cpp/0000755000175000017500000000000013105126767011226 5ustar sergeysergeycpp/manual.cpp.html0000755000175000017500001102502513105126767014163 0ustar sergeysergey
1 Introduction
    1.1 What is ALGLIB
    1.2 ALGLIB license
    1.3 Documentation license
    1.4 Reference Manual and User Guide
    1.5 Acknowledgements
2 ALGLIB structure
    2.1 Packages
    2.2 Subpackages
    2.3 Open Source and Commercial versions
3 Compatibility
    3.1 CPU
    3.2 OS
    3.3 Compiler
    3.4 Optimization settings
4 Compiling ALGLIB
    4.1 Adding to your project
    4.2 Configuring for your compiler
    4.3 Improving performance (CPU-specific and OS-specific optimizations)
5 Working with commercial version
    5.1 Benefits of commercial version
    5.2 Working with SSE support (Intel/AMD users)
    5.3 Using multithreading
        5.3.1 SMT (CMT/hyper-threading) issues
    5.4 Linking with Intel MKL
        5.4.1 Using lightweight Intel MKL supplied by ALGLIB Project
        5.4.2 Using your own installation of Intel MKL
    5.5 Examples - compiling commercial edition of ALGLIB
        5.5.1 Introduction
        5.5.2 Compiling under Windows
6 Using ALGLIB
    6.1 Thread-safety
    6.2 Global definitions
    6.3 Datatypes
    6.4 Constants
    6.5 Functions
    6.6 Working with vectors and matrices
    6.7 Using functions: 'expert' and 'friendly' interfaces
    6.8 Handling errors
    6.9 Working with Level 1 BLAS functions
    6.10 Reading data from CSV files
7 Advanced topics
    7.1 Testing ALGLIB
8 ALGLIB packages and subpackages
    8.1 AlglibMisc package
    8.2 DataAnalysis package
    8.3 DiffEquations package
    8.4 FastTransforms package
    8.5 Integration package
    8.6 Interpolation package
    8.7 LinAlg package
    8.8 Optimization package
    8.9 Solvers package
    8.10 SpecialFunctions package
    8.11 Statistics package

1 Introduction

1.2 1.1 What is ALGLIB

ALGLIB is a cross-platform numerical analysis and data mining library. It supports several programming languages (C++, C#, Pascal, VBA) and several operating systems (Windows, *nix family).

ALGLIB features include:

ALGLIB Project (the company behind ALGLIB) delivers to you several editions of ALGLIB:

Free Edition is a serial version without multithreading support or extensive low-level optimizations (generic C or C# code). Commercial Edition is a heavily optimized version of ALGLIB. It supports multithreading, it was extensively optimized, and (on Intel platforms) - our commercial users may enjoy precompiled version of ALGLIB which internally calls Intel MKL to accelerate low-level tasks. We obtained license from Intel corp., which allows us to integrate Intel MKL into ALGLIB, so you don't have to buy separate license from Intel.

1.2 1.1 ALGLIB license

ALGLIB Free Edition is distributed under GPL 2+, GPL license version 2 or at your option any later version. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses

ALGLIB Commercial Edition is distributed under license which is friendly to commericial users. A copy of the commercial license can be found at http://www.alglib.net/commercial.php.

1.3 Documentation license

This reference manual is licensed under BSD-like documentation license:

Copyright 1994-2017 Sergey Bochkanov, ALGLIB Project. All rights reserved.

Redistribution and use of this document (ALGLIB Reference Manual) with or without modification, are permitted provided that such redistributions will retain the above copyright notice, this condition and the following disclaimer as the first (or last) lines of this file.

THIS DOCUMENTATION IS PROVIDED BY THE ALGLIB PROJECT "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE ALGLIB PROJECT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DOCUMENTATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

1.4 Reference Manual and User Guide

ALGLIB Project provides two sources of information: ALGLIB Reference Manual (this document) and ALGLIB User Guide.

ALGLIB Reference Manual contains full description of all publicly accessible ALGLIB units accompanied with examples. Reference Manual is focused on the source code: it documents units, functions, structures and so on. If you want to know what unit YYY can do or what subroutines unit ZZZ contains Reference Manual is a place to go. Free software needs free documentation - that's why ALGLIB Reference Manual is licensed under BSD-like documentation license.

Additionally to the Reference Manual we provide you User Guide. User Guide is focused on more general questions: how fast ALGLIB is? how reliable it is? what are the strong and weak sides of the algorithms used? We aim to make ALGLIB User Guide an important source of information both about ALGLIB and numerical analysis algorithms in general. We want it to be a book about algorithms, not just software documentation. And we want it to be unique - that's why ALGLIB User Guide is distributed under less-permissive personal-use-only license.

1.5 Acknowledgements

ALGLIB was not possible without contribution of the next open source projects:

We also want to thank developers of the Intel's local development center (Nizhny Novgorod branch) for their help during MKL integration.

2 ALGLIB structure

2.1 Packages

ALGLIB is a C++ interface to the computational core written in C. Both C library and C++ wrapper are automatically generated by code generation tools developed within ALGLIB project. Pre-3.0 versions of ALGLIB included more than 100 units, but it was difficult to work with such large number of files. Since ALGLIB 3.0 all units are merged into 11 packages and two support units:

One package may rely on other ones, but we have tried to reduce number of dependencies. Every package relies on ap.cpp and many packages rely on alglibinternal.cpp. But many packages require only these two to work, and many other packages need significantly less than 13 packages. For example, statistics.cpp requires two packages mentioned above and only one additional package - specialfunctions.cpp.

2.2 Subpackages

There is one more concept to learn - subpackages. Every package was created from several source files. For example (as of ALGLIB 3.0.0), linalg.cpp was created by merging together 14 .cpp files (C++ interface) and 14 .c files (computational core). These files provide different functionality: one of them calculates triangular factorizations, another generates random matrices, and so on. We've merged source code, but what to do with their documentation?

Of course, we can merge their documentation (as we've merged units) in one big list of functions and data structures, but such list will be hard to read. Instead, we have decided to merge source code, but leave documentation separate.

If you look at the list of ALGLIB packages, you will see that each package includes several subpackages. For example, linalg.cpp includes trfac, svd, evd and other subpackages. These subpackages do no exist as separate files, namespaces or other entities. They are just subsets of one large unit which provide significantly different functionality. They have separate documentation sections, but if you want to use svd subpackage, you have to include linalg.h, not svd.h.

2.3 Open Source and Commercial versions

ALGLIB comes in two versions - open source (GPL-licensed) and commercial (closed source) one. Both versions have same functionality, i.e. may solve same set of problems. However, commercial version differs from open source one in following aspects:

This documentation applies to both versions of ALGLIB. Detailed description of commercial version can be found below.

3 Compatibility

3.1 CPU

ALGLIB is compatible with any CPU which:

Most mainstream CPU's (in particular, x86, x86_64, ARM and SPARC) satisfy these requirements.

As for Intel architecture, ALGLIB works with both FPU-based and SSE-based implementations of floating point math. 80-bit internal representation used by Intel FPU is not a problem for ALGLIB.

3.2 OS

ALGLIB for C++ (both open source and commercial versions) can be compiled in OS-agnostic mode (no OS-specific preprocessor definitions), when it is compatible with any OS which supports C++98 standard library. In particular, it will work under any POSIX-compatible OS and under Windows.

If you want to use multithreaded capabilities of commercial version of ALGLIB, you should compile it in OS-specific mode by #defining either AE_OS=AE_WINDOWS or AE_OS=AE_POSIX at compile time, depending on OS being used. Former corresponds to any modern OS (32/64-bit Windows XP and later) from Windows family, while latter means almost any POSIX-compatible OS. It applies only to commercial version of ALGLIB. Open source version is always OS-agnostic, even in the presence of OS-specific definitions.

3.3 Compiler

ALGLIB is compatible with any C++ compiler which:

All modern compilers (in particular, reasonably new versions of MSVC, GCC and Sun Studio) satisfy these requirements.

However, some very old compilers (ten years old version of Borland C++ Builder, for example) may emit code which does not correctly work with IEEE special values. If you use one of these old compilers, we recommend you to run ALGLIB test suite to ensure that library works.

3.4 Optimization settings

ALGLIB is compatible with any kind of optimizing compiler as long as:

Generally, all kinds of optimization that were marked by compiler vendor as "safe" are possible. For example, ALGLIB can be compiled:

From the other side, following "unsafe" optimizations will break ALGLIB:

4 Compiling ALGLIB

4.1 Adding to your project

Adding ALGLIB to your project is easy - just pick packages you need and... add them to your project! Under most used compilers (GCC, MSVC, Sun Studio) it will work without any additional settings. In other cases you will need to define several preprocessor definitions (this topic will be detailed below), but everything will still be simple.

By "adding to your project" we mean that you should a) compile .cpp files with the rest of your project, and b) include .h files you need. Do not include .cpp files - these files must be compiled separately, not as part of some larger source file. The only files you should include are .h files, stored in the /src folder of the ALGLIB distribution.

As you see, ALGLIB has no project files or makefiles. Why? There are several reasons:

In any case, compiling ALGLIB is so simple that even without project file you can do it in several minutes.

4.2 Configuring for your compiler

If you use modern versions of MSVC, GCC or Sun Studio, you don't need to configure ALGLIB at all. But if you use outdated versions of these compilers (or something else), then you may need to tune definitions of several data types:

ALGLIB tries to autodetect your compiler and to define these types in compiler-specific manner:

In most cases, it is enough. But if anything goes wrong, you have several options:

4.3 Improving performance (CPU-specific and OS-specific optimizations)

You can improve performance of ALGLIB in a several ways:

ALGLIB has two-layered structure: some set of basic performance-critical primitives is implemented using optimized code, and the rest of the library is built on top of these primitives. By default, ALGLIB uses generic C code to implement these primitives (matrix multiplication, decompositions, etc.). This code works everywhere from Intel to SPARC. However, you can tell ALGLIB that it will work under particular architecture by defining appropriate macro at the global level:

When AE_CPU macro is defined and equals to the AE_INTEL, it enables SSE2 support. ALGLIB will use cpuid instruction to determine SSE2 presence at run-time and - in case we have SSE2 - to use SSE2-capable code. ALGLIB uses SSE2 intrinsics which are portable across different compilers and efficient enough for most practical purposes. As of ALGLIB 3.4, SSE2 support is available for MSVC, GCC and Sun Studio users only.

5 Working with commercial version

5.1 Benefits of commercial version

Commercial version of ALGLIB for C++ features four important improvements over open source one:

5.2 Working with SSE support (Intel/AMD users)

ALGLIB for C++ can utilize SSE2 set of instructions supported by all modern Intel and AMD x86 processors. This feature is optional and must be explicitly turned on during compile-time. If you do not activate it, ALGLIB will use generic C code, without any processor-specific assembly/intrinsics.

Thus, if you turn on this feature, your code will run faster on x86_32 and x86_64 processors, but will be unportable to non-x86 platforms (and Intel MIC platform, which is not exactly x86 and does not support SSE!). From the other side, if you do not activate this feature, your code will be portable to almost any modern CPU (SPARC, ARM, ...).

In order to turn on x86-specific optimizations, you should define AE_CPU=AE_INTEL preprocessor definition at global level. It will tell ALGLIB to use SSE intrinsics supported by GCC, MSVC and Intel compilers. Additionally you should tell compiler to generate SSE-capable code. It can be done in the project settings of your IDE or in the command line:


GCC example:
> g++ -msse2 -I. -DAE_CPU=AE_INTEL *.cpp -lm

MSVC example:
> cl /I. /EHsc /DAE_CPU=AE_INTEL *.cpp

5.3 Using multithreading

Commercial version of ALGLIB includes out-of-the-box support for multithreading. Many (not all) computationally intensive problems can be solved in multithreaded mode. You should read comments on specific ALGLIB functions to determine what can be multithreaded and what can not.

ALGLIB does not depend on vendor/compiler support for technologies like OpenMP/MPI/... Under Windows ALGLIB uses OS threads and custom synchronization framework. Under POSIX-compatible OS (Solaris, Linux, FreeBSD, NetBSD, OpenBSD, ...) ALGLIB uses POSIX Threads (standard *nix library which is shipped with any POSIX system) with its threading and synchronization primitives. It gives ALGLIB unprecedented portability across operating systems and compilers. ALGLIB does not depend on presence of any custom multithreading library or compiler support for any multithreading technology.

If you want to use multithreaded capabilities of ALGLIB, you should:

  1. compile it in OS-specific mode (ALGLIB have to know what OS it is running on)
  2. tell ALGLIB about number of worker threads to use
  3. call multithreaded versions of computational functions

Let explain it in more details...

1. You should compile ALGLIB in OS-specific mode by #defining either AE_OS=AE_WINDOWS or AE_OS=AE_POSIX at compile time, depending on OS being used. Former corresponds to any modern OS (32/64-bit Windows XP and later) from Windows family, while latter means almost any POSIX-compatible OS. When compiling on POSIX, do not forget to link ALGLIB with libpthread library.

2. ALGLIB automatically determines number of cores on application startup. On Windows it is done using GetSystemInfo() call. On POSIX systems ALGLIB performs sysconf(_SC_NPROCESSORS_ONLN) system call. This system call is supported by all modern POSIX-compatible systems: Solaris, Linux, FreeBSD, NetBSD, OpenBSD.

By default, ALGLIB uses all available cores except for one. Say, on 4-core system it will use three cores - unless being told to use more or less. It will keep your system responsive during lengthy computations. Such behavior may be changed with setnworkers() call:

You may want to specify maximum number of worker threads during compile time by means of preprocessor definition AE_NWORKERS=N. You can add this definition to compiler command line or change corresponding project settings in your IDE. Here N can be any positive number. ALGLIB will use exactly N worker threads, unless being told to use less by setnworkers() call.

Some old POSIX-compatible operating systems do not support sysconf(_SC_NPROCESSORS_ONLN) system call which is required in order to automatically determine number of active cores. On these systems you should specify number of cores manually at compile time. Without it ALGLIB will run in single-threaded mode.

3. When you use commercial edition of ALGLIB, you may choose between serial and multithreaded versions of SMP-capable functions:

You should carefully decide what version of function to use. Starting/stopping worker thread costs tens of thousands of CPU cycles. Thus you won't get multithreading speedup on small computational problems.

5.3.1 SMT (CMT/hyper-threading) issues

Simultaneous multithreading (SMT) also known as Hyper-threading (Intel) and Cluster-based Multithreading (AMD) is a CPU design where several (usually two) logical cores share resources of one physical core. Say, on dual-core system with 2x HT scale factor you will see 4 logical cores. Each pair of these 4 cores, however, share same hardware resources. Thus, you may get only marginal speedup when running highly optimized software which fully utilizes CPU resources.

Say, if one thread occupies floating-point unit, another thread on the same physical core may work with integer numbers at the same time without any performance penalties. In this case you may get some speedup due to having additional cores. But if both threads keep FPU unit 100% busy, they won't get any multithreaded speedup.

So, if 2 math-intensive threads are dispatched by OS scheduler to different physical cores, you will get 2x speedup due to use of multithreading. But if these threads are dispatched to different logical cores - but same physical core - you won't get any speedup at all! One physical core will be 100% busy, and another one will be 100% idle. From the other side, if you start four threads instead of two, your system will be 100% utilized independently of thread scheduling details.

Let we stress it one more time - multithreading speedup on SMT systems is highly dependent on number of threads you are running and decisions made by OS scheduler. It is not 100% deterministic! With "true SMP" when you run 2 threads, you get 2x speedup (or 1.95, or 1.80 - it depends on algorithm, but this factor is always same). With SMT when you run 2 threads you may get your 2x speedup - or no speedup at all. Modern OS schedulers do a good job on single-socket hardware, but even in this "simple" case they give no guarantees of fair distribution of hardware resources. And things become a bit tricky when you work with multi-socket hardware. On SMT systems the only guaranteed way to 100% utilize your CPU is to create as many worker threads as there are logical cores. In this case OS scheduler has no chance to make its work in a wrong way.

5.4 Linking with Intel MKL

5.4.1 Using lightweight Intel MKL supplied by ALGLIB Project

Commercial edition of ALGLIB includes MKL extensions - special lightweight distribution of Intel MKL, highly optimized numerical library from Intel - and precompiled ALGLIB-MKL interface libraries. Linking your programs with MKL extensions allows you to run ALGLIB with maximum performance.

Current version of ALGLIB features Windows-only MKL extensions, but in future ALGLIB releases we will introduce MKL extensions for Linux systems.

Unlike the rest of the library, MKL extensions are distributed in binary-only form. ALGLIB itself is still distributed in source code form, but Intel MKL and ALGLIB-MKL interface are distributed as precompiled dynamic/static libraries. We can not distribute them in source because of license restrictions associated with Intel MKL. Also due to license restrictions we can not give you direct access to MKL functionality. You may use MKL to accelerate ALGLIB - without paying for MKL license - but you may not call its functions directly. It is technically possible, but strictly prohibited by both MKL's EULA and ALGLIB License Agreement. If you want to work with MKL, you should buy separate license from Intel.

MKL extensions are located in the /cpp/mkl-windows subdirectory of the ALGLIB distribution. This directory includes:

In order to activate MKL extensions you should:

Examples on linking from command line can be found in the next section.

5.4.2 Using your own installation of Intel MKL

If you bought separate license for Intel MKL, and want to use your own installation of MKL - and not our lightweight distribution - then you should compile ALGLIB as it was told in the previous section, with all necessary preprocessor definitions. But instead of linking with mkl4alglib dynamic library, you should add to your project mkl4alglib.c file from mkl-interface directory and compile it (as C file) along with the rest of ALGLIB.

This C file implements interface between MKL and ALGLIB. Having this file in your project and defining AE_MKL preprocessor definition results in ALGLIB using MKL functions.

However, this C file is just interface! It is your responsibility to make sure that C/C++ compiler can find MKL headers, and appropriate MKL static/dynamic libraries are linked to your application.

If you link ALGLIB with your own installation if Intel MKL, you may do so on any OS where MKL works - Windows or Linux.

5.5 Examples - compiling commercial edition of ALGLIB

5.5.1 Introduction

In this section we'll consider different compilation scenarios for commercial version of ALGLIB - from simple platform-agnostic compilation to linking with MKL extensions.

We assume that you unpacked ALGLIB distribution in the current directory and saved here demo.cpp file, whose code is given below. Thus, in the current directory you should have exactly one file (demo.cpp) and exactly one subdirectory (cpp folder with ALGLIB distribution).

5.5.2 Compiling under Windows

File listing below contains the very basic program which uses ALGLIB to perform matrix-matrix multiplication. After that program evaluates performance of GEMM (function being called) and prints result to console. We'll show how performance of this program continually increases as we add more and more sophisticated compiler options.

demo.cpp
#include <stdio.h>
#include <windows.h>
#include "LinAlg.h"

double counter()
{
    return 0.001*GetTickCount();
}

int main()
{
    alglib::real_2d_array a, b, c;
    int n = 2000;
    int i, j;
    double timeneeded, flops;
    
    // Initialize arrays
    a.setlength(n, n);
    b.setlength(n, n);
    c.setlength(n, n);
    for(i=0; i<n; i++)
        for(j=0; j<n; j++)
        {
            a[i][j] = alglib::randomreal()-0.5;
            b[i][j] = alglib::randomreal()-0.5;
            c[i][j] = 0.0;
        }
    
    // Set number of worker threads: "4" means "use 4 cores".
    // This line is ignored if AE_OS is UNDEFINED.
    alglib::setnworkers(4);
    
    // Perform matrix-matrix product.
    // We call function with "smp_" prefix, which means that ALGLIB
    // will try to execute it in parallel manner whenever it is possible.
    flops = 2*pow((double)n, (double)3);
    timeneeded = counter();
    alglib::smp_rmatrixgemm(
        n, n, n,
        1.0,
        a, 0, 0, 0,
        b, 0, 0, 1,
        0.0,
        c, 0, 0);
    timeneeded = counter()-timeneeded;
    
    // Evaluate performance
    printf("Performance is %.1f GFLOPS\n", (double)(1.0E-9*flops/timeneeded));
    
    return 0;
}

Examples below cover Windows compilation from command line with MSVC. It is very straightforward to adapt them to compilation from MSVC IDE - or to another compilers. We assume that you already called %VCINSTALLDIR%\bin\amd64\vcvars64.bat batch file which loads 64-bit build environment (or its 32-bit counterpart). We also assume that current directory is clean before example is executed (i.e. it has ONLY demo.cpp file and cpp folder). We used 3.2 GHz 4-core CPU for this test.

First example covers platform-agnostic compilation without optimization settings - the most simple way to compile ALGLIB. However, in platform-agnostic mode ALGLIB is unable to use all performance related features present in commercial edition.

We starts from copying all cpp and h files to current directory, then we will compile them along with demo.cpp. In this and following examples we will omit compiler output for the sake of simplicity.

OS-agnostic mode, no compiler optimizations
> copy cpp\src\*.* .
> cl /I. /EHsc /Fedemo.exe *.cpp
> demo.exe
Performance is 0.7 GFLOPS

Well, 0.7 GFLOPS is not very impressing for a 3.2GHz CPU... Let's add /Ox to compiler parameters.

OS-agnostic mode, /Ox optimization
> copy cpp\src\*.* .
> cl /I. /EHsc /Fedemo.exe /Ox *.cpp
> demo.exe
Performance is 0.9 GFLOPS

Still not impressed. Let's turn on optimizations for x86 architecture: define AE_CPU=AE_INTEL.

OS-agnostic mode, ALGLIB knows it is x86/x64
> copy cpp\src\*.* .
> cl /I. /EHsc /Fedemo.exe /Ox /DAE_CPU=AE_INTEL *.cpp
> demo.exe
Performance is 4.5 GFLOPS

It is good, but we have 4 cores - and only one of them was used. Defining AE_OS=AE_WINDOWS allows ALGLIB to use Windows threads to parallelize execution of some functions.

ALGLIB knows it is Windows on x86/x64 CPU
> copy cpp\src\*.* .
> cl /I. /EHsc /Fedemo.exe /Ox /DAE_CPU=AE_INTEL /DAE_OS=AE_WINDOWS *.cpp
> demo.exe
Performance is 16.0 GFLOPS

Not bad. And now we are ready to the final test - linking with MKL extensions.

Linking with MKL extensions differs a bit from standard way of linking with ALGLIB. ALGLIB itself is compiled with one more preprocessor definition: we define AE_MKL symbol. We also link ALGLIB with appropriate (32-bit or 64-bit) mkl4alglib static library, which is import library for special lightweight MKL distribution, shipped with ALGLIB for no additional price. We also should copy to current directory appropriate mkl4alglib DLL file which contains Intel MKL.

Linking with MKL extensions
> copy cpp\src\*.* .
> copy cpp\mkl-windows\mkl4alglib_64.lib .
> copy cpp\mkl-windows\mkl4alglib_64.dll .
> cl /I. /EHsc /Fedemo.exe /Ox /DAE_CPU=AE_INTEL /DAE_OS=AE_WINDOWS /DAE_MKL demo.cpp mkl4alglib_64.lib
> demo.exe
Performance is 33.1 GFLOPS

From 0.7 GFLOPS to 33.1 GFLOPS - you may see that commercial version of ALGLIB is really worth it!

6 Using ALGLIB

6.1 Thread-safety

Both open source and commercial versions of ALGLIB are 100% thread-safe as long as different user threads work with different instances of objects/arrays. Thread-safety is guaranteed by having no global shared variables.

However, any kind of sharing ALGLIB objects/arrays between different threads is potentially hazardous. Even when this object is seemingly used in read-only mode!

Say, you use ALGLIB neural network NET to process two input vectors X0 and X1, and get two output vectors Y0 and Y1. You may decide that neural network is used in read-only mode which does not change state of NET, because output is written to distinct arrays Y. Thus, you may want to process these vectors from parallel threads.

But it is not read-only operation, even if it looks like this! Neural network object NET allocates internal temporary buffers, which are modified by neural processing functions. Thus, sharing one instance of neural network between two threads is thread-unsafe!

6.2 Global definitions

ALGLIB defines several conditional symbols (all start with "AE_" which means "ALGLIB environment") and two namespaces: alglib_impl (contains computational core) and alglib (contains C++ interface).

Although this manual mentions both alglib_impl and alglib namespaces, only alglib namespace should be used by you. It contains user-friendly C++ interface with automatic memory management, exception handling and all other nice features. alglib_impl is less user-friendly, is less documented, and it is too easy to crash your system or cause memory leak if you use it directly.

6.3 Datatypes

ALGLIB (ap.h header) defines several "basic" datatypes (types which are used by all packages) and many package-specific datatypes. "Basic" datatypes are:

Package-specific datatypes are classes which can be divided into two distinct groups:

6.4 Constants

The most important constants (defined in the ap.h header) from ALGLIB namespace are:

6.5 Functions

The most important "basic" functions from ALGLIB namespace (ap.h header) are:

6.6 Working with vectors and matrices

ALGLIB (ap.h header) supports matrixes and vectors (one-dimensional and two-dimensional arrays) of variable size, with numeration starting from zero.

Everything starts from array creation. You should distinguish the creation of array class instance and the memory allocation for the array. When creating the class instance, you can use constructor without any parameters, that creates an empty array without any elements. An attempt to address them may cause the program failure.

You can use copy and assignment constructors that copy one array into another. If, during the copy operation, the source array has no memory allocated for the array elements, destination array will contain no elements either. If the source array has memory allocated for its elements, destination array will allocate the same amount of memory and copy the elements there. That is, the copy operation yields into two independent arrays with indentical contents.

You can also create array from formatted string like "[]", "[true,FALSE,tRUe]", "[[]]]" or "[[1,2],[3.2,4],[5.2]]" (note: '.' is used as decimal point independently from locale settings).

alglib::boolean_1d_array b1;
b1 = "[true]";

alglib::real_2d_array r2("[[2,3],[3,4]]");
alglib::real_2d_array r2_1("[[]]");
alglib::real_2d_array r2_2(r2);
r2_1 = r2;

alglib::complex_1d_array c2;
c2 = "[]";
c2 = "[0]";
c2 = "[1,2i]";
c2 = "[+1-2i,-1+5i]";
c2 = "[ 4i-2,  8i+2]";
c2 = "[+4i-2, +8i+2]";
c2 = "[-4i-2, -8i+2]";

After an empty array has been created, you can allocate memory for its elements, using the setlength() method. The content of the created array elements is not defined. If the setlength method is called for the array with already allocated memory, then, after changing its parameters, the newly allocated elements also become undefined and the old content is destroyed.

alglib::boolean_1d_array b1;
b1.setlength(2);

alglib::integer_2d_array r2;
r2.setlength(4,3);

Another way to initialize array is to call setcontent() method. This method accepts pointer to data which are copied into newly allocated array. Vectors are stored in contiguous order, matrices are stored row by row.

alglib::real_1d_array r1;
double _r1[] = {2, 3};
r1.setcontent(2,_r1);

alglib::real_2d_array r2;
double _r2[] = {11, 12, 13, 21, 22, 23};
r2.setcontent(2,3,_r2);

To access the array elements, an overloaded operator() or operator[] can used. That is, the code addressing the element of array a with indexes [i,j] can look like a(i,j) or a[i][j].

alglib::integer_1d_array a("[1,2,3]");
alglib::integer_1d_array b("[3,9,27]");
a[0] = b(0);

alglib::integer_2d_array c("[[1,2,3],[9,9,9]]");
alglib::integer_2d_array d("[[3,9,27],[8,8,8]]");
d[1][1] = c(0,0);

You can access contents of 1-dimensional array by calling getcontent() method which returns pointer to the array memory. For historical reasons 2-dimensional arrays do not provide getcontent() method, but you can use create reference to any element of array. 2-dimensional arrays store data in row-major order with aligned rows (i.e. generally distance between rows is not equal to number of columns). You can get stride (distance between consequtive elements in different rows) with getstride() call.

alglib::integer_1d_array a("[1,2]");
alglib::real_2d_array b("[[0,1],[10,11]]");

alglib::ae_int_t *a_row = a.getcontent();

// all three pointers point to the same location
double *b_row0 = &b[0][0];
double *b_row0_2 = &b(0,0);
double *b_row0_3 = b[0];

// advancing to the next row of 2-dimensional array
double *b_row1 = b_row0 + b.getstride();

Finally, you can get array size with length(), rows() or cols() methods:

alglib::integer_1d_array a("[1,2]");
alglib::real_2d_array b("[[0,1],[10,11]]");

printf("%ld\n", (long)a.length());
printf("%ld\n", (long)b.rows());
printf("%ld\n", (long)b.cols());

6.7 Using functions: 'expert' and 'friendly' interfaces

Most ALGLIB functions provide two interfaces: 'expert' and 'friendly'. What is the difference between two? When you use 'friendly' interface, ALGLIB:

When you use 'expert' interface, ALGLIB:

Here are several examples of 'friendly' and 'expert' interfaces:

#include "interpolation.h"

...

alglib::real_1d_array    x("[0,1,2,3]");
alglib::real_1d_array    y("[1,5,3,9]");
alglib::real_1d_array   y2("[1,5,3,9,0]");
alglib::spline1dinterpolant s;

alglib::spline1dbuildlinear(x, y, 4, s);  // 'expert' interface is used
alglib::spline1dbuildlinear(x, y, s);     // 'friendly' interface - input size is
                                          // automatically determined

alglib::spline1dbuildlinear(x, y2, 4, s); // y2.length() is 5, but it will work

alglib::spline1dbuildlinear(x, y2, s);    // it won't work because sizes of x and y2
                                          // are inconsistent

'Friendly' interface - matrix semantics:

#include "linalg.h"

...

alglib::real_2d_array a;
alglib::matinvreport  rep;
alglib::ae_int_t      info;

// 
// 'Friendly' interface: spdmatrixinverse() accepts and returns symmetric matrix
// 

// symmetric positive definite matrix
a = "[[2,1],[1,2]]";

// after this line A will contain [[0.66,-0.33],[-0.33,0.66]]
// which is symmetric too
alglib::spdmatrixinverse(a, info, rep); 

// you may try to pass nonsymmetric matrix
a = "[[2,1],[0,2]]";

// but exception will be thrown in such case
alglib::spdmatrixinverse(a, info, rep); 

Same function but with 'expert' interface:

#include "linalg.h"

...

alglib::real_2d_array a;
alglib::matinvreport  rep;
alglib::ae_int_t      info;

// 
// 'Expert' interface, spdmatrixinverse()
// 

// only upper triangle is used; a[1][0] is initialized by NAN,
// but it can be arbitrary number
a = "[[2,1],[NAN,2]]";

// after this line A will contain [[0.66,-0.33],[NAN,0.66]]
// only upper triangle is modified
alglib::spdmatrixinverse(a, 2 /* N */, true /* upper triangle is used */, info, rep); 

6.8 Handling errors

ALGLIB uses two error handling strategies:

What is actually done depends on function being used and error being reported:

  1. if function returns some error code and has corresponding value for this kind of error, ALLGIB returns error code
  2. if function does not return error code (or returns error code, but there is no code for error being reported), ALGLIB throws alglib::ap_error exception. Exception object has msg parameter which contains short description of error.

To make things clear we consider several examples of error handling.

Example 1. mincgreate function creates nonlinear CG optimizer. It accepts problem size N and initial point X. Several things can go wrong - you may pass array which is too short, filled by NAN's, or otherwise pass incorrect data. However, this function returns no error code - so it throws an exception in case something goes wrong. There is no other way to tell caller that something went wrong.

Example 2. rmatrixinverse function calculates inverse matrix. It returns error code, which is set to +1 when problem is solved and is set to -3 if singular matrix was passed to the function. However, there is no error code for matrix which is non-square or contains infinities. Well, we could have created corresponding error codes - but we didn't. So if you pass singular matrix to rmatrixinverse, you will get completion code -3. But if you pass matrix which contains INF in one of its elements, alglib::ap_error will be thrown.

First error handling strategy (error codes) is used to report "frequent" errors, which can occur during normal execution of user program. Second error handling strategy (exceptions) is used to report "rare" errors which are result of serious flaws in your program (or ALGLIB) - infinities/NAN's in the inputs, inconsistent inputs, etc.

6.9 Working with Level 1 BLAS functions

ALGLIB (ap.h header) includes following Level 1 BLAS functions:

Each Level 1 BLAS function accepts input stride and output stride, which are expected to be positive. Input and output vectors should not overlap. Functions operating with complex vectors accept additional parameter conj_src, which specifies whether input vector is conjugated or not.

For each real/complex function there exists "simple" companion which accepts no stride or conjugation modifier. "Simple" function assumes that input/output stride is +1, and no input conjugation is required.

alglib::real_1d_array    rvec("[0,1,2,3]");
alglib::real_2d_array    rmat("[[1,2],[3,4]]");
alglib::complex_1d_array cvec("[0+1i,1+2i,2-1i,3-2i]");
alglib::complex_2d_array cmat("[[3i,1],[9,2i]]");

alglib::vmove(&rvec[0],  1, &rmat[0][0], rmat.getstride(), 2); // now rvec is [1,3,2,3]

alglib::vmove(&cvec[0],  1, &cmat[0][0], rmat.getstride(), "No conj", 2); // now cvec is [3i, 9, 2-1i, 3-2i]
alglib::vmove(&cvec[2],  1, &cmat[0][0], 1,                "Conj", 2);    // now cvec is [3i, 9, -3i,  1]

Here is full list of Level 1 BLAS functions implemented in ALGLIB:

double vdotproduct(
    const double *v0,
     ae_int_t stride0,
     const double *v1,
     ae_int_t stride1,
     ae_int_t n);
double vdotproduct(
    const double *v1,
     const double *v2,
     ae_int_t N);

alglib::complex vdotproduct(
    const alglib::complex *v0,
     ae_int_t stride0,
     const char *conj0,
     const alglib::complex *v1,
     ae_int_t stride1,
     const char *conj1,
     ae_int_t n);
alglib::complex vdotproduct(
    const alglib::complex *v1,
     const alglib::complex *v2,
     ae_int_t N);

void vmove(
    double *vdst,
      ae_int_t stride_dst,
     const double* vsrc,
      ae_int_t stride_src,
     ae_int_t n);
void vmove(
    double *vdst,
     const double* vsrc,
     ae_int_t N);

void vmove(
    alglib::complex *vdst,
     ae_int_t stride_dst,
     const alglib::complex* vsrc,
     ae_int_t stride_src,
     const char *conj_src,
     ae_int_t n);
void vmove(
    alglib::complex *vdst,
     const alglib::complex* vsrc,
     ae_int_t N);

void vmoveneg(
    double *vdst,
      ae_int_t stride_dst,
     const double* vsrc,
      ae_int_t stride_src,
     ae_int_t n);
void vmoveneg(
    double *vdst,
     const double *vsrc,
     ae_int_t N);

void vmoveneg(
    alglib::complex *vdst,
     ae_int_t stride_dst,
     const alglib::complex* vsrc,
     ae_int_t stride_src,
     const char *conj_src,
     ae_int_t n);
void vmoveneg(
    alglib::complex *vdst,
     const alglib::complex *vsrc,
     ae_int_t N);

void vmove(
    double *vdst,
      ae_int_t stride_dst,
     const double* vsrc,
      ae_int_t stride_src,
     ae_int_t n,
     double alpha);
void vmove(
    double *vdst,
     const double *vsrc,
     ae_int_t N,
     double alpha);

void vmove(
    alglib::complex *vdst,
     ae_int_t stride_dst,
     const alglib::complex* vsrc,
     ae_int_t stride_src,
     const char *conj_src,
     ae_int_t n,
     double alpha);
void vmove(
    alglib::complex *vdst,
     const alglib::complex *vsrc,
     ae_int_t N,
     double alpha);

void vmove(
    alglib::complex *vdst,
     ae_int_t stride_dst,
     const alglib::complex* vsrc,
     ae_int_t stride_src,
     const char *conj_src,
     ae_int_t n,
     alglib::complex alpha);
void vmove(
    alglib::complex *vdst,
     const alglib::complex *vsrc,
     ae_int_t N,
     alglib::complex alpha);

void vadd(
    double *vdst,
      ae_int_t stride_dst,
     const double *vsrc,
      ae_int_t stride_src,
     ae_int_t n);
void vadd(
    double *vdst,
     const double *vsrc,
     ae_int_t N);

void vadd(
    alglib::complex *vdst,
     ae_int_t stride_dst,
     const alglib::complex *vsrc,
     ae_int_t stride_src,
     const char *conj_src,
     ae_int_t n);
void vadd(
    alglib::complex *vdst,
     const alglib::complex *vsrc,
     ae_int_t N);

void vadd(
    double *vdst,
      ae_int_t stride_dst,
     const double *vsrc,
      ae_int_t stride_src,
     ae_int_t n,
     double alpha);
void vadd(
    double *vdst,
     const double *vsrc,
     ae_int_t N,
     double alpha);

void vadd(
    alglib::complex *vdst,
     ae_int_t stride_dst,
     const alglib::complex *vsrc,
     ae_int_t stride_src,
     const char *conj_src,
     ae_int_t n,
     double alpha);
void vadd(
    alglib::complex *vdst,
     const alglib::complex *vsrc,
     ae_int_t N,
     double alpha);

void vadd(
    alglib::complex *vdst,
     ae_int_t stride_dst,
     const alglib::complex *vsrc,
     ae_int_t stride_src,
     const char *conj_src,
     ae_int_t n,
     alglib::complex alpha);
void vadd(
    alglib::complex *vdst,
     const alglib::complex *vsrc,
     ae_int_t N,
     alglib::complex alpha);

void vsub(
    double *vdst,
      ae_int_t stride_dst,
     const double *vsrc,
      ae_int_t stride_src,
     ae_int_t n);
void vsub(
    double *vdst,
     const double *vsrc,
     ae_int_t N);

void vsub(
    alglib::complex *vdst,
     ae_int_t stride_dst,
     const alglib::complex *vsrc,
     ae_int_t stride_src,
     const char *conj_src,
     ae_int_t n);
void vsub(
    alglib::complex *vdst,
     const alglib::complex *vsrc,
     ae_int_t N);

void vsub(
    double *vdst,
      ae_int_t stride_dst,
     const double *vsrc,
      ae_int_t stride_src,
     ae_int_t n,
     double alpha);
void vsub(
    double *vdst,
     const double *vsrc,
     ae_int_t N,
     double alpha);

void vsub(
    alglib::complex *vdst,
     ae_int_t stride_dst,
     const alglib::complex *vsrc,
     ae_int_t stride_src,
     const char *conj_src,
     ae_int_t n,
     double alpha);
void vsub(
    alglib::complex *vdst,
     const alglib::complex *vsrc,
     ae_int_t N,
     double alpha);

void vsub(
    alglib::complex *vdst,
     ae_int_t stride_dst,
     const alglib::complex *vsrc,
     ae_int_t stride_src,
     const char *conj_src,
     ae_int_t n,
     alglib::complex alpha);
void vsub(
    alglib::complex *vdst,
     const alglib::complex *vsrc,
     ae_int_t N,
     alglib::complex alpha);

void vmul(
    double *vdst,
      ae_int_t stride_dst,
     ae_int_t n,
     double alpha);
void vmul(
    double *vdst,
     ae_int_t N,
     double alpha);

void vmul(
    alglib::complex *vdst,
     ae_int_t stride_dst,
     ae_int_t n,
     double alpha);
void vmul(
    alglib::complex *vdst,
     ae_int_t N,
     double alpha);

void vmul(
    alglib::complex *vdst,
     ae_int_t stride_dst,
     ae_int_t n,
     alglib::complex alpha);
void vmul(
    alglib::complex *vdst,
     ae_int_t N,
     alglib::complex alpha);

6.10 Reading data from CSV files

ALGLIB (ap.h header) has alglib::read_csv() function which allows to read data from CSV file. Entire file is loaded into memory as double precision 2D array (alglib::real_2d_array object). This function provides following features:

See comments on alglib::read_csv() function for more information about its functionality.

7 Advanced topics

7.1 Testing ALGLIB

There are two test suites in ALGLIB: computational tests and interface tests. Computational tests are located in /tests/test_c.cpp. They are focused on numerical properties of algorithms, stress testing and "deep" tests (large automatically generated problems). They require significant amount of time to finish (tens of minutes).

Interface tests are located in /tests/test_i.cpp. These tests are focused on ability to correctly pass data between computational core and caller, ability to detect simple problems in inputs, and on ability to at least compile ALGLIB with your compiler. They are very fast (about a minute to finish including compilation time).

Running test suite is easy - just

  1. compile one of these files (test_c.cpp or test_i.cpp) along with the rest of the library
  2. launch executable you will get. It may take from several seconds (interface tests) to several minutes (computational tests) to get final results

If you want to be sure that ALGLIB will work with some sophisticated optimization settings, set corresponding flags during compile time. If your compiler/system are not in the list of supported ones, we recommend you to run both test suites. But if you are running out of time, run at least test_i.cpp.

8 ALGLIB packages and subpackages

8.1 AlglibMisc package

hqrnd High quality random numbers generator
nearestneighbor Nearest neighbor search: approximate and exact
xdebug Debug functions to test ALGLIB interface generator
 

8.2 DataAnalysis package

bdss Basic dataset functions
clustering Clustering functions (hierarchical, k-means, k-means++)
datacomp Backward compatibility functions
dforest Decision forest classifier (regression model)
filters Different filters used in data analysis
lda Linear discriminant analysis
linreg Linear models
logit Logit models
mcpd Markov Chains for Population/proportional Data
mlpbase Basic functions for neural networks
mlpe Basic functions for neural ensemble models
mlptrain Neural network training
pca Principal component analysis
 

8.3 DiffEquations package

odesolver Ordinary differential equation solver
 

8.4 FastTransforms package

conv Fast real/complex convolution
corr Fast real/complex cross-correlation
fft Real/complex FFT
fht Real Fast Hartley Transform
 

8.5 Integration package

autogk Adaptive 1-dimensional integration
gkq Gauss-Kronrod quadrature generator
gq Gaussian quadrature generator
 

8.6 Interpolation package

idwint Inverse distance weighting: interpolation/fitting
lsfit Fitting with least squates target function (linear and nonlinear least-squares)
nsfit Fitting with non-least-squares target functions (ones involving min/max operations, etc)
parametric Parametric curves
polint Polynomial interpolation/fitting
ratint Rational interpolation/fitting
rbf Scattered N-dimensional interpolation with RBF models
spline1d 1D spline interpolation/fitting
spline2d 2D spline interpolation
spline3d 3D spline interpolation
 

8.7 LinAlg package

ablas Level 2 and Level 3 BLAS operations
bdsvd Bidiagonal SVD
evd Direct and iterative eigensolvers
inverseupdate Sherman-Morrison update of the inverse matrix
matdet Determinant calculation
matgen Random matrix generation
matinv Matrix inverse
normestimator Estimates norm of the sparse matrix (from below)
ortfac Real/complex QR/LQ, bi(tri)diagonal, Hessenberg decompositions
rcond Condition number estimate
schur Schur decomposition
sparse Sparse matrices
spdgevd Generalized symmetric eigensolver
svd Singular value decomposition
trfac LU and Cholesky decompositions (dense and sparse)
 

8.8 Optimization package

minbc Box constrained optimizer with fast activation of multiple constraints per step
minbleic Bound constrained optimizer with additional linear equality/inequality constraints
mincg Conjugate gradient optimizer
mincomp Backward compatibility functions
minlbfgs Limited memory BFGS optimizer
minlm Improved Levenberg-Marquardt optimizer
minnlc Nonlinearly constrained optimizer
minns Nonsmooth constrained optimizer
minqp Quadratic programming with bound and linear equality/inequality constraints
 

8.9 Solvers package

densesolver Dense linear system solver
lincg Sparse linear CG solver
linlsqr Sparse linear LSQR solver
nleq Solvers for nonlinear equations
polynomialsolver Polynomial solver
 

8.10 SpecialFunctions package

airyf Airy functions
bessel Bessel functions
betaf Beta function
binomialdistr Binomial distribution
chebyshev Chebyshev polynomials
chisquaredistr Chi-Square distribution
dawson Dawson integral
elliptic Elliptic integrals
expintegrals Exponential integrals
fdistr F-distribution
fresnel Fresnel integrals
gammafunc Gamma function
hermite Hermite polynomials
ibetaf Incomplete beta function
igammaf Incomplete gamma function
jacobianelliptic Jacobian elliptic functions
laguerre Laguerre polynomials
legendre Legendre polynomials
normaldistr Normal distribution
poissondistr Poisson distribution
psif Psi function
studenttdistr Student's t-distribution
trigintegrals Trigonometric integrals
 

8.11 Statistics package

basestat Mean, variance, covariance, correlation, etc.
correlationtests Hypothesis testing: correlation tests
jarquebera Hypothesis testing: Jarque-Bera test
mannwhitneyu Hypothesis testing: Mann-Whitney-U test
stest Hypothesis testing: sign test
studentttests Hypothesis testing: Student's t-test
variancetests Hypothesis testing: F-test and one-sample variance test
wsr Hypothesis testing: Wilcoxon signed rank test
 
cmatrixcopy
cmatrixgemm
cmatrixherk
cmatrixlefttrsm
cmatrixmv
cmatrixrank1
cmatrixrighttrsm
cmatrixsyrk
cmatrixtranspose
rmatrixcopy
rmatrixenforcesymmetricity
rmatrixgemm
rmatrixlefttrsm
rmatrixmv
rmatrixrank1
rmatrixrighttrsm
rmatrixsyrk
rmatrixtranspose
ablas_d_gemm Matrix multiplication (single-threaded)
ablas_d_syrk Symmetric rank-K update (single-threaded)
ablas_smp_gemm Matrix multiplication (multithreaded)
ablas_smp_syrk Symmetric rank-K update (multithreaded)
/************************************************************************* Copy Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/
void alglib::cmatrixcopy( ae_int_t m, ae_int_t n, complex_2d_array a, ae_int_t ia, ae_int_t ja, complex_2d_array& b, ae_int_t ib, ae_int_t jb);
/************************************************************************* This subroutine calculates C = alpha*op1(A)*op2(B) +beta*C where: * C is MxN general matrix * op1(A) is MxK matrix * op2(B) is KxN matrix * "op" may be identity transformation, transposition, conjugate transposition Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. IMPORTANT: This function does NOT preallocate output matrix C, it MUST be preallocated by caller prior to calling this function. In case C does not have enough space to store result, exception will be generated. INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset OpTypeA - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition B - matrix IB - submatrix offset JB - submatrix offset OpTypeB - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition Beta - coefficient C - matrix (PREALLOCATED, large enough to store result) IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixgemm( ae_int_t m, ae_int_t n, ae_int_t k, alglib::complex alpha, complex_2d_array a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, complex_2d_array b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, alglib::complex beta, complex_2d_array& c, ae_int_t ic, ae_int_t jc); void alglib::smp_cmatrixgemm( ae_int_t m, ae_int_t n, ae_int_t k, alglib::smp_complex alpha, complex_2d_array a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, complex_2d_array b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, alglib::smp_complex beta, complex_2d_array& c, ae_int_t ic, ae_int_t jc);

Examples:   [1]  [2]  

/************************************************************************* This subroutine calculates C=alpha*A*A^H+beta*C or C=alpha*A^H*A+beta*C where: * C is NxN Hermitian matrix given by its upper/lower triangle * A is NxK matrix when A*A^H is calculated, KxN matrix otherwise Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 K - matrix size, K>=0 Alpha - coefficient A - matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpTypeA - multiplication type: * 0 - A*A^H is calculated * 2 - A^H*A is calculated Beta - coefficient C - preallocated input/output matrix IC - submatrix offset (row index) JC - submatrix offset (column index) IsUpper - whether upper or lower triangle of C is updated; this function updates only one half of C, leaving other half unchanged (not referenced at all). -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixherk( ae_int_t n, ae_int_t k, double alpha, complex_2d_array a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, complex_2d_array& c, ae_int_t ic, ae_int_t jc, bool isupper); void alglib::smp_cmatrixherk( ae_int_t n, ae_int_t k, double alpha, complex_2d_array a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, complex_2d_array& c, ae_int_t ic, ae_int_t jc, bool isupper);

Examples:   [1]  [2]  

/************************************************************************* This subroutine calculates op(A^-1)*X where: * X is MxN general matrix * A is MxM upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition, conjugate transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+M-1,J1:J1+M-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixlefttrsm( ae_int_t m, ae_int_t n, complex_2d_array a, ae_int_t i1, ae_int_t j1, bool isupper, bool isunit, ae_int_t optype, complex_2d_array& x, ae_int_t i2, ae_int_t j2); void alglib::smp_cmatrixlefttrsm( ae_int_t m, ae_int_t n, complex_2d_array a, ae_int_t i1, ae_int_t j1, bool isupper, bool isunit, ae_int_t optype, complex_2d_array& x, ae_int_t i2, ae_int_t j2);
/************************************************************************* Matrix-vector product: y := op(A)*x INPUT PARAMETERS: M - number of rows of op(A) M>=0 N - number of columns of op(A) N>=0 A - target matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpA - operation type: * OpA=0 => op(A) = A * OpA=1 => op(A) = A^T * OpA=2 => op(A) = A^H X - input vector IX - subvector offset IY - subvector offset Y - preallocated matrix, must be large enough to store result OUTPUT PARAMETERS: Y - vector which stores result if M=0, then subroutine does nothing. if N=0, Y is filled by zeros. -- ALGLIB routine -- 28.01.2010 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixmv( ae_int_t m, ae_int_t n, complex_2d_array a, ae_int_t ia, ae_int_t ja, ae_int_t opa, complex_1d_array x, ae_int_t ix, complex_1d_array& y, ae_int_t iy);
/************************************************************************* Rank-1 correction: A := A + u*v' INPUT PARAMETERS: M - number of rows N - number of columns A - target matrix, MxN submatrix is updated IA - submatrix offset (row index) JA - submatrix offset (column index) U - vector #1 IU - subvector offset V - vector #2 IV - subvector offset *************************************************************************/
void alglib::cmatrixrank1( ae_int_t m, ae_int_t n, complex_2d_array& a, ae_int_t ia, ae_int_t ja, complex_1d_array& u, ae_int_t iu, complex_1d_array& v, ae_int_t iv);
/************************************************************************* This subroutine calculates X*op(A^-1) where: * X is MxN general matrix * A is NxN upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition, conjugate transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+N-1,J1:J1+N-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixrighttrsm( ae_int_t m, ae_int_t n, complex_2d_array a, ae_int_t i1, ae_int_t j1, bool isupper, bool isunit, ae_int_t optype, complex_2d_array& x, ae_int_t i2, ae_int_t j2); void alglib::smp_cmatrixrighttrsm( ae_int_t m, ae_int_t n, complex_2d_array a, ae_int_t i1, ae_int_t j1, bool isupper, bool isunit, ae_int_t optype, complex_2d_array& x, ae_int_t i2, ae_int_t j2);
/************************************************************************* This subroutine is an older version of CMatrixHERK(), one with wrong name (it is HErmitian update, not SYmmetric). It is left here for backward compatibility. -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixsyrk( ae_int_t n, ae_int_t k, double alpha, complex_2d_array a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, complex_2d_array& c, ae_int_t ic, ae_int_t jc, bool isupper); void alglib::smp_cmatrixsyrk( ae_int_t n, ae_int_t k, double alpha, complex_2d_array a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, complex_2d_array& c, ae_int_t ic, ae_int_t jc, bool isupper);
/************************************************************************* Cache-oblivous complex "copy-and-transpose" Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/
void alglib::cmatrixtranspose( ae_int_t m, ae_int_t n, complex_2d_array a, ae_int_t ia, ae_int_t ja, complex_2d_array& b, ae_int_t ib, ae_int_t jb);
/************************************************************************* Copy Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/
void alglib::rmatrixcopy( ae_int_t m, ae_int_t n, real_2d_array a, ae_int_t ia, ae_int_t ja, real_2d_array& b, ae_int_t ib, ae_int_t jb);
/************************************************************************* This code enforces symmetricy of the matrix by copying Upper part to lower one (or vice versa). INPUT PARAMETERS: A - matrix N - number of rows/columns IsUpper - whether we want to copy upper triangle to lower one (True) or vice versa (False). *************************************************************************/
void alglib::rmatrixenforcesymmetricity( real_2d_array& a, ae_int_t n, bool isupper);
/************************************************************************* This subroutine calculates C = alpha*op1(A)*op2(B) +beta*C where: * C is MxN general matrix * op1(A) is MxK matrix * op2(B) is KxN matrix * "op" may be identity transformation, transposition Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. IMPORTANT: This function does NOT preallocate output matrix C, it MUST be preallocated by caller prior to calling this function. In case C does not have enough space to store result, exception will be generated. INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset OpTypeA - transformation type: * 0 - no transformation * 1 - transposition B - matrix IB - submatrix offset JB - submatrix offset OpTypeB - transformation type: * 0 - no transformation * 1 - transposition Beta - coefficient C - PREALLOCATED output matrix, large enough to store result IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 2009-2013 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixgemm( ae_int_t m, ae_int_t n, ae_int_t k, double alpha, real_2d_array a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, real_2d_array b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, real_2d_array& c, ae_int_t ic, ae_int_t jc); void alglib::smp_rmatrixgemm( ae_int_t m, ae_int_t n, ae_int_t k, double alpha, real_2d_array a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, real_2d_array b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, real_2d_array& c, ae_int_t ic, ae_int_t jc);

Examples:   [1]  [2]  

/************************************************************************* This subroutine calculates op(A^-1)*X where: * X is MxN general matrix * A is MxM upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+M-1,J1:J1+M-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixlefttrsm( ae_int_t m, ae_int_t n, real_2d_array a, ae_int_t i1, ae_int_t j1, bool isupper, bool isunit, ae_int_t optype, real_2d_array& x, ae_int_t i2, ae_int_t j2); void alglib::smp_rmatrixlefttrsm( ae_int_t m, ae_int_t n, real_2d_array a, ae_int_t i1, ae_int_t j1, bool isupper, bool isunit, ae_int_t optype, real_2d_array& x, ae_int_t i2, ae_int_t j2);
/************************************************************************* Matrix-vector product: y := op(A)*x INPUT PARAMETERS: M - number of rows of op(A) N - number of columns of op(A) A - target matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpA - operation type: * OpA=0 => op(A) = A * OpA=1 => op(A) = A^T X - input vector IX - subvector offset IY - subvector offset Y - preallocated matrix, must be large enough to store result OUTPUT PARAMETERS: Y - vector which stores result if M=0, then subroutine does nothing. if N=0, Y is filled by zeros. -- ALGLIB routine -- 28.01.2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixmv( ae_int_t m, ae_int_t n, real_2d_array a, ae_int_t ia, ae_int_t ja, ae_int_t opa, real_1d_array x, ae_int_t ix, real_1d_array& y, ae_int_t iy);
/************************************************************************* Rank-1 correction: A := A + u*v' INPUT PARAMETERS: M - number of rows N - number of columns A - target matrix, MxN submatrix is updated IA - submatrix offset (row index) JA - submatrix offset (column index) U - vector #1 IU - subvector offset V - vector #2 IV - subvector offset *************************************************************************/
void alglib::rmatrixrank1( ae_int_t m, ae_int_t n, real_2d_array& a, ae_int_t ia, ae_int_t ja, real_1d_array& u, ae_int_t iu, real_1d_array& v, ae_int_t iv);
/************************************************************************* This subroutine calculates X*op(A^-1) where: * X is MxN general matrix * A is NxN upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+N-1,J1:J1+N-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixrighttrsm( ae_int_t m, ae_int_t n, real_2d_array a, ae_int_t i1, ae_int_t j1, bool isupper, bool isunit, ae_int_t optype, real_2d_array& x, ae_int_t i2, ae_int_t j2); void alglib::smp_rmatrixrighttrsm( ae_int_t m, ae_int_t n, real_2d_array a, ae_int_t i1, ae_int_t j1, bool isupper, bool isunit, ae_int_t optype, real_2d_array& x, ae_int_t i2, ae_int_t j2);
/************************************************************************* This subroutine calculates C=alpha*A*A^T+beta*C or C=alpha*A^T*A+beta*C where: * C is NxN symmetric matrix given by its upper/lower triangle * A is NxK matrix when A*A^T is calculated, KxN matrix otherwise Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 K - matrix size, K>=0 Alpha - coefficient A - matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpTypeA - multiplication type: * 0 - A*A^T is calculated * 2 - A^T*A is calculated Beta - coefficient C - preallocated input/output matrix IC - submatrix offset (row index) JC - submatrix offset (column index) IsUpper - whether C is upper triangular or lower triangular -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixsyrk( ae_int_t n, ae_int_t k, double alpha, real_2d_array a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, real_2d_array& c, ae_int_t ic, ae_int_t jc, bool isupper); void alglib::smp_rmatrixsyrk( ae_int_t n, ae_int_t k, double alpha, real_2d_array a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, real_2d_array& c, ae_int_t ic, ae_int_t jc, bool isupper);

Examples:   [1]  [2]  

/************************************************************************* Cache-oblivous real "copy-and-transpose" Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/
void alglib::rmatrixtranspose( ae_int_t m, ae_int_t n, real_2d_array a, ae_int_t ia, ae_int_t ja, real_2d_array& b, ae_int_t ib, ae_int_t jb);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    real_2d_array a = "[[2,1],[1,3]]";
    real_2d_array b = "[[2,1],[0,1]]";
    real_2d_array c = "[[0,0],[0,0]]";

    //
    // rmatrixgemm() function allows us to calculate matrix product C:=A*B or
    // to perform more general operation, C:=alpha*op1(A)*op2(B)+beta*C,
    // where A, B, C are rectangular matrices, op(X) can be X or X^T,
    // alpha and beta are scalars.
    //
    // This function:
    // * can apply transposition and/or multiplication by scalar to operands
    // * can use arbitrary part of matrices A/B (given by submatrix offset)
    // * can store result into arbitrary part of C
    // * for performance reasons requires C to be preallocated
    //
    // Parameters of this function are:
    // * M, N, K            -   sizes of op1(A) (which is MxK), op2(B) (which
    //                          is KxN) and C (which is MxN)
    // * Alpha              -   coefficient before A*B
    // * A, IA, JA          -   matrix A and offset of the submatrix
    // * OpTypeA            -   transformation type:
    //                          0 - no transformation
    //                          1 - transposition
    // * B, IB, JB          -   matrix B and offset of the submatrix
    // * OpTypeB            -   transformation type:
    //                          0 - no transformation
    //                          1 - transposition
    // * Beta               -   coefficient before C
    // * C, IC, JC          -   preallocated matrix C and offset of the submatrix
    //
    // Below we perform simple product C:=A*B (alpha=1, beta=0)
    //
    // IMPORTANT: this function works with preallocated C, which must be large
    //            enough to store multiplication result.
    //
    ae_int_t m = 2;
    ae_int_t n = 2;
    ae_int_t k = 2;
    double alpha = 1.0;
    ae_int_t ia = 0;
    ae_int_t ja = 0;
    ae_int_t optypea = 0;
    ae_int_t ib = 0;
    ae_int_t jb = 0;
    ae_int_t optypeb = 0;
    double beta = 0.0;
    ae_int_t ic = 0;
    ae_int_t jc = 0;
    rmatrixgemm(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc);
    printf("%s\n", c.tostring(3).c_str()); // EXPECTED: [[4,3],[2,4]]

    //
    // Now we try to apply some simple transformation to operands: C:=A*B^T
    //
    optypeb = 1;
    rmatrixgemm(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc);
    printf("%s\n", c.tostring(3).c_str()); // EXPECTED: [[5,1],[5,3]]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // rmatrixsyrk() function allows us to calculate symmetric rank-K update
    // C := beta*C + alpha*A'*A, where C is square N*N matrix, A is square K*N
    // matrix, alpha and beta are scalars. It is also possible to update by
    // adding A*A' instead of A'*A.
    //
    // Parameters of this function are:
    // * N, K       -   matrix size
    // * Alpha      -   coefficient before A
    // * A, IA, JA  -   matrix and submatrix offsets
    // * OpTypeA    -   multiplication type:
    //                  * 0 - A*A^T is calculated
    //                  * 2 - A^T*A is calculated
    // * Beta       -   coefficient before C
    // * C, IC, JC  -   preallocated input/output matrix and submatrix offsets
    // * IsUpper    -   whether upper or lower triangle of C is updated;
    //                  this function updates only one half of C, leaving
    //                  other half unchanged (not referenced at all).
    //
    // Below we will show how to calculate simple product C:=A'*A
    //
    // NOTE: beta=0 and we do not use previous value of C, but still it
    //       MUST be preallocated.
    //
    ae_int_t n = 2;
    ae_int_t k = 1;
    double alpha = 1.0;
    ae_int_t ia = 0;
    ae_int_t ja = 0;
    ae_int_t optypea = 2;
    double beta = 0.0;
    ae_int_t ic = 0;
    ae_int_t jc = 0;
    bool isupper = true;
    real_2d_array a = "[[1,2]]";

    // preallocate space to store result
    real_2d_array c = "[[0,0],[0,0]]";

    // calculate product, store result into upper part of c
    rmatrixsyrk(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper);

    // output result.
    // IMPORTANT: lower triangle of C was NOT updated!
    printf("%s\n", c.tostring(3).c_str()); // EXPECTED: [[1,2],[0,4]]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // In this example we assume that you already know how to work with
    // rmatrixgemm() function. Below we concentrate on its multithreading
    // capabilities.
    //
    // SMP edition of ALGLIB includes smp_rmatrixgemm() - multithreaded
    // version of rmatrixgemm() function. In the basic edition of ALGLIB
    // (GPL edition or commercial version without SMP support) this function
    // just calls single-threaded stub. So, you may call this function from
    // ANY edition of ALGLIB, but only in SMP edition it will work in really
    // multithreaded mode.
    //
    // In order to use multithreading, you have to:
    // 1) Install SMP edition of ALGLIB.
    // 2) This step is specific for C++ users: you should activate OS-specific
    //    capabilities of ALGLIB by defining AE_OS=AE_POSIX (for *nix systems)
    //    or AE_OS=AE_WINDOWS (for Windows systems).
    //    C# users do not have to perform this step because C# programs are
    //    portable across different systems without OS-specific tuning.
    // 3) Allow ALGLIB to know about number of worker threads to use:
    //    a) autodetection (C++, C#):
    //          ALGLIB will automatically determine number of CPU cores and
    //          (by default) will use all cores except for one. Say, on 4-core
    //          system it will use three cores - unless you manually told it
    //          to use more or less. It will keep your system responsive during
    //          lengthy computations.
    //          Such behavior may be changed with setnworkers() call:
    //          * alglib::setnworkers(0)  = use all cores
    //          * alglib::setnworkers(-1) = leave one core unused
    //          * alglib::setnworkers(-2) = leave two cores unused
    //          * alglib::setnworkers(+2) = use 2 cores (even if you have more)
    //    b) manual specification (C++, C#):
    //          You may want to specify maximum number of worker threads during
    //          compile time by means of preprocessor definition AE_NWORKERS.
    //          For C++ it will be "AE_NWORKERS=X" where X can be any positive number.
    //          For C# it is "AE_NWORKERSX", where X should be replaced by number of
    //          workers (AE_NWORKERS2, AE_NWORKERS3, AE_NWORKERS4, ...).
    //          You can add this definition to compiler command line or change
    //          corresponding project settings in your IDE.
    //
    // After you installed and configured SMP edition of ALGLIB, you may choose
    // between serial and multithreaded versions of SMP-capable functions:
    // * serial version works as usual, in the context of the calling thread
    // * multithreaded version (with "smp_" prefix) creates (or wakes up) worker
    //   threads, inserts task in the worker queue, and waits for completion of
    //   the task. All processing is done in context of worker thread(s).
    //
    // NOTE: because starting/stopping worker threads costs thousands of CPU cycles,
    //       you should not use multithreading for lightweight computational problems.
    //
    // NOTE: some old POSIX-compatible operating systems do not support
    //       sysconf(_SC_NPROCESSORS_ONLN) system call which is required in order
    //       to automatically determine number of active cores. On these systems
    //       you should specify number of cores manually at compile time.
    //       Without it ALGLIB will run in single-threaded mode.
    //
    // Now, back to our example. In this example we will show you:
    // * how to call SMP version of rmatrixgemm(). Because we work with tiny 2x2
    //   matrices, we won't expect to see ANY speedup from using multithreading.
    //   The only purpose of this demo is to show how to call SMP functions.
    // * how to modify number of worker threads used by ALGLIB
    //
    real_2d_array a = "[[2,1],[1,3]]";
    real_2d_array b = "[[2,1],[0,1]]";
    real_2d_array c = "[[0,0],[0,0]]";
    ae_int_t m = 2;
    ae_int_t n = 2;
    ae_int_t k = 2;
    double alpha = 1.0;
    ae_int_t ia = 0;
    ae_int_t ja = 0;
    ae_int_t optypea = 0;
    ae_int_t ib = 0;
    ae_int_t jb = 0;
    ae_int_t optypeb = 0;
    double beta = 0.0;
    ae_int_t ic = 0;
    ae_int_t jc = 0;

    // serial code
    c = "[[0,0],[0,0]]";
    rmatrixgemm(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc);

    // SMP code with default number of worker threads
    c = "[[0,0],[0,0]]";
    smp_rmatrixgemm(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc);
    printf("%s\n", c.tostring(3).c_str()); // EXPECTED: [[4,3],[2,4]]

    // override number of worker threads - use two cores
    alglib::setnworkers(+2);
    c = "[[0,0],[0,0]]";
    smp_rmatrixgemm(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc);
    printf("%s\n", c.tostring(3).c_str()); // EXPECTED: [[4,3],[2,4]]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // In this example we assume that you already know how to work with
    // rmatrixsyrk() function. Below we concentrate on its multithreading
    // capabilities.
    //
    // SMP edition of ALGLIB includes smp_rmatrixsyrk() - multithreaded
    // version of rmatrixsyrk() function. In the basic edition of ALGLIB
    // (GPL edition or commercial version without SMP support) this function
    // just calls single-threaded stub. So, you may call this function from
    // ANY edition of ALGLIB, but only in SMP edition it will work in really
    // multithreaded mode.
    //
    // In order to use multithreading, you have to:
    // 1) Install SMP edition of ALGLIB.
    // 2) This step is specific for C++ users: you should activate OS-specific
    //    capabilities of ALGLIB by defining AE_OS=AE_POSIX (for *nix systems)
    //    or AE_OS=AE_WINDOWS (for Windows systems).
    //    C# users do not have to perform this step because C# programs are
    //    portable across different systems without OS-specific tuning.
    // 3) Allow ALGLIB to know about number of worker threads to use:
    //    a) autodetection (C++, C#):
    //          ALGLIB will automatically determine number of CPU cores and
    //          (by default) will use all cores except for one. Say, on 4-core
    //          system it will use three cores - unless you manually told it
    //          to use more or less. It will keep your system responsive during
    //          lengthy computations.
    //          Such behavior may be changed with setnworkers() call:
    //          * alglib::setnworkers(0)  = use all cores
    //          * alglib::setnworkers(-1) = leave one core unused
    //          * alglib::setnworkers(-2) = leave two cores unused
    //          * alglib::setnworkers(+2) = use 2 cores (even if you have more)
    //    b) manual specification (C++, C#):
    //          You may want to specify maximum number of worker threads during
    //          compile time by means of preprocessor definition AE_NWORKERS.
    //          For C++ it will be "AE_NWORKERS=X" where X can be any positive number.
    //          For C# it is "AE_NWORKERSX", where X should be replaced by number of
    //          workers (AE_NWORKERS2, AE_NWORKERS3, AE_NWORKERS4, ...).
    //          You can add this definition to compiler command line or change
    //          corresponding project settings in your IDE.
    //
    // After you installed and configured SMP edition of ALGLIB, you may choose
    // between serial and multithreaded versions of SMP-capable functions:
    // * serial version works as usual, in the context of the calling thread
    // * multithreaded version (with "smp_" prefix) creates (or wakes up) worker
    //   threads, inserts task in the worker queue, and waits for completion of
    //   the task. All processing is done in context of worker thread(s).
    //
    // NOTE: because starting/stopping worker threads costs thousands of CPU cycles,
    //       you should not use multithreading for lightweight computational problems.
    //
    // NOTE: some old POSIX-compatible operating systems do not support
    //       sysconf(_SC_NPROCESSORS_ONLN) system call which is required in order
    //       to automatically determine number of active cores. On these systems
    //       you should specify number of cores manually at compile time.
    //       Without it ALGLIB will run in single-threaded mode.
    //
    // Now, back to our example. In this example we will show you:
    // * how to call SMP version of rmatrixsyrk(). Because we work with tiny 2x2
    //   matrices, we won't expect to see ANY speedup from using multithreading.
    //   The only purpose of this demo is to show how to call SMP functions.
    // * how to modify number of worker threads used by ALGLIB
    //
    ae_int_t n = 2;
    ae_int_t k = 1;
    double alpha = 1.0;
    ae_int_t ia = 0;
    ae_int_t ja = 0;
    ae_int_t optypea = 2;
    double beta = 0.0;
    ae_int_t ic = 0;
    ae_int_t jc = 0;
    bool isupper = true;
    real_2d_array a = "[[1,2]]";
    real_2d_array c = "[[]]";

    //
    // Default number of worker threads.
    // Preallocate space to store result, call multithreaded version, test.
    //
    // NOTE: this function updates only one triangular part of C. In our
    //       example we choose to update upper triangle.
    //
    c = "[[0,0],[0,0]]";
    smp_rmatrixsyrk(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper);
    printf("%s\n", c.tostring(3).c_str()); // EXPECTED: [[1,2],[0,4]]

    //
    // Override default number of worker threads (set to 2).
    // Preallocate space to store result, call multithreaded version, test.
    //
    // NOTE: this function updates only one triangular part of C. In our
    //       example we choose to update upper triangle.
    //
    alglib::setnworkers(+2);
    c = "[[0,0],[0,0]]";
    smp_rmatrixsyrk(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper);
    printf("%s\n", c.tostring(3).c_str()); // EXPECTED: [[1,2],[0,4]]
    return 0;
}


airy
/************************************************************************* Airy function Solution of the differential equation y"(x) = xy. The function returns the two independent solutions Ai, Bi and their first derivatives Ai'(x), Bi'(x). Evaluation is by power series summation for small x, by rational minimax approximations for large x. ACCURACY: Error criterion is absolute when function <= 1, relative when function > 1, except * denotes relative error criterion. For large negative x, the absolute error increases as x^1.5. For large positive x, the relative error increases as x^1.5. Arithmetic domain function # trials peak rms IEEE -10, 0 Ai 10000 1.6e-15 2.7e-16 IEEE 0, 10 Ai 10000 2.3e-14* 1.8e-15* IEEE -10, 0 Ai' 10000 4.6e-15 7.6e-16 IEEE 0, 10 Ai' 10000 1.8e-14* 1.5e-15* IEEE -10, 10 Bi 30000 4.2e-15 5.3e-16 IEEE -10, 10 Bi' 30000 4.9e-15 7.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/
void alglib::airy( double x, double& ai, double& aip, double& bi, double& bip);
autogkreport
autogkstate
autogkintegrate
autogkresults
autogksingular
autogksmooth
autogksmoothw
autogk_d1 Integrating f=exp(x) by adaptive integrator
/************************************************************************* Integration report: * TerminationType = completetion code: * -5 non-convergence of Gauss-Kronrod nodes calculation subroutine. * -1 incorrect parameters were specified * 1 OK * Rep.NFEV countains number of function calculations * Rep.NIntervals contains number of intervals [a,b] was partitioned into. *************************************************************************/
class autogkreport { ae_int_t terminationtype; ae_int_t nfev; ae_int_t nintervals; };
/************************************************************************* This structure stores state of the integration algorithm. Although this class has public fields, they are not intended for external use. You should use ALGLIB functions to work with this class: * autogksmooth()/AutoGKSmoothW()/... to create objects * autogkintegrate() to begin integration * autogkresults() to get results *************************************************************************/
class autogkstate { };
/************************************************************************* This function is used to launcn iterations of ODE solver It accepts following parameters: diff - callback which calculates dy/dx for given y and x obj - optional object which is passed to diff; can be NULL -- ALGLIB -- Copyright 07.05.2009 by Bochkanov Sergey *************************************************************************/
void autogkintegrate(autogkstate &state, void (*func)(double x, double xminusa, double bminusx, double &y, void *ptr), void *ptr = NULL);

Examples:   [1]  

/************************************************************************* Adaptive integration results Called after AutoGKIteration returned False. Input parameters: State - algorithm state (used by AutoGKIteration). Output parameters: V - integral(f(x)dx,a,b) Rep - optimization report (see AutoGKReport description) -- ALGLIB -- Copyright 14.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::autogkresults( autogkstate state, double& v, autogkreport& rep);

Examples:   [1]  

/************************************************************************* Integration on a finite interval [A,B]. Integrand have integrable singularities at A/B. F(X) must diverge as "(x-A)^alpha" at A, as "(B-x)^beta" at B, with known alpha/beta (alpha>-1, beta>-1). If alpha/beta are not known, estimates from below can be used (but these estimates should be greater than -1 too). One of alpha/beta variables (or even both alpha/beta) may be equal to 0, which means than function F(x) is non-singular at A/B. Anyway (singular at bounds or not), function F(x) is supposed to be continuous on (A,B). Fast-convergent algorithm based on a Gauss-Kronrod formula is used. Result is calculated with accuracy close to the machine precision. INPUT PARAMETERS: A, B - interval boundaries (A<B, A=B or A>B) Alpha - power-law coefficient of the F(x) at A, Alpha>-1 Beta - power-law coefficient of the F(x) at B, Beta>-1 OUTPUT PARAMETERS State - structure which stores algorithm state SEE ALSO AutoGKSmooth, AutoGKSmoothW, AutoGKResults. -- ALGLIB -- Copyright 06.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::autogksingular( double a, double b, double alpha, double beta, autogkstate& state);
/************************************************************************* Integration of a smooth function F(x) on a finite interval [a,b]. Fast-convergent algorithm based on a Gauss-Kronrod formula is used. Result is calculated with accuracy close to the machine precision. Algorithm works well only with smooth integrands. It may be used with continuous non-smooth integrands, but with less performance. It should never be used with integrands which have integrable singularities at lower or upper limits - algorithm may crash. Use AutoGKSingular in such cases. INPUT PARAMETERS: A, B - interval boundaries (A<B, A=B or A>B) OUTPUT PARAMETERS State - structure which stores algorithm state SEE ALSO AutoGKSmoothW, AutoGKSingular, AutoGKResults. -- ALGLIB -- Copyright 06.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::autogksmooth(double a, double b, autogkstate& state);

Examples:   [1]  

/************************************************************************* Integration of a smooth function F(x) on a finite interval [a,b]. This subroutine is same as AutoGKSmooth(), but it guarantees that interval [a,b] is partitioned into subintervals which have width at most XWidth. Subroutine can be used when integrating nearly-constant function with narrow "bumps" (about XWidth wide). If "bumps" are too narrow, AutoGKSmooth subroutine can overlook them. INPUT PARAMETERS: A, B - interval boundaries (A<B, A=B or A>B) OUTPUT PARAMETERS State - structure which stores algorithm state SEE ALSO AutoGKSmooth, AutoGKSingular, AutoGKResults. -- ALGLIB -- Copyright 06.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::autogksmoothw( double a, double b, double xwidth, autogkstate& state);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "integration.h"

using namespace alglib;
void int_function_1_func(double x, double xminusa, double bminusx, double &y, void *ptr) 
{
    // this callback calculates f(x)=exp(x)
    y = exp(x);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates integration of f=exp(x) on [0,1]:
    // * first, autogkstate is initialized
    // * then we call integration function
    // * and finally we obtain results with autogkresults() call
    //
    double a = 0;
    double b = 1;
    autogkstate s;
    double v;
    autogkreport rep;

    autogksmooth(a, b, s);
    alglib::autogkintegrate(s, int_function_1_func);
    autogkresults(s, v, rep);

    printf("%.2f\n", double(v)); // EXPECTED: 1.7182
    return 0;
}


cov2
covm
covm2
pearsoncorr2
pearsoncorrelation
pearsoncorrm
pearsoncorrm2
rankdata
rankdatacentered
sampleadev
samplekurtosis
samplemean
samplemedian
samplemoments
samplepercentile
sampleskewness
samplevariance
spearmancorr2
spearmancorrm
spearmancorrm2
spearmanrankcorrelation
basestat_d_base Basic functionality (moments, adev, median, percentile)
basestat_d_c2 Correlation (covariance) between two random variables
basestat_d_cm Correlation (covariance) between components of random vector
basestat_d_cm2 Correlation (covariance) between two random vectors
/************************************************************************* 2-sample covariance Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: covariance (zero for N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/
double alglib::cov2(real_1d_array x, real_1d_array y); double alglib::cov2(real_1d_array x, real_1d_array y, ae_int_t n);

Examples:   [1]  

/************************************************************************* Covariance matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with covariance matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], covariance matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::covm(real_2d_array x, real_2d_array& c); void alglib::covm( real_2d_array x, ae_int_t n, ae_int_t m, real_2d_array& c); void alglib::smp_covm(real_2d_array x, real_2d_array& c); void alglib::smp_covm( real_2d_array x, ae_int_t n, ae_int_t m, real_2d_array& c);

Examples:   [1]  

/************************************************************************* Cross-covariance matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with covariance matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-covariance matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::covm2(real_2d_array x, real_2d_array y, real_2d_array& c); void alglib::covm2( real_2d_array x, real_2d_array y, ae_int_t n, ae_int_t m1, ae_int_t m2, real_2d_array& c); void alglib::smp_covm2(real_2d_array x, real_2d_array y, real_2d_array& c); void alglib::smp_covm2( real_2d_array x, real_2d_array y, ae_int_t n, ae_int_t m1, ae_int_t m2, real_2d_array& c);

Examples:   [1]  

/************************************************************************* Pearson product-moment correlation coefficient Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: Pearson product-moment correlation coefficient (zero for N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/
double alglib::pearsoncorr2(real_1d_array x, real_1d_array y); double alglib::pearsoncorr2(real_1d_array x, real_1d_array y, ae_int_t n);

Examples:   [1]  

/************************************************************************* Obsolete function, we recommend to use PearsonCorr2(). -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/
double alglib::pearsoncorrelation( real_1d_array x, real_1d_array y, ae_int_t n);
/************************************************************************* Pearson product-moment correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pearsoncorrm(real_2d_array x, real_2d_array& c); void alglib::pearsoncorrm( real_2d_array x, ae_int_t n, ae_int_t m, real_2d_array& c); void alglib::smp_pearsoncorrm(real_2d_array x, real_2d_array& c); void alglib::smp_pearsoncorrm( real_2d_array x, ae_int_t n, ae_int_t m, real_2d_array& c);

Examples:   [1]  

/************************************************************************* Pearson product-moment cross-correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pearsoncorrm2( real_2d_array x, real_2d_array y, real_2d_array& c); void alglib::pearsoncorrm2( real_2d_array x, real_2d_array y, ae_int_t n, ae_int_t m1, ae_int_t m2, real_2d_array& c); void alglib::smp_pearsoncorrm2( real_2d_array x, real_2d_array y, real_2d_array& c); void alglib::smp_pearsoncorrm2( real_2d_array x, real_2d_array y, ae_int_t n, ae_int_t m1, ae_int_t m2, real_2d_array& c);

Examples:   [1]  

/************************************************************************* This function replaces data in XY by their ranks: * XY is processed row-by-row * rows are processed separately * tied data are correctly handled (tied ranks are calculated) * ranking starts from 0, ends at NFeatures-1 * sum of within-row values is equal to (NFeatures-1)*NFeatures/2 SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! ones where expected operations count is less than 100.000 INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points NFeatures- number of features OUTPUT PARAMETERS: XY - data are replaced by their within-row ranks; ranking starts from 0, ends at NFeatures-1 -- ALGLIB -- Copyright 18.04.2013 by Bochkanov Sergey *************************************************************************/
void alglib::rankdata(real_2d_array& xy); void alglib::rankdata( real_2d_array& xy, ae_int_t npoints, ae_int_t nfeatures); void alglib::smp_rankdata(real_2d_array& xy); void alglib::smp_rankdata( real_2d_array& xy, ae_int_t npoints, ae_int_t nfeatures);
/************************************************************************* This function replaces data in XY by their CENTERED ranks: * XY is processed row-by-row * rows are processed separately * tied data are correctly handled (tied ranks are calculated) * centered ranks are just usual ranks, but centered in such way that sum of within-row values is equal to 0.0. * centering is performed by subtracting mean from each row, i.e it changes mean value, but does NOT change higher moments SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! ones where expected operations count is less than 100.000 INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points NFeatures- number of features OUTPUT PARAMETERS: XY - data are replaced by their within-row ranks; ranking starts from 0, ends at NFeatures-1 -- ALGLIB -- Copyright 18.04.2013 by Bochkanov Sergey *************************************************************************/
void alglib::rankdatacentered(real_2d_array& xy); void alglib::rankdatacentered( real_2d_array& xy, ae_int_t npoints, ae_int_t nfeatures); void alglib::smp_rankdatacentered(real_2d_array& xy); void alglib::smp_rankdatacentered( real_2d_array& xy, ae_int_t npoints, ae_int_t nfeatures);
/************************************************************************* ADev Input parameters: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X Output parameters: ADev- ADev -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/
void alglib::sampleadev(real_1d_array x, double& adev); void alglib::sampleadev(real_1d_array x, ae_int_t n, double& adev);

Examples:   [1]  

/************************************************************************* Calculation of the kurtosis. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Kurtosis' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/
double alglib::samplekurtosis(real_1d_array x); double alglib::samplekurtosis(real_1d_array x, ae_int_t n);
/************************************************************************* Calculation of the mean. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Mean' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/
double alglib::samplemean(real_1d_array x); double alglib::samplemean(real_1d_array x, ae_int_t n);
/************************************************************************* Median calculation. Input parameters: X - sample (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X Output parameters: Median -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/
void alglib::samplemedian(real_1d_array x, double& median); void alglib::samplemedian(real_1d_array x, ae_int_t n, double& median);

Examples:   [1]  

/************************************************************************* Calculation of the distribution moments: mean, variance, skewness, kurtosis. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X OUTPUT PARAMETERS Mean - mean. Variance- variance. Skewness- skewness (if variance<>0; zero otherwise). Kurtosis- kurtosis (if variance<>0; zero otherwise). NOTE: variance is calculated by dividing sum of squares by N-1, not N. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/
void alglib::samplemoments( real_1d_array x, double& mean, double& variance, double& skewness, double& kurtosis); void alglib::samplemoments( real_1d_array x, ae_int_t n, double& mean, double& variance, double& skewness, double& kurtosis);

Examples:   [1]  

/************************************************************************* Percentile calculation. Input parameters: X - sample (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X P - percentile (0<=P<=1) Output parameters: V - percentile -- ALGLIB -- Copyright 01.03.2008 by Bochkanov Sergey *************************************************************************/
void alglib::samplepercentile(real_1d_array x, double p, double& v); void alglib::samplepercentile( real_1d_array x, ae_int_t n, double p, double& v);

Examples:   [1]  

/************************************************************************* Calculation of the skewness. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Skewness' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/
double alglib::sampleskewness(real_1d_array x); double alglib::sampleskewness(real_1d_array x, ae_int_t n);
/************************************************************************* Calculation of the variance. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Variance' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/
double alglib::samplevariance(real_1d_array x); double alglib::samplevariance(real_1d_array x, ae_int_t n);
/************************************************************************* Spearman's rank correlation coefficient Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: Spearman's rank correlation coefficient (zero for N=0 or N=1) -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/
double alglib::spearmancorr2(real_1d_array x, real_1d_array y); double alglib::spearmancorr2( real_1d_array x, real_1d_array y, ae_int_t n);

Examples:   [1]  

/************************************************************************* Spearman's rank correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spearmancorrm(real_2d_array x, real_2d_array& c); void alglib::spearmancorrm( real_2d_array x, ae_int_t n, ae_int_t m, real_2d_array& c); void alglib::smp_spearmancorrm(real_2d_array x, real_2d_array& c); void alglib::smp_spearmancorrm( real_2d_array x, ae_int_t n, ae_int_t m, real_2d_array& c);

Examples:   [1]  

/************************************************************************* Spearman's rank cross-correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spearmancorrm2( real_2d_array x, real_2d_array y, real_2d_array& c); void alglib::spearmancorrm2( real_2d_array x, real_2d_array y, ae_int_t n, ae_int_t m1, ae_int_t m2, real_2d_array& c); void alglib::smp_spearmancorrm2( real_2d_array x, real_2d_array y, real_2d_array& c); void alglib::smp_spearmancorrm2( real_2d_array x, real_2d_array y, ae_int_t n, ae_int_t m1, ae_int_t m2, real_2d_array& c);

Examples:   [1]  

/************************************************************************* Obsolete function, we recommend to use SpearmanCorr2(). -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/
double alglib::spearmanrankcorrelation( real_1d_array x, real_1d_array y, ae_int_t n);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "statistics.h"

using namespace alglib;


int main(int argc, char **argv)
{
    real_1d_array x = "[0,1,4,9,16,25,36,49,64,81]";
    double mean;
    double variance;
    double skewness;
    double kurtosis;
    double adev;
    double p;
    double v;

    //
    // Here we demonstrate calculation of sample moments
    // (mean, variance, skewness, kurtosis)
    //
    samplemoments(x, mean, variance, skewness, kurtosis);
    printf("%.1f\n", double(mean)); // EXPECTED: 28.5
    printf("%.1f\n", double(variance)); // EXPECTED: 801.1667
    printf("%.1f\n", double(skewness)); // EXPECTED: 0.5751
    printf("%.1f\n", double(kurtosis)); // EXPECTED: -1.2666

    //
    // Average deviation
    //
    sampleadev(x, adev);
    printf("%.1f\n", double(adev)); // EXPECTED: 23.2

    //
    // Median and percentile
    //
    samplemedian(x, v);
    printf("%.1f\n", double(v)); // EXPECTED: 20.5
    p = 0.5;
    samplepercentile(x, p, v);
    printf("%.1f\n", double(v)); // EXPECTED: 20.5
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "statistics.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We have two samples - x and y, and want to measure dependency between them
    //
    real_1d_array x = "[0,1,4,9,16,25,36,49,64,81]";
    real_1d_array y = "[0,1,2,3,4,5,6,7,8,9]";
    double v;

    //
    // Three dependency measures are calculated:
    // * covariation
    // * Pearson correlation
    // * Spearman rank correlation
    //
    v = cov2(x, y);
    printf("%.2f\n", double(v)); // EXPECTED: 82.5
    v = pearsoncorr2(x, y);
    printf("%.2f\n", double(v)); // EXPECTED: 0.9627
    v = spearmancorr2(x, y);
    printf("%.2f\n", double(v)); // EXPECTED: 1.000
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "statistics.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // X is a sample matrix:
    // * I-th row corresponds to I-th observation
    // * J-th column corresponds to J-th variable
    //
    real_2d_array x = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]";
    real_2d_array c;

    //
    // Three dependency measures are calculated:
    // * covariation
    // * Pearson correlation
    // * Spearman rank correlation
    //
    // Result is stored into C, with C[i,j] equal to correlation
    // (covariance) between I-th and J-th variables of X.
    //
    covm(x, c);
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [[1.80,0.60,-1.40],[0.60,0.70,-0.80],[-1.40,-0.80,14.70]]
    pearsoncorrm(x, c);
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [[1.000,0.535,-0.272],[0.535,1.000,-0.249],[-0.272,-0.249,1.000]]
    spearmancorrm(x, c);
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [[1.000,0.556,-0.306],[0.556,1.000,-0.750],[-0.306,-0.750,1.000]]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "statistics.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // X and Y are sample matrices:
    // * I-th row corresponds to I-th observation
    // * J-th column corresponds to J-th variable
    //
    real_2d_array x = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]";
    real_2d_array y = "[[2,3],[2,1],[-1,6],[-9,9],[7,1]]";
    real_2d_array c;

    //
    // Three dependency measures are calculated:
    // * covariation
    // * Pearson correlation
    // * Spearman rank correlation
    //
    // Result is stored into C, with C[i,j] equal to correlation
    // (covariance) between I-th variable of X and J-th variable of Y.
    //
    covm2(x, y, c);
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [[4.100,-3.250],[2.450,-1.500],[13.450,-5.750]]
    pearsoncorrm2(x, y, c);
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [[0.519,-0.699],[0.497,-0.518],[0.596,-0.433]]
    spearmancorrm2(x, y, c);
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [[0.541,-0.649],[0.216,-0.433],[0.433,-0.135]]
    return 0;
}


dsoptimalsplit2
dsoptimalsplit2fast
/************************************************************************* Optimal binary classification Algorithms finds optimal (=with minimal cross-entropy) binary partition. Internal subroutine. INPUT PARAMETERS: A - array[0..N-1], variable C - array[0..N-1], class numbers (0 or 1). N - array size OUTPUT PARAMETERS: Info - completetion code: * -3, all values of A[] are same (partition is impossible) * -2, one of C[] is incorrect (<0, >1) * -1, incorrect pararemets were passed (N<=0). * 1, OK Threshold- partiton boundary. Left part contains values which are strictly less than Threshold. Right part contains values which are greater than or equal to Threshold. PAL, PBL- probabilities P(0|v<Threshold) and P(1|v<Threshold) PAR, PBR- probabilities P(0|v>=Threshold) and P(1|v>=Threshold) CVE - cross-validation estimate of cross-entropy -- ALGLIB -- Copyright 22.05.2008 by Bochkanov Sergey *************************************************************************/
void alglib::dsoptimalsplit2( real_1d_array a, integer_1d_array c, ae_int_t n, ae_int_t& info, double& threshold, double& pal, double& pbl, double& par, double& pbr, double& cve);
/************************************************************************* Optimal partition, internal subroutine. Fast version. Accepts: A array[0..N-1] array of attributes array[0..N-1] C array[0..N-1] array of class labels TiesBuf array[0..N] temporaries (ties) CntBuf array[0..2*NC-1] temporaries (counts) Alpha centering factor (0<=alpha<=1, recommended value - 0.05) BufR array[0..N-1] temporaries BufI array[0..N-1] temporaries Output: Info error code (">0"=OK, "<0"=bad) RMS training set RMS error CVRMS leave-one-out RMS error Note: content of all arrays is changed by subroutine; it doesn't allocate temporaries. -- ALGLIB -- Copyright 11.12.2008 by Bochkanov Sergey *************************************************************************/
void alglib::dsoptimalsplit2fast( real_1d_array& a, integer_1d_array& c, integer_1d_array& tiesbuf, integer_1d_array& cntbuf, real_1d_array& bufr, integer_1d_array& bufi, ae_int_t n, ae_int_t nc, double alpha, ae_int_t& info, double& threshold, double& rms, double& cvrms);
rmatrixbdsvd
/************************************************************************* Singular value decomposition of a bidiagonal matrix (extended algorithm) COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The algorithm performs the singular value decomposition of a bidiagonal matrix B (upper or lower) representing it as B = Q*S*P^T, where Q and P - orthogonal matrices, S - diagonal matrix with non-negative elements on the main diagonal, in descending order. The algorithm finds singular values. In addition, the algorithm can calculate matrices Q and P (more precisely, not the matrices, but their product with given matrices U and VT - U*Q and (P^T)*VT)). Of course, matrices U and VT can be of any type, including identity. Furthermore, the algorithm can calculate Q'*C (this product is calculated more effectively than U*Q, because this calculation operates with rows instead of matrix columns). The feature of the algorithm is its ability to find all singular values including those which are arbitrarily close to 0 with relative accuracy close to machine precision. If the parameter IsFractionalAccuracyRequired is set to True, all singular values will have high relative accuracy close to machine precision. If the parameter is set to False, only the biggest singular value will have relative accuracy close to machine precision. The absolute error of other singular values is equal to the absolute error of the biggest singular value. Input parameters: D - main diagonal of matrix B. Array whose index ranges within [0..N-1]. E - superdiagonal (or subdiagonal) of matrix B. Array whose index ranges within [0..N-2]. N - size of matrix B. IsUpper - True, if the matrix is upper bidiagonal. IsFractionalAccuracyRequired - THIS PARAMETER IS IGNORED SINCE ALGLIB 3.5.0 SINGULAR VALUES ARE ALWAYS SEARCHED WITH HIGH ACCURACY. U - matrix to be multiplied by Q. Array whose indexes range within [0..NRU-1, 0..N-1]. The matrix can be bigger, in that case only the submatrix [0..NRU-1, 0..N-1] will be multiplied by Q. NRU - number of rows in matrix U. C - matrix to be multiplied by Q'. Array whose indexes range within [0..N-1, 0..NCC-1]. The matrix can be bigger, in that case only the submatrix [0..N-1, 0..NCC-1] will be multiplied by Q'. NCC - number of columns in matrix C. VT - matrix to be multiplied by P^T. Array whose indexes range within [0..N-1, 0..NCVT-1]. The matrix can be bigger, in that case only the submatrix [0..N-1, 0..NCVT-1] will be multiplied by P^T. NCVT - number of columns in matrix VT. Output parameters: D - singular values of matrix B in descending order. U - if NRU>0, contains matrix U*Q. VT - if NCVT>0, contains matrix (P^T)*VT. C - if NCC>0, contains matrix Q'*C. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged (rare case). NOTE: multiplication U*Q is performed by means of transposition to internal buffer, multiplication and backward transposition. It helps to avoid costly columnwise operations and speed-up algorithm. Additional information: The type of convergence is controlled by the internal parameter TOL. If the parameter is greater than 0, the singular values will have relative accuracy TOL. If TOL<0, the singular values will have absolute accuracy ABS(TOL)*norm(B). By default, |TOL| falls within the range of 10*Epsilon and 100*Epsilon, where Epsilon is the machine precision. It is not recommended to use TOL less than 10*Epsilon since this will considerably slow down the algorithm and may not lead to error decreasing. History: * 31 March, 2007. changed MAXITR from 6 to 12. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1999. *************************************************************************/
bool alglib::rmatrixbdsvd( real_1d_array& d, real_1d_array e, ae_int_t n, bool isupper, bool isfractionalaccuracyrequired, real_2d_array& u, ae_int_t nru, real_2d_array& c, ae_int_t ncc, real_2d_array& vt, ae_int_t ncvt);
besseli0
besseli1
besselj0
besselj1
besseljn
besselk0
besselk1
besselkn
bessely0
bessely1
besselyn
/************************************************************************* Modified Bessel function of order zero Returns modified Bessel function of order zero of the argument. The function is defined as i0(x) = j0( ix ). The range is partitioned into the two intervals [0,8] and (8, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 30000 5.8e-16 1.4e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::besseli0(double x);
/************************************************************************* Modified Bessel function of order one Returns modified Bessel function of order one of the argument. The function is defined as i1(x) = -i j1( ix ). The range is partitioned into the two intervals [0,8] and (8, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.9e-15 2.1e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::besseli1(double x);
/************************************************************************* Bessel function of order zero Returns Bessel function of order zero of the argument. The domain is divided into the intervals [0, 5] and (5, infinity). In the first interval the following rational approximation is used: 2 2 (w - r ) (w - r ) P (w) / Q (w) 1 2 3 8 2 where w = x and the two r's are zeros of the function. In the second interval, the Hankel asymptotic expansion is employed with two rational functions of degree 6/6 and 7/7. ACCURACY: Absolute error: arithmetic domain # trials peak rms IEEE 0, 30 60000 4.2e-16 1.1e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::besselj0(double x);
/************************************************************************* Bessel function of order one Returns Bessel function of order one of the argument. The domain is divided into the intervals [0, 8] and (8, infinity). In the first interval a 24 term Chebyshev expansion is used. In the second, the asymptotic trigonometric representation is employed using two rational functions of degree 5/5. ACCURACY: Absolute error: arithmetic domain # trials peak rms IEEE 0, 30 30000 2.6e-16 1.1e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::besselj1(double x);
/************************************************************************* Bessel function of integer order Returns Bessel function of order n, where n is a (possibly negative) integer. The ratio of jn(x) to j0(x) is computed by backward recurrence. First the ratio jn/jn-1 is found by a continued fraction expansion. Then the recurrence relating successive orders is applied until j0 or j1 is reached. If n = 0 or 1 the routine for j0 or j1 is called directly. ACCURACY: Absolute error: arithmetic range # trials peak rms IEEE 0, 30 5000 4.4e-16 7.9e-17 Not suitable for large n or x. Use jv() (fractional order) instead. Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::besseljn(ae_int_t n, double x);
/************************************************************************* Modified Bessel function, second kind, order zero Returns modified Bessel function of the second kind of order zero of the argument. The range is partitioned into the two intervals [0,8] and (8, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Tested at 2000 random points between 0 and 8. Peak absolute error (relative when K0 > 1) was 1.46e-14; rms, 4.26e-15. Relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.2e-15 1.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::besselk0(double x);
/************************************************************************* Modified Bessel function, second kind, order one Computes the modified Bessel function of the second kind of order one of the argument. The range is partitioned into the two intervals [0,2] and (2, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.2e-15 1.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::besselk1(double x);
/************************************************************************* Modified Bessel function, second kind, integer order Returns modified Bessel function of the second kind of order n of the argument. The range is partitioned into the two intervals [0,9.55] and (9.55, infinity). An ascending power series is used in the low range, and an asymptotic expansion in the high range. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 90000 1.8e-8 3.0e-10 Error is high only near the crossover point x = 9.55 between the two expansions used. Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::besselkn(ae_int_t nn, double x);
/************************************************************************* Bessel function of the second kind, order zero Returns Bessel function of the second kind, of order zero, of the argument. The domain is divided into the intervals [0, 5] and (5, infinity). In the first interval a rational approximation R(x) is employed to compute y0(x) = R(x) + 2 * log(x) * j0(x) / PI. Thus a call to j0() is required. In the second interval, the Hankel asymptotic expansion is employed with two rational functions of degree 6/6 and 7/7. ACCURACY: Absolute error, when y0(x) < 1; else relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.3e-15 1.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::bessely0(double x);
/************************************************************************* Bessel function of second kind of order one Returns Bessel function of the second kind of order one of the argument. The domain is divided into the intervals [0, 8] and (8, infinity). In the first interval a 25 term Chebyshev expansion is used, and a call to j1() is required. In the second, the asymptotic trigonometric representation is employed using two rational functions of degree 5/5. ACCURACY: Absolute error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.0e-15 1.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::bessely1(double x);
/************************************************************************* Bessel function of second kind of integer order Returns Bessel function of order n, where n is a (possibly negative) integer. The function is evaluated by forward recurrence on n, starting with values computed by the routines y0() and y1(). If n = 0 or 1 the routine for y0 or y1 is called directly. ACCURACY: Absolute error, except relative when y > 1: arithmetic domain # trials peak rms IEEE 0, 30 30000 3.4e-15 4.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::besselyn(ae_int_t n, double x);
beta
/************************************************************************* Beta function - - | (a) | (b) beta( a, b ) = -----------. - | (a+b) For large arguments the logarithm of the function is evaluated using lgam(), then exponentiated. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 30000 8.1e-14 1.1e-14 Cephes Math Library Release 2.0: April, 1987 Copyright 1984, 1987 by Stephen L. Moshier *************************************************************************/
double alglib::beta(double a, double b);
binomialcdistribution
binomialdistribution
invbinomialdistribution
/************************************************************************* Complemented binomial distribution Returns the sum of the terms k+1 through n of the Binomial probability density: n -- ( n ) j n-j > ( ) p (1-p) -- ( j ) j=k+1 The terms are not summed directly; instead the incomplete beta integral is employed, according to the formula y = bdtrc( k, n, p ) = incbet( k+1, n-k, p ). The arguments must be positive, with p ranging from 0 to 1. ACCURACY: Tested at random points (a,b,p). a,b Relative error: arithmetic domain # trials peak rms For p between 0.001 and 1: IEEE 0,100 100000 6.7e-15 8.2e-16 For p between 0 and .001: IEEE 0,100 100000 1.5e-13 2.7e-15 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::binomialcdistribution(ae_int_t k, ae_int_t n, double p);
/************************************************************************* Binomial distribution Returns the sum of the terms 0 through k of the Binomial probability density: k -- ( n ) j n-j > ( ) p (1-p) -- ( j ) j=0 The terms are not summed directly; instead the incomplete beta integral is employed, according to the formula y = bdtr( k, n, p ) = incbet( n-k, k+1, 1-p ). The arguments must be positive, with p ranging from 0 to 1. ACCURACY: Tested at random points (a,b,p), with p between 0 and 1. a,b Relative error: arithmetic domain # trials peak rms For p between 0.001 and 1: IEEE 0,100 100000 4.3e-15 2.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::binomialdistribution(ae_int_t k, ae_int_t n, double p);
/************************************************************************* Inverse binomial distribution Finds the event probability p such that the sum of the terms 0 through k of the Binomial probability density is equal to the given cumulative probability y. This is accomplished using the inverse beta integral function and the relation 1 - p = incbi( n-k, k+1, y ). ACCURACY: Tested at random points (a,b,p). a,b Relative error: arithmetic domain # trials peak rms For p between 0.001 and 1: IEEE 0,100 100000 2.3e-14 6.4e-16 IEEE 0,10000 100000 6.6e-12 1.2e-13 For p between 10^-6 and 0.001: IEEE 0,100 100000 2.0e-12 1.3e-14 IEEE 0,10000 100000 1.5e-12 3.2e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::invbinomialdistribution(ae_int_t k, ae_int_t n, double y);
chebyshevcalculate
chebyshevcoefficients
chebyshevsum
fromchebyshev
/************************************************************************* Calculation of the value of the Chebyshev polynomials of the first and second kinds. Parameters: r - polynomial kind, either 1 or 2. n - degree, n>=0 x - argument, -1 <= x <= 1 Result: the value of the Chebyshev polynomial at x *************************************************************************/
double alglib::chebyshevcalculate(ae_int_t r, ae_int_t n, double x);
/************************************************************************* Representation of Tn as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/
void alglib::chebyshevcoefficients(ae_int_t n, real_1d_array& c);
/************************************************************************* Summation of Chebyshev polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*T0(x) + c[1]*T1(x) + ... + c[N]*TN(x) or c[0]*U0(x) + c[1]*U1(x) + ... + c[N]*UN(x) depending on the R. Parameters: r - polynomial kind, either 1 or 2. n - degree, n>=0 x - argument Result: the value of the Chebyshev polynomial at x *************************************************************************/
double alglib::chebyshevsum( real_1d_array c, ae_int_t r, ae_int_t n, double x);
/************************************************************************* Conversion of a series of Chebyshev polynomials to a power series. Represents A[0]*T0(x) + A[1]*T1(x) + ... + A[N]*Tn(x) as B[0] + B[1]*X + ... + B[N]*X^N. Input parameters: A - Chebyshev series coefficients N - degree, N>=0 Output parameters B - power series coefficients *************************************************************************/
void alglib::fromchebyshev(real_1d_array a, ae_int_t n, real_1d_array& b);
chisquarecdistribution
chisquaredistribution
invchisquaredistribution
/************************************************************************* Complemented Chi-square distribution Returns the area under the right hand tail (from x to infinity) of the Chi square probability density function with v degrees of freedom: inf. - 1 | | v/2-1 -t/2 P( x | v ) = ----------- | t e dt v/2 - | | 2 | (v/2) - x where x is the Chi-square variable. The incomplete gamma integral is used, according to the formula y = chdtr( v, x ) = igamc( v/2.0, x/2.0 ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::chisquarecdistribution(double v, double x);
/************************************************************************* Chi-square distribution Returns the area under the left hand tail (from 0 to x) of the Chi square probability density function with v degrees of freedom. x - 1 | | v/2-1 -t/2 P( x | v ) = ----------- | t e dt v/2 - | | 2 | (v/2) - 0 where x is the Chi-square variable. The incomplete gamma integral is used, according to the formula y = chdtr( v, x ) = igam( v/2.0, x/2.0 ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::chisquaredistribution(double v, double x);
/************************************************************************* Inverse of complemented Chi-square distribution Finds the Chi-square argument x such that the integral from x to infinity of the Chi-square density is equal to the given cumulative probability y. This is accomplished using the inverse gamma integral function and the relation x/2 = igami( df/2, y ); ACCURACY: See inverse incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::invchisquaredistribution(double v, double y);
ahcreport
clusterizerstate
kmeansreport
clusterizercreate
clusterizergetdistances
clusterizergetkclusters
clusterizerrunahc
clusterizerrunkmeans
clusterizerseparatedbycorr
clusterizerseparatedbydist
clusterizersetahcalgo
clusterizersetdistances
clusterizersetkmeansinit
clusterizersetkmeanslimits
clusterizersetpoints
clst_ahc Simple hierarchical clusterization with Euclidean distance function
clst_distance Clusterization with different metric types
clst_kclusters Obtaining K top clusters from clusterization tree
clst_kmeans Simple k-means clusterization
clst_linkage Clusterization with different linkage types
/************************************************************************* This structure is used to store results of the agglomerative hierarchical clustering (AHC). Following information is returned: * TerminationType - completion code: * 1 for successful completion of algorithm * -5 inappropriate combination of clustering algorithm and distance function was used. As for now, it is possible only when Ward's method is called for dataset with non-Euclidean distance function. In case negative completion code is returned, other fields of report structure are invalid and should not be used. * NPoints contains number of points in the original dataset * Z contains information about merges performed (see below). Z contains indexes from the original (unsorted) dataset and it can be used when you need to know what points were merged. However, it is not convenient when you want to build a dendrograd (see below). * if you want to build dendrogram, you can use Z, but it is not good option, because Z contains indexes from unsorted dataset. Dendrogram built from such dataset is likely to have intersections. So, you have to reorder you points before building dendrogram. Permutation which reorders point is returned in P. Another representation of merges, which is more convenient for dendorgram construction, is returned in PM. * more information on format of Z, P and PM can be found below and in the examples from ALGLIB Reference Manual. FORMAL DESCRIPTION OF FIELDS: NPoints number of points Z array[NPoints-1,2], contains indexes of clusters linked in pairs to form clustering tree. I-th row corresponds to I-th merge: * Z[I,0] - index of the first cluster to merge * Z[I,1] - index of the second cluster to merge * Z[I,0]<Z[I,1] * clusters are numbered from 0 to 2*NPoints-2, with indexes from 0 to NPoints-1 corresponding to points of the original dataset, and indexes from NPoints to 2*NPoints-2 correspond to clusters generated by subsequent merges (I-th row of Z creates cluster with index NPoints+I). IMPORTANT: indexes in Z[] are indexes in the ORIGINAL, unsorted dataset. In addition to Z algorithm outputs permutation which rearranges points in such way that subsequent merges are performed on adjacent points (such order is needed if you want to build dendrogram). However, indexes in Z are related to original, unrearranged sequence of points. P array[NPoints], permutation which reorders points for dendrogram construction. P[i] contains index of the position where we should move I-th point of the original dataset in order to apply merges PZ/PM. PZ same as Z, but for permutation of points given by P. The only thing which changed are indexes of the original points; indexes of clusters remained same. MergeDist array[NPoints-1], contains distances between clusters being merged (MergeDist[i] correspond to merge stored in Z[i,...]): * CLINK, SLINK and average linkage algorithms report "raw", unmodified distance metric. * Ward's method reports weighted intra-cluster variance, which is equal to ||Ca-Cb||^2 * Sa*Sb/(Sa+Sb). Here A and B are clusters being merged, Ca is a center of A, Cb is a center of B, Sa is a size of A, Sb is a size of B. PM array[NPoints-1,6], another representation of merges, which is suited for dendrogram construction. It deals with rearranged points (permutation P is applied) and represents merges in a form which different from one used by Z. For each I from 0 to NPoints-2, I-th row of PM represents merge performed on two clusters C0 and C1. Here: * C0 contains points with indexes PM[I,0]...PM[I,1] * C1 contains points with indexes PM[I,2]...PM[I,3] * indexes stored in PM are given for dataset sorted according to permutation P * PM[I,1]=PM[I,2]-1 (only adjacent clusters are merged) * PM[I,0]<=PM[I,1], PM[I,2]<=PM[I,3], i.e. both clusters contain at least one point * heights of "subdendrograms" corresponding to C0/C1 are stored in PM[I,4] and PM[I,5]. Subdendrograms corresponding to single-point clusters have height=0. Dendrogram of the merge result has height H=max(H0,H1)+1. NOTE: there is one-to-one correspondence between merges described by Z and PM. I-th row of Z describes same merge of clusters as I-th row of PM, with "left" cluster from Z corresponding to the "left" one from PM. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
class ahcreport { ae_int_t terminationtype; ae_int_t npoints; integer_1d_array p; integer_2d_array z; integer_2d_array pz; integer_2d_array pm; real_1d_array mergedist; };
/************************************************************************* This structure is a clusterization engine. You should not try to access its fields directly. Use ALGLIB functions in order to work with this object. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
class clusterizerstate { };
/************************************************************************* This structure is used to store results of the k-means clustering algorithm. Following information is always returned: * NPoints contains number of points in the original dataset * TerminationType contains completion code, negative on failure, positive on success * K contains number of clusters For positive TerminationType we return: * NFeatures contains number of variables in the original dataset * C, which contains centers found by algorithm * CIdx, which maps points of the original dataset to clusters FORMAL DESCRIPTION OF FIELDS: NPoints number of points, >=0 NFeatures number of variables, >=1 TerminationType completion code: * -5 if distance type is anything different from Euclidean metric * -3 for degenerate dataset: a) less than K distinct points, b) K=0 for non-empty dataset. * +1 for successful completion K number of clusters C array[K,NFeatures], rows of the array store centers CIdx array[NPoints], which contains cluster indexes IterationsCount actual number of iterations performed by clusterizer. If algorithm performed more than one random restart, total number of iterations is returned. Energy merit function, "energy", sum of squared deviations from cluster centers -- ALGLIB -- Copyright 27.11.2012 by Bochkanov Sergey *************************************************************************/
class kmeansreport { ae_int_t npoints; ae_int_t nfeatures; ae_int_t terminationtype; ae_int_t iterationscount; double energy; ae_int_t k; real_2d_array c; integer_1d_array cidx; };
/************************************************************************* This function initializes clusterizer object. Newly initialized object is empty, i.e. it does not contain dataset. You should use it as follows: 1. creation 2. dataset is added with ClusterizerSetPoints() 3. additional parameters are set 3. clusterization is performed with one of the clustering functions -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::clusterizercreate(clusterizerstate& s);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* This function returns distance matrix for dataset COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Agglomerative hierarchical clustering algorithm has two phases: ! distance matrix calculation and clustering itself. Only first phase ! (distance matrix calculation) is accelerated by Intel MKL and multi- ! threading. Thus, acceleration is significant only for medium or high- ! dimensional problems. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points, >=0 NFeatures- number of features, >=1 DistType- distance function: * 0 Chebyshev distance (L-inf norm) * 1 city block distance (L1 norm) * 2 Euclidean distance (L2 norm, non-squared) * 10 Pearson correlation: dist(a,b) = 1-corr(a,b) * 11 Absolute Pearson correlation: dist(a,b) = 1-|corr(a,b)| * 12 Uncentered Pearson correlation (cosine of the angle): dist(a,b) = a'*b/(|a|*|b|) * 13 Absolute uncentered Pearson correlation dist(a,b) = |a'*b|/(|a|*|b|) * 20 Spearman rank correlation: dist(a,b) = 1-rankcorr(a,b) * 21 Absolute Spearman rank correlation dist(a,b) = 1-|rankcorr(a,b)| OUTPUT PARAMETERS: D - array[NPoints,NPoints], distance matrix (full matrix is returned, with lower and upper triangles) NOTE: different distance functions have different performance penalty: * Euclidean or Pearson correlation distances are the fastest ones * Spearman correlation distance function is a bit slower * city block and Chebyshev distances are order of magnitude slower The reason behing difference in performance is that correlation-based distance functions are computed using optimized linear algebra kernels, while Chebyshev and city block distance functions are computed using simple nested loops with two branches at each iteration. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::clusterizergetdistances( real_2d_array xy, ae_int_t npoints, ae_int_t nfeatures, ae_int_t disttype, real_2d_array& d); void alglib::smp_clusterizergetdistances( real_2d_array xy, ae_int_t npoints, ae_int_t nfeatures, ae_int_t disttype, real_2d_array& d);
/************************************************************************* This function takes as input clusterization report Rep, desired clusters count K, and builds top K clusters from hierarchical clusterization tree. It returns assignment of points to clusters (array of cluster indexes). INPUT PARAMETERS: Rep - report from ClusterizerRunAHC() performed on XY K - desired number of clusters, 1<=K<=NPoints. K can be zero only when NPoints=0. OUTPUT PARAMETERS: CIdx - array[NPoints], I-th element contains cluster index (from 0 to K-1) for I-th point of the dataset. CZ - array[K]. This array allows to convert cluster indexes returned by this function to indexes used by Rep.Z. J-th cluster returned by this function corresponds to CZ[J]-th cluster stored in Rep.Z/PZ/PM. It is guaranteed that CZ[I]<CZ[I+1]. NOTE: K clusters built by this subroutine are assumed to have no hierarchy. Although they were obtained by manipulation with top K nodes of dendrogram (i.e. hierarchical decomposition of dataset), this function does not return information about hierarchy. Each of the clusters stand on its own. NOTE: Cluster indexes returned by this function does not correspond to indexes returned in Rep.Z/PZ/PM. Either you work with hierarchical representation of the dataset (dendrogram), or you work with "flat" representation returned by this function. Each of representations has its own clusters indexing system (former uses [0, 2*NPoints-2]), while latter uses [0..K-1]), although it is possible to perform conversion from one system to another by means of CZ array, returned by this function, which allows you to convert indexes stored in CIdx to the numeration system used by Rep.Z. NOTE: this subroutine is optimized for moderate values of K. Say, for K=5 it will perform many times faster than for K=100. Its worst-case performance is O(N*K), although in average case it perform better (up to O(N*log(K))). -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::clusterizergetkclusters( ahcreport rep, ae_int_t k, integer_1d_array& cidx, integer_1d_array& cz);

Examples:   [1]  [2]  

/************************************************************************* This function performs agglomerative hierarchical clustering COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Agglomerative hierarchical clustering algorithm has two phases: ! distance matrix calculation and clustering itself. Only first phase ! (distance matrix calculation) is accelerated by Intel MKL and multi- ! threading. Thus, acceleration is significant only for medium or high- ! dimensional problems. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() OUTPUT PARAMETERS: Rep - clustering results; see description of AHCReport structure for more information. NOTE 1: hierarchical clustering algorithms require large amounts of memory. In particular, this implementation needs sizeof(double)*NPoints^2 bytes, which are used to store distance matrix. In case we work with user-supplied matrix, this amount is multiplied by 2 (we have to store original matrix and to work with its copy). For example, problem with 10000 points would require 800M of RAM, even when working in a 1-dimensional space. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::clusterizerrunahc(clusterizerstate s, ahcreport& rep); void alglib::smp_clusterizerrunahc(clusterizerstate s, ahcreport& rep);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* This function performs clustering by k-means++ algorithm. You may change algorithm properties by calling: * ClusterizerSetKMeansLimits() to change number of restarts or iterations * ClusterizerSetKMeansInit() to change initialization algorithm By default, one restart and unlimited number of iterations are used. Initialization algorithm is chosen automatically. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (can be used from C# and C++) ! * access to high-performance C++ core (actual for C# users) ! ! K-means clustering algorithm has two phases: selection of initial ! centers and clustering itself. ALGLIB parallelizes both phases. ! Parallel version is optimized for the following scenario: medium or ! high-dimensional problem (20 or more dimensions) with large number of ! points and clusters. However, some speed-up can be obtained even when ! assumptions above are violated. ! ! As for native-vs-managed comparison, working with native core brings ! 30-40% improvement in speed over pure C# version of ALGLIB. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() K - number of clusters, K>=0. K can be zero only when algorithm is called for empty dataset, in this case completion code is set to success (+1). If K=0 and dataset size is non-zero, we can not meaningfully assign points to some center (there are no centers because K=0) and return -3 as completion code (failure). OUTPUT PARAMETERS: Rep - clustering results; see description of KMeansReport structure for more information. NOTE 1: k-means clustering can be performed only for datasets with Euclidean distance function. Algorithm will return negative completion code in Rep.TerminationType in case dataset was added to clusterizer with DistType other than Euclidean (or dataset was specified by distance matrix instead of explicitly given points). -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::clusterizerrunkmeans( clusterizerstate s, ae_int_t k, kmeansreport& rep); void alglib::smp_clusterizerrunkmeans( clusterizerstate s, ae_int_t k, kmeansreport& rep);
/************************************************************************* This function accepts AHC report Rep, desired maximum intercluster correlation and returns top clusters from hierarchical clusterization tree which are separated by correlation R or LOWER. It returns assignment of points to clusters (array of cluster indexes). There is one more function with similar name - ClusterizerSeparatedByDist, which returns clusters with intercluster distance equal to R or HIGHER (note: higher for distance, lower for correlation). INPUT PARAMETERS: Rep - report from ClusterizerRunAHC() performed on XY R - desired maximum intercluster correlation, -1<=R<=+1 OUTPUT PARAMETERS: K - number of clusters, 1<=K<=NPoints CIdx - array[NPoints], I-th element contains cluster index (from 0 to K-1) for I-th point of the dataset. CZ - array[K]. This array allows to convert cluster indexes returned by this function to indexes used by Rep.Z. J-th cluster returned by this function corresponds to CZ[J]-th cluster stored in Rep.Z/PZ/PM. It is guaranteed that CZ[I]<CZ[I+1]. NOTE: K clusters built by this subroutine are assumed to have no hierarchy. Although they were obtained by manipulation with top K nodes of dendrogram (i.e. hierarchical decomposition of dataset), this function does not return information about hierarchy. Each of the clusters stand on its own. NOTE: Cluster indexes returned by this function does not correspond to indexes returned in Rep.Z/PZ/PM. Either you work with hierarchical representation of the dataset (dendrogram), or you work with "flat" representation returned by this function. Each of representations has its own clusters indexing system (former uses [0, 2*NPoints-2]), while latter uses [0..K-1]), although it is possible to perform conversion from one system to another by means of CZ array, returned by this function, which allows you to convert indexes stored in CIdx to the numeration system used by Rep.Z. NOTE: this subroutine is optimized for moderate values of K. Say, for K=5 it will perform many times faster than for K=100. Its worst-case performance is O(N*K), although in average case it perform better (up to O(N*log(K))). -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::clusterizerseparatedbycorr( ahcreport rep, double r, ae_int_t& k, integer_1d_array& cidx, integer_1d_array& cz);
/************************************************************************* This function accepts AHC report Rep, desired minimum intercluster distance and returns top clusters from hierarchical clusterization tree which are separated by distance R or HIGHER. It returns assignment of points to clusters (array of cluster indexes). There is one more function with similar name - ClusterizerSeparatedByCorr, which returns clusters with intercluster correlation equal to R or LOWER (note: higher for distance, lower for correlation). INPUT PARAMETERS: Rep - report from ClusterizerRunAHC() performed on XY R - desired minimum intercluster distance, R>=0 OUTPUT PARAMETERS: K - number of clusters, 1<=K<=NPoints CIdx - array[NPoints], I-th element contains cluster index (from 0 to K-1) for I-th point of the dataset. CZ - array[K]. This array allows to convert cluster indexes returned by this function to indexes used by Rep.Z. J-th cluster returned by this function corresponds to CZ[J]-th cluster stored in Rep.Z/PZ/PM. It is guaranteed that CZ[I]<CZ[I+1]. NOTE: K clusters built by this subroutine are assumed to have no hierarchy. Although they were obtained by manipulation with top K nodes of dendrogram (i.e. hierarchical decomposition of dataset), this function does not return information about hierarchy. Each of the clusters stand on its own. NOTE: Cluster indexes returned by this function does not correspond to indexes returned in Rep.Z/PZ/PM. Either you work with hierarchical representation of the dataset (dendrogram), or you work with "flat" representation returned by this function. Each of representations has its own clusters indexing system (former uses [0, 2*NPoints-2]), while latter uses [0..K-1]), although it is possible to perform conversion from one system to another by means of CZ array, returned by this function, which allows you to convert indexes stored in CIdx to the numeration system used by Rep.Z. NOTE: this subroutine is optimized for moderate values of K. Say, for K=5 it will perform many times faster than for K=100. Its worst-case performance is O(N*K), although in average case it perform better (up to O(N*log(K))). -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::clusterizerseparatedbydist( ahcreport rep, double r, ae_int_t& k, integer_1d_array& cidx, integer_1d_array& cz);
/************************************************************************* This function sets agglomerative hierarchical clustering algorithm INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() Algo - algorithm type: * 0 complete linkage (default algorithm) * 1 single linkage * 2 unweighted average linkage * 3 weighted average linkage * 4 Ward's method NOTE: Ward's method works correctly only with Euclidean distance, that's why algorithm will return negative termination code (failure) for any other distance type. It is possible, however, to use this method with user-supplied distance matrix. It is your responsibility to pass one which was calculated with Euclidean distance function. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::clusterizersetahcalgo(clusterizerstate s, ae_int_t algo);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* This function adds dataset given by distance matrix to the clusterizer structure. It is important that dataset is not given explicitly - only distance matrix is given. This function overrides all previous calls of ClusterizerSetPoints() or ClusterizerSetDistances(). INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() D - array[NPoints,NPoints], distance matrix given by its upper or lower triangle (main diagonal is ignored because its entries are expected to be zero). NPoints - number of points IsUpper - whether upper or lower triangle of D is given. NOTE 1: different clustering algorithms have different limitations: * agglomerative hierarchical clustering algorithms may be used with any kind of distance metric, including one which is given by distance matrix * k-means++ clustering algorithm may be used only with Euclidean distance function and explicitly given points - it can not be used with dataset given by distance matrix Thus, if you call this function, you will be unable to use k-means clustering algorithm to process your problem. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::clusterizersetdistances( clusterizerstate s, real_2d_array d, bool isupper); void alglib::clusterizersetdistances( clusterizerstate s, real_2d_array d, ae_int_t npoints, bool isupper);

Examples:   [1]  

/************************************************************************* This function sets k-means initialization algorithm. Several different algorithms can be chosen, including k-means++. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() InitAlgo- initialization algorithm: * 0 automatic selection ( different versions of ALGLIB may select different algorithms) * 1 random initialization * 2 k-means++ initialization (best quality of initial centers, but long non-parallelizable initialization phase with bad cache locality) * 3 "fast-greedy" algorithm with efficient, easy to parallelize initialization. Quality of initial centers is somewhat worse than that of k-means++. This algorithm is a default one in the current version of ALGLIB. *-1 "debug" algorithm which always selects first K rows of dataset; this algorithm is used for debug purposes only. Do not use it in the industrial code! -- ALGLIB -- Copyright 21.01.2015 by Bochkanov Sergey *************************************************************************/
void alglib::clusterizersetkmeansinit( clusterizerstate s, ae_int_t initalgo);
/************************************************************************* This function sets k-means properties: number of restarts and maximum number of iterations per one run. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() Restarts- restarts count, >=1. k-means++ algorithm performs several restarts and chooses best set of centers (one with minimum squared distance). MaxIts - maximum number of k-means iterations performed during one run. >=0, zero value means that algorithm performs unlimited number of iterations. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::clusterizersetkmeanslimits( clusterizerstate s, ae_int_t restarts, ae_int_t maxits);
/************************************************************************* This function adds dataset to the clusterizer structure. This function overrides all previous calls of ClusterizerSetPoints() or ClusterizerSetDistances(). INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() XY - array[NPoints,NFeatures], dataset NPoints - number of points, >=0 NFeatures- number of features, >=1 DistType- distance function: * 0 Chebyshev distance (L-inf norm) * 1 city block distance (L1 norm) * 2 Euclidean distance (L2 norm), non-squared * 10 Pearson correlation: dist(a,b) = 1-corr(a,b) * 11 Absolute Pearson correlation: dist(a,b) = 1-|corr(a,b)| * 12 Uncentered Pearson correlation (cosine of the angle): dist(a,b) = a'*b/(|a|*|b|) * 13 Absolute uncentered Pearson correlation dist(a,b) = |a'*b|/(|a|*|b|) * 20 Spearman rank correlation: dist(a,b) = 1-rankcorr(a,b) * 21 Absolute Spearman rank correlation dist(a,b) = 1-|rankcorr(a,b)| NOTE 1: different distance functions have different performance penalty: * Euclidean or Pearson correlation distances are the fastest ones * Spearman correlation distance function is a bit slower * city block and Chebyshev distances are order of magnitude slower The reason behing difference in performance is that correlation-based distance functions are computed using optimized linear algebra kernels, while Chebyshev and city block distance functions are computed using simple nested loops with two branches at each iteration. NOTE 2: different clustering algorithms have different limitations: * agglomerative hierarchical clustering algorithms may be used with any kind of distance metric * k-means++ clustering algorithm may be used only with Euclidean distance function Thus, list of specific clustering algorithms you may use depends on distance function you specify when you set your dataset. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::clusterizersetpoints( clusterizerstate s, real_2d_array xy, ae_int_t disttype); void alglib::clusterizersetpoints( clusterizerstate s, real_2d_array xy, ae_int_t npoints, ae_int_t nfeatures, ae_int_t disttype);

Examples:   [1]  [2]  [3]  [4]  [5]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // The very simple clusterization example
    //
    // We have a set of points in 2D space:
    //     (P0,P1,P2,P3,P4) = ((1,1),(1,2),(4,1),(2,3),(4,1.5))
    //
    //  |
    //  |     P3
    //  |
    //  | P1          
    //  |             P4
    //  | P0          P2
    //  |-------------------------
    //
    // We want to perform Agglomerative Hierarchic Clusterization (AHC),
    // using complete linkage (default algorithm) and Euclidean distance
    // (default metric).
    //
    // In order to do that, we:
    // * create clusterizer with clusterizercreate()
    // * set points XY and metric (2=Euclidean) with clusterizersetpoints()
    // * run AHC algorithm with clusterizerrunahc
    //
    // You may see that clusterization itself is a minor part of the example,
    // most of which is dominated by comments :)
    //
    clusterizerstate s;
    ahcreport rep;
    real_2d_array xy = "[[1,1],[1,2],[4,1],[2,3],[4,1.5]]";

    clusterizercreate(s);
    clusterizersetpoints(s, xy, 2);
    clusterizerrunahc(s, rep);

    //
    // Now we've built our clusterization tree. Rep.z contains information which
    // is required to build dendrogram. I-th row of rep.z represents one merge
    // operation, with first cluster to merge having index rep.z[I,0] and second
    // one having index rep.z[I,1]. Merge result has index NPoints+I.
    //
    // Clusters with indexes less than NPoints are single-point initial clusters,
    // while ones with indexes from NPoints to 2*NPoints-2 are multi-point
    // clusters created during merges.
    //
    // In our example, Z=[[2,4], [0,1], [3,6], [5,7]]
    //
    // It means that:
    // * first, we merge C2=(P2) and C4=(P4),    and create C5=(P2,P4)
    // * then, we merge  C2=(P0) and C1=(P1),    and create C6=(P0,P1)
    // * then, we merge  C3=(P3) and C6=(P0,P1), and create C7=(P0,P1,P3)
    // * finally, we merge C5 and C7 and create C8=(P0,P1,P2,P3,P4)
    //
    // Thus, we have following dendrogram:
    //  
    //      ------8-----
    //      |          |
    //      |      ----7----
    //      |      |       |
    //   ---5---   |    ---6---
    //   |     |   |    |     |
    //   P2   P4   P3   P0   P1
    //
    printf("%s\n", rep.z.tostring().c_str()); // EXPECTED: [[2,4],[0,1],[3,6],[5,7]]

    //
    // We've built dendrogram above by reordering our dataset.
    //
    // Without such reordering it would be impossible to build dendrogram without
    // intersections. Luckily, ahcreport structure contains two additional fields
    // which help to build dendrogram from your data:
    // * rep.p, which contains permutation applied to dataset
    // * rep.pm, which contains another representation of merges 
    //
    // In our example we have:
    // * P=[3,4,0,2,1]
    // * PZ=[[0,0,1,1,0,0],[3,3,4,4,0,0],[2,2,3,4,0,1],[0,1,2,4,1,2]]
    //
    // Permutation array P tells us that P0 should be moved to position 3,
    // P1 moved to position 4, P2 moved to position 0 and so on:
    //
    //   (P0 P1 P2 P3 P4) => (P2 P4 P3 P0 P1)
    //
    // Merges array PZ tells us how to perform merges on the sorted dataset.
    // One row of PZ corresponds to one merge operations, with first pair of
    // elements denoting first of the clusters to merge (start index, end
    // index) and next pair of elements denoting second of the clusters to
    // merge. Clusters being merged are always adjacent, with first one on
    // the left and second one on the right.
    //
    // For example, first row of PZ tells us that clusters [0,0] and [1,1] are
    // merged (single-point clusters, with first one containing P2 and second
    // one containing P4). Third row of PZ tells us that we merge one single-
    // point cluster [2,2] with one two-point cluster [3,4].
    //
    // There are two more elements in each row of PZ. These are the helper
    // elements, which denote HEIGHT (not size) of left and right subdendrograms.
    // For example, according to PZ, first two merges are performed on clusterization
    // trees of height 0, while next two merges are performed on 0-1 and 1-2
    // pairs of trees correspondingly.
    //
    printf("%s\n", rep.p.tostring().c_str()); // EXPECTED: [3,4,0,2,1]
    printf("%s\n", rep.pm.tostring().c_str()); // EXPECTED: [[0,0,1,1,0,0],[3,3,4,4,0,0],[2,2,3,4,0,1],[0,1,2,4,1,2]]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We have three points in 4D space:
    //     (P0,P1,P2) = ((1, 2, 1, 2), (6, 7, 6, 7), (7, 6, 7, 6))
    //
    // We want to try clustering them with different distance functions.
    // Distance function is chosen when we add dataset to the clusterizer.
    // We can choose several distance types - Euclidean, city block, Chebyshev,
    // several correlation measures or user-supplied distance matrix.
    //
    // Here we'll try three distances: Euclidean, Pearson correlation,
    // user-supplied distance matrix. Different distance functions lead
    // to different choices being made by algorithm during clustering.
    //
    clusterizerstate s;
    ahcreport rep;
    ae_int_t disttype;
    real_2d_array xy = "[[1, 2, 1, 2], [6, 7, 6, 7], [7, 6, 7, 6]]";
    clusterizercreate(s);

    // With Euclidean distance function (disttype=2) two closest points
    // are P1 and P2, thus:
    // * first, we merge P1 and P2 to form C3=[P1,P2]
    // * second, we merge P0 and C3 to form C4=[P0,P1,P2]
    disttype = 2;
    clusterizersetpoints(s, xy, disttype);
    clusterizerrunahc(s, rep);
    printf("%s\n", rep.z.tostring().c_str()); // EXPECTED: [[1,2],[0,3]]

    // With Pearson correlation distance function (disttype=10) situation
    // is different - distance between P0 and P1 is zero, thus:
    // * first, we merge P0 and P1 to form C3=[P0,P1]
    // * second, we merge P2 and C3 to form C4=[P0,P1,P2]
    disttype = 10;
    clusterizersetpoints(s, xy, disttype);
    clusterizerrunahc(s, rep);
    printf("%s\n", rep.z.tostring().c_str()); // EXPECTED: [[0,1],[2,3]]

    // Finally, we try clustering with user-supplied distance matrix:
    //     [ 0 3 1 ]
    // P = [ 3 0 3 ], where P[i,j] = dist(Pi,Pj)
    //     [ 1 3 0 ]
    //
    // * first, we merge P0 and P2 to form C3=[P0,P2]
    // * second, we merge P1 and C3 to form C4=[P0,P1,P2]
    real_2d_array d = "[[0,3,1],[3,0,3],[1,3,0]]";
    clusterizersetdistances(s, d, true);
    clusterizerrunahc(s, rep);
    printf("%s\n", rep.z.tostring().c_str()); // EXPECTED: [[0,2],[1,3]]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We have a set of points in 2D space:
    //     (P0,P1,P2,P3,P4) = ((1,1),(1,2),(4,1),(2,3),(4,1.5))
    //
    //  |
    //  |     P3
    //  |
    //  | P1          
    //  |             P4
    //  | P0          P2
    //  |-------------------------
    //
    // We perform Agglomerative Hierarchic Clusterization (AHC) and we want
    // to get top K clusters from clusterization tree for different K.
    //
    clusterizerstate s;
    ahcreport rep;
    real_2d_array xy = "[[1,1],[1,2],[4,1],[2,3],[4,1.5]]";
    integer_1d_array cidx;
    integer_1d_array cz;

    clusterizercreate(s);
    clusterizersetpoints(s, xy, 2);
    clusterizerrunahc(s, rep);

    // with K=5, every points is assigned to its own cluster:
    // C0=P0, C1=P1 and so on...
    clusterizergetkclusters(rep, 5, cidx, cz);
    printf("%s\n", cidx.tostring().c_str()); // EXPECTED: [0,1,2,3,4]

    // with K=1 we have one large cluster C0=[P0,P1,P2,P3,P4,P5]
    clusterizergetkclusters(rep, 1, cidx, cz);
    printf("%s\n", cidx.tostring().c_str()); // EXPECTED: [0,0,0,0,0]

    // with K=3 we have three clusters C0=[P3], C1=[P2,P4], C2=[P0,P1]
    clusterizergetkclusters(rep, 3, cidx, cz);
    printf("%s\n", cidx.tostring().c_str()); // EXPECTED: [2,2,1,0,1]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // The very simple clusterization example
    //
    // We have a set of points in 2D space:
    //     (P0,P1,P2,P3,P4) = ((1,1),(1,2),(4,1),(2,3),(4,1.5))
    //
    //  |
    //  |     P3
    //  |
    //  | P1          
    //  |             P4
    //  | P0          P2
    //  |-------------------------
    //
    // We want to perform k-means++ clustering with K=2.
    //
    // In order to do that, we:
    // * create clusterizer with clusterizercreate()
    // * set points XY and metric (must be Euclidean, distype=2) with clusterizersetpoints()
    // * (optional) set number of restarts from random positions to 5
    // * run k-means algorithm with clusterizerrunkmeans()
    //
    // You may see that clusterization itself is a minor part of the example,
    // most of which is dominated by comments :)
    //
    clusterizerstate s;
    kmeansreport rep;
    real_2d_array xy = "[[1,1],[1,2],[4,1],[2,3],[4,1.5]]";

    clusterizercreate(s);
    clusterizersetpoints(s, xy, 2);
    clusterizersetkmeanslimits(s, 5, 0);
    clusterizerrunkmeans(s, 2, rep);

    //
    // We've performed clusterization, and it succeeded (completion code is +1).
    //
    // Now first center is stored in the first row of rep.c, second one is stored
    // in the second row. rep.cidx can be used to determine which center is
    // closest to some specific point of the dataset.
    //
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 1

    // We called clusterizersetpoints() with disttype=2 because k-means++
    // algorithm does NOT support metrics other than Euclidean. But what if we
    // try to use some other metric?
    //
    // We change metric type by calling clusterizersetpoints() one more time,
    // and try to run k-means algo again. It fails.
    //
    clusterizersetpoints(s, xy, 0);
    clusterizerrunkmeans(s, 2, rep);
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: -5
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We have a set of points in 1D space:
    //     (P0,P1,P2,P3,P4) = (1, 3, 10, 16, 20)
    //
    // We want to perform Agglomerative Hierarchic Clusterization (AHC),
    // using either complete or single linkage and Euclidean distance
    // (default metric).
    //
    // First two steps merge P0/P1 and P3/P4 independently of the linkage type.
    // However, third step depends on linkage type being used:
    // * in case of complete linkage P2=10 is merged with [P0,P1]
    // * in case of single linkage P2=10 is merged with [P3,P4]
    //
    clusterizerstate s;
    ahcreport rep;
    real_2d_array xy = "[[1],[3],[10],[16],[20]]";
    integer_1d_array cidx;
    integer_1d_array cz;

    clusterizercreate(s);
    clusterizersetpoints(s, xy, 2);

    // use complete linkage, reduce set down to 2 clusters.
    // print clusterization with clusterizergetkclusters(2).
    // P2 must belong to [P0,P1]
    clusterizersetahcalgo(s, 0);
    clusterizerrunahc(s, rep);
    clusterizergetkclusters(rep, 2, cidx, cz);
    printf("%s\n", cidx.tostring().c_str()); // EXPECTED: [1,1,1,0,0]

    // use single linkage, reduce set down to 2 clusters.
    // print clusterization with clusterizergetkclusters(2).
    // P2 must belong to [P2,P3]
    clusterizersetahcalgo(s, 1);
    clusterizerrunahc(s, rep);
    clusterizergetkclusters(rep, 2, cidx, cz);
    printf("%s\n", cidx.tostring().c_str()); // EXPECTED: [0,0,1,1,1]
    return 0;
}


convc1d
convc1dcircular
convc1dcircularinv
convc1dinv
convr1d
convr1dcircular
convr1dcircularinv
convr1dinv
/************************************************************************* 1-dimensional complex convolution. For given A/B returns conv(A,B) (non-circular). Subroutine can automatically choose between three implementations: straightforward O(M*N) formula for very small N (or M), overlap-add algorithm for cases where max(M,N) is significantly larger than min(M,N), but O(M*N) algorithm is too slow, and general FFT-based formula for cases where two previois algorithms are too slow. Algorithm has max(M,N)*log(max(M,N)) complexity for any M/N. INPUT PARAMETERS A - array[0..M-1] - complex function to be transformed M - problem size B - array[0..N-1] - complex function to be transformed N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..N+M-2]. NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/
void alglib::convc1d( complex_1d_array a, ae_int_t m, complex_1d_array b, ae_int_t n, complex_1d_array& r);
/************************************************************************* 1-dimensional circular complex convolution. For given S/R returns conv(S,R) (circular). Algorithm has linearithmic complexity for any M/N. IMPORTANT: normal convolution is commutative, i.e. it is symmetric - conv(A,B)=conv(B,A). Cyclic convolution IS NOT. One function - S - is a signal, periodic function, and another - R - is a response, non-periodic function with limited length. INPUT PARAMETERS S - array[0..M-1] - complex periodic signal M - problem size B - array[0..N-1] - complex non-periodic response N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/
void alglib::convc1dcircular( complex_1d_array s, ae_int_t m, complex_1d_array r, ae_int_t n, complex_1d_array& c);
/************************************************************************* 1-dimensional circular complex deconvolution (inverse of ConvC1DCircular()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved periodic signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - non-periodic response N - response length OUTPUT PARAMETERS R - deconvolved signal. array[0..M-1]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/
void alglib::convc1dcircularinv( complex_1d_array a, ae_int_t m, complex_1d_array b, ae_int_t n, complex_1d_array& r);
/************************************************************************* 1-dimensional complex non-circular deconvolution (inverse of ConvC1D()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - response N - response length, N<=M OUTPUT PARAMETERS R - deconvolved signal. array[0..M-N]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/
void alglib::convc1dinv( complex_1d_array a, ae_int_t m, complex_1d_array b, ae_int_t n, complex_1d_array& r);
/************************************************************************* 1-dimensional real convolution. Analogous to ConvC1D(), see ConvC1D() comments for more details. INPUT PARAMETERS A - array[0..M-1] - real function to be transformed M - problem size B - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..N+M-2]. NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/
void alglib::convr1d( real_1d_array a, ae_int_t m, real_1d_array b, ae_int_t n, real_1d_array& r);
/************************************************************************* 1-dimensional circular real convolution. Analogous to ConvC1DCircular(), see ConvC1DCircular() comments for more details. INPUT PARAMETERS S - array[0..M-1] - real signal M - problem size B - array[0..N-1] - real response N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/
void alglib::convr1dcircular( real_1d_array s, ae_int_t m, real_1d_array r, ae_int_t n, real_1d_array& c);
/************************************************************************* 1-dimensional complex deconvolution (inverse of ConvC1D()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - response N - response length OUTPUT PARAMETERS R - deconvolved signal. array[0..M-N]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/
void alglib::convr1dcircularinv( real_1d_array a, ae_int_t m, real_1d_array b, ae_int_t n, real_1d_array& r);
/************************************************************************* 1-dimensional real deconvolution (inverse of ConvC1D()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - response N - response length, N<=M OUTPUT PARAMETERS R - deconvolved signal. array[0..M-N]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/
void alglib::convr1dinv( real_1d_array a, ae_int_t m, real_1d_array b, ae_int_t n, real_1d_array& r);
corrc1d
corrc1dcircular
corrr1d
corrr1dcircular
/************************************************************************* 1-dimensional complex cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (non-circular). Correlation is calculated using reduction to convolution. Algorithm with max(N,N)*log(max(N,N)) complexity is used (see ConvC1D() for more info about performance). IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrC1D(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - complex function to be transformed, signal containing pattern N - problem size Pattern - array[0..M-1] - complex function to be transformed, pattern to search withing signal M - problem size OUTPUT PARAMETERS R - cross-correlation, array[0..N+M-2]: * positive lags are stored in R[0..N-1], R[i] = sum(conj(pattern[j])*signal[i+j] * negative lags are stored in R[N..N+M-2], R[N+M-1-i] = sum(conj(pattern[j])*signal[-i+j] NOTE: It is assumed that pattern domain is [0..M-1]. If Pattern is non-zero on [-K..M-1], you can still use this subroutine, just shift result by K. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/
void alglib::corrc1d( complex_1d_array signal, ae_int_t n, complex_1d_array pattern, ae_int_t m, complex_1d_array& r);
/************************************************************************* 1-dimensional circular complex cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (circular). Algorithm has linearithmic complexity for any M/N. IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrC1DCircular(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - complex function to be transformed, periodic signal containing pattern N - problem size Pattern - array[0..M-1] - complex function to be transformed, non-periodic pattern to search withing signal M - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/
void alglib::corrc1dcircular( complex_1d_array signal, ae_int_t m, complex_1d_array pattern, ae_int_t n, complex_1d_array& c);
/************************************************************************* 1-dimensional real cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (non-circular). Correlation is calculated using reduction to convolution. Algorithm with max(N,N)*log(max(N,N)) complexity is used (see ConvC1D() for more info about performance). IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrR1D(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - real function to be transformed, signal containing pattern N - problem size Pattern - array[0..M-1] - real function to be transformed, pattern to search withing signal M - problem size OUTPUT PARAMETERS R - cross-correlation, array[0..N+M-2]: * positive lags are stored in R[0..N-1], R[i] = sum(pattern[j]*signal[i+j] * negative lags are stored in R[N..N+M-2], R[N+M-1-i] = sum(pattern[j]*signal[-i+j] NOTE: It is assumed that pattern domain is [0..M-1]. If Pattern is non-zero on [-K..M-1], you can still use this subroutine, just shift result by K. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/
void alglib::corrr1d( real_1d_array signal, ae_int_t n, real_1d_array pattern, ae_int_t m, real_1d_array& r);
/************************************************************************* 1-dimensional circular real cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (circular). Algorithm has linearithmic complexity for any M/N. IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrR1DCircular(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - real function to be transformed, periodic signal containing pattern N - problem size Pattern - array[0..M-1] - real function to be transformed, non-periodic pattern to search withing signal M - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/
void alglib::corrr1dcircular( real_1d_array signal, ae_int_t m, real_1d_array pattern, ae_int_t n, real_1d_array& c);
pearsoncorrelationsignificance
spearmanrankcorrelationsignificance
/************************************************************************* Pearson's correlation coefficient significance test This test checks hypotheses about whether X and Y are samples of two continuous distributions having zero correlation or whether their correlation is non-zero. The following tests are performed: * two-tailed test (null hypothesis - X and Y have zero correlation) * left-tailed test (null hypothesis - the correlation coefficient is greater than or equal to 0) * right-tailed test (null hypothesis - the correlation coefficient is less than or equal to 0). Requirements: * the number of elements in each sample is not less than 5 * normality of distributions of X and Y. Input parameters: R - Pearson's correlation coefficient for X and Y N - number of elements in samples, N>=5. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/
void alglib::pearsoncorrelationsignificance( double r, ae_int_t n, double& bothtails, double& lefttail, double& righttail);
/************************************************************************* Spearman's rank correlation coefficient significance test This test checks hypotheses about whether X and Y are samples of two continuous distributions having zero correlation or whether their correlation is non-zero. The following tests are performed: * two-tailed test (null hypothesis - X and Y have zero correlation) * left-tailed test (null hypothesis - the correlation coefficient is greater than or equal to 0) * right-tailed test (null hypothesis - the correlation coefficient is less than or equal to 0). Requirements: * the number of elements in each sample is not less than 5. The test is non-parametric and doesn't require distributions X and Y to be normal. Input parameters: R - Spearman's rank correlation coefficient for X and Y N - number of elements in samples, N>=5. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spearmanrankcorrelationsignificance( double r, ae_int_t n, double& bothtails, double& lefttail, double& righttail);
kmeansgenerate
/************************************************************************* k-means++ clusterization. Backward compatibility function, we recommend to use CLUSTERING subpackage as better replacement. -- ALGLIB -- Copyright 21.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::kmeansgenerate( real_2d_array xy, ae_int_t npoints, ae_int_t nvars, ae_int_t k, ae_int_t restarts, ae_int_t& info, real_2d_array& c, integer_1d_array& xyc);
dawsonintegral
/************************************************************************* Dawson's Integral Approximates the integral x - 2 | | 2 dawsn(x) = exp( -x ) | exp( t ) dt | | - 0 Three different rational approximations are employed, for the intervals 0 to 3.25; 3.25 to 6.25; and 6.25 up. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,10 10000 6.9e-16 1.0e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::dawsonintegral(double x);
densesolverlsreport
densesolverreport
cmatrixlusolve
cmatrixlusolvefast
cmatrixlusolvem
cmatrixlusolvemfast
cmatrixmixedsolve
cmatrixmixedsolvem
cmatrixsolve
cmatrixsolvefast
cmatrixsolvem
cmatrixsolvemfast
hpdmatrixcholeskysolve
hpdmatrixcholeskysolvefast
hpdmatrixcholeskysolvem
hpdmatrixcholeskysolvemfast
hpdmatrixsolve
hpdmatrixsolvefast
hpdmatrixsolvem
hpdmatrixsolvemfast
rmatrixlusolve
rmatrixlusolvefast
rmatrixlusolvem
rmatrixlusolvemfast
rmatrixmixedsolve
rmatrixmixedsolvem
rmatrixsolve
rmatrixsolvefast
rmatrixsolvels
rmatrixsolvem
rmatrixsolvemfast
spdmatrixcholeskysolve
spdmatrixcholeskysolvefast
spdmatrixcholeskysolvem
spdmatrixcholeskysolvemfast
spdmatrixsolve
spdmatrixsolvefast
spdmatrixsolvem
spdmatrixsolvemfast
/************************************************************************* *************************************************************************/
class densesolverlsreport { double r2; real_2d_array cx; ae_int_t n; ae_int_t k; };
/************************************************************************* *************************************************************************/
class densesolverreport { double r1; double rinf; };
/************************************************************************* Complex dense linear solver for A*x=b with complex N*N A given by its LU decomposition and N*1 vectors x and b. This is "slow-but-robust" version of the complex linear solver with additional features which add significant performance overhead. Faster version is CMatrixLUSolveFast() function. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use CMatrixSolve or CMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! CMatrixLUSolveFast() function. INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::cmatrixlusolve( complex_2d_array lua, integer_1d_array p, ae_int_t n, complex_1d_array b, ae_int_t& info, densesolverreport& rep, complex_1d_array& x);
/************************************************************************* Complex dense linear solver for A*x=b with N*N complex A given by its LU decomposition and N*1 vectors x and b. This is fast lightweight version of solver, which is significantly faster than CMatrixLUSolve(), but does not provide additional information (like condition numbers). Algorithm features: * O(N^2) complexity * no additional time-consuming features, just triangular solver INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros NOTE: unlike CMatrixLUSolve(), this function does NOT check for near-degeneracy of input matrix. It checks for EXACT degeneracy, because this check is easy to do. However, very badly conditioned matrices may went unnoticed. -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::cmatrixlusolvefast( complex_2d_array lua, integer_1d_array p, ae_int_t n, complex_1d_array& b, ae_int_t& info);
/************************************************************************* Dense solver for A*X=B with N*N complex A given by its LU decomposition, and N*M matrices X and B (multiple right sides). "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use CMatrixSolve or CMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. ! ! This performance penalty is especially apparent when you use ! ALGLIB parallel capabilities (condition number estimation is ! inherently sequential). It also becomes significant for ! small-scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! CMatrixLUSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::cmatrixlusolvem( complex_2d_array lua, integer_1d_array p, ae_int_t n, complex_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, complex_2d_array& x); void alglib::smp_cmatrixlusolvem( complex_2d_array lua, integer_1d_array p, ae_int_t n, complex_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, complex_2d_array& x);
/************************************************************************* Dense solver for A*X=B with N*N complex A given by its LU decomposition, and N*M matrices X and B (multiple right sides). "Fast-but-lightweight" version of the solver. Algorithm features: * O(M*N^2) complexity * no additional time-consuming features COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N,M]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::cmatrixlusolvemfast( complex_2d_array lua, integer_1d_array p, ae_int_t n, complex_2d_array& b, ae_int_t m, ae_int_t& info); void alglib::smp_cmatrixlusolvemfast( complex_2d_array lua, integer_1d_array p, ae_int_t n, complex_2d_array& b, ae_int_t m, ae_int_t& info);
/************************************************************************* Dense solver. Same as RMatrixMixedSolve(), but for complex matrices. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::cmatrixmixedsolve( complex_2d_array a, complex_2d_array lua, integer_1d_array p, ae_int_t n, complex_1d_array b, ae_int_t& info, densesolverreport& rep, complex_1d_array& x);
/************************************************************************* Dense solver. Same as RMatrixMixedSolveM(), but for complex matrices. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(M*N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::cmatrixmixedsolvem( complex_2d_array a, complex_2d_array lua, integer_1d_array p, ae_int_t n, complex_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, complex_2d_array& x);
/************************************************************************* Complex dense solver for A*x=B with N*N complex matrix A and N*1 complex vectors x and b. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^3) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, CMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::cmatrixsolve( complex_2d_array a, ae_int_t n, complex_1d_array b, ae_int_t& info, densesolverreport& rep, complex_1d_array& x); void alglib::smp_cmatrixsolve( complex_2d_array a, ae_int_t n, complex_1d_array b, ae_int_t& info, densesolverreport& rep, complex_1d_array& x);
/************************************************************************* Complex dense solver for A*x=B with N*N complex matrix A and N*1 complex vectors x and b. "Fast-but-lightweight" version of the solver. Algorithm features: * O(N^3) complexity * no additional time consuming features, just triangular solver COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS: Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::cmatrixsolvefast( complex_2d_array a, ae_int_t n, complex_1d_array& b, ae_int_t& info); void alglib::smp_cmatrixsolvefast( complex_2d_array a, ae_int_t n, complex_1d_array& b, ae_int_t& info);
/************************************************************************* Complex dense solver for A*X=B with N*N complex matrix A, N*M complex matrices X and B. "Slow-but-feature-rich" version which provides additional functions, at the cost of slower performance. Faster version may be invoked with CMatrixSolveMFast() function. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^3+M*N^2) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, CMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size RFS - iterative refinement switch: * True - refinement is used. Less performance, more precision. * False - refinement is not used. More performance, less precision. OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::cmatrixsolvem( complex_2d_array a, ae_int_t n, complex_2d_array b, ae_int_t m, bool rfs, ae_int_t& info, densesolverreport& rep, complex_2d_array& x); void alglib::smp_cmatrixsolvem( complex_2d_array a, ae_int_t n, complex_2d_array b, ae_int_t m, bool rfs, ae_int_t& info, densesolverreport& rep, complex_2d_array& x);
/************************************************************************* Complex dense solver for A*X=B with N*N complex matrix A, N*M complex matrices X and B. "Fast-but-lightweight" version which provides just triangular solver - and no additional functions like iterative refinement or condition number estimation. Algorithm features: * O(N^3+M*N^2) complexity * no additional time consuming functions COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N,M]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 16.03.2015 by Bochkanov Sergey *************************************************************************/
void alglib::cmatrixsolvemfast( complex_2d_array a, ae_int_t n, complex_2d_array& b, ae_int_t m, ae_int_t& info); void alglib::smp_cmatrixsolvemfast( complex_2d_array a, ae_int_t n, complex_2d_array& b, ae_int_t m, ae_int_t& info);
/************************************************************************* Dense solver for A*x=b with N*N Hermitian positive definite matrix A given by its Cholesky decomposition, and N*1 complex vectors x and b. This is "slow-but-feature-rich" version of the solver which estimates condition number of the system. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! HPDMatrixCholeskySolveFast() function. INPUT PARAMETERS CHA - array[0..N-1,0..N-1], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 - solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::hpdmatrixcholeskysolve( complex_2d_array cha, ae_int_t n, bool isupper, complex_1d_array b, ae_int_t& info, densesolverreport& rep, complex_1d_array& x);
/************************************************************************* Dense solver for A*x=b with N*N Hermitian positive definite matrix A given by its Cholesky decomposition, and N*1 complex vectors x and b. This is "fast-but-lightweight" version of the solver. Algorithm features: * O(N^2) complexity * matrix is represented by its upper or lower triangle * no additional time-consuming features INPUT PARAMETERS CHA - array[0..N-1,0..N-1], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned B is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[N]: * for info>0 - overwritten by solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/
void alglib::hpdmatrixcholeskysolvefast( complex_2d_array cha, ae_int_t n, bool isupper, complex_1d_array& b, ae_int_t& info);
/************************************************************************* Dense solver for A*X=B with N*N Hermitian positive definite matrix A given by its Cholesky decomposition and N*M complex matrices X and B. This is "slow-but-feature-rich" version of the solver which, in addition to the solution, estimates condition number of the system. Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. Amount of overhead introduced depends on M (the ! larger - the more efficient). ! ! This performance penalty is insignificant when compared with ! cost of large Cholesky decomposition. However, if you call ! this function many times for the same left side, this ! overhead BECOMES significant. It also becomes significant ! for small-scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! HPDMatrixCholeskySolveMFast() function. INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, HPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[N,M], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 contains solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::hpdmatrixcholeskysolvem( complex_2d_array cha, ae_int_t n, bool isupper, complex_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, complex_2d_array& x); void alglib::smp_hpdmatrixcholeskysolvem( complex_2d_array cha, ae_int_t n, bool isupper, complex_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, complex_2d_array& x);
/************************************************************************* Dense solver for A*X=B with N*N Hermitian positive definite matrix A given by its Cholesky decomposition and N*M complex matrices X and B. This is "fast-but-lightweight" version of the solver. Algorithm features: * O(M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional time-consuming features INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, HPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[N,M], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved B - array[N]: * for info>0 overwritten by solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/
void alglib::hpdmatrixcholeskysolvemfast( complex_2d_array cha, ae_int_t n, bool isupper, complex_2d_array& b, ae_int_t m, ae_int_t& info); void alglib::smp_hpdmatrixcholeskysolvemfast( complex_2d_array cha, ae_int_t n, bool isupper, complex_2d_array& b, ae_int_t m, ae_int_t& info);
/************************************************************************* Dense solver for A*x=b, with N*N Hermitian positive definite matrix A, and N*1 complex vectors x and b. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just performs Cholesky ! decomposition and calls triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, HPDMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - same as in RMatrixSolve Returns -3 for non-HPD matrices. Rep - same as in RMatrixSolve X - same as in RMatrixSolve -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::hpdmatrixsolve( complex_2d_array a, ae_int_t n, bool isupper, complex_1d_array b, ae_int_t& info, densesolverreport& rep, complex_1d_array& x); void alglib::smp_hpdmatrixsolve( complex_2d_array a, ae_int_t n, bool isupper, complex_1d_array b, ae_int_t& info, densesolverreport& rep, complex_1d_array& x);
/************************************************************************* Dense solver for A*x=b, with N*N Hermitian positive definite matrix A, and N*1 complex vectors x and b. "Fast-but-lightweight" version of the solver without additional functions. Algorithm features: * O(N^3) complexity * matrix is represented by its upper or lower triangle * no additional time consuming functions COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or not positive definite X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved B - array[0..N-1]: * overwritten by solution * zeros, if A is exactly singular (diagonal of its LU decomposition has exact zeros). -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/
void alglib::hpdmatrixsolvefast( complex_2d_array a, ae_int_t n, bool isupper, complex_1d_array& b, ae_int_t& info); void alglib::smp_hpdmatrixsolvefast( complex_2d_array a, ae_int_t n, bool isupper, complex_1d_array& b, ae_int_t& info);
/************************************************************************* Dense solver for A*X=B, with N*N Hermitian positive definite matrix A and N*M complex matrices X and B. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. ! ! This performance penalty is especially apparent when you use ! ALGLIB parallel capabilities (condition number estimation is ! inherently sequential). It also becomes significant for ! small-scale problems (N<100). ! ! In such cases we strongly recommend you to use faster solver, ! HPDMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - same as in RMatrixSolve. Returns -3 for non-HPD matrices. Rep - same as in RMatrixSolve X - same as in RMatrixSolve -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::hpdmatrixsolvem( complex_2d_array a, ae_int_t n, bool isupper, complex_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, complex_2d_array& x); void alglib::smp_hpdmatrixsolvem( complex_2d_array a, ae_int_t n, bool isupper, complex_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, complex_2d_array& x);
/************************************************************************* Dense solver for A*X=B, with N*N Hermitian positive definite matrix A and N*M complex matrices X and B. "Fast-but-lightweight" version of the solver. Algorithm features: * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional time consuming features like condition number estimation COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or is not positive definite. B is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[0..N-1]: * overwritten by solution * zeros, if problem was not solved -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/
void alglib::hpdmatrixsolvemfast( complex_2d_array a, ae_int_t n, bool isupper, complex_2d_array& b, ae_int_t m, ae_int_t& info); void alglib::smp_hpdmatrixsolvemfast( complex_2d_array a, ae_int_t n, bool isupper, complex_2d_array& b, ae_int_t m, ae_int_t& info);
/************************************************************************* Dense solver. This subroutine solves a system A*x=b, where A is NxN non-denegerate real matrix given by its LU decomposition, x and b are real vectors. This is "slow-but-robust" version of the linear LU-based solver. Faster version is RMatrixLUSolveFast() function. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! RMatrixLUSolveFast() function. INPUT PARAMETERS LUA - array[N,N], LU decomposition, RMatrixLU result P - array[N], pivots array, RMatrixLU result N - size of A B - array[N], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixlusolve( real_2d_array lua, integer_1d_array p, ae_int_t n, real_1d_array b, ae_int_t& info, densesolverreport& rep, real_1d_array& x);
/************************************************************************* Dense solver. This subroutine solves a system A*x=b, where A is NxN non-denegerate real matrix given by its LU decomposition, x and b are real vectors. This is "fast-without-any-checks" version of the linear LU-based solver. Slower but more robust version is RMatrixLUSolve() function. Algorithm features: * O(N^2) complexity * fast algorithm without ANY additional checks, just triangular solver INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixlusolvefast( real_2d_array lua, integer_1d_array p, ae_int_t n, real_1d_array& b, ae_int_t& info);
/************************************************************************* Dense solver. Similar to RMatrixLUSolve() but solves task with multiple right parts (where b and x are NxM matrices). This is "robust-but-slow" version of LU-based solver which performs additional checks for non-degeneracy of inputs (condition number estimation). If you need best performance, use "fast-without-any-checks" version, RMatrixLUSolveMFast(). Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. ! ! This performance penalty is especially apparent when you use ! ALGLIB parallel capabilities (condition number estimation is ! inherently sequential). It also becomes significant for ! small-scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! RMatrixLUSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS LUA - array[N,N], LU decomposition, RMatrixLU result P - array[N], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixlusolvem( real_2d_array lua, integer_1d_array p, ae_int_t n, real_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, real_2d_array& x); void alglib::smp_rmatrixlusolvem( real_2d_array lua, integer_1d_array p, ae_int_t n, real_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, real_2d_array& x);
/************************************************************************* Dense solver. Similar to RMatrixLUSolve() but solves task with multiple right parts, where b and x are NxM matrices. This is "fast-without-any-checks" version of LU-based solver. It does not estimate condition number of a system, so it is extremely fast. If you need better detection of near-degenerate cases, use RMatrixLUSolveM() function. Algorithm features: * O(M*N^2) complexity * fast algorithm without ANY additional checks, just triangular solver COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N,M]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixlusolvemfast( real_2d_array lua, integer_1d_array p, ae_int_t n, real_2d_array& b, ae_int_t m, ae_int_t& info); void alglib::smp_rmatrixlusolvemfast( real_2d_array lua, integer_1d_array p, ae_int_t n, real_2d_array& b, ae_int_t m, ae_int_t& info);
/************************************************************************* Dense solver. This subroutine solves a system A*x=b, where BOTH ORIGINAL A AND ITS LU DECOMPOSITION ARE KNOWN. You can use it if for some reasons you have both A and its LU decomposition. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixmixedsolve( real_2d_array a, real_2d_array lua, integer_1d_array p, ae_int_t n, real_1d_array b, ae_int_t& info, densesolverreport& rep, real_1d_array& x);
/************************************************************************* Dense solver. Similar to RMatrixMixedSolve() but solves task with multiple right parts (where b and x are NxM matrices). Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(M*N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixmixedsolvem( real_2d_array a, real_2d_array lua, integer_1d_array p, ae_int_t n, real_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, real_2d_array& x);
/************************************************************************* Dense solver for A*x=b with N*N real matrix A and N*1 real vectorx x and b. This is "slow-but-feature rich" version of the linear solver. Faster version is RMatrixSolveFast() function. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^3) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. It also very significant on small matrices. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, RMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixsolve( real_2d_array a, ae_int_t n, real_1d_array b, ae_int_t& info, densesolverreport& rep, real_1d_array& x); void alglib::smp_rmatrixsolve( real_2d_array a, ae_int_t n, real_1d_array b, ae_int_t& info, densesolverreport& rep, real_1d_array& x);
/************************************************************************* Dense solver. This subroutine solves a system A*x=b, where A is NxN non-denegerate real matrix, x and b are vectors. This is a "fast" version of linear solver which does NOT provide any additional functions like condition number estimation or iterative refinement. Algorithm features: * efficient algorithm O(N^3) complexity * no performance overhead from additional functionality If you need condition number estimation or iterative refinement, use more feature-rich version - RMatrixSolve(). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 16.03.2015 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixsolvefast( real_2d_array a, ae_int_t n, real_1d_array& b, ae_int_t& info); void alglib::smp_rmatrixsolvefast( real_2d_array a, ae_int_t n, real_1d_array& b, ae_int_t& info);
/************************************************************************* Dense solver. This subroutine finds solution of the linear system A*X=B with non-square, possibly degenerate A. System is solved in the least squares sense, and general least squares solution X = X0 + CX*y which minimizes |A*X-B| is returned. If A is non-degenerate, solution in the usual sense is returned. Algorithm features: * automatic detection (and correct handling!) of degenerate cases * iterative refinement * O(N^3) complexity COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is only partially supported (some parts are ! optimized, but most - are not). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..NRows-1,0..NCols-1], system matrix NRows - vertical size of A NCols - horizontal size of A B - array[0..NCols-1], right part Threshold- a number in [0,1]. Singular values beyond Threshold are considered zero. Set it to 0.0, if you don't understand what it means, so the solver will choose good value on its own. OUTPUT PARAMETERS Info - return code: * -4 SVD subroutine failed * -1 if NRows<=0 or NCols<=0 or Threshold<0 was passed * 1 if task is solved Rep - solver report, see below for more info X - array[0..N-1,0..M-1], it contains: * solution of A*X=B (even for singular A) * zeros, if SVD subroutine failed SOLVER REPORT Subroutine sets following fields of the Rep structure: * R2 reciprocal of condition number: 1/cond(A), 2-norm. * N = NCols * K dim(Null(A)) * CX array[0..N-1,0..K-1], kernel of A. Columns of CX store such vectors that A*CX[i]=0. -- ALGLIB -- Copyright 24.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixsolvels( real_2d_array a, ae_int_t nrows, ae_int_t ncols, real_1d_array b, double threshold, ae_int_t& info, densesolverlsreport& rep, real_1d_array& x); void alglib::smp_rmatrixsolvels( real_2d_array a, ae_int_t nrows, ae_int_t ncols, real_1d_array b, double threshold, ae_int_t& info, densesolverlsreport& rep, real_1d_array& x);
/************************************************************************* Dense solver. Similar to RMatrixSolve() but solves task with multiple right parts (where b and x are NxM matrices). This is "slow-but-robust" version of linear solver with additional functionality like condition number estimation. There also exists faster version - RMatrixSolveMFast(). Algorithm features: * automatic detection of degenerate cases * condition number estimation * optional iterative refinement * O(N^3+M*N^2) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. It also very significant on small matrices. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, RMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size RFS - iterative refinement switch: * True - refinement is used. Less performance, more precision. * False - refinement is not used. More performance, less precision. OUTPUT PARAMETERS Info - return code: * -3 A is ill conditioned or singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixsolvem( real_2d_array a, ae_int_t n, real_2d_array b, ae_int_t m, bool rfs, ae_int_t& info, densesolverreport& rep, real_2d_array& x); void alglib::smp_rmatrixsolvem( real_2d_array a, ae_int_t n, real_2d_array b, ae_int_t m, bool rfs, ae_int_t& info, densesolverreport& rep, real_2d_array& x);
/************************************************************************* Dense solver. Similar to RMatrixSolve() but solves task with multiple right parts (where b and x are NxM matrices). This is "fast" version of linear solver which does NOT offer additional functions like condition number estimation or iterative refinement. Algorithm features: * O(N^3+M*N^2) complexity * no additional functionality, highest performance COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size RFS - iterative refinement switch: * True - refinement is used. Less performance, more precision. * False - refinement is not used. More performance, less precision. OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixsolvemfast( real_2d_array a, ae_int_t n, real_2d_array& b, ae_int_t m, ae_int_t& info); void alglib::smp_rmatrixsolvemfast( real_2d_array a, ae_int_t n, real_2d_array& b, ae_int_t m, ae_int_t& info);
/************************************************************************* Dense solver for A*x=b with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*1 real vectors x and b. This is "slow- but-feature-rich" version of the solver which, in addition to the solution, performs condition number estimation. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! SPDMatrixCholeskySolveFast() function. INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[N], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 - solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spdmatrixcholeskysolve( real_2d_array cha, ae_int_t n, bool isupper, real_1d_array b, ae_int_t& info, densesolverreport& rep, real_1d_array& x);
/************************************************************************* Dense solver for A*x=b with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*1 real vectors x and b. This is "fast- but-lightweight" version of the solver. Algorithm features: * O(N^2) complexity * matrix is represented by its upper or lower triangle * no additional features INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[N], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[N]: * for info>0 - overwritten by solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spdmatrixcholeskysolvefast( real_2d_array cha, ae_int_t n, bool isupper, real_1d_array& b, ae_int_t& info);
/************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*M vectors X and B. It is "slow-but- feature-rich" version of the solver which estimates condition number of the system. Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. Amount of overhead introduced depends on M (the ! larger - the more efficient). ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! SPDMatrixCholeskySolveMFast() function. INPUT PARAMETERS CHA - array[0..N-1,0..N-1], Cholesky decomposition, SPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or badly conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 contains solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spdmatrixcholeskysolvem( real_2d_array cha, ae_int_t n, bool isupper, real_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, real_2d_array& x); void alglib::smp_spdmatrixcholeskysolvem( real_2d_array cha, ae_int_t n, bool isupper, real_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, real_2d_array& x);
/************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*M vectors X and B. It is "fast-but- lightweight" version of the solver which just solves linear system, without any additional functions. Algorithm features: * O(M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional functionality INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, SPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[N,M], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or badly conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved B - array[N]: * for info>0 overwritten by solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/
void alglib::spdmatrixcholeskysolvemfast( real_2d_array cha, ae_int_t n, bool isupper, real_2d_array& b, ae_int_t m, ae_int_t& info); void alglib::smp_spdmatrixcholeskysolvemfast( real_2d_array cha, ae_int_t n, bool isupper, real_2d_array& b, ae_int_t m, ae_int_t& info);
/************************************************************************* Dense linear solver for A*x=b with N*N real symmetric positive definite matrix A, N*1 vectors x and b. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just performs Cholesky ! decomposition and calls triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, SPDMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or non-SPD. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spdmatrixsolve( real_2d_array a, ae_int_t n, bool isupper, real_1d_array b, ae_int_t& info, densesolverreport& rep, real_1d_array& x); void alglib::smp_spdmatrixsolve( real_2d_array a, ae_int_t n, bool isupper, real_1d_array b, ae_int_t& info, densesolverreport& rep, real_1d_array& x);
/************************************************************************* Dense linear solver for A*x=b with N*N real symmetric positive definite matrix A, N*1 vectors x and b. "Fast-but-lightweight" version of the solver. Algorithm features: * O(N^3) complexity * matrix is represented by its upper or lower triangle * no additional time consuming features like condition number estimation COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or non-SPD * -1 N<=0 was passed * 1 task was solved B - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/
void alglib::spdmatrixsolvefast( real_2d_array a, ae_int_t n, bool isupper, real_1d_array& b, ae_int_t& info); void alglib::smp_spdmatrixsolvefast( real_2d_array a, ae_int_t n, bool isupper, real_1d_array& b, ae_int_t& info);
/************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A, and N*M vectors X and B. It is "slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just performs Cholesky ! decomposition and calls triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, SPDMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or non-SPD. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spdmatrixsolvem( real_2d_array a, ae_int_t n, bool isupper, real_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, real_2d_array& x); void alglib::smp_spdmatrixsolvem( real_2d_array a, ae_int_t n, bool isupper, real_2d_array b, ae_int_t m, ae_int_t& info, densesolverreport& rep, real_2d_array& x);
/************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A, and N*M vectors X and B. It is "fast-but-lightweight" version of the solver. Algorithm features: * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional time consuming features COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular * -1 N<=0 was passed * 1 task was solved B - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/
void alglib::spdmatrixsolvemfast( real_2d_array a, ae_int_t n, bool isupper, real_2d_array& b, ae_int_t m, ae_int_t& info); void alglib::smp_spdmatrixsolvemfast( real_2d_array a, ae_int_t n, bool isupper, real_2d_array& b, ae_int_t m, ae_int_t& info);
decisionforest
dfreport
dfavgce
dfavgerror
dfavgrelerror
dfbuildrandomdecisionforest
dfbuildrandomdecisionforestx1
dfprocess
dfprocessi
dfrelclserror
dfrmserror
dfserialize
dfunserialize
/************************************************************************* *************************************************************************/
class decisionforest { };
/************************************************************************* *************************************************************************/
class dfreport { double relclserror; double avgce; double rmserror; double avgerror; double avgrelerror; double oobrelclserror; double oobavgce; double oobrmserror; double oobavgerror; double oobavgrelerror; };
/************************************************************************* Average cross-entropy (in bits per element) on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: CrossEntropy/(NPoints*LN(2)). Zero if model solves regression task. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/
double alglib::dfavgce( decisionforest df, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Average error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task, it means average error when estimating posterior probabilities. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/
double alglib::dfavgerror( decisionforest df, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Average relative error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task, it means average relative error when estimating posterior probability of belonging to the correct class. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/
double alglib::dfavgrelerror( decisionforest df, real_2d_array xy, ae_int_t npoints);
/************************************************************************* This subroutine builds random decision forest. INPUT PARAMETERS: XY - training set NPoints - training set size, NPoints>=1 NVars - number of independent variables, NVars>=1 NClasses - task type: * NClasses=1 - regression task with one dependent variable * NClasses>1 - classification task with NClasses classes. NTrees - number of trees in a forest, NTrees>=1. recommended values: 50-100. R - percent of a training set used to build individual trees. 0<R<=1. recommended values: 0.1 <= R <= 0.66. OUTPUT PARAMETERS: Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<1, NVars<1, NClasses<1, NTrees<1, R<=0 or R>1). * 1, if task has been solved DF - model built Rep - training report, contains error on a training set and out-of-bag estimates of generalization error. -- ALGLIB -- Copyright 19.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::dfbuildrandomdecisionforest( real_2d_array xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t ntrees, double r, ae_int_t& info, decisionforest& df, dfreport& rep);
/************************************************************************* This subroutine builds random decision forest. This function gives ability to tune number of variables used when choosing best split. INPUT PARAMETERS: XY - training set NPoints - training set size, NPoints>=1 NVars - number of independent variables, NVars>=1 NClasses - task type: * NClasses=1 - regression task with one dependent variable * NClasses>1 - classification task with NClasses classes. NTrees - number of trees in a forest, NTrees>=1. recommended values: 50-100. NRndVars - number of variables used when choosing best split R - percent of a training set used to build individual trees. 0<R<=1. recommended values: 0.1 <= R <= 0.66. OUTPUT PARAMETERS: Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<1, NVars<1, NClasses<1, NTrees<1, R<=0 or R>1). * 1, if task has been solved DF - model built Rep - training report, contains error on a training set and out-of-bag estimates of generalization error. -- ALGLIB -- Copyright 19.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::dfbuildrandomdecisionforestx1( real_2d_array xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t ntrees, ae_int_t nrndvars, double r, ae_int_t& info, decisionforest& df, dfreport& rep);
/************************************************************************* Procesing INPUT PARAMETERS: DF - decision forest model X - input vector, array[0..NVars-1]. OUTPUT PARAMETERS: Y - result. Regression estimate when solving regression task, vector of posterior probabilities for classification task. See also DFProcessI. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::dfprocess( decisionforest df, real_1d_array x, real_1d_array& y);
/************************************************************************* 'interactive' variant of DFProcess for languages like Python which support constructs like "Y = DFProcessI(DF,X)" and interactive mode of interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::dfprocessi( decisionforest df, real_1d_array x, real_1d_array& y);
/************************************************************************* Relative classification error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: percent of incorrectly classified cases. Zero if model solves regression task. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/
double alglib::dfrelclserror( decisionforest df, real_2d_array xy, ae_int_t npoints);
/************************************************************************* RMS error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: root mean square error. Its meaning for regression task is obvious. As for classification task, RMS error means error when estimating posterior probabilities. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/
double alglib::dfrmserror( decisionforest df, real_2d_array xy, ae_int_t npoints);
/************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/
void dfserialize(decisionforest &obj, std::string &s_out); void dfserialize(decisionforest &obj, std::ostream &s_out);
/************************************************************************* This function unserializes data structure from string. *************************************************************************/
void dfunserialize(const std::string &s_in, decisionforest &obj); void dfunserialize(const std::istream &s_in, decisionforest &obj);
ellipticintegrale
ellipticintegralk
ellipticintegralkhighprecision
incompleteellipticintegrale
incompleteellipticintegralk
/************************************************************************* Complete elliptic integral of the second kind Approximates the integral pi/2 - | | 2 E(m) = | sqrt( 1 - m sin t ) dt | | - 0 using the approximation P(x) - x log x Q(x). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 1 10000 2.1e-16 7.3e-17 Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::ellipticintegrale(double m);
/************************************************************************* Complete elliptic integral of the first kind Approximates the integral pi/2 - | | | dt K(m) = | ------------------ | 2 | | sqrt( 1 - m sin t ) - 0 using the approximation P(x) - log x Q(x). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,1 30000 2.5e-16 6.8e-17 Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::ellipticintegralk(double m);
/************************************************************************* Complete elliptic integral of the first kind Approximates the integral pi/2 - | | | dt K(m) = | ------------------ | 2 | | sqrt( 1 - m sin t ) - 0 where m = 1 - m1, using the approximation P(x) - log x Q(x). The argument m1 is used rather than m so that the logarithmic singularity at m = 1 will be shifted to the origin; this preserves maximum accuracy. K(0) = pi/2. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,1 30000 2.5e-16 6.8e-17 Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::ellipticintegralkhighprecision(double m1);
/************************************************************************* Incomplete elliptic integral of the second kind Approximates the integral phi - | | | 2 E(phi_\m) = | sqrt( 1 - m sin t ) dt | | | - 0 of amplitude phi and modulus m, using the arithmetic - geometric mean algorithm. ACCURACY: Tested at random arguments with phi in [-10, 10] and m in [0, 1]. Relative error: arithmetic domain # trials peak rms IEEE -10,10 150000 3.3e-15 1.4e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1993, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::incompleteellipticintegrale(double phi, double m);
/************************************************************************* Incomplete elliptic integral of the first kind F(phi|m) Approximates the integral phi - | | | dt F(phi_\m) = | ------------------ | 2 | | sqrt( 1 - m sin t ) - 0 of amplitude phi and modulus m, using the arithmetic - geometric mean algorithm. ACCURACY: Tested at random points with m in [0, 1] and phi as indicated. Relative error: arithmetic domain # trials peak rms IEEE -10,10 200000 7.4e-16 1.0e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::incompleteellipticintegralk(double phi, double m);
eigsubspacereport
eigsubspacestate
eigsubspacecreate
eigsubspacecreatebuf
eigsubspaceooccontinue
eigsubspaceoocgetrequestdata
eigsubspaceoocgetrequestinfo
eigsubspaceoocsendresult
eigsubspaceoocstart
eigsubspaceoocstop
eigsubspacesetcond
eigsubspacesolvedenses
eigsubspacesolvesparses
hmatrixevd
hmatrixevdi
hmatrixevdr
rmatrixevd
smatrixevd
smatrixevdi
smatrixevdr
smatrixtdevd
smatrixtdevdi
smatrixtdevdr
/************************************************************************* This object stores state of the subspace iteration algorithm. You should use ALGLIB functions to work with this object. *************************************************************************/
class eigsubspacereport { ae_int_t iterationscount; };
/************************************************************************* This object stores state of the subspace iteration algorithm. You should use ALGLIB functions to work with this object. *************************************************************************/
class eigsubspacestate { };
/************************************************************************* This function initializes subspace iteration solver. This solver is used to solve symmetric real eigenproblems where just a few (top K) eigenvalues and corresponding eigenvectors is required. This solver can be significantly faster than complete EVD decomposition in the following case: * when only just a small fraction of top eigenpairs of dense matrix is required. When K approaches N, this solver is slower than complete dense EVD * when problem matrix is sparse (and/or is not known explicitly, i.e. only matrix-matrix product can be performed) USAGE (explicit dense/sparse matrix): 1. User initializes algorithm state with eigsubspacecreate() call 2. [optional] User tunes solver parameters by calling eigsubspacesetcond() or other functions 3. User calls eigsubspacesolvedense() or eigsubspacesolvesparse() methods, which take algorithm state and 2D array or alglib.sparsematrix object. USAGE (out-of-core mode): 1. User initializes algorithm state with eigsubspacecreate() call 2. [optional] User tunes solver parameters by calling eigsubspacesetcond() or other functions 3. User activates out-of-core mode of the solver and repeatedly calls communication functions in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: N - problem dimensionality, N>0 K - number of top eigenvector to calculate, 0<K<=N. OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/
void alglib::eigsubspacecreate( ae_int_t n, ae_int_t k, eigsubspacestate& state);
/************************************************************************* Buffered version of constructor which aims to reuse previously allocated memory as much as possible. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/
void alglib::eigsubspacecreatebuf( ae_int_t n, ae_int_t k, eigsubspacestate state);
/************************************************************************* This function performs subspace iteration in the out-of-core mode. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/
bool alglib::eigsubspaceooccontinue(eigsubspacestate state);
/************************************************************************* This function is used to retrieve information about out-of-core request sent by solver to user code: matrix X (array[N,RequestSize) which have to be multiplied by out-of-core matrix A in a product A*X. This function returns just request data; in order to get size of the data prior to processing requestm, use eigsubspaceoocgetrequestinfo(). It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver running in out-of-core mode X - possibly preallocated storage; reallocated if needed, left unchanged, if large enough to store request data. OUTPUT PARAMETERS: X - array[N,RequestSize] or larger, leading rectangle is filled with dense matrix X. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/
void alglib::eigsubspaceoocgetrequestdata( eigsubspacestate state, real_2d_array& x);
/************************************************************************* This function is used to retrieve information about out-of-core request sent by solver to user code: request type (current version of the solver sends only requests for matrix-matrix products) and request size (size of the matrices being multiplied). This function returns just request metrics; in order to get contents of the matrices being multiplied, use eigsubspaceoocgetrequestdata(). It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver running in out-of-core mode OUTPUT PARAMETERS: RequestType - type of the request to process: * 0 - for matrix-matrix product A*X, with A being NxN matrix whose eigenvalues/vectors are needed, and X being NxREQUESTSIZE one which is returned by the eigsubspaceoocgetrequestdata(). RequestSize - size of the X matrix (number of columns), usually it is several times larger than number of vectors K requested by user. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/
void alglib::eigsubspaceoocgetrequestinfo( eigsubspacestate state, ae_int_t& requesttype, ae_int_t& requestsize);
/************************************************************************* This function is used to send user reply to out-of-core request sent by solver. Usually it is product A*X for returned by solver matrix X. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver running in out-of-core mode AX - array[N,RequestSize] or larger, leading rectangle is filled with product A*X. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/
void alglib::eigsubspaceoocsendresult( eigsubspacestate state, real_2d_array ax);
/************************************************************************* This function initiates out-of-core mode of subspace eigensolver. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver object MType - matrix type: * 0 for real symmetric matrix (solver assumes that matrix being processed is symmetric; symmetric direct eigensolver is used for smaller subproblems arising during solution of larger "full" task) Future versions of ALGLIB may introduce support for other matrix types; for now, only symmetric eigenproblems are supported. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/
void alglib::eigsubspaceoocstart(eigsubspacestate state, ae_int_t mtype);
/************************************************************************* This function finalizes out-of-core mode of subspace eigensolver. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver state OUTPUT PARAMETERS: W - array[K], depending on solver settings: * top K eigenvalues ordered by descending - if eigenvectors are returned in Z * zeros - if invariant subspace is returned in Z Z - array[N,K], depending on solver settings either: * matrix of eigenvectors found * orthogonal basis of K-dimensional invariant subspace Rep - report with additional parameters -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/
void alglib::eigsubspaceoocstop( eigsubspacestate state, real_1d_array& w, real_2d_array& z, eigsubspacereport& rep);
/************************************************************************* This function sets stopping critera for the solver: * error in eigenvector/value allowed by solver * maximum number of iterations to perform INPUT PARAMETERS: State - solver structure Eps - eps>=0, with non-zero value used to tell solver that it can stop after all eigenvalues converged with error roughly proportional to eps*MAX(LAMBDA_MAX), where LAMBDA_MAX is a maximum eigenvalue. Zero value means that no check for precision is performed. MaxIts - maxits>=0, with non-zero value used to tell solver that it can stop after maxits steps (no matter how precise current estimate is) NOTE: passing eps=0 and maxits=0 results in automatic selection of moderate eps as stopping criteria (1.0E-6 in current implementation, but it may change without notice). NOTE: very small values of eps are possible (say, 1.0E-12), although the larger problem you solve (N and/or K), the harder it is to find precise eigenvectors because rounding errors tend to accumulate. NOTE: passing non-zero eps results in some performance penalty, roughly equal to 2N*(2K)^2 FLOPs per iteration. These additional computations are required in order to estimate current error in eigenvalues via Rayleigh-Ritz process. Most of this additional time is spent in construction of ~2Kx2K symmetric subproblem whose eigenvalues are checked with exact eigensolver. This additional time is negligible if you search for eigenvalues of the large dense matrix, but may become noticeable on highly sparse EVD problems, where cost of matrix-matrix product is low. If you set eps to exactly zero, Rayleigh-Ritz phase is completely turned off. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/
void alglib::eigsubspacesetcond( eigsubspacestate state, double eps, ae_int_t maxits);
/************************************************************************* This function runs eigensolver for dense NxN symmetric matrix A, given by upper or lower triangle. This function can not process nonsymmetric matrices. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * multithreading support ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! For a situation when you need just a few eigenvectors (~1-10), ! multithreading typically gives sublinear (wrt to cores count) speedup. ! For larger problems it may give you nearly linear increase in ! performance. ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: State - solver state A - array[N,N], symmetric NxN matrix given by one of its triangles IsUpper - whether upper or lower triangle of A is given (the other one is not referenced at all). OUTPUT PARAMETERS: W - array[K], top K eigenvalues ordered by descending of their absolute values Z - array[N,K], matrix of eigenvectors found Rep - report with additional parameters NOTE: internally this function allocates a copy of NxN dense A. You should take it into account when working with very large matrices occupying almost all RAM. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/
void alglib::eigsubspacesolvedenses( eigsubspacestate state, real_2d_array a, bool isupper, real_1d_array& w, real_2d_array& z, eigsubspacereport& rep); void alglib::smp_eigsubspacesolvedenses( eigsubspacestate state, real_2d_array a, bool isupper, real_1d_array& w, real_2d_array& z, eigsubspacereport& rep);
/************************************************************************* This function runs eigensolver for dense NxN symmetric matrix A, given by upper or lower triangle. This function can not process nonsymmetric matrices. INPUT PARAMETERS: State - solver state A - NxN symmetric matrix given by one of its triangles IsUpper - whether upper or lower triangle of A is given (the other one is not referenced at all). OUTPUT PARAMETERS: W - array[K], top K eigenvalues ordered by descending of their absolute values Z - array[N,K], matrix of eigenvectors found Rep - report with additional parameters -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/
void alglib::eigsubspacesolvesparses( eigsubspacestate state, sparsematrix a, bool isupper, real_1d_array& w, real_2d_array& z, eigsubspacereport& rep);
/************************************************************************* Finding the eigenvalues and eigenvectors of a Hermitian matrix The algorithm finds eigen pairs of a Hermitian matrix by reducing it to real tridiagonal form and using the QL/QR algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - Hermitian matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains the eigenvectors. Array whose indexes range within [0..N-1, 0..N-1]. The eigenvectors are stored in the matrix columns. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged (rare case). Note: eigenvectors of Hermitian matrix are defined up to multiplication by a complex number L, such that |L|=1. -- ALGLIB -- Copyright 2005, 23 March 2007 by Bochkanov Sergey *************************************************************************/
bool alglib::hmatrixevd( complex_2d_array a, ae_int_t n, ae_int_t zneeded, bool isupper, real_1d_array& d, complex_2d_array& z);
/************************************************************************* Subroutine for finding the eigenvalues and eigenvectors of a Hermitian matrix with given indexes by using bisection and inverse iteration methods Input parameters: A - Hermitian matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. I1, I2 - index interval for searching (from I1 to I2). 0 <= I1 <= I2 <= N-1. Output parameters: W - array of the eigenvalues found. Array whose index ranges within [0..I2-I1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..I2-I1]. In that case, the eigenvectors are stored in the matrix columns. Result: True, if successful. W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned. Note: eigen vectors of Hermitian matrix are defined up to multiplication by a complex number L, such as |L|=1. -- ALGLIB -- Copyright 07.01.2006, 24.03.2007 by Bochkanov Sergey. *************************************************************************/
bool alglib::hmatrixevdi( complex_2d_array a, ae_int_t n, ae_int_t zneeded, bool isupper, ae_int_t i1, ae_int_t i2, real_1d_array& w, complex_2d_array& z);
/************************************************************************* Subroutine for finding the eigenvalues (and eigenvectors) of a Hermitian matrix in a given half-interval (A, B] by using a bisection and inverse iteration Input parameters: A - Hermitian matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. B1, B2 - half-interval (B1, B2] to search eigenvalues in. Output parameters: M - number of eigenvalues found in a given half-interval, M>=0 W - array of the eigenvalues found. Array whose index ranges within [0..M-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..M-1]. The eigenvectors are stored in the matrix columns. Result: True, if successful. M contains the number of eigenvalues in the given half-interval (could be equal to 0), W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned, M is equal to 0. Note: eigen vectors of Hermitian matrix are defined up to multiplication by a complex number L, such as |L|=1. -- ALGLIB -- Copyright 07.01.2006, 24.03.2007 by Bochkanov Sergey. *************************************************************************/
bool alglib::hmatrixevdr( complex_2d_array a, ae_int_t n, ae_int_t zneeded, bool isupper, double b1, double b2, ae_int_t& m, real_1d_array& w, complex_2d_array& z);
/************************************************************************* Finding eigenvalues and eigenvectors of a general (unsymmetric) matrix COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Speed-up provided by MKL for this particular problem (EVD) ! is really high, because MKL uses combination of (a) better low-level ! optimizations, and (b) better EVD algorithms. ! ! On one particular SSE-capable machine for N=1024, commercial MKL- ! -capable ALGLIB was: ! * 7-10 times faster than open source "generic C" version ! * 15-18 times faster than "pure C#" version ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The algorithm finds eigenvalues and eigenvectors of a general matrix by using the QR algorithm with multiple shifts. The algorithm can find eigenvalues and both left and right eigenvectors. The right eigenvector is a vector x such that A*x = w*x, and the left eigenvector is a vector y such that y'*A = w*y' (here y' implies a complex conjugate transposition of vector y). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. VNeeded - flag controlling whether eigenvectors are needed or not. If VNeeded is equal to: * 0, eigenvectors are not returned; * 1, right eigenvectors are returned; * 2, left eigenvectors are returned; * 3, both left and right eigenvectors are returned. Output parameters: WR - real parts of eigenvalues. Array whose index ranges within [0..N-1]. WR - imaginary parts of eigenvalues. Array whose index ranges within [0..N-1]. VL, VR - arrays of left and right eigenvectors (if they are needed). If WI[i]=0, the respective eigenvalue is a real number, and it corresponds to the column number I of matrices VL/VR. If WI[i]>0, we have a pair of complex conjugate numbers with positive and negative imaginary parts: the first eigenvalue WR[i] + sqrt(-1)*WI[i]; the second eigenvalue WR[i+1] + sqrt(-1)*WI[i+1]; WI[i]>0 WI[i+1] = -WI[i] < 0 In that case, the eigenvector corresponding to the first eigenvalue is located in i and i+1 columns of matrices VL/VR (the column number i contains the real part, and the column number i+1 contains the imaginary part), and the vector corresponding to the second eigenvalue is a complex conjugate to the first vector. Arrays whose indexes range within [0..N-1, 0..N-1]. Result: True, if the algorithm has converged. False, if the algorithm has not converged. Note 1: Some users may ask the following question: what if WI[N-1]>0? WI[N] must contain an eigenvalue which is complex conjugate to the N-th eigenvalue, but the array has only size N? The answer is as follows: such a situation cannot occur because the algorithm finds a pairs of eigenvalues, therefore, if WI[i]>0, I is strictly less than N-1. Note 2: The algorithm performance depends on the value of the internal parameter NS of the InternalSchurDecomposition subroutine which defines the number of shifts in the QR algorithm (similarly to the block width in block-matrix algorithms of linear algebra). If you require maximum performance on your machine, it is recommended to adjust this parameter manually. See also the InternalTREVC subroutine. The algorithm is based on the LAPACK 3.0 library. *************************************************************************/
bool alglib::rmatrixevd( real_2d_array a, ae_int_t n, ae_int_t vneeded, real_1d_array& wr, real_1d_array& wi, real_2d_array& vl, real_2d_array& vr);
/************************************************************************* Finding the eigenvalues and eigenvectors of a symmetric matrix The algorithm finds eigen pairs of a symmetric matrix by reducing it to tridiagonal form and using the QL/QR algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpper - storage format. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains the eigenvectors. Array whose indexes range within [0..N-1, 0..N-1]. The eigenvectors are stored in the matrix columns. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged (rare case). -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/
bool alglib::smatrixevd( real_2d_array a, ae_int_t n, ae_int_t zneeded, bool isupper, real_1d_array& d, real_2d_array& z);
/************************************************************************* Subroutine for finding the eigenvalues and eigenvectors of a symmetric matrix with given indexes by using bisection and inverse iteration methods. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. I1, I2 - index interval for searching (from I1 to I2). 0 <= I1 <= I2 <= N-1. Output parameters: W - array of the eigenvalues found. Array whose index ranges within [0..I2-I1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..I2-I1]. In that case, the eigenvectors are stored in the matrix columns. Result: True, if successful. W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned. -- ALGLIB -- Copyright 07.01.2006 by Bochkanov Sergey *************************************************************************/
bool alglib::smatrixevdi( real_2d_array a, ae_int_t n, ae_int_t zneeded, bool isupper, ae_int_t i1, ae_int_t i2, real_1d_array& w, real_2d_array& z);
/************************************************************************* Subroutine for finding the eigenvalues (and eigenvectors) of a symmetric matrix in a given half open interval (A, B] by using a bisection and inverse iteration Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. B1, B2 - half open interval (B1, B2] to search eigenvalues in. Output parameters: M - number of eigenvalues found in a given half-interval (M>=0). W - array of the eigenvalues found. Array whose index ranges within [0..M-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..M-1]. The eigenvectors are stored in the matrix columns. Result: True, if successful. M contains the number of eigenvalues in the given half-interval (could be equal to 0), W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned, M is equal to 0. -- ALGLIB -- Copyright 07.01.2006 by Bochkanov Sergey *************************************************************************/
bool alglib::smatrixevdr( real_2d_array a, ae_int_t n, ae_int_t zneeded, bool isupper, double b1, double b2, ae_int_t& m, real_1d_array& w, real_2d_array& z);
/************************************************************************* Finding the eigenvalues and eigenvectors of a tridiagonal symmetric matrix The algorithm finds the eigen pairs of a tridiagonal symmetric matrix by using an QL/QR algorithm with implicit shifts. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: D - the main diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-1]. E - the secondary diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-2]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not needed; * 1, the eigenvectors of a tridiagonal matrix are multiplied by the square matrix Z. It is used if the tridiagonal matrix is obtained by the similarity transformation of a symmetric matrix; * 2, the eigenvectors of a tridiagonal matrix replace the square matrix Z; * 3, matrix Z contains the first row of the eigenvectors matrix. Z - if ZNeeded=1, Z contains the square matrix by which the eigenvectors are multiplied. Array whose indexes range within [0..N-1, 0..N-1]. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains the product of a given matrix (from the left) and the eigenvectors matrix (from the right); * 2, Z contains the eigenvectors. * 3, Z contains the first row of the eigenvectors matrix. If ZNeeded<3, Z is the array whose indexes range within [0..N-1, 0..N-1]. In that case, the eigenvectors are stored in the matrix columns. If ZNeeded=3, Z is the array whose indexes range within [0..0, 0..N-1]. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/
bool alglib::smatrixtdevd( real_1d_array& d, real_1d_array e, ae_int_t n, ae_int_t zneeded, real_2d_array& z);
/************************************************************************* Subroutine for finding tridiagonal matrix eigenvalues/vectors with given indexes (in ascending order) by using the bisection and inverse iteraion. Input parameters: D - the main diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-1]. E - the secondary diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-2]. N - size of matrix. N>=0. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not needed; * 1, the eigenvectors of a tridiagonal matrix are multiplied by the square matrix Z. It is used if the tridiagonal matrix is obtained by the similarity transformation of a symmetric matrix. * 2, the eigenvectors of a tridiagonal matrix replace matrix Z. I1, I2 - index interval for searching (from I1 to I2). 0 <= I1 <= I2 <= N-1. Z - if ZNeeded is equal to: * 0, Z isn't used and remains unchanged; * 1, Z contains the square matrix (array whose indexes range within [0..N-1, 0..N-1]) which reduces the given symmetric matrix to tridiagonal form; * 2, Z isn't used (but changed on the exit). Output parameters: D - array of the eigenvalues found. Array whose index ranges within [0..I2-I1]. Z - if ZNeeded is equal to: * 0, doesn't contain any information; * 1, contains the product of a given NxN matrix Z (from the left) and Nx(I2-I1) matrix of the eigenvectors found (from the right). Array whose indexes range within [0..N-1, 0..I2-I1]. * 2, contains the matrix of the eigenvalues found. Array whose indexes range within [0..N-1, 0..I2-I1]. Result: True, if successful. In that case, D contains the eigenvalues, Z contains the eigenvectors (if needed). It should be noted that the subroutine changes the size of arrays D and Z. False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned. -- ALGLIB -- Copyright 25.12.2005 by Bochkanov Sergey *************************************************************************/
bool alglib::smatrixtdevdi( real_1d_array& d, real_1d_array e, ae_int_t n, ae_int_t zneeded, ae_int_t i1, ae_int_t i2, real_2d_array& z);
/************************************************************************* Subroutine for finding the tridiagonal matrix eigenvalues/vectors in a given half-interval (A, B] by using bisection and inverse iteration. Input parameters: D - the main diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-1]. E - the secondary diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-2]. N - size of matrix, N>=0. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not needed; * 1, the eigenvectors of a tridiagonal matrix are multiplied by the square matrix Z. It is used if the tridiagonal matrix is obtained by the similarity transformation of a symmetric matrix. * 2, the eigenvectors of a tridiagonal matrix replace matrix Z. A, B - half-interval (A, B] to search eigenvalues in. Z - if ZNeeded is equal to: * 0, Z isn't used and remains unchanged; * 1, Z contains the square matrix (array whose indexes range within [0..N-1, 0..N-1]) which reduces the given symmetric matrix to tridiagonal form; * 2, Z isn't used (but changed on the exit). Output parameters: D - array of the eigenvalues found. Array whose index ranges within [0..M-1]. M - number of eigenvalues found in the given half-interval (M>=0). Z - if ZNeeded is equal to: * 0, doesn't contain any information; * 1, contains the product of a given NxN matrix Z (from the left) and NxM matrix of the eigenvectors found (from the right). Array whose indexes range within [0..N-1, 0..M-1]. * 2, contains the matrix of the eigenvectors found. Array whose indexes range within [0..N-1, 0..M-1]. Result: True, if successful. In that case, M contains the number of eigenvalues in the given half-interval (could be equal to 0), D contains the eigenvalues, Z contains the eigenvectors (if needed). It should be noted that the subroutine changes the size of arrays D and Z. False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned, M is equal to 0. -- ALGLIB -- Copyright 31.03.2008 by Bochkanov Sergey *************************************************************************/
bool alglib::smatrixtdevdr( real_1d_array& d, real_1d_array e, ae_int_t n, ae_int_t zneeded, double a, double b, ae_int_t& m, real_2d_array& z);
exponentialintegralei
exponentialintegralen
/************************************************************************* Exponential integral Ei(x) x - t | | e Ei(x) = -|- --- dt . | | t - -inf Not defined for x <= 0. See also expn.c. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,100 50000 8.6e-16 1.3e-16 Cephes Math Library Release 2.8: May, 1999 Copyright 1999 by Stephen L. Moshier *************************************************************************/
double alglib::exponentialintegralei(double x);
/************************************************************************* Exponential integral En(x) Evaluates the exponential integral inf. - | | -xt | e E (x) = | ---- dt. n | n | | t - 1 Both n and x must be nonnegative. The routine employs either a power series, a continued fraction, or an asymptotic formula depending on the relative values of n and x. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 30 10000 1.7e-15 3.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::exponentialintegralen(double x, ae_int_t n);
fcdistribution
fdistribution
invfdistribution
/************************************************************************* Complemented F distribution Returns the area from x to infinity under the F density function (also known as Snedcor's density or the variance ratio density). inf. - 1 | | a-1 b-1 1-P(x) = ------ | t (1-t) dt B(a,b) | | - x The incomplete beta integral is used, according to the formula P(x) = incbet( df2/2, df1/2, (df2/(df2 + df1*x) ). ACCURACY: Tested at random points (a,b,x) in the indicated intervals. x a,b Relative error: arithmetic domain domain # trials peak rms IEEE 0,1 1,100 100000 3.7e-14 5.9e-16 IEEE 1,5 1,100 100000 8.0e-15 1.6e-15 IEEE 0,1 1,10000 100000 1.8e-11 3.5e-13 IEEE 1,5 1,10000 100000 2.0e-11 3.0e-12 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::fcdistribution(ae_int_t a, ae_int_t b, double x);
/************************************************************************* F distribution Returns the area from zero to x under the F density function (also known as Snedcor's density or the variance ratio density). This is the density of x = (u1/df1)/(u2/df2), where u1 and u2 are random variables having Chi square distributions with df1 and df2 degrees of freedom, respectively. The incomplete beta integral is used, according to the formula P(x) = incbet( df1/2, df2/2, (df1*x/(df2 + df1*x) ). The arguments a and b are greater than zero, and x is nonnegative. ACCURACY: Tested at random points (a,b,x). x a,b Relative error: arithmetic domain domain # trials peak rms IEEE 0,1 0,100 100000 9.8e-15 1.7e-15 IEEE 1,5 0,100 100000 6.5e-15 3.5e-16 IEEE 0,1 1,10000 100000 2.2e-11 3.3e-12 IEEE 1,5 1,10000 100000 1.1e-11 1.7e-13 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::fdistribution(ae_int_t a, ae_int_t b, double x);
/************************************************************************* Inverse of complemented F distribution Finds the F density argument x such that the integral from x to infinity of the F density is equal to the given probability p. This is accomplished using the inverse beta integral function and the relations z = incbi( df2/2, df1/2, p ) x = df2 (1-z) / (df1 z). Note: the following relations hold for the inverse of the uncomplemented F distribution: z = incbi( df1/2, df2/2, p ) x = df2 z / (df1 (1-z)). ACCURACY: Tested at random points (a,b,p). a,b Relative error: arithmetic domain # trials peak rms For p between .001 and 1: IEEE 1,100 100000 8.3e-15 4.7e-16 IEEE 1,10000 100000 2.1e-11 1.4e-13 For p between 10^-6 and 10^-3: IEEE 1,100 50000 1.3e-12 8.4e-15 IEEE 1,10000 50000 3.0e-12 4.8e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::invfdistribution(ae_int_t a, ae_int_t b, double y);
fftc1d
fftc1dinv
fftr1d
fftr1dinv
fft_complex_d1 Complex FFT: simple example
fft_complex_d2 Complex FFT: advanced example
fft_real_d1 Real FFT: simple example
fft_real_d2 Real FFT: advanced example
/************************************************************************* 1-dimensional complex FFT. Array size N may be arbitrary number (composite or prime). Composite N's are handled with cache-oblivious variation of a Cooley-Tukey algorithm. Small prime-factors are transformed using hard coded codelets (similar to FFTW codelets, but without low-level optimization), large prime-factors are handled with Bluestein's algorithm. Fastests transforms are for smooth N's (prime factors are 2, 3, 5 only), most fast for powers of 2. When N have prime factors larger than these, but orders of magnitude smaller than N, computations will be about 4 times slower than for nearby highly composite N's. When N itself is prime, speed will be 6 times lower. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - complex function to be transformed N - problem size OUTPUT PARAMETERS A - DFT of a input array, array[0..N-1] A_out[j] = SUM(A_in[k]*exp(-2*pi*sqrt(-1)*j*k/N), k = 0..N-1) -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::fftc1d(complex_1d_array& a); void alglib::fftc1d(complex_1d_array& a, ae_int_t n);

Examples:   [1]  [2]  

/************************************************************************* 1-dimensional complex inverse FFT. Array size N may be arbitrary number (composite or prime). Algorithm has O(N*logN) complexity for any N (composite or prime). See FFTC1D() description for more information about algorithm performance. INPUT PARAMETERS A - array[0..N-1] - complex array to be transformed N - problem size OUTPUT PARAMETERS A - inverse DFT of a input array, array[0..N-1] A_out[j] = SUM(A_in[k]/N*exp(+2*pi*sqrt(-1)*j*k/N), k = 0..N-1) -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::fftc1dinv(complex_1d_array& a); void alglib::fftc1dinv(complex_1d_array& a, ae_int_t n);

Examples:   [1]  [2]  

/************************************************************************* 1-dimensional real FFT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS F - DFT of a input array, array[0..N-1] F[j] = SUM(A[k]*exp(-2*pi*sqrt(-1)*j*k/N), k = 0..N-1) NOTE: F[] satisfies symmetry property F[k] = conj(F[N-k]), so just one half of array is usually needed. But for convinience subroutine returns full complex array (with frequencies above N/2), so its result may be used by other FFT-related subroutines. -- ALGLIB -- Copyright 01.06.2009 by Bochkanov Sergey *************************************************************************/
void alglib::fftr1d(real_1d_array a, complex_1d_array& f); void alglib::fftr1d(real_1d_array a, ae_int_t n, complex_1d_array& f);

Examples:   [1]  [2]  

/************************************************************************* 1-dimensional real inverse FFT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS F - array[0..floor(N/2)] - frequencies from forward real FFT N - problem size OUTPUT PARAMETERS A - inverse DFT of a input array, array[0..N-1] NOTE: F[] should satisfy symmetry property F[k] = conj(F[N-k]), so just one half of frequencies array is needed - elements from 0 to floor(N/2). F[0] is ALWAYS real. If N is even F[floor(N/2)] is real too. If N is odd, then F[floor(N/2)] has no special properties. Relying on properties noted above, FFTR1DInv subroutine uses only elements from 0th to floor(N/2)-th. It ignores imaginary part of F[0], and in case N is even it ignores imaginary part of F[floor(N/2)] too. When you call this function using full arguments list - "FFTR1DInv(F,N,A)" - you can pass either either frequencies array with N elements or reduced array with roughly N/2 elements - subroutine will successfully transform both. If you call this function using reduced arguments list - "FFTR1DInv(F,A)" - you must pass FULL array with N elements (although higher N/2 are still not used) because array size is used to automatically determine FFT length -- ALGLIB -- Copyright 01.06.2009 by Bochkanov Sergey *************************************************************************/
void alglib::fftr1dinv(complex_1d_array f, real_1d_array& a); void alglib::fftr1dinv(complex_1d_array f, ae_int_t n, real_1d_array& a);

Examples:   [1]  [2]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "fasttransforms.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // first we demonstrate forward FFT:
    // [1i,1i,1i,1i] is converted to [4i, 0, 0, 0]
    //
    complex_1d_array z = "[1i,1i,1i,1i]";
    fftc1d(z);
    printf("%s\n", z.tostring(3).c_str()); // EXPECTED: [4i,0,0,0]

    //
    // now we convert [4i, 0, 0, 0] back to [1i,1i,1i,1i]
    // with backward FFT
    //
    fftc1dinv(z);
    printf("%s\n", z.tostring(3).c_str()); // EXPECTED: [1i,1i,1i,1i]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "fasttransforms.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // first we demonstrate forward FFT:
    // [0,1,0,1i] is converted to [1+1i, -1-1i, -1-1i, 1+1i]
    //
    complex_1d_array z = "[0,1,0,1i]";
    fftc1d(z);
    printf("%s\n", z.tostring(3).c_str()); // EXPECTED: [1+1i, -1-1i, -1-1i, 1+1i]

    //
    // now we convert result back with backward FFT
    //
    fftc1dinv(z);
    printf("%s\n", z.tostring(3).c_str()); // EXPECTED: [0,1,0,1i]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "fasttransforms.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // first we demonstrate forward FFT:
    // [1,1,1,1] is converted to [4, 0, 0, 0]
    //
    real_1d_array x = "[1,1,1,1]";
    complex_1d_array f;
    real_1d_array x2;
    fftr1d(x, f);
    printf("%s\n", f.tostring(3).c_str()); // EXPECTED: [4,0,0,0]

    //
    // now we convert [4, 0, 0, 0] back to [1,1,1,1]
    // with backward FFT
    //
    fftr1dinv(f, x2);
    printf("%s\n", x2.tostring(3).c_str()); // EXPECTED: [1,1,1,1]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "fasttransforms.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // first we demonstrate forward FFT:
    // [1,2,3,4] is converted to [10, -2+2i, -2, -2-2i]
    //
    // note that output array is self-adjoint:
    // * f[0] = conj(f[0])
    // * f[1] = conj(f[3])
    // * f[2] = conj(f[2])
    //
    real_1d_array x = "[1,2,3,4]";
    complex_1d_array f;
    real_1d_array x2;
    fftr1d(x, f);
    printf("%s\n", f.tostring(3).c_str()); // EXPECTED: [10, -2+2i, -2, -2-2i]

    //
    // now we convert [10, -2+2i, -2, -2-2i] back to [1,2,3,4]
    //
    fftr1dinv(f, x2);
    printf("%s\n", x2.tostring(3).c_str()); // EXPECTED: [1,2,3,4]

    //
    // remember that F is self-adjoint? It means that we can pass just half
    // (slightly larger than half) of F to inverse real FFT and still get our result.
    //
    // I.e. instead [10, -2+2i, -2, -2-2i] we pass just [10, -2+2i, -2] and everything works!
    //
    // NOTE: in this case we should explicitly pass array length (which is 4) to ALGLIB;
    // if not, it will automatically use array length to determine FFT size and
    // will erroneously make half-length FFT.
    //
    f = "[10, -2+2i, -2]";
    fftr1dinv(f, 4, x2);
    printf("%s\n", x2.tostring(3).c_str()); // EXPECTED: [1,2,3,4]
    return 0;
}


fhtr1d
fhtr1dinv
/************************************************************************* 1-dimensional Fast Hartley Transform. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS A - FHT of a input array, array[0..N-1], A_out[k] = sum(A_in[j]*(cos(2*pi*j*k/N)+sin(2*pi*j*k/N)), j=0..N-1) -- ALGLIB -- Copyright 04.06.2009 by Bochkanov Sergey *************************************************************************/
void alglib::fhtr1d(real_1d_array& a, ae_int_t n);
/************************************************************************* 1-dimensional inverse FHT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - complex array to be transformed N - problem size OUTPUT PARAMETERS A - inverse FHT of a input array, array[0..N-1] -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::fhtr1dinv(real_1d_array& a, ae_int_t n);
filterema
filterlrma
filtersma
filters_d_ema EMA(alpha) filter
filters_d_lrma LRMA(k) filter
filters_d_sma SMA(k) filter
/************************************************************************* Filters: exponential moving averages. This filter replaces array by results of EMA(alpha) filter. EMA(alpha) is defined as filter which replaces X[] by S[]: S[0] = X[0] S[t] = alpha*X[t] + (1-alpha)*S[t-1] INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 alpha - 0<alpha<=1, smoothing parameter. OUTPUT PARAMETERS: X - array, whose first N elements were processed with EMA(alpha) NOTE 1: this function uses efficient in-place algorithm which does not allocate temporary arrays. NOTE 2: this algorithm uses BOTH previous points and current one, i.e. new value of X[i] depends on BOTH previous point and X[i] itself. NOTE 3: technical analytis users quite often work with EMA coefficient expressed in DAYS instead of fractions. If you want to calculate EMA(N), where N is a number of days, you can use alpha=2/(N+1). -- ALGLIB -- Copyright 25.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::filterema(real_1d_array& x, double alpha); void alglib::filterema(real_1d_array& x, ae_int_t n, double alpha);

Examples:   [1]  

/************************************************************************* Filters: linear regression moving averages. This filter replaces array by results of LRMA(K) filter. LRMA(K) is defined as filter which, for each data point, builds linear regression model using K prevous points (point itself is included in these K points) and calculates value of this linear model at the point in question. INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 K - K>=1 (K can be larger than N , such cases will be correctly handled). Window width. K=1 corresponds to identity transformation (nothing changes). OUTPUT PARAMETERS: X - array, whose first N elements were processed with SMA(K) NOTE 1: this function uses efficient in-place algorithm which does not allocate temporary arrays. NOTE 2: this algorithm makes only one pass through array and uses running sum to speed-up calculation of the averages. Additional measures are taken to ensure that running sum on a long sequence of zero elements will be correctly reset to zero even in the presence of round-off error. NOTE 3: this is unsymmetric version of the algorithm, which does NOT averages points after the current one. Only X[i], X[i-1], ... are used when calculating new value of X[i]. We should also note that this algorithm uses BOTH previous points and current one, i.e. new value of X[i] depends on BOTH previous point and X[i] itself. -- ALGLIB -- Copyright 25.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::filterlrma(real_1d_array& x, ae_int_t k); void alglib::filterlrma(real_1d_array& x, ae_int_t n, ae_int_t k);

Examples:   [1]  

/************************************************************************* Filters: simple moving averages (unsymmetric). This filter replaces array by results of SMA(K) filter. SMA(K) is defined as filter which averages at most K previous points (previous - not points AROUND central point) - or less, in case of the first K-1 points. INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 K - K>=1 (K can be larger than N , such cases will be correctly handled). Window width. K=1 corresponds to identity transformation (nothing changes). OUTPUT PARAMETERS: X - array, whose first N elements were processed with SMA(K) NOTE 1: this function uses efficient in-place algorithm which does not allocate temporary arrays. NOTE 2: this algorithm makes only one pass through array and uses running sum to speed-up calculation of the averages. Additional measures are taken to ensure that running sum on a long sequence of zero elements will be correctly reset to zero even in the presence of round-off error. NOTE 3: this is unsymmetric version of the algorithm, which does NOT averages points after the current one. Only X[i], X[i-1], ... are used when calculating new value of X[i]. We should also note that this algorithm uses BOTH previous points and current one, i.e. new value of X[i] depends on BOTH previous point and X[i] itself. -- ALGLIB -- Copyright 25.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::filtersma(real_1d_array& x, ae_int_t k); void alglib::filtersma(real_1d_array& x, ae_int_t n, ae_int_t k);

Examples:   [1]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Here we demonstrate EMA(0.5) filtering for time series.
    //
    real_1d_array x = "[5,6,7,8]";

    //
    // Apply filter.
    // We should get [5, 5.5, 6.25, 7.125] as result
    //
    filterema(x, 0.5);
    printf("%s\n", x.tostring(4).c_str()); // EXPECTED: [5,5.5,6.25,7.125]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Here we demonstrate LRMA(3) filtering for time series.
    //
    real_1d_array x = "[7,8,8,9,12,12]";

    //
    // Apply filter.
    // We should get [7.0000, 8.0000, 8.1667, 8.8333, 11.6667, 12.5000] as result
    //    
    filterlrma(x, 3);
    printf("%s\n", x.tostring(4).c_str()); // EXPECTED: [7.0000,8.0000,8.1667,8.8333,11.6667,12.5000]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Here we demonstrate SMA(k) filtering for time series.
    //
    real_1d_array x = "[5,6,7,8]";

    //
    // Apply filter.
    // We should get [5, 5.5, 6.5, 7.5] as result
    //
    filtersma(x, 2);
    printf("%s\n", x.tostring(4).c_str()); // EXPECTED: [5,5.5,6.5,7.5]
    return 0;
}


fresnelintegral
/************************************************************************* Fresnel integral Evaluates the Fresnel integrals x - | | C(x) = | cos(pi/2 t**2) dt, | | - 0 x - | | S(x) = | sin(pi/2 t**2) dt. | | - 0 The integrals are evaluated by a power series for x < 1. For x >= 1 auxiliary functions f(x) and g(x) are employed such that C(x) = 0.5 + f(x) sin( pi/2 x**2 ) - g(x) cos( pi/2 x**2 ) S(x) = 0.5 - f(x) cos( pi/2 x**2 ) - g(x) sin( pi/2 x**2 ) ACCURACY: Relative error. Arithmetic function domain # trials peak rms IEEE S(x) 0, 10 10000 2.0e-15 3.2e-16 IEEE C(x) 0, 10 10000 1.8e-15 3.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/
void alglib::fresnelintegral(double x, double& c, double& s);
gammafunction
lngamma
/************************************************************************* Gamma function Input parameters: X - argument Domain: 0 < X < 171.6 -170 < X < 0, X is not an integer. Relative error: arithmetic domain # trials peak rms IEEE -170,-33 20000 2.3e-15 3.3e-16 IEEE -33, 33 20000 9.4e-16 2.2e-16 IEEE 33, 171.6 20000 2.3e-15 3.2e-16 Cephes Math Library Release 2.8: June, 2000 Original copyright 1984, 1987, 1989, 1992, 2000 by Stephen L. Moshier Translated to AlgoPascal by Bochkanov Sergey (2005, 2006, 2007). *************************************************************************/
double alglib::gammafunction(double x);
/************************************************************************* Natural logarithm of gamma function Input parameters: X - argument Result: logarithm of the absolute value of the Gamma(X). Output parameters: SgnGam - sign(Gamma(X)) Domain: 0 < X < 2.55e305 -2.55e305 < X < 0, X is not an integer. ACCURACY: arithmetic domain # trials peak rms IEEE 0, 3 28000 5.4e-16 1.1e-16 IEEE 2.718, 2.556e305 40000 3.5e-16 8.3e-17 The error criterion was relative when the function magnitude was greater than one but absolute when it was less than one. The following test used the relative error criterion, though at certain points the relative error could be much higher than indicated. IEEE -200, -4 10000 4.8e-16 1.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 1992, 2000 by Stephen L. Moshier Translated to AlgoPascal by Bochkanov Sergey (2005, 2006, 2007). *************************************************************************/
double alglib::lngamma(double x, double& sgngam);
gkqgenerategaussjacobi
gkqgenerategausslegendre
gkqgeneraterec
gkqlegendrecalc
gkqlegendretbl
/************************************************************************* Returns Gauss and Gauss-Kronrod nodes/weights for Gauss-Jacobi quadrature on [-1,1] with weight function W(x)=Power(1-x,Alpha)*Power(1+x,Beta). INPUT PARAMETERS: N - number of Kronrod nodes, must be odd number, >=3. Alpha - power-law coefficient, Alpha>-1 Beta - power-law coefficient, Beta>-1 OUTPUT PARAMETERS: Info - error code: * -5 no real and positive Gauss-Kronrod formula can be created for such a weight function with a given number of nodes. * -4 an error was detected when calculating weights/nodes. Alpha or Beta are too close to -1 to obtain weights/nodes with high enough accuracy, or, may be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK * +2 OK, but quadrature rule have exterior nodes, x[0]<-1 or x[n-1]>+1 X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::gkqgenerategaussjacobi( ae_int_t n, double alpha, double beta, ae_int_t& info, real_1d_array& x, real_1d_array& wkronrod, real_1d_array& wgauss);
/************************************************************************* Returns Gauss and Gauss-Kronrod nodes/weights for Gauss-Legendre quadrature with N points. GKQLegendreCalc (calculation) or GKQLegendreTbl (precomputed table) is used depending on machine precision and number of nodes. INPUT PARAMETERS: N - number of Kronrod nodes, must be odd number, >=3. OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. N is too large to obtain weights/nodes with high enough accuracy. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::gkqgenerategausslegendre( ae_int_t n, ae_int_t& info, real_1d_array& x, real_1d_array& wkronrod, real_1d_array& wgauss);
/************************************************************************* Computation of nodes and weights of a Gauss-Kronrod quadrature formula The algorithm generates the N-point Gauss-Kronrod quadrature formula with weight function given by coefficients alpha and beta of a recurrence relation which generates a system of orthogonal polynomials: P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zero moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - alpha coefficients, array[0..floor(3*K/2)]. Beta - beta coefficients, array[0..ceil(3*K/2)]. Beta[0] is not used and may be arbitrary. Beta[I]>0. Mu0 - zeroth moment of the weight function. N - number of nodes of the Gauss-Kronrod quadrature formula, N >= 3, N = 2*K+1. OUTPUT PARAMETERS: Info - error code: * -5 no real and positive Gauss-Kronrod formula can be created for such a weight function with a given number of nodes. * -4 N is too large, task may be ill conditioned - x[i]=x[i+1] found. * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 08.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::gkqgeneraterec( real_1d_array alpha, real_1d_array beta, double mu0, ae_int_t n, ae_int_t& info, real_1d_array& x, real_1d_array& wkronrod, real_1d_array& wgauss);
/************************************************************************* Returns Gauss and Gauss-Kronrod nodes for quadrature with N points. Reduction to tridiagonal eigenproblem is used. INPUT PARAMETERS: N - number of Kronrod nodes, must be odd number, >=3. OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. N is too large to obtain weights/nodes with high enough accuracy. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::gkqlegendrecalc( ae_int_t n, ae_int_t& info, real_1d_array& x, real_1d_array& wkronrod, real_1d_array& wgauss);
/************************************************************************* Returns Gauss and Gauss-Kronrod nodes for quadrature with N points using pre-calculated table. Nodes/weights were computed with accuracy up to 1.0E-32 (if MPFR version of ALGLIB is used). In standard double precision accuracy reduces to something about 2.0E-16 (depending on your compiler's handling of long floating point constants). INPUT PARAMETERS: N - number of Kronrod nodes. N can be 15, 21, 31, 41, 51, 61. OUTPUT PARAMETERS: X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::gkqlegendretbl( ae_int_t n, real_1d_array& x, real_1d_array& wkronrod, real_1d_array& wgauss, double& eps);
gqgenerategausshermite
gqgenerategaussjacobi
gqgenerategausslaguerre
gqgenerategausslegendre
gqgenerategausslobattorec
gqgenerategaussradaurec
gqgeneraterec
/************************************************************************* Returns nodes/weights for Gauss-Hermite quadrature on (-inf,+inf) with weight function W(x)=Exp(-x*x) INPUT PARAMETERS: N - number of nodes, >=1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. May be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N/Alpha was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::gqgenerategausshermite( ae_int_t n, ae_int_t& info, real_1d_array& x, real_1d_array& w);
/************************************************************************* Returns nodes/weights for Gauss-Jacobi quadrature on [-1,1] with weight function W(x)=Power(1-x,Alpha)*Power(1+x,Beta). INPUT PARAMETERS: N - number of nodes, >=1 Alpha - power-law coefficient, Alpha>-1 Beta - power-law coefficient, Beta>-1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. Alpha or Beta are too close to -1 to obtain weights/nodes with high enough accuracy, or, may be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N/Alpha/Beta was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::gqgenerategaussjacobi( ae_int_t n, double alpha, double beta, ae_int_t& info, real_1d_array& x, real_1d_array& w);
/************************************************************************* Returns nodes/weights for Gauss-Laguerre quadrature on [0,+inf) with weight function W(x)=Power(x,Alpha)*Exp(-x) INPUT PARAMETERS: N - number of nodes, >=1 Alpha - power-law coefficient, Alpha>-1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. Alpha is too close to -1 to obtain weights/nodes with high enough accuracy or, may be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N/Alpha was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::gqgenerategausslaguerre( ae_int_t n, double alpha, ae_int_t& info, real_1d_array& x, real_1d_array& w);
/************************************************************************* Returns nodes/weights for Gauss-Legendre quadrature on [-1,1] with N nodes. INPUT PARAMETERS: N - number of nodes, >=1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. N is too large to obtain weights/nodes with high enough accuracy. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/
void alglib::gqgenerategausslegendre( ae_int_t n, ae_int_t& info, real_1d_array& x, real_1d_array& w);
/************************************************************************* Computation of nodes and weights for a Gauss-Lobatto quadrature formula The algorithm generates the N-point Gauss-Lobatto quadrature formula with weight function given by coefficients alpha and beta of a recurrence which generates a system of orthogonal polynomials. P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zeroth moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - array[0..N-2], alpha coefficients Beta - array[0..N-2], beta coefficients. Zero-indexed element is not used, may be arbitrary. Beta[I]>0 Mu0 - zeroth moment of the weighting function. A - left boundary of the integration interval. B - right boundary of the integration interval. N - number of nodes of the quadrature formula, N>=3 (including the left and right boundary nodes). OUTPUT PARAMETERS: Info - error code: * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * 1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 2005-2009 by Bochkanov Sergey *************************************************************************/
void alglib::gqgenerategausslobattorec( real_1d_array alpha, real_1d_array beta, double mu0, double a, double b, ae_int_t n, ae_int_t& info, real_1d_array& x, real_1d_array& w);
/************************************************************************* Computation of nodes and weights for a Gauss-Radau quadrature formula The algorithm generates the N-point Gauss-Radau quadrature formula with weight function given by the coefficients alpha and beta of a recurrence which generates a system of orthogonal polynomials. P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zeroth moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - array[0..N-2], alpha coefficients. Beta - array[0..N-1], beta coefficients Zero-indexed element is not used. Beta[I]>0 Mu0 - zeroth moment of the weighting function. A - left boundary of the integration interval. N - number of nodes of the quadrature formula, N>=2 (including the left boundary node). OUTPUT PARAMETERS: Info - error code: * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * 1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 2005-2009 by Bochkanov Sergey *************************************************************************/
void alglib::gqgenerategaussradaurec( real_1d_array alpha, real_1d_array beta, double mu0, double a, ae_int_t n, ae_int_t& info, real_1d_array& x, real_1d_array& w);
/************************************************************************* Computation of nodes and weights for a Gauss quadrature formula The algorithm generates the N-point Gauss quadrature formula with weight function given by coefficients alpha and beta of a recurrence relation which generates a system of orthogonal polynomials: P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zeroth moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - array[0..N-1], alpha coefficients Beta - array[0..N-1], beta coefficients Zero-indexed element is not used and may be arbitrary. Beta[I]>0. Mu0 - zeroth moment of the weight function. N - number of nodes of the quadrature formula, N>=1 OUTPUT PARAMETERS: Info - error code: * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * 1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 2005-2009 by Bochkanov Sergey *************************************************************************/
void alglib::gqgeneraterec( real_1d_array alpha, real_1d_array beta, double mu0, ae_int_t n, ae_int_t& info, real_1d_array& x, real_1d_array& w);
hermitecalculate
hermitecoefficients
hermitesum
/************************************************************************* Calculation of the value of the Hermite polynomial. Parameters: n - degree, n>=0 x - argument Result: the value of the Hermite polynomial Hn at x *************************************************************************/
double alglib::hermitecalculate(ae_int_t n, double x);
/************************************************************************* Representation of Hn as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/
void alglib::hermitecoefficients(ae_int_t n, real_1d_array& c);
/************************************************************************* Summation of Hermite polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*H0(x) + c[1]*H1(x) + ... + c[N]*HN(x) Parameters: n - degree, n>=0 x - argument Result: the value of the Hermite polynomial at x *************************************************************************/
double alglib::hermitesum(real_1d_array c, ae_int_t n, double x);
hqrndstate
hqrndcontinuous
hqrnddiscrete
hqrndexponential
hqrndnormal
hqrndnormal2
hqrndrandomize
hqrndseed
hqrnduniformi
hqrnduniformr
hqrndunit2
/************************************************************************* Portable high quality random number generator state. Initialized with HQRNDRandomize() or HQRNDSeed(). Fields: S1, S2 - seed values V - precomputed value MagicV - 'magic' value used to determine whether State structure was correctly initialized. *************************************************************************/
class hqrndstate { };
/************************************************************************* This function generates random number from continuous distribution given by finite sample X. INPUT PARAMETERS State - high quality random number generator, must be initialized with HQRNDRandomize() or HQRNDSeed(). X - finite sample, array[N] (can be larger, in this case only leading N elements are used). THIS ARRAY MUST BE SORTED BY ASCENDING. N - number of elements to use, N>=1 RESULT this function returns random number from continuous distribution which tries to approximate X as mush as possible. min(X)<=Result<=max(X). -- ALGLIB -- Copyright 08.11.2011 by Bochkanov Sergey *************************************************************************/
double alglib::hqrndcontinuous( hqrndstate state, real_1d_array x, ae_int_t n);
/************************************************************************* This function generates random number from discrete distribution given by finite sample X. INPUT PARAMETERS State - high quality random number generator, must be initialized with HQRNDRandomize() or HQRNDSeed(). X - finite sample N - number of elements to use, N>=1 RESULT this function returns one of the X[i] for random i=0..N-1 -- ALGLIB -- Copyright 08.11.2011 by Bochkanov Sergey *************************************************************************/
double alglib::hqrnddiscrete( hqrndstate state, real_1d_array x, ae_int_t n);
/************************************************************************* Random number generator: exponential distribution State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 11.08.2007 by Bochkanov Sergey *************************************************************************/
double alglib::hqrndexponential(hqrndstate state, double lambdav);
/************************************************************************* Random number generator: normal numbers This function generates one random number from normal distribution. Its performance is equal to that of HQRNDNormal2() State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/
double alglib::hqrndnormal(hqrndstate state);
/************************************************************************* Random number generator: normal numbers This function generates two independent random numbers from normal distribution. Its performance is equal to that of HQRNDNormal() State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/
void alglib::hqrndnormal2(hqrndstate state, double& x1, double& x2);
/************************************************************************* HQRNDState initialization with random values which come from standard RNG. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/
void alglib::hqrndrandomize(hqrndstate& state);
/************************************************************************* HQRNDState initialization with seed values -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/
void alglib::hqrndseed(ae_int_t s1, ae_int_t s2, hqrndstate& state);
/************************************************************************* This function generates random integer number in [0, N) 1. State structure must be initialized with HQRNDRandomize() or HQRNDSeed() 2. N can be any positive number except for very large numbers: * close to 2^31 on 32-bit systems * close to 2^62 on 64-bit systems An exception will be generated if N is too large. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::hqrnduniformi(hqrndstate state, ae_int_t n);
/************************************************************************* This function generates random real number in (0,1), not including interval boundaries State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/
double alglib::hqrnduniformr(hqrndstate state);
/************************************************************************* Random number generator: random X and Y such that X^2+Y^2=1 State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/
void alglib::hqrndunit2(hqrndstate state, double& x, double& y);
incompletebeta
invincompletebeta
/************************************************************************* Incomplete beta integral Returns incomplete beta integral of the arguments, evaluated from zero to x. The function is defined as x - - | (a+b) | | a-1 b-1 ----------- | t (1-t) dt. - - | | | (a) | (b) - 0 The domain of definition is 0 <= x <= 1. In this implementation a and b are restricted to positive values. The integral from x to 1 may be obtained by the symmetry relation 1 - incbet( a, b, x ) = incbet( b, a, 1-x ). The integral is evaluated by a continued fraction expansion or, when b*x is small, by a power series. ACCURACY: Tested at uniformly distributed random points (a,b,x) with a and b in "domain" and x between 0 and 1. Relative error arithmetic domain # trials peak rms IEEE 0,5 10000 6.9e-15 4.5e-16 IEEE 0,85 250000 2.2e-13 1.7e-14 IEEE 0,1000 30000 5.3e-12 6.3e-13 IEEE 0,10000 250000 9.3e-11 7.1e-12 IEEE 0,100000 10000 8.7e-10 4.8e-11 Outputs smaller than the IEEE gradual underflow threshold were excluded from these statistics. Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::incompletebeta(double a, double b, double x);
/************************************************************************* Inverse of imcomplete beta integral Given y, the function finds x such that incbet( a, b, x ) = y . The routine performs interval halving or Newton iterations to find the root of incbet(a,b,x) - y = 0. ACCURACY: Relative error: x a,b arithmetic domain domain # trials peak rms IEEE 0,1 .5,10000 50000 5.8e-12 1.3e-13 IEEE 0,1 .25,100 100000 1.8e-13 3.9e-15 IEEE 0,1 0,5 50000 1.1e-12 5.5e-15 With a and b constrained to half-integer or integer values: IEEE 0,1 .5,10000 50000 5.8e-12 1.1e-13 IEEE 0,1 .5,100 100000 1.7e-14 7.9e-16 With a = .5, b constrained to half-integer or integer values: IEEE 0,1 .5,10000 10000 8.3e-11 1.0e-11 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1996, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::invincompletebeta(double a, double b, double y);
idwinterpolant
idwbuildmodifiedshepard
idwbuildmodifiedshepardr
idwbuildnoisy
idwcalc
/************************************************************************* IDW interpolant. *************************************************************************/
class idwinterpolant { };
/************************************************************************* IDW interpolant using modified Shepard method for uniform point distributions. INPUT PARAMETERS: XY - X and Y values, array[0..N-1,0..NX]. First NX columns contain X-values, last column contain Y-values. N - number of nodes, N>0. NX - space dimension, NX>=1. D - nodal function type, either: * 0 constant model. Just for demonstration only, worst model ever. * 1 linear model, least squares fitting. Simpe model for datasets too small for quadratic models * 2 quadratic model, least squares fitting. Best model available (if your dataset is large enough). * -1 "fast" linear model, use with caution!!! It is significantly faster than linear/quadratic and better than constant model. But it is less robust (especially in the presence of noise). NQ - number of points used to calculate nodal functions (ignored for constant models). NQ should be LARGER than: * max(1.5*(1+NX),2^NX+1) for linear model, * max(3/4*(NX+2)*(NX+1),2^NX+1) for quadratic model. Values less than this threshold will be silently increased. NW - number of points used to calculate weights and to interpolate. Required: >=2^NX+1, values less than this threshold will be silently increased. Recommended value: about 2*NQ OUTPUT PARAMETERS: Z - IDW interpolant. NOTES: * best results are obtained with quadratic models, worst - with constant models * when N is large, NQ and NW must be significantly smaller than N both to obtain optimal performance and to obtain optimal accuracy. In 2 or 3-dimensional tasks NQ=15 and NW=25 are good values to start with. * NQ and NW may be greater than N. In such cases they will be automatically decreased. * this subroutine is always succeeds (as long as correct parameters are passed). * see 'Multivariate Interpolation of Large Sets of Scattered Data' by Robert J. Renka for more information on this algorithm. * this subroutine assumes that point distribution is uniform at the small scales. If it isn't - for example, points are concentrated along "lines", but "lines" distribution is uniform at the larger scale - then you should use IDWBuildModifiedShepardR() -- ALGLIB PROJECT -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/
void alglib::idwbuildmodifiedshepard( real_2d_array xy, ae_int_t n, ae_int_t nx, ae_int_t d, ae_int_t nq, ae_int_t nw, idwinterpolant& z);
/************************************************************************* IDW interpolant using modified Shepard method for non-uniform datasets. This type of model uses constant nodal functions and interpolates using all nodes which are closer than user-specified radius R. It may be used when points distribution is non-uniform at the small scale, but it is at the distances as large as R. INPUT PARAMETERS: XY - X and Y values, array[0..N-1,0..NX]. First NX columns contain X-values, last column contain Y-values. N - number of nodes, N>0. NX - space dimension, NX>=1. R - radius, R>0 OUTPUT PARAMETERS: Z - IDW interpolant. NOTES: * if there is less than IDWKMin points within R-ball, algorithm selects IDWKMin closest ones, so that continuity properties of interpolant are preserved even far from points. -- ALGLIB PROJECT -- Copyright 11.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::idwbuildmodifiedshepardr( real_2d_array xy, ae_int_t n, ae_int_t nx, double r, idwinterpolant& z);
/************************************************************************* IDW model for noisy data. This subroutine may be used to handle noisy data, i.e. data with noise in OUTPUT values. It differs from IDWBuildModifiedShepard() in the following aspects: * nodal functions are not constrained to pass through nodes: Qi(xi)<>yi, i.e. we have fitting instead of interpolation. * weights which are used during least squares fitting stage are all equal to 1.0 (independently of distance) * "fast"-linear or constant nodal functions are not supported (either not robust enough or too rigid) This problem require far more complex tuning than interpolation problems. Below you can find some recommendations regarding this problem: * focus on tuning NQ; it controls noise reduction. As for NW, you can just make it equal to 2*NQ. * you can use cross-validation to determine optimal NQ. * optimal NQ is a result of complex tradeoff between noise level (more noise = larger NQ required) and underlying function complexity (given fixed N, larger NQ means smoothing of compex features in the data). For example, NQ=N will reduce noise to the minimum level possible, but you will end up with just constant/linear/quadratic (depending on D) least squares model for the whole dataset. INPUT PARAMETERS: XY - X and Y values, array[0..N-1,0..NX]. First NX columns contain X-values, last column contain Y-values. N - number of nodes, N>0. NX - space dimension, NX>=1. D - nodal function degree, either: * 1 linear model, least squares fitting. Simpe model for datasets too small for quadratic models (or for very noisy problems). * 2 quadratic model, least squares fitting. Best model available (if your dataset is large enough). NQ - number of points used to calculate nodal functions. NQ should be significantly larger than 1.5 times the number of coefficients in a nodal function to overcome effects of noise: * larger than 1.5*(1+NX) for linear model, * larger than 3/4*(NX+2)*(NX+1) for quadratic model. Values less than this threshold will be silently increased. NW - number of points used to calculate weights and to interpolate. Required: >=2^NX+1, values less than this threshold will be silently increased. Recommended value: about 2*NQ or larger OUTPUT PARAMETERS: Z - IDW interpolant. NOTES: * best results are obtained with quadratic models, linear models are not recommended to use unless you are pretty sure that it is what you want * this subroutine is always succeeds (as long as correct parameters are passed). * see 'Multivariate Interpolation of Large Sets of Scattered Data' by Robert J. Renka for more information on this algorithm. -- ALGLIB PROJECT -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/
void alglib::idwbuildnoisy( real_2d_array xy, ae_int_t n, ae_int_t nx, ae_int_t d, ae_int_t nq, ae_int_t nw, idwinterpolant& z);
/************************************************************************* IDW interpolation INPUT PARAMETERS: Z - IDW interpolant built with one of model building subroutines. X - array[0..NX-1], interpolation point Result: IDW interpolant Z(X) -- ALGLIB -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/
double alglib::idwcalc(idwinterpolant z, real_1d_array x);
incompletegamma
incompletegammac
invincompletegammac
/************************************************************************* Incomplete gamma integral The function is defined by x - 1 | | -t a-1 igam(a,x) = ----- | e t dt. - | | | (a) - 0 In this implementation both arguments must be positive. The integral is evaluated by either a power series or continued fraction expansion, depending on the relative values of a and x. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 200000 3.6e-14 2.9e-15 IEEE 0,100 300000 9.9e-14 1.5e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::incompletegamma(double a, double x);
/************************************************************************* Complemented incomplete gamma integral The function is defined by igamc(a,x) = 1 - igam(a,x) inf. - 1 | | -t a-1 = ----- | e t dt. - | | | (a) - x In this implementation both arguments must be positive. The integral is evaluated by either a power series or continued fraction expansion, depending on the relative values of a and x. ACCURACY: Tested at random a, x. a x Relative error: arithmetic domain domain # trials peak rms IEEE 0.5,100 0,100 200000 1.9e-14 1.7e-15 IEEE 0.01,0.5 0,100 200000 1.4e-13 1.6e-15 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 1987, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::incompletegammac(double a, double x);
/************************************************************************* Inverse of complemented imcomplete gamma integral Given p, the function finds x such that igamc( a, x ) = p. Starting with the approximate value 3 x = a t where t = 1 - d - ndtri(p) sqrt(d) and d = 1/9a, the routine performs up to 10 Newton iterations to find the root of igamc(a,x) - p = 0. ACCURACY: Tested at random a, p in the intervals indicated. a p Relative error: arithmetic domain domain # trials peak rms IEEE 0.5,100 0,0.5 100000 1.0e-14 1.7e-15 IEEE 0.01,0.5 0,0.5 100000 9.0e-14 3.4e-15 IEEE 0.5,10000 0,0.5 20000 2.3e-13 3.8e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::invincompletegammac(double a, double y0);
rmatrixinvupdatecolumn
rmatrixinvupdaterow
rmatrixinvupdatesimple
rmatrixinvupdateuv
/************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm updates matrix A^-1 when adding a vector to a column of matrix A. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. UpdColumn - the column of A whose vector U was added. 0 <= UpdColumn <= N-1 U - the vector to be added to a column. Array whose index ranges within [0..N-1]. Output parameters: InvA - inverse of modified matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixinvupdatecolumn( real_2d_array& inva, ae_int_t n, ae_int_t updcolumn, real_1d_array u);
/************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm updates matrix A^-1 when adding a vector to a row of matrix A. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. UpdRow - the row of A whose vector V was added. 0 <= Row <= N-1 V - the vector to be added to a row. Array whose index ranges within [0..N-1]. Output parameters: InvA - inverse of modified matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixinvupdaterow( real_2d_array& inva, ae_int_t n, ae_int_t updrow, real_1d_array v);
/************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm updates matrix A^-1 when adding a number to an element of matrix A. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. UpdRow - row where the element to be updated is stored. UpdColumn - column where the element to be updated is stored. UpdVal - a number to be added to the element. Output parameters: InvA - inverse of modified matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixinvupdatesimple( real_2d_array& inva, ae_int_t n, ae_int_t updrow, ae_int_t updcolumn, double updval);
/************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm computes the inverse of matrix A+u*v' by using the given matrix A^-1 and the vectors u and v. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. U - the vector modifying the matrix. Array whose index ranges within [0..N-1]. V - the vector modifying the matrix. Array whose index ranges within [0..N-1]. Output parameters: InvA - inverse of matrix A + u*v'. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixinvupdateuv( real_2d_array& inva, ae_int_t n, real_1d_array u, real_1d_array v);
jacobianellipticfunctions
/************************************************************************* Jacobian Elliptic Functions Evaluates the Jacobian elliptic functions sn(u|m), cn(u|m), and dn(u|m) of parameter m between 0 and 1, and real argument u. These functions are periodic, with quarter-period on the real axis equal to the complete elliptic integral ellpk(1.0-m). Relation to incomplete elliptic integral: If u = ellik(phi,m), then sn(u|m) = sin(phi), and cn(u|m) = cos(phi). Phi is called the amplitude of u. Computation is by means of the arithmetic-geometric mean algorithm, except when m is within 1e-9 of 0 or 1. In the latter case with m close to 1, the approximation applies only for phi < pi/2. ACCURACY: Tested at random points with u between 0 and 10, m between 0 and 1. Absolute error (* = relative error): arithmetic function # trials peak rms IEEE phi 10000 9.2e-16* 1.4e-16* IEEE sn 50000 4.1e-15 4.6e-16 IEEE cn 40000 3.6e-15 4.4e-16 IEEE dn 10000 1.3e-12 1.8e-14 Peak error observed in consistency check using addition theorem for sn(u+v) was 4e-16 (absolute). Also tested by the above relation to the incomplete elliptic integral. Accuracy deteriorates when u is large. Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
void alglib::jacobianellipticfunctions( double u, double m, double& sn, double& cn, double& dn, double& ph);
jarqueberatest
/************************************************************************* Jarque-Bera test This test checks hypotheses about the fact that a given sample X is a sample of normal random variable. Requirements: * the number of elements in the sample is not less than 5. Input parameters: X - sample. Array whose index goes from 0 to N-1. N - size of the sample. N>=5 Output parameters: P - p-value for the test Accuracy of the approximation used (5<=N<=1951): p-value relative error (5<=N<=1951) [1, 0.1] < 1% [0.1, 0.01] < 2% [0.01, 0.001] < 6% [0.001, 0] wasn't measured For N>1951 accuracy wasn't measured but it shouldn't be sharply different from table values. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/
void alglib::jarqueberatest(real_1d_array x, ae_int_t n, double& p);
laguerrecalculate
laguerrecoefficients
laguerresum
/************************************************************************* Calculation of the value of the Laguerre polynomial. Parameters: n - degree, n>=0 x - argument Result: the value of the Laguerre polynomial Ln at x *************************************************************************/
double alglib::laguerrecalculate(ae_int_t n, double x);
/************************************************************************* Representation of Ln as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/
void alglib::laguerrecoefficients(ae_int_t n, real_1d_array& c);
/************************************************************************* Summation of Laguerre polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*L0(x) + c[1]*L1(x) + ... + c[N]*LN(x) Parameters: n - degree, n>=0 x - argument Result: the value of the Laguerre polynomial at x *************************************************************************/
double alglib::laguerresum(real_1d_array c, ae_int_t n, double x);
fisherlda
fisherldan
/************************************************************************* Multiclass Fisher LDA Subroutine finds coefficients of linear combination which optimally separates training set on classes. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! Multithreading is used to accelerate initial phase of LDA, which ! includes calculation of products of large matrices. Again, for best ! efficiency problem must be high-dimensional. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: XY - training set, array[0..NPoints-1,0..NVars]. First NVars columns store values of independent variables, next column stores number of class (from 0 to NClasses-1) which dataset element belongs to. Fractional values are rounded to nearest integer. NPoints - training set size, NPoints>=0 NVars - number of independent variables, NVars>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: Info - return code: * -4, if internal EVD subroutine hasn't converged * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, NVars<1, NClasses<2) * 1, if task has been solved * 2, if there was a multicollinearity in training set, but task has been solved. W - linear combination coefficients, array[0..NVars-1] -- ALGLIB -- Copyright 31.05.2008 by Bochkanov Sergey *************************************************************************/
void alglib::fisherlda( real_2d_array xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t& info, real_1d_array& w);
/************************************************************************* N-dimensional multiclass Fisher LDA Subroutine finds coefficients of linear combinations which optimally separates training set on classes. It returns N-dimensional basis whose vector are sorted by quality of training set separation (in descending order). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! Multithreading is used to accelerate initial phase of LDA, which ! includes calculation of products of large matrices. Again, for best ! efficiency problem must be high-dimensional. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: XY - training set, array[0..NPoints-1,0..NVars]. First NVars columns store values of independent variables, next column stores number of class (from 0 to NClasses-1) which dataset element belongs to. Fractional values are rounded to nearest integer. NPoints - training set size, NPoints>=0 NVars - number of independent variables, NVars>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: Info - return code: * -4, if internal EVD subroutine hasn't converged * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, NVars<1, NClasses<2) * 1, if task has been solved * 2, if there was a multicollinearity in training set, but task has been solved. W - basis, array[0..NVars-1,0..NVars-1] columns of matrix stores basis vectors, sorted by quality of training set separation (in descending order) -- ALGLIB -- Copyright 31.05.2008 by Bochkanov Sergey *************************************************************************/
void alglib::fisherldan( real_2d_array xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t& info, real_2d_array& w); void alglib::smp_fisherldan( real_2d_array xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t& info, real_2d_array& w);
legendrecalculate
legendrecoefficients
legendresum
/************************************************************************* Calculation of the value of the Legendre polynomial Pn. Parameters: n - degree, n>=0 x - argument Result: the value of the Legendre polynomial Pn at x *************************************************************************/
double alglib::legendrecalculate(ae_int_t n, double x);
/************************************************************************* Representation of Pn as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/
void alglib::legendrecoefficients(ae_int_t n, real_1d_array& c);
/************************************************************************* Summation of Legendre polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*P0(x) + c[1]*P1(x) + ... + c[N]*PN(x) Parameters: n - degree, n>=0 x - argument Result: the value of the Legendre polynomial at x *************************************************************************/
double alglib::legendresum(real_1d_array c, ae_int_t n, double x);
lincgreport
lincgstate
lincgcreate
lincgresults
lincgsetcond
lincgsetprecdiag
lincgsetprecunit
lincgsetrestartfreq
lincgsetrupdatefreq
lincgsetstartingpoint
lincgsetxrep
lincgsolvesparse
lincg_d_1 Solution of sparse linear systems with CG
/************************************************************************* *************************************************************************/
class lincgreport { ae_int_t iterationscount; ae_int_t nmv; ae_int_t terminationtype; double r2; };
/************************************************************************* This object stores state of the linear CG method. You should use ALGLIB functions to work with this object. Never try to access its fields directly! *************************************************************************/
class lincgstate { };
/************************************************************************* This function initializes linear CG Solver. This solver is used to solve symmetric positive definite problems. If you want to solve nonsymmetric (or non-positive definite) problem you may use LinLSQR solver provided by ALGLIB. USAGE: 1. User initializes algorithm state with LinCGCreate() call 2. User tunes solver parameters with LinCGSetCond() and other functions 3. Optionally, user sets starting point with LinCGSetStartingPoint() 4. User calls LinCGSolveSparse() function which takes algorithm state and SparseMatrix object. 5. User calls LinCGResults() to get solution 6. Optionally, user may call LinCGSolveSparse() again to solve another problem with different matrix and/or right part without reinitializing LinCGState structure. INPUT PARAMETERS: N - problem dimension, N>0 OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::lincgcreate(ae_int_t n, lincgstate& state);

Examples:   [1]  

/************************************************************************* CG-solver: results. This function must be called after LinCGSolve INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[N], solution Rep - optimization report: * Rep.TerminationType completetion code: * -5 input matrix is either not positive definite, too large or too small * -4 overflow/underflow during solution (ill conditioned problem) * 1 ||residual||<=EpsF*||b|| * 5 MaxIts steps was taken * 7 rounding errors prevent further progress, best point found is returned * Rep.IterationsCount contains iterations count * NMV countains number of matrix-vector calculations -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::lincgresults( lincgstate state, real_1d_array& x, lincgreport& rep);

Examples:   [1]  

/************************************************************************* This function sets stopping criteria. INPUT PARAMETERS: EpsF - algorithm will be stopped if norm of residual is less than EpsF*||b||. MaxIts - algorithm will be stopped if number of iterations is more than MaxIts. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: If both EpsF and MaxIts are zero then small EpsF will be set to small value. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::lincgsetcond(lincgstate state, double epsf, ae_int_t maxits);
/************************************************************************* This function changes preconditioning settings of LinCGSolveSparse() function. LinCGSolveSparse() will use diagonal of the system matrix as preconditioner. This preconditioning mode is active by default. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/
void alglib::lincgsetprecdiag(lincgstate state);
/************************************************************************* This function changes preconditioning settings of LinCGSolveSparse() function. By default, SolveSparse() uses diagonal preconditioner, but if you want to use solver without preconditioning, you can call this function which forces solver to use unit matrix for preconditioning. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/
void alglib::lincgsetprecunit(lincgstate state);
/************************************************************************* This function sets restart frequency. By default, algorithm is restarted after N subsequent iterations. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::lincgsetrestartfreq(lincgstate state, ae_int_t srf);
/************************************************************************* This function sets frequency of residual recalculations. Algorithm updates residual r_k using iterative formula, but recalculates it from scratch after each 10 iterations. It is done to avoid accumulation of numerical errors and to stop algorithm when r_k starts to grow. Such low update frequence (1/10) gives very little overhead, but makes algorithm a bit more robust against numerical errors. However, you may change it INPUT PARAMETERS: Freq - desired update frequency, Freq>=0. Zero value means that no updates will be done. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::lincgsetrupdatefreq(lincgstate state, ae_int_t freq);
/************************************************************************* This function sets starting point. By default, zero starting point is used. INPUT PARAMETERS: X - starting point, array[N] OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::lincgsetstartingpoint(lincgstate state, real_1d_array x);
/************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinCGOptimize(). -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::lincgsetxrep(lincgstate state, bool needxrep);
/************************************************************************* Procedure for solution of A*x=b with sparse A. INPUT PARAMETERS: State - algorithm state A - sparse matrix in the CRS format (you MUST contvert it to CRS format by calling SparseConvertToCRS() function). IsUpper - whether upper or lower triangle of A is used: * IsUpper=True => only upper triangle is used and lower triangle is not referenced at all * IsUpper=False => only lower triangle is used and upper triangle is not referenced at all B - right part, array[N] RESULT: This function returns no result. You can get solution by calling LinCGResults() NOTE: this function uses lightweight preconditioning - multiplication by inverse of diag(A). If you want, you can turn preconditioning off by calling LinCGSetPrecUnit(). However, preconditioning cost is low and preconditioner is very important for solution of badly scaled problems. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::lincgsolvesparse( lincgstate state, sparsematrix a, bool isupper, real_1d_array b);

Examples:   [1]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "solvers.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example illustrates solution of sparse linear systems with
    // conjugate gradient method.
    // 
    // Suppose that we have linear system A*x=b with sparse symmetric
    // positive definite A (represented by sparsematrix object)
    //         [ 5 1       ]
    //         [ 1 7 2     ]
    //     A = [   2 8 1   ]
    //         [     1 4 1 ]
    //         [       1 4 ]
    // and right part b
    //     [  7 ]
    //     [ 17 ]
    // b = [ 14 ]
    //     [ 10 ]
    //     [  6 ]
    // and we want to solve this system using sparse linear CG. In order
    // to do so, we have to create left part (sparsematrix object) and
    // right part (dense array).
    //
    // Initially, sparse matrix is created in the Hash-Table format,
    // which allows easy initialization, but do not allow matrix to be
    // used in the linear solvers. So after construction you should convert
    // sparse matrix to CRS format (one suited for linear operations).
    //
    // It is important to note that in our example we initialize full
    // matrix A, both lower and upper triangles. However, it is symmetric
    // and sparse solver needs just one half of the matrix. So you may
    // save about half of the space by filling only one of the triangles.
    //
    sparsematrix a;
    sparsecreate(5, 5, a);
    sparseset(a, 0, 0, 5.0);
    sparseset(a, 0, 1, 1.0);
    sparseset(a, 1, 0, 1.0);
    sparseset(a, 1, 1, 7.0);
    sparseset(a, 1, 2, 2.0);
    sparseset(a, 2, 1, 2.0);
    sparseset(a, 2, 2, 8.0);
    sparseset(a, 2, 3, 1.0);
    sparseset(a, 3, 2, 1.0);
    sparseset(a, 3, 3, 4.0);
    sparseset(a, 3, 4, 1.0);
    sparseset(a, 4, 3, 1.0);
    sparseset(a, 4, 4, 4.0);

    //
    // Now our matrix is fully initialized, but we have to do one more
    // step - convert it from Hash-Table format to CRS format (see
    // documentation on sparse matrices for more information about these
    // formats).
    //
    // If you omit this call, ALGLIB will generate exception on the first
    // attempt to use A in linear operations. 
    //
    sparseconverttocrs(a);

    //
    // Initialization of the right part
    //
    real_1d_array b = "[7,17,14,10,6]";

    //
    // Now we have to create linear solver object and to use it for the
    // solution of the linear system.
    //
    // NOTE: lincgsolvesparse() accepts additional parameter which tells
    //       what triangle of the symmetric matrix should be used - upper
    //       or lower. Because we've filled both parts of the matrix, we
    //       can use any part - upper or lower.
    //
    lincgstate s;
    lincgreport rep;
    real_1d_array x;
    lincgcreate(5, s);
    lincgsolvesparse(s, a, true, b);
    lincgresults(s, x, rep);

    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 1
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [1.000,2.000,1.000,2.000,1.000]
    return 0;
}


linlsqrreport
linlsqrstate
linlsqrcreate
linlsqrresults
linlsqrsetcond
linlsqrsetlambdai
linlsqrsetprecdiag
linlsqrsetprecunit
linlsqrsetxrep
linlsqrsolvesparse
linlsqr_d_1 Solution of sparse linear systems with CG
/************************************************************************* *************************************************************************/
class linlsqrreport { ae_int_t iterationscount; ae_int_t nmv; ae_int_t terminationtype; };
/************************************************************************* This object stores state of the LinLSQR method. You should use ALGLIB functions to work with this object. *************************************************************************/
class linlsqrstate { };
/************************************************************************* This function initializes linear LSQR Solver. This solver is used to solve non-symmetric (and, possibly, non-square) problems. Least squares solution is returned for non-compatible systems. USAGE: 1. User initializes algorithm state with LinLSQRCreate() call 2. User tunes solver parameters with LinLSQRSetCond() and other functions 3. User calls LinLSQRSolveSparse() function which takes algorithm state and SparseMatrix object. 4. User calls LinLSQRResults() to get solution 5. Optionally, user may call LinLSQRSolveSparse() again to solve another problem with different matrix and/or right part without reinitializing LinLSQRState structure. INPUT PARAMETERS: M - number of rows in A N - number of variables, N>0 OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::linlsqrcreate(ae_int_t m, ae_int_t n, linlsqrstate& state);

Examples:   [1]  

/************************************************************************* LSQR solver: results. This function must be called after LinLSQRSolve INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[N], solution Rep - optimization report: * Rep.TerminationType completetion code: * 1 ||Rk||<=EpsB*||B|| * 4 ||A^T*Rk||/(||A||*||Rk||)<=EpsA * 5 MaxIts steps was taken * 7 rounding errors prevent further progress, X contains best point found so far. (sometimes returned on singular systems) * Rep.IterationsCount contains iterations count * NMV countains number of matrix-vector calculations -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::linlsqrresults( linlsqrstate state, real_1d_array& x, linlsqrreport& rep);

Examples:   [1]  

/************************************************************************* This function sets stopping criteria. INPUT PARAMETERS: EpsA - algorithm will be stopped if ||A^T*Rk||/(||A||*||Rk||)<=EpsA. EpsB - algorithm will be stopped if ||Rk||<=EpsB*||B|| MaxIts - algorithm will be stopped if number of iterations more than MaxIts. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTE: if EpsA,EpsB,EpsC and MaxIts are zero then these variables will be setted as default values. -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::linlsqrsetcond( linlsqrstate state, double epsa, double epsb, ae_int_t maxits);
/************************************************************************* This function sets optional Tikhonov regularization coefficient. It is zero by default. INPUT PARAMETERS: LambdaI - regularization factor, LambdaI>=0 OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::linlsqrsetlambdai(linlsqrstate state, double lambdai);
/************************************************************************* This function changes preconditioning settings of LinCGSolveSparse() function. LinCGSolveSparse() will use diagonal of the system matrix as preconditioner. This preconditioning mode is active by default. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/
void alglib::linlsqrsetprecdiag(linlsqrstate state);
/************************************************************************* This function changes preconditioning settings of LinLSQQSolveSparse() function. By default, SolveSparse() uses diagonal preconditioner, but if you want to use solver without preconditioning, you can call this function which forces solver to use unit matrix for preconditioning. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/
void alglib::linlsqrsetprecunit(linlsqrstate state);
/************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinCGOptimize(). -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::linlsqrsetxrep(linlsqrstate state, bool needxrep);
/************************************************************************* Procedure for solution of A*x=b with sparse A. INPUT PARAMETERS: State - algorithm state A - sparse M*N matrix in the CRS format (you MUST contvert it to CRS format by calling SparseConvertToCRS() function BEFORE you pass it to this function). B - right part, array[M] RESULT: This function returns no result. You can get solution by calling LinCGResults() NOTE: this function uses lightweight preconditioning - multiplication by inverse of diag(A). If you want, you can turn preconditioning off by calling LinLSQRSetPrecUnit(). However, preconditioning cost is low and preconditioner is very important for solution of badly scaled problems. -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/
void alglib::linlsqrsolvesparse( linlsqrstate state, sparsematrix a, real_1d_array b);

Examples:   [1]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "solvers.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example illustrates solution of sparse linear least squares problem
    // with LSQR algorithm.
    // 
    // Suppose that we have least squares problem min|A*x-b| with sparse A
    // represented by sparsematrix object
    //         [ 1 1 ]
    //         [ 1 1 ]
    //     A = [ 2 1 ]
    //         [ 1   ]
    //         [   1 ]
    // and right part b
    //     [ 4 ]
    //     [ 2 ]
    // b = [ 4 ]
    //     [ 1 ]
    //     [ 2 ]
    // and we want to solve this system in the least squares sense using
    // LSQR algorithm. In order to do so, we have to create left part
    // (sparsematrix object) and right part (dense array).
    //
    // Initially, sparse matrix is created in the Hash-Table format,
    // which allows easy initialization, but do not allow matrix to be
    // used in the linear solvers. So after construction you should convert
    // sparse matrix to CRS format (one suited for linear operations).
    //
    sparsematrix a;
    sparsecreate(5, 2, a);
    sparseset(a, 0, 0, 1.0);
    sparseset(a, 0, 1, 1.0);
    sparseset(a, 1, 0, 1.0);
    sparseset(a, 1, 1, 1.0);
    sparseset(a, 2, 0, 2.0);
    sparseset(a, 2, 1, 1.0);
    sparseset(a, 3, 0, 1.0);
    sparseset(a, 4, 1, 1.0);

    //
    // Now our matrix is fully initialized, but we have to do one more
    // step - convert it from Hash-Table format to CRS format (see
    // documentation on sparse matrices for more information about these
    // formats).
    //
    // If you omit this call, ALGLIB will generate exception on the first
    // attempt to use A in linear operations. 
    //
    sparseconverttocrs(a);

    //
    // Initialization of the right part
    //
    real_1d_array b = "[4,2,4,1,2]";

    //
    // Now we have to create linear solver object and to use it for the
    // solution of the linear system.
    //
    linlsqrstate s;
    linlsqrreport rep;
    real_1d_array x;
    linlsqrcreate(5, 2, s);
    linlsqrsolvesparse(s, a, b);
    linlsqrresults(s, x, rep);

    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [1.000,2.000]
    return 0;
}


linearmodel
lrreport
lravgerror
lravgrelerror
lrbuild
lrbuilds
lrbuildz
lrbuildzs
lrpack
lrprocess
lrrmserror
lrunpack
linreg_d_basic Linear regression used to build the very basic model and unpack coefficients
/************************************************************************* *************************************************************************/
class linearmodel { };
/************************************************************************* LRReport structure contains additional information about linear model: * C - covariation matrix, array[0..NVars,0..NVars]. C[i,j] = Cov(A[i],A[j]) * RMSError - root mean square error on a training set * AvgError - average error on a training set * AvgRelError - average relative error on a training set (excluding observations with zero function value). * CVRMSError - leave-one-out cross-validation estimate of generalization error. Calculated using fast algorithm with O(NVars*NPoints) complexity. * CVAvgError - cross-validation estimate of average error * CVAvgRelError - cross-validation estimate of average relative error All other fields of the structure are intended for internal use and should not be used outside ALGLIB. *************************************************************************/
class lrreport { real_2d_array c; double rmserror; double avgerror; double avgrelerror; double cvrmserror; double cvavgerror; double cvavgrelerror; ae_int_t ncvdefects; integer_1d_array cvdefects; };
/************************************************************************* Average error on the test set INPUT PARAMETERS: LM - linear model XY - test set NPoints - test set size RESULT: average error. -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/
double alglib::lravgerror( linearmodel lm, real_2d_array xy, ae_int_t npoints);
/************************************************************************* RMS error on the test set INPUT PARAMETERS: LM - linear model XY - test set NPoints - test set size RESULT: average relative error. -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/
double alglib::lravgrelerror( linearmodel lm, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Linear regression Subroutine builds model: Y = A(0)*X[0] + ... + A(N-1)*X[N-1] + A(N) and model found in ALGLIB format, covariation matrix, training set errors (rms, average, average relative) and leave-one-out cross-validation estimate of the generalization error. CV estimate calculated using fast algorithm with O(NPoints*NVars) complexity. When covariation matrix is calculated standard deviations of function values are assumed to be equal to RMS error on the training set. INPUT PARAMETERS: XY - training set, array [0..NPoints-1,0..NVars]: * NVars columns - independent variables * last column - dependent variable NPoints - training set size, NPoints>NVars+1 NVars - number of independent variables OUTPUT PARAMETERS: Info - return code: * -255, in case of unknown internal error * -4, if internal SVD subroutine haven't converged * -1, if incorrect parameters was passed (NPoints<NVars+2, NVars<1). * 1, if subroutine successfully finished LM - linear model in the ALGLIB format. Use subroutines of this unit to work with the model. AR - additional results -- ALGLIB -- Copyright 02.08.2008 by Bochkanov Sergey *************************************************************************/
void alglib::lrbuild( real_2d_array xy, ae_int_t npoints, ae_int_t nvars, ae_int_t& info, linearmodel& lm, lrreport& ar);

Examples:   [1]  

/************************************************************************* Linear regression Variant of LRBuild which uses vector of standatd deviations (errors in function values). INPUT PARAMETERS: XY - training set, array [0..NPoints-1,0..NVars]: * NVars columns - independent variables * last column - dependent variable S - standard deviations (errors in function values) array[0..NPoints-1], S[i]>0. NPoints - training set size, NPoints>NVars+1 NVars - number of independent variables OUTPUT PARAMETERS: Info - return code: * -255, in case of unknown internal error * -4, if internal SVD subroutine haven't converged * -1, if incorrect parameters was passed (NPoints<NVars+2, NVars<1). * -2, if S[I]<=0 * 1, if subroutine successfully finished LM - linear model in the ALGLIB format. Use subroutines of this unit to work with the model. AR - additional results -- ALGLIB -- Copyright 02.08.2008 by Bochkanov Sergey *************************************************************************/
void alglib::lrbuilds( real_2d_array xy, real_1d_array s, ae_int_t npoints, ae_int_t nvars, ae_int_t& info, linearmodel& lm, lrreport& ar);
/************************************************************************* Like LRBuild but builds model Y = A(0)*X[0] + ... + A(N-1)*X[N-1] i.e. with zero constant term. -- ALGLIB -- Copyright 30.10.2008 by Bochkanov Sergey *************************************************************************/
void alglib::lrbuildz( real_2d_array xy, ae_int_t npoints, ae_int_t nvars, ae_int_t& info, linearmodel& lm, lrreport& ar);
/************************************************************************* Like LRBuildS, but builds model Y = A(0)*X[0] + ... + A(N-1)*X[N-1] i.e. with zero constant term. -- ALGLIB -- Copyright 30.10.2008 by Bochkanov Sergey *************************************************************************/
void alglib::lrbuildzs( real_2d_array xy, real_1d_array s, ae_int_t npoints, ae_int_t nvars, ae_int_t& info, linearmodel& lm, lrreport& ar);
/************************************************************************* "Packs" coefficients and creates linear model in ALGLIB format (LRUnpack reversed). INPUT PARAMETERS: V - coefficients, array[0..NVars] NVars - number of independent variables OUTPUT PAREMETERS: LM - linear model. -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/
void alglib::lrpack(real_1d_array v, ae_int_t nvars, linearmodel& lm);
/************************************************************************* Procesing INPUT PARAMETERS: LM - linear model X - input vector, array[0..NVars-1]. Result: value of linear model regression estimate -- ALGLIB -- Copyright 03.09.2008 by Bochkanov Sergey *************************************************************************/
double alglib::lrprocess(linearmodel lm, real_1d_array x);
/************************************************************************* RMS error on the test set INPUT PARAMETERS: LM - linear model XY - test set NPoints - test set size RESULT: root mean square error. -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/
double alglib::lrrmserror( linearmodel lm, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Unpacks coefficients of linear model. INPUT PARAMETERS: LM - linear model in ALGLIB format OUTPUT PARAMETERS: V - coefficients, array[0..NVars] constant term (intercept) is stored in the V[NVars]. NVars - number of independent variables (one less than number of coefficients) -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/
void alglib::lrunpack(linearmodel lm, real_1d_array& v, ae_int_t& nvars);

Examples:   [1]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // In this example we demonstrate linear fitting by f(x|a) = a*exp(0.5*x).
    //
    // We have:
    // * xy - matrix of basic function values (exp(0.5*x)) and expected values
    //
    real_2d_array xy = "[[0.606531,1.133719],[0.670320,1.306522],[0.740818,1.504604],[0.818731,1.554663],[0.904837,1.884638],[1.000000,2.072436],[1.105171,2.257285],[1.221403,2.534068],[1.349859,2.622017],[1.491825,2.897713],[1.648721,3.219371]]";
    ae_int_t info;
    ae_int_t nvars;
    linearmodel model;
    lrreport rep;
    real_1d_array c;

    lrbuildz(xy, 11, 1, info, model, rep);
    printf("%d\n", int(info)); // EXPECTED: 1
    lrunpack(model, c, nvars);
    printf("%s\n", c.tostring(4).c_str()); // EXPECTED: [1.98650,0.00000]
    return 0;
}


logitmodel
mnlreport
mnlavgce
mnlavgerror
mnlavgrelerror
mnlclserror
mnlpack
mnlprocess
mnlprocessi
mnlrelclserror
mnlrmserror
mnltrainh
mnlunpack
/************************************************************************* *************************************************************************/
class logitmodel { };
/************************************************************************* MNLReport structure contains information about training process: * NGrad - number of gradient calculations * NHess - number of Hessian calculations *************************************************************************/
class mnlreport { ae_int_t ngrad; ae_int_t nhess; };
/************************************************************************* Average cross-entropy (in bits per element) on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: CrossEntropy/(NPoints*ln(2)). -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/
double alglib::mnlavgce( logitmodel lm, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Average error on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: average error (error when estimating posterior probabilities). -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/
double alglib::mnlavgerror( logitmodel lm, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Average relative error on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: average relative error (error when estimating posterior probabilities). -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/
double alglib::mnlavgrelerror( logitmodel lm, real_2d_array xy, ae_int_t ssize);
/************************************************************************* Classification error on test set = MNLRelClsError*NPoints -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::mnlclserror( logitmodel lm, real_2d_array xy, ae_int_t npoints);
/************************************************************************* "Packs" coefficients and creates logit model in ALGLIB format (MNLUnpack reversed). INPUT PARAMETERS: A - model (see MNLUnpack) NVars - number of independent variables NClasses - number of classes OUTPUT PARAMETERS: LM - logit model. -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mnlpack( real_2d_array a, ae_int_t nvars, ae_int_t nclasses, logitmodel& lm);
/************************************************************************* Procesing INPUT PARAMETERS: LM - logit model, passed by non-constant reference (some fields of structure are used as temporaries when calculating model output). X - input vector, array[0..NVars-1]. Y - (possibly) preallocated buffer; if size of Y is less than NClasses, it will be reallocated.If it is large enough, it is NOT reallocated, so we can save some time on reallocation. OUTPUT PARAMETERS: Y - result, array[0..NClasses-1] Vector of posterior probabilities for classification task. -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mnlprocess(logitmodel lm, real_1d_array x, real_1d_array& y);
/************************************************************************* 'interactive' variant of MNLProcess for languages like Python which support constructs like "Y = MNLProcess(LM,X)" and interactive mode of the interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mnlprocessi( logitmodel lm, real_1d_array x, real_1d_array& y);
/************************************************************************* Relative classification error on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: percent of incorrectly classified cases. -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/
double alglib::mnlrelclserror( logitmodel lm, real_2d_array xy, ae_int_t npoints);
/************************************************************************* RMS error on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: root mean square error (error when estimating posterior probabilities). -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/
double alglib::mnlrmserror( logitmodel lm, real_2d_array xy, ae_int_t npoints);
/************************************************************************* This subroutine trains logit model. INPUT PARAMETERS: XY - training set, array[0..NPoints-1,0..NVars] First NVars columns store values of independent variables, next column stores number of class (from 0 to NClasses-1) which dataset element belongs to. Fractional values are rounded to nearest integer. NPoints - training set size, NPoints>=1 NVars - number of independent variables, NVars>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<NVars+2, NVars<1, NClasses<2). * 1, if task has been solved LM - model built Rep - training report -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mnltrainh( real_2d_array xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t& info, logitmodel& lm, mnlreport& rep);
/************************************************************************* Unpacks coefficients of logit model. Logit model have form: P(class=i) = S(i) / (S(0) + S(1) + ... +S(M-1)) S(i) = Exp(A[i,0]*X[0] + ... + A[i,N-1]*X[N-1] + A[i,N]), when i<M-1 S(M-1) = 1 INPUT PARAMETERS: LM - logit model in ALGLIB format OUTPUT PARAMETERS: V - coefficients, array[0..NClasses-2,0..NVars] NVars - number of independent variables NClasses - number of classes -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mnlunpack( logitmodel lm, real_2d_array& a, ae_int_t& nvars, ae_int_t& nclasses);
barycentricfitreport
lsfitreport
lsfitstate
polynomialfitreport
spline1dfitreport
barycentricfitfloaterhormann
barycentricfitfloaterhormannwc
logisticcalc4
logisticcalc5
logisticfit4
logisticfit45x
logisticfit4ec
logisticfit5
logisticfit5ec
lsfitcreatef
lsfitcreatefg
lsfitcreatefgh
lsfitcreatewf
lsfitcreatewfg
lsfitcreatewfgh
lsfitfit
lsfitlinear
lsfitlinearc
lsfitlinearw
lsfitlinearwc
lsfitresults
lsfitsetbc
lsfitsetcond
lsfitsetgradientcheck
lsfitsetlc
lsfitsetscale
lsfitsetstpmax
lsfitsetxrep
lstfitpiecewiselinearrdp
lstfitpiecewiselinearrdpfixed
polynomialfit
polynomialfitwc
spline1dfitcubic
spline1dfitcubicwc
spline1dfithermite
spline1dfithermitewc
spline1dfitpenalized
spline1dfitpenalizedw
lsfit_d_lin Unconstrained (general) linear least squares fitting with and without weights
lsfit_d_linc Constrained (general) linear least squares fitting with and without weights
lsfit_d_nlf Nonlinear fitting using function value only
lsfit_d_nlfb Bound contstrained nonlinear fitting using function value only
lsfit_d_nlfg Nonlinear fitting using gradient
lsfit_d_nlfgh Nonlinear fitting using gradient and Hessian
lsfit_d_nlscale Nonlinear fitting with custom scaling and bound constraints
lsfit_d_pol Unconstrained polynomial fitting
lsfit_d_polc Constrained polynomial fitting
lsfit_d_spline Unconstrained fitting by penalized regression spline
lsfit_t_4pl 4-parameter logistic fitting
lsfit_t_5pl 5-parameter logistic fitting
/************************************************************************* Barycentric fitting report: RMSError RMS error AvgError average error AvgRelError average relative error (for non-zero Y[I]) MaxError maximum error TaskRCond reciprocal of task's condition number *************************************************************************/
class barycentricfitreport { double taskrcond; ae_int_t dbest; double rmserror; double avgerror; double avgrelerror; double maxerror; };
/************************************************************************* Least squares fitting report. This structure contains informational fields which are set by fitting functions provided by this unit. Different functions initialize different sets of fields, so you should read documentation on specific function you used in order to know which fields are initialized. TaskRCond reciprocal of task's condition number IterationsCount number of internal iterations VarIdx if user-supplied gradient contains errors which were detected by nonlinear fitter, this field is set to index of the first component of gradient which is suspected to be spoiled by bugs. RMSError RMS error AvgError average error AvgRelError average relative error (for non-zero Y[I]) MaxError maximum error WRMSError weighted RMS error CovPar covariance matrix for parameters, filled by some solvers ErrPar vector of errors in parameters, filled by some solvers ErrCurve vector of fit errors - variability of the best-fit curve, filled by some solvers. Noise vector of per-point noise estimates, filled by some solvers. R2 coefficient of determination (non-weighted, non-adjusted), filled by some solvers. *************************************************************************/
class lsfitreport { double taskrcond; ae_int_t iterationscount; ae_int_t varidx; double rmserror; double avgerror; double avgrelerror; double maxerror; double wrmserror; real_2d_array covpar; real_1d_array errpar; real_1d_array errcurve; real_1d_array noise; double r2; };
/************************************************************************* Nonlinear fitter. You should use ALGLIB functions to work with fitter. Never try to access its fields directly! *************************************************************************/
class lsfitstate { };
/************************************************************************* Polynomial fitting report: TaskRCond reciprocal of task's condition number RMSError RMS error AvgError average error AvgRelError average relative error (for non-zero Y[I]) MaxError maximum error *************************************************************************/
class polynomialfitreport { double taskrcond; double rmserror; double avgerror; double avgrelerror; double maxerror; };
/************************************************************************* Spline fitting report: RMSError RMS error AvgError average error AvgRelError average relative error (for non-zero Y[I]) MaxError maximum error Fields below are filled by obsolete functions (Spline1DFitCubic, Spline1DFitHermite). Modern fitting functions do NOT fill these fields: TaskRCond reciprocal of task's condition number *************************************************************************/
class spline1dfitreport { double taskrcond; double rmserror; double avgerror; double avgrelerror; double maxerror; };
/************************************************************************* Rational least squares fitting using Floater-Hormann rational functions with optimal D chosen from [0,9]. Equidistant grid with M node on [min(x),max(x)] is used to build basis functions. Different values of D are tried, optimal D (least root mean square error) is chosen. Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2) (mostly dominated by the least squares solver). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. N - number of points, N>0. M - number of basis functions ( = number_of_nodes), M>=2. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints B - barycentric interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * DBest best value of the D parameter * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::barycentricfitfloaterhormann( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t m, ae_int_t& info, barycentricinterpolant& b, barycentricfitreport& rep); void alglib::smp_barycentricfitfloaterhormann( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t m, ae_int_t& info, barycentricinterpolant& b, barycentricfitreport& rep);
/************************************************************************* Weghted rational least squares fitting using Floater-Hormann rational functions with optimal D chosen from [0,9], with constraints and individual weights. Equidistant grid with M node on [min(x),max(x)] is used to build basis functions. Different values of D are tried, optimal D (least WEIGHTED root mean square error) is chosen. Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2) (mostly dominated by the least squares solver). SEE ALSO * BarycentricFitFloaterHormann(), "lightweight" fitting without invididual weights and constraints. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points, N>0. XC - points where function values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints, 0<=K<M. K=0 means no constraints (XC/YC/DC are not used in such cases) M - number of basis functions ( = number_of_nodes), M>=2. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints -1 means another errors in parameters passed (N<=0, for example) B - barycentric interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * DBest best value of the D parameter * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroutine doesn't calculate task's condition number for K<>0. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained barycentric interpolants: * excessive constraints can be inconsistent. Floater-Hormann basis functions aren't as flexible as splines (although they are very smooth). * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints IS NOT GUARANTEED. * in the several special cases, however, we CAN guarantee consistency. * one of this cases is constraints on the function VALUES at the interval boundaries. Note that consustency of the constraints on the function DERIVATIVES is NOT guaranteed (you can use in such cases cubic splines which are more flexible). * another special case is ONE constraint on the function value (OR, but not AND, derivative) anywhere in the interval Our final recommendation is to use constraints WHEN AND ONLY WHEN you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::barycentricfitfloaterhormannwc( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t n, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t k, ae_int_t m, ae_int_t& info, barycentricinterpolant& b, barycentricfitreport& rep); void alglib::smp_barycentricfitfloaterhormannwc( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t n, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t k, ae_int_t m, ae_int_t& info, barycentricinterpolant& b, barycentricfitreport& rep);
/************************************************************************* This function calculates value of four-parameter logistic (4PL) model at specified point X. 4PL model has following form: F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) INPUT PARAMETERS: X - current point, X>=0: * zero X is correctly handled even for B<=0 * negative X results in exception. A, B, C, D- parameters of 4PL model: * A is unconstrained * B is unconstrained; zero or negative values are handled correctly. * C>0, non-positive value results in exception * D is unconstrained RESULT: model value at X NOTE: if B=0, denominator is assumed to be equal to 2.0 even for zero X (strictly speaking, 0^0 is undefined). NOTE: this function also throws exception if all input parameters are correct, but overflow was detected during calculations. NOTE: this function performs a lot of checks; if you need really high performance, consider evaluating model yourself, without checking for degenerate cases. -- ALGLIB PROJECT -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/
double alglib::logisticcalc4( double x, double a, double b, double c, double d);

Examples:   [1]  

/************************************************************************* This function calculates value of five-parameter logistic (5PL) model at specified point X. 5PL model has following form: F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) INPUT PARAMETERS: X - current point, X>=0: * zero X is correctly handled even for B<=0 * negative X results in exception. A, B, C, D, G- parameters of 5PL model: * A is unconstrained * B is unconstrained; zero or negative values are handled correctly. * C>0, non-positive value results in exception * D is unconstrained * G>0, non-positive value results in exception RESULT: model value at X NOTE: if B=0, denominator is assumed to be equal to Power(2.0,G) even for zero X (strictly speaking, 0^0 is undefined). NOTE: this function also throws exception if all input parameters are correct, but overflow was detected during calculations. NOTE: this function performs a lot of checks; if you need really high performance, consider evaluating model yourself, without checking for degenerate cases. -- ALGLIB PROJECT -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/
double alglib::logisticcalc5( double x, double a, double b, double c, double d, double g);

Examples:   [1]  

/************************************************************************* This function fits four-parameter logistic (4PL) model to data provided by user. 4PL model has following form: F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) Here: * A, D - unconstrained (see LogisticFit4EC() for constrained 4PL) * B>=0 * C>0 IMPORTANT: output of this function is constrained in such way that B>0. Because 4PL model is symmetric with respect to B, there is no need to explore B<0. Constraining B makes algorithm easier to stabilize and debug. Users who for some reason prefer to work with negative B's should transform output themselves (swap A and D, replace B by -B). 4PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". * second Levenberg-Marquardt round is performed without excessive constraints. Results from the previous round are used as initial guess. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. OUTPUT PARAMETERS: A, B, C, D- parameters of 4PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc4() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/
void alglib::logisticfit4( real_1d_array x, real_1d_array y, ae_int_t n, double& a, double& b, double& c, double& d, lsfitreport& rep);

Examples:   [1]  

/************************************************************************* This is "expert" 4PL/5PL fitting function, which can be used if you need better control over fitting process than provided by LogisticFit4() or LogisticFit5(). This function fits model of the form F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) (4PL model) or F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) (5PL model) Here: * A, D - unconstrained * B>=0 for 4PL, unconstrained for 5PL * C>0 * G>0 (if present) INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. CnstrLeft- optional equality constraint for model value at the left boundary (at X=0). Specify NAN (Not-a-Number) if you do not need constraint on the model value at X=0 (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. CnstrRight- optional equality constraint for model value at X=infinity. Specify NAN (Not-a-Number) if you do not need constraint on the model value (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. Is4PL - whether 4PL or 5PL models are fitted LambdaV - regularization coefficient, LambdaV>=0. Set it to zero unless you know what you are doing. EpsX - stopping condition (step size), EpsX>=0. Zero value means that small step is automatically chosen. See notes below for more information. RsCnt - number of repeated restarts from random points. 4PL/5PL models are prone to problem of bad local extrema. Utilizing multiple random restarts allows us to improve algorithm convergence. RsCnt>=0. Zero value means that function automatically choose small amount of restarts (recommended). OUTPUT PARAMETERS: A, B, C, D- parameters of 4PL model G - parameter of 5PL model; for Is4PL=True, G=1 is returned. Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc5() function. NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. EQUALITY CONSTRAINTS ON PARAMETERS 4PL/5PL solver supports equality constraints on model values at the left boundary (X=0) and right boundary (X=infinity). These constraints are completely optional and you can specify both of them, only one - or no constraints at all. Parameter CnstrLeft contains left constraint (or NAN for unconstrained fitting), and CnstrRight contains right one. For 4PL, left constraint ALWAYS corresponds to parameter A, and right one is ALWAYS constraint on D. That's because 4PL model is normalized in such way that B>=0. For 5PL model things are different. Unlike 4PL one, 5PL model is NOT symmetric with respect to change in sign of B. Thus, negative B's are possible, and left constraint may constrain parameter A (for positive B's) - or parameter D (for negative B's). Similarly changes meaning of right constraint. You do not have to decide what parameter to constrain - algorithm will automatically determine correct parameters as fitting progresses. However, question highlighted above is important when you interpret fitting results. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/
void alglib::logisticfit45x( real_1d_array x, real_1d_array y, ae_int_t n, double cnstrleft, double cnstrright, bool is4pl, double lambdav, double epsx, ae_int_t rscnt, double& a, double& b, double& c, double& d, double& g, lsfitreport& rep);
/************************************************************************* This function fits four-parameter logistic (4PL) model to data provided by user, with optional constraints on parameters A and D. 4PL model has following form: F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) Here: * A, D - with optional equality constraints * B>=0 * C>0 IMPORTANT: output of this function is constrained in such way that B>0. Because 4PL model is symmetric with respect to B, there is no need to explore B<0. Constraining B makes algorithm easier to stabilize and debug. Users who for some reason prefer to work with negative B's should transform output themselves (swap A and D, replace B by -B). 4PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". * second Levenberg-Marquardt round is performed without excessive constraints. Results from the previous round are used as initial guess. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. CnstrLeft- optional equality constraint for model value at the left boundary (at X=0). Specify NAN (Not-a-Number) if you do not need constraint on the model value at X=0 (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. CnstrRight- optional equality constraint for model value at X=infinity. Specify NAN (Not-a-Number) if you do not need constraint on the model value (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. OUTPUT PARAMETERS: A, B, C, D- parameters of 4PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc4() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. EQUALITY CONSTRAINTS ON PARAMETERS 4PL/5PL solver supports equality constraints on model values at the left boundary (X=0) and right boundary (X=infinity). These constraints are completely optional and you can specify both of them, only one - or no constraints at all. Parameter CnstrLeft contains left constraint (or NAN for unconstrained fitting), and CnstrRight contains right one. For 4PL, left constraint ALWAYS corresponds to parameter A, and right one is ALWAYS constraint on D. That's because 4PL model is normalized in such way that B>=0. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/
void alglib::logisticfit4ec( real_1d_array x, real_1d_array y, ae_int_t n, double cnstrleft, double cnstrright, double& a, double& b, double& c, double& d, lsfitreport& rep);
/************************************************************************* This function fits five-parameter logistic (5PL) model to data provided by user. 5PL model has following form: F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) Here: * A, D - unconstrained * B - unconstrained * C>0 * G>0 IMPORTANT: unlike in 4PL fitting, output of this function is NOT constrained in such way that B is guaranteed to be positive. Furthermore, unlike 4PL, 5PL model is NOT symmetric with respect to B, so you can NOT transform model to equivalent one, with B having desired sign (>0 or <0). 5PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". Parameter G is fixed at G=1. * second Levenberg-Marquardt round is performed without excessive constraints on B and C, but with G still equal to 1. Results from the previous round are used as initial guess. * third Levenberg-Marquardt round relaxes constraints on G and tries two different models - one with B>0 and one with B<0. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. OUTPUT PARAMETERS: A,B,C,D,G- parameters of 5PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc5() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/
void alglib::logisticfit5( real_1d_array x, real_1d_array y, ae_int_t n, double& a, double& b, double& c, double& d, double& g, lsfitreport& rep);

Examples:   [1]  

/************************************************************************* This function fits five-parameter logistic (5PL) model to data provided by user, subject to optional equality constraints on parameters A and D. 5PL model has following form: F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) Here: * A, D - with optional equality constraints * B - unconstrained * C>0 * G>0 IMPORTANT: unlike in 4PL fitting, output of this function is NOT constrained in such way that B is guaranteed to be positive. Furthermore, unlike 4PL, 5PL model is NOT symmetric with respect to B, so you can NOT transform model to equivalent one, with B having desired sign (>0 or <0). 5PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". Parameter G is fixed at G=1. * second Levenberg-Marquardt round is performed without excessive constraints on B and C, but with G still equal to 1. Results from the previous round are used as initial guess. * third Levenberg-Marquardt round relaxes constraints on G and tries two different models - one with B>0 and one with B<0. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. CnstrLeft- optional equality constraint for model value at the left boundary (at X=0). Specify NAN (Not-a-Number) if you do not need constraint on the model value at X=0 (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. CnstrRight- optional equality constraint for model value at X=infinity. Specify NAN (Not-a-Number) if you do not need constraint on the model value (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. OUTPUT PARAMETERS: A,B,C,D,G- parameters of 5PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc5() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. EQUALITY CONSTRAINTS ON PARAMETERS 5PL solver supports equality constraints on model values at the left boundary (X=0) and right boundary (X=infinity). These constraints are completely optional and you can specify both of them, only one - or no constraints at all. Parameter CnstrLeft contains left constraint (or NAN for unconstrained fitting), and CnstrRight contains right one. Unlike 4PL one, 5PL model is NOT symmetric with respect to change in sign of B. Thus, negative B's are possible, and left constraint may constrain parameter A (for positive B's) - or parameter D (for negative B's). Similarly changes meaning of right constraint. You do not have to decide what parameter to constrain - algorithm will automatically determine correct parameters as fitting progresses. However, question highlighted above is important when you interpret fitting results. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/
void alglib::logisticfit5ec( real_1d_array x, real_1d_array y, ae_int_t n, double cnstrleft, double cnstrright, double& a, double& b, double& c, double& d, double& g, lsfitreport& rep);
/************************************************************************* Nonlinear least squares fitting using function values only. Combination of numerical differentiation and secant updates is used to obtain function Jacobian. Nonlinear task min(F(c)) is solved, where F(c) = (f(c,x[0])-y[0])^2 + ... + (f(c,x[n-1])-y[n-1])^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]). INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted DiffStep- numerical differentiation step; should not be very small or large; large = loss of accuracy small = growth of round-off errors OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 18.10.2008 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitcreatef( real_2d_array x, real_1d_array y, real_1d_array c, double diffstep, lsfitstate& state); void alglib::lsfitcreatef( real_2d_array x, real_1d_array y, real_1d_array c, ae_int_t n, ae_int_t m, ae_int_t k, double diffstep, lsfitstate& state);

Examples:   [1]  [2]  [3]  

/************************************************************************* Nonlinear least squares fitting using gradient only, without individual weights. Nonlinear task min(F(c)) is solved, where F(c) = ((f(c,x[0])-y[0]))^2 + ... + ((f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]) and its gradient. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted CheapFG - boolean flag, which is: * True if both function and gradient calculation complexity are less than O(M^2). An improved algorithm can be used which corresponds to FGJ scheme from MINLM unit. * False otherwise. Standard Jacibian-bases Levenberg-Marquardt algo will be used (FJ scheme). OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitcreatefg( real_2d_array x, real_1d_array y, real_1d_array c, bool cheapfg, lsfitstate& state); void alglib::lsfitcreatefg( real_2d_array x, real_1d_array y, real_1d_array c, ae_int_t n, ae_int_t m, ae_int_t k, bool cheapfg, lsfitstate& state);

Examples:   [1]  

/************************************************************************* Nonlinear least squares fitting using gradient/Hessian, without individial weights. Nonlinear task min(F(c)) is solved, where F(c) = ((f(c,x[0])-y[0]))^2 + ... + ((f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses f(c,x[i]), its gradient and its Hessian. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitcreatefgh( real_2d_array x, real_1d_array y, real_1d_array c, lsfitstate& state); void alglib::lsfitcreatefgh( real_2d_array x, real_1d_array y, real_1d_array c, ae_int_t n, ae_int_t m, ae_int_t k, lsfitstate& state);

Examples:   [1]  

/************************************************************************* Weighted nonlinear least squares fitting using function values only. Combination of numerical differentiation and secant updates is used to obtain function Jacobian. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]). INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted DiffStep- numerical differentiation step; should not be very small or large; large = loss of accuracy small = growth of round-off errors OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 18.10.2008 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitcreatewf( real_2d_array x, real_1d_array y, real_1d_array w, real_1d_array c, double diffstep, lsfitstate& state); void alglib::lsfitcreatewf( real_2d_array x, real_1d_array y, real_1d_array w, real_1d_array c, ae_int_t n, ae_int_t m, ae_int_t k, double diffstep, lsfitstate& state);

Examples:   [1]  [2]  

/************************************************************************* Weighted nonlinear least squares fitting using gradient only. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]) and its gradient. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted CheapFG - boolean flag, which is: * True if both function and gradient calculation complexity are less than O(M^2). An improved algorithm can be used which corresponds to FGJ scheme from MINLM unit. * False otherwise. Standard Jacibian-bases Levenberg-Marquardt algo will be used (FJ scheme). OUTPUT PARAMETERS: State - structure which stores algorithm state See also: LSFitResults LSFitCreateFG (fitting without weights) LSFitCreateWFGH (fitting using Hessian) LSFitCreateFGH (fitting using Hessian, without weights) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitcreatewfg( real_2d_array x, real_1d_array y, real_1d_array w, real_1d_array c, bool cheapfg, lsfitstate& state); void alglib::lsfitcreatewfg( real_2d_array x, real_1d_array y, real_1d_array w, real_1d_array c, ae_int_t n, ae_int_t m, ae_int_t k, bool cheapfg, lsfitstate& state);

Examples:   [1]  

/************************************************************************* Weighted nonlinear least squares fitting using gradient/Hessian. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses f(c,x[i]), its gradient and its Hessian. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitcreatewfgh( real_2d_array x, real_1d_array y, real_1d_array w, real_1d_array c, lsfitstate& state); void alglib::lsfitcreatewfgh( real_2d_array x, real_1d_array y, real_1d_array w, real_1d_array c, ae_int_t n, ae_int_t m, ae_int_t k, lsfitstate& state);

Examples:   [1]  

/************************************************************************* This family of functions is used to launcn iterations of nonlinear fitter These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x grad - callback which calculates function (or merit function) value func and gradient grad at given point x hess - callback which calculates function (or merit function) value func, gradient grad and Hessian hess at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. this algorithm is somewhat unusual because it works with parameterized function f(C,X), where X is a function argument (we have many points which are characterized by different argument values), and C is a parameter to fit. For example, if we want to do linear fit by f(c0,c1,x) = c0*x+c1, then x will be argument, and {c0,c1} will be parameters. It is important to understand that this algorithm finds minimum in the space of function PARAMETERS (not arguments), so it needs derivatives of f() with respect to C, not X. In the example above it will need f=c0*x+c1 and {df/dc0,df/dc1} = {x,1} instead of {df/dx} = {c0}. 2. Callback functions accept C as the first parameter, and X as the second 3. If state was created with LSFitCreateFG(), algorithm needs just function and its gradient, but if state was created with LSFitCreateFGH(), algorithm will need function, gradient and Hessian. According to the said above, there ase several versions of this function, which accept different sets of callbacks. This flexibility opens way to subtle errors - you may create state with LSFitCreateFGH() (optimization using Hessian), but call function which does not accept Hessian. So when algorithm will request Hessian, there will be no callback to call. In this case exception will be thrown. Be careful to avoid such errors because there is no way to find them at compile time - you can see them at runtime only. -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void lsfitfit(lsfitstate &state, void (*func)(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &c, double func, void *ptr) = NULL, void *ptr = NULL); void lsfitfit(lsfitstate &state, void (*func)(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr), void (*grad)(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &c, double func, void *ptr) = NULL, void *ptr = NULL); void lsfitfit(lsfitstate &state, void (*func)(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr), void (*grad)(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*hess)(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, real_2d_array &hess, void *ptr), void (*rep)(const real_1d_array &c, double func, void *ptr) = NULL, void *ptr = NULL);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* Linear least squares fitting. QR decomposition is used to reduce task to MxM, then triangular solver or SVD-based solver is used depending on condition number of the system. It allows to maximize speed and retain decent accuracy. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I, J] - value of J-th basis function in I-th point. N - number of points used. N>=1. M - number of basis functions, M>=1. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * Rep.TaskRCond reciprocal of condition number * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitlinear( real_1d_array y, real_2d_array fmatrix, ae_int_t& info, real_1d_array& c, lsfitreport& rep); void alglib::lsfitlinear( real_1d_array y, real_2d_array fmatrix, ae_int_t n, ae_int_t m, ae_int_t& info, real_1d_array& c, lsfitreport& rep); void alglib::smp_lsfitlinear( real_1d_array y, real_2d_array fmatrix, ae_int_t& info, real_1d_array& c, lsfitreport& rep); void alglib::smp_lsfitlinear( real_1d_array y, real_2d_array fmatrix, ae_int_t n, ae_int_t m, ae_int_t& info, real_1d_array& c, lsfitreport& rep);

Examples:   [1]  

/************************************************************************* Constained linear least squares fitting. This is variation of LSFitLinear(), which searchs for min|A*x=b| given that K additional constaints C*x=bc are satisfied. It reduces original task to modified one: min|B*y-d| WITHOUT constraints, then LSFitLinear() is called. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I,J] - value of J-th basis function in I-th point. CMatrix - a table of constaints, array[0..K-1,0..M]. I-th row of CMatrix corresponds to I-th linear constraint: CMatrix[I,0]*C[0] + ... + CMatrix[I,M-1]*C[M-1] = CMatrix[I,M] N - number of points used. N>=1. M - number of basis functions, M>=1. K - number of constraints, 0 <= K < M K=0 corresponds to absence of constraints. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -3 either too many constraints (M or more), degenerate constraints (some constraints are repetead twice) or inconsistent constraints were specified. * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 07.09.2009 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitlinearc( real_1d_array y, real_2d_array fmatrix, real_2d_array cmatrix, ae_int_t& info, real_1d_array& c, lsfitreport& rep); void alglib::lsfitlinearc( real_1d_array y, real_2d_array fmatrix, real_2d_array cmatrix, ae_int_t n, ae_int_t m, ae_int_t k, ae_int_t& info, real_1d_array& c, lsfitreport& rep); void alglib::smp_lsfitlinearc( real_1d_array y, real_2d_array fmatrix, real_2d_array cmatrix, ae_int_t& info, real_1d_array& c, lsfitreport& rep); void alglib::smp_lsfitlinearc( real_1d_array y, real_2d_array fmatrix, real_2d_array cmatrix, ae_int_t n, ae_int_t m, ae_int_t k, ae_int_t& info, real_1d_array& c, lsfitreport& rep);

Examples:   [1]  

/************************************************************************* Weighted linear least squares fitting. QR decomposition is used to reduce task to MxM, then triangular solver or SVD-based solver is used depending on condition number of the system. It allows to maximize speed and retain decent accuracy. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. W - array[0..N-1] Weights corresponding to function values. Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I, J] - value of J-th basis function in I-th point. N - number of points used. N>=1. M - number of basis functions, M>=1. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -1 incorrect N/M were specified * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * Rep.TaskRCond reciprocal of condition number * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitlinearw( real_1d_array y, real_1d_array w, real_2d_array fmatrix, ae_int_t& info, real_1d_array& c, lsfitreport& rep); void alglib::lsfitlinearw( real_1d_array y, real_1d_array w, real_2d_array fmatrix, ae_int_t n, ae_int_t m, ae_int_t& info, real_1d_array& c, lsfitreport& rep); void alglib::smp_lsfitlinearw( real_1d_array y, real_1d_array w, real_2d_array fmatrix, ae_int_t& info, real_1d_array& c, lsfitreport& rep); void alglib::smp_lsfitlinearw( real_1d_array y, real_1d_array w, real_2d_array fmatrix, ae_int_t n, ae_int_t m, ae_int_t& info, real_1d_array& c, lsfitreport& rep);

Examples:   [1]  

/************************************************************************* Weighted constained linear least squares fitting. This is variation of LSFitLinearW(), which searchs for min|A*x=b| given that K additional constaints C*x=bc are satisfied. It reduces original task to modified one: min|B*y-d| WITHOUT constraints, then LSFitLinearW() is called. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. W - array[0..N-1] Weights corresponding to function values. Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I,J] - value of J-th basis function in I-th point. CMatrix - a table of constaints, array[0..K-1,0..M]. I-th row of CMatrix corresponds to I-th linear constraint: CMatrix[I,0]*C[0] + ... + CMatrix[I,M-1]*C[M-1] = CMatrix[I,M] N - number of points used. N>=1. M - number of basis functions, M>=1. K - number of constraints, 0 <= K < M K=0 corresponds to absence of constraints. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -3 either too many constraints (M or more), degenerate constraints (some constraints are repetead twice) or inconsistent constraints were specified. * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 07.09.2009 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitlinearwc( real_1d_array y, real_1d_array w, real_2d_array fmatrix, real_2d_array cmatrix, ae_int_t& info, real_1d_array& c, lsfitreport& rep); void alglib::lsfitlinearwc( real_1d_array y, real_1d_array w, real_2d_array fmatrix, real_2d_array cmatrix, ae_int_t n, ae_int_t m, ae_int_t k, ae_int_t& info, real_1d_array& c, lsfitreport& rep); void alglib::smp_lsfitlinearwc( real_1d_array y, real_1d_array w, real_2d_array fmatrix, real_2d_array cmatrix, ae_int_t& info, real_1d_array& c, lsfitreport& rep); void alglib::smp_lsfitlinearwc( real_1d_array y, real_1d_array w, real_2d_array fmatrix, real_2d_array cmatrix, ae_int_t n, ae_int_t m, ae_int_t k, ae_int_t& info, real_1d_array& c, lsfitreport& rep);

Examples:   [1]  

/************************************************************************* Nonlinear least squares fitting results. Called after return from LSFitFit(). INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: Info - completion code: * -7 gradient verification failed. See LSFitSetGradientCheck() for more information. * 2 relative step is no more than EpsX. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible C - array[0..K-1], solution Rep - optimization report. On success following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED * WRMSError weighted rms error on the (X,Y). ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(J*CovPar*J')), where J is Jacobian matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitresults( lsfitstate state, ae_int_t& info, real_1d_array& c, lsfitreport& rep);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* This function sets boundary constraints for underlying optimizer Boundary constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetBC() call. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[K]. If some (all) variables are unbounded, you may specify very small number or -INF (latter is recommended because it will allow solver to use better algorithm). BndU - upper bounds, array[K]. If some (all) variables are unbounded, you may specify very large number or +INF (latter is recommended because it will allow solver to use better algorithm). NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: unlike other constrained optimization algorithms, this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitsetbc( lsfitstate state, real_1d_array bndl, real_1d_array bndu);
/************************************************************************* Stopping conditions for nonlinear least squares fitting. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by LSFitSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Only Levenberg-Marquardt iterations are counted (L-BFGS/CG iterations are NOT counted because their cost is very low compared to that of LM). NOTE Passing EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (according to the scheme used by MINLM unit). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitsetcond(lsfitstate state, double epsx, ae_int_t maxits);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before fitting begins * LSFitFit() is called * prior to actual fitting, for each point in data set X_i and each component of parameters being fited C_j algorithm performs following steps: * two trial steps are made to C_j-TestStep*S[j] and C_j+TestStep*S[j], where C_j is j-th parameter and S[j] is a scale of j-th parameter * if needed, steps are bounded with respect to constraints on C[] * F(X_i|C) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N*K (points count * parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with LSFitSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. NOTE 4: this function works only for optimizers created with LSFitCreateWFG() or LSFitCreateFG() constructors. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitsetgradientcheck(lsfitstate state, double teststep);
/************************************************************************* This function sets linear constraints for underlying optimizer Linear constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetLC() call. INPUT PARAMETERS: State - structure stores algorithm state C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT IMPORTANT: if you have linear constraints, it is strongly recommended to set scale of variables with lsfitsetscale(). QP solver which is used to calculate linearly constrained steps heavily relies on good scaling of input problems. NOTE: linear (non-box) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations. NOTE: general linear constraints add significant overhead to solution process. Although solver performs roughly same amount of iterations (when compared with similar box-only constrained problem), each iteration now involves solution of linearly constrained QP subproblem, which requires ~3-5 times more Cholesky decompositions. Thus, if you can reformulate your problem in such way this it has only box constraints, it may be beneficial to do so. -- ALGLIB -- Copyright 29.04.2017 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitsetlc( lsfitstate state, real_2d_array c, integer_1d_array ct); void alglib::lsfitsetlc( lsfitstate state, real_2d_array c, integer_1d_array ct, ae_int_t k);
/************************************************************************* This function sets scaling coefficients for underlying optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Generally, scale is NOT considered to be a form of preconditioner. But LM optimizer is unique in that it uses scaling matrix both in the stopping condition tests and as Marquardt damping factor. Proper scaling is very important for the algorithm performance. It is less important for the quality of results, but still has some influence (it is easier to converge when variables are properly scaled, so premature stopping is possible when very badly scalled variables are combined with relaxed stopping conditions). INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitsetscale(lsfitstate state, real_1d_array s);

Examples:   [1]  

/************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. NOTE: non-zero StpMax leads to moderate performance degradation because intermediate step of preconditioned L-BFGS optimization is incompatible with limits on step size. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitsetstpmax(lsfitstate state, double stpmax);
/************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not When reports are needed, State.C (current parameters) and State.F (current value of fitting function) are reported. -- ALGLIB -- Copyright 15.08.2010 by Bochkanov Sergey *************************************************************************/
void alglib::lsfitsetxrep(lsfitstate state, bool needxrep);
/************************************************************************* This subroutine fits piecewise linear curve to points with Ramer-Douglas- Peucker algorithm, which stops after achieving desired precision. IMPORTANT: * it performs non-least-squares fitting; it builds curve, but this curve does not minimize some least squares metric. See description of RDP algorithm (say, in Wikipedia) for more details on WHAT is performed. * this function does NOT work with parametric curves (i.e. curves which can be represented as {X(t),Y(t)}. It works with curves which can be represented as Y(X). Thus, it is impossible to model figures like circles with this functions. If you want to work with parametric curves, you should use ParametricRDPFixed() function provided by "Parametric" subpackage of "Interpolation" package. INPUT PARAMETERS: X - array of X-coordinates: * at least N elements * can be unordered (points are automatically sorted) * this function may accept non-distinct X (see below for more information on handling of such inputs) Y - array of Y-coordinates: * at least N elements N - number of elements in X/Y Eps - positive number, desired precision. OUTPUT PARAMETERS: X2 - X-values of corner points for piecewise approximation, has length NSections+1 or zero (for NSections=0). Y2 - Y-values of corner points, has length NSections+1 or zero (for NSections=0). NSections- number of sections found by algorithm, NSections can be zero for degenerate datasets (N<=1 or all X[] are non-distinct). NOTE: X2/Y2 are ordered arrays, i.e. (X2[0],Y2[0]) is a first point of curve, (X2[NSection-1],Y2[NSection-1]) is the last point. -- ALGLIB -- Copyright 02.10.2014 by Bochkanov Sergey *************************************************************************/
void alglib::lstfitpiecewiselinearrdp( real_1d_array x, real_1d_array y, ae_int_t n, double eps, real_1d_array& x2, real_1d_array& y2, ae_int_t& nsections);
/************************************************************************* This subroutine fits piecewise linear curve to points with Ramer-Douglas- Peucker algorithm, which stops after generating specified number of linear sections. IMPORTANT: * it does NOT perform least-squares fitting; it builds curve, but this curve does not minimize some least squares metric. See description of RDP algorithm (say, in Wikipedia) for more details on WHAT is performed. * this function does NOT work with parametric curves (i.e. curves which can be represented as {X(t),Y(t)}. It works with curves which can be represented as Y(X). Thus, it is impossible to model figures like circles with this functions. If you want to work with parametric curves, you should use ParametricRDPFixed() function provided by "Parametric" subpackage of "Interpolation" package. INPUT PARAMETERS: X - array of X-coordinates: * at least N elements * can be unordered (points are automatically sorted) * this function may accept non-distinct X (see below for more information on handling of such inputs) Y - array of Y-coordinates: * at least N elements N - number of elements in X/Y M - desired number of sections: * at most M sections are generated by this function * less than M sections can be generated if we have N<M (or some X are non-distinct). OUTPUT PARAMETERS: X2 - X-values of corner points for piecewise approximation, has length NSections+1 or zero (for NSections=0). Y2 - Y-values of corner points, has length NSections+1 or zero (for NSections=0). NSections- number of sections found by algorithm, NSections<=M, NSections can be zero for degenerate datasets (N<=1 or all X[] are non-distinct). NOTE: X2/Y2 are ordered arrays, i.e. (X2[0],Y2[0]) is a first point of curve, (X2[NSection-1],Y2[NSection-1]) is the last point. -- ALGLIB -- Copyright 02.10.2014 by Bochkanov Sergey *************************************************************************/
void alglib::lstfitpiecewiselinearrdpfixed( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t m, real_1d_array& x2, real_1d_array& y2, ae_int_t& nsections);
/************************************************************************* Fitting by polynomials in barycentric form. This function provides simple unterface for unconstrained unweighted fitting. See PolynomialFitWC() if you need constrained fitting. Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO: PolynomialFitWC() COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. N - number of points, N>0 * if given, only leading N elements of X/Y are used * if not given, automatically determined from sizes of X/Y M - number of basis functions (= polynomial_degree + 1), M>=1 OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD P - interpolant in barycentric form. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED NOTES: you can convert P from barycentric form to the power or Chebyshev basis with PolynomialBar2Pow() or PolynomialBar2Cheb() functions from POLINT subpackage. -- ALGLIB PROJECT -- Copyright 10.12.2009 by Bochkanov Sergey *************************************************************************/
void alglib::polynomialfit( real_1d_array x, real_1d_array y, ae_int_t m, ae_int_t& info, barycentricinterpolant& p, polynomialfitreport& rep); void alglib::polynomialfit( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t m, ae_int_t& info, barycentricinterpolant& p, polynomialfitreport& rep); void alglib::smp_polynomialfit( real_1d_array x, real_1d_array y, ae_int_t m, ae_int_t& info, barycentricinterpolant& p, polynomialfitreport& rep); void alglib::smp_polynomialfit( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t m, ae_int_t& info, barycentricinterpolant& p, polynomialfitreport& rep);

Examples:   [1]  

/************************************************************************* Weighted fitting by polynomials in barycentric form, with constraints on function values or first derivatives. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO: PolynomialFit() COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points, N>0. * if given, only leading N elements of X/Y/W are used * if not given, automatically determined from sizes of X/Y/W XC - points where polynomial values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that P(XC[i])=YC[i] * DC[i]=1 means that P'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints, 0<=K<M. K=0 means no constraints (XC/YC/DC are not used in such cases) M - number of basis functions (= polynomial_degree + 1), M>=1 OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints P - interpolant in barycentric form. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTES: you can convert P from barycentric form to the power or Chebyshev basis with PolynomialBar2Pow() or PolynomialBar2Cheb() functions from POLINT subpackage. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * even simple constraints can be inconsistent, see Wikipedia article on this subject: http://en.wikipedia.org/wiki/Birkhoff_interpolation * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints is NOT GUARANTEED. * in the one special cases, however, we can guarantee consistency. This case is: M>1 and constraints on the function values (NOT DERIVATIVES) Our final recommendation is to use constraints WHEN AND ONLY when you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 10.12.2009 by Bochkanov Sergey *************************************************************************/
void alglib::polynomialfitwc( real_1d_array x, real_1d_array y, real_1d_array w, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t m, ae_int_t& info, barycentricinterpolant& p, polynomialfitreport& rep); void alglib::polynomialfitwc( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t n, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t k, ae_int_t m, ae_int_t& info, barycentricinterpolant& p, polynomialfitreport& rep); void alglib::smp_polynomialfitwc( real_1d_array x, real_1d_array y, real_1d_array w, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t m, ae_int_t& info, barycentricinterpolant& p, polynomialfitreport& rep); void alglib::smp_polynomialfitwc( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t n, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t k, ae_int_t m, ae_int_t& info, barycentricinterpolant& p, polynomialfitreport& rep);

Examples:   [1]  

/************************************************************************* Least squares fitting by cubic spline. This subroutine is "lightweight" alternative for more complex and feature- rich Spline1DFitCubicWC(). See Spline1DFitCubicWC() for more information about subroutine parameters (we don't duplicate it here because of length) COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dfitcubic( real_1d_array x, real_1d_array y, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::spline1dfitcubic( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::smp_spline1dfitcubic( real_1d_array x, real_1d_array y, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::smp_spline1dfitcubic( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep);
/************************************************************************* Weighted fitting by cubic spline, with constraints on function values or derivatives. Equidistant grid with M-2 nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with continuous second derivatives and non-fixed first derivatives at interval ends. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO Spline1DFitHermiteWC() - fitting by Hermite splines (more flexible, less smooth) Spline1DFitCubic() - "lightweight" fitting by cubic splines, without invididual weights and constraints COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes XC - points where spline values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints (optional): * 0<=K<M. * K=0 means no constraints (XC/YC/DC are not used) * if given, only first K elements of XC/YC/DC are used * if not given, automatically determined from XC/YC/DC M - number of basis functions ( = number_of_nodes+2), M>=4. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints S - spline interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * excessive constraints can be inconsistent. Splines are piecewise cubic functions, and it is easy to create an example, where large number of constraints concentrated in small area will result in inconsistency. Just because spline is not flexible enough to satisfy all of them. And same constraints spread across the [min(x),max(x)] will be perfectly consistent. * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints IS NOT GUARANTEED. * in the several special cases, however, we CAN guarantee consistency. * one of this cases is constraints on the function values AND/OR its derivatives at the interval boundaries. * another special case is ONE constraint on the function value (OR, but not AND, derivative) anywhere in the interval Our final recommendation is to use constraints WHEN AND ONLY WHEN you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dfitcubicwc( real_1d_array x, real_1d_array y, real_1d_array w, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::spline1dfitcubicwc( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t n, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t k, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::smp_spline1dfitcubicwc( real_1d_array x, real_1d_array y, real_1d_array w, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::smp_spline1dfitcubicwc( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t n, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t k, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep);
/************************************************************************* Least squares fitting by Hermite spline. This subroutine is "lightweight" alternative for more complex and feature- rich Spline1DFitHermiteWC(). See Spline1DFitHermiteWC() description for more information about subroutine parameters (we don't duplicate it here because of length). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dfithermite( real_1d_array x, real_1d_array y, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::spline1dfithermite( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::smp_spline1dfithermite( real_1d_array x, real_1d_array y, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::smp_spline1dfithermite( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep);
/************************************************************************* Weighted fitting by Hermite spline, with constraints on function values or first derivatives. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are Hermite splines. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO Spline1DFitCubicWC() - fitting by Cubic splines (less flexible, more smooth) Spline1DFitHermite() - "lightweight" Hermite fitting, without invididual weights and constraints COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes XC - points where spline values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints (optional): * 0<=K<M. * K=0 means no constraints (XC/YC/DC are not used) * if given, only first K elements of XC/YC/DC are used * if not given, automatically determined from XC/YC/DC M - number of basis functions (= 2 * number of nodes), M>=4, M IS EVEN! OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints -2 means odd M was passed (which is not supported) -1 means another errors in parameters passed (N<=0, for example) S - spline interpolant. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. IMPORTANT: this subroitine supports only even M's ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * excessive constraints can be inconsistent. Splines are piecewise cubic functions, and it is easy to create an example, where large number of constraints concentrated in small area will result in inconsistency. Just because spline is not flexible enough to satisfy all of them. And same constraints spread across the [min(x),max(x)] will be perfectly consistent. * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints is NOT GUARANTEED. * in the several special cases, however, we can guarantee consistency. * one of this cases is M>=4 and constraints on the function value (AND/OR its derivative) at the interval boundaries. * another special case is M>=4 and ONE constraint on the function value (OR, BUT NOT AND, derivative) anywhere in [min(x),max(x)] Our final recommendation is to use constraints WHEN AND ONLY when you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dfithermitewc( real_1d_array x, real_1d_array y, real_1d_array w, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::spline1dfithermitewc( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t n, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t k, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::smp_spline1dfithermitewc( real_1d_array x, real_1d_array y, real_1d_array w, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::smp_spline1dfithermitewc( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t n, real_1d_array xc, real_1d_array yc, integer_1d_array dc, ae_int_t k, ae_int_t m, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep);
/************************************************************************* Fitting by penalized cubic spline. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with natural boundary conditions. Problem is regularized by adding non-linearity penalty to the usual least squares penalty function: S(x) = arg min { LS + P }, where LS = SUM { w[i]^2*(y[i] - S(x[i]))^2 } - least squares penalty P = C*10^rho*integral{ S''(x)^2*dx } - non-linearity penalty rho - tunable constant given by user C - automatically determined scale parameter, makes penalty invariant with respect to scaling of X, Y, W. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. N - number of points (optional): * N>0 * if given, only first N elements of X/Y are processed * if not given, automatically determined from X/Y sizes M - number of basis functions ( = number_of_nodes), M>=4. Rho - regularization constant passed by user. It penalizes nonlinearity in the regression spline. It is logarithmically scaled, i.e. actual value of regularization constant is calculated as 10^Rho. It is automatically scaled so that: * Rho=2.0 corresponds to moderate amount of nonlinearity * generally, it should be somewhere in the [-8.0,+8.0] If you do not want to penalize nonlineary, pass small Rho. Values as low as -15 should work. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD or Cholesky decomposition; problem may be too ill-conditioned (very rare) S - spline interpolant. Rep - Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTE 1: additional nodes are added to the spline outside of the fitting interval to force linearity when x<min(x,xc) or x>max(x,xc). It is done for consistency - we penalize non-linearity at [min(x,xc),max(x,xc)], so it is natural to force linearity outside of this interval. NOTE 2: function automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dfitpenalized( real_1d_array x, real_1d_array y, ae_int_t m, double rho, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::spline1dfitpenalized( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t m, double rho, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::smp_spline1dfitpenalized( real_1d_array x, real_1d_array y, ae_int_t m, double rho, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::smp_spline1dfitpenalized( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t m, double rho, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep);

Examples:   [1]  

/************************************************************************* Weighted fitting by penalized cubic spline. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with natural boundary conditions. Problem is regularized by adding non-linearity penalty to the usual least squares penalty function: S(x) = arg min { LS + P }, where LS = SUM { w[i]^2*(y[i] - S(x[i]))^2 } - least squares penalty P = C*10^rho*integral{ S''(x)^2*dx } - non-linearity penalty rho - tunable constant given by user C - automatically determined scale parameter, makes penalty invariant with respect to scaling of X, Y, W. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted problem. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes M - number of basis functions ( = number_of_nodes), M>=4. Rho - regularization constant passed by user. It penalizes nonlinearity in the regression spline. It is logarithmically scaled, i.e. actual value of regularization constant is calculated as 10^Rho. It is automatically scaled so that: * Rho=2.0 corresponds to moderate amount of nonlinearity * generally, it should be somewhere in the [-8.0,+8.0] If you do not want to penalize nonlineary, pass small Rho. Values as low as -15 should work. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD or Cholesky decomposition; problem may be too ill-conditioned (very rare) S - spline interpolant. Rep - Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTE 1: additional nodes are added to the spline outside of the fitting interval to force linearity when x<min(x,xc) or x>max(x,xc). It is done for consistency - we penalize non-linearity at [min(x,xc),max(x,xc)], so it is natural to force linearity outside of this interval. NOTE 2: function automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 19.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dfitpenalizedw( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t m, double rho, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::spline1dfitpenalizedw( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t n, ae_int_t m, double rho, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::smp_spline1dfitpenalizedw( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t m, double rho, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep); void alglib::smp_spline1dfitpenalizedw( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t n, ae_int_t m, double rho, ae_int_t& info, spline1dinterpolant& s, spline1dfitreport& rep);

Examples:   [1]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // In this example we demonstrate linear fitting by f(x|a) = a*exp(0.5*x).
    //
    // We have:
    // * y - vector of experimental data
    // * fmatrix -  matrix of basis functions calculated at sample points
    //              Actually, we have only one basis function F0 = exp(0.5*x).
    //
    real_2d_array fmatrix = "[[0.606531],[0.670320],[0.740818],[0.818731],[0.904837],[1.000000],[1.105171],[1.221403],[1.349859],[1.491825],[1.648721]]";
    real_1d_array y = "[1.133719, 1.306522, 1.504604, 1.554663, 1.884638, 2.072436, 2.257285, 2.534068, 2.622017, 2.897713, 3.219371]";
    ae_int_t info;
    real_1d_array c;
    lsfitreport rep;

    //
    // Linear fitting without weights
    //
    lsfitlinear(y, fmatrix, info, c, rep);
    printf("%d\n", int(info)); // EXPECTED: 1
    printf("%s\n", c.tostring(4).c_str()); // EXPECTED: [1.98650]

    //
    // Linear fitting with individual weights.
    // Slightly different result is returned.
    //
    real_1d_array w = "[1.414213, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]";
    lsfitlinearw(y, w, fmatrix, info, c, rep);
    printf("%d\n", int(info)); // EXPECTED: 1
    printf("%s\n", c.tostring(4).c_str()); // EXPECTED: [1.983354]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // In this example we demonstrate linear fitting by f(x|a,b) = a*x+b
    // with simple constraint f(0)=0.
    //
    // We have:
    // * y - vector of experimental data
    // * fmatrix -  matrix of basis functions sampled at [0,1] with step 0.2:
    //                  [ 1.0   0.0 ]
    //                  [ 1.0   0.2 ]
    //                  [ 1.0   0.4 ]
    //                  [ 1.0   0.6 ]
    //                  [ 1.0   0.8 ]
    //                  [ 1.0   1.0 ]
    //              first column contains value of first basis function (constant term)
    //              second column contains second basis function (linear term)
    // * cmatrix -  matrix of linear constraints:
    //                  [ 1.0  0.0  0.0 ]
    //              first two columns contain coefficients before basis functions,
    //              last column contains desired value of their sum.
    //              So [1,0,0] means "1*constant_term + 0*linear_term = 0" 
    //
    real_1d_array y = "[0.072436,0.246944,0.491263,0.522300,0.714064,0.921929]";
    real_2d_array fmatrix = "[[1,0.0],[1,0.2],[1,0.4],[1,0.6],[1,0.8],[1,1.0]]";
    real_2d_array cmatrix = "[[1,0,0]]";
    ae_int_t info;
    real_1d_array c;
    lsfitreport rep;

    //
    // Constrained fitting without weights
    //
    lsfitlinearc(y, fmatrix, cmatrix, info, c, rep);
    printf("%d\n", int(info)); // EXPECTED: 1
    printf("%s\n", c.tostring(3).c_str()); // EXPECTED: [0,0.932933]

    //
    // Constrained fitting with individual weights
    //
    real_1d_array w = "[1, 1.414213, 1, 1, 1, 1]";
    lsfitlinearwc(y, w, fmatrix, cmatrix, info, c, rep);
    printf("%d\n", int(info)); // EXPECTED: 1
    printf("%s\n", c.tostring(3).c_str()); // EXPECTED: [0,0.938322]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;
void function_cx_1_func(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr) 
{
    // this callback calculates f(c,x)=exp(-c0*sqr(x0))
    // where x is a position on X-axis and c is adjustable parameter
    func = exp(-c[0]*pow(x[0],2));
}

int main(int argc, char **argv)
{
    //
    // In this example we demonstrate exponential fitting
    // by f(x) = exp(-c*x^2)
    // using function value only.
    //
    // Gradient is estimated using combination of numerical differences
    // and secant updates. diffstep variable stores differentiation step 
    // (we have to tell algorithm what step to use).
    //
    real_2d_array x = "[[-1],[-0.8],[-0.6],[-0.4],[-0.2],[0],[0.2],[0.4],[0.6],[0.8],[1.0]]";
    real_1d_array y = "[0.223130, 0.382893, 0.582748, 0.786628, 0.941765, 1.000000, 0.941765, 0.786628, 0.582748, 0.382893, 0.223130]";
    real_1d_array c = "[0.3]";
    double epsx = 0.000001;
    ae_int_t maxits = 0;
    ae_int_t info;
    lsfitstate state;
    lsfitreport rep;
    double diffstep = 0.0001;

    //
    // Fitting without weights
    //
    lsfitcreatef(x, y, c, diffstep, state);
    lsfitsetcond(state, epsx, maxits);
    alglib::lsfitfit(state, function_cx_1_func);
    lsfitresults(state, info, c, rep);
    printf("%d\n", int(info)); // EXPECTED: 2
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [1.5]

    //
    // Fitting with weights
    // (you can change weights and see how it changes result)
    //
    real_1d_array w = "[1,1,1,1,1,1,1,1,1,1,1]";
    lsfitcreatewf(x, y, w, c, diffstep, state);
    lsfitsetcond(state, epsx, maxits);
    alglib::lsfitfit(state, function_cx_1_func);
    lsfitresults(state, info, c, rep);
    printf("%d\n", int(info)); // EXPECTED: 2
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [1.5]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;
void function_cx_1_func(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr) 
{
    // this callback calculates f(c,x)=exp(-c0*sqr(x0))
    // where x is a position on X-axis and c is adjustable parameter
    func = exp(-c[0]*pow(x[0],2));
}

int main(int argc, char **argv)
{
    //
    // In this example we demonstrate exponential fitting by
    //     f(x) = exp(-c*x^2)
    // subject to bound constraints
    //     0.0 <= c <= 1.0
    // using function value only.
    //
    // Gradient is estimated using combination of numerical differences
    // and secant updates. diffstep variable stores differentiation step 
    // (we have to tell algorithm what step to use).
    //
    // Unconstrained solution is c=1.5, but because of constraints we should
    // get c=1.0 (at the boundary).
    //
    real_2d_array x = "[[-1],[-0.8],[-0.6],[-0.4],[-0.2],[0],[0.2],[0.4],[0.6],[0.8],[1.0]]";
    real_1d_array y = "[0.223130, 0.382893, 0.582748, 0.786628, 0.941765, 1.000000, 0.941765, 0.786628, 0.582748, 0.382893, 0.223130]";
    real_1d_array c = "[0.3]";
    real_1d_array bndl = "[0.0]";
    real_1d_array bndu = "[1.0]";
    double epsx = 0.000001;
    ae_int_t maxits = 0;
    ae_int_t info;
    lsfitstate state;
    lsfitreport rep;
    double diffstep = 0.0001;

    lsfitcreatef(x, y, c, diffstep, state);
    lsfitsetbc(state, bndl, bndu);
    lsfitsetcond(state, epsx, maxits);
    alglib::lsfitfit(state, function_cx_1_func);
    lsfitresults(state, info, c, rep);
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [1.0]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;
void function_cx_1_func(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr) 
{
    // this callback calculates f(c,x)=exp(-c0*sqr(x0))
    // where x is a position on X-axis and c is adjustable parameter
    func = exp(-c[0]*pow(x[0],2));
}
void function_cx_1_grad(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) 
{
    // this callback calculates f(c,x)=exp(-c0*sqr(x0)) and gradient G={df/dc[i]}
    // where x is a position on X-axis and c is adjustable parameter.
    // IMPORTANT: gradient is calculated with respect to C, not to X
    func = exp(-c[0]*pow(x[0],2));
    grad[0] = -pow(x[0],2)*func;
}

int main(int argc, char **argv)
{
    //
    // In this example we demonstrate exponential fitting
    // by f(x) = exp(-c*x^2)
    // using function value and gradient (with respect to c).
    //
    real_2d_array x = "[[-1],[-0.8],[-0.6],[-0.4],[-0.2],[0],[0.2],[0.4],[0.6],[0.8],[1.0]]";
    real_1d_array y = "[0.223130, 0.382893, 0.582748, 0.786628, 0.941765, 1.000000, 0.941765, 0.786628, 0.582748, 0.382893, 0.223130]";
    real_1d_array c = "[0.3]";
    double epsx = 0.000001;
    ae_int_t maxits = 0;
    ae_int_t info;
    lsfitstate state;
    lsfitreport rep;

    //
    // Fitting without weights
    //
    lsfitcreatefg(x, y, c, true, state);
    lsfitsetcond(state, epsx, maxits);
    alglib::lsfitfit(state, function_cx_1_func, function_cx_1_grad);
    lsfitresults(state, info, c, rep);
    printf("%d\n", int(info)); // EXPECTED: 2
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [1.5]

    //
    // Fitting with weights
    // (you can change weights and see how it changes result)
    //
    real_1d_array w = "[1,1,1,1,1,1,1,1,1,1,1]";
    lsfitcreatewfg(x, y, w, c, true, state);
    lsfitsetcond(state, epsx, maxits);
    alglib::lsfitfit(state, function_cx_1_func, function_cx_1_grad);
    lsfitresults(state, info, c, rep);
    printf("%d\n", int(info)); // EXPECTED: 2
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [1.5]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;
void function_cx_1_func(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr) 
{
    // this callback calculates f(c,x)=exp(-c0*sqr(x0))
    // where x is a position on X-axis and c is adjustable parameter
    func = exp(-c[0]*pow(x[0],2));
}
void function_cx_1_grad(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) 
{
    // this callback calculates f(c,x)=exp(-c0*sqr(x0)) and gradient G={df/dc[i]}
    // where x is a position on X-axis and c is adjustable parameter.
    // IMPORTANT: gradient is calculated with respect to C, not to X
    func = exp(-c[0]*pow(x[0],2));
    grad[0] = -pow(x[0],2)*func;
}
void function_cx_1_hess(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, real_2d_array &hess, void *ptr) 
{
    // this callback calculates f(c,x)=exp(-c0*sqr(x0)), gradient G={df/dc[i]} and Hessian H={d2f/(dc[i]*dc[j])}
    // where x is a position on X-axis and c is adjustable parameter.
    // IMPORTANT: gradient/Hessian are calculated with respect to C, not to X
    func = exp(-c[0]*pow(x[0],2));
    grad[0] = -pow(x[0],2)*func;
    hess[0][0] = pow(x[0],4)*func;
}

int main(int argc, char **argv)
{
    //
    // In this example we demonstrate exponential fitting
    // by f(x) = exp(-c*x^2)
    // using function value, gradient and Hessian (with respect to c)
    //
    real_2d_array x = "[[-1],[-0.8],[-0.6],[-0.4],[-0.2],[0],[0.2],[0.4],[0.6],[0.8],[1.0]]";
    real_1d_array y = "[0.223130, 0.382893, 0.582748, 0.786628, 0.941765, 1.000000, 0.941765, 0.786628, 0.582748, 0.382893, 0.223130]";
    real_1d_array c = "[0.3]";
    double epsx = 0.000001;
    ae_int_t maxits = 0;
    ae_int_t info;
    lsfitstate state;
    lsfitreport rep;

    //
    // Fitting without weights
    //
    lsfitcreatefgh(x, y, c, state);
    lsfitsetcond(state, epsx, maxits);
    alglib::lsfitfit(state, function_cx_1_func, function_cx_1_grad, function_cx_1_hess);
    lsfitresults(state, info, c, rep);
    printf("%d\n", int(info)); // EXPECTED: 2
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [1.5]

    //
    // Fitting with weights
    // (you can change weights and see how it changes result)
    //
    real_1d_array w = "[1,1,1,1,1,1,1,1,1,1,1]";
    lsfitcreatewfgh(x, y, w, c, state);
    lsfitsetcond(state, epsx, maxits);
    alglib::lsfitfit(state, function_cx_1_func, function_cx_1_grad, function_cx_1_hess);
    lsfitresults(state, info, c, rep);
    printf("%d\n", int(info)); // EXPECTED: 2
    printf("%s\n", c.tostring(1).c_str()); // EXPECTED: [1.5]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;
void function_debt_func(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr) 
{
    //
    // this callback calculates f(c,x)=c[0]*(1+c[1]*(pow(x[0]-1999,c[2])-1))
    //
    func = c[0]*(1+c[1]*(pow(x[0]-1999,c[2])-1));
}

int main(int argc, char **argv)
{
    //
    // In this example we demonstrate fitting by
    //     f(x) = c[0]*(1+c[1]*((x-1999)^c[2]-1))
    // subject to bound constraints
    //     -INF  < c[0] < +INF
    //      -10 <= c[1] <= +10
    //      0.1 <= c[2] <= 2.0
    // Data we want to fit are time series of Japan national debt
    // collected from 2000 to 2008 measured in USD (dollars, not
    // millions of dollars).
    //
    // Our variables are:
    //     c[0] - debt value at initial moment (2000),
    //     c[1] - direction coefficient (growth or decrease),
    //     c[2] - curvature coefficient.
    // You may see that our variables are badly scaled - first one 
    // is order of 10^12, and next two are somewhere about 1 in 
    // magnitude. Such problem is difficult to solve without some
    // kind of scaling.
    // That is exactly where lsfitsetscale() function can be used.
    // We set scale of our variables to [1.0E12, 1, 1], which allows
    // us to easily solve this problem.
    //
    // You can try commenting out lsfitsetscale() call - and you will 
    // see that algorithm will fail to converge.
    //
    real_2d_array x = "[[2000],[2001],[2002],[2003],[2004],[2005],[2006],[2007],[2008]]";
    real_1d_array y = "[4323239600000.0, 4560913100000.0, 5564091500000.0, 6743189300000.0, 7284064600000.0, 7050129600000.0, 7092221500000.0, 8483907600000.0, 8625804400000.0]";
    real_1d_array c = "[1.0e+13, 1, 1]";
    double epsx = 1.0e-5;
    real_1d_array bndl = "[-inf, -10, 0.1]";
    real_1d_array bndu = "[+inf, +10, 2.0]";
    real_1d_array s = "[1.0e+12, 1, 1]";
    ae_int_t maxits = 0;
    ae_int_t info;
    lsfitstate state;
    lsfitreport rep;
    double diffstep = 1.0e-5;

    lsfitcreatef(x, y, c, diffstep, state);
    lsfitsetcond(state, epsx, maxits);
    lsfitsetbc(state, bndl, bndu);
    lsfitsetscale(state, s);
    alglib::lsfitfit(state, function_debt_func);
    lsfitresults(state, info, c, rep);
    printf("%d\n", int(info)); // EXPECTED: 2
    printf("%s\n", c.tostring(-2).c_str()); // EXPECTED: [4.142560E+12, 0.434240, 0.565376]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example demonstrates polynomial fitting.
    //
    // Fitting is done by two (M=2) functions from polynomial basis:
    //     f0 = 1
    //     f1 = x
    // Basically, it just a linear fit; more complex polynomials may be used
    // (e.g. parabolas with M=3, cubic with M=4), but even such simple fit allows
    // us to demonstrate polynomialfit() function in action.
    //
    // We have:
    // * x      set of abscissas
    // * y      experimental data
    //
    // Additionally we demonstrate weighted fitting, where second point has
    // more weight than other ones.
    //
    real_1d_array x = "[0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]";
    real_1d_array y = "[0.00,0.05,0.26,0.32,0.33,0.43,0.60,0.60,0.77,0.98,1.02]";
    ae_int_t m = 2;
    double t = 2;
    ae_int_t info;
    barycentricinterpolant p;
    polynomialfitreport rep;
    double v;

    //
    // Fitting without individual weights
    //
    // NOTE: result is returned as barycentricinterpolant structure.
    //       if you want to get representation in the power basis,
    //       you can use barycentricbar2pow() function to convert
    //       from barycentric to power representation (see docs for 
    //       POLINT subpackage for more info).
    //
    polynomialfit(x, y, m, info, p, rep);
    v = barycentriccalc(p, t);
    printf("%.2f\n", double(v)); // EXPECTED: 2.011

    //
    // Fitting with individual weights
    //
    // NOTE: slightly different result is returned
    //
    real_1d_array w = "[1,1.414213562,1,1,1,1,1,1,1,1,1]";
    real_1d_array xc = "[]";
    real_1d_array yc = "[]";
    integer_1d_array dc = "[]";
    polynomialfitwc(x, y, w, xc, yc, dc, m, info, p, rep);
    v = barycentriccalc(p, t);
    printf("%.2f\n", double(v)); // EXPECTED: 2.023
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example demonstrates polynomial fitting.
    //
    // Fitting is done by two (M=2) functions from polynomial basis:
    //     f0 = 1
    //     f1 = x
    // with simple constraint on function value
    //     f(0) = 0
    // Basically, it just a linear fit; more complex polynomials may be used
    // (e.g. parabolas with M=3, cubic with M=4), but even such simple fit allows
    // us to demonstrate polynomialfit() function in action.
    //
    // We have:
    // * x      set of abscissas
    // * y      experimental data
    // * xc     points where constraints are placed
    // * yc     constraints on derivatives
    // * dc     derivative indices
    //          (0 means function itself, 1 means first derivative)
    //
    real_1d_array x = "[1.0,1.0]";
    real_1d_array y = "[0.9,1.1]";
    real_1d_array w = "[1,1]";
    real_1d_array xc = "[0]";
    real_1d_array yc = "[0]";
    integer_1d_array dc = "[0]";
    double t = 2;
    ae_int_t m = 2;
    ae_int_t info;
    barycentricinterpolant p;
    polynomialfitreport rep;
    double v;

    polynomialfitwc(x, y, w, xc, yc, dc, m, info, p, rep);
    v = barycentriccalc(p, t);
    printf("%.2f\n", double(v)); // EXPECTED: 2.000
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // In this example we demonstrate penalized spline fitting of noisy data
    //
    // We have:
    // * x - abscissas
    // * y - vector of experimental data, straight line with small noise
    //
    real_1d_array x = "[0.00,0.10,0.20,0.30,0.40,0.50,0.60,0.70,0.80,0.90]";
    real_1d_array y = "[0.10,0.00,0.30,0.40,0.30,0.40,0.62,0.68,0.75,0.95]";
    ae_int_t info;
    double v;
    spline1dinterpolant s;
    spline1dfitreport rep;
    double rho;

    //
    // Fit with VERY small amount of smoothing (rho = -5.0)
    // and large number of basis functions (M=50).
    //
    // With such small regularization penalized spline almost fully reproduces function values
    //
    rho = -5.0;
    spline1dfitpenalized(x, y, 50, rho, info, s, rep);
    printf("%d\n", int(info)); // EXPECTED: 1
    v = spline1dcalc(s, 0.0);
    printf("%.1f\n", double(v)); // EXPECTED: 0.10

    //
    // Fit with VERY large amount of smoothing (rho = 10.0)
    // and large number of basis functions (M=50).
    //
    // With such regularization our spline should become close to the straight line fit.
    // We will compare its value in x=1.0 with results obtained from such fit.
    //
    rho = +10.0;
    spline1dfitpenalized(x, y, 50, rho, info, s, rep);
    printf("%d\n", int(info)); // EXPECTED: 1
    v = spline1dcalc(s, 1.0);
    printf("%.2f\n", double(v)); // EXPECTED: 0.969

    //
    // In real life applications you may need some moderate degree of fitting,
    // so we try to fit once more with rho=3.0.
    //
    rho = +3.0;
    spline1dfitpenalized(x, y, 50, rho, info, s, rep);
    printf("%d\n", int(info)); // EXPECTED: 1
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    real_1d_array x = "[1,2,3,4,5,6,7,8]";
    real_1d_array y = "[0.06313223,0.44552624,0.61838364,0.71385108,0.77345838,0.81383140,0.84280033,0.86449822]";
    ae_int_t n = 8;
    double a;
    double b;
    double c;
    double d;
    lsfitreport rep;

    //
    // Test logisticfit4() on carefully designed data with a priori known answer.
    //
    logisticfit4(x, y, n, a, b, c, d, rep);
    printf("%.1f\n", double(a)); // EXPECTED: -1.000
    printf("%.1f\n", double(b)); // EXPECTED: 1.200
    printf("%.1f\n", double(c)); // EXPECTED: 0.900
    printf("%.1f\n", double(d)); // EXPECTED: 1.000

    //
    // Evaluate model at point x=0.5
    //
    double v;
    v = logisticcalc4(0.5, a, b, c, d);
    printf("%.2f\n", double(v)); // EXPECTED: -0.33874308
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    real_1d_array x = "[1,2,3,4,5,6,7,8]";
    real_1d_array y = "[0.1949776139,0.5710060208,0.726002637,0.8060434158,0.8534547965,0.8842071579,0.9054773317,0.9209088299]";
    ae_int_t n = 8;
    double a;
    double b;
    double c;
    double d;
    double g;
    lsfitreport rep;

    //
    // Test logisticfit5() on carefully designed data with a priori known answer.
    //
    logisticfit5(x, y, n, a, b, c, d, g, rep);
    printf("%.1f\n", double(a)); // EXPECTED: -1.000
    printf("%.1f\n", double(b)); // EXPECTED: 1.200
    printf("%.1f\n", double(c)); // EXPECTED: 0.900
    printf("%.1f\n", double(d)); // EXPECTED: 1.000
    printf("%.1f\n", double(g)); // EXPECTED: 1.200

    //
    // Evaluate model at point x=0.5
    //
    double v;
    v = logisticcalc5(0.5, a, b, c, d, g);
    printf("%.2f\n", double(v)); // EXPECTED: -0.2354656824
    return 0;
}


mannwhitneyutest
/************************************************************************* Mann-Whitney U-test This test checks hypotheses about whether X and Y are samples of two continuous distributions of the same shape and same median or whether their medians are different. The following tests are performed: * two-tailed test (null hypothesis - the medians are equal) * left-tailed test (null hypothesis - the median of the first sample is greater than or equal to the median of the second sample) * right-tailed test (null hypothesis - the median of the first sample is less than or equal to the median of the second sample). Requirements: * the samples are independent * X and Y are continuous distributions (or discrete distributions well- approximating continuous distributions) * distributions of X and Y have the same shape. The only possible difference is their position (i.e. the value of the median) * the number of elements in each sample is not less than 5 * the scale of measurement should be ordinal, interval or ratio (i.e. the test could not be applied to nominal variables). The test is non-parametric and doesn't require distributions to be normal. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of the sample. N>=5 Y - sample 2. Array whose index goes from 0 to M-1. M - size of the sample. M>=5 Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. To calculate p-values, special approximation is used. This method lets us calculate p-values with satisfactory accuracy in interval [0.0001, 1]. There is no approximation outside the [0.0001, 1] interval. Therefore, if the significance level outlies this interval, the test returns 0.0001. Relative precision of approximation of p-value: N M Max.err. Rms.err. 5..10 N..10 1.4e-02 6.0e-04 5..10 N..100 2.2e-02 5.3e-06 10..15 N..15 1.0e-02 3.2e-04 10..15 N..100 1.0e-02 2.2e-05 15..100 N..100 6.1e-03 2.7e-06 For N,M>100 accuracy checks weren't put into practice, but taking into account characteristics of asymptotic approximation used, precision should not be sharply different from the values for interval [5, 100]. NOTE: P-value approximation was optimized for 0.0001<=p<=0.2500. Thus, P's outside of this interval are enforced to these bounds. Say, you may quite often get P equal to exactly 0.25 or 0.0001. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mannwhitneyutest( real_1d_array x, ae_int_t n, real_1d_array y, ae_int_t m, double& bothtails, double& lefttail, double& righttail);
cmatrixdet
cmatrixludet
rmatrixdet
rmatrixludet
spdmatrixcholeskydet
spdmatrixdet
matdet_d_1 Determinant calculation, real matrix, short form
matdet_d_2 Determinant calculation, real matrix, full form
matdet_d_3 Determinant calculation, complex matrix, short form
matdet_d_4 Determinant calculation, complex matrix, full form
matdet_d_5 Determinant calculation, complex matrix with zero imaginary part, short form
/************************************************************************* Calculation of the determinant of a general matrix Input parameters: A - matrix, array[0..N-1, 0..N-1] N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: determinant of matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/
alglib::complex alglib::cmatrixdet(complex_2d_array a); alglib::complex alglib::cmatrixdet(complex_2d_array a, ae_int_t n);

Examples:   [1]  [2]  [3]  

/************************************************************************* Determinant calculation of the matrix given by its LU decomposition. Input parameters: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations which were made during the LU decomposition. Output of RMatrixLU subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: matrix determinant. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/
alglib::complex alglib::cmatrixludet( complex_2d_array a, integer_1d_array pivots); alglib::complex alglib::cmatrixludet( complex_2d_array a, integer_1d_array pivots, ae_int_t n);
/************************************************************************* Calculation of the determinant of a general matrix Input parameters: A - matrix, array[0..N-1, 0..N-1] N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: determinant of matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/
double alglib::rmatrixdet(real_2d_array a); double alglib::rmatrixdet(real_2d_array a, ae_int_t n);

Examples:   [1]  [2]  

/************************************************************************* Determinant calculation of the matrix given by its LU decomposition. Input parameters: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations which were made during the LU decomposition. Output of RMatrixLU subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: matrix determinant. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/
double alglib::rmatrixludet(real_2d_array a, integer_1d_array pivots); double alglib::rmatrixludet( real_2d_array a, integer_1d_array pivots, ae_int_t n);
/************************************************************************* Determinant calculation of the matrix given by the Cholesky decomposition. Input parameters: A - Cholesky decomposition, output of SMatrixCholesky subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) As the determinant is equal to the product of squares of diagonal elements, it's not necessary to specify which triangle - lower or upper - the matrix is stored in. Result: matrix determinant. -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/
double alglib::spdmatrixcholeskydet(real_2d_array a); double alglib::spdmatrixcholeskydet(real_2d_array a, ae_int_t n);
/************************************************************************* Determinant calculation of the symmetric positive definite matrix. Input parameters: A - matrix. Array with elements [0..N-1, 0..N-1]. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) IsUpper - (optional) storage type: * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Result: determinant of matrix A. If matrix A is not positive definite, exception is thrown. -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/
double alglib::spdmatrixdet(real_2d_array a); double alglib::spdmatrixdet(real_2d_array a, ae_int_t n, bool isupper);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    real_2d_array b = "[[1,2],[2,1]]";
    double a;
    a = rmatrixdet(b);
    printf("%.3f\n", double(a)); // EXPECTED: -3
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    real_2d_array b = "[[5,4],[4,5]]";
    double a;
    a = rmatrixdet(b, 2);
    printf("%.3f\n", double(a)); // EXPECTED: 9
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    complex_2d_array b = "[[1+1i,2],[2,1-1i]]";
    alglib::complex a;
    a = cmatrixdet(b);
    printf("%s\n", a.tostring(3).c_str()); // EXPECTED: -2
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    alglib::complex a;
    complex_2d_array b = "[[5i,4],[4i,5]]";
    a = cmatrixdet(b, 2);
    printf("%s\n", a.tostring(3).c_str()); // EXPECTED: 9i
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    alglib::complex a;
    complex_2d_array b = "[[9,1],[2,1]]";
    a = cmatrixdet(b);
    printf("%s\n", a.tostring(3).c_str()); // EXPECTED: 7
    return 0;
}


cmatrixrndcond
cmatrixrndorthogonal
cmatrixrndorthogonalfromtheleft
cmatrixrndorthogonalfromtheright
hmatrixrndcond
hmatrixrndmultiply
hpdmatrixrndcond
rmatrixrndcond
rmatrixrndorthogonal
rmatrixrndorthogonalfromtheleft
rmatrixrndorthogonalfromtheright
smatrixrndcond
smatrixrndmultiply
spdmatrixrndcond
/************************************************************************* Generation of random NxN complex matrix with given condition number C and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixrndcond(ae_int_t n, double c, complex_2d_array& a);
/************************************************************************* Generation of a random Haar distributed orthogonal complex matrix INPUT PARAMETERS: N - matrix size, N>=1 OUTPUT PARAMETERS: A - orthogonal NxN matrix, array[0..N-1,0..N-1] NOTE: this function uses algorithm described in Stewart, G. W. (1980), "The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators". Speaking short, to generate an (N+1)x(N+1) orthogonal matrix, it: * takes an NxN one * takes uniformly distributed unit vector of dimension N+1. * constructs a Householder reflection from the vector, then applies it to the smaller matrix (embedded in the larger size with a 1 at the bottom right corner). -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixrndorthogonal(ae_int_t n, complex_2d_array& a);
/************************************************************************* Multiplication of MxN complex matrix by MxM random Haar distributed complex orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - Q*A, where Q is random MxM orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixrndorthogonalfromtheleft( complex_2d_array& a, ae_int_t m, ae_int_t n);
/************************************************************************* Multiplication of MxN complex matrix by NxN random Haar distributed complex orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixrndorthogonalfromtheright( complex_2d_array& a, ae_int_t m, ae_int_t n);
/************************************************************************* Generation of random NxN Hermitian matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::hmatrixrndcond(ae_int_t n, double c, complex_2d_array& a);
/************************************************************************* Hermitian multiplication of NxN matrix by random Haar distributed complex orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..N-1, 0..N-1] N - matrix size OUTPUT PARAMETERS: A - Q^H*A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::hmatrixrndmultiply(complex_2d_array& a, ae_int_t n);
/************************************************************************* Generation of random NxN Hermitian positive definite matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random HPD matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::hpdmatrixrndcond(ae_int_t n, double c, complex_2d_array& a);
/************************************************************************* Generation of random NxN matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixrndcond(ae_int_t n, double c, real_2d_array& a);
/************************************************************************* Generation of a random uniformly distributed (Haar) orthogonal matrix INPUT PARAMETERS: N - matrix size, N>=1 OUTPUT PARAMETERS: A - orthogonal NxN matrix, array[0..N-1,0..N-1] NOTE: this function uses algorithm described in Stewart, G. W. (1980), "The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators". Speaking short, to generate an (N+1)x(N+1) orthogonal matrix, it: * takes an NxN one * takes uniformly distributed unit vector of dimension N+1. * constructs a Householder reflection from the vector, then applies it to the smaller matrix (embedded in the larger size with a 1 at the bottom right corner). -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixrndorthogonal(ae_int_t n, real_2d_array& a);
/************************************************************************* Multiplication of MxN matrix by MxM random Haar distributed orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - Q*A, where Q is random MxM orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixrndorthogonalfromtheleft( real_2d_array& a, ae_int_t m, ae_int_t n);
/************************************************************************* Multiplication of MxN matrix by NxN random Haar distributed orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixrndorthogonalfromtheright( real_2d_array& a, ae_int_t m, ae_int_t n);
/************************************************************************* Generation of random NxN symmetric matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::smatrixrndcond(ae_int_t n, double c, real_2d_array& a);
/************************************************************************* Symmetric multiplication of NxN matrix by random Haar distributed orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..N-1, 0..N-1] N - matrix size OUTPUT PARAMETERS: A - Q'*A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::smatrixrndmultiply(real_2d_array& a, ae_int_t n);
/************************************************************************* Generation of random NxN symmetric positive definite matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random SPD matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/
void alglib::spdmatrixrndcond(ae_int_t n, double c, real_2d_array& a);
matinvreport
cmatrixinverse
cmatrixluinverse
cmatrixtrinverse
hpdmatrixcholeskyinverse
hpdmatrixinverse
rmatrixinverse
rmatrixluinverse
rmatrixtrinverse
spdmatrixcholeskyinverse
spdmatrixinverse
matinv_d_c1 Complex matrix inverse
matinv_d_hpd1 HPD matrix inverse
matinv_d_r1 Real matrix inverse
matinv_d_spd1 SPD matrix inverse
/************************************************************************* Matrix inverse report: * R1 reciprocal of condition number in 1-norm * RInf reciprocal of condition number in inf-norm *************************************************************************/
class matinvreport { double r1; double rinf; };
/************************************************************************* Inversion of a general matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/
void alglib::cmatrixinverse( complex_2d_array& a, ae_int_t& info, matinvreport& rep); void alglib::cmatrixinverse( complex_2d_array& a, ae_int_t n, ae_int_t& info, matinvreport& rep); void alglib::smp_cmatrixinverse( complex_2d_array& a, ae_int_t& info, matinvreport& rep); void alglib::smp_cmatrixinverse( complex_2d_array& a, ae_int_t n, ae_int_t& info, matinvreport& rep);

Examples:   [1]  

/************************************************************************* Inversion of a matrix given by its LU decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - LU decomposition of the matrix (output of CMatrixLU subroutine). Pivots - table of permutations (the output of CMatrixLU subroutine). N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) OUTPUT PARAMETERS: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 05.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixluinverse( complex_2d_array& a, integer_1d_array pivots, ae_int_t& info, matinvreport& rep); void alglib::cmatrixluinverse( complex_2d_array& a, integer_1d_array pivots, ae_int_t n, ae_int_t& info, matinvreport& rep); void alglib::smp_cmatrixluinverse( complex_2d_array& a, integer_1d_array pivots, ae_int_t& info, matinvreport& rep); void alglib::smp_cmatrixluinverse( complex_2d_array& a, integer_1d_array pivots, ae_int_t n, ae_int_t& info, matinvreport& rep);
/************************************************************************* Triangular matrix inverse (complex) The subroutine inverts the following types of matrices: * upper triangular * upper triangular with unit diagonal * lower triangular * lower triangular with unit diagonal In case of an upper (lower) triangular matrix, the inverse matrix will also be upper (lower) triangular, and after the end of the algorithm, the inverse matrix replaces the source matrix. The elements below (above) the main diagonal are not changed by the algorithm. If the matrix has a unit diagonal, the inverse matrix also has a unit diagonal, and the diagonal elements are not passed to the algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that triangular inverse is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix, array[0..N-1, 0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - True, if the matrix is upper triangular. IsUnit - diagonal type (optional): * if True, matrix has unit diagonal (a[i,i] are NOT used) * if False, matrix diagonal is arbitrary * if not given, False is assumed Output parameters: Info - same as for RMatrixLUInverse Rep - same as for RMatrixLUInverse A - same as for RMatrixLUInverse. -- ALGLIB -- Copyright 05.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::cmatrixtrinverse( complex_2d_array& a, bool isupper, ae_int_t& info, matinvreport& rep); void alglib::cmatrixtrinverse( complex_2d_array& a, ae_int_t n, bool isupper, bool isunit, ae_int_t& info, matinvreport& rep); void alglib::smp_cmatrixtrinverse( complex_2d_array& a, bool isupper, ae_int_t& info, matinvreport& rep); void alglib::smp_cmatrixtrinverse( complex_2d_array& a, ae_int_t n, bool isupper, bool isunit, ae_int_t& info, matinvreport& rep);
/************************************************************************* Inversion of a Hermitian positive definite matrix which is given by Cholesky decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - Cholesky decomposition of the matrix to be inverted: A=U'*U or A = L*L'. Output of HPDMatrixCholesky subroutine. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, lower half is used. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::hpdmatrixcholeskyinverse( complex_2d_array& a, ae_int_t& info, matinvreport& rep); void alglib::hpdmatrixcholeskyinverse( complex_2d_array& a, ae_int_t n, bool isupper, ae_int_t& info, matinvreport& rep); void alglib::smp_hpdmatrixcholeskyinverse( complex_2d_array& a, ae_int_t& info, matinvreport& rep); void alglib::smp_hpdmatrixcholeskyinverse( complex_2d_array& a, ae_int_t n, bool isupper, ae_int_t& info, matinvreport& rep);
/************************************************************************* Inversion of a Hermitian positive definite matrix. Given an upper or lower triangle of a Hermitian positive definite matrix, the algorithm generates matrix A^-1 and saves the upper or lower triangle depending on the input. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be inverted (upper or lower triangle). Array with elements [0..N-1,0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::hpdmatrixinverse( complex_2d_array& a, ae_int_t& info, matinvreport& rep); void alglib::hpdmatrixinverse( complex_2d_array& a, ae_int_t n, bool isupper, ae_int_t& info, matinvreport& rep); void alglib::smp_hpdmatrixinverse( complex_2d_array& a, ae_int_t& info, matinvreport& rep); void alglib::smp_hpdmatrixinverse( complex_2d_array& a, ae_int_t n, bool isupper, ae_int_t& info, matinvreport& rep);

Examples:   [1]  

/************************************************************************* Inversion of a general matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse Result: True, if the matrix is not singular. False, if the matrix is singular. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixinverse( real_2d_array& a, ae_int_t& info, matinvreport& rep); void alglib::rmatrixinverse( real_2d_array& a, ae_int_t n, ae_int_t& info, matinvreport& rep); void alglib::smp_rmatrixinverse( real_2d_array& a, ae_int_t& info, matinvreport& rep); void alglib::smp_rmatrixinverse( real_2d_array& a, ae_int_t n, ae_int_t& info, matinvreport& rep);

Examples:   [1]  

/************************************************************************* Inversion of a matrix given by its LU decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations (the output of RMatrixLU subroutine). N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. it is filled by zeros in such cases. * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - solver report, see below for more info A - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. SOLVER REPORT Subroutine sets following fields of the Rep structure: * R1 reciprocal of condition number: 1/cond(A), 1-norm. * RInf reciprocal of condition number: 1/cond(A), inf-norm. -- ALGLIB routine -- 05.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixluinverse( real_2d_array& a, integer_1d_array pivots, ae_int_t& info, matinvreport& rep); void alglib::rmatrixluinverse( real_2d_array& a, integer_1d_array pivots, ae_int_t n, ae_int_t& info, matinvreport& rep); void alglib::smp_rmatrixluinverse( real_2d_array& a, integer_1d_array pivots, ae_int_t& info, matinvreport& rep); void alglib::smp_rmatrixluinverse( real_2d_array& a, integer_1d_array pivots, ae_int_t n, ae_int_t& info, matinvreport& rep);
/************************************************************************* Triangular matrix inverse (real) The subroutine inverts the following types of matrices: * upper triangular * upper triangular with unit diagonal * lower triangular * lower triangular with unit diagonal In case of an upper (lower) triangular matrix, the inverse matrix will also be upper (lower) triangular, and after the end of the algorithm, the inverse matrix replaces the source matrix. The elements below (above) the main diagonal are not changed by the algorithm. If the matrix has a unit diagonal, the inverse matrix also has a unit diagonal, and the diagonal elements are not passed to the algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that triangular inverse is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix, array[0..N-1, 0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - True, if the matrix is upper triangular. IsUnit - diagonal type (optional): * if True, matrix has unit diagonal (a[i,i] are NOT used) * if False, matrix diagonal is arbitrary * if not given, False is assumed Output parameters: Info - same as for RMatrixLUInverse Rep - same as for RMatrixLUInverse A - same as for RMatrixLUInverse. -- ALGLIB -- Copyright 05.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::rmatrixtrinverse( real_2d_array& a, bool isupper, ae_int_t& info, matinvreport& rep); void alglib::rmatrixtrinverse( real_2d_array& a, ae_int_t n, bool isupper, bool isunit, ae_int_t& info, matinvreport& rep); void alglib::smp_rmatrixtrinverse( real_2d_array& a, bool isupper, ae_int_t& info, matinvreport& rep); void alglib::smp_rmatrixtrinverse( real_2d_array& a, ae_int_t n, bool isupper, bool isunit, ae_int_t& info, matinvreport& rep);
/************************************************************************* Inversion of a symmetric positive definite matrix which is given by Cholesky decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - Cholesky decomposition of the matrix to be inverted: A=U'*U or A = L*L'. Output of SPDMatrixCholesky subroutine. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, lower half is used. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::spdmatrixcholeskyinverse( real_2d_array& a, ae_int_t& info, matinvreport& rep); void alglib::spdmatrixcholeskyinverse( real_2d_array& a, ae_int_t n, bool isupper, ae_int_t& info, matinvreport& rep); void alglib::smp_spdmatrixcholeskyinverse( real_2d_array& a, ae_int_t& info, matinvreport& rep); void alglib::smp_spdmatrixcholeskyinverse( real_2d_array& a, ae_int_t n, bool isupper, ae_int_t& info, matinvreport& rep);
/************************************************************************* Inversion of a symmetric positive definite matrix. Given an upper or lower triangle of a symmetric positive definite matrix, the algorithm generates matrix A^-1 and saves the upper or lower triangle depending on the input. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be inverted (upper or lower triangle). Array with elements [0..N-1,0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::spdmatrixinverse( real_2d_array& a, ae_int_t& info, matinvreport& rep); void alglib::spdmatrixinverse( real_2d_array& a, ae_int_t n, bool isupper, ae_int_t& info, matinvreport& rep); void alglib::smp_spdmatrixinverse( real_2d_array& a, ae_int_t& info, matinvreport& rep); void alglib::smp_spdmatrixinverse( real_2d_array& a, ae_int_t n, bool isupper, ae_int_t& info, matinvreport& rep);

Examples:   [1]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    complex_2d_array a = "[[1i,-1],[1i,1]]";
    ae_int_t info;
    matinvreport rep;
    cmatrixinverse(a, info, rep);
    printf("%d\n", int(info)); // EXPECTED: 1
    printf("%s\n", a.tostring(4).c_str()); // EXPECTED: [[-0.5i,-0.5i],[-0.5,0.5]]
    printf("%.4f\n", double(rep.r1)); // EXPECTED: 0.5
    printf("%.4f\n", double(rep.rinf)); // EXPECTED: 0.5
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    complex_2d_array a = "[[2,1],[1,2]]";
    ae_int_t info;
    matinvreport rep;
    hpdmatrixinverse(a, info, rep);
    printf("%d\n", int(info)); // EXPECTED: 1
    printf("%s\n", a.tostring(4).c_str()); // EXPECTED: [[0.666666,-0.333333],[-0.333333,0.666666]]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    real_2d_array a = "[[1,-1],[1,1]]";
    ae_int_t info;
    matinvreport rep;
    rmatrixinverse(a, info, rep);
    printf("%d\n", int(info)); // EXPECTED: 1
    printf("%s\n", a.tostring(4).c_str()); // EXPECTED: [[0.5,0.5],[-0.5,0.5]]
    printf("%.4f\n", double(rep.r1)); // EXPECTED: 0.5
    printf("%.4f\n", double(rep.rinf)); // EXPECTED: 0.5
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    real_2d_array a = "[[2,1],[1,2]]";
    ae_int_t info;
    matinvreport rep;
    spdmatrixinverse(a, info, rep);
    printf("%d\n", int(info)); // EXPECTED: 1
    printf("%s\n", a.tostring(4).c_str()); // EXPECTED: [[0.666666,-0.333333],[-0.333333,0.666666]]
    return 0;
}


mcpdreport
mcpdstate
mcpdaddbc
mcpdaddec
mcpdaddtrack
mcpdcreate
mcpdcreateentry
mcpdcreateentryexit
mcpdcreateexit
mcpdresults
mcpdsetbc
mcpdsetec
mcpdsetlc
mcpdsetpredictionweights
mcpdsetprior
mcpdsettikhonovregularizer
mcpdsolve
mcpd_simple1 Simple unconstrained MCPD model (no entry/exit states)
mcpd_simple2 Simple MCPD model (no entry/exit states) with equality constraints
/************************************************************************* This structure is a MCPD training report: InnerIterationsCount - number of inner iterations of the underlying optimization algorithm OuterIterationsCount - number of outer iterations of the underlying optimization algorithm NFEV - number of merit function evaluations TerminationType - termination type (same as for MinBLEIC optimizer, positive values denote success, negative ones - failure) -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
class mcpdreport { ae_int_t inneriterationscount; ae_int_t outeriterationscount; ae_int_t nfev; ae_int_t terminationtype; };
/************************************************************************* This structure is a MCPD (Markov Chains for Population Data) solver. You should use ALGLIB functions in order to work with this object. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
class mcpdstate { };
/************************************************************************* This function is used to add bound constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to ADD bound constraint for one element of P without changing constraints for other elements. You can also use MCPDSetBC() function which allows to place bound constraints on arbitrary subset of elements of P. Set of constraints is specified by BndL/BndU matrices, which may contain arbitrary combination of finite numbers or infinities (like -INF<x<=0.5 or 0.1<=x<+INF). These functions (MCPDSetBC and MCPDAddBC) interact as follows: * there is internal matrix of bound constraints which is stored in the MCPD solver * MCPDSetBC() replaces this matrix by another one (SET) * MCPDAddBC() modifies one element of this matrix and leaves other ones unchanged (ADD) * thus MCPDAddBC() call preserves all modifications done by previous calls, while MCPDSetBC() completely discards all changes done to the equality constraints. INPUT PARAMETERS: S - solver I - row index of element being constrained J - column index of element being constrained BndL - lower bound BndU - upper bound -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdaddbc( mcpdstate s, ae_int_t i, ae_int_t j, double bndl, double bndu);
/************************************************************************* This function is used to add equality constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to ADD equality constraint for one element of P without changing constraints for other elements. You can also use MCPDSetEC() function which allows you to specify arbitrary set of equality constraints in one call. These functions (MCPDSetEC and MCPDAddEC) interact as follows: * there is internal matrix of equality constraints which is stored in the MCPD solver * MCPDSetEC() replaces this matrix by another one (SET) * MCPDAddEC() modifies one element of this matrix and leaves other ones unchanged (ADD) * thus MCPDAddEC() call preserves all modifications done by previous calls, while MCPDSetEC() completely discards all changes done to the equality constraints. INPUT PARAMETERS: S - solver I - row index of element being constrained J - column index of element being constrained C - value (constraint for P[I,J]). Can be either NAN (no constraint) or finite value from [0,1]. NOTES: 1. infinite values of C will lead to exception being thrown. Values less than 0.0 or greater than 1.0 will lead to error code being returned after call to MCPDSolve(). -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdaddec(mcpdstate s, ae_int_t i, ae_int_t j, double c);

Examples:   [1]  

/************************************************************************* This function is used to add a track - sequence of system states at the different moments of its evolution. You may add one or several tracks to the MCPD solver. In case you have several tracks, they won't overwrite each other. For example, if you pass two tracks, A1-A2-A3 (system at t=A+1, t=A+2 and t=A+3) and B1-B2-B3, then solver will try to model transitions from t=A+1 to t=A+2, t=A+2 to t=A+3, t=B+1 to t=B+2, t=B+2 to t=B+3. But it WONT mix these two tracks - i.e. it wont try to model transition from t=A+3 to t=B+1. INPUT PARAMETERS: S - solver XY - track, array[K,N]: * I-th row is a state at t=I * elements of XY must be non-negative (exception will be thrown on negative elements) K - number of points in a track * if given, only leading K rows of XY are used * if not given, automatically determined from size of XY NOTES: 1. Track may contain either proportional or population data: * with proportional data all rows of XY must sum to 1.0, i.e. we have proportions instead of absolute population values * with population data rows of XY contain population counts and generally do not sum to 1.0 (although they still must be non-negative) -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdaddtrack(mcpdstate s, real_2d_array xy); void alglib::mcpdaddtrack(mcpdstate s, real_2d_array xy, ae_int_t k);

Examples:   [1]  [2]  

/************************************************************************* DESCRIPTION: This function creates MCPD (Markov Chains for Population Data) solver. This solver can be used to find transition matrix P for N-dimensional prediction problem where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional population vectors (components of each X are non-negative), and P is a N*N transition matrix (elements of P are non-negative, each column sums to 1.0). Such models arise when when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is constant, i.e. there is no new individuals and no one leaves population * you want to model transitions of individuals from one state into another USAGE: Here we give very brief outline of the MCPD. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on data analysis which is available at http://www.alglib.net/dataanalysis/ 1. User initializes algorithm state with MCPDCreate() call 2. User adds one or more tracks - sequences of states which describe evolution of a system being modelled from different starting conditions 3. User may add optional boundary, equality and/or linear constraints on the coefficients of P by calling one of the following functions: * MCPDSetEC() to set equality constraints * MCPDSetBC() to set bound constraints * MCPDSetLC() to set linear constraints 4. Optionally, user may set custom weights for prediction errors (by default, algorithm assigns non-equal, automatically chosen weights for errors in the prediction of different components of X). It can be done with a call of MCPDSetPredictionWeights() function. 5. User calls MCPDSolve() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 6. User calls MCPDResults() to get solution INPUT PARAMETERS: N - problem dimension, N>=1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdcreate(ae_int_t n, mcpdstate& s);

Examples:   [1]  [2]  

/************************************************************************* DESCRIPTION: This function is a specialized version of MCPDCreate() function, and we recommend you to read comments for this function for general information about MCPD solver. This function creates MCPD (Markov Chains for Population Data) solver for "Entry-state" model, i.e. model where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional state vectors P is a N*N transition matrix and one selected component of X[] is called "entry" state and is treated in a special way: system state always transits from "entry" state to some another state system state can not transit from any state into "entry" state Such conditions basically mean that row of P which corresponds to "entry" state is zero. Such models arise when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is NOT constant - at every moment of time there is some (unpredictable) amount of "new" individuals, which can transit into one of the states at the next turn, but still no one leaves population * you want to model transitions of individuals from one state into another * but you do NOT want to predict amount of "new" individuals because it does not depends on individuals already present (hence system can not transit INTO entry state - it can only transit FROM it). This model is discussed in more details in the ALGLIB User Guide (see http://www.alglib.net/dataanalysis/ for more data). INPUT PARAMETERS: N - problem dimension, N>=2 EntryState- index of entry state, in 0..N-1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdcreateentry( ae_int_t n, ae_int_t entrystate, mcpdstate& s);
/************************************************************************* DESCRIPTION: This function is a specialized version of MCPDCreate() function, and we recommend you to read comments for this function for general information about MCPD solver. This function creates MCPD (Markov Chains for Population Data) solver for "Entry-Exit-states" model, i.e. model where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional state vectors P is a N*N transition matrix one selected component of X[] is called "entry" state and is treated in a special way: system state always transits from "entry" state to some another state system state can not transit from any state into "entry" state and another one component of X[] is called "exit" state and is treated in a special way too: system state can transit from any state into "exit" state system state can not transit from "exit" state into any other state transition operator discards "exit" state (makes it zero at each turn) Such conditions basically mean that: row of P which corresponds to "entry" state is zero column of P which corresponds to "exit" state is zero Multiplication by such P may decrease sum of vector components. Such models arise when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is NOT constant * at every moment of time there is some (unpredictable) amount of "new" individuals, which can transit into one of the states at the next turn * some individuals can move (predictably) into "exit" state and leave population at the next turn * you want to model transitions of individuals from one state into another, including transitions from the "entry" state and into the "exit" state. * but you do NOT want to predict amount of "new" individuals because it does not depends on individuals already present (hence system can not transit INTO entry state - it can only transit FROM it). This model is discussed in more details in the ALGLIB User Guide (see http://www.alglib.net/dataanalysis/ for more data). INPUT PARAMETERS: N - problem dimension, N>=2 EntryState- index of entry state, in 0..N-1 ExitState- index of exit state, in 0..N-1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdcreateentryexit( ae_int_t n, ae_int_t entrystate, ae_int_t exitstate, mcpdstate& s);
/************************************************************************* DESCRIPTION: This function is a specialized version of MCPDCreate() function, and we recommend you to read comments for this function for general information about MCPD solver. This function creates MCPD (Markov Chains for Population Data) solver for "Exit-state" model, i.e. model where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional state vectors P is a N*N transition matrix and one selected component of X[] is called "exit" state and is treated in a special way: system state can transit from any state into "exit" state system state can not transit from "exit" state into any other state transition operator discards "exit" state (makes it zero at each turn) Such conditions basically mean that column of P which corresponds to "exit" state is zero. Multiplication by such P may decrease sum of vector components. Such models arise when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is NOT constant - individuals can move into "exit" state and leave population at the next turn, but there are no new individuals * amount of individuals which leave population can be predicted * you want to model transitions of individuals from one state into another (including transitions into the "exit" state) This model is discussed in more details in the ALGLIB User Guide (see http://www.alglib.net/dataanalysis/ for more data). INPUT PARAMETERS: N - problem dimension, N>=2 ExitState- index of exit state, in 0..N-1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdcreateexit(ae_int_t n, ae_int_t exitstate, mcpdstate& s);
/************************************************************************* MCPD results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: P - array[N,N], transition matrix Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one. Speaking short, positive values denote success, negative ones are failures. More information about fields of this structure can be found in the comments on MCPDReport datatype. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdresults(mcpdstate s, real_2d_array& p, mcpdreport& rep);

Examples:   [1]  [2]  

/************************************************************************* This function is used to add bound constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to place bound constraints on arbitrary subset of elements of P. Set of constraints is specified by BndL/BndU matrices, which may contain arbitrary combination of finite numbers or infinities (like -INF<x<=0.5 or 0.1<=x<+INF). You can also use MCPDAddBC() function which allows to ADD bound constraint for one element of P without changing constraints for other elements. These functions (MCPDSetBC and MCPDAddBC) interact as follows: * there is internal matrix of bound constraints which is stored in the MCPD solver * MCPDSetBC() replaces this matrix by another one (SET) * MCPDAddBC() modifies one element of this matrix and leaves other ones unchanged (ADD) * thus MCPDAddBC() call preserves all modifications done by previous calls, while MCPDSetBC() completely discards all changes done to the equality constraints. INPUT PARAMETERS: S - solver BndL - lower bounds constraints, array[N,N]. Elements of BndL can be finite numbers or -INF. BndU - upper bounds constraints, array[N,N]. Elements of BndU can be finite numbers or +INF. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdsetbc( mcpdstate s, real_2d_array bndl, real_2d_array bndu);
/************************************************************************* This function is used to add equality constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to place equality constraints on arbitrary subset of elements of P. Set of constraints is specified by EC, which may contain either NAN's or finite numbers from [0,1]. NAN denotes absence of constraint, finite number denotes equality constraint on specific element of P. You can also use MCPDAddEC() function which allows to ADD equality constraint for one element of P without changing constraints for other elements. These functions (MCPDSetEC and MCPDAddEC) interact as follows: * there is internal matrix of equality constraints which is stored in the MCPD solver * MCPDSetEC() replaces this matrix by another one (SET) * MCPDAddEC() modifies one element of this matrix and leaves other ones unchanged (ADD) * thus MCPDAddEC() call preserves all modifications done by previous calls, while MCPDSetEC() completely discards all changes done to the equality constraints. INPUT PARAMETERS: S - solver EC - equality constraints, array[N,N]. Elements of EC can be either NAN's or finite numbers from [0,1]. NAN denotes absence of constraints, while finite value denotes equality constraint on the corresponding element of P. NOTES: 1. infinite values of EC will lead to exception being thrown. Values less than 0.0 or greater than 1.0 will lead to error code being returned after call to MCPDSolve(). -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdsetec(mcpdstate s, real_2d_array ec);
/************************************************************************* This function is used to set linear equality/inequality constraints on the elements of the transition matrix P. This function can be used to set one or several general linear constraints on the elements of P. Two types of constraints are supported: * equality constraints * inequality constraints (both less-or-equal and greater-or-equal) Coefficients of constraints are specified by matrix C (one of the parameters). One row of C corresponds to one constraint. Because transition matrix P has N*N elements, we need N*N columns to store all coefficients (they are stored row by row), and one more column to store right part - hence C has N*N+1 columns. Constraint kind is stored in the CT array. Thus, I-th linear constraint is P[0,0]*C[I,0] + P[0,1]*C[I,1] + .. + P[0,N-1]*C[I,N-1] + + P[1,0]*C[I,N] + P[1,1]*C[I,N+1] + ... + + P[N-1,N-1]*C[I,N*N-1] ?=? C[I,N*N] where ?=? can be either "=" (CT[i]=0), "<=" (CT[i]<0) or ">=" (CT[i]>0). Your constraint may involve only some subset of P (less than N*N elements). For example it can be something like P[0,0] + P[0,1] = 0.5 In this case you still should pass matrix with N*N+1 columns, but all its elements (except for C[0,0], C[0,1] and C[0,N*N-1]) will be zero. INPUT PARAMETERS: S - solver C - array[K,N*N+1] - coefficients of constraints (see above for complete description) CT - array[K] - constraint types (see above for complete description) K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdsetlc(mcpdstate s, real_2d_array c, integer_1d_array ct); void alglib::mcpdsetlc( mcpdstate s, real_2d_array c, integer_1d_array ct, ae_int_t k);
/************************************************************************* This function is used to change prediction weights MCPD solver scales prediction errors as follows Error(P) = ||W*(y-P*x)||^2 where x is a system state at time t y is a system state at time t+1 P is a transition matrix W is a diagonal scaling matrix By default, weights are chosen in order to minimize relative prediction error instead of absolute one. For example, if one component of state is about 0.5 in magnitude and another one is about 0.05, then algorithm will make corresponding weights equal to 2.0 and 20.0. INPUT PARAMETERS: S - solver PW - array[N], weights: * must be non-negative values (exception will be thrown otherwise) * zero values will be replaced by automatically chosen values -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdsetpredictionweights(mcpdstate s, real_1d_array pw);
/************************************************************************* This function allows to set prior values used for regularization of your problem. By default, regularizing term is equal to r*||P-prior_P||^2, where r is a small non-zero value, P is transition matrix, prior_P is identity matrix, ||X||^2 is a sum of squared elements of X. This function allows you to change prior values prior_P. You can also change r with MCPDSetTikhonovRegularizer() function. INPUT PARAMETERS: S - solver PP - array[N,N], matrix of prior values: 1. elements must be real numbers from [0,1] 2. columns must sum to 1.0. First property is checked (exception is thrown otherwise), while second one is not checked/enforced. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdsetprior(mcpdstate s, real_2d_array pp);
/************************************************************************* This function allows to tune amount of Tikhonov regularization being applied to your problem. By default, regularizing term is equal to r*||P-prior_P||^2, where r is a small non-zero value, P is transition matrix, prior_P is identity matrix, ||X||^2 is a sum of squared elements of X. This function allows you to change coefficient r. You can also change prior values with MCPDSetPrior() function. INPUT PARAMETERS: S - solver V - regularization coefficient, finite non-negative value. It is not recommended to specify zero value unless you are pretty sure that you want it. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdsettikhonovregularizer(mcpdstate s, double v);
/************************************************************************* This function is used to start solution of the MCPD problem. After return from this function, you can use MCPDResults() to get solution and completion code. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mcpdsolve(mcpdstate s);

Examples:   [1]  [2]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // The very simple MCPD example
    //
    // We have a loan portfolio. Our loans can be in one of two states:
    // * normal loans ("good" ones)
    // * past due loans ("bad" ones)
    //
    // We assume that:
    // * loans can transition from any state to any other state. In 
    //   particular, past due loan can become "good" one at any moment 
    //   with same (fixed) probability. Not realistic, but it is toy example :)
    // * portfolio size does not change over time
    //
    // Thus, we have following model
    //     state_new = P*state_old
    // where
    //         ( p00  p01 )
    //     P = (          )
    //         ( p10  p11 )
    //
    // We want to model transitions between these two states using MCPD
    // approach (Markov Chains for Proportional/Population Data), i.e.
    // to restore hidden transition matrix P using actual portfolio data.
    // We have:
    // * poportional data, i.e. proportion of loans in the normal and past 
    //   due states (not portfolio size measured in some currency, although 
    //   it is possible to work with population data too)
    // * two tracks, i.e. two sequences which describe portfolio
    //   evolution from two different starting states: [1,0] (all loans 
    //   are "good") and [0.8,0.2] (only 80% of portfolio is in the "good"
    //   state)
    //
    mcpdstate s;
    mcpdreport rep;
    real_2d_array p;
    real_2d_array track0 = "[[1.00000,0.00000],[0.95000,0.05000],[0.92750,0.07250],[0.91738,0.08263],[0.91282,0.08718]]";
    real_2d_array track1 = "[[0.80000,0.20000],[0.86000,0.14000],[0.88700,0.11300],[0.89915,0.10085]]";

    mcpdcreate(2, s);
    mcpdaddtrack(s, track0);
    mcpdaddtrack(s, track1);
    mcpdsolve(s);
    mcpdresults(s, p, rep);

    //
    // Hidden matrix P is equal to
    //         ( 0.95  0.50 )
    //         (            )
    //         ( 0.05  0.50 )
    // which means that "good" loans can become "bad" with 5% probability, 
    // while "bad" loans will return to good state with 50% probability.
    //
    printf("%s\n", p.tostring(2).c_str()); // EXPECTED: [[0.95,0.50],[0.05,0.50]]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Simple MCPD example
    //
    // We have a loan portfolio. Our loans can be in one of three states:
    // * normal loans
    // * past due loans
    // * charged off loans
    //
    // We assume that:
    // * normal loan can stay normal or become past due (but not charged off)
    // * past due loan can stay past due, become normal or charged off
    // * charged off loan will stay charged off for the rest of eternity
    // * portfolio size does not change over time
    // Not realistic, but it is toy example :)
    //
    // Thus, we have following model
    //     state_new = P*state_old
    // where
    //         ( p00  p01    )
    //     P = ( p10  p11    )
    //         (      p21  1 )
    // i.e. four elements of P are known a priori.
    //
    // Although it is possible (given enough data) to In order to enforce 
    // this property we set equality constraints on these elements.
    //
    // We want to model transitions between these two states using MCPD
    // approach (Markov Chains for Proportional/Population Data), i.e.
    // to restore hidden transition matrix P using actual portfolio data.
    // We have:
    // * poportional data, i.e. proportion of loans in the current and past 
    //   due states (not portfolio size measured in some currency, although 
    //   it is possible to work with population data too)
    // * two tracks, i.e. two sequences which describe portfolio
    //   evolution from two different starting states: [1,0,0] (all loans 
    //   are "good") and [0.8,0.2,0.0] (only 80% of portfolio is in the "good"
    //   state)
    //
    mcpdstate s;
    mcpdreport rep;
    real_2d_array p;
    real_2d_array track0 = "[[1.000000,0.000000,0.000000],[0.950000,0.050000,0.000000],[0.927500,0.060000,0.012500],[0.911125,0.061375,0.027500],[0.896256,0.060900,0.042844]]";
    real_2d_array track1 = "[[0.800000,0.200000,0.000000],[0.860000,0.090000,0.050000],[0.862000,0.065500,0.072500],[0.851650,0.059475,0.088875],[0.838805,0.057451,0.103744]]";

    mcpdcreate(3, s);
    mcpdaddtrack(s, track0);
    mcpdaddtrack(s, track1);
    mcpdaddec(s, 0, 2, 0.0);
    mcpdaddec(s, 1, 2, 0.0);
    mcpdaddec(s, 2, 2, 1.0);
    mcpdaddec(s, 2, 0, 0.0);
    mcpdsolve(s);
    mcpdresults(s, p, rep);

    //
    // Hidden matrix P is equal to
    //         ( 0.95 0.50      )
    //         ( 0.05 0.25      )
    //         (      0.25 1.00 ) 
    // which means that "good" loans can become past due with 5% probability, 
    // while past due loans will become charged off with 25% probability or
    // return back to normal state with 50% probability.
    //
    printf("%s\n", p.tostring(2).c_str()); // EXPECTED: [[0.95,0.50,0.00],[0.05,0.25,0.00],[0.00,0.25,1.00]]
    return 0;
}


minbcreport
minbcstate
minbccreate
minbccreatef
minbcoptimize
minbcrequesttermination
minbcrestartfrom
minbcresults
minbcresultsbuf
minbcsetbc
minbcsetcond
minbcsetgradientcheck
minbcsetprecdefault
minbcsetprecdiag
minbcsetprecscale
minbcsetscale
minbcsetstpmax
minbcsetxrep
minbc_d_1 Nonlinear optimization with box constraints
minbc_numdiff Nonlinear optimization with bound constraints and numerical differentiation
/************************************************************************* This structure stores optimization report: * IterationsCount number of iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinBCSetGradientCheck() for more information. -3 inconsistent constraints. 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 terminated by user who called minbcrequesttermination(). X contains point which was "current accepted" when termination request was submitted. ADDITIONAL FIELDS There are additional fields which can be used for debugging: * DebugEqErr error in the equality constraints (2-norm) * DebugFS f, calculated at projection of initial point to the feasible set * DebugFF f, calculated at the final point * DebugDX |X_start-X_final| *************************************************************************/
class minbcreport { ae_int_t iterationscount; ae_int_t nfev; ae_int_t varidx; ae_int_t terminationtype; };
/************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinBC subpackage to work with this object *************************************************************************/
class minbcstate { };
/************************************************************************* BOX CONSTRAINED OPTIMIZATION WITH FAST ACTIVATION OF MULTIPLE BOX CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to box constraints (with some of box constraints actually being equality ones). This optimizer uses algorithm similar to that of MinBLEIC (optimizer with general linear constraints), but presence of box-only constraints allows us to use faster constraint activation strategies. On large-scale problems, with multiple constraints active at the solution, this optimizer can be several times faster than BLEIC. REQUIREMENTS: * user must provide function value and gradient * starting point X0 must be feasible or not too far away from the feasible set * grad(f) must be Lipschitz continuous on a level set: L = { x : f(x)<=f(x0) } * function must be defined everywhere on the feasible set F USAGE: Constrained optimization if far more complex than the unconstrained one. Here we give very brief outline of the BC optimizer. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinBCCreate() call 2. USer adds box constraints by calling MinBCSetBC() function. 3. User sets stopping conditions with MinBCSetCond(). 4. User calls MinBCOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 5. User calls MinBCResults() to get solution 6. Optionally user may call MinBCRestartFrom() to solve another problem with same N but another starting point. MinBCRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbccreate(real_1d_array x, minbcstate& state); void alglib::minbccreate(ae_int_t n, real_1d_array x, minbcstate& state);

Examples:   [1]  

/************************************************************************* The subroutine is finite difference variant of MinBCCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinBCCreate() in order to get more information about creation of BC optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinBCSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. CG needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minbccreatef( real_1d_array x, double diffstep, minbcstate& state); void alglib::minbccreatef( ae_int_t n, real_1d_array x, double diffstep, minbcstate& state);

Examples:   [1]  

/************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x grad - callback which calculates function (or merit function) value func and gradient grad at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied gradient, and one which uses function value only and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object (either MinBCCreate() for analytical gradient or MinBCCreateF() for numerical differentiation) you should choose appropriate variant of MinBCOptimize() - one which accepts function AND gradient or one which accepts function ONLY. Be careful to choose variant of MinBCOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinBCOptimize() and specific function used to create optimizer. | USER PASSED TO MinBCOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinBCCreateF() | works FAILS MinBCCreate() | FAILS works Here "FAIL" denotes inappropriate combinations of optimizer creation function and MinBCOptimize() version. Attemps to use such combination (for example, to create optimizer with MinBCCreateF() and to pass gradient information to MinCGOptimize()) will lead to exception being thrown. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void minbcoptimize(minbcstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minbcoptimize(minbcstate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL);

Examples:   [1]  [2]  

/************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minbcrequesttermination(minbcstate state);
/************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with MinBCCreate call. X - new starting point. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbcrestartfrom(minbcstate state, real_1d_array x);
/************************************************************************* BC results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinBCSetGradientCheck() for more information. * -3 inconsistent constraints. * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken * 8 terminated by user who called minbcrequesttermination(). X contains point which was "current accepted" when termination request was submitted. More information about fields of this structure can be found in the comments on MinBCReport datatype. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbcresults( minbcstate state, real_1d_array& x, minbcreport& rep);

Examples:   [1]  [2]  

/************************************************************************* BC results Buffered implementation of MinBCResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbcresultsbuf( minbcstate state, real_1d_array& x, minbcreport& rep);
/************************************************************************* This function sets boundary constraints for BC optimizer. Boundary constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinBCRestartFrom(). INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints, even when numerical differentiation is used (algorithm adjusts nodes according to boundary constraints) -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbcsetbc( minbcstate state, real_1d_array bndl, real_1d_array bndu);

Examples:   [1]  [2]  

/************************************************************************* This function sets stopping conditions for the optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|<EpsG is satisfied, where: * |.| means Euclidian norm * v - scaled gradient vector, v[i]=g[i]*s[i] * g - gradient * s - scaling coefficients set by MinBCSetScale() EpsF - >=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinBCSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. NOTE: when SetCond() called with non-zero MaxIts, BC solver may perform slightly more than MaxIts iterations. I.e., MaxIts sets non-strict limit on iterations count. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbcsetcond( minbcstate state, double epsg, double epsf, double epsx, ae_int_t maxits);

Examples:   [1]  [2]  

/************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinBCOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * if needed, steps are bounded with respect to constraints on X[] * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinBCSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/
void alglib::minbcsetgradientcheck(minbcstate state, double teststep);
/************************************************************************* Modification of the preconditioner: preconditioning is turned off. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbcsetprecdefault(minbcstate state);
/************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE 1: D[i] should be positive. Exception will be thrown otherwise. NOTE 2: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbcsetprecdiag(minbcstate state, real_1d_array d);
/************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinBCSetScale() call (before or after MinBCSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbcsetprecscale(minbcstate state);
/************************************************************************* This function sets scaling coefficients for BC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the BC too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinBCSetPrec...() functions. There is a special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minbcsetscale(minbcstate state, real_1d_array s);
/************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which lead to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbcsetstpmax(minbcstate state, double stpmax);
/************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinBCOptimize(). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbcsetxrep(minbcstate state, bool needxrep);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void function1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) 
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    // and its derivatives df/d0 and df/dx1
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
    grad[0] = 400*pow(x[0]+3,3);
    grad[1] = 4*pow(x[1]-3,3);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4
    // subject to bound constraints -1<=x<=+1, -1<=y<=+1, using MinBC optimizer.
    //
    real_1d_array x = "[0,0]";
    real_1d_array bndl = "[-1,-1]";
    real_1d_array bndu = "[+1,+1]";
    minbcstate state;
    minbcreport rep;

    //
    // These variables define stopping conditions for the optimizer.
    //
    // We use very simple condition - |g|<=epsg
    //
    double epsg = 0.000001;
    double epsf = 0;
    double epsx = 0;
    ae_int_t maxits = 0;

    //
    // Now we are ready to actually optimize something:
    // * first we create optimizer
    // * we add boundary constraints
    // * we tune stopping conditions
    // * and, finally, optimize and obtain results...
    //
    minbccreate(x, state);
    minbcsetbc(state, bndl, bndu);
    minbcsetcond(state, epsg, epsf, epsx, maxits);
    alglib::minbcoptimize(state, function1_grad);
    minbcresults(state, x, rep);

    //
    // ...and evaluate these results
    //
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-1,1]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void function1_func(const real_1d_array &x, double &func, void *ptr)
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4
    // subject to bound constraints -1<=x<=+1, -1<=y<=+1, using MinBC optimizer.
    //
    real_1d_array x = "[0,0]";
    real_1d_array bndl = "[-1,-1]";
    real_1d_array bndu = "[+1,+1]";
    minbcstate state;
    minbcreport rep;

    //
    // These variables define stopping conditions for the optimizer.
    //
    // We use very simple condition - |g|<=epsg
    //
    double epsg = 0.000001;
    double epsf = 0;
    double epsx = 0;
    ae_int_t maxits = 0;

    //
    // This variable contains differentiation step
    //
    double diffstep = 1.0e-6;

    //
    // Now we are ready to actually optimize something:
    // * first we create optimizer
    // * we add boundary constraints
    // * we tune stopping conditions
    // * and, finally, optimize and obtain results...
    //
    minbccreatef(x, diffstep, state);
    minbcsetbc(state, bndl, bndu);
    minbcsetcond(state, epsg, epsf, epsx, maxits);
    alglib::minbcoptimize(state, function1_func);
    minbcresults(state, x, rep);

    //
    // ...and evaluate these results
    //
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-1,1]
    return 0;
}


minbleicreport
minbleicstate
minbleiccreate
minbleiccreatef
minbleicoptimize
minbleicrequesttermination
minbleicrestartfrom
minbleicresults
minbleicresultsbuf
minbleicsetbc
minbleicsetcond
minbleicsetgradientcheck
minbleicsetlc
minbleicsetprecdefault
minbleicsetprecdiag
minbleicsetprecscale
minbleicsetscale
minbleicsetstpmax
minbleicsetxrep
minbleic_d_1 Nonlinear optimization with bound constraints
minbleic_d_2 Nonlinear optimization with linear inequality constraints
minbleic_ftrim Nonlinear optimization by BLEIC, function with singularities
minbleic_numdiff Nonlinear optimization with bound constraints and numerical differentiation
/************************************************************************* This structure stores optimization report: * IterationsCount number of iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinBLEICSetGradientCheck() for more information. -3 inconsistent constraints. Feasible point is either nonexistent or too hard to find. Try to restart optimizer with better initial approximation 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 terminated by user who called minbleicrequesttermination(). X contains point which was "current accepted" when termination request was submitted. ADDITIONAL FIELDS There are additional fields which can be used for debugging: * DebugEqErr error in the equality constraints (2-norm) * DebugFS f, calculated at projection of initial point to the feasible set * DebugFF f, calculated at the final point * DebugDX |X_start-X_final| *************************************************************************/
class minbleicreport { ae_int_t iterationscount; ae_int_t nfev; ae_int_t varidx; ae_int_t terminationtype; double debugeqerr; double debugfs; double debugff; double debugdx; ae_int_t debugfeasqpits; ae_int_t debugfeasgpaits; ae_int_t inneriterationscount; ae_int_t outeriterationscount; };
/************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinBLEIC subpackage to work with this object *************************************************************************/
class minbleicstate { };
/************************************************************************* BOUND CONSTRAINED OPTIMIZATION WITH ADDITIONAL LINEAR EQUALITY AND INEQUALITY CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints REQUIREMENTS: * user must provide function value and gradient * starting point X0 must be feasible or not too far away from the feasible set * grad(f) must be Lipschitz continuous on a level set: L = { x : f(x)<=f(x0) } * function must be defined everywhere on the feasible set F USAGE: Constrained optimization if far more complex than the unconstrained one. Here we give very brief outline of the BLEIC optimizer. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinBLEICCreate() call 2. USer adds boundary and/or linear constraints by calling MinBLEICSetBC() and MinBLEICSetLC() functions. 3. User sets stopping conditions with MinBLEICSetCond(). 4. User calls MinBLEICOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 5. User calls MinBLEICResults() to get solution 6. Optionally user may call MinBLEICRestartFrom() to solve another problem with same N but another starting point. MinBLEICRestartFrom() allows to reuse already initialized structure. NOTE: if you have box-only constraints (no general linear constraints), then MinBC optimizer can be better option. It uses special, faster constraint activation method, which performs better on problems with multiple constraints active at the solution. On small-scale problems performance of MinBC is similar to that of MinBLEIC, but on large-scale ones (hundreds and thousands of active constraints) it can be several times faster than MinBLEIC. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleiccreate(real_1d_array x, minbleicstate& state); void alglib::minbleiccreate( ae_int_t n, real_1d_array x, minbleicstate& state);

Examples:   [1]  [2]  [3]  

/************************************************************************* The subroutine is finite difference variant of MinBLEICCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinBLEICCreate() in order to get more information about creation of BLEIC optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinBLEICSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. CG needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minbleiccreatef( real_1d_array x, double diffstep, minbleicstate& state); void alglib::minbleiccreatef( ae_int_t n, real_1d_array x, double diffstep, minbleicstate& state);

Examples:   [1]  

/************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x grad - callback which calculates function (or merit function) value func and gradient grad at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied gradient, and one which uses function value only and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object (either MinBLEICCreate() for analytical gradient or MinBLEICCreateF() for numerical differentiation) you should choose appropriate variant of MinBLEICOptimize() - one which accepts function AND gradient or one which accepts function ONLY. Be careful to choose variant of MinBLEICOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinBLEICOptimize() and specific function used to create optimizer. | USER PASSED TO MinBLEICOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinBLEICCreateF() | work FAIL MinBLEICCreate() | FAIL work Here "FAIL" denotes inappropriate combinations of optimizer creation function and MinBLEICOptimize() version. Attemps to use such combination (for example, to create optimizer with MinBLEICCreateF() and to pass gradient information to MinCGOptimize()) will lead to exception being thrown. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void minbleicoptimize(minbleicstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minbleicoptimize(minbleicstate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicrequesttermination(minbleicstate state);
/************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with MinBLEICCreate call. X - new starting point. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicrestartfrom(minbleicstate state, real_1d_array x);
/************************************************************************* BLEIC results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinBLEICSetGradientCheck() for more information. * -3 inconsistent constraints. Feasible point is either nonexistent or too hard to find. Try to restart optimizer with better initial approximation * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken * 8 terminated by user who called minbleicrequesttermination(). X contains point which was "current accepted" when termination request was submitted. More information about fields of this structure can be found in the comments on MinBLEICReport datatype. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicresults( minbleicstate state, real_1d_array& x, minbleicreport& rep);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* BLEIC results Buffered implementation of MinBLEICResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicresultsbuf( minbleicstate state, real_1d_array& x, minbleicreport& rep);
/************************************************************************* This function sets boundary constraints for BLEIC optimizer. Boundary constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinBLEICRestartFrom(). NOTE: if you have box-only constraints (no general linear constraints), then MinBC optimizer can be better option. It uses special, faster constraint activation method, which performs better on problems with multiple constraints active at the solution. On small-scale problems performance of MinBC is similar to that of MinBLEIC, but on large-scale ones (hundreds and thousands of active constraints) it can be several times faster than MinBLEIC. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints, even when numerical differentiation is used (algorithm adjusts nodes according to boundary constraints) -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicsetbc( minbleicstate state, real_1d_array bndl, real_1d_array bndu);

Examples:   [1]  [2]  

/************************************************************************* This function sets stopping conditions for the optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|<EpsG is satisfied, where: * |.| means Euclidian norm * v - scaled gradient vector, v[i]=g[i]*s[i] * g - gradient * s - scaling coefficients set by MinBLEICSetScale() EpsF - >=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinBLEICSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. NOTE: when SetCond() called with non-zero MaxIts, BLEIC solver may perform slightly more than MaxIts iterations. I.e., MaxIts sets non-strict limit on iterations count. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicsetcond( minbleicstate state, double epsg, double epsf, double epsx, ae_int_t maxits);

Examples:   [1]  [2]  [3]  

/************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinBLEICOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * if needed, steps are bounded with respect to constraints on X[] * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinBLEICSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicsetgradientcheck( minbleicstate state, double teststep);
/************************************************************************* This function sets linear constraints for BLEIC optimizer. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinBLEICRestartFrom(). INPUT PARAMETERS: State - structure previously allocated with MinBLEICCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: linear (non-bound) constraints are satisfied only approximately: * there always exists some minor violation (about Epsilon in magnitude) due to rounding errors * numerical differentiation, if used, may lead to function evaluations outside of the feasible area, because algorithm does NOT change numerical differentiation formula according to linear constraints. If you want constraints to be satisfied exactly, try to reformulate your problem in such manner that all constraints will become boundary ones (this kind of constraints is always satisfied exactly, both in the final solution and in all intermediate points). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicsetlc( minbleicstate state, real_2d_array c, integer_1d_array ct); void alglib::minbleicsetlc( minbleicstate state, real_2d_array c, integer_1d_array ct, ae_int_t k);

Examples:   [1]  

/************************************************************************* Modification of the preconditioner: preconditioning is turned off. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicsetprecdefault(minbleicstate state);
/************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE 1: D[i] should be positive. Exception will be thrown otherwise. NOTE 2: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicsetprecdiag(minbleicstate state, real_1d_array d);
/************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinBLEICSetScale() call (before or after MinBLEICSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicsetprecscale(minbleicstate state);
/************************************************************************* This function sets scaling coefficients for BLEIC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the BLEIC too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinBLEICSetPrec...() functions. There is a special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicsetscale(minbleicstate state, real_1d_array s);
/************************************************************************* This function sets maximum step length IMPORTANT: this feature is hard to combine with preconditioning. You can't set upper limit on step length, when you solve optimization problem with linear (non-boundary) constraints AND preconditioner turned on. When non-boundary constraints are present, you have to either a) use preconditioner, or b) use upper limit on step length. YOU CAN'T USE BOTH! In this case algorithm will terminate with appropriate error code. INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which lead to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicsetstpmax(minbleicstate state, double stpmax);
/************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinBLEICOptimize(). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicsetxrep(minbleicstate state, bool needxrep);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void function1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) 
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    // and its derivatives df/d0 and df/dx1
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
    grad[0] = 400*pow(x[0]+3,3);
    grad[1] = 4*pow(x[1]-3,3);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4
    // subject to bound constraints -1<=x<=+1, -1<=y<=+1, using BLEIC optimizer.
    //
    real_1d_array x = "[0,0]";
    real_1d_array bndl = "[-1,-1]";
    real_1d_array bndu = "[+1,+1]";
    minbleicstate state;
    minbleicreport rep;

    //
    // These variables define stopping conditions for the optimizer.
    //
    // We use very simple condition - |g|<=epsg
    //
    double epsg = 0.000001;
    double epsf = 0;
    double epsx = 0;
    ae_int_t maxits = 0;

    //
    // Now we are ready to actually optimize something:
    // * first we create optimizer
    // * we add boundary constraints
    // * we tune stopping conditions
    // * and, finally, optimize and obtain results...
    //
    minbleiccreate(x, state);
    minbleicsetbc(state, bndl, bndu);
    minbleicsetcond(state, epsg, epsf, epsx, maxits);
    alglib::minbleicoptimize(state, function1_grad);
    minbleicresults(state, x, rep);

    //
    // ...and evaluate these results
    //
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-1,1]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void function1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) 
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    // and its derivatives df/d0 and df/dx1
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
    grad[0] = 400*pow(x[0]+3,3);
    grad[1] = 4*pow(x[1]-3,3);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4
    // subject to inequality constraints:
    // * x>=2 (posed as general linear constraint),
    // * x+y>=6
    // using BLEIC optimizer.
    //
    real_1d_array x = "[5,5]";
    real_2d_array c = "[[1,0,2],[1,1,6]]";
    integer_1d_array ct = "[1,1]";
    minbleicstate state;
    minbleicreport rep;

    //
    // These variables define stopping conditions for the optimizer.
    //
    // We use very simple condition - |g|<=epsg
    //
    double epsg = 0.000001;
    double epsf = 0;
    double epsx = 0;
    ae_int_t maxits = 0;

    //
    // Now we are ready to actually optimize something:
    // * first we create optimizer
    // * we add linear constraints
    // * we tune stopping conditions
    // * and, finally, optimize and obtain results...
    //
    minbleiccreate(x, state);
    minbleicsetlc(state, c, ct);
    minbleicsetcond(state, epsg, epsf, epsx, maxits);
    alglib::minbleicoptimize(state, function1_grad);
    minbleicresults(state, x, rep);

    //
    // ...and evaluate these results
    //
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [2,4]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void s1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr)
{
    //
    // this callback calculates f(x) = (1+x)^(-0.2) + (1-x)^(-0.3) + 1000*x and its gradient.
    //
    // function is trimmed when we calculate it near the singular points or outside of the [-1,+1].
    // Note that we do NOT calculate gradient in this case.
    //
    if( (x[0]<=-0.999999999999) || (x[0]>=+0.999999999999) )
    {
        func = 1.0E+300;
        return;
    }
    func = pow(1+x[0],-0.2) + pow(1-x[0],-0.3) + 1000*x[0];
    grad[0] = -0.2*pow(1+x[0],-1.2) +0.3*pow(1-x[0],-1.3) + 1000;
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x) = (1+x)^(-0.2) + (1-x)^(-0.3) + 1000*x.
    //
    // This function is undefined outside of (-1,+1) and has singularities at x=-1 and x=+1.
    // Special technique called "function trimming" allows us to solve this optimization problem 
    // - without using boundary constraints!
    //
    // See http://www.alglib.net/optimization/tipsandtricks.php#ftrimming for more information
    // on this subject.
    //
    real_1d_array x = "[0]";
    double epsg = 1.0e-6;
    double epsf = 0;
    double epsx = 0;
    ae_int_t maxits = 0;
    minbleicstate state;
    minbleicreport rep;

    minbleiccreate(x, state);
    minbleicsetcond(state, epsg, epsf, epsx, maxits);
    alglib::minbleicoptimize(state, s1_grad);
    minbleicresults(state, x, rep);

    printf("%s\n", x.tostring(5).c_str()); // EXPECTED: [-0.99917305]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void function1_func(const real_1d_array &x, double &func, void *ptr)
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4
    // subject to bound constraints -1<=x<=+1, -1<=y<=+1, using BLEIC optimizer.
    //
    real_1d_array x = "[0,0]";
    real_1d_array bndl = "[-1,-1]";
    real_1d_array bndu = "[+1,+1]";
    minbleicstate state;
    minbleicreport rep;

    //
    // These variables define stopping conditions for the optimizer.
    //
    // We use very simple condition - |g|<=epsg
    //
    double epsg = 0.000001;
    double epsf = 0;
    double epsx = 0;
    ae_int_t maxits = 0;

    //
    // This variable contains differentiation step
    //
    double diffstep = 1.0e-6;

    //
    // Now we are ready to actually optimize something:
    // * first we create optimizer
    // * we add boundary constraints
    // * we tune stopping conditions
    // * and, finally, optimize and obtain results...
    //
    minbleiccreatef(x, diffstep, state);
    minbleicsetbc(state, bndl, bndu);
    minbleicsetcond(state, epsg, epsf, epsx, maxits);
    alglib::minbleicoptimize(state, function1_func);
    minbleicresults(state, x, rep);

    //
    // ...and evaluate these results
    //
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-1,1]
    return 0;
}


mincgreport
mincgstate
mincgcreate
mincgcreatef
mincgoptimize
mincgrequesttermination
mincgrestartfrom
mincgresults
mincgresultsbuf
mincgsetcgtype
mincgsetcond
mincgsetgradientcheck
mincgsetprecdefault
mincgsetprecdiag
mincgsetprecscale
mincgsetscale
mincgsetstpmax
mincgsetxrep
mincgsuggeststep
mincg_d_1 Nonlinear optimization by CG
mincg_d_2 Nonlinear optimization with additional settings and restarts
mincg_ftrim Nonlinear optimization by CG, function with singularities
mincg_numdiff Nonlinear optimization by CG with numerical differentiation
/************************************************************************* This structure stores optimization report: * IterationsCount total number of inner iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinCGSetGradientCheck() for more information. 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 terminated by user who called mincgrequesttermination(). X contains point which was "current accepted" when termination request was submitted. Other fields of this structure are not documented and should not be used! *************************************************************************/
class mincgreport { ae_int_t iterationscount; ae_int_t nfev; ae_int_t varidx; ae_int_t terminationtype; };
/************************************************************************* This object stores state of the nonlinear CG optimizer. You should use ALGLIB functions to work with this object. *************************************************************************/
class mincgstate { };
/************************************************************************* NONLINEAR CONJUGATE GRADIENT METHOD DESCRIPTION: The subroutine minimizes function F(x) of N arguments by using one of the nonlinear conjugate gradient methods. These CG methods are globally convergent (even on non-convex functions) as long as grad(f) is Lipschitz continuous in a some neighborhood of the L = { x : f(x)<=f(x0) }. REQUIREMENTS: Algorithm will request following information during its operation: * function value F and its gradient G (simultaneously) at given point X USAGE: 1. User initializes algorithm state with MinCGCreate() call 2. User tunes solver parameters with MinCGSetCond(), MinCGSetStpMax() and other functions 3. User calls MinCGOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 4. User calls MinCGResults() to get solution 5. Optionally, user may call MinCGRestartFrom() to solve another problem with same N but another starting point and/or another function. MinCGRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 25.03.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mincgcreate(real_1d_array x, mincgstate& state); void alglib::mincgcreate(ae_int_t n, real_1d_array x, mincgstate& state);

Examples:   [1]  [2]  [3]  

/************************************************************************* The subroutine is finite difference variant of MinCGCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinCGCreate() in order to get more information about creation of CG optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinCGSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. L-BFGS needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/
void alglib::mincgcreatef( real_1d_array x, double diffstep, mincgstate& state); void alglib::mincgcreatef( ae_int_t n, real_1d_array x, double diffstep, mincgstate& state);

Examples:   [1]  

/************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x grad - callback which calculates function (or merit function) value func and gradient grad at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied gradient, and one which uses function value only and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object (either MinCGCreate() for analytical gradient or MinCGCreateF() for numerical differentiation) you should choose appropriate variant of MinCGOptimize() - one which accepts function AND gradient or one which accepts function ONLY. Be careful to choose variant of MinCGOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinCGOptimize() and specific function used to create optimizer. | USER PASSED TO MinCGOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinCGCreateF() | work FAIL MinCGCreate() | FAIL work Here "FAIL" denotes inappropriate combinations of optimizer creation function and MinCGOptimize() version. Attemps to use such combination (for example, to create optimizer with MinCGCreateF() and to pass gradient information to MinCGOptimize()) will lead to exception being thrown. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 20.04.2009 by Bochkanov Sergey *************************************************************************/
void mincgoptimize(mincgstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void mincgoptimize(mincgstate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/
void alglib::mincgrequesttermination(mincgstate state);
/************************************************************************* This subroutine restarts CG algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used to store algorithm state. X - new starting point. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mincgrestartfrom(mincgstate state, real_1d_array x);

Examples:   [1]  

/************************************************************************* Conjugate gradient results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report: * Rep.TerminationType completetion code: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinCGSetGradientCheck() for more information. * 1 relative function improvement is no more than EpsF. * 2 relative step is no more than EpsX. * 4 gradient norm is no more than EpsG * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible, we return best X found so far * 8 terminated by user * Rep.IterationsCount contains iterations count * NFEV countains number of function calculations -- ALGLIB -- Copyright 20.04.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mincgresults( mincgstate state, real_1d_array& x, mincgreport& rep);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* Conjugate gradient results Buffered implementation of MinCGResults(), which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 20.04.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mincgresultsbuf( mincgstate state, real_1d_array& x, mincgreport& rep);
/************************************************************************* This function sets CG algorithm. INPUT PARAMETERS: State - structure which stores algorithm state CGType - algorithm type: * -1 automatic selection of the best algorithm * 0 DY (Dai and Yuan) algorithm * 1 Hybrid DY-HS algorithm -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mincgsetcgtype(mincgstate state, ae_int_t cgtype);
/************************************************************************* This function sets stopping conditions for CG optimization algorithm. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|<EpsG is satisfied, where: * |.| means Euclidian norm * v - scaled gradient vector, v[i]=g[i]*s[i] * g - gradient * s - scaling coefficients set by MinCGSetScale() EpsF - >=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinCGSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0, EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (small EpsX). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mincgsetcond( mincgstate state, double epsg, double epsf, double epsx, ae_int_t maxits);

Examples:   [1]  [2]  [3]  

/************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinCGOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinCGSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 31.05.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mincgsetgradientcheck(mincgstate state, double teststep);
/************************************************************************* Modification of the preconditioner: preconditioning is turned off. INPUT PARAMETERS: State - structure which stores algorithm state NOTE: you can change preconditioner "on the fly", during algorithm iterations. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mincgsetprecdefault(mincgstate state);
/************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE: you can change preconditioner "on the fly", during algorithm iterations. NOTE 2: D[i] should be positive. Exception will be thrown otherwise. NOTE 3: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mincgsetprecdiag(mincgstate state, real_1d_array d);
/************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinCGSetScale() call (before or after MinCGSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state NOTE: you can change preconditioner "on the fly", during algorithm iterations. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mincgsetprecscale(mincgstate state);
/************************************************************************* This function sets scaling coefficients for CG optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of CG optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the CG too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinCGSetPrec...() functions. There is special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::mincgsetscale(mincgstate state, real_1d_array s);
/************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mincgsetstpmax(mincgstate state, double stpmax);
/************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinCGOptimize(). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mincgsetxrep(mincgstate state, bool needxrep);
/************************************************************************* This function allows to suggest initial step length to the CG algorithm. Suggested step length is used as starting point for the line search. It can be useful when you have badly scaled problem, i.e. when ||grad|| (which is used as initial estimate for the first step) is many orders of magnitude different from the desired step. Line search may fail on such problems without good estimate of initial step length. Imagine, for example, problem with ||grad||=10^50 and desired step equal to 0.1 Line search function will use 10^50 as initial step, then it will decrease step length by 2 (up to 20 attempts) and will get 10^44, which is still too large. This function allows us to tell than line search should be started from some moderate step length, like 1.0, so algorithm will be able to detect desired step length in a several searches. Default behavior (when no step is suggested) is to use preconditioner, if it is available, to generate initial estimate of step length. This function influences only first iteration of algorithm. It should be called between MinCGCreate/MinCGRestartFrom() call and MinCGOptimize call. Suggested step is ignored if you have preconditioner. INPUT PARAMETERS: State - structure used to store algorithm state. Stp - initial estimate of the step length. Can be zero (no estimate). -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mincgsuggeststep(mincgstate state, double stp);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void function1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) 
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    // and its derivatives df/d0 and df/dx1
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
    grad[0] = 400*pow(x[0]+3,3);
    grad[1] = 4*pow(x[1]-3,3);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4
    // with nonlinear conjugate gradient method.
    //
    real_1d_array x = "[0,0]";
    double epsg = 0.0000000001;
    double epsf = 0;
    double epsx = 0;
    ae_int_t maxits = 0;
    mincgstate state;
    mincgreport rep;

    mincgcreate(x, state);
    mincgsetcond(state, epsg, epsf, epsx, maxits);
    alglib::mincgoptimize(state, function1_grad);
    mincgresults(state, x, rep);

    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-3,3]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void function1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) 
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    // and its derivatives df/d0 and df/dx1
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
    grad[0] = 400*pow(x[0]+3,3);
    grad[1] = 4*pow(x[1]-3,3);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4
    // with nonlinear conjugate gradient method.
    //
    // Several advanced techniques are demonstrated:
    // * upper limit on step size
    // * restart from new point
    //
    real_1d_array x = "[0,0]";
    double epsg = 0.0000000001;
    double epsf = 0;
    double epsx = 0;
    double stpmax = 0.1;
    ae_int_t maxits = 0;
    mincgstate state;
    mincgreport rep;

    // first run
    mincgcreate(x, state);
    mincgsetcond(state, epsg, epsf, epsx, maxits);
    mincgsetstpmax(state, stpmax);
    alglib::mincgoptimize(state, function1_grad);
    mincgresults(state, x, rep);

    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-3,3]

    // second run - algorithm is restarted with mincgrestartfrom()
    x = "[10,10]";
    mincgrestartfrom(state, x);
    alglib::mincgoptimize(state, function1_grad);
    mincgresults(state, x, rep);

    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-3,3]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void s1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr)
{
    //
    // this callback calculates f(x) = (1+x)^(-0.2) + (1-x)^(-0.3) + 1000*x and its gradient.
    //
    // function is trimmed when we calculate it near the singular points or outside of the [-1,+1].
    // Note that we do NOT calculate gradient in this case.
    //
    if( (x[0]<=-0.999999999999) || (x[0]>=+0.999999999999) )
    {
        func = 1.0E+300;
        return;
    }
    func = pow(1+x[0],-0.2) + pow(1-x[0],-0.3) + 1000*x[0];
    grad[0] = -0.2*pow(1+x[0],-1.2) +0.3*pow(1-x[0],-1.3) + 1000;
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x) = (1+x)^(-0.2) + (1-x)^(-0.3) + 1000*x.
    // This function has singularities at the boundary of the [-1,+1], but technique called
    // "function trimming" allows us to solve this optimization problem.
    //
    // See http://www.alglib.net/optimization/tipsandtricks.php#ftrimming for more information
    // on this subject.
    //
    real_1d_array x = "[0]";
    double epsg = 1.0e-6;
    double epsf = 0;
    double epsx = 0;
    ae_int_t maxits = 0;
    mincgstate state;
    mincgreport rep;

    mincgcreate(x, state);
    mincgsetcond(state, epsg, epsf, epsx, maxits);
    alglib::mincgoptimize(state, s1_grad);
    mincgresults(state, x, rep);

    printf("%s\n", x.tostring(5).c_str()); // EXPECTED: [-0.99917305]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void function1_func(const real_1d_array &x, double &func, void *ptr)
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4
    // using numerical differentiation to calculate gradient.
    //
    real_1d_array x = "[0,0]";
    double epsg = 0.0000000001;
    double epsf = 0;
    double epsx = 0;
    double diffstep = 1.0e-6;
    ae_int_t maxits = 0;
    mincgstate state;
    mincgreport rep;

    mincgcreatef(x, diffstep, state);
    mincgsetcond(state, epsg, epsf, epsx, maxits);
    alglib::mincgoptimize(state, function1_func);
    mincgresults(state, x, rep);

    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-3,3]
    return 0;
}


minasareport
minasastate
minasacreate
minasaoptimize
minasarestartfrom
minasaresults
minasaresultsbuf
minasasetalgorithm
minasasetcond
minasasetstpmax
minasasetxrep
minbleicsetbarrierdecay
minbleicsetbarrierwidth
minlbfgssetcholeskypreconditioner
minlbfgssetdefaultpreconditioner
/************************************************************************* *************************************************************************/
class minasareport { ae_int_t iterationscount; ae_int_t nfev; ae_int_t terminationtype; ae_int_t activeconstraints; };
/************************************************************************* *************************************************************************/
class minasastate { };
/************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 25.03.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minasacreate( real_1d_array x, real_1d_array bndl, real_1d_array bndu, minasastate& state); void alglib::minasacreate( ae_int_t n, real_1d_array x, real_1d_array bndl, real_1d_array bndu, minasastate& state);
/************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state grad - callback which calculates function (or merit function) value func and gradient grad at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/
void minasaoptimize(minasastate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL);
/************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minasarestartfrom( minasastate state, real_1d_array x, real_1d_array bndl, real_1d_array bndu);
/************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::minasaresults( minasastate state, real_1d_array& x, minasareport& rep);
/************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::minasaresultsbuf( minasastate state, real_1d_array& x, minasareport& rep);
/************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minasasetalgorithm(minasastate state, ae_int_t algotype);
/************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minasasetcond( minasastate state, double epsg, double epsf, double epsx, ae_int_t maxits);
/************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minasasetstpmax(minasastate state, double stpmax);
/************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minasasetxrep(minasastate state, bool needxrep);
/************************************************************************* This is obsolete function which was used by previous version of the BLEIC optimizer. It does nothing in the current version of BLEIC. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicsetbarrierdecay(minbleicstate state, double mudecay);
/************************************************************************* This is obsolete function which was used by previous version of the BLEIC optimizer. It does nothing in the current version of BLEIC. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minbleicsetbarrierwidth(minbleicstate state, double mu);
/************************************************************************* Obsolete function, use MinLBFGSSetCholeskyPreconditioner() instead. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgssetcholeskypreconditioner( minlbfgsstate state, real_2d_array p, bool isupper);
/************************************************************************* Obsolete function, use MinLBFGSSetPrecDefault() instead. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgssetdefaultpreconditioner(minlbfgsstate state);
minlbfgsreport
minlbfgsstate
minlbfgscreate
minlbfgscreatef
minlbfgsoptimize
minlbfgsrequesttermination
minlbfgsrestartfrom
minlbfgsresults
minlbfgsresultsbuf
minlbfgssetcond
minlbfgssetgradientcheck
minlbfgssetpreccholesky
minlbfgssetprecdefault
minlbfgssetprecdiag
minlbfgssetprecscale
minlbfgssetscale
minlbfgssetstpmax
minlbfgssetxrep
minlbfgs_d_1 Nonlinear optimization by L-BFGS
minlbfgs_d_2 Nonlinear optimization with additional settings and restarts
minlbfgs_ftrim Nonlinear optimization by LBFGS, function with singularities
minlbfgs_numdiff Nonlinear optimization by L-BFGS with numerical differentiation
/************************************************************************* This structure stores optimization report: * IterationsCount total number of inner iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinLBFGSSetGradientCheck() for more information. 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 terminated by user who called minlbfgsrequesttermination(). X contains point which was "current accepted" when termination request was submitted. Other fields of this structure are not documented and should not be used! *************************************************************************/
class minlbfgsreport { ae_int_t iterationscount; ae_int_t nfev; ae_int_t varidx; ae_int_t terminationtype; };
/************************************************************************* *************************************************************************/
class minlbfgsstate { };
/************************************************************************* LIMITED MEMORY BFGS METHOD FOR LARGE SCALE OPTIMIZATION DESCRIPTION: The subroutine minimizes function F(x) of N arguments by using a quasi- Newton method (LBFGS scheme) which is optimized to use a minimum amount of memory. The subroutine generates the approximation of an inverse Hessian matrix by using information about the last M steps of the algorithm (instead of N). It lessens a required amount of memory from a value of order N^2 to a value of order 2*N*M. REQUIREMENTS: Algorithm will request following information during its operation: * function value F and its gradient G (simultaneously) at given point X USAGE: 1. User initializes algorithm state with MinLBFGSCreate() call 2. User tunes solver parameters with MinLBFGSSetCond() MinLBFGSSetStpMax() and other functions 3. User calls MinLBFGSOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 4. User calls MinLBFGSResults() to get solution 5. Optionally user may call MinLBFGSRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLBFGSRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension. N>0 M - number of corrections in the BFGS scheme of Hessian approximation update. Recommended value: 3<=M<=7. The smaller value causes worse convergence, the bigger will not cause a considerably better convergence, but will cause a fall in the performance. M<=N. X - initial solution approximation, array[0..N-1]. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLBFGSSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLBFGSSetStpMax() function to bound algorithm's steps. However, L-BFGS rarely needs such a tuning. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgscreate( ae_int_t m, real_1d_array x, minlbfgsstate& state); void alglib::minlbfgscreate( ae_int_t n, ae_int_t m, real_1d_array x, minlbfgsstate& state);

Examples:   [1]  [2]  [3]  

/************************************************************************* The subroutine is finite difference variant of MinLBFGSCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinLBFGSCreate() in order to get more information about creation of LBFGS optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of corrections in the BFGS scheme of Hessian approximation update. Recommended value: 3<=M<=7. The smaller value causes worse convergence, the bigger will not cause a considerably better convergence, but will cause a fall in the performance. M<=N. X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinLBFGSSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. LBFGS needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgscreatef( ae_int_t m, real_1d_array x, double diffstep, minlbfgsstate& state); void alglib::minlbfgscreatef( ae_int_t n, ae_int_t m, real_1d_array x, double diffstep, minlbfgsstate& state);

Examples:   [1]  

/************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x grad - callback which calculates function (or merit function) value func and gradient grad at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied gradient, and one which uses function value only and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object (either MinLBFGSCreate() for analytical gradient or MinLBFGSCreateF() for numerical differentiation) you should choose appropriate variant of MinLBFGSOptimize() - one which accepts function AND gradient or one which accepts function ONLY. Be careful to choose variant of MinLBFGSOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinLBFGSOptimize() and specific function used to create optimizer. | USER PASSED TO MinLBFGSOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinLBFGSCreateF() | work FAIL MinLBFGSCreate() | FAIL work Here "FAIL" denotes inappropriate combinations of optimizer creation function and MinLBFGSOptimize() version. Attemps to use such combination (for example, to create optimizer with MinLBFGSCreateF() and to pass gradient information to MinCGOptimize()) will lead to exception being thrown. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/
void minlbfgsoptimize(minlbfgsstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minlbfgsoptimize(minlbfgsstate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgsrequesttermination(minlbfgsstate state);
/************************************************************************* This subroutine restarts LBFGS algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used to store algorithm state X - new starting point. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgsrestartfrom(minlbfgsstate state, real_1d_array x);

Examples:   [1]  

/************************************************************************* L-BFGS algorithm results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report: * Rep.TerminationType completetion code: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinLBFGSSetGradientCheck() for more information. * -2 rounding errors prevent further improvement. X contains best point found. * -1 incorrect parameters were specified * 1 relative function improvement is no more than EpsF. * 2 relative step is no more than EpsX. * 4 gradient norm is no more than EpsG * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * 8 terminated by user who called minlbfgsrequesttermination(). X contains point which was "current accepted" when termination request was submitted. * Rep.IterationsCount contains iterations count * NFEV countains number of function calculations -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgsresults( minlbfgsstate state, real_1d_array& x, minlbfgsreport& rep);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* L-BFGS algorithm results Buffered implementation of MinLBFGSResults which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgsresultsbuf( minlbfgsstate state, real_1d_array& x, minlbfgsreport& rep);
/************************************************************************* This function sets stopping conditions for L-BFGS optimization algorithm. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|<EpsG is satisfied, where: * |.| means Euclidian norm * v - scaled gradient vector, v[i]=g[i]*s[i] * g - gradient * s - scaling coefficients set by MinLBFGSSetScale() EpsF - >=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinLBFGSSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0, EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (small EpsX). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgssetcond( minlbfgsstate state, double epsg, double epsf, double epsx, ae_int_t maxits);

Examples:   [1]  [2]  [3]  

/************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinLBFGSOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * if needed, steps are bounded with respect to constraints on X[] * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinLBFGSSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 24.05.2012 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgssetgradientcheck( minlbfgsstate state, double teststep);
/************************************************************************* Modification of the preconditioner: Cholesky factorization of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state P - triangular preconditioner, Cholesky factorization of the approximate Hessian. array[0..N-1,0..N-1], (if larger, only leading N elements are used). IsUpper - whether upper or lower triangle of P is given (other triangle is not referenced) After call to this function preconditioner is changed to P (P is copied into the internal buffer). NOTE: you can change preconditioner "on the fly", during algorithm iterations. NOTE 2: P should be nonsingular. Exception will be thrown otherwise. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgssetpreccholesky( minlbfgsstate state, real_2d_array p, bool isupper);
/************************************************************************* Modification of the preconditioner: default preconditioner (simple scaling, same for all elements of X) is used. INPUT PARAMETERS: State - structure which stores algorithm state NOTE: you can change preconditioner "on the fly", during algorithm iterations. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgssetprecdefault(minlbfgsstate state);
/************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE: you can change preconditioner "on the fly", during algorithm iterations. NOTE 2: D[i] should be positive. Exception will be thrown otherwise. NOTE 3: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgssetprecdiag(minlbfgsstate state, real_1d_array d);
/************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinLBFGSSetScale() call (before or after MinLBFGSSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgssetprecscale(minlbfgsstate state);
/************************************************************************* This function sets scaling coefficients for LBFGS optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the LBFGS too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinLBFGSSetPrec...() functions. There is special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgssetscale(minlbfgsstate state, real_1d_array s);
/************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0 (default), if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgssetstpmax(minlbfgsstate state, double stpmax);
/************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinLBFGSOptimize(). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlbfgssetxrep(minlbfgsstate state, bool needxrep);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void function1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) 
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    // and its derivatives df/d0 and df/dx1
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
    grad[0] = 400*pow(x[0]+3,3);
    grad[1] = 4*pow(x[1]-3,3);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4
    // using LBFGS method.
    //
    real_1d_array x = "[0,0]";
    double epsg = 0.0000000001;
    double epsf = 0;
    double epsx = 0;
    ae_int_t maxits = 0;
    minlbfgsstate state;
    minlbfgsreport rep;

    minlbfgscreate(1, x, state);
    minlbfgssetcond(state, epsg, epsf, epsx, maxits);
    alglib::minlbfgsoptimize(state, function1_grad);
    minlbfgsresults(state, x, rep);

    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-3,3]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void function1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) 
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    // and its derivatives df/d0 and df/dx1
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
    grad[0] = 400*pow(x[0]+3,3);
    grad[1] = 4*pow(x[1]-3,3);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4
    // using LBFGS method.
    //
    // Several advanced techniques are demonstrated:
    // * upper limit on step size
    // * restart from new point
    //
    real_1d_array x = "[0,0]";
    double epsg = 0.0000000001;
    double epsf = 0;
    double epsx = 0;
    double stpmax = 0.1;
    ae_int_t maxits = 0;
    minlbfgsstate state;
    minlbfgsreport rep;

    // first run
    minlbfgscreate(1, x, state);
    minlbfgssetcond(state, epsg, epsf, epsx, maxits);
    minlbfgssetstpmax(state, stpmax);
    alglib::minlbfgsoptimize(state, function1_grad);
    minlbfgsresults(state, x, rep);

    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-3,3]

    // second run - algorithm is restarted
    x = "[10,10]";
    minlbfgsrestartfrom(state, x);
    alglib::minlbfgsoptimize(state, function1_grad);
    minlbfgsresults(state, x, rep);

    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-3,3]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void s1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr)
{
    //
    // this callback calculates f(x) = (1+x)^(-0.2) + (1-x)^(-0.3) + 1000*x and its gradient.
    //
    // function is trimmed when we calculate it near the singular points or outside of the [-1,+1].
    // Note that we do NOT calculate gradient in this case.
    //
    if( (x[0]<=-0.999999999999) || (x[0]>=+0.999999999999) )
    {
        func = 1.0E+300;
        return;
    }
    func = pow(1+x[0],-0.2) + pow(1-x[0],-0.3) + 1000*x[0];
    grad[0] = -0.2*pow(1+x[0],-1.2) +0.3*pow(1-x[0],-1.3) + 1000;
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x) = (1+x)^(-0.2) + (1-x)^(-0.3) + 1000*x.
    // This function has singularities at the boundary of the [-1,+1], but technique called
    // "function trimming" allows us to solve this optimization problem.
    //
    // See http://www.alglib.net/optimization/tipsandtricks.php#ftrimming for more information
    // on this subject.
    //
    real_1d_array x = "[0]";
    double epsg = 1.0e-6;
    double epsf = 0;
    double epsx = 0;
    ae_int_t maxits = 0;
    minlbfgsstate state;
    minlbfgsreport rep;

    minlbfgscreate(1, x, state);
    minlbfgssetcond(state, epsg, epsf, epsx, maxits);
    alglib::minlbfgsoptimize(state, s1_grad);
    minlbfgsresults(state, x, rep);

    printf("%s\n", x.tostring(5).c_str()); // EXPECTED: [-0.99917305]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void function1_func(const real_1d_array &x, double &func, void *ptr)
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4
    // using numerical differentiation to calculate gradient.
    //
    real_1d_array x = "[0,0]";
    double epsg = 0.0000000001;
    double epsf = 0;
    double epsx = 0;
    double diffstep = 1.0e-6;
    ae_int_t maxits = 0;
    minlbfgsstate state;
    minlbfgsreport rep;

    minlbfgscreatef(1, x, diffstep, state);
    minlbfgssetcond(state, epsg, epsf, epsx, maxits);
    alglib::minlbfgsoptimize(state, function1_func);
    minlbfgsresults(state, x, rep);

    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-3,3]
    return 0;
}


minlmreport
minlmstate
minlmcreatefgh
minlmcreatefgj
minlmcreatefj
minlmcreatev
minlmcreatevgj
minlmcreatevj
minlmoptimize
minlmrequesttermination
minlmrestartfrom
minlmresults
minlmresultsbuf
minlmsetacctype
minlmsetbc
minlmsetcond
minlmsetgradientcheck
minlmsetlc
minlmsetscale
minlmsetstpmax
minlmsetxrep
minlm_d_fgh Nonlinear Hessian-based optimization for general functions
minlm_d_restarts Efficient restarts of LM optimizer
minlm_d_v Nonlinear least squares optimization using function vector only
minlm_d_vb Bound constrained nonlinear least squares optimization
minlm_d_vj Nonlinear least squares optimization using function vector and Jacobian
/************************************************************************* Optimization report, filled by MinLMResults() function FIELDS: * TerminationType, completetion code: * -8 optimizer detected NAN/INF values either in the function itself, or in its Jacobian * -7 derivative correctness check failed; see rep.funcidx, rep.varidx for more information. * -5 inappropriate solver was used: * solver created with minlmcreatefgh() used on problem with general linear constraints (set with minlmsetlc() call). * -3 constraints are inconsistent * 2 relative step is no more than EpsX. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * 8 terminated by user who called MinLMRequestTermination(). X contains point which was "current accepted" when termination request was submitted. * IterationsCount, contains iterations count * NFunc, number of function calculations * NJac, number of Jacobi matrix calculations * NGrad, number of gradient calculations * NHess, number of Hessian calculations * NCholesky, number of Cholesky decomposition calculations *************************************************************************/
class minlmreport { ae_int_t iterationscount; ae_int_t terminationtype; ae_int_t funcidx; ae_int_t varidx; ae_int_t nfunc; ae_int_t njac; ae_int_t ngrad; ae_int_t nhess; ae_int_t ncholesky; };
/************************************************************************* Levenberg-Marquardt optimizer. This structure should be created using one of the MinLMCreate???() functions. You should not access its fields directly; use ALGLIB functions to work with it. *************************************************************************/
class minlmstate { };
/************************************************************************* LEVENBERG-MARQUARDT-LIKE METHOD FOR NON-LINEAR OPTIMIZATION DESCRIPTION: This function is used to find minimum of general form (not "sum-of- -squares") function F = F(x[0], ..., x[n-1]) using its gradient and Hessian. Levenberg-Marquardt modification with L-BFGS pre-optimization and internal pre-conditioned L-BFGS optimization after each Levenberg-Marquardt step is used. REQUIREMENTS: This algorithm will request following information during its operation: * function value F at given point X * F and gradient G (simultaneously) at given point X * F, G and Hessian H (simultaneously) at given point X There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts func(), grad() and hess() function pointers. First pointer is used to calculate F at given point, second one calculates F(x) and grad F(x), third one calculates F(x), grad F(x), hess F(x). You can try to initialize MinLMState structure with FGH-function and then use incorrect version of MinLMOptimize() (for example, version which does not provide Hessian matrix), but it will lead to exception being thrown after first attempt to calculate Hessian. USAGE: 1. User initializes algorithm state with MinLMCreateFGH() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and pointers (delegates, etc.) to callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - initial solution, array[0..N-1] OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::minlmcreatefgh(real_1d_array x, minlmstate& state); void alglib::minlmcreatefgh( ae_int_t n, real_1d_array x, minlmstate& state);

Examples:   [1]  

/************************************************************************* This is obsolete function. Since ALGLIB 3.3 it is equivalent to MinLMCreateFJ(). -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::minlmcreatefgj( ae_int_t m, real_1d_array x, minlmstate& state); void alglib::minlmcreatefgj( ae_int_t n, ae_int_t m, real_1d_array x, minlmstate& state);
/************************************************************************* This function is considered obsolete since ALGLIB 3.1.0 and is present for backward compatibility only. We recommend to use MinLMCreateVJ, which provides similar, but more consistent and feature-rich interface. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::minlmcreatefj( ae_int_t m, real_1d_array x, minlmstate& state); void alglib::minlmcreatefj( ae_int_t n, ae_int_t m, real_1d_array x, minlmstate& state);
/************************************************************************* IMPROVED LEVENBERG-MARQUARDT METHOD FOR NON-LINEAR LEAST SQUARES OPTIMIZATION DESCRIPTION: This function is used to find minimum of function which is represented as sum of squares: F(x) = f[0]^2(x[0],...,x[n-1]) + ... + f[m-1]^2(x[0],...,x[n-1]) using value of function vector f[] only. Finite differences are used to calculate Jacobian. REQUIREMENTS: This algorithm will request following information during its operation: * function vector f[] at given point X There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts fvec() callback. You can try to initialize MinLMState structure with VJ function and then use incorrect version of MinLMOptimize() (for example, version which works with general form function and does not accept function vector), but it will lead to exception being thrown after first attempt to calculate Jacobian. USAGE: 1. User initializes algorithm state with MinLMCreateV() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of functions f[i] X - initial solution, array[0..N-1] DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state See also MinLMIteration, MinLMResults. NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::minlmcreatev( ae_int_t m, real_1d_array x, double diffstep, minlmstate& state); void alglib::minlmcreatev( ae_int_t n, ae_int_t m, real_1d_array x, double diffstep, minlmstate& state);

Examples:   [1]  [2]  [3]  

/************************************************************************* This is obsolete function. Since ALGLIB 3.3 it is equivalent to MinLMCreateVJ(). -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::minlmcreatevgj( ae_int_t m, real_1d_array x, minlmstate& state); void alglib::minlmcreatevgj( ae_int_t n, ae_int_t m, real_1d_array x, minlmstate& state);
/************************************************************************* IMPROVED LEVENBERG-MARQUARDT METHOD FOR NON-LINEAR LEAST SQUARES OPTIMIZATION DESCRIPTION: This function is used to find minimum of function which is represented as sum of squares: F(x) = f[0]^2(x[0],...,x[n-1]) + ... + f[m-1]^2(x[0],...,x[n-1]) using value of function vector f[] and Jacobian of f[]. REQUIREMENTS: This algorithm will request following information during its operation: * function vector f[] at given point X * function vector f[] and Jacobian of f[] (simultaneously) at given point There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts fvec() and jac() callbacks. First one is used to calculate f[] at given point, second one calculates f[] and Jacobian df[i]/dx[j]. You can try to initialize MinLMState structure with VJ function and then use incorrect version of MinLMOptimize() (for example, version which works with general form function and does not provide Jacobian), but it will lead to exception being thrown after first attempt to calculate Jacobian. USAGE: 1. User initializes algorithm state with MinLMCreateVJ() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of functions f[i] X - initial solution, array[0..N-1] OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::minlmcreatevj( ae_int_t m, real_1d_array x, minlmstate& state); void alglib::minlmcreatevj( ae_int_t n, ae_int_t m, real_1d_array x, minlmstate& state);

Examples:   [1]  

/************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x grad - callback which calculates function (or merit function) value func and gradient grad at given point x hess - callback which calculates function (or merit function) value func, gradient grad and Hessian hess at given point x fvec - callback which calculates function vector fi[] at given point x jac - callback which calculates function vector fi[] and Jacobian jac at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. Depending on function used to create state structure, this algorithm may accept Jacobian and/or Hessian and/or gradient. According to the said above, there ase several versions of this function, which accept different sets of callbacks. This flexibility opens way to subtle errors - you may create state with MinLMCreateFGH() (optimization using Hessian), but call function which does not accept Hessian. So when algorithm will request Hessian, there will be no callback to call. In this case exception will be thrown. Be careful to avoid such errors because there is no way to find them at compile time - you can see them at runtime only. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/
void minlmoptimize(minlmstate &state, void (*fvec)(const real_1d_array &x, real_1d_array &fi, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minlmoptimize(minlmstate &state, void (*fvec)(const real_1d_array &x, real_1d_array &fi, void *ptr), void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minlmoptimize(minlmstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*hess)(const real_1d_array &x, double &func, real_1d_array &grad, real_2d_array &hess, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minlmoptimize(minlmstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minlmoptimize(minlmstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minlmrequesttermination(minlmstate state);
/************************************************************************* This subroutine restarts LM algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used for reverse communication previously allocated with MinLMCreateXXX call. X - new starting point. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlmrestartfrom(minlmstate state, real_1d_array x);

Examples:   [1]  

/************************************************************************* Levenberg-Marquardt algorithm results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report; includes termination codes and additional information. Termination codes are listed below, see comments for this structure for more info. Termination code is stored in rep.terminationtype field: * -8 optimizer detected NAN/INF values either in the function itself, or in its Jacobian * -7 derivative correctness check failed; see rep.funcidx, rep.varidx for more information. * -3 constraints are inconsistent * 2 relative step is no more than EpsX. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * 8 terminated by user who called minlmrequesttermination(). X contains point which was "current accepted" when termination request was submitted. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::minlmresults( minlmstate state, real_1d_array& x, minlmreport& rep);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* Levenberg-Marquardt algorithm results Buffered implementation of MinLMResults(), which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::minlmresultsbuf( minlmstate state, real_1d_array& x, minlmreport& rep);
/************************************************************************* This function is used to change acceleration settings You can choose between three acceleration strategies: * AccType=0, no acceleration. * AccType=1, secant updates are used to update quadratic model after each iteration. After fixed number of iterations (or after model breakdown) we recalculate quadratic model using analytic Jacobian or finite differences. Number of secant-based iterations depends on optimization settings: about 3 iterations - when we have analytic Jacobian, up to 2*N iterations - when we use finite differences to calculate Jacobian. AccType=1 is recommended when Jacobian calculation cost is prohibitively high (several Mx1 function vector calculations followed by several NxN Cholesky factorizations are faster than calculation of one M*N Jacobian). It should also be used when we have no Jacobian, because finite difference approximation takes too much time to compute. Table below list optimization protocols (XYZ protocol corresponds to MinLMCreateXYZ) and acceleration types they support (and use by default). ACCELERATION TYPES SUPPORTED BY OPTIMIZATION PROTOCOLS: protocol 0 1 comment V + + VJ + + FGH + DEFAULT VALUES: protocol 0 1 comment V x without acceleration it is so slooooooooow VJ x FGH x NOTE: this function should be called before optimization. Attempt to call it during algorithm iterations may result in unexpected behavior. NOTE: attempt to call this function with unsupported protocol/acceleration combination will result in exception being thrown. -- ALGLIB -- Copyright 14.10.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlmsetacctype(minlmstate state, ae_int_t acctype);
/************************************************************************* This function sets boundary constraints for LM optimizer Boundary constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetBC() call. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF (latter is recommended because it will allow solver to use better algorithm). BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF (latter is recommended because it will allow solver to use better algorithm). NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints or at its boundary -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minlmsetbc( minlmstate state, real_1d_array bndl, real_1d_array bndu);
/************************************************************************* This function sets stopping conditions for Levenberg-Marquardt optimization algorithm. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinLMSetScale() Recommended values: 1E-9 ... 1E-12. MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Only Levenberg-Marquardt iterations are counted (L-BFGS/CG iterations are NOT counted because their cost is very low compared to that of LM). Passing EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (small EpsX). NOTE: it is not recommended to set large EpsX (say, 0.001). Because LM is a second-order method, it performs very precise steps anyway. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlmsetcond(minlmstate state, double epsx, ae_int_t maxits);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinLMOptimize() is called * prior to actual optimization, for each function Fi and each component of parameters being optimized X[j] algorithm performs following steps: * two trial steps are made to X[j]-TestStep*S[j] and X[j]+TestStep*S[j], where X[j] is j-th parameter and S[j] is a scale of j-th parameter * if needed, steps are bounded with respect to constraints on X[] * Fi(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative, Rep.FuncIdx is set to index of the function. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) Jacobian evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinLMSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/
void alglib::minlmsetgradientcheck(minlmstate state, double teststep);
/************************************************************************* This function sets general linear constraints for LM optimizer Linear constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another minlmsetlc() call. INPUT PARAMETERS: State - structure stores algorithm state C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT IMPORTANT: if you have linear constraints, it is strongly recommended to set scale of variables with minlmsetscale(). QP solver which is used to calculate linearly constrained steps heavily relies on good scaling of input problems. IMPORTANT: solvers created with minlmcreatefgh() do not support linear constraints. NOTE: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations. NOTE: general linear constraints add significant overhead to solution process. Although solver performs roughly same amount of iterations (when compared with similar box-only constrained problem), each iteration now involves solution of linearly constrained QP subproblem, which requires ~3-5 times more Cholesky decompositions. Thus, if you can reformulate your problem in such way this it has only box constraints, it may be beneficial to do so. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minlmsetlc( minlmstate state, real_2d_array c, integer_1d_array ct); void alglib::minlmsetlc( minlmstate state, real_2d_array c, integer_1d_array ct, ae_int_t k);
/************************************************************************* This function sets scaling coefficients for LM optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Generally, scale is NOT considered to be a form of preconditioner. But LM optimizer is unique in that it uses scaling matrix both in the stopping condition tests and as Marquardt damping factor. Proper scaling is very important for the algorithm performance. It is less important for the quality of results, but still has some influence (it is easier to converge when variables are properly scaled, so premature stopping is possible when very badly scalled variables are combined with relaxed stopping conditions). INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minlmsetscale(minlmstate state, real_1d_array s);
/************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. NOTE: non-zero StpMax leads to moderate performance degradation because intermediate step of preconditioned L-BFGS optimization is incompatible with limits on step size. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlmsetstpmax(minlmstate state, double stpmax);
/************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinLMOptimize(). Both Levenberg-Marquardt and internal L-BFGS iterations are reported. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minlmsetxrep(minlmstate state, bool needxrep);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void function1_func(const real_1d_array &x, double &func, void *ptr)
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
}
void function1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) 
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    // and its derivatives df/d0 and df/dx1
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
    grad[0] = 400*pow(x[0]+3,3);
    grad[1] = 4*pow(x[1]-3,3);
}
void function1_hess(const real_1d_array &x, double &func, real_1d_array &grad, real_2d_array &hess, void *ptr)
{
    //
    // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4
    // its derivatives df/d0 and df/dx1
    // and its Hessian.
    //
    func = 100*pow(x[0]+3,4) + pow(x[1]-3,4);
    grad[0] = 400*pow(x[0]+3,3);
    grad[1] = 4*pow(x[1]-3,3);
    hess[0][0] = 1200*pow(x[0]+3,2);
    hess[0][1] = 0;
    hess[1][0] = 0;
    hess[1][1] = 12*pow(x[1]-3,2);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of F(x0,x1) = 100*(x0+3)^4+(x1-3)^4
    // using "FGH" mode of the Levenberg-Marquardt optimizer.
    //
    // F is treated like a monolitic function without internal structure,
    // i.e. we do NOT represent it as a sum of squares.
    //
    // Optimization algorithm uses:
    // * function value F(x0,x1)
    // * gradient G={dF/dxi}
    // * Hessian H={d2F/(dxi*dxj)}
    //
    real_1d_array x = "[0,0]";
    double epsx = 0.0000000001;
    ae_int_t maxits = 0;
    minlmstate state;
    minlmreport rep;

    minlmcreatefgh(x, state);
    minlmsetcond(state, epsx, maxits);
    alglib::minlmoptimize(state, function1_func, function1_grad, function1_hess);
    minlmresults(state, x, rep);

    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-3,+3]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void  function1_fvec(const real_1d_array &x, real_1d_array &fi, void *ptr)
{
    //
    // this callback calculates
    // f0(x0,x1) = 100*(x0+3)^4,
    // f1(x0,x1) = (x1-3)^4
    //
    fi[0] = 10*pow(x[0]+3,2);
    fi[1] = pow(x[1]-3,2);
}
void  function2_fvec(const real_1d_array &x, real_1d_array &fi, void *ptr)
{
    //
    // this callback calculates
    // f0(x0,x1) = x0^2+1
    // f1(x0,x1) = x1-1
    //
    fi[0] = x[0]*x[0]+1;
    fi[1] = x[1]-1;
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of F(x0,x1) = f0^2+f1^2, where 
    //
    //     f0(x0,x1) = 10*(x0+3)^2
    //     f1(x0,x1) = (x1-3)^2
    //
    // using several starting points and efficient restarts.
    //
    real_1d_array x;
    double epsx = 0.0000000001;
    ae_int_t maxits = 0;
    minlmstate state;
    minlmreport rep;

    //
    // create optimizer using minlmcreatev()
    //
    x = "[10,10]";
    minlmcreatev(2, x, 0.0001, state);
    minlmsetcond(state, epsx, maxits);
    alglib::minlmoptimize(state, function1_fvec);
    minlmresults(state, x, rep);
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-3,+3]

    //
    // restart optimizer using minlmrestartfrom()
    //
    // we can use different starting point, different function,
    // different stopping conditions, but problem size
    // must remain unchanged.
    //
    x = "[4,4]";
    minlmrestartfrom(state, x);
    alglib::minlmoptimize(state, function2_fvec);
    minlmresults(state, x, rep);
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [0,1]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void  function1_fvec(const real_1d_array &x, real_1d_array &fi, void *ptr)
{
    //
    // this callback calculates
    // f0(x0,x1) = 100*(x0+3)^4,
    // f1(x0,x1) = (x1-3)^4
    //
    fi[0] = 10*pow(x[0]+3,2);
    fi[1] = pow(x[1]-3,2);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of F(x0,x1) = f0^2+f1^2, where 
    //
    //     f0(x0,x1) = 10*(x0+3)^2
    //     f1(x0,x1) = (x1-3)^2
    //
    // using "V" mode of the Levenberg-Marquardt optimizer.
    //
    // Optimization algorithm uses:
    // * function vector f[] = {f1,f2}
    //
    // No other information (Jacobian, gradient, etc.) is needed.
    //
    real_1d_array x = "[0,0]";
    double epsx = 0.0000000001;
    ae_int_t maxits = 0;
    minlmstate state;
    minlmreport rep;

    minlmcreatev(2, x, 0.0001, state);
    minlmsetcond(state, epsx, maxits);
    alglib::minlmoptimize(state, function1_fvec);
    minlmresults(state, x, rep);

    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-3,+3]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void  function1_fvec(const real_1d_array &x, real_1d_array &fi, void *ptr)
{
    //
    // this callback calculates
    // f0(x0,x1) = 100*(x0+3)^4,
    // f1(x0,x1) = (x1-3)^4
    //
    fi[0] = 10*pow(x[0]+3,2);
    fi[1] = pow(x[1]-3,2);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of F(x0,x1) = f0^2+f1^2, where 
    //
    //     f0(x0,x1) = 10*(x0+3)^2
    //     f1(x0,x1) = (x1-3)^2
    //
    // with boundary constraints
    //
    //     -1 <= x0 <= +1
    //     -1 <= x1 <= +1
    //
    // using "V" mode of the Levenberg-Marquardt optimizer.
    //
    // Optimization algorithm uses:
    // * function vector f[] = {f1,f2}
    //
    // No other information (Jacobian, gradient, etc.) is needed.
    //
    real_1d_array x = "[0,0]";
    real_1d_array bndl = "[-1,-1]";
    real_1d_array bndu = "[+1,+1]";
    double epsx = 0.0000000001;
    ae_int_t maxits = 0;
    minlmstate state;
    minlmreport rep;

    minlmcreatev(2, x, 0.0001, state);
    minlmsetbc(state, bndl, bndu);
    minlmsetcond(state, epsx, maxits);
    alglib::minlmoptimize(state, function1_fvec);
    minlmresults(state, x, rep);

    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-1,+1]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void  function1_fvec(const real_1d_array &x, real_1d_array &fi, void *ptr)
{
    //
    // this callback calculates
    // f0(x0,x1) = 100*(x0+3)^4,
    // f1(x0,x1) = (x1-3)^4
    //
    fi[0] = 10*pow(x[0]+3,2);
    fi[1] = pow(x[1]-3,2);
}
void  function1_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr)
{
    //
    // this callback calculates
    // f0(x0,x1) = 100*(x0+3)^4,
    // f1(x0,x1) = (x1-3)^4
    // and Jacobian matrix J = [dfi/dxj]
    //
    fi[0] = 10*pow(x[0]+3,2);
    fi[1] = pow(x[1]-3,2);
    jac[0][0] = 20*(x[0]+3);
    jac[0][1] = 0;
    jac[1][0] = 0;
    jac[1][1] = 2*(x[1]-3);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of F(x0,x1) = f0^2+f1^2, where 
    //
    //     f0(x0,x1) = 10*(x0+3)^2
    //     f1(x0,x1) = (x1-3)^2
    //
    // using "VJ" mode of the Levenberg-Marquardt optimizer.
    //
    // Optimization algorithm uses:
    // * function vector f[] = {f1,f2}
    // * Jacobian matrix J = {dfi/dxj}.
    //
    real_1d_array x = "[0,0]";
    double epsx = 0.0000000001;
    ae_int_t maxits = 0;
    minlmstate state;
    minlmreport rep;

    minlmcreatevj(2, x, state);
    minlmsetcond(state, epsx, maxits);
    alglib::minlmoptimize(state, function1_fvec, function1_jac);
    minlmresults(state, x, rep);

    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [-3,+3]
    return 0;
}


minnlcreport
minnlcstate
minnlccreate
minnlccreatef
minnlcoptimize
minnlcrestartfrom
minnlcresults
minnlcresultsbuf
minnlcsetalgoaul
minnlcsetbc
minnlcsetcond
minnlcsetgradientcheck
minnlcsetlc
minnlcsetnlc
minnlcsetprecexactlowrank
minnlcsetprecexactrobust
minnlcsetprecinexact
minnlcsetprecnone
minnlcsetscale
minnlcsetstpmax
minnlcsetxrep
minnlc_d_equality Nonlinearly constrained optimization (equality constraints)
minnlc_d_inequality Nonlinearly constrained optimization (inequality constraints)
minnlc_d_mixed Nonlinearly constrained optimization with mixed equality/inequality constraints
/************************************************************************* This structure stores optimization report: * IterationsCount total number of inner iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinNLCSetGradientCheck() for more information. 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. Other fields of this structure are not documented and should not be used! *************************************************************************/
class minnlcreport { ae_int_t iterationscount; ae_int_t nfev; ae_int_t varidx; ae_int_t funcidx; ae_int_t terminationtype; ae_int_t dbgphase0its; };
/************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinNLC subpackage to work with this object *************************************************************************/
class minnlcstate { };
/************************************************************************* NONLINEARLY CONSTRAINED OPTIMIZATION WITH PRECONDITIONED AUGMENTED LAGRANGIAN ALGORITHM DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints * nonlinear equality constraints Gi(x)=0 * nonlinear inequality constraints Hi(x)<=0 REQUIREMENTS: * user must provide function value and gradient for F(), H(), G() * starting point X0 must be feasible or not too far away from the feasible set * F(), G(), H() are twice continuously differentiable on the feasible set and its neighborhood * nonlinear constraints G() and H() must have non-zero gradient at G(x)=0 and at H(x)=0. Say, constraint like x^2>=1 is supported, but x^2>=0 is NOT supported. USAGE: Constrained optimization if far more complex than the unconstrained one. Nonlinearly constrained optimization is one of the most esoteric numerical procedures. Here we give very brief outline of the MinNLC optimizer. We strongly recommend you to study examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinNLCCreate() call and chooses what NLC solver to use. There is some solver which is used by default, with default settings, but you should NOT rely on default choice. It may change in future releases of ALGLIB without notice, and no one can guarantee that new solver will be able to solve your problem with default settings. From the other side, if you choose solver explicitly, you can be pretty sure that it will work with new ALGLIB releases. In the current release following solvers can be used: * AUL solver (activated with MinNLCSetAlgoAUL() function) 2. User adds boundary and/or linear and/or nonlinear constraints by means of calling one of the following functions: a) MinNLCSetBC() for boundary constraints b) MinNLCSetLC() for linear constraints c) MinNLCSetNLC() for nonlinear constraints You may combine (a), (b) and (c) in one optimization problem. 3. User sets scale of the variables with MinNLCSetScale() function. It is VERY important to set scale of the variables, because nonlinearly constrained problems are hard to solve when variables are badly scaled. 4. User sets stopping conditions with MinNLCSetCond(). If NLC solver uses inner/outer iteration layout, this function sets stopping conditions for INNER iterations. 5. User chooses one of the preconditioning methods. Preconditioning is very important for efficient handling of boundary/linear/nonlinear constraints. Without preconditioning algorithm would require thousands of iterations even for simple problems. Several preconditioners can be used: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). Since version 3.11.0 ALGLIB uses exact robust preconditioner as default option, but in some cases exact low rank one may be better option. 6. Finally, user calls MinNLCOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G/H. 7. User calls MinNLCResults() to get solution 8. Optionally user may call MinNLCRestartFrom() to solve another problem with same N but another starting point. MinNLCRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlccreate(real_1d_array x, minnlcstate& state); void alglib::minnlccreate( ae_int_t n, real_1d_array x, minnlcstate& state);

Examples:   [1]  [2]  [3]  

/************************************************************************* This subroutine is a finite difference variant of MinNLCCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinNLCCreate() in order to get more information about creation of NLC optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinNLCSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large TRUNCATION errors, while too small step will result in too large NUMERICAL errors. 1.0E-4 can be good value to start from. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlccreatef( real_1d_array x, double diffstep, minnlcstate& state); void alglib::minnlccreatef( ae_int_t n, real_1d_array x, double diffstep, minnlcstate& state);
/************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state fvec - callback which calculates function vector fi[] at given point x jac - callback which calculates function vector fi[] and Jacobian jac at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied Jacobian, and one which uses only function vector and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object you should choose appropriate variant of MinNLCOptimize() - one which accepts function AND Jacobian or one which accepts ONLY function. Be careful to choose variant of MinNLCOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinNLCOptimize() and specific function used to create optimizer. | USER PASSED TO MinNLCOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinNLCCreateF() | works FAILS MinNLCCreate() | FAILS works Here "FAILS" denotes inappropriate combinations of optimizer creation function and MinNLCOptimize() version. Attemps to use such combination will lead to exception. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/
void minnlcoptimize(minnlcstate &state, void (*fvec)(const real_1d_array &x, real_1d_array &fi, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minnlcoptimize(minnlcstate &state, void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL);

Examples:   [1]  [2]  [3]  

/************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. X - new starting point. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcrestartfrom(minnlcstate state, real_1d_array x);
/************************************************************************* MinNLC results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinNLCSetGradientCheck() for more information. * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken More information about fields of this structure can be found in the comments on MinNLCReport datatype. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcresults( minnlcstate state, real_1d_array& x, minnlcreport& rep);

Examples:   [1]  [2]  [3]  

/************************************************************************* NLC results Buffered implementation of MinNLCResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcresultsbuf( minnlcstate state, real_1d_array& x, minnlcreport& rep);
/************************************************************************* This function tells MinNLC unit to use Augmented Lagrangian algorithm for nonlinearly constrained optimization. This algorithm is a slight modification of one described in "A Modified Barrier-Augmented Lagrangian Method for Constrained Minimization (1999)" by D.GOLDFARB, R.POLYAK, K. SCHEINBERG, I.YUZEFOVICH. Augmented Lagrangian algorithm works by converting problem of minimizing F(x) subject to equality/inequality constraints to unconstrained problem of the form min[ f(x) + + Rho*PENALTY_EQ(x) + SHIFT_EQ(x,Nu1) + + Rho*PENALTY_INEQ(x) + SHIFT_INEQ(x,Nu2) ] where: * Rho is a fixed penalization coefficient * PENALTY_EQ(x) is a penalty term, which is used to APPROXIMATELY enforce equality constraints * SHIFT_EQ(x) is a special "shift" term which is used to "fine-tune" equality constraints, greatly increasing precision * PENALTY_INEQ(x) is a penalty term which is used to approximately enforce inequality constraints * SHIFT_INEQ(x) is a special "shift" term which is used to "fine-tune" inequality constraints, greatly increasing precision * Nu1/Nu2 are vectors of Lagrange coefficients which are fine-tuned during outer iterations of algorithm This version of AUL algorithm uses preconditioner, which greatly accelerates convergence. Because this algorithm is similar to penalty methods, it may perform steps into infeasible area. All kinds of constraints (boundary, linear and nonlinear ones) may be violated in intermediate points - and in the solution. However, properly configured AUL method is significantly better at handling constraints than barrier and/or penalty methods. The very basic outline of algorithm is given below: 1) first outer iteration is performed with "default" values of Lagrange multipliers Nu1/Nu2. Solution quality is low (candidate point can be too far away from true solution; large violation of constraints is possible) and is comparable with that of penalty methods. 2) subsequent outer iterations refine Lagrange multipliers and improve quality of the solution. INPUT PARAMETERS: State - structure which stores algorithm state Rho - penalty coefficient, Rho>0: * large enough that algorithm converges with desired precision. Minimum value is 10*max(S'*diag(H)*S), where S is a scale matrix (set by MinNLCSetScale) and H is a Hessian of the function being minimized. If you can not easily estimate Hessian norm, see our recommendations below. * not TOO large to prevent ill-conditioning * for unit-scale problems (variables and Hessian have unit magnitude), Rho=100 or Rho=1000 can be used. * it is important to note that Rho is internally multiplied by scaling matrix, i.e. optimum value of Rho depends on scale of variables specified by MinNLCSetScale(). ItsCnt - number of outer iterations: * ItsCnt=0 means that small number of outer iterations is automatically chosen (10 iterations in current version). * ItsCnt=1 means that AUL algorithm performs just as usual barrier method. * ItsCnt>1 means that AUL algorithm performs specified number of outer iterations HOW TO CHOOSE PARAMETERS Nonlinear optimization is a tricky area and Augmented Lagrangian algorithm is sometimes hard to tune. Good values of Rho and ItsCnt are problem- specific. In order to help you we prepared following set of recommendations: * for unit-scale problems (variables and Hessian have unit magnitude), Rho=100 or Rho=1000 can be used. * start from some small value of Rho and solve problem with just one outer iteration (ItcCnt=1). In this case algorithm behaves like penalty method. Increase Rho in 2x or 10x steps until you see that one outer iteration returns point which is "rough approximation to solution". It is very important to have Rho so large that penalty term becomes constraining i.e. modified function becomes highly convex in constrained directions. From the other side, too large Rho may prevent you from converging to the solution. You can diagnose it by studying number of inner iterations performed by algorithm: too few (5-10 on 1000-dimensional problem) or too many (orders of magnitude more than dimensionality) usually means that Rho is too large. * with just one outer iteration you usually have low-quality solution. Some constraints can be violated with very large margin, while other ones (which are NOT violated in the true solution) can push final point too far in the inner area of the feasible set. For example, if you have constraint x0>=0 and true solution x0=1, then merely a presence of "x0>=0" will introduce a bias towards larger values of x0. Say, algorithm may stop at x0=1.5 instead of 1.0. * after you found good Rho, you may increase number of outer iterations. ItsCnt=10 is a good value. Subsequent outer iteration will refine values of Lagrange multipliers. Constraints which were violated will be enforced, inactive constraints will be dropped (corresponding multipliers will be decreased). Ideally, you should see 10-1000x improvement in constraint handling (constraint violation is reduced). * if you see that algorithm converges to vicinity of solution, but additional outer iterations do not refine solution, it may mean that algorithm is unstable - it wanders around true solution, but can not approach it. Sometimes algorithm may be stabilized by increasing Rho one more time, making it 5x or 10x larger. SCALING OF CONSTRAINTS [IMPORTANT] AUL optimizer scales variables according to scale specified by MinNLCSetScale() function, so it can handle problems with badly scaled variables (as long as we KNOW their scales). However, because function being optimized is a mix of original function and constraint-dependent penalty functions, it is important to rescale both variables AND constraints. Say, if you minimize f(x)=x^2 subject to 1000000*x>=0, then you have constraint whose scale is different from that of target function (another example is 0.000001*x>=0). It is also possible to have constraints whose scales are misaligned: 1000000*x0>=0, 0.000001*x1<=0. Inappropriate scaling may ruin convergence because minimizing x^2 subject to x>=0 is NOT same as minimizing it subject to 1000000*x>=0. Because we know coefficients of boundary/linear constraints, we can automatically rescale and normalize them. However, there is no way to automatically rescale nonlinear constraints Gi(x) and Hi(x) - they are black boxes. It means that YOU are the one who is responsible for correct scaling of nonlinear constraints Gi(x) and Hi(x). We recommend you to rescale nonlinear constraints in such way that I-th component of dG/dX (or dH/dx) has magnitude approximately equal to 1/S[i] (where S is a scale set by MinNLCSetScale() function). WHAT IF IT DOES NOT CONVERGE? It is possible that AUL algorithm fails to converge to precise values of Lagrange multipliers. It stops somewhere around true solution, but candidate point is still too far from solution, and some constraints are violated. Such kind of failure is specific for Lagrangian algorithms - technically, they stop at some point, but this point is not constrained solution. There are exist several reasons why algorithm may fail to converge: a) too loose stopping criteria for inner iteration b) degenerate, redundant constraints c) target function has unconstrained extremum exactly at the boundary of some constraint d) numerical noise in the target function In all these cases algorithm is unstable - each outer iteration results in large and almost random step which improves handling of some constraints, but violates other ones (ideally outer iterations should form a sequence of progressively decreasing steps towards solution). First reason possible is that too loose stopping criteria for inner iteration were specified. Augmented Lagrangian algorithm solves a sequence of intermediate problems, and requries each of them to be solved with high precision. Insufficient precision results in incorrect update of Lagrange multipliers. Another reason is that you may have specified degenerate constraints: say, some constraint was repeated twice. In most cases AUL algorithm gracefully handles such situations, but sometimes it may spend too much time figuring out subtle degeneracies in constraint matrix. Third reason is tricky and hard to diagnose. Consider situation when you minimize f=x^2 subject to constraint x>=0. Unconstrained extremum is located exactly at the boundary of constrained area. In this case algorithm will tend to oscillate between negative and positive x. Each time it stops at x<0 it "reinforces" constraint x>=0, and each time it is bounced to x>0 it "relaxes" constraint (and is attracted to x<0). Such situation sometimes happens in problems with hidden symetries. Algorithm is got caught in a loop with Lagrange multipliers being continuously increased/decreased. Luckily, such loop forms after at least three iterations, so this problem can be solved by DECREASING number of outer iterations down to 1-2 and increasing penalty coefficient Rho as much as possible. Final reason is numerical noise. AUL algorithm is robust against moderate noise (more robust than, say, active set methods), but large noise may destabilize algorithm. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetalgoaul( minnlcstate state, double rho, ae_int_t itscnt);

Examples:   [1]  [2]  [3]  

/************************************************************************* This function sets boundary constraints for NLC optimizer. Boundary constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinNLCRestartFrom(). You may combine boundary constraints with general linear ones - and with nonlinear ones! Boundary constraints are handled more efficiently than other types. Thus, if your problem has mixed constraints, you may explicitly specify some of them as boundary and save some time/space. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: when you solve your problem with augmented Lagrangian solver, boundary constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of feasible area! -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetbc( minnlcstate state, real_1d_array bndl, real_1d_array bndu);
/************************************************************************* This function sets stopping conditions for inner iterations of optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|<EpsG is satisfied, where: * |.| means Euclidian norm * v - scaled gradient vector, v[i]=g[i]*s[i] * g - gradient * s - scaling coefficients set by MinNLCSetScale() EpsF - >=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinNLCSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetcond( minnlcstate state, double epsg, double epsf, double epsx, ae_int_t maxits);

Examples:   [1]  [2]  [3]  

/************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinNLCOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative, and Rep.FuncIdx is set to index of the function. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinNLCSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetgradientcheck(minnlcstate state, double teststep);
/************************************************************************* This function sets linear constraints for MinNLC optimizer. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinNLCRestartFrom(). You may combine linear constraints with boundary ones - and with nonlinear ones! If your problem has mixed constraints, you may explicitly specify some of them as linear. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: when you solve your problem with augmented Lagrangian solver, linear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of feasible area! -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetlc( minnlcstate state, real_2d_array c, integer_1d_array ct); void alglib::minnlcsetlc( minnlcstate state, real_2d_array c, integer_1d_array ct, ae_int_t k);
/************************************************************************* This function sets nonlinear constraints for MinNLC optimizer. In fact, this function sets NUMBER of nonlinear constraints. Constraints itself (constraint functions) are passed to MinNLCOptimize() method. This method requires user-defined vector function F[] and its Jacobian J[], where: * first component of F[] and first row of Jacobian J[] corresponds to function being minimized * next NLEC components of F[] (and rows of J) correspond to nonlinear equality constraints G_i(x)=0 * next NLIC components of F[] (and rows of J) correspond to nonlinear inequality constraints H_i(x)<=0 NOTE: you may combine nonlinear constraints with linear/boundary ones. If your problem has mixed constraints, you may explicitly specify some of them as linear ones. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. NLEC - number of Non-Linear Equality Constraints (NLEC), >=0 NLIC - number of Non-Linear Inquality Constraints (NLIC), >=0 NOTE 1: when you solve your problem with augmented Lagrangian solver, nonlinear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of feasible area! NOTE 2: algorithm scales variables according to scale specified by MinNLCSetScale() function, so it can handle problems with badly scaled variables (as long as we KNOW their scales). However, there is no way to automatically scale nonlinear constraints Gi(x) and Hi(x). Inappropriate scaling of Gi/Hi may ruin convergence. Solving problem with constraint "1000*G0(x)=0" is NOT same as solving it with constraint "0.001*G0(x)=0". It means that YOU are the one who is responsible for correct scaling of nonlinear constraints Gi(x) and Hi(x). We recommend you to scale nonlinear constraints in such way that I-th component of dG/dX (or dH/dx) has approximately unit magnitude (for problems with unit scale) or has magnitude approximately equal to 1/S[i] (where S is a scale set by MinNLCSetScale() function). -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetnlc( minnlcstate state, ae_int_t nlec, ae_int_t nlic);

Examples:   [1]  [2]  [3]  

/************************************************************************* This function sets preconditioner to "exact low rank" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may use following preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). It also provides special unpreconditioned mode of operation which can be used for test purposes. Comments below discuss low rank preconditioner. Exact low-rank preconditioner uses Woodbury matrix identity to build quadratic model of the penalized function. It has following features: * no special assumptions about orthogonality of constraints * preconditioner evaluation is optimized for K<<N. Its cost is O(N*K^2), so it may become prohibitively slow for K>=N. * finally, stability of the process is guaranteed only for K<<N. Woodbury update often fail for K>=N due to degeneracy of intermediate matrices. That's why we recommend to use "exact robust" preconditioner for such cases. RECOMMENDATIONS We recommend to choose between "exact low rank" and "exact robust" preconditioners, with "low rank" version being chosen when you know in advance that total count of non-box constraints won't exceed N, and "robust" version being chosen when you need bulletproof solution. INPUT PARAMETERS: State - structure stores algorithm state UpdateFreq- update frequency. Preconditioner is rebuilt after every UpdateFreq iterations. Recommended value: 10 or higher. Zero value means that good default value will be used. -- ALGLIB -- Copyright 26.09.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetprecexactlowrank( minnlcstate state, ae_int_t updatefreq);

Examples:   [1]  [2]  [3]  

/************************************************************************* This function sets preconditioner to "exact robust" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may use following preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). It also provides special unpreconditioned mode of operation which can be used for test purposes. Comments below discuss robust preconditioner. Exact robust preconditioner uses Cholesky decomposition to invert approximate Hessian matrix H=D+W'*C*W (where D stands for diagonal terms of Hessian, combined result of initial scaling matrix and penalty from box constraints; W stands for general linear constraints and linearization of nonlinear ones; C stands for diagonal matrix of penalty coefficients). This preconditioner has following features: * no special assumptions about constraint structure * preconditioner is optimized for stability; unlike "exact low rank" version which fails for K>=N, this one works well for any value of K. * the only drawback is that is takes O(N^3+K*N^2) time to build it. No economical Woodbury update is applied even when it makes sense, thus there are exist situations (K<<N) when "exact low rank" preconditioner outperforms this one. RECOMMENDATIONS We recommend to choose between "exact low rank" and "exact robust" preconditioners, with "low rank" version being chosen when you know in advance that total count of non-box constraints won't exceed N, and "robust" version being chosen when you need bulletproof solution. INPUT PARAMETERS: State - structure stores algorithm state UpdateFreq- update frequency. Preconditioner is rebuilt after every UpdateFreq iterations. Recommended value: 10 or higher. Zero value means that good default value will be used. -- ALGLIB -- Copyright 26.09.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetprecexactrobust( minnlcstate state, ae_int_t updatefreq);
/************************************************************************* This function sets preconditioner to "inexact LBFGS-based" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may use following preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). Inexact LBFGS-based preconditioner uses L-BFGS formula combined with orthogonality assumption to perform very fast updates. For a N-dimensional problem with K general linear or nonlinear constraints (boundary ones are not counted) it has O(N*K) cost per iteration. This preconditioner has best quality (less iterations) when general linear and nonlinear constraints are orthogonal to each other (orthogonality with respect to boundary constraints is not required). Number of iterations increases when constraints are non-orthogonal, because algorithm assumes orthogonality, but still it is better than no preconditioner at all. INPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 26.09.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetprecinexact(minnlcstate state);
/************************************************************************* This function sets preconditioner to "turned off" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may utilize two preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, and b) exact low rank one. It also provides special unpreconditioned mode of operation which can be used for test purposes. This function activates this test mode. Do not use it in production code to solve real-life problems. INPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 26.09.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetprecnone(minnlcstate state);
/************************************************************************* This function sets scaling coefficients for NLC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetscale(minnlcstate state, real_1d_array s);

Examples:   [1]  [2]  [3]  

/************************************************************************* This function sets maximum step length (after scaling of step vector with respect to variable scales specified by minnlcsetscale() call). INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0 (default), if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetstpmax(minnlcstate state, double stpmax);
/************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinNLCOptimize(). NOTE: algorithm passes two parameters to rep() callback - current point and penalized function value at current point. Important - function value which is returned is NOT function being minimized. It is sum of the value of the function being minimized - and penalty term. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minnlcsetxrep(minnlcstate state, bool needxrep);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void  nlcfunc1_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr)
{
    //
    // this callback calculates
    //
    //     f0(x0,x1) = -x0+x1
    //     f1(x0,x1) = x0^2+x1^2-1
    //
    // and Jacobian matrix J = [dfi/dxj]
    //
    fi[0] = -x[0]+x[1];
    fi[1] = x[0]*x[0] + x[1]*x[1] - 1.0;
    jac[0][0] = -1.0;
    jac[0][1] = +1.0;
    jac[1][0] = 2*x[0];
    jac[1][1] = 2*x[1];
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of
    //
    //     f(x0,x1) = -x0+x1
    //
    // subject to nonlinear equality constraint
    //
    //    x0^2 + x1^2 - 1 = 0
    //
    real_1d_array x0 = "[0,0]";
    real_1d_array s = "[1,1]";
    double epsg = 0;
    double epsf = 0;
    double epsx = 0.000001;
    ae_int_t maxits = 0;
    ae_int_t outerits = 5;
    ae_int_t updatefreq = 10;
    double rho = 1000;
    minnlcstate state;
    minnlcreport rep;
    real_1d_array x1;

    //
    // Create optimizer object, choose AUL algorithm and tune its settings:
    // * rho=1000       penalty coefficient
    // * outerits=5     number of outer iterations to tune Lagrange coefficients
    // * epsx=0.000001  stopping condition for inner iterations
    // * s=[1,1]        all variables have unit scale
    // * exact low-rank preconditioner is used, updated after each 10 iterations
    //
    minnlccreate(2, x0, state);
    minnlcsetalgoaul(state, rho, outerits);
    minnlcsetcond(state, epsg, epsf, epsx, maxits);
    minnlcsetscale(state, s);
    minnlcsetprecexactlowrank(state, updatefreq);

    //
    // Set constraints:
    //
    // Nonlinear constraints are tricky - you can not "pack" general
    // nonlinear function into double precision array. That's why
    // minnlcsetnlc() does not accept constraints itself - only constraint
    // counts are passed: first parameter is number of equality constraints,
    // second one is number of inequality constraints.
    //
    // As for constraining functions - these functions are passed as part
    // of problem Jacobian (see below).
    //
    // NOTE: MinNLC optimizer supports arbitrary combination of boundary, general
    //       linear and general nonlinear constraints. This example does not
    //       show how to work with general linear constraints, but you can
    //       easily find it in documentation on minnlcsetbc() and
    //       minnlcsetlc() functions.
    //
    minnlcsetnlc(state, 1, 0);

    //
    // Optimize and test results.
    //
    // Optimizer object accepts vector function and its Jacobian, with first
    // component (Jacobian row) being target function, and next components
    // (Jacobian rows) being nonlinear equality and inequality constraints.
    //
    // So, our vector function has form
    //
    //     {f0,f1} = { -x0+x1 , x0^2+x1^2-1 }
    //
    // with Jacobian
    //
    //         [  -1    +1  ]
    //     J = [            ]
    //         [ 2*x0  2*x1 ]
    //
    // with f0 being target function, f1 being constraining function. Number
    // of equality/inequality constraints is specified by minnlcsetnlc(),
    // with equality ones always being first, inequality ones being last.
    //
    alglib::minnlcoptimize(state, nlcfunc1_jac);
    minnlcresults(state, x1, rep);
    printf("%s\n", x1.tostring(2).c_str()); // EXPECTED: [0.70710,-0.70710]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void  nlcfunc1_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr)
{
    //
    // this callback calculates
    //
    //     f0(x0,x1) = -x0+x1
    //     f1(x0,x1) = x0^2+x1^2-1
    //
    // and Jacobian matrix J = [dfi/dxj]
    //
    fi[0] = -x[0]+x[1];
    fi[1] = x[0]*x[0] + x[1]*x[1] - 1.0;
    jac[0][0] = -1.0;
    jac[0][1] = +1.0;
    jac[1][0] = 2*x[0];
    jac[1][1] = 2*x[1];
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of
    //
    //     f(x0,x1) = -x0+x1
    //
    // subject to boundary constraints
    //
    //    x0>=0, x1>=0
    //
    // and nonlinear inequality constraint
    //
    //    x0^2 + x1^2 - 1 <= 0
    //
    real_1d_array x0 = "[0,0]";
    real_1d_array s = "[1,1]";
    double epsg = 0;
    double epsf = 0;
    double epsx = 0.000001;
    ae_int_t maxits = 0;
    ae_int_t outerits = 5;
    ae_int_t updatefreq = 10;
    double rho = 1000;
    real_1d_array bndl = "[0,0]";
    real_1d_array bndu = "[+inf,+inf]";
    minnlcstate state;
    minnlcreport rep;
    real_1d_array x1;

    //
    // Create optimizer object, choose AUL algorithm and tune its settings:
    // * rho=1000       penalty coefficient
    // * outerits=5     number of outer iterations to tune Lagrange coefficients
    // * epsx=0.000001  stopping condition for inner iterations
    // * s=[1,1]        all variables have unit scale
    // * exact low-rank preconditioner is used, updated after each 10 iterations
    //
    minnlccreate(2, x0, state);
    minnlcsetalgoaul(state, rho, outerits);
    minnlcsetcond(state, epsg, epsf, epsx, maxits);
    minnlcsetscale(state, s);
    minnlcsetprecexactlowrank(state, updatefreq);

    //
    // Set constraints:
    //
    // 1. boundary constraints are passed with minnlcsetbc() call
    //
    // 2. nonlinear constraints are more tricky - you can not "pack" general
    //    nonlinear function into double precision array. That's why
    //    minnlcsetnlc() does not accept constraints itself - only constraint
    //    counts are passed: first parameter is number of equality constraints,
    //    second one is number of inequality constraints.
    //
    //    As for constraining functions - these functions are passed as part
    //    of problem Jacobian (see below).
    //
    // NOTE: MinNLC optimizer supports arbitrary combination of boundary, general
    //       linear and general nonlinear constraints. This example does not
    //       show how to work with general linear constraints, but you can
    //       easily find it in documentation on minnlcsetlc() function.
    //
    minnlcsetbc(state, bndl, bndu);
    minnlcsetnlc(state, 0, 1);

    //
    // Optimize and test results.
    //
    // Optimizer object accepts vector function and its Jacobian, with first
    // component (Jacobian row) being target function, and next components
    // (Jacobian rows) being nonlinear equality and inequality constraints.
    //
    // So, our vector function has form
    //
    //     {f0,f1} = { -x0+x1 , x0^2+x1^2-1 }
    //
    // with Jacobian
    //
    //         [  -1    +1  ]
    //     J = [            ]
    //         [ 2*x0  2*x1 ]
    //
    // with f0 being target function, f1 being constraining function. Number
    // of equality/inequality constraints is specified by minnlcsetnlc(),
    // with equality ones always being first, inequality ones being last.
    //
    alglib::minnlcoptimize(state, nlcfunc1_jac);
    minnlcresults(state, x1, rep);
    printf("%s\n", x1.tostring(2).c_str()); // EXPECTED: [1.0000,0.0000]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void  nlcfunc2_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr)
{
    //
    // this callback calculates
    //
    //     f0(x0,x1,x2) = x0+x1
    //     f1(x0,x1,x2) = x2-exp(x0)
    //     f2(x0,x1,x2) = x0^2+x1^2-1
    //
    // and Jacobian matrix J = [dfi/dxj]
    //
    fi[0] = x[0]+x[1];
    fi[1] = x[2]-exp(x[0]);
    fi[2] = x[0]*x[0] + x[1]*x[1] - 1.0;
    jac[0][0] = 1.0;
    jac[0][1] = 1.0;
    jac[0][2] = 0.0;
    jac[1][0] = -exp(x[0]);
    jac[1][1] = 0.0;
    jac[1][2] = 1.0;
    jac[2][0] = 2*x[0];
    jac[2][1] = 2*x[1];
    jac[2][2] = 0.0;
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of
    //
    //     f(x0,x1) = x0+x1
    //
    // subject to nonlinear inequality constraint
    //
    //    x0^2 + x1^2 - 1 <= 0
    //
    // and nonlinear equality constraint
    //
    //    x2-exp(x0) = 0
    //
    real_1d_array x0 = "[0,0,0]";
    real_1d_array s = "[1,1,1]";
    double epsg = 0;
    double epsf = 0;
    double epsx = 0.000001;
    ae_int_t maxits = 0;
    ae_int_t outerits = 5;
    ae_int_t updatefreq = 10;
    double rho = 1000;
    minnlcstate state;
    minnlcreport rep;
    real_1d_array x1;

    //
    // Create optimizer object, choose AUL algorithm and tune its settings:
    // * rho=1000       penalty coefficient
    // * outerits=5     number of outer iterations to tune Lagrange coefficients
    // * epsx=0.000001  stopping condition for inner iterations
    // * s=[1,1]        all variables have unit scale
    // * exact low-rank preconditioner is used, updated after each 10 iterations
    // * upper limit on step length is specified (to avoid probing locations where exp() is large)
    //
    minnlccreate(3, x0, state);
    minnlcsetalgoaul(state, rho, outerits);
    minnlcsetcond(state, epsg, epsf, epsx, maxits);
    minnlcsetscale(state, s);
    minnlcsetprecexactlowrank(state, updatefreq);
    minnlcsetstpmax(state, 10.0);

    //
    // Set constraints:
    //
    // Nonlinear constraints are tricky - you can not "pack" general
    // nonlinear function into double precision array. That's why
    // minnlcsetnlc() does not accept constraints itself - only constraint
    // counts are passed: first parameter is number of equality constraints,
    // second one is number of inequality constraints.
    //
    // As for constraining functions - these functions are passed as part
    // of problem Jacobian (see below).
    //
    // NOTE: MinNLC optimizer supports arbitrary combination of boundary, general
    //       linear and general nonlinear constraints. This example does not
    //       show how to work with boundary or general linear constraints, but you
    //       can easily find it in documentation on minnlcsetbc() and
    //       minnlcsetlc() functions.
    //
    minnlcsetnlc(state, 1, 1);

    //
    // Optimize and test results.
    //
    // Optimizer object accepts vector function and its Jacobian, with first
    // component (Jacobian row) being target function, and next components
    // (Jacobian rows) being nonlinear equality and inequality constraints.
    //
    // So, our vector function has form
    //
    //     {f0,f1,f2} = { x0+x1 , x2-exp(x0) , x0^2+x1^2-1 }
    //
    // with Jacobian
    //
    //         [  +1      +1       0 ]
    //     J = [-exp(x0)  0        1 ]
    //         [ 2*x0    2*x1      0 ]
    //
    // with f0 being target function, f1 being equality constraint "f1=0",
    // f2 being inequality constraint "f2<=0". Number of equality/inequality
    // constraints is specified by minnlcsetnlc(), with equality ones always
    // being first, inequality ones being last.
    //
    alglib::minnlcoptimize(state, nlcfunc2_jac);
    minnlcresults(state, x1, rep);
    printf("%s\n", x1.tostring(2).c_str()); // EXPECTED: [-0.70710,-0.70710,0.49306]
    return 0;
}


minnsreport
minnsstate
minnscreate
minnscreatef
minnsoptimize
minnsrequesttermination
minnsrestartfrom
minnsresults
minnsresultsbuf
minnssetalgoags
minnssetbc
minnssetcond
minnssetlc
minnssetnlc
minnssetscale
minnssetxrep
minns_d_bc Nonsmooth box constrained optimization
minns_d_diff Nonsmooth unconstrained optimization with numerical differentiation
minns_d_nlc Nonsmooth nonlinearly constrained optimization
minns_d_unconstrained Nonsmooth unconstrained optimization
/************************************************************************* This structure stores optimization report: * IterationsCount total number of inner iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) * CErr maximum violation of all types of constraints * LCErr maximum violation of linear constraints * NLCErr maximum violation of nonlinear constraints TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -3 box constraints are inconsistent -1 inconsistent parameters were passed: * penalty parameter for minnssetalgoags() is zero, but we have nonlinear constraints set by minnssetnlc() 2 sampling radius decreased below epsx 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 User requested termination via MinNSRequestTermination() Other fields of this structure are not documented and should not be used! *************************************************************************/
class minnsreport { ae_int_t iterationscount; ae_int_t nfev; double cerr; double lcerr; double nlcerr; ae_int_t terminationtype; ae_int_t varidx; ae_int_t funcidx; };
/************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinNS subpackage to work with this object *************************************************************************/
class minnsstate { };
/************************************************************************* NONSMOOTH NONCONVEX OPTIMIZATION SUBJECT TO BOX/LINEAR/NONLINEAR-NONSMOOTH CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints * nonlinear equality constraints Gi(x)=0 * nonlinear inequality constraints Hi(x)<=0 IMPORTANT: see MinNSSetAlgoAGS for important information on performance restrictions of AGS solver. REQUIREMENTS: * starting point X0 must be feasible or not too far away from the feasible set * F(), G(), H() are continuous, locally Lipschitz and continuously (but not necessarily twice) differentiable in an open dense subset of R^N. Functions F(), G() and H() may be nonsmooth and non-convex. Informally speaking, it means that functions are composed of large differentiable "patches" with nonsmoothness having place only at the boundaries between these "patches". Most real-life nonsmooth functions satisfy these requirements. Say, anything which involves finite number of abs(), min() and max() is very likely to pass the test. Say, it is possible to optimize anything of the following: * f=abs(x0)+2*abs(x1) * f=max(x0,x1) * f=sin(max(x0,x1)+abs(x2)) * for nonlinearly constrained problems: F() must be bounded from below without nonlinear constraints (this requirement is due to the fact that, contrary to box and linear constraints, nonlinear ones require special handling). * user must provide function value and gradient for F(), H(), G() at all points where function/gradient can be calculated. If optimizer requires value exactly at the boundary between "patches" (say, at x=0 for f=abs(x)), where gradient is not defined, user may resolve tie arbitrarily (in our case - return +1 or -1 at its discretion). * NS solver supports numerical differentiation, i.e. it may differentiate your function for you, but it results in 2N increase of function evaluations. Not recommended unless you solve really small problems. See minnscreatef() for more information on this functionality. USAGE: 1. User initializes algorithm state with MinNSCreate() call and chooses what NLC solver to use. There is some solver which is used by default, with default settings, but you should NOT rely on default choice. It may change in future releases of ALGLIB without notice, and no one can guarantee that new solver will be able to solve your problem with default settings. From the other side, if you choose solver explicitly, you can be pretty sure that it will work with new ALGLIB releases. In the current release following solvers can be used: * AGS solver (activated with MinNSSetAlgoAGS() function) 2. User adds boundary and/or linear and/or nonlinear constraints by means of calling one of the following functions: a) MinNSSetBC() for boundary constraints b) MinNSSetLC() for linear constraints c) MinNSSetNLC() for nonlinear constraints You may combine (a), (b) and (c) in one optimization problem. 3. User sets scale of the variables with MinNSSetScale() function. It is VERY important to set scale of the variables, because nonlinearly constrained problems are hard to solve when variables are badly scaled. 4. User sets stopping conditions with MinNSSetCond(). 5. Finally, user calls MinNSOptimize() function which takes algorithm state and pointer (delegate, etc) to callback function which calculates F/G/H. 7. User calls MinNSResults() to get solution 8. Optionally user may call MinNSRestartFrom() to solve another problem with same N but another starting point. MinNSRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state NOTE: minnscreatef() function may be used if you do not have analytic gradient. This function creates solver which uses numerical differentiation with user-specified step. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void alglib::minnscreate(real_1d_array x, minnsstate& state); void alglib::minnscreate(ae_int_t n, real_1d_array x, minnsstate& state);

Examples:   [1]  [2]  [3]  

/************************************************************************* Version of minnscreatef() which uses numerical differentiation. I.e., you do not have to calculate derivatives yourself. However, this version needs 2N times more function evaluations. 2-point differentiation formula is used, because more precise 4-point formula is unstable when used on non-smooth functions. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. DiffStep- differentiation step, DiffStep>0. Algorithm performs numerical differentiation with step for I-th variable being equal to DiffStep*S[I] (here S[] is a scale vector, set by minnssetscale() function). Do not use too small steps, because it may lead to catastrophic cancellation during intermediate calculations. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void alglib::minnscreatef( real_1d_array x, double diffstep, minnsstate& state); void alglib::minnscreatef( ae_int_t n, real_1d_array x, double diffstep, minnsstate& state);

Examples:   [1]  

/************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state fvec - callback which calculates function vector fi[] at given point x jac - callback which calculates function vector fi[] and Jacobian jac at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied Jacobian, and one which uses only function vector and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object you should choose appropriate variant of minnsoptimize() - one which accepts function AND Jacobian or one which accepts ONLY function. Be careful to choose variant of minnsoptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to minnsoptimize() and specific function used to create optimizer. | USER PASSED TO minnsoptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ minnscreatef() | works FAILS minnscreate() | FAILS works Here "FAILS" denotes inappropriate combinations of optimizer creation function and minnsoptimize() version. Attemps to use such combination will lead to exception. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void minnsoptimize(minnsstate &state, void (*fvec)(const real_1d_array &x, real_1d_array &fi, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minnsoptimize(minnsstate &state, void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void alglib::minnsrequesttermination(minnsstate state);
/************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. X - new starting point. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void alglib::minnsrestartfrom(minnsstate state, real_1d_array x);
/************************************************************************* MinNS results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -3 box constraints are inconsistent * -1 inconsistent parameters were passed: * penalty parameter for minnssetalgoags() is zero, but we have nonlinear constraints set by minnssetnlc() * 2 sampling radius decreased below epsx * 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. * 8 User requested termination via minnsrequesttermination() -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void alglib::minnsresults( minnsstate state, real_1d_array& x, minnsreport& rep);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* Buffered implementation of minnsresults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void alglib::minnsresultsbuf( minnsstate state, real_1d_array& x, minnsreport& rep);
/************************************************************************* This function tells MinNS unit to use AGS (adaptive gradient sampling) algorithm for nonsmooth constrained optimization. This algorithm is a slight modification of one described in "An Adaptive Gradient Sampling Algorithm for Nonsmooth Optimization" by Frank E. Curtisy and Xiaocun Quez. This optimizer has following benefits and drawbacks: + robustness; it can be used with nonsmooth and nonconvex functions. + relatively easy tuning; most of the metaparameters are easy to select. - it has convergence of steepest descent, slower than CG/LBFGS. - each iteration involves evaluation of ~2N gradient values and solution of 2Nx2N quadratic programming problem, which limits applicability of algorithm by small-scale problems (up to 50-100). IMPORTANT: this algorithm has convergence guarantees, i.e. it will steadily move towards some stationary point of the function. However, "stationary point" does not always mean "solution". Nonsmooth problems often have "flat spots", i.e. areas where function do not change at all. Such "flat spots" are stationary points by definition, and algorithm may be caught here. Nonsmooth CONVEX tasks are not prone to this problem. Say, if your function has form f()=MAX(f0,f1,...), and f_i are convex, then f() is convex too and you have guaranteed convergence to solution. INPUT PARAMETERS: State - structure which stores algorithm state Radius - initial sampling radius, >=0. Internally multiplied by vector of per-variable scales specified by minnssetscale()). You should select relatively large sampling radius, roughly proportional to scaled length of the first steps of the algorithm. Something close to 0.1 in magnitude should be good for most problems. AGS solver can automatically decrease radius, so too large radius is not a problem (assuming that you won't choose so large radius that algorithm will sample function in too far away points, where gradient value is irrelevant). Too small radius won't cause algorithm to fail, but it may slow down algorithm (it may have to perform too short steps). Penalty - penalty coefficient for nonlinear constraints: * for problem with nonlinear constraints should be some problem-specific positive value, large enough that penalty term changes shape of the function. Starting from some problem-specific value penalty coefficient becomes large enough to exactly enforce nonlinear constraints; larger values do not improve precision. Increasing it too much may slow down convergence, so you should choose it carefully. * can be zero for problems WITHOUT nonlinear constraints (i.e. for unconstrained ones or ones with just box or linear constraints) * if you specify zero value for problem with at least one nonlinear constraint, algorithm will terminate with error code -1. ALGORITHM OUTLINE The very basic outline of unconstrained AGS algorithm is given below: 0. If sampling radius is below EpsX or we performed more then MaxIts iterations - STOP. 1. sample O(N) gradient values at random locations around current point; informally speaking, this sample is an implicit piecewise linear model of the function, although algorithm formulation does not mention that explicitly 2. solve quadratic programming problem in order to find descent direction 3. if QP solver tells us that we are near solution, decrease sampling radius and move to (0) 4. perform backtracking line search 5. after moving to new point, goto (0) As for the constraints: * box constraints are handled exactly by modification of the function being minimized * linear/nonlinear constraints are handled by adding L1 penalty. Because our solver can handle nonsmoothness, we can use L1 penalty function, which is an exact one (i.e. exact solution is returned under such penalty). * penalty coefficient for linear constraints is chosen automatically; however, penalty coefficient for nonlinear constraints must be specified by user. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void alglib::minnssetalgoags( minnsstate state, double radius, double penalty);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* This function sets boundary constraints. Boundary constraints are inactive by default (after initial creation). They are preserved after algorithm restart with minnsrestartfrom(). INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: AGS solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints, even when numerical differentiation is used (algorithm adjusts nodes according to boundary constraints) -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void alglib::minnssetbc( minnsstate state, real_1d_array bndl, real_1d_array bndu);

Examples:   [1]  

/************************************************************************* This function sets stopping conditions for iterations of optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0 The AGS solver finishes its work if on k+1-th iteration sampling radius decreases below EpsX. MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. We do not recommend you to rely on default choice in production code. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void alglib::minnssetcond(minnsstate state, double epsx, ae_int_t maxits);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* This function sets linear constraints. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with minnsrestartfrom(). INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE: linear (non-bound) constraints are satisfied only approximately: * there always exists some minor violation (about current sampling radius in magnitude during optimization, about EpsX in the solution) due to use of penalty method to handle constraints. * numerical differentiation, if used, may lead to function evaluations outside of the feasible area, because algorithm does NOT change numerical differentiation formula according to linear constraints. If you want constraints to be satisfied exactly, try to reformulate your problem in such manner that all constraints will become boundary ones (this kind of constraints is always satisfied exactly, both in the final solution and in all intermediate points). -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void alglib::minnssetlc( minnsstate state, real_2d_array c, integer_1d_array ct); void alglib::minnssetlc( minnsstate state, real_2d_array c, integer_1d_array ct, ae_int_t k);
/************************************************************************* This function sets nonlinear constraints. In fact, this function sets NUMBER of nonlinear constraints. Constraints itself (constraint functions) are passed to minnsoptimize() method. This method requires user-defined vector function F[] and its Jacobian J[], where: * first component of F[] and first row of Jacobian J[] correspond to function being minimized * next NLEC components of F[] (and rows of J) correspond to nonlinear equality constraints G_i(x)=0 * next NLIC components of F[] (and rows of J) correspond to nonlinear inequality constraints H_i(x)<=0 NOTE: you may combine nonlinear constraints with linear/boundary ones. If your problem has mixed constraints, you may explicitly specify some of them as linear ones. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. NLEC - number of Non-Linear Equality Constraints (NLEC), >=0 NLIC - number of Non-Linear Inquality Constraints (NLIC), >=0 NOTE 1: nonlinear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of the feasible area! NOTE 2: algorithm scales variables according to scale specified by minnssetscale() function, so it can handle problems with badly scaled variables (as long as we KNOW their scales). However, there is no way to automatically scale nonlinear constraints Gi(x) and Hi(x). Inappropriate scaling of Gi/Hi may ruin convergence. Solving problem with constraint "1000*G0(x)=0" is NOT same as solving it with constraint "0.001*G0(x)=0". It means that YOU are the one who is responsible for correct scaling of nonlinear constraints Gi(x) and Hi(x). We recommend you to scale nonlinear constraints in such way that I-th component of dG/dX (or dH/dx) has approximately unit magnitude (for problems with unit scale) or has magnitude approximately equal to 1/S[i] (where S is a scale set by minnssetscale() function). NOTE 3: nonlinear constraints are always hard to handle, no matter what algorithm you try to use. Even basic box/linear constraints modify function curvature by adding valleys and ridges. However, nonlinear constraints add valleys which are very hard to follow due to their "curved" nature. It means that optimization with single nonlinear constraint may be significantly slower than optimization with multiple linear ones. It is normal situation, and we recommend you to carefully choose Rho parameter of minnssetalgoags(), because too large value may slow down convergence. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void alglib::minnssetnlc(minnsstate state, ae_int_t nlec, ae_int_t nlic);

Examples:   [1]  

/************************************************************************* This function sets scaling coefficients for NLC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/
void alglib::minnssetscale(minnsstate state, real_1d_array s);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to minnsoptimize(). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/
void alglib::minnssetxrep(minnsstate state, bool needxrep);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void  nsfunc1_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr)
{
    //
    // this callback calculates
    //
    //     f0(x0,x1) = 2*|x0|+x1
    //
    // and Jacobian matrix J = [df0/dx0 df0/dx1]
    //
    fi[0] = 2*fabs(double(x[0]))+fabs(double(x[1]));
    jac[0][0] = 2*alglib::sign(x[0]);
    jac[0][1] = alglib::sign(x[1]);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of
    //
    //     f(x0,x1) = 2*|x0|+|x1|
    //
    // subject to box constraints
    //
    //        1 <= x0 < +INF
    //     -INF <= x1 < +INF
    //
    // using nonsmooth nonlinear optimizer.
    //
    real_1d_array x0 = "[1,1]";
    real_1d_array s = "[1,1]";
    real_1d_array bndl = "[1,-inf]";
    real_1d_array bndu = "[+inf,+inf]";
    double epsx = 0.00001;
    double radius = 0.1;
    double rho = 0.0;
    ae_int_t maxits = 0;
    minnsstate state;
    minnsreport rep;
    real_1d_array x1;

    //
    // Create optimizer object, choose AGS algorithm and tune its settings:
    // * radius=0.1     good initial value; will be automatically decreased later.
    // * rho=0.0        penalty coefficient for nonlinear constraints; can be zero
    //                  because we do not have such constraints
    // * epsx=0.000001  stopping conditions
    // * s=[1,1]        all variables have unit scale
    //
    minnscreate(2, x0, state);
    minnssetalgoags(state, radius, rho);
    minnssetcond(state, epsx, maxits);
    minnssetscale(state, s);

    //
    // Set box constraints.
    //
    // General linear constraints are set in similar way (see comments on
    // minnssetlc() function for more information).
    //
    // You may combine box, linear and nonlinear constraints in one optimization
    // problem.
    //
    minnssetbc(state, bndl, bndu);

    //
    // Optimize and test results.
    //
    // Optimizer object accepts vector function and its Jacobian, with first
    // component (Jacobian row) being target function, and next components
    // (Jacobian rows) being nonlinear equality and inequality constraints
    // (box/linear ones are passed separately by means of minnssetbc() and
    // minnssetlc() calls).
    //
    // If you do not have nonlinear constraints (exactly our situation), then
    // you will have one-component function vector and 1xN Jacobian matrix.
    //
    // So, our vector function has form
    //
    //     {f0} = { 2*|x0|+|x1| }
    //
    // with Jacobian
    //
    //         [                       ]
    //     J = [ 2*sign(x0)   sign(x1) ]
    //         [                       ]
    //
    // NOTE: nonsmooth optimizer requires considerably more function
    //       evaluations than smooth solver - about 2N times more. Using
    //       numerical differentiation introduces additional (multiplicative)
    //       2N speedup.
    //
    //       It means that if smooth optimizer WITH user-supplied gradient
    //       needs 100 function evaluations to solve 50-dimensional problem,
    //       then AGS solver with user-supplied gradient will need about 10.000
    //       function evaluations, and with numerical gradient about 1.000.000
    //       function evaluations will be performed.
    //
    // NOTE: AGS solver used by us can handle nonsmooth and nonconvex
    //       optimization problems. It has convergence guarantees, i.e. it will
    //       converge to stationary point of the function after running for some
    //       time.
    //
    //       However, it is important to remember that "stationary point" is not
    //       equal to "solution". If your problem is convex, everything is OK.
    //       But nonconvex optimization problems may have "flat spots" - large
    //       areas where gradient is exactly zero, but function value is far away
    //       from optimal. Such areas are stationary points too, and optimizer
    //       may be trapped here.
    //
    //       "Flat spots" are nonsmooth equivalent of the saddle points, but with
    //       orders of magnitude worse properties - they may be quite large and
    //       hard to avoid. All nonsmooth optimizers are prone to this kind of the
    //       problem, because it is impossible to automatically distinguish "flat
    //       spot" from true solution.
    //
    //       This note is here to warn you that you should be very careful when
    //       you solve nonsmooth optimization problems. Visual inspection of
    //       results is essential.
    //
    //
    alglib::minnsoptimize(state, nsfunc1_jac);
    minnsresults(state, x1, rep);
    printf("%s\n", x1.tostring(2).c_str()); // EXPECTED: [1.0000,0.0000]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void  nsfunc1_fvec(const real_1d_array &x, real_1d_array &fi, void *ptr)
{
    //
    // this callback calculates
    //
    //     f0(x0,x1) = 2*|x0|+x1
    //
    fi[0] = 2*fabs(double(x[0]))+fabs(double(x[1]));
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of
    //
    //     f(x0,x1) = 2*|x0|+|x1|
    //
    // using nonsmooth nonlinear optimizer with numerical
    // differentiation provided by ALGLIB.
    //
    // NOTE: nonsmooth optimizer requires considerably more function
    //       evaluations than smooth solver - about 2N times more. Using
    //       numerical differentiation introduces additional (multiplicative)
    //       2N speedup.
    //
    //       It means that if smooth optimizer WITH user-supplied gradient
    //       needs 100 function evaluations to solve 50-dimensional problem,
    //       then AGS solver with user-supplied gradient will need about 10.000
    //       function evaluations, and with numerical gradient about 1.000.000
    //       function evaluations will be performed.
    //
    real_1d_array x0 = "[1,1]";
    real_1d_array s = "[1,1]";
    double epsx = 0.00001;
    double diffstep = 0.000001;
    double radius = 0.1;
    double rho = 0.0;
    ae_int_t maxits = 0;
    minnsstate state;
    minnsreport rep;
    real_1d_array x1;

    //
    // Create optimizer object, choose AGS algorithm and tune its settings:
    // * radius=0.1     good initial value; will be automatically decreased later.
    // * rho=0.0        penalty coefficient for nonlinear constraints; can be zero
    //                  because we do not have such constraints
    // * epsx=0.000001  stopping conditions
    // * s=[1,1]        all variables have unit scale
    //
    minnscreatef(2, x0, diffstep, state);
    minnssetalgoags(state, radius, rho);
    minnssetcond(state, epsx, maxits);
    minnssetscale(state, s);

    //
    // Optimize and test results.
    //
    // Optimizer object accepts vector function, with first component
    // being target function, and next components being nonlinear equality
    // and inequality constraints (box/linear ones are passed separately
    // by means of minnssetbc() and minnssetlc() calls).
    //
    // If you do not have nonlinear constraints (exactly our situation), then
    // you will have one-component function vector.
    //
    // So, our vector function has form
    //
    //     {f0} = { 2*|x0|+|x1| }
    //
    alglib::minnsoptimize(state, nsfunc1_fvec);
    minnsresults(state, x1, rep);
    printf("%s\n", x1.tostring(2).c_str()); // EXPECTED: [0.0000,0.0000]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void  nsfunc2_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr)
{
    //
    // this callback calculates function vector
    //
    //     f0(x0,x1) = 2*|x0|+x1
    //     f1(x0,x1) = x0-1
    //     f2(x0,x1) = -x1-1
    //
    // and Jacobian matrix J
    //
    //         [ df0/dx0   df0/dx1 ]
    //     J = [ df1/dx0   df1/dx1 ]
    //         [ df2/dx0   df2/dx1 ]
    //
    fi[0] = 2*fabs(double(x[0]))+fabs(double(x[1]));
    jac[0][0] = 2*alglib::sign(x[0]);
    jac[0][1] = alglib::sign(x[1]);
    fi[1] = x[0]-1;
    jac[1][0] = 1;
    jac[1][1] = 0;
    fi[2] = -x[1]-1;
    jac[2][0] = 0;
    jac[2][1] = -1;
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of
    //
    //     f(x0,x1) = 2*|x0|+|x1|
    //
    // subject to combination of equality and inequality constraints
    //
    //      x0  =  1
    //      x1 >= -1
    //
    // using nonsmooth nonlinear optimizer. Although these constraints
    // are linear, we treat them as general nonlinear ones in order to
    // demonstrate nonlinearly constrained optimization setup.
    //
    real_1d_array x0 = "[1,1]";
    real_1d_array s = "[1,1]";
    double epsx = 0.00001;
    double radius = 0.1;
    double rho = 50.0;
    ae_int_t maxits = 0;
    minnsstate state;
    minnsreport rep;
    real_1d_array x1;

    //
    // Create optimizer object, choose AGS algorithm and tune its settings:
    // * radius=0.1     good initial value; will be automatically decreased later.
    // * rho=50.0       penalty coefficient for nonlinear constraints. It is your
    //                  responsibility to choose good one - large enough that it
    //                  enforces constraints, but small enough in order to avoid
    //                  extreme slowdown due to ill-conditioning.
    // * epsx=0.000001  stopping conditions
    // * s=[1,1]        all variables have unit scale
    //
    minnscreate(2, x0, state);
    minnssetalgoags(state, radius, rho);
    minnssetcond(state, epsx, maxits);
    minnssetscale(state, s);

    //
    // Set general nonlinear constraints.
    //
    // This part is more tricky than working with box/linear constraints - you
    // can not "pack" general nonlinear function into double precision array.
    // That's why minnssetnlc() does not accept constraints itself - only
    // constraint COUNTS are passed: first parameter is number of equality
    // constraints, second one is number of inequality constraints.
    //
    // As for constraining functions - these functions are passed as part
    // of problem Jacobian (see below).
    //
    // NOTE: MinNS optimizer supports arbitrary combination of boundary, general
    //       linear and general nonlinear constraints. This example does not
    //       show how to work with general linear constraints, but you can
    //       easily find it in documentation on minnlcsetlc() function.
    //
    minnssetnlc(state, 1, 1);

    //
    // Optimize and test results.
    //
    // Optimizer object accepts vector function and its Jacobian, with first
    // component (Jacobian row) being target function, and next components
    // (Jacobian rows) being nonlinear equality and inequality constraints
    // (box/linear ones are passed separately by means of minnssetbc() and
    // minnssetlc() calls).
    //
    // Nonlinear equality constraints have form Gi(x)=0, inequality ones
    // have form Hi(x)<=0, so we may have to "normalize" constraints prior
    // to passing them to optimizer (right side is zero, constraints are
    // sorted, multiplied by -1 when needed).
    //
    // So, our vector function has form
    //
    //     {f0,f1,f2} = { 2*|x0|+|x1|,  x0-1, -x1-1 }
    //
    // with Jacobian
    //
    //         [ 2*sign(x0)   sign(x1) ]
    //     J = [     1           0     ]
    //         [     0          -1     ]
    //
    // which means that we have optimization problem
    //
    //     min{f0} subject to f1=0, f2<=0
    //
    // which is essentially same as
    //
    //     min { 2*|x0|+|x1| } subject to x0=1, x1>=-1
    //
    // NOTE: AGS solver used by us can handle nonsmooth and nonconvex
    //       optimization problems. It has convergence guarantees, i.e. it will
    //       converge to stationary point of the function after running for some
    //       time.
    //
    //       However, it is important to remember that "stationary point" is not
    //       equal to "solution". If your problem is convex, everything is OK.
    //       But nonconvex optimization problems may have "flat spots" - large
    //       areas where gradient is exactly zero, but function value is far away
    //       from optimal. Such areas are stationary points too, and optimizer
    //       may be trapped here.
    //
    //       "Flat spots" are nonsmooth equivalent of the saddle points, but with
    //       orders of magnitude worse properties - they may be quite large and
    //       hard to avoid. All nonsmooth optimizers are prone to this kind of the
    //       problem, because it is impossible to automatically distinguish "flat
    //       spot" from true solution.
    //
    //       This note is here to warn you that you should be very careful when
    //       you solve nonsmooth optimization problems. Visual inspection of
    //       results is essential.
    //
    alglib::minnsoptimize(state, nsfunc2_jac);
    minnsresults(state, x1, rep);
    printf("%s\n", x1.tostring(2).c_str()); // EXPECTED: [1.0000,0.0000]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;
void  nsfunc1_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr)
{
    //
    // this callback calculates
    //
    //     f0(x0,x1) = 2*|x0|+x1
    //
    // and Jacobian matrix J = [df0/dx0 df0/dx1]
    //
    fi[0] = 2*fabs(double(x[0]))+fabs(double(x[1]));
    jac[0][0] = 2*alglib::sign(x[0]);
    jac[0][1] = alglib::sign(x[1]);
}

int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of
    //
    //     f(x0,x1) = 2*|x0|+|x1|
    //
    // using nonsmooth nonlinear optimizer.
    //
    real_1d_array x0 = "[1,1]";
    real_1d_array s = "[1,1]";
    double epsx = 0.00001;
    double radius = 0.1;
    double rho = 0.0;
    ae_int_t maxits = 0;
    minnsstate state;
    minnsreport rep;
    real_1d_array x1;

    //
    // Create optimizer object, choose AGS algorithm and tune its settings:
    // * radius=0.1     good initial value; will be automatically decreased later.
    // * rho=0.0        penalty coefficient for nonlinear constraints; can be zero
    //                  because we do not have such constraints
    // * epsx=0.000001  stopping conditions
    // * s=[1,1]        all variables have unit scale
    //
    minnscreate(2, x0, state);
    minnssetalgoags(state, radius, rho);
    minnssetcond(state, epsx, maxits);
    minnssetscale(state, s);

    //
    // Optimize and test results.
    //
    // Optimizer object accepts vector function and its Jacobian, with first
    // component (Jacobian row) being target function, and next components
    // (Jacobian rows) being nonlinear equality and inequality constraints
    // (box/linear ones are passed separately by means of minnssetbc() and
    // minnssetlc() calls).
    //
    // If you do not have nonlinear constraints (exactly our situation), then
    // you will have one-component function vector and 1xN Jacobian matrix.
    //
    // So, our vector function has form
    //
    //     {f0} = { 2*|x0|+|x1| }
    //
    // with Jacobian
    //
    //         [                       ]
    //     J = [ 2*sign(x0)   sign(x1) ]
    //         [                       ]
    //
    // NOTE: nonsmooth optimizer requires considerably more function
    //       evaluations than smooth solver - about 2N times more. Using
    //       numerical differentiation introduces additional (multiplicative)
    //       2N speedup.
    //
    //       It means that if smooth optimizer WITH user-supplied gradient
    //       needs 100 function evaluations to solve 50-dimensional problem,
    //       then AGS solver with user-supplied gradient will need about 10.000
    //       function evaluations, and with numerical gradient about 1.000.000
    //       function evaluations will be performed.
    //
    // NOTE: AGS solver used by us can handle nonsmooth and nonconvex
    //       optimization problems. It has convergence guarantees, i.e. it will
    //       converge to stationary point of the function after running for some
    //       time.
    //
    //       However, it is important to remember that "stationary point" is not
    //       equal to "solution". If your problem is convex, everything is OK.
    //       But nonconvex optimization problems may have "flat spots" - large
    //       areas where gradient is exactly zero, but function value is far away
    //       from optimal. Such areas are stationary points too, and optimizer
    //       may be trapped here.
    //
    //       "Flat spots" are nonsmooth equivalent of the saddle points, but with
    //       orders of magnitude worse properties - they may be quite large and
    //       hard to avoid. All nonsmooth optimizers are prone to this kind of the
    //       problem, because it is impossible to automatically distinguish "flat
    //       spot" from true solution.
    //
    //       This note is here to warn you that you should be very careful when
    //       you solve nonsmooth optimization problems. Visual inspection of
    //       results is essential.
    //
    alglib::minnsoptimize(state, nsfunc1_jac);
    minnsresults(state, x1, rep);
    printf("%s\n", x1.tostring(2).c_str()); // EXPECTED: [0.0000,0.0000]
    return 0;
}


minqpreport
minqpstate
minqpcreate
minqpoptimize
minqpresults
minqpresultsbuf
minqpsetalgobleic
minqpsetalgocholesky
minqpsetalgodenseaul
minqpsetalgoquickqp
minqpsetbc
minqpsetlc
minqpsetlcmixed
minqpsetlcsparse
minqpsetlinearterm
minqpsetorigin
minqpsetquadraticterm
minqpsetquadratictermsparse
minqpsetscale
minqpsetstartingpoint
minqp_d_bc1 Bound constrained dense quadratic programming
minqp_d_lc1 Linearly constrained dense quadratic programming
minqp_d_nonconvex Nonconvex quadratic programming
minqp_d_u1 Unconstrained dense quadratic programming
minqp_d_u2 Unconstrained sparse quadratic programming
/************************************************************************* This structure stores optimization report: * InnerIterationsCount number of inner iterations * OuterIterationsCount number of outer iterations * NCholesky number of Cholesky decomposition * NMV number of matrix-vector products (only products calculated as part of iterative process are counted) * TerminationType completion code (see below) Completion codes: * -5 inappropriate solver was used: * QuickQP solver for problem with general linear constraints (dense/sparse) * -4 BLEIC-QP or QuickQP solver found unconstrained direction of negative curvature (function is unbounded from below even under constraints), no meaningful minimum can be found. * -3 inconsistent constraints (or, maybe, feasible point is too hard to find). If you are sure that constraints are feasible, try to restart optimizer with better initial approximation. * -1 solver error * 1..4 successful completion * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. *************************************************************************/
class minqpreport { ae_int_t inneriterationscount; ae_int_t outeriterationscount; ae_int_t nmv; ae_int_t ncholesky; ae_int_t terminationtype; };
/************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinQP subpackage to work with this object *************************************************************************/
class minqpstate { };
/************************************************************************* CONSTRAINED QUADRATIC PROGRAMMING The subroutine creates QP optimizer. After initial creation, it contains default optimization problem with zero quadratic and linear terms and no constraints. You should set quadratic/linear terms with calls to functions provided by MinQP subpackage. You should also choose appropriate QP solver and set it and its stopping criteria by means of MinQPSetAlgo??????() function. Then, you should start solution process by means of MinQPOptimize() call. Solution itself can be obtained with MinQPResults() function. Following solvers are recommended: * QuickQP for dense problems with box-only constraints (or no constraints at all) * QP-BLEIC for dense/sparse problems with moderate (up to 50) number of general linear constraints * DENSE-AUL-QP for dense problems with any (small or large) number of general linear constraints INPUT PARAMETERS: N - problem size OUTPUT PARAMETERS: State - optimizer with zero quadratic/linear terms and no constraints -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minqpcreate(ae_int_t n, minqpstate& state);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* This function solves quadratic programming problem. Prior to calling this function you should choose solver by means of one of the following functions: * minqpsetalgoquickqp() - for QuickQP solver * minqpsetalgobleic() - for BLEIC-QP solver * minqpsetalgodenseaul() - for Dense-AUL-QP solver These functions also allow you to control stopping criteria of the solver. If you did not set solver, MinQP subpackage will automatically select solver for your problem and will run it with default stopping criteria. However, it is better to set explicitly solver and its stopping criteria. INPUT PARAMETERS: State - algorithm state You should use MinQPResults() function to access results after calls to this function. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey. Special thanks to Elvira Illarionova for important suggestions on the linearly constrained QP algorithm. *************************************************************************/
void alglib::minqpoptimize(minqpstate state);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* QP solver results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution. This array is allocated and initialized only when Rep.TerminationType parameter is positive (success). Rep - optimization report. You should check Rep.TerminationType, which contains completion code, and you may check another fields which contain another information about algorithm functioning. Failure codes returned by algorithm are: * -5 inappropriate solver was used: * QuickQP solver for problem with general linear constraints * -4 BLEIC-QP/QuickQP solver found unconstrained direction of negative curvature (function is unbounded from below even under constraints), no meaningful minimum can be found. * -3 inconsistent constraints (or maybe feasible point is too hard to find). If you are sure that constraints are feasible, try to restart optimizer with better initial approximation. Completion codes specific for Cholesky algorithm: * 4 successful completion Completion codes specific for BLEIC/QuickQP algorithms: * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minqpresults( minqpstate state, real_1d_array& x, minqpreport& rep);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* QP results Buffered implementation of MinQPResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minqpresultsbuf( minqpstate state, real_1d_array& x, minqpreport& rep);
/************************************************************************* This function tells solver to use BLEIC-based algorithm and sets stopping criteria for the algorithm. ALGORITHM FEATURES: * supports dense and sparse QP problems * supports box and general linear equality/inequality constraints * can solve all types of problems (convex, semidefinite, nonconvex) as long as they are bounded from below under constraints. Say, it is possible to solve "min{-x^2} subject to -1<=x<=+1". Of course, global minimum is found only for positive definite and semidefinite problems. As for indefinite ones - only local minimum is found. ALGORITHM OUTLINE: * BLEIC-QP solver is just a driver function for MinBLEIC solver; it solves quadratic programming problem as general linearly constrained optimization problem, which is solved by means of BLEIC solver (part of ALGLIB, active set method). ALGORITHM LIMITATIONS: * this algorithm is fast enough for large-scale problems with small amount of general linear constraints (say, up to 50), but it is inefficient for problems with several hundreds of constraints. Iteration cost is roughly quadratic w.r.t. constraint count. Furthermore, it can not efficiently handle sparse constraints (they are converted to dense format prior to solution). Thus, if you have large and/or sparse constraint matrix and convex QP problem, Dense-AUL-QP solver may be better solution. * unlike QuickQP solver, this algorithm does not perform Newton steps and does not use Level 3 BLAS. Being general-purpose active set method, it can activate constraints only one-by-one. Thus, its performance is lower than that of QuickQP. * its precision is also a bit inferior to that of QuickQP. BLEIC-QP performs only LBFGS steps (no Newton steps), which are good at detecting neighborhood of the solution, buy needs many iterations to find solution with more than 6 digits of precision. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|<EpsG is satisfied, where: * |.| means Euclidian norm * v - scaled constrained gradient vector, v[i]=g[i]*s[i] * g - gradient * s - scaling coefficients set by MinQPSetScale() EpsF - >=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} EpsX - >=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinQPSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. NOTE: this algorithm uses LBFGS iterations, which are relatively cheap, but improve function value only a bit. So you will need many iterations to converge - from 0.1*N to 10*N, depending on problem's condition number. IT IS VERY IMPORTANT TO CALL MinQPSetScale() WHEN YOU USE THIS ALGORITHM BECAUSE ITS STOPPING CRITERIA ARE SCALE-DEPENDENT! Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (presently it is small step length, but it may change in the future versions of ALGLIB). -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetalgobleic( minqpstate state, double epsg, double epsf, double epsx, ae_int_t maxits);
/************************************************************************* DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED This function tells solver to use Cholesky-based algorithm. This algorithm was deprecated in ALGLIB 3.9.0 because its performance is inferior to that of BLEIC-QP or QuickQP on high-dimensional problems. Furthermore, it supports only dense convex QP problems. This solver is no longer active by default. We recommend you to switch to AUL-QP, BLEIC-QP or QuickQP solver. DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetalgocholesky(minqpstate state);
/************************************************************************* This function tells QP solver to use Dense-AUL algorithm and sets stopping criteria for the algorithm. ALGORITHM FEATURES: * supports dense and sparse QP problems; although it uses dense Cholesky to build preconditioner, it still works faster for sparse problems. * supports box and dense/sparse general linear equality/inequality constraints * convergence is theoretically proved for positive-definite (convex) QP problems. Semidefinite and non-convex problems can be solved as long as they are bounded from below under constraints, although without theoretical guarantees. * this solver is better than QP-BLEIC on problems with large number of general linear constraints. ALGORITHM OUTLINE: * this algorithm is an augmented Lagrangian method with dense preconditioner (hence its name). It is similar to barrier/penalty methods, but much more precise and faster. * it performs several outer iterations in order to refine values of the Lagrange multipliers. Single outer iteration is a solution of some unconstrained optimization problem: first it performs dense Cholesky factorization of the Hessian in order to build preconditioner (adaptive regularization is applied to enforce positive definiteness), and then it uses L-BFGS optimizer to solve optimization problem. * typically you need about 5-10 outer iterations to converge to solution ALGORITHM LIMITATIONS: * because dense Cholesky driver is used, this algorithm has O(N^2) memory requirements and O(OuterIterations*N^3) minimum running time. From the practical point of view, it limits its applicability by several thousands of variables. From the other side, variables count is the most limiting factor, and dependence on constraint count is much more lower. Assuming that constraint matrix is sparse, it may handle tens of thousands of general linear constraints. * its precision is lower than that of BLEIC-QP and QuickQP. It is hard to find solution with more than 6 digits of precision. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0, stopping criteria for inner optimizer. Inner iterations are stopped when step length (with variable scaling being applied) is less than EpsX. See minqpsetscale() for more information on variable scaling. Rho - penalty coefficient, Rho>0: * large enough that algorithm converges with desired precision. * not TOO large to prevent ill-conditioning * recommended values are 100, 1000 or 10000 ItsCnt - number of outer iterations: * recommended values: 10-15 (although in most cases it converges within 5 iterations, you may need a few more to be sure). * ItsCnt=0 means that small number of outer iterations is automatically chosen (10 iterations in current version). * ItsCnt=1 means that AUL algorithm performs just as usual barrier method. * ItsCnt>1 means that AUL algorithm performs specified number of outer iterations IT IS VERY IMPORTANT TO CALL minqpsetscale() WHEN YOU USE THIS ALGORITHM BECAUSE ITS CONVERGENCE PROPERTIES AND STOPPING CRITERIA ARE SCALE-DEPENDENT! NOTE: Passing EpsX=0 will lead to automatic step length selection (specific step length chosen may change in the future versions of ALGLIB, so it is better to specify step length explicitly). -- ALGLIB -- Copyright 20.08.2016 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetalgodenseaul( minqpstate state, double epsx, double rho, ae_int_t itscnt);
/************************************************************************* This function tells solver to use QuickQP algorithm: special extra-fast algorithm for problems with box-only constrants. It may solve non-convex problems as long as they are bounded from below under constraints. ALGORITHM FEATURES: * many times (from 5x to 50x!) faster than BLEIC-based QP solver; utilizes accelerated methods for activation of constraints. * supports dense and sparse QP problems * supports ONLY box constraints; general linear constraints are NOT supported by this solver * can solve all types of problems (convex, semidefinite, nonconvex) as long as they are bounded from below under constraints. Say, it is possible to solve "min{-x^2} subject to -1<=x<=+1". In convex/semidefinite case global minimum is returned, in nonconvex case - algorithm returns one of the local minimums. ALGORITHM OUTLINE: * algorithm performs two kinds of iterations: constrained CG iterations and constrained Newton iterations * initially it performs small number of constrained CG iterations, which can efficiently activate/deactivate multiple constraints * after CG phase algorithm tries to calculate Cholesky decomposition and to perform several constrained Newton steps. If Cholesky decomposition failed (matrix is indefinite even under constraints), we perform more CG iterations until we converge to such set of constraints that system matrix becomes positive definite. Constrained Newton steps greatly increase convergence speed and precision. * algorithm interleaves CG and Newton iterations which allows to handle indefinite matrices (CG phase) and quickly converge after final set of constraints is found (Newton phase). Combination of CG and Newton phases is called "outer iteration". * it is possible to turn off Newton phase (beneficial for semidefinite problems - Cholesky decomposition will fail too often) ALGORITHM LIMITATIONS: * algorithm does not support general linear constraints; only box ones are supported * Cholesky decomposition for sparse problems is performed with Skyline Cholesky solver, which is intended for low-profile matrices. No profile- reducing reordering of variables is performed in this version of ALGLIB. * problems with near-zero negative eigenvalues (or exacty zero ones) may experience about 2-3x performance penalty. The reason is that Cholesky decomposition can not be performed until we identify directions of zero and negative curvature and activate corresponding boundary constraints - but we need a lot of trial and errors because these directions are hard to notice in the matrix spectrum. In this case you may turn off Newton phase of algorithm. Large negative eigenvalues are not an issue, so highly non-convex problems can be solved very efficiently. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|<EpsG is satisfied, where: * |.| means Euclidian norm * v - scaled constrained gradient vector, v[i]=g[i]*s[i] * g - gradient * s - scaling coefficients set by MinQPSetScale() EpsF - >=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} EpsX - >=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinQPSetScale() MaxOuterIts-maximum number of OUTER iterations. One outer iteration includes some amount of CG iterations (from 5 to ~N) and one or several (usually small amount) Newton steps. Thus, one outer iteration has high cost, but can greatly reduce funcation value. Use 0 if you do not want to limit number of outer iterations. UseNewton- use Newton phase or not: * Newton phase improves performance of positive definite dense problems (about 2 times improvement can be observed) * can result in some performance penalty on semidefinite or slightly negative definite problems - each Newton phase will bring no improvement (Cholesky failure), but still will require computational time. * if you doubt, you can turn off this phase - optimizer will retain its most of its high speed. IT IS VERY IMPORTANT TO CALL MinQPSetScale() WHEN YOU USE THIS ALGORITHM BECAUSE ITS STOPPING CRITERIA ARE SCALE-DEPENDENT! Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (presently it is small step length, but it may change in the future versions of ALGLIB). -- ALGLIB -- Copyright 22.05.2014 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetalgoquickqp( minqpstate state, double epsg, double epsf, double epsx, ae_int_t maxouterits, bool usenewton);
/************************************************************************* This function sets box constraints for QP solver Box constraints are inactive by default (after initial creation). After being set, they are preserved until explicitly turned off with another SetBC() call. All QP solvers may handle box constraints. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF (latter is recommended because it will allow solver to use better algorithm). BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF (latter is recommended because it will allow solver to use better algorithm). NOTE: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetbc( minqpstate state, real_1d_array bndl, real_1d_array bndu);

Examples:   [1]  

/************************************************************************* This function sets dense linear constraints for QP optimizer. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). After call to this function sparse constraints are dropped, and you have only those constraints which were specified in the present call. If you want to specify mixed (with dense and sparse terms) linear constraints, you should call minqpsetlcmixed(). SUPPORT BY QP SOLVERS: Following QP solvers can handle dense linear constraints: * BLEIC-QP - handles them with high precision, but may be inefficient for problems with hundreds of constraints * Dense-AUL-QP - handles them with moderate precision (approx. 10^-6), may efficiently handle thousands of constraints. Following QP solvers can NOT handle dense linear constraints: * QuickQP - can not handle general linear constraints INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 19.06.2012 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetlc( minqpstate state, real_2d_array c, integer_1d_array ct); void alglib::minqpsetlc( minqpstate state, real_2d_array c, integer_1d_array ct, ae_int_t k);

Examples:   [1]  

/************************************************************************* This function sets mixed linear constraints, which include a set of dense rows, and a set of sparse rows. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). This function may be useful if constraint matrix includes large number of both types of rows - dense and sparse. If you have just a few sparse rows, you may represent them in dense format without loosing performance. Similarly, if you have just a few dense rows, you may store them in sparse format with almost same performance. SUPPORT BY QP SOLVERS: Following QP solvers can handle mixed dense/sparse linear constraints: * BLEIC-QP - handles them with high precision, but can not utilize their sparsity - sparse constraint matrix is silently converted to dense format. Thus, it may be inefficient for problems with hundreds of constraints. * Dense-AUL-QP - although this solver uses dense linear algebra to calculate Cholesky preconditioner, it may efficiently handle sparse constraints. It may solve problems with hundreds and thousands of constraints. The only drawback is that precision of constraint handling is typically within 1E-4... ..1E-6 range. Following QP solvers can NOT handle mixed linear constraints: * QuickQP - can not handle general linear constraints at all INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. DenseC - dense linear constraints, array[K,N+1]. Each row of DenseC represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of DenseC (including right part) must be finite. DenseCT - type of constraints, array[K]: * if DenseCT[i]>0, then I-th constraint is DenseC[i,*]*x >= DenseC[i,n+1] * if DenseCT[i]=0, then I-th constraint is DenseC[i,*]*x = DenseC[i,n+1] * if DenseCT[i]<0, then I-th constraint is DenseC[i,*]*x <= DenseC[i,n+1] DenseK - number of equality/inequality constraints, DenseK>=0 SparseC - linear constraints, sparse matrix with dimensions at least [SparseK,N+1]. If matrix has larger size, only leading SPARSEKx(N+1) rectangle is used. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. SparseCT- type of sparse constraints, array[K]: * if SparseCT[i]>0, then I-th constraint is SparseC[i,*]*x >= SparseC[i,n+1] * if SparseCT[i]=0, then I-th constraint is SparseC[i,*]*x = SparseC[i,n+1] * if SparseCT[i]<0, then I-th constraint is SparseC[i,*]*x <= SparseC[i,n+1] SparseK - number of sparse equality/inequality constraints, K>=0 NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 22.08.2016 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetlcmixed( minqpstate state, real_2d_array densec, integer_1d_array densect, ae_int_t densek, sparsematrix sparsec, integer_1d_array sparsect, ae_int_t sparsek);
/************************************************************************* This function sets sparse linear constraints for QP optimizer. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). After call to this function dense constraints are dropped, and you have only those constraints which were specified in the present call. If you want to specify mixed (with dense and sparse terms) linear constraints, you should call minqpsetlcmixed(). SUPPORT BY QP SOLVERS: Following QP solvers can handle sparse linear constraints: * BLEIC-QP - handles them with high precision, but can not utilize their sparsity - sparse constraint matrix is silently converted to dense format. Thus, it may be inefficient for problems with hundreds of constraints. * Dense-AUL-QP - although this solver uses dense linear algebra to calculate Cholesky preconditioner, it may efficiently handle sparse constraints. It may solve problems with hundreds and thousands of constraints. The only drawback is that precision of constraint handling is typically within 1E-4... ..1E-6 range. Following QP solvers can NOT handle sparse linear constraints: * QuickQP - can not handle general linear constraints INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. C - linear constraints, sparse matrix with dimensions at least [K,N+1]. If matrix has larger size, only leading Kx(N+1) rectangle is used. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0 NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 22.08.2016 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetlcsparse( minqpstate state, sparsematrix c, integer_1d_array ct, ae_int_t k);
/************************************************************************* This function sets linear term for QP solver. By default, linear term is zero. INPUT PARAMETERS: State - structure which stores algorithm state B - linear term, array[N]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetlinearterm(minqpstate state, real_1d_array b);

Examples:   [1]  [2]  [3]  [4]  [5]  

/************************************************************************* This function sets origin for QP solver. By default, following QP program is solved: min(0.5*x'*A*x+b'*x) This function allows to solve different problem: min(0.5*(x-x_origin)'*A*(x-x_origin)+b'*(x-x_origin)) Specification of non-zero origin affects function being minimized, but not constraints. Box and linear constraints are still calculated without origin. INPUT PARAMETERS: State - structure which stores algorithm state XOrigin - origin, array[N]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetorigin(minqpstate state, real_1d_array xorigin);
/************************************************************************* This function sets dense quadratic term for QP solver. By default, quadratic term is zero. SUPPORT BY QP SOLVERS: Dense quadratic term can be handled by following QP solvers: * QuickQP * BLEIC-QP * Dense-AUL-QP IMPORTANT: This solver minimizes following function: f(x) = 0.5*x'*A*x + b'*x. Note that quadratic term has 0.5 before it. So if you want to minimize f(x) = x^2 + x you should rewrite your problem as follows: f(x) = 0.5*(2*x^2) + x and your matrix A will be equal to [[2.0]], not to [[1.0]] INPUT PARAMETERS: State - structure which stores algorithm state A - matrix, array[N,N] IsUpper - (optional) storage type: * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used * if not given, both lower and upper triangles must be filled. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetquadraticterm(minqpstate state, real_2d_array a); void alglib::minqpsetquadraticterm( minqpstate state, real_2d_array a, bool isupper);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* This function sets sparse quadratic term for QP solver. By default, quadratic term is zero. This function overrides previous calls to minqpsetquadraticterm() or minqpsetquadratictermsparse(). SUPPORT BY QP SOLVERS: Sparse quadratic term can be handled by following QP solvers: * QuickQP * BLEIC-QP * Dense-AUL-QP (internally converts sparse matrix to dense format) IMPORTANT: This solver minimizes following function: f(x) = 0.5*x'*A*x + b'*x. Note that quadratic term has 0.5 before it. So if you want to minimize f(x) = x^2 + x you should rewrite your problem as follows: f(x) = 0.5*(2*x^2) + x and your matrix A will be equal to [[2.0]], not to [[1.0]] INPUT PARAMETERS: State - structure which stores algorithm state A - matrix, array[N,N] IsUpper - (optional) storage type: * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used * if not given, both lower and upper triangles must be filled. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetquadratictermsparse( minqpstate state, sparsematrix a, bool isupper);

Examples:   [1]  

/************************************************************************* This function sets scaling coefficients. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances) and as preconditioner. Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetscale(minqpstate state, real_1d_array s);
/************************************************************************* This function sets starting point for QP solver. It is useful to have good initial approximation to the solution, because it will increase speed of convergence and identification of active constraints. INPUT PARAMETERS: State - structure which stores algorithm state X - starting point, array[N]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/
void alglib::minqpsetstartingpoint(minqpstate state, real_1d_array x);

Examples:   [1]  [2]  [3]  [4]  [5]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of F(x0,x1) = x0^2 + x1^2 -6*x0 - 4*x1
    // subject to bound constraints 0<=x0<=2.5, 0<=x1<=2.5
    //
    // Exact solution is [x0,x1] = [2.5,2]
    //
    // We provide algorithm with starting point. With such small problem good starting
    // point is not really necessary, but with high-dimensional problem it can save us
    // a lot of time.
    //
    // Several QP solvers are tried: QuickQP, BLEIC, DENSE-AUL.
    //
    // IMPORTANT: this solver minimizes  following  function:
    //     f(x) = 0.5*x'*A*x + b'*x.
    // Note that quadratic term has 0.5 before it. So if you want to minimize
    // quadratic function, you should rewrite it in such way that quadratic term
    // is multiplied by 0.5 too.
    // For example, our function is f(x)=x0^2+x1^2+..., but we rewrite it as 
    //     f(x) = 0.5*(2*x0^2+2*x1^2) + ....
    // and pass diag(2,2) as quadratic term - NOT diag(1,1)!
    //
    real_2d_array a = "[[2,0],[0,2]]";
    real_1d_array b = "[-6,-4]";
    real_1d_array x0 = "[0,1]";
    real_1d_array s = "[1,1]";
    real_1d_array bndl = "[0.0,0.0]";
    real_1d_array bndu = "[2.5,2.5]";
    real_1d_array x;
    minqpstate state;
    minqpreport rep;

    // create solver, set quadratic/linear terms
    minqpcreate(2, state);
    minqpsetquadraticterm(state, a);
    minqpsetlinearterm(state, b);
    minqpsetstartingpoint(state, x0);
    minqpsetbc(state, bndl, bndu);

    // Set scale of the parameters.
    // It is strongly recommended that you set scale of your variables.
    // Knowing their scales is essential for evaluation of stopping criteria
    // and for preconditioning of the algorithm steps.
    // You can find more information on scaling at http://www.alglib.net/optimization/scaling.php
    minqpsetscale(state, s);

    //
    // Solve problem with QuickQP solver.
    //
    // This solver is intended for medium and large-scale problems with box
    // constraints (general linear constraints are not supported).
    //
    // Default stopping criteria are used, Newton phase is active.
    //
    minqpsetalgoquickqp(state, 0.0, 0.0, 0.0, 0, true);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 4
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [2.5,2]

    //
    // Solve problem with BLEIC-based QP solver.
    //
    // This solver is intended for problems with moderate (up to 50) number
    // of general linear constraints and unlimited number of box constraints.
    //
    // Default stopping criteria are used.
    //
    minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [2.5,2]

    //
    // Solve problem with DENSE-AUL solver.
    //
    // This solver is optimized for problems with up to several thousands of
    // variables and large amount of general linear constraints. Problems with
    // less than 50 general linear constraints can be efficiently solved with
    // BLEIC, problems with box-only constraints can be solved with QuickQP.
    // However, DENSE-AUL will work in any (including unconstrained) case.
    //
    // Default stopping criteria are used.
    //
    minqpsetalgodenseaul(state, 1.0e-9, 1.0e+4, 5);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [2.5,2]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of F(x0,x1) = x0^2 + x1^2 -6*x0 - 4*x1
    // subject to linear constraint x0+x1<=2
    //
    // Exact solution is [x0,x1] = [1.5,0.5]
    //
    // IMPORTANT: this solver minimizes  following  function:
    //     f(x) = 0.5*x'*A*x + b'*x.
    // Note that quadratic term has 0.5 before it. So if you want to minimize
    // quadratic function, you should rewrite it in such way that quadratic term
    // is multiplied by 0.5 too.
    // For example, our function is f(x)=x0^2+x1^2+..., but we rewrite it as 
    //     f(x) = 0.5*(2*x0^2+2*x1^2) + ....
    // and pass diag(2,2) as quadratic term - NOT diag(1,1)!
    //
    real_2d_array a = "[[2,0],[0,2]]";
    real_1d_array b = "[-6,-4]";
    real_1d_array s = "[1,1]";
    real_2d_array c = "[[1.0,1.0,2.0]]";
    integer_1d_array ct = "[-1]";
    real_1d_array x;
    minqpstate state;
    minqpreport rep;

    // create solver, set quadratic/linear terms
    minqpcreate(2, state);
    minqpsetquadraticterm(state, a);
    minqpsetlinearterm(state, b);
    minqpsetlc(state, c, ct);

    // Set scale of the parameters.
    // It is strongly recommended that you set scale of your variables.
    // Knowing their scales is essential for evaluation of stopping criteria
    // and for preconditioning of the algorithm steps.
    // You can find more information on scaling at http://www.alglib.net/optimization/scaling.php
    minqpsetscale(state, s);

    //
    // Solve problem with BLEIC-based QP solver.
    //
    // This solver is intended for problems with moderate (up to 50) number
    // of general linear constraints and unlimited number of box constraints.
    //
    // Default stopping criteria are used.
    //
    minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%s\n", x.tostring(1).c_str()); // EXPECTED: [1.500,0.500]

    //
    // Solve problem with DENSE-AUL solver.
    //
    // This solver is optimized for problems with up to several thousands of
    // variables and large amount of general linear constraints. Problems with
    // less than 50 general linear constraints can be efficiently solved with
    // BLEIC, problems with box-only constraints can be solved with QuickQP.
    // However, DENSE-AUL will work in any (including unconstrained) case.
    //
    // Default stopping criteria are used.
    //
    minqpsetalgodenseaul(state, 1.0e-9, 1.0e+4, 5);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%s\n", x.tostring(1).c_str()); // EXPECTED: [1.500,0.500]

    //
    // Solve problem with QuickQP solver.
    //
    // This solver is intended for medium and large-scale problems with box
    // constraints, and...
    //
    // ...Oops! It does not support general linear constraints, -5 returned as completion code!
    //
    minqpsetalgoquickqp(state, 0.0, 0.0, 0.0, 0, true);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: -5
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of nonconvex function
    //     F(x0,x1) = -(x0^2+x1^2)
    // subject to constraints x0,x1 in [1.0,2.0]
    // Exact solution is [x0,x1] = [2,2].
    //
    // Non-convex problems are harded to solve than convex ones, and they
    // may have more than one local minimum. However, ALGLIB solves may deal
    // with such problems (altough they do not guarantee convergence to
    // global minimum).
    //
    // IMPORTANT: this solver minimizes  following  function:
    //     f(x) = 0.5*x'*A*x + b'*x.
    // Note that quadratic term has 0.5 before it. So if you want to minimize
    // quadratic function, you should rewrite it in such way that quadratic term
    // is multiplied by 0.5 too.
    //
    // For example, our function is f(x)=-(x0^2+x1^2), but we rewrite it as 
    //     f(x) = 0.5*(-2*x0^2-2*x1^2)
    // and pass diag(-2,-2) as quadratic term - NOT diag(-1,-1)!
    //
    real_2d_array a = "[[-2,0],[0,-2]]";
    real_1d_array x0 = "[1,1]";
    real_1d_array s = "[1,1]";
    real_1d_array bndl = "[1.0,1.0]";
    real_1d_array bndu = "[2.0,2.0]";
    real_1d_array x;
    minqpstate state;
    minqpreport rep;

    // create solver, set quadratic/linear terms, constraints
    minqpcreate(2, state);
    minqpsetquadraticterm(state, a);
    minqpsetstartingpoint(state, x0);
    minqpsetbc(state, bndl, bndu);

    // Set scale of the parameters.
    // It is strongly recommended that you set scale of your variables.
    // Knowing their scales is essential for evaluation of stopping criteria
    // and for preconditioning of the algorithm steps.
    // You can find more information on scaling at http://www.alglib.net/optimization/scaling.php
    minqpsetscale(state, s);

    //
    // Solve problem with BLEIC-based QP solver.
    //
    // This solver is intended for problems with moderate (up to 50) number
    // of general linear constraints and unlimited number of box constraints.
    //
    // It may solve non-convex problems as long as they are bounded from
    // below under constraints.
    //
    // Default stopping criteria are used.
    //
    minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [2,2]

    //
    // Solve problem with DENSE-AUL solver.
    //
    // This solver is optimized for problems with up to several thousands of
    // variables and large amount of general linear constraints. Problems with
    // less than 50 general linear constraints can be efficiently solved with
    // BLEIC, problems with box-only constraints can be solved with QuickQP.
    // However, DENSE-AUL will work in any (including unconstrained) case.
    //
    // Algorithm convergence is guaranteed only for convex case, but you may
    // expect that it will work for non-convex problems too (because near the
    // solution they are locally convex).
    //
    // Default stopping criteria are used.
    //
    minqpsetalgodenseaul(state, 1.0e-9, 1.0e+4, 5);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [2,2]

    // Hmm... this problem is bounded from below (has solution) only under constraints.
    // What it we remove them?
    //
    // You may see that BLEIC algorithm detects unboundedness of the problem, 
    // -4 is returned as completion code. However, DENSE-AUL is unable to detect
    // such situation and it will cycle forever (we do not test it here).
    real_1d_array nobndl = "[-inf,-inf]";
    real_1d_array nobndu = "[+inf,+inf]";
    minqpsetbc(state, nobndl, nobndu);
    minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: -4
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of F(x0,x1) = x0^2 + x1^2 -6*x0 - 4*x1
    //
    // Exact solution is [x0,x1] = [3,2]
    //
    // We provide algorithm with starting point, although in this case
    // (dense matrix, no constraints) it can work without such information.
    //
    // Several QP solvers are tried: QuickQP, BLEIC, DENSE-AUL.
    //
    // IMPORTANT: this solver minimizes  following  function:
    //     f(x) = 0.5*x'*A*x + b'*x.
    // Note that quadratic term has 0.5 before it. So if you want to minimize
    // quadratic function, you should rewrite it in such way that quadratic term
    // is multiplied by 0.5 too.
    //
    // For example, our function is f(x)=x0^2+x1^2+..., but we rewrite it as 
    //     f(x) = 0.5*(2*x0^2+2*x1^2) + .... 
    // and pass diag(2,2) as quadratic term - NOT diag(1,1)!
    //
    real_2d_array a = "[[2,0],[0,2]]";
    real_1d_array b = "[-6,-4]";
    real_1d_array x0 = "[0,1]";
    real_1d_array s = "[1,1]";
    real_1d_array x;
    minqpstate state;
    minqpreport rep;

    // create solver, set quadratic/linear terms
    minqpcreate(2, state);
    minqpsetquadraticterm(state, a);
    minqpsetlinearterm(state, b);
    minqpsetstartingpoint(state, x0);

    // Set scale of the parameters.
    // It is strongly recommended that you set scale of your variables.
    // Knowing their scales is essential for evaluation of stopping criteria
    // and for preconditioning of the algorithm steps.
    // You can find more information on scaling at http://www.alglib.net/optimization/scaling.php
    minqpsetscale(state, s);

    //
    // Solve problem with QuickQP solver.
    //
    // This solver is intended for medium and large-scale problems with box
    // constraints (general linear constraints are not supported), but it can
    // also be efficiently used on unconstrained problems.
    //
    // Default stopping criteria are used, Newton phase is active.
    //
    minqpsetalgoquickqp(state, 0.0, 0.0, 0.0, 0, true);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [3,2]

    //
    // Solve problem with BLEIC-based QP solver.
    //
    // This solver is intended for problems with moderate (up to 50) number
    // of general linear constraints and unlimited number of box constraints.
    // Of course, unconstrained problems can be solved too.
    //
    // Default stopping criteria are used.
    //
    minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [3,2]

    //
    // Solve problem with DENSE-AUL solver.
    //
    // This solver is optimized for problems with up to several thousands of
    // variables and large amount of general linear constraints. Problems with
    // less than 50 general linear constraints can be efficiently solved with
    // BLEIC, problems with box-only constraints can be solved with QuickQP.
    // However, DENSE-AUL will work in any (including unconstrained) case.
    //
    // Default stopping criteria are used.
    //
    minqpsetalgodenseaul(state, 1.0e-9, 1.0e+4, 5);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [3,2]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "optimization.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example demonstrates minimization of F(x0,x1) = x0^2 + x1^2 -6*x0 - 4*x1,
    // with quadratic term given by sparse matrix structure.
    //
    // Exact solution is [x0,x1] = [3,2]
    //
    // We provide algorithm with starting point, although in this case
    // (dense matrix, no constraints) it can work without such information.
    //
    // IMPORTANT: this solver minimizes  following  function:
    //     f(x) = 0.5*x'*A*x + b'*x.
    // Note that quadratic term has 0.5 before it. So if you want to minimize
    // quadratic function, you should rewrite it in such way that quadratic term
    // is multiplied by 0.5 too.
    //
    // For example, our function is f(x)=x0^2+x1^2+..., but we rewrite it as 
    //     f(x) = 0.5*(2*x0^2+2*x1^2) + ....
    // and pass diag(2,2) as quadratic term - NOT diag(1,1)!
    //
    sparsematrix a;
    real_1d_array b = "[-6,-4]";
    real_1d_array x0 = "[0,1]";
    real_1d_array s = "[1,1]";
    real_1d_array x;
    minqpstate state;
    minqpreport rep;

    // initialize sparsematrix structure
    sparsecreate(2, 2, 0, a);
    sparseset(a, 0, 0, 2.0);
    sparseset(a, 1, 1, 2.0);

    // create solver, set quadratic/linear terms
    minqpcreate(2, state);
    minqpsetquadratictermsparse(state, a, true);
    minqpsetlinearterm(state, b);
    minqpsetstartingpoint(state, x0);

    // Set scale of the parameters.
    // It is strongly recommended that you set scale of your variables.
    // Knowing their scales is essential for evaluation of stopping criteria
    // and for preconditioning of the algorithm steps.
    // You can find more information on scaling at http://www.alglib.net/optimization/scaling.php
    minqpsetscale(state, s);

    //
    // Solve problem with BLEIC-based QP solver.
    //
    // This solver is intended for problems with moderate (up to 50) number
    // of general linear constraints and unlimited number of box constraints.
    // It also supports sparse problems.
    //
    // Default stopping criteria are used.
    //
    minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0);
    minqpoptimize(state);
    minqpresults(state, x, rep);
    printf("%s\n", x.tostring(2).c_str()); // EXPECTED: [3,2]
    return 0;
}


modelerrors
multilayerperceptron
mlpactivationfunction
mlpallerrorssparsesubset
mlpallerrorssubset
mlpavgce
mlpavgcesparse
mlpavgerror
mlpavgerrorsparse
mlpavgrelerror
mlpavgrelerrorsparse
mlpclserror
mlpcopy
mlpcopytunableparameters
mlpcreate0
mlpcreate1
mlpcreate2
mlpcreateb0
mlpcreateb1
mlpcreateb2
mlpcreatec0
mlpcreatec1
mlpcreatec2
mlpcreater0
mlpcreater1
mlpcreater2
mlperror
mlperrorn
mlperrorsparse
mlperrorsparsesubset
mlperrorsubset
mlpgetinputscaling
mlpgetinputscount
mlpgetlayerscount
mlpgetlayersize
mlpgetneuroninfo
mlpgetoutputscaling
mlpgetoutputscount
mlpgetweight
mlpgetweightscount
mlpgrad
mlpgradbatch
mlpgradbatchsparse
mlpgradbatchsparsesubset
mlpgradbatchsubset
mlpgradn
mlpgradnbatch
mlphessianbatch
mlphessiannbatch
mlpinitpreprocessor
mlpissoftmax
mlpprocess
mlpprocessi
mlpproperties
mlprandomize
mlprandomizefull
mlprelclserror
mlprelclserrorsparse
mlprmserror
mlprmserrorsparse
mlpserialize
mlpsetinputscaling
mlpsetneuroninfo
mlpsetoutputscaling
mlpsetweight
mlpunserialize
/************************************************************************* Model's errors: * RelCLSError - fraction of misclassified cases. * AvgCE - acerage cross-entropy * RMSError - root-mean-square error * AvgError - average error * AvgRelError - average relative error NOTE 1: RelCLSError/AvgCE are zero on regression problems. NOTE 2: on classification problems RMSError/AvgError/AvgRelError contain errors in prediction of posterior probabilities *************************************************************************/
class modelerrors { double relclserror; double avgce; double rmserror; double avgerror; double avgrelerror; };
/************************************************************************* *************************************************************************/
class multilayerperceptron { };
/************************************************************************* Neural network activation function INPUT PARAMETERS: NET - neuron input K - function index (zero for linear function) OUTPUT PARAMETERS: F - function DF - its derivative D2F - its second derivative -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpactivationfunction( double net, ae_int_t k, double& f, double& df, double& d2f);
/************************************************************************* Calculation of all types of errors on subset of dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset given by sparse matrix; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SetSize - real size of XY, SetSize>=0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. OUTPUT PARAMETERS: Rep - it contains all type of errors. -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpallerrorssparsesubset( multilayerperceptron network, sparsematrix xy, ae_int_t setsize, integer_1d_array subset, ae_int_t subsetsize, modelerrors& rep); void alglib::smp_mlpallerrorssparsesubset( multilayerperceptron network, sparsematrix xy, ae_int_t setsize, integer_1d_array subset, ae_int_t subsetsize, modelerrors& rep);
/************************************************************************* Calculation of all types of errors on subset of dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SetSize - real size of XY, SetSize>=0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. OUTPUT PARAMETERS: Rep - it contains all type of errors. -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpallerrorssubset( multilayerperceptron network, real_2d_array xy, ae_int_t setsize, integer_1d_array subset, ae_int_t subsetsize, modelerrors& rep); void alglib::smp_mlpallerrorssubset( multilayerperceptron network, real_2d_array xy, ae_int_t setsize, integer_1d_array subset, ae_int_t subsetsize, modelerrors& rep);
/************************************************************************* Average cross-entropy (in bits per element) on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: CrossEntropy/(NPoints*LN(2)). Zero if network solves regression task. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 08.01.2009 by Bochkanov Sergey *************************************************************************/
double alglib::mlpavgce( multilayerperceptron network, real_2d_array xy, ae_int_t npoints); double alglib::smp_mlpavgce( multilayerperceptron network, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Average cross-entropy (in bits per element) on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: CrossEntropy/(NPoints*LN(2)). Zero if network solves regression task. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 9.08.2012 by Bochkanov Sergey *************************************************************************/
double alglib::mlpavgcesparse( multilayerperceptron network, sparsematrix xy, ae_int_t npoints); double alglib::smp_mlpavgcesparse( multilayerperceptron network, sparsematrix xy, ae_int_t npoints);
/************************************************************************* Average absolute error on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Its meaning for regression task is obvious. As for classification task, it means average error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 11.03.2008 by Bochkanov Sergey *************************************************************************/
double alglib::mlpavgerror( multilayerperceptron network, real_2d_array xy, ae_int_t npoints); double alglib::smp_mlpavgerror( multilayerperceptron network, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Average absolute error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Its meaning for regression task is obvious. As for classification task, it means average error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/
double alglib::mlpavgerrorsparse( multilayerperceptron network, sparsematrix xy, ae_int_t npoints); double alglib::smp_mlpavgerrorsparse( multilayerperceptron network, sparsematrix xy, ae_int_t npoints);
/************************************************************************* Average relative error on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Its meaning for regression task is obvious. As for classification task, it means average relative error when estimating posterior probability of belonging to the correct class. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 11.03.2008 by Bochkanov Sergey *************************************************************************/
double alglib::mlpavgrelerror( multilayerperceptron network, real_2d_array xy, ae_int_t npoints); double alglib::smp_mlpavgrelerror( multilayerperceptron network, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Average relative error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Its meaning for regression task is obvious. As for classification task, it means average relative error when estimating posterior probability of belonging to the correct class. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/
double alglib::mlpavgrelerrorsparse( multilayerperceptron network, sparsematrix xy, ae_int_t npoints); double alglib::smp_mlpavgrelerrorsparse( multilayerperceptron network, sparsematrix xy, ae_int_t npoints);
/************************************************************************* Classification error of the neural network on dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: classification error (number of misclassified cases) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::mlpclserror( multilayerperceptron network, real_2d_array xy, ae_int_t npoints); ae_int_t alglib::smp_mlpclserror( multilayerperceptron network, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Copying of neural network INPUT PARAMETERS: Network1 - original OUTPUT PARAMETERS: Network2 - copy -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcopy( multilayerperceptron network1, multilayerperceptron& network2);
/************************************************************************* This function copies tunable parameters (weights/means/sigmas) from one network to another with same architecture. It performs some rudimentary checks that architectures are same, and throws exception if check fails. It is intended for fast copying of states between two network which are known to have same geometry. INPUT PARAMETERS: Network1 - source, must be correctly initialized Network2 - target, must have same architecture OUTPUT PARAMETERS: Network2 - network state is copied from source to target -- ALGLIB -- Copyright 20.06.2013 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcopytunableparameters( multilayerperceptron network1, multilayerperceptron network2);
/************************************************************************* Creates neural network with NIn inputs, NOut outputs, without hidden layers, with linear output layer. Network weights are filled with small random values. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreate0( ae_int_t nin, ae_int_t nout, multilayerperceptron& network);
/************************************************************************* Same as MLPCreate0, but with one hidden layer (NHid neurons) with non-linear activation function. Output layer is linear. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreate1( ae_int_t nin, ae_int_t nhid, ae_int_t nout, multilayerperceptron& network);
/************************************************************************* Same as MLPCreate0, but with two hidden layers (NHid1 and NHid2 neurons) with non-linear activation function. Output layer is linear. $ALL -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreate2( ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, multilayerperceptron& network);
/************************************************************************* Creates neural network with NIn inputs, NOut outputs, without hidden layers with non-linear output layer. Network weights are filled with small random values. Activation function of the output layer takes values: (B, +INF), if D>=0 or (-INF, B), if D<0. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreateb0( ae_int_t nin, ae_int_t nout, double b, double d, multilayerperceptron& network);
/************************************************************************* Same as MLPCreateB0 but with non-linear hidden layer. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreateb1( ae_int_t nin, ae_int_t nhid, ae_int_t nout, double b, double d, multilayerperceptron& network);
/************************************************************************* Same as MLPCreateB0 but with two non-linear hidden layers. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreateb2( ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, double b, double d, multilayerperceptron& network);
/************************************************************************* Creates classifier network with NIn inputs and NOut possible classes. Network contains no hidden layers and linear output layer with SOFTMAX- normalization (so outputs sums up to 1.0 and converge to posterior probabilities). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreatec0( ae_int_t nin, ae_int_t nout, multilayerperceptron& network);
/************************************************************************* Same as MLPCreateC0, but with one non-linear hidden layer. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreatec1( ae_int_t nin, ae_int_t nhid, ae_int_t nout, multilayerperceptron& network);
/************************************************************************* Same as MLPCreateC0, but with two non-linear hidden layers. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreatec2( ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, multilayerperceptron& network);
/************************************************************************* Creates neural network with NIn inputs, NOut outputs, without hidden layers with non-linear output layer. Network weights are filled with small random values. Activation function of the output layer takes values [A,B]. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreater0( ae_int_t nin, ae_int_t nout, double a, double b, multilayerperceptron& network);
/************************************************************************* Same as MLPCreateR0, but with non-linear hidden layer. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreater1( ae_int_t nin, ae_int_t nhid, ae_int_t nout, double a, double b, multilayerperceptron& network);
/************************************************************************* Same as MLPCreateR0, but with two non-linear hidden layers. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreater2( ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, double a, double b, multilayerperceptron& network);
/************************************************************************* Error of the neural network on dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x, depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
double alglib::mlperror( multilayerperceptron network, real_2d_array xy, ae_int_t npoints); double alglib::smp_mlperror( multilayerperceptron network, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Natural error function for neural network, internal subroutine. NOTE: this function is single-threaded. Unlike other error function, it receives no speed-up from being executed in SMP mode. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
double alglib::mlperrorn( multilayerperceptron network, real_2d_array xy, ae_int_t ssize);
/************************************************************************* Error of the neural network on dataset given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x, depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0 RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/
double alglib::mlperrorsparse( multilayerperceptron network, sparsematrix xy, ae_int_t npoints); double alglib::smp_mlperrorsparse( multilayerperceptron network, sparsematrix xy, ae_int_t npoints);
/************************************************************************* Error of the neural network on subset of sparse dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. SetSize - real size of XY, SetSize>=0; it is used when SubsetSize<0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/
double alglib::mlperrorsparsesubset( multilayerperceptron network, sparsematrix xy, ae_int_t setsize, integer_1d_array subset, ae_int_t subsetsize); double alglib::smp_mlperrorsparsesubset( multilayerperceptron network, sparsematrix xy, ae_int_t setsize, integer_1d_array subset, ae_int_t subsetsize);
/************************************************************************* Error of the neural network on subset of dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; SetSize - real size of XY, SetSize>=0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/
double alglib::mlperrorsubset( multilayerperceptron network, real_2d_array xy, ae_int_t setsize, integer_1d_array subset, ae_int_t subsetsize); double alglib::smp_mlperrorsubset( multilayerperceptron network, real_2d_array xy, ae_int_t setsize, integer_1d_array subset, ae_int_t subsetsize);
/************************************************************************* This function returns offset/scaling coefficients for I-th input of the network. INPUT PARAMETERS: Network - network I - input index OUTPUT PARAMETERS: Mean - mean term Sigma - sigma term, guaranteed to be nonzero. I-th input is passed through linear transformation IN[i] = (IN[i]-Mean)/Sigma before feeding to the network -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/
void alglib::mlpgetinputscaling( multilayerperceptron network, ae_int_t i, double& mean, double& sigma);
/************************************************************************* Returns number of inputs. -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::mlpgetinputscount(multilayerperceptron network);
/************************************************************************* This function returns total number of layers (including input, hidden and output layers). -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::mlpgetlayerscount(multilayerperceptron network);
/************************************************************************* This function returns size of K-th layer. K=0 corresponds to input layer, K=CNT-1 corresponds to output layer. Size of the output layer is always equal to the number of outputs, although when we have softmax-normalized network, last neuron doesn't have any connections - it is just zero. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::mlpgetlayersize( multilayerperceptron network, ae_int_t k);
/************************************************************************* This function returns information about Ith neuron of Kth layer INPUT PARAMETERS: Network - network K - layer index I - neuron index (within layer) OUTPUT PARAMETERS: FKind - activation function type (used by MLPActivationFunction()) this value is zero for input or linear neurons Threshold - also called offset, bias zero for input neurons NOTE: this function throws exception if layer or neuron with given index do not exists. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/
void alglib::mlpgetneuroninfo( multilayerperceptron network, ae_int_t k, ae_int_t i, ae_int_t& fkind, double& threshold);
/************************************************************************* This function returns offset/scaling coefficients for I-th output of the network. INPUT PARAMETERS: Network - network I - input index OUTPUT PARAMETERS: Mean - mean term Sigma - sigma term, guaranteed to be nonzero. I-th output is passed through linear transformation OUT[i] = OUT[i]*Sigma+Mean before returning it to user. In case we have SOFTMAX-normalized network, we return (Mean,Sigma)=(0.0,1.0). -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/
void alglib::mlpgetoutputscaling( multilayerperceptron network, ae_int_t i, double& mean, double& sigma);
/************************************************************************* Returns number of outputs. -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::mlpgetoutputscount(multilayerperceptron network);
/************************************************************************* This function returns information about connection from I0-th neuron of K0-th layer to I1-th neuron of K1-th layer. INPUT PARAMETERS: Network - network K0 - layer index I0 - neuron index (within layer) K1 - layer index I1 - neuron index (within layer) RESULT: connection weight (zero for non-existent connections) This function: 1. throws exception if layer or neuron with given index do not exists. 2. returns zero if neurons exist, but there is no connection between them -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/
double alglib::mlpgetweight( multilayerperceptron network, ae_int_t k0, ae_int_t i0, ae_int_t k1, ae_int_t i1);
/************************************************************************* Returns number of weights. -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::mlpgetweightscount(multilayerperceptron network);
/************************************************************************* Gradient calculation INPUT PARAMETERS: Network - network initialized with one of the network creation funcs X - input vector, length of array must be at least NIn DesiredY- desired outputs, length of array must be at least NOut Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpgrad( multilayerperceptron network, real_1d_array x, real_1d_array desiredy, double& e, real_1d_array& grad);
/************************************************************************* Batch gradient calculation for a set of inputs/outputs FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in dense format; one sample = one row: * first NIn columns contain inputs, * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SSize - number of elements in XY Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpgradbatch( multilayerperceptron network, real_2d_array xy, ae_int_t ssize, double& e, real_1d_array& grad); void alglib::smp_mlpgradbatch( multilayerperceptron network, real_2d_array xy, ae_int_t ssize, double& e, real_1d_array& grad);
/************************************************************************* Batch gradient calculation for a set of inputs/outputs given by sparse matrices FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in sparse format; one sample = one row: * MATRIX MUST BE STORED IN CRS FORMAT * first NIn columns contain inputs. * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SSize - number of elements in XY Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpgradbatchsparse( multilayerperceptron network, sparsematrix xy, ae_int_t ssize, double& e, real_1d_array& grad); void alglib::smp_mlpgradbatchsparse( multilayerperceptron network, sparsematrix xy, ae_int_t ssize, double& e, real_1d_array& grad);
/************************************************************************* Batch gradient calculation for a set of inputs/outputs for a subset of dataset given by set of indexes. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in sparse format; one sample = one row: * MATRIX MUST BE STORED IN CRS FORMAT * first NIn columns contain inputs, * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SetSize - real size of XY, SetSize>=0; Idx - subset of SubsetSize elements, array[SubsetSize]: * Idx[I] stores row index in the original dataset which is given by XY. Gradient is calculated with respect to rows whose indexes are stored in Idx[]. * Idx[] must store correct indexes; this function throws an exception in case incorrect index (less than 0 or larger than rows(XY)) is given * Idx[] may store indexes in any order and even with repetitions. SubsetSize- number of elements in Idx[] array: * positive value means that subset given by Idx[] is processed * zero value results in zero gradient * negative value means that full dataset is processed Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] NOTE: when SubsetSize<0 is used full dataset by call MLPGradBatchSparse function. -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpgradbatchsparsesubset( multilayerperceptron network, sparsematrix xy, ae_int_t setsize, integer_1d_array idx, ae_int_t subsetsize, double& e, real_1d_array& grad); void alglib::smp_mlpgradbatchsparsesubset( multilayerperceptron network, sparsematrix xy, ae_int_t setsize, integer_1d_array idx, ae_int_t subsetsize, double& e, real_1d_array& grad);
/************************************************************************* Batch gradient calculation for a subset of dataset FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in dense format; one sample = one row: * first NIn columns contain inputs, * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SetSize - real size of XY, SetSize>=0; Idx - subset of SubsetSize elements, array[SubsetSize]: * Idx[I] stores row index in the original dataset which is given by XY. Gradient is calculated with respect to rows whose indexes are stored in Idx[]. * Idx[] must store correct indexes; this function throws an exception in case incorrect index (less than 0 or larger than rows(XY)) is given * Idx[] may store indexes in any order and even with repetitions. SubsetSize- number of elements in Idx[] array: * positive value means that subset given by Idx[] is processed * zero value results in zero gradient * negative value means that full dataset is processed Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpgradbatchsubset( multilayerperceptron network, real_2d_array xy, ae_int_t setsize, integer_1d_array idx, ae_int_t subsetsize, double& e, real_1d_array& grad); void alglib::smp_mlpgradbatchsubset( multilayerperceptron network, real_2d_array xy, ae_int_t setsize, integer_1d_array idx, ae_int_t subsetsize, double& e, real_1d_array& grad);
/************************************************************************* Gradient calculation (natural error function is used) INPUT PARAMETERS: Network - network initialized with one of the network creation funcs X - input vector, length of array must be at least NIn DesiredY- desired outputs, length of array must be at least NOut Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, sum-of-squares for regression networks, cross-entropy for classification networks. Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpgradn( multilayerperceptron network, real_1d_array x, real_1d_array desiredy, double& e, real_1d_array& grad);
/************************************************************************* Batch gradient calculation for a set of inputs/outputs (natural error function is used) INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - set of inputs/outputs; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SSize - number of elements in XY Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, sum-of-squares for regression networks, cross-entropy for classification networks. Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpgradnbatch( multilayerperceptron network, real_2d_array xy, ae_int_t ssize, double& e, real_1d_array& grad);
/************************************************************************* Batch Hessian calculation using R-algorithm. Internal subroutine. -- ALGLIB -- Copyright 26.01.2008 by Bochkanov Sergey. Hessian calculation based on R-algorithm described in "Fast Exact Multiplication by the Hessian", B. A. Pearlmutter, Neural Computation, 1994. *************************************************************************/
void alglib::mlphessianbatch( multilayerperceptron network, real_2d_array xy, ae_int_t ssize, double& e, real_1d_array& grad, real_2d_array& h);
/************************************************************************* Batch Hessian calculation (natural error function) using R-algorithm. Internal subroutine. -- ALGLIB -- Copyright 26.01.2008 by Bochkanov Sergey. Hessian calculation based on R-algorithm described in "Fast Exact Multiplication by the Hessian", B. A. Pearlmutter, Neural Computation, 1994. *************************************************************************/
void alglib::mlphessiannbatch( multilayerperceptron network, real_2d_array xy, ae_int_t ssize, double& e, real_1d_array& grad, real_2d_array& h);
/************************************************************************* Internal subroutine. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mlpinitpreprocessor( multilayerperceptron network, real_2d_array xy, ae_int_t ssize);
/************************************************************************* Tells whether network is SOFTMAX-normalized (i.e. classifier) or not. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
bool alglib::mlpissoftmax(multilayerperceptron network);
/************************************************************************* Procesing INPUT PARAMETERS: Network - neural network X - input vector, array[0..NIn-1]. OUTPUT PARAMETERS: Y - result. Regression estimate when solving regression task, vector of posterior probabilities for classification task. See also MLPProcessI -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpprocess( multilayerperceptron network, real_1d_array x, real_1d_array& y);
/************************************************************************* 'interactive' variant of MLPProcess for languages like Python which support constructs like "Y = MLPProcess(NN,X)" and interactive mode of the interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 21.09.2010 by Bochkanov Sergey *************************************************************************/
void alglib::mlpprocessi( multilayerperceptron network, real_1d_array x, real_1d_array& y);
/************************************************************************* Returns information about initialized network: number of inputs, outputs, weights. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpproperties( multilayerperceptron network, ae_int_t& nin, ae_int_t& nout, ae_int_t& wcount);
/************************************************************************* Randomization of neural network weights -- ALGLIB -- Copyright 06.11.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlprandomize(multilayerperceptron network);
/************************************************************************* Randomization of neural network weights and standartisator -- ALGLIB -- Copyright 10.03.2008 by Bochkanov Sergey *************************************************************************/
void alglib::mlprandomizefull(multilayerperceptron network);
/************************************************************************* Relative classification error on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Percent of incorrectly classified cases. Works both for classifier networks and general purpose networks used as classifiers. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 25.12.2008 by Bochkanov Sergey *************************************************************************/
double alglib::mlprelclserror( multilayerperceptron network, real_2d_array xy, ae_int_t npoints); double alglib::smp_mlprelclserror( multilayerperceptron network, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Relative classification error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Percent of incorrectly classified cases. Works both for classifier networks and general purpose networks used as classifiers. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/
double alglib::mlprelclserrorsparse( multilayerperceptron network, sparsematrix xy, ae_int_t npoints); double alglib::smp_mlprelclserrorsparse( multilayerperceptron network, sparsematrix xy, ae_int_t npoints);
/************************************************************************* RMS error on the test set given. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Root mean square error. Its meaning for regression task is obvious. As for classification task, RMS error means error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/
double alglib::mlprmserror( multilayerperceptron network, real_2d_array xy, ae_int_t npoints); double alglib::smp_mlprmserror( multilayerperceptron network, real_2d_array xy, ae_int_t npoints);
/************************************************************************* RMS error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Root mean square error. Its meaning for regression task is obvious. As for classification task, RMS error means error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/
double alglib::mlprmserrorsparse( multilayerperceptron network, sparsematrix xy, ae_int_t npoints); double alglib::smp_mlprmserrorsparse( multilayerperceptron network, sparsematrix xy, ae_int_t npoints);
/************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/
void mlpserialize(multilayerperceptron &obj, std::string &s_out); void mlpserialize(multilayerperceptron &obj, std::ostream &s_out);
/************************************************************************* This function sets offset/scaling coefficients for I-th input of the network. INPUT PARAMETERS: Network - network I - input index Mean - mean term Sigma - sigma term (if zero, will be replaced by 1.0) NTE: I-th input is passed through linear transformation IN[i] = (IN[i]-Mean)/Sigma before feeding to the network. This function sets Mean and Sigma. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/
void alglib::mlpsetinputscaling( multilayerperceptron network, ae_int_t i, double mean, double sigma);
/************************************************************************* This function modifies information about Ith neuron of Kth layer INPUT PARAMETERS: Network - network K - layer index I - neuron index (within layer) FKind - activation function type (used by MLPActivationFunction()) this value must be zero for input neurons (you can not set activation function for input neurons) Threshold - also called offset, bias this value must be zero for input neurons (you can not set threshold for input neurons) NOTES: 1. this function throws exception if layer or neuron with given index do not exists. 2. this function also throws exception when you try to set non-linear activation function for input neurons (any kind of network) or for output neurons of classifier network. 3. this function throws exception when you try to set non-zero threshold for input neurons (any kind of network). -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/
void alglib::mlpsetneuroninfo( multilayerperceptron network, ae_int_t k, ae_int_t i, ae_int_t fkind, double threshold);
/************************************************************************* This function sets offset/scaling coefficients for I-th output of the network. INPUT PARAMETERS: Network - network I - input index Mean - mean term Sigma - sigma term (if zero, will be replaced by 1.0) OUTPUT PARAMETERS: NOTE: I-th output is passed through linear transformation OUT[i] = OUT[i]*Sigma+Mean before returning it to user. This function sets Sigma/Mean. In case we have SOFTMAX-normalized network, you can not set (Sigma,Mean) to anything other than(0.0,1.0) - this function will throw exception. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/
void alglib::mlpsetoutputscaling( multilayerperceptron network, ae_int_t i, double mean, double sigma);
/************************************************************************* This function modifies information about connection from I0-th neuron of K0-th layer to I1-th neuron of K1-th layer. INPUT PARAMETERS: Network - network K0 - layer index I0 - neuron index (within layer) K1 - layer index I1 - neuron index (within layer) W - connection weight (must be zero for non-existent connections) This function: 1. throws exception if layer or neuron with given index do not exists. 2. throws exception if you try to set non-zero weight for non-existent connection -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/
void alglib::mlpsetweight( multilayerperceptron network, ae_int_t k0, ae_int_t i0, ae_int_t k1, ae_int_t i1, double w);
/************************************************************************* This function unserializes data structure from string. *************************************************************************/
void mlpunserialize(const std::string &s_in, multilayerperceptron &obj); void mlpunserialize(const std::istream &s_in, multilayerperceptron &obj);
mlpensemble
mlpeavgce
mlpeavgerror
mlpeavgrelerror
mlpecreate0
mlpecreate1
mlpecreate2
mlpecreateb0
mlpecreateb1
mlpecreateb2
mlpecreatec0
mlpecreatec1
mlpecreatec2
mlpecreatefromnetwork
mlpecreater0
mlpecreater1
mlpecreater2
mlpeissoftmax
mlpeprocess
mlpeprocessi
mlpeproperties
mlperandomize
mlperelclserror
mlpermserror
mlpeserialize
mlpeunserialize
/************************************************************************* Neural networks ensemble *************************************************************************/
class mlpensemble { };
/************************************************************************* Average cross-entropy (in bits per element) on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: CrossEntropy/(NPoints*LN(2)). Zero if ensemble solves regression task. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
double alglib::mlpeavgce( mlpensemble ensemble, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Average error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task it means average error when estimating posterior probabilities. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
double alglib::mlpeavgerror( mlpensemble ensemble, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Average relative error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task it means average relative error when estimating posterior probabilities. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
double alglib::mlpeavgrelerror( mlpensemble ensemble, real_2d_array xy, ae_int_t npoints);
/************************************************************************* Like MLPCreate0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreate0( ae_int_t nin, ae_int_t nout, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Like MLPCreate1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreate1( ae_int_t nin, ae_int_t nhid, ae_int_t nout, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Like MLPCreate2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreate2( ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Like MLPCreateB0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreateb0( ae_int_t nin, ae_int_t nout, double b, double d, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Like MLPCreateB1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreateb1( ae_int_t nin, ae_int_t nhid, ae_int_t nout, double b, double d, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Like MLPCreateB2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreateb2( ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, double b, double d, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Like MLPCreateC0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreatec0( ae_int_t nin, ae_int_t nout, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Like MLPCreateC1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreatec1( ae_int_t nin, ae_int_t nhid, ae_int_t nout, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Like MLPCreateC2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreatec2( ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Creates ensemble from network. Only network geometry is copied. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreatefromnetwork( multilayerperceptron network, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Like MLPCreateR0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreater0( ae_int_t nin, ae_int_t nout, double a, double b, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Like MLPCreateR1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreater1( ae_int_t nin, ae_int_t nhid, ae_int_t nout, double a, double b, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Like MLPCreateR2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpecreater2( ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, double a, double b, ae_int_t ensemblesize, mlpensemble& ensemble);
/************************************************************************* Return normalization type (whether ensemble is SOFTMAX-normalized or not). -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
bool alglib::mlpeissoftmax(mlpensemble ensemble);
/************************************************************************* Procesing INPUT PARAMETERS: Ensemble- neural networks ensemble X - input vector, array[0..NIn-1]. Y - (possibly) preallocated buffer; if size of Y is less than NOut, it will be reallocated. If it is large enough, it is NOT reallocated, so we can save some time on reallocation. OUTPUT PARAMETERS: Y - result. Regression estimate when solving regression task, vector of posterior probabilities for classification task. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpeprocess( mlpensemble ensemble, real_1d_array x, real_1d_array& y);
/************************************************************************* 'interactive' variant of MLPEProcess for languages like Python which support constructs like "Y = MLPEProcess(LM,X)" and interactive mode of the interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpeprocessi( mlpensemble ensemble, real_1d_array x, real_1d_array& y);
/************************************************************************* Return ensemble properties (number of inputs and outputs). -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpeproperties( mlpensemble ensemble, ae_int_t& nin, ae_int_t& nout);
/************************************************************************* Randomization of MLP ensemble -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlperandomize(mlpensemble ensemble);
/************************************************************************* Relative classification error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: percent of incorrectly classified cases. Works both for classifier betwork and for regression networks which are used as classifiers. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
double alglib::mlperelclserror( mlpensemble ensemble, real_2d_array xy, ae_int_t npoints);
/************************************************************************* RMS error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: root mean square error. Its meaning for regression task is obvious. As for classification task RMS error means error when estimating posterior probabilities. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
double alglib::mlpermserror( mlpensemble ensemble, real_2d_array xy, ae_int_t npoints);
/************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/
void mlpeserialize(mlpensemble &obj, std::string &s_out); void mlpeserialize(mlpensemble &obj, std::ostream &s_out);
/************************************************************************* This function unserializes data structure from string. *************************************************************************/
void mlpeunserialize(const std::string &s_in, mlpensemble &obj); void mlpeunserialize(const std::istream &s_in, mlpensemble &obj);
mlpcvreport
mlpreport
mlptrainer
mlpcontinuetraining
mlpcreatetrainer
mlpcreatetrainercls
mlpebagginglbfgs
mlpebagginglm
mlpetraines
mlpkfoldcv
mlpkfoldcvlbfgs
mlpkfoldcvlm
mlpsetalgobatch
mlpsetcond
mlpsetdataset
mlpsetdecay
mlpsetsparsedataset
mlpstarttraining
mlptrainensemblees
mlptraines
mlptrainlbfgs
mlptrainlm
mlptrainnetwork
nn_cls2 Binary classification problem
nn_cls3 Multiclass classification problem
nn_crossvalidation Cross-validation
nn_ensembles_es Early stopping ensembles
nn_parallel Parallel training
nn_regr Regression problem with one output (2=>1)
nn_regr_n Regression problem with multiple outputs (2=>2)
nn_trainerobject Advanced example on trainer object
/************************************************************************* Cross-validation estimates of generalization error *************************************************************************/
class mlpcvreport { double relclserror; double avgce; double rmserror; double avgerror; double avgrelerror; };
/************************************************************************* Training report: * RelCLSError - fraction of misclassified cases. * AvgCE - acerage cross-entropy * RMSError - root-mean-square error * AvgError - average error * AvgRelError - average relative error * NGrad - number of gradient calculations * NHess - number of Hessian calculations * NCholesky - number of Cholesky decompositions NOTE 1: RelCLSError/AvgCE are zero on regression problems. NOTE 2: on classification problems RMSError/AvgError/AvgRelError contain errors in prediction of posterior probabilities *************************************************************************/
class mlpreport { double relclserror; double avgce; double rmserror; double avgerror; double avgrelerror; ae_int_t ngrad; ae_int_t nhess; ae_int_t ncholesky; };
/************************************************************************* Trainer object for neural network. You should not try to access fields of this object directly - use ALGLIB functions to work with this object. *************************************************************************/
class mlptrainer { };
/************************************************************************* IMPORTANT: this is an "expert" version of the MLPTrain() function. We do not recommend you to use it unless you are pretty sure that you need ability to monitor training progress. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. This function performs step-by-step training of the neural network. Here "step-by-step" means that training starts with MLPStartTraining() call, and then user subsequently calls MLPContinueTraining() to perform one more iteration of the training. This function performs one more iteration of the training and returns either True (training continues) or False (training stopped). In case True was returned, Network weights are updated according to the current state of the optimization progress. In case False was returned, no additional updates is performed (previous update of the network weights moved us to the final point, and no additional updates is needed). EXAMPLE: > > [initialize network and trainer object] > > MLPStartTraining(Trainer, Network, True) > while MLPContinueTraining(Trainer, Network) do > [visualize training progress] > INPUT PARAMETERS: S - trainer object Network - neural network structure, which is used to store current state of the training process. OUTPUT PARAMETERS: Network - weights of the neural network are rewritten by the current approximation. NOTE: this method uses sum-of-squares error function for training. NOTE: it is expected that trainer object settings are NOT changed during step-by-step training, i.e. no one changes stopping criteria or training set during training. It is possible and there is no defense against such actions, but algorithm behavior in such cases is undefined and can be unpredictable. NOTE: It is expected that Network is the same one which was passed to MLPStartTraining() function. However, THIS function checks only following: * that number of network inputs is consistent with trainer object settings * that number of network outputs/classes is consistent with trainer object settings * that number of network weights is the same as number of weights in the network passed to MLPStartTraining() function Exception is thrown when these conditions are violated. It is also expected that you do not change state of the network on your own - the only party who has right to change network during its training is a trainer object. Any attempt to interfere with trainer may lead to unpredictable results. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/
bool alglib::mlpcontinuetraining( mlptrainer s, multilayerperceptron network); bool alglib::smp_mlpcontinuetraining( mlptrainer s, multilayerperceptron network);
/************************************************************************* Creation of the network trainer object for regression networks INPUT PARAMETERS: NIn - number of inputs, NIn>=1 NOut - number of outputs, NOut>=1 OUTPUT PARAMETERS: S - neural network trainer object. This structure can be used to train any regression network with NIn inputs and NOut outputs. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreatetrainer(ae_int_t nin, ae_int_t nout, mlptrainer& s);

Examples:   [1]  [2]  [3]  [4]  [5]  [6]  

/************************************************************************* Creation of the network trainer object for classification networks INPUT PARAMETERS: NIn - number of inputs, NIn>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: S - neural network trainer object. This structure can be used to train any classification network with NIn inputs and NOut outputs. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpcreatetrainercls( ae_int_t nin, ae_int_t nclasses, mlptrainer& s);

Examples:   [1]  [2]  

/************************************************************************* Training neural networks ensemble using bootstrap aggregating (bagging). L-BFGS algorithm is used as base training method. INPUT PARAMETERS: Ensemble - model with initialized geometry XY - training set NPoints - training set size Decay - weight decay coefficient, >=0.001 Restarts - restarts, >0. WStep - stopping criterion, same as in MLPTrainLBFGS MaxIts - stopping criterion, same as in MLPTrainLBFGS OUTPUT PARAMETERS: Ensemble - trained model Info - return code: * -8, if both WStep=0 and MaxIts=0 * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report. OOBErrors - out-of-bag generalization error estimate -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpebagginglbfgs( mlpensemble ensemble, real_2d_array xy, ae_int_t npoints, double decay, ae_int_t restarts, double wstep, ae_int_t maxits, ae_int_t& info, mlpreport& rep, mlpcvreport& ooberrors);
/************************************************************************* Training neural networks ensemble using bootstrap aggregating (bagging). Modified Levenberg-Marquardt algorithm is used as base training method. INPUT PARAMETERS: Ensemble - model with initialized geometry XY - training set NPoints - training set size Decay - weight decay coefficient, >=0.001 Restarts - restarts, >0. OUTPUT PARAMETERS: Ensemble - trained model Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report. OOBErrors - out-of-bag generalization error estimate -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpebagginglm( mlpensemble ensemble, real_2d_array xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t& info, mlpreport& rep, mlpcvreport& ooberrors);
/************************************************************************* Training neural networks ensemble using early stopping. INPUT PARAMETERS: Ensemble - model with initialized geometry XY - training set NPoints - training set size Decay - weight decay coefficient, >=0.001 Restarts - restarts, >0. OUTPUT PARAMETERS: Ensemble - trained model Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, Restarts<1). * 6, if task has been solved. Rep - training report. OOBErrors - out-of-bag generalization error estimate -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlpetraines( mlpensemble ensemble, real_2d_array xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t& info, mlpreport& rep);
/************************************************************************* This function estimates generalization error using cross-validation on the current dataset with current training settings. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * FoldsCount cross-validation rounds (always) ! * NRestarts training sessions performed within each of ! cross-validation rounds (if NRestarts>1) ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - trainer object Network - neural network. It must have same number of inputs and output/classes as was specified during creation of the trainer object. Network is not changed during cross- validation and is not trained - it is used only as representative of its architecture. I.e., we estimate generalization properties of ARCHITECTURE, not some specific network. NRestarts - number of restarts, >=0: * NRestarts>0 means that for each cross-validation round specified number of random restarts is performed, with best network being chosen after training. * NRestarts=0 is same as NRestarts=1 FoldsCount - number of folds in k-fold cross-validation: * 2<=FoldsCount<=size of dataset * recommended value: 10. * values larger than dataset size will be silently truncated down to dataset size OUTPUT PARAMETERS: Rep - structure which contains cross-validation estimates: * Rep.RelCLSError - fraction of misclassified cases. * Rep.AvgCE - acerage cross-entropy * Rep.RMSError - root-mean-square error * Rep.AvgError - average error * Rep.AvgRelError - average relative error NOTE: when no dataset was specified with MLPSetDataset/SetSparseDataset(), or subset with only one point was given, zeros are returned as estimates. NOTE: this method performs FoldsCount cross-validation rounds, each one with NRestarts random starts. Thus, FoldsCount*NRestarts networks are trained in total. NOTE: Rep.RelCLSError/Rep.AvgCE are zero on regression problems. NOTE: on classification problems Rep.RMSError/Rep.AvgError/Rep.AvgRelError contain errors in prediction of posterior probabilities. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpkfoldcv( mlptrainer s, multilayerperceptron network, ae_int_t nrestarts, ae_int_t foldscount, mlpreport& rep); void alglib::smp_mlpkfoldcv( mlptrainer s, multilayerperceptron network, ae_int_t nrestarts, ae_int_t foldscount, mlpreport& rep);

Examples:   [1]  [2]  

/************************************************************************* Cross-validation estimate of generalization error. Base algorithm - L-BFGS. INPUT PARAMETERS: Network - neural network with initialized geometry. Network is not changed during cross-validation - it is used only as a representative of its architecture. XY - training set. SSize - training set size Decay - weight decay, same as in MLPTrainLBFGS Restarts - number of restarts, >0. restarts are counted for each partition separately, so total number of restarts will be Restarts*FoldsCount. WStep - stopping criterion, same as in MLPTrainLBFGS MaxIts - stopping criterion, same as in MLPTrainLBFGS FoldsCount - number of folds in k-fold cross-validation, 2<=FoldsCount<=SSize. recommended value: 10. OUTPUT PARAMETERS: Info - return code, same as in MLPTrainLBFGS Rep - report, same as in MLPTrainLM/MLPTrainLBFGS CVRep - generalization error estimates -- ALGLIB -- Copyright 09.12.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpkfoldcvlbfgs( multilayerperceptron network, real_2d_array xy, ae_int_t npoints, double decay, ae_int_t restarts, double wstep, ae_int_t maxits, ae_int_t foldscount, ae_int_t& info, mlpreport& rep, mlpcvreport& cvrep);
/************************************************************************* Cross-validation estimate of generalization error. Base algorithm - Levenberg-Marquardt. INPUT PARAMETERS: Network - neural network with initialized geometry. Network is not changed during cross-validation - it is used only as a representative of its architecture. XY - training set. SSize - training set size Decay - weight decay, same as in MLPTrainLBFGS Restarts - number of restarts, >0. restarts are counted for each partition separately, so total number of restarts will be Restarts*FoldsCount. FoldsCount - number of folds in k-fold cross-validation, 2<=FoldsCount<=SSize. recommended value: 10. OUTPUT PARAMETERS: Info - return code, same as in MLPTrainLBFGS Rep - report, same as in MLPTrainLM/MLPTrainLBFGS CVRep - generalization error estimates -- ALGLIB -- Copyright 09.12.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlpkfoldcvlm( multilayerperceptron network, real_2d_array xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t foldscount, ae_int_t& info, mlpreport& rep, mlpcvreport& cvrep);
/************************************************************************* This function sets training algorithm: batch training using L-BFGS will be used. This algorithm: * the most robust for small-scale problems, but may be too slow for large scale ones. * perfoms full pass through the dataset before performing step * uses conditions specified by MLPSetCond() for stopping * is default one used by trainer object INPUT PARAMETERS: S - trainer object -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpsetalgobatch(mlptrainer s);
/************************************************************************* This function sets stopping criteria for the optimizer. INPUT PARAMETERS: S - trainer object WStep - stopping criterion. Algorithm stops if step size is less than WStep. Recommended value - 0.01. Zero step size means stopping after MaxIts iterations. WStep>=0. MaxIts - stopping criterion. Algorithm stops after MaxIts epochs (full passes over entire dataset). Zero MaxIts means stopping when step is sufficiently small. MaxIts>=0. NOTE: by default, WStep=0.005 and MaxIts=0 are used. These values are also used when MLPSetCond() is called with WStep=0 and MaxIts=0. NOTE: these stopping criteria are used for all kinds of neural training - from "conventional" networks to early stopping ensembles. When used for "conventional" networks, they are used as the only stopping criteria. When combined with early stopping, they used as ADDITIONAL stopping criteria which can terminate early stopping algorithm. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpsetcond(mlptrainer s, double wstep, ae_int_t maxits);
/************************************************************************* This function sets "current dataset" of the trainer object to one passed by user. INPUT PARAMETERS: S - trainer object XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. NPoints - points count, >=0. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following datasetformat is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpsetdataset( mlptrainer s, real_2d_array xy, ae_int_t npoints);

Examples:   [1]  [2]  [3]  [4]  [5]  [6]  [7]  [8]  

/************************************************************************* This function sets weight decay coefficient which is used for training. INPUT PARAMETERS: S - trainer object Decay - weight decay coefficient, >=0. Weight decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 1.0E-3. Weight decay can be set to zero, in this case network is trained without weight decay. NOTE: by default network uses some small nonzero value for weight decay. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpsetdecay(mlptrainer s, double decay);
/************************************************************************* This function sets "current dataset" of the trainer object to one passed by user (sparse matrix is used to store dataset). INPUT PARAMETERS: S - trainer object XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Any sparse storage format can be used: Hash-table, CRS... NPoints - points count, >=0 DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following datasetformat is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpsetsparsedataset( mlptrainer s, sparsematrix xy, ae_int_t npoints);
/************************************************************************* IMPORTANT: this is an "expert" version of the MLPTrain() function. We do not recommend you to use it unless you are pretty sure that you need ability to monitor training progress. This function performs step-by-step training of the neural network. Here "step-by-step" means that training starts with MLPStartTraining() call, and then user subsequently calls MLPContinueTraining() to perform one more iteration of the training. After call to this function trainer object remembers network and is ready to train it. However, no training is performed until first call to MLPContinueTraining() function. Subsequent calls to MLPContinueTraining() will advance training progress one iteration further. EXAMPLE: > > ...initialize network and trainer object.... > > MLPStartTraining(Trainer, Network, True) > while MLPContinueTraining(Trainer, Network) do > ...visualize training progress... > INPUT PARAMETERS: S - trainer object Network - neural network. It must have same number of inputs and output/classes as was specified during creation of the trainer object. RandomStart - randomize network before training or not: * True means that network is randomized and its initial state (one which was passed to the trainer object) is lost. * False means that training is started from the current state of the network OUTPUT PARAMETERS: Network - neural network which is ready to training (weights are initialized, preprocessor is initialized using current training set) NOTE: this method uses sum-of-squares error function for training. NOTE: it is expected that trainer object settings are NOT changed during step-by-step training, i.e. no one changes stopping criteria or training set during training. It is possible and there is no defense against such actions, but algorithm behavior in such cases is undefined and can be unpredictable. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlpstarttraining( mlptrainer s, multilayerperceptron network, bool randomstart);
/************************************************************************* This function trains neural network ensemble passed to this function using current dataset and early stopping training algorithm. Each early stopping round performs NRestarts random restarts (thus, EnsembleSize*NRestarts training rounds is performed in total). FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * EnsembleSize training sessions performed for each of ensemble ! members (always parallelized) ! * NRestarts training sessions performed within each of training ! sessions (if NRestarts>1) ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - trainer object; Ensemble - neural network ensemble. It must have same number of inputs and outputs/classes as was specified during creation of the trainer object. NRestarts - number of restarts, >=0: * NRestarts>0 means that specified number of random restarts are performed during each ES round; * NRestarts=0 is silently replaced by 1. OUTPUT PARAMETERS: Ensemble - trained ensemble; Rep - it contains all type of errors. NOTE: this training method uses BOTH early stopping and weight decay! So, you should select weight decay before starting training just as you select it before training "conventional" networks. NOTE: when no dataset was specified with MLPSetDataset/SetSparseDataset(), or single-point dataset was passed, ensemble is filled by zero values. NOTE: this method uses sum-of-squares error function for training. -- ALGLIB -- Copyright 22.08.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlptrainensemblees( mlptrainer s, mlpensemble ensemble, ae_int_t nrestarts, mlpreport& rep); void alglib::smp_mlptrainensemblees( mlptrainer s, mlpensemble ensemble, ae_int_t nrestarts, mlpreport& rep);

Examples:   [1]  [2]  

/************************************************************************* Neural network training using early stopping (base algorithm - L-BFGS with regularization). INPUT PARAMETERS: Network - neural network with initialized geometry TrnXY - training set TrnSize - training set size, TrnSize>0 ValXY - validation set ValSize - validation set size, ValSize>0 Decay - weight decay constant, >=0.001 Decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 0.001. Restarts - number of restarts, either: * strictly positive number - algorithm make specified number of restarts from random position. * -1, in which case algorithm makes exactly one run from the initial state of the network (no randomization). If you don't know what Restarts to choose, choose one one the following: * -1 (deterministic start) * +1 (one random restart) * +5 (moderate amount of random restarts) OUTPUT PARAMETERS: Network - trained neural network. Info - return code: * -2, if there is a point with class number outside of [0..NOut-1]. * -1, if wrong parameters specified (NPoints<0, Restarts<1, ...). * 2, task has been solved, stopping criterion met - sufficiently small step size. Not expected (we use EARLY stopping) but possible and not an error. * 6, task has been solved, stopping criterion met - increasing of validation set error. Rep - training report NOTE: Algorithm stops if validation set error increases for a long enough or step size is small enought (there are task where validation set may decrease for eternity). In any case solution returned corresponds to the minimum of validation set error. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlptraines( multilayerperceptron network, real_2d_array trnxy, ae_int_t trnsize, real_2d_array valxy, ae_int_t valsize, double decay, ae_int_t restarts, ae_int_t& info, mlpreport& rep);
/************************************************************************* Neural network training using L-BFGS algorithm with regularization. Subroutine trains neural network with restarts from random positions. Algorithm is well suited for problems of any dimensionality (memory requirements and step complexity are linear by weights number). INPUT PARAMETERS: Network - neural network with initialized geometry XY - training set NPoints - training set size Decay - weight decay constant, >=0.001 Decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 0.001. Restarts - number of restarts from random position, >0. If you don't know what Restarts to choose, use 2. WStep - stopping criterion. Algorithm stops if step size is less than WStep. Recommended value - 0.01. Zero step size means stopping after MaxIts iterations. MaxIts - stopping criterion. Algorithm stops after MaxIts iterations (NOT gradient calculations). Zero MaxIts means stopping when step is sufficiently small. OUTPUT PARAMETERS: Network - trained neural network. Info - return code: * -8, if both WStep=0 and MaxIts=0 * -2, if there is a point with class number outside of [0..NOut-1]. * -1, if wrong parameters specified (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report -- ALGLIB -- Copyright 09.12.2007 by Bochkanov Sergey *************************************************************************/
void alglib::mlptrainlbfgs( multilayerperceptron network, real_2d_array xy, ae_int_t npoints, double decay, ae_int_t restarts, double wstep, ae_int_t maxits, ae_int_t& info, mlpreport& rep);
/************************************************************************* Neural network training using modified Levenberg-Marquardt with exact Hessian calculation and regularization. Subroutine trains neural network with restarts from random positions. Algorithm is well suited for small and medium scale problems (hundreds of weights). INPUT PARAMETERS: Network - neural network with initialized geometry XY - training set NPoints - training set size Decay - weight decay constant, >=0.001 Decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 0.001. Restarts - number of restarts from random position, >0. If you don't know what Restarts to choose, use 2. OUTPUT PARAMETERS: Network - trained neural network. Info - return code: * -9, if internal matrix inverse subroutine failed * -2, if there is a point with class number outside of [0..NOut-1]. * -1, if wrong parameters specified (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/
void alglib::mlptrainlm( multilayerperceptron network, real_2d_array xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t& info, mlpreport& rep);
/************************************************************************* This function trains neural network passed to this function, using current dataset (one which was passed to MLPSetDataset() or MLPSetSparseDataset()) and current training settings. Training from NRestarts random starting positions is performed, best network is chosen. Training is performed using current training algorithm. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * NRestarts training sessions performed within each of ! cross-validation rounds (if NRestarts>1) ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - trainer object Network - neural network. It must have same number of inputs and output/classes as was specified during creation of the trainer object. NRestarts - number of restarts, >=0: * NRestarts>0 means that specified number of random restarts are performed, best network is chosen after training * NRestarts=0 means that current state of the network is used for training. OUTPUT PARAMETERS: Network - trained network NOTE: when no dataset was specified with MLPSetDataset/SetSparseDataset(), network is filled by zero values. Same behavior for functions MLPStartTraining and MLPContinueTraining. NOTE: this method uses sum-of-squares error function for training. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::mlptrainnetwork( mlptrainer s, multilayerperceptron network, ae_int_t nrestarts, mlpreport& rep); void alglib::smp_mlptrainnetwork( mlptrainer s, multilayerperceptron network, ae_int_t nrestarts, mlpreport& rep);

Examples:   [1]  [2]  [3]  [4]  [5]  [6]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Suppose that we want to classify numbers as positive (class 0) and negative
    // (class 1). We have training set which includes several strictly positive
    // or negative numbers - and zero.
    //
    // The problem is that we are not sure how to classify zero, so from time to
    // time we mark it as positive or negative (with equal probability). Other
    // numbers are marked in pure deterministic setting. How will neural network
    // cope with such classification task?
    //
    // NOTE: we use network with excessive amount of neurons, which guarantees
    //       almost exact reproduction of the training set. Generalization ability
    //       of such network is rather low, but we are not concerned with such
    //       questions in this basic demo.
    //
    mlptrainer trn;
    multilayerperceptron network;
    mlpreport rep;
    real_1d_array x = "[0]";
    real_1d_array y = "[0,0]";

    //
    // Training set. One row corresponds to one record [A => class(A)].
    //
    // Classes are denoted by numbers from 0 to 1, where 0 corresponds to positive
    // numbers and 1 to negative numbers.
    //
    // [ +1  0]
    // [ +2  0]
    // [ -1  1]
    // [ -2  1]
    // [  0  0]   !! sometimes we classify 0 as positive, sometimes as negative
    // [  0  1]   !!
    //
    real_2d_array xy = "[[+1,0],[+2,0],[-1,1],[-2,1],[0,0],[0,1]]";

    //
    //
    // When we solve classification problems, everything is slightly different from
    // the regression ones:
    //
    // 1. Network is created. Because we solve classification problem, we use
    //    mlpcreatec1() function instead of mlpcreate1(). This function creates
    //    classifier network with SOFTMAX-normalized outputs. This network returns
    //    vector of class membership probabilities which are normalized to be
    //    non-negative and sum to 1.0
    //
    // 2. We use mlpcreatetrainercls() function instead of mlpcreatetrainer() to
    //    create trainer object. Trainer object process dataset and neural network
    //    slightly differently to account for specifics of the classification
    //    problems.
    //
    // 3. Dataset is attached to trainer object. Note that dataset format is slightly
    //    different from one used for regression.
    //
    mlpcreatetrainercls(1, 2, trn);
    mlpcreatec1(1, 5, 2, network);
    mlpsetdataset(trn, xy, 6);

    //
    // Network is trained with 5 restarts from random positions
    //
    mlptrainnetwork(trn, network, 5, rep);

    //
    // Test our neural network on strictly positive and strictly negative numbers.
    //
    // IMPORTANT! Classifier network returns class membership probabilities instead
    // of class indexes. Network returns two values (probabilities) instead of one
    // (class index).
    //
    // Thus, for +1 we expect to get [P0,P1] = [1,0], where P0 is probability that
    // number is positive (belongs to class 0), and P1 is probability that number
    // is negative (belongs to class 1).
    //
    // For -1 we expect to get [P0,P1] = [0,1]
    //
    // Following properties are guaranteed by network architecture:
    // * P0>=0, P1>=0   non-negativity
    // * P0+P1=1        normalization
    //
    x = "[1]";
    mlpprocess(network, x, y);
    printf("%s\n", y.tostring(1).c_str()); // EXPECTED: [1.000,0.000]
    x = "[-1]";
    mlpprocess(network, x, y);
    printf("%s\n", y.tostring(1).c_str()); // EXPECTED: [0.000,1.000]

    //
    // But what our network will return for 0, which is between classes 0 and 1?
    //
    // In our dataset it has two different marks assigned (class 0 AND class 1).
    // So network will return something average between class 0 and class 1:
    //     0 => [0.5, 0.5]
    //
    x = "[0]";
    mlpprocess(network, x, y);
    printf("%s\n", y.tostring(1).c_str()); // EXPECTED: [0.500,0.500]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Suppose that we want to classify numbers as positive (class 0) and negative
    // (class 1). We also have one more class for zero (class 2).
    //
    // NOTE: we use network with excessive amount of neurons, which guarantees
    //       almost exact reproduction of the training set. Generalization ability
    //       of such network is rather low, but we are not concerned with such
    //       questions in this basic demo.
    //
    mlptrainer trn;
    multilayerperceptron network;
    mlpreport rep;
    real_1d_array x = "[0]";
    real_1d_array y = "[0,0,0]";

    //
    // Training set. One row corresponds to one record [A => class(A)].
    //
    // Classes are denoted by numbers from 0 to 2, where 0 corresponds to positive
    // numbers, 1 to negative numbers, 2 to zero
    //
    // [ +1  0]
    // [ +2  0]
    // [ -1  1]
    // [ -2  1]
    // [  0  2]
    //
    real_2d_array xy = "[[+1,0],[+2,0],[-1,1],[-2,1],[0,2]]";

    //
    //
    // When we solve classification problems, everything is slightly different from
    // the regression ones:
    //
    // 1. Network is created. Because we solve classification problem, we use
    //    mlpcreatec1() function instead of mlpcreate1(). This function creates
    //    classifier network with SOFTMAX-normalized outputs. This network returns
    //    vector of class membership probabilities which are normalized to be
    //    non-negative and sum to 1.0
    //
    // 2. We use mlpcreatetrainercls() function instead of mlpcreatetrainer() to
    //    create trainer object. Trainer object process dataset and neural network
    //    slightly differently to account for specifics of the classification
    //    problems.
    //
    // 3. Dataset is attached to trainer object. Note that dataset format is slightly
    //    different from one used for regression.
    //
    mlpcreatetrainercls(1, 3, trn);
    mlpcreatec1(1, 5, 3, network);
    mlpsetdataset(trn, xy, 5);

    //
    // Network is trained with 5 restarts from random positions
    //
    mlptrainnetwork(trn, network, 5, rep);

    //
    // Test our neural network on strictly positive and strictly negative numbers.
    //
    // IMPORTANT! Classifier network returns class membership probabilities instead
    // of class indexes. Network returns three values (probabilities) instead of one
    // (class index).
    //
    // Thus, for +1 we expect to get [P0,P1,P2] = [1,0,0],
    // for -1 we expect to get [P0,P1,P2] = [0,1,0],
    // and for 0 we will get [P0,P1,P2] = [0,0,1].
    //
    // Following properties are guaranteed by network architecture:
    // * P0>=0, P1>=0, P2>=0    non-negativity
    // * P0+P1+P2=1             normalization
    //
    x = "[1]";
    mlpprocess(network, x, y);
    printf("%s\n", y.tostring(1).c_str()); // EXPECTED: [1.000,0.000,0.000]
    x = "[-1]";
    mlpprocess(network, x, y);
    printf("%s\n", y.tostring(1).c_str()); // EXPECTED: [0.000,1.000,0.000]
    x = "[0]";
    mlpprocess(network, x, y);
    printf("%s\n", y.tostring(1).c_str()); // EXPECTED: [0.000,0.000,1.000]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example shows how to perform cross-validation with ALGLIB
    //
    mlptrainer trn;
    multilayerperceptron network;
    mlpreport rep;

    //
    // Training set: f(x)=1/(x^2+1)
    // One row corresponds to one record [x,f(x)]
    //
    real_2d_array xy = "[[-2.0,0.2],[-1.6,0.3],[-1.3,0.4],[-1,0.5],[-0.6,0.7],[-0.3,0.9],[0,1],[2.0,0.2],[1.6,0.3],[1.3,0.4],[1,0.5],[0.6,0.7],[0.3,0.9]]";

    //
    // Trainer object is created.
    // Dataset is attached to trainer object.
    //
    // NOTE: it is not good idea to perform cross-validation on sample
    //       as small as ours (13 examples). It is done for demonstration
    //       purposes only. Generalization error estimates won't be
    //       precise enough for practical purposes.
    //
    mlpcreatetrainer(1, 1, trn);
    mlpsetdataset(trn, xy, 13);

    //
    // The key property of the cross-validation is that it estimates
    // generalization properties of neural ARCHITECTURE. It does NOT
    // estimates generalization error of some specific network which
    // is passed to the k-fold CV routine.
    //
    // In our example we create 1x4x1 neural network and pass it to
    // CV routine without training it. Original state of the network
    // is not used for cross-validation - each round is restarted from
    // random initial state. Only geometry of network matters.
    //
    // We perform 5 restarts from different random positions for each
    // of the 10 cross-validation rounds.
    //
    mlpcreate1(1, 4, 1, network);
    mlpkfoldcv(trn, network, 5, 10, rep);

    //
    // Cross-validation routine stores estimates of the generalization
    // error to MLP report structure. You may examine its fields and
    // see estimates of different errors (RMS, CE, Avg).
    //
    // Because cross-validation is non-deterministic, in our manual we
    // can not say what values will be stored to rep after call to
    // mlpkfoldcv(). Every CV round will return slightly different
    // estimates.
    //
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example shows how to train early stopping ensebles.
    //
    mlptrainer trn;
    mlpensemble ensemble;
    mlpreport rep;

    //
    // Training set: f(x)=1/(x^2+1)
    // One row corresponds to one record [x,f(x)]
    //
    real_2d_array xy = "[[-2.0,0.2],[-1.6,0.3],[-1.3,0.4],[-1,0.5],[-0.6,0.7],[-0.3,0.9],[0,1],[2.0,0.2],[1.6,0.3],[1.3,0.4],[1,0.5],[0.6,0.7],[0.3,0.9]]";

    //
    // Trainer object is created.
    // Dataset is attached to trainer object.
    //
    // NOTE: it is not good idea to use early stopping ensemble on sample
    //       as small as ours (13 examples). It is done for demonstration
    //       purposes only. Ensemble training algorithm won't find good
    //       solution on such small sample.
    //
    mlpcreatetrainer(1, 1, trn);
    mlpsetdataset(trn, xy, 13);

    //
    // Ensemble is created and trained. Each of 50 network is trained
    // with 5 restarts.
    //
    mlpecreate1(1, 4, 1, 50, ensemble);
    mlptrainensemblees(trn, ensemble, 5, rep);
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example shows how to use parallel functionality of ALGLIB.
    // We generate simple 1-dimensional regression problem and show how
    // to use parallel training, parallel cross-validation, parallel
    // training of neural ensembles.
    //
    // We assume that you already know how to use ALGLIB in serial mode
    // and concentrate on its parallel capabilities.
    //
    // NOTE: it is not good idea to use parallel features on sample as small
    //       as ours (13 examples). It is done only for demonstration purposes.
    //
    mlptrainer trn;
    multilayerperceptron network;
    mlpensemble ensemble;
    mlpreport rep;
    real_2d_array xy = "[[-2.0,0.2],[-1.6,0.3],[-1.3,0.4],[-1,0.5],[-0.6,0.7],[-0.3,0.9],[0,1],[2.0,0.2],[1.6,0.3],[1.3,0.4],[1,0.5],[0.6,0.7],[0.3,0.9]]";
    mlpcreatetrainer(1, 1, trn);
    mlpsetdataset(trn, xy, 13);
    mlpcreate1(1, 4, 1, network);
    mlpecreate1(1, 4, 1, 50, ensemble);

    //
    // Below we demonstrate how to perform:
    // * parallel training of individual networks
    // * parallel cross-validation
    // * parallel training of neural ensembles
    //
    // In order to use multithreading, you have to:
    // 1) Install SMP edition of ALGLIB.
    // 2) This step is specific for C++ users: you should activate OS-specific
    //    capabilities of ALGLIB by defining AE_OS=AE_POSIX (for *nix systems)
    //    or AE_OS=AE_WINDOWS (for Windows systems).
    //    C# users do not have to perform this step because C# programs are
    //    portable across different systems without OS-specific tuning.
    // 3) Allow ALGLIB to know about number of worker threads to use:
    //    a) autodetection (C++, C#):
    //          ALGLIB will automatically determine number of CPU cores and
    //          (by default) will use all cores except for one. Say, on 4-core
    //          system it will use three cores - unless you manually told it
    //          to use more or less. It will keep your system responsive during
    //          lengthy computations.
    //          Such behavior may be changed with setnworkers() call:
    //          * alglib::setnworkers(0)  = use all cores
    //          * alglib::setnworkers(-1) = leave one core unused
    //          * alglib::setnworkers(-2) = leave two cores unused
    //          * alglib::setnworkers(+2) = use 2 cores (even if you have more)
    //    b) manual specification (C++, C#):
    //          You may want to specify maximum number of worker threads during
    //          compile time by means of preprocessor definition AE_NWORKERS.
    //          For C++ it will be "AE_NWORKERS=X" where X can be any positive number.
    //          For C# it is "AE_NWORKERSX", where X should be replaced by number of
    //          workers (AE_NWORKERS2, AE_NWORKERS3, AE_NWORKERS4, ...).
    //          You can add this definition to compiler command line or change
    //          corresponding project settings in your IDE.
    //
    // After you installed and configured SMP edition of ALGLIB, you may choose
    // between serial and multithreaded versions of SMP-capable functions:
    // * serial version works as usual, in the context of the calling thread
    // * multithreaded version (with "smp_" prefix) creates (or wakes up) worker
    //   threads, inserts task in the worker queue, and waits for completion of
    //   the task. All processing is done in context of worker thread(s).
    //
    // NOTE: because starting/stopping worker threads costs thousands of CPU cycles,
    //       you should not use multithreading for lightweight computational problems.
    //
    // NOTE: some old POSIX-compatible operating systems do not support
    //       sysconf(_SC_NPROCESSORS_ONLN) system call which is required in order
    //       to automatically determine number of active cores. On these systems
    //       you should specify number of cores manually at compile time.
    //       Without it ALGLIB will run in single-threaded mode.
    //

    //
    // First, we perform parallel training of individual network with 5
    // restarts from random positions. These 5 rounds of  training  are
    // executed in parallel manner,  with  best  network  chosen  after
    // training.
    //
    // ALGLIB can use additional way to speed up computations -  divide
    // dataset   into   smaller   subsets   and   process these subsets
    // simultaneously. It allows us  to  efficiently  parallelize  even
    // single training round. This operation is performed automatically
    // for large datasets, but our toy dataset is too small.
    //
    smp_mlptrainnetwork(trn, network, 5, rep);

    //
    // Then, we perform parallel 10-fold cross-validation, with 5 random
    // restarts per each CV round. I.e., 5*10=50  networks  are trained
    // in total. All these operations can be parallelized.
    //
    // NOTE: again, ALGLIB can parallelize  calculation   of   gradient
    //       over entire dataset - but our dataset is too small.
    //
    smp_mlpkfoldcv(trn, network, 5, 10, rep);

    //
    // Finally, we train early stopping ensemble of 50 neural networks,
    // each  of them is trained with 5 random restarts. I.e.,  5*50=250
    // networks aretrained in total.
    //
    smp_mlptrainensemblees(trn, ensemble, 5, rep);
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // The very simple example on neural network: network is trained to reproduce
    // small 2x2 multiplication table.
    //
    // NOTE: we use network with excessive amount of neurons, which guarantees
    //       almost exact reproduction of the training set. Generalization ability
    //       of such network is rather low, but we are not concerned with such
    //       questions in this basic demo.
    //
    mlptrainer trn;
    multilayerperceptron network;
    mlpreport rep;

    //
    // Training set:
    // * one row corresponds to one record A*B=C in the multiplication table
    // * first two columns store A and B, last column stores C
    //
    // [1 * 1 = 1]
    // [1 * 2 = 2]
    // [2 * 1 = 2]
    // [2 * 2 = 4]
    //
    real_2d_array xy = "[[1,1,1],[1,2,2],[2,1,2],[2,2,4]]";

    //
    // Network is created.
    // Trainer object is created.
    // Dataset is attached to trainer object.
    //
    mlpcreatetrainer(2, 1, trn);
    mlpcreate1(2, 5, 1, network);
    mlpsetdataset(trn, xy, 4);

    //
    // Network is trained with 5 restarts from random positions
    //
    mlptrainnetwork(trn, network, 5, rep);

    //
    // 2*2=?
    //
    real_1d_array x = "[2,2]";
    real_1d_array y = "[0]";
    mlpprocess(network, x, y);
    printf("%s\n", y.tostring(1).c_str()); // EXPECTED: [4.000]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Network with 2 inputs and 2 outputs is trained to reproduce vector function:
    //     (x0,x1) => (x0+x1, x0*x1)
    //
    // Informally speaking, we want neural network to simultaneously calculate
    // both sum of two numbers and their product.
    //
    // NOTE: we use network with excessive amount of neurons, which guarantees
    //       almost exact reproduction of the training set. Generalization ability
    //       of such network is rather low, but we are not concerned with such
    //       questions in this basic demo.
    //
    mlptrainer trn;
    multilayerperceptron network;
    mlpreport rep;

    //
    // Training set. One row corresponds to one record [A,B,A+B,A*B].
    //
    // [ 1   1  1+1  1*1 ]
    // [ 1   2  1+2  1*2 ]
    // [ 2   1  2+1  2*1 ]
    // [ 2   2  2+2  2*2 ]
    //
    real_2d_array xy = "[[1,1,2,1],[1,2,3,2],[2,1,3,2],[2,2,4,4]]";

    //
    // Network is created.
    // Trainer object is created.
    // Dataset is attached to trainer object.
    //
    mlpcreatetrainer(2, 2, trn);
    mlpcreate1(2, 5, 2, network);
    mlpsetdataset(trn, xy, 4);

    //
    // Network is trained with 5 restarts from random positions
    //
    mlptrainnetwork(trn, network, 5, rep);

    //
    // 2+1=?
    // 2*1=?
    //
    real_1d_array x = "[2,1]";
    real_1d_array y = "[0,0]";
    mlpprocess(network, x, y);
    printf("%s\n", y.tostring(1).c_str()); // EXPECTED: [3.000,2.000]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "dataanalysis.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Trainer object is used to train network. It stores dataset, training settings,
    // and other information which is NOT part of neural network. You should use
    // trainer object as follows:
    // (1) you create trainer object and specify task type (classification/regression)
    //     and number of inputs/outputs
    // (2) you add dataset to the trainer object
    // (3) you may change training settings (stopping criteria or weight decay)
    // (4) finally, you may train one or more networks
    //
    // You may interleave stages 2...4 and repeat them many times. Trainer object
    // remembers its internal state and can be used several times after its creation
    // and initialization.
    //
    mlptrainer trn;

    //
    // Stage 1: object creation.
    //
    // We have to specify number of inputs and outputs. Trainer object can be used
    // only for problems with same number of inputs/outputs as was specified during
    // its creation.
    //
    // In case you want to train SOFTMAX-normalized network which solves classification
    // problems,  you  must  use  another  function  to  create  trainer  object:
    // mlpcreatetrainercls().
    //
    // Below we create trainer object which can be used to train regression networks
    // with 2 inputs and 1 output.
    //
    mlpcreatetrainer(2, 1, trn);

    //
    // Stage 2: specification of the training set
    //
    // By default trainer object stores empty dataset. So to solve your non-empty problem
    // you have to set dataset by passing to trainer dense or sparse matrix.
    //
    // One row of the matrix corresponds to one record A*B=C in the multiplication table.
    // First two columns store A and B, last column stores C
    //
    //     [1 * 1 = 1]   [ 1 1 1 ]
    //     [1 * 2 = 2]   [ 1 2 2 ]
    //     [2 * 1 = 2] = [ 2 1 2 ]
    //     [2 * 2 = 4]   [ 2 2 4 ]
    //
    real_2d_array xy = "[[1,1,1],[1,2,2],[2,1,2],[2,2,4]]";
    mlpsetdataset(trn, xy, 4);

    //
    // Stage 3: modification of the training parameters.
    //
    // You may modify parameters like weights decay or stopping criteria:
    // * we set moderate weight decay
    // * we choose iterations limit as stopping condition (another condition - step size -
    //   is zero, which means than this condition is not active)
    //
    double wstep = 0.000;
    ae_int_t maxits = 100;
    mlpsetdecay(trn, 0.01);
    mlpsetcond(trn, wstep, maxits);

    //
    // Stage 4: training.
    //
    // We will train several networks with different architecture using same trainer object.
    // We may change training parameters or even dataset, so different networks are trained
    // differently. But in this simple example we will train all networks with same settings.
    //
    // We create and train three networks:
    // * network 1 has 2x1 architecture     (2 inputs, no hidden neurons, 1 output)
    // * network 2 has 2x5x1 architecture   (2 inputs, 5 hidden neurons, 1 output)
    // * network 3 has 2x5x5x1 architecture (2 inputs, two hidden layers, 1 output)
    //
    // NOTE: these networks solve regression problems. For classification problems you
    //       should use mlpcreatec0/c1/c2 to create neural networks which have SOFTMAX-
    //       normalized outputs.
    //
    multilayerperceptron net1;
    multilayerperceptron net2;
    multilayerperceptron net3;
    mlpreport rep;

    mlpcreate0(2, 1, net1);
    mlpcreate1(2, 5, 1, net2);
    mlpcreate2(2, 5, 5, 1, net3);

    mlptrainnetwork(trn, net1, 5, rep);
    mlptrainnetwork(trn, net2, 5, rep);
    mlptrainnetwork(trn, net3, 5, rep);
    return 0;
}


kdtree
kdtreerequestbuffer
kdtreebuild
kdtreebuildtagged
kdtreecreaterequestbuffer
kdtreequeryaknn
kdtreequerybox
kdtreequeryknn
kdtreequeryresultsdistances
kdtreequeryresultsdistancesi
kdtreequeryresultstags
kdtreequeryresultstagsi
kdtreequeryresultsx
kdtreequeryresultsxi
kdtreequeryresultsxy
kdtreequeryresultsxyi
kdtreequeryrnn
kdtreeserialize
kdtreetsqueryaknn
kdtreetsquerybox
kdtreetsqueryknn
kdtreetsqueryresultsdistances
kdtreetsqueryresultstags
kdtreetsqueryresultsx
kdtreetsqueryresultsxy
kdtreetsqueryrnn
kdtreeunserialize
nneighbor_d_1 Nearest neighbor search, KNN queries
nneighbor_d_2 Serialization of KD-trees
/************************************************************************* KD-tree object. *************************************************************************/
class kdtree { };
/************************************************************************* Buffer object which is used to perform nearest neighbor requests in the multithreaded mode (multiple threads working with same KD-tree object). This object should be created with KDTreeCreateBuffer(). *************************************************************************/
class kdtreerequestbuffer { };
/************************************************************************* KD-tree creation This subroutine creates KD-tree from set of X-values and optional Y-values INPUT PARAMETERS XY - dataset, array[0..N-1,0..NX+NY-1]. one row corresponds to one point. first NX columns contain X-values, next NY (NY may be zero) columns may contain associated Y-values N - number of points, N>=0. NX - space dimension, NX>=1. NY - number of optional Y-values, NY>=0. NormType- norm type: * 0 denotes infinity-norm * 1 denotes 1-norm * 2 denotes 2-norm (Euclidean norm) OUTPUT PARAMETERS KDT - KD-tree NOTES 1. KD-tree creation have O(N*logN) complexity and O(N*(2*NX+NY)) memory requirements. 2. Although KD-trees may be used with any combination of N and NX, they are more efficient than brute-force search only when N >> 4^NX. So they are most useful in low-dimensional tasks (NX=2, NX=3). NX=1 is another inefficient case, because simple binary search (without additional structures) is much more efficient in such tasks than KD-trees. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreebuild( real_2d_array xy, ae_int_t nx, ae_int_t ny, ae_int_t normtype, kdtree& kdt); void alglib::kdtreebuild( real_2d_array xy, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_int_t normtype, kdtree& kdt);

Examples:   [1]  [2]  

/************************************************************************* KD-tree creation This subroutine creates KD-tree from set of X-values, integer tags and optional Y-values INPUT PARAMETERS XY - dataset, array[0..N-1,0..NX+NY-1]. one row corresponds to one point. first NX columns contain X-values, next NY (NY may be zero) columns may contain associated Y-values Tags - tags, array[0..N-1], contains integer tags associated with points. N - number of points, N>=0 NX - space dimension, NX>=1. NY - number of optional Y-values, NY>=0. NormType- norm type: * 0 denotes infinity-norm * 1 denotes 1-norm * 2 denotes 2-norm (Euclidean norm) OUTPUT PARAMETERS KDT - KD-tree NOTES 1. KD-tree creation have O(N*logN) complexity and O(N*(2*NX+NY)) memory requirements. 2. Although KD-trees may be used with any combination of N and NX, they are more efficient than brute-force search only when N >> 4^NX. So they are most useful in low-dimensional tasks (NX=2, NX=3). NX=1 is another inefficient case, because simple binary search (without additional structures) is much more efficient in such tasks than KD-trees. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreebuildtagged( real_2d_array xy, integer_1d_array tags, ae_int_t nx, ae_int_t ny, ae_int_t normtype, kdtree& kdt); void alglib::kdtreebuildtagged( real_2d_array xy, integer_1d_array tags, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_int_t normtype, kdtree& kdt);

Examples:   [1]  

/************************************************************************* This function creates buffer structure which can be used to perform parallel KD-tree requests. KD-tree subpackage provides two sets of request functions - ones which use internal buffer of KD-tree object (these functions are single-threaded because they use same buffer, which can not shared between threads), and ones which use external buffer. This function is used to initialize external buffer. INPUT PARAMETERS KDT - KD-tree which is associated with newly created buffer OUTPUT PARAMETERS Buf - external buffer. IMPORTANT: KD-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreecreaterequestbuffer( kdtree kdt, kdtreerequestbuffer& buf);
/************************************************************************* K-NN query: approximate K nearest neighbors IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryAKNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True Eps - approximation factor, Eps>=0. eps-approximate nearest neighbor is a neighbor whose distance from X is at most (1+eps) times distance of true nearest neighbor. RESULT number of actual neighbors found (either K or N, if K>N). NOTES significant performance gain may be achieved only when Eps is is on the order of magnitude of 1 or larger. This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::kdtreequeryaknn( kdtree kdt, real_1d_array x, ae_int_t k, double eps); ae_int_t alglib::kdtreequeryaknn( kdtree kdt, real_1d_array x, ae_int_t k, bool selfmatch, double eps);

Examples:   [1]  

/************************************************************************* Box query: all points within user-specified box. IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryBox() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree BoxMin - lower bounds, array[0..NX-1]. BoxMax - upper bounds, array[0..NX-1]. RESULT number of actual neighbors found (in [0,N]). This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() returns zeros for this request NOTE: this particular query returns unordered results, because there is no meaningful way of ordering points. Furthermore, no 'distance' is associated with points - it is either INSIDE or OUTSIDE (so request for distances will return zeros). -- ALGLIB -- Copyright 14.05.2016 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::kdtreequerybox( kdtree kdt, real_1d_array boxmin, real_1d_array boxmax);
/************************************************************************* K-NN query: K nearest neighbors IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryKNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of actual neighbors found (either K or N, if K>N). This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::kdtreequeryknn(kdtree kdt, real_1d_array x, ae_int_t k); ae_int_t alglib::kdtreequeryknn( kdtree kdt, real_1d_array x, ae_int_t k, bool selfmatch);

Examples:   [1]  

/************************************************************************* Distances from last query This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultsdistances(). INPUT PARAMETERS KDT - KD-tree R - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS R - filled with distances (in corresponding norm) NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreequeryresultsdistances(kdtree kdt, real_1d_array& r);

Examples:   [1]  

/************************************************************************* Distances from last query; 'interactive' variant for languages like Python which support constructs like "R = KDTreeQueryResultsDistancesI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreequeryresultsdistancesi(kdtree kdt, real_1d_array& r);
/************************************************************************* Tags from last query This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultstags(). INPUT PARAMETERS KDT - KD-tree Tags - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS Tags - filled with tags associated with points, or, when no tags were supplied, with zeros NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreequeryresultstags(kdtree kdt, integer_1d_array& tags);

Examples:   [1]  

/************************************************************************* Tags from last query; 'interactive' variant for languages like Python which support constructs like "Tags = KDTreeQueryResultsTagsI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreequeryresultstagsi(kdtree kdt, integer_1d_array& tags);
/************************************************************************* X-values from last query. This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultsx(). INPUT PARAMETERS KDT - KD-tree X - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS X - rows are filled with X-values NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreequeryresultsx(kdtree kdt, real_2d_array& x);

Examples:   [1]  

/************************************************************************* X-values from last query; 'interactive' variant for languages like Python which support constructs like "X = KDTreeQueryResultsXI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreequeryresultsxi(kdtree kdt, real_2d_array& x);
/************************************************************************* X- and Y-values from last query This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultsxy(). INPUT PARAMETERS KDT - KD-tree XY - possibly pre-allocated buffer. If XY is too small to store result, it is resized. If size(XY) is enough to store result, it is left unchanged. OUTPUT PARAMETERS XY - rows are filled with points: first NX columns with X-values, next NY columns - with Y-values. NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreequeryresultsxy(kdtree kdt, real_2d_array& xy);

Examples:   [1]  

/************************************************************************* XY-values from last query; 'interactive' variant for languages like Python which support constructs like "XY = KDTreeQueryResultsXYI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreequeryresultsxyi(kdtree kdt, real_2d_array& xy);
/************************************************************************* R-NN query: all points within R-sphere centered at X IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryRNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. R - radius of sphere (in corresponding norm), R>0 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of neighbors found, >=0 This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain actual results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::kdtreequeryrnn(kdtree kdt, real_1d_array x, double r); ae_int_t alglib::kdtreequeryrnn( kdtree kdt, real_1d_array x, double r, bool selfmatch);

Examples:   [1]  

/************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/
void kdtreeserialize(kdtree &obj, std::string &s_out); void kdtreeserialize(kdtree &obj, std::ostream &s_out);
/************************************************************************* K-NN query: approximate K nearest neighbors, using thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True Eps - approximation factor, Eps>=0. eps-approximate nearest neighbor is a neighbor whose distance from X is at most (1+eps) times distance of true nearest neighbor. RESULT number of actual neighbors found (either K or N, if K>N). NOTES significant performance gain may be achieved only when Eps is is on the order of magnitude of 1 or larger. This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::kdtreetsqueryaknn( kdtree kdt, kdtreerequestbuffer buf, real_1d_array x, ae_int_t k, double eps); ae_int_t alglib::kdtreetsqueryaknn( kdtree kdt, kdtreerequestbuffer buf, real_1d_array x, ae_int_t k, bool selfmatch, double eps);
/************************************************************************* Box query: all points within user-specified box, using thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. BoxMin - lower bounds, array[0..NX-1]. BoxMax - upper bounds, array[0..NX-1]. RESULT number of actual neighbors found (in [0,N]). This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "ts" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() returns zeros for this query NOTE: this particular query returns unordered results, because there is no meaningful way of ordering points. Furthermore, no 'distance' is associated with points - it is either INSIDE or OUTSIDE (so request for distances will return zeros). IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 14.05.2016 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::kdtreetsquerybox( kdtree kdt, kdtreerequestbuffer buf, real_1d_array boxmin, real_1d_array boxmax);
/************************************************************************* K-NN query: K nearest neighbors, using external thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - kd-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of actual neighbors found (either K or N, if K>N). This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::kdtreetsqueryknn( kdtree kdt, kdtreerequestbuffer buf, real_1d_array x, ae_int_t k); ae_int_t alglib::kdtreetsqueryknn( kdtree kdt, kdtreerequestbuffer buf, real_1d_array x, ae_int_t k, bool selfmatch);
/************************************************************************* Distances from last query associated with kdtreerequestbuffer object. This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - KDTreeTsqueryresultsdistances(). INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. R - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS R - filled with distances (in corresponding norm) NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreetsqueryresultsdistances( kdtree kdt, kdtreerequestbuffer buf, real_1d_array& r);
/************************************************************************* Tags from last query associated with kdtreerequestbuffer object. This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - KDTreeTsqueryresultstags(). INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. Tags - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS Tags - filled with tags associated with points, or, when no tags were supplied, with zeros NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreetsqueryresultstags( kdtree kdt, kdtreerequestbuffer buf, integer_1d_array& tags);
/************************************************************************* X-values from last query associated with kdtreerequestbuffer object. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. X - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS X - rows are filled with X-values NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreetsqueryresultsx( kdtree kdt, kdtreerequestbuffer buf, real_2d_array& x);
/************************************************************************* X- and Y-values from last query associated with kdtreerequestbuffer object. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. XY - possibly pre-allocated buffer. If XY is too small to store result, it is resized. If size(XY) is enough to store result, it is left unchanged. OUTPUT PARAMETERS XY - rows are filled with points: first NX columns with X-values, next NY columns - with Y-values. NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/
void alglib::kdtreetsqueryresultsxy( kdtree kdt, kdtreerequestbuffer buf, real_2d_array& xy);
/************************************************************************* R-NN query: all points within R-sphere centered at X, using external thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. R - radius of sphere (in corresponding norm), R>0 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of neighbors found, >=0 This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::kdtreetsqueryrnn( kdtree kdt, kdtreerequestbuffer buf, real_1d_array x, double r); ae_int_t alglib::kdtreetsqueryrnn( kdtree kdt, kdtreerequestbuffer buf, real_1d_array x, double r, bool selfmatch);
/************************************************************************* This function unserializes data structure from string. *************************************************************************/
void kdtreeunserialize(const std::string &s_in, kdtree &obj); void kdtreeunserialize(const std::istream &s_in, kdtree &obj);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "alglibmisc.h"

using namespace alglib;


int main(int argc, char **argv)
{
    real_2d_array a = "[[0,0],[0,1],[1,0],[1,1]]";
    ae_int_t nx = 2;
    ae_int_t ny = 0;
    ae_int_t normtype = 2;
    kdtree kdt;
    real_1d_array x;
    real_2d_array r = "[[]]";
    ae_int_t k;
    kdtreebuild(a, nx, ny, normtype, kdt);
    x = "[-1,0]";
    k = kdtreequeryknn(kdt, x, 1);
    printf("%d\n", int(k)); // EXPECTED: 1
    kdtreequeryresultsx(kdt, r);
    printf("%s\n", r.tostring(1).c_str()); // EXPECTED: [[0,0]]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "alglibmisc.h"

using namespace alglib;


int main(int argc, char **argv)
{
    real_2d_array a = "[[0,0],[0,1],[1,0],[1,1]]";
    ae_int_t nx = 2;
    ae_int_t ny = 0;
    ae_int_t normtype = 2;
    kdtree kdt0;
    kdtree kdt1;
    std::string s;
    real_1d_array x;
    real_2d_array r0 = "[[]]";
    real_2d_array r1 = "[[]]";

    //
    // Build tree and serialize it
    //
    kdtreebuild(a, nx, ny, normtype, kdt0);
    alglib::kdtreeserialize(kdt0, s);
    alglib::kdtreeunserialize(s, kdt1);

    //
    // Compare results from KNN queries
    //
    x = "[-1,0]";
    kdtreequeryknn(kdt0, x, 1);
    kdtreequeryresultsx(kdt0, r0);
    kdtreequeryknn(kdt1, x, 1);
    kdtreequeryresultsx(kdt1, r1);
    printf("%s\n", r0.tostring(1).c_str()); // EXPECTED: [[0,0]]
    printf("%s\n", r1.tostring(1).c_str()); // EXPECTED: [[0,0]]
    return 0;
}


nleqreport
nleqstate
nleqcreatelm
nleqrestartfrom
nleqresults
nleqresultsbuf
nleqsetcond
nleqsetstpmax
nleqsetxrep
nleqsolve
/************************************************************************* *************************************************************************/
class nleqreport { ae_int_t iterationscount; ae_int_t nfunc; ae_int_t njac; ae_int_t terminationtype; };
/************************************************************************* *************************************************************************/
class nleqstate { };
/************************************************************************* LEVENBERG-MARQUARDT-LIKE NONLINEAR SOLVER DESCRIPTION: This algorithm solves system of nonlinear equations F[0](x[0], ..., x[n-1]) = 0 F[1](x[0], ..., x[n-1]) = 0 ... F[M-1](x[0], ..., x[n-1]) = 0 with M/N do not necessarily coincide. Algorithm converges quadratically under following conditions: * the solution set XS is nonempty * for some xs in XS there exist such neighbourhood N(xs) that: * vector function F(x) and its Jacobian J(x) are continuously differentiable on N * ||F(x)|| provides local error bound on N, i.e. there exists such c1, that ||F(x)||>c1*distance(x,XS) Note that these conditions are much more weaker than usual non-singularity conditions. For example, algorithm will converge for any affine function F (whether its Jacobian singular or not). REQUIREMENTS: Algorithm will request following information during its operation: * function vector F[] and Jacobian matrix at given point X * value of merit function f(x)=F[0]^2(x)+...+F[M-1]^2(x) at given point X USAGE: 1. User initializes algorithm state with NLEQCreateLM() call 2. User tunes solver parameters with NLEQSetCond(), NLEQSetStpMax() and other functions 3. User calls NLEQSolve() function which takes algorithm state and pointers (delegates, etc.) to callback functions which calculate merit function value and Jacobian. 4. User calls NLEQResults() to get solution 5. Optionally, user may call NLEQRestartFrom() to solve another problem with same parameters (N/M) but another starting point and/or another function vector. NLEQRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - space dimension, N>1: * if provided, only leading N elements of X are used * if not provided, determined automatically from size of X M - system size X - starting point OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with NLEQSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use NLEQSetStpMax() function to bound algorithm's steps. 3. this algorithm is a slightly modified implementation of the method described in 'Levenberg-Marquardt method for constrained nonlinear equations with strong local convergence properties' by Christian Kanzow Nobuo Yamashita and Masao Fukushima and further developed in 'On the convergence of a New Levenberg-Marquardt Method' by Jin-yan Fan and Ya-Xiang Yuan. -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::nleqcreatelm(ae_int_t m, real_1d_array x, nleqstate& state); void alglib::nleqcreatelm( ae_int_t n, ae_int_t m, real_1d_array x, nleqstate& state);
/************************************************************************* This subroutine restarts CG algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used for reverse communication previously allocated with MinCGCreate call. X - new starting point. BndL - new lower bounds BndU - new upper bounds -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/
void alglib::nleqrestartfrom(nleqstate state, real_1d_array x);
/************************************************************************* NLEQ solver results INPUT PARAMETERS: State - algorithm state. OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report: * Rep.TerminationType completetion code: * -4 ERROR: algorithm has converged to the stationary point Xf which is local minimum of f=F[0]^2+...+F[m-1]^2, but is not solution of nonlinear system. * 1 sqrt(f)<=EpsF. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * Rep.IterationsCount contains iterations count * NFEV countains number of function calculations * ActiveConstraints contains number of active constraints -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::nleqresults( nleqstate state, real_1d_array& x, nleqreport& rep);
/************************************************************************* NLEQ solver results Buffered implementation of NLEQResults(), which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::nleqresultsbuf( nleqstate state, real_1d_array& x, nleqreport& rep);
/************************************************************************* This function sets stopping conditions for the nonlinear solver INPUT PARAMETERS: State - structure which stores algorithm state EpsF - >=0 The subroutine finishes its work if on k+1-th iteration the condition ||F||<=EpsF is satisfied MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsF=0 and MaxIts=0 simultaneously will lead to automatic stopping criterion selection (small EpsF). NOTES: -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/
void alglib::nleqsetcond(nleqstate state, double epsf, ae_int_t maxits);
/************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when target function contains exp() or other fast growing functions, and algorithm makes too large steps which lead to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/
void alglib::nleqsetstpmax(nleqstate state, double stpmax);
/************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to NLEQSolve(). -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/
void alglib::nleqsetxrep(nleqstate state, bool needxrep);
/************************************************************************* This family of functions is used to launcn iterations of nonlinear solver These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x jac - callback which calculates function vector fi[] and Jacobian jac at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/
void nleqsolve(nleqstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL);
errorfunction
errorfunctionc
inverf
invnormaldistribution
normaldistribution
/************************************************************************* Error function The integral is x - 2 | | 2 erf(x) = -------- | exp( - t ) dt. sqrt(pi) | | - 0 For 0 <= |x| < 1, erf(x) = x * P4(x**2)/Q5(x**2); otherwise erf(x) = 1 - erfc(x). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,1 30000 3.7e-16 1.0e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::errorfunction(double x);
/************************************************************************* Complementary error function 1 - erf(x) = inf. - 2 | | 2 erfc(x) = -------- | exp( - t ) dt sqrt(pi) | | - x For small x, erfc(x) = 1 - erf(x); otherwise rational approximations are computed. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,26.6417 30000 5.7e-14 1.5e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::errorfunctionc(double x);
/************************************************************************* Inverse of the error function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::inverf(double e);
/************************************************************************* Inverse of Normal distribution function Returns the argument, x, for which the area under the Gaussian probability density function (integrated from minus infinity to x) is equal to y. For small arguments 0 < y < exp(-2), the program computes z = sqrt( -2.0 * log(y) ); then the approximation is x = z - log(z)/z - (1/z) P(1/z) / Q(1/z). There are two rational functions P/Q, one for 0 < y < exp(-32) and the other for y up to exp(-2). For larger arguments, w = y - 0.5, and x/sqrt(2pi) = w + w**3 R(w**2)/S(w**2)). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0.125, 1 20000 7.2e-16 1.3e-16 IEEE 3e-308, 0.135 50000 4.6e-16 9.8e-17 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::invnormaldistribution(double y0);
/************************************************************************* Normal distribution function Returns the area under the Gaussian probability density function, integrated from minus infinity to x: x - 1 | | 2 ndtr(x) = --------- | exp( - t /2 ) dt sqrt(2pi) | | - -inf. = ( 1 + erf(z) ) / 2 = erfc(z) / 2 where z = x/sqrt(2). Computation is via the functions erf and erfc. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE -13,0 30000 3.4e-14 6.7e-15 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::normaldistribution(double x);
normestimatorstate
normestimatorcreate
normestimatorestimatesparse
normestimatorresults
normestimatorsetseed
/************************************************************************* This object stores state of the iterative norm estimation algorithm. You should use ALGLIB functions to work with this object. *************************************************************************/
class normestimatorstate { };
/************************************************************************* This procedure initializes matrix norm estimator. USAGE: 1. User initializes algorithm state with NormEstimatorCreate() call 2. User calls NormEstimatorEstimateSparse() (or NormEstimatorIteration()) 3. User calls NormEstimatorResults() to get solution. INPUT PARAMETERS: M - number of rows in the matrix being estimated, M>0 N - number of columns in the matrix being estimated, N>0 NStart - number of random starting vectors recommended value - at least 5. NIts - number of iterations to do with best starting vector recommended value - at least 5. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTE: this algorithm is effectively deterministic, i.e. it always returns same result when repeatedly called for the same matrix. In fact, algorithm uses randomized starting vectors, but internal random numbers generator always generates same sequence of the random values (it is a feature, not bug). Algorithm can be made non-deterministic with NormEstimatorSetSeed(0) call. -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::normestimatorcreate( ae_int_t m, ae_int_t n, ae_int_t nstart, ae_int_t nits, normestimatorstate& state);
/************************************************************************* This function estimates norm of the sparse M*N matrix A. INPUT PARAMETERS: State - norm estimator state, must be initialized with a call to NormEstimatorCreate() A - sparse M*N matrix, must be converted to CRS format prior to calling this function. After this function is over you can call NormEstimatorResults() to get estimate of the norm(A). -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::normestimatorestimatesparse( normestimatorstate state, sparsematrix a);
/************************************************************************* Matrix norm estimation results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: Nrm - estimate of the matrix norm, Nrm>=0 -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::normestimatorresults(normestimatorstate state, double& nrm);
/************************************************************************* This function changes seed value used by algorithm. In some cases we need deterministic processing, i.e. subsequent calls must return equal results, in other cases we need non-deterministic algorithm which returns different results for the same matrix on every pass. Setting zero seed will lead to non-deterministic algorithm, while non-zero value will make our algorithm deterministic. INPUT PARAMETERS: State - norm estimator state, must be initialized with a call to NormEstimatorCreate() SeedVal - seed value, >=0. Zero value = non-deterministic algo. -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::normestimatorsetseed( normestimatorstate state, ae_int_t seedval);
nsfitspheremcc
nsfitspheremic
nsfitspheremzc
nsfitspherex
/************************************************************************* Fits minimum circumscribed (MCC) circle (or NX-dimensional sphere) to data (a set of points in NX-dimensional space). INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) OUTPUT PARAMETERS: CX - central point for a sphere RHi - radius NOTE: this function is an easy-to-use wrapper around more powerful "expert" function nsfitspherex(). This wrapper is optimized for ease of use and stability - at the cost of somewhat lower performance (we have to use very tight stopping criteria for inner optimizer because we want to make sure that it will converge on any dataset). If you are ready to experiment with settings of "expert" function, you can achieve ~2-4x speedup over standard "bulletproof" settings. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/
void alglib::nsfitspheremcc( real_2d_array xy, ae_int_t npoints, ae_int_t nx, real_1d_array& cx, double& rhi);
/************************************************************************* Fits maximum inscribed circle (or NX-dimensional sphere) to data (a set of points in NX-dimensional space). INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius NOTE: this function is an easy-to-use wrapper around more powerful "expert" function nsfitspherex(). This wrapper is optimized for ease of use and stability - at the cost of somewhat lower performance (we have to use very tight stopping criteria for inner optimizer because we want to make sure that it will converge on any dataset). If you are ready to experiment with settings of "expert" function, you can achieve ~2-4x speedup over standard "bulletproof" settings. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/
void alglib::nsfitspheremic( real_2d_array xy, ae_int_t npoints, ae_int_t nx, real_1d_array& cx, double& rlo);
/************************************************************************* Fits minimum zone circle (or NX-dimensional sphere) to data (a set of points in NX-dimensional space). INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius of inscribed circle RHo - radius of circumscribed circle NOTE: this function is an easy-to-use wrapper around more powerful "expert" function nsfitspherex(). This wrapper is optimized for ease of use and stability - at the cost of somewhat lower performance (we have to use very tight stopping criteria for inner optimizer because we want to make sure that it will converge on any dataset). If you are ready to experiment with settings of "expert" function, you can achieve ~2-4x speedup over standard "bulletproof" settings. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/
void alglib::nsfitspheremzc( real_2d_array xy, ae_int_t npoints, ae_int_t nx, real_1d_array& cx, double& rlo, double& rhi);
/************************************************************************* Fitting minimum circumscribed, maximum inscribed or minimum zone circles (or NX-dimensional spheres) to data (a set of points in NX-dimensional space). This is expert function which allows to tweak many parameters of underlying nonlinear solver: * stopping criteria for inner iterations * number of outer iterations * penalty coefficient used to handle nonlinear constraints (we convert unconstrained nonsmooth optimization problem ivolving max() and/or min() operations to quadratically constrained smooth one). You may tweak all these parameters or only some of them, leaving other ones at their default state - just specify zero value, and solver will fill it with appropriate default one. These comments also include some discussion of approach used to handle such unusual fitting problem, its stability, drawbacks of alternative methods, and convergence properties. INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) ProblemType-used to encode problem type: * 1 for minimum circumscribed circle/sphere fitting (MCC) * 2 for maximum inscribed circle/sphere fitting (MIC) * 3 for minimum zone circle fitting (difference between Rhi and Rlo is minimized), denoted as MZC EpsX - stopping condition for NLC optimizer: * must be non-negative * use 0 to choose default value (1.0E-12 is used by default) * you may specify larger values, up to 1.0E-6, if you want to speed-up solver; NLC solver performs several preconditioned outer iterations, so final result typically has precision much better than EpsX. AULIts - number of outer iterations performed by NLC optimizer: * must be non-negative * use 0 to choose default value (20 is used by default) * you may specify values smaller than 20 if you want to speed up solver; 10 often results in good combination of precision and speed; sometimes you may get good results with just 6 outer iterations. Penalty - penalty coefficient for NLC optimizer: * must be non-negative * use 0 to choose default value (1.0E6 in current version) * it should be really large, 1.0E6...1.0E7 is a good value to start from; * generally, default value is good enough OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius: * for ProblemType=2,3, radius of the inscribed sphere * for ProblemType=1 - zero RHo - radius: * for ProblemType=1,3, radius of the circumscribed sphere * for ProblemType=2 - zero NOTE: ON THE UNIQUENESS OF SOLUTIONS ALGLIB provides solution to several related circle fitting problems: MCC (minimum circumscribed), MIC (maximum inscribed) and MZC (minimum zone) fitting. It is important to note that among these problems only MCC is convex and has unique solution independently from starting point. As for MIC, it may (or may not, depending on dataset properties) have multiple solutions, and it always has one degenerate solution C=infinity which corresponds to infinitely large radius. Thus, there are no guarantees that solution to MIC returned by this solver will be the best one (and no one can provide you with such guarantee because problem is NP-hard). The only guarantee you have is that this solution is locally optimal, i.e. it can not be improved by infinitesimally small tweaks in the parameters. It is also possible to "run away" to infinity when started from bad initial point located outside of point cloud (or when point cloud does not span entire circumference/surface of the sphere). Finally, MZC (minimum zone circle) stands somewhere between MCC and MIC in stability. It is somewhat regularized by "circumscribed" term of the merit function; however, solutions to MZC may be non-unique, and in some unlucky cases it is also possible to "run away to infinity". NOTE: ON THE NONLINEARLY CONSTRAINED PROGRAMMING APPROACH The problem formulation for MCC (minimum circumscribed circle; for the sake of simplicity we omit MZC and MIC here) is: [ [ ]2 ] min [ max [ XY[i]-C ] ] C [ i [ ] ] i.e. it is unconstrained nonsmooth optimization problem of finding "best" central point, with radius R being unambiguously determined from C. In order to move away from non-smoothness we use following reformulation: [ ] [ ]2 min [ R ] subject to R>=0, [ XY[i]-C ] <= R^2 C,R [ ] [ ] i.e. it becomes smooth quadratically constrained optimization problem with linear target function. Such problem statement is 100% equivalent to the original nonsmooth one, but much easier to approach. We solve it with MinNLC solver provided by ALGLIB. NOTE: ON INSTABILITY OF SEQUENTIAL LINEAR PROGRAMMING APPROACJ ALGLIB has nonlinearly constrained solver which proved to be stable on such problems. However, some authors proposed to linearize constraints in the vicinity of current approximation (Ci,Ri) and to get next approximate solution (Ci+1,Ri+1) as solution to linear programming problem. Obviously, LP problems are easier than nonlinearly constrained ones. Indeed, SLP approach to MCC/MIC/MZC resulted in ~10-20x increase in performance (when compared with NLC solver). However, it turned out that in some cases linearized model fails to predict correct direction for next step and tells us that we converged to solution even when we are still 2-4 digits of precision away from it. It is important that it is not failure of LP solver - it is failure of the linear model; even when solved exactly, it fails to handle subtle nonlinearities which arise near the solution. We validated it by comparing results returned by ALGLIB linear solver with that of MATLAB. In our experiments with SLP solver: * MCC failed most often, at both realistic and synthetic datasets * MIC sometimes failed, but sometimes succeeded * MZC often succeeded; our guess is that presence of two independent sets of constraints (one set for Rlo and another one for Rhi) and two terms in the target function (Rlo and Rhi) regularizes task, so when linear model fails to handle nonlinearities from Rlo, it uses Rhi as a hint (and vice versa). Because SLP approach failed to achieve stable results, we do not include it in ALGLIB. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/
void alglib::nsfitspherex( real_2d_array xy, ae_int_t npoints, ae_int_t nx, ae_int_t problemtype, double epsx, ae_int_t aulits, double penalty, real_1d_array& cx, double& rlo, double& rhi);
odesolverreport
odesolverstate
odesolverresults
odesolverrkck
odesolversolve
odesolver_d1 Solving y'=-y with ODE solver
/************************************************************************* *************************************************************************/
class odesolverreport { ae_int_t nfev; ae_int_t terminationtype; };
/************************************************************************* *************************************************************************/
class odesolverstate { };
/************************************************************************* ODE solver results Called after OdeSolverIteration returned False. INPUT PARAMETERS: State - algorithm state (used by OdeSolverIteration). OUTPUT PARAMETERS: M - number of tabulated values, M>=1 XTbl - array[0..M-1], values of X YTbl - array[0..M-1,0..N-1], values of Y in X[i] Rep - solver report: * Rep.TerminationType completetion code: * -2 X is not ordered by ascending/descending or there are non-distinct X[], i.e. X[i]=X[i+1] * -1 incorrect parameters were specified * 1 task has been solved * Rep.NFEV contains number of function calculations -- ALGLIB -- Copyright 01.09.2009 by Bochkanov Sergey *************************************************************************/
void alglib::odesolverresults( odesolverstate state, ae_int_t& m, real_1d_array& xtbl, real_2d_array& ytbl, odesolverreport& rep);

Examples:   [1]  

/************************************************************************* Cash-Karp adaptive ODE solver. This subroutine solves ODE Y'=f(Y,x) with initial conditions Y(xs)=Ys (here Y may be single variable or vector of N variables). INPUT PARAMETERS: Y - initial conditions, array[0..N-1]. contains values of Y[] at X[0] N - system size X - points at which Y should be tabulated, array[0..M-1] integrations starts at X[0], ends at X[M-1], intermediate values at X[i] are returned too. SHOULD BE ORDERED BY ASCENDING OR BY DESCENDING! M - number of intermediate points + first point + last point: * M>2 means that you need both Y(X[M-1]) and M-2 values at intermediate points * M=2 means that you want just to integrate from X[0] to X[1] and don't interested in intermediate values. * M=1 means that you don't want to integrate :) it is degenerate case, but it will be handled correctly. * M<1 means error Eps - tolerance (absolute/relative error on each step will be less than Eps). When passing: * Eps>0, it means desired ABSOLUTE error * Eps<0, it means desired RELATIVE error. Relative errors are calculated with respect to maximum values of Y seen so far. Be careful to use this criterion when starting from Y[] that are close to zero. H - initial step lenth, it will be adjusted automatically after the first step. If H=0, step will be selected automatically (usualy it will be equal to 0.001 of min(x[i]-x[j])). OUTPUT PARAMETERS State - structure which stores algorithm state between subsequent calls of OdeSolverIteration. Used for reverse communication. This structure should be passed to the OdeSolverIteration subroutine. SEE ALSO AutoGKSmoothW, AutoGKSingular, AutoGKIteration, AutoGKResults. -- ALGLIB -- Copyright 01.09.2009 by Bochkanov Sergey *************************************************************************/
void alglib::odesolverrkck( real_1d_array y, real_1d_array x, double eps, double h, odesolverstate& state); void alglib::odesolverrkck( real_1d_array y, ae_int_t n, real_1d_array x, ae_int_t m, double eps, double h, odesolverstate& state);

Examples:   [1]  

/************************************************************************* This function is used to launcn iterations of ODE solver It accepts following parameters: diff - callback which calculates dy/dx for given y and x ptr - optional pointer which is passed to diff; can be NULL -- ALGLIB -- Copyright 01.09.2009 by Bochkanov Sergey *************************************************************************/
void odesolversolve(odesolverstate &state, void (*diff)(const real_1d_array &y, double x, real_1d_array &dy, void *ptr), void *ptr = NULL);

Examples:   [1]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "diffequations.h"

using namespace alglib;
void ode_function_1_diff(const real_1d_array &y, double x, real_1d_array &dy, void *ptr) 
{
    // this callback calculates f(y[],x)=-y[0]
    dy[0] = -y[0];
}

int main(int argc, char **argv)
{
    real_1d_array y = "[1]";
    real_1d_array x = "[0, 1, 2, 3]";
    double eps = 0.00001;
    double h = 0;
    odesolverstate s;
    ae_int_t m;
    real_1d_array xtbl;
    real_2d_array ytbl;
    odesolverreport rep;
    odesolverrkck(y, x, eps, h, s);
    alglib::odesolversolve(s, ode_function_1_diff);
    odesolverresults(s, m, xtbl, ytbl, rep);
    printf("%d\n", int(m)); // EXPECTED: 4
    printf("%s\n", xtbl.tostring(2).c_str()); // EXPECTED: [0, 1, 2, 3]
    printf("%s\n", ytbl.tostring(2).c_str()); // EXPECTED: [[1], [0.367], [0.135], [0.050]]
    return 0;
}


cmatrixlq
cmatrixlqunpackl
cmatrixlqunpackq
cmatrixqr
cmatrixqrunpackq
cmatrixqrunpackr
hmatrixtd
hmatrixtdunpackq
rmatrixbd
rmatrixbdmultiplybyp
rmatrixbdmultiplybyq
rmatrixbdunpackdiagonals
rmatrixbdunpackpt
rmatrixbdunpackq
rmatrixhessenberg
rmatrixhessenbergunpackh
rmatrixhessenbergunpackq
rmatrixlq
rmatrixlqunpackl
rmatrixlqunpackq
rmatrixqr
rmatrixqrunpackq
rmatrixqrunpackr
smatrixtd
smatrixtdunpackq
/************************************************************************* LQ decomposition of a rectangular complex matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1] M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q and L in compact form Tau - array of scalar factors which are used to form matrix Q. Array whose indexes range within [0.. Min(M,N)-1] Matrix A is represented as A = LQ, where Q is an orthogonal matrix of size MxM, L - lower triangular (or lower trapezoid) matrix of size MxN. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/
void alglib::cmatrixlq( complex_2d_array& a, ae_int_t m, ae_int_t n, complex_1d_array& tau); void alglib::smp_cmatrixlq( complex_2d_array& a, ae_int_t m, ae_int_t n, complex_1d_array& tau);
/************************************************************************* Unpacking of matrix L from the LQ decomposition of a matrix A Input parameters: A - matrices Q and L in compact form. Output of CMatrixLQ subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: L - matrix L, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixlqunpackl( complex_2d_array a, ae_int_t m, ae_int_t n, complex_2d_array& l);
/************************************************************************* Partial unpacking of matrix Q from LQ decomposition of a complex matrix A. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices Q and R in compact form. Output of CMatrixLQ subroutine . M - number of rows in matrix A. M>=0. N - number of columns in matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of CMatrixLQ subroutine . QRows - required number of rows in matrix Q. N>=QColumns>=0. Output parameters: Q - first QRows rows of matrix Q. Array whose index ranges within [0..QRows-1, 0..N-1]. If QRows=0, array isn't changed. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixlqunpackq( complex_2d_array a, ae_int_t m, ae_int_t n, complex_1d_array tau, ae_int_t qrows, complex_2d_array& q); void alglib::smp_cmatrixlqunpackq( complex_2d_array a, ae_int_t m, ae_int_t n, complex_1d_array tau, ae_int_t qrows, complex_2d_array& q);
/************************************************************************* QR decomposition of a rectangular complex matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1] M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q and R in compact form Tau - array of scalar factors which are used to form matrix Q. Array whose indexes range within [0.. Min(M,N)-1] Matrix A is represented as A = QR, where Q is an orthogonal matrix of size MxM, R - upper triangular (or upper trapezoid) matrix of size MxN. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/
void alglib::cmatrixqr( complex_2d_array& a, ae_int_t m, ae_int_t n, complex_1d_array& tau); void alglib::smp_cmatrixqr( complex_2d_array& a, ae_int_t m, ae_int_t n, complex_1d_array& tau);
/************************************************************************* Partial unpacking of matrix Q from QR decomposition of a complex matrix A. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices Q and R in compact form. Output of CMatrixQR subroutine . M - number of rows in matrix A. M>=0. N - number of columns in matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of CMatrixQR subroutine . QColumns - required number of columns in matrix Q. M>=QColumns>=0. Output parameters: Q - first QColumns columns of matrix Q. Array whose index ranges within [0..M-1, 0..QColumns-1]. If QColumns=0, array isn't changed. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixqrunpackq( complex_2d_array a, ae_int_t m, ae_int_t n, complex_1d_array tau, ae_int_t qcolumns, complex_2d_array& q); void alglib::smp_cmatrixqrunpackq( complex_2d_array a, ae_int_t m, ae_int_t n, complex_1d_array tau, ae_int_t qcolumns, complex_2d_array& q);
/************************************************************************* Unpacking of matrix R from the QR decomposition of a matrix A Input parameters: A - matrices Q and R in compact form. Output of CMatrixQR subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: R - matrix R, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixqrunpackr( complex_2d_array a, ae_int_t m, ae_int_t n, complex_2d_array& r);
/************************************************************************* Reduction of a Hermitian matrix which is given by its higher or lower triangular part to a real tridiagonal matrix using unitary similarity transformation: Q'*A*Q = T. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be transformed array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. If IsUpper = True, then matrix A is given by its upper triangle, and the lower triangle is not used and not modified by the algorithm, and vice versa if IsUpper = False. Output parameters: A - matrices T and Q in compact form (see lower) Tau - array of factors which are forming matrices H(i) array with elements [0..N-2]. D - main diagonal of real symmetric matrix T. array with elements [0..N-1]. E - secondary diagonal of real symmetric matrix T. array with elements [0..N-2]. If IsUpper=True, the matrix Q is represented as a product of elementary reflectors Q = H(n-2) . . . H(2) H(0). Each H(i) has the form H(i) = I - tau * v * v' where tau is a complex scalar, and v is a complex vector with v(i+1:n-1) = 0, v(i) = 1, v(0:i-1) is stored on exit in A(0:i-1,i+1), and tau in TAU(i). If IsUpper=False, the matrix Q is represented as a product of elementary reflectors Q = H(0) H(2) . . . H(n-2). Each H(i) has the form H(i) = I - tau * v * v' where tau is a complex scalar, and v is a complex vector with v(0:i) = 0, v(i+1) = 1, v(i+2:n-1) is stored on exit in A(i+2:n-1,i), and tau in TAU(i). The contents of A on exit are illustrated by the following examples with n = 5: if UPLO = 'U': if UPLO = 'L': ( d e v1 v2 v3 ) ( d ) ( d e v2 v3 ) ( e d ) ( d e v3 ) ( v0 e d ) ( d e ) ( v0 v1 e d ) ( d ) ( v0 v1 v2 e d ) where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i). -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/
void alglib::hmatrixtd( complex_2d_array& a, ae_int_t n, bool isupper, complex_1d_array& tau, real_1d_array& d, real_1d_array& e);
/************************************************************************* Unpacking matrix Q which reduces a Hermitian matrix to a real tridiagonal form. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - the result of a HMatrixTD subroutine N - size of matrix A. IsUpper - storage format (a parameter of HMatrixTD subroutine) Tau - the result of a HMatrixTD subroutine Output parameters: Q - transformation matrix. array with elements [0..N-1, 0..N-1]. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/
void alglib::hmatrixtdunpackq( complex_2d_array a, ae_int_t n, bool isupper, complex_1d_array tau, complex_2d_array& q);
/************************************************************************* Reduction of a rectangular matrix to bidiagonal form The algorithm reduces the rectangular matrix A to bidiagonal form by orthogonal transformations P and Q: A = Q*B*(P^T). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function because ! bidiagonal decompostion is inherently sequential in nature. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - source matrix. array[0..M-1, 0..N-1] M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q, B, P in compact form (see below). TauQ - scalar factors which are used to form matrix Q. TauP - scalar factors which are used to form matrix P. The main diagonal and one of the secondary diagonals of matrix A are replaced with bidiagonal matrix B. Other elements contain elementary reflections which form MxM matrix Q and NxN matrix P, respectively. If M>=N, B is the upper bidiagonal MxN matrix and is stored in the corresponding elements of matrix A. Matrix Q is represented as a product of elementary reflections Q = H(0)*H(1)*...*H(n-1), where H(i) = 1-tau*v*v'. Here tau is a scalar which is stored in TauQ[i], and vector v has the following structure: v(0:i-1)=0, v(i)=1, v(i+1:m-1) is stored in elements A(i+1:m-1,i). Matrix P is as follows: P = G(0)*G(1)*...*G(n-2), where G(i) = 1 - tau*u*u'. Tau is stored in TauP[i], u(0:i)=0, u(i+1)=1, u(i+2:n-1) is stored in elements A(i,i+2:n-1). If M<N, B is the lower bidiagonal MxN matrix and is stored in the corresponding elements of matrix A. Q = H(0)*H(1)*...*H(m-2), where H(i) = 1 - tau*v*v', tau is stored in TauQ, v(0:i)=0, v(i+1)=1, v(i+2:m-1) is stored in elements A(i+2:m-1,i). P = G(0)*G(1)*...*G(m-1), G(i) = 1-tau*u*u', tau is stored in TauP, u(0:i-1)=0, u(i)=1, u(i+1:n-1) is stored in A(i,i+1:n-1). EXAMPLE: m=6, n=5 (m > n): m=5, n=6 (m < n): ( d e u1 u1 u1 ) ( d u1 u1 u1 u1 u1 ) ( v1 d e u2 u2 ) ( e d u2 u2 u2 u2 ) ( v1 v2 d e u3 ) ( v1 e d u3 u3 u3 ) ( v1 v2 v3 d e ) ( v1 v2 e d u4 u4 ) ( v1 v2 v3 v4 d ) ( v1 v2 v3 e d u5 ) ( v1 v2 v3 v4 v5 ) Here vi and ui are vectors which form H(i) and G(i), and d and e - are the diagonal and off-diagonal elements of matrix B. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994. Sergey Bochkanov, ALGLIB project, translation from FORTRAN to pseudocode, 2007-2010. *************************************************************************/
void alglib::rmatrixbd( real_2d_array& a, ae_int_t m, ae_int_t n, real_1d_array& tauq, real_1d_array& taup);
/************************************************************************* Multiplication by matrix P which reduces matrix A to bidiagonal form. The algorithm allows pre- or post-multiply by P or P'. Input parameters: QP - matrices Q and P in compact form. Output of RMatrixBD subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUP - scalar factors which are used to form P. Output of RMatrixBD subroutine. Z - multiplied matrix. Array whose indexes range within [0..ZRows-1,0..ZColumns-1]. ZRows - number of rows in matrix Z. If FromTheRight=False, ZRows=N, otherwise ZRows can be arbitrary. ZColumns - number of columns in matrix Z. If FromTheRight=True, ZColumns=N, otherwise ZColumns can be arbitrary. FromTheRight - pre- or post-multiply. DoTranspose - multiply by P or P'. Output parameters: Z - product of Z and P. Array whose indexes range within [0..ZRows-1,0..ZColumns-1]. If ZRows=0 or ZColumns=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixbdmultiplybyp( real_2d_array qp, ae_int_t m, ae_int_t n, real_1d_array taup, real_2d_array& z, ae_int_t zrows, ae_int_t zcolumns, bool fromtheright, bool dotranspose);
/************************************************************************* Multiplication by matrix Q which reduces matrix A to bidiagonal form. The algorithm allows pre- or post-multiply by Q or Q'. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: QP - matrices Q and P in compact form. Output of ToBidiagonal subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUQ - scalar factors which are used to form Q. Output of ToBidiagonal subroutine. Z - multiplied matrix. array[0..ZRows-1,0..ZColumns-1] ZRows - number of rows in matrix Z. If FromTheRight=False, ZRows=M, otherwise ZRows can be arbitrary. ZColumns - number of columns in matrix Z. If FromTheRight=True, ZColumns=M, otherwise ZColumns can be arbitrary. FromTheRight - pre- or post-multiply. DoTranspose - multiply by Q or Q'. Output parameters: Z - product of Z and Q. Array[0..ZRows-1,0..ZColumns-1] If ZRows=0 or ZColumns=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixbdmultiplybyq( real_2d_array qp, ae_int_t m, ae_int_t n, real_1d_array tauq, real_2d_array& z, ae_int_t zrows, ae_int_t zcolumns, bool fromtheright, bool dotranspose);
/************************************************************************* Unpacking of the main and secondary diagonals of bidiagonal decomposition of matrix A. Input parameters: B - output of RMatrixBD subroutine. M - number of rows in matrix B. N - number of columns in matrix B. Output parameters: IsUpper - True, if the matrix is upper bidiagonal. otherwise IsUpper is False. D - the main diagonal. Array whose index ranges within [0..Min(M,N)-1]. E - the secondary diagonal (upper or lower, depending on the value of IsUpper). Array index ranges within [0..Min(M,N)-1], the last element is not used. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixbdunpackdiagonals( real_2d_array b, ae_int_t m, ae_int_t n, bool& isupper, real_1d_array& d, real_1d_array& e);
/************************************************************************* Unpacking matrix P which reduces matrix A to bidiagonal form. The subroutine returns transposed matrix P. Input parameters: QP - matrices Q and P in compact form. Output of ToBidiagonal subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUP - scalar factors which are used to form P. Output of ToBidiagonal subroutine. PTRows - required number of rows of matrix P^T. N >= PTRows >= 0. Output parameters: PT - first PTRows columns of matrix P^T Array[0..PTRows-1, 0..N-1] If PTRows=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixbdunpackpt( real_2d_array qp, ae_int_t m, ae_int_t n, real_1d_array taup, ae_int_t ptrows, real_2d_array& pt);
/************************************************************************* Unpacking matrix Q which reduces a matrix to bidiagonal form. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: QP - matrices Q and P in compact form. Output of ToBidiagonal subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUQ - scalar factors which are used to form Q. Output of ToBidiagonal subroutine. QColumns - required number of columns in matrix Q. M>=QColumns>=0. Output parameters: Q - first QColumns columns of matrix Q. Array[0..M-1, 0..QColumns-1] If QColumns=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixbdunpackq( real_2d_array qp, ae_int_t m, ae_int_t n, real_1d_array tauq, ae_int_t qcolumns, real_2d_array& q);
/************************************************************************* Reduction of a square matrix to upper Hessenberg form: Q'*A*Q = H, where Q is an orthogonal matrix, H - Hessenberg matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A with elements [0..N-1, 0..N-1] N - size of matrix A. Output parameters: A - matrices Q and P in compact form (see below). Tau - array of scalar factors which are used to form matrix Q. Array whose index ranges within [0..N-2] Matrix H is located on the main diagonal, on the lower secondary diagonal and above the main diagonal of matrix A. The elements which are used to form matrix Q are situated in array Tau and below the lower secondary diagonal of matrix A as follows: Matrix Q is represented as a product of elementary reflections Q = H(0)*H(2)*...*H(n-2), where each H(i) is given by H(i) = 1 - tau * v * (v^T) where tau is a scalar stored in Tau[I]; v - is a real vector, so that v(0:i) = 0, v(i+1) = 1, v(i+2:n-1) stored in A(i+2:n-1,i). -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/
void alglib::rmatrixhessenberg( real_2d_array& a, ae_int_t n, real_1d_array& tau);
/************************************************************************* Unpacking matrix H (the result of matrix A reduction to upper Hessenberg form) Input parameters: A - output of RMatrixHessenberg subroutine. N - size of matrix A. Output parameters: H - matrix H. Array whose indexes range within [0..N-1, 0..N-1]. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixhessenbergunpackh( real_2d_array a, ae_int_t n, real_2d_array& h);
/************************************************************************* Unpacking matrix Q which reduces matrix A to upper Hessenberg form COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - output of RMatrixHessenberg subroutine. N - size of matrix A. Tau - scalar factors which are used to form Q. Output of RMatrixHessenberg subroutine. Output parameters: Q - matrix Q. Array whose indexes range within [0..N-1, 0..N-1]. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixhessenbergunpackq( real_2d_array a, ae_int_t n, real_1d_array tau, real_2d_array& q);
/************************************************************************* LQ decomposition of a rectangular matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices L and Q in compact form (see below) Tau - array of scalar factors which are used to form matrix Q. Array whose index ranges within [0..Min(M,N)-1]. Matrix A is represented as A = LQ, where Q is an orthogonal matrix of size MxM, L - lower triangular (or lower trapezoid) matrix of size M x N. The elements of matrix L are located on and below the main diagonal of matrix A. The elements which are located in Tau array and above the main diagonal of matrix A are used to form matrix Q as follows: Matrix Q is represented as a product of elementary reflections Q = H(k-1)*H(k-2)*...*H(1)*H(0), where k = min(m,n), and each H(i) is of the form H(i) = 1 - tau * v * (v^T) where tau is a scalar stored in Tau[I]; v - real vector, so that v(0:i-1)=0, v(i) = 1, v(i+1:n-1) stored in A(i,i+1:n-1). -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixlq( real_2d_array& a, ae_int_t m, ae_int_t n, real_1d_array& tau); void alglib::smp_rmatrixlq( real_2d_array& a, ae_int_t m, ae_int_t n, real_1d_array& tau);
/************************************************************************* Unpacking of matrix L from the LQ decomposition of a matrix A Input parameters: A - matrices Q and L in compact form. Output of RMatrixLQ subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: L - matrix L, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixlqunpackl( real_2d_array a, ae_int_t m, ae_int_t n, real_2d_array& l);
/************************************************************************* Partial unpacking of matrix Q from the LQ decomposition of a matrix A COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices L and Q in compact form. Output of RMatrixLQ subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of the RMatrixLQ subroutine. QRows - required number of rows in matrix Q. N>=QRows>=0. Output parameters: Q - first QRows rows of matrix Q. Array whose indexes range within [0..QRows-1, 0..N-1]. If QRows=0, the array remains unchanged. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixlqunpackq( real_2d_array a, ae_int_t m, ae_int_t n, real_1d_array tau, ae_int_t qrows, real_2d_array& q); void alglib::smp_rmatrixlqunpackq( real_2d_array a, ae_int_t m, ae_int_t n, real_1d_array tau, ae_int_t qrows, real_2d_array& q);
/************************************************************************* QR decomposition of a rectangular matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q and R in compact form (see below). Tau - array of scalar factors which are used to form matrix Q. Array whose index ranges within [0.. Min(M-1,N-1)]. Matrix A is represented as A = QR, where Q is an orthogonal matrix of size MxM, R - upper triangular (or upper trapezoid) matrix of size M x N. The elements of matrix R are located on and above the main diagonal of matrix A. The elements which are located in Tau array and below the main diagonal of matrix A are used to form matrix Q as follows: Matrix Q is represented as a product of elementary reflections Q = H(0)*H(2)*...*H(k-1), where k = min(m,n), and each H(i) is in the form H(i) = 1 - tau * v * (v^T) where tau is a scalar stored in Tau[I]; v - real vector, so that v(0:i-1) = 0, v(i) = 1, v(i+1:m-1) stored in A(i+1:m-1,i). -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixqr( real_2d_array& a, ae_int_t m, ae_int_t n, real_1d_array& tau); void alglib::smp_rmatrixqr( real_2d_array& a, ae_int_t m, ae_int_t n, real_1d_array& tau);
/************************************************************************* Partial unpacking of matrix Q from the QR decomposition of a matrix A COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices Q and R in compact form. Output of RMatrixQR subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of the RMatrixQR subroutine. QColumns - required number of columns of matrix Q. M>=QColumns>=0. Output parameters: Q - first QColumns columns of matrix Q. Array whose indexes range within [0..M-1, 0..QColumns-1]. If QColumns=0, the array remains unchanged. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixqrunpackq( real_2d_array a, ae_int_t m, ae_int_t n, real_1d_array tau, ae_int_t qcolumns, real_2d_array& q); void alglib::smp_rmatrixqrunpackq( real_2d_array a, ae_int_t m, ae_int_t n, real_1d_array tau, ae_int_t qcolumns, real_2d_array& q);
/************************************************************************* Unpacking of matrix R from the QR decomposition of a matrix A Input parameters: A - matrices Q and R in compact form. Output of RMatrixQR subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: R - matrix R, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixqrunpackr( real_2d_array a, ae_int_t m, ae_int_t n, real_2d_array& r);
/************************************************************************* Reduction of a symmetric matrix which is given by its higher or lower triangular part to a tridiagonal matrix using orthogonal similarity transformation: Q'*A*Q=T. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be transformed array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. If IsUpper = True, then matrix A is given by its upper triangle, and the lower triangle is not used and not modified by the algorithm, and vice versa if IsUpper = False. Output parameters: A - matrices T and Q in compact form (see lower) Tau - array of factors which are forming matrices H(i) array with elements [0..N-2]. D - main diagonal of symmetric matrix T. array with elements [0..N-1]. E - secondary diagonal of symmetric matrix T. array with elements [0..N-2]. If IsUpper=True, the matrix Q is represented as a product of elementary reflectors Q = H(n-2) . . . H(2) H(0). Each H(i) has the form H(i) = I - tau * v * v' where tau is a real scalar, and v is a real vector with v(i+1:n-1) = 0, v(i) = 1, v(0:i-1) is stored on exit in A(0:i-1,i+1), and tau in TAU(i). If IsUpper=False, the matrix Q is represented as a product of elementary reflectors Q = H(0) H(2) . . . H(n-2). Each H(i) has the form H(i) = I - tau * v * v' where tau is a real scalar, and v is a real vector with v(0:i) = 0, v(i+1) = 1, v(i+2:n-1) is stored on exit in A(i+2:n-1,i), and tau in TAU(i). The contents of A on exit are illustrated by the following examples with n = 5: if UPLO = 'U': if UPLO = 'L': ( d e v1 v2 v3 ) ( d ) ( d e v2 v3 ) ( e d ) ( d e v3 ) ( v0 e d ) ( d e ) ( v0 v1 e d ) ( d ) ( v0 v1 v2 e d ) where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i). -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/
void alglib::smatrixtd( real_2d_array& a, ae_int_t n, bool isupper, real_1d_array& tau, real_1d_array& d, real_1d_array& e);
/************************************************************************* Unpacking matrix Q which reduces symmetric matrix to a tridiagonal form. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - the result of a SMatrixTD subroutine N - size of matrix A. IsUpper - storage format (a parameter of SMatrixTD subroutine) Tau - the result of a SMatrixTD subroutine Output parameters: Q - transformation matrix. array with elements [0..N-1, 0..N-1]. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/
void alglib::smatrixtdunpackq( real_2d_array a, ae_int_t n, bool isupper, real_1d_array tau, real_2d_array& q);
pspline2interpolant
pspline3interpolant
parametricrdpfixed
pspline2arclength
pspline2build
pspline2buildperiodic
pspline2calc
pspline2diff
pspline2diff2
pspline2parametervalues
pspline2tangent
pspline3arclength
pspline3build
pspline3buildperiodic
pspline3calc
pspline3diff
pspline3diff2
pspline3parametervalues
pspline3tangent
parametric_rdp Parametric Ramer-Douglas-Peucker approximation
/************************************************************************* Parametric spline inteprolant: 2-dimensional curve. You should not try to access its members directly - use PSpline2XXXXXXXX() functions instead. *************************************************************************/
class pspline2interpolant { };
/************************************************************************* Parametric spline inteprolant: 3-dimensional curve. You should not try to access its members directly - use PSpline3XXXXXXXX() functions instead. *************************************************************************/
class pspline3interpolant { };
/************************************************************************* This subroutine fits piecewise linear curve to points with Ramer-Douglas- Peucker algorithm. This function performs PARAMETRIC fit, i.e. it can be used to fit curves like circles. On input it accepts dataset which describes parametric multidimensional curve X(t), with X being vector, and t taking values in [0,N), where N is a number of points in dataset. As result, it returns reduced dataset X2, which can be used to build parametric curve X2(t), which approximates X(t) with desired precision (or has specified number of sections). INPUT PARAMETERS: X - array of multidimensional points: * at least N elements, leading N elements are used if more than N elements were specified * order of points is IMPORTANT because it is parametric fit * each row of array is one point which has D coordinates N - number of elements in X D - number of dimensions (elements per row of X) StopM - stopping condition - desired number of sections: * at most M sections are generated by this function * less than M sections can be generated if we have N<M (or some X are non-distinct). * zero StopM means that algorithm does not stop after achieving some pre-specified section count StopEps - stopping condition - desired precision: * algorithm stops after error in each section is at most Eps * zero Eps means that algorithm does not stop after achieving some pre-specified precision OUTPUT PARAMETERS: X2 - array of corner points for piecewise approximation, has length NSections+1 or zero (for NSections=0). Idx2 - array of indexes (parameter values): * has length NSections+1 or zero (for NSections=0). * each element of Idx2 corresponds to same-numbered element of X2 * each element of Idx2 is index of corresponding element of X2 at original array X, i.e. I-th row of X2 is Idx2[I]-th row of X. * elements of Idx2 can be treated as parameter values which should be used when building new parametric curve * Idx2[0]=0, Idx2[NSections]=N-1 NSections- number of sections found by algorithm, NSections<=M, NSections can be zero for degenerate datasets (N<=1 or all X[] are non-distinct). NOTE: algorithm stops after: a) dividing curve into StopM sections b) achieving required precision StopEps c) dividing curve into N-1 sections If both StopM and StopEps are non-zero, algorithm is stopped by the FIRST criterion which is satisfied. In case both StopM and StopEps are zero, algorithm stops because of (c). -- ALGLIB -- Copyright 02.10.2014 by Bochkanov Sergey *************************************************************************/
void alglib::parametricrdpfixed( real_2d_array x, ae_int_t n, ae_int_t d, ae_int_t stopm, double stopeps, real_2d_array& x2, integer_1d_array& idx2, ae_int_t& nsections);

Examples:   [1]  

/************************************************************************* This function calculates arc length, i.e. length of curve between t=a and t=b. INPUT PARAMETERS: P - parametric spline interpolant A,B - parameter values corresponding to arc ends: * B>A will result in positive length returned * B<A will result in negative length returned RESULT: length of arc starting at T=A and ending at T=B. -- ALGLIB PROJECT -- Copyright 30.05.2010 by Bochkanov Sergey *************************************************************************/
double alglib::pspline2arclength( pspline2interpolant p, double a, double b);
/************************************************************************* This function builds non-periodic 2-dimensional parametric spline which starts at (X[0],Y[0]) and ends at (X[N-1],Y[N-1]). INPUT PARAMETERS: XY - points, array[0..N-1,0..1]. XY[I,0:1] corresponds to the Ith point. Order of points is important! N - points count, N>=5 for Akima splines, N>=2 for other types of splines. ST - spline type: * 0 Akima spline * 1 parabolically terminated Catmull-Rom spline (Tension=0) * 2 parabolically terminated cubic spline PT - parameterization type: * 0 uniform * 1 chord length * 2 centripetal OUTPUT PARAMETERS: P - parametric spline interpolant NOTES: * this function assumes that there all consequent points are distinct. I.e. (x0,y0)<>(x1,y1), (x1,y1)<>(x2,y2), (x2,y2)<>(x3,y3) and so on. However, non-consequent points may coincide, i.e. we can have (x0,y0)= =(x2,y2). -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline2build( real_2d_array xy, ae_int_t n, ae_int_t st, ae_int_t pt, pspline2interpolant& p);
/************************************************************************* This function builds periodic 2-dimensional parametric spline which starts at (X[0],Y[0]), goes through all points to (X[N-1],Y[N-1]) and then back to (X[0],Y[0]). INPUT PARAMETERS: XY - points, array[0..N-1,0..1]. XY[I,0:1] corresponds to the Ith point. XY[N-1,0:1] must be different from XY[0,0:1]. Order of points is important! N - points count, N>=3 for other types of splines. ST - spline type: * 1 Catmull-Rom spline (Tension=0) with cyclic boundary conditions * 2 cubic spline with cyclic boundary conditions PT - parameterization type: * 0 uniform * 1 chord length * 2 centripetal OUTPUT PARAMETERS: P - parametric spline interpolant NOTES: * this function assumes that there all consequent points are distinct. I.e. (x0,y0)<>(x1,y1), (x1,y1)<>(x2,y2), (x2,y2)<>(x3,y3) and so on. However, non-consequent points may coincide, i.e. we can have (x0,y0)= =(x2,y2). * last point of sequence is NOT equal to the first point. You shouldn't make curve "explicitly periodic" by making them equal. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline2buildperiodic( real_2d_array xy, ae_int_t n, ae_int_t st, ae_int_t pt, pspline2interpolant& p);
/************************************************************************* This function calculates the value of the parametric spline for a given value of parameter T INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-position Y - Y-position -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline2calc( pspline2interpolant p, double t, double& x, double& y);
/************************************************************************* This function calculates derivative, i.e. it returns (dX/dT,dY/dT). INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - X-derivative Y - Y-value DY - Y-derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline2diff( pspline2interpolant p, double t, double& x, double& dx, double& y, double& dy);
/************************************************************************* This function calculates first and second derivative with respect to T. INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - derivative D2X - second derivative Y - Y-value DY - derivative D2Y - second derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline2diff2( pspline2interpolant p, double t, double& x, double& dx, double& d2x, double& y, double& dy, double& d2y);
/************************************************************************* This function returns vector of parameter values correspoding to points. I.e. for P created from (X[0],Y[0])...(X[N-1],Y[N-1]) and U=TValues(P) we have (X[0],Y[0]) = PSpline2Calc(P,U[0]), (X[1],Y[1]) = PSpline2Calc(P,U[1]), (X[2],Y[2]) = PSpline2Calc(P,U[2]), ... INPUT PARAMETERS: P - parametric spline interpolant OUTPUT PARAMETERS: N - array size T - array[0..N-1] NOTES: * for non-periodic splines U[0]=0, U[0]<U[1]<...<U[N-1], U[N-1]=1 * for periodic splines U[0]=0, U[0]<U[1]<...<U[N-1], U[N-1]<1 -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline2parametervalues( pspline2interpolant p, ae_int_t& n, real_1d_array& t);
/************************************************************************* This function calculates tangent vector for a given value of parameter T INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-component of tangent vector (normalized) Y - Y-component of tangent vector (normalized) NOTE: X^2+Y^2 is either 1 (for non-zero tangent vector) or 0. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline2tangent( pspline2interpolant p, double t, double& x, double& y);
/************************************************************************* This function calculates arc length, i.e. length of curve between t=a and t=b. INPUT PARAMETERS: P - parametric spline interpolant A,B - parameter values corresponding to arc ends: * B>A will result in positive length returned * B<A will result in negative length returned RESULT: length of arc starting at T=A and ending at T=B. -- ALGLIB PROJECT -- Copyright 30.05.2010 by Bochkanov Sergey *************************************************************************/
double alglib::pspline3arclength( pspline3interpolant p, double a, double b);
/************************************************************************* This function builds non-periodic 3-dimensional parametric spline which starts at (X[0],Y[0],Z[0]) and ends at (X[N-1],Y[N-1],Z[N-1]). Same as PSpline2Build() function, but for 3D, so we won't duplicate its description here. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline3build( real_2d_array xy, ae_int_t n, ae_int_t st, ae_int_t pt, pspline3interpolant& p);
/************************************************************************* This function builds periodic 3-dimensional parametric spline which starts at (X[0],Y[0],Z[0]), goes through all points to (X[N-1],Y[N-1],Z[N-1]) and then back to (X[0],Y[0],Z[0]). Same as PSpline2Build() function, but for 3D, so we won't duplicate its description here. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline3buildperiodic( real_2d_array xy, ae_int_t n, ae_int_t st, ae_int_t pt, pspline3interpolant& p);
/************************************************************************* This function calculates the value of the parametric spline for a given value of parameter T. INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-position Y - Y-position Z - Z-position -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline3calc( pspline3interpolant p, double t, double& x, double& y, double& z);
/************************************************************************* This function calculates derivative, i.e. it returns (dX/dT,dY/dT,dZ/dT). INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - X-derivative Y - Y-value DY - Y-derivative Z - Z-value DZ - Z-derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline3diff( pspline3interpolant p, double t, double& x, double& dx, double& y, double& dy, double& z, double& dz);
/************************************************************************* This function calculates first and second derivative with respect to T. INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - derivative D2X - second derivative Y - Y-value DY - derivative D2Y - second derivative Z - Z-value DZ - derivative D2Z - second derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline3diff2( pspline3interpolant p, double t, double& x, double& dx, double& d2x, double& y, double& dy, double& d2y, double& z, double& dz, double& d2z);
/************************************************************************* This function returns vector of parameter values correspoding to points. Same as PSpline2ParameterValues(), but for 3D. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline3parametervalues( pspline3interpolant p, ae_int_t& n, real_1d_array& t);
/************************************************************************* This function calculates tangent vector for a given value of parameter T INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-component of tangent vector (normalized) Y - Y-component of tangent vector (normalized) Z - Z-component of tangent vector (normalized) NOTE: X^2+Y^2+Z^2 is either 1 (for non-zero tangent vector) or 0. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/
void alglib::pspline3tangent( pspline3interpolant p, double t, double& x, double& y, double& z);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We use RDP algorithm to approximate parametric 2D curve given by
    // locations in t=0,1,2,3 (see below), which form piecewise linear
    // trajectory through D-dimensional space (2-dimensional in our example).
    // 
    //     |
    //     |
    //     -     *     *     X2................X3
    //     |                .
    //     |               .
    //     -     *     *  .  *     *     *     *
    //     |             .
    //     |            .
    //     -     *     X1    *     *     *     *
    //     |      .....
    //     |  ....
    //     X0----|-----|-----|-----|-----|-----|---
    //
    ae_int_t npoints = 4;
    ae_int_t ndimensions = 2;
    real_2d_array x = "[[0,0],[2,1],[3,3],[6,3]]";

    //
    // Approximation of parametric curve is performed by another parametric curve
    // with lesser amount of points. It allows to work with "compressed"
    // representation, which needs smaller amount of memory. Say, in our example
    // (we allow points with error smaller than 0.8) approximation will have
    // just two sequential sections connecting X0 with X2, and X2 with X3.
    // 
    //     |
    //     |
    //     -     *     *     X2................X3
    //     |               . 
    //     |             .  
    //     -     *     .     *     *     *     *
    //     |         .    
    //     |       .     
    //     -     .     X1    *     *     *     *
    //     |   .       
    //     | .    
    //     X0----|-----|-----|-----|-----|-----|---
    //
    //
    real_2d_array y;
    integer_1d_array idxy;
    ae_int_t nsections;
    ae_int_t limitcnt = 0;
    double limiteps = 0.8;
    parametricrdpfixed(x, npoints, ndimensions, limitcnt, limiteps, y, idxy, nsections);
    printf("%d\n", int(nsections)); // EXPECTED: 2
    printf("%s\n", idxy.tostring().c_str()); // EXPECTED: [0,2,3]
    return 0;
}


pcabuildbasis
pcatruncatedsubspace
/************************************************************************* Principal components analysis This function builds orthogonal basis where first axis corresponds to direction with maximum variance, second axis maximizes variance in the subspace orthogonal to first axis and so on. This function builds FULL basis, i.e. returns N vectors corresponding to ALL directions, no matter how informative. If you need just a few (say, 10 or 50) of the most important directions, you may find it faster to use one of the reduced versions: * pcatruncatedsubspace() - for subspace iteration based method It should be noted that, unlike LDA, PCA does not use class labels. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * multithreading support ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Multithreading typically gives sublinear (wrt to cores count) speedup, ! because only some parts of the algorithm can be parallelized. ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - dataset, array[0..NPoints-1,0..NVars-1]. matrix contains ONLY INDEPENDENT VARIABLES. NPoints - dataset size, NPoints>=0 NVars - number of independent variables, NVars>=1 OUTPUT PARAMETERS: Info - return code: * -4, if SVD subroutine haven't converged * -1, if wrong parameters has been passed (NPoints<0, NVars<1) * 1, if task is solved S2 - array[0..NVars-1]. variance values corresponding to basis vectors. V - array[0..NVars-1,0..NVars-1] matrix, whose columns store basis vectors. -- ALGLIB -- Copyright 25.08.2008 by Bochkanov Sergey *************************************************************************/
void alglib::pcabuildbasis( real_2d_array x, ae_int_t npoints, ae_int_t nvars, ae_int_t& info, real_1d_array& s2, real_2d_array& v); void alglib::smp_pcabuildbasis( real_2d_array x, ae_int_t npoints, ae_int_t nvars, ae_int_t& info, real_1d_array& s2, real_2d_array& v);
/************************************************************************* Principal components analysis This function performs truncated PCA, i.e. returns just a few most important directions. Internally it uses iterative eigensolver which is very efficient when only a minor fraction of full basis is required. Thus, if you need full basis, it is better to use pcabuildbasis() function. It should be noted that, unlike LDA, PCA does not use class labels. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * multithreading support ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! For a situation when you need just a few eigenvectors (~1-10), ! multithreading typically gives sublinear (wrt to cores count) speedup. ! For larger problems it may give you nearly linear increase in ! performance. ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - dataset, array[0..NPoints-1,0..NVars-1]. matrix contains ONLY INDEPENDENT VARIABLES. NPoints - dataset size, NPoints>=0 NVars - number of independent variables, NVars>=1 NNeeded - number of requested directions, in [1,NVars] range; this function is efficient only for NNeeded<<NVars. Eps - desired precision of vectors returned; underlying solver will stop iterations as soon as absolute error in corresponding singular values reduces to roughly eps*MAX(lambda[]), with lambda[] being array of eigen values. Zero value means that algorithm performs number of iterations specified by maxits parameter, without paying attention to precision. MaxIts - number of iterations performed by subspace iteration method. Zero value means that no limit on iteration count is placed (eps-based stopping condition is used). OUTPUT PARAMETERS: S2 - array[NNeeded]. Variance values corresponding to basis vectors. V - array[NVars,NNeeded] matrix, whose columns store basis vectors. NOTE: passing eps=0 and maxits=0 results in small eps being selected as stopping condition. Exact value of automatically selected eps is version- -dependent. -- ALGLIB -- Copyright 10.01.2017 by Bochkanov Sergey *************************************************************************/
void alglib::pcatruncatedsubspace( real_2d_array x, ae_int_t npoints, ae_int_t nvars, ae_int_t nneeded, double eps, ae_int_t maxits, real_1d_array& s2, real_2d_array& v); void alglib::smp_pcatruncatedsubspace( real_2d_array x, ae_int_t npoints, ae_int_t nvars, ae_int_t nneeded, double eps, ae_int_t maxits, real_1d_array& s2, real_2d_array& v);
invpoissondistribution
poissoncdistribution
poissondistribution
/************************************************************************* Inverse Poisson distribution Finds the Poisson variable x such that the integral from 0 to x of the Poisson density is equal to the given probability y. This is accomplished using the inverse gamma integral function and the relation m = igami( k+1, y ). ACCURACY: See inverse incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::invpoissondistribution(ae_int_t k, double y);
/************************************************************************* Complemented Poisson distribution Returns the sum of the terms k+1 to infinity of the Poisson distribution: inf. j -- -m m > e -- -- j! j=k+1 The terms are not summed directly; instead the incomplete gamma integral is employed, according to the formula y = pdtrc( k, m ) = igam( k+1, m ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::poissoncdistribution(ae_int_t k, double m);
/************************************************************************* Poisson distribution Returns the sum of the first k+1 terms of the Poisson distribution: k j -- -m m > e -- -- j! j=0 The terms are not summed directly; instead the incomplete gamma integral is employed, according to the relation y = pdtr( k, m ) = igamc( k+1, m ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::poissondistribution(ae_int_t k, double m);
polynomialbar2cheb
polynomialbar2pow
polynomialbuild
polynomialbuildcheb1
polynomialbuildcheb2
polynomialbuildeqdist
polynomialcalccheb1
polynomialcalccheb2
polynomialcalceqdist
polynomialcheb2bar
polynomialpow2bar
polint_d_calcdiff Interpolation and differentiation using barycentric representation
polint_d_conv Conversion between power basis and barycentric representation
polint_d_spec Polynomial interpolation on special grids (equidistant, Chebyshev I/II)
/************************************************************************* Conversion from barycentric representation to Chebyshev basis. This function has O(N^2) complexity. INPUT PARAMETERS: P - polynomial in barycentric form A,B - base interval for Chebyshev polynomials (see below) A<>B OUTPUT PARAMETERS T - coefficients of Chebyshev representation; P(x) = sum { T[i]*Ti(2*(x-A)/(B-A)-1), i=0..N-1 }, where Ti - I-th Chebyshev polynomial. NOTES: barycentric interpolant passed as P may be either polynomial obtained from polynomial interpolation/ fitting or rational function which is NOT polynomial. We can't distinguish between these two cases, and this algorithm just tries to work assuming that P IS a polynomial. If not, algorithm will return results, but they won't have any meaning. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/
void alglib::polynomialbar2cheb( barycentricinterpolant p, double a, double b, real_1d_array& t);
/************************************************************************* Conversion from barycentric representation to power basis. This function has O(N^2) complexity. INPUT PARAMETERS: P - polynomial in barycentric form C - offset (see below); 0.0 is used as default value. S - scale (see below); 1.0 is used as default value. S<>0. OUTPUT PARAMETERS A - coefficients, P(x) = sum { A[i]*((X-C)/S)^i, i=0..N-1 } N - number of coefficients (polynomial degree plus 1) NOTES: 1. this function accepts offset and scale, which can be set to improve numerical properties of polynomial. For example, if P was obtained as result of interpolation on [-1,+1], you can set C=0 and S=1 and represent P as sum of 1, x, x^2, x^3 and so on. In most cases you it is exactly what you need. However, if your interpolation model was built on [999,1001], you will see significant growth of numerical errors when using {1, x, x^2, x^3} as basis. Representing P as sum of 1, (x-1000), (x-1000)^2, (x-1000)^3 will be better option. Such representation can be obtained by using 1000.0 as offset C and 1.0 as scale S. 2. power basis is ill-conditioned and tricks described above can't solve this problem completely. This function will return coefficients in any case, but for N>8 they will become unreliable. However, N's less than 5 are pretty safe. 3. barycentric interpolant passed as P may be either polynomial obtained from polynomial interpolation/ fitting or rational function which is NOT polynomial. We can't distinguish between these two cases, and this algorithm just tries to work assuming that P IS a polynomial. If not, algorithm will return results, but they won't have any meaning. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/
void alglib::polynomialbar2pow( barycentricinterpolant p, real_1d_array& a); void alglib::polynomialbar2pow( barycentricinterpolant p, double c, double s, real_1d_array& a);

Examples:   [1]  

/************************************************************************* Lagrange intepolant: generation of the model on the general grid. This function has O(N^2) complexity. INPUT PARAMETERS: X - abscissas, array[0..N-1] Y - function values, array[0..N-1] N - number of points, N>=1 OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/
void alglib::polynomialbuild( real_1d_array x, real_1d_array y, barycentricinterpolant& p); void alglib::polynomialbuild( real_1d_array x, real_1d_array y, ae_int_t n, barycentricinterpolant& p);

Examples:   [1]  

/************************************************************************* Lagrange intepolant on Chebyshev grid (first kind). This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1], Y[I] = Y(0.5*(B+A) + 0.5*(B-A)*Cos(PI*(2*i+1)/(2*n))) N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/
void alglib::polynomialbuildcheb1( double a, double b, real_1d_array y, barycentricinterpolant& p); void alglib::polynomialbuildcheb1( double a, double b, real_1d_array y, ae_int_t n, barycentricinterpolant& p);

Examples:   [1]  

/************************************************************************* Lagrange intepolant on Chebyshev grid (second kind). This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1], Y[I] = Y(0.5*(B+A) + 0.5*(B-A)*Cos(PI*i/(n-1))) N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/
void alglib::polynomialbuildcheb2( double a, double b, real_1d_array y, barycentricinterpolant& p); void alglib::polynomialbuildcheb2( double a, double b, real_1d_array y, ae_int_t n, barycentricinterpolant& p);

Examples:   [1]  

/************************************************************************* Lagrange intepolant: generation of the model on equidistant grid. This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1] N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/
void alglib::polynomialbuildeqdist( double a, double b, real_1d_array y, barycentricinterpolant& p); void alglib::polynomialbuildeqdist( double a, double b, real_1d_array y, ae_int_t n, barycentricinterpolant& p);

Examples:   [1]  

/************************************************************************* Fast polynomial interpolation function on Chebyshev points (first kind) with O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on Chebyshev grid (first kind), X[i] = 0.5*(B+A) + 0.5*(B-A)*Cos(PI*(2*i+1)/(2*n)) for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolIntBuildCheb1()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/
double alglib::polynomialcalccheb1( double a, double b, real_1d_array f, double t); double alglib::polynomialcalccheb1( double a, double b, real_1d_array f, ae_int_t n, double t);

Examples:   [1]  

/************************************************************************* Fast polynomial interpolation function on Chebyshev points (second kind) with O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on Chebyshev grid (second kind), X[i] = 0.5*(B+A) + 0.5*(B-A)*Cos(PI*i/(n-1)) for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolIntBuildCheb2()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/
double alglib::polynomialcalccheb2( double a, double b, real_1d_array f, double t); double alglib::polynomialcalccheb2( double a, double b, real_1d_array f, ae_int_t n, double t);

Examples:   [1]  

/************************************************************************* Fast equidistant polynomial interpolation function with O(N) complexity INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on equidistant grid, N>=1 for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolynomialBuildEqDist()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/
double alglib::polynomialcalceqdist( double a, double b, real_1d_array f, double t); double alglib::polynomialcalceqdist( double a, double b, real_1d_array f, ae_int_t n, double t);

Examples:   [1]  

/************************************************************************* Conversion from Chebyshev basis to barycentric representation. This function has O(N^2) complexity. INPUT PARAMETERS: T - coefficients of Chebyshev representation; P(x) = sum { T[i]*Ti(2*(x-A)/(B-A)-1), i=0..N }, where Ti - I-th Chebyshev polynomial. N - number of coefficients: * if given, only leading N elements of T are used * if not given, automatically determined from size of T A,B - base interval for Chebyshev polynomials (see above) A<B OUTPUT PARAMETERS P - polynomial in barycentric form -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/
void alglib::polynomialcheb2bar( real_1d_array t, double a, double b, barycentricinterpolant& p); void alglib::polynomialcheb2bar( real_1d_array t, ae_int_t n, double a, double b, barycentricinterpolant& p);
/************************************************************************* Conversion from power basis to barycentric representation. This function has O(N^2) complexity. INPUT PARAMETERS: A - coefficients, P(x) = sum { A[i]*((X-C)/S)^i, i=0..N-1 } N - number of coefficients (polynomial degree plus 1) * if given, only leading N elements of A are used * if not given, automatically determined from size of A C - offset (see below); 0.0 is used as default value. S - scale (see below); 1.0 is used as default value. S<>0. OUTPUT PARAMETERS P - polynomial in barycentric form NOTES: 1. this function accepts offset and scale, which can be set to improve numerical properties of polynomial. For example, if you interpolate on [-1,+1], you can set C=0 and S=1 and convert from sum of 1, x, x^2, x^3 and so on. In most cases you it is exactly what you need. However, if your interpolation model was built on [999,1001], you will see significant growth of numerical errors when using {1, x, x^2, x^3} as input basis. Converting from sum of 1, (x-1000), (x-1000)^2, (x-1000)^3 will be better option (you have to specify 1000.0 as offset C and 1.0 as scale S). 2. power basis is ill-conditioned and tricks described above can't solve this problem completely. This function will return barycentric model in any case, but for N>8 accuracy well degrade. However, N's less than 5 are pretty safe. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/
void alglib::polynomialpow2bar( real_1d_array a, barycentricinterpolant& p); void alglib::polynomialpow2bar( real_1d_array a, ae_int_t n, double c, double s, barycentricinterpolant& p);

Examples:   [1]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Here we demonstrate polynomial interpolation and differentiation
    // of y=x^2-x sampled at [0,1,2]. Barycentric representation of polynomial is used.
    //
    real_1d_array x = "[0,1,2]";
    real_1d_array y = "[0,0,2]";
    double t = -1;
    double v;
    double dv;
    double d2v;
    barycentricinterpolant p;

    // barycentric model is created
    polynomialbuild(x, y, p);

    // barycentric interpolation is demonstrated
    v = barycentriccalc(p, t);
    printf("%.4f\n", double(v)); // EXPECTED: 2.0

    // barycentric differentation is demonstrated
    barycentricdiff1(p, t, v, dv);
    printf("%.4f\n", double(v)); // EXPECTED: 2.0
    printf("%.4f\n", double(dv)); // EXPECTED: -3.0

    // second derivatives with barycentric representation
    barycentricdiff1(p, t, v, dv);
    printf("%.4f\n", double(v)); // EXPECTED: 2.0
    printf("%.4f\n", double(dv)); // EXPECTED: -3.0
    barycentricdiff2(p, t, v, dv, d2v);
    printf("%.4f\n", double(v)); // EXPECTED: 2.0
    printf("%.4f\n", double(dv)); // EXPECTED: -3.0
    printf("%.4f\n", double(d2v)); // EXPECTED: 2.0
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Here we demonstrate conversion of y=x^2-x
    // between power basis and barycentric representation.
    //
    real_1d_array a = "[0,-1,+1]";
    double t = 2;
    real_1d_array a2;
    double v;
    barycentricinterpolant p;

    //
    // a=[0,-1,+1] is decomposition of y=x^2-x in the power basis:
    //
    //     y = 0 - 1*x + 1*x^2
    //
    // We convert it to the barycentric form.
    //
    polynomialpow2bar(a, p);

    // now we have barycentric interpolation; we can use it for interpolation
    v = barycentriccalc(p, t);
    printf("%.2f\n", double(v)); // EXPECTED: 2.0

    // we can also convert back from barycentric representation to power basis
    polynomialbar2pow(p, a2);
    printf("%s\n", a2.tostring(2).c_str()); // EXPECTED: [0,-1,+1]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Temporaries:
    // * values of y=x^2-x sampled at three special grids:
    //   * equdistant grid spanning [0,2],     x[i] = 2*i/(N-1), i=0..N-1
    //   * Chebyshev-I grid spanning [-1,+1],  x[i] = 1 + Cos(PI*(2*i+1)/(2*n)), i=0..N-1
    //   * Chebyshev-II grid spanning [-1,+1], x[i] = 1 + Cos(PI*i/(n-1)), i=0..N-1
    // * barycentric interpolants for these three grids
    // * vectors to store coefficients of quadratic representation
    //
    real_1d_array y_eqdist = "[0,0,2]";
    real_1d_array y_cheb1 = "[-0.116025,0.000000,1.616025]";
    real_1d_array y_cheb2 = "[0,0,2]";
    barycentricinterpolant p_eqdist;
    barycentricinterpolant p_cheb1;
    barycentricinterpolant p_cheb2;
    real_1d_array a_eqdist;
    real_1d_array a_cheb1;
    real_1d_array a_cheb2;

    //
    // First, we demonstrate construction of barycentric interpolants on
    // special grids. We unpack power representation to ensure that
    // interpolant was built correctly.
    //
    // In all three cases we should get same quadratic function.
    //
    polynomialbuildeqdist(0.0, 2.0, y_eqdist, p_eqdist);
    polynomialbar2pow(p_eqdist, a_eqdist);
    printf("%s\n", a_eqdist.tostring(4).c_str()); // EXPECTED: [0,-1,+1]

    polynomialbuildcheb1(-1, +1, y_cheb1, p_cheb1);
    polynomialbar2pow(p_cheb1, a_cheb1);
    printf("%s\n", a_cheb1.tostring(4).c_str()); // EXPECTED: [0,-1,+1]

    polynomialbuildcheb2(-1, +1, y_cheb2, p_cheb2);
    polynomialbar2pow(p_cheb2, a_cheb2);
    printf("%s\n", a_cheb2.tostring(4).c_str()); // EXPECTED: [0,-1,+1]

    //
    // Now we demonstrate polynomial interpolation without construction 
    // of the barycentricinterpolant structure.
    //
    // We calculate interpolant value at x=-2.
    // In all three cases we should get same f=6
    //
    double t = -2;
    double v;
    v = polynomialcalceqdist(0.0, 2.0, y_eqdist, t);
    printf("%.4f\n", double(v)); // EXPECTED: 6.0

    v = polynomialcalccheb1(-1, +1, y_cheb1, t);
    printf("%.4f\n", double(v)); // EXPECTED: 6.0

    v = polynomialcalccheb2(-1, +1, y_cheb2, t);
    printf("%.4f\n", double(v)); // EXPECTED: 6.0
    return 0;
}


polynomialsolverreport
polynomialsolve
/************************************************************************* *************************************************************************/
class polynomialsolverreport { double maxerr; };
/************************************************************************* Polynomial root finding. This function returns all roots of the polynomial P(x) = a0 + a1*x + a2*x^2 + ... + an*x^n Both real and complex roots are returned (see below). INPUT PARAMETERS: A - array[N+1], polynomial coefficients: * A[0] is constant term * A[N] is a coefficient of X^N N - polynomial degree OUTPUT PARAMETERS: X - array of complex roots: * for isolated real root, X[I] is strictly real: IMAGE(X[I])=0 * complex roots are always returned in pairs - roots occupy positions I and I+1, with: * X[I+1]=Conj(X[I]) * IMAGE(X[I]) > 0 * IMAGE(X[I+1]) = -IMAGE(X[I]) < 0 * multiple real roots may have non-zero imaginary part due to roundoff errors. There is no reliable way to distinguish real root of multiplicity 2 from two complex roots in the presence of roundoff errors. Rep - report, additional information, following fields are set: * Rep.MaxErr - max( |P(xi)| ) for i=0..N-1. This field allows to quickly estimate "quality" of the roots being returned. NOTE: this function uses companion matrix method to find roots. In case internal EVD solver fails do find eigenvalues, exception is generated. NOTE: roots are not "polished" and no matrix balancing is performed for them. -- ALGLIB -- Copyright 24.02.2014 by Bochkanov Sergey *************************************************************************/
void alglib::polynomialsolve( real_1d_array a, ae_int_t n, complex_1d_array& x, polynomialsolverreport& rep);
psi
/************************************************************************* Psi (digamma) function d - psi(x) = -- ln | (x) dx is the logarithmic derivative of the gamma function. For integer x, n-1 - psi(n) = -EUL + > 1/k. - k=1 This formula is used for 0 < n <= 10. If x is negative, it is transformed to a positive argument by the reflection formula psi(1-x) = psi(x) + pi cot(pi x). For general positive x, the argument is made greater than 10 using the recurrence psi(x+1) = psi(x) + 1/x. Then the following asymptotic expansion is applied: inf. B - 2k psi(x) = log(x) - 1/2x - > ------- - 2k k=1 2k x where the B2k are Bernoulli numbers. ACCURACY: Relative error (except absolute when |psi| < 1): arithmetic domain # trials peak rms IEEE 0,30 30000 1.3e-15 1.4e-16 IEEE -30,0 40000 1.5e-15 2.2e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1992, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::psi(double x);
barycentricinterpolant
barycentricbuildfloaterhormann
barycentricbuildxyw
barycentriccalc
barycentricdiff1
barycentricdiff2
barycentriclintransx
barycentriclintransy
barycentricunpack
/************************************************************************* Barycentric interpolant. *************************************************************************/
class barycentricinterpolant { };
/************************************************************************* Rational interpolant without poles The subroutine constructs the rational interpolating function without real poles (see 'Barycentric rational interpolation with no poles and high rates of approximation', Michael S. Floater. and Kai Hormann, for more information on this subject). Input parameters: X - interpolation nodes, array[0..N-1]. Y - function values, array[0..N-1]. N - number of nodes, N>0. D - order of the interpolation scheme, 0 <= D <= N-1. D<0 will cause an error. D>=N it will be replaced with D=N-1. if you don't know what D to choose, use small value about 3-5. Output parameters: B - barycentric interpolant. Note: this algorithm always succeeds and calculates the weights with close to machine precision. -- ALGLIB PROJECT -- Copyright 17.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::barycentricbuildfloaterhormann( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t d, barycentricinterpolant& b);
/************************************************************************* Rational interpolant from X/Y/W arrays F(t) = SUM(i=0,n-1,w[i]*f[i]/(t-x[i])) / SUM(i=0,n-1,w[i]/(t-x[i])) INPUT PARAMETERS: X - interpolation nodes, array[0..N-1] F - function values, array[0..N-1] W - barycentric weights, array[0..N-1] N - nodes count, N>0 OUTPUT PARAMETERS: B - barycentric interpolant built from (X, Y, W) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::barycentricbuildxyw( real_1d_array x, real_1d_array y, real_1d_array w, ae_int_t n, barycentricinterpolant& b);
/************************************************************************* Rational interpolation using barycentric formula F(t) = SUM(i=0,n-1,w[i]*f[i]/(t-x[i])) / SUM(i=0,n-1,w[i]/(t-x[i])) Input parameters: B - barycentric interpolant built with one of model building subroutines. T - interpolation point Result: barycentric interpolant F(t) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
double alglib::barycentriccalc(barycentricinterpolant b, double t);
/************************************************************************* Differentiation of barycentric interpolant: first derivative. Algorithm used in this subroutine is very robust and should not fail until provided with values too close to MaxRealNumber (usually MaxRealNumber/N or greater will overflow). INPUT PARAMETERS: B - barycentric interpolant built with one of model building subroutines. T - interpolation point OUTPUT PARAMETERS: F - barycentric interpolant at T DF - first derivative NOTE -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::barycentricdiff1( barycentricinterpolant b, double t, double& f, double& df);
/************************************************************************* Differentiation of barycentric interpolant: first/second derivatives. INPUT PARAMETERS: B - barycentric interpolant built with one of model building subroutines. T - interpolation point OUTPUT PARAMETERS: F - barycentric interpolant at T DF - first derivative D2F - second derivative NOTE: this algorithm may fail due to overflow/underflor if used on data whose values are close to MaxRealNumber or MinRealNumber. Use more robust BarycentricDiff1() subroutine in such cases. -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::barycentricdiff2( barycentricinterpolant b, double t, double& f, double& df, double& d2f);
/************************************************************************* This subroutine performs linear transformation of the argument. INPUT PARAMETERS: B - rational interpolant in barycentric form CA, CB - transformation coefficients: x = CA*t + CB OUTPUT PARAMETERS: B - transformed interpolant with X replaced by T -- ALGLIB PROJECT -- Copyright 19.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::barycentriclintransx( barycentricinterpolant b, double ca, double cb);
/************************************************************************* This subroutine performs linear transformation of the barycentric interpolant. INPUT PARAMETERS: B - rational interpolant in barycentric form CA, CB - transformation coefficients: B2(x) = CA*B(x) + CB OUTPUT PARAMETERS: B - transformed interpolant -- ALGLIB PROJECT -- Copyright 19.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::barycentriclintransy( barycentricinterpolant b, double ca, double cb);
/************************************************************************* Extracts X/Y/W arrays from rational interpolant INPUT PARAMETERS: B - barycentric interpolant OUTPUT PARAMETERS: N - nodes count, N>0 X - interpolation nodes, array[0..N-1] F - function values, array[0..N-1] W - barycentric weights, array[0..N-1] -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/
void alglib::barycentricunpack( barycentricinterpolant b, ae_int_t& n, real_1d_array& x, real_1d_array& y, real_1d_array& w);
rbfcalcbuffer
rbfmodel
rbfreport
rbfbuildmodel
rbfcalc
rbfcalc1
rbfcalc2
rbfcalc3
rbfcalcbuf
rbfcreate
rbfcreatecalcbuffer
rbfgetmodelversion
rbfgridcalc2
rbfgridcalc2v
rbfgridcalc2vsubset
rbfgridcalc3v
rbfgridcalc3vsubset
rbfserialize
rbfsetalgohierarchical
rbfsetalgomultilayer
rbfsetalgoqnn
rbfsetconstterm
rbfsetlinterm
rbfsetpoints
rbfsetpointsandscales
rbfsetv2bf
rbfsetv2its
rbfsetv2supportr
rbfsetzeroterm
rbftscalcbuf
rbfunpack
rbfunserialize
rbf_d_hrbf Simple model built with HRBF algorithm
rbf_d_polterm RBF models - working with polynomial term
rbf_d_serialize Serialization/unserialization
rbf_d_vector Working with vector functions
/************************************************************************* Buffer object which is used to perform nearest neighbor requests in the multithreaded mode (multiple threads working with same KD-tree object). This object should be created with KDTreeCreateBuffer(). *************************************************************************/
class rbfcalcbuffer { };
/************************************************************************* RBF model. Never try to directly work with fields of this object - always use ALGLIB functions to use this object. *************************************************************************/
class rbfmodel { };
/************************************************************************* RBF solution report: * TerminationType - termination type, positive values - success, non-positive - failure. Fields which are set by modern RBF solvers (hierarchical): * RMSError - root-mean-square error; NAN for old solvers (ML, QNN) * MaxError - maximum error; NAN for old solvers (ML, QNN) *************************************************************************/
class rbfreport { double rmserror; double maxerror; ae_int_t arows; ae_int_t acols; ae_int_t annz; ae_int_t iterationscount; ae_int_t nmv; ae_int_t terminationtype; };
/************************************************************************* This function builds RBF model and returns report (contains some information which can be used for evaluation of the algorithm properties). Call to this function modifies RBF model by calculating its centers/radii/ weights and saving them into RBFModel structure. Initially RBFModel contain zero coefficients, but after call to this function we will have coefficients which were calculated in order to fit our dataset. After you called this function you can call RBFCalc(), RBFGridCalc() and other model calculation functions. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call Rep - report: * Rep.TerminationType: * -5 - non-distinct basis function centers were detected, interpolation aborted; only QNN returns this error code, other algorithms can handle non- distinct nodes. * -4 - nonconvergence of the internal SVD solver * -3 incorrect model construction algorithm was chosen: QNN or RBF-ML, combined with one of the incompatible features - NX=1 or NX>3; points with per-dimension scales. * 1 - successful termination Fields which are set only by modern RBF solvers (hierarchical or nonnegative; older solvers like QNN and ML initialize these fields by NANs): * rep.rmserror - root-mean-square error at nodes * rep.maxerror - maximum error at nodes Fields are used for debugging purposes: * Rep.IterationsCount - iterations count of the LSQR solver * Rep.NMV - number of matrix-vector products * Rep.ARows - rows count for the system matrix * Rep.ACols - columns count for the system matrix * Rep.ANNZ - number of significantly non-zero elements (elements above some algorithm-determined threshold) NOTE: failure to build model will leave current state of the structure unchanged. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::rbfbuildmodel(rbfmodel s, rbfreport& rep);

Examples:   [1]  [2]  [3]  

/************************************************************************* This function calculates values of the RBF model at the given point. This is general function which can be used for arbitrary NX (dimension of the space of arguments) and NY (dimension of the function itself). However when you have NY=1 you may find more convenient to use rbfcalc2() or rbfcalc3(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when model is not initialized. INPUT PARAMETERS: S - RBF model X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. OUTPUT PARAMETERS: Y - function value, array[NY]. Y is out-parameter and reallocated after call to this function. In case you want to reuse previously allocated Y, you may use RBFCalcBuf(), which reallocates Y only when it is too small. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::rbfcalc(rbfmodel s, real_1d_array x, real_1d_array& y);

Examples:   [1]  

/************************************************************************* This function calculates values of the RBF model in the given point. IMPORTANT: this function works only with modern (hierarchical) RBFs. It can not be used with legacy (version 1) RBFs because older RBF code does not support 1-dimensional models. This function should be used when we have NY=1 (scalar function) and NX=1 (1-dimensional space). If you have 3-dimensional space, use rbfcalc3(). If you have 2-dimensional space, use rbfcalc3(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use generic rbfcalc(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when: * model is not initialized * NX<>1 * NY<>1 INPUT PARAMETERS: S - RBF model X0 - X-coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
double alglib::rbfcalc1(rbfmodel s, double x0);
/************************************************************************* This function calculates values of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=2 (2-dimensional space). If you have 3-dimensional space, use rbfcalc3(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use generic rbfcalc(). If you want to calculate function values many times, consider using rbfgridcalc2v(), which is far more efficient than many subsequent calls to rbfcalc2(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when: * model is not initialized * NX<>2 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - first coordinate, finite number X1 - second coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
double alglib::rbfcalc2(rbfmodel s, double x0, double x1);

Examples:   [1]  [2]  

/************************************************************************* This function calculates value of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=3 (3-dimensional space). If you have 2-dimensional space, use rbfcalc2(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use generic rbfcalc(). If you want to calculate function values many times, consider using rbfgridcalc3v(), which is far more efficient than many subsequent calls to rbfcalc3(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when: * model is not initialized * NX<>3 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - first coordinate, finite number X1 - second coordinate, finite number X2 - third coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
double alglib::rbfcalc3(rbfmodel s, double x0, double x1, double x2);
/************************************************************************* This function calculates values of the RBF model at the given point. Same as rbfcalc(), but does not reallocate Y when in is large enough to store function values. If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. INPUT PARAMETERS: S - RBF model X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. Y - possibly preallocated array OUTPUT PARAMETERS: Y - function value, array[NY]. Y is not reallocated when it is larger than NY. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::rbfcalcbuf(rbfmodel s, real_1d_array x, real_1d_array& y);
/************************************************************************* This function creates RBF model for a scalar (NY=1) or vector (NY>1) function in a NX-dimensional space (NX>=1). Newly created model is empty. It can be used for interpolation right after creation, but it just returns zeros. You have to add points to the model, tune interpolation settings, and then call model construction function rbfbuildmodel() which will update model according to your specification. USAGE: 1. User creates model with rbfcreate() 2. User adds dataset with rbfsetpoints() (points do NOT have to be on a regular grid) or rbfsetpointsandscales(). 3. (OPTIONAL) User chooses polynomial term by calling: * rbflinterm() to set linear term * rbfconstterm() to set constant term * rbfzeroterm() to set zero term By default, linear term is used. 4. User tweaks algorithm properties with rbfsetalgohierarchical() method (or chooses one of the legacy algorithms - QNN (rbfsetalgoqnn) or ML (rbfsetalgomultilayer)). 5. User calls rbfbuildmodel() function which rebuilds model according to the specification 6. User may call rbfcalc() to calculate model value at the specified point, rbfgridcalc() to calculate model values at the points of the regular grid. User may extract model coefficients with rbfunpack() call. IMPORTANT: we recommend you to use latest model construction algorithm - hierarchical RBFs, which is activated by rbfsetalgohierarchical() function. This algorithm is the fastest one, and most memory- efficient. However, it is incompatible with older versions of ALGLIB (pre-3.11). So, if you serialize hierarchical model, you will be unable to load it in pre-3.11 ALGLIB. Other model types (QNN and RBF-ML) are still backward-compatible. INPUT PARAMETERS: NX - dimension of the space, NX>=1 NY - function dimension, NY>=1 OUTPUT PARAMETERS: S - RBF model (initially equals to zero) NOTE 1: memory requirements. RBF models require amount of memory which is proportional to the number of data points. Some additional memory is allocated during model construction, but most of this memory is freed after model coefficients are calculated. Amount of this additional memory depends on model construction algorithm being used. NOTE 2: prior to ALGLIB version 3.11, RBF models supported only NX=2 or NX=3. Any attempt to create single-dimensional or more than 3-dimensional RBF model resulted in exception. ALGLIB 3.11 supports any NX>0, but models created with NX!=2 and NX!=3 are incompatible with (a) older versions of ALGLIB, (b) old model construction algorithms (QNN or RBF-ML). So, if you create a model with NX=2 or NX=3, then, depending on specific model construction algorithm being chosen, you will (QNN and RBF-ML) or will not (HierarchicalRBF) get backward compatibility with older versions of ALGLIB. You have a choice here. However, if you create a model with NX neither 2 nor 3, you have no backward compatibility from the start, and you are forced to use hierarchical RBFs and ALGLIB 3.11 or later. -- ALGLIB -- Copyright 13.12.2011, 20.06.2016 by Bochkanov Sergey *************************************************************************/
void alglib::rbfcreate(ae_int_t nx, ae_int_t ny, rbfmodel& s);

Examples:   [1]  [2]  [3]  [4]  

/************************************************************************* This function creates buffer structure which can be used to perform parallel RBF model evaluations (with one RBF model instance being used from multiple threads, as long as different threads use different instances of buffer). This buffer object can be used with rbftscalcbuf() function (here "ts" stands for "thread-safe", "buf" is a suffix which denotes function which reuses previously allocated output space). How to use it: * create RBF model structure with rbfcreate() * load data, tune parameters * call rbfbuildmodel() * call rbfcreatecalcbuffer(), once per thread working with RBF model (you should call this function only AFTER call to rbfbuildmodel(), see below for more information) * call rbftscalcbuf() from different threads, with each thread working with its own copy of buffer object. INPUT PARAMETERS S - RBF model OUTPUT PARAMETERS Buf - external buffer. IMPORTANT: buffer object should be used only with RBF model object which was used to initialize buffer. Any attempt to use buffer with different object is dangerous - you may get memory violation error because sizes of internal arrays do not fit to dimensions of RBF structure. IMPORTANT: you should call this function only for model which was built with rbfbuildmodel() function, after successful invocation of rbfbuildmodel(). Sizes of some internal structures are determined only after model is built, so buffer object created before model construction stage will be useless (and any attempt to use it will result in exception). -- ALGLIB -- Copyright 02.04.2016 by Sergey Bochkanov *************************************************************************/
void alglib::rbfcreatecalcbuffer(rbfmodel s, rbfcalcbuffer& buf);
/************************************************************************* This function returns model version. INPUT PARAMETERS: S - RBF model RESULT: * 1 - for models created by QNN and RBF-ML algorithms, compatible with ALGLIB 3.10 or earlier. * 2 - for models created by HierarchicalRBF, requires ALGLIB 3.11 or later -- ALGLIB -- Copyright 06.07.2016 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::rbfgetmodelversion(rbfmodel s);
/************************************************************************* This is legacy function for gridded calculation of RBF model. It is superseded by rbfgridcalc2v() and rbfgridcalc2vsubset() functions. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::rbfgridcalc2( rbfmodel s, real_1d_array x0, ae_int_t n0, real_1d_array x1, ae_int_t n1, real_2d_array& y);
/************************************************************************* This function calculates values of the RBF model at the regular grid, which has N0*N1 points, with Point[I,J] = (X0[I], X1[J]). Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>2 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 200x200 problem to get nearly-linear speed- ! up (high efficiency). ! ! Parallel processing is implemented only for modern (hierarchical) ! RBFs. Legacy version 1 RBFs (created by QNN or RBF-ML) are still ! processed serially. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1], where NY is a number of "output" vector values (this function supports vector- valued RBF models). Y is out-variable and is reallocated by this function. Y[K+NY*(I0+I1*N0)]=F_k(X0[I0],X1[I1]), for: * K=0...NY-1 * I0=0...N0-1 * I1=0...N1-1 NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. NOTE: if you need function values on some subset of regular grid, which may be described as "several compact and dense islands", you may use rbfgridcalc2vsubset(). -- ALGLIB -- Copyright 27.01.2017 by Bochkanov Sergey *************************************************************************/
void alglib::rbfgridcalc2v( rbfmodel s, real_1d_array x0, ae_int_t n0, real_1d_array x1, ae_int_t n1, real_1d_array& y); void alglib::smp_rbfgridcalc2v( rbfmodel s, real_1d_array x0, ae_int_t n0, real_1d_array x1, ae_int_t n1, real_1d_array& y);
/************************************************************************* This function calculates values of the RBF model at some subset of regular grid: * grid has N0*N1 points, with Point[I,J] = (X0[I], X1[J]) * only values at some subset of this grid are required Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>2 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 200x200 problem to get nearly-linear speed- ! up (high efficiency). ! ! Parallel processing is implemented only for modern (hierarchical) ! RBFs. Legacy version 1 RBFs (created by QNN or RBF-ML) are still ! processed serially. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension FlagY - array[N0*N1]: * Y[I0+I1*N0] corresponds to node (X0[I0],X1[I1]) * it is a "bitmap" array which contains False for nodes which are NOT calculated, and True for nodes which are required. OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1*N2], where NY is a number of "output" vector values (this function supports vector- valued RBF models): * Y[K+NY*(I0+I1*N0)]=F_k(X0[I0],X1[I1]), for K=0...NY-1, I0=0...N0-1, I1=0...N1-1. * elements of Y[] which correspond to FlagY[]=True are loaded by model values (which may be exactly zero for some nodes). * elements of Y[] which correspond to FlagY[]=False MAY be initialized by zeros OR may be calculated. This function processes grid as a hierarchy of nested blocks and micro-rows. If just one element of micro-row is required, entire micro-row (up to 8 nodes in the current version, but no promises) is calculated. NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/
void alglib::rbfgridcalc2vsubset( rbfmodel s, real_1d_array x0, ae_int_t n0, real_1d_array x1, ae_int_t n1, boolean_1d_array flagy, real_1d_array& y); void alglib::smp_rbfgridcalc2vsubset( rbfmodel s, real_1d_array x0, ae_int_t n0, real_1d_array x1, ae_int_t n1, boolean_1d_array flagy, real_1d_array& y);
/************************************************************************* This function calculates values of the RBF model at the regular grid, which has N0*N1*N2 points, with Point[I,J,K] = (X0[I], X1[J], X2[K]). Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>3 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 50x50x50 problem to get nearly-linear speed- ! up (high efficiency). ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension X2 - array of grid nodes, third coordinates, array[N2] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N2 - grid size (number of nodes) in the third dimension OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1*N2], where NY is a number of "output" vector values (this function supports vector- valued RBF models). Y is out-variable and is reallocated by this function. Y[K+NY*(I0+I1*N0+I2*N0*N1)]=F_k(X0[I0],X1[I1],X2[I2]), for: * K=0...NY-1 * I0=0...N0-1 * I1=0...N1-1 * I2=0...N2-1 NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. NOTE: if you need function values on some subset of regular grid, which may be described as "several compact and dense islands", you may use rbfgridcalc3vsubset(). -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/
void alglib::rbfgridcalc3v( rbfmodel s, real_1d_array x0, ae_int_t n0, real_1d_array x1, ae_int_t n1, real_1d_array x2, ae_int_t n2, real_1d_array& y); void alglib::smp_rbfgridcalc3v( rbfmodel s, real_1d_array x0, ae_int_t n0, real_1d_array x1, ae_int_t n1, real_1d_array x2, ae_int_t n2, real_1d_array& y);
/************************************************************************* This function calculates values of the RBF model at some subset of regular grid: * grid has N0*N1*N2 points, with Point[I,J,K] = (X0[I], X1[J], X2[K]) * only values at some subset of this grid are required Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>3 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 50x50x50 problem to get nearly-linear speed- ! up (high efficiency). ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension X2 - array of grid nodes, third coordinates, array[N2] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N2 - grid size (number of nodes) in the third dimension FlagY - array[N0*N1*N2]: * Y[I0+I1*N0+I2*N0*N1] corresponds to node (X0[I0],X1[I1],X2[I2]) * it is a "bitmap" array which contains False for nodes which are NOT calculated, and True for nodes which are required. OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1*N2], where NY is a number of "output" vector values (this function supports vector- valued RBF models): * Y[K+NY*(I0+I1*N0+I2*N0*N1)]=F_k(X0[I0],X1[I1],X2[I2]), for K=0...NY-1, I0=0...N0-1, I1=0...N1-1, I2=0...N2-1. * elements of Y[] which correspond to FlagY[]=True are loaded by model values (which may be exactly zero for some nodes). * elements of Y[] which correspond to FlagY[]=False MAY be initialized by zeros OR may be calculated. This function processes grid as a hierarchy of nested blocks and micro-rows. If just one element of micro-row is required, entire micro-row (up to 8 nodes in the current version, but no promises) is calculated. NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/
void alglib::rbfgridcalc3vsubset( rbfmodel s, real_1d_array x0, ae_int_t n0, real_1d_array x1, ae_int_t n1, real_1d_array x2, ae_int_t n2, boolean_1d_array flagy, real_1d_array& y); void alglib::smp_rbfgridcalc3vsubset( rbfmodel s, real_1d_array x0, ae_int_t n0, real_1d_array x1, ae_int_t n1, real_1d_array x2, ae_int_t n2, boolean_1d_array flagy, real_1d_array& y);
/************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/
void rbfserialize(rbfmodel &obj, std::string &s_out); void rbfserialize(rbfmodel &obj, std::ostream &s_out);
/************************************************************************* This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called Hierarchical RBF. It similar to its previous incarnation, RBF-ML, i.e. it also builds a sequence of models with decreasing radii. However, it uses more economical way of building upper layers (ones with large radii), which results in faster model construction and evaluation, as well as smaller memory footprint during construction. This algorithm has following important features: * ability to handle millions of points * controllable smoothing via nonlinearity penalization * support for NX-dimensional models with NX=1 or NX>3 (unlike QNN or RBF-ML) * support for specification of per-dimensional radii via scale vector, which is set by means of rbfsetpointsandscales() function. This feature is useful if you solve spatio-temporal interpolation problems, where different radii are required for spatial and temporal dimensions. Running times are roughly proportional to: * N*log(N)*NLayers - for model construction * N*NLayers - for model evaluation You may see that running time does not depend on search radius or points density, just on number of layers in the hierarchy. IMPORTANT: this model construction algorithm was introduced in ALGLIB 3.11 and produces models which are INCOMPATIBLE with previous versions of ALGLIB. You can not unserialize models produced with this function in ALGLIB 3.10 or earlier. INPUT PARAMETERS: S - RBF model, initialized by rbfcreate() call RBase - RBase parameter, RBase>0 NLayers - NLayers parameter, NLayers>0, recommended value to start with - about 5. LambdaNS- >=0, nonlinearity penalty coefficient, negative values are not allowed. This parameter adds controllable smoothing to the problem, which may reduce noise. Specification of non- zero lambda means that in addition to fitting error solver will also minimize LambdaNS*|S''(x)|^2 (appropriately generalized to multiple dimensions. Specification of exactly zero value means that no penalty is added (we do not even evaluate matrix of second derivatives which is necessary for smoothing). Calculation of nonlinearity penalty is costly - it results in several-fold increase of model construction time. Evaluation time remains the same. Optimal lambda is problem-dependent and requires trial and error. Good value to start from is 1e-5...1e-6, which corresponds to slightly noticeable smoothing of the function. Value 1e-2 usually means that quite heavy smoothing is applied. TUNING ALGORITHM In order to use this algorithm you have to choose three parameters: * initial radius RBase * number of layers in the model NLayers * penalty coefficient LambdaNS Initial radius is easy to choose - you can pick any number several times larger than the average distance between points. Algorithm won't break down if you choose radius which is too large (model construction time will increase, but model will be built correctly). Choose such number of layers that RLast=RBase/2^(NLayers-1) (radius used by the last layer) will be smaller than the typical distance between points. In case model error is too large, you can increase number of layers. Having more layers will make model construction and evaluation proportionally slower, but it will allow you to have model which precisely fits your data. From the other side, if you want to suppress noise, you can DECREASE number of layers to make your model less flexible (or specify non-zero LambdaNS). TYPICAL ERRORS 1. Using too small number of layers - RBF models with large radius are not flexible enough to reproduce small variations in the target function. You need many layers with different radii, from large to small, in order to have good model. 2. Using initial radius which is too small. You will get model with "holes" in the areas which are too far away from interpolation centers. However, algorithm will work correctly (and quickly) in this case. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/
void alglib::rbfsetalgohierarchical( rbfmodel s, double rbase, ae_int_t nlayers, double lambdans);

Examples:   [1]  [2]  [3]  

/************************************************************************* DEPRECATED:since version 3.11 ALGLIB includes new RBF model construction algorithm, Hierarchical RBF. This algorithm is faster and requires less memory than QNN and RBF-ML. It is especially good for large-scale interpolation problems. So, we recommend you to consider Hierarchical RBF as default option. ========================================================================== This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called RBF-ML. It builds multilayer RBF model, i.e. model with subsequently decreasing radii, which allows us to combine smoothness (due to large radii of the first layers) with exactness (due to small radii of the last layers) and fast convergence. Internally RBF-ML uses many different means of acceleration, from sparse matrices to KD-trees, which results in algorithm whose working time is roughly proportional to N*log(N)*Density*RBase^2*NLayers, where N is a number of points, Density is an average density if points per unit of the interpolation space, RBase is an initial radius, NLayers is a number of layers. RBF-ML is good for following kinds of interpolation problems: 1. "exact" problems (perfect fit) with well separated points 2. least squares problems with arbitrary distribution of points (algorithm gives perfect fit where it is possible, and resorts to least squares fit in the hard areas). 3. noisy problems where we want to apply some controlled amount of smoothing. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call RBase - RBase parameter, RBase>0 NLayers - NLayers parameter, NLayers>0, recommended value to start with - about 5. LambdaV - regularization value, can be useful when solving problem in the least squares sense. Optimal lambda is problem- dependent and require trial and error. In our experience, good lambda can be as large as 0.1, and you can use 0.001 as initial guess. Default value - 0.01, which is used when LambdaV is not given. You can specify zero value, but it is not recommended to do so. TUNING ALGORITHM In order to use this algorithm you have to choose three parameters: * initial radius RBase * number of layers in the model NLayers * regularization coefficient LambdaV Initial radius is easy to choose - you can pick any number several times larger than the average distance between points. Algorithm won't break down if you choose radius which is too large (model construction time will increase, but model will be built correctly). Choose such number of layers that RLast=RBase/2^(NLayers-1) (radius used by the last layer) will be smaller than the typical distance between points. In case model error is too large, you can increase number of layers. Having more layers will make model construction and evaluation proportionally slower, but it will allow you to have model which precisely fits your data. From the other side, if you want to suppress noise, you can DECREASE number of layers to make your model less flexible. Regularization coefficient LambdaV controls smoothness of the individual models built for each layer. We recommend you to use default value in case you don't want to tune this parameter, because having non-zero LambdaV accelerates and stabilizes internal iterative algorithm. In case you want to suppress noise you can use LambdaV as additional parameter (larger value = more smoothness) to tune. TYPICAL ERRORS 1. Using initial radius which is too large. Memory requirements of the RBF-ML are roughly proportional to N*Density*RBase^2 (where Density is an average density of points per unit of the interpolation space). In the extreme case of the very large RBase we will need O(N^2) units of memory - and many layers in order to decrease radius to some reasonably small value. 2. Using too small number of layers - RBF models with large radius are not flexible enough to reproduce small variations in the target function. You need many layers with different radii, from large to small, in order to have good model. 3. Using initial radius which is too small. You will get model with "holes" in the areas which are too far away from interpolation centers. However, algorithm will work correctly (and quickly) in this case. 4. Using too many layers - you will get too large and too slow model. This model will perfectly reproduce your function, but maybe you will be able to achieve similar results with less layers (and less memory). -- ALGLIB -- Copyright 02.03.2012 by Bochkanov Sergey *************************************************************************/
void alglib::rbfsetalgomultilayer( rbfmodel s, double rbase, ae_int_t nlayers); void alglib::rbfsetalgomultilayer( rbfmodel s, double rbase, ae_int_t nlayers, double lambdav);
/************************************************************************* DEPRECATED:since version 3.11 ALGLIB includes new RBF model construction algorithm, Hierarchical RBF. This algorithm is faster and requires less memory than QNN and RBF-ML. It is especially good for large-scale interpolation problems. So, we recommend you to consider Hierarchical RBF as default option. ========================================================================== This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called RBF-QNN and it is good for point sets with following properties: a) all points are distinct b) all points are well separated. c) points distribution is approximately uniform. There is no "contour lines", clusters of points, or other small-scale structures. Algorithm description: 1) interpolation centers are allocated to data points 2) interpolation radii are calculated as distances to the nearest centers times Q coefficient (where Q is a value from [0.75,1.50]). 3) after performing (2) radii are transformed in order to avoid situation when single outlier has very large radius and influences many points across all dataset. Transformation has following form: new_r[i] = min(r[i],Z*median(r[])) where r[i] is I-th radius, median() is a median radius across entire dataset, Z is user-specified value which controls amount of deviation from median radius. When (a) is violated, we will be unable to build RBF model. When (b) or (c) are violated, model will be built, but interpolation quality will be low. See http://www.alglib.net/interpolation/ for more information on this subject. This algorithm is used by default. Additional Q parameter controls smoothness properties of the RBF basis: * Q<0.75 will give perfectly conditioned basis, but terrible smoothness properties (RBF interpolant will have sharp peaks around function values) * Q around 1.0 gives good balance between smoothness and condition number * Q>1.5 will lead to badly conditioned systems and slow convergence of the underlying linear solver (although smoothness will be very good) * Q>2.0 will effectively make optimizer useless because it won't converge within reasonable amount of iterations. It is possible to set such large Q, but it is advised not to do so. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call Q - Q parameter, Q>0, recommended value - 1.0 Z - Z parameter, Z>0, recommended value - 5.0 NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::rbfsetalgoqnn(rbfmodel s); void alglib::rbfsetalgoqnn(rbfmodel s, double q, double z);
/************************************************************************* This function sets constant term (model is a sum of radial basis functions plus constant). This function won't have effect until next call to RBFBuildModel(). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::rbfsetconstterm(rbfmodel s);

Examples:   [1]  

/************************************************************************* This function sets linear term (model is a sum of radial basis functions plus linear polynomial). This function won't have effect until next call to RBFBuildModel(). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::rbfsetlinterm(rbfmodel s);

Examples:   [1]  

/************************************************************************* This function adds dataset. This function overrides results of the previous calls, i.e. multiple calls of this function will result in only the last set being added. IMPORTANT: ALGLIB version 3.11 and later allows you to specify a set of per-dimension scales. Interpolation radii are multiplied by the scale vector. It may be useful if you have mixed spatio-temporal data (say, a set of 3D slices recorded at different times). You should call rbfsetpointsandscales() function to use this feature. INPUT PARAMETERS: S - RBF model, initialized by rbfcreate() call. XY - points, array[N,NX+NY]. One row corresponds to one point in the dataset. First NX elements are coordinates, next NY elements are function values. Array may be larger than specified, in this case only leading [N,NX+NY] elements will be used. N - number of points in the dataset After you've added dataset and (optionally) tuned algorithm settings you should call rbfbuildmodel() in order to build a model for you. NOTE: dataset added by this function is not saved during model serialization. MODEL ITSELF is serialized, but data used to build it are not. So, if you 1) add dataset to empty RBF model, 2) serialize and unserialize it, then you will get an empty RBF model with no dataset being attached. From the other side, if you call rbfbuildmodel() between (1) and (2), then after (2) you will get your fully constructed RBF model - but again with no dataset attached, so subsequent calls to rbfbuildmodel() will produce empty model. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::rbfsetpoints(rbfmodel s, real_2d_array xy); void alglib::rbfsetpoints(rbfmodel s, real_2d_array xy, ae_int_t n);

Examples:   [1]  [2]  [3]  

/************************************************************************* This function adds dataset and a vector of per-dimension scales. It may be useful if you have mixed spatio-temporal data - say, a set of 3D slices recorded at different times. Such data typically require different RBF radii for spatial and temporal dimensions. ALGLIB solves this problem by specifying single RBF radius, which is (optionally) multiplied by the scale vector. This function overrides results of the previous calls, i.e. multiple calls of this function will result in only the last set being added. IMPORTANT: only HierarchicalRBF algorithm can work with scaled points. So, using this function results in RBF models which can be used in ALGLIB 3.11 or later. Previous versions of the library will be unable to unserialize models produced by HierarchicalRBF algo. Any attempt to use this function with RBF-ML or QNN algorithms will result in -3 error code being returned (incorrect algorithm). INPUT PARAMETERS: R - RBF model, initialized by rbfcreate() call. XY - points, array[N,NX+NY]. One row corresponds to one point in the dataset. First NX elements are coordinates, next NY elements are function values. Array may be larger than specified, in this case only leading [N,NX+NY] elements will be used. N - number of points in the dataset S - array[NX], scale vector, S[i]>0. After you've added dataset and (optionally) tuned algorithm settings you should call rbfbuildmodel() in order to build a model for you. NOTE: dataset added by this function is not saved during model serialization. MODEL ITSELF is serialized, but data used to build it are not. So, if you 1) add dataset to empty RBF model, 2) serialize and unserialize it, then you will get an empty RBF model with no dataset being attached. From the other side, if you call rbfbuildmodel() between (1) and (2), then after (2) you will get your fully constructed RBF model - but again with no dataset attached, so subsequent calls to rbfbuildmodel() will produce empty model. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/
void alglib::rbfsetpointsandscales( rbfmodel r, real_2d_array xy, real_1d_array s); void alglib::rbfsetpointsandscales( rbfmodel r, real_2d_array xy, ae_int_t n, real_1d_array s);
/************************************************************************* This function sets basis function type, which can be: * 0 for classic Gaussian * 1 for fast and compact bell-like basis function, which becomes exactly zero at distance equal to 3*R (default option). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call BF - basis function type: * 0 - classic Gaussian * 1 - fast and compact one -- ALGLIB -- Copyright 01.02.2017 by Bochkanov Sergey *************************************************************************/
void alglib::rbfsetv2bf(rbfmodel s, ae_int_t bf);
/************************************************************************* This function sets stopping criteria of the underlying linear solver for hierarchical (version 2) RBF constructor. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call MaxIts - this criterion will stop algorithm after MaxIts iterations. Typically a few hundreds iterations is required, with 400 being a good default value to start experimentation. Zero value means that default value will be selected. -- ALGLIB -- Copyright 01.02.2017 by Bochkanov Sergey *************************************************************************/
void alglib::rbfsetv2its(rbfmodel s, ae_int_t maxits);
/************************************************************************* This function sets support radius parameter of hierarchical (version 2) RBF constructor. Hierarchical RBF model achieves great speed-up by removing from the model excessive (too dense) nodes. Say, if you have RBF radius equal to 1 meter, and two nodes are just 1 millimeter apart, you may remove one of them without reducing model quality. Support radius parameter is used to justify which points need removal, and which do not. If two points are less than SUPPORT_R*CUR_RADIUS units of distance apart, one of them is removed from the model. The larger support radius is, the faster model construction AND evaluation are. However, too large values result in "bumpy" models. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call R - support radius coefficient, >=0. Recommended values are [0.1,0.4] range, with 0.1 being default value. -- ALGLIB -- Copyright 01.02.2017 by Bochkanov Sergey *************************************************************************/
void alglib::rbfsetv2supportr(rbfmodel s, double r);
/************************************************************************* This function sets zero term (model is a sum of radial basis functions without polynomial term). This function won't have effect until next call to RBFBuildModel(). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::rbfsetzeroterm(rbfmodel s);

Examples:   [1]  

/************************************************************************* This function calculates values of the RBF model at the given point, using external buffer object (internal temporaries of RBF model are not modified). This function allows to use same RBF model object in different threads, assuming that different threads use different instances of buffer structure. INPUT PARAMETERS: S - RBF model, may be shared between different threads Buf - buffer object created for this particular instance of RBF model with rbfcreatecalcbuffer(). X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. Y - possibly preallocated array OUTPUT PARAMETERS: Y - function value, array[NY]. Y is not reallocated when it is larger than NY. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::rbftscalcbuf( rbfmodel s, rbfcalcbuffer buf, real_1d_array x, real_1d_array& y);
/************************************************************************* This function "unpacks" RBF model by extracting its coefficients. INPUT PARAMETERS: S - RBF model OUTPUT PARAMETERS: NX - dimensionality of argument NY - dimensionality of the target function XWR - model information, array[NC,NX+NY+1]. One row of the array corresponds to one basis function: * first NX columns - coordinates of the center * next NY columns - weights, one per dimension of the function being modelled For ModelVersion=1: * last column - radius, same for all dimensions of the function being modelled For ModelVersion=2: * last NX columns - radii, one per dimension NC - number of the centers V - polynomial term , array[NY,NX+1]. One row per one dimension of the function being modelled. First NX elements are linear coefficients, V[NX] is equal to the constant part. ModelVersion-version of the RBF model: * 1 - for models created by QNN and RBF-ML algorithms, compatible with ALGLIB 3.10 or earlier. * 2 - for models created by HierarchicalRBF, requires ALGLIB 3.11 or later -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/
void alglib::rbfunpack( rbfmodel s, ae_int_t& nx, ae_int_t& ny, real_2d_array& xwr, ae_int_t& nc, real_2d_array& v, ae_int_t& modelversion);

Examples:   [1]  

/************************************************************************* This function unserializes data structure from string. *************************************************************************/
void rbfunserialize(const std::string &s_in, rbfmodel &obj); void rbfunserialize(const std::istream &s_in, rbfmodel &obj);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example illustrates basic concepts of the RBF models: creation, modification,
    // evaluation.
    // 
    // Suppose that we have set of 2-dimensional points with associated
    // scalar function values, and we want to build a RBF model using
    // our data.
    // 
    // NOTE: we can work with 3D models too :)
    // 
    // Typical sequence of steps is given below:
    // 1. we create RBF model object
    // 2. we attach our dataset to the RBF model and tune algorithm settings
    // 3. we rebuild RBF model using QNN algorithm on new data
    // 4. we use RBF model (evaluate, serialize, etc.)
    //
    double v;

    //
    // Step 1: RBF model creation.
    //
    // We have to specify dimensionality of the space (2 or 3) and
    // dimensionality of the function (scalar or vector).
    //
    // New model is empty - it can be evaluated,
    // but we just get zero value at any point.
    //
    rbfmodel model;
    rbfcreate(2, 1, model);

    v = rbfcalc2(model, 0.0, 0.0);
    printf("%.2f\n", double(v)); // EXPECTED: 0.000

    //
    // Step 2: we add dataset.
    //
    // XY contains two points - x0=(-1,0) and x1=(+1,0) -
    // and two function values f(x0)=2, f(x1)=3.
    //
    // We added points, but model was not rebuild yet.
    // If we call rbfcalc2(), we still will get 0.0 as result.
    //
    real_2d_array xy = "[[-1,0,2],[+1,0,3]]";
    rbfsetpoints(model, xy);

    v = rbfcalc2(model, 0.0, 0.0);
    printf("%.2f\n", double(v)); // EXPECTED: 0.000

    //
    // Step 3: rebuild model
    //
    // After we've configured model, we should rebuild it -
    // it will change coefficients stored internally in the
    // rbfmodel structure.
    //
    // We use hierarchical RBF algorithm with following parameters:
    // * RBase - set to 1.0
    // * NLayers - three layers are used (although such simple problem
    //   does not need more than 1 layer)
    // * LambdaReg - is set to zero value, no smoothing is required
    //
    rbfreport rep;
    rbfsetalgohierarchical(model, 1.0, 3, 0.0);
    rbfbuildmodel(model, rep);
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 1

    //
    // Step 4: model was built
    //
    // After call of rbfbuildmodel(), rbfcalc2() will return
    // value of the new model.
    //
    v = rbfcalc2(model, 0.0, 0.0);
    printf("%.2f\n", double(v)); // EXPECTED: 2.500
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example show how to work with polynomial term
    // 
    // Suppose that we have set of 2-dimensional points with associated
    // scalar function values, and we want to build a RBF model using
    // our data.
    //
    // We use hierarchical RBF algorithm with following parameters:
    // * RBase - set to 1.0
    // * NLayers - three layers are used (although such simple problem
    //   does not need more than 1 layer)
    // * LambdaReg - is set to zero value, no smoothing is required
    //
    double v;
    rbfmodel model;
    real_2d_array xy = "[[-1,0,2],[+1,0,3]]";
    rbfreport rep;

    rbfcreate(2, 1, model);
    rbfsetpoints(model, xy);
    rbfsetalgohierarchical(model, 1.0, 3, 0.0);

    //
    // By default, RBF model uses linear term. It means that model
    // looks like
    //     f(x,y) = SUM(RBF[i]) + a*x + b*y + c
    // where RBF[i] is I-th radial basis function and a*x+by+c is a
    // linear term. Having linear terms in a model gives us:
    // (1) improved extrapolation properties
    // (2) linearity of the model when data can be perfectly fitted
    //     by the linear function
    // (3) linear asymptotic behavior
    //
    // Our simple dataset can be modelled by the linear function
    //     f(x,y) = 0.5*x + 2.5
    // and rbfbuildmodel() with default settings should preserve this
    // linearity.
    //
    ae_int_t nx;
    ae_int_t ny;
    ae_int_t nc;
    ae_int_t modelversion;
    real_2d_array xwr;
    real_2d_array c;
    rbfbuildmodel(model, rep);
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 1
    rbfunpack(model, nx, ny, xwr, nc, c, modelversion);
    printf("%s\n", c.tostring(2).c_str()); // EXPECTED: [[0.500,0.000,2.500]]

    // asymptotic behavior of our function is linear
    v = rbfcalc2(model, 1000.0, 0.0);
    printf("%.1f\n", double(v)); // EXPECTED: 502.50

    //
    // Instead of linear term we can use constant term. In this case
    // we will get model which has form
    //     f(x,y) = SUM(RBF[i]) + c
    // where RBF[i] is I-th radial basis function and c is a constant,
    // which is equal to the average function value on the dataset.
    //
    // Because we've already attached dataset to the model the only
    // thing we have to do is to call rbfsetconstterm() and then
    // rebuild model with rbfbuildmodel().
    //
    rbfsetconstterm(model);
    rbfbuildmodel(model, rep);
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 1
    rbfunpack(model, nx, ny, xwr, nc, c, modelversion);
    printf("%s\n", c.tostring(2).c_str()); // EXPECTED: [[0.000,0.000,2.500]]

    // asymptotic behavior of our function is constant
    v = rbfcalc2(model, 1000.0, 0.0);
    printf("%.2f\n", double(v)); // EXPECTED: 2.500

    //
    // Finally, we can use zero term. Just plain RBF without polynomial
    // part:
    //     f(x,y) = SUM(RBF[i])
    // where RBF[i] is I-th radial basis function.
    //
    rbfsetzeroterm(model);
    rbfbuildmodel(model, rep);
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 1
    rbfunpack(model, nx, ny, xwr, nc, c, modelversion);
    printf("%s\n", c.tostring(2).c_str()); // EXPECTED: [[0.000,0.000,0.000]]

    // asymptotic behavior of our function is just zero constant
    v = rbfcalc2(model, 1000.0, 0.0);
    printf("%.2f\n", double(v)); // EXPECTED: 0.000
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example show how to serialize and unserialize RBF model
    // 
    // Suppose that we have set of 2-dimensional points with associated
    // scalar function values, and we want to build a RBF model using
    // our data. Then we want to serialize it to string and to unserialize
    // from string, loading to another instance of RBF model.
    //
    // Here we assume that you already know how to create RBF models.
    //
    std::string s;
    double v;
    rbfmodel model0;
    rbfmodel model1;
    real_2d_array xy = "[[-1,0,2],[+1,0,3]]";
    rbfreport rep;

    // model initialization
    rbfcreate(2, 1, model0);
    rbfsetpoints(model0, xy);
    rbfsetalgohierarchical(model0, 1.0, 3, 0.0);
    rbfbuildmodel(model0, rep);
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 1

    //
    // Serialization - it looks easy,
    // but you should carefully read next section.
    //
    alglib::rbfserialize(model0, s);
    alglib::rbfunserialize(s, model1);

    // both models return same value
    v = rbfcalc2(model0, 0.0, 0.0);
    printf("%.2f\n", double(v)); // EXPECTED: 2.500
    v = rbfcalc2(model1, 0.0, 0.0);
    printf("%.2f\n", double(v)); // EXPECTED: 2.500

    //
    // Previous section shows that model state is saved/restored during
    // serialization. However, some properties are NOT serialized.
    //
    // Serialization saves/restores RBF model, but it does NOT saves/restores
    // settings which were used to build current model. In particular, dataset
    // which was used to build model, is not preserved.
    //
    // What does it mean in for us?
    //
    // Do you remember this sequence: rbfcreate-rbfsetpoints-rbfbuildmodel?
    // First step creates model, second step adds dataset and tunes model
    // settings, third step builds model using current dataset and model
    // construction settings.
    //
    // If you call rbfbuildmodel() without calling rbfsetpoints() first, you
    // will get empty (zero) RBF model. In our example, model0 contains
    // dataset which was added by rbfsetpoints() call. However, model1 does
    // NOT contain dataset - because dataset is NOT serialized.
    //
    // This, if we call rbfbuildmodel(model0,rep), we will get same model,
    // which returns 2.5 at (x,y)=(0,0). However, after same call model1 will
    // return zero - because it contains RBF model (coefficients), but does NOT
    // contain dataset which was used to build this model.
    //
    // Basically, it means that:
    // * serialization of the RBF model preserves anything related to the model
    //   EVALUATION
    // * but it does NOT creates perfect copy of the original object.
    //
    rbfbuildmodel(model0, rep);
    v = rbfcalc2(model0, 0.0, 0.0);
    printf("%.2f\n", double(v)); // EXPECTED: 2.500

    rbfbuildmodel(model1, rep);
    v = rbfcalc2(model1, 0.0, 0.0);
    printf("%.2f\n", double(v)); // EXPECTED: 0.000
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Suppose that we have set of 2-dimensional points with associated VECTOR
    // function values, and we want to build a RBF model using our data.
    // 
    // Typical sequence of steps is given below:
    // 1. we create RBF model object
    // 2. we attach our dataset to the RBF model and tune algorithm settings
    // 3. we rebuild RBF model using new data
    // 4. we use RBF model (evaluate, serialize, etc.)
    //
    real_1d_array x;
    real_1d_array y;

    //
    // Step 1: RBF model creation.
    //
    // We have to specify dimensionality of the space (equal to 2) and
    // dimensionality of the function (2-dimensional vector function).
    //
    // New model is empty - it can be evaluated,
    // but we just get zero value at any point.
    //
    rbfmodel model;
    rbfcreate(2, 2, model);

    x = "[+1,+1]";
    rbfcalc(model, x, y);
    printf("%s\n", y.tostring(2).c_str()); // EXPECTED: [0.000,0.000]

    //
    // Step 2: we add dataset.
    //
    // XY arrays containt four points:
    // * (x0,y0) = (+1,+1), f(x0,y0)=(0,-1)
    // * (x1,y1) = (+1,-1), f(x1,y1)=(-1,0)
    // * (x2,y2) = (-1,-1), f(x2,y2)=(0,+1)
    // * (x3,y3) = (-1,+1), f(x3,y3)=(+1,0)
    //
    real_2d_array xy = "[[+1,+1,0,-1],[+1,-1,-1,0],[-1,-1,0,+1],[-1,+1,+1,0]]";
    rbfsetpoints(model, xy);

    // We added points, but model was not rebuild yet.
    // If we call rbfcalc(), we still will get 0.0 as result.
    rbfcalc(model, x, y);
    printf("%s\n", y.tostring(2).c_str()); // EXPECTED: [0.000,0.000]

    //
    // Step 3: rebuild model
    //
    // We use hierarchical RBF algorithm with following parameters:
    // * RBase - set to 1.0
    // * NLayers - three layers are used (although such simple problem
    //   does not need more than 1 layer)
    // * LambdaReg - is set to zero value, no smoothing is required
    //
    // After we've configured model, we should rebuild it -
    // it will change coefficients stored internally in the
    // rbfmodel structure.
    //
    rbfreport rep;
    rbfsetalgohierarchical(model, 1.0, 3, 0.0);
    rbfbuildmodel(model, rep);
    printf("%d\n", int(rep.terminationtype)); // EXPECTED: 1

    //
    // Step 4: model was built
    //
    // After call of rbfbuildmodel(), rbfcalc() will return
    // value of the new model.
    //
    rbfcalc(model, x, y);
    printf("%s\n", y.tostring(2).c_str()); // EXPECTED: [0.000,-1.000]
    return 0;
}


cmatrixlurcond1
cmatrixlurcondinf
cmatrixrcond1
cmatrixrcondinf
cmatrixtrrcond1
cmatrixtrrcondinf
hpdmatrixcholeskyrcond
hpdmatrixrcond
rmatrixlurcond1
rmatrixlurcondinf
rmatrixrcond1
rmatrixrcondinf
rmatrixtrrcond1
rmatrixtrrcondinf
spdmatrixcholeskyrcond
spdmatrixrcond
/************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the CMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::cmatrixlurcond1(complex_2d_array lua, ae_int_t n);
/************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (infinity norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the CMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::cmatrixlurcondinf(complex_2d_array lua, ae_int_t n);
/************************************************************************* Estimate of a matrix condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::cmatrixrcond1(complex_2d_array a, ae_int_t n);
/************************************************************************* Estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::cmatrixrcondinf(complex_2d_array a, ae_int_t n);
/************************************************************************* Triangular matrix: estimate of a condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array[0..N-1, 0..N-1]. N - size of A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::cmatrixtrrcond1( complex_2d_array a, ae_int_t n, bool isupper, bool isunit);
/************************************************************************* Triangular matrix: estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::cmatrixtrrcondinf( complex_2d_array a, ae_int_t n, bool isupper, bool isunit);
/************************************************************************* Condition number estimate of a Hermitian positive definite matrix given by Cholesky decomposition. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: CD - Cholesky decomposition of matrix A, output of SMatrixCholesky subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::hpdmatrixcholeskyrcond( complex_2d_array a, ae_int_t n, bool isupper);
/************************************************************************* Condition number estimate of a Hermitian positive definite matrix. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm of condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: A - Hermitian positive definite matrix which is given by its upper or lower triangle depending on the value of IsUpper. Array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. Result: 1/LowerBound(cond(A)), if matrix A is positive definite, -1, if matrix A is not positive definite, and its condition number could not be found by this algorithm. NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::hpdmatrixrcond( complex_2d_array a, ae_int_t n, bool isupper);
/************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the RMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::rmatrixlurcond1(real_2d_array lua, ae_int_t n);
/************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (infinity norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the RMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::rmatrixlurcondinf(real_2d_array lua, ae_int_t n);
/************************************************************************* Estimate of a matrix condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::rmatrixrcond1(real_2d_array a, ae_int_t n);
/************************************************************************* Estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::rmatrixrcondinf(real_2d_array a, ae_int_t n);
/************************************************************************* Triangular matrix: estimate of a condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array[0..N-1, 0..N-1]. N - size of A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::rmatrixtrrcond1( real_2d_array a, ae_int_t n, bool isupper, bool isunit);
/************************************************************************* Triangular matrix: estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::rmatrixtrrcondinf( real_2d_array a, ae_int_t n, bool isupper, bool isunit);
/************************************************************************* Condition number estimate of a symmetric positive definite matrix given by Cholesky decomposition. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: CD - Cholesky decomposition of matrix A, output of SMatrixCholesky subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::spdmatrixcholeskyrcond( real_2d_array a, ae_int_t n, bool isupper);
/************************************************************************* Condition number estimate of a symmetric positive definite matrix. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm of condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: A - symmetric positive definite matrix which is given by its upper or lower triangle depending on the value of IsUpper. Array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. Result: 1/LowerBound(cond(A)), if matrix A is positive definite, -1, if matrix A is not positive definite, and its condition number could not be found by this algorithm. NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/
double alglib::spdmatrixrcond(real_2d_array a, ae_int_t n, bool isupper);
rmatrixschur
/************************************************************************* Subroutine performing the Schur decomposition of a general matrix by using the QR algorithm with multiple shifts. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The source matrix A is represented as S'*A*S = T, where S is an orthogonal matrix (Schur vectors), T - upper quasi-triangular matrix (with blocks of sizes 1x1 and 2x2 on the main diagonal). Input parameters: A - matrix to be decomposed. Array whose indexes range within [0..N-1, 0..N-1]. N - size of A, N>=0. Output parameters: A - contains matrix T. Array whose indexes range within [0..N-1, 0..N-1]. S - contains Schur vectors. Array whose indexes range within [0..N-1, 0..N-1]. Note 1: The block structure of matrix T can be easily recognized: since all the elements below the blocks are zeros, the elements a[i+1,i] which are equal to 0 show the block border. Note 2: The algorithm performance depends on the value of the internal parameter NS of the InternalSchurDecomposition subroutine which defines the number of shifts in the QR algorithm (similarly to the block width in block-matrix algorithms in linear algebra). If you require maximum performance on your machine, it is recommended to adjust this parameter manually. Result: True, if the algorithm has converged and parameters A and S contain the result. False, if the algorithm has not converged. Algorithm implemented on the basis of the DHSEQR subroutine (LAPACK 3.0 library). *************************************************************************/
bool alglib::rmatrixschur(real_2d_array& a, ae_int_t n, real_2d_array& s);
sparsebuffers
sparsematrix
sparseadd
sparseconvertto
sparseconverttocrs
sparseconverttohash
sparseconverttosks
sparsecopy
sparsecopybuf
sparsecopytobuf
sparsecopytocrs
sparsecopytocrsbuf
sparsecopytohash
sparsecopytohashbuf
sparsecopytosks
sparsecopytosksbuf
sparsecreate
sparsecreatebuf
sparsecreatecrs
sparsecreatecrsbuf
sparsecreatesks
sparsecreatesksbuf
sparseenumerate
sparsefree
sparseget
sparsegetcompressedrow
sparsegetdiagonal
sparsegetlowercount
sparsegetmatrixtype
sparsegetncols
sparsegetnrows
sparsegetrow
sparsegetuppercount
sparseiscrs
sparseishash
sparseissks
sparsemm
sparsemm2
sparsemtm
sparsemtv
sparsemv
sparsemv2
sparseresizematrix
sparserewriteexisting
sparseset
sparsesmm
sparsesmv
sparseswap
sparsetransposesks
sparsetrmv
sparsetrsv
sparsevsmv
sparse_d_1 Basic operations with sparse matrices
sparse_d_crs Advanced topic: creation in the CRS format.
/************************************************************************* Temporary buffers for sparse matrix operations. You should pass an instance of this structure to factorization functions. It allows to reuse memory during repeated sparse factorizations. You do not have to call some initialization function - simply passing an instance to factorization function is enough. *************************************************************************/
class sparsebuffers { };
/************************************************************************* Sparse matrix structure. You should use ALGLIB functions to work with sparse matrix. Never try to access its fields directly! NOTES ON THE SPARSE STORAGE FORMATS Sparse matrices can be stored using several formats: * Hash-Table representation * Compressed Row Storage (CRS) * Skyline matrix storage (SKS) Each of the formats has benefits and drawbacks: * Hash-table is good for dynamic operations (insertion of new elements), but does not support linear algebra operations * CRS is good for operations like matrix-vector or matrix-matrix products, but its initialization is less convenient - you have to tell row sizes at the initialization, and you have to fill matrix only row by row, from left to right. * SKS is a special format which is used to store triangular factors from Cholesky factorization. It does not support dynamic modification, and support for linear algebra operations is very limited. Tables below outline information about these two formats: OPERATIONS WITH MATRIX HASH CRS SKS creation + + + SparseGet + + + SparseRewriteExisting + + + SparseSet + SparseAdd + SparseGetRow + + SparseGetCompressedRow + + sparse-dense linear algebra + + *************************************************************************/
class sparsematrix { };
/************************************************************************* This function adds value to S[i,j] - element of the sparse matrix. Matrix must be in a Hash-Table mode. In case S[i,j] already exists in the table, V i added to its value. In case S[i,j] is non-existent, it is inserted in the table. Table automatically grows when necessary. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. Exception will be thrown for CRS matrix. I - row index of the element to modify, 0<=I<M J - column index of the element to modify, 0<=J<N V - value to add, must be finite number OUTPUT PARAMETERS S - modified matrix NOTE 1: when S[i,j] is exactly zero after modification, it is deleted from the table. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparseadd(sparsematrix s, ae_int_t i, ae_int_t j, double v);

Examples:   [1]  

/************************************************************************* This function performs in-place conversion to desired sparse storage format. INPUT PARAMETERS S0 - sparse matrix in any format. Fmt - desired storage format of the output, as returned by SparseGetMatrixType() function: * 0 for hash-based storage * 1 for CRS * 2 for SKS OUTPUT PARAMETERS S0 - sparse matrix in requested format. NOTE: in-place conversion wastes a lot of memory which is used to store temporaries. If you perform a lot of repeated conversions, we recommend to use out-of-place buffered conversion functions, like SparseCopyToBuf(), which can reuse already allocated memory. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/
void alglib::sparseconvertto(sparsematrix s0, ae_int_t fmt);
/************************************************************************* This function converts matrix to CRS format. Some algorithms (linear algebra ones, for example) require matrices in CRS format. This function allows to perform in-place conversion. INPUT PARAMETERS S - sparse M*N matrix in any format OUTPUT PARAMETERS S - matrix in CRS format NOTE: this function has no effect when called with matrix which is already in CRS mode. NOTE: this function allocates temporary memory to store a copy of the matrix. If you perform a lot of repeated conversions, we recommend you to use SparseCopyToCRSBuf() function, which can reuse previously allocated memory. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparseconverttocrs(sparsematrix s);

Examples:   [1]  

/************************************************************************* This function performs in-place conversion to Hash table storage. INPUT PARAMETERS S - sparse matrix in CRS format. OUTPUT PARAMETERS S - sparse matrix in Hash table format. NOTE: this function has no effect when called with matrix which is already in Hash table mode. NOTE: in-place conversion involves allocation of temporary arrays. If you perform a lot of repeated in- place conversions, it may lead to memory fragmentation. Consider using out-of-place SparseCopyToHashBuf() function in this case. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::sparseconverttohash(sparsematrix s);
/************************************************************************* This function performs in-place conversion to SKS format. INPUT PARAMETERS S - sparse matrix in any format. OUTPUT PARAMETERS S - sparse matrix in SKS format. NOTE: this function has no effect when called with matrix which is already in SKS mode. NOTE: in-place conversion involves allocation of temporary arrays. If you perform a lot of repeated in- place conversions, it may lead to memory fragmentation. Consider using out-of-place SparseCopyToSKSBuf() function in this case. -- ALGLIB PROJECT -- Copyright 15.01.2014 by Bochkanov Sergey *************************************************************************/
void alglib::sparseconverttosks(sparsematrix s);
/************************************************************************* This function copies S0 to S1. This function completely deallocates memory owned by S1 before creating a copy of S0. If you want to reuse memory, use SparseCopyBuf. NOTE: this function does not verify its arguments, it just copies all fields of the structure. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecopy(sparsematrix s0, sparsematrix& s1);
/************************************************************************* This function copies S0 to S1. Memory already allocated in S1 is reused as much as possible. NOTE: this function does not verify its arguments, it just copies all fields of the structure. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecopybuf(sparsematrix s0, sparsematrix s1);
/************************************************************************* This function performs out-of-place conversion to desired sparse storage format. S0 is copied to S1 and converted on-the-fly. Memory allocated in S1 is reused to maximum extent possible. INPUT PARAMETERS S0 - sparse matrix in any format. Fmt - desired storage format of the output, as returned by SparseGetMatrixType() function: * 0 for hash-based storage * 1 for CRS * 2 for SKS OUTPUT PARAMETERS S1 - sparse matrix in requested format. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecopytobuf( sparsematrix s0, ae_int_t fmt, sparsematrix s1);
/************************************************************************* This function performs out-of-place conversion to CRS format. S0 is copied to S1 and converted on-the-fly. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in CRS format. NOTE: if S0 is stored as CRS, it is just copied without conversion. NOTE: this function de-allocates memory occupied by S1 before starting CRS conversion. If you perform a lot of repeated CRS conversions, it may lead to memory fragmentation. In this case we recommend you to use SparseCopyToCRSBuf() function which re-uses memory in S1 as much as possible. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecopytocrs(sparsematrix s0, sparsematrix& s1);
/************************************************************************* This function performs out-of-place conversion to CRS format. S0 is copied to S1 and converted on-the-fly. Memory allocated in S1 is reused to maximum extent possible. INPUT PARAMETERS S0 - sparse matrix in any format. S1 - matrix which may contain some pre-allocated memory, or can be just uninitialized structure. OUTPUT PARAMETERS S1 - sparse matrix in CRS format. NOTE: if S0 is stored as CRS, it is just copied without conversion. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecopytocrsbuf(sparsematrix s0, sparsematrix s1);
/************************************************************************* This function performs out-of-place conversion to Hash table storage format. S0 is copied to S1 and converted on-the-fly. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in Hash table format. NOTE: if S0 is stored as Hash-table, it is just copied without conversion. NOTE: this function de-allocates memory occupied by S1 before starting conversion. If you perform a lot of repeated conversions, it may lead to memory fragmentation. In this case we recommend you to use SparseCopyToHashBuf() function which re-uses memory in S1 as much as possible. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecopytohash(sparsematrix s0, sparsematrix& s1);
/************************************************************************* This function performs out-of-place conversion to Hash table storage format. S0 is copied to S1 and converted on-the-fly. Memory allocated in S1 is reused to maximum extent possible. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in Hash table format. NOTE: if S0 is stored as Hash-table, it is just copied without conversion. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecopytohashbuf(sparsematrix s0, sparsematrix s1);
/************************************************************************* This function performs out-of-place conversion to SKS storage format. S0 is copied to S1 and converted on-the-fly. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in SKS format. NOTE: if S0 is stored as SKS, it is just copied without conversion. NOTE: this function de-allocates memory occupied by S1 before starting conversion. If you perform a lot of repeated conversions, it may lead to memory fragmentation. In this case we recommend you to use SparseCopyToSKSBuf() function which re-uses memory in S1 as much as possible. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecopytosks(sparsematrix s0, sparsematrix& s1);
/************************************************************************* This function performs out-of-place conversion to SKS format. S0 is copied to S1 and converted on-the-fly. Memory allocated in S1 is reused to maximum extent possible. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in SKS format. NOTE: if S0 is stored as SKS, it is just copied without conversion. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecopytosksbuf(sparsematrix s0, sparsematrix s1);
/************************************************************************* This function creates sparse matrix in a Hash-Table format. This function creates Hast-Table matrix, which can be converted to CRS format after its initialization is over. Typical usage scenario for a sparse matrix is: 1. creation in a Hash-Table format 2. insertion of the matrix elements 3. conversion to the CRS representation 4. matrix is passed to some linear algebra algorithm Some information about different matrix formats can be found below, in the "NOTES" section. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 K - K>=0, expected number of non-zero elements in a matrix. K can be inexact approximation, can be less than actual number of elements (table will grow when needed) or even zero). It is important to understand that although hash-table may grow automatically, it is better to provide good estimate of data size. OUTPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. All elements of the matrix are zero. NOTE 1 Hash-tables use memory inefficiently, and they have to keep some amount of the "spare memory" in order to have good performance. Hash table for matrix with K non-zero elements will need C*K*(8+2*sizeof(int)) bytes, where C is a small constant, about 1.5-2 in magnitude. CRS storage, from the other side, is more memory-efficient, and needs just K*(8+sizeof(int))+M*sizeof(int) bytes, where M is a number of rows in a matrix. When you convert from the Hash-Table to CRS representation, all unneeded memory will be freed. NOTE 2 Comments of SparseMatrix structure outline information about different sparse storage formats. We recommend you to read them before starting to use ALGLIB sparse matrices. NOTE 3 This function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateBuf function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecreate(ae_int_t m, ae_int_t n, sparsematrix& s); void alglib::sparsecreate( ae_int_t m, ae_int_t n, ae_int_t k, sparsematrix& s);

Examples:   [1]  

/************************************************************************* This version of SparseCreate function creates sparse matrix in Hash-Table format, reusing previously allocated storage as much as possible. Read comments for SparseCreate() for more information. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 K - K>=0, expected number of non-zero elements in a matrix. K can be inexact approximation, can be less than actual number of elements (table will grow when needed) or even zero). It is important to understand that although hash-table may grow automatically, it is better to provide good estimate of data size. S - SparseMatrix structure which MAY contain some already allocated storage. OUTPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. All elements of the matrix are zero. Previously allocated storage is reused, if its size is compatible with expected number of non-zeros K. -- ALGLIB PROJECT -- Copyright 14.01.2014 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecreatebuf(ae_int_t m, ae_int_t n, sparsematrix s); void alglib::sparsecreatebuf( ae_int_t m, ae_int_t n, ae_int_t k, sparsematrix s);
/************************************************************************* This function creates sparse matrix in a CRS format (expert function for situations when you are running out of memory). This function creates CRS matrix. Typical usage scenario for a CRS matrix is: 1. creation (you have to tell number of non-zero elements at each row at this moment) 2. insertion of the matrix elements (row by row, from left to right) 3. matrix is passed to some linear algebra algorithm This function is a memory-efficient alternative to SparseCreate(), but it is more complex because it requires you to know in advance how large your matrix is. Some information about different matrix formats can be found in comments on SparseMatrix structure. We recommend you to read them before starting to use ALGLIB sparse matrices.. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 NER - number of elements at each row, array[M], NER[I]>=0 OUTPUT PARAMETERS S - sparse M*N matrix in CRS representation. You have to fill ALL non-zero elements by calling SparseSet() BEFORE you try to use this matrix. NOTE: this function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateCRSBuf function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecreatecrs( ae_int_t m, ae_int_t n, integer_1d_array ner, sparsematrix& s);

Examples:   [1]  

/************************************************************************* This function creates sparse matrix in a CRS format (expert function for situations when you are running out of memory). This version of CRS matrix creation function may reuse memory already allocated in S. This function creates CRS matrix. Typical usage scenario for a CRS matrix is: 1. creation (you have to tell number of non-zero elements at each row at this moment) 2. insertion of the matrix elements (row by row, from left to right) 3. matrix is passed to some linear algebra algorithm This function is a memory-efficient alternative to SparseCreate(), but it is more complex because it requires you to know in advance how large your matrix is. Some information about different matrix formats can be found in comments on SparseMatrix structure. We recommend you to read them before starting to use ALGLIB sparse matrices.. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 NER - number of elements at each row, array[M], NER[I]>=0 S - sparse matrix structure with possibly preallocated memory. OUTPUT PARAMETERS S - sparse M*N matrix in CRS representation. You have to fill ALL non-zero elements by calling SparseSet() BEFORE you try to use this matrix. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecreatecrsbuf( ae_int_t m, ae_int_t n, integer_1d_array ner, sparsematrix s);
/************************************************************************* This function creates sparse matrix in a SKS format (skyline storage format). In most cases you do not need this function - CRS format better suits most use cases. INPUT PARAMETERS M, N - number of rows(M) and columns (N) in a matrix: * M=N (as for now, ALGLIB supports only square SKS) * N>=1 * M>=1 D - "bottom" bandwidths, array[M], D[I]>=0. I-th element stores number of non-zeros at I-th row, below the diagonal (diagonal itself is not included) U - "top" bandwidths, array[N], U[I]>=0. I-th element stores number of non-zeros at I-th row, above the diagonal (diagonal itself is not included) OUTPUT PARAMETERS S - sparse M*N matrix in SKS representation. All elements are filled by zeros. You may use SparseRewriteExisting() to change their values. NOTE: this function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateSKSBuf function. -- ALGLIB PROJECT -- Copyright 13.01.2014 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecreatesks( ae_int_t m, ae_int_t n, integer_1d_array d, integer_1d_array u, sparsematrix& s);
/************************************************************************* This is "buffered" version of SparseCreateSKS() which reuses memory previously allocated in S (of course, memory is reallocated if needed). This function creates sparse matrix in a SKS format (skyline storage format). In most cases you do not need this function - CRS format better suits most use cases. INPUT PARAMETERS M, N - number of rows(M) and columns (N) in a matrix: * M=N (as for now, ALGLIB supports only square SKS) * N>=1 * M>=1 D - "bottom" bandwidths, array[M], 0<=D[I]<=I. I-th element stores number of non-zeros at I-th row, below the diagonal (diagonal itself is not included) U - "top" bandwidths, array[N], 0<=U[I]<=I. I-th element stores number of non-zeros at I-th row, above the diagonal (diagonal itself is not included) OUTPUT PARAMETERS S - sparse M*N matrix in SKS representation. All elements are filled by zeros. You may use SparseSet()/SparseAdd() to change their values. -- ALGLIB PROJECT -- Copyright 13.01.2014 by Bochkanov Sergey *************************************************************************/
void alglib::sparsecreatesksbuf( ae_int_t m, ae_int_t n, integer_1d_array d, integer_1d_array u, sparsematrix s);
/************************************************************************* This function is used to enumerate all elements of the sparse matrix. Before first call user initializes T0 and T1 counters by zero. These counters are used to remember current position in a matrix; after each call they are updated by the function. Subsequent calls to this function return non-zero elements of the sparse matrix, one by one. If you enumerate CRS matrix, matrix is traversed from left to right, from top to bottom. In case you enumerate matrix stored as Hash table, elements are returned in random order. EXAMPLE > T0=0 > T1=0 > while SparseEnumerate(S,T0,T1,I,J,V) do > ....do something with I,J,V INPUT PARAMETERS S - sparse M*N matrix in Hash-Table or CRS representation. T0 - internal counter T1 - internal counter OUTPUT PARAMETERS T0 - new value of the internal counter T1 - new value of the internal counter I - row index of non-zero element, 0<=I<M. J - column index of non-zero element, 0<=J<N V - value of the T-th element RESULT True in case of success (next non-zero element was retrieved) False in case all non-zero elements were enumerated NOTE: you may call SparseRewriteExisting() during enumeration, but it is THE ONLY matrix modification function you can call!!! Other matrix modification functions should not be called during enumeration! -- ALGLIB PROJECT -- Copyright 14.03.2012 by Bochkanov Sergey *************************************************************************/
bool alglib::sparseenumerate( sparsematrix s, ae_int_t& t0, ae_int_t& t1, ae_int_t& i, ae_int_t& j, double& v);
/************************************************************************* The function frees all memory occupied by sparse matrix. Sparse matrix structure becomes unusable after this call. OUTPUT PARAMETERS S - sparse matrix to delete -- ALGLIB PROJECT -- Copyright 24.07.2012 by Bochkanov Sergey *************************************************************************/
void alglib::sparsefree(sparsematrix& s);
/************************************************************************* This function returns S[i,j] - element of the sparse matrix. Matrix can be in any mode (Hash-Table, CRS, SKS), but this function is less efficient for CRS matrices. Hash-Table and SKS matrices can find element in O(1) time, while CRS matrices need O(log(RS)) time, where RS is an number of non-zero elements in a row. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. Exception will be thrown for CRS matrix. I - row index of the element to modify, 0<=I<M J - column index of the element to modify, 0<=J<N RESULT value of S[I,J] or zero (in case no element with such index is found) -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
double alglib::sparseget(sparsematrix s, ae_int_t i, ae_int_t j);

Examples:   [1]  [2]  

/************************************************************************* This function returns I-th row of the sparse matrix IN COMPRESSED FORMAT - only non-zero elements are returned (with their indexes). Matrix must be stored in CRS or SKS format. INPUT PARAMETERS: S - sparse M*N matrix in CRS format I - row index, 0<=I<M ColIdx - output buffer for column indexes, can be preallocated. In case buffer size is too small to store I-th row, it is automatically reallocated. Vals - output buffer for values, can be preallocated. In case buffer size is too small to store I-th row, it is automatically reallocated. OUTPUT PARAMETERS: ColIdx - column indexes of non-zero elements, sorted by ascending. Symbolically non-zero elements are counted (i.e. if you allocated place for element, but it has zero numerical value - it is counted). Vals - values. Vals[K] stores value of matrix element with indexes (I,ColIdx[K]). Symbolically non-zero elements are counted (i.e. if you allocated place for element, but it has zero numerical value - it is counted). NZCnt - number of symbolically non-zero elements per row. NOTE: when incorrect I (outside of [0,M-1]) or matrix (non CRS/SKS) is passed, this function throws exception. NOTE: this function may allocate additional, unnecessary place for ColIdx and Vals arrays. It is dictated by performance reasons - on SKS matrices it is faster to allocate space at the beginning with some "extra"-space, than performing two passes over matrix - first time to calculate exact space required for data, second time - to store data itself. -- ALGLIB PROJECT -- Copyright 10.12.2014 by Bochkanov Sergey *************************************************************************/
void alglib::sparsegetcompressedrow( sparsematrix s, ae_int_t i, integer_1d_array& colidx, real_1d_array& vals, ae_int_t& nzcnt);
/************************************************************************* This function returns I-th diagonal element of the sparse matrix. Matrix can be in any mode (Hash-Table or CRS storage), but this function is most efficient for CRS matrices - it requires less than 50 CPU cycles to extract diagonal element. For Hash-Table matrices we still have O(1) query time, but function is many times slower. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. Exception will be thrown for CRS matrix. I - index of the element to modify, 0<=I<min(M,N) RESULT value of S[I,I] or zero (in case no element with such index is found) -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
double alglib::sparsegetdiagonal(sparsematrix s, ae_int_t i);
/************************************************************************* The function returns number of strictly lower triangular non-zero elements in the matrix. It counts SYMBOLICALLY non-zero elements, i.e. entries in the sparse matrix data structure. If some element has zero numerical value, it is still counted. This function has different cost for different types of matrices: * for hash-based matrices it involves complete pass over entire hash-table with O(NNZ) cost, where NNZ is number of non-zero elements * for CRS and SKS matrix types cost of counting is O(N) (N - matrix size). RESULT: number of non-zero elements strictly below main diagonal -- ALGLIB PROJECT -- Copyright 12.02.2014 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::sparsegetlowercount(sparsematrix s);
/************************************************************************* This function returns type of the matrix storage format. INPUT PARAMETERS: S - sparse matrix. RESULT: sparse storage format used by matrix: 0 - Hash-table 1 - CRS (compressed row storage) 2 - SKS (skyline) NOTE: future versions of ALGLIB may include additional sparse storage formats. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::sparsegetmatrixtype(sparsematrix s);
/************************************************************************* The function returns number of columns of a sparse matrix. RESULT: number of columns of a sparse matrix. -- ALGLIB PROJECT -- Copyright 23.08.2012 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::sparsegetncols(sparsematrix s);
/************************************************************************* The function returns number of rows of a sparse matrix. RESULT: number of rows of a sparse matrix. -- ALGLIB PROJECT -- Copyright 23.08.2012 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::sparsegetnrows(sparsematrix s);
/************************************************************************* This function returns I-th row of the sparse matrix. Matrix must be stored in CRS or SKS format. INPUT PARAMETERS: S - sparse M*N matrix in CRS format I - row index, 0<=I<M IRow - output buffer, can be preallocated. In case buffer size is too small to store I-th row, it is automatically reallocated. OUTPUT PARAMETERS: IRow - array[M], I-th row. NOTE: this function has O(N) running time, where N is a column count. It allocates and fills N-element array, even although most of its elemets are zero. NOTE: If you have O(non-zeros-per-row) time and memory requirements, use SparseGetCompressedRow() function. It returns data in compressed format. NOTE: when incorrect I (outside of [0,M-1]) or matrix (non CRS/SKS) is passed, this function throws exception. -- ALGLIB PROJECT -- Copyright 10.12.2014 by Bochkanov Sergey *************************************************************************/
void alglib::sparsegetrow( sparsematrix s, ae_int_t i, real_1d_array& irow);
/************************************************************************* The function returns number of strictly upper triangular non-zero elements in the matrix. It counts SYMBOLICALLY non-zero elements, i.e. entries in the sparse matrix data structure. If some element has zero numerical value, it is still counted. This function has different cost for different types of matrices: * for hash-based matrices it involves complete pass over entire hash-table with O(NNZ) cost, where NNZ is number of non-zero elements * for CRS and SKS matrix types cost of counting is O(N) (N - matrix size). RESULT: number of non-zero elements strictly above main diagonal -- ALGLIB PROJECT -- Copyright 12.02.2014 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::sparsegetuppercount(sparsematrix s);
/************************************************************************* This function checks matrix storage format and returns True when matrix is stored using CRS representation. INPUT PARAMETERS: S - sparse matrix. RESULT: True if matrix type is CRS False if matrix type is not CRS -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/
bool alglib::sparseiscrs(sparsematrix s);
/************************************************************************* This function checks matrix storage format and returns True when matrix is stored using Hash table representation. INPUT PARAMETERS: S - sparse matrix. RESULT: True if matrix type is Hash table False if matrix type is not Hash table -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/
bool alglib::sparseishash(sparsematrix s);
/************************************************************************* This function checks matrix storage format and returns True when matrix is stored using SKS representation. INPUT PARAMETERS: S - sparse matrix. RESULT: True if matrix type is SKS False if matrix type is not SKS -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/
bool alglib::sparseissks(sparsematrix s);
/************************************************************************* This function calculates matrix-matrix product S*A. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. A - array[N][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B - array[M][K], S*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsemm( sparsematrix s, real_2d_array a, ae_int_t k, real_2d_array& b);
/************************************************************************* This function simultaneously calculates two matrix-matrix products: S*A and S^T*A. S must be square (non-rectangular) matrix stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse N*N matrix in CRS or SKS format. A - array[N][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B0 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. B1 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B0 - array[N][K], S*A B1 - array[N][K], S^T*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsemm2( sparsematrix s, real_2d_array a, ae_int_t k, real_2d_array& b0, real_2d_array& b1);
/************************************************************************* This function calculates matrix-matrix product S^T*A. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. A - array[M][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least M, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B - array[N][K], S^T*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsemtm( sparsematrix s, real_2d_array a, ae_int_t k, real_2d_array& b);
/************************************************************************* This function calculates matrix-vector product S^T*x. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. X - array[M], input vector. For performance reasons we make only quick checks - we check that array size is at least M, but we do not check for NAN's or INF's. Y - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y - array[N], S^T*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsemtv(sparsematrix s, real_1d_array x, real_1d_array& y);
/************************************************************************* This function calculates matrix-vector product S*x. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y - array[M], S*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsemv(sparsematrix s, real_1d_array x, real_1d_array& y);

Examples:   [1]  [2]  

/************************************************************************* This function simultaneously calculates two matrix-vector products: S*x and S^T*x. S must be square (non-rectangular) matrix stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse N*N matrix in CRS or SKS format. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y0 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. Y1 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y0 - array[N], S*x Y1 - array[N], S^T*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsemv2( sparsematrix s, real_1d_array x, real_1d_array& y0, real_1d_array& y1);
/************************************************************************* This procedure resizes Hash-Table matrix. It can be called when you have deleted too many elements from the matrix, and you want to free unneeded memory. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparseresizematrix(sparsematrix s);
/************************************************************************* This function rewrites existing (non-zero) element. It returns True if element exists or False, when it is called for non-existing (zero) element. This function works with any kind of the matrix. The purpose of this function is to provide convenient thread-safe way to modify sparse matrix. Such modification (already existing element is rewritten) is guaranteed to be thread-safe without any synchronization, as long as different threads modify different elements. INPUT PARAMETERS S - sparse M*N matrix in any kind of representation (Hash, SKS, CRS). I - row index of non-zero element to modify, 0<=I<M J - column index of non-zero element to modify, 0<=J<N V - value to rewrite, must be finite number OUTPUT PARAMETERS S - modified matrix RESULT True in case when element exists False in case when element doesn't exist or it is zero -- ALGLIB PROJECT -- Copyright 14.03.2012 by Bochkanov Sergey *************************************************************************/
bool alglib::sparserewriteexisting( sparsematrix s, ae_int_t i, ae_int_t j, double v);
/************************************************************************* This function modifies S[i,j] - element of the sparse matrix. For Hash-based storage format: * this function can be called at any moment - during matrix initialization or later * new value can be zero or non-zero. In case new value of S[i,j] is zero, this element is deleted from the table. * this function has no effect when called with zero V for non-existent element. For CRS-bases storage format: * this function can be called ONLY DURING MATRIX INITIALIZATION * new value MUST be non-zero. Exception will be thrown for zero V. * elements must be initialized in correct order - from top row to bottom, within row - from left to right. For SKS storage: NOT SUPPORTED! Use SparseRewriteExisting() to work with SKS matrices. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table or CRS representation. I - row index of the element to modify, 0<=I<M J - column index of the element to modify, 0<=J<N V - value to set, must be finite number, can be zero OUTPUT PARAMETERS S - modified matrix -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparseset(sparsematrix s, ae_int_t i, ae_int_t j, double v);

Examples:   [1]  [2]  

/************************************************************************* This function calculates matrix-matrix product S*A, when S is symmetric matrix. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*M matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is given: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. A - array[N][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B - array[M][K], S*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsesmm( sparsematrix s, bool isupper, real_2d_array a, ae_int_t k, real_2d_array& b);
/************************************************************************* This function calculates matrix-vector product S*x, when S is symmetric matrix. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*M matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is given: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y - array[M], S*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/
void alglib::sparsesmv( sparsematrix s, bool isupper, real_1d_array x, real_1d_array& y);
/************************************************************************* This function efficiently swaps contents of S0 and S1. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/
void alglib::sparseswap(sparsematrix s0, sparsematrix s1);
/************************************************************************* This function performs efficient in-place transpose of SKS matrix. No additional memory is allocated during transposition. This function supports only skyline storage format (SKS). INPUT PARAMETERS S - sparse matrix in SKS format. OUTPUT PARAMETERS S - sparse matrix, transposed. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/
void alglib::sparsetransposesks(sparsematrix s);
/************************************************************************* This function calculates matrix-vector product op(S)*x, when x is vector, S is symmetric triangular matrix, op(S) is transposition or no operation. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse square matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is used: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. IsUnit - unit or non-unit diagonal: * if True, diagonal elements of triangular matrix are considered equal to 1.0. Actual elements stored in S are not referenced at all. * if False, diagonal stored in S is used OpType - operation type: * if 0, S*x is calculated * if 1, (S^T)*x is calculated (transposition) X - array[N] which stores input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y - possibly preallocated input buffer. Automatically resized if its size is too small. OUTPUT PARAMETERS Y - array[N], op(S)*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 20.01.2014 by Bochkanov Sergey *************************************************************************/
void alglib::sparsetrmv( sparsematrix s, bool isupper, bool isunit, ae_int_t optype, real_1d_array& x, real_1d_array& y);
/************************************************************************* This function solves linear system op(S)*y=x where x is vector, S is symmetric triangular matrix, op(S) is transposition or no operation. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse square matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is used: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. IsUnit - unit or non-unit diagonal: * if True, diagonal elements of triangular matrix are considered equal to 1.0. Actual elements stored in S are not referenced at all. * if False, diagonal stored in S is used. It is your responsibility to make sure that diagonal is non-zero. OpType - operation type: * if 0, S*x is calculated * if 1, (S^T)*x is calculated (transposition) X - array[N] which stores input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. OUTPUT PARAMETERS X - array[N], inv(op(S))*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. NOTE: no assertion or tests are done during algorithm operation. It is your responsibility to provide invertible matrix to algorithm. -- ALGLIB PROJECT -- Copyright 20.01.2014 by Bochkanov Sergey *************************************************************************/
void alglib::sparsetrsv( sparsematrix s, bool isupper, bool isunit, ae_int_t optype, real_1d_array& x);
/************************************************************************* This function calculates vector-matrix-vector product x'*S*x, where S is symmetric matrix. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*M matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is given: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. RESULT x'*S*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 27.01.2014 by Bochkanov Sergey *************************************************************************/
double alglib::sparsevsmv(sparsematrix s, bool isupper, real_1d_array x);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example demonstrates creation/initialization of the sparse matrix
    // and matrix-vector multiplication.
    //
    // First, we have to create matrix and initialize it. Matrix is initially created
    // in the Hash-Table format, which allows convenient initialization. We can modify
    // Hash-Table matrix with sparseset() and sparseadd() functions.
    //
    // NOTE: Unlike CRS format, Hash-Table representation allows you to initialize
    // elements in the arbitrary order. You may see that we initialize a[0][0] first,
    // then move to the second row, and then move back to the first row.
    //
    sparsematrix s;
    sparsecreate(2, 2, s);
    sparseset(s, 0, 0, 2.0);
    sparseset(s, 1, 1, 1.0);
    sparseset(s, 0, 1, 1.0);

    sparseadd(s, 1, 1, 4.0);

    //
    // Now S is equal to
    //   [ 2 1 ]
    //   [   5 ]
    // Lets check it by reading matrix contents with sparseget().
    // You may see that with sparseget() you may read both non-zero
    // and zero elements.
    //
    double v;
    v = sparseget(s, 0, 0);
    printf("%.2f\n", double(v)); // EXPECTED: 2.0000
    v = sparseget(s, 0, 1);
    printf("%.2f\n", double(v)); // EXPECTED: 1.0000
    v = sparseget(s, 1, 0);
    printf("%.2f\n", double(v)); // EXPECTED: 0.0000
    v = sparseget(s, 1, 1);
    printf("%.2f\n", double(v)); // EXPECTED: 5.0000

    //
    // After successful creation we can use our matrix for linear operations.
    //
    // However, there is one more thing we MUST do before using S in linear
    // operations: we have to convert it from HashTable representation (used for
    // initialization and dynamic operations) to CRS format with sparseconverttocrs()
    // call. If you omit this call, ALGLIB will generate exception on the first
    // attempt to use S in linear operations. 
    //
    sparseconverttocrs(s);

    //
    // Now S is in the CRS format and we are ready to do linear operations.
    // Lets calculate A*x for some x.
    //
    real_1d_array x = "[1,-1]";
    real_1d_array y = "[]";
    sparsemv(s, x, y);
    printf("%s\n", y.tostring(2).c_str()); // EXPECTED: [1.000,-5.000]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "linalg.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // This example demonstrates creation/initialization of the sparse matrix in the
    // CRS format.
    //
    // Hash-Table format used by default is very convenient (it allows easy
    // insertion of elements, automatic memory reallocation), but has
    // significant memory and performance overhead. Insertion of one element 
    // costs hundreds of CPU cycles, and memory consumption is several times
    // higher than that of CRS.
    //
    // When you work with really large matrices and when you can tell in 
    // advance how many elements EXACTLY you need, it can be beneficial to 
    // create matrix in the CRS format from the very beginning.
    //
    // If you want to create matrix in the CRS format, you should:
    // * use sparsecreatecrs() function
    // * know row sizes in advance (number of non-zero entries in the each row)
    // * initialize matrix with sparseset() - another function, sparseadd(), is not allowed
    // * initialize elements from left to right, from top to bottom, each
    //   element is initialized only once.
    //
    sparsematrix s;
    integer_1d_array row_sizes = "[2,2,2,1]";
    sparsecreatecrs(4, 4, row_sizes, s);
    sparseset(s, 0, 0, 2.0);
    sparseset(s, 0, 1, 1.0);
    sparseset(s, 1, 1, 4.0);
    sparseset(s, 1, 2, 2.0);
    sparseset(s, 2, 2, 3.0);
    sparseset(s, 2, 3, 1.0);
    sparseset(s, 3, 3, 9.0);

    //
    // Now S is equal to
    //   [ 2 1     ]
    //   [   4 2   ]
    //   [     3 1 ]
    //   [       9 ]
    //
    // We should point that we have initialized S elements from left to right,
    // from top to bottom. CRS representation does NOT allow you to do so in
    // the different order. Try to change order of the sparseset() calls above,
    // and you will see that your program generates exception.
    //
    // We can check it by reading matrix contents with sparseget().
    // However, you should remember that sparseget() is inefficient on
    // CRS matrices (it may have to pass through all elements of the row 
    // until it finds element you need).
    //
    double v;
    v = sparseget(s, 0, 0);
    printf("%.2f\n", double(v)); // EXPECTED: 2.0000
    v = sparseget(s, 2, 3);
    printf("%.2f\n", double(v)); // EXPECTED: 1.0000

    // you may see that you can read zero elements (which are not stored) with sparseget()
    v = sparseget(s, 3, 2);
    printf("%.2f\n", double(v)); // EXPECTED: 0.0000

    //
    // After successful creation we can use our matrix for linear operations.
    // Lets calculate A*x for some x.
    //
    real_1d_array x = "[1,-1,1,-1]";
    real_1d_array y = "[]";
    sparsemv(s, x, y);
    printf("%s\n", y.tostring(2).c_str()); // EXPECTED: [1.000,-2.000,2.000,-9]
    return 0;
}


smatrixgevd
smatrixgevdreduce
/************************************************************************* Algorithm for solving the following generalized symmetric positive-definite eigenproblem: A*x = lambda*B*x (1) or A*B*x = lambda*x (2) or B*A*x = lambda*x (3). where A is a symmetric matrix, B - symmetric positive-definite matrix. The problem is solved by reducing it to an ordinary symmetric eigenvalue problem. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrices A and B. IsUpperA - storage format of matrix A. B - symmetric positive-definite matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. IsUpperB - storage format of matrix B. ZNeeded - if ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. ProblemType - if ProblemType is equal to: * 1, the following problem is solved: A*x = lambda*B*x; * 2, the following problem is solved: A*B*x = lambda*x; * 3, the following problem is solved: B*A*x = lambda*x. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..N-1]. The eigenvectors are stored in matrix columns. It should be noted that the eigenvectors in such problems do not form an orthogonal system. Result: True, if the problem was solved successfully. False, if the error occurred during the Cholesky decomposition of matrix B (the matrix isn't positive-definite) or during the work of the iterative algorithm for solving the symmetric eigenproblem. See also the GeneralizedSymmetricDefiniteEVDReduce subroutine. -- ALGLIB -- Copyright 1.28.2006 by Bochkanov Sergey *************************************************************************/
bool alglib::smatrixgevd( real_2d_array a, ae_int_t n, bool isuppera, real_2d_array b, bool isupperb, ae_int_t zneeded, ae_int_t problemtype, real_1d_array& d, real_2d_array& z);
/************************************************************************* Algorithm for reduction of the following generalized symmetric positive- definite eigenvalue problem: A*x = lambda*B*x (1) or A*B*x = lambda*x (2) or B*A*x = lambda*x (3) to the symmetric eigenvalues problem C*y = lambda*y (eigenvalues of this and the given problems are the same, and the eigenvectors of the given problem could be obtained by multiplying the obtained eigenvectors by the transformation matrix x = R*y). Here A is a symmetric matrix, B - symmetric positive-definite matrix. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrices A and B. IsUpperA - storage format of matrix A. B - symmetric positive-definite matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. IsUpperB - storage format of matrix B. ProblemType - if ProblemType is equal to: * 1, the following problem is solved: A*x = lambda*B*x; * 2, the following problem is solved: A*B*x = lambda*x; * 3, the following problem is solved: B*A*x = lambda*x. Output parameters: A - symmetric matrix which is given by its upper or lower triangle depending on IsUpperA. Contains matrix C. Array whose indexes range within [0..N-1, 0..N-1]. R - upper triangular or low triangular transformation matrix which is used to obtain the eigenvectors of a given problem as the product of eigenvectors of C (from the right) and matrix R (from the left). If the matrix is upper triangular, the elements below the main diagonal are equal to 0 (and vice versa). Thus, we can perform the multiplication without taking into account the internal structure (which is an easier though less effective way). Array whose indexes range within [0..N-1, 0..N-1]. IsUpperR - type of matrix R (upper or lower triangular). Result: True, if the problem was reduced successfully. False, if the error occurred during the Cholesky decomposition of matrix B (the matrix is not positive-definite). -- ALGLIB -- Copyright 1.28.2006 by Bochkanov Sergey *************************************************************************/
bool alglib::smatrixgevdreduce( real_2d_array& a, ae_int_t n, bool isuppera, real_2d_array b, bool isupperb, ae_int_t problemtype, real_2d_array& r, bool& isupperr);
spline1dinterpolant
spline1dbuildakima
spline1dbuildcatmullrom
spline1dbuildcubic
spline1dbuildhermite
spline1dbuildlinear
spline1dbuildmonotone
spline1dcalc
spline1dconvcubic
spline1dconvdiff2cubic
spline1dconvdiffcubic
spline1ddiff
spline1dgriddiff2cubic
spline1dgriddiffcubic
spline1dintegrate
spline1dlintransx
spline1dlintransy
spline1dunpack
spline1d_d_convdiff Resampling using cubic splines
spline1d_d_cubic Cubic spline interpolation
spline1d_d_griddiff Differentiation on the grid using cubic splines
spline1d_d_linear Piecewise linear spline interpolation
spline1d_d_monotone Monotone interpolation
/************************************************************************* 1-dimensional spline interpolant *************************************************************************/
class spline1dinterpolant { };
/************************************************************************* This subroutine builds Akima spline interpolant INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dbuildakima( real_1d_array x, real_1d_array y, spline1dinterpolant& c); void alglib::spline1dbuildakima( real_1d_array x, real_1d_array y, ae_int_t n, spline1dinterpolant& c);
/************************************************************************* This subroutine builds Catmull-Rom spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Y - function values, array[0..N-1]. OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundType - boundary condition type: * -1 for periodic boundary condition * 0 for parabolically terminated spline (default) Tension - tension parameter: * tension=0 corresponds to classic Catmull-Rom spline (default) * 0<tension<1 corresponds to more general form - cardinal spline OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dbuildcatmullrom( real_1d_array x, real_1d_array y, spline1dinterpolant& c); void alglib::spline1dbuildcatmullrom( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t boundtype, double tension, spline1dinterpolant& c);
/************************************************************************* This subroutine builds cubic spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Y - function values, array[0..N-1]. OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dbuildcubic( real_1d_array x, real_1d_array y, spline1dinterpolant& c); void alglib::spline1dbuildcubic( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, spline1dinterpolant& c);
/************************************************************************* This subroutine builds Hermite spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] D - derivatives, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant. ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dbuildhermite( real_1d_array x, real_1d_array y, real_1d_array d, spline1dinterpolant& c); void alglib::spline1dbuildhermite( real_1d_array x, real_1d_array y, real_1d_array d, ae_int_t n, spline1dinterpolant& c);
/************************************************************************* This subroutine builds linear spline interpolant INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dbuildlinear( real_1d_array x, real_1d_array y, spline1dinterpolant& c); void alglib::spline1dbuildlinear( real_1d_array x, real_1d_array y, ae_int_t n, spline1dinterpolant& c);

Examples:   [1]  [2]  

/************************************************************************* This function builds monotone cubic Hermite interpolant. This interpolant is monotonic in [x(0),x(n-1)] and is constant outside of this interval. In case y[] form non-monotonic sequence, interpolant is piecewise monotonic. Say, for x=(0,1,2,3,4) and y=(0,1,2,1,0) interpolant will monotonically grow at [0..2] and monotonically decrease at [2..4]. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Subroutine automatically sorts points, so caller may pass unsorted array. Y - function values, array[0..N-1] N - the number of points(N>=2). OUTPUT PARAMETERS: C - spline interpolant. -- ALGLIB PROJECT -- Copyright 21.06.2012 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dbuildmonotone( real_1d_array x, real_1d_array y, spline1dinterpolant& c); void alglib::spline1dbuildmonotone( real_1d_array x, real_1d_array y, ae_int_t n, spline1dinterpolant& c);

Examples:   [1]  

/************************************************************************* This subroutine calculates the value of the spline at the given point X. INPUT PARAMETERS: C - spline interpolant X - point Result: S(x) -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/
double alglib::spline1dcalc(spline1dinterpolant c, double x);

Examples:   [1]  [2]  [3]  

/************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dconvcubic( real_1d_array x, real_1d_array y, real_1d_array x2, real_1d_array& y2); void alglib::spline1dconvcubic( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, real_1d_array x2, ae_int_t n2, real_1d_array& y2);

Examples:   [1]  

/************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[], first and second derivatives d2[] and dd2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] D2 - first derivatives at X2[] DD2 - second derivatives at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dconvdiff2cubic( real_1d_array x, real_1d_array y, real_1d_array x2, real_1d_array& y2, real_1d_array& d2, real_1d_array& dd2); void alglib::spline1dconvdiff2cubic( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, real_1d_array x2, ae_int_t n2, real_1d_array& y2, real_1d_array& d2, real_1d_array& dd2);

Examples:   [1]  

/************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[] and derivatives d2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] D2 - first derivatives at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dconvdiffcubic( real_1d_array x, real_1d_array y, real_1d_array x2, real_1d_array& y2, real_1d_array& d2); void alglib::spline1dconvdiffcubic( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, real_1d_array x2, ae_int_t n2, real_1d_array& y2, real_1d_array& d2);

Examples:   [1]  

/************************************************************************* This subroutine differentiates the spline. INPUT PARAMETERS: C - spline interpolant. X - point Result: S - S(x) DS - S'(x) D2S - S''(x) -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline1ddiff( spline1dinterpolant c, double x, double& s, double& ds, double& d2s);
/************************************************************************* This function solves following problem: given table y[] of function values at nodes x[], it calculates and returns tables of first and second function derivatives d1[] and d2[] (calculated at the same nodes x[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - spline nodes Y - function values OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: D1 - S' values at X[] D2 - S'' values at X[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Derivative values are correctly reordered on return, so D[I] is always equal to S'(X[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dgriddiff2cubic( real_1d_array x, real_1d_array y, real_1d_array& d1, real_1d_array& d2); void alglib::spline1dgriddiff2cubic( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, real_1d_array& d1, real_1d_array& d2);

Examples:   [1]  

/************************************************************************* This function solves following problem: given table y[] of function values at nodes x[], it calculates and returns table of function derivatives d[] (calculated at the same nodes x[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - spline nodes Y - function values OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: D - derivative values at X[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Derivative values are correctly reordered on return, so D[I] is always equal to S'(X[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dgriddiffcubic( real_1d_array x, real_1d_array y, real_1d_array& d); void alglib::spline1dgriddiffcubic( real_1d_array x, real_1d_array y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, real_1d_array& d);

Examples:   [1]  

/************************************************************************* This subroutine integrates the spline. INPUT PARAMETERS: C - spline interpolant. X - right bound of the integration interval [a, x], here 'a' denotes min(x[]) Result: integral(S(t)dt,a,x) -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/
double alglib::spline1dintegrate(spline1dinterpolant c, double x);
/************************************************************************* This subroutine performs linear transformation of the spline argument. INPUT PARAMETERS: C - spline interpolant. A, B- transformation coefficients: x = A*t + B Result: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dlintransx(spline1dinterpolant c, double a, double b);
/************************************************************************* This subroutine performs linear transformation of the spline. INPUT PARAMETERS: C - spline interpolant. A, B- transformation coefficients: S2(x) = A*S(x) + B Result: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dlintransy(spline1dinterpolant c, double a, double b);
/************************************************************************* This subroutine unpacks the spline into the coefficients table. INPUT PARAMETERS: C - spline interpolant. X - point OUTPUT PARAMETERS: Tbl - coefficients table, unpacked format, array[0..N-2, 0..5]. For I = 0...N-2: Tbl[I,0] = X[i] Tbl[I,1] = X[i+1] Tbl[I,2] = C0 Tbl[I,3] = C1 Tbl[I,4] = C2 Tbl[I,5] = C3 On [x[i], x[i+1]] spline is equals to: S(x) = C0 + C1*t + C2*t^2 + C3*t^3 t = x-x[i] NOTE: You can rebuild spline with Spline1DBuildHermite() function, which accepts as inputs function values and derivatives at nodes, which are easy to calculate when you have coefficients. -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline1dunpack( spline1dinterpolant c, ae_int_t& n, real_2d_array& tbl);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We use cubic spline to do resampling, i.e. having
    // values of f(x)=x^2 sampled at 5 equidistant nodes on [-1,+1]
    // we calculate values/derivatives of cubic spline on 
    // another grid (equidistant with 9 nodes on [-1,+1])
    // WITHOUT CONSTRUCTION OF SPLINE OBJECT.
    //
    // There are efficient functions spline1dconvcubic(),
    // spline1dconvdiffcubic() and spline1dconvdiff2cubic() 
    // for such calculations.
    //
    // We use default boundary conditions ("parabolically terminated
    // spline") because cubic spline built with such boundary conditions 
    // will exactly reproduce any quadratic f(x).
    //
    // Actually, we could use natural conditions, but we feel that 
    // spline which exactly reproduces f() will show us more 
    // understandable results.
    //
    real_1d_array x_old = "[-1.0,-0.5,0.0,+0.5,+1.0]";
    real_1d_array y_old = "[+1.0,0.25,0.0,0.25,+1.0]";
    real_1d_array x_new = "[-1.00,-0.75,-0.50,-0.25,0.00,+0.25,+0.50,+0.75,+1.00]";
    real_1d_array y_new;
    real_1d_array d1_new;
    real_1d_array d2_new;

    //
    // First, conversion without differentiation.
    //
    //
    spline1dconvcubic(x_old, y_old, x_new, y_new);
    printf("%s\n", y_new.tostring(3).c_str()); // EXPECTED: [1.0000, 0.5625, 0.2500, 0.0625, 0.0000, 0.0625, 0.2500, 0.5625, 1.0000]

    //
    // Then, conversion with differentiation (first derivatives only)
    //
    //
    spline1dconvdiffcubic(x_old, y_old, x_new, y_new, d1_new);
    printf("%s\n", y_new.tostring(3).c_str()); // EXPECTED: [1.0000, 0.5625, 0.2500, 0.0625, 0.0000, 0.0625, 0.2500, 0.5625, 1.0000]
    printf("%s\n", d1_new.tostring(3).c_str()); // EXPECTED: [-2.0, -1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0]

    //
    // Finally, conversion with first and second derivatives
    //
    //
    spline1dconvdiff2cubic(x_old, y_old, x_new, y_new, d1_new, d2_new);
    printf("%s\n", y_new.tostring(3).c_str()); // EXPECTED: [1.0000, 0.5625, 0.2500, 0.0625, 0.0000, 0.0625, 0.2500, 0.5625, 1.0000]
    printf("%s\n", d1_new.tostring(3).c_str()); // EXPECTED: [-2.0, -1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0]
    printf("%s\n", d2_new.tostring(3).c_str()); // EXPECTED: [2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We use cubic spline to interpolate f(x)=x^2 sampled 
    // at 5 equidistant nodes on [-1,+1].
    //
    // First, we use default boundary conditions ("parabolically terminated
    // spline") because cubic spline built with such boundary conditions 
    // will exactly reproduce any quadratic f(x).
    //
    // Then we try to use natural boundary conditions
    //     d2S(-1)/dx^2 = 0.0
    //     d2S(+1)/dx^2 = 0.0
    // and see that such spline interpolated f(x) with small error.
    //
    real_1d_array x = "[-1.0,-0.5,0.0,+0.5,+1.0]";
    real_1d_array y = "[+1.0,0.25,0.0,0.25,+1.0]";
    double t = 0.25;
    double v;
    spline1dinterpolant s;
    ae_int_t natural_bound_type = 2;
    //
    // Test exact boundary conditions: build S(x), calculare S(0.25)
    // (almost same as original function)
    //
    spline1dbuildcubic(x, y, s);
    v = spline1dcalc(s, t);
    printf("%.4f\n", double(v)); // EXPECTED: 0.0625

    //
    // Test natural boundary conditions: build S(x), calculare S(0.25)
    // (small interpolation error)
    //
    spline1dbuildcubic(x, y, 5, natural_bound_type, 0.0, natural_bound_type, 0.0, s);
    v = spline1dcalc(s, t);
    printf("%.3f\n", double(v)); // EXPECTED: 0.0580
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We use cubic spline to do grid differentiation, i.e. having
    // values of f(x)=x^2 sampled at 5 equidistant nodes on [-1,+1]
    // we calculate derivatives of cubic spline at nodes WITHOUT
    // CONSTRUCTION OF SPLINE OBJECT.
    //
    // There are efficient functions spline1dgriddiffcubic() and
    // spline1dgriddiff2cubic() for such calculations.
    //
    // We use default boundary conditions ("parabolically terminated
    // spline") because cubic spline built with such boundary conditions 
    // will exactly reproduce any quadratic f(x).
    //
    // Actually, we could use natural conditions, but we feel that 
    // spline which exactly reproduces f() will show us more 
    // understandable results.
    //
    real_1d_array x = "[-1.0,-0.5,0.0,+0.5,+1.0]";
    real_1d_array y = "[+1.0,0.25,0.0,0.25,+1.0]";
    real_1d_array d1;
    real_1d_array d2;

    //
    // We calculate first derivatives: they must be equal to 2*x
    //
    spline1dgriddiffcubic(x, y, d1);
    printf("%s\n", d1.tostring(3).c_str()); // EXPECTED: [-2.0, -1.0, 0.0, +1.0, +2.0]

    //
    // Now test griddiff2, which returns first AND second derivatives.
    // First derivative is 2*x, second is equal to 2.0
    //
    spline1dgriddiff2cubic(x, y, d1, d2);
    printf("%s\n", d1.tostring(3).c_str()); // EXPECTED: [-2.0, -1.0, 0.0, +1.0, +2.0]
    printf("%s\n", d2.tostring(3).c_str()); // EXPECTED: [ 2.0,  2.0, 2.0,  2.0,  2.0]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We use piecewise linear spline to interpolate f(x)=x^2 sampled 
    // at 5 equidistant nodes on [-1,+1].
    //
    real_1d_array x = "[-1.0,-0.5,0.0,+0.5,+1.0]";
    real_1d_array y = "[+1.0,0.25,0.0,0.25,+1.0]";
    double t = 0.25;
    double v;
    spline1dinterpolant s;

    // build spline
    spline1dbuildlinear(x, y, s);

    // calculate S(0.25) - it is quite different from 0.25^2=0.0625
    v = spline1dcalc(s, t);
    printf("%.4f\n", double(v)); // EXPECTED: 0.125
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // Spline built witn spline1dbuildcubic() can be non-monotone even when
    // Y-values form monotone sequence. Say, for x=[0,1,2] and y=[0,1,1]
    // cubic spline will monotonically grow until x=1.5 and then start
    // decreasing.
    //
    // That's why ALGLIB provides special spline construction function
    // which builds spline which preserves monotonicity of the original
    // dataset.
    //
    // NOTE: in case original dataset is non-monotonic, ALGLIB splits it
    // into monotone subsequences and builds piecewise monotonic spline.
    //
    real_1d_array x = "[0,1,2]";
    real_1d_array y = "[0,1,1]";
    spline1dinterpolant s;

    // build spline
    spline1dbuildmonotone(x, y, s);

    // calculate S at x = [-0.5, 0.0, 0.5, 1.0, 1.5, 2.0]
    // you may see that spline is really monotonic
    double v;
    v = spline1dcalc(s, -0.5);
    printf("%.4f\n", double(v)); // EXPECTED: 0.0000
    v = spline1dcalc(s, 0.0);
    printf("%.4f\n", double(v)); // EXPECTED: 0.0000
    v = spline1dcalc(s, +0.5);
    printf("%.4f\n", double(v)); // EXPECTED: 0.5000
    v = spline1dcalc(s, 1.0);
    printf("%.4f\n", double(v)); // EXPECTED: 1.0000
    v = spline1dcalc(s, 1.5);
    printf("%.4f\n", double(v)); // EXPECTED: 1.0000
    v = spline1dcalc(s, 2.0);
    printf("%.4f\n", double(v)); // EXPECTED: 1.0000
    return 0;
}


spline2dinterpolant
spline2dbuildbicubic
spline2dbuildbicubicv
spline2dbuildbilinear
spline2dbuildbilinearv
spline2dcalc
spline2dcalcv
spline2dcalcvbuf
spline2dcopy
spline2ddiff
spline2dlintransf
spline2dlintransxy
spline2dresamplebicubic
spline2dresamplebilinear
spline2dunpack
spline2dunpackv
spline2d_bicubic Bilinear spline interpolation
spline2d_bilinear Bilinear spline interpolation
spline2d_copytrans Copy and transform
spline2d_unpack Unpacking bilinear spline
spline2d_vector Copy and transform
/************************************************************************* 2-dimensional spline inteprolant *************************************************************************/
class spline2dinterpolant { };
/************************************************************************* This subroutine was deprecated in ALGLIB 3.6.0 We recommend you to switch to Spline2DBuildBicubicV(), which is more flexible and accepts its arguments in more convenient order. -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline2dbuildbicubic( real_1d_array x, real_1d_array y, real_2d_array f, ae_int_t m, ae_int_t n, spline2dinterpolant& c);
/************************************************************************* This subroutine builds bicubic vector-valued spline. Input parameters: X - spline abscissas, array[0..N-1] Y - spline ordinates, array[0..M-1] F - function values, array[0..M*N*D-1]: * first D elements store D values at (X[0],Y[0]) * next D elements store D values at (X[1],Y[0]) * general form - D function values at (X[i],Y[j]) are stored at F[D*(J*N+I)...D*(J*N+I)+D-1]. M,N - grid size, M>=2, N>=2 D - vector dimension, D>=1 Output parameters: C - spline interpolant -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/
void alglib::spline2dbuildbicubicv( real_1d_array x, ae_int_t n, real_1d_array y, ae_int_t m, real_1d_array f, ae_int_t d, spline2dinterpolant& c);

Examples:   [1]  

/************************************************************************* This subroutine was deprecated in ALGLIB 3.6.0 We recommend you to switch to Spline2DBuildBilinearV(), which is more flexible and accepts its arguments in more convenient order. -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline2dbuildbilinear( real_1d_array x, real_1d_array y, real_2d_array f, ae_int_t m, ae_int_t n, spline2dinterpolant& c);
/************************************************************************* This subroutine builds bilinear vector-valued spline. Input parameters: X - spline abscissas, array[0..N-1] Y - spline ordinates, array[0..M-1] F - function values, array[0..M*N*D-1]: * first D elements store D values at (X[0],Y[0]) * next D elements store D values at (X[1],Y[0]) * general form - D function values at (X[i],Y[j]) are stored at F[D*(J*N+I)...D*(J*N+I)+D-1]. M,N - grid size, M>=2, N>=2 D - vector dimension, D>=1 Output parameters: C - spline interpolant -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/
void alglib::spline2dbuildbilinearv( real_1d_array x, ae_int_t n, real_1d_array y, ae_int_t m, real_1d_array f, ae_int_t d, spline2dinterpolant& c);

Examples:   [1]  [2]  

/************************************************************************* This subroutine calculates the value of the bilinear or bicubic spline at the given point X. Input parameters: C - coefficients table. Built by BuildBilinearSpline or BuildBicubicSpline. X, Y- point Result: S(x,y) -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/
double alglib::spline2dcalc(spline2dinterpolant c, double x, double y);

Examples:   [1]  [2]  

/************************************************************************* This subroutine calculates bilinear or bicubic vector-valued spline at the given point (X,Y). INPUT PARAMETERS: C - spline interpolant. X, Y- point OUTPUT PARAMETERS: F - array[D] which stores function values. F is out-parameter and it is reallocated after call to this function. In case you want to reuse previously allocated F, you may use Spline2DCalcVBuf(), which reallocates F only when it is too small. -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/
void alglib::spline2dcalcv( spline2dinterpolant c, double x, double y, real_1d_array& f);

Examples:   [1]  

/************************************************************************* This subroutine calculates bilinear or bicubic vector-valued spline at the given point (X,Y). INPUT PARAMETERS: C - spline interpolant. X, Y- point F - output buffer, possibly preallocated array. In case array size is large enough to store result, it is not reallocated. Array which is too short will be reallocated OUTPUT PARAMETERS: F - array[D] (or larger) which stores function values -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/
void alglib::spline2dcalcvbuf( spline2dinterpolant c, double x, double y, real_1d_array& f);
/************************************************************************* This subroutine makes the copy of the spline model. Input parameters: C - spline interpolant Output parameters: CC - spline copy -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline2dcopy(spline2dinterpolant c, spline2dinterpolant& cc);

Examples:   [1]  

/************************************************************************* This subroutine calculates the value of the bilinear or bicubic spline at the given point X and its derivatives. Input parameters: C - spline interpolant. X, Y- point Output parameters: F - S(x,y) FX - dS(x,y)/dX FY - dS(x,y)/dY FXY - d2S(x,y)/dXdY -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline2ddiff( spline2dinterpolant c, double x, double y, double& f, double& fx, double& fy, double& fxy);
/************************************************************************* This subroutine performs linear transformation of the spline. Input parameters: C - spline interpolant. A, B- transformation coefficients: S2(x,y) = A*S(x,y) + B Output parameters: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline2dlintransf(spline2dinterpolant c, double a, double b);

Examples:   [1]  

/************************************************************************* This subroutine performs linear transformation of the spline argument. Input parameters: C - spline interpolant AX, BX - transformation coefficients: x = A*t + B AY, BY - transformation coefficients: y = A*u + B Result: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline2dlintransxy( spline2dinterpolant c, double ax, double bx, double ay, double by);

Examples:   [1]  

/************************************************************************* Bicubic spline resampling Input parameters: A - function values at the old grid, array[0..OldHeight-1, 0..OldWidth-1] OldHeight - old grid height, OldHeight>1 OldWidth - old grid width, OldWidth>1 NewHeight - new grid height, NewHeight>1 NewWidth - new grid width, NewWidth>1 Output parameters: B - function values at the new grid, array[0..NewHeight-1, 0..NewWidth-1] -- ALGLIB routine -- 15 May, 2007 Copyright by Bochkanov Sergey *************************************************************************/
void alglib::spline2dresamplebicubic( real_2d_array a, ae_int_t oldheight, ae_int_t oldwidth, real_2d_array& b, ae_int_t newheight, ae_int_t newwidth);
/************************************************************************* Bilinear spline resampling Input parameters: A - function values at the old grid, array[0..OldHeight-1, 0..OldWidth-1] OldHeight - old grid height, OldHeight>1 OldWidth - old grid width, OldWidth>1 NewHeight - new grid height, NewHeight>1 NewWidth - new grid width, NewWidth>1 Output parameters: B - function values at the new grid, array[0..NewHeight-1, 0..NewWidth-1] -- ALGLIB routine -- 09.07.2007 Copyright by Bochkanov Sergey *************************************************************************/
void alglib::spline2dresamplebilinear( real_2d_array a, ae_int_t oldheight, ae_int_t oldwidth, real_2d_array& b, ae_int_t newheight, ae_int_t newwidth);
/************************************************************************* This subroutine was deprecated in ALGLIB 3.6.0 We recommend you to switch to Spline2DUnpackV(), which is more flexible and accepts its arguments in more convenient order. -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/
void alglib::spline2dunpack( spline2dinterpolant c, ae_int_t& m, ae_int_t& n, real_2d_array& tbl);
/************************************************************************* This subroutine unpacks two-dimensional spline into the coefficients table Input parameters: C - spline interpolant. Result: M, N- grid size (x-axis and y-axis) D - number of components Tbl - coefficients table, unpacked format, D - components: [0..(N-1)*(M-1)*D-1, 0..19]. For T=0..D-1 (component index), I = 0...N-2 (x index), J=0..M-2 (y index): K := T + I*D + J*D*(N-1) K-th row stores decomposition for T-th component of the vector-valued function Tbl[K,0] = X[i] Tbl[K,1] = X[i+1] Tbl[K,2] = Y[j] Tbl[K,3] = Y[j+1] Tbl[K,4] = C00 Tbl[K,5] = C01 Tbl[K,6] = C02 Tbl[K,7] = C03 Tbl[K,8] = C10 Tbl[K,9] = C11 ... Tbl[K,19] = C33 On each grid square spline is equals to: S(x) = SUM(c[i,j]*(t^i)*(u^j), i=0..3, j=0..3) t = x-x[j] u = y-y[i] -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/
void alglib::spline2dunpackv( spline2dinterpolant c, ae_int_t& m, ae_int_t& n, ae_int_t& d, real_2d_array& tbl);

Examples:   [1]  

#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We use bilinear spline to interpolate f(x,y)=x^2+2*y^2 sampled 
    // at (x,y) from [0.0, 0.5, 1.0] X [0.0, 1.0].
    //
    real_1d_array x = "[0.0, 0.5, 1.0]";
    real_1d_array y = "[0.0, 1.0]";
    real_1d_array f = "[0.00,0.25,1.00,2.00,2.25,3.00]";
    double vx = 0.25;
    double vy = 0.50;
    double v;
    double dx;
    double dy;
    double dxy;
    spline2dinterpolant s;

    // build spline
    spline2dbuildbicubicv(x, 3, y, 2, f, 1, s);

    // calculate S(0.25,0.50)
    v = spline2dcalc(s, vx, vy);
    printf("%.4f\n", double(v)); // EXPECTED: 1.0625

    // calculate derivatives
    spline2ddiff(s, vx, vy, v, dx, dy, dxy);
    printf("%.4f\n", double(v)); // EXPECTED: 1.0625
    printf("%.4f\n", double(dx)); // EXPECTED: 0.5000
    printf("%.4f\n", double(dy)); // EXPECTED: 2.0000
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We use bilinear spline to interpolate f(x,y)=x^2+2*y^2 sampled 
    // at (x,y) from [0.0, 0.5, 1.0] X [0.0, 1.0].
    //
    real_1d_array x = "[0.0, 0.5, 1.0]";
    real_1d_array y = "[0.0, 1.0]";
    real_1d_array f = "[0.00,0.25,1.00,2.00,2.25,3.00]";
    double vx = 0.25;
    double vy = 0.50;
    double v;
    spline2dinterpolant s;

    // build spline
    spline2dbuildbilinearv(x, 3, y, 2, f, 1, s);

    // calculate S(0.25,0.50)
    v = spline2dcalc(s, vx, vy);
    printf("%.4f\n", double(v)); // EXPECTED: 1.1250
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We build bilinear spline for f(x,y)=x+2*y for (x,y) in [0,1].
    // Then we apply several transformations to this spline.
    //
    real_1d_array x = "[0.0, 1.0]";
    real_1d_array y = "[0.0, 1.0]";
    real_1d_array f = "[0.00,1.00,2.00,3.00]";
    spline2dinterpolant s;
    spline2dinterpolant snew;
    double v;
    spline2dbuildbilinearv(x, 2, y, 2, f, 1, s);

    // copy spline, apply transformation x:=2*xnew, y:=4*ynew
    // evaluate at (xnew,ynew) = (0.25,0.25) - should be same as (x,y)=(0.5,1.0)
    spline2dcopy(s, snew);
    spline2dlintransxy(snew, 2.0, 0.0, 4.0, 0.0);
    v = spline2dcalc(snew, 0.25, 0.25);
    printf("%.4f\n", double(v)); // EXPECTED: 2.500

    // copy spline, apply transformation SNew:=2*S+3
    spline2dcopy(s, snew);
    spline2dlintransf(snew, 2.0, 3.0);
    v = spline2dcalc(snew, 0.5, 1.0);
    printf("%.4f\n", double(v)); // EXPECTED: 8.000

    //
    // Same example, but for vector spline (f0,f1) = {x+2*y, 2*x+y}
    //
    real_1d_array f2 = "[0.00,0.00, 1.00,2.00, 2.00,1.00, 3.00,3.00]";
    real_1d_array vr;
    spline2dbuildbilinearv(x, 2, y, 2, f2, 2, s);

    // copy spline, apply transformation x:=2*xnew, y:=4*ynew
    spline2dcopy(s, snew);
    spline2dlintransxy(snew, 2.0, 0.0, 4.0, 0.0);
    spline2dcalcv(snew, 0.25, 0.25, vr);
    printf("%s\n", vr.tostring(4).c_str()); // EXPECTED: [2.500,2.000]

    // copy spline, apply transformation SNew:=2*S+3
    spline2dcopy(s, snew);
    spline2dlintransf(snew, 2.0, 3.0);
    spline2dcalcv(snew, 0.5, 1.0, vr);
    printf("%s\n", vr.tostring(4).c_str()); // EXPECTED: [8.000,7.000]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We build bilinear spline for f(x,y)=x+2*y+3*xy for (x,y) in [0,1].
    // Then we demonstrate how to unpack it.
    //
    real_1d_array x = "[0.0, 1.0]";
    real_1d_array y = "[0.0, 1.0]";
    real_1d_array f = "[0.00,1.00,2.00,6.00]";
    real_2d_array c;
    ae_int_t m;
    ae_int_t n;
    ae_int_t d;
    spline2dinterpolant s;

    // build spline
    spline2dbuildbilinearv(x, 2, y, 2, f, 1, s);

    // unpack and test
    spline2dunpackv(s, m, n, d, c);
    printf("%s\n", c.tostring(4).c_str()); // EXPECTED: [[0, 1, 0, 1, 0,2,0,0, 1,3,0,0, 0,0,0,0, 0,0,0,0 ]]
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We build bilinear vector-valued spline (f0,f1) = {x+2*y, 2*x+y}
    // Spline is built using function values at 2x2 grid: (x,y)=[0,1]*[0,1]
    // Then we perform evaluation at (x,y)=(0.1,0.3)
    //
    real_1d_array x = "[0.0, 1.0]";
    real_1d_array y = "[0.0, 1.0]";
    real_1d_array f = "[0.00,0.00, 1.00,2.00, 2.00,1.00, 3.00,3.00]";
    spline2dinterpolant s;
    real_1d_array vr;
    spline2dbuildbilinearv(x, 2, y, 2, f, 2, s);
    spline2dcalcv(s, 0.1, 0.3, vr);
    printf("%s\n", vr.tostring(4).c_str()); // EXPECTED: [0.700,0.500]
    return 0;
}


spline3dinterpolant
spline3dbuildtrilinearv
spline3dcalc
spline3dcalcv
spline3dcalcvbuf
spline3dlintransf
spline3dlintransxyz
spline3dresampletrilinear
spline3dunpackv
spline3d_trilinear Trilinear spline interpolation
spline3d_vector Vector-valued trilinear spline interpolation
/************************************************************************* 3-dimensional spline inteprolant *************************************************************************/
class spline3dinterpolant { };
/************************************************************************* This subroutine builds trilinear vector-valued spline. INPUT PARAMETERS: X - spline abscissas, array[0..N-1] Y - spline ordinates, array[0..M-1] Z - spline applicates, array[0..L-1] F - function values, array[0..M*N*L*D-1]: * first D elements store D values at (X[0],Y[0],Z[0]) * next D elements store D values at (X[1],Y[0],Z[0]) * next D elements store D values at (X[2],Y[0],Z[0]) * ... * next D elements store D values at (X[0],Y[1],Z[0]) * next D elements store D values at (X[1],Y[1],Z[0]) * next D elements store D values at (X[2],Y[1],Z[0]) * ... * next D elements store D values at (X[0],Y[0],Z[1]) * next D elements store D values at (X[1],Y[0],Z[1]) * next D elements store D values at (X[2],Y[0],Z[1]) * ... * general form - D function values at (X[i],Y[j]) are stored at F[D*(N*(M*K+J)+I)...D*(N*(M*K+J)+I)+D-1]. M,N, L - grid size, M>=2, N>=2, L>=2 D - vector dimension, D>=1 OUTPUT PARAMETERS: C - spline interpolant -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/
void alglib::spline3dbuildtrilinearv( real_1d_array x, ae_int_t n, real_1d_array y, ae_int_t m, real_1d_array z, ae_int_t l, real_1d_array f, ae_int_t d, spline3dinterpolant& c);

Examples:   [1]  [2]  

/************************************************************************* This subroutine calculates the value of the trilinear or tricubic spline at the given point (X,Y,Z). INPUT PARAMETERS: C - coefficients table. Built by BuildBilinearSpline or BuildBicubicSpline. X, Y, Z - point Result: S(x,y,z) -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/
double alglib::spline3dcalc( spline3dinterpolant c, double x, double y, double z);

Examples:   [1]  

/************************************************************************* This subroutine calculates trilinear or tricubic vector-valued spline at the given point (X,Y,Z). INPUT PARAMETERS: C - spline interpolant. X, Y, Z - point OUTPUT PARAMETERS: F - array[D] which stores function values. F is out-parameter and it is reallocated after call to this function. In case you want to reuse previously allocated F, you may use Spline2DCalcVBuf(), which reallocates F only when it is too small. -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/
void alglib::spline3dcalcv( spline3dinterpolant c, double x, double y, double z, real_1d_array& f);

Examples:   [1]  

/************************************************************************* This subroutine calculates bilinear or bicubic vector-valued spline at the given point (X,Y,Z). INPUT PARAMETERS: C - spline interpolant. X, Y, Z - point F - output buffer, possibly preallocated array. In case array size is large enough to store result, it is not reallocated. Array which is too short will be reallocated OUTPUT PARAMETERS: F - array[D] (or larger) which stores function values -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/
void alglib::spline3dcalcvbuf( spline3dinterpolant c, double x, double y, double z, real_1d_array& f);
/************************************************************************* This subroutine performs linear transformation of the spline. INPUT PARAMETERS: C - spline interpolant. A, B- transformation coefficients: S2(x,y) = A*S(x,y,z) + B OUTPUT PARAMETERS: C - transformed spline -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/
void alglib::spline3dlintransf(spline3dinterpolant c, double a, double b);
/************************************************************************* This subroutine performs linear transformation of the spline argument. INPUT PARAMETERS: C - spline interpolant AX, BX - transformation coefficients: x = A*u + B AY, BY - transformation coefficients: y = A*v + B AZ, BZ - transformation coefficients: z = A*w + B OUTPUT PARAMETERS: C - transformed spline -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/
void alglib::spline3dlintransxyz( spline3dinterpolant c, double ax, double bx, double ay, double by, double az, double bz);
/************************************************************************* Trilinear spline resampling INPUT PARAMETERS: A - array[0..OldXCount*OldYCount*OldZCount-1], function values at the old grid, : A[0] x=0,y=0,z=0 A[1] x=1,y=0,z=0 A[..] ... A[..] x=oldxcount-1,y=0,z=0 A[..] x=0,y=1,z=0 A[..] ... ... OldZCount - old Z-count, OldZCount>1 OldYCount - old Y-count, OldYCount>1 OldXCount - old X-count, OldXCount>1 NewZCount - new Z-count, NewZCount>1 NewYCount - new Y-count, NewYCount>1 NewXCount - new X-count, NewXCount>1 OUTPUT PARAMETERS: B - array[0..NewXCount*NewYCount*NewZCount-1], function values at the new grid: B[0] x=0,y=0,z=0 B[1] x=1,y=0,z=0 B[..] ... B[..] x=newxcount-1,y=0,z=0 B[..] x=0,y=1,z=0 B[..] ... ... -- ALGLIB routine -- 26.04.2012 Copyright by Bochkanov Sergey *************************************************************************/
void alglib::spline3dresampletrilinear( real_1d_array a, ae_int_t oldzcount, ae_int_t oldycount, ae_int_t oldxcount, ae_int_t newzcount, ae_int_t newycount, ae_int_t newxcount, real_1d_array& b);
/************************************************************************* This subroutine unpacks tri-dimensional spline into the coefficients table INPUT PARAMETERS: C - spline interpolant. Result: N - grid size (X) M - grid size (Y) L - grid size (Z) D - number of components SType- spline type. Currently, only one spline type is supported: trilinear spline, as indicated by SType=1. Tbl - spline coefficients: [0..(N-1)*(M-1)*(L-1)*D-1, 0..13]. For T=0..D-1 (component index), I = 0...N-2 (x index), J=0..M-2 (y index), K=0..L-2 (z index): Q := T + I*D + J*D*(N-1) + K*D*(N-1)*(M-1), Q-th row stores decomposition for T-th component of the vector-valued function Tbl[Q,0] = X[i] Tbl[Q,1] = X[i+1] Tbl[Q,2] = Y[j] Tbl[Q,3] = Y[j+1] Tbl[Q,4] = Z[k] Tbl[Q,5] = Z[k+1] Tbl[Q,6] = C000 Tbl[Q,7] = C100 Tbl[Q,8] = C010 Tbl[Q,9] = C110 Tbl[Q,10]= C001 Tbl[Q,11]= C101 Tbl[Q,12]= C011 Tbl[Q,13]= C111 On each grid square spline is equals to: S(x) = SUM(c[i,j,k]*(x^i)*(y^j)*(z^k), i=0..1, j=0..1, k=0..1) t = x-x[j] u = y-y[i] v = z-z[k] NOTE: format of Tbl is given for SType=1. Future versions of ALGLIB can use different formats for different values of SType. -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/
void alglib::spline3dunpackv( spline3dinterpolant c, ae_int_t& n, ae_int_t& m, ae_int_t& l, ae_int_t& d, ae_int_t& stype, real_2d_array& tbl);
#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We use trilinear spline to interpolate f(x,y,z)=x+xy+z sampled 
    // at (x,y,z) from [0.0, 1.0] X [0.0, 1.0] X [0.0, 1.0].
    //
    // We store x, y and z-values at local arrays with same names.
    // Function values are stored in the array F as follows:
    //     f[0]     (x,y,z) = (0,0,0)
    //     f[1]     (x,y,z) = (1,0,0)
    //     f[2]     (x,y,z) = (0,1,0)
    //     f[3]     (x,y,z) = (1,1,0)
    //     f[4]     (x,y,z) = (0,0,1)
    //     f[5]     (x,y,z) = (1,0,1)
    //     f[6]     (x,y,z) = (0,1,1)
    //     f[7]     (x,y,z) = (1,1,1)
    //
    real_1d_array x = "[0.0, 1.0]";
    real_1d_array y = "[0.0, 1.0]";
    real_1d_array z = "[0.0, 1.0]";
    real_1d_array f = "[0,1,0,2,1,2,1,3]";
    double vx = 0.50;
    double vy = 0.50;
    double vz = 0.50;
    double v;
    spline3dinterpolant s;

    // build spline
    spline3dbuildtrilinearv(x, 2, y, 2, z, 2, f, 1, s);

    // calculate S(0.5,0.5,0.5)
    v = spline3dcalc(s, vx, vy, vz);
    printf("%.4f\n", double(v)); // EXPECTED: 1.2500
    return 0;
}


#include "stdafx.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "interpolation.h"

using namespace alglib;


int main(int argc, char **argv)
{
    //
    // We use trilinear vector-valued spline to interpolate {f0,f1}={x+xy+z,x+xy+yz+z}
    // sampled at (x,y,z) from [0.0, 1.0] X [0.0, 1.0] X [0.0, 1.0].
    //
    // We store x, y and z-values at local arrays with same names.
    // Function values are stored in the array F as follows:
    //     f[0]     f0, (x,y,z) = (0,0,0)
    //     f[1]     f1, (x,y,z) = (0,0,0)
    //     f[2]     f0, (x,y,z) = (1,0,0)
    //     f[3]     f1, (x,y,z) = (1,0,0)
    //     f[4]     f0, (x,y,z) = (0,1,0)
    //     f[5]     f1, (x,y,z) = (0,1,0)
    //     f[6]     f0, (x,y,z) = (1,1,0)
    //     f[7]     f1, (x,y,z) = (1,1,0)
    //     f[8]     f0, (x,y,z) = (0,0,1)
    //     f[9]     f1, (x,y,z) = (0,0,1)
    //     f[10]    f0, (x,y,z) = (1,0,1)
    //     f[11]    f1, (x,y,z) = (1,0,1)
    //     f[12]    f0, (x,y,z) = (0,1,1)
    //     f[13]    f1, (x,y,z) = (0,1,1)
    //     f[14]    f0, (x,y,z) = (1,1,1)
    //     f[15]    f1, (x,y,z) = (1,1,1)
    //
    real_1d_array x = "[0.0, 1.0]";
    real_1d_array y = "[0.0, 1.0]";
    real_1d_array z = "[0.0, 1.0]";
    real_1d_array f = "[0,0, 1,1, 0,0, 2,2, 1,1, 2,2, 1,2, 3,4]";
    double vx = 0.50;
    double vy = 0.50;
    double vz = 0.50;
    spline3dinterpolant s;

    // build spline
    spline3dbuildtrilinearv(x, 2, y, 2, z, 2, f, 2, s);

    // calculate S(0.5,0.5,0.5) - we have vector of values instead of single value
    real_1d_array v;
    spline3dcalcv(s, vx, vy, vz, v);
    printf("%s\n", v.tostring(4).c_str()); // EXPECTED: [1.2500,1.5000]
    return 0;
}


onesamplesigntest
/************************************************************************* Sign test This test checks three hypotheses about the median of the given sample. The following tests are performed: * two-tailed test (null hypothesis - the median is equal to the given value) * left-tailed test (null hypothesis - the median is greater than or equal to the given value) * right-tailed test (null hypothesis - the median is less than or equal to the given value) Requirements: * the scale of measurement should be ordinal, interval or ratio (i.e. the test could not be applied to nominal variables). The test is non-parametric and doesn't require distribution X to be normal Input parameters: X - sample. Array whose index goes from 0 to N-1. N - size of the sample. Median - assumed median value. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. While calculating p-values high-precision binomial distribution approximation is used, so significance levels have about 15 exact digits. -- ALGLIB -- Copyright 08.09.2006 by Bochkanov Sergey *************************************************************************/
void alglib::onesamplesigntest( real_1d_array x, ae_int_t n, double median, double& bothtails, double& lefttail, double& righttail);
invstudenttdistribution
studenttdistribution
/************************************************************************* Functional inverse of Student's t distribution Given probability p, finds the argument t such that stdtr(k,t) is equal to p. ACCURACY: Tested at random 1 <= k <= 100. The "domain" refers to p: Relative error: arithmetic domain # trials peak rms IEEE .001,.999 25000 5.7e-15 8.0e-16 IEEE 10^-6,.001 25000 2.0e-12 2.9e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::invstudenttdistribution(ae_int_t k, double p);
/************************************************************************* Student's t distribution Computes the integral from minus infinity to t of the Student t distribution with integer k > 0 degrees of freedom: t - | | - | 2 -(k+1)/2 | ( (k+1)/2 ) | ( x ) ---------------------- | ( 1 + --- ) dx - | ( k ) sqrt( k pi ) | ( k/2 ) | | | - -inf. Relation to incomplete beta integral: 1 - stdtr(k,t) = 0.5 * incbet( k/2, 1/2, z ) where z = k/(k + t**2). For t < -2, this is the method of computation. For higher t, a direct method is derived from integration by parts. Since the function is symmetric about t=0, the area under the right tail of the density is found by calling the function with -t instead of t. ACCURACY: Tested at random 1 <= k <= 25. The "domain" refers to t. Relative error: arithmetic domain # trials peak rms IEEE -100,-2 50000 5.9e-15 1.4e-15 IEEE -2,100 500000 2.7e-15 4.9e-17 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/
double alglib::studenttdistribution(ae_int_t k, double t);
studentttest1
studentttest2
unequalvariancettest
/************************************************************************* One-sample t-test This test checks three hypotheses about the mean of the given sample. The following tests are performed: * two-tailed test (null hypothesis - the mean is equal to the given value) * left-tailed test (null hypothesis - the mean is greater than or equal to the given value) * right-tailed test (null hypothesis - the mean is less than or equal to the given value). The test is based on the assumption that a given sample has a normal distribution and an unknown dispersion. If the distribution sharply differs from normal, the test will work incorrectly. INPUT PARAMETERS: X - sample. Array whose index goes from 0 to N-1. N - size of sample, N>=0 Mean - assumed value of the mean. OUTPUT PARAMETERS: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. NOTE: this function correctly handles degenerate cases: * when N=0, all p-values are set to 1.0 * when variance of X[] is exactly zero, p-values are set to 1.0 or 0.0, depending on difference between sample mean and value of mean being tested. -- ALGLIB -- Copyright 08.09.2006 by Bochkanov Sergey *************************************************************************/
void alglib::studentttest1( real_1d_array x, ae_int_t n, double mean, double& bothtails, double& lefttail, double& righttail);
/************************************************************************* Two-sample pooled test This test checks three hypotheses about the mean of the given samples. The following tests are performed: * two-tailed test (null hypothesis - the means are equal) * left-tailed test (null hypothesis - the mean of the first sample is greater than or equal to the mean of the second sample) * right-tailed test (null hypothesis - the mean of the first sample is less than or equal to the mean of the second sample). Test is based on the following assumptions: * given samples have normal distributions * dispersions are equal * samples are independent. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of sample. Y - sample 2. Array whose index goes from 0 to M-1. M - size of sample. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. NOTE: this function correctly handles degenerate cases: * when N=0 or M=0, all p-values are set to 1.0 * when both samples has exactly zero variance, p-values are set to 1.0 or 0.0, depending on difference between means. -- ALGLIB -- Copyright 18.09.2006 by Bochkanov Sergey *************************************************************************/
void alglib::studentttest2( real_1d_array x, ae_int_t n, real_1d_array y, ae_int_t m, double& bothtails, double& lefttail, double& righttail);
/************************************************************************* Two-sample unpooled test This test checks three hypotheses about the mean of the given samples. The following tests are performed: * two-tailed test (null hypothesis - the means are equal) * left-tailed test (null hypothesis - the mean of the first sample is greater than or equal to the mean of the second sample) * right-tailed test (null hypothesis - the mean of the first sample is less than or equal to the mean of the second sample). Test is based on the following assumptions: * given samples have normal distributions * samples are independent. Equality of variances is NOT required. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of the sample. Y - sample 2. Array whose index goes from 0 to M-1. M - size of the sample. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. NOTE: this function correctly handles degenerate cases: * when N=0 or M=0, all p-values are set to 1.0 * when both samples has zero variance, p-values are set to 1.0 or 0.0, depending on difference between means. * when only one sample has zero variance, test reduces to 1-sample version. -- ALGLIB -- Copyright 18.09.2006 by Bochkanov Sergey *************************************************************************/
void alglib::unequalvariancettest( real_1d_array x, ae_int_t n, real_1d_array y, ae_int_t m, double& bothtails, double& lefttail, double& righttail);
rmatrixsvd
/************************************************************************* Singular value decomposition of a rectangular matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is only partially supported (some parts are ! optimized, but most - are not). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The algorithm calculates the singular value decomposition of a matrix of size MxN: A = U * S * V^T The algorithm finds the singular values and, optionally, matrices U and V^T. The algorithm can find both first min(M,N) columns of matrix U and rows of matrix V^T (singular vectors), and matrices U and V^T wholly (of sizes MxM and NxN respectively). Take into account that the subroutine does not return matrix V but V^T. Input parameters: A - matrix to be decomposed. Array whose indexes range within [0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. UNeeded - 0, 1 or 2. See the description of the parameter U. VTNeeded - 0, 1 or 2. See the description of the parameter VT. AdditionalMemory - If the parameter: * equals 0, the algorithm doesn't use additional memory (lower requirements, lower performance). * equals 1, the algorithm uses additional memory of size min(M,N)*min(M,N) of real numbers. It often speeds up the algorithm. * equals 2, the algorithm uses additional memory of size M*min(M,N) of real numbers. It allows to get a maximum performance. The recommended value of the parameter is 2. Output parameters: W - contains singular values in descending order. U - if UNeeded=0, U isn't changed, the left singular vectors are not calculated. if Uneeded=1, U contains left singular vectors (first min(M,N) columns of matrix U). Array whose indexes range within [0..M-1, 0..Min(M,N)-1]. if UNeeded=2, U contains matrix U wholly. Array whose indexes range within [0..M-1, 0..M-1]. VT - if VTNeeded=0, VT isn't changed, the right singular vectors are not calculated. if VTNeeded=1, VT contains right singular vectors (first min(M,N) rows of matrix V^T). Array whose indexes range within [0..min(M,N)-1, 0..N-1]. if VTNeeded=2, VT contains matrix V^T wholly. Array whose indexes range within [0..N-1, 0..N-1]. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/
bool alglib::rmatrixsvd( real_2d_array a, ae_int_t m, ae_int_t n, ae_int_t uneeded, ae_int_t vtneeded, ae_int_t additionalmemory, real_1d_array& w, real_2d_array& u, real_2d_array& vt); bool alglib::smp_rmatrixsvd( real_2d_array a, ae_int_t m, ae_int_t n, ae_int_t uneeded, ae_int_t vtneeded, ae_int_t additionalmemory, real_1d_array& w, real_2d_array& u, real_2d_array& vt);
cmatrixlu
hpdmatrixcholesky
rmatrixlu
sparsecholeskyskyline
spdmatrixcholesky
spdmatrixcholeskyupdateadd1
spdmatrixcholeskyupdateadd1buf
spdmatrixcholeskyupdatefix
spdmatrixcholeskyupdatefixbuf
/************************************************************************* LU decomposition of a general complex matrix with row pivoting A is represented as A = P*L*U, where: * L is lower unitriangular matrix * U is upper triangular matrix * P = P0*P1*...*PK, K=min(M,N)-1, Pi - permutation matrix for I and Pivots[I] This is cache-oblivous implementation of LU decomposition. It is optimized for square matrices. As for rectangular matrices: * best case - M>>N * worst case - N>>M, small M, large N, matrix does not fit in CPU cache COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - array[0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. OUTPUT PARAMETERS: A - matrices L and U in compact form: * L is stored under main diagonal * U is stored on and above main diagonal Pivots - permutation matrix in compact form. array[0..Min(M-1,N-1)]. -- ALGLIB routine -- 10.01.2010 Bochkanov Sergey *************************************************************************/
void alglib::cmatrixlu( complex_2d_array& a, ae_int_t m, ae_int_t n, integer_1d_array& pivots); void alglib::smp_cmatrixlu( complex_2d_array& a, ae_int_t m, ae_int_t n, integer_1d_array& pivots);
/************************************************************************* Cache-oblivious Cholesky decomposition The algorithm computes Cholesky decomposition of a Hermitian positive- definite matrix. The result of an algorithm is a representation of A as A=U'*U or A=L*L' (here X' detones conj(X^T)). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - upper or lower triangle of a factorized matrix. array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - if IsUpper=True, then A contains an upper triangle of a symmetric matrix, otherwise A contains a lower one. OUTPUT PARAMETERS: A - the result of factorization. If IsUpper=True, then the upper triangle contains matrix U, so that A = U'*U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of A is not determined in such case. -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/
bool alglib::hpdmatrixcholesky( complex_2d_array& a, ae_int_t n, bool isupper); bool alglib::smp_hpdmatrixcholesky( complex_2d_array& a, ae_int_t n, bool isupper);
/************************************************************************* LU decomposition of a general real matrix with row pivoting A is represented as A = P*L*U, where: * L is lower unitriangular matrix * U is upper triangular matrix * P = P0*P1*...*PK, K=min(M,N)-1, Pi - permutation matrix for I and Pivots[I] This is cache-oblivous implementation of LU decomposition. It is optimized for square matrices. As for rectangular matrices: * best case - M>>N * worst case - N>>M, small M, large N, matrix does not fit in CPU cache COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - array[0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. OUTPUT PARAMETERS: A - matrices L and U in compact form: * L is stored under main diagonal * U is stored on and above main diagonal Pivots - permutation matrix in compact form. array[0..Min(M-1,N-1)]. -- ALGLIB routine -- 10.01.2010 Bochkanov Sergey *************************************************************************/
void alglib::rmatrixlu( real_2d_array& a, ae_int_t m, ae_int_t n, integer_1d_array& pivots); void alglib::smp_rmatrixlu( real_2d_array& a, ae_int_t m, ae_int_t n, integer_1d_array& pivots);
/************************************************************************* Sparse Cholesky decomposition for skyline matrixm using in-place algorithm without allocating additional storage. The algorithm computes Cholesky decomposition of a symmetric positive- definite sparse matrix. The result of an algorithm is a representation of A as A=U^T*U or A=L*L^T This function is a more efficient alternative to general, but slower SparseCholeskyX(), because it does not create temporary copies of the target. It performs factorization in-place, which gives best performance on low-profile matrices. Its drawback, however, is that it can not perform profile-reducing permutation of input matrix. INPUT PARAMETERS: A - sparse matrix in skyline storage (SKS) format. N - size of matrix A (can be smaller than actual size of A) IsUpper - if IsUpper=True, then factorization is performed on upper triangle. Another triangle is ignored (it may contant some data, but it is not changed). OUTPUT PARAMETERS: A - the result of factorization, stored in SKS. If IsUpper=True, then the upper triangle contains matrix U, such that A = U^T*U. Lower triangle is not changed. Similarly, if IsUpper = False. In this case L is returned, and we have A = L*(L^T). Note that THIS function does not perform permutation of rows to reduce bandwidth. RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of A is not determined in such case. NOTE: for performance reasons this function does NOT check that input matrix includes only finite values. It is your responsibility to make sure that there are no infinite or NAN values in the matrix. -- ALGLIB routine -- 16.01.2014 Bochkanov Sergey *************************************************************************/
bool alglib::sparsecholeskyskyline( sparsematrix a, ae_int_t n, bool isupper);
/************************************************************************* Cache-oblivious Cholesky decomposition The algorithm computes Cholesky decomposition of a symmetric positive- definite matrix. The result of an algorithm is a representation of A as A=U^T*U or A=L*L^T COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - upper or lower triangle of a factorized matrix. array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - if IsUpper=True, then A contains an upper triangle of a symmetric matrix, otherwise A contains a lower one. OUTPUT PARAMETERS: A - the result of factorization. If IsUpper=True, then the upper triangle contains matrix U, so that A = U^T*U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of A is not determined in such case. -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/
bool alglib::spdmatrixcholesky( real_2d_array& a, ae_int_t n, bool isupper); bool alglib::smp_spdmatrixcholesky( real_2d_array& a, ae_int_t n, bool isupper);
/************************************************************************* Update of Cholesky decomposition: rank-1 update to original A. "Buffered" version which uses preallocated buffer which is saved between subsequent function calls. This function uses internally allocated buffer which is not saved between subsequent calls. So, if you perform a lot of subsequent updates, we recommend you to use "buffered" version of this function: SPDMatrixCholeskyUpdateAdd1Buf(). INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. U - array[N], rank-1 update to A: A_mod = A + u*u' Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. NOTE: this function always succeeds, so it does not return completion code NOTE: this function checks sizes of input arrays, but it does NOT checks for presence of infinities or NAN's. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/
void alglib::spdmatrixcholeskyupdateadd1( real_2d_array& a, ae_int_t n, bool isupper, real_1d_array u);
/************************************************************************* Update of Cholesky decomposition: rank-1 update to original A. "Buffered" version which uses preallocated buffer which is saved between subsequent function calls. See comments for SPDMatrixCholeskyUpdateAdd1() for more information. INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. U - array[N], rank-1 update to A: A_mod = A + u*u' Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/
void alglib::spdmatrixcholeskyupdateadd1buf( real_2d_array& a, ae_int_t n, bool isupper, real_1d_array u, real_1d_array& bufr);
/************************************************************************* Update of Cholesky decomposition: "fixing" some variables. This function uses internally allocated buffer which is not saved between subsequent calls. So, if you perform a lot of subsequent updates, we recommend you to use "buffered" version of this function: SPDMatrixCholeskyUpdateFixBuf(). "FIXING" EXPLAINED: Suppose we have N*N positive definite matrix A. "Fixing" some variable means filling corresponding row/column of A by zeros, and setting diagonal element to 1. For example, if we fix 2nd variable in 4*4 matrix A, it becomes Af: ( A00 A01 A02 A03 ) ( Af00 0 Af02 Af03 ) ( A10 A11 A12 A13 ) ( 0 1 0 0 ) ( A20 A21 A22 A23 ) => ( Af20 0 Af22 Af23 ) ( A30 A31 A32 A33 ) ( Af30 0 Af32 Af33 ) If we have Cholesky decomposition of A, it must be recalculated after variables were fixed. However, it is possible to use efficient algorithm, which needs O(K*N^2) time to "fix" K variables, given Cholesky decomposition of original, "unfixed" A. INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. Fix - array[N], I-th element is True if I-th variable must be fixed. Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. NOTE: this function always succeeds, so it does not return completion code NOTE: this function checks sizes of input arrays, but it does NOT checks for presence of infinities or NAN's. NOTE: this function is efficient only for moderate amount of updated variables - say, 0.1*N or 0.3*N. For larger amount of variables it will still work, but you may get better performance with straightforward Cholesky. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/
void alglib::spdmatrixcholeskyupdatefix( real_2d_array& a, ae_int_t n, bool isupper, boolean_1d_array fix);
/************************************************************************* Update of Cholesky decomposition: "fixing" some variables. "Buffered" version which uses preallocated buffer which is saved between subsequent function calls. See comments for SPDMatrixCholeskyUpdateFix() for more information. INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. Fix - array[N], I-th element is True if I-th variable must be fixed. Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/
void alglib::spdmatrixcholeskyupdatefixbuf( real_2d_array& a, ae_int_t n, bool isupper, boolean_1d_array fix, real_1d_array& bufr);
hyperbolicsinecosineintegrals
sinecosineintegrals
/************************************************************************* Hyperbolic sine and cosine integrals Approximates the integrals x - | | cosh t - 1 Chi(x) = eul + ln x + | ----------- dt, | | t - 0 x - | | sinh t Shi(x) = | ------ dt | | t - 0 where eul = 0.57721566490153286061 is Euler's constant. The integrals are evaluated by power series for x < 8 and by Chebyshev expansions for x between 8 and 88. For large x, both functions approach exp(x)/2x. Arguments greater than 88 in magnitude return MAXNUM. ACCURACY: Test interval 0 to 88. Relative error: arithmetic function # trials peak rms IEEE Shi 30000 6.9e-16 1.6e-16 Absolute error, except relative when |Chi| > 1: IEEE Chi 30000 8.4e-16 1.4e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/
void alglib::hyperbolicsinecosineintegrals( double x, double& shi, double& chi);
/************************************************************************* Sine and cosine integrals Evaluates the integrals x - | cos t - 1 Ci(x) = eul + ln x + | --------- dt, | t - 0 x - | sin t Si(x) = | ----- dt | t - 0 where eul = 0.57721566490153286061 is Euler's constant. The integrals are approximated by rational functions. For x > 8 auxiliary functions f(x) and g(x) are employed such that Ci(x) = f(x) sin(x) - g(x) cos(x) Si(x) = pi/2 - f(x) cos(x) - g(x) sin(x) ACCURACY: Test interval = [0,50]. Absolute error, except relative when > 1: arithmetic function # trials peak rms IEEE Si 30000 4.4e-16 7.3e-17 IEEE Ci 30000 6.9e-16 5.1e-17 Cephes Math Library Release 2.1: January, 1989 Copyright 1984, 1987, 1989 by Stephen L. Moshier *************************************************************************/
void alglib::sinecosineintegrals(double x, double& si, double& ci);
ftest
onesamplevariancetest
/************************************************************************* Two-sample F-test This test checks three hypotheses about dispersions of the given samples. The following tests are performed: * two-tailed test (null hypothesis - the dispersions are equal) * left-tailed test (null hypothesis - the dispersion of the first sample is greater than or equal to the dispersion of the second sample). * right-tailed test (null hypothesis - the dispersion of the first sample is less than or equal to the dispersion of the second sample) The test is based on the following assumptions: * the given samples have normal distributions * the samples are independent. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - sample size. Y - sample 2. Array whose index goes from 0 to M-1. M - sample size. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 19.09.2006 by Bochkanov Sergey *************************************************************************/
void alglib::ftest( real_1d_array x, ae_int_t n, real_1d_array y, ae_int_t m, double& bothtails, double& lefttail, double& righttail);
/************************************************************************* One-sample chi-square test This test checks three hypotheses about the dispersion of the given sample The following tests are performed: * two-tailed test (null hypothesis - the dispersion equals the given number) * left-tailed test (null hypothesis - the dispersion is greater than or equal to the given number) * right-tailed test (null hypothesis - dispersion is less than or equal to the given number). Test is based on the following assumptions: * the given sample has a normal distribution. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of the sample. Variance - dispersion value to compare with. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 19.09.2006 by Bochkanov Sergey *************************************************************************/
void alglib::onesamplevariancetest( real_1d_array x, ae_int_t n, double variance, double& bothtails, double& lefttail, double& righttail);
wilcoxonsignedranktest
/************************************************************************* Wilcoxon signed-rank test This test checks three hypotheses about the median of the given sample. The following tests are performed: * two-tailed test (null hypothesis - the median is equal to the given value) * left-tailed test (null hypothesis - the median is greater than or equal to the given value) * right-tailed test (null hypothesis - the median is less than or equal to the given value) Requirements: * the scale of measurement should be ordinal, interval or ratio (i.e. the test could not be applied to nominal variables). * the distribution should be continuous and symmetric relative to its median. * number of distinct values in the X array should be greater than 4 The test is non-parametric and doesn't require distribution X to be normal Input parameters: X - sample. Array whose index goes from 0 to N-1. N - size of the sample. Median - assumed median value. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. To calculate p-values, special approximation is used. This method lets us calculate p-values with two decimal places in interval [0.0001, 1]. "Two decimal places" does not sound very impressive, but in practice the relative error of less than 1% is enough to make a decision. There is no approximation outside the [0.0001, 1] interval. Therefore, if the significance level outlies this interval, the test returns 0.0001. -- ALGLIB -- Copyright 08.09.2006 by Bochkanov Sergey *************************************************************************/
void alglib::wilcoxonsignedranktest( real_1d_array x, ae_int_t n, double e, double& bothtails, double& lefttail, double& righttail);
xdebugrecord1
xdebugb1appendcopy
xdebugb1count
xdebugb1not
xdebugb1outeven
xdebugb2count
xdebugb2not
xdebugb2outsin
xdebugb2transpose
xdebugc1appendcopy
xdebugc1neg
xdebugc1outeven
xdebugc1sum
xdebugc2neg
xdebugc2outsincos
xdebugc2sum
xdebugc2transpose
xdebugi1appendcopy
xdebugi1neg
xdebugi1outeven
xdebugi1sum
xdebugi2neg
xdebugi2outsin
xdebugi2sum
xdebugi2transpose
xdebuginitrecord1
xdebugmaskedbiasedproductsum
xdebugr1appendcopy
xdebugr1neg
xdebugr1outeven
xdebugr1sum
xdebugr2neg
xdebugr2outsin
xdebugr2sum
xdebugr2transpose
/************************************************************************* *************************************************************************/
class xdebugrecord1 { ae_int_t i; alglib::complex c; real_1d_array a; };
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugb1appendcopy(boolean_1d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Counts number of True values in the boolean 1D array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::xdebugb1count(boolean_1d_array a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by NOT(a[i]). Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugb1not(boolean_1d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered elements set to True. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugb1outeven(ae_int_t n, boolean_1d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Counts number of True values in the boolean 2D array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::xdebugb2count(boolean_2d_array a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by NOT(a[i]). Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugb2not(boolean_2d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sin(3*I+5*J)>0" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugb2outsin(ae_int_t m, ae_int_t n, boolean_2d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugb2transpose(boolean_2d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugc1appendcopy(complex_1d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -A[I] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugc1neg(complex_1d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered A[K] set to (x,y) = (K*0.25, K*0.125) and odd-numbered ones are set to 0. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugc1outeven(ae_int_t n, complex_1d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
alglib::complex alglib::xdebugc1sum(complex_1d_array a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -a[i,j] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugc2neg(complex_2d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sin(3*I+5*J),Cos(3*I+5*J)" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugc2outsincos( ae_int_t m, ae_int_t n, complex_2d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
alglib::complex alglib::xdebugc2sum(complex_2d_array a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugc2transpose(complex_2d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugi1appendcopy(integer_1d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -A[I] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugi1neg(integer_1d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered A[I] set to I, and odd-numbered ones set to 0. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugi1outeven(ae_int_t n, integer_1d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::xdebugi1sum(integer_1d_array a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -a[i,j] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugi2neg(integer_2d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sign(Sin(3*I+5*J))" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugi2outsin(ae_int_t m, ae_int_t n, integer_2d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
ae_int_t alglib::xdebugi2sum(integer_2d_array a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugi2transpose(integer_2d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Creates and returns XDebugRecord1 structure: * integer and complex fields of Rec1 are set to 1 and 1+i correspondingly * array field of Rec1 is set to [2,3] -- ALGLIB -- Copyright 27.05.2014 by Bochkanov Sergey *************************************************************************/
void alglib::xdebuginitrecord1(xdebugrecord1& rec1);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of a[i,j]*(1+b[i,j]) such that c[i,j] is True -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
double alglib::xdebugmaskedbiasedproductsum( ae_int_t m, ae_int_t n, real_2d_array a, real_2d_array b, boolean_2d_array c);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugr1appendcopy(real_1d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -A[I] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugr1neg(real_1d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered A[I] set to I*0.25, and odd-numbered ones are set to 0. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugr1outeven(ae_int_t n, real_1d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
double alglib::xdebugr1sum(real_1d_array a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -a[i,j] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugr2neg(real_2d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sin(3*I+5*J)" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugr2outsin(ae_int_t m, ae_int_t n, real_2d_array& a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
double alglib::xdebugr2sum(real_2d_array a);
/************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/
void alglib::xdebugr2transpose(real_2d_array& a);
cpp/src/0000755000175000017500000000000013105126766012014 5ustar sergeysergeycpp/src/solvers.cpp0000755000175000017500000172151213105126765014230 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "solvers.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif using namespace std; ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* *************************************************************************/ _densesolverreport_owner::_densesolverreport_owner() { p_struct = (alglib_impl::densesolverreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::densesolverreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_densesolverreport_init(p_struct, NULL); } _densesolverreport_owner::_densesolverreport_owner(const _densesolverreport_owner &rhs) { p_struct = (alglib_impl::densesolverreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::densesolverreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_densesolverreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _densesolverreport_owner& _densesolverreport_owner::operator=(const _densesolverreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_densesolverreport_clear(p_struct); alglib_impl::_densesolverreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _densesolverreport_owner::~_densesolverreport_owner() { alglib_impl::_densesolverreport_clear(p_struct); ae_free(p_struct); } alglib_impl::densesolverreport* _densesolverreport_owner::c_ptr() { return p_struct; } alglib_impl::densesolverreport* _densesolverreport_owner::c_ptr() const { return const_cast(p_struct); } densesolverreport::densesolverreport() : _densesolverreport_owner() ,r1(p_struct->r1),rinf(p_struct->rinf) { } densesolverreport::densesolverreport(const densesolverreport &rhs):_densesolverreport_owner(rhs) ,r1(p_struct->r1),rinf(p_struct->rinf) { } densesolverreport& densesolverreport::operator=(const densesolverreport &rhs) { if( this==&rhs ) return *this; _densesolverreport_owner::operator=(rhs); return *this; } densesolverreport::~densesolverreport() { } /************************************************************************* *************************************************************************/ _densesolverlsreport_owner::_densesolverlsreport_owner() { p_struct = (alglib_impl::densesolverlsreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::densesolverlsreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_densesolverlsreport_init(p_struct, NULL); } _densesolverlsreport_owner::_densesolverlsreport_owner(const _densesolverlsreport_owner &rhs) { p_struct = (alglib_impl::densesolverlsreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::densesolverlsreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_densesolverlsreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _densesolverlsreport_owner& _densesolverlsreport_owner::operator=(const _densesolverlsreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_densesolverlsreport_clear(p_struct); alglib_impl::_densesolverlsreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _densesolverlsreport_owner::~_densesolverlsreport_owner() { alglib_impl::_densesolverlsreport_clear(p_struct); ae_free(p_struct); } alglib_impl::densesolverlsreport* _densesolverlsreport_owner::c_ptr() { return p_struct; } alglib_impl::densesolverlsreport* _densesolverlsreport_owner::c_ptr() const { return const_cast(p_struct); } densesolverlsreport::densesolverlsreport() : _densesolverlsreport_owner() ,r2(p_struct->r2),cx(&p_struct->cx),n(p_struct->n),k(p_struct->k) { } densesolverlsreport::densesolverlsreport(const densesolverlsreport &rhs):_densesolverlsreport_owner(rhs) ,r2(p_struct->r2),cx(&p_struct->cx),n(p_struct->n),k(p_struct->k) { } densesolverlsreport& densesolverlsreport::operator=(const densesolverlsreport &rhs) { if( this==&rhs ) return *this; _densesolverlsreport_owner::operator=(rhs); return *this; } densesolverlsreport::~densesolverlsreport() { } /************************************************************************* Dense solver for A*x=b with N*N real matrix A and N*1 real vectorx x and b. This is "slow-but-feature rich" version of the linear solver. Faster version is RMatrixSolveFast() function. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^3) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. It also very significant on small matrices. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, RMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixsolve(const real_2d_array &a, const ae_int_t n, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixsolve(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixsolve(const real_2d_array &a, const ae_int_t n, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixsolve(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver. This subroutine solves a system A*x=b, where A is NxN non-denegerate real matrix, x and b are vectors. This is a "fast" version of linear solver which does NOT provide any additional functions like condition number estimation or iterative refinement. Algorithm features: * efficient algorithm O(N^3) complexity * no performance overhead from additional functionality If you need condition number estimation or iterative refinement, use more feature-rich version - RMatrixSolve(). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 16.03.2015 by Bochkanov Sergey *************************************************************************/ void rmatrixsolvefast(const real_2d_array &a, const ae_int_t n, const real_1d_array &b, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixsolvefast(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixsolvefast(const real_2d_array &a, const ae_int_t n, const real_1d_array &b, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixsolvefast(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver. Similar to RMatrixSolve() but solves task with multiple right parts (where b and x are NxM matrices). This is "slow-but-robust" version of linear solver with additional functionality like condition number estimation. There also exists faster version - RMatrixSolveMFast(). Algorithm features: * automatic detection of degenerate cases * condition number estimation * optional iterative refinement * O(N^3+M*N^2) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. It also very significant on small matrices. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, RMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size RFS - iterative refinement switch: * True - refinement is used. Less performance, more precision. * False - refinement is not used. More performance, less precision. OUTPUT PARAMETERS Info - return code: * -3 A is ill conditioned or singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixsolvem(const real_2d_array &a, const ae_int_t n, const real_2d_array &b, const ae_int_t m, const bool rfs, ae_int_t &info, densesolverreport &rep, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixsolvem(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), m, rfs, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixsolvem(const real_2d_array &a, const ae_int_t n, const real_2d_array &b, const ae_int_t m, const bool rfs, ae_int_t &info, densesolverreport &rep, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixsolvem(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), m, rfs, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver. Similar to RMatrixSolve() but solves task with multiple right parts (where b and x are NxM matrices). This is "fast" version of linear solver which does NOT offer additional functions like condition number estimation or iterative refinement. Algorithm features: * O(N^3+M*N^2) complexity * no additional functionality, highest performance COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size RFS - iterative refinement switch: * True - refinement is used. Less performance, more precision. * False - refinement is not used. More performance, less precision. OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixsolvemfast(const real_2d_array &a, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixsolvemfast(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixsolvemfast(const real_2d_array &a, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixsolvemfast(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver. This subroutine solves a system A*x=b, where A is NxN non-denegerate real matrix given by its LU decomposition, x and b are real vectors. This is "slow-but-robust" version of the linear LU-based solver. Faster version is RMatrixLUSolveFast() function. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! RMatrixLUSolveFast() function. INPUT PARAMETERS LUA - array[N,N], LU decomposition, RMatrixLU result P - array[N], pivots array, RMatrixLU result N - size of A B - array[N], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixlusolve(const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixlusolve(const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver. This subroutine solves a system A*x=b, where A is NxN non-denegerate real matrix given by its LU decomposition, x and b are real vectors. This is "fast-without-any-checks" version of the linear LU-based solver. Slower but more robust version is RMatrixLUSolve() function. Algorithm features: * O(N^2) complexity * fast algorithm without ANY additional checks, just triangular solver INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void rmatrixlusolvefast(const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_1d_array &b, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixlusolvefast(const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver. Similar to RMatrixLUSolve() but solves task with multiple right parts (where b and x are NxM matrices). This is "robust-but-slow" version of LU-based solver which performs additional checks for non-degeneracy of inputs (condition number estimation). If you need best performance, use "fast-without-any-checks" version, RMatrixLUSolveMFast(). Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. ! ! This performance penalty is especially apparent when you use ! ALGLIB parallel capabilities (condition number estimation is ! inherently sequential). It also becomes significant for ! small-scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! RMatrixLUSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS LUA - array[N,N], LU decomposition, RMatrixLU result P - array[N], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixlusolvem(const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixlusolvem(const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixlusolvem(const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixlusolvem(const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver. Similar to RMatrixLUSolve() but solves task with multiple right parts, where b and x are NxM matrices. This is "fast-without-any-checks" version of LU-based solver. It does not estimate condition number of a system, so it is extremely fast. If you need better detection of near-degenerate cases, use RMatrixLUSolveM() function. Algorithm features: * O(M*N^2) complexity * fast algorithm without ANY additional checks, just triangular solver COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N,M]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void rmatrixlusolvemfast(const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixlusolvemfast(const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixlusolvemfast(const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixlusolvemfast(const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver. This subroutine solves a system A*x=b, where BOTH ORIGINAL A AND ITS LU DECOMPOSITION ARE KNOWN. You can use it if for some reasons you have both A and its LU decomposition. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixmixedsolve(const real_2d_array &a, const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixmixedsolve(const_cast(a.c_ptr()), const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver. Similar to RMatrixMixedSolve() but solves task with multiple right parts (where b and x are NxM matrices). Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(M*N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixmixedsolvem(const real_2d_array &a, const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixmixedsolvem(const_cast(a.c_ptr()), const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complex dense solver for A*X=B with N*N complex matrix A, N*M complex matrices X and B. "Slow-but-feature-rich" version which provides additional functions, at the cost of slower performance. Faster version may be invoked with CMatrixSolveMFast() function. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^3+M*N^2) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, CMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size RFS - iterative refinement switch: * True - refinement is used. Less performance, more precision. * False - refinement is not used. More performance, less precision. OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixsolvem(const complex_2d_array &a, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, const bool rfs, ae_int_t &info, densesolverreport &rep, complex_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixsolvem(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), m, rfs, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixsolvem(const complex_2d_array &a, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, const bool rfs, ae_int_t &info, densesolverreport &rep, complex_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixsolvem(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), m, rfs, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complex dense solver for A*X=B with N*N complex matrix A, N*M complex matrices X and B. "Fast-but-lightweight" version which provides just triangular solver - and no additional functions like iterative refinement or condition number estimation. Algorithm features: * O(N^3+M*N^2) complexity * no additional time consuming functions COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N,M]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 16.03.2015 by Bochkanov Sergey *************************************************************************/ void cmatrixsolvemfast(const complex_2d_array &a, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixsolvemfast(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixsolvemfast(const complex_2d_array &a, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixsolvemfast(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complex dense solver for A*x=B with N*N complex matrix A and N*1 complex vectors x and b. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^3) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, CMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixsolve(const complex_2d_array &a, const ae_int_t n, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixsolve(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixsolve(const complex_2d_array &a, const ae_int_t n, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixsolve(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complex dense solver for A*x=B with N*N complex matrix A and N*1 complex vectors x and b. "Fast-but-lightweight" version of the solver. Algorithm features: * O(N^3) complexity * no additional time consuming features, just triangular solver COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS: Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixsolvefast(const complex_2d_array &a, const ae_int_t n, const complex_1d_array &b, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixsolvefast(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixsolvefast(const complex_2d_array &a, const ae_int_t n, const complex_1d_array &b, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixsolvefast(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*X=B with N*N complex A given by its LU decomposition, and N*M matrices X and B (multiple right sides). "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use CMatrixSolve or CMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. ! ! This performance penalty is especially apparent when you use ! ALGLIB parallel capabilities (condition number estimation is ! inherently sequential). It also becomes significant for ! small-scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! CMatrixLUSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixlusolvem(const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixlusolvem(const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixlusolvem(const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixlusolvem(const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*X=B with N*N complex A given by its LU decomposition, and N*M matrices X and B (multiple right sides). "Fast-but-lightweight" version of the solver. Algorithm features: * O(M*N^2) complexity * no additional time-consuming features COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N,M]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixlusolvemfast(const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixlusolvemfast(const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixlusolvemfast(const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixlusolvemfast(const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complex dense linear solver for A*x=b with complex N*N A given by its LU decomposition and N*1 vectors x and b. This is "slow-but-robust" version of the complex linear solver with additional features which add significant performance overhead. Faster version is CMatrixLUSolveFast() function. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use CMatrixSolve or CMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! CMatrixLUSolveFast() function. INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixlusolve(const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixlusolve(const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complex dense linear solver for A*x=b with N*N complex A given by its LU decomposition and N*1 vectors x and b. This is fast lightweight version of solver, which is significantly faster than CMatrixLUSolve(), but does not provide additional information (like condition numbers). Algorithm features: * O(N^2) complexity * no additional time-consuming features, just triangular solver INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros NOTE: unlike CMatrixLUSolve(), this function does NOT check for near-degeneracy of input matrix. It checks for EXACT degeneracy, because this check is easy to do. However, very badly conditioned matrices may went unnoticed. -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixlusolvefast(const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_1d_array &b, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixlusolvefast(const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver. Same as RMatrixMixedSolveM(), but for complex matrices. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(M*N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixmixedsolvem(const complex_2d_array &a, const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixmixedsolvem(const_cast(a.c_ptr()), const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver. Same as RMatrixMixedSolve(), but for complex matrices. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixmixedsolve(const complex_2d_array &a, const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixmixedsolve(const_cast(a.c_ptr()), const_cast(lua.c_ptr()), const_cast(p.c_ptr()), n, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A, and N*M vectors X and B. It is "slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just performs Cholesky ! decomposition and calls triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, SPDMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or non-SPD. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixsolvem(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixsolvem(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spdmatrixsolvem(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spdmatrixsolvem(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A, and N*M vectors X and B. It is "fast-but-lightweight" version of the solver. Algorithm features: * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional time consuming features COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular * -1 N<=0 was passed * 1 task was solved B - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/ void spdmatrixsolvemfast(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixsolvemfast(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spdmatrixsolvemfast(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spdmatrixsolvemfast(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense linear solver for A*x=b with N*N real symmetric positive definite matrix A, N*1 vectors x and b. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just performs Cholesky ! decomposition and calls triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, SPDMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or non-SPD. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixsolve(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixsolve(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spdmatrixsolve(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spdmatrixsolve(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense linear solver for A*x=b with N*N real symmetric positive definite matrix A, N*1 vectors x and b. "Fast-but-lightweight" version of the solver. Algorithm features: * O(N^3) complexity * matrix is represented by its upper or lower triangle * no additional time consuming features like condition number estimation COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or non-SPD * -1 N<=0 was passed * 1 task was solved B - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/ void spdmatrixsolvefast(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &b, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixsolvefast(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spdmatrixsolvefast(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &b, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spdmatrixsolvefast(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*M vectors X and B. It is "slow-but- feature-rich" version of the solver which estimates condition number of the system. Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. Amount of overhead introduced depends on M (the ! larger - the more efficient). ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! SPDMatrixCholeskySolveMFast() function. INPUT PARAMETERS CHA - array[0..N-1,0..N-1], Cholesky decomposition, SPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or badly conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 contains solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskysolvem(const real_2d_array &cha, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixcholeskysolvem(const_cast(cha.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spdmatrixcholeskysolvem(const real_2d_array &cha, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spdmatrixcholeskysolvem(const_cast(cha.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*M vectors X and B. It is "fast-but- lightweight" version of the solver which just solves linear system, without any additional functions. Algorithm features: * O(M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional functionality INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, SPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[N,M], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or badly conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved B - array[N]: * for info>0 overwritten by solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskysolvemfast(const real_2d_array &cha, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixcholeskysolvemfast(const_cast(cha.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spdmatrixcholeskysolvemfast(const real_2d_array &cha, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spdmatrixcholeskysolvemfast(const_cast(cha.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*x=b with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*1 real vectors x and b. This is "slow- but-feature-rich" version of the solver which, in addition to the solution, performs condition number estimation. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! SPDMatrixCholeskySolveFast() function. INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[N], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 - solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskysolve(const real_2d_array &cha, const ae_int_t n, const bool isupper, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixcholeskysolve(const_cast(cha.c_ptr()), n, isupper, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*x=b with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*1 real vectors x and b. This is "fast- but-lightweight" version of the solver. Algorithm features: * O(N^2) complexity * matrix is represented by its upper or lower triangle * no additional features INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[N], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[N]: * for info>0 - overwritten by solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskysolvefast(const real_2d_array &cha, const ae_int_t n, const bool isupper, const real_1d_array &b, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixcholeskysolvefast(const_cast(cha.c_ptr()), n, isupper, const_cast(b.c_ptr()), &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*X=B, with N*N Hermitian positive definite matrix A and N*M complex matrices X and B. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. ! ! This performance penalty is especially apparent when you use ! ALGLIB parallel capabilities (condition number estimation is ! inherently sequential). It also becomes significant for ! small-scale problems (N<100). ! ! In such cases we strongly recommend you to use faster solver, ! HPDMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - same as in RMatrixSolve. Returns -3 for non-HPD matrices. Rep - same as in RMatrixSolve X - same as in RMatrixSolve -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void hpdmatrixsolvem(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixsolvem(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_hpdmatrixsolvem(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_hpdmatrixsolvem(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*X=B, with N*N Hermitian positive definite matrix A and N*M complex matrices X and B. "Fast-but-lightweight" version of the solver. Algorithm features: * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional time consuming features like condition number estimation COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or is not positive definite. B is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[0..N-1]: * overwritten by solution * zeros, if problem was not solved -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/ void hpdmatrixsolvemfast(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixsolvemfast(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_hpdmatrixsolvemfast(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_hpdmatrixsolvemfast(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*x=b, with N*N Hermitian positive definite matrix A, and N*1 complex vectors x and b. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just performs Cholesky ! decomposition and calls triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, HPDMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - same as in RMatrixSolve Returns -3 for non-HPD matrices. Rep - same as in RMatrixSolve X - same as in RMatrixSolve -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void hpdmatrixsolve(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixsolve(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_hpdmatrixsolve(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_hpdmatrixsolve(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*x=b, with N*N Hermitian positive definite matrix A, and N*1 complex vectors x and b. "Fast-but-lightweight" version of the solver without additional functions. Algorithm features: * O(N^3) complexity * matrix is represented by its upper or lower triangle * no additional time consuming functions COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or not positive definite X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved B - array[0..N-1]: * overwritten by solution * zeros, if A is exactly singular (diagonal of its LU decomposition has exact zeros). -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/ void hpdmatrixsolvefast(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_1d_array &b, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixsolvefast(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_hpdmatrixsolvefast(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_1d_array &b, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_hpdmatrixsolvefast(const_cast(a.c_ptr()), n, isupper, const_cast(b.c_ptr()), &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*X=B with N*N Hermitian positive definite matrix A given by its Cholesky decomposition and N*M complex matrices X and B. This is "slow-but-feature-rich" version of the solver which, in addition to the solution, estimates condition number of the system. Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. Amount of overhead introduced depends on M (the ! larger - the more efficient). ! ! This performance penalty is insignificant when compared with ! cost of large Cholesky decomposition. However, if you call ! this function many times for the same left side, this ! overhead BECOMES significant. It also becomes significant ! for small-scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! HPDMatrixCholeskySolveMFast() function. INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, HPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[N,M], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 contains solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskysolvem(const complex_2d_array &cha, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixcholeskysolvem(const_cast(cha.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_hpdmatrixcholeskysolvem(const complex_2d_array &cha, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_hpdmatrixcholeskysolvem(const_cast(cha.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*X=B with N*N Hermitian positive definite matrix A given by its Cholesky decomposition and N*M complex matrices X and B. This is "fast-but-lightweight" version of the solver. Algorithm features: * O(M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional time-consuming features INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, HPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[N,M], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved B - array[N]: * for info>0 overwritten by solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskysolvemfast(const complex_2d_array &cha, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixcholeskysolvemfast(const_cast(cha.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_hpdmatrixcholeskysolvemfast(const complex_2d_array &cha, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_hpdmatrixcholeskysolvemfast(const_cast(cha.c_ptr()), n, isupper, const_cast(b.c_ptr()), m, &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*x=b with N*N Hermitian positive definite matrix A given by its Cholesky decomposition, and N*1 complex vectors x and b. This is "slow-but-feature-rich" version of the solver which estimates condition number of the system. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! HPDMatrixCholeskySolveFast() function. INPUT PARAMETERS CHA - array[0..N-1,0..N-1], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 - solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskysolve(const complex_2d_array &cha, const ae_int_t n, const bool isupper, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixcholeskysolve(const_cast(cha.c_ptr()), n, isupper, const_cast(b.c_ptr()), &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver for A*x=b with N*N Hermitian positive definite matrix A given by its Cholesky decomposition, and N*1 complex vectors x and b. This is "fast-but-lightweight" version of the solver. Algorithm features: * O(N^2) complexity * matrix is represented by its upper or lower triangle * no additional time-consuming features INPUT PARAMETERS CHA - array[0..N-1,0..N-1], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned B is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[N]: * for info>0 - overwritten by solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskysolvefast(const complex_2d_array &cha, const ae_int_t n, const bool isupper, const complex_1d_array &b, ae_int_t &info) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixcholeskysolvefast(const_cast(cha.c_ptr()), n, isupper, const_cast(b.c_ptr()), &info, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dense solver. This subroutine finds solution of the linear system A*X=B with non-square, possibly degenerate A. System is solved in the least squares sense, and general least squares solution X = X0 + CX*y which minimizes |A*X-B| is returned. If A is non-degenerate, solution in the usual sense is returned. Algorithm features: * automatic detection (and correct handling!) of degenerate cases * iterative refinement * O(N^3) complexity COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is only partially supported (some parts are ! optimized, but most - are not). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..NRows-1,0..NCols-1], system matrix NRows - vertical size of A NCols - horizontal size of A B - array[0..NCols-1], right part Threshold- a number in [0,1]. Singular values beyond Threshold are considered zero. Set it to 0.0, if you don't understand what it means, so the solver will choose good value on its own. OUTPUT PARAMETERS Info - return code: * -4 SVD subroutine failed * -1 if NRows<=0 or NCols<=0 or Threshold<0 was passed * 1 if task is solved Rep - solver report, see below for more info X - array[0..N-1,0..M-1], it contains: * solution of A*X=B (even for singular A) * zeros, if SVD subroutine failed SOLVER REPORT Subroutine sets following fields of the Rep structure: * R2 reciprocal of condition number: 1/cond(A), 2-norm. * N = NCols * K dim(Null(A)) * CX array[0..N-1,0..K-1], kernel of A. Columns of CX store such vectors that A*CX[i]=0. -- ALGLIB -- Copyright 24.08.2009 by Bochkanov Sergey *************************************************************************/ void rmatrixsolvels(const real_2d_array &a, const ae_int_t nrows, const ae_int_t ncols, const real_1d_array &b, const double threshold, ae_int_t &info, densesolverlsreport &rep, real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixsolvels(const_cast(a.c_ptr()), nrows, ncols, const_cast(b.c_ptr()), threshold, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixsolvels(const real_2d_array &a, const ae_int_t nrows, const ae_int_t ncols, const real_1d_array &b, const double threshold, ae_int_t &info, densesolverlsreport &rep, real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixsolvels(const_cast(a.c_ptr()), nrows, ncols, const_cast(b.c_ptr()), threshold, &info, const_cast(rep.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This object stores state of the LinLSQR method. You should use ALGLIB functions to work with this object. *************************************************************************/ _linlsqrstate_owner::_linlsqrstate_owner() { p_struct = (alglib_impl::linlsqrstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::linlsqrstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_linlsqrstate_init(p_struct, NULL); } _linlsqrstate_owner::_linlsqrstate_owner(const _linlsqrstate_owner &rhs) { p_struct = (alglib_impl::linlsqrstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::linlsqrstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_linlsqrstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _linlsqrstate_owner& _linlsqrstate_owner::operator=(const _linlsqrstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_linlsqrstate_clear(p_struct); alglib_impl::_linlsqrstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _linlsqrstate_owner::~_linlsqrstate_owner() { alglib_impl::_linlsqrstate_clear(p_struct); ae_free(p_struct); } alglib_impl::linlsqrstate* _linlsqrstate_owner::c_ptr() { return p_struct; } alglib_impl::linlsqrstate* _linlsqrstate_owner::c_ptr() const { return const_cast(p_struct); } linlsqrstate::linlsqrstate() : _linlsqrstate_owner() { } linlsqrstate::linlsqrstate(const linlsqrstate &rhs):_linlsqrstate_owner(rhs) { } linlsqrstate& linlsqrstate::operator=(const linlsqrstate &rhs) { if( this==&rhs ) return *this; _linlsqrstate_owner::operator=(rhs); return *this; } linlsqrstate::~linlsqrstate() { } /************************************************************************* *************************************************************************/ _linlsqrreport_owner::_linlsqrreport_owner() { p_struct = (alglib_impl::linlsqrreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::linlsqrreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_linlsqrreport_init(p_struct, NULL); } _linlsqrreport_owner::_linlsqrreport_owner(const _linlsqrreport_owner &rhs) { p_struct = (alglib_impl::linlsqrreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::linlsqrreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_linlsqrreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _linlsqrreport_owner& _linlsqrreport_owner::operator=(const _linlsqrreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_linlsqrreport_clear(p_struct); alglib_impl::_linlsqrreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _linlsqrreport_owner::~_linlsqrreport_owner() { alglib_impl::_linlsqrreport_clear(p_struct); ae_free(p_struct); } alglib_impl::linlsqrreport* _linlsqrreport_owner::c_ptr() { return p_struct; } alglib_impl::linlsqrreport* _linlsqrreport_owner::c_ptr() const { return const_cast(p_struct); } linlsqrreport::linlsqrreport() : _linlsqrreport_owner() ,iterationscount(p_struct->iterationscount),nmv(p_struct->nmv),terminationtype(p_struct->terminationtype) { } linlsqrreport::linlsqrreport(const linlsqrreport &rhs):_linlsqrreport_owner(rhs) ,iterationscount(p_struct->iterationscount),nmv(p_struct->nmv),terminationtype(p_struct->terminationtype) { } linlsqrreport& linlsqrreport::operator=(const linlsqrreport &rhs) { if( this==&rhs ) return *this; _linlsqrreport_owner::operator=(rhs); return *this; } linlsqrreport::~linlsqrreport() { } /************************************************************************* This function initializes linear LSQR Solver. This solver is used to solve non-symmetric (and, possibly, non-square) problems. Least squares solution is returned for non-compatible systems. USAGE: 1. User initializes algorithm state with LinLSQRCreate() call 2. User tunes solver parameters with LinLSQRSetCond() and other functions 3. User calls LinLSQRSolveSparse() function which takes algorithm state and SparseMatrix object. 4. User calls LinLSQRResults() to get solution 5. Optionally, user may call LinLSQRSolveSparse() again to solve another problem with different matrix and/or right part without reinitializing LinLSQRState structure. INPUT PARAMETERS: M - number of rows in A N - number of variables, N>0 OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrcreate(const ae_int_t m, const ae_int_t n, linlsqrstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::linlsqrcreate(m, n, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function changes preconditioning settings of LinLSQQSolveSparse() function. By default, SolveSparse() uses diagonal preconditioner, but if you want to use solver without preconditioning, you can call this function which forces solver to use unit matrix for preconditioning. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/ void linlsqrsetprecunit(const linlsqrstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::linlsqrsetprecunit(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function changes preconditioning settings of LinCGSolveSparse() function. LinCGSolveSparse() will use diagonal of the system matrix as preconditioner. This preconditioning mode is active by default. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/ void linlsqrsetprecdiag(const linlsqrstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::linlsqrsetprecdiag(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets optional Tikhonov regularization coefficient. It is zero by default. INPUT PARAMETERS: LambdaI - regularization factor, LambdaI>=0 OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsetlambdai(const linlsqrstate &state, const double lambdai) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::linlsqrsetlambdai(const_cast(state.c_ptr()), lambdai, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Procedure for solution of A*x=b with sparse A. INPUT PARAMETERS: State - algorithm state A - sparse M*N matrix in the CRS format (you MUST contvert it to CRS format by calling SparseConvertToCRS() function BEFORE you pass it to this function). B - right part, array[M] RESULT: This function returns no result. You can get solution by calling LinCGResults() NOTE: this function uses lightweight preconditioning - multiplication by inverse of diag(A). If you want, you can turn preconditioning off by calling LinLSQRSetPrecUnit(). However, preconditioning cost is low and preconditioner is very important for solution of badly scaled problems. -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsolvesparse(const linlsqrstate &state, const sparsematrix &a, const real_1d_array &b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::linlsqrsolvesparse(const_cast(state.c_ptr()), const_cast(a.c_ptr()), const_cast(b.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping criteria. INPUT PARAMETERS: EpsA - algorithm will be stopped if ||A^T*Rk||/(||A||*||Rk||)<=EpsA. EpsB - algorithm will be stopped if ||Rk||<=EpsB*||B|| MaxIts - algorithm will be stopped if number of iterations more than MaxIts. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTE: if EpsA,EpsB,EpsC and MaxIts are zero then these variables will be setted as default values. -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsetcond(const linlsqrstate &state, const double epsa, const double epsb, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::linlsqrsetcond(const_cast(state.c_ptr()), epsa, epsb, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* LSQR solver: results. This function must be called after LinLSQRSolve INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[N], solution Rep - optimization report: * Rep.TerminationType completetion code: * 1 ||Rk||<=EpsB*||B|| * 4 ||A^T*Rk||/(||A||*||Rk||)<=EpsA * 5 MaxIts steps was taken * 7 rounding errors prevent further progress, X contains best point found so far. (sometimes returned on singular systems) * Rep.IterationsCount contains iterations count * NMV countains number of matrix-vector calculations -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrresults(const linlsqrstate &state, real_1d_array &x, linlsqrreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::linlsqrresults(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinCGOptimize(). -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsetxrep(const linlsqrstate &state, const bool needxrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::linlsqrsetxrep(const_cast(state.c_ptr()), needxrep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* *************************************************************************/ _polynomialsolverreport_owner::_polynomialsolverreport_owner() { p_struct = (alglib_impl::polynomialsolverreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::polynomialsolverreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_polynomialsolverreport_init(p_struct, NULL); } _polynomialsolverreport_owner::_polynomialsolverreport_owner(const _polynomialsolverreport_owner &rhs) { p_struct = (alglib_impl::polynomialsolverreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::polynomialsolverreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_polynomialsolverreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _polynomialsolverreport_owner& _polynomialsolverreport_owner::operator=(const _polynomialsolverreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_polynomialsolverreport_clear(p_struct); alglib_impl::_polynomialsolverreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _polynomialsolverreport_owner::~_polynomialsolverreport_owner() { alglib_impl::_polynomialsolverreport_clear(p_struct); ae_free(p_struct); } alglib_impl::polynomialsolverreport* _polynomialsolverreport_owner::c_ptr() { return p_struct; } alglib_impl::polynomialsolverreport* _polynomialsolverreport_owner::c_ptr() const { return const_cast(p_struct); } polynomialsolverreport::polynomialsolverreport() : _polynomialsolverreport_owner() ,maxerr(p_struct->maxerr) { } polynomialsolverreport::polynomialsolverreport(const polynomialsolverreport &rhs):_polynomialsolverreport_owner(rhs) ,maxerr(p_struct->maxerr) { } polynomialsolverreport& polynomialsolverreport::operator=(const polynomialsolverreport &rhs) { if( this==&rhs ) return *this; _polynomialsolverreport_owner::operator=(rhs); return *this; } polynomialsolverreport::~polynomialsolverreport() { } /************************************************************************* Polynomial root finding. This function returns all roots of the polynomial P(x) = a0 + a1*x + a2*x^2 + ... + an*x^n Both real and complex roots are returned (see below). INPUT PARAMETERS: A - array[N+1], polynomial coefficients: * A[0] is constant term * A[N] is a coefficient of X^N N - polynomial degree OUTPUT PARAMETERS: X - array of complex roots: * for isolated real root, X[I] is strictly real: IMAGE(X[I])=0 * complex roots are always returned in pairs - roots occupy positions I and I+1, with: * X[I+1]=Conj(X[I]) * IMAGE(X[I]) > 0 * IMAGE(X[I+1]) = -IMAGE(X[I]) < 0 * multiple real roots may have non-zero imaginary part due to roundoff errors. There is no reliable way to distinguish real root of multiplicity 2 from two complex roots in the presence of roundoff errors. Rep - report, additional information, following fields are set: * Rep.MaxErr - max( |P(xi)| ) for i=0..N-1. This field allows to quickly estimate "quality" of the roots being returned. NOTE: this function uses companion matrix method to find roots. In case internal EVD solver fails do find eigenvalues, exception is generated. NOTE: roots are not "polished" and no matrix balancing is performed for them. -- ALGLIB -- Copyright 24.02.2014 by Bochkanov Sergey *************************************************************************/ void polynomialsolve(const real_1d_array &a, const ae_int_t n, complex_1d_array &x, polynomialsolverreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialsolve(const_cast(a.c_ptr()), n, const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* *************************************************************************/ _nleqstate_owner::_nleqstate_owner() { p_struct = (alglib_impl::nleqstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::nleqstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_nleqstate_init(p_struct, NULL); } _nleqstate_owner::_nleqstate_owner(const _nleqstate_owner &rhs) { p_struct = (alglib_impl::nleqstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::nleqstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_nleqstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _nleqstate_owner& _nleqstate_owner::operator=(const _nleqstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_nleqstate_clear(p_struct); alglib_impl::_nleqstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _nleqstate_owner::~_nleqstate_owner() { alglib_impl::_nleqstate_clear(p_struct); ae_free(p_struct); } alglib_impl::nleqstate* _nleqstate_owner::c_ptr() { return p_struct; } alglib_impl::nleqstate* _nleqstate_owner::c_ptr() const { return const_cast(p_struct); } nleqstate::nleqstate() : _nleqstate_owner() ,needf(p_struct->needf),needfij(p_struct->needfij),xupdated(p_struct->xupdated),f(p_struct->f),fi(&p_struct->fi),j(&p_struct->j),x(&p_struct->x) { } nleqstate::nleqstate(const nleqstate &rhs):_nleqstate_owner(rhs) ,needf(p_struct->needf),needfij(p_struct->needfij),xupdated(p_struct->xupdated),f(p_struct->f),fi(&p_struct->fi),j(&p_struct->j),x(&p_struct->x) { } nleqstate& nleqstate::operator=(const nleqstate &rhs) { if( this==&rhs ) return *this; _nleqstate_owner::operator=(rhs); return *this; } nleqstate::~nleqstate() { } /************************************************************************* *************************************************************************/ _nleqreport_owner::_nleqreport_owner() { p_struct = (alglib_impl::nleqreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::nleqreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_nleqreport_init(p_struct, NULL); } _nleqreport_owner::_nleqreport_owner(const _nleqreport_owner &rhs) { p_struct = (alglib_impl::nleqreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::nleqreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_nleqreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _nleqreport_owner& _nleqreport_owner::operator=(const _nleqreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_nleqreport_clear(p_struct); alglib_impl::_nleqreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _nleqreport_owner::~_nleqreport_owner() { alglib_impl::_nleqreport_clear(p_struct); ae_free(p_struct); } alglib_impl::nleqreport* _nleqreport_owner::c_ptr() { return p_struct; } alglib_impl::nleqreport* _nleqreport_owner::c_ptr() const { return const_cast(p_struct); } nleqreport::nleqreport() : _nleqreport_owner() ,iterationscount(p_struct->iterationscount),nfunc(p_struct->nfunc),njac(p_struct->njac),terminationtype(p_struct->terminationtype) { } nleqreport::nleqreport(const nleqreport &rhs):_nleqreport_owner(rhs) ,iterationscount(p_struct->iterationscount),nfunc(p_struct->nfunc),njac(p_struct->njac),terminationtype(p_struct->terminationtype) { } nleqreport& nleqreport::operator=(const nleqreport &rhs) { if( this==&rhs ) return *this; _nleqreport_owner::operator=(rhs); return *this; } nleqreport::~nleqreport() { } /************************************************************************* LEVENBERG-MARQUARDT-LIKE NONLINEAR SOLVER DESCRIPTION: This algorithm solves system of nonlinear equations F[0](x[0], ..., x[n-1]) = 0 F[1](x[0], ..., x[n-1]) = 0 ... F[M-1](x[0], ..., x[n-1]) = 0 with M/N do not necessarily coincide. Algorithm converges quadratically under following conditions: * the solution set XS is nonempty * for some xs in XS there exist such neighbourhood N(xs) that: * vector function F(x) and its Jacobian J(x) are continuously differentiable on N * ||F(x)|| provides local error bound on N, i.e. there exists such c1, that ||F(x)||>c1*distance(x,XS) Note that these conditions are much more weaker than usual non-singularity conditions. For example, algorithm will converge for any affine function F (whether its Jacobian singular or not). REQUIREMENTS: Algorithm will request following information during its operation: * function vector F[] and Jacobian matrix at given point X * value of merit function f(x)=F[0]^2(x)+...+F[M-1]^2(x) at given point X USAGE: 1. User initializes algorithm state with NLEQCreateLM() call 2. User tunes solver parameters with NLEQSetCond(), NLEQSetStpMax() and other functions 3. User calls NLEQSolve() function which takes algorithm state and pointers (delegates, etc.) to callback functions which calculate merit function value and Jacobian. 4. User calls NLEQResults() to get solution 5. Optionally, user may call NLEQRestartFrom() to solve another problem with same parameters (N/M) but another starting point and/or another function vector. NLEQRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - space dimension, N>1: * if provided, only leading N elements of X are used * if not provided, determined automatically from size of X M - system size X - starting point OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with NLEQSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use NLEQSetStpMax() function to bound algorithm's steps. 3. this algorithm is a slightly modified implementation of the method described in 'Levenberg-Marquardt method for constrained nonlinear equations with strong local convergence properties' by Christian Kanzow Nobuo Yamashita and Masao Fukushima and further developed in 'On the convergence of a New Levenberg-Marquardt Method' by Jin-yan Fan and Ya-Xiang Yuan. -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/ void nleqcreatelm(const ae_int_t n, const ae_int_t m, const real_1d_array &x, nleqstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::nleqcreatelm(n, m, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* LEVENBERG-MARQUARDT-LIKE NONLINEAR SOLVER DESCRIPTION: This algorithm solves system of nonlinear equations F[0](x[0], ..., x[n-1]) = 0 F[1](x[0], ..., x[n-1]) = 0 ... F[M-1](x[0], ..., x[n-1]) = 0 with M/N do not necessarily coincide. Algorithm converges quadratically under following conditions: * the solution set XS is nonempty * for some xs in XS there exist such neighbourhood N(xs) that: * vector function F(x) and its Jacobian J(x) are continuously differentiable on N * ||F(x)|| provides local error bound on N, i.e. there exists such c1, that ||F(x)||>c1*distance(x,XS) Note that these conditions are much more weaker than usual non-singularity conditions. For example, algorithm will converge for any affine function F (whether its Jacobian singular or not). REQUIREMENTS: Algorithm will request following information during its operation: * function vector F[] and Jacobian matrix at given point X * value of merit function f(x)=F[0]^2(x)+...+F[M-1]^2(x) at given point X USAGE: 1. User initializes algorithm state with NLEQCreateLM() call 2. User tunes solver parameters with NLEQSetCond(), NLEQSetStpMax() and other functions 3. User calls NLEQSolve() function which takes algorithm state and pointers (delegates, etc.) to callback functions which calculate merit function value and Jacobian. 4. User calls NLEQResults() to get solution 5. Optionally, user may call NLEQRestartFrom() to solve another problem with same parameters (N/M) but another starting point and/or another function vector. NLEQRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - space dimension, N>1: * if provided, only leading N elements of X are used * if not provided, determined automatically from size of X M - system size X - starting point OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with NLEQSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use NLEQSetStpMax() function to bound algorithm's steps. 3. this algorithm is a slightly modified implementation of the method described in 'Levenberg-Marquardt method for constrained nonlinear equations with strong local convergence properties' by Christian Kanzow Nobuo Yamashita and Masao Fukushima and further developed in 'On the convergence of a New Levenberg-Marquardt Method' by Jin-yan Fan and Ya-Xiang Yuan. -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/ void nleqcreatelm(const ae_int_t m, const real_1d_array &x, nleqstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::nleqcreatelm(n, m, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping conditions for the nonlinear solver INPUT PARAMETERS: State - structure which stores algorithm state EpsF - >=0 The subroutine finishes its work if on k+1-th iteration the condition ||F||<=EpsF is satisfied MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsF=0 and MaxIts=0 simultaneously will lead to automatic stopping criterion selection (small EpsF). NOTES: -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/ void nleqsetcond(const nleqstate &state, const double epsf, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::nleqsetcond(const_cast(state.c_ptr()), epsf, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to NLEQSolve(). -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/ void nleqsetxrep(const nleqstate &state, const bool needxrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::nleqsetxrep(const_cast(state.c_ptr()), needxrep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when target function contains exp() or other fast growing functions, and algorithm makes too large steps which lead to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/ void nleqsetstpmax(const nleqstate &state, const double stpmax) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::nleqsetstpmax(const_cast(state.c_ptr()), stpmax, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool nleqiteration(const nleqstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::nleqiteration(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void nleqsolve(nleqstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( func==NULL ) throw ap_error("ALGLIB: error in 'nleqsolve()' (func is NULL)"); if( jac==NULL ) throw ap_error("ALGLIB: error in 'nleqsolve()' (jac is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::nleqiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needf ) { func(state.x, state.f, ptr); continue; } if( state.needfij ) { jac(state.x, state.fi, state.j, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'nleqsolve' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* NLEQ solver results INPUT PARAMETERS: State - algorithm state. OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report: * Rep.TerminationType completetion code: * -4 ERROR: algorithm has converged to the stationary point Xf which is local minimum of f=F[0]^2+...+F[m-1]^2, but is not solution of nonlinear system. * 1 sqrt(f)<=EpsF. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * Rep.IterationsCount contains iterations count * NFEV countains number of function calculations * ActiveConstraints contains number of active constraints -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/ void nleqresults(const nleqstate &state, real_1d_array &x, nleqreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::nleqresults(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* NLEQ solver results Buffered implementation of NLEQResults(), which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/ void nleqresultsbuf(const nleqstate &state, real_1d_array &x, nleqreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::nleqresultsbuf(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine restarts CG algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used for reverse communication previously allocated with MinCGCreate call. X - new starting point. BndL - new lower bounds BndU - new upper bounds -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void nleqrestartfrom(const nleqstate &state, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::nleqrestartfrom(const_cast(state.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This object stores state of the linear CG method. You should use ALGLIB functions to work with this object. Never try to access its fields directly! *************************************************************************/ _lincgstate_owner::_lincgstate_owner() { p_struct = (alglib_impl::lincgstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::lincgstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_lincgstate_init(p_struct, NULL); } _lincgstate_owner::_lincgstate_owner(const _lincgstate_owner &rhs) { p_struct = (alglib_impl::lincgstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::lincgstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_lincgstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _lincgstate_owner& _lincgstate_owner::operator=(const _lincgstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_lincgstate_clear(p_struct); alglib_impl::_lincgstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _lincgstate_owner::~_lincgstate_owner() { alglib_impl::_lincgstate_clear(p_struct); ae_free(p_struct); } alglib_impl::lincgstate* _lincgstate_owner::c_ptr() { return p_struct; } alglib_impl::lincgstate* _lincgstate_owner::c_ptr() const { return const_cast(p_struct); } lincgstate::lincgstate() : _lincgstate_owner() { } lincgstate::lincgstate(const lincgstate &rhs):_lincgstate_owner(rhs) { } lincgstate& lincgstate::operator=(const lincgstate &rhs) { if( this==&rhs ) return *this; _lincgstate_owner::operator=(rhs); return *this; } lincgstate::~lincgstate() { } /************************************************************************* *************************************************************************/ _lincgreport_owner::_lincgreport_owner() { p_struct = (alglib_impl::lincgreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::lincgreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_lincgreport_init(p_struct, NULL); } _lincgreport_owner::_lincgreport_owner(const _lincgreport_owner &rhs) { p_struct = (alglib_impl::lincgreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::lincgreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_lincgreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _lincgreport_owner& _lincgreport_owner::operator=(const _lincgreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_lincgreport_clear(p_struct); alglib_impl::_lincgreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _lincgreport_owner::~_lincgreport_owner() { alglib_impl::_lincgreport_clear(p_struct); ae_free(p_struct); } alglib_impl::lincgreport* _lincgreport_owner::c_ptr() { return p_struct; } alglib_impl::lincgreport* _lincgreport_owner::c_ptr() const { return const_cast(p_struct); } lincgreport::lincgreport() : _lincgreport_owner() ,iterationscount(p_struct->iterationscount),nmv(p_struct->nmv),terminationtype(p_struct->terminationtype),r2(p_struct->r2) { } lincgreport::lincgreport(const lincgreport &rhs):_lincgreport_owner(rhs) ,iterationscount(p_struct->iterationscount),nmv(p_struct->nmv),terminationtype(p_struct->terminationtype),r2(p_struct->r2) { } lincgreport& lincgreport::operator=(const lincgreport &rhs) { if( this==&rhs ) return *this; _lincgreport_owner::operator=(rhs); return *this; } lincgreport::~lincgreport() { } /************************************************************************* This function initializes linear CG Solver. This solver is used to solve symmetric positive definite problems. If you want to solve nonsymmetric (or non-positive definite) problem you may use LinLSQR solver provided by ALGLIB. USAGE: 1. User initializes algorithm state with LinCGCreate() call 2. User tunes solver parameters with LinCGSetCond() and other functions 3. Optionally, user sets starting point with LinCGSetStartingPoint() 4. User calls LinCGSolveSparse() function which takes algorithm state and SparseMatrix object. 5. User calls LinCGResults() to get solution 6. Optionally, user may call LinCGSolveSparse() again to solve another problem with different matrix and/or right part without reinitializing LinCGState structure. INPUT PARAMETERS: N - problem dimension, N>0 OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgcreate(const ae_int_t n, lincgstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lincgcreate(n, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets starting point. By default, zero starting point is used. INPUT PARAMETERS: X - starting point, array[N] OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetstartingpoint(const lincgstate &state, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lincgsetstartingpoint(const_cast(state.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function changes preconditioning settings of LinCGSolveSparse() function. By default, SolveSparse() uses diagonal preconditioner, but if you want to use solver without preconditioning, you can call this function which forces solver to use unit matrix for preconditioning. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/ void lincgsetprecunit(const lincgstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lincgsetprecunit(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function changes preconditioning settings of LinCGSolveSparse() function. LinCGSolveSparse() will use diagonal of the system matrix as preconditioner. This preconditioning mode is active by default. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/ void lincgsetprecdiag(const lincgstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lincgsetprecdiag(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping criteria. INPUT PARAMETERS: EpsF - algorithm will be stopped if norm of residual is less than EpsF*||b||. MaxIts - algorithm will be stopped if number of iterations is more than MaxIts. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: If both EpsF and MaxIts are zero then small EpsF will be set to small value. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetcond(const lincgstate &state, const double epsf, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lincgsetcond(const_cast(state.c_ptr()), epsf, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Procedure for solution of A*x=b with sparse A. INPUT PARAMETERS: State - algorithm state A - sparse matrix in the CRS format (you MUST contvert it to CRS format by calling SparseConvertToCRS() function). IsUpper - whether upper or lower triangle of A is used: * IsUpper=True => only upper triangle is used and lower triangle is not referenced at all * IsUpper=False => only lower triangle is used and upper triangle is not referenced at all B - right part, array[N] RESULT: This function returns no result. You can get solution by calling LinCGResults() NOTE: this function uses lightweight preconditioning - multiplication by inverse of diag(A). If you want, you can turn preconditioning off by calling LinCGSetPrecUnit(). However, preconditioning cost is low and preconditioner is very important for solution of badly scaled problems. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsolvesparse(const lincgstate &state, const sparsematrix &a, const bool isupper, const real_1d_array &b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lincgsolvesparse(const_cast(state.c_ptr()), const_cast(a.c_ptr()), isupper, const_cast(b.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* CG-solver: results. This function must be called after LinCGSolve INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[N], solution Rep - optimization report: * Rep.TerminationType completetion code: * -5 input matrix is either not positive definite, too large or too small * -4 overflow/underflow during solution (ill conditioned problem) * 1 ||residual||<=EpsF*||b|| * 5 MaxIts steps was taken * 7 rounding errors prevent further progress, best point found is returned * Rep.IterationsCount contains iterations count * NMV countains number of matrix-vector calculations -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgresults(const lincgstate &state, real_1d_array &x, lincgreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lincgresults(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets restart frequency. By default, algorithm is restarted after N subsequent iterations. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetrestartfreq(const lincgstate &state, const ae_int_t srf) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lincgsetrestartfreq(const_cast(state.c_ptr()), srf, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets frequency of residual recalculations. Algorithm updates residual r_k using iterative formula, but recalculates it from scratch after each 10 iterations. It is done to avoid accumulation of numerical errors and to stop algorithm when r_k starts to grow. Such low update frequence (1/10) gives very little overhead, but makes algorithm a bit more robust against numerical errors. However, you may change it INPUT PARAMETERS: Freq - desired update frequency, Freq>=0. Zero value means that no updates will be done. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetrupdatefreq(const lincgstate &state, const ae_int_t freq) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lincgsetrupdatefreq(const_cast(state.c_ptr()), freq, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinCGOptimize(). -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetxrep(const lincgstate &state, const bool needxrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lincgsetxrep(const_cast(state.c_ptr()), needxrep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF COMPUTATIONAL CORE // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { static void densesolver_rmatrixlusolveinternal(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_matrix* a, ae_bool havea, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state); static void densesolver_spdmatrixcholeskysolveinternal(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* a, ae_bool havea, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state); static void densesolver_cmatrixlusolveinternal(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_matrix* a, ae_bool havea, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state); static void densesolver_hpdmatrixcholeskysolveinternal(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* a, ae_bool havea, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state); static ae_int_t densesolver_densesolverrfsmax(ae_int_t n, double r1, double rinf, ae_state *_state); static ae_int_t densesolver_densesolverrfsmaxv2(ae_int_t n, double r2, ae_state *_state); static void densesolver_rbasiclusolve(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_vector* xb, ae_state *_state); static void densesolver_spdbasiccholeskysolve(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* xb, ae_state *_state); static void densesolver_cbasiclusolve(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_vector* xb, ae_state *_state); static void densesolver_hpdbasiccholeskysolve(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* xb, ae_state *_state); static double linlsqr_atol = 1.0E-6; static double linlsqr_btol = 1.0E-6; static void linlsqr_clearrfields(linlsqrstate* state, ae_state *_state); static void nleq_clearrequestfields(nleqstate* state, ae_state *_state); static ae_bool nleq_increaselambda(double* lambdav, double* nu, double lambdaup, ae_state *_state); static void nleq_decreaselambda(double* lambdav, double* nu, double lambdadown, ae_state *_state); static double lincg_defaultprecision = 1.0E-6; static void lincg_clearrfields(lincgstate* state, ae_state *_state); static void lincg_updateitersdata(lincgstate* state, ae_state *_state); /************************************************************************* Dense solver for A*x=b with N*N real matrix A and N*1 real vectorx x and b. This is "slow-but-feature rich" version of the linear solver. Faster version is RMatrixSolveFast() function. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^3) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. It also very significant on small matrices. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, RMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixsolve(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_matrix bm; ae_matrix xm; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_vector_clear(x); ae_matrix_init(&bm, 0, 0, DT_REAL, _state); ae_matrix_init(&xm, 0, 0, DT_REAL, _state); if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&bm, n, 1, _state); ae_v_move(&bm.ptr.pp_double[0][0], bm.stride, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); rmatrixsolvem(a, n, &bm, 1, ae_true, info, rep, &xm, _state); ae_vector_set_length(x, n, _state); ae_v_move(&x->ptr.p_double[0], 1, &xm.ptr.pp_double[0][0], xm.stride, ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixsolve(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state) { rmatrixsolve(a,n,b,info,rep,x, _state); } /************************************************************************* Dense solver. This subroutine solves a system A*x=b, where A is NxN non-denegerate real matrix, x and b are vectors. This is a "fast" version of linear solver which does NOT provide any additional functions like condition number estimation or iterative refinement. Algorithm features: * efficient algorithm O(N^3) complexity * no performance overhead from additional functionality If you need condition number estimation or iterative refinement, use more feature-rich version - RMatrixSolve(). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 16.03.2015 by Bochkanov Sergey *************************************************************************/ void rmatrixsolvefast(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_int_t j; ae_vector p; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; *info = 0; ae_vector_init(&p, 0, DT_INT, _state); if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } rmatrixlu(a, n, n, &p, _state); for(i=0; i<=n-1; i++) { if( ae_fp_eq(a->ptr.pp_double[i][i],(double)(0)) ) { for(j=0; j<=n-1; j++) { b->ptr.p_double[j] = 0.0; } *info = -3; ae_frame_leave(_state); return; } } densesolver_rbasiclusolve(a, &p, n, b, _state); *info = 1; ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixsolvefast(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, ae_state *_state) { rmatrixsolvefast(a,n,b,info, _state); } /************************************************************************* Dense solver. Similar to RMatrixSolve() but solves task with multiple right parts (where b and x are NxM matrices). This is "slow-but-robust" version of linear solver with additional functionality like condition number estimation. There also exists faster version - RMatrixSolveMFast(). Algorithm features: * automatic detection of degenerate cases * condition number estimation * optional iterative refinement * O(N^3+M*N^2) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. It also very significant on small matrices. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, RMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size RFS - iterative refinement switch: * True - refinement is used. Less performance, more precision. * False - refinement is not used. More performance, less precision. OUTPUT PARAMETERS Info - return code: * -3 A is ill conditioned or singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixsolvem(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_bool rfs, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state) { ae_frame _frame_block; ae_matrix da; ae_matrix emptya; ae_vector p; ae_int_t i; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); ae_matrix_init(&da, 0, 0, DT_REAL, _state); ae_matrix_init(&emptya, 0, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&da, n, n, _state); /* * 1. factorize matrix * 3. solve */ for(i=0; i<=n-1; i++) { ae_v_move(&da.ptr.pp_double[i][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } rmatrixlu(&da, n, n, &p, _state); if( rfs ) { densesolver_rmatrixlusolveinternal(&da, &p, n, a, ae_true, b, m, info, rep, x, _state); } else { densesolver_rmatrixlusolveinternal(&da, &p, n, &emptya, ae_false, b, m, info, rep, x, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixsolvem(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_bool rfs, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state) { rmatrixsolvem(a,n,b,m,rfs,info,rep,x, _state); } /************************************************************************* Dense solver. Similar to RMatrixSolve() but solves task with multiple right parts (where b and x are NxM matrices). This is "fast" version of linear solver which does NOT offer additional functions like condition number estimation or iterative refinement. Algorithm features: * O(N^3+M*N^2) complexity * no additional functionality, highest performance COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size RFS - iterative refinement switch: * True - refinement is used. Less performance, more precision. * False - refinement is not used. More performance, less precision. OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixsolvemfast(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; double v; ae_int_t i; ae_int_t j; ae_int_t k; ae_vector p; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; *info = 0; ae_vector_init(&p, 0, DT_INT, _state); /* * Check for exact degeneracy */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } rmatrixlu(a, n, n, &p, _state); for(i=0; i<=n-1; i++) { if( ae_fp_eq(a->ptr.pp_double[i][i],(double)(0)) ) { for(j=0; j<=n-1; j++) { for(k=0; k<=m-1; k++) { b->ptr.pp_double[j][k] = 0.0; } } *info = -3; ae_frame_leave(_state); return; } } /* * Solve with TRSM() */ for(i=0; i<=n-1; i++) { if( p.ptr.p_int[i]!=i ) { for(j=0; j<=m-1; j++) { v = b->ptr.pp_double[i][j]; b->ptr.pp_double[i][j] = b->ptr.pp_double[p.ptr.p_int[i]][j]; b->ptr.pp_double[p.ptr.p_int[i]][j] = v; } } } rmatrixlefttrsm(n, m, a, 0, 0, ae_false, ae_true, 0, b, 0, 0, _state); rmatrixlefttrsm(n, m, a, 0, 0, ae_true, ae_false, 0, b, 0, 0, _state); *info = 1; ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixsolvemfast(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { rmatrixsolvemfast(a,n,b,m,info, _state); } /************************************************************************* Dense solver. This subroutine solves a system A*x=b, where A is NxN non-denegerate real matrix given by its LU decomposition, x and b are real vectors. This is "slow-but-robust" version of the linear LU-based solver. Faster version is RMatrixLUSolveFast() function. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! RMatrixLUSolveFast() function. INPUT PARAMETERS LUA - array[N,N], LU decomposition, RMatrixLU result P - array[N], pivots array, RMatrixLU result N - size of A B - array[N], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixlusolve(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_matrix bm; ae_matrix xm; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_vector_clear(x); ae_matrix_init(&bm, 0, 0, DT_REAL, _state); ae_matrix_init(&xm, 0, 0, DT_REAL, _state); if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&bm, n, 1, _state); ae_v_move(&bm.ptr.pp_double[0][0], bm.stride, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); rmatrixlusolvem(lua, p, n, &bm, 1, info, rep, &xm, _state); ae_vector_set_length(x, n, _state); ae_v_move(&x->ptr.p_double[0], 1, &xm.ptr.pp_double[0][0], xm.stride, ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* Dense solver. This subroutine solves a system A*x=b, where A is NxN non-denegerate real matrix given by its LU decomposition, x and b are real vectors. This is "fast-without-any-checks" version of the linear LU-based solver. Slower but more robust version is RMatrixLUSolve() function. Algorithm features: * O(N^2) complexity * fast algorithm without ANY additional checks, just triangular solver INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void rmatrixlusolvefast(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, ae_state *_state) { ae_int_t i; ae_int_t j; *info = 0; if( n<=0 ) { *info = -1; return; } for(i=0; i<=n-1; i++) { if( ae_fp_eq(lua->ptr.pp_double[i][i],(double)(0)) ) { for(j=0; j<=n-1; j++) { b->ptr.p_double[j] = 0.0; } *info = -3; return; } } densesolver_rbasiclusolve(lua, p, n, b, _state); *info = 1; } /************************************************************************* Dense solver. Similar to RMatrixLUSolve() but solves task with multiple right parts (where b and x are NxM matrices). This is "robust-but-slow" version of LU-based solver which performs additional checks for non-degeneracy of inputs (condition number estimation). If you need best performance, use "fast-without-any-checks" version, RMatrixLUSolveMFast(). Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. ! ! This performance penalty is especially apparent when you use ! ALGLIB parallel capabilities (condition number estimation is ! inherently sequential). It also becomes significant for ! small-scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! RMatrixLUSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS LUA - array[N,N], LU decomposition, RMatrixLU result P - array[N], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixlusolvem(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state) { ae_frame _frame_block; ae_matrix emptya; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); ae_matrix_init(&emptya, 0, 0, DT_REAL, _state); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } /* * solve */ densesolver_rmatrixlusolveinternal(lua, p, n, &emptya, ae_false, b, m, info, rep, x, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixlusolvem(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state) { rmatrixlusolvem(lua,p,n,b,m,info,rep,x, _state); } /************************************************************************* Dense solver. Similar to RMatrixLUSolve() but solves task with multiple right parts, where b and x are NxM matrices. This is "fast-without-any-checks" version of LU-based solver. It does not estimate condition number of a system, so it is extremely fast. If you need better detection of near-degenerate cases, use RMatrixLUSolveM() function. Algorithm features: * O(M*N^2) complexity * fast algorithm without ANY additional checks, just triangular solver COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N,M]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void rmatrixlusolvemfast(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { double v; ae_int_t i; ae_int_t j; ae_int_t k; *info = 0; /* * Check for exact degeneracy */ if( n<=0||m<=0 ) { *info = -1; return; } for(i=0; i<=n-1; i++) { if( ae_fp_eq(lua->ptr.pp_double[i][i],(double)(0)) ) { for(j=0; j<=n-1; j++) { for(k=0; k<=m-1; k++) { b->ptr.pp_double[j][k] = 0.0; } } *info = -3; return; } } /* * Solve with TRSM() */ for(i=0; i<=n-1; i++) { if( p->ptr.p_int[i]!=i ) { for(j=0; j<=m-1; j++) { v = b->ptr.pp_double[i][j]; b->ptr.pp_double[i][j] = b->ptr.pp_double[p->ptr.p_int[i]][j]; b->ptr.pp_double[p->ptr.p_int[i]][j] = v; } } } rmatrixlefttrsm(n, m, lua, 0, 0, ae_false, ae_true, 0, b, 0, 0, _state); rmatrixlefttrsm(n, m, lua, 0, 0, ae_true, ae_false, 0, b, 0, 0, _state); *info = 1; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixlusolvemfast(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { rmatrixlusolvemfast(lua,p,n,b,m,info, _state); } /************************************************************************* Dense solver. This subroutine solves a system A*x=b, where BOTH ORIGINAL A AND ITS LU DECOMPOSITION ARE KNOWN. You can use it if for some reasons you have both A and its LU decomposition. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixmixedsolve(/* Real */ ae_matrix* a, /* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_matrix bm; ae_matrix xm; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_vector_clear(x); ae_matrix_init(&bm, 0, 0, DT_REAL, _state); ae_matrix_init(&xm, 0, 0, DT_REAL, _state); if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&bm, n, 1, _state); ae_v_move(&bm.ptr.pp_double[0][0], bm.stride, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); rmatrixmixedsolvem(a, lua, p, n, &bm, 1, info, rep, &xm, _state); ae_vector_set_length(x, n, _state); ae_v_move(&x->ptr.p_double[0], 1, &xm.ptr.pp_double[0][0], xm.stride, ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* Dense solver. Similar to RMatrixMixedSolve() but solves task with multiple right parts (where b and x are NxM matrices). Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(M*N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixmixedsolvem(/* Real */ ae_matrix* a, /* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state) { *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; return; } /* * solve */ densesolver_rmatrixlusolveinternal(lua, p, n, a, ae_true, b, m, info, rep, x, _state); } /************************************************************************* Complex dense solver for A*X=B with N*N complex matrix A, N*M complex matrices X and B. "Slow-but-feature-rich" version which provides additional functions, at the cost of slower performance. Faster version may be invoked with CMatrixSolveMFast() function. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^3+M*N^2) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, CMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size RFS - iterative refinement switch: * True - refinement is used. Less performance, more precision. * False - refinement is not used. More performance, less precision. OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixsolvem(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_bool rfs, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state) { ae_frame _frame_block; ae_matrix da; ae_matrix emptya; ae_vector p; ae_int_t i; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); ae_matrix_init(&da, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&emptya, 0, 0, DT_COMPLEX, _state); ae_vector_init(&p, 0, DT_INT, _state); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&da, n, n, _state); /* * factorize, solve */ for(i=0; i<=n-1; i++) { ae_v_cmove(&da.ptr.pp_complex[i][0], 1, &a->ptr.pp_complex[i][0], 1, "N", ae_v_len(0,n-1)); } cmatrixlu(&da, n, n, &p, _state); if( rfs ) { densesolver_cmatrixlusolveinternal(&da, &p, n, a, ae_true, b, m, info, rep, x, _state); } else { densesolver_cmatrixlusolveinternal(&da, &p, n, &emptya, ae_false, b, m, info, rep, x, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixsolvem(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_bool rfs, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state) { cmatrixsolvem(a,n,b,m,rfs,info,rep,x, _state); } /************************************************************************* Complex dense solver for A*X=B with N*N complex matrix A, N*M complex matrices X and B. "Fast-but-lightweight" version which provides just triangular solver - and no additional functions like iterative refinement or condition number estimation. Algorithm features: * O(N^3+M*N^2) complexity * no additional time consuming functions COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N,M]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 16.03.2015 by Bochkanov Sergey *************************************************************************/ void cmatrixsolvemfast(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_complex v; ae_int_t i; ae_int_t j; ae_int_t k; ae_vector p; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; *info = 0; ae_vector_init(&p, 0, DT_INT, _state); /* * Check for exact degeneracy */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } cmatrixlu(a, n, n, &p, _state); for(i=0; i<=n-1; i++) { if( ae_c_eq_d(a->ptr.pp_complex[i][i],(double)(0)) ) { for(j=0; j<=n-1; j++) { for(k=0; k<=m-1; k++) { b->ptr.pp_complex[j][k] = ae_complex_from_d(0.0); } } *info = -3; ae_frame_leave(_state); return; } } /* * Solve with TRSM() */ for(i=0; i<=n-1; i++) { if( p.ptr.p_int[i]!=i ) { for(j=0; j<=m-1; j++) { v = b->ptr.pp_complex[i][j]; b->ptr.pp_complex[i][j] = b->ptr.pp_complex[p.ptr.p_int[i]][j]; b->ptr.pp_complex[p.ptr.p_int[i]][j] = v; } } } cmatrixlefttrsm(n, m, a, 0, 0, ae_false, ae_true, 0, b, 0, 0, _state); cmatrixlefttrsm(n, m, a, 0, 0, ae_true, ae_false, 0, b, 0, 0, _state); *info = 1; ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixsolvemfast(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { cmatrixsolvemfast(a,n,b,m,info, _state); } /************************************************************************* Complex dense solver for A*x=B with N*N complex matrix A and N*1 complex vectors x and b. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^3) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, CMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixsolve(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_matrix bm; ae_matrix xm; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_vector_clear(x); ae_matrix_init(&bm, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&xm, 0, 0, DT_COMPLEX, _state); if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&bm, n, 1, _state); ae_v_cmove(&bm.ptr.pp_complex[0][0], bm.stride, &b->ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); cmatrixsolvem(a, n, &bm, 1, ae_true, info, rep, &xm, _state); ae_vector_set_length(x, n, _state); ae_v_cmove(&x->ptr.p_complex[0], 1, &xm.ptr.pp_complex[0][0], xm.stride, "N", ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixsolve(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state) { cmatrixsolve(a,n,b,info,rep,x, _state); } /************************************************************************* Complex dense solver for A*x=B with N*N complex matrix A and N*1 complex vectors x and b. "Fast-but-lightweight" version of the solver. Algorithm features: * O(N^3) complexity * no additional time consuming features, just triangular solver COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS: Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixsolvefast(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_int_t j; ae_vector p; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; *info = 0; ae_vector_init(&p, 0, DT_INT, _state); if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } cmatrixlu(a, n, n, &p, _state); for(i=0; i<=n-1; i++) { if( ae_c_eq_d(a->ptr.pp_complex[i][i],(double)(0)) ) { for(j=0; j<=n-1; j++) { b->ptr.p_complex[j] = ae_complex_from_d(0.0); } *info = -3; ae_frame_leave(_state); return; } } densesolver_cbasiclusolve(a, &p, n, b, _state); *info = 1; ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixsolvefast(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, ae_state *_state) { cmatrixsolvefast(a,n,b,info, _state); } /************************************************************************* Dense solver for A*X=B with N*N complex A given by its LU decomposition, and N*M matrices X and B (multiple right sides). "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use CMatrixSolve or CMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. ! ! This performance penalty is especially apparent when you use ! ALGLIB parallel capabilities (condition number estimation is ! inherently sequential). It also becomes significant for ! small-scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! CMatrixLUSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixlusolvem(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state) { ae_frame _frame_block; ae_matrix emptya; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); ae_matrix_init(&emptya, 0, 0, DT_COMPLEX, _state); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } /* * solve */ densesolver_cmatrixlusolveinternal(lua, p, n, &emptya, ae_false, b, m, info, rep, x, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixlusolvem(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state) { cmatrixlusolvem(lua,p,n,b,m,info,rep,x, _state); } /************************************************************************* Dense solver for A*X=B with N*N complex A given by its LU decomposition, and N*M matrices X and B (multiple right sides). "Fast-but-lightweight" version of the solver. Algorithm features: * O(M*N^2) complexity * no additional time-consuming features COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N,M]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixlusolvemfast(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { ae_complex v; ae_int_t i; ae_int_t j; ae_int_t k; *info = 0; /* * Check for exact degeneracy */ if( n<=0||m<=0 ) { *info = -1; return; } for(i=0; i<=n-1; i++) { if( ae_c_eq_d(lua->ptr.pp_complex[i][i],(double)(0)) ) { for(j=0; j<=n-1; j++) { for(k=0; k<=m-1; k++) { b->ptr.pp_complex[j][k] = ae_complex_from_d(0.0); } } *info = -3; return; } } /* * Solve with TRSM() */ for(i=0; i<=n-1; i++) { if( p->ptr.p_int[i]!=i ) { for(j=0; j<=m-1; j++) { v = b->ptr.pp_complex[i][j]; b->ptr.pp_complex[i][j] = b->ptr.pp_complex[p->ptr.p_int[i]][j]; b->ptr.pp_complex[p->ptr.p_int[i]][j] = v; } } } cmatrixlefttrsm(n, m, lua, 0, 0, ae_false, ae_true, 0, b, 0, 0, _state); cmatrixlefttrsm(n, m, lua, 0, 0, ae_true, ae_false, 0, b, 0, 0, _state); *info = 1; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixlusolvemfast(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { cmatrixlusolvemfast(lua,p,n,b,m,info, _state); } /************************************************************************* Complex dense linear solver for A*x=b with complex N*N A given by its LU decomposition and N*1 vectors x and b. This is "slow-but-robust" version of the complex linear solver with additional features which add significant performance overhead. Faster version is CMatrixLUSolveFast() function. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use CMatrixSolve or CMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! CMatrixLUSolveFast() function. INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixlusolve(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_matrix bm; ae_matrix xm; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_vector_clear(x); ae_matrix_init(&bm, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&xm, 0, 0, DT_COMPLEX, _state); if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&bm, n, 1, _state); ae_v_cmove(&bm.ptr.pp_complex[0][0], bm.stride, &b->ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); cmatrixlusolvem(lua, p, n, &bm, 1, info, rep, &xm, _state); ae_vector_set_length(x, n, _state); ae_v_cmove(&x->ptr.p_complex[0], 1, &xm.ptr.pp_complex[0][0], xm.stride, "N", ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* Complex dense linear solver for A*x=b with N*N complex A given by its LU decomposition and N*1 vectors x and b. This is fast lightweight version of solver, which is significantly faster than CMatrixLUSolve(), but does not provide additional information (like condition numbers). Algorithm features: * O(N^2) complexity * no additional time-consuming features, just triangular solver INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros NOTE: unlike CMatrixLUSolve(), this function does NOT check for near-degeneracy of input matrix. It checks for EXACT degeneracy, because this check is easy to do. However, very badly conditioned matrices may went unnoticed. -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixlusolvefast(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, ae_state *_state) { ae_int_t i; ae_int_t j; *info = 0; if( n<=0 ) { *info = -1; return; } for(i=0; i<=n-1; i++) { if( ae_c_eq_d(lua->ptr.pp_complex[i][i],(double)(0)) ) { for(j=0; j<=n-1; j++) { b->ptr.p_complex[j] = ae_complex_from_d(0.0); } *info = -3; return; } } densesolver_cbasiclusolve(lua, p, n, b, _state); *info = 1; } /************************************************************************* Dense solver. Same as RMatrixMixedSolveM(), but for complex matrices. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(M*N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixmixedsolvem(/* Complex */ ae_matrix* a, /* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state) { *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; return; } /* * solve */ densesolver_cmatrixlusolveinternal(lua, p, n, a, ae_true, b, m, info, rep, x, _state); } /************************************************************************* Dense solver. Same as RMatrixMixedSolve(), but for complex matrices. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixmixedsolve(/* Complex */ ae_matrix* a, /* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_matrix bm; ae_matrix xm; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_vector_clear(x); ae_matrix_init(&bm, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&xm, 0, 0, DT_COMPLEX, _state); if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&bm, n, 1, _state); ae_v_cmove(&bm.ptr.pp_complex[0][0], bm.stride, &b->ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); cmatrixmixedsolvem(a, lua, p, n, &bm, 1, info, rep, &xm, _state); ae_vector_set_length(x, n, _state); ae_v_cmove(&x->ptr.p_complex[0], 1, &xm.ptr.pp_complex[0][0], xm.stride, "N", ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A, and N*M vectors X and B. It is "slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just performs Cholesky ! decomposition and calls triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, SPDMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or non-SPD. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixsolvem(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state) { ae_frame _frame_block; ae_matrix da; ae_int_t i; ae_int_t j; ae_int_t j1; ae_int_t j2; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); ae_matrix_init(&da, 0, 0, DT_REAL, _state); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&da, n, n, _state); /* * factorize * solve */ for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i; j2 = n-1; } else { j1 = 0; j2 = i; } ae_v_move(&da.ptr.pp_double[i][j1], 1, &a->ptr.pp_double[i][j1], 1, ae_v_len(j1,j2)); } if( !spdmatrixcholesky(&da, n, isupper, _state) ) { ae_matrix_set_length(x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x->ptr.pp_double[i][j] = (double)(0); } } rep->r1 = (double)(0); rep->rinf = (double)(0); *info = -3; ae_frame_leave(_state); return; } *info = 1; densesolver_spdmatrixcholeskysolveinternal(&da, n, isupper, a, ae_true, b, m, info, rep, x, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_spdmatrixsolvem(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state) { spdmatrixsolvem(a,n,isupper,b,m,info,rep,x, _state); } /************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A, and N*M vectors X and B. It is "fast-but-lightweight" version of the solver. Algorithm features: * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional time consuming features COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular * -1 N<=0 was passed * 1 task was solved B - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/ void spdmatrixsolvemfast(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; *info = 0; *info = 1; if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } if( !spdmatrixcholesky(a, n, isupper, _state) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { b->ptr.pp_double[i][j] = 0.0; } } *info = -3; ae_frame_leave(_state); return; } if( isupper ) { rmatrixlefttrsm(n, m, a, 0, 0, ae_true, ae_false, 1, b, 0, 0, _state); rmatrixlefttrsm(n, m, a, 0, 0, ae_true, ae_false, 0, b, 0, 0, _state); } else { rmatrixlefttrsm(n, m, a, 0, 0, ae_false, ae_false, 0, b, 0, 0, _state); rmatrixlefttrsm(n, m, a, 0, 0, ae_false, ae_false, 1, b, 0, 0, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_spdmatrixsolvemfast(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { spdmatrixsolvemfast(a,n,isupper,b,m,info, _state); } /************************************************************************* Dense linear solver for A*x=b with N*N real symmetric positive definite matrix A, N*1 vectors x and b. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just performs Cholesky ! decomposition and calls triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, SPDMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or non-SPD. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixsolve(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_matrix bm; ae_matrix xm; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_vector_clear(x); ae_matrix_init(&bm, 0, 0, DT_REAL, _state); ae_matrix_init(&xm, 0, 0, DT_REAL, _state); if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&bm, n, 1, _state); ae_v_move(&bm.ptr.pp_double[0][0], bm.stride, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); spdmatrixsolvem(a, n, isupper, &bm, 1, info, rep, &xm, _state); ae_vector_set_length(x, n, _state); ae_v_move(&x->ptr.p_double[0], 1, &xm.ptr.pp_double[0][0], xm.stride, ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_spdmatrixsolve(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state) { spdmatrixsolve(a,n,isupper,b,info,rep,x, _state); } /************************************************************************* Dense linear solver for A*x=b with N*N real symmetric positive definite matrix A, N*1 vectors x and b. "Fast-but-lightweight" version of the solver. Algorithm features: * O(N^3) complexity * matrix is represented by its upper or lower triangle * no additional time consuming features like condition number estimation COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or non-SPD * -1 N<=0 was passed * 1 task was solved B - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/ void spdmatrixsolvefast(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* b, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; *info = 0; *info = 1; if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } if( !spdmatrixcholesky(a, n, isupper, _state) ) { for(i=0; i<=n-1; i++) { b->ptr.p_double[i] = 0.0; } *info = -3; ae_frame_leave(_state); return; } densesolver_spdbasiccholeskysolve(a, n, isupper, b, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_spdmatrixsolvefast(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* b, ae_int_t* info, ae_state *_state) { spdmatrixsolvefast(a,n,isupper,b,info, _state); } /************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*M vectors X and B. It is "slow-but- feature-rich" version of the solver which estimates condition number of the system. Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. Amount of overhead introduced depends on M (the ! larger - the more efficient). ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! SPDMatrixCholeskySolveMFast() function. INPUT PARAMETERS CHA - array[0..N-1,0..N-1], Cholesky decomposition, SPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or badly conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 contains solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskysolvem(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state) { ae_frame _frame_block; ae_matrix emptya; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); ae_matrix_init(&emptya, 0, 0, DT_REAL, _state); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } /* * solve */ densesolver_spdmatrixcholeskysolveinternal(cha, n, isupper, &emptya, ae_false, b, m, info, rep, x, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_spdmatrixcholeskysolvem(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state) { spdmatrixcholeskysolvem(cha,n,isupper,b,m,info,rep,x, _state); } /************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*M vectors X and B. It is "fast-but- lightweight" version of the solver which just solves linear system, without any additional functions. Algorithm features: * O(M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional functionality INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, SPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[N,M], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or badly conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved B - array[N]: * for info>0 overwritten by solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskysolvemfast(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; *info = 0; *info = 1; if( n<=0 ) { *info = -1; return; } for(k=0; k<=n-1; k++) { if( ae_fp_eq(cha->ptr.pp_double[k][k],0.0) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { b->ptr.pp_double[i][j] = 0.0; } } *info = -3; return; } } if( isupper ) { rmatrixlefttrsm(n, m, cha, 0, 0, ae_true, ae_false, 1, b, 0, 0, _state); rmatrixlefttrsm(n, m, cha, 0, 0, ae_true, ae_false, 0, b, 0, 0, _state); } else { rmatrixlefttrsm(n, m, cha, 0, 0, ae_false, ae_false, 0, b, 0, 0, _state); rmatrixlefttrsm(n, m, cha, 0, 0, ae_false, ae_false, 1, b, 0, 0, _state); } } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_spdmatrixcholeskysolvemfast(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { spdmatrixcholeskysolvemfast(cha,n,isupper,b,m,info, _state); } /************************************************************************* Dense solver for A*x=b with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*1 real vectors x and b. This is "slow- but-feature-rich" version of the solver which, in addition to the solution, performs condition number estimation. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! SPDMatrixCholeskySolveFast() function. INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[N], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 - solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskysolve(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_matrix bm; ae_matrix xm; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_vector_clear(x); ae_matrix_init(&bm, 0, 0, DT_REAL, _state); ae_matrix_init(&xm, 0, 0, DT_REAL, _state); if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&bm, n, 1, _state); ae_v_move(&bm.ptr.pp_double[0][0], bm.stride, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); spdmatrixcholeskysolvem(cha, n, isupper, &bm, 1, info, rep, &xm, _state); ae_vector_set_length(x, n, _state); ae_v_move(&x->ptr.p_double[0], 1, &xm.ptr.pp_double[0][0], xm.stride, ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* Dense solver for A*x=b with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*1 real vectors x and b. This is "fast- but-lightweight" version of the solver. Algorithm features: * O(N^2) complexity * matrix is represented by its upper or lower triangle * no additional features INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[N], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[N]: * for info>0 - overwritten by solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskysolvefast(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* b, ae_int_t* info, ae_state *_state) { ae_int_t i; ae_int_t k; *info = 0; *info = 1; if( n<=0 ) { *info = -1; return; } for(k=0; k<=n-1; k++) { if( ae_fp_eq(cha->ptr.pp_double[k][k],0.0) ) { for(i=0; i<=n-1; i++) { b->ptr.p_double[i] = 0.0; } *info = -3; return; } } densesolver_spdbasiccholeskysolve(cha, n, isupper, b, _state); } /************************************************************************* Dense solver for A*X=B, with N*N Hermitian positive definite matrix A and N*M complex matrices X and B. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. ! ! This performance penalty is especially apparent when you use ! ALGLIB parallel capabilities (condition number estimation is ! inherently sequential). It also becomes significant for ! small-scale problems (N<100). ! ! In such cases we strongly recommend you to use faster solver, ! HPDMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - same as in RMatrixSolve. Returns -3 for non-HPD matrices. Rep - same as in RMatrixSolve X - same as in RMatrixSolve -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void hpdmatrixsolvem(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state) { ae_frame _frame_block; ae_matrix da; ae_int_t i; ae_int_t j; ae_int_t j1; ae_int_t j2; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); ae_matrix_init(&da, 0, 0, DT_COMPLEX, _state); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&da, n, n, _state); /* * factorize matrix, solve */ for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i; j2 = n-1; } else { j1 = 0; j2 = i; } ae_v_cmove(&da.ptr.pp_complex[i][j1], 1, &a->ptr.pp_complex[i][j1], 1, "N", ae_v_len(j1,j2)); } if( !hpdmatrixcholesky(&da, n, isupper, _state) ) { ae_matrix_set_length(x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } rep->r1 = (double)(0); rep->rinf = (double)(0); *info = -3; ae_frame_leave(_state); return; } *info = 1; densesolver_hpdmatrixcholeskysolveinternal(&da, n, isupper, a, ae_true, b, m, info, rep, x, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_hpdmatrixsolvem(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state) { hpdmatrixsolvem(a,n,isupper,b,m,info,rep,x, _state); } /************************************************************************* Dense solver for A*X=B, with N*N Hermitian positive definite matrix A and N*M complex matrices X and B. "Fast-but-lightweight" version of the solver. Algorithm features: * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional time consuming features like condition number estimation COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or is not positive definite. B is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[0..N-1]: * overwritten by solution * zeros, if problem was not solved -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/ void hpdmatrixsolvemfast(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; *info = 0; *info = 1; if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } if( !hpdmatrixcholesky(a, n, isupper, _state) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { b->ptr.pp_complex[i][j] = ae_complex_from_d(0.0); } } *info = -3; ae_frame_leave(_state); return; } if( isupper ) { cmatrixlefttrsm(n, m, a, 0, 0, ae_true, ae_false, 2, b, 0, 0, _state); cmatrixlefttrsm(n, m, a, 0, 0, ae_true, ae_false, 0, b, 0, 0, _state); } else { cmatrixlefttrsm(n, m, a, 0, 0, ae_false, ae_false, 0, b, 0, 0, _state); cmatrixlefttrsm(n, m, a, 0, 0, ae_false, ae_false, 2, b, 0, 0, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_hpdmatrixsolvemfast(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { hpdmatrixsolvemfast(a,n,isupper,b,m,info, _state); } /************************************************************************* Dense solver for A*x=b, with N*N Hermitian positive definite matrix A, and N*1 complex vectors x and b. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just performs Cholesky ! decomposition and calls triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, HPDMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - same as in RMatrixSolve Returns -3 for non-HPD matrices. Rep - same as in RMatrixSolve X - same as in RMatrixSolve -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void hpdmatrixsolve(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_matrix bm; ae_matrix xm; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_vector_clear(x); ae_matrix_init(&bm, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&xm, 0, 0, DT_COMPLEX, _state); if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&bm, n, 1, _state); ae_v_cmove(&bm.ptr.pp_complex[0][0], bm.stride, &b->ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); hpdmatrixsolvem(a, n, isupper, &bm, 1, info, rep, &xm, _state); ae_vector_set_length(x, n, _state); ae_v_cmove(&x->ptr.p_complex[0], 1, &xm.ptr.pp_complex[0][0], xm.stride, "N", ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_hpdmatrixsolve(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state) { hpdmatrixsolve(a,n,isupper,b,info,rep,x, _state); } /************************************************************************* Dense solver for A*x=b, with N*N Hermitian positive definite matrix A, and N*1 complex vectors x and b. "Fast-but-lightweight" version of the solver without additional functions. Algorithm features: * O(N^3) complexity * matrix is represented by its upper or lower triangle * no additional time consuming functions COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or not positive definite X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved B - array[0..N-1]: * overwritten by solution * zeros, if A is exactly singular (diagonal of its LU decomposition has exact zeros). -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/ void hpdmatrixsolvefast(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* b, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; *info = 0; *info = 1; if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } if( !hpdmatrixcholesky(a, n, isupper, _state) ) { for(i=0; i<=n-1; i++) { b->ptr.p_complex[i] = ae_complex_from_d(0.0); } *info = -3; ae_frame_leave(_state); return; } densesolver_hpdbasiccholeskysolve(a, n, isupper, b, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_hpdmatrixsolvefast(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* b, ae_int_t* info, ae_state *_state) { hpdmatrixsolvefast(a,n,isupper,b,info, _state); } /************************************************************************* Dense solver for A*X=B with N*N Hermitian positive definite matrix A given by its Cholesky decomposition and N*M complex matrices X and B. This is "slow-but-feature-rich" version of the solver which, in addition to the solution, estimates condition number of the system. Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. Amount of overhead introduced depends on M (the ! larger - the more efficient). ! ! This performance penalty is insignificant when compared with ! cost of large Cholesky decomposition. However, if you call ! this function many times for the same left side, this ! overhead BECOMES significant. It also becomes significant ! for small-scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! HPDMatrixCholeskySolveMFast() function. INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, HPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[N,M], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 contains solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskysolvem(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state) { ae_frame _frame_block; ae_matrix emptya; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); ae_matrix_init(&emptya, 0, 0, DT_COMPLEX, _state); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } /* * 1. scale matrix, max(|U[i,j]|) * 2. factorize scaled matrix * 3. solve */ densesolver_hpdmatrixcholeskysolveinternal(cha, n, isupper, &emptya, ae_false, b, m, info, rep, x, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_hpdmatrixcholeskysolvem(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state) { hpdmatrixcholeskysolvem(cha,n,isupper,b,m,info,rep,x, _state); } /************************************************************************* Dense solver for A*X=B with N*N Hermitian positive definite matrix A given by its Cholesky decomposition and N*M complex matrices X and B. This is "fast-but-lightweight" version of the solver. Algorithm features: * O(M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional time-consuming features INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, HPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[N,M], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved B - array[N]: * for info>0 overwritten by solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskysolvemfast(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; *info = 0; *info = 1; if( n<=0 ) { *info = -1; return; } for(k=0; k<=n-1; k++) { if( ae_fp_eq(cha->ptr.pp_complex[k][k].x,0.0)&&ae_fp_eq(cha->ptr.pp_complex[k][k].y,0.0) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { b->ptr.pp_complex[i][j] = ae_complex_from_d(0.0); } } *info = -3; return; } } if( isupper ) { cmatrixlefttrsm(n, m, cha, 0, 0, ae_true, ae_false, 2, b, 0, 0, _state); cmatrixlefttrsm(n, m, cha, 0, 0, ae_true, ae_false, 0, b, 0, 0, _state); } else { cmatrixlefttrsm(n, m, cha, 0, 0, ae_false, ae_false, 0, b, 0, 0, _state); cmatrixlefttrsm(n, m, cha, 0, 0, ae_false, ae_false, 2, b, 0, 0, _state); } } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_hpdmatrixcholeskysolvemfast(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state) { hpdmatrixcholeskysolvemfast(cha,n,isupper,b,m,info, _state); } /************************************************************************* Dense solver for A*x=b with N*N Hermitian positive definite matrix A given by its Cholesky decomposition, and N*1 complex vectors x and b. This is "slow-but-feature-rich" version of the solver which estimates condition number of the system. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! HPDMatrixCholeskySolveFast() function. INPUT PARAMETERS CHA - array[0..N-1,0..N-1], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 - solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskysolve(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_matrix bm; ae_matrix xm; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_vector_clear(x); ae_matrix_init(&bm, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&xm, 0, 0, DT_COMPLEX, _state); if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(&bm, n, 1, _state); ae_v_cmove(&bm.ptr.pp_complex[0][0], bm.stride, &b->ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); hpdmatrixcholeskysolvem(cha, n, isupper, &bm, 1, info, rep, &xm, _state); ae_vector_set_length(x, n, _state); ae_v_cmove(&x->ptr.p_complex[0], 1, &xm.ptr.pp_complex[0][0], xm.stride, "N", ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* Dense solver for A*x=b with N*N Hermitian positive definite matrix A given by its Cholesky decomposition, and N*1 complex vectors x and b. This is "fast-but-lightweight" version of the solver. Algorithm features: * O(N^2) complexity * matrix is represented by its upper or lower triangle * no additional time-consuming features INPUT PARAMETERS CHA - array[0..N-1,0..N-1], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned B is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[N]: * for info>0 - overwritten by solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskysolvefast(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* b, ae_int_t* info, ae_state *_state) { ae_int_t i; ae_int_t k; *info = 0; *info = 1; if( n<=0 ) { *info = -1; return; } for(k=0; k<=n-1; k++) { if( ae_fp_eq(cha->ptr.pp_complex[k][k].x,0.0)&&ae_fp_eq(cha->ptr.pp_complex[k][k].y,0.0) ) { for(i=0; i<=n-1; i++) { b->ptr.p_complex[i] = ae_complex_from_d(0.0); } *info = -3; return; } } densesolver_hpdbasiccholeskysolve(cha, n, isupper, b, _state); } /************************************************************************* Dense solver. This subroutine finds solution of the linear system A*X=B with non-square, possibly degenerate A. System is solved in the least squares sense, and general least squares solution X = X0 + CX*y which minimizes |A*X-B| is returned. If A is non-degenerate, solution in the usual sense is returned. Algorithm features: * automatic detection (and correct handling!) of degenerate cases * iterative refinement * O(N^3) complexity COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is only partially supported (some parts are ! optimized, but most - are not). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..NRows-1,0..NCols-1], system matrix NRows - vertical size of A NCols - horizontal size of A B - array[0..NCols-1], right part Threshold- a number in [0,1]. Singular values beyond Threshold are considered zero. Set it to 0.0, if you don't understand what it means, so the solver will choose good value on its own. OUTPUT PARAMETERS Info - return code: * -4 SVD subroutine failed * -1 if NRows<=0 or NCols<=0 or Threshold<0 was passed * 1 if task is solved Rep - solver report, see below for more info X - array[0..N-1,0..M-1], it contains: * solution of A*X=B (even for singular A) * zeros, if SVD subroutine failed SOLVER REPORT Subroutine sets following fields of the Rep structure: * R2 reciprocal of condition number: 1/cond(A), 2-norm. * N = NCols * K dim(Null(A)) * CX array[0..N-1,0..K-1], kernel of A. Columns of CX store such vectors that A*CX[i]=0. -- ALGLIB -- Copyright 24.08.2009 by Bochkanov Sergey *************************************************************************/ void rmatrixsolvels(/* Real */ ae_matrix* a, ae_int_t nrows, ae_int_t ncols, /* Real */ ae_vector* b, double threshold, ae_int_t* info, densesolverlsreport* rep, /* Real */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_vector sv; ae_matrix u; ae_matrix vt; ae_vector rp; ae_vector utb; ae_vector sutb; ae_vector tmp; ae_vector ta; ae_vector tx; ae_vector buf; ae_vector w; ae_int_t i; ae_int_t j; ae_int_t nsv; ae_int_t kernelidx; double v; double verr; ae_bool svdfailed; ae_bool zeroa; ae_int_t rfs; ae_int_t nrfs; ae_bool terminatenexttime; ae_bool smallerr; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverlsreport_clear(rep); ae_vector_clear(x); ae_vector_init(&sv, 0, DT_REAL, _state); ae_matrix_init(&u, 0, 0, DT_REAL, _state); ae_matrix_init(&vt, 0, 0, DT_REAL, _state); ae_vector_init(&rp, 0, DT_REAL, _state); ae_vector_init(&utb, 0, DT_REAL, _state); ae_vector_init(&sutb, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&ta, 0, DT_REAL, _state); ae_vector_init(&tx, 0, DT_REAL, _state); ae_vector_init(&buf, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); if( (nrows<=0||ncols<=0)||ae_fp_less(threshold,(double)(0)) ) { *info = -1; ae_frame_leave(_state); return; } if( ae_fp_eq(threshold,(double)(0)) ) { threshold = 1000*ae_machineepsilon; } /* * Factorize A first */ svdfailed = !rmatrixsvd(a, nrows, ncols, 1, 2, 2, &sv, &u, &vt, _state); zeroa = ae_fp_eq(sv.ptr.p_double[0],(double)(0)); if( svdfailed||zeroa ) { if( svdfailed ) { *info = -4; } else { *info = 1; } ae_vector_set_length(x, ncols, _state); for(i=0; i<=ncols-1; i++) { x->ptr.p_double[i] = (double)(0); } rep->n = ncols; rep->k = ncols; ae_matrix_set_length(&rep->cx, ncols, ncols, _state); for(i=0; i<=ncols-1; i++) { for(j=0; j<=ncols-1; j++) { if( i==j ) { rep->cx.ptr.pp_double[i][j] = (double)(1); } else { rep->cx.ptr.pp_double[i][j] = (double)(0); } } } rep->r2 = (double)(0); ae_frame_leave(_state); return; } nsv = ae_minint(ncols, nrows, _state); if( nsv==ncols ) { rep->r2 = sv.ptr.p_double[nsv-1]/sv.ptr.p_double[0]; } else { rep->r2 = (double)(0); } rep->n = ncols; *info = 1; /* * Iterative refinement of xc combined with solution: * 1. xc = 0 * 2. calculate r = bc-A*xc using extra-precise dot product * 3. solve A*y = r * 4. update x:=x+r * 5. goto 2 * * This cycle is executed until one of two things happens: * 1. maximum number of iterations reached * 2. last iteration decreased error to the lower limit */ ae_vector_set_length(&utb, nsv, _state); ae_vector_set_length(&sutb, nsv, _state); ae_vector_set_length(x, ncols, _state); ae_vector_set_length(&tmp, ncols, _state); ae_vector_set_length(&ta, ncols+1, _state); ae_vector_set_length(&tx, ncols+1, _state); ae_vector_set_length(&buf, ncols+1, _state); for(i=0; i<=ncols-1; i++) { x->ptr.p_double[i] = (double)(0); } kernelidx = nsv; for(i=0; i<=nsv-1; i++) { if( ae_fp_less_eq(sv.ptr.p_double[i],threshold*sv.ptr.p_double[0]) ) { kernelidx = i; break; } } rep->k = ncols-kernelidx; nrfs = densesolver_densesolverrfsmaxv2(ncols, rep->r2, _state); terminatenexttime = ae_false; ae_vector_set_length(&rp, nrows, _state); for(rfs=0; rfs<=nrfs; rfs++) { if( terminatenexttime ) { break; } /* * calculate right part */ if( rfs==0 ) { ae_v_move(&rp.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,nrows-1)); } else { smallerr = ae_true; for(i=0; i<=nrows-1; i++) { ae_v_move(&ta.ptr.p_double[0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,ncols-1)); ta.ptr.p_double[ncols] = (double)(-1); ae_v_move(&tx.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,ncols-1)); tx.ptr.p_double[ncols] = b->ptr.p_double[i]; xdot(&ta, &tx, ncols+1, &buf, &v, &verr, _state); rp.ptr.p_double[i] = -v; smallerr = smallerr&&ae_fp_less(ae_fabs(v, _state),4*verr); } if( smallerr ) { terminatenexttime = ae_true; } } /* * solve A*dx = rp */ for(i=0; i<=ncols-1; i++) { tmp.ptr.p_double[i] = (double)(0); } for(i=0; i<=nsv-1; i++) { utb.ptr.p_double[i] = (double)(0); } for(i=0; i<=nrows-1; i++) { v = rp.ptr.p_double[i]; ae_v_addd(&utb.ptr.p_double[0], 1, &u.ptr.pp_double[i][0], 1, ae_v_len(0,nsv-1), v); } for(i=0; i<=nsv-1; i++) { if( iptr.p_double[0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,ncols-1)); } /* * fill CX */ if( rep->k>0 ) { ae_matrix_set_length(&rep->cx, ncols, rep->k, _state); for(i=0; i<=rep->k-1; i++) { ae_v_move(&rep->cx.ptr.pp_double[0][i], rep->cx.stride, &vt.ptr.pp_double[kernelidx+i][0], 1, ae_v_len(0,ncols-1)); } } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixsolvels(/* Real */ ae_matrix* a, ae_int_t nrows, ae_int_t ncols, /* Real */ ae_vector* b, double threshold, ae_int_t* info, densesolverlsreport* rep, /* Real */ ae_vector* x, ae_state *_state) { rmatrixsolvels(a,nrows,ncols,b,threshold,info,rep,x, _state); } /************************************************************************* Internal LU solver -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ static void densesolver_rmatrixlusolveinternal(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_matrix* a, ae_bool havea, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t rfs; ae_int_t nrfs; ae_vector xc; ae_vector y; ae_vector bc; ae_vector xa; ae_vector xb; ae_vector tx; double v; double verr; double mxb; ae_bool smallerr; ae_bool terminatenexttime; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&bc, 0, DT_REAL, _state); ae_vector_init(&xa, 0, DT_REAL, _state); ae_vector_init(&xb, 0, DT_REAL, _state); ae_vector_init(&tx, 0, DT_REAL, _state); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { if( p->ptr.p_int[i]>n-1||p->ptr.p_int[i]r1 = rmatrixlurcond1(lua, n, _state); rep->rinf = rmatrixlurcondinf(lua, n, _state); if( ae_fp_less(rep->r1,rcondthreshold(_state))||ae_fp_less(rep->rinf,rcondthreshold(_state)) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x->ptr.pp_double[i][j] = (double)(0); } } rep->r1 = (double)(0); rep->rinf = (double)(0); *info = -3; ae_frame_leave(_state); return; } *info = 1; /* * First stage of solution: rough solution with TRSM() */ mxb = 0.0; for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { v = b->ptr.pp_double[i][j]; mxb = ae_maxreal(mxb, ae_fabs(v, _state), _state); x->ptr.pp_double[i][j] = v; } } for(i=0; i<=n-1; i++) { if( p->ptr.p_int[i]!=i ) { for(j=0; j<=m-1; j++) { v = x->ptr.pp_double[i][j]; x->ptr.pp_double[i][j] = x->ptr.pp_double[p->ptr.p_int[i]][j]; x->ptr.pp_double[p->ptr.p_int[i]][j] = v; } } } rmatrixlefttrsm(n, m, lua, 0, 0, ae_false, ae_true, 0, x, 0, 0, _state); rmatrixlefttrsm(n, m, lua, 0, 0, ae_true, ae_false, 0, x, 0, 0, _state); /* * Second stage: iterative refinement */ if( havea ) { for(k=0; k<=m-1; k++) { nrfs = densesolver_densesolverrfsmax(n, rep->r1, rep->rinf, _state); terminatenexttime = ae_false; for(rfs=0; rfs<=nrfs-1; rfs++) { if( terminatenexttime ) { break; } /* * generate right part */ smallerr = ae_true; ae_v_move(&xb.ptr.p_double[0], 1, &x->ptr.pp_double[0][k], x->stride, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { ae_v_move(&xa.ptr.p_double[0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); xa.ptr.p_double[n] = (double)(-1); xb.ptr.p_double[n] = b->ptr.pp_double[i][k]; xdot(&xa, &xb, n+1, &tx, &v, &verr, _state); y.ptr.p_double[i] = -v; smallerr = smallerr&&ae_fp_less(ae_fabs(v, _state),4*verr); } if( smallerr ) { terminatenexttime = ae_true; } /* * solve and update */ densesolver_rbasiclusolve(lua, p, n, &y, _state); ae_v_add(&x->ptr.pp_double[0][k], x->stride, &y.ptr.p_double[0], 1, ae_v_len(0,n-1)); } } } ae_frame_leave(_state); } /************************************************************************* Internal Cholesky solver -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ static void densesolver_spdmatrixcholeskysolveinternal(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* a, ae_bool havea, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state) { ae_int_t i; ae_int_t j; *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; return; } ae_matrix_set_length(x, n, m, _state); /* * estimate condition number, test for near singularity */ rep->r1 = spdmatrixcholeskyrcond(cha, n, isupper, _state); rep->rinf = rep->r1; if( ae_fp_less(rep->r1,rcondthreshold(_state)) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x->ptr.pp_double[i][j] = (double)(0); } } rep->r1 = (double)(0); rep->rinf = (double)(0); *info = -3; return; } *info = 1; /* * Solve with TRSM() */ for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x->ptr.pp_double[i][j] = b->ptr.pp_double[i][j]; } } if( isupper ) { rmatrixlefttrsm(n, m, cha, 0, 0, ae_true, ae_false, 1, x, 0, 0, _state); rmatrixlefttrsm(n, m, cha, 0, 0, ae_true, ae_false, 0, x, 0, 0, _state); } else { rmatrixlefttrsm(n, m, cha, 0, 0, ae_false, ae_false, 0, x, 0, 0, _state); rmatrixlefttrsm(n, m, cha, 0, 0, ae_false, ae_false, 1, x, 0, 0, _state); } } /************************************************************************* Internal LU solver -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ static void densesolver_cmatrixlusolveinternal(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_matrix* a, ae_bool havea, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t rfs; ae_int_t nrfs; ae_vector xc; ae_vector y; ae_vector bc; ae_vector xa; ae_vector xb; ae_vector tx; ae_vector tmpbuf; ae_complex v; double verr; ae_bool smallerr; ae_bool terminatenexttime; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); ae_vector_init(&xc, 0, DT_COMPLEX, _state); ae_vector_init(&y, 0, DT_COMPLEX, _state); ae_vector_init(&bc, 0, DT_COMPLEX, _state); ae_vector_init(&xa, 0, DT_COMPLEX, _state); ae_vector_init(&xb, 0, DT_COMPLEX, _state); ae_vector_init(&tx, 0, DT_COMPLEX, _state); ae_vector_init(&tmpbuf, 0, DT_REAL, _state); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { if( p->ptr.p_int[i]>n-1||p->ptr.p_int[i]r1 = cmatrixlurcond1(lua, n, _state); rep->rinf = cmatrixlurcondinf(lua, n, _state); if( ae_fp_less(rep->r1,rcondthreshold(_state))||ae_fp_less(rep->rinf,rcondthreshold(_state)) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } rep->r1 = (double)(0); rep->rinf = (double)(0); *info = -3; ae_frame_leave(_state); return; } *info = 1; /* * First phase: solve with TRSM() */ for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x->ptr.pp_complex[i][j] = b->ptr.pp_complex[i][j]; } } for(i=0; i<=n-1; i++) { if( p->ptr.p_int[i]!=i ) { for(j=0; j<=m-1; j++) { v = x->ptr.pp_complex[i][j]; x->ptr.pp_complex[i][j] = x->ptr.pp_complex[p->ptr.p_int[i]][j]; x->ptr.pp_complex[p->ptr.p_int[i]][j] = v; } } } cmatrixlefttrsm(n, m, lua, 0, 0, ae_false, ae_true, 0, x, 0, 0, _state); cmatrixlefttrsm(n, m, lua, 0, 0, ae_true, ae_false, 0, x, 0, 0, _state); /* * solve */ for(k=0; k<=m-1; k++) { ae_v_cmove(&bc.ptr.p_complex[0], 1, &b->ptr.pp_complex[0][k], b->stride, "N", ae_v_len(0,n-1)); ae_v_cmove(&xc.ptr.p_complex[0], 1, &x->ptr.pp_complex[0][k], x->stride, "N", ae_v_len(0,n-1)); /* * Iterative refinement of xc: * * calculate r = bc-A*xc using extra-precise dot product * * solve A*y = r * * update x:=x+r * * This cycle is executed until one of two things happens: * 1. maximum number of iterations reached * 2. last iteration decreased error to the lower limit */ if( havea ) { nrfs = densesolver_densesolverrfsmax(n, rep->r1, rep->rinf, _state); terminatenexttime = ae_false; for(rfs=0; rfs<=nrfs-1; rfs++) { if( terminatenexttime ) { break; } /* * generate right part */ smallerr = ae_true; ae_v_cmove(&xb.ptr.p_complex[0], 1, &xc.ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { ae_v_cmove(&xa.ptr.p_complex[0], 1, &a->ptr.pp_complex[i][0], 1, "N", ae_v_len(0,n-1)); xa.ptr.p_complex[n] = ae_complex_from_i(-1); xb.ptr.p_complex[n] = bc.ptr.p_complex[i]; xcdot(&xa, &xb, n+1, &tmpbuf, &v, &verr, _state); y.ptr.p_complex[i] = ae_c_neg(v); smallerr = smallerr&&ae_fp_less(ae_c_abs(v, _state),4*verr); } if( smallerr ) { terminatenexttime = ae_true; } /* * solve and update */ densesolver_cbasiclusolve(lua, p, n, &y, _state); ae_v_cadd(&xc.ptr.p_complex[0], 1, &y.ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); } } /* * Store xc. * Post-scale result. */ ae_v_cmove(&x->ptr.pp_complex[0][k], x->stride, &xc.ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); } ae_frame_leave(_state); } /************************************************************************* Internal Cholesky solver -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ static void densesolver_hpdmatrixcholeskysolveinternal(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* a, ae_bool havea, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_vector xc; ae_vector y; ae_vector bc; ae_vector xa; ae_vector xb; ae_vector tx; ae_frame_make(_state, &_frame_block); *info = 0; _densesolverreport_clear(rep); ae_matrix_clear(x); ae_vector_init(&xc, 0, DT_COMPLEX, _state); ae_vector_init(&y, 0, DT_COMPLEX, _state); ae_vector_init(&bc, 0, DT_COMPLEX, _state); ae_vector_init(&xa, 0, DT_COMPLEX, _state); ae_vector_init(&xb, 0, DT_COMPLEX, _state); ae_vector_init(&tx, 0, DT_COMPLEX, _state); /* * prepare: check inputs, allocate space... */ if( n<=0||m<=0 ) { *info = -1; ae_frame_leave(_state); return; } ae_matrix_set_length(x, n, m, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&bc, n, _state); ae_vector_set_length(&tx, n+1, _state); ae_vector_set_length(&xa, n+1, _state); ae_vector_set_length(&xb, n+1, _state); /* * estimate condition number, test for near singularity */ rep->r1 = hpdmatrixcholeskyrcond(cha, n, isupper, _state); rep->rinf = rep->r1; if( ae_fp_less(rep->r1,rcondthreshold(_state)) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } rep->r1 = (double)(0); rep->rinf = (double)(0); *info = -3; ae_frame_leave(_state); return; } *info = 1; /* * solve */ for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x->ptr.pp_complex[i][j] = b->ptr.pp_complex[i][j]; } } if( isupper ) { cmatrixlefttrsm(n, m, cha, 0, 0, ae_true, ae_false, 2, x, 0, 0, _state); cmatrixlefttrsm(n, m, cha, 0, 0, ae_true, ae_false, 0, x, 0, 0, _state); } else { cmatrixlefttrsm(n, m, cha, 0, 0, ae_false, ae_false, 0, x, 0, 0, _state); cmatrixlefttrsm(n, m, cha, 0, 0, ae_false, ae_false, 2, x, 0, 0, _state); } ae_frame_leave(_state); } /************************************************************************* Internal subroutine. Returns maximum count of RFS iterations as function of: 1. machine epsilon 2. task size. 3. condition number -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ static ae_int_t densesolver_densesolverrfsmax(ae_int_t n, double r1, double rinf, ae_state *_state) { ae_int_t result; result = 5; return result; } /************************************************************************* Internal subroutine. Returns maximum count of RFS iterations as function of: 1. machine epsilon 2. task size. 3. norm-2 condition number -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ static ae_int_t densesolver_densesolverrfsmaxv2(ae_int_t n, double r2, ae_state *_state) { ae_int_t result; result = densesolver_densesolverrfsmax(n, (double)(0), (double)(0), _state); return result; } /************************************************************************* Basic LU solver for PLU*x = y. This subroutine assumes that: * A=PLU is well-conditioned, so no zero divisions or overflow may occur -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ static void densesolver_rbasiclusolve(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_vector* xb, ae_state *_state) { ae_int_t i; double v; for(i=0; i<=n-1; i++) { if( p->ptr.p_int[i]!=i ) { v = xb->ptr.p_double[i]; xb->ptr.p_double[i] = xb->ptr.p_double[p->ptr.p_int[i]]; xb->ptr.p_double[p->ptr.p_int[i]] = v; } } for(i=1; i<=n-1; i++) { v = ae_v_dotproduct(&lua->ptr.pp_double[i][0], 1, &xb->ptr.p_double[0], 1, ae_v_len(0,i-1)); xb->ptr.p_double[i] = xb->ptr.p_double[i]-v; } xb->ptr.p_double[n-1] = xb->ptr.p_double[n-1]/lua->ptr.pp_double[n-1][n-1]; for(i=n-2; i>=0; i--) { v = ae_v_dotproduct(&lua->ptr.pp_double[i][i+1], 1, &xb->ptr.p_double[i+1], 1, ae_v_len(i+1,n-1)); xb->ptr.p_double[i] = (xb->ptr.p_double[i]-v)/lua->ptr.pp_double[i][i]; } } /************************************************************************* Basic Cholesky solver for ScaleA*Cholesky(A)'*x = y. This subroutine assumes that: * A*ScaleA is well scaled * A is well-conditioned, so no zero divisions or overflow may occur -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ static void densesolver_spdbasiccholeskysolve(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* xb, ae_state *_state) { ae_int_t i; double v; /* * A = L*L' or A=U'*U */ if( isupper ) { /* * Solve U'*y=b first. */ for(i=0; i<=n-1; i++) { xb->ptr.p_double[i] = xb->ptr.p_double[i]/cha->ptr.pp_double[i][i]; if( iptr.p_double[i]; ae_v_subd(&xb->ptr.p_double[i+1], 1, &cha->ptr.pp_double[i][i+1], 1, ae_v_len(i+1,n-1), v); } } /* * Solve U*x=y then. */ for(i=n-1; i>=0; i--) { if( iptr.pp_double[i][i+1], 1, &xb->ptr.p_double[i+1], 1, ae_v_len(i+1,n-1)); xb->ptr.p_double[i] = xb->ptr.p_double[i]-v; } xb->ptr.p_double[i] = xb->ptr.p_double[i]/cha->ptr.pp_double[i][i]; } } else { /* * Solve L*y=b first */ for(i=0; i<=n-1; i++) { if( i>0 ) { v = ae_v_dotproduct(&cha->ptr.pp_double[i][0], 1, &xb->ptr.p_double[0], 1, ae_v_len(0,i-1)); xb->ptr.p_double[i] = xb->ptr.p_double[i]-v; } xb->ptr.p_double[i] = xb->ptr.p_double[i]/cha->ptr.pp_double[i][i]; } /* * Solve L'*x=y then. */ for(i=n-1; i>=0; i--) { xb->ptr.p_double[i] = xb->ptr.p_double[i]/cha->ptr.pp_double[i][i]; if( i>0 ) { v = xb->ptr.p_double[i]; ae_v_subd(&xb->ptr.p_double[0], 1, &cha->ptr.pp_double[i][0], 1, ae_v_len(0,i-1), v); } } } } /************************************************************************* Basic LU solver for ScaleA*PLU*x = y. This subroutine assumes that: * L is well-scaled, and it is U which needs scaling by ScaleA. * A=PLU is well-conditioned, so no zero divisions or overflow may occur -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ static void densesolver_cbasiclusolve(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_vector* xb, ae_state *_state) { ae_int_t i; ae_complex v; for(i=0; i<=n-1; i++) { if( p->ptr.p_int[i]!=i ) { v = xb->ptr.p_complex[i]; xb->ptr.p_complex[i] = xb->ptr.p_complex[p->ptr.p_int[i]]; xb->ptr.p_complex[p->ptr.p_int[i]] = v; } } for(i=1; i<=n-1; i++) { v = ae_v_cdotproduct(&lua->ptr.pp_complex[i][0], 1, "N", &xb->ptr.p_complex[0], 1, "N", ae_v_len(0,i-1)); xb->ptr.p_complex[i] = ae_c_sub(xb->ptr.p_complex[i],v); } xb->ptr.p_complex[n-1] = ae_c_div(xb->ptr.p_complex[n-1],lua->ptr.pp_complex[n-1][n-1]); for(i=n-2; i>=0; i--) { v = ae_v_cdotproduct(&lua->ptr.pp_complex[i][i+1], 1, "N", &xb->ptr.p_complex[i+1], 1, "N", ae_v_len(i+1,n-1)); xb->ptr.p_complex[i] = ae_c_div(ae_c_sub(xb->ptr.p_complex[i],v),lua->ptr.pp_complex[i][i]); } } /************************************************************************* Basic Cholesky solver for ScaleA*Cholesky(A)'*x = y. This subroutine assumes that: * A*ScaleA is well scaled * A is well-conditioned, so no zero divisions or overflow may occur -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ static void densesolver_hpdbasiccholeskysolve(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* xb, ae_state *_state) { ae_int_t i; ae_complex v; /* * A = L*L' or A=U'*U */ if( isupper ) { /* * Solve U'*y=b first. */ for(i=0; i<=n-1; i++) { xb->ptr.p_complex[i] = ae_c_div(xb->ptr.p_complex[i],ae_c_conj(cha->ptr.pp_complex[i][i], _state)); if( iptr.p_complex[i]; ae_v_csubc(&xb->ptr.p_complex[i+1], 1, &cha->ptr.pp_complex[i][i+1], 1, "Conj", ae_v_len(i+1,n-1), v); } } /* * Solve U*x=y then. */ for(i=n-1; i>=0; i--) { if( iptr.pp_complex[i][i+1], 1, "N", &xb->ptr.p_complex[i+1], 1, "N", ae_v_len(i+1,n-1)); xb->ptr.p_complex[i] = ae_c_sub(xb->ptr.p_complex[i],v); } xb->ptr.p_complex[i] = ae_c_div(xb->ptr.p_complex[i],cha->ptr.pp_complex[i][i]); } } else { /* * Solve L*y=b first */ for(i=0; i<=n-1; i++) { if( i>0 ) { v = ae_v_cdotproduct(&cha->ptr.pp_complex[i][0], 1, "N", &xb->ptr.p_complex[0], 1, "N", ae_v_len(0,i-1)); xb->ptr.p_complex[i] = ae_c_sub(xb->ptr.p_complex[i],v); } xb->ptr.p_complex[i] = ae_c_div(xb->ptr.p_complex[i],cha->ptr.pp_complex[i][i]); } /* * Solve L'*x=y then. */ for(i=n-1; i>=0; i--) { xb->ptr.p_complex[i] = ae_c_div(xb->ptr.p_complex[i],ae_c_conj(cha->ptr.pp_complex[i][i], _state)); if( i>0 ) { v = xb->ptr.p_complex[i]; ae_v_csubc(&xb->ptr.p_complex[0], 1, &cha->ptr.pp_complex[i][0], 1, "Conj", ae_v_len(0,i-1), v); } } } } void _densesolverreport_init(void* _p, ae_state *_state) { densesolverreport *p = (densesolverreport*)_p; ae_touch_ptr((void*)p); } void _densesolverreport_init_copy(void* _dst, void* _src, ae_state *_state) { densesolverreport *dst = (densesolverreport*)_dst; densesolverreport *src = (densesolverreport*)_src; dst->r1 = src->r1; dst->rinf = src->rinf; } void _densesolverreport_clear(void* _p) { densesolverreport *p = (densesolverreport*)_p; ae_touch_ptr((void*)p); } void _densesolverreport_destroy(void* _p) { densesolverreport *p = (densesolverreport*)_p; ae_touch_ptr((void*)p); } void _densesolverlsreport_init(void* _p, ae_state *_state) { densesolverlsreport *p = (densesolverlsreport*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->cx, 0, 0, DT_REAL, _state); } void _densesolverlsreport_init_copy(void* _dst, void* _src, ae_state *_state) { densesolverlsreport *dst = (densesolverlsreport*)_dst; densesolverlsreport *src = (densesolverlsreport*)_src; dst->r2 = src->r2; ae_matrix_init_copy(&dst->cx, &src->cx, _state); dst->n = src->n; dst->k = src->k; } void _densesolverlsreport_clear(void* _p) { densesolverlsreport *p = (densesolverlsreport*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->cx); } void _densesolverlsreport_destroy(void* _p) { densesolverlsreport *p = (densesolverlsreport*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->cx); } /************************************************************************* This function initializes linear LSQR Solver. This solver is used to solve non-symmetric (and, possibly, non-square) problems. Least squares solution is returned for non-compatible systems. USAGE: 1. User initializes algorithm state with LinLSQRCreate() call 2. User tunes solver parameters with LinLSQRSetCond() and other functions 3. User calls LinLSQRSolveSparse() function which takes algorithm state and SparseMatrix object. 4. User calls LinLSQRResults() to get solution 5. Optionally, user may call LinLSQRSolveSparse() again to solve another problem with different matrix and/or right part without reinitializing LinLSQRState structure. INPUT PARAMETERS: M - number of rows in A N - number of variables, N>0 OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrcreate(ae_int_t m, ae_int_t n, linlsqrstate* state, ae_state *_state) { ae_int_t i; _linlsqrstate_clear(state); ae_assert(m>0, "LinLSQRCreate: M<=0", _state); ae_assert(n>0, "LinLSQRCreate: N<=0", _state); state->m = m; state->n = n; state->prectype = 0; state->epsa = linlsqr_atol; state->epsb = linlsqr_btol; state->epsc = 1/ae_sqrt(ae_machineepsilon, _state); state->maxits = 0; state->lambdai = (double)(0); state->xrep = ae_false; state->running = ae_false; /* * * allocate arrays * * set RX to NAN (just for the case user calls Results() without * calling SolveSparse() * * set B to zero */ normestimatorcreate(m, n, 2, 2, &state->nes, _state); ae_vector_set_length(&state->rx, state->n, _state); ae_vector_set_length(&state->ui, state->m+state->n, _state); ae_vector_set_length(&state->uip1, state->m+state->n, _state); ae_vector_set_length(&state->vip1, state->n, _state); ae_vector_set_length(&state->vi, state->n, _state); ae_vector_set_length(&state->omegai, state->n, _state); ae_vector_set_length(&state->omegaip1, state->n, _state); ae_vector_set_length(&state->d, state->n, _state); ae_vector_set_length(&state->x, state->m+state->n, _state); ae_vector_set_length(&state->mv, state->m+state->n, _state); ae_vector_set_length(&state->mtv, state->n, _state); ae_vector_set_length(&state->b, state->m, _state); for(i=0; i<=n-1; i++) { state->rx.ptr.p_double[i] = _state->v_nan; } for(i=0; i<=m-1; i++) { state->b.ptr.p_double[i] = (double)(0); } ae_vector_set_length(&state->rstate.ia, 1+1, _state); ae_vector_set_length(&state->rstate.ra, 0+1, _state); state->rstate.stage = -1; } /************************************************************************* This function sets right part. By default, right part is zero. INPUT PARAMETERS: B - right part, array[N]. OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsetb(linlsqrstate* state, /* Real */ ae_vector* b, ae_state *_state) { ae_int_t i; ae_assert(!state->running, "LinLSQRSetB: you can not change B when LinLSQRIteration is running", _state); ae_assert(state->m<=b->cnt, "LinLSQRSetB: Length(B)m, _state), "LinLSQRSetB: B contains infinite or NaN values", _state); state->bnorm2 = (double)(0); for(i=0; i<=state->m-1; i++) { state->b.ptr.p_double[i] = b->ptr.p_double[i]; state->bnorm2 = state->bnorm2+b->ptr.p_double[i]*b->ptr.p_double[i]; } } /************************************************************************* This function changes preconditioning settings of LinLSQQSolveSparse() function. By default, SolveSparse() uses diagonal preconditioner, but if you want to use solver without preconditioning, you can call this function which forces solver to use unit matrix for preconditioning. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/ void linlsqrsetprecunit(linlsqrstate* state, ae_state *_state) { ae_assert(!state->running, "LinLSQRSetPrecUnit: you can not change preconditioner, because function LinLSQRIteration is running!", _state); state->prectype = -1; } /************************************************************************* This function changes preconditioning settings of LinCGSolveSparse() function. LinCGSolveSparse() will use diagonal of the system matrix as preconditioner. This preconditioning mode is active by default. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/ void linlsqrsetprecdiag(linlsqrstate* state, ae_state *_state) { ae_assert(!state->running, "LinLSQRSetPrecDiag: you can not change preconditioner, because function LinCGIteration is running!", _state); state->prectype = 0; } /************************************************************************* This function sets optional Tikhonov regularization coefficient. It is zero by default. INPUT PARAMETERS: LambdaI - regularization factor, LambdaI>=0 OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsetlambdai(linlsqrstate* state, double lambdai, ae_state *_state) { ae_assert(!state->running, "LinLSQRSetLambdaI: you can not set LambdaI, because function LinLSQRIteration is running", _state); ae_assert(ae_isfinite(lambdai, _state)&&ae_fp_greater_eq(lambdai,(double)(0)), "LinLSQRSetLambdaI: LambdaI is infinite or NaN", _state); state->lambdai = lambdai; } /************************************************************************* -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ ae_bool linlsqriteration(linlsqrstate* state, ae_state *_state) { ae_int_t summn; double bnorm; ae_int_t i; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { summn = state->rstate.ia.ptr.p_int[0]; i = state->rstate.ia.ptr.p_int[1]; bnorm = state->rstate.ra.ptr.p_double[0]; } else { summn = 359; i = -58; bnorm = -919; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } if( state->rstate.stage==4 ) { goto lbl_4; } if( state->rstate.stage==5 ) { goto lbl_5; } if( state->rstate.stage==6 ) { goto lbl_6; } /* * Routine body */ ae_assert(state->b.cnt>0, "LinLSQRIteration: using non-allocated array B", _state); bnorm = ae_sqrt(state->bnorm2, _state); state->running = ae_true; state->repnmv = 0; linlsqr_clearrfields(state, _state); state->repiterationscount = 0; summn = state->m+state->n; state->r2 = state->bnorm2; /* *estimate for ANorm */ normestimatorrestart(&state->nes, _state); lbl_7: if( !normestimatoriteration(&state->nes, _state) ) { goto lbl_8; } if( !state->nes.needmv ) { goto lbl_9; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->nes.x.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); state->repnmv = state->repnmv+1; linlsqr_clearrfields(state, _state); state->needmv = ae_true; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: state->needmv = ae_false; ae_v_move(&state->nes.mv.ptr.p_double[0], 1, &state->mv.ptr.p_double[0], 1, ae_v_len(0,state->m-1)); goto lbl_7; lbl_9: if( !state->nes.needmtv ) { goto lbl_11; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->nes.x.ptr.p_double[0], 1, ae_v_len(0,state->m-1)); /* *matrix-vector multiplication */ state->repnmv = state->repnmv+1; linlsqr_clearrfields(state, _state); state->needmtv = ae_true; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: state->needmtv = ae_false; ae_v_move(&state->nes.mtv.ptr.p_double[0], 1, &state->mtv.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); goto lbl_7; lbl_11: goto lbl_7; lbl_8: normestimatorresults(&state->nes, &state->anorm, _state); /* *initialize .RX by zeros */ for(i=0; i<=state->n-1; i++) { state->rx.ptr.p_double[i] = (double)(0); } /* *output first report */ if( !state->xrep ) { goto lbl_13; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->rx.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); linlsqr_clearrfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 2; goto lbl_rcomm; lbl_2: state->xupdated = ae_false; lbl_13: /* * LSQR, Step 0. * * Algorithm outline corresponds to one which was described at p.50 of * "LSQR - an algorithm for sparse linear equations and sparse least * squares" by C.Paige and M.Saunders with one small addition - we * explicitly extend system matrix by additional N lines in order * to handle non-zero lambda, i.e. original A is replaced by * [ A ] * A_mod = [ ] * [ lambda*I ]. * * Step 0: * x[0] = 0 * beta[1]*u[1] = b * alpha[1]*v[1] = A_mod'*u[1] * w[1] = v[1] * phiBar[1] = beta[1] * rhoBar[1] = alpha[1] * d[0] = 0 * * NOTE: * There are three criteria for stopping: * (S0) maximum number of iterations * (S1) ||Rk||<=EpsB*||B||; * (S2) ||A^T*Rk||/(||A||*||Rk||)<=EpsA. * It is very important that S2 always checked AFTER S1. It is necessary * to avoid division by zero when Rk=0. */ state->betai = bnorm; if( ae_fp_eq(state->betai,(double)(0)) ) { /* * Zero right part */ state->running = ae_false; state->repterminationtype = 1; result = ae_false; return result; } for(i=0; i<=summn-1; i++) { if( im ) { state->ui.ptr.p_double[i] = state->b.ptr.p_double[i]/state->betai; } else { state->ui.ptr.p_double[i] = (double)(0); } state->x.ptr.p_double[i] = state->ui.ptr.p_double[i]; } state->repnmv = state->repnmv+1; linlsqr_clearrfields(state, _state); state->needmtv = ae_true; state->rstate.stage = 3; goto lbl_rcomm; lbl_3: state->needmtv = ae_false; for(i=0; i<=state->n-1; i++) { state->mtv.ptr.p_double[i] = state->mtv.ptr.p_double[i]+state->lambdai*state->ui.ptr.p_double[state->m+i]; } state->alphai = (double)(0); for(i=0; i<=state->n-1; i++) { state->alphai = state->alphai+state->mtv.ptr.p_double[i]*state->mtv.ptr.p_double[i]; } state->alphai = ae_sqrt(state->alphai, _state); if( ae_fp_eq(state->alphai,(double)(0)) ) { /* * Orthogonality stopping criterion is met */ state->running = ae_false; state->repterminationtype = 4; result = ae_false; return result; } for(i=0; i<=state->n-1; i++) { state->vi.ptr.p_double[i] = state->mtv.ptr.p_double[i]/state->alphai; state->omegai.ptr.p_double[i] = state->vi.ptr.p_double[i]; } state->phibari = state->betai; state->rhobari = state->alphai; for(i=0; i<=state->n-1; i++) { state->d.ptr.p_double[i] = (double)(0); } state->dnorm = (double)(0); /* * Steps I=1, 2, ... */ lbl_15: if( ae_false ) { goto lbl_16; } /* * At I-th step State.RepIterationsCount=I. */ state->repiterationscount = state->repiterationscount+1; /* * Bidiagonalization part: * beta[i+1]*u[i+1] = A_mod*v[i]-alpha[i]*u[i] * alpha[i+1]*v[i+1] = A_mod'*u[i+1] - beta[i+1]*v[i] * * NOTE: beta[i+1]=0 or alpha[i+1]=0 will lead to successful termination * in the end of the current iteration. In this case u/v are zero. * NOTE2: algorithm won't fail on zero alpha or beta (there will be no * division by zero because it will be stopped BEFORE division * occurs). However, near-zero alpha and beta won't stop algorithm * and, although no division by zero will happen, orthogonality * in U and V will be lost. */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->vi.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); state->repnmv = state->repnmv+1; linlsqr_clearrfields(state, _state); state->needmv = ae_true; state->rstate.stage = 4; goto lbl_rcomm; lbl_4: state->needmv = ae_false; for(i=0; i<=state->n-1; i++) { state->mv.ptr.p_double[state->m+i] = state->lambdai*state->vi.ptr.p_double[i]; } state->betaip1 = (double)(0); for(i=0; i<=summn-1; i++) { state->uip1.ptr.p_double[i] = state->mv.ptr.p_double[i]-state->alphai*state->ui.ptr.p_double[i]; state->betaip1 = state->betaip1+state->uip1.ptr.p_double[i]*state->uip1.ptr.p_double[i]; } if( ae_fp_neq(state->betaip1,(double)(0)) ) { state->betaip1 = ae_sqrt(state->betaip1, _state); for(i=0; i<=summn-1; i++) { state->uip1.ptr.p_double[i] = state->uip1.ptr.p_double[i]/state->betaip1; } } ae_v_move(&state->x.ptr.p_double[0], 1, &state->uip1.ptr.p_double[0], 1, ae_v_len(0,state->m-1)); state->repnmv = state->repnmv+1; linlsqr_clearrfields(state, _state); state->needmtv = ae_true; state->rstate.stage = 5; goto lbl_rcomm; lbl_5: state->needmtv = ae_false; for(i=0; i<=state->n-1; i++) { state->mtv.ptr.p_double[i] = state->mtv.ptr.p_double[i]+state->lambdai*state->uip1.ptr.p_double[state->m+i]; } state->alphaip1 = (double)(0); for(i=0; i<=state->n-1; i++) { state->vip1.ptr.p_double[i] = state->mtv.ptr.p_double[i]-state->betaip1*state->vi.ptr.p_double[i]; state->alphaip1 = state->alphaip1+state->vip1.ptr.p_double[i]*state->vip1.ptr.p_double[i]; } if( ae_fp_neq(state->alphaip1,(double)(0)) ) { state->alphaip1 = ae_sqrt(state->alphaip1, _state); for(i=0; i<=state->n-1; i++) { state->vip1.ptr.p_double[i] = state->vip1.ptr.p_double[i]/state->alphaip1; } } /* * Build next orthogonal transformation */ state->rhoi = safepythag2(state->rhobari, state->betaip1, _state); state->ci = state->rhobari/state->rhoi; state->si = state->betaip1/state->rhoi; state->theta = state->si*state->alphaip1; state->rhobarip1 = -state->ci*state->alphaip1; state->phii = state->ci*state->phibari; state->phibarip1 = state->si*state->phibari; /* * Update .RNorm * * This tricky formula is necessary because simply writing * State.R2:=State.PhiBarIP1*State.PhiBarIP1 does NOT guarantees * monotonic decrease of R2. Roundoff error combined with 80-bit * precision used internally by Intel chips allows R2 to increase * slightly in some rare, but possible cases. This property is * undesirable, so we prefer to guard against R increase. */ state->r2 = ae_minreal(state->r2, state->phibarip1*state->phibarip1, _state); /* * Update d and DNorm, check condition-related stopping criteria */ for(i=0; i<=state->n-1; i++) { state->d.ptr.p_double[i] = 1/state->rhoi*(state->vi.ptr.p_double[i]-state->theta*state->d.ptr.p_double[i]); state->dnorm = state->dnorm+state->d.ptr.p_double[i]*state->d.ptr.p_double[i]; } if( ae_fp_greater_eq(ae_sqrt(state->dnorm, _state)*state->anorm,state->epsc) ) { state->running = ae_false; state->repterminationtype = 7; result = ae_false; return result; } /* * Update x, output report */ for(i=0; i<=state->n-1; i++) { state->rx.ptr.p_double[i] = state->rx.ptr.p_double[i]+state->phii/state->rhoi*state->omegai.ptr.p_double[i]; } if( !state->xrep ) { goto lbl_17; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->rx.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); linlsqr_clearrfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 6; goto lbl_rcomm; lbl_6: state->xupdated = ae_false; lbl_17: /* * Check stopping criteria * 1. achieved required number of iterations; * 2. ||Rk||<=EpsB*||B||; * 3. ||A^T*Rk||/(||A||*||Rk||)<=EpsA; */ if( state->maxits>0&&state->repiterationscount>=state->maxits ) { /* * Achieved required number of iterations */ state->running = ae_false; state->repterminationtype = 5; result = ae_false; return result; } if( ae_fp_less_eq(state->phibarip1,state->epsb*bnorm) ) { /* * ||Rk||<=EpsB*||B||, here ||Rk||=PhiBar */ state->running = ae_false; state->repterminationtype = 1; result = ae_false; return result; } if( ae_fp_less_eq(state->alphaip1*ae_fabs(state->ci, _state)/state->anorm,state->epsa) ) { /* * ||A^T*Rk||/(||A||*||Rk||)<=EpsA, here ||A^T*Rk||=PhiBar*Alpha[i+1]*|.C| */ state->running = ae_false; state->repterminationtype = 4; result = ae_false; return result; } /* * Update omega */ for(i=0; i<=state->n-1; i++) { state->omegaip1.ptr.p_double[i] = state->vip1.ptr.p_double[i]-state->theta/state->rhoi*state->omegai.ptr.p_double[i]; } /* * Prepare for the next iteration - rename variables: * u[i] := u[i+1] * v[i] := v[i+1] * rho[i] := rho[i+1] * ... */ ae_v_move(&state->ui.ptr.p_double[0], 1, &state->uip1.ptr.p_double[0], 1, ae_v_len(0,summn-1)); ae_v_move(&state->vi.ptr.p_double[0], 1, &state->vip1.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); ae_v_move(&state->omegai.ptr.p_double[0], 1, &state->omegaip1.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); state->alphai = state->alphaip1; state->betai = state->betaip1; state->phibari = state->phibarip1; state->rhobari = state->rhobarip1; goto lbl_15; lbl_16: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = summn; state->rstate.ia.ptr.p_int[1] = i; state->rstate.ra.ptr.p_double[0] = bnorm; return result; } /************************************************************************* Procedure for solution of A*x=b with sparse A. INPUT PARAMETERS: State - algorithm state A - sparse M*N matrix in the CRS format (you MUST contvert it to CRS format by calling SparseConvertToCRS() function BEFORE you pass it to this function). B - right part, array[M] RESULT: This function returns no result. You can get solution by calling LinCGResults() NOTE: this function uses lightweight preconditioning - multiplication by inverse of diag(A). If you want, you can turn preconditioning off by calling LinLSQRSetPrecUnit(). However, preconditioning cost is low and preconditioner is very important for solution of badly scaled problems. -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsolvesparse(linlsqrstate* state, sparsematrix* a, /* Real */ ae_vector* b, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t t0; ae_int_t t1; double v; n = state->n; ae_assert(!state->running, "LinLSQRSolveSparse: you can not call this function when LinLSQRIteration is running", _state); ae_assert(b->cnt>=state->m, "LinLSQRSolveSparse: Length(B)m, _state), "LinLSQRSolveSparse: B contains infinite or NaN values", _state); /* * Allocate temporaries */ rvectorsetlengthatleast(&state->tmpd, n, _state); rvectorsetlengthatleast(&state->tmpx, n, _state); /* * Compute diagonal scaling matrix D */ if( state->prectype==0 ) { /* * Default preconditioner - inverse of column norms */ for(i=0; i<=n-1; i++) { state->tmpd.ptr.p_double[i] = (double)(0); } t0 = 0; t1 = 0; while(sparseenumerate(a, &t0, &t1, &i, &j, &v, _state)) { state->tmpd.ptr.p_double[j] = state->tmpd.ptr.p_double[j]+ae_sqr(v, _state); } for(i=0; i<=n-1; i++) { if( ae_fp_greater(state->tmpd.ptr.p_double[i],(double)(0)) ) { state->tmpd.ptr.p_double[i] = 1/ae_sqrt(state->tmpd.ptr.p_double[i], _state); } else { state->tmpd.ptr.p_double[i] = (double)(1); } } } else { /* * No diagonal scaling */ for(i=0; i<=n-1; i++) { state->tmpd.ptr.p_double[i] = (double)(1); } } /* * Solve. * * Instead of solving A*x=b we solve preconditioned system (A*D)*(inv(D)*x)=b. * Transformed A is not calculated explicitly, we just modify multiplication * by A or A'. After solution we modify State.RX so it will store untransformed * variables */ linlsqrsetb(state, b, _state); linlsqrrestart(state, _state); while(linlsqriteration(state, _state)) { if( state->needmv ) { for(i=0; i<=n-1; i++) { state->tmpx.ptr.p_double[i] = state->tmpd.ptr.p_double[i]*state->x.ptr.p_double[i]; } sparsemv(a, &state->tmpx, &state->mv, _state); } if( state->needmtv ) { sparsemtv(a, &state->x, &state->mtv, _state); for(i=0; i<=n-1; i++) { state->mtv.ptr.p_double[i] = state->tmpd.ptr.p_double[i]*state->mtv.ptr.p_double[i]; } } } for(i=0; i<=n-1; i++) { state->rx.ptr.p_double[i] = state->tmpd.ptr.p_double[i]*state->rx.ptr.p_double[i]; } } /************************************************************************* This function sets stopping criteria. INPUT PARAMETERS: EpsA - algorithm will be stopped if ||A^T*Rk||/(||A||*||Rk||)<=EpsA. EpsB - algorithm will be stopped if ||Rk||<=EpsB*||B|| MaxIts - algorithm will be stopped if number of iterations more than MaxIts. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTE: if EpsA,EpsB,EpsC and MaxIts are zero then these variables will be setted as default values. -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsetcond(linlsqrstate* state, double epsa, double epsb, ae_int_t maxits, ae_state *_state) { ae_assert(!state->running, "LinLSQRSetCond: you can not call this function when LinLSQRIteration is running", _state); ae_assert(ae_isfinite(epsa, _state)&&ae_fp_greater_eq(epsa,(double)(0)), "LinLSQRSetCond: EpsA is negative, INF or NAN", _state); ae_assert(ae_isfinite(epsb, _state)&&ae_fp_greater_eq(epsb,(double)(0)), "LinLSQRSetCond: EpsB is negative, INF or NAN", _state); ae_assert(maxits>=0, "LinLSQRSetCond: MaxIts is negative", _state); if( (ae_fp_eq(epsa,(double)(0))&&ae_fp_eq(epsb,(double)(0)))&&maxits==0 ) { state->epsa = linlsqr_atol; state->epsb = linlsqr_btol; state->maxits = state->n; } else { state->epsa = epsa; state->epsb = epsb; state->maxits = maxits; } } /************************************************************************* LSQR solver: results. This function must be called after LinLSQRSolve INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[N], solution Rep - optimization report: * Rep.TerminationType completetion code: * 1 ||Rk||<=EpsB*||B|| * 4 ||A^T*Rk||/(||A||*||Rk||)<=EpsA * 5 MaxIts steps was taken * 7 rounding errors prevent further progress, X contains best point found so far. (sometimes returned on singular systems) * Rep.IterationsCount contains iterations count * NMV countains number of matrix-vector calculations -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrresults(linlsqrstate* state, /* Real */ ae_vector* x, linlsqrreport* rep, ae_state *_state) { ae_vector_clear(x); _linlsqrreport_clear(rep); ae_assert(!state->running, "LinLSQRResult: you can not call this function when LinLSQRIteration is running", _state); if( x->cntn ) { ae_vector_set_length(x, state->n, _state); } ae_v_move(&x->ptr.p_double[0], 1, &state->rx.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); rep->iterationscount = state->repiterationscount; rep->nmv = state->repnmv; rep->terminationtype = state->repterminationtype; } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinCGOptimize(). -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsetxrep(linlsqrstate* state, ae_bool needxrep, ae_state *_state) { state->xrep = needxrep; } /************************************************************************* This function restarts LinLSQRIteration -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrrestart(linlsqrstate* state, ae_state *_state) { ae_vector_set_length(&state->rstate.ia, 1+1, _state); ae_vector_set_length(&state->rstate.ra, 0+1, _state); state->rstate.stage = -1; linlsqr_clearrfields(state, _state); } /************************************************************************* Clears request fileds (to be sure that we don't forgot to clear something) *************************************************************************/ static void linlsqr_clearrfields(linlsqrstate* state, ae_state *_state) { state->xupdated = ae_false; state->needmv = ae_false; state->needmtv = ae_false; state->needmv2 = ae_false; state->needvmv = ae_false; state->needprec = ae_false; } void _linlsqrstate_init(void* _p, ae_state *_state) { linlsqrstate *p = (linlsqrstate*)_p; ae_touch_ptr((void*)p); _normestimatorstate_init(&p->nes, _state); ae_vector_init(&p->rx, 0, DT_REAL, _state); ae_vector_init(&p->b, 0, DT_REAL, _state); ae_vector_init(&p->ui, 0, DT_REAL, _state); ae_vector_init(&p->uip1, 0, DT_REAL, _state); ae_vector_init(&p->vi, 0, DT_REAL, _state); ae_vector_init(&p->vip1, 0, DT_REAL, _state); ae_vector_init(&p->omegai, 0, DT_REAL, _state); ae_vector_init(&p->omegaip1, 0, DT_REAL, _state); ae_vector_init(&p->d, 0, DT_REAL, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->mv, 0, DT_REAL, _state); ae_vector_init(&p->mtv, 0, DT_REAL, _state); ae_vector_init(&p->tmpd, 0, DT_REAL, _state); ae_vector_init(&p->tmpx, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); } void _linlsqrstate_init_copy(void* _dst, void* _src, ae_state *_state) { linlsqrstate *dst = (linlsqrstate*)_dst; linlsqrstate *src = (linlsqrstate*)_src; _normestimatorstate_init_copy(&dst->nes, &src->nes, _state); ae_vector_init_copy(&dst->rx, &src->rx, _state); ae_vector_init_copy(&dst->b, &src->b, _state); dst->n = src->n; dst->m = src->m; dst->prectype = src->prectype; ae_vector_init_copy(&dst->ui, &src->ui, _state); ae_vector_init_copy(&dst->uip1, &src->uip1, _state); ae_vector_init_copy(&dst->vi, &src->vi, _state); ae_vector_init_copy(&dst->vip1, &src->vip1, _state); ae_vector_init_copy(&dst->omegai, &src->omegai, _state); ae_vector_init_copy(&dst->omegaip1, &src->omegaip1, _state); dst->alphai = src->alphai; dst->alphaip1 = src->alphaip1; dst->betai = src->betai; dst->betaip1 = src->betaip1; dst->phibari = src->phibari; dst->phibarip1 = src->phibarip1; dst->phii = src->phii; dst->rhobari = src->rhobari; dst->rhobarip1 = src->rhobarip1; dst->rhoi = src->rhoi; dst->ci = src->ci; dst->si = src->si; dst->theta = src->theta; dst->lambdai = src->lambdai; ae_vector_init_copy(&dst->d, &src->d, _state); dst->anorm = src->anorm; dst->bnorm2 = src->bnorm2; dst->dnorm = src->dnorm; dst->r2 = src->r2; ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->mv, &src->mv, _state); ae_vector_init_copy(&dst->mtv, &src->mtv, _state); dst->epsa = src->epsa; dst->epsb = src->epsb; dst->epsc = src->epsc; dst->maxits = src->maxits; dst->xrep = src->xrep; dst->xupdated = src->xupdated; dst->needmv = src->needmv; dst->needmtv = src->needmtv; dst->needmv2 = src->needmv2; dst->needvmv = src->needvmv; dst->needprec = src->needprec; dst->repiterationscount = src->repiterationscount; dst->repnmv = src->repnmv; dst->repterminationtype = src->repterminationtype; dst->running = src->running; ae_vector_init_copy(&dst->tmpd, &src->tmpd, _state); ae_vector_init_copy(&dst->tmpx, &src->tmpx, _state); _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); } void _linlsqrstate_clear(void* _p) { linlsqrstate *p = (linlsqrstate*)_p; ae_touch_ptr((void*)p); _normestimatorstate_clear(&p->nes); ae_vector_clear(&p->rx); ae_vector_clear(&p->b); ae_vector_clear(&p->ui); ae_vector_clear(&p->uip1); ae_vector_clear(&p->vi); ae_vector_clear(&p->vip1); ae_vector_clear(&p->omegai); ae_vector_clear(&p->omegaip1); ae_vector_clear(&p->d); ae_vector_clear(&p->x); ae_vector_clear(&p->mv); ae_vector_clear(&p->mtv); ae_vector_clear(&p->tmpd); ae_vector_clear(&p->tmpx); _rcommstate_clear(&p->rstate); } void _linlsqrstate_destroy(void* _p) { linlsqrstate *p = (linlsqrstate*)_p; ae_touch_ptr((void*)p); _normestimatorstate_destroy(&p->nes); ae_vector_destroy(&p->rx); ae_vector_destroy(&p->b); ae_vector_destroy(&p->ui); ae_vector_destroy(&p->uip1); ae_vector_destroy(&p->vi); ae_vector_destroy(&p->vip1); ae_vector_destroy(&p->omegai); ae_vector_destroy(&p->omegaip1); ae_vector_destroy(&p->d); ae_vector_destroy(&p->x); ae_vector_destroy(&p->mv); ae_vector_destroy(&p->mtv); ae_vector_destroy(&p->tmpd); ae_vector_destroy(&p->tmpx); _rcommstate_destroy(&p->rstate); } void _linlsqrreport_init(void* _p, ae_state *_state) { linlsqrreport *p = (linlsqrreport*)_p; ae_touch_ptr((void*)p); } void _linlsqrreport_init_copy(void* _dst, void* _src, ae_state *_state) { linlsqrreport *dst = (linlsqrreport*)_dst; linlsqrreport *src = (linlsqrreport*)_src; dst->iterationscount = src->iterationscount; dst->nmv = src->nmv; dst->terminationtype = src->terminationtype; } void _linlsqrreport_clear(void* _p) { linlsqrreport *p = (linlsqrreport*)_p; ae_touch_ptr((void*)p); } void _linlsqrreport_destroy(void* _p) { linlsqrreport *p = (linlsqrreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* Polynomial root finding. This function returns all roots of the polynomial P(x) = a0 + a1*x + a2*x^2 + ... + an*x^n Both real and complex roots are returned (see below). INPUT PARAMETERS: A - array[N+1], polynomial coefficients: * A[0] is constant term * A[N] is a coefficient of X^N N - polynomial degree OUTPUT PARAMETERS: X - array of complex roots: * for isolated real root, X[I] is strictly real: IMAGE(X[I])=0 * complex roots are always returned in pairs - roots occupy positions I and I+1, with: * X[I+1]=Conj(X[I]) * IMAGE(X[I]) > 0 * IMAGE(X[I+1]) = -IMAGE(X[I]) < 0 * multiple real roots may have non-zero imaginary part due to roundoff errors. There is no reliable way to distinguish real root of multiplicity 2 from two complex roots in the presence of roundoff errors. Rep - report, additional information, following fields are set: * Rep.MaxErr - max( |P(xi)| ) for i=0..N-1. This field allows to quickly estimate "quality" of the roots being returned. NOTE: this function uses companion matrix method to find roots. In case internal EVD solver fails do find eigenvalues, exception is generated. NOTE: roots are not "polished" and no matrix balancing is performed for them. -- ALGLIB -- Copyright 24.02.2014 by Bochkanov Sergey *************************************************************************/ void polynomialsolve(/* Real */ ae_vector* a, ae_int_t n, /* Complex */ ae_vector* x, polynomialsolverreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _a; ae_matrix c; ae_matrix vl; ae_matrix vr; ae_vector wr; ae_vector wi; ae_int_t i; ae_int_t j; ae_bool status; ae_int_t nz; ae_int_t ne; ae_complex v; ae_complex vv; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_a, a, _state); a = &_a; ae_vector_clear(x); _polynomialsolverreport_clear(rep); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_matrix_init(&vl, 0, 0, DT_REAL, _state); ae_matrix_init(&vr, 0, 0, DT_REAL, _state); ae_vector_init(&wr, 0, DT_REAL, _state); ae_vector_init(&wi, 0, DT_REAL, _state); ae_assert(n>0, "PolynomialSolve: N<=0", _state); ae_assert(a->cnt>=n+1, "PolynomialSolve: Length(A)ptr.p_double[n],(double)(0)), "PolynomialSolve: A[N]=0", _state); /* * Prepare */ ae_vector_set_length(x, n, _state); /* * Normalize A: * * analytically determine NZ zero roots * * quick exit for NZ=N * * make residual NE-th degree polynomial monic * (here NE=N-NZ) */ nz = 0; while(nzptr.p_double[nz],(double)(0))) { nz = nz+1; } ne = n-nz; for(i=nz; i<=n; i++) { a->ptr.p_double[i-nz] = a->ptr.p_double[i]/a->ptr.p_double[n]; } /* * For NZ0 ) { ae_matrix_set_length(&c, ne, ne, _state); for(i=0; i<=ne-1; i++) { for(j=0; j<=ne-1; j++) { c.ptr.pp_double[i][j] = (double)(0); } } c.ptr.pp_double[0][ne-1] = -a->ptr.p_double[0]; for(i=1; i<=ne-1; i++) { c.ptr.pp_double[i][i-1] = (double)(1); c.ptr.pp_double[i][ne-1] = -a->ptr.p_double[i]; } status = rmatrixevd(&c, ne, 0, &wr, &wi, &vl, &vr, _state); ae_assert(status, "PolynomialSolve: inernal error - EVD solver failed", _state); for(i=0; i<=ne-1; i++) { x->ptr.p_complex[i].x = wr.ptr.p_double[i]; x->ptr.p_complex[i].y = wi.ptr.p_double[i]; } } /* * Remaining NZ zero roots */ for(i=ne; i<=n-1; i++) { x->ptr.p_complex[i] = ae_complex_from_i(0); } /* * Rep */ rep->maxerr = (double)(0); for(i=0; i<=ne-1; i++) { v = ae_complex_from_i(0); vv = ae_complex_from_i(1); for(j=0; j<=ne; j++) { v = ae_c_add(v,ae_c_mul_d(vv,a->ptr.p_double[j])); vv = ae_c_mul(vv,x->ptr.p_complex[i]); } rep->maxerr = ae_maxreal(rep->maxerr, ae_c_abs(v, _state), _state); } ae_frame_leave(_state); } void _polynomialsolverreport_init(void* _p, ae_state *_state) { polynomialsolverreport *p = (polynomialsolverreport*)_p; ae_touch_ptr((void*)p); } void _polynomialsolverreport_init_copy(void* _dst, void* _src, ae_state *_state) { polynomialsolverreport *dst = (polynomialsolverreport*)_dst; polynomialsolverreport *src = (polynomialsolverreport*)_src; dst->maxerr = src->maxerr; } void _polynomialsolverreport_clear(void* _p) { polynomialsolverreport *p = (polynomialsolverreport*)_p; ae_touch_ptr((void*)p); } void _polynomialsolverreport_destroy(void* _p) { polynomialsolverreport *p = (polynomialsolverreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* LEVENBERG-MARQUARDT-LIKE NONLINEAR SOLVER DESCRIPTION: This algorithm solves system of nonlinear equations F[0](x[0], ..., x[n-1]) = 0 F[1](x[0], ..., x[n-1]) = 0 ... F[M-1](x[0], ..., x[n-1]) = 0 with M/N do not necessarily coincide. Algorithm converges quadratically under following conditions: * the solution set XS is nonempty * for some xs in XS there exist such neighbourhood N(xs) that: * vector function F(x) and its Jacobian J(x) are continuously differentiable on N * ||F(x)|| provides local error bound on N, i.e. there exists such c1, that ||F(x)||>c1*distance(x,XS) Note that these conditions are much more weaker than usual non-singularity conditions. For example, algorithm will converge for any affine function F (whether its Jacobian singular or not). REQUIREMENTS: Algorithm will request following information during its operation: * function vector F[] and Jacobian matrix at given point X * value of merit function f(x)=F[0]^2(x)+...+F[M-1]^2(x) at given point X USAGE: 1. User initializes algorithm state with NLEQCreateLM() call 2. User tunes solver parameters with NLEQSetCond(), NLEQSetStpMax() and other functions 3. User calls NLEQSolve() function which takes algorithm state and pointers (delegates, etc.) to callback functions which calculate merit function value and Jacobian. 4. User calls NLEQResults() to get solution 5. Optionally, user may call NLEQRestartFrom() to solve another problem with same parameters (N/M) but another starting point and/or another function vector. NLEQRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - space dimension, N>1: * if provided, only leading N elements of X are used * if not provided, determined automatically from size of X M - system size X - starting point OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with NLEQSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use NLEQSetStpMax() function to bound algorithm's steps. 3. this algorithm is a slightly modified implementation of the method described in 'Levenberg-Marquardt method for constrained nonlinear equations with strong local convergence properties' by Christian Kanzow Nobuo Yamashita and Masao Fukushima and further developed in 'On the convergence of a New Levenberg-Marquardt Method' by Jin-yan Fan and Ya-Xiang Yuan. -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/ void nleqcreatelm(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, nleqstate* state, ae_state *_state) { _nleqstate_clear(state); ae_assert(n>=1, "NLEQCreateLM: N<1!", _state); ae_assert(m>=1, "NLEQCreateLM: M<1!", _state); ae_assert(x->cnt>=n, "NLEQCreateLM: Length(X)n = n; state->m = m; nleqsetcond(state, (double)(0), 0, _state); nleqsetxrep(state, ae_false, _state); nleqsetstpmax(state, (double)(0), _state); ae_vector_set_length(&state->x, n, _state); ae_vector_set_length(&state->xbase, n, _state); ae_matrix_set_length(&state->j, m, n, _state); ae_vector_set_length(&state->fi, m, _state); ae_vector_set_length(&state->rightpart, n, _state); ae_vector_set_length(&state->candstep, n, _state); nleqrestartfrom(state, x, _state); } /************************************************************************* This function sets stopping conditions for the nonlinear solver INPUT PARAMETERS: State - structure which stores algorithm state EpsF - >=0 The subroutine finishes its work if on k+1-th iteration the condition ||F||<=EpsF is satisfied MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsF=0 and MaxIts=0 simultaneously will lead to automatic stopping criterion selection (small EpsF). NOTES: -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/ void nleqsetcond(nleqstate* state, double epsf, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(epsf, _state), "NLEQSetCond: EpsF is not finite number!", _state); ae_assert(ae_fp_greater_eq(epsf,(double)(0)), "NLEQSetCond: negative EpsF!", _state); ae_assert(maxits>=0, "NLEQSetCond: negative MaxIts!", _state); if( ae_fp_eq(epsf,(double)(0))&&maxits==0 ) { epsf = 1.0E-6; } state->epsf = epsf; state->maxits = maxits; } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to NLEQSolve(). -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/ void nleqsetxrep(nleqstate* state, ae_bool needxrep, ae_state *_state) { state->xrep = needxrep; } /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when target function contains exp() or other fast growing functions, and algorithm makes too large steps which lead to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/ void nleqsetstpmax(nleqstate* state, double stpmax, ae_state *_state) { ae_assert(ae_isfinite(stpmax, _state), "NLEQSetStpMax: StpMax is not finite!", _state); ae_assert(ae_fp_greater_eq(stpmax,(double)(0)), "NLEQSetStpMax: StpMax<0!", _state); state->stpmax = stpmax; } /************************************************************************* -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ ae_bool nleqiteration(nleqstate* state, ae_state *_state) { ae_int_t n; ae_int_t m; ae_int_t i; double lambdaup; double lambdadown; double lambdav; double rho; double mu; double stepnorm; ae_bool b; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { n = state->rstate.ia.ptr.p_int[0]; m = state->rstate.ia.ptr.p_int[1]; i = state->rstate.ia.ptr.p_int[2]; b = state->rstate.ba.ptr.p_bool[0]; lambdaup = state->rstate.ra.ptr.p_double[0]; lambdadown = state->rstate.ra.ptr.p_double[1]; lambdav = state->rstate.ra.ptr.p_double[2]; rho = state->rstate.ra.ptr.p_double[3]; mu = state->rstate.ra.ptr.p_double[4]; stepnorm = state->rstate.ra.ptr.p_double[5]; } else { n = 359; m = -58; i = -919; b = ae_true; lambdaup = 81; lambdadown = 255; lambdav = 74; rho = -788; mu = 809; stepnorm = 205; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } if( state->rstate.stage==4 ) { goto lbl_4; } /* * Routine body */ /* * Prepare */ n = state->n; m = state->m; state->repterminationtype = 0; state->repiterationscount = 0; state->repnfunc = 0; state->repnjac = 0; /* * Calculate F/G, initialize algorithm */ nleq_clearrequestfields(state, _state); state->needf = ae_true; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: state->needf = ae_false; state->repnfunc = state->repnfunc+1; ae_v_move(&state->xbase.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->fbase = state->f; state->fprev = ae_maxrealnumber; if( !state->xrep ) { goto lbl_5; } /* * progress report */ nleq_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: state->xupdated = ae_false; lbl_5: if( ae_fp_less_eq(state->f,ae_sqr(state->epsf, _state)) ) { state->repterminationtype = 1; result = ae_false; return result; } /* * Main cycle */ lambdaup = (double)(10); lambdadown = 0.3; lambdav = 0.001; rho = (double)(1); lbl_7: if( ae_false ) { goto lbl_8; } /* * Get Jacobian; * before we get to this point we already have State.XBase filled * with current point and State.FBase filled with function value * at XBase */ nleq_clearrequestfields(state, _state); state->needfij = ae_true; ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->rstate.stage = 2; goto lbl_rcomm; lbl_2: state->needfij = ae_false; state->repnfunc = state->repnfunc+1; state->repnjac = state->repnjac+1; rmatrixmv(n, m, &state->j, 0, 0, 1, &state->fi, 0, &state->rightpart, 0, _state); ae_v_muld(&state->rightpart.ptr.p_double[0], 1, ae_v_len(0,n-1), -1); /* * Inner cycle: find good lambda */ lbl_9: if( ae_false ) { goto lbl_10; } /* * Solve (J^T*J + (Lambda+Mu)*I)*y = J^T*F * to get step d=-y where: * * Mu=||F|| - is damping parameter for nonlinear system * * Lambda - is additional Levenberg-Marquardt parameter * for better convergence when far away from minimum */ for(i=0; i<=n-1; i++) { state->candstep.ptr.p_double[i] = (double)(0); } fblssolvecgx(&state->j, m, n, lambdav, &state->rightpart, &state->candstep, &state->cgbuf, _state); /* * Normalize step (it must be no more than StpMax) */ stepnorm = (double)(0); for(i=0; i<=n-1; i++) { if( ae_fp_neq(state->candstep.ptr.p_double[i],(double)(0)) ) { stepnorm = (double)(1); break; } } linminnormalized(&state->candstep, &stepnorm, n, _state); if( ae_fp_neq(state->stpmax,(double)(0)) ) { stepnorm = ae_minreal(stepnorm, state->stpmax, _state); } /* * Test new step - is it good enough? * * if not, Lambda is increased and we try again. * * if step is good, we decrease Lambda and move on. * * We can break this cycle on two occasions: * * step is so small that x+step==x (in floating point arithmetics) * * lambda is so large */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&state->x.ptr.p_double[0], 1, &state->candstep.ptr.p_double[0], 1, ae_v_len(0,n-1), stepnorm); b = ae_true; for(i=0; i<=n-1; i++) { if( ae_fp_neq(state->x.ptr.p_double[i],state->xbase.ptr.p_double[i]) ) { b = ae_false; break; } } if( b ) { /* * Step is too small, force zero step and break */ stepnorm = (double)(0); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->f = state->fbase; goto lbl_10; } nleq_clearrequestfields(state, _state); state->needf = ae_true; state->rstate.stage = 3; goto lbl_rcomm; lbl_3: state->needf = ae_false; state->repnfunc = state->repnfunc+1; if( ae_fp_less(state->f,state->fbase) ) { /* * function value decreased, move on */ nleq_decreaselambda(&lambdav, &rho, lambdadown, _state); goto lbl_10; } if( !nleq_increaselambda(&lambdav, &rho, lambdaup, _state) ) { /* * Lambda is too large (near overflow), force zero step and break */ stepnorm = (double)(0); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->f = state->fbase; goto lbl_10; } goto lbl_9; lbl_10: /* * Accept step: * * new position * * new function value */ state->fbase = state->f; ae_v_addd(&state->xbase.ptr.p_double[0], 1, &state->candstep.ptr.p_double[0], 1, ae_v_len(0,n-1), stepnorm); state->repiterationscount = state->repiterationscount+1; /* * Report new iteration */ if( !state->xrep ) { goto lbl_11; } nleq_clearrequestfields(state, _state); state->xupdated = ae_true; state->f = state->fbase; ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->rstate.stage = 4; goto lbl_rcomm; lbl_4: state->xupdated = ae_false; lbl_11: /* * Test stopping conditions on F, step (zero/non-zero) and MaxIts; * If one of the conditions is met, RepTerminationType is changed. */ if( ae_fp_less_eq(ae_sqrt(state->f, _state),state->epsf) ) { state->repterminationtype = 1; } if( ae_fp_eq(stepnorm,(double)(0))&&state->repterminationtype==0 ) { state->repterminationtype = -4; } if( state->repiterationscount>=state->maxits&&state->maxits>0 ) { state->repterminationtype = 5; } if( state->repterminationtype!=0 ) { goto lbl_8; } /* * Now, iteration is finally over */ goto lbl_7; lbl_8: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = n; state->rstate.ia.ptr.p_int[1] = m; state->rstate.ia.ptr.p_int[2] = i; state->rstate.ba.ptr.p_bool[0] = b; state->rstate.ra.ptr.p_double[0] = lambdaup; state->rstate.ra.ptr.p_double[1] = lambdadown; state->rstate.ra.ptr.p_double[2] = lambdav; state->rstate.ra.ptr.p_double[3] = rho; state->rstate.ra.ptr.p_double[4] = mu; state->rstate.ra.ptr.p_double[5] = stepnorm; return result; } /************************************************************************* NLEQ solver results INPUT PARAMETERS: State - algorithm state. OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report: * Rep.TerminationType completetion code: * -4 ERROR: algorithm has converged to the stationary point Xf which is local minimum of f=F[0]^2+...+F[m-1]^2, but is not solution of nonlinear system. * 1 sqrt(f)<=EpsF. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * Rep.IterationsCount contains iterations count * NFEV countains number of function calculations * ActiveConstraints contains number of active constraints -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/ void nleqresults(nleqstate* state, /* Real */ ae_vector* x, nleqreport* rep, ae_state *_state) { ae_vector_clear(x); _nleqreport_clear(rep); nleqresultsbuf(state, x, rep, _state); } /************************************************************************* NLEQ solver results Buffered implementation of NLEQResults(), which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/ void nleqresultsbuf(nleqstate* state, /* Real */ ae_vector* x, nleqreport* rep, ae_state *_state) { if( x->cntn ) { ae_vector_set_length(x, state->n, _state); } ae_v_move(&x->ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); rep->iterationscount = state->repiterationscount; rep->nfunc = state->repnfunc; rep->njac = state->repnjac; rep->terminationtype = state->repterminationtype; } /************************************************************************* This subroutine restarts CG algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used for reverse communication previously allocated with MinCGCreate call. X - new starting point. BndL - new lower bounds BndU - new upper bounds -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void nleqrestartfrom(nleqstate* state, /* Real */ ae_vector* x, ae_state *_state) { ae_assert(x->cnt>=state->n, "NLEQRestartFrom: Length(X)n, _state), "NLEQRestartFrom: X contains infinite or NaN values!", _state); ae_v_move(&state->x.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,state->n-1)); ae_vector_set_length(&state->rstate.ia, 2+1, _state); ae_vector_set_length(&state->rstate.ba, 0+1, _state); ae_vector_set_length(&state->rstate.ra, 5+1, _state); state->rstate.stage = -1; nleq_clearrequestfields(state, _state); } /************************************************************************* Clears request fileds (to be sure that we don't forgot to clear something) *************************************************************************/ static void nleq_clearrequestfields(nleqstate* state, ae_state *_state) { state->needf = ae_false; state->needfij = ae_false; state->xupdated = ae_false; } /************************************************************************* Increases lambda, returns False when there is a danger of overflow *************************************************************************/ static ae_bool nleq_increaselambda(double* lambdav, double* nu, double lambdaup, ae_state *_state) { double lnlambda; double lnnu; double lnlambdaup; double lnmax; ae_bool result; result = ae_false; lnlambda = ae_log(*lambdav, _state); lnlambdaup = ae_log(lambdaup, _state); lnnu = ae_log(*nu, _state); lnmax = 0.5*ae_log(ae_maxrealnumber, _state); if( ae_fp_greater(lnlambda+lnlambdaup+lnnu,lnmax) ) { return result; } if( ae_fp_greater(lnnu+ae_log((double)(2), _state),lnmax) ) { return result; } *lambdav = *lambdav*lambdaup*(*nu); *nu = *nu*2; result = ae_true; return result; } /************************************************************************* Decreases lambda, but leaves it unchanged when there is danger of underflow. *************************************************************************/ static void nleq_decreaselambda(double* lambdav, double* nu, double lambdadown, ae_state *_state) { *nu = (double)(1); if( ae_fp_less(ae_log(*lambdav, _state)+ae_log(lambdadown, _state),ae_log(ae_minrealnumber, _state)) ) { *lambdav = ae_minrealnumber; } else { *lambdav = *lambdav*lambdadown; } } void _nleqstate_init(void* _p, ae_state *_state) { nleqstate *p = (nleqstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->fi, 0, DT_REAL, _state); ae_matrix_init(&p->j, 0, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); ae_vector_init(&p->xbase, 0, DT_REAL, _state); ae_vector_init(&p->candstep, 0, DT_REAL, _state); ae_vector_init(&p->rightpart, 0, DT_REAL, _state); ae_vector_init(&p->cgbuf, 0, DT_REAL, _state); } void _nleqstate_init_copy(void* _dst, void* _src, ae_state *_state) { nleqstate *dst = (nleqstate*)_dst; nleqstate *src = (nleqstate*)_src; dst->n = src->n; dst->m = src->m; dst->epsf = src->epsf; dst->maxits = src->maxits; dst->xrep = src->xrep; dst->stpmax = src->stpmax; ae_vector_init_copy(&dst->x, &src->x, _state); dst->f = src->f; ae_vector_init_copy(&dst->fi, &src->fi, _state); ae_matrix_init_copy(&dst->j, &src->j, _state); dst->needf = src->needf; dst->needfij = src->needfij; dst->xupdated = src->xupdated; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); dst->repiterationscount = src->repiterationscount; dst->repnfunc = src->repnfunc; dst->repnjac = src->repnjac; dst->repterminationtype = src->repterminationtype; ae_vector_init_copy(&dst->xbase, &src->xbase, _state); dst->fbase = src->fbase; dst->fprev = src->fprev; ae_vector_init_copy(&dst->candstep, &src->candstep, _state); ae_vector_init_copy(&dst->rightpart, &src->rightpart, _state); ae_vector_init_copy(&dst->cgbuf, &src->cgbuf, _state); } void _nleqstate_clear(void* _p) { nleqstate *p = (nleqstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->x); ae_vector_clear(&p->fi); ae_matrix_clear(&p->j); _rcommstate_clear(&p->rstate); ae_vector_clear(&p->xbase); ae_vector_clear(&p->candstep); ae_vector_clear(&p->rightpart); ae_vector_clear(&p->cgbuf); } void _nleqstate_destroy(void* _p) { nleqstate *p = (nleqstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->x); ae_vector_destroy(&p->fi); ae_matrix_destroy(&p->j); _rcommstate_destroy(&p->rstate); ae_vector_destroy(&p->xbase); ae_vector_destroy(&p->candstep); ae_vector_destroy(&p->rightpart); ae_vector_destroy(&p->cgbuf); } void _nleqreport_init(void* _p, ae_state *_state) { nleqreport *p = (nleqreport*)_p; ae_touch_ptr((void*)p); } void _nleqreport_init_copy(void* _dst, void* _src, ae_state *_state) { nleqreport *dst = (nleqreport*)_dst; nleqreport *src = (nleqreport*)_src; dst->iterationscount = src->iterationscount; dst->nfunc = src->nfunc; dst->njac = src->njac; dst->terminationtype = src->terminationtype; } void _nleqreport_clear(void* _p) { nleqreport *p = (nleqreport*)_p; ae_touch_ptr((void*)p); } void _nleqreport_destroy(void* _p) { nleqreport *p = (nleqreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* This function initializes linear CG Solver. This solver is used to solve symmetric positive definite problems. If you want to solve nonsymmetric (or non-positive definite) problem you may use LinLSQR solver provided by ALGLIB. USAGE: 1. User initializes algorithm state with LinCGCreate() call 2. User tunes solver parameters with LinCGSetCond() and other functions 3. Optionally, user sets starting point with LinCGSetStartingPoint() 4. User calls LinCGSolveSparse() function which takes algorithm state and SparseMatrix object. 5. User calls LinCGResults() to get solution 6. Optionally, user may call LinCGSolveSparse() again to solve another problem with different matrix and/or right part without reinitializing LinCGState structure. INPUT PARAMETERS: N - problem dimension, N>0 OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgcreate(ae_int_t n, lincgstate* state, ae_state *_state) { ae_int_t i; _lincgstate_clear(state); ae_assert(n>0, "LinCGCreate: N<=0", _state); state->n = n; state->prectype = 0; state->itsbeforerestart = n; state->itsbeforerupdate = 10; state->epsf = lincg_defaultprecision; state->maxits = 0; state->xrep = ae_false; state->running = ae_false; /* * * allocate arrays * * set RX to NAN (just for the case user calls Results() without * calling SolveSparse() * * set starting point to zero * * we do NOT initialize B here because we assume that user should * initializate it using LinCGSetB() function. In case he forgets * to do so, exception will be thrown in the LinCGIteration(). */ ae_vector_set_length(&state->rx, state->n, _state); ae_vector_set_length(&state->startx, state->n, _state); ae_vector_set_length(&state->b, state->n, _state); for(i=0; i<=state->n-1; i++) { state->rx.ptr.p_double[i] = _state->v_nan; state->startx.ptr.p_double[i] = 0.0; state->b.ptr.p_double[i] = (double)(0); } ae_vector_set_length(&state->cx, state->n, _state); ae_vector_set_length(&state->p, state->n, _state); ae_vector_set_length(&state->r, state->n, _state); ae_vector_set_length(&state->cr, state->n, _state); ae_vector_set_length(&state->z, state->n, _state); ae_vector_set_length(&state->cz, state->n, _state); ae_vector_set_length(&state->x, state->n, _state); ae_vector_set_length(&state->mv, state->n, _state); ae_vector_set_length(&state->pv, state->n, _state); lincg_updateitersdata(state, _state); ae_vector_set_length(&state->rstate.ia, 0+1, _state); ae_vector_set_length(&state->rstate.ra, 2+1, _state); state->rstate.stage = -1; } /************************************************************************* This function sets starting point. By default, zero starting point is used. INPUT PARAMETERS: X - starting point, array[N] OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetstartingpoint(lincgstate* state, /* Real */ ae_vector* x, ae_state *_state) { ae_assert(!state->running, "LinCGSetStartingPoint: you can not change starting point because LinCGIteration() function is running", _state); ae_assert(state->n<=x->cnt, "LinCGSetStartingPoint: Length(X)n, _state), "LinCGSetStartingPoint: X contains infinite or NaN values!", _state); ae_v_move(&state->startx.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,state->n-1)); } /************************************************************************* This function sets right part. By default, right part is zero. INPUT PARAMETERS: B - right part, array[N]. OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetb(lincgstate* state, /* Real */ ae_vector* b, ae_state *_state) { ae_assert(!state->running, "LinCGSetB: you can not set B, because function LinCGIteration is running!", _state); ae_assert(b->cnt>=state->n, "LinCGSetB: Length(B)n, _state), "LinCGSetB: B contains infinite or NaN values!", _state); ae_v_move(&state->b.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,state->n-1)); } /************************************************************************* This function changes preconditioning settings of LinCGSolveSparse() function. By default, SolveSparse() uses diagonal preconditioner, but if you want to use solver without preconditioning, you can call this function which forces solver to use unit matrix for preconditioning. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/ void lincgsetprecunit(lincgstate* state, ae_state *_state) { ae_assert(!state->running, "LinCGSetPrecUnit: you can not change preconditioner, because function LinCGIteration is running!", _state); state->prectype = -1; } /************************************************************************* This function changes preconditioning settings of LinCGSolveSparse() function. LinCGSolveSparse() will use diagonal of the system matrix as preconditioner. This preconditioning mode is active by default. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/ void lincgsetprecdiag(lincgstate* state, ae_state *_state) { ae_assert(!state->running, "LinCGSetPrecDiag: you can not change preconditioner, because function LinCGIteration is running!", _state); state->prectype = 0; } /************************************************************************* This function sets stopping criteria. INPUT PARAMETERS: EpsF - algorithm will be stopped if norm of residual is less than EpsF*||b||. MaxIts - algorithm will be stopped if number of iterations is more than MaxIts. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: If both EpsF and MaxIts are zero then small EpsF will be set to small value. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetcond(lincgstate* state, double epsf, ae_int_t maxits, ae_state *_state) { ae_assert(!state->running, "LinCGSetCond: you can not change stopping criteria when LinCGIteration() is running", _state); ae_assert(ae_isfinite(epsf, _state)&&ae_fp_greater_eq(epsf,(double)(0)), "LinCGSetCond: EpsF is negative or contains infinite or NaN values", _state); ae_assert(maxits>=0, "LinCGSetCond: MaxIts is negative", _state); if( ae_fp_eq(epsf,(double)(0))&&maxits==0 ) { state->epsf = lincg_defaultprecision; state->maxits = maxits; } else { state->epsf = epsf; state->maxits = maxits; } } /************************************************************************* Reverse communication version of linear CG. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ ae_bool lincgiteration(lincgstate* state, ae_state *_state) { ae_int_t i; double uvar; double bnorm; double v; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { i = state->rstate.ia.ptr.p_int[0]; uvar = state->rstate.ra.ptr.p_double[0]; bnorm = state->rstate.ra.ptr.p_double[1]; v = state->rstate.ra.ptr.p_double[2]; } else { i = 359; uvar = -58; bnorm = -919; v = -909; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } if( state->rstate.stage==4 ) { goto lbl_4; } if( state->rstate.stage==5 ) { goto lbl_5; } if( state->rstate.stage==6 ) { goto lbl_6; } if( state->rstate.stage==7 ) { goto lbl_7; } /* * Routine body */ ae_assert(state->b.cnt>0, "LinCGIteration: B is not initialized (you must initialize B by LinCGSetB() call", _state); state->running = ae_true; state->repnmv = 0; lincg_clearrfields(state, _state); lincg_updateitersdata(state, _state); /* * Start 0-th iteration */ ae_v_move(&state->rx.ptr.p_double[0], 1, &state->startx.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->rx.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); state->repnmv = state->repnmv+1; lincg_clearrfields(state, _state); state->needvmv = ae_true; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: state->needvmv = ae_false; bnorm = (double)(0); state->r2 = (double)(0); state->meritfunction = (double)(0); for(i=0; i<=state->n-1; i++) { state->r.ptr.p_double[i] = state->b.ptr.p_double[i]-state->mv.ptr.p_double[i]; state->r2 = state->r2+state->r.ptr.p_double[i]*state->r.ptr.p_double[i]; state->meritfunction = state->meritfunction+state->mv.ptr.p_double[i]*state->rx.ptr.p_double[i]-2*state->b.ptr.p_double[i]*state->rx.ptr.p_double[i]; bnorm = bnorm+state->b.ptr.p_double[i]*state->b.ptr.p_double[i]; } bnorm = ae_sqrt(bnorm, _state); /* * Output first report */ if( !state->xrep ) { goto lbl_8; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->rx.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); lincg_clearrfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: state->xupdated = ae_false; lbl_8: /* * Is x0 a solution? */ if( !ae_isfinite(state->r2, _state)||ae_fp_less_eq(ae_sqrt(state->r2, _state),state->epsf*bnorm) ) { state->running = ae_false; if( ae_isfinite(state->r2, _state) ) { state->repterminationtype = 1; } else { state->repterminationtype = -4; } result = ae_false; return result; } /* * Calculate Z and P */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->r.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); state->repnmv = state->repnmv+1; lincg_clearrfields(state, _state); state->needprec = ae_true; state->rstate.stage = 2; goto lbl_rcomm; lbl_2: state->needprec = ae_false; for(i=0; i<=state->n-1; i++) { state->z.ptr.p_double[i] = state->pv.ptr.p_double[i]; state->p.ptr.p_double[i] = state->z.ptr.p_double[i]; } /* * Other iterations(1..N) */ state->repiterationscount = 0; lbl_10: if( ae_false ) { goto lbl_11; } state->repiterationscount = state->repiterationscount+1; /* * Calculate Alpha */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->p.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); state->repnmv = state->repnmv+1; lincg_clearrfields(state, _state); state->needvmv = ae_true; state->rstate.stage = 3; goto lbl_rcomm; lbl_3: state->needvmv = ae_false; if( !ae_isfinite(state->vmv, _state)||ae_fp_less_eq(state->vmv,(double)(0)) ) { /* * a) Overflow when calculating VMV * b) non-positive VMV (non-SPD matrix) */ state->running = ae_false; if( ae_isfinite(state->vmv, _state) ) { state->repterminationtype = -5; } else { state->repterminationtype = -4; } result = ae_false; return result; } state->alpha = (double)(0); for(i=0; i<=state->n-1; i++) { state->alpha = state->alpha+state->r.ptr.p_double[i]*state->z.ptr.p_double[i]; } state->alpha = state->alpha/state->vmv; if( !ae_isfinite(state->alpha, _state) ) { /* * Overflow when calculating Alpha */ state->running = ae_false; state->repterminationtype = -4; result = ae_false; return result; } /* * Next step toward solution */ for(i=0; i<=state->n-1; i++) { state->cx.ptr.p_double[i] = state->rx.ptr.p_double[i]+state->alpha*state->p.ptr.p_double[i]; } /* * Calculate R: * * use recurrent relation to update R * * at every ItsBeforeRUpdate-th iteration recalculate it from scratch, using matrix-vector product * in case R grows instead of decreasing, algorithm is terminated with positive completion code */ if( !(state->itsbeforerupdate==0||state->repiterationscount%state->itsbeforerupdate!=0) ) { goto lbl_12; } /* * Calculate R using recurrent formula */ for(i=0; i<=state->n-1; i++) { state->cr.ptr.p_double[i] = state->r.ptr.p_double[i]-state->alpha*state->mv.ptr.p_double[i]; state->x.ptr.p_double[i] = state->cr.ptr.p_double[i]; } goto lbl_13; lbl_12: /* * Calculate R using matrix-vector multiplication */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->cx.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); state->repnmv = state->repnmv+1; lincg_clearrfields(state, _state); state->needmv = ae_true; state->rstate.stage = 4; goto lbl_rcomm; lbl_4: state->needmv = ae_false; for(i=0; i<=state->n-1; i++) { state->cr.ptr.p_double[i] = state->b.ptr.p_double[i]-state->mv.ptr.p_double[i]; state->x.ptr.p_double[i] = state->cr.ptr.p_double[i]; } /* * Calculating merit function * Check emergency stopping criterion */ v = (double)(0); for(i=0; i<=state->n-1; i++) { v = v+state->mv.ptr.p_double[i]*state->cx.ptr.p_double[i]-2*state->b.ptr.p_double[i]*state->cx.ptr.p_double[i]; } if( ae_fp_less(v,state->meritfunction) ) { goto lbl_14; } for(i=0; i<=state->n-1; i++) { if( !ae_isfinite(state->rx.ptr.p_double[i], _state) ) { state->running = ae_false; state->repterminationtype = -4; result = ae_false; return result; } } /* *output last report */ if( !state->xrep ) { goto lbl_16; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->rx.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); lincg_clearrfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 5; goto lbl_rcomm; lbl_5: state->xupdated = ae_false; lbl_16: state->running = ae_false; state->repterminationtype = 7; result = ae_false; return result; lbl_14: state->meritfunction = v; lbl_13: ae_v_move(&state->rx.ptr.p_double[0], 1, &state->cx.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); /* * calculating RNorm * * NOTE: monotonic decrease of R2 is not guaranteed by algorithm. */ state->r2 = (double)(0); for(i=0; i<=state->n-1; i++) { state->r2 = state->r2+state->cr.ptr.p_double[i]*state->cr.ptr.p_double[i]; } /* *output report */ if( !state->xrep ) { goto lbl_18; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->rx.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); lincg_clearrfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 6; goto lbl_rcomm; lbl_6: state->xupdated = ae_false; lbl_18: /* *stopping criterion *achieved the required precision */ if( !ae_isfinite(state->r2, _state)||ae_fp_less_eq(ae_sqrt(state->r2, _state),state->epsf*bnorm) ) { state->running = ae_false; if( ae_isfinite(state->r2, _state) ) { state->repterminationtype = 1; } else { state->repterminationtype = -4; } result = ae_false; return result; } if( state->repiterationscount>=state->maxits&&state->maxits>0 ) { for(i=0; i<=state->n-1; i++) { if( !ae_isfinite(state->rx.ptr.p_double[i], _state) ) { state->running = ae_false; state->repterminationtype = -4; result = ae_false; return result; } } /* *if X is finite number */ state->running = ae_false; state->repterminationtype = 5; result = ae_false; return result; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->cr.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); /* *prepere of parameters for next iteration */ state->repnmv = state->repnmv+1; lincg_clearrfields(state, _state); state->needprec = ae_true; state->rstate.stage = 7; goto lbl_rcomm; lbl_7: state->needprec = ae_false; ae_v_move(&state->cz.ptr.p_double[0], 1, &state->pv.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); if( state->repiterationscount%state->itsbeforerestart!=0 ) { state->beta = (double)(0); uvar = (double)(0); for(i=0; i<=state->n-1; i++) { state->beta = state->beta+state->cz.ptr.p_double[i]*state->cr.ptr.p_double[i]; uvar = uvar+state->z.ptr.p_double[i]*state->r.ptr.p_double[i]; } /* *check that UVar is't INF or is't zero */ if( !ae_isfinite(uvar, _state)||ae_fp_eq(uvar,(double)(0)) ) { state->running = ae_false; state->repterminationtype = -4; result = ae_false; return result; } /* *calculate .BETA */ state->beta = state->beta/uvar; /* *check that .BETA neither INF nor NaN */ if( !ae_isfinite(state->beta, _state) ) { state->running = ae_false; state->repterminationtype = -1; result = ae_false; return result; } for(i=0; i<=state->n-1; i++) { state->p.ptr.p_double[i] = state->cz.ptr.p_double[i]+state->beta*state->p.ptr.p_double[i]; } } else { ae_v_move(&state->p.ptr.p_double[0], 1, &state->cz.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); } /* *prepere data for next iteration */ for(i=0; i<=state->n-1; i++) { /* *write (k+1)th iteration to (k )th iteration */ state->r.ptr.p_double[i] = state->cr.ptr.p_double[i]; state->z.ptr.p_double[i] = state->cz.ptr.p_double[i]; } goto lbl_10; lbl_11: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = i; state->rstate.ra.ptr.p_double[0] = uvar; state->rstate.ra.ptr.p_double[1] = bnorm; state->rstate.ra.ptr.p_double[2] = v; return result; } /************************************************************************* Procedure for solution of A*x=b with sparse A. INPUT PARAMETERS: State - algorithm state A - sparse matrix in the CRS format (you MUST contvert it to CRS format by calling SparseConvertToCRS() function). IsUpper - whether upper or lower triangle of A is used: * IsUpper=True => only upper triangle is used and lower triangle is not referenced at all * IsUpper=False => only lower triangle is used and upper triangle is not referenced at all B - right part, array[N] RESULT: This function returns no result. You can get solution by calling LinCGResults() NOTE: this function uses lightweight preconditioning - multiplication by inverse of diag(A). If you want, you can turn preconditioning off by calling LinCGSetPrecUnit(). However, preconditioning cost is low and preconditioner is very important for solution of badly scaled problems. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsolvesparse(lincgstate* state, sparsematrix* a, ae_bool isupper, /* Real */ ae_vector* b, ae_state *_state) { ae_int_t n; ae_int_t i; double v; double vmv; n = state->n; ae_assert(b->cnt>=state->n, "LinCGSetB: Length(B)n, _state), "LinCGSetB: B contains infinite or NaN values!", _state); /* * Allocate temporaries */ rvectorsetlengthatleast(&state->tmpd, n, _state); /* * Compute diagonal scaling matrix D */ if( state->prectype==0 ) { /* * Default preconditioner - inverse of matrix diagonal */ for(i=0; i<=n-1; i++) { v = sparsegetdiagonal(a, i, _state); if( ae_fp_greater(v,(double)(0)) ) { state->tmpd.ptr.p_double[i] = 1/ae_sqrt(v, _state); } else { state->tmpd.ptr.p_double[i] = (double)(1); } } } else { /* * No diagonal scaling */ for(i=0; i<=n-1; i++) { state->tmpd.ptr.p_double[i] = (double)(1); } } /* * Solve */ lincgrestart(state, _state); lincgsetb(state, b, _state); while(lincgiteration(state, _state)) { /* * Process different requests from optimizer */ if( state->needmv ) { sparsesmv(a, isupper, &state->x, &state->mv, _state); } if( state->needvmv ) { sparsesmv(a, isupper, &state->x, &state->mv, _state); vmv = ae_v_dotproduct(&state->x.ptr.p_double[0], 1, &state->mv.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); state->vmv = vmv; } if( state->needprec ) { for(i=0; i<=n-1; i++) { state->pv.ptr.p_double[i] = state->x.ptr.p_double[i]*ae_sqr(state->tmpd.ptr.p_double[i], _state); } } } } /************************************************************************* CG-solver: results. This function must be called after LinCGSolve INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[N], solution Rep - optimization report: * Rep.TerminationType completetion code: * -5 input matrix is either not positive definite, too large or too small * -4 overflow/underflow during solution (ill conditioned problem) * 1 ||residual||<=EpsF*||b|| * 5 MaxIts steps was taken * 7 rounding errors prevent further progress, best point found is returned * Rep.IterationsCount contains iterations count * NMV countains number of matrix-vector calculations -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgresults(lincgstate* state, /* Real */ ae_vector* x, lincgreport* rep, ae_state *_state) { ae_vector_clear(x); _lincgreport_clear(rep); ae_assert(!state->running, "LinCGResult: you can not get result, because function LinCGIteration has been launched!", _state); if( x->cntn ) { ae_vector_set_length(x, state->n, _state); } ae_v_move(&x->ptr.p_double[0], 1, &state->rx.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); rep->iterationscount = state->repiterationscount; rep->nmv = state->repnmv; rep->terminationtype = state->repterminationtype; rep->r2 = state->r2; } /************************************************************************* This function sets restart frequency. By default, algorithm is restarted after N subsequent iterations. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetrestartfreq(lincgstate* state, ae_int_t srf, ae_state *_state) { ae_assert(!state->running, "LinCGSetRestartFreq: you can not change restart frequency when LinCGIteration() is running", _state); ae_assert(srf>0, "LinCGSetRestartFreq: non-positive SRF", _state); state->itsbeforerestart = srf; } /************************************************************************* This function sets frequency of residual recalculations. Algorithm updates residual r_k using iterative formula, but recalculates it from scratch after each 10 iterations. It is done to avoid accumulation of numerical errors and to stop algorithm when r_k starts to grow. Such low update frequence (1/10) gives very little overhead, but makes algorithm a bit more robust against numerical errors. However, you may change it INPUT PARAMETERS: Freq - desired update frequency, Freq>=0. Zero value means that no updates will be done. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetrupdatefreq(lincgstate* state, ae_int_t freq, ae_state *_state) { ae_assert(!state->running, "LinCGSetRUpdateFreq: you can not change update frequency when LinCGIteration() is running", _state); ae_assert(freq>=0, "LinCGSetRUpdateFreq: non-positive Freq", _state); state->itsbeforerupdate = freq; } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinCGOptimize(). -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetxrep(lincgstate* state, ae_bool needxrep, ae_state *_state) { state->xrep = needxrep; } /************************************************************************* Procedure for restart function LinCGIteration -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgrestart(lincgstate* state, ae_state *_state) { ae_vector_set_length(&state->rstate.ia, 0+1, _state); ae_vector_set_length(&state->rstate.ra, 2+1, _state); state->rstate.stage = -1; lincg_clearrfields(state, _state); } /************************************************************************* Clears request fileds (to be sure that we don't forgot to clear something) *************************************************************************/ static void lincg_clearrfields(lincgstate* state, ae_state *_state) { state->xupdated = ae_false; state->needmv = ae_false; state->needmtv = ae_false; state->needmv2 = ae_false; state->needvmv = ae_false; state->needprec = ae_false; } /************************************************************************* Clears request fileds (to be sure that we don't forgot to clear something) *************************************************************************/ static void lincg_updateitersdata(lincgstate* state, ae_state *_state) { state->repiterationscount = 0; state->repnmv = 0; state->repterminationtype = 0; } void _lincgstate_init(void* _p, ae_state *_state) { lincgstate *p = (lincgstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->rx, 0, DT_REAL, _state); ae_vector_init(&p->b, 0, DT_REAL, _state); ae_vector_init(&p->cx, 0, DT_REAL, _state); ae_vector_init(&p->cr, 0, DT_REAL, _state); ae_vector_init(&p->cz, 0, DT_REAL, _state); ae_vector_init(&p->p, 0, DT_REAL, _state); ae_vector_init(&p->r, 0, DT_REAL, _state); ae_vector_init(&p->z, 0, DT_REAL, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->mv, 0, DT_REAL, _state); ae_vector_init(&p->pv, 0, DT_REAL, _state); ae_vector_init(&p->startx, 0, DT_REAL, _state); ae_vector_init(&p->tmpd, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); } void _lincgstate_init_copy(void* _dst, void* _src, ae_state *_state) { lincgstate *dst = (lincgstate*)_dst; lincgstate *src = (lincgstate*)_src; ae_vector_init_copy(&dst->rx, &src->rx, _state); ae_vector_init_copy(&dst->b, &src->b, _state); dst->n = src->n; dst->prectype = src->prectype; ae_vector_init_copy(&dst->cx, &src->cx, _state); ae_vector_init_copy(&dst->cr, &src->cr, _state); ae_vector_init_copy(&dst->cz, &src->cz, _state); ae_vector_init_copy(&dst->p, &src->p, _state); ae_vector_init_copy(&dst->r, &src->r, _state); ae_vector_init_copy(&dst->z, &src->z, _state); dst->alpha = src->alpha; dst->beta = src->beta; dst->r2 = src->r2; dst->meritfunction = src->meritfunction; ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->mv, &src->mv, _state); ae_vector_init_copy(&dst->pv, &src->pv, _state); dst->vmv = src->vmv; ae_vector_init_copy(&dst->startx, &src->startx, _state); dst->epsf = src->epsf; dst->maxits = src->maxits; dst->itsbeforerestart = src->itsbeforerestart; dst->itsbeforerupdate = src->itsbeforerupdate; dst->xrep = src->xrep; dst->xupdated = src->xupdated; dst->needmv = src->needmv; dst->needmtv = src->needmtv; dst->needmv2 = src->needmv2; dst->needvmv = src->needvmv; dst->needprec = src->needprec; dst->repiterationscount = src->repiterationscount; dst->repnmv = src->repnmv; dst->repterminationtype = src->repterminationtype; dst->running = src->running; ae_vector_init_copy(&dst->tmpd, &src->tmpd, _state); _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); } void _lincgstate_clear(void* _p) { lincgstate *p = (lincgstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->rx); ae_vector_clear(&p->b); ae_vector_clear(&p->cx); ae_vector_clear(&p->cr); ae_vector_clear(&p->cz); ae_vector_clear(&p->p); ae_vector_clear(&p->r); ae_vector_clear(&p->z); ae_vector_clear(&p->x); ae_vector_clear(&p->mv); ae_vector_clear(&p->pv); ae_vector_clear(&p->startx); ae_vector_clear(&p->tmpd); _rcommstate_clear(&p->rstate); } void _lincgstate_destroy(void* _p) { lincgstate *p = (lincgstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->rx); ae_vector_destroy(&p->b); ae_vector_destroy(&p->cx); ae_vector_destroy(&p->cr); ae_vector_destroy(&p->cz); ae_vector_destroy(&p->p); ae_vector_destroy(&p->r); ae_vector_destroy(&p->z); ae_vector_destroy(&p->x); ae_vector_destroy(&p->mv); ae_vector_destroy(&p->pv); ae_vector_destroy(&p->startx); ae_vector_destroy(&p->tmpd); _rcommstate_destroy(&p->rstate); } void _lincgreport_init(void* _p, ae_state *_state) { lincgreport *p = (lincgreport*)_p; ae_touch_ptr((void*)p); } void _lincgreport_init_copy(void* _dst, void* _src, ae_state *_state) { lincgreport *dst = (lincgreport*)_dst; lincgreport *src = (lincgreport*)_src; dst->iterationscount = src->iterationscount; dst->nmv = src->nmv; dst->terminationtype = src->terminationtype; dst->r2 = src->r2; } void _lincgreport_clear(void* _p) { lincgreport *p = (lincgreport*)_p; ae_touch_ptr((void*)p); } void _lincgreport_destroy(void* _p) { lincgreport *p = (lincgreport*)_p; ae_touch_ptr((void*)p); } } cpp/src/solvers.h0000755000175000017500000051560113105126765013674 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _solvers_pkg_h #define _solvers_pkg_h #include "ap.h" #include "alglibinternal.h" #include "linalg.h" #include "alglibmisc.h" ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (DATATYPES) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { typedef struct { double r1; double rinf; } densesolverreport; typedef struct { double r2; ae_matrix cx; ae_int_t n; ae_int_t k; } densesolverlsreport; typedef struct { normestimatorstate nes; ae_vector rx; ae_vector b; ae_int_t n; ae_int_t m; ae_int_t prectype; ae_vector ui; ae_vector uip1; ae_vector vi; ae_vector vip1; ae_vector omegai; ae_vector omegaip1; double alphai; double alphaip1; double betai; double betaip1; double phibari; double phibarip1; double phii; double rhobari; double rhobarip1; double rhoi; double ci; double si; double theta; double lambdai; ae_vector d; double anorm; double bnorm2; double dnorm; double r2; ae_vector x; ae_vector mv; ae_vector mtv; double epsa; double epsb; double epsc; ae_int_t maxits; ae_bool xrep; ae_bool xupdated; ae_bool needmv; ae_bool needmtv; ae_bool needmv2; ae_bool needvmv; ae_bool needprec; ae_int_t repiterationscount; ae_int_t repnmv; ae_int_t repterminationtype; ae_bool running; ae_vector tmpd; ae_vector tmpx; rcommstate rstate; } linlsqrstate; typedef struct { ae_int_t iterationscount; ae_int_t nmv; ae_int_t terminationtype; } linlsqrreport; typedef struct { double maxerr; } polynomialsolverreport; typedef struct { ae_int_t n; ae_int_t m; double epsf; ae_int_t maxits; ae_bool xrep; double stpmax; ae_vector x; double f; ae_vector fi; ae_matrix j; ae_bool needf; ae_bool needfij; ae_bool xupdated; rcommstate rstate; ae_int_t repiterationscount; ae_int_t repnfunc; ae_int_t repnjac; ae_int_t repterminationtype; ae_vector xbase; double fbase; double fprev; ae_vector candstep; ae_vector rightpart; ae_vector cgbuf; } nleqstate; typedef struct { ae_int_t iterationscount; ae_int_t nfunc; ae_int_t njac; ae_int_t terminationtype; } nleqreport; typedef struct { ae_vector rx; ae_vector b; ae_int_t n; ae_int_t prectype; ae_vector cx; ae_vector cr; ae_vector cz; ae_vector p; ae_vector r; ae_vector z; double alpha; double beta; double r2; double meritfunction; ae_vector x; ae_vector mv; ae_vector pv; double vmv; ae_vector startx; double epsf; ae_int_t maxits; ae_int_t itsbeforerestart; ae_int_t itsbeforerupdate; ae_bool xrep; ae_bool xupdated; ae_bool needmv; ae_bool needmtv; ae_bool needmv2; ae_bool needvmv; ae_bool needprec; ae_int_t repiterationscount; ae_int_t repnmv; ae_int_t repterminationtype; ae_bool running; ae_vector tmpd; rcommstate rstate; } lincgstate; typedef struct { ae_int_t iterationscount; ae_int_t nmv; ae_int_t terminationtype; double r2; } lincgreport; } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* *************************************************************************/ class _densesolverreport_owner { public: _densesolverreport_owner(); _densesolverreport_owner(const _densesolverreport_owner &rhs); _densesolverreport_owner& operator=(const _densesolverreport_owner &rhs); virtual ~_densesolverreport_owner(); alglib_impl::densesolverreport* c_ptr(); alglib_impl::densesolverreport* c_ptr() const; protected: alglib_impl::densesolverreport *p_struct; }; class densesolverreport : public _densesolverreport_owner { public: densesolverreport(); densesolverreport(const densesolverreport &rhs); densesolverreport& operator=(const densesolverreport &rhs); virtual ~densesolverreport(); double &r1; double &rinf; }; /************************************************************************* *************************************************************************/ class _densesolverlsreport_owner { public: _densesolverlsreport_owner(); _densesolverlsreport_owner(const _densesolverlsreport_owner &rhs); _densesolverlsreport_owner& operator=(const _densesolverlsreport_owner &rhs); virtual ~_densesolverlsreport_owner(); alglib_impl::densesolverlsreport* c_ptr(); alglib_impl::densesolverlsreport* c_ptr() const; protected: alglib_impl::densesolverlsreport *p_struct; }; class densesolverlsreport : public _densesolverlsreport_owner { public: densesolverlsreport(); densesolverlsreport(const densesolverlsreport &rhs); densesolverlsreport& operator=(const densesolverlsreport &rhs); virtual ~densesolverlsreport(); double &r2; real_2d_array cx; ae_int_t &n; ae_int_t &k; }; /************************************************************************* This object stores state of the LinLSQR method. You should use ALGLIB functions to work with this object. *************************************************************************/ class _linlsqrstate_owner { public: _linlsqrstate_owner(); _linlsqrstate_owner(const _linlsqrstate_owner &rhs); _linlsqrstate_owner& operator=(const _linlsqrstate_owner &rhs); virtual ~_linlsqrstate_owner(); alglib_impl::linlsqrstate* c_ptr(); alglib_impl::linlsqrstate* c_ptr() const; protected: alglib_impl::linlsqrstate *p_struct; }; class linlsqrstate : public _linlsqrstate_owner { public: linlsqrstate(); linlsqrstate(const linlsqrstate &rhs); linlsqrstate& operator=(const linlsqrstate &rhs); virtual ~linlsqrstate(); }; /************************************************************************* *************************************************************************/ class _linlsqrreport_owner { public: _linlsqrreport_owner(); _linlsqrreport_owner(const _linlsqrreport_owner &rhs); _linlsqrreport_owner& operator=(const _linlsqrreport_owner &rhs); virtual ~_linlsqrreport_owner(); alglib_impl::linlsqrreport* c_ptr(); alglib_impl::linlsqrreport* c_ptr() const; protected: alglib_impl::linlsqrreport *p_struct; }; class linlsqrreport : public _linlsqrreport_owner { public: linlsqrreport(); linlsqrreport(const linlsqrreport &rhs); linlsqrreport& operator=(const linlsqrreport &rhs); virtual ~linlsqrreport(); ae_int_t &iterationscount; ae_int_t &nmv; ae_int_t &terminationtype; }; /************************************************************************* *************************************************************************/ class _polynomialsolverreport_owner { public: _polynomialsolverreport_owner(); _polynomialsolverreport_owner(const _polynomialsolverreport_owner &rhs); _polynomialsolverreport_owner& operator=(const _polynomialsolverreport_owner &rhs); virtual ~_polynomialsolverreport_owner(); alglib_impl::polynomialsolverreport* c_ptr(); alglib_impl::polynomialsolverreport* c_ptr() const; protected: alglib_impl::polynomialsolverreport *p_struct; }; class polynomialsolverreport : public _polynomialsolverreport_owner { public: polynomialsolverreport(); polynomialsolverreport(const polynomialsolverreport &rhs); polynomialsolverreport& operator=(const polynomialsolverreport &rhs); virtual ~polynomialsolverreport(); double &maxerr; }; /************************************************************************* *************************************************************************/ class _nleqstate_owner { public: _nleqstate_owner(); _nleqstate_owner(const _nleqstate_owner &rhs); _nleqstate_owner& operator=(const _nleqstate_owner &rhs); virtual ~_nleqstate_owner(); alglib_impl::nleqstate* c_ptr(); alglib_impl::nleqstate* c_ptr() const; protected: alglib_impl::nleqstate *p_struct; }; class nleqstate : public _nleqstate_owner { public: nleqstate(); nleqstate(const nleqstate &rhs); nleqstate& operator=(const nleqstate &rhs); virtual ~nleqstate(); ae_bool &needf; ae_bool &needfij; ae_bool &xupdated; double &f; real_1d_array fi; real_2d_array j; real_1d_array x; }; /************************************************************************* *************************************************************************/ class _nleqreport_owner { public: _nleqreport_owner(); _nleqreport_owner(const _nleqreport_owner &rhs); _nleqreport_owner& operator=(const _nleqreport_owner &rhs); virtual ~_nleqreport_owner(); alglib_impl::nleqreport* c_ptr(); alglib_impl::nleqreport* c_ptr() const; protected: alglib_impl::nleqreport *p_struct; }; class nleqreport : public _nleqreport_owner { public: nleqreport(); nleqreport(const nleqreport &rhs); nleqreport& operator=(const nleqreport &rhs); virtual ~nleqreport(); ae_int_t &iterationscount; ae_int_t &nfunc; ae_int_t &njac; ae_int_t &terminationtype; }; /************************************************************************* This object stores state of the linear CG method. You should use ALGLIB functions to work with this object. Never try to access its fields directly! *************************************************************************/ class _lincgstate_owner { public: _lincgstate_owner(); _lincgstate_owner(const _lincgstate_owner &rhs); _lincgstate_owner& operator=(const _lincgstate_owner &rhs); virtual ~_lincgstate_owner(); alglib_impl::lincgstate* c_ptr(); alglib_impl::lincgstate* c_ptr() const; protected: alglib_impl::lincgstate *p_struct; }; class lincgstate : public _lincgstate_owner { public: lincgstate(); lincgstate(const lincgstate &rhs); lincgstate& operator=(const lincgstate &rhs); virtual ~lincgstate(); }; /************************************************************************* *************************************************************************/ class _lincgreport_owner { public: _lincgreport_owner(); _lincgreport_owner(const _lincgreport_owner &rhs); _lincgreport_owner& operator=(const _lincgreport_owner &rhs); virtual ~_lincgreport_owner(); alglib_impl::lincgreport* c_ptr(); alglib_impl::lincgreport* c_ptr() const; protected: alglib_impl::lincgreport *p_struct; }; class lincgreport : public _lincgreport_owner { public: lincgreport(); lincgreport(const lincgreport &rhs); lincgreport& operator=(const lincgreport &rhs); virtual ~lincgreport(); ae_int_t &iterationscount; ae_int_t &nmv; ae_int_t &terminationtype; double &r2; }; /************************************************************************* Dense solver for A*x=b with N*N real matrix A and N*1 real vectorx x and b. This is "slow-but-feature rich" version of the linear solver. Faster version is RMatrixSolveFast() function. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^3) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. It also very significant on small matrices. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, RMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixsolve(const real_2d_array &a, const ae_int_t n, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x); void smp_rmatrixsolve(const real_2d_array &a, const ae_int_t n, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x); /************************************************************************* Dense solver. This subroutine solves a system A*x=b, where A is NxN non-denegerate real matrix, x and b are vectors. This is a "fast" version of linear solver which does NOT provide any additional functions like condition number estimation or iterative refinement. Algorithm features: * efficient algorithm O(N^3) complexity * no performance overhead from additional functionality If you need condition number estimation or iterative refinement, use more feature-rich version - RMatrixSolve(). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 16.03.2015 by Bochkanov Sergey *************************************************************************/ void rmatrixsolvefast(const real_2d_array &a, const ae_int_t n, const real_1d_array &b, ae_int_t &info); void smp_rmatrixsolvefast(const real_2d_array &a, const ae_int_t n, const real_1d_array &b, ae_int_t &info); /************************************************************************* Dense solver. Similar to RMatrixSolve() but solves task with multiple right parts (where b and x are NxM matrices). This is "slow-but-robust" version of linear solver with additional functionality like condition number estimation. There also exists faster version - RMatrixSolveMFast(). Algorithm features: * automatic detection of degenerate cases * condition number estimation * optional iterative refinement * O(N^3+M*N^2) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. It also very significant on small matrices. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, RMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size RFS - iterative refinement switch: * True - refinement is used. Less performance, more precision. * False - refinement is not used. More performance, less precision. OUTPUT PARAMETERS Info - return code: * -3 A is ill conditioned or singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixsolvem(const real_2d_array &a, const ae_int_t n, const real_2d_array &b, const ae_int_t m, const bool rfs, ae_int_t &info, densesolverreport &rep, real_2d_array &x); void smp_rmatrixsolvem(const real_2d_array &a, const ae_int_t n, const real_2d_array &b, const ae_int_t m, const bool rfs, ae_int_t &info, densesolverreport &rep, real_2d_array &x); /************************************************************************* Dense solver. Similar to RMatrixSolve() but solves task with multiple right parts (where b and x are NxM matrices). This is "fast" version of linear solver which does NOT offer additional functions like condition number estimation or iterative refinement. Algorithm features: * O(N^3+M*N^2) complexity * no additional functionality, highest performance COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size RFS - iterative refinement switch: * True - refinement is used. Less performance, more precision. * False - refinement is not used. More performance, less precision. OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixsolvemfast(const real_2d_array &a, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info); void smp_rmatrixsolvemfast(const real_2d_array &a, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info); /************************************************************************* Dense solver. This subroutine solves a system A*x=b, where A is NxN non-denegerate real matrix given by its LU decomposition, x and b are real vectors. This is "slow-but-robust" version of the linear LU-based solver. Faster version is RMatrixLUSolveFast() function. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! RMatrixLUSolveFast() function. INPUT PARAMETERS LUA - array[N,N], LU decomposition, RMatrixLU result P - array[N], pivots array, RMatrixLU result N - size of A B - array[N], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixlusolve(const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x); /************************************************************************* Dense solver. This subroutine solves a system A*x=b, where A is NxN non-denegerate real matrix given by its LU decomposition, x and b are real vectors. This is "fast-without-any-checks" version of the linear LU-based solver. Slower but more robust version is RMatrixLUSolve() function. Algorithm features: * O(N^2) complexity * fast algorithm without ANY additional checks, just triangular solver INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void rmatrixlusolvefast(const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_1d_array &b, ae_int_t &info); /************************************************************************* Dense solver. Similar to RMatrixLUSolve() but solves task with multiple right parts (where b and x are NxM matrices). This is "robust-but-slow" version of LU-based solver which performs additional checks for non-degeneracy of inputs (condition number estimation). If you need best performance, use "fast-without-any-checks" version, RMatrixLUSolveMFast(). Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. ! ! This performance penalty is especially apparent when you use ! ALGLIB parallel capabilities (condition number estimation is ! inherently sequential). It also becomes significant for ! small-scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! RMatrixLUSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS LUA - array[N,N], LU decomposition, RMatrixLU result P - array[N], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixlusolvem(const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x); void smp_rmatrixlusolvem(const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x); /************************************************************************* Dense solver. Similar to RMatrixLUSolve() but solves task with multiple right parts, where b and x are NxM matrices. This is "fast-without-any-checks" version of LU-based solver. It does not estimate condition number of a system, so it is extremely fast. If you need better detection of near-degenerate cases, use RMatrixLUSolveM() function. Algorithm features: * O(M*N^2) complexity * fast algorithm without ANY additional checks, just triangular solver COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N,M]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void rmatrixlusolvemfast(const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info); void smp_rmatrixlusolvemfast(const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info); /************************************************************************* Dense solver. This subroutine solves a system A*x=b, where BOTH ORIGINAL A AND ITS LU DECOMPOSITION ARE KNOWN. You can use it if for some reasons you have both A and its LU decomposition. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixmixedsolve(const real_2d_array &a, const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x); /************************************************************************* Dense solver. Similar to RMatrixMixedSolve() but solves task with multiple right parts (where b and x are NxM matrices). Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(M*N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixmixedsolvem(const real_2d_array &a, const real_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x); /************************************************************************* Complex dense solver for A*X=B with N*N complex matrix A, N*M complex matrices X and B. "Slow-but-feature-rich" version which provides additional functions, at the cost of slower performance. Faster version may be invoked with CMatrixSolveMFast() function. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^3+M*N^2) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, CMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size RFS - iterative refinement switch: * True - refinement is used. Less performance, more precision. * False - refinement is not used. More performance, less precision. OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixsolvem(const complex_2d_array &a, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, const bool rfs, ae_int_t &info, densesolverreport &rep, complex_2d_array &x); void smp_cmatrixsolvem(const complex_2d_array &a, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, const bool rfs, ae_int_t &info, densesolverreport &rep, complex_2d_array &x); /************************************************************************* Complex dense solver for A*X=B with N*N complex matrix A, N*M complex matrices X and B. "Fast-but-lightweight" version which provides just triangular solver - and no additional functions like iterative refinement or condition number estimation. Algorithm features: * O(N^3+M*N^2) complexity * no additional time consuming functions COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N,M]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 16.03.2015 by Bochkanov Sergey *************************************************************************/ void cmatrixsolvemfast(const complex_2d_array &a, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info); void smp_cmatrixsolvemfast(const complex_2d_array &a, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info); /************************************************************************* Complex dense solver for A*x=B with N*N complex matrix A and N*1 complex vectors x and b. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^3) complexity IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system ! and performs iterative refinement, which results in ! significant performance penalty when compared with "fast" ! version which just performs LU decomposition and calls ! triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, CMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixsolve(const complex_2d_array &a, const ae_int_t n, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x); void smp_cmatrixsolve(const complex_2d_array &a, const ae_int_t n, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x); /************************************************************************* Complex dense solver for A*x=B with N*N complex matrix A and N*1 complex vectors x and b. "Fast-but-lightweight" version of the solver. Algorithm features: * O(N^3) complexity * no additional time consuming features, just triangular solver COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - array[0..N-1,0..N-1], system matrix N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS: Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixsolvefast(const complex_2d_array &a, const ae_int_t n, const complex_1d_array &b, ae_int_t &info); void smp_cmatrixsolvefast(const complex_2d_array &a, const ae_int_t n, const complex_1d_array &b, ae_int_t &info); /************************************************************************* Dense solver for A*X=B with N*N complex A given by its LU decomposition, and N*M matrices X and B (multiple right sides). "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use CMatrixSolve or CMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. ! ! This performance penalty is especially apparent when you use ! ALGLIB parallel capabilities (condition number estimation is ! inherently sequential). It also becomes significant for ! small-scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! CMatrixLUSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixlusolvem(const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x); void smp_cmatrixlusolvem(const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x); /************************************************************************* Dense solver for A*X=B with N*N complex A given by its LU decomposition, and N*M matrices X and B (multiple right sides). "Fast-but-lightweight" version of the solver. Algorithm features: * O(M*N^2) complexity * no additional time-consuming features COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Triangular solver is relatively easy to parallelize. ! However, parallelization will be efficient only for large number of ! right parts M. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, RMatrixLU result P - array[0..N-1], pivots array, RMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N,M]: * info>0 => overwritten by solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixlusolvemfast(const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info); void smp_cmatrixlusolvemfast(const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info); /************************************************************************* Complex dense linear solver for A*x=b with complex N*N A given by its LU decomposition and N*1 vectors x and b. This is "slow-but-robust" version of the complex linear solver with additional features which add significant performance overhead. Faster version is CMatrixLUSolveFast() function. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation No iterative refinement is provided because exact form of original matrix is not known to subroutine. Use CMatrixSolve or CMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems. ! ! In such cases we strongly recommend you to use faster solver, ! CMatrixLUSolveFast() function. INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixlusolve(const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x); /************************************************************************* Complex dense linear solver for A*x=b with N*N complex A given by its LU decomposition and N*1 vectors x and b. This is fast lightweight version of solver, which is significantly faster than CMatrixLUSolve(), but does not provide additional information (like condition numbers). Algorithm features: * O(N^2) complexity * no additional time-consuming features, just triangular solver INPUT PARAMETERS LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is exactly singular (ill conditioned matrices are not recognized). * -1 N<=0 was passed * 1 task is solved B - array[N]: * info>0 => overwritten by solution * info=-3 => filled by zeros NOTE: unlike CMatrixLUSolve(), this function does NOT check for near-degeneracy of input matrix. It checks for EXACT degeneracy, because this check is easy to do. However, very badly conditioned matrices may went unnoticed. -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixlusolvefast(const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_1d_array &b, ae_int_t &info); /************************************************************************* Dense solver. Same as RMatrixMixedSolveM(), but for complex matrices. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(M*N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixmixedsolvem(const complex_2d_array &a, const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x); /************************************************************************* Dense solver. Same as RMatrixMixedSolve(), but for complex matrices. Algorithm features: * automatic detection of degenerate cases * condition number estimation * iterative refinement * O(N^2) complexity INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix LUA - array[0..N-1,0..N-1], LU decomposition, CMatrixLU result P - array[0..N-1], pivots array, CMatrixLU result N - size of A B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or exactly singular. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixmixedsolve(const complex_2d_array &a, const complex_2d_array &lua, const integer_1d_array &p, const ae_int_t n, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x); /************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A, and N*M vectors X and B. It is "slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just performs Cholesky ! decomposition and calls triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, SPDMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or non-SPD. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixsolvem(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x); void smp_spdmatrixsolvem(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x); /************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A, and N*M vectors X and B. It is "fast-but-lightweight" version of the solver. Algorithm features: * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional time consuming features COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular * -1 N<=0 was passed * 1 task was solved B - array[N,M], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/ void spdmatrixsolvemfast(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info); void smp_spdmatrixsolvemfast(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info); /************************************************************************* Dense linear solver for A*x=b with N*N real symmetric positive definite matrix A, N*1 vectors x and b. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just performs Cholesky ! decomposition and calls triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, SPDMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 matrix is very badly conditioned or non-SPD. * -1 N<=0 was passed * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixsolve(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x); void smp_spdmatrixsolve(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x); /************************************************************************* Dense linear solver for A*x=b with N*N real symmetric positive definite matrix A, N*1 vectors x and b. "Fast-but-lightweight" version of the solver. Algorithm features: * O(N^3) complexity * matrix is represented by its upper or lower triangle * no additional time consuming features like condition number estimation COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or non-SPD * -1 N<=0 was passed * 1 task was solved B - array[N], it contains: * info>0 => solution * info=-3 => filled by zeros -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/ void spdmatrixsolvefast(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &b, ae_int_t &info); void smp_spdmatrixsolvefast(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &b, ae_int_t &info); /************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*M vectors X and B. It is "slow-but- feature-rich" version of the solver which estimates condition number of the system. Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. Amount of overhead introduced depends on M (the ! larger - the more efficient). ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! SPDMatrixCholeskySolveMFast() function. INPUT PARAMETERS CHA - array[0..N-1,0..N-1], Cholesky decomposition, SPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or badly conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 contains solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskysolvem(const real_2d_array &cha, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x); void smp_spdmatrixcholeskysolvem(const real_2d_array &cha, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, real_2d_array &x); /************************************************************************* Dense solver for A*X=B with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*M vectors X and B. It is "fast-but- lightweight" version of the solver which just solves linear system, without any additional functions. Algorithm features: * O(M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional functionality INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, SPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[N,M], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or badly conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved B - array[N]: * for info>0 overwritten by solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskysolvemfast(const real_2d_array &cha, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info); void smp_spdmatrixcholeskysolvemfast(const real_2d_array &cha, const ae_int_t n, const bool isupper, const real_2d_array &b, const ae_int_t m, ae_int_t &info); /************************************************************************* Dense solver for A*x=b with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*1 real vectors x and b. This is "slow- but-feature-rich" version of the solver which, in addition to the solution, performs condition number estimation. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! SPDMatrixCholeskySolveFast() function. INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[N], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 - solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskysolve(const real_2d_array &cha, const ae_int_t n, const bool isupper, const real_1d_array &b, ae_int_t &info, densesolverreport &rep, real_1d_array &x); /************************************************************************* Dense solver for A*x=b with N*N symmetric positive definite matrix A given by its Cholesky decomposition, and N*1 real vectors x and b. This is "fast- but-lightweight" version of the solver. Algorithm features: * O(N^2) complexity * matrix is represented by its upper or lower triangle * no additional features INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[N], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[N]: * for info>0 - overwritten by solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskysolvefast(const real_2d_array &cha, const ae_int_t n, const bool isupper, const real_1d_array &b, ae_int_t &info); /************************************************************************* Dense solver for A*X=B, with N*N Hermitian positive definite matrix A and N*M complex matrices X and B. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. ! ! This performance penalty is especially apparent when you use ! ALGLIB parallel capabilities (condition number estimation is ! inherently sequential). It also becomes significant for ! small-scale problems (N<100). ! ! In such cases we strongly recommend you to use faster solver, ! HPDMatrixSolveMFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - same as in RMatrixSolve. Returns -3 for non-HPD matrices. Rep - same as in RMatrixSolve X - same as in RMatrixSolve -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void hpdmatrixsolvem(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x); void smp_hpdmatrixsolvem(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x); /************************************************************************* Dense solver for A*X=B, with N*N Hermitian positive definite matrix A and N*M complex matrices X and B. "Fast-but-lightweight" version of the solver. Algorithm features: * O(N^3+M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional time consuming features like condition number estimation COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1,0..M-1], right part M - right part size OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or is not positive definite. B is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[0..N-1]: * overwritten by solution * zeros, if problem was not solved -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/ void hpdmatrixsolvemfast(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info); void smp_hpdmatrixsolvemfast(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info); /************************************************************************* Dense solver for A*x=b, with N*N Hermitian positive definite matrix A, and N*1 complex vectors x and b. "Slow-but-feature-rich" version of the solver. Algorithm features: * automatic detection of degenerate cases * condition number estimation * O(N^3) complexity * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just performs Cholesky ! decomposition and calls triangular solver. ! ! This performance penalty is especially visible in the ! multithreaded mode, because both condition number estimation ! and iterative refinement are inherently sequential ! calculations. ! ! Thus, if you need high performance and if you are pretty sure ! that your system is well conditioned, we strongly recommend ! you to use faster solver, HPDMatrixSolveFast() function. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - same as in RMatrixSolve Returns -3 for non-HPD matrices. Rep - same as in RMatrixSolve X - same as in RMatrixSolve -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void hpdmatrixsolve(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x); void smp_hpdmatrixsolve(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x); /************************************************************************* Dense solver for A*x=b, with N*N Hermitian positive definite matrix A, and N*1 complex vectors x and b. "Fast-but-lightweight" version of the solver without additional functions. Algorithm features: * O(N^3) complexity * matrix is represented by its upper or lower triangle * no additional time consuming functions COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..N-1,0..N-1], system matrix N - size of A IsUpper - what half of A is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or not positive definite X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved B - array[0..N-1]: * overwritten by solution * zeros, if A is exactly singular (diagonal of its LU decomposition has exact zeros). -- ALGLIB -- Copyright 17.03.2015 by Bochkanov Sergey *************************************************************************/ void hpdmatrixsolvefast(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_1d_array &b, ae_int_t &info); void smp_hpdmatrixsolvefast(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_1d_array &b, ae_int_t &info); /************************************************************************* Dense solver for A*X=B with N*N Hermitian positive definite matrix A given by its Cholesky decomposition and N*M complex matrices X and B. This is "slow-but-feature-rich" version of the solver which, in addition to the solution, estimates condition number of the system. Algorithm features: * automatic detection of degenerate cases * O(M*N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in significant performance penalty when ! compared with "fast" version which just calls triangular ! solver. Amount of overhead introduced depends on M (the ! larger - the more efficient). ! ! This performance penalty is insignificant when compared with ! cost of large Cholesky decomposition. However, if you call ! this function many times for the same left side, this ! overhead BECOMES significant. It also becomes significant ! for small-scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! HPDMatrixCholeskySolveMFast() function. INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, HPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[N,M], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 contains solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskysolvem(const complex_2d_array &cha, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x); void smp_hpdmatrixcholeskysolvem(const complex_2d_array &cha, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info, densesolverreport &rep, complex_2d_array &x); /************************************************************************* Dense solver for A*X=B with N*N Hermitian positive definite matrix A given by its Cholesky decomposition and N*M complex matrices X and B. This is "fast-but-lightweight" version of the solver. Algorithm features: * O(M*N^2) complexity * matrix is represented by its upper or lower triangle * no additional time-consuming features INPUT PARAMETERS CHA - array[N,N], Cholesky decomposition, HPDMatrixCholesky result N - size of CHA IsUpper - what half of CHA is provided B - array[N,M], right part M - right part size OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task was solved B - array[N]: * for info>0 overwritten by solution * for info=-3 filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskysolvemfast(const complex_2d_array &cha, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info); void smp_hpdmatrixcholeskysolvemfast(const complex_2d_array &cha, const ae_int_t n, const bool isupper, const complex_2d_array &b, const ae_int_t m, ae_int_t &info); /************************************************************************* Dense solver for A*x=b with N*N Hermitian positive definite matrix A given by its Cholesky decomposition, and N*1 complex vectors x and b. This is "slow-but-feature-rich" version of the solver which estimates condition number of the system. Algorithm features: * automatic detection of degenerate cases * O(N^2) complexity * condition number estimation * matrix is represented by its upper or lower triangle No iterative refinement is provided because such partial representation of matrix does not allow efficient calculation of extra-precise matrix-vector products for large matrices. Use RMatrixSolve or RMatrixMixedSolve if you need iterative refinement. IMPORTANT: ! this function is NOT the most efficient linear solver provided ! by ALGLIB. It estimates condition number of linear system, ! which results in 10-15x performance penalty when compared ! with "fast" version which just calls triangular solver. ! ! This performance penalty is insignificant when compared with ! cost of large LU decomposition. However, if you call this ! function many times for the same left side, this overhead ! BECOMES significant. It also becomes significant for small- ! scale problems (N<50). ! ! In such cases we strongly recommend you to use faster solver, ! HPDMatrixCholeskySolveFast() function. INPUT PARAMETERS CHA - array[0..N-1,0..N-1], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned X is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved Rep - additional report, following fields are set: * rep.r1 condition number in 1-norm * rep.rinf condition number in inf-norm X - array[N]: * for info>0 - solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskysolve(const complex_2d_array &cha, const ae_int_t n, const bool isupper, const complex_1d_array &b, ae_int_t &info, densesolverreport &rep, complex_1d_array &x); /************************************************************************* Dense solver for A*x=b with N*N Hermitian positive definite matrix A given by its Cholesky decomposition, and N*1 complex vectors x and b. This is "fast-but-lightweight" version of the solver. Algorithm features: * O(N^2) complexity * matrix is represented by its upper or lower triangle * no additional time-consuming features INPUT PARAMETERS CHA - array[0..N-1,0..N-1], Cholesky decomposition, SPDMatrixCholesky result N - size of A IsUpper - what half of CHA is provided B - array[0..N-1], right part OUTPUT PARAMETERS Info - return code: * -3 A is is exactly singular or ill conditioned B is filled by zeros in such cases. * -1 N<=0 was passed * 1 task is solved B - array[N]: * for info>0 - overwritten by solution * for info=-3 - filled by zeros -- ALGLIB -- Copyright 18.03.2015 by Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskysolvefast(const complex_2d_array &cha, const ae_int_t n, const bool isupper, const complex_1d_array &b, ae_int_t &info); /************************************************************************* Dense solver. This subroutine finds solution of the linear system A*X=B with non-square, possibly degenerate A. System is solved in the least squares sense, and general least squares solution X = X0 + CX*y which minimizes |A*X-B| is returned. If A is non-degenerate, solution in the usual sense is returned. Algorithm features: * automatic detection (and correct handling!) of degenerate cases * iterative refinement * O(N^3) complexity COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is only partially supported (some parts are ! optimized, but most - are not). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS A - array[0..NRows-1,0..NCols-1], system matrix NRows - vertical size of A NCols - horizontal size of A B - array[0..NCols-1], right part Threshold- a number in [0,1]. Singular values beyond Threshold are considered zero. Set it to 0.0, if you don't understand what it means, so the solver will choose good value on its own. OUTPUT PARAMETERS Info - return code: * -4 SVD subroutine failed * -1 if NRows<=0 or NCols<=0 or Threshold<0 was passed * 1 if task is solved Rep - solver report, see below for more info X - array[0..N-1,0..M-1], it contains: * solution of A*X=B (even for singular A) * zeros, if SVD subroutine failed SOLVER REPORT Subroutine sets following fields of the Rep structure: * R2 reciprocal of condition number: 1/cond(A), 2-norm. * N = NCols * K dim(Null(A)) * CX array[0..N-1,0..K-1], kernel of A. Columns of CX store such vectors that A*CX[i]=0. -- ALGLIB -- Copyright 24.08.2009 by Bochkanov Sergey *************************************************************************/ void rmatrixsolvels(const real_2d_array &a, const ae_int_t nrows, const ae_int_t ncols, const real_1d_array &b, const double threshold, ae_int_t &info, densesolverlsreport &rep, real_1d_array &x); void smp_rmatrixsolvels(const real_2d_array &a, const ae_int_t nrows, const ae_int_t ncols, const real_1d_array &b, const double threshold, ae_int_t &info, densesolverlsreport &rep, real_1d_array &x); /************************************************************************* This function initializes linear LSQR Solver. This solver is used to solve non-symmetric (and, possibly, non-square) problems. Least squares solution is returned for non-compatible systems. USAGE: 1. User initializes algorithm state with LinLSQRCreate() call 2. User tunes solver parameters with LinLSQRSetCond() and other functions 3. User calls LinLSQRSolveSparse() function which takes algorithm state and SparseMatrix object. 4. User calls LinLSQRResults() to get solution 5. Optionally, user may call LinLSQRSolveSparse() again to solve another problem with different matrix and/or right part without reinitializing LinLSQRState structure. INPUT PARAMETERS: M - number of rows in A N - number of variables, N>0 OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrcreate(const ae_int_t m, const ae_int_t n, linlsqrstate &state); /************************************************************************* This function changes preconditioning settings of LinLSQQSolveSparse() function. By default, SolveSparse() uses diagonal preconditioner, but if you want to use solver without preconditioning, you can call this function which forces solver to use unit matrix for preconditioning. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/ void linlsqrsetprecunit(const linlsqrstate &state); /************************************************************************* This function changes preconditioning settings of LinCGSolveSparse() function. LinCGSolveSparse() will use diagonal of the system matrix as preconditioner. This preconditioning mode is active by default. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/ void linlsqrsetprecdiag(const linlsqrstate &state); /************************************************************************* This function sets optional Tikhonov regularization coefficient. It is zero by default. INPUT PARAMETERS: LambdaI - regularization factor, LambdaI>=0 OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsetlambdai(const linlsqrstate &state, const double lambdai); /************************************************************************* Procedure for solution of A*x=b with sparse A. INPUT PARAMETERS: State - algorithm state A - sparse M*N matrix in the CRS format (you MUST contvert it to CRS format by calling SparseConvertToCRS() function BEFORE you pass it to this function). B - right part, array[M] RESULT: This function returns no result. You can get solution by calling LinCGResults() NOTE: this function uses lightweight preconditioning - multiplication by inverse of diag(A). If you want, you can turn preconditioning off by calling LinLSQRSetPrecUnit(). However, preconditioning cost is low and preconditioner is very important for solution of badly scaled problems. -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsolvesparse(const linlsqrstate &state, const sparsematrix &a, const real_1d_array &b); /************************************************************************* This function sets stopping criteria. INPUT PARAMETERS: EpsA - algorithm will be stopped if ||A^T*Rk||/(||A||*||Rk||)<=EpsA. EpsB - algorithm will be stopped if ||Rk||<=EpsB*||B|| MaxIts - algorithm will be stopped if number of iterations more than MaxIts. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTE: if EpsA,EpsB,EpsC and MaxIts are zero then these variables will be setted as default values. -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsetcond(const linlsqrstate &state, const double epsa, const double epsb, const ae_int_t maxits); /************************************************************************* LSQR solver: results. This function must be called after LinLSQRSolve INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[N], solution Rep - optimization report: * Rep.TerminationType completetion code: * 1 ||Rk||<=EpsB*||B|| * 4 ||A^T*Rk||/(||A||*||Rk||)<=EpsA * 5 MaxIts steps was taken * 7 rounding errors prevent further progress, X contains best point found so far. (sometimes returned on singular systems) * Rep.IterationsCount contains iterations count * NMV countains number of matrix-vector calculations -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrresults(const linlsqrstate &state, real_1d_array &x, linlsqrreport &rep); /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinCGOptimize(). -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ void linlsqrsetxrep(const linlsqrstate &state, const bool needxrep); /************************************************************************* Polynomial root finding. This function returns all roots of the polynomial P(x) = a0 + a1*x + a2*x^2 + ... + an*x^n Both real and complex roots are returned (see below). INPUT PARAMETERS: A - array[N+1], polynomial coefficients: * A[0] is constant term * A[N] is a coefficient of X^N N - polynomial degree OUTPUT PARAMETERS: X - array of complex roots: * for isolated real root, X[I] is strictly real: IMAGE(X[I])=0 * complex roots are always returned in pairs - roots occupy positions I and I+1, with: * X[I+1]=Conj(X[I]) * IMAGE(X[I]) > 0 * IMAGE(X[I+1]) = -IMAGE(X[I]) < 0 * multiple real roots may have non-zero imaginary part due to roundoff errors. There is no reliable way to distinguish real root of multiplicity 2 from two complex roots in the presence of roundoff errors. Rep - report, additional information, following fields are set: * Rep.MaxErr - max( |P(xi)| ) for i=0..N-1. This field allows to quickly estimate "quality" of the roots being returned. NOTE: this function uses companion matrix method to find roots. In case internal EVD solver fails do find eigenvalues, exception is generated. NOTE: roots are not "polished" and no matrix balancing is performed for them. -- ALGLIB -- Copyright 24.02.2014 by Bochkanov Sergey *************************************************************************/ void polynomialsolve(const real_1d_array &a, const ae_int_t n, complex_1d_array &x, polynomialsolverreport &rep); /************************************************************************* LEVENBERG-MARQUARDT-LIKE NONLINEAR SOLVER DESCRIPTION: This algorithm solves system of nonlinear equations F[0](x[0], ..., x[n-1]) = 0 F[1](x[0], ..., x[n-1]) = 0 ... F[M-1](x[0], ..., x[n-1]) = 0 with M/N do not necessarily coincide. Algorithm converges quadratically under following conditions: * the solution set XS is nonempty * for some xs in XS there exist such neighbourhood N(xs) that: * vector function F(x) and its Jacobian J(x) are continuously differentiable on N * ||F(x)|| provides local error bound on N, i.e. there exists such c1, that ||F(x)||>c1*distance(x,XS) Note that these conditions are much more weaker than usual non-singularity conditions. For example, algorithm will converge for any affine function F (whether its Jacobian singular or not). REQUIREMENTS: Algorithm will request following information during its operation: * function vector F[] and Jacobian matrix at given point X * value of merit function f(x)=F[0]^2(x)+...+F[M-1]^2(x) at given point X USAGE: 1. User initializes algorithm state with NLEQCreateLM() call 2. User tunes solver parameters with NLEQSetCond(), NLEQSetStpMax() and other functions 3. User calls NLEQSolve() function which takes algorithm state and pointers (delegates, etc.) to callback functions which calculate merit function value and Jacobian. 4. User calls NLEQResults() to get solution 5. Optionally, user may call NLEQRestartFrom() to solve another problem with same parameters (N/M) but another starting point and/or another function vector. NLEQRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - space dimension, N>1: * if provided, only leading N elements of X are used * if not provided, determined automatically from size of X M - system size X - starting point OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with NLEQSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use NLEQSetStpMax() function to bound algorithm's steps. 3. this algorithm is a slightly modified implementation of the method described in 'Levenberg-Marquardt method for constrained nonlinear equations with strong local convergence properties' by Christian Kanzow Nobuo Yamashita and Masao Fukushima and further developed in 'On the convergence of a New Levenberg-Marquardt Method' by Jin-yan Fan and Ya-Xiang Yuan. -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/ void nleqcreatelm(const ae_int_t n, const ae_int_t m, const real_1d_array &x, nleqstate &state); void nleqcreatelm(const ae_int_t m, const real_1d_array &x, nleqstate &state); /************************************************************************* This function sets stopping conditions for the nonlinear solver INPUT PARAMETERS: State - structure which stores algorithm state EpsF - >=0 The subroutine finishes its work if on k+1-th iteration the condition ||F||<=EpsF is satisfied MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsF=0 and MaxIts=0 simultaneously will lead to automatic stopping criterion selection (small EpsF). NOTES: -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/ void nleqsetcond(const nleqstate &state, const double epsf, const ae_int_t maxits); /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to NLEQSolve(). -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/ void nleqsetxrep(const nleqstate &state, const bool needxrep); /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when target function contains exp() or other fast growing functions, and algorithm makes too large steps which lead to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/ void nleqsetstpmax(const nleqstate &state, const double stpmax); /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool nleqiteration(const nleqstate &state); /************************************************************************* This family of functions is used to launcn iterations of nonlinear solver These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x jac - callback which calculates function vector fi[] and Jacobian jac at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void nleqsolve(nleqstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); /************************************************************************* NLEQ solver results INPUT PARAMETERS: State - algorithm state. OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report: * Rep.TerminationType completetion code: * -4 ERROR: algorithm has converged to the stationary point Xf which is local minimum of f=F[0]^2+...+F[m-1]^2, but is not solution of nonlinear system. * 1 sqrt(f)<=EpsF. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * Rep.IterationsCount contains iterations count * NFEV countains number of function calculations * ActiveConstraints contains number of active constraints -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/ void nleqresults(const nleqstate &state, real_1d_array &x, nleqreport &rep); /************************************************************************* NLEQ solver results Buffered implementation of NLEQResults(), which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/ void nleqresultsbuf(const nleqstate &state, real_1d_array &x, nleqreport &rep); /************************************************************************* This subroutine restarts CG algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used for reverse communication previously allocated with MinCGCreate call. X - new starting point. BndL - new lower bounds BndU - new upper bounds -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void nleqrestartfrom(const nleqstate &state, const real_1d_array &x); /************************************************************************* This function initializes linear CG Solver. This solver is used to solve symmetric positive definite problems. If you want to solve nonsymmetric (or non-positive definite) problem you may use LinLSQR solver provided by ALGLIB. USAGE: 1. User initializes algorithm state with LinCGCreate() call 2. User tunes solver parameters with LinCGSetCond() and other functions 3. Optionally, user sets starting point with LinCGSetStartingPoint() 4. User calls LinCGSolveSparse() function which takes algorithm state and SparseMatrix object. 5. User calls LinCGResults() to get solution 6. Optionally, user may call LinCGSolveSparse() again to solve another problem with different matrix and/or right part without reinitializing LinCGState structure. INPUT PARAMETERS: N - problem dimension, N>0 OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgcreate(const ae_int_t n, lincgstate &state); /************************************************************************* This function sets starting point. By default, zero starting point is used. INPUT PARAMETERS: X - starting point, array[N] OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetstartingpoint(const lincgstate &state, const real_1d_array &x); /************************************************************************* This function changes preconditioning settings of LinCGSolveSparse() function. By default, SolveSparse() uses diagonal preconditioner, but if you want to use solver without preconditioning, you can call this function which forces solver to use unit matrix for preconditioning. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/ void lincgsetprecunit(const lincgstate &state); /************************************************************************* This function changes preconditioning settings of LinCGSolveSparse() function. LinCGSolveSparse() will use diagonal of the system matrix as preconditioner. This preconditioning mode is active by default. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 19.11.2012 by Bochkanov Sergey *************************************************************************/ void lincgsetprecdiag(const lincgstate &state); /************************************************************************* This function sets stopping criteria. INPUT PARAMETERS: EpsF - algorithm will be stopped if norm of residual is less than EpsF*||b||. MaxIts - algorithm will be stopped if number of iterations is more than MaxIts. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: If both EpsF and MaxIts are zero then small EpsF will be set to small value. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetcond(const lincgstate &state, const double epsf, const ae_int_t maxits); /************************************************************************* Procedure for solution of A*x=b with sparse A. INPUT PARAMETERS: State - algorithm state A - sparse matrix in the CRS format (you MUST contvert it to CRS format by calling SparseConvertToCRS() function). IsUpper - whether upper or lower triangle of A is used: * IsUpper=True => only upper triangle is used and lower triangle is not referenced at all * IsUpper=False => only lower triangle is used and upper triangle is not referenced at all B - right part, array[N] RESULT: This function returns no result. You can get solution by calling LinCGResults() NOTE: this function uses lightweight preconditioning - multiplication by inverse of diag(A). If you want, you can turn preconditioning off by calling LinCGSetPrecUnit(). However, preconditioning cost is low and preconditioner is very important for solution of badly scaled problems. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsolvesparse(const lincgstate &state, const sparsematrix &a, const bool isupper, const real_1d_array &b); /************************************************************************* CG-solver: results. This function must be called after LinCGSolve INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[N], solution Rep - optimization report: * Rep.TerminationType completetion code: * -5 input matrix is either not positive definite, too large or too small * -4 overflow/underflow during solution (ill conditioned problem) * 1 ||residual||<=EpsF*||b|| * 5 MaxIts steps was taken * 7 rounding errors prevent further progress, best point found is returned * Rep.IterationsCount contains iterations count * NMV countains number of matrix-vector calculations -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgresults(const lincgstate &state, real_1d_array &x, lincgreport &rep); /************************************************************************* This function sets restart frequency. By default, algorithm is restarted after N subsequent iterations. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetrestartfreq(const lincgstate &state, const ae_int_t srf); /************************************************************************* This function sets frequency of residual recalculations. Algorithm updates residual r_k using iterative formula, but recalculates it from scratch after each 10 iterations. It is done to avoid accumulation of numerical errors and to stop algorithm when r_k starts to grow. Such low update frequence (1/10) gives very little overhead, but makes algorithm a bit more robust against numerical errors. However, you may change it INPUT PARAMETERS: Freq - desired update frequency, Freq>=0. Zero value means that no updates will be done. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetrupdatefreq(const lincgstate &state, const ae_int_t freq); /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinCGOptimize(). -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ void lincgsetxrep(const lincgstate &state, const bool needxrep); } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (FUNCTIONS) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { void rmatrixsolve(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state); void _pexec_rmatrixsolve(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state); void rmatrixsolvefast(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, ae_state *_state); void _pexec_rmatrixsolvefast(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, ae_state *_state); void rmatrixsolvem(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_bool rfs, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state); void _pexec_rmatrixsolvem(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_bool rfs, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state); void rmatrixsolvemfast(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void _pexec_rmatrixsolvemfast(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void rmatrixlusolve(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state); void rmatrixlusolvefast(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, ae_state *_state); void rmatrixlusolvem(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state); void _pexec_rmatrixlusolvem(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state); void rmatrixlusolvemfast(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void _pexec_rmatrixlusolvemfast(/* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void rmatrixmixedsolve(/* Real */ ae_matrix* a, /* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state); void rmatrixmixedsolvem(/* Real */ ae_matrix* a, /* Real */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state); void cmatrixsolvem(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_bool rfs, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state); void _pexec_cmatrixsolvem(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_bool rfs, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state); void cmatrixsolvemfast(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void _pexec_cmatrixsolvemfast(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void cmatrixsolve(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state); void _pexec_cmatrixsolve(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state); void cmatrixsolvefast(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, ae_state *_state); void _pexec_cmatrixsolvefast(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, ae_state *_state); void cmatrixlusolvem(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state); void _pexec_cmatrixlusolvem(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state); void cmatrixlusolvemfast(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void _pexec_cmatrixlusolvemfast(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void cmatrixlusolve(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state); void cmatrixlusolvefast(/* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, ae_state *_state); void cmatrixmixedsolvem(/* Complex */ ae_matrix* a, /* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state); void cmatrixmixedsolve(/* Complex */ ae_matrix* a, /* Complex */ ae_matrix* lua, /* Integer */ ae_vector* p, ae_int_t n, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state); void spdmatrixsolvem(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state); void _pexec_spdmatrixsolvem(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state); void spdmatrixsolvemfast(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void _pexec_spdmatrixsolvemfast(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void spdmatrixsolve(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state); void _pexec_spdmatrixsolve(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state); void spdmatrixsolvefast(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* b, ae_int_t* info, ae_state *_state); void _pexec_spdmatrixsolvefast(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* b, ae_int_t* info, ae_state *_state); void spdmatrixcholeskysolvem(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state); void _pexec_spdmatrixcholeskysolvem(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Real */ ae_matrix* x, ae_state *_state); void spdmatrixcholeskysolvemfast(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void _pexec_spdmatrixcholeskysolvemfast(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void spdmatrixcholeskysolve(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Real */ ae_vector* x, ae_state *_state); void spdmatrixcholeskysolvefast(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* b, ae_int_t* info, ae_state *_state); void hpdmatrixsolvem(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state); void _pexec_hpdmatrixsolvem(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state); void hpdmatrixsolvemfast(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void _pexec_hpdmatrixsolvemfast(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void hpdmatrixsolve(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state); void _pexec_hpdmatrixsolve(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state); void hpdmatrixsolvefast(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* b, ae_int_t* info, ae_state *_state); void _pexec_hpdmatrixsolvefast(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* b, ae_int_t* info, ae_state *_state); void hpdmatrixcholeskysolvem(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state); void _pexec_hpdmatrixcholeskysolvem(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_matrix* x, ae_state *_state); void hpdmatrixcholeskysolvemfast(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void _pexec_hpdmatrixcholeskysolvemfast(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_matrix* b, ae_int_t m, ae_int_t* info, ae_state *_state); void hpdmatrixcholeskysolve(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* b, ae_int_t* info, densesolverreport* rep, /* Complex */ ae_vector* x, ae_state *_state); void hpdmatrixcholeskysolvefast(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* b, ae_int_t* info, ae_state *_state); void rmatrixsolvels(/* Real */ ae_matrix* a, ae_int_t nrows, ae_int_t ncols, /* Real */ ae_vector* b, double threshold, ae_int_t* info, densesolverlsreport* rep, /* Real */ ae_vector* x, ae_state *_state); void _pexec_rmatrixsolvels(/* Real */ ae_matrix* a, ae_int_t nrows, ae_int_t ncols, /* Real */ ae_vector* b, double threshold, ae_int_t* info, densesolverlsreport* rep, /* Real */ ae_vector* x, ae_state *_state); void _densesolverreport_init(void* _p, ae_state *_state); void _densesolverreport_init_copy(void* _dst, void* _src, ae_state *_state); void _densesolverreport_clear(void* _p); void _densesolverreport_destroy(void* _p); void _densesolverlsreport_init(void* _p, ae_state *_state); void _densesolverlsreport_init_copy(void* _dst, void* _src, ae_state *_state); void _densesolverlsreport_clear(void* _p); void _densesolverlsreport_destroy(void* _p); void linlsqrcreate(ae_int_t m, ae_int_t n, linlsqrstate* state, ae_state *_state); void linlsqrsetb(linlsqrstate* state, /* Real */ ae_vector* b, ae_state *_state); void linlsqrsetprecunit(linlsqrstate* state, ae_state *_state); void linlsqrsetprecdiag(linlsqrstate* state, ae_state *_state); void linlsqrsetlambdai(linlsqrstate* state, double lambdai, ae_state *_state); ae_bool linlsqriteration(linlsqrstate* state, ae_state *_state); void linlsqrsolvesparse(linlsqrstate* state, sparsematrix* a, /* Real */ ae_vector* b, ae_state *_state); void linlsqrsetcond(linlsqrstate* state, double epsa, double epsb, ae_int_t maxits, ae_state *_state); void linlsqrresults(linlsqrstate* state, /* Real */ ae_vector* x, linlsqrreport* rep, ae_state *_state); void linlsqrsetxrep(linlsqrstate* state, ae_bool needxrep, ae_state *_state); void linlsqrrestart(linlsqrstate* state, ae_state *_state); void _linlsqrstate_init(void* _p, ae_state *_state); void _linlsqrstate_init_copy(void* _dst, void* _src, ae_state *_state); void _linlsqrstate_clear(void* _p); void _linlsqrstate_destroy(void* _p); void _linlsqrreport_init(void* _p, ae_state *_state); void _linlsqrreport_init_copy(void* _dst, void* _src, ae_state *_state); void _linlsqrreport_clear(void* _p); void _linlsqrreport_destroy(void* _p); void polynomialsolve(/* Real */ ae_vector* a, ae_int_t n, /* Complex */ ae_vector* x, polynomialsolverreport* rep, ae_state *_state); void _polynomialsolverreport_init(void* _p, ae_state *_state); void _polynomialsolverreport_init_copy(void* _dst, void* _src, ae_state *_state); void _polynomialsolverreport_clear(void* _p); void _polynomialsolverreport_destroy(void* _p); void nleqcreatelm(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, nleqstate* state, ae_state *_state); void nleqsetcond(nleqstate* state, double epsf, ae_int_t maxits, ae_state *_state); void nleqsetxrep(nleqstate* state, ae_bool needxrep, ae_state *_state); void nleqsetstpmax(nleqstate* state, double stpmax, ae_state *_state); ae_bool nleqiteration(nleqstate* state, ae_state *_state); void nleqresults(nleqstate* state, /* Real */ ae_vector* x, nleqreport* rep, ae_state *_state); void nleqresultsbuf(nleqstate* state, /* Real */ ae_vector* x, nleqreport* rep, ae_state *_state); void nleqrestartfrom(nleqstate* state, /* Real */ ae_vector* x, ae_state *_state); void _nleqstate_init(void* _p, ae_state *_state); void _nleqstate_init_copy(void* _dst, void* _src, ae_state *_state); void _nleqstate_clear(void* _p); void _nleqstate_destroy(void* _p); void _nleqreport_init(void* _p, ae_state *_state); void _nleqreport_init_copy(void* _dst, void* _src, ae_state *_state); void _nleqreport_clear(void* _p); void _nleqreport_destroy(void* _p); void lincgcreate(ae_int_t n, lincgstate* state, ae_state *_state); void lincgsetstartingpoint(lincgstate* state, /* Real */ ae_vector* x, ae_state *_state); void lincgsetb(lincgstate* state, /* Real */ ae_vector* b, ae_state *_state); void lincgsetprecunit(lincgstate* state, ae_state *_state); void lincgsetprecdiag(lincgstate* state, ae_state *_state); void lincgsetcond(lincgstate* state, double epsf, ae_int_t maxits, ae_state *_state); ae_bool lincgiteration(lincgstate* state, ae_state *_state); void lincgsolvesparse(lincgstate* state, sparsematrix* a, ae_bool isupper, /* Real */ ae_vector* b, ae_state *_state); void lincgresults(lincgstate* state, /* Real */ ae_vector* x, lincgreport* rep, ae_state *_state); void lincgsetrestartfreq(lincgstate* state, ae_int_t srf, ae_state *_state); void lincgsetrupdatefreq(lincgstate* state, ae_int_t freq, ae_state *_state); void lincgsetxrep(lincgstate* state, ae_bool needxrep, ae_state *_state); void lincgrestart(lincgstate* state, ae_state *_state); void _lincgstate_init(void* _p, ae_state *_state); void _lincgstate_init_copy(void* _dst, void* _src, ae_state *_state); void _lincgstate_clear(void* _p); void _lincgstate_destroy(void* _p); void _lincgreport_init(void* _p, ae_state *_state); void _lincgreport_init_copy(void* _dst, void* _src, ae_state *_state); void _lincgreport_clear(void* _p); void _lincgreport_destroy(void* _p); } #endif cpp/src/stdafx.h0000755000175000017500000000000413105126766013453 0ustar sergeysergey cpp/src/diffequations.h0000755000175000017500000002313713105126765015036 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _diffequations_pkg_h #define _diffequations_pkg_h #include "ap.h" #include "alglibinternal.h" ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (DATATYPES) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { typedef struct { ae_int_t n; ae_int_t m; double xscale; double h; double eps; ae_bool fraceps; ae_vector yc; ae_vector escale; ae_vector xg; ae_int_t solvertype; ae_bool needdy; double x; ae_vector y; ae_vector dy; ae_matrix ytbl; ae_int_t repterminationtype; ae_int_t repnfev; ae_vector yn; ae_vector yns; ae_vector rka; ae_vector rkc; ae_vector rkcs; ae_matrix rkb; ae_matrix rkk; rcommstate rstate; } odesolverstate; typedef struct { ae_int_t nfev; ae_int_t terminationtype; } odesolverreport; } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* *************************************************************************/ class _odesolverstate_owner { public: _odesolverstate_owner(); _odesolverstate_owner(const _odesolverstate_owner &rhs); _odesolverstate_owner& operator=(const _odesolverstate_owner &rhs); virtual ~_odesolverstate_owner(); alglib_impl::odesolverstate* c_ptr(); alglib_impl::odesolverstate* c_ptr() const; protected: alglib_impl::odesolverstate *p_struct; }; class odesolverstate : public _odesolverstate_owner { public: odesolverstate(); odesolverstate(const odesolverstate &rhs); odesolverstate& operator=(const odesolverstate &rhs); virtual ~odesolverstate(); ae_bool &needdy; real_1d_array y; real_1d_array dy; double &x; }; /************************************************************************* *************************************************************************/ class _odesolverreport_owner { public: _odesolverreport_owner(); _odesolverreport_owner(const _odesolverreport_owner &rhs); _odesolverreport_owner& operator=(const _odesolverreport_owner &rhs); virtual ~_odesolverreport_owner(); alglib_impl::odesolverreport* c_ptr(); alglib_impl::odesolverreport* c_ptr() const; protected: alglib_impl::odesolverreport *p_struct; }; class odesolverreport : public _odesolverreport_owner { public: odesolverreport(); odesolverreport(const odesolverreport &rhs); odesolverreport& operator=(const odesolverreport &rhs); virtual ~odesolverreport(); ae_int_t &nfev; ae_int_t &terminationtype; }; /************************************************************************* Cash-Karp adaptive ODE solver. This subroutine solves ODE Y'=f(Y,x) with initial conditions Y(xs)=Ys (here Y may be single variable or vector of N variables). INPUT PARAMETERS: Y - initial conditions, array[0..N-1]. contains values of Y[] at X[0] N - system size X - points at which Y should be tabulated, array[0..M-1] integrations starts at X[0], ends at X[M-1], intermediate values at X[i] are returned too. SHOULD BE ORDERED BY ASCENDING OR BY DESCENDING! M - number of intermediate points + first point + last point: * M>2 means that you need both Y(X[M-1]) and M-2 values at intermediate points * M=2 means that you want just to integrate from X[0] to X[1] and don't interested in intermediate values. * M=1 means that you don't want to integrate :) it is degenerate case, but it will be handled correctly. * M<1 means error Eps - tolerance (absolute/relative error on each step will be less than Eps). When passing: * Eps>0, it means desired ABSOLUTE error * Eps<0, it means desired RELATIVE error. Relative errors are calculated with respect to maximum values of Y seen so far. Be careful to use this criterion when starting from Y[] that are close to zero. H - initial step lenth, it will be adjusted automatically after the first step. If H=0, step will be selected automatically (usualy it will be equal to 0.001 of min(x[i]-x[j])). OUTPUT PARAMETERS State - structure which stores algorithm state between subsequent calls of OdeSolverIteration. Used for reverse communication. This structure should be passed to the OdeSolverIteration subroutine. SEE ALSO AutoGKSmoothW, AutoGKSingular, AutoGKIteration, AutoGKResults. -- ALGLIB -- Copyright 01.09.2009 by Bochkanov Sergey *************************************************************************/ void odesolverrkck(const real_1d_array &y, const ae_int_t n, const real_1d_array &x, const ae_int_t m, const double eps, const double h, odesolverstate &state); void odesolverrkck(const real_1d_array &y, const real_1d_array &x, const double eps, const double h, odesolverstate &state); /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool odesolveriteration(const odesolverstate &state); /************************************************************************* This function is used to launcn iterations of ODE solver It accepts following parameters: diff - callback which calculates dy/dx for given y and x ptr - optional pointer which is passed to diff; can be NULL -- ALGLIB -- Copyright 01.09.2009 by Bochkanov Sergey *************************************************************************/ void odesolversolve(odesolverstate &state, void (*diff)(const real_1d_array &y, double x, real_1d_array &dy, void *ptr), void *ptr = NULL); /************************************************************************* ODE solver results Called after OdeSolverIteration returned False. INPUT PARAMETERS: State - algorithm state (used by OdeSolverIteration). OUTPUT PARAMETERS: M - number of tabulated values, M>=1 XTbl - array[0..M-1], values of X YTbl - array[0..M-1,0..N-1], values of Y in X[i] Rep - solver report: * Rep.TerminationType completetion code: * -2 X is not ordered by ascending/descending or there are non-distinct X[], i.e. X[i]=X[i+1] * -1 incorrect parameters were specified * 1 task has been solved * Rep.NFEV contains number of function calculations -- ALGLIB -- Copyright 01.09.2009 by Bochkanov Sergey *************************************************************************/ void odesolverresults(const odesolverstate &state, ae_int_t &m, real_1d_array &xtbl, real_2d_array &ytbl, odesolverreport &rep); } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (FUNCTIONS) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { void odesolverrkck(/* Real */ ae_vector* y, ae_int_t n, /* Real */ ae_vector* x, ae_int_t m, double eps, double h, odesolverstate* state, ae_state *_state); ae_bool odesolveriteration(odesolverstate* state, ae_state *_state); void odesolverresults(odesolverstate* state, ae_int_t* m, /* Real */ ae_vector* xtbl, /* Real */ ae_matrix* ytbl, odesolverreport* rep, ae_state *_state); void _odesolverstate_init(void* _p, ae_state *_state); void _odesolverstate_init_copy(void* _dst, void* _src, ae_state *_state); void _odesolverstate_clear(void* _p); void _odesolverstate_destroy(void* _p); void _odesolverreport_init(void* _p, ae_state *_state); void _odesolverreport_init_copy(void* _dst, void* _src, ae_state *_state); void _odesolverreport_clear(void* _p); void _odesolverreport_destroy(void* _p); } #endif cpp/src/optimization.cpp0000755000175000017500000631001213105126765015253 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "optimization.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif using namespace std; ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* *************************************************************************/ _minlbfgsstate_owner::_minlbfgsstate_owner() { p_struct = (alglib_impl::minlbfgsstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minlbfgsstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minlbfgsstate_init(p_struct, NULL); } _minlbfgsstate_owner::_minlbfgsstate_owner(const _minlbfgsstate_owner &rhs) { p_struct = (alglib_impl::minlbfgsstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minlbfgsstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minlbfgsstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minlbfgsstate_owner& _minlbfgsstate_owner::operator=(const _minlbfgsstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minlbfgsstate_clear(p_struct); alglib_impl::_minlbfgsstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minlbfgsstate_owner::~_minlbfgsstate_owner() { alglib_impl::_minlbfgsstate_clear(p_struct); ae_free(p_struct); } alglib_impl::minlbfgsstate* _minlbfgsstate_owner::c_ptr() { return p_struct; } alglib_impl::minlbfgsstate* _minlbfgsstate_owner::c_ptr() const { return const_cast(p_struct); } minlbfgsstate::minlbfgsstate() : _minlbfgsstate_owner() ,needf(p_struct->needf),needfg(p_struct->needfg),xupdated(p_struct->xupdated),f(p_struct->f),g(&p_struct->g),x(&p_struct->x) { } minlbfgsstate::minlbfgsstate(const minlbfgsstate &rhs):_minlbfgsstate_owner(rhs) ,needf(p_struct->needf),needfg(p_struct->needfg),xupdated(p_struct->xupdated),f(p_struct->f),g(&p_struct->g),x(&p_struct->x) { } minlbfgsstate& minlbfgsstate::operator=(const minlbfgsstate &rhs) { if( this==&rhs ) return *this; _minlbfgsstate_owner::operator=(rhs); return *this; } minlbfgsstate::~minlbfgsstate() { } /************************************************************************* This structure stores optimization report: * IterationsCount total number of inner iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinLBFGSSetGradientCheck() for more information. 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 terminated by user who called minlbfgsrequesttermination(). X contains point which was "current accepted" when termination request was submitted. Other fields of this structure are not documented and should not be used! *************************************************************************/ _minlbfgsreport_owner::_minlbfgsreport_owner() { p_struct = (alglib_impl::minlbfgsreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minlbfgsreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minlbfgsreport_init(p_struct, NULL); } _minlbfgsreport_owner::_minlbfgsreport_owner(const _minlbfgsreport_owner &rhs) { p_struct = (alglib_impl::minlbfgsreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minlbfgsreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minlbfgsreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minlbfgsreport_owner& _minlbfgsreport_owner::operator=(const _minlbfgsreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minlbfgsreport_clear(p_struct); alglib_impl::_minlbfgsreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minlbfgsreport_owner::~_minlbfgsreport_owner() { alglib_impl::_minlbfgsreport_clear(p_struct); ae_free(p_struct); } alglib_impl::minlbfgsreport* _minlbfgsreport_owner::c_ptr() { return p_struct; } alglib_impl::minlbfgsreport* _minlbfgsreport_owner::c_ptr() const { return const_cast(p_struct); } minlbfgsreport::minlbfgsreport() : _minlbfgsreport_owner() ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),varidx(p_struct->varidx),terminationtype(p_struct->terminationtype) { } minlbfgsreport::minlbfgsreport(const minlbfgsreport &rhs):_minlbfgsreport_owner(rhs) ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),varidx(p_struct->varidx),terminationtype(p_struct->terminationtype) { } minlbfgsreport& minlbfgsreport::operator=(const minlbfgsreport &rhs) { if( this==&rhs ) return *this; _minlbfgsreport_owner::operator=(rhs); return *this; } minlbfgsreport::~minlbfgsreport() { } /************************************************************************* LIMITED MEMORY BFGS METHOD FOR LARGE SCALE OPTIMIZATION DESCRIPTION: The subroutine minimizes function F(x) of N arguments by using a quasi- Newton method (LBFGS scheme) which is optimized to use a minimum amount of memory. The subroutine generates the approximation of an inverse Hessian matrix by using information about the last M steps of the algorithm (instead of N). It lessens a required amount of memory from a value of order N^2 to a value of order 2*N*M. REQUIREMENTS: Algorithm will request following information during its operation: * function value F and its gradient G (simultaneously) at given point X USAGE: 1. User initializes algorithm state with MinLBFGSCreate() call 2. User tunes solver parameters with MinLBFGSSetCond() MinLBFGSSetStpMax() and other functions 3. User calls MinLBFGSOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 4. User calls MinLBFGSResults() to get solution 5. Optionally user may call MinLBFGSRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLBFGSRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension. N>0 M - number of corrections in the BFGS scheme of Hessian approximation update. Recommended value: 3<=M<=7. The smaller value causes worse convergence, the bigger will not cause a considerably better convergence, but will cause a fall in the performance. M<=N. X - initial solution approximation, array[0..N-1]. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLBFGSSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLBFGSSetStpMax() function to bound algorithm's steps. However, L-BFGS rarely needs such a tuning. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgscreate(const ae_int_t n, const ae_int_t m, const real_1d_array &x, minlbfgsstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgscreate(n, m, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* LIMITED MEMORY BFGS METHOD FOR LARGE SCALE OPTIMIZATION DESCRIPTION: The subroutine minimizes function F(x) of N arguments by using a quasi- Newton method (LBFGS scheme) which is optimized to use a minimum amount of memory. The subroutine generates the approximation of an inverse Hessian matrix by using information about the last M steps of the algorithm (instead of N). It lessens a required amount of memory from a value of order N^2 to a value of order 2*N*M. REQUIREMENTS: Algorithm will request following information during its operation: * function value F and its gradient G (simultaneously) at given point X USAGE: 1. User initializes algorithm state with MinLBFGSCreate() call 2. User tunes solver parameters with MinLBFGSSetCond() MinLBFGSSetStpMax() and other functions 3. User calls MinLBFGSOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 4. User calls MinLBFGSResults() to get solution 5. Optionally user may call MinLBFGSRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLBFGSRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension. N>0 M - number of corrections in the BFGS scheme of Hessian approximation update. Recommended value: 3<=M<=7. The smaller value causes worse convergence, the bigger will not cause a considerably better convergence, but will cause a fall in the performance. M<=N. X - initial solution approximation, array[0..N-1]. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLBFGSSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLBFGSSetStpMax() function to bound algorithm's steps. However, L-BFGS rarely needs such a tuning. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgscreate(const ae_int_t m, const real_1d_array &x, minlbfgsstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgscreate(n, m, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The subroutine is finite difference variant of MinLBFGSCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinLBFGSCreate() in order to get more information about creation of LBFGS optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of corrections in the BFGS scheme of Hessian approximation update. Recommended value: 3<=M<=7. The smaller value causes worse convergence, the bigger will not cause a considerably better convergence, but will cause a fall in the performance. M<=N. X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinLBFGSSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. LBFGS needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void minlbfgscreatef(const ae_int_t n, const ae_int_t m, const real_1d_array &x, const double diffstep, minlbfgsstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgscreatef(n, m, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The subroutine is finite difference variant of MinLBFGSCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinLBFGSCreate() in order to get more information about creation of LBFGS optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of corrections in the BFGS scheme of Hessian approximation update. Recommended value: 3<=M<=7. The smaller value causes worse convergence, the bigger will not cause a considerably better convergence, but will cause a fall in the performance. M<=N. X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinLBFGSSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. LBFGS needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void minlbfgscreatef(const ae_int_t m, const real_1d_array &x, const double diffstep, minlbfgsstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgscreatef(n, m, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping conditions for L-BFGS optimization algorithm. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinLBFGSSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0, EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (small EpsX). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetcond(const minlbfgsstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgssetcond(const_cast(state.c_ptr()), epsg, epsf, epsx, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinLBFGSOptimize(). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetxrep(const minlbfgsstate &state, const bool needxrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgssetxrep(const_cast(state.c_ptr()), needxrep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0 (default), if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetstpmax(const minlbfgsstate &state, const double stpmax) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgssetstpmax(const_cast(state.c_ptr()), stpmax, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets scaling coefficients for LBFGS optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the LBFGS too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinLBFGSSetPrec...() functions. There is special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlbfgssetscale(const minlbfgsstate &state, const real_1d_array &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgssetscale(const_cast(state.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: default preconditioner (simple scaling, same for all elements of X) is used. INPUT PARAMETERS: State - structure which stores algorithm state NOTE: you can change preconditioner "on the fly", during algorithm iterations. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetprecdefault(const minlbfgsstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgssetprecdefault(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: Cholesky factorization of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state P - triangular preconditioner, Cholesky factorization of the approximate Hessian. array[0..N-1,0..N-1], (if larger, only leading N elements are used). IsUpper - whether upper or lower triangle of P is given (other triangle is not referenced) After call to this function preconditioner is changed to P (P is copied into the internal buffer). NOTE: you can change preconditioner "on the fly", during algorithm iterations. NOTE 2: P should be nonsingular. Exception will be thrown otherwise. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetpreccholesky(const minlbfgsstate &state, const real_2d_array &p, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgssetpreccholesky(const_cast(state.c_ptr()), const_cast(p.c_ptr()), isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE: you can change preconditioner "on the fly", during algorithm iterations. NOTE 2: D[i] should be positive. Exception will be thrown otherwise. NOTE 3: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetprecdiag(const minlbfgsstate &state, const real_1d_array &d) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgssetprecdiag(const_cast(state.c_ptr()), const_cast(d.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinLBFGSSetScale() call (before or after MinLBFGSSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetprecscale(const minlbfgsstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgssetprecscale(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minlbfgsiteration(const minlbfgsstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::minlbfgsiteration(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minlbfgsoptimize(minlbfgsstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( func==NULL ) throw ap_error("ALGLIB: error in 'minlbfgsoptimize()' (func is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minlbfgsiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needf ) { func(state.x, state.f, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minlbfgsoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minlbfgsoptimize(minlbfgsstate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( grad==NULL ) throw ap_error("ALGLIB: error in 'minlbfgsoptimize()' (grad is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minlbfgsiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needfg ) { grad(state.x, state.f, state.g, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minlbfgsoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* L-BFGS algorithm results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report: * Rep.TerminationType completetion code: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinLBFGSSetGradientCheck() for more information. * -2 rounding errors prevent further improvement. X contains best point found. * -1 incorrect parameters were specified * 1 relative function improvement is no more than EpsF. * 2 relative step is no more than EpsX. * 4 gradient norm is no more than EpsG * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * 8 terminated by user who called minlbfgsrequesttermination(). X contains point which was "current accepted" when termination request was submitted. * Rep.IterationsCount contains iterations count * NFEV countains number of function calculations -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgsresults(const minlbfgsstate &state, real_1d_array &x, minlbfgsreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgsresults(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* L-BFGS algorithm results Buffered implementation of MinLBFGSResults which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgsresultsbuf(const minlbfgsstate &state, real_1d_array &x, minlbfgsreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgsresultsbuf(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine restarts LBFGS algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used to store algorithm state X - new starting point. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgsrestartfrom(const minlbfgsstate &state, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgsrestartfrom(const_cast(state.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void minlbfgsrequesttermination(const minlbfgsstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgsrequesttermination(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinLBFGSOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * if needed, steps are bounded with respect to constraints on X[] * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinLBFGSSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 24.05.2012 by Bochkanov Sergey *************************************************************************/ void minlbfgssetgradientcheck(const minlbfgsstate &state, const double teststep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgssetgradientcheck(const_cast(state.c_ptr()), teststep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This object stores state of the nonlinear CG optimizer. You should use ALGLIB functions to work with this object. *************************************************************************/ _mincgstate_owner::_mincgstate_owner() { p_struct = (alglib_impl::mincgstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::mincgstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mincgstate_init(p_struct, NULL); } _mincgstate_owner::_mincgstate_owner(const _mincgstate_owner &rhs) { p_struct = (alglib_impl::mincgstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::mincgstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mincgstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _mincgstate_owner& _mincgstate_owner::operator=(const _mincgstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_mincgstate_clear(p_struct); alglib_impl::_mincgstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _mincgstate_owner::~_mincgstate_owner() { alglib_impl::_mincgstate_clear(p_struct); ae_free(p_struct); } alglib_impl::mincgstate* _mincgstate_owner::c_ptr() { return p_struct; } alglib_impl::mincgstate* _mincgstate_owner::c_ptr() const { return const_cast(p_struct); } mincgstate::mincgstate() : _mincgstate_owner() ,needf(p_struct->needf),needfg(p_struct->needfg),xupdated(p_struct->xupdated),f(p_struct->f),g(&p_struct->g),x(&p_struct->x) { } mincgstate::mincgstate(const mincgstate &rhs):_mincgstate_owner(rhs) ,needf(p_struct->needf),needfg(p_struct->needfg),xupdated(p_struct->xupdated),f(p_struct->f),g(&p_struct->g),x(&p_struct->x) { } mincgstate& mincgstate::operator=(const mincgstate &rhs) { if( this==&rhs ) return *this; _mincgstate_owner::operator=(rhs); return *this; } mincgstate::~mincgstate() { } /************************************************************************* This structure stores optimization report: * IterationsCount total number of inner iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinCGSetGradientCheck() for more information. 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 terminated by user who called mincgrequesttermination(). X contains point which was "current accepted" when termination request was submitted. Other fields of this structure are not documented and should not be used! *************************************************************************/ _mincgreport_owner::_mincgreport_owner() { p_struct = (alglib_impl::mincgreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::mincgreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mincgreport_init(p_struct, NULL); } _mincgreport_owner::_mincgreport_owner(const _mincgreport_owner &rhs) { p_struct = (alglib_impl::mincgreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::mincgreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mincgreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _mincgreport_owner& _mincgreport_owner::operator=(const _mincgreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_mincgreport_clear(p_struct); alglib_impl::_mincgreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _mincgreport_owner::~_mincgreport_owner() { alglib_impl::_mincgreport_clear(p_struct); ae_free(p_struct); } alglib_impl::mincgreport* _mincgreport_owner::c_ptr() { return p_struct; } alglib_impl::mincgreport* _mincgreport_owner::c_ptr() const { return const_cast(p_struct); } mincgreport::mincgreport() : _mincgreport_owner() ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),varidx(p_struct->varidx),terminationtype(p_struct->terminationtype) { } mincgreport::mincgreport(const mincgreport &rhs):_mincgreport_owner(rhs) ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),varidx(p_struct->varidx),terminationtype(p_struct->terminationtype) { } mincgreport& mincgreport::operator=(const mincgreport &rhs) { if( this==&rhs ) return *this; _mincgreport_owner::operator=(rhs); return *this; } mincgreport::~mincgreport() { } /************************************************************************* NONLINEAR CONJUGATE GRADIENT METHOD DESCRIPTION: The subroutine minimizes function F(x) of N arguments by using one of the nonlinear conjugate gradient methods. These CG methods are globally convergent (even on non-convex functions) as long as grad(f) is Lipschitz continuous in a some neighborhood of the L = { x : f(x)<=f(x0) }. REQUIREMENTS: Algorithm will request following information during its operation: * function value F and its gradient G (simultaneously) at given point X USAGE: 1. User initializes algorithm state with MinCGCreate() call 2. User tunes solver parameters with MinCGSetCond(), MinCGSetStpMax() and other functions 3. User calls MinCGOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 4. User calls MinCGResults() to get solution 5. Optionally, user may call MinCGRestartFrom() to solve another problem with same N but another starting point and/or another function. MinCGRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 25.03.2010 by Bochkanov Sergey *************************************************************************/ void mincgcreate(const ae_int_t n, const real_1d_array &x, mincgstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgcreate(n, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* NONLINEAR CONJUGATE GRADIENT METHOD DESCRIPTION: The subroutine minimizes function F(x) of N arguments by using one of the nonlinear conjugate gradient methods. These CG methods are globally convergent (even on non-convex functions) as long as grad(f) is Lipschitz continuous in a some neighborhood of the L = { x : f(x)<=f(x0) }. REQUIREMENTS: Algorithm will request following information during its operation: * function value F and its gradient G (simultaneously) at given point X USAGE: 1. User initializes algorithm state with MinCGCreate() call 2. User tunes solver parameters with MinCGSetCond(), MinCGSetStpMax() and other functions 3. User calls MinCGOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 4. User calls MinCGResults() to get solution 5. Optionally, user may call MinCGRestartFrom() to solve another problem with same N but another starting point and/or another function. MinCGRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 25.03.2010 by Bochkanov Sergey *************************************************************************/ void mincgcreate(const real_1d_array &x, mincgstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgcreate(n, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The subroutine is finite difference variant of MinCGCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinCGCreate() in order to get more information about creation of CG optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinCGSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. L-BFGS needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void mincgcreatef(const ae_int_t n, const real_1d_array &x, const double diffstep, mincgstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgcreatef(n, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The subroutine is finite difference variant of MinCGCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinCGCreate() in order to get more information about creation of CG optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinCGSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. L-BFGS needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void mincgcreatef(const real_1d_array &x, const double diffstep, mincgstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgcreatef(n, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping conditions for CG optimization algorithm. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinCGSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0, EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (small EpsX). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetcond(const mincgstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgsetcond(const_cast(state.c_ptr()), epsg, epsf, epsx, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets scaling coefficients for CG optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of CG optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the CG too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinCGSetPrec...() functions. There is special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void mincgsetscale(const mincgstate &state, const real_1d_array &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgsetscale(const_cast(state.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinCGOptimize(). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetxrep(const mincgstate &state, const bool needxrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgsetxrep(const_cast(state.c_ptr()), needxrep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets CG algorithm. INPUT PARAMETERS: State - structure which stores algorithm state CGType - algorithm type: * -1 automatic selection of the best algorithm * 0 DY (Dai and Yuan) algorithm * 1 Hybrid DY-HS algorithm -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetcgtype(const mincgstate &state, const ae_int_t cgtype) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgsetcgtype(const_cast(state.c_ptr()), cgtype, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetstpmax(const mincgstate &state, const double stpmax) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgsetstpmax(const_cast(state.c_ptr()), stpmax, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function allows to suggest initial step length to the CG algorithm. Suggested step length is used as starting point for the line search. It can be useful when you have badly scaled problem, i.e. when ||grad|| (which is used as initial estimate for the first step) is many orders of magnitude different from the desired step. Line search may fail on such problems without good estimate of initial step length. Imagine, for example, problem with ||grad||=10^50 and desired step equal to 0.1 Line search function will use 10^50 as initial step, then it will decrease step length by 2 (up to 20 attempts) and will get 10^44, which is still too large. This function allows us to tell than line search should be started from some moderate step length, like 1.0, so algorithm will be able to detect desired step length in a several searches. Default behavior (when no step is suggested) is to use preconditioner, if it is available, to generate initial estimate of step length. This function influences only first iteration of algorithm. It should be called between MinCGCreate/MinCGRestartFrom() call and MinCGOptimize call. Suggested step is ignored if you have preconditioner. INPUT PARAMETERS: State - structure used to store algorithm state. Stp - initial estimate of the step length. Can be zero (no estimate). -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void mincgsuggeststep(const mincgstate &state, const double stp) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgsuggeststep(const_cast(state.c_ptr()), stp, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: preconditioning is turned off. INPUT PARAMETERS: State - structure which stores algorithm state NOTE: you can change preconditioner "on the fly", during algorithm iterations. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetprecdefault(const mincgstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgsetprecdefault(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE: you can change preconditioner "on the fly", during algorithm iterations. NOTE 2: D[i] should be positive. Exception will be thrown otherwise. NOTE 3: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetprecdiag(const mincgstate &state, const real_1d_array &d) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgsetprecdiag(const_cast(state.c_ptr()), const_cast(d.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinCGSetScale() call (before or after MinCGSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state NOTE: you can change preconditioner "on the fly", during algorithm iterations. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetprecscale(const mincgstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgsetprecscale(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool mincgiteration(const mincgstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::mincgiteration(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void mincgoptimize(mincgstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( func==NULL ) throw ap_error("ALGLIB: error in 'mincgoptimize()' (func is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::mincgiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needf ) { func(state.x, state.f, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'mincgoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void mincgoptimize(mincgstate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( grad==NULL ) throw ap_error("ALGLIB: error in 'mincgoptimize()' (grad is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::mincgiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needfg ) { grad(state.x, state.f, state.g, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'mincgoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Conjugate gradient results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report: * Rep.TerminationType completetion code: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinCGSetGradientCheck() for more information. * 1 relative function improvement is no more than EpsF. * 2 relative step is no more than EpsX. * 4 gradient norm is no more than EpsG * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible, we return best X found so far * 8 terminated by user * Rep.IterationsCount contains iterations count * NFEV countains number of function calculations -- ALGLIB -- Copyright 20.04.2009 by Bochkanov Sergey *************************************************************************/ void mincgresults(const mincgstate &state, real_1d_array &x, mincgreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgresults(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Conjugate gradient results Buffered implementation of MinCGResults(), which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 20.04.2009 by Bochkanov Sergey *************************************************************************/ void mincgresultsbuf(const mincgstate &state, real_1d_array &x, mincgreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgresultsbuf(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine restarts CG algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used to store algorithm state. X - new starting point. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void mincgrestartfrom(const mincgstate &state, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgrestartfrom(const_cast(state.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void mincgrequesttermination(const mincgstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgrequesttermination(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinCGOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinCGSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 31.05.2012 by Bochkanov Sergey *************************************************************************/ void mincgsetgradientcheck(const mincgstate &state, const double teststep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mincgsetgradientcheck(const_cast(state.c_ptr()), teststep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinBLEIC subpackage to work with this object *************************************************************************/ _minbleicstate_owner::_minbleicstate_owner() { p_struct = (alglib_impl::minbleicstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minbleicstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minbleicstate_init(p_struct, NULL); } _minbleicstate_owner::_minbleicstate_owner(const _minbleicstate_owner &rhs) { p_struct = (alglib_impl::minbleicstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minbleicstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minbleicstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minbleicstate_owner& _minbleicstate_owner::operator=(const _minbleicstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minbleicstate_clear(p_struct); alglib_impl::_minbleicstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minbleicstate_owner::~_minbleicstate_owner() { alglib_impl::_minbleicstate_clear(p_struct); ae_free(p_struct); } alglib_impl::minbleicstate* _minbleicstate_owner::c_ptr() { return p_struct; } alglib_impl::minbleicstate* _minbleicstate_owner::c_ptr() const { return const_cast(p_struct); } minbleicstate::minbleicstate() : _minbleicstate_owner() ,needf(p_struct->needf),needfg(p_struct->needfg),xupdated(p_struct->xupdated),f(p_struct->f),g(&p_struct->g),x(&p_struct->x) { } minbleicstate::minbleicstate(const minbleicstate &rhs):_minbleicstate_owner(rhs) ,needf(p_struct->needf),needfg(p_struct->needfg),xupdated(p_struct->xupdated),f(p_struct->f),g(&p_struct->g),x(&p_struct->x) { } minbleicstate& minbleicstate::operator=(const minbleicstate &rhs) { if( this==&rhs ) return *this; _minbleicstate_owner::operator=(rhs); return *this; } minbleicstate::~minbleicstate() { } /************************************************************************* This structure stores optimization report: * IterationsCount number of iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinBLEICSetGradientCheck() for more information. -3 inconsistent constraints. Feasible point is either nonexistent or too hard to find. Try to restart optimizer with better initial approximation 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 terminated by user who called minbleicrequesttermination(). X contains point which was "current accepted" when termination request was submitted. ADDITIONAL FIELDS There are additional fields which can be used for debugging: * DebugEqErr error in the equality constraints (2-norm) * DebugFS f, calculated at projection of initial point to the feasible set * DebugFF f, calculated at the final point * DebugDX |X_start-X_final| *************************************************************************/ _minbleicreport_owner::_minbleicreport_owner() { p_struct = (alglib_impl::minbleicreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minbleicreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minbleicreport_init(p_struct, NULL); } _minbleicreport_owner::_minbleicreport_owner(const _minbleicreport_owner &rhs) { p_struct = (alglib_impl::minbleicreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minbleicreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minbleicreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minbleicreport_owner& _minbleicreport_owner::operator=(const _minbleicreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minbleicreport_clear(p_struct); alglib_impl::_minbleicreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minbleicreport_owner::~_minbleicreport_owner() { alglib_impl::_minbleicreport_clear(p_struct); ae_free(p_struct); } alglib_impl::minbleicreport* _minbleicreport_owner::c_ptr() { return p_struct; } alglib_impl::minbleicreport* _minbleicreport_owner::c_ptr() const { return const_cast(p_struct); } minbleicreport::minbleicreport() : _minbleicreport_owner() ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),varidx(p_struct->varidx),terminationtype(p_struct->terminationtype),debugeqerr(p_struct->debugeqerr),debugfs(p_struct->debugfs),debugff(p_struct->debugff),debugdx(p_struct->debugdx),debugfeasqpits(p_struct->debugfeasqpits),debugfeasgpaits(p_struct->debugfeasgpaits),inneriterationscount(p_struct->inneriterationscount),outeriterationscount(p_struct->outeriterationscount) { } minbleicreport::minbleicreport(const minbleicreport &rhs):_minbleicreport_owner(rhs) ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),varidx(p_struct->varidx),terminationtype(p_struct->terminationtype),debugeqerr(p_struct->debugeqerr),debugfs(p_struct->debugfs),debugff(p_struct->debugff),debugdx(p_struct->debugdx),debugfeasqpits(p_struct->debugfeasqpits),debugfeasgpaits(p_struct->debugfeasgpaits),inneriterationscount(p_struct->inneriterationscount),outeriterationscount(p_struct->outeriterationscount) { } minbleicreport& minbleicreport::operator=(const minbleicreport &rhs) { if( this==&rhs ) return *this; _minbleicreport_owner::operator=(rhs); return *this; } minbleicreport::~minbleicreport() { } /************************************************************************* BOUND CONSTRAINED OPTIMIZATION WITH ADDITIONAL LINEAR EQUALITY AND INEQUALITY CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints REQUIREMENTS: * user must provide function value and gradient * starting point X0 must be feasible or not too far away from the feasible set * grad(f) must be Lipschitz continuous on a level set: L = { x : f(x)<=f(x0) } * function must be defined everywhere on the feasible set F USAGE: Constrained optimization if far more complex than the unconstrained one. Here we give very brief outline of the BLEIC optimizer. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinBLEICCreate() call 2. USer adds boundary and/or linear constraints by calling MinBLEICSetBC() and MinBLEICSetLC() functions. 3. User sets stopping conditions with MinBLEICSetCond(). 4. User calls MinBLEICOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 5. User calls MinBLEICResults() to get solution 6. Optionally user may call MinBLEICRestartFrom() to solve another problem with same N but another starting point. MinBLEICRestartFrom() allows to reuse already initialized structure. NOTE: if you have box-only constraints (no general linear constraints), then MinBC optimizer can be better option. It uses special, faster constraint activation method, which performs better on problems with multiple constraints active at the solution. On small-scale problems performance of MinBC is similar to that of MinBLEIC, but on large-scale ones (hundreds and thousands of active constraints) it can be several times faster than MinBLEIC. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleiccreate(const ae_int_t n, const real_1d_array &x, minbleicstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleiccreate(n, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* BOUND CONSTRAINED OPTIMIZATION WITH ADDITIONAL LINEAR EQUALITY AND INEQUALITY CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints REQUIREMENTS: * user must provide function value and gradient * starting point X0 must be feasible or not too far away from the feasible set * grad(f) must be Lipschitz continuous on a level set: L = { x : f(x)<=f(x0) } * function must be defined everywhere on the feasible set F USAGE: Constrained optimization if far more complex than the unconstrained one. Here we give very brief outline of the BLEIC optimizer. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinBLEICCreate() call 2. USer adds boundary and/or linear constraints by calling MinBLEICSetBC() and MinBLEICSetLC() functions. 3. User sets stopping conditions with MinBLEICSetCond(). 4. User calls MinBLEICOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 5. User calls MinBLEICResults() to get solution 6. Optionally user may call MinBLEICRestartFrom() to solve another problem with same N but another starting point. MinBLEICRestartFrom() allows to reuse already initialized structure. NOTE: if you have box-only constraints (no general linear constraints), then MinBC optimizer can be better option. It uses special, faster constraint activation method, which performs better on problems with multiple constraints active at the solution. On small-scale problems performance of MinBC is similar to that of MinBLEIC, but on large-scale ones (hundreds and thousands of active constraints) it can be several times faster than MinBLEIC. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleiccreate(const real_1d_array &x, minbleicstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleiccreate(n, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The subroutine is finite difference variant of MinBLEICCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinBLEICCreate() in order to get more information about creation of BLEIC optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinBLEICSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. CG needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void minbleiccreatef(const ae_int_t n, const real_1d_array &x, const double diffstep, minbleicstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleiccreatef(n, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The subroutine is finite difference variant of MinBLEICCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinBLEICCreate() in order to get more information about creation of BLEIC optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinBLEICSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. CG needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void minbleiccreatef(const real_1d_array &x, const double diffstep, minbleicstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleiccreatef(n, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets boundary constraints for BLEIC optimizer. Boundary constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinBLEICRestartFrom(). NOTE: if you have box-only constraints (no general linear constraints), then MinBC optimizer can be better option. It uses special, faster constraint activation method, which performs better on problems with multiple constraints active at the solution. On small-scale problems performance of MinBC is similar to that of MinBLEIC, but on large-scale ones (hundreds and thousands of active constraints) it can be several times faster than MinBLEIC. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints, even when numerical differentiation is used (algorithm adjusts nodes according to boundary constraints) -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetbc(const minbleicstate &state, const real_1d_array &bndl, const real_1d_array &bndu) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetbc(const_cast(state.c_ptr()), const_cast(bndl.c_ptr()), const_cast(bndu.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets linear constraints for BLEIC optimizer. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinBLEICRestartFrom(). INPUT PARAMETERS: State - structure previously allocated with MinBLEICCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: linear (non-bound) constraints are satisfied only approximately: * there always exists some minor violation (about Epsilon in magnitude) due to rounding errors * numerical differentiation, if used, may lead to function evaluations outside of the feasible area, because algorithm does NOT change numerical differentiation formula according to linear constraints. If you want constraints to be satisfied exactly, try to reformulate your problem in such manner that all constraints will become boundary ones (this kind of constraints is always satisfied exactly, both in the final solution and in all intermediate points). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetlc(const minbleicstate &state, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetlc(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets linear constraints for BLEIC optimizer. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinBLEICRestartFrom(). INPUT PARAMETERS: State - structure previously allocated with MinBLEICCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: linear (non-bound) constraints are satisfied only approximately: * there always exists some minor violation (about Epsilon in magnitude) due to rounding errors * numerical differentiation, if used, may lead to function evaluations outside of the feasible area, because algorithm does NOT change numerical differentiation formula according to linear constraints. If you want constraints to be satisfied exactly, try to reformulate your problem in such manner that all constraints will become boundary ones (this kind of constraints is always satisfied exactly, both in the final solution and in all intermediate points). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetlc(const minbleicstate &state, const real_2d_array &c, const integer_1d_array &ct) { alglib_impl::ae_state _alglib_env_state; ae_int_t k; if( (c.rows()!=ct.length())) throw ap_error("Error while calling 'minbleicsetlc': looks like one of arguments has wrong size"); k = c.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetlc(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping conditions for the optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinBLEICSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. NOTE: when SetCond() called with non-zero MaxIts, BLEIC solver may perform slightly more than MaxIts iterations. I.e., MaxIts sets non-strict limit on iterations count. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetcond(const minbleicstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetcond(const_cast(state.c_ptr()), epsg, epsf, epsx, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets scaling coefficients for BLEIC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the BLEIC too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinBLEICSetPrec...() functions. There is a special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minbleicsetscale(const minbleicstate &state, const real_1d_array &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetscale(const_cast(state.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: preconditioning is turned off. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetprecdefault(const minbleicstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetprecdefault(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE 1: D[i] should be positive. Exception will be thrown otherwise. NOTE 2: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetprecdiag(const minbleicstate &state, const real_1d_array &d) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetprecdiag(const_cast(state.c_ptr()), const_cast(d.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinBLEICSetScale() call (before or after MinBLEICSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetprecscale(const minbleicstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetprecscale(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinBLEICOptimize(). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetxrep(const minbleicstate &state, const bool needxrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetxrep(const_cast(state.c_ptr()), needxrep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets maximum step length IMPORTANT: this feature is hard to combine with preconditioning. You can't set upper limit on step length, when you solve optimization problem with linear (non-boundary) constraints AND preconditioner turned on. When non-boundary constraints are present, you have to either a) use preconditioner, or b) use upper limit on step length. YOU CAN'T USE BOTH! In this case algorithm will terminate with appropriate error code. INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which lead to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetstpmax(const minbleicstate &state, const double stpmax) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetstpmax(const_cast(state.c_ptr()), stpmax, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minbleiciteration(const minbleicstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::minbleiciteration(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minbleicoptimize(minbleicstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( func==NULL ) throw ap_error("ALGLIB: error in 'minbleicoptimize()' (func is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minbleiciteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needf ) { func(state.x, state.f, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minbleicoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minbleicoptimize(minbleicstate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( grad==NULL ) throw ap_error("ALGLIB: error in 'minbleicoptimize()' (grad is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minbleiciteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needfg ) { grad(state.x, state.f, state.g, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minbleicoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* BLEIC results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinBLEICSetGradientCheck() for more information. * -3 inconsistent constraints. Feasible point is either nonexistent or too hard to find. Try to restart optimizer with better initial approximation * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken * 8 terminated by user who called minbleicrequesttermination(). X contains point which was "current accepted" when termination request was submitted. More information about fields of this structure can be found in the comments on MinBLEICReport datatype. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicresults(const minbleicstate &state, real_1d_array &x, minbleicreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicresults(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* BLEIC results Buffered implementation of MinBLEICResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicresultsbuf(const minbleicstate &state, real_1d_array &x, minbleicreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicresultsbuf(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with MinBLEICCreate call. X - new starting point. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicrestartfrom(const minbleicstate &state, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicrestartfrom(const_cast(state.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void minbleicrequesttermination(const minbleicstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicrequesttermination(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinBLEICOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * if needed, steps are bounded with respect to constraints on X[] * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinBLEICSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/ void minbleicsetgradientcheck(const minbleicstate &state, const double teststep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetgradientcheck(const_cast(state.c_ptr()), teststep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinQP subpackage to work with this object *************************************************************************/ _minqpstate_owner::_minqpstate_owner() { p_struct = (alglib_impl::minqpstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minqpstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minqpstate_init(p_struct, NULL); } _minqpstate_owner::_minqpstate_owner(const _minqpstate_owner &rhs) { p_struct = (alglib_impl::minqpstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minqpstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minqpstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minqpstate_owner& _minqpstate_owner::operator=(const _minqpstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minqpstate_clear(p_struct); alglib_impl::_minqpstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minqpstate_owner::~_minqpstate_owner() { alglib_impl::_minqpstate_clear(p_struct); ae_free(p_struct); } alglib_impl::minqpstate* _minqpstate_owner::c_ptr() { return p_struct; } alglib_impl::minqpstate* _minqpstate_owner::c_ptr() const { return const_cast(p_struct); } minqpstate::minqpstate() : _minqpstate_owner() { } minqpstate::minqpstate(const minqpstate &rhs):_minqpstate_owner(rhs) { } minqpstate& minqpstate::operator=(const minqpstate &rhs) { if( this==&rhs ) return *this; _minqpstate_owner::operator=(rhs); return *this; } minqpstate::~minqpstate() { } /************************************************************************* This structure stores optimization report: * InnerIterationsCount number of inner iterations * OuterIterationsCount number of outer iterations * NCholesky number of Cholesky decomposition * NMV number of matrix-vector products (only products calculated as part of iterative process are counted) * TerminationType completion code (see below) Completion codes: * -5 inappropriate solver was used: * QuickQP solver for problem with general linear constraints (dense/sparse) * -4 BLEIC-QP or QuickQP solver found unconstrained direction of negative curvature (function is unbounded from below even under constraints), no meaningful minimum can be found. * -3 inconsistent constraints (or, maybe, feasible point is too hard to find). If you are sure that constraints are feasible, try to restart optimizer with better initial approximation. * -1 solver error * 1..4 successful completion * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. *************************************************************************/ _minqpreport_owner::_minqpreport_owner() { p_struct = (alglib_impl::minqpreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minqpreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minqpreport_init(p_struct, NULL); } _minqpreport_owner::_minqpreport_owner(const _minqpreport_owner &rhs) { p_struct = (alglib_impl::minqpreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minqpreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minqpreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minqpreport_owner& _minqpreport_owner::operator=(const _minqpreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minqpreport_clear(p_struct); alglib_impl::_minqpreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minqpreport_owner::~_minqpreport_owner() { alglib_impl::_minqpreport_clear(p_struct); ae_free(p_struct); } alglib_impl::minqpreport* _minqpreport_owner::c_ptr() { return p_struct; } alglib_impl::minqpreport* _minqpreport_owner::c_ptr() const { return const_cast(p_struct); } minqpreport::minqpreport() : _minqpreport_owner() ,inneriterationscount(p_struct->inneriterationscount),outeriterationscount(p_struct->outeriterationscount),nmv(p_struct->nmv),ncholesky(p_struct->ncholesky),terminationtype(p_struct->terminationtype) { } minqpreport::minqpreport(const minqpreport &rhs):_minqpreport_owner(rhs) ,inneriterationscount(p_struct->inneriterationscount),outeriterationscount(p_struct->outeriterationscount),nmv(p_struct->nmv),ncholesky(p_struct->ncholesky),terminationtype(p_struct->terminationtype) { } minqpreport& minqpreport::operator=(const minqpreport &rhs) { if( this==&rhs ) return *this; _minqpreport_owner::operator=(rhs); return *this; } minqpreport::~minqpreport() { } /************************************************************************* CONSTRAINED QUADRATIC PROGRAMMING The subroutine creates QP optimizer. After initial creation, it contains default optimization problem with zero quadratic and linear terms and no constraints. You should set quadratic/linear terms with calls to functions provided by MinQP subpackage. You should also choose appropriate QP solver and set it and its stopping criteria by means of MinQPSetAlgo??????() function. Then, you should start solution process by means of MinQPOptimize() call. Solution itself can be obtained with MinQPResults() function. Following solvers are recommended: * QuickQP for dense problems with box-only constraints (or no constraints at all) * QP-BLEIC for dense/sparse problems with moderate (up to 50) number of general linear constraints * DENSE-AUL-QP for dense problems with any (small or large) number of general linear constraints INPUT PARAMETERS: N - problem size OUTPUT PARAMETERS: State - optimizer with zero quadratic/linear terms and no constraints -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpcreate(const ae_int_t n, minqpstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpcreate(n, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets linear term for QP solver. By default, linear term is zero. INPUT PARAMETERS: State - structure which stores algorithm state B - linear term, array[N]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetlinearterm(const minqpstate &state, const real_1d_array &b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetlinearterm(const_cast(state.c_ptr()), const_cast(b.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets dense quadratic term for QP solver. By default, quadratic term is zero. SUPPORT BY QP SOLVERS: Dense quadratic term can be handled by following QP solvers: * QuickQP * BLEIC-QP * Dense-AUL-QP IMPORTANT: This solver minimizes following function: f(x) = 0.5*x'*A*x + b'*x. Note that quadratic term has 0.5 before it. So if you want to minimize f(x) = x^2 + x you should rewrite your problem as follows: f(x) = 0.5*(2*x^2) + x and your matrix A will be equal to [[2.0]], not to [[1.0]] INPUT PARAMETERS: State - structure which stores algorithm state A - matrix, array[N,N] IsUpper - (optional) storage type: * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used * if not given, both lower and upper triangles must be filled. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetquadraticterm(const minqpstate &state, const real_2d_array &a, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetquadraticterm(const_cast(state.c_ptr()), const_cast(a.c_ptr()), isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets dense quadratic term for QP solver. By default, quadratic term is zero. SUPPORT BY QP SOLVERS: Dense quadratic term can be handled by following QP solvers: * QuickQP * BLEIC-QP * Dense-AUL-QP IMPORTANT: This solver minimizes following function: f(x) = 0.5*x'*A*x + b'*x. Note that quadratic term has 0.5 before it. So if you want to minimize f(x) = x^2 + x you should rewrite your problem as follows: f(x) = 0.5*(2*x^2) + x and your matrix A will be equal to [[2.0]], not to [[1.0]] INPUT PARAMETERS: State - structure which stores algorithm state A - matrix, array[N,N] IsUpper - (optional) storage type: * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used * if not given, both lower and upper triangles must be filled. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetquadraticterm(const minqpstate &state, const real_2d_array &a) { alglib_impl::ae_state _alglib_env_state; bool isupper; if( !alglib_impl::ae_is_symmetric(const_cast(a.c_ptr())) ) throw ap_error("'a' parameter is not symmetric matrix"); isupper = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetquadraticterm(const_cast(state.c_ptr()), const_cast(a.c_ptr()), isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets sparse quadratic term for QP solver. By default, quadratic term is zero. This function overrides previous calls to minqpsetquadraticterm() or minqpsetquadratictermsparse(). SUPPORT BY QP SOLVERS: Sparse quadratic term can be handled by following QP solvers: * QuickQP * BLEIC-QP * Dense-AUL-QP (internally converts sparse matrix to dense format) IMPORTANT: This solver minimizes following function: f(x) = 0.5*x'*A*x + b'*x. Note that quadratic term has 0.5 before it. So if you want to minimize f(x) = x^2 + x you should rewrite your problem as follows: f(x) = 0.5*(2*x^2) + x and your matrix A will be equal to [[2.0]], not to [[1.0]] INPUT PARAMETERS: State - structure which stores algorithm state A - matrix, array[N,N] IsUpper - (optional) storage type: * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used * if not given, both lower and upper triangles must be filled. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetquadratictermsparse(const minqpstate &state, const sparsematrix &a, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetquadratictermsparse(const_cast(state.c_ptr()), const_cast(a.c_ptr()), isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets starting point for QP solver. It is useful to have good initial approximation to the solution, because it will increase speed of convergence and identification of active constraints. INPUT PARAMETERS: State - structure which stores algorithm state X - starting point, array[N]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetstartingpoint(const minqpstate &state, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetstartingpoint(const_cast(state.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets origin for QP solver. By default, following QP program is solved: min(0.5*x'*A*x+b'*x) This function allows to solve different problem: min(0.5*(x-x_origin)'*A*(x-x_origin)+b'*(x-x_origin)) Specification of non-zero origin affects function being minimized, but not constraints. Box and linear constraints are still calculated without origin. INPUT PARAMETERS: State - structure which stores algorithm state XOrigin - origin, array[N]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetorigin(const minqpstate &state, const real_1d_array &xorigin) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetorigin(const_cast(state.c_ptr()), const_cast(xorigin.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets scaling coefficients. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances) and as preconditioner. Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetscale(const minqpstate &state, const real_1d_array &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetscale(const_cast(state.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED This function tells solver to use Cholesky-based algorithm. This algorithm was deprecated in ALGLIB 3.9.0 because its performance is inferior to that of BLEIC-QP or QuickQP on high-dimensional problems. Furthermore, it supports only dense convex QP problems. This solver is no longer active by default. We recommend you to switch to AUL-QP, BLEIC-QP or QuickQP solver. DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetalgocholesky(const minqpstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetalgocholesky(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function tells solver to use BLEIC-based algorithm and sets stopping criteria for the algorithm. ALGORITHM FEATURES: * supports dense and sparse QP problems * supports box and general linear equality/inequality constraints * can solve all types of problems (convex, semidefinite, nonconvex) as long as they are bounded from below under constraints. Say, it is possible to solve "min{-x^2} subject to -1<=x<=+1". Of course, global minimum is found only for positive definite and semidefinite problems. As for indefinite ones - only local minimum is found. ALGORITHM OUTLINE: * BLEIC-QP solver is just a driver function for MinBLEIC solver; it solves quadratic programming problem as general linearly constrained optimization problem, which is solved by means of BLEIC solver (part of ALGLIB, active set method). ALGORITHM LIMITATIONS: * this algorithm is fast enough for large-scale problems with small amount of general linear constraints (say, up to 50), but it is inefficient for problems with several hundreds of constraints. Iteration cost is roughly quadratic w.r.t. constraint count. Furthermore, it can not efficiently handle sparse constraints (they are converted to dense format prior to solution). Thus, if you have large and/or sparse constraint matrix and convex QP problem, Dense-AUL-QP solver may be better solution. * unlike QuickQP solver, this algorithm does not perform Newton steps and does not use Level 3 BLAS. Being general-purpose active set method, it can activate constraints only one-by-one. Thus, its performance is lower than that of QuickQP. * its precision is also a bit inferior to that of QuickQP. BLEIC-QP performs only LBFGS steps (no Newton steps), which are good at detecting neighborhood of the solution, buy needs many iterations to find solution with more than 6 digits of precision. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} EpsX - >=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinQPSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. NOTE: this algorithm uses LBFGS iterations, which are relatively cheap, but improve function value only a bit. So you will need many iterations to converge - from 0.1*N to 10*N, depending on problem's condition number. IT IS VERY IMPORTANT TO CALL MinQPSetScale() WHEN YOU USE THIS ALGORITHM BECAUSE ITS STOPPING CRITERIA ARE SCALE-DEPENDENT! Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (presently it is small step length, but it may change in the future versions of ALGLIB). -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetalgobleic(const minqpstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetalgobleic(const_cast(state.c_ptr()), epsg, epsf, epsx, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function tells QP solver to use Dense-AUL algorithm and sets stopping criteria for the algorithm. ALGORITHM FEATURES: * supports dense and sparse QP problems; although it uses dense Cholesky to build preconditioner, it still works faster for sparse problems. * supports box and dense/sparse general linear equality/inequality constraints * convergence is theoretically proved for positive-definite (convex) QP problems. Semidefinite and non-convex problems can be solved as long as they are bounded from below under constraints, although without theoretical guarantees. * this solver is better than QP-BLEIC on problems with large number of general linear constraints. ALGORITHM OUTLINE: * this algorithm is an augmented Lagrangian method with dense preconditioner (hence its name). It is similar to barrier/penalty methods, but much more precise and faster. * it performs several outer iterations in order to refine values of the Lagrange multipliers. Single outer iteration is a solution of some unconstrained optimization problem: first it performs dense Cholesky factorization of the Hessian in order to build preconditioner (adaptive regularization is applied to enforce positive definiteness), and then it uses L-BFGS optimizer to solve optimization problem. * typically you need about 5-10 outer iterations to converge to solution ALGORITHM LIMITATIONS: * because dense Cholesky driver is used, this algorithm has O(N^2) memory requirements and O(OuterIterations*N^3) minimum running time. From the practical point of view, it limits its applicability by several thousands of variables. From the other side, variables count is the most limiting factor, and dependence on constraint count is much more lower. Assuming that constraint matrix is sparse, it may handle tens of thousands of general linear constraints. * its precision is lower than that of BLEIC-QP and QuickQP. It is hard to find solution with more than 6 digits of precision. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0, stopping criteria for inner optimizer. Inner iterations are stopped when step length (with variable scaling being applied) is less than EpsX. See minqpsetscale() for more information on variable scaling. Rho - penalty coefficient, Rho>0: * large enough that algorithm converges with desired precision. * not TOO large to prevent ill-conditioning * recommended values are 100, 1000 or 10000 ItsCnt - number of outer iterations: * recommended values: 10-15 (although in most cases it converges within 5 iterations, you may need a few more to be sure). * ItsCnt=0 means that small number of outer iterations is automatically chosen (10 iterations in current version). * ItsCnt=1 means that AUL algorithm performs just as usual barrier method. * ItsCnt>1 means that AUL algorithm performs specified number of outer iterations IT IS VERY IMPORTANT TO CALL minqpsetscale() WHEN YOU USE THIS ALGORITHM BECAUSE ITS CONVERGENCE PROPERTIES AND STOPPING CRITERIA ARE SCALE-DEPENDENT! NOTE: Passing EpsX=0 will lead to automatic step length selection (specific step length chosen may change in the future versions of ALGLIB, so it is better to specify step length explicitly). -- ALGLIB -- Copyright 20.08.2016 by Bochkanov Sergey *************************************************************************/ void minqpsetalgodenseaul(const minqpstate &state, const double epsx, const double rho, const ae_int_t itscnt) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetalgodenseaul(const_cast(state.c_ptr()), epsx, rho, itscnt, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function tells solver to use QuickQP algorithm: special extra-fast algorithm for problems with box-only constrants. It may solve non-convex problems as long as they are bounded from below under constraints. ALGORITHM FEATURES: * many times (from 5x to 50x!) faster than BLEIC-based QP solver; utilizes accelerated methods for activation of constraints. * supports dense and sparse QP problems * supports ONLY box constraints; general linear constraints are NOT supported by this solver * can solve all types of problems (convex, semidefinite, nonconvex) as long as they are bounded from below under constraints. Say, it is possible to solve "min{-x^2} subject to -1<=x<=+1". In convex/semidefinite case global minimum is returned, in nonconvex case - algorithm returns one of the local minimums. ALGORITHM OUTLINE: * algorithm performs two kinds of iterations: constrained CG iterations and constrained Newton iterations * initially it performs small number of constrained CG iterations, which can efficiently activate/deactivate multiple constraints * after CG phase algorithm tries to calculate Cholesky decomposition and to perform several constrained Newton steps. If Cholesky decomposition failed (matrix is indefinite even under constraints), we perform more CG iterations until we converge to such set of constraints that system matrix becomes positive definite. Constrained Newton steps greatly increase convergence speed and precision. * algorithm interleaves CG and Newton iterations which allows to handle indefinite matrices (CG phase) and quickly converge after final set of constraints is found (Newton phase). Combination of CG and Newton phases is called "outer iteration". * it is possible to turn off Newton phase (beneficial for semidefinite problems - Cholesky decomposition will fail too often) ALGORITHM LIMITATIONS: * algorithm does not support general linear constraints; only box ones are supported * Cholesky decomposition for sparse problems is performed with Skyline Cholesky solver, which is intended for low-profile matrices. No profile- reducing reordering of variables is performed in this version of ALGLIB. * problems with near-zero negative eigenvalues (or exacty zero ones) may experience about 2-3x performance penalty. The reason is that Cholesky decomposition can not be performed until we identify directions of zero and negative curvature and activate corresponding boundary constraints - but we need a lot of trial and errors because these directions are hard to notice in the matrix spectrum. In this case you may turn off Newton phase of algorithm. Large negative eigenvalues are not an issue, so highly non-convex problems can be solved very efficiently. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} EpsX - >=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinQPSetScale() MaxOuterIts-maximum number of OUTER iterations. One outer iteration includes some amount of CG iterations (from 5 to ~N) and one or several (usually small amount) Newton steps. Thus, one outer iteration has high cost, but can greatly reduce funcation value. Use 0 if you do not want to limit number of outer iterations. UseNewton- use Newton phase or not: * Newton phase improves performance of positive definite dense problems (about 2 times improvement can be observed) * can result in some performance penalty on semidefinite or slightly negative definite problems - each Newton phase will bring no improvement (Cholesky failure), but still will require computational time. * if you doubt, you can turn off this phase - optimizer will retain its most of its high speed. IT IS VERY IMPORTANT TO CALL MinQPSetScale() WHEN YOU USE THIS ALGORITHM BECAUSE ITS STOPPING CRITERIA ARE SCALE-DEPENDENT! Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (presently it is small step length, but it may change in the future versions of ALGLIB). -- ALGLIB -- Copyright 22.05.2014 by Bochkanov Sergey *************************************************************************/ void minqpsetalgoquickqp(const minqpstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxouterits, const bool usenewton) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetalgoquickqp(const_cast(state.c_ptr()), epsg, epsf, epsx, maxouterits, usenewton, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets box constraints for QP solver Box constraints are inactive by default (after initial creation). After being set, they are preserved until explicitly turned off with another SetBC() call. All QP solvers may handle box constraints. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF (latter is recommended because it will allow solver to use better algorithm). BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF (latter is recommended because it will allow solver to use better algorithm). NOTE: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetbc(const minqpstate &state, const real_1d_array &bndl, const real_1d_array &bndu) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetbc(const_cast(state.c_ptr()), const_cast(bndl.c_ptr()), const_cast(bndu.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets dense linear constraints for QP optimizer. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). After call to this function sparse constraints are dropped, and you have only those constraints which were specified in the present call. If you want to specify mixed (with dense and sparse terms) linear constraints, you should call minqpsetlcmixed(). SUPPORT BY QP SOLVERS: Following QP solvers can handle dense linear constraints: * BLEIC-QP - handles them with high precision, but may be inefficient for problems with hundreds of constraints * Dense-AUL-QP - handles them with moderate precision (approx. 10^-6), may efficiently handle thousands of constraints. Following QP solvers can NOT handle dense linear constraints: * QuickQP - can not handle general linear constraints INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 19.06.2012 by Bochkanov Sergey *************************************************************************/ void minqpsetlc(const minqpstate &state, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetlc(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets dense linear constraints for QP optimizer. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). After call to this function sparse constraints are dropped, and you have only those constraints which were specified in the present call. If you want to specify mixed (with dense and sparse terms) linear constraints, you should call minqpsetlcmixed(). SUPPORT BY QP SOLVERS: Following QP solvers can handle dense linear constraints: * BLEIC-QP - handles them with high precision, but may be inefficient for problems with hundreds of constraints * Dense-AUL-QP - handles them with moderate precision (approx. 10^-6), may efficiently handle thousands of constraints. Following QP solvers can NOT handle dense linear constraints: * QuickQP - can not handle general linear constraints INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 19.06.2012 by Bochkanov Sergey *************************************************************************/ void minqpsetlc(const minqpstate &state, const real_2d_array &c, const integer_1d_array &ct) { alglib_impl::ae_state _alglib_env_state; ae_int_t k; if( (c.rows()!=ct.length())) throw ap_error("Error while calling 'minqpsetlc': looks like one of arguments has wrong size"); k = c.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetlc(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets sparse linear constraints for QP optimizer. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). After call to this function dense constraints are dropped, and you have only those constraints which were specified in the present call. If you want to specify mixed (with dense and sparse terms) linear constraints, you should call minqpsetlcmixed(). SUPPORT BY QP SOLVERS: Following QP solvers can handle sparse linear constraints: * BLEIC-QP - handles them with high precision, but can not utilize their sparsity - sparse constraint matrix is silently converted to dense format. Thus, it may be inefficient for problems with hundreds of constraints. * Dense-AUL-QP - although this solver uses dense linear algebra to calculate Cholesky preconditioner, it may efficiently handle sparse constraints. It may solve problems with hundreds and thousands of constraints. The only drawback is that precision of constraint handling is typically within 1E-4... ..1E-6 range. Following QP solvers can NOT handle sparse linear constraints: * QuickQP - can not handle general linear constraints INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. C - linear constraints, sparse matrix with dimensions at least [K,N+1]. If matrix has larger size, only leading Kx(N+1) rectangle is used. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0 NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 22.08.2016 by Bochkanov Sergey *************************************************************************/ void minqpsetlcsparse(const minqpstate &state, const sparsematrix &c, const integer_1d_array &ct, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetlcsparse(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets mixed linear constraints, which include a set of dense rows, and a set of sparse rows. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). This function may be useful if constraint matrix includes large number of both types of rows - dense and sparse. If you have just a few sparse rows, you may represent them in dense format without loosing performance. Similarly, if you have just a few dense rows, you may store them in sparse format with almost same performance. SUPPORT BY QP SOLVERS: Following QP solvers can handle mixed dense/sparse linear constraints: * BLEIC-QP - handles them with high precision, but can not utilize their sparsity - sparse constraint matrix is silently converted to dense format. Thus, it may be inefficient for problems with hundreds of constraints. * Dense-AUL-QP - although this solver uses dense linear algebra to calculate Cholesky preconditioner, it may efficiently handle sparse constraints. It may solve problems with hundreds and thousands of constraints. The only drawback is that precision of constraint handling is typically within 1E-4... ..1E-6 range. Following QP solvers can NOT handle mixed linear constraints: * QuickQP - can not handle general linear constraints at all INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. DenseC - dense linear constraints, array[K,N+1]. Each row of DenseC represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of DenseC (including right part) must be finite. DenseCT - type of constraints, array[K]: * if DenseCT[i]>0, then I-th constraint is DenseC[i,*]*x >= DenseC[i,n+1] * if DenseCT[i]=0, then I-th constraint is DenseC[i,*]*x = DenseC[i,n+1] * if DenseCT[i]<0, then I-th constraint is DenseC[i,*]*x <= DenseC[i,n+1] DenseK - number of equality/inequality constraints, DenseK>=0 SparseC - linear constraints, sparse matrix with dimensions at least [SparseK,N+1]. If matrix has larger size, only leading SPARSEKx(N+1) rectangle is used. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. SparseCT- type of sparse constraints, array[K]: * if SparseCT[i]>0, then I-th constraint is SparseC[i,*]*x >= SparseC[i,n+1] * if SparseCT[i]=0, then I-th constraint is SparseC[i,*]*x = SparseC[i,n+1] * if SparseCT[i]<0, then I-th constraint is SparseC[i,*]*x <= SparseC[i,n+1] SparseK - number of sparse equality/inequality constraints, K>=0 NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 22.08.2016 by Bochkanov Sergey *************************************************************************/ void minqpsetlcmixed(const minqpstate &state, const real_2d_array &densec, const integer_1d_array &densect, const ae_int_t densek, const sparsematrix &sparsec, const integer_1d_array &sparsect, const ae_int_t sparsek) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpsetlcmixed(const_cast(state.c_ptr()), const_cast(densec.c_ptr()), const_cast(densect.c_ptr()), densek, const_cast(sparsec.c_ptr()), const_cast(sparsect.c_ptr()), sparsek, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function solves quadratic programming problem. Prior to calling this function you should choose solver by means of one of the following functions: * minqpsetalgoquickqp() - for QuickQP solver * minqpsetalgobleic() - for BLEIC-QP solver * minqpsetalgodenseaul() - for Dense-AUL-QP solver These functions also allow you to control stopping criteria of the solver. If you did not set solver, MinQP subpackage will automatically select solver for your problem and will run it with default stopping criteria. However, it is better to set explicitly solver and its stopping criteria. INPUT PARAMETERS: State - algorithm state You should use MinQPResults() function to access results after calls to this function. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey. Special thanks to Elvira Illarionova for important suggestions on the linearly constrained QP algorithm. *************************************************************************/ void minqpoptimize(const minqpstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpoptimize(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* QP solver results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution. This array is allocated and initialized only when Rep.TerminationType parameter is positive (success). Rep - optimization report. You should check Rep.TerminationType, which contains completion code, and you may check another fields which contain another information about algorithm functioning. Failure codes returned by algorithm are: * -5 inappropriate solver was used: * QuickQP solver for problem with general linear constraints * -4 BLEIC-QP/QuickQP solver found unconstrained direction of negative curvature (function is unbounded from below even under constraints), no meaningful minimum can be found. * -3 inconsistent constraints (or maybe feasible point is too hard to find). If you are sure that constraints are feasible, try to restart optimizer with better initial approximation. Completion codes specific for Cholesky algorithm: * 4 successful completion Completion codes specific for BLEIC/QuickQP algorithms: * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpresults(const minqpstate &state, real_1d_array &x, minqpreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpresults(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* QP results Buffered implementation of MinQPResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpresultsbuf(const minqpstate &state, real_1d_array &x, minqpreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minqpresultsbuf(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinNLC subpackage to work with this object *************************************************************************/ _minnlcstate_owner::_minnlcstate_owner() { p_struct = (alglib_impl::minnlcstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minnlcstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minnlcstate_init(p_struct, NULL); } _minnlcstate_owner::_minnlcstate_owner(const _minnlcstate_owner &rhs) { p_struct = (alglib_impl::minnlcstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minnlcstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minnlcstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minnlcstate_owner& _minnlcstate_owner::operator=(const _minnlcstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minnlcstate_clear(p_struct); alglib_impl::_minnlcstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minnlcstate_owner::~_minnlcstate_owner() { alglib_impl::_minnlcstate_clear(p_struct); ae_free(p_struct); } alglib_impl::minnlcstate* _minnlcstate_owner::c_ptr() { return p_struct; } alglib_impl::minnlcstate* _minnlcstate_owner::c_ptr() const { return const_cast(p_struct); } minnlcstate::minnlcstate() : _minnlcstate_owner() ,needfi(p_struct->needfi),needfij(p_struct->needfij),xupdated(p_struct->xupdated),f(p_struct->f),fi(&p_struct->fi),j(&p_struct->j),x(&p_struct->x) { } minnlcstate::minnlcstate(const minnlcstate &rhs):_minnlcstate_owner(rhs) ,needfi(p_struct->needfi),needfij(p_struct->needfij),xupdated(p_struct->xupdated),f(p_struct->f),fi(&p_struct->fi),j(&p_struct->j),x(&p_struct->x) { } minnlcstate& minnlcstate::operator=(const minnlcstate &rhs) { if( this==&rhs ) return *this; _minnlcstate_owner::operator=(rhs); return *this; } minnlcstate::~minnlcstate() { } /************************************************************************* This structure stores optimization report: * IterationsCount total number of inner iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinNLCSetGradientCheck() for more information. 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. Other fields of this structure are not documented and should not be used! *************************************************************************/ _minnlcreport_owner::_minnlcreport_owner() { p_struct = (alglib_impl::minnlcreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minnlcreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minnlcreport_init(p_struct, NULL); } _minnlcreport_owner::_minnlcreport_owner(const _minnlcreport_owner &rhs) { p_struct = (alglib_impl::minnlcreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minnlcreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minnlcreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minnlcreport_owner& _minnlcreport_owner::operator=(const _minnlcreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minnlcreport_clear(p_struct); alglib_impl::_minnlcreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minnlcreport_owner::~_minnlcreport_owner() { alglib_impl::_minnlcreport_clear(p_struct); ae_free(p_struct); } alglib_impl::minnlcreport* _minnlcreport_owner::c_ptr() { return p_struct; } alglib_impl::minnlcreport* _minnlcreport_owner::c_ptr() const { return const_cast(p_struct); } minnlcreport::minnlcreport() : _minnlcreport_owner() ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),varidx(p_struct->varidx),funcidx(p_struct->funcidx),terminationtype(p_struct->terminationtype),dbgphase0its(p_struct->dbgphase0its) { } minnlcreport::minnlcreport(const minnlcreport &rhs):_minnlcreport_owner(rhs) ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),varidx(p_struct->varidx),funcidx(p_struct->funcidx),terminationtype(p_struct->terminationtype),dbgphase0its(p_struct->dbgphase0its) { } minnlcreport& minnlcreport::operator=(const minnlcreport &rhs) { if( this==&rhs ) return *this; _minnlcreport_owner::operator=(rhs); return *this; } minnlcreport::~minnlcreport() { } /************************************************************************* NONLINEARLY CONSTRAINED OPTIMIZATION WITH PRECONDITIONED AUGMENTED LAGRANGIAN ALGORITHM DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints * nonlinear equality constraints Gi(x)=0 * nonlinear inequality constraints Hi(x)<=0 REQUIREMENTS: * user must provide function value and gradient for F(), H(), G() * starting point X0 must be feasible or not too far away from the feasible set * F(), G(), H() are twice continuously differentiable on the feasible set and its neighborhood * nonlinear constraints G() and H() must have non-zero gradient at G(x)=0 and at H(x)=0. Say, constraint like x^2>=1 is supported, but x^2>=0 is NOT supported. USAGE: Constrained optimization if far more complex than the unconstrained one. Nonlinearly constrained optimization is one of the most esoteric numerical procedures. Here we give very brief outline of the MinNLC optimizer. We strongly recommend you to study examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinNLCCreate() call and chooses what NLC solver to use. There is some solver which is used by default, with default settings, but you should NOT rely on default choice. It may change in future releases of ALGLIB without notice, and no one can guarantee that new solver will be able to solve your problem with default settings. From the other side, if you choose solver explicitly, you can be pretty sure that it will work with new ALGLIB releases. In the current release following solvers can be used: * AUL solver (activated with MinNLCSetAlgoAUL() function) 2. User adds boundary and/or linear and/or nonlinear constraints by means of calling one of the following functions: a) MinNLCSetBC() for boundary constraints b) MinNLCSetLC() for linear constraints c) MinNLCSetNLC() for nonlinear constraints You may combine (a), (b) and (c) in one optimization problem. 3. User sets scale of the variables with MinNLCSetScale() function. It is VERY important to set scale of the variables, because nonlinearly constrained problems are hard to solve when variables are badly scaled. 4. User sets stopping conditions with MinNLCSetCond(). If NLC solver uses inner/outer iteration layout, this function sets stopping conditions for INNER iterations. 5. User chooses one of the preconditioning methods. Preconditioning is very important for efficient handling of boundary/linear/nonlinear constraints. Without preconditioning algorithm would require thousands of iterations even for simple problems. Several preconditioners can be used: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). Since version 3.11.0 ALGLIB uses exact robust preconditioner as default option, but in some cases exact low rank one may be better option. 6. Finally, user calls MinNLCOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G/H. 7. User calls MinNLCResults() to get solution 8. Optionally user may call MinNLCRestartFrom() to solve another problem with same N but another starting point. MinNLCRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlccreate(const ae_int_t n, const real_1d_array &x, minnlcstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlccreate(n, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* NONLINEARLY CONSTRAINED OPTIMIZATION WITH PRECONDITIONED AUGMENTED LAGRANGIAN ALGORITHM DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints * nonlinear equality constraints Gi(x)=0 * nonlinear inequality constraints Hi(x)<=0 REQUIREMENTS: * user must provide function value and gradient for F(), H(), G() * starting point X0 must be feasible or not too far away from the feasible set * F(), G(), H() are twice continuously differentiable on the feasible set and its neighborhood * nonlinear constraints G() and H() must have non-zero gradient at G(x)=0 and at H(x)=0. Say, constraint like x^2>=1 is supported, but x^2>=0 is NOT supported. USAGE: Constrained optimization if far more complex than the unconstrained one. Nonlinearly constrained optimization is one of the most esoteric numerical procedures. Here we give very brief outline of the MinNLC optimizer. We strongly recommend you to study examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinNLCCreate() call and chooses what NLC solver to use. There is some solver which is used by default, with default settings, but you should NOT rely on default choice. It may change in future releases of ALGLIB without notice, and no one can guarantee that new solver will be able to solve your problem with default settings. From the other side, if you choose solver explicitly, you can be pretty sure that it will work with new ALGLIB releases. In the current release following solvers can be used: * AUL solver (activated with MinNLCSetAlgoAUL() function) 2. User adds boundary and/or linear and/or nonlinear constraints by means of calling one of the following functions: a) MinNLCSetBC() for boundary constraints b) MinNLCSetLC() for linear constraints c) MinNLCSetNLC() for nonlinear constraints You may combine (a), (b) and (c) in one optimization problem. 3. User sets scale of the variables with MinNLCSetScale() function. It is VERY important to set scale of the variables, because nonlinearly constrained problems are hard to solve when variables are badly scaled. 4. User sets stopping conditions with MinNLCSetCond(). If NLC solver uses inner/outer iteration layout, this function sets stopping conditions for INNER iterations. 5. User chooses one of the preconditioning methods. Preconditioning is very important for efficient handling of boundary/linear/nonlinear constraints. Without preconditioning algorithm would require thousands of iterations even for simple problems. Several preconditioners can be used: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). Since version 3.11.0 ALGLIB uses exact robust preconditioner as default option, but in some cases exact low rank one may be better option. 6. Finally, user calls MinNLCOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G/H. 7. User calls MinNLCResults() to get solution 8. Optionally user may call MinNLCRestartFrom() to solve another problem with same N but another starting point. MinNLCRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlccreate(const real_1d_array &x, minnlcstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlccreate(n, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine is a finite difference variant of MinNLCCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinNLCCreate() in order to get more information about creation of NLC optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinNLCSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large TRUNCATION errors, while too small step will result in too large NUMERICAL errors. 1.0E-4 can be good value to start from. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlccreatef(const ae_int_t n, const real_1d_array &x, const double diffstep, minnlcstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlccreatef(n, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine is a finite difference variant of MinNLCCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinNLCCreate() in order to get more information about creation of NLC optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinNLCSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large TRUNCATION errors, while too small step will result in too large NUMERICAL errors. 1.0E-4 can be good value to start from. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlccreatef(const real_1d_array &x, const double diffstep, minnlcstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlccreatef(n, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets boundary constraints for NLC optimizer. Boundary constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinNLCRestartFrom(). You may combine boundary constraints with general linear ones - and with nonlinear ones! Boundary constraints are handled more efficiently than other types. Thus, if your problem has mixed constraints, you may explicitly specify some of them as boundary and save some time/space. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: when you solve your problem with augmented Lagrangian solver, boundary constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of feasible area! -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetbc(const minnlcstate &state, const real_1d_array &bndl, const real_1d_array &bndu) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetbc(const_cast(state.c_ptr()), const_cast(bndl.c_ptr()), const_cast(bndu.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets linear constraints for MinNLC optimizer. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinNLCRestartFrom(). You may combine linear constraints with boundary ones - and with nonlinear ones! If your problem has mixed constraints, you may explicitly specify some of them as linear. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: when you solve your problem with augmented Lagrangian solver, linear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of feasible area! -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetlc(const minnlcstate &state, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetlc(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets linear constraints for MinNLC optimizer. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinNLCRestartFrom(). You may combine linear constraints with boundary ones - and with nonlinear ones! If your problem has mixed constraints, you may explicitly specify some of them as linear. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: when you solve your problem with augmented Lagrangian solver, linear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of feasible area! -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetlc(const minnlcstate &state, const real_2d_array &c, const integer_1d_array &ct) { alglib_impl::ae_state _alglib_env_state; ae_int_t k; if( (c.rows()!=ct.length())) throw ap_error("Error while calling 'minnlcsetlc': looks like one of arguments has wrong size"); k = c.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetlc(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets nonlinear constraints for MinNLC optimizer. In fact, this function sets NUMBER of nonlinear constraints. Constraints itself (constraint functions) are passed to MinNLCOptimize() method. This method requires user-defined vector function F[] and its Jacobian J[], where: * first component of F[] and first row of Jacobian J[] corresponds to function being minimized * next NLEC components of F[] (and rows of J) correspond to nonlinear equality constraints G_i(x)=0 * next NLIC components of F[] (and rows of J) correspond to nonlinear inequality constraints H_i(x)<=0 NOTE: you may combine nonlinear constraints with linear/boundary ones. If your problem has mixed constraints, you may explicitly specify some of them as linear ones. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. NLEC - number of Non-Linear Equality Constraints (NLEC), >=0 NLIC - number of Non-Linear Inquality Constraints (NLIC), >=0 NOTE 1: when you solve your problem with augmented Lagrangian solver, nonlinear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of feasible area! NOTE 2: algorithm scales variables according to scale specified by MinNLCSetScale() function, so it can handle problems with badly scaled variables (as long as we KNOW their scales). However, there is no way to automatically scale nonlinear constraints Gi(x) and Hi(x). Inappropriate scaling of Gi/Hi may ruin convergence. Solving problem with constraint "1000*G0(x)=0" is NOT same as solving it with constraint "0.001*G0(x)=0". It means that YOU are the one who is responsible for correct scaling of nonlinear constraints Gi(x) and Hi(x). We recommend you to scale nonlinear constraints in such way that I-th component of dG/dX (or dH/dx) has approximately unit magnitude (for problems with unit scale) or has magnitude approximately equal to 1/S[i] (where S is a scale set by MinNLCSetScale() function). -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetnlc(const minnlcstate &state, const ae_int_t nlec, const ae_int_t nlic) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetnlc(const_cast(state.c_ptr()), nlec, nlic, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping conditions for inner iterations of optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinNLCSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetcond(const minnlcstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetcond(const_cast(state.c_ptr()), epsg, epsf, epsx, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets scaling coefficients for NLC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetscale(const minnlcstate &state, const real_1d_array &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetscale(const_cast(state.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets preconditioner to "inexact LBFGS-based" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may use following preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). Inexact LBFGS-based preconditioner uses L-BFGS formula combined with orthogonality assumption to perform very fast updates. For a N-dimensional problem with K general linear or nonlinear constraints (boundary ones are not counted) it has O(N*K) cost per iteration. This preconditioner has best quality (less iterations) when general linear and nonlinear constraints are orthogonal to each other (orthogonality with respect to boundary constraints is not required). Number of iterations increases when constraints are non-orthogonal, because algorithm assumes orthogonality, but still it is better than no preconditioner at all. INPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 26.09.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetprecinexact(const minnlcstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetprecinexact(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets preconditioner to "exact low rank" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may use following preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). It also provides special unpreconditioned mode of operation which can be used for test purposes. Comments below discuss low rank preconditioner. Exact low-rank preconditioner uses Woodbury matrix identity to build quadratic model of the penalized function. It has following features: * no special assumptions about orthogonality of constraints * preconditioner evaluation is optimized for K<=N. * finally, stability of the process is guaranteed only for K<=N due to degeneracy of intermediate matrices. That's why we recommend to use "exact robust" preconditioner for such cases. RECOMMENDATIONS We recommend to choose between "exact low rank" and "exact robust" preconditioners, with "low rank" version being chosen when you know in advance that total count of non-box constraints won't exceed N, and "robust" version being chosen when you need bulletproof solution. INPUT PARAMETERS: State - structure stores algorithm state UpdateFreq- update frequency. Preconditioner is rebuilt after every UpdateFreq iterations. Recommended value: 10 or higher. Zero value means that good default value will be used. -- ALGLIB -- Copyright 26.09.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetprecexactlowrank(const minnlcstate &state, const ae_int_t updatefreq) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetprecexactlowrank(const_cast(state.c_ptr()), updatefreq, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets preconditioner to "exact robust" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may use following preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). It also provides special unpreconditioned mode of operation which can be used for test purposes. Comments below discuss robust preconditioner. Exact robust preconditioner uses Cholesky decomposition to invert approximate Hessian matrix H=D+W'*C*W (where D stands for diagonal terms of Hessian, combined result of initial scaling matrix and penalty from box constraints; W stands for general linear constraints and linearization of nonlinear ones; C stands for diagonal matrix of penalty coefficients). This preconditioner has following features: * no special assumptions about constraint structure * preconditioner is optimized for stability; unlike "exact low rank" version which fails for K>=N, this one works well for any value of K. * the only drawback is that is takes O(N^3+K*N^2) time to build it. No economical Woodbury update is applied even when it makes sense, thus there are exist situations (K<(state.c_ptr()), updatefreq, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets preconditioner to "turned off" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may utilize two preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, and b) exact low rank one. It also provides special unpreconditioned mode of operation which can be used for test purposes. This function activates this test mode. Do not use it in production code to solve real-life problems. INPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 26.09.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetprecnone(const minnlcstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetprecnone(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets maximum step length (after scaling of step vector with respect to variable scales specified by minnlcsetscale() call). INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0 (default), if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minnlcsetstpmax(const minnlcstate &state, const double stpmax) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetstpmax(const_cast(state.c_ptr()), stpmax, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function tells MinNLC unit to use Augmented Lagrangian algorithm for nonlinearly constrained optimization. This algorithm is a slight modification of one described in "A Modified Barrier-Augmented Lagrangian Method for Constrained Minimization (1999)" by D.GOLDFARB, R.POLYAK, K. SCHEINBERG, I.YUZEFOVICH. Augmented Lagrangian algorithm works by converting problem of minimizing F(x) subject to equality/inequality constraints to unconstrained problem of the form min[ f(x) + + Rho*PENALTY_EQ(x) + SHIFT_EQ(x,Nu1) + + Rho*PENALTY_INEQ(x) + SHIFT_INEQ(x,Nu2) ] where: * Rho is a fixed penalization coefficient * PENALTY_EQ(x) is a penalty term, which is used to APPROXIMATELY enforce equality constraints * SHIFT_EQ(x) is a special "shift" term which is used to "fine-tune" equality constraints, greatly increasing precision * PENALTY_INEQ(x) is a penalty term which is used to approximately enforce inequality constraints * SHIFT_INEQ(x) is a special "shift" term which is used to "fine-tune" inequality constraints, greatly increasing precision * Nu1/Nu2 are vectors of Lagrange coefficients which are fine-tuned during outer iterations of algorithm This version of AUL algorithm uses preconditioner, which greatly accelerates convergence. Because this algorithm is similar to penalty methods, it may perform steps into infeasible area. All kinds of constraints (boundary, linear and nonlinear ones) may be violated in intermediate points - and in the solution. However, properly configured AUL method is significantly better at handling constraints than barrier and/or penalty methods. The very basic outline of algorithm is given below: 1) first outer iteration is performed with "default" values of Lagrange multipliers Nu1/Nu2. Solution quality is low (candidate point can be too far away from true solution; large violation of constraints is possible) and is comparable with that of penalty methods. 2) subsequent outer iterations refine Lagrange multipliers and improve quality of the solution. INPUT PARAMETERS: State - structure which stores algorithm state Rho - penalty coefficient, Rho>0: * large enough that algorithm converges with desired precision. Minimum value is 10*max(S'*diag(H)*S), where S is a scale matrix (set by MinNLCSetScale) and H is a Hessian of the function being minimized. If you can not easily estimate Hessian norm, see our recommendations below. * not TOO large to prevent ill-conditioning * for unit-scale problems (variables and Hessian have unit magnitude), Rho=100 or Rho=1000 can be used. * it is important to note that Rho is internally multiplied by scaling matrix, i.e. optimum value of Rho depends on scale of variables specified by MinNLCSetScale(). ItsCnt - number of outer iterations: * ItsCnt=0 means that small number of outer iterations is automatically chosen (10 iterations in current version). * ItsCnt=1 means that AUL algorithm performs just as usual barrier method. * ItsCnt>1 means that AUL algorithm performs specified number of outer iterations HOW TO CHOOSE PARAMETERS Nonlinear optimization is a tricky area and Augmented Lagrangian algorithm is sometimes hard to tune. Good values of Rho and ItsCnt are problem- specific. In order to help you we prepared following set of recommendations: * for unit-scale problems (variables and Hessian have unit magnitude), Rho=100 or Rho=1000 can be used. * start from some small value of Rho and solve problem with just one outer iteration (ItcCnt=1). In this case algorithm behaves like penalty method. Increase Rho in 2x or 10x steps until you see that one outer iteration returns point which is "rough approximation to solution". It is very important to have Rho so large that penalty term becomes constraining i.e. modified function becomes highly convex in constrained directions. From the other side, too large Rho may prevent you from converging to the solution. You can diagnose it by studying number of inner iterations performed by algorithm: too few (5-10 on 1000-dimensional problem) or too many (orders of magnitude more than dimensionality) usually means that Rho is too large. * with just one outer iteration you usually have low-quality solution. Some constraints can be violated with very large margin, while other ones (which are NOT violated in the true solution) can push final point too far in the inner area of the feasible set. For example, if you have constraint x0>=0 and true solution x0=1, then merely a presence of "x0>=0" will introduce a bias towards larger values of x0. Say, algorithm may stop at x0=1.5 instead of 1.0. * after you found good Rho, you may increase number of outer iterations. ItsCnt=10 is a good value. Subsequent outer iteration will refine values of Lagrange multipliers. Constraints which were violated will be enforced, inactive constraints will be dropped (corresponding multipliers will be decreased). Ideally, you should see 10-1000x improvement in constraint handling (constraint violation is reduced). * if you see that algorithm converges to vicinity of solution, but additional outer iterations do not refine solution, it may mean that algorithm is unstable - it wanders around true solution, but can not approach it. Sometimes algorithm may be stabilized by increasing Rho one more time, making it 5x or 10x larger. SCALING OF CONSTRAINTS [IMPORTANT] AUL optimizer scales variables according to scale specified by MinNLCSetScale() function, so it can handle problems with badly scaled variables (as long as we KNOW their scales). However, because function being optimized is a mix of original function and constraint-dependent penalty functions, it is important to rescale both variables AND constraints. Say, if you minimize f(x)=x^2 subject to 1000000*x>=0, then you have constraint whose scale is different from that of target function (another example is 0.000001*x>=0). It is also possible to have constraints whose scales are misaligned: 1000000*x0>=0, 0.000001*x1<=0. Inappropriate scaling may ruin convergence because minimizing x^2 subject to x>=0 is NOT same as minimizing it subject to 1000000*x>=0. Because we know coefficients of boundary/linear constraints, we can automatically rescale and normalize them. However, there is no way to automatically rescale nonlinear constraints Gi(x) and Hi(x) - they are black boxes. It means that YOU are the one who is responsible for correct scaling of nonlinear constraints Gi(x) and Hi(x). We recommend you to rescale nonlinear constraints in such way that I-th component of dG/dX (or dH/dx) has magnitude approximately equal to 1/S[i] (where S is a scale set by MinNLCSetScale() function). WHAT IF IT DOES NOT CONVERGE? It is possible that AUL algorithm fails to converge to precise values of Lagrange multipliers. It stops somewhere around true solution, but candidate point is still too far from solution, and some constraints are violated. Such kind of failure is specific for Lagrangian algorithms - technically, they stop at some point, but this point is not constrained solution. There are exist several reasons why algorithm may fail to converge: a) too loose stopping criteria for inner iteration b) degenerate, redundant constraints c) target function has unconstrained extremum exactly at the boundary of some constraint d) numerical noise in the target function In all these cases algorithm is unstable - each outer iteration results in large and almost random step which improves handling of some constraints, but violates other ones (ideally outer iterations should form a sequence of progressively decreasing steps towards solution). First reason possible is that too loose stopping criteria for inner iteration were specified. Augmented Lagrangian algorithm solves a sequence of intermediate problems, and requries each of them to be solved with high precision. Insufficient precision results in incorrect update of Lagrange multipliers. Another reason is that you may have specified degenerate constraints: say, some constraint was repeated twice. In most cases AUL algorithm gracefully handles such situations, but sometimes it may spend too much time figuring out subtle degeneracies in constraint matrix. Third reason is tricky and hard to diagnose. Consider situation when you minimize f=x^2 subject to constraint x>=0. Unconstrained extremum is located exactly at the boundary of constrained area. In this case algorithm will tend to oscillate between negative and positive x. Each time it stops at x<0 it "reinforces" constraint x>=0, and each time it is bounced to x>0 it "relaxes" constraint (and is attracted to x<0). Such situation sometimes happens in problems with hidden symetries. Algorithm is got caught in a loop with Lagrange multipliers being continuously increased/decreased. Luckily, such loop forms after at least three iterations, so this problem can be solved by DECREASING number of outer iterations down to 1-2 and increasing penalty coefficient Rho as much as possible. Final reason is numerical noise. AUL algorithm is robust against moderate noise (more robust than, say, active set methods), but large noise may destabilize algorithm. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetalgoaul(const minnlcstate &state, const double rho, const ae_int_t itscnt) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetalgoaul(const_cast(state.c_ptr()), rho, itscnt, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinNLCOptimize(). NOTE: algorithm passes two parameters to rep() callback - current point and penalized function value at current point. Important - function value which is returned is NOT function being minimized. It is sum of the value of the function being minimized - and penalty term. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minnlcsetxrep(const minnlcstate &state, const bool needxrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetxrep(const_cast(state.c_ptr()), needxrep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minnlciteration(const minnlcstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::minnlciteration(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minnlcoptimize(minnlcstate &state, void (*fvec)(const real_1d_array &x, real_1d_array &fi, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( fvec==NULL ) throw ap_error("ALGLIB: error in 'minnlcoptimize()' (fvec is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minnlciteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needfi ) { fvec(state.x, state.fi, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minnlcoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minnlcoptimize(minnlcstate &state, void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( jac==NULL ) throw ap_error("ALGLIB: error in 'minnlcoptimize()' (jac is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minnlciteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needfij ) { jac(state.x, state.fi, state.j, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minnlcoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* MinNLC results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinNLCSetGradientCheck() for more information. * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken More information about fields of this structure can be found in the comments on MinNLCReport datatype. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcresults(const minnlcstate &state, real_1d_array &x, minnlcreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcresults(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* NLC results Buffered implementation of MinNLCResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minnlcresultsbuf(const minnlcstate &state, real_1d_array &x, minnlcreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcresultsbuf(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. X - new starting point. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minnlcrestartfrom(const minnlcstate &state, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcrestartfrom(const_cast(state.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinNLCOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative, and Rep.FuncIdx is set to index of the function. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinNLCSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetgradientcheck(const minnlcstate &state, const double teststep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnlcsetgradientcheck(const_cast(state.c_ptr()), teststep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinBC subpackage to work with this object *************************************************************************/ _minbcstate_owner::_minbcstate_owner() { p_struct = (alglib_impl::minbcstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minbcstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minbcstate_init(p_struct, NULL); } _minbcstate_owner::_minbcstate_owner(const _minbcstate_owner &rhs) { p_struct = (alglib_impl::minbcstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minbcstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minbcstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minbcstate_owner& _minbcstate_owner::operator=(const _minbcstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minbcstate_clear(p_struct); alglib_impl::_minbcstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minbcstate_owner::~_minbcstate_owner() { alglib_impl::_minbcstate_clear(p_struct); ae_free(p_struct); } alglib_impl::minbcstate* _minbcstate_owner::c_ptr() { return p_struct; } alglib_impl::minbcstate* _minbcstate_owner::c_ptr() const { return const_cast(p_struct); } minbcstate::minbcstate() : _minbcstate_owner() ,needf(p_struct->needf),needfg(p_struct->needfg),xupdated(p_struct->xupdated),f(p_struct->f),g(&p_struct->g),x(&p_struct->x) { } minbcstate::minbcstate(const minbcstate &rhs):_minbcstate_owner(rhs) ,needf(p_struct->needf),needfg(p_struct->needfg),xupdated(p_struct->xupdated),f(p_struct->f),g(&p_struct->g),x(&p_struct->x) { } minbcstate& minbcstate::operator=(const minbcstate &rhs) { if( this==&rhs ) return *this; _minbcstate_owner::operator=(rhs); return *this; } minbcstate::~minbcstate() { } /************************************************************************* This structure stores optimization report: * IterationsCount number of iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinBCSetGradientCheck() for more information. -3 inconsistent constraints. 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 terminated by user who called minbcrequesttermination(). X contains point which was "current accepted" when termination request was submitted. ADDITIONAL FIELDS There are additional fields which can be used for debugging: * DebugEqErr error in the equality constraints (2-norm) * DebugFS f, calculated at projection of initial point to the feasible set * DebugFF f, calculated at the final point * DebugDX |X_start-X_final| *************************************************************************/ _minbcreport_owner::_minbcreport_owner() { p_struct = (alglib_impl::minbcreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minbcreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minbcreport_init(p_struct, NULL); } _minbcreport_owner::_minbcreport_owner(const _minbcreport_owner &rhs) { p_struct = (alglib_impl::minbcreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minbcreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minbcreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minbcreport_owner& _minbcreport_owner::operator=(const _minbcreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minbcreport_clear(p_struct); alglib_impl::_minbcreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minbcreport_owner::~_minbcreport_owner() { alglib_impl::_minbcreport_clear(p_struct); ae_free(p_struct); } alglib_impl::minbcreport* _minbcreport_owner::c_ptr() { return p_struct; } alglib_impl::minbcreport* _minbcreport_owner::c_ptr() const { return const_cast(p_struct); } minbcreport::minbcreport() : _minbcreport_owner() ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),varidx(p_struct->varidx),terminationtype(p_struct->terminationtype) { } minbcreport::minbcreport(const minbcreport &rhs):_minbcreport_owner(rhs) ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),varidx(p_struct->varidx),terminationtype(p_struct->terminationtype) { } minbcreport& minbcreport::operator=(const minbcreport &rhs) { if( this==&rhs ) return *this; _minbcreport_owner::operator=(rhs); return *this; } minbcreport::~minbcreport() { } /************************************************************************* BOX CONSTRAINED OPTIMIZATION WITH FAST ACTIVATION OF MULTIPLE BOX CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to box constraints (with some of box constraints actually being equality ones). This optimizer uses algorithm similar to that of MinBLEIC (optimizer with general linear constraints), but presence of box-only constraints allows us to use faster constraint activation strategies. On large-scale problems, with multiple constraints active at the solution, this optimizer can be several times faster than BLEIC. REQUIREMENTS: * user must provide function value and gradient * starting point X0 must be feasible or not too far away from the feasible set * grad(f) must be Lipschitz continuous on a level set: L = { x : f(x)<=f(x0) } * function must be defined everywhere on the feasible set F USAGE: Constrained optimization if far more complex than the unconstrained one. Here we give very brief outline of the BC optimizer. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinBCCreate() call 2. USer adds box constraints by calling MinBCSetBC() function. 3. User sets stopping conditions with MinBCSetCond(). 4. User calls MinBCOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 5. User calls MinBCResults() to get solution 6. Optionally user may call MinBCRestartFrom() to solve another problem with same N but another starting point. MinBCRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbccreate(const ae_int_t n, const real_1d_array &x, minbcstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbccreate(n, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* BOX CONSTRAINED OPTIMIZATION WITH FAST ACTIVATION OF MULTIPLE BOX CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to box constraints (with some of box constraints actually being equality ones). This optimizer uses algorithm similar to that of MinBLEIC (optimizer with general linear constraints), but presence of box-only constraints allows us to use faster constraint activation strategies. On large-scale problems, with multiple constraints active at the solution, this optimizer can be several times faster than BLEIC. REQUIREMENTS: * user must provide function value and gradient * starting point X0 must be feasible or not too far away from the feasible set * grad(f) must be Lipschitz continuous on a level set: L = { x : f(x)<=f(x0) } * function must be defined everywhere on the feasible set F USAGE: Constrained optimization if far more complex than the unconstrained one. Here we give very brief outline of the BC optimizer. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinBCCreate() call 2. USer adds box constraints by calling MinBCSetBC() function. 3. User sets stopping conditions with MinBCSetCond(). 4. User calls MinBCOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 5. User calls MinBCResults() to get solution 6. Optionally user may call MinBCRestartFrom() to solve another problem with same N but another starting point. MinBCRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbccreate(const real_1d_array &x, minbcstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbccreate(n, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The subroutine is finite difference variant of MinBCCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinBCCreate() in order to get more information about creation of BC optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinBCSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. CG needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void minbccreatef(const ae_int_t n, const real_1d_array &x, const double diffstep, minbcstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbccreatef(n, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The subroutine is finite difference variant of MinBCCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinBCCreate() in order to get more information about creation of BC optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinBCSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. CG needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void minbccreatef(const real_1d_array &x, const double diffstep, minbcstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbccreatef(n, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets boundary constraints for BC optimizer. Boundary constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinBCRestartFrom(). INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints, even when numerical differentiation is used (algorithm adjusts nodes according to boundary constraints) -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetbc(const minbcstate &state, const real_1d_array &bndl, const real_1d_array &bndu) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcsetbc(const_cast(state.c_ptr()), const_cast(bndl.c_ptr()), const_cast(bndu.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping conditions for the optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinBCSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. NOTE: when SetCond() called with non-zero MaxIts, BC solver may perform slightly more than MaxIts iterations. I.e., MaxIts sets non-strict limit on iterations count. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetcond(const minbcstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcsetcond(const_cast(state.c_ptr()), epsg, epsf, epsx, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets scaling coefficients for BC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the BC too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinBCSetPrec...() functions. There is a special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minbcsetscale(const minbcstate &state, const real_1d_array &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcsetscale(const_cast(state.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: preconditioning is turned off. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetprecdefault(const minbcstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcsetprecdefault(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE 1: D[i] should be positive. Exception will be thrown otherwise. NOTE 2: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetprecdiag(const minbcstate &state, const real_1d_array &d) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcsetprecdiag(const_cast(state.c_ptr()), const_cast(d.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinBCSetScale() call (before or after MinBCSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetprecscale(const minbcstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcsetprecscale(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinBCOptimize(). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetxrep(const minbcstate &state, const bool needxrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcsetxrep(const_cast(state.c_ptr()), needxrep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which lead to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetstpmax(const minbcstate &state, const double stpmax) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcsetstpmax(const_cast(state.c_ptr()), stpmax, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minbciteration(const minbcstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::minbciteration(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minbcoptimize(minbcstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( func==NULL ) throw ap_error("ALGLIB: error in 'minbcoptimize()' (func is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minbciteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needf ) { func(state.x, state.f, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minbcoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minbcoptimize(minbcstate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( grad==NULL ) throw ap_error("ALGLIB: error in 'minbcoptimize()' (grad is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minbciteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needfg ) { grad(state.x, state.f, state.g, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minbcoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* BC results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinBCSetGradientCheck() for more information. * -3 inconsistent constraints. * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken * 8 terminated by user who called minbcrequesttermination(). X contains point which was "current accepted" when termination request was submitted. More information about fields of this structure can be found in the comments on MinBCReport datatype. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcresults(const minbcstate &state, real_1d_array &x, minbcreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcresults(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* BC results Buffered implementation of MinBCResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcresultsbuf(const minbcstate &state, real_1d_array &x, minbcreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcresultsbuf(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with MinBCCreate call. X - new starting point. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcrestartfrom(const minbcstate &state, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcrestartfrom(const_cast(state.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void minbcrequesttermination(const minbcstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcrequesttermination(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinBCOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * if needed, steps are bounded with respect to constraints on X[] * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinBCSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/ void minbcsetgradientcheck(const minbcstate &state, const double teststep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbcsetgradientcheck(const_cast(state.c_ptr()), teststep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinNS subpackage to work with this object *************************************************************************/ _minnsstate_owner::_minnsstate_owner() { p_struct = (alglib_impl::minnsstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minnsstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minnsstate_init(p_struct, NULL); } _minnsstate_owner::_minnsstate_owner(const _minnsstate_owner &rhs) { p_struct = (alglib_impl::minnsstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minnsstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minnsstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minnsstate_owner& _minnsstate_owner::operator=(const _minnsstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minnsstate_clear(p_struct); alglib_impl::_minnsstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minnsstate_owner::~_minnsstate_owner() { alglib_impl::_minnsstate_clear(p_struct); ae_free(p_struct); } alglib_impl::minnsstate* _minnsstate_owner::c_ptr() { return p_struct; } alglib_impl::minnsstate* _minnsstate_owner::c_ptr() const { return const_cast(p_struct); } minnsstate::minnsstate() : _minnsstate_owner() ,needfi(p_struct->needfi),needfij(p_struct->needfij),xupdated(p_struct->xupdated),f(p_struct->f),fi(&p_struct->fi),j(&p_struct->j),x(&p_struct->x) { } minnsstate::minnsstate(const minnsstate &rhs):_minnsstate_owner(rhs) ,needfi(p_struct->needfi),needfij(p_struct->needfij),xupdated(p_struct->xupdated),f(p_struct->f),fi(&p_struct->fi),j(&p_struct->j),x(&p_struct->x) { } minnsstate& minnsstate::operator=(const minnsstate &rhs) { if( this==&rhs ) return *this; _minnsstate_owner::operator=(rhs); return *this; } minnsstate::~minnsstate() { } /************************************************************************* This structure stores optimization report: * IterationsCount total number of inner iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) * CErr maximum violation of all types of constraints * LCErr maximum violation of linear constraints * NLCErr maximum violation of nonlinear constraints TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -3 box constraints are inconsistent -1 inconsistent parameters were passed: * penalty parameter for minnssetalgoags() is zero, but we have nonlinear constraints set by minnssetnlc() 2 sampling radius decreased below epsx 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 User requested termination via MinNSRequestTermination() Other fields of this structure are not documented and should not be used! *************************************************************************/ _minnsreport_owner::_minnsreport_owner() { p_struct = (alglib_impl::minnsreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minnsreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minnsreport_init(p_struct, NULL); } _minnsreport_owner::_minnsreport_owner(const _minnsreport_owner &rhs) { p_struct = (alglib_impl::minnsreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minnsreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minnsreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minnsreport_owner& _minnsreport_owner::operator=(const _minnsreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minnsreport_clear(p_struct); alglib_impl::_minnsreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minnsreport_owner::~_minnsreport_owner() { alglib_impl::_minnsreport_clear(p_struct); ae_free(p_struct); } alglib_impl::minnsreport* _minnsreport_owner::c_ptr() { return p_struct; } alglib_impl::minnsreport* _minnsreport_owner::c_ptr() const { return const_cast(p_struct); } minnsreport::minnsreport() : _minnsreport_owner() ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),cerr(p_struct->cerr),lcerr(p_struct->lcerr),nlcerr(p_struct->nlcerr),terminationtype(p_struct->terminationtype),varidx(p_struct->varidx),funcidx(p_struct->funcidx) { } minnsreport::minnsreport(const minnsreport &rhs):_minnsreport_owner(rhs) ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),cerr(p_struct->cerr),lcerr(p_struct->lcerr),nlcerr(p_struct->nlcerr),terminationtype(p_struct->terminationtype),varidx(p_struct->varidx),funcidx(p_struct->funcidx) { } minnsreport& minnsreport::operator=(const minnsreport &rhs) { if( this==&rhs ) return *this; _minnsreport_owner::operator=(rhs); return *this; } minnsreport::~minnsreport() { } /************************************************************************* NONSMOOTH NONCONVEX OPTIMIZATION SUBJECT TO BOX/LINEAR/NONLINEAR-NONSMOOTH CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints * nonlinear equality constraints Gi(x)=0 * nonlinear inequality constraints Hi(x)<=0 IMPORTANT: see MinNSSetAlgoAGS for important information on performance restrictions of AGS solver. REQUIREMENTS: * starting point X0 must be feasible or not too far away from the feasible set * F(), G(), H() are continuous, locally Lipschitz and continuously (but not necessarily twice) differentiable in an open dense subset of R^N. Functions F(), G() and H() may be nonsmooth and non-convex. Informally speaking, it means that functions are composed of large differentiable "patches" with nonsmoothness having place only at the boundaries between these "patches". Most real-life nonsmooth functions satisfy these requirements. Say, anything which involves finite number of abs(), min() and max() is very likely to pass the test. Say, it is possible to optimize anything of the following: * f=abs(x0)+2*abs(x1) * f=max(x0,x1) * f=sin(max(x0,x1)+abs(x2)) * for nonlinearly constrained problems: F() must be bounded from below without nonlinear constraints (this requirement is due to the fact that, contrary to box and linear constraints, nonlinear ones require special handling). * user must provide function value and gradient for F(), H(), G() at all points where function/gradient can be calculated. If optimizer requires value exactly at the boundary between "patches" (say, at x=0 for f=abs(x)), where gradient is not defined, user may resolve tie arbitrarily (in our case - return +1 or -1 at its discretion). * NS solver supports numerical differentiation, i.e. it may differentiate your function for you, but it results in 2N increase of function evaluations. Not recommended unless you solve really small problems. See minnscreatef() for more information on this functionality. USAGE: 1. User initializes algorithm state with MinNSCreate() call and chooses what NLC solver to use. There is some solver which is used by default, with default settings, but you should NOT rely on default choice. It may change in future releases of ALGLIB without notice, and no one can guarantee that new solver will be able to solve your problem with default settings. From the other side, if you choose solver explicitly, you can be pretty sure that it will work with new ALGLIB releases. In the current release following solvers can be used: * AGS solver (activated with MinNSSetAlgoAGS() function) 2. User adds boundary and/or linear and/or nonlinear constraints by means of calling one of the following functions: a) MinNSSetBC() for boundary constraints b) MinNSSetLC() for linear constraints c) MinNSSetNLC() for nonlinear constraints You may combine (a), (b) and (c) in one optimization problem. 3. User sets scale of the variables with MinNSSetScale() function. It is VERY important to set scale of the variables, because nonlinearly constrained problems are hard to solve when variables are badly scaled. 4. User sets stopping conditions with MinNSSetCond(). 5. Finally, user calls MinNSOptimize() function which takes algorithm state and pointer (delegate, etc) to callback function which calculates F/G/H. 7. User calls MinNSResults() to get solution 8. Optionally user may call MinNSRestartFrom() to solve another problem with same N but another starting point. MinNSRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state NOTE: minnscreatef() function may be used if you do not have analytic gradient. This function creates solver which uses numerical differentiation with user-specified step. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnscreate(const ae_int_t n, const real_1d_array &x, minnsstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnscreate(n, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* NONSMOOTH NONCONVEX OPTIMIZATION SUBJECT TO BOX/LINEAR/NONLINEAR-NONSMOOTH CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints * nonlinear equality constraints Gi(x)=0 * nonlinear inequality constraints Hi(x)<=0 IMPORTANT: see MinNSSetAlgoAGS for important information on performance restrictions of AGS solver. REQUIREMENTS: * starting point X0 must be feasible or not too far away from the feasible set * F(), G(), H() are continuous, locally Lipschitz and continuously (but not necessarily twice) differentiable in an open dense subset of R^N. Functions F(), G() and H() may be nonsmooth and non-convex. Informally speaking, it means that functions are composed of large differentiable "patches" with nonsmoothness having place only at the boundaries between these "patches". Most real-life nonsmooth functions satisfy these requirements. Say, anything which involves finite number of abs(), min() and max() is very likely to pass the test. Say, it is possible to optimize anything of the following: * f=abs(x0)+2*abs(x1) * f=max(x0,x1) * f=sin(max(x0,x1)+abs(x2)) * for nonlinearly constrained problems: F() must be bounded from below without nonlinear constraints (this requirement is due to the fact that, contrary to box and linear constraints, nonlinear ones require special handling). * user must provide function value and gradient for F(), H(), G() at all points where function/gradient can be calculated. If optimizer requires value exactly at the boundary between "patches" (say, at x=0 for f=abs(x)), where gradient is not defined, user may resolve tie arbitrarily (in our case - return +1 or -1 at its discretion). * NS solver supports numerical differentiation, i.e. it may differentiate your function for you, but it results in 2N increase of function evaluations. Not recommended unless you solve really small problems. See minnscreatef() for more information on this functionality. USAGE: 1. User initializes algorithm state with MinNSCreate() call and chooses what NLC solver to use. There is some solver which is used by default, with default settings, but you should NOT rely on default choice. It may change in future releases of ALGLIB without notice, and no one can guarantee that new solver will be able to solve your problem with default settings. From the other side, if you choose solver explicitly, you can be pretty sure that it will work with new ALGLIB releases. In the current release following solvers can be used: * AGS solver (activated with MinNSSetAlgoAGS() function) 2. User adds boundary and/or linear and/or nonlinear constraints by means of calling one of the following functions: a) MinNSSetBC() for boundary constraints b) MinNSSetLC() for linear constraints c) MinNSSetNLC() for nonlinear constraints You may combine (a), (b) and (c) in one optimization problem. 3. User sets scale of the variables with MinNSSetScale() function. It is VERY important to set scale of the variables, because nonlinearly constrained problems are hard to solve when variables are badly scaled. 4. User sets stopping conditions with MinNSSetCond(). 5. Finally, user calls MinNSOptimize() function which takes algorithm state and pointer (delegate, etc) to callback function which calculates F/G/H. 7. User calls MinNSResults() to get solution 8. Optionally user may call MinNSRestartFrom() to solve another problem with same N but another starting point. MinNSRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state NOTE: minnscreatef() function may be used if you do not have analytic gradient. This function creates solver which uses numerical differentiation with user-specified step. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnscreate(const real_1d_array &x, minnsstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnscreate(n, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Version of minnscreatef() which uses numerical differentiation. I.e., you do not have to calculate derivatives yourself. However, this version needs 2N times more function evaluations. 2-point differentiation formula is used, because more precise 4-point formula is unstable when used on non-smooth functions. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. DiffStep- differentiation step, DiffStep>0. Algorithm performs numerical differentiation with step for I-th variable being equal to DiffStep*S[I] (here S[] is a scale vector, set by minnssetscale() function). Do not use too small steps, because it may lead to catastrophic cancellation during intermediate calculations. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnscreatef(const ae_int_t n, const real_1d_array &x, const double diffstep, minnsstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnscreatef(n, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Version of minnscreatef() which uses numerical differentiation. I.e., you do not have to calculate derivatives yourself. However, this version needs 2N times more function evaluations. 2-point differentiation formula is used, because more precise 4-point formula is unstable when used on non-smooth functions. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. DiffStep- differentiation step, DiffStep>0. Algorithm performs numerical differentiation with step for I-th variable being equal to DiffStep*S[I] (here S[] is a scale vector, set by minnssetscale() function). Do not use too small steps, because it may lead to catastrophic cancellation during intermediate calculations. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnscreatef(const real_1d_array &x, const double diffstep, minnsstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnscreatef(n, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets boundary constraints. Boundary constraints are inactive by default (after initial creation). They are preserved after algorithm restart with minnsrestartfrom(). INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: AGS solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints, even when numerical differentiation is used (algorithm adjusts nodes according to boundary constraints) -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetbc(const minnsstate &state, const real_1d_array &bndl, const real_1d_array &bndu) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnssetbc(const_cast(state.c_ptr()), const_cast(bndl.c_ptr()), const_cast(bndu.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets linear constraints. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with minnsrestartfrom(). INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE: linear (non-bound) constraints are satisfied only approximately: * there always exists some minor violation (about current sampling radius in magnitude during optimization, about EpsX in the solution) due to use of penalty method to handle constraints. * numerical differentiation, if used, may lead to function evaluations outside of the feasible area, because algorithm does NOT change numerical differentiation formula according to linear constraints. If you want constraints to be satisfied exactly, try to reformulate your problem in such manner that all constraints will become boundary ones (this kind of constraints is always satisfied exactly, both in the final solution and in all intermediate points). -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetlc(const minnsstate &state, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnssetlc(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets linear constraints. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with minnsrestartfrom(). INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE: linear (non-bound) constraints are satisfied only approximately: * there always exists some minor violation (about current sampling radius in magnitude during optimization, about EpsX in the solution) due to use of penalty method to handle constraints. * numerical differentiation, if used, may lead to function evaluations outside of the feasible area, because algorithm does NOT change numerical differentiation formula according to linear constraints. If you want constraints to be satisfied exactly, try to reformulate your problem in such manner that all constraints will become boundary ones (this kind of constraints is always satisfied exactly, both in the final solution and in all intermediate points). -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetlc(const minnsstate &state, const real_2d_array &c, const integer_1d_array &ct) { alglib_impl::ae_state _alglib_env_state; ae_int_t k; if( (c.rows()!=ct.length())) throw ap_error("Error while calling 'minnssetlc': looks like one of arguments has wrong size"); k = c.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnssetlc(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets nonlinear constraints. In fact, this function sets NUMBER of nonlinear constraints. Constraints itself (constraint functions) are passed to minnsoptimize() method. This method requires user-defined vector function F[] and its Jacobian J[], where: * first component of F[] and first row of Jacobian J[] correspond to function being minimized * next NLEC components of F[] (and rows of J) correspond to nonlinear equality constraints G_i(x)=0 * next NLIC components of F[] (and rows of J) correspond to nonlinear inequality constraints H_i(x)<=0 NOTE: you may combine nonlinear constraints with linear/boundary ones. If your problem has mixed constraints, you may explicitly specify some of them as linear ones. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. NLEC - number of Non-Linear Equality Constraints (NLEC), >=0 NLIC - number of Non-Linear Inquality Constraints (NLIC), >=0 NOTE 1: nonlinear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of the feasible area! NOTE 2: algorithm scales variables according to scale specified by minnssetscale() function, so it can handle problems with badly scaled variables (as long as we KNOW their scales). However, there is no way to automatically scale nonlinear constraints Gi(x) and Hi(x). Inappropriate scaling of Gi/Hi may ruin convergence. Solving problem with constraint "1000*G0(x)=0" is NOT same as solving it with constraint "0.001*G0(x)=0". It means that YOU are the one who is responsible for correct scaling of nonlinear constraints Gi(x) and Hi(x). We recommend you to scale nonlinear constraints in such way that I-th component of dG/dX (or dH/dx) has approximately unit magnitude (for problems with unit scale) or has magnitude approximately equal to 1/S[i] (where S is a scale set by minnssetscale() function). NOTE 3: nonlinear constraints are always hard to handle, no matter what algorithm you try to use. Even basic box/linear constraints modify function curvature by adding valleys and ridges. However, nonlinear constraints add valleys which are very hard to follow due to their "curved" nature. It means that optimization with single nonlinear constraint may be significantly slower than optimization with multiple linear ones. It is normal situation, and we recommend you to carefully choose Rho parameter of minnssetalgoags(), because too large value may slow down convergence. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetnlc(const minnsstate &state, const ae_int_t nlec, const ae_int_t nlic) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnssetnlc(const_cast(state.c_ptr()), nlec, nlic, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping conditions for iterations of optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0 The AGS solver finishes its work if on k+1-th iteration sampling radius decreases below EpsX. MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. We do not recommend you to rely on default choice in production code. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetcond(const minnsstate &state, const double epsx, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnssetcond(const_cast(state.c_ptr()), epsx, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets scaling coefficients for NLC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetscale(const minnsstate &state, const real_1d_array &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnssetscale(const_cast(state.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function tells MinNS unit to use AGS (adaptive gradient sampling) algorithm for nonsmooth constrained optimization. This algorithm is a slight modification of one described in "An Adaptive Gradient Sampling Algorithm for Nonsmooth Optimization" by Frank E. Curtisy and Xiaocun Quez. This optimizer has following benefits and drawbacks: + robustness; it can be used with nonsmooth and nonconvex functions. + relatively easy tuning; most of the metaparameters are easy to select. - it has convergence of steepest descent, slower than CG/LBFGS. - each iteration involves evaluation of ~2N gradient values and solution of 2Nx2N quadratic programming problem, which limits applicability of algorithm by small-scale problems (up to 50-100). IMPORTANT: this algorithm has convergence guarantees, i.e. it will steadily move towards some stationary point of the function. However, "stationary point" does not always mean "solution". Nonsmooth problems often have "flat spots", i.e. areas where function do not change at all. Such "flat spots" are stationary points by definition, and algorithm may be caught here. Nonsmooth CONVEX tasks are not prone to this problem. Say, if your function has form f()=MAX(f0,f1,...), and f_i are convex, then f() is convex too and you have guaranteed convergence to solution. INPUT PARAMETERS: State - structure which stores algorithm state Radius - initial sampling radius, >=0. Internally multiplied by vector of per-variable scales specified by minnssetscale()). You should select relatively large sampling radius, roughly proportional to scaled length of the first steps of the algorithm. Something close to 0.1 in magnitude should be good for most problems. AGS solver can automatically decrease radius, so too large radius is not a problem (assuming that you won't choose so large radius that algorithm will sample function in too far away points, where gradient value is irrelevant). Too small radius won't cause algorithm to fail, but it may slow down algorithm (it may have to perform too short steps). Penalty - penalty coefficient for nonlinear constraints: * for problem with nonlinear constraints should be some problem-specific positive value, large enough that penalty term changes shape of the function. Starting from some problem-specific value penalty coefficient becomes large enough to exactly enforce nonlinear constraints; larger values do not improve precision. Increasing it too much may slow down convergence, so you should choose it carefully. * can be zero for problems WITHOUT nonlinear constraints (i.e. for unconstrained ones or ones with just box or linear constraints) * if you specify zero value for problem with at least one nonlinear constraint, algorithm will terminate with error code -1. ALGORITHM OUTLINE The very basic outline of unconstrained AGS algorithm is given below: 0. If sampling radius is below EpsX or we performed more then MaxIts iterations - STOP. 1. sample O(N) gradient values at random locations around current point; informally speaking, this sample is an implicit piecewise linear model of the function, although algorithm formulation does not mention that explicitly 2. solve quadratic programming problem in order to find descent direction 3. if QP solver tells us that we are near solution, decrease sampling radius and move to (0) 4. perform backtracking line search 5. after moving to new point, goto (0) As for the constraints: * box constraints are handled exactly by modification of the function being minimized * linear/nonlinear constraints are handled by adding L1 penalty. Because our solver can handle nonsmoothness, we can use L1 penalty function, which is an exact one (i.e. exact solution is returned under such penalty). * penalty coefficient for linear constraints is chosen automatically; however, penalty coefficient for nonlinear constraints must be specified by user. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetalgoags(const minnsstate &state, const double radius, const double penalty) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnssetalgoags(const_cast(state.c_ptr()), radius, penalty, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to minnsoptimize(). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minnssetxrep(const minnsstate &state, const bool needxrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnssetxrep(const_cast(state.c_ptr()), needxrep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsrequesttermination(const minnsstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnsrequesttermination(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minnsiteration(const minnsstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::minnsiteration(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minnsoptimize(minnsstate &state, void (*fvec)(const real_1d_array &x, real_1d_array &fi, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( fvec==NULL ) throw ap_error("ALGLIB: error in 'minnsoptimize()' (fvec is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minnsiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needfi ) { fvec(state.x, state.fi, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minnsoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minnsoptimize(minnsstate &state, void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( jac==NULL ) throw ap_error("ALGLIB: error in 'minnsoptimize()' (jac is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minnsiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needfij ) { jac(state.x, state.fi, state.j, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minnsoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* MinNS results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -3 box constraints are inconsistent * -1 inconsistent parameters were passed: * penalty parameter for minnssetalgoags() is zero, but we have nonlinear constraints set by minnssetnlc() * 2 sampling radius decreased below epsx * 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. * 8 User requested termination via minnsrequesttermination() -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsresults(const minnsstate &state, real_1d_array &x, minnsreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnsresults(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Buffered implementation of minnsresults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsresultsbuf(const minnsstate &state, real_1d_array &x, minnsreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnsresultsbuf(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. X - new starting point. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsrestartfrom(const minnsstate &state, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minnsrestartfrom(const_cast(state.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* *************************************************************************/ _minasastate_owner::_minasastate_owner() { p_struct = (alglib_impl::minasastate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minasastate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minasastate_init(p_struct, NULL); } _minasastate_owner::_minasastate_owner(const _minasastate_owner &rhs) { p_struct = (alglib_impl::minasastate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minasastate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minasastate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minasastate_owner& _minasastate_owner::operator=(const _minasastate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minasastate_clear(p_struct); alglib_impl::_minasastate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minasastate_owner::~_minasastate_owner() { alglib_impl::_minasastate_clear(p_struct); ae_free(p_struct); } alglib_impl::minasastate* _minasastate_owner::c_ptr() { return p_struct; } alglib_impl::minasastate* _minasastate_owner::c_ptr() const { return const_cast(p_struct); } minasastate::minasastate() : _minasastate_owner() ,needfg(p_struct->needfg),xupdated(p_struct->xupdated),f(p_struct->f),g(&p_struct->g),x(&p_struct->x) { } minasastate::minasastate(const minasastate &rhs):_minasastate_owner(rhs) ,needfg(p_struct->needfg),xupdated(p_struct->xupdated),f(p_struct->f),g(&p_struct->g),x(&p_struct->x) { } minasastate& minasastate::operator=(const minasastate &rhs) { if( this==&rhs ) return *this; _minasastate_owner::operator=(rhs); return *this; } minasastate::~minasastate() { } /************************************************************************* *************************************************************************/ _minasareport_owner::_minasareport_owner() { p_struct = (alglib_impl::minasareport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minasareport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minasareport_init(p_struct, NULL); } _minasareport_owner::_minasareport_owner(const _minasareport_owner &rhs) { p_struct = (alglib_impl::minasareport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minasareport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minasareport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minasareport_owner& _minasareport_owner::operator=(const _minasareport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minasareport_clear(p_struct); alglib_impl::_minasareport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minasareport_owner::~_minasareport_owner() { alglib_impl::_minasareport_clear(p_struct); ae_free(p_struct); } alglib_impl::minasareport* _minasareport_owner::c_ptr() { return p_struct; } alglib_impl::minasareport* _minasareport_owner::c_ptr() const { return const_cast(p_struct); } minasareport::minasareport() : _minasareport_owner() ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),terminationtype(p_struct->terminationtype),activeconstraints(p_struct->activeconstraints) { } minasareport::minasareport(const minasareport &rhs):_minasareport_owner(rhs) ,iterationscount(p_struct->iterationscount),nfev(p_struct->nfev),terminationtype(p_struct->terminationtype),activeconstraints(p_struct->activeconstraints) { } minasareport& minasareport::operator=(const minasareport &rhs) { if( this==&rhs ) return *this; _minasareport_owner::operator=(rhs); return *this; } minasareport::~minasareport() { } /************************************************************************* Obsolete function, use MinLBFGSSetPrecDefault() instead. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetdefaultpreconditioner(const minlbfgsstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgssetdefaultpreconditioner(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Obsolete function, use MinLBFGSSetCholeskyPreconditioner() instead. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetcholeskypreconditioner(const minlbfgsstate &state, const real_2d_array &p, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlbfgssetcholeskypreconditioner(const_cast(state.c_ptr()), const_cast(p.c_ptr()), isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is obsolete function which was used by previous version of the BLEIC optimizer. It does nothing in the current version of BLEIC. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetbarrierwidth(const minbleicstate &state, const double mu) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetbarrierwidth(const_cast(state.c_ptr()), mu, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is obsolete function which was used by previous version of the BLEIC optimizer. It does nothing in the current version of BLEIC. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetbarrierdecay(const minbleicstate &state, const double mudecay) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minbleicsetbarrierdecay(const_cast(state.c_ptr()), mudecay, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 25.03.2010 by Bochkanov Sergey *************************************************************************/ void minasacreate(const ae_int_t n, const real_1d_array &x, const real_1d_array &bndl, const real_1d_array &bndu, minasastate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minasacreate(n, const_cast(x.c_ptr()), const_cast(bndl.c_ptr()), const_cast(bndu.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 25.03.2010 by Bochkanov Sergey *************************************************************************/ void minasacreate(const real_1d_array &x, const real_1d_array &bndl, const real_1d_array &bndu, minasastate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=bndl.length()) || (x.length()!=bndu.length())) throw ap_error("Error while calling 'minasacreate': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minasacreate(n, const_cast(x.c_ptr()), const_cast(bndl.c_ptr()), const_cast(bndu.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minasasetcond(const minasastate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minasasetcond(const_cast(state.c_ptr()), epsg, epsf, epsx, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minasasetxrep(const minasastate &state, const bool needxrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minasasetxrep(const_cast(state.c_ptr()), needxrep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minasasetalgorithm(const minasastate &state, const ae_int_t algotype) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minasasetalgorithm(const_cast(state.c_ptr()), algotype, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minasasetstpmax(const minasastate &state, const double stpmax) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minasasetstpmax(const_cast(state.c_ptr()), stpmax, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minasaiteration(const minasastate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::minasaiteration(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minasaoptimize(minasastate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( grad==NULL ) throw ap_error("ALGLIB: error in 'minasaoptimize()' (grad is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minasaiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needfg ) { grad(state.x, state.f, state.g, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minasaoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void minasaresults(const minasastate &state, real_1d_array &x, minasareport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minasaresults(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void minasaresultsbuf(const minasastate &state, real_1d_array &x, minasareport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minasaresultsbuf(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void minasarestartfrom(const minasastate &state, const real_1d_array &x, const real_1d_array &bndl, const real_1d_array &bndu) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minasarestartfrom(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(bndl.c_ptr()), const_cast(bndu.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Levenberg-Marquardt optimizer. This structure should be created using one of the MinLMCreate???() functions. You should not access its fields directly; use ALGLIB functions to work with it. *************************************************************************/ _minlmstate_owner::_minlmstate_owner() { p_struct = (alglib_impl::minlmstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minlmstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minlmstate_init(p_struct, NULL); } _minlmstate_owner::_minlmstate_owner(const _minlmstate_owner &rhs) { p_struct = (alglib_impl::minlmstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::minlmstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minlmstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minlmstate_owner& _minlmstate_owner::operator=(const _minlmstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minlmstate_clear(p_struct); alglib_impl::_minlmstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minlmstate_owner::~_minlmstate_owner() { alglib_impl::_minlmstate_clear(p_struct); ae_free(p_struct); } alglib_impl::minlmstate* _minlmstate_owner::c_ptr() { return p_struct; } alglib_impl::minlmstate* _minlmstate_owner::c_ptr() const { return const_cast(p_struct); } minlmstate::minlmstate() : _minlmstate_owner() ,needf(p_struct->needf),needfg(p_struct->needfg),needfgh(p_struct->needfgh),needfi(p_struct->needfi),needfij(p_struct->needfij),xupdated(p_struct->xupdated),f(p_struct->f),fi(&p_struct->fi),g(&p_struct->g),h(&p_struct->h),j(&p_struct->j),x(&p_struct->x) { } minlmstate::minlmstate(const minlmstate &rhs):_minlmstate_owner(rhs) ,needf(p_struct->needf),needfg(p_struct->needfg),needfgh(p_struct->needfgh),needfi(p_struct->needfi),needfij(p_struct->needfij),xupdated(p_struct->xupdated),f(p_struct->f),fi(&p_struct->fi),g(&p_struct->g),h(&p_struct->h),j(&p_struct->j),x(&p_struct->x) { } minlmstate& minlmstate::operator=(const minlmstate &rhs) { if( this==&rhs ) return *this; _minlmstate_owner::operator=(rhs); return *this; } minlmstate::~minlmstate() { } /************************************************************************* Optimization report, filled by MinLMResults() function FIELDS: * TerminationType, completetion code: * -8 optimizer detected NAN/INF values either in the function itself, or in its Jacobian * -7 derivative correctness check failed; see rep.funcidx, rep.varidx for more information. * -5 inappropriate solver was used: * solver created with minlmcreatefgh() used on problem with general linear constraints (set with minlmsetlc() call). * -3 constraints are inconsistent * 2 relative step is no more than EpsX. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * 8 terminated by user who called MinLMRequestTermination(). X contains point which was "current accepted" when termination request was submitted. * IterationsCount, contains iterations count * NFunc, number of function calculations * NJac, number of Jacobi matrix calculations * NGrad, number of gradient calculations * NHess, number of Hessian calculations * NCholesky, number of Cholesky decomposition calculations *************************************************************************/ _minlmreport_owner::_minlmreport_owner() { p_struct = (alglib_impl::minlmreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minlmreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minlmreport_init(p_struct, NULL); } _minlmreport_owner::_minlmreport_owner(const _minlmreport_owner &rhs) { p_struct = (alglib_impl::minlmreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::minlmreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_minlmreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _minlmreport_owner& _minlmreport_owner::operator=(const _minlmreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_minlmreport_clear(p_struct); alglib_impl::_minlmreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _minlmreport_owner::~_minlmreport_owner() { alglib_impl::_minlmreport_clear(p_struct); ae_free(p_struct); } alglib_impl::minlmreport* _minlmreport_owner::c_ptr() { return p_struct; } alglib_impl::minlmreport* _minlmreport_owner::c_ptr() const { return const_cast(p_struct); } minlmreport::minlmreport() : _minlmreport_owner() ,iterationscount(p_struct->iterationscount),terminationtype(p_struct->terminationtype),funcidx(p_struct->funcidx),varidx(p_struct->varidx),nfunc(p_struct->nfunc),njac(p_struct->njac),ngrad(p_struct->ngrad),nhess(p_struct->nhess),ncholesky(p_struct->ncholesky) { } minlmreport::minlmreport(const minlmreport &rhs):_minlmreport_owner(rhs) ,iterationscount(p_struct->iterationscount),terminationtype(p_struct->terminationtype),funcidx(p_struct->funcidx),varidx(p_struct->varidx),nfunc(p_struct->nfunc),njac(p_struct->njac),ngrad(p_struct->ngrad),nhess(p_struct->nhess),ncholesky(p_struct->ncholesky) { } minlmreport& minlmreport::operator=(const minlmreport &rhs) { if( this==&rhs ) return *this; _minlmreport_owner::operator=(rhs); return *this; } minlmreport::~minlmreport() { } /************************************************************************* IMPROVED LEVENBERG-MARQUARDT METHOD FOR NON-LINEAR LEAST SQUARES OPTIMIZATION DESCRIPTION: This function is used to find minimum of function which is represented as sum of squares: F(x) = f[0]^2(x[0],...,x[n-1]) + ... + f[m-1]^2(x[0],...,x[n-1]) using value of function vector f[] and Jacobian of f[]. REQUIREMENTS: This algorithm will request following information during its operation: * function vector f[] at given point X * function vector f[] and Jacobian of f[] (simultaneously) at given point There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts fvec() and jac() callbacks. First one is used to calculate f[] at given point, second one calculates f[] and Jacobian df[i]/dx[j]. You can try to initialize MinLMState structure with VJ function and then use incorrect version of MinLMOptimize() (for example, version which works with general form function and does not provide Jacobian), but it will lead to exception being thrown after first attempt to calculate Jacobian. USAGE: 1. User initializes algorithm state with MinLMCreateVJ() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of functions f[i] X - initial solution, array[0..N-1] OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatevj(const ae_int_t n, const ae_int_t m, const real_1d_array &x, minlmstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmcreatevj(n, m, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* IMPROVED LEVENBERG-MARQUARDT METHOD FOR NON-LINEAR LEAST SQUARES OPTIMIZATION DESCRIPTION: This function is used to find minimum of function which is represented as sum of squares: F(x) = f[0]^2(x[0],...,x[n-1]) + ... + f[m-1]^2(x[0],...,x[n-1]) using value of function vector f[] and Jacobian of f[]. REQUIREMENTS: This algorithm will request following information during its operation: * function vector f[] at given point X * function vector f[] and Jacobian of f[] (simultaneously) at given point There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts fvec() and jac() callbacks. First one is used to calculate f[] at given point, second one calculates f[] and Jacobian df[i]/dx[j]. You can try to initialize MinLMState structure with VJ function and then use incorrect version of MinLMOptimize() (for example, version which works with general form function and does not provide Jacobian), but it will lead to exception being thrown after first attempt to calculate Jacobian. USAGE: 1. User initializes algorithm state with MinLMCreateVJ() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of functions f[i] X - initial solution, array[0..N-1] OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatevj(const ae_int_t m, const real_1d_array &x, minlmstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmcreatevj(n, m, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* IMPROVED LEVENBERG-MARQUARDT METHOD FOR NON-LINEAR LEAST SQUARES OPTIMIZATION DESCRIPTION: This function is used to find minimum of function which is represented as sum of squares: F(x) = f[0]^2(x[0],...,x[n-1]) + ... + f[m-1]^2(x[0],...,x[n-1]) using value of function vector f[] only. Finite differences are used to calculate Jacobian. REQUIREMENTS: This algorithm will request following information during its operation: * function vector f[] at given point X There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts fvec() callback. You can try to initialize MinLMState structure with VJ function and then use incorrect version of MinLMOptimize() (for example, version which works with general form function and does not accept function vector), but it will lead to exception being thrown after first attempt to calculate Jacobian. USAGE: 1. User initializes algorithm state with MinLMCreateV() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of functions f[i] X - initial solution, array[0..N-1] DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state See also MinLMIteration, MinLMResults. NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatev(const ae_int_t n, const ae_int_t m, const real_1d_array &x, const double diffstep, minlmstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmcreatev(n, m, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* IMPROVED LEVENBERG-MARQUARDT METHOD FOR NON-LINEAR LEAST SQUARES OPTIMIZATION DESCRIPTION: This function is used to find minimum of function which is represented as sum of squares: F(x) = f[0]^2(x[0],...,x[n-1]) + ... + f[m-1]^2(x[0],...,x[n-1]) using value of function vector f[] only. Finite differences are used to calculate Jacobian. REQUIREMENTS: This algorithm will request following information during its operation: * function vector f[] at given point X There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts fvec() callback. You can try to initialize MinLMState structure with VJ function and then use incorrect version of MinLMOptimize() (for example, version which works with general form function and does not accept function vector), but it will lead to exception being thrown after first attempt to calculate Jacobian. USAGE: 1. User initializes algorithm state with MinLMCreateV() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of functions f[i] X - initial solution, array[0..N-1] DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state See also MinLMIteration, MinLMResults. NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatev(const ae_int_t m, const real_1d_array &x, const double diffstep, minlmstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmcreatev(n, m, const_cast(x.c_ptr()), diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* LEVENBERG-MARQUARDT-LIKE METHOD FOR NON-LINEAR OPTIMIZATION DESCRIPTION: This function is used to find minimum of general form (not "sum-of- -squares") function F = F(x[0], ..., x[n-1]) using its gradient and Hessian. Levenberg-Marquardt modification with L-BFGS pre-optimization and internal pre-conditioned L-BFGS optimization after each Levenberg-Marquardt step is used. REQUIREMENTS: This algorithm will request following information during its operation: * function value F at given point X * F and gradient G (simultaneously) at given point X * F, G and Hessian H (simultaneously) at given point X There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts func(), grad() and hess() function pointers. First pointer is used to calculate F at given point, second one calculates F(x) and grad F(x), third one calculates F(x), grad F(x), hess F(x). You can try to initialize MinLMState structure with FGH-function and then use incorrect version of MinLMOptimize() (for example, version which does not provide Hessian matrix), but it will lead to exception being thrown after first attempt to calculate Hessian. USAGE: 1. User initializes algorithm state with MinLMCreateFGH() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and pointers (delegates, etc.) to callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - initial solution, array[0..N-1] OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatefgh(const ae_int_t n, const real_1d_array &x, minlmstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmcreatefgh(n, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* LEVENBERG-MARQUARDT-LIKE METHOD FOR NON-LINEAR OPTIMIZATION DESCRIPTION: This function is used to find minimum of general form (not "sum-of- -squares") function F = F(x[0], ..., x[n-1]) using its gradient and Hessian. Levenberg-Marquardt modification with L-BFGS pre-optimization and internal pre-conditioned L-BFGS optimization after each Levenberg-Marquardt step is used. REQUIREMENTS: This algorithm will request following information during its operation: * function value F at given point X * F and gradient G (simultaneously) at given point X * F, G and Hessian H (simultaneously) at given point X There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts func(), grad() and hess() function pointers. First pointer is used to calculate F at given point, second one calculates F(x) and grad F(x), third one calculates F(x), grad F(x), hess F(x). You can try to initialize MinLMState structure with FGH-function and then use incorrect version of MinLMOptimize() (for example, version which does not provide Hessian matrix), but it will lead to exception being thrown after first attempt to calculate Hessian. USAGE: 1. User initializes algorithm state with MinLMCreateFGH() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and pointers (delegates, etc.) to callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - initial solution, array[0..N-1] OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatefgh(const real_1d_array &x, minlmstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmcreatefgh(n, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping conditions for Levenberg-Marquardt optimization algorithm. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinLMSetScale() Recommended values: 1E-9 ... 1E-12. MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Only Levenberg-Marquardt iterations are counted (L-BFGS/CG iterations are NOT counted because their cost is very low compared to that of LM). Passing EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (small EpsX). NOTE: it is not recommended to set large EpsX (say, 0.001). Because LM is a second-order method, it performs very precise steps anyway. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlmsetcond(const minlmstate &state, const double epsx, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmsetcond(const_cast(state.c_ptr()), epsx, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinLMOptimize(). Both Levenberg-Marquardt and internal L-BFGS iterations are reported. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlmsetxrep(const minlmstate &state, const bool needxrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmsetxrep(const_cast(state.c_ptr()), needxrep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. NOTE: non-zero StpMax leads to moderate performance degradation because intermediate step of preconditioned L-BFGS optimization is incompatible with limits on step size. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlmsetstpmax(const minlmstate &state, const double stpmax) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmsetstpmax(const_cast(state.c_ptr()), stpmax, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets scaling coefficients for LM optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Generally, scale is NOT considered to be a form of preconditioner. But LM optimizer is unique in that it uses scaling matrix both in the stopping condition tests and as Marquardt damping factor. Proper scaling is very important for the algorithm performance. It is less important for the quality of results, but still has some influence (it is easier to converge when variables are properly scaled, so premature stopping is possible when very badly scalled variables are combined with relaxed stopping conditions). INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlmsetscale(const minlmstate &state, const real_1d_array &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmsetscale(const_cast(state.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets boundary constraints for LM optimizer Boundary constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetBC() call. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF (latter is recommended because it will allow solver to use better algorithm). BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF (latter is recommended because it will allow solver to use better algorithm). NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints or at its boundary -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlmsetbc(const minlmstate &state, const real_1d_array &bndl, const real_1d_array &bndu) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmsetbc(const_cast(state.c_ptr()), const_cast(bndl.c_ptr()), const_cast(bndu.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets general linear constraints for LM optimizer Linear constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another minlmsetlc() call. INPUT PARAMETERS: State - structure stores algorithm state C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT IMPORTANT: if you have linear constraints, it is strongly recommended to set scale of variables with minlmsetscale(). QP solver which is used to calculate linearly constrained steps heavily relies on good scaling of input problems. IMPORTANT: solvers created with minlmcreatefgh() do not support linear constraints. NOTE: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations. NOTE: general linear constraints add significant overhead to solution process. Although solver performs roughly same amount of iterations (when compared with similar box-only constrained problem), each iteration now involves solution of linearly constrained QP subproblem, which requires ~3-5 times more Cholesky decompositions. Thus, if you can reformulate your problem in such way this it has only box constraints, it may be beneficial to do so. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlmsetlc(const minlmstate &state, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmsetlc(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets general linear constraints for LM optimizer Linear constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another minlmsetlc() call. INPUT PARAMETERS: State - structure stores algorithm state C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT IMPORTANT: if you have linear constraints, it is strongly recommended to set scale of variables with minlmsetscale(). QP solver which is used to calculate linearly constrained steps heavily relies on good scaling of input problems. IMPORTANT: solvers created with minlmcreatefgh() do not support linear constraints. NOTE: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations. NOTE: general linear constraints add significant overhead to solution process. Although solver performs roughly same amount of iterations (when compared with similar box-only constrained problem), each iteration now involves solution of linearly constrained QP subproblem, which requires ~3-5 times more Cholesky decompositions. Thus, if you can reformulate your problem in such way this it has only box constraints, it may be beneficial to do so. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlmsetlc(const minlmstate &state, const real_2d_array &c, const integer_1d_array &ct) { alglib_impl::ae_state _alglib_env_state; ae_int_t k; if( (c.rows()!=ct.length())) throw ap_error("Error while calling 'minlmsetlc': looks like one of arguments has wrong size"); k = c.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmsetlc(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to change acceleration settings You can choose between three acceleration strategies: * AccType=0, no acceleration. * AccType=1, secant updates are used to update quadratic model after each iteration. After fixed number of iterations (or after model breakdown) we recalculate quadratic model using analytic Jacobian or finite differences. Number of secant-based iterations depends on optimization settings: about 3 iterations - when we have analytic Jacobian, up to 2*N iterations - when we use finite differences to calculate Jacobian. AccType=1 is recommended when Jacobian calculation cost is prohibitively high (several Mx1 function vector calculations followed by several NxN Cholesky factorizations are faster than calculation of one M*N Jacobian). It should also be used when we have no Jacobian, because finite difference approximation takes too much time to compute. Table below list optimization protocols (XYZ protocol corresponds to MinLMCreateXYZ) and acceleration types they support (and use by default). ACCELERATION TYPES SUPPORTED BY OPTIMIZATION PROTOCOLS: protocol 0 1 comment V + + VJ + + FGH + DEFAULT VALUES: protocol 0 1 comment V x without acceleration it is so slooooooooow VJ x FGH x NOTE: this function should be called before optimization. Attempt to call it during algorithm iterations may result in unexpected behavior. NOTE: attempt to call this function with unsupported protocol/acceleration combination will result in exception being thrown. -- ALGLIB -- Copyright 14.10.2010 by Bochkanov Sergey *************************************************************************/ void minlmsetacctype(const minlmstate &state, const ae_int_t acctype) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmsetacctype(const_cast(state.c_ptr()), acctype, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minlmiteration(const minlmstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::minlmiteration(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minlmoptimize(minlmstate &state, void (*fvec)(const real_1d_array &x, real_1d_array &fi, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( fvec==NULL ) throw ap_error("ALGLIB: error in 'minlmoptimize()' (fvec is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minlmiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needfi ) { fvec(state.x, state.fi, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minlmoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minlmoptimize(minlmstate &state, void (*fvec)(const real_1d_array &x, real_1d_array &fi, void *ptr), void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( fvec==NULL ) throw ap_error("ALGLIB: error in 'minlmoptimize()' (fvec is NULL)"); if( jac==NULL ) throw ap_error("ALGLIB: error in 'minlmoptimize()' (jac is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minlmiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needfi ) { fvec(state.x, state.fi, ptr); continue; } if( state.needfij ) { jac(state.x, state.fi, state.j, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minlmoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minlmoptimize(minlmstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*hess)(const real_1d_array &x, double &func, real_1d_array &grad, real_2d_array &hess, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( func==NULL ) throw ap_error("ALGLIB: error in 'minlmoptimize()' (func is NULL)"); if( grad==NULL ) throw ap_error("ALGLIB: error in 'minlmoptimize()' (grad is NULL)"); if( hess==NULL ) throw ap_error("ALGLIB: error in 'minlmoptimize()' (hess is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minlmiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needf ) { func(state.x, state.f, ptr); continue; } if( state.needfg ) { grad(state.x, state.f, state.g, ptr); continue; } if( state.needfgh ) { hess(state.x, state.f, state.g, state.h, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minlmoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minlmoptimize(minlmstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( func==NULL ) throw ap_error("ALGLIB: error in 'minlmoptimize()' (func is NULL)"); if( jac==NULL ) throw ap_error("ALGLIB: error in 'minlmoptimize()' (jac is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minlmiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needf ) { func(state.x, state.f, ptr); continue; } if( state.needfij ) { jac(state.x, state.fi, state.j, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minlmoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void minlmoptimize(minlmstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( func==NULL ) throw ap_error("ALGLIB: error in 'minlmoptimize()' (func is NULL)"); if( grad==NULL ) throw ap_error("ALGLIB: error in 'minlmoptimize()' (grad is NULL)"); if( jac==NULL ) throw ap_error("ALGLIB: error in 'minlmoptimize()' (jac is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::minlmiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needf ) { func(state.x, state.f, ptr); continue; } if( state.needfg ) { grad(state.x, state.f, state.g, ptr); continue; } if( state.needfij ) { jac(state.x, state.fi, state.j, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.x, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'minlmoptimize' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Levenberg-Marquardt algorithm results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report; includes termination codes and additional information. Termination codes are listed below, see comments for this structure for more info. Termination code is stored in rep.terminationtype field: * -8 optimizer detected NAN/INF values either in the function itself, or in its Jacobian * -7 derivative correctness check failed; see rep.funcidx, rep.varidx for more information. * -3 constraints are inconsistent * 2 relative step is no more than EpsX. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * 8 terminated by user who called minlmrequesttermination(). X contains point which was "current accepted" when termination request was submitted. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmresults(const minlmstate &state, real_1d_array &x, minlmreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmresults(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Levenberg-Marquardt algorithm results Buffered implementation of MinLMResults(), which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmresultsbuf(const minlmstate &state, real_1d_array &x, minlmreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmresultsbuf(const_cast(state.c_ptr()), const_cast(x.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine restarts LM algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used for reverse communication previously allocated with MinLMCreateXXX call. X - new starting point. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void minlmrestartfrom(const minlmstate &state, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmrestartfrom(const_cast(state.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void minlmrequesttermination(const minlmstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmrequesttermination(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is obsolete function. Since ALGLIB 3.3 it is equivalent to MinLMCreateVJ(). -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatevgj(const ae_int_t n, const ae_int_t m, const real_1d_array &x, minlmstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmcreatevgj(n, m, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is obsolete function. Since ALGLIB 3.3 it is equivalent to MinLMCreateVJ(). -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatevgj(const ae_int_t m, const real_1d_array &x, minlmstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmcreatevgj(n, m, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is obsolete function. Since ALGLIB 3.3 it is equivalent to MinLMCreateFJ(). -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatefgj(const ae_int_t n, const ae_int_t m, const real_1d_array &x, minlmstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmcreatefgj(n, m, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is obsolete function. Since ALGLIB 3.3 it is equivalent to MinLMCreateFJ(). -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatefgj(const ae_int_t m, const real_1d_array &x, minlmstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmcreatefgj(n, m, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is considered obsolete since ALGLIB 3.1.0 and is present for backward compatibility only. We recommend to use MinLMCreateVJ, which provides similar, but more consistent and feature-rich interface. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatefj(const ae_int_t n, const ae_int_t m, const real_1d_array &x, minlmstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmcreatefj(n, m, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is considered obsolete since ALGLIB 3.1.0 and is present for backward compatibility only. We recommend to use MinLMCreateVJ, which provides similar, but more consistent and feature-rich interface. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatefj(const ae_int_t m, const real_1d_array &x, minlmstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmcreatefj(n, m, const_cast(x.c_ptr()), const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinLMOptimize() is called * prior to actual optimization, for each function Fi and each component of parameters being optimized X[j] algorithm performs following steps: * two trial steps are made to X[j]-TestStep*S[j] and X[j]+TestStep*S[j], where X[j] is j-th parameter and S[j] is a scale of j-th parameter * if needed, steps are bounded with respect to constraints on X[] * Fi(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative, Rep.FuncIdx is set to index of the function. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) Jacobian evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinLMSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/ void minlmsetgradientcheck(const minlmstate &state, const double teststep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::minlmsetgradientcheck(const_cast(state.c_ptr()), teststep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF COMPUTATIONAL CORE // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { static ae_int_t cqmodels_newtonrefinementits = 3; static ae_bool cqmodels_cqmrebuild(convexquadraticmodel* s, ae_state *_state); static void cqmodels_cqmsolveea(convexquadraticmodel* s, /* Real */ ae_vector* x, /* Real */ ae_vector* tmp, ae_state *_state); static double optserv_feasibilityerror(/* Real */ ae_matrix* ce, /* Real */ ae_vector* x, ae_int_t nmain, ae_int_t nslack, ae_int_t k, ae_state *_state); static void snnls_funcgradu(snnlssolver* s, /* Real */ ae_vector* x, /* Real */ ae_vector* r, /* Real */ ae_vector* g, double* f, ae_state *_state); static void snnls_func(snnlssolver* s, /* Real */ ae_vector* x, double* f, ae_state *_state); static void snnls_trdprepare(snnlssolver* s, /* Real */ ae_vector* x, /* Real */ ae_vector* diag, double lambdav, /* Real */ ae_vector* trdd, /* Real */ ae_matrix* trda, /* Real */ ae_vector* tmp0, /* Real */ ae_vector* tmp1, /* Real */ ae_vector* tmp2, /* Real */ ae_matrix* tmplq, ae_state *_state); static void snnls_trdsolve(/* Real */ ae_vector* trdd, /* Real */ ae_matrix* trda, ae_int_t ns, ae_int_t nd, /* Real */ ae_vector* d, ae_state *_state); static void snnls_trdfixvariable(/* Real */ ae_vector* trdd, /* Real */ ae_matrix* trda, ae_int_t ns, ae_int_t nd, ae_int_t idx, /* Real */ ae_vector* tmp, ae_state *_state); static void sactivesets_constraineddescent(sactiveset* state, /* Real */ ae_vector* g, /* Real */ ae_vector* h, /* Real */ ae_matrix* ha, ae_bool normalize, /* Real */ ae_vector* d, ae_state *_state); static void sactivesets_reactivateconstraints(sactiveset* state, /* Real */ ae_vector* gc, /* Real */ ae_vector* h, ae_state *_state); static ae_int_t qqpsolver_quickqprestartcg = 50; static double qqpsolver_penaltyfactor = 50.0; static double qqpsolver_regz = 1.0E-9; static double qqpsolver_projectedtargetfunction(qqpbuffers* sstate, /* Real */ ae_vector* x, /* Real */ ae_vector* d, double stp, /* Real */ ae_vector* tmp0, ae_state *_state); static void qqpsolver_targetgradient(qqpbuffers* sstate, /* Real */ ae_vector* x, /* Real */ ae_vector* g, ae_state *_state); static void qqpsolver_quadraticmodel(qqpbuffers* sstate, /* Real */ ae_vector* x, /* Real */ ae_vector* d, /* Real */ ae_vector* g, double* d1, ae_int_t* d1est, double* d2, ae_int_t* d2est, ae_state *_state); static void qqpsolver_findbeststepandmove(qqpbuffers* sstate, sactiveset* sas, /* Real */ ae_vector* d, double stp, ae_bool needact, ae_int_t cidx, double cval, /* Real */ ae_vector* addsteps, ae_int_t addstepscnt, /* Boolean */ ae_vector* activated, /* Real */ ae_vector* tmp0, ae_state *_state); static ae_bool qqpsolver_cnewtonbuild(qqpbuffers* sstate, ae_int_t sparsesolver, ae_int_t* ncholesky, ae_state *_state); static ae_bool qqpsolver_cnewtonupdate(qqpbuffers* sstate, qqpsettings* settings, ae_int_t* ncupdates, ae_state *_state); static ae_bool qqpsolver_cnewtonstep(qqpbuffers* sstate, qqpsettings* settings, /* Real */ ae_vector* gc, ae_state *_state); static double minlbfgs_gtol = 0.4; static void minlbfgs_clearrequestfields(minlbfgsstate* state, ae_state *_state); static void qpdenseaulsolver_generateexmodel(/* Real */ ae_matrix* sclsfta, /* Real */ ae_vector* sclsftb, ae_int_t nmain, /* Real */ ae_vector* sclsftbndl, /* Boolean */ ae_vector* sclsfthasbndl, /* Real */ ae_vector* sclsftbndu, /* Boolean */ ae_vector* sclsfthasbndu, /* Real */ ae_matrix* sclsftcleic, ae_int_t sclsftnec, ae_int_t sclsftnic, /* Real */ ae_vector* nulc, double rho, /* Real */ ae_matrix* exa, /* Real */ ae_vector* exb, /* Real */ ae_vector* exbndl, /* Real */ ae_vector* exbndu, /* Real */ ae_matrix* tmp2, ae_state *_state); static void qpdenseaulsolver_generateexinitialpoint(/* Real */ ae_vector* sclsftxc, ae_int_t nmain, ae_int_t nslack, /* Real */ ae_vector* exxc, ae_state *_state); static void qpdenseaulsolver_updatelagrangemultipliers(/* Real */ ae_matrix* sclsfta, /* Real */ ae_vector* sclsftb, ae_int_t nmain, /* Real */ ae_vector* sclsftbndl, /* Boolean */ ae_vector* sclsfthasbndl, /* Real */ ae_vector* sclsftbndu, /* Boolean */ ae_vector* sclsfthasbndu, /* Real */ ae_matrix* sclsftcleic, ae_int_t sclsftnec, ae_int_t sclsftnic, /* Real */ ae_vector* exxc, /* Real */ ae_vector* nulcest, qpdenseaulbuffers* buffers, ae_state *_state); static ae_int_t qpcholeskysolver_maxlagrangeits = 10; static ae_int_t qpcholeskysolver_maxbadnewtonits = 7; static double qpcholeskysolver_penaltyfactor = 100.0; static double qpcholeskysolver_modelvalue(convexquadraticmodel* a, /* Real */ ae_vector* b, /* Real */ ae_vector* xc, ae_int_t n, /* Real */ ae_vector* tmp, ae_state *_state); static ae_int_t qpcholeskysolver_boundedstepandactivation(sactiveset* sas, /* Real */ ae_vector* xn, ae_int_t n, /* Real */ ae_vector* buf, ae_state *_state); static ae_bool qpcholeskysolver_constrainedoptimum(sactiveset* sas, convexquadraticmodel* a, double anorm, /* Real */ ae_vector* b, /* Real */ ae_vector* xn, ae_int_t n, /* Real */ ae_vector* tmp, /* Boolean */ ae_vector* tmpb, /* Real */ ae_vector* lagrangec, ae_state *_state); static ae_int_t mincg_rscountdownlen = 10; static double mincg_gtol = 0.3; static void mincg_clearrequestfields(mincgstate* state, ae_state *_state); static void mincg_preconditionedmultiply(mincgstate* state, /* Real */ ae_vector* x, /* Real */ ae_vector* work0, /* Real */ ae_vector* work1, ae_state *_state); static double mincg_preconditionedmultiply2(mincgstate* state, /* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* work0, /* Real */ ae_vector* work1, ae_state *_state); static void mincg_mincginitinternal(ae_int_t n, double diffstep, mincgstate* state, ae_state *_state); static double minbleic_gtol = 0.4; static double minbleic_maxnonmonotoniclen = 1.0E-7; static double minbleic_nmstol = 1.0E2; static double minbleic_initialdecay = 0.5; static double minbleic_mindecay = 0.1; static double minbleic_decaycorrection = 0.8; static double minbleic_penaltyfactor = 100; static void minbleic_clearrequestfields(minbleicstate* state, ae_state *_state); static void minbleic_minbleicinitinternal(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minbleicstate* state, ae_state *_state); static void minbleic_updateestimateofgoodstep(double* estimate, double newstep, ae_state *_state); static double minbleic_feasibilityerror(/* Real */ ae_vector* x, /* Real */ ae_vector* s, ae_int_t n, /* Real */ ae_matrix* cleic, ae_int_t nec, ae_int_t nic, ae_state *_state); static double minnlc_aulmaxgrowth = 10.0; static ae_int_t minnlc_lbfgsfactor = 10; static double minnlc_hessesttol = 1.0E-6; static double minnlc_initgamma = 1.0E-6; static double minnlc_regprec = 1.0E-6; static void minnlc_clearrequestfields(minnlcstate* state, ae_state *_state); static void minnlc_minnlcinitinternal(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minnlcstate* state, ae_state *_state); static void minnlc_clearpreconditioner(minlbfgsstate* auloptimizer, ae_state *_state); static void minnlc_updatepreconditioner(ae_int_t prectype, ae_int_t updatefreq, ae_int_t* preccounter, minlbfgsstate* auloptimizer, /* Real */ ae_vector* x, double rho, double gammak, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* hasbndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* hasbndu, /* Real */ ae_vector* nubc, /* Real */ ae_matrix* cleic, /* Real */ ae_vector* nulc, /* Real */ ae_vector* fi, /* Real */ ae_matrix* jac, /* Real */ ae_vector* nunlc, /* Real */ ae_vector* bufd, /* Real */ ae_vector* bufc, /* Real */ ae_matrix* bufw, /* Real */ ae_matrix* bufz, /* Real */ ae_vector* tmp0, ae_int_t n, ae_int_t nec, ae_int_t nic, ae_int_t ng, ae_int_t nh, ae_state *_state); static void minnlc_penaltybc(/* Real */ ae_vector* x, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* hasbndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* hasbndu, /* Real */ ae_vector* nubc, ae_int_t n, double rho, double stabilizingpoint, double* f, /* Real */ ae_vector* g, ae_state *_state); static void minnlc_penaltylc(/* Real */ ae_vector* x, /* Real */ ae_matrix* cleic, /* Real */ ae_vector* nulc, ae_int_t n, ae_int_t nec, ae_int_t nic, double rho, double stabilizingpoint, double* f, /* Real */ ae_vector* g, ae_state *_state); static void minnlc_penaltynlc(/* Real */ ae_vector* fi, /* Real */ ae_matrix* j, /* Real */ ae_vector* nunlc, ae_int_t n, ae_int_t ng, ae_int_t nh, double rho, double stabilizingpoint, double* f, /* Real */ ae_vector* g, ae_state *_state); static ae_bool minnlc_auliteration(minnlcstate* state, ae_state *_state); static double minbc_gtol = 0.4; static double minbc_maxnonmonotoniclen = 1.0E-5; static double minbc_initialdecay = 0.5; static double minbc_mindecay = 0.1; static double minbc_decaycorrection = 0.8; static void minbc_clearrequestfields(minbcstate* state, ae_state *_state); static void minbc_minbcinitinternal(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minbcstate* state, ae_state *_state); static void minbc_updateestimateofgoodstep(double* estimate, double newstep, ae_state *_state); static void minns_clearrequestfields(minnsstate* state, ae_state *_state); static void minns_minnsinitinternal(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minnsstate* state, ae_state *_state); static ae_bool minns_agsiteration(minnsstate* state, ae_state *_state); static void minns_generatemeritfunction(minnsstate* state, ae_int_t sampleidx, ae_state *_state); static void minns_unscalepointbc(minnsstate* state, /* Real */ ae_vector* x, ae_state *_state); static void minns_solveqp(/* Real */ ae_matrix* sampleg, /* Real */ ae_vector* diagh, ae_int_t nsample, ae_int_t nvars, /* Real */ ae_vector* coeffs, ae_int_t* dbgncholesky, minnsqp* state, ae_state *_state); static void minns_qpcalculategradfunc(/* Real */ ae_matrix* sampleg, /* Real */ ae_vector* diagh, ae_int_t nsample, ae_int_t nvars, /* Real */ ae_vector* coeffs, /* Real */ ae_vector* g, double* f, /* Real */ ae_vector* tmp, ae_state *_state); static void minns_qpcalculatefunc(/* Real */ ae_matrix* sampleg, /* Real */ ae_vector* diagh, ae_int_t nsample, ae_int_t nvars, /* Real */ ae_vector* coeffs, double* f, /* Real */ ae_vector* tmp, ae_state *_state); static void minns_qpsolveu(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* x, ae_state *_state); static void minns_qpsolveut(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* x, ae_state *_state); static ae_int_t mincomp_n1 = 2; static ae_int_t mincomp_n2 = 2; static double mincomp_stpmin = 1.0E-300; static double mincomp_gtol = 0.3; static double mincomp_gpaftol = 0.0001; static double mincomp_gpadecay = 0.5; static double mincomp_asarho = 0.5; static double mincomp_asaboundedantigradnorm(minasastate* state, ae_state *_state); static double mincomp_asaginorm(minasastate* state, ae_state *_state); static double mincomp_asad1norm(minasastate* state, ae_state *_state); static ae_bool mincomp_asauisempty(minasastate* state, ae_state *_state); static void mincomp_clearrequestfields(minasastate* state, ae_state *_state); static double minlm_lambdaup = 2.0; static double minlm_lambdadown = 0.33; static double minlm_suspiciousnu = 16; static ae_int_t minlm_smallmodelage = 3; static ae_int_t minlm_additers = 5; static void minlm_lmprepare(ae_int_t n, ae_int_t m, ae_bool havegrad, minlmstate* state, ae_state *_state); static void minlm_clearrequestfields(minlmstate* state, ae_state *_state); static ae_bool minlm_increaselambda(double* lambdav, double* nu, ae_state *_state); static void minlm_decreaselambda(double* lambdav, double* nu, ae_state *_state); static ae_int_t minlm_checkdecrease(/* Real */ ae_matrix* quadraticmodel, /* Real */ ae_vector* gbase, double fbase, ae_int_t n, /* Real */ ae_vector* deltax, double fnew, double* lambdav, double* nu, ae_state *_state); static ae_bool minlm_minlmstepfinderinit(minlmstepfinder* state, ae_int_t n, ae_int_t m, ae_int_t maxmodelage, ae_bool hasfi, /* Real */ ae_vector* xbase, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, /* Real */ ae_matrix* cleic, ae_int_t nec, ae_int_t nic, /* Real */ ae_vector* s, double stpmax, double epsx, ae_state *_state); static void minlm_minlmstepfinderstart(minlmstepfinder* state, /* Real */ ae_matrix* quadraticmodel, /* Real */ ae_vector* gbase, double fbase, /* Real */ ae_vector* xbase, /* Real */ ae_vector* fibase, ae_int_t modelage, ae_state *_state); static ae_bool minlm_minlmstepfinderiteration(minlmstepfinder* state, double* lambdav, double* nu, /* Real */ ae_vector* xnew, /* Real */ ae_vector* deltax, ae_bool* deltaxready, /* Real */ ae_vector* deltaf, ae_bool* deltafready, ae_int_t* iflag, double* fnew, ae_int_t* ncholesky, ae_state *_state); /************************************************************************* This subroutine is used to initialize CQM. By default, empty NxN model is generated, with Alpha=Lambda=Theta=0.0 and zero b. Previously allocated buffer variables are reused as much as possible. -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqminit(ae_int_t n, convexquadraticmodel* s, ae_state *_state) { ae_int_t i; s->n = n; s->k = 0; s->nfree = n; s->ecakind = -1; s->alpha = 0.0; s->tau = 0.0; s->theta = 0.0; s->ismaintermchanged = ae_true; s->issecondarytermchanged = ae_true; s->islineartermchanged = ae_true; s->isactivesetchanged = ae_true; bvectorsetlengthatleast(&s->activeset, n, _state); rvectorsetlengthatleast(&s->xc, n, _state); rvectorsetlengthatleast(&s->eb, n, _state); rvectorsetlengthatleast(&s->tq1, n, _state); rvectorsetlengthatleast(&s->txc, n, _state); rvectorsetlengthatleast(&s->tb, n, _state); rvectorsetlengthatleast(&s->b, s->n, _state); rvectorsetlengthatleast(&s->tk1, s->n, _state); for(i=0; i<=n-1; i++) { s->activeset.ptr.p_bool[i] = ae_false; s->xc.ptr.p_double[i] = 0.0; s->b.ptr.p_double[i] = 0.0; } } /************************************************************************* This subroutine changes main quadratic term of the model. INPUT PARAMETERS: S - model A - NxN matrix, only upper or lower triangle is referenced IsUpper - True, when matrix is stored in upper triangle Alpha - multiplier; when Alpha=0, A is not referenced at all -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqmseta(convexquadraticmodel* s, /* Real */ ae_matrix* a, ae_bool isupper, double alpha, ae_state *_state) { ae_int_t i; ae_int_t j; double v; ae_assert(ae_isfinite(alpha, _state)&&ae_fp_greater_eq(alpha,(double)(0)), "CQMSetA: Alpha<0 or is not finite number", _state); ae_assert(ae_fp_eq(alpha,(double)(0))||isfinitertrmatrix(a, s->n, isupper, _state), "CQMSetA: A is not finite NxN matrix", _state); s->alpha = alpha; if( ae_fp_greater(alpha,(double)(0)) ) { rmatrixsetlengthatleast(&s->a, s->n, s->n, _state); rmatrixsetlengthatleast(&s->ecadense, s->n, s->n, _state); rmatrixsetlengthatleast(&s->tq2dense, s->n, s->n, _state); for(i=0; i<=s->n-1; i++) { for(j=i; j<=s->n-1; j++) { if( isupper ) { v = a->ptr.pp_double[i][j]; } else { v = a->ptr.pp_double[j][i]; } s->a.ptr.pp_double[i][j] = v; s->a.ptr.pp_double[j][i] = v; } } } s->ismaintermchanged = ae_true; } /************************************************************************* This subroutine changes main quadratic term of the model. INPUT PARAMETERS: S - model A - possibly preallocated buffer OUTPUT PARAMETERS: A - NxN matrix, full matrix is returned. Zero matrix is returned if model is empty. -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqmgeta(convexquadraticmodel* s, /* Real */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; double v; ae_int_t n; n = s->n; rmatrixsetlengthatleast(a, n, n, _state); if( ae_fp_greater(s->alpha,(double)(0)) ) { v = s->alpha; for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a->ptr.pp_double[i][j] = v*s->a.ptr.pp_double[i][j]; } } } else { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a->ptr.pp_double[i][j] = 0.0; } } } } /************************************************************************* This subroutine rewrites diagonal of the main quadratic term of the model (dense A) by vector Z/Alpha (current value of the Alpha coefficient is used). IMPORTANT: in case model has no dense quadratic term, this function allocates N*N dense matrix of zeros, and fills its diagonal by non-zero values. INPUT PARAMETERS: S - model Z - new diagonal, array[N] -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqmrewritedensediagonal(convexquadraticmodel* s, /* Real */ ae_vector* z, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; n = s->n; if( ae_fp_eq(s->alpha,(double)(0)) ) { rmatrixsetlengthatleast(&s->a, s->n, s->n, _state); rmatrixsetlengthatleast(&s->ecadense, s->n, s->n, _state); rmatrixsetlengthatleast(&s->tq2dense, s->n, s->n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { s->a.ptr.pp_double[i][j] = 0.0; } } s->alpha = 1.0; } for(i=0; i<=s->n-1; i++) { s->a.ptr.pp_double[i][i] = z->ptr.p_double[i]/s->alpha; } s->ismaintermchanged = ae_true; } /************************************************************************* This subroutine changes diagonal quadratic term of the model. INPUT PARAMETERS: S - model D - array[N], semidefinite diagonal matrix Tau - multiplier; when Tau=0, D is not referenced at all -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqmsetd(convexquadraticmodel* s, /* Real */ ae_vector* d, double tau, ae_state *_state) { ae_int_t i; ae_assert(ae_isfinite(tau, _state)&&ae_fp_greater_eq(tau,(double)(0)), "CQMSetD: Tau<0 or is not finite number", _state); ae_assert(ae_fp_eq(tau,(double)(0))||isfinitevector(d, s->n, _state), "CQMSetD: D is not finite Nx1 vector", _state); s->tau = tau; if( ae_fp_greater(tau,(double)(0)) ) { rvectorsetlengthatleast(&s->d, s->n, _state); rvectorsetlengthatleast(&s->ecadiag, s->n, _state); rvectorsetlengthatleast(&s->tq2diag, s->n, _state); for(i=0; i<=s->n-1; i++) { ae_assert(ae_fp_greater_eq(d->ptr.p_double[i],(double)(0)), "CQMSetD: D[i]<0", _state); s->d.ptr.p_double[i] = d->ptr.p_double[i]; } } s->ismaintermchanged = ae_true; } /************************************************************************* This subroutine drops main quadratic term A from the model. It is same as call to CQMSetA() with zero A, but gives better performance because algorithm knows that matrix is zero and can optimize subsequent calculations. INPUT PARAMETERS: S - model -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqmdropa(convexquadraticmodel* s, ae_state *_state) { s->alpha = 0.0; s->ismaintermchanged = ae_true; } /************************************************************************* This subroutine changes linear term of the model -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqmsetb(convexquadraticmodel* s, /* Real */ ae_vector* b, ae_state *_state) { ae_int_t i; ae_assert(isfinitevector(b, s->n, _state), "CQMSetB: B is not finite vector", _state); rvectorsetlengthatleast(&s->b, s->n, _state); for(i=0; i<=s->n-1; i++) { s->b.ptr.p_double[i] = b->ptr.p_double[i]; } s->islineartermchanged = ae_true; } /************************************************************************* This subroutine changes linear term of the model -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqmsetq(convexquadraticmodel* s, /* Real */ ae_matrix* q, /* Real */ ae_vector* r, ae_int_t k, double theta, ae_state *_state) { ae_int_t i; ae_int_t j; ae_assert(k>=0, "CQMSetQ: K<0", _state); ae_assert((k==0||ae_fp_eq(theta,(double)(0)))||apservisfinitematrix(q, k, s->n, _state), "CQMSetQ: Q is not finite matrix", _state); ae_assert((k==0||ae_fp_eq(theta,(double)(0)))||isfinitevector(r, k, _state), "CQMSetQ: R is not finite vector", _state); ae_assert(ae_isfinite(theta, _state)&&ae_fp_greater_eq(theta,(double)(0)), "CQMSetQ: Theta<0 or is not finite number", _state); /* * degenerate case: K=0 or Theta=0 */ if( k==0||ae_fp_eq(theta,(double)(0)) ) { s->k = 0; s->theta = (double)(0); s->issecondarytermchanged = ae_true; return; } /* * General case: both Theta>0 and K>0 */ s->k = k; s->theta = theta; rmatrixsetlengthatleast(&s->q, s->k, s->n, _state); rvectorsetlengthatleast(&s->r, s->k, _state); rmatrixsetlengthatleast(&s->eq, s->k, s->n, _state); rmatrixsetlengthatleast(&s->eccm, s->k, s->k, _state); rmatrixsetlengthatleast(&s->tk2, s->k, s->n, _state); for(i=0; i<=s->k-1; i++) { for(j=0; j<=s->n-1; j++) { s->q.ptr.pp_double[i][j] = q->ptr.pp_double[i][j]; } s->r.ptr.p_double[i] = r->ptr.p_double[i]; } s->issecondarytermchanged = ae_true; } /************************************************************************* This subroutine changes active set INPUT PARAMETERS S - model X - array[N], constraint values ActiveSet- array[N], active set. If ActiveSet[I]=True, then I-th variables is constrained to X[I]. -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqmsetactiveset(convexquadraticmodel* s, /* Real */ ae_vector* x, /* Boolean */ ae_vector* activeset, ae_state *_state) { ae_int_t i; ae_assert(x->cnt>=s->n, "CQMSetActiveSet: Length(X)cnt>=s->n, "CQMSetActiveSet: Length(ActiveSet)n-1; i++) { s->isactivesetchanged = s->isactivesetchanged||(s->activeset.ptr.p_bool[i]&&!activeset->ptr.p_bool[i]); s->isactivesetchanged = s->isactivesetchanged||(activeset->ptr.p_bool[i]&&!s->activeset.ptr.p_bool[i]); s->activeset.ptr.p_bool[i] = activeset->ptr.p_bool[i]; if( activeset->ptr.p_bool[i] ) { ae_assert(ae_isfinite(x->ptr.p_double[i], _state), "CQMSetActiveSet: X[] contains infinite constraints", _state); s->isactivesetchanged = s->isactivesetchanged||ae_fp_neq(s->xc.ptr.p_double[i],x->ptr.p_double[i]); s->xc.ptr.p_double[i] = x->ptr.p_double[i]; } } } /************************************************************************* This subroutine evaluates model at X. Active constraints are ignored. -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ double cqmeval(convexquadraticmodel* s, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; double v; double result; n = s->n; ae_assert(isfinitevector(x, n, _state), "CQMEval: X is not finite vector", _state); result = 0.0; /* * main quadratic term */ if( ae_fp_greater(s->alpha,(double)(0)) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { result = result+s->alpha*0.5*x->ptr.p_double[i]*s->a.ptr.pp_double[i][j]*x->ptr.p_double[j]; } } } if( ae_fp_greater(s->tau,(double)(0)) ) { for(i=0; i<=n-1; i++) { result = result+0.5*ae_sqr(x->ptr.p_double[i], _state)*s->tau*s->d.ptr.p_double[i]; } } /* * secondary quadratic term */ if( ae_fp_greater(s->theta,(double)(0)) ) { for(i=0; i<=s->k-1; i++) { v = ae_v_dotproduct(&s->q.ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); result = result+0.5*s->theta*ae_sqr(v-s->r.ptr.p_double[i], _state); } } /* * linear term */ for(i=0; i<=s->n-1; i++) { result = result+x->ptr.p_double[i]*s->b.ptr.p_double[i]; } return result; } /************************************************************************* This subroutine evaluates model at X. Active constraints are ignored. It returns: R - model value Noise- estimate of the numerical noise in data -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqmevalx(convexquadraticmodel* s, /* Real */ ae_vector* x, double* r, double* noise, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; double v; double v2; double mxq; double eps; *r = 0; *noise = 0; n = s->n; ae_assert(isfinitevector(x, n, _state), "CQMEval: X is not finite vector", _state); *r = 0.0; *noise = 0.0; eps = 2*ae_machineepsilon; mxq = 0.0; /* * Main quadratic term. * * Noise from the main quadratic term is equal to the * maximum summand in the term. */ if( ae_fp_greater(s->alpha,(double)(0)) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = s->alpha*0.5*x->ptr.p_double[i]*s->a.ptr.pp_double[i][j]*x->ptr.p_double[j]; *r = *r+v; *noise = ae_maxreal(*noise, eps*ae_fabs(v, _state), _state); } } } if( ae_fp_greater(s->tau,(double)(0)) ) { for(i=0; i<=n-1; i++) { v = 0.5*ae_sqr(x->ptr.p_double[i], _state)*s->tau*s->d.ptr.p_double[i]; *r = *r+v; *noise = ae_maxreal(*noise, eps*ae_fabs(v, _state), _state); } } /* * secondary quadratic term * * Noise from the secondary quadratic term is estimated as follows: * * noise in qi*x-r[i] is estimated as * Eps*MXQ = Eps*max(|r[i]|, |q[i,j]*x[j]|) * * noise in (qi*x-r[i])^2 is estimated as * NOISE = (|qi*x-r[i]|+Eps*MXQ)^2-(|qi*x-r[i]|)^2 * = Eps*MXQ*(2*|qi*x-r[i]|+Eps*MXQ) */ if( ae_fp_greater(s->theta,(double)(0)) ) { for(i=0; i<=s->k-1; i++) { v = 0.0; mxq = ae_fabs(s->r.ptr.p_double[i], _state); for(j=0; j<=n-1; j++) { v2 = s->q.ptr.pp_double[i][j]*x->ptr.p_double[j]; v = v+v2; mxq = ae_maxreal(mxq, ae_fabs(v2, _state), _state); } *r = *r+0.5*s->theta*ae_sqr(v-s->r.ptr.p_double[i], _state); *noise = ae_maxreal(*noise, eps*mxq*(2*ae_fabs(v-s->r.ptr.p_double[i], _state)+eps*mxq), _state); } } /* * linear term */ for(i=0; i<=s->n-1; i++) { *r = *r+x->ptr.p_double[i]*s->b.ptr.p_double[i]; *noise = ae_maxreal(*noise, eps*ae_fabs(x->ptr.p_double[i]*s->b.ptr.p_double[i], _state), _state); } /* * Final update of the noise */ *noise = n*(*noise); } /************************************************************************* This subroutine evaluates gradient of the model; active constraints are ignored. INPUT PARAMETERS: S - convex model X - point, array[N] G - possibly preallocated buffer; resized, if too small -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqmgradunconstrained(convexquadraticmodel* s, /* Real */ ae_vector* x, /* Real */ ae_vector* g, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; double v; n = s->n; ae_assert(isfinitevector(x, n, _state), "CQMEvalGradUnconstrained: X is not finite vector", _state); rvectorsetlengthatleast(g, n, _state); for(i=0; i<=n-1; i++) { g->ptr.p_double[i] = (double)(0); } /* * main quadratic term */ if( ae_fp_greater(s->alpha,(double)(0)) ) { for(i=0; i<=n-1; i++) { v = 0.0; for(j=0; j<=n-1; j++) { v = v+s->alpha*s->a.ptr.pp_double[i][j]*x->ptr.p_double[j]; } g->ptr.p_double[i] = g->ptr.p_double[i]+v; } } if( ae_fp_greater(s->tau,(double)(0)) ) { for(i=0; i<=n-1; i++) { g->ptr.p_double[i] = g->ptr.p_double[i]+x->ptr.p_double[i]*s->tau*s->d.ptr.p_double[i]; } } /* * secondary quadratic term */ if( ae_fp_greater(s->theta,(double)(0)) ) { for(i=0; i<=s->k-1; i++) { v = ae_v_dotproduct(&s->q.ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); v = s->theta*(v-s->r.ptr.p_double[i]); ae_v_addd(&g->ptr.p_double[0], 1, &s->q.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } } /* * linear term */ for(i=0; i<=n-1; i++) { g->ptr.p_double[i] = g->ptr.p_double[i]+s->b.ptr.p_double[i]; } } /************************************************************************* This subroutine evaluates x'*(0.5*alpha*A+tau*D)*x -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ double cqmxtadx2(convexquadraticmodel* s, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; double result; n = s->n; ae_assert(isfinitevector(x, n, _state), "CQMEval: X is not finite vector", _state); result = 0.0; /* * main quadratic term */ if( ae_fp_greater(s->alpha,(double)(0)) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { result = result+s->alpha*0.5*x->ptr.p_double[i]*s->a.ptr.pp_double[i][j]*x->ptr.p_double[j]; } } } if( ae_fp_greater(s->tau,(double)(0)) ) { for(i=0; i<=n-1; i++) { result = result+0.5*ae_sqr(x->ptr.p_double[i], _state)*s->tau*s->d.ptr.p_double[i]; } } return result; } /************************************************************************* This subroutine evaluates (0.5*alpha*A+tau*D)*x Y is automatically resized if needed -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqmadx(convexquadraticmodel* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t n; ae_int_t i; double v; n = s->n; ae_assert(isfinitevector(x, n, _state), "CQMEval: X is not finite vector", _state); rvectorsetlengthatleast(y, n, _state); /* * main quadratic term */ for(i=0; i<=n-1; i++) { y->ptr.p_double[i] = (double)(0); } if( ae_fp_greater(s->alpha,(double)(0)) ) { for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&s->a.ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); y->ptr.p_double[i] = y->ptr.p_double[i]+s->alpha*v; } } if( ae_fp_greater(s->tau,(double)(0)) ) { for(i=0; i<=n-1; i++) { y->ptr.p_double[i] = y->ptr.p_double[i]+x->ptr.p_double[i]*s->tau*s->d.ptr.p_double[i]; } } } /************************************************************************* This subroutine finds optimum of the model. It returns False on failure (indefinite/semidefinite matrix). Optimum is found subject to active constraints. INPUT PARAMETERS S - model X - possibly preallocated buffer; automatically resized, if too small enough. -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ ae_bool cqmconstrainedoptimum(convexquadraticmodel* s, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; ae_int_t nfree; ae_int_t k; ae_int_t i; double v; ae_int_t cidx0; ae_int_t itidx; ae_bool result; /* * Rebuild internal structures */ if( !cqmodels_cqmrebuild(s, _state) ) { result = ae_false; return result; } n = s->n; k = s->k; nfree = s->nfree; result = ae_true; /* * Calculate initial point for the iterative refinement: * * free components are set to zero * * constrained components are set to their constrained values */ rvectorsetlengthatleast(x, n, _state); for(i=0; i<=n-1; i++) { if( s->activeset.ptr.p_bool[i] ) { x->ptr.p_double[i] = s->xc.ptr.p_double[i]; } else { x->ptr.p_double[i] = (double)(0); } } /* * Iterative refinement. * * In an ideal world without numerical errors it would be enough * to make just one Newton step from initial point: * x_new = -H^(-1)*grad(x=0) * However, roundoff errors can significantly deteriorate quality * of the solution. So we have to recalculate gradient and to * perform Newton steps several times. * * Below we perform fixed number of Newton iterations. */ for(itidx=0; itidx<=cqmodels_newtonrefinementits-1; itidx++) { /* * Calculate gradient at the current point. * Move free components of the gradient in the beginning. */ cqmgradunconstrained(s, x, &s->tmpg, _state); cidx0 = 0; for(i=0; i<=n-1; i++) { if( !s->activeset.ptr.p_bool[i] ) { s->tmpg.ptr.p_double[cidx0] = s->tmpg.ptr.p_double[i]; cidx0 = cidx0+1; } } /* * Free components of the extrema are calculated in the first NFree elements of TXC. * * First, we have to calculate original Newton step, without rank-K perturbations */ ae_v_moveneg(&s->txc.ptr.p_double[0], 1, &s->tmpg.ptr.p_double[0], 1, ae_v_len(0,nfree-1)); cqmodels_cqmsolveea(s, &s->txc, &s->tmp0, _state); /* * Then, we account for rank-K correction. * Woodbury matrix identity is used. */ if( s->k>0&&ae_fp_greater(s->theta,(double)(0)) ) { rvectorsetlengthatleast(&s->tmp0, ae_maxint(nfree, k, _state), _state); rvectorsetlengthatleast(&s->tmp1, ae_maxint(nfree, k, _state), _state); ae_v_moveneg(&s->tmp1.ptr.p_double[0], 1, &s->tmpg.ptr.p_double[0], 1, ae_v_len(0,nfree-1)); cqmodels_cqmsolveea(s, &s->tmp1, &s->tmp0, _state); for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&s->eq.ptr.pp_double[i][0], 1, &s->tmp1.ptr.p_double[0], 1, ae_v_len(0,nfree-1)); s->tmp0.ptr.p_double[i] = v; } fblscholeskysolve(&s->eccm, 1.0, k, ae_true, &s->tmp0, &s->tmp1, _state); for(i=0; i<=nfree-1; i++) { s->tmp1.ptr.p_double[i] = 0.0; } for(i=0; i<=k-1; i++) { v = s->tmp0.ptr.p_double[i]; ae_v_addd(&s->tmp1.ptr.p_double[0], 1, &s->eq.ptr.pp_double[i][0], 1, ae_v_len(0,nfree-1), v); } cqmodels_cqmsolveea(s, &s->tmp1, &s->tmp0, _state); ae_v_sub(&s->txc.ptr.p_double[0], 1, &s->tmp1.ptr.p_double[0], 1, ae_v_len(0,nfree-1)); } /* * Unpack components from TXC into X. We pass through all * free components of X and add our step. */ cidx0 = 0; for(i=0; i<=n-1; i++) { if( !s->activeset.ptr.p_bool[i] ) { x->ptr.p_double[i] = x->ptr.p_double[i]+s->txc.ptr.p_double[cidx0]; cidx0 = cidx0+1; } } } return result; } /************************************************************************* This function scales vector by multiplying it by inverse of the diagonal of the Hessian matrix. It should be used to accelerate steepest descent phase of the QP solver. Although it is called "scale-grad", it can be called for any vector, whether it is gradient, anti-gradient, or just some vector. This function does NOT takes into account current set of constraints, it just performs matrix-vector multiplication without taking into account constraints. INPUT PARAMETERS: S - model X - vector to scale OUTPUT PARAMETERS: X - scaled vector NOTE: when called for non-SPD matrices, it silently skips components of X which correspond to zero or negative diagonal elements. NOTE: this function uses diagonals of A and D; it ignores Q - rank-K term of the quadratic model. -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ void cqmscalevector(convexquadraticmodel* s, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; ae_int_t i; double v; n = s->n; for(i=0; i<=n-1; i++) { v = 0.0; if( ae_fp_greater(s->alpha,(double)(0)) ) { v = v+s->a.ptr.pp_double[i][i]; } if( ae_fp_greater(s->tau,(double)(0)) ) { v = v+s->d.ptr.p_double[i]; } if( ae_fp_greater(v,(double)(0)) ) { x->ptr.p_double[i] = x->ptr.p_double[i]/v; } } } /************************************************************************* This subroutine calls CQMRebuild() and evaluates model at X subject to active constraints. It is intended for debug purposes only, because it evaluates model by means of temporaries, which were calculated by CQMRebuild(). The only purpose of this function is to check correctness of CQMRebuild() by comparing results of this function with ones obtained by CQMEval(), which is used as reference point. The idea is that significant deviation in results of these two functions is evidence of some error in the CQMRebuild(). NOTE: suffix T denotes that temporaries marked by T-prefix are used. There is one more variant of this function, which uses "effective" model built by CQMRebuild(). NOTE2: in case CQMRebuild() fails (due to model non-convexity), this function returns NAN. -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ double cqmdebugconstrainedevalt(convexquadraticmodel* s, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; ae_int_t nfree; ae_int_t i; ae_int_t j; double v; double result; n = s->n; ae_assert(isfinitevector(x, n, _state), "CQMDebugConstrainedEvalT: X is not finite vector", _state); if( !cqmodels_cqmrebuild(s, _state) ) { result = _state->v_nan; return result; } result = 0.0; nfree = s->nfree; /* * Reorder variables */ j = 0; for(i=0; i<=n-1; i++) { if( !s->activeset.ptr.p_bool[i] ) { ae_assert(jtxc.ptr.p_double[j] = x->ptr.p_double[i]; j = j+1; } } /* * TQ2, TQ1, TQ0 * */ if( ae_fp_greater(s->alpha,(double)(0)) ) { /* * Dense TQ2 */ for(i=0; i<=nfree-1; i++) { for(j=0; j<=nfree-1; j++) { result = result+0.5*s->txc.ptr.p_double[i]*s->tq2dense.ptr.pp_double[i][j]*s->txc.ptr.p_double[j]; } } } else { /* * Diagonal TQ2 */ for(i=0; i<=nfree-1; i++) { result = result+0.5*s->tq2diag.ptr.p_double[i]*ae_sqr(s->txc.ptr.p_double[i], _state); } } for(i=0; i<=nfree-1; i++) { result = result+s->tq1.ptr.p_double[i]*s->txc.ptr.p_double[i]; } result = result+s->tq0; /* * TK2, TK1, TK0 */ if( s->k>0&&ae_fp_greater(s->theta,(double)(0)) ) { for(i=0; i<=s->k-1; i++) { v = (double)(0); for(j=0; j<=nfree-1; j++) { v = v+s->tk2.ptr.pp_double[i][j]*s->txc.ptr.p_double[j]; } result = result+0.5*ae_sqr(v, _state); } for(i=0; i<=nfree-1; i++) { result = result+s->tk1.ptr.p_double[i]*s->txc.ptr.p_double[i]; } result = result+s->tk0; } /* * TB (Bf and Bc parts) */ for(i=0; i<=n-1; i++) { result = result+s->tb.ptr.p_double[i]*s->txc.ptr.p_double[i]; } return result; } /************************************************************************* This subroutine calls CQMRebuild() and evaluates model at X subject to active constraints. It is intended for debug purposes only, because it evaluates model by means of "effective" matrices built by CQMRebuild(). The only purpose of this function is to check correctness of CQMRebuild() by comparing results of this function with ones obtained by CQMEval(), which is used as reference point. The idea is that significant deviation in results of these two functions is evidence of some error in the CQMRebuild(). NOTE: suffix E denotes that effective matrices. There is one more variant of this function, which uses temporary matrices built by CQMRebuild(). NOTE2: in case CQMRebuild() fails (due to model non-convexity), this function returns NAN. -- ALGLIB -- Copyright 12.06.2012 by Bochkanov Sergey *************************************************************************/ double cqmdebugconstrainedevale(convexquadraticmodel* s, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; ae_int_t nfree; ae_int_t i; ae_int_t j; double v; double result; n = s->n; ae_assert(isfinitevector(x, n, _state), "CQMDebugConstrainedEvalE: X is not finite vector", _state); if( !cqmodels_cqmrebuild(s, _state) ) { result = _state->v_nan; return result; } result = 0.0; nfree = s->nfree; /* * Reorder variables */ j = 0; for(i=0; i<=n-1; i++) { if( !s->activeset.ptr.p_bool[i] ) { ae_assert(jtxc.ptr.p_double[j] = x->ptr.p_double[i]; j = j+1; } } /* * ECA */ ae_assert((s->ecakind==0||s->ecakind==1)||(s->ecakind==-1&&nfree==0), "CQMDebugConstrainedEvalE: unexpected ECAKind", _state); if( s->ecakind==0 ) { /* * Dense ECA */ for(i=0; i<=nfree-1; i++) { v = 0.0; for(j=i; j<=nfree-1; j++) { v = v+s->ecadense.ptr.pp_double[i][j]*s->txc.ptr.p_double[j]; } result = result+0.5*ae_sqr(v, _state); } } if( s->ecakind==1 ) { /* * Diagonal ECA */ for(i=0; i<=nfree-1; i++) { result = result+0.5*ae_sqr(s->ecadiag.ptr.p_double[i]*s->txc.ptr.p_double[i], _state); } } /* * EQ */ for(i=0; i<=s->k-1; i++) { v = 0.0; for(j=0; j<=nfree-1; j++) { v = v+s->eq.ptr.pp_double[i][j]*s->txc.ptr.p_double[j]; } result = result+0.5*ae_sqr(v, _state); } /* * EB */ for(i=0; i<=nfree-1; i++) { result = result+s->eb.ptr.p_double[i]*s->txc.ptr.p_double[i]; } /* * EC */ result = result+s->ec; return result; } /************************************************************************* Internal function, rebuilds "effective" model subject to constraints. Returns False on failure (non-SPD main quadratic term) -- ALGLIB -- Copyright 10.05.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool cqmodels_cqmrebuild(convexquadraticmodel* s, ae_state *_state) { ae_int_t n; ae_int_t nfree; ae_int_t k; ae_int_t i; ae_int_t j; ae_int_t ridx0; ae_int_t ridx1; ae_int_t cidx0; ae_int_t cidx1; double v; ae_bool result; if( ae_fp_eq(s->alpha,(double)(0))&&ae_fp_eq(s->tau,(double)(0)) ) { /* * Non-SPD model, quick exit */ result = ae_false; return result; } result = ae_true; n = s->n; k = s->k; /* * Determine number of free variables. * Fill TXC - array whose last N-NFree elements store constraints. */ if( s->isactivesetchanged ) { s->nfree = 0; for(i=0; i<=n-1; i++) { if( !s->activeset.ptr.p_bool[i] ) { s->nfree = s->nfree+1; } } j = s->nfree; for(i=0; i<=n-1; i++) { if( s->activeset.ptr.p_bool[i] ) { s->txc.ptr.p_double[j] = s->xc.ptr.p_double[i]; j = j+1; } } } nfree = s->nfree; /* * Re-evaluate TQ2/TQ1/TQ0, if needed */ if( s->isactivesetchanged||s->ismaintermchanged ) { /* * Handle cases Alpha>0 and Alpha=0 separately: * * in the first case we have dense matrix * * in the second one we have diagonal matrix, which can be * handled more efficiently */ if( ae_fp_greater(s->alpha,(double)(0)) ) { /* * Alpha>0, dense QP * * Split variables into two groups - free (F) and constrained (C). Reorder * variables in such way that free vars come first, constrained are last: * x = [xf, xc]. * * Main quadratic term x'*(alpha*A+tau*D)*x now splits into quadratic part, * linear part and constant part: * ( alpha*Aff+tau*Df alpha*Afc ) ( xf ) * 0.5*( xf' xc' )*( )*( ) = * ( alpha*Acf alpha*Acc+tau*Dc ) ( xc ) * * = 0.5*xf'*(alpha*Aff+tau*Df)*xf + (alpha*Afc*xc)'*xf + 0.5*xc'(alpha*Acc+tau*Dc)*xc * * We store these parts into temporary variables: * * alpha*Aff+tau*Df, alpha*Afc, alpha*Acc+tau*Dc are stored into upper * triangle of TQ2 * * alpha*Afc*xc is stored into TQ1 * * 0.5*xc'(alpha*Acc+tau*Dc)*xc is stored into TQ0 * * Below comes first part of the work - generation of TQ2: * * we pass through rows of A and copy I-th row into upper block (Aff/Afc) or * lower one (Acf/Acc) of TQ2, depending on presence of X[i] in the active set. * RIdx0 variable contains current position for insertion into upper block, * RIdx1 contains current position for insertion into lower one. * * within each row, we copy J-th element into left half (Aff/Acf) or right * one (Afc/Acc), depending on presence of X[j] in the active set. CIdx0 * contains current position for insertion into left block, CIdx1 contains * position for insertion into right one. * * during copying, we multiply elements by alpha and add diagonal matrix D. */ ridx0 = 0; ridx1 = s->nfree; for(i=0; i<=n-1; i++) { cidx0 = 0; cidx1 = s->nfree; for(j=0; j<=n-1; j++) { if( !s->activeset.ptr.p_bool[i]&&!s->activeset.ptr.p_bool[j] ) { /* * Element belongs to Aff */ v = s->alpha*s->a.ptr.pp_double[i][j]; if( i==j&&ae_fp_greater(s->tau,(double)(0)) ) { v = v+s->tau*s->d.ptr.p_double[i]; } s->tq2dense.ptr.pp_double[ridx0][cidx0] = v; } if( !s->activeset.ptr.p_bool[i]&&s->activeset.ptr.p_bool[j] ) { /* * Element belongs to Afc */ s->tq2dense.ptr.pp_double[ridx0][cidx1] = s->alpha*s->a.ptr.pp_double[i][j]; } if( s->activeset.ptr.p_bool[i]&&!s->activeset.ptr.p_bool[j] ) { /* * Element belongs to Acf */ s->tq2dense.ptr.pp_double[ridx1][cidx0] = s->alpha*s->a.ptr.pp_double[i][j]; } if( s->activeset.ptr.p_bool[i]&&s->activeset.ptr.p_bool[j] ) { /* * Element belongs to Acc */ v = s->alpha*s->a.ptr.pp_double[i][j]; if( i==j&&ae_fp_greater(s->tau,(double)(0)) ) { v = v+s->tau*s->d.ptr.p_double[i]; } s->tq2dense.ptr.pp_double[ridx1][cidx1] = v; } if( s->activeset.ptr.p_bool[j] ) { cidx1 = cidx1+1; } else { cidx0 = cidx0+1; } } if( s->activeset.ptr.p_bool[i] ) { ridx1 = ridx1+1; } else { ridx0 = ridx0+1; } } /* * Now we have TQ2, and we can evaluate TQ1. * In the special case when we have Alpha=0, NFree=0 or NFree=N, * TQ1 is filled by zeros. */ for(i=0; i<=n-1; i++) { s->tq1.ptr.p_double[i] = 0.0; } if( s->nfree>0&&s->nfreenfree, n-s->nfree, &s->tq2dense, 0, s->nfree, 0, &s->txc, s->nfree, &s->tq1, 0, _state); } /* * And finally, we evaluate TQ0. */ v = 0.0; for(i=s->nfree; i<=n-1; i++) { for(j=s->nfree; j<=n-1; j++) { v = v+0.5*s->txc.ptr.p_double[i]*s->tq2dense.ptr.pp_double[i][j]*s->txc.ptr.p_double[j]; } } s->tq0 = v; } else { /* * Alpha=0, diagonal QP * * Split variables into two groups - free (F) and constrained (C). Reorder * variables in such way that free vars come first, constrained are last: * x = [xf, xc]. * * Main quadratic term x'*(tau*D)*x now splits into quadratic and constant * parts: * ( tau*Df ) ( xf ) * 0.5*( xf' xc' )*( )*( ) = * ( tau*Dc ) ( xc ) * * = 0.5*xf'*(tau*Df)*xf + 0.5*xc'(tau*Dc)*xc * * We store these parts into temporary variables: * * tau*Df is stored in TQ2Diag * * 0.5*xc'(tau*Dc)*xc is stored into TQ0 */ s->tq0 = 0.0; ridx0 = 0; for(i=0; i<=n-1; i++) { if( !s->activeset.ptr.p_bool[i] ) { s->tq2diag.ptr.p_double[ridx0] = s->tau*s->d.ptr.p_double[i]; ridx0 = ridx0+1; } else { s->tq0 = s->tq0+0.5*s->tau*s->d.ptr.p_double[i]*ae_sqr(s->xc.ptr.p_double[i], _state); } } for(i=0; i<=n-1; i++) { s->tq1.ptr.p_double[i] = 0.0; } } } /* * Re-evaluate TK2/TK1/TK0, if needed */ if( s->isactivesetchanged||s->issecondarytermchanged ) { /* * Split variables into two groups - free (F) and constrained (C). Reorder * variables in such way that free vars come first, constrained are last: * x = [xf, xc]. * * Secondary term theta*(Q*x-r)'*(Q*x-r) now splits into quadratic part, * linear part and constant part: * ( ( xf ) )' ( ( xf ) ) * 0.5*theta*( (Qf Qc)'*( ) - r ) * ( (Qf Qc)'*( ) - r ) = * ( ( xc ) ) ( ( xc ) ) * * = 0.5*theta*xf'*(Qf'*Qf)*xf + theta*((Qc*xc-r)'*Qf)*xf + * + theta*(-r'*(Qc*xc-r)-0.5*r'*r+0.5*xc'*Qc'*Qc*xc) * * We store these parts into temporary variables: * * sqrt(theta)*Qf is stored into TK2 * * theta*((Qc*xc-r)'*Qf) is stored into TK1 * * theta*(-r'*(Qc*xc-r)-0.5*r'*r+0.5*xc'*Qc'*Qc*xc) is stored into TK0 * * We use several other temporaries to store intermediate results: * * Tmp0 - to store Qc*xc-r * * Tmp1 - to store Qc*xc * * Generation of TK2/TK1/TK0 is performed as follows: * * we fill TK2/TK1/TK0 (to handle K=0 or Theta=0) * * other steps are performed only for K>0 and Theta>0 * * we pass through columns of Q and copy I-th column into left block (Qf) or * right one (Qc) of TK2, depending on presence of X[i] in the active set. * CIdx0 variable contains current position for insertion into upper block, * CIdx1 contains current position for insertion into lower one. * * we calculate Qc*xc-r and store it into Tmp0 * * we calculate TK0 and TK1 * * we multiply leading part of TK2 which stores Qf by sqrt(theta) * it is important to perform this step AFTER calculation of TK0 and TK1, * because we need original (non-modified) Qf to calculate TK0 and TK1. */ for(j=0; j<=n-1; j++) { for(i=0; i<=k-1; i++) { s->tk2.ptr.pp_double[i][j] = 0.0; } s->tk1.ptr.p_double[j] = 0.0; } s->tk0 = 0.0; if( s->k>0&&ae_fp_greater(s->theta,(double)(0)) ) { /* * Split Q into Qf and Qc * Calculate Qc*xc-r, store in Tmp0 */ rvectorsetlengthatleast(&s->tmp0, k, _state); rvectorsetlengthatleast(&s->tmp1, k, _state); cidx0 = 0; cidx1 = nfree; for(i=0; i<=k-1; i++) { s->tmp1.ptr.p_double[i] = 0.0; } for(j=0; j<=n-1; j++) { if( s->activeset.ptr.p_bool[j] ) { for(i=0; i<=k-1; i++) { s->tk2.ptr.pp_double[i][cidx1] = s->q.ptr.pp_double[i][j]; s->tmp1.ptr.p_double[i] = s->tmp1.ptr.p_double[i]+s->q.ptr.pp_double[i][j]*s->txc.ptr.p_double[cidx1]; } cidx1 = cidx1+1; } else { for(i=0; i<=k-1; i++) { s->tk2.ptr.pp_double[i][cidx0] = s->q.ptr.pp_double[i][j]; } cidx0 = cidx0+1; } } for(i=0; i<=k-1; i++) { s->tmp0.ptr.p_double[i] = s->tmp1.ptr.p_double[i]-s->r.ptr.p_double[i]; } /* * Calculate TK0 */ v = 0.0; for(i=0; i<=k-1; i++) { v = v+s->theta*(0.5*ae_sqr(s->tmp1.ptr.p_double[i], _state)-s->r.ptr.p_double[i]*s->tmp0.ptr.p_double[i]-0.5*ae_sqr(s->r.ptr.p_double[i], _state)); } s->tk0 = v; /* * Calculate TK1 */ if( nfree>0 ) { for(i=0; i<=k-1; i++) { v = s->theta*s->tmp0.ptr.p_double[i]; ae_v_addd(&s->tk1.ptr.p_double[0], 1, &s->tk2.ptr.pp_double[i][0], 1, ae_v_len(0,nfree-1), v); } } /* * Calculate TK2 */ if( nfree>0 ) { v = ae_sqrt(s->theta, _state); for(i=0; i<=k-1; i++) { ae_v_muld(&s->tk2.ptr.pp_double[i][0], 1, ae_v_len(0,nfree-1), v); } } } } /* * Re-evaluate TB */ if( s->isactivesetchanged||s->islineartermchanged ) { ridx0 = 0; ridx1 = nfree; for(i=0; i<=n-1; i++) { if( s->activeset.ptr.p_bool[i] ) { s->tb.ptr.p_double[ridx1] = s->b.ptr.p_double[i]; ridx1 = ridx1+1; } else { s->tb.ptr.p_double[ridx0] = s->b.ptr.p_double[i]; ridx0 = ridx0+1; } } } /* * Compose ECA: either dense ECA or diagonal ECA */ if( (s->isactivesetchanged||s->ismaintermchanged)&&nfree>0 ) { if( ae_fp_greater(s->alpha,(double)(0)) ) { /* * Dense ECA */ s->ecakind = 0; for(i=0; i<=nfree-1; i++) { for(j=i; j<=nfree-1; j++) { s->ecadense.ptr.pp_double[i][j] = s->tq2dense.ptr.pp_double[i][j]; } } if( !spdmatrixcholeskyrec(&s->ecadense, 0, nfree, ae_true, &s->tmp0, _state) ) { result = ae_false; return result; } } else { /* * Diagonal ECA */ s->ecakind = 1; for(i=0; i<=nfree-1; i++) { if( ae_fp_less(s->tq2diag.ptr.p_double[i],(double)(0)) ) { result = ae_false; return result; } s->ecadiag.ptr.p_double[i] = ae_sqrt(s->tq2diag.ptr.p_double[i], _state); } } } /* * Compose EQ */ if( s->isactivesetchanged||s->issecondarytermchanged ) { for(i=0; i<=k-1; i++) { for(j=0; j<=nfree-1; j++) { s->eq.ptr.pp_double[i][j] = s->tk2.ptr.pp_double[i][j]; } } } /* * Calculate ECCM */ if( ((((s->isactivesetchanged||s->ismaintermchanged)||s->issecondarytermchanged)&&s->k>0)&&ae_fp_greater(s->theta,(double)(0)))&&nfree>0 ) { /* * Calculate ECCM - Cholesky factor of the "effective" capacitance * matrix CM = I + EQ*inv(EffectiveA)*EQ'. * * We calculate CM as follows: * CM = I + EQ*inv(EffectiveA)*EQ' * = I + EQ*ECA^(-1)*ECA^(-T)*EQ' * = I + (EQ*ECA^(-1))*(EQ*ECA^(-1))' * * Then we perform Cholesky decomposition of CM. */ rmatrixsetlengthatleast(&s->tmp2, k, n, _state); rmatrixcopy(k, nfree, &s->eq, 0, 0, &s->tmp2, 0, 0, _state); ae_assert(s->ecakind==0||s->ecakind==1, "CQMRebuild: unexpected ECAKind", _state); if( s->ecakind==0 ) { rmatrixrighttrsm(k, nfree, &s->ecadense, 0, 0, ae_true, ae_false, 0, &s->tmp2, 0, 0, _state); } if( s->ecakind==1 ) { for(i=0; i<=k-1; i++) { for(j=0; j<=nfree-1; j++) { s->tmp2.ptr.pp_double[i][j] = s->tmp2.ptr.pp_double[i][j]/s->ecadiag.ptr.p_double[j]; } } } for(i=0; i<=k-1; i++) { for(j=0; j<=k-1; j++) { s->eccm.ptr.pp_double[i][j] = 0.0; } s->eccm.ptr.pp_double[i][i] = 1.0; } rmatrixsyrk(k, nfree, 1.0, &s->tmp2, 0, 0, 0, 1.0, &s->eccm, 0, 0, ae_true, _state); if( !spdmatrixcholeskyrec(&s->eccm, 0, k, ae_true, &s->tmp0, _state) ) { result = ae_false; return result; } } /* * Compose EB and EC * * NOTE: because these quantities are cheap to compute, we do not * use caching here. */ for(i=0; i<=nfree-1; i++) { s->eb.ptr.p_double[i] = s->tq1.ptr.p_double[i]+s->tk1.ptr.p_double[i]+s->tb.ptr.p_double[i]; } s->ec = s->tq0+s->tk0; for(i=nfree; i<=n-1; i++) { s->ec = s->ec+s->tb.ptr.p_double[i]*s->txc.ptr.p_double[i]; } /* * Change cache status - everything is cached */ s->ismaintermchanged = ae_false; s->issecondarytermchanged = ae_false; s->islineartermchanged = ae_false; s->isactivesetchanged = ae_false; return result; } /************************************************************************* Internal function, solves system Effective_A*x = b. It should be called after successful completion of CQMRebuild(). INPUT PARAMETERS: S - quadratic model, after call to CQMRebuild() X - right part B, array[S.NFree] Tmp - temporary array, automatically reallocated if needed OUTPUT PARAMETERS: X - solution, array[S.NFree] NOTE: when called with zero S.NFree, returns silently NOTE: this function assumes that EA is non-degenerate -- ALGLIB -- Copyright 10.05.2011 by Bochkanov Sergey *************************************************************************/ static void cqmodels_cqmsolveea(convexquadraticmodel* s, /* Real */ ae_vector* x, /* Real */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_assert((s->ecakind==0||s->ecakind==1)||(s->ecakind==-1&&s->nfree==0), "CQMSolveEA: unexpected ECAKind", _state); if( s->ecakind==0 ) { /* * Dense ECA, use FBLSCholeskySolve() dense solver. */ fblscholeskysolve(&s->ecadense, 1.0, s->nfree, ae_true, x, tmp, _state); } if( s->ecakind==1 ) { /* * Diagonal ECA */ for(i=0; i<=s->nfree-1; i++) { x->ptr.p_double[i] = x->ptr.p_double[i]/ae_sqr(s->ecadiag.ptr.p_double[i], _state); } } } void _convexquadraticmodel_init(void* _p, ae_state *_state) { convexquadraticmodel *p = (convexquadraticmodel*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->a, 0, 0, DT_REAL, _state); ae_matrix_init(&p->q, 0, 0, DT_REAL, _state); ae_vector_init(&p->b, 0, DT_REAL, _state); ae_vector_init(&p->r, 0, DT_REAL, _state); ae_vector_init(&p->xc, 0, DT_REAL, _state); ae_vector_init(&p->d, 0, DT_REAL, _state); ae_vector_init(&p->activeset, 0, DT_BOOL, _state); ae_matrix_init(&p->tq2dense, 0, 0, DT_REAL, _state); ae_matrix_init(&p->tk2, 0, 0, DT_REAL, _state); ae_vector_init(&p->tq2diag, 0, DT_REAL, _state); ae_vector_init(&p->tq1, 0, DT_REAL, _state); ae_vector_init(&p->tk1, 0, DT_REAL, _state); ae_vector_init(&p->txc, 0, DT_REAL, _state); ae_vector_init(&p->tb, 0, DT_REAL, _state); ae_matrix_init(&p->ecadense, 0, 0, DT_REAL, _state); ae_matrix_init(&p->eq, 0, 0, DT_REAL, _state); ae_matrix_init(&p->eccm, 0, 0, DT_REAL, _state); ae_vector_init(&p->ecadiag, 0, DT_REAL, _state); ae_vector_init(&p->eb, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_vector_init(&p->tmp1, 0, DT_REAL, _state); ae_vector_init(&p->tmpg, 0, DT_REAL, _state); ae_matrix_init(&p->tmp2, 0, 0, DT_REAL, _state); } void _convexquadraticmodel_init_copy(void* _dst, void* _src, ae_state *_state) { convexquadraticmodel *dst = (convexquadraticmodel*)_dst; convexquadraticmodel *src = (convexquadraticmodel*)_src; dst->n = src->n; dst->k = src->k; dst->alpha = src->alpha; dst->tau = src->tau; dst->theta = src->theta; ae_matrix_init_copy(&dst->a, &src->a, _state); ae_matrix_init_copy(&dst->q, &src->q, _state); ae_vector_init_copy(&dst->b, &src->b, _state); ae_vector_init_copy(&dst->r, &src->r, _state); ae_vector_init_copy(&dst->xc, &src->xc, _state); ae_vector_init_copy(&dst->d, &src->d, _state); ae_vector_init_copy(&dst->activeset, &src->activeset, _state); ae_matrix_init_copy(&dst->tq2dense, &src->tq2dense, _state); ae_matrix_init_copy(&dst->tk2, &src->tk2, _state); ae_vector_init_copy(&dst->tq2diag, &src->tq2diag, _state); ae_vector_init_copy(&dst->tq1, &src->tq1, _state); ae_vector_init_copy(&dst->tk1, &src->tk1, _state); dst->tq0 = src->tq0; dst->tk0 = src->tk0; ae_vector_init_copy(&dst->txc, &src->txc, _state); ae_vector_init_copy(&dst->tb, &src->tb, _state); dst->nfree = src->nfree; dst->ecakind = src->ecakind; ae_matrix_init_copy(&dst->ecadense, &src->ecadense, _state); ae_matrix_init_copy(&dst->eq, &src->eq, _state); ae_matrix_init_copy(&dst->eccm, &src->eccm, _state); ae_vector_init_copy(&dst->ecadiag, &src->ecadiag, _state); ae_vector_init_copy(&dst->eb, &src->eb, _state); dst->ec = src->ec; ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); ae_vector_init_copy(&dst->tmp1, &src->tmp1, _state); ae_vector_init_copy(&dst->tmpg, &src->tmpg, _state); ae_matrix_init_copy(&dst->tmp2, &src->tmp2, _state); dst->ismaintermchanged = src->ismaintermchanged; dst->issecondarytermchanged = src->issecondarytermchanged; dst->islineartermchanged = src->islineartermchanged; dst->isactivesetchanged = src->isactivesetchanged; } void _convexquadraticmodel_clear(void* _p) { convexquadraticmodel *p = (convexquadraticmodel*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->a); ae_matrix_clear(&p->q); ae_vector_clear(&p->b); ae_vector_clear(&p->r); ae_vector_clear(&p->xc); ae_vector_clear(&p->d); ae_vector_clear(&p->activeset); ae_matrix_clear(&p->tq2dense); ae_matrix_clear(&p->tk2); ae_vector_clear(&p->tq2diag); ae_vector_clear(&p->tq1); ae_vector_clear(&p->tk1); ae_vector_clear(&p->txc); ae_vector_clear(&p->tb); ae_matrix_clear(&p->ecadense); ae_matrix_clear(&p->eq); ae_matrix_clear(&p->eccm); ae_vector_clear(&p->ecadiag); ae_vector_clear(&p->eb); ae_vector_clear(&p->tmp0); ae_vector_clear(&p->tmp1); ae_vector_clear(&p->tmpg); ae_matrix_clear(&p->tmp2); } void _convexquadraticmodel_destroy(void* _p) { convexquadraticmodel *p = (convexquadraticmodel*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->a); ae_matrix_destroy(&p->q); ae_vector_destroy(&p->b); ae_vector_destroy(&p->r); ae_vector_destroy(&p->xc); ae_vector_destroy(&p->d); ae_vector_destroy(&p->activeset); ae_matrix_destroy(&p->tq2dense); ae_matrix_destroy(&p->tk2); ae_vector_destroy(&p->tq2diag); ae_vector_destroy(&p->tq1); ae_vector_destroy(&p->tk1); ae_vector_destroy(&p->txc); ae_vector_destroy(&p->tb); ae_matrix_destroy(&p->ecadense); ae_matrix_destroy(&p->eq); ae_matrix_destroy(&p->eccm); ae_vector_destroy(&p->ecadiag); ae_vector_destroy(&p->eb); ae_vector_destroy(&p->tmp0); ae_vector_destroy(&p->tmp1); ae_vector_destroy(&p->tmpg); ae_matrix_destroy(&p->tmp2); } /************************************************************************* This subroutine is used to prepare threshold value which will be used for trimming of the target function (see comments on TrimFunction() for more information). This function accepts only one parameter: function value at the starting point. It returns threshold which will be used for trimming. -- ALGLIB -- Copyright 10.05.2011 by Bochkanov Sergey *************************************************************************/ void trimprepare(double f, double* threshold, ae_state *_state) { *threshold = 0; *threshold = 10*(ae_fabs(f, _state)+1); } /************************************************************************* This subroutine is used to "trim" target function, i.e. to do following transformation: { {F,G} if F=Threshold Such transformation allows us to solve problems with singularities by redefining function in such way that it becomes bounded from above. -- ALGLIB -- Copyright 10.05.2011 by Bochkanov Sergey *************************************************************************/ void trimfunction(double* f, /* Real */ ae_vector* g, ae_int_t n, double threshold, ae_state *_state) { ae_int_t i; if( ae_fp_greater_eq(*f,threshold) ) { *f = threshold; for(i=0; i<=n-1; i++) { g->ptr.p_double[i] = 0.0; } } } /************************************************************************* This function enforces boundary constraints in the X. This function correctly (although a bit inefficient) handles BL[i] which are -INF and BU[i] which are +INF. We have NMain+NSlack dimensional X, with first NMain components bounded by BL/BU, and next NSlack ones bounded by non-negativity constraints. INPUT PARAMETERS X - array[NMain+NSlack], point BL - array[NMain], lower bounds (may contain -INF, when bound is not present) HaveBL - array[NMain], if HaveBL[i] is False, then i-th bound is not present BU - array[NMain], upper bounds (may contain +INF, when bound is not present) HaveBU - array[NMain], if HaveBU[i] is False, then i-th bound is not present OUTPUT PARAMETERS X - X with all constraints being enforced It returns True when constraints are consistent, False - when constraints are inconsistent. -- ALGLIB -- Copyright 10.01.2012 by Bochkanov Sergey *************************************************************************/ ae_bool enforceboundaryconstraints(/* Real */ ae_vector* x, /* Real */ ae_vector* bl, /* Boolean */ ae_vector* havebl, /* Real */ ae_vector* bu, /* Boolean */ ae_vector* havebu, ae_int_t nmain, ae_int_t nslack, ae_state *_state) { ae_int_t i; ae_bool result; result = ae_false; for(i=0; i<=nmain-1; i++) { if( (havebl->ptr.p_bool[i]&&havebu->ptr.p_bool[i])&&ae_fp_greater(bl->ptr.p_double[i],bu->ptr.p_double[i]) ) { return result; } if( havebl->ptr.p_bool[i]&&ae_fp_less(x->ptr.p_double[i],bl->ptr.p_double[i]) ) { x->ptr.p_double[i] = bl->ptr.p_double[i]; } if( havebu->ptr.p_bool[i]&&ae_fp_greater(x->ptr.p_double[i],bu->ptr.p_double[i]) ) { x->ptr.p_double[i] = bu->ptr.p_double[i]; } } for(i=0; i<=nslack-1; i++) { if( ae_fp_less(x->ptr.p_double[nmain+i],(double)(0)) ) { x->ptr.p_double[nmain+i] = (double)(0); } } result = ae_true; return result; } /************************************************************************* This function projects gradient into feasible area of boundary constrained optimization problem. X can be infeasible with respect to boundary constraints. We have NMain+NSlack dimensional X, with first NMain components bounded by BL/BU, and next NSlack ones bounded by non-negativity constraints. INPUT PARAMETERS X - array[NMain+NSlack], point G - array[NMain+NSlack], gradient BL - lower bounds (may contain -INF, when bound is not present) HaveBL - if HaveBL[i] is False, then i-th bound is not present BU - upper bounds (may contain +INF, when bound is not present) HaveBU - if HaveBU[i] is False, then i-th bound is not present OUTPUT PARAMETERS G - projection of G. Components of G which satisfy one of the following (1) (X[I]<=BndL[I]) and (G[I]>0), OR (2) (X[I]>=BndU[I]) and (G[I]<0) are replaced by zeros. NOTE 1: this function assumes that constraints are feasible. It throws exception otherwise. NOTE 2: in fact, projection of ANTI-gradient is calculated, because this function trims components of -G which points outside of the feasible area. However, working with -G is considered confusing, because all optimization source work with G. -- ALGLIB -- Copyright 10.01.2012 by Bochkanov Sergey *************************************************************************/ void projectgradientintobc(/* Real */ ae_vector* x, /* Real */ ae_vector* g, /* Real */ ae_vector* bl, /* Boolean */ ae_vector* havebl, /* Real */ ae_vector* bu, /* Boolean */ ae_vector* havebu, ae_int_t nmain, ae_int_t nslack, ae_state *_state) { ae_int_t i; for(i=0; i<=nmain-1; i++) { ae_assert((!havebl->ptr.p_bool[i]||!havebu->ptr.p_bool[i])||ae_fp_less_eq(bl->ptr.p_double[i],bu->ptr.p_double[i]), "ProjectGradientIntoBC: internal error (infeasible constraints)", _state); if( (havebl->ptr.p_bool[i]&&ae_fp_less_eq(x->ptr.p_double[i],bl->ptr.p_double[i]))&&ae_fp_greater(g->ptr.p_double[i],(double)(0)) ) { g->ptr.p_double[i] = (double)(0); } if( (havebu->ptr.p_bool[i]&&ae_fp_greater_eq(x->ptr.p_double[i],bu->ptr.p_double[i]))&&ae_fp_less(g->ptr.p_double[i],(double)(0)) ) { g->ptr.p_double[i] = (double)(0); } } for(i=0; i<=nslack-1; i++) { if( ae_fp_less_eq(x->ptr.p_double[nmain+i],(double)(0))&&ae_fp_greater(g->ptr.p_double[nmain+i],(double)(0)) ) { g->ptr.p_double[nmain+i] = (double)(0); } } } /************************************************************************* Given a) initial point X0[NMain+NSlack] (feasible with respect to bound constraints) b) step vector alpha*D[NMain+NSlack] c) boundary constraints BndL[NMain], BndU[NMain] d) implicit non-negativity constraints for slack variables this function calculates bound on the step length subject to boundary constraints. It returns: * MaxStepLen - such step length that X0+MaxStepLen*alpha*D is exactly at the boundary given by constraints * VariableToFreeze - index of the constraint to be activated, 0 <= VariableToFreeze < NMain+NSlack * ValueToFreeze - value of the corresponding constraint. Notes: * it is possible that several constraints can be activated by the step at once. In such cases only one constraint is returned. It is caller responsibility to check other constraints. This function makes sure that we activate at least one constraint, and everything else is the responsibility of the caller. * steps smaller than MaxStepLen still can activate constraints due to numerical errors. Thus purpose of this function is not to guard against accidental activation of the constraints - quite the reverse, its purpose is to activate at least constraint upon performing step which is too long. * in case there is no constraints to activate, we return negative VariableToFreeze and zero MaxStepLen and ValueToFreeze. * this function assumes that constraints are consistent; it throws exception otherwise. INPUT PARAMETERS X - array[NMain+NSlack], point. Must be feasible with respect to bound constraints (exception will be thrown otherwise) D - array[NMain+NSlack], step direction alpha - scalar multiplier before D, alpha<>0 BndL - lower bounds, array[NMain] (may contain -INF, when bound is not present) HaveBndL - array[NMain], if HaveBndL[i] is False, then i-th bound is not present BndU - array[NMain], upper bounds (may contain +INF, when bound is not present) HaveBndU - array[NMain], if HaveBndU[i] is False, then i-th bound is not present NMain - number of main variables NSlack - number of slack variables OUTPUT PARAMETERS VariableToFreeze: * negative value = step is unbounded, ValueToFreeze=0, MaxStepLen=0. * non-negative value = at least one constraint, given by this parameter, will be activated upon performing maximum step. ValueToFreeze- value of the variable which will be constrained MaxStepLen - maximum length of the step. Can be zero when step vector looks outside of the feasible area. -- ALGLIB -- Copyright 10.01.2012 by Bochkanov Sergey *************************************************************************/ void calculatestepbound(/* Real */ ae_vector* x, /* Real */ ae_vector* d, double alpha, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* havebndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* havebndu, ae_int_t nmain, ae_int_t nslack, ae_int_t* variabletofreeze, double* valuetofreeze, double* maxsteplen, ae_state *_state) { ae_int_t i; double prevmax; double initval; *variabletofreeze = 0; *valuetofreeze = 0; *maxsteplen = 0; ae_assert(ae_fp_neq(alpha,(double)(0)), "CalculateStepBound: zero alpha", _state); *variabletofreeze = -1; initval = ae_maxrealnumber; *maxsteplen = initval; for(i=0; i<=nmain-1; i++) { if( havebndl->ptr.p_bool[i]&&ae_fp_less(alpha*d->ptr.p_double[i],(double)(0)) ) { ae_assert(ae_fp_greater_eq(x->ptr.p_double[i],bndl->ptr.p_double[i]), "CalculateStepBound: infeasible X", _state); prevmax = *maxsteplen; *maxsteplen = safeminposrv(x->ptr.p_double[i]-bndl->ptr.p_double[i], -alpha*d->ptr.p_double[i], *maxsteplen, _state); if( ae_fp_less(*maxsteplen,prevmax) ) { *variabletofreeze = i; *valuetofreeze = bndl->ptr.p_double[i]; } } if( havebndu->ptr.p_bool[i]&&ae_fp_greater(alpha*d->ptr.p_double[i],(double)(0)) ) { ae_assert(ae_fp_less_eq(x->ptr.p_double[i],bndu->ptr.p_double[i]), "CalculateStepBound: infeasible X", _state); prevmax = *maxsteplen; *maxsteplen = safeminposrv(bndu->ptr.p_double[i]-x->ptr.p_double[i], alpha*d->ptr.p_double[i], *maxsteplen, _state); if( ae_fp_less(*maxsteplen,prevmax) ) { *variabletofreeze = i; *valuetofreeze = bndu->ptr.p_double[i]; } } } for(i=0; i<=nslack-1; i++) { if( ae_fp_less(alpha*d->ptr.p_double[nmain+i],(double)(0)) ) { ae_assert(ae_fp_greater_eq(x->ptr.p_double[nmain+i],(double)(0)), "CalculateStepBound: infeasible X", _state); prevmax = *maxsteplen; *maxsteplen = safeminposrv(x->ptr.p_double[nmain+i], -alpha*d->ptr.p_double[nmain+i], *maxsteplen, _state); if( ae_fp_less(*maxsteplen,prevmax) ) { *variabletofreeze = nmain+i; *valuetofreeze = (double)(0); } } } if( ae_fp_eq(*maxsteplen,initval) ) { *valuetofreeze = (double)(0); *maxsteplen = (double)(0); } } /************************************************************************* This function postprocesses bounded step by: * analysing step length (whether it is equal to MaxStepLen) and activating constraint given by VariableToFreeze if needed * checking for additional bound constraints to activate This function uses final point of the step, quantities calculated by the CalculateStepBound() function. As result, it returns point which is exactly feasible with respect to boundary constraints. NOTE 1: this function does NOT handle and check linear equality constraints NOTE 2: when StepTaken=MaxStepLen we always activate at least one constraint INPUT PARAMETERS X - array[NMain+NSlack], final point to postprocess XPrev - array[NMain+NSlack], initial point BndL - lower bounds, array[NMain] (may contain -INF, when bound is not present) HaveBndL - array[NMain], if HaveBndL[i] is False, then i-th bound is not present BndU - array[NMain], upper bounds (may contain +INF, when bound is not present) HaveBndU - array[NMain], if HaveBndU[i] is False, then i-th bound is not present NMain - number of main variables NSlack - number of slack variables VariableToFreeze-result of CalculateStepBound() ValueToFreeze- result of CalculateStepBound() StepTaken - actual step length (actual step is equal to the possibly non-unit step direction vector times this parameter). StepTaken<=MaxStepLen. MaxStepLen - result of CalculateStepBound() OUTPUT PARAMETERS X - point bounded with respect to constraints. components corresponding to active constraints are exactly equal to the boundary values. RESULT: number of constraints activated in addition to previously active ones. Constraints which were DEACTIVATED are ignored (do not influence function value). -- ALGLIB -- Copyright 10.01.2012 by Bochkanov Sergey *************************************************************************/ ae_int_t postprocessboundedstep(/* Real */ ae_vector* x, /* Real */ ae_vector* xprev, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* havebndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* havebndu, ae_int_t nmain, ae_int_t nslack, ae_int_t variabletofreeze, double valuetofreeze, double steptaken, double maxsteplen, ae_state *_state) { ae_int_t i; ae_bool wasactivated; ae_int_t result; ae_assert(variabletofreeze<0||ae_fp_less_eq(steptaken,maxsteplen), "Assertion failed", _state); /* * Activate constraints */ if( variabletofreeze>=0&&ae_fp_eq(steptaken,maxsteplen) ) { x->ptr.p_double[variabletofreeze] = valuetofreeze; } for(i=0; i<=nmain-1; i++) { if( havebndl->ptr.p_bool[i]&&ae_fp_less(x->ptr.p_double[i],bndl->ptr.p_double[i]) ) { x->ptr.p_double[i] = bndl->ptr.p_double[i]; } if( havebndu->ptr.p_bool[i]&&ae_fp_greater(x->ptr.p_double[i],bndu->ptr.p_double[i]) ) { x->ptr.p_double[i] = bndu->ptr.p_double[i]; } } for(i=0; i<=nslack-1; i++) { if( ae_fp_less_eq(x->ptr.p_double[nmain+i],(double)(0)) ) { x->ptr.p_double[nmain+i] = (double)(0); } } /* * Calculate number of constraints being activated */ result = 0; for(i=0; i<=nmain-1; i++) { wasactivated = ae_fp_neq(x->ptr.p_double[i],xprev->ptr.p_double[i])&&((havebndl->ptr.p_bool[i]&&ae_fp_eq(x->ptr.p_double[i],bndl->ptr.p_double[i]))||(havebndu->ptr.p_bool[i]&&ae_fp_eq(x->ptr.p_double[i],bndu->ptr.p_double[i]))); wasactivated = wasactivated||variabletofreeze==i; if( wasactivated ) { result = result+1; } } for(i=0; i<=nslack-1; i++) { wasactivated = ae_fp_neq(x->ptr.p_double[nmain+i],xprev->ptr.p_double[nmain+i])&&ae_fp_eq(x->ptr.p_double[nmain+i],0.0); wasactivated = wasactivated||variabletofreeze==nmain+i; if( wasactivated ) { result = result+1; } } return result; } /************************************************************************* The purpose of this function is to prevent algorithm from "unsticking" from the active bound constraints because of numerical noise in the gradient or Hessian. It is done by zeroing some components of the search direction D. D[i] is zeroed when both (a) and (b) are true: a) corresponding X[i] is exactly at the boundary b) |D[i]*S[i]| <= DropTol*Sqrt(SUM(D[i]^2*S[I]^2)) D can be step direction , antigradient, gradient, or anything similar. Sign of D does not matter, nor matters step length. NOTE 1: boundary constraints are expected to be consistent, as well as X is expected to be feasible. Exception will be thrown otherwise. INPUT PARAMETERS D - array[NMain+NSlack], direction X - array[NMain+NSlack], current point BndL - lower bounds, array[NMain] (may contain -INF, when bound is not present) HaveBndL - array[NMain], if HaveBndL[i] is False, then i-th bound is not present BndU - array[NMain], upper bounds (may contain +INF, when bound is not present) HaveBndU - array[NMain], if HaveBndU[i] is False, then i-th bound is not present S - array[NMain+NSlack], scaling of the variables NMain - number of main variables NSlack - number of slack variables DropTol - drop tolerance, >=0 OUTPUT PARAMETERS X - point bounded with respect to constraints. components corresponding to active constraints are exactly equal to the boundary values. -- ALGLIB -- Copyright 10.01.2012 by Bochkanov Sergey *************************************************************************/ void filterdirection(/* Real */ ae_vector* d, /* Real */ ae_vector* x, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* havebndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* havebndu, /* Real */ ae_vector* s, ae_int_t nmain, ae_int_t nslack, double droptol, ae_state *_state) { ae_int_t i; double scalednorm; ae_bool isactive; scalednorm = 0.0; for(i=0; i<=nmain+nslack-1; i++) { scalednorm = scalednorm+ae_sqr(d->ptr.p_double[i]*s->ptr.p_double[i], _state); } scalednorm = ae_sqrt(scalednorm, _state); for(i=0; i<=nmain-1; i++) { ae_assert(!havebndl->ptr.p_bool[i]||ae_fp_greater_eq(x->ptr.p_double[i],bndl->ptr.p_double[i]), "FilterDirection: infeasible point", _state); ae_assert(!havebndu->ptr.p_bool[i]||ae_fp_less_eq(x->ptr.p_double[i],bndu->ptr.p_double[i]), "FilterDirection: infeasible point", _state); isactive = (havebndl->ptr.p_bool[i]&&ae_fp_eq(x->ptr.p_double[i],bndl->ptr.p_double[i]))||(havebndu->ptr.p_bool[i]&&ae_fp_eq(x->ptr.p_double[i],bndu->ptr.p_double[i])); if( isactive&&ae_fp_less_eq(ae_fabs(d->ptr.p_double[i]*s->ptr.p_double[i], _state),droptol*scalednorm) ) { d->ptr.p_double[i] = 0.0; } } for(i=0; i<=nslack-1; i++) { ae_assert(ae_fp_greater_eq(x->ptr.p_double[nmain+i],(double)(0)), "FilterDirection: infeasible point", _state); if( ae_fp_eq(x->ptr.p_double[nmain+i],(double)(0))&&ae_fp_less_eq(ae_fabs(d->ptr.p_double[nmain+i]*s->ptr.p_double[nmain+i], _state),droptol*scalednorm) ) { d->ptr.p_double[nmain+i] = 0.0; } } } /************************************************************************* This function returns number of bound constraints whose state was changed (either activated or deactivated) when making step from XPrev to X. Constraints are considered: * active - when we are exactly at the boundary * inactive - when we are not at the boundary You should note that antigradient direction is NOT taken into account when we make decions on the constraint status. INPUT PARAMETERS X - array[NMain+NSlack], final point. Must be feasible with respect to bound constraints. XPrev - array[NMain+NSlack], initial point. Must be feasible with respect to bound constraints. BndL - lower bounds, array[NMain] (may contain -INF, when bound is not present) HaveBndL - array[NMain], if HaveBndL[i] is False, then i-th bound is not present BndU - array[NMain], upper bounds (may contain +INF, when bound is not present) HaveBndU - array[NMain], if HaveBndU[i] is False, then i-th bound is not present NMain - number of main variables NSlack - number of slack variables RESULT: number of constraints whose state was changed. -- ALGLIB -- Copyright 10.01.2012 by Bochkanov Sergey *************************************************************************/ ae_int_t numberofchangedconstraints(/* Real */ ae_vector* x, /* Real */ ae_vector* xprev, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* havebndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* havebndu, ae_int_t nmain, ae_int_t nslack, ae_state *_state) { ae_int_t i; ae_bool statuschanged; ae_int_t result; result = 0; for(i=0; i<=nmain-1; i++) { if( ae_fp_neq(x->ptr.p_double[i],xprev->ptr.p_double[i]) ) { statuschanged = ae_false; if( havebndl->ptr.p_bool[i]&&(ae_fp_eq(x->ptr.p_double[i],bndl->ptr.p_double[i])||ae_fp_eq(xprev->ptr.p_double[i],bndl->ptr.p_double[i])) ) { statuschanged = ae_true; } if( havebndu->ptr.p_bool[i]&&(ae_fp_eq(x->ptr.p_double[i],bndu->ptr.p_double[i])||ae_fp_eq(xprev->ptr.p_double[i],bndu->ptr.p_double[i])) ) { statuschanged = ae_true; } if( statuschanged ) { result = result+1; } } } for(i=0; i<=nslack-1; i++) { if( ae_fp_neq(x->ptr.p_double[nmain+i],xprev->ptr.p_double[nmain+i])&&(ae_fp_eq(x->ptr.p_double[nmain+i],(double)(0))||ae_fp_eq(xprev->ptr.p_double[nmain+i],(double)(0))) ) { result = result+1; } } return result; } /************************************************************************* This function finds feasible point of (NMain+NSlack)-dimensional problem subject to NMain explicit boundary constraints (some constraints can be omitted), NSlack implicit non-negativity constraints, K linear equality constraints. INPUT PARAMETERS X - array[NMain+NSlack], initial point. BndL - lower bounds, array[NMain] (may contain -INF, when bound is not present) HaveBndL - array[NMain], if HaveBndL[i] is False, then i-th bound is not present BndU - array[NMain], upper bounds (may contain +INF, when bound is not present) HaveBndU - array[NMain], if HaveBndU[i] is False, then i-th bound is not present NMain - number of main variables NSlack - number of slack variables CE - array[K,NMain+NSlack+1], equality constraints CE*x=b. Rows contain constraints, first NMain+NSlack columns contain coefficients before X[], last column contain right part. K - number of linear constraints EpsI - infeasibility (error in the right part) allowed in the solution OUTPUT PARAMETERS: X - feasible point or best infeasible point found before algorithm termination QPIts - number of QP iterations (for debug purposes) GPAIts - number of GPA iterations (for debug purposes) RESULT: True in case X is feasible, False - if it is infeasible. -- ALGLIB -- Copyright 20.01.2012 by Bochkanov Sergey *************************************************************************/ ae_bool findfeasiblepoint(/* Real */ ae_vector* x, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* havebndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* havebndu, ae_int_t nmain, ae_int_t nslack, /* Real */ ae_matrix* ce, ae_int_t k, double epsi, ae_int_t* qpits, ae_int_t* gpaits, ae_state *_state) { ae_frame _frame_block; ae_matrix _ce; ae_int_t i; ae_int_t j; ae_int_t idx0; ae_int_t idx1; ae_vector permx; ae_vector xn; ae_vector xa; ae_vector newtonstep; ae_vector g; ae_vector pg; ae_matrix a; double armijostep; double armijobeststep; double armijobestfeas; double v; double mx; double feaserr; double feaserr0; double feaserr1; double feasold; double feasnew; double pgnorm; double vn; double vd; double stp; ae_int_t vartofreeze; double valtofreeze; double maxsteplen; ae_bool werechangesinconstraints; ae_bool stage1isover; ae_bool converged; ae_vector activeconstraints; ae_vector tmpk; ae_vector colnorms; ae_int_t nactive; ae_int_t nfree; ae_int_t nsvd; ae_vector p1; ae_vector p2; apbuffers buf; ae_vector w; ae_vector s; ae_matrix u; ae_matrix vt; ae_int_t itscount; ae_int_t itswithintolerance; ae_int_t maxitswithintolerance; ae_int_t badits; ae_int_t maxbadits; ae_int_t gparuns; ae_int_t maxarmijoruns; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_ce, ce, _state); ce = &_ce; *qpits = 0; *gpaits = 0; ae_vector_init(&permx, 0, DT_REAL, _state); ae_vector_init(&xn, 0, DT_REAL, _state); ae_vector_init(&xa, 0, DT_REAL, _state); ae_vector_init(&newtonstep, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&pg, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&activeconstraints, 0, DT_REAL, _state); ae_vector_init(&tmpk, 0, DT_REAL, _state); ae_vector_init(&colnorms, 0, DT_REAL, _state); ae_vector_init(&p1, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); _apbuffers_init(&buf, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_matrix_init(&u, 0, 0, DT_REAL, _state); ae_matrix_init(&vt, 0, 0, DT_REAL, _state); maxitswithintolerance = 3; maxbadits = 3; maxarmijoruns = 5; *qpits = 0; *gpaits = 0; /* * Initial enforcement of the feasibility with respect to boundary constraints * NOTE: after this block we assume that boundary constraints are consistent. */ if( !enforceboundaryconstraints(x, bndl, havebndl, bndu, havebndu, nmain, nslack, _state) ) { result = ae_false; ae_frame_leave(_state); return result; } if( k==0 ) { /* * No linear constraints, we can exit right now */ result = ae_true; ae_frame_leave(_state); return result; } /* * Scale rows of CE in such way that max(CE[i,0..nmain+nslack-1])=1 for any i=0..k-1 */ for(i=0; i<=k-1; i++) { v = 0.0; for(j=0; j<=nmain+nslack-1; j++) { v = ae_maxreal(v, ae_fabs(ce->ptr.pp_double[i][j], _state), _state); } if( ae_fp_neq(v,(double)(0)) ) { v = 1/v; ae_v_muld(&ce->ptr.pp_double[i][0], 1, ae_v_len(0,nmain+nslack), v); } } /* * Allocate temporaries */ ae_vector_set_length(&xn, nmain+nslack, _state); ae_vector_set_length(&xa, nmain+nslack, _state); ae_vector_set_length(&permx, nmain+nslack, _state); ae_vector_set_length(&g, nmain+nslack, _state); ae_vector_set_length(&pg, nmain+nslack, _state); ae_vector_set_length(&tmpk, k, _state); ae_matrix_set_length(&a, k, nmain+nslack, _state); ae_vector_set_length(&activeconstraints, nmain+nslack, _state); ae_vector_set_length(&newtonstep, nmain+nslack, _state); ae_vector_set_length(&s, nmain+nslack, _state); ae_vector_set_length(&colnorms, nmain+nslack, _state); for(i=0; i<=nmain+nslack-1; i++) { s.ptr.p_double[i] = 1.0; colnorms.ptr.p_double[i] = 0.0; for(j=0; j<=k-1; j++) { colnorms.ptr.p_double[i] = colnorms.ptr.p_double[i]+ae_sqr(ce->ptr.pp_double[j][i], _state); } } /* * K>0, we have linear equality constraints combined with bound constraints. * * Try to find feasible point as minimizer of the quadratic function * F(x) = 0.5*||CE*x-b||^2 = 0.5*x'*(CE'*CE)*x - (b'*CE)*x + 0.5*b'*b * subject to boundary constraints given by BL, BU and non-negativity of * the slack variables. BTW, we drop constant term because it does not * actually influences on the solution. * * Below we will assume that K>0. */ itswithintolerance = 0; badits = 0; itscount = 0; for(;;) { /* * Stage 0: check for exact convergence */ converged = ae_true; feaserr = optserv_feasibilityerror(ce, x, nmain, nslack, k, _state); for(i=0; i<=k-1; i++) { /* * Calculate MX - maximum term in the left part * * Terminate if error in the right part is not greater than 100*Eps*MX. * * IMPORTANT: we must perform check for non-strict inequality, i.e. to use <= instead of <. * it will allow us to easily handle situations with zero rows of CE. * * NOTE: it is important to calculate feasibility error with dedicated * function. Once we had a situation when usage of "inline" code * resulted in different numerical values calculated at different * parts of program for exactly same X. However, this value is * essential for algorithm's ability to terminate before entering * infinite loop, so reproducibility of numerical results is very * important. */ mx = (double)(0); v = -ce->ptr.pp_double[i][nmain+nslack]; for(j=0; j<=nmain+nslack-1; j++) { mx = ae_maxreal(mx, ae_fabs(ce->ptr.pp_double[i][j]*x->ptr.p_double[j], _state), _state); v = v+ce->ptr.pp_double[i][j]*x->ptr.p_double[j]; } converged = converged&&ae_fp_less_eq(ae_fabs(v, _state),100*ae_machineepsilon*mx); } feaserr0 = feaserr; if( converged ) { result = ae_fp_less_eq(feaserr,epsi); ae_frame_leave(_state); return result; } /* * Stage 1: equality constrained quadratic programming * * * treat active bound constraints as equality ones (constraint is considered * active when we are at the boundary, independently of the antigradient direction) * * calculate unrestricted Newton step to point XM (which may be infeasible) * calculate MaxStepLen = largest step in direction of XM which retains feasibility. * * perform bounded step from X to XN: * a) XN=XM (if XM is feasible) * b) XN=X-MaxStepLen*(XM-X) (otherwise) * * X := XN * * if XM (Newton step subject to currently active constraints) was feasible, goto Stage 2 * * repeat Stage 1 * * NOTE 1: in order to solve constrained qudratic subproblem we will have to reorder * variables in such way that ones corresponding to inactive constraints will * be first, and active ones will be last in the list. CE and X are now * [ xi ] * separated into two parts: CE = [CEi CEa], x = [ ], where CEi/Xi correspond * [ xa ] * to INACTIVE constraints, and CEa/Xa correspond to the ACTIVE ones. * * Now, instead of F=0.5*x'*(CE'*CE)*x - (b'*CE)*x + 0.5*b'*b, we have * F(xi) = 0.5*(CEi*xi,CEi*xi) + (CEa*xa-b,CEi*xi) + (0.5*CEa*xa-b,CEa*xa). * Here xa is considered constant, i.e. we optimize with respect to xi, leaving xa fixed. * * We can solve it by performing SVD of CEi and calculating pseudoinverse of the * Hessian matrix. Of course, we do NOT calculate pseudoinverse explicitly - we * just use singular vectors to perform implicit multiplication by it. * */ for(;;) { /* * Calculate G - gradient subject to equality constraints, * multiply it by inverse of the Hessian diagonal to obtain initial * step vector. * * Bound step subject to constraints which can be activated, * run Armijo search with increasing step size. * Search is terminated when feasibility error stops to decrease. * * NOTE: it is important to test for "stops to decrease" instead * of "starts to increase" in order to correctly handle cases with * zero CE. */ armijobeststep = 0.0; armijobestfeas = 0.0; for(i=0; i<=nmain+nslack-1; i++) { g.ptr.p_double[i] = (double)(0); } for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&ce->ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); v = v-ce->ptr.pp_double[i][nmain+nslack]; armijobestfeas = armijobestfeas+ae_sqr(v, _state); ae_v_addd(&g.ptr.p_double[0], 1, &ce->ptr.pp_double[i][0], 1, ae_v_len(0,nmain+nslack-1), v); } armijobestfeas = ae_sqrt(armijobestfeas, _state); for(i=0; i<=nmain-1; i++) { if( havebndl->ptr.p_bool[i]&&ae_fp_eq(x->ptr.p_double[i],bndl->ptr.p_double[i]) ) { g.ptr.p_double[i] = 0.0; } if( havebndu->ptr.p_bool[i]&&ae_fp_eq(x->ptr.p_double[i],bndu->ptr.p_double[i]) ) { g.ptr.p_double[i] = 0.0; } } for(i=0; i<=nslack-1; i++) { if( ae_fp_eq(x->ptr.p_double[nmain+i],0.0) ) { g.ptr.p_double[nmain+i] = 0.0; } } v = 0.0; for(i=0; i<=nmain+nslack-1; i++) { if( ae_fp_neq(ae_sqr(colnorms.ptr.p_double[i], _state),(double)(0)) ) { newtonstep.ptr.p_double[i] = -g.ptr.p_double[i]/ae_sqr(colnorms.ptr.p_double[i], _state); } else { newtonstep.ptr.p_double[i] = 0.0; } v = v+ae_sqr(newtonstep.ptr.p_double[i], _state); } if( ae_fp_eq(v,(double)(0)) ) { /* * Constrained gradient is zero, QP iterations are over */ break; } calculatestepbound(x, &newtonstep, 1.0, bndl, havebndl, bndu, havebndu, nmain, nslack, &vartofreeze, &valtofreeze, &maxsteplen, _state); if( vartofreeze>=0&&ae_fp_eq(maxsteplen,(double)(0)) ) { /* * Can not perform step, QP iterations are over */ break; } if( vartofreeze>=0 ) { armijostep = ae_minreal(1.0, maxsteplen, _state); } else { armijostep = (double)(1); } for(;;) { ae_v_move(&xa.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); ae_v_addd(&xa.ptr.p_double[0], 1, &newtonstep.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1), armijostep); enforceboundaryconstraints(&xa, bndl, havebndl, bndu, havebndu, nmain, nslack, _state); feaserr = 0.0; for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&ce->ptr.pp_double[i][0], 1, &xa.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); v = v-ce->ptr.pp_double[i][nmain+nslack]; feaserr = feaserr+ae_sqr(v, _state); } feaserr = ae_sqrt(feaserr, _state); if( ae_fp_greater_eq(feaserr,armijobestfeas) ) { break; } armijobestfeas = feaserr; armijobeststep = armijostep; armijostep = 2.0*armijostep; } ae_v_addd(&x->ptr.p_double[0], 1, &newtonstep.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1), armijobeststep); enforceboundaryconstraints(x, bndl, havebndl, bndu, havebndu, nmain, nslack, _state); /* * Determine number of active and free constraints */ nactive = 0; for(i=0; i<=nmain-1; i++) { activeconstraints.ptr.p_double[i] = (double)(0); if( havebndl->ptr.p_bool[i]&&ae_fp_eq(x->ptr.p_double[i],bndl->ptr.p_double[i]) ) { activeconstraints.ptr.p_double[i] = (double)(1); } if( havebndu->ptr.p_bool[i]&&ae_fp_eq(x->ptr.p_double[i],bndu->ptr.p_double[i]) ) { activeconstraints.ptr.p_double[i] = (double)(1); } if( ae_fp_greater(activeconstraints.ptr.p_double[i],(double)(0)) ) { nactive = nactive+1; } } for(i=0; i<=nslack-1; i++) { activeconstraints.ptr.p_double[nmain+i] = (double)(0); if( ae_fp_eq(x->ptr.p_double[nmain+i],0.0) ) { activeconstraints.ptr.p_double[nmain+i] = (double)(1); } if( ae_fp_greater(activeconstraints.ptr.p_double[nmain+i],(double)(0)) ) { nactive = nactive+1; } } nfree = nmain+nslack-nactive; if( nfree==0 ) { break; } *qpits = *qpits+1; /* * Reorder variables */ tagsortbuf(&activeconstraints, nmain+nslack, &p1, &p2, &buf, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=nmain+nslack-1; j++) { a.ptr.pp_double[i][j] = ce->ptr.pp_double[i][j]; } } for(j=0; j<=nmain+nslack-1; j++) { permx.ptr.p_double[j] = x->ptr.p_double[j]; } for(j=0; j<=nmain+nslack-1; j++) { if( p2.ptr.p_int[j]!=j ) { idx0 = p2.ptr.p_int[j]; idx1 = j; for(i=0; i<=k-1; i++) { v = a.ptr.pp_double[i][idx0]; a.ptr.pp_double[i][idx0] = a.ptr.pp_double[i][idx1]; a.ptr.pp_double[i][idx1] = v; } v = permx.ptr.p_double[idx0]; permx.ptr.p_double[idx0] = permx.ptr.p_double[idx1]; permx.ptr.p_double[idx1] = v; } } /* * Calculate (unprojected) gradient: * G(xi) = CEi'*(CEi*xi + CEa*xa - b) */ for(i=0; i<=nfree-1; i++) { g.ptr.p_double[i] = (double)(0); } for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &permx.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); tmpk.ptr.p_double[i] = v-ce->ptr.pp_double[i][nmain+nslack]; } for(i=0; i<=k-1; i++) { v = tmpk.ptr.p_double[i]; ae_v_addd(&g.ptr.p_double[0], 1, &a.ptr.pp_double[i][0], 1, ae_v_len(0,nfree-1), v); } /* * Calculate Newton step using SVD of CEi: * F(xi) = 0.5*xi'*H*xi + g'*xi (Taylor decomposition) * XN = -H^(-1)*g (new point, solution of the QP subproblem) * H = CEi'*CEi * CEi = U*W*V' (SVD of CEi) * H = V*W^2*V' * H^(-1) = V*W^(-2)*V' * step = -V*W^(-2)*V'*g (it is better to perform multiplication from right to left) * * NOTE 1: we do NOT need left singular vectors to perform Newton step. */ nsvd = ae_minint(k, nfree, _state); if( !rmatrixsvd(&a, k, nfree, 0, 1, 2, &w, &u, &vt, _state) ) { result = ae_false; ae_frame_leave(_state); return result; } for(i=0; i<=nsvd-1; i++) { v = ae_v_dotproduct(&vt.ptr.pp_double[i][0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,nfree-1)); tmpk.ptr.p_double[i] = v; } for(i=0; i<=nsvd-1; i++) { /* * It is important to have strict ">" in order to correctly * handle zero singular values. */ if( ae_fp_greater(ae_sqr(w.ptr.p_double[i], _state),ae_sqr(w.ptr.p_double[0], _state)*(nmain+nslack)*ae_machineepsilon) ) { tmpk.ptr.p_double[i] = tmpk.ptr.p_double[i]/ae_sqr(w.ptr.p_double[i], _state); } else { tmpk.ptr.p_double[i] = (double)(0); } } for(i=0; i<=nmain+nslack-1; i++) { newtonstep.ptr.p_double[i] = (double)(0); } for(i=0; i<=nsvd-1; i++) { v = tmpk.ptr.p_double[i]; ae_v_subd(&newtonstep.ptr.p_double[0], 1, &vt.ptr.pp_double[i][0], 1, ae_v_len(0,nfree-1), v); } for(j=nmain+nslack-1; j>=0; j--) { if( p2.ptr.p_int[j]!=j ) { idx0 = p2.ptr.p_int[j]; idx1 = j; v = newtonstep.ptr.p_double[idx0]; newtonstep.ptr.p_double[idx0] = newtonstep.ptr.p_double[idx1]; newtonstep.ptr.p_double[idx1] = v; } } /* * NewtonStep contains Newton step subject to active bound constraints. * * Such step leads us to the minimizer of the equality constrained F, * but such minimizer may be infeasible because some constraints which * are inactive at the initial point can be violated at the solution. * * Thus, we perform optimization in two stages: * a) perform bounded Newton step, i.e. step in the Newton direction * until activation of the first constraint * b) in case (MaxStepLen>0)and(MaxStepLen<1), perform additional iteration * of the Armijo line search in the rest of the Newton direction. */ calculatestepbound(x, &newtonstep, 1.0, bndl, havebndl, bndu, havebndu, nmain, nslack, &vartofreeze, &valtofreeze, &maxsteplen, _state); if( vartofreeze>=0&&ae_fp_eq(maxsteplen,(double)(0)) ) { /* * Activation of the constraints prevent us from performing step, * QP iterations are over */ break; } if( vartofreeze>=0 ) { v = ae_minreal(1.0, maxsteplen, _state); } else { v = 1.0; } ae_v_moved(&xn.ptr.p_double[0], 1, &newtonstep.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1), v); ae_v_add(&xn.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); postprocessboundedstep(&xn, x, bndl, havebndl, bndu, havebndu, nmain, nslack, vartofreeze, valtofreeze, v, maxsteplen, _state); if( ae_fp_greater(maxsteplen,(double)(0))&&ae_fp_less(maxsteplen,(double)(1)) ) { /* * Newton step was restricted by activation of the constraints, * perform Armijo iteration. * * Initial estimate for best step is zero step. We try different * step sizes, from the 1-MaxStepLen (residual of the full Newton * step) to progressively smaller and smaller steps. */ armijobeststep = 0.0; armijobestfeas = 0.0; for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&ce->ptr.pp_double[i][0], 1, &xn.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); v = v-ce->ptr.pp_double[i][nmain+nslack]; armijobestfeas = armijobestfeas+ae_sqr(v, _state); } armijobestfeas = ae_sqrt(armijobestfeas, _state); armijostep = 1-maxsteplen; for(j=0; j<=maxarmijoruns-1; j++) { ae_v_move(&xa.ptr.p_double[0], 1, &xn.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); ae_v_addd(&xa.ptr.p_double[0], 1, &newtonstep.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1), armijostep); enforceboundaryconstraints(&xa, bndl, havebndl, bndu, havebndu, nmain, nslack, _state); feaserr = 0.0; for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&ce->ptr.pp_double[i][0], 1, &xa.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); v = v-ce->ptr.pp_double[i][nmain+nslack]; feaserr = feaserr+ae_sqr(v, _state); } feaserr = ae_sqrt(feaserr, _state); if( ae_fp_less(feaserr,armijobestfeas) ) { armijobestfeas = feaserr; armijobeststep = armijostep; } armijostep = 0.5*armijostep; } ae_v_move(&xa.ptr.p_double[0], 1, &xn.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); ae_v_addd(&xa.ptr.p_double[0], 1, &newtonstep.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1), armijobeststep); enforceboundaryconstraints(&xa, bndl, havebndl, bndu, havebndu, nmain, nslack, _state); } else { /* * Armijo iteration is not performed */ ae_v_move(&xa.ptr.p_double[0], 1, &xn.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); } stage1isover = ae_fp_greater_eq(maxsteplen,(double)(1))||ae_fp_eq(maxsteplen,(double)(0)); /* * Calculate feasibility errors for old and new X. * These quantinies are used for debugging purposes only. * However, we can leave them in release code because performance impact is insignificant. * * Update X. Exit if needed. */ feasold = (double)(0); feasnew = (double)(0); for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&ce->ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); feasold = feasold+ae_sqr(v-ce->ptr.pp_double[i][nmain+nslack], _state); v = ae_v_dotproduct(&ce->ptr.pp_double[i][0], 1, &xa.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); feasnew = feasnew+ae_sqr(v-ce->ptr.pp_double[i][nmain+nslack], _state); } feasold = ae_sqrt(feasold, _state); feasnew = ae_sqrt(feasnew, _state); if( ae_fp_greater_eq(feasnew,feasold) ) { break; } ae_v_move(&x->ptr.p_double[0], 1, &xa.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); if( stage1isover ) { break; } } /* * Stage 2: gradient projection algorithm (GPA) * * * calculate feasibility error (with respect to linear equality constraints) * * calculate gradient G of F, project it into feasible area (G => PG) * * exit if norm(PG) is exactly zero or feasibility error is smaller than EpsC * * let XM be exact minimum of F along -PG (XM may be infeasible). * calculate MaxStepLen = largest step in direction of -PG which retains feasibility. * * perform bounded step from X to XN: * a) XN=XM (if XM is feasible) * b) XN=X-MaxStepLen*PG (otherwise) * * X := XN * * stop after specified number of iterations or when no new constraints was activated * * NOTES: * * grad(F) = (CE'*CE)*x - (b'*CE)^T * * CE[i] denotes I-th row of CE * * XM = X+stp*(-PG) where stp=(grad(F(X)),PG)/(CE*PG,CE*PG). * Here PG is a projected gradient, but in fact it can be arbitrary non-zero * direction vector - formula for minimum of F along PG still will be correct. */ werechangesinconstraints = ae_false; for(gparuns=1; gparuns<=k; gparuns++) { /* * calculate feasibility error and G */ feaserr = (double)(0); for(i=0; i<=nmain+nslack-1; i++) { g.ptr.p_double[i] = (double)(0); } for(i=0; i<=k-1; i++) { /* * G += CE[i]^T * (CE[i]*x-b[i]) */ v = ae_v_dotproduct(&ce->ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); v = v-ce->ptr.pp_double[i][nmain+nslack]; feaserr = feaserr+ae_sqr(v, _state); ae_v_addd(&g.ptr.p_double[0], 1, &ce->ptr.pp_double[i][0], 1, ae_v_len(0,nmain+nslack-1), v); } /* * project G, filter it (strip numerical noise) */ ae_v_move(&pg.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); projectgradientintobc(x, &pg, bndl, havebndl, bndu, havebndu, nmain, nslack, _state); filterdirection(&pg, x, bndl, havebndl, bndu, havebndu, &s, nmain, nslack, 1.0E-9, _state); for(i=0; i<=nmain+nslack-1; i++) { if( ae_fp_neq(ae_sqr(colnorms.ptr.p_double[i], _state),(double)(0)) ) { pg.ptr.p_double[i] = pg.ptr.p_double[i]/ae_sqr(colnorms.ptr.p_double[i], _state); } else { pg.ptr.p_double[i] = 0.0; } } /* * Check GNorm and feasibility. * Exit when GNorm is exactly zero. */ pgnorm = ae_v_dotproduct(&pg.ptr.p_double[0], 1, &pg.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); feaserr = ae_sqrt(feaserr, _state); pgnorm = ae_sqrt(pgnorm, _state); if( ae_fp_eq(pgnorm,(double)(0)) ) { result = ae_fp_less_eq(feaserr,epsi); ae_frame_leave(_state); return result; } /* * calculate planned step length */ vn = ae_v_dotproduct(&g.ptr.p_double[0], 1, &pg.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); vd = (double)(0); for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&ce->ptr.pp_double[i][0], 1, &pg.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); vd = vd+ae_sqr(v, _state); } stp = vn/vd; /* * Calculate step bound. * Perform bounded step and post-process it */ calculatestepbound(x, &pg, -1.0, bndl, havebndl, bndu, havebndu, nmain, nslack, &vartofreeze, &valtofreeze, &maxsteplen, _state); if( vartofreeze>=0&&ae_fp_eq(maxsteplen,(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } if( vartofreeze>=0 ) { v = ae_minreal(stp, maxsteplen, _state); } else { v = stp; } ae_v_move(&xn.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); ae_v_subd(&xn.ptr.p_double[0], 1, &pg.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1), v); postprocessboundedstep(&xn, x, bndl, havebndl, bndu, havebndu, nmain, nslack, vartofreeze, valtofreeze, v, maxsteplen, _state); /* * update X * check stopping criteria */ werechangesinconstraints = werechangesinconstraints||numberofchangedconstraints(&xn, x, bndl, havebndl, bndu, havebndu, nmain, nslack, _state)>0; ae_v_move(&x->ptr.p_double[0], 1, &xn.ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); *gpaits = *gpaits+1; if( !werechangesinconstraints ) { break; } } /* * Stage 3: decide to stop algorithm or not to stop * * 1. we can stop when last GPA run did NOT changed constraints status. * It means that we've found final set of the active constraints even * before GPA made its run. And it means that Newton step moved us to * the minimum subject to the present constraints. * Depending on feasibility error, True or False is returned. */ feaserr = optserv_feasibilityerror(ce, x, nmain, nslack, k, _state); feaserr1 = feaserr; if( ae_fp_greater_eq(feaserr1,feaserr0*(1-1000*ae_machineepsilon)) ) { inc(&badits, _state); } else { badits = 0; } if( ae_fp_less_eq(feaserr,epsi) ) { inc(&itswithintolerance, _state); } else { itswithintolerance = 0; } if( (!werechangesinconstraints||itswithintolerance>=maxitswithintolerance)||badits>=maxbadits ) { result = ae_fp_less_eq(feaserr,epsi); ae_frame_leave(_state); return result; } itscount = itscount+1; } ae_frame_leave(_state); return result; } /************************************************************************* This function checks that input derivatives are right. First it scales parameters DF0 and DF1 from segment [A;B] to [0;1]. Then it builds Hermite spline and derivative of it in 0.5. Search scale as Max(DF0,DF1, |F0-F1|). Right derivative has to satisfy condition: |H-F|/S<=0,01, |H'-F'|/S<=0,01. INPUT PARAMETERS: F0 - function's value in X-TestStep point; DF0 - derivative's value in X-TestStep point; F1 - function's value in X+TestStep point; DF1 - derivative's value in X+TestStep point; F - testing function's value; DF - testing derivative's value; Width- width of verification segment. RESULT: If input derivatives is right then function returns true, else function returns false. -- ALGLIB -- Copyright 29.05.2012 by Bochkanov Sergey *************************************************************************/ ae_bool derivativecheck(double f0, double df0, double f1, double df1, double f, double df, double width, ae_state *_state) { double s; double h; double dh; ae_bool result; df = width*df; df0 = width*df0; df1 = width*df1; s = ae_maxreal(ae_maxreal(ae_fabs(df0, _state), ae_fabs(df1, _state), _state), ae_fabs(f1-f0, _state), _state); h = 0.5*f0+0.125*df0+0.5*f1-0.125*df1; dh = -1.5*f0-0.25*df0+1.5*f1-0.25*df1; if( ae_fp_neq(s,(double)(0)) ) { if( ae_fp_greater(ae_fabs(h-f, _state)/s,0.001)||ae_fp_greater(ae_fabs(dh-df, _state)/s,0.001) ) { result = ae_false; return result; } } else { if( ae_fp_neq(h-f,0.0)||ae_fp_neq(dh-df,0.0) ) { result = ae_false; return result; } } result = ae_true; return result; } /************************************************************************* Having quadratic target function f(x) = 0.5*x'*A*x + b'*x + penaltyfactor*0.5*(C*x-b)'*(C*x-b) and its parabolic model along direction D F(x0+alpha*D) = D2*alpha^2 + D1*alpha this function estimates numerical errors in the coefficients of the model. It is important that this function does NOT calculate D1/D2 - it only estimates numerical errors introduced during evaluation and compares their magnitudes against magnitudes of numerical errors. As result, one of three outcomes is returned for each coefficient: * "true" coefficient is almost surely positive * "true" coefficient is almost surely negative * numerical errors in coefficient are so large that it can not be reliably distinguished from zero INPUT PARAMETERS: AbsASum - SUM(|A[i,j]|) AbsASum2- SUM(A[i,j]^2) MB - max(|B|) MX - max(|X|) MD - max(|D|) D1 - linear coefficient D2 - quadratic coefficient OUTPUT PARAMETERS: D1Est - estimate of D1 sign, accounting for possible numerical errors: * >0 means "almost surely positive" (D1>0 and large) * <0 means "almost surely negative" (D1<0 and large) * =0 means "pessimistic estimate of numerical errors in D1 is larger than magnitude of D1 itself; it is impossible to reliably distinguish D1 from zero". D2Est - estimate of D2 sign, accounting for possible numerical errors: * >0 means "almost surely positive" (D2>0 and large) * <0 means "almost surely negative" (D2<0 and large) * =0 means "pessimistic estimate of numerical errors in D2 is larger than magnitude of D2 itself; it is impossible to reliably distinguish D2 from zero". -- ALGLIB -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/ void estimateparabolicmodel(double absasum, double absasum2, double mx, double mb, double md, double d1, double d2, ae_int_t* d1est, ae_int_t* d2est, ae_state *_state) { double d1esterror; double d2esterror; double eps; double e1; double e2; *d1est = 0; *d2est = 0; /* * Error estimates: * * * error in D1=d'*(A*x+b) is estimated as * ED1 = eps*MAX_ABS(D)*(MAX_ABS(X)*ENORM(A)+MAX_ABS(B)) * * error in D2=0.5*d'*A*d is estimated as * ED2 = eps*MAX_ABS(D)^2*ENORM(A) * * Here ENORM(A) is some pseudo-norm which reflects the way numerical * error accumulates during addition. Two ways of accumulation are * possible - worst case (errors always increase) and mean-case (errors * may cancel each other). We calculate geometrical average of both: * * ENORM_WORST(A) = SUM(|A[i,j]|) error in N-term sum grows as O(N) * * ENORM_MEAN(A) = SQRT(SUM(A[i,j]^2)) error in N-term sum grows as O(sqrt(N)) * * ENORM(A) = SQRT(ENORM_WORST(A),ENORM_MEAN(A)) */ eps = 4*ae_machineepsilon; e1 = eps*md*(mx*absasum+mb); e2 = eps*md*(mx*ae_sqrt(absasum2, _state)+mb); d1esterror = ae_sqrt(e1*e2, _state); if( ae_fp_less_eq(ae_fabs(d1, _state),d1esterror) ) { *d1est = 0; } else { *d1est = ae_sign(d1, _state); } e1 = eps*md*md*absasum; e2 = eps*md*md*ae_sqrt(absasum2, _state); d2esterror = ae_sqrt(e1*e2, _state); if( ae_fp_less_eq(ae_fabs(d2, _state),d2esterror) ) { *d2est = 0; } else { *d2est = ae_sign(d2, _state); } } /************************************************************************* This function calculates inexact rank-K preconditioner for Hessian matrix H=D+W'*C*W, where: * H is a Hessian matrix, which is approximated by D/W/C * D is a diagonal matrix with positive entries * W is a rank-K correction * C is a diagonal factor of rank-K correction This preconditioner is inexact but fast - it requires O(N*K) time to be applied. Its main purpose - to be used in barrier/penalty/AUL methods, where ill-conditioning is created by combination of two factors: * simple bounds on variables => ill-conditioned D * general barrier/penalty => correction W with large coefficient C (makes problem ill-conditioned) but W itself is well conditioned. Preconditioner P is calculated by artificially constructing a set of BFGS updates which tries to reproduce behavior of H: * Sk = Wk (k-th row of W) * Yk = (D+Wk'*Ck*Wk)*Sk * Yk/Sk are reordered by ascending of C[k]*norm(Wk)^2 Here we assume that rows of Wk are orthogonal or nearly orthogonal, which allows us to have O(N*K+K^2) update instead of O(N*K^2) one. Reordering of updates is essential for having good performance on non-orthogonal problems (updates which do not add much of curvature are added first, and updates which add very large eigenvalues are added last and override effect of the first updates). On input this function takes direction S and components of H. On output it returns inv(H)*S -- ALGLIB -- Copyright 30.06.2014 by Bochkanov Sergey *************************************************************************/ void inexactlbfgspreconditioner(/* Real */ ae_vector* s, ae_int_t n, /* Real */ ae_vector* d, /* Real */ ae_vector* c, /* Real */ ae_matrix* w, ae_int_t k, precbuflbfgs* buf, ae_state *_state) { ae_int_t idx; ae_int_t i; ae_int_t j; double v; double v0; double v1; double vx; double vy; rvectorsetlengthatleast(&buf->norms, k, _state); rvectorsetlengthatleast(&buf->alpha, k, _state); rvectorsetlengthatleast(&buf->rho, k, _state); rmatrixsetlengthatleast(&buf->yk, k, n, _state); ivectorsetlengthatleast(&buf->idx, k, _state); /* * Check inputs */ for(i=0; i<=n-1; i++) { ae_assert(ae_fp_greater(d->ptr.p_double[i],(double)(0)), "InexactLBFGSPreconditioner: D[]<=0", _state); } for(i=0; i<=k-1; i++) { ae_assert(ae_fp_greater_eq(c->ptr.p_double[i],(double)(0)), "InexactLBFGSPreconditioner: C[]<0", _state); } /* * Reorder linear terms according to increase of second derivative. * Fill Norms[] array. */ for(idx=0; idx<=k-1; idx++) { v = ae_v_dotproduct(&w->ptr.pp_double[idx][0], 1, &w->ptr.pp_double[idx][0], 1, ae_v_len(0,n-1)); buf->norms.ptr.p_double[idx] = v*c->ptr.p_double[idx]; buf->idx.ptr.p_int[idx] = idx; } tagsortfasti(&buf->norms, &buf->idx, &buf->bufa, &buf->bufb, k, _state); /* * Apply updates */ for(idx=0; idx<=k-1; idx++) { /* * Select update to perform (ordered by ascending of second derivative) */ i = buf->idx.ptr.p_int[idx]; /* * Calculate YK and Rho */ v = ae_v_dotproduct(&w->ptr.pp_double[i][0], 1, &w->ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); v = v*c->ptr.p_double[i]; for(j=0; j<=n-1; j++) { buf->yk.ptr.pp_double[i][j] = (d->ptr.p_double[j]+v)*w->ptr.pp_double[i][j]; } v = 0.0; v0 = 0.0; v1 = 0.0; for(j=0; j<=n-1; j++) { vx = w->ptr.pp_double[i][j]; vy = buf->yk.ptr.pp_double[i][j]; v = v+vx*vy; v0 = v0+vx*vx; v1 = v1+vy*vy; } if( (ae_fp_greater(v,(double)(0))&&ae_fp_greater(v0*v1,(double)(0)))&&ae_fp_greater(v/ae_sqrt(v0*v1, _state),n*10*ae_machineepsilon) ) { buf->rho.ptr.p_double[i] = 1/v; } else { buf->rho.ptr.p_double[i] = 0.0; } } for(idx=k-1; idx>=0; idx--) { /* * Select update to perform (ordered by ascending of second derivative) */ i = buf->idx.ptr.p_int[idx]; /* * Calculate Alpha[] according to L-BFGS algorithm * and update S[] */ v = ae_v_dotproduct(&w->ptr.pp_double[i][0], 1, &s->ptr.p_double[0], 1, ae_v_len(0,n-1)); v = buf->rho.ptr.p_double[i]*v; buf->alpha.ptr.p_double[i] = v; ae_v_subd(&s->ptr.p_double[0], 1, &buf->yk.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } for(j=0; j<=n-1; j++) { s->ptr.p_double[j] = s->ptr.p_double[j]/d->ptr.p_double[j]; } for(idx=0; idx<=k-1; idx++) { /* * Select update to perform (ordered by ascending of second derivative) */ i = buf->idx.ptr.p_int[idx]; /* * Calculate Beta according to L-BFGS algorithm * and update S[] */ v = ae_v_dotproduct(&buf->yk.ptr.pp_double[i][0], 1, &s->ptr.p_double[0], 1, ae_v_len(0,n-1)); v = buf->alpha.ptr.p_double[i]-buf->rho.ptr.p_double[i]*v; ae_v_addd(&s->ptr.p_double[0], 1, &w->ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } } /************************************************************************* This function prepares exact low-rank preconditioner for Hessian matrix H=D+W'*C*W, where: * H is a Hessian matrix, which is approximated by D/W/C * D is a diagonal matrix with positive entries * W is a rank-K correction * C is a diagonal factor of rank-K correction, positive semidefinite This preconditioner is exact but relatively slow - it requires O(N*K^2) time to be prepared and O(N*K) time to be applied. It is calculated with the help of Woodbury matrix identity. It should be used as follows: * PrepareLowRankPreconditioner() call PREPARES data structure * subsequent calls to ApplyLowRankPreconditioner() APPLY preconditioner to user-specified search direction. -- ALGLIB -- Copyright 30.06.2014 by Bochkanov Sergey *************************************************************************/ void preparelowrankpreconditioner(/* Real */ ae_vector* d, /* Real */ ae_vector* c, /* Real */ ae_matrix* w, ae_int_t n, ae_int_t k, precbuflowrank* buf, ae_state *_state) { ae_int_t i; ae_int_t j; double v; ae_bool b; /* * Check inputs */ ae_assert(n>0, "PrepareLowRankPreconditioner: N<=0", _state); ae_assert(k>=0, "PrepareLowRankPreconditioner: N<=0", _state); for(i=0; i<=n-1; i++) { ae_assert(ae_fp_greater(d->ptr.p_double[i],(double)(0)), "PrepareLowRankPreconditioner: D[]<=0", _state); } for(i=0; i<=k-1; i++) { ae_assert(ae_fp_greater_eq(c->ptr.p_double[i],(double)(0)), "PrepareLowRankPreconditioner: C[]<0", _state); } /* * Prepare buffer structure; skip zero entries of update. */ rvectorsetlengthatleast(&buf->d, n, _state); rmatrixsetlengthatleast(&buf->v, k, n, _state); rvectorsetlengthatleast(&buf->bufc, k, _state); rmatrixsetlengthatleast(&buf->bufw, k+1, n, _state); buf->n = n; buf->k = 0; for(i=0; i<=k-1; i++) { /* * Estimate magnitude of update row; skip zero rows (either W or C are zero) */ v = 0.0; for(j=0; j<=n-1; j++) { v = v+w->ptr.pp_double[i][j]*w->ptr.pp_double[i][j]; } v = v*c->ptr.p_double[i]; if( ae_fp_eq(v,(double)(0)) ) { continue; } ae_assert(ae_fp_greater(v,(double)(0)), "PrepareLowRankPreconditioner: internal error", _state); /* * Copy non-zero update to buffer */ buf->bufc.ptr.p_double[buf->k] = c->ptr.p_double[i]; for(j=0; j<=n-1; j++) { buf->v.ptr.pp_double[buf->k][j] = w->ptr.pp_double[i][j]; buf->bufw.ptr.pp_double[buf->k][j] = w->ptr.pp_double[i][j]; } inc(&buf->k, _state); } /* * Reset K (for convenience) */ k = buf->k; /* * Prepare diagonal factor; quick exit for K=0 */ for(i=0; i<=n-1; i++) { buf->d.ptr.p_double[i] = 1/d->ptr.p_double[i]; } if( k==0 ) { return; } /* * Use Woodbury matrix identity */ rmatrixsetlengthatleast(&buf->bufz, k, k, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=k-1; j++) { buf->bufz.ptr.pp_double[i][j] = 0.0; } } for(i=0; i<=k-1; i++) { buf->bufz.ptr.pp_double[i][i] = 1/buf->bufc.ptr.p_double[i]; } for(j=0; j<=n-1; j++) { buf->bufw.ptr.pp_double[k][j] = 1/ae_sqrt(d->ptr.p_double[j], _state); } for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { buf->bufw.ptr.pp_double[i][j] = buf->bufw.ptr.pp_double[i][j]*buf->bufw.ptr.pp_double[k][j]; } } rmatrixgemm(k, k, n, 1.0, &buf->bufw, 0, 0, 0, &buf->bufw, 0, 0, 1, 1.0, &buf->bufz, 0, 0, _state); b = spdmatrixcholeskyrec(&buf->bufz, 0, k, ae_true, &buf->tmp, _state); ae_assert(b, "PrepareLowRankPreconditioner: internal error (Cholesky failure)", _state); rmatrixlefttrsm(k, n, &buf->bufz, 0, 0, ae_true, ae_false, 1, &buf->v, 0, 0, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { buf->v.ptr.pp_double[i][j] = buf->v.ptr.pp_double[i][j]*buf->d.ptr.p_double[j]; } } } /************************************************************************* This function apply exact low-rank preconditioner prepared by PrepareLowRankPreconditioner function (see its comments for more information). -- ALGLIB -- Copyright 30.06.2014 by Bochkanov Sergey *************************************************************************/ void applylowrankpreconditioner(/* Real */ ae_vector* s, precbuflowrank* buf, ae_state *_state) { ae_int_t n; ae_int_t k; ae_int_t i; ae_int_t j; double v; n = buf->n; k = buf->k; rvectorsetlengthatleast(&buf->tmp, n, _state); for(j=0; j<=n-1; j++) { buf->tmp.ptr.p_double[j] = buf->d.ptr.p_double[j]*s->ptr.p_double[j]; } for(i=0; i<=k-1; i++) { v = 0.0; for(j=0; j<=n-1; j++) { v = v+buf->v.ptr.pp_double[i][j]*s->ptr.p_double[j]; } for(j=0; j<=n-1; j++) { buf->tmp.ptr.p_double[j] = buf->tmp.ptr.p_double[j]-v*buf->v.ptr.pp_double[i][j]; } } for(i=0; i<=n-1; i++) { s->ptr.p_double[i] = buf->tmp.ptr.p_double[i]; } } /************************************************************************* This function calculates feasibility error (square root of sum of squared errors) for a Kx(NMain+NSlack) system of linear equalities. INPUT PARAMETERS: CE - set of K equality constraints, array[K,NMain+NSlack+1] X - candidate point, array [NMain+NSlack] NMain - number of primary variables NSlack - number of slack variables K - number of constraints RESULT: Sqrt(SUM(Err^2)) -- ALGLIB -- Copyright 17.09.2015 by Bochkanov Sergey *************************************************************************/ static double optserv_feasibilityerror(/* Real */ ae_matrix* ce, /* Real */ ae_vector* x, ae_int_t nmain, ae_int_t nslack, ae_int_t k, ae_state *_state) { ae_int_t i; double v; double result; result = 0.0; for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&ce->ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,nmain+nslack-1)); v = v-ce->ptr.pp_double[i][nmain+nslack]; result = result+v*v; } result = ae_sqrt(result, _state); return result; } void _precbuflbfgs_init(void* _p, ae_state *_state) { precbuflbfgs *p = (precbuflbfgs*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->norms, 0, DT_REAL, _state); ae_vector_init(&p->alpha, 0, DT_REAL, _state); ae_vector_init(&p->rho, 0, DT_REAL, _state); ae_matrix_init(&p->yk, 0, 0, DT_REAL, _state); ae_vector_init(&p->idx, 0, DT_INT, _state); ae_vector_init(&p->bufa, 0, DT_REAL, _state); ae_vector_init(&p->bufb, 0, DT_INT, _state); } void _precbuflbfgs_init_copy(void* _dst, void* _src, ae_state *_state) { precbuflbfgs *dst = (precbuflbfgs*)_dst; precbuflbfgs *src = (precbuflbfgs*)_src; ae_vector_init_copy(&dst->norms, &src->norms, _state); ae_vector_init_copy(&dst->alpha, &src->alpha, _state); ae_vector_init_copy(&dst->rho, &src->rho, _state); ae_matrix_init_copy(&dst->yk, &src->yk, _state); ae_vector_init_copy(&dst->idx, &src->idx, _state); ae_vector_init_copy(&dst->bufa, &src->bufa, _state); ae_vector_init_copy(&dst->bufb, &src->bufb, _state); } void _precbuflbfgs_clear(void* _p) { precbuflbfgs *p = (precbuflbfgs*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->norms); ae_vector_clear(&p->alpha); ae_vector_clear(&p->rho); ae_matrix_clear(&p->yk); ae_vector_clear(&p->idx); ae_vector_clear(&p->bufa); ae_vector_clear(&p->bufb); } void _precbuflbfgs_destroy(void* _p) { precbuflbfgs *p = (precbuflbfgs*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->norms); ae_vector_destroy(&p->alpha); ae_vector_destroy(&p->rho); ae_matrix_destroy(&p->yk); ae_vector_destroy(&p->idx); ae_vector_destroy(&p->bufa); ae_vector_destroy(&p->bufb); } void _precbuflowrank_init(void* _p, ae_state *_state) { precbuflowrank *p = (precbuflowrank*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->d, 0, DT_REAL, _state); ae_matrix_init(&p->v, 0, 0, DT_REAL, _state); ae_vector_init(&p->bufc, 0, DT_REAL, _state); ae_matrix_init(&p->bufz, 0, 0, DT_REAL, _state); ae_matrix_init(&p->bufw, 0, 0, DT_REAL, _state); ae_vector_init(&p->tmp, 0, DT_REAL, _state); } void _precbuflowrank_init_copy(void* _dst, void* _src, ae_state *_state) { precbuflowrank *dst = (precbuflowrank*)_dst; precbuflowrank *src = (precbuflowrank*)_src; dst->n = src->n; dst->k = src->k; ae_vector_init_copy(&dst->d, &src->d, _state); ae_matrix_init_copy(&dst->v, &src->v, _state); ae_vector_init_copy(&dst->bufc, &src->bufc, _state); ae_matrix_init_copy(&dst->bufz, &src->bufz, _state); ae_matrix_init_copy(&dst->bufw, &src->bufw, _state); ae_vector_init_copy(&dst->tmp, &src->tmp, _state); } void _precbuflowrank_clear(void* _p) { precbuflowrank *p = (precbuflowrank*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->d); ae_matrix_clear(&p->v); ae_vector_clear(&p->bufc); ae_matrix_clear(&p->bufz); ae_matrix_clear(&p->bufw); ae_vector_clear(&p->tmp); } void _precbuflowrank_destroy(void* _p) { precbuflowrank *p = (precbuflowrank*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->d); ae_matrix_destroy(&p->v); ae_vector_destroy(&p->bufc); ae_matrix_destroy(&p->bufz); ae_matrix_destroy(&p->bufw); ae_vector_destroy(&p->tmp); } /************************************************************************* This subroutine is used to initialize SNNLS solver. By default, empty NNLS problem is produced, but we allocated enough space to store problems with NSMax+NDMax columns and NRMax rows. It is good place to provide algorithm with initial estimate of the space requirements, although you may underestimate problem size or even pass zero estimates - in this case buffer variables will be resized automatically when you set NNLS problem. Previously allocated buffer variables are reused as much as possible. This function does not clear structure completely, it tries to preserve as much dynamically allocated memory as possible. -- ALGLIB -- Copyright 10.10.2012 by Bochkanov Sergey *************************************************************************/ void snnlsinit(ae_int_t nsmax, ae_int_t ndmax, ae_int_t nrmax, snnlssolver* s, ae_state *_state) { s->ns = 0; s->nd = 0; s->nr = 0; rmatrixsetlengthatleast(&s->densea, nrmax, ndmax, _state); rmatrixsetlengthatleast(&s->tmpca, nrmax, ndmax, _state); rvectorsetlengthatleast(&s->b, nrmax, _state); bvectorsetlengthatleast(&s->nnc, nsmax+ndmax, _state); s->debugflops = 0.0; s->debugmaxinnerits = 0; } /************************************************************************* This subroutine is used to set NNLS problem: ( [ 1 | ] [ ] [ ] )^2 ( [ 1 | ] [ ] [ ] ) min ( [ 1 | Ad ] * [ x ] - [ b ] ) s.t. x>=0 ( [ | ] [ ] [ ] ) ( [ | ] [ ] [ ] ) where: * identity matrix has NS*NS size (NS<=NR, NS can be zero) * dense matrix Ad has NR*ND size * b is NR*1 vector * x is (NS+ND)*1 vector * all elements of x are non-negative (this constraint can be removed later by calling SNNLSDropNNC() function) Previously allocated buffer variables are reused as much as possible. After you set problem, you can solve it with SNNLSSolve(). INPUT PARAMETERS: S - SNNLS solver, must be initialized with SNNLSInit() call A - array[NR,ND], dense part of the system B - array[NR], right part NS - size of the sparse part of the system, 0<=NS<=NR ND - size of the dense part of the system, ND>=0 NR - rows count, NR>0 NOTE: 1. You can have NS+ND=0, solver will correctly accept such combination and return empty array as problem solution. -- ALGLIB -- Copyright 10.10.2012 by Bochkanov Sergey *************************************************************************/ void snnlssetproblem(snnlssolver* s, /* Real */ ae_matrix* a, /* Real */ ae_vector* b, ae_int_t ns, ae_int_t nd, ae_int_t nr, ae_state *_state) { ae_int_t i; ae_assert(nd>=0, "SNNLSSetProblem: ND<0", _state); ae_assert(ns>=0, "SNNLSSetProblem: NS<0", _state); ae_assert(nr>0, "SNNLSSetProblem: NR<=0", _state); ae_assert(ns<=nr, "SNNLSSetProblem: NS>NR", _state); ae_assert(a->rows>=nr||nd==0, "SNNLSSetProblem: rows(A)cols>=nd, "SNNLSSetProblem: cols(A)cnt>=nr, "SNNLSSetProblem: length(B)ns = ns; s->nd = nd; s->nr = nr; if( nd>0 ) { rmatrixsetlengthatleast(&s->densea, nr, nd, _state); for(i=0; i<=nr-1; i++) { ae_v_move(&s->densea.ptr.pp_double[i][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,nd-1)); } } rvectorsetlengthatleast(&s->b, nr, _state); ae_v_move(&s->b.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,nr-1)); bvectorsetlengthatleast(&s->nnc, ns+nd, _state); for(i=0; i<=ns+nd-1; i++) { s->nnc.ptr.p_bool[i] = ae_true; } } /************************************************************************* This subroutine drops non-negativity constraint from the problem set by SNNLSSetProblem() call. This function must be called AFTER problem is set, because each SetProblem() call resets constraints to their default state (all constraints are present). INPUT PARAMETERS: S - SNNLS solver, must be initialized with SNNLSInit() call, problem must be set with SNNLSSetProblem() call. Idx - constraint index, 0<=IDX=0, "SNNLSDropNNC: Idx<0", _state); ae_assert(idxns+s->nd, "SNNLSDropNNC: Idx>=NS+ND", _state); s->nnc.ptr.p_bool[idx] = ae_false; } /************************************************************************* This subroutine is used to solve NNLS problem. INPUT PARAMETERS: S - SNNLS solver, must be initialized with SNNLSInit() call and problem must be set up with SNNLSSetProblem() call. X - possibly preallocated buffer, automatically resized if needed OUTPUT PARAMETERS: X - array[NS+ND], solution NOTE: 1. You can have NS+ND=0, solver will correctly accept such combination and return empty array as problem solution. 2. Internal field S.DebugFLOPS contains rough estimate of FLOPs used to solve problem. It can be used for debugging purposes. This field is real-valued. -- ALGLIB -- Copyright 10.10.2012 by Bochkanov Sergey *************************************************************************/ void snnlssolve(snnlssolver* s, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t i; ae_int_t ns; ae_int_t nd; ae_int_t nr; ae_bool wasactivation; double lambdav; double v0; double v1; double v; ae_int_t outerits; ae_int_t innerits; ae_int_t maxouterits; double xtol; double kicklength; ae_bool kickneeded; double f0; double f1; double dnrm; ae_int_t actidx; double stp; double stpmax; /* * Prepare */ ns = s->ns; nd = s->nd; nr = s->nr; s->debugflops = 0.0; /* * Handle special cases: * * NS+ND=0 * * ND=0 */ if( ns+nd==0 ) { return; } if( nd==0 ) { rvectorsetlengthatleast(x, ns, _state); for(i=0; i<=ns-1; i++) { x->ptr.p_double[i] = s->b.ptr.p_double[i]; if( s->nnc.ptr.p_bool[i] ) { x->ptr.p_double[i] = ae_maxreal(x->ptr.p_double[i], 0.0, _state); } } return; } /* * Main cycle of BLEIC-SNNLS algorithm. * Below we assume that ND>0. */ rvectorsetlengthatleast(x, ns+nd, _state); rvectorsetlengthatleast(&s->xn, ns+nd, _state); rvectorsetlengthatleast(&s->xp, ns+nd, _state); rvectorsetlengthatleast(&s->g, ns+nd, _state); rvectorsetlengthatleast(&s->d, ns+nd, _state); rvectorsetlengthatleast(&s->r, nr, _state); rvectorsetlengthatleast(&s->diagaa, nd, _state); rvectorsetlengthatleast(&s->regdiag, ns+nd, _state); rvectorsetlengthatleast(&s->dx, ns+nd, _state); for(i=0; i<=ns+nd-1; i++) { x->ptr.p_double[i] = 0.0; s->regdiag.ptr.p_double[i] = 1.0; } lambdav = 1.0E6*ae_machineepsilon; maxouterits = 10; outerits = 0; innerits = 0; xtol = 1.0E3*ae_machineepsilon; kicklength = ae_sqrt(ae_minrealnumber, _state); for(;;) { /* * Initial check for correctness of X */ for(i=0; i<=ns+nd-1; i++) { ae_assert(!s->nnc.ptr.p_bool[i]||ae_fp_greater_eq(x->ptr.p_double[i],(double)(0)), "SNNLS: integrity check failed", _state); } /* * Calculate gradient G and constrained descent direction D */ snnls_funcgradu(s, x, &s->r, &s->g, &f0, _state); for(i=0; i<=ns+nd-1; i++) { if( (s->nnc.ptr.p_bool[i]&&ae_fp_eq(x->ptr.p_double[i],(double)(0)))&&ae_fp_greater(s->g.ptr.p_double[i],(double)(0)) ) { s->d.ptr.p_double[i] = 0.0; } else { s->d.ptr.p_double[i] = -s->g.ptr.p_double[i]; } } /* * Decide whether we need "kick" stage: special stage * that moves us away from boundary constraints which are * not strictly active (i.e. such constraints that x[i]=0.0 and d[i]>0). * * If we need kick stage, we make a kick - and restart iteration. * If not, after this block we can rely on the fact that * for all x[i]=0.0 we have d[i]=0.0 * * NOTE: we do not increase outer iterations counter here */ kickneeded = ae_false; for(i=0; i<=ns+nd-1; i++) { if( (s->nnc.ptr.p_bool[i]&&ae_fp_eq(x->ptr.p_double[i],0.0))&&ae_fp_greater(s->d.ptr.p_double[i],0.0) ) { kickneeded = ae_true; } } if( kickneeded ) { /* * Perform kick. * Restart. * Do not increase iterations counter. */ for(i=0; i<=ns+nd-1; i++) { if( ae_fp_eq(x->ptr.p_double[i],0.0)&&ae_fp_greater(s->d.ptr.p_double[i],0.0) ) { x->ptr.p_double[i] = x->ptr.p_double[i]+kicklength; } } continue; } /* * Newton phase * Reduce problem to constrained triangular form and perform Newton * steps with quick activation of constrants (triangular form is * updated in order to handle changed constraints). */ for(i=0; i<=ns+nd-1; i++) { s->xp.ptr.p_double[i] = x->ptr.p_double[i]; } snnls_trdprepare(s, x, &s->regdiag, lambdav, &s->trdd, &s->trda, &s->tmp0, &s->tmp1, &s->tmp2, &s->tmplq, _state); for(;;) { /* * Skip if debug limit on inner iterations count is turned on. */ if( s->debugmaxinnerits>0&&innerits>=s->debugmaxinnerits ) { break; } /* * Prepare step vector. */ snnls_funcgradu(s, x, &s->r, &s->g, &f0, _state); for(i=0; i<=ns+nd-1; i++) { s->d.ptr.p_double[i] = -s->g.ptr.p_double[i]; if( s->nnc.ptr.p_bool[i]&&ae_fp_eq(x->ptr.p_double[i],0.0) ) { s->d.ptr.p_double[i] = 0.0; } } snnls_trdsolve(&s->trdd, &s->trda, ns, nd, &s->d, _state); /* * Perform unconstrained trial step and compare function values. */ for(i=0; i<=ns+nd-1; i++) { s->xn.ptr.p_double[i] = x->ptr.p_double[i]+s->d.ptr.p_double[i]; } snnls_func(s, &s->xn, &f1, _state); if( ae_fp_greater_eq(f1,f0) ) { break; } /* * Calculate length of D, maximum step and component which is * activated by this step. Break if D is exactly zero. */ dnrm = 0.0; for(i=0; i<=ns+nd-1; i++) { dnrm = dnrm+ae_sqr(s->d.ptr.p_double[i], _state); } dnrm = ae_sqrt(dnrm, _state); actidx = -1; stpmax = 1.0E50; for(i=0; i<=ns+nd-1; i++) { if( s->nnc.ptr.p_bool[i]&&ae_fp_less(s->d.ptr.p_double[i],0.0) ) { v = stpmax; stpmax = safeminposrv(x->ptr.p_double[i], -s->d.ptr.p_double[i], stpmax, _state); if( ae_fp_less(stpmax,v) ) { actidx = i; } } } if( ae_fp_eq(dnrm,0.0) ) { break; } /* * Perform constrained step and update X * and triangular model. */ stp = ae_minreal(1.0, stpmax, _state); for(i=0; i<=ns+nd-1; i++) { v = x->ptr.p_double[i]+stp*s->d.ptr.p_double[i]; if( s->nnc.ptr.p_bool[i] ) { v = ae_maxreal(v, 0.0, _state); } s->xn.ptr.p_double[i] = v; } if( ae_fp_eq(stp,stpmax)&&actidx>=0 ) { s->xn.ptr.p_double[actidx] = 0.0; } wasactivation = ae_false; for(i=0; i<=ns+nd-1; i++) { if( ae_fp_eq(s->xn.ptr.p_double[i],0.0)&&ae_fp_neq(x->ptr.p_double[i],0.0) ) { wasactivation = ae_true; snnls_trdfixvariable(&s->trdd, &s->trda, ns, nd, i, &s->tmpcholesky, _state); } } for(i=0; i<=ns+nd-1; i++) { x->ptr.p_double[i] = s->xn.ptr.p_double[i]; } /* * Increment iterations counter. * Terminate if no constraint was activated. */ inc(&innerits, _state); if( !wasactivation ) { break; } } /* * Update outer iterations counter. * * Break if necessary: * * maximum number of outer iterations performed * * relative change in X is small enough */ inc(&outerits, _state); if( outerits>=maxouterits ) { break; } v = (double)(0); for(i=0; i<=ns+nd-1; i++) { v0 = ae_fabs(s->xp.ptr.p_double[i], _state); v1 = ae_fabs(x->ptr.p_double[i], _state); if( ae_fp_neq(v0,(double)(0))||ae_fp_neq(v1,(double)(0)) ) { v = ae_maxreal(v, ae_fabs(x->ptr.p_double[i]-s->xp.ptr.p_double[i], _state)/ae_maxreal(v0, v1, _state), _state); } } if( ae_fp_less_eq(v,xtol) ) { break; } } } /************************************************************************* This function calculates: * residual vector R = A*x-b * unconstrained gradient vector G * function value F = 0.5*|R|^2 R and G must have at least N elements. -- ALGLIB -- Copyright 15.07.2015 by Bochkanov Sergey *************************************************************************/ static void snnls_funcgradu(snnlssolver* s, /* Real */ ae_vector* x, /* Real */ ae_vector* r, /* Real */ ae_vector* g, double* f, ae_state *_state) { ae_int_t i; ae_int_t nr; ae_int_t nd; ae_int_t ns; double v; *f = 0; nr = s->nr; nd = s->nd; ns = s->ns; *f = 0.0; for(i=0; i<=nr-1; i++) { v = ae_v_dotproduct(&s->densea.ptr.pp_double[i][0], 1, &x->ptr.p_double[ns], 1, ae_v_len(0,nd-1)); if( iptr.p_double[i]; } v = v-s->b.ptr.p_double[i]; r->ptr.p_double[i] = v; *f = *f+0.5*v*v; } for(i=0; i<=ns-1; i++) { g->ptr.p_double[i] = r->ptr.p_double[i]; } for(i=ns; i<=ns+nd-1; i++) { g->ptr.p_double[i] = 0.0; } for(i=0; i<=nr-1; i++) { v = r->ptr.p_double[i]; ae_v_addd(&g->ptr.p_double[ns], 1, &s->densea.ptr.pp_double[i][0], 1, ae_v_len(ns,ns+nd-1), v); } } /************************************************************************* This function calculates function value F = 0.5*|R|^2 at X. -- ALGLIB -- Copyright 15.07.2015 by Bochkanov Sergey *************************************************************************/ static void snnls_func(snnlssolver* s, /* Real */ ae_vector* x, double* f, ae_state *_state) { ae_int_t i; ae_int_t nr; ae_int_t nd; ae_int_t ns; double v; *f = 0; nr = s->nr; nd = s->nd; ns = s->ns; *f = 0.0; for(i=0; i<=nr-1; i++) { v = ae_v_dotproduct(&s->densea.ptr.pp_double[i][0], 1, &x->ptr.p_double[ns], 1, ae_v_len(0,nd-1)); if( iptr.p_double[i]; } v = v-s->b.ptr.p_double[i]; *f = *f+0.5*v*v; } } static void snnls_trdprepare(snnlssolver* s, /* Real */ ae_vector* x, /* Real */ ae_vector* diag, double lambdav, /* Real */ ae_vector* trdd, /* Real */ ae_matrix* trda, /* Real */ ae_vector* tmp0, /* Real */ ae_vector* tmp1, /* Real */ ae_vector* tmp2, /* Real */ ae_matrix* tmplq, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t ns; ae_int_t nd; ae_int_t nr; double v; double cs; double sn; double r; /* * Prepare */ ns = s->ns; nd = s->nd; nr = s->nr; /* * Triangular reduction */ rvectorsetlengthatleast(trdd, ns, _state); rmatrixsetlengthatleast(trda, ns+nd, nd, _state); rmatrixsetlengthatleast(tmplq, nd, nr+nd, _state); for(i=0; i<=ns-1; i++) { /* * Apply rotation to I-th row and corresponding row of * regularizer. Here V is diagonal element of I-th row, * which is set to 1.0 or 0.0 depending on variable * status (constrained or not). */ v = 1.0; if( s->nnc.ptr.p_bool[i]&&ae_fp_eq(x->ptr.p_double[i],0.0) ) { v = 0.0; } generaterotation(v, lambdav, &cs, &sn, &r, _state); trdd->ptr.p_double[i] = cs*v+sn*lambdav; for(j=0; j<=nd-1; j++) { v = s->densea.ptr.pp_double[i][j]; trda->ptr.pp_double[i][j] = cs*v; tmplq->ptr.pp_double[j][i] = -sn*v; } } for(j=0; j<=nd-1; j++) { for(i=ns; i<=nr-1; i++) { tmplq->ptr.pp_double[j][i] = s->densea.ptr.pp_double[i][j]; } } for(j=0; j<=nd-1; j++) { if( s->nnc.ptr.p_bool[ns+j]&&ae_fp_eq(x->ptr.p_double[ns+j],0.0) ) { /* * Variable is constrained, entire row is set to zero. */ for(i=0; i<=nr-1; i++) { tmplq->ptr.pp_double[j][i] = 0.0; } for(i=0; i<=ns-1; i++) { trda->ptr.pp_double[i][j] = 0.0; } } } for(i=0; i<=nd-1; i++) { for(j=0; j<=nd-1; j++) { tmplq->ptr.pp_double[j][nr+i] = 0.0; } tmplq->ptr.pp_double[i][nr+i] = lambdav*diag->ptr.p_double[i]; } rvectorsetlengthatleast(tmp0, nr+nd+1, _state); rvectorsetlengthatleast(tmp1, nr+nd+1, _state); rvectorsetlengthatleast(tmp2, nr+nd+1, _state); rmatrixlqbasecase(tmplq, nd, nr+nd, tmp0, tmp1, tmp2, _state); for(i=0; i<=nd-1; i++) { if( ae_fp_less(tmplq->ptr.pp_double[i][i],0.0) ) { for(j=i; j<=nd-1; j++) { tmplq->ptr.pp_double[j][i] = -tmplq->ptr.pp_double[j][i]; } } } for(i=0; i<=nd-1; i++) { for(j=0; j<=i; j++) { trda->ptr.pp_double[ns+j][i] = tmplq->ptr.pp_double[i][j]; } } } static void snnls_trdsolve(/* Real */ ae_vector* trdd, /* Real */ ae_matrix* trda, ae_int_t ns, ae_int_t nd, /* Real */ ae_vector* d, ae_state *_state) { ae_int_t i; ae_int_t j; double v; /* * Solve U'*y=d first. * * This section includes two parts: * * solve diagonal part of U' * * solve dense part of U' */ for(i=0; i<=ns-1; i++) { d->ptr.p_double[i] = d->ptr.p_double[i]/trdd->ptr.p_double[i]; v = d->ptr.p_double[i]; for(j=0; j<=nd-1; j++) { d->ptr.p_double[ns+j] = d->ptr.p_double[ns+j]-v*trda->ptr.pp_double[i][j]; } } for(i=0; i<=nd-1; i++) { d->ptr.p_double[ns+i] = d->ptr.p_double[ns+i]/trda->ptr.pp_double[ns+i][i]; v = d->ptr.p_double[ns+i]; for(j=i+1; j<=nd-1; j++) { d->ptr.p_double[ns+j] = d->ptr.p_double[ns+j]-v*trda->ptr.pp_double[ns+i][j]; } } /* * Solve U*x=y then. * * This section includes two parts: * * solve trailing triangular part of U * * solve combination of diagonal and dense parts of U */ for(i=nd-1; i>=0; i--) { v = 0.0; for(j=i+1; j<=nd-1; j++) { v = v+trda->ptr.pp_double[ns+i][j]*d->ptr.p_double[ns+j]; } d->ptr.p_double[ns+i] = (d->ptr.p_double[ns+i]-v)/trda->ptr.pp_double[ns+i][i]; } for(i=ns-1; i>=0; i--) { v = 0.0; for(j=0; j<=nd-1; j++) { v = v+trda->ptr.pp_double[i][j]*d->ptr.p_double[ns+j]; } d->ptr.p_double[i] = (d->ptr.p_double[i]-v)/trdd->ptr.p_double[i]; } } static void snnls_trdfixvariable(/* Real */ ae_vector* trdd, /* Real */ ae_matrix* trda, ae_int_t ns, ae_int_t nd, ae_int_t idx, /* Real */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; double cs; double sn; double r; double v; double vv; ae_assert(ns>=0, "TRDFixVariable: integrity error", _state); ae_assert(nd>=0, "TRDFixVariable: integrity error", _state); ae_assert(ns+nd>0, "TRDFixVariable: integrity error", _state); ae_assert(idx>=0, "TRDFixVariable: integrity error", _state); ae_assert(idxptr.p_double[idx] = 1.0; return; } for(j=0; j<=nd-1; j++) { /* * Apply first rotation */ tmp->ptr.p_double[j] = trda->ptr.pp_double[idx][j]; trda->ptr.pp_double[idx][j] = 0.0; } trdd->ptr.p_double[idx] = 1.0; for(i=0; i<=nd-1; i++) { if( ae_fp_neq(tmp->ptr.p_double[i],(double)(0)) ) { /* * Apply subsequent rotations with bottom triangular part of A */ generaterotation(trda->ptr.pp_double[ns+i][i], tmp->ptr.p_double[i], &cs, &sn, &r, _state); for(j=i; j<=nd-1; j++) { v = trda->ptr.pp_double[ns+i][j]; vv = tmp->ptr.p_double[j]; trda->ptr.pp_double[ns+i][j] = v*cs+vv*sn; tmp->ptr.p_double[j] = vv*cs-v*sn; } } } } else { /* * We fix variable in the dense part of the model. It means * that prior to fixing we have: * * ( | ) * ( D | ) * ( | ) * (-----| A ) * ( |0 ) * ( |00 ) * ( |000 ) * ( |0000 ) * ( |00000) * * then we replace idx-th column by zeros: * * ( | 0 ) * ( D | 0 ) * ( | 0 ) * (-----|A 0 A) * ( | 0 ) * ( | 0 ) * ( | 0 ) * * and append row with unit element to bottom, in order to * regularize problem * * ( | 0 ) * ( D | 0 ) * ( | 0 ) * (-----|A 0 A) * ( | 0 ) * ( | 0 ) * ( | 0 ) * (00000|00100) <- appended * * and then we nullify this row by applying rotations: * * (D 0 | ) * ( 0 | ) * ( 0 D| ) * (-----| A ) * ( | ) * ( | ) <- first rotation is applied here * ( | ) <- subsequent rotations are applied to rows below * ( 0 | 0 ) <- as result, row becomes zero * * and triangular structure is preserved. */ k = idx-ns; for(i=0; i<=ns+nd-1; i++) { trda->ptr.pp_double[i][k] = 0.0; } for(j=k+1; j<=nd-1; j++) { /* * Apply first rotation */ tmp->ptr.p_double[j] = trda->ptr.pp_double[idx][j]; trda->ptr.pp_double[idx][j] = 0.0; } trda->ptr.pp_double[idx][k] = 1.0; for(i=k+1; i<=nd-1; i++) { if( ae_fp_neq(tmp->ptr.p_double[i],(double)(0)) ) { /* * Apply subsequent rotations with bottom triangular part of A */ generaterotation(trda->ptr.pp_double[ns+i][i], tmp->ptr.p_double[i], &cs, &sn, &r, _state); for(j=i; j<=nd-1; j++) { v = trda->ptr.pp_double[ns+i][j]; vv = tmp->ptr.p_double[j]; trda->ptr.pp_double[ns+i][j] = v*cs+vv*sn; tmp->ptr.p_double[j] = vv*cs-v*sn; } } } } } void _snnlssolver_init(void* _p, ae_state *_state) { snnlssolver *p = (snnlssolver*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->densea, 0, 0, DT_REAL, _state); ae_vector_init(&p->b, 0, DT_REAL, _state); ae_vector_init(&p->nnc, 0, DT_BOOL, _state); ae_vector_init(&p->xn, 0, DT_REAL, _state); ae_vector_init(&p->xp, 0, DT_REAL, _state); ae_matrix_init(&p->tmpca, 0, 0, DT_REAL, _state); ae_matrix_init(&p->tmplq, 0, 0, DT_REAL, _state); ae_matrix_init(&p->trda, 0, 0, DT_REAL, _state); ae_vector_init(&p->trdd, 0, DT_REAL, _state); ae_vector_init(&p->crb, 0, DT_REAL, _state); ae_vector_init(&p->g, 0, DT_REAL, _state); ae_vector_init(&p->d, 0, DT_REAL, _state); ae_vector_init(&p->dx, 0, DT_REAL, _state); ae_vector_init(&p->diagaa, 0, DT_REAL, _state); ae_vector_init(&p->cb, 0, DT_REAL, _state); ae_vector_init(&p->cx, 0, DT_REAL, _state); ae_vector_init(&p->cborg, 0, DT_REAL, _state); ae_vector_init(&p->tmpcholesky, 0, DT_REAL, _state); ae_vector_init(&p->r, 0, DT_REAL, _state); ae_vector_init(&p->regdiag, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_vector_init(&p->tmp1, 0, DT_REAL, _state); ae_vector_init(&p->tmp2, 0, DT_REAL, _state); ae_vector_init(&p->rdtmprowmap, 0, DT_INT, _state); } void _snnlssolver_init_copy(void* _dst, void* _src, ae_state *_state) { snnlssolver *dst = (snnlssolver*)_dst; snnlssolver *src = (snnlssolver*)_src; dst->ns = src->ns; dst->nd = src->nd; dst->nr = src->nr; ae_matrix_init_copy(&dst->densea, &src->densea, _state); ae_vector_init_copy(&dst->b, &src->b, _state); ae_vector_init_copy(&dst->nnc, &src->nnc, _state); dst->debugflops = src->debugflops; dst->debugmaxinnerits = src->debugmaxinnerits; ae_vector_init_copy(&dst->xn, &src->xn, _state); ae_vector_init_copy(&dst->xp, &src->xp, _state); ae_matrix_init_copy(&dst->tmpca, &src->tmpca, _state); ae_matrix_init_copy(&dst->tmplq, &src->tmplq, _state); ae_matrix_init_copy(&dst->trda, &src->trda, _state); ae_vector_init_copy(&dst->trdd, &src->trdd, _state); ae_vector_init_copy(&dst->crb, &src->crb, _state); ae_vector_init_copy(&dst->g, &src->g, _state); ae_vector_init_copy(&dst->d, &src->d, _state); ae_vector_init_copy(&dst->dx, &src->dx, _state); ae_vector_init_copy(&dst->diagaa, &src->diagaa, _state); ae_vector_init_copy(&dst->cb, &src->cb, _state); ae_vector_init_copy(&dst->cx, &src->cx, _state); ae_vector_init_copy(&dst->cborg, &src->cborg, _state); ae_vector_init_copy(&dst->tmpcholesky, &src->tmpcholesky, _state); ae_vector_init_copy(&dst->r, &src->r, _state); ae_vector_init_copy(&dst->regdiag, &src->regdiag, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); ae_vector_init_copy(&dst->tmp1, &src->tmp1, _state); ae_vector_init_copy(&dst->tmp2, &src->tmp2, _state); ae_vector_init_copy(&dst->rdtmprowmap, &src->rdtmprowmap, _state); } void _snnlssolver_clear(void* _p) { snnlssolver *p = (snnlssolver*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->densea); ae_vector_clear(&p->b); ae_vector_clear(&p->nnc); ae_vector_clear(&p->xn); ae_vector_clear(&p->xp); ae_matrix_clear(&p->tmpca); ae_matrix_clear(&p->tmplq); ae_matrix_clear(&p->trda); ae_vector_clear(&p->trdd); ae_vector_clear(&p->crb); ae_vector_clear(&p->g); ae_vector_clear(&p->d); ae_vector_clear(&p->dx); ae_vector_clear(&p->diagaa); ae_vector_clear(&p->cb); ae_vector_clear(&p->cx); ae_vector_clear(&p->cborg); ae_vector_clear(&p->tmpcholesky); ae_vector_clear(&p->r); ae_vector_clear(&p->regdiag); ae_vector_clear(&p->tmp0); ae_vector_clear(&p->tmp1); ae_vector_clear(&p->tmp2); ae_vector_clear(&p->rdtmprowmap); } void _snnlssolver_destroy(void* _p) { snnlssolver *p = (snnlssolver*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->densea); ae_vector_destroy(&p->b); ae_vector_destroy(&p->nnc); ae_vector_destroy(&p->xn); ae_vector_destroy(&p->xp); ae_matrix_destroy(&p->tmpca); ae_matrix_destroy(&p->tmplq); ae_matrix_destroy(&p->trda); ae_vector_destroy(&p->trdd); ae_vector_destroy(&p->crb); ae_vector_destroy(&p->g); ae_vector_destroy(&p->d); ae_vector_destroy(&p->dx); ae_vector_destroy(&p->diagaa); ae_vector_destroy(&p->cb); ae_vector_destroy(&p->cx); ae_vector_destroy(&p->cborg); ae_vector_destroy(&p->tmpcholesky); ae_vector_destroy(&p->r); ae_vector_destroy(&p->regdiag); ae_vector_destroy(&p->tmp0); ae_vector_destroy(&p->tmp1); ae_vector_destroy(&p->tmp2); ae_vector_destroy(&p->rdtmprowmap); } /************************************************************************* This subroutine is used to initialize active set. By default, empty N-variable model with no constraints is generated. Previously allocated buffer variables are reused as much as possible. Two use cases for this object are described below. CASE 1 - STEEPEST DESCENT: SASInit() repeat: SASReactivateConstraints() SASDescentDirection() SASExploreDirection() SASMoveTo() until convergence CASE 1 - PRECONDITIONED STEEPEST DESCENT: SASInit() repeat: SASReactivateConstraintsPrec() SASDescentDirectionPrec() SASExploreDirection() SASMoveTo() until convergence -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ void sasinit(ae_int_t n, sactiveset* s, ae_state *_state) { ae_int_t i; s->n = n; s->algostate = 0; /* * Constraints */ s->constraintschanged = ae_true; s->nec = 0; s->nic = 0; rvectorsetlengthatleast(&s->bndl, n, _state); bvectorsetlengthatleast(&s->hasbndl, n, _state); rvectorsetlengthatleast(&s->bndu, n, _state); bvectorsetlengthatleast(&s->hasbndu, n, _state); for(i=0; i<=n-1; i++) { s->bndl.ptr.p_double[i] = _state->v_neginf; s->bndu.ptr.p_double[i] = _state->v_posinf; s->hasbndl.ptr.p_bool[i] = ae_false; s->hasbndu.ptr.p_bool[i] = ae_false; } /* * current point, scale */ s->hasxc = ae_false; rvectorsetlengthatleast(&s->xc, n, _state); rvectorsetlengthatleast(&s->s, n, _state); rvectorsetlengthatleast(&s->h, n, _state); for(i=0; i<=n-1; i++) { s->xc.ptr.p_double[i] = 0.0; s->s.ptr.p_double[i] = 1.0; s->h.ptr.p_double[i] = 1.0; } /* * Other */ rvectorsetlengthatleast(&s->unitdiagonal, n, _state); for(i=0; i<=n-1; i++) { s->unitdiagonal.ptr.p_double[i] = 1.0; } } /************************************************************************* This function sets scaling coefficients for SAS object. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function During orthogonalization phase, scale is used to calculate drop tolerances (whether vector is significantly non-zero or not). INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ void sassetscale(sactiveset* state, /* Real */ ae_vector* s, ae_state *_state) { ae_int_t i; ae_assert(state->algostate==0, "SASSetScale: you may change scale only in modification mode", _state); ae_assert(s->cnt>=state->n, "SASSetScale: Length(S)n-1; i++) { ae_assert(ae_isfinite(s->ptr.p_double[i], _state), "SASSetScale: S contains infinite or NAN elements", _state); ae_assert(ae_fp_neq(s->ptr.p_double[i],(double)(0)), "SASSetScale: S contains zero elements", _state); } for(i=0; i<=state->n-1; i++) { state->s.ptr.p_double[i] = ae_fabs(s->ptr.p_double[i], _state); } } /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE 1: D[i] should be positive. Exception will be thrown otherwise. NOTE 2: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ void sassetprecdiag(sactiveset* state, /* Real */ ae_vector* d, ae_state *_state) { ae_int_t i; ae_assert(state->algostate==0, "SASSetPrecDiag: you may change preconditioner only in modification mode", _state); ae_assert(d->cnt>=state->n, "SASSetPrecDiag: D is too short", _state); for(i=0; i<=state->n-1; i++) { ae_assert(ae_isfinite(d->ptr.p_double[i], _state), "SASSetPrecDiag: D contains infinite or NAN elements", _state); ae_assert(ae_fp_greater(d->ptr.p_double[i],(double)(0)), "SASSetPrecDiag: D contains non-positive elements", _state); } for(i=0; i<=state->n-1; i++) { state->h.ptr.p_double[i] = d->ptr.p_double[i]; } } /************************************************************************* This function sets/changes boundary constraints. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ void sassetbc(sactiveset* state, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state) { ae_int_t i; ae_int_t n; ae_assert(state->algostate==0, "SASSetBC: you may change constraints only in modification mode", _state); n = state->n; ae_assert(bndl->cnt>=n, "SASSetBC: Length(BndL)cnt>=n, "SASSetBC: Length(BndU)ptr.p_double[i], _state)||ae_isneginf(bndl->ptr.p_double[i], _state), "SASSetBC: BndL contains NAN or +INF", _state); ae_assert(ae_isfinite(bndu->ptr.p_double[i], _state)||ae_isposinf(bndu->ptr.p_double[i], _state), "SASSetBC: BndL contains NAN or -INF", _state); state->bndl.ptr.p_double[i] = bndl->ptr.p_double[i]; state->hasbndl.ptr.p_bool[i] = ae_isfinite(bndl->ptr.p_double[i], _state); state->bndu.ptr.p_double[i] = bndu->ptr.p_double[i]; state->hasbndu.ptr.p_bool[i] = ae_isfinite(bndu->ptr.p_double[i], _state); } state->constraintschanged = ae_true; } /************************************************************************* This function sets linear constraints for SAS object. Linear constraints are inactive by default (after initial creation). INPUT PARAMETERS: State - SAS structure C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0 NOTE 1: linear (non-bound) constraints are satisfied only approximately: * there always exists some minor violation (about Epsilon in magnitude) due to rounding errors * numerical differentiation, if used, may lead to function evaluations outside of the feasible area, because algorithm does NOT change numerical differentiation formula according to linear constraints. If you want constraints to be satisfied exactly, try to reformulate your problem in such manner that all constraints will become boundary ones (this kind of constraints is always satisfied exactly, both in the final solution and in all intermediate points). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void sassetlc(sactiveset* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state) { ae_int_t n; ae_int_t i; ae_assert(state->algostate==0, "SASSetLC: you may change constraints only in modification mode", _state); n = state->n; /* * First, check for errors in the inputs */ ae_assert(k>=0, "SASSetLC: K<0", _state); ae_assert(c->cols>=n+1||k==0, "SASSetLC: Cols(C)rows>=k, "SASSetLC: Rows(C)cnt>=k, "SASSetLC: Length(CT)nec = 0; state->nic = 0; state->constraintschanged = ae_true; return; } /* * Equality constraints are stored first, in the upper * NEC rows of State.CLEIC matrix. Inequality constraints * are stored in the next NIC rows. * * NOTE: we convert inequality constraints to the form * A*x<=b before copying them. */ rmatrixsetlengthatleast(&state->cleic, k, n+1, _state); state->nec = 0; state->nic = 0; for(i=0; i<=k-1; i++) { if( ct->ptr.p_int[i]==0 ) { ae_v_move(&state->cleic.ptr.pp_double[state->nec][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); state->nec = state->nec+1; } } for(i=0; i<=k-1; i++) { if( ct->ptr.p_int[i]!=0 ) { if( ct->ptr.p_int[i]>0 ) { ae_v_moveneg(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } else { ae_v_move(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } state->nic = state->nic+1; } } /* * Mark state as changed */ state->constraintschanged = ae_true; } /************************************************************************* Another variation of SASSetLC(), which accepts linear constraints using another representation. Linear constraints are inactive by default (after initial creation). INPUT PARAMETERS: State - SAS structure CLEIC - linear constraints, array[NEC+NIC,N+1]. Each row of C represents one constraint: * first N elements correspond to coefficients, * last element corresponds to the right part. First NEC rows store equality constraints, next NIC - are inequality ones. All elements of C (including right part) must be finite. NEC - number of equality constraints, NEC>=0 NIC - number of inequality constraints, NIC>=0 NOTE 1: linear (non-bound) constraints are satisfied only approximately: * there always exists some minor violation (about Epsilon in magnitude) due to rounding errors * numerical differentiation, if used, may lead to function evaluations outside of the feasible area, because algorithm does NOT change numerical differentiation formula according to linear constraints. If you want constraints to be satisfied exactly, try to reformulate your problem in such manner that all constraints will become boundary ones (this kind of constraints is always satisfied exactly, both in the final solution and in all intermediate points). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void sassetlcx(sactiveset* state, /* Real */ ae_matrix* cleic, ae_int_t nec, ae_int_t nic, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; ae_assert(state->algostate==0, "SASSetLCX: you may change constraints only in modification mode", _state); n = state->n; /* * First, check for errors in the inputs */ ae_assert(nec>=0, "SASSetLCX: NEC<0", _state); ae_assert(nic>=0, "SASSetLCX: NIC<0", _state); ae_assert(cleic->cols>=n+1||nec+nic==0, "SASSetLCX: Cols(CLEIC)rows>=nec+nic, "SASSetLCX: Rows(CLEIC)cleic, nec+nic, n+1, _state); state->nec = nec; state->nic = nic; for(i=0; i<=nec+nic-1; i++) { for(j=0; j<=n; j++) { state->cleic.ptr.pp_double[i][j] = cleic->ptr.pp_double[i][j]; } } /* * Mark state as changed */ state->constraintschanged = ae_true; } /************************************************************************* This subroutine turns on optimization mode: 1. feasibility in X is enforced (in case X=S.XC and constraints have not changed, algorithm just uses X without any modifications at all) 2. constraints are marked as "candidate" or "inactive" INPUT PARAMETERS: S - active set object X - initial point (candidate), array[N]. It is expected that X contains only finite values (we do not check it). OUTPUT PARAMETERS: S - state is changed X - initial point can be changed to enforce feasibility RESULT: True in case feasible point was found (mode was changed to "optimization") False in case no feasible point was found (mode was not changed) -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ ae_bool sasstartoptimization(sactiveset* state, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; ae_int_t nec; ae_int_t nic; ae_int_t i; ae_int_t j; double v; double v0; double v1; double vv; double vc; double vx; ae_bool result; ae_assert(state->algostate==0, "SASStartOptimization: already in optimization mode", _state); result = ae_false; n = state->n; nec = state->nec; nic = state->nic; /* * Enforce feasibility and calculate set of "candidate"/"active" constraints. * Always active equality constraints are marked as "active", all other constraints * are marked as "candidate". */ ivectorsetlengthatleast(&state->activeset, n+nec+nic, _state); for(i=0; i<=n-1; i++) { if( state->hasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i] ) { if( ae_fp_greater(state->bndl.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { return result; } } } ae_v_move(&state->xc.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); if( state->nec+state->nic>0 ) { /* * General linear constraints are present. * Try to use fast code for feasible initial point with modest * memory requirements. */ rvectorsetlengthatleast(&state->tmp0, n, _state); state->feasinitpt = ae_true; for(i=0; i<=n-1; i++) { state->tmp0.ptr.p_double[i] = x->ptr.p_double[i]; state->activeset.ptr.p_int[i] = -1; if( (state->hasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i])&&ae_fp_eq(state->bndl.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->tmp0.ptr.p_double[i] = state->bndl.ptr.p_double[i]; state->activeset.ptr.p_int[i] = 1; continue; } if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less_eq(state->tmp0.ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { state->activeset.ptr.p_int[i] = 0; state->tmp0.ptr.p_double[i] = state->bndl.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater_eq(state->tmp0.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->activeset.ptr.p_int[i] = 0; state->tmp0.ptr.p_double[i] = state->bndu.ptr.p_double[i]; } } for(i=0; i<=state->nec+state->nic-1; i++) { v = -state->cleic.ptr.pp_double[i][n]; v0 = (double)(0); v1 = (double)(0); for(j=0; j<=n-1; j++) { vx = state->tmp0.ptr.p_double[j]/state->s.ptr.p_double[j]; vc = state->cleic.ptr.pp_double[i][j]*state->s.ptr.p_double[j]; v = v+vx*vc; v0 = v0+ae_sqr(vx, _state); v1 = v1+ae_sqr(vc, _state); } vv = ae_sqrt(v0, _state)*ae_sqrt(v1, _state)*1000*ae_machineepsilon; if( inec ) { state->activeset.ptr.p_int[n+i] = 1; state->feasinitpt = state->feasinitpt&&ae_fp_less(ae_fabs(v, _state),vv); } else { state->feasinitpt = state->feasinitpt&&ae_fp_less(v,vv); if( ae_fp_less(v,-vv) ) { state->activeset.ptr.p_int[n+i] = -1; } else { state->activeset.ptr.p_int[n+i] = 0; } } } if( state->feasinitpt ) { ae_v_move(&state->xc.ptr.p_double[0], 1, &state->tmp0.ptr.p_double[0], 1, ae_v_len(0,n-1)); } /* * Fast code failed? Use general code with ~(N+NIC)^2 memory requirements */ if( !state->feasinitpt ) { rvectorsetlengthatleast(&state->tmp0, n, _state); rvectorsetlengthatleast(&state->tmpfeas, n+state->nic, _state); rmatrixsetlengthatleast(&state->tmpm0, state->nec+state->nic, n+state->nic+1, _state); for(i=0; i<=state->nec+state->nic-1; i++) { ae_v_move(&state->tmpm0.ptr.pp_double[i][0], 1, &state->cleic.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); for(j=n; j<=n+state->nic-1; j++) { state->tmpm0.ptr.pp_double[i][j] = (double)(0); } if( i>=state->nec ) { state->tmpm0.ptr.pp_double[i][n+i-state->nec] = 1.0; } state->tmpm0.ptr.pp_double[i][n+state->nic] = state->cleic.ptr.pp_double[i][n]; } ae_v_move(&state->tmpfeas.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=state->nic-1; i++) { v = ae_v_dotproduct(&state->cleic.ptr.pp_double[i+state->nec][0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->tmpfeas.ptr.p_double[i+n] = ae_maxreal(state->cleic.ptr.pp_double[i+state->nec][n]-v, 0.0, _state); } if( !findfeasiblepoint(&state->tmpfeas, &state->bndl, &state->hasbndl, &state->bndu, &state->hasbndu, n, state->nic, &state->tmpm0, state->nec+state->nic, 1.0E-6, &i, &j, _state) ) { return result; } ae_v_move(&state->xc.ptr.p_double[0], 1, &state->tmpfeas.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { if( (state->hasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i])&&ae_fp_eq(state->bndl.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->activeset.ptr.p_int[i] = 1; continue; } if( (state->hasbndl.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->bndl.ptr.p_double[i]))||(state->hasbndu.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->bndu.ptr.p_double[i])) ) { state->activeset.ptr.p_int[i] = 0; continue; } state->activeset.ptr.p_int[i] = -1; } for(i=0; i<=state->nec-1; i++) { state->activeset.ptr.p_int[n+i] = 1; } for(i=0; i<=state->nic-1; i++) { if( ae_fp_eq(state->tmpfeas.ptr.p_double[n+i],(double)(0)) ) { state->activeset.ptr.p_int[n+state->nec+i] = 0; } else { state->activeset.ptr.p_int[n+state->nec+i] = -1; } } } } else { /* * Only bound constraints are present, quick code can be used */ for(i=0; i<=n-1; i++) { state->activeset.ptr.p_int[i] = -1; if( (state->hasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i])&&ae_fp_eq(state->bndl.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->activeset.ptr.p_int[i] = 1; state->xc.ptr.p_double[i] = state->bndl.ptr.p_double[i]; continue; } if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less_eq(state->xc.ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { state->xc.ptr.p_double[i] = state->bndl.ptr.p_double[i]; state->activeset.ptr.p_int[i] = 0; continue; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater_eq(state->xc.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->xc.ptr.p_double[i] = state->bndu.ptr.p_double[i]; state->activeset.ptr.p_int[i] = 0; continue; } } state->feasinitpt = ae_true; } /* * Change state, allocate temporaries */ result = ae_true; state->algostate = 1; state->basisisready = ae_false; state->hasxc = ae_true; rmatrixsetlengthatleast(&state->pbasis, ae_minint(nec+nic, n, _state), n+1, _state); rmatrixsetlengthatleast(&state->ibasis, ae_minint(nec+nic, n, _state), n+1, _state); rmatrixsetlengthatleast(&state->sbasis, ae_minint(nec+nic, n, _state), n+1, _state); return result; } /************************************************************************* This function explores search direction and calculates bound for step as well as information for activation of constraints. INPUT PARAMETERS: State - SAS structure which stores current point and all other active set related information D - descent direction to explore OUTPUT PARAMETERS: StpMax - upper limit on step length imposed by yet inactive constraints. Can be zero in case some constraints can be activated by zero step. Equal to some large value in case step is unlimited. CIdx - -1 for unlimited step, in [0,N+NEC+NIC) in case of limited step. VVal - value which is assigned to X[CIdx] during activation. For CIdx<0 or CIdx>=N some dummy value is assigned to this parameter. *************************************************************************/ void sasexploredirection(sactiveset* state, /* Real */ ae_vector* d, double* stpmax, ae_int_t* cidx, double* vval, ae_state *_state) { ae_int_t n; ae_int_t nec; ae_int_t nic; ae_int_t i; double prevmax; double vc; double vd; *stpmax = 0; *cidx = 0; *vval = 0; ae_assert(state->algostate==1, "SASExploreDirection: is not in optimization mode", _state); n = state->n; nec = state->nec; nic = state->nic; *cidx = -1; *vval = (double)(0); *stpmax = 1.0E50; for(i=0; i<=n-1; i++) { if( state->activeset.ptr.p_int[i]<=0 ) { ae_assert(!state->hasbndl.ptr.p_bool[i]||ae_fp_greater_eq(state->xc.ptr.p_double[i],state->bndl.ptr.p_double[i]), "SASExploreDirection: internal error - infeasible X", _state); ae_assert(!state->hasbndu.ptr.p_bool[i]||ae_fp_less_eq(state->xc.ptr.p_double[i],state->bndu.ptr.p_double[i]), "SASExploreDirection: internal error - infeasible X", _state); if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less(d->ptr.p_double[i],(double)(0)) ) { prevmax = *stpmax; *stpmax = safeminposrv(state->xc.ptr.p_double[i]-state->bndl.ptr.p_double[i], -d->ptr.p_double[i], *stpmax, _state); if( ae_fp_less(*stpmax,prevmax) ) { *cidx = i; *vval = state->bndl.ptr.p_double[i]; } } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater(d->ptr.p_double[i],(double)(0)) ) { prevmax = *stpmax; *stpmax = safeminposrv(state->bndu.ptr.p_double[i]-state->xc.ptr.p_double[i], d->ptr.p_double[i], *stpmax, _state); if( ae_fp_less(*stpmax,prevmax) ) { *cidx = i; *vval = state->bndu.ptr.p_double[i]; } } } } for(i=nec; i<=nec+nic-1; i++) { if( state->activeset.ptr.p_int[n+i]<=0 ) { vc = ae_v_dotproduct(&state->cleic.ptr.pp_double[i][0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); vc = vc-state->cleic.ptr.pp_double[i][n]; vd = ae_v_dotproduct(&state->cleic.ptr.pp_double[i][0], 1, &d->ptr.p_double[0], 1, ae_v_len(0,n-1)); if( ae_fp_less_eq(vd,(double)(0)) ) { continue; } if( ae_fp_less(vc,(double)(0)) ) { /* * XC is strictly feasible with respect to I-th constraint, * we can perform non-zero step because there is non-zero distance * between XC and bound. */ prevmax = *stpmax; *stpmax = safeminposrv(-vc, vd, *stpmax, _state); if( ae_fp_less(*stpmax,prevmax) ) { *cidx = n+i; } } else { /* * XC is at the boundary (or slightly beyond it), and step vector * points beyond the boundary. * * The only thing we can do is to perform zero step and activate * I-th constraint. */ *stpmax = (double)(0); *cidx = n+i; } } } } /************************************************************************* This subroutine moves current point to XN, which can be: a) point in the direction previously explored with SASExploreDirection() function (in this case NeedAct/CIdx/CVal are used) b) point in arbitrary direction, not necessarily previously checked with SASExploreDirection() function. Step may activate one constraint. It is assumed than XN is approximately feasible (small error as large as several ulps is possible). Strict feasibility with respect to bound constraints is enforced during activation, feasibility with respect to general linear constraints is not enforced. This function activates boundary constraints, such that both is True: 1) XC[I] is not at the boundary 2) XN[I] is at the boundary or beyond it INPUT PARAMETERS: S - active set object XN - new point. NeedAct - True in case one constraint needs activation CIdx - index of constraint, in [0,N+NEC+NIC). Ignored if NeedAct is false. This value is calculated by SASExploreDirection(). CVal - for CIdx in [0,N) this field stores value which is assigned to XC[CIdx] during activation. CVal is ignored in other cases. This value is calculated by SASExploreDirection(). OUTPUT PARAMETERS: S - current point and list of active constraints are changed. RESULT: >0, in case at least one inactive non-candidate constraint was activated =0, in case only "candidate" constraints were activated <0, in case no constraints were activated by the step NOTE: in general case State.XC<>XN because activation of constraints may slightly change current point (to enforce feasibility). -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ ae_int_t sasmoveto(sactiveset* state, /* Real */ ae_vector* xn, ae_bool needact, ae_int_t cidx, double cval, ae_state *_state) { ae_int_t n; ae_int_t nec; ae_int_t nic; ae_int_t i; ae_bool wasactivation; ae_int_t result; ae_assert(state->algostate==1, "SASMoveTo: is not in optimization mode", _state); n = state->n; nec = state->nec; nic = state->nic; /* * Save previous state, update current point */ rvectorsetlengthatleast(&state->mtx, n, _state); ivectorsetlengthatleast(&state->mtas, n+nec+nic, _state); for(i=0; i<=n-1; i++) { state->mtx.ptr.p_double[i] = state->xc.ptr.p_double[i]; state->xc.ptr.p_double[i] = xn->ptr.p_double[i]; } for(i=0; i<=n+nec+nic-1; i++) { state->mtas.ptr.p_int[i] = state->activeset.ptr.p_int[i]; } /* * Activate constraints */ wasactivation = ae_false; if( needact ) { /* * Activation */ ae_assert(cidx>=0&&cidxxc.ptr.p_double[cidx] = cval; } state->activeset.ptr.p_int[cidx] = 1; wasactivation = ae_true; } for(i=0; i<=n-1; i++) { /* * Post-check (some constraints may be activated because of numerical errors) */ if( (state->hasbndl.ptr.p_bool[i]&&ae_fp_less_eq(state->xc.ptr.p_double[i],state->bndl.ptr.p_double[i]))&&ae_fp_neq(state->xc.ptr.p_double[i],state->mtx.ptr.p_double[i]) ) { state->xc.ptr.p_double[i] = state->bndl.ptr.p_double[i]; state->activeset.ptr.p_int[i] = 1; wasactivation = ae_true; } if( (state->hasbndu.ptr.p_bool[i]&&ae_fp_greater_eq(state->xc.ptr.p_double[i],state->bndu.ptr.p_double[i]))&&ae_fp_neq(state->xc.ptr.p_double[i],state->mtx.ptr.p_double[i]) ) { state->xc.ptr.p_double[i] = state->bndu.ptr.p_double[i]; state->activeset.ptr.p_int[i] = 1; wasactivation = ae_true; } } /* * Determine return status: * * -1 in case no constraints were activated * * 0 in case only "candidate" constraints were activated * * +1 in case at least one "non-candidate" constraint was activated */ if( wasactivation ) { /* * Step activated one/several constraints, but sometimes it is spurious * activation - RecalculateConstraints() tells us that constraint is * inactive (negative Largrange multiplier), but step activates it * because of numerical noise. * * This block of code checks whether step activated truly new constraints * (ones which were not in the active set at the solution): * * * for non-boundary constraint it is enough to check that previous value * of ActiveSet[i] is negative (=far from boundary), and new one is * positive (=we are at the boundary, constraint is activated). * * * for boundary constraints previous criterion won't work. Each variable * has two constraints, and simply checking their status is not enough - * we have to correctly identify cases when we leave one boundary * (PrevActiveSet[i]=0) and move to another boundary (ActiveSet[i]>0). * Such cases can be identified if we compare previous X with new X. * * In case only "candidate" constraints were activated, result variable * is set to 0. In case at least one new constraint was activated, result * is set to 1. */ result = 0; for(i=0; i<=n-1; i++) { if( state->activeset.ptr.p_int[i]>0&&ae_fp_neq(state->xc.ptr.p_double[i],state->mtx.ptr.p_double[i]) ) { result = 1; } } for(i=n; i<=n+state->nec+state->nic-1; i++) { if( state->mtas.ptr.p_int[i]<0&&state->activeset.ptr.p_int[i]>0 ) { result = 1; } } } else { /* * No activation, return -1 */ result = -1; } /* * Invalidate basis */ state->basisisready = ae_false; return result; } /************************************************************************* This subroutine performs immediate activation of one constraint: * "immediate" means that we do not have to move to activate it * in case boundary constraint is activated, we enforce current point to be exactly at the boundary INPUT PARAMETERS: S - active set object CIdx - index of constraint, in [0,N+NEC+NIC). This value is calculated by SASExploreDirection(). CVal - for CIdx in [0,N) this field stores value which is assigned to XC[CIdx] during activation. CVal is ignored in other cases. This value is calculated by SASExploreDirection(). -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ void sasimmediateactivation(sactiveset* state, ae_int_t cidx, double cval, ae_state *_state) { ae_assert(state->algostate==1, "SASMoveTo: is not in optimization mode", _state); if( cidxn ) { state->xc.ptr.p_double[cidx] = cval; } state->activeset.ptr.p_int[cidx] = 1; state->basisisready = ae_false; } /************************************************************************* This subroutine calculates descent direction subject to current active set. INPUT PARAMETERS: S - active set object G - array[N], gradient D - possibly prealocated buffer; automatically resized if needed. OUTPUT PARAMETERS: D - descent direction projected onto current active set. Components of D which correspond to active boundary constraints are forced to be exactly zero. In case D is non-zero, it is normalized to have unit norm. NOTE: in case active set has N active constraints (or more), descent direction is forced to be exactly zero. -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ void sasconstraineddescent(sactiveset* state, /* Real */ ae_vector* g, /* Real */ ae_vector* d, ae_state *_state) { ae_assert(state->algostate==1, "SASConstrainedDescent: is not in optimization mode", _state); sasrebuildbasis(state, _state); sactivesets_constraineddescent(state, g, &state->unitdiagonal, &state->ibasis, ae_true, d, _state); } /************************************************************************* This subroutine calculates preconditioned descent direction subject to current active set. INPUT PARAMETERS: S - active set object G - array[N], gradient D - possibly prealocated buffer; automatically resized if needed. OUTPUT PARAMETERS: D - descent direction projected onto current active set. Components of D which correspond to active boundary constraints are forced to be exactly zero. In case D is non-zero, it is normalized to have unit norm. NOTE: in case active set has N active constraints (or more), descent direction is forced to be exactly zero. -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ void sasconstraineddescentprec(sactiveset* state, /* Real */ ae_vector* g, /* Real */ ae_vector* d, ae_state *_state) { ae_assert(state->algostate==1, "SASConstrainedDescentPrec: is not in optimization mode", _state); sasrebuildbasis(state, _state); sactivesets_constraineddescent(state, g, &state->h, &state->pbasis, ae_true, d, _state); } /************************************************************************* This subroutine calculates projection of direction vector to current active set. INPUT PARAMETERS: S - active set object D - array[N], direction OUTPUT PARAMETERS: D - direction projected onto current active set. Components of D which correspond to active boundary constraints are forced to be exactly zero. NOTE: in case active set has N active constraints (or more), descent direction is forced to be exactly zero. -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ void sasconstraineddirection(sactiveset* state, /* Real */ ae_vector* d, ae_state *_state) { ae_int_t i; ae_assert(state->algostate==1, "SASConstrainedAntigradientPrec: is not in optimization mode", _state); sasrebuildbasis(state, _state); sactivesets_constraineddescent(state, d, &state->unitdiagonal, &state->ibasis, ae_false, &state->cdtmp, _state); for(i=0; i<=state->n-1; i++) { d->ptr.p_double[i] = -state->cdtmp.ptr.p_double[i]; } } /************************************************************************* This subroutine calculates product of direction vector and preconditioner multiplied subject to current active set. INPUT PARAMETERS: S - active set object D - array[N], direction OUTPUT PARAMETERS: D - preconditioned direction projected onto current active set. Components of D which correspond to active boundary constraints are forced to be exactly zero. NOTE: in case active set has N active constraints (or more), descent direction is forced to be exactly zero. -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ void sasconstraineddirectionprec(sactiveset* state, /* Real */ ae_vector* d, ae_state *_state) { ae_int_t i; ae_assert(state->algostate==1, "SASConstrainedAntigradientPrec: is not in optimization mode", _state); sasrebuildbasis(state, _state); sactivesets_constraineddescent(state, d, &state->h, &state->pbasis, ae_false, &state->cdtmp, _state); for(i=0; i<=state->n-1; i++) { d->ptr.p_double[i] = -state->cdtmp.ptr.p_double[i]; } } /************************************************************************* This subroutine performs correction of some (possibly infeasible) point with respect to a) current active set, b) all boundary constraints, both active and inactive: 0) we calculate L1 penalty term for violation of active linear constraints (one which is returned by SASActiveLCPenalty1() function). 1) first, it performs projection (orthogonal with respect to scale matrix S) of X into current active set: X -> X1. 2) next, we perform projection with respect to ALL boundary constraints which are violated at X1: X1 -> X2. 3) X is replaced by X2. The idea is that this function can preserve and enforce feasibility during optimization, and additional penalty parameter can be used to prevent algo from leaving feasible set because of rounding errors. INPUT PARAMETERS: S - active set object X - array[N], candidate point OUTPUT PARAMETERS: X - "improved" candidate point: a) feasible with respect to all boundary constraints b) feasibility with respect to active set is retained at good level. Penalty - penalty term, which can be added to function value if user wants to penalize violation of constraints (recommended). NOTE: this function is not intended to find exact projection (i.e. best approximation) of X into feasible set. It just improves situation a bit. Regular use of this function will help you to retain feasibility - if you already have something to start with and constrain your steps is such way that the only source of infeasibility are roundoff errors. -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ void sascorrection(sactiveset* state, /* Real */ ae_vector* x, double* penalty, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t n; double v; *penalty = 0; ae_assert(state->algostate==1, "SASCorrection: is not in optimization mode", _state); sasrebuildbasis(state, _state); n = state->n; rvectorsetlengthatleast(&state->corrtmp, n, _state); /* * Calculate penalty term. */ *penalty = sasactivelcpenalty1(state, x, _state); /* * Perform projection 1. * * This projecton is given by: * * x_proj = x - S*S*As'*(As*x-b) * * where x is original x before projection, S is a scale matrix, * As is a matrix of equality constraints (active set) which were * orthogonalized with respect to inner product given by S (i.e. we * have As*S*S'*As'=I), b is a right part of the orthogonalized * constraints. * * NOTE: you can verify that x_proj is strictly feasible w.r.t. * active set by multiplying it by As - you will get * As*x_proj = As*x - As*x + b = b. * * This formula for projection can be obtained by solving * following minimization problem. * * min ||inv(S)*(x_proj-x)||^2 s.t. As*x_proj=b * */ ae_v_move(&state->corrtmp.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=state->basissize-1; i++) { v = -state->sbasis.ptr.pp_double[i][n]; for(j=0; j<=n-1; j++) { v = v+state->sbasis.ptr.pp_double[i][j]*state->corrtmp.ptr.p_double[j]; } for(j=0; j<=n-1; j++) { state->corrtmp.ptr.p_double[j] = state->corrtmp.ptr.p_double[j]-v*state->sbasis.ptr.pp_double[i][j]*ae_sqr(state->s.ptr.p_double[j], _state); } } for(i=0; i<=n-1; i++) { if( state->activeset.ptr.p_int[i]>0 ) { state->corrtmp.ptr.p_double[i] = state->xc.ptr.p_double[i]; } } /* * Perform projection 2 */ for(i=0; i<=n-1; i++) { x->ptr.p_double[i] = state->corrtmp.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less(x->ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { x->ptr.p_double[i] = state->bndl.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater(x->ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { x->ptr.p_double[i] = state->bndu.ptr.p_double[i]; } } } /************************************************************************* This subroutine returns L1 penalty for violation of active general linear constraints (violation of boundary or inactive linear constraints is not added to penalty). Penalty term is equal to: Penalty = SUM( Abs((C_i*x-R_i)/Alpha_i) ) Here: * summation is performed for I=0...NEC+NIC-1, ActiveSet[N+I]>0 (only for rows of CLEIC which are in active set) * C_i is I-th row of CLEIC * R_i is corresponding right part * S is a scale matrix * Alpha_i = ||S*C_i|| - is a scaling coefficient which "normalizes" I-th summation term according to its scale. INPUT PARAMETERS: S - active set object X - array[N], candidate point -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ double sasactivelcpenalty1(sactiveset* state, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t n; ae_int_t nec; ae_int_t nic; double v; double alpha; double p; double result; ae_assert(state->algostate==1, "SASActiveLCPenalty1: is not in optimization mode", _state); sasrebuildbasis(state, _state); n = state->n; nec = state->nec; nic = state->nic; /* * Calculate penalty term. */ result = (double)(0); for(i=0; i<=nec+nic-1; i++) { if( state->activeset.ptr.p_int[n+i]>0 ) { alpha = (double)(0); p = -state->cleic.ptr.pp_double[i][n]; for(j=0; j<=n-1; j++) { v = state->cleic.ptr.pp_double[i][j]; p = p+v*x->ptr.p_double[j]; alpha = alpha+ae_sqr(v*state->s.ptr.p_double[j], _state); } alpha = ae_sqrt(alpha, _state); if( ae_fp_neq(alpha,(double)(0)) ) { result = result+ae_fabs(p/alpha, _state); } } } return result; } /************************************************************************* This subroutine calculates scaled norm of vector after projection onto subspace of active constraints. Most often this function is used to test stopping conditions. INPUT PARAMETERS: S - active set object D - vector whose norm is calculated RESULT: Vector norm (after projection and scaling) NOTE: projection is performed first, scaling is performed after projection NOTE: if we have N active constraints, zero value (exact zero) is returned -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ double sasscaledconstrainednorm(sactiveset* state, /* Real */ ae_vector* d, ae_state *_state) { ae_int_t i; ae_int_t n; double v; ae_int_t nactive; double result; ae_assert(state->algostate==1, "SASMoveTo: is not in optimization mode", _state); n = state->n; rvectorsetlengthatleast(&state->scntmp, n, _state); /* * Prepare basis (if needed) */ sasrebuildbasis(state, _state); /* * Calculate descent direction */ nactive = 0; for(i=0; i<=n-1; i++) { if( state->activeset.ptr.p_int[i]>0 ) { state->scntmp.ptr.p_double[i] = (double)(0); nactive = nactive+1; } else { state->scntmp.ptr.p_double[i] = d->ptr.p_double[i]; } } if( nactive+state->basissize>=n ) { /* * Quick exit if number of active constraints is N or larger */ result = 0.0; return result; } for(i=0; i<=state->basissize-1; i++) { v = ae_v_dotproduct(&state->ibasis.ptr.pp_double[i][0], 1, &state->scntmp.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_subd(&state->scntmp.ptr.p_double[0], 1, &state->ibasis.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } v = 0.0; for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->s.ptr.p_double[i]*state->scntmp.ptr.p_double[i], _state); } result = ae_sqrt(v, _state); return result; } /************************************************************************* This subroutine turns off optimization mode. INPUT PARAMETERS: S - active set object OUTPUT PARAMETERS: S - state is changed NOTE: this function can be called many times for optimizer which was already stopped. -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ void sasstopoptimization(sactiveset* state, ae_state *_state) { state->algostate = 0; } /************************************************************************* This function recalculates constraints - activates and deactivates them according to gradient value at current point. Algorithm assumes that we want to make steepest descent step from current point; constraints are activated and deactivated in such way that we won't violate any constraint by steepest descent step. After call to this function active set is ready to try steepest descent step (SASDescentDirection-SASExploreDirection-SASMoveTo). Only already "active" and "candidate" elements of ActiveSet are examined; constraints which are not active are not examined. INPUT PARAMETERS: State - active set object GC - array[N], gradient at XC OUTPUT PARAMETERS: State - active set object, with new set of constraint -- ALGLIB -- Copyright 26.09.2012 by Bochkanov Sergey *************************************************************************/ void sasreactivateconstraints(sactiveset* state, /* Real */ ae_vector* gc, ae_state *_state) { ae_assert(state->algostate==1, "SASReactivateConstraints: must be in optimization mode", _state); sactivesets_reactivateconstraints(state, gc, &state->unitdiagonal, _state); } /************************************************************************* This function recalculates constraints - activates and deactivates them according to gradient value at current point. Algorithm assumes that we want to make Quasi-Newton step from current point with diagonal Quasi-Newton matrix H. Constraints are activated and deactivated in such way that we won't violate any constraint by step. After call to this function active set is ready to try preconditioned steepest descent step (SASDescentDirection-SASExploreDirection-SASMoveTo). Only already "active" and "candidate" elements of ActiveSet are examined; constraints which are not active are not examined. INPUT PARAMETERS: State - active set object GC - array[N], gradient at XC OUTPUT PARAMETERS: State - active set object, with new set of constraint -- ALGLIB -- Copyright 26.09.2012 by Bochkanov Sergey *************************************************************************/ void sasreactivateconstraintsprec(sactiveset* state, /* Real */ ae_vector* gc, ae_state *_state) { ae_assert(state->algostate==1, "SASReactivateConstraintsPrec: must be in optimization mode", _state); sactivesets_reactivateconstraints(state, gc, &state->h, _state); } /************************************************************************* This function builds three orthonormal basises for current active set: * P-orthogonal one, which is orthogonalized with inner product (x,y) = x'*P*y, where P=inv(H) is current preconditioner * S-orthogonal one, which is orthogonalized with inner product (x,y) = x'*S'*S*y, where S is diagonal scaling matrix * I-orthogonal one, which is orthogonalized with standard dot product NOTE: all sets of orthogonal vectors are guaranteed to have same size. P-orthogonal basis is built first, I/S-orthogonal basises are forced to have same number of vectors as P-orthogonal one (padded by zero vectors if needed). NOTE: this function tracks changes in active set; first call will result in reorthogonalization INPUT PARAMETERS: State - active set object H - diagonal preconditioner, H[i]>0 OUTPUT PARAMETERS: State - active set object with new basis -- ALGLIB -- Copyright 20.06.2012 by Bochkanov Sergey *************************************************************************/ void sasrebuildbasis(sactiveset* state, ae_state *_state) { ae_int_t n; ae_int_t nec; ae_int_t nic; ae_int_t i; ae_int_t j; ae_int_t t; ae_int_t nactivelin; ae_int_t nactivebnd; double v; double vmax; ae_int_t kmax; if( state->basisisready ) { return; } n = state->n; nec = state->nec; nic = state->nic; rmatrixsetlengthatleast(&state->tmpbasis, nec+nic, n+1, _state); state->basissize = 0; state->basisisready = ae_true; /* * Determine number of active boundary and non-boundary * constraints, move them to TmpBasis. Quick exit if no * non-boundary constraints were detected. */ nactivelin = 0; nactivebnd = 0; for(i=0; i<=nec+nic-1; i++) { if( state->activeset.ptr.p_int[n+i]>0 ) { nactivelin = nactivelin+1; } } for(j=0; j<=n-1; j++) { if( state->activeset.ptr.p_int[j]>0 ) { nactivebnd = nactivebnd+1; } } if( nactivelin==0 ) { return; } /* * Orthogonalize linear constraints (inner product is given by preconditioner) * with respect to each other and boundary ones: * * normalize all constraints * * orthogonalize with respect to boundary ones * * repeat: * * if basisSize+nactivebnd=n - TERMINATE * * choose largest row from TmpBasis * * if row norm is too small - TERMINATE * * add row to basis, normalize * * remove from TmpBasis, orthogonalize other constraints with respect to this one */ nactivelin = 0; for(i=0; i<=nec+nic-1; i++) { if( state->activeset.ptr.p_int[n+i]>0 ) { ae_v_move(&state->tmpbasis.ptr.pp_double[nactivelin][0], 1, &state->cleic.ptr.pp_double[i][0], 1, ae_v_len(0,n)); nactivelin = nactivelin+1; } } for(i=0; i<=nactivelin-1; i++) { v = 0.0; for(j=0; j<=n-1; j++) { v = v+ae_sqr(state->tmpbasis.ptr.pp_double[i][j], _state)/state->h.ptr.p_double[j]; } if( ae_fp_greater(v,(double)(0)) ) { v = 1/ae_sqrt(v, _state); for(j=0; j<=n; j++) { state->tmpbasis.ptr.pp_double[i][j] = state->tmpbasis.ptr.pp_double[i][j]*v; } } } for(j=0; j<=n-1; j++) { if( state->activeset.ptr.p_int[j]>0 ) { for(i=0; i<=nactivelin-1; i++) { state->tmpbasis.ptr.pp_double[i][n] = state->tmpbasis.ptr.pp_double[i][n]-state->tmpbasis.ptr.pp_double[i][j]*state->xc.ptr.p_double[j]; state->tmpbasis.ptr.pp_double[i][j] = 0.0; } } } while(state->basissize+nactivebndtmpbasis.ptr.pp_double[i][j], _state)/state->h.ptr.p_double[j]; } v = ae_sqrt(v, _state); if( ae_fp_greater(v,vmax) ) { vmax = v; kmax = i; } } if( ae_fp_less(vmax,1.0E4*ae_machineepsilon) ) { break; } v = 1/vmax; ae_v_moved(&state->pbasis.ptr.pp_double[state->basissize][0], 1, &state->tmpbasis.ptr.pp_double[kmax][0], 1, ae_v_len(0,n), v); state->basissize = state->basissize+1; /* * Reorthogonalize other vectors with respect to chosen one. * Remove it from the array. */ for(i=0; i<=nactivelin-1; i++) { if( i!=kmax ) { v = (double)(0); for(j=0; j<=n-1; j++) { v = v+state->pbasis.ptr.pp_double[state->basissize-1][j]*state->tmpbasis.ptr.pp_double[i][j]/state->h.ptr.p_double[j]; } ae_v_subd(&state->tmpbasis.ptr.pp_double[i][0], 1, &state->pbasis.ptr.pp_double[state->basissize-1][0], 1, ae_v_len(0,n), v); } } for(j=0; j<=n; j++) { state->tmpbasis.ptr.pp_double[kmax][j] = (double)(0); } } /* * Orthogonalize linear constraints using traditional dot product * with respect to each other and boundary ones. * * NOTE: we force basis size to be equal to one which was computed * at the previous step, with preconditioner-based inner product. */ nactivelin = 0; for(i=0; i<=nec+nic-1; i++) { if( state->activeset.ptr.p_int[n+i]>0 ) { ae_v_move(&state->tmpbasis.ptr.pp_double[nactivelin][0], 1, &state->cleic.ptr.pp_double[i][0], 1, ae_v_len(0,n)); nactivelin = nactivelin+1; } } for(i=0; i<=nactivelin-1; i++) { v = 0.0; for(j=0; j<=n-1; j++) { v = v+ae_sqr(state->tmpbasis.ptr.pp_double[i][j], _state); } if( ae_fp_greater(v,(double)(0)) ) { v = 1/ae_sqrt(v, _state); for(j=0; j<=n; j++) { state->tmpbasis.ptr.pp_double[i][j] = state->tmpbasis.ptr.pp_double[i][j]*v; } } } for(j=0; j<=n-1; j++) { if( state->activeset.ptr.p_int[j]>0 ) { for(i=0; i<=nactivelin-1; i++) { state->tmpbasis.ptr.pp_double[i][n] = state->tmpbasis.ptr.pp_double[i][n]-state->tmpbasis.ptr.pp_double[i][j]*state->xc.ptr.p_double[j]; state->tmpbasis.ptr.pp_double[i][j] = 0.0; } } } for(t=0; t<=state->basissize-1; t++) { /* * Find largest vector, add to basis. */ vmax = (double)(-1); kmax = -1; for(i=0; i<=nactivelin-1; i++) { v = 0.0; for(j=0; j<=n-1; j++) { v = v+ae_sqr(state->tmpbasis.ptr.pp_double[i][j], _state); } v = ae_sqrt(v, _state); if( ae_fp_greater(v,vmax) ) { vmax = v; kmax = i; } } if( ae_fp_eq(vmax,(double)(0)) ) { for(j=0; j<=n; j++) { state->ibasis.ptr.pp_double[t][j] = 0.0; } continue; } v = 1/vmax; ae_v_moved(&state->ibasis.ptr.pp_double[t][0], 1, &state->tmpbasis.ptr.pp_double[kmax][0], 1, ae_v_len(0,n), v); /* * Reorthogonalize other vectors with respect to chosen one. * Remove it from the array. */ for(i=0; i<=nactivelin-1; i++) { if( i!=kmax ) { v = (double)(0); for(j=0; j<=n-1; j++) { v = v+state->ibasis.ptr.pp_double[t][j]*state->tmpbasis.ptr.pp_double[i][j]; } ae_v_subd(&state->tmpbasis.ptr.pp_double[i][0], 1, &state->ibasis.ptr.pp_double[t][0], 1, ae_v_len(0,n), v); } } for(j=0; j<=n; j++) { state->tmpbasis.ptr.pp_double[kmax][j] = (double)(0); } } /* * Orthogonalize linear constraints using inner product given by * scale matrix. * * NOTE: we force basis size to be equal to one which was computed * with preconditioner-based inner product. */ nactivelin = 0; for(i=0; i<=nec+nic-1; i++) { if( state->activeset.ptr.p_int[n+i]>0 ) { ae_v_move(&state->tmpbasis.ptr.pp_double[nactivelin][0], 1, &state->cleic.ptr.pp_double[i][0], 1, ae_v_len(0,n)); nactivelin = nactivelin+1; } } for(i=0; i<=nactivelin-1; i++) { v = 0.0; for(j=0; j<=n-1; j++) { v = v+ae_sqr(state->tmpbasis.ptr.pp_double[i][j]*state->s.ptr.p_double[j], _state); } if( ae_fp_greater(v,(double)(0)) ) { v = 1/ae_sqrt(v, _state); for(j=0; j<=n; j++) { state->tmpbasis.ptr.pp_double[i][j] = state->tmpbasis.ptr.pp_double[i][j]*v; } } } for(j=0; j<=n-1; j++) { if( state->activeset.ptr.p_int[j]>0 ) { for(i=0; i<=nactivelin-1; i++) { state->tmpbasis.ptr.pp_double[i][n] = state->tmpbasis.ptr.pp_double[i][n]-state->tmpbasis.ptr.pp_double[i][j]*state->xc.ptr.p_double[j]; state->tmpbasis.ptr.pp_double[i][j] = 0.0; } } } for(t=0; t<=state->basissize-1; t++) { /* * Find largest vector, add to basis. */ vmax = (double)(-1); kmax = -1; for(i=0; i<=nactivelin-1; i++) { v = 0.0; for(j=0; j<=n-1; j++) { v = v+ae_sqr(state->tmpbasis.ptr.pp_double[i][j]*state->s.ptr.p_double[j], _state); } v = ae_sqrt(v, _state); if( ae_fp_greater(v,vmax) ) { vmax = v; kmax = i; } } if( ae_fp_eq(vmax,(double)(0)) ) { for(j=0; j<=n; j++) { state->sbasis.ptr.pp_double[t][j] = 0.0; } continue; } v = 1/vmax; ae_v_moved(&state->sbasis.ptr.pp_double[t][0], 1, &state->tmpbasis.ptr.pp_double[kmax][0], 1, ae_v_len(0,n), v); /* * Reorthogonalize other vectors with respect to chosen one. * Remove it from the array. */ for(i=0; i<=nactivelin-1; i++) { if( i!=kmax ) { v = (double)(0); for(j=0; j<=n-1; j++) { v = v+state->sbasis.ptr.pp_double[t][j]*state->tmpbasis.ptr.pp_double[i][j]*ae_sqr(state->s.ptr.p_double[j], _state); } ae_v_subd(&state->tmpbasis.ptr.pp_double[i][0], 1, &state->sbasis.ptr.pp_double[t][0], 1, ae_v_len(0,n), v); } } for(j=0; j<=n; j++) { state->tmpbasis.ptr.pp_double[kmax][j] = (double)(0); } } } /************************************************************************* This subroutine calculates preconditioned descent direction subject to current active set. INPUT PARAMETERS: State - active set object G - array[N], gradient H - array[N], Hessian matrix HA - active constraints orthogonalized in such way that HA*inv(H)*HA'= I. Normalize- whether we need normalized descent or not D - possibly preallocated buffer; automatically resized. OUTPUT PARAMETERS: D - descent direction projected onto current active set. Components of D which correspond to active boundary constraints are forced to be exactly zero. In case D is non-zero and Normalize is True, it is normalized to have unit norm. NOTE: if we have N active constraints, D is explicitly set to zero. -- ALGLIB -- Copyright 21.12.2012 by Bochkanov Sergey *************************************************************************/ static void sactivesets_constraineddescent(sactiveset* state, /* Real */ ae_vector* g, /* Real */ ae_vector* h, /* Real */ ae_matrix* ha, ae_bool normalize, /* Real */ ae_vector* d, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t n; double v; ae_int_t nactive; ae_assert(state->algostate==1, "SAS: internal error in ConstrainedDescent() - not in optimization mode", _state); ae_assert(state->basisisready, "SAS: internal error in ConstrainedDescent() - no basis", _state); n = state->n; rvectorsetlengthatleast(d, n, _state); /* * Calculate preconditioned constrained descent direction: * * d := -inv(H)*( g - HA'*(HA*inv(H)*g) ) * * Formula above always gives direction which is orthogonal to rows of HA. * You can verify it by multiplication of both sides by HA[i] (I-th row), * taking into account that HA*inv(H)*HA'= I (by definition of HA - it is * orthogonal basis with inner product given by inv(H)). */ nactive = 0; for(i=0; i<=n-1; i++) { if( state->activeset.ptr.p_int[i]>0 ) { d->ptr.p_double[i] = (double)(0); nactive = nactive+1; } else { d->ptr.p_double[i] = g->ptr.p_double[i]; } } for(i=0; i<=state->basissize-1; i++) { v = 0.0; for(j=0; j<=n-1; j++) { v = v+ha->ptr.pp_double[i][j]*d->ptr.p_double[j]/h->ptr.p_double[j]; } ae_v_subd(&d->ptr.p_double[0], 1, &ha->ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); nactive = nactive+1; } v = 0.0; for(i=0; i<=n-1; i++) { if( state->activeset.ptr.p_int[i]>0 ) { d->ptr.p_double[i] = (double)(0); } else { d->ptr.p_double[i] = -d->ptr.p_double[i]/h->ptr.p_double[i]; v = v+ae_sqr(d->ptr.p_double[i], _state); } } v = ae_sqrt(v, _state); if( nactive>=n ) { v = (double)(0); for(i=0; i<=n-1; i++) { d->ptr.p_double[i] = (double)(0); } } if( normalize&&ae_fp_greater(v,(double)(0)) ) { for(i=0; i<=n-1; i++) { d->ptr.p_double[i] = d->ptr.p_double[i]/v; } } } /************************************************************************* This function recalculates constraints - activates and deactivates them according to gradient value at current point. Algorithm assumes that we want to make Quasi-Newton step from current point with diagonal Quasi-Newton matrix H. Constraints are activated and deactivated in such way that we won't violate any constraint by step. Only already "active" and "candidate" elements of ActiveSet are examined; constraints which are not active are not examined. INPUT PARAMETERS: State - active set object GC - array[N], gradient at XC H - array[N], Hessian matrix OUTPUT PARAMETERS: State - active set object, with new set of constraint -- ALGLIB -- Copyright 26.09.2012 by Bochkanov Sergey *************************************************************************/ static void sactivesets_reactivateconstraints(sactiveset* state, /* Real */ ae_vector* gc, /* Real */ ae_vector* h, ae_state *_state) { ae_int_t n; ae_int_t nec; ae_int_t nic; ae_int_t i; ae_int_t j; ae_int_t idx0; ae_int_t idx1; double v; ae_int_t nactivebnd; ae_int_t nactivelin; ae_int_t nactiveconstraints; double rowscale; ae_assert(state->algostate==1, "SASReactivateConstraintsPrec: must be in optimization mode", _state); /* * Prepare */ n = state->n; nec = state->nec; nic = state->nic; state->basisisready = ae_false; /* * Handle important special case - no linear constraints, * only boundary constraints are present */ if( nec+nic==0 ) { for(i=0; i<=n-1; i++) { if( (state->hasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i])&&ae_fp_eq(state->bndl.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->activeset.ptr.p_int[i] = 1; continue; } if( (state->hasbndl.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->bndl.ptr.p_double[i]))&&ae_fp_greater_eq(gc->ptr.p_double[i],(double)(0)) ) { state->activeset.ptr.p_int[i] = 1; continue; } if( (state->hasbndu.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->bndu.ptr.p_double[i]))&&ae_fp_less_eq(gc->ptr.p_double[i],(double)(0)) ) { state->activeset.ptr.p_int[i] = 1; continue; } state->activeset.ptr.p_int[i] = -1; } return; } /* * General case. * Allocate temporaries. */ rvectorsetlengthatleast(&state->rctmpg, n, _state); rvectorsetlengthatleast(&state->rctmprightpart, n, _state); rvectorsetlengthatleast(&state->rctmps, n, _state); rmatrixsetlengthatleast(&state->rctmpdense0, n, nec+nic, _state); rmatrixsetlengthatleast(&state->rctmpdense1, n, nec+nic, _state); bvectorsetlengthatleast(&state->rctmpisequality, n+nec+nic, _state); ivectorsetlengthatleast(&state->rctmpconstraintidx, n+nec+nic, _state); /* * Calculate descent direction */ ae_v_moveneg(&state->rctmpg.ptr.p_double[0], 1, &gc->ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * Determine candidates to the active set. * * After this block constraints become either "inactive" (ActiveSet[i]<0) * or "candidates" (ActiveSet[i]=0). Previously active constraints always * become "candidates". */ for(i=0; i<=n+nec+nic-1; i++) { if( state->activeset.ptr.p_int[i]>0 ) { state->activeset.ptr.p_int[i] = 0; } else { state->activeset.ptr.p_int[i] = -1; } } nactiveconstraints = 0; nactivebnd = 0; nactivelin = 0; for(i=0; i<=n-1; i++) { /* * Activate boundary constraints: * * copy constraint index to RCTmpConstraintIdx * * set corresponding element of ActiveSet[] to "candidate" * * fill RCTmpS by either +1 (lower bound) or -1 (upper bound) * * set RCTmpIsEquality to False (BndLhasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i])&&ae_fp_eq(state->bndl.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { /* * Equality constraint is activated */ state->rctmpconstraintidx.ptr.p_int[nactiveconstraints] = i; state->activeset.ptr.p_int[i] = 0; state->rctmps.ptr.p_double[i] = 1.0; state->rctmpisequality.ptr.p_bool[nactiveconstraints] = ae_true; nactiveconstraints = nactiveconstraints+1; nactivebnd = nactivebnd+1; continue; } if( state->hasbndl.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { /* * Lower bound is activated */ state->rctmpconstraintidx.ptr.p_int[nactiveconstraints] = i; state->activeset.ptr.p_int[i] = 0; state->rctmps.ptr.p_double[i] = -1.0; state->rctmpisequality.ptr.p_bool[nactiveconstraints] = ae_false; nactiveconstraints = nactiveconstraints+1; nactivebnd = nactivebnd+1; continue; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { /* * Upper bound is activated */ state->rctmpconstraintidx.ptr.p_int[nactiveconstraints] = i; state->activeset.ptr.p_int[i] = 0; state->rctmps.ptr.p_double[i] = 1.0; state->rctmpisequality.ptr.p_bool[nactiveconstraints] = ae_false; nactiveconstraints = nactiveconstraints+1; nactivebnd = nactivebnd+1; continue; } } for(i=0; i<=nec+nic-1; i++) { if( i>=nec&&state->activeset.ptr.p_int[n+i]<0 ) { /* * Inequality constraints are skipped if both (a) constraint was * not active, and (b) we are too far away from the boundary. */ rowscale = 0.0; v = -state->cleic.ptr.pp_double[i][n]; for(j=0; j<=n-1; j++) { v = v+state->cleic.ptr.pp_double[i][j]*state->xc.ptr.p_double[j]; rowscale = ae_maxreal(rowscale, ae_fabs(state->cleic.ptr.pp_double[i][j]*state->s.ptr.p_double[j], _state), _state); } if( ae_fp_less_eq(v,-1.0E5*ae_machineepsilon*rowscale) ) { /* * NOTE: it is important to check for non-strict inequality * because we have to correctly handle zero constraint * 0*x<=0 */ continue; } } ae_v_move(&state->rctmpdense0.ptr.pp_double[0][nactivelin], state->rctmpdense0.stride, &state->cleic.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); state->rctmpconstraintidx.ptr.p_int[nactiveconstraints] = n+i; state->activeset.ptr.p_int[n+i] = 0; state->rctmpisequality.ptr.p_bool[nactiveconstraints] = ihasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i])&&ae_fp_eq(state->bndl.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->activeset.ptr.p_int[i] = 1; continue; } if( (state->hasbndl.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->bndl.ptr.p_double[i]))&&ae_fp_greater_eq(gc->ptr.p_double[i],(double)(0)) ) { state->activeset.ptr.p_int[i] = 1; continue; } if( (state->hasbndu.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->bndu.ptr.p_double[i]))&&ae_fp_less_eq(gc->ptr.p_double[i],(double)(0)) ) { state->activeset.ptr.p_int[i] = 1; continue; } } return; } /* * General case. * * APPROACH TO CONSTRAINTS ACTIVATION/DEACTIVATION * * We have NActiveConstraints "candidates": NActiveBnd boundary candidates, * NActiveLin linear candidates. Indexes of boundary constraints are stored * in RCTmpConstraintIdx[0:NActiveBnd-1], indexes of linear ones are stored * in RCTmpConstraintIdx[NActiveBnd:NActiveBnd+NActiveLin-1]. Some of the * constraints are equality ones, some are inequality - as specified by * RCTmpIsEquality[i]. * * Now we have to determine active subset of "candidates" set. In order to * do so we solve following constrained minimization problem: * ( )^2 * min ( SUM(lambda[i]*A[i]) + G ) * ( ) * Here: * * G is a gradient (column vector) * * A[i] is a column vector, linear (left) part of I-th constraint. * I=0..NActiveConstraints-1, first NActiveBnd elements of A are just * subset of identity matrix (boundary constraints), next NActiveLin * elements are subset of rows of the matrix of general linear constraints. * * lambda[i] is a Lagrange multiplier corresponding to I-th constraint * * NOTE: for preconditioned setting A is replaced by A*H^(-0.5), G is * replaced by G*H^(-0.5). We apply this scaling at the last stage, * before passing data to NNLS solver. * * Minimization is performed subject to non-negativity constraints on * lambda[i] corresponding to inequality constraints. Inequality constraints * which correspond to non-zero lambda are activated, equality constraints * are always considered active. * * Informally speaking, we "decompose" descent direction -G and represent * it as sum of constraint vectors and "residual" part (which is equal to * the actual descent direction subject to constraints). * * SOLUTION OF THE NNLS PROBLEM * * We solve this optimization problem with Non-Negative Least Squares solver, * which can efficiently solve least squares problems of the form * * ( [ I | AU ] )^2 * min ( [ | ]*x-b ) s.t. non-negativity constraints on some x[i] * ( [ 0 | AL ] ) * * In order to use this solver we have to rearrange rows of A[] and G in * such way that first NActiveBnd columns of A store identity matrix (before * sorting non-zero elements are randomly distributed in the first NActiveBnd * columns of A, during sorting we move them to first NActiveBnd rows). * * Then we create instance of NNLS solver (we reuse instance left from the * previous run of the optimization problem) and solve NNLS problem. */ idx0 = 0; idx1 = nactivebnd; for(i=0; i<=n-1; i++) { if( state->activeset.ptr.p_int[i]>=0 ) { v = 1/ae_sqrt(h->ptr.p_double[i], _state); for(j=0; j<=nactivelin-1; j++) { state->rctmpdense1.ptr.pp_double[idx0][j] = state->rctmpdense0.ptr.pp_double[i][j]/state->rctmps.ptr.p_double[i]*v; } state->rctmprightpart.ptr.p_double[idx0] = state->rctmpg.ptr.p_double[i]/state->rctmps.ptr.p_double[i]*v; idx0 = idx0+1; } else { v = 1/ae_sqrt(h->ptr.p_double[i], _state); for(j=0; j<=nactivelin-1; j++) { state->rctmpdense1.ptr.pp_double[idx1][j] = state->rctmpdense0.ptr.pp_double[i][j]*v; } state->rctmprightpart.ptr.p_double[idx1] = state->rctmpg.ptr.p_double[i]*v; idx1 = idx1+1; } } snnlsinit(n, ae_minint(nec+nic, n, _state), n, &state->solver, _state); snnlssetproblem(&state->solver, &state->rctmpdense1, &state->rctmprightpart, nactivebnd, nactiveconstraints-nactivebnd, n, _state); for(i=0; i<=nactiveconstraints-1; i++) { if( state->rctmpisequality.ptr.p_bool[i] ) { snnlsdropnnc(&state->solver, i, _state); } } snnlssolve(&state->solver, &state->rctmplambdas, _state); /* * After solution of the problem we activate equality constraints (always active) * and inequality constraints with non-zero Lagrange multipliers. Then we reorthogonalize * active constraints. */ for(i=0; i<=n+nec+nic-1; i++) { state->activeset.ptr.p_int[i] = -1; } for(i=0; i<=nactiveconstraints-1; i++) { if( state->rctmpisequality.ptr.p_bool[i]||ae_fp_greater(state->rctmplambdas.ptr.p_double[i],(double)(0)) ) { state->activeset.ptr.p_int[state->rctmpconstraintidx.ptr.p_int[i]] = 1; } else { state->activeset.ptr.p_int[state->rctmpconstraintidx.ptr.p_int[i]] = 0; } } sasrebuildbasis(state, _state); } void _sactiveset_init(void* _p, ae_state *_state) { sactiveset *p = (sactiveset*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->xc, 0, DT_REAL, _state); ae_vector_init(&p->s, 0, DT_REAL, _state); ae_vector_init(&p->h, 0, DT_REAL, _state); ae_vector_init(&p->activeset, 0, DT_INT, _state); ae_matrix_init(&p->sbasis, 0, 0, DT_REAL, _state); ae_matrix_init(&p->pbasis, 0, 0, DT_REAL, _state); ae_matrix_init(&p->ibasis, 0, 0, DT_REAL, _state); ae_vector_init(&p->hasbndl, 0, DT_BOOL, _state); ae_vector_init(&p->hasbndu, 0, DT_BOOL, _state); ae_vector_init(&p->bndl, 0, DT_REAL, _state); ae_vector_init(&p->bndu, 0, DT_REAL, _state); ae_matrix_init(&p->cleic, 0, 0, DT_REAL, _state); ae_vector_init(&p->mtx, 0, DT_REAL, _state); ae_vector_init(&p->mtas, 0, DT_INT, _state); ae_vector_init(&p->cdtmp, 0, DT_REAL, _state); ae_vector_init(&p->corrtmp, 0, DT_REAL, _state); ae_vector_init(&p->unitdiagonal, 0, DT_REAL, _state); _snnlssolver_init(&p->solver, _state); ae_vector_init(&p->scntmp, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_vector_init(&p->tmpfeas, 0, DT_REAL, _state); ae_matrix_init(&p->tmpm0, 0, 0, DT_REAL, _state); ae_vector_init(&p->rctmps, 0, DT_REAL, _state); ae_vector_init(&p->rctmpg, 0, DT_REAL, _state); ae_vector_init(&p->rctmprightpart, 0, DT_REAL, _state); ae_matrix_init(&p->rctmpdense0, 0, 0, DT_REAL, _state); ae_matrix_init(&p->rctmpdense1, 0, 0, DT_REAL, _state); ae_vector_init(&p->rctmpisequality, 0, DT_BOOL, _state); ae_vector_init(&p->rctmpconstraintidx, 0, DT_INT, _state); ae_vector_init(&p->rctmplambdas, 0, DT_REAL, _state); ae_matrix_init(&p->tmpbasis, 0, 0, DT_REAL, _state); } void _sactiveset_init_copy(void* _dst, void* _src, ae_state *_state) { sactiveset *dst = (sactiveset*)_dst; sactiveset *src = (sactiveset*)_src; dst->n = src->n; dst->algostate = src->algostate; ae_vector_init_copy(&dst->xc, &src->xc, _state); dst->hasxc = src->hasxc; ae_vector_init_copy(&dst->s, &src->s, _state); ae_vector_init_copy(&dst->h, &src->h, _state); ae_vector_init_copy(&dst->activeset, &src->activeset, _state); dst->basisisready = src->basisisready; ae_matrix_init_copy(&dst->sbasis, &src->sbasis, _state); ae_matrix_init_copy(&dst->pbasis, &src->pbasis, _state); ae_matrix_init_copy(&dst->ibasis, &src->ibasis, _state); dst->basissize = src->basissize; dst->feasinitpt = src->feasinitpt; dst->constraintschanged = src->constraintschanged; ae_vector_init_copy(&dst->hasbndl, &src->hasbndl, _state); ae_vector_init_copy(&dst->hasbndu, &src->hasbndu, _state); ae_vector_init_copy(&dst->bndl, &src->bndl, _state); ae_vector_init_copy(&dst->bndu, &src->bndu, _state); ae_matrix_init_copy(&dst->cleic, &src->cleic, _state); dst->nec = src->nec; dst->nic = src->nic; ae_vector_init_copy(&dst->mtx, &src->mtx, _state); ae_vector_init_copy(&dst->mtas, &src->mtas, _state); ae_vector_init_copy(&dst->cdtmp, &src->cdtmp, _state); ae_vector_init_copy(&dst->corrtmp, &src->corrtmp, _state); ae_vector_init_copy(&dst->unitdiagonal, &src->unitdiagonal, _state); _snnlssolver_init_copy(&dst->solver, &src->solver, _state); ae_vector_init_copy(&dst->scntmp, &src->scntmp, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); ae_vector_init_copy(&dst->tmpfeas, &src->tmpfeas, _state); ae_matrix_init_copy(&dst->tmpm0, &src->tmpm0, _state); ae_vector_init_copy(&dst->rctmps, &src->rctmps, _state); ae_vector_init_copy(&dst->rctmpg, &src->rctmpg, _state); ae_vector_init_copy(&dst->rctmprightpart, &src->rctmprightpart, _state); ae_matrix_init_copy(&dst->rctmpdense0, &src->rctmpdense0, _state); ae_matrix_init_copy(&dst->rctmpdense1, &src->rctmpdense1, _state); ae_vector_init_copy(&dst->rctmpisequality, &src->rctmpisequality, _state); ae_vector_init_copy(&dst->rctmpconstraintidx, &src->rctmpconstraintidx, _state); ae_vector_init_copy(&dst->rctmplambdas, &src->rctmplambdas, _state); ae_matrix_init_copy(&dst->tmpbasis, &src->tmpbasis, _state); } void _sactiveset_clear(void* _p) { sactiveset *p = (sactiveset*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->xc); ae_vector_clear(&p->s); ae_vector_clear(&p->h); ae_vector_clear(&p->activeset); ae_matrix_clear(&p->sbasis); ae_matrix_clear(&p->pbasis); ae_matrix_clear(&p->ibasis); ae_vector_clear(&p->hasbndl); ae_vector_clear(&p->hasbndu); ae_vector_clear(&p->bndl); ae_vector_clear(&p->bndu); ae_matrix_clear(&p->cleic); ae_vector_clear(&p->mtx); ae_vector_clear(&p->mtas); ae_vector_clear(&p->cdtmp); ae_vector_clear(&p->corrtmp); ae_vector_clear(&p->unitdiagonal); _snnlssolver_clear(&p->solver); ae_vector_clear(&p->scntmp); ae_vector_clear(&p->tmp0); ae_vector_clear(&p->tmpfeas); ae_matrix_clear(&p->tmpm0); ae_vector_clear(&p->rctmps); ae_vector_clear(&p->rctmpg); ae_vector_clear(&p->rctmprightpart); ae_matrix_clear(&p->rctmpdense0); ae_matrix_clear(&p->rctmpdense1); ae_vector_clear(&p->rctmpisequality); ae_vector_clear(&p->rctmpconstraintidx); ae_vector_clear(&p->rctmplambdas); ae_matrix_clear(&p->tmpbasis); } void _sactiveset_destroy(void* _p) { sactiveset *p = (sactiveset*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->xc); ae_vector_destroy(&p->s); ae_vector_destroy(&p->h); ae_vector_destroy(&p->activeset); ae_matrix_destroy(&p->sbasis); ae_matrix_destroy(&p->pbasis); ae_matrix_destroy(&p->ibasis); ae_vector_destroy(&p->hasbndl); ae_vector_destroy(&p->hasbndu); ae_vector_destroy(&p->bndl); ae_vector_destroy(&p->bndu); ae_matrix_destroy(&p->cleic); ae_vector_destroy(&p->mtx); ae_vector_destroy(&p->mtas); ae_vector_destroy(&p->cdtmp); ae_vector_destroy(&p->corrtmp); ae_vector_destroy(&p->unitdiagonal); _snnlssolver_destroy(&p->solver); ae_vector_destroy(&p->scntmp); ae_vector_destroy(&p->tmp0); ae_vector_destroy(&p->tmpfeas); ae_matrix_destroy(&p->tmpm0); ae_vector_destroy(&p->rctmps); ae_vector_destroy(&p->rctmpg); ae_vector_destroy(&p->rctmprightpart); ae_matrix_destroy(&p->rctmpdense0); ae_matrix_destroy(&p->rctmpdense1); ae_vector_destroy(&p->rctmpisequality); ae_vector_destroy(&p->rctmpconstraintidx); ae_vector_destroy(&p->rctmplambdas); ae_matrix_destroy(&p->tmpbasis); } /************************************************************************* This function initializes QQPSettings structure with default settings. Newly created structure MUST be initialized by default settings - or by copy of the already initialized structure. -- ALGLIB -- Copyright 14.05.2011 by Bochkanov Sergey *************************************************************************/ void qqploaddefaults(ae_int_t nmain, qqpsettings* s, ae_state *_state) { s->epsg = 0.0; s->epsf = 0.0; s->epsx = 1.0E-6; s->maxouterits = 0; s->cgphase = ae_true; s->cnphase = ae_true; s->cgminits = 5; s->cgmaxits = ae_maxint(s->cgminits, ae_round(1+0.33*nmain, _state), _state); s->sparsesolver = 0; s->cnmaxupdates = ae_round(1+0.1*nmain, _state); } /************************************************************************* This function initializes QQPSettings structure with copy of another, already initialized structure. -- ALGLIB -- Copyright 14.05.2011 by Bochkanov Sergey *************************************************************************/ void qqpcopysettings(qqpsettings* src, qqpsettings* dst, ae_state *_state) { dst->epsg = src->epsg; dst->epsf = src->epsf; dst->epsx = src->epsx; dst->maxouterits = src->maxouterits; dst->cgphase = src->cgphase; dst->cnphase = src->cnphase; dst->cgminits = src->cgminits; dst->cgmaxits = src->cgmaxits; dst->sparsesolver = src->sparsesolver; dst->cnmaxupdates = src->cnmaxupdates; } /************************************************************************* This function runs QQP solver; it returns after optimization process was completed. Following QP problem is solved: min(0.5*(x-x_origin)'*A*(x-x_origin)+b'*(x-x_origin)) subject to boundary constraints. IMPORTANT: UNLIKE MANY OTHER SOLVERS, THIS FUNCTION DOES NOT REQUIRE YOU TO INITIALIZE STATE OBJECT. IT CAN BE AUTOMATICALLY INITIALIZED DURING SOLUTION PROCESS. INPUT PARAMETERS: AC - for dense problems given by CQM model (AKind=0) A-term of CQM object contains system matrix. Other terms are unspecified and should not be referenced. SparseAC - for sparse problems (AKind=1) DenseAC - for traditional dense matrices (AKind=2) AKind - matrix term to use: * 0 for dense CQM (CQMAC) * 1 for sparse matrix (SparseAC) * 2 for dense matrix (DenseAC) IsUpper - which triangle of SparseAC/DenseAC stores matrix - upper or lower one (for dense matrices this parameter is not actual). BC - linear term, array[NC] BndLC - lower bound, array[NC] BndUC - upper bound, array[NC] SC - scale vector, array[NC]: * I-th element contains scale of I-th variable, * SC[I]>0 XOriginC - origin term, array[NC]. Can be zero. NC - number of variables in the original formulation (no slack variables). CLEICC - linear equality/inequality constraints. Present version of this function does NOT provide publicly available support for linear constraints. This feature will be introduced in the future versions of the function. NEC, NIC - number of equality/inequality constraints. MUST BE ZERO IN THE CURRENT VERSION!!! Settings - QQPSettings object initialized by one of the initialization functions. SState - object which stores temporaries: * uninitialized object is automatically initialized * previously allocated memory is reused as much as possible XS - initial point, array[NC] OUTPUT PARAMETERS: XS - last point TerminationType-termination type: * * * -- ALGLIB -- Copyright 14.05.2011 by Bochkanov Sergey *************************************************************************/ void qqpoptimize(convexquadraticmodel* cqmac, sparsematrix* sparseac, /* Real */ ae_matrix* denseac, ae_int_t akind, ae_bool isupper, /* Real */ ae_vector* bc, /* Real */ ae_vector* bndlc, /* Real */ ae_vector* bnduc, /* Real */ ae_vector* sc, /* Real */ ae_vector* xoriginc, ae_int_t nc, /* Real */ ae_matrix* cleicc, ae_int_t nec, ae_int_t nic, qqpsettings* settings, qqpbuffers* sstate, /* Real */ ae_vector* xs, ae_int_t* terminationtype, ae_state *_state) { ae_int_t n; ae_int_t nmain; ae_int_t i; ae_int_t j; ae_int_t k; double v; double vv; double d2; double d1; ae_int_t d1est; ae_int_t d2est; ae_bool needact; double reststp; double fullstp; double stpmax; double stp; ae_int_t stpcnt; ae_int_t cidx; double cval; ae_int_t cgcnt; ae_int_t cgmax; ae_int_t newtcnt; ae_int_t sparsesolver; double beta; ae_bool b; double fprev; double fcur; ae_bool problemsolved; ae_bool isconstrained; double f0; double f1; *terminationtype = 0; /* * Primary checks */ ae_assert((akind==0||akind==1)||akind==2, "QQPOptimize: incorrect AKind", _state); sstate->nmain = nc; sstate->nslack = nic; sstate->n = nc+nic; sstate->nec = nec; sstate->nic = nic; n = sstate->n; nmain = sstate->nmain; *terminationtype = 0; sstate->repinneriterationscount = 0; sstate->repouteriterationscount = 0; sstate->repncholesky = 0; sstate->repncupdates = 0; /* * Several checks * * matrix size * * scale vector * * consistency of bound constraints * * consistency of settings */ if( akind==1 ) { ae_assert(sparsegetnrows(sparseac, _state)==nmain, "QQPOptimize: rows(SparseAC)<>NMain", _state); ae_assert(sparsegetncols(sparseac, _state)==nmain, "QQPOptimize: cols(SparseAC)<>NMain", _state); } for(i=0; i<=nmain-1; i++) { ae_assert(ae_isfinite(sc->ptr.p_double[i], _state)&&ae_fp_greater(sc->ptr.p_double[i],(double)(0)), "QQPOptimize: incorrect scale", _state); } for(i=0; i<=nmain-1; i++) { if( ae_isfinite(bndlc->ptr.p_double[i], _state)&&ae_isfinite(bnduc->ptr.p_double[i], _state) ) { if( ae_fp_greater(bndlc->ptr.p_double[i],bnduc->ptr.p_double[i]) ) { *terminationtype = -3; return; } } } ae_assert(settings->cgphase||settings->cnphase, "QQPOptimize: both phases (CG and Newton) are inactive", _state); /* * Allocate data structures */ rvectorsetlengthatleast(&sstate->bndl, n, _state); rvectorsetlengthatleast(&sstate->bndu, n, _state); bvectorsetlengthatleast(&sstate->havebndl, n, _state); bvectorsetlengthatleast(&sstate->havebndu, n, _state); rvectorsetlengthatleast(&sstate->xs, n, _state); rvectorsetlengthatleast(&sstate->xf, n, _state); rvectorsetlengthatleast(&sstate->xp, n, _state); rvectorsetlengthatleast(&sstate->gc, n, _state); rvectorsetlengthatleast(&sstate->cgc, n, _state); rvectorsetlengthatleast(&sstate->cgp, n, _state); rvectorsetlengthatleast(&sstate->dc, n, _state); rvectorsetlengthatleast(&sstate->dp, n, _state); rvectorsetlengthatleast(&sstate->tmp0, n, _state); rvectorsetlengthatleast(&sstate->stpbuf, 15, _state); sasinit(n, &sstate->sas, _state); /* * Scale/shift problem coefficients: * * min { 0.5*(x-x0)'*A*(x-x0) + b'*(x-x0) } * * becomes (after transformation "x = S*y+x0") * * min { 0.5*y'*(S*A*S)*y + (S*b)'*y * * Modified A_mod=S*A*S and b_mod=S*(b+A*x0) are * stored into SState.DenseA and SState.B. * * NOTE: DenseA/DenseB are arrays whose lengths are * NMain, not N=NMain+NSlack! We store reduced * matrix and vector because extend parts (last * NSlack rows/columns) are exactly zero. * */ rvectorsetlengthatleast(&sstate->b, nmain, _state); for(i=0; i<=nmain-1; i++) { sstate->b.ptr.p_double[i] = sc->ptr.p_double[i]*bc->ptr.p_double[i]; } sstate->akind = -99; if( akind==0 ) { /* * Dense QP problem - just copy and scale. */ rmatrixsetlengthatleast(&sstate->densea, nmain, nmain, _state); cqmgeta(cqmac, &sstate->densea, _state); sstate->akind = 0; sstate->absamax = (double)(0); sstate->absasum = (double)(0); sstate->absasum2 = (double)(0); for(i=0; i<=nmain-1; i++) { for(j=0; j<=nmain-1; j++) { v = sc->ptr.p_double[i]*sstate->densea.ptr.pp_double[i][j]*sc->ptr.p_double[j]; sstate->densea.ptr.pp_double[i][j] = v; sstate->absamax = ae_maxreal(sstate->absamax, v, _state); sstate->absasum = sstate->absasum+v; sstate->absasum2 = sstate->absasum2+v*v; } } } if( akind==1 ) { /* * Sparse QP problem - a bit tricky. Depending on format of the * input we use different strategies for copying matrix: * * SKS matrices are copied to SKS format * * anything else is copied to CRS format */ sparsecopytosksbuf(sparseac, &sstate->sparsea, _state); if( isupper ) { sparsetransposesks(&sstate->sparsea, _state); } sstate->akind = 1; sstate->sparseupper = ae_false; sstate->absamax = (double)(0); sstate->absasum = (double)(0); sstate->absasum2 = (double)(0); for(i=0; i<=n-1; i++) { k = sstate->sparsea.ridx.ptr.p_int[i]; for(j=i-sstate->sparsea.didx.ptr.p_int[i]; j<=i; j++) { v = sc->ptr.p_double[i]*sstate->sparsea.vals.ptr.p_double[k]*sc->ptr.p_double[j]; sstate->sparsea.vals.ptr.p_double[k] = v; if( i==j ) { /* * Diagonal terms are counted only once */ sstate->absamax = ae_maxreal(sstate->absamax, v, _state); sstate->absasum = sstate->absasum+v; sstate->absasum2 = sstate->absasum2+v*v; } else { /* * Offdiagonal terms are counted twice */ sstate->absamax = ae_maxreal(sstate->absamax, v, _state); sstate->absasum = sstate->absasum+2*v; sstate->absasum2 = sstate->absasum2+2*v*v; } k = k+1; } } } if( akind==2 ) { /* * Dense QP problem - just copy and scale. */ rmatrixsetlengthatleast(&sstate->densea, nmain, nmain, _state); sstate->akind = 0; sstate->absamax = (double)(0); sstate->absasum = (double)(0); sstate->absasum2 = (double)(0); if( isupper ) { for(i=0; i<=nmain-1; i++) { for(j=i; j<=nmain-1; j++) { v = sc->ptr.p_double[i]*denseac->ptr.pp_double[i][j]*sc->ptr.p_double[j]; sstate->densea.ptr.pp_double[i][j] = v; sstate->densea.ptr.pp_double[j][i] = v; if( ae_fp_eq((double)(i),v) ) { vv = (double)(1); } else { vv = (double)(2); } sstate->absamax = ae_maxreal(sstate->absamax, v, _state); sstate->absasum = sstate->absasum+v*vv; sstate->absasum2 = sstate->absasum2+v*v*vv; } } } else { for(i=0; i<=nmain-1; i++) { for(j=0; j<=i; j++) { v = sc->ptr.p_double[i]*denseac->ptr.pp_double[i][j]*sc->ptr.p_double[j]; sstate->densea.ptr.pp_double[i][j] = v; sstate->densea.ptr.pp_double[j][i] = v; if( ae_fp_eq((double)(i),v) ) { vv = (double)(1); } else { vv = (double)(2); } sstate->absamax = ae_maxreal(sstate->absamax, v, _state); sstate->absasum = sstate->absasum+v*vv; sstate->absasum2 = sstate->absasum2+v*v*vv; } } } } ae_assert(sstate->akind>=0, "QQP: integrity check failed", _state); /* * Load box constraints into State structure. * * We apply transformation to variables: y=(x-x_origin)/s, * each of the constraints is appropriately shifted/scaled. */ for(i=0; i<=nmain-1; i++) { sstate->havebndl.ptr.p_bool[i] = ae_isfinite(bndlc->ptr.p_double[i], _state); if( sstate->havebndl.ptr.p_bool[i] ) { sstate->bndl.ptr.p_double[i] = (bndlc->ptr.p_double[i]-xoriginc->ptr.p_double[i])/sc->ptr.p_double[i]; } else { ae_assert(ae_isneginf(bndlc->ptr.p_double[i], _state), "QQPOptimize: incorrect lower bound", _state); sstate->bndl.ptr.p_double[i] = _state->v_neginf; } sstate->havebndu.ptr.p_bool[i] = ae_isfinite(bnduc->ptr.p_double[i], _state); if( sstate->havebndu.ptr.p_bool[i] ) { sstate->bndu.ptr.p_double[i] = (bnduc->ptr.p_double[i]-xoriginc->ptr.p_double[i])/sc->ptr.p_double[i]; } else { ae_assert(ae_isposinf(bnduc->ptr.p_double[i], _state), "QQPOptimize: incorrect upper bound", _state); sstate->bndu.ptr.p_double[i] = _state->v_posinf; } } for(i=nmain; i<=n-1; i++) { sstate->havebndl.ptr.p_bool[i] = ae_true; sstate->bndl.ptr.p_double[i] = 0.0; sstate->havebndu.ptr.p_bool[i] = ae_false; sstate->bndu.ptr.p_double[i] = _state->v_posinf; } /* * Shift/scale linear constraints with transformation y=(x-x_origin)/s: * * constraint "c[i]'*x = b[i]" becomes "(s[i]*c[i])'*x = b[i]-c[i]'*x_origin". * * after constraint is loaded into SState.CLEIC, it is additionally normalized */ rmatrixsetlengthatleast(&sstate->cleic, nec+nic, n+1, _state); for(i=0; i<=nec+nic-1; i++) { v = (double)(0); vv = (double)(0); for(j=0; j<=nmain-1; j++) { sstate->cleic.ptr.pp_double[i][j] = cleicc->ptr.pp_double[i][j]*sc->ptr.p_double[j]; vv = vv+ae_sqr(sstate->cleic.ptr.pp_double[i][j], _state); v = v+cleicc->ptr.pp_double[i][j]*xoriginc->ptr.p_double[j]; } vv = ae_sqrt(vv, _state); for(j=nmain; j<=n-1; j++) { sstate->cleic.ptr.pp_double[i][j] = 0.0; } sstate->cleic.ptr.pp_double[i][n] = cleicc->ptr.pp_double[i][nmain]-v; if( i>=nec ) { sstate->cleic.ptr.pp_double[i][nmain+i-nec] = 1.0; } if( ae_fp_greater(vv,(double)(0)) ) { for(j=0; j<=n; j++) { sstate->cleic.ptr.pp_double[i][j] = sstate->cleic.ptr.pp_double[i][j]/vv; } } } /* * Process initial point: * * first NMain components are equal to XS-XOriginC * * last NIC components are deduced from linear constraints * * make sure that boundary constraints are preserved by transformation */ for(i=0; i<=nmain-1; i++) { sstate->xs.ptr.p_double[i] = (xs->ptr.p_double[i]-xoriginc->ptr.p_double[i])/sc->ptr.p_double[i]; if( sstate->havebndl.ptr.p_bool[i]&&ae_fp_less(sstate->xs.ptr.p_double[i],sstate->bndl.ptr.p_double[i]) ) { sstate->xs.ptr.p_double[i] = sstate->bndl.ptr.p_double[i]; } if( sstate->havebndu.ptr.p_bool[i]&&ae_fp_greater(sstate->xs.ptr.p_double[i],sstate->bndu.ptr.p_double[i]) ) { sstate->xs.ptr.p_double[i] = sstate->bndu.ptr.p_double[i]; } if( sstate->havebndl.ptr.p_bool[i]&&ae_fp_eq(xs->ptr.p_double[i],bndlc->ptr.p_double[i]) ) { sstate->xs.ptr.p_double[i] = sstate->bndl.ptr.p_double[i]; } if( sstate->havebndu.ptr.p_bool[i]&&ae_fp_eq(xs->ptr.p_double[i],bnduc->ptr.p_double[i]) ) { sstate->xs.ptr.p_double[i] = sstate->bndu.ptr.p_double[i]; } } for(i=0; i<=nic-1; i++) { v = ae_v_dotproduct(&sstate->xs.ptr.p_double[0], 1, &sstate->cleic.ptr.pp_double[nec+i][0], 1, ae_v_len(0,nmain-1)); sstate->xs.ptr.p_double[nmain+i] = ae_maxreal(sstate->cleic.ptr.pp_double[nec+i][n]-v, 0.0, _state); } /* * Select sparse direct solver */ if( akind==1 ) { sparsesolver = settings->sparsesolver; if( sparsesolver==0 ) { sparsesolver = 1; } if( sparseissks(&sstate->sparsea, _state) ) { sparsesolver = 2; } sparsesolver = 2; ae_assert(sparsesolver==1||sparsesolver==2, "QQPOptimize: incorrect SparseSolver", _state); } else { sparsesolver = 0; } /* * For unconstrained problems - try to use fast approach which requires * just one unregularized Cholesky decomposition for solution. If it fails, * switch to general QQP code. */ problemsolved = ae_false; isconstrained = nec+nic>0; for(i=0; i<=n-1; i++) { isconstrained = (isconstrained||sstate->havebndl.ptr.p_bool[i])||sstate->havebndu.ptr.p_bool[i]; } if( (!isconstrained&&settings->cnphase)&&akind==0 ) { ae_assert(nmain==n, "QQP: integrity check failed", _state); rmatrixsetlengthatleast(&sstate->densez, n, n, _state); rvectorsetlengthatleast(&sstate->tmpcn, n, _state); for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { sstate->densez.ptr.pp_double[i][j] = sstate->densea.ptr.pp_double[i][j]; } } inc(&sstate->repncholesky, _state); if( spdmatrixcholeskyrec(&sstate->densez, 0, n, ae_true, &sstate->tmpcn, _state) ) { ae_v_move(&sstate->xf.ptr.p_double[0], 1, &sstate->xs.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { sstate->dc.ptr.p_double[i] = (double)(0); } f0 = qqpsolver_projectedtargetfunction(sstate, &sstate->xf, &sstate->dc, 0.0, &sstate->tmpcn, _state); for(k=0; k<=3; k++) { rmatrixmv(n, n, &sstate->densea, 0, 0, 0, &sstate->xf, 0, &sstate->gc, 0, _state); ae_v_add(&sstate->gc.ptr.p_double[0], 1, &sstate->b.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { sstate->dc.ptr.p_double[i] = -sstate->gc.ptr.p_double[i]; } fblscholeskysolve(&sstate->densez, 1.0, n, ae_true, &sstate->dc, &sstate->tmpcn, _state); f1 = qqpsolver_projectedtargetfunction(sstate, &sstate->xf, &sstate->dc, 1.0, &sstate->tmpcn, _state); if( ae_fp_greater_eq(f1,f0) ) { break; } ae_v_add(&sstate->xf.ptr.p_double[0], 1, &sstate->dc.ptr.p_double[0], 1, ae_v_len(0,n-1)); f0 = f1; } *terminationtype = 2; problemsolved = ae_true; } } /* * Attempt to solve problem with fast approach failed, use generic QQP */ if( !problemsolved ) { /* * Prepare "active set" structure */ sassetbc(&sstate->sas, &sstate->bndl, &sstate->bndu, _state); sassetlcx(&sstate->sas, &sstate->cleic, 0, 0, _state); if( !sasstartoptimization(&sstate->sas, &sstate->xs, _state) ) { *terminationtype = -3; return; } /* * Main loop. * * Following variables are used: * * GC stores current gradient (unconstrained) * * CGC stores current gradient (constrained) * * DC stores current search direction * * CGP stores constrained gradient at previous point * (zero on initial entry) * * DP stores previous search direction * (zero on initial entry) */ cgmax = settings->cgminits; sstate->repinneriterationscount = 0; sstate->repouteriterationscount = 0; for(;;) { if( settings->maxouterits>0&&sstate->repouteriterationscount>=settings->maxouterits ) { *terminationtype = 5; break; } if( sstate->repouteriterationscount>0 ) { /* * Check EpsF- and EpsX-based stopping criteria. * Because problem was already scaled, we do not scale step before checking its length. * NOTE: these checks are performed only after at least one outer iteration was made. */ if( ae_fp_greater(settings->epsf,(double)(0)) ) { /* * NOTE 1: here we rely on the fact that ProjectedTargetFunction() ignore D when Stp=0 * NOTE 2: code below handles situation when update increases function value instead * of decreasing it. */ fprev = qqpsolver_projectedtargetfunction(sstate, &sstate->xp, &sstate->dc, 0.0, &sstate->tmp0, _state); fcur = qqpsolver_projectedtargetfunction(sstate, &sstate->sas.xc, &sstate->dc, 0.0, &sstate->tmp0, _state); if( ae_fp_less_eq(fprev-fcur,settings->epsf*ae_maxreal(ae_fabs(fprev, _state), ae_maxreal(ae_fabs(fcur, _state), 1.0, _state), _state)) ) { *terminationtype = 1; break; } } if( ae_fp_greater(settings->epsx,(double)(0)) ) { v = 0.0; for(i=0; i<=n-1; i++) { v = v+ae_sqr(sstate->xp.ptr.p_double[i]-sstate->sas.xc.ptr.p_double[i], _state); } if( ae_fp_less_eq(ae_sqrt(v, _state),settings->epsx) ) { *terminationtype = 2; break; } } } inc(&sstate->repouteriterationscount, _state); ae_v_move(&sstate->xp.ptr.p_double[0], 1, &sstate->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( !settings->cgphase ) { cgmax = 0; } for(i=0; i<=n-1; i++) { sstate->cgp.ptr.p_double[i] = 0.0; sstate->dp.ptr.p_double[i] = 0.0; } for(cgcnt=0; cgcnt<=cgmax-1; cgcnt++) { /* * Calculate unconstrained gradient GC for "extended" QP problem * Determine active set, current constrained gradient CGC. * Check gradient-based stopping condition. * * NOTE: because problem was scaled, we do not have to apply scaling * to gradient before checking stopping condition. */ qqpsolver_targetgradient(sstate, &sstate->sas.xc, &sstate->gc, _state); sasreactivateconstraints(&sstate->sas, &sstate->gc, _state); ae_v_move(&sstate->cgc.ptr.p_double[0], 1, &sstate->gc.ptr.p_double[0], 1, ae_v_len(0,n-1)); sasconstraineddirection(&sstate->sas, &sstate->cgc, _state); v = ae_v_dotproduct(&sstate->cgc.ptr.p_double[0], 1, &sstate->cgc.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( ae_fp_less_eq(ae_sqrt(v, _state),settings->epsg) ) { *terminationtype = 4; break; } /* * Prepare search direction DC and explore it. * * We try to use CGP/DP to prepare conjugate gradient step, * but we resort to steepest descent step (Beta=0) in case * we are at I-th boundary, but DP[I]<>0. * * Such approach allows us to ALWAYS have feasible DC, with * guaranteed compatibility with both feasible area and current * active set. * * Automatic CG reset performed every time DP is incompatible * with current active set and/or feasible area. We also * perform reset every QuickQPRestartCG iterations. */ ae_v_moveneg(&sstate->dc.ptr.p_double[0], 1, &sstate->cgc.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = 0.0; vv = 0.0; b = ae_false; for(i=0; i<=n-1; i++) { v = v+sstate->cgc.ptr.p_double[i]*sstate->cgc.ptr.p_double[i]; vv = vv+sstate->cgp.ptr.p_double[i]*sstate->cgp.ptr.p_double[i]; b = b||((sstate->havebndl.ptr.p_bool[i]&&ae_fp_eq(sstate->sas.xc.ptr.p_double[i],sstate->bndl.ptr.p_double[i]))&&ae_fp_neq(sstate->dp.ptr.p_double[i],(double)(0))); b = b||((sstate->havebndu.ptr.p_bool[i]&&ae_fp_eq(sstate->sas.xc.ptr.p_double[i],sstate->bndu.ptr.p_double[i]))&&ae_fp_neq(sstate->dp.ptr.p_double[i],(double)(0))); } b = b||ae_fp_eq(vv,(double)(0)); b = b||cgcnt%qqpsolver_quickqprestartcg==0; if( !b ) { beta = v/vv; } else { beta = 0.0; } ae_v_addd(&sstate->dc.ptr.p_double[0], 1, &sstate->dp.ptr.p_double[0], 1, ae_v_len(0,n-1), beta); sasconstraineddirection(&sstate->sas, &sstate->dc, _state); sasexploredirection(&sstate->sas, &sstate->dc, &stpmax, &cidx, &cval, _state); /* * Build quadratic model of F along descent direction: * * F(xc+alpha*D) = D2*alpha^2 + D1*alpha * * Terminate algorithm if needed. * * NOTE: we do not maintain constant term D0 */ qqpsolver_quadraticmodel(sstate, &sstate->sas.xc, &sstate->dc, &sstate->gc, &d1, &d1est, &d2, &d2est, _state); if( ae_fp_eq(d1,(double)(0))&&ae_fp_eq(d2,(double)(0)) ) { /* * D1 and D2 are exactly zero, success. * After this if-then we assume that D is non-zero. */ *terminationtype = 4; break; } if( d1est>=0 ) { /* * Numerical noise is too large, it means that we are close * to minimum - and that further improvement is impossible. * * After this if-then we assume that D1 is definitely negative * (even under presence of numerical errors). */ *terminationtype = 7; break; } if( d2est<=0&&cidx<0 ) { /* * Function is unbounded from below: * * D1<0 (verified by previous block) * * D2Est<=0, which means that either D2<0 - or it can not * be reliably distinguished from zero. * * step is unconstrained * * If these conditions are true, we abnormally terminate QP * algorithm with return code -4 */ *terminationtype = -4; break; } /* * Perform step along DC. * * In this block of code we maintain two step length: * * RestStp - restricted step, maximum step length along DC which does * not violate constraints * * FullStp - step length along DC which minimizes quadratic function * without taking constraints into account. If problem is * unbounded from below without constraints, FullStp is * forced to be RestStp. * * So, if function is convex (D2>0): * * FullStp = -D1/(2*D2) * * RestStp = restricted FullStp * * 0<=RestStp<=FullStp * * If function is non-convex, but bounded from below under constraints: * * RestStp = step length subject to constraints * * FullStp = RestStp * * After RestStp and FullStp are initialized, we generate several trial * steps which are different multiples of RestStp and FullStp. */ if( d2est>0 ) { ae_assert(ae_fp_less(d1,(double)(0)), "QQPOptimize: internal error", _state); fullstp = -d1/(2*d2); needact = ae_fp_greater_eq(fullstp,stpmax); if( needact ) { ae_assert(sstate->stpbuf.cnt>=3, "QQPOptimize: StpBuf overflow", _state); reststp = stpmax; stp = reststp; sstate->stpbuf.ptr.p_double[0] = reststp*4; sstate->stpbuf.ptr.p_double[1] = fullstp; sstate->stpbuf.ptr.p_double[2] = fullstp/4; stpcnt = 3; } else { reststp = fullstp; stp = fullstp; stpcnt = 0; } } else { ae_assert(cidx>=0, "QQPOptimize: internal error", _state); ae_assert(sstate->stpbuf.cnt>=2, "QQPOptimize: StpBuf overflow", _state); reststp = stpmax; fullstp = stpmax; stp = reststp; needact = ae_true; sstate->stpbuf.ptr.p_double[0] = 4*reststp; stpcnt = 1; } qqpsolver_findbeststepandmove(sstate, &sstate->sas, &sstate->dc, stp, needact, cidx, cval, &sstate->stpbuf, stpcnt, &sstate->activated, &sstate->tmp0, _state); /* * Update CG information. */ ae_v_move(&sstate->dp.ptr.p_double[0], 1, &sstate->dc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&sstate->cgp.ptr.p_double[0], 1, &sstate->cgc.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * Update iterations counter */ sstate->repinneriterationscount = sstate->repinneriterationscount+1; } if( *terminationtype!=0 ) { break; } cgmax = settings->cgmaxits; /* * Generate YIdx - reordering of variables for constrained Newton phase. * Free variables come first, fixed are last ones. */ newtcnt = 0; for(;;) { /* * Skip iteration if constrained Newton is turned off. */ if( !settings->cnphase ) { break; } /* * At the first iteration - build Cholesky decomposition of Hessian. * At subsequent iterations - refine Hessian by adding new constraints. * * Loop is terminated in following cases: * * Hessian is not positive definite subject to current constraints * (termination during initial decomposition) * * there were no new constraints being activated * (termination during update) * * all constraints were activated during last step * (termination during update) * * CNMaxUpdates were performed on matrix * (termination during update) */ if( newtcnt==0 ) { /* * Perform initial Newton step. If Cholesky decomposition fails, * increase number of CG iterations to CGMaxIts - it should help * us to find set of constraints which will make matrix positive * definite. */ b = qqpsolver_cnewtonbuild(sstate, sparsesolver, &sstate->repncholesky, _state); if( b ) { cgmax = settings->cgminits; } } else { b = qqpsolver_cnewtonupdate(sstate, settings, &sstate->repncupdates, _state); } if( !b ) { break; } inc(&newtcnt, _state); /* * Calculate gradient GC. */ qqpsolver_targetgradient(sstate, &sstate->sas.xc, &sstate->gc, _state); /* * Bound-constrained Newton step */ for(i=0; i<=n-1; i++) { sstate->dc.ptr.p_double[i] = sstate->gc.ptr.p_double[i]; } if( !qqpsolver_cnewtonstep(sstate, settings, &sstate->dc, _state) ) { break; } qqpsolver_quadraticmodel(sstate, &sstate->sas.xc, &sstate->dc, &sstate->gc, &d1, &d1est, &d2, &d2est, _state); if( d1est>=0 ) { /* * We are close to minimum, derivative is nearly zero, break Newton iteration */ break; } if( d2est>0 ) { /* * Positive definite matrix, we can perform Newton step */ ae_assert(ae_fp_less(d1,(double)(0)), "QQPOptimize: internal error", _state); fullstp = -d1/(2*d2); sasexploredirection(&sstate->sas, &sstate->dc, &stpmax, &cidx, &cval, _state); needact = ae_fp_greater_eq(fullstp,stpmax); if( needact ) { ae_assert(sstate->stpbuf.cnt>=3, "QQPOptimize: StpBuf overflow", _state); reststp = stpmax; stp = reststp; sstate->stpbuf.ptr.p_double[0] = reststp*4; sstate->stpbuf.ptr.p_double[1] = fullstp; sstate->stpbuf.ptr.p_double[2] = fullstp/4; stpcnt = 3; } else { reststp = fullstp; stp = fullstp; stpcnt = 0; } qqpsolver_findbeststepandmove(sstate, &sstate->sas, &sstate->dc, stp, needact, cidx, cval, &sstate->stpbuf, stpcnt, &sstate->activated, &sstate->tmp0, _state); } else { /* * Matrix is semi-definite or indefinite, but regularized * Cholesky succeeded and gave us descent direction in DC. * * We will investigate it and try to perform descent step: * * first, we explore direction: * * if it is unbounded, we stop algorithm with * appropriate termination code -4. * * if StpMax=0, we break Newton phase and return to * CG phase - constraint geometry is complicated near * current point, so it is better to use simpler algo. * * second, we check that bounded step decreases function; * if not, we again skip to CG phase * * finally, we use FindBestStep...() function to choose * between bounded step and projection of full-length step * (latter may give additional decrease in */ sasexploredirection(&sstate->sas, &sstate->dc, &stpmax, &cidx, &cval, _state); if( cidx<0 ) { /* * Function is unbounded from below: * * D1<0 (verified by previous block) * * D2Est<=0, which means that either D2<0 - or it can not * be reliably distinguished from zero. * * step is unconstrained * * If these conditions are true, we abnormally terminate QP * algorithm with return code -4 */ *terminationtype = -4; break; } if( ae_fp_eq(stpmax,(double)(0)) ) { /* * Resort to CG phase. * Increase number of CG iterations. */ cgmax = settings->cgmaxits; break; } ae_assert(ae_fp_greater(stpmax,(double)(0)), "QQPOptimize: internal error", _state); f0 = qqpsolver_projectedtargetfunction(sstate, &sstate->sas.xc, &sstate->dc, 0.0, &sstate->tmp0, _state); f1 = qqpsolver_projectedtargetfunction(sstate, &sstate->sas.xc, &sstate->dc, stpmax, &sstate->tmp0, _state); if( ae_fp_greater_eq(f1,f0) ) { /* * Descent direction does not actually decrease function value. * Resort to CG phase * Increase number of CG iterations. */ cgmax = settings->cgmaxits; break; } ae_assert(sstate->stpbuf.cnt>=3, "QQPOptimize: StpBuf overflow", _state); reststp = stpmax; stp = reststp; sstate->stpbuf.ptr.p_double[0] = reststp*4; sstate->stpbuf.ptr.p_double[1] = 1.00; sstate->stpbuf.ptr.p_double[2] = 0.25; stpcnt = 3; qqpsolver_findbeststepandmove(sstate, &sstate->sas, &sstate->dc, stp, ae_true, cidx, cval, &sstate->stpbuf, stpcnt, &sstate->activated, &sstate->tmp0, _state); } } if( *terminationtype!=0 ) { break; } } sasstopoptimization(&sstate->sas, _state); ae_v_move(&sstate->xf.ptr.p_double[0], 1, &sstate->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); } /* * Stop optimization and unpack results. * * Add XOriginC to XS and make sure that boundary constraints are * both (a) satisfied, (b) preserved. Former means that "shifted" * point is feasible, while latter means that point which was exactly * at the boundary before shift will be exactly at the boundary * after shift. */ for(i=0; i<=nmain-1; i++) { xs->ptr.p_double[i] = sc->ptr.p_double[i]*sstate->xf.ptr.p_double[i]+xoriginc->ptr.p_double[i]; if( sstate->havebndl.ptr.p_bool[i]&&ae_fp_less(xs->ptr.p_double[i],bndlc->ptr.p_double[i]) ) { xs->ptr.p_double[i] = bndlc->ptr.p_double[i]; } if( sstate->havebndu.ptr.p_bool[i]&&ae_fp_greater(xs->ptr.p_double[i],bnduc->ptr.p_double[i]) ) { xs->ptr.p_double[i] = bnduc->ptr.p_double[i]; } if( sstate->havebndl.ptr.p_bool[i]&&ae_fp_eq(sstate->xf.ptr.p_double[i],sstate->bndl.ptr.p_double[i]) ) { xs->ptr.p_double[i] = bndlc->ptr.p_double[i]; } if( sstate->havebndu.ptr.p_bool[i]&&ae_fp_eq(sstate->xf.ptr.p_double[i],sstate->bndu.ptr.p_double[i]) ) { xs->ptr.p_double[i] = bnduc->ptr.p_double[i]; } } } /************************************************************************* Target function at point PROJ(X+Stp*D), where PROJ(.) is a projection into feasible set. NOTE: if Stp=0, D is not referenced at all. Thus, there is no need to fill it by some meaningful values for Stp=0. This subroutine uses temporary buffer Tmp, which is automatically resized if needed. -- ALGLIB -- Copyright 21.12.2013 by Bochkanov Sergey *************************************************************************/ static double qqpsolver_projectedtargetfunction(qqpbuffers* sstate, /* Real */ ae_vector* x, /* Real */ ae_vector* d, double stp, /* Real */ ae_vector* tmp0, ae_state *_state) { ae_int_t nec; ae_int_t nic; ae_int_t n; ae_int_t nmain; ae_int_t i; ae_int_t j; double v; double vv; double result; n = sstate->n; nmain = sstate->nmain; nec = sstate->nec; nic = sstate->nic; rvectorsetlengthatleast(tmp0, n, _state); /* * Calculate projected point */ for(i=0; i<=n-1; i++) { if( ae_fp_neq(stp,(double)(0)) ) { v = x->ptr.p_double[i]+stp*d->ptr.p_double[i]; } else { v = x->ptr.p_double[i]; } if( sstate->havebndl.ptr.p_bool[i]&&ae_fp_less(v,sstate->bndl.ptr.p_double[i]) ) { v = sstate->bndl.ptr.p_double[i]; } if( sstate->havebndu.ptr.p_bool[i]&&ae_fp_greater(v,sstate->bndu.ptr.p_double[i]) ) { v = sstate->bndu.ptr.p_double[i]; } tmp0->ptr.p_double[i] = v; } /* * Function value at the Tmp0: * * f(x) = 0.5*x'*A*x + b'*x + penaltyfactor*0.5*(C*x-b)'*(C*x-b) */ result = 0.0; for(i=0; i<=nmain-1; i++) { result = result+sstate->b.ptr.p_double[i]*tmp0->ptr.p_double[i]; } if( sstate->akind==0 ) { /* * Dense matrix A */ for(i=0; i<=nmain-1; i++) { v = tmp0->ptr.p_double[i]; result = result+0.5*v*v*sstate->densea.ptr.pp_double[i][i]; vv = 0.0; for(j=i+1; j<=nmain-1; j++) { vv = vv+sstate->densea.ptr.pp_double[i][j]*tmp0->ptr.p_double[j]; } result = result+v*vv; } } else { /* * sparse matrix A */ ae_assert(sstate->akind==1, "QQPOptimize: unexpected AKind in ProjectedTargetFunction", _state); result = result+0.5*sparsevsmv(&sstate->sparsea, sstate->sparseupper, tmp0, _state); } for(i=0; i<=nec+nic-1; i++) { v = ae_v_dotproduct(&sstate->cleic.ptr.pp_double[i][0], 1, &tmp0->ptr.p_double[0], 1, ae_v_len(0,n-1)); result = result+0.5*qqpsolver_penaltyfactor*ae_sqr(v-sstate->cleic.ptr.pp_double[i][n], _state); } return result; } /************************************************************************* Gradient of "extended" (N>=NMain variables, original + slack ones) target function: f(x) = 0.5*x'*A*x + b'*x + penaltyfactor*0.5*(C*x-b)'*(C*x-b) which is equal to grad = A*x + b + penaltyfactor*C'*(C*x-b) Here: * x is array[N] (that's why problem is called extended - N>=NMain) * A is array[NMain,NMain] * b is array[NMain] * C is array[NEC+NIC,N] INPUT PARAMETERS: SState - structure which stores function terms (not modified) X - location G - possibly preallocated buffer OUTPUT PARAMETERS: G - array[N], gradient -- ALGLIB -- Copyright 21.12.2013 by Bochkanov Sergey *************************************************************************/ static void qqpsolver_targetgradient(qqpbuffers* sstate, /* Real */ ae_vector* x, /* Real */ ae_vector* g, ae_state *_state) { ae_int_t nec; ae_int_t nic; ae_int_t n; ae_int_t nmain; ae_int_t i; double v; n = sstate->n; nmain = sstate->nmain; nec = sstate->nec; nic = sstate->nic; rvectorsetlengthatleast(g, n, _state); if( sstate->akind==0 ) { /* * Dense matrix A */ rmatrixmv(nmain, nmain, &sstate->densea, 0, 0, 0, x, 0, g, 0, _state); } else { /* * Sparse matrix A */ ae_assert(sstate->akind==1, "QQPOptimize: unexpected AKind in TargetGradient", _state); sparsesmv(&sstate->sparsea, sstate->sparseupper, x, g, _state); } ae_v_add(&g->ptr.p_double[0], 1, &sstate->b.ptr.p_double[0], 1, ae_v_len(0,nmain-1)); for(i=nmain; i<=n-1; i++) { g->ptr.p_double[i] = 0.0; } for(i=0; i<=nec+nic-1; i++) { v = ae_v_dotproduct(&sstate->cleic.ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v-sstate->cleic.ptr.pp_double[i][n]; v = v*qqpsolver_penaltyfactor; ae_v_addd(&g->ptr.p_double[0], 1, &sstate->cleic.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } } /************************************************************************* First and second derivatives of the "extended" target function along specified direction. Target function is called "extended" because of additional slack variables and has form: f(x) = 0.5*x'*A*x + b'*x + penaltyfactor*0.5*(C*x-b)'*(C*x-b) with gradient grad = A*x + b + penaltyfactor*C'*(C*x-b) Quadratic model has form F(x0+alpha*D) = D2*alpha^2 + D1*alpha INPUT PARAMETERS: SState - structure which is used to obtain quadratic term of the model X - current point, array[N] D - direction across which derivatives are calculated, array[N] G - gradient at current point (pre-calculated by caller), array[N] OUTPUT PARAMETERS: D1 - linear coefficient D1Est - estimate of D1 sign, accounting for possible numerical errors: * >0 means "almost surely positive" * <0 means "almost surely negative" * =0 means "pessimistic estimate of numerical errors in D1 is larger than magnitude of D1 itself; it is impossible to reliably distinguish D1 from zero". D2 - quadratic coefficient D2Est - estimate of D2 sign, accounting for possible numerical errors: * >0 means "almost surely positive" * <0 means "almost surely negative" * =0 means "pessimistic estimate of numerical errors in D2 is larger than magnitude of D2 itself; it is impossible to reliably distinguish D2 from zero". -- ALGLIB -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/ static void qqpsolver_quadraticmodel(qqpbuffers* sstate, /* Real */ ae_vector* x, /* Real */ ae_vector* d, /* Real */ ae_vector* g, double* d1, ae_int_t* d1est, double* d2, ae_int_t* d2est, ae_state *_state) { ae_int_t nec; ae_int_t nic; ae_int_t n; ae_int_t nmain; ae_int_t i; ae_int_t j; double v; double vv; double mx; double mb; double md; *d1 = 0; *d1est = 0; *d2 = 0; *d2est = 0; n = sstate->n; nmain = sstate->nmain; nec = sstate->nec; nic = sstate->nic; /* * Maximums */ mx = 0.0; md = 0.0; mb = 0.0; for(i=0; i<=n-1; i++) { mx = ae_maxreal(mx, ae_fabs(x->ptr.p_double[i], _state), _state); md = ae_maxreal(md, ae_fabs(d->ptr.p_double[i], _state), _state); } for(i=0; i<=nmain-1; i++) { mb = ae_maxreal(mb, ae_fabs(sstate->b.ptr.p_double[i], _state), _state); } /* * D2 */ if( sstate->akind==0 ) { /* * Dense matrix A */ *d2 = 0.0; for(i=0; i<=nmain-1; i++) { v = d->ptr.p_double[i]; vv = 0.0; *d2 = *d2+0.5*v*v*sstate->densea.ptr.pp_double[i][i]; for(j=i+1; j<=nmain-1; j++) { vv = vv+sstate->densea.ptr.pp_double[i][j]*d->ptr.p_double[j]; } *d2 = *d2+v*vv; } } else { /* * Sparse matrix A */ ae_assert(sstate->akind==1, "QQPOptimize: unexpected AKind in TargetGradient", _state); *d2 = 0.5*sparsevsmv(&sstate->sparsea, sstate->sparseupper, d, _state); } for(i=0; i<=nec+nic-1; i++) { /* * NOTE: there is no "V:=V-CLEIC[I,N]" line, and it is not an error! * We estimate curvature information here, which is not dependent * on right part. */ v = ae_v_dotproduct(&sstate->cleic.ptr.pp_double[i][0], 1, &d->ptr.p_double[0], 1, ae_v_len(0,n-1)); *d2 = *d2+v*v*qqpsolver_penaltyfactor*0.5; } v = ae_v_dotproduct(&d->ptr.p_double[0], 1, &g->ptr.p_double[0], 1, ae_v_len(0,n-1)); *d1 = v; /* * Error estimates */ estimateparabolicmodel(sstate->absasum, sstate->absasum2, mx, mb, md, *d1, *d2, d1est, d2est, _state); } /************************************************************************* This function accepts quadratic model of the form f(x) = 0.5*x'*A*x + b'*x + penaltyfactor*0.5*(C*x-b)'*(C*x-b) and list of possible steps along direction D. It chooses best step (one which achieves minimum value of the target function) and moves current point (given by SAS object) to the new location. Step is bounded subject to boundary constraints. Candidate steps are divided into two groups: * "default" step, which is always performed when no candidate steps LONGER THAN THE DEFAULT ONE is given. This candidate MUST reduce target function value; it is responsibility of caller to provide default candidate which reduces target function. * "additional candidates", which may be shorter or longer than the default step. Candidates which are shorter that the default step are ignored; candidates which are longer than the "default" step are tested. The idea is that we ALWAYS try "default" step, and it is responsibility of the caller to provide us with something which is worth trying. This step may activate some constraint - that's why we stopped at "default" step size. However, we may also try longer steps which may activate additional constraints and further reduce function value. INPUT PARAMETERS: SState - structure which stores model SAS - active set structure which stores current point in SAS.XC D - direction for step Stp - step length for "default" candidate NeedAct - whether default candidate activates some constraint; if NeedAct is True, constraint given by CIdc/CVal is GUARANTEED to be activated in the final point. CIdx - if NeedAct is True, stores index of the constraint to activate CVal - if NeedAct is True, stores constrained value; SAS.XC[CIdx] is forced to be equal to CVal. AddSteps- array[AddStepsCnt] of additional steps: * AddSteps[]<=Stp are ignored * AddSteps[]>Stp are tried Activated- possibly preallocated buffer; previously allocated memory will be reused. Tmp0 - possibly preallocated buffer; previously allocated memory will be reused. OUTPUT PARAMETERS: SAS - SAS.XC is set to new point; if there was a constraint specified by NeedAct/CIdx/CVal, it will be activated (other constraints may be activated too, but this one is guaranteed to be active in the final point). Activated- elements of this array are set to True, if I-th constraint as inactive at previous point, but become active in the new one. Situations when we deactivate xi>=0 and activate xi<=1 are considered as activation of previously inactive constraint -- ALGLIB -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/ static void qqpsolver_findbeststepandmove(qqpbuffers* sstate, sactiveset* sas, /* Real */ ae_vector* d, double stp, ae_bool needact, ae_int_t cidx, double cval, /* Real */ ae_vector* addsteps, ae_int_t addstepscnt, /* Boolean */ ae_vector* activated, /* Real */ ae_vector* tmp0, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t k; double v; double stpbest; double fbest; double fcand; n = sstate->n; rvectorsetlengthatleast(tmp0, n, _state); bvectorsetlengthatleast(activated, n, _state); /* * Calculate initial step, store to Tmp0 * * NOTE: Tmp0 is guaranteed to be feasible w.r.t. boundary constraints */ for(i=0; i<=n-1; i++) { v = sas->xc.ptr.p_double[i]+stp*d->ptr.p_double[i]; if( sstate->havebndl.ptr.p_bool[i]&&ae_fp_less(v,sstate->bndl.ptr.p_double[i]) ) { v = sstate->bndl.ptr.p_double[i]; } if( sstate->havebndu.ptr.p_bool[i]&&ae_fp_greater(v,sstate->bndu.ptr.p_double[i]) ) { v = sstate->bndu.ptr.p_double[i]; } tmp0->ptr.p_double[i] = v; } if( needact ) { tmp0->ptr.p_double[cidx] = cval; } /* * Try additional steps, if AddStepsCnt>0 */ if( addstepscnt>0 ) { /* * Find best step */ stpbest = stp; fbest = qqpsolver_projectedtargetfunction(sstate, &sas->xc, d, stpbest, tmp0, _state); for(k=0; k<=addstepscnt-1; k++) { if( ae_fp_greater(addsteps->ptr.p_double[k],stp) ) { fcand = qqpsolver_projectedtargetfunction(sstate, &sas->xc, d, addsteps->ptr.p_double[k], tmp0, _state); if( ae_fp_less(fcand,fbest) ) { fbest = fcand; stpbest = addsteps->ptr.p_double[k]; } } } /* * Prepare best step * * NOTE: because only AddSteps[]>Stp were checked, * this step will activate constraint CIdx. */ for(i=0; i<=n-1; i++) { v = sas->xc.ptr.p_double[i]+stpbest*d->ptr.p_double[i]; if( sstate->havebndl.ptr.p_bool[i]&&ae_fp_less(v,sstate->bndl.ptr.p_double[i]) ) { v = sstate->bndl.ptr.p_double[i]; } if( sstate->havebndu.ptr.p_bool[i]&&ae_fp_greater(v,sstate->bndu.ptr.p_double[i]) ) { v = sstate->bndu.ptr.p_double[i]; } tmp0->ptr.p_double[i] = v; } if( needact ) { tmp0->ptr.p_double[cidx] = cval; } } /* * Fill Activated array by information about activated constraints. * Perform step */ for(i=0; i<=n-1; i++) { activated->ptr.p_bool[i] = ae_false; v = tmp0->ptr.p_double[i]; if( ae_fp_eq(v,sas->xc.ptr.p_double[i]) ) { continue; } if( sstate->havebndl.ptr.p_bool[i]&&ae_fp_eq(v,sstate->bndl.ptr.p_double[i]) ) { activated->ptr.p_bool[i] = ae_true; } if( sstate->havebndu.ptr.p_bool[i]&&ae_fp_eq(v,sstate->bndu.ptr.p_double[i]) ) { activated->ptr.p_bool[i] = ae_true; } } sasmoveto(sas, tmp0, needact, cidx, cval, _state); } /************************************************************************* This function prepares data for constrained Newton step for penalized quadratic model of the form f(x) = 0.5*x'*A*x + b'*x + penaltyfactor*0.5*(C*x-b)'*(C*x-b) where A can be dense or sparse, and model is considered subject to equality constraints specified by SState.SAS.XC object. Constraint is considered active if XC[i] is exactly BndL[i] or BndU[i], i.e. we ignore internal list of constraints monitored by SAS object. Our own set of constraints includes all constraints stored by SAS, but also may include some constraints which are inactive in SAS. "Preparation" means that Cholesky decomposition of the effective system matrix is performed, and we can perform constrained Newton step. This function works as black box. It uses fields of SState which are marked as "Variables for constrained Newton phase", and only this function and its friends know about these variables. Everyone else should use: * CNewtonBuild() to prepare initial Cholesky decomposition for step * CNewtonStep() to perform constrained Newton step * CNewtonUpdate() to update Cholesky matrix after point was moved and constraints were updated. In some cases it is possible to efficiently re-calculate Cholesky decomposition if you know which constraints were activated. If efficient re-calculation is impossible, this function returns False. INPUT PARAMETERS: SState - structure which stores model and temporaries for CN phase; in particular, SAS.XC stores current point. SparseSolver-which sparse solver to use for sparse model; ignored for dense QP. Can be: * 2 - SKS-based Cholesky NCholesky- counter which is incremented after Cholesky (successful or failed one) OUTPUT PARAMETERS: NCholesky- possibly updated counter RESULT: True, if Cholesky decomposition was successfully performed. False, if a) matrix was semi-definite or indefinite, or b) particular combination of matrix type (sparse) and constraints (general linear) is not supported. NOTE: this function may routinely return False, for indefinite matrices or for sparse problems with general linear constraints. You should be able to handle such situations. -- ALGLIB -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/ static ae_bool qqpsolver_cnewtonbuild(qqpbuffers* sstate, ae_int_t sparsesolver, ae_int_t* ncholesky, ae_state *_state) { ae_int_t nec; ae_int_t nic; ae_int_t n; ae_int_t nmain; ae_int_t i; ae_int_t j; ae_int_t k; double v; ae_bool b; ae_int_t ridx0; ae_int_t ridx1; ae_int_t nfree; ae_bool result; result = ae_false; /* * Fetch often used fields */ n = sstate->n; nmain = sstate->nmain; nec = sstate->nec; nic = sstate->nic; /* * Check problem properties. * Sparse problems with general linear constraints are not supported yet. */ if( sstate->akind==1&&nec+nic>0 ) { return result; } /* * 1. Set CNModelAge to zero * 2. Generate YIdx - reordering of variables such that free variables * come first and are ordered by ascending, fixed are last ones and * have no particular ordering. * * This step is same for dense and sparse problems. */ sstate->cnmodelage = 0; ivectorsetlengthatleast(&sstate->yidx, n, _state); ridx0 = 0; ridx1 = n-1; for(i=0; i<=n-1; i++) { sstate->yidx.ptr.p_int[i] = -1; } for(i=0; i<=n-1; i++) { ae_assert(!sstate->havebndl.ptr.p_bool[i]||ae_fp_greater_eq(sstate->sas.xc.ptr.p_double[i],sstate->bndl.ptr.p_double[i]), "CNewtonBuild: internal error", _state); ae_assert(!sstate->havebndu.ptr.p_bool[i]||ae_fp_less_eq(sstate->sas.xc.ptr.p_double[i],sstate->bndu.ptr.p_double[i]), "CNewtonBuild: internal error", _state); b = ae_false; b = b||(sstate->havebndl.ptr.p_bool[i]&&ae_fp_eq(sstate->sas.xc.ptr.p_double[i],sstate->bndl.ptr.p_double[i])); b = b||(sstate->havebndu.ptr.p_bool[i]&&ae_fp_eq(sstate->sas.xc.ptr.p_double[i],sstate->bndu.ptr.p_double[i])); if( b ) { sstate->yidx.ptr.p_int[ridx1] = i; ridx1 = ridx1-1; } else { sstate->yidx.ptr.p_int[ridx0] = i; ridx0 = ridx0+1; } } ae_assert(ridx0==ridx1+1, "CNewtonBuild: internal error", _state); nfree = ridx0; sstate->nfree = nfree; if( nfree==0 ) { return result; } /* * Constrained Newton matrix: dense version */ if( sstate->akind==0 ) { rmatrixsetlengthatleast(&sstate->densez, n, n, _state); rvectorsetlengthatleast(&sstate->tmpcn, n, _state); if( nec+nic>0 ) { /* * Initialize Z with C'*C, add A */ rmatrixsyrk(n, nec+nic, qqpsolver_penaltyfactor, &sstate->cleic, 0, 0, 2, 0.0, &sstate->densez, 0, 0, ae_true, _state); for(i=0; i<=nmain-1; i++) { for(j=i; j<=nmain-1; j++) { sstate->densez.ptr.pp_double[i][j] = sstate->densez.ptr.pp_double[i][j]+sstate->densea.ptr.pp_double[i][j]; } } } else { /* * No linear constraints, just set Z to A */ ae_assert(n==nmain, "CNewtonBuild: integrity check failed", _state); for(i=0; i<=nmain-1; i++) { for(j=i; j<=nmain-1; j++) { sstate->densez.ptr.pp_double[i][j] = sstate->densea.ptr.pp_double[i][j]; } } } for(i=1; i<=nfree-1; i++) { ae_assert(sstate->yidx.ptr.p_int[i]>sstate->yidx.ptr.p_int[i-1], "CNewtonBuild: integrity check failed", _state); } for(i=0; i<=nfree-1; i++) { k = sstate->yidx.ptr.p_int[i]; for(j=i; j<=nfree-1; j++) { sstate->densez.ptr.pp_double[i][j] = sstate->densez.ptr.pp_double[k][sstate->yidx.ptr.p_int[j]]; } } rvectorsetlengthatleast(&sstate->regdiag, n, _state); for(i=0; i<=nfree-1; i++) { v = 0.0; for(j=0; j<=i-1; j++) { v = v+ae_fabs(sstate->densez.ptr.pp_double[j][i], _state); } for(j=i; j<=nfree-1; j++) { v = v+ae_fabs(sstate->densez.ptr.pp_double[i][j], _state); } if( ae_fp_eq(v,(double)(0)) ) { v = 1.0; } sstate->regdiag.ptr.p_double[i] = qqpsolver_regz*v; } for(i=0; i<=nfree-1; i++) { sstate->densez.ptr.pp_double[i][i] = sstate->densez.ptr.pp_double[i][i]+sstate->regdiag.ptr.p_double[i]; } inc(ncholesky, _state); if( !spdmatrixcholeskyrec(&sstate->densez, 0, nfree, ae_true, &sstate->tmpcn, _state) ) { return result; } for(i=nfree-1; i>=0; i--) { ae_v_move(&sstate->tmpcn.ptr.p_double[i], 1, &sstate->densez.ptr.pp_double[i][i], 1, ae_v_len(i,nfree-1)); k = sstate->yidx.ptr.p_int[i]; for(j=k; j<=n-1; j++) { sstate->densez.ptr.pp_double[k][j] = (double)(0); } for(j=i; j<=nfree-1; j++) { sstate->densez.ptr.pp_double[k][sstate->yidx.ptr.p_int[j]] = sstate->tmpcn.ptr.p_double[j]; } } for(i=nfree; i<=n-1; i++) { k = sstate->yidx.ptr.p_int[i]; sstate->densez.ptr.pp_double[k][k] = 1.0; for(j=k+1; j<=n-1; j++) { sstate->densez.ptr.pp_double[k][j] = (double)(0); } } result = ae_true; return result; } /* * Constrained Newton matrix: sparse version */ if( sstate->akind==1 ) { ae_assert(nec+nic==0, "CNewtonBuild: internal error", _state); ae_assert(sparsesolver==2, "CNewtonBuild: internal error", _state); /* * Copy sparse A to Z and fill rows/columns corresponding to active * constraints by zeros. Diagonal elements corresponding to active * constraints are filled by unit values. */ sparsecopytosksbuf(&sstate->sparsea, &sstate->sparsecca, _state); rvectorsetlengthatleast(&sstate->tmpcn, n, _state); for(i=0; i<=n-1; i++) { sstate->tmpcn.ptr.p_double[i] = (double)(0); } for(i=nfree; i<=n-1; i++) { sstate->tmpcn.ptr.p_double[sstate->yidx.ptr.p_int[i]] = (double)(1); } for(i=0; i<=n-1; i++) { k = sstate->sparsecca.ridx.ptr.p_int[i]; for(j=i-sstate->sparsecca.didx.ptr.p_int[i]; j<=i; j++) { if( ae_fp_neq(sstate->tmpcn.ptr.p_double[i],(double)(0))||ae_fp_neq(sstate->tmpcn.ptr.p_double[j],(double)(0)) ) { /* * I-th or J-th variable is in active set (constrained) */ if( i==j ) { sstate->sparsecca.vals.ptr.p_double[k] = 1.0; } else { sstate->sparsecca.vals.ptr.p_double[k] = 0.0; } } k = k+1; } } /* * Perform sparse Cholesky */ inc(ncholesky, _state); if( !sparsecholeskyskyline(&sstate->sparsecca, nmain, sstate->sparseupper, _state) ) { return result; } result = ae_true; return result; } /* * Unexpected :) */ ae_assert(ae_false, "CNewtonBuild: internal error", _state); return result; } /************************************************************************* This function updates equality-constrained Cholesky matrix after activation of the new equality constraints. Matrix being updated is quadratic term of the function below f(x) = 0.5*x'*A*x + b'*x + penaltyfactor*0.5*(C*x-b)'*(C*x-b) where A can be dense or sparse. This function uses YIdx[] array (set by CNewtonBuild() function) to distinguish between active and inactive constraints. This function works as black box. It uses fields of SState which are marked as "Variables for constrained Newton phase", and only this function and its friends know about these variables. Everyone else should use: * CNewtonBuild() to prepare initial Cholesky decomposition for step * CNewtonStep() to perform constrained Newton step * CNewtonUpdate() to update Cholesky matrix after point was moved and constraints were updated. In some cases it is possible to efficiently re-calculate Cholesky decomposition if you know which constraints were activated. If efficient re-calculation is impossible, this function returns False. INPUT PARAMETERS: SState - structure which stores model and temporaries for CN phase; in particular, SAS.XC stores current point. Settings - QQPSettings object which was initialized by appropriate construction function. NCUpdates- counter which is incremented after each update (one update means one variable being fixed) OUTPUT PARAMETERS: NCUpdates- possibly updated counter RESULT: True, if Cholesky decomposition was successfully performed. False, if a) model age was too high, or b) particular combination of matrix type (sparse) and constraints (general linear) is not supported NOTE: this function may routinely return False. You should be able to handle such situations. -- ALGLIB -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/ static ae_bool qqpsolver_cnewtonupdate(qqpbuffers* sstate, qqpsettings* settings, ae_int_t* ncupdates, ae_state *_state) { ae_int_t n; ae_int_t nfree; ae_int_t ntofix; ae_bool b; ae_int_t ridx0; ae_int_t ridx1; ae_int_t i; ae_int_t k; ae_bool result; result = ae_false; /* * Cholesky updates for sparse problems are not supported */ if( sstate->akind==1 ) { return result; } /* * Fetch often used fields */ n = sstate->n; nfree = sstate->nfree; /* * Determine variables to fix and move them to YIdx[NFree-NToFix:NFree-1] * Exit if CNModelAge increased too much. */ ivectorsetlengthatleast(&sstate->tmpcni, n, _state); ridx0 = 0; ridx1 = nfree-1; for(i=0; i<=nfree-1; i++) { sstate->tmpcni.ptr.p_int[i] = -1; } for(k=0; k<=nfree-1; k++) { i = sstate->yidx.ptr.p_int[k]; ae_assert(!sstate->havebndl.ptr.p_bool[i]||ae_fp_greater_eq(sstate->sas.xc.ptr.p_double[i],sstate->bndl.ptr.p_double[i]), "CNewtonUpdate: internal error", _state); ae_assert(!sstate->havebndu.ptr.p_bool[i]||ae_fp_less_eq(sstate->sas.xc.ptr.p_double[i],sstate->bndu.ptr.p_double[i]), "CNewtonUpdate: internal error", _state); b = ae_false; b = b||(sstate->havebndl.ptr.p_bool[i]&&ae_fp_eq(sstate->sas.xc.ptr.p_double[i],sstate->bndl.ptr.p_double[i])); b = b||(sstate->havebndu.ptr.p_bool[i]&&ae_fp_eq(sstate->sas.xc.ptr.p_double[i],sstate->bndu.ptr.p_double[i])); if( b ) { sstate->tmpcni.ptr.p_int[ridx1] = i; ridx1 = ridx1-1; } else { sstate->tmpcni.ptr.p_int[ridx0] = i; ridx0 = ridx0+1; } } ae_assert(ridx0==ridx1+1, "CNewtonUpdate: internal error", _state); ntofix = nfree-ridx0; if( ntofix==0||ntofix==nfree ) { return result; } if( sstate->cnmodelage+ntofix>settings->cnmaxupdates ) { return result; } for(i=0; i<=nfree-1; i++) { sstate->yidx.ptr.p_int[i] = sstate->tmpcni.ptr.p_int[i]; } /* * Constrained Newton matrix: dense version. */ if( sstate->akind==0 ) { /* * Update Cholesky matrix with SPDMatrixCholeskyUpdateFixBuf() */ bvectorsetlengthatleast(&sstate->tmpcnb, n, _state); for(i=0; i<=n-1; i++) { sstate->tmpcnb.ptr.p_bool[i] = ae_false; } for(i=nfree-ntofix; i<=nfree-1; i++) { sstate->tmpcnb.ptr.p_bool[sstate->yidx.ptr.p_int[i]] = ae_true; } spdmatrixcholeskyupdatefixbuf(&sstate->densez, n, ae_true, &sstate->tmpcnb, &sstate->tmpcn, _state); /* * Update information stored in State and exit */ sstate->nfree = nfree-ntofix; sstate->cnmodelage = sstate->cnmodelage+ntofix; *ncupdates = *ncupdates+ntofix; result = ae_true; return result; } /* * Unexpected :) */ ae_assert(ae_false, "CNewtonUpdate: internal error", _state); return result; } /************************************************************************* This function prepares equality-constrained Newton step using previously calculated constrained Cholesky matrix of the problem f(x) = 0.5*x'*A*x + b'*x + penaltyfactor*0.5*(C*x-b)'*(C*x-b) where A can be dense or sparse. As input, this function accepts gradient at the current location. As output, it returns step vector (replaces gradient). This function works as black box. It uses fields of SState which are marked as "Variables for constrained Newton phase", and only this function and its friends know about these variables. Everyone else should use: * CNewtonBuild() to prepare initial Cholesky decomposition for step * CNewtonStep() to perform constrained Newton step * CNewtonUpdate() to update Cholesky matrix after point was moved and constraints were updated. In some cases it is possible to efficiently re-calculate Cholesky decomposition if you know which constraints were activated. If efficient re-calculation is impossible, this function returns False. INPUT PARAMETERS: SState - structure which stores model and temporaries for CN phase; in particular, SAS.XC stores current point. Settings - QQPSettings object which was initialized by appropriate construction function. GC - array[NMain+NSlack], gradient of the target function OUTPUT PARAMETERS: GC - array[NMain+NSlack], step vector (on success) RESULT: True, if step was successfully calculated. False, if step calculation failed: a) gradient was exactly zero, b) gradient norm was smaller than EpsG (stopping condition) c) all variables were equality-constrained NOTE: this function may routinely return False. You should be able to handle such situations. -- ALGLIB -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/ static ae_bool qqpsolver_cnewtonstep(qqpbuffers* sstate, qqpsettings* settings, /* Real */ ae_vector* gc, ae_state *_state) { ae_int_t i; ae_int_t n; ae_int_t nfree; double v; ae_bool result; result = ae_false; n = sstate->n; nfree = sstate->nfree; for(i=nfree; i<=n-1; i++) { gc->ptr.p_double[sstate->yidx.ptr.p_int[i]] = 0.0; } v = ae_v_dotproduct(&gc->ptr.p_double[0], 1, &gc->ptr.p_double[0], 1, ae_v_len(0,n-1)); if( ae_fp_less_eq(ae_sqrt(v, _state),settings->epsg) ) { return result; } for(i=0; i<=n-1; i++) { gc->ptr.p_double[i] = -gc->ptr.p_double[i]; } if( sstate->akind==0 ) { /* * Dense Newton step. * Use straightforward Cholesky solver. */ fblscholeskysolve(&sstate->densez, 1.0, n, ae_true, gc, &sstate->tmpcn, _state); result = ae_true; return result; } if( sstate->akind==1 ) { /* * Sparse Newton step. * * We have T*T' = L*L' = U'*U (depending on specific triangle stored in SparseCCA). */ if( sstate->sparseupper ) { sparsetrsv(&sstate->sparsecca, sstate->sparseupper, ae_false, 1, gc, _state); sparsetrsv(&sstate->sparsecca, sstate->sparseupper, ae_false, 0, gc, _state); } else { sparsetrsv(&sstate->sparsecca, sstate->sparseupper, ae_false, 0, gc, _state); sparsetrsv(&sstate->sparsecca, sstate->sparseupper, ae_false, 1, gc, _state); } result = ae_true; return result; } ae_assert(ae_false, "CNewtonStep: internal error", _state); return result; } void _qqpsettings_init(void* _p, ae_state *_state) { qqpsettings *p = (qqpsettings*)_p; ae_touch_ptr((void*)p); } void _qqpsettings_init_copy(void* _dst, void* _src, ae_state *_state) { qqpsettings *dst = (qqpsettings*)_dst; qqpsettings *src = (qqpsettings*)_src; dst->epsg = src->epsg; dst->epsf = src->epsf; dst->epsx = src->epsx; dst->maxouterits = src->maxouterits; dst->cgphase = src->cgphase; dst->cnphase = src->cnphase; dst->cgminits = src->cgminits; dst->cgmaxits = src->cgmaxits; dst->cnmaxupdates = src->cnmaxupdates; dst->sparsesolver = src->sparsesolver; } void _qqpsettings_clear(void* _p) { qqpsettings *p = (qqpsettings*)_p; ae_touch_ptr((void*)p); } void _qqpsettings_destroy(void* _p) { qqpsettings *p = (qqpsettings*)_p; ae_touch_ptr((void*)p); } void _qqpbuffers_init(void* _p, ae_state *_state) { qqpbuffers *p = (qqpbuffers*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->densea, 0, 0, DT_REAL, _state); _sparsematrix_init(&p->sparsea, _state); ae_vector_init(&p->b, 0, DT_REAL, _state); ae_vector_init(&p->bndl, 0, DT_REAL, _state); ae_vector_init(&p->bndu, 0, DT_REAL, _state); ae_vector_init(&p->havebndl, 0, DT_BOOL, _state); ae_vector_init(&p->havebndu, 0, DT_BOOL, _state); ae_matrix_init(&p->cleic, 0, 0, DT_REAL, _state); ae_vector_init(&p->xs, 0, DT_REAL, _state); ae_vector_init(&p->xf, 0, DT_REAL, _state); ae_vector_init(&p->gc, 0, DT_REAL, _state); ae_vector_init(&p->xp, 0, DT_REAL, _state); ae_vector_init(&p->dc, 0, DT_REAL, _state); ae_vector_init(&p->dp, 0, DT_REAL, _state); ae_vector_init(&p->cgc, 0, DT_REAL, _state); ae_vector_init(&p->cgp, 0, DT_REAL, _state); _sactiveset_init(&p->sas, _state); ae_vector_init(&p->activated, 0, DT_BOOL, _state); ae_matrix_init(&p->densez, 0, 0, DT_REAL, _state); _sparsematrix_init(&p->sparsecca, _state); ae_vector_init(&p->yidx, 0, DT_INT, _state); ae_vector_init(&p->regdiag, 0, DT_REAL, _state); ae_vector_init(&p->regx0, 0, DT_REAL, _state); ae_vector_init(&p->tmpcn, 0, DT_REAL, _state); ae_vector_init(&p->tmpcni, 0, DT_INT, _state); ae_vector_init(&p->tmpcnb, 0, DT_BOOL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_vector_init(&p->stpbuf, 0, DT_REAL, _state); _sparsebuffers_init(&p->sbuf, _state); } void _qqpbuffers_init_copy(void* _dst, void* _src, ae_state *_state) { qqpbuffers *dst = (qqpbuffers*)_dst; qqpbuffers *src = (qqpbuffers*)_src; dst->n = src->n; dst->nmain = src->nmain; dst->nslack = src->nslack; dst->nec = src->nec; dst->nic = src->nic; dst->akind = src->akind; ae_matrix_init_copy(&dst->densea, &src->densea, _state); _sparsematrix_init_copy(&dst->sparsea, &src->sparsea, _state); dst->sparseupper = src->sparseupper; dst->absamax = src->absamax; dst->absasum = src->absasum; dst->absasum2 = src->absasum2; ae_vector_init_copy(&dst->b, &src->b, _state); ae_vector_init_copy(&dst->bndl, &src->bndl, _state); ae_vector_init_copy(&dst->bndu, &src->bndu, _state); ae_vector_init_copy(&dst->havebndl, &src->havebndl, _state); ae_vector_init_copy(&dst->havebndu, &src->havebndu, _state); ae_matrix_init_copy(&dst->cleic, &src->cleic, _state); ae_vector_init_copy(&dst->xs, &src->xs, _state); ae_vector_init_copy(&dst->xf, &src->xf, _state); ae_vector_init_copy(&dst->gc, &src->gc, _state); ae_vector_init_copy(&dst->xp, &src->xp, _state); ae_vector_init_copy(&dst->dc, &src->dc, _state); ae_vector_init_copy(&dst->dp, &src->dp, _state); ae_vector_init_copy(&dst->cgc, &src->cgc, _state); ae_vector_init_copy(&dst->cgp, &src->cgp, _state); _sactiveset_init_copy(&dst->sas, &src->sas, _state); ae_vector_init_copy(&dst->activated, &src->activated, _state); dst->nfree = src->nfree; dst->cnmodelage = src->cnmodelage; ae_matrix_init_copy(&dst->densez, &src->densez, _state); _sparsematrix_init_copy(&dst->sparsecca, &src->sparsecca, _state); ae_vector_init_copy(&dst->yidx, &src->yidx, _state); ae_vector_init_copy(&dst->regdiag, &src->regdiag, _state); ae_vector_init_copy(&dst->regx0, &src->regx0, _state); ae_vector_init_copy(&dst->tmpcn, &src->tmpcn, _state); ae_vector_init_copy(&dst->tmpcni, &src->tmpcni, _state); ae_vector_init_copy(&dst->tmpcnb, &src->tmpcnb, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); ae_vector_init_copy(&dst->stpbuf, &src->stpbuf, _state); _sparsebuffers_init_copy(&dst->sbuf, &src->sbuf, _state); dst->repinneriterationscount = src->repinneriterationscount; dst->repouteriterationscount = src->repouteriterationscount; dst->repncholesky = src->repncholesky; dst->repncupdates = src->repncupdates; } void _qqpbuffers_clear(void* _p) { qqpbuffers *p = (qqpbuffers*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->densea); _sparsematrix_clear(&p->sparsea); ae_vector_clear(&p->b); ae_vector_clear(&p->bndl); ae_vector_clear(&p->bndu); ae_vector_clear(&p->havebndl); ae_vector_clear(&p->havebndu); ae_matrix_clear(&p->cleic); ae_vector_clear(&p->xs); ae_vector_clear(&p->xf); ae_vector_clear(&p->gc); ae_vector_clear(&p->xp); ae_vector_clear(&p->dc); ae_vector_clear(&p->dp); ae_vector_clear(&p->cgc); ae_vector_clear(&p->cgp); _sactiveset_clear(&p->sas); ae_vector_clear(&p->activated); ae_matrix_clear(&p->densez); _sparsematrix_clear(&p->sparsecca); ae_vector_clear(&p->yidx); ae_vector_clear(&p->regdiag); ae_vector_clear(&p->regx0); ae_vector_clear(&p->tmpcn); ae_vector_clear(&p->tmpcni); ae_vector_clear(&p->tmpcnb); ae_vector_clear(&p->tmp0); ae_vector_clear(&p->stpbuf); _sparsebuffers_clear(&p->sbuf); } void _qqpbuffers_destroy(void* _p) { qqpbuffers *p = (qqpbuffers*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->densea); _sparsematrix_destroy(&p->sparsea); ae_vector_destroy(&p->b); ae_vector_destroy(&p->bndl); ae_vector_destroy(&p->bndu); ae_vector_destroy(&p->havebndl); ae_vector_destroy(&p->havebndu); ae_matrix_destroy(&p->cleic); ae_vector_destroy(&p->xs); ae_vector_destroy(&p->xf); ae_vector_destroy(&p->gc); ae_vector_destroy(&p->xp); ae_vector_destroy(&p->dc); ae_vector_destroy(&p->dp); ae_vector_destroy(&p->cgc); ae_vector_destroy(&p->cgp); _sactiveset_destroy(&p->sas); ae_vector_destroy(&p->activated); ae_matrix_destroy(&p->densez); _sparsematrix_destroy(&p->sparsecca); ae_vector_destroy(&p->yidx); ae_vector_destroy(&p->regdiag); ae_vector_destroy(&p->regx0); ae_vector_destroy(&p->tmpcn); ae_vector_destroy(&p->tmpcni); ae_vector_destroy(&p->tmpcnb); ae_vector_destroy(&p->tmp0); ae_vector_destroy(&p->stpbuf); _sparsebuffers_destroy(&p->sbuf); } /************************************************************************* LIMITED MEMORY BFGS METHOD FOR LARGE SCALE OPTIMIZATION DESCRIPTION: The subroutine minimizes function F(x) of N arguments by using a quasi- Newton method (LBFGS scheme) which is optimized to use a minimum amount of memory. The subroutine generates the approximation of an inverse Hessian matrix by using information about the last M steps of the algorithm (instead of N). It lessens a required amount of memory from a value of order N^2 to a value of order 2*N*M. REQUIREMENTS: Algorithm will request following information during its operation: * function value F and its gradient G (simultaneously) at given point X USAGE: 1. User initializes algorithm state with MinLBFGSCreate() call 2. User tunes solver parameters with MinLBFGSSetCond() MinLBFGSSetStpMax() and other functions 3. User calls MinLBFGSOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 4. User calls MinLBFGSResults() to get solution 5. Optionally user may call MinLBFGSRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLBFGSRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension. N>0 M - number of corrections in the BFGS scheme of Hessian approximation update. Recommended value: 3<=M<=7. The smaller value causes worse convergence, the bigger will not cause a considerably better convergence, but will cause a fall in the performance. M<=N. X - initial solution approximation, array[0..N-1]. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLBFGSSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLBFGSSetStpMax() function to bound algorithm's steps. However, L-BFGS rarely needs such a tuning. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgscreate(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, minlbfgsstate* state, ae_state *_state) { _minlbfgsstate_clear(state); ae_assert(n>=1, "MinLBFGSCreate: N<1!", _state); ae_assert(m>=1, "MinLBFGSCreate: M<1", _state); ae_assert(m<=n, "MinLBFGSCreate: M>N", _state); ae_assert(x->cnt>=n, "MinLBFGSCreate: Length(X)0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of corrections in the BFGS scheme of Hessian approximation update. Recommended value: 3<=M<=7. The smaller value causes worse convergence, the bigger will not cause a considerably better convergence, but will cause a fall in the performance. M<=N. X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinLBFGSSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. LBFGS needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void minlbfgscreatef(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, double diffstep, minlbfgsstate* state, ae_state *_state) { _minlbfgsstate_clear(state); ae_assert(n>=1, "MinLBFGSCreateF: N too small!", _state); ae_assert(m>=1, "MinLBFGSCreateF: M<1", _state); ae_assert(m<=n, "MinLBFGSCreateF: M>N", _state); ae_assert(x->cnt>=n, "MinLBFGSCreateF: Length(X)=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinLBFGSSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0, EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (small EpsX). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetcond(minlbfgsstate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(epsg, _state), "MinLBFGSSetCond: EpsG is not finite number!", _state); ae_assert(ae_fp_greater_eq(epsg,(double)(0)), "MinLBFGSSetCond: negative EpsG!", _state); ae_assert(ae_isfinite(epsf, _state), "MinLBFGSSetCond: EpsF is not finite number!", _state); ae_assert(ae_fp_greater_eq(epsf,(double)(0)), "MinLBFGSSetCond: negative EpsF!", _state); ae_assert(ae_isfinite(epsx, _state), "MinLBFGSSetCond: EpsX is not finite number!", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "MinLBFGSSetCond: negative EpsX!", _state); ae_assert(maxits>=0, "MinLBFGSSetCond: negative MaxIts!", _state); if( ((ae_fp_eq(epsg,(double)(0))&&ae_fp_eq(epsf,(double)(0)))&&ae_fp_eq(epsx,(double)(0)))&&maxits==0 ) { epsx = 1.0E-6; } state->epsg = epsg; state->epsf = epsf; state->epsx = epsx; state->maxits = maxits; } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinLBFGSOptimize(). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetxrep(minlbfgsstate* state, ae_bool needxrep, ae_state *_state) { state->xrep = needxrep; } /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0 (default), if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetstpmax(minlbfgsstate* state, double stpmax, ae_state *_state) { ae_assert(ae_isfinite(stpmax, _state), "MinLBFGSSetStpMax: StpMax is not finite!", _state); ae_assert(ae_fp_greater_eq(stpmax,(double)(0)), "MinLBFGSSetStpMax: StpMax<0!", _state); state->stpmax = stpmax; } /************************************************************************* This function sets scaling coefficients for LBFGS optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the LBFGS too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinLBFGSSetPrec...() functions. There is special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlbfgssetscale(minlbfgsstate* state, /* Real */ ae_vector* s, ae_state *_state) { ae_int_t i; ae_assert(s->cnt>=state->n, "MinLBFGSSetScale: Length(S)n-1; i++) { ae_assert(ae_isfinite(s->ptr.p_double[i], _state), "MinLBFGSSetScale: S contains infinite or NAN elements", _state); ae_assert(ae_fp_neq(s->ptr.p_double[i],(double)(0)), "MinLBFGSSetScale: S contains zero elements", _state); state->s.ptr.p_double[i] = ae_fabs(s->ptr.p_double[i], _state); } } /************************************************************************* Extended subroutine for internal use only. Accepts additional parameters: Flags - additional settings: * Flags = 0 means no additional settings * Flags = 1 "do not allocate memory". used when solving a many subsequent tasks with same N/M values. First call MUST be without this flag bit set, subsequent calls of MinLBFGS with same MinLBFGSState structure can set Flags to 1. DiffStep - numerical differentiation step -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgscreatex(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, ae_int_t flags, double diffstep, minlbfgsstate* state, ae_state *_state) { ae_bool allocatemem; ae_int_t i; ae_assert(n>=1, "MinLBFGS: N too small!", _state); ae_assert(m>=1, "MinLBFGS: M too small!", _state); ae_assert(m<=n, "MinLBFGS: M too large!", _state); /* * Initialize */ state->teststep = (double)(0); state->diffstep = diffstep; state->n = n; state->m = m; allocatemem = flags%2==0; flags = flags/2; if( allocatemem ) { rvectorsetlengthatleast(&state->rho, m, _state); rvectorsetlengthatleast(&state->theta, m, _state); rmatrixsetlengthatleast(&state->yk, m, n, _state); rmatrixsetlengthatleast(&state->sk, m, n, _state); rvectorsetlengthatleast(&state->d, n, _state); rvectorsetlengthatleast(&state->xp, n, _state); rvectorsetlengthatleast(&state->x, n, _state); rvectorsetlengthatleast(&state->s, n, _state); rvectorsetlengthatleast(&state->g, n, _state); rvectorsetlengthatleast(&state->work, n, _state); } minlbfgssetcond(state, (double)(0), (double)(0), (double)(0), 0, _state); minlbfgssetxrep(state, ae_false, _state); minlbfgssetstpmax(state, (double)(0), _state); minlbfgsrestartfrom(state, x, _state); for(i=0; i<=n-1; i++) { state->s.ptr.p_double[i] = 1.0; } state->prectype = 0; } /************************************************************************* Modification of the preconditioner: default preconditioner (simple scaling, same for all elements of X) is used. INPUT PARAMETERS: State - structure which stores algorithm state NOTE: you can change preconditioner "on the fly", during algorithm iterations. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetprecdefault(minlbfgsstate* state, ae_state *_state) { state->prectype = 0; } /************************************************************************* Modification of the preconditioner: Cholesky factorization of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state P - triangular preconditioner, Cholesky factorization of the approximate Hessian. array[0..N-1,0..N-1], (if larger, only leading N elements are used). IsUpper - whether upper or lower triangle of P is given (other triangle is not referenced) After call to this function preconditioner is changed to P (P is copied into the internal buffer). NOTE: you can change preconditioner "on the fly", during algorithm iterations. NOTE 2: P should be nonsingular. Exception will be thrown otherwise. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetpreccholesky(minlbfgsstate* state, /* Real */ ae_matrix* p, ae_bool isupper, ae_state *_state) { ae_int_t i; double mx; ae_assert(isfinitertrmatrix(p, state->n, isupper, _state), "MinLBFGSSetPrecCholesky: P contains infinite or NAN values!", _state); mx = (double)(0); for(i=0; i<=state->n-1; i++) { mx = ae_maxreal(mx, ae_fabs(p->ptr.pp_double[i][i], _state), _state); } ae_assert(ae_fp_greater(mx,(double)(0)), "MinLBFGSSetPrecCholesky: P is strictly singular!", _state); if( state->denseh.rowsn||state->denseh.colsn ) { ae_matrix_set_length(&state->denseh, state->n, state->n, _state); } state->prectype = 1; if( isupper ) { rmatrixcopy(state->n, state->n, p, 0, 0, &state->denseh, 0, 0, _state); } else { rmatrixtranspose(state->n, state->n, p, 0, 0, &state->denseh, 0, 0, _state); } } /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE: you can change preconditioner "on the fly", during algorithm iterations. NOTE 2: D[i] should be positive. Exception will be thrown otherwise. NOTE 3: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetprecdiag(minlbfgsstate* state, /* Real */ ae_vector* d, ae_state *_state) { ae_int_t i; ae_assert(d->cnt>=state->n, "MinLBFGSSetPrecDiag: D is too short", _state); for(i=0; i<=state->n-1; i++) { ae_assert(ae_isfinite(d->ptr.p_double[i], _state), "MinLBFGSSetPrecDiag: D contains infinite or NAN elements", _state); ae_assert(ae_fp_greater(d->ptr.p_double[i],(double)(0)), "MinLBFGSSetPrecDiag: D contains non-positive elements", _state); } rvectorsetlengthatleast(&state->diagh, state->n, _state); state->prectype = 2; for(i=0; i<=state->n-1; i++) { state->diagh.ptr.p_double[i] = d->ptr.p_double[i]; } } /************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinLBFGSSetScale() call (before or after MinLBFGSSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetprecscale(minlbfgsstate* state, ae_state *_state) { state->prectype = 3; } /************************************************************************* This function sets low-rank preconditioner for Hessian matrix H=D+W'*C*W, where: * H is a Hessian matrix, which is approximated by D/W/C * D is a NxN diagonal positive definite matrix * W is a KxN low-rank correction * C is a KxK positive definite diagonal factor of low-rank correction This preconditioner is inexact but fast - it requires O(N*K) time to be applied. Preconditioner P is calculated by artificially constructing a set of BFGS updates which tries to reproduce behavior of H: * Sk = Wk (k-th row of W) * Yk = (D+Wk'*Ck*Wk)*Sk * Yk/Sk are reordered by ascending of C[k]*norm(Wk)^2 Here we assume that rows of Wk are orthogonal or nearly orthogonal, which allows us to have O(N*K+K^2) update instead of O(N*K^2) one. Reordering of updates is essential for having good performance on non-orthogonal problems (updates which do not add much of curvature are added first, and updates which add very large eigenvalues are added last and override effect of the first updates). In practice, this preconditioner is perfect when ortogonal correction is applied; on non-orthogonal problems sometimes it allows to achieve 5x speedup (when compared to non-preconditioned solver). -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetprecrankklbfgsfast(minlbfgsstate* state, /* Real */ ae_vector* d, /* Real */ ae_vector* c, /* Real */ ae_matrix* w, ae_int_t cnt, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t n; n = state->n; state->prectype = 4; state->preck = cnt; rvectorsetlengthatleast(&state->precc, cnt, _state); rvectorsetlengthatleast(&state->precd, n, _state); rmatrixsetlengthatleast(&state->precw, cnt, n, _state); for(i=0; i<=n-1; i++) { state->precd.ptr.p_double[i] = d->ptr.p_double[i]; } for(i=0; i<=cnt-1; i++) { state->precc.ptr.p_double[i] = c->ptr.p_double[i]; for(j=0; j<=n-1; j++) { state->precw.ptr.pp_double[i][j] = w->ptr.pp_double[i][j]; } } } /************************************************************************* This function sets exact low-rank preconditioner for Hessian matrix H=D+W'*C*W, where: * H is a Hessian matrix, which is approximated by D/W/C * D is a NxN diagonal positive definite matrix * W is a KxN low-rank correction * C is a KxK semidefinite diagonal factor of low-rank correction This preconditioner is exact but slow - it requires O(N*K^2) time to be built and O(N*K) time to be applied. Woodbury matrix identity is used to build inverse matrix. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetpreclowrankexact(minlbfgsstate* state, /* Real */ ae_vector* d, /* Real */ ae_vector* c, /* Real */ ae_matrix* w, ae_int_t cnt, ae_state *_state) { state->prectype = 5; preparelowrankpreconditioner(d, c, w, state->n, cnt, &state->lowrankbuf, _state); } /************************************************************************* NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied gradient, and one which uses function value only and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object (either MinLBFGSCreate() for analytical gradient or MinLBFGSCreateF() for numerical differentiation) you should choose appropriate variant of MinLBFGSOptimize() - one which accepts function AND gradient or one which accepts function ONLY. Be careful to choose variant of MinLBFGSOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinLBFGSOptimize() and specific function used to create optimizer. | USER PASSED TO MinLBFGSOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinLBFGSCreateF() | work FAIL MinLBFGSCreate() | FAIL work Here "FAIL" denotes inappropriate combinations of optimizer creation function and MinLBFGSOptimize() version. Attemps to use such combination (for example, to create optimizer with MinLBFGSCreateF() and to pass gradient information to MinCGOptimize()) will lead to exception being thrown. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ ae_bool minlbfgsiteration(minlbfgsstate* state, ae_state *_state) { ae_int_t n; ae_int_t m; ae_int_t i; ae_int_t j; ae_int_t ic; ae_int_t mcinfo; double v; double vv; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { n = state->rstate.ia.ptr.p_int[0]; m = state->rstate.ia.ptr.p_int[1]; i = state->rstate.ia.ptr.p_int[2]; j = state->rstate.ia.ptr.p_int[3]; ic = state->rstate.ia.ptr.p_int[4]; mcinfo = state->rstate.ia.ptr.p_int[5]; v = state->rstate.ra.ptr.p_double[0]; vv = state->rstate.ra.ptr.p_double[1]; } else { n = 359; m = -58; i = -919; j = -909; ic = 81; mcinfo = 255; v = 74; vv = -788; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } if( state->rstate.stage==4 ) { goto lbl_4; } if( state->rstate.stage==5 ) { goto lbl_5; } if( state->rstate.stage==6 ) { goto lbl_6; } if( state->rstate.stage==7 ) { goto lbl_7; } if( state->rstate.stage==8 ) { goto lbl_8; } if( state->rstate.stage==9 ) { goto lbl_9; } if( state->rstate.stage==10 ) { goto lbl_10; } if( state->rstate.stage==11 ) { goto lbl_11; } if( state->rstate.stage==12 ) { goto lbl_12; } if( state->rstate.stage==13 ) { goto lbl_13; } if( state->rstate.stage==14 ) { goto lbl_14; } if( state->rstate.stage==15 ) { goto lbl_15; } if( state->rstate.stage==16 ) { goto lbl_16; } /* * Routine body */ /* * Unload frequently used variables from State structure * (just for typing convinience) */ n = state->n; m = state->m; state->userterminationneeded = ae_false; state->repterminationtype = 0; state->repiterationscount = 0; state->repvaridx = -1; state->repnfev = 0; /* * Check, that transferred derivative value is right */ minlbfgs_clearrequestfields(state, _state); if( !(ae_fp_eq(state->diffstep,(double)(0))&&ae_fp_greater(state->teststep,(double)(0))) ) { goto lbl_17; } state->needfg = ae_true; i = 0; lbl_19: if( i>n-1 ) { goto lbl_21; } v = state->x.ptr.p_double[i]; state->x.ptr.p_double[i] = v-state->teststep*state->s.ptr.p_double[i]; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: state->fm1 = state->f; state->fp1 = state->g.ptr.p_double[i]; state->x.ptr.p_double[i] = v+state->teststep*state->s.ptr.p_double[i]; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: state->fm2 = state->f; state->fp2 = state->g.ptr.p_double[i]; state->x.ptr.p_double[i] = v; state->rstate.stage = 2; goto lbl_rcomm; lbl_2: /* * 2*State.TestStep - scale parameter * width of segment [Xi-TestStep;Xi+TestStep] */ if( !derivativecheck(state->fm1, state->fp1, state->fm2, state->fp2, state->f, state->g.ptr.p_double[i], 2*state->teststep, _state) ) { state->repvaridx = i; state->repterminationtype = -7; result = ae_false; return result; } i = i+1; goto lbl_19; lbl_21: state->needfg = ae_false; lbl_17: /* * Calculate F/G at the initial point */ minlbfgs_clearrequestfields(state, _state); if( ae_fp_neq(state->diffstep,(double)(0)) ) { goto lbl_22; } state->needfg = ae_true; state->rstate.stage = 3; goto lbl_rcomm; lbl_3: state->needfg = ae_false; goto lbl_23; lbl_22: state->needf = ae_true; state->rstate.stage = 4; goto lbl_rcomm; lbl_4: state->fbase = state->f; i = 0; lbl_24: if( i>n-1 ) { goto lbl_26; } v = state->x.ptr.p_double[i]; state->x.ptr.p_double[i] = v-state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 5; goto lbl_rcomm; lbl_5: state->fm2 = state->f; state->x.ptr.p_double[i] = v-0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 6; goto lbl_rcomm; lbl_6: state->fm1 = state->f; state->x.ptr.p_double[i] = v+0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 7; goto lbl_rcomm; lbl_7: state->fp1 = state->f; state->x.ptr.p_double[i] = v+state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 8; goto lbl_rcomm; lbl_8: state->fp2 = state->f; state->x.ptr.p_double[i] = v; state->g.ptr.p_double[i] = (8*(state->fp1-state->fm1)-(state->fp2-state->fm2))/(6*state->diffstep*state->s.ptr.p_double[i]); i = i+1; goto lbl_24; lbl_26: state->f = state->fbase; state->needf = ae_false; lbl_23: trimprepare(state->f, &state->trimthreshold, _state); if( !state->xrep ) { goto lbl_27; } minlbfgs_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 9; goto lbl_rcomm; lbl_9: state->xupdated = ae_false; lbl_27: if( state->userterminationneeded ) { /* * User requested termination */ state->repterminationtype = 8; result = ae_false; return result; } state->repnfev = 1; state->fold = state->f; v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->g.ptr.p_double[i]*state->s.ptr.p_double[i], _state); } if( ae_fp_less_eq(ae_sqrt(v, _state),state->epsg) ) { state->repterminationtype = 4; result = ae_false; return result; } /* * Choose initial step and direction. * Apply preconditioner, if we have something other than default. */ ae_v_moveneg(&state->d.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( state->prectype==0 ) { /* * Default preconditioner is used, but we can't use it before iterations will start */ v = ae_v_dotproduct(&state->g.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = ae_sqrt(v, _state); if( ae_fp_eq(state->stpmax,(double)(0)) ) { state->stp = ae_minreal(1.0/v, (double)(1), _state); } else { state->stp = ae_minreal(1.0/v, state->stpmax, _state); } } if( state->prectype==1 ) { /* * Cholesky preconditioner is used */ fblscholeskysolve(&state->denseh, 1.0, n, ae_true, &state->d, &state->autobuf, _state); state->stp = (double)(1); } if( state->prectype==2 ) { /* * diagonal approximation is used */ for(i=0; i<=n-1; i++) { state->d.ptr.p_double[i] = state->d.ptr.p_double[i]/state->diagh.ptr.p_double[i]; } state->stp = (double)(1); } if( state->prectype==3 ) { /* * scale-based preconditioner is used */ for(i=0; i<=n-1; i++) { state->d.ptr.p_double[i] = state->d.ptr.p_double[i]*state->s.ptr.p_double[i]*state->s.ptr.p_double[i]; } state->stp = (double)(1); } if( state->prectype==4 ) { /* * rank-k BFGS-based preconditioner is used */ inexactlbfgspreconditioner(&state->d, n, &state->precd, &state->precc, &state->precw, state->preck, &state->precbuf, _state); state->stp = (double)(1); } if( state->prectype==5 ) { /* * exact low-rank preconditioner is used */ applylowrankpreconditioner(&state->d, &state->lowrankbuf, _state); state->stp = (double)(1); } /* * Main cycle */ state->k = 0; lbl_29: if( ae_false ) { goto lbl_30; } /* * Main cycle: prepare to 1-D line search */ state->p = state->k%m; state->q = ae_minint(state->k, m-1, _state); /* * Store X[k], G[k] */ ae_v_move(&state->xp.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_moveneg(&state->sk.ptr.pp_double[state->p][0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_moveneg(&state->yk.ptr.pp_double[state->p][0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * Minimize F(x+alpha*d) * Calculate S[k], Y[k] */ state->mcstage = 0; if( state->k!=0 ) { state->stp = 1.0; } linminnormalized(&state->d, &state->stp, n, _state); mcsrch(n, &state->x, &state->f, &state->g, &state->d, &state->stp, state->stpmax, minlbfgs_gtol, &mcinfo, &state->nfev, &state->work, &state->lstate, &state->mcstage, _state); lbl_31: if( state->mcstage==0 ) { goto lbl_32; } minlbfgs_clearrequestfields(state, _state); if( ae_fp_neq(state->diffstep,(double)(0)) ) { goto lbl_33; } state->needfg = ae_true; state->rstate.stage = 10; goto lbl_rcomm; lbl_10: state->needfg = ae_false; goto lbl_34; lbl_33: state->needf = ae_true; state->rstate.stage = 11; goto lbl_rcomm; lbl_11: state->fbase = state->f; i = 0; lbl_35: if( i>n-1 ) { goto lbl_37; } v = state->x.ptr.p_double[i]; state->x.ptr.p_double[i] = v-state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 12; goto lbl_rcomm; lbl_12: state->fm2 = state->f; state->x.ptr.p_double[i] = v-0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 13; goto lbl_rcomm; lbl_13: state->fm1 = state->f; state->x.ptr.p_double[i] = v+0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 14; goto lbl_rcomm; lbl_14: state->fp1 = state->f; state->x.ptr.p_double[i] = v+state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 15; goto lbl_rcomm; lbl_15: state->fp2 = state->f; state->x.ptr.p_double[i] = v; state->g.ptr.p_double[i] = (8*(state->fp1-state->fm1)-(state->fp2-state->fm2))/(6*state->diffstep*state->s.ptr.p_double[i]); i = i+1; goto lbl_35; lbl_37: state->f = state->fbase; state->needf = ae_false; lbl_34: trimfunction(&state->f, &state->g, n, state->trimthreshold, _state); mcsrch(n, &state->x, &state->f, &state->g, &state->d, &state->stp, state->stpmax, minlbfgs_gtol, &mcinfo, &state->nfev, &state->work, &state->lstate, &state->mcstage, _state); goto lbl_31; lbl_32: if( state->userterminationneeded ) { /* * User requested termination. * Restore previous point and return. */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xp.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->repterminationtype = 8; result = ae_false; return result; } if( !state->xrep ) { goto lbl_38; } /* * report */ minlbfgs_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 16; goto lbl_rcomm; lbl_16: state->xupdated = ae_false; lbl_38: state->repnfev = state->repnfev+state->nfev; state->repiterationscount = state->repiterationscount+1; ae_v_add(&state->sk.ptr.pp_double[state->p][0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_add(&state->yk.ptr.pp_double[state->p][0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * Stopping conditions */ v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->g.ptr.p_double[i]*state->s.ptr.p_double[i], _state); } if( !ae_isfinite(v, _state)||!ae_isfinite(state->f, _state) ) { /* * Abnormal termination - infinities in function/gradient */ state->repterminationtype = -8; result = ae_false; return result; } if( state->repiterationscount>=state->maxits&&state->maxits>0 ) { /* * Too many iterations */ state->repterminationtype = 5; result = ae_false; return result; } if( ae_fp_less_eq(ae_sqrt(v, _state),state->epsg) ) { /* * Gradient is small enough */ state->repterminationtype = 4; result = ae_false; return result; } if( ae_fp_less_eq(state->fold-state->f,state->epsf*ae_maxreal(ae_fabs(state->fold, _state), ae_maxreal(ae_fabs(state->f, _state), 1.0, _state), _state)) ) { /* * F(k+1)-F(k) is small enough */ state->repterminationtype = 1; result = ae_false; return result; } v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->sk.ptr.pp_double[state->p][i]/state->s.ptr.p_double[i], _state); } if( ae_fp_less_eq(ae_sqrt(v, _state),state->epsx) ) { /* * X(k+1)-X(k) is small enough */ state->repterminationtype = 2; result = ae_false; return result; } /* * If Wolfe conditions are satisfied, we can update * limited memory model. * * However, if conditions are not satisfied (NFEV limit is met, * function is too wild, ...), we'll skip L-BFGS update */ if( mcinfo!=1 ) { /* * Skip update. * * In such cases we'll initialize search direction by * antigradient vector, because it leads to more * transparent code with less number of special cases */ state->fold = state->f; ae_v_moveneg(&state->d.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); } else { /* * Calculate Rho[k], GammaK */ v = ae_v_dotproduct(&state->yk.ptr.pp_double[state->p][0], 1, &state->sk.ptr.pp_double[state->p][0], 1, ae_v_len(0,n-1)); vv = ae_v_dotproduct(&state->yk.ptr.pp_double[state->p][0], 1, &state->yk.ptr.pp_double[state->p][0], 1, ae_v_len(0,n-1)); if( ae_fp_eq(v,(double)(0))||ae_fp_eq(vv,(double)(0)) ) { /* * Rounding errors make further iterations impossible. */ state->repterminationtype = -2; result = ae_false; return result; } state->rho.ptr.p_double[state->p] = 1/v; state->gammak = v/vv; /* * Calculate d(k+1) = -H(k+1)*g(k+1) * * for I:=K downto K-Q do * V = s(i)^T * work(iteration:I) * theta(i) = V * work(iteration:I+1) = work(iteration:I) - V*Rho(i)*y(i) * work(last iteration) = H0*work(last iteration) - preconditioner * for I:=K-Q to K do * V = y(i)^T*work(iteration:I) * work(iteration:I+1) = work(iteration:I) +(-V+theta(i))*Rho(i)*s(i) * * NOW WORK CONTAINS d(k+1) */ ae_v_move(&state->work.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=state->k; i>=state->k-state->q; i--) { ic = i%m; v = ae_v_dotproduct(&state->sk.ptr.pp_double[ic][0], 1, &state->work.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->theta.ptr.p_double[ic] = v; vv = v*state->rho.ptr.p_double[ic]; ae_v_subd(&state->work.ptr.p_double[0], 1, &state->yk.ptr.pp_double[ic][0], 1, ae_v_len(0,n-1), vv); } if( state->prectype==0 ) { /* * Simple preconditioner is used */ v = state->gammak; ae_v_muld(&state->work.ptr.p_double[0], 1, ae_v_len(0,n-1), v); } if( state->prectype==1 ) { /* * Cholesky preconditioner is used */ fblscholeskysolve(&state->denseh, (double)(1), n, ae_true, &state->work, &state->autobuf, _state); } if( state->prectype==2 ) { /* * diagonal approximation is used */ for(i=0; i<=n-1; i++) { state->work.ptr.p_double[i] = state->work.ptr.p_double[i]/state->diagh.ptr.p_double[i]; } } if( state->prectype==3 ) { /* * scale-based preconditioner is used */ for(i=0; i<=n-1; i++) { state->work.ptr.p_double[i] = state->work.ptr.p_double[i]*state->s.ptr.p_double[i]*state->s.ptr.p_double[i]; } } if( state->prectype==4 ) { /* * Rank-K BFGS-based preconditioner is used */ inexactlbfgspreconditioner(&state->work, n, &state->precd, &state->precc, &state->precw, state->preck, &state->precbuf, _state); } if( state->prectype==5 ) { /* * Exact low-rank preconditioner is used */ applylowrankpreconditioner(&state->work, &state->lowrankbuf, _state); } for(i=state->k-state->q; i<=state->k; i++) { ic = i%m; v = ae_v_dotproduct(&state->yk.ptr.pp_double[ic][0], 1, &state->work.ptr.p_double[0], 1, ae_v_len(0,n-1)); vv = state->rho.ptr.p_double[ic]*(-v+state->theta.ptr.p_double[ic]); ae_v_addd(&state->work.ptr.p_double[0], 1, &state->sk.ptr.pp_double[ic][0], 1, ae_v_len(0,n-1), vv); } ae_v_moveneg(&state->d.ptr.p_double[0], 1, &state->work.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * Next step */ state->fold = state->f; state->k = state->k+1; } goto lbl_29; lbl_30: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = n; state->rstate.ia.ptr.p_int[1] = m; state->rstate.ia.ptr.p_int[2] = i; state->rstate.ia.ptr.p_int[3] = j; state->rstate.ia.ptr.p_int[4] = ic; state->rstate.ia.ptr.p_int[5] = mcinfo; state->rstate.ra.ptr.p_double[0] = v; state->rstate.ra.ptr.p_double[1] = vv; return result; } /************************************************************************* L-BFGS algorithm results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report: * Rep.TerminationType completetion code: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinLBFGSSetGradientCheck() for more information. * -2 rounding errors prevent further improvement. X contains best point found. * -1 incorrect parameters were specified * 1 relative function improvement is no more than EpsF. * 2 relative step is no more than EpsX. * 4 gradient norm is no more than EpsG * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * 8 terminated by user who called minlbfgsrequesttermination(). X contains point which was "current accepted" when termination request was submitted. * Rep.IterationsCount contains iterations count * NFEV countains number of function calculations -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgsresults(minlbfgsstate* state, /* Real */ ae_vector* x, minlbfgsreport* rep, ae_state *_state) { ae_vector_clear(x); _minlbfgsreport_clear(rep); minlbfgsresultsbuf(state, x, rep, _state); } /************************************************************************* L-BFGS algorithm results Buffered implementation of MinLBFGSResults which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgsresultsbuf(minlbfgsstate* state, /* Real */ ae_vector* x, minlbfgsreport* rep, ae_state *_state) { if( x->cntn ) { ae_vector_set_length(x, state->n, _state); } ae_v_move(&x->ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); rep->iterationscount = state->repiterationscount; rep->nfev = state->repnfev; rep->varidx = state->repvaridx; rep->terminationtype = state->repterminationtype; } /************************************************************************* This subroutine restarts LBFGS algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used to store algorithm state X - new starting point. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgsrestartfrom(minlbfgsstate* state, /* Real */ ae_vector* x, ae_state *_state) { ae_assert(x->cnt>=state->n, "MinLBFGSRestartFrom: Length(X)n, _state), "MinLBFGSRestartFrom: X contains infinite or NaN values!", _state); ae_v_move(&state->x.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,state->n-1)); ae_vector_set_length(&state->rstate.ia, 5+1, _state); ae_vector_set_length(&state->rstate.ra, 1+1, _state); state->rstate.stage = -1; minlbfgs_clearrequestfields(state, _state); } /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void minlbfgsrequesttermination(minlbfgsstate* state, ae_state *_state) { state->userterminationneeded = ae_true; } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinLBFGSOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * if needed, steps are bounded with respect to constraints on X[] * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinLBFGSSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 24.05.2012 by Bochkanov Sergey *************************************************************************/ void minlbfgssetgradientcheck(minlbfgsstate* state, double teststep, ae_state *_state) { ae_assert(ae_isfinite(teststep, _state), "MinLBFGSSetGradientCheck: TestStep contains NaN or Infinite", _state); ae_assert(ae_fp_greater_eq(teststep,(double)(0)), "MinLBFGSSetGradientCheck: invalid argument TestStep(TestStep<0)", _state); state->teststep = teststep; } /************************************************************************* Clears request fileds (to be sure that we don't forgot to clear something) *************************************************************************/ static void minlbfgs_clearrequestfields(minlbfgsstate* state, ae_state *_state) { state->needf = ae_false; state->needfg = ae_false; state->xupdated = ae_false; } void _minlbfgsstate_init(void* _p, ae_state *_state) { minlbfgsstate *p = (minlbfgsstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->s, 0, DT_REAL, _state); ae_vector_init(&p->rho, 0, DT_REAL, _state); ae_matrix_init(&p->yk, 0, 0, DT_REAL, _state); ae_matrix_init(&p->sk, 0, 0, DT_REAL, _state); ae_vector_init(&p->xp, 0, DT_REAL, _state); ae_vector_init(&p->theta, 0, DT_REAL, _state); ae_vector_init(&p->d, 0, DT_REAL, _state); ae_vector_init(&p->work, 0, DT_REAL, _state); ae_matrix_init(&p->denseh, 0, 0, DT_REAL, _state); ae_vector_init(&p->diagh, 0, DT_REAL, _state); ae_vector_init(&p->precc, 0, DT_REAL, _state); ae_vector_init(&p->precd, 0, DT_REAL, _state); ae_matrix_init(&p->precw, 0, 0, DT_REAL, _state); _precbuflbfgs_init(&p->precbuf, _state); _precbuflowrank_init(&p->lowrankbuf, _state); ae_vector_init(&p->autobuf, 0, DT_REAL, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->g, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); _linminstate_init(&p->lstate, _state); } void _minlbfgsstate_init_copy(void* _dst, void* _src, ae_state *_state) { minlbfgsstate *dst = (minlbfgsstate*)_dst; minlbfgsstate *src = (minlbfgsstate*)_src; dst->n = src->n; dst->m = src->m; dst->epsg = src->epsg; dst->epsf = src->epsf; dst->epsx = src->epsx; dst->maxits = src->maxits; dst->xrep = src->xrep; dst->stpmax = src->stpmax; ae_vector_init_copy(&dst->s, &src->s, _state); dst->diffstep = src->diffstep; dst->nfev = src->nfev; dst->mcstage = src->mcstage; dst->k = src->k; dst->q = src->q; dst->p = src->p; ae_vector_init_copy(&dst->rho, &src->rho, _state); ae_matrix_init_copy(&dst->yk, &src->yk, _state); ae_matrix_init_copy(&dst->sk, &src->sk, _state); ae_vector_init_copy(&dst->xp, &src->xp, _state); ae_vector_init_copy(&dst->theta, &src->theta, _state); ae_vector_init_copy(&dst->d, &src->d, _state); dst->stp = src->stp; ae_vector_init_copy(&dst->work, &src->work, _state); dst->fold = src->fold; dst->trimthreshold = src->trimthreshold; dst->prectype = src->prectype; dst->gammak = src->gammak; ae_matrix_init_copy(&dst->denseh, &src->denseh, _state); ae_vector_init_copy(&dst->diagh, &src->diagh, _state); ae_vector_init_copy(&dst->precc, &src->precc, _state); ae_vector_init_copy(&dst->precd, &src->precd, _state); ae_matrix_init_copy(&dst->precw, &src->precw, _state); dst->preck = src->preck; _precbuflbfgs_init_copy(&dst->precbuf, &src->precbuf, _state); _precbuflowrank_init_copy(&dst->lowrankbuf, &src->lowrankbuf, _state); dst->fbase = src->fbase; dst->fm2 = src->fm2; dst->fm1 = src->fm1; dst->fp1 = src->fp1; dst->fp2 = src->fp2; ae_vector_init_copy(&dst->autobuf, &src->autobuf, _state); ae_vector_init_copy(&dst->x, &src->x, _state); dst->f = src->f; ae_vector_init_copy(&dst->g, &src->g, _state); dst->needf = src->needf; dst->needfg = src->needfg; dst->xupdated = src->xupdated; dst->userterminationneeded = src->userterminationneeded; dst->teststep = src->teststep; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); dst->repiterationscount = src->repiterationscount; dst->repnfev = src->repnfev; dst->repvaridx = src->repvaridx; dst->repterminationtype = src->repterminationtype; _linminstate_init_copy(&dst->lstate, &src->lstate, _state); } void _minlbfgsstate_clear(void* _p) { minlbfgsstate *p = (minlbfgsstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->s); ae_vector_clear(&p->rho); ae_matrix_clear(&p->yk); ae_matrix_clear(&p->sk); ae_vector_clear(&p->xp); ae_vector_clear(&p->theta); ae_vector_clear(&p->d); ae_vector_clear(&p->work); ae_matrix_clear(&p->denseh); ae_vector_clear(&p->diagh); ae_vector_clear(&p->precc); ae_vector_clear(&p->precd); ae_matrix_clear(&p->precw); _precbuflbfgs_clear(&p->precbuf); _precbuflowrank_clear(&p->lowrankbuf); ae_vector_clear(&p->autobuf); ae_vector_clear(&p->x); ae_vector_clear(&p->g); _rcommstate_clear(&p->rstate); _linminstate_clear(&p->lstate); } void _minlbfgsstate_destroy(void* _p) { minlbfgsstate *p = (minlbfgsstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->s); ae_vector_destroy(&p->rho); ae_matrix_destroy(&p->yk); ae_matrix_destroy(&p->sk); ae_vector_destroy(&p->xp); ae_vector_destroy(&p->theta); ae_vector_destroy(&p->d); ae_vector_destroy(&p->work); ae_matrix_destroy(&p->denseh); ae_vector_destroy(&p->diagh); ae_vector_destroy(&p->precc); ae_vector_destroy(&p->precd); ae_matrix_destroy(&p->precw); _precbuflbfgs_destroy(&p->precbuf); _precbuflowrank_destroy(&p->lowrankbuf); ae_vector_destroy(&p->autobuf); ae_vector_destroy(&p->x); ae_vector_destroy(&p->g); _rcommstate_destroy(&p->rstate); _linminstate_destroy(&p->lstate); } void _minlbfgsreport_init(void* _p, ae_state *_state) { minlbfgsreport *p = (minlbfgsreport*)_p; ae_touch_ptr((void*)p); } void _minlbfgsreport_init_copy(void* _dst, void* _src, ae_state *_state) { minlbfgsreport *dst = (minlbfgsreport*)_dst; minlbfgsreport *src = (minlbfgsreport*)_src; dst->iterationscount = src->iterationscount; dst->nfev = src->nfev; dst->varidx = src->varidx; dst->terminationtype = src->terminationtype; } void _minlbfgsreport_clear(void* _p) { minlbfgsreport *p = (minlbfgsreport*)_p; ae_touch_ptr((void*)p); } void _minlbfgsreport_destroy(void* _p) { minlbfgsreport *p = (minlbfgsreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* This function initializes QPDENSEAULSettings structure with default settings. Newly created structure MUST be initialized by default settings - or by copy of the already initialized structure. -- ALGLIB -- Copyright 14.05.2011 by Bochkanov Sergey *************************************************************************/ void qpdenseaulloaddefaults(ae_int_t nmain, qpdenseaulsettings* s, ae_state *_state) { s->epsx = 1.0E-6; s->outerits = 5; s->rho = 100.0; } /************************************************************************* This function runs Dense-AUL solver; it returns after optimization process was completed. Following QP problem is solved: min(0.5*(x-x_origin)'*A*(x-x_origin)+b'*(x-x_origin)) subject to combination of box and general linear dense/sparse constraints. INPUT PARAMETERS: DenseA - for dense problems (AKind=0), A-term of CQM object contains system matrix. Other terms are unspecified and should not be referenced. SparseA - for sparse problems (AKind=1), CRS format AKind - sparse matrix format: * 0 for dense matrix * 1 for sparse matrix SparseUpper - which triangle of SparseAC stores matrix - upper or lower one (for dense matrices this parameter is not actual). B - linear term, array[N] BndL - lower bound, array[N] BndU - upper bound, array[N] S - scale vector, array[NC]: * I-th element contains scale of I-th variable, * SC[I]>0 XOrigin - origin term, array[NC]. Can be zero. N - number of variables in the original formulation (no slack variables). CLEIC - dense linear equality/inequality constraints. Equality constraints come first. NEC, NIC - number of dense equality/inequality constraints. SCLEIC - sparse linear equality/inequality constraints. Equality constraints come first. SNEC, SNIC - number of sparse equality/inequality constraints. RenormLC - whether constraints should be renormalized (recommended) or used "as is". Settings - QPDENSEAULSettings object initialized by one of the initialization functions. State - object which stores temporaries XS - initial point, array[NC] OUTPUT PARAMETERS: XS - last point TerminationType-termination type: * * * -- ALGLIB -- Copyright 14.05.2011 by Bochkanov Sergey *************************************************************************/ void qpdenseauloptimize(convexquadraticmodel* a, sparsematrix* sparsea, ae_int_t akind, ae_bool sparseaupper, /* Real */ ae_vector* b, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, /* Real */ ae_vector* s, /* Real */ ae_vector* xorigin, ae_int_t nn, /* Real */ ae_matrix* cleic, ae_int_t dnec, ae_int_t dnic, sparsematrix* scleic, ae_int_t snec, ae_int_t snic, ae_bool renormlc, qpdenseaulsettings* settings, qpdenseaulbuffers* state, /* Real */ ae_vector* xs, ae_int_t* terminationtype, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t j0; ae_int_t j1; ae_int_t k; double v; double vv; double rho; double epsx; ae_int_t outeridx; ae_int_t nmain; ae_int_t nslack; ae_int_t ntotal; ae_int_t ktotal; double maxrho; double feaserr; double feaserrprev; double requestedfeasdecrease; ae_int_t goodcounter; ae_int_t stagnationcounter; double anorm; double maxcac; *terminationtype = 0; nmain = nn; nslack = dnic+snic; ntotal = nmain+nslack; ktotal = dnec+dnic+snec+snic; rho = settings->rho; epsx = settings->epsx; requestedfeasdecrease = 0.33; maxrho = 1.0E12; if( ae_fp_less_eq(epsx,(double)(0)) ) { epsx = 1.0E-9; } /* * Integrity checks */ if( snec+snic>0 ) { ae_assert(scleic->matrixtype==1, "QPDENSEAULOptimize: unexpected sparse matrix format", _state); ae_assert(scleic->m==snec+snic, "QPDENSEAULOptimize: unexpected sparse matrix size", _state); ae_assert(scleic->n==nmain+1, "QPDENSEAULOptimize: unexpected sparse matrix size", _state); } /* * Prepare */ state->repinneriterationscount = 0; state->repouteriterationscount = 0; state->repncholesky = 0; state->repnmv = 0; *terminationtype = 0; rmatrixsetlengthatleast(&state->sclsfta, nmain, nmain, _state); rvectorsetlengthatleast(&state->sclsftb, nmain, _state); rvectorsetlengthatleast(&state->sclsftxc, nmain, _state); rvectorsetlengthatleast(&state->sclsftbndl, nmain, _state); rvectorsetlengthatleast(&state->sclsftbndu, nmain, _state); bvectorsetlengthatleast(&state->sclsfthasbndl, nmain, _state); bvectorsetlengthatleast(&state->sclsfthasbndu, nmain, _state); rmatrixsetlengthatleast(&state->sclsftcleic, ktotal, nmain+1, _state); rvectorsetlengthatleast(&state->nulc, ktotal, _state); rvectorsetlengthatleast(&state->nulcest, ktotal, _state); rmatrixsetlengthatleast(&state->exa, ntotal, ntotal, _state); rvectorsetlengthatleast(&state->exb, ntotal, _state); rvectorsetlengthatleast(&state->exxc, ntotal, _state); rvectorsetlengthatleast(&state->exxn, ntotal, _state); rvectorsetlengthatleast(&state->exxorigin, ntotal, _state); rvectorsetlengthatleast(&state->exbndl, ntotal, _state); rvectorsetlengthatleast(&state->exbndu, ntotal, _state); rvectorsetlengthatleast(&state->tmp0, ntotal, _state); /* * Prepare scaled/shifted model. */ ae_assert(akind==0||akind==1, "QPDENSEAULOptimize: unexpected AKind", _state); if( akind==0 ) { /* * Extract dense A and scale */ cqmgeta(a, &state->tmp2, _state); for(i=0; i<=nmain-1; i++) { for(j=0; j<=nmain-1; j++) { state->sclsfta.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=nmain-1; i++) { for(j=i; j<=nmain-1; j++) { v = state->tmp2.ptr.pp_double[i][j]*s->ptr.p_double[i]*s->ptr.p_double[j]; state->sclsfta.ptr.pp_double[i][j] = v; state->sclsfta.ptr.pp_double[j][i] = v; } } } if( akind==1 ) { /* * Extract sparse A and scale */ ae_assert(sparsea->matrixtype==1, "QPDENSEAULOptimize: unexpected sparse matrix format", _state); ae_assert(sparsea->m==nmain, "QPDENSEAULOptimize: unexpected sparse matrix size", _state); ae_assert(sparsea->n==nmain, "QPDENSEAULOptimize: unexpected sparse matrix size", _state); for(i=0; i<=nmain-1; i++) { for(j=0; j<=nmain-1; j++) { state->sclsfta.ptr.pp_double[i][j] = (double)(0); } } if( sparseaupper ) { for(i=0; i<=nmain-1; i++) { if( sparsea->didx.ptr.p_int[i]!=sparsea->uidx.ptr.p_int[i] ) { state->sclsfta.ptr.pp_double[i][i] = sparsea->vals.ptr.p_double[sparsea->didx.ptr.p_int[i]]*s->ptr.p_double[i]*s->ptr.p_double[i]; } j0 = sparsea->uidx.ptr.p_int[i]; j1 = sparsea->ridx.ptr.p_int[i+1]-1; for(j=j0; j<=j1; j++) { k = sparsea->idx.ptr.p_int[j]; v = sparsea->vals.ptr.p_double[j]*s->ptr.p_double[i]*s->ptr.p_double[k]; state->sclsfta.ptr.pp_double[i][k] = v; state->sclsfta.ptr.pp_double[k][i] = v; } } } else { for(i=0; i<=nmain-1; i++) { if( sparsea->didx.ptr.p_int[i]!=sparsea->uidx.ptr.p_int[i] ) { state->sclsfta.ptr.pp_double[i][i] = sparsea->vals.ptr.p_double[sparsea->didx.ptr.p_int[i]]*s->ptr.p_double[i]*s->ptr.p_double[i]; } j0 = sparsea->ridx.ptr.p_int[i]; j1 = sparsea->didx.ptr.p_int[i]-1; for(j=j0; j<=j1; j++) { k = sparsea->idx.ptr.p_int[j]; v = sparsea->vals.ptr.p_double[j]*s->ptr.p_double[i]*s->ptr.p_double[k]; state->sclsfta.ptr.pp_double[i][k] = v; state->sclsfta.ptr.pp_double[k][i] = v; } } } } for(i=0; i<=nmain-1; i++) { state->sclsftb.ptr.p_double[i] = b->ptr.p_double[i]*s->ptr.p_double[i]; } for(i=0; i<=nmain-1; i++) { ae_assert(ae_isfinite(bndl->ptr.p_double[i], _state)||ae_isneginf(bndl->ptr.p_double[i], _state), "QPDENSEAULOptimize: integrity check failure (7)", _state); ae_assert(ae_isfinite(bndu->ptr.p_double[i], _state)||ae_isposinf(bndu->ptr.p_double[i], _state), "QPDENSEAULOptimize: integrity check failure (8)", _state); state->sclsftxc.ptr.p_double[i] = (xs->ptr.p_double[i]-xorigin->ptr.p_double[i])/s->ptr.p_double[i]; state->sclsfthasbndl.ptr.p_bool[i] = ae_isfinite(bndl->ptr.p_double[i], _state); state->sclsfthasbndu.ptr.p_bool[i] = ae_isfinite(bndu->ptr.p_double[i], _state); if( (state->sclsfthasbndl.ptr.p_bool[i]&&state->sclsfthasbndu.ptr.p_bool[i])&&ae_fp_eq(bndl->ptr.p_double[i],bndu->ptr.p_double[i]) ) { /* * Make sure that SclSftBndL[I]=SclSftBndU[I] bit-to-bit * even with CRAZY optimizing compiler. */ state->sclsftbndu.ptr.p_double[i] = (bndu->ptr.p_double[i]-xorigin->ptr.p_double[i])/s->ptr.p_double[i]; state->sclsftbndl.ptr.p_double[i] = state->sclsftbndu.ptr.p_double[i]; continue; } if( state->sclsfthasbndl.ptr.p_bool[i] ) { state->sclsftbndl.ptr.p_double[i] = (bndl->ptr.p_double[i]-xorigin->ptr.p_double[i])/s->ptr.p_double[i]; } if( state->sclsfthasbndu.ptr.p_bool[i] ) { state->sclsftbndu.ptr.p_double[i] = (bndu->ptr.p_double[i]-xorigin->ptr.p_double[i])/s->ptr.p_double[i]; } } for(i=0; i<=ktotal-1; i++) { for(j=0; j<=nmain; j++) { state->sclsftcleic.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=dnec-1; i++) { for(j=0; j<=nmain-1; j++) { v = cleic->ptr.pp_double[i][j]*s->ptr.p_double[j]; state->sclsftcleic.ptr.pp_double[i][j] = v; } state->sclsftcleic.ptr.pp_double[i][nmain] = cleic->ptr.pp_double[i][nmain]; } for(i=0; i<=dnic-1; i++) { for(j=0; j<=nmain-1; j++) { v = cleic->ptr.pp_double[dnec+i][j]*s->ptr.p_double[j]; state->sclsftcleic.ptr.pp_double[dnec+snec+i][j] = v; } state->sclsftcleic.ptr.pp_double[dnec+snec+i][nmain] = cleic->ptr.pp_double[dnec+i][nmain]; } for(i=0; i<=snec-1; i++) { /* * Because constraints are sparse, everything is a bit tricky - * it is possible that N-th element of the row is zero and not * stored; it is also possible that entire row is empty. */ j0 = scleic->ridx.ptr.p_int[i]; j1 = scleic->ridx.ptr.p_int[i+1]-1; if( j1>=j0&&scleic->idx.ptr.p_int[j1]==nmain ) { state->sclsftcleic.ptr.pp_double[dnec+i][nmain] = scleic->vals.ptr.p_double[j1]; j1 = j1-1; } for(j=j0; j<=j1; j++) { k = scleic->idx.ptr.p_int[j]; v = scleic->vals.ptr.p_double[j]*s->ptr.p_double[k]; state->sclsftcleic.ptr.pp_double[dnec+i][k] = v; } } for(i=0; i<=snic-1; i++) { /* * Because constraints are sparse, everything is a bit tricky - * it is possible that N-th element of the row is zero and not * stored; it is also possible that entire row is empty. */ j0 = scleic->ridx.ptr.p_int[snec+i]; j1 = scleic->ridx.ptr.p_int[snec+i+1]-1; if( j1>=j0&&scleic->idx.ptr.p_int[j1]==nmain ) { state->sclsftcleic.ptr.pp_double[dnec+snec+dnic+i][nmain] = scleic->vals.ptr.p_double[j1]; j1 = j1-1; } for(j=j0; j<=j1; j++) { k = scleic->idx.ptr.p_int[j]; v = scleic->vals.ptr.p_double[j]*s->ptr.p_double[k]; state->sclsftcleic.ptr.pp_double[dnec+snec+dnic+i][k] = v; } } if( renormlc&&ktotal>0 ) { /* * Normalize linear constraints in such way that they have unit norm * (after variable scaling) */ for(i=0; i<=ktotal-1; i++) { vv = 0.0; for(j=0; j<=nmain-1; j++) { v = state->sclsftcleic.ptr.pp_double[i][j]; vv = vv+v*v; } vv = ae_sqrt(vv, _state); if( ae_fp_greater(vv,(double)(0)) ) { vv = 1/vv; for(j=0; j<=nmain; j++) { state->sclsftcleic.ptr.pp_double[i][j] = state->sclsftcleic.ptr.pp_double[i][j]*vv; } } } } for(i=0; i<=ktotal-1; i++) { /* * Apply XOrigin */ v = 0.0; for(j=0; j<=nmain-1; j++) { v = v+state->sclsftcleic.ptr.pp_double[i][j]*(xorigin->ptr.p_double[j]/s->ptr.p_double[j]); } state->sclsftcleic.ptr.pp_double[i][nmain] = state->sclsftcleic.ptr.pp_double[i][nmain]-v; } /* * Normalize model in such way that norm(A)~1 (very roughly) * * We have two lower bounds for sigma_max(A): * * first estimate is provided by Frobenius norm, it is equal to ANorm/NMain * * second estimate is provided by max(CAC) * * We select largest one of these estimates, because using just one * of them is prone to different failure modes. Then, we divide A and B * by this estimate. */ anorm = (double)(0); for(i=0; i<=nmain-1; i++) { for(j=0; j<=nmain-1; j++) { anorm = anorm+ae_sqr(state->sclsfta.ptr.pp_double[i][j], _state); } } anorm = ae_sqrt(anorm, _state); if( renormlc&&ktotal>0 ) { /* * Calculate max(|diag(C*A*C')|), where C is constraint matrix */ rmatrixsetlengthatleast(&state->tmp2, ktotal, nmain, _state); rmatrixgemm(ktotal, nmain, nmain, 1.0, &state->sclsftcleic, 0, 0, 0, &state->sclsfta, 0, 0, 0, 0.0, &state->tmp2, 0, 0, _state); maxcac = 0.0; for(i=0; i<=ktotal-1; i++) { v = (double)(0); vv = (double)(0); for(j=0; j<=nmain-1; j++) { v = v+state->tmp2.ptr.pp_double[i][j]*state->sclsftcleic.ptr.pp_double[i][j]; vv = vv+ae_sqr(state->sclsftcleic.ptr.pp_double[i][j], _state); } ae_assert(ae_fp_less(ae_fabs(vv-1, _state),1.0E-9)||ae_fp_eq(vv,(double)(0)), "DENSE-AUL: integrity check failed", _state); maxcac = ae_maxreal(maxcac, ae_fabs(v, _state), _state); } } else { maxcac = (double)(0); } v = 1/coalesce(ae_maxreal(maxcac, anorm/nmain, _state), (double)(1), _state); for(i=0; i<=nmain-1; i++) { for(j=0; j<=nmain-1; j++) { state->sclsfta.ptr.pp_double[i][j] = state->sclsfta.ptr.pp_double[i][j]*v; } } for(i=0; i<=nmain-1; i++) { state->sclsftb.ptr.p_double[i] = state->sclsftb.ptr.p_double[i]*v; } /* * Perform outer iteration */ qqploaddefaults(ntotal, &state->qqpsettingsuser, _state); state->qqpsettingsuser.maxouterits = 50; state->qqpsettingsuser.epsg = 0.0; state->qqpsettingsuser.epsf = 0.0; state->qqpsettingsuser.epsx = 0.01*epsx; state->qqpsettingsuser.cnphase = ae_true; for(i=0; i<=ktotal-1; i++) { state->nulc.ptr.p_double[i] = (double)(0); } qpdenseaulsolver_generateexinitialpoint(&state->sclsftxc, nmain, nslack, &state->exxc, _state); goodcounter = 0; stagnationcounter = 0; feaserr = ae_maxrealnumber; for(outeridx=0; outeridx<=settings->outerits-1; outeridx++) { /* * Generate penalized quadratic model */ qpdenseaulsolver_generateexmodel(&state->sclsfta, &state->sclsftb, nmain, &state->sclsftbndl, &state->sclsfthasbndl, &state->sclsftbndu, &state->sclsfthasbndu, &state->sclsftcleic, dnec+snec, dnic+snic, &state->nulc, rho, &state->exa, &state->exb, &state->exbndl, &state->exbndu, &state->tmp2, _state); rvectorsetlengthatleast(&state->exscale, ntotal, _state); for(i=0; i<=ntotal-1; i++) { state->exscale.ptr.p_double[i] = 1.0; state->exxn.ptr.p_double[i] = state->exxc.ptr.p_double[i]; state->exxorigin.ptr.p_double[i] = (double)(0); } /* * Solve with QQP. */ qqpoptimize(&state->dummycqm, &state->dummysparse, &state->exa, 2, ae_true, &state->exb, &state->exbndl, &state->exbndu, &state->exscale, &state->exxorigin, ntotal, &state->tmp2, 0, 0, &state->qqpsettingsuser, &state->qqpbuf, &state->exxn, &k, _state); state->repncholesky = state->repncholesky+state->qqpbuf.repncholesky; /* * Estimate Lagrange multipliers using alternative algorithm */ ae_v_move(&state->nulcest.ptr.p_double[0], 1, &state->nulc.ptr.p_double[0], 1, ae_v_len(0,ktotal-1)); qpdenseaulsolver_updatelagrangemultipliers(&state->sclsfta, &state->sclsftb, nmain, &state->sclsftbndl, &state->sclsfthasbndl, &state->sclsftbndu, &state->sclsfthasbndu, &state->sclsftcleic, dnec+snec, dnic+snic, &state->exxn, &state->nulcest, state, _state); /* * Update XC and Lagrange multipliers */ feaserrprev = feaserr; feaserr = (double)(0); for(i=0; i<=ktotal-1; i++) { /* * Calculate I-th feasibility error in V using formula for distance * between point and line (here we calculate actual distance between * XN and hyperplane Ci'*XN=Bi, which is different from error Ci'*XN-Bi). */ v = (double)(0); vv = (double)(0); for(j=0; j<=nmain-1; j++) { v = v+state->sclsftcleic.ptr.pp_double[i][j]*state->exxn.ptr.p_double[j]; vv = vv+ae_sqr(state->sclsftcleic.ptr.pp_double[i][j], _state); } if( i>=dnec+snec ) { v = v+state->exxn.ptr.p_double[nmain+(i-(dnec+snec))]; vv = vv+ae_sqr((double)(1), _state); } v = v-state->sclsftcleic.ptr.pp_double[i][nmain]; vv = coalesce(vv, (double)(1), _state); v = v/ae_sqrt(vv, _state); /* * Calculate magnitude of Lagrangian update (and Lagrangian parameters themselves) */ feaserr = feaserr+ae_sqr(v, _state); state->nulc.ptr.p_double[i] = state->nulcest.ptr.p_double[i]; } feaserr = ae_sqrt(feaserr, _state); ae_v_move(&state->exxc.ptr.p_double[0], 1, &state->exxn.ptr.p_double[0], 1, ae_v_len(0,ntotal-1)); if( ae_fp_less(feaserr,epsx) ) { inc(&goodcounter, _state); } else { goodcounter = 0; } if( ae_fp_greater(feaserr,feaserrprev*requestedfeasdecrease) ) { inc(&stagnationcounter, _state); } else { stagnationcounter = 0; } if( goodcounter>=2 ) { break; } if( stagnationcounter>=2 ) { rho = ae_minreal(rho*10.0, maxrho, _state); } else { rho = ae_minreal(rho*1.41, maxrho, _state); } } /* * Unpack results. * * Add XOrigin to XC and make sure that boundary constraints are * satisfied. */ for(i=0; i<=nmain-1; i++) { /* * Unscale/unshift */ xs->ptr.p_double[i] = s->ptr.p_double[i]*state->exxc.ptr.p_double[i]+xorigin->ptr.p_double[i]; /* * Make sure that point is feasible w.r.t. box constraints. * Enforce box constraints which were active in the scaled/shifted solution. */ if( state->sclsfthasbndl.ptr.p_bool[i] ) { if( ae_fp_less(xs->ptr.p_double[i],bndl->ptr.p_double[i]) ) { xs->ptr.p_double[i] = bndl->ptr.p_double[i]; } if( ae_fp_eq(state->exxc.ptr.p_double[i],state->sclsftbndl.ptr.p_double[i]) ) { xs->ptr.p_double[i] = bndl->ptr.p_double[i]; } } if( state->sclsfthasbndu.ptr.p_bool[i] ) { if( ae_fp_greater(xs->ptr.p_double[i],bndu->ptr.p_double[i]) ) { xs->ptr.p_double[i] = bndu->ptr.p_double[i]; } if( ae_fp_eq(state->exxc.ptr.p_double[i],state->sclsftbndu.ptr.p_double[i]) ) { xs->ptr.p_double[i] = bndu->ptr.p_double[i]; } } } *terminationtype = 2; } /************************************************************************* This function generates box-constrained QP problem, which is penalized and augmented formulation of original linearly constrained problem -- ALGLIB -- Copyright 23.02.2017 by Bochkanov Sergey *************************************************************************/ static void qpdenseaulsolver_generateexmodel(/* Real */ ae_matrix* sclsfta, /* Real */ ae_vector* sclsftb, ae_int_t nmain, /* Real */ ae_vector* sclsftbndl, /* Boolean */ ae_vector* sclsfthasbndl, /* Real */ ae_vector* sclsftbndu, /* Boolean */ ae_vector* sclsfthasbndu, /* Real */ ae_matrix* sclsftcleic, ae_int_t sclsftnec, ae_int_t sclsftnic, /* Real */ ae_vector* nulc, double rho, /* Real */ ae_matrix* exa, /* Real */ ae_vector* exb, /* Real */ ae_vector* exbndl, /* Real */ ae_vector* exbndu, /* Real */ ae_matrix* tmp2, ae_state *_state) { ae_int_t nslack; ae_int_t ntotal; ae_int_t i; ae_int_t j; double v; nslack = sclsftnic; ntotal = nmain+nslack; /* * Primary quadratic term */ for(i=0; i<=ntotal-1; i++) { for(j=i; j<=ntotal-1; j++) { exa->ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=nmain-1; i++) { for(j=i; j<=nmain-1; j++) { exa->ptr.pp_double[i][j] = sclsfta->ptr.pp_double[i][j]; } } /* * Primary linear term */ for(i=0; i<=ntotal-1; i++) { exb->ptr.p_double[i] = (double)(0); } for(i=0; i<=nmain-1; i++) { exb->ptr.p_double[i] = sclsftb->ptr.p_double[i]; } /* * Box constraints - move primary, add slack */ for(i=0; i<=nmain-1; i++) { if( sclsfthasbndl->ptr.p_bool[i] ) { exbndl->ptr.p_double[i] = sclsftbndl->ptr.p_double[i]; } else { exbndl->ptr.p_double[i] = _state->v_neginf; } if( sclsfthasbndu->ptr.p_bool[i] ) { exbndu->ptr.p_double[i] = sclsftbndu->ptr.p_double[i]; } else { exbndu->ptr.p_double[i] = _state->v_posinf; } } for(i=nmain; i<=ntotal-1; i++) { exbndl->ptr.p_double[i] = (double)(0); exbndu->ptr.p_double[i] = _state->v_posinf; } /* * Handle equality constraints: * * modify quadratic term * * modify linear term * * add Lagrangian term */ rmatrixsetlengthatleast(tmp2, sclsftnec+sclsftnic, ntotal, _state); for(i=0; i<=sclsftnec+sclsftnic-1; i++) { /* * Given constraint row ci and right hand side ri, * I-th quadratic constraint adds penalty term * * 0.5*Rho*(ci'*x-ri)^2 = * = 0.5*Rho*(ci'*x-ri)^T*(ci'*x-ri) = * = 0.5*Rho*(x'*ci-ri')*(ci'*x-ri) = * = 0.5*Rho*(x'*ci*ci'*x - ri'*ci'*x - x'*ci*ri + ri'*ri ) * = 0.5*Rho*(x'*(ci*ci')*x - 2*ri*(ci'*x) + ri^2 ) * * Thus, quadratic term is updated by * * 0.5*Rho*(ci*ci') * * (with actual update to ExA being performed without 0.5 * multiplier because entire matrix is post-multipliead by 0.5) * and linear term receives update * * -Rho*ri*ci * * Similaryly, lagrangian term is -NUi*(ci'*x-ri), * so linear term is updated by * * -NUi*ci * * Because our model does not take into account constant term, * we calculate just quadratic and linear terms. */ ae_v_move(&tmp2->ptr.pp_double[i][0], 1, &sclsftcleic->ptr.pp_double[i][0], 1, ae_v_len(0,nmain-1)); for(j=nmain; j<=ntotal-1; j++) { tmp2->ptr.pp_double[i][j] = (double)(0); } if( i>=sclsftnec ) { tmp2->ptr.pp_double[i][nmain+i-sclsftnec] = 1.0; } v = -rho*sclsftcleic->ptr.pp_double[i][nmain]; ae_v_addd(&exb->ptr.p_double[0], 1, &tmp2->ptr.pp_double[i][0], 1, ae_v_len(0,ntotal-1), v); v = -nulc->ptr.p_double[i]; ae_v_addd(&exb->ptr.p_double[0], 1, &tmp2->ptr.pp_double[i][0], 1, ae_v_len(0,ntotal-1), v); } rmatrixsyrk(ntotal, sclsftnec+sclsftnic, rho, tmp2, 0, 0, 2, 1.0, exa, 0, 0, ae_true, _state); } /************************************************************************* This function generates initial point for "extended" box-constrained QP problem. -- ALGLIB -- Copyright 23.02.2017 by Bochkanov Sergey *************************************************************************/ static void qpdenseaulsolver_generateexinitialpoint(/* Real */ ae_vector* sclsftxc, ae_int_t nmain, ae_int_t nslack, /* Real */ ae_vector* exxc, ae_state *_state) { ae_int_t ntotal; ae_int_t i; ntotal = nmain+nslack; for(i=0; i<=ntotal-1; i++) { exxc->ptr.p_double[i] = (double)(0); } for(i=0; i<=nmain-1; i++) { exxc->ptr.p_double[i] = sclsftxc->ptr.p_double[i]; } } /************************************************************************* This function estimates Lagrange multipliers for scaled-shifted QP problem (here "scaled-shifted" means that we performed variable scaling and subtracted origin) given by quadratic term A, linear term B, box constraints and linear constraint matrix. It is assumed that all linear constraints are equality ones, with first NEC ones being constraints without slack variables, and next NIC ones having slack variables. The only inequality constraints we have are box ones, with first NMain ones being "general" box constraints, and next NIC ones being non-negativity constraints (not specified explicitly). We also make use of the current point XC, which is used to determine active box constraints. Actual QP problem size is NMain+NIC, but some parameters have lower dimensionality. Parameters sizes are: * A is assumed to be array[NMain,NMain] * B is assumed to be array[NMain] * BndL, BndU are array[NMain] * CLEIC is array[NEC+NIC,NMain+1] (last item in a row containts right part) * ExXC is array[NMain+NIC], holds current point * NuLCEst is array[NEC+NIC], holds initial values of Lagrange coeffs On exit NuLCEst is updated with new estimate of Lagrange multipliers. -- ALGLIB -- Copyright 23.02.2017 by Bochkanov Sergey *************************************************************************/ static void qpdenseaulsolver_updatelagrangemultipliers(/* Real */ ae_matrix* sclsfta, /* Real */ ae_vector* sclsftb, ae_int_t nmain, /* Real */ ae_vector* sclsftbndl, /* Boolean */ ae_vector* sclsfthasbndl, /* Real */ ae_vector* sclsftbndu, /* Boolean */ ae_vector* sclsfthasbndu, /* Real */ ae_matrix* sclsftcleic, ae_int_t sclsftnec, ae_int_t sclsftnic, /* Real */ ae_vector* exxc, /* Real */ ae_vector* nulcest, qpdenseaulbuffers* buffers, ae_state *_state) { ae_int_t nslack; ae_int_t ntotal; ae_int_t ktotal; ae_int_t nqrrows; ae_int_t nqrcols; ae_int_t i; ae_int_t j; double lambdareg; double mxdiag; double v; ae_bool isactive; nslack = sclsftnic; ntotal = nmain+nslack; ktotal = sclsftnec+sclsftnic; /* * Given current point ExXC, we can determine active and inactive * constraints. After we drop inactive inequality constraints, we * have equality-only constrained QP problem, with mix of general * linear equality constraints and "simple" constraints Xi=Ci. * * Problem min(0.5*x'*A*x + b'*x) s.t. C*x=d (general linear * constraints) can be solved by explicitly writing out Lagrange * equations: * * [ A C' ] [ X ] [ -b] * [ ] [ ] = [ ] * [ C ] [ L ] [ d ] * * or * * [ X ] * A1* [ ] = b1 * [ L ] * * where X stands for solution itself, and L stands for Lagrange * multipliers. It can be easily solved with direct linear solver. * However, such formulation does not account for "simple" equality * constraints on variables. It is possible to include "simple" * constraints into "general" ones (i.e. append (0 ... 0 -1 0 ... 0)' * to the constraint matrix), but it will increase problem * size. * * Another approach is to use initial values of X and L (X0 and L0) * as starting point, and to solve for "offset" from (X0, L0): * * [ X0+X1 ] * A1*[ ] = b1 * [ L0+L1 ] * * or * * [ X1 ] [ X0 ] * A1*[ ] = b1 - A1*[ ] * [ L1 ] [ L0 ] * * In such formulation components of X1 which correspond to active * constraints on variables are "frozen" at value 0 (because we have * equality constraint, offset from constrained value have to be zero). * * Thus, we can rewrite corresponding columns of A1 with zeros - and * use this space to store (0 ... 0 -1 0 ... 0)', which is used to * account for Lagrange multipliers for "simple" constraints. */ nqrcols = ntotal+ktotal; nqrrows = nqrcols; rvectorsetlengthatleast(&buffers->qrsv0, nqrcols, _state); rvectorsetlengthatleast(&buffers->qrsvx1, nqrcols, _state); for(i=0; i<=ntotal-1; i++) { buffers->qrsv0.ptr.p_double[i] = exxc->ptr.p_double[i]; } for(i=0; i<=ktotal-1; i++) { buffers->qrsv0.ptr.p_double[ntotal+i] = nulcest->ptr.p_double[i]; } rmatrixsetlengthatleast(&buffers->qrkkt, nqrcols+nqrcols, nqrcols+1, _state); rvectorsetlengthatleast(&buffers->qrrightpart, nqrcols+nqrcols, _state); lambdareg = 1.0E-8; for(;;) { /* * Initialize matrix A1 and right part b1 with zeros */ for(i=0; i<=buffers->qrkkt.rows-1; i++) { for(j=0; j<=buffers->qrkkt.cols-1; j++) { buffers->qrkkt.ptr.pp_double[i][j] = (double)(0); } buffers->qrrightpart.ptr.p_double[i] = (double)(0); } /* * Append quadratic term (note: we implicitly add NSlack zeros to * A and b). */ mxdiag = (double)(0); for(i=0; i<=nmain-1; i++) { for(j=0; j<=nmain-1; j++) { buffers->qrkkt.ptr.pp_double[i][j] = sclsfta->ptr.pp_double[i][j]; } buffers->qrrightpart.ptr.p_double[i] = -sclsftb->ptr.p_double[i]; mxdiag = ae_maxreal(mxdiag, ae_fabs(sclsfta->ptr.pp_double[i][i], _state), _state); } mxdiag = coalesce(mxdiag, (double)(1), _state); /* * Append general linear constraints */ for(i=0; i<=ktotal-1; i++) { for(j=0; j<=nmain-1; j++) { buffers->qrkkt.ptr.pp_double[ntotal+i][j] = -sclsftcleic->ptr.pp_double[i][j]; buffers->qrkkt.ptr.pp_double[j][ntotal+i] = -sclsftcleic->ptr.pp_double[i][j]; } if( i>=sclsftnec ) { buffers->qrkkt.ptr.pp_double[ntotal+i][nmain+(i-sclsftnec)] = (double)(-1); buffers->qrkkt.ptr.pp_double[nmain+(i-sclsftnec)][ntotal+i] = (double)(-1); } buffers->qrrightpart.ptr.p_double[ntotal+i] = -sclsftcleic->ptr.pp_double[i][nmain]; } /* * Append regularizer to the bottom of the matrix * (it will be factored in during QR decomposition) */ if( ae_fp_greater(lambdareg,(double)(0)) ) { nqrrows = nqrcols+nqrcols; for(i=0; i<=nqrcols-1; i++) { buffers->qrkkt.ptr.pp_double[nqrcols+i][i] = lambdareg*mxdiag; } } /* * Subtract reference point (X0,L0) from the system */ for(i=0; i<=nqrcols-1; i++) { v = ae_v_dotproduct(&buffers->qrkkt.ptr.pp_double[i][0], 1, &buffers->qrsv0.ptr.p_double[0], 1, ae_v_len(0,nqrcols-1)); buffers->qrrightpart.ptr.p_double[i] = buffers->qrrightpart.ptr.p_double[i]-v; } /* * Handle active "simple" equality constraints */ for(i=0; i<=ntotal-1; i++) { isactive = ae_false; if( iptr.p_bool[i]&&ae_fp_eq(exxc->ptr.p_double[i],sclsftbndl->ptr.p_double[i]))||(sclsfthasbndu->ptr.p_bool[i]&&ae_fp_eq(exxc->ptr.p_double[i],sclsftbndu->ptr.p_double[i]))) ) { isactive = ae_true; } if( i>=nmain&&ae_fp_eq(exxc->ptr.p_double[i],0.0) ) { isactive = ae_true; } if( !isactive ) { continue; } for(j=0; j<=nqrrows-1; j++) { buffers->qrkkt.ptr.pp_double[j][i] = (double)(0); } buffers->qrkkt.ptr.pp_double[i][i] = (double)(-1); } /* * Solve via QR decomposition: * * append right part to the system matrix * * perform QR decomposition of the extended matrix (right part is implicitly * multiplied by Q during decomposition; believe me, it works!) * * check condition number, increase regularization value if necessary and retry * * solve triangular system, break iteration */ for(i=0; i<=nqrrows-1; i++) { buffers->qrkkt.ptr.pp_double[i][nqrcols] = buffers->qrrightpart.ptr.p_double[i]; } rmatrixqr(&buffers->qrkkt, nqrrows, nqrcols+1, &buffers->qrtau, _state); if( ae_fp_less_eq(rmatrixtrrcond1(&buffers->qrkkt, nqrcols, ae_true, ae_false, _state),1000*ae_machineepsilon) ) { lambdareg = coalesce(10*lambdareg, 1.0E-13, _state); continue; } for(i=nqrcols-1; i>=0; i--) { v = buffers->qrkkt.ptr.pp_double[i][nqrcols]; for(j=i+1; j<=nqrcols-1; j++) { v = v-buffers->qrkkt.ptr.pp_double[i][j]*buffers->qrsvx1.ptr.p_double[j]; } buffers->qrsvx1.ptr.p_double[i] = v/buffers->qrkkt.ptr.pp_double[i][i]; } break; } /* * Update Lagrange coefficients */ for(i=0; i<=ktotal-1; i++) { nulcest->ptr.p_double[i] = buffers->qrsv0.ptr.p_double[ntotal+i]+buffers->qrsvx1.ptr.p_double[ntotal+i]; } } void _qpdenseaulsettings_init(void* _p, ae_state *_state) { qpdenseaulsettings *p = (qpdenseaulsettings*)_p; ae_touch_ptr((void*)p); } void _qpdenseaulsettings_init_copy(void* _dst, void* _src, ae_state *_state) { qpdenseaulsettings *dst = (qpdenseaulsettings*)_dst; qpdenseaulsettings *src = (qpdenseaulsettings*)_src; dst->epsx = src->epsx; dst->outerits = src->outerits; dst->rho = src->rho; } void _qpdenseaulsettings_clear(void* _p) { qpdenseaulsettings *p = (qpdenseaulsettings*)_p; ae_touch_ptr((void*)p); } void _qpdenseaulsettings_destroy(void* _p) { qpdenseaulsettings *p = (qpdenseaulsettings*)_p; ae_touch_ptr((void*)p); } void _qpdenseaulbuffers_init(void* _p, ae_state *_state) { qpdenseaulbuffers *p = (qpdenseaulbuffers*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->nulc, 0, DT_REAL, _state); ae_matrix_init(&p->sclsfta, 0, 0, DT_REAL, _state); ae_vector_init(&p->sclsftb, 0, DT_REAL, _state); ae_vector_init(&p->sclsfthasbndl, 0, DT_BOOL, _state); ae_vector_init(&p->sclsfthasbndu, 0, DT_BOOL, _state); ae_vector_init(&p->sclsftbndl, 0, DT_REAL, _state); ae_vector_init(&p->sclsftbndu, 0, DT_REAL, _state); ae_vector_init(&p->sclsftxc, 0, DT_REAL, _state); ae_matrix_init(&p->sclsftcleic, 0, 0, DT_REAL, _state); ae_matrix_init(&p->exa, 0, 0, DT_REAL, _state); ae_vector_init(&p->exb, 0, DT_REAL, _state); ae_vector_init(&p->exxc, 0, DT_REAL, _state); ae_vector_init(&p->exxn, 0, DT_REAL, _state); ae_vector_init(&p->exbndl, 0, DT_REAL, _state); ae_vector_init(&p->exbndu, 0, DT_REAL, _state); ae_vector_init(&p->exscale, 0, DT_REAL, _state); ae_vector_init(&p->exxorigin, 0, DT_REAL, _state); _qqpsettings_init(&p->qqpsettingsuser, _state); _qqpbuffers_init(&p->qqpbuf, _state); ae_vector_init(&p->nulcest, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_matrix_init(&p->tmp2, 0, 0, DT_REAL, _state); ae_vector_init(&p->modelg, 0, DT_REAL, _state); ae_vector_init(&p->d, 0, DT_REAL, _state); ae_vector_init(&p->deltax, 0, DT_REAL, _state); _convexquadraticmodel_init(&p->dummycqm, _state); _sparsematrix_init(&p->dummysparse, _state); ae_matrix_init(&p->qrkkt, 0, 0, DT_REAL, _state); ae_vector_init(&p->qrrightpart, 0, DT_REAL, _state); ae_vector_init(&p->qrtau, 0, DT_REAL, _state); ae_vector_init(&p->qrsv0, 0, DT_REAL, _state); ae_vector_init(&p->qrsvx1, 0, DT_REAL, _state); } void _qpdenseaulbuffers_init_copy(void* _dst, void* _src, ae_state *_state) { qpdenseaulbuffers *dst = (qpdenseaulbuffers*)_dst; qpdenseaulbuffers *src = (qpdenseaulbuffers*)_src; ae_vector_init_copy(&dst->nulc, &src->nulc, _state); ae_matrix_init_copy(&dst->sclsfta, &src->sclsfta, _state); ae_vector_init_copy(&dst->sclsftb, &src->sclsftb, _state); ae_vector_init_copy(&dst->sclsfthasbndl, &src->sclsfthasbndl, _state); ae_vector_init_copy(&dst->sclsfthasbndu, &src->sclsfthasbndu, _state); ae_vector_init_copy(&dst->sclsftbndl, &src->sclsftbndl, _state); ae_vector_init_copy(&dst->sclsftbndu, &src->sclsftbndu, _state); ae_vector_init_copy(&dst->sclsftxc, &src->sclsftxc, _state); ae_matrix_init_copy(&dst->sclsftcleic, &src->sclsftcleic, _state); ae_matrix_init_copy(&dst->exa, &src->exa, _state); ae_vector_init_copy(&dst->exb, &src->exb, _state); ae_vector_init_copy(&dst->exxc, &src->exxc, _state); ae_vector_init_copy(&dst->exxn, &src->exxn, _state); ae_vector_init_copy(&dst->exbndl, &src->exbndl, _state); ae_vector_init_copy(&dst->exbndu, &src->exbndu, _state); ae_vector_init_copy(&dst->exscale, &src->exscale, _state); ae_vector_init_copy(&dst->exxorigin, &src->exxorigin, _state); _qqpsettings_init_copy(&dst->qqpsettingsuser, &src->qqpsettingsuser, _state); _qqpbuffers_init_copy(&dst->qqpbuf, &src->qqpbuf, _state); ae_vector_init_copy(&dst->nulcest, &src->nulcest, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); ae_matrix_init_copy(&dst->tmp2, &src->tmp2, _state); ae_vector_init_copy(&dst->modelg, &src->modelg, _state); ae_vector_init_copy(&dst->d, &src->d, _state); ae_vector_init_copy(&dst->deltax, &src->deltax, _state); _convexquadraticmodel_init_copy(&dst->dummycqm, &src->dummycqm, _state); _sparsematrix_init_copy(&dst->dummysparse, &src->dummysparse, _state); ae_matrix_init_copy(&dst->qrkkt, &src->qrkkt, _state); ae_vector_init_copy(&dst->qrrightpart, &src->qrrightpart, _state); ae_vector_init_copy(&dst->qrtau, &src->qrtau, _state); ae_vector_init_copy(&dst->qrsv0, &src->qrsv0, _state); ae_vector_init_copy(&dst->qrsvx1, &src->qrsvx1, _state); dst->repinneriterationscount = src->repinneriterationscount; dst->repouteriterationscount = src->repouteriterationscount; dst->repncholesky = src->repncholesky; dst->repnmv = src->repnmv; } void _qpdenseaulbuffers_clear(void* _p) { qpdenseaulbuffers *p = (qpdenseaulbuffers*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->nulc); ae_matrix_clear(&p->sclsfta); ae_vector_clear(&p->sclsftb); ae_vector_clear(&p->sclsfthasbndl); ae_vector_clear(&p->sclsfthasbndu); ae_vector_clear(&p->sclsftbndl); ae_vector_clear(&p->sclsftbndu); ae_vector_clear(&p->sclsftxc); ae_matrix_clear(&p->sclsftcleic); ae_matrix_clear(&p->exa); ae_vector_clear(&p->exb); ae_vector_clear(&p->exxc); ae_vector_clear(&p->exxn); ae_vector_clear(&p->exbndl); ae_vector_clear(&p->exbndu); ae_vector_clear(&p->exscale); ae_vector_clear(&p->exxorigin); _qqpsettings_clear(&p->qqpsettingsuser); _qqpbuffers_clear(&p->qqpbuf); ae_vector_clear(&p->nulcest); ae_vector_clear(&p->tmp0); ae_matrix_clear(&p->tmp2); ae_vector_clear(&p->modelg); ae_vector_clear(&p->d); ae_vector_clear(&p->deltax); _convexquadraticmodel_clear(&p->dummycqm); _sparsematrix_clear(&p->dummysparse); ae_matrix_clear(&p->qrkkt); ae_vector_clear(&p->qrrightpart); ae_vector_clear(&p->qrtau); ae_vector_clear(&p->qrsv0); ae_vector_clear(&p->qrsvx1); } void _qpdenseaulbuffers_destroy(void* _p) { qpdenseaulbuffers *p = (qpdenseaulbuffers*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->nulc); ae_matrix_destroy(&p->sclsfta); ae_vector_destroy(&p->sclsftb); ae_vector_destroy(&p->sclsfthasbndl); ae_vector_destroy(&p->sclsfthasbndu); ae_vector_destroy(&p->sclsftbndl); ae_vector_destroy(&p->sclsftbndu); ae_vector_destroy(&p->sclsftxc); ae_matrix_destroy(&p->sclsftcleic); ae_matrix_destroy(&p->exa); ae_vector_destroy(&p->exb); ae_vector_destroy(&p->exxc); ae_vector_destroy(&p->exxn); ae_vector_destroy(&p->exbndl); ae_vector_destroy(&p->exbndu); ae_vector_destroy(&p->exscale); ae_vector_destroy(&p->exxorigin); _qqpsettings_destroy(&p->qqpsettingsuser); _qqpbuffers_destroy(&p->qqpbuf); ae_vector_destroy(&p->nulcest); ae_vector_destroy(&p->tmp0); ae_matrix_destroy(&p->tmp2); ae_vector_destroy(&p->modelg); ae_vector_destroy(&p->d); ae_vector_destroy(&p->deltax); _convexquadraticmodel_destroy(&p->dummycqm); _sparsematrix_destroy(&p->dummysparse); ae_matrix_destroy(&p->qrkkt); ae_vector_destroy(&p->qrrightpart); ae_vector_destroy(&p->qrtau); ae_vector_destroy(&p->qrsv0); ae_vector_destroy(&p->qrsvx1); } /************************************************************************* This function initializes QPCholeskySettings structure with default settings. Newly created structure MUST be initialized by default settings - or by copy of the already initialized structure. -- ALGLIB -- Copyright 14.05.2011 by Bochkanov Sergey *************************************************************************/ void qpcholeskyloaddefaults(ae_int_t nmain, qpcholeskysettings* s, ae_state *_state) { s->epsg = 0.0; s->epsf = 0.0; s->epsx = 1.0E-6; s->maxits = 0; } /************************************************************************* This function initializes QPCholeskySettings structure with copy of another, already initialized structure. -- ALGLIB -- Copyright 14.05.2011 by Bochkanov Sergey *************************************************************************/ void qpcholeskycopysettings(qpcholeskysettings* src, qpcholeskysettings* dst, ae_state *_state) { dst->epsg = src->epsg; dst->epsf = src->epsf; dst->epsx = src->epsx; dst->maxits = src->maxits; } /************************************************************************* This function runs QPCholesky solver; it returns after optimization process was completed. Following QP problem is solved: min(0.5*(x-x_origin)'*A*(x-x_origin)+b'*(x-x_origin)) subject to boundary constraints. INPUT PARAMETERS: AC - for dense problems (AKind=0) contains system matrix in the A-term of CQM object. OTHER TERMS ARE ACTIVELY USED AND MODIFIED BY THE SOLVER! SparseAC - for sparse problems (AKind=1 AKind - sparse matrix format: * 0 for dense matrix * 1 for sparse matrix SparseUpper - which triangle of SparseAC stores matrix - upper or lower one (for dense matrices this parameter is not actual). BC - linear term, array[NC] BndLC - lower bound, array[NC] BndUC - upper bound, array[NC] SC - scale vector, array[NC]: * I-th element contains scale of I-th variable, * SC[I]>0 XOriginC - origin term, array[NC]. Can be zero. NC - number of variables in the original formulation (no slack variables). CLEICC - linear equality/inequality constraints. Present version of this function does NOT provide publicly available support for linear constraints. This feature will be introduced in the future versions of the function. NEC, NIC - number of equality/inequality constraints. MUST BE ZERO IN THE CURRENT VERSION!!! Settings - QPCholeskySettings object initialized by one of the initialization functions. SState - object which stores temporaries: * if uninitialized object was passed, FirstCall parameter MUST be set to True; object will be automatically initialized by the function, and FirstCall will be set to False. * if FirstCall=False, it is assumed that this parameter was already initialized by previous call to this function with same problem dimensions (variable count N). XS - initial point, array[NC] OUTPUT PARAMETERS: XS - last point TerminationType-termination type: * * * -- ALGLIB -- Copyright 14.05.2011 by Bochkanov Sergey *************************************************************************/ void qpcholeskyoptimize(convexquadraticmodel* a, double anorm, /* Real */ ae_vector* b, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, /* Real */ ae_vector* s, /* Real */ ae_vector* xorigin, ae_int_t n, /* Real */ ae_matrix* cleic, ae_int_t nec, ae_int_t nic, qpcholeskybuffers* sstate, /* Real */ ae_vector* xsc, ae_int_t* terminationtype, ae_state *_state) { ae_int_t i; double noisetolerance; ae_bool havebc; double v; ae_int_t badnewtonits; double maxscaledgrad; double v0; double v1; ae_int_t nextaction; double fprev; double fcur; double fcand; double noiselevel; double d0; double d1; double d2; ae_int_t actstatus; *terminationtype = 0; /* * Allocate storage and prepare fields */ rvectorsetlengthatleast(&sstate->rctmpg, n, _state); rvectorsetlengthatleast(&sstate->tmp0, n, _state); rvectorsetlengthatleast(&sstate->tmp1, n, _state); rvectorsetlengthatleast(&sstate->gc, n, _state); rvectorsetlengthatleast(&sstate->pg, n, _state); rvectorsetlengthatleast(&sstate->xs, n, _state); rvectorsetlengthatleast(&sstate->xn, n, _state); rvectorsetlengthatleast(&sstate->workbndl, n, _state); rvectorsetlengthatleast(&sstate->workbndu, n, _state); bvectorsetlengthatleast(&sstate->havebndl, n, _state); bvectorsetlengthatleast(&sstate->havebndu, n, _state); sstate->repinneriterationscount = 0; sstate->repouteriterationscount = 0; sstate->repncholesky = 0; noisetolerance = (double)(10); /* * Our formulation of quadratic problem includes origin point, * i.e. we have F(x-x_origin) which is minimized subject to * constraints on x, instead of having simply F(x). * * Here we make transition from non-zero origin to zero one. * In order to make such transition we have to: * 1. subtract x_origin from x_start * 2. modify constraints * 3. solve problem * 4. add x_origin to solution * * There is alternate solution - to modify quadratic function * by expansion of multipliers containing (x-x_origin), but * we prefer to modify constraints, because it is a) more precise * and b) easier to to. * * Parts (1)-(2) are done here. After this block is over, * we have: * * XS, which stores shifted XStart (if we don't have XStart, * value of XS will be ignored later) * * WorkBndL, WorkBndU, which store modified boundary constraints. */ havebc = ae_false; for(i=0; i<=n-1; i++) { sstate->havebndl.ptr.p_bool[i] = ae_isfinite(bndl->ptr.p_double[i], _state); sstate->havebndu.ptr.p_bool[i] = ae_isfinite(bndu->ptr.p_double[i], _state); havebc = (havebc||sstate->havebndl.ptr.p_bool[i])||sstate->havebndu.ptr.p_bool[i]; if( sstate->havebndl.ptr.p_bool[i] ) { sstate->workbndl.ptr.p_double[i] = bndl->ptr.p_double[i]-xorigin->ptr.p_double[i]; } else { sstate->workbndl.ptr.p_double[i] = _state->v_neginf; } if( sstate->havebndu.ptr.p_bool[i] ) { sstate->workbndu.ptr.p_double[i] = bndu->ptr.p_double[i]-xorigin->ptr.p_double[i]; } else { sstate->workbndu.ptr.p_double[i] = _state->v_posinf; } } rmatrixsetlengthatleast(&sstate->workcleic, nec+nic, n+1, _state); for(i=0; i<=nec+nic-1; i++) { v = ae_v_dotproduct(&cleic->ptr.pp_double[i][0], 1, &xorigin->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&sstate->workcleic.ptr.pp_double[i][0], 1, &cleic->ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); sstate->workcleic.ptr.pp_double[i][n] = cleic->ptr.pp_double[i][n]-v; } /* * We have starting point in StartX, so we just have to shift and bound it */ for(i=0; i<=n-1; i++) { sstate->xs.ptr.p_double[i] = xsc->ptr.p_double[i]-xorigin->ptr.p_double[i]; if( sstate->havebndl.ptr.p_bool[i] ) { if( ae_fp_less(sstate->xs.ptr.p_double[i],sstate->workbndl.ptr.p_double[i]) ) { sstate->xs.ptr.p_double[i] = sstate->workbndl.ptr.p_double[i]; } } if( sstate->havebndu.ptr.p_bool[i] ) { if( ae_fp_greater(sstate->xs.ptr.p_double[i],sstate->workbndu.ptr.p_double[i]) ) { sstate->xs.ptr.p_double[i] = sstate->workbndu.ptr.p_double[i]; } } } /* * Handle special case - no constraints */ if( !havebc&&nec+nic==0 ) { /* * "Simple" unconstrained Cholesky */ bvectorsetlengthatleast(&sstate->tmpb, n, _state); for(i=0; i<=n-1; i++) { sstate->tmpb.ptr.p_bool[i] = ae_false; } sstate->repncholesky = sstate->repncholesky+1; cqmsetb(a, b, _state); cqmsetactiveset(a, &sstate->xs, &sstate->tmpb, _state); if( !cqmconstrainedoptimum(a, &sstate->xn, _state) ) { *terminationtype = -5; return; } ae_v_move(&xsc->ptr.p_double[0], 1, &sstate->xn.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_add(&xsc->ptr.p_double[0], 1, &xorigin->ptr.p_double[0], 1, ae_v_len(0,n-1)); sstate->repinneriterationscount = 1; sstate->repouteriterationscount = 1; *terminationtype = 4; return; } /* * Prepare "active set" structure */ sasinit(n, &sstate->sas, _state); sassetbc(&sstate->sas, &sstate->workbndl, &sstate->workbndu, _state); sassetlcx(&sstate->sas, &sstate->workcleic, nec, nic, _state); sassetscale(&sstate->sas, s, _state); if( !sasstartoptimization(&sstate->sas, &sstate->xs, _state) ) { *terminationtype = -3; return; } /* * Main cycle of CQP algorithm */ *terminationtype = 4; badnewtonits = 0; maxscaledgrad = 0.0; for(;;) { /* * Update iterations count */ inc(&sstate->repouteriterationscount, _state); inc(&sstate->repinneriterationscount, _state); /* * Phase 1. * * Determine active set. * Update MaxScaledGrad. */ cqmadx(a, &sstate->sas.xc, &sstate->rctmpg, _state); ae_v_add(&sstate->rctmpg.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); sasreactivateconstraints(&sstate->sas, &sstate->rctmpg, _state); v = 0.0; for(i=0; i<=n-1; i++) { v = v+ae_sqr(sstate->rctmpg.ptr.p_double[i]*s->ptr.p_double[i], _state); } maxscaledgrad = ae_maxreal(maxscaledgrad, ae_sqrt(v, _state), _state); /* * Phase 2: perform penalized steepest descent step. * * NextAction control variable is set on exit from this loop: * * NextAction>0 in case we have to proceed to Phase 3 (Newton step) * * NextAction<0 in case we have to proceed to Phase 1 (recalculate active set) * * NextAction=0 in case we found solution (step along projected gradient is small enough) */ for(;;) { /* * Calculate constrained descent direction, store to PG. * Successful termination if PG is zero. */ cqmadx(a, &sstate->sas.xc, &sstate->gc, _state); ae_v_add(&sstate->gc.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); sasconstraineddescent(&sstate->sas, &sstate->gc, &sstate->pg, _state); v0 = ae_v_dotproduct(&sstate->pg.ptr.p_double[0], 1, &sstate->pg.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( ae_fp_eq(v0,(double)(0)) ) { /* * Constrained derivative is zero. * Solution found. */ nextaction = 0; break; } /* * Build quadratic model of F along descent direction: * F(xc+alpha*pg) = D2*alpha^2 + D1*alpha + D0 * Store noise level in the XC (noise level is used to classify * step as singificant or insignificant). * * In case function curvature is negative or product of descent * direction and gradient is non-negative, iterations are terminated. * * NOTE: D0 is not actually used, but we prefer to maintain it. */ fprev = qpcholeskysolver_modelvalue(a, b, &sstate->sas.xc, n, &sstate->tmp0, _state); fprev = fprev+qpcholeskysolver_penaltyfactor*maxscaledgrad*sasactivelcpenalty1(&sstate->sas, &sstate->sas.xc, _state); cqmevalx(a, &sstate->sas.xc, &v, &noiselevel, _state); v0 = cqmxtadx2(a, &sstate->pg, _state); d2 = v0; v1 = ae_v_dotproduct(&sstate->pg.ptr.p_double[0], 1, &sstate->gc.ptr.p_double[0], 1, ae_v_len(0,n-1)); d1 = v1; d0 = fprev; if( ae_fp_less_eq(d2,(double)(0)) ) { /* * Second derivative is non-positive, function is non-convex. */ *terminationtype = -5; nextaction = 0; break; } if( ae_fp_greater_eq(d1,(double)(0)) ) { /* * Second derivative is positive, first derivative is non-negative. * Solution found. */ nextaction = 0; break; } /* * Modify quadratic model - add penalty for violation of the active * constraints. * * Boundary constraints are always satisfied exactly, so we do not * add penalty term for them. General equality constraint of the * form a'*(xc+alpha*d)=b adds penalty term: * P(alpha) = (a'*(xc+alpha*d)-b)^2 * = (alpha*(a'*d) + (a'*xc-b))^2 * = alpha^2*(a'*d)^2 + alpha*2*(a'*d)*(a'*xc-b) + (a'*xc-b)^2 * Each penalty term is multiplied by 100*Anorm before adding it to * the 1-dimensional quadratic model. * * Penalization of the quadratic model improves behavior of the * algorithm in the presense of the multiple degenerate constraints. * In particular, it prevents algorithm from making large steps in * directions which violate equality constraints. */ for(i=0; i<=nec+nic-1; i++) { if( sstate->sas.activeset.ptr.p_int[n+i]>0 ) { v0 = ae_v_dotproduct(&sstate->workcleic.ptr.pp_double[i][0], 1, &sstate->pg.ptr.p_double[0], 1, ae_v_len(0,n-1)); v1 = ae_v_dotproduct(&sstate->workcleic.ptr.pp_double[i][0], 1, &sstate->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); v1 = v1-sstate->workcleic.ptr.pp_double[i][n]; v = 100*anorm; d2 = d2+v*ae_sqr(v0, _state); d1 = d1+v*2*v0*v1; d0 = d0+v*ae_sqr(v1, _state); } } /* * Try unbounded step. * In case function change is dominated by noise or function actually increased * instead of decreasing, we terminate iterations. */ v = -d1/(2*d2); ae_v_move(&sstate->xn.ptr.p_double[0], 1, &sstate->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&sstate->xn.ptr.p_double[0], 1, &sstate->pg.ptr.p_double[0], 1, ae_v_len(0,n-1), v); fcand = qpcholeskysolver_modelvalue(a, b, &sstate->xn, n, &sstate->tmp0, _state); fcand = fcand+qpcholeskysolver_penaltyfactor*maxscaledgrad*sasactivelcpenalty1(&sstate->sas, &sstate->xn, _state); if( ae_fp_greater_eq(fcand,fprev-noiselevel*noisetolerance) ) { nextaction = 0; break; } /* * Save active set * Perform bounded step with (possible) activation */ actstatus = qpcholeskysolver_boundedstepandactivation(&sstate->sas, &sstate->xn, n, &sstate->tmp0, _state); fcur = qpcholeskysolver_modelvalue(a, b, &sstate->sas.xc, n, &sstate->tmp0, _state); /* * Depending on results, decide what to do: * 1. In case step was performed without activation of constraints, * we proceed to Newton method * 2. In case there was activated at least one constraint with ActiveSet[I]<0, * we proceed to Phase 1 and re-evaluate active set. * 3. Otherwise (activation of the constraints with ActiveSet[I]=0) * we try Phase 2 one more time. */ if( actstatus<0 ) { /* * Step without activation, proceed to Newton */ nextaction = 1; break; } if( actstatus==0 ) { /* * No new constraints added during last activation - only * ones which were at the boundary (ActiveSet[I]=0), but * inactive due to numerical noise. * * Now, these constraints are added to the active set, and * we try to perform steepest descent (Phase 2) one more time. */ continue; } else { /* * Last step activated at least one significantly new * constraint (ActiveSet[I]<0), we have to re-evaluate * active set (Phase 1). */ nextaction = -1; break; } } if( nextaction<0 ) { continue; } if( nextaction==0 ) { break; } /* * Phase 3: fast equality-constrained solver * * NOTE: this solver uses Augmented Lagrangian algorithm to solve * equality-constrained subproblems. This algorithm may * perform steps which increase function values instead of * decreasing it (in hard cases, like overconstrained problems). * * Such non-monononic steps may create a loop, when Augmented * Lagrangian algorithm performs uphill step, and steepest * descent algorithm (Phase 2) performs downhill step in the * opposite direction. * * In order to prevent iterations to continue forever we * count iterations when AL algorithm increased function * value instead of decreasing it. When number of such "bad" * iterations will increase beyong MaxBadNewtonIts, we will * terminate algorithm. */ fprev = qpcholeskysolver_modelvalue(a, b, &sstate->sas.xc, n, &sstate->tmp0, _state); for(;;) { /* * Calculate optimum subject to presently active constraints */ sstate->repncholesky = sstate->repncholesky+1; if( !qpcholeskysolver_constrainedoptimum(&sstate->sas, a, anorm, b, &sstate->xn, n, &sstate->tmp0, &sstate->tmpb, &sstate->tmp1, _state) ) { *terminationtype = -5; sasstopoptimization(&sstate->sas, _state); return; } /* * Add constraints. * If no constraints was added, accept candidate point XN and move to next phase. */ if( qpcholeskysolver_boundedstepandactivation(&sstate->sas, &sstate->xn, n, &sstate->tmp0, _state)<0 ) { break; } } fcur = qpcholeskysolver_modelvalue(a, b, &sstate->sas.xc, n, &sstate->tmp0, _state); if( ae_fp_greater_eq(fcur,fprev) ) { badnewtonits = badnewtonits+1; } if( badnewtonits>=qpcholeskysolver_maxbadnewtonits ) { /* * Algorithm found solution, but keeps iterating because Newton * algorithm performs uphill steps (noise in the Augmented Lagrangian * algorithm). We terminate algorithm; it is considered normal * termination. */ break; } } sasstopoptimization(&sstate->sas, _state); /* * Post-process: add XOrigin to XC */ for(i=0; i<=n-1; i++) { if( sstate->havebndl.ptr.p_bool[i]&&ae_fp_eq(sstate->sas.xc.ptr.p_double[i],sstate->workbndl.ptr.p_double[i]) ) { xsc->ptr.p_double[i] = bndl->ptr.p_double[i]; continue; } if( sstate->havebndu.ptr.p_bool[i]&&ae_fp_eq(sstate->sas.xc.ptr.p_double[i],sstate->workbndu.ptr.p_double[i]) ) { xsc->ptr.p_double[i] = bndu->ptr.p_double[i]; continue; } xsc->ptr.p_double[i] = boundval(sstate->sas.xc.ptr.p_double[i]+xorigin->ptr.p_double[i], bndl->ptr.p_double[i], bndu->ptr.p_double[i], _state); } } /************************************************************************* Model value: f = 0.5*x'*A*x + b'*x INPUT PARAMETERS: A - convex quadratic model; only main quadratic term is used, other parts of the model (D/Q/linear term) are ignored. This function does not modify model state. B - right part XC - evaluation point Tmp - temporary buffer, automatically resized if needed -- ALGLIB -- Copyright 20.06.2012 by Bochkanov Sergey *************************************************************************/ static double qpcholeskysolver_modelvalue(convexquadraticmodel* a, /* Real */ ae_vector* b, /* Real */ ae_vector* xc, ae_int_t n, /* Real */ ae_vector* tmp, ae_state *_state) { double v0; double v1; double result; rvectorsetlengthatleast(tmp, n, _state); cqmadx(a, xc, tmp, _state); v0 = ae_v_dotproduct(&xc->ptr.p_double[0], 1, &tmp->ptr.p_double[0], 1, ae_v_len(0,n-1)); v1 = ae_v_dotproduct(&xc->ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); result = 0.5*v0+v1; return result; } /************************************************************************* Having feasible current point XC and possibly infeasible candidate point XN, this function performs longest step from XC to XN which retains feasibility. In case XN is found to be infeasible, at least one constraint is activated. For example, if we have: XC=0.5 XN=1.2 x>=0, x<=1 then this function will move us to X=1.0 and activate constraint "x<=1". INPUT PARAMETERS: State - MinQP state. XC - current point, must be feasible with respect to all constraints XN - candidate point, can be infeasible with respect to some constraints. Must be located in the subspace of current active set, i.e. it is feasible with respect to already active constraints. Buf - temporary buffer, automatically resized if needed OUTPUT PARAMETERS: State - this function changes following fields of State: * State.ActiveSet * State.ActiveC - active linear constraints XC - new position RESULT: >0, in case at least one inactive non-candidate constraint was activated =0, in case only "candidate" constraints were activated <0, in case no constraints were activated by the step -- ALGLIB -- Copyright 29.02.2012 by Bochkanov Sergey *************************************************************************/ static ae_int_t qpcholeskysolver_boundedstepandactivation(sactiveset* sas, /* Real */ ae_vector* xn, ae_int_t n, /* Real */ ae_vector* buf, ae_state *_state) { double stpmax; ae_int_t cidx; double cval; ae_bool needact; double v; ae_int_t result; rvectorsetlengthatleast(buf, n, _state); ae_v_move(&buf->ptr.p_double[0], 1, &xn->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_sub(&buf->ptr.p_double[0], 1, &sas->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); sasexploredirection(sas, buf, &stpmax, &cidx, &cval, _state); needact = ae_fp_less_eq(stpmax,(double)(1)); v = ae_minreal(stpmax, 1.0, _state); ae_v_muld(&buf->ptr.p_double[0], 1, ae_v_len(0,n-1), v); ae_v_add(&buf->ptr.p_double[0], 1, &sas->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); result = sasmoveto(sas, buf, needact, cidx, cval, _state); return result; } /************************************************************************* Optimum of A subject to: a) active boundary constraints (given by ActiveSet[] and corresponding elements of XC) b) active linear constraints (given by C, R, LagrangeC) INPUT PARAMETERS: A - main quadratic term of the model; although structure may store linear and rank-K terms, these terms are ignored and rewritten by this function. ANorm - estimate of ||A|| (2-norm is used) B - array[N], linear term of the model XN - possibly preallocated buffer Tmp - temporary buffer (automatically resized) Tmp1 - temporary buffer (automatically resized) OUTPUT PARAMETERS: A - modified quadratic model (this function changes rank-K term and linear term of the model) LagrangeC- current estimate of the Lagrange coefficients XN - solution RESULT: True on success, False on failure (non-SPD model) -- ALGLIB -- Copyright 20.06.2012 by Bochkanov Sergey *************************************************************************/ static ae_bool qpcholeskysolver_constrainedoptimum(sactiveset* sas, convexquadraticmodel* a, double anorm, /* Real */ ae_vector* b, /* Real */ ae_vector* xn, ae_int_t n, /* Real */ ae_vector* tmp, /* Boolean */ ae_vector* tmpb, /* Real */ ae_vector* lagrangec, ae_state *_state) { ae_int_t itidx; ae_int_t i; double v; double feaserrold; double feaserrnew; double theta; ae_bool result; /* * Rebuild basis accroding to current active set. * We call SASRebuildBasis() to make sure that fields of SAS * store up to date values. */ sasrebuildbasis(sas, _state); /* * Allocate temporaries. */ rvectorsetlengthatleast(tmp, ae_maxint(n, sas->basissize, _state), _state); bvectorsetlengthatleast(tmpb, n, _state); rvectorsetlengthatleast(lagrangec, sas->basissize, _state); /* * Prepare model */ for(i=0; i<=sas->basissize-1; i++) { tmp->ptr.p_double[i] = sas->pbasis.ptr.pp_double[i][n]; } theta = 100.0*anorm; for(i=0; i<=n-1; i++) { if( sas->activeset.ptr.p_int[i]>0 ) { tmpb->ptr.p_bool[i] = ae_true; } else { tmpb->ptr.p_bool[i] = ae_false; } } cqmsetactiveset(a, &sas->xc, tmpb, _state); cqmsetq(a, &sas->pbasis, tmp, sas->basissize, theta, _state); /* * Iterate until optimal values of Lagrange multipliers are found */ for(i=0; i<=sas->basissize-1; i++) { lagrangec->ptr.p_double[i] = (double)(0); } feaserrnew = ae_maxrealnumber; result = ae_true; for(itidx=1; itidx<=qpcholeskysolver_maxlagrangeits; itidx++) { /* * Generate right part B using linear term and current * estimate of the Lagrange multipliers. */ ae_v_move(&tmp->ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=sas->basissize-1; i++) { v = lagrangec->ptr.p_double[i]; ae_v_subd(&tmp->ptr.p_double[0], 1, &sas->pbasis.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } cqmsetb(a, tmp, _state); /* * Solve */ result = cqmconstrainedoptimum(a, xn, _state); if( !result ) { return result; } /* * Compare feasibility errors. * Terminate if error decreased too slowly. */ feaserrold = feaserrnew; feaserrnew = (double)(0); for(i=0; i<=sas->basissize-1; i++) { v = ae_v_dotproduct(&sas->pbasis.ptr.pp_double[i][0], 1, &xn->ptr.p_double[0], 1, ae_v_len(0,n-1)); feaserrnew = feaserrnew+ae_sqr(v-sas->pbasis.ptr.pp_double[i][n], _state); } feaserrnew = ae_sqrt(feaserrnew, _state); if( ae_fp_greater_eq(feaserrnew,0.2*feaserrold) ) { break; } /* * Update Lagrange multipliers */ for(i=0; i<=sas->basissize-1; i++) { v = ae_v_dotproduct(&sas->pbasis.ptr.pp_double[i][0], 1, &xn->ptr.p_double[0], 1, ae_v_len(0,n-1)); lagrangec->ptr.p_double[i] = lagrangec->ptr.p_double[i]-theta*(v-sas->pbasis.ptr.pp_double[i][n]); } } return result; } void _qpcholeskysettings_init(void* _p, ae_state *_state) { qpcholeskysettings *p = (qpcholeskysettings*)_p; ae_touch_ptr((void*)p); } void _qpcholeskysettings_init_copy(void* _dst, void* _src, ae_state *_state) { qpcholeskysettings *dst = (qpcholeskysettings*)_dst; qpcholeskysettings *src = (qpcholeskysettings*)_src; dst->epsg = src->epsg; dst->epsf = src->epsf; dst->epsx = src->epsx; dst->maxits = src->maxits; } void _qpcholeskysettings_clear(void* _p) { qpcholeskysettings *p = (qpcholeskysettings*)_p; ae_touch_ptr((void*)p); } void _qpcholeskysettings_destroy(void* _p) { qpcholeskysettings *p = (qpcholeskysettings*)_p; ae_touch_ptr((void*)p); } void _qpcholeskybuffers_init(void* _p, ae_state *_state) { qpcholeskybuffers *p = (qpcholeskybuffers*)_p; ae_touch_ptr((void*)p); _sactiveset_init(&p->sas, _state); ae_vector_init(&p->pg, 0, DT_REAL, _state); ae_vector_init(&p->gc, 0, DT_REAL, _state); ae_vector_init(&p->xs, 0, DT_REAL, _state); ae_vector_init(&p->xn, 0, DT_REAL, _state); ae_vector_init(&p->workbndl, 0, DT_REAL, _state); ae_vector_init(&p->workbndu, 0, DT_REAL, _state); ae_vector_init(&p->havebndl, 0, DT_BOOL, _state); ae_vector_init(&p->havebndu, 0, DT_BOOL, _state); ae_matrix_init(&p->workcleic, 0, 0, DT_REAL, _state); ae_vector_init(&p->rctmpg, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_vector_init(&p->tmp1, 0, DT_REAL, _state); ae_vector_init(&p->tmpb, 0, DT_BOOL, _state); } void _qpcholeskybuffers_init_copy(void* _dst, void* _src, ae_state *_state) { qpcholeskybuffers *dst = (qpcholeskybuffers*)_dst; qpcholeskybuffers *src = (qpcholeskybuffers*)_src; _sactiveset_init_copy(&dst->sas, &src->sas, _state); ae_vector_init_copy(&dst->pg, &src->pg, _state); ae_vector_init_copy(&dst->gc, &src->gc, _state); ae_vector_init_copy(&dst->xs, &src->xs, _state); ae_vector_init_copy(&dst->xn, &src->xn, _state); ae_vector_init_copy(&dst->workbndl, &src->workbndl, _state); ae_vector_init_copy(&dst->workbndu, &src->workbndu, _state); ae_vector_init_copy(&dst->havebndl, &src->havebndl, _state); ae_vector_init_copy(&dst->havebndu, &src->havebndu, _state); ae_matrix_init_copy(&dst->workcleic, &src->workcleic, _state); ae_vector_init_copy(&dst->rctmpg, &src->rctmpg, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); ae_vector_init_copy(&dst->tmp1, &src->tmp1, _state); ae_vector_init_copy(&dst->tmpb, &src->tmpb, _state); dst->repinneriterationscount = src->repinneriterationscount; dst->repouteriterationscount = src->repouteriterationscount; dst->repncholesky = src->repncholesky; } void _qpcholeskybuffers_clear(void* _p) { qpcholeskybuffers *p = (qpcholeskybuffers*)_p; ae_touch_ptr((void*)p); _sactiveset_clear(&p->sas); ae_vector_clear(&p->pg); ae_vector_clear(&p->gc); ae_vector_clear(&p->xs); ae_vector_clear(&p->xn); ae_vector_clear(&p->workbndl); ae_vector_clear(&p->workbndu); ae_vector_clear(&p->havebndl); ae_vector_clear(&p->havebndu); ae_matrix_clear(&p->workcleic); ae_vector_clear(&p->rctmpg); ae_vector_clear(&p->tmp0); ae_vector_clear(&p->tmp1); ae_vector_clear(&p->tmpb); } void _qpcholeskybuffers_destroy(void* _p) { qpcholeskybuffers *p = (qpcholeskybuffers*)_p; ae_touch_ptr((void*)p); _sactiveset_destroy(&p->sas); ae_vector_destroy(&p->pg); ae_vector_destroy(&p->gc); ae_vector_destroy(&p->xs); ae_vector_destroy(&p->xn); ae_vector_destroy(&p->workbndl); ae_vector_destroy(&p->workbndu); ae_vector_destroy(&p->havebndl); ae_vector_destroy(&p->havebndu); ae_matrix_destroy(&p->workcleic); ae_vector_destroy(&p->rctmpg); ae_vector_destroy(&p->tmp0); ae_vector_destroy(&p->tmp1); ae_vector_destroy(&p->tmpb); } /************************************************************************* NONLINEAR CONJUGATE GRADIENT METHOD DESCRIPTION: The subroutine minimizes function F(x) of N arguments by using one of the nonlinear conjugate gradient methods. These CG methods are globally convergent (even on non-convex functions) as long as grad(f) is Lipschitz continuous in a some neighborhood of the L = { x : f(x)<=f(x0) }. REQUIREMENTS: Algorithm will request following information during its operation: * function value F and its gradient G (simultaneously) at given point X USAGE: 1. User initializes algorithm state with MinCGCreate() call 2. User tunes solver parameters with MinCGSetCond(), MinCGSetStpMax() and other functions 3. User calls MinCGOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 4. User calls MinCGResults() to get solution 5. Optionally, user may call MinCGRestartFrom() to solve another problem with same N but another starting point and/or another function. MinCGRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 25.03.2010 by Bochkanov Sergey *************************************************************************/ void mincgcreate(ae_int_t n, /* Real */ ae_vector* x, mincgstate* state, ae_state *_state) { _mincgstate_clear(state); ae_assert(n>=1, "MinCGCreate: N too small!", _state); ae_assert(x->cnt>=n, "MinCGCreate: Length(X)0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinCGSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. L-BFGS needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void mincgcreatef(ae_int_t n, /* Real */ ae_vector* x, double diffstep, mincgstate* state, ae_state *_state) { _mincgstate_clear(state); ae_assert(n>=1, "MinCGCreateF: N too small!", _state); ae_assert(x->cnt>=n, "MinCGCreateF: Length(X)=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinCGSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0, EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (small EpsX). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetcond(mincgstate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(epsg, _state), "MinCGSetCond: EpsG is not finite number!", _state); ae_assert(ae_fp_greater_eq(epsg,(double)(0)), "MinCGSetCond: negative EpsG!", _state); ae_assert(ae_isfinite(epsf, _state), "MinCGSetCond: EpsF is not finite number!", _state); ae_assert(ae_fp_greater_eq(epsf,(double)(0)), "MinCGSetCond: negative EpsF!", _state); ae_assert(ae_isfinite(epsx, _state), "MinCGSetCond: EpsX is not finite number!", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "MinCGSetCond: negative EpsX!", _state); ae_assert(maxits>=0, "MinCGSetCond: negative MaxIts!", _state); if( ((ae_fp_eq(epsg,(double)(0))&&ae_fp_eq(epsf,(double)(0)))&&ae_fp_eq(epsx,(double)(0)))&&maxits==0 ) { epsx = 1.0E-6; } state->epsg = epsg; state->epsf = epsf; state->epsx = epsx; state->maxits = maxits; } /************************************************************************* This function sets scaling coefficients for CG optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of CG optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the CG too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinCGSetPrec...() functions. There is special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void mincgsetscale(mincgstate* state, /* Real */ ae_vector* s, ae_state *_state) { ae_int_t i; ae_assert(s->cnt>=state->n, "MinCGSetScale: Length(S)n-1; i++) { ae_assert(ae_isfinite(s->ptr.p_double[i], _state), "MinCGSetScale: S contains infinite or NAN elements", _state); ae_assert(ae_fp_neq(s->ptr.p_double[i],(double)(0)), "MinCGSetScale: S contains zero elements", _state); state->s.ptr.p_double[i] = ae_fabs(s->ptr.p_double[i], _state); } } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinCGOptimize(). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetxrep(mincgstate* state, ae_bool needxrep, ae_state *_state) { state->xrep = needxrep; } /************************************************************************* This function turns on/off line search reports. These reports are described in more details in developer-only comments on MinCGState object. INPUT PARAMETERS: State - structure which stores algorithm state NeedDRep- whether line search reports are needed or not This function is intended for private use only. Turning it on artificially may cause program failure. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetdrep(mincgstate* state, ae_bool needdrep, ae_state *_state) { state->drep = needdrep; } /************************************************************************* This function sets CG algorithm. INPUT PARAMETERS: State - structure which stores algorithm state CGType - algorithm type: * -1 automatic selection of the best algorithm * 0 DY (Dai and Yuan) algorithm * 1 Hybrid DY-HS algorithm -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetcgtype(mincgstate* state, ae_int_t cgtype, ae_state *_state) { ae_assert(cgtype>=-1&&cgtype<=1, "MinCGSetCGType: incorrect CGType!", _state); if( cgtype==-1 ) { cgtype = 1; } state->cgtype = cgtype; } /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetstpmax(mincgstate* state, double stpmax, ae_state *_state) { ae_assert(ae_isfinite(stpmax, _state), "MinCGSetStpMax: StpMax is not finite!", _state); ae_assert(ae_fp_greater_eq(stpmax,(double)(0)), "MinCGSetStpMax: StpMax<0!", _state); state->stpmax = stpmax; } /************************************************************************* This function allows to suggest initial step length to the CG algorithm. Suggested step length is used as starting point for the line search. It can be useful when you have badly scaled problem, i.e. when ||grad|| (which is used as initial estimate for the first step) is many orders of magnitude different from the desired step. Line search may fail on such problems without good estimate of initial step length. Imagine, for example, problem with ||grad||=10^50 and desired step equal to 0.1 Line search function will use 10^50 as initial step, then it will decrease step length by 2 (up to 20 attempts) and will get 10^44, which is still too large. This function allows us to tell than line search should be started from some moderate step length, like 1.0, so algorithm will be able to detect desired step length in a several searches. Default behavior (when no step is suggested) is to use preconditioner, if it is available, to generate initial estimate of step length. This function influences only first iteration of algorithm. It should be called between MinCGCreate/MinCGRestartFrom() call and MinCGOptimize call. Suggested step is ignored if you have preconditioner. INPUT PARAMETERS: State - structure used to store algorithm state. Stp - initial estimate of the step length. Can be zero (no estimate). -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void mincgsuggeststep(mincgstate* state, double stp, ae_state *_state) { ae_assert(ae_isfinite(stp, _state), "MinCGSuggestStep: Stp is infinite or NAN", _state); ae_assert(ae_fp_greater_eq(stp,(double)(0)), "MinCGSuggestStep: Stp<0", _state); state->suggestedstep = stp; } /************************************************************************* This developer-only function allows to retrieve unscaled (!) length of last good step (i.e. step which resulted in sufficient decrease of target function). It can be used in for solution of sequential optimization subproblems, where MinCGSuggestStep() is called with length of previous step as parameter. INPUT PARAMETERS: State - structure used to store algorithm state. RESULT: length of last good step being accepted NOTE: result of this function is undefined if you called it before -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ double mincglastgoodstep(mincgstate* state, ae_state *_state) { double result; result = state->lastgoodstep; return result; } /************************************************************************* Modification of the preconditioner: preconditioning is turned off. INPUT PARAMETERS: State - structure which stores algorithm state NOTE: you can change preconditioner "on the fly", during algorithm iterations. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetprecdefault(mincgstate* state, ae_state *_state) { state->prectype = 0; state->innerresetneeded = ae_true; } /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE: you can change preconditioner "on the fly", during algorithm iterations. NOTE 2: D[i] should be positive. Exception will be thrown otherwise. NOTE 3: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetprecdiag(mincgstate* state, /* Real */ ae_vector* d, ae_state *_state) { ae_int_t i; ae_assert(d->cnt>=state->n, "MinCGSetPrecDiag: D is too short", _state); for(i=0; i<=state->n-1; i++) { ae_assert(ae_isfinite(d->ptr.p_double[i], _state), "MinCGSetPrecDiag: D contains infinite or NAN elements", _state); ae_assert(ae_fp_greater(d->ptr.p_double[i],(double)(0)), "MinCGSetPrecDiag: D contains non-positive elements", _state); } mincgsetprecdiagfast(state, d, _state); } /************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinCGSetScale() call (before or after MinCGSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state NOTE: you can change preconditioner "on the fly", during algorithm iterations. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetprecscale(mincgstate* state, ae_state *_state) { state->prectype = 3; state->innerresetneeded = ae_true; } /************************************************************************* NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied gradient, and one which uses function value only and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object (either MinCGCreate() for analytical gradient or MinCGCreateF() for numerical differentiation) you should choose appropriate variant of MinCGOptimize() - one which accepts function AND gradient or one which accepts function ONLY. Be careful to choose variant of MinCGOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinCGOptimize() and specific function used to create optimizer. | USER PASSED TO MinCGOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinCGCreateF() | work FAIL MinCGCreate() | FAIL work Here "FAIL" denotes inappropriate combinations of optimizer creation function and MinCGOptimize() version. Attemps to use such combination (for example, to create optimizer with MinCGCreateF() and to pass gradient information to MinCGOptimize()) will lead to exception being thrown. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 20.04.2009 by Bochkanov Sergey *************************************************************************/ ae_bool mincgiteration(mincgstate* state, ae_state *_state) { ae_int_t n; ae_int_t i; double betak; double v; double vv; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { n = state->rstate.ia.ptr.p_int[0]; i = state->rstate.ia.ptr.p_int[1]; betak = state->rstate.ra.ptr.p_double[0]; v = state->rstate.ra.ptr.p_double[1]; vv = state->rstate.ra.ptr.p_double[2]; } else { n = 359; i = -58; betak = -919; v = -909; vv = 81; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } if( state->rstate.stage==4 ) { goto lbl_4; } if( state->rstate.stage==5 ) { goto lbl_5; } if( state->rstate.stage==6 ) { goto lbl_6; } if( state->rstate.stage==7 ) { goto lbl_7; } if( state->rstate.stage==8 ) { goto lbl_8; } if( state->rstate.stage==9 ) { goto lbl_9; } if( state->rstate.stage==10 ) { goto lbl_10; } if( state->rstate.stage==11 ) { goto lbl_11; } if( state->rstate.stage==12 ) { goto lbl_12; } if( state->rstate.stage==13 ) { goto lbl_13; } if( state->rstate.stage==14 ) { goto lbl_14; } if( state->rstate.stage==15 ) { goto lbl_15; } if( state->rstate.stage==16 ) { goto lbl_16; } if( state->rstate.stage==17 ) { goto lbl_17; } if( state->rstate.stage==18 ) { goto lbl_18; } if( state->rstate.stage==19 ) { goto lbl_19; } /* * Routine body */ /* * Prepare */ n = state->n; state->terminationneeded = ae_false; state->userterminationneeded = ae_false; state->repterminationtype = 0; state->repiterationscount = 0; state->repvaridx = -1; state->repnfev = 0; state->debugrestartscount = 0; /* * Check, that transferred derivative value is right */ mincg_clearrequestfields(state, _state); if( !(ae_fp_eq(state->diffstep,(double)(0))&&ae_fp_greater(state->teststep,(double)(0))) ) { goto lbl_20; } state->needfg = ae_true; i = 0; lbl_22: if( i>n-1 ) { goto lbl_24; } v = state->x.ptr.p_double[i]; state->x.ptr.p_double[i] = v-state->teststep*state->s.ptr.p_double[i]; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: state->fm1 = state->f; state->fp1 = state->g.ptr.p_double[i]; state->x.ptr.p_double[i] = v+state->teststep*state->s.ptr.p_double[i]; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: state->fm2 = state->f; state->fp2 = state->g.ptr.p_double[i]; state->x.ptr.p_double[i] = v; state->rstate.stage = 2; goto lbl_rcomm; lbl_2: /* * 2*State.TestStep - scale parameter * width of segment [Xi-TestStep;Xi+TestStep] */ if( !derivativecheck(state->fm1, state->fp1, state->fm2, state->fp2, state->f, state->g.ptr.p_double[i], 2*state->teststep, _state) ) { state->repvaridx = i; state->repterminationtype = -7; result = ae_false; return result; } i = i+1; goto lbl_22; lbl_24: state->needfg = ae_false; lbl_20: /* * Preparations continue: * * set XK * * calculate F/G * * set DK to -G * * powerup algo (it may change preconditioner) * * apply preconditioner to DK * * report update of X * * check stopping conditions for G */ ae_v_move(&state->xk.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); mincg_clearrequestfields(state, _state); if( ae_fp_neq(state->diffstep,(double)(0)) ) { goto lbl_25; } state->needfg = ae_true; state->rstate.stage = 3; goto lbl_rcomm; lbl_3: state->needfg = ae_false; goto lbl_26; lbl_25: state->needf = ae_true; state->rstate.stage = 4; goto lbl_rcomm; lbl_4: state->fbase = state->f; i = 0; lbl_27: if( i>n-1 ) { goto lbl_29; } v = state->x.ptr.p_double[i]; state->x.ptr.p_double[i] = v-state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 5; goto lbl_rcomm; lbl_5: state->fm2 = state->f; state->x.ptr.p_double[i] = v-0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 6; goto lbl_rcomm; lbl_6: state->fm1 = state->f; state->x.ptr.p_double[i] = v+0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 7; goto lbl_rcomm; lbl_7: state->fp1 = state->f; state->x.ptr.p_double[i] = v+state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 8; goto lbl_rcomm; lbl_8: state->fp2 = state->f; state->x.ptr.p_double[i] = v; state->g.ptr.p_double[i] = (8*(state->fp1-state->fm1)-(state->fp2-state->fm2))/(6*state->diffstep*state->s.ptr.p_double[i]); i = i+1; goto lbl_27; lbl_29: state->f = state->fbase; state->needf = ae_false; lbl_26: if( !state->drep ) { goto lbl_30; } /* * Report algorithm powerup (if needed) */ mincg_clearrequestfields(state, _state); state->algpowerup = ae_true; state->rstate.stage = 9; goto lbl_rcomm; lbl_9: state->algpowerup = ae_false; lbl_30: trimprepare(state->f, &state->trimthreshold, _state); ae_v_moveneg(&state->dk.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); mincg_preconditionedmultiply(state, &state->dk, &state->work0, &state->work1, _state); if( !state->xrep ) { goto lbl_32; } mincg_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 10; goto lbl_rcomm; lbl_10: state->xupdated = ae_false; lbl_32: if( state->terminationneeded||state->userterminationneeded ) { /* * Combined termination point for "internal" termination by TerminationNeeded flag * and for "user" termination by MinCGRequestTermination() (UserTerminationNeeded flag). * In this location rules for both of methods are same, thus only one exit point is needed. */ ae_v_move(&state->xn.ptr.p_double[0], 1, &state->xk.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->repterminationtype = 8; result = ae_false; return result; } v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->g.ptr.p_double[i]*state->s.ptr.p_double[i], _state); } if( ae_fp_less_eq(ae_sqrt(v, _state),state->epsg) ) { ae_v_move(&state->xn.ptr.p_double[0], 1, &state->xk.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->repterminationtype = 4; result = ae_false; return result; } state->repnfev = 1; state->k = 0; state->fold = state->f; /* * Choose initial step. * Apply preconditioner, if we have something other than default. */ if( state->prectype==2||state->prectype==3 ) { /* * because we use preconditioner, step length must be equal * to the norm of DK */ v = ae_v_dotproduct(&state->dk.ptr.p_double[0], 1, &state->dk.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->lastgoodstep = ae_sqrt(v, _state); } else { /* * No preconditioner is used, we try to use suggested step */ if( ae_fp_greater(state->suggestedstep,(double)(0)) ) { state->lastgoodstep = state->suggestedstep; } else { state->lastgoodstep = 1.0; } } /* * Main cycle */ state->rstimer = mincg_rscountdownlen; lbl_34: if( ae_false ) { goto lbl_35; } /* * * clear reset flag * * clear termination flag * * store G[k] for later calculation of Y[k] * * prepare starting point and direction and step length for line search */ state->innerresetneeded = ae_false; state->terminationneeded = ae_false; ae_v_moveneg(&state->yk.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->d.ptr.p_double[0], 1, &state->dk.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xk.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->mcstage = 0; state->stp = 1.0; linminnormalized(&state->d, &state->stp, n, _state); if( ae_fp_neq(state->lastgoodstep,(double)(0)) ) { state->stp = state->lastgoodstep; } state->curstpmax = state->stpmax; /* * Report beginning of line search (if needed) * Terminate algorithm, if user request was detected */ if( !state->drep ) { goto lbl_36; } mincg_clearrequestfields(state, _state); state->lsstart = ae_true; state->rstate.stage = 11; goto lbl_rcomm; lbl_11: state->lsstart = ae_false; lbl_36: if( state->terminationneeded ) { ae_v_move(&state->xn.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->repterminationtype = 8; result = ae_false; return result; } /* * Minimization along D */ mcsrch(n, &state->x, &state->f, &state->g, &state->d, &state->stp, state->curstpmax, mincg_gtol, &state->mcinfo, &state->nfev, &state->work0, &state->lstate, &state->mcstage, _state); lbl_38: if( state->mcstage==0 ) { goto lbl_39; } /* * Calculate function/gradient using either * analytical gradient supplied by user * or finite difference approximation. * * "Trim" function in order to handle near-singularity points. */ mincg_clearrequestfields(state, _state); if( ae_fp_neq(state->diffstep,(double)(0)) ) { goto lbl_40; } state->needfg = ae_true; state->rstate.stage = 12; goto lbl_rcomm; lbl_12: state->needfg = ae_false; goto lbl_41; lbl_40: state->needf = ae_true; state->rstate.stage = 13; goto lbl_rcomm; lbl_13: state->fbase = state->f; i = 0; lbl_42: if( i>n-1 ) { goto lbl_44; } v = state->x.ptr.p_double[i]; state->x.ptr.p_double[i] = v-state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 14; goto lbl_rcomm; lbl_14: state->fm2 = state->f; state->x.ptr.p_double[i] = v-0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 15; goto lbl_rcomm; lbl_15: state->fm1 = state->f; state->x.ptr.p_double[i] = v+0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 16; goto lbl_rcomm; lbl_16: state->fp1 = state->f; state->x.ptr.p_double[i] = v+state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 17; goto lbl_rcomm; lbl_17: state->fp2 = state->f; state->x.ptr.p_double[i] = v; state->g.ptr.p_double[i] = (8*(state->fp1-state->fm1)-(state->fp2-state->fm2))/(6*state->diffstep*state->s.ptr.p_double[i]); i = i+1; goto lbl_42; lbl_44: state->f = state->fbase; state->needf = ae_false; lbl_41: trimfunction(&state->f, &state->g, n, state->trimthreshold, _state); /* * Call MCSRCH again */ mcsrch(n, &state->x, &state->f, &state->g, &state->d, &state->stp, state->curstpmax, mincg_gtol, &state->mcinfo, &state->nfev, &state->work0, &state->lstate, &state->mcstage, _state); goto lbl_38; lbl_39: /* * * terminate algorithm if "user" request for detected * * report end of line search * * store current point to XN * * report iteration * * terminate algorithm if "internal" request was detected */ if( state->userterminationneeded ) { ae_v_move(&state->xn.ptr.p_double[0], 1, &state->xk.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->repterminationtype = 8; result = ae_false; return result; } if( !state->drep ) { goto lbl_45; } /* * Report end of line search (if needed) */ mincg_clearrequestfields(state, _state); state->lsend = ae_true; state->rstate.stage = 18; goto lbl_rcomm; lbl_18: state->lsend = ae_false; lbl_45: ae_v_move(&state->xn.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( !state->xrep ) { goto lbl_47; } mincg_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 19; goto lbl_rcomm; lbl_19: state->xupdated = ae_false; lbl_47: if( state->terminationneeded ) { ae_v_move(&state->xn.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->repterminationtype = 8; result = ae_false; return result; } /* * Line search is finished. * * calculate BetaK * * calculate DN * * update timers * * calculate step length: * * LastScaledStep is ALWAYS calculated because it is used in the stopping criteria * * LastGoodStep is updated only when MCINFO is equal to 1 (Wolfe conditions hold). * See below for more explanation. */ if( state->mcinfo==1&&!state->innerresetneeded ) { /* * Standard Wolfe conditions hold * Calculate Y[K] and D[K]'*Y[K] */ ae_v_add(&state->yk.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); vv = ae_v_dotproduct(&state->yk.ptr.p_double[0], 1, &state->dk.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * Calculate BetaK according to DY formula */ v = mincg_preconditionedmultiply2(state, &state->g, &state->g, &state->work0, &state->work1, _state); state->betady = v/vv; /* * Calculate BetaK according to HS formula */ v = mincg_preconditionedmultiply2(state, &state->g, &state->yk, &state->work0, &state->work1, _state); state->betahs = v/vv; /* * Choose BetaK */ if( state->cgtype==0 ) { betak = state->betady; } if( state->cgtype==1 ) { betak = ae_maxreal((double)(0), ae_minreal(state->betady, state->betahs, _state), _state); } } else { /* * Something is wrong (may be function is too wild or too flat) * or we just have to restart algo. * * We'll set BetaK=0, which will restart CG algorithm. * We can stop later (during normal checks) if stopping conditions are met. */ betak = (double)(0); state->debugrestartscount = state->debugrestartscount+1; } if( state->repiterationscount>0&&state->repiterationscount%(3+n)==0 ) { /* * clear Beta every N iterations */ betak = (double)(0); } if( state->mcinfo==1||state->mcinfo==5 ) { state->rstimer = mincg_rscountdownlen; } else { state->rstimer = state->rstimer-1; } ae_v_moveneg(&state->dn.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); mincg_preconditionedmultiply(state, &state->dn, &state->work0, &state->work1, _state); ae_v_addd(&state->dn.ptr.p_double[0], 1, &state->dk.ptr.p_double[0], 1, ae_v_len(0,n-1), betak); state->lastscaledstep = 0.0; for(i=0; i<=n-1; i++) { state->lastscaledstep = state->lastscaledstep+ae_sqr(state->d.ptr.p_double[i]/state->s.ptr.p_double[i], _state); } state->lastscaledstep = state->stp*ae_sqrt(state->lastscaledstep, _state); if( state->mcinfo==1 ) { /* * Step is good (Wolfe conditions hold), update LastGoodStep. * * This check for MCINFO=1 is essential because sometimes in the * constrained optimization setting we may take very short steps * (like 1E-15) because we were very close to boundary of the * feasible area. Such short step does not mean that we've converged * to the solution - it was so short because we were close to the * boundary and there was a limit on step length. * * So having such short step is quite normal situation. However, we * should NOT start next iteration from step whose initial length is * estimated as 1E-15 because it may lead to the failure of the * linear minimizer (step is too short, function does not changes, * line search stagnates). */ state->lastgoodstep = (double)(0); for(i=0; i<=n-1; i++) { state->lastgoodstep = state->lastgoodstep+ae_sqr(state->d.ptr.p_double[i], _state); } state->lastgoodstep = state->stp*ae_sqrt(state->lastgoodstep, _state); } /* * Update information. * Check stopping conditions. */ v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->g.ptr.p_double[i]*state->s.ptr.p_double[i], _state); } if( !ae_isfinite(v, _state)||!ae_isfinite(state->f, _state) ) { /* * Abnormal termination - infinities in function/gradient */ state->repterminationtype = -8; result = ae_false; return result; } state->repnfev = state->repnfev+state->nfev; state->repiterationscount = state->repiterationscount+1; if( state->repiterationscount>=state->maxits&&state->maxits>0 ) { /* * Too many iterations */ state->repterminationtype = 5; result = ae_false; return result; } if( ae_fp_less_eq(ae_sqrt(v, _state),state->epsg) ) { /* * Gradient is small enough */ state->repterminationtype = 4; result = ae_false; return result; } if( !state->innerresetneeded ) { /* * These conditions are checked only when no inner reset was requested by user */ if( ae_fp_less_eq(state->fold-state->f,state->epsf*ae_maxreal(ae_fabs(state->fold, _state), ae_maxreal(ae_fabs(state->f, _state), 1.0, _state), _state)) ) { /* * F(k+1)-F(k) is small enough */ state->repterminationtype = 1; result = ae_false; return result; } if( ae_fp_less_eq(state->lastscaledstep,state->epsx) ) { /* * X(k+1)-X(k) is small enough */ state->repterminationtype = 2; result = ae_false; return result; } } if( state->rstimer<=0 ) { /* * Too many subsequent restarts */ state->repterminationtype = 7; result = ae_false; return result; } /* * Shift Xk/Dk, update other information */ ae_v_move(&state->xk.ptr.p_double[0], 1, &state->xn.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->dk.ptr.p_double[0], 1, &state->dn.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->fold = state->f; state->k = state->k+1; goto lbl_34; lbl_35: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = n; state->rstate.ia.ptr.p_int[1] = i; state->rstate.ra.ptr.p_double[0] = betak; state->rstate.ra.ptr.p_double[1] = v; state->rstate.ra.ptr.p_double[2] = vv; return result; } /************************************************************************* Conjugate gradient results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report: * Rep.TerminationType completetion code: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinCGSetGradientCheck() for more information. * 1 relative function improvement is no more than EpsF. * 2 relative step is no more than EpsX. * 4 gradient norm is no more than EpsG * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible, we return best X found so far * 8 terminated by user * Rep.IterationsCount contains iterations count * NFEV countains number of function calculations -- ALGLIB -- Copyright 20.04.2009 by Bochkanov Sergey *************************************************************************/ void mincgresults(mincgstate* state, /* Real */ ae_vector* x, mincgreport* rep, ae_state *_state) { ae_vector_clear(x); _mincgreport_clear(rep); mincgresultsbuf(state, x, rep, _state); } /************************************************************************* Conjugate gradient results Buffered implementation of MinCGResults(), which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 20.04.2009 by Bochkanov Sergey *************************************************************************/ void mincgresultsbuf(mincgstate* state, /* Real */ ae_vector* x, mincgreport* rep, ae_state *_state) { if( x->cntn ) { ae_vector_set_length(x, state->n, _state); } ae_v_move(&x->ptr.p_double[0], 1, &state->xn.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); rep->iterationscount = state->repiterationscount; rep->nfev = state->repnfev; rep->varidx = state->repvaridx; rep->terminationtype = state->repterminationtype; } /************************************************************************* This subroutine restarts CG algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used to store algorithm state. X - new starting point. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void mincgrestartfrom(mincgstate* state, /* Real */ ae_vector* x, ae_state *_state) { ae_assert(x->cnt>=state->n, "MinCGRestartFrom: Length(X)n, _state), "MinCGCreate: X contains infinite or NaN values!", _state); ae_v_move(&state->x.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,state->n-1)); mincgsuggeststep(state, 0.0, _state); ae_vector_set_length(&state->rstate.ia, 1+1, _state); ae_vector_set_length(&state->rstate.ra, 2+1, _state); state->rstate.stage = -1; mincg_clearrequestfields(state, _state); } /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void mincgrequesttermination(mincgstate* state, ae_state *_state) { state->userterminationneeded = ae_true; } /************************************************************************* Faster version of MinCGSetPrecDiag(), for time-critical parts of code, without safety checks. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetprecdiagfast(mincgstate* state, /* Real */ ae_vector* d, ae_state *_state) { ae_int_t i; rvectorsetlengthatleast(&state->diagh, state->n, _state); rvectorsetlengthatleast(&state->diaghl2, state->n, _state); state->prectype = 2; state->vcnt = 0; state->innerresetneeded = ae_true; for(i=0; i<=state->n-1; i++) { state->diagh.ptr.p_double[i] = d->ptr.p_double[i]; state->diaghl2.ptr.p_double[i] = 0.0; } } /************************************************************************* This function sets low-rank preconditioner for Hessian matrix H=D+V'*C*V, where: * H is a Hessian matrix, which is approximated by D/V/C * D=D1+D2 is a diagonal matrix, which includes two positive definite terms: * constant term D1 (is not updated or infrequently updated) * variable term D2 (can be cheaply updated from iteration to iteration) * V is a low-rank correction * C is a diagonal factor of low-rank correction Preconditioner P is calculated using approximate Woodburry formula: P = D^(-1) - D^(-1)*V'*(C^(-1)+V*D1^(-1)*V')^(-1)*V*D^(-1) = D^(-1) - D^(-1)*VC'*VC*D^(-1), where VC = sqrt(B)*V B = (C^(-1)+V*D1^(-1)*V')^(-1) Note that B is calculated using constant term (D1) only, which allows us to update D2 without recalculation of B or VC. Such preconditioner is exact when D2 is zero. When D2 is non-zero, it is only approximation, but very good and cheap one. This function accepts D1, V, C. D2 is set to zero by default. Cost of this update is O(N*VCnt*VCnt), but D2 can be updated in just O(N) by MinCGSetPrecVarPart. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetpreclowrankfast(mincgstate* state, /* Real */ ae_vector* d1, /* Real */ ae_vector* c, /* Real */ ae_matrix* v, ae_int_t vcnt, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t n; double t; ae_matrix b; ae_frame_make(_state, &_frame_block); ae_matrix_init(&b, 0, 0, DT_REAL, _state); if( vcnt==0 ) { mincgsetprecdiagfast(state, d1, _state); ae_frame_leave(_state); return; } n = state->n; ae_matrix_set_length(&b, vcnt, vcnt, _state); rvectorsetlengthatleast(&state->diagh, n, _state); rvectorsetlengthatleast(&state->diaghl2, n, _state); rmatrixsetlengthatleast(&state->vcorr, vcnt, n, _state); state->prectype = 2; state->vcnt = vcnt; state->innerresetneeded = ae_true; for(i=0; i<=n-1; i++) { state->diagh.ptr.p_double[i] = d1->ptr.p_double[i]; state->diaghl2.ptr.p_double[i] = 0.0; } for(i=0; i<=vcnt-1; i++) { for(j=i; j<=vcnt-1; j++) { t = (double)(0); for(k=0; k<=n-1; k++) { t = t+v->ptr.pp_double[i][k]*v->ptr.pp_double[j][k]/d1->ptr.p_double[k]; } b.ptr.pp_double[i][j] = t; } b.ptr.pp_double[i][i] = b.ptr.pp_double[i][i]+1.0/c->ptr.p_double[i]; } if( !spdmatrixcholeskyrec(&b, 0, vcnt, ae_true, &state->work0, _state) ) { state->vcnt = 0; ae_frame_leave(_state); return; } for(i=0; i<=vcnt-1; i++) { ae_v_move(&state->vcorr.ptr.pp_double[i][0], 1, &v->ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); for(j=0; j<=i-1; j++) { t = b.ptr.pp_double[j][i]; ae_v_subd(&state->vcorr.ptr.pp_double[i][0], 1, &state->vcorr.ptr.pp_double[j][0], 1, ae_v_len(0,n-1), t); } t = 1/b.ptr.pp_double[i][i]; ae_v_muld(&state->vcorr.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), t); } ae_frame_leave(_state); } /************************************************************************* This function updates variable part (diagonal matrix D2) of low-rank preconditioner. This update is very cheap and takes just O(N) time. It has no effect with default preconditioner. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetprecvarpart(mincgstate* state, /* Real */ ae_vector* d2, ae_state *_state) { ae_int_t i; ae_int_t n; n = state->n; for(i=0; i<=n-1; i++) { state->diaghl2.ptr.p_double[i] = d2->ptr.p_double[i]; } } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinCGOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinCGSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 31.05.2012 by Bochkanov Sergey *************************************************************************/ void mincgsetgradientcheck(mincgstate* state, double teststep, ae_state *_state) { ae_assert(ae_isfinite(teststep, _state), "MinCGSetGradientCheck: TestStep contains NaN or Infinite", _state); ae_assert(ae_fp_greater_eq(teststep,(double)(0)), "MinCGSetGradientCheck: invalid argument TestStep(TestStep<0)", _state); state->teststep = teststep; } /************************************************************************* Clears request fileds (to be sure that we don't forgot to clear something) *************************************************************************/ static void mincg_clearrequestfields(mincgstate* state, ae_state *_state) { state->needf = ae_false; state->needfg = ae_false; state->xupdated = ae_false; state->lsstart = ae_false; state->lsend = ae_false; state->algpowerup = ae_false; } /************************************************************************* This function calculates preconditioned product H^(-1)*x and stores result back into X. Work0[] and Work1[] are used as temporaries (size must be at least N; this function doesn't allocate arrays). -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ static void mincg_preconditionedmultiply(mincgstate* state, /* Real */ ae_vector* x, /* Real */ ae_vector* work0, /* Real */ ae_vector* work1, ae_state *_state) { ae_int_t i; ae_int_t n; ae_int_t vcnt; double v; n = state->n; vcnt = state->vcnt; if( state->prectype==0 ) { return; } if( state->prectype==3 ) { for(i=0; i<=n-1; i++) { x->ptr.p_double[i] = x->ptr.p_double[i]*state->s.ptr.p_double[i]*state->s.ptr.p_double[i]; } return; } ae_assert(state->prectype==2, "MinCG: internal error (unexpected PrecType)", _state); /* * handle part common for VCnt=0 and VCnt<>0 */ for(i=0; i<=n-1; i++) { x->ptr.p_double[i] = x->ptr.p_double[i]/(state->diagh.ptr.p_double[i]+state->diaghl2.ptr.p_double[i]); } /* * if VCnt>0 */ if( vcnt>0 ) { for(i=0; i<=vcnt-1; i++) { v = ae_v_dotproduct(&state->vcorr.ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); work0->ptr.p_double[i] = v; } for(i=0; i<=n-1; i++) { work1->ptr.p_double[i] = (double)(0); } for(i=0; i<=vcnt-1; i++) { v = work0->ptr.p_double[i]; ae_v_addd(&state->work1.ptr.p_double[0], 1, &state->vcorr.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } for(i=0; i<=n-1; i++) { x->ptr.p_double[i] = x->ptr.p_double[i]-state->work1.ptr.p_double[i]/(state->diagh.ptr.p_double[i]+state->diaghl2.ptr.p_double[i]); } } } /************************************************************************* This function calculates preconditioned product x'*H^(-1)*y. Work0[] and Work1[] are used as temporaries (size must be at least N; this function doesn't allocate arrays). -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ static double mincg_preconditionedmultiply2(mincgstate* state, /* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* work0, /* Real */ ae_vector* work1, ae_state *_state) { ae_int_t i; ae_int_t n; ae_int_t vcnt; double v0; double v1; double result; n = state->n; vcnt = state->vcnt; /* * no preconditioning */ if( state->prectype==0 ) { v0 = ae_v_dotproduct(&x->ptr.p_double[0], 1, &y->ptr.p_double[0], 1, ae_v_len(0,n-1)); result = v0; return result; } if( state->prectype==3 ) { result = (double)(0); for(i=0; i<=n-1; i++) { result = result+x->ptr.p_double[i]*state->s.ptr.p_double[i]*state->s.ptr.p_double[i]*y->ptr.p_double[i]; } return result; } ae_assert(state->prectype==2, "MinCG: internal error (unexpected PrecType)", _state); /* * low rank preconditioning */ result = 0.0; for(i=0; i<=n-1; i++) { result = result+x->ptr.p_double[i]*y->ptr.p_double[i]/(state->diagh.ptr.p_double[i]+state->diaghl2.ptr.p_double[i]); } if( vcnt>0 ) { for(i=0; i<=n-1; i++) { work0->ptr.p_double[i] = x->ptr.p_double[i]/(state->diagh.ptr.p_double[i]+state->diaghl2.ptr.p_double[i]); work1->ptr.p_double[i] = y->ptr.p_double[i]/(state->diagh.ptr.p_double[i]+state->diaghl2.ptr.p_double[i]); } for(i=0; i<=vcnt-1; i++) { v0 = ae_v_dotproduct(&work0->ptr.p_double[0], 1, &state->vcorr.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); v1 = ae_v_dotproduct(&work1->ptr.p_double[0], 1, &state->vcorr.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); result = result-v0*v1; } } return result; } /************************************************************************* Internal initialization subroutine -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ static void mincg_mincginitinternal(ae_int_t n, double diffstep, mincgstate* state, ae_state *_state) { ae_int_t i; /* * Initialize */ state->teststep = (double)(0); state->n = n; state->diffstep = diffstep; state->lastgoodstep = (double)(0); mincgsetcond(state, (double)(0), (double)(0), (double)(0), 0, _state); mincgsetxrep(state, ae_false, _state); mincgsetdrep(state, ae_false, _state); mincgsetstpmax(state, (double)(0), _state); mincgsetcgtype(state, -1, _state); mincgsetprecdefault(state, _state); ae_vector_set_length(&state->xk, n, _state); ae_vector_set_length(&state->dk, n, _state); ae_vector_set_length(&state->xn, n, _state); ae_vector_set_length(&state->dn, n, _state); ae_vector_set_length(&state->x, n, _state); ae_vector_set_length(&state->d, n, _state); ae_vector_set_length(&state->g, n, _state); ae_vector_set_length(&state->work0, n, _state); ae_vector_set_length(&state->work1, n, _state); ae_vector_set_length(&state->yk, n, _state); ae_vector_set_length(&state->s, n, _state); for(i=0; i<=n-1; i++) { state->s.ptr.p_double[i] = 1.0; } } void _mincgstate_init(void* _p, ae_state *_state) { mincgstate *p = (mincgstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->diagh, 0, DT_REAL, _state); ae_vector_init(&p->diaghl2, 0, DT_REAL, _state); ae_matrix_init(&p->vcorr, 0, 0, DT_REAL, _state); ae_vector_init(&p->s, 0, DT_REAL, _state); ae_vector_init(&p->xk, 0, DT_REAL, _state); ae_vector_init(&p->dk, 0, DT_REAL, _state); ae_vector_init(&p->xn, 0, DT_REAL, _state); ae_vector_init(&p->dn, 0, DT_REAL, _state); ae_vector_init(&p->d, 0, DT_REAL, _state); ae_vector_init(&p->yk, 0, DT_REAL, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->g, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); _linminstate_init(&p->lstate, _state); ae_vector_init(&p->work0, 0, DT_REAL, _state); ae_vector_init(&p->work1, 0, DT_REAL, _state); } void _mincgstate_init_copy(void* _dst, void* _src, ae_state *_state) { mincgstate *dst = (mincgstate*)_dst; mincgstate *src = (mincgstate*)_src; dst->n = src->n; dst->epsg = src->epsg; dst->epsf = src->epsf; dst->epsx = src->epsx; dst->maxits = src->maxits; dst->stpmax = src->stpmax; dst->suggestedstep = src->suggestedstep; dst->xrep = src->xrep; dst->drep = src->drep; dst->cgtype = src->cgtype; dst->prectype = src->prectype; ae_vector_init_copy(&dst->diagh, &src->diagh, _state); ae_vector_init_copy(&dst->diaghl2, &src->diaghl2, _state); ae_matrix_init_copy(&dst->vcorr, &src->vcorr, _state); dst->vcnt = src->vcnt; ae_vector_init_copy(&dst->s, &src->s, _state); dst->diffstep = src->diffstep; dst->nfev = src->nfev; dst->mcstage = src->mcstage; dst->k = src->k; ae_vector_init_copy(&dst->xk, &src->xk, _state); ae_vector_init_copy(&dst->dk, &src->dk, _state); ae_vector_init_copy(&dst->xn, &src->xn, _state); ae_vector_init_copy(&dst->dn, &src->dn, _state); ae_vector_init_copy(&dst->d, &src->d, _state); dst->fold = src->fold; dst->stp = src->stp; dst->curstpmax = src->curstpmax; ae_vector_init_copy(&dst->yk, &src->yk, _state); dst->lastgoodstep = src->lastgoodstep; dst->lastscaledstep = src->lastscaledstep; dst->mcinfo = src->mcinfo; dst->innerresetneeded = src->innerresetneeded; dst->terminationneeded = src->terminationneeded; dst->trimthreshold = src->trimthreshold; dst->rstimer = src->rstimer; ae_vector_init_copy(&dst->x, &src->x, _state); dst->f = src->f; ae_vector_init_copy(&dst->g, &src->g, _state); dst->needf = src->needf; dst->needfg = src->needfg; dst->xupdated = src->xupdated; dst->algpowerup = src->algpowerup; dst->lsstart = src->lsstart; dst->lsend = src->lsend; dst->userterminationneeded = src->userterminationneeded; dst->teststep = src->teststep; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); dst->repiterationscount = src->repiterationscount; dst->repnfev = src->repnfev; dst->repvaridx = src->repvaridx; dst->repterminationtype = src->repterminationtype; dst->debugrestartscount = src->debugrestartscount; _linminstate_init_copy(&dst->lstate, &src->lstate, _state); dst->fbase = src->fbase; dst->fm2 = src->fm2; dst->fm1 = src->fm1; dst->fp1 = src->fp1; dst->fp2 = src->fp2; dst->betahs = src->betahs; dst->betady = src->betady; ae_vector_init_copy(&dst->work0, &src->work0, _state); ae_vector_init_copy(&dst->work1, &src->work1, _state); } void _mincgstate_clear(void* _p) { mincgstate *p = (mincgstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->diagh); ae_vector_clear(&p->diaghl2); ae_matrix_clear(&p->vcorr); ae_vector_clear(&p->s); ae_vector_clear(&p->xk); ae_vector_clear(&p->dk); ae_vector_clear(&p->xn); ae_vector_clear(&p->dn); ae_vector_clear(&p->d); ae_vector_clear(&p->yk); ae_vector_clear(&p->x); ae_vector_clear(&p->g); _rcommstate_clear(&p->rstate); _linminstate_clear(&p->lstate); ae_vector_clear(&p->work0); ae_vector_clear(&p->work1); } void _mincgstate_destroy(void* _p) { mincgstate *p = (mincgstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->diagh); ae_vector_destroy(&p->diaghl2); ae_matrix_destroy(&p->vcorr); ae_vector_destroy(&p->s); ae_vector_destroy(&p->xk); ae_vector_destroy(&p->dk); ae_vector_destroy(&p->xn); ae_vector_destroy(&p->dn); ae_vector_destroy(&p->d); ae_vector_destroy(&p->yk); ae_vector_destroy(&p->x); ae_vector_destroy(&p->g); _rcommstate_destroy(&p->rstate); _linminstate_destroy(&p->lstate); ae_vector_destroy(&p->work0); ae_vector_destroy(&p->work1); } void _mincgreport_init(void* _p, ae_state *_state) { mincgreport *p = (mincgreport*)_p; ae_touch_ptr((void*)p); } void _mincgreport_init_copy(void* _dst, void* _src, ae_state *_state) { mincgreport *dst = (mincgreport*)_dst; mincgreport *src = (mincgreport*)_src; dst->iterationscount = src->iterationscount; dst->nfev = src->nfev; dst->varidx = src->varidx; dst->terminationtype = src->terminationtype; } void _mincgreport_clear(void* _p) { mincgreport *p = (mincgreport*)_p; ae_touch_ptr((void*)p); } void _mincgreport_destroy(void* _p) { mincgreport *p = (mincgreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* BOUND CONSTRAINED OPTIMIZATION WITH ADDITIONAL LINEAR EQUALITY AND INEQUALITY CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints REQUIREMENTS: * user must provide function value and gradient * starting point X0 must be feasible or not too far away from the feasible set * grad(f) must be Lipschitz continuous on a level set: L = { x : f(x)<=f(x0) } * function must be defined everywhere on the feasible set F USAGE: Constrained optimization if far more complex than the unconstrained one. Here we give very brief outline of the BLEIC optimizer. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinBLEICCreate() call 2. USer adds boundary and/or linear constraints by calling MinBLEICSetBC() and MinBLEICSetLC() functions. 3. User sets stopping conditions with MinBLEICSetCond(). 4. User calls MinBLEICOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 5. User calls MinBLEICResults() to get solution 6. Optionally user may call MinBLEICRestartFrom() to solve another problem with same N but another starting point. MinBLEICRestartFrom() allows to reuse already initialized structure. NOTE: if you have box-only constraints (no general linear constraints), then MinBC optimizer can be better option. It uses special, faster constraint activation method, which performs better on problems with multiple constraints active at the solution. On small-scale problems performance of MinBC is similar to that of MinBLEIC, but on large-scale ones (hundreds and thousands of active constraints) it can be several times faster than MinBLEIC. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleiccreate(ae_int_t n, /* Real */ ae_vector* x, minbleicstate* state, ae_state *_state) { ae_frame _frame_block; ae_matrix c; ae_vector ct; ae_frame_make(_state, &_frame_block); _minbleicstate_clear(state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); ae_assert(n>=1, "MinBLEICCreate: N<1", _state); ae_assert(x->cnt>=n, "MinBLEICCreate: Length(X)0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinBLEICSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. CG needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void minbleiccreatef(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minbleicstate* state, ae_state *_state) { ae_frame _frame_block; ae_matrix c; ae_vector ct; ae_frame_make(_state, &_frame_block); _minbleicstate_clear(state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); ae_assert(n>=1, "MinBLEICCreateF: N<1", _state); ae_assert(x->cnt>=n, "MinBLEICCreateF: Length(X)nmain; ae_assert(bndl->cnt>=n, "MinBLEICSetBC: Length(BndL)cnt>=n, "MinBLEICSetBC: Length(BndU)ptr.p_double[i], _state)||ae_isneginf(bndl->ptr.p_double[i], _state), "MinBLEICSetBC: BndL contains NAN or +INF", _state); ae_assert(ae_isfinite(bndu->ptr.p_double[i], _state)||ae_isposinf(bndu->ptr.p_double[i], _state), "MinBLEICSetBC: BndL contains NAN or -INF", _state); state->bndl.ptr.p_double[i] = bndl->ptr.p_double[i]; state->hasbndl.ptr.p_bool[i] = ae_isfinite(bndl->ptr.p_double[i], _state); state->bndu.ptr.p_double[i] = bndu->ptr.p_double[i]; state->hasbndu.ptr.p_bool[i] = ae_isfinite(bndu->ptr.p_double[i], _state); } sassetbc(&state->sas, bndl, bndu, _state); } /************************************************************************* This function sets linear constraints for BLEIC optimizer. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinBLEICRestartFrom(). INPUT PARAMETERS: State - structure previously allocated with MinBLEICCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: linear (non-bound) constraints are satisfied only approximately: * there always exists some minor violation (about Epsilon in magnitude) due to rounding errors * numerical differentiation, if used, may lead to function evaluations outside of the feasible area, because algorithm does NOT change numerical differentiation formula according to linear constraints. If you want constraints to be satisfied exactly, try to reformulate your problem in such manner that all constraints will become boundary ones (this kind of constraints is always satisfied exactly, both in the final solution and in all intermediate points). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetlc(minbleicstate* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; double v; n = state->nmain; /* * First, check for errors in the inputs */ ae_assert(k>=0, "MinBLEICSetLC: K<0", _state); ae_assert(c->cols>=n+1||k==0, "MinBLEICSetLC: Cols(C)rows>=k, "MinBLEICSetLC: Rows(C)cnt>=k, "MinBLEICSetLC: Length(CT)nec = 0; state->nic = 0; return; } /* * Equality constraints are stored first, in the upper * NEC rows of State.CLEIC matrix. Inequality constraints * are stored in the next NIC rows. * * NOTE: we convert inequality constraints to the form * A*x<=b before copying them. */ rmatrixsetlengthatleast(&state->cleic, k, n+1, _state); state->nec = 0; state->nic = 0; for(i=0; i<=k-1; i++) { if( ct->ptr.p_int[i]==0 ) { ae_v_move(&state->cleic.ptr.pp_double[state->nec][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); state->nec = state->nec+1; } } for(i=0; i<=k-1; i++) { if( ct->ptr.p_int[i]!=0 ) { if( ct->ptr.p_int[i]>0 ) { ae_v_moveneg(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } else { ae_v_move(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } state->nic = state->nic+1; } } /* * Normalize rows of State.CLEIC: each row must have unit norm. * Norm is calculated using first N elements (i.e. right part is * not counted when we calculate norm). */ for(i=0; i<=k-1; i++) { v = (double)(0); for(j=0; j<=n-1; j++) { v = v+ae_sqr(state->cleic.ptr.pp_double[i][j], _state); } if( ae_fp_eq(v,(double)(0)) ) { continue; } v = 1/ae_sqrt(v, _state); ae_v_muld(&state->cleic.ptr.pp_double[i][0], 1, ae_v_len(0,n), v); } sassetlc(&state->sas, c, ct, k, _state); } /************************************************************************* This function sets stopping conditions for the optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinBLEICSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. NOTE: when SetCond() called with non-zero MaxIts, BLEIC solver may perform slightly more than MaxIts iterations. I.e., MaxIts sets non-strict limit on iterations count. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetcond(minbleicstate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(epsg, _state), "MinBLEICSetCond: EpsG is not finite number", _state); ae_assert(ae_fp_greater_eq(epsg,(double)(0)), "MinBLEICSetCond: negative EpsG", _state); ae_assert(ae_isfinite(epsf, _state), "MinBLEICSetCond: EpsF is not finite number", _state); ae_assert(ae_fp_greater_eq(epsf,(double)(0)), "MinBLEICSetCond: negative EpsF", _state); ae_assert(ae_isfinite(epsx, _state), "MinBLEICSetCond: EpsX is not finite number", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "MinBLEICSetCond: negative EpsX", _state); ae_assert(maxits>=0, "MinBLEICSetCond: negative MaxIts!", _state); if( ((ae_fp_eq(epsg,(double)(0))&&ae_fp_eq(epsf,(double)(0)))&&ae_fp_eq(epsx,(double)(0)))&&maxits==0 ) { epsx = 1.0E-6; } state->epsg = epsg; state->epsf = epsf; state->epsx = epsx; state->maxits = maxits; } /************************************************************************* This function sets scaling coefficients for BLEIC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the BLEIC too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinBLEICSetPrec...() functions. There is a special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minbleicsetscale(minbleicstate* state, /* Real */ ae_vector* s, ae_state *_state) { ae_int_t i; ae_assert(s->cnt>=state->nmain, "MinBLEICSetScale: Length(S)nmain-1; i++) { ae_assert(ae_isfinite(s->ptr.p_double[i], _state), "MinBLEICSetScale: S contains infinite or NAN elements", _state); ae_assert(ae_fp_neq(s->ptr.p_double[i],(double)(0)), "MinBLEICSetScale: S contains zero elements", _state); state->s.ptr.p_double[i] = ae_fabs(s->ptr.p_double[i], _state); } sassetscale(&state->sas, s, _state); } /************************************************************************* Modification of the preconditioner: preconditioning is turned off. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetprecdefault(minbleicstate* state, ae_state *_state) { state->prectype = 0; } /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE 1: D[i] should be positive. Exception will be thrown otherwise. NOTE 2: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetprecdiag(minbleicstate* state, /* Real */ ae_vector* d, ae_state *_state) { ae_int_t i; ae_assert(d->cnt>=state->nmain, "MinBLEICSetPrecDiag: D is too short", _state); for(i=0; i<=state->nmain-1; i++) { ae_assert(ae_isfinite(d->ptr.p_double[i], _state), "MinBLEICSetPrecDiag: D contains infinite or NAN elements", _state); ae_assert(ae_fp_greater(d->ptr.p_double[i],(double)(0)), "MinBLEICSetPrecDiag: D contains non-positive elements", _state); } rvectorsetlengthatleast(&state->diagh, state->nmain, _state); state->prectype = 2; for(i=0; i<=state->nmain-1; i++) { state->diagh.ptr.p_double[i] = d->ptr.p_double[i]; } } /************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinBLEICSetScale() call (before or after MinBLEICSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetprecscale(minbleicstate* state, ae_state *_state) { state->prectype = 3; } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinBLEICOptimize(). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetxrep(minbleicstate* state, ae_bool needxrep, ae_state *_state) { state->xrep = needxrep; } /************************************************************************* This function turns on/off line search reports. These reports are described in more details in developer-only comments on MinBLEICState object. INPUT PARAMETERS: State - structure which stores algorithm state NeedDRep- whether line search reports are needed or not This function is intended for private use only. Turning it on artificially may cause program failure. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetdrep(minbleicstate* state, ae_bool needdrep, ae_state *_state) { state->drep = needdrep; } /************************************************************************* This function sets maximum step length IMPORTANT: this feature is hard to combine with preconditioning. You can't set upper limit on step length, when you solve optimization problem with linear (non-boundary) constraints AND preconditioner turned on. When non-boundary constraints are present, you have to either a) use preconditioner, or b) use upper limit on step length. YOU CAN'T USE BOTH! In this case algorithm will terminate with appropriate error code. INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which lead to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetstpmax(minbleicstate* state, double stpmax, ae_state *_state) { ae_assert(ae_isfinite(stpmax, _state), "MinBLEICSetStpMax: StpMax is not finite!", _state); ae_assert(ae_fp_greater_eq(stpmax,(double)(0)), "MinBLEICSetStpMax: StpMax<0!", _state); state->stpmax = stpmax; } /************************************************************************* NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied gradient, and one which uses function value only and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object (either MinBLEICCreate() for analytical gradient or MinBLEICCreateF() for numerical differentiation) you should choose appropriate variant of MinBLEICOptimize() - one which accepts function AND gradient or one which accepts function ONLY. Be careful to choose variant of MinBLEICOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinBLEICOptimize() and specific function used to create optimizer. | USER PASSED TO MinBLEICOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinBLEICCreateF() | work FAIL MinBLEICCreate() | FAIL work Here "FAIL" denotes inappropriate combinations of optimizer creation function and MinBLEICOptimize() version. Attemps to use such combination (for example, to create optimizer with MinBLEICCreateF() and to pass gradient information to MinCGOptimize()) will lead to exception being thrown. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ ae_bool minbleiciteration(minbleicstate* state, ae_state *_state) { ae_int_t n; ae_int_t m; ae_int_t i; ae_int_t j; double v; double vv; double v0; ae_bool b; ae_int_t mcinfo; ae_int_t actstatus; ae_int_t itidx; double penalty; double ginit; double gdecay; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { n = state->rstate.ia.ptr.p_int[0]; m = state->rstate.ia.ptr.p_int[1]; i = state->rstate.ia.ptr.p_int[2]; j = state->rstate.ia.ptr.p_int[3]; mcinfo = state->rstate.ia.ptr.p_int[4]; actstatus = state->rstate.ia.ptr.p_int[5]; itidx = state->rstate.ia.ptr.p_int[6]; b = state->rstate.ba.ptr.p_bool[0]; v = state->rstate.ra.ptr.p_double[0]; vv = state->rstate.ra.ptr.p_double[1]; v0 = state->rstate.ra.ptr.p_double[2]; penalty = state->rstate.ra.ptr.p_double[3]; ginit = state->rstate.ra.ptr.p_double[4]; gdecay = state->rstate.ra.ptr.p_double[5]; } else { n = 359; m = -58; i = -919; j = -909; mcinfo = 81; actstatus = 255; itidx = 74; b = ae_false; v = 809; vv = 205; v0 = -838; penalty = 939; ginit = -526; gdecay = 763; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } if( state->rstate.stage==4 ) { goto lbl_4; } if( state->rstate.stage==5 ) { goto lbl_5; } if( state->rstate.stage==6 ) { goto lbl_6; } if( state->rstate.stage==7 ) { goto lbl_7; } if( state->rstate.stage==8 ) { goto lbl_8; } if( state->rstate.stage==9 ) { goto lbl_9; } if( state->rstate.stage==10 ) { goto lbl_10; } if( state->rstate.stage==11 ) { goto lbl_11; } if( state->rstate.stage==12 ) { goto lbl_12; } if( state->rstate.stage==13 ) { goto lbl_13; } if( state->rstate.stage==14 ) { goto lbl_14; } if( state->rstate.stage==15 ) { goto lbl_15; } if( state->rstate.stage==16 ) { goto lbl_16; } if( state->rstate.stage==17 ) { goto lbl_17; } if( state->rstate.stage==18 ) { goto lbl_18; } if( state->rstate.stage==19 ) { goto lbl_19; } if( state->rstate.stage==20 ) { goto lbl_20; } if( state->rstate.stage==21 ) { goto lbl_21; } if( state->rstate.stage==22 ) { goto lbl_22; } if( state->rstate.stage==23 ) { goto lbl_23; } /* * Routine body */ /* * Algorithm parameters: * * M number of L-BFGS corrections. * This coefficient remains fixed during iterations. * * GDecay desired decrease of constrained gradient during L-BFGS iterations. * This coefficient is decreased after each L-BFGS round until * it reaches minimum decay. */ m = ae_minint(5, state->nmain, _state); gdecay = minbleic_initialdecay; /* * Init */ n = state->nmain; state->steepestdescentstep = ae_false; state->userterminationneeded = ae_false; state->repterminationtype = 0; state->repinneriterationscount = 0; state->repouteriterationscount = 0; state->repnfev = 0; state->repvaridx = -1; state->repdebugeqerr = 0.0; state->repdebugfs = _state->v_nan; state->repdebugff = _state->v_nan; state->repdebugdx = _state->v_nan; if( ae_fp_neq(state->stpmax,(double)(0))&&state->prectype!=0 ) { state->repterminationtype = -10; result = ae_false; return result; } rmatrixsetlengthatleast(&state->bufyk, m+1, n, _state); rmatrixsetlengthatleast(&state->bufsk, m+1, n, _state); rvectorsetlengthatleast(&state->bufrho, m, _state); rvectorsetlengthatleast(&state->buftheta, m, _state); rvectorsetlengthatleast(&state->tmp0, n, _state); /* * Fill TmpPrec with current preconditioner */ rvectorsetlengthatleast(&state->tmpprec, n, _state); for(i=0; i<=n-1; i++) { if( state->prectype==2 ) { state->tmpprec.ptr.p_double[i] = state->diagh.ptr.p_double[i]; continue; } if( state->prectype==3 ) { state->tmpprec.ptr.p_double[i] = 1/ae_sqr(state->s.ptr.p_double[i], _state); continue; } state->tmpprec.ptr.p_double[i] = (double)(1); } sassetprecdiag(&state->sas, &state->tmpprec, _state); /* * Start optimization */ if( !sasstartoptimization(&state->sas, &state->xstart, _state) ) { state->repterminationtype = -3; result = ae_false; return result; } /* * Check correctness of user-supplied gradient */ if( !(ae_fp_eq(state->diffstep,(double)(0))&&ae_fp_greater(state->teststep,(double)(0))) ) { goto lbl_24; } minbleic_clearrequestfields(state, _state); ae_v_move(&state->x.ptr.p_double[0], 1, &state->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->needfg = ae_true; i = 0; lbl_26: if( i>n-1 ) { goto lbl_28; } ae_assert(!state->hasbndl.ptr.p_bool[i]||ae_fp_greater_eq(state->sas.xc.ptr.p_double[i],state->bndl.ptr.p_double[i]), "MinBLEICIteration: internal error(State.X is out of bounds)", _state); ae_assert(!state->hasbndu.ptr.p_bool[i]||ae_fp_less_eq(state->sas.xc.ptr.p_double[i],state->bndu.ptr.p_double[i]), "MinBLEICIteration: internal error(State.X is out of bounds)", _state); v = state->x.ptr.p_double[i]; state->x.ptr.p_double[i] = v-state->teststep*state->s.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i] ) { state->x.ptr.p_double[i] = ae_maxreal(state->x.ptr.p_double[i], state->bndl.ptr.p_double[i], _state); } state->xm1 = state->x.ptr.p_double[i]; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: state->fm1 = state->f; state->gm1 = state->g.ptr.p_double[i]; state->x.ptr.p_double[i] = v+state->teststep*state->s.ptr.p_double[i]; if( state->hasbndu.ptr.p_bool[i] ) { state->x.ptr.p_double[i] = ae_minreal(state->x.ptr.p_double[i], state->bndu.ptr.p_double[i], _state); } state->xp1 = state->x.ptr.p_double[i]; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: state->fp1 = state->f; state->gp1 = state->g.ptr.p_double[i]; state->x.ptr.p_double[i] = (state->xm1+state->xp1)/2; if( state->hasbndl.ptr.p_bool[i] ) { state->x.ptr.p_double[i] = ae_maxreal(state->x.ptr.p_double[i], state->bndl.ptr.p_double[i], _state); } if( state->hasbndu.ptr.p_bool[i] ) { state->x.ptr.p_double[i] = ae_minreal(state->x.ptr.p_double[i], state->bndu.ptr.p_double[i], _state); } state->rstate.stage = 2; goto lbl_rcomm; lbl_2: state->x.ptr.p_double[i] = v; if( !derivativecheck(state->fm1, state->gm1, state->fp1, state->gp1, state->f, state->g.ptr.p_double[i], state->xp1-state->xm1, _state) ) { state->repvaridx = i; state->repterminationtype = -7; sasstopoptimization(&state->sas, _state); result = ae_false; return result; } i = i+1; goto lbl_26; lbl_28: state->needfg = ae_false; lbl_24: /* * Main cycle of BLEIC-PG algorithm */ state->repterminationtype = 0; state->lastgoodstep = (double)(0); state->lastscaledgoodstep = (double)(0); state->maxscaledgrad = (double)(0); state->nonmonotoniccnt = ae_round(1.5*(n+state->nic), _state)+5; ae_v_move(&state->x.ptr.p_double[0], 1, &state->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); minbleic_clearrequestfields(state, _state); if( ae_fp_neq(state->diffstep,(double)(0)) ) { goto lbl_29; } state->needfg = ae_true; state->rstate.stage = 3; goto lbl_rcomm; lbl_3: state->needfg = ae_false; goto lbl_30; lbl_29: state->needf = ae_true; state->rstate.stage = 4; goto lbl_rcomm; lbl_4: state->needf = ae_false; lbl_30: state->fc = state->f; trimprepare(state->f, &state->trimthreshold, _state); state->repnfev = state->repnfev+1; if( !state->xrep ) { goto lbl_31; } /* * Report current point */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->f = state->fc; state->xupdated = ae_true; state->rstate.stage = 5; goto lbl_rcomm; lbl_5: state->xupdated = ae_false; lbl_31: if( state->userterminationneeded ) { /* * User requested termination */ sasstopoptimization(&state->sas, _state); state->repterminationtype = 8; result = ae_false; return result; } lbl_33: if( ae_false ) { goto lbl_34; } /* * Preparations * * (a) calculate unconstrained gradient * (b) determine initial active set * (c) update MaxScaledGrad * (d) check F/G for NAN/INF, abnormally terminate algorithm if needed */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); minbleic_clearrequestfields(state, _state); if( ae_fp_neq(state->diffstep,(double)(0)) ) { goto lbl_35; } /* * Analytic gradient */ state->needfg = ae_true; state->rstate.stage = 6; goto lbl_rcomm; lbl_6: state->needfg = ae_false; goto lbl_36; lbl_35: /* * Numerical differentiation */ state->needf = ae_true; state->rstate.stage = 7; goto lbl_rcomm; lbl_7: state->fbase = state->f; i = 0; lbl_37: if( i>n-1 ) { goto lbl_39; } v = state->x.ptr.p_double[i]; b = ae_false; if( state->hasbndl.ptr.p_bool[i] ) { b = b||ae_fp_less(v-state->diffstep*state->s.ptr.p_double[i],state->bndl.ptr.p_double[i]); } if( state->hasbndu.ptr.p_bool[i] ) { b = b||ae_fp_greater(v+state->diffstep*state->s.ptr.p_double[i],state->bndu.ptr.p_double[i]); } if( b ) { goto lbl_40; } state->x.ptr.p_double[i] = v-state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 8; goto lbl_rcomm; lbl_8: state->fm2 = state->f; state->x.ptr.p_double[i] = v-0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 9; goto lbl_rcomm; lbl_9: state->fm1 = state->f; state->x.ptr.p_double[i] = v+0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 10; goto lbl_rcomm; lbl_10: state->fp1 = state->f; state->x.ptr.p_double[i] = v+state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 11; goto lbl_rcomm; lbl_11: state->fp2 = state->f; state->g.ptr.p_double[i] = (8*(state->fp1-state->fm1)-(state->fp2-state->fm2))/(6*state->diffstep*state->s.ptr.p_double[i]); goto lbl_41; lbl_40: state->xm1 = v-state->diffstep*state->s.ptr.p_double[i]; state->xp1 = v+state->diffstep*state->s.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less(state->xm1,state->bndl.ptr.p_double[i]) ) { state->xm1 = state->bndl.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater(state->xp1,state->bndu.ptr.p_double[i]) ) { state->xp1 = state->bndu.ptr.p_double[i]; } state->x.ptr.p_double[i] = state->xm1; state->rstate.stage = 12; goto lbl_rcomm; lbl_12: state->fm1 = state->f; state->x.ptr.p_double[i] = state->xp1; state->rstate.stage = 13; goto lbl_rcomm; lbl_13: state->fp1 = state->f; if( ae_fp_neq(state->xm1,state->xp1) ) { state->g.ptr.p_double[i] = (state->fp1-state->fm1)/(state->xp1-state->xm1); } else { state->g.ptr.p_double[i] = (double)(0); } lbl_41: state->x.ptr.p_double[i] = v; i = i+1; goto lbl_37; lbl_39: state->f = state->fbase; state->needf = ae_false; lbl_36: state->fc = state->f; ae_v_move(&state->ugc.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->cgc.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); sasreactivateconstraintsprec(&state->sas, &state->ugc, _state); sasconstraineddirection(&state->sas, &state->cgc, _state); ginit = 0.0; for(i=0; i<=n-1; i++) { ginit = ginit+ae_sqr(state->cgc.ptr.p_double[i]*state->s.ptr.p_double[i], _state); } ginit = ae_sqrt(ginit, _state); state->maxscaledgrad = ae_maxreal(state->maxscaledgrad, ginit, _state); if( !ae_isfinite(ginit, _state)||!ae_isfinite(state->fc, _state) ) { /* * Abnormal termination - infinities in function/gradient */ sasstopoptimization(&state->sas, _state); state->repterminationtype = -8; result = ae_false; return result; } if( state->userterminationneeded ) { /* * User requested termination */ sasstopoptimization(&state->sas, _state); state->repterminationtype = 8; result = ae_false; return result; } /* * LBFGS stage: * * during LBFGS iterations we activate new constraints, but never * deactivate already active ones. * * we perform at most N iterations of LBFGS before re-evaluating * active set and restarting LBFGS. * * first iteration of LBFGS is a special - it is performed with * minimum set of active constraints, algorithm termination can * be performed only at this state. We call this iteration * "steepest descent step". * * About termination: * * LBFGS iterations can be terminated because of two reasons: * * "termination" - non-zero termination code in RepTerminationType, * which means that optimization is done * * "restart" - zero RepTerminationType, which means that we * have to re-evaluate active set and resume LBFGS stage. * * one more option is "refresh" - to continue LBFGS iterations, * but with all BFGS updates (Sk/Yk pairs) being dropped; * it happens after changes in active set */ state->bufsize = 0; state->steepestdescentstep = ae_true; itidx = -1; lbl_42: if( itidx>=n-1 ) { goto lbl_43; } /* * Increment iterations counter * * NOTE: we have strong reasons to use such complex scheme * instead of just for() loop - this counter may be * decreased at some occasions to perform "restart" * of an iteration. */ itidx = itidx+1; /* * At the beginning of each iteration: * * SAS.XC stores current point * * FC stores current function value * * UGC stores current unconstrained gradient * * CGC stores current constrained gradient * * D stores constrained step direction (calculated at this block) * * * Check gradient-based stopping criteria * * This stopping condition is tested only for step which is the * first step of LBFGS (subsequent steps may accumulate active * constraints thus they should NOT be used for stopping - gradient * may be small when constrained, but these constraints may be * deactivated by the subsequent steps) */ if( state->steepestdescentstep&&ae_fp_less_eq(sasscaledconstrainednorm(&state->sas, &state->ugc, _state),state->epsg) ) { /* * Gradient is small enough. * Optimization is terminated */ state->repterminationtype = 4; goto lbl_43; } /* * 1. Calculate search direction D according to L-BFGS algorithm * using constrained preconditioner to perform inner multiplication. * 2. Evaluate scaled length of direction D; restart LBFGS if D is zero * (it may be possible that we found minimum, but it is also possible * that some constraints need deactivation) * 3. If D is non-zero, try to use previous scaled step length as initial estimate for new step. */ ae_v_move(&state->work.ptr.p_double[0], 1, &state->cgc.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=state->bufsize-1; i>=0; i--) { v = ae_v_dotproduct(&state->bufsk.ptr.pp_double[i][0], 1, &state->work.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->buftheta.ptr.p_double[i] = v; vv = v*state->bufrho.ptr.p_double[i]; ae_v_subd(&state->work.ptr.p_double[0], 1, &state->bufyk.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), vv); } sasconstraineddirectionprec(&state->sas, &state->work, _state); for(i=0; i<=state->bufsize-1; i++) { v = ae_v_dotproduct(&state->bufyk.ptr.pp_double[i][0], 1, &state->work.ptr.p_double[0], 1, ae_v_len(0,n-1)); vv = state->bufrho.ptr.p_double[i]*(-v+state->buftheta.ptr.p_double[i]); ae_v_addd(&state->work.ptr.p_double[0], 1, &state->bufsk.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), vv); } ae_v_moveneg(&state->d.ptr.p_double[0], 1, &state->work.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->d.ptr.p_double[i]/state->s.ptr.p_double[i], _state); } v = ae_sqrt(v, _state); if( ae_fp_eq(v,(double)(0)) ) { /* * Search direction is zero. * If we perform "steepest descent step", algorithm is terminated. * Otherwise we just restart LBFGS. */ if( state->steepestdescentstep ) { state->repterminationtype = 4; } goto lbl_43; } ae_assert(ae_fp_greater(v,(double)(0)), "MinBLEIC: internal error", _state); if( ae_fp_greater(state->lastscaledgoodstep,(double)(0))&&ae_fp_greater(v,(double)(0)) ) { state->stp = state->lastscaledgoodstep/v; } else { state->stp = 1.0/v; } /* * Calculate bound on step length. * Step direction is stored */ sasexploredirection(&state->sas, &state->d, &state->curstpmax, &state->cidx, &state->cval, _state); state->activationstep = state->curstpmax; if( state->cidx>=0&&ae_fp_eq(state->activationstep,(double)(0)) ) { /* * We are exactly at the boundary, immediate activation * of constraint is required. LBFGS stage is continued * with "refreshed" model. * * ! IMPORTANT: we do not clear SteepestDescent flag here, * ! it is very important for correct stopping * ! of algorithm. * * ! IMPORTANT: we decrease iteration counter in order to * preserve computational budget for iterations. */ sasimmediateactivation(&state->sas, state->cidx, state->cval, _state); state->bufsize = 0; itidx = itidx-1; goto lbl_42; } if( ae_fp_greater(state->stpmax,(double)(0)) ) { v = ae_v_dotproduct(&state->d.ptr.p_double[0], 1, &state->d.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = ae_sqrt(v, _state); if( ae_fp_greater(v,(double)(0)) ) { state->curstpmax = ae_minreal(state->curstpmax, state->stpmax/v, _state); } } /* * Report beginning of line search (if requested by caller). * See description of the MinBLEICState for more information * about fields accessible to caller. * * Caller may do following: * * change State.Stp and load better initial estimate of * the step length. * Caller may not terminate algorithm. */ if( !state->drep ) { goto lbl_44; } minbleic_clearrequestfields(state, _state); state->lsstart = ae_true; state->boundedstep = state->cidx>=0; ae_v_move(&state->x.ptr.p_double[0], 1, &state->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->rstate.stage = 14; goto lbl_rcomm; lbl_14: state->lsstart = ae_false; lbl_44: /* * Minimize F(x+alpha*d) */ ae_v_move(&state->xn.ptr.p_double[0], 1, &state->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->cgn.ptr.p_double[0], 1, &state->cgc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->ugn.ptr.p_double[0], 1, &state->ugc.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->fn = state->fc; state->mcstage = 0; mcsrch(n, &state->xn, &state->fn, &state->cgn, &state->d, &state->stp, state->curstpmax, minbleic_gtol, &mcinfo, &state->nfev, &state->work, &state->lstate, &state->mcstage, _state); lbl_46: if( state->mcstage==0 ) { goto lbl_47; } /* * Perform correction (constraints are enforced) * Copy XN to X */ sascorrection(&state->sas, &state->xn, &penalty, _state); for(i=0; i<=n-1; i++) { state->x.ptr.p_double[i] = state->xn.ptr.p_double[i]; } /* * Gradient, either user-provided or numerical differentiation */ minbleic_clearrequestfields(state, _state); if( ae_fp_neq(state->diffstep,(double)(0)) ) { goto lbl_48; } /* * Analytic gradient */ state->needfg = ae_true; state->rstate.stage = 15; goto lbl_rcomm; lbl_15: state->needfg = ae_false; state->repnfev = state->repnfev+1; goto lbl_49; lbl_48: /* * Numerical differentiation */ state->needf = ae_true; state->rstate.stage = 16; goto lbl_rcomm; lbl_16: state->fbase = state->f; i = 0; lbl_50: if( i>n-1 ) { goto lbl_52; } v = state->x.ptr.p_double[i]; b = ae_false; if( state->hasbndl.ptr.p_bool[i] ) { b = b||ae_fp_less(v-state->diffstep*state->s.ptr.p_double[i],state->bndl.ptr.p_double[i]); } if( state->hasbndu.ptr.p_bool[i] ) { b = b||ae_fp_greater(v+state->diffstep*state->s.ptr.p_double[i],state->bndu.ptr.p_double[i]); } if( b ) { goto lbl_53; } state->x.ptr.p_double[i] = v-state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 17; goto lbl_rcomm; lbl_17: state->fm2 = state->f; state->x.ptr.p_double[i] = v-0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 18; goto lbl_rcomm; lbl_18: state->fm1 = state->f; state->x.ptr.p_double[i] = v+0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 19; goto lbl_rcomm; lbl_19: state->fp1 = state->f; state->x.ptr.p_double[i] = v+state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 20; goto lbl_rcomm; lbl_20: state->fp2 = state->f; state->g.ptr.p_double[i] = (8*(state->fp1-state->fm1)-(state->fp2-state->fm2))/(6*state->diffstep*state->s.ptr.p_double[i]); state->repnfev = state->repnfev+4; goto lbl_54; lbl_53: state->xm1 = v-state->diffstep*state->s.ptr.p_double[i]; state->xp1 = v+state->diffstep*state->s.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less(state->xm1,state->bndl.ptr.p_double[i]) ) { state->xm1 = state->bndl.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater(state->xp1,state->bndu.ptr.p_double[i]) ) { state->xp1 = state->bndu.ptr.p_double[i]; } state->x.ptr.p_double[i] = state->xm1; state->rstate.stage = 21; goto lbl_rcomm; lbl_21: state->fm1 = state->f; state->x.ptr.p_double[i] = state->xp1; state->rstate.stage = 22; goto lbl_rcomm; lbl_22: state->fp1 = state->f; if( ae_fp_neq(state->xm1,state->xp1) ) { state->g.ptr.p_double[i] = (state->fp1-state->fm1)/(state->xp1-state->xm1); } else { state->g.ptr.p_double[i] = (double)(0); } state->repnfev = state->repnfev+2; lbl_54: state->x.ptr.p_double[i] = v; i = i+1; goto lbl_50; lbl_52: state->f = state->fbase; state->needf = ae_false; lbl_49: /* * Back to MCSRCH * * NOTE: penalty term from correction is added to FN in order * to penalize increase in infeasibility. */ state->fn = state->f+minbleic_penaltyfactor*state->maxscaledgrad*penalty; ae_v_move(&state->cgn.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->ugn.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); sasconstraineddirection(&state->sas, &state->cgn, _state); trimfunction(&state->fn, &state->cgn, n, state->trimthreshold, _state); mcsrch(n, &state->xn, &state->fn, &state->cgn, &state->d, &state->stp, state->curstpmax, minbleic_gtol, &mcinfo, &state->nfev, &state->work, &state->lstate, &state->mcstage, _state); goto lbl_46; lbl_47: ae_v_moveneg(&state->bufsk.ptr.pp_double[state->bufsize][0], 1, &state->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_moveneg(&state->bufyk.ptr.pp_double[state->bufsize][0], 1, &state->cgc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_add(&state->bufsk.ptr.pp_double[state->bufsize][0], 1, &state->xn.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_add(&state->bufyk.ptr.pp_double[state->bufsize][0], 1, &state->cgn.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * Check for presence of NAN/INF in function/gradient */ v = state->fn; for(i=0; i<=n-1; i++) { v = 0.1*v+state->ugn.ptr.p_double[i]; } if( !ae_isfinite(v, _state) ) { /* * Abnormal termination - infinities in function/gradient */ state->repterminationtype = -8; goto lbl_43; } /* * Handle possible failure of the line search or request for termination */ if( mcinfo!=1&&mcinfo!=5 ) { /* * We can not find step which decreases function value. We have * two possibilities: * (a) numerical properties of the function do not allow us to * find good step. * (b) we are close to activation of some constraint, and it is * so close that step which activates it leads to change in * target function which is smaller than numerical noise. * * Optimization algorithm must be able to handle case (b), because * inability to handle it will cause failure when algorithm * started very close to boundary of the feasible area. * * In order to correctly handle such cases we allow limited amount * of small steps which increase function value. */ v = 0.0; for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->d.ptr.p_double[i]*state->curstpmax/state->s.ptr.p_double[i], _state); } v = ae_sqrt(v, _state); b = ae_false; if( (state->cidx>=0&&ae_fp_less_eq(v,minbleic_maxnonmonotoniclen))&&state->nonmonotoniccnt>0 ) { /* * We try to enforce non-monotonic step: * * Stp := CurStpMax * * MCINFO := 5 * * XN := XC+CurStpMax*D * * non-monotonic counter is decreased * * NOTE: UGN/CGN are not updated because step is so short that we assume that * GN is approximately equal to GC. * * NOTE: prior to enforcing such step we check that it does not increase infeasibility * of constraints beyond tolerable level */ v = state->curstpmax; ae_v_move(&state->tmp0.ptr.p_double[0], 1, &state->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&state->tmp0.ptr.p_double[0], 1, &state->d.ptr.p_double[0], 1, ae_v_len(0,n-1), v); if( ae_fp_less_eq(minbleic_feasibilityerror(&state->tmp0, &state->s, n, &state->cleic, state->nec, state->nic, _state),minbleic_nmstol*ae_sqrt((double)(n), _state)*ae_machineepsilon) ) { state->stp = state->curstpmax; mcinfo = 5; ae_v_move(&state->xn.ptr.p_double[0], 1, &state->tmp0.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->nonmonotoniccnt = state->nonmonotoniccnt-1; b = ae_true; } } if( !b ) { /* * Numerical properties of the function do not allow * us to solve problem. Here we have two possibilities: * * if it is "steepest descent" step, we can terminate * algorithm because we are close to minimum * * if it is NOT "steepest descent" step, we should restart * LBFGS iterations. */ if( state->steepestdescentstep ) { /* * Algorithm is terminated */ state->repterminationtype = 7; goto lbl_43; } else { /* * Re-evaluate active set and restart LBFGS */ goto lbl_43; } } } if( state->userterminationneeded ) { goto lbl_43; } /* * Current point is updated: * * move XC/FC/GC to XP/FP/GP * * change current point remembered by SAS structure * * move XN/FN/GN to XC/FC/GC * * report current point and update iterations counter * * if MCINFO=1, push new pair SK/YK to LBFGS buffer */ state->fp = state->fc; ae_v_move(&state->xp.ptr.p_double[0], 1, &state->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->fc = state->fn; ae_v_move(&state->cgc.ptr.p_double[0], 1, &state->cgn.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->ugc.ptr.p_double[0], 1, &state->ugn.ptr.p_double[0], 1, ae_v_len(0,n-1)); actstatus = sasmoveto(&state->sas, &state->xn, state->cidx>=0&&ae_fp_greater_eq(state->stp,state->activationstep), state->cidx, state->cval, _state); if( !state->xrep ) { goto lbl_55; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->sas.xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); minbleic_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 23; goto lbl_rcomm; lbl_23: state->xupdated = ae_false; lbl_55: state->repinneriterationscount = state->repinneriterationscount+1; if( mcinfo==1 ) { /* * Accept new LBFGS update given by Sk,Yk */ if( state->bufsize==m ) { /* * Buffer is full, shift contents by one row */ for(i=0; i<=state->bufsize-1; i++) { ae_v_move(&state->bufsk.ptr.pp_double[i][0], 1, &state->bufsk.ptr.pp_double[i+1][0], 1, ae_v_len(0,n-1)); ae_v_move(&state->bufyk.ptr.pp_double[i][0], 1, &state->bufyk.ptr.pp_double[i+1][0], 1, ae_v_len(0,n-1)); } for(i=0; i<=state->bufsize-2; i++) { state->bufrho.ptr.p_double[i] = state->bufrho.ptr.p_double[i+1]; state->buftheta.ptr.p_double[i] = state->buftheta.ptr.p_double[i+1]; } } else { /* * Buffer is not full, increase buffer size by 1 */ state->bufsize = state->bufsize+1; } v = ae_v_dotproduct(&state->bufyk.ptr.pp_double[state->bufsize-1][0], 1, &state->bufsk.ptr.pp_double[state->bufsize-1][0], 1, ae_v_len(0,n-1)); vv = ae_v_dotproduct(&state->bufyk.ptr.pp_double[state->bufsize-1][0], 1, &state->bufyk.ptr.pp_double[state->bufsize-1][0], 1, ae_v_len(0,n-1)); if( ae_fp_eq(v,(double)(0))||ae_fp_eq(vv,(double)(0)) ) { /* * Strange internal error in LBFGS - either YK=0 * (which should not have been) or (SK,YK)=0 (again, * unexpected). It should not take place because * MCINFO=1, which signals "good" step. But just * to be sure we have special branch of code which * restarts LBFGS */ goto lbl_43; } state->bufrho.ptr.p_double[state->bufsize-1] = 1/v; ae_assert(state->bufsize<=m, "MinBLEIC: internal error", _state); /* * Update length of the good step */ v = (double)(0); vv = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr((state->sas.xc.ptr.p_double[i]-state->xp.ptr.p_double[i])/state->s.ptr.p_double[i], _state); vv = vv+ae_sqr(state->sas.xc.ptr.p_double[i]-state->xp.ptr.p_double[i], _state); } state->lastgoodstep = ae_sqrt(vv, _state); minbleic_updateestimateofgoodstep(&state->lastscaledgoodstep, ae_sqrt(v, _state), _state); } /* * Check stopping criteria * * Step size and function-based stopping criteria are tested only * for step which satisfies Wolfe conditions and is the first step of * LBFGS (subsequent steps may accumulate active constraints thus * they should NOT be used for stopping; step size or function change * may be small when constrained, but these constraints may be * deactivated by the subsequent steps). * * MaxIts-based stopping condition is checked for all kinds of steps. */ if( mcinfo==1&&state->steepestdescentstep ) { /* * Step is small enough */ v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr((state->sas.xc.ptr.p_double[i]-state->xp.ptr.p_double[i])/state->s.ptr.p_double[i], _state); } v = ae_sqrt(v, _state); if( ae_fp_less_eq(v,state->epsx) ) { state->repterminationtype = 2; goto lbl_43; } /* * Function change is small enough */ if( ae_fp_less_eq(ae_fabs(state->fp-state->fc, _state),state->epsf*ae_maxreal(ae_fabs(state->fc, _state), ae_maxreal(ae_fabs(state->fp, _state), 1.0, _state), _state)) ) { state->repterminationtype = 1; goto lbl_43; } } if( state->maxits>0&&state->repinneriterationscount>=state->maxits ) { state->repterminationtype = 5; goto lbl_43; } /* * Clear "steepest descent" flag. */ state->steepestdescentstep = ae_false; /* * Smooth reset (LBFGS memory model is refreshed) or hard restart: * * LBFGS model is refreshed, if line search was performed with activation of constraints * * algorithm is restarted if scaled gradient decreased below GDecay */ if( actstatus>=0 ) { state->bufsize = 0; goto lbl_42; } v = 0.0; for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->cgc.ptr.p_double[i]*state->s.ptr.p_double[i], _state); } if( ae_fp_less(ae_sqrt(v, _state),gdecay*ginit) ) { goto lbl_43; } goto lbl_42; lbl_43: if( state->userterminationneeded ) { /* * User requested termination */ state->repterminationtype = 8; goto lbl_34; } if( state->repterminationtype!=0 ) { /* * Algorithm terminated */ goto lbl_34; } /* * Decrease decay coefficient. Subsequent L-BFGS stages will * have more stringent stopping criteria. */ gdecay = ae_maxreal(gdecay*minbleic_decaycorrection, minbleic_mindecay, _state); goto lbl_33; lbl_34: sasstopoptimization(&state->sas, _state); state->repouteriterationscount = 1; result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = n; state->rstate.ia.ptr.p_int[1] = m; state->rstate.ia.ptr.p_int[2] = i; state->rstate.ia.ptr.p_int[3] = j; state->rstate.ia.ptr.p_int[4] = mcinfo; state->rstate.ia.ptr.p_int[5] = actstatus; state->rstate.ia.ptr.p_int[6] = itidx; state->rstate.ba.ptr.p_bool[0] = b; state->rstate.ra.ptr.p_double[0] = v; state->rstate.ra.ptr.p_double[1] = vv; state->rstate.ra.ptr.p_double[2] = v0; state->rstate.ra.ptr.p_double[3] = penalty; state->rstate.ra.ptr.p_double[4] = ginit; state->rstate.ra.ptr.p_double[5] = gdecay; return result; } /************************************************************************* BLEIC results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinBLEICSetGradientCheck() for more information. * -3 inconsistent constraints. Feasible point is either nonexistent or too hard to find. Try to restart optimizer with better initial approximation * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken * 8 terminated by user who called minbleicrequesttermination(). X contains point which was "current accepted" when termination request was submitted. More information about fields of this structure can be found in the comments on MinBLEICReport datatype. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicresults(minbleicstate* state, /* Real */ ae_vector* x, minbleicreport* rep, ae_state *_state) { ae_vector_clear(x); _minbleicreport_clear(rep); minbleicresultsbuf(state, x, rep, _state); } /************************************************************************* BLEIC results Buffered implementation of MinBLEICResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicresultsbuf(minbleicstate* state, /* Real */ ae_vector* x, minbleicreport* rep, ae_state *_state) { ae_int_t i; if( x->cntnmain ) { ae_vector_set_length(x, state->nmain, _state); } rep->iterationscount = state->repinneriterationscount; rep->inneriterationscount = state->repinneriterationscount; rep->outeriterationscount = state->repouteriterationscount; rep->nfev = state->repnfev; rep->varidx = state->repvaridx; rep->terminationtype = state->repterminationtype; if( state->repterminationtype>0 ) { ae_v_move(&x->ptr.p_double[0], 1, &state->sas.xc.ptr.p_double[0], 1, ae_v_len(0,state->nmain-1)); } else { for(i=0; i<=state->nmain-1; i++) { x->ptr.p_double[i] = _state->v_nan; } } rep->debugeqerr = state->repdebugeqerr; rep->debugfs = state->repdebugfs; rep->debugff = state->repdebugff; rep->debugdx = state->repdebugdx; rep->debugfeasqpits = state->repdebugfeasqpits; rep->debugfeasgpaits = state->repdebugfeasgpaits; } /************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with MinBLEICCreate call. X - new starting point. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicrestartfrom(minbleicstate* state, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; n = state->nmain; /* * First, check for errors in the inputs */ ae_assert(x->cnt>=n, "MinBLEICRestartFrom: Length(X)xstart.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * prepare RComm facilities */ ae_vector_set_length(&state->rstate.ia, 6+1, _state); ae_vector_set_length(&state->rstate.ba, 0+1, _state); ae_vector_set_length(&state->rstate.ra, 5+1, _state); state->rstate.stage = -1; minbleic_clearrequestfields(state, _state); sasstopoptimization(&state->sas, _state); } /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void minbleicrequesttermination(minbleicstate* state, ae_state *_state) { state->userterminationneeded = ae_true; } /************************************************************************* This subroutine finalizes internal structures after emergency termination from State.LSStart report (see comments on MinBLEICState for more information). INPUT PARAMETERS: State - structure after exit from LSStart report -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicemergencytermination(minbleicstate* state, ae_state *_state) { sasstopoptimization(&state->sas, _state); } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinBLEICOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * if needed, steps are bounded with respect to constraints on X[] * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinBLEICSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/ void minbleicsetgradientcheck(minbleicstate* state, double teststep, ae_state *_state) { ae_assert(ae_isfinite(teststep, _state), "MinBLEICSetGradientCheck: TestStep contains NaN or Infinite", _state); ae_assert(ae_fp_greater_eq(teststep,(double)(0)), "MinBLEICSetGradientCheck: invalid argument TestStep(TestStep<0)", _state); state->teststep = teststep; } /************************************************************************* Clears request fileds (to be sure that we don't forget to clear something) *************************************************************************/ static void minbleic_clearrequestfields(minbleicstate* state, ae_state *_state) { state->needf = ae_false; state->needfg = ae_false; state->xupdated = ae_false; state->lsstart = ae_false; } /************************************************************************* Internal initialization subroutine *************************************************************************/ static void minbleic_minbleicinitinternal(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minbleicstate* state, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_matrix c; ae_vector ct; ae_frame_make(_state, &_frame_block); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); /* * Initialize */ state->teststep = (double)(0); state->nmain = n; state->diffstep = diffstep; sasinit(n, &state->sas, _state); ae_vector_set_length(&state->bndl, n, _state); ae_vector_set_length(&state->hasbndl, n, _state); ae_vector_set_length(&state->bndu, n, _state); ae_vector_set_length(&state->hasbndu, n, _state); ae_vector_set_length(&state->xstart, n, _state); ae_vector_set_length(&state->cgc, n, _state); ae_vector_set_length(&state->ugc, n, _state); ae_vector_set_length(&state->xn, n, _state); ae_vector_set_length(&state->cgn, n, _state); ae_vector_set_length(&state->ugn, n, _state); ae_vector_set_length(&state->xp, n, _state); ae_vector_set_length(&state->d, n, _state); ae_vector_set_length(&state->s, n, _state); ae_vector_set_length(&state->x, n, _state); ae_vector_set_length(&state->g, n, _state); ae_vector_set_length(&state->work, n, _state); for(i=0; i<=n-1; i++) { state->bndl.ptr.p_double[i] = _state->v_neginf; state->hasbndl.ptr.p_bool[i] = ae_false; state->bndu.ptr.p_double[i] = _state->v_posinf; state->hasbndu.ptr.p_bool[i] = ae_false; state->s.ptr.p_double[i] = 1.0; } minbleicsetlc(state, &c, &ct, 0, _state); minbleicsetcond(state, 0.0, 0.0, 0.0, 0, _state); minbleicsetxrep(state, ae_false, _state); minbleicsetdrep(state, ae_false, _state); minbleicsetstpmax(state, 0.0, _state); minbleicsetprecdefault(state, _state); minbleicrestartfrom(state, x, _state); ae_frame_leave(_state); } /************************************************************************* This subroutine updates estimate of the good step length given: 1) previous estimate 2) new length of the good step It makes sure that estimate does not change too rapidly - ratio of new and old estimates will be at least 0.01, at most 100.0 In case previous estimate of good step is zero (no estimate), new estimate is used unconditionally. -- ALGLIB -- Copyright 16.01.2013 by Bochkanov Sergey *************************************************************************/ static void minbleic_updateestimateofgoodstep(double* estimate, double newstep, ae_state *_state) { if( ae_fp_eq(*estimate,(double)(0)) ) { *estimate = newstep; return; } if( ae_fp_less(newstep,*estimate*0.01) ) { *estimate = *estimate*0.01; return; } if( ae_fp_greater(newstep,*estimate*100) ) { *estimate = *estimate*100; return; } *estimate = newstep; } /************************************************************************* This subroutine estimates relative feasibility error of the point. INPUT PARAMETERS: X - current point (unscaled) S - scale vector N - dimensionality CLEIC - array[NEC+NIC,N+1], constraint matrix, may be unnormalized NEC - number of equality constraints (come first) NIC - number of inequality constraints (come last) RESULT feasibility error, good value is ~1E-16...1E-14 -- ALGLIB -- Copyright 16.01.2013 by Bochkanov Sergey *************************************************************************/ static double minbleic_feasibilityerror(/* Real */ ae_vector* x, /* Real */ ae_vector* s, ae_int_t n, /* Real */ ae_matrix* cleic, ae_int_t nec, ae_int_t nic, ae_state *_state) { ae_int_t i; ae_int_t j; double v; double v0; double v1; double vc; double vx; double result; result = (double)(0); for(i=0; i<=nec+nic-1; i++) { v = -cleic->ptr.pp_double[i][n]; v0 = (double)(0); v1 = (double)(0); for(j=0; j<=n-1; j++) { vc = cleic->ptr.pp_double[i][j]*s->ptr.p_double[j]; vx = x->ptr.p_double[j]/s->ptr.p_double[j]; v = v+vc*vx; v0 = v0+ae_sqr(vc, _state); v1 = v1+ae_sqr(vx, _state); } v0 = coalesce(ae_sqrt(v0, _state), (double)(1), _state); v1 = coalesce(ae_sqrt(v1, _state), (double)(1), _state); if( isas, _state); ae_vector_init(&p->s, 0, DT_REAL, _state); ae_vector_init(&p->diagh, 0, DT_REAL, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->g, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); ae_vector_init(&p->ugc, 0, DT_REAL, _state); ae_vector_init(&p->cgc, 0, DT_REAL, _state); ae_vector_init(&p->xn, 0, DT_REAL, _state); ae_vector_init(&p->ugn, 0, DT_REAL, _state); ae_vector_init(&p->cgn, 0, DT_REAL, _state); ae_vector_init(&p->xp, 0, DT_REAL, _state); ae_vector_init(&p->d, 0, DT_REAL, _state); ae_matrix_init(&p->cleic, 0, 0, DT_REAL, _state); ae_vector_init(&p->hasbndl, 0, DT_BOOL, _state); ae_vector_init(&p->hasbndu, 0, DT_BOOL, _state); ae_vector_init(&p->bndl, 0, DT_REAL, _state); ae_vector_init(&p->bndu, 0, DT_REAL, _state); ae_vector_init(&p->xstart, 0, DT_REAL, _state); _snnlssolver_init(&p->solver, _state); ae_vector_init(&p->tmpprec, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_vector_init(&p->work, 0, DT_REAL, _state); _linminstate_init(&p->lstate, _state); ae_matrix_init(&p->bufyk, 0, 0, DT_REAL, _state); ae_matrix_init(&p->bufsk, 0, 0, DT_REAL, _state); ae_vector_init(&p->bufrho, 0, DT_REAL, _state); ae_vector_init(&p->buftheta, 0, DT_REAL, _state); } void _minbleicstate_init_copy(void* _dst, void* _src, ae_state *_state) { minbleicstate *dst = (minbleicstate*)_dst; minbleicstate *src = (minbleicstate*)_src; dst->nmain = src->nmain; dst->nslack = src->nslack; dst->epsg = src->epsg; dst->epsf = src->epsf; dst->epsx = src->epsx; dst->maxits = src->maxits; dst->xrep = src->xrep; dst->drep = src->drep; dst->stpmax = src->stpmax; dst->diffstep = src->diffstep; _sactiveset_init_copy(&dst->sas, &src->sas, _state); ae_vector_init_copy(&dst->s, &src->s, _state); dst->prectype = src->prectype; ae_vector_init_copy(&dst->diagh, &src->diagh, _state); ae_vector_init_copy(&dst->x, &src->x, _state); dst->f = src->f; ae_vector_init_copy(&dst->g, &src->g, _state); dst->needf = src->needf; dst->needfg = src->needfg; dst->xupdated = src->xupdated; dst->lsstart = src->lsstart; dst->steepestdescentstep = src->steepestdescentstep; dst->boundedstep = src->boundedstep; dst->userterminationneeded = src->userterminationneeded; dst->teststep = src->teststep; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); ae_vector_init_copy(&dst->ugc, &src->ugc, _state); ae_vector_init_copy(&dst->cgc, &src->cgc, _state); ae_vector_init_copy(&dst->xn, &src->xn, _state); ae_vector_init_copy(&dst->ugn, &src->ugn, _state); ae_vector_init_copy(&dst->cgn, &src->cgn, _state); ae_vector_init_copy(&dst->xp, &src->xp, _state); dst->fc = src->fc; dst->fn = src->fn; dst->fp = src->fp; ae_vector_init_copy(&dst->d, &src->d, _state); ae_matrix_init_copy(&dst->cleic, &src->cleic, _state); dst->nec = src->nec; dst->nic = src->nic; dst->lastgoodstep = src->lastgoodstep; dst->lastscaledgoodstep = src->lastscaledgoodstep; dst->maxscaledgrad = src->maxscaledgrad; ae_vector_init_copy(&dst->hasbndl, &src->hasbndl, _state); ae_vector_init_copy(&dst->hasbndu, &src->hasbndu, _state); ae_vector_init_copy(&dst->bndl, &src->bndl, _state); ae_vector_init_copy(&dst->bndu, &src->bndu, _state); dst->repinneriterationscount = src->repinneriterationscount; dst->repouteriterationscount = src->repouteriterationscount; dst->repnfev = src->repnfev; dst->repvaridx = src->repvaridx; dst->repterminationtype = src->repterminationtype; dst->repdebugeqerr = src->repdebugeqerr; dst->repdebugfs = src->repdebugfs; dst->repdebugff = src->repdebugff; dst->repdebugdx = src->repdebugdx; dst->repdebugfeasqpits = src->repdebugfeasqpits; dst->repdebugfeasgpaits = src->repdebugfeasgpaits; ae_vector_init_copy(&dst->xstart, &src->xstart, _state); _snnlssolver_init_copy(&dst->solver, &src->solver, _state); dst->fbase = src->fbase; dst->fm2 = src->fm2; dst->fm1 = src->fm1; dst->fp1 = src->fp1; dst->fp2 = src->fp2; dst->xm1 = src->xm1; dst->xp1 = src->xp1; dst->gm1 = src->gm1; dst->gp1 = src->gp1; dst->cidx = src->cidx; dst->cval = src->cval; ae_vector_init_copy(&dst->tmpprec, &src->tmpprec, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); dst->nfev = src->nfev; dst->mcstage = src->mcstage; dst->stp = src->stp; dst->curstpmax = src->curstpmax; dst->activationstep = src->activationstep; ae_vector_init_copy(&dst->work, &src->work, _state); _linminstate_init_copy(&dst->lstate, &src->lstate, _state); dst->trimthreshold = src->trimthreshold; dst->nonmonotoniccnt = src->nonmonotoniccnt; ae_matrix_init_copy(&dst->bufyk, &src->bufyk, _state); ae_matrix_init_copy(&dst->bufsk, &src->bufsk, _state); ae_vector_init_copy(&dst->bufrho, &src->bufrho, _state); ae_vector_init_copy(&dst->buftheta, &src->buftheta, _state); dst->bufsize = src->bufsize; } void _minbleicstate_clear(void* _p) { minbleicstate *p = (minbleicstate*)_p; ae_touch_ptr((void*)p); _sactiveset_clear(&p->sas); ae_vector_clear(&p->s); ae_vector_clear(&p->diagh); ae_vector_clear(&p->x); ae_vector_clear(&p->g); _rcommstate_clear(&p->rstate); ae_vector_clear(&p->ugc); ae_vector_clear(&p->cgc); ae_vector_clear(&p->xn); ae_vector_clear(&p->ugn); ae_vector_clear(&p->cgn); ae_vector_clear(&p->xp); ae_vector_clear(&p->d); ae_matrix_clear(&p->cleic); ae_vector_clear(&p->hasbndl); ae_vector_clear(&p->hasbndu); ae_vector_clear(&p->bndl); ae_vector_clear(&p->bndu); ae_vector_clear(&p->xstart); _snnlssolver_clear(&p->solver); ae_vector_clear(&p->tmpprec); ae_vector_clear(&p->tmp0); ae_vector_clear(&p->work); _linminstate_clear(&p->lstate); ae_matrix_clear(&p->bufyk); ae_matrix_clear(&p->bufsk); ae_vector_clear(&p->bufrho); ae_vector_clear(&p->buftheta); } void _minbleicstate_destroy(void* _p) { minbleicstate *p = (minbleicstate*)_p; ae_touch_ptr((void*)p); _sactiveset_destroy(&p->sas); ae_vector_destroy(&p->s); ae_vector_destroy(&p->diagh); ae_vector_destroy(&p->x); ae_vector_destroy(&p->g); _rcommstate_destroy(&p->rstate); ae_vector_destroy(&p->ugc); ae_vector_destroy(&p->cgc); ae_vector_destroy(&p->xn); ae_vector_destroy(&p->ugn); ae_vector_destroy(&p->cgn); ae_vector_destroy(&p->xp); ae_vector_destroy(&p->d); ae_matrix_destroy(&p->cleic); ae_vector_destroy(&p->hasbndl); ae_vector_destroy(&p->hasbndu); ae_vector_destroy(&p->bndl); ae_vector_destroy(&p->bndu); ae_vector_destroy(&p->xstart); _snnlssolver_destroy(&p->solver); ae_vector_destroy(&p->tmpprec); ae_vector_destroy(&p->tmp0); ae_vector_destroy(&p->work); _linminstate_destroy(&p->lstate); ae_matrix_destroy(&p->bufyk); ae_matrix_destroy(&p->bufsk); ae_vector_destroy(&p->bufrho); ae_vector_destroy(&p->buftheta); } void _minbleicreport_init(void* _p, ae_state *_state) { minbleicreport *p = (minbleicreport*)_p; ae_touch_ptr((void*)p); } void _minbleicreport_init_copy(void* _dst, void* _src, ae_state *_state) { minbleicreport *dst = (minbleicreport*)_dst; minbleicreport *src = (minbleicreport*)_src; dst->iterationscount = src->iterationscount; dst->nfev = src->nfev; dst->varidx = src->varidx; dst->terminationtype = src->terminationtype; dst->debugeqerr = src->debugeqerr; dst->debugfs = src->debugfs; dst->debugff = src->debugff; dst->debugdx = src->debugdx; dst->debugfeasqpits = src->debugfeasqpits; dst->debugfeasgpaits = src->debugfeasgpaits; dst->inneriterationscount = src->inneriterationscount; dst->outeriterationscount = src->outeriterationscount; } void _minbleicreport_clear(void* _p) { minbleicreport *p = (minbleicreport*)_p; ae_touch_ptr((void*)p); } void _minbleicreport_destroy(void* _p) { minbleicreport *p = (minbleicreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* This function initializes QPBLEICSettings structure with default settings. Newly created structure MUST be initialized by default settings - or by copy of the already initialized structure. -- ALGLIB -- Copyright 14.05.2011 by Bochkanov Sergey *************************************************************************/ void qpbleicloaddefaults(ae_int_t nmain, qpbleicsettings* s, ae_state *_state) { s->epsg = 0.0; s->epsf = 0.0; s->epsx = 1.0E-6; s->maxits = 0; } /************************************************************************* This function initializes QPBLEICSettings structure with copy of another, already initialized structure. -- ALGLIB -- Copyright 14.05.2011 by Bochkanov Sergey *************************************************************************/ void qpbleiccopysettings(qpbleicsettings* src, qpbleicsettings* dst, ae_state *_state) { dst->epsg = src->epsg; dst->epsf = src->epsf; dst->epsx = src->epsx; dst->maxits = src->maxits; } /************************************************************************* This function runs QPBLEIC solver; it returns after optimization process was completed. Following QP problem is solved: min(0.5*(x-x_origin)'*A*(x-x_origin)+b'*(x-x_origin)) subject to boundary constraints. INPUT PARAMETERS: AC - for dense problems (AKind=0), A-term of CQM object contains system matrix. Other terms are unspecified and should not be referenced. SparseAC - for sparse problems (AKind=1 AKind - sparse matrix format: * 0 for dense matrix * 1 for sparse matrix SparseUpper - which triangle of SparseAC stores matrix - upper or lower one (for dense matrices this parameter is not actual). AbsASum - SUM(|A[i,j]|) AbsASum2 - SUM(A[i,j]^2) BC - linear term, array[NC] BndLC - lower bound, array[NC] BndUC - upper bound, array[NC] SC - scale vector, array[NC]: * I-th element contains scale of I-th variable, * SC[I]>0 XOriginC - origin term, array[NC]. Can be zero. NC - number of variables in the original formulation (no slack variables). CLEICC - linear equality/inequality constraints. Present version of this function does NOT provide publicly available support for linear constraints. This feature will be introduced in the future versions of the function. NEC, NIC - number of equality/inequality constraints. MUST BE ZERO IN THE CURRENT VERSION!!! Settings - QPBLEICSettings object initialized by one of the initialization functions. SState - object which stores temporaries: * if uninitialized object was passed, FirstCall parameter MUST be set to True; object will be automatically initialized by the function, and FirstCall will be set to False. * if FirstCall=False, it is assumed that this parameter was already initialized by previous call to this function with same problem dimensions (variable count N). FirstCall - whether it is first call of this function for this specific instance of SState, with this number of variables N specified. XS - initial point, array[NC] OUTPUT PARAMETERS: XS - last point FirstCall - uncondtionally set to False TerminationType-termination type: * * * -- ALGLIB -- Copyright 14.05.2011 by Bochkanov Sergey *************************************************************************/ void qpbleicoptimize(convexquadraticmodel* a, sparsematrix* sparsea, ae_int_t akind, ae_bool sparseaupper, double absasum, double absasum2, /* Real */ ae_vector* b, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, /* Real */ ae_vector* s, /* Real */ ae_vector* xorigin, ae_int_t n, /* Real */ ae_matrix* cleic, ae_int_t nec, ae_int_t nic, qpbleicsettings* settings, qpbleicbuffers* sstate, ae_bool* firstcall, /* Real */ ae_vector* xs, ae_int_t* terminationtype, ae_state *_state) { ae_int_t i; double d2; double d1; double d0; double v; double v0; double v1; double md; double mx; double mb; ae_int_t d1est; ae_int_t d2est; *terminationtype = 0; ae_assert(akind==0||akind==1, "QPBLEICOptimize: unexpected AKind", _state); sstate->repinneriterationscount = 0; sstate->repouteriterationscount = 0; *terminationtype = 0; /* * Prepare solver object, if needed */ if( *firstcall ) { minbleiccreate(n, xs, &sstate->solver, _state); *firstcall = ae_false; } /* * Prepare max(|B|) */ mb = 0.0; for(i=0; i<=n-1; i++) { mb = ae_maxreal(mb, ae_fabs(b->ptr.p_double[i], _state), _state); } /* * Temporaries */ ivectorsetlengthatleast(&sstate->tmpi, nec+nic, _state); rvectorsetlengthatleast(&sstate->tmp0, n, _state); rvectorsetlengthatleast(&sstate->tmp1, n, _state); for(i=0; i<=nec-1; i++) { sstate->tmpi.ptr.p_int[i] = 0; } for(i=0; i<=nic-1; i++) { sstate->tmpi.ptr.p_int[nec+i] = -1; } minbleicsetlc(&sstate->solver, cleic, &sstate->tmpi, nec+nic, _state); minbleicsetbc(&sstate->solver, bndl, bndu, _state); minbleicsetdrep(&sstate->solver, ae_true, _state); minbleicsetcond(&sstate->solver, ae_minrealnumber, 0.0, 0.0, settings->maxits, _state); minbleicsetscale(&sstate->solver, s, _state); minbleicsetprecscale(&sstate->solver, _state); minbleicrestartfrom(&sstate->solver, xs, _state); while(minbleiciteration(&sstate->solver, _state)) { /* * Line search started */ if( sstate->solver.lsstart ) { /* * Iteration counters: * * inner iterations count is increased on every line search * * outer iterations count is increased only at steepest descent line search */ inc(&sstate->repinneriterationscount, _state); if( sstate->solver.steepestdescentstep ) { inc(&sstate->repouteriterationscount, _state); } /* * Build quadratic model of F along descent direction: * * F(x+alpha*d) = D2*alpha^2 + D1*alpha + D0 * * Calculate estimates of linear and quadratic term * (term magnitude is compared with magnitude of numerical errors) */ d0 = sstate->solver.f; d1 = ae_v_dotproduct(&sstate->solver.d.ptr.p_double[0], 1, &sstate->solver.g.ptr.p_double[0], 1, ae_v_len(0,n-1)); d2 = (double)(0); if( akind==0 ) { d2 = cqmxtadx2(a, &sstate->solver.d, _state); } if( akind==1 ) { sparsesmv(sparsea, sparseaupper, &sstate->solver.d, &sstate->tmp0, _state); d2 = 0.0; for(i=0; i<=n-1; i++) { d2 = d2+sstate->solver.d.ptr.p_double[i]*sstate->tmp0.ptr.p_double[i]; } d2 = 0.5*d2; } mx = 0.0; md = 0.0; for(i=0; i<=n-1; i++) { mx = ae_maxreal(mx, ae_fabs(sstate->solver.x.ptr.p_double[i], _state), _state); md = ae_maxreal(md, ae_fabs(sstate->solver.d.ptr.p_double[i], _state), _state); } estimateparabolicmodel(absasum, absasum2, mx, mb, md, d1, d2, &d1est, &d2est, _state); /* * Tests for "normal" convergence. * * This line search may be started from steepest descent * stage (stage 2) or from L-BFGS stage (stage 3) of the * BLEIC algorithm. Depending on stage type, different * checks are performed. * * Say, L-BFGS stage is an equality-constrained refinement * stage of BLEIC. This stage refines current iterate * under "frozen" equality constraints. We can terminate * iterations at this stage only when we encounter * unconstrained direction of negative curvature. In all * other cases (say, when constrained gradient is zero) * we should not terminate algorithm because everything may * change after de-activating presently active constraints. * * Tests for convergence are performed only at "steepest descent" stage * of the BLEIC algorithm, and only when function is non-concave * (D2 is positive or approximately zero) along direction D. * * NOTE: we do not test iteration count (MaxIts) here, because * this stopping condition is tested by BLEIC itself. */ if( sstate->solver.steepestdescentstep&&d2est>=0 ) { if( d1est>=0 ) { /* * "Emergency" stopping condition: D is non-descent direction. * Sometimes it is possible because of numerical noise in the * target function. */ *terminationtype = 4; for(i=0; i<=n-1; i++) { xs->ptr.p_double[i] = sstate->solver.x.ptr.p_double[i]; } break; } if( d2est>0 ) { /* * Stopping condition #4 - gradient norm is small: * * 1. rescale State.Solver.D and State.Solver.G according to * current scaling, store results to Tmp0 and Tmp1. * 2. Normalize Tmp0 (scaled direction vector). * 3. compute directional derivative (in scaled variables), * which is equal to DOTPRODUCT(Tmp0,Tmp1). */ v = (double)(0); for(i=0; i<=n-1; i++) { sstate->tmp0.ptr.p_double[i] = sstate->solver.d.ptr.p_double[i]/s->ptr.p_double[i]; sstate->tmp1.ptr.p_double[i] = sstate->solver.g.ptr.p_double[i]*s->ptr.p_double[i]; v = v+ae_sqr(sstate->tmp0.ptr.p_double[i], _state); } ae_assert(ae_fp_greater(v,(double)(0)), "QPBLEICOptimize: inernal errror (scaled direction is zero)", _state); v = 1/ae_sqrt(v, _state); ae_v_muld(&sstate->tmp0.ptr.p_double[0], 1, ae_v_len(0,n-1), v); v = ae_v_dotproduct(&sstate->tmp0.ptr.p_double[0], 1, &sstate->tmp1.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( ae_fp_less_eq(ae_fabs(v, _state),settings->epsg) ) { *terminationtype = 4; for(i=0; i<=n-1; i++) { xs->ptr.p_double[i] = sstate->solver.x.ptr.p_double[i]; } break; } /* * Stopping condition #1 - relative function improvement is small: * * 1. calculate steepest descent step: V = -D1/(2*D2) * 2. calculate function change: V1= D2*V^2 + D1*V * 3. stop if function change is small enough */ v = -d1/(2*d2); v1 = d2*v*v+d1*v; if( ae_fp_less_eq(ae_fabs(v1, _state),settings->epsf*ae_maxreal(d0, 1.0, _state)) ) { *terminationtype = 1; for(i=0; i<=n-1; i++) { xs->ptr.p_double[i] = sstate->solver.x.ptr.p_double[i]; } break; } /* * Stopping condition #2 - scaled step is small: * * 1. calculate step multiplier V0 (step itself is D*V0) * 2. calculate scaled step length V * 3. stop if step is small enough */ v0 = -d1/(2*d2); v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(v0*sstate->solver.d.ptr.p_double[i]/s->ptr.p_double[i], _state); } if( ae_fp_less_eq(ae_sqrt(v, _state),settings->epsx) ) { *terminationtype = 2; for(i=0; i<=n-1; i++) { xs->ptr.p_double[i] = sstate->solver.x.ptr.p_double[i]; } break; } } } /* * Test for unconstrained direction of negative curvature */ if( (d2est<0||(d2est==0&&d1est<0))&&!sstate->solver.boundedstep ) { /* * Function is unbounded from below: * * function will decrease along D, i.e. either: * * D2<0 * * D2=0 and D1<0 * * step is unconstrained * * If these conditions are true, we abnormally terminate QP * algorithm with return code -4 (we can do so at any stage * of BLEIC - whether it is L-BFGS or steepest descent one). */ *terminationtype = -4; for(i=0; i<=n-1; i++) { xs->ptr.p_double[i] = sstate->solver.x.ptr.p_double[i]; } break; } /* * Suggest new step (only if D1 is negative far away from zero, * D2 is positive far away from zero). */ if( d1est<0&&d2est>0 ) { sstate->solver.stp = safeminposrv(-d1, 2*d2, sstate->solver.curstpmax, _state); } } /* * Gradient evaluation */ if( sstate->solver.needfg ) { for(i=0; i<=n-1; i++) { sstate->tmp0.ptr.p_double[i] = sstate->solver.x.ptr.p_double[i]-xorigin->ptr.p_double[i]; } if( akind==0 ) { cqmadx(a, &sstate->tmp0, &sstate->tmp1, _state); } if( akind==1 ) { sparsesmv(sparsea, sparseaupper, &sstate->tmp0, &sstate->tmp1, _state); } v0 = ae_v_dotproduct(&sstate->tmp0.ptr.p_double[0], 1, &sstate->tmp1.ptr.p_double[0], 1, ae_v_len(0,n-1)); v1 = ae_v_dotproduct(&sstate->tmp0.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); sstate->solver.f = 0.5*v0+v1; ae_v_move(&sstate->solver.g.ptr.p_double[0], 1, &sstate->tmp1.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_add(&sstate->solver.g.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); } } if( *terminationtype==0 ) { /* * BLEIC optimizer was terminated by one of its inner stopping * conditions. Usually it is iteration counter (if such * stopping condition was specified by user). */ minbleicresultsbuf(&sstate->solver, xs, &sstate->solverrep, _state); *terminationtype = sstate->solverrep.terminationtype; } else { /* * BLEIC optimizer was terminated in "emergency" mode by QP * solver. * * NOTE: such termination is "emergency" only when viewed from * BLEIC's position. QP solver sees such termination as * routine one, triggered by QP's stopping criteria. */ minbleicemergencytermination(&sstate->solver, _state); } } void _qpbleicsettings_init(void* _p, ae_state *_state) { qpbleicsettings *p = (qpbleicsettings*)_p; ae_touch_ptr((void*)p); } void _qpbleicsettings_init_copy(void* _dst, void* _src, ae_state *_state) { qpbleicsettings *dst = (qpbleicsettings*)_dst; qpbleicsettings *src = (qpbleicsettings*)_src; dst->epsg = src->epsg; dst->epsf = src->epsf; dst->epsx = src->epsx; dst->maxits = src->maxits; } void _qpbleicsettings_clear(void* _p) { qpbleicsettings *p = (qpbleicsettings*)_p; ae_touch_ptr((void*)p); } void _qpbleicsettings_destroy(void* _p) { qpbleicsettings *p = (qpbleicsettings*)_p; ae_touch_ptr((void*)p); } void _qpbleicbuffers_init(void* _p, ae_state *_state) { qpbleicbuffers *p = (qpbleicbuffers*)_p; ae_touch_ptr((void*)p); _minbleicstate_init(&p->solver, _state); _minbleicreport_init(&p->solverrep, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_vector_init(&p->tmp1, 0, DT_REAL, _state); ae_vector_init(&p->tmpi, 0, DT_INT, _state); } void _qpbleicbuffers_init_copy(void* _dst, void* _src, ae_state *_state) { qpbleicbuffers *dst = (qpbleicbuffers*)_dst; qpbleicbuffers *src = (qpbleicbuffers*)_src; _minbleicstate_init_copy(&dst->solver, &src->solver, _state); _minbleicreport_init_copy(&dst->solverrep, &src->solverrep, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); ae_vector_init_copy(&dst->tmp1, &src->tmp1, _state); ae_vector_init_copy(&dst->tmpi, &src->tmpi, _state); dst->repinneriterationscount = src->repinneriterationscount; dst->repouteriterationscount = src->repouteriterationscount; } void _qpbleicbuffers_clear(void* _p) { qpbleicbuffers *p = (qpbleicbuffers*)_p; ae_touch_ptr((void*)p); _minbleicstate_clear(&p->solver); _minbleicreport_clear(&p->solverrep); ae_vector_clear(&p->tmp0); ae_vector_clear(&p->tmp1); ae_vector_clear(&p->tmpi); } void _qpbleicbuffers_destroy(void* _p) { qpbleicbuffers *p = (qpbleicbuffers*)_p; ae_touch_ptr((void*)p); _minbleicstate_destroy(&p->solver); _minbleicreport_destroy(&p->solverrep); ae_vector_destroy(&p->tmp0); ae_vector_destroy(&p->tmp1); ae_vector_destroy(&p->tmpi); } /************************************************************************* CONSTRAINED QUADRATIC PROGRAMMING The subroutine creates QP optimizer. After initial creation, it contains default optimization problem with zero quadratic and linear terms and no constraints. You should set quadratic/linear terms with calls to functions provided by MinQP subpackage. You should also choose appropriate QP solver and set it and its stopping criteria by means of MinQPSetAlgo??????() function. Then, you should start solution process by means of MinQPOptimize() call. Solution itself can be obtained with MinQPResults() function. Following solvers are recommended: * QuickQP for dense problems with box-only constraints (or no constraints at all) * QP-BLEIC for dense/sparse problems with moderate (up to 50) number of general linear constraints * DENSE-AUL-QP for dense problems with any (small or large) number of general linear constraints INPUT PARAMETERS: N - problem size OUTPUT PARAMETERS: State - optimizer with zero quadratic/linear terms and no constraints -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpcreate(ae_int_t n, minqpstate* state, ae_state *_state) { ae_int_t i; _minqpstate_clear(state); ae_assert(n>=1, "MinQPCreate: N<1", _state); /* * initialize QP solver */ state->n = n; state->nec = 0; state->nic = 0; state->snec = 0; state->snic = 0; state->repterminationtype = 0; state->absamax = (double)(1); state->absasum = (double)(1); state->absasum2 = (double)(1); state->akind = 0; state->sparseaupper = ae_false; cqminit(n, &state->a, _state); ae_vector_set_length(&state->b, n, _state); ae_vector_set_length(&state->bndl, n, _state); ae_vector_set_length(&state->bndu, n, _state); ae_vector_set_length(&state->havebndl, n, _state); ae_vector_set_length(&state->havebndu, n, _state); ae_vector_set_length(&state->s, n, _state); ae_vector_set_length(&state->startx, n, _state); ae_vector_set_length(&state->xorigin, n, _state); ae_vector_set_length(&state->xs, n, _state); for(i=0; i<=n-1; i++) { state->bndl.ptr.p_double[i] = _state->v_neginf; state->bndu.ptr.p_double[i] = _state->v_posinf; state->havebndl.ptr.p_bool[i] = ae_false; state->havebndu.ptr.p_bool[i] = ae_false; state->b.ptr.p_double[i] = 0.0; state->startx.ptr.p_double[i] = 0.0; state->xorigin.ptr.p_double[i] = 0.0; state->s.ptr.p_double[i] = 1.0; } state->havex = ae_false; minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0, _state); qqploaddefaults(n, &state->qqpsettingsuser, _state); qpbleicloaddefaults(n, &state->qpbleicsettingsuser, _state); qpdenseaulloaddefaults(n, &state->qpdenseaulsettingsuser, _state); state->qpbleicfirstcall = ae_true; state->dbgskipconstraintnormalization = ae_false; } /************************************************************************* This function sets linear term for QP solver. By default, linear term is zero. INPUT PARAMETERS: State - structure which stores algorithm state B - linear term, array[N]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetlinearterm(minqpstate* state, /* Real */ ae_vector* b, ae_state *_state) { ae_int_t n; n = state->n; ae_assert(b->cnt>=n, "MinQPSetLinearTerm: Length(B)n; ae_assert(a->rows>=n, "MinQPSetQuadraticTerm: Rows(A)cols>=n, "MinQPSetQuadraticTerm: Cols(A)n; ae_assert(sparsegetnrows(a, _state)==n, "MinQPSetQuadraticTermSparse: Rows(A)<>N", _state); ae_assert(sparsegetncols(a, _state)==n, "MinQPSetQuadraticTermSparse: Cols(A)<>N", _state); sparsecopytocrsbuf(a, &state->sparsea, _state); state->sparseaupper = isupper; state->akind = 1; /* * Estimate norm of A * (it will be used later in the quadratic penalty function) */ state->absamax = (double)(0); state->absasum = (double)(0); state->absasum2 = (double)(0); t0 = 0; t1 = 0; while(sparseenumerate(a, &t0, &t1, &i, &j, &v, _state)) { if( i==j ) { /* * Diagonal terms are counted only once */ state->absamax = ae_maxreal(state->absamax, v, _state); state->absasum = state->absasum+v; state->absasum2 = state->absasum2+v*v; } if( (j>i&&isupper)||(jabsamax = ae_maxreal(state->absamax, v, _state); state->absasum = state->absasum+2*v; state->absasum2 = state->absasum2+2*v*v; } } } /************************************************************************* This function sets starting point for QP solver. It is useful to have good initial approximation to the solution, because it will increase speed of convergence and identification of active constraints. INPUT PARAMETERS: State - structure which stores algorithm state X - starting point, array[N]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetstartingpoint(minqpstate* state, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; n = state->n; ae_assert(x->cnt>=n, "MinQPSetStartingPoint: Length(B)n; ae_assert(xorigin->cnt>=n, "MinQPSetOrigin: Length(B)cnt>=state->n, "MinQPSetScale: Length(S)n-1; i++) { ae_assert(ae_isfinite(s->ptr.p_double[i], _state), "MinQPSetScale: S contains infinite or NAN elements", _state); ae_assert(ae_fp_neq(s->ptr.p_double[i],(double)(0)), "MinQPSetScale: S contains zero elements", _state); state->s.ptr.p_double[i] = ae_fabs(s->ptr.p_double[i], _state); } } /************************************************************************* DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED This function tells solver to use Cholesky-based algorithm. This algorithm was deprecated in ALGLIB 3.9.0 because its performance is inferior to that of BLEIC-QP or QuickQP on high-dimensional problems. Furthermore, it supports only dense convex QP problems. This solver is no longer active by default. We recommend you to switch to AUL-QP, BLEIC-QP or QuickQP solver. DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetalgocholesky(minqpstate* state, ae_state *_state) { state->algokind = 1; } /************************************************************************* This function tells solver to use BLEIC-based algorithm and sets stopping criteria for the algorithm. ALGORITHM FEATURES: * supports dense and sparse QP problems * supports box and general linear equality/inequality constraints * can solve all types of problems (convex, semidefinite, nonconvex) as long as they are bounded from below under constraints. Say, it is possible to solve "min{-x^2} subject to -1<=x<=+1". Of course, global minimum is found only for positive definite and semidefinite problems. As for indefinite ones - only local minimum is found. ALGORITHM OUTLINE: * BLEIC-QP solver is just a driver function for MinBLEIC solver; it solves quadratic programming problem as general linearly constrained optimization problem, which is solved by means of BLEIC solver (part of ALGLIB, active set method). ALGORITHM LIMITATIONS: * this algorithm is fast enough for large-scale problems with small amount of general linear constraints (say, up to 50), but it is inefficient for problems with several hundreds of constraints. Iteration cost is roughly quadratic w.r.t. constraint count. Furthermore, it can not efficiently handle sparse constraints (they are converted to dense format prior to solution). Thus, if you have large and/or sparse constraint matrix and convex QP problem, Dense-AUL-QP solver may be better solution. * unlike QuickQP solver, this algorithm does not perform Newton steps and does not use Level 3 BLAS. Being general-purpose active set method, it can activate constraints only one-by-one. Thus, its performance is lower than that of QuickQP. * its precision is also a bit inferior to that of QuickQP. BLEIC-QP performs only LBFGS steps (no Newton steps), which are good at detecting neighborhood of the solution, buy needs many iterations to find solution with more than 6 digits of precision. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} EpsX - >=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinQPSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. NOTE: this algorithm uses LBFGS iterations, which are relatively cheap, but improve function value only a bit. So you will need many iterations to converge - from 0.1*N to 10*N, depending on problem's condition number. IT IS VERY IMPORTANT TO CALL MinQPSetScale() WHEN YOU USE THIS ALGORITHM BECAUSE ITS STOPPING CRITERIA ARE SCALE-DEPENDENT! Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (presently it is small step length, but it may change in the future versions of ALGLIB). -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetalgobleic(minqpstate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(epsg, _state), "MinQPSetAlgoBLEIC: EpsG is not finite number", _state); ae_assert(ae_fp_greater_eq(epsg,(double)(0)), "MinQPSetAlgoBLEIC: negative EpsG", _state); ae_assert(ae_isfinite(epsf, _state), "MinQPSetAlgoBLEIC: EpsF is not finite number", _state); ae_assert(ae_fp_greater_eq(epsf,(double)(0)), "MinQPSetAlgoBLEIC: negative EpsF", _state); ae_assert(ae_isfinite(epsx, _state), "MinQPSetAlgoBLEIC: EpsX is not finite number", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "MinQPSetAlgoBLEIC: negative EpsX", _state); ae_assert(maxits>=0, "MinQPSetAlgoBLEIC: negative MaxIts!", _state); state->algokind = 2; if( ((ae_fp_eq(epsg,(double)(0))&&ae_fp_eq(epsf,(double)(0)))&&ae_fp_eq(epsx,(double)(0)))&&maxits==0 ) { epsx = 1.0E-6; } state->qpbleicsettingsuser.epsg = epsg; state->qpbleicsettingsuser.epsf = epsf; state->qpbleicsettingsuser.epsx = epsx; state->qpbleicsettingsuser.maxits = maxits; } /************************************************************************* This function tells QP solver to use Dense-AUL algorithm and sets stopping criteria for the algorithm. ALGORITHM FEATURES: * supports dense and sparse QP problems; although it uses dense Cholesky to build preconditioner, it still works faster for sparse problems. * supports box and dense/sparse general linear equality/inequality constraints * convergence is theoretically proved for positive-definite (convex) QP problems. Semidefinite and non-convex problems can be solved as long as they are bounded from below under constraints, although without theoretical guarantees. * this solver is better than QP-BLEIC on problems with large number of general linear constraints. ALGORITHM OUTLINE: * this algorithm is an augmented Lagrangian method with dense preconditioner (hence its name). It is similar to barrier/penalty methods, but much more precise and faster. * it performs several outer iterations in order to refine values of the Lagrange multipliers. Single outer iteration is a solution of some unconstrained optimization problem: first it performs dense Cholesky factorization of the Hessian in order to build preconditioner (adaptive regularization is applied to enforce positive definiteness), and then it uses L-BFGS optimizer to solve optimization problem. * typically you need about 5-10 outer iterations to converge to solution ALGORITHM LIMITATIONS: * because dense Cholesky driver is used, this algorithm has O(N^2) memory requirements and O(OuterIterations*N^3) minimum running time. From the practical point of view, it limits its applicability by several thousands of variables. From the other side, variables count is the most limiting factor, and dependence on constraint count is much more lower. Assuming that constraint matrix is sparse, it may handle tens of thousands of general linear constraints. * its precision is lower than that of BLEIC-QP and QuickQP. It is hard to find solution with more than 6 digits of precision. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0, stopping criteria for inner optimizer. Inner iterations are stopped when step length (with variable scaling being applied) is less than EpsX. See minqpsetscale() for more information on variable scaling. Rho - penalty coefficient, Rho>0: * large enough that algorithm converges with desired precision. * not TOO large to prevent ill-conditioning * recommended values are 100, 1000 or 10000 ItsCnt - number of outer iterations: * recommended values: 10-15 (although in most cases it converges within 5 iterations, you may need a few more to be sure). * ItsCnt=0 means that small number of outer iterations is automatically chosen (10 iterations in current version). * ItsCnt=1 means that AUL algorithm performs just as usual barrier method. * ItsCnt>1 means that AUL algorithm performs specified number of outer iterations IT IS VERY IMPORTANT TO CALL minqpsetscale() WHEN YOU USE THIS ALGORITHM BECAUSE ITS CONVERGENCE PROPERTIES AND STOPPING CRITERIA ARE SCALE-DEPENDENT! NOTE: Passing EpsX=0 will lead to automatic step length selection (specific step length chosen may change in the future versions of ALGLIB, so it is better to specify step length explicitly). -- ALGLIB -- Copyright 20.08.2016 by Bochkanov Sergey *************************************************************************/ void minqpsetalgodenseaul(minqpstate* state, double epsx, double rho, ae_int_t itscnt, ae_state *_state) { ae_assert(ae_isfinite(epsx, _state), "MinQPSetAlgoDenseAUL: EpsX is not finite number", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "MinQPSetAlgoDenseAUL: negative EpsX", _state); ae_assert(ae_isfinite(rho, _state), "MinQPSetAlgoDenseAUL: Rho is not finite number", _state); ae_assert(ae_fp_greater(rho,(double)(0)), "MinQPSetAlgoDenseAUL: non-positive Rho", _state); ae_assert(itscnt>=0, "MinQPSetAlgoDenseAUL: negative ItsCnt!", _state); state->algokind = 4; if( ae_fp_eq(epsx,(double)(0)) ) { epsx = 1.0E-8; } if( itscnt==0 ) { itscnt = 10; } state->qpdenseaulsettingsuser.epsx = epsx; state->qpdenseaulsettingsuser.outerits = itscnt; state->qpdenseaulsettingsuser.rho = rho; } /************************************************************************* This function tells solver to use QuickQP algorithm: special extra-fast algorithm for problems with box-only constrants. It may solve non-convex problems as long as they are bounded from below under constraints. ALGORITHM FEATURES: * many times (from 5x to 50x!) faster than BLEIC-based QP solver; utilizes accelerated methods for activation of constraints. * supports dense and sparse QP problems * supports ONLY box constraints; general linear constraints are NOT supported by this solver * can solve all types of problems (convex, semidefinite, nonconvex) as long as they are bounded from below under constraints. Say, it is possible to solve "min{-x^2} subject to -1<=x<=+1". In convex/semidefinite case global minimum is returned, in nonconvex case - algorithm returns one of the local minimums. ALGORITHM OUTLINE: * algorithm performs two kinds of iterations: constrained CG iterations and constrained Newton iterations * initially it performs small number of constrained CG iterations, which can efficiently activate/deactivate multiple constraints * after CG phase algorithm tries to calculate Cholesky decomposition and to perform several constrained Newton steps. If Cholesky decomposition failed (matrix is indefinite even under constraints), we perform more CG iterations until we converge to such set of constraints that system matrix becomes positive definite. Constrained Newton steps greatly increase convergence speed and precision. * algorithm interleaves CG and Newton iterations which allows to handle indefinite matrices (CG phase) and quickly converge after final set of constraints is found (Newton phase). Combination of CG and Newton phases is called "outer iteration". * it is possible to turn off Newton phase (beneficial for semidefinite problems - Cholesky decomposition will fail too often) ALGORITHM LIMITATIONS: * algorithm does not support general linear constraints; only box ones are supported * Cholesky decomposition for sparse problems is performed with Skyline Cholesky solver, which is intended for low-profile matrices. No profile- reducing reordering of variables is performed in this version of ALGLIB. * problems with near-zero negative eigenvalues (or exacty zero ones) may experience about 2-3x performance penalty. The reason is that Cholesky decomposition can not be performed until we identify directions of zero and negative curvature and activate corresponding boundary constraints - but we need a lot of trial and errors because these directions are hard to notice in the matrix spectrum. In this case you may turn off Newton phase of algorithm. Large negative eigenvalues are not an issue, so highly non-convex problems can be solved very efficiently. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} EpsX - >=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinQPSetScale() MaxOuterIts-maximum number of OUTER iterations. One outer iteration includes some amount of CG iterations (from 5 to ~N) and one or several (usually small amount) Newton steps. Thus, one outer iteration has high cost, but can greatly reduce funcation value. Use 0 if you do not want to limit number of outer iterations. UseNewton- use Newton phase or not: * Newton phase improves performance of positive definite dense problems (about 2 times improvement can be observed) * can result in some performance penalty on semidefinite or slightly negative definite problems - each Newton phase will bring no improvement (Cholesky failure), but still will require computational time. * if you doubt, you can turn off this phase - optimizer will retain its most of its high speed. IT IS VERY IMPORTANT TO CALL MinQPSetScale() WHEN YOU USE THIS ALGORITHM BECAUSE ITS STOPPING CRITERIA ARE SCALE-DEPENDENT! Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (presently it is small step length, but it may change in the future versions of ALGLIB). -- ALGLIB -- Copyright 22.05.2014 by Bochkanov Sergey *************************************************************************/ void minqpsetalgoquickqp(minqpstate* state, double epsg, double epsf, double epsx, ae_int_t maxouterits, ae_bool usenewton, ae_state *_state) { ae_assert(ae_isfinite(epsg, _state), "MinQPSetAlgoQuickQP: EpsG is not finite number", _state); ae_assert(ae_fp_greater_eq(epsg,(double)(0)), "MinQPSetAlgoQuickQP: negative EpsG", _state); ae_assert(ae_isfinite(epsf, _state), "MinQPSetAlgoQuickQP: EpsF is not finite number", _state); ae_assert(ae_fp_greater_eq(epsf,(double)(0)), "MinQPSetAlgoQuickQP: negative EpsF", _state); ae_assert(ae_isfinite(epsx, _state), "MinQPSetAlgoQuickQP: EpsX is not finite number", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "MinQPSetAlgoQuickQP: negative EpsX", _state); ae_assert(maxouterits>=0, "MinQPSetAlgoQuickQP: negative MaxOuterIts!", _state); state->algokind = 3; if( ((ae_fp_eq(epsg,(double)(0))&&ae_fp_eq(epsf,(double)(0)))&&ae_fp_eq(epsx,(double)(0)))&&maxouterits==0 ) { epsx = 1.0E-6; } state->qqpsettingsuser.maxouterits = maxouterits; state->qqpsettingsuser.epsg = epsg; state->qqpsettingsuser.epsf = epsf; state->qqpsettingsuser.epsx = epsx; state->qqpsettingsuser.cnphase = usenewton; } /************************************************************************* This function sets box constraints for QP solver Box constraints are inactive by default (after initial creation). After being set, they are preserved until explicitly turned off with another SetBC() call. All QP solvers may handle box constraints. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF (latter is recommended because it will allow solver to use better algorithm). BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF (latter is recommended because it will allow solver to use better algorithm). NOTE: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetbc(minqpstate* state, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state) { ae_int_t i; ae_int_t n; n = state->n; ae_assert(bndl->cnt>=n, "MinQPSetBC: Length(BndL)cnt>=n, "MinQPSetBC: Length(BndU)ptr.p_double[i], _state)||ae_isneginf(bndl->ptr.p_double[i], _state), "MinQPSetBC: BndL contains NAN or +INF", _state); ae_assert(ae_isfinite(bndu->ptr.p_double[i], _state)||ae_isposinf(bndu->ptr.p_double[i], _state), "MinQPSetBC: BndU contains NAN or -INF", _state); state->bndl.ptr.p_double[i] = bndl->ptr.p_double[i]; state->havebndl.ptr.p_bool[i] = ae_isfinite(bndl->ptr.p_double[i], _state); state->bndu.ptr.p_double[i] = bndu->ptr.p_double[i]; state->havebndu.ptr.p_bool[i] = ae_isfinite(bndu->ptr.p_double[i], _state); } } /************************************************************************* This function sets dense linear constraints for QP optimizer. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). After call to this function sparse constraints are dropped, and you have only those constraints which were specified in the present call. If you want to specify mixed (with dense and sparse terms) linear constraints, you should call minqpsetlcmixed(). SUPPORT BY QP SOLVERS: Following QP solvers can handle dense linear constraints: * BLEIC-QP - handles them with high precision, but may be inefficient for problems with hundreds of constraints * Dense-AUL-QP - handles them with moderate precision (approx. 10^-6), may efficiently handle thousands of constraints. Following QP solvers can NOT handle dense linear constraints: * QuickQP - can not handle general linear constraints INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 19.06.2012 by Bochkanov Sergey *************************************************************************/ void minqpsetlc(minqpstate* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state) { ae_frame _frame_block; sparsematrix dummyc; ae_vector dummyct; ae_frame_make(_state, &_frame_block); _sparsematrix_init(&dummyc, _state); ae_vector_init(&dummyct, 0, DT_INT, _state); minqpsetlcmixed(state, c, ct, k, &dummyc, &dummyct, 0, _state); ae_frame_leave(_state); } /************************************************************************* This function sets sparse linear constraints for QP optimizer. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). After call to this function dense constraints are dropped, and you have only those constraints which were specified in the present call. If you want to specify mixed (with dense and sparse terms) linear constraints, you should call minqpsetlcmixed(). SUPPORT BY QP SOLVERS: Following QP solvers can handle sparse linear constraints: * BLEIC-QP - handles them with high precision, but can not utilize their sparsity - sparse constraint matrix is silently converted to dense format. Thus, it may be inefficient for problems with hundreds of constraints. * Dense-AUL-QP - although this solver uses dense linear algebra to calculate Cholesky preconditioner, it may efficiently handle sparse constraints. It may solve problems with hundreds and thousands of constraints. The only drawback is that precision of constraint handling is typically within 1E-4... ..1E-6 range. Following QP solvers can NOT handle sparse linear constraints: * QuickQP - can not handle general linear constraints INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. C - linear constraints, sparse matrix with dimensions at least [K,N+1]. If matrix has larger size, only leading Kx(N+1) rectangle is used. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0 NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 22.08.2016 by Bochkanov Sergey *************************************************************************/ void minqpsetlcsparse(minqpstate* state, sparsematrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state) { ae_frame _frame_block; ae_matrix dummyc; ae_vector dummyct; ae_frame_make(_state, &_frame_block); ae_matrix_init(&dummyc, 0, 0, DT_REAL, _state); ae_vector_init(&dummyct, 0, DT_INT, _state); minqpsetlcmixed(state, &dummyc, &dummyct, 0, c, ct, k, _state); ae_frame_leave(_state); } /************************************************************************* This function sets mixed linear constraints, which include a set of dense rows, and a set of sparse rows. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). This function may be useful if constraint matrix includes large number of both types of rows - dense and sparse. If you have just a few sparse rows, you may represent them in dense format without loosing performance. Similarly, if you have just a few dense rows, you may store them in sparse format with almost same performance. SUPPORT BY QP SOLVERS: Following QP solvers can handle mixed dense/sparse linear constraints: * BLEIC-QP - handles them with high precision, but can not utilize their sparsity - sparse constraint matrix is silently converted to dense format. Thus, it may be inefficient for problems with hundreds of constraints. * Dense-AUL-QP - although this solver uses dense linear algebra to calculate Cholesky preconditioner, it may efficiently handle sparse constraints. It may solve problems with hundreds and thousands of constraints. The only drawback is that precision of constraint handling is typically within 1E-4... ..1E-6 range. Following QP solvers can NOT handle mixed linear constraints: * QuickQP - can not handle general linear constraints at all INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. DenseC - dense linear constraints, array[K,N+1]. Each row of DenseC represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of DenseC (including right part) must be finite. DenseCT - type of constraints, array[K]: * if DenseCT[i]>0, then I-th constraint is DenseC[i,*]*x >= DenseC[i,n+1] * if DenseCT[i]=0, then I-th constraint is DenseC[i,*]*x = DenseC[i,n+1] * if DenseCT[i]<0, then I-th constraint is DenseC[i,*]*x <= DenseC[i,n+1] DenseK - number of equality/inequality constraints, DenseK>=0 SparseC - linear constraints, sparse matrix with dimensions at least [SparseK,N+1]. If matrix has larger size, only leading SPARSEKx(N+1) rectangle is used. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. SparseCT- type of sparse constraints, array[K]: * if SparseCT[i]>0, then I-th constraint is SparseC[i,*]*x >= SparseC[i,n+1] * if SparseCT[i]=0, then I-th constraint is SparseC[i,*]*x = SparseC[i,n+1] * if SparseCT[i]<0, then I-th constraint is SparseC[i,*]*x <= SparseC[i,n+1] SparseK - number of sparse equality/inequality constraints, K>=0 NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 22.08.2016 by Bochkanov Sergey *************************************************************************/ void minqpsetlcmixed(minqpstate* state, /* Real */ ae_matrix* densec, /* Integer */ ae_vector* densect, ae_int_t densek, sparsematrix* sparsec, /* Integer */ ae_vector* sparsect, ae_int_t sparsek, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t j0; double v; ae_vector srcidx; ae_vector dstidx; ae_vector s; ae_vector rs; ae_vector eoffs; ae_vector roffs; ae_vector v2; ae_vector eidx; ae_vector eval; ae_int_t t0; ae_int_t t1; ae_int_t nnz; ae_frame_make(_state, &_frame_block); ae_vector_init(&srcidx, 0, DT_INT, _state); ae_vector_init(&dstidx, 0, DT_INT, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&rs, 0, DT_INT, _state); ae_vector_init(&eoffs, 0, DT_INT, _state); ae_vector_init(&roffs, 0, DT_INT, _state); ae_vector_init(&v2, 0, DT_REAL, _state); ae_vector_init(&eidx, 0, DT_INT, _state); ae_vector_init(&eval, 0, DT_REAL, _state); n = state->n; /* * First, check for errors in the inputs */ ae_assert(densek>=0, "MinQPSetLCMixed: K<0", _state); ae_assert(densek==0||densec->cols>=n+1, "MinQPSetLCMixed: Cols(C)rows>=densek, "MinQPSetLCMixed: Rows(DenseC)cnt>=densek, "MinQPSetLCMixed: Length(DenseCT)=0, "MinQPSetLCMixed: SparseK<0", _state); ae_assert(sparsek==0||sparsegetncols(sparsec, _state)>=n+1, "MinQPSetLCMixed: Cols(SparseC)=sparsek, "MinQPSetLCMixed: Rows(SparseC)cnt>=sparsek, "MinQPSetLCMixed: Length(SparseCT)nec = 0; state->nic = 0; state->snec = 0; state->snic = 0; if( densek+sparsek==0 ) { ae_frame_leave(_state); return; } /* * If we have dense constraints */ if( densek>0 ) { /* * Equality constraints are stored first, in the upper * NEC rows of State.CLEIC matrix. Inequality constraints * are stored in the next NIC rows. * * NOTE: we convert inequality constraints to the form * A*x<=b before copying them. */ rmatrixsetlengthatleast(&state->cleic, densek, n+1, _state); for(i=0; i<=densek-1; i++) { if( densect->ptr.p_int[i]==0 ) { ae_v_move(&state->cleic.ptr.pp_double[state->nec][0], 1, &densec->ptr.pp_double[i][0], 1, ae_v_len(0,n)); state->nec = state->nec+1; } } for(i=0; i<=densek-1; i++) { if( densect->ptr.p_int[i]!=0 ) { if( densect->ptr.p_int[i]>0 ) { ae_v_moveneg(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &densec->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } else { ae_v_move(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &densec->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } state->nic = state->nic+1; } } /* * Normalize rows of State.CLEIC: each row must have unit norm. * Norm is calculated using first N elements (i.e. right part is * not counted when we calculate norm). */ if( !state->dbgskipconstraintnormalization ) { for(i=0; i<=densek-1; i++) { v = (double)(0); for(j=0; j<=n-1; j++) { v = v+ae_sqr(state->cleic.ptr.pp_double[i][j], _state); } if( ae_fp_eq(v,(double)(0)) ) { continue; } v = 1/ae_sqrt(v, _state); ae_v_muld(&state->cleic.ptr.pp_double[i][0], 1, ae_v_len(0,n), v); } } } /* * If we have sparse constraints * * NOTE: we have to write our own conversion code * because standard implementation converts whole * matrix, and we need just leading rectangle. */ if( sparsek>0 ) { /* * Calculate metrics: * * row sizes for CRS storage. * * row norms for scaling * * number of non-zero elements * * in addition, we check for finiteness of matrix elements */ ae_vector_set_length(&rs, sparsek, _state); ae_vector_set_length(&v2, sparsek, _state); for(i=0; i<=sparsek-1; i++) { rs.ptr.p_int[i] = 0; v2.ptr.p_double[i] = (double)(0); } t0 = 0; t1 = 0; nnz = 0; while(sparseenumerate(sparsec, &t0, &t1, &i, &j, &v, _state)) { if( i>sparsek-1||j>n ) { continue; } ae_assert(ae_isfinite(v, _state), "MinQPSetLCSparse: C contains infinite or NAN values", _state); nnz = nnz+1; rs.ptr.p_int[i] = rs.ptr.p_int[i]+1; if( j0 ) { /* * Determine actual order in which constraints are stored. * * Equality constraints are stored first, in the upper * SNEC rows of State.CLEIC matrix. Inequality constraints * are stored in the next NIC rows. * * NOTE: we convert inequality constraints to the form * A*x<=b before copying them. */ ae_vector_set_length(&srcidx, sparsek, _state); ae_vector_set_length(&dstidx, sparsek, _state); ae_vector_set_length(&s, sparsek, _state); for(i=0; i<=sparsek-1; i++) { srcidx.ptr.p_int[i] = -1; } for(i=0; i<=sparsek-1; i++) { if( sparsect->ptr.p_int[i]==0 ) { s.ptr.p_double[i] = (double)(1); srcidx.ptr.p_int[state->snec] = i; state->snec = state->snec+1; } else { if( sparsect->ptr.p_int[i]>0 ) { s.ptr.p_double[i] = (double)(-1); } else { s.ptr.p_double[i] = (double)(1); } srcidx.ptr.p_int[sparsek-1-state->snic] = i; state->snic = state->snic+1; } } for(i=0; i<=sparsek-1; i++) { ae_assert(srcidx.ptr.p_int[i]>=0, "MinQPSetLCSparse: integrity check failed", _state); } for(i=0; i<=sparsek-1; i++) { dstidx.ptr.p_int[srcidx.ptr.p_int[i]] = i; } /* * Prepare CRS conversion. */ state->scleic.m = sparsek; state->scleic.n = n+1; ivectorsetlengthatleast(&state->scleic.ridx, sparsek+1, _state); ae_vector_set_length(&eoffs, sparsek+1, _state); state->scleic.ridx.ptr.p_int[0] = 0; eoffs.ptr.p_int[0] = 0; for(i=1; i<=sparsek; i++) { state->scleic.ridx.ptr.p_int[i] = state->scleic.ridx.ptr.p_int[i-1]+rs.ptr.p_int[srcidx.ptr.p_int[i-1]]; eoffs.ptr.p_int[i] = state->scleic.ridx.ptr.p_int[i]; } for(i=0; i<=sparsek-1; i++) { if( ae_fp_greater(v2.ptr.p_double[i],(double)(0))&&!state->dbgskipconstraintnormalization ) { v2.ptr.p_double[i] = s.ptr.p_double[i]/ae_sqrt(v2.ptr.p_double[i], _state); } else { v2.ptr.p_double[i] = s.ptr.p_double[i]; } } ivectorsetlengthatleast(&state->scleic.idx, nnz, _state); rvectorsetlengthatleast(&state->scleic.vals, nnz, _state); t0 = 0; t1 = 0; while(sparseenumerate(sparsec, &t0, &t1, &i, &j, &v, _state)) { if( i>sparsek-1||j>n ) { continue; } j0 = eoffs.ptr.p_int[dstidx.ptr.p_int[i]]; state->scleic.idx.ptr.p_int[j0] = j; state->scleic.vals.ptr.p_double[j0] = v*v2.ptr.p_double[i]; eoffs.ptr.p_int[dstidx.ptr.p_int[i]] = j0+1; } for(i=0; i<=sparsek-1; i++) { ae_assert(eoffs.ptr.p_int[i]==state->scleic.ridx.ptr.p_int[i+1], "MinQPSetLCSparse: integrity check failed", _state); } sparsecreatecrsinplace(&state->scleic, _state); } } ae_frame_leave(_state); } /************************************************************************* This function solves quadratic programming problem. Prior to calling this function you should choose solver by means of one of the following functions: * minqpsetalgoquickqp() - for QuickQP solver * minqpsetalgobleic() - for BLEIC-QP solver * minqpsetalgodenseaul() - for Dense-AUL-QP solver These functions also allow you to control stopping criteria of the solver. If you did not set solver, MinQP subpackage will automatically select solver for your problem and will run it with default stopping criteria. However, it is better to set explicitly solver and its stopping criteria. INPUT PARAMETERS: State - algorithm state You should use MinQPResults() function to access results after calls to this function. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey. Special thanks to Elvira Illarionova for important suggestions on the linearly constrained QP algorithm. *************************************************************************/ void minqpoptimize(minqpstate* state, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t idst; ae_int_t isrc; ae_int_t j0; ae_int_t j1; ae_int_t nbc; n = state->n; state->repterminationtype = -5; state->repinneriterationscount = 0; state->repouteriterationscount = 0; state->repncholesky = 0; state->repnmv = 0; /* * check correctness of constraints */ for(i=0; i<=n-1; i++) { if( state->havebndl.ptr.p_bool[i]&&state->havebndu.ptr.p_bool[i] ) { if( ae_fp_greater(state->bndl.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->repterminationtype = -3; return; } } } /* * count number of bound and linear constraints */ nbc = 0; for(i=0; i<=n-1; i++) { if( state->havebndl.ptr.p_bool[i] ) { nbc = nbc+1; } if( state->havebndu.ptr.p_bool[i] ) { nbc = nbc+1; } } /* * Initial point: * * if we have starting point in StartX, we just have to bound it * * if we do not have StartX, deduce initial point from boundary constraints */ if( state->havex ) { for(i=0; i<=n-1; i++) { state->xs.ptr.p_double[i] = state->startx.ptr.p_double[i]; if( state->havebndl.ptr.p_bool[i]&&ae_fp_less(state->xs.ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { state->xs.ptr.p_double[i] = state->bndl.ptr.p_double[i]; } if( state->havebndu.ptr.p_bool[i]&&ae_fp_greater(state->xs.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->xs.ptr.p_double[i] = state->bndu.ptr.p_double[i]; } } } else { for(i=0; i<=n-1; i++) { if( state->havebndl.ptr.p_bool[i]&&state->havebndu.ptr.p_bool[i] ) { state->xs.ptr.p_double[i] = 0.5*(state->bndl.ptr.p_double[i]+state->bndu.ptr.p_double[i]); continue; } if( state->havebndl.ptr.p_bool[i] ) { state->xs.ptr.p_double[i] = state->bndl.ptr.p_double[i]; continue; } if( state->havebndu.ptr.p_bool[i] ) { state->xs.ptr.p_double[i] = state->bndu.ptr.p_double[i]; continue; } state->xs.ptr.p_double[i] = (double)(0); } } /* * QP-BLEIC solver */ if( state->algokind==2 ) { /* * Combine dense and sparse constraints in temporary dense storage */ rmatrixsetlengthatleast(&state->ecleic, state->nec+state->nic+state->snec+state->snic, n+1, _state); for(i=0; i<=state->nec-1; i++) { ae_v_move(&state->ecleic.ptr.pp_double[i][0], 1, &state->cleic.ptr.pp_double[i][0], 1, ae_v_len(0,n)); } for(i=0; i<=state->snec-1; i++) { ae_assert(state->scleic.matrixtype==1, "MinQPOptimize: integrity check failed", _state); idst = i+state->nec; for(j=0; j<=n; j++) { state->ecleic.ptr.pp_double[idst][j] = (double)(0); } j0 = state->scleic.ridx.ptr.p_int[i]; j1 = state->scleic.ridx.ptr.p_int[i+1]; for(j=j0; j<=j1-1; j++) { state->ecleic.ptr.pp_double[idst][state->scleic.idx.ptr.p_int[j]] = state->scleic.vals.ptr.p_double[j]; } } for(i=0; i<=state->nic-1; i++) { idst = i+state->nec+state->snec; isrc = i+state->nec; ae_v_move(&state->ecleic.ptr.pp_double[idst][0], 1, &state->cleic.ptr.pp_double[isrc][0], 1, ae_v_len(0,n)); } for(i=0; i<=state->snic-1; i++) { ae_assert(state->scleic.matrixtype==1, "MinQPOptimize: integrity check failed", _state); idst = i+state->nec+state->snec+state->nic; isrc = i+state->snec; for(j=0; j<=n; j++) { state->ecleic.ptr.pp_double[idst][j] = (double)(0); } j0 = state->scleic.ridx.ptr.p_int[isrc]; j1 = state->scleic.ridx.ptr.p_int[isrc+1]; for(j=j0; j<=j1-1; j++) { state->ecleic.ptr.pp_double[idst][state->scleic.idx.ptr.p_int[j]] = state->scleic.vals.ptr.p_double[j]; } } /* * Solve */ qpbleicoptimize(&state->a, &state->sparsea, state->akind, state->sparseaupper, state->absasum, state->absasum2, &state->b, &state->bndl, &state->bndu, &state->s, &state->xorigin, n, &state->ecleic, state->nec+state->snec, state->nic+state->snic, &state->qpbleicsettingsuser, &state->qpbleicbuf, &state->qpbleicfirstcall, &state->xs, &state->repterminationtype, _state); state->repinneriterationscount = state->qpbleicbuf.repinneriterationscount; state->repouteriterationscount = state->qpbleicbuf.repouteriterationscount; return; } /* * QuickQP solver */ if( state->algokind==3 ) { if( state->nec+state->nic>0 ) { state->repterminationtype = -5; return; } if( state->snec+state->snic>0 ) { state->repterminationtype = -5; return; } qqpoptimize(&state->a, &state->sparsea, &state->dummyr2, state->akind, state->sparseaupper, &state->b, &state->bndl, &state->bndu, &state->s, &state->xorigin, n, &state->cleic, state->nec, state->nic, &state->qqpsettingsuser, &state->qqpbuf, &state->xs, &state->repterminationtype, _state); state->repinneriterationscount = state->qqpbuf.repinneriterationscount; state->repouteriterationscount = state->qqpbuf.repouteriterationscount; state->repncholesky = state->qqpbuf.repncholesky; return; } /* * QP-DenseAUL solver */ if( state->algokind==4 ) { /* * Solve */ qpdenseauloptimize(&state->a, &state->sparsea, state->akind, state->sparseaupper, &state->b, &state->bndl, &state->bndu, &state->s, &state->xorigin, n, &state->cleic, state->nec, state->nic, &state->scleic, state->snec, state->snic, !state->dbgskipconstraintnormalization, &state->qpdenseaulsettingsuser, &state->qpdenseaulbuf, &state->xs, &state->repterminationtype, _state); state->repinneriterationscount = state->qpdenseaulbuf.repinneriterationscount; state->repouteriterationscount = state->qpdenseaulbuf.repouteriterationscount; state->repncholesky = state->qpdenseaulbuf.repncholesky; return; } /* * Cholesky solver. */ if( state->algokind==1 ) { /* * Check matrix type. * Cholesky solver supports only dense matrices. */ if( state->akind!=0 ) { state->repterminationtype = -5; return; } if( state->snec+state->snic>0 ) { state->repterminationtype = -5; return; } qpcholeskyoptimize(&state->a, state->absamax*n, &state->b, &state->bndl, &state->bndu, &state->s, &state->xorigin, n, &state->cleic, state->nec, state->nic, &state->qpcholeskybuf, &state->xs, &state->repterminationtype, _state); state->repinneriterationscount = state->qpcholeskybuf.repinneriterationscount; state->repouteriterationscount = state->qpcholeskybuf.repouteriterationscount; state->repncholesky = state->qpcholeskybuf.repncholesky; return; } /* * Integrity check failed - unknown solver */ ae_assert(ae_false, "MinQPOptimize: integrity check failed - unknown solver", _state); } /************************************************************************* QP solver results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution. This array is allocated and initialized only when Rep.TerminationType parameter is positive (success). Rep - optimization report. You should check Rep.TerminationType, which contains completion code, and you may check another fields which contain another information about algorithm functioning. Failure codes returned by algorithm are: * -5 inappropriate solver was used: * QuickQP solver for problem with general linear constraints * -4 BLEIC-QP/QuickQP solver found unconstrained direction of negative curvature (function is unbounded from below even under constraints), no meaningful minimum can be found. * -3 inconsistent constraints (or maybe feasible point is too hard to find). If you are sure that constraints are feasible, try to restart optimizer with better initial approximation. Completion codes specific for Cholesky algorithm: * 4 successful completion Completion codes specific for BLEIC/QuickQP algorithms: * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpresults(minqpstate* state, /* Real */ ae_vector* x, minqpreport* rep, ae_state *_state) { ae_vector_clear(x); _minqpreport_clear(rep); minqpresultsbuf(state, x, rep, _state); } /************************************************************************* QP results Buffered implementation of MinQPResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpresultsbuf(minqpstate* state, /* Real */ ae_vector* x, minqpreport* rep, ae_state *_state) { if( x->cntn ) { ae_vector_set_length(x, state->n, _state); } ae_v_move(&x->ptr.p_double[0], 1, &state->xs.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); rep->inneriterationscount = state->repinneriterationscount; rep->outeriterationscount = state->repouteriterationscount; rep->nmv = state->repnmv; rep->ncholesky = state->repncholesky; rep->terminationtype = state->repterminationtype; } /************************************************************************* Fast version of MinQPSetLinearTerm(), which doesn't check its arguments. For internal use only. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetlineartermfast(minqpstate* state, /* Real */ ae_vector* b, ae_state *_state) { ae_v_move(&state->b.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,state->n-1)); } /************************************************************************* Fast version of MinQPSetQuadraticTerm(), which doesn't check its arguments. It accepts additional parameter - shift S, which allows to "shift" matrix A by adding s*I to A. S must be positive (although it is not checked). For internal use only. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetquadratictermfast(minqpstate* state, /* Real */ ae_matrix* a, ae_bool isupper, double s, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t n; double v; ae_int_t j0; ae_int_t j1; n = state->n; state->akind = 0; cqmseta(&state->a, a, isupper, 1.0, _state); if( ae_fp_greater(s,(double)(0)) ) { rvectorsetlengthatleast(&state->tmp0, n, _state); for(i=0; i<=n-1; i++) { state->tmp0.ptr.p_double[i] = a->ptr.pp_double[i][i]+s; } cqmrewritedensediagonal(&state->a, &state->tmp0, _state); } /* * Estimate norm of A * (it will be used later in the quadratic penalty function) */ state->absamax = (double)(0); state->absasum = (double)(0); state->absasum2 = (double)(0); for(i=0; i<=n-1; i++) { if( isupper ) { j0 = i; j1 = n-1; } else { j0 = 0; j1 = i; } for(j=j0; j<=j1; j++) { v = ae_fabs(a->ptr.pp_double[i][j], _state); state->absamax = ae_maxreal(state->absamax, v, _state); state->absasum = state->absasum+v; state->absasum2 = state->absasum2+v*v; } } } /************************************************************************* Internal function which allows to rewrite diagonal of quadratic term. For internal use only. This function can be used only when you have dense A and already made MinQPSetQuadraticTerm(Fast) call. -- ALGLIB -- Copyright 16.01.2011 by Bochkanov Sergey *************************************************************************/ void minqprewritediagonal(minqpstate* state, /* Real */ ae_vector* s, ae_state *_state) { cqmrewritedensediagonal(&state->a, s, _state); } /************************************************************************* Fast version of MinQPSetStartingPoint(), which doesn't check its arguments. For internal use only. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetstartingpointfast(minqpstate* state, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; n = state->n; ae_v_move(&state->startx.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); state->havex = ae_true; } /************************************************************************* Fast version of MinQPSetOrigin(), which doesn't check its arguments. For internal use only. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetoriginfast(minqpstate* state, /* Real */ ae_vector* xorigin, ae_state *_state) { ae_int_t n; n = state->n; ae_v_move(&state->xorigin.ptr.p_double[0], 1, &xorigin->ptr.p_double[0], 1, ae_v_len(0,n-1)); } void _minqpstate_init(void* _p, ae_state *_state) { minqpstate *p = (minqpstate*)_p; ae_touch_ptr((void*)p); _qqpsettings_init(&p->qqpsettingsuser, _state); _qpbleicsettings_init(&p->qpbleicsettingsuser, _state); _qpdenseaulsettings_init(&p->qpdenseaulsettingsuser, _state); _convexquadraticmodel_init(&p->a, _state); _sparsematrix_init(&p->sparsea, _state); ae_vector_init(&p->b, 0, DT_REAL, _state); ae_vector_init(&p->bndl, 0, DT_REAL, _state); ae_vector_init(&p->bndu, 0, DT_REAL, _state); ae_vector_init(&p->s, 0, DT_REAL, _state); ae_vector_init(&p->havebndl, 0, DT_BOOL, _state); ae_vector_init(&p->havebndu, 0, DT_BOOL, _state); ae_vector_init(&p->xorigin, 0, DT_REAL, _state); ae_vector_init(&p->startx, 0, DT_REAL, _state); ae_matrix_init(&p->cleic, 0, 0, DT_REAL, _state); _sparsematrix_init(&p->scleic, _state); ae_vector_init(&p->xs, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_matrix_init(&p->ecleic, 0, 0, DT_REAL, _state); ae_matrix_init(&p->dummyr2, 0, 0, DT_REAL, _state); _qpbleicbuffers_init(&p->qpbleicbuf, _state); _qqpbuffers_init(&p->qqpbuf, _state); _qpdenseaulbuffers_init(&p->qpdenseaulbuf, _state); _qpcholeskybuffers_init(&p->qpcholeskybuf, _state); } void _minqpstate_init_copy(void* _dst, void* _src, ae_state *_state) { minqpstate *dst = (minqpstate*)_dst; minqpstate *src = (minqpstate*)_src; dst->n = src->n; _qqpsettings_init_copy(&dst->qqpsettingsuser, &src->qqpsettingsuser, _state); _qpbleicsettings_init_copy(&dst->qpbleicsettingsuser, &src->qpbleicsettingsuser, _state); _qpdenseaulsettings_init_copy(&dst->qpdenseaulsettingsuser, &src->qpdenseaulsettingsuser, _state); dst->dbgskipconstraintnormalization = src->dbgskipconstraintnormalization; dst->algokind = src->algokind; dst->akind = src->akind; _convexquadraticmodel_init_copy(&dst->a, &src->a, _state); _sparsematrix_init_copy(&dst->sparsea, &src->sparsea, _state); dst->sparseaupper = src->sparseaupper; dst->absamax = src->absamax; dst->absasum = src->absasum; dst->absasum2 = src->absasum2; ae_vector_init_copy(&dst->b, &src->b, _state); ae_vector_init_copy(&dst->bndl, &src->bndl, _state); ae_vector_init_copy(&dst->bndu, &src->bndu, _state); ae_vector_init_copy(&dst->s, &src->s, _state); ae_vector_init_copy(&dst->havebndl, &src->havebndl, _state); ae_vector_init_copy(&dst->havebndu, &src->havebndu, _state); ae_vector_init_copy(&dst->xorigin, &src->xorigin, _state); ae_vector_init_copy(&dst->startx, &src->startx, _state); dst->havex = src->havex; ae_matrix_init_copy(&dst->cleic, &src->cleic, _state); dst->nec = src->nec; dst->nic = src->nic; _sparsematrix_init_copy(&dst->scleic, &src->scleic, _state); dst->snec = src->snec; dst->snic = src->snic; ae_vector_init_copy(&dst->xs, &src->xs, _state); dst->repinneriterationscount = src->repinneriterationscount; dst->repouteriterationscount = src->repouteriterationscount; dst->repncholesky = src->repncholesky; dst->repnmv = src->repnmv; dst->repterminationtype = src->repterminationtype; ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); ae_matrix_init_copy(&dst->ecleic, &src->ecleic, _state); ae_matrix_init_copy(&dst->dummyr2, &src->dummyr2, _state); dst->qpbleicfirstcall = src->qpbleicfirstcall; _qpbleicbuffers_init_copy(&dst->qpbleicbuf, &src->qpbleicbuf, _state); _qqpbuffers_init_copy(&dst->qqpbuf, &src->qqpbuf, _state); _qpdenseaulbuffers_init_copy(&dst->qpdenseaulbuf, &src->qpdenseaulbuf, _state); _qpcholeskybuffers_init_copy(&dst->qpcholeskybuf, &src->qpcholeskybuf, _state); } void _minqpstate_clear(void* _p) { minqpstate *p = (minqpstate*)_p; ae_touch_ptr((void*)p); _qqpsettings_clear(&p->qqpsettingsuser); _qpbleicsettings_clear(&p->qpbleicsettingsuser); _qpdenseaulsettings_clear(&p->qpdenseaulsettingsuser); _convexquadraticmodel_clear(&p->a); _sparsematrix_clear(&p->sparsea); ae_vector_clear(&p->b); ae_vector_clear(&p->bndl); ae_vector_clear(&p->bndu); ae_vector_clear(&p->s); ae_vector_clear(&p->havebndl); ae_vector_clear(&p->havebndu); ae_vector_clear(&p->xorigin); ae_vector_clear(&p->startx); ae_matrix_clear(&p->cleic); _sparsematrix_clear(&p->scleic); ae_vector_clear(&p->xs); ae_vector_clear(&p->tmp0); ae_matrix_clear(&p->ecleic); ae_matrix_clear(&p->dummyr2); _qpbleicbuffers_clear(&p->qpbleicbuf); _qqpbuffers_clear(&p->qqpbuf); _qpdenseaulbuffers_clear(&p->qpdenseaulbuf); _qpcholeskybuffers_clear(&p->qpcholeskybuf); } void _minqpstate_destroy(void* _p) { minqpstate *p = (minqpstate*)_p; ae_touch_ptr((void*)p); _qqpsettings_destroy(&p->qqpsettingsuser); _qpbleicsettings_destroy(&p->qpbleicsettingsuser); _qpdenseaulsettings_destroy(&p->qpdenseaulsettingsuser); _convexquadraticmodel_destroy(&p->a); _sparsematrix_destroy(&p->sparsea); ae_vector_destroy(&p->b); ae_vector_destroy(&p->bndl); ae_vector_destroy(&p->bndu); ae_vector_destroy(&p->s); ae_vector_destroy(&p->havebndl); ae_vector_destroy(&p->havebndu); ae_vector_destroy(&p->xorigin); ae_vector_destroy(&p->startx); ae_matrix_destroy(&p->cleic); _sparsematrix_destroy(&p->scleic); ae_vector_destroy(&p->xs); ae_vector_destroy(&p->tmp0); ae_matrix_destroy(&p->ecleic); ae_matrix_destroy(&p->dummyr2); _qpbleicbuffers_destroy(&p->qpbleicbuf); _qqpbuffers_destroy(&p->qqpbuf); _qpdenseaulbuffers_destroy(&p->qpdenseaulbuf); _qpcholeskybuffers_destroy(&p->qpcholeskybuf); } void _minqpreport_init(void* _p, ae_state *_state) { minqpreport *p = (minqpreport*)_p; ae_touch_ptr((void*)p); } void _minqpreport_init_copy(void* _dst, void* _src, ae_state *_state) { minqpreport *dst = (minqpreport*)_dst; minqpreport *src = (minqpreport*)_src; dst->inneriterationscount = src->inneriterationscount; dst->outeriterationscount = src->outeriterationscount; dst->nmv = src->nmv; dst->ncholesky = src->ncholesky; dst->terminationtype = src->terminationtype; } void _minqpreport_clear(void* _p) { minqpreport *p = (minqpreport*)_p; ae_touch_ptr((void*)p); } void _minqpreport_destroy(void* _p) { minqpreport *p = (minqpreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* NONLINEARLY CONSTRAINED OPTIMIZATION WITH PRECONDITIONED AUGMENTED LAGRANGIAN ALGORITHM DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints * nonlinear equality constraints Gi(x)=0 * nonlinear inequality constraints Hi(x)<=0 REQUIREMENTS: * user must provide function value and gradient for F(), H(), G() * starting point X0 must be feasible or not too far away from the feasible set * F(), G(), H() are twice continuously differentiable on the feasible set and its neighborhood * nonlinear constraints G() and H() must have non-zero gradient at G(x)=0 and at H(x)=0. Say, constraint like x^2>=1 is supported, but x^2>=0 is NOT supported. USAGE: Constrained optimization if far more complex than the unconstrained one. Nonlinearly constrained optimization is one of the most esoteric numerical procedures. Here we give very brief outline of the MinNLC optimizer. We strongly recommend you to study examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinNLCCreate() call and chooses what NLC solver to use. There is some solver which is used by default, with default settings, but you should NOT rely on default choice. It may change in future releases of ALGLIB without notice, and no one can guarantee that new solver will be able to solve your problem with default settings. From the other side, if you choose solver explicitly, you can be pretty sure that it will work with new ALGLIB releases. In the current release following solvers can be used: * AUL solver (activated with MinNLCSetAlgoAUL() function) 2. User adds boundary and/or linear and/or nonlinear constraints by means of calling one of the following functions: a) MinNLCSetBC() for boundary constraints b) MinNLCSetLC() for linear constraints c) MinNLCSetNLC() for nonlinear constraints You may combine (a), (b) and (c) in one optimization problem. 3. User sets scale of the variables with MinNLCSetScale() function. It is VERY important to set scale of the variables, because nonlinearly constrained problems are hard to solve when variables are badly scaled. 4. User sets stopping conditions with MinNLCSetCond(). If NLC solver uses inner/outer iteration layout, this function sets stopping conditions for INNER iterations. 5. User chooses one of the preconditioning methods. Preconditioning is very important for efficient handling of boundary/linear/nonlinear constraints. Without preconditioning algorithm would require thousands of iterations even for simple problems. Several preconditioners can be used: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). Since version 3.11.0 ALGLIB uses exact robust preconditioner as default option, but in some cases exact low rank one may be better option. 6. Finally, user calls MinNLCOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G/H. 7. User calls MinNLCResults() to get solution 8. Optionally user may call MinNLCRestartFrom() to solve another problem with same N but another starting point. MinNLCRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlccreate(ae_int_t n, /* Real */ ae_vector* x, minnlcstate* state, ae_state *_state) { _minnlcstate_clear(state); ae_assert(n>=1, "MinNLCCreate: N<1", _state); ae_assert(x->cnt>=n, "MinNLCCreate: Length(X)0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinNLCSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large TRUNCATION errors, while too small step will result in too large NUMERICAL errors. 1.0E-4 can be good value to start from. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlccreatef(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minnlcstate* state, ae_state *_state) { _minnlcstate_clear(state); ae_assert(n>=1, "MinNLCCreateF: N<1", _state); ae_assert(x->cnt>=n, "MinNLCCreateF: Length(X)n; ae_assert(bndl->cnt>=n, "MinNLCSetBC: Length(BndL)cnt>=n, "MinNLCSetBC: Length(BndU)ptr.p_double[i], _state)||ae_isneginf(bndl->ptr.p_double[i], _state), "MinNLCSetBC: BndL contains NAN or +INF", _state); ae_assert(ae_isfinite(bndu->ptr.p_double[i], _state)||ae_isposinf(bndu->ptr.p_double[i], _state), "MinNLCSetBC: BndL contains NAN or -INF", _state); state->bndl.ptr.p_double[i] = bndl->ptr.p_double[i]; state->hasbndl.ptr.p_bool[i] = ae_isfinite(bndl->ptr.p_double[i], _state); state->bndu.ptr.p_double[i] = bndu->ptr.p_double[i]; state->hasbndu.ptr.p_bool[i] = ae_isfinite(bndu->ptr.p_double[i], _state); } } /************************************************************************* This function sets linear constraints for MinNLC optimizer. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinNLCRestartFrom(). You may combine linear constraints with boundary ones - and with nonlinear ones! If your problem has mixed constraints, you may explicitly specify some of them as linear. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: when you solve your problem with augmented Lagrangian solver, linear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of feasible area! -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetlc(minnlcstate* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state) { ae_int_t n; ae_int_t i; n = state->n; /* * First, check for errors in the inputs */ ae_assert(k>=0, "MinNLCSetLC: K<0", _state); ae_assert(c->cols>=n+1||k==0, "MinNLCSetLC: Cols(C)rows>=k, "MinNLCSetLC: Rows(C)cnt>=k, "MinNLCSetLC: Length(CT)nec = 0; state->nic = 0; return; } /* * Equality constraints are stored first, in the upper * NEC rows of State.CLEIC matrix. Inequality constraints * are stored in the next NIC rows. * * NOTE: we convert inequality constraints to the form * A*x<=b before copying them. */ rmatrixsetlengthatleast(&state->cleic, k, n+1, _state); state->nec = 0; state->nic = 0; for(i=0; i<=k-1; i++) { if( ct->ptr.p_int[i]==0 ) { ae_v_move(&state->cleic.ptr.pp_double[state->nec][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); state->nec = state->nec+1; } } for(i=0; i<=k-1; i++) { if( ct->ptr.p_int[i]!=0 ) { if( ct->ptr.p_int[i]>0 ) { ae_v_moveneg(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } else { ae_v_move(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } state->nic = state->nic+1; } } } /************************************************************************* This function sets nonlinear constraints for MinNLC optimizer. In fact, this function sets NUMBER of nonlinear constraints. Constraints itself (constraint functions) are passed to MinNLCOptimize() method. This method requires user-defined vector function F[] and its Jacobian J[], where: * first component of F[] and first row of Jacobian J[] corresponds to function being minimized * next NLEC components of F[] (and rows of J) correspond to nonlinear equality constraints G_i(x)=0 * next NLIC components of F[] (and rows of J) correspond to nonlinear inequality constraints H_i(x)<=0 NOTE: you may combine nonlinear constraints with linear/boundary ones. If your problem has mixed constraints, you may explicitly specify some of them as linear ones. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. NLEC - number of Non-Linear Equality Constraints (NLEC), >=0 NLIC - number of Non-Linear Inquality Constraints (NLIC), >=0 NOTE 1: when you solve your problem with augmented Lagrangian solver, nonlinear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of feasible area! NOTE 2: algorithm scales variables according to scale specified by MinNLCSetScale() function, so it can handle problems with badly scaled variables (as long as we KNOW their scales). However, there is no way to automatically scale nonlinear constraints Gi(x) and Hi(x). Inappropriate scaling of Gi/Hi may ruin convergence. Solving problem with constraint "1000*G0(x)=0" is NOT same as solving it with constraint "0.001*G0(x)=0". It means that YOU are the one who is responsible for correct scaling of nonlinear constraints Gi(x) and Hi(x). We recommend you to scale nonlinear constraints in such way that I-th component of dG/dX (or dH/dx) has approximately unit magnitude (for problems with unit scale) or has magnitude approximately equal to 1/S[i] (where S is a scale set by MinNLCSetScale() function). -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetnlc(minnlcstate* state, ae_int_t nlec, ae_int_t nlic, ae_state *_state) { ae_assert(nlec>=0, "MinNLCSetNLC: NLEC<0", _state); ae_assert(nlic>=0, "MinNLCSetNLC: NLIC<0", _state); state->ng = nlec; state->nh = nlic; ae_vector_set_length(&state->fi, 1+state->ng+state->nh, _state); ae_matrix_set_length(&state->j, 1+state->ng+state->nh, state->n, _state); } /************************************************************************* This function sets stopping conditions for inner iterations of optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinNLCSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetcond(minnlcstate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(epsg, _state), "MinNLCSetCond: EpsG is not finite number", _state); ae_assert(ae_fp_greater_eq(epsg,(double)(0)), "MinNLCSetCond: negative EpsG", _state); ae_assert(ae_isfinite(epsf, _state), "MinNLCSetCond: EpsF is not finite number", _state); ae_assert(ae_fp_greater_eq(epsf,(double)(0)), "MinNLCSetCond: negative EpsF", _state); ae_assert(ae_isfinite(epsx, _state), "MinNLCSetCond: EpsX is not finite number", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "MinNLCSetCond: negative EpsX", _state); ae_assert(maxits>=0, "MinNLCSetCond: negative MaxIts!", _state); if( ((ae_fp_eq(epsg,(double)(0))&&ae_fp_eq(epsf,(double)(0)))&&ae_fp_eq(epsx,(double)(0)))&&maxits==0 ) { epsx = 1.0E-6; } state->epsg = epsg; state->epsf = epsf; state->epsx = epsx; state->maxits = maxits; } /************************************************************************* This function sets scaling coefficients for NLC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetscale(minnlcstate* state, /* Real */ ae_vector* s, ae_state *_state) { ae_int_t i; ae_assert(s->cnt>=state->n, "MinNLCSetScale: Length(S)n-1; i++) { ae_assert(ae_isfinite(s->ptr.p_double[i], _state), "MinNLCSetScale: S contains infinite or NAN elements", _state); ae_assert(ae_fp_neq(s->ptr.p_double[i],(double)(0)), "MinNLCSetScale: S contains zero elements", _state); state->s.ptr.p_double[i] = ae_fabs(s->ptr.p_double[i], _state); } } /************************************************************************* This function sets preconditioner to "inexact LBFGS-based" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may use following preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). Inexact LBFGS-based preconditioner uses L-BFGS formula combined with orthogonality assumption to perform very fast updates. For a N-dimensional problem with K general linear or nonlinear constraints (boundary ones are not counted) it has O(N*K) cost per iteration. This preconditioner has best quality (less iterations) when general linear and nonlinear constraints are orthogonal to each other (orthogonality with respect to boundary constraints is not required). Number of iterations increases when constraints are non-orthogonal, because algorithm assumes orthogonality, but still it is better than no preconditioner at all. INPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 26.09.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetprecinexact(minnlcstate* state, ae_state *_state) { state->updatefreq = 0; state->prectype = 1; } /************************************************************************* This function sets preconditioner to "exact low rank" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may use following preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). It also provides special unpreconditioned mode of operation which can be used for test purposes. Comments below discuss low rank preconditioner. Exact low-rank preconditioner uses Woodbury matrix identity to build quadratic model of the penalized function. It has following features: * no special assumptions about orthogonality of constraints * preconditioner evaluation is optimized for K<=N. * finally, stability of the process is guaranteed only for K<=N due to degeneracy of intermediate matrices. That's why we recommend to use "exact robust" preconditioner for such cases. RECOMMENDATIONS We recommend to choose between "exact low rank" and "exact robust" preconditioners, with "low rank" version being chosen when you know in advance that total count of non-box constraints won't exceed N, and "robust" version being chosen when you need bulletproof solution. INPUT PARAMETERS: State - structure stores algorithm state UpdateFreq- update frequency. Preconditioner is rebuilt after every UpdateFreq iterations. Recommended value: 10 or higher. Zero value means that good default value will be used. -- ALGLIB -- Copyright 26.09.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetprecexactlowrank(minnlcstate* state, ae_int_t updatefreq, ae_state *_state) { ae_assert(updatefreq>=0, "MinNLCSetPrecExactLowRank: UpdateFreq<0", _state); if( updatefreq==0 ) { updatefreq = 10; } state->prectype = 2; state->updatefreq = updatefreq; } /************************************************************************* This function sets preconditioner to "exact robust" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may use following preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). It also provides special unpreconditioned mode of operation which can be used for test purposes. Comments below discuss robust preconditioner. Exact robust preconditioner uses Cholesky decomposition to invert approximate Hessian matrix H=D+W'*C*W (where D stands for diagonal terms of Hessian, combined result of initial scaling matrix and penalty from box constraints; W stands for general linear constraints and linearization of nonlinear ones; C stands for diagonal matrix of penalty coefficients). This preconditioner has following features: * no special assumptions about constraint structure * preconditioner is optimized for stability; unlike "exact low rank" version which fails for K>=N, this one works well for any value of K. * the only drawback is that is takes O(N^3+K*N^2) time to build it. No economical Woodbury update is applied even when it makes sense, thus there are exist situations (K<=0, "MinNLCSetPrecExactLowRank: UpdateFreq<0", _state); if( updatefreq==0 ) { updatefreq = 10; } state->prectype = 3; state->updatefreq = updatefreq; } /************************************************************************* This function sets preconditioner to "turned off" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may utilize two preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, and b) exact low rank one. It also provides special unpreconditioned mode of operation which can be used for test purposes. This function activates this test mode. Do not use it in production code to solve real-life problems. INPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 26.09.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetprecnone(minnlcstate* state, ae_state *_state) { state->updatefreq = 0; state->prectype = 0; } /************************************************************************* This function sets maximum step length (after scaling of step vector with respect to variable scales specified by minnlcsetscale() call). INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0 (default), if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minnlcsetstpmax(minnlcstate* state, double stpmax, ae_state *_state) { ae_assert(ae_isfinite(stpmax, _state), "MinNLCSetStpMax: StpMax is not finite!", _state); ae_assert(ae_fp_greater_eq(stpmax,(double)(0)), "MinNLCSetStpMax: StpMax<0!", _state); state->stpmax = stpmax; } /************************************************************************* This function tells MinNLC unit to use Augmented Lagrangian algorithm for nonlinearly constrained optimization. This algorithm is a slight modification of one described in "A Modified Barrier-Augmented Lagrangian Method for Constrained Minimization (1999)" by D.GOLDFARB, R.POLYAK, K. SCHEINBERG, I.YUZEFOVICH. Augmented Lagrangian algorithm works by converting problem of minimizing F(x) subject to equality/inequality constraints to unconstrained problem of the form min[ f(x) + + Rho*PENALTY_EQ(x) + SHIFT_EQ(x,Nu1) + + Rho*PENALTY_INEQ(x) + SHIFT_INEQ(x,Nu2) ] where: * Rho is a fixed penalization coefficient * PENALTY_EQ(x) is a penalty term, which is used to APPROXIMATELY enforce equality constraints * SHIFT_EQ(x) is a special "shift" term which is used to "fine-tune" equality constraints, greatly increasing precision * PENALTY_INEQ(x) is a penalty term which is used to approximately enforce inequality constraints * SHIFT_INEQ(x) is a special "shift" term which is used to "fine-tune" inequality constraints, greatly increasing precision * Nu1/Nu2 are vectors of Lagrange coefficients which are fine-tuned during outer iterations of algorithm This version of AUL algorithm uses preconditioner, which greatly accelerates convergence. Because this algorithm is similar to penalty methods, it may perform steps into infeasible area. All kinds of constraints (boundary, linear and nonlinear ones) may be violated in intermediate points - and in the solution. However, properly configured AUL method is significantly better at handling constraints than barrier and/or penalty methods. The very basic outline of algorithm is given below: 1) first outer iteration is performed with "default" values of Lagrange multipliers Nu1/Nu2. Solution quality is low (candidate point can be too far away from true solution; large violation of constraints is possible) and is comparable with that of penalty methods. 2) subsequent outer iterations refine Lagrange multipliers and improve quality of the solution. INPUT PARAMETERS: State - structure which stores algorithm state Rho - penalty coefficient, Rho>0: * large enough that algorithm converges with desired precision. Minimum value is 10*max(S'*diag(H)*S), where S is a scale matrix (set by MinNLCSetScale) and H is a Hessian of the function being minimized. If you can not easily estimate Hessian norm, see our recommendations below. * not TOO large to prevent ill-conditioning * for unit-scale problems (variables and Hessian have unit magnitude), Rho=100 or Rho=1000 can be used. * it is important to note that Rho is internally multiplied by scaling matrix, i.e. optimum value of Rho depends on scale of variables specified by MinNLCSetScale(). ItsCnt - number of outer iterations: * ItsCnt=0 means that small number of outer iterations is automatically chosen (10 iterations in current version). * ItsCnt=1 means that AUL algorithm performs just as usual barrier method. * ItsCnt>1 means that AUL algorithm performs specified number of outer iterations HOW TO CHOOSE PARAMETERS Nonlinear optimization is a tricky area and Augmented Lagrangian algorithm is sometimes hard to tune. Good values of Rho and ItsCnt are problem- specific. In order to help you we prepared following set of recommendations: * for unit-scale problems (variables and Hessian have unit magnitude), Rho=100 or Rho=1000 can be used. * start from some small value of Rho and solve problem with just one outer iteration (ItcCnt=1). In this case algorithm behaves like penalty method. Increase Rho in 2x or 10x steps until you see that one outer iteration returns point which is "rough approximation to solution". It is very important to have Rho so large that penalty term becomes constraining i.e. modified function becomes highly convex in constrained directions. From the other side, too large Rho may prevent you from converging to the solution. You can diagnose it by studying number of inner iterations performed by algorithm: too few (5-10 on 1000-dimensional problem) or too many (orders of magnitude more than dimensionality) usually means that Rho is too large. * with just one outer iteration you usually have low-quality solution. Some constraints can be violated with very large margin, while other ones (which are NOT violated in the true solution) can push final point too far in the inner area of the feasible set. For example, if you have constraint x0>=0 and true solution x0=1, then merely a presence of "x0>=0" will introduce a bias towards larger values of x0. Say, algorithm may stop at x0=1.5 instead of 1.0. * after you found good Rho, you may increase number of outer iterations. ItsCnt=10 is a good value. Subsequent outer iteration will refine values of Lagrange multipliers. Constraints which were violated will be enforced, inactive constraints will be dropped (corresponding multipliers will be decreased). Ideally, you should see 10-1000x improvement in constraint handling (constraint violation is reduced). * if you see that algorithm converges to vicinity of solution, but additional outer iterations do not refine solution, it may mean that algorithm is unstable - it wanders around true solution, but can not approach it. Sometimes algorithm may be stabilized by increasing Rho one more time, making it 5x or 10x larger. SCALING OF CONSTRAINTS [IMPORTANT] AUL optimizer scales variables according to scale specified by MinNLCSetScale() function, so it can handle problems with badly scaled variables (as long as we KNOW their scales). However, because function being optimized is a mix of original function and constraint-dependent penalty functions, it is important to rescale both variables AND constraints. Say, if you minimize f(x)=x^2 subject to 1000000*x>=0, then you have constraint whose scale is different from that of target function (another example is 0.000001*x>=0). It is also possible to have constraints whose scales are misaligned: 1000000*x0>=0, 0.000001*x1<=0. Inappropriate scaling may ruin convergence because minimizing x^2 subject to x>=0 is NOT same as minimizing it subject to 1000000*x>=0. Because we know coefficients of boundary/linear constraints, we can automatically rescale and normalize them. However, there is no way to automatically rescale nonlinear constraints Gi(x) and Hi(x) - they are black boxes. It means that YOU are the one who is responsible for correct scaling of nonlinear constraints Gi(x) and Hi(x). We recommend you to rescale nonlinear constraints in such way that I-th component of dG/dX (or dH/dx) has magnitude approximately equal to 1/S[i] (where S is a scale set by MinNLCSetScale() function). WHAT IF IT DOES NOT CONVERGE? It is possible that AUL algorithm fails to converge to precise values of Lagrange multipliers. It stops somewhere around true solution, but candidate point is still too far from solution, and some constraints are violated. Such kind of failure is specific for Lagrangian algorithms - technically, they stop at some point, but this point is not constrained solution. There are exist several reasons why algorithm may fail to converge: a) too loose stopping criteria for inner iteration b) degenerate, redundant constraints c) target function has unconstrained extremum exactly at the boundary of some constraint d) numerical noise in the target function In all these cases algorithm is unstable - each outer iteration results in large and almost random step which improves handling of some constraints, but violates other ones (ideally outer iterations should form a sequence of progressively decreasing steps towards solution). First reason possible is that too loose stopping criteria for inner iteration were specified. Augmented Lagrangian algorithm solves a sequence of intermediate problems, and requries each of them to be solved with high precision. Insufficient precision results in incorrect update of Lagrange multipliers. Another reason is that you may have specified degenerate constraints: say, some constraint was repeated twice. In most cases AUL algorithm gracefully handles such situations, but sometimes it may spend too much time figuring out subtle degeneracies in constraint matrix. Third reason is tricky and hard to diagnose. Consider situation when you minimize f=x^2 subject to constraint x>=0. Unconstrained extremum is located exactly at the boundary of constrained area. In this case algorithm will tend to oscillate between negative and positive x. Each time it stops at x<0 it "reinforces" constraint x>=0, and each time it is bounced to x>0 it "relaxes" constraint (and is attracted to x<0). Such situation sometimes happens in problems with hidden symetries. Algorithm is got caught in a loop with Lagrange multipliers being continuously increased/decreased. Luckily, such loop forms after at least three iterations, so this problem can be solved by DECREASING number of outer iterations down to 1-2 and increasing penalty coefficient Rho as much as possible. Final reason is numerical noise. AUL algorithm is robust against moderate noise (more robust than, say, active set methods), but large noise may destabilize algorithm. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetalgoaul(minnlcstate* state, double rho, ae_int_t itscnt, ae_state *_state) { ae_assert(itscnt>=0, "MinNLCSetAlgoAUL: negative ItsCnt", _state); ae_assert(ae_isfinite(rho, _state), "MinNLCSetAlgoAUL: Rho is not finite", _state); ae_assert(ae_fp_greater(rho,(double)(0)), "MinNLCSetAlgoAUL: Rho<=0", _state); if( itscnt==0 ) { itscnt = 10; } state->aulitscnt = itscnt; state->rho = rho; state->solvertype = 0; } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinNLCOptimize(). NOTE: algorithm passes two parameters to rep() callback - current point and penalized function value at current point. Important - function value which is returned is NOT function being minimized. It is sum of the value of the function being minimized - and penalty term. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minnlcsetxrep(minnlcstate* state, ae_bool needxrep, ae_state *_state) { state->xrep = needxrep; } /************************************************************************* NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied Jacobian, and one which uses only function vector and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object you should choose appropriate variant of MinNLCOptimize() - one which accepts function AND Jacobian or one which accepts ONLY function. Be careful to choose variant of MinNLCOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinNLCOptimize() and specific function used to create optimizer. | USER PASSED TO MinNLCOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinNLCCreateF() | works FAILS MinNLCCreate() | FAILS works Here "FAILS" denotes inappropriate combinations of optimizer creation function and MinNLCOptimize() version. Attemps to use such combination will lead to exception. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ ae_bool minnlciteration(minnlcstate* state, ae_state *_state) { ae_int_t i; ae_int_t k; ae_int_t n; ae_int_t ng; ae_int_t nh; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { i = state->rstate.ia.ptr.p_int[0]; k = state->rstate.ia.ptr.p_int[1]; n = state->rstate.ia.ptr.p_int[2]; ng = state->rstate.ia.ptr.p_int[3]; nh = state->rstate.ia.ptr.p_int[4]; } else { i = 359; k = -58; n = -919; ng = -909; nh = 81; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } if( state->rstate.stage==4 ) { goto lbl_4; } if( state->rstate.stage==5 ) { goto lbl_5; } if( state->rstate.stage==6 ) { goto lbl_6; } if( state->rstate.stage==7 ) { goto lbl_7; } if( state->rstate.stage==8 ) { goto lbl_8; } /* * Routine body */ /* * Init */ state->repterminationtype = 0; state->repinneriterationscount = 0; state->repouteriterationscount = 0; state->repnfev = 0; state->repvaridx = 0; state->repfuncidx = 0; state->repdbgphase0its = 0; n = state->n; ng = state->ng; nh = state->nh; minnlc_clearrequestfields(state, _state); /* * Test gradient */ if( !(ae_fp_eq(state->diffstep,(double)(0))&&ae_fp_greater(state->teststep,(double)(0))) ) { goto lbl_9; } rvectorsetlengthatleast(&state->xbase, n, _state); rvectorsetlengthatleast(&state->fbase, 1+ng+nh, _state); rvectorsetlengthatleast(&state->dfbase, 1+ng+nh, _state); rvectorsetlengthatleast(&state->fm1, 1+ng+nh, _state); rvectorsetlengthatleast(&state->fp1, 1+ng+nh, _state); rvectorsetlengthatleast(&state->dfm1, 1+ng+nh, _state); rvectorsetlengthatleast(&state->dfp1, 1+ng+nh, _state); state->needfij = ae_true; ae_v_move(&state->xbase.ptr.p_double[0], 1, &state->xstart.ptr.p_double[0], 1, ae_v_len(0,n-1)); k = 0; lbl_11: if( k>n-1 ) { goto lbl_13; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->rstate.stage = 0; goto lbl_rcomm; lbl_0: ae_v_move(&state->fbase.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,ng+nh)); ae_v_move(&state->dfbase.ptr.p_double[0], 1, &state->j.ptr.pp_double[0][k], state->j.stride, ae_v_len(0,ng+nh)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->x.ptr.p_double[k] = state->x.ptr.p_double[k]-state->s.ptr.p_double[k]*state->teststep; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: ae_v_move(&state->fm1.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,ng+nh)); ae_v_move(&state->dfm1.ptr.p_double[0], 1, &state->j.ptr.pp_double[0][k], state->j.stride, ae_v_len(0,ng+nh)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->x.ptr.p_double[k] = state->x.ptr.p_double[k]+state->s.ptr.p_double[k]*state->teststep; state->rstate.stage = 2; goto lbl_rcomm; lbl_2: ae_v_move(&state->fp1.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,ng+nh)); ae_v_move(&state->dfp1.ptr.p_double[0], 1, &state->j.ptr.pp_double[0][k], state->j.stride, ae_v_len(0,ng+nh)); for(i=0; i<=ng+nh; i++) { if( !derivativecheck(state->fm1.ptr.p_double[i], state->dfm1.ptr.p_double[i], state->fp1.ptr.p_double[i], state->dfp1.ptr.p_double[i], state->fbase.ptr.p_double[i], state->dfbase.ptr.p_double[i], 2*state->s.ptr.p_double[k]*state->teststep, _state) ) { state->repfuncidx = i; state->repvaridx = k; state->repterminationtype = -7; result = ae_false; return result; } } k = k+1; goto lbl_11; lbl_13: state->needfij = ae_false; lbl_9: /* * AUL solver */ if( state->solvertype!=0 ) { goto lbl_14; } if( ae_fp_neq(state->diffstep,(double)(0)) ) { rvectorsetlengthatleast(&state->xbase, n, _state); rvectorsetlengthatleast(&state->fbase, 1+ng+nh, _state); rvectorsetlengthatleast(&state->fm2, 1+ng+nh, _state); rvectorsetlengthatleast(&state->fm1, 1+ng+nh, _state); rvectorsetlengthatleast(&state->fp1, 1+ng+nh, _state); rvectorsetlengthatleast(&state->fp2, 1+ng+nh, _state); } ae_vector_set_length(&state->rstateaul.ia, 8+1, _state); ae_vector_set_length(&state->rstateaul.ra, 7+1, _state); state->rstateaul.stage = -1; lbl_16: if( !minnlc_auliteration(state, _state) ) { goto lbl_17; } /* * Numerical differentiation (if needed) - intercept NeedFiJ * request and replace it by sequence of NeedFi requests */ if( !(ae_fp_neq(state->diffstep,(double)(0))&&state->needfij) ) { goto lbl_18; } state->needfij = ae_false; state->needfi = ae_true; ae_v_move(&state->xbase.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); k = 0; lbl_20: if( k>n-1 ) { goto lbl_22; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->x.ptr.p_double[k] = state->x.ptr.p_double[k]-state->s.ptr.p_double[k]*state->diffstep; state->rstate.stage = 3; goto lbl_rcomm; lbl_3: ae_v_move(&state->fm2.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,ng+nh)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->x.ptr.p_double[k] = state->x.ptr.p_double[k]-0.5*state->s.ptr.p_double[k]*state->diffstep; state->rstate.stage = 4; goto lbl_rcomm; lbl_4: ae_v_move(&state->fm1.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,ng+nh)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->x.ptr.p_double[k] = state->x.ptr.p_double[k]+0.5*state->s.ptr.p_double[k]*state->diffstep; state->rstate.stage = 5; goto lbl_rcomm; lbl_5: ae_v_move(&state->fp1.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,ng+nh)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->x.ptr.p_double[k] = state->x.ptr.p_double[k]+state->s.ptr.p_double[k]*state->diffstep; state->rstate.stage = 6; goto lbl_rcomm; lbl_6: ae_v_move(&state->fp2.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,ng+nh)); for(i=0; i<=ng+nh; i++) { state->j.ptr.pp_double[i][k] = (8*(state->fp1.ptr.p_double[i]-state->fm1.ptr.p_double[i])-(state->fp2.ptr.p_double[i]-state->fm2.ptr.p_double[i]))/(6*state->diffstep*state->s.ptr.p_double[i]); } k = k+1; goto lbl_20; lbl_22: ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->rstate.stage = 7; goto lbl_rcomm; lbl_7: /* * Restore previous values of fields and continue */ state->needfi = ae_false; state->needfij = ae_true; goto lbl_16; lbl_18: /* * Forward request to caller */ state->rstate.stage = 8; goto lbl_rcomm; lbl_8: goto lbl_16; lbl_17: result = ae_false; return result; lbl_14: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = i; state->rstate.ia.ptr.p_int[1] = k; state->rstate.ia.ptr.p_int[2] = n; state->rstate.ia.ptr.p_int[3] = ng; state->rstate.ia.ptr.p_int[4] = nh; return result; } /************************************************************************* MinNLC results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinNLCSetGradientCheck() for more information. * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken More information about fields of this structure can be found in the comments on MinNLCReport datatype. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcresults(minnlcstate* state, /* Real */ ae_vector* x, minnlcreport* rep, ae_state *_state) { ae_vector_clear(x); _minnlcreport_clear(rep); minnlcresultsbuf(state, x, rep, _state); } /************************************************************************* NLC results Buffered implementation of MinNLCResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minnlcresultsbuf(minnlcstate* state, /* Real */ ae_vector* x, minnlcreport* rep, ae_state *_state) { ae_int_t i; if( x->cntn ) { ae_vector_set_length(x, state->n, _state); } rep->iterationscount = state->repinneriterationscount; rep->nfev = state->repnfev; rep->varidx = state->repvaridx; rep->funcidx = state->repfuncidx; rep->terminationtype = state->repterminationtype; rep->dbgphase0its = state->repdbgphase0its; if( state->repterminationtype>0 ) { ae_v_move(&x->ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); } else { for(i=0; i<=state->n-1; i++) { x->ptr.p_double[i] = _state->v_nan; } } } /************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. X - new starting point. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minnlcrestartfrom(minnlcstate* state, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; n = state->n; /* * First, check for errors in the inputs */ ae_assert(x->cnt>=n, "MinNLCRestartFrom: Length(X)xstart.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * prepare RComm facilities */ ae_vector_set_length(&state->rstate.ia, 4+1, _state); state->rstate.stage = -1; minnlc_clearrequestfields(state, _state); } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinNLCOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative, and Rep.FuncIdx is set to index of the function. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinNLCSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetgradientcheck(minnlcstate* state, double teststep, ae_state *_state) { ae_assert(ae_isfinite(teststep, _state), "MinNLCSetGradientCheck: TestStep contains NaN or Infinite", _state); ae_assert(ae_fp_greater_eq(teststep,(double)(0)), "MinNLCSetGradientCheck: invalid argument TestStep(TestStep<0)", _state); state->teststep = teststep; } /************************************************************************* Penalty function for equality constraints. INPUT PARAMETERS: Alpha - function argument. Penalty function becomes large when Alpha approaches -1 or +1. It is defined for Alpha<=-1 or Alpha>=+1 - in this case infinite value is returned. OUTPUT PARAMETERS: F - depending on Alpha: * for Alpha in (-1+eps,+1-eps), F=F(Alpha) * for Alpha outside of interval, F is some very large number DF - depending on Alpha: * for Alpha in (-1+eps,+1-eps), DF=dF(Alpha)/dAlpha, exact numerical derivative. * otherwise, it is zero D2F - second derivative -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcequalitypenaltyfunction(double alpha, double* f, double* df, double* d2f, ae_state *_state) { *f = 0; *df = 0; *d2f = 0; *f = 0.5*alpha*alpha; *df = alpha; *d2f = 1.0; } /************************************************************************* "Penalty" function for inequality constraints, which is multiplied by penalty coefficient Rho. "Penalty" function plays only supplementary role - it helps to stabilize algorithm when solving non-convex problems. Because it is multiplied by fixed and large Rho - not Lagrange multiplier Nu which may become arbitrarily small! - it enforces convexity of the problem behind the boundary of the feasible area. This function is zero at the feasible area and in the close neighborhood, it becomes non-zero only at some distance (scaling is essential!) and grows quadratically. Penalty function must enter augmented Lagrangian as Rho*PENALTY(x-lowerbound) with corresponding changes being made for upper bound or other kinds of constraints. INPUT PARAMETERS: Alpha - function argument. Typically, if we have active constraint with precise Lagrange multiplier, we have Alpha around 1. Large positive Alpha's correspond to inner area of the feasible set. Alpha<1 corresponds to outer area of the feasible set. StabilizingPoint- point where F becomes non-zero. Must be negative value, at least -1, large values (hundreds) are possible. OUTPUT PARAMETERS: F - F(Alpha) DF - DF=dF(Alpha)/dAlpha, exact derivative D2F - second derivative NOTE: it is improtant to have significantly non-zero StabilizingPoint, because when it is large, shift term does not interfere with Lagrange multipliers converging to their final values. Thus, convergence of such modified AUL algorithm is still guaranteed by same set of theorems. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcinequalitypenaltyfunction(double alpha, double stabilizingpoint, double* f, double* df, double* d2f, ae_state *_state) { *f = 0; *df = 0; *d2f = 0; if( ae_fp_greater_eq(alpha,stabilizingpoint) ) { *f = 0.0; *df = 0.0; *d2f = 0.0; } else { alpha = alpha-stabilizingpoint; *f = 0.5*alpha*alpha; *df = alpha; *d2f = 1.0; } } /************************************************************************* "Shift" function for inequality constraints, which is multiplied by corresponding Lagrange multiplier. "Shift" function is a main factor which enforces inequality constraints. Inequality penalty function plays only supplementary role - it prevents accidental step deep into infeasible area when working with non-convex problems (read comments on corresponding function for more information). Shift function must enter augmented Lagrangian as Nu/Rho*SHIFT((x-lowerbound)*Rho+1) with corresponding changes being made for upper bound or other kinds of constraints. INPUT PARAMETERS: Alpha - function argument. Typically, if we have active constraint with precise Lagrange multiplier, we have Alpha around 1. Large positive Alpha's correspond to inner area of the feasible set. Alpha<1 corresponds to outer area of the feasible set. OUTPUT PARAMETERS: F - F(Alpha) DF - DF=dF(Alpha)/dAlpha, exact derivative D2F - second derivative -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcinequalityshiftfunction(double alpha, double* f, double* df, double* d2f, ae_state *_state) { *f = 0; *df = 0; *d2f = 0; if( ae_fp_greater_eq(alpha,0.5) ) { *f = -ae_log(alpha, _state); *df = -1/alpha; *d2f = 1/(alpha*alpha); } else { *f = 2*alpha*alpha-4*alpha+(ae_log((double)(2), _state)+1.5); *df = 4*alpha-4; *d2f = (double)(4); } } /************************************************************************* Clears request fileds (to be sure that we don't forget to clear something) *************************************************************************/ static void minnlc_clearrequestfields(minnlcstate* state, ae_state *_state) { state->needfi = ae_false; state->needfij = ae_false; state->xupdated = ae_false; } /************************************************************************* Internal initialization subroutine. Sets default NLC solver with default criteria. *************************************************************************/ static void minnlc_minnlcinitinternal(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minnlcstate* state, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_matrix c; ae_vector ct; ae_frame_make(_state, &_frame_block); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); /* * Default params */ state->stabilizingpoint = -100.0; state->initialinequalitymultiplier = 1.0; /* * Initialize other params */ state->teststep = (double)(0); state->n = n; state->diffstep = diffstep; ae_vector_set_length(&state->bndl, n, _state); ae_vector_set_length(&state->hasbndl, n, _state); ae_vector_set_length(&state->bndu, n, _state); ae_vector_set_length(&state->hasbndu, n, _state); ae_vector_set_length(&state->s, n, _state); ae_vector_set_length(&state->xstart, n, _state); ae_vector_set_length(&state->xc, n, _state); ae_vector_set_length(&state->x, n, _state); for(i=0; i<=n-1; i++) { state->bndl.ptr.p_double[i] = _state->v_neginf; state->hasbndl.ptr.p_bool[i] = ae_false; state->bndu.ptr.p_double[i] = _state->v_posinf; state->hasbndu.ptr.p_bool[i] = ae_false; state->s.ptr.p_double[i] = 1.0; state->xstart.ptr.p_double[i] = x->ptr.p_double[i]; state->xc.ptr.p_double[i] = x->ptr.p_double[i]; } minnlcsetlc(state, &c, &ct, 0, _state); minnlcsetnlc(state, 0, 0, _state); minnlcsetcond(state, 0.0, 0.0, 0.0, 0, _state); minnlcsetxrep(state, ae_false, _state); minnlcsetalgoaul(state, 1.0E-3, 0, _state); minnlcsetprecexactrobust(state, 0, _state); minnlcsetstpmax(state, 0.0, _state); minlbfgscreate(n, ae_minint(minnlc_lbfgsfactor, n, _state), x, &state->auloptimizer, _state); minnlcrestartfrom(state, x, _state); ae_frame_leave(_state); } /************************************************************************* This function clears preconditioner for L-BFGS optimizer (sets it do default state); Parameters: AULOptimizer - optimizer to tune -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ static void minnlc_clearpreconditioner(minlbfgsstate* auloptimizer, ae_state *_state) { minlbfgssetprecdefault(auloptimizer, _state); } /************************************************************************* This function updates preconditioner for L-BFGS optimizer. Parameters: PrecType - preconditioner type: * 0 for unpreconditioned iterations * 1 for inexact LBFGS * 2 for exact low rank preconditioner update after each UpdateFreq its * 3 for exact robust preconditioner update after each UpdateFreq its UpdateFreq - update frequency PrecCounter - iterations counter, must be zero on the first call, automatically increased by this function. This counter is used to implement "update-once-in-X-iterations" scheme. AULOptimizer - optimizer to tune X - current point Rho - penalty term GammaK - current estimate of Hessian norm (used for initialization of preconditioner). Can be zero, in which case Hessian is assumed to be unit. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ static void minnlc_updatepreconditioner(ae_int_t prectype, ae_int_t updatefreq, ae_int_t* preccounter, minlbfgsstate* auloptimizer, /* Real */ ae_vector* x, double rho, double gammak, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* hasbndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* hasbndu, /* Real */ ae_vector* nubc, /* Real */ ae_matrix* cleic, /* Real */ ae_vector* nulc, /* Real */ ae_vector* fi, /* Real */ ae_matrix* jac, /* Real */ ae_vector* nunlc, /* Real */ ae_vector* bufd, /* Real */ ae_vector* bufc, /* Real */ ae_matrix* bufw, /* Real */ ae_matrix* bufz, /* Real */ ae_vector* tmp0, ae_int_t n, ae_int_t nec, ae_int_t nic, ae_int_t ng, ae_int_t nh, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; double v; double p; double dp; double d2p; ae_bool bflag; ae_assert(ae_fp_greater(rho,(double)(0)), "MinNLC: integrity check failed", _state); rvectorsetlengthatleast(bufd, n, _state); rvectorsetlengthatleast(bufc, nec+nic+ng+nh, _state); rmatrixsetlengthatleast(bufw, nec+nic+ng+nh, n, _state); rvectorsetlengthatleast(tmp0, n, _state); /* * Preconditioner before update from barrier/penalty functions */ if( ae_fp_eq(gammak,(double)(0)) ) { gammak = (double)(1); } for(i=0; i<=n-1; i++) { bufd->ptr.p_double[i] = gammak; } /* * Update diagonal Hessian using nonlinearity from boundary constraints: * * penalty term from equality constraints * * shift term from inequality constraints * * NOTE: penalty term for inequality constraints is ignored because it * is large only in exceptional cases. */ for(i=0; i<=n-1; i++) { if( (hasbndl->ptr.p_bool[i]&&hasbndu->ptr.p_bool[i])&&ae_fp_eq(bndl->ptr.p_double[i],bndu->ptr.p_double[i]) ) { minnlcequalitypenaltyfunction((x->ptr.p_double[i]-bndl->ptr.p_double[i])*rho, &p, &dp, &d2p, _state); bufd->ptr.p_double[i] = bufd->ptr.p_double[i]+d2p*rho; continue; } if( hasbndl->ptr.p_bool[i] ) { minnlcinequalityshiftfunction((x->ptr.p_double[i]-bndl->ptr.p_double[i])*rho+1, &p, &dp, &d2p, _state); bufd->ptr.p_double[i] = bufd->ptr.p_double[i]+nubc->ptr.p_double[2*i+0]*d2p*rho; } if( hasbndu->ptr.p_bool[i] ) { minnlcinequalityshiftfunction((bndu->ptr.p_double[i]-x->ptr.p_double[i])*rho+1, &p, &dp, &d2p, _state); bufd->ptr.p_double[i] = bufd->ptr.p_double[i]+nubc->ptr.p_double[2*i+1]*d2p*rho; } } /* * Process linear constraints */ for(i=0; i<=nec+nic-1; i++) { ae_v_move(&bufw->ptr.pp_double[i][0], 1, &cleic->ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&cleic->ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v-cleic->ptr.pp_double[i][n]; if( iptr.p_double[i] = d2p*rho; } else { /* * Inequality constraint */ minnlcinequalityshiftfunction(-v*rho+1, &p, &dp, &d2p, _state); bufc->ptr.p_double[i] = nulc->ptr.p_double[i]*d2p*rho; } } /* * Process nonlinear constraints */ for(i=0; i<=ng+nh-1; i++) { ae_v_move(&bufw->ptr.pp_double[nec+nic+i][0], 1, &jac->ptr.pp_double[1+i][0], 1, ae_v_len(0,n-1)); v = fi->ptr.p_double[1+i]; if( iptr.p_double[nec+nic+i] = d2p*rho; } else { /* * Inequality constraint */ minnlcinequalityshiftfunction(-v*rho+1, &p, &dp, &d2p, _state); bufc->ptr.p_double[nec+nic+i] = nunlc->ptr.p_double[i]*d2p*rho; } } /* * Add regularizer (large Rho often result in nearly-degenerate matrices; * sometimes Cholesky decomposition fails without regularization). * * We use RegPrec*diag(W'*W) as preconditioner. */ k = nec+nic+ng+nh; for(j=0; j<=n-1; j++) { tmp0->ptr.p_double[j] = 0.0; } for(i=0; i<=k-1; i++) { v = bufc->ptr.p_double[i]; for(j=0; j<=n-1; j++) { tmp0->ptr.p_double[j] = tmp0->ptr.p_double[j]+v*bufw->ptr.pp_double[i][j]*bufw->ptr.pp_double[i][j]; } } for(j=0; j<=n-1; j++) { bufd->ptr.p_double[j] = bufd->ptr.p_double[j]+minnlc_regprec*tmp0->ptr.p_double[j]; } /* * Apply preconditioner */ if( prectype==1 ) { minlbfgssetprecrankklbfgsfast(auloptimizer, bufd, bufc, bufw, nec+nic+ng+nh, _state); } if( prectype==2&&*preccounter%updatefreq==0 ) { minlbfgssetpreclowrankexact(auloptimizer, bufd, bufc, bufw, nec+nic+ng+nh, _state); } if( prectype==3&&*preccounter%updatefreq==0 ) { /* * Generate full NxN dense Hessian */ rmatrixsetlengthatleast(bufz, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { bufz->ptr.pp_double[i][j] = (double)(0); } bufz->ptr.pp_double[i][i] = bufd->ptr.p_double[i]; } if( nec+nic+ng+nh>0 ) { for(i=0; i<=nec+nic+ng+nh-1; i++) { ae_assert(ae_fp_greater_eq(bufc->ptr.p_double[i],(double)(0)), "MinNLC: updatepreconditioner() integrity failure", _state); v = ae_sqrt(bufc->ptr.p_double[i], _state); for(j=0; j<=n-1; j++) { bufw->ptr.pp_double[i][j] = bufw->ptr.pp_double[i][j]*v; } } rmatrixsyrk(n, nec+nic+ng+nh, 1.0, bufw, 0, 0, 2, 1.0, bufz, 0, 0, ae_true, _state); } /* * Evaluate Cholesky decomposition, set preconditioner */ bflag = spdmatrixcholeskyrec(bufz, 0, n, ae_true, bufd, _state); ae_assert(bflag, "MinNLC: updatepreconditioner() failure, Cholesky failed", _state); minlbfgssetpreccholesky(auloptimizer, bufz, ae_true, _state); } inc(preccounter, _state); } /************************************************************************* This subroutine adds penalty from boundary constraints to target function and its gradient. Penalty function is one which is used for main AUL cycle - with Lagrange multipliers and infinite at the barrier and beyond. Parameters: X[] - current point BndL[], BndU[] - boundary constraints HasBndL[], HasBndU[] - I-th element is True if corresponding constraint is present NuBC[] - Lagrange multipliers corresponding to constraints Rho - penalty term StabilizingPoint - branch point for inequality stabilizing term F - function value to modify G - gradient to modify -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ static void minnlc_penaltybc(/* Real */ ae_vector* x, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* hasbndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* hasbndu, /* Real */ ae_vector* nubc, ae_int_t n, double rho, double stabilizingpoint, double* f, /* Real */ ae_vector* g, ae_state *_state) { ae_int_t i; double p; double dp; double d2p; for(i=0; i<=n-1; i++) { if( (hasbndl->ptr.p_bool[i]&&hasbndu->ptr.p_bool[i])&&ae_fp_eq(bndl->ptr.p_double[i],bndu->ptr.p_double[i]) ) { /* * I-th boundary constraint is of equality-type */ minnlcequalitypenaltyfunction((x->ptr.p_double[i]-bndl->ptr.p_double[i])*rho, &p, &dp, &d2p, _state); *f = *f+p/rho-nubc->ptr.p_double[2*i+0]*(x->ptr.p_double[i]-bndl->ptr.p_double[i]); g->ptr.p_double[i] = g->ptr.p_double[i]+dp-nubc->ptr.p_double[2*i+0]; continue; } if( hasbndl->ptr.p_bool[i] ) { /* * Handle lower bound */ minnlcinequalitypenaltyfunction(x->ptr.p_double[i]-bndl->ptr.p_double[i], stabilizingpoint, &p, &dp, &d2p, _state); *f = *f+rho*p; g->ptr.p_double[i] = g->ptr.p_double[i]+rho*dp; minnlcinequalityshiftfunction((x->ptr.p_double[i]-bndl->ptr.p_double[i])*rho+1, &p, &dp, &d2p, _state); *f = *f+p/rho*nubc->ptr.p_double[2*i+0]; g->ptr.p_double[i] = g->ptr.p_double[i]+dp*nubc->ptr.p_double[2*i+0]; } if( hasbndu->ptr.p_bool[i] ) { /* * Handle upper bound */ minnlcinequalitypenaltyfunction(bndu->ptr.p_double[i]-x->ptr.p_double[i], stabilizingpoint, &p, &dp, &d2p, _state); *f = *f+rho*p; g->ptr.p_double[i] = g->ptr.p_double[i]-rho*dp; minnlcinequalityshiftfunction((bndu->ptr.p_double[i]-x->ptr.p_double[i])*rho+1, &p, &dp, &d2p, _state); *f = *f+p/rho*nubc->ptr.p_double[2*i+1]; g->ptr.p_double[i] = g->ptr.p_double[i]-dp*nubc->ptr.p_double[2*i+1]; } } } /************************************************************************* This subroutine adds penalty from linear constraints to target function and its gradient. Penalty function is one which is used for main AUL cycle - with Lagrange multipliers and infinite at the barrier and beyond. Parameters: X[] - current point CLEIC[] - constraints matrix, first NEC rows are equality ones, next NIC rows are inequality ones. array[NEC+NIC,N+1] NuLC[] - Lagrange multipliers corresponding to constraints, array[NEC+NIC] N - dimensionalty NEC - number of equality constraints NIC - number of inequality constraints. Rho - penalty term StabilizingPoint - branch point for inequality stabilizing term F - function value to modify G - gradient to modify -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ static void minnlc_penaltylc(/* Real */ ae_vector* x, /* Real */ ae_matrix* cleic, /* Real */ ae_vector* nulc, ae_int_t n, ae_int_t nec, ae_int_t nic, double rho, double stabilizingpoint, double* f, /* Real */ ae_vector* g, ae_state *_state) { ae_int_t i; double v; double p; double dp; double d2p; double fupd; double gupd; for(i=0; i<=nec+nic-1; i++) { v = ae_v_dotproduct(&cleic->ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v-cleic->ptr.pp_double[i][n]; fupd = (double)(0); gupd = (double)(0); if( iptr.p_double[i]*v; gupd = gupd-nulc->ptr.p_double[i]; } else { /* * Inequality constraint */ minnlcinequalitypenaltyfunction(-v, stabilizingpoint, &p, &dp, &d2p, _state); fupd = fupd+p*rho; gupd = gupd-dp*rho; minnlcinequalityshiftfunction(-v*rho+1, &p, &dp, &d2p, _state); fupd = fupd+p/rho*nulc->ptr.p_double[i]; gupd = gupd-dp*nulc->ptr.p_double[i]; } *f = *f+fupd; ae_v_addd(&g->ptr.p_double[0], 1, &cleic->ptr.pp_double[i][0], 1, ae_v_len(0,n-1), gupd); } } /************************************************************************* This subroutine adds penalty from nonlinear constraints to target function and its gradient. Penalty function is one which is used for main AUL cycle - with Lagrange multipliers and infinite at the barrier and beyond. Parameters: Fi[] - function vector: * 1 component for function being minimized * NG components for equality constraints G_i(x)=0 * NH components for inequality constraints H_i(x)<=0 J[] - Jacobian matrix, array[1+NG+NH,N] NuNLC[] - Lagrange multipliers corresponding to constraints, array[NG+NH] N - number of dimensions NG - number of equality constraints NH - number of inequality constraints Rho - penalty term StabilizingPoint - branch point for inequality stabilizing term F - function value to modify G - gradient to modify -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ static void minnlc_penaltynlc(/* Real */ ae_vector* fi, /* Real */ ae_matrix* j, /* Real */ ae_vector* nunlc, ae_int_t n, ae_int_t ng, ae_int_t nh, double rho, double stabilizingpoint, double* f, /* Real */ ae_vector* g, ae_state *_state) { ae_int_t i; double v; double p; double dp; double d2p; double fupd; double gupd; /* * IMPORTANT: loop starts from 1, not zero! */ for(i=1; i<=ng+nh; i++) { v = fi->ptr.p_double[i]; fupd = (double)(0); gupd = (double)(0); if( i<=ng ) { /* * Equality constraint */ minnlcequalitypenaltyfunction(v*rho, &p, &dp, &d2p, _state); fupd = fupd+p/rho; gupd = gupd+dp; fupd = fupd-nunlc->ptr.p_double[i-1]*v; gupd = gupd-nunlc->ptr.p_double[i-1]; } else { /* * Inequality constraint */ minnlcinequalitypenaltyfunction(-v, stabilizingpoint, &p, &dp, &d2p, _state); fupd = fupd+p*rho; gupd = gupd-dp*rho; minnlcinequalityshiftfunction(-v*rho+1, &p, &dp, &d2p, _state); fupd = fupd+p/rho*nunlc->ptr.p_double[i-1]; gupd = gupd-dp*nunlc->ptr.p_double[i-1]; } *f = *f+fupd; ae_v_addd(&g->ptr.p_double[0], 1, &j->ptr.pp_double[i][0], 1, ae_v_len(0,n-1), gupd); } } /************************************************************************* This function performs actual processing for AUL algorith. It expects that caller redirects its reverse communication requests NeedFiJ/XUpdated to external user who will provide analytic derivative (or handle reports about progress). In case external user does not have analytic derivative, it is responsibility of caller to intercept NeedFiJ request and replace it with appropriate numerical differentiation scheme. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ static ae_bool minnlc_auliteration(minnlcstate* state, ae_state *_state) { ae_int_t n; ae_int_t nec; ae_int_t nic; ae_int_t ng; ae_int_t nh; ae_int_t i; ae_int_t j; ae_int_t outerit; ae_int_t preccounter; double v; double vv; double p; double dp; double d2p; double v0; double v1; double v2; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstateaul.stage>=0 ) { n = state->rstateaul.ia.ptr.p_int[0]; nec = state->rstateaul.ia.ptr.p_int[1]; nic = state->rstateaul.ia.ptr.p_int[2]; ng = state->rstateaul.ia.ptr.p_int[3]; nh = state->rstateaul.ia.ptr.p_int[4]; i = state->rstateaul.ia.ptr.p_int[5]; j = state->rstateaul.ia.ptr.p_int[6]; outerit = state->rstateaul.ia.ptr.p_int[7]; preccounter = state->rstateaul.ia.ptr.p_int[8]; v = state->rstateaul.ra.ptr.p_double[0]; vv = state->rstateaul.ra.ptr.p_double[1]; p = state->rstateaul.ra.ptr.p_double[2]; dp = state->rstateaul.ra.ptr.p_double[3]; d2p = state->rstateaul.ra.ptr.p_double[4]; v0 = state->rstateaul.ra.ptr.p_double[5]; v1 = state->rstateaul.ra.ptr.p_double[6]; v2 = state->rstateaul.ra.ptr.p_double[7]; } else { n = 255; nec = 74; nic = -788; ng = 809; nh = 205; i = -838; j = 939; outerit = -526; preccounter = 763; v = -541; vv = -698; p = -900; dp = -318; d2p = -940; v0 = 1016; v1 = -229; v2 = -536; } if( state->rstateaul.stage==0 ) { goto lbl_0; } if( state->rstateaul.stage==1 ) { goto lbl_1; } if( state->rstateaul.stage==2 ) { goto lbl_2; } /* * Routine body */ ae_assert(state->solvertype==0, "MinNLC: internal error", _state); n = state->n; nec = state->nec; nic = state->nic; ng = state->ng; nh = state->nh; /* * Prepare scaled problem */ rvectorsetlengthatleast(&state->scaledbndl, n, _state); rvectorsetlengthatleast(&state->scaledbndu, n, _state); rmatrixsetlengthatleast(&state->scaledcleic, nec+nic, n+1, _state); for(i=0; i<=n-1; i++) { if( state->hasbndl.ptr.p_bool[i] ) { state->scaledbndl.ptr.p_double[i] = state->bndl.ptr.p_double[i]/state->s.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i] ) { state->scaledbndu.ptr.p_double[i] = state->bndu.ptr.p_double[i]/state->s.ptr.p_double[i]; } state->xc.ptr.p_double[i] = state->xstart.ptr.p_double[i]/state->s.ptr.p_double[i]; } for(i=0; i<=nec+nic-1; i++) { /* * Scale and normalize linear constraints */ vv = 0.0; for(j=0; j<=n-1; j++) { v = state->cleic.ptr.pp_double[i][j]*state->s.ptr.p_double[j]; state->scaledcleic.ptr.pp_double[i][j] = v; vv = vv+v*v; } vv = ae_sqrt(vv, _state); state->scaledcleic.ptr.pp_double[i][n] = state->cleic.ptr.pp_double[i][n]; if( ae_fp_greater(vv,(double)(0)) ) { for(j=0; j<=n; j++) { state->scaledcleic.ptr.pp_double[i][j] = state->scaledcleic.ptr.pp_double[i][j]/vv; } } } /* * Prepare stopping criteria */ minlbfgssetcond(&state->auloptimizer, state->epsg, state->epsf, state->epsx, state->maxits, _state); minlbfgssetstpmax(&state->auloptimizer, state->stpmax, _state); /* * Main AUL cycle: * * prepare Lagrange multipliers NuNB/NuLC * * set GammaK (current estimate of Hessian norm) to InitGamma and XKPresent to False */ rvectorsetlengthatleast(&state->nubc, 2*n, _state); rvectorsetlengthatleast(&state->nulc, nec+nic, _state); rvectorsetlengthatleast(&state->nunlc, ng+nh, _state); rvectorsetlengthatleast(&state->xk, n, _state); rvectorsetlengthatleast(&state->gk, n, _state); rvectorsetlengthatleast(&state->xk1, n, _state); rvectorsetlengthatleast(&state->gk1, n, _state); for(i=0; i<=n-1; i++) { state->nubc.ptr.p_double[2*i+0] = 0.0; state->nubc.ptr.p_double[2*i+1] = 0.0; if( (state->hasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i])&&ae_fp_eq(state->bndl.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { continue; } if( state->hasbndl.ptr.p_bool[i] ) { state->nubc.ptr.p_double[2*i+0] = state->initialinequalitymultiplier; } if( state->hasbndu.ptr.p_bool[i] ) { state->nubc.ptr.p_double[2*i+1] = state->initialinequalitymultiplier; } } for(i=0; i<=nec-1; i++) { state->nulc.ptr.p_double[i] = 0.0; } for(i=0; i<=nic-1; i++) { state->nulc.ptr.p_double[nec+i] = state->initialinequalitymultiplier; } for(i=0; i<=ng-1; i++) { state->nunlc.ptr.p_double[i] = 0.0; } for(i=0; i<=nh-1; i++) { state->nunlc.ptr.p_double[ng+i] = state->initialinequalitymultiplier; } state->gammak = minnlc_initgamma; state->xkpresent = ae_false; ae_assert(state->aulitscnt>0, "MinNLC: integrity check failed", _state); minnlc_clearpreconditioner(&state->auloptimizer, _state); outerit = 0; lbl_3: if( outerit>state->aulitscnt-1 ) { goto lbl_5; } /* * Optimize with current Lagrange multipliers * * NOTE: this code expects and checks that line search ends in the * point which is used as beginning for the next search. Such * guarantee is given by MCSRCH function. L-BFGS optimizer * does not formally guarantee it, but it follows same rule. * Below we a) rely on such property of the optimizer, and b) * assert that it is true, in order to fail loudly if it is * not true. * * NOTE: security check for NAN/INF in F/G is responsibility of * LBFGS optimizer. AUL optimizer checks for NAN/INF only * when we update Lagrange multipliers. */ preccounter = 0; minlbfgssetxrep(&state->auloptimizer, ae_true, _state); minlbfgsrestartfrom(&state->auloptimizer, &state->xc, _state); lbl_6: if( !minlbfgsiteration(&state->auloptimizer, _state) ) { goto lbl_7; } if( !state->auloptimizer.needfg ) { goto lbl_8; } /* * Un-scale X, evaluate F/G/H, re-scale Jacobian */ for(i=0; i<=n-1; i++) { state->x.ptr.p_double[i] = state->auloptimizer.x.ptr.p_double[i]*state->s.ptr.p_double[i]; } state->needfij = ae_true; state->rstateaul.stage = 0; goto lbl_rcomm; lbl_0: state->needfij = ae_false; for(i=0; i<=ng+nh; i++) { for(j=0; j<=n-1; j++) { state->j.ptr.pp_double[i][j] = state->j.ptr.pp_double[i][j]*state->s.ptr.p_double[j]; } } /* * Store data for estimation of Hessian norm: * * current point (re-scaled) * * gradient of the target function (re-scaled, unmodified) */ ae_v_move(&state->xk1.ptr.p_double[0], 1, &state->auloptimizer.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->gk1.ptr.p_double[0], 1, &state->j.ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); /* * Function being optimized */ state->auloptimizer.f = state->fi.ptr.p_double[0]; for(i=0; i<=n-1; i++) { state->auloptimizer.g.ptr.p_double[i] = state->j.ptr.pp_double[0][i]; } /* * Penalty for violation of boundary/linear/nonlinear constraints */ minnlc_penaltybc(&state->auloptimizer.x, &state->scaledbndl, &state->hasbndl, &state->scaledbndu, &state->hasbndu, &state->nubc, n, state->rho, state->stabilizingpoint, &state->auloptimizer.f, &state->auloptimizer.g, _state); minnlc_penaltylc(&state->auloptimizer.x, &state->scaledcleic, &state->nulc, n, nec, nic, state->rho, state->stabilizingpoint, &state->auloptimizer.f, &state->auloptimizer.g, _state); minnlc_penaltynlc(&state->fi, &state->j, &state->nunlc, n, ng, nh, state->rho, state->stabilizingpoint, &state->auloptimizer.f, &state->auloptimizer.g, _state); /* * To optimizer */ goto lbl_6; lbl_8: if( !state->auloptimizer.xupdated ) { goto lbl_10; } /* * Report current point (if needed) */ if( !state->xrep ) { goto lbl_12; } for(i=0; i<=n-1; i++) { state->x.ptr.p_double[i] = state->auloptimizer.x.ptr.p_double[i]*state->s.ptr.p_double[i]; } state->f = state->auloptimizer.f; state->xupdated = ae_true; state->rstateaul.stage = 1; goto lbl_rcomm; lbl_1: state->xupdated = ae_false; lbl_12: /* * Update GammaK */ if( state->xkpresent ) { /* * XK/GK store beginning of current line search, and XK1/GK1 * store data for the end of the line search: * * first, we Assert() that XK1 (last point where function * was evaluated) is same as AULOptimizer.X (what is * reported by RComm interface * * calculate step length V2. * * If V2>HessEstTol, then: * * calculate V0 - directional derivative at XK, * and V1 - directional derivative at XK1 * * set GammaK to Max(GammaK, |V1-V0|/V2) */ for(i=0; i<=n-1; i++) { ae_assert(ae_fp_less_eq(ae_fabs(state->auloptimizer.x.ptr.p_double[i]-state->xk1.ptr.p_double[i], _state),100*ae_machineepsilon), "MinNLC: integrity check failed, unexpected behavior of LBFGS optimizer", _state); } v2 = 0.0; for(i=0; i<=n-1; i++) { v2 = v2+ae_sqr(state->xk.ptr.p_double[i]-state->xk1.ptr.p_double[i], _state); } v2 = ae_sqrt(v2, _state); if( ae_fp_greater(v2,minnlc_hessesttol) ) { v0 = 0.0; v1 = 0.0; for(i=0; i<=n-1; i++) { v = (state->xk.ptr.p_double[i]-state->xk1.ptr.p_double[i])/v2; v0 = v0+state->gk.ptr.p_double[i]*v; v1 = v1+state->gk1.ptr.p_double[i]*v; } state->gammak = ae_maxreal(state->gammak, ae_fabs(v1-v0, _state)/v2, _state); } } else { /* * Beginning of the first line search, XK is not yet initialized. */ ae_v_move(&state->xk.ptr.p_double[0], 1, &state->xk1.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->gk.ptr.p_double[0], 1, &state->gk1.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->xkpresent = ae_true; } /* * Update preconsitioner using current GammaK */ minnlc_updatepreconditioner(state->prectype, state->updatefreq, &preccounter, &state->auloptimizer, &state->auloptimizer.x, state->rho, state->gammak, &state->scaledbndl, &state->hasbndl, &state->scaledbndu, &state->hasbndu, &state->nubc, &state->scaledcleic, &state->nulc, &state->fi, &state->j, &state->nunlc, &state->bufd, &state->bufc, &state->bufw, &state->bufz, &state->tmp0, n, nec, nic, ng, nh, _state); goto lbl_6; lbl_10: ae_assert(ae_false, "MinNLC: integrity check failed", _state); goto lbl_6; lbl_7: minlbfgsresultsbuf(&state->auloptimizer, &state->xc, &state->aulreport, _state); state->repinneriterationscount = state->repinneriterationscount+state->aulreport.iterationscount; state->repnfev = state->repnfev+state->aulreport.nfev; state->repterminationtype = state->aulreport.terminationtype; inc(&state->repouteriterationscount, _state); if( state->repterminationtype<=0 ) { goto lbl_5; } /* * 1. Evaluate F/J * 2. Check for NAN/INF in F/J: we just calculate sum of their * components, it should be enough to reduce vector/matrix to * just one value which either "normal" (all summands were "normal") * or NAN/INF (at least one summand was NAN/INF). * 3. Update Lagrange multipliers */ for(i=0; i<=n-1; i++) { state->x.ptr.p_double[i] = state->xc.ptr.p_double[i]*state->s.ptr.p_double[i]; } state->needfij = ae_true; state->rstateaul.stage = 2; goto lbl_rcomm; lbl_2: state->needfij = ae_false; v = 0.0; for(i=0; i<=ng+nh; i++) { v = 0.1*v+state->fi.ptr.p_double[i]; for(j=0; j<=n-1; j++) { v = 0.1*v+state->j.ptr.pp_double[i][j]; } } if( !ae_isfinite(v, _state) ) { /* * Abnormal termination - infinities in function/gradient */ state->repterminationtype = -8; result = ae_false; return result; } for(i=0; i<=ng+nh; i++) { for(j=0; j<=n-1; j++) { state->j.ptr.pp_double[i][j] = state->j.ptr.pp_double[i][j]*state->s.ptr.p_double[j]; } } for(i=0; i<=n-1; i++) { /* * Process coefficients corresponding to equality-type * constraints. */ if( (state->hasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i])&&ae_fp_eq(state->bndl.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { minnlcequalitypenaltyfunction((state->xc.ptr.p_double[i]-state->scaledbndl.ptr.p_double[i])*state->rho, &p, &dp, &d2p, _state); state->nubc.ptr.p_double[2*i+0] = state->nubc.ptr.p_double[2*i+0]-dp; continue; } /* * Process coefficients corresponding to inequality-type * constraints. These coefficients have limited growth/decay * per iteration which helps to stabilize algorithm. */ ae_assert(ae_fp_greater(minnlc_aulmaxgrowth,1.0), "MinNLC: integrity error", _state); if( state->hasbndl.ptr.p_bool[i] ) { minnlcinequalityshiftfunction((state->xc.ptr.p_double[i]-state->scaledbndl.ptr.p_double[i])*state->rho+1, &p, &dp, &d2p, _state); v = ae_fabs(dp, _state); v = ae_minreal(v, minnlc_aulmaxgrowth, _state); v = ae_maxreal(v, 1/minnlc_aulmaxgrowth, _state); state->nubc.ptr.p_double[2*i+0] = state->nubc.ptr.p_double[2*i+0]*v; } if( state->hasbndu.ptr.p_bool[i] ) { minnlcinequalityshiftfunction((state->scaledbndu.ptr.p_double[i]-state->xc.ptr.p_double[i])*state->rho+1, &p, &dp, &d2p, _state); v = ae_fabs(dp, _state); v = ae_minreal(v, minnlc_aulmaxgrowth, _state); v = ae_maxreal(v, 1/minnlc_aulmaxgrowth, _state); state->nubc.ptr.p_double[2*i+1] = state->nubc.ptr.p_double[2*i+1]*v; } } for(i=0; i<=nec+nic-1; i++) { v = ae_v_dotproduct(&state->scaledcleic.ptr.pp_double[i][0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v-state->scaledcleic.ptr.pp_double[i][n]; if( irho, &p, &dp, &d2p, _state); state->nulc.ptr.p_double[i] = state->nulc.ptr.p_double[i]-dp; } else { minnlcinequalityshiftfunction(-v*state->rho+1, &p, &dp, &d2p, _state); v = ae_fabs(dp, _state); v = ae_minreal(v, minnlc_aulmaxgrowth, _state); v = ae_maxreal(v, 1/minnlc_aulmaxgrowth, _state); state->nulc.ptr.p_double[i] = state->nulc.ptr.p_double[i]*v; } } for(i=1; i<=ng+nh; i++) { /* * NOTE: loop index must start from 1, not zero! */ v = state->fi.ptr.p_double[i]; if( i<=ng ) { minnlcequalitypenaltyfunction(v*state->rho, &p, &dp, &d2p, _state); state->nunlc.ptr.p_double[i-1] = state->nunlc.ptr.p_double[i-1]-dp; } else { minnlcinequalityshiftfunction(-v*state->rho+1, &p, &dp, &d2p, _state); v = ae_fabs(dp, _state); v = ae_minreal(v, minnlc_aulmaxgrowth, _state); v = ae_maxreal(v, 1/minnlc_aulmaxgrowth, _state); state->nunlc.ptr.p_double[i-1] = state->nunlc.ptr.p_double[i-1]*v; } } outerit = outerit+1; goto lbl_3; lbl_5: for(i=0; i<=n-1; i++) { state->xc.ptr.p_double[i] = state->xc.ptr.p_double[i]*state->s.ptr.p_double[i]; } result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstateaul.ia.ptr.p_int[0] = n; state->rstateaul.ia.ptr.p_int[1] = nec; state->rstateaul.ia.ptr.p_int[2] = nic; state->rstateaul.ia.ptr.p_int[3] = ng; state->rstateaul.ia.ptr.p_int[4] = nh; state->rstateaul.ia.ptr.p_int[5] = i; state->rstateaul.ia.ptr.p_int[6] = j; state->rstateaul.ia.ptr.p_int[7] = outerit; state->rstateaul.ia.ptr.p_int[8] = preccounter; state->rstateaul.ra.ptr.p_double[0] = v; state->rstateaul.ra.ptr.p_double[1] = vv; state->rstateaul.ra.ptr.p_double[2] = p; state->rstateaul.ra.ptr.p_double[3] = dp; state->rstateaul.ra.ptr.p_double[4] = d2p; state->rstateaul.ra.ptr.p_double[5] = v0; state->rstateaul.ra.ptr.p_double[6] = v1; state->rstateaul.ra.ptr.p_double[7] = v2; return result; } void _minnlcstate_init(void* _p, ae_state *_state) { minnlcstate *p = (minnlcstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->s, 0, DT_REAL, _state); ae_vector_init(&p->bndl, 0, DT_REAL, _state); ae_vector_init(&p->bndu, 0, DT_REAL, _state); ae_vector_init(&p->hasbndl, 0, DT_BOOL, _state); ae_vector_init(&p->hasbndu, 0, DT_BOOL, _state); ae_matrix_init(&p->cleic, 0, 0, DT_REAL, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->fi, 0, DT_REAL, _state); ae_matrix_init(&p->j, 0, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); _rcommstate_init(&p->rstateaul, _state); ae_vector_init(&p->scaledbndl, 0, DT_REAL, _state); ae_vector_init(&p->scaledbndu, 0, DT_REAL, _state); ae_matrix_init(&p->scaledcleic, 0, 0, DT_REAL, _state); ae_vector_init(&p->xc, 0, DT_REAL, _state); ae_vector_init(&p->xstart, 0, DT_REAL, _state); ae_vector_init(&p->xbase, 0, DT_REAL, _state); ae_vector_init(&p->fbase, 0, DT_REAL, _state); ae_vector_init(&p->dfbase, 0, DT_REAL, _state); ae_vector_init(&p->fm2, 0, DT_REAL, _state); ae_vector_init(&p->fm1, 0, DT_REAL, _state); ae_vector_init(&p->fp1, 0, DT_REAL, _state); ae_vector_init(&p->fp2, 0, DT_REAL, _state); ae_vector_init(&p->dfm1, 0, DT_REAL, _state); ae_vector_init(&p->dfp1, 0, DT_REAL, _state); ae_vector_init(&p->bufd, 0, DT_REAL, _state); ae_vector_init(&p->bufc, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_matrix_init(&p->bufw, 0, 0, DT_REAL, _state); ae_matrix_init(&p->bufz, 0, 0, DT_REAL, _state); ae_vector_init(&p->xk, 0, DT_REAL, _state); ae_vector_init(&p->xk1, 0, DT_REAL, _state); ae_vector_init(&p->gk, 0, DT_REAL, _state); ae_vector_init(&p->gk1, 0, DT_REAL, _state); _minlbfgsstate_init(&p->auloptimizer, _state); _minlbfgsreport_init(&p->aulreport, _state); ae_vector_init(&p->nubc, 0, DT_REAL, _state); ae_vector_init(&p->nulc, 0, DT_REAL, _state); ae_vector_init(&p->nunlc, 0, DT_REAL, _state); } void _minnlcstate_init_copy(void* _dst, void* _src, ae_state *_state) { minnlcstate *dst = (minnlcstate*)_dst; minnlcstate *src = (minnlcstate*)_src; dst->stabilizingpoint = src->stabilizingpoint; dst->initialinequalitymultiplier = src->initialinequalitymultiplier; dst->solvertype = src->solvertype; dst->prectype = src->prectype; dst->updatefreq = src->updatefreq; dst->rho = src->rho; dst->n = src->n; dst->epsg = src->epsg; dst->epsf = src->epsf; dst->epsx = src->epsx; dst->maxits = src->maxits; dst->aulitscnt = src->aulitscnt; dst->xrep = src->xrep; dst->stpmax = src->stpmax; dst->diffstep = src->diffstep; dst->teststep = src->teststep; ae_vector_init_copy(&dst->s, &src->s, _state); ae_vector_init_copy(&dst->bndl, &src->bndl, _state); ae_vector_init_copy(&dst->bndu, &src->bndu, _state); ae_vector_init_copy(&dst->hasbndl, &src->hasbndl, _state); ae_vector_init_copy(&dst->hasbndu, &src->hasbndu, _state); dst->nec = src->nec; dst->nic = src->nic; ae_matrix_init_copy(&dst->cleic, &src->cleic, _state); dst->ng = src->ng; dst->nh = src->nh; ae_vector_init_copy(&dst->x, &src->x, _state); dst->f = src->f; ae_vector_init_copy(&dst->fi, &src->fi, _state); ae_matrix_init_copy(&dst->j, &src->j, _state); dst->needfij = src->needfij; dst->needfi = src->needfi; dst->xupdated = src->xupdated; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); _rcommstate_init_copy(&dst->rstateaul, &src->rstateaul, _state); ae_vector_init_copy(&dst->scaledbndl, &src->scaledbndl, _state); ae_vector_init_copy(&dst->scaledbndu, &src->scaledbndu, _state); ae_matrix_init_copy(&dst->scaledcleic, &src->scaledcleic, _state); ae_vector_init_copy(&dst->xc, &src->xc, _state); ae_vector_init_copy(&dst->xstart, &src->xstart, _state); ae_vector_init_copy(&dst->xbase, &src->xbase, _state); ae_vector_init_copy(&dst->fbase, &src->fbase, _state); ae_vector_init_copy(&dst->dfbase, &src->dfbase, _state); ae_vector_init_copy(&dst->fm2, &src->fm2, _state); ae_vector_init_copy(&dst->fm1, &src->fm1, _state); ae_vector_init_copy(&dst->fp1, &src->fp1, _state); ae_vector_init_copy(&dst->fp2, &src->fp2, _state); ae_vector_init_copy(&dst->dfm1, &src->dfm1, _state); ae_vector_init_copy(&dst->dfp1, &src->dfp1, _state); ae_vector_init_copy(&dst->bufd, &src->bufd, _state); ae_vector_init_copy(&dst->bufc, &src->bufc, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); ae_matrix_init_copy(&dst->bufw, &src->bufw, _state); ae_matrix_init_copy(&dst->bufz, &src->bufz, _state); ae_vector_init_copy(&dst->xk, &src->xk, _state); ae_vector_init_copy(&dst->xk1, &src->xk1, _state); ae_vector_init_copy(&dst->gk, &src->gk, _state); ae_vector_init_copy(&dst->gk1, &src->gk1, _state); dst->gammak = src->gammak; dst->xkpresent = src->xkpresent; _minlbfgsstate_init_copy(&dst->auloptimizer, &src->auloptimizer, _state); _minlbfgsreport_init_copy(&dst->aulreport, &src->aulreport, _state); ae_vector_init_copy(&dst->nubc, &src->nubc, _state); ae_vector_init_copy(&dst->nulc, &src->nulc, _state); ae_vector_init_copy(&dst->nunlc, &src->nunlc, _state); dst->repinneriterationscount = src->repinneriterationscount; dst->repouteriterationscount = src->repouteriterationscount; dst->repnfev = src->repnfev; dst->repvaridx = src->repvaridx; dst->repfuncidx = src->repfuncidx; dst->repterminationtype = src->repterminationtype; dst->repdbgphase0its = src->repdbgphase0its; } void _minnlcstate_clear(void* _p) { minnlcstate *p = (minnlcstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->s); ae_vector_clear(&p->bndl); ae_vector_clear(&p->bndu); ae_vector_clear(&p->hasbndl); ae_vector_clear(&p->hasbndu); ae_matrix_clear(&p->cleic); ae_vector_clear(&p->x); ae_vector_clear(&p->fi); ae_matrix_clear(&p->j); _rcommstate_clear(&p->rstate); _rcommstate_clear(&p->rstateaul); ae_vector_clear(&p->scaledbndl); ae_vector_clear(&p->scaledbndu); ae_matrix_clear(&p->scaledcleic); ae_vector_clear(&p->xc); ae_vector_clear(&p->xstart); ae_vector_clear(&p->xbase); ae_vector_clear(&p->fbase); ae_vector_clear(&p->dfbase); ae_vector_clear(&p->fm2); ae_vector_clear(&p->fm1); ae_vector_clear(&p->fp1); ae_vector_clear(&p->fp2); ae_vector_clear(&p->dfm1); ae_vector_clear(&p->dfp1); ae_vector_clear(&p->bufd); ae_vector_clear(&p->bufc); ae_vector_clear(&p->tmp0); ae_matrix_clear(&p->bufw); ae_matrix_clear(&p->bufz); ae_vector_clear(&p->xk); ae_vector_clear(&p->xk1); ae_vector_clear(&p->gk); ae_vector_clear(&p->gk1); _minlbfgsstate_clear(&p->auloptimizer); _minlbfgsreport_clear(&p->aulreport); ae_vector_clear(&p->nubc); ae_vector_clear(&p->nulc); ae_vector_clear(&p->nunlc); } void _minnlcstate_destroy(void* _p) { minnlcstate *p = (minnlcstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->s); ae_vector_destroy(&p->bndl); ae_vector_destroy(&p->bndu); ae_vector_destroy(&p->hasbndl); ae_vector_destroy(&p->hasbndu); ae_matrix_destroy(&p->cleic); ae_vector_destroy(&p->x); ae_vector_destroy(&p->fi); ae_matrix_destroy(&p->j); _rcommstate_destroy(&p->rstate); _rcommstate_destroy(&p->rstateaul); ae_vector_destroy(&p->scaledbndl); ae_vector_destroy(&p->scaledbndu); ae_matrix_destroy(&p->scaledcleic); ae_vector_destroy(&p->xc); ae_vector_destroy(&p->xstart); ae_vector_destroy(&p->xbase); ae_vector_destroy(&p->fbase); ae_vector_destroy(&p->dfbase); ae_vector_destroy(&p->fm2); ae_vector_destroy(&p->fm1); ae_vector_destroy(&p->fp1); ae_vector_destroy(&p->fp2); ae_vector_destroy(&p->dfm1); ae_vector_destroy(&p->dfp1); ae_vector_destroy(&p->bufd); ae_vector_destroy(&p->bufc); ae_vector_destroy(&p->tmp0); ae_matrix_destroy(&p->bufw); ae_matrix_destroy(&p->bufz); ae_vector_destroy(&p->xk); ae_vector_destroy(&p->xk1); ae_vector_destroy(&p->gk); ae_vector_destroy(&p->gk1); _minlbfgsstate_destroy(&p->auloptimizer); _minlbfgsreport_destroy(&p->aulreport); ae_vector_destroy(&p->nubc); ae_vector_destroy(&p->nulc); ae_vector_destroy(&p->nunlc); } void _minnlcreport_init(void* _p, ae_state *_state) { minnlcreport *p = (minnlcreport*)_p; ae_touch_ptr((void*)p); } void _minnlcreport_init_copy(void* _dst, void* _src, ae_state *_state) { minnlcreport *dst = (minnlcreport*)_dst; minnlcreport *src = (minnlcreport*)_src; dst->iterationscount = src->iterationscount; dst->nfev = src->nfev; dst->varidx = src->varidx; dst->funcidx = src->funcidx; dst->terminationtype = src->terminationtype; dst->dbgphase0its = src->dbgphase0its; } void _minnlcreport_clear(void* _p) { minnlcreport *p = (minnlcreport*)_p; ae_touch_ptr((void*)p); } void _minnlcreport_destroy(void* _p) { minnlcreport *p = (minnlcreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* BOX CONSTRAINED OPTIMIZATION WITH FAST ACTIVATION OF MULTIPLE BOX CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to box constraints (with some of box constraints actually being equality ones). This optimizer uses algorithm similar to that of MinBLEIC (optimizer with general linear constraints), but presence of box-only constraints allows us to use faster constraint activation strategies. On large-scale problems, with multiple constraints active at the solution, this optimizer can be several times faster than BLEIC. REQUIREMENTS: * user must provide function value and gradient * starting point X0 must be feasible or not too far away from the feasible set * grad(f) must be Lipschitz continuous on a level set: L = { x : f(x)<=f(x0) } * function must be defined everywhere on the feasible set F USAGE: Constrained optimization if far more complex than the unconstrained one. Here we give very brief outline of the BC optimizer. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinBCCreate() call 2. USer adds box constraints by calling MinBCSetBC() function. 3. User sets stopping conditions with MinBCSetCond(). 4. User calls MinBCOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 5. User calls MinBCResults() to get solution 6. Optionally user may call MinBCRestartFrom() to solve another problem with same N but another starting point. MinBCRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbccreate(ae_int_t n, /* Real */ ae_vector* x, minbcstate* state, ae_state *_state) { ae_frame _frame_block; ae_matrix c; ae_vector ct; ae_frame_make(_state, &_frame_block); _minbcstate_clear(state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); ae_assert(n>=1, "MinBCCreate: N<1", _state); ae_assert(x->cnt>=n, "MinBCCreate: Length(X)0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinBCSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. CG needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void minbccreatef(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minbcstate* state, ae_state *_state) { ae_frame _frame_block; ae_matrix c; ae_vector ct; ae_frame_make(_state, &_frame_block); _minbcstate_clear(state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); ae_assert(n>=1, "MinBCCreateF: N<1", _state); ae_assert(x->cnt>=n, "MinBCCreateF: Length(X)nmain; ae_assert(bndl->cnt>=n, "MinBCSetBC: Length(BndL)cnt>=n, "MinBCSetBC: Length(BndU)ptr.p_double[i], _state)||ae_isneginf(bndl->ptr.p_double[i], _state), "MinBCSetBC: BndL contains NAN or +INF", _state); ae_assert(ae_isfinite(bndu->ptr.p_double[i], _state)||ae_isposinf(bndu->ptr.p_double[i], _state), "MinBCSetBC: BndL contains NAN or -INF", _state); state->bndl.ptr.p_double[i] = bndl->ptr.p_double[i]; state->hasbndl.ptr.p_bool[i] = ae_isfinite(bndl->ptr.p_double[i], _state); state->bndu.ptr.p_double[i] = bndu->ptr.p_double[i]; state->hasbndu.ptr.p_bool[i] = ae_isfinite(bndu->ptr.p_double[i], _state); } } /************************************************************************* This function sets stopping conditions for the optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinBCSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. NOTE: when SetCond() called with non-zero MaxIts, BC solver may perform slightly more than MaxIts iterations. I.e., MaxIts sets non-strict limit on iterations count. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetcond(minbcstate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(epsg, _state), "MinBCSetCond: EpsG is not finite number", _state); ae_assert(ae_fp_greater_eq(epsg,(double)(0)), "MinBCSetCond: negative EpsG", _state); ae_assert(ae_isfinite(epsf, _state), "MinBCSetCond: EpsF is not finite number", _state); ae_assert(ae_fp_greater_eq(epsf,(double)(0)), "MinBCSetCond: negative EpsF", _state); ae_assert(ae_isfinite(epsx, _state), "MinBCSetCond: EpsX is not finite number", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "MinBCSetCond: negative EpsX", _state); ae_assert(maxits>=0, "MinBCSetCond: negative MaxIts!", _state); if( ((ae_fp_eq(epsg,(double)(0))&&ae_fp_eq(epsf,(double)(0)))&&ae_fp_eq(epsx,(double)(0)))&&maxits==0 ) { epsx = 1.0E-6; } state->epsg = epsg; state->epsf = epsf; state->epsx = epsx; state->maxits = maxits; } /************************************************************************* This function sets scaling coefficients for BC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the BC too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinBCSetPrec...() functions. There is a special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minbcsetscale(minbcstate* state, /* Real */ ae_vector* s, ae_state *_state) { ae_int_t i; ae_assert(s->cnt>=state->nmain, "MinBCSetScale: Length(S)nmain-1; i++) { ae_assert(ae_isfinite(s->ptr.p_double[i], _state), "MinBCSetScale: S contains infinite or NAN elements", _state); ae_assert(ae_fp_neq(s->ptr.p_double[i],(double)(0)), "MinBCSetScale: S contains zero elements", _state); state->s.ptr.p_double[i] = ae_fabs(s->ptr.p_double[i], _state); } } /************************************************************************* Modification of the preconditioner: preconditioning is turned off. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetprecdefault(minbcstate* state, ae_state *_state) { state->prectype = 0; } /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE 1: D[i] should be positive. Exception will be thrown otherwise. NOTE 2: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetprecdiag(minbcstate* state, /* Real */ ae_vector* d, ae_state *_state) { ae_int_t i; ae_assert(d->cnt>=state->nmain, "MinBCSetPrecDiag: D is too short", _state); for(i=0; i<=state->nmain-1; i++) { ae_assert(ae_isfinite(d->ptr.p_double[i], _state), "MinBCSetPrecDiag: D contains infinite or NAN elements", _state); ae_assert(ae_fp_greater(d->ptr.p_double[i],(double)(0)), "MinBCSetPrecDiag: D contains non-positive elements", _state); } rvectorsetlengthatleast(&state->diagh, state->nmain, _state); state->prectype = 2; for(i=0; i<=state->nmain-1; i++) { state->diagh.ptr.p_double[i] = d->ptr.p_double[i]; } } /************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinBCSetScale() call (before or after MinBCSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetprecscale(minbcstate* state, ae_state *_state) { state->prectype = 3; } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinBCOptimize(). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetxrep(minbcstate* state, ae_bool needxrep, ae_state *_state) { state->xrep = needxrep; } /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which lead to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetstpmax(minbcstate* state, double stpmax, ae_state *_state) { ae_assert(ae_isfinite(stpmax, _state), "MinBCSetStpMax: StpMax is not finite!", _state); ae_assert(ae_fp_greater_eq(stpmax,(double)(0)), "MinBCSetStpMax: StpMax<0!", _state); state->stpmax = stpmax; } /************************************************************************* NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied gradient, and one which uses function value only and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object (either MinBCCreate() for analytical gradient or MinBCCreateF() for numerical differentiation) you should choose appropriate variant of MinBCOptimize() - one which accepts function AND gradient or one which accepts function ONLY. Be careful to choose variant of MinBCOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinBCOptimize() and specific function used to create optimizer. | USER PASSED TO MinBCOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinBCCreateF() | works FAILS MinBCCreate() | FAILS works Here "FAIL" denotes inappropriate combinations of optimizer creation function and MinBCOptimize() version. Attemps to use such combination (for example, to create optimizer with MinBCCreateF() and to pass gradient information to MinCGOptimize()) will lead to exception being thrown. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ ae_bool minbciteration(minbcstate* state, ae_state *_state) { ae_int_t freezeidx; double freezeval; double scaleddnorm; ae_int_t n; ae_int_t m; ae_int_t i; ae_int_t j; double v; double vv; double v0; ae_bool b; ae_int_t mcinfo; ae_int_t itidx; double ginit; double gdecay; ae_bool activationstatus; double activationstep; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { freezeidx = state->rstate.ia.ptr.p_int[0]; n = state->rstate.ia.ptr.p_int[1]; m = state->rstate.ia.ptr.p_int[2]; i = state->rstate.ia.ptr.p_int[3]; j = state->rstate.ia.ptr.p_int[4]; mcinfo = state->rstate.ia.ptr.p_int[5]; itidx = state->rstate.ia.ptr.p_int[6]; b = state->rstate.ba.ptr.p_bool[0]; activationstatus = state->rstate.ba.ptr.p_bool[1]; freezeval = state->rstate.ra.ptr.p_double[0]; scaleddnorm = state->rstate.ra.ptr.p_double[1]; v = state->rstate.ra.ptr.p_double[2]; vv = state->rstate.ra.ptr.p_double[3]; v0 = state->rstate.ra.ptr.p_double[4]; ginit = state->rstate.ra.ptr.p_double[5]; gdecay = state->rstate.ra.ptr.p_double[6]; activationstep = state->rstate.ra.ptr.p_double[7]; } else { freezeidx = 359; n = -58; m = -919; i = -909; j = 81; mcinfo = 255; itidx = 74; b = ae_false; activationstatus = ae_true; freezeval = 205; scaleddnorm = -838; v = 939; vv = -526; v0 = 763; ginit = -541; gdecay = -698; activationstep = -900; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } if( state->rstate.stage==4 ) { goto lbl_4; } if( state->rstate.stage==5 ) { goto lbl_5; } if( state->rstate.stage==6 ) { goto lbl_6; } if( state->rstate.stage==7 ) { goto lbl_7; } if( state->rstate.stage==8 ) { goto lbl_8; } if( state->rstate.stage==9 ) { goto lbl_9; } if( state->rstate.stage==10 ) { goto lbl_10; } if( state->rstate.stage==11 ) { goto lbl_11; } if( state->rstate.stage==12 ) { goto lbl_12; } if( state->rstate.stage==13 ) { goto lbl_13; } if( state->rstate.stage==14 ) { goto lbl_14; } if( state->rstate.stage==15 ) { goto lbl_15; } if( state->rstate.stage==16 ) { goto lbl_16; } if( state->rstate.stage==17 ) { goto lbl_17; } if( state->rstate.stage==18 ) { goto lbl_18; } if( state->rstate.stage==19 ) { goto lbl_19; } if( state->rstate.stage==20 ) { goto lbl_20; } if( state->rstate.stage==21 ) { goto lbl_21; } if( state->rstate.stage==22 ) { goto lbl_22; } if( state->rstate.stage==23 ) { goto lbl_23; } if( state->rstate.stage==24 ) { goto lbl_24; } if( state->rstate.stage==25 ) { goto lbl_25; } if( state->rstate.stage==26 ) { goto lbl_26; } if( state->rstate.stage==27 ) { goto lbl_27; } if( state->rstate.stage==28 ) { goto lbl_28; } if( state->rstate.stage==29 ) { goto lbl_29; } if( state->rstate.stage==30 ) { goto lbl_30; } if( state->rstate.stage==31 ) { goto lbl_31; } /* * Routine body */ /* * Algorithm parameters: * * M number of L-BFGS corrections. * This coefficient remains fixed during iterations. * * GDecay desired decrease of constrained gradient during L-BFGS iterations. * This coefficient is decreased after each L-BFGS round until * it reaches minimum decay. */ m = ae_minint(5, state->nmain, _state); gdecay = minbc_initialdecay; /* * Init */ n = state->nmain; for(i=0; i<=n-1; i++) { state->xc.ptr.p_double[i] = state->xstart.ptr.p_double[i]; } if( !enforceboundaryconstraints(&state->xc, &state->bndl, &state->hasbndl, &state->bndu, &state->hasbndu, n, 0, _state) ) { /* * Inconsistent constraints */ state->repterminationtype = -3; result = ae_false; return result; } state->userterminationneeded = ae_false; state->repterminationtype = 0; state->repiterationscount = 0; state->repnfev = 0; state->repvaridx = -1; rmatrixsetlengthatleast(&state->bufyk, m+1, n, _state); rmatrixsetlengthatleast(&state->bufsk, m+1, n, _state); rvectorsetlengthatleast(&state->bufrho, m, _state); rvectorsetlengthatleast(&state->buftheta, m, _state); rvectorsetlengthatleast(&state->tmp0, n, _state); /* * Fill TmpPrec with current preconditioner */ rvectorsetlengthatleast(&state->tmpprec, n, _state); for(i=0; i<=n-1; i++) { if( state->prectype==2 ) { state->tmpprec.ptr.p_double[i] = 1/state->diagh.ptr.p_double[i]; continue; } if( state->prectype==3 ) { state->tmpprec.ptr.p_double[i] = ae_sqr(state->s.ptr.p_double[i], _state); continue; } state->tmpprec.ptr.p_double[i] = (double)(1); } /* * Check correctness of user-supplied gradient */ if( !(ae_fp_eq(state->diffstep,(double)(0))&&ae_fp_greater(state->teststep,(double)(0))) ) { goto lbl_32; } minbc_clearrequestfields(state, _state); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->needfg = ae_true; i = 0; lbl_34: if( i>n-1 ) { goto lbl_36; } ae_assert(!state->hasbndl.ptr.p_bool[i]||ae_fp_greater_eq(state->xc.ptr.p_double[i],state->bndl.ptr.p_double[i]), "MinBCIteration: internal error(State.X is out of bounds)", _state); ae_assert(!state->hasbndu.ptr.p_bool[i]||ae_fp_less_eq(state->xc.ptr.p_double[i],state->bndu.ptr.p_double[i]), "MinBCIteration: internal error(State.X is out of bounds)", _state); v = state->x.ptr.p_double[i]; state->x.ptr.p_double[i] = v-state->teststep*state->s.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i] ) { state->x.ptr.p_double[i] = ae_maxreal(state->x.ptr.p_double[i], state->bndl.ptr.p_double[i], _state); } state->xm1 = state->x.ptr.p_double[i]; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: state->fm1 = state->f; state->gm1 = state->g.ptr.p_double[i]; state->x.ptr.p_double[i] = v+state->teststep*state->s.ptr.p_double[i]; if( state->hasbndu.ptr.p_bool[i] ) { state->x.ptr.p_double[i] = ae_minreal(state->x.ptr.p_double[i], state->bndu.ptr.p_double[i], _state); } state->xp1 = state->x.ptr.p_double[i]; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: state->fp1 = state->f; state->gp1 = state->g.ptr.p_double[i]; state->x.ptr.p_double[i] = (state->xm1+state->xp1)/2; if( state->hasbndl.ptr.p_bool[i] ) { state->x.ptr.p_double[i] = ae_maxreal(state->x.ptr.p_double[i], state->bndl.ptr.p_double[i], _state); } if( state->hasbndu.ptr.p_bool[i] ) { state->x.ptr.p_double[i] = ae_minreal(state->x.ptr.p_double[i], state->bndu.ptr.p_double[i], _state); } state->rstate.stage = 2; goto lbl_rcomm; lbl_2: state->x.ptr.p_double[i] = v; if( !derivativecheck(state->fm1, state->gm1, state->fp1, state->gp1, state->f, state->g.ptr.p_double[i], state->xp1-state->xm1, _state) ) { state->repvaridx = i; state->repterminationtype = -7; result = ae_false; return result; } i = i+1; goto lbl_34; lbl_36: state->needfg = ae_false; lbl_32: /* * Main cycle of BC-PG algorithm */ state->repterminationtype = 0; state->lastscaledgoodstep = (double)(0); state->nonmonotoniccnt = ae_round(1.5*n, _state)+5; ae_v_move(&state->x.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); minbc_clearrequestfields(state, _state); if( ae_fp_neq(state->diffstep,(double)(0)) ) { goto lbl_37; } state->needfg = ae_true; state->rstate.stage = 3; goto lbl_rcomm; lbl_3: state->needfg = ae_false; goto lbl_38; lbl_37: state->needf = ae_true; state->rstate.stage = 4; goto lbl_rcomm; lbl_4: state->needf = ae_false; lbl_38: state->fc = state->f; trimprepare(state->f, &state->trimthreshold, _state); state->repnfev = state->repnfev+1; if( !state->xrep ) { goto lbl_39; } /* * Report current point */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->f = state->fc; state->xupdated = ae_true; state->rstate.stage = 5; goto lbl_rcomm; lbl_5: state->xupdated = ae_false; lbl_39: if( state->userterminationneeded ) { /* * User requested termination */ state->repterminationtype = 8; result = ae_false; return result; } lbl_41: if( ae_false ) { goto lbl_42; } /* * Steepest descent phase * * (a) calculate unconstrained gradient * (b) check F/G for NAN/INF, abnormally terminate algorithm if needed * (c) perform one steepest descent step, activating only those constraints * which prevent us from moving outside of box-constrained area */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); minbc_clearrequestfields(state, _state); if( ae_fp_neq(state->diffstep,(double)(0)) ) { goto lbl_43; } /* * Analytic gradient */ state->needfg = ae_true; state->rstate.stage = 6; goto lbl_rcomm; lbl_6: state->needfg = ae_false; goto lbl_44; lbl_43: /* * Numerical differentiation */ state->needf = ae_true; state->rstate.stage = 7; goto lbl_rcomm; lbl_7: state->fbase = state->f; i = 0; lbl_45: if( i>n-1 ) { goto lbl_47; } v = state->x.ptr.p_double[i]; b = ae_false; if( state->hasbndl.ptr.p_bool[i] ) { b = b||ae_fp_less(v-state->diffstep*state->s.ptr.p_double[i],state->bndl.ptr.p_double[i]); } if( state->hasbndu.ptr.p_bool[i] ) { b = b||ae_fp_greater(v+state->diffstep*state->s.ptr.p_double[i],state->bndu.ptr.p_double[i]); } if( b ) { goto lbl_48; } state->x.ptr.p_double[i] = v-state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 8; goto lbl_rcomm; lbl_8: state->fm2 = state->f; state->x.ptr.p_double[i] = v-0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 9; goto lbl_rcomm; lbl_9: state->fm1 = state->f; state->x.ptr.p_double[i] = v+0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 10; goto lbl_rcomm; lbl_10: state->fp1 = state->f; state->x.ptr.p_double[i] = v+state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 11; goto lbl_rcomm; lbl_11: state->fp2 = state->f; state->g.ptr.p_double[i] = (8*(state->fp1-state->fm1)-(state->fp2-state->fm2))/(6*state->diffstep*state->s.ptr.p_double[i]); goto lbl_49; lbl_48: state->xm1 = v-state->diffstep*state->s.ptr.p_double[i]; state->xp1 = v+state->diffstep*state->s.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less(state->xm1,state->bndl.ptr.p_double[i]) ) { state->xm1 = state->bndl.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater(state->xp1,state->bndu.ptr.p_double[i]) ) { state->xp1 = state->bndu.ptr.p_double[i]; } state->x.ptr.p_double[i] = state->xm1; state->rstate.stage = 12; goto lbl_rcomm; lbl_12: state->fm1 = state->f; state->x.ptr.p_double[i] = state->xp1; state->rstate.stage = 13; goto lbl_rcomm; lbl_13: state->fp1 = state->f; if( ae_fp_neq(state->xm1,state->xp1) ) { state->g.ptr.p_double[i] = (state->fp1-state->fm1)/(state->xp1-state->xm1); } else { state->g.ptr.p_double[i] = (double)(0); } lbl_49: state->x.ptr.p_double[i] = v; i = i+1; goto lbl_45; lbl_47: state->f = state->fbase; state->needf = ae_false; lbl_44: state->fc = state->f; ae_v_move(&state->ugc.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->cgc.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); projectgradientintobc(&state->xc, &state->cgc, &state->bndl, &state->hasbndl, &state->bndu, &state->hasbndu, n, 0, _state); ginit = 0.0; for(i=0; i<=n-1; i++) { ginit = ginit+ae_sqr(state->cgc.ptr.p_double[i]*state->s.ptr.p_double[i], _state); } ginit = ae_sqrt(ginit, _state); if( !ae_isfinite(ginit, _state)||!ae_isfinite(state->fc, _state) ) { /* * Abnormal termination - infinities in function/gradient */ state->repterminationtype = -8; result = ae_false; return result; } if( state->userterminationneeded ) { /* * User requested termination */ state->repterminationtype = 8; result = ae_false; return result; } if( ae_fp_less_eq(ginit,state->epsg) ) { /* * Gradient is small enough. * Optimization is terminated */ state->repterminationtype = 4; result = ae_false; return result; } for(i=0; i<=n-1; i++) { state->d.ptr.p_double[i] = -state->tmpprec.ptr.p_double[i]*state->cgc.ptr.p_double[i]; } scaleddnorm = (double)(0); for(i=0; i<=n-1; i++) { scaleddnorm = scaleddnorm+ae_sqr(state->d.ptr.p_double[i]/state->s.ptr.p_double[i], _state); } scaleddnorm = ae_sqrt(scaleddnorm, _state); ae_assert(ae_fp_greater(scaleddnorm,(double)(0)), "MinBC: integrity check failed", _state); if( ae_fp_greater(state->lastscaledgoodstep,(double)(0)) ) { state->stp = state->lastscaledgoodstep/scaleddnorm; } else { state->stp = 1.0/scaleddnorm; } calculatestepbound(&state->xc, &state->d, 1.0, &state->bndl, &state->hasbndl, &state->bndu, &state->hasbndu, n, 0, &freezeidx, &freezeval, &state->curstpmax, _state); activationstep = state->curstpmax; if( freezeidx<0||ae_fp_greater(state->curstpmax,1.0E50) ) { state->curstpmax = 1.0E50; } if( ae_fp_greater(state->stpmax,(double)(0)) ) { state->curstpmax = ae_minreal(state->curstpmax, state->stpmax/scaleddnorm, _state); } ae_v_move(&state->xn.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->cgn.ptr.p_double[0], 1, &state->cgc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->ugn.ptr.p_double[0], 1, &state->ugc.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->fn = state->fc; state->mcstage = 0; mcsrch(n, &state->xn, &state->fn, &state->cgn, &state->d, &state->stp, state->curstpmax, minbc_gtol, &mcinfo, &state->nfev, &state->work, &state->lstate, &state->mcstage, _state); lbl_50: if( state->mcstage==0 ) { goto lbl_51; } /* * Copy XN to X, perform on-the-fly correction w.r.t box * constraints (projection onto feasible set). */ for(i=0; i<=n-1; i++) { state->x.ptr.p_double[i] = state->xn.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less(state->xn.ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { state->x.ptr.p_double[i] = state->bndl.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater(state->xn.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->x.ptr.p_double[i] = state->bndu.ptr.p_double[i]; } } /* * Gradient, either user-provided or numerical differentiation */ minbc_clearrequestfields(state, _state); if( ae_fp_neq(state->diffstep,(double)(0)) ) { goto lbl_52; } /* * Analytic gradient */ state->needfg = ae_true; state->rstate.stage = 14; goto lbl_rcomm; lbl_14: state->needfg = ae_false; state->repnfev = state->repnfev+1; goto lbl_53; lbl_52: /* * Numerical differentiation */ state->needf = ae_true; state->rstate.stage = 15; goto lbl_rcomm; lbl_15: state->fbase = state->f; i = 0; lbl_54: if( i>n-1 ) { goto lbl_56; } v = state->x.ptr.p_double[i]; b = ae_false; if( state->hasbndl.ptr.p_bool[i] ) { b = b||ae_fp_less(v-state->diffstep*state->s.ptr.p_double[i],state->bndl.ptr.p_double[i]); } if( state->hasbndu.ptr.p_bool[i] ) { b = b||ae_fp_greater(v+state->diffstep*state->s.ptr.p_double[i],state->bndu.ptr.p_double[i]); } if( b ) { goto lbl_57; } state->x.ptr.p_double[i] = v-state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 16; goto lbl_rcomm; lbl_16: state->fm2 = state->f; state->x.ptr.p_double[i] = v-0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 17; goto lbl_rcomm; lbl_17: state->fm1 = state->f; state->x.ptr.p_double[i] = v+0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 18; goto lbl_rcomm; lbl_18: state->fp1 = state->f; state->x.ptr.p_double[i] = v+state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 19; goto lbl_rcomm; lbl_19: state->fp2 = state->f; state->g.ptr.p_double[i] = (8*(state->fp1-state->fm1)-(state->fp2-state->fm2))/(6*state->diffstep*state->s.ptr.p_double[i]); state->repnfev = state->repnfev+4; goto lbl_58; lbl_57: state->xm1 = v-state->diffstep*state->s.ptr.p_double[i]; state->xp1 = v+state->diffstep*state->s.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less(state->xm1,state->bndl.ptr.p_double[i]) ) { state->xm1 = state->bndl.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater(state->xp1,state->bndu.ptr.p_double[i]) ) { state->xp1 = state->bndu.ptr.p_double[i]; } state->x.ptr.p_double[i] = state->xm1; state->rstate.stage = 20; goto lbl_rcomm; lbl_20: state->fm1 = state->f; state->x.ptr.p_double[i] = state->xp1; state->rstate.stage = 21; goto lbl_rcomm; lbl_21: state->fp1 = state->f; if( ae_fp_neq(state->xm1,state->xp1) ) { state->g.ptr.p_double[i] = (state->fp1-state->fm1)/(state->xp1-state->xm1); } else { state->g.ptr.p_double[i] = (double)(0); } state->repnfev = state->repnfev+2; lbl_58: state->x.ptr.p_double[i] = v; i = i+1; goto lbl_54; lbl_56: state->f = state->fbase; state->needf = ae_false; lbl_53: /* * Back to MCSRCH */ trimfunction(&state->f, &state->g, n, state->trimthreshold, _state); state->fn = state->f; ae_v_move(&state->cgn.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->ugn.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { if( ae_fp_eq(state->d.ptr.p_double[i],(double)(0)) ) { state->cgn.ptr.p_double[i] = (double)(0); } } mcsrch(n, &state->xn, &state->fn, &state->cgn, &state->d, &state->stp, state->curstpmax, minbc_gtol, &mcinfo, &state->nfev, &state->work, &state->lstate, &state->mcstage, _state); goto lbl_50; lbl_51: v = state->fn; for(i=0; i<=n-1; i++) { v = 0.1*v+state->ugn.ptr.p_double[i]; } if( !ae_isfinite(v, _state) ) { /* * Abnormal termination - infinities in function/gradient */ state->repterminationtype = -8; result = ae_false; return result; } if( mcinfo!=1&&mcinfo!=5 ) { /* * We can not find step which decreases function value. We have * two possibilities: * (a) numerical properties of the function do not allow us to * find good step. * (b) we are close to activation of some constraint, and it is * so close that step which activates it leads to change in * target function which is smaller than numerical noise. * * Optimization algorithm must be able to handle case (b), because * inability to handle it will cause failure when algorithm * started very close to boundary of the feasible area. * * In order to correctly handle such cases we allow limited amount * of small steps which increase function value. */ if( (freezeidx>=0&&ae_fp_less_eq(scaleddnorm*state->curstpmax,minbc_maxnonmonotoniclen))&&state->nonmonotoniccnt>0 ) { /* * We enforce non-monotonic step: * * Stp := CurStpMax * * MCINFO := 5 * * XN := XC+CurStpMax*D * * non-monotonic counter is decreased * * NOTE: UGN/CGN are not updated because step is so short that we assume that * GN is approximately equal to GC. */ state->stp = state->curstpmax; mcinfo = 5; v = state->curstpmax; ae_v_move(&state->xn.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&state->xn.ptr.p_double[0], 1, &state->d.ptr.p_double[0], 1, ae_v_len(0,n-1), v); state->nonmonotoniccnt = state->nonmonotoniccnt-1; } else { /* * Numerical properties of the function does not allow * us to solve problem. Algorithm is terminated */ state->repterminationtype = 7; result = ae_false; return result; } } if( state->userterminationneeded ) { /* * User requested termination */ state->repterminationtype = 8; result = ae_false; return result; } ae_assert(mcinfo!=5||ae_fp_eq(state->stp,state->curstpmax), "MinBC: integrity check failed", _state); postprocessboundedstep(&state->xn, &state->xc, &state->bndl, &state->hasbndl, &state->bndu, &state->hasbndu, n, 0, freezeidx, freezeval, state->stp, activationstep, _state); state->fp = state->fc; state->fc = state->fn; ae_v_move(&state->xp.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->xc.ptr.p_double[0], 1, &state->xn.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->cgc.ptr.p_double[0], 1, &state->cgn.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->ugc.ptr.p_double[0], 1, &state->ugn.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( !state->xrep ) { goto lbl_59; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); minbc_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 22; goto lbl_rcomm; lbl_22: state->xupdated = ae_false; lbl_59: state->repiterationscount = state->repiterationscount+1; if( mcinfo==1 ) { v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr((state->xc.ptr.p_double[i]-state->xp.ptr.p_double[i])/state->s.ptr.p_double[i], _state); } v = ae_sqrt(v, _state); if( ae_fp_less_eq(v,state->epsx) ) { /* * Step is small enough */ state->repterminationtype = 2; result = ae_false; return result; } if( ae_fp_less_eq(ae_fabs(state->fp-state->fc, _state),state->epsf*ae_maxreal(ae_fabs(state->fc, _state), ae_maxreal(ae_fabs(state->fp, _state), 1.0, _state), _state)) ) { /* * Function change is small enough */ state->repterminationtype = 1; result = ae_false; return result; } } if( state->maxits>0&&state->repiterationscount>=state->maxits ) { /* * Iteration counter exceeded limit */ state->repterminationtype = 5; result = ae_false; return result; } /* * LBFGS stage: * * during LBFGS iterations we activate new constraints, but never * deactivate already active ones. * * we perform at most N iterations of LBFGS before re-evaluating * active set and restarting LBFGS. * * About termination: * * LBFGS iterations can be terminated because of two reasons: * * "termination" - non-zero termination code in RepTerminationType, * which means that optimization is done * * "restart" - zero RepTerminationType, which means that we * have to re-evaluate active set and resume LBFGS stage. * * one more option is "refresh" - to continue LBFGS iterations, * but with all BFGS updates (Sk/Yk pairs) being dropped; * it happens after changes in active set */ ginit = 0.0; for(i=0; i<=n-1; i++) { state->cgc.ptr.p_double[i] = state->ugc.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { state->cgc.ptr.p_double[i] = (double)(0); } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->cgc.ptr.p_double[i] = (double)(0); } ginit = ginit+ae_sqr(state->cgc.ptr.p_double[i]*state->s.ptr.p_double[i], _state); } ginit = ae_sqrt(ginit, _state); state->bufsize = 0; itidx = 0; lbl_61: if( itidx>n-1 ) { goto lbl_63; } /* * At the beginning of each iteration: * * XC stores current point * * FC stores current function value * * UGC stores current unconstrained gradient * * CGC stores current constrained gradient * * D stores constrained step direction (calculated at this block) * * 1. Calculate search direction D according to L-BFGS algorithm * using constrained preconditioner to perform inner multiplication. * 2. Evaluate scaled length of direction D; restart LBFGS if D is zero * (it may be possible that we found minimum, but it is also possible * that some constraints need deactivation) * 3. If D is non-zero, try to use previous scaled step length as initial estimate for new step. * 4. Calculate bound on step length. */ ae_v_move(&state->work.ptr.p_double[0], 1, &state->cgc.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=state->bufsize-1; i>=0; i--) { v = ae_v_dotproduct(&state->bufsk.ptr.pp_double[i][0], 1, &state->work.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->buftheta.ptr.p_double[i] = v; vv = v*state->bufrho.ptr.p_double[i]; ae_v_subd(&state->work.ptr.p_double[0], 1, &state->bufyk.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), vv); } for(i=0; i<=n-1; i++) { state->work.ptr.p_double[i] = state->tmpprec.ptr.p_double[i]*state->work.ptr.p_double[i]; } for(i=0; i<=state->bufsize-1; i++) { v = ae_v_dotproduct(&state->bufyk.ptr.pp_double[i][0], 1, &state->work.ptr.p_double[0], 1, ae_v_len(0,n-1)); vv = state->bufrho.ptr.p_double[i]*(-v+state->buftheta.ptr.p_double[i]); ae_v_addd(&state->work.ptr.p_double[0], 1, &state->bufsk.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), vv); } ae_v_moveneg(&state->d.ptr.p_double[0], 1, &state->work.ptr.p_double[0], 1, ae_v_len(0,n-1)); b = ae_false; for(i=0; i<=n-1; i++) { b = b||((state->hasbndl.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->bndl.ptr.p_double[i]))&&ae_fp_neq(state->d.ptr.p_double[i],(double)(0))); b = b||((state->hasbndu.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->bndu.ptr.p_double[i]))&&ae_fp_neq(state->d.ptr.p_double[i],(double)(0))); } ae_assert(!b, "MinBC: integrity check failed (q)", _state); scaleddnorm = (double)(0); for(i=0; i<=n-1; i++) { scaleddnorm = scaleddnorm+ae_sqr(state->d.ptr.p_double[i]/state->s.ptr.p_double[i], _state); } scaleddnorm = ae_sqrt(scaleddnorm, _state); if( ae_fp_eq(scaleddnorm,(double)(0)) ) { /* * Search direction is zero. * Skip back to steepest descent phase. */ goto lbl_63; } if( ae_fp_greater(state->lastscaledgoodstep,(double)(0)) ) { state->stp = state->lastscaledgoodstep/scaleddnorm; } else { state->stp = 1.0/scaleddnorm; } state->curstpmax = 1.0E50; if( ae_fp_greater(state->stpmax,(double)(0)) ) { state->curstpmax = ae_minreal(state->curstpmax, state->stpmax/scaleddnorm, _state); } /* * Minimize G(t) = F(CONSTRAIN(XC + t*D)), with t being scalar, XC and D being vectors. */ ae_v_move(&state->xn.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->cgn.ptr.p_double[0], 1, &state->cgc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->ugn.ptr.p_double[0], 1, &state->ugc.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->fn = state->fc; state->mcstage = 0; mcsrch(n, &state->xn, &state->fn, &state->cgn, &state->d, &state->stp, state->curstpmax, minbc_gtol, &mcinfo, &state->nfev, &state->work, &state->lstate, &state->mcstage, _state); lbl_64: if( state->mcstage==0 ) { goto lbl_65; } /* * Copy XN to X, perform on-the-fly correction w.r.t box * constraints (projection onto feasible set). */ for(i=0; i<=n-1; i++) { state->x.ptr.p_double[i] = state->xn.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less_eq(state->xn.ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { state->x.ptr.p_double[i] = state->bndl.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater_eq(state->xn.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->x.ptr.p_double[i] = state->bndu.ptr.p_double[i]; } } /* * Gradient, either user-provided or numerical differentiation */ minbc_clearrequestfields(state, _state); if( ae_fp_neq(state->diffstep,(double)(0)) ) { goto lbl_66; } /* * Analytic gradient */ state->needfg = ae_true; state->rstate.stage = 23; goto lbl_rcomm; lbl_23: state->needfg = ae_false; state->repnfev = state->repnfev+1; goto lbl_67; lbl_66: /* * Numerical differentiation */ state->needf = ae_true; state->rstate.stage = 24; goto lbl_rcomm; lbl_24: state->fbase = state->f; i = 0; lbl_68: if( i>n-1 ) { goto lbl_70; } v = state->x.ptr.p_double[i]; b = ae_false; if( state->hasbndl.ptr.p_bool[i] ) { b = b||ae_fp_less(v-state->diffstep*state->s.ptr.p_double[i],state->bndl.ptr.p_double[i]); } if( state->hasbndu.ptr.p_bool[i] ) { b = b||ae_fp_greater(v+state->diffstep*state->s.ptr.p_double[i],state->bndu.ptr.p_double[i]); } if( b ) { goto lbl_71; } state->x.ptr.p_double[i] = v-state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 25; goto lbl_rcomm; lbl_25: state->fm2 = state->f; state->x.ptr.p_double[i] = v-0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 26; goto lbl_rcomm; lbl_26: state->fm1 = state->f; state->x.ptr.p_double[i] = v+0.5*state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 27; goto lbl_rcomm; lbl_27: state->fp1 = state->f; state->x.ptr.p_double[i] = v+state->diffstep*state->s.ptr.p_double[i]; state->rstate.stage = 28; goto lbl_rcomm; lbl_28: state->fp2 = state->f; state->g.ptr.p_double[i] = (8*(state->fp1-state->fm1)-(state->fp2-state->fm2))/(6*state->diffstep*state->s.ptr.p_double[i]); state->repnfev = state->repnfev+4; goto lbl_72; lbl_71: state->xm1 = v-state->diffstep*state->s.ptr.p_double[i]; state->xp1 = v+state->diffstep*state->s.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less(state->xm1,state->bndl.ptr.p_double[i]) ) { state->xm1 = state->bndl.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater(state->xp1,state->bndu.ptr.p_double[i]) ) { state->xp1 = state->bndu.ptr.p_double[i]; } state->x.ptr.p_double[i] = state->xm1; state->rstate.stage = 29; goto lbl_rcomm; lbl_29: state->fm1 = state->f; state->x.ptr.p_double[i] = state->xp1; state->rstate.stage = 30; goto lbl_rcomm; lbl_30: state->fp1 = state->f; if( ae_fp_neq(state->xm1,state->xp1) ) { state->g.ptr.p_double[i] = (state->fp1-state->fm1)/(state->xp1-state->xm1); } else { state->g.ptr.p_double[i] = (double)(0); } state->repnfev = state->repnfev+2; lbl_72: state->x.ptr.p_double[i] = v; i = i+1; goto lbl_68; lbl_70: state->f = state->fbase; state->needf = ae_false; lbl_67: /* * Back to MCSRCH */ trimfunction(&state->f, &state->g, n, state->trimthreshold, _state); state->fn = state->f; for(i=0; i<=n-1; i++) { state->ugn.ptr.p_double[i] = state->g.ptr.p_double[i]; state->cgn.ptr.p_double[i] = state->g.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less_eq(state->xn.ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { state->cgn.ptr.p_double[i] = (double)(0); } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater_eq(state->xn.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->cgn.ptr.p_double[i] = (double)(0); } } mcsrch(n, &state->xn, &state->fn, &state->cgn, &state->d, &state->stp, state->curstpmax, minbc_gtol, &mcinfo, &state->nfev, &state->work, &state->lstate, &state->mcstage, _state); goto lbl_64; lbl_65: for(i=0; i<=n-1; i++) { if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less_eq(state->xn.ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { state->xn.ptr.p_double[i] = state->bndl.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater_eq(state->xn.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->xn.ptr.p_double[i] = state->bndu.ptr.p_double[i]; } } ae_v_moveneg(&state->bufsk.ptr.pp_double[state->bufsize][0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_moveneg(&state->bufyk.ptr.pp_double[state->bufsize][0], 1, &state->cgc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_add(&state->bufsk.ptr.pp_double[state->bufsize][0], 1, &state->xn.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_add(&state->bufyk.ptr.pp_double[state->bufsize][0], 1, &state->cgn.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * Handle special situations: * * check for presence of NAN/INF in function/gradient * * handle failure of line search */ v = state->fn; for(i=0; i<=n-1; i++) { v = 0.1*v+state->ugn.ptr.p_double[i]; } if( !ae_isfinite(v, _state) ) { /* * Abnormal termination - infinities in function/gradient */ state->repterminationtype = -8; result = ae_false; return result; } if( state->userterminationneeded ) { /* * User requested termination */ state->repterminationtype = 8; result = ae_false; return result; } if( mcinfo!=1 ) { /* * Terminate LBFGS phase */ goto lbl_63; } /* * Current point is updated: * * move XC/FC/GC to XP/FP/GP * * move XN/FN/GN to XC/FC/GC * * report current point and update iterations counter * * push new pair SK/YK to LBFGS buffer * * update length of the good step */ activationstatus = ae_false; for(i=0; i<=n-1; i++) { if( (state->hasbndl.ptr.p_bool[i]&&ae_fp_eq(state->xn.ptr.p_double[i],state->bndl.ptr.p_double[i]))&&ae_fp_neq(state->xn.ptr.p_double[i],state->xc.ptr.p_double[i]) ) { activationstatus = ae_true; } if( (state->hasbndu.ptr.p_bool[i]&&ae_fp_eq(state->xn.ptr.p_double[i],state->bndu.ptr.p_double[i]))&&ae_fp_neq(state->xn.ptr.p_double[i],state->xc.ptr.p_double[i]) ) { activationstatus = ae_true; } } state->fp = state->fc; state->fc = state->fn; ae_v_move(&state->xp.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->xc.ptr.p_double[0], 1, &state->xn.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->cgc.ptr.p_double[0], 1, &state->cgn.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->ugc.ptr.p_double[0], 1, &state->ugn.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( !state->xrep ) { goto lbl_73; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); minbc_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 31; goto lbl_rcomm; lbl_31: state->xupdated = ae_false; lbl_73: state->repiterationscount = state->repiterationscount+1; if( state->bufsize==m ) { /* * Buffer is full, shift contents by one row */ for(i=0; i<=state->bufsize-1; i++) { ae_v_move(&state->bufsk.ptr.pp_double[i][0], 1, &state->bufsk.ptr.pp_double[i+1][0], 1, ae_v_len(0,n-1)); ae_v_move(&state->bufyk.ptr.pp_double[i][0], 1, &state->bufyk.ptr.pp_double[i+1][0], 1, ae_v_len(0,n-1)); } for(i=0; i<=state->bufsize-2; i++) { state->bufrho.ptr.p_double[i] = state->bufrho.ptr.p_double[i+1]; state->buftheta.ptr.p_double[i] = state->buftheta.ptr.p_double[i+1]; } } else { /* * Buffer is not full, increase buffer size by 1 */ state->bufsize = state->bufsize+1; } v = ae_v_dotproduct(&state->bufyk.ptr.pp_double[state->bufsize-1][0], 1, &state->bufsk.ptr.pp_double[state->bufsize-1][0], 1, ae_v_len(0,n-1)); vv = ae_v_dotproduct(&state->bufyk.ptr.pp_double[state->bufsize-1][0], 1, &state->bufyk.ptr.pp_double[state->bufsize-1][0], 1, ae_v_len(0,n-1)); if( ae_fp_eq(v,(double)(0))||ae_fp_eq(vv,(double)(0)) ) { /* * Strange internal error in LBFGS - either YK=0 * (which should not have been) or (SK,YK)=0 (again, * unexpected). It should not take place because * MCINFO=1, which signals "good" step. But just * to be sure we have special branch of code which * restarts LBFGS */ goto lbl_63; } state->bufrho.ptr.p_double[state->bufsize-1] = 1/v; ae_assert(state->bufsize<=m, "MinBC: internal error", _state); v = (double)(0); vv = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr((state->xc.ptr.p_double[i]-state->xp.ptr.p_double[i])/state->s.ptr.p_double[i], _state); vv = vv+ae_sqr(state->xc.ptr.p_double[i]-state->xp.ptr.p_double[i], _state); } minbc_updateestimateofgoodstep(&state->lastscaledgoodstep, ae_sqrt(v, _state), _state); /* * Check MaxIts-based stopping condition. */ if( state->maxits>0&&state->repiterationscount>=state->maxits ) { state->repterminationtype = 5; result = ae_false; return result; } /* * Smooth reset (LBFGS memory model is refreshed) or hard restart: * * LBFGS model is refreshed, if line search was performed with activation of constraints * * algorithm is restarted if scaled gradient decreased below GDecay */ if( activationstatus ) { state->bufsize = 0; goto lbl_62; } v = 0.0; for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->cgc.ptr.p_double[i]*state->s.ptr.p_double[i], _state); } if( ae_fp_less(ae_sqrt(v, _state),gdecay*ginit) ) { goto lbl_63; } lbl_62: itidx = itidx+1; goto lbl_61; lbl_63: /* * Decrease decay coefficient. Subsequent L-BFGS stages will * have more stringent stopping criteria. */ gdecay = ae_maxreal(gdecay*minbc_decaycorrection, minbc_mindecay, _state); goto lbl_41; lbl_42: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = freezeidx; state->rstate.ia.ptr.p_int[1] = n; state->rstate.ia.ptr.p_int[2] = m; state->rstate.ia.ptr.p_int[3] = i; state->rstate.ia.ptr.p_int[4] = j; state->rstate.ia.ptr.p_int[5] = mcinfo; state->rstate.ia.ptr.p_int[6] = itidx; state->rstate.ba.ptr.p_bool[0] = b; state->rstate.ba.ptr.p_bool[1] = activationstatus; state->rstate.ra.ptr.p_double[0] = freezeval; state->rstate.ra.ptr.p_double[1] = scaleddnorm; state->rstate.ra.ptr.p_double[2] = v; state->rstate.ra.ptr.p_double[3] = vv; state->rstate.ra.ptr.p_double[4] = v0; state->rstate.ra.ptr.p_double[5] = ginit; state->rstate.ra.ptr.p_double[6] = gdecay; state->rstate.ra.ptr.p_double[7] = activationstep; return result; } /************************************************************************* BC results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinBCSetGradientCheck() for more information. * -3 inconsistent constraints. * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken * 8 terminated by user who called minbcrequesttermination(). X contains point which was "current accepted" when termination request was submitted. More information about fields of this structure can be found in the comments on MinBCReport datatype. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcresults(minbcstate* state, /* Real */ ae_vector* x, minbcreport* rep, ae_state *_state) { ae_vector_clear(x); _minbcreport_clear(rep); minbcresultsbuf(state, x, rep, _state); } /************************************************************************* BC results Buffered implementation of MinBCResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcresultsbuf(minbcstate* state, /* Real */ ae_vector* x, minbcreport* rep, ae_state *_state) { ae_int_t i; if( x->cntnmain ) { ae_vector_set_length(x, state->nmain, _state); } rep->iterationscount = state->repiterationscount; rep->nfev = state->repnfev; rep->varidx = state->repvaridx; rep->terminationtype = state->repterminationtype; if( state->repterminationtype>0 ) { ae_v_move(&x->ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,state->nmain-1)); } else { for(i=0; i<=state->nmain-1; i++) { x->ptr.p_double[i] = _state->v_nan; } } } /************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with MinBCCreate call. X - new starting point. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcrestartfrom(minbcstate* state, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; n = state->nmain; /* * First, check for errors in the inputs */ ae_assert(x->cnt>=n, "MinBCRestartFrom: Length(X)xstart.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * prepare RComm facilities */ ae_vector_set_length(&state->rstate.ia, 6+1, _state); ae_vector_set_length(&state->rstate.ba, 1+1, _state); ae_vector_set_length(&state->rstate.ra, 7+1, _state); state->rstate.stage = -1; minbc_clearrequestfields(state, _state); } /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void minbcrequesttermination(minbcstate* state, ae_state *_state) { state->userterminationneeded = ae_true; } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinBCOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * if needed, steps are bounded with respect to constraints on X[] * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinBCSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/ void minbcsetgradientcheck(minbcstate* state, double teststep, ae_state *_state) { ae_assert(ae_isfinite(teststep, _state), "MinBCSetGradientCheck: TestStep contains NaN or Infinite", _state); ae_assert(ae_fp_greater_eq(teststep,(double)(0)), "MinBCSetGradientCheck: invalid argument TestStep(TestStep<0)", _state); state->teststep = teststep; } /************************************************************************* Clears request fileds (to be sure that we don't forget to clear something) *************************************************************************/ static void minbc_clearrequestfields(minbcstate* state, ae_state *_state) { state->needf = ae_false; state->needfg = ae_false; state->xupdated = ae_false; } /************************************************************************* Internal initialization subroutine *************************************************************************/ static void minbc_minbcinitinternal(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minbcstate* state, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_matrix c; ae_vector ct; ae_frame_make(_state, &_frame_block); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); /* * Initialize */ state->teststep = (double)(0); state->nmain = n; state->diffstep = diffstep; ae_vector_set_length(&state->bndl, n, _state); ae_vector_set_length(&state->hasbndl, n, _state); ae_vector_set_length(&state->bndu, n, _state); ae_vector_set_length(&state->hasbndu, n, _state); ae_vector_set_length(&state->xstart, n, _state); ae_vector_set_length(&state->xc, n, _state); ae_vector_set_length(&state->cgc, n, _state); ae_vector_set_length(&state->ugc, n, _state); ae_vector_set_length(&state->xn, n, _state); ae_vector_set_length(&state->cgn, n, _state); ae_vector_set_length(&state->ugn, n, _state); ae_vector_set_length(&state->xp, n, _state); ae_vector_set_length(&state->d, n, _state); ae_vector_set_length(&state->s, n, _state); ae_vector_set_length(&state->x, n, _state); ae_vector_set_length(&state->g, n, _state); ae_vector_set_length(&state->work, n, _state); for(i=0; i<=n-1; i++) { state->bndl.ptr.p_double[i] = _state->v_neginf; state->hasbndl.ptr.p_bool[i] = ae_false; state->bndu.ptr.p_double[i] = _state->v_posinf; state->hasbndu.ptr.p_bool[i] = ae_false; state->s.ptr.p_double[i] = 1.0; } minbcsetcond(state, 0.0, 0.0, 0.0, 0, _state); minbcsetxrep(state, ae_false, _state); minbcsetstpmax(state, 0.0, _state); minbcsetprecdefault(state, _state); minbcrestartfrom(state, x, _state); ae_frame_leave(_state); } /************************************************************************* This subroutine updates estimate of the good step length given: 1) previous estimate 2) new length of the good step It makes sure that estimate does not change too rapidly - ratio of new and old estimates will be at least 0.01, at most 100.0 In case previous estimate of good step is zero (no estimate), new estimate is used unconditionally. -- ALGLIB -- Copyright 16.01.2013 by Bochkanov Sergey *************************************************************************/ static void minbc_updateestimateofgoodstep(double* estimate, double newstep, ae_state *_state) { if( ae_fp_eq(*estimate,(double)(0)) ) { *estimate = newstep; return; } if( ae_fp_less(newstep,*estimate*0.01) ) { *estimate = *estimate*0.01; return; } if( ae_fp_greater(newstep,*estimate*100) ) { *estimate = *estimate*100; return; } *estimate = newstep; } void _minbcstate_init(void* _p, ae_state *_state) { minbcstate *p = (minbcstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->s, 0, DT_REAL, _state); ae_vector_init(&p->diagh, 0, DT_REAL, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->g, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); ae_vector_init(&p->xc, 0, DT_REAL, _state); ae_vector_init(&p->ugc, 0, DT_REAL, _state); ae_vector_init(&p->cgc, 0, DT_REAL, _state); ae_vector_init(&p->xn, 0, DT_REAL, _state); ae_vector_init(&p->ugn, 0, DT_REAL, _state); ae_vector_init(&p->cgn, 0, DT_REAL, _state); ae_vector_init(&p->xp, 0, DT_REAL, _state); ae_vector_init(&p->d, 0, DT_REAL, _state); ae_vector_init(&p->hasbndl, 0, DT_BOOL, _state); ae_vector_init(&p->hasbndu, 0, DT_BOOL, _state); ae_vector_init(&p->bndl, 0, DT_REAL, _state); ae_vector_init(&p->bndu, 0, DT_REAL, _state); ae_vector_init(&p->xstart, 0, DT_REAL, _state); _snnlssolver_init(&p->solver, _state); ae_vector_init(&p->tmpprec, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_vector_init(&p->work, 0, DT_REAL, _state); _linminstate_init(&p->lstate, _state); ae_matrix_init(&p->bufyk, 0, 0, DT_REAL, _state); ae_matrix_init(&p->bufsk, 0, 0, DT_REAL, _state); ae_vector_init(&p->bufrho, 0, DT_REAL, _state); ae_vector_init(&p->buftheta, 0, DT_REAL, _state); } void _minbcstate_init_copy(void* _dst, void* _src, ae_state *_state) { minbcstate *dst = (minbcstate*)_dst; minbcstate *src = (minbcstate*)_src; dst->nmain = src->nmain; dst->epsg = src->epsg; dst->epsf = src->epsf; dst->epsx = src->epsx; dst->maxits = src->maxits; dst->xrep = src->xrep; dst->stpmax = src->stpmax; dst->diffstep = src->diffstep; ae_vector_init_copy(&dst->s, &src->s, _state); dst->prectype = src->prectype; ae_vector_init_copy(&dst->diagh, &src->diagh, _state); ae_vector_init_copy(&dst->x, &src->x, _state); dst->f = src->f; ae_vector_init_copy(&dst->g, &src->g, _state); dst->needf = src->needf; dst->needfg = src->needfg; dst->xupdated = src->xupdated; dst->userterminationneeded = src->userterminationneeded; dst->teststep = src->teststep; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); ae_vector_init_copy(&dst->xc, &src->xc, _state); ae_vector_init_copy(&dst->ugc, &src->ugc, _state); ae_vector_init_copy(&dst->cgc, &src->cgc, _state); ae_vector_init_copy(&dst->xn, &src->xn, _state); ae_vector_init_copy(&dst->ugn, &src->ugn, _state); ae_vector_init_copy(&dst->cgn, &src->cgn, _state); ae_vector_init_copy(&dst->xp, &src->xp, _state); dst->fc = src->fc; dst->fn = src->fn; dst->fp = src->fp; ae_vector_init_copy(&dst->d, &src->d, _state); dst->lastscaledgoodstep = src->lastscaledgoodstep; ae_vector_init_copy(&dst->hasbndl, &src->hasbndl, _state); ae_vector_init_copy(&dst->hasbndu, &src->hasbndu, _state); ae_vector_init_copy(&dst->bndl, &src->bndl, _state); ae_vector_init_copy(&dst->bndu, &src->bndu, _state); dst->repiterationscount = src->repiterationscount; dst->repnfev = src->repnfev; dst->repvaridx = src->repvaridx; dst->repterminationtype = src->repterminationtype; ae_vector_init_copy(&dst->xstart, &src->xstart, _state); _snnlssolver_init_copy(&dst->solver, &src->solver, _state); dst->fbase = src->fbase; dst->fm2 = src->fm2; dst->fm1 = src->fm1; dst->fp1 = src->fp1; dst->fp2 = src->fp2; dst->xm1 = src->xm1; dst->xp1 = src->xp1; dst->gm1 = src->gm1; dst->gp1 = src->gp1; ae_vector_init_copy(&dst->tmpprec, &src->tmpprec, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); dst->nfev = src->nfev; dst->mcstage = src->mcstage; dst->stp = src->stp; dst->curstpmax = src->curstpmax; ae_vector_init_copy(&dst->work, &src->work, _state); _linminstate_init_copy(&dst->lstate, &src->lstate, _state); dst->trimthreshold = src->trimthreshold; dst->nonmonotoniccnt = src->nonmonotoniccnt; ae_matrix_init_copy(&dst->bufyk, &src->bufyk, _state); ae_matrix_init_copy(&dst->bufsk, &src->bufsk, _state); ae_vector_init_copy(&dst->bufrho, &src->bufrho, _state); ae_vector_init_copy(&dst->buftheta, &src->buftheta, _state); dst->bufsize = src->bufsize; } void _minbcstate_clear(void* _p) { minbcstate *p = (minbcstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->s); ae_vector_clear(&p->diagh); ae_vector_clear(&p->x); ae_vector_clear(&p->g); _rcommstate_clear(&p->rstate); ae_vector_clear(&p->xc); ae_vector_clear(&p->ugc); ae_vector_clear(&p->cgc); ae_vector_clear(&p->xn); ae_vector_clear(&p->ugn); ae_vector_clear(&p->cgn); ae_vector_clear(&p->xp); ae_vector_clear(&p->d); ae_vector_clear(&p->hasbndl); ae_vector_clear(&p->hasbndu); ae_vector_clear(&p->bndl); ae_vector_clear(&p->bndu); ae_vector_clear(&p->xstart); _snnlssolver_clear(&p->solver); ae_vector_clear(&p->tmpprec); ae_vector_clear(&p->tmp0); ae_vector_clear(&p->work); _linminstate_clear(&p->lstate); ae_matrix_clear(&p->bufyk); ae_matrix_clear(&p->bufsk); ae_vector_clear(&p->bufrho); ae_vector_clear(&p->buftheta); } void _minbcstate_destroy(void* _p) { minbcstate *p = (minbcstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->s); ae_vector_destroy(&p->diagh); ae_vector_destroy(&p->x); ae_vector_destroy(&p->g); _rcommstate_destroy(&p->rstate); ae_vector_destroy(&p->xc); ae_vector_destroy(&p->ugc); ae_vector_destroy(&p->cgc); ae_vector_destroy(&p->xn); ae_vector_destroy(&p->ugn); ae_vector_destroy(&p->cgn); ae_vector_destroy(&p->xp); ae_vector_destroy(&p->d); ae_vector_destroy(&p->hasbndl); ae_vector_destroy(&p->hasbndu); ae_vector_destroy(&p->bndl); ae_vector_destroy(&p->bndu); ae_vector_destroy(&p->xstart); _snnlssolver_destroy(&p->solver); ae_vector_destroy(&p->tmpprec); ae_vector_destroy(&p->tmp0); ae_vector_destroy(&p->work); _linminstate_destroy(&p->lstate); ae_matrix_destroy(&p->bufyk); ae_matrix_destroy(&p->bufsk); ae_vector_destroy(&p->bufrho); ae_vector_destroy(&p->buftheta); } void _minbcreport_init(void* _p, ae_state *_state) { minbcreport *p = (minbcreport*)_p; ae_touch_ptr((void*)p); } void _minbcreport_init_copy(void* _dst, void* _src, ae_state *_state) { minbcreport *dst = (minbcreport*)_dst; minbcreport *src = (minbcreport*)_src; dst->iterationscount = src->iterationscount; dst->nfev = src->nfev; dst->varidx = src->varidx; dst->terminationtype = src->terminationtype; } void _minbcreport_clear(void* _p) { minbcreport *p = (minbcreport*)_p; ae_touch_ptr((void*)p); } void _minbcreport_destroy(void* _p) { minbcreport *p = (minbcreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* NONSMOOTH NONCONVEX OPTIMIZATION SUBJECT TO BOX/LINEAR/NONLINEAR-NONSMOOTH CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints * nonlinear equality constraints Gi(x)=0 * nonlinear inequality constraints Hi(x)<=0 IMPORTANT: see MinNSSetAlgoAGS for important information on performance restrictions of AGS solver. REQUIREMENTS: * starting point X0 must be feasible or not too far away from the feasible set * F(), G(), H() are continuous, locally Lipschitz and continuously (but not necessarily twice) differentiable in an open dense subset of R^N. Functions F(), G() and H() may be nonsmooth and non-convex. Informally speaking, it means that functions are composed of large differentiable "patches" with nonsmoothness having place only at the boundaries between these "patches". Most real-life nonsmooth functions satisfy these requirements. Say, anything which involves finite number of abs(), min() and max() is very likely to pass the test. Say, it is possible to optimize anything of the following: * f=abs(x0)+2*abs(x1) * f=max(x0,x1) * f=sin(max(x0,x1)+abs(x2)) * for nonlinearly constrained problems: F() must be bounded from below without nonlinear constraints (this requirement is due to the fact that, contrary to box and linear constraints, nonlinear ones require special handling). * user must provide function value and gradient for F(), H(), G() at all points where function/gradient can be calculated. If optimizer requires value exactly at the boundary between "patches" (say, at x=0 for f=abs(x)), where gradient is not defined, user may resolve tie arbitrarily (in our case - return +1 or -1 at its discretion). * NS solver supports numerical differentiation, i.e. it may differentiate your function for you, but it results in 2N increase of function evaluations. Not recommended unless you solve really small problems. See minnscreatef() for more information on this functionality. USAGE: 1. User initializes algorithm state with MinNSCreate() call and chooses what NLC solver to use. There is some solver which is used by default, with default settings, but you should NOT rely on default choice. It may change in future releases of ALGLIB without notice, and no one can guarantee that new solver will be able to solve your problem with default settings. From the other side, if you choose solver explicitly, you can be pretty sure that it will work with new ALGLIB releases. In the current release following solvers can be used: * AGS solver (activated with MinNSSetAlgoAGS() function) 2. User adds boundary and/or linear and/or nonlinear constraints by means of calling one of the following functions: a) MinNSSetBC() for boundary constraints b) MinNSSetLC() for linear constraints c) MinNSSetNLC() for nonlinear constraints You may combine (a), (b) and (c) in one optimization problem. 3. User sets scale of the variables with MinNSSetScale() function. It is VERY important to set scale of the variables, because nonlinearly constrained problems are hard to solve when variables are badly scaled. 4. User sets stopping conditions with MinNSSetCond(). 5. Finally, user calls MinNSOptimize() function which takes algorithm state and pointer (delegate, etc) to callback function which calculates F/G/H. 7. User calls MinNSResults() to get solution 8. Optionally user may call MinNSRestartFrom() to solve another problem with same N but another starting point. MinNSRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state NOTE: minnscreatef() function may be used if you do not have analytic gradient. This function creates solver which uses numerical differentiation with user-specified step. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnscreate(ae_int_t n, /* Real */ ae_vector* x, minnsstate* state, ae_state *_state) { _minnsstate_clear(state); ae_assert(n>=1, "MinNSCreate: N<1", _state); ae_assert(x->cnt>=n, "MinNSCreate: Length(X)0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. DiffStep- differentiation step, DiffStep>0. Algorithm performs numerical differentiation with step for I-th variable being equal to DiffStep*S[I] (here S[] is a scale vector, set by minnssetscale() function). Do not use too small steps, because it may lead to catastrophic cancellation during intermediate calculations. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnscreatef(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minnsstate* state, ae_state *_state) { _minnsstate_clear(state); ae_assert(n>=1, "MinNSCreateF: N<1", _state); ae_assert(x->cnt>=n, "MinNSCreateF: Length(X)n; ae_assert(bndl->cnt>=n, "MinNSSetBC: Length(BndL)cnt>=n, "MinNSSetBC: Length(BndU)ptr.p_double[i], _state)||ae_isneginf(bndl->ptr.p_double[i], _state), "MinNSSetBC: BndL contains NAN or +INF", _state); ae_assert(ae_isfinite(bndu->ptr.p_double[i], _state)||ae_isposinf(bndu->ptr.p_double[i], _state), "MinNSSetBC: BndL contains NAN or -INF", _state); state->bndl.ptr.p_double[i] = bndl->ptr.p_double[i]; state->hasbndl.ptr.p_bool[i] = ae_isfinite(bndl->ptr.p_double[i], _state); state->bndu.ptr.p_double[i] = bndu->ptr.p_double[i]; state->hasbndu.ptr.p_bool[i] = ae_isfinite(bndu->ptr.p_double[i], _state); } } /************************************************************************* This function sets linear constraints. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with minnsrestartfrom(). INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE: linear (non-bound) constraints are satisfied only approximately: * there always exists some minor violation (about current sampling radius in magnitude during optimization, about EpsX in the solution) due to use of penalty method to handle constraints. * numerical differentiation, if used, may lead to function evaluations outside of the feasible area, because algorithm does NOT change numerical differentiation formula according to linear constraints. If you want constraints to be satisfied exactly, try to reformulate your problem in such manner that all constraints will become boundary ones (this kind of constraints is always satisfied exactly, both in the final solution and in all intermediate points). -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetlc(minnsstate* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state) { ae_int_t n; ae_int_t i; n = state->n; /* * First, check for errors in the inputs */ ae_assert(k>=0, "MinNSSetLC: K<0", _state); ae_assert(c->cols>=n+1||k==0, "MinNSSetLC: Cols(C)rows>=k, "MinNSSetLC: Rows(C)cnt>=k, "MinNSSetLC: Length(CT)nec = 0; state->nic = 0; return; } /* * Equality constraints are stored first, in the upper * NEC rows of State.CLEIC matrix. Inequality constraints * are stored in the next NIC rows. * * NOTE: we convert inequality constraints to the form * A*x<=b before copying them. */ rmatrixsetlengthatleast(&state->cleic, k, n+1, _state); state->nec = 0; state->nic = 0; for(i=0; i<=k-1; i++) { if( ct->ptr.p_int[i]==0 ) { ae_v_move(&state->cleic.ptr.pp_double[state->nec][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); state->nec = state->nec+1; } } for(i=0; i<=k-1; i++) { if( ct->ptr.p_int[i]!=0 ) { if( ct->ptr.p_int[i]>0 ) { ae_v_moveneg(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } else { ae_v_move(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } state->nic = state->nic+1; } } } /************************************************************************* This function sets nonlinear constraints. In fact, this function sets NUMBER of nonlinear constraints. Constraints itself (constraint functions) are passed to minnsoptimize() method. This method requires user-defined vector function F[] and its Jacobian J[], where: * first component of F[] and first row of Jacobian J[] correspond to function being minimized * next NLEC components of F[] (and rows of J) correspond to nonlinear equality constraints G_i(x)=0 * next NLIC components of F[] (and rows of J) correspond to nonlinear inequality constraints H_i(x)<=0 NOTE: you may combine nonlinear constraints with linear/boundary ones. If your problem has mixed constraints, you may explicitly specify some of them as linear ones. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. NLEC - number of Non-Linear Equality Constraints (NLEC), >=0 NLIC - number of Non-Linear Inquality Constraints (NLIC), >=0 NOTE 1: nonlinear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of the feasible area! NOTE 2: algorithm scales variables according to scale specified by minnssetscale() function, so it can handle problems with badly scaled variables (as long as we KNOW their scales). However, there is no way to automatically scale nonlinear constraints Gi(x) and Hi(x). Inappropriate scaling of Gi/Hi may ruin convergence. Solving problem with constraint "1000*G0(x)=0" is NOT same as solving it with constraint "0.001*G0(x)=0". It means that YOU are the one who is responsible for correct scaling of nonlinear constraints Gi(x) and Hi(x). We recommend you to scale nonlinear constraints in such way that I-th component of dG/dX (or dH/dx) has approximately unit magnitude (for problems with unit scale) or has magnitude approximately equal to 1/S[i] (where S is a scale set by minnssetscale() function). NOTE 3: nonlinear constraints are always hard to handle, no matter what algorithm you try to use. Even basic box/linear constraints modify function curvature by adding valleys and ridges. However, nonlinear constraints add valleys which are very hard to follow due to their "curved" nature. It means that optimization with single nonlinear constraint may be significantly slower than optimization with multiple linear ones. It is normal situation, and we recommend you to carefully choose Rho parameter of minnssetalgoags(), because too large value may slow down convergence. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetnlc(minnsstate* state, ae_int_t nlec, ae_int_t nlic, ae_state *_state) { ae_assert(nlec>=0, "MinNSSetNLC: NLEC<0", _state); ae_assert(nlic>=0, "MinNSSetNLC: NLIC<0", _state); state->ng = nlec; state->nh = nlic; ae_vector_set_length(&state->fi, 1+state->ng+state->nh, _state); ae_matrix_set_length(&state->j, 1+state->ng+state->nh, state->n, _state); } /************************************************************************* This function sets stopping conditions for iterations of optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0 The AGS solver finishes its work if on k+1-th iteration sampling radius decreases below EpsX. MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. We do not recommend you to rely on default choice in production code. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetcond(minnsstate* state, double epsx, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(epsx, _state), "MinNSSetCond: EpsX is not finite number", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "MinNSSetCond: negative EpsX", _state); ae_assert(maxits>=0, "MinNSSetCond: negative MaxIts!", _state); if( ae_fp_eq(epsx,(double)(0))&&maxits==0 ) { epsx = 1.0E-6; } state->epsx = epsx; state->maxits = maxits; } /************************************************************************* This function sets scaling coefficients for NLC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetscale(minnsstate* state, /* Real */ ae_vector* s, ae_state *_state) { ae_int_t i; ae_assert(s->cnt>=state->n, "MinNSSetScale: Length(S)n-1; i++) { ae_assert(ae_isfinite(s->ptr.p_double[i], _state), "MinNSSetScale: S contains infinite or NAN elements", _state); ae_assert(ae_fp_neq(s->ptr.p_double[i],(double)(0)), "MinNSSetScale: S contains zero elements", _state); state->s.ptr.p_double[i] = ae_fabs(s->ptr.p_double[i], _state); } } /************************************************************************* This function tells MinNS unit to use AGS (adaptive gradient sampling) algorithm for nonsmooth constrained optimization. This algorithm is a slight modification of one described in "An Adaptive Gradient Sampling Algorithm for Nonsmooth Optimization" by Frank E. Curtisy and Xiaocun Quez. This optimizer has following benefits and drawbacks: + robustness; it can be used with nonsmooth and nonconvex functions. + relatively easy tuning; most of the metaparameters are easy to select. - it has convergence of steepest descent, slower than CG/LBFGS. - each iteration involves evaluation of ~2N gradient values and solution of 2Nx2N quadratic programming problem, which limits applicability of algorithm by small-scale problems (up to 50-100). IMPORTANT: this algorithm has convergence guarantees, i.e. it will steadily move towards some stationary point of the function. However, "stationary point" does not always mean "solution". Nonsmooth problems often have "flat spots", i.e. areas where function do not change at all. Such "flat spots" are stationary points by definition, and algorithm may be caught here. Nonsmooth CONVEX tasks are not prone to this problem. Say, if your function has form f()=MAX(f0,f1,...), and f_i are convex, then f() is convex too and you have guaranteed convergence to solution. INPUT PARAMETERS: State - structure which stores algorithm state Radius - initial sampling radius, >=0. Internally multiplied by vector of per-variable scales specified by minnssetscale()). You should select relatively large sampling radius, roughly proportional to scaled length of the first steps of the algorithm. Something close to 0.1 in magnitude should be good for most problems. AGS solver can automatically decrease radius, so too large radius is not a problem (assuming that you won't choose so large radius that algorithm will sample function in too far away points, where gradient value is irrelevant). Too small radius won't cause algorithm to fail, but it may slow down algorithm (it may have to perform too short steps). Penalty - penalty coefficient for nonlinear constraints: * for problem with nonlinear constraints should be some problem-specific positive value, large enough that penalty term changes shape of the function. Starting from some problem-specific value penalty coefficient becomes large enough to exactly enforce nonlinear constraints; larger values do not improve precision. Increasing it too much may slow down convergence, so you should choose it carefully. * can be zero for problems WITHOUT nonlinear constraints (i.e. for unconstrained ones or ones with just box or linear constraints) * if you specify zero value for problem with at least one nonlinear constraint, algorithm will terminate with error code -1. ALGORITHM OUTLINE The very basic outline of unconstrained AGS algorithm is given below: 0. If sampling radius is below EpsX or we performed more then MaxIts iterations - STOP. 1. sample O(N) gradient values at random locations around current point; informally speaking, this sample is an implicit piecewise linear model of the function, although algorithm formulation does not mention that explicitly 2. solve quadratic programming problem in order to find descent direction 3. if QP solver tells us that we are near solution, decrease sampling radius and move to (0) 4. perform backtracking line search 5. after moving to new point, goto (0) As for the constraints: * box constraints are handled exactly by modification of the function being minimized * linear/nonlinear constraints are handled by adding L1 penalty. Because our solver can handle nonsmoothness, we can use L1 penalty function, which is an exact one (i.e. exact solution is returned under such penalty). * penalty coefficient for linear constraints is chosen automatically; however, penalty coefficient for nonlinear constraints must be specified by user. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetalgoags(minnsstate* state, double radius, double penalty, ae_state *_state) { ae_assert(ae_isfinite(radius, _state), "MinNSSetAlgoAGS: Radius is not finite", _state); ae_assert(ae_fp_greater(radius,(double)(0)), "MinNSSetAlgoAGS: Radius<=0", _state); ae_assert(ae_isfinite(penalty, _state), "MinNSSetAlgoAGS: Penalty is not finite", _state); ae_assert(ae_fp_greater_eq(penalty,0.0), "MinNSSetAlgoAGS: Penalty<0", _state); state->agsrhononlinear = penalty; state->agsradius = radius; state->solvertype = 0; } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to minnsoptimize(). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minnssetxrep(minnsstate* state, ae_bool needxrep, ae_state *_state) { state->xrep = needxrep; } /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsrequesttermination(minnsstate* state, ae_state *_state) { state->userterminationneeded = ae_true; } /************************************************************************* NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied Jacobian, and one which uses only function vector and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object you should choose appropriate variant of minnsoptimize() - one which accepts function AND Jacobian or one which accepts ONLY function. Be careful to choose variant of minnsoptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to minnsoptimize() and specific function used to create optimizer. | USER PASSED TO minnsoptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ minnscreatef() | works FAILS minnscreate() | FAILS works Here "FAILS" denotes inappropriate combinations of optimizer creation function and minnsoptimize() version. Attemps to use such combination will lead to exception. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ ae_bool minnsiteration(minnsstate* state, ae_state *_state) { ae_int_t i; ae_int_t k; ae_int_t n; ae_int_t ng; ae_int_t nh; double v; double xp; double xm; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { i = state->rstate.ia.ptr.p_int[0]; k = state->rstate.ia.ptr.p_int[1]; n = state->rstate.ia.ptr.p_int[2]; ng = state->rstate.ia.ptr.p_int[3]; nh = state->rstate.ia.ptr.p_int[4]; v = state->rstate.ra.ptr.p_double[0]; xp = state->rstate.ra.ptr.p_double[1]; xm = state->rstate.ra.ptr.p_double[2]; } else { i = 359; k = -58; n = -919; ng = -909; nh = 81; v = 255; xp = 74; xm = -788; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } /* * Routine body */ /* * Init */ state->replcerr = 0.0; state->repnlcerr = 0.0; state->repterminationtype = 0; state->repinneriterationscount = 0; state->repouteriterationscount = 0; state->repnfev = 0; state->repvaridx = 0; state->repfuncidx = 0; state->userterminationneeded = ae_false; state->dbgncholesky = 0; n = state->n; ng = state->ng; nh = state->nh; minns_clearrequestfields(state, _state); /* * AGS solver */ if( state->solvertype!=0 ) { goto lbl_4; } if( ae_fp_neq(state->diffstep,(double)(0)) ) { rvectorsetlengthatleast(&state->xbase, n, _state); rvectorsetlengthatleast(&state->fm, 1+ng+nh, _state); rvectorsetlengthatleast(&state->fp, 1+ng+nh, _state); } ae_vector_set_length(&state->rstateags.ia, 13+1, _state); ae_vector_set_length(&state->rstateags.ba, 3+1, _state); ae_vector_set_length(&state->rstateags.ra, 9+1, _state); state->rstateags.stage = -1; lbl_6: if( !minns_agsiteration(state, _state) ) { goto lbl_7; } /* * Numerical differentiation (if needed) - intercept NeedFiJ * request and replace it by sequence of NeedFi requests */ if( !(ae_fp_neq(state->diffstep,(double)(0))&&state->needfij) ) { goto lbl_8; } state->needfij = ae_false; state->needfi = ae_true; ae_v_move(&state->xbase.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); k = 0; lbl_10: if( k>n-1 ) { goto lbl_12; } v = state->xbase.ptr.p_double[k]; xm = v-state->diffstep*state->s.ptr.p_double[k]; xp = v+state->diffstep*state->s.ptr.p_double[k]; if( state->hasbndl.ptr.p_bool[k]&&ae_fp_less(xm,state->bndl.ptr.p_double[k]) ) { xm = state->bndl.ptr.p_double[k]; } if( state->hasbndu.ptr.p_bool[k]&&ae_fp_greater(xp,state->bndu.ptr.p_double[k]) ) { xp = state->bndu.ptr.p_double[k]; } ae_assert(ae_fp_less_eq(xm,xp), "MinNS: integrity check failed", _state); if( ae_fp_eq(xm,xp) ) { goto lbl_13; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->x.ptr.p_double[k] = xm; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: ae_v_move(&state->fm.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,ng+nh)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->x.ptr.p_double[k] = xp; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: ae_v_move(&state->fp.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,ng+nh)); ae_v_move(&state->j.ptr.pp_double[0][k], state->j.stride, &state->fp.ptr.p_double[0], 1, ae_v_len(0,ng+nh)); ae_v_sub(&state->j.ptr.pp_double[0][k], state->j.stride, &state->fm.ptr.p_double[0], 1, ae_v_len(0,ng+nh)); v = 1/(xp-xm); ae_v_muld(&state->j.ptr.pp_double[0][k], state->j.stride, ae_v_len(0,ng+nh), v); state->repnfev = state->repnfev+2; goto lbl_14; lbl_13: for(i=0; i<=ng+nh; i++) { state->j.ptr.pp_double[i][k] = 0.0; } lbl_14: k = k+1; goto lbl_10; lbl_12: ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->rstate.stage = 2; goto lbl_rcomm; lbl_2: /* * Restore previous values of fields and continue */ state->needfi = ae_false; state->needfij = ae_true; goto lbl_6; lbl_8: /* * Forward request to caller */ state->rstate.stage = 3; goto lbl_rcomm; lbl_3: inc(&state->repnfev, _state); goto lbl_6; lbl_7: result = ae_false; return result; lbl_4: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = i; state->rstate.ia.ptr.p_int[1] = k; state->rstate.ia.ptr.p_int[2] = n; state->rstate.ia.ptr.p_int[3] = ng; state->rstate.ia.ptr.p_int[4] = nh; state->rstate.ra.ptr.p_double[0] = v; state->rstate.ra.ptr.p_double[1] = xp; state->rstate.ra.ptr.p_double[2] = xm; return result; } /************************************************************************* MinNS results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -3 box constraints are inconsistent * -1 inconsistent parameters were passed: * penalty parameter for minnssetalgoags() is zero, but we have nonlinear constraints set by minnssetnlc() * 2 sampling radius decreased below epsx * 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. * 8 User requested termination via minnsrequesttermination() -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsresults(minnsstate* state, /* Real */ ae_vector* x, minnsreport* rep, ae_state *_state) { ae_vector_clear(x); _minnsreport_clear(rep); minnsresultsbuf(state, x, rep, _state); } /************************************************************************* Buffered implementation of minnsresults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsresultsbuf(minnsstate* state, /* Real */ ae_vector* x, minnsreport* rep, ae_state *_state) { ae_int_t i; if( x->cntn ) { ae_vector_set_length(x, state->n, _state); } rep->iterationscount = state->repinneriterationscount; rep->nfev = state->repnfev; rep->varidx = state->repvaridx; rep->funcidx = state->repfuncidx; rep->terminationtype = state->repterminationtype; rep->cerr = ae_maxreal(state->replcerr, state->repnlcerr, _state); rep->lcerr = state->replcerr; rep->nlcerr = state->repnlcerr; if( state->repterminationtype>0 ) { ae_v_move(&x->ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); } else { for(i=0; i<=state->n-1; i++) { x->ptr.p_double[i] = _state->v_nan; } } } /************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. X - new starting point. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsrestartfrom(minnsstate* state, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; n = state->n; /* * First, check for errors in the inputs */ ae_assert(x->cnt>=n, "MinNSRestartFrom: Length(X)xstart.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * prepare RComm facilities */ ae_vector_set_length(&state->rstate.ia, 4+1, _state); ae_vector_set_length(&state->rstate.ra, 2+1, _state); state->rstate.stage = -1; minns_clearrequestfields(state, _state); } /************************************************************************* Clears request fileds (to be sure that we don't forget to clear something) *************************************************************************/ static void minns_clearrequestfields(minnsstate* state, ae_state *_state) { state->needfi = ae_false; state->needfij = ae_false; state->xupdated = ae_false; } /************************************************************************* Internal initialization subroutine. Sets default NLC solver with default criteria. *************************************************************************/ static void minns_minnsinitinternal(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minnsstate* state, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_matrix c; ae_vector ct; ae_frame_make(_state, &_frame_block); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); state->agsinitstp = 0.2; state->agsstattold = 1.0E-10; state->agsshortstpabs = 1.0E-10; state->agsshortstprel = 0.75; state->agsshortf = 10*ae_machineepsilon; state->agsrhononlinear = 0.0; state->agsraddecay = 0.2; state->agsalphadecay = 0.5; state->agsdecrease = 0.1; state->agsmaxraddecays = 50; state->agsmaxbacktrack = 20; state->agsmaxbacktracknonfull = 8; state->agspenaltylevel = 50.0; state->agspenaltyincrease = 100.0; state->agsminupdate = ae_maxint(5, n/2, _state); state->agssamplesize = ae_maxint(2*n+1, state->agsminupdate+1, _state); state->agsshortlimit = 4+state->agssamplesize/state->agsminupdate; /* * Initialize other params */ state->n = n; state->diffstep = diffstep; ae_vector_set_length(&state->bndl, n, _state); ae_vector_set_length(&state->hasbndl, n, _state); ae_vector_set_length(&state->bndu, n, _state); ae_vector_set_length(&state->hasbndu, n, _state); ae_vector_set_length(&state->s, n, _state); ae_vector_set_length(&state->xstart, n, _state); ae_vector_set_length(&state->xc, n, _state); ae_vector_set_length(&state->xn, n, _state); ae_vector_set_length(&state->d, n, _state); ae_vector_set_length(&state->x, n, _state); for(i=0; i<=n-1; i++) { state->bndl.ptr.p_double[i] = _state->v_neginf; state->hasbndl.ptr.p_bool[i] = ae_false; state->bndu.ptr.p_double[i] = _state->v_posinf; state->hasbndu.ptr.p_bool[i] = ae_false; state->s.ptr.p_double[i] = 1.0; state->xstart.ptr.p_double[i] = x->ptr.p_double[i]; state->xc.ptr.p_double[i] = x->ptr.p_double[i]; } minnssetlc(state, &c, &ct, 0, _state); minnssetnlc(state, 0, 0, _state); minnssetcond(state, 0.0, 0, _state); minnssetxrep(state, ae_false, _state); minnssetalgoags(state, 0.1, 1000.0, _state); minnsrestartfrom(state, x, _state); ae_frame_leave(_state); } /************************************************************************* This function performs actual processing for AUL algorith. It expects that caller redirects its reverse communication requests NeedFiJ/XUpdated to external user who will provide analytic derivative (or handle reports about progress). In case external user does not have analytic derivative, it is responsibility of caller to intercept NeedFiJ request and replace it with appropriate numerical differentiation scheme. -- ALGLIB -- Copyright 06.06.2015 by Bochkanov Sergey *************************************************************************/ static ae_bool minns_agsiteration(minnsstate* state, ae_state *_state) { ae_int_t n; ae_int_t nec; ae_int_t nic; ae_int_t ng; ae_int_t nh; ae_int_t i; ae_int_t j; ae_int_t k; double radius0; double radius; ae_int_t radiusdecays; double alpha; double recommendedstep; double dnrm; double dg; double v; double vv; ae_int_t maxsamplesize; ae_int_t cursamplesize; double v0; double v1; ae_bool restartneeded; ae_bool b; ae_bool alphadecreased; ae_int_t shortstepscnt; ae_int_t backtrackits; ae_int_t maxbacktrackits; ae_bool fullsample; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstateags.stage>=0 ) { n = state->rstateags.ia.ptr.p_int[0]; nec = state->rstateags.ia.ptr.p_int[1]; nic = state->rstateags.ia.ptr.p_int[2]; ng = state->rstateags.ia.ptr.p_int[3]; nh = state->rstateags.ia.ptr.p_int[4]; i = state->rstateags.ia.ptr.p_int[5]; j = state->rstateags.ia.ptr.p_int[6]; k = state->rstateags.ia.ptr.p_int[7]; radiusdecays = state->rstateags.ia.ptr.p_int[8]; maxsamplesize = state->rstateags.ia.ptr.p_int[9]; cursamplesize = state->rstateags.ia.ptr.p_int[10]; shortstepscnt = state->rstateags.ia.ptr.p_int[11]; backtrackits = state->rstateags.ia.ptr.p_int[12]; maxbacktrackits = state->rstateags.ia.ptr.p_int[13]; restartneeded = state->rstateags.ba.ptr.p_bool[0]; b = state->rstateags.ba.ptr.p_bool[1]; alphadecreased = state->rstateags.ba.ptr.p_bool[2]; fullsample = state->rstateags.ba.ptr.p_bool[3]; radius0 = state->rstateags.ra.ptr.p_double[0]; radius = state->rstateags.ra.ptr.p_double[1]; alpha = state->rstateags.ra.ptr.p_double[2]; recommendedstep = state->rstateags.ra.ptr.p_double[3]; dnrm = state->rstateags.ra.ptr.p_double[4]; dg = state->rstateags.ra.ptr.p_double[5]; v = state->rstateags.ra.ptr.p_double[6]; vv = state->rstateags.ra.ptr.p_double[7]; v0 = state->rstateags.ra.ptr.p_double[8]; v1 = state->rstateags.ra.ptr.p_double[9]; } else { n = 809; nec = 205; nic = -838; ng = 939; nh = -526; i = 763; j = -541; k = -698; radiusdecays = -900; maxsamplesize = -318; cursamplesize = -940; shortstepscnt = 1016; backtrackits = -229; maxbacktrackits = -536; restartneeded = ae_true; b = ae_true; alphadecreased = ae_false; fullsample = ae_false; radius0 = -722; radius = -413; alpha = -461; recommendedstep = 927; dnrm = 201; dg = 922; v = -154; vv = 306; v0 = -1011; v1 = 951; } if( state->rstateags.stage==0 ) { goto lbl_0; } if( state->rstateags.stage==1 ) { goto lbl_1; } if( state->rstateags.stage==2 ) { goto lbl_2; } if( state->rstateags.stage==3 ) { goto lbl_3; } /* * Routine body */ ae_assert(state->solvertype==0, "MinNS: internal error", _state); n = state->n; nec = state->nec; nic = state->nic; ng = state->ng; nh = state->nh; /* * Check consistency of parameters */ if( ng+nh>0&&ae_fp_eq(state->agsrhononlinear,(double)(0)) ) { state->repterminationtype = -1; result = ae_false; return result; } /* * Allocate arrays. */ rvectorsetlengthatleast(&state->colmax, n, _state); rvectorsetlengthatleast(&state->diagh, n, _state); rvectorsetlengthatleast(&state->signmin, n, _state); rvectorsetlengthatleast(&state->signmax, n, _state); maxsamplesize = state->agssamplesize; rmatrixsetlengthatleast(&state->samplex, maxsamplesize+1, n, _state); rmatrixsetlengthatleast(&state->samplegm, maxsamplesize+1, n, _state); rmatrixsetlengthatleast(&state->samplegmbc, maxsamplesize+1, n, _state); rvectorsetlengthatleast(&state->samplef, maxsamplesize+1, _state); rvectorsetlengthatleast(&state->samplef0, maxsamplesize+1, _state); rvectorsetlengthatleast(&state->grs, n, _state); /* * Prepare optimizer */ rvectorsetlengthatleast(&state->tmp0, maxsamplesize, _state); rvectorsetlengthatleast(&state->tmp1, maxsamplesize, _state); ivectorsetlengthatleast(&state->tmp3, 1, _state); rmatrixsetlengthatleast(&state->tmp2, 1, maxsamplesize+1, _state); for(i=0; i<=maxsamplesize-1; i++) { state->tmp0.ptr.p_double[i] = 0.0; state->tmp1.ptr.p_double[i] = _state->v_posinf; } /* * Prepare RNG, seed it with fixed values so * that each run on same problem yeilds same results */ hqrndseed(7235, 98532, &state->agsrs, _state); /* * Prepare initial point subject to current bound constraints and * perform scaling of bound constraints, linear constraints, point itself */ rvectorsetlengthatleast(&state->scaledbndl, n, _state); rvectorsetlengthatleast(&state->scaledbndu, n, _state); for(i=0; i<=n-1; i++) { /* * Check and scale constraints */ if( (state->hasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i])&&ae_fp_less(state->bndu.ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { state->repterminationtype = -3; result = ae_false; return result; } if( state->hasbndl.ptr.p_bool[i] ) { state->scaledbndl.ptr.p_double[i] = state->bndl.ptr.p_double[i]/state->s.ptr.p_double[i]; } else { state->scaledbndl.ptr.p_double[i] = _state->v_neginf; } if( state->hasbndu.ptr.p_bool[i] ) { state->scaledbndu.ptr.p_double[i] = state->bndu.ptr.p_double[i]/state->s.ptr.p_double[i]; } else { state->scaledbndu.ptr.p_double[i] = _state->v_posinf; } if( state->hasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i] ) { ae_assert(ae_fp_less_eq(state->scaledbndl.ptr.p_double[i],state->scaledbndu.ptr.p_double[i]), "MinNS: integrity check failed", _state); } if( (state->hasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i])&&ae_fp_eq(state->bndl.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { ae_assert(ae_fp_eq(state->scaledbndl.ptr.p_double[i],state->scaledbndu.ptr.p_double[i]), "MinNS: integrity check failed", _state); } /* * Scale and constrain point */ state->xc.ptr.p_double[i] = state->xstart.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less_eq(state->xc.ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { state->xc.ptr.p_double[i] = state->scaledbndl.ptr.p_double[i]; continue; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater_eq(state->xc.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->xc.ptr.p_double[i] = state->scaledbndu.ptr.p_double[i]; continue; } state->xc.ptr.p_double[i] = state->xc.ptr.p_double[i]/state->s.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less_eq(state->xc.ptr.p_double[i],state->scaledbndl.ptr.p_double[i]) ) { state->xc.ptr.p_double[i] = state->scaledbndl.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater_eq(state->xc.ptr.p_double[i],state->scaledbndu.ptr.p_double[i]) ) { state->xc.ptr.p_double[i] = state->scaledbndu.ptr.p_double[i]; } } rmatrixsetlengthatleast(&state->scaledcleic, nec+nic, n+1, _state); rvectorsetlengthatleast(&state->rholinear, nec+nic, _state); for(i=0; i<=nec+nic-1; i++) { /* * Initial value of penalty coefficient is zero */ state->rholinear.ptr.p_double[i] = 0.0; /* * Scale and normalize linear constraints */ vv = 0.0; for(j=0; j<=n-1; j++) { v = state->cleic.ptr.pp_double[i][j]*state->s.ptr.p_double[j]; state->scaledcleic.ptr.pp_double[i][j] = v; vv = vv+v*v; } vv = ae_sqrt(vv, _state); state->scaledcleic.ptr.pp_double[i][n] = state->cleic.ptr.pp_double[i][n]; if( ae_fp_greater(vv,(double)(0)) ) { for(j=0; j<=n; j++) { state->scaledcleic.ptr.pp_double[i][j] = state->scaledcleic.ptr.pp_double[i][j]/vv; } } } /* * Main cycle * * We maintain several variables during iteration: * * RecommendedStep- current estimate of recommended step length; * must be Radius0 on first entry * * Radius - current sampling radius * * CurSampleSize - current sample size (may change in future versions) * * FullSample - whether we have full sample, or only partial one * * RadiusDecays - total number of decreases performed for sampling radius */ radius = state->agsradius; radius0 = radius; recommendedstep = ae_minreal(radius0, state->agsinitstp, _state); cursamplesize = 1; radiusdecays = 0; shortstepscnt = 0; fullsample = ae_false; lbl_4: if( ae_false ) { goto lbl_5; } /* * First phase of iteration - central point: * * 1. evaluate function at central point - first entry in sample. * Its status is ignored, it is always recalculated. * 2. report point and check gradient/function value for NAN/INF * 3. check penalty coefficients for linear terms; increase them * if directional derivative of function being optimized (not * merit function!) is larger than derivative of penalty. * 4. update report on constraint violation */ cursamplesize = ae_maxint(cursamplesize, 1, _state); ae_v_move(&state->samplex.ptr.pp_double[0][0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); minns_unscalepointbc(state, &state->x, _state); minns_clearrequestfields(state, _state); state->needfij = ae_true; state->rstateags.stage = 0; goto lbl_rcomm; lbl_0: state->needfij = ae_false; state->replcerr = 0.0; for(i=0; i<=nec+nic-1; i++) { v = -state->scaledcleic.ptr.pp_double[i][n]; for(j=0; j<=n-1; j++) { v = v+state->scaledcleic.ptr.pp_double[i][j]*state->xc.ptr.p_double[j]; } if( i>=nec&&ae_fp_less_eq(v,(double)(0)) ) { continue; } state->replcerr = ae_maxreal(state->replcerr, ae_fabs(v, _state), _state); } state->repnlcerr = 0.0; for(i=1; i<=ng+nh; i++) { v = state->fi.ptr.p_double[i]; if( i>ng&&ae_fp_less_eq(v,(double)(0)) ) { continue; } state->repnlcerr = ae_maxreal(state->repnlcerr, ae_fabs(v, _state), _state); } for(j=0; j<=n-1; j++) { state->grs.ptr.p_double[j] = state->j.ptr.pp_double[0][j]*state->s.ptr.p_double[j]; } minns_generatemeritfunction(state, 0, _state); if( !state->xrep ) { goto lbl_6; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->f = state->samplef0.ptr.p_double[0]; minns_unscalepointbc(state, &state->x, _state); minns_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstateags.stage = 1; goto lbl_rcomm; lbl_1: state->xupdated = ae_false; lbl_6: if( state->userterminationneeded ) { /* * User requested termination */ state->repterminationtype = 8; goto lbl_5; } v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->samplegm.ptr.pp_double[0][i], _state); } if( !ae_isfinite(v, _state)||!ae_isfinite(state->samplef.ptr.p_double[0], _state) ) { /* * Abnormal termination - infinities in function/gradient */ state->repterminationtype = -8; goto lbl_5; } restartneeded = ae_false; for(i=0; i<=nec+nic-1; i++) { /* * Evaluate penalty function. * * Skip update if penalty is satisfied exactly (this check * also covers situations when I-th row is exactly zero). */ v = -state->scaledcleic.ptr.pp_double[i][n]; for(j=0; j<=n-1; j++) { v = v+state->scaledcleic.ptr.pp_double[i][j]*state->xc.ptr.p_double[j]; } if( i=nec&&ae_fp_less_eq(v,(double)(0)) ) { continue; } /* * Calculate directional derivative, compare it with threshold. * * NOTE: we rely on the fact that ScaledCLEIC is normalized */ ae_assert(ae_fp_greater(state->agspenaltylevel,1.0), "MinNS: integrity error", _state); ae_assert(ae_fp_greater(state->agspenaltyincrease,state->agspenaltylevel), "MinNS: integrity error", _state); v = 0.0; for(j=0; j<=n-1; j++) { v = v+state->grs.ptr.p_double[j]*state->scaledcleic.ptr.pp_double[i][j]; } v = ae_fabs(v, _state); if( ae_fp_greater(v*state->agspenaltylevel,state->rholinear.ptr.p_double[i]) ) { state->rholinear.ptr.p_double[i] = v*state->agspenaltyincrease; restartneeded = ae_true; } } if( restartneeded ) { cursamplesize = 0; goto lbl_4; } /* * Check stopping conditions. */ if( radiusdecays>=state->agsmaxraddecays ) { /* * Too many attempts to decrease radius */ state->repterminationtype = 7; goto lbl_5; } if( state->repinneriterationscount>=state->maxits&&state->maxits>0 ) { /* * Too many iterations */ state->repterminationtype = 5; goto lbl_5; } if( ae_fp_less_eq(radius,state->epsx*state->agsraddecay) ) { /* * Radius is smaller than required step tolerance multiplied by radius decay. * * Additional decay is required in order to make sure that optimization session * with radius equal to EpsX was successfully done. */ state->repterminationtype = 2; goto lbl_5; } /* * Update sample: * * 1. invalidate entries which are too far away from XC * and move all valid entries to beginning of the sample. * 2. add new entries until we have AGSSampleSize * items in our sample. We remove oldest entries from * sample until we have enough place to add at least * AGSMinUpdate items. * 3. prepare "modified" gradient sample with respect to * boundary constraints. */ ae_assert(cursamplesize>=1, "MinNS: integrity check failed", _state); k = 1; for(i=1; i<=cursamplesize-1; i++) { /* * If entry is outside of Radius-ball around XC, discard it. */ v = 0.0; for(j=0; j<=n-1; j++) { v = ae_maxreal(v, ae_fabs(state->samplex.ptr.pp_double[i][j]-state->xc.ptr.p_double[j], _state), _state); } if( ae_fp_greater(v,radius) ) { continue; } /* * If central point is exactly at boundary, and corresponding * component of entry is OUT of boundary, entry is discarded. */ b = ae_false; for(j=0; j<=n-1; j++) { b = b||((state->hasbndl.ptr.p_bool[j]&&ae_fp_eq(state->xc.ptr.p_double[j],state->scaledbndl.ptr.p_double[j]))&&ae_fp_neq(state->samplex.ptr.pp_double[i][j],state->scaledbndl.ptr.p_double[j])); b = b||((state->hasbndu.ptr.p_bool[j]&&ae_fp_eq(state->xc.ptr.p_double[j],state->scaledbndu.ptr.p_double[j]))&&ae_fp_neq(state->samplex.ptr.pp_double[i][j],state->scaledbndu.ptr.p_double[j])); } if( b ) { continue; } /* * Move to the beginning */ ae_v_move(&state->samplex.ptr.pp_double[k][0], 1, &state->samplex.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); ae_v_move(&state->samplegm.ptr.pp_double[k][0], 1, &state->samplegm.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); state->samplef.ptr.p_double[k] = state->samplef.ptr.p_double[i]; state->samplef0.ptr.p_double[k] = state->samplef0.ptr.p_double[i]; k = k+1; } cursamplesize = k; if( state->agssamplesize-cursamplesizeagsminupdate ) { /* * Remove oldest entries */ k = state->agsminupdate-(state->agssamplesize-cursamplesize); ae_assert(k<=cursamplesize-1, "MinNS: integrity check failed", _state); for(i=1; i<=cursamplesize-k-1; i++) { ae_v_move(&state->samplex.ptr.pp_double[i][0], 1, &state->samplex.ptr.pp_double[i+k][0], 1, ae_v_len(0,n-1)); ae_v_move(&state->samplegm.ptr.pp_double[i][0], 1, &state->samplegm.ptr.pp_double[i+k][0], 1, ae_v_len(0,n-1)); state->samplef.ptr.p_double[i] = state->samplef.ptr.p_double[i+k]; state->samplef0.ptr.p_double[i] = state->samplef0.ptr.p_double[i+k]; } cursamplesize = cursamplesize-k; } k = 0; i = cursamplesize; lbl_8: if( i>ae_minint(cursamplesize+state->agsminupdate, state->agssamplesize, _state)-1 ) { goto lbl_10; } for(j=0; j<=n-1; j++) { /* * Undistorted position */ state->samplex.ptr.pp_double[i][j] = state->xc.ptr.p_double[j]; /* * Do not apply distortion, if we are exactly at boundary constraint. */ if( (state->hasbndl.ptr.p_bool[j]&&state->hasbndu.ptr.p_bool[j])&&ae_fp_eq(state->scaledbndl.ptr.p_double[j],state->scaledbndu.ptr.p_double[j]) ) { continue; } if( state->hasbndl.ptr.p_bool[j]&&ae_fp_eq(state->samplex.ptr.pp_double[i][j],state->scaledbndl.ptr.p_double[j]) ) { continue; } if( state->hasbndu.ptr.p_bool[j]&&ae_fp_eq(state->samplex.ptr.pp_double[i][j],state->scaledbndu.ptr.p_double[j]) ) { continue; } /* * Apply distortion */ if( ae_fp_greater_eq(hqrnduniformr(&state->agsrs, _state),0.5) ) { /* * Sample at the left side with 50% probability */ v0 = state->samplex.ptr.pp_double[i][j]-radius; v1 = state->samplex.ptr.pp_double[i][j]; if( state->hasbndl.ptr.p_bool[j] ) { v0 = ae_maxreal(state->scaledbndl.ptr.p_double[j], v0, _state); } } else { /* * Sample at the right side with 50% probability */ v0 = state->samplex.ptr.pp_double[i][j]; v1 = state->samplex.ptr.pp_double[i][j]+radius; if( state->hasbndu.ptr.p_bool[j] ) { v1 = ae_minreal(state->scaledbndu.ptr.p_double[j], v1, _state); } } ae_assert(ae_fp_greater_eq(v1,v0), "MinNS: integrity check failed", _state); state->samplex.ptr.pp_double[i][j] = boundval(v0+(v1-v0)*hqrnduniformr(&state->agsrs, _state), v0, v1, _state); } ae_v_move(&state->x.ptr.p_double[0], 1, &state->samplex.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); minns_unscalepointbc(state, &state->x, _state); minns_clearrequestfields(state, _state); state->needfij = ae_true; state->rstateags.stage = 2; goto lbl_rcomm; lbl_2: state->needfij = ae_false; minns_generatemeritfunction(state, i, _state); k = k+1; i = i+1; goto lbl_8; lbl_10: cursamplesize = cursamplesize+k; fullsample = cursamplesize==state->agssamplesize; for(j=0; j<=cursamplesize-1; j++) { /* * For J-th element in gradient sample, process all of its components * and modify them according to status of box constraints */ for(i=0; i<=n-1; i++) { ae_assert(!state->hasbndl.ptr.p_bool[i]||ae_fp_greater_eq(state->xc.ptr.p_double[i],state->scaledbndl.ptr.p_double[i]), "MinNS: integrity error", _state); ae_assert(!state->hasbndu.ptr.p_bool[i]||ae_fp_less_eq(state->xc.ptr.p_double[i],state->scaledbndu.ptr.p_double[i]), "MinNS: integrity error", _state); state->samplegmbc.ptr.pp_double[j][i] = state->samplegm.ptr.pp_double[j][i]; if( (state->hasbndl.ptr.p_bool[i]&&state->hasbndu.ptr.p_bool[i])&&ae_fp_eq(state->scaledbndl.ptr.p_double[i],state->scaledbndu.ptr.p_double[i]) ) { /* * I-th box constraint is of equality type (lower bound matches upper one). * Simplest case, always active. */ state->samplegmbc.ptr.pp_double[j][i] = 0.0; continue; } if( state->hasbndl.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->scaledbndl.ptr.p_double[i]) ) { /* * We are at lower bound. * * A bit more complex: * * first, we have to activate/deactivate constraint depending on gradient at XC * * second, in any case, I-th column of gradient sample must be non-positive */ if( ae_fp_greater_eq(state->samplegm.ptr.pp_double[0][i],0.0) ) { state->samplegmbc.ptr.pp_double[j][i] = 0.0; } state->samplegmbc.ptr.pp_double[j][i] = ae_minreal(state->samplegmbc.ptr.pp_double[j][i], 0.0, _state); continue; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_eq(state->xc.ptr.p_double[i],state->scaledbndu.ptr.p_double[i]) ) { /* * We are at upper bound. * * A bit more complex: * * first, we have to activate/deactivate constraint depending on gradient at XC * * second, in any case, I-th column of gradient sample must be non-negative */ if( ae_fp_less_eq(state->samplegm.ptr.pp_double[0][i],0.0) ) { state->samplegmbc.ptr.pp_double[j][i] = 0.0; } state->samplegmbc.ptr.pp_double[j][i] = ae_maxreal(state->samplegmbc.ptr.pp_double[j][i], 0.0, _state); continue; } } } /* * Calculate diagonal Hessian. * * This Hessian serves two purposes: * * first, it improves performance of gradient descent step * * second, it improves condition number of QP subproblem * solved to determine step * * The idea is that for each variable we check whether sample * includes entries with alternating sign of gradient: * * if gradients with different signs are present, Hessian * component is set to M/R, where M is a maximum magnitude * of corresponding gradient component, R is a sampling radius. * Note that sign=0 and sign=1 are treated as different ones * * if all gradients have same sign, Hessian component is * set to M/R0, where R0 is initial sampling radius. */ for(j=0; j<=n-1; j++) { state->colmax.ptr.p_double[j] = 0.0; state->signmin.ptr.p_double[j] = (double)(1); state->signmax.ptr.p_double[j] = (double)(-1); } for(i=0; i<=cursamplesize-1; i++) { for(j=0; j<=n-1; j++) { v = state->samplegmbc.ptr.pp_double[i][j]; state->colmax.ptr.p_double[j] = ae_maxreal(state->colmax.ptr.p_double[j], ae_fabs(v, _state), _state); state->signmin.ptr.p_double[j] = ae_minreal(state->signmin.ptr.p_double[j], (double)(ae_sign(v, _state)), _state); state->signmax.ptr.p_double[j] = ae_maxreal(state->signmax.ptr.p_double[j], (double)(ae_sign(v, _state)), _state); } } for(j=0; j<=n-1; j++) { if( ae_fp_neq(state->signmin.ptr.p_double[j],state->signmax.ptr.p_double[j]) ) { /* * Alternating signs of gradient - step is proportional to current sampling radius */ ae_assert(ae_fp_neq(state->colmax.ptr.p_double[j],(double)(0)), "MinNS: integrity check failed", _state); ae_assert(ae_fp_neq(radius,(double)(0)), "MinNS: integrity check failed", _state); state->diagh.ptr.p_double[j] = state->colmax.ptr.p_double[j]/radius; continue; } if( ae_fp_neq(state->colmax.ptr.p_double[j],(double)(0)) ) { /* * Non-alternating sign of gradient, but non-zero. * Step is proportional to initial sampling radius */ ae_assert(ae_fp_neq(radius0,(double)(0)), "MinNS: integrity check failed", _state); state->diagh.ptr.p_double[j] = state->colmax.ptr.p_double[j]/radius0; continue; } state->diagh.ptr.p_double[j] = (double)(1); } /* * PROJECTION PHASE * * We project zero vector on convex hull of gradient sample. * If projection is small enough, we decrease radius and restart. * Otherwise, this phase returns search direction in State.D. * * NOTE: because we use iterative solver, it may have trouble * dealing with ill-conditioned problems. So we also employ * second, backup test for stationarity - when too many * subsequent backtracking searches resulted in short steps. */ minns_solveqp(&state->samplegmbc, &state->diagh, cursamplesize, n, &state->tmp0, &state->dbgncholesky, &state->nsqp, _state); for(j=0; j<=n-1; j++) { state->d.ptr.p_double[j] = 0.0; } for(i=0; i<=cursamplesize-1; i++) { v = state->tmp0.ptr.p_double[i]; ae_v_addd(&state->d.ptr.p_double[0], 1, &state->samplegmbc.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } v = 0.0; for(j=0; j<=n-1; j++) { v = ae_maxreal(v, ae_fabs(state->d.ptr.p_double[j]/coalesce(state->colmax.ptr.p_double[j], 1.0, _state), _state), _state); } if( ae_fp_less_eq(v,state->agsstattold) ) { /* * Stationarity test succeded. * Decrease radius and restart. * * NOTE: we also clear ShortStepsCnt on restart */ radius = radius*state->agsraddecay; shortstepscnt = 0; inc(&radiusdecays, _state); inc(&state->repinneriterationscount, _state); goto lbl_4; } for(i=0; i<=n-1; i++) { state->d.ptr.p_double[i] = -state->d.ptr.p_double[i]/state->diagh.ptr.p_double[i]; } /* * Perform backtracking line search. * Update initial step length depending on search results. * Here we assume that D is non-zero. * * NOTE: if AGSShortLimit subsequent line searches resulted * in steps shorter than AGSStatTolStp, we decrease radius. */ dnrm = 0.0; dg = 0.0; for(i=0; i<=n-1; i++) { dnrm = dnrm+ae_sqr(state->d.ptr.p_double[i], _state); dg = dg+state->d.ptr.p_double[i]*state->samplegmbc.ptr.pp_double[0][i]; } dnrm = ae_sqrt(dnrm, _state); ae_assert(ae_fp_greater(dnrm,(double)(0)), "MinNS: integrity error", _state); alpha = recommendedstep/dnrm; alphadecreased = ae_false; backtrackits = 0; if( fullsample ) { maxbacktrackits = state->agsmaxbacktrack; } else { maxbacktrackits = state->agsmaxbacktracknonfull; } lbl_11: if( ae_false ) { goto lbl_12; } /* * Prepare XN and evaluate merit function at XN */ ae_v_move(&state->xn.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&state->xn.ptr.p_double[0], 1, &state->d.ptr.p_double[0], 1, ae_v_len(0,n-1), alpha); enforceboundaryconstraints(&state->xn, &state->scaledbndl, &state->hasbndl, &state->scaledbndu, &state->hasbndu, n, 0, _state); ae_v_move(&state->samplex.ptr.pp_double[maxsamplesize][0], 1, &state->xn.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xn.ptr.p_double[0], 1, ae_v_len(0,n-1)); minns_unscalepointbc(state, &state->x, _state); minns_clearrequestfields(state, _state); state->needfij = ae_true; state->rstateags.stage = 3; goto lbl_rcomm; lbl_3: state->needfij = ae_false; minns_generatemeritfunction(state, maxsamplesize, _state); /* * Check sufficient decrease condition */ ae_assert(ae_fp_greater(dnrm,(double)(0)), "MinNS: integrity error", _state); if( ae_fp_less_eq(state->samplef.ptr.p_double[maxsamplesize],state->samplef.ptr.p_double[0]+alpha*state->agsdecrease*dg) ) { goto lbl_12; } /* * Decrease Alpha */ alpha = alpha*state->agsalphadecay; alphadecreased = ae_true; /* * Update and check iterations counter. */ inc(&backtrackits, _state); if( backtrackits>=maxbacktrackits ) { /* * Too many backtracking searches performed without success. * Terminate iterations. */ alpha = 0.0; alphadecreased = ae_true; ae_v_move(&state->xn.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); goto lbl_12; } goto lbl_11; lbl_12: if( (ae_fp_less_eq(alpha*dnrm,state->agsshortstpabs)||ae_fp_less_eq(alpha*dnrm,state->agsshortstprel*radius))||ae_fp_less_eq(ae_fabs(state->samplef.ptr.p_double[0]-state->samplef.ptr.p_double[maxsamplesize], _state),state->agsshortf) ) { inc(&shortstepscnt, _state); } else { shortstepscnt = 0; } if( shortstepscnt>=state->agsshortlimit ) { /* * Too many subsequent short steps. * * It may be possible that optimizer is unable to find out * that we have to decrease radius because of ill-conditioned * gradients. * * Decrease radius and restart. */ radius = radius*state->agsraddecay; shortstepscnt = 0; inc(&radiusdecays, _state); inc(&state->repinneriterationscount, _state); goto lbl_4; } if( !alphadecreased ) { recommendedstep = recommendedstep*2.0; } if( alphadecreased&&fullsample ) { recommendedstep = recommendedstep*0.5; } /* * Next iteration */ ae_v_move(&state->xc.ptr.p_double[0], 1, &state->xn.ptr.p_double[0], 1, ae_v_len(0,n-1)); inc(&state->repinneriterationscount, _state); goto lbl_4; lbl_5: /* * Convert back from scaled to unscaled representation */ minns_unscalepointbc(state, &state->xc, _state); result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstateags.ia.ptr.p_int[0] = n; state->rstateags.ia.ptr.p_int[1] = nec; state->rstateags.ia.ptr.p_int[2] = nic; state->rstateags.ia.ptr.p_int[3] = ng; state->rstateags.ia.ptr.p_int[4] = nh; state->rstateags.ia.ptr.p_int[5] = i; state->rstateags.ia.ptr.p_int[6] = j; state->rstateags.ia.ptr.p_int[7] = k; state->rstateags.ia.ptr.p_int[8] = radiusdecays; state->rstateags.ia.ptr.p_int[9] = maxsamplesize; state->rstateags.ia.ptr.p_int[10] = cursamplesize; state->rstateags.ia.ptr.p_int[11] = shortstepscnt; state->rstateags.ia.ptr.p_int[12] = backtrackits; state->rstateags.ia.ptr.p_int[13] = maxbacktrackits; state->rstateags.ba.ptr.p_bool[0] = restartneeded; state->rstateags.ba.ptr.p_bool[1] = b; state->rstateags.ba.ptr.p_bool[2] = alphadecreased; state->rstateags.ba.ptr.p_bool[3] = fullsample; state->rstateags.ra.ptr.p_double[0] = radius0; state->rstateags.ra.ptr.p_double[1] = radius; state->rstateags.ra.ptr.p_double[2] = alpha; state->rstateags.ra.ptr.p_double[3] = recommendedstep; state->rstateags.ra.ptr.p_double[4] = dnrm; state->rstateags.ra.ptr.p_double[5] = dg; state->rstateags.ra.ptr.p_double[6] = v; state->rstateags.ra.ptr.p_double[7] = vv; state->rstateags.ra.ptr.p_double[8] = v0; state->rstateags.ra.ptr.p_double[9] = v1; return result; } /************************************************************************* This function calculates merit function (target function + penalties for violation of non-box constraints), using State.X (unscaled), State.Fi, State.J (unscaled) and State.SampleX (scaled) as inputs. Results are loaded: * target function value - to State.SampleF0[SampleIdx] * merit function value - to State.SampleF[SampleIdx] * gradient of merit function - to State.SampleGM[SampleIdx] -- ALGLIB -- Copyright 02.06.2015 by Bochkanov Sergey *************************************************************************/ static void minns_generatemeritfunction(minnsstate* state, ae_int_t sampleidx, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t nec; ae_int_t nic; ae_int_t ng; ae_int_t nh; double v; double s; n = state->n; nec = state->nec; nic = state->nic; ng = state->ng; nh = state->nh; /* * Integrity check */ for(i=0; i<=n-1; i++) { ae_assert(!state->hasbndl.ptr.p_bool[i]||ae_fp_greater_eq(state->x.ptr.p_double[i],state->bndl.ptr.p_double[i]), "MinNS: integrity error", _state); ae_assert(!state->hasbndu.ptr.p_bool[i]||ae_fp_less_eq(state->x.ptr.p_double[i],state->bndu.ptr.p_double[i]), "MinNS: integrity error", _state); } /* * Prepare "raw" function */ state->samplef.ptr.p_double[sampleidx] = state->fi.ptr.p_double[0]; state->samplef0.ptr.p_double[sampleidx] = state->fi.ptr.p_double[0]; for(j=0; j<=n-1; j++) { state->samplegm.ptr.pp_double[sampleidx][j] = state->j.ptr.pp_double[0][j]*state->s.ptr.p_double[j]; } /* * Modify merit function with linear constraints */ for(i=0; i<=nec+nic-1; i++) { v = -state->scaledcleic.ptr.pp_double[i][n]; for(j=0; j<=n-1; j++) { v = v+state->scaledcleic.ptr.pp_double[i][j]*state->samplex.ptr.pp_double[sampleidx][j]; } if( i>=nec&&ae_fp_less(v,(double)(0)) ) { continue; } state->samplef.ptr.p_double[sampleidx] = state->samplef.ptr.p_double[sampleidx]+state->rholinear.ptr.p_double[i]*ae_fabs(v, _state); s = (double)(ae_sign(v, _state)); for(j=0; j<=n-1; j++) { state->samplegm.ptr.pp_double[sampleidx][j] = state->samplegm.ptr.pp_double[sampleidx][j]+state->rholinear.ptr.p_double[i]*s*state->scaledcleic.ptr.pp_double[i][j]; } } /* * Modify merit function with nonlinear constraints */ for(i=1; i<=ng+nh; i++) { v = state->fi.ptr.p_double[i]; if( i<=ng&&ae_fp_eq(v,(double)(0)) ) { continue; } if( i>ng&&ae_fp_less_eq(v,(double)(0)) ) { continue; } state->samplef.ptr.p_double[sampleidx] = state->samplef.ptr.p_double[sampleidx]+state->agsrhononlinear*ae_fabs(v, _state); s = (double)(ae_sign(v, _state)); for(j=0; j<=n-1; j++) { state->samplegm.ptr.pp_double[sampleidx][j] = state->samplegm.ptr.pp_double[sampleidx][j]+state->agsrhononlinear*s*state->j.ptr.pp_double[i][j]*state->s.ptr.p_double[j]; } } } /************************************************************************* This function performs transformation of X from scaled coordinates to unscaled ones, paying special attention to box constraints: * points which were exactly at the boundary before scaling will be mapped to corresponding boundary after scaling * in any case, unscaled box constraints will be satisfied -- ALGLIB -- Copyright 02.06.2015 by Bochkanov Sergey *************************************************************************/ static void minns_unscalepointbc(minnsstate* state, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t i; for(i=0; i<=state->n-1; i++) { if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less_eq(x->ptr.p_double[i],state->scaledbndl.ptr.p_double[i]) ) { x->ptr.p_double[i] = state->bndl.ptr.p_double[i]; continue; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater_eq(x->ptr.p_double[i],state->scaledbndu.ptr.p_double[i]) ) { x->ptr.p_double[i] = state->bndu.ptr.p_double[i]; continue; } x->ptr.p_double[i] = x->ptr.p_double[i]*state->s.ptr.p_double[i]; if( state->hasbndl.ptr.p_bool[i]&&ae_fp_less_eq(x->ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { x->ptr.p_double[i] = state->bndl.ptr.p_double[i]; } if( state->hasbndu.ptr.p_bool[i]&&ae_fp_greater_eq(x->ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { x->ptr.p_double[i] = state->bndu.ptr.p_double[i]; } } } /************************************************************************* This function solves QP problem of the form [ ] min [ 0.5*c'*(G*inv(H)*G')*c ] s.t. c[i]>=0, SUM(c[i])=1.0 [ ] where G is stored in SampleG[] array, diagonal H is stored in DiagH[]. DbgNCholesky is incremented every time we perform Cholesky decomposition. -- ALGLIB -- Copyright 02.06.2015 by Bochkanov Sergey *************************************************************************/ static void minns_solveqp(/* Real */ ae_matrix* sampleg, /* Real */ ae_vector* diagh, ae_int_t nsample, ae_int_t nvars, /* Real */ ae_vector* coeffs, ae_int_t* dbgncholesky, minnsqp* state, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; double v; double vv; ae_int_t n; ae_int_t idx0; ae_int_t idx1; ae_int_t ncandbnd; ae_int_t innerits; ae_int_t outerits; double dnrm; double stp; double stpmax; ae_int_t actidx; double dtol; ae_bool kickneeded; double kicklength; double lambdav; double maxdiag; ae_bool wasactivation; ae_bool werechanges; ae_int_t termcnt; n = nsample; /* * Allocate arrays, prepare data */ rvectorsetlengthatleast(coeffs, n, _state); rvectorsetlengthatleast(&state->xc, n, _state); rvectorsetlengthatleast(&state->xn, n, _state); rvectorsetlengthatleast(&state->x0, n, _state); rvectorsetlengthatleast(&state->gc, n, _state); rvectorsetlengthatleast(&state->d, n, _state); rmatrixsetlengthatleast(&state->uh, n, n, _state); rmatrixsetlengthatleast(&state->ch, n, n, _state); rmatrixsetlengthatleast(&state->rk, nsample, nvars, _state); rvectorsetlengthatleast(&state->invutc, n, _state); rvectorsetlengthatleast(&state->tmp0, n, _state); bvectorsetlengthatleast(&state->tmpb, n, _state); for(i=0; i<=n-1; i++) { state->xc.ptr.p_double[i] = 1.0/n; coeffs->ptr.p_double[i] = 1.0/n; } for(i=0; i<=nsample-1; i++) { for(j=0; j<=nvars-1; j++) { state->rk.ptr.pp_double[i][j] = sampleg->ptr.pp_double[i][j]/ae_sqrt(diagh->ptr.p_double[j], _state); } } rmatrixsyrk(nsample, nvars, 1.0, &state->rk, 0, 0, 0, 0.0, &state->uh, 0, 0, ae_true, _state); maxdiag = 0.0; for(i=0; i<=nsample-1; i++) { maxdiag = ae_maxreal(maxdiag, state->uh.ptr.pp_double[i][i], _state); } maxdiag = coalesce(maxdiag, 1.0, _state); /* * Main cycle: */ innerits = 0; outerits = 0; dtol = 1.0E5*ae_machineepsilon; kicklength = ae_machineepsilon; lambdav = 1.0E5*ae_machineepsilon; termcnt = 0; for(;;) { /* * Save current point to X0 */ ae_v_move(&state->x0.ptr.p_double[0], 1, &state->xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * Calculate gradient at initial point, solve NNLS problem * to determine descent direction D subject to constraints. * * In order to do so we solve following constrained * minimization problem: * ( )^2 * min ( SUM(lambda[i]*A[i]) + G ) * ( ) * Here: * * G is a gradient (column vector) * * A[i] is a column vector of I-th constraint * * lambda[i] is a Lagrange multiplier corresponding to I-th constraint * * NOTE: all A[i] except for last one have only one element being set, * so we rely on sparse capabilities of NNLS solver. However, * in order to use these capabilities we have to reorder variables * in such way that sparse ones come first. * * After finding lambda[] coefficients, we can find constrained descent * direction by subtracting lambda[i]*A[i] from D=-G. We make use of the * fact that first NCandBnd columns are just columns of identity matrix, * so we can perform exact projection by explicitly setting elements of D * to zeros. */ minns_qpcalculategradfunc(sampleg, diagh, nsample, nvars, &state->xc, &state->gc, &state->fc, &state->tmp0, _state); ivectorsetlengthatleast(&state->tmpidx, n, _state); rvectorsetlengthatleast(&state->tmpd, n, _state); rmatrixsetlengthatleast(&state->tmpc2, n, 1, _state); idx0 = 0; ncandbnd = 0; for(i=0; i<=n-1; i++) { if( ae_fp_eq(state->xc.ptr.p_double[i],0.0) ) { ncandbnd = ncandbnd+1; } } idx1 = ncandbnd; for(i=0; i<=n-1; i++) { if( ae_fp_eq(state->xc.ptr.p_double[i],0.0) ) { /* * Candidate for activation of boundary constraint, * comes first. * * NOTE: multiplication by -1 is due to the fact that * it is lower bound, and has specific direction * of constraint gradient. */ state->tmpidx.ptr.p_int[idx0] = i; state->tmpd.ptr.p_double[idx0] = (-state->gc.ptr.p_double[i])*(-1); state->tmpc2.ptr.pp_double[idx0][0] = 1.0*(-1); idx0 = idx0+1; } else { /* * We are far away from boundary. */ state->tmpidx.ptr.p_int[idx1] = i; state->tmpd.ptr.p_double[idx1] = -state->gc.ptr.p_double[i]; state->tmpc2.ptr.pp_double[idx1][0] = 1.0; idx1 = idx1+1; } } ae_assert(idx0==ncandbnd, "MinNSQP: integrity check failed", _state); ae_assert(idx1==n, "MinNSQP: integrity check failed", _state); snnlsinit(n, 1, n, &state->nnls, _state); snnlssetproblem(&state->nnls, &state->tmpc2, &state->tmpd, ncandbnd, 1, n, _state); snnlsdropnnc(&state->nnls, ncandbnd, _state); snnlssolve(&state->nnls, &state->tmplambdas, _state); for(i=0; i<=n-1; i++) { state->d.ptr.p_double[i] = -state->gc.ptr.p_double[i]-state->tmplambdas.ptr.p_double[ncandbnd]; } for(i=0; i<=ncandbnd-1; i++) { if( ae_fp_greater(state->tmplambdas.ptr.p_double[i],(double)(0)) ) { state->d.ptr.p_double[state->tmpidx.ptr.p_int[i]] = 0.0; } } /* * Additional stage to "polish" D (improve situation * with sum-to-one constraint and boundary constraints) * and to perform additional integrity check. * * After this stage we are pretty sure that: * * if x[i]=0.0, then d[i]>=0.0 * * if d[i]<0.0, then x[i]>0.0 */ v = 0.0; vv = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_eq(state->xc.ptr.p_double[i],0.0)&&ae_fp_less(state->d.ptr.p_double[i],0.0) ) { state->d.ptr.p_double[i] = 0.0; } v = v+state->d.ptr.p_double[i]; vv = ae_maxreal(vv, ae_fabs(state->gc.ptr.p_double[i], _state), _state); } ae_assert(ae_fp_less(ae_fabs(v, _state),1.0E5*ae_sqrt((double)(n), _state)*ae_machineepsilon*ae_maxreal(vv, 1.0, _state)), "MinNSQP: integrity check failed", _state); /* * Decide whether we need "kick" stage: special stage * that moves us away from boundary constraints which are * not strictly active (i.e. such constraints that x[i]=0.0 and d[i]>0). * * If we need kick stage, we make a kick - and restart iteration. * If not, after this block we can rely on the fact that * for all x[i]=0.0 we have d[i]=0.0 */ kickneeded = ae_false; for(i=0; i<=n-1; i++) { if( ae_fp_eq(state->xc.ptr.p_double[i],0.0)&&ae_fp_greater(state->d.ptr.p_double[i],0.0) ) { kickneeded = ae_true; } } if( kickneeded ) { /* * Perform kick. * Restart. * Do not increase outer iterations counter. */ v = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_eq(state->xc.ptr.p_double[i],0.0)&&ae_fp_greater(state->d.ptr.p_double[i],0.0) ) { state->xc.ptr.p_double[i] = state->xc.ptr.p_double[i]+kicklength; } v = v+state->xc.ptr.p_double[i]; } ae_assert(ae_fp_greater(v,0.0), "MinNSQP: integrity check failed", _state); for(i=0; i<=n-1; i++) { state->xc.ptr.p_double[i] = state->xc.ptr.p_double[i]/v; } inc(&innerits, _state); continue; } /* * Calculate Cholesky decomposition of constrained Hessian * for Newton phase. */ for(;;) { for(i=0; i<=n-1; i++) { /* * Diagonal element */ if( ae_fp_greater(state->xc.ptr.p_double[i],0.0) ) { state->ch.ptr.pp_double[i][i] = state->uh.ptr.pp_double[i][i]+lambdav*maxdiag; } else { state->ch.ptr.pp_double[i][i] = 1.0; } /* * Offdiagonal elements */ for(j=i+1; j<=n-1; j++) { if( ae_fp_greater(state->xc.ptr.p_double[i],0.0)&&ae_fp_greater(state->xc.ptr.p_double[j],0.0) ) { state->ch.ptr.pp_double[i][j] = state->uh.ptr.pp_double[i][j]; } else { state->ch.ptr.pp_double[i][j] = 0.0; } } } inc(dbgncholesky, _state); if( !spdmatrixcholeskyrec(&state->ch, 0, n, ae_true, &state->tmp0, _state) ) { /* * Cholesky decomposition failed. * Increase LambdaV and repeat iteration. * Do not increase outer iterations counter. */ lambdav = lambdav*10; continue; } break; } /* * Newton phase */ for(;;) { /* * Calculate constrained (equality and sum-to-one) descent direction D. * * Here we use Sherman-Morrison update to calculate direction subject to * sum-to-one constraint. */ minns_qpcalculategradfunc(sampleg, diagh, nsample, nvars, &state->xc, &state->gc, &state->fc, &state->tmp0, _state); for(i=0; i<=n-1; i++) { if( ae_fp_greater(state->xc.ptr.p_double[i],0.0) ) { state->invutc.ptr.p_double[i] = 1.0; state->d.ptr.p_double[i] = -state->gc.ptr.p_double[i]; } else { state->invutc.ptr.p_double[i] = 0.0; state->d.ptr.p_double[i] = 0.0; } } minns_qpsolveut(&state->ch, n, &state->invutc, _state); minns_qpsolveut(&state->ch, n, &state->d, _state); v = 0.0; vv = 0.0; for(i=0; i<=n-1; i++) { vv = vv+ae_sqr(state->invutc.ptr.p_double[i], _state); v = v+state->invutc.ptr.p_double[i]*state->d.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { state->d.ptr.p_double[i] = state->d.ptr.p_double[i]-v/vv*state->invutc.ptr.p_double[i]; } minns_qpsolveu(&state->ch, n, &state->d, _state); v = 0.0; k = 0; for(i=0; i<=n-1; i++) { v = v+state->d.ptr.p_double[i]; if( ae_fp_neq(state->d.ptr.p_double[i],0.0) ) { k = k+1; } } if( k>0&&ae_fp_greater(v,0.0) ) { vv = v/k; for(i=0; i<=n-1; i++) { if( ae_fp_neq(state->d.ptr.p_double[i],0.0) ) { state->d.ptr.p_double[i] = state->d.ptr.p_double[i]-vv; } } } /* * Calculate length of D, maximum step and component which is * activated by this step. * * Break if D is exactly zero. We do not break here if DNrm is * small - this check is performed later. It is important to * perform last step with nearly-zero D, it allows us to have * extra-precision in solution which is often needed for convergence * of AGS algorithm. */ dnrm = 0.0; for(i=0; i<=n-1; i++) { dnrm = dnrm+ae_sqr(state->d.ptr.p_double[i], _state); } dnrm = ae_sqrt(dnrm, _state); actidx = -1; stpmax = 1.0E50; for(i=0; i<=n-1; i++) { if( ae_fp_less(state->d.ptr.p_double[i],0.0) ) { v = stpmax; stpmax = safeminposrv(state->xc.ptr.p_double[i], -state->d.ptr.p_double[i], stpmax, _state); if( ae_fp_less(stpmax,v) ) { actidx = i; } } } if( ae_fp_eq(dnrm,0.0) ) { break; } /* * Calculate trial function value at unconstrained full step. * If trial value is greater or equal to FC, terminate iterations. */ for(i=0; i<=n-1; i++) { state->xn.ptr.p_double[i] = state->xc.ptr.p_double[i]+1.0*state->d.ptr.p_double[i]; } minns_qpcalculatefunc(sampleg, diagh, nsample, nvars, &state->xn, &state->fn, &state->tmp0, _state); if( ae_fp_greater_eq(state->fn,state->fc) ) { break; } /* * Perform step * Update Hessian * Update XC * * Break if: * a) no constraint was activated * b) norm of D is small enough */ stp = ae_minreal(1.0, stpmax, _state); for(i=0; i<=n-1; i++) { state->xn.ptr.p_double[i] = ae_maxreal(state->xc.ptr.p_double[i]+stp*state->d.ptr.p_double[i], 0.0, _state); } if( ae_fp_eq(stp,stpmax)&&actidx>=0 ) { state->xn.ptr.p_double[actidx] = 0.0; } wasactivation = ae_false; for(i=0; i<=n-1; i++) { state->tmpb.ptr.p_bool[i] = ae_fp_eq(state->xn.ptr.p_double[i],0.0)&&ae_fp_neq(state->xc.ptr.p_double[i],0.0); wasactivation = wasactivation||state->tmpb.ptr.p_bool[i]; } ae_v_move(&state->xc.ptr.p_double[0], 1, &state->xn.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( !wasactivation ) { break; } if( ae_fp_less_eq(dnrm,dtol) ) { break; } spdmatrixcholeskyupdatefixbuf(&state->ch, n, ae_true, &state->tmpb, &state->tmp0, _state); } /* * Compare status of boundary constraints - if nothing changed during * last outer iteration, TermCnt is increased. Otherwise it is reset * to zero. * * When TermCnt is large enough, we terminate algorithm. */ werechanges = ae_false; for(i=0; i<=n-1; i++) { werechanges = werechanges||ae_sign(state->x0.ptr.p_double[i], _state)!=ae_sign(state->xc.ptr.p_double[i], _state); } if( !werechanges ) { inc(&termcnt, _state); } else { termcnt = 0; } if( termcnt>=2 ) { break; } /* * Increase number of outer iterations. * Break if we performed too many. */ inc(&outerits, _state); if( outerits==10 ) { break; } } /* * Store result */ for(i=0; i<=n-1; i++) { coeffs->ptr.p_double[i] = state->xc.ptr.p_double[i]; } } /************************************************************************* Function/gradient calculation for QP solver. -- ALGLIB -- Copyright 02.06.2015 by Bochkanov Sergey *************************************************************************/ static void minns_qpcalculategradfunc(/* Real */ ae_matrix* sampleg, /* Real */ ae_vector* diagh, ae_int_t nsample, ae_int_t nvars, /* Real */ ae_vector* coeffs, /* Real */ ae_vector* g, double* f, /* Real */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t j; double v; *f = 0; rvectorsetlengthatleast(g, nsample, _state); rvectorsetlengthatleast(tmp, nvars, _state); /* * Calculate GS*p */ for(j=0; j<=nvars-1; j++) { tmp->ptr.p_double[j] = 0.0; } for(i=0; i<=nsample-1; i++) { v = coeffs->ptr.p_double[i]; ae_v_addd(&tmp->ptr.p_double[0], 1, &sampleg->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1), v); } /* * Calculate F */ *f = 0.0; for(i=0; i<=nvars-1; i++) { *f = *f+0.5*ae_sqr(tmp->ptr.p_double[i], _state)/diagh->ptr.p_double[i]; } /* * Multiply by inverse Hessian */ for(i=0; i<=nvars-1; i++) { tmp->ptr.p_double[i] = tmp->ptr.p_double[i]/diagh->ptr.p_double[i]; } /* * Function gradient */ for(i=0; i<=nsample-1; i++) { v = ae_v_dotproduct(&sampleg->ptr.pp_double[i][0], 1, &tmp->ptr.p_double[0], 1, ae_v_len(0,nvars-1)); g->ptr.p_double[i] = v; } } /************************************************************************* Function calculation for QP solver. -- ALGLIB -- Copyright 02.06.2015 by Bochkanov Sergey *************************************************************************/ static void minns_qpcalculatefunc(/* Real */ ae_matrix* sampleg, /* Real */ ae_vector* diagh, ae_int_t nsample, ae_int_t nvars, /* Real */ ae_vector* coeffs, double* f, /* Real */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t j; double v; *f = 0; rvectorsetlengthatleast(tmp, nvars, _state); /* * Calculate GS*p */ for(j=0; j<=nvars-1; j++) { tmp->ptr.p_double[j] = 0.0; } for(i=0; i<=nsample-1; i++) { v = coeffs->ptr.p_double[i]; ae_v_addd(&tmp->ptr.p_double[0], 1, &sampleg->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1), v); } /* * Calculate F */ *f = 0.0; for(i=0; i<=nvars-1; i++) { *f = *f+0.5*ae_sqr(tmp->ptr.p_double[i], _state)/diagh->ptr.p_double[i]; } } /************************************************************************* Triangular solver for QP solver. -- ALGLIB -- Copyright 02.06.2015 by Bochkanov Sergey *************************************************************************/ static void minns_qpsolveu(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t i; ae_int_t j; double v; /* * A^(-1)*X */ for(i=n-1; i>=0; i--) { v = x->ptr.p_double[i]; for(j=i+1; j<=n-1; j++) { v = v-a->ptr.pp_double[i][j]*x->ptr.p_double[j]; } x->ptr.p_double[i] = v/a->ptr.pp_double[i][i]; } } /************************************************************************* Triangular solver for QP solver. -- ALGLIB -- Copyright 02.06.2015 by Bochkanov Sergey *************************************************************************/ static void minns_qpsolveut(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t i; ae_int_t j; double v; /* * A^(-T)*X */ for(i=0; i<=n-1; i++) { x->ptr.p_double[i] = x->ptr.p_double[i]/a->ptr.pp_double[i][i]; v = x->ptr.p_double[i]; for(j=i+1; j<=n-1; j++) { x->ptr.p_double[j] = x->ptr.p_double[j]-a->ptr.pp_double[i][j]*v; } } } void _minnsqp_init(void* _p, ae_state *_state) { minnsqp *p = (minnsqp*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->xc, 0, DT_REAL, _state); ae_vector_init(&p->xn, 0, DT_REAL, _state); ae_vector_init(&p->x0, 0, DT_REAL, _state); ae_vector_init(&p->gc, 0, DT_REAL, _state); ae_vector_init(&p->d, 0, DT_REAL, _state); ae_matrix_init(&p->uh, 0, 0, DT_REAL, _state); ae_matrix_init(&p->ch, 0, 0, DT_REAL, _state); ae_matrix_init(&p->rk, 0, 0, DT_REAL, _state); ae_vector_init(&p->invutc, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_vector_init(&p->tmpidx, 0, DT_INT, _state); ae_vector_init(&p->tmpd, 0, DT_REAL, _state); ae_vector_init(&p->tmpc, 0, DT_REAL, _state); ae_vector_init(&p->tmplambdas, 0, DT_REAL, _state); ae_matrix_init(&p->tmpc2, 0, 0, DT_REAL, _state); ae_vector_init(&p->tmpb, 0, DT_BOOL, _state); _snnlssolver_init(&p->nnls, _state); } void _minnsqp_init_copy(void* _dst, void* _src, ae_state *_state) { minnsqp *dst = (minnsqp*)_dst; minnsqp *src = (minnsqp*)_src; dst->fc = src->fc; dst->fn = src->fn; ae_vector_init_copy(&dst->xc, &src->xc, _state); ae_vector_init_copy(&dst->xn, &src->xn, _state); ae_vector_init_copy(&dst->x0, &src->x0, _state); ae_vector_init_copy(&dst->gc, &src->gc, _state); ae_vector_init_copy(&dst->d, &src->d, _state); ae_matrix_init_copy(&dst->uh, &src->uh, _state); ae_matrix_init_copy(&dst->ch, &src->ch, _state); ae_matrix_init_copy(&dst->rk, &src->rk, _state); ae_vector_init_copy(&dst->invutc, &src->invutc, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); ae_vector_init_copy(&dst->tmpidx, &src->tmpidx, _state); ae_vector_init_copy(&dst->tmpd, &src->tmpd, _state); ae_vector_init_copy(&dst->tmpc, &src->tmpc, _state); ae_vector_init_copy(&dst->tmplambdas, &src->tmplambdas, _state); ae_matrix_init_copy(&dst->tmpc2, &src->tmpc2, _state); ae_vector_init_copy(&dst->tmpb, &src->tmpb, _state); _snnlssolver_init_copy(&dst->nnls, &src->nnls, _state); } void _minnsqp_clear(void* _p) { minnsqp *p = (minnsqp*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->xc); ae_vector_clear(&p->xn); ae_vector_clear(&p->x0); ae_vector_clear(&p->gc); ae_vector_clear(&p->d); ae_matrix_clear(&p->uh); ae_matrix_clear(&p->ch); ae_matrix_clear(&p->rk); ae_vector_clear(&p->invutc); ae_vector_clear(&p->tmp0); ae_vector_clear(&p->tmpidx); ae_vector_clear(&p->tmpd); ae_vector_clear(&p->tmpc); ae_vector_clear(&p->tmplambdas); ae_matrix_clear(&p->tmpc2); ae_vector_clear(&p->tmpb); _snnlssolver_clear(&p->nnls); } void _minnsqp_destroy(void* _p) { minnsqp *p = (minnsqp*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->xc); ae_vector_destroy(&p->xn); ae_vector_destroy(&p->x0); ae_vector_destroy(&p->gc); ae_vector_destroy(&p->d); ae_matrix_destroy(&p->uh); ae_matrix_destroy(&p->ch); ae_matrix_destroy(&p->rk); ae_vector_destroy(&p->invutc); ae_vector_destroy(&p->tmp0); ae_vector_destroy(&p->tmpidx); ae_vector_destroy(&p->tmpd); ae_vector_destroy(&p->tmpc); ae_vector_destroy(&p->tmplambdas); ae_matrix_destroy(&p->tmpc2); ae_vector_destroy(&p->tmpb); _snnlssolver_destroy(&p->nnls); } void _minnsstate_init(void* _p, ae_state *_state) { minnsstate *p = (minnsstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->s, 0, DT_REAL, _state); ae_vector_init(&p->bndl, 0, DT_REAL, _state); ae_vector_init(&p->bndu, 0, DT_REAL, _state); ae_vector_init(&p->hasbndl, 0, DT_BOOL, _state); ae_vector_init(&p->hasbndu, 0, DT_BOOL, _state); ae_matrix_init(&p->cleic, 0, 0, DT_REAL, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->fi, 0, DT_REAL, _state); ae_matrix_init(&p->j, 0, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); _rcommstate_init(&p->rstateags, _state); _hqrndstate_init(&p->agsrs, _state); ae_vector_init(&p->xstart, 0, DT_REAL, _state); ae_vector_init(&p->xc, 0, DT_REAL, _state); ae_vector_init(&p->xn, 0, DT_REAL, _state); ae_vector_init(&p->grs, 0, DT_REAL, _state); ae_vector_init(&p->d, 0, DT_REAL, _state); ae_vector_init(&p->colmax, 0, DT_REAL, _state); ae_vector_init(&p->diagh, 0, DT_REAL, _state); ae_vector_init(&p->signmin, 0, DT_REAL, _state); ae_vector_init(&p->signmax, 0, DT_REAL, _state); ae_vector_init(&p->scaledbndl, 0, DT_REAL, _state); ae_vector_init(&p->scaledbndu, 0, DT_REAL, _state); ae_matrix_init(&p->scaledcleic, 0, 0, DT_REAL, _state); ae_vector_init(&p->rholinear, 0, DT_REAL, _state); ae_matrix_init(&p->samplex, 0, 0, DT_REAL, _state); ae_matrix_init(&p->samplegm, 0, 0, DT_REAL, _state); ae_matrix_init(&p->samplegmbc, 0, 0, DT_REAL, _state); ae_vector_init(&p->samplef, 0, DT_REAL, _state); ae_vector_init(&p->samplef0, 0, DT_REAL, _state); _minnsqp_init(&p->nsqp, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_vector_init(&p->tmp1, 0, DT_REAL, _state); ae_matrix_init(&p->tmp2, 0, 0, DT_REAL, _state); ae_vector_init(&p->tmp3, 0, DT_INT, _state); ae_vector_init(&p->xbase, 0, DT_REAL, _state); ae_vector_init(&p->fp, 0, DT_REAL, _state); ae_vector_init(&p->fm, 0, DT_REAL, _state); } void _minnsstate_init_copy(void* _dst, void* _src, ae_state *_state) { minnsstate *dst = (minnsstate*)_dst; minnsstate *src = (minnsstate*)_src; dst->solvertype = src->solvertype; dst->n = src->n; dst->epsx = src->epsx; dst->maxits = src->maxits; dst->xrep = src->xrep; dst->diffstep = src->diffstep; ae_vector_init_copy(&dst->s, &src->s, _state); ae_vector_init_copy(&dst->bndl, &src->bndl, _state); ae_vector_init_copy(&dst->bndu, &src->bndu, _state); ae_vector_init_copy(&dst->hasbndl, &src->hasbndl, _state); ae_vector_init_copy(&dst->hasbndu, &src->hasbndu, _state); dst->nec = src->nec; dst->nic = src->nic; ae_matrix_init_copy(&dst->cleic, &src->cleic, _state); dst->ng = src->ng; dst->nh = src->nh; ae_vector_init_copy(&dst->x, &src->x, _state); dst->f = src->f; ae_vector_init_copy(&dst->fi, &src->fi, _state); ae_matrix_init_copy(&dst->j, &src->j, _state); dst->needfij = src->needfij; dst->needfi = src->needfi; dst->xupdated = src->xupdated; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); _rcommstate_init_copy(&dst->rstateags, &src->rstateags, _state); _hqrndstate_init_copy(&dst->agsrs, &src->agsrs, _state); dst->agsradius = src->agsradius; dst->agssamplesize = src->agssamplesize; dst->agsraddecay = src->agsraddecay; dst->agsalphadecay = src->agsalphadecay; dst->agsdecrease = src->agsdecrease; dst->agsinitstp = src->agsinitstp; dst->agsstattold = src->agsstattold; dst->agsshortstpabs = src->agsshortstpabs; dst->agsshortstprel = src->agsshortstprel; dst->agsshortf = src->agsshortf; dst->agsshortlimit = src->agsshortlimit; dst->agsrhononlinear = src->agsrhononlinear; dst->agsminupdate = src->agsminupdate; dst->agsmaxraddecays = src->agsmaxraddecays; dst->agsmaxbacktrack = src->agsmaxbacktrack; dst->agsmaxbacktracknonfull = src->agsmaxbacktracknonfull; dst->agspenaltylevel = src->agspenaltylevel; dst->agspenaltyincrease = src->agspenaltyincrease; ae_vector_init_copy(&dst->xstart, &src->xstart, _state); ae_vector_init_copy(&dst->xc, &src->xc, _state); ae_vector_init_copy(&dst->xn, &src->xn, _state); ae_vector_init_copy(&dst->grs, &src->grs, _state); ae_vector_init_copy(&dst->d, &src->d, _state); ae_vector_init_copy(&dst->colmax, &src->colmax, _state); ae_vector_init_copy(&dst->diagh, &src->diagh, _state); ae_vector_init_copy(&dst->signmin, &src->signmin, _state); ae_vector_init_copy(&dst->signmax, &src->signmax, _state); dst->userterminationneeded = src->userterminationneeded; ae_vector_init_copy(&dst->scaledbndl, &src->scaledbndl, _state); ae_vector_init_copy(&dst->scaledbndu, &src->scaledbndu, _state); ae_matrix_init_copy(&dst->scaledcleic, &src->scaledcleic, _state); ae_vector_init_copy(&dst->rholinear, &src->rholinear, _state); ae_matrix_init_copy(&dst->samplex, &src->samplex, _state); ae_matrix_init_copy(&dst->samplegm, &src->samplegm, _state); ae_matrix_init_copy(&dst->samplegmbc, &src->samplegmbc, _state); ae_vector_init_copy(&dst->samplef, &src->samplef, _state); ae_vector_init_copy(&dst->samplef0, &src->samplef0, _state); _minnsqp_init_copy(&dst->nsqp, &src->nsqp, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); ae_vector_init_copy(&dst->tmp1, &src->tmp1, _state); ae_matrix_init_copy(&dst->tmp2, &src->tmp2, _state); ae_vector_init_copy(&dst->tmp3, &src->tmp3, _state); ae_vector_init_copy(&dst->xbase, &src->xbase, _state); ae_vector_init_copy(&dst->fp, &src->fp, _state); ae_vector_init_copy(&dst->fm, &src->fm, _state); dst->repinneriterationscount = src->repinneriterationscount; dst->repouteriterationscount = src->repouteriterationscount; dst->repnfev = src->repnfev; dst->repvaridx = src->repvaridx; dst->repfuncidx = src->repfuncidx; dst->repterminationtype = src->repterminationtype; dst->replcerr = src->replcerr; dst->repnlcerr = src->repnlcerr; dst->dbgncholesky = src->dbgncholesky; } void _minnsstate_clear(void* _p) { minnsstate *p = (minnsstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->s); ae_vector_clear(&p->bndl); ae_vector_clear(&p->bndu); ae_vector_clear(&p->hasbndl); ae_vector_clear(&p->hasbndu); ae_matrix_clear(&p->cleic); ae_vector_clear(&p->x); ae_vector_clear(&p->fi); ae_matrix_clear(&p->j); _rcommstate_clear(&p->rstate); _rcommstate_clear(&p->rstateags); _hqrndstate_clear(&p->agsrs); ae_vector_clear(&p->xstart); ae_vector_clear(&p->xc); ae_vector_clear(&p->xn); ae_vector_clear(&p->grs); ae_vector_clear(&p->d); ae_vector_clear(&p->colmax); ae_vector_clear(&p->diagh); ae_vector_clear(&p->signmin); ae_vector_clear(&p->signmax); ae_vector_clear(&p->scaledbndl); ae_vector_clear(&p->scaledbndu); ae_matrix_clear(&p->scaledcleic); ae_vector_clear(&p->rholinear); ae_matrix_clear(&p->samplex); ae_matrix_clear(&p->samplegm); ae_matrix_clear(&p->samplegmbc); ae_vector_clear(&p->samplef); ae_vector_clear(&p->samplef0); _minnsqp_clear(&p->nsqp); ae_vector_clear(&p->tmp0); ae_vector_clear(&p->tmp1); ae_matrix_clear(&p->tmp2); ae_vector_clear(&p->tmp3); ae_vector_clear(&p->xbase); ae_vector_clear(&p->fp); ae_vector_clear(&p->fm); } void _minnsstate_destroy(void* _p) { minnsstate *p = (minnsstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->s); ae_vector_destroy(&p->bndl); ae_vector_destroy(&p->bndu); ae_vector_destroy(&p->hasbndl); ae_vector_destroy(&p->hasbndu); ae_matrix_destroy(&p->cleic); ae_vector_destroy(&p->x); ae_vector_destroy(&p->fi); ae_matrix_destroy(&p->j); _rcommstate_destroy(&p->rstate); _rcommstate_destroy(&p->rstateags); _hqrndstate_destroy(&p->agsrs); ae_vector_destroy(&p->xstart); ae_vector_destroy(&p->xc); ae_vector_destroy(&p->xn); ae_vector_destroy(&p->grs); ae_vector_destroy(&p->d); ae_vector_destroy(&p->colmax); ae_vector_destroy(&p->diagh); ae_vector_destroy(&p->signmin); ae_vector_destroy(&p->signmax); ae_vector_destroy(&p->scaledbndl); ae_vector_destroy(&p->scaledbndu); ae_matrix_destroy(&p->scaledcleic); ae_vector_destroy(&p->rholinear); ae_matrix_destroy(&p->samplex); ae_matrix_destroy(&p->samplegm); ae_matrix_destroy(&p->samplegmbc); ae_vector_destroy(&p->samplef); ae_vector_destroy(&p->samplef0); _minnsqp_destroy(&p->nsqp); ae_vector_destroy(&p->tmp0); ae_vector_destroy(&p->tmp1); ae_matrix_destroy(&p->tmp2); ae_vector_destroy(&p->tmp3); ae_vector_destroy(&p->xbase); ae_vector_destroy(&p->fp); ae_vector_destroy(&p->fm); } void _minnsreport_init(void* _p, ae_state *_state) { minnsreport *p = (minnsreport*)_p; ae_touch_ptr((void*)p); } void _minnsreport_init_copy(void* _dst, void* _src, ae_state *_state) { minnsreport *dst = (minnsreport*)_dst; minnsreport *src = (minnsreport*)_src; dst->iterationscount = src->iterationscount; dst->nfev = src->nfev; dst->cerr = src->cerr; dst->lcerr = src->lcerr; dst->nlcerr = src->nlcerr; dst->terminationtype = src->terminationtype; dst->varidx = src->varidx; dst->funcidx = src->funcidx; } void _minnsreport_clear(void* _p) { minnsreport *p = (minnsreport*)_p; ae_touch_ptr((void*)p); } void _minnsreport_destroy(void* _p) { minnsreport *p = (minnsreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* Obsolete function, use MinLBFGSSetPrecDefault() instead. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetdefaultpreconditioner(minlbfgsstate* state, ae_state *_state) { minlbfgssetprecdefault(state, _state); } /************************************************************************* Obsolete function, use MinLBFGSSetCholeskyPreconditioner() instead. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetcholeskypreconditioner(minlbfgsstate* state, /* Real */ ae_matrix* p, ae_bool isupper, ae_state *_state) { minlbfgssetpreccholesky(state, p, isupper, _state); } /************************************************************************* This is obsolete function which was used by previous version of the BLEIC optimizer. It does nothing in the current version of BLEIC. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetbarrierwidth(minbleicstate* state, double mu, ae_state *_state) { } /************************************************************************* This is obsolete function which was used by previous version of the BLEIC optimizer. It does nothing in the current version of BLEIC. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetbarrierdecay(minbleicstate* state, double mudecay, ae_state *_state) { } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 25.03.2010 by Bochkanov Sergey *************************************************************************/ void minasacreate(ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, minasastate* state, ae_state *_state) { ae_int_t i; _minasastate_clear(state); ae_assert(n>=1, "MinASA: N too small!", _state); ae_assert(x->cnt>=n, "MinCGCreate: Length(X)cnt>=n, "MinCGCreate: Length(BndL)cnt>=n, "MinCGCreate: Length(BndU)ptr.p_double[i],bndu->ptr.p_double[i]), "MinASA: inconsistent bounds!", _state); ae_assert(ae_fp_less_eq(bndl->ptr.p_double[i],x->ptr.p_double[i]), "MinASA: infeasible X!", _state); ae_assert(ae_fp_less_eq(x->ptr.p_double[i],bndu->ptr.p_double[i]), "MinASA: infeasible X!", _state); } /* * Initialize */ state->n = n; minasasetcond(state, (double)(0), (double)(0), (double)(0), 0, _state); minasasetxrep(state, ae_false, _state); minasasetstpmax(state, (double)(0), _state); minasasetalgorithm(state, -1, _state); ae_vector_set_length(&state->bndl, n, _state); ae_vector_set_length(&state->bndu, n, _state); ae_vector_set_length(&state->ak, n, _state); ae_vector_set_length(&state->xk, n, _state); ae_vector_set_length(&state->dk, n, _state); ae_vector_set_length(&state->an, n, _state); ae_vector_set_length(&state->xn, n, _state); ae_vector_set_length(&state->dn, n, _state); ae_vector_set_length(&state->x, n, _state); ae_vector_set_length(&state->d, n, _state); ae_vector_set_length(&state->g, n, _state); ae_vector_set_length(&state->gc, n, _state); ae_vector_set_length(&state->work, n, _state); ae_vector_set_length(&state->yk, n, _state); minasarestartfrom(state, x, bndl, bndu, _state); } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minasasetcond(minasastate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(epsg, _state), "MinASASetCond: EpsG is not finite number!", _state); ae_assert(ae_fp_greater_eq(epsg,(double)(0)), "MinASASetCond: negative EpsG!", _state); ae_assert(ae_isfinite(epsf, _state), "MinASASetCond: EpsF is not finite number!", _state); ae_assert(ae_fp_greater_eq(epsf,(double)(0)), "MinASASetCond: negative EpsF!", _state); ae_assert(ae_isfinite(epsx, _state), "MinASASetCond: EpsX is not finite number!", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "MinASASetCond: negative EpsX!", _state); ae_assert(maxits>=0, "MinASASetCond: negative MaxIts!", _state); if( ((ae_fp_eq(epsg,(double)(0))&&ae_fp_eq(epsf,(double)(0)))&&ae_fp_eq(epsx,(double)(0)))&&maxits==0 ) { epsx = 1.0E-6; } state->epsg = epsg; state->epsf = epsf; state->epsx = epsx; state->maxits = maxits; } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minasasetxrep(minasastate* state, ae_bool needxrep, ae_state *_state) { state->xrep = needxrep; } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minasasetalgorithm(minasastate* state, ae_int_t algotype, ae_state *_state) { ae_assert(algotype>=-1&&algotype<=1, "MinASASetAlgorithm: incorrect AlgoType!", _state); if( algotype==-1 ) { algotype = 1; } state->cgtype = algotype; } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minasasetstpmax(minasastate* state, double stpmax, ae_state *_state) { ae_assert(ae_isfinite(stpmax, _state), "MinASASetStpMax: StpMax is not finite!", _state); ae_assert(ae_fp_greater_eq(stpmax,(double)(0)), "MinASASetStpMax: StpMax<0!", _state); state->stpmax = stpmax; } /************************************************************************* -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ ae_bool minasaiteration(minasastate* state, ae_state *_state) { ae_int_t n; ae_int_t i; double betak; double v; double vv; ae_int_t mcinfo; ae_bool b; ae_bool stepfound; ae_int_t diffcnt; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { n = state->rstate.ia.ptr.p_int[0]; i = state->rstate.ia.ptr.p_int[1]; mcinfo = state->rstate.ia.ptr.p_int[2]; diffcnt = state->rstate.ia.ptr.p_int[3]; b = state->rstate.ba.ptr.p_bool[0]; stepfound = state->rstate.ba.ptr.p_bool[1]; betak = state->rstate.ra.ptr.p_double[0]; v = state->rstate.ra.ptr.p_double[1]; vv = state->rstate.ra.ptr.p_double[2]; } else { n = 359; i = -58; mcinfo = -919; diffcnt = -909; b = ae_true; stepfound = ae_true; betak = 74; v = -788; vv = 809; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } if( state->rstate.stage==4 ) { goto lbl_4; } if( state->rstate.stage==5 ) { goto lbl_5; } if( state->rstate.stage==6 ) { goto lbl_6; } if( state->rstate.stage==7 ) { goto lbl_7; } if( state->rstate.stage==8 ) { goto lbl_8; } if( state->rstate.stage==9 ) { goto lbl_9; } if( state->rstate.stage==10 ) { goto lbl_10; } if( state->rstate.stage==11 ) { goto lbl_11; } if( state->rstate.stage==12 ) { goto lbl_12; } if( state->rstate.stage==13 ) { goto lbl_13; } if( state->rstate.stage==14 ) { goto lbl_14; } /* * Routine body */ /* * Prepare */ n = state->n; state->repterminationtype = 0; state->repiterationscount = 0; state->repnfev = 0; state->debugrestartscount = 0; state->cgtype = 1; ae_v_move(&state->xk.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { if( ae_fp_eq(state->xk.ptr.p_double[i],state->bndl.ptr.p_double[i])||ae_fp_eq(state->xk.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->ak.ptr.p_double[i] = (double)(0); } else { state->ak.ptr.p_double[i] = (double)(1); } } state->mu = 0.1; state->curalgo = 0; /* * Calculate F/G, initialize algorithm */ mincomp_clearrequestfields(state, _state); state->needfg = ae_true; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: state->needfg = ae_false; if( !state->xrep ) { goto lbl_15; } /* * progress report */ mincomp_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: state->xupdated = ae_false; lbl_15: if( ae_fp_less_eq(mincomp_asaboundedantigradnorm(state, _state),state->epsg) ) { state->repterminationtype = 4; result = ae_false; return result; } state->repnfev = state->repnfev+1; /* * Main cycle * * At the beginning of new iteration: * * CurAlgo stores current algorithm selector * * State.XK, State.F and State.G store current X/F/G * * State.AK stores current set of active constraints */ lbl_17: if( ae_false ) { goto lbl_18; } /* * GPA algorithm */ if( state->curalgo!=0 ) { goto lbl_19; } state->k = 0; state->acount = 0; lbl_21: if( ae_false ) { goto lbl_22; } /* * Determine Dk = proj(xk - gk)-xk */ for(i=0; i<=n-1; i++) { state->d.ptr.p_double[i] = boundval(state->xk.ptr.p_double[i]-state->g.ptr.p_double[i], state->bndl.ptr.p_double[i], state->bndu.ptr.p_double[i], _state)-state->xk.ptr.p_double[i]; } /* * Armijo line search. * * exact search with alpha=1 is tried first, * 'exact' means that we evaluate f() EXACTLY at * bound(x-g,bndl,bndu), without intermediate floating * point operations. * * alpha<1 are tried if explicit search wasn't successful * Result is placed into XN. * * Two types of search are needed because we can't * just use second type with alpha=1 because in finite * precision arithmetics (x1-x0)+x0 may differ from x1. * So while x1 is correctly bounded (it lie EXACTLY on * boundary, if it is active), (x1-x0)+x0 may be * not bounded. */ v = ae_v_dotproduct(&state->d.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->dginit = v; state->finit = state->f; if( !(ae_fp_less_eq(mincomp_asad1norm(state, _state),state->stpmax)||ae_fp_eq(state->stpmax,(double)(0))) ) { goto lbl_23; } /* * Try alpha=1 step first */ for(i=0; i<=n-1; i++) { state->x.ptr.p_double[i] = boundval(state->xk.ptr.p_double[i]-state->g.ptr.p_double[i], state->bndl.ptr.p_double[i], state->bndu.ptr.p_double[i], _state); } mincomp_clearrequestfields(state, _state); state->needfg = ae_true; state->rstate.stage = 2; goto lbl_rcomm; lbl_2: state->needfg = ae_false; state->repnfev = state->repnfev+1; stepfound = ae_fp_less_eq(state->f,state->finit+mincomp_gpaftol*state->dginit); goto lbl_24; lbl_23: stepfound = ae_false; lbl_24: if( !stepfound ) { goto lbl_25; } /* * we are at the boundary(ies) */ ae_v_move(&state->xn.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->stp = (double)(1); goto lbl_26; lbl_25: /* * alpha=1 is too large, try smaller values */ state->stp = (double)(1); linminnormalized(&state->d, &state->stp, n, _state); state->dginit = state->dginit/state->stp; state->stp = mincomp_gpadecay*state->stp; if( ae_fp_greater(state->stpmax,(double)(0)) ) { state->stp = ae_minreal(state->stp, state->stpmax, _state); } lbl_27: if( ae_false ) { goto lbl_28; } v = state->stp; ae_v_move(&state->x.ptr.p_double[0], 1, &state->xk.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&state->x.ptr.p_double[0], 1, &state->d.ptr.p_double[0], 1, ae_v_len(0,n-1), v); mincomp_clearrequestfields(state, _state); state->needfg = ae_true; state->rstate.stage = 3; goto lbl_rcomm; lbl_3: state->needfg = ae_false; state->repnfev = state->repnfev+1; if( ae_fp_less_eq(state->stp,mincomp_stpmin) ) { goto lbl_28; } if( ae_fp_less_eq(state->f,state->finit+state->stp*mincomp_gpaftol*state->dginit) ) { goto lbl_28; } state->stp = state->stp*mincomp_gpadecay; goto lbl_27; lbl_28: ae_v_move(&state->xn.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); lbl_26: state->repiterationscount = state->repiterationscount+1; if( !state->xrep ) { goto lbl_29; } /* * progress report */ mincomp_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 4; goto lbl_rcomm; lbl_4: state->xupdated = ae_false; lbl_29: /* * Calculate new set of active constraints. * Reset counter if active set was changed. * Prepare for the new iteration */ for(i=0; i<=n-1; i++) { if( ae_fp_eq(state->xn.ptr.p_double[i],state->bndl.ptr.p_double[i])||ae_fp_eq(state->xn.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->an.ptr.p_double[i] = (double)(0); } else { state->an.ptr.p_double[i] = (double)(1); } } for(i=0; i<=n-1; i++) { if( ae_fp_neq(state->ak.ptr.p_double[i],state->an.ptr.p_double[i]) ) { state->acount = -1; break; } } state->acount = state->acount+1; ae_v_move(&state->xk.ptr.p_double[0], 1, &state->xn.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->ak.ptr.p_double[0], 1, &state->an.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * Stopping conditions */ if( !(state->repiterationscount>=state->maxits&&state->maxits>0) ) { goto lbl_31; } /* * Too many iterations */ state->repterminationtype = 5; if( !state->xrep ) { goto lbl_33; } mincomp_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 5; goto lbl_rcomm; lbl_5: state->xupdated = ae_false; lbl_33: result = ae_false; return result; lbl_31: if( ae_fp_greater(mincomp_asaboundedantigradnorm(state, _state),state->epsg) ) { goto lbl_35; } /* * Gradient is small enough */ state->repterminationtype = 4; if( !state->xrep ) { goto lbl_37; } mincomp_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 6; goto lbl_rcomm; lbl_6: state->xupdated = ae_false; lbl_37: result = ae_false; return result; lbl_35: v = ae_v_dotproduct(&state->d.ptr.p_double[0], 1, &state->d.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( ae_fp_greater(ae_sqrt(v, _state)*state->stp,state->epsx) ) { goto lbl_39; } /* * Step size is too small, no further improvement is * possible */ state->repterminationtype = 2; if( !state->xrep ) { goto lbl_41; } mincomp_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 7; goto lbl_rcomm; lbl_7: state->xupdated = ae_false; lbl_41: result = ae_false; return result; lbl_39: if( ae_fp_greater(state->finit-state->f,state->epsf*ae_maxreal(ae_fabs(state->finit, _state), ae_maxreal(ae_fabs(state->f, _state), 1.0, _state), _state)) ) { goto lbl_43; } /* * F(k+1)-F(k) is small enough */ state->repterminationtype = 1; if( !state->xrep ) { goto lbl_45; } mincomp_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 8; goto lbl_rcomm; lbl_8: state->xupdated = ae_false; lbl_45: result = ae_false; return result; lbl_43: /* * Decide - should we switch algorithm or not */ if( mincomp_asauisempty(state, _state) ) { if( ae_fp_greater_eq(mincomp_asaginorm(state, _state),state->mu*mincomp_asad1norm(state, _state)) ) { state->curalgo = 1; goto lbl_22; } else { state->mu = state->mu*mincomp_asarho; } } else { if( state->acount==mincomp_n1 ) { if( ae_fp_greater_eq(mincomp_asaginorm(state, _state),state->mu*mincomp_asad1norm(state, _state)) ) { state->curalgo = 1; goto lbl_22; } } } /* * Next iteration */ state->k = state->k+1; goto lbl_21; lbl_22: lbl_19: /* * CG algorithm */ if( state->curalgo!=1 ) { goto lbl_47; } /* * first, check that there are non-active constraints. * move to GPA algorithm, if all constraints are active */ b = ae_true; for(i=0; i<=n-1; i++) { if( ae_fp_neq(state->ak.ptr.p_double[i],(double)(0)) ) { b = ae_false; break; } } if( b ) { state->curalgo = 0; goto lbl_17; } /* * CG iterations */ state->fold = state->f; ae_v_move(&state->xk.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { state->dk.ptr.p_double[i] = -state->g.ptr.p_double[i]*state->ak.ptr.p_double[i]; state->gc.ptr.p_double[i] = state->g.ptr.p_double[i]*state->ak.ptr.p_double[i]; } lbl_49: if( ae_false ) { goto lbl_50; } /* * Store G[k] for later calculation of Y[k] */ for(i=0; i<=n-1; i++) { state->yk.ptr.p_double[i] = -state->gc.ptr.p_double[i]; } /* * Make a CG step in direction given by DK[]: * * calculate step. Step projection into feasible set * is used. It has several benefits: a) step may be * found with usual line search, b) multiple constraints * may be activated with one step, c) activated constraints * are detected in a natural way - just compare x[i] with * bounds * * update active set, set B to True, if there * were changes in the set. */ ae_v_move(&state->d.ptr.p_double[0], 1, &state->dk.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->xn.ptr.p_double[0], 1, &state->xk.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->mcstage = 0; state->stp = (double)(1); linminnormalized(&state->d, &state->stp, n, _state); if( ae_fp_neq(state->laststep,(double)(0)) ) { state->stp = state->laststep; } mcsrch(n, &state->xn, &state->f, &state->gc, &state->d, &state->stp, state->stpmax, mincomp_gtol, &mcinfo, &state->nfev, &state->work, &state->lstate, &state->mcstage, _state); lbl_51: if( state->mcstage==0 ) { goto lbl_52; } /* * preprocess data: bound State.XN so it belongs to the * feasible set and store it in the State.X */ for(i=0; i<=n-1; i++) { state->x.ptr.p_double[i] = boundval(state->xn.ptr.p_double[i], state->bndl.ptr.p_double[i], state->bndu.ptr.p_double[i], _state); } /* * RComm */ mincomp_clearrequestfields(state, _state); state->needfg = ae_true; state->rstate.stage = 9; goto lbl_rcomm; lbl_9: state->needfg = ae_false; /* * postprocess data: zero components of G corresponding to * the active constraints */ for(i=0; i<=n-1; i++) { if( ae_fp_eq(state->x.ptr.p_double[i],state->bndl.ptr.p_double[i])||ae_fp_eq(state->x.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->gc.ptr.p_double[i] = (double)(0); } else { state->gc.ptr.p_double[i] = state->g.ptr.p_double[i]; } } mcsrch(n, &state->xn, &state->f, &state->gc, &state->d, &state->stp, state->stpmax, mincomp_gtol, &mcinfo, &state->nfev, &state->work, &state->lstate, &state->mcstage, _state); goto lbl_51; lbl_52: diffcnt = 0; for(i=0; i<=n-1; i++) { /* * XN contains unprojected result, project it, * save copy to X (will be used for progress reporting) */ state->xn.ptr.p_double[i] = boundval(state->xn.ptr.p_double[i], state->bndl.ptr.p_double[i], state->bndu.ptr.p_double[i], _state); /* * update active set */ if( ae_fp_eq(state->xn.ptr.p_double[i],state->bndl.ptr.p_double[i])||ae_fp_eq(state->xn.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { state->an.ptr.p_double[i] = (double)(0); } else { state->an.ptr.p_double[i] = (double)(1); } if( ae_fp_neq(state->an.ptr.p_double[i],state->ak.ptr.p_double[i]) ) { diffcnt = diffcnt+1; } state->ak.ptr.p_double[i] = state->an.ptr.p_double[i]; } ae_v_move(&state->xk.ptr.p_double[0], 1, &state->xn.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->repnfev = state->repnfev+state->nfev; state->repiterationscount = state->repiterationscount+1; if( !state->xrep ) { goto lbl_53; } /* * progress report */ mincomp_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 10; goto lbl_rcomm; lbl_10: state->xupdated = ae_false; lbl_53: /* * Update info about step length */ v = ae_v_dotproduct(&state->d.ptr.p_double[0], 1, &state->d.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->laststep = ae_sqrt(v, _state)*state->stp; /* * Check stopping conditions. */ if( ae_fp_greater(mincomp_asaboundedantigradnorm(state, _state),state->epsg) ) { goto lbl_55; } /* * Gradient is small enough */ state->repterminationtype = 4; if( !state->xrep ) { goto lbl_57; } mincomp_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 11; goto lbl_rcomm; lbl_11: state->xupdated = ae_false; lbl_57: result = ae_false; return result; lbl_55: if( !(state->repiterationscount>=state->maxits&&state->maxits>0) ) { goto lbl_59; } /* * Too many iterations */ state->repterminationtype = 5; if( !state->xrep ) { goto lbl_61; } mincomp_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 12; goto lbl_rcomm; lbl_12: state->xupdated = ae_false; lbl_61: result = ae_false; return result; lbl_59: if( !(ae_fp_greater_eq(mincomp_asaginorm(state, _state),state->mu*mincomp_asad1norm(state, _state))&&diffcnt==0) ) { goto lbl_63; } /* * These conditions (EpsF/EpsX) are explicitly or implicitly * related to the current step size and influenced * by changes in the active constraints. * * For these reasons they are checked only when we don't * want to 'unstick' at the end of the iteration and there * were no changes in the active set. * * NOTE: consition |G|>=Mu*|D1| must be exactly opposite * to the condition used to switch back to GPA. At least * one inequality must be strict, otherwise infinite cycle * may occur when |G|=Mu*|D1| (we DON'T test stopping * conditions and we DON'T switch to GPA, so we cycle * indefinitely). */ if( ae_fp_greater(state->fold-state->f,state->epsf*ae_maxreal(ae_fabs(state->fold, _state), ae_maxreal(ae_fabs(state->f, _state), 1.0, _state), _state)) ) { goto lbl_65; } /* * F(k+1)-F(k) is small enough */ state->repterminationtype = 1; if( !state->xrep ) { goto lbl_67; } mincomp_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 13; goto lbl_rcomm; lbl_13: state->xupdated = ae_false; lbl_67: result = ae_false; return result; lbl_65: if( ae_fp_greater(state->laststep,state->epsx) ) { goto lbl_69; } /* * X(k+1)-X(k) is small enough */ state->repterminationtype = 2; if( !state->xrep ) { goto lbl_71; } mincomp_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 14; goto lbl_rcomm; lbl_14: state->xupdated = ae_false; lbl_71: result = ae_false; return result; lbl_69: lbl_63: /* * Check conditions for switching */ if( ae_fp_less(mincomp_asaginorm(state, _state),state->mu*mincomp_asad1norm(state, _state)) ) { state->curalgo = 0; goto lbl_50; } if( diffcnt>0 ) { if( mincomp_asauisempty(state, _state)||diffcnt>=mincomp_n2 ) { state->curalgo = 1; } else { state->curalgo = 0; } goto lbl_50; } /* * Calculate D(k+1) * * Line search may result in: * * maximum feasible step being taken (already processed) * * point satisfying Wolfe conditions * * some kind of error (CG is restarted by assigning 0.0 to Beta) */ if( mcinfo==1 ) { /* * Standard Wolfe conditions are satisfied: * * calculate Y[K] and BetaK */ ae_v_add(&state->yk.ptr.p_double[0], 1, &state->gc.ptr.p_double[0], 1, ae_v_len(0,n-1)); vv = ae_v_dotproduct(&state->yk.ptr.p_double[0], 1, &state->dk.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&state->gc.ptr.p_double[0], 1, &state->gc.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->betady = v/vv; v = ae_v_dotproduct(&state->gc.ptr.p_double[0], 1, &state->yk.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->betahs = v/vv; if( state->cgtype==0 ) { betak = state->betady; } if( state->cgtype==1 ) { betak = ae_maxreal((double)(0), ae_minreal(state->betady, state->betahs, _state), _state); } } else { /* * Something is wrong (may be function is too wild or too flat). * * We'll set BetaK=0, which will restart CG algorithm. * We can stop later (during normal checks) if stopping conditions are met. */ betak = (double)(0); state->debugrestartscount = state->debugrestartscount+1; } ae_v_moveneg(&state->dn.ptr.p_double[0], 1, &state->gc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&state->dn.ptr.p_double[0], 1, &state->dk.ptr.p_double[0], 1, ae_v_len(0,n-1), betak); ae_v_move(&state->dk.ptr.p_double[0], 1, &state->dn.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * update other information */ state->fold = state->f; state->k = state->k+1; goto lbl_49; lbl_50: lbl_47: goto lbl_17; lbl_18: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = n; state->rstate.ia.ptr.p_int[1] = i; state->rstate.ia.ptr.p_int[2] = mcinfo; state->rstate.ia.ptr.p_int[3] = diffcnt; state->rstate.ba.ptr.p_bool[0] = b; state->rstate.ba.ptr.p_bool[1] = stepfound; state->rstate.ra.ptr.p_double[0] = betak; state->rstate.ra.ptr.p_double[1] = v; state->rstate.ra.ptr.p_double[2] = vv; return result; } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void minasaresults(minasastate* state, /* Real */ ae_vector* x, minasareport* rep, ae_state *_state) { ae_vector_clear(x); _minasareport_clear(rep); minasaresultsbuf(state, x, rep, _state); } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void minasaresultsbuf(minasastate* state, /* Real */ ae_vector* x, minasareport* rep, ae_state *_state) { ae_int_t i; if( x->cntn ) { ae_vector_set_length(x, state->n, _state); } ae_v_move(&x->ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); rep->iterationscount = state->repiterationscount; rep->nfev = state->repnfev; rep->terminationtype = state->repterminationtype; rep->activeconstraints = 0; for(i=0; i<=state->n-1; i++) { if( ae_fp_eq(state->ak.ptr.p_double[i],(double)(0)) ) { rep->activeconstraints = rep->activeconstraints+1; } } } /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void minasarestartfrom(minasastate* state, /* Real */ ae_vector* x, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state) { ae_assert(x->cnt>=state->n, "MinASARestartFrom: Length(X)n, _state), "MinASARestartFrom: X contains infinite or NaN values!", _state); ae_assert(bndl->cnt>=state->n, "MinASARestartFrom: Length(BndL)n, _state), "MinASARestartFrom: BndL contains infinite or NaN values!", _state); ae_assert(bndu->cnt>=state->n, "MinASARestartFrom: Length(BndU)n, _state), "MinASARestartFrom: BndU contains infinite or NaN values!", _state); ae_v_move(&state->x.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,state->n-1)); ae_v_move(&state->bndl.ptr.p_double[0], 1, &bndl->ptr.p_double[0], 1, ae_v_len(0,state->n-1)); ae_v_move(&state->bndu.ptr.p_double[0], 1, &bndu->ptr.p_double[0], 1, ae_v_len(0,state->n-1)); state->laststep = (double)(0); ae_vector_set_length(&state->rstate.ia, 3+1, _state); ae_vector_set_length(&state->rstate.ba, 1+1, _state); ae_vector_set_length(&state->rstate.ra, 2+1, _state); state->rstate.stage = -1; mincomp_clearrequestfields(state, _state); } /************************************************************************* Returns norm of bounded anti-gradient. Bounded antigradient is a vector obtained from anti-gradient by zeroing components which point outwards: result = norm(v) v[i]=0 if ((-g[i]<0)and(x[i]=bndl[i])) or ((-g[i]>0)and(x[i]=bndu[i])) v[i]=-g[i] otherwise This function may be used to check a stopping criterion. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ static double mincomp_asaboundedantigradnorm(minasastate* state, ae_state *_state) { ae_int_t i; double v; double result; result = (double)(0); for(i=0; i<=state->n-1; i++) { v = -state->g.ptr.p_double[i]; if( ae_fp_eq(state->x.ptr.p_double[i],state->bndl.ptr.p_double[i])&&ae_fp_less(-state->g.ptr.p_double[i],(double)(0)) ) { v = (double)(0); } if( ae_fp_eq(state->x.ptr.p_double[i],state->bndu.ptr.p_double[i])&&ae_fp_greater(-state->g.ptr.p_double[i],(double)(0)) ) { v = (double)(0); } result = result+ae_sqr(v, _state); } result = ae_sqrt(result, _state); return result; } /************************************************************************* Returns norm of GI(x). GI(x) is a gradient vector whose components associated with active constraints are zeroed. It differs from bounded anti-gradient because components of GI(x) are zeroed independently of sign(g[i]), and anti-gradient's components are zeroed with respect to both constraint and sign. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ static double mincomp_asaginorm(minasastate* state, ae_state *_state) { ae_int_t i; double result; result = (double)(0); for(i=0; i<=state->n-1; i++) { if( ae_fp_neq(state->x.ptr.p_double[i],state->bndl.ptr.p_double[i])&&ae_fp_neq(state->x.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { result = result+ae_sqr(state->g.ptr.p_double[i], _state); } } result = ae_sqrt(result, _state); return result; } /************************************************************************* Returns norm(D1(State.X)) For a meaning of D1 see 'NEW ACTIVE SET ALGORITHM FOR BOX CONSTRAINED OPTIMIZATION' by WILLIAM W. HAGER AND HONGCHAO ZHANG. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ static double mincomp_asad1norm(minasastate* state, ae_state *_state) { ae_int_t i; double result; result = (double)(0); for(i=0; i<=state->n-1; i++) { result = result+ae_sqr(boundval(state->x.ptr.p_double[i]-state->g.ptr.p_double[i], state->bndl.ptr.p_double[i], state->bndu.ptr.p_double[i], _state)-state->x.ptr.p_double[i], _state); } result = ae_sqrt(result, _state); return result; } /************************************************************************* Returns True, if U set is empty. * State.X is used as point, * State.G - as gradient, * D is calculated within function (because State.D may have different meaning depending on current optimization algorithm) For a meaning of U see 'NEW ACTIVE SET ALGORITHM FOR BOX CONSTRAINED OPTIMIZATION' by WILLIAM W. HAGER AND HONGCHAO ZHANG. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ static ae_bool mincomp_asauisempty(minasastate* state, ae_state *_state) { ae_int_t i; double d; double d2; double d32; ae_bool result; d = mincomp_asad1norm(state, _state); d2 = ae_sqrt(d, _state); d32 = d*d2; result = ae_true; for(i=0; i<=state->n-1; i++) { if( ae_fp_greater_eq(ae_fabs(state->g.ptr.p_double[i], _state),d2)&&ae_fp_greater_eq(ae_minreal(state->x.ptr.p_double[i]-state->bndl.ptr.p_double[i], state->bndu.ptr.p_double[i]-state->x.ptr.p_double[i], _state),d32) ) { result = ae_false; return result; } } return result; } /************************************************************************* Clears request fileds (to be sure that we don't forgot to clear something) *************************************************************************/ static void mincomp_clearrequestfields(minasastate* state, ae_state *_state) { state->needfg = ae_false; state->xupdated = ae_false; } void _minasastate_init(void* _p, ae_state *_state) { minasastate *p = (minasastate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->bndl, 0, DT_REAL, _state); ae_vector_init(&p->bndu, 0, DT_REAL, _state); ae_vector_init(&p->ak, 0, DT_REAL, _state); ae_vector_init(&p->xk, 0, DT_REAL, _state); ae_vector_init(&p->dk, 0, DT_REAL, _state); ae_vector_init(&p->an, 0, DT_REAL, _state); ae_vector_init(&p->xn, 0, DT_REAL, _state); ae_vector_init(&p->dn, 0, DT_REAL, _state); ae_vector_init(&p->d, 0, DT_REAL, _state); ae_vector_init(&p->work, 0, DT_REAL, _state); ae_vector_init(&p->yk, 0, DT_REAL, _state); ae_vector_init(&p->gc, 0, DT_REAL, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->g, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); _linminstate_init(&p->lstate, _state); } void _minasastate_init_copy(void* _dst, void* _src, ae_state *_state) { minasastate *dst = (minasastate*)_dst; minasastate *src = (minasastate*)_src; dst->n = src->n; dst->epsg = src->epsg; dst->epsf = src->epsf; dst->epsx = src->epsx; dst->maxits = src->maxits; dst->xrep = src->xrep; dst->stpmax = src->stpmax; dst->cgtype = src->cgtype; dst->k = src->k; dst->nfev = src->nfev; dst->mcstage = src->mcstage; ae_vector_init_copy(&dst->bndl, &src->bndl, _state); ae_vector_init_copy(&dst->bndu, &src->bndu, _state); dst->curalgo = src->curalgo; dst->acount = src->acount; dst->mu = src->mu; dst->finit = src->finit; dst->dginit = src->dginit; ae_vector_init_copy(&dst->ak, &src->ak, _state); ae_vector_init_copy(&dst->xk, &src->xk, _state); ae_vector_init_copy(&dst->dk, &src->dk, _state); ae_vector_init_copy(&dst->an, &src->an, _state); ae_vector_init_copy(&dst->xn, &src->xn, _state); ae_vector_init_copy(&dst->dn, &src->dn, _state); ae_vector_init_copy(&dst->d, &src->d, _state); dst->fold = src->fold; dst->stp = src->stp; ae_vector_init_copy(&dst->work, &src->work, _state); ae_vector_init_copy(&dst->yk, &src->yk, _state); ae_vector_init_copy(&dst->gc, &src->gc, _state); dst->laststep = src->laststep; ae_vector_init_copy(&dst->x, &src->x, _state); dst->f = src->f; ae_vector_init_copy(&dst->g, &src->g, _state); dst->needfg = src->needfg; dst->xupdated = src->xupdated; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); dst->repiterationscount = src->repiterationscount; dst->repnfev = src->repnfev; dst->repterminationtype = src->repterminationtype; dst->debugrestartscount = src->debugrestartscount; _linminstate_init_copy(&dst->lstate, &src->lstate, _state); dst->betahs = src->betahs; dst->betady = src->betady; } void _minasastate_clear(void* _p) { minasastate *p = (minasastate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->bndl); ae_vector_clear(&p->bndu); ae_vector_clear(&p->ak); ae_vector_clear(&p->xk); ae_vector_clear(&p->dk); ae_vector_clear(&p->an); ae_vector_clear(&p->xn); ae_vector_clear(&p->dn); ae_vector_clear(&p->d); ae_vector_clear(&p->work); ae_vector_clear(&p->yk); ae_vector_clear(&p->gc); ae_vector_clear(&p->x); ae_vector_clear(&p->g); _rcommstate_clear(&p->rstate); _linminstate_clear(&p->lstate); } void _minasastate_destroy(void* _p) { minasastate *p = (minasastate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->bndl); ae_vector_destroy(&p->bndu); ae_vector_destroy(&p->ak); ae_vector_destroy(&p->xk); ae_vector_destroy(&p->dk); ae_vector_destroy(&p->an); ae_vector_destroy(&p->xn); ae_vector_destroy(&p->dn); ae_vector_destroy(&p->d); ae_vector_destroy(&p->work); ae_vector_destroy(&p->yk); ae_vector_destroy(&p->gc); ae_vector_destroy(&p->x); ae_vector_destroy(&p->g); _rcommstate_destroy(&p->rstate); _linminstate_destroy(&p->lstate); } void _minasareport_init(void* _p, ae_state *_state) { minasareport *p = (minasareport*)_p; ae_touch_ptr((void*)p); } void _minasareport_init_copy(void* _dst, void* _src, ae_state *_state) { minasareport *dst = (minasareport*)_dst; minasareport *src = (minasareport*)_src; dst->iterationscount = src->iterationscount; dst->nfev = src->nfev; dst->terminationtype = src->terminationtype; dst->activeconstraints = src->activeconstraints; } void _minasareport_clear(void* _p) { minasareport *p = (minasareport*)_p; ae_touch_ptr((void*)p); } void _minasareport_destroy(void* _p) { minasareport *p = (minasareport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* IMPROVED LEVENBERG-MARQUARDT METHOD FOR NON-LINEAR LEAST SQUARES OPTIMIZATION DESCRIPTION: This function is used to find minimum of function which is represented as sum of squares: F(x) = f[0]^2(x[0],...,x[n-1]) + ... + f[m-1]^2(x[0],...,x[n-1]) using value of function vector f[] and Jacobian of f[]. REQUIREMENTS: This algorithm will request following information during its operation: * function vector f[] at given point X * function vector f[] and Jacobian of f[] (simultaneously) at given point There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts fvec() and jac() callbacks. First one is used to calculate f[] at given point, second one calculates f[] and Jacobian df[i]/dx[j]. You can try to initialize MinLMState structure with VJ function and then use incorrect version of MinLMOptimize() (for example, version which works with general form function and does not provide Jacobian), but it will lead to exception being thrown after first attempt to calculate Jacobian. USAGE: 1. User initializes algorithm state with MinLMCreateVJ() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of functions f[i] X - initial solution, array[0..N-1] OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatevj(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, minlmstate* state, ae_state *_state) { _minlmstate_clear(state); ae_assert(n>=1, "MinLMCreateVJ: N<1!", _state); ae_assert(m>=1, "MinLMCreateVJ: M<1!", _state); ae_assert(x->cnt>=n, "MinLMCreateVJ: Length(X)teststep = (double)(0); state->n = n; state->m = m; state->algomode = 1; state->hasf = ae_false; state->hasfi = ae_true; state->hasg = ae_false; /* * second stage of initialization */ minlm_lmprepare(n, m, ae_false, state, _state); minlmsetacctype(state, 0, _state); minlmsetcond(state, (double)(0), 0, _state); minlmsetxrep(state, ae_false, _state); minlmsetstpmax(state, (double)(0), _state); minlmrestartfrom(state, x, _state); } /************************************************************************* IMPROVED LEVENBERG-MARQUARDT METHOD FOR NON-LINEAR LEAST SQUARES OPTIMIZATION DESCRIPTION: This function is used to find minimum of function which is represented as sum of squares: F(x) = f[0]^2(x[0],...,x[n-1]) + ... + f[m-1]^2(x[0],...,x[n-1]) using value of function vector f[] only. Finite differences are used to calculate Jacobian. REQUIREMENTS: This algorithm will request following information during its operation: * function vector f[] at given point X There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts fvec() callback. You can try to initialize MinLMState structure with VJ function and then use incorrect version of MinLMOptimize() (for example, version which works with general form function and does not accept function vector), but it will lead to exception being thrown after first attempt to calculate Jacobian. USAGE: 1. User initializes algorithm state with MinLMCreateV() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of functions f[i] X - initial solution, array[0..N-1] DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state See also MinLMIteration, MinLMResults. NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatev(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, double diffstep, minlmstate* state, ae_state *_state) { _minlmstate_clear(state); ae_assert(ae_isfinite(diffstep, _state), "MinLMCreateV: DiffStep is not finite!", _state); ae_assert(ae_fp_greater(diffstep,(double)(0)), "MinLMCreateV: DiffStep<=0!", _state); ae_assert(n>=1, "MinLMCreateV: N<1!", _state); ae_assert(m>=1, "MinLMCreateV: M<1!", _state); ae_assert(x->cnt>=n, "MinLMCreateV: Length(X)teststep = (double)(0); state->n = n; state->m = m; state->algomode = 0; state->hasf = ae_false; state->hasfi = ae_true; state->hasg = ae_false; state->diffstep = diffstep; /* * Second stage of initialization */ minlm_lmprepare(n, m, ae_false, state, _state); minlmsetacctype(state, 1, _state); minlmsetcond(state, (double)(0), 0, _state); minlmsetxrep(state, ae_false, _state); minlmsetstpmax(state, (double)(0), _state); minlmrestartfrom(state, x, _state); } /************************************************************************* LEVENBERG-MARQUARDT-LIKE METHOD FOR NON-LINEAR OPTIMIZATION DESCRIPTION: This function is used to find minimum of general form (not "sum-of- -squares") function F = F(x[0], ..., x[n-1]) using its gradient and Hessian. Levenberg-Marquardt modification with L-BFGS pre-optimization and internal pre-conditioned L-BFGS optimization after each Levenberg-Marquardt step is used. REQUIREMENTS: This algorithm will request following information during its operation: * function value F at given point X * F and gradient G (simultaneously) at given point X * F, G and Hessian H (simultaneously) at given point X There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts func(), grad() and hess() function pointers. First pointer is used to calculate F at given point, second one calculates F(x) and grad F(x), third one calculates F(x), grad F(x), hess F(x). You can try to initialize MinLMState structure with FGH-function and then use incorrect version of MinLMOptimize() (for example, version which does not provide Hessian matrix), but it will lead to exception being thrown after first attempt to calculate Hessian. USAGE: 1. User initializes algorithm state with MinLMCreateFGH() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and pointers (delegates, etc.) to callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - initial solution, array[0..N-1] OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatefgh(ae_int_t n, /* Real */ ae_vector* x, minlmstate* state, ae_state *_state) { _minlmstate_clear(state); ae_assert(n>=1, "MinLMCreateFGH: N<1!", _state); ae_assert(x->cnt>=n, "MinLMCreateFGH: Length(X)teststep = (double)(0); state->n = n; state->m = 0; state->algomode = 2; state->hasf = ae_true; state->hasfi = ae_false; state->hasg = ae_true; /* * init2 */ minlm_lmprepare(n, 0, ae_true, state, _state); minlmsetacctype(state, 2, _state); minlmsetcond(state, (double)(0), 0, _state); minlmsetxrep(state, ae_false, _state); minlmsetstpmax(state, (double)(0), _state); minlmrestartfrom(state, x, _state); } /************************************************************************* This function sets stopping conditions for Levenberg-Marquardt optimization algorithm. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinLMSetScale() Recommended values: 1E-9 ... 1E-12. MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Only Levenberg-Marquardt iterations are counted (L-BFGS/CG iterations are NOT counted because their cost is very low compared to that of LM). Passing EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (small EpsX). NOTE: it is not recommended to set large EpsX (say, 0.001). Because LM is a second-order method, it performs very precise steps anyway. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlmsetcond(minlmstate* state, double epsx, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(epsx, _state), "MinLMSetCond: EpsX is not finite number!", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "MinLMSetCond: negative EpsX!", _state); ae_assert(maxits>=0, "MinLMSetCond: negative MaxIts!", _state); if( ae_fp_eq(epsx,(double)(0))&&maxits==0 ) { epsx = 1.0E-9; } state->epsx = epsx; state->maxits = maxits; } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinLMOptimize(). Both Levenberg-Marquardt and internal L-BFGS iterations are reported. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlmsetxrep(minlmstate* state, ae_bool needxrep, ae_state *_state) { state->xrep = needxrep; } /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. NOTE: non-zero StpMax leads to moderate performance degradation because intermediate step of preconditioned L-BFGS optimization is incompatible with limits on step size. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlmsetstpmax(minlmstate* state, double stpmax, ae_state *_state) { ae_assert(ae_isfinite(stpmax, _state), "MinLMSetStpMax: StpMax is not finite!", _state); ae_assert(ae_fp_greater_eq(stpmax,(double)(0)), "MinLMSetStpMax: StpMax<0!", _state); state->stpmax = stpmax; } /************************************************************************* This function sets scaling coefficients for LM optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Generally, scale is NOT considered to be a form of preconditioner. But LM optimizer is unique in that it uses scaling matrix both in the stopping condition tests and as Marquardt damping factor. Proper scaling is very important for the algorithm performance. It is less important for the quality of results, but still has some influence (it is easier to converge when variables are properly scaled, so premature stopping is possible when very badly scalled variables are combined with relaxed stopping conditions). INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlmsetscale(minlmstate* state, /* Real */ ae_vector* s, ae_state *_state) { ae_int_t i; ae_assert(s->cnt>=state->n, "MinLMSetScale: Length(S)n-1; i++) { ae_assert(ae_isfinite(s->ptr.p_double[i], _state), "MinLMSetScale: S contains infinite or NAN elements", _state); ae_assert(ae_fp_neq(s->ptr.p_double[i],(double)(0)), "MinLMSetScale: S contains zero elements", _state); state->s.ptr.p_double[i] = ae_fabs(s->ptr.p_double[i], _state); } } /************************************************************************* This function sets boundary constraints for LM optimizer Boundary constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetBC() call. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF (latter is recommended because it will allow solver to use better algorithm). BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF (latter is recommended because it will allow solver to use better algorithm). NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints or at its boundary -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlmsetbc(minlmstate* state, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state) { ae_int_t i; ae_int_t n; n = state->n; ae_assert(bndl->cnt>=n, "MinLMSetBC: Length(BndL)cnt>=n, "MinLMSetBC: Length(BndU)ptr.p_double[i], _state)||ae_isneginf(bndl->ptr.p_double[i], _state), "MinLMSetBC: BndL contains NAN or +INF", _state); ae_assert(ae_isfinite(bndu->ptr.p_double[i], _state)||ae_isposinf(bndu->ptr.p_double[i], _state), "MinLMSetBC: BndU contains NAN or -INF", _state); state->bndl.ptr.p_double[i] = bndl->ptr.p_double[i]; state->havebndl.ptr.p_bool[i] = ae_isfinite(bndl->ptr.p_double[i], _state); state->bndu.ptr.p_double[i] = bndu->ptr.p_double[i]; state->havebndu.ptr.p_bool[i] = ae_isfinite(bndu->ptr.p_double[i], _state); } } /************************************************************************* This function sets general linear constraints for LM optimizer Linear constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another minlmsetlc() call. INPUT PARAMETERS: State - structure stores algorithm state C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT IMPORTANT: if you have linear constraints, it is strongly recommended to set scale of variables with minlmsetscale(). QP solver which is used to calculate linearly constrained steps heavily relies on good scaling of input problems. IMPORTANT: solvers created with minlmcreatefgh() do not support linear constraints. NOTE: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations. NOTE: general linear constraints add significant overhead to solution process. Although solver performs roughly same amount of iterations (when compared with similar box-only constrained problem), each iteration now involves solution of linearly constrained QP subproblem, which requires ~3-5 times more Cholesky decompositions. Thus, if you can reformulate your problem in such way this it has only box constraints, it may be beneficial to do so. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlmsetlc(minlmstate* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state) { ae_int_t i; ae_int_t n; n = state->n; /* * First, check for errors in the inputs */ ae_assert(k>=0, "MinLMSetLC: K<0", _state); ae_assert(c->cols>=n+1||k==0, "MinLMSetLC: Cols(C)rows>=k, "MinLMSetLC: Rows(C)cnt>=k, "MinLMSetLC: Length(CT)nec = 0; state->nic = 0; return; } /* * Equality constraints are stored first, in the upper * NEC rows of State.CLEIC matrix. Inequality constraints * are stored in the next NIC rows. * * NOTE: we convert inequality constraints to the form * A*x<=b before copying them. */ rmatrixsetlengthatleast(&state->cleic, k, n+1, _state); state->nec = 0; state->nic = 0; for(i=0; i<=k-1; i++) { if( ct->ptr.p_int[i]==0 ) { ae_v_move(&state->cleic.ptr.pp_double[state->nec][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); state->nec = state->nec+1; } } for(i=0; i<=k-1; i++) { if( ct->ptr.p_int[i]!=0 ) { if( ct->ptr.p_int[i]>0 ) { ae_v_moveneg(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } else { ae_v_move(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } state->nic = state->nic+1; } } } /************************************************************************* This function is used to change acceleration settings You can choose between three acceleration strategies: * AccType=0, no acceleration. * AccType=1, secant updates are used to update quadratic model after each iteration. After fixed number of iterations (or after model breakdown) we recalculate quadratic model using analytic Jacobian or finite differences. Number of secant-based iterations depends on optimization settings: about 3 iterations - when we have analytic Jacobian, up to 2*N iterations - when we use finite differences to calculate Jacobian. AccType=1 is recommended when Jacobian calculation cost is prohibitively high (several Mx1 function vector calculations followed by several NxN Cholesky factorizations are faster than calculation of one M*N Jacobian). It should also be used when we have no Jacobian, because finite difference approximation takes too much time to compute. Table below list optimization protocols (XYZ protocol corresponds to MinLMCreateXYZ) and acceleration types they support (and use by default). ACCELERATION TYPES SUPPORTED BY OPTIMIZATION PROTOCOLS: protocol 0 1 comment V + + VJ + + FGH + DEFAULT VALUES: protocol 0 1 comment V x without acceleration it is so slooooooooow VJ x FGH x NOTE: this function should be called before optimization. Attempt to call it during algorithm iterations may result in unexpected behavior. NOTE: attempt to call this function with unsupported protocol/acceleration combination will result in exception being thrown. -- ALGLIB -- Copyright 14.10.2010 by Bochkanov Sergey *************************************************************************/ void minlmsetacctype(minlmstate* state, ae_int_t acctype, ae_state *_state) { ae_assert((acctype==0||acctype==1)||acctype==2, "MinLMSetAccType: incorrect AccType!", _state); if( acctype==2 ) { acctype = 0; } if( acctype==0 ) { state->maxmodelage = 0; state->makeadditers = ae_false; return; } if( acctype==1 ) { ae_assert(state->hasfi, "MinLMSetAccType: AccType=1 is incompatible with current protocol!", _state); if( state->algomode==0 ) { state->maxmodelage = 2*state->n; } else { state->maxmodelage = minlm_smallmodelage; } state->makeadditers = ae_false; return; } } /************************************************************************* NOTES: 1. Depending on function used to create state structure, this algorithm may accept Jacobian and/or Hessian and/or gradient. According to the said above, there ase several versions of this function, which accept different sets of callbacks. This flexibility opens way to subtle errors - you may create state with MinLMCreateFGH() (optimization using Hessian), but call function which does not accept Hessian. So when algorithm will request Hessian, there will be no callback to call. In this case exception will be thrown. Be careful to avoid such errors because there is no way to find them at compile time - you can see them at runtime only. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ ae_bool minlmiteration(minlmstate* state, ae_state *_state) { ae_int_t n; ae_int_t m; ae_bool bflag; ae_int_t iflag; double v; double s; double t; double fnew; ae_int_t i; ae_int_t k; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { n = state->rstate.ia.ptr.p_int[0]; m = state->rstate.ia.ptr.p_int[1]; iflag = state->rstate.ia.ptr.p_int[2]; i = state->rstate.ia.ptr.p_int[3]; k = state->rstate.ia.ptr.p_int[4]; bflag = state->rstate.ba.ptr.p_bool[0]; v = state->rstate.ra.ptr.p_double[0]; s = state->rstate.ra.ptr.p_double[1]; t = state->rstate.ra.ptr.p_double[2]; fnew = state->rstate.ra.ptr.p_double[3]; } else { n = 359; m = -58; iflag = -919; i = -909; k = 81; bflag = ae_true; v = 74; s = -788; t = 809; fnew = 205; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } if( state->rstate.stage==4 ) { goto lbl_4; } if( state->rstate.stage==5 ) { goto lbl_5; } if( state->rstate.stage==6 ) { goto lbl_6; } if( state->rstate.stage==7 ) { goto lbl_7; } if( state->rstate.stage==8 ) { goto lbl_8; } if( state->rstate.stage==9 ) { goto lbl_9; } if( state->rstate.stage==10 ) { goto lbl_10; } if( state->rstate.stage==11 ) { goto lbl_11; } if( state->rstate.stage==12 ) { goto lbl_12; } if( state->rstate.stage==13 ) { goto lbl_13; } if( state->rstate.stage==14 ) { goto lbl_14; } if( state->rstate.stage==15 ) { goto lbl_15; } if( state->rstate.stage==16 ) { goto lbl_16; } if( state->rstate.stage==17 ) { goto lbl_17; } if( state->rstate.stage==18 ) { goto lbl_18; } if( state->rstate.stage==19 ) { goto lbl_19; } if( state->rstate.stage==20 ) { goto lbl_20; } if( state->rstate.stage==21 ) { goto lbl_21; } if( state->rstate.stage==22 ) { goto lbl_22; } if( state->rstate.stage==23 ) { goto lbl_23; } if( state->rstate.stage==24 ) { goto lbl_24; } if( state->rstate.stage==25 ) { goto lbl_25; } if( state->rstate.stage==26 ) { goto lbl_26; } if( state->rstate.stage==27 ) { goto lbl_27; } if( state->rstate.stage==28 ) { goto lbl_28; } if( state->rstate.stage==29 ) { goto lbl_29; } /* * Routine body */ /* * prepare */ n = state->n; m = state->m; state->repiterationscount = 0; state->repterminationtype = 0; state->repfuncidx = -1; state->repvaridx = -1; state->repnfunc = 0; state->repnjac = 0; state->repngrad = 0; state->repnhess = 0; state->repncholesky = 0; state->userterminationneeded = ae_false; /* * Prepare LM step finder and enforce/check feasibility of constraints */ if( !minlm_minlmstepfinderinit(&state->finderstate, n, m, state->maxmodelage, state->hasfi, &state->xbase, &state->bndl, &state->bndu, &state->cleic, state->nec, state->nic, &state->s, state->stpmax, state->epsx, _state) ) { state->repterminationtype = -3; result = ae_false; return result; } /* * set constraints for obsolete QP solver */ minqpsetbc(&state->qpstate, &state->bndl, &state->bndu, _state); /* * Check, that transferred derivative value is right */ minlm_clearrequestfields(state, _state); if( !(state->algomode==1&&ae_fp_greater(state->teststep,(double)(0))) ) { goto lbl_30; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->needfij = ae_true; i = 0; lbl_32: if( i>n-1 ) { goto lbl_34; } ae_assert((state->havebndl.ptr.p_bool[i]&&ae_fp_less_eq(state->bndl.ptr.p_double[i],state->x.ptr.p_double[i]))||!state->havebndl.ptr.p_bool[i], "MinLM: internal error(State.X is out of bounds)", _state); ae_assert((state->havebndu.ptr.p_bool[i]&&ae_fp_less_eq(state->x.ptr.p_double[i],state->bndu.ptr.p_double[i]))||!state->havebndu.ptr.p_bool[i], "MinLMIteration: internal error(State.X is out of bounds)", _state); v = state->x.ptr.p_double[i]; state->x.ptr.p_double[i] = v-state->teststep*state->s.ptr.p_double[i]; if( state->havebndl.ptr.p_bool[i] ) { state->x.ptr.p_double[i] = ae_maxreal(state->x.ptr.p_double[i], state->bndl.ptr.p_double[i], _state); } state->xm1 = state->x.ptr.p_double[i]; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: ae_v_move(&state->fm1.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_v_move(&state->gm1.ptr.p_double[0], 1, &state->j.ptr.pp_double[0][i], state->j.stride, ae_v_len(0,m-1)); state->x.ptr.p_double[i] = v+state->teststep*state->s.ptr.p_double[i]; if( state->havebndu.ptr.p_bool[i] ) { state->x.ptr.p_double[i] = ae_minreal(state->x.ptr.p_double[i], state->bndu.ptr.p_double[i], _state); } state->xp1 = state->x.ptr.p_double[i]; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: ae_v_move(&state->fp1.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_v_move(&state->gp1.ptr.p_double[0], 1, &state->j.ptr.pp_double[0][i], state->j.stride, ae_v_len(0,m-1)); state->x.ptr.p_double[i] = (state->xm1+state->xp1)/2; if( state->havebndl.ptr.p_bool[i] ) { state->x.ptr.p_double[i] = ae_maxreal(state->x.ptr.p_double[i], state->bndl.ptr.p_double[i], _state); } if( state->havebndu.ptr.p_bool[i] ) { state->x.ptr.p_double[i] = ae_minreal(state->x.ptr.p_double[i], state->bndu.ptr.p_double[i], _state); } state->rstate.stage = 2; goto lbl_rcomm; lbl_2: ae_v_move(&state->fc1.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_v_move(&state->gc1.ptr.p_double[0], 1, &state->j.ptr.pp_double[0][i], state->j.stride, ae_v_len(0,m-1)); state->x.ptr.p_double[i] = v; for(k=0; k<=m-1; k++) { if( !derivativecheck(state->fm1.ptr.p_double[k], state->gm1.ptr.p_double[k], state->fp1.ptr.p_double[k], state->gp1.ptr.p_double[k], state->fc1.ptr.p_double[k], state->gc1.ptr.p_double[k], state->xp1-state->xm1, _state) ) { state->repfuncidx = k; state->repvaridx = i; state->repterminationtype = -7; result = ae_false; return result; } } i = i+1; goto lbl_32; lbl_34: state->needfij = ae_false; lbl_30: /* * Initial report of current point * * Note 1: we rewrite State.X twice because * user may accidentally change it after first call. * * Note 2: we set NeedF or NeedFI depending on what * information about function we have. */ if( !state->xrep ) { goto lbl_35; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); minlm_clearrequestfields(state, _state); if( !state->hasf ) { goto lbl_37; } state->needf = ae_true; state->rstate.stage = 3; goto lbl_rcomm; lbl_3: state->needf = ae_false; goto lbl_38; lbl_37: ae_assert(state->hasfi, "MinLM: internal error 2!", _state); state->needfi = ae_true; state->rstate.stage = 4; goto lbl_rcomm; lbl_4: state->needfi = ae_false; v = ae_v_dotproduct(&state->fi.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); state->f = v; lbl_38: state->repnfunc = state->repnfunc+1; ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); minlm_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 5; goto lbl_rcomm; lbl_5: state->xupdated = ae_false; lbl_35: if( state->userterminationneeded ) { /* * User requested termination */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->repterminationtype = 8; result = ae_false; return result; } /* * Prepare control variables */ state->nu = (double)(1); state->lambdav = -ae_maxrealnumber; state->modelage = state->maxmodelage+1; state->deltaxready = ae_false; state->deltafready = ae_false; if( state->algomode==2 ) { goto lbl_39; } /* * Jacobian-based optimization mode * * Main cycle. * * We move through it until either: * * one of the stopping conditions is met * * we decide that stopping conditions are too stringent * and break from cycle */ lbl_41: if( ae_false ) { goto lbl_42; } /* * First, we have to prepare quadratic model for our function. * We use BFlag to ensure that model is prepared; * if it is false at the end of this block, something went wrong. * * We may either calculate brand new model or update old one. * * Before this block we have: * * State.XBase - current position. * * State.DeltaX - if DeltaXReady is True * * State.DeltaF - if DeltaFReady is True * * After this block is over, we will have: * * State.XBase - base point (unchanged) * * State.FBase - F(XBase) * * State.GBase - linear term * * State.QuadraticModel - quadratic term * * State.LambdaV - current estimate for lambda * * We also clear DeltaXReady/DeltaFReady flags * after initialization is done. */ ae_assert(state->algomode==0||state->algomode==1, "MinLM: integrity check failed", _state); if( !(state->modelage>state->maxmodelage||!(state->deltaxready&&state->deltafready)) ) { goto lbl_43; } /* * Refresh model (using either finite differences or analytic Jacobian) */ if( state->algomode!=0 ) { goto lbl_45; } /* * Optimization using F values only. * Use finite differences to estimate Jacobian. */ ae_assert(state->hasfi, "MinLMIteration: internal error when estimating Jacobian (no f[])", _state); k = 0; lbl_47: if( k>n-1 ) { goto lbl_49; } /* * We guard X[k] from leaving [BndL,BndU]. * In case BndL=BndU, we assume that derivative in this direction is zero. */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->x.ptr.p_double[k] = state->x.ptr.p_double[k]-state->s.ptr.p_double[k]*state->diffstep; if( state->havebndl.ptr.p_bool[k] ) { state->x.ptr.p_double[k] = ae_maxreal(state->x.ptr.p_double[k], state->bndl.ptr.p_double[k], _state); } if( state->havebndu.ptr.p_bool[k] ) { state->x.ptr.p_double[k] = ae_minreal(state->x.ptr.p_double[k], state->bndu.ptr.p_double[k], _state); } state->xm1 = state->x.ptr.p_double[k]; minlm_clearrequestfields(state, _state); state->needfi = ae_true; state->rstate.stage = 6; goto lbl_rcomm; lbl_6: state->repnfunc = state->repnfunc+1; ae_v_move(&state->fm1.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->x.ptr.p_double[k] = state->x.ptr.p_double[k]+state->s.ptr.p_double[k]*state->diffstep; if( state->havebndl.ptr.p_bool[k] ) { state->x.ptr.p_double[k] = ae_maxreal(state->x.ptr.p_double[k], state->bndl.ptr.p_double[k], _state); } if( state->havebndu.ptr.p_bool[k] ) { state->x.ptr.p_double[k] = ae_minreal(state->x.ptr.p_double[k], state->bndu.ptr.p_double[k], _state); } state->xp1 = state->x.ptr.p_double[k]; minlm_clearrequestfields(state, _state); state->needfi = ae_true; state->rstate.stage = 7; goto lbl_rcomm; lbl_7: state->repnfunc = state->repnfunc+1; ae_v_move(&state->fp1.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); v = state->xp1-state->xm1; if( ae_fp_neq(v,(double)(0)) ) { v = 1/v; ae_v_moved(&state->j.ptr.pp_double[0][k], state->j.stride, &state->fp1.ptr.p_double[0], 1, ae_v_len(0,m-1), v); ae_v_subd(&state->j.ptr.pp_double[0][k], state->j.stride, &state->fm1.ptr.p_double[0], 1, ae_v_len(0,m-1), v); } else { for(i=0; i<=m-1; i++) { state->j.ptr.pp_double[i][k] = (double)(0); } } k = k+1; goto lbl_47; lbl_49: /* * Calculate F(XBase) */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); minlm_clearrequestfields(state, _state); state->needfi = ae_true; state->rstate.stage = 8; goto lbl_rcomm; lbl_8: state->needfi = ae_false; state->repnfunc = state->repnfunc+1; state->repnjac = state->repnjac+1; /* * New model */ state->modelage = 0; goto lbl_46; lbl_45: /* * Obtain f[] and Jacobian */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); minlm_clearrequestfields(state, _state); state->needfij = ae_true; state->rstate.stage = 9; goto lbl_rcomm; lbl_9: state->needfij = ae_false; state->repnfunc = state->repnfunc+1; state->repnjac = state->repnjac+1; /* * New model */ state->modelage = 0; lbl_46: goto lbl_44; lbl_43: /* * State.J contains Jacobian or its current approximation; * refresh it using secant updates: * * f(x0+dx) = f(x0) + J*dx, * J_new = J_old + u*h' * h = x_new-x_old * u = (f_new - f_old - J_old*h)/(h'h) * * We can explicitly generate h and u, but it is * preferential to do in-place calculations. Only * I-th row of J_old is needed to calculate u[I], * so we can update J row by row in one pass. * * NOTE: we expect that State.XBase contains new point, * State.FBase contains old point, State.DeltaX and * State.DeltaY contain updates from last step. */ ae_assert(state->deltaxready&&state->deltafready, "MinLMIteration: uninitialized DeltaX/DeltaF", _state); t = ae_v_dotproduct(&state->deltax.ptr.p_double[0], 1, &state->deltax.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_assert(ae_fp_neq(t,(double)(0)), "MinLM: internal error (T=0)", _state); for(i=0; i<=m-1; i++) { v = ae_v_dotproduct(&state->j.ptr.pp_double[i][0], 1, &state->deltax.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = (state->deltaf.ptr.p_double[i]-v)/t; ae_v_addd(&state->j.ptr.pp_double[i][0], 1, &state->deltax.ptr.p_double[0], 1, ae_v_len(0,n-1), v); } ae_v_move(&state->fi.ptr.p_double[0], 1, &state->fibase.ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_v_add(&state->fi.ptr.p_double[0], 1, &state->deltaf.ptr.p_double[0], 1, ae_v_len(0,m-1)); /* * Increase model age */ state->modelage = state->modelage+1; lbl_44: rmatrixgemm(n, n, m, 2.0, &state->j, 0, 0, 1, &state->j, 0, 0, 0, 0.0, &state->quadraticmodel, 0, 0, _state); rmatrixmv(n, m, &state->j, 0, 0, 1, &state->fi, 0, &state->gbase, 0, _state); ae_v_muld(&state->gbase.ptr.p_double[0], 1, ae_v_len(0,n-1), 2); v = ae_v_dotproduct(&state->fi.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); state->fbase = v; ae_v_move(&state->fibase.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); state->deltaxready = ae_false; state->deltafready = ae_false; /* * Perform integrity check (presense of NAN/INF) */ v = state->fbase; for(i=0; i<=n-1; i++) { v = 0.1*v+state->gbase.ptr.p_double[i]; } if( !ae_isfinite(v, _state) ) { /* * Break! */ state->repterminationtype = -8; result = ae_false; return result; } /* * If Lambda is not initialized, initialize it using quadratic model */ if( ae_fp_less(state->lambdav,(double)(0)) ) { state->lambdav = (double)(0); for(i=0; i<=n-1; i++) { state->lambdav = ae_maxreal(state->lambdav, ae_fabs(state->quadraticmodel.ptr.pp_double[i][i], _state)*ae_sqr(state->s.ptr.p_double[i], _state), _state); } state->lambdav = 0.001*state->lambdav; if( ae_fp_eq(state->lambdav,(double)(0)) ) { state->lambdav = (double)(1); } } /* * Find value of Levenberg-Marquardt damping parameter which: * * leads to positive definite damped model * * within bounds specified by StpMax * * generates step which decreases function value * * After this block IFlag is set to: * * -8, if internal integrity control detected NAN/INF in function values * * -3, if constraints are infeasible * * -2, if model update is needed (either Lambda growth is too large * or step is too short, but we can't rely on model and stop iterations) * * -1, if model is fresh, Lambda have grown too large, termination is needed * * 0, if everything is OK, continue iterations * * >0, successful termination, step is less than EpsX * * State.Nu can have any value on enter, but after exit it is set to 1.0 */ iflag = -99; minlm_minlmstepfinderstart(&state->finderstate, &state->quadraticmodel, &state->gbase, state->fbase, &state->xbase, &state->fibase, state->modelage, _state); lbl_50: if( !minlm_minlmstepfinderiteration(&state->finderstate, &state->lambdav, &state->nu, &state->xnew, &state->deltax, &state->deltaxready, &state->deltaf, &state->deltafready, &iflag, &fnew, &state->repncholesky, _state) ) { goto lbl_51; } ae_assert(state->hasfi||state->hasf, "MinLM: internal error 2!", _state); state->repnfunc = state->repnfunc+1; minlm_clearrequestfields(state, _state); if( !state->finderstate.needfi ) { goto lbl_52; } ae_assert(state->hasfi, "MinLM: internal error 2!", _state); ae_v_move(&state->x.ptr.p_double[0], 1, &state->finderstate.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->needfi = ae_true; state->rstate.stage = 10; goto lbl_rcomm; lbl_10: state->needfi = ae_false; ae_v_move(&state->finderstate.fi.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); goto lbl_50; lbl_52: if( !state->finderstate.needf ) { goto lbl_54; } ae_assert(state->hasf, "MinLM: internal error 2!", _state); ae_v_move(&state->x.ptr.p_double[0], 1, &state->finderstate.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->needf = ae_true; state->rstate.stage = 11; goto lbl_rcomm; lbl_11: state->needf = ae_false; state->finderstate.f = state->f; goto lbl_50; lbl_54: ae_assert(ae_false, "MinLM: internal error 2!", _state); goto lbl_50; lbl_51: if( state->userterminationneeded ) { /* * User requested termination */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->repterminationtype = 8; result = ae_false; return result; } state->nu = (double)(1); ae_assert(((iflag>=-3&&iflag<=0)||iflag==-8)||iflag>0, "MinLM: internal integrity check failed!", _state); if( iflag==-3 ) { state->repterminationtype = -3; result = ae_false; return result; } if( iflag==-2 ) { state->modelage = state->maxmodelage+1; goto lbl_41; } if( iflag!=-1 ) { goto lbl_56; } /* * Stopping conditions are too stringent */ state->repterminationtype = 7; if( !state->xrep ) { goto lbl_58; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->f = state->fbase; minlm_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 12; goto lbl_rcomm; lbl_12: state->xupdated = ae_false; lbl_58: result = ae_false; return result; lbl_56: if( !(iflag==-8||iflag>0) ) { goto lbl_60; } /* * Either: * * Integrity check failed - infinities or NANs * * successful termination (step size is small enough) */ state->repterminationtype = iflag; if( !state->xrep ) { goto lbl_62; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->f = state->fbase; minlm_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 13; goto lbl_rcomm; lbl_13: state->xupdated = ae_false; lbl_62: result = ae_false; return result; lbl_60: state->f = fnew; /* * Levenberg-Marquardt step is ready. * Compare predicted vs. actual decrease and decide what to do with lambda. * * NOTE: we expect that State.DeltaX contains direction of step, * State.F contains function value at new point. */ ae_assert(state->deltaxready, "MinLM: deltaX is not ready", _state); iflag = minlm_checkdecrease(&state->quadraticmodel, &state->gbase, state->fbase, n, &state->deltax, state->f, &state->lambdav, &state->nu, _state); if( iflag==0 ) { goto lbl_64; } state->repterminationtype = iflag; if( !state->xrep ) { goto lbl_66; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->f = state->fbase; minlm_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 14; goto lbl_rcomm; lbl_14: state->xupdated = ae_false; lbl_66: result = ae_false; return result; lbl_64: /* * Accept step, report it and * test stopping conditions on iterations count and function decrease. * * NOTE: we expect that State.DeltaX contains direction of step, * State.F contains function value at new point. * * NOTE2: we should update XBase ONLY. In the beginning of the next * iteration we expect that State.FIBase is NOT updated and * contains old value of a function vector. */ ae_v_move(&state->xbase.ptr.p_double[0], 1, &state->xnew.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( !state->xrep ) { goto lbl_68; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); minlm_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 15; goto lbl_rcomm; lbl_15: state->xupdated = ae_false; lbl_68: state->repiterationscount = state->repiterationscount+1; if( state->repiterationscount>=state->maxits&&state->maxits>0 ) { state->repterminationtype = 5; } if( state->repterminationtype<=0 ) { goto lbl_70; } if( !state->xrep ) { goto lbl_72; } /* * Report: XBase contains new point, F contains function value at new point */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); minlm_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 16; goto lbl_rcomm; lbl_16: state->xupdated = ae_false; lbl_72: result = ae_false; return result; lbl_70: state->modelage = state->modelage+1; goto lbl_41; lbl_42: /* * Lambda is too large, we have to break iterations. */ state->repterminationtype = 7; if( !state->xrep ) { goto lbl_74; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->f = state->fbase; minlm_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 17; goto lbl_rcomm; lbl_17: state->xupdated = ae_false; lbl_74: goto lbl_40; lbl_39: /* * Legacy Hessian-based mode * * Main cycle. * * We move through it until either: * * one of the stopping conditions is met * * we decide that stopping conditions are too stringent * and break from cycle * */ if( state->nec+state->nic>0 ) { /* * FGH solver does not support general linear constraints */ state->repterminationtype = -5; result = ae_false; return result; } lbl_76: if( ae_false ) { goto lbl_77; } /* * First, we have to prepare quadratic model for our function. * We use BFlag to ensure that model is prepared; * if it is false at the end of this block, something went wrong. * * We may either calculate brand new model or update old one. * * Before this block we have: * * State.XBase - current position. * * State.DeltaX - if DeltaXReady is True * * State.DeltaF - if DeltaFReady is True * * After this block is over, we will have: * * State.XBase - base point (unchanged) * * State.FBase - F(XBase) * * State.GBase - linear term * * State.QuadraticModel - quadratic term * * State.LambdaV - current estimate for lambda * * We also clear DeltaXReady/DeltaFReady flags * after initialization is done. */ bflag = ae_false; if( !(state->algomode==0||state->algomode==1) ) { goto lbl_78; } /* * Calculate f[] and Jacobian */ if( !(state->modelage>state->maxmodelage||!(state->deltaxready&&state->deltafready)) ) { goto lbl_80; } /* * Refresh model (using either finite differences or analytic Jacobian) */ if( state->algomode!=0 ) { goto lbl_82; } /* * Optimization using F values only. * Use finite differences to estimate Jacobian. */ ae_assert(state->hasfi, "MinLMIteration: internal error when estimating Jacobian (no f[])", _state); k = 0; lbl_84: if( k>n-1 ) { goto lbl_86; } /* * We guard X[k] from leaving [BndL,BndU]. * In case BndL=BndU, we assume that derivative in this direction is zero. */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->x.ptr.p_double[k] = state->x.ptr.p_double[k]-state->s.ptr.p_double[k]*state->diffstep; if( state->havebndl.ptr.p_bool[k] ) { state->x.ptr.p_double[k] = ae_maxreal(state->x.ptr.p_double[k], state->bndl.ptr.p_double[k], _state); } if( state->havebndu.ptr.p_bool[k] ) { state->x.ptr.p_double[k] = ae_minreal(state->x.ptr.p_double[k], state->bndu.ptr.p_double[k], _state); } state->xm1 = state->x.ptr.p_double[k]; minlm_clearrequestfields(state, _state); state->needfi = ae_true; state->rstate.stage = 18; goto lbl_rcomm; lbl_18: state->repnfunc = state->repnfunc+1; ae_v_move(&state->fm1.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->x.ptr.p_double[k] = state->x.ptr.p_double[k]+state->s.ptr.p_double[k]*state->diffstep; if( state->havebndl.ptr.p_bool[k] ) { state->x.ptr.p_double[k] = ae_maxreal(state->x.ptr.p_double[k], state->bndl.ptr.p_double[k], _state); } if( state->havebndu.ptr.p_bool[k] ) { state->x.ptr.p_double[k] = ae_minreal(state->x.ptr.p_double[k], state->bndu.ptr.p_double[k], _state); } state->xp1 = state->x.ptr.p_double[k]; minlm_clearrequestfields(state, _state); state->needfi = ae_true; state->rstate.stage = 19; goto lbl_rcomm; lbl_19: state->repnfunc = state->repnfunc+1; ae_v_move(&state->fp1.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); v = state->xp1-state->xm1; if( ae_fp_neq(v,(double)(0)) ) { v = 1/v; ae_v_moved(&state->j.ptr.pp_double[0][k], state->j.stride, &state->fp1.ptr.p_double[0], 1, ae_v_len(0,m-1), v); ae_v_subd(&state->j.ptr.pp_double[0][k], state->j.stride, &state->fm1.ptr.p_double[0], 1, ae_v_len(0,m-1), v); } else { for(i=0; i<=m-1; i++) { state->j.ptr.pp_double[i][k] = (double)(0); } } k = k+1; goto lbl_84; lbl_86: /* * Calculate F(XBase) */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); minlm_clearrequestfields(state, _state); state->needfi = ae_true; state->rstate.stage = 20; goto lbl_rcomm; lbl_20: state->needfi = ae_false; state->repnfunc = state->repnfunc+1; state->repnjac = state->repnjac+1; /* * New model */ state->modelage = 0; goto lbl_83; lbl_82: /* * Obtain f[] and Jacobian */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); minlm_clearrequestfields(state, _state); state->needfij = ae_true; state->rstate.stage = 21; goto lbl_rcomm; lbl_21: state->needfij = ae_false; state->repnfunc = state->repnfunc+1; state->repnjac = state->repnjac+1; /* * New model */ state->modelage = 0; lbl_83: goto lbl_81; lbl_80: /* * State.J contains Jacobian or its current approximation; * refresh it using secant updates: * * f(x0+dx) = f(x0) + J*dx, * J_new = J_old + u*h' * h = x_new-x_old * u = (f_new - f_old - J_old*h)/(h'h) * * We can explicitly generate h and u, but it is * preferential to do in-place calculations. Only * I-th row of J_old is needed to calculate u[I], * so we can update J row by row in one pass. * * NOTE: we expect that State.XBase contains new point, * State.FBase contains old point, State.DeltaX and * State.DeltaY contain updates from last step. */ ae_assert(state->deltaxready&&state->deltafready, "MinLMIteration: uninitialized DeltaX/DeltaF", _state); t = ae_v_dotproduct(&state->deltax.ptr.p_double[0], 1, &state->deltax.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_assert(ae_fp_neq(t,(double)(0)), "MinLM: internal error (T=0)", _state); for(i=0; i<=m-1; i++) { v = ae_v_dotproduct(&state->j.ptr.pp_double[i][0], 1, &state->deltax.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = (state->deltaf.ptr.p_double[i]-v)/t; ae_v_addd(&state->j.ptr.pp_double[i][0], 1, &state->deltax.ptr.p_double[0], 1, ae_v_len(0,n-1), v); } ae_v_move(&state->fi.ptr.p_double[0], 1, &state->fibase.ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_v_add(&state->fi.ptr.p_double[0], 1, &state->deltaf.ptr.p_double[0], 1, ae_v_len(0,m-1)); /* * Increase model age */ state->modelage = state->modelage+1; lbl_81: /* * Generate quadratic model: * f(xbase+dx) = * = (f0 + J*dx)'(f0 + J*dx) * = f0^2 + dx'J'f0 + f0*J*dx + dx'J'J*dx * = f0^2 + 2*f0*J*dx + dx'J'J*dx * * Note that we calculate 2*(J'J) instead of J'J because * our quadratic model is based on Tailor decomposition, * i.e. it has 0.5 before quadratic term. */ rmatrixgemm(n, n, m, 2.0, &state->j, 0, 0, 1, &state->j, 0, 0, 0, 0.0, &state->quadraticmodel, 0, 0, _state); rmatrixmv(n, m, &state->j, 0, 0, 1, &state->fi, 0, &state->gbase, 0, _state); ae_v_muld(&state->gbase.ptr.p_double[0], 1, ae_v_len(0,n-1), 2); v = ae_v_dotproduct(&state->fi.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); state->fbase = v; ae_v_move(&state->fibase.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); /* * set control variables */ bflag = ae_true; lbl_78: if( state->algomode!=2 ) { goto lbl_87; } ae_assert(!state->hasfi, "MinLMIteration: internal error (HasFI is True in Hessian-based mode)", _state); /* * Obtain F, G, H */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); minlm_clearrequestfields(state, _state); state->needfgh = ae_true; state->rstate.stage = 22; goto lbl_rcomm; lbl_22: state->needfgh = ae_false; state->repnfunc = state->repnfunc+1; state->repngrad = state->repngrad+1; state->repnhess = state->repnhess+1; rmatrixcopy(n, n, &state->h, 0, 0, &state->quadraticmodel, 0, 0, _state); ae_v_move(&state->gbase.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->fbase = state->f; /* * set control variables */ bflag = ae_true; state->modelage = 0; lbl_87: ae_assert(bflag, "MinLM: internal integrity check failed!", _state); state->deltaxready = ae_false; state->deltafready = ae_false; /* * Perform integrity check (presense of NAN/INF) */ v = state->fbase; for(i=0; i<=n-1; i++) { v = 0.1*v+state->gbase.ptr.p_double[i]; } if( !ae_isfinite(v, _state) ) { /* * Break! */ state->repterminationtype = -8; result = ae_false; return result; } /* * If Lambda is not initialized, initialize it using quadratic model */ if( ae_fp_less(state->lambdav,(double)(0)) ) { state->lambdav = (double)(0); for(i=0; i<=n-1; i++) { state->lambdav = ae_maxreal(state->lambdav, ae_fabs(state->quadraticmodel.ptr.pp_double[i][i], _state)*ae_sqr(state->s.ptr.p_double[i], _state), _state); } state->lambdav = 0.001*state->lambdav; if( ae_fp_eq(state->lambdav,(double)(0)) ) { state->lambdav = (double)(1); } } /* * Find value of Levenberg-Marquardt damping parameter which: * * leads to positive definite damped model * * within bounds specified by StpMax * * generates step which decreases function value * * After this block IFlag is set to: * * -3, if constraints are infeasible * * -2, if model update is needed (either Lambda growth is too large * or step is too short, but we can't rely on model and stop iterations) * * -1, if model is fresh, Lambda have grown too large, termination is needed * * 0, if everything is OK, continue iterations * * State.Nu can have any value on enter, but after exit it is set to 1.0 */ iflag = -99; lbl_89: if( ae_false ) { goto lbl_90; } /* * Do we need model update? */ if( state->modelage>0&&ae_fp_greater_eq(state->nu,minlm_suspiciousnu) ) { iflag = -2; goto lbl_90; } /* * Setup quadratic solver and solve quadratic programming problem. * After problem is solved we'll try to bound step by StpMax * (Lambda will be increased if step size is too large). * * We use BFlag variable to indicate that we have to increase Lambda. * If it is False, we will try to increase Lambda and move to new iteration. */ bflag = ae_true; minqpsetstartingpointfast(&state->qpstate, &state->xbase, _state); minqpsetoriginfast(&state->qpstate, &state->xbase, _state); minqpsetlineartermfast(&state->qpstate, &state->gbase, _state); minqpsetquadratictermfast(&state->qpstate, &state->quadraticmodel, ae_true, 0.0, _state); for(i=0; i<=n-1; i++) { state->tmp0.ptr.p_double[i] = state->quadraticmodel.ptr.pp_double[i][i]+state->lambdav/ae_sqr(state->s.ptr.p_double[i], _state); } minqprewritediagonal(&state->qpstate, &state->tmp0, _state); minqpoptimize(&state->qpstate, _state); minqpresultsbuf(&state->qpstate, &state->xdir, &state->qprep, _state); if( state->qprep.terminationtype>0 ) { /* * successful solution of QP problem */ ae_v_sub(&state->xdir.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&state->xdir.ptr.p_double[0], 1, &state->xdir.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( ae_isfinite(v, _state) ) { v = ae_sqrt(v, _state); if( ae_fp_greater(state->stpmax,(double)(0))&&ae_fp_greater(v,state->stpmax) ) { bflag = ae_false; } } else { bflag = ae_false; } } else { /* * Either problem is non-convex (increase LambdaV) or constraints are inconsistent */ ae_assert(state->qprep.terminationtype==-3||state->qprep.terminationtype==-5, "MinLM: unexpected completion code from QP solver", _state); if( state->qprep.terminationtype==-3 ) { iflag = -3; goto lbl_90; } bflag = ae_false; } if( !bflag ) { /* * Solution failed: * try to increase lambda to make matrix positive definite and continue. */ if( !minlm_increaselambda(&state->lambdav, &state->nu, _state) ) { iflag = -1; goto lbl_90; } goto lbl_89; } /* * Step in State.XDir and it is bounded by StpMax. * * We should check stopping conditions on step size here. * DeltaX, which is used for secant updates, is initialized here. * * This code is a bit tricky because sometimes XDir<>0, but * it is so small that XDir+XBase==XBase (in finite precision * arithmetics). So we set DeltaX to XBase, then * add XDir, and then subtract XBase to get exact value of * DeltaX. * * Step length is estimated using DeltaX. * * NOTE: stopping conditions are tested * for fresh models only (ModelAge=0) */ ae_v_move(&state->deltax.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_add(&state->deltax.ptr.p_double[0], 1, &state->xdir.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_sub(&state->deltax.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->deltaxready = ae_true; v = 0.0; for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->deltax.ptr.p_double[i]/state->s.ptr.p_double[i], _state); } v = ae_sqrt(v, _state); if( ae_fp_greater(v,state->epsx) ) { goto lbl_91; } if( state->modelage!=0 ) { goto lbl_93; } /* * Step is too short, model is fresh and we can rely on it. * Terminating. */ state->repterminationtype = 2; if( !state->xrep ) { goto lbl_95; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->f = state->fbase; minlm_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 23; goto lbl_rcomm; lbl_23: state->xupdated = ae_false; lbl_95: result = ae_false; return result; goto lbl_94; lbl_93: /* * Step is suspiciously short, but model is not fresh * and we can't rely on it. */ iflag = -2; goto lbl_90; lbl_94: lbl_91: /* * Let's evaluate new step: * a) if we have Fi vector, we evaluate it using rcomm, and * then we manually calculate State.F as sum of squares of Fi[] * b) if we have F value, we just evaluate it through rcomm interface * * We prefer (a) because we may need Fi vector for additional * iterations */ ae_assert(state->hasfi||state->hasf, "MinLM: internal error 2!", _state); ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_add(&state->x.ptr.p_double[0], 1, &state->xdir.ptr.p_double[0], 1, ae_v_len(0,n-1)); minlm_clearrequestfields(state, _state); if( !state->hasfi ) { goto lbl_97; } state->needfi = ae_true; state->rstate.stage = 24; goto lbl_rcomm; lbl_24: state->needfi = ae_false; v = ae_v_dotproduct(&state->fi.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); state->f = v; ae_v_move(&state->deltaf.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_v_sub(&state->deltaf.ptr.p_double[0], 1, &state->fibase.ptr.p_double[0], 1, ae_v_len(0,m-1)); state->deltafready = ae_true; goto lbl_98; lbl_97: state->needf = ae_true; state->rstate.stage = 25; goto lbl_rcomm; lbl_25: state->needf = ae_false; lbl_98: state->repnfunc = state->repnfunc+1; if( !ae_isfinite(state->f, _state) ) { /* * Integrity check failed, break! */ state->repterminationtype = -8; result = ae_false; return result; } if( ae_fp_greater_eq(state->f,state->fbase) ) { /* * Increase lambda and continue */ if( !minlm_increaselambda(&state->lambdav, &state->nu, _state) ) { iflag = -1; goto lbl_90; } goto lbl_89; } /* * We've found our step! */ iflag = 0; goto lbl_90; goto lbl_89; lbl_90: if( state->userterminationneeded ) { /* * User requested termination */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->repterminationtype = 8; result = ae_false; return result; } state->nu = (double)(1); ae_assert(iflag>=-3&&iflag<=0, "MinLM: internal integrity check failed!", _state); if( iflag==-3 ) { state->repterminationtype = -3; result = ae_false; return result; } if( iflag==-2 ) { state->modelage = state->maxmodelage+1; goto lbl_76; } if( iflag==-1 ) { goto lbl_77; } /* * Levenberg-Marquardt step is ready. * Compare predicted vs. actual decrease and decide what to do with lambda. * * NOTE: we expect that State.DeltaX contains direction of step, * State.F contains function value at new point. */ ae_assert(state->deltaxready, "MinLM: deltaX is not ready", _state); t = (double)(0); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&state->quadraticmodel.ptr.pp_double[i][0], 1, &state->deltax.ptr.p_double[0], 1, ae_v_len(0,n-1)); t = t+state->deltax.ptr.p_double[i]*state->gbase.ptr.p_double[i]+0.5*state->deltax.ptr.p_double[i]*v; } state->predicteddecrease = -t; state->actualdecrease = -(state->f-state->fbase); if( ae_fp_less_eq(state->predicteddecrease,(double)(0)) ) { goto lbl_77; } v = state->actualdecrease/state->predicteddecrease; if( ae_fp_greater_eq(v,0.1) ) { goto lbl_99; } if( minlm_increaselambda(&state->lambdav, &state->nu, _state) ) { goto lbl_101; } /* * Lambda is too large, we have to break iterations. */ state->repterminationtype = 7; if( !state->xrep ) { goto lbl_103; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->f = state->fbase; minlm_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 26; goto lbl_rcomm; lbl_26: state->xupdated = ae_false; lbl_103: result = ae_false; return result; lbl_101: lbl_99: if( ae_fp_greater(v,0.5) ) { minlm_decreaselambda(&state->lambdav, &state->nu, _state); } /* * Accept step, report it and * test stopping conditions on iterations count and function decrease. * * NOTE: we expect that State.DeltaX contains direction of step, * State.F contains function value at new point. * * NOTE2: we should update XBase ONLY. In the beginning of the next * iteration we expect that State.FIBase is NOT updated and * contains old value of a function vector. */ ae_v_add(&state->xbase.ptr.p_double[0], 1, &state->deltax.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( !state->xrep ) { goto lbl_105; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); minlm_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 27; goto lbl_rcomm; lbl_27: state->xupdated = ae_false; lbl_105: state->repiterationscount = state->repiterationscount+1; if( state->repiterationscount>=state->maxits&&state->maxits>0 ) { state->repterminationtype = 5; } if( state->repterminationtype<=0 ) { goto lbl_107; } if( !state->xrep ) { goto lbl_109; } /* * Report: XBase contains new point, F contains function value at new point */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); minlm_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 28; goto lbl_rcomm; lbl_28: state->xupdated = ae_false; lbl_109: result = ae_false; return result; lbl_107: state->modelage = state->modelage+1; goto lbl_76; lbl_77: /* * Lambda is too large, we have to break iterations. */ state->repterminationtype = 7; if( !state->xrep ) { goto lbl_111; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->f = state->fbase; minlm_clearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 29; goto lbl_rcomm; lbl_29: state->xupdated = ae_false; lbl_111: lbl_40: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = n; state->rstate.ia.ptr.p_int[1] = m; state->rstate.ia.ptr.p_int[2] = iflag; state->rstate.ia.ptr.p_int[3] = i; state->rstate.ia.ptr.p_int[4] = k; state->rstate.ba.ptr.p_bool[0] = bflag; state->rstate.ra.ptr.p_double[0] = v; state->rstate.ra.ptr.p_double[1] = s; state->rstate.ra.ptr.p_double[2] = t; state->rstate.ra.ptr.p_double[3] = fnew; return result; } /************************************************************************* Levenberg-Marquardt algorithm results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report; includes termination codes and additional information. Termination codes are listed below, see comments for this structure for more info. Termination code is stored in rep.terminationtype field: * -8 optimizer detected NAN/INF values either in the function itself, or in its Jacobian * -7 derivative correctness check failed; see rep.funcidx, rep.varidx for more information. * -3 constraints are inconsistent * 2 relative step is no more than EpsX. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * 8 terminated by user who called minlmrequesttermination(). X contains point which was "current accepted" when termination request was submitted. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmresults(minlmstate* state, /* Real */ ae_vector* x, minlmreport* rep, ae_state *_state) { ae_vector_clear(x); _minlmreport_clear(rep); minlmresultsbuf(state, x, rep, _state); } /************************************************************************* Levenberg-Marquardt algorithm results Buffered implementation of MinLMResults(), which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmresultsbuf(minlmstate* state, /* Real */ ae_vector* x, minlmreport* rep, ae_state *_state) { if( x->cntn ) { ae_vector_set_length(x, state->n, _state); } ae_v_move(&x->ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,state->n-1)); rep->iterationscount = state->repiterationscount; rep->terminationtype = state->repterminationtype; rep->funcidx = state->repfuncidx; rep->varidx = state->repvaridx; rep->nfunc = state->repnfunc; rep->njac = state->repnjac; rep->ngrad = state->repngrad; rep->nhess = state->repnhess; rep->ncholesky = state->repncholesky; } /************************************************************************* This subroutine restarts LM algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used for reverse communication previously allocated with MinLMCreateXXX call. X - new starting point. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void minlmrestartfrom(minlmstate* state, /* Real */ ae_vector* x, ae_state *_state) { ae_assert(x->cnt>=state->n, "MinLMRestartFrom: Length(X)n, _state), "MinLMRestartFrom: X contains infinite or NaN values!", _state); ae_v_move(&state->xbase.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,state->n-1)); ae_vector_set_length(&state->rstate.ia, 4+1, _state); ae_vector_set_length(&state->rstate.ba, 0+1, _state); ae_vector_set_length(&state->rstate.ra, 3+1, _state); state->rstate.stage = -1; minlm_clearrequestfields(state, _state); } /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void minlmrequesttermination(minlmstate* state, ae_state *_state) { state->userterminationneeded = ae_true; } /************************************************************************* This is obsolete function. Since ALGLIB 3.3 it is equivalent to MinLMCreateVJ(). -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatevgj(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, minlmstate* state, ae_state *_state) { _minlmstate_clear(state); minlmcreatevj(n, m, x, state, _state); } /************************************************************************* This is obsolete function. Since ALGLIB 3.3 it is equivalent to MinLMCreateFJ(). -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatefgj(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, minlmstate* state, ae_state *_state) { _minlmstate_clear(state); minlmcreatefj(n, m, x, state, _state); } /************************************************************************* This function is considered obsolete since ALGLIB 3.1.0 and is present for backward compatibility only. We recommend to use MinLMCreateVJ, which provides similar, but more consistent and feature-rich interface. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatefj(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, minlmstate* state, ae_state *_state) { _minlmstate_clear(state); ae_assert(n>=1, "MinLMCreateFJ: N<1!", _state); ae_assert(m>=1, "MinLMCreateFJ: M<1!", _state); ae_assert(x->cnt>=n, "MinLMCreateFJ: Length(X)teststep = (double)(0); state->n = n; state->m = m; state->algomode = 1; state->hasf = ae_true; state->hasfi = ae_false; state->hasg = ae_false; /* * init 2 */ minlm_lmprepare(n, m, ae_true, state, _state); minlmsetacctype(state, 0, _state); minlmsetcond(state, (double)(0), 0, _state); minlmsetxrep(state, ae_false, _state); minlmsetstpmax(state, (double)(0), _state); minlmrestartfrom(state, x, _state); } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinLMOptimize() is called * prior to actual optimization, for each function Fi and each component of parameters being optimized X[j] algorithm performs following steps: * two trial steps are made to X[j]-TestStep*S[j] and X[j]+TestStep*S[j], where X[j] is j-th parameter and S[j] is a scale of j-th parameter * if needed, steps are bounded with respect to constraints on X[] * Fi(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative, Rep.FuncIdx is set to index of the function. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) Jacobian evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinLMSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/ void minlmsetgradientcheck(minlmstate* state, double teststep, ae_state *_state) { ae_assert(ae_isfinite(teststep, _state), "MinLMSetGradientCheck: TestStep contains NaN or Infinite", _state); ae_assert(ae_fp_greater_eq(teststep,(double)(0)), "MinLMSetGradientCheck: invalid argument TestStep(TestStep<0)", _state); state->teststep = teststep; } /************************************************************************* Prepare internal structures (except for RComm). Note: M must be zero for FGH mode, non-zero for V/VJ/FJ/FGJ mode. *************************************************************************/ static void minlm_lmprepare(ae_int_t n, ae_int_t m, ae_bool havegrad, minlmstate* state, ae_state *_state) { ae_int_t i; if( n<=0||m<0 ) { return; } if( havegrad ) { ae_vector_set_length(&state->g, n, _state); } if( m!=0 ) { ae_matrix_set_length(&state->j, m, n, _state); ae_vector_set_length(&state->fi, m, _state); ae_vector_set_length(&state->fibase, m, _state); ae_vector_set_length(&state->deltaf, m, _state); ae_vector_set_length(&state->fm1, m, _state); ae_vector_set_length(&state->fp1, m, _state); ae_vector_set_length(&state->fc1, m, _state); ae_vector_set_length(&state->gm1, m, _state); ae_vector_set_length(&state->gp1, m, _state); ae_vector_set_length(&state->gc1, m, _state); } else { ae_matrix_set_length(&state->h, n, n, _state); } ae_vector_set_length(&state->x, n, _state); ae_vector_set_length(&state->deltax, n, _state); ae_matrix_set_length(&state->quadraticmodel, n, n, _state); ae_vector_set_length(&state->xbase, n, _state); ae_vector_set_length(&state->gbase, n, _state); ae_vector_set_length(&state->xdir, n, _state); ae_vector_set_length(&state->tmp0, n, _state); /* * prepare internal L-BFGS */ for(i=0; i<=n-1; i++) { state->x.ptr.p_double[i] = (double)(0); } minlbfgscreate(n, ae_minint(minlm_additers, n, _state), &state->x, &state->internalstate, _state); minlbfgssetcond(&state->internalstate, 0.0, 0.0, 0.0, ae_minint(minlm_additers, n, _state), _state); /* * Prepare internal QP solver */ minqpcreate(n, &state->qpstate, _state); minqpsetalgocholesky(&state->qpstate, _state); /* * Prepare boundary constraints */ ae_vector_set_length(&state->bndl, n, _state); ae_vector_set_length(&state->bndu, n, _state); ae_vector_set_length(&state->havebndl, n, _state); ae_vector_set_length(&state->havebndu, n, _state); for(i=0; i<=n-1; i++) { state->bndl.ptr.p_double[i] = _state->v_neginf; state->havebndl.ptr.p_bool[i] = ae_false; state->bndu.ptr.p_double[i] = _state->v_posinf; state->havebndu.ptr.p_bool[i] = ae_false; } /* * Prepare scaling matrix */ ae_vector_set_length(&state->s, n, _state); for(i=0; i<=n-1; i++) { state->s.ptr.p_double[i] = 1.0; } /* * Prepare linear constraints */ state->nec = 0; state->nic = 0; } /************************************************************************* Clears request fileds (to be sure that we don't forgot to clear something) *************************************************************************/ static void minlm_clearrequestfields(minlmstate* state, ae_state *_state) { state->needf = ae_false; state->needfg = ae_false; state->needfgh = ae_false; state->needfij = ae_false; state->needfi = ae_false; state->xupdated = ae_false; } /************************************************************************* Increases lambda, returns False when there is a danger of overflow *************************************************************************/ static ae_bool minlm_increaselambda(double* lambdav, double* nu, ae_state *_state) { double lnlambda; double lnnu; double lnlambdaup; double lnmax; ae_bool result; result = ae_false; lnlambda = ae_log(*lambdav, _state); lnlambdaup = ae_log(minlm_lambdaup, _state); lnnu = ae_log(*nu, _state); lnmax = ae_log(ae_maxrealnumber, _state); if( ae_fp_greater(lnlambda+lnlambdaup+lnnu,0.25*lnmax) ) { return result; } if( ae_fp_greater(lnnu+ae_log((double)(2), _state),lnmax) ) { return result; } *lambdav = *lambdav*minlm_lambdaup*(*nu); *nu = *nu*2; result = ae_true; return result; } /************************************************************************* Decreases lambda, but leaves it unchanged when there is danger of underflow. *************************************************************************/ static void minlm_decreaselambda(double* lambdav, double* nu, ae_state *_state) { *nu = (double)(1); if( ae_fp_less(ae_log(*lambdav, _state)+ae_log(minlm_lambdadown, _state),ae_log(ae_minrealnumber, _state)) ) { *lambdav = ae_minrealnumber; } else { *lambdav = *lambdav*minlm_lambdadown; } } /************************************************************************* This function compares actual decrease vs predicted decrease and updates LambdaV/Nu accordingly. INPUT PARAMETERS: QuadraticModel - array[N,N], full Hessian matrix of quadratic model at deltaX=0 GBase - array[N], gradient at deltaX=0 FBase - F(deltaX=0) N - size DeltaX - step vector FNew - new function value LambdaV - lambda-value, updated on exit Nu - Nu-multiplier, updated on exit On exit it returns: * Result=0 - if we have to continue iterations * Result<>0 - if termination with completion code Result is requested -- ALGLIB -- Copyright 17.02.2017 by Bochkanov Sergey *************************************************************************/ static ae_int_t minlm_checkdecrease(/* Real */ ae_matrix* quadraticmodel, /* Real */ ae_vector* gbase, double fbase, ae_int_t n, /* Real */ ae_vector* deltax, double fnew, double* lambdav, double* nu, ae_state *_state) { ae_int_t i; double v; double t; double predicteddecrease; double actualdecrease; ae_int_t result; result = 0; t = (double)(0); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&quadraticmodel->ptr.pp_double[i][0], 1, &deltax->ptr.p_double[0], 1, ae_v_len(0,n-1)); t = t+deltax->ptr.p_double[i]*gbase->ptr.p_double[i]+0.5*deltax->ptr.p_double[i]*v; } predicteddecrease = -t; actualdecrease = -(fnew-fbase); if( ae_fp_less_eq(predicteddecrease,(double)(0)) ) { result = 7; return result; } v = actualdecrease/predicteddecrease; if( ae_fp_less(v,0.1) ) { if( !minlm_increaselambda(lambdav, nu, _state) ) { /* * Lambda is too large, we have to break iterations. */ result = 7; return result; } } if( ae_fp_greater(v,0.5) ) { minlm_decreaselambda(lambdav, nu, _state); } return result; } /************************************************************************* This function initializes step finder object with problem statement; model parameters specified during this call should not (and can not) change during object lifetime (although it is possible to re-initialize object with different settings). This function reuses internally allocated objects as much as possible. In addition to initializing step finder, this function enforces feasibility in initial point X passed to this function. It is important that LM iteration starts from feasible point and performs feasible steps; RETURN VALUE: True for successful initialization False for inconsistent constraints; you should not use step finder if it returned False. *************************************************************************/ static ae_bool minlm_minlmstepfinderinit(minlmstepfinder* state, ae_int_t n, ae_int_t m, ae_int_t maxmodelage, ae_bool hasfi, /* Real */ ae_vector* xbase, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, /* Real */ ae_matrix* cleic, ae_int_t nec, ae_int_t nic, /* Real */ ae_vector* s, double stpmax, double epsx, ae_state *_state) { ae_int_t i; ae_bool result; state->n = n; state->m = m; state->maxmodelage = maxmodelage; state->hasfi = hasfi; state->stpmax = stpmax; state->epsx = epsx; /* * Allocate temporaries, create QP solver, select QP algorithm */ rvectorsetlengthatleast(&state->bndl, n, _state); rvectorsetlengthatleast(&state->bndu, n, _state); rvectorsetlengthatleast(&state->s, n, _state); bvectorsetlengthatleast(&state->havebndl, n, _state); bvectorsetlengthatleast(&state->havebndu, n, _state); rvectorsetlengthatleast(&state->x, n, _state); rvectorsetlengthatleast(&state->xbase, n, _state); rvectorsetlengthatleast(&state->tmp0, n, _state); rvectorsetlengthatleast(&state->modeldiag, n, _state); ivectorsetlengthatleast(&state->tmpct, nec+nic, _state); rvectorsetlengthatleast(&state->xdir, n, _state); if( hasfi ) { rvectorsetlengthatleast(&state->fi, m, _state); rvectorsetlengthatleast(&state->fibase, m, _state); } for(i=0; i<=n-1; i++) { ae_assert(ae_isfinite(bndl->ptr.p_double[i], _state)||ae_isneginf(bndl->ptr.p_double[i], _state), "MinLM: integrity check failed", _state); ae_assert(ae_isfinite(bndu->ptr.p_double[i], _state)||ae_isposinf(bndu->ptr.p_double[i], _state), "MinLM: integrity check failed", _state); state->bndl.ptr.p_double[i] = bndl->ptr.p_double[i]; state->havebndl.ptr.p_bool[i] = ae_isfinite(bndl->ptr.p_double[i], _state); state->bndu.ptr.p_double[i] = bndu->ptr.p_double[i]; state->havebndu.ptr.p_bool[i] = ae_isfinite(bndu->ptr.p_double[i], _state); state->s.ptr.p_double[i] = s->ptr.p_double[i]; } for(i=0; i<=nec-1; i++) { state->tmpct.ptr.p_int[i] = 0; } for(i=0; i<=nic-1; i++) { state->tmpct.ptr.p_int[nec+i] = -1; } minqpcreate(n, &state->qpstate, _state); if( nec+nic==0 ) { minqpsetalgoquickqp(&state->qpstate, 0.0, 0.0, coalesce(0.01*epsx, 1.0E-12, _state), 10, ae_true, _state); } else { minqpsetalgodenseaul(&state->qpstate, coalesce(0.01*epsx, 1.0E-12, _state), (double)(100), 10, _state); } minqpsetbc(&state->qpstate, bndl, bndu, _state); minqpsetlc(&state->qpstate, cleic, &state->tmpct, nec+nic, _state); minqpsetscale(&state->qpstate, s, _state); /* * Check feasibility of constraints: * * check/enforce box constraints (straightforward) * * prepare QP subproblem which return us a feasible point */ result = ae_true; for(i=0; i<=n-1; i++) { if( (state->havebndl.ptr.p_bool[i]&&state->havebndu.ptr.p_bool[i])&&ae_fp_greater(state->bndl.ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { result = ae_false; return result; } if( state->havebndl.ptr.p_bool[i]&&ae_fp_less(xbase->ptr.p_double[i],state->bndl.ptr.p_double[i]) ) { xbase->ptr.p_double[i] = state->bndl.ptr.p_double[i]; } if( state->havebndu.ptr.p_bool[i]&&ae_fp_greater(xbase->ptr.p_double[i],state->bndu.ptr.p_double[i]) ) { xbase->ptr.p_double[i] = state->bndu.ptr.p_double[i]; } } if( nec+nic>0 ) { /* * Well, we have linear constraints... let's use heavy machinery. * * We will modify QP solver state below, but everything will be * restored in MinLMStepFinderStart(). */ sparsecreate(n, n, n, &state->tmpsp, _state); for(i=0; i<=n-1; i++) { sparseset(&state->tmpsp, i, i, 0.5, _state); state->tmp0.ptr.p_double[i] = (double)(0); } minqpsetstartingpointfast(&state->qpstate, xbase, _state); minqpsetoriginfast(&state->qpstate, xbase, _state); minqpsetlineartermfast(&state->qpstate, &state->tmp0, _state); minqpsetquadratictermsparse(&state->qpstate, &state->tmpsp, ae_true, _state); minqpoptimize(&state->qpstate, _state); minqpresultsbuf(&state->qpstate, xbase, &state->qprep, _state); } return result; } /************************************************************************* This function prepares LM step search session. *************************************************************************/ static void minlm_minlmstepfinderstart(minlmstepfinder* state, /* Real */ ae_matrix* quadraticmodel, /* Real */ ae_vector* gbase, double fbase, /* Real */ ae_vector* xbase, /* Real */ ae_vector* fibase, ae_int_t modelage, ae_state *_state) { ae_int_t i; ae_int_t n; n = state->n; ae_vector_set_length(&state->rstate.ia, 2+1, _state); ae_vector_set_length(&state->rstate.ba, 0+1, _state); ae_vector_set_length(&state->rstate.ra, 0+1, _state); state->rstate.stage = -1; state->modelage = modelage; state->fbase = fbase; if( state->hasfi ) { for(i=0; i<=state->m-1; i++) { state->fibase.ptr.p_double[i] = fibase->ptr.p_double[i]; } } for(i=0; i<=n-1; i++) { state->xbase.ptr.p_double[i] = xbase->ptr.p_double[i]; state->modeldiag.ptr.p_double[i] = quadraticmodel->ptr.pp_double[i][i]; } minqpsetstartingpointfast(&state->qpstate, xbase, _state); minqpsetoriginfast(&state->qpstate, xbase, _state); minqpsetlineartermfast(&state->qpstate, gbase, _state); minqpsetquadratictermfast(&state->qpstate, quadraticmodel, ae_true, 0.0, _state); } /************************************************************************* This function runs LM step search session. // // Find value of Levenberg-Marquardt damping parameter which: // * leads to positive definite damped model // * within bounds specified by StpMax // * generates step which decreases function value // // After this block IFlag is set to: // * -8, if infinities/NANs were detected in function values/gradient // * -3, if constraints are infeasible // * -2, if model update is needed (either Lambda growth is too large // or step is too short, but we can't rely on model and stop iterations) // * -1, if model is fresh, Lambda have grown too large, termination is needed // * 0, if everything is OK, continue iterations // * >0 - successful completion (step size is small enough) // // State.Nu can have any value on enter, but after exit it is set to 1.0 // *************************************************************************/ static ae_bool minlm_minlmstepfinderiteration(minlmstepfinder* state, double* lambdav, double* nu, /* Real */ ae_vector* xnew, /* Real */ ae_vector* deltax, ae_bool* deltaxready, /* Real */ ae_vector* deltaf, ae_bool* deltafready, ae_int_t* iflag, double* fnew, ae_int_t* ncholesky, ae_state *_state) { ae_int_t i; ae_bool bflag; double v; ae_int_t n; ae_int_t m; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { i = state->rstate.ia.ptr.p_int[0]; n = state->rstate.ia.ptr.p_int[1]; m = state->rstate.ia.ptr.p_int[2]; bflag = state->rstate.ba.ptr.p_bool[0]; v = state->rstate.ra.ptr.p_double[0]; } else { i = -838; n = 939; m = -526; bflag = ae_true; v = -541; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } /* * Routine body */ *iflag = -99; n = state->n; m = state->m; lbl_2: if( ae_false ) { goto lbl_3; } *deltaxready = ae_false; *deltafready = ae_false; /* * Do we need model update? */ if( state->modelage>0&&ae_fp_greater_eq(*nu,minlm_suspiciousnu) ) { *iflag = -2; goto lbl_3; } /* * Setup quadratic solver and solve quadratic programming problem. * After problem is solved we'll try to bound step by StpMax * (Lambda will be increased if step size is too large). * * We use BFlag variable to indicate that we have to increase Lambda. * If it is False, we will try to increase Lambda and move to new iteration. */ bflag = ae_true; for(i=0; i<=n-1; i++) { state->tmp0.ptr.p_double[i] = state->modeldiag.ptr.p_double[i]+*lambdav/ae_sqr(state->s.ptr.p_double[i], _state); } minqprewritediagonal(&state->qpstate, &state->tmp0, _state); minqpoptimize(&state->qpstate, _state); minqpresultsbuf(&state->qpstate, xnew, &state->qprep, _state); *ncholesky = *ncholesky+state->qprep.ncholesky; if( state->qprep.terminationtype==-3 ) { /* * Infeasible constraints */ *iflag = -3; goto lbl_3; } if( state->qprep.terminationtype==-4||state->qprep.terminationtype==-5 ) { /* * Unconstrained direction of negative curvature was detected */ if( !minlm_increaselambda(lambdav, nu, _state) ) { *iflag = -1; goto lbl_3; } goto lbl_2; } ae_assert(state->qprep.terminationtype>0, "MinLM: unexpected completion code from QP solver", _state); ae_v_move(&state->xdir.ptr.p_double[0], 1, &xnew->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_sub(&state->xdir.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = 0.0; for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->xdir.ptr.p_double[i]/state->s.ptr.p_double[i], _state); } if( ae_isfinite(v, _state) ) { v = ae_sqrt(v, _state); if( ae_fp_greater(state->stpmax,(double)(0))&&ae_fp_greater(v,state->stpmax) ) { bflag = ae_false; } } else { bflag = ae_false; } if( !bflag ) { /* * Solution failed: * try to increase lambda to make matrix positive definite and continue. */ if( !minlm_increaselambda(lambdav, nu, _state) ) { *iflag = -1; goto lbl_3; } goto lbl_2; } /* * Step in State.XDir and it is bounded by StpMax. * * We should check stopping conditions on step size here. * DeltaX, which is used for secant updates, is initialized here. * * This code is a bit tricky because sometimes XDir<>0, but * it is so small that XDir+XBase==XBase (in finite precision * arithmetics). So we set DeltaX to XBase, then * add XDir, and then subtract XBase to get exact value of * DeltaX. * * Step length is estimated using DeltaX. * * NOTE: stopping conditions are tested * for fresh models only (ModelAge=0) */ ae_v_move(&deltax->ptr.p_double[0], 1, &xnew->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_sub(&deltax->ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); *deltaxready = ae_true; v = 0.0; for(i=0; i<=n-1; i++) { v = v+ae_sqr(deltax->ptr.p_double[i]/state->s.ptr.p_double[i], _state); } v = ae_sqrt(v, _state); if( ae_fp_less_eq(v,state->epsx) ) { if( state->modelage==0 ) { /* * Step is too short, model is fresh and we can rely on it. * Terminating. */ *iflag = 2; goto lbl_3; } else { /* * Step is suspiciously short, but model is not fresh * and we can't rely on it. */ *iflag = -2; goto lbl_3; } } /* * Let's evaluate new step: * a) if we have Fi vector, we evaluate it using rcomm, and * then we manually calculate State.F as sum of squares of Fi[] * b) if we have F value, we just evaluate it through rcomm interface * * We prefer (a) because we may need Fi vector for additional * iterations */ ae_v_move(&state->x.ptr.p_double[0], 1, &xnew->ptr.p_double[0], 1, ae_v_len(0,n-1)); state->needf = ae_false; state->needfi = ae_false; if( !state->hasfi ) { goto lbl_4; } state->needfi = ae_true; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: state->needfi = ae_false; v = ae_v_dotproduct(&state->fi.ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); *fnew = v; ae_v_move(&deltaf->ptr.p_double[0], 1, &state->fi.ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_v_sub(&deltaf->ptr.p_double[0], 1, &state->fibase.ptr.p_double[0], 1, ae_v_len(0,m-1)); *deltafready = ae_true; goto lbl_5; lbl_4: state->needf = ae_true; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: state->needf = ae_false; *fnew = state->f; lbl_5: if( !ae_isfinite(*fnew, _state) ) { /* * Integrity check failed, break! */ *iflag = -8; goto lbl_3; } if( ae_fp_greater_eq(*fnew,state->fbase) ) { /* * Increase lambda and continue */ if( !minlm_increaselambda(lambdav, nu, _state) ) { *iflag = -1; goto lbl_3; } goto lbl_2; } /* * We've found our step! */ *iflag = 0; goto lbl_3; goto lbl_2; lbl_3: *nu = (double)(1); ae_assert(((*iflag>=-3&&*iflag<=0)||*iflag==-8)||*iflag>0, "MinLM: internal integrity check failed!", _state); result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = i; state->rstate.ia.ptr.p_int[1] = n; state->rstate.ia.ptr.p_int[2] = m; state->rstate.ba.ptr.p_bool[0] = bflag; state->rstate.ra.ptr.p_double[0] = v; return result; } void _minlmstepfinder_init(void* _p, ae_state *_state) { minlmstepfinder *p = (minlmstepfinder*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->fi, 0, DT_REAL, _state); ae_vector_init(&p->modeldiag, 0, DT_REAL, _state); ae_vector_init(&p->xbase, 0, DT_REAL, _state); ae_vector_init(&p->fibase, 0, DT_REAL, _state); ae_vector_init(&p->bndl, 0, DT_REAL, _state); ae_vector_init(&p->bndu, 0, DT_REAL, _state); ae_vector_init(&p->havebndl, 0, DT_BOOL, _state); ae_vector_init(&p->havebndu, 0, DT_BOOL, _state); ae_vector_init(&p->s, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); ae_vector_init(&p->xdir, 0, DT_REAL, _state); ae_vector_init(&p->choleskybuf, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_vector_init(&p->tmpct, 0, DT_INT, _state); _minqpstate_init(&p->qpstate, _state); _minqpreport_init(&p->qprep, _state); _sparsematrix_init(&p->tmpsp, _state); } void _minlmstepfinder_init_copy(void* _dst, void* _src, ae_state *_state) { minlmstepfinder *dst = (minlmstepfinder*)_dst; minlmstepfinder *src = (minlmstepfinder*)_src; dst->n = src->n; dst->m = src->m; dst->stpmax = src->stpmax; dst->modelage = src->modelage; dst->maxmodelage = src->maxmodelage; dst->hasfi = src->hasfi; dst->epsx = src->epsx; ae_vector_init_copy(&dst->x, &src->x, _state); dst->f = src->f; ae_vector_init_copy(&dst->fi, &src->fi, _state); dst->needf = src->needf; dst->needfi = src->needfi; dst->fbase = src->fbase; ae_vector_init_copy(&dst->modeldiag, &src->modeldiag, _state); ae_vector_init_copy(&dst->xbase, &src->xbase, _state); ae_vector_init_copy(&dst->fibase, &src->fibase, _state); ae_vector_init_copy(&dst->bndl, &src->bndl, _state); ae_vector_init_copy(&dst->bndu, &src->bndu, _state); ae_vector_init_copy(&dst->havebndl, &src->havebndl, _state); ae_vector_init_copy(&dst->havebndu, &src->havebndu, _state); ae_vector_init_copy(&dst->s, &src->s, _state); _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); ae_vector_init_copy(&dst->xdir, &src->xdir, _state); ae_vector_init_copy(&dst->choleskybuf, &src->choleskybuf, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); ae_vector_init_copy(&dst->tmpct, &src->tmpct, _state); dst->actualdecrease = src->actualdecrease; dst->predicteddecrease = src->predicteddecrease; _minqpstate_init_copy(&dst->qpstate, &src->qpstate, _state); _minqpreport_init_copy(&dst->qprep, &src->qprep, _state); _sparsematrix_init_copy(&dst->tmpsp, &src->tmpsp, _state); } void _minlmstepfinder_clear(void* _p) { minlmstepfinder *p = (minlmstepfinder*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->x); ae_vector_clear(&p->fi); ae_vector_clear(&p->modeldiag); ae_vector_clear(&p->xbase); ae_vector_clear(&p->fibase); ae_vector_clear(&p->bndl); ae_vector_clear(&p->bndu); ae_vector_clear(&p->havebndl); ae_vector_clear(&p->havebndu); ae_vector_clear(&p->s); _rcommstate_clear(&p->rstate); ae_vector_clear(&p->xdir); ae_vector_clear(&p->choleskybuf); ae_vector_clear(&p->tmp0); ae_vector_clear(&p->tmpct); _minqpstate_clear(&p->qpstate); _minqpreport_clear(&p->qprep); _sparsematrix_clear(&p->tmpsp); } void _minlmstepfinder_destroy(void* _p) { minlmstepfinder *p = (minlmstepfinder*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->x); ae_vector_destroy(&p->fi); ae_vector_destroy(&p->modeldiag); ae_vector_destroy(&p->xbase); ae_vector_destroy(&p->fibase); ae_vector_destroy(&p->bndl); ae_vector_destroy(&p->bndu); ae_vector_destroy(&p->havebndl); ae_vector_destroy(&p->havebndu); ae_vector_destroy(&p->s); _rcommstate_destroy(&p->rstate); ae_vector_destroy(&p->xdir); ae_vector_destroy(&p->choleskybuf); ae_vector_destroy(&p->tmp0); ae_vector_destroy(&p->tmpct); _minqpstate_destroy(&p->qpstate); _minqpreport_destroy(&p->qprep); _sparsematrix_destroy(&p->tmpsp); } void _minlmstate_init(void* _p, ae_state *_state) { minlmstate *p = (minlmstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->fi, 0, DT_REAL, _state); ae_matrix_init(&p->j, 0, 0, DT_REAL, _state); ae_matrix_init(&p->h, 0, 0, DT_REAL, _state); ae_vector_init(&p->g, 0, DT_REAL, _state); ae_vector_init(&p->xbase, 0, DT_REAL, _state); ae_vector_init(&p->fibase, 0, DT_REAL, _state); ae_vector_init(&p->gbase, 0, DT_REAL, _state); ae_matrix_init(&p->quadraticmodel, 0, 0, DT_REAL, _state); ae_vector_init(&p->bndl, 0, DT_REAL, _state); ae_vector_init(&p->bndu, 0, DT_REAL, _state); ae_vector_init(&p->havebndl, 0, DT_BOOL, _state); ae_vector_init(&p->havebndu, 0, DT_BOOL, _state); ae_vector_init(&p->s, 0, DT_REAL, _state); ae_matrix_init(&p->cleic, 0, 0, DT_REAL, _state); ae_vector_init(&p->xnew, 0, DT_REAL, _state); ae_vector_init(&p->xdir, 0, DT_REAL, _state); ae_vector_init(&p->deltax, 0, DT_REAL, _state); ae_vector_init(&p->deltaf, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); ae_vector_init(&p->choleskybuf, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); ae_vector_init(&p->fm1, 0, DT_REAL, _state); ae_vector_init(&p->fp1, 0, DT_REAL, _state); ae_vector_init(&p->fc1, 0, DT_REAL, _state); ae_vector_init(&p->gm1, 0, DT_REAL, _state); ae_vector_init(&p->gp1, 0, DT_REAL, _state); ae_vector_init(&p->gc1, 0, DT_REAL, _state); _minlbfgsstate_init(&p->internalstate, _state); _minlbfgsreport_init(&p->internalrep, _state); _minqpstate_init(&p->qpstate, _state); _minqpreport_init(&p->qprep, _state); _minlmstepfinder_init(&p->finderstate, _state); } void _minlmstate_init_copy(void* _dst, void* _src, ae_state *_state) { minlmstate *dst = (minlmstate*)_dst; minlmstate *src = (minlmstate*)_src; dst->n = src->n; dst->m = src->m; dst->diffstep = src->diffstep; dst->epsx = src->epsx; dst->maxits = src->maxits; dst->xrep = src->xrep; dst->stpmax = src->stpmax; dst->maxmodelage = src->maxmodelage; dst->makeadditers = src->makeadditers; ae_vector_init_copy(&dst->x, &src->x, _state); dst->f = src->f; ae_vector_init_copy(&dst->fi, &src->fi, _state); ae_matrix_init_copy(&dst->j, &src->j, _state); ae_matrix_init_copy(&dst->h, &src->h, _state); ae_vector_init_copy(&dst->g, &src->g, _state); dst->needf = src->needf; dst->needfg = src->needfg; dst->needfgh = src->needfgh; dst->needfij = src->needfij; dst->needfi = src->needfi; dst->xupdated = src->xupdated; dst->userterminationneeded = src->userterminationneeded; dst->algomode = src->algomode; dst->hasf = src->hasf; dst->hasfi = src->hasfi; dst->hasg = src->hasg; ae_vector_init_copy(&dst->xbase, &src->xbase, _state); dst->fbase = src->fbase; ae_vector_init_copy(&dst->fibase, &src->fibase, _state); ae_vector_init_copy(&dst->gbase, &src->gbase, _state); ae_matrix_init_copy(&dst->quadraticmodel, &src->quadraticmodel, _state); ae_vector_init_copy(&dst->bndl, &src->bndl, _state); ae_vector_init_copy(&dst->bndu, &src->bndu, _state); ae_vector_init_copy(&dst->havebndl, &src->havebndl, _state); ae_vector_init_copy(&dst->havebndu, &src->havebndu, _state); ae_vector_init_copy(&dst->s, &src->s, _state); ae_matrix_init_copy(&dst->cleic, &src->cleic, _state); dst->nec = src->nec; dst->nic = src->nic; dst->lambdav = src->lambdav; dst->nu = src->nu; dst->modelage = src->modelage; ae_vector_init_copy(&dst->xnew, &src->xnew, _state); ae_vector_init_copy(&dst->xdir, &src->xdir, _state); ae_vector_init_copy(&dst->deltax, &src->deltax, _state); ae_vector_init_copy(&dst->deltaf, &src->deltaf, _state); dst->deltaxready = src->deltaxready; dst->deltafready = src->deltafready; dst->teststep = src->teststep; dst->repiterationscount = src->repiterationscount; dst->repterminationtype = src->repterminationtype; dst->repfuncidx = src->repfuncidx; dst->repvaridx = src->repvaridx; dst->repnfunc = src->repnfunc; dst->repnjac = src->repnjac; dst->repngrad = src->repngrad; dst->repnhess = src->repnhess; dst->repncholesky = src->repncholesky; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); ae_vector_init_copy(&dst->choleskybuf, &src->choleskybuf, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); dst->actualdecrease = src->actualdecrease; dst->predicteddecrease = src->predicteddecrease; dst->xm1 = src->xm1; dst->xp1 = src->xp1; ae_vector_init_copy(&dst->fm1, &src->fm1, _state); ae_vector_init_copy(&dst->fp1, &src->fp1, _state); ae_vector_init_copy(&dst->fc1, &src->fc1, _state); ae_vector_init_copy(&dst->gm1, &src->gm1, _state); ae_vector_init_copy(&dst->gp1, &src->gp1, _state); ae_vector_init_copy(&dst->gc1, &src->gc1, _state); _minlbfgsstate_init_copy(&dst->internalstate, &src->internalstate, _state); _minlbfgsreport_init_copy(&dst->internalrep, &src->internalrep, _state); _minqpstate_init_copy(&dst->qpstate, &src->qpstate, _state); _minqpreport_init_copy(&dst->qprep, &src->qprep, _state); _minlmstepfinder_init_copy(&dst->finderstate, &src->finderstate, _state); } void _minlmstate_clear(void* _p) { minlmstate *p = (minlmstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->x); ae_vector_clear(&p->fi); ae_matrix_clear(&p->j); ae_matrix_clear(&p->h); ae_vector_clear(&p->g); ae_vector_clear(&p->xbase); ae_vector_clear(&p->fibase); ae_vector_clear(&p->gbase); ae_matrix_clear(&p->quadraticmodel); ae_vector_clear(&p->bndl); ae_vector_clear(&p->bndu); ae_vector_clear(&p->havebndl); ae_vector_clear(&p->havebndu); ae_vector_clear(&p->s); ae_matrix_clear(&p->cleic); ae_vector_clear(&p->xnew); ae_vector_clear(&p->xdir); ae_vector_clear(&p->deltax); ae_vector_clear(&p->deltaf); _rcommstate_clear(&p->rstate); ae_vector_clear(&p->choleskybuf); ae_vector_clear(&p->tmp0); ae_vector_clear(&p->fm1); ae_vector_clear(&p->fp1); ae_vector_clear(&p->fc1); ae_vector_clear(&p->gm1); ae_vector_clear(&p->gp1); ae_vector_clear(&p->gc1); _minlbfgsstate_clear(&p->internalstate); _minlbfgsreport_clear(&p->internalrep); _minqpstate_clear(&p->qpstate); _minqpreport_clear(&p->qprep); _minlmstepfinder_clear(&p->finderstate); } void _minlmstate_destroy(void* _p) { minlmstate *p = (minlmstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->x); ae_vector_destroy(&p->fi); ae_matrix_destroy(&p->j); ae_matrix_destroy(&p->h); ae_vector_destroy(&p->g); ae_vector_destroy(&p->xbase); ae_vector_destroy(&p->fibase); ae_vector_destroy(&p->gbase); ae_matrix_destroy(&p->quadraticmodel); ae_vector_destroy(&p->bndl); ae_vector_destroy(&p->bndu); ae_vector_destroy(&p->havebndl); ae_vector_destroy(&p->havebndu); ae_vector_destroy(&p->s); ae_matrix_destroy(&p->cleic); ae_vector_destroy(&p->xnew); ae_vector_destroy(&p->xdir); ae_vector_destroy(&p->deltax); ae_vector_destroy(&p->deltaf); _rcommstate_destroy(&p->rstate); ae_vector_destroy(&p->choleskybuf); ae_vector_destroy(&p->tmp0); ae_vector_destroy(&p->fm1); ae_vector_destroy(&p->fp1); ae_vector_destroy(&p->fc1); ae_vector_destroy(&p->gm1); ae_vector_destroy(&p->gp1); ae_vector_destroy(&p->gc1); _minlbfgsstate_destroy(&p->internalstate); _minlbfgsreport_destroy(&p->internalrep); _minqpstate_destroy(&p->qpstate); _minqpreport_destroy(&p->qprep); _minlmstepfinder_destroy(&p->finderstate); } void _minlmreport_init(void* _p, ae_state *_state) { minlmreport *p = (minlmreport*)_p; ae_touch_ptr((void*)p); } void _minlmreport_init_copy(void* _dst, void* _src, ae_state *_state) { minlmreport *dst = (minlmreport*)_dst; minlmreport *src = (minlmreport*)_src; dst->iterationscount = src->iterationscount; dst->terminationtype = src->terminationtype; dst->funcidx = src->funcidx; dst->varidx = src->varidx; dst->nfunc = src->nfunc; dst->njac = src->njac; dst->ngrad = src->ngrad; dst->nhess = src->nhess; dst->ncholesky = src->ncholesky; } void _minlmreport_clear(void* _p) { minlmreport *p = (minlmreport*)_p; ae_touch_ptr((void*)p); } void _minlmreport_destroy(void* _p) { minlmreport *p = (minlmreport*)_p; ae_touch_ptr((void*)p); } } cpp/src/linalg.h0000755000175000017500000127101513105126765013444 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _linalg_pkg_h #define _linalg_pkg_h #include "ap.h" #include "alglibinternal.h" #include "alglibmisc.h" ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (DATATYPES) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { typedef struct { ae_vector vals; ae_vector idx; ae_vector ridx; ae_vector didx; ae_vector uidx; ae_int_t matrixtype; ae_int_t m; ae_int_t n; ae_int_t nfree; ae_int_t ninitialized; ae_int_t tablesize; } sparsematrix; typedef struct { ae_vector d; ae_vector u; sparsematrix s; } sparsebuffers; typedef struct { double r1; double rinf; } matinvreport; typedef struct { double e1; double e2; ae_vector x; ae_vector ax; double xax; ae_int_t n; ae_vector rk; ae_vector rk1; ae_vector xk; ae_vector xk1; ae_vector pk; ae_vector pk1; ae_vector b; rcommstate rstate; ae_vector tmp2; } fblslincgstate; typedef struct { ae_int_t n; ae_int_t m; ae_int_t nstart; ae_int_t nits; ae_int_t seedval; ae_vector x0; ae_vector x1; ae_vector t; ae_vector xbest; hqrndstate r; ae_vector x; ae_vector mv; ae_vector mtv; ae_bool needmv; ae_bool needmtv; double repnorm; rcommstate rstate; } normestimatorstate; typedef struct { ae_int_t n; ae_int_t k; ae_int_t nwork; ae_int_t maxits; double eps; ae_int_t eigenvectorsneeded; ae_int_t matrixtype; hqrndstate rs; ae_bool running; ae_vector tau; ae_matrix qcur; ae_matrix znew; ae_matrix r; ae_matrix rz; ae_matrix tz; ae_matrix rq; ae_matrix dummy; ae_vector rw; ae_vector tw; ae_vector wcur; ae_vector wprev; ae_vector wrank; apbuffers buf; ae_matrix x; ae_matrix ax; ae_int_t requesttype; ae_int_t requestsize; ae_int_t repiterationscount; rcommstate rstate; } eigsubspacestate; typedef struct { ae_int_t iterationscount; } eigsubspacereport; } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* Sparse matrix structure. You should use ALGLIB functions to work with sparse matrix. Never try to access its fields directly! NOTES ON THE SPARSE STORAGE FORMATS Sparse matrices can be stored using several formats: * Hash-Table representation * Compressed Row Storage (CRS) * Skyline matrix storage (SKS) Each of the formats has benefits and drawbacks: * Hash-table is good for dynamic operations (insertion of new elements), but does not support linear algebra operations * CRS is good for operations like matrix-vector or matrix-matrix products, but its initialization is less convenient - you have to tell row sizes at the initialization, and you have to fill matrix only row by row, from left to right. * SKS is a special format which is used to store triangular factors from Cholesky factorization. It does not support dynamic modification, and support for linear algebra operations is very limited. Tables below outline information about these two formats: OPERATIONS WITH MATRIX HASH CRS SKS creation + + + SparseGet + + + SparseRewriteExisting + + + SparseSet + SparseAdd + SparseGetRow + + SparseGetCompressedRow + + sparse-dense linear algebra + + *************************************************************************/ class _sparsematrix_owner { public: _sparsematrix_owner(); _sparsematrix_owner(const _sparsematrix_owner &rhs); _sparsematrix_owner& operator=(const _sparsematrix_owner &rhs); virtual ~_sparsematrix_owner(); alglib_impl::sparsematrix* c_ptr(); alglib_impl::sparsematrix* c_ptr() const; protected: alglib_impl::sparsematrix *p_struct; }; class sparsematrix : public _sparsematrix_owner { public: sparsematrix(); sparsematrix(const sparsematrix &rhs); sparsematrix& operator=(const sparsematrix &rhs); virtual ~sparsematrix(); }; /************************************************************************* Temporary buffers for sparse matrix operations. You should pass an instance of this structure to factorization functions. It allows to reuse memory during repeated sparse factorizations. You do not have to call some initialization function - simply passing an instance to factorization function is enough. *************************************************************************/ class _sparsebuffers_owner { public: _sparsebuffers_owner(); _sparsebuffers_owner(const _sparsebuffers_owner &rhs); _sparsebuffers_owner& operator=(const _sparsebuffers_owner &rhs); virtual ~_sparsebuffers_owner(); alglib_impl::sparsebuffers* c_ptr(); alglib_impl::sparsebuffers* c_ptr() const; protected: alglib_impl::sparsebuffers *p_struct; }; class sparsebuffers : public _sparsebuffers_owner { public: sparsebuffers(); sparsebuffers(const sparsebuffers &rhs); sparsebuffers& operator=(const sparsebuffers &rhs); virtual ~sparsebuffers(); }; /************************************************************************* Matrix inverse report: * R1 reciprocal of condition number in 1-norm * RInf reciprocal of condition number in inf-norm *************************************************************************/ class _matinvreport_owner { public: _matinvreport_owner(); _matinvreport_owner(const _matinvreport_owner &rhs); _matinvreport_owner& operator=(const _matinvreport_owner &rhs); virtual ~_matinvreport_owner(); alglib_impl::matinvreport* c_ptr(); alglib_impl::matinvreport* c_ptr() const; protected: alglib_impl::matinvreport *p_struct; }; class matinvreport : public _matinvreport_owner { public: matinvreport(); matinvreport(const matinvreport &rhs); matinvreport& operator=(const matinvreport &rhs); virtual ~matinvreport(); double &r1; double &rinf; }; /************************************************************************* This object stores state of the iterative norm estimation algorithm. You should use ALGLIB functions to work with this object. *************************************************************************/ class _normestimatorstate_owner { public: _normestimatorstate_owner(); _normestimatorstate_owner(const _normestimatorstate_owner &rhs); _normestimatorstate_owner& operator=(const _normestimatorstate_owner &rhs); virtual ~_normestimatorstate_owner(); alglib_impl::normestimatorstate* c_ptr(); alglib_impl::normestimatorstate* c_ptr() const; protected: alglib_impl::normestimatorstate *p_struct; }; class normestimatorstate : public _normestimatorstate_owner { public: normestimatorstate(); normestimatorstate(const normestimatorstate &rhs); normestimatorstate& operator=(const normestimatorstate &rhs); virtual ~normestimatorstate(); }; /************************************************************************* This object stores state of the subspace iteration algorithm. You should use ALGLIB functions to work with this object. *************************************************************************/ class _eigsubspacestate_owner { public: _eigsubspacestate_owner(); _eigsubspacestate_owner(const _eigsubspacestate_owner &rhs); _eigsubspacestate_owner& operator=(const _eigsubspacestate_owner &rhs); virtual ~_eigsubspacestate_owner(); alglib_impl::eigsubspacestate* c_ptr(); alglib_impl::eigsubspacestate* c_ptr() const; protected: alglib_impl::eigsubspacestate *p_struct; }; class eigsubspacestate : public _eigsubspacestate_owner { public: eigsubspacestate(); eigsubspacestate(const eigsubspacestate &rhs); eigsubspacestate& operator=(const eigsubspacestate &rhs); virtual ~eigsubspacestate(); }; /************************************************************************* This object stores state of the subspace iteration algorithm. You should use ALGLIB functions to work with this object. *************************************************************************/ class _eigsubspacereport_owner { public: _eigsubspacereport_owner(); _eigsubspacereport_owner(const _eigsubspacereport_owner &rhs); _eigsubspacereport_owner& operator=(const _eigsubspacereport_owner &rhs); virtual ~_eigsubspacereport_owner(); alglib_impl::eigsubspacereport* c_ptr(); alglib_impl::eigsubspacereport* c_ptr() const; protected: alglib_impl::eigsubspacereport *p_struct; }; class eigsubspacereport : public _eigsubspacereport_owner { public: eigsubspacereport(); eigsubspacereport(const eigsubspacereport &rhs); eigsubspacereport& operator=(const eigsubspacereport &rhs); virtual ~eigsubspacereport(); ae_int_t &iterationscount; }; /************************************************************************* This function creates sparse matrix in a Hash-Table format. This function creates Hast-Table matrix, which can be converted to CRS format after its initialization is over. Typical usage scenario for a sparse matrix is: 1. creation in a Hash-Table format 2. insertion of the matrix elements 3. conversion to the CRS representation 4. matrix is passed to some linear algebra algorithm Some information about different matrix formats can be found below, in the "NOTES" section. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 K - K>=0, expected number of non-zero elements in a matrix. K can be inexact approximation, can be less than actual number of elements (table will grow when needed) or even zero). It is important to understand that although hash-table may grow automatically, it is better to provide good estimate of data size. OUTPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. All elements of the matrix are zero. NOTE 1 Hash-tables use memory inefficiently, and they have to keep some amount of the "spare memory" in order to have good performance. Hash table for matrix with K non-zero elements will need C*K*(8+2*sizeof(int)) bytes, where C is a small constant, about 1.5-2 in magnitude. CRS storage, from the other side, is more memory-efficient, and needs just K*(8+sizeof(int))+M*sizeof(int) bytes, where M is a number of rows in a matrix. When you convert from the Hash-Table to CRS representation, all unneeded memory will be freed. NOTE 2 Comments of SparseMatrix structure outline information about different sparse storage formats. We recommend you to read them before starting to use ALGLIB sparse matrices. NOTE 3 This function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateBuf function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecreate(const ae_int_t m, const ae_int_t n, const ae_int_t k, sparsematrix &s); void sparsecreate(const ae_int_t m, const ae_int_t n, sparsematrix &s); /************************************************************************* This version of SparseCreate function creates sparse matrix in Hash-Table format, reusing previously allocated storage as much as possible. Read comments for SparseCreate() for more information. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 K - K>=0, expected number of non-zero elements in a matrix. K can be inexact approximation, can be less than actual number of elements (table will grow when needed) or even zero). It is important to understand that although hash-table may grow automatically, it is better to provide good estimate of data size. S - SparseMatrix structure which MAY contain some already allocated storage. OUTPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. All elements of the matrix are zero. Previously allocated storage is reused, if its size is compatible with expected number of non-zeros K. -- ALGLIB PROJECT -- Copyright 14.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsecreatebuf(const ae_int_t m, const ae_int_t n, const ae_int_t k, const sparsematrix &s); void sparsecreatebuf(const ae_int_t m, const ae_int_t n, const sparsematrix &s); /************************************************************************* This function creates sparse matrix in a CRS format (expert function for situations when you are running out of memory). This function creates CRS matrix. Typical usage scenario for a CRS matrix is: 1. creation (you have to tell number of non-zero elements at each row at this moment) 2. insertion of the matrix elements (row by row, from left to right) 3. matrix is passed to some linear algebra algorithm This function is a memory-efficient alternative to SparseCreate(), but it is more complex because it requires you to know in advance how large your matrix is. Some information about different matrix formats can be found in comments on SparseMatrix structure. We recommend you to read them before starting to use ALGLIB sparse matrices.. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 NER - number of elements at each row, array[M], NER[I]>=0 OUTPUT PARAMETERS S - sparse M*N matrix in CRS representation. You have to fill ALL non-zero elements by calling SparseSet() BEFORE you try to use this matrix. NOTE: this function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateCRSBuf function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecreatecrs(const ae_int_t m, const ae_int_t n, const integer_1d_array &ner, sparsematrix &s); /************************************************************************* This function creates sparse matrix in a CRS format (expert function for situations when you are running out of memory). This version of CRS matrix creation function may reuse memory already allocated in S. This function creates CRS matrix. Typical usage scenario for a CRS matrix is: 1. creation (you have to tell number of non-zero elements at each row at this moment) 2. insertion of the matrix elements (row by row, from left to right) 3. matrix is passed to some linear algebra algorithm This function is a memory-efficient alternative to SparseCreate(), but it is more complex because it requires you to know in advance how large your matrix is. Some information about different matrix formats can be found in comments on SparseMatrix structure. We recommend you to read them before starting to use ALGLIB sparse matrices.. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 NER - number of elements at each row, array[M], NER[I]>=0 S - sparse matrix structure with possibly preallocated memory. OUTPUT PARAMETERS S - sparse M*N matrix in CRS representation. You have to fill ALL non-zero elements by calling SparseSet() BEFORE you try to use this matrix. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecreatecrsbuf(const ae_int_t m, const ae_int_t n, const integer_1d_array &ner, const sparsematrix &s); /************************************************************************* This function creates sparse matrix in a SKS format (skyline storage format). In most cases you do not need this function - CRS format better suits most use cases. INPUT PARAMETERS M, N - number of rows(M) and columns (N) in a matrix: * M=N (as for now, ALGLIB supports only square SKS) * N>=1 * M>=1 D - "bottom" bandwidths, array[M], D[I]>=0. I-th element stores number of non-zeros at I-th row, below the diagonal (diagonal itself is not included) U - "top" bandwidths, array[N], U[I]>=0. I-th element stores number of non-zeros at I-th row, above the diagonal (diagonal itself is not included) OUTPUT PARAMETERS S - sparse M*N matrix in SKS representation. All elements are filled by zeros. You may use SparseRewriteExisting() to change their values. NOTE: this function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateSKSBuf function. -- ALGLIB PROJECT -- Copyright 13.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsecreatesks(const ae_int_t m, const ae_int_t n, const integer_1d_array &d, const integer_1d_array &u, sparsematrix &s); /************************************************************************* This is "buffered" version of SparseCreateSKS() which reuses memory previously allocated in S (of course, memory is reallocated if needed). This function creates sparse matrix in a SKS format (skyline storage format). In most cases you do not need this function - CRS format better suits most use cases. INPUT PARAMETERS M, N - number of rows(M) and columns (N) in a matrix: * M=N (as for now, ALGLIB supports only square SKS) * N>=1 * M>=1 D - "bottom" bandwidths, array[M], 0<=D[I]<=I. I-th element stores number of non-zeros at I-th row, below the diagonal (diagonal itself is not included) U - "top" bandwidths, array[N], 0<=U[I]<=I. I-th element stores number of non-zeros at I-th row, above the diagonal (diagonal itself is not included) OUTPUT PARAMETERS S - sparse M*N matrix in SKS representation. All elements are filled by zeros. You may use SparseSet()/SparseAdd() to change their values. -- ALGLIB PROJECT -- Copyright 13.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsecreatesksbuf(const ae_int_t m, const ae_int_t n, const integer_1d_array &d, const integer_1d_array &u, const sparsematrix &s); /************************************************************************* This function copies S0 to S1. This function completely deallocates memory owned by S1 before creating a copy of S0. If you want to reuse memory, use SparseCopyBuf. NOTE: this function does not verify its arguments, it just copies all fields of the structure. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecopy(const sparsematrix &s0, sparsematrix &s1); /************************************************************************* This function copies S0 to S1. Memory already allocated in S1 is reused as much as possible. NOTE: this function does not verify its arguments, it just copies all fields of the structure. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecopybuf(const sparsematrix &s0, const sparsematrix &s1); /************************************************************************* This function efficiently swaps contents of S0 and S1. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/ void sparseswap(const sparsematrix &s0, const sparsematrix &s1); /************************************************************************* This function adds value to S[i,j] - element of the sparse matrix. Matrix must be in a Hash-Table mode. In case S[i,j] already exists in the table, V i added to its value. In case S[i,j] is non-existent, it is inserted in the table. Table automatically grows when necessary. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. Exception will be thrown for CRS matrix. I - row index of the element to modify, 0<=I=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y - array[M], S*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsesmv(const sparsematrix &s, const bool isupper, const real_1d_array &x, real_1d_array &y); /************************************************************************* This function calculates vector-matrix-vector product x'*S*x, where S is symmetric matrix. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*M matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is given: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. RESULT x'*S*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 27.01.2014 by Bochkanov Sergey *************************************************************************/ double sparsevsmv(const sparsematrix &s, const bool isupper, const real_1d_array &x); /************************************************************************* This function calculates matrix-matrix product S*A. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. A - array[N][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B - array[M][K], S*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemm(const sparsematrix &s, const real_2d_array &a, const ae_int_t k, real_2d_array &b); /************************************************************************* This function calculates matrix-matrix product S^T*A. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. A - array[M][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least M, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B - array[N][K], S^T*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemtm(const sparsematrix &s, const real_2d_array &a, const ae_int_t k, real_2d_array &b); /************************************************************************* This function simultaneously calculates two matrix-matrix products: S*A and S^T*A. S must be square (non-rectangular) matrix stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse N*N matrix in CRS or SKS format. A - array[N][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B0 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. B1 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B0 - array[N][K], S*A B1 - array[N][K], S^T*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemm2(const sparsematrix &s, const real_2d_array &a, const ae_int_t k, real_2d_array &b0, real_2d_array &b1); /************************************************************************* This function calculates matrix-matrix product S*A, when S is symmetric matrix. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*M matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is given: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. A - array[N][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B - array[M][K], S*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsesmm(const sparsematrix &s, const bool isupper, const real_2d_array &a, const ae_int_t k, real_2d_array &b); /************************************************************************* This function calculates matrix-vector product op(S)*x, when x is vector, S is symmetric triangular matrix, op(S) is transposition or no operation. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse square matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is used: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. IsUnit - unit or non-unit diagonal: * if True, diagonal elements of triangular matrix are considered equal to 1.0. Actual elements stored in S are not referenced at all. * if False, diagonal stored in S is used OpType - operation type: * if 0, S*x is calculated * if 1, (S^T)*x is calculated (transposition) X - array[N] which stores input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y - possibly preallocated input buffer. Automatically resized if its size is too small. OUTPUT PARAMETERS Y - array[N], op(S)*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 20.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsetrmv(const sparsematrix &s, const bool isupper, const bool isunit, const ae_int_t optype, const real_1d_array &x, real_1d_array &y); /************************************************************************* This function solves linear system op(S)*y=x where x is vector, S is symmetric triangular matrix, op(S) is transposition or no operation. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse square matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is used: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. IsUnit - unit or non-unit diagonal: * if True, diagonal elements of triangular matrix are considered equal to 1.0. Actual elements stored in S are not referenced at all. * if False, diagonal stored in S is used. It is your responsibility to make sure that diagonal is non-zero. OpType - operation type: * if 0, S*x is calculated * if 1, (S^T)*x is calculated (transposition) X - array[N] which stores input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. OUTPUT PARAMETERS X - array[N], inv(op(S))*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. NOTE: no assertion or tests are done during algorithm operation. It is your responsibility to provide invertible matrix to algorithm. -- ALGLIB PROJECT -- Copyright 20.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsetrsv(const sparsematrix &s, const bool isupper, const bool isunit, const ae_int_t optype, const real_1d_array &x); /************************************************************************* This procedure resizes Hash-Table matrix. It can be called when you have deleted too many elements from the matrix, and you want to free unneeded memory. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparseresizematrix(const sparsematrix &s); /************************************************************************* This function is used to enumerate all elements of the sparse matrix. Before first call user initializes T0 and T1 counters by zero. These counters are used to remember current position in a matrix; after each call they are updated by the function. Subsequent calls to this function return non-zero elements of the sparse matrix, one by one. If you enumerate CRS matrix, matrix is traversed from left to right, from top to bottom. In case you enumerate matrix stored as Hash table, elements are returned in random order. EXAMPLE > T0=0 > T1=0 > while SparseEnumerate(S,T0,T1,I,J,V) do > ....do something with I,J,V INPUT PARAMETERS S - sparse M*N matrix in Hash-Table or CRS representation. T0 - internal counter T1 - internal counter OUTPUT PARAMETERS T0 - new value of the internal counter T1 - new value of the internal counter I - row index of non-zero element, 0<=I=1 OUTPUT PARAMETERS: A - orthogonal NxN matrix, array[0..N-1,0..N-1] NOTE: this function uses algorithm described in Stewart, G. W. (1980), "The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators". Speaking short, to generate an (N+1)x(N+1) orthogonal matrix, it: * takes an NxN one * takes uniformly distributed unit vector of dimension N+1. * constructs a Householder reflection from the vector, then applies it to the smaller matrix (embedded in the larger size with a 1 at the bottom right corner). -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrndorthogonal(const ae_int_t n, real_2d_array &a); /************************************************************************* Generation of random NxN matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrndcond(const ae_int_t n, const double c, real_2d_array &a); /************************************************************************* Generation of a random Haar distributed orthogonal complex matrix INPUT PARAMETERS: N - matrix size, N>=1 OUTPUT PARAMETERS: A - orthogonal NxN matrix, array[0..N-1,0..N-1] NOTE: this function uses algorithm described in Stewart, G. W. (1980), "The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators". Speaking short, to generate an (N+1)x(N+1) orthogonal matrix, it: * takes an NxN one * takes uniformly distributed unit vector of dimension N+1. * constructs a Householder reflection from the vector, then applies it to the smaller matrix (embedded in the larger size with a 1 at the bottom right corner). -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrndorthogonal(const ae_int_t n, complex_2d_array &a); /************************************************************************* Generation of random NxN complex matrix with given condition number C and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrndcond(const ae_int_t n, const double c, complex_2d_array &a); /************************************************************************* Generation of random NxN symmetric matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void smatrixrndcond(const ae_int_t n, const double c, real_2d_array &a); /************************************************************************* Generation of random NxN symmetric positive definite matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random SPD matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void spdmatrixrndcond(const ae_int_t n, const double c, real_2d_array &a); /************************************************************************* Generation of random NxN Hermitian matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void hmatrixrndcond(const ae_int_t n, const double c, complex_2d_array &a); /************************************************************************* Generation of random NxN Hermitian positive definite matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random HPD matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void hpdmatrixrndcond(const ae_int_t n, const double c, complex_2d_array &a); /************************************************************************* Multiplication of MxN matrix by NxN random Haar distributed orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrndorthogonalfromtheright(real_2d_array &a, const ae_int_t m, const ae_int_t n); /************************************************************************* Multiplication of MxN matrix by MxM random Haar distributed orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - Q*A, where Q is random MxM orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrndorthogonalfromtheleft(real_2d_array &a, const ae_int_t m, const ae_int_t n); /************************************************************************* Multiplication of MxN complex matrix by NxN random Haar distributed complex orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrndorthogonalfromtheright(complex_2d_array &a, const ae_int_t m, const ae_int_t n); /************************************************************************* Multiplication of MxN complex matrix by MxM random Haar distributed complex orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - Q*A, where Q is random MxM orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrndorthogonalfromtheleft(complex_2d_array &a, const ae_int_t m, const ae_int_t n); /************************************************************************* Symmetric multiplication of NxN matrix by random Haar distributed orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..N-1, 0..N-1] N - matrix size OUTPUT PARAMETERS: A - Q'*A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void smatrixrndmultiply(real_2d_array &a, const ae_int_t n); /************************************************************************* Hermitian multiplication of NxN matrix by random Haar distributed complex orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..N-1, 0..N-1] N - matrix size OUTPUT PARAMETERS: A - Q^H*A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void hmatrixrndmultiply(complex_2d_array &a, const ae_int_t n); /************************************************************************* Cache-oblivous complex "copy-and-transpose" Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/ void cmatrixtranspose(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, complex_2d_array &b, const ae_int_t ib, const ae_int_t jb); /************************************************************************* Cache-oblivous real "copy-and-transpose" Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/ void rmatrixtranspose(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, const real_2d_array &b, const ae_int_t ib, const ae_int_t jb); /************************************************************************* This code enforces symmetricy of the matrix by copying Upper part to lower one (or vice versa). INPUT PARAMETERS: A - matrix N - number of rows/columns IsUpper - whether we want to copy upper triangle to lower one (True) or vice versa (False). *************************************************************************/ void rmatrixenforcesymmetricity(const real_2d_array &a, const ae_int_t n, const bool isupper); /************************************************************************* Copy Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/ void cmatrixcopy(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, complex_2d_array &b, const ae_int_t ib, const ae_int_t jb); /************************************************************************* Copy Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/ void rmatrixcopy(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, real_2d_array &b, const ae_int_t ib, const ae_int_t jb); /************************************************************************* Rank-1 correction: A := A + u*v' INPUT PARAMETERS: M - number of rows N - number of columns A - target matrix, MxN submatrix is updated IA - submatrix offset (row index) JA - submatrix offset (column index) U - vector #1 IU - subvector offset V - vector #2 IV - subvector offset *************************************************************************/ void cmatrixrank1(const ae_int_t m, const ae_int_t n, complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, complex_1d_array &u, const ae_int_t iu, complex_1d_array &v, const ae_int_t iv); /************************************************************************* Rank-1 correction: A := A + u*v' INPUT PARAMETERS: M - number of rows N - number of columns A - target matrix, MxN submatrix is updated IA - submatrix offset (row index) JA - submatrix offset (column index) U - vector #1 IU - subvector offset V - vector #2 IV - subvector offset *************************************************************************/ void rmatrixrank1(const ae_int_t m, const ae_int_t n, real_2d_array &a, const ae_int_t ia, const ae_int_t ja, real_1d_array &u, const ae_int_t iu, real_1d_array &v, const ae_int_t iv); /************************************************************************* Matrix-vector product: y := op(A)*x INPUT PARAMETERS: M - number of rows of op(A) M>=0 N - number of columns of op(A) N>=0 A - target matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpA - operation type: * OpA=0 => op(A) = A * OpA=1 => op(A) = A^T * OpA=2 => op(A) = A^H X - input vector IX - subvector offset IY - subvector offset Y - preallocated matrix, must be large enough to store result OUTPUT PARAMETERS: Y - vector which stores result if M=0, then subroutine does nothing. if N=0, Y is filled by zeros. -- ALGLIB routine -- 28.01.2010 Bochkanov Sergey *************************************************************************/ void cmatrixmv(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t opa, const complex_1d_array &x, const ae_int_t ix, complex_1d_array &y, const ae_int_t iy); /************************************************************************* Matrix-vector product: y := op(A)*x INPUT PARAMETERS: M - number of rows of op(A) N - number of columns of op(A) A - target matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpA - operation type: * OpA=0 => op(A) = A * OpA=1 => op(A) = A^T X - input vector IX - subvector offset IY - subvector offset Y - preallocated matrix, must be large enough to store result OUTPUT PARAMETERS: Y - vector which stores result if M=0, then subroutine does nothing. if N=0, Y is filled by zeros. -- ALGLIB routine -- 28.01.2010 Bochkanov Sergey *************************************************************************/ void rmatrixmv(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t opa, const real_1d_array &x, const ae_int_t ix, real_1d_array &y, const ae_int_t iy); /************************************************************************* This subroutine calculates X*op(A^-1) where: * X is MxN general matrix * A is NxN upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition, conjugate transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+N-1,J1:J1+N-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrighttrsm(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const complex_2d_array &x, const ae_int_t i2, const ae_int_t j2); void smp_cmatrixrighttrsm(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const complex_2d_array &x, const ae_int_t i2, const ae_int_t j2); /************************************************************************* This subroutine calculates op(A^-1)*X where: * X is MxN general matrix * A is MxM upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition, conjugate transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+M-1,J1:J1+M-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixlefttrsm(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const complex_2d_array &x, const ae_int_t i2, const ae_int_t j2); void smp_cmatrixlefttrsm(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const complex_2d_array &x, const ae_int_t i2, const ae_int_t j2); /************************************************************************* This subroutine calculates X*op(A^-1) where: * X is MxN general matrix * A is NxN upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+N-1,J1:J1+N-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrighttrsm(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const real_2d_array &x, const ae_int_t i2, const ae_int_t j2); void smp_rmatrixrighttrsm(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const real_2d_array &x, const ae_int_t i2, const ae_int_t j2); /************************************************************************* This subroutine calculates op(A^-1)*X where: * X is MxN general matrix * A is MxM upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+M-1,J1:J1+M-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixlefttrsm(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const real_2d_array &x, const ae_int_t i2, const ae_int_t j2); void smp_rmatrixlefttrsm(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const real_2d_array &x, const ae_int_t i2, const ae_int_t j2); /************************************************************************* This subroutine calculates C=alpha*A*A^H+beta*C or C=alpha*A^H*A+beta*C where: * C is NxN Hermitian matrix given by its upper/lower triangle * A is NxK matrix when A*A^H is calculated, KxN matrix otherwise Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 K - matrix size, K>=0 Alpha - coefficient A - matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpTypeA - multiplication type: * 0 - A*A^H is calculated * 2 - A^H*A is calculated Beta - coefficient C - preallocated input/output matrix IC - submatrix offset (row index) JC - submatrix offset (column index) IsUpper - whether upper or lower triangle of C is updated; this function updates only one half of C, leaving other half unchanged (not referenced at all). -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixherk(const ae_int_t n, const ae_int_t k, const double alpha, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const double beta, const complex_2d_array &c, const ae_int_t ic, const ae_int_t jc, const bool isupper); void smp_cmatrixherk(const ae_int_t n, const ae_int_t k, const double alpha, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const double beta, const complex_2d_array &c, const ae_int_t ic, const ae_int_t jc, const bool isupper); /************************************************************************* This subroutine calculates C=alpha*A*A^T+beta*C or C=alpha*A^T*A+beta*C where: * C is NxN symmetric matrix given by its upper/lower triangle * A is NxK matrix when A*A^T is calculated, KxN matrix otherwise Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 K - matrix size, K>=0 Alpha - coefficient A - matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpTypeA - multiplication type: * 0 - A*A^T is calculated * 2 - A^T*A is calculated Beta - coefficient C - preallocated input/output matrix IC - submatrix offset (row index) JC - submatrix offset (column index) IsUpper - whether C is upper triangular or lower triangular -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixsyrk(const ae_int_t n, const ae_int_t k, const double alpha, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const double beta, const real_2d_array &c, const ae_int_t ic, const ae_int_t jc, const bool isupper); void smp_rmatrixsyrk(const ae_int_t n, const ae_int_t k, const double alpha, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const double beta, const real_2d_array &c, const ae_int_t ic, const ae_int_t jc, const bool isupper); /************************************************************************* This subroutine calculates C = alpha*op1(A)*op2(B) +beta*C where: * C is MxN general matrix * op1(A) is MxK matrix * op2(B) is KxN matrix * "op" may be identity transformation, transposition, conjugate transposition Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. IMPORTANT: This function does NOT preallocate output matrix C, it MUST be preallocated by caller prior to calling this function. In case C does not have enough space to store result, exception will be generated. INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset OpTypeA - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition B - matrix IB - submatrix offset JB - submatrix offset OpTypeB - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition Beta - coefficient C - matrix (PREALLOCATED, large enough to store result) IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixgemm(const ae_int_t m, const ae_int_t n, const ae_int_t k, const alglib::complex alpha, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const complex_2d_array &b, const ae_int_t ib, const ae_int_t jb, const ae_int_t optypeb, const alglib::complex beta, const complex_2d_array &c, const ae_int_t ic, const ae_int_t jc); void smp_cmatrixgemm(const ae_int_t m, const ae_int_t n, const ae_int_t k, const alglib::complex alpha, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const complex_2d_array &b, const ae_int_t ib, const ae_int_t jb, const ae_int_t optypeb, const alglib::complex beta, const complex_2d_array &c, const ae_int_t ic, const ae_int_t jc); /************************************************************************* This subroutine calculates C = alpha*op1(A)*op2(B) +beta*C where: * C is MxN general matrix * op1(A) is MxK matrix * op2(B) is KxN matrix * "op" may be identity transformation, transposition Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. IMPORTANT: This function does NOT preallocate output matrix C, it MUST be preallocated by caller prior to calling this function. In case C does not have enough space to store result, exception will be generated. INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset OpTypeA - transformation type: * 0 - no transformation * 1 - transposition B - matrix IB - submatrix offset JB - submatrix offset OpTypeB - transformation type: * 0 - no transformation * 1 - transposition Beta - coefficient C - PREALLOCATED output matrix, large enough to store result IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 2009-2013 Bochkanov Sergey *************************************************************************/ void rmatrixgemm(const ae_int_t m, const ae_int_t n, const ae_int_t k, const double alpha, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const real_2d_array &b, const ae_int_t ib, const ae_int_t jb, const ae_int_t optypeb, const double beta, const real_2d_array &c, const ae_int_t ic, const ae_int_t jc); void smp_rmatrixgemm(const ae_int_t m, const ae_int_t n, const ae_int_t k, const double alpha, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const real_2d_array &b, const ae_int_t ib, const ae_int_t jb, const ae_int_t optypeb, const double beta, const real_2d_array &c, const ae_int_t ic, const ae_int_t jc); /************************************************************************* This subroutine is an older version of CMatrixHERK(), one with wrong name (it is HErmitian update, not SYmmetric). It is left here for backward compatibility. -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixsyrk(const ae_int_t n, const ae_int_t k, const double alpha, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const double beta, const complex_2d_array &c, const ae_int_t ic, const ae_int_t jc, const bool isupper); void smp_cmatrixsyrk(const ae_int_t n, const ae_int_t k, const double alpha, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const double beta, const complex_2d_array &c, const ae_int_t ic, const ae_int_t jc, const bool isupper); /************************************************************************* LU decomposition of a general real matrix with row pivoting A is represented as A = P*L*U, where: * L is lower unitriangular matrix * U is upper triangular matrix * P = P0*P1*...*PK, K=min(M,N)-1, Pi - permutation matrix for I and Pivots[I] This is cache-oblivous implementation of LU decomposition. It is optimized for square matrices. As for rectangular matrices: * best case - M>>N * worst case - N>>M, small M, large N, matrix does not fit in CPU cache COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - array[0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. OUTPUT PARAMETERS: A - matrices L and U in compact form: * L is stored under main diagonal * U is stored on and above main diagonal Pivots - permutation matrix in compact form. array[0..Min(M-1,N-1)]. -- ALGLIB routine -- 10.01.2010 Bochkanov Sergey *************************************************************************/ void rmatrixlu(real_2d_array &a, const ae_int_t m, const ae_int_t n, integer_1d_array &pivots); void smp_rmatrixlu(real_2d_array &a, const ae_int_t m, const ae_int_t n, integer_1d_array &pivots); /************************************************************************* LU decomposition of a general complex matrix with row pivoting A is represented as A = P*L*U, where: * L is lower unitriangular matrix * U is upper triangular matrix * P = P0*P1*...*PK, K=min(M,N)-1, Pi - permutation matrix for I and Pivots[I] This is cache-oblivous implementation of LU decomposition. It is optimized for square matrices. As for rectangular matrices: * best case - M>>N * worst case - N>>M, small M, large N, matrix does not fit in CPU cache COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - array[0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. OUTPUT PARAMETERS: A - matrices L and U in compact form: * L is stored under main diagonal * U is stored on and above main diagonal Pivots - permutation matrix in compact form. array[0..Min(M-1,N-1)]. -- ALGLIB routine -- 10.01.2010 Bochkanov Sergey *************************************************************************/ void cmatrixlu(complex_2d_array &a, const ae_int_t m, const ae_int_t n, integer_1d_array &pivots); void smp_cmatrixlu(complex_2d_array &a, const ae_int_t m, const ae_int_t n, integer_1d_array &pivots); /************************************************************************* Cache-oblivious Cholesky decomposition The algorithm computes Cholesky decomposition of a Hermitian positive- definite matrix. The result of an algorithm is a representation of A as A=U'*U or A=L*L' (here X' detones conj(X^T)). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - upper or lower triangle of a factorized matrix. array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - if IsUpper=True, then A contains an upper triangle of a symmetric matrix, otherwise A contains a lower one. OUTPUT PARAMETERS: A - the result of factorization. If IsUpper=True, then the upper triangle contains matrix U, so that A = U'*U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of A is not determined in such case. -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ bool hpdmatrixcholesky(complex_2d_array &a, const ae_int_t n, const bool isupper); bool smp_hpdmatrixcholesky(complex_2d_array &a, const ae_int_t n, const bool isupper); /************************************************************************* Cache-oblivious Cholesky decomposition The algorithm computes Cholesky decomposition of a symmetric positive- definite matrix. The result of an algorithm is a representation of A as A=U^T*U or A=L*L^T COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - upper or lower triangle of a factorized matrix. array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - if IsUpper=True, then A contains an upper triangle of a symmetric matrix, otherwise A contains a lower one. OUTPUT PARAMETERS: A - the result of factorization. If IsUpper=True, then the upper triangle contains matrix U, so that A = U^T*U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of A is not determined in such case. -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ bool spdmatrixcholesky(real_2d_array &a, const ae_int_t n, const bool isupper); bool smp_spdmatrixcholesky(real_2d_array &a, const ae_int_t n, const bool isupper); /************************************************************************* Update of Cholesky decomposition: rank-1 update to original A. "Buffered" version which uses preallocated buffer which is saved between subsequent function calls. This function uses internally allocated buffer which is not saved between subsequent calls. So, if you perform a lot of subsequent updates, we recommend you to use "buffered" version of this function: SPDMatrixCholeskyUpdateAdd1Buf(). INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. U - array[N], rank-1 update to A: A_mod = A + u*u' Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. NOTE: this function always succeeds, so it does not return completion code NOTE: this function checks sizes of input arrays, but it does NOT checks for presence of infinities or NAN's. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/ void spdmatrixcholeskyupdateadd1(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &u); /************************************************************************* Update of Cholesky decomposition: "fixing" some variables. This function uses internally allocated buffer which is not saved between subsequent calls. So, if you perform a lot of subsequent updates, we recommend you to use "buffered" version of this function: SPDMatrixCholeskyUpdateFixBuf(). "FIXING" EXPLAINED: Suppose we have N*N positive definite matrix A. "Fixing" some variable means filling corresponding row/column of A by zeros, and setting diagonal element to 1. For example, if we fix 2nd variable in 4*4 matrix A, it becomes Af: ( A00 A01 A02 A03 ) ( Af00 0 Af02 Af03 ) ( A10 A11 A12 A13 ) ( 0 1 0 0 ) ( A20 A21 A22 A23 ) => ( Af20 0 Af22 Af23 ) ( A30 A31 A32 A33 ) ( Af30 0 Af32 Af33 ) If we have Cholesky decomposition of A, it must be recalculated after variables were fixed. However, it is possible to use efficient algorithm, which needs O(K*N^2) time to "fix" K variables, given Cholesky decomposition of original, "unfixed" A. INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. Fix - array[N], I-th element is True if I-th variable must be fixed. Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. NOTE: this function always succeeds, so it does not return completion code NOTE: this function checks sizes of input arrays, but it does NOT checks for presence of infinities or NAN's. NOTE: this function is efficient only for moderate amount of updated variables - say, 0.1*N or 0.3*N. For larger amount of variables it will still work, but you may get better performance with straightforward Cholesky. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/ void spdmatrixcholeskyupdatefix(const real_2d_array &a, const ae_int_t n, const bool isupper, const boolean_1d_array &fix); /************************************************************************* Update of Cholesky decomposition: rank-1 update to original A. "Buffered" version which uses preallocated buffer which is saved between subsequent function calls. See comments for SPDMatrixCholeskyUpdateAdd1() for more information. INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. U - array[N], rank-1 update to A: A_mod = A + u*u' Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/ void spdmatrixcholeskyupdateadd1buf(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &u, real_1d_array &bufr); /************************************************************************* Update of Cholesky decomposition: "fixing" some variables. "Buffered" version which uses preallocated buffer which is saved between subsequent function calls. See comments for SPDMatrixCholeskyUpdateFix() for more information. INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. Fix - array[N], I-th element is True if I-th variable must be fixed. Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/ void spdmatrixcholeskyupdatefixbuf(const real_2d_array &a, const ae_int_t n, const bool isupper, const boolean_1d_array &fix, real_1d_array &bufr); /************************************************************************* Sparse Cholesky decomposition for skyline matrixm using in-place algorithm without allocating additional storage. The algorithm computes Cholesky decomposition of a symmetric positive- definite sparse matrix. The result of an algorithm is a representation of A as A=U^T*U or A=L*L^T This function is a more efficient alternative to general, but slower SparseCholeskyX(), because it does not create temporary copies of the target. It performs factorization in-place, which gives best performance on low-profile matrices. Its drawback, however, is that it can not perform profile-reducing permutation of input matrix. INPUT PARAMETERS: A - sparse matrix in skyline storage (SKS) format. N - size of matrix A (can be smaller than actual size of A) IsUpper - if IsUpper=True, then factorization is performed on upper triangle. Another triangle is ignored (it may contant some data, but it is not changed). OUTPUT PARAMETERS: A - the result of factorization, stored in SKS. If IsUpper=True, then the upper triangle contains matrix U, such that A = U^T*U. Lower triangle is not changed. Similarly, if IsUpper = False. In this case L is returned, and we have A = L*(L^T). Note that THIS function does not perform permutation of rows to reduce bandwidth. RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of A is not determined in such case. NOTE: for performance reasons this function does NOT check that input matrix includes only finite values. It is your responsibility to make sure that there are no infinite or NAN values in the matrix. -- ALGLIB routine -- 16.01.2014 Bochkanov Sergey *************************************************************************/ bool sparsecholeskyskyline(const sparsematrix &a, const ae_int_t n, const bool isupper); /************************************************************************* Estimate of a matrix condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixrcond1(const real_2d_array &a, const ae_int_t n); /************************************************************************* Estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixrcondinf(const real_2d_array &a, const ae_int_t n); /************************************************************************* Condition number estimate of a symmetric positive definite matrix. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm of condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: A - symmetric positive definite matrix which is given by its upper or lower triangle depending on the value of IsUpper. Array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. Result: 1/LowerBound(cond(A)), if matrix A is positive definite, -1, if matrix A is not positive definite, and its condition number could not be found by this algorithm. NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double spdmatrixrcond(const real_2d_array &a, const ae_int_t n, const bool isupper); /************************************************************************* Triangular matrix: estimate of a condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array[0..N-1, 0..N-1]. N - size of A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixtrrcond1(const real_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit); /************************************************************************* Triangular matrix: estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixtrrcondinf(const real_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit); /************************************************************************* Condition number estimate of a Hermitian positive definite matrix. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm of condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: A - Hermitian positive definite matrix which is given by its upper or lower triangle depending on the value of IsUpper. Array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. Result: 1/LowerBound(cond(A)), if matrix A is positive definite, -1, if matrix A is not positive definite, and its condition number could not be found by this algorithm. NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double hpdmatrixrcond(const complex_2d_array &a, const ae_int_t n, const bool isupper); /************************************************************************* Estimate of a matrix condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixrcond1(const complex_2d_array &a, const ae_int_t n); /************************************************************************* Estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixrcondinf(const complex_2d_array &a, const ae_int_t n); /************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the RMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixlurcond1(const real_2d_array &lua, const ae_int_t n); /************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (infinity norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the RMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixlurcondinf(const real_2d_array &lua, const ae_int_t n); /************************************************************************* Condition number estimate of a symmetric positive definite matrix given by Cholesky decomposition. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: CD - Cholesky decomposition of matrix A, output of SMatrixCholesky subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double spdmatrixcholeskyrcond(const real_2d_array &a, const ae_int_t n, const bool isupper); /************************************************************************* Condition number estimate of a Hermitian positive definite matrix given by Cholesky decomposition. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: CD - Cholesky decomposition of matrix A, output of SMatrixCholesky subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double hpdmatrixcholeskyrcond(const complex_2d_array &a, const ae_int_t n, const bool isupper); /************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the CMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixlurcond1(const complex_2d_array &lua, const ae_int_t n); /************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (infinity norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the CMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixlurcondinf(const complex_2d_array &lua, const ae_int_t n); /************************************************************************* Triangular matrix: estimate of a condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array[0..N-1, 0..N-1]. N - size of A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixtrrcond1(const complex_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit); /************************************************************************* Triangular matrix: estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixtrrcondinf(const complex_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit); /************************************************************************* Inversion of a matrix given by its LU decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations (the output of RMatrixLU subroutine). N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. it is filled by zeros in such cases. * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - solver report, see below for more info A - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. SOLVER REPORT Subroutine sets following fields of the Rep structure: * R1 reciprocal of condition number: 1/cond(A), 1-norm. * RInf reciprocal of condition number: 1/cond(A), inf-norm. -- ALGLIB routine -- 05.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixluinverse(real_2d_array &a, const integer_1d_array &pivots, const ae_int_t n, ae_int_t &info, matinvreport &rep); void smp_rmatrixluinverse(real_2d_array &a, const integer_1d_array &pivots, const ae_int_t n, ae_int_t &info, matinvreport &rep); void rmatrixluinverse(real_2d_array &a, const integer_1d_array &pivots, ae_int_t &info, matinvreport &rep); void smp_rmatrixluinverse(real_2d_array &a, const integer_1d_array &pivots, ae_int_t &info, matinvreport &rep); /************************************************************************* Inversion of a general matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse Result: True, if the matrix is not singular. False, if the matrix is singular. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/ void rmatrixinverse(real_2d_array &a, const ae_int_t n, ae_int_t &info, matinvreport &rep); void smp_rmatrixinverse(real_2d_array &a, const ae_int_t n, ae_int_t &info, matinvreport &rep); void rmatrixinverse(real_2d_array &a, ae_int_t &info, matinvreport &rep); void smp_rmatrixinverse(real_2d_array &a, ae_int_t &info, matinvreport &rep); /************************************************************************* Inversion of a matrix given by its LU decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - LU decomposition of the matrix (output of CMatrixLU subroutine). Pivots - table of permutations (the output of CMatrixLU subroutine). N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) OUTPUT PARAMETERS: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 05.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixluinverse(complex_2d_array &a, const integer_1d_array &pivots, const ae_int_t n, ae_int_t &info, matinvreport &rep); void smp_cmatrixluinverse(complex_2d_array &a, const integer_1d_array &pivots, const ae_int_t n, ae_int_t &info, matinvreport &rep); void cmatrixluinverse(complex_2d_array &a, const integer_1d_array &pivots, ae_int_t &info, matinvreport &rep); void smp_cmatrixluinverse(complex_2d_array &a, const integer_1d_array &pivots, ae_int_t &info, matinvreport &rep); /************************************************************************* Inversion of a general matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void cmatrixinverse(complex_2d_array &a, const ae_int_t n, ae_int_t &info, matinvreport &rep); void smp_cmatrixinverse(complex_2d_array &a, const ae_int_t n, ae_int_t &info, matinvreport &rep); void cmatrixinverse(complex_2d_array &a, ae_int_t &info, matinvreport &rep); void smp_cmatrixinverse(complex_2d_array &a, ae_int_t &info, matinvreport &rep); /************************************************************************* Inversion of a symmetric positive definite matrix which is given by Cholesky decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - Cholesky decomposition of the matrix to be inverted: A=U'*U or A = L*L'. Output of SPDMatrixCholesky subroutine. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, lower half is used. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskyinverse(real_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep); void smp_spdmatrixcholeskyinverse(real_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep); void spdmatrixcholeskyinverse(real_2d_array &a, ae_int_t &info, matinvreport &rep); void smp_spdmatrixcholeskyinverse(real_2d_array &a, ae_int_t &info, matinvreport &rep); /************************************************************************* Inversion of a symmetric positive definite matrix. Given an upper or lower triangle of a symmetric positive definite matrix, the algorithm generates matrix A^-1 and saves the upper or lower triangle depending on the input. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be inverted (upper or lower triangle). Array with elements [0..N-1,0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void spdmatrixinverse(real_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep); void smp_spdmatrixinverse(real_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep); void spdmatrixinverse(real_2d_array &a, ae_int_t &info, matinvreport &rep); void smp_spdmatrixinverse(real_2d_array &a, ae_int_t &info, matinvreport &rep); /************************************************************************* Inversion of a Hermitian positive definite matrix which is given by Cholesky decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - Cholesky decomposition of the matrix to be inverted: A=U'*U or A = L*L'. Output of HPDMatrixCholesky subroutine. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, lower half is used. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskyinverse(complex_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep); void smp_hpdmatrixcholeskyinverse(complex_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep); void hpdmatrixcholeskyinverse(complex_2d_array &a, ae_int_t &info, matinvreport &rep); void smp_hpdmatrixcholeskyinverse(complex_2d_array &a, ae_int_t &info, matinvreport &rep); /************************************************************************* Inversion of a Hermitian positive definite matrix. Given an upper or lower triangle of a Hermitian positive definite matrix, the algorithm generates matrix A^-1 and saves the upper or lower triangle depending on the input. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be inverted (upper or lower triangle). Array with elements [0..N-1,0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void hpdmatrixinverse(complex_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep); void smp_hpdmatrixinverse(complex_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep); void hpdmatrixinverse(complex_2d_array &a, ae_int_t &info, matinvreport &rep); void smp_hpdmatrixinverse(complex_2d_array &a, ae_int_t &info, matinvreport &rep); /************************************************************************* Triangular matrix inverse (real) The subroutine inverts the following types of matrices: * upper triangular * upper triangular with unit diagonal * lower triangular * lower triangular with unit diagonal In case of an upper (lower) triangular matrix, the inverse matrix will also be upper (lower) triangular, and after the end of the algorithm, the inverse matrix replaces the source matrix. The elements below (above) the main diagonal are not changed by the algorithm. If the matrix has a unit diagonal, the inverse matrix also has a unit diagonal, and the diagonal elements are not passed to the algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that triangular inverse is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix, array[0..N-1, 0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - True, if the matrix is upper triangular. IsUnit - diagonal type (optional): * if True, matrix has unit diagonal (a[i,i] are NOT used) * if False, matrix diagonal is arbitrary * if not given, False is assumed Output parameters: Info - same as for RMatrixLUInverse Rep - same as for RMatrixLUInverse A - same as for RMatrixLUInverse. -- ALGLIB -- Copyright 05.02.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixtrinverse(real_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit, ae_int_t &info, matinvreport &rep); void smp_rmatrixtrinverse(real_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit, ae_int_t &info, matinvreport &rep); void rmatrixtrinverse(real_2d_array &a, const bool isupper, ae_int_t &info, matinvreport &rep); void smp_rmatrixtrinverse(real_2d_array &a, const bool isupper, ae_int_t &info, matinvreport &rep); /************************************************************************* Triangular matrix inverse (complex) The subroutine inverts the following types of matrices: * upper triangular * upper triangular with unit diagonal * lower triangular * lower triangular with unit diagonal In case of an upper (lower) triangular matrix, the inverse matrix will also be upper (lower) triangular, and after the end of the algorithm, the inverse matrix replaces the source matrix. The elements below (above) the main diagonal are not changed by the algorithm. If the matrix has a unit diagonal, the inverse matrix also has a unit diagonal, and the diagonal elements are not passed to the algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that triangular inverse is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix, array[0..N-1, 0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - True, if the matrix is upper triangular. IsUnit - diagonal type (optional): * if True, matrix has unit diagonal (a[i,i] are NOT used) * if False, matrix diagonal is arbitrary * if not given, False is assumed Output parameters: Info - same as for RMatrixLUInverse Rep - same as for RMatrixLUInverse A - same as for RMatrixLUInverse. -- ALGLIB -- Copyright 05.02.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixtrinverse(complex_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit, ae_int_t &info, matinvreport &rep); void smp_cmatrixtrinverse(complex_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit, ae_int_t &info, matinvreport &rep); void cmatrixtrinverse(complex_2d_array &a, const bool isupper, ae_int_t &info, matinvreport &rep); void smp_cmatrixtrinverse(complex_2d_array &a, const bool isupper, ae_int_t &info, matinvreport &rep); /************************************************************************* QR decomposition of a rectangular matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q and R in compact form (see below). Tau - array of scalar factors which are used to form matrix Q. Array whose index ranges within [0.. Min(M-1,N-1)]. Matrix A is represented as A = QR, where Q is an orthogonal matrix of size MxM, R - upper triangular (or upper trapezoid) matrix of size M x N. The elements of matrix R are located on and above the main diagonal of matrix A. The elements which are located in Tau array and below the main diagonal of matrix A are used to form matrix Q as follows: Matrix Q is represented as a product of elementary reflections Q = H(0)*H(2)*...*H(k-1), where k = min(m,n), and each H(i) is in the form H(i) = 1 - tau * v * (v^T) where tau is a scalar stored in Tau[I]; v - real vector, so that v(0:i-1) = 0, v(i) = 1, v(i+1:m-1) stored in A(i+1:m-1,i). -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixqr(real_2d_array &a, const ae_int_t m, const ae_int_t n, real_1d_array &tau); void smp_rmatrixqr(real_2d_array &a, const ae_int_t m, const ae_int_t n, real_1d_array &tau); /************************************************************************* LQ decomposition of a rectangular matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices L and Q in compact form (see below) Tau - array of scalar factors which are used to form matrix Q. Array whose index ranges within [0..Min(M,N)-1]. Matrix A is represented as A = LQ, where Q is an orthogonal matrix of size MxM, L - lower triangular (or lower trapezoid) matrix of size M x N. The elements of matrix L are located on and below the main diagonal of matrix A. The elements which are located in Tau array and above the main diagonal of matrix A are used to form matrix Q as follows: Matrix Q is represented as a product of elementary reflections Q = H(k-1)*H(k-2)*...*H(1)*H(0), where k = min(m,n), and each H(i) is of the form H(i) = 1 - tau * v * (v^T) where tau is a scalar stored in Tau[I]; v - real vector, so that v(0:i-1)=0, v(i) = 1, v(i+1:n-1) stored in A(i,i+1:n-1). -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixlq(real_2d_array &a, const ae_int_t m, const ae_int_t n, real_1d_array &tau); void smp_rmatrixlq(real_2d_array &a, const ae_int_t m, const ae_int_t n, real_1d_array &tau); /************************************************************************* QR decomposition of a rectangular complex matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1] M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q and R in compact form Tau - array of scalar factors which are used to form matrix Q. Array whose indexes range within [0.. Min(M,N)-1] Matrix A is represented as A = QR, where Q is an orthogonal matrix of size MxM, R - upper triangular (or upper trapezoid) matrix of size MxN. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ void cmatrixqr(complex_2d_array &a, const ae_int_t m, const ae_int_t n, complex_1d_array &tau); void smp_cmatrixqr(complex_2d_array &a, const ae_int_t m, const ae_int_t n, complex_1d_array &tau); /************************************************************************* LQ decomposition of a rectangular complex matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1] M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q and L in compact form Tau - array of scalar factors which are used to form matrix Q. Array whose indexes range within [0.. Min(M,N)-1] Matrix A is represented as A = LQ, where Q is an orthogonal matrix of size MxM, L - lower triangular (or lower trapezoid) matrix of size MxN. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ void cmatrixlq(complex_2d_array &a, const ae_int_t m, const ae_int_t n, complex_1d_array &tau); void smp_cmatrixlq(complex_2d_array &a, const ae_int_t m, const ae_int_t n, complex_1d_array &tau); /************************************************************************* Partial unpacking of matrix Q from the QR decomposition of a matrix A COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices Q and R in compact form. Output of RMatrixQR subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of the RMatrixQR subroutine. QColumns - required number of columns of matrix Q. M>=QColumns>=0. Output parameters: Q - first QColumns columns of matrix Q. Array whose indexes range within [0..M-1, 0..QColumns-1]. If QColumns=0, the array remains unchanged. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixqrunpackq(const real_2d_array &a, const ae_int_t m, const ae_int_t n, const real_1d_array &tau, const ae_int_t qcolumns, real_2d_array &q); void smp_rmatrixqrunpackq(const real_2d_array &a, const ae_int_t m, const ae_int_t n, const real_1d_array &tau, const ae_int_t qcolumns, real_2d_array &q); /************************************************************************* Unpacking of matrix R from the QR decomposition of a matrix A Input parameters: A - matrices Q and R in compact form. Output of RMatrixQR subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: R - matrix R, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixqrunpackr(const real_2d_array &a, const ae_int_t m, const ae_int_t n, real_2d_array &r); /************************************************************************* Partial unpacking of matrix Q from the LQ decomposition of a matrix A COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices L and Q in compact form. Output of RMatrixLQ subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of the RMatrixLQ subroutine. QRows - required number of rows in matrix Q. N>=QRows>=0. Output parameters: Q - first QRows rows of matrix Q. Array whose indexes range within [0..QRows-1, 0..N-1]. If QRows=0, the array remains unchanged. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixlqunpackq(const real_2d_array &a, const ae_int_t m, const ae_int_t n, const real_1d_array &tau, const ae_int_t qrows, real_2d_array &q); void smp_rmatrixlqunpackq(const real_2d_array &a, const ae_int_t m, const ae_int_t n, const real_1d_array &tau, const ae_int_t qrows, real_2d_array &q); /************************************************************************* Unpacking of matrix L from the LQ decomposition of a matrix A Input parameters: A - matrices Q and L in compact form. Output of RMatrixLQ subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: L - matrix L, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixlqunpackl(const real_2d_array &a, const ae_int_t m, const ae_int_t n, real_2d_array &l); /************************************************************************* Partial unpacking of matrix Q from QR decomposition of a complex matrix A. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices Q and R in compact form. Output of CMatrixQR subroutine . M - number of rows in matrix A. M>=0. N - number of columns in matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of CMatrixQR subroutine . QColumns - required number of columns in matrix Q. M>=QColumns>=0. Output parameters: Q - first QColumns columns of matrix Q. Array whose index ranges within [0..M-1, 0..QColumns-1]. If QColumns=0, array isn't changed. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixqrunpackq(const complex_2d_array &a, const ae_int_t m, const ae_int_t n, const complex_1d_array &tau, const ae_int_t qcolumns, complex_2d_array &q); void smp_cmatrixqrunpackq(const complex_2d_array &a, const ae_int_t m, const ae_int_t n, const complex_1d_array &tau, const ae_int_t qcolumns, complex_2d_array &q); /************************************************************************* Unpacking of matrix R from the QR decomposition of a matrix A Input parameters: A - matrices Q and R in compact form. Output of CMatrixQR subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: R - matrix R, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixqrunpackr(const complex_2d_array &a, const ae_int_t m, const ae_int_t n, complex_2d_array &r); /************************************************************************* Partial unpacking of matrix Q from LQ decomposition of a complex matrix A. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices Q and R in compact form. Output of CMatrixLQ subroutine . M - number of rows in matrix A. M>=0. N - number of columns in matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of CMatrixLQ subroutine . QRows - required number of rows in matrix Q. N>=QColumns>=0. Output parameters: Q - first QRows rows of matrix Q. Array whose index ranges within [0..QRows-1, 0..N-1]. If QRows=0, array isn't changed. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixlqunpackq(const complex_2d_array &a, const ae_int_t m, const ae_int_t n, const complex_1d_array &tau, const ae_int_t qrows, complex_2d_array &q); void smp_cmatrixlqunpackq(const complex_2d_array &a, const ae_int_t m, const ae_int_t n, const complex_1d_array &tau, const ae_int_t qrows, complex_2d_array &q); /************************************************************************* Unpacking of matrix L from the LQ decomposition of a matrix A Input parameters: A - matrices Q and L in compact form. Output of CMatrixLQ subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: L - matrix L, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixlqunpackl(const complex_2d_array &a, const ae_int_t m, const ae_int_t n, complex_2d_array &l); /************************************************************************* Reduction of a rectangular matrix to bidiagonal form The algorithm reduces the rectangular matrix A to bidiagonal form by orthogonal transformations P and Q: A = Q*B*(P^T). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function because ! bidiagonal decompostion is inherently sequential in nature. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - source matrix. array[0..M-1, 0..N-1] M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q, B, P in compact form (see below). TauQ - scalar factors which are used to form matrix Q. TauP - scalar factors which are used to form matrix P. The main diagonal and one of the secondary diagonals of matrix A are replaced with bidiagonal matrix B. Other elements contain elementary reflections which form MxM matrix Q and NxN matrix P, respectively. If M>=N, B is the upper bidiagonal MxN matrix and is stored in the corresponding elements of matrix A. Matrix Q is represented as a product of elementary reflections Q = H(0)*H(1)*...*H(n-1), where H(i) = 1-tau*v*v'. Here tau is a scalar which is stored in TauQ[i], and vector v has the following structure: v(0:i-1)=0, v(i)=1, v(i+1:m-1) is stored in elements A(i+1:m-1,i). Matrix P is as follows: P = G(0)*G(1)*...*G(n-2), where G(i) = 1 - tau*u*u'. Tau is stored in TauP[i], u(0:i)=0, u(i+1)=1, u(i+2:n-1) is stored in elements A(i,i+2:n-1). If M n): m=5, n=6 (m < n): ( d e u1 u1 u1 ) ( d u1 u1 u1 u1 u1 ) ( v1 d e u2 u2 ) ( e d u2 u2 u2 u2 ) ( v1 v2 d e u3 ) ( v1 e d u3 u3 u3 ) ( v1 v2 v3 d e ) ( v1 v2 e d u4 u4 ) ( v1 v2 v3 v4 d ) ( v1 v2 v3 e d u5 ) ( v1 v2 v3 v4 v5 ) Here vi and ui are vectors which form H(i) and G(i), and d and e - are the diagonal and off-diagonal elements of matrix B. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994. Sergey Bochkanov, ALGLIB project, translation from FORTRAN to pseudocode, 2007-2010. *************************************************************************/ void rmatrixbd(real_2d_array &a, const ae_int_t m, const ae_int_t n, real_1d_array &tauq, real_1d_array &taup); /************************************************************************* Unpacking matrix Q which reduces a matrix to bidiagonal form. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: QP - matrices Q and P in compact form. Output of ToBidiagonal subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUQ - scalar factors which are used to form Q. Output of ToBidiagonal subroutine. QColumns - required number of columns in matrix Q. M>=QColumns>=0. Output parameters: Q - first QColumns columns of matrix Q. Array[0..M-1, 0..QColumns-1] If QColumns=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdunpackq(const real_2d_array &qp, const ae_int_t m, const ae_int_t n, const real_1d_array &tauq, const ae_int_t qcolumns, real_2d_array &q); /************************************************************************* Multiplication by matrix Q which reduces matrix A to bidiagonal form. The algorithm allows pre- or post-multiply by Q or Q'. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: QP - matrices Q and P in compact form. Output of ToBidiagonal subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUQ - scalar factors which are used to form Q. Output of ToBidiagonal subroutine. Z - multiplied matrix. array[0..ZRows-1,0..ZColumns-1] ZRows - number of rows in matrix Z. If FromTheRight=False, ZRows=M, otherwise ZRows can be arbitrary. ZColumns - number of columns in matrix Z. If FromTheRight=True, ZColumns=M, otherwise ZColumns can be arbitrary. FromTheRight - pre- or post-multiply. DoTranspose - multiply by Q or Q'. Output parameters: Z - product of Z and Q. Array[0..ZRows-1,0..ZColumns-1] If ZRows=0 or ZColumns=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdmultiplybyq(const real_2d_array &qp, const ae_int_t m, const ae_int_t n, const real_1d_array &tauq, real_2d_array &z, const ae_int_t zrows, const ae_int_t zcolumns, const bool fromtheright, const bool dotranspose); /************************************************************************* Unpacking matrix P which reduces matrix A to bidiagonal form. The subroutine returns transposed matrix P. Input parameters: QP - matrices Q and P in compact form. Output of ToBidiagonal subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUP - scalar factors which are used to form P. Output of ToBidiagonal subroutine. PTRows - required number of rows of matrix P^T. N >= PTRows >= 0. Output parameters: PT - first PTRows columns of matrix P^T Array[0..PTRows-1, 0..N-1] If PTRows=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdunpackpt(const real_2d_array &qp, const ae_int_t m, const ae_int_t n, const real_1d_array &taup, const ae_int_t ptrows, real_2d_array &pt); /************************************************************************* Multiplication by matrix P which reduces matrix A to bidiagonal form. The algorithm allows pre- or post-multiply by P or P'. Input parameters: QP - matrices Q and P in compact form. Output of RMatrixBD subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUP - scalar factors which are used to form P. Output of RMatrixBD subroutine. Z - multiplied matrix. Array whose indexes range within [0..ZRows-1,0..ZColumns-1]. ZRows - number of rows in matrix Z. If FromTheRight=False, ZRows=N, otherwise ZRows can be arbitrary. ZColumns - number of columns in matrix Z. If FromTheRight=True, ZColumns=N, otherwise ZColumns can be arbitrary. FromTheRight - pre- or post-multiply. DoTranspose - multiply by P or P'. Output parameters: Z - product of Z and P. Array whose indexes range within [0..ZRows-1,0..ZColumns-1]. If ZRows=0 or ZColumns=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdmultiplybyp(const real_2d_array &qp, const ae_int_t m, const ae_int_t n, const real_1d_array &taup, real_2d_array &z, const ae_int_t zrows, const ae_int_t zcolumns, const bool fromtheright, const bool dotranspose); /************************************************************************* Unpacking of the main and secondary diagonals of bidiagonal decomposition of matrix A. Input parameters: B - output of RMatrixBD subroutine. M - number of rows in matrix B. N - number of columns in matrix B. Output parameters: IsUpper - True, if the matrix is upper bidiagonal. otherwise IsUpper is False. D - the main diagonal. Array whose index ranges within [0..Min(M,N)-1]. E - the secondary diagonal (upper or lower, depending on the value of IsUpper). Array index ranges within [0..Min(M,N)-1], the last element is not used. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdunpackdiagonals(const real_2d_array &b, const ae_int_t m, const ae_int_t n, bool &isupper, real_1d_array &d, real_1d_array &e); /************************************************************************* Reduction of a square matrix to upper Hessenberg form: Q'*A*Q = H, where Q is an orthogonal matrix, H - Hessenberg matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A with elements [0..N-1, 0..N-1] N - size of matrix A. Output parameters: A - matrices Q and P in compact form (see below). Tau - array of scalar factors which are used to form matrix Q. Array whose index ranges within [0..N-2] Matrix H is located on the main diagonal, on the lower secondary diagonal and above the main diagonal of matrix A. The elements which are used to form matrix Q are situated in array Tau and below the lower secondary diagonal of matrix A as follows: Matrix Q is represented as a product of elementary reflections Q = H(0)*H(2)*...*H(n-2), where each H(i) is given by H(i) = 1 - tau * v * (v^T) where tau is a scalar stored in Tau[I]; v - is a real vector, so that v(0:i) = 0, v(i+1) = 1, v(i+2:n-1) stored in A(i+2:n-1,i). -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ void rmatrixhessenberg(real_2d_array &a, const ae_int_t n, real_1d_array &tau); /************************************************************************* Unpacking matrix Q which reduces matrix A to upper Hessenberg form COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - output of RMatrixHessenberg subroutine. N - size of matrix A. Tau - scalar factors which are used to form Q. Output of RMatrixHessenberg subroutine. Output parameters: Q - matrix Q. Array whose indexes range within [0..N-1, 0..N-1]. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixhessenbergunpackq(const real_2d_array &a, const ae_int_t n, const real_1d_array &tau, real_2d_array &q); /************************************************************************* Unpacking matrix H (the result of matrix A reduction to upper Hessenberg form) Input parameters: A - output of RMatrixHessenberg subroutine. N - size of matrix A. Output parameters: H - matrix H. Array whose indexes range within [0..N-1, 0..N-1]. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixhessenbergunpackh(const real_2d_array &a, const ae_int_t n, real_2d_array &h); /************************************************************************* Reduction of a symmetric matrix which is given by its higher or lower triangular part to a tridiagonal matrix using orthogonal similarity transformation: Q'*A*Q=T. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be transformed array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. If IsUpper = True, then matrix A is given by its upper triangle, and the lower triangle is not used and not modified by the algorithm, and vice versa if IsUpper = False. Output parameters: A - matrices T and Q in compact form (see lower) Tau - array of factors which are forming matrices H(i) array with elements [0..N-2]. D - main diagonal of symmetric matrix T. array with elements [0..N-1]. E - secondary diagonal of symmetric matrix T. array with elements [0..N-2]. If IsUpper=True, the matrix Q is represented as a product of elementary reflectors Q = H(n-2) . . . H(2) H(0). Each H(i) has the form H(i) = I - tau * v * v' where tau is a real scalar, and v is a real vector with v(i+1:n-1) = 0, v(i) = 1, v(0:i-1) is stored on exit in A(0:i-1,i+1), and tau in TAU(i). If IsUpper=False, the matrix Q is represented as a product of elementary reflectors Q = H(0) H(2) . . . H(n-2). Each H(i) has the form H(i) = I - tau * v * v' where tau is a real scalar, and v is a real vector with v(0:i) = 0, v(i+1) = 1, v(i+2:n-1) is stored on exit in A(i+2:n-1,i), and tau in TAU(i). The contents of A on exit are illustrated by the following examples with n = 5: if UPLO = 'U': if UPLO = 'L': ( d e v1 v2 v3 ) ( d ) ( d e v2 v3 ) ( e d ) ( d e v3 ) ( v0 e d ) ( d e ) ( v0 v1 e d ) ( d ) ( v0 v1 v2 e d ) where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i). -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ void smatrixtd(real_2d_array &a, const ae_int_t n, const bool isupper, real_1d_array &tau, real_1d_array &d, real_1d_array &e); /************************************************************************* Unpacking matrix Q which reduces symmetric matrix to a tridiagonal form. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - the result of a SMatrixTD subroutine N - size of matrix A. IsUpper - storage format (a parameter of SMatrixTD subroutine) Tau - the result of a SMatrixTD subroutine Output parameters: Q - transformation matrix. array with elements [0..N-1, 0..N-1]. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/ void smatrixtdunpackq(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &tau, real_2d_array &q); /************************************************************************* Reduction of a Hermitian matrix which is given by its higher or lower triangular part to a real tridiagonal matrix using unitary similarity transformation: Q'*A*Q = T. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be transformed array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. If IsUpper = True, then matrix A is given by its upper triangle, and the lower triangle is not used and not modified by the algorithm, and vice versa if IsUpper = False. Output parameters: A - matrices T and Q in compact form (see lower) Tau - array of factors which are forming matrices H(i) array with elements [0..N-2]. D - main diagonal of real symmetric matrix T. array with elements [0..N-1]. E - secondary diagonal of real symmetric matrix T. array with elements [0..N-2]. If IsUpper=True, the matrix Q is represented as a product of elementary reflectors Q = H(n-2) . . . H(2) H(0). Each H(i) has the form H(i) = I - tau * v * v' where tau is a complex scalar, and v is a complex vector with v(i+1:n-1) = 0, v(i) = 1, v(0:i-1) is stored on exit in A(0:i-1,i+1), and tau in TAU(i). If IsUpper=False, the matrix Q is represented as a product of elementary reflectors Q = H(0) H(2) . . . H(n-2). Each H(i) has the form H(i) = I - tau * v * v' where tau is a complex scalar, and v is a complex vector with v(0:i) = 0, v(i+1) = 1, v(i+2:n-1) is stored on exit in A(i+2:n-1,i), and tau in TAU(i). The contents of A on exit are illustrated by the following examples with n = 5: if UPLO = 'U': if UPLO = 'L': ( d e v1 v2 v3 ) ( d ) ( d e v2 v3 ) ( e d ) ( d e v3 ) ( v0 e d ) ( d e ) ( v0 v1 e d ) ( d ) ( v0 v1 v2 e d ) where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i). -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ void hmatrixtd(complex_2d_array &a, const ae_int_t n, const bool isupper, complex_1d_array &tau, real_1d_array &d, real_1d_array &e); /************************************************************************* Unpacking matrix Q which reduces a Hermitian matrix to a real tridiagonal form. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - the result of a HMatrixTD subroutine N - size of matrix A. IsUpper - storage format (a parameter of HMatrixTD subroutine) Tau - the result of a HMatrixTD subroutine Output parameters: Q - transformation matrix. array with elements [0..N-1, 0..N-1]. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/ void hmatrixtdunpackq(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_1d_array &tau, complex_2d_array &q); /************************************************************************* Singular value decomposition of a bidiagonal matrix (extended algorithm) COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The algorithm performs the singular value decomposition of a bidiagonal matrix B (upper or lower) representing it as B = Q*S*P^T, where Q and P - orthogonal matrices, S - diagonal matrix with non-negative elements on the main diagonal, in descending order. The algorithm finds singular values. In addition, the algorithm can calculate matrices Q and P (more precisely, not the matrices, but their product with given matrices U and VT - U*Q and (P^T)*VT)). Of course, matrices U and VT can be of any type, including identity. Furthermore, the algorithm can calculate Q'*C (this product is calculated more effectively than U*Q, because this calculation operates with rows instead of matrix columns). The feature of the algorithm is its ability to find all singular values including those which are arbitrarily close to 0 with relative accuracy close to machine precision. If the parameter IsFractionalAccuracyRequired is set to True, all singular values will have high relative accuracy close to machine precision. If the parameter is set to False, only the biggest singular value will have relative accuracy close to machine precision. The absolute error of other singular values is equal to the absolute error of the biggest singular value. Input parameters: D - main diagonal of matrix B. Array whose index ranges within [0..N-1]. E - superdiagonal (or subdiagonal) of matrix B. Array whose index ranges within [0..N-2]. N - size of matrix B. IsUpper - True, if the matrix is upper bidiagonal. IsFractionalAccuracyRequired - THIS PARAMETER IS IGNORED SINCE ALGLIB 3.5.0 SINGULAR VALUES ARE ALWAYS SEARCHED WITH HIGH ACCURACY. U - matrix to be multiplied by Q. Array whose indexes range within [0..NRU-1, 0..N-1]. The matrix can be bigger, in that case only the submatrix [0..NRU-1, 0..N-1] will be multiplied by Q. NRU - number of rows in matrix U. C - matrix to be multiplied by Q'. Array whose indexes range within [0..N-1, 0..NCC-1]. The matrix can be bigger, in that case only the submatrix [0..N-1, 0..NCC-1] will be multiplied by Q'. NCC - number of columns in matrix C. VT - matrix to be multiplied by P^T. Array whose indexes range within [0..N-1, 0..NCVT-1]. The matrix can be bigger, in that case only the submatrix [0..N-1, 0..NCVT-1] will be multiplied by P^T. NCVT - number of columns in matrix VT. Output parameters: D - singular values of matrix B in descending order. U - if NRU>0, contains matrix U*Q. VT - if NCVT>0, contains matrix (P^T)*VT. C - if NCC>0, contains matrix Q'*C. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged (rare case). NOTE: multiplication U*Q is performed by means of transposition to internal buffer, multiplication and backward transposition. It helps to avoid costly columnwise operations and speed-up algorithm. Additional information: The type of convergence is controlled by the internal parameter TOL. If the parameter is greater than 0, the singular values will have relative accuracy TOL. If TOL<0, the singular values will have absolute accuracy ABS(TOL)*norm(B). By default, |TOL| falls within the range of 10*Epsilon and 100*Epsilon, where Epsilon is the machine precision. It is not recommended to use TOL less than 10*Epsilon since this will considerably slow down the algorithm and may not lead to error decreasing. History: * 31 March, 2007. changed MAXITR from 6 to 12. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1999. *************************************************************************/ bool rmatrixbdsvd(real_1d_array &d, const real_1d_array &e, const ae_int_t n, const bool isupper, const bool isfractionalaccuracyrequired, real_2d_array &u, const ae_int_t nru, real_2d_array &c, const ae_int_t ncc, real_2d_array &vt, const ae_int_t ncvt); /************************************************************************* Singular value decomposition of a rectangular matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is only partially supported (some parts are ! optimized, but most - are not). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The algorithm calculates the singular value decomposition of a matrix of size MxN: A = U * S * V^T The algorithm finds the singular values and, optionally, matrices U and V^T. The algorithm can find both first min(M,N) columns of matrix U and rows of matrix V^T (singular vectors), and matrices U and V^T wholly (of sizes MxM and NxN respectively). Take into account that the subroutine does not return matrix V but V^T. Input parameters: A - matrix to be decomposed. Array whose indexes range within [0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. UNeeded - 0, 1 or 2. See the description of the parameter U. VTNeeded - 0, 1 or 2. See the description of the parameter VT. AdditionalMemory - If the parameter: * equals 0, the algorithm doesn't use additional memory (lower requirements, lower performance). * equals 1, the algorithm uses additional memory of size min(M,N)*min(M,N) of real numbers. It often speeds up the algorithm. * equals 2, the algorithm uses additional memory of size M*min(M,N) of real numbers. It allows to get a maximum performance. The recommended value of the parameter is 2. Output parameters: W - contains singular values in descending order. U - if UNeeded=0, U isn't changed, the left singular vectors are not calculated. if Uneeded=1, U contains left singular vectors (first min(M,N) columns of matrix U). Array whose indexes range within [0..M-1, 0..Min(M,N)-1]. if UNeeded=2, U contains matrix U wholly. Array whose indexes range within [0..M-1, 0..M-1]. VT - if VTNeeded=0, VT isn't changed, the right singular vectors are not calculated. if VTNeeded=1, VT contains right singular vectors (first min(M,N) rows of matrix V^T). Array whose indexes range within [0..min(M,N)-1, 0..N-1]. if VTNeeded=2, VT contains matrix V^T wholly. Array whose indexes range within [0..N-1, 0..N-1]. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ bool rmatrixsvd(const real_2d_array &a, const ae_int_t m, const ae_int_t n, const ae_int_t uneeded, const ae_int_t vtneeded, const ae_int_t additionalmemory, real_1d_array &w, real_2d_array &u, real_2d_array &vt); bool smp_rmatrixsvd(const real_2d_array &a, const ae_int_t m, const ae_int_t n, const ae_int_t uneeded, const ae_int_t vtneeded, const ae_int_t additionalmemory, real_1d_array &w, real_2d_array &u, real_2d_array &vt); /************************************************************************* This procedure initializes matrix norm estimator. USAGE: 1. User initializes algorithm state with NormEstimatorCreate() call 2. User calls NormEstimatorEstimateSparse() (or NormEstimatorIteration()) 3. User calls NormEstimatorResults() to get solution. INPUT PARAMETERS: M - number of rows in the matrix being estimated, M>0 N - number of columns in the matrix being estimated, N>0 NStart - number of random starting vectors recommended value - at least 5. NIts - number of iterations to do with best starting vector recommended value - at least 5. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTE: this algorithm is effectively deterministic, i.e. it always returns same result when repeatedly called for the same matrix. In fact, algorithm uses randomized starting vectors, but internal random numbers generator always generates same sequence of the random values (it is a feature, not bug). Algorithm can be made non-deterministic with NormEstimatorSetSeed(0) call. -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorcreate(const ae_int_t m, const ae_int_t n, const ae_int_t nstart, const ae_int_t nits, normestimatorstate &state); /************************************************************************* This function changes seed value used by algorithm. In some cases we need deterministic processing, i.e. subsequent calls must return equal results, in other cases we need non-deterministic algorithm which returns different results for the same matrix on every pass. Setting zero seed will lead to non-deterministic algorithm, while non-zero value will make our algorithm deterministic. INPUT PARAMETERS: State - norm estimator state, must be initialized with a call to NormEstimatorCreate() SeedVal - seed value, >=0. Zero value = non-deterministic algo. -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorsetseed(const normestimatorstate &state, const ae_int_t seedval); /************************************************************************* This function estimates norm of the sparse M*N matrix A. INPUT PARAMETERS: State - norm estimator state, must be initialized with a call to NormEstimatorCreate() A - sparse M*N matrix, must be converted to CRS format prior to calling this function. After this function is over you can call NormEstimatorResults() to get estimate of the norm(A). -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorestimatesparse(const normestimatorstate &state, const sparsematrix &a); /************************************************************************* Matrix norm estimation results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: Nrm - estimate of the matrix norm, Nrm>=0 -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorresults(const normestimatorstate &state, double &nrm); /************************************************************************* This function initializes subspace iteration solver. This solver is used to solve symmetric real eigenproblems where just a few (top K) eigenvalues and corresponding eigenvectors is required. This solver can be significantly faster than complete EVD decomposition in the following case: * when only just a small fraction of top eigenpairs of dense matrix is required. When K approaches N, this solver is slower than complete dense EVD * when problem matrix is sparse (and/or is not known explicitly, i.e. only matrix-matrix product can be performed) USAGE (explicit dense/sparse matrix): 1. User initializes algorithm state with eigsubspacecreate() call 2. [optional] User tunes solver parameters by calling eigsubspacesetcond() or other functions 3. User calls eigsubspacesolvedense() or eigsubspacesolvesparse() methods, which take algorithm state and 2D array or alglib.sparsematrix object. USAGE (out-of-core mode): 1. User initializes algorithm state with eigsubspacecreate() call 2. [optional] User tunes solver parameters by calling eigsubspacesetcond() or other functions 3. User activates out-of-core mode of the solver and repeatedly calls communication functions in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: N - problem dimensionality, N>0 K - number of top eigenvector to calculate, 0=0, with non-zero value used to tell solver that it can stop after all eigenvalues converged with error roughly proportional to eps*MAX(LAMBDA_MAX), where LAMBDA_MAX is a maximum eigenvalue. Zero value means that no check for precision is performed. MaxIts - maxits>=0, with non-zero value used to tell solver that it can stop after maxits steps (no matter how precise current estimate is) NOTE: passing eps=0 and maxits=0 results in automatic selection of moderate eps as stopping criteria (1.0E-6 in current implementation, but it may change without notice). NOTE: very small values of eps are possible (say, 1.0E-12), although the larger problem you solve (N and/or K), the harder it is to find precise eigenvectors because rounding errors tend to accumulate. NOTE: passing non-zero eps results in some performance penalty, roughly equal to 2N*(2K)^2 FLOPs per iteration. These additional computations are required in order to estimate current error in eigenvalues via Rayleigh-Ritz process. Most of this additional time is spent in construction of ~2Kx2K symmetric subproblem whose eigenvalues are checked with exact eigensolver. This additional time is negligible if you search for eigenvalues of the large dense matrix, but may become noticeable on highly sparse EVD problems, where cost of matrix-matrix product is low. If you set eps to exactly zero, Rayleigh-Ritz phase is completely turned off. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspacesetcond(const eigsubspacestate &state, const double eps, const ae_int_t maxits); /************************************************************************* This function initiates out-of-core mode of subspace eigensolver. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver object MType - matrix type: * 0 for real symmetric matrix (solver assumes that matrix being processed is symmetric; symmetric direct eigensolver is used for smaller subproblems arising during solution of larger "full" task) Future versions of ALGLIB may introduce support for other matrix types; for now, only symmetric eigenproblems are supported. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocstart(const eigsubspacestate &state, const ae_int_t mtype); /************************************************************************* This function performs subspace iteration in the out-of-core mode. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ bool eigsubspaceooccontinue(const eigsubspacestate &state); /************************************************************************* This function is used to retrieve information about out-of-core request sent by solver to user code: request type (current version of the solver sends only requests for matrix-matrix products) and request size (size of the matrices being multiplied). This function returns just request metrics; in order to get contents of the matrices being multiplied, use eigsubspaceoocgetrequestdata(). It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver running in out-of-core mode OUTPUT PARAMETERS: RequestType - type of the request to process: * 0 - for matrix-matrix product A*X, with A being NxN matrix whose eigenvalues/vectors are needed, and X being NxREQUESTSIZE one which is returned by the eigsubspaceoocgetrequestdata(). RequestSize - size of the X matrix (number of columns), usually it is several times larger than number of vectors K requested by user. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocgetrequestinfo(const eigsubspacestate &state, ae_int_t &requesttype, ae_int_t &requestsize); /************************************************************************* This function is used to retrieve information about out-of-core request sent by solver to user code: matrix X (array[N,RequestSize) which have to be multiplied by out-of-core matrix A in a product A*X. This function returns just request data; in order to get size of the data prior to processing requestm, use eigsubspaceoocgetrequestinfo(). It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver running in out-of-core mode X - possibly preallocated storage; reallocated if needed, left unchanged, if large enough to store request data. OUTPUT PARAMETERS: X - array[N,RequestSize] or larger, leading rectangle is filled with dense matrix X. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocgetrequestdata(const eigsubspacestate &state, real_2d_array &x); /************************************************************************* This function is used to send user reply to out-of-core request sent by solver. Usually it is product A*X for returned by solver matrix X. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver running in out-of-core mode AX - array[N,RequestSize] or larger, leading rectangle is filled with product A*X. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocsendresult(const eigsubspacestate &state, const real_2d_array &ax); /************************************************************************* This function finalizes out-of-core mode of subspace eigensolver. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver state OUTPUT PARAMETERS: W - array[K], depending on solver settings: * top K eigenvalues ordered by descending - if eigenvectors are returned in Z * zeros - if invariant subspace is returned in Z Z - array[N,K], depending on solver settings either: * matrix of eigenvectors found * orthogonal basis of K-dimensional invariant subspace Rep - report with additional parameters -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocstop(const eigsubspacestate &state, real_1d_array &w, real_2d_array &z, eigsubspacereport &rep); /************************************************************************* This function runs eigensolver for dense NxN symmetric matrix A, given by upper or lower triangle. This function can not process nonsymmetric matrices. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * multithreading support ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! For a situation when you need just a few eigenvectors (~1-10), ! multithreading typically gives sublinear (wrt to cores count) speedup. ! For larger problems it may give you nearly linear increase in ! performance. ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: State - solver state A - array[N,N], symmetric NxN matrix given by one of its triangles IsUpper - whether upper or lower triangle of A is given (the other one is not referenced at all). OUTPUT PARAMETERS: W - array[K], top K eigenvalues ordered by descending of their absolute values Z - array[N,K], matrix of eigenvectors found Rep - report with additional parameters NOTE: internally this function allocates a copy of NxN dense A. You should take it into account when working with very large matrices occupying almost all RAM. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspacesolvedenses(const eigsubspacestate &state, const real_2d_array &a, const bool isupper, real_1d_array &w, real_2d_array &z, eigsubspacereport &rep); void smp_eigsubspacesolvedenses(const eigsubspacestate &state, const real_2d_array &a, const bool isupper, real_1d_array &w, real_2d_array &z, eigsubspacereport &rep); /************************************************************************* This function runs eigensolver for dense NxN symmetric matrix A, given by upper or lower triangle. This function can not process nonsymmetric matrices. INPUT PARAMETERS: State - solver state A - NxN symmetric matrix given by one of its triangles IsUpper - whether upper or lower triangle of A is given (the other one is not referenced at all). OUTPUT PARAMETERS: W - array[K], top K eigenvalues ordered by descending of their absolute values Z - array[N,K], matrix of eigenvectors found Rep - report with additional parameters -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspacesolvesparses(const eigsubspacestate &state, const sparsematrix &a, const bool isupper, real_1d_array &w, real_2d_array &z, eigsubspacereport &rep); /************************************************************************* Finding the eigenvalues and eigenvectors of a symmetric matrix The algorithm finds eigen pairs of a symmetric matrix by reducing it to tridiagonal form and using the QL/QR algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpper - storage format. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains the eigenvectors. Array whose indexes range within [0..N-1, 0..N-1]. The eigenvectors are stored in the matrix columns. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged (rare case). -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/ bool smatrixevd(const real_2d_array &a, const ae_int_t n, const ae_int_t zneeded, const bool isupper, real_1d_array &d, real_2d_array &z); /************************************************************************* Subroutine for finding the eigenvalues (and eigenvectors) of a symmetric matrix in a given half open interval (A, B] by using a bisection and inverse iteration Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. B1, B2 - half open interval (B1, B2] to search eigenvalues in. Output parameters: M - number of eigenvalues found in a given half-interval (M>=0). W - array of the eigenvalues found. Array whose index ranges within [0..M-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..M-1]. The eigenvectors are stored in the matrix columns. Result: True, if successful. M contains the number of eigenvalues in the given half-interval (could be equal to 0), W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned, M is equal to 0. -- ALGLIB -- Copyright 07.01.2006 by Bochkanov Sergey *************************************************************************/ bool smatrixevdr(const real_2d_array &a, const ae_int_t n, const ae_int_t zneeded, const bool isupper, const double b1, const double b2, ae_int_t &m, real_1d_array &w, real_2d_array &z); /************************************************************************* Subroutine for finding the eigenvalues and eigenvectors of a symmetric matrix with given indexes by using bisection and inverse iteration methods. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. I1, I2 - index interval for searching (from I1 to I2). 0 <= I1 <= I2 <= N-1. Output parameters: W - array of the eigenvalues found. Array whose index ranges within [0..I2-I1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..I2-I1]. In that case, the eigenvectors are stored in the matrix columns. Result: True, if successful. W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned. -- ALGLIB -- Copyright 07.01.2006 by Bochkanov Sergey *************************************************************************/ bool smatrixevdi(const real_2d_array &a, const ae_int_t n, const ae_int_t zneeded, const bool isupper, const ae_int_t i1, const ae_int_t i2, real_1d_array &w, real_2d_array &z); /************************************************************************* Finding the eigenvalues and eigenvectors of a Hermitian matrix The algorithm finds eigen pairs of a Hermitian matrix by reducing it to real tridiagonal form and using the QL/QR algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - Hermitian matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains the eigenvectors. Array whose indexes range within [0..N-1, 0..N-1]. The eigenvectors are stored in the matrix columns. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged (rare case). Note: eigenvectors of Hermitian matrix are defined up to multiplication by a complex number L, such that |L|=1. -- ALGLIB -- Copyright 2005, 23 March 2007 by Bochkanov Sergey *************************************************************************/ bool hmatrixevd(const complex_2d_array &a, const ae_int_t n, const ae_int_t zneeded, const bool isupper, real_1d_array &d, complex_2d_array &z); /************************************************************************* Subroutine for finding the eigenvalues (and eigenvectors) of a Hermitian matrix in a given half-interval (A, B] by using a bisection and inverse iteration Input parameters: A - Hermitian matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. B1, B2 - half-interval (B1, B2] to search eigenvalues in. Output parameters: M - number of eigenvalues found in a given half-interval, M>=0 W - array of the eigenvalues found. Array whose index ranges within [0..M-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..M-1]. The eigenvectors are stored in the matrix columns. Result: True, if successful. M contains the number of eigenvalues in the given half-interval (could be equal to 0), W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned, M is equal to 0. Note: eigen vectors of Hermitian matrix are defined up to multiplication by a complex number L, such as |L|=1. -- ALGLIB -- Copyright 07.01.2006, 24.03.2007 by Bochkanov Sergey. *************************************************************************/ bool hmatrixevdr(const complex_2d_array &a, const ae_int_t n, const ae_int_t zneeded, const bool isupper, const double b1, const double b2, ae_int_t &m, real_1d_array &w, complex_2d_array &z); /************************************************************************* Subroutine for finding the eigenvalues and eigenvectors of a Hermitian matrix with given indexes by using bisection and inverse iteration methods Input parameters: A - Hermitian matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. I1, I2 - index interval for searching (from I1 to I2). 0 <= I1 <= I2 <= N-1. Output parameters: W - array of the eigenvalues found. Array whose index ranges within [0..I2-I1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..I2-I1]. In that case, the eigenvectors are stored in the matrix columns. Result: True, if successful. W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned. Note: eigen vectors of Hermitian matrix are defined up to multiplication by a complex number L, such as |L|=1. -- ALGLIB -- Copyright 07.01.2006, 24.03.2007 by Bochkanov Sergey. *************************************************************************/ bool hmatrixevdi(const complex_2d_array &a, const ae_int_t n, const ae_int_t zneeded, const bool isupper, const ae_int_t i1, const ae_int_t i2, real_1d_array &w, complex_2d_array &z); /************************************************************************* Finding the eigenvalues and eigenvectors of a tridiagonal symmetric matrix The algorithm finds the eigen pairs of a tridiagonal symmetric matrix by using an QL/QR algorithm with implicit shifts. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: D - the main diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-1]. E - the secondary diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-2]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not needed; * 1, the eigenvectors of a tridiagonal matrix are multiplied by the square matrix Z. It is used if the tridiagonal matrix is obtained by the similarity transformation of a symmetric matrix; * 2, the eigenvectors of a tridiagonal matrix replace the square matrix Z; * 3, matrix Z contains the first row of the eigenvectors matrix. Z - if ZNeeded=1, Z contains the square matrix by which the eigenvectors are multiplied. Array whose indexes range within [0..N-1, 0..N-1]. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains the product of a given matrix (from the left) and the eigenvectors matrix (from the right); * 2, Z contains the eigenvectors. * 3, Z contains the first row of the eigenvectors matrix. If ZNeeded<3, Z is the array whose indexes range within [0..N-1, 0..N-1]. In that case, the eigenvectors are stored in the matrix columns. If ZNeeded=3, Z is the array whose indexes range within [0..0, 0..N-1]. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ bool smatrixtdevd(real_1d_array &d, const real_1d_array &e, const ae_int_t n, const ae_int_t zneeded, real_2d_array &z); /************************************************************************* Subroutine for finding the tridiagonal matrix eigenvalues/vectors in a given half-interval (A, B] by using bisection and inverse iteration. Input parameters: D - the main diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-1]. E - the secondary diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-2]. N - size of matrix, N>=0. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not needed; * 1, the eigenvectors of a tridiagonal matrix are multiplied by the square matrix Z. It is used if the tridiagonal matrix is obtained by the similarity transformation of a symmetric matrix. * 2, the eigenvectors of a tridiagonal matrix replace matrix Z. A, B - half-interval (A, B] to search eigenvalues in. Z - if ZNeeded is equal to: * 0, Z isn't used and remains unchanged; * 1, Z contains the square matrix (array whose indexes range within [0..N-1, 0..N-1]) which reduces the given symmetric matrix to tridiagonal form; * 2, Z isn't used (but changed on the exit). Output parameters: D - array of the eigenvalues found. Array whose index ranges within [0..M-1]. M - number of eigenvalues found in the given half-interval (M>=0). Z - if ZNeeded is equal to: * 0, doesn't contain any information; * 1, contains the product of a given NxN matrix Z (from the left) and NxM matrix of the eigenvectors found (from the right). Array whose indexes range within [0..N-1, 0..M-1]. * 2, contains the matrix of the eigenvectors found. Array whose indexes range within [0..N-1, 0..M-1]. Result: True, if successful. In that case, M contains the number of eigenvalues in the given half-interval (could be equal to 0), D contains the eigenvalues, Z contains the eigenvectors (if needed). It should be noted that the subroutine changes the size of arrays D and Z. False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned, M is equal to 0. -- ALGLIB -- Copyright 31.03.2008 by Bochkanov Sergey *************************************************************************/ bool smatrixtdevdr(real_1d_array &d, const real_1d_array &e, const ae_int_t n, const ae_int_t zneeded, const double a, const double b, ae_int_t &m, real_2d_array &z); /************************************************************************* Subroutine for finding tridiagonal matrix eigenvalues/vectors with given indexes (in ascending order) by using the bisection and inverse iteraion. Input parameters: D - the main diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-1]. E - the secondary diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-2]. N - size of matrix. N>=0. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not needed; * 1, the eigenvectors of a tridiagonal matrix are multiplied by the square matrix Z. It is used if the tridiagonal matrix is obtained by the similarity transformation of a symmetric matrix. * 2, the eigenvectors of a tridiagonal matrix replace matrix Z. I1, I2 - index interval for searching (from I1 to I2). 0 <= I1 <= I2 <= N-1. Z - if ZNeeded is equal to: * 0, Z isn't used and remains unchanged; * 1, Z contains the square matrix (array whose indexes range within [0..N-1, 0..N-1]) which reduces the given symmetric matrix to tridiagonal form; * 2, Z isn't used (but changed on the exit). Output parameters: D - array of the eigenvalues found. Array whose index ranges within [0..I2-I1]. Z - if ZNeeded is equal to: * 0, doesn't contain any information; * 1, contains the product of a given NxN matrix Z (from the left) and Nx(I2-I1) matrix of the eigenvectors found (from the right). Array whose indexes range within [0..N-1, 0..I2-I1]. * 2, contains the matrix of the eigenvalues found. Array whose indexes range within [0..N-1, 0..I2-I1]. Result: True, if successful. In that case, D contains the eigenvalues, Z contains the eigenvectors (if needed). It should be noted that the subroutine changes the size of arrays D and Z. False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned. -- ALGLIB -- Copyright 25.12.2005 by Bochkanov Sergey *************************************************************************/ bool smatrixtdevdi(real_1d_array &d, const real_1d_array &e, const ae_int_t n, const ae_int_t zneeded, const ae_int_t i1, const ae_int_t i2, real_2d_array &z); /************************************************************************* Finding eigenvalues and eigenvectors of a general (unsymmetric) matrix COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Speed-up provided by MKL for this particular problem (EVD) ! is really high, because MKL uses combination of (a) better low-level ! optimizations, and (b) better EVD algorithms. ! ! On one particular SSE-capable machine for N=1024, commercial MKL- ! -capable ALGLIB was: ! * 7-10 times faster than open source "generic C" version ! * 15-18 times faster than "pure C#" version ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The algorithm finds eigenvalues and eigenvectors of a general matrix by using the QR algorithm with multiple shifts. The algorithm can find eigenvalues and both left and right eigenvectors. The right eigenvector is a vector x such that A*x = w*x, and the left eigenvector is a vector y such that y'*A = w*y' (here y' implies a complex conjugate transposition of vector y). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. VNeeded - flag controlling whether eigenvectors are needed or not. If VNeeded is equal to: * 0, eigenvectors are not returned; * 1, right eigenvectors are returned; * 2, left eigenvectors are returned; * 3, both left and right eigenvectors are returned. Output parameters: WR - real parts of eigenvalues. Array whose index ranges within [0..N-1]. WR - imaginary parts of eigenvalues. Array whose index ranges within [0..N-1]. VL, VR - arrays of left and right eigenvectors (if they are needed). If WI[i]=0, the respective eigenvalue is a real number, and it corresponds to the column number I of matrices VL/VR. If WI[i]>0, we have a pair of complex conjugate numbers with positive and negative imaginary parts: the first eigenvalue WR[i] + sqrt(-1)*WI[i]; the second eigenvalue WR[i+1] + sqrt(-1)*WI[i+1]; WI[i]>0 WI[i+1] = -WI[i] < 0 In that case, the eigenvector corresponding to the first eigenvalue is located in i and i+1 columns of matrices VL/VR (the column number i contains the real part, and the column number i+1 contains the imaginary part), and the vector corresponding to the second eigenvalue is a complex conjugate to the first vector. Arrays whose indexes range within [0..N-1, 0..N-1]. Result: True, if the algorithm has converged. False, if the algorithm has not converged. Note 1: Some users may ask the following question: what if WI[N-1]>0? WI[N] must contain an eigenvalue which is complex conjugate to the N-th eigenvalue, but the array has only size N? The answer is as follows: such a situation cannot occur because the algorithm finds a pairs of eigenvalues, therefore, if WI[i]>0, I is strictly less than N-1. Note 2: The algorithm performance depends on the value of the internal parameter NS of the InternalSchurDecomposition subroutine which defines the number of shifts in the QR algorithm (similarly to the block width in block-matrix algorithms of linear algebra). If you require maximum performance on your machine, it is recommended to adjust this parameter manually. See also the InternalTREVC subroutine. The algorithm is based on the LAPACK 3.0 library. *************************************************************************/ bool rmatrixevd(const real_2d_array &a, const ae_int_t n, const ae_int_t vneeded, real_1d_array &wr, real_1d_array &wi, real_2d_array &vl, real_2d_array &vr); /************************************************************************* Subroutine performing the Schur decomposition of a general matrix by using the QR algorithm with multiple shifts. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The source matrix A is represented as S'*A*S = T, where S is an orthogonal matrix (Schur vectors), T - upper quasi-triangular matrix (with blocks of sizes 1x1 and 2x2 on the main diagonal). Input parameters: A - matrix to be decomposed. Array whose indexes range within [0..N-1, 0..N-1]. N - size of A, N>=0. Output parameters: A - contains matrix T. Array whose indexes range within [0..N-1, 0..N-1]. S - contains Schur vectors. Array whose indexes range within [0..N-1, 0..N-1]. Note 1: The block structure of matrix T can be easily recognized: since all the elements below the blocks are zeros, the elements a[i+1,i] which are equal to 0 show the block border. Note 2: The algorithm performance depends on the value of the internal parameter NS of the InternalSchurDecomposition subroutine which defines the number of shifts in the QR algorithm (similarly to the block width in block-matrix algorithms in linear algebra). If you require maximum performance on your machine, it is recommended to adjust this parameter manually. Result: True, if the algorithm has converged and parameters A and S contain the result. False, if the algorithm has not converged. Algorithm implemented on the basis of the DHSEQR subroutine (LAPACK 3.0 library). *************************************************************************/ bool rmatrixschur(real_2d_array &a, const ae_int_t n, real_2d_array &s); /************************************************************************* Algorithm for solving the following generalized symmetric positive-definite eigenproblem: A*x = lambda*B*x (1) or A*B*x = lambda*x (2) or B*A*x = lambda*x (3). where A is a symmetric matrix, B - symmetric positive-definite matrix. The problem is solved by reducing it to an ordinary symmetric eigenvalue problem. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrices A and B. IsUpperA - storage format of matrix A. B - symmetric positive-definite matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. IsUpperB - storage format of matrix B. ZNeeded - if ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. ProblemType - if ProblemType is equal to: * 1, the following problem is solved: A*x = lambda*B*x; * 2, the following problem is solved: A*B*x = lambda*x; * 3, the following problem is solved: B*A*x = lambda*x. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..N-1]. The eigenvectors are stored in matrix columns. It should be noted that the eigenvectors in such problems do not form an orthogonal system. Result: True, if the problem was solved successfully. False, if the error occurred during the Cholesky decomposition of matrix B (the matrix isn't positive-definite) or during the work of the iterative algorithm for solving the symmetric eigenproblem. See also the GeneralizedSymmetricDefiniteEVDReduce subroutine. -- ALGLIB -- Copyright 1.28.2006 by Bochkanov Sergey *************************************************************************/ bool smatrixgevd(const real_2d_array &a, const ae_int_t n, const bool isuppera, const real_2d_array &b, const bool isupperb, const ae_int_t zneeded, const ae_int_t problemtype, real_1d_array &d, real_2d_array &z); /************************************************************************* Algorithm for reduction of the following generalized symmetric positive- definite eigenvalue problem: A*x = lambda*B*x (1) or A*B*x = lambda*x (2) or B*A*x = lambda*x (3) to the symmetric eigenvalues problem C*y = lambda*y (eigenvalues of this and the given problems are the same, and the eigenvectors of the given problem could be obtained by multiplying the obtained eigenvectors by the transformation matrix x = R*y). Here A is a symmetric matrix, B - symmetric positive-definite matrix. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrices A and B. IsUpperA - storage format of matrix A. B - symmetric positive-definite matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. IsUpperB - storage format of matrix B. ProblemType - if ProblemType is equal to: * 1, the following problem is solved: A*x = lambda*B*x; * 2, the following problem is solved: A*B*x = lambda*x; * 3, the following problem is solved: B*A*x = lambda*x. Output parameters: A - symmetric matrix which is given by its upper or lower triangle depending on IsUpperA. Contains matrix C. Array whose indexes range within [0..N-1, 0..N-1]. R - upper triangular or low triangular transformation matrix which is used to obtain the eigenvectors of a given problem as the product of eigenvectors of C (from the right) and matrix R (from the left). If the matrix is upper triangular, the elements below the main diagonal are equal to 0 (and vice versa). Thus, we can perform the multiplication without taking into account the internal structure (which is an easier though less effective way). Array whose indexes range within [0..N-1, 0..N-1]. IsUpperR - type of matrix R (upper or lower triangular). Result: True, if the problem was reduced successfully. False, if the error occurred during the Cholesky decomposition of matrix B (the matrix is not positive-definite). -- ALGLIB -- Copyright 1.28.2006 by Bochkanov Sergey *************************************************************************/ bool smatrixgevdreduce(real_2d_array &a, const ae_int_t n, const bool isuppera, const real_2d_array &b, const bool isupperb, const ae_int_t problemtype, real_2d_array &r, bool &isupperr); /************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm updates matrix A^-1 when adding a number to an element of matrix A. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. UpdRow - row where the element to be updated is stored. UpdColumn - column where the element to be updated is stored. UpdVal - a number to be added to the element. Output parameters: InvA - inverse of modified matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void rmatrixinvupdatesimple(real_2d_array &inva, const ae_int_t n, const ae_int_t updrow, const ae_int_t updcolumn, const double updval); /************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm updates matrix A^-1 when adding a vector to a row of matrix A. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. UpdRow - the row of A whose vector V was added. 0 <= Row <= N-1 V - the vector to be added to a row. Array whose index ranges within [0..N-1]. Output parameters: InvA - inverse of modified matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void rmatrixinvupdaterow(real_2d_array &inva, const ae_int_t n, const ae_int_t updrow, const real_1d_array &v); /************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm updates matrix A^-1 when adding a vector to a column of matrix A. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. UpdColumn - the column of A whose vector U was added. 0 <= UpdColumn <= N-1 U - the vector to be added to a column. Array whose index ranges within [0..N-1]. Output parameters: InvA - inverse of modified matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void rmatrixinvupdatecolumn(real_2d_array &inva, const ae_int_t n, const ae_int_t updcolumn, const real_1d_array &u); /************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm computes the inverse of matrix A+u*v' by using the given matrix A^-1 and the vectors u and v. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. U - the vector modifying the matrix. Array whose index ranges within [0..N-1]. V - the vector modifying the matrix. Array whose index ranges within [0..N-1]. Output parameters: InvA - inverse of matrix A + u*v'. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void rmatrixinvupdateuv(real_2d_array &inva, const ae_int_t n, const real_1d_array &u, const real_1d_array &v); /************************************************************************* Determinant calculation of the matrix given by its LU decomposition. Input parameters: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations which were made during the LU decomposition. Output of RMatrixLU subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: matrix determinant. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ double rmatrixludet(const real_2d_array &a, const integer_1d_array &pivots, const ae_int_t n); double rmatrixludet(const real_2d_array &a, const integer_1d_array &pivots); /************************************************************************* Calculation of the determinant of a general matrix Input parameters: A - matrix, array[0..N-1, 0..N-1] N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: determinant of matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ double rmatrixdet(const real_2d_array &a, const ae_int_t n); double rmatrixdet(const real_2d_array &a); /************************************************************************* Determinant calculation of the matrix given by its LU decomposition. Input parameters: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations which were made during the LU decomposition. Output of RMatrixLU subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: matrix determinant. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ alglib::complex cmatrixludet(const complex_2d_array &a, const integer_1d_array &pivots, const ae_int_t n); alglib::complex cmatrixludet(const complex_2d_array &a, const integer_1d_array &pivots); /************************************************************************* Calculation of the determinant of a general matrix Input parameters: A - matrix, array[0..N-1, 0..N-1] N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: determinant of matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ alglib::complex cmatrixdet(const complex_2d_array &a, const ae_int_t n); alglib::complex cmatrixdet(const complex_2d_array &a); /************************************************************************* Determinant calculation of the matrix given by the Cholesky decomposition. Input parameters: A - Cholesky decomposition, output of SMatrixCholesky subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) As the determinant is equal to the product of squares of diagonal elements, it's not necessary to specify which triangle - lower or upper - the matrix is stored in. Result: matrix determinant. -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/ double spdmatrixcholeskydet(const real_2d_array &a, const ae_int_t n); double spdmatrixcholeskydet(const real_2d_array &a); /************************************************************************* Determinant calculation of the symmetric positive definite matrix. Input parameters: A - matrix. Array with elements [0..N-1, 0..N-1]. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) IsUpper - (optional) storage type: * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Result: determinant of matrix A. If matrix A is not positive definite, exception is thrown. -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/ double spdmatrixdet(const real_2d_array &a, const ae_int_t n, const bool isupper); double spdmatrixdet(const real_2d_array &a); } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (FUNCTIONS) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { void sparsecreate(ae_int_t m, ae_int_t n, ae_int_t k, sparsematrix* s, ae_state *_state); void sparsecreatebuf(ae_int_t m, ae_int_t n, ae_int_t k, sparsematrix* s, ae_state *_state); void sparsecreatecrs(ae_int_t m, ae_int_t n, /* Integer */ ae_vector* ner, sparsematrix* s, ae_state *_state); void sparsecreatecrsbuf(ae_int_t m, ae_int_t n, /* Integer */ ae_vector* ner, sparsematrix* s, ae_state *_state); void sparsecreatesks(ae_int_t m, ae_int_t n, /* Integer */ ae_vector* d, /* Integer */ ae_vector* u, sparsematrix* s, ae_state *_state); void sparsecreatesksbuf(ae_int_t m, ae_int_t n, /* Integer */ ae_vector* d, /* Integer */ ae_vector* u, sparsematrix* s, ae_state *_state); void sparsecopy(sparsematrix* s0, sparsematrix* s1, ae_state *_state); void sparsecopybuf(sparsematrix* s0, sparsematrix* s1, ae_state *_state); void sparseswap(sparsematrix* s0, sparsematrix* s1, ae_state *_state); void sparseadd(sparsematrix* s, ae_int_t i, ae_int_t j, double v, ae_state *_state); void sparseset(sparsematrix* s, ae_int_t i, ae_int_t j, double v, ae_state *_state); double sparseget(sparsematrix* s, ae_int_t i, ae_int_t j, ae_state *_state); double sparsegetdiagonal(sparsematrix* s, ae_int_t i, ae_state *_state); void sparsemv(sparsematrix* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void sparsemtv(sparsematrix* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void sparsemv2(sparsematrix* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y0, /* Real */ ae_vector* y1, ae_state *_state); void sparsesmv(sparsematrix* s, ae_bool isupper, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); double sparsevsmv(sparsematrix* s, ae_bool isupper, /* Real */ ae_vector* x, ae_state *_state); void sparsemm(sparsematrix* s, /* Real */ ae_matrix* a, ae_int_t k, /* Real */ ae_matrix* b, ae_state *_state); void sparsemtm(sparsematrix* s, /* Real */ ae_matrix* a, ae_int_t k, /* Real */ ae_matrix* b, ae_state *_state); void sparsemm2(sparsematrix* s, /* Real */ ae_matrix* a, ae_int_t k, /* Real */ ae_matrix* b0, /* Real */ ae_matrix* b1, ae_state *_state); void sparsesmm(sparsematrix* s, ae_bool isupper, /* Real */ ae_matrix* a, ae_int_t k, /* Real */ ae_matrix* b, ae_state *_state); void sparsetrmv(sparsematrix* s, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void sparsetrsv(sparsematrix* s, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_vector* x, ae_state *_state); void sparseresizematrix(sparsematrix* s, ae_state *_state); double sparsegetaveragelengthofchain(sparsematrix* s, ae_state *_state); ae_bool sparseenumerate(sparsematrix* s, ae_int_t* t0, ae_int_t* t1, ae_int_t* i, ae_int_t* j, double* v, ae_state *_state); ae_bool sparserewriteexisting(sparsematrix* s, ae_int_t i, ae_int_t j, double v, ae_state *_state); void sparsegetrow(sparsematrix* s, ae_int_t i, /* Real */ ae_vector* irow, ae_state *_state); void sparsegetcompressedrow(sparsematrix* s, ae_int_t i, /* Integer */ ae_vector* colidx, /* Real */ ae_vector* vals, ae_int_t* nzcnt, ae_state *_state); void sparsetransposesks(sparsematrix* s, ae_state *_state); void sparseconvertto(sparsematrix* s0, ae_int_t fmt, ae_state *_state); void sparsecopytobuf(sparsematrix* s0, ae_int_t fmt, sparsematrix* s1, ae_state *_state); void sparseconverttohash(sparsematrix* s, ae_state *_state); void sparsecopytohash(sparsematrix* s0, sparsematrix* s1, ae_state *_state); void sparsecopytohashbuf(sparsematrix* s0, sparsematrix* s1, ae_state *_state); void sparseconverttocrs(sparsematrix* s, ae_state *_state); void sparsecopytocrs(sparsematrix* s0, sparsematrix* s1, ae_state *_state); void sparsecopytocrsbuf(sparsematrix* s0, sparsematrix* s1, ae_state *_state); void sparseconverttosks(sparsematrix* s, ae_state *_state); void sparsecopytosks(sparsematrix* s0, sparsematrix* s1, ae_state *_state); void sparsecopytosksbuf(sparsematrix* s0, sparsematrix* s1, ae_state *_state); void sparsecreatecrsinplace(sparsematrix* s, ae_state *_state); ae_int_t sparsegetmatrixtype(sparsematrix* s, ae_state *_state); ae_bool sparseishash(sparsematrix* s, ae_state *_state); ae_bool sparseiscrs(sparsematrix* s, ae_state *_state); ae_bool sparseissks(sparsematrix* s, ae_state *_state); void sparsefree(sparsematrix* s, ae_state *_state); ae_int_t sparsegetnrows(sparsematrix* s, ae_state *_state); ae_int_t sparsegetncols(sparsematrix* s, ae_state *_state); ae_int_t sparsegetuppercount(sparsematrix* s, ae_state *_state); ae_int_t sparsegetlowercount(sparsematrix* s, ae_state *_state); void _sparsematrix_init(void* _p, ae_state *_state); void _sparsematrix_init_copy(void* _dst, void* _src, ae_state *_state); void _sparsematrix_clear(void* _p); void _sparsematrix_destroy(void* _p); void _sparsebuffers_init(void* _p, ae_state *_state); void _sparsebuffers_init_copy(void* _dst, void* _src, ae_state *_state); void _sparsebuffers_clear(void* _p); void _sparsebuffers_destroy(void* _p); void rmatrixrndorthogonal(ae_int_t n, /* Real */ ae_matrix* a, ae_state *_state); void rmatrixrndcond(ae_int_t n, double c, /* Real */ ae_matrix* a, ae_state *_state); void cmatrixrndorthogonal(ae_int_t n, /* Complex */ ae_matrix* a, ae_state *_state); void cmatrixrndcond(ae_int_t n, double c, /* Complex */ ae_matrix* a, ae_state *_state); void smatrixrndcond(ae_int_t n, double c, /* Real */ ae_matrix* a, ae_state *_state); void spdmatrixrndcond(ae_int_t n, double c, /* Real */ ae_matrix* a, ae_state *_state); void hmatrixrndcond(ae_int_t n, double c, /* Complex */ ae_matrix* a, ae_state *_state); void hpdmatrixrndcond(ae_int_t n, double c, /* Complex */ ae_matrix* a, ae_state *_state); void rmatrixrndorthogonalfromtheright(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, ae_state *_state); void rmatrixrndorthogonalfromtheleft(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, ae_state *_state); void cmatrixrndorthogonalfromtheright(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, ae_state *_state); void cmatrixrndorthogonalfromtheleft(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, ae_state *_state); void smatrixrndmultiply(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state); void hmatrixrndmultiply(/* Complex */ ae_matrix* a, ae_int_t n, ae_state *_state); void ablassplitlength(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t* n1, ae_int_t* n2, ae_state *_state); void ablascomplexsplitlength(/* Complex */ ae_matrix* a, ae_int_t n, ae_int_t* n1, ae_int_t* n2, ae_state *_state); ae_int_t ablasblocksize(/* Real */ ae_matrix* a, ae_state *_state); ae_int_t ablascomplexblocksize(/* Complex */ ae_matrix* a, ae_state *_state); ae_int_t ablasmicroblocksize(ae_state *_state); void cmatrixtranspose(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_state *_state); void rmatrixtranspose(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_state *_state); void rmatrixenforcesymmetricity(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state); void cmatrixcopy(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_state *_state); void rmatrixcopy(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_state *_state); void cmatrixrank1(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Complex */ ae_vector* u, ae_int_t iu, /* Complex */ ae_vector* v, ae_int_t iv, ae_state *_state); void rmatrixrank1(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_vector* u, ae_int_t iu, /* Real */ ae_vector* v, ae_int_t iv, ae_state *_state); void cmatrixmv(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t opa, /* Complex */ ae_vector* x, ae_int_t ix, /* Complex */ ae_vector* y, ae_int_t iy, ae_state *_state); void rmatrixmv(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t opa, /* Real */ ae_vector* x, ae_int_t ix, /* Real */ ae_vector* y, ae_int_t iy, ae_state *_state); void cmatrixrighttrsm(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); void _pexec_cmatrixrighttrsm(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); void cmatrixlefttrsm(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); void _pexec_cmatrixlefttrsm(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); void rmatrixrighttrsm(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); void _pexec_rmatrixrighttrsm(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); void rmatrixlefttrsm(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); void _pexec_rmatrixlefttrsm(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); void cmatrixherk(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); void _pexec_cmatrixherk(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); void rmatrixsyrk(ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); void _pexec_rmatrixsyrk(ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); void cmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); void _pexec_cmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); void rmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); void _pexec_rmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); void cmatrixsyrk(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); void _pexec_cmatrixsyrk(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); void rmatrixlu(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state); void _pexec_rmatrixlu(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state); void cmatrixlu(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state); void _pexec_cmatrixlu(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state); ae_bool hpdmatrixcholesky(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state); ae_bool _pexec_hpdmatrixcholesky(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state); ae_bool spdmatrixcholesky(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state); ae_bool _pexec_spdmatrixcholesky(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state); void spdmatrixcholeskyupdateadd1(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* u, ae_state *_state); void spdmatrixcholeskyupdatefix(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Boolean */ ae_vector* fix, ae_state *_state); void spdmatrixcholeskyupdateadd1buf(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* u, /* Real */ ae_vector* bufr, ae_state *_state); void spdmatrixcholeskyupdatefixbuf(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Boolean */ ae_vector* fix, /* Real */ ae_vector* bufr, ae_state *_state); ae_bool sparsecholeskyskyline(sparsematrix* a, ae_int_t n, ae_bool isupper, ae_state *_state); ae_bool sparsecholeskyx(sparsematrix* a, ae_int_t n, ae_bool isupper, /* Integer */ ae_vector* p0, /* Integer */ ae_vector* p1, ae_int_t ordering, ae_int_t algo, ae_int_t fmt, sparsebuffers* buf, sparsematrix* c, ae_state *_state); void rmatrixlup(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state); void cmatrixlup(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state); void rmatrixplu(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state); void cmatrixplu(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state); ae_bool spdmatrixcholeskyrec(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tmp, ae_state *_state); double rmatrixrcond1(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state); double rmatrixrcondinf(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state); double spdmatrixrcond(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state); double rmatrixtrrcond1(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_state *_state); double rmatrixtrrcondinf(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_state *_state); double hpdmatrixrcond(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state); double cmatrixrcond1(/* Complex */ ae_matrix* a, ae_int_t n, ae_state *_state); double cmatrixrcondinf(/* Complex */ ae_matrix* a, ae_int_t n, ae_state *_state); double rmatrixlurcond1(/* Real */ ae_matrix* lua, ae_int_t n, ae_state *_state); double rmatrixlurcondinf(/* Real */ ae_matrix* lua, ae_int_t n, ae_state *_state); double spdmatrixcholeskyrcond(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state); double hpdmatrixcholeskyrcond(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state); double cmatrixlurcond1(/* Complex */ ae_matrix* lua, ae_int_t n, ae_state *_state); double cmatrixlurcondinf(/* Complex */ ae_matrix* lua, ae_int_t n, ae_state *_state); double cmatrixtrrcond1(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_state *_state); double cmatrixtrrcondinf(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_state *_state); double rcondthreshold(ae_state *_state); void rmatrixluinverse(/* Real */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state); void _pexec_rmatrixluinverse(/* Real */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state); void rmatrixinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state); void _pexec_rmatrixinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state); void cmatrixluinverse(/* Complex */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state); void _pexec_cmatrixluinverse(/* Complex */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state); void cmatrixinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state); void _pexec_cmatrixinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state); void spdmatrixcholeskyinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_int_t* info, matinvreport* rep, ae_state *_state); void _pexec_spdmatrixcholeskyinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_int_t* info, matinvreport* rep, ae_state *_state); void spdmatrixinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_int_t* info, matinvreport* rep, ae_state *_state); void _pexec_spdmatrixinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_int_t* info, matinvreport* rep, ae_state *_state); void hpdmatrixcholeskyinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_int_t* info, matinvreport* rep, ae_state *_state); void _pexec_hpdmatrixcholeskyinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_int_t* info, matinvreport* rep, ae_state *_state); void hpdmatrixinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_int_t* info, matinvreport* rep, ae_state *_state); void _pexec_hpdmatrixinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_int_t* info, matinvreport* rep, ae_state *_state); void rmatrixtrinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_int_t* info, matinvreport* rep, ae_state *_state); void _pexec_rmatrixtrinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_int_t* info, matinvreport* rep, ae_state *_state); void cmatrixtrinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_int_t* info, matinvreport* rep, ae_state *_state); void _pexec_cmatrixtrinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_int_t* info, matinvreport* rep, ae_state *_state); void spdmatrixcholeskyinverserec(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tmp, ae_state *_state); void _matinvreport_init(void* _p, ae_state *_state); void _matinvreport_init_copy(void* _dst, void* _src, ae_state *_state); void _matinvreport_clear(void* _p); void _matinvreport_destroy(void* _p); void rmatrixqr(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_state *_state); void _pexec_rmatrixqr(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_state *_state); void rmatrixlq(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_state *_state); void _pexec_rmatrixlq(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_state *_state); void cmatrixqr(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_state *_state); void _pexec_cmatrixqr(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_state *_state); void cmatrixlq(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_state *_state); void _pexec_cmatrixlq(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_state *_state); void rmatrixqrunpackq(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_int_t qcolumns, /* Real */ ae_matrix* q, ae_state *_state); void _pexec_rmatrixqrunpackq(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_int_t qcolumns, /* Real */ ae_matrix* q, ae_state *_state); void rmatrixqrunpackr(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* r, ae_state *_state); void rmatrixlqunpackq(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_int_t qrows, /* Real */ ae_matrix* q, ae_state *_state); void _pexec_rmatrixlqunpackq(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_int_t qrows, /* Real */ ae_matrix* q, ae_state *_state); void rmatrixlqunpackl(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* l, ae_state *_state); void cmatrixqrunpackq(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_int_t qcolumns, /* Complex */ ae_matrix* q, ae_state *_state); void _pexec_cmatrixqrunpackq(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_int_t qcolumns, /* Complex */ ae_matrix* q, ae_state *_state); void cmatrixqrunpackr(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* r, ae_state *_state); void cmatrixlqunpackq(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_int_t qrows, /* Complex */ ae_matrix* q, ae_state *_state); void _pexec_cmatrixlqunpackq(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_int_t qrows, /* Complex */ ae_matrix* q, ae_state *_state); void cmatrixlqunpackl(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* l, ae_state *_state); void rmatrixqrbasecase(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* work, /* Real */ ae_vector* t, /* Real */ ae_vector* tau, ae_state *_state); void rmatrixlqbasecase(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* work, /* Real */ ae_vector* t, /* Real */ ae_vector* tau, ae_state *_state); void rmatrixbd(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tauq, /* Real */ ae_vector* taup, ae_state *_state); void rmatrixbdunpackq(/* Real */ ae_matrix* qp, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tauq, ae_int_t qcolumns, /* Real */ ae_matrix* q, ae_state *_state); void rmatrixbdmultiplybyq(/* Real */ ae_matrix* qp, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tauq, /* Real */ ae_matrix* z, ae_int_t zrows, ae_int_t zcolumns, ae_bool fromtheright, ae_bool dotranspose, ae_state *_state); void rmatrixbdunpackpt(/* Real */ ae_matrix* qp, ae_int_t m, ae_int_t n, /* Real */ ae_vector* taup, ae_int_t ptrows, /* Real */ ae_matrix* pt, ae_state *_state); void rmatrixbdmultiplybyp(/* Real */ ae_matrix* qp, ae_int_t m, ae_int_t n, /* Real */ ae_vector* taup, /* Real */ ae_matrix* z, ae_int_t zrows, ae_int_t zcolumns, ae_bool fromtheright, ae_bool dotranspose, ae_state *_state); void rmatrixbdunpackdiagonals(/* Real */ ae_matrix* b, ae_int_t m, ae_int_t n, ae_bool* isupper, /* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_state *_state); void rmatrixhessenberg(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* tau, ae_state *_state); void rmatrixhessenbergunpackq(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* tau, /* Real */ ae_matrix* q, ae_state *_state); void rmatrixhessenbergunpackh(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* h, ae_state *_state); void smatrixtd(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tau, /* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_state *_state); void smatrixtdunpackq(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tau, /* Real */ ae_matrix* q, ae_state *_state); void hmatrixtd(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tau, /* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_state *_state); void hmatrixtdunpackq(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tau, /* Complex */ ae_matrix* q, ae_state *_state); void fblscholeskysolve(/* Real */ ae_matrix* cha, double sqrtscalea, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* xb, /* Real */ ae_vector* tmp, ae_state *_state); void fblssolvecgx(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double alpha, /* Real */ ae_vector* b, /* Real */ ae_vector* x, /* Real */ ae_vector* buf, ae_state *_state); void fblscgcreate(/* Real */ ae_vector* x, /* Real */ ae_vector* b, ae_int_t n, fblslincgstate* state, ae_state *_state); ae_bool fblscgiteration(fblslincgstate* state, ae_state *_state); void fblssolvels(/* Real */ ae_matrix* a, /* Real */ ae_vector* b, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tmp0, /* Real */ ae_vector* tmp1, /* Real */ ae_vector* tmp2, ae_state *_state); void _fblslincgstate_init(void* _p, ae_state *_state); void _fblslincgstate_init_copy(void* _dst, void* _src, ae_state *_state); void _fblslincgstate_clear(void* _p); void _fblslincgstate_destroy(void* _p); ae_bool rmatrixbdsvd(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool isupper, ae_bool isfractionalaccuracyrequired, /* Real */ ae_matrix* u, ae_int_t nru, /* Real */ ae_matrix* c, ae_int_t ncc, /* Real */ ae_matrix* vt, ae_int_t ncvt, ae_state *_state); ae_bool bidiagonalsvddecomposition(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool isupper, ae_bool isfractionalaccuracyrequired, /* Real */ ae_matrix* u, ae_int_t nru, /* Real */ ae_matrix* c, ae_int_t ncc, /* Real */ ae_matrix* vt, ae_int_t ncvt, ae_state *_state); ae_bool rmatrixsvd(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, ae_int_t uneeded, ae_int_t vtneeded, ae_int_t additionalmemory, /* Real */ ae_vector* w, /* Real */ ae_matrix* u, /* Real */ ae_matrix* vt, ae_state *_state); ae_bool _pexec_rmatrixsvd(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, ae_int_t uneeded, ae_int_t vtneeded, ae_int_t additionalmemory, /* Real */ ae_vector* w, /* Real */ ae_matrix* u, /* Real */ ae_matrix* vt, ae_state *_state); void normestimatorcreate(ae_int_t m, ae_int_t n, ae_int_t nstart, ae_int_t nits, normestimatorstate* state, ae_state *_state); void normestimatorsetseed(normestimatorstate* state, ae_int_t seedval, ae_state *_state); ae_bool normestimatoriteration(normestimatorstate* state, ae_state *_state); void normestimatorestimatesparse(normestimatorstate* state, sparsematrix* a, ae_state *_state); void normestimatorresults(normestimatorstate* state, double* nrm, ae_state *_state); void normestimatorrestart(normestimatorstate* state, ae_state *_state); void _normestimatorstate_init(void* _p, ae_state *_state); void _normestimatorstate_init_copy(void* _dst, void* _src, ae_state *_state); void _normestimatorstate_clear(void* _p); void _normestimatorstate_destroy(void* _p); void eigsubspacecreate(ae_int_t n, ae_int_t k, eigsubspacestate* state, ae_state *_state); void eigsubspacecreatebuf(ae_int_t n, ae_int_t k, eigsubspacestate* state, ae_state *_state); void eigsubspacesetcond(eigsubspacestate* state, double eps, ae_int_t maxits, ae_state *_state); void eigsubspaceoocstart(eigsubspacestate* state, ae_int_t mtype, ae_state *_state); ae_bool eigsubspaceooccontinue(eigsubspacestate* state, ae_state *_state); void eigsubspaceoocgetrequestinfo(eigsubspacestate* state, ae_int_t* requesttype, ae_int_t* requestsize, ae_state *_state); void eigsubspaceoocgetrequestdata(eigsubspacestate* state, /* Real */ ae_matrix* x, ae_state *_state); void eigsubspaceoocsendresult(eigsubspacestate* state, /* Real */ ae_matrix* ax, ae_state *_state); void eigsubspaceoocstop(eigsubspacestate* state, /* Real */ ae_vector* w, /* Real */ ae_matrix* z, eigsubspacereport* rep, ae_state *_state); void eigsubspacesolvedenses(eigsubspacestate* state, /* Real */ ae_matrix* a, ae_bool isupper, /* Real */ ae_vector* w, /* Real */ ae_matrix* z, eigsubspacereport* rep, ae_state *_state); void _pexec_eigsubspacesolvedenses(eigsubspacestate* state, /* Real */ ae_matrix* a, ae_bool isupper, /* Real */ ae_vector* w, /* Real */ ae_matrix* z, eigsubspacereport* rep, ae_state *_state); void eigsubspacesolvesparses(eigsubspacestate* state, sparsematrix* a, ae_bool isupper, /* Real */ ae_vector* w, /* Real */ ae_matrix* z, eigsubspacereport* rep, ae_state *_state); ae_bool eigsubspaceiteration(eigsubspacestate* state, ae_state *_state); ae_bool smatrixevd(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t zneeded, ae_bool isupper, /* Real */ ae_vector* d, /* Real */ ae_matrix* z, ae_state *_state); ae_bool smatrixevdr(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t zneeded, ae_bool isupper, double b1, double b2, ae_int_t* m, /* Real */ ae_vector* w, /* Real */ ae_matrix* z, ae_state *_state); ae_bool smatrixevdi(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t zneeded, ae_bool isupper, ae_int_t i1, ae_int_t i2, /* Real */ ae_vector* w, /* Real */ ae_matrix* z, ae_state *_state); ae_bool hmatrixevd(/* Complex */ ae_matrix* a, ae_int_t n, ae_int_t zneeded, ae_bool isupper, /* Real */ ae_vector* d, /* Complex */ ae_matrix* z, ae_state *_state); ae_bool hmatrixevdr(/* Complex */ ae_matrix* a, ae_int_t n, ae_int_t zneeded, ae_bool isupper, double b1, double b2, ae_int_t* m, /* Real */ ae_vector* w, /* Complex */ ae_matrix* z, ae_state *_state); ae_bool hmatrixevdi(/* Complex */ ae_matrix* a, ae_int_t n, ae_int_t zneeded, ae_bool isupper, ae_int_t i1, ae_int_t i2, /* Real */ ae_vector* w, /* Complex */ ae_matrix* z, ae_state *_state); ae_bool smatrixtdevd(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_int_t zneeded, /* Real */ ae_matrix* z, ae_state *_state); ae_bool smatrixtdevdr(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_int_t zneeded, double a, double b, ae_int_t* m, /* Real */ ae_matrix* z, ae_state *_state); ae_bool smatrixtdevdi(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_int_t zneeded, ae_int_t i1, ae_int_t i2, /* Real */ ae_matrix* z, ae_state *_state); ae_bool rmatrixevd(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t vneeded, /* Real */ ae_vector* wr, /* Real */ ae_vector* wi, /* Real */ ae_matrix* vl, /* Real */ ae_matrix* vr, ae_state *_state); void _eigsubspacestate_init(void* _p, ae_state *_state); void _eigsubspacestate_init_copy(void* _dst, void* _src, ae_state *_state); void _eigsubspacestate_clear(void* _p); void _eigsubspacestate_destroy(void* _p); void _eigsubspacereport_init(void* _p, ae_state *_state); void _eigsubspacereport_init_copy(void* _dst, void* _src, ae_state *_state); void _eigsubspacereport_clear(void* _p); void _eigsubspacereport_destroy(void* _p); ae_bool rmatrixschur(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* s, ae_state *_state); ae_bool smatrixgevd(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isuppera, /* Real */ ae_matrix* b, ae_bool isupperb, ae_int_t zneeded, ae_int_t problemtype, /* Real */ ae_vector* d, /* Real */ ae_matrix* z, ae_state *_state); ae_bool smatrixgevdreduce(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isuppera, /* Real */ ae_matrix* b, ae_bool isupperb, ae_int_t problemtype, /* Real */ ae_matrix* r, ae_bool* isupperr, ae_state *_state); void rmatrixinvupdatesimple(/* Real */ ae_matrix* inva, ae_int_t n, ae_int_t updrow, ae_int_t updcolumn, double updval, ae_state *_state); void rmatrixinvupdaterow(/* Real */ ae_matrix* inva, ae_int_t n, ae_int_t updrow, /* Real */ ae_vector* v, ae_state *_state); void rmatrixinvupdatecolumn(/* Real */ ae_matrix* inva, ae_int_t n, ae_int_t updcolumn, /* Real */ ae_vector* u, ae_state *_state); void rmatrixinvupdateuv(/* Real */ ae_matrix* inva, ae_int_t n, /* Real */ ae_vector* u, /* Real */ ae_vector* v, ae_state *_state); double rmatrixludet(/* Real */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_state *_state); double rmatrixdet(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state); ae_complex cmatrixludet(/* Complex */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_state *_state); ae_complex cmatrixdet(/* Complex */ ae_matrix* a, ae_int_t n, ae_state *_state); double spdmatrixcholeskydet(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state); double spdmatrixdet(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state); } #endif cpp/src/integration.h0000755000175000017500000007627413105126766014533 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _integration_pkg_h #define _integration_pkg_h #include "ap.h" #include "alglibinternal.h" #include "linalg.h" #include "alglibmisc.h" #include "specialfunctions.h" ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (DATATYPES) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { typedef struct { ae_int_t terminationtype; ae_int_t nfev; ae_int_t nintervals; } autogkreport; typedef struct { double a; double b; double eps; double xwidth; double x; double f; ae_int_t info; double r; ae_matrix heap; ae_int_t heapsize; ae_int_t heapwidth; ae_int_t heapused; double sumerr; double sumabs; ae_vector qn; ae_vector wg; ae_vector wk; ae_vector wr; ae_int_t n; rcommstate rstate; } autogkinternalstate; typedef struct { double a; double b; double alpha; double beta; double xwidth; double x; double xminusa; double bminusx; ae_bool needf; double f; ae_int_t wrappermode; autogkinternalstate internalstate; rcommstate rstate; double v; ae_int_t terminationtype; ae_int_t nfev; ae_int_t nintervals; } autogkstate; } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* Integration report: * TerminationType = completetion code: * -5 non-convergence of Gauss-Kronrod nodes calculation subroutine. * -1 incorrect parameters were specified * 1 OK * Rep.NFEV countains number of function calculations * Rep.NIntervals contains number of intervals [a,b] was partitioned into. *************************************************************************/ class _autogkreport_owner { public: _autogkreport_owner(); _autogkreport_owner(const _autogkreport_owner &rhs); _autogkreport_owner& operator=(const _autogkreport_owner &rhs); virtual ~_autogkreport_owner(); alglib_impl::autogkreport* c_ptr(); alglib_impl::autogkreport* c_ptr() const; protected: alglib_impl::autogkreport *p_struct; }; class autogkreport : public _autogkreport_owner { public: autogkreport(); autogkreport(const autogkreport &rhs); autogkreport& operator=(const autogkreport &rhs); virtual ~autogkreport(); ae_int_t &terminationtype; ae_int_t &nfev; ae_int_t &nintervals; }; /************************************************************************* This structure stores state of the integration algorithm. Although this class has public fields, they are not intended for external use. You should use ALGLIB functions to work with this class: * autogksmooth()/AutoGKSmoothW()/... to create objects * autogkintegrate() to begin integration * autogkresults() to get results *************************************************************************/ class _autogkstate_owner { public: _autogkstate_owner(); _autogkstate_owner(const _autogkstate_owner &rhs); _autogkstate_owner& operator=(const _autogkstate_owner &rhs); virtual ~_autogkstate_owner(); alglib_impl::autogkstate* c_ptr(); alglib_impl::autogkstate* c_ptr() const; protected: alglib_impl::autogkstate *p_struct; }; class autogkstate : public _autogkstate_owner { public: autogkstate(); autogkstate(const autogkstate &rhs); autogkstate& operator=(const autogkstate &rhs); virtual ~autogkstate(); ae_bool &needf; double &x; double &xminusa; double &bminusx; double &f; }; /************************************************************************* Computation of nodes and weights for a Gauss quadrature formula The algorithm generates the N-point Gauss quadrature formula with weight function given by coefficients alpha and beta of a recurrence relation which generates a system of orthogonal polynomials: P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zeroth moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - array[0..N-1], alpha coefficients Beta - array[0..N-1], beta coefficients Zero-indexed element is not used and may be arbitrary. Beta[I]>0. Mu0 - zeroth moment of the weight function. N - number of nodes of the quadrature formula, N>=1 OUTPUT PARAMETERS: Info - error code: * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * 1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 2005-2009 by Bochkanov Sergey *************************************************************************/ void gqgeneraterec(const real_1d_array &alpha, const real_1d_array &beta, const double mu0, const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &w); /************************************************************************* Computation of nodes and weights for a Gauss-Lobatto quadrature formula The algorithm generates the N-point Gauss-Lobatto quadrature formula with weight function given by coefficients alpha and beta of a recurrence which generates a system of orthogonal polynomials. P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zeroth moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - array[0..N-2], alpha coefficients Beta - array[0..N-2], beta coefficients. Zero-indexed element is not used, may be arbitrary. Beta[I]>0 Mu0 - zeroth moment of the weighting function. A - left boundary of the integration interval. B - right boundary of the integration interval. N - number of nodes of the quadrature formula, N>=3 (including the left and right boundary nodes). OUTPUT PARAMETERS: Info - error code: * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * 1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 2005-2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategausslobattorec(const real_1d_array &alpha, const real_1d_array &beta, const double mu0, const double a, const double b, const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &w); /************************************************************************* Computation of nodes and weights for a Gauss-Radau quadrature formula The algorithm generates the N-point Gauss-Radau quadrature formula with weight function given by the coefficients alpha and beta of a recurrence which generates a system of orthogonal polynomials. P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zeroth moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - array[0..N-2], alpha coefficients. Beta - array[0..N-1], beta coefficients Zero-indexed element is not used. Beta[I]>0 Mu0 - zeroth moment of the weighting function. A - left boundary of the integration interval. N - number of nodes of the quadrature formula, N>=2 (including the left boundary node). OUTPUT PARAMETERS: Info - error code: * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * 1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 2005-2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategaussradaurec(const real_1d_array &alpha, const real_1d_array &beta, const double mu0, const double a, const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &w); /************************************************************************* Returns nodes/weights for Gauss-Legendre quadrature on [-1,1] with N nodes. INPUT PARAMETERS: N - number of nodes, >=1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. N is too large to obtain weights/nodes with high enough accuracy. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategausslegendre(const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &w); /************************************************************************* Returns nodes/weights for Gauss-Jacobi quadrature on [-1,1] with weight function W(x)=Power(1-x,Alpha)*Power(1+x,Beta). INPUT PARAMETERS: N - number of nodes, >=1 Alpha - power-law coefficient, Alpha>-1 Beta - power-law coefficient, Beta>-1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. Alpha or Beta are too close to -1 to obtain weights/nodes with high enough accuracy, or, may be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N/Alpha/Beta was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategaussjacobi(const ae_int_t n, const double alpha, const double beta, ae_int_t &info, real_1d_array &x, real_1d_array &w); /************************************************************************* Returns nodes/weights for Gauss-Laguerre quadrature on [0,+inf) with weight function W(x)=Power(x,Alpha)*Exp(-x) INPUT PARAMETERS: N - number of nodes, >=1 Alpha - power-law coefficient, Alpha>-1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. Alpha is too close to -1 to obtain weights/nodes with high enough accuracy or, may be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N/Alpha was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategausslaguerre(const ae_int_t n, const double alpha, ae_int_t &info, real_1d_array &x, real_1d_array &w); /************************************************************************* Returns nodes/weights for Gauss-Hermite quadrature on (-inf,+inf) with weight function W(x)=Exp(-x*x) INPUT PARAMETERS: N - number of nodes, >=1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. May be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N/Alpha was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategausshermite(const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &w); /************************************************************************* Computation of nodes and weights of a Gauss-Kronrod quadrature formula The algorithm generates the N-point Gauss-Kronrod quadrature formula with weight function given by coefficients alpha and beta of a recurrence relation which generates a system of orthogonal polynomials: P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zero moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - alpha coefficients, array[0..floor(3*K/2)]. Beta - beta coefficients, array[0..ceil(3*K/2)]. Beta[0] is not used and may be arbitrary. Beta[I]>0. Mu0 - zeroth moment of the weight function. N - number of nodes of the Gauss-Kronrod quadrature formula, N >= 3, N = 2*K+1. OUTPUT PARAMETERS: Info - error code: * -5 no real and positive Gauss-Kronrod formula can be created for such a weight function with a given number of nodes. * -4 N is too large, task may be ill conditioned - x[i]=x[i+1] found. * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 08.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqgeneraterec(const real_1d_array &alpha, const real_1d_array &beta, const double mu0, const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &wkronrod, real_1d_array &wgauss); /************************************************************************* Returns Gauss and Gauss-Kronrod nodes/weights for Gauss-Legendre quadrature with N points. GKQLegendreCalc (calculation) or GKQLegendreTbl (precomputed table) is used depending on machine precision and number of nodes. INPUT PARAMETERS: N - number of Kronrod nodes, must be odd number, >=3. OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. N is too large to obtain weights/nodes with high enough accuracy. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqgenerategausslegendre(const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &wkronrod, real_1d_array &wgauss); /************************************************************************* Returns Gauss and Gauss-Kronrod nodes/weights for Gauss-Jacobi quadrature on [-1,1] with weight function W(x)=Power(1-x,Alpha)*Power(1+x,Beta). INPUT PARAMETERS: N - number of Kronrod nodes, must be odd number, >=3. Alpha - power-law coefficient, Alpha>-1 Beta - power-law coefficient, Beta>-1 OUTPUT PARAMETERS: Info - error code: * -5 no real and positive Gauss-Kronrod formula can be created for such a weight function with a given number of nodes. * -4 an error was detected when calculating weights/nodes. Alpha or Beta are too close to -1 to obtain weights/nodes with high enough accuracy, or, may be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK * +2 OK, but quadrature rule have exterior nodes, x[0]<-1 or x[n-1]>+1 X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqgenerategaussjacobi(const ae_int_t n, const double alpha, const double beta, ae_int_t &info, real_1d_array &x, real_1d_array &wkronrod, real_1d_array &wgauss); /************************************************************************* Returns Gauss and Gauss-Kronrod nodes for quadrature with N points. Reduction to tridiagonal eigenproblem is used. INPUT PARAMETERS: N - number of Kronrod nodes, must be odd number, >=3. OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. N is too large to obtain weights/nodes with high enough accuracy. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqlegendrecalc(const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &wkronrod, real_1d_array &wgauss); /************************************************************************* Returns Gauss and Gauss-Kronrod nodes for quadrature with N points using pre-calculated table. Nodes/weights were computed with accuracy up to 1.0E-32 (if MPFR version of ALGLIB is used). In standard double precision accuracy reduces to something about 2.0E-16 (depending on your compiler's handling of long floating point constants). INPUT PARAMETERS: N - number of Kronrod nodes. N can be 15, 21, 31, 41, 51, 61. OUTPUT PARAMETERS: X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqlegendretbl(const ae_int_t n, real_1d_array &x, real_1d_array &wkronrod, real_1d_array &wgauss, double &eps); /************************************************************************* Integration of a smooth function F(x) on a finite interval [a,b]. Fast-convergent algorithm based on a Gauss-Kronrod formula is used. Result is calculated with accuracy close to the machine precision. Algorithm works well only with smooth integrands. It may be used with continuous non-smooth integrands, but with less performance. It should never be used with integrands which have integrable singularities at lower or upper limits - algorithm may crash. Use AutoGKSingular in such cases. INPUT PARAMETERS: A, B - interval boundaries (AB) OUTPUT PARAMETERS State - structure which stores algorithm state SEE ALSO AutoGKSmoothW, AutoGKSingular, AutoGKResults. -- ALGLIB -- Copyright 06.05.2009 by Bochkanov Sergey *************************************************************************/ void autogksmooth(const double a, const double b, autogkstate &state); /************************************************************************* Integration of a smooth function F(x) on a finite interval [a,b]. This subroutine is same as AutoGKSmooth(), but it guarantees that interval [a,b] is partitioned into subintervals which have width at most XWidth. Subroutine can be used when integrating nearly-constant function with narrow "bumps" (about XWidth wide). If "bumps" are too narrow, AutoGKSmooth subroutine can overlook them. INPUT PARAMETERS: A, B - interval boundaries (AB) OUTPUT PARAMETERS State - structure which stores algorithm state SEE ALSO AutoGKSmooth, AutoGKSingular, AutoGKResults. -- ALGLIB -- Copyright 06.05.2009 by Bochkanov Sergey *************************************************************************/ void autogksmoothw(const double a, const double b, const double xwidth, autogkstate &state); /************************************************************************* Integration on a finite interval [A,B]. Integrand have integrable singularities at A/B. F(X) must diverge as "(x-A)^alpha" at A, as "(B-x)^beta" at B, with known alpha/beta (alpha>-1, beta>-1). If alpha/beta are not known, estimates from below can be used (but these estimates should be greater than -1 too). One of alpha/beta variables (or even both alpha/beta) may be equal to 0, which means than function F(x) is non-singular at A/B. Anyway (singular at bounds or not), function F(x) is supposed to be continuous on (A,B). Fast-convergent algorithm based on a Gauss-Kronrod formula is used. Result is calculated with accuracy close to the machine precision. INPUT PARAMETERS: A, B - interval boundaries (AB) Alpha - power-law coefficient of the F(x) at A, Alpha>-1 Beta - power-law coefficient of the F(x) at B, Beta>-1 OUTPUT PARAMETERS State - structure which stores algorithm state SEE ALSO AutoGKSmooth, AutoGKSmoothW, AutoGKResults. -- ALGLIB -- Copyright 06.05.2009 by Bochkanov Sergey *************************************************************************/ void autogksingular(const double a, const double b, const double alpha, const double beta, autogkstate &state); /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool autogkiteration(const autogkstate &state); /************************************************************************* This function is used to launcn iterations of the 1-dimensional integrator It accepts following parameters: func - callback which calculates f(x) for given x ptr - optional pointer which is passed to func; can be NULL -- ALGLIB -- Copyright 07.05.2009 by Bochkanov Sergey *************************************************************************/ void autogkintegrate(autogkstate &state, void (*func)(double x, double xminusa, double bminusx, double &y, void *ptr), void *ptr = NULL); /************************************************************************* Adaptive integration results Called after AutoGKIteration returned False. Input parameters: State - algorithm state (used by AutoGKIteration). Output parameters: V - integral(f(x)dx,a,b) Rep - optimization report (see AutoGKReport description) -- ALGLIB -- Copyright 14.11.2007 by Bochkanov Sergey *************************************************************************/ void autogkresults(const autogkstate &state, double &v, autogkreport &rep); } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (FUNCTIONS) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { void gqgeneraterec(/* Real */ ae_vector* alpha, /* Real */ ae_vector* beta, double mu0, ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state); void gqgenerategausslobattorec(/* Real */ ae_vector* alpha, /* Real */ ae_vector* beta, double mu0, double a, double b, ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state); void gqgenerategaussradaurec(/* Real */ ae_vector* alpha, /* Real */ ae_vector* beta, double mu0, double a, ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state); void gqgenerategausslegendre(ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state); void gqgenerategaussjacobi(ae_int_t n, double alpha, double beta, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state); void gqgenerategausslaguerre(ae_int_t n, double alpha, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state); void gqgenerategausshermite(ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state); void gkqgeneraterec(/* Real */ ae_vector* alpha, /* Real */ ae_vector* beta, double mu0, ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* wkronrod, /* Real */ ae_vector* wgauss, ae_state *_state); void gkqgenerategausslegendre(ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* wkronrod, /* Real */ ae_vector* wgauss, ae_state *_state); void gkqgenerategaussjacobi(ae_int_t n, double alpha, double beta, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* wkronrod, /* Real */ ae_vector* wgauss, ae_state *_state); void gkqlegendrecalc(ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* wkronrod, /* Real */ ae_vector* wgauss, ae_state *_state); void gkqlegendretbl(ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* wkronrod, /* Real */ ae_vector* wgauss, double* eps, ae_state *_state); void autogksmooth(double a, double b, autogkstate* state, ae_state *_state); void autogksmoothw(double a, double b, double xwidth, autogkstate* state, ae_state *_state); void autogksingular(double a, double b, double alpha, double beta, autogkstate* state, ae_state *_state); ae_bool autogkiteration(autogkstate* state, ae_state *_state); void autogkresults(autogkstate* state, double* v, autogkreport* rep, ae_state *_state); void _autogkreport_init(void* _p, ae_state *_state); void _autogkreport_init_copy(void* _dst, void* _src, ae_state *_state); void _autogkreport_clear(void* _p); void _autogkreport_destroy(void* _p); void _autogkinternalstate_init(void* _p, ae_state *_state); void _autogkinternalstate_init_copy(void* _dst, void* _src, ae_state *_state); void _autogkinternalstate_clear(void* _p); void _autogkinternalstate_destroy(void* _p); void _autogkstate_init(void* _p, ae_state *_state); void _autogkstate_init_copy(void* _dst, void* _src, ae_state *_state); void _autogkstate_clear(void* _p); void _autogkstate_destroy(void* _p); } #endif cpp/src/statistics.cpp0000755000175000017500000225331613105126765014730 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "statistics.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif using namespace std; ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* Calculation of the distribution moments: mean, variance, skewness, kurtosis. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X OUTPUT PARAMETERS Mean - mean. Variance- variance. Skewness- skewness (if variance<>0; zero otherwise). Kurtosis- kurtosis (if variance<>0; zero otherwise). NOTE: variance is calculated by dividing sum of squares by N-1, not N. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ void samplemoments(const real_1d_array &x, const ae_int_t n, double &mean, double &variance, double &skewness, double &kurtosis) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::samplemoments(const_cast(x.c_ptr()), n, &mean, &variance, &skewness, &kurtosis, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the distribution moments: mean, variance, skewness, kurtosis. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X OUTPUT PARAMETERS Mean - mean. Variance- variance. Skewness- skewness (if variance<>0; zero otherwise). Kurtosis- kurtosis (if variance<>0; zero otherwise). NOTE: variance is calculated by dividing sum of squares by N-1, not N. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ void samplemoments(const real_1d_array &x, double &mean, double &variance, double &skewness, double &kurtosis) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::samplemoments(const_cast(x.c_ptr()), n, &mean, &variance, &skewness, &kurtosis, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the mean. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Mean' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double samplemean(const real_1d_array &x, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::samplemean(const_cast(x.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the mean. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Mean' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double samplemean(const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::samplemean(const_cast(x.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the variance. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Variance' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double samplevariance(const real_1d_array &x, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::samplevariance(const_cast(x.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the variance. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Variance' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double samplevariance(const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::samplevariance(const_cast(x.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the skewness. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Skewness' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double sampleskewness(const real_1d_array &x, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::sampleskewness(const_cast(x.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the skewness. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Skewness' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double sampleskewness(const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::sampleskewness(const_cast(x.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the kurtosis. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Kurtosis' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double samplekurtosis(const real_1d_array &x, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::samplekurtosis(const_cast(x.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the kurtosis. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Kurtosis' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double samplekurtosis(const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::samplekurtosis(const_cast(x.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* ADev Input parameters: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X Output parameters: ADev- ADev -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ void sampleadev(const real_1d_array &x, const ae_int_t n, double &adev) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sampleadev(const_cast(x.c_ptr()), n, &adev, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* ADev Input parameters: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X Output parameters: ADev- ADev -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ void sampleadev(const real_1d_array &x, double &adev) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sampleadev(const_cast(x.c_ptr()), n, &adev, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Median calculation. Input parameters: X - sample (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X Output parameters: Median -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ void samplemedian(const real_1d_array &x, const ae_int_t n, double &median) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::samplemedian(const_cast(x.c_ptr()), n, &median, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Median calculation. Input parameters: X - sample (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X Output parameters: Median -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ void samplemedian(const real_1d_array &x, double &median) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::samplemedian(const_cast(x.c_ptr()), n, &median, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Percentile calculation. Input parameters: X - sample (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X P - percentile (0<=P<=1) Output parameters: V - percentile -- ALGLIB -- Copyright 01.03.2008 by Bochkanov Sergey *************************************************************************/ void samplepercentile(const real_1d_array &x, const ae_int_t n, const double p, double &v) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::samplepercentile(const_cast(x.c_ptr()), n, p, &v, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Percentile calculation. Input parameters: X - sample (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X P - percentile (0<=P<=1) Output parameters: V - percentile -- ALGLIB -- Copyright 01.03.2008 by Bochkanov Sergey *************************************************************************/ void samplepercentile(const real_1d_array &x, const double p, double &v) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::samplepercentile(const_cast(x.c_ptr()), n, p, &v, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 2-sample covariance Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: covariance (zero for N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ double cov2(const real_1d_array &x, const real_1d_array &y, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::cov2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 2-sample covariance Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: covariance (zero for N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ double cov2(const real_1d_array &x, const real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'cov2': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::cov2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Pearson product-moment correlation coefficient Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: Pearson product-moment correlation coefficient (zero for N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ double pearsoncorr2(const real_1d_array &x, const real_1d_array &y, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::pearsoncorr2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Pearson product-moment correlation coefficient Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: Pearson product-moment correlation coefficient (zero for N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ double pearsoncorr2(const real_1d_array &x, const real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'pearsoncorr2': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::pearsoncorr2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Spearman's rank correlation coefficient Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: Spearman's rank correlation coefficient (zero for N=0 or N=1) -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ double spearmancorr2(const real_1d_array &x, const real_1d_array &y, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spearmancorr2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Spearman's rank correlation coefficient Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: Spearman's rank correlation coefficient (zero for N=0 or N=1) -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ double spearmancorr2(const real_1d_array &x, const real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spearmancorr2': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spearmancorr2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Covariance matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with covariance matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], covariance matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void covm(const real_2d_array &x, const ae_int_t n, const ae_int_t m, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::covm(const_cast(x.c_ptr()), n, m, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_covm(const real_2d_array &x, const ae_int_t n, const ae_int_t m, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_covm(const_cast(x.c_ptr()), n, m, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Covariance matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with covariance matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], covariance matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void covm(const real_2d_array &x, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; n = x.rows(); m = x.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::covm(const_cast(x.c_ptr()), n, m, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_covm(const real_2d_array &x, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; n = x.rows(); m = x.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_covm(const_cast(x.c_ptr()), n, m, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Pearson product-moment correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void pearsoncorrm(const real_2d_array &x, const ae_int_t n, const ae_int_t m, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pearsoncorrm(const_cast(x.c_ptr()), n, m, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_pearsoncorrm(const real_2d_array &x, const ae_int_t n, const ae_int_t m, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_pearsoncorrm(const_cast(x.c_ptr()), n, m, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Pearson product-moment correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void pearsoncorrm(const real_2d_array &x, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; n = x.rows(); m = x.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pearsoncorrm(const_cast(x.c_ptr()), n, m, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_pearsoncorrm(const real_2d_array &x, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; n = x.rows(); m = x.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_pearsoncorrm(const_cast(x.c_ptr()), n, m, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Spearman's rank correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void spearmancorrm(const real_2d_array &x, const ae_int_t n, const ae_int_t m, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spearmancorrm(const_cast(x.c_ptr()), n, m, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spearmancorrm(const real_2d_array &x, const ae_int_t n, const ae_int_t m, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spearmancorrm(const_cast(x.c_ptr()), n, m, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Spearman's rank correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void spearmancorrm(const real_2d_array &x, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; n = x.rows(); m = x.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spearmancorrm(const_cast(x.c_ptr()), n, m, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spearmancorrm(const real_2d_array &x, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; n = x.rows(); m = x.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spearmancorrm(const_cast(x.c_ptr()), n, m, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Cross-covariance matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with covariance matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-covariance matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void covm2(const real_2d_array &x, const real_2d_array &y, const ae_int_t n, const ae_int_t m1, const ae_int_t m2, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::covm2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m1, m2, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_covm2(const real_2d_array &x, const real_2d_array &y, const ae_int_t n, const ae_int_t m1, const ae_int_t m2, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_covm2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m1, m2, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Cross-covariance matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with covariance matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-covariance matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void covm2(const real_2d_array &x, const real_2d_array &y, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m1; ae_int_t m2; if( (x.rows()!=y.rows())) throw ap_error("Error while calling 'covm2': looks like one of arguments has wrong size"); n = x.rows(); m1 = x.cols(); m2 = y.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::covm2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m1, m2, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_covm2(const real_2d_array &x, const real_2d_array &y, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m1; ae_int_t m2; if( (x.rows()!=y.rows())) throw ap_error("Error while calling 'covm2': looks like one of arguments has wrong size"); n = x.rows(); m1 = x.cols(); m2 = y.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_covm2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m1, m2, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Pearson product-moment cross-correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void pearsoncorrm2(const real_2d_array &x, const real_2d_array &y, const ae_int_t n, const ae_int_t m1, const ae_int_t m2, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pearsoncorrm2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m1, m2, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_pearsoncorrm2(const real_2d_array &x, const real_2d_array &y, const ae_int_t n, const ae_int_t m1, const ae_int_t m2, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_pearsoncorrm2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m1, m2, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Pearson product-moment cross-correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void pearsoncorrm2(const real_2d_array &x, const real_2d_array &y, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m1; ae_int_t m2; if( (x.rows()!=y.rows())) throw ap_error("Error while calling 'pearsoncorrm2': looks like one of arguments has wrong size"); n = x.rows(); m1 = x.cols(); m2 = y.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pearsoncorrm2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m1, m2, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_pearsoncorrm2(const real_2d_array &x, const real_2d_array &y, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m1; ae_int_t m2; if( (x.rows()!=y.rows())) throw ap_error("Error while calling 'pearsoncorrm2': looks like one of arguments has wrong size"); n = x.rows(); m1 = x.cols(); m2 = y.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_pearsoncorrm2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m1, m2, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Spearman's rank cross-correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void spearmancorrm2(const real_2d_array &x, const real_2d_array &y, const ae_int_t n, const ae_int_t m1, const ae_int_t m2, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spearmancorrm2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m1, m2, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spearmancorrm2(const real_2d_array &x, const real_2d_array &y, const ae_int_t n, const ae_int_t m1, const ae_int_t m2, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spearmancorrm2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m1, m2, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Spearman's rank cross-correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void spearmancorrm2(const real_2d_array &x, const real_2d_array &y, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m1; ae_int_t m2; if( (x.rows()!=y.rows())) throw ap_error("Error while calling 'spearmancorrm2': looks like one of arguments has wrong size"); n = x.rows(); m1 = x.cols(); m2 = y.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spearmancorrm2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m1, m2, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spearmancorrm2(const real_2d_array &x, const real_2d_array &y, real_2d_array &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m1; ae_int_t m2; if( (x.rows()!=y.rows())) throw ap_error("Error while calling 'spearmancorrm2': looks like one of arguments has wrong size"); n = x.rows(); m1 = x.cols(); m2 = y.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spearmancorrm2(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m1, m2, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function replaces data in XY by their ranks: * XY is processed row-by-row * rows are processed separately * tied data are correctly handled (tied ranks are calculated) * ranking starts from 0, ends at NFeatures-1 * sum of within-row values is equal to (NFeatures-1)*NFeatures/2 SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! ones where expected operations count is less than 100.000 INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points NFeatures- number of features OUTPUT PARAMETERS: XY - data are replaced by their within-row ranks; ranking starts from 0, ends at NFeatures-1 -- ALGLIB -- Copyright 18.04.2013 by Bochkanov Sergey *************************************************************************/ void rankdata(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rankdata(const_cast(xy.c_ptr()), npoints, nfeatures, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rankdata(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rankdata(const_cast(xy.c_ptr()), npoints, nfeatures, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function replaces data in XY by their ranks: * XY is processed row-by-row * rows are processed separately * tied data are correctly handled (tied ranks are calculated) * ranking starts from 0, ends at NFeatures-1 * sum of within-row values is equal to (NFeatures-1)*NFeatures/2 SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! ones where expected operations count is less than 100.000 INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points NFeatures- number of features OUTPUT PARAMETERS: XY - data are replaced by their within-row ranks; ranking starts from 0, ends at NFeatures-1 -- ALGLIB -- Copyright 18.04.2013 by Bochkanov Sergey *************************************************************************/ void rankdata(real_2d_array &xy) { alglib_impl::ae_state _alglib_env_state; ae_int_t npoints; ae_int_t nfeatures; npoints = xy.rows(); nfeatures = xy.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rankdata(const_cast(xy.c_ptr()), npoints, nfeatures, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rankdata(real_2d_array &xy) { alglib_impl::ae_state _alglib_env_state; ae_int_t npoints; ae_int_t nfeatures; npoints = xy.rows(); nfeatures = xy.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rankdata(const_cast(xy.c_ptr()), npoints, nfeatures, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function replaces data in XY by their CENTERED ranks: * XY is processed row-by-row * rows are processed separately * tied data are correctly handled (tied ranks are calculated) * centered ranks are just usual ranks, but centered in such way that sum of within-row values is equal to 0.0. * centering is performed by subtracting mean from each row, i.e it changes mean value, but does NOT change higher moments SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! ones where expected operations count is less than 100.000 INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points NFeatures- number of features OUTPUT PARAMETERS: XY - data are replaced by their within-row ranks; ranking starts from 0, ends at NFeatures-1 -- ALGLIB -- Copyright 18.04.2013 by Bochkanov Sergey *************************************************************************/ void rankdatacentered(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rankdatacentered(const_cast(xy.c_ptr()), npoints, nfeatures, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rankdatacentered(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rankdatacentered(const_cast(xy.c_ptr()), npoints, nfeatures, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function replaces data in XY by their CENTERED ranks: * XY is processed row-by-row * rows are processed separately * tied data are correctly handled (tied ranks are calculated) * centered ranks are just usual ranks, but centered in such way that sum of within-row values is equal to 0.0. * centering is performed by subtracting mean from each row, i.e it changes mean value, but does NOT change higher moments SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! ones where expected operations count is less than 100.000 INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points NFeatures- number of features OUTPUT PARAMETERS: XY - data are replaced by their within-row ranks; ranking starts from 0, ends at NFeatures-1 -- ALGLIB -- Copyright 18.04.2013 by Bochkanov Sergey *************************************************************************/ void rankdatacentered(real_2d_array &xy) { alglib_impl::ae_state _alglib_env_state; ae_int_t npoints; ae_int_t nfeatures; npoints = xy.rows(); nfeatures = xy.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rankdatacentered(const_cast(xy.c_ptr()), npoints, nfeatures, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rankdatacentered(real_2d_array &xy) { alglib_impl::ae_state _alglib_env_state; ae_int_t npoints; ae_int_t nfeatures; npoints = xy.rows(); nfeatures = xy.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rankdatacentered(const_cast(xy.c_ptr()), npoints, nfeatures, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Obsolete function, we recommend to use PearsonCorr2(). -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ double pearsoncorrelation(const real_1d_array &x, const real_1d_array &y, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::pearsoncorrelation(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Obsolete function, we recommend to use SpearmanCorr2(). -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ double spearmanrankcorrelation(const real_1d_array &x, const real_1d_array &y, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spearmanrankcorrelation(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Wilcoxon signed-rank test This test checks three hypotheses about the median of the given sample. The following tests are performed: * two-tailed test (null hypothesis - the median is equal to the given value) * left-tailed test (null hypothesis - the median is greater than or equal to the given value) * right-tailed test (null hypothesis - the median is less than or equal to the given value) Requirements: * the scale of measurement should be ordinal, interval or ratio (i.e. the test could not be applied to nominal variables). * the distribution should be continuous and symmetric relative to its median. * number of distinct values in the X array should be greater than 4 The test is non-parametric and doesn't require distribution X to be normal Input parameters: X - sample. Array whose index goes from 0 to N-1. N - size of the sample. Median - assumed median value. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. To calculate p-values, special approximation is used. This method lets us calculate p-values with two decimal places in interval [0.0001, 1]. "Two decimal places" does not sound very impressive, but in practice the relative error of less than 1% is enough to make a decision. There is no approximation outside the [0.0001, 1] interval. Therefore, if the significance level outlies this interval, the test returns 0.0001. -- ALGLIB -- Copyright 08.09.2006 by Bochkanov Sergey *************************************************************************/ void wilcoxonsignedranktest(const real_1d_array &x, const ae_int_t n, const double e, double &bothtails, double &lefttail, double &righttail) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::wilcoxonsignedranktest(const_cast(x.c_ptr()), n, e, &bothtails, &lefttail, &righttail, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Sign test This test checks three hypotheses about the median of the given sample. The following tests are performed: * two-tailed test (null hypothesis - the median is equal to the given value) * left-tailed test (null hypothesis - the median is greater than or equal to the given value) * right-tailed test (null hypothesis - the median is less than or equal to the given value) Requirements: * the scale of measurement should be ordinal, interval or ratio (i.e. the test could not be applied to nominal variables). The test is non-parametric and doesn't require distribution X to be normal Input parameters: X - sample. Array whose index goes from 0 to N-1. N - size of the sample. Median - assumed median value. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. While calculating p-values high-precision binomial distribution approximation is used, so significance levels have about 15 exact digits. -- ALGLIB -- Copyright 08.09.2006 by Bochkanov Sergey *************************************************************************/ void onesamplesigntest(const real_1d_array &x, const ae_int_t n, const double median, double &bothtails, double &lefttail, double &righttail) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::onesamplesigntest(const_cast(x.c_ptr()), n, median, &bothtails, &lefttail, &righttail, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Pearson's correlation coefficient significance test This test checks hypotheses about whether X and Y are samples of two continuous distributions having zero correlation or whether their correlation is non-zero. The following tests are performed: * two-tailed test (null hypothesis - X and Y have zero correlation) * left-tailed test (null hypothesis - the correlation coefficient is greater than or equal to 0) * right-tailed test (null hypothesis - the correlation coefficient is less than or equal to 0). Requirements: * the number of elements in each sample is not less than 5 * normality of distributions of X and Y. Input parameters: R - Pearson's correlation coefficient for X and Y N - number of elements in samples, N>=5. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ void pearsoncorrelationsignificance(const double r, const ae_int_t n, double &bothtails, double &lefttail, double &righttail) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pearsoncorrelationsignificance(r, n, &bothtails, &lefttail, &righttail, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Spearman's rank correlation coefficient significance test This test checks hypotheses about whether X and Y are samples of two continuous distributions having zero correlation or whether their correlation is non-zero. The following tests are performed: * two-tailed test (null hypothesis - X and Y have zero correlation) * left-tailed test (null hypothesis - the correlation coefficient is greater than or equal to 0) * right-tailed test (null hypothesis - the correlation coefficient is less than or equal to 0). Requirements: * the number of elements in each sample is not less than 5. The test is non-parametric and doesn't require distributions X and Y to be normal. Input parameters: R - Spearman's rank correlation coefficient for X and Y N - number of elements in samples, N>=5. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ void spearmanrankcorrelationsignificance(const double r, const ae_int_t n, double &bothtails, double &lefttail, double &righttail) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spearmanrankcorrelationsignificance(r, n, &bothtails, &lefttail, &righttail, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* One-sample t-test This test checks three hypotheses about the mean of the given sample. The following tests are performed: * two-tailed test (null hypothesis - the mean is equal to the given value) * left-tailed test (null hypothesis - the mean is greater than or equal to the given value) * right-tailed test (null hypothesis - the mean is less than or equal to the given value). The test is based on the assumption that a given sample has a normal distribution and an unknown dispersion. If the distribution sharply differs from normal, the test will work incorrectly. INPUT PARAMETERS: X - sample. Array whose index goes from 0 to N-1. N - size of sample, N>=0 Mean - assumed value of the mean. OUTPUT PARAMETERS: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. NOTE: this function correctly handles degenerate cases: * when N=0, all p-values are set to 1.0 * when variance of X[] is exactly zero, p-values are set to 1.0 or 0.0, depending on difference between sample mean and value of mean being tested. -- ALGLIB -- Copyright 08.09.2006 by Bochkanov Sergey *************************************************************************/ void studentttest1(const real_1d_array &x, const ae_int_t n, const double mean, double &bothtails, double &lefttail, double &righttail) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::studentttest1(const_cast(x.c_ptr()), n, mean, &bothtails, &lefttail, &righttail, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Two-sample pooled test This test checks three hypotheses about the mean of the given samples. The following tests are performed: * two-tailed test (null hypothesis - the means are equal) * left-tailed test (null hypothesis - the mean of the first sample is greater than or equal to the mean of the second sample) * right-tailed test (null hypothesis - the mean of the first sample is less than or equal to the mean of the second sample). Test is based on the following assumptions: * given samples have normal distributions * dispersions are equal * samples are independent. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of sample. Y - sample 2. Array whose index goes from 0 to M-1. M - size of sample. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. NOTE: this function correctly handles degenerate cases: * when N=0 or M=0, all p-values are set to 1.0 * when both samples has exactly zero variance, p-values are set to 1.0 or 0.0, depending on difference between means. -- ALGLIB -- Copyright 18.09.2006 by Bochkanov Sergey *************************************************************************/ void studentttest2(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, double &bothtails, double &lefttail, double &righttail) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::studentttest2(const_cast(x.c_ptr()), n, const_cast(y.c_ptr()), m, &bothtails, &lefttail, &righttail, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Two-sample unpooled test This test checks three hypotheses about the mean of the given samples. The following tests are performed: * two-tailed test (null hypothesis - the means are equal) * left-tailed test (null hypothesis - the mean of the first sample is greater than or equal to the mean of the second sample) * right-tailed test (null hypothesis - the mean of the first sample is less than or equal to the mean of the second sample). Test is based on the following assumptions: * given samples have normal distributions * samples are independent. Equality of variances is NOT required. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of the sample. Y - sample 2. Array whose index goes from 0 to M-1. M - size of the sample. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. NOTE: this function correctly handles degenerate cases: * when N=0 or M=0, all p-values are set to 1.0 * when both samples has zero variance, p-values are set to 1.0 or 0.0, depending on difference between means. * when only one sample has zero variance, test reduces to 1-sample version. -- ALGLIB -- Copyright 18.09.2006 by Bochkanov Sergey *************************************************************************/ void unequalvariancettest(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, double &bothtails, double &lefttail, double &righttail) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::unequalvariancettest(const_cast(x.c_ptr()), n, const_cast(y.c_ptr()), m, &bothtails, &lefttail, &righttail, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Mann-Whitney U-test This test checks hypotheses about whether X and Y are samples of two continuous distributions of the same shape and same median or whether their medians are different. The following tests are performed: * two-tailed test (null hypothesis - the medians are equal) * left-tailed test (null hypothesis - the median of the first sample is greater than or equal to the median of the second sample) * right-tailed test (null hypothesis - the median of the first sample is less than or equal to the median of the second sample). Requirements: * the samples are independent * X and Y are continuous distributions (or discrete distributions well- approximating continuous distributions) * distributions of X and Y have the same shape. The only possible difference is their position (i.e. the value of the median) * the number of elements in each sample is not less than 5 * the scale of measurement should be ordinal, interval or ratio (i.e. the test could not be applied to nominal variables). The test is non-parametric and doesn't require distributions to be normal. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of the sample. N>=5 Y - sample 2. Array whose index goes from 0 to M-1. M - size of the sample. M>=5 Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. To calculate p-values, special approximation is used. This method lets us calculate p-values with satisfactory accuracy in interval [0.0001, 1]. There is no approximation outside the [0.0001, 1] interval. Therefore, if the significance level outlies this interval, the test returns 0.0001. Relative precision of approximation of p-value: N M Max.err. Rms.err. 5..10 N..10 1.4e-02 6.0e-04 5..10 N..100 2.2e-02 5.3e-06 10..15 N..15 1.0e-02 3.2e-04 10..15 N..100 1.0e-02 2.2e-05 15..100 N..100 6.1e-03 2.7e-06 For N,M>100 accuracy checks weren't put into practice, but taking into account characteristics of asymptotic approximation used, precision should not be sharply different from the values for interval [5, 100]. NOTE: P-value approximation was optimized for 0.0001<=p<=0.2500. Thus, P's outside of this interval are enforced to these bounds. Say, you may quite often get P equal to exactly 0.25 or 0.0001. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ void mannwhitneyutest(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, double &bothtails, double &lefttail, double &righttail) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mannwhitneyutest(const_cast(x.c_ptr()), n, const_cast(y.c_ptr()), m, &bothtails, &lefttail, &righttail, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Jarque-Bera test This test checks hypotheses about the fact that a given sample X is a sample of normal random variable. Requirements: * the number of elements in the sample is not less than 5. Input parameters: X - sample. Array whose index goes from 0 to N-1. N - size of the sample. N>=5 Output parameters: P - p-value for the test Accuracy of the approximation used (5<=N<=1951): p-value relative error (5<=N<=1951) [1, 0.1] < 1% [0.1, 0.01] < 2% [0.01, 0.001] < 6% [0.001, 0] wasn't measured For N>1951 accuracy wasn't measured but it shouldn't be sharply different from table values. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ void jarqueberatest(const real_1d_array &x, const ae_int_t n, double &p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::jarqueberatest(const_cast(x.c_ptr()), n, &p, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Two-sample F-test This test checks three hypotheses about dispersions of the given samples. The following tests are performed: * two-tailed test (null hypothesis - the dispersions are equal) * left-tailed test (null hypothesis - the dispersion of the first sample is greater than or equal to the dispersion of the second sample). * right-tailed test (null hypothesis - the dispersion of the first sample is less than or equal to the dispersion of the second sample) The test is based on the following assumptions: * the given samples have normal distributions * the samples are independent. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - sample size. Y - sample 2. Array whose index goes from 0 to M-1. M - sample size. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 19.09.2006 by Bochkanov Sergey *************************************************************************/ void ftest(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, double &bothtails, double &lefttail, double &righttail) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ftest(const_cast(x.c_ptr()), n, const_cast(y.c_ptr()), m, &bothtails, &lefttail, &righttail, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* One-sample chi-square test This test checks three hypotheses about the dispersion of the given sample The following tests are performed: * two-tailed test (null hypothesis - the dispersion equals the given number) * left-tailed test (null hypothesis - the dispersion is greater than or equal to the given number) * right-tailed test (null hypothesis - dispersion is less than or equal to the given number). Test is based on the following assumptions: * the given sample has a normal distribution. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of the sample. Variance - dispersion value to compare with. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 19.09.2006 by Bochkanov Sergey *************************************************************************/ void onesamplevariancetest(const real_1d_array &x, const ae_int_t n, const double variance, double &bothtails, double &lefttail, double &righttail) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::onesamplevariancetest(const_cast(x.c_ptr()), n, variance, &bothtails, &lefttail, &righttail, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF COMPUTATIONAL CORE // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { static void basestat_rankdatarec(/* Real */ ae_matrix* xy, ae_int_t i0, ae_int_t i1, ae_int_t nfeatures, ae_bool iscentered, ae_shared_pool* pool, ae_int_t basecasecost, ae_state *_state); static void basestat_rankdatabasecase(/* Real */ ae_matrix* xy, ae_int_t i0, ae_int_t i1, ae_int_t nfeatures, ae_bool iscentered, apbuffers* buf0, apbuffers* buf1, ae_state *_state); static void wsr_wcheb(double x, double c, double* tj, double* tj1, double* r, ae_state *_state); static double wsr_w5(double s, ae_state *_state); static double wsr_w6(double s, ae_state *_state); static double wsr_w7(double s, ae_state *_state); static double wsr_w8(double s, ae_state *_state); static double wsr_w9(double s, ae_state *_state); static double wsr_w10(double s, ae_state *_state); static double wsr_w11(double s, ae_state *_state); static double wsr_w12(double s, ae_state *_state); static double wsr_w13(double s, ae_state *_state); static double wsr_w14(double s, ae_state *_state); static double wsr_w15(double s, ae_state *_state); static double wsr_w16(double s, ae_state *_state); static double wsr_w17(double s, ae_state *_state); static double wsr_w18(double s, ae_state *_state); static double wsr_w19(double s, ae_state *_state); static double wsr_w20(double s, ae_state *_state); static double wsr_w21(double s, ae_state *_state); static double wsr_w22(double s, ae_state *_state); static double wsr_w23(double s, ae_state *_state); static double wsr_w24(double s, ae_state *_state); static double wsr_w25(double s, ae_state *_state); static double wsr_w26(double s, ae_state *_state); static double wsr_w27(double s, ae_state *_state); static double wsr_w28(double s, ae_state *_state); static double wsr_w29(double s, ae_state *_state); static double wsr_w30(double s, ae_state *_state); static double wsr_w40(double s, ae_state *_state); static double wsr_w60(double s, ae_state *_state); static double wsr_w120(double s, ae_state *_state); static double wsr_w200(double s, ae_state *_state); static double wsr_wsigma(double s, ae_int_t n, ae_state *_state); static double correlationtests_spearmantail5(double s, ae_state *_state); static double correlationtests_spearmantail6(double s, ae_state *_state); static double correlationtests_spearmantail7(double s, ae_state *_state); static double correlationtests_spearmantail8(double s, ae_state *_state); static double correlationtests_spearmantail9(double s, ae_state *_state); static double correlationtests_spearmantail(double t, ae_int_t n, ae_state *_state); static void mannwhitneyu_ucheb(double x, double c, double* tj, double* tj1, double* r, ae_state *_state); static double mannwhitneyu_uninterpolate(double p1, double p2, double p3, ae_int_t n, ae_state *_state); static double mannwhitneyu_usigma000(ae_int_t n1, ae_int_t n2, ae_state *_state); static double mannwhitneyu_usigma075(ae_int_t n1, ae_int_t n2, ae_state *_state); static double mannwhitneyu_usigma150(ae_int_t n1, ae_int_t n2, ae_state *_state); static double mannwhitneyu_usigma225(ae_int_t n1, ae_int_t n2, ae_state *_state); static double mannwhitneyu_usigma300(ae_int_t n1, ae_int_t n2, ae_state *_state); static double mannwhitneyu_usigma333(ae_int_t n1, ae_int_t n2, ae_state *_state); static double mannwhitneyu_usigma367(ae_int_t n1, ae_int_t n2, ae_state *_state); static double mannwhitneyu_usigma400(ae_int_t n1, ae_int_t n2, ae_state *_state); static double mannwhitneyu_utbln5n5(double s, ae_state *_state); static double mannwhitneyu_utbln5n6(double s, ae_state *_state); static double mannwhitneyu_utbln5n7(double s, ae_state *_state); static double mannwhitneyu_utbln5n8(double s, ae_state *_state); static double mannwhitneyu_utbln5n9(double s, ae_state *_state); static double mannwhitneyu_utbln5n10(double s, ae_state *_state); static double mannwhitneyu_utbln5n11(double s, ae_state *_state); static double mannwhitneyu_utbln5n12(double s, ae_state *_state); static double mannwhitneyu_utbln5n13(double s, ae_state *_state); static double mannwhitneyu_utbln5n14(double s, ae_state *_state); static double mannwhitneyu_utbln5n15(double s, ae_state *_state); static double mannwhitneyu_utbln5n16(double s, ae_state *_state); static double mannwhitneyu_utbln5n17(double s, ae_state *_state); static double mannwhitneyu_utbln5n18(double s, ae_state *_state); static double mannwhitneyu_utbln5n19(double s, ae_state *_state); static double mannwhitneyu_utbln5n20(double s, ae_state *_state); static double mannwhitneyu_utbln5n21(double s, ae_state *_state); static double mannwhitneyu_utbln5n22(double s, ae_state *_state); static double mannwhitneyu_utbln5n23(double s, ae_state *_state); static double mannwhitneyu_utbln5n24(double s, ae_state *_state); static double mannwhitneyu_utbln5n25(double s, ae_state *_state); static double mannwhitneyu_utbln5n26(double s, ae_state *_state); static double mannwhitneyu_utbln5n27(double s, ae_state *_state); static double mannwhitneyu_utbln5n28(double s, ae_state *_state); static double mannwhitneyu_utbln5n29(double s, ae_state *_state); static double mannwhitneyu_utbln5n30(double s, ae_state *_state); static double mannwhitneyu_utbln5n100(double s, ae_state *_state); static double mannwhitneyu_utbln6n6(double s, ae_state *_state); static double mannwhitneyu_utbln6n7(double s, ae_state *_state); static double mannwhitneyu_utbln6n8(double s, ae_state *_state); static double mannwhitneyu_utbln6n9(double s, ae_state *_state); static double mannwhitneyu_utbln6n10(double s, ae_state *_state); static double mannwhitneyu_utbln6n11(double s, ae_state *_state); static double mannwhitneyu_utbln6n12(double s, ae_state *_state); static double mannwhitneyu_utbln6n13(double s, ae_state *_state); static double mannwhitneyu_utbln6n14(double s, ae_state *_state); static double mannwhitneyu_utbln6n15(double s, ae_state *_state); static double mannwhitneyu_utbln6n30(double s, ae_state *_state); static double mannwhitneyu_utbln6n100(double s, ae_state *_state); static double mannwhitneyu_utbln7n7(double s, ae_state *_state); static double mannwhitneyu_utbln7n8(double s, ae_state *_state); static double mannwhitneyu_utbln7n9(double s, ae_state *_state); static double mannwhitneyu_utbln7n10(double s, ae_state *_state); static double mannwhitneyu_utbln7n11(double s, ae_state *_state); static double mannwhitneyu_utbln7n12(double s, ae_state *_state); static double mannwhitneyu_utbln7n13(double s, ae_state *_state); static double mannwhitneyu_utbln7n14(double s, ae_state *_state); static double mannwhitneyu_utbln7n15(double s, ae_state *_state); static double mannwhitneyu_utbln7n30(double s, ae_state *_state); static double mannwhitneyu_utbln7n100(double s, ae_state *_state); static double mannwhitneyu_utbln8n8(double s, ae_state *_state); static double mannwhitneyu_utbln8n9(double s, ae_state *_state); static double mannwhitneyu_utbln8n10(double s, ae_state *_state); static double mannwhitneyu_utbln8n11(double s, ae_state *_state); static double mannwhitneyu_utbln8n12(double s, ae_state *_state); static double mannwhitneyu_utbln8n13(double s, ae_state *_state); static double mannwhitneyu_utbln8n14(double s, ae_state *_state); static double mannwhitneyu_utbln8n15(double s, ae_state *_state); static double mannwhitneyu_utbln8n30(double s, ae_state *_state); static double mannwhitneyu_utbln8n100(double s, ae_state *_state); static double mannwhitneyu_utbln9n9(double s, ae_state *_state); static double mannwhitneyu_utbln9n10(double s, ae_state *_state); static double mannwhitneyu_utbln9n11(double s, ae_state *_state); static double mannwhitneyu_utbln9n12(double s, ae_state *_state); static double mannwhitneyu_utbln9n13(double s, ae_state *_state); static double mannwhitneyu_utbln9n14(double s, ae_state *_state); static double mannwhitneyu_utbln9n15(double s, ae_state *_state); static double mannwhitneyu_utbln9n30(double s, ae_state *_state); static double mannwhitneyu_utbln9n100(double s, ae_state *_state); static double mannwhitneyu_utbln10n10(double s, ae_state *_state); static double mannwhitneyu_utbln10n11(double s, ae_state *_state); static double mannwhitneyu_utbln10n12(double s, ae_state *_state); static double mannwhitneyu_utbln10n13(double s, ae_state *_state); static double mannwhitneyu_utbln10n14(double s, ae_state *_state); static double mannwhitneyu_utbln10n15(double s, ae_state *_state); static double mannwhitneyu_utbln10n30(double s, ae_state *_state); static double mannwhitneyu_utbln10n100(double s, ae_state *_state); static double mannwhitneyu_utbln11n11(double s, ae_state *_state); static double mannwhitneyu_utbln11n12(double s, ae_state *_state); static double mannwhitneyu_utbln11n13(double s, ae_state *_state); static double mannwhitneyu_utbln11n14(double s, ae_state *_state); static double mannwhitneyu_utbln11n15(double s, ae_state *_state); static double mannwhitneyu_utbln11n30(double s, ae_state *_state); static double mannwhitneyu_utbln11n100(double s, ae_state *_state); static double mannwhitneyu_utbln12n12(double s, ae_state *_state); static double mannwhitneyu_utbln12n13(double s, ae_state *_state); static double mannwhitneyu_utbln12n14(double s, ae_state *_state); static double mannwhitneyu_utbln12n15(double s, ae_state *_state); static double mannwhitneyu_utbln12n30(double s, ae_state *_state); static double mannwhitneyu_utbln12n100(double s, ae_state *_state); static double mannwhitneyu_utbln13n13(double s, ae_state *_state); static double mannwhitneyu_utbln13n14(double s, ae_state *_state); static double mannwhitneyu_utbln13n15(double s, ae_state *_state); static double mannwhitneyu_utbln13n30(double s, ae_state *_state); static double mannwhitneyu_utbln13n100(double s, ae_state *_state); static double mannwhitneyu_utbln14n14(double s, ae_state *_state); static double mannwhitneyu_utbln14n15(double s, ae_state *_state); static double mannwhitneyu_utbln14n30(double s, ae_state *_state); static double mannwhitneyu_utbln14n100(double s, ae_state *_state); static double mannwhitneyu_usigma(double s, ae_int_t n1, ae_int_t n2, ae_state *_state); static void jarquebera_jarqueberastatistic(/* Real */ ae_vector* x, ae_int_t n, double* s, ae_state *_state); static double jarquebera_jarqueberaapprox(ae_int_t n, double s, ae_state *_state); static double jarquebera_jbtbl5(double s, ae_state *_state); static double jarquebera_jbtbl6(double s, ae_state *_state); static double jarquebera_jbtbl7(double s, ae_state *_state); static double jarquebera_jbtbl8(double s, ae_state *_state); static double jarquebera_jbtbl9(double s, ae_state *_state); static double jarquebera_jbtbl10(double s, ae_state *_state); static double jarquebera_jbtbl11(double s, ae_state *_state); static double jarquebera_jbtbl12(double s, ae_state *_state); static double jarquebera_jbtbl13(double s, ae_state *_state); static double jarquebera_jbtbl14(double s, ae_state *_state); static double jarquebera_jbtbl15(double s, ae_state *_state); static double jarquebera_jbtbl16(double s, ae_state *_state); static double jarquebera_jbtbl17(double s, ae_state *_state); static double jarquebera_jbtbl18(double s, ae_state *_state); static double jarquebera_jbtbl19(double s, ae_state *_state); static double jarquebera_jbtbl20(double s, ae_state *_state); static double jarquebera_jbtbl30(double s, ae_state *_state); static double jarquebera_jbtbl50(double s, ae_state *_state); static double jarquebera_jbtbl65(double s, ae_state *_state); static double jarquebera_jbtbl100(double s, ae_state *_state); static double jarquebera_jbtbl130(double s, ae_state *_state); static double jarquebera_jbtbl200(double s, ae_state *_state); static double jarquebera_jbtbl301(double s, ae_state *_state); static double jarquebera_jbtbl501(double s, ae_state *_state); static double jarquebera_jbtbl701(double s, ae_state *_state); static double jarquebera_jbtbl1401(double s, ae_state *_state); static void jarquebera_jbcheb(double x, double c, double* tj, double* tj1, double* r, ae_state *_state); /************************************************************************* Calculation of the distribution moments: mean, variance, skewness, kurtosis. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X OUTPUT PARAMETERS Mean - mean. Variance- variance. Skewness- skewness (if variance<>0; zero otherwise). Kurtosis- kurtosis (if variance<>0; zero otherwise). NOTE: variance is calculated by dividing sum of squares by N-1, not N. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ void samplemoments(/* Real */ ae_vector* x, ae_int_t n, double* mean, double* variance, double* skewness, double* kurtosis, ae_state *_state) { ae_int_t i; double v; double v1; double v2; double stddev; *mean = 0; *variance = 0; *skewness = 0; *kurtosis = 0; ae_assert(n>=0, "SampleMoments: N<0", _state); ae_assert(x->cnt>=n, "SampleMoments: Length(X)ptr.p_double[i]; } *mean = *mean/n; /* * Variance (using corrected two-pass algorithm) */ if( n!=1 ) { v1 = (double)(0); for(i=0; i<=n-1; i++) { v1 = v1+ae_sqr(x->ptr.p_double[i]-(*mean), _state); } v2 = (double)(0); for(i=0; i<=n-1; i++) { v2 = v2+(x->ptr.p_double[i]-(*mean)); } v2 = ae_sqr(v2, _state)/n; *variance = (v1-v2)/(n-1); if( ae_fp_less(*variance,(double)(0)) ) { *variance = (double)(0); } stddev = ae_sqrt(*variance, _state); } /* * Skewness and kurtosis */ if( ae_fp_neq(stddev,(double)(0)) ) { for(i=0; i<=n-1; i++) { v = (x->ptr.p_double[i]-(*mean))/stddev; v2 = ae_sqr(v, _state); *skewness = *skewness+v2*v; *kurtosis = *kurtosis+ae_sqr(v2, _state); } *skewness = *skewness/n; *kurtosis = *kurtosis/n-3; } } /************************************************************************* Calculation of the mean. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Mean' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double samplemean(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state) { double mean; double tmp0; double tmp1; double tmp2; double result; samplemoments(x, n, &mean, &tmp0, &tmp1, &tmp2, _state); result = mean; return result; } /************************************************************************* Calculation of the variance. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Variance' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double samplevariance(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state) { double variance; double tmp0; double tmp1; double tmp2; double result; samplemoments(x, n, &tmp0, &variance, &tmp1, &tmp2, _state); result = variance; return result; } /************************************************************************* Calculation of the skewness. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Skewness' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double sampleskewness(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state) { double skewness; double tmp0; double tmp1; double tmp2; double result; samplemoments(x, n, &tmp0, &tmp1, &skewness, &tmp2, _state); result = skewness; return result; } /************************************************************************* Calculation of the kurtosis. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Kurtosis' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double samplekurtosis(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state) { double kurtosis; double tmp0; double tmp1; double tmp2; double result; samplemoments(x, n, &tmp0, &tmp1, &tmp2, &kurtosis, _state); result = kurtosis; return result; } /************************************************************************* ADev Input parameters: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X Output parameters: ADev- ADev -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ void sampleadev(/* Real */ ae_vector* x, ae_int_t n, double* adev, ae_state *_state) { ae_int_t i; double mean; *adev = 0; ae_assert(n>=0, "SampleADev: N<0", _state); ae_assert(x->cnt>=n, "SampleADev: Length(X)ptr.p_double[i]; } mean = mean/n; /* * ADev */ for(i=0; i<=n-1; i++) { *adev = *adev+ae_fabs(x->ptr.p_double[i]-mean, _state); } *adev = *adev/n; } /************************************************************************* Median calculation. Input parameters: X - sample (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X Output parameters: Median -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ void samplemedian(/* Real */ ae_vector* x, ae_int_t n, double* median, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_int_t i; ae_int_t ir; ae_int_t j; ae_int_t l; ae_int_t midp; ae_int_t k; double a; double tval; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; *median = 0; ae_assert(n>=0, "SampleMedian: N<0", _state); ae_assert(x->cnt>=n, "SampleMedian: Length(X)ptr.p_double[0]; ae_frame_leave(_state); return; } if( n==2 ) { *median = 0.5*(x->ptr.p_double[0]+x->ptr.p_double[1]); ae_frame_leave(_state); return; } /* * Common case, N>=3. * Choose X[(N-1)/2] */ l = 0; ir = n-1; k = (n-1)/2; for(;;) { if( ir<=l+1 ) { /* * 1 or 2 elements in partition */ if( ir==l+1&&ae_fp_less(x->ptr.p_double[ir],x->ptr.p_double[l]) ) { tval = x->ptr.p_double[l]; x->ptr.p_double[l] = x->ptr.p_double[ir]; x->ptr.p_double[ir] = tval; } break; } else { midp = (l+ir)/2; tval = x->ptr.p_double[midp]; x->ptr.p_double[midp] = x->ptr.p_double[l+1]; x->ptr.p_double[l+1] = tval; if( ae_fp_greater(x->ptr.p_double[l],x->ptr.p_double[ir]) ) { tval = x->ptr.p_double[l]; x->ptr.p_double[l] = x->ptr.p_double[ir]; x->ptr.p_double[ir] = tval; } if( ae_fp_greater(x->ptr.p_double[l+1],x->ptr.p_double[ir]) ) { tval = x->ptr.p_double[l+1]; x->ptr.p_double[l+1] = x->ptr.p_double[ir]; x->ptr.p_double[ir] = tval; } if( ae_fp_greater(x->ptr.p_double[l],x->ptr.p_double[l+1]) ) { tval = x->ptr.p_double[l]; x->ptr.p_double[l] = x->ptr.p_double[l+1]; x->ptr.p_double[l+1] = tval; } i = l+1; j = ir; a = x->ptr.p_double[l+1]; for(;;) { do { i = i+1; } while(ae_fp_less(x->ptr.p_double[i],a)); do { j = j-1; } while(ae_fp_greater(x->ptr.p_double[j],a)); if( jptr.p_double[i]; x->ptr.p_double[i] = x->ptr.p_double[j]; x->ptr.p_double[j] = tval; } x->ptr.p_double[l+1] = x->ptr.p_double[j]; x->ptr.p_double[j] = a; if( j>=k ) { ir = j-1; } if( j<=k ) { l = i; } } } /* * If N is odd, return result */ if( n%2==1 ) { *median = x->ptr.p_double[k]; ae_frame_leave(_state); return; } a = x->ptr.p_double[n-1]; for(i=k+1; i<=n-1; i++) { if( ae_fp_less(x->ptr.p_double[i],a) ) { a = x->ptr.p_double[i]; } } *median = 0.5*(x->ptr.p_double[k]+a); ae_frame_leave(_state); } /************************************************************************* Percentile calculation. Input parameters: X - sample (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X P - percentile (0<=P<=1) Output parameters: V - percentile -- ALGLIB -- Copyright 01.03.2008 by Bochkanov Sergey *************************************************************************/ void samplepercentile(/* Real */ ae_vector* x, ae_int_t n, double p, double* v, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_int_t i1; double t; ae_vector rbuf; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; *v = 0; ae_vector_init(&rbuf, 0, DT_REAL, _state); ae_assert(n>=0, "SamplePercentile: N<0", _state); ae_assert(x->cnt>=n, "SamplePercentile: Length(X)ptr.p_double[0]; ae_frame_leave(_state); return; } if( ae_fp_eq(p,(double)(1)) ) { *v = x->ptr.p_double[n-1]; ae_frame_leave(_state); return; } t = p*(n-1); i1 = ae_ifloor(t, _state); t = t-ae_ifloor(t, _state); *v = x->ptr.p_double[i1]*(1-t)+x->ptr.p_double[i1+1]*t; ae_frame_leave(_state); } /************************************************************************* 2-sample covariance Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: covariance (zero for N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ double cov2(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_state *_state) { ae_int_t i; double xmean; double ymean; double v; double x0; double y0; double s; ae_bool samex; ae_bool samey; double result; ae_assert(n>=0, "Cov2: N<0", _state); ae_assert(x->cnt>=n, "Cov2: Length(X)cnt>=n, "Cov2: Length(Y)ptr.p_double[0]; y0 = y->ptr.p_double[0]; v = (double)1/(double)n; for(i=0; i<=n-1; i++) { s = x->ptr.p_double[i]; samex = samex&&ae_fp_eq(s,x0); xmean = xmean+s*v; s = y->ptr.p_double[i]; samey = samey&&ae_fp_eq(s,y0); ymean = ymean+s*v; } if( samex||samey ) { result = (double)(0); return result; } /* * covariance */ v = (double)1/(double)(n-1); result = (double)(0); for(i=0; i<=n-1; i++) { result = result+v*(x->ptr.p_double[i]-xmean)*(y->ptr.p_double[i]-ymean); } return result; } /************************************************************************* Pearson product-moment correlation coefficient Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: Pearson product-moment correlation coefficient (zero for N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ double pearsoncorr2(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_state *_state) { ae_int_t i; double xmean; double ymean; double v; double x0; double y0; double s; ae_bool samex; ae_bool samey; double xv; double yv; double t1; double t2; double result; ae_assert(n>=0, "PearsonCorr2: N<0", _state); ae_assert(x->cnt>=n, "PearsonCorr2: Length(X)cnt>=n, "PearsonCorr2: Length(Y)ptr.p_double[0]; y0 = y->ptr.p_double[0]; v = (double)1/(double)n; for(i=0; i<=n-1; i++) { s = x->ptr.p_double[i]; samex = samex&&ae_fp_eq(s,x0); xmean = xmean+s*v; s = y->ptr.p_double[i]; samey = samey&&ae_fp_eq(s,y0); ymean = ymean+s*v; } if( samex||samey ) { result = (double)(0); return result; } /* * numerator and denominator */ s = (double)(0); xv = (double)(0); yv = (double)(0); for(i=0; i<=n-1; i++) { t1 = x->ptr.p_double[i]-xmean; t2 = y->ptr.p_double[i]-ymean; xv = xv+ae_sqr(t1, _state); yv = yv+ae_sqr(t2, _state); s = s+t1*t2; } if( ae_fp_eq(xv,(double)(0))||ae_fp_eq(yv,(double)(0)) ) { result = (double)(0); } else { result = s/(ae_sqrt(xv, _state)*ae_sqrt(yv, _state)); } return result; } /************************************************************************* Spearman's rank correlation coefficient Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: Spearman's rank correlation coefficient (zero for N=0 or N=1) -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ double spearmancorr2(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; apbuffers buf; double result; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; _apbuffers_init(&buf, _state); ae_assert(n>=0, "SpearmanCorr2: N<0", _state); ae_assert(x->cnt>=n, "SpearmanCorr2: Length(X)cnt>=n, "SpearmanCorr2: Length(Y)=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], covariance matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void covm(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, ae_state *_state) { ae_frame _frame_block; ae_matrix _x; ae_int_t i; ae_int_t j; double v; ae_vector t; ae_vector x0; ae_vector same; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_x, x, _state); x = &_x; ae_matrix_clear(c); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&same, 0, DT_BOOL, _state); ae_assert(n>=0, "CovM: N<0", _state); ae_assert(m>=1, "CovM: M<1", _state); ae_assert(x->rows>=n, "CovM: Rows(X)cols>=m||n==0, "CovM: Cols(X)ptr.pp_double[i][j] = (double)(0); } } ae_frame_leave(_state); return; } /* * Calculate means, * check for constant columns */ ae_vector_set_length(&t, m, _state); ae_vector_set_length(&x0, m, _state); ae_vector_set_length(&same, m, _state); ae_matrix_set_length(c, m, m, _state); for(i=0; i<=m-1; i++) { t.ptr.p_double[i] = (double)(0); same.ptr.p_bool[i] = ae_true; } ae_v_move(&x0.ptr.p_double[0], 1, &x->ptr.pp_double[0][0], 1, ae_v_len(0,m-1)); v = (double)1/(double)n; for(i=0; i<=n-1; i++) { ae_v_addd(&t.ptr.p_double[0], 1, &x->ptr.pp_double[i][0], 1, ae_v_len(0,m-1), v); for(j=0; j<=m-1; j++) { same.ptr.p_bool[j] = same.ptr.p_bool[j]&&ae_fp_eq(x->ptr.pp_double[i][j],x0.ptr.p_double[j]); } } /* * * center variables; * * if we have constant columns, these columns are * artificially zeroed (they must be zero in exact arithmetics, * but unfortunately floating point ops are not exact). * * calculate upper half of symmetric covariance matrix */ for(i=0; i<=n-1; i++) { ae_v_sub(&x->ptr.pp_double[i][0], 1, &t.ptr.p_double[0], 1, ae_v_len(0,m-1)); for(j=0; j<=m-1; j++) { if( same.ptr.p_bool[j] ) { x->ptr.pp_double[i][j] = (double)(0); } } } rmatrixsyrk(m, n, (double)1/(double)(n-1), x, 0, 0, 1, 0.0, c, 0, 0, ae_true, _state); rmatrixenforcesymmetricity(c, m, ae_true, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_covm(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, ae_state *_state) { covm(x,n,m,c, _state); } /************************************************************************* Pearson product-moment correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void pearsoncorrm(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, ae_state *_state) { ae_frame _frame_block; ae_vector t; ae_int_t i; ae_int_t j; double v; ae_frame_make(_state, &_frame_block); ae_matrix_clear(c); ae_vector_init(&t, 0, DT_REAL, _state); ae_assert(n>=0, "PearsonCorrM: N<0", _state); ae_assert(m>=1, "PearsonCorrM: M<1", _state); ae_assert(x->rows>=n, "PearsonCorrM: Rows(X)cols>=m||n==0, "PearsonCorrM: Cols(X)ptr.pp_double[i][i],(double)(0)) ) { t.ptr.p_double[i] = 1/ae_sqrt(c->ptr.pp_double[i][i], _state); } else { t.ptr.p_double[i] = 0.0; } } for(i=0; i<=m-1; i++) { v = t.ptr.p_double[i]; for(j=0; j<=m-1; j++) { c->ptr.pp_double[i][j] = c->ptr.pp_double[i][j]*v*t.ptr.p_double[j]; } } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_pearsoncorrm(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, ae_state *_state) { pearsoncorrm(x,n,m,c, _state); } /************************************************************************* Spearman's rank correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void spearmancorrm(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; apbuffers buf; ae_matrix xc; ae_vector t; double v; double vv; double x0; ae_bool b; ae_frame_make(_state, &_frame_block); ae_matrix_clear(c); _apbuffers_init(&buf, _state); ae_matrix_init(&xc, 0, 0, DT_REAL, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_assert(n>=0, "SpearmanCorrM: N<0", _state); ae_assert(m>=1, "SpearmanCorrM: M<1", _state); ae_assert(x->rows>=n, "SpearmanCorrM: Rows(X)cols>=m||n==0, "SpearmanCorrM: Cols(X)ptr.pp_double[i][j] = (double)(0); } } ae_frame_leave(_state); return; } /* * Allocate */ ae_vector_set_length(&t, ae_maxint(n, m, _state), _state); ae_matrix_set_length(c, m, m, _state); /* * Replace data with ranks */ ae_matrix_set_length(&xc, m, n, _state); rmatrixtranspose(n, m, x, 0, 0, &xc, 0, 0, _state); rankdata(&xc, m, n, _state); /* * 1. Calculate means, check for constant columns * 2. Center variables, constant columns are * artificialy zeroed (they must be zero in exact arithmetics, * but unfortunately floating point is not exact). */ for(i=0; i<=m-1; i++) { /* * Calculate: * * V - mean value of I-th variable * * B - True in case all variable values are same */ v = (double)(0); b = ae_true; x0 = xc.ptr.pp_double[i][0]; for(j=0; j<=n-1; j++) { vv = xc.ptr.pp_double[i][j]; v = v+vv; b = b&&ae_fp_eq(vv,x0); } v = v/n; /* * Center/zero I-th variable */ if( b ) { /* * Zero */ for(j=0; j<=n-1; j++) { xc.ptr.pp_double[i][j] = 0.0; } } else { /* * Center */ for(j=0; j<=n-1; j++) { xc.ptr.pp_double[i][j] = xc.ptr.pp_double[i][j]-v; } } } /* * Calculate upper half of symmetric covariance matrix */ rmatrixsyrk(m, n, (double)1/(double)(n-1), &xc, 0, 0, 0, 0.0, c, 0, 0, ae_true, _state); /* * Calculate Pearson coefficients (upper triangle) */ for(i=0; i<=m-1; i++) { if( ae_fp_greater(c->ptr.pp_double[i][i],(double)(0)) ) { t.ptr.p_double[i] = 1/ae_sqrt(c->ptr.pp_double[i][i], _state); } else { t.ptr.p_double[i] = 0.0; } } for(i=0; i<=m-1; i++) { v = t.ptr.p_double[i]; for(j=i; j<=m-1; j++) { c->ptr.pp_double[i][j] = c->ptr.pp_double[i][j]*v*t.ptr.p_double[j]; } } /* * force symmetricity */ rmatrixenforcesymmetricity(c, m, ae_true, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_spearmancorrm(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, ae_state *_state) { spearmancorrm(x,n,m,c, _state); } /************************************************************************* Cross-covariance matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with covariance matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-covariance matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void covm2(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t m1, ae_int_t m2, /* Real */ ae_matrix* c, ae_state *_state) { ae_frame _frame_block; ae_matrix _x; ae_matrix _y; ae_int_t i; ae_int_t j; double v; ae_vector t; ae_vector x0; ae_vector y0; ae_vector samex; ae_vector samey; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_x, x, _state); x = &_x; ae_matrix_init_copy(&_y, y, _state); y = &_y; ae_matrix_clear(c); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&y0, 0, DT_REAL, _state); ae_vector_init(&samex, 0, DT_BOOL, _state); ae_vector_init(&samey, 0, DT_BOOL, _state); ae_assert(n>=0, "CovM2: N<0", _state); ae_assert(m1>=1, "CovM2: M1<1", _state); ae_assert(m2>=1, "CovM2: M2<1", _state); ae_assert(x->rows>=n, "CovM2: Rows(X)cols>=m1||n==0, "CovM2: Cols(X)rows>=n, "CovM2: Rows(Y)cols>=m2||n==0, "CovM2: Cols(Y)ptr.pp_double[i][j] = (double)(0); } } ae_frame_leave(_state); return; } /* * Allocate */ ae_vector_set_length(&t, ae_maxint(m1, m2, _state), _state); ae_vector_set_length(&x0, m1, _state); ae_vector_set_length(&y0, m2, _state); ae_vector_set_length(&samex, m1, _state); ae_vector_set_length(&samey, m2, _state); ae_matrix_set_length(c, m1, m2, _state); /* * * calculate means of X * * center X * * if we have constant columns, these columns are * artificially zeroed (they must be zero in exact arithmetics, * but unfortunately floating point ops are not exact). */ for(i=0; i<=m1-1; i++) { t.ptr.p_double[i] = (double)(0); samex.ptr.p_bool[i] = ae_true; } ae_v_move(&x0.ptr.p_double[0], 1, &x->ptr.pp_double[0][0], 1, ae_v_len(0,m1-1)); v = (double)1/(double)n; for(i=0; i<=n-1; i++) { ae_v_addd(&t.ptr.p_double[0], 1, &x->ptr.pp_double[i][0], 1, ae_v_len(0,m1-1), v); for(j=0; j<=m1-1; j++) { samex.ptr.p_bool[j] = samex.ptr.p_bool[j]&&ae_fp_eq(x->ptr.pp_double[i][j],x0.ptr.p_double[j]); } } for(i=0; i<=n-1; i++) { ae_v_sub(&x->ptr.pp_double[i][0], 1, &t.ptr.p_double[0], 1, ae_v_len(0,m1-1)); for(j=0; j<=m1-1; j++) { if( samex.ptr.p_bool[j] ) { x->ptr.pp_double[i][j] = (double)(0); } } } /* * Repeat same steps for Y */ for(i=0; i<=m2-1; i++) { t.ptr.p_double[i] = (double)(0); samey.ptr.p_bool[i] = ae_true; } ae_v_move(&y0.ptr.p_double[0], 1, &y->ptr.pp_double[0][0], 1, ae_v_len(0,m2-1)); v = (double)1/(double)n; for(i=0; i<=n-1; i++) { ae_v_addd(&t.ptr.p_double[0], 1, &y->ptr.pp_double[i][0], 1, ae_v_len(0,m2-1), v); for(j=0; j<=m2-1; j++) { samey.ptr.p_bool[j] = samey.ptr.p_bool[j]&&ae_fp_eq(y->ptr.pp_double[i][j],y0.ptr.p_double[j]); } } for(i=0; i<=n-1; i++) { ae_v_sub(&y->ptr.pp_double[i][0], 1, &t.ptr.p_double[0], 1, ae_v_len(0,m2-1)); for(j=0; j<=m2-1; j++) { if( samey.ptr.p_bool[j] ) { y->ptr.pp_double[i][j] = (double)(0); } } } /* * calculate cross-covariance matrix */ rmatrixgemm(m1, m2, n, (double)1/(double)(n-1), x, 0, 0, 1, y, 0, 0, 0, 0.0, c, 0, 0, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_covm2(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t m1, ae_int_t m2, /* Real */ ae_matrix* c, ae_state *_state) { covm2(x,y,n,m1,m2,c, _state); } /************************************************************************* Pearson product-moment cross-correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void pearsoncorrm2(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t m1, ae_int_t m2, /* Real */ ae_matrix* c, ae_state *_state) { ae_frame _frame_block; ae_matrix _x; ae_matrix _y; ae_int_t i; ae_int_t j; double v; ae_vector t; ae_vector x0; ae_vector y0; ae_vector sx; ae_vector sy; ae_vector samex; ae_vector samey; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_x, x, _state); x = &_x; ae_matrix_init_copy(&_y, y, _state); y = &_y; ae_matrix_clear(c); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&y0, 0, DT_REAL, _state); ae_vector_init(&sx, 0, DT_REAL, _state); ae_vector_init(&sy, 0, DT_REAL, _state); ae_vector_init(&samex, 0, DT_BOOL, _state); ae_vector_init(&samey, 0, DT_BOOL, _state); ae_assert(n>=0, "PearsonCorrM2: N<0", _state); ae_assert(m1>=1, "PearsonCorrM2: M1<1", _state); ae_assert(m2>=1, "PearsonCorrM2: M2<1", _state); ae_assert(x->rows>=n, "PearsonCorrM2: Rows(X)cols>=m1||n==0, "PearsonCorrM2: Cols(X)rows>=n, "PearsonCorrM2: Rows(Y)cols>=m2||n==0, "PearsonCorrM2: Cols(Y)ptr.pp_double[i][j] = (double)(0); } } ae_frame_leave(_state); return; } /* * Allocate */ ae_vector_set_length(&t, ae_maxint(m1, m2, _state), _state); ae_vector_set_length(&x0, m1, _state); ae_vector_set_length(&y0, m2, _state); ae_vector_set_length(&sx, m1, _state); ae_vector_set_length(&sy, m2, _state); ae_vector_set_length(&samex, m1, _state); ae_vector_set_length(&samey, m2, _state); ae_matrix_set_length(c, m1, m2, _state); /* * * calculate means of X * * center X * * if we have constant columns, these columns are * artificially zeroed (they must be zero in exact arithmetics, * but unfortunately floating point ops are not exact). * * calculate column variances */ for(i=0; i<=m1-1; i++) { t.ptr.p_double[i] = (double)(0); samex.ptr.p_bool[i] = ae_true; sx.ptr.p_double[i] = (double)(0); } ae_v_move(&x0.ptr.p_double[0], 1, &x->ptr.pp_double[0][0], 1, ae_v_len(0,m1-1)); v = (double)1/(double)n; for(i=0; i<=n-1; i++) { ae_v_addd(&t.ptr.p_double[0], 1, &x->ptr.pp_double[i][0], 1, ae_v_len(0,m1-1), v); for(j=0; j<=m1-1; j++) { samex.ptr.p_bool[j] = samex.ptr.p_bool[j]&&ae_fp_eq(x->ptr.pp_double[i][j],x0.ptr.p_double[j]); } } for(i=0; i<=n-1; i++) { ae_v_sub(&x->ptr.pp_double[i][0], 1, &t.ptr.p_double[0], 1, ae_v_len(0,m1-1)); for(j=0; j<=m1-1; j++) { if( samex.ptr.p_bool[j] ) { x->ptr.pp_double[i][j] = (double)(0); } sx.ptr.p_double[j] = sx.ptr.p_double[j]+x->ptr.pp_double[i][j]*x->ptr.pp_double[i][j]; } } for(j=0; j<=m1-1; j++) { sx.ptr.p_double[j] = ae_sqrt(sx.ptr.p_double[j]/(n-1), _state); } /* * Repeat same steps for Y */ for(i=0; i<=m2-1; i++) { t.ptr.p_double[i] = (double)(0); samey.ptr.p_bool[i] = ae_true; sy.ptr.p_double[i] = (double)(0); } ae_v_move(&y0.ptr.p_double[0], 1, &y->ptr.pp_double[0][0], 1, ae_v_len(0,m2-1)); v = (double)1/(double)n; for(i=0; i<=n-1; i++) { ae_v_addd(&t.ptr.p_double[0], 1, &y->ptr.pp_double[i][0], 1, ae_v_len(0,m2-1), v); for(j=0; j<=m2-1; j++) { samey.ptr.p_bool[j] = samey.ptr.p_bool[j]&&ae_fp_eq(y->ptr.pp_double[i][j],y0.ptr.p_double[j]); } } for(i=0; i<=n-1; i++) { ae_v_sub(&y->ptr.pp_double[i][0], 1, &t.ptr.p_double[0], 1, ae_v_len(0,m2-1)); for(j=0; j<=m2-1; j++) { if( samey.ptr.p_bool[j] ) { y->ptr.pp_double[i][j] = (double)(0); } sy.ptr.p_double[j] = sy.ptr.p_double[j]+y->ptr.pp_double[i][j]*y->ptr.pp_double[i][j]; } } for(j=0; j<=m2-1; j++) { sy.ptr.p_double[j] = ae_sqrt(sy.ptr.p_double[j]/(n-1), _state); } /* * calculate cross-covariance matrix */ rmatrixgemm(m1, m2, n, (double)1/(double)(n-1), x, 0, 0, 1, y, 0, 0, 0, 0.0, c, 0, 0, _state); /* * Divide by standard deviations */ for(i=0; i<=m1-1; i++) { if( ae_fp_neq(sx.ptr.p_double[i],(double)(0)) ) { sx.ptr.p_double[i] = 1/sx.ptr.p_double[i]; } else { sx.ptr.p_double[i] = 0.0; } } for(i=0; i<=m2-1; i++) { if( ae_fp_neq(sy.ptr.p_double[i],(double)(0)) ) { sy.ptr.p_double[i] = 1/sy.ptr.p_double[i]; } else { sy.ptr.p_double[i] = 0.0; } } for(i=0; i<=m1-1; i++) { v = sx.ptr.p_double[i]; for(j=0; j<=m2-1; j++) { c->ptr.pp_double[i][j] = c->ptr.pp_double[i][j]*v*sy.ptr.p_double[j]; } } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_pearsoncorrm2(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t m1, ae_int_t m2, /* Real */ ae_matrix* c, ae_state *_state) { pearsoncorrm2(x,y,n,m1,m2,c, _state); } /************************************************************************* Spearman's rank cross-correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void spearmancorrm2(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t m1, ae_int_t m2, /* Real */ ae_matrix* c, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double v; double v2; double vv; ae_bool b; ae_vector t; double x0; double y0; ae_vector sx; ae_vector sy; ae_matrix xc; ae_matrix yc; apbuffers buf; ae_frame_make(_state, &_frame_block); ae_matrix_clear(c); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&sx, 0, DT_REAL, _state); ae_vector_init(&sy, 0, DT_REAL, _state); ae_matrix_init(&xc, 0, 0, DT_REAL, _state); ae_matrix_init(&yc, 0, 0, DT_REAL, _state); _apbuffers_init(&buf, _state); ae_assert(n>=0, "SpearmanCorrM2: N<0", _state); ae_assert(m1>=1, "SpearmanCorrM2: M1<1", _state); ae_assert(m2>=1, "SpearmanCorrM2: M2<1", _state); ae_assert(x->rows>=n, "SpearmanCorrM2: Rows(X)cols>=m1||n==0, "SpearmanCorrM2: Cols(X)rows>=n, "SpearmanCorrM2: Rows(Y)cols>=m2||n==0, "SpearmanCorrM2: Cols(Y)ptr.pp_double[i][j] = (double)(0); } } ae_frame_leave(_state); return; } /* * Allocate */ ae_vector_set_length(&t, ae_maxint(ae_maxint(m1, m2, _state), n, _state), _state); ae_vector_set_length(&sx, m1, _state); ae_vector_set_length(&sy, m2, _state); ae_matrix_set_length(c, m1, m2, _state); /* * Replace data with ranks */ ae_matrix_set_length(&xc, m1, n, _state); ae_matrix_set_length(&yc, m2, n, _state); rmatrixtranspose(n, m1, x, 0, 0, &xc, 0, 0, _state); rmatrixtranspose(n, m2, y, 0, 0, &yc, 0, 0, _state); rankdata(&xc, m1, n, _state); rankdata(&yc, m2, n, _state); /* * 1. Calculate means, variances, check for constant columns * 2. Center variables, constant columns are * artificialy zeroed (they must be zero in exact arithmetics, * but unfortunately floating point is not exact). * * Description of variables: * * V - mean value of I-th variable * * V2- variance * * VV-temporary * * B - True in case all variable values are same */ for(i=0; i<=m1-1; i++) { v = (double)(0); v2 = 0.0; b = ae_true; x0 = xc.ptr.pp_double[i][0]; for(j=0; j<=n-1; j++) { vv = xc.ptr.pp_double[i][j]; v = v+vv; b = b&&ae_fp_eq(vv,x0); } v = v/n; if( b ) { for(j=0; j<=n-1; j++) { xc.ptr.pp_double[i][j] = 0.0; } } else { for(j=0; j<=n-1; j++) { vv = xc.ptr.pp_double[i][j]; xc.ptr.pp_double[i][j] = vv-v; v2 = v2+(vv-v)*(vv-v); } } sx.ptr.p_double[i] = ae_sqrt(v2/(n-1), _state); } for(i=0; i<=m2-1; i++) { v = (double)(0); v2 = 0.0; b = ae_true; y0 = yc.ptr.pp_double[i][0]; for(j=0; j<=n-1; j++) { vv = yc.ptr.pp_double[i][j]; v = v+vv; b = b&&ae_fp_eq(vv,y0); } v = v/n; if( b ) { for(j=0; j<=n-1; j++) { yc.ptr.pp_double[i][j] = 0.0; } } else { for(j=0; j<=n-1; j++) { vv = yc.ptr.pp_double[i][j]; yc.ptr.pp_double[i][j] = vv-v; v2 = v2+(vv-v)*(vv-v); } } sy.ptr.p_double[i] = ae_sqrt(v2/(n-1), _state); } /* * calculate cross-covariance matrix */ rmatrixgemm(m1, m2, n, (double)1/(double)(n-1), &xc, 0, 0, 0, &yc, 0, 0, 1, 0.0, c, 0, 0, _state); /* * Divide by standard deviations */ for(i=0; i<=m1-1; i++) { if( ae_fp_neq(sx.ptr.p_double[i],(double)(0)) ) { sx.ptr.p_double[i] = 1/sx.ptr.p_double[i]; } else { sx.ptr.p_double[i] = 0.0; } } for(i=0; i<=m2-1; i++) { if( ae_fp_neq(sy.ptr.p_double[i],(double)(0)) ) { sy.ptr.p_double[i] = 1/sy.ptr.p_double[i]; } else { sy.ptr.p_double[i] = 0.0; } } for(i=0; i<=m1-1; i++) { v = sx.ptr.p_double[i]; for(j=0; j<=m2-1; j++) { c->ptr.pp_double[i][j] = c->ptr.pp_double[i][j]*v*sy.ptr.p_double[j]; } } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_spearmancorrm2(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t m1, ae_int_t m2, /* Real */ ae_matrix* c, ae_state *_state) { spearmancorrm2(x,y,n,m1,m2,c, _state); } /************************************************************************* This function replaces data in XY by their ranks: * XY is processed row-by-row * rows are processed separately * tied data are correctly handled (tied ranks are calculated) * ranking starts from 0, ends at NFeatures-1 * sum of within-row values is equal to (NFeatures-1)*NFeatures/2 SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! ones where expected operations count is less than 100.000 INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points NFeatures- number of features OUTPUT PARAMETERS: XY - data are replaced by their within-row ranks; ranking starts from 0, ends at NFeatures-1 -- ALGLIB -- Copyright 18.04.2013 by Bochkanov Sergey *************************************************************************/ void rankdata(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_state *_state) { ae_frame _frame_block; apbuffers buf0; apbuffers buf1; ae_int_t basecasecost; ae_shared_pool pool; ae_frame_make(_state, &_frame_block); _apbuffers_init(&buf0, _state); _apbuffers_init(&buf1, _state); ae_shared_pool_init(&pool, _state); ae_assert(npoints>=0, "RankData: NPoints<0", _state); ae_assert(nfeatures>=1, "RankData: NFeatures<1", _state); ae_assert(xy->rows>=npoints, "RankData: Rows(XY)cols>=nfeatures||npoints==0, "RankData: Cols(XY)=0, "RankData: NPoints<0", _state); ae_assert(nfeatures>=1, "RankData: NFeatures<1", _state); ae_assert(xy->rows>=npoints, "RankData: Rows(XY)cols>=nfeatures||npoints==0, "RankData: Cols(XY)=i0, "RankDataRec: internal error", _state); /* * Recursively split problem, if it is too large */ problemcost = inttoreal(i1-i0, _state)*inttoreal(nfeatures, _state)*logbase2((double)(nfeatures), _state); if( i1-i0>=2&&ae_fp_greater(problemcost,(double)(basecasecost)) ) { im = (i1+i0)/2; basestat_rankdatarec(xy, i0, im, nfeatures, iscentered, pool, basecasecost, _state); basestat_rankdatarec(xy, im, i1, nfeatures, iscentered, pool, basecasecost, _state); ae_frame_leave(_state); return; } /* * Retrieve buffers from pool, call serial code, return buffers to pool */ ae_shared_pool_retrieve(pool, &_buf0, _state); ae_shared_pool_retrieve(pool, &_buf1, _state); basestat_rankdatabasecase(xy, i0, i1, nfeatures, iscentered, buf0, buf1, _state); ae_shared_pool_recycle(pool, &_buf0, _state); ae_shared_pool_recycle(pool, &_buf1, _state); ae_frame_leave(_state); } /************************************************************************* Basecase code for RankData(), performs actual work on subset of data using temporary buffer passed as parameter. INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset I0 - index of first row to process I1 - index of past-the-last row to process; this function processes half-interval [I0,I1). NFeatures- number of features IsCentered- whether ranks are centered or not: * True - ranks are centered in such way that their within-row sum is zero * False - ranks are not centered Buf0 - temporary buffers, may be empty (this function automatically allocates/reuses buffers). Buf1 - temporary buffers, may be empty (this function automatically allocates/reuses buffers). OUTPUT PARAMETERS: XY - data in [I0,I1) are replaced by their within-row ranks; ranking starts from 0, ends at NFeatures-1 -- ALGLIB -- Copyright 18.04.2013 by Bochkanov Sergey *************************************************************************/ static void basestat_rankdatabasecase(/* Real */ ae_matrix* xy, ae_int_t i0, ae_int_t i1, ae_int_t nfeatures, ae_bool iscentered, apbuffers* buf0, apbuffers* buf1, ae_state *_state) { ae_int_t i; ae_assert(i1>=i0, "RankDataBasecase: internal error", _state); if( buf1->ra0.cntra0, nfeatures, _state); } for(i=i0; i<=i1-1; i++) { ae_v_move(&buf1->ra0.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nfeatures-1)); rankx(&buf1->ra0, nfeatures, iscentered, buf0, _state); ae_v_move(&xy->ptr.pp_double[i][0], 1, &buf1->ra0.ptr.p_double[0], 1, ae_v_len(0,nfeatures-1)); } } /************************************************************************* Wilcoxon signed-rank test This test checks three hypotheses about the median of the given sample. The following tests are performed: * two-tailed test (null hypothesis - the median is equal to the given value) * left-tailed test (null hypothesis - the median is greater than or equal to the given value) * right-tailed test (null hypothesis - the median is less than or equal to the given value) Requirements: * the scale of measurement should be ordinal, interval or ratio (i.e. the test could not be applied to nominal variables). * the distribution should be continuous and symmetric relative to its median. * number of distinct values in the X array should be greater than 4 The test is non-parametric and doesn't require distribution X to be normal Input parameters: X - sample. Array whose index goes from 0 to N-1. N - size of the sample. Median - assumed median value. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. To calculate p-values, special approximation is used. This method lets us calculate p-values with two decimal places in interval [0.0001, 1]. "Two decimal places" does not sound very impressive, but in practice the relative error of less than 1% is enough to make a decision. There is no approximation outside the [0.0001, 1] interval. Therefore, if the significance level outlies this interval, the test returns 0.0001. -- ALGLIB -- Copyright 08.09.2006 by Bochkanov Sergey *************************************************************************/ void wilcoxonsignedranktest(/* Real */ ae_vector* x, ae_int_t n, double e, double* bothtails, double* lefttail, double* righttail, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t t; double tmp; ae_int_t tmpi; ae_int_t ns; ae_vector r; ae_vector c; double w; double p; double mp; double s; double sigma; double mu; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; *bothtails = 0; *lefttail = 0; *righttail = 0; ae_vector_init(&r, 0, DT_REAL, _state); ae_vector_init(&c, 0, DT_INT, _state); /* * Prepare */ if( n<5 ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; ae_frame_leave(_state); return; } ns = 0; for(i=0; i<=n-1; i++) { if( ae_fp_eq(x->ptr.p_double[i],e) ) { continue; } x->ptr.p_double[ns] = x->ptr.p_double[i]; ns = ns+1; } if( ns<5 ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; ae_frame_leave(_state); return; } ae_vector_set_length(&r, ns-1+1, _state); ae_vector_set_length(&c, ns-1+1, _state); for(i=0; i<=ns-1; i++) { r.ptr.p_double[i] = ae_fabs(x->ptr.p_double[i]-e, _state); c.ptr.p_int[i] = i; } /* * sort {R, C} */ if( ns!=1 ) { i = 2; do { t = i; while(t!=1) { k = t/2; if( ae_fp_greater_eq(r.ptr.p_double[k-1],r.ptr.p_double[t-1]) ) { t = 1; } else { tmp = r.ptr.p_double[k-1]; r.ptr.p_double[k-1] = r.ptr.p_double[t-1]; r.ptr.p_double[t-1] = tmp; tmpi = c.ptr.p_int[k-1]; c.ptr.p_int[k-1] = c.ptr.p_int[t-1]; c.ptr.p_int[t-1] = tmpi; t = k; } } i = i+1; } while(i<=ns); i = ns-1; do { tmp = r.ptr.p_double[i]; r.ptr.p_double[i] = r.ptr.p_double[0]; r.ptr.p_double[0] = tmp; tmpi = c.ptr.p_int[i]; c.ptr.p_int[i] = c.ptr.p_int[0]; c.ptr.p_int[0] = tmpi; t = 1; while(t!=0) { k = 2*t; if( k>i ) { t = 0; } else { if( k=1); } /* * compute tied ranks */ i = 0; while(i<=ns-1) { j = i+1; while(j<=ns-1) { if( ae_fp_neq(r.ptr.p_double[j],r.ptr.p_double[i]) ) { break; } j = j+1; } for(k=i; k<=j-1; k++) { r.ptr.p_double[k] = 1+(double)(i+j-1)/(double)2; } i = j; } /* * Compute W+ */ w = (double)(0); for(i=0; i<=ns-1; i++) { if( ae_fp_greater(x->ptr.p_double[c.ptr.p_int[i]],e) ) { w = w+r.ptr.p_double[i]; } } /* * Result */ mu = rmul2((double)(ns), (double)(ns+1), _state)/4; sigma = ae_sqrt(mu*(2*ns+1)/6, _state); s = (w-mu)/sigma; if( ae_fp_less_eq(s,(double)(0)) ) { p = ae_exp(wsr_wsigma(-(w-mu)/sigma, ns, _state), _state); mp = 1-ae_exp(wsr_wsigma(-(w-1-mu)/sigma, ns, _state), _state); } else { mp = ae_exp(wsr_wsigma((w-mu)/sigma, ns, _state), _state); p = 1-ae_exp(wsr_wsigma((w+1-mu)/sigma, ns, _state), _state); } *lefttail = ae_maxreal(p, 1.0E-4, _state); *righttail = ae_maxreal(mp, 1.0E-4, _state); *bothtails = 2*ae_minreal(*lefttail, *righttail, _state); ae_frame_leave(_state); } /************************************************************************* Sequential Chebyshev interpolation. *************************************************************************/ static void wsr_wcheb(double x, double c, double* tj, double* tj1, double* r, ae_state *_state) { double t; *r = *r+c*(*tj); t = 2*x*(*tj1)-(*tj); *tj = *tj1; *tj1 = t; } /************************************************************************* Tail(S, 5) *************************************************************************/ static double wsr_w5(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-3.708099e+00*s+7.500000e+00, _state); if( w>=7 ) { r = -6.931e-01; } if( w==6 ) { r = -9.008e-01; } if( w==5 ) { r = -1.163e+00; } if( w==4 ) { r = -1.520e+00; } if( w==3 ) { r = -1.856e+00; } if( w==2 ) { r = -2.367e+00; } if( w==1 ) { r = -2.773e+00; } if( w<=0 ) { r = -3.466e+00; } result = r; return result; } /************************************************************************* Tail(S, 6) *************************************************************************/ static double wsr_w6(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-4.769696e+00*s+1.050000e+01, _state); if( w>=10 ) { r = -6.931e-01; } if( w==9 ) { r = -8.630e-01; } if( w==8 ) { r = -1.068e+00; } if( w==7 ) { r = -1.269e+00; } if( w==6 ) { r = -1.520e+00; } if( w==5 ) { r = -1.856e+00; } if( w==4 ) { r = -2.213e+00; } if( w==3 ) { r = -2.549e+00; } if( w==2 ) { r = -3.060e+00; } if( w==1 ) { r = -3.466e+00; } if( w<=0 ) { r = -4.159e+00; } result = r; return result; } /************************************************************************* Tail(S, 7) *************************************************************************/ static double wsr_w7(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-5.916080e+00*s+1.400000e+01, _state); if( w>=14 ) { r = -6.325e-01; } if( w==13 ) { r = -7.577e-01; } if( w==12 ) { r = -9.008e-01; } if( w==11 ) { r = -1.068e+00; } if( w==10 ) { r = -1.241e+00; } if( w==9 ) { r = -1.451e+00; } if( w==8 ) { r = -1.674e+00; } if( w==7 ) { r = -1.908e+00; } if( w==6 ) { r = -2.213e+00; } if( w==5 ) { r = -2.549e+00; } if( w==4 ) { r = -2.906e+00; } if( w==3 ) { r = -3.243e+00; } if( w==2 ) { r = -3.753e+00; } if( w==1 ) { r = -4.159e+00; } if( w<=0 ) { r = -4.852e+00; } result = r; return result; } /************************************************************************* Tail(S, 8) *************************************************************************/ static double wsr_w8(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-7.141428e+00*s+1.800000e+01, _state); if( w>=18 ) { r = -6.399e-01; } if( w==17 ) { r = -7.494e-01; } if( w==16 ) { r = -8.630e-01; } if( w==15 ) { r = -9.913e-01; } if( w==14 ) { r = -1.138e+00; } if( w==13 ) { r = -1.297e+00; } if( w==12 ) { r = -1.468e+00; } if( w==11 ) { r = -1.653e+00; } if( w==10 ) { r = -1.856e+00; } if( w==9 ) { r = -2.079e+00; } if( w==8 ) { r = -2.326e+00; } if( w==7 ) { r = -2.601e+00; } if( w==6 ) { r = -2.906e+00; } if( w==5 ) { r = -3.243e+00; } if( w==4 ) { r = -3.599e+00; } if( w==3 ) { r = -3.936e+00; } if( w==2 ) { r = -4.447e+00; } if( w==1 ) { r = -4.852e+00; } if( w<=0 ) { r = -5.545e+00; } result = r; return result; } /************************************************************************* Tail(S, 9) *************************************************************************/ static double wsr_w9(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-8.440972e+00*s+2.250000e+01, _state); if( w>=22 ) { r = -6.931e-01; } if( w==21 ) { r = -7.873e-01; } if( w==20 ) { r = -8.912e-01; } if( w==19 ) { r = -1.002e+00; } if( w==18 ) { r = -1.120e+00; } if( w==17 ) { r = -1.255e+00; } if( w==16 ) { r = -1.394e+00; } if( w==15 ) { r = -1.547e+00; } if( w==14 ) { r = -1.717e+00; } if( w==13 ) { r = -1.895e+00; } if( w==12 ) { r = -2.079e+00; } if( w==11 ) { r = -2.287e+00; } if( w==10 ) { r = -2.501e+00; } if( w==9 ) { r = -2.742e+00; } if( w==8 ) { r = -3.019e+00; } if( w==7 ) { r = -3.294e+00; } if( w==6 ) { r = -3.599e+00; } if( w==5 ) { r = -3.936e+00; } if( w==4 ) { r = -4.292e+00; } if( w==3 ) { r = -4.629e+00; } if( w==2 ) { r = -5.140e+00; } if( w==1 ) { r = -5.545e+00; } if( w<=0 ) { r = -6.238e+00; } result = r; return result; } /************************************************************************* Tail(S, 10) *************************************************************************/ static double wsr_w10(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-9.810708e+00*s+2.750000e+01, _state); if( w>=27 ) { r = -6.931e-01; } if( w==26 ) { r = -7.745e-01; } if( w==25 ) { r = -8.607e-01; } if( w==24 ) { r = -9.551e-01; } if( w==23 ) { r = -1.057e+00; } if( w==22 ) { r = -1.163e+00; } if( w==21 ) { r = -1.279e+00; } if( w==20 ) { r = -1.402e+00; } if( w==19 ) { r = -1.533e+00; } if( w==18 ) { r = -1.674e+00; } if( w==17 ) { r = -1.826e+00; } if( w==16 ) { r = -1.983e+00; } if( w==15 ) { r = -2.152e+00; } if( w==14 ) { r = -2.336e+00; } if( w==13 ) { r = -2.525e+00; } if( w==12 ) { r = -2.727e+00; } if( w==11 ) { r = -2.942e+00; } if( w==10 ) { r = -3.170e+00; } if( w==9 ) { r = -3.435e+00; } if( w==8 ) { r = -3.713e+00; } if( w==7 ) { r = -3.987e+00; } if( w==6 ) { r = -4.292e+00; } if( w==5 ) { r = -4.629e+00; } if( w==4 ) { r = -4.986e+00; } if( w==3 ) { r = -5.322e+00; } if( w==2 ) { r = -5.833e+00; } if( w==1 ) { r = -6.238e+00; } if( w<=0 ) { r = -6.931e+00; } result = r; return result; } /************************************************************************* Tail(S, 11) *************************************************************************/ static double wsr_w11(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-1.124722e+01*s+3.300000e+01, _state); if( w>=33 ) { r = -6.595e-01; } if( w==32 ) { r = -7.279e-01; } if( w==31 ) { r = -8.002e-01; } if( w==30 ) { r = -8.782e-01; } if( w==29 ) { r = -9.615e-01; } if( w==28 ) { r = -1.050e+00; } if( w==27 ) { r = -1.143e+00; } if( w==26 ) { r = -1.243e+00; } if( w==25 ) { r = -1.348e+00; } if( w==24 ) { r = -1.459e+00; } if( w==23 ) { r = -1.577e+00; } if( w==22 ) { r = -1.700e+00; } if( w==21 ) { r = -1.832e+00; } if( w==20 ) { r = -1.972e+00; } if( w==19 ) { r = -2.119e+00; } if( w==18 ) { r = -2.273e+00; } if( w==17 ) { r = -2.437e+00; } if( w==16 ) { r = -2.607e+00; } if( w==15 ) { r = -2.788e+00; } if( w==14 ) { r = -2.980e+00; } if( w==13 ) { r = -3.182e+00; } if( w==12 ) { r = -3.391e+00; } if( w==11 ) { r = -3.617e+00; } if( w==10 ) { r = -3.863e+00; } if( w==9 ) { r = -4.128e+00; } if( w==8 ) { r = -4.406e+00; } if( w==7 ) { r = -4.680e+00; } if( w==6 ) { r = -4.986e+00; } if( w==5 ) { r = -5.322e+00; } if( w==4 ) { r = -5.679e+00; } if( w==3 ) { r = -6.015e+00; } if( w==2 ) { r = -6.526e+00; } if( w==1 ) { r = -6.931e+00; } if( w<=0 ) { r = -7.625e+00; } result = r; return result; } /************************************************************************* Tail(S, 12) *************************************************************************/ static double wsr_w12(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-1.274755e+01*s+3.900000e+01, _state); if( w>=39 ) { r = -6.633e-01; } if( w==38 ) { r = -7.239e-01; } if( w==37 ) { r = -7.878e-01; } if( w==36 ) { r = -8.556e-01; } if( w==35 ) { r = -9.276e-01; } if( w==34 ) { r = -1.003e+00; } if( w==33 ) { r = -1.083e+00; } if( w==32 ) { r = -1.168e+00; } if( w==31 ) { r = -1.256e+00; } if( w==30 ) { r = -1.350e+00; } if( w==29 ) { r = -1.449e+00; } if( w==28 ) { r = -1.552e+00; } if( w==27 ) { r = -1.660e+00; } if( w==26 ) { r = -1.774e+00; } if( w==25 ) { r = -1.893e+00; } if( w==24 ) { r = -2.017e+00; } if( w==23 ) { r = -2.148e+00; } if( w==22 ) { r = -2.285e+00; } if( w==21 ) { r = -2.429e+00; } if( w==20 ) { r = -2.581e+00; } if( w==19 ) { r = -2.738e+00; } if( w==18 ) { r = -2.902e+00; } if( w==17 ) { r = -3.076e+00; } if( w==16 ) { r = -3.255e+00; } if( w==15 ) { r = -3.443e+00; } if( w==14 ) { r = -3.645e+00; } if( w==13 ) { r = -3.852e+00; } if( w==12 ) { r = -4.069e+00; } if( w==11 ) { r = -4.310e+00; } if( w==10 ) { r = -4.557e+00; } if( w==9 ) { r = -4.821e+00; } if( w==8 ) { r = -5.099e+00; } if( w==7 ) { r = -5.373e+00; } if( w==6 ) { r = -5.679e+00; } if( w==5 ) { r = -6.015e+00; } if( w==4 ) { r = -6.372e+00; } if( w==3 ) { r = -6.708e+00; } if( w==2 ) { r = -7.219e+00; } if( w==1 ) { r = -7.625e+00; } if( w<=0 ) { r = -8.318e+00; } result = r; return result; } /************************************************************************* Tail(S, 13) *************************************************************************/ static double wsr_w13(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-1.430909e+01*s+4.550000e+01, _state); if( w>=45 ) { r = -6.931e-01; } if( w==44 ) { r = -7.486e-01; } if( w==43 ) { r = -8.068e-01; } if( w==42 ) { r = -8.683e-01; } if( w==41 ) { r = -9.328e-01; } if( w==40 ) { r = -1.001e+00; } if( w==39 ) { r = -1.072e+00; } if( w==38 ) { r = -1.146e+00; } if( w==37 ) { r = -1.224e+00; } if( w==36 ) { r = -1.306e+00; } if( w==35 ) { r = -1.392e+00; } if( w==34 ) { r = -1.481e+00; } if( w==33 ) { r = -1.574e+00; } if( w==32 ) { r = -1.672e+00; } if( w==31 ) { r = -1.773e+00; } if( w==30 ) { r = -1.879e+00; } if( w==29 ) { r = -1.990e+00; } if( w==28 ) { r = -2.104e+00; } if( w==27 ) { r = -2.224e+00; } if( w==26 ) { r = -2.349e+00; } if( w==25 ) { r = -2.479e+00; } if( w==24 ) { r = -2.614e+00; } if( w==23 ) { r = -2.755e+00; } if( w==22 ) { r = -2.902e+00; } if( w==21 ) { r = -3.055e+00; } if( w==20 ) { r = -3.215e+00; } if( w==19 ) { r = -3.380e+00; } if( w==18 ) { r = -3.551e+00; } if( w==17 ) { r = -3.733e+00; } if( w==16 ) { r = -3.917e+00; } if( w==15 ) { r = -4.113e+00; } if( w==14 ) { r = -4.320e+00; } if( w==13 ) { r = -4.534e+00; } if( w==12 ) { r = -4.762e+00; } if( w==11 ) { r = -5.004e+00; } if( w==10 ) { r = -5.250e+00; } if( w==9 ) { r = -5.514e+00; } if( w==8 ) { r = -5.792e+00; } if( w==7 ) { r = -6.066e+00; } if( w==6 ) { r = -6.372e+00; } if( w==5 ) { r = -6.708e+00; } if( w==4 ) { r = -7.065e+00; } if( w==3 ) { r = -7.401e+00; } if( w==2 ) { r = -7.912e+00; } if( w==1 ) { r = -8.318e+00; } if( w<=0 ) { r = -9.011e+00; } result = r; return result; } /************************************************************************* Tail(S, 14) *************************************************************************/ static double wsr_w14(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-1.592953e+01*s+5.250000e+01, _state); if( w>=52 ) { r = -6.931e-01; } if( w==51 ) { r = -7.428e-01; } if( w==50 ) { r = -7.950e-01; } if( w==49 ) { r = -8.495e-01; } if( w==48 ) { r = -9.067e-01; } if( w==47 ) { r = -9.664e-01; } if( w==46 ) { r = -1.029e+00; } if( w==45 ) { r = -1.094e+00; } if( w==44 ) { r = -1.162e+00; } if( w==43 ) { r = -1.233e+00; } if( w==42 ) { r = -1.306e+00; } if( w==41 ) { r = -1.383e+00; } if( w==40 ) { r = -1.463e+00; } if( w==39 ) { r = -1.546e+00; } if( w==38 ) { r = -1.632e+00; } if( w==37 ) { r = -1.722e+00; } if( w==36 ) { r = -1.815e+00; } if( w==35 ) { r = -1.911e+00; } if( w==34 ) { r = -2.011e+00; } if( w==33 ) { r = -2.115e+00; } if( w==32 ) { r = -2.223e+00; } if( w==31 ) { r = -2.334e+00; } if( w==30 ) { r = -2.450e+00; } if( w==29 ) { r = -2.570e+00; } if( w==28 ) { r = -2.694e+00; } if( w==27 ) { r = -2.823e+00; } if( w==26 ) { r = -2.956e+00; } if( w==25 ) { r = -3.095e+00; } if( w==24 ) { r = -3.238e+00; } if( w==23 ) { r = -3.387e+00; } if( w==22 ) { r = -3.541e+00; } if( w==21 ) { r = -3.700e+00; } if( w==20 ) { r = -3.866e+00; } if( w==19 ) { r = -4.038e+00; } if( w==18 ) { r = -4.215e+00; } if( w==17 ) { r = -4.401e+00; } if( w==16 ) { r = -4.592e+00; } if( w==15 ) { r = -4.791e+00; } if( w==14 ) { r = -5.004e+00; } if( w==13 ) { r = -5.227e+00; } if( w==12 ) { r = -5.456e+00; } if( w==11 ) { r = -5.697e+00; } if( w==10 ) { r = -5.943e+00; } if( w==9 ) { r = -6.208e+00; } if( w==8 ) { r = -6.485e+00; } if( w==7 ) { r = -6.760e+00; } if( w==6 ) { r = -7.065e+00; } if( w==5 ) { r = -7.401e+00; } if( w==4 ) { r = -7.758e+00; } if( w==3 ) { r = -8.095e+00; } if( w==2 ) { r = -8.605e+00; } if( w==1 ) { r = -9.011e+00; } if( w<=0 ) { r = -9.704e+00; } result = r; return result; } /************************************************************************* Tail(S, 15) *************************************************************************/ static double wsr_w15(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-1.760682e+01*s+6.000000e+01, _state); if( w>=60 ) { r = -6.714e-01; } if( w==59 ) { r = -7.154e-01; } if( w==58 ) { r = -7.613e-01; } if( w==57 ) { r = -8.093e-01; } if( w==56 ) { r = -8.593e-01; } if( w==55 ) { r = -9.114e-01; } if( w==54 ) { r = -9.656e-01; } if( w==53 ) { r = -1.022e+00; } if( w==52 ) { r = -1.081e+00; } if( w==51 ) { r = -1.142e+00; } if( w==50 ) { r = -1.205e+00; } if( w==49 ) { r = -1.270e+00; } if( w==48 ) { r = -1.339e+00; } if( w==47 ) { r = -1.409e+00; } if( w==46 ) { r = -1.482e+00; } if( w==45 ) { r = -1.558e+00; } if( w==44 ) { r = -1.636e+00; } if( w==43 ) { r = -1.717e+00; } if( w==42 ) { r = -1.801e+00; } if( w==41 ) { r = -1.888e+00; } if( w==40 ) { r = -1.977e+00; } if( w==39 ) { r = -2.070e+00; } if( w==38 ) { r = -2.166e+00; } if( w==37 ) { r = -2.265e+00; } if( w==36 ) { r = -2.366e+00; } if( w==35 ) { r = -2.472e+00; } if( w==34 ) { r = -2.581e+00; } if( w==33 ) { r = -2.693e+00; } if( w==32 ) { r = -2.809e+00; } if( w==31 ) { r = -2.928e+00; } if( w==30 ) { r = -3.051e+00; } if( w==29 ) { r = -3.179e+00; } if( w==28 ) { r = -3.310e+00; } if( w==27 ) { r = -3.446e+00; } if( w==26 ) { r = -3.587e+00; } if( w==25 ) { r = -3.732e+00; } if( w==24 ) { r = -3.881e+00; } if( w==23 ) { r = -4.036e+00; } if( w==22 ) { r = -4.195e+00; } if( w==21 ) { r = -4.359e+00; } if( w==20 ) { r = -4.531e+00; } if( w==19 ) { r = -4.707e+00; } if( w==18 ) { r = -4.888e+00; } if( w==17 ) { r = -5.079e+00; } if( w==16 ) { r = -5.273e+00; } if( w==15 ) { r = -5.477e+00; } if( w==14 ) { r = -5.697e+00; } if( w==13 ) { r = -5.920e+00; } if( w==12 ) { r = -6.149e+00; } if( w==11 ) { r = -6.390e+00; } if( w==10 ) { r = -6.636e+00; } if( w==9 ) { r = -6.901e+00; } if( w==8 ) { r = -7.178e+00; } if( w==7 ) { r = -7.453e+00; } if( w==6 ) { r = -7.758e+00; } if( w==5 ) { r = -8.095e+00; } if( w==4 ) { r = -8.451e+00; } if( w==3 ) { r = -8.788e+00; } if( w==2 ) { r = -9.299e+00; } if( w==1 ) { r = -9.704e+00; } if( w<=0 ) { r = -1.040e+01; } result = r; return result; } /************************************************************************* Tail(S, 16) *************************************************************************/ static double wsr_w16(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-1.933908e+01*s+6.800000e+01, _state); if( w>=68 ) { r = -6.733e-01; } if( w==67 ) { r = -7.134e-01; } if( w==66 ) { r = -7.551e-01; } if( w==65 ) { r = -7.986e-01; } if( w==64 ) { r = -8.437e-01; } if( w==63 ) { r = -8.905e-01; } if( w==62 ) { r = -9.391e-01; } if( w==61 ) { r = -9.895e-01; } if( w==60 ) { r = -1.042e+00; } if( w==59 ) { r = -1.096e+00; } if( w==58 ) { r = -1.152e+00; } if( w==57 ) { r = -1.210e+00; } if( w==56 ) { r = -1.270e+00; } if( w==55 ) { r = -1.331e+00; } if( w==54 ) { r = -1.395e+00; } if( w==53 ) { r = -1.462e+00; } if( w==52 ) { r = -1.530e+00; } if( w==51 ) { r = -1.600e+00; } if( w==50 ) { r = -1.673e+00; } if( w==49 ) { r = -1.748e+00; } if( w==48 ) { r = -1.825e+00; } if( w==47 ) { r = -1.904e+00; } if( w==46 ) { r = -1.986e+00; } if( w==45 ) { r = -2.071e+00; } if( w==44 ) { r = -2.158e+00; } if( w==43 ) { r = -2.247e+00; } if( w==42 ) { r = -2.339e+00; } if( w==41 ) { r = -2.434e+00; } if( w==40 ) { r = -2.532e+00; } if( w==39 ) { r = -2.632e+00; } if( w==38 ) { r = -2.735e+00; } if( w==37 ) { r = -2.842e+00; } if( w==36 ) { r = -2.951e+00; } if( w==35 ) { r = -3.064e+00; } if( w==34 ) { r = -3.179e+00; } if( w==33 ) { r = -3.298e+00; } if( w==32 ) { r = -3.420e+00; } if( w==31 ) { r = -3.546e+00; } if( w==30 ) { r = -3.676e+00; } if( w==29 ) { r = -3.810e+00; } if( w==28 ) { r = -3.947e+00; } if( w==27 ) { r = -4.088e+00; } if( w==26 ) { r = -4.234e+00; } if( w==25 ) { r = -4.383e+00; } if( w==24 ) { r = -4.538e+00; } if( w==23 ) { r = -4.697e+00; } if( w==22 ) { r = -4.860e+00; } if( w==21 ) { r = -5.029e+00; } if( w==20 ) { r = -5.204e+00; } if( w==19 ) { r = -5.383e+00; } if( w==18 ) { r = -5.569e+00; } if( w==17 ) { r = -5.762e+00; } if( w==16 ) { r = -5.960e+00; } if( w==15 ) { r = -6.170e+00; } if( w==14 ) { r = -6.390e+00; } if( w==13 ) { r = -6.613e+00; } if( w==12 ) { r = -6.842e+00; } if( w==11 ) { r = -7.083e+00; } if( w==10 ) { r = -7.329e+00; } if( w==9 ) { r = -7.594e+00; } if( w==8 ) { r = -7.871e+00; } if( w==7 ) { r = -8.146e+00; } if( w==6 ) { r = -8.451e+00; } if( w==5 ) { r = -8.788e+00; } if( w==4 ) { r = -9.144e+00; } if( w==3 ) { r = -9.481e+00; } if( w==2 ) { r = -9.992e+00; } if( w==1 ) { r = -1.040e+01; } if( w<=0 ) { r = -1.109e+01; } result = r; return result; } /************************************************************************* Tail(S, 17) *************************************************************************/ static double wsr_w17(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-2.112463e+01*s+7.650000e+01, _state); if( w>=76 ) { r = -6.931e-01; } if( w==75 ) { r = -7.306e-01; } if( w==74 ) { r = -7.695e-01; } if( w==73 ) { r = -8.097e-01; } if( w==72 ) { r = -8.514e-01; } if( w==71 ) { r = -8.946e-01; } if( w==70 ) { r = -9.392e-01; } if( w==69 ) { r = -9.853e-01; } if( w==68 ) { r = -1.033e+00; } if( w==67 ) { r = -1.082e+00; } if( w==66 ) { r = -1.133e+00; } if( w==65 ) { r = -1.185e+00; } if( w==64 ) { r = -1.240e+00; } if( w==63 ) { r = -1.295e+00; } if( w==62 ) { r = -1.353e+00; } if( w==61 ) { r = -1.412e+00; } if( w==60 ) { r = -1.473e+00; } if( w==59 ) { r = -1.536e+00; } if( w==58 ) { r = -1.600e+00; } if( w==57 ) { r = -1.666e+00; } if( w==56 ) { r = -1.735e+00; } if( w==55 ) { r = -1.805e+00; } if( w==54 ) { r = -1.877e+00; } if( w==53 ) { r = -1.951e+00; } if( w==52 ) { r = -2.028e+00; } if( w==51 ) { r = -2.106e+00; } if( w==50 ) { r = -2.186e+00; } if( w==49 ) { r = -2.269e+00; } if( w==48 ) { r = -2.353e+00; } if( w==47 ) { r = -2.440e+00; } if( w==46 ) { r = -2.530e+00; } if( w==45 ) { r = -2.621e+00; } if( w==44 ) { r = -2.715e+00; } if( w==43 ) { r = -2.812e+00; } if( w==42 ) { r = -2.911e+00; } if( w==41 ) { r = -3.012e+00; } if( w==40 ) { r = -3.116e+00; } if( w==39 ) { r = -3.223e+00; } if( w==38 ) { r = -3.332e+00; } if( w==37 ) { r = -3.445e+00; } if( w==36 ) { r = -3.560e+00; } if( w==35 ) { r = -3.678e+00; } if( w==34 ) { r = -3.799e+00; } if( w==33 ) { r = -3.924e+00; } if( w==32 ) { r = -4.052e+00; } if( w==31 ) { r = -4.183e+00; } if( w==30 ) { r = -4.317e+00; } if( w==29 ) { r = -4.456e+00; } if( w==28 ) { r = -4.597e+00; } if( w==27 ) { r = -4.743e+00; } if( w==26 ) { r = -4.893e+00; } if( w==25 ) { r = -5.047e+00; } if( w==24 ) { r = -5.204e+00; } if( w==23 ) { r = -5.367e+00; } if( w==22 ) { r = -5.534e+00; } if( w==21 ) { r = -5.706e+00; } if( w==20 ) { r = -5.884e+00; } if( w==19 ) { r = -6.066e+00; } if( w==18 ) { r = -6.254e+00; } if( w==17 ) { r = -6.451e+00; } if( w==16 ) { r = -6.654e+00; } if( w==15 ) { r = -6.864e+00; } if( w==14 ) { r = -7.083e+00; } if( w==13 ) { r = -7.306e+00; } if( w==12 ) { r = -7.535e+00; } if( w==11 ) { r = -7.776e+00; } if( w==10 ) { r = -8.022e+00; } if( w==9 ) { r = -8.287e+00; } if( w==8 ) { r = -8.565e+00; } if( w==7 ) { r = -8.839e+00; } if( w==6 ) { r = -9.144e+00; } if( w==5 ) { r = -9.481e+00; } if( w==4 ) { r = -9.838e+00; } if( w==3 ) { r = -1.017e+01; } if( w==2 ) { r = -1.068e+01; } if( w==1 ) { r = -1.109e+01; } if( w<=0 ) { r = -1.178e+01; } result = r; return result; } /************************************************************************* Tail(S, 18) *************************************************************************/ static double wsr_w18(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-2.296193e+01*s+8.550000e+01, _state); if( w>=85 ) { r = -6.931e-01; } if( w==84 ) { r = -7.276e-01; } if( w==83 ) { r = -7.633e-01; } if( w==82 ) { r = -8.001e-01; } if( w==81 ) { r = -8.381e-01; } if( w==80 ) { r = -8.774e-01; } if( w==79 ) { r = -9.179e-01; } if( w==78 ) { r = -9.597e-01; } if( w==77 ) { r = -1.003e+00; } if( w==76 ) { r = -1.047e+00; } if( w==75 ) { r = -1.093e+00; } if( w==74 ) { r = -1.140e+00; } if( w==73 ) { r = -1.188e+00; } if( w==72 ) { r = -1.238e+00; } if( w==71 ) { r = -1.289e+00; } if( w==70 ) { r = -1.342e+00; } if( w==69 ) { r = -1.396e+00; } if( w==68 ) { r = -1.452e+00; } if( w==67 ) { r = -1.509e+00; } if( w==66 ) { r = -1.568e+00; } if( w==65 ) { r = -1.628e+00; } if( w==64 ) { r = -1.690e+00; } if( w==63 ) { r = -1.753e+00; } if( w==62 ) { r = -1.818e+00; } if( w==61 ) { r = -1.885e+00; } if( w==60 ) { r = -1.953e+00; } if( w==59 ) { r = -2.023e+00; } if( w==58 ) { r = -2.095e+00; } if( w==57 ) { r = -2.168e+00; } if( w==56 ) { r = -2.244e+00; } if( w==55 ) { r = -2.321e+00; } if( w==54 ) { r = -2.400e+00; } if( w==53 ) { r = -2.481e+00; } if( w==52 ) { r = -2.564e+00; } if( w==51 ) { r = -2.648e+00; } if( w==50 ) { r = -2.735e+00; } if( w==49 ) { r = -2.824e+00; } if( w==48 ) { r = -2.915e+00; } if( w==47 ) { r = -3.008e+00; } if( w==46 ) { r = -3.104e+00; } if( w==45 ) { r = -3.201e+00; } if( w==44 ) { r = -3.301e+00; } if( w==43 ) { r = -3.403e+00; } if( w==42 ) { r = -3.508e+00; } if( w==41 ) { r = -3.615e+00; } if( w==40 ) { r = -3.724e+00; } if( w==39 ) { r = -3.836e+00; } if( w==38 ) { r = -3.950e+00; } if( w==37 ) { r = -4.068e+00; } if( w==36 ) { r = -4.188e+00; } if( w==35 ) { r = -4.311e+00; } if( w==34 ) { r = -4.437e+00; } if( w==33 ) { r = -4.565e+00; } if( w==32 ) { r = -4.698e+00; } if( w==31 ) { r = -4.833e+00; } if( w==30 ) { r = -4.971e+00; } if( w==29 ) { r = -5.113e+00; } if( w==28 ) { r = -5.258e+00; } if( w==27 ) { r = -5.408e+00; } if( w==26 ) { r = -5.561e+00; } if( w==25 ) { r = -5.717e+00; } if( w==24 ) { r = -5.878e+00; } if( w==23 ) { r = -6.044e+00; } if( w==22 ) { r = -6.213e+00; } if( w==21 ) { r = -6.388e+00; } if( w==20 ) { r = -6.569e+00; } if( w==19 ) { r = -6.753e+00; } if( w==18 ) { r = -6.943e+00; } if( w==17 ) { r = -7.144e+00; } if( w==16 ) { r = -7.347e+00; } if( w==15 ) { r = -7.557e+00; } if( w==14 ) { r = -7.776e+00; } if( w==13 ) { r = -7.999e+00; } if( w==12 ) { r = -8.228e+00; } if( w==11 ) { r = -8.469e+00; } if( w==10 ) { r = -8.715e+00; } if( w==9 ) { r = -8.980e+00; } if( w==8 ) { r = -9.258e+00; } if( w==7 ) { r = -9.532e+00; } if( w==6 ) { r = -9.838e+00; } if( w==5 ) { r = -1.017e+01; } if( w==4 ) { r = -1.053e+01; } if( w==3 ) { r = -1.087e+01; } if( w==2 ) { r = -1.138e+01; } if( w==1 ) { r = -1.178e+01; } if( w<=0 ) { r = -1.248e+01; } result = r; return result; } /************************************************************************* Tail(S, 19) *************************************************************************/ static double wsr_w19(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-2.484955e+01*s+9.500000e+01, _state); if( w>=95 ) { r = -6.776e-01; } if( w==94 ) { r = -7.089e-01; } if( w==93 ) { r = -7.413e-01; } if( w==92 ) { r = -7.747e-01; } if( w==91 ) { r = -8.090e-01; } if( w==90 ) { r = -8.445e-01; } if( w==89 ) { r = -8.809e-01; } if( w==88 ) { r = -9.185e-01; } if( w==87 ) { r = -9.571e-01; } if( w==86 ) { r = -9.968e-01; } if( w==85 ) { r = -1.038e+00; } if( w==84 ) { r = -1.080e+00; } if( w==83 ) { r = -1.123e+00; } if( w==82 ) { r = -1.167e+00; } if( w==81 ) { r = -1.213e+00; } if( w==80 ) { r = -1.259e+00; } if( w==79 ) { r = -1.307e+00; } if( w==78 ) { r = -1.356e+00; } if( w==77 ) { r = -1.407e+00; } if( w==76 ) { r = -1.458e+00; } if( w==75 ) { r = -1.511e+00; } if( w==74 ) { r = -1.565e+00; } if( w==73 ) { r = -1.621e+00; } if( w==72 ) { r = -1.678e+00; } if( w==71 ) { r = -1.736e+00; } if( w==70 ) { r = -1.796e+00; } if( w==69 ) { r = -1.857e+00; } if( w==68 ) { r = -1.919e+00; } if( w==67 ) { r = -1.983e+00; } if( w==66 ) { r = -2.048e+00; } if( w==65 ) { r = -2.115e+00; } if( w==64 ) { r = -2.183e+00; } if( w==63 ) { r = -2.253e+00; } if( w==62 ) { r = -2.325e+00; } if( w==61 ) { r = -2.398e+00; } if( w==60 ) { r = -2.472e+00; } if( w==59 ) { r = -2.548e+00; } if( w==58 ) { r = -2.626e+00; } if( w==57 ) { r = -2.706e+00; } if( w==56 ) { r = -2.787e+00; } if( w==55 ) { r = -2.870e+00; } if( w==54 ) { r = -2.955e+00; } if( w==53 ) { r = -3.042e+00; } if( w==52 ) { r = -3.130e+00; } if( w==51 ) { r = -3.220e+00; } if( w==50 ) { r = -3.313e+00; } if( w==49 ) { r = -3.407e+00; } if( w==48 ) { r = -3.503e+00; } if( w==47 ) { r = -3.601e+00; } if( w==46 ) { r = -3.702e+00; } if( w==45 ) { r = -3.804e+00; } if( w==44 ) { r = -3.909e+00; } if( w==43 ) { r = -4.015e+00; } if( w==42 ) { r = -4.125e+00; } if( w==41 ) { r = -4.236e+00; } if( w==40 ) { r = -4.350e+00; } if( w==39 ) { r = -4.466e+00; } if( w==38 ) { r = -4.585e+00; } if( w==37 ) { r = -4.706e+00; } if( w==36 ) { r = -4.830e+00; } if( w==35 ) { r = -4.957e+00; } if( w==34 ) { r = -5.086e+00; } if( w==33 ) { r = -5.219e+00; } if( w==32 ) { r = -5.355e+00; } if( w==31 ) { r = -5.493e+00; } if( w==30 ) { r = -5.634e+00; } if( w==29 ) { r = -5.780e+00; } if( w==28 ) { r = -5.928e+00; } if( w==27 ) { r = -6.080e+00; } if( w==26 ) { r = -6.235e+00; } if( w==25 ) { r = -6.394e+00; } if( w==24 ) { r = -6.558e+00; } if( w==23 ) { r = -6.726e+00; } if( w==22 ) { r = -6.897e+00; } if( w==21 ) { r = -7.074e+00; } if( w==20 ) { r = -7.256e+00; } if( w==19 ) { r = -7.443e+00; } if( w==18 ) { r = -7.636e+00; } if( w==17 ) { r = -7.837e+00; } if( w==16 ) { r = -8.040e+00; } if( w==15 ) { r = -8.250e+00; } if( w==14 ) { r = -8.469e+00; } if( w==13 ) { r = -8.692e+00; } if( w==12 ) { r = -8.921e+00; } if( w==11 ) { r = -9.162e+00; } if( w==10 ) { r = -9.409e+00; } if( w==9 ) { r = -9.673e+00; } if( w==8 ) { r = -9.951e+00; } if( w==7 ) { r = -1.023e+01; } if( w==6 ) { r = -1.053e+01; } if( w==5 ) { r = -1.087e+01; } if( w==4 ) { r = -1.122e+01; } if( w==3 ) { r = -1.156e+01; } if( w==2 ) { r = -1.207e+01; } if( w==1 ) { r = -1.248e+01; } if( w<=0 ) { r = -1.317e+01; } result = r; return result; } /************************************************************************* Tail(S, 20) *************************************************************************/ static double wsr_w20(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-2.678619e+01*s+1.050000e+02, _state); if( w>=105 ) { r = -6.787e-01; } if( w==104 ) { r = -7.078e-01; } if( w==103 ) { r = -7.378e-01; } if( w==102 ) { r = -7.686e-01; } if( w==101 ) { r = -8.004e-01; } if( w==100 ) { r = -8.330e-01; } if( w==99 ) { r = -8.665e-01; } if( w==98 ) { r = -9.010e-01; } if( w==97 ) { r = -9.363e-01; } if( w==96 ) { r = -9.726e-01; } if( w==95 ) { r = -1.010e+00; } if( w==94 ) { r = -1.048e+00; } if( w==93 ) { r = -1.087e+00; } if( w==92 ) { r = -1.128e+00; } if( w==91 ) { r = -1.169e+00; } if( w==90 ) { r = -1.211e+00; } if( w==89 ) { r = -1.254e+00; } if( w==88 ) { r = -1.299e+00; } if( w==87 ) { r = -1.344e+00; } if( w==86 ) { r = -1.390e+00; } if( w==85 ) { r = -1.438e+00; } if( w==84 ) { r = -1.486e+00; } if( w==83 ) { r = -1.536e+00; } if( w==82 ) { r = -1.587e+00; } if( w==81 ) { r = -1.639e+00; } if( w==80 ) { r = -1.692e+00; } if( w==79 ) { r = -1.746e+00; } if( w==78 ) { r = -1.802e+00; } if( w==77 ) { r = -1.859e+00; } if( w==76 ) { r = -1.916e+00; } if( w==75 ) { r = -1.976e+00; } if( w==74 ) { r = -2.036e+00; } if( w==73 ) { r = -2.098e+00; } if( w==72 ) { r = -2.161e+00; } if( w==71 ) { r = -2.225e+00; } if( w==70 ) { r = -2.290e+00; } if( w==69 ) { r = -2.357e+00; } if( w==68 ) { r = -2.426e+00; } if( w==67 ) { r = -2.495e+00; } if( w==66 ) { r = -2.566e+00; } if( w==65 ) { r = -2.639e+00; } if( w==64 ) { r = -2.713e+00; } if( w==63 ) { r = -2.788e+00; } if( w==62 ) { r = -2.865e+00; } if( w==61 ) { r = -2.943e+00; } if( w==60 ) { r = -3.023e+00; } if( w==59 ) { r = -3.104e+00; } if( w==58 ) { r = -3.187e+00; } if( w==57 ) { r = -3.272e+00; } if( w==56 ) { r = -3.358e+00; } if( w==55 ) { r = -3.446e+00; } if( w==54 ) { r = -3.536e+00; } if( w==53 ) { r = -3.627e+00; } if( w==52 ) { r = -3.721e+00; } if( w==51 ) { r = -3.815e+00; } if( w==50 ) { r = -3.912e+00; } if( w==49 ) { r = -4.011e+00; } if( w==48 ) { r = -4.111e+00; } if( w==47 ) { r = -4.214e+00; } if( w==46 ) { r = -4.318e+00; } if( w==45 ) { r = -4.425e+00; } if( w==44 ) { r = -4.534e+00; } if( w==43 ) { r = -4.644e+00; } if( w==42 ) { r = -4.757e+00; } if( w==41 ) { r = -4.872e+00; } if( w==40 ) { r = -4.990e+00; } if( w==39 ) { r = -5.109e+00; } if( w==38 ) { r = -5.232e+00; } if( w==37 ) { r = -5.356e+00; } if( w==36 ) { r = -5.484e+00; } if( w==35 ) { r = -5.614e+00; } if( w==34 ) { r = -5.746e+00; } if( w==33 ) { r = -5.882e+00; } if( w==32 ) { r = -6.020e+00; } if( w==31 ) { r = -6.161e+00; } if( w==30 ) { r = -6.305e+00; } if( w==29 ) { r = -6.453e+00; } if( w==28 ) { r = -6.603e+00; } if( w==27 ) { r = -6.757e+00; } if( w==26 ) { r = -6.915e+00; } if( w==25 ) { r = -7.076e+00; } if( w==24 ) { r = -7.242e+00; } if( w==23 ) { r = -7.411e+00; } if( w==22 ) { r = -7.584e+00; } if( w==21 ) { r = -7.763e+00; } if( w==20 ) { r = -7.947e+00; } if( w==19 ) { r = -8.136e+00; } if( w==18 ) { r = -8.330e+00; } if( w==17 ) { r = -8.530e+00; } if( w==16 ) { r = -8.733e+00; } if( w==15 ) { r = -8.943e+00; } if( w==14 ) { r = -9.162e+00; } if( w==13 ) { r = -9.386e+00; } if( w==12 ) { r = -9.614e+00; } if( w==11 ) { r = -9.856e+00; } if( w==10 ) { r = -1.010e+01; } if( w==9 ) { r = -1.037e+01; } if( w==8 ) { r = -1.064e+01; } if( w==7 ) { r = -1.092e+01; } if( w==6 ) { r = -1.122e+01; } if( w==5 ) { r = -1.156e+01; } if( w==4 ) { r = -1.192e+01; } if( w==3 ) { r = -1.225e+01; } if( w==2 ) { r = -1.276e+01; } if( w==1 ) { r = -1.317e+01; } if( w<=0 ) { r = -1.386e+01; } result = r; return result; } /************************************************************************* Tail(S, 21) *************************************************************************/ static double wsr_w21(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-2.877064e+01*s+1.155000e+02, _state); if( w>=115 ) { r = -6.931e-01; } if( w==114 ) { r = -7.207e-01; } if( w==113 ) { r = -7.489e-01; } if( w==112 ) { r = -7.779e-01; } if( w==111 ) { r = -8.077e-01; } if( w==110 ) { r = -8.383e-01; } if( w==109 ) { r = -8.697e-01; } if( w==108 ) { r = -9.018e-01; } if( w==107 ) { r = -9.348e-01; } if( w==106 ) { r = -9.685e-01; } if( w==105 ) { r = -1.003e+00; } if( w==104 ) { r = -1.039e+00; } if( w==103 ) { r = -1.075e+00; } if( w==102 ) { r = -1.112e+00; } if( w==101 ) { r = -1.150e+00; } if( w==100 ) { r = -1.189e+00; } if( w==99 ) { r = -1.229e+00; } if( w==98 ) { r = -1.269e+00; } if( w==97 ) { r = -1.311e+00; } if( w==96 ) { r = -1.353e+00; } if( w==95 ) { r = -1.397e+00; } if( w==94 ) { r = -1.441e+00; } if( w==93 ) { r = -1.486e+00; } if( w==92 ) { r = -1.533e+00; } if( w==91 ) { r = -1.580e+00; } if( w==90 ) { r = -1.628e+00; } if( w==89 ) { r = -1.677e+00; } if( w==88 ) { r = -1.728e+00; } if( w==87 ) { r = -1.779e+00; } if( w==86 ) { r = -1.831e+00; } if( w==85 ) { r = -1.884e+00; } if( w==84 ) { r = -1.939e+00; } if( w==83 ) { r = -1.994e+00; } if( w==82 ) { r = -2.051e+00; } if( w==81 ) { r = -2.108e+00; } if( w==80 ) { r = -2.167e+00; } if( w==79 ) { r = -2.227e+00; } if( w==78 ) { r = -2.288e+00; } if( w==77 ) { r = -2.350e+00; } if( w==76 ) { r = -2.414e+00; } if( w==75 ) { r = -2.478e+00; } if( w==74 ) { r = -2.544e+00; } if( w==73 ) { r = -2.611e+00; } if( w==72 ) { r = -2.679e+00; } if( w==71 ) { r = -2.748e+00; } if( w==70 ) { r = -2.819e+00; } if( w==69 ) { r = -2.891e+00; } if( w==68 ) { r = -2.964e+00; } if( w==67 ) { r = -3.039e+00; } if( w==66 ) { r = -3.115e+00; } if( w==65 ) { r = -3.192e+00; } if( w==64 ) { r = -3.270e+00; } if( w==63 ) { r = -3.350e+00; } if( w==62 ) { r = -3.432e+00; } if( w==61 ) { r = -3.515e+00; } if( w==60 ) { r = -3.599e+00; } if( w==59 ) { r = -3.685e+00; } if( w==58 ) { r = -3.772e+00; } if( w==57 ) { r = -3.861e+00; } if( w==56 ) { r = -3.952e+00; } if( w==55 ) { r = -4.044e+00; } if( w==54 ) { r = -4.138e+00; } if( w==53 ) { r = -4.233e+00; } if( w==52 ) { r = -4.330e+00; } if( w==51 ) { r = -4.429e+00; } if( w==50 ) { r = -4.530e+00; } if( w==49 ) { r = -4.632e+00; } if( w==48 ) { r = -4.736e+00; } if( w==47 ) { r = -4.842e+00; } if( w==46 ) { r = -4.950e+00; } if( w==45 ) { r = -5.060e+00; } if( w==44 ) { r = -5.172e+00; } if( w==43 ) { r = -5.286e+00; } if( w==42 ) { r = -5.402e+00; } if( w==41 ) { r = -5.520e+00; } if( w==40 ) { r = -5.641e+00; } if( w==39 ) { r = -5.763e+00; } if( w==38 ) { r = -5.889e+00; } if( w==37 ) { r = -6.016e+00; } if( w==36 ) { r = -6.146e+00; } if( w==35 ) { r = -6.278e+00; } if( w==34 ) { r = -6.413e+00; } if( w==33 ) { r = -6.551e+00; } if( w==32 ) { r = -6.692e+00; } if( w==31 ) { r = -6.835e+00; } if( w==30 ) { r = -6.981e+00; } if( w==29 ) { r = -7.131e+00; } if( w==28 ) { r = -7.283e+00; } if( w==27 ) { r = -7.439e+00; } if( w==26 ) { r = -7.599e+00; } if( w==25 ) { r = -7.762e+00; } if( w==24 ) { r = -7.928e+00; } if( w==23 ) { r = -8.099e+00; } if( w==22 ) { r = -8.274e+00; } if( w==21 ) { r = -8.454e+00; } if( w==20 ) { r = -8.640e+00; } if( w==19 ) { r = -8.829e+00; } if( w==18 ) { r = -9.023e+00; } if( w==17 ) { r = -9.223e+00; } if( w==16 ) { r = -9.426e+00; } if( w==15 ) { r = -9.636e+00; } if( w==14 ) { r = -9.856e+00; } if( w==13 ) { r = -1.008e+01; } if( w==12 ) { r = -1.031e+01; } if( w==11 ) { r = -1.055e+01; } if( w==10 ) { r = -1.079e+01; } if( w==9 ) { r = -1.106e+01; } if( w==8 ) { r = -1.134e+01; } if( w==7 ) { r = -1.161e+01; } if( w==6 ) { r = -1.192e+01; } if( w==5 ) { r = -1.225e+01; } if( w==4 ) { r = -1.261e+01; } if( w==3 ) { r = -1.295e+01; } if( w==2 ) { r = -1.346e+01; } if( w==1 ) { r = -1.386e+01; } if( w<=0 ) { r = -1.456e+01; } result = r; return result; } /************************************************************************* Tail(S, 22) *************************************************************************/ static double wsr_w22(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-3.080179e+01*s+1.265000e+02, _state); if( w>=126 ) { r = -6.931e-01; } if( w==125 ) { r = -7.189e-01; } if( w==124 ) { r = -7.452e-01; } if( w==123 ) { r = -7.722e-01; } if( w==122 ) { r = -7.999e-01; } if( w==121 ) { r = -8.283e-01; } if( w==120 ) { r = -8.573e-01; } if( w==119 ) { r = -8.871e-01; } if( w==118 ) { r = -9.175e-01; } if( w==117 ) { r = -9.486e-01; } if( w==116 ) { r = -9.805e-01; } if( w==115 ) { r = -1.013e+00; } if( w==114 ) { r = -1.046e+00; } if( w==113 ) { r = -1.080e+00; } if( w==112 ) { r = -1.115e+00; } if( w==111 ) { r = -1.151e+00; } if( w==110 ) { r = -1.187e+00; } if( w==109 ) { r = -1.224e+00; } if( w==108 ) { r = -1.262e+00; } if( w==107 ) { r = -1.301e+00; } if( w==106 ) { r = -1.340e+00; } if( w==105 ) { r = -1.381e+00; } if( w==104 ) { r = -1.422e+00; } if( w==103 ) { r = -1.464e+00; } if( w==102 ) { r = -1.506e+00; } if( w==101 ) { r = -1.550e+00; } if( w==100 ) { r = -1.594e+00; } if( w==99 ) { r = -1.640e+00; } if( w==98 ) { r = -1.686e+00; } if( w==97 ) { r = -1.733e+00; } if( w==96 ) { r = -1.781e+00; } if( w==95 ) { r = -1.830e+00; } if( w==94 ) { r = -1.880e+00; } if( w==93 ) { r = -1.930e+00; } if( w==92 ) { r = -1.982e+00; } if( w==91 ) { r = -2.034e+00; } if( w==90 ) { r = -2.088e+00; } if( w==89 ) { r = -2.142e+00; } if( w==88 ) { r = -2.198e+00; } if( w==87 ) { r = -2.254e+00; } if( w==86 ) { r = -2.312e+00; } if( w==85 ) { r = -2.370e+00; } if( w==84 ) { r = -2.429e+00; } if( w==83 ) { r = -2.490e+00; } if( w==82 ) { r = -2.551e+00; } if( w==81 ) { r = -2.614e+00; } if( w==80 ) { r = -2.677e+00; } if( w==79 ) { r = -2.742e+00; } if( w==78 ) { r = -2.808e+00; } if( w==77 ) { r = -2.875e+00; } if( w==76 ) { r = -2.943e+00; } if( w==75 ) { r = -3.012e+00; } if( w==74 ) { r = -3.082e+00; } if( w==73 ) { r = -3.153e+00; } if( w==72 ) { r = -3.226e+00; } if( w==71 ) { r = -3.300e+00; } if( w==70 ) { r = -3.375e+00; } if( w==69 ) { r = -3.451e+00; } if( w==68 ) { r = -3.529e+00; } if( w==67 ) { r = -3.607e+00; } if( w==66 ) { r = -3.687e+00; } if( w==65 ) { r = -3.769e+00; } if( w==64 ) { r = -3.851e+00; } if( w==63 ) { r = -3.935e+00; } if( w==62 ) { r = -4.021e+00; } if( w==61 ) { r = -4.108e+00; } if( w==60 ) { r = -4.196e+00; } if( w==59 ) { r = -4.285e+00; } if( w==58 ) { r = -4.376e+00; } if( w==57 ) { r = -4.469e+00; } if( w==56 ) { r = -4.563e+00; } if( w==55 ) { r = -4.659e+00; } if( w==54 ) { r = -4.756e+00; } if( w==53 ) { r = -4.855e+00; } if( w==52 ) { r = -4.955e+00; } if( w==51 ) { r = -5.057e+00; } if( w==50 ) { r = -5.161e+00; } if( w==49 ) { r = -5.266e+00; } if( w==48 ) { r = -5.374e+00; } if( w==47 ) { r = -5.483e+00; } if( w==46 ) { r = -5.594e+00; } if( w==45 ) { r = -5.706e+00; } if( w==44 ) { r = -5.821e+00; } if( w==43 ) { r = -5.938e+00; } if( w==42 ) { r = -6.057e+00; } if( w==41 ) { r = -6.177e+00; } if( w==40 ) { r = -6.300e+00; } if( w==39 ) { r = -6.426e+00; } if( w==38 ) { r = -6.553e+00; } if( w==37 ) { r = -6.683e+00; } if( w==36 ) { r = -6.815e+00; } if( w==35 ) { r = -6.949e+00; } if( w==34 ) { r = -7.086e+00; } if( w==33 ) { r = -7.226e+00; } if( w==32 ) { r = -7.368e+00; } if( w==31 ) { r = -7.513e+00; } if( w==30 ) { r = -7.661e+00; } if( w==29 ) { r = -7.813e+00; } if( w==28 ) { r = -7.966e+00; } if( w==27 ) { r = -8.124e+00; } if( w==26 ) { r = -8.285e+00; } if( w==25 ) { r = -8.449e+00; } if( w==24 ) { r = -8.617e+00; } if( w==23 ) { r = -8.789e+00; } if( w==22 ) { r = -8.965e+00; } if( w==21 ) { r = -9.147e+00; } if( w==20 ) { r = -9.333e+00; } if( w==19 ) { r = -9.522e+00; } if( w==18 ) { r = -9.716e+00; } if( w==17 ) { r = -9.917e+00; } if( w==16 ) { r = -1.012e+01; } if( w==15 ) { r = -1.033e+01; } if( w==14 ) { r = -1.055e+01; } if( w==13 ) { r = -1.077e+01; } if( w==12 ) { r = -1.100e+01; } if( w==11 ) { r = -1.124e+01; } if( w==10 ) { r = -1.149e+01; } if( w==9 ) { r = -1.175e+01; } if( w==8 ) { r = -1.203e+01; } if( w==7 ) { r = -1.230e+01; } if( w==6 ) { r = -1.261e+01; } if( w==5 ) { r = -1.295e+01; } if( w==4 ) { r = -1.330e+01; } if( w==3 ) { r = -1.364e+01; } if( w==2 ) { r = -1.415e+01; } if( w==1 ) { r = -1.456e+01; } if( w<=0 ) { r = -1.525e+01; } result = r; return result; } /************************************************************************* Tail(S, 23) *************************************************************************/ static double wsr_w23(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-3.287856e+01*s+1.380000e+02, _state); if( w>=138 ) { r = -6.813e-01; } if( w==137 ) { r = -7.051e-01; } if( w==136 ) { r = -7.295e-01; } if( w==135 ) { r = -7.544e-01; } if( w==134 ) { r = -7.800e-01; } if( w==133 ) { r = -8.061e-01; } if( w==132 ) { r = -8.328e-01; } if( w==131 ) { r = -8.601e-01; } if( w==130 ) { r = -8.880e-01; } if( w==129 ) { r = -9.166e-01; } if( w==128 ) { r = -9.457e-01; } if( w==127 ) { r = -9.755e-01; } if( w==126 ) { r = -1.006e+00; } if( w==125 ) { r = -1.037e+00; } if( w==124 ) { r = -1.069e+00; } if( w==123 ) { r = -1.101e+00; } if( w==122 ) { r = -1.134e+00; } if( w==121 ) { r = -1.168e+00; } if( w==120 ) { r = -1.202e+00; } if( w==119 ) { r = -1.237e+00; } if( w==118 ) { r = -1.273e+00; } if( w==117 ) { r = -1.309e+00; } if( w==116 ) { r = -1.347e+00; } if( w==115 ) { r = -1.384e+00; } if( w==114 ) { r = -1.423e+00; } if( w==113 ) { r = -1.462e+00; } if( w==112 ) { r = -1.502e+00; } if( w==111 ) { r = -1.543e+00; } if( w==110 ) { r = -1.585e+00; } if( w==109 ) { r = -1.627e+00; } if( w==108 ) { r = -1.670e+00; } if( w==107 ) { r = -1.714e+00; } if( w==106 ) { r = -1.758e+00; } if( w==105 ) { r = -1.804e+00; } if( w==104 ) { r = -1.850e+00; } if( w==103 ) { r = -1.897e+00; } if( w==102 ) { r = -1.944e+00; } if( w==101 ) { r = -1.993e+00; } if( w==100 ) { r = -2.042e+00; } if( w==99 ) { r = -2.093e+00; } if( w==98 ) { r = -2.144e+00; } if( w==97 ) { r = -2.195e+00; } if( w==96 ) { r = -2.248e+00; } if( w==95 ) { r = -2.302e+00; } if( w==94 ) { r = -2.356e+00; } if( w==93 ) { r = -2.412e+00; } if( w==92 ) { r = -2.468e+00; } if( w==91 ) { r = -2.525e+00; } if( w==90 ) { r = -2.583e+00; } if( w==89 ) { r = -2.642e+00; } if( w==88 ) { r = -2.702e+00; } if( w==87 ) { r = -2.763e+00; } if( w==86 ) { r = -2.825e+00; } if( w==85 ) { r = -2.888e+00; } if( w==84 ) { r = -2.951e+00; } if( w==83 ) { r = -3.016e+00; } if( w==82 ) { r = -3.082e+00; } if( w==81 ) { r = -3.149e+00; } if( w==80 ) { r = -3.216e+00; } if( w==79 ) { r = -3.285e+00; } if( w==78 ) { r = -3.355e+00; } if( w==77 ) { r = -3.426e+00; } if( w==76 ) { r = -3.498e+00; } if( w==75 ) { r = -3.571e+00; } if( w==74 ) { r = -3.645e+00; } if( w==73 ) { r = -3.721e+00; } if( w==72 ) { r = -3.797e+00; } if( w==71 ) { r = -3.875e+00; } if( w==70 ) { r = -3.953e+00; } if( w==69 ) { r = -4.033e+00; } if( w==68 ) { r = -4.114e+00; } if( w==67 ) { r = -4.197e+00; } if( w==66 ) { r = -4.280e+00; } if( w==65 ) { r = -4.365e+00; } if( w==64 ) { r = -4.451e+00; } if( w==63 ) { r = -4.539e+00; } if( w==62 ) { r = -4.628e+00; } if( w==61 ) { r = -4.718e+00; } if( w==60 ) { r = -4.809e+00; } if( w==59 ) { r = -4.902e+00; } if( w==58 ) { r = -4.996e+00; } if( w==57 ) { r = -5.092e+00; } if( w==56 ) { r = -5.189e+00; } if( w==55 ) { r = -5.287e+00; } if( w==54 ) { r = -5.388e+00; } if( w==53 ) { r = -5.489e+00; } if( w==52 ) { r = -5.592e+00; } if( w==51 ) { r = -5.697e+00; } if( w==50 ) { r = -5.804e+00; } if( w==49 ) { r = -5.912e+00; } if( w==48 ) { r = -6.022e+00; } if( w==47 ) { r = -6.133e+00; } if( w==46 ) { r = -6.247e+00; } if( w==45 ) { r = -6.362e+00; } if( w==44 ) { r = -6.479e+00; } if( w==43 ) { r = -6.598e+00; } if( w==42 ) { r = -6.719e+00; } if( w==41 ) { r = -6.842e+00; } if( w==40 ) { r = -6.967e+00; } if( w==39 ) { r = -7.094e+00; } if( w==38 ) { r = -7.224e+00; } if( w==37 ) { r = -7.355e+00; } if( w==36 ) { r = -7.489e+00; } if( w==35 ) { r = -7.625e+00; } if( w==34 ) { r = -7.764e+00; } if( w==33 ) { r = -7.905e+00; } if( w==32 ) { r = -8.049e+00; } if( w==31 ) { r = -8.196e+00; } if( w==30 ) { r = -8.345e+00; } if( w==29 ) { r = -8.498e+00; } if( w==28 ) { r = -8.653e+00; } if( w==27 ) { r = -8.811e+00; } if( w==26 ) { r = -8.974e+00; } if( w==25 ) { r = -9.139e+00; } if( w==24 ) { r = -9.308e+00; } if( w==23 ) { r = -9.481e+00; } if( w==22 ) { r = -9.658e+00; } if( w==21 ) { r = -9.840e+00; } if( w==20 ) { r = -1.003e+01; } if( w==19 ) { r = -1.022e+01; } if( w==18 ) { r = -1.041e+01; } if( w==17 ) { r = -1.061e+01; } if( w==16 ) { r = -1.081e+01; } if( w==15 ) { r = -1.102e+01; } if( w==14 ) { r = -1.124e+01; } if( w==13 ) { r = -1.147e+01; } if( w==12 ) { r = -1.169e+01; } if( w==11 ) { r = -1.194e+01; } if( w==10 ) { r = -1.218e+01; } if( w==9 ) { r = -1.245e+01; } if( w==8 ) { r = -1.272e+01; } if( w==7 ) { r = -1.300e+01; } if( w==6 ) { r = -1.330e+01; } if( w==5 ) { r = -1.364e+01; } if( w==4 ) { r = -1.400e+01; } if( w==3 ) { r = -1.433e+01; } if( w==2 ) { r = -1.484e+01; } if( w==1 ) { r = -1.525e+01; } if( w<=0 ) { r = -1.594e+01; } result = r; return result; } /************************************************************************* Tail(S, 24) *************************************************************************/ static double wsr_w24(double s, ae_state *_state) { ae_int_t w; double r; double result; r = (double)(0); w = ae_round(-3.500000e+01*s+1.500000e+02, _state); if( w>=150 ) { r = -6.820e-01; } if( w==149 ) { r = -7.044e-01; } if( w==148 ) { r = -7.273e-01; } if( w==147 ) { r = -7.507e-01; } if( w==146 ) { r = -7.746e-01; } if( w==145 ) { r = -7.990e-01; } if( w==144 ) { r = -8.239e-01; } if( w==143 ) { r = -8.494e-01; } if( w==142 ) { r = -8.754e-01; } if( w==141 ) { r = -9.020e-01; } if( w==140 ) { r = -9.291e-01; } if( w==139 ) { r = -9.567e-01; } if( w==138 ) { r = -9.849e-01; } if( w==137 ) { r = -1.014e+00; } if( w==136 ) { r = -1.043e+00; } if( w==135 ) { r = -1.073e+00; } if( w==134 ) { r = -1.103e+00; } if( w==133 ) { r = -1.135e+00; } if( w==132 ) { r = -1.166e+00; } if( w==131 ) { r = -1.198e+00; } if( w==130 ) { r = -1.231e+00; } if( w==129 ) { r = -1.265e+00; } if( w==128 ) { r = -1.299e+00; } if( w==127 ) { r = -1.334e+00; } if( w==126 ) { r = -1.369e+00; } if( w==125 ) { r = -1.405e+00; } if( w==124 ) { r = -1.441e+00; } if( w==123 ) { r = -1.479e+00; } if( w==122 ) { r = -1.517e+00; } if( w==121 ) { r = -1.555e+00; } if( w==120 ) { r = -1.594e+00; } if( w==119 ) { r = -1.634e+00; } if( w==118 ) { r = -1.675e+00; } if( w==117 ) { r = -1.716e+00; } if( w==116 ) { r = -1.758e+00; } if( w==115 ) { r = -1.800e+00; } if( w==114 ) { r = -1.844e+00; } if( w==113 ) { r = -1.888e+00; } if( w==112 ) { r = -1.932e+00; } if( w==111 ) { r = -1.978e+00; } if( w==110 ) { r = -2.024e+00; } if( w==109 ) { r = -2.070e+00; } if( w==108 ) { r = -2.118e+00; } if( w==107 ) { r = -2.166e+00; } if( w==106 ) { r = -2.215e+00; } if( w==105 ) { r = -2.265e+00; } if( w==104 ) { r = -2.316e+00; } if( w==103 ) { r = -2.367e+00; } if( w==102 ) { r = -2.419e+00; } if( w==101 ) { r = -2.472e+00; } if( w==100 ) { r = -2.526e+00; } if( w==99 ) { r = -2.580e+00; } if( w==98 ) { r = -2.636e+00; } if( w==97 ) { r = -2.692e+00; } if( w==96 ) { r = -2.749e+00; } if( w==95 ) { r = -2.806e+00; } if( w==94 ) { r = -2.865e+00; } if( w==93 ) { r = -2.925e+00; } if( w==92 ) { r = -2.985e+00; } if( w==91 ) { r = -3.046e+00; } if( w==90 ) { r = -3.108e+00; } if( w==89 ) { r = -3.171e+00; } if( w==88 ) { r = -3.235e+00; } if( w==87 ) { r = -3.300e+00; } if( w==86 ) { r = -3.365e+00; } if( w==85 ) { r = -3.432e+00; } if( w==84 ) { r = -3.499e+00; } if( w==83 ) { r = -3.568e+00; } if( w==82 ) { r = -3.637e+00; } if( w==81 ) { r = -3.708e+00; } if( w==80 ) { r = -3.779e+00; } if( w==79 ) { r = -3.852e+00; } if( w==78 ) { r = -3.925e+00; } if( w==77 ) { r = -4.000e+00; } if( w==76 ) { r = -4.075e+00; } if( w==75 ) { r = -4.151e+00; } if( w==74 ) { r = -4.229e+00; } if( w==73 ) { r = -4.308e+00; } if( w==72 ) { r = -4.387e+00; } if( w==71 ) { r = -4.468e+00; } if( w==70 ) { r = -4.550e+00; } if( w==69 ) { r = -4.633e+00; } if( w==68 ) { r = -4.718e+00; } if( w==67 ) { r = -4.803e+00; } if( w==66 ) { r = -4.890e+00; } if( w==65 ) { r = -4.978e+00; } if( w==64 ) { r = -5.067e+00; } if( w==63 ) { r = -5.157e+00; } if( w==62 ) { r = -5.249e+00; } if( w==61 ) { r = -5.342e+00; } if( w==60 ) { r = -5.436e+00; } if( w==59 ) { r = -5.531e+00; } if( w==58 ) { r = -5.628e+00; } if( w==57 ) { r = -5.727e+00; } if( w==56 ) { r = -5.826e+00; } if( w==55 ) { r = -5.927e+00; } if( w==54 ) { r = -6.030e+00; } if( w==53 ) { r = -6.134e+00; } if( w==52 ) { r = -6.240e+00; } if( w==51 ) { r = -6.347e+00; } if( w==50 ) { r = -6.456e+00; } if( w==49 ) { r = -6.566e+00; } if( w==48 ) { r = -6.678e+00; } if( w==47 ) { r = -6.792e+00; } if( w==46 ) { r = -6.907e+00; } if( w==45 ) { r = -7.025e+00; } if( w==44 ) { r = -7.144e+00; } if( w==43 ) { r = -7.265e+00; } if( w==42 ) { r = -7.387e+00; } if( w==41 ) { r = -7.512e+00; } if( w==40 ) { r = -7.639e+00; } if( w==39 ) { r = -7.768e+00; } if( w==38 ) { r = -7.899e+00; } if( w==37 ) { r = -8.032e+00; } if( w==36 ) { r = -8.167e+00; } if( w==35 ) { r = -8.305e+00; } if( w==34 ) { r = -8.445e+00; } if( w==33 ) { r = -8.588e+00; } if( w==32 ) { r = -8.733e+00; } if( w==31 ) { r = -8.881e+00; } if( w==30 ) { r = -9.031e+00; } if( w==29 ) { r = -9.185e+00; } if( w==28 ) { r = -9.341e+00; } if( w==27 ) { r = -9.501e+00; } if( w==26 ) { r = -9.664e+00; } if( w==25 ) { r = -9.830e+00; } if( w==24 ) { r = -1.000e+01; } if( w==23 ) { r = -1.017e+01; } if( w==22 ) { r = -1.035e+01; } if( w==21 ) { r = -1.053e+01; } if( w==20 ) { r = -1.072e+01; } if( w==19 ) { r = -1.091e+01; } if( w==18 ) { r = -1.110e+01; } if( w==17 ) { r = -1.130e+01; } if( w==16 ) { r = -1.151e+01; } if( w==15 ) { r = -1.172e+01; } if( w==14 ) { r = -1.194e+01; } if( w==13 ) { r = -1.216e+01; } if( w==12 ) { r = -1.239e+01; } if( w==11 ) { r = -1.263e+01; } if( w==10 ) { r = -1.287e+01; } if( w==9 ) { r = -1.314e+01; } if( w==8 ) { r = -1.342e+01; } if( w==7 ) { r = -1.369e+01; } if( w==6 ) { r = -1.400e+01; } if( w==5 ) { r = -1.433e+01; } if( w==4 ) { r = -1.469e+01; } if( w==3 ) { r = -1.503e+01; } if( w==2 ) { r = -1.554e+01; } if( w==1 ) { r = -1.594e+01; } if( w<=0 ) { r = -1.664e+01; } result = r; return result; } /************************************************************************* Tail(S, 25) *************************************************************************/ static double wsr_w25(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/4.000000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; wsr_wcheb(x, -5.150509e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.695528e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.437637e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -2.611906e-01, &tj, &tj1, &result, _state); wsr_wcheb(x, -7.625722e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -2.579892e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.086876e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -2.906543e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, -2.354881e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, 1.007195e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -8.437327e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 26) *************************************************************************/ static double wsr_w26(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/4.000000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; wsr_wcheb(x, -5.117622e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.635159e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.395167e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -2.382823e-01, &tj, &tj1, &result, _state); wsr_wcheb(x, -6.531987e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -2.060112e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -8.203697e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.516523e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.431364e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, 6.384553e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -3.238369e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 27) *************************************************************************/ static double wsr_w27(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/4.000000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; wsr_wcheb(x, -5.089731e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.584248e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.359966e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -2.203696e-01, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.753344e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.761891e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -7.096897e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.419108e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.581214e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, 3.033766e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.901441e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 28) *************************************************************************/ static double wsr_w28(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/4.000000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; wsr_wcheb(x, -5.065046e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.539163e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.328939e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -2.046376e-01, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.061515e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.469271e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.711578e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, -8.389153e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.250575e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, 4.047245e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.128555e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 29) *************************************************************************/ static double wsr_w29(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/4.000000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; wsr_wcheb(x, -5.043413e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.499756e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.302137e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.915129e-01, &tj, &tj1, &result, _state); wsr_wcheb(x, -4.516329e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.260064e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -4.817269e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.478130e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.111668e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, 4.093451e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.135860e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 30) *************************************************************************/ static double wsr_w30(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/4.000000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; wsr_wcheb(x, -5.024071e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.464515e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.278342e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.800030e-01, &tj, &tj1, &result, _state); wsr_wcheb(x, -4.046294e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.076162e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -3.968677e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.911679e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -8.619185e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, 5.125362e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -3.984370e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 40) *************************************************************************/ static double wsr_w40(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/4.000000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; wsr_wcheb(x, -4.904809e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.248327e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.136698e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.170982e-01, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.824427e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -3.888648e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.344929e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, 2.790407e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -4.619858e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, 3.359121e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -2.883026e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 60) *************************************************************************/ static double wsr_w60(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/4.000000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; wsr_wcheb(x, -4.809656e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.077191e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.029402e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -7.507931e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, -6.506226e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.391278e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, -4.263635e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, 2.302271e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -2.384348e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, 1.865587e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.622355e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 120) *************************************************************************/ static double wsr_w120(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/4.000000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; wsr_wcheb(x, -4.729426e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -4.934426e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -9.433231e-01, &tj, &tj1, &result, _state); wsr_wcheb(x, -4.492504e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, 1.673948e-05, &tj, &tj1, &result, _state); wsr_wcheb(x, -6.077014e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -7.215768e-05, &tj, &tj1, &result, _state); wsr_wcheb(x, 9.086734e-05, &tj, &tj1, &result, _state); wsr_wcheb(x, -8.447980e-05, &tj, &tj1, &result, _state); wsr_wcheb(x, 6.705028e-05, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.828507e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 200) *************************************************************************/ static double wsr_w200(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/4.000000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; wsr_wcheb(x, -4.700240e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -4.883080e+00, &tj, &tj1, &result, _state); wsr_wcheb(x, -9.132168e-01, &tj, &tj1, &result, _state); wsr_wcheb(x, -3.512684e-02, &tj, &tj1, &result, _state); wsr_wcheb(x, 1.726342e-03, &tj, &tj1, &result, _state); wsr_wcheb(x, -5.189796e-04, &tj, &tj1, &result, _state); wsr_wcheb(x, -1.628659e-06, &tj, &tj1, &result, _state); wsr_wcheb(x, 4.261786e-05, &tj, &tj1, &result, _state); wsr_wcheb(x, -4.002498e-05, &tj, &tj1, &result, _state); wsr_wcheb(x, 3.146287e-05, &tj, &tj1, &result, _state); wsr_wcheb(x, -2.727576e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S,N), S>=0 *************************************************************************/ static double wsr_wsigma(double s, ae_int_t n, ae_state *_state) { double f0; double f1; double f2; double f3; double f4; double x0; double x1; double x2; double x3; double x4; double x; double result; result = (double)(0); if( n==5 ) { result = wsr_w5(s, _state); } if( n==6 ) { result = wsr_w6(s, _state); } if( n==7 ) { result = wsr_w7(s, _state); } if( n==8 ) { result = wsr_w8(s, _state); } if( n==9 ) { result = wsr_w9(s, _state); } if( n==10 ) { result = wsr_w10(s, _state); } if( n==11 ) { result = wsr_w11(s, _state); } if( n==12 ) { result = wsr_w12(s, _state); } if( n==13 ) { result = wsr_w13(s, _state); } if( n==14 ) { result = wsr_w14(s, _state); } if( n==15 ) { result = wsr_w15(s, _state); } if( n==16 ) { result = wsr_w16(s, _state); } if( n==17 ) { result = wsr_w17(s, _state); } if( n==18 ) { result = wsr_w18(s, _state); } if( n==19 ) { result = wsr_w19(s, _state); } if( n==20 ) { result = wsr_w20(s, _state); } if( n==21 ) { result = wsr_w21(s, _state); } if( n==22 ) { result = wsr_w22(s, _state); } if( n==23 ) { result = wsr_w23(s, _state); } if( n==24 ) { result = wsr_w24(s, _state); } if( n==25 ) { result = wsr_w25(s, _state); } if( n==26 ) { result = wsr_w26(s, _state); } if( n==27 ) { result = wsr_w27(s, _state); } if( n==28 ) { result = wsr_w28(s, _state); } if( n==29 ) { result = wsr_w29(s, _state); } if( n==30 ) { result = wsr_w30(s, _state); } if( n>30 ) { x = 1.0/n; x0 = 1.0/30; f0 = wsr_w30(s, _state); x1 = 1.0/40; f1 = wsr_w40(s, _state); x2 = 1.0/60; f2 = wsr_w60(s, _state); x3 = 1.0/120; f3 = wsr_w120(s, _state); x4 = 1.0/200; f4 = wsr_w200(s, _state); f1 = ((x-x0)*f1-(x-x1)*f0)/(x1-x0); f2 = ((x-x0)*f2-(x-x2)*f0)/(x2-x0); f3 = ((x-x0)*f3-(x-x3)*f0)/(x3-x0); f4 = ((x-x0)*f4-(x-x4)*f0)/(x4-x0); f2 = ((x-x1)*f2-(x-x2)*f1)/(x2-x1); f3 = ((x-x1)*f3-(x-x3)*f1)/(x3-x1); f4 = ((x-x1)*f4-(x-x4)*f1)/(x4-x1); f3 = ((x-x2)*f3-(x-x3)*f2)/(x3-x2); f4 = ((x-x2)*f4-(x-x4)*f2)/(x4-x2); f4 = ((x-x3)*f4-(x-x4)*f3)/(x4-x3); result = f4; } return result; } /************************************************************************* Sign test This test checks three hypotheses about the median of the given sample. The following tests are performed: * two-tailed test (null hypothesis - the median is equal to the given value) * left-tailed test (null hypothesis - the median is greater than or equal to the given value) * right-tailed test (null hypothesis - the median is less than or equal to the given value) Requirements: * the scale of measurement should be ordinal, interval or ratio (i.e. the test could not be applied to nominal variables). The test is non-parametric and doesn't require distribution X to be normal Input parameters: X - sample. Array whose index goes from 0 to N-1. N - size of the sample. Median - assumed median value. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. While calculating p-values high-precision binomial distribution approximation is used, so significance levels have about 15 exact digits. -- ALGLIB -- Copyright 08.09.2006 by Bochkanov Sergey *************************************************************************/ void onesamplesigntest(/* Real */ ae_vector* x, ae_int_t n, double median, double* bothtails, double* lefttail, double* righttail, ae_state *_state) { ae_int_t i; ae_int_t gtcnt; ae_int_t necnt; *bothtails = 0; *lefttail = 0; *righttail = 0; if( n<=1 ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; return; } /* * Calculate: * GTCnt - count of x[i]>Median * NECnt - count of x[i]<>Median */ gtcnt = 0; necnt = 0; for(i=0; i<=n-1; i++) { if( ae_fp_greater(x->ptr.p_double[i],median) ) { gtcnt = gtcnt+1; } if( ae_fp_neq(x->ptr.p_double[i],median) ) { necnt = necnt+1; } } if( necnt==0 ) { /* * all x[i] are equal to Median. * So we can conclude that Median is a true median :) */ *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; return; } *bothtails = ae_minreal(2*binomialdistribution(ae_minint(gtcnt, necnt-gtcnt, _state), necnt, 0.5, _state), 1.0, _state); *lefttail = binomialdistribution(gtcnt, necnt, 0.5, _state); *righttail = binomialcdistribution(gtcnt-1, necnt, 0.5, _state); } /************************************************************************* Pearson's correlation coefficient significance test This test checks hypotheses about whether X and Y are samples of two continuous distributions having zero correlation or whether their correlation is non-zero. The following tests are performed: * two-tailed test (null hypothesis - X and Y have zero correlation) * left-tailed test (null hypothesis - the correlation coefficient is greater than or equal to 0) * right-tailed test (null hypothesis - the correlation coefficient is less than or equal to 0). Requirements: * the number of elements in each sample is not less than 5 * normality of distributions of X and Y. Input parameters: R - Pearson's correlation coefficient for X and Y N - number of elements in samples, N>=5. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ void pearsoncorrelationsignificance(double r, ae_int_t n, double* bothtails, double* lefttail, double* righttail, ae_state *_state) { double t; double p; *bothtails = 0; *lefttail = 0; *righttail = 0; /* * Some special cases */ if( ae_fp_greater_eq(r,(double)(1)) ) { *bothtails = 0.0; *lefttail = 1.0; *righttail = 0.0; return; } if( ae_fp_less_eq(r,(double)(-1)) ) { *bothtails = 0.0; *lefttail = 0.0; *righttail = 1.0; return; } if( n<5 ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; return; } /* * General case */ t = r*ae_sqrt((n-2)/(1-ae_sqr(r, _state)), _state); p = studenttdistribution(n-2, t, _state); *bothtails = 2*ae_minreal(p, 1-p, _state); *lefttail = p; *righttail = 1-p; } /************************************************************************* Spearman's rank correlation coefficient significance test This test checks hypotheses about whether X and Y are samples of two continuous distributions having zero correlation or whether their correlation is non-zero. The following tests are performed: * two-tailed test (null hypothesis - X and Y have zero correlation) * left-tailed test (null hypothesis - the correlation coefficient is greater than or equal to 0) * right-tailed test (null hypothesis - the correlation coefficient is less than or equal to 0). Requirements: * the number of elements in each sample is not less than 5. The test is non-parametric and doesn't require distributions X and Y to be normal. Input parameters: R - Spearman's rank correlation coefficient for X and Y N - number of elements in samples, N>=5. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ void spearmanrankcorrelationsignificance(double r, ae_int_t n, double* bothtails, double* lefttail, double* righttail, ae_state *_state) { double t; double p; *bothtails = 0; *lefttail = 0; *righttail = 0; /* * Special case */ if( n<5 ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; return; } /* * General case */ if( ae_fp_greater_eq(r,(double)(1)) ) { t = 1.0E10; } else { if( ae_fp_less_eq(r,(double)(-1)) ) { t = -1.0E10; } else { t = r*ae_sqrt((n-2)/(1-ae_sqr(r, _state)), _state); } } if( ae_fp_less(t,(double)(0)) ) { p = correlationtests_spearmantail(t, n, _state); *bothtails = 2*p; *lefttail = p; *righttail = 1-p; } else { p = correlationtests_spearmantail(-t, n, _state); *bothtails = 2*p; *lefttail = 1-p; *righttail = p; } } /************************************************************************* Tail(S, 5) *************************************************************************/ static double correlationtests_spearmantail5(double s, ae_state *_state) { double result; if( ae_fp_less(s,0.000e+00) ) { result = studenttdistribution(3, -s, _state); return result; } if( ae_fp_greater_eq(s,3.580e+00) ) { result = 8.304e-03; return result; } if( ae_fp_greater_eq(s,2.322e+00) ) { result = 4.163e-02; return result; } if( ae_fp_greater_eq(s,1.704e+00) ) { result = 6.641e-02; return result; } if( ae_fp_greater_eq(s,1.303e+00) ) { result = 1.164e-01; return result; } if( ae_fp_greater_eq(s,1.003e+00) ) { result = 1.748e-01; return result; } if( ae_fp_greater_eq(s,7.584e-01) ) { result = 2.249e-01; return result; } if( ae_fp_greater_eq(s,5.468e-01) ) { result = 2.581e-01; return result; } if( ae_fp_greater_eq(s,3.555e-01) ) { result = 3.413e-01; return result; } if( ae_fp_greater_eq(s,1.759e-01) ) { result = 3.911e-01; return result; } if( ae_fp_greater_eq(s,1.741e-03) ) { result = 4.747e-01; return result; } if( ae_fp_greater_eq(s,0.000e+00) ) { result = 5.248e-01; return result; } result = (double)(0); return result; } /************************************************************************* Tail(S, 6) *************************************************************************/ static double correlationtests_spearmantail6(double s, ae_state *_state) { double result; if( ae_fp_less(s,1.001e+00) ) { result = studenttdistribution(4, -s, _state); return result; } if( ae_fp_greater_eq(s,5.663e+00) ) { result = 1.366e-03; return result; } if( ae_fp_greater_eq(s,3.834e+00) ) { result = 8.350e-03; return result; } if( ae_fp_greater_eq(s,2.968e+00) ) { result = 1.668e-02; return result; } if( ae_fp_greater_eq(s,2.430e+00) ) { result = 2.921e-02; return result; } if( ae_fp_greater_eq(s,2.045e+00) ) { result = 5.144e-02; return result; } if( ae_fp_greater_eq(s,1.747e+00) ) { result = 6.797e-02; return result; } if( ae_fp_greater_eq(s,1.502e+00) ) { result = 8.752e-02; return result; } if( ae_fp_greater_eq(s,1.295e+00) ) { result = 1.210e-01; return result; } if( ae_fp_greater_eq(s,1.113e+00) ) { result = 1.487e-01; return result; } if( ae_fp_greater_eq(s,1.001e+00) ) { result = 1.780e-01; return result; } result = (double)(0); return result; } /************************************************************************* Tail(S, 7) *************************************************************************/ static double correlationtests_spearmantail7(double s, ae_state *_state) { double result; if( ae_fp_less(s,1.001e+00) ) { result = studenttdistribution(5, -s, _state); return result; } if( ae_fp_greater_eq(s,8.159e+00) ) { result = 2.081e-04; return result; } if( ae_fp_greater_eq(s,5.620e+00) ) { result = 1.393e-03; return result; } if( ae_fp_greater_eq(s,4.445e+00) ) { result = 3.398e-03; return result; } if( ae_fp_greater_eq(s,3.728e+00) ) { result = 6.187e-03; return result; } if( ae_fp_greater_eq(s,3.226e+00) ) { result = 1.200e-02; return result; } if( ae_fp_greater_eq(s,2.844e+00) ) { result = 1.712e-02; return result; } if( ae_fp_greater_eq(s,2.539e+00) ) { result = 2.408e-02; return result; } if( ae_fp_greater_eq(s,2.285e+00) ) { result = 3.320e-02; return result; } if( ae_fp_greater_eq(s,2.068e+00) ) { result = 4.406e-02; return result; } if( ae_fp_greater_eq(s,1.879e+00) ) { result = 5.478e-02; return result; } if( ae_fp_greater_eq(s,1.710e+00) ) { result = 6.946e-02; return result; } if( ae_fp_greater_eq(s,1.559e+00) ) { result = 8.331e-02; return result; } if( ae_fp_greater_eq(s,1.420e+00) ) { result = 1.001e-01; return result; } if( ae_fp_greater_eq(s,1.292e+00) ) { result = 1.180e-01; return result; } if( ae_fp_greater_eq(s,1.173e+00) ) { result = 1.335e-01; return result; } if( ae_fp_greater_eq(s,1.062e+00) ) { result = 1.513e-01; return result; } if( ae_fp_greater_eq(s,1.001e+00) ) { result = 1.770e-01; return result; } result = (double)(0); return result; } /************************************************************************* Tail(S, 8) *************************************************************************/ static double correlationtests_spearmantail8(double s, ae_state *_state) { double result; if( ae_fp_less(s,2.001e+00) ) { result = studenttdistribution(6, -s, _state); return result; } if( ae_fp_greater_eq(s,1.103e+01) ) { result = 2.194e-05; return result; } if( ae_fp_greater_eq(s,7.685e+00) ) { result = 2.008e-04; return result; } if( ae_fp_greater_eq(s,6.143e+00) ) { result = 5.686e-04; return result; } if( ae_fp_greater_eq(s,5.213e+00) ) { result = 1.138e-03; return result; } if( ae_fp_greater_eq(s,4.567e+00) ) { result = 2.310e-03; return result; } if( ae_fp_greater_eq(s,4.081e+00) ) { result = 3.634e-03; return result; } if( ae_fp_greater_eq(s,3.697e+00) ) { result = 5.369e-03; return result; } if( ae_fp_greater_eq(s,3.381e+00) ) { result = 7.708e-03; return result; } if( ae_fp_greater_eq(s,3.114e+00) ) { result = 1.087e-02; return result; } if( ae_fp_greater_eq(s,2.884e+00) ) { result = 1.397e-02; return result; } if( ae_fp_greater_eq(s,2.682e+00) ) { result = 1.838e-02; return result; } if( ae_fp_greater_eq(s,2.502e+00) ) { result = 2.288e-02; return result; } if( ae_fp_greater_eq(s,2.340e+00) ) { result = 2.883e-02; return result; } if( ae_fp_greater_eq(s,2.192e+00) ) { result = 3.469e-02; return result; } if( ae_fp_greater_eq(s,2.057e+00) ) { result = 4.144e-02; return result; } if( ae_fp_greater_eq(s,2.001e+00) ) { result = 4.804e-02; return result; } result = (double)(0); return result; } /************************************************************************* Tail(S, 9) *************************************************************************/ static double correlationtests_spearmantail9(double s, ae_state *_state) { double result; if( ae_fp_less(s,2.001e+00) ) { result = studenttdistribution(7, -s, _state); return result; } if( ae_fp_greater_eq(s,9.989e+00) ) { result = 2.306e-05; return result; } if( ae_fp_greater_eq(s,8.069e+00) ) { result = 8.167e-05; return result; } if( ae_fp_greater_eq(s,6.890e+00) ) { result = 1.744e-04; return result; } if( ae_fp_greater_eq(s,6.077e+00) ) { result = 3.625e-04; return result; } if( ae_fp_greater_eq(s,5.469e+00) ) { result = 6.450e-04; return result; } if( ae_fp_greater_eq(s,4.991e+00) ) { result = 1.001e-03; return result; } if( ae_fp_greater_eq(s,4.600e+00) ) { result = 1.514e-03; return result; } if( ae_fp_greater_eq(s,4.272e+00) ) { result = 2.213e-03; return result; } if( ae_fp_greater_eq(s,3.991e+00) ) { result = 2.990e-03; return result; } if( ae_fp_greater_eq(s,3.746e+00) ) { result = 4.101e-03; return result; } if( ae_fp_greater_eq(s,3.530e+00) ) { result = 5.355e-03; return result; } if( ae_fp_greater_eq(s,3.336e+00) ) { result = 6.887e-03; return result; } if( ae_fp_greater_eq(s,3.161e+00) ) { result = 8.598e-03; return result; } if( ae_fp_greater_eq(s,3.002e+00) ) { result = 1.065e-02; return result; } if( ae_fp_greater_eq(s,2.855e+00) ) { result = 1.268e-02; return result; } if( ae_fp_greater_eq(s,2.720e+00) ) { result = 1.552e-02; return result; } if( ae_fp_greater_eq(s,2.595e+00) ) { result = 1.836e-02; return result; } if( ae_fp_greater_eq(s,2.477e+00) ) { result = 2.158e-02; return result; } if( ae_fp_greater_eq(s,2.368e+00) ) { result = 2.512e-02; return result; } if( ae_fp_greater_eq(s,2.264e+00) ) { result = 2.942e-02; return result; } if( ae_fp_greater_eq(s,2.166e+00) ) { result = 3.325e-02; return result; } if( ae_fp_greater_eq(s,2.073e+00) ) { result = 3.800e-02; return result; } if( ae_fp_greater_eq(s,2.001e+00) ) { result = 4.285e-02; return result; } result = (double)(0); return result; } /************************************************************************* Tail(T,N), accepts T<0 *************************************************************************/ static double correlationtests_spearmantail(double t, ae_int_t n, ae_state *_state) { double result; if( n==5 ) { result = correlationtests_spearmantail5(-t, _state); return result; } if( n==6 ) { result = correlationtests_spearmantail6(-t, _state); return result; } if( n==7 ) { result = correlationtests_spearmantail7(-t, _state); return result; } if( n==8 ) { result = correlationtests_spearmantail8(-t, _state); return result; } if( n==9 ) { result = correlationtests_spearmantail9(-t, _state); return result; } result = studenttdistribution(n-2, t, _state); return result; } /************************************************************************* One-sample t-test This test checks three hypotheses about the mean of the given sample. The following tests are performed: * two-tailed test (null hypothesis - the mean is equal to the given value) * left-tailed test (null hypothesis - the mean is greater than or equal to the given value) * right-tailed test (null hypothesis - the mean is less than or equal to the given value). The test is based on the assumption that a given sample has a normal distribution and an unknown dispersion. If the distribution sharply differs from normal, the test will work incorrectly. INPUT PARAMETERS: X - sample. Array whose index goes from 0 to N-1. N - size of sample, N>=0 Mean - assumed value of the mean. OUTPUT PARAMETERS: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. NOTE: this function correctly handles degenerate cases: * when N=0, all p-values are set to 1.0 * when variance of X[] is exactly zero, p-values are set to 1.0 or 0.0, depending on difference between sample mean and value of mean being tested. -- ALGLIB -- Copyright 08.09.2006 by Bochkanov Sergey *************************************************************************/ void studentttest1(/* Real */ ae_vector* x, ae_int_t n, double mean, double* bothtails, double* lefttail, double* righttail, ae_state *_state) { ae_int_t i; double xmean; double x0; double v; ae_bool samex; double xvariance; double xstddev; double v1; double v2; double stat; double s; *bothtails = 0; *lefttail = 0; *righttail = 0; if( n<=0 ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; return; } /* * Mean */ xmean = (double)(0); x0 = x->ptr.p_double[0]; samex = ae_true; for(i=0; i<=n-1; i++) { v = x->ptr.p_double[i]; xmean = xmean+v; samex = samex&&ae_fp_eq(v,x0); } if( samex ) { xmean = x0; } else { xmean = xmean/n; } /* * Variance (using corrected two-pass algorithm) */ xvariance = (double)(0); xstddev = (double)(0); if( n!=1&&!samex ) { v1 = (double)(0); for(i=0; i<=n-1; i++) { v1 = v1+ae_sqr(x->ptr.p_double[i]-xmean, _state); } v2 = (double)(0); for(i=0; i<=n-1; i++) { v2 = v2+(x->ptr.p_double[i]-xmean); } v2 = ae_sqr(v2, _state)/n; xvariance = (v1-v2)/(n-1); if( ae_fp_less(xvariance,(double)(0)) ) { xvariance = (double)(0); } xstddev = ae_sqrt(xvariance, _state); } if( ae_fp_eq(xstddev,(double)(0)) ) { if( ae_fp_eq(xmean,mean) ) { *bothtails = 1.0; } else { *bothtails = 0.0; } if( ae_fp_greater_eq(xmean,mean) ) { *lefttail = 1.0; } else { *lefttail = 0.0; } if( ae_fp_less_eq(xmean,mean) ) { *righttail = 1.0; } else { *righttail = 0.0; } return; } /* * Statistic */ stat = (xmean-mean)/(xstddev/ae_sqrt((double)(n), _state)); s = studenttdistribution(n-1, stat, _state); *bothtails = 2*ae_minreal(s, 1-s, _state); *lefttail = s; *righttail = 1-s; } /************************************************************************* Two-sample pooled test This test checks three hypotheses about the mean of the given samples. The following tests are performed: * two-tailed test (null hypothesis - the means are equal) * left-tailed test (null hypothesis - the mean of the first sample is greater than or equal to the mean of the second sample) * right-tailed test (null hypothesis - the mean of the first sample is less than or equal to the mean of the second sample). Test is based on the following assumptions: * given samples have normal distributions * dispersions are equal * samples are independent. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of sample. Y - sample 2. Array whose index goes from 0 to M-1. M - size of sample. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. NOTE: this function correctly handles degenerate cases: * when N=0 or M=0, all p-values are set to 1.0 * when both samples has exactly zero variance, p-values are set to 1.0 or 0.0, depending on difference between means. -- ALGLIB -- Copyright 18.09.2006 by Bochkanov Sergey *************************************************************************/ void studentttest2(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, double* bothtails, double* lefttail, double* righttail, ae_state *_state) { ae_int_t i; ae_bool samex; ae_bool samey; double x0; double y0; double xmean; double ymean; double v; double stat; double s; double p; *bothtails = 0; *lefttail = 0; *righttail = 0; if( n<=0||m<=0 ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; return; } /* * Mean */ xmean = (double)(0); x0 = x->ptr.p_double[0]; samex = ae_true; for(i=0; i<=n-1; i++) { v = x->ptr.p_double[i]; xmean = xmean+v; samex = samex&&ae_fp_eq(v,x0); } if( samex ) { xmean = x0; } else { xmean = xmean/n; } ymean = (double)(0); y0 = y->ptr.p_double[0]; samey = ae_true; for(i=0; i<=m-1; i++) { v = y->ptr.p_double[i]; ymean = ymean+v; samey = samey&&ae_fp_eq(v,y0); } if( samey ) { ymean = y0; } else { ymean = ymean/m; } /* * S */ s = (double)(0); if( n+m>2 ) { for(i=0; i<=n-1; i++) { s = s+ae_sqr(x->ptr.p_double[i]-xmean, _state); } for(i=0; i<=m-1; i++) { s = s+ae_sqr(y->ptr.p_double[i]-ymean, _state); } s = ae_sqrt(s*((double)1/(double)n+(double)1/(double)m)/(n+m-2), _state); } if( ae_fp_eq(s,(double)(0)) ) { if( ae_fp_eq(xmean,ymean) ) { *bothtails = 1.0; } else { *bothtails = 0.0; } if( ae_fp_greater_eq(xmean,ymean) ) { *lefttail = 1.0; } else { *lefttail = 0.0; } if( ae_fp_less_eq(xmean,ymean) ) { *righttail = 1.0; } else { *righttail = 0.0; } return; } /* * Statistic */ stat = (xmean-ymean)/s; p = studenttdistribution(n+m-2, stat, _state); *bothtails = 2*ae_minreal(p, 1-p, _state); *lefttail = p; *righttail = 1-p; } /************************************************************************* Two-sample unpooled test This test checks three hypotheses about the mean of the given samples. The following tests are performed: * two-tailed test (null hypothesis - the means are equal) * left-tailed test (null hypothesis - the mean of the first sample is greater than or equal to the mean of the second sample) * right-tailed test (null hypothesis - the mean of the first sample is less than or equal to the mean of the second sample). Test is based on the following assumptions: * given samples have normal distributions * samples are independent. Equality of variances is NOT required. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of the sample. Y - sample 2. Array whose index goes from 0 to M-1. M - size of the sample. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. NOTE: this function correctly handles degenerate cases: * when N=0 or M=0, all p-values are set to 1.0 * when both samples has zero variance, p-values are set to 1.0 or 0.0, depending on difference between means. * when only one sample has zero variance, test reduces to 1-sample version. -- ALGLIB -- Copyright 18.09.2006 by Bochkanov Sergey *************************************************************************/ void unequalvariancettest(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, double* bothtails, double* lefttail, double* righttail, ae_state *_state) { ae_int_t i; ae_bool samex; ae_bool samey; double x0; double y0; double xmean; double ymean; double xvar; double yvar; double v; double df; double p; double stat; double c; *bothtails = 0; *lefttail = 0; *righttail = 0; if( n<=0||m<=0 ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; return; } /* * Mean */ xmean = (double)(0); x0 = x->ptr.p_double[0]; samex = ae_true; for(i=0; i<=n-1; i++) { v = x->ptr.p_double[i]; xmean = xmean+v; samex = samex&&ae_fp_eq(v,x0); } if( samex ) { xmean = x0; } else { xmean = xmean/n; } ymean = (double)(0); y0 = y->ptr.p_double[0]; samey = ae_true; for(i=0; i<=m-1; i++) { v = y->ptr.p_double[i]; ymean = ymean+v; samey = samey&&ae_fp_eq(v,y0); } if( samey ) { ymean = y0; } else { ymean = ymean/m; } /* * Variance (using corrected two-pass algorithm) */ xvar = (double)(0); if( n>=2&&!samex ) { for(i=0; i<=n-1; i++) { xvar = xvar+ae_sqr(x->ptr.p_double[i]-xmean, _state); } xvar = xvar/(n-1); } yvar = (double)(0); if( m>=2&&!samey ) { for(i=0; i<=m-1; i++) { yvar = yvar+ae_sqr(y->ptr.p_double[i]-ymean, _state); } yvar = yvar/(m-1); } /* * Handle different special cases * (one or both variances are zero). */ if( ae_fp_eq(xvar,(double)(0))&&ae_fp_eq(yvar,(double)(0)) ) { if( ae_fp_eq(xmean,ymean) ) { *bothtails = 1.0; } else { *bothtails = 0.0; } if( ae_fp_greater_eq(xmean,ymean) ) { *lefttail = 1.0; } else { *lefttail = 0.0; } if( ae_fp_less_eq(xmean,ymean) ) { *righttail = 1.0; } else { *righttail = 0.0; } return; } if( ae_fp_eq(xvar,(double)(0)) ) { /* * X is constant, unpooled 2-sample test reduces to 1-sample test. * * NOTE: right-tail and left-tail must be passed to 1-sample * t-test in reverse order because we reverse order of * of samples. */ studentttest1(y, m, xmean, bothtails, righttail, lefttail, _state); return; } if( ae_fp_eq(yvar,(double)(0)) ) { /* * Y is constant, unpooled 2-sample test reduces to 1-sample test. */ studentttest1(x, n, ymean, bothtails, lefttail, righttail, _state); return; } /* * Statistic */ stat = (xmean-ymean)/ae_sqrt(xvar/n+yvar/m, _state); c = xvar/n/(xvar/n+yvar/m); df = rmul2((double)(n-1), (double)(m-1), _state)/((m-1)*ae_sqr(c, _state)+(n-1)*ae_sqr(1-c, _state)); if( ae_fp_greater(stat,(double)(0)) ) { p = 1-0.5*incompletebeta(df/2, 0.5, df/(df+ae_sqr(stat, _state)), _state); } else { p = 0.5*incompletebeta(df/2, 0.5, df/(df+ae_sqr(stat, _state)), _state); } *bothtails = 2*ae_minreal(p, 1-p, _state); *lefttail = p; *righttail = 1-p; } /************************************************************************* Mann-Whitney U-test This test checks hypotheses about whether X and Y are samples of two continuous distributions of the same shape and same median or whether their medians are different. The following tests are performed: * two-tailed test (null hypothesis - the medians are equal) * left-tailed test (null hypothesis - the median of the first sample is greater than or equal to the median of the second sample) * right-tailed test (null hypothesis - the median of the first sample is less than or equal to the median of the second sample). Requirements: * the samples are independent * X and Y are continuous distributions (or discrete distributions well- approximating continuous distributions) * distributions of X and Y have the same shape. The only possible difference is their position (i.e. the value of the median) * the number of elements in each sample is not less than 5 * the scale of measurement should be ordinal, interval or ratio (i.e. the test could not be applied to nominal variables). The test is non-parametric and doesn't require distributions to be normal. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of the sample. N>=5 Y - sample 2. Array whose index goes from 0 to M-1. M - size of the sample. M>=5 Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. To calculate p-values, special approximation is used. This method lets us calculate p-values with satisfactory accuracy in interval [0.0001, 1]. There is no approximation outside the [0.0001, 1] interval. Therefore, if the significance level outlies this interval, the test returns 0.0001. Relative precision of approximation of p-value: N M Max.err. Rms.err. 5..10 N..10 1.4e-02 6.0e-04 5..10 N..100 2.2e-02 5.3e-06 10..15 N..15 1.0e-02 3.2e-04 10..15 N..100 1.0e-02 2.2e-05 15..100 N..100 6.1e-03 2.7e-06 For N,M>100 accuracy checks weren't put into practice, but taking into account characteristics of asymptotic approximation used, precision should not be sharply different from the values for interval [5, 100]. NOTE: P-value approximation was optimized for 0.0001<=p<=0.2500. Thus, P's outside of this interval are enforced to these bounds. Say, you may quite often get P equal to exactly 0.25 or 0.0001. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ void mannwhitneyutest(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, double* bothtails, double* lefttail, double* righttail, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t t; double tmp; ae_int_t tmpi; ae_int_t ns; ae_vector r; ae_vector c; double u; double p; double mp; double s; double sigma; double mu; ae_int_t tiecount; ae_vector tiesize; ae_frame_make(_state, &_frame_block); *bothtails = 0; *lefttail = 0; *righttail = 0; ae_vector_init(&r, 0, DT_REAL, _state); ae_vector_init(&c, 0, DT_INT, _state); ae_vector_init(&tiesize, 0, DT_INT, _state); /* * Prepare */ if( n<=4||m<=4 ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; ae_frame_leave(_state); return; } ns = n+m; ae_vector_set_length(&r, ns-1+1, _state); ae_vector_set_length(&c, ns-1+1, _state); for(i=0; i<=n-1; i++) { r.ptr.p_double[i] = x->ptr.p_double[i]; c.ptr.p_int[i] = 0; } for(i=0; i<=m-1; i++) { r.ptr.p_double[n+i] = y->ptr.p_double[i]; c.ptr.p_int[n+i] = 1; } /* * sort {R, C} */ if( ns!=1 ) { i = 2; do { t = i; while(t!=1) { k = t/2; if( ae_fp_greater_eq(r.ptr.p_double[k-1],r.ptr.p_double[t-1]) ) { t = 1; } else { tmp = r.ptr.p_double[k-1]; r.ptr.p_double[k-1] = r.ptr.p_double[t-1]; r.ptr.p_double[t-1] = tmp; tmpi = c.ptr.p_int[k-1]; c.ptr.p_int[k-1] = c.ptr.p_int[t-1]; c.ptr.p_int[t-1] = tmpi; t = k; } } i = i+1; } while(i<=ns); i = ns-1; do { tmp = r.ptr.p_double[i]; r.ptr.p_double[i] = r.ptr.p_double[0]; r.ptr.p_double[0] = tmp; tmpi = c.ptr.p_int[i]; c.ptr.p_int[i] = c.ptr.p_int[0]; c.ptr.p_int[0] = tmpi; t = 1; while(t!=0) { k = 2*t; if( k>i ) { t = 0; } else { if( k=1); } /* * compute tied ranks */ i = 0; tiecount = 0; ae_vector_set_length(&tiesize, ns-1+1, _state); while(i<=ns-1) { j = i+1; while(j<=ns-1) { if( ae_fp_neq(r.ptr.p_double[j],r.ptr.p_double[i]) ) { break; } j = j+1; } for(k=i; k<=j-1; k++) { r.ptr.p_double[k] = 1+(double)(i+j-1)/(double)2; } tiesize.ptr.p_int[tiecount] = j-i; tiecount = tiecount+1; i = j; } /* * Compute U */ u = (double)(0); for(i=0; i<=ns-1; i++) { if( c.ptr.p_int[i]==0 ) { u = u+r.ptr.p_double[i]; } } u = rmul2((double)(n), (double)(m), _state)+rmul2((double)(n), (double)(n+1), _state)*0.5-u; /* * Result */ mu = rmul2((double)(n), (double)(m), _state)/2; tmp = ns*(ae_sqr((double)(ns), _state)-1)/12; for(i=0; i<=tiecount-1; i++) { tmp = tmp-tiesize.ptr.p_int[i]*(ae_sqr((double)(tiesize.ptr.p_int[i]), _state)-1)/12; } sigma = ae_sqrt(rmul2((double)(n), (double)(m), _state)/ns/(ns-1)*tmp, _state); s = (u-mu)/sigma; if( ae_fp_less_eq(s,(double)(0)) ) { p = ae_exp(mannwhitneyu_usigma(-(u-mu)/sigma, n, m, _state), _state); mp = 1-ae_exp(mannwhitneyu_usigma(-(u-1-mu)/sigma, n, m, _state), _state); } else { mp = ae_exp(mannwhitneyu_usigma((u-mu)/sigma, n, m, _state), _state); p = 1-ae_exp(mannwhitneyu_usigma((u+1-mu)/sigma, n, m, _state), _state); } *lefttail = boundval(ae_maxreal(mp, 1.0E-4, _state), 0.0001, 0.2500, _state); *righttail = boundval(ae_maxreal(p, 1.0E-4, _state), 0.0001, 0.2500, _state); *bothtails = 2*ae_minreal(*lefttail, *righttail, _state); ae_frame_leave(_state); } /************************************************************************* Sequential Chebyshev interpolation. *************************************************************************/ static void mannwhitneyu_ucheb(double x, double c, double* tj, double* tj1, double* r, ae_state *_state) { double t; *r = *r+c*(*tj); t = 2*x*(*tj1)-(*tj); *tj = *tj1; *tj1 = t; } /************************************************************************* Three-point polynomial interpolation. *************************************************************************/ static double mannwhitneyu_uninterpolate(double p1, double p2, double p3, ae_int_t n, ae_state *_state) { double t1; double t2; double t3; double t; double p12; double p23; double result; t1 = 1.0/15.0; t2 = 1.0/30.0; t3 = 1.0/100.0; t = 1.0/n; p12 = ((t-t2)*p1+(t1-t)*p2)/(t1-t2); p23 = ((t-t3)*p2+(t2-t)*p3)/(t2-t3); result = ((t-t3)*p12+(t1-t)*p23)/(t1-t3); return result; } /************************************************************************* Tail(0, N1, N2) *************************************************************************/ static double mannwhitneyu_usigma000(ae_int_t n1, ae_int_t n2, ae_state *_state) { double p1; double p2; double p3; double result; p1 = mannwhitneyu_uninterpolate(-6.76984e-01, -6.83700e-01, -6.89873e-01, n2, _state); p2 = mannwhitneyu_uninterpolate(-6.83700e-01, -6.87311e-01, -6.90957e-01, n2, _state); p3 = mannwhitneyu_uninterpolate(-6.89873e-01, -6.90957e-01, -6.92175e-01, n2, _state); result = mannwhitneyu_uninterpolate(p1, p2, p3, n1, _state); return result; } /************************************************************************* Tail(0.75, N1, N2) *************************************************************************/ static double mannwhitneyu_usigma075(ae_int_t n1, ae_int_t n2, ae_state *_state) { double p1; double p2; double p3; double result; p1 = mannwhitneyu_uninterpolate(-1.44500e+00, -1.45906e+00, -1.47063e+00, n2, _state); p2 = mannwhitneyu_uninterpolate(-1.45906e+00, -1.46856e+00, -1.47644e+00, n2, _state); p3 = mannwhitneyu_uninterpolate(-1.47063e+00, -1.47644e+00, -1.48100e+00, n2, _state); result = mannwhitneyu_uninterpolate(p1, p2, p3, n1, _state); return result; } /************************************************************************* Tail(1.5, N1, N2) *************************************************************************/ static double mannwhitneyu_usigma150(ae_int_t n1, ae_int_t n2, ae_state *_state) { double p1; double p2; double p3; double result; p1 = mannwhitneyu_uninterpolate(-2.65380e+00, -2.67352e+00, -2.69011e+00, n2, _state); p2 = mannwhitneyu_uninterpolate(-2.67352e+00, -2.68591e+00, -2.69659e+00, n2, _state); p3 = mannwhitneyu_uninterpolate(-2.69011e+00, -2.69659e+00, -2.70192e+00, n2, _state); result = mannwhitneyu_uninterpolate(p1, p2, p3, n1, _state); return result; } /************************************************************************* Tail(2.25, N1, N2) *************************************************************************/ static double mannwhitneyu_usigma225(ae_int_t n1, ae_int_t n2, ae_state *_state) { double p1; double p2; double p3; double result; p1 = mannwhitneyu_uninterpolate(-4.41465e+00, -4.42260e+00, -4.43702e+00, n2, _state); p2 = mannwhitneyu_uninterpolate(-4.42260e+00, -4.41639e+00, -4.41928e+00, n2, _state); p3 = mannwhitneyu_uninterpolate(-4.43702e+00, -4.41928e+00, -4.41030e+00, n2, _state); result = mannwhitneyu_uninterpolate(p1, p2, p3, n1, _state); return result; } /************************************************************************* Tail(3.0, N1, N2) *************************************************************************/ static double mannwhitneyu_usigma300(ae_int_t n1, ae_int_t n2, ae_state *_state) { double p1; double p2; double p3; double result; p1 = mannwhitneyu_uninterpolate(-6.89839e+00, -6.83477e+00, -6.82340e+00, n2, _state); p2 = mannwhitneyu_uninterpolate(-6.83477e+00, -6.74559e+00, -6.71117e+00, n2, _state); p3 = mannwhitneyu_uninterpolate(-6.82340e+00, -6.71117e+00, -6.64929e+00, n2, _state); result = mannwhitneyu_uninterpolate(p1, p2, p3, n1, _state); return result; } /************************************************************************* Tail(3.33, N1, N2) *************************************************************************/ static double mannwhitneyu_usigma333(ae_int_t n1, ae_int_t n2, ae_state *_state) { double p1; double p2; double p3; double result; p1 = mannwhitneyu_uninterpolate(-8.31272e+00, -8.17096e+00, -8.13125e+00, n2, _state); p2 = mannwhitneyu_uninterpolate(-8.17096e+00, -8.00156e+00, -7.93245e+00, n2, _state); p3 = mannwhitneyu_uninterpolate(-8.13125e+00, -7.93245e+00, -7.82502e+00, n2, _state); result = mannwhitneyu_uninterpolate(p1, p2, p3, n1, _state); return result; } /************************************************************************* Tail(3.66, N1, N2) *************************************************************************/ static double mannwhitneyu_usigma367(ae_int_t n1, ae_int_t n2, ae_state *_state) { double p1; double p2; double p3; double result; p1 = mannwhitneyu_uninterpolate(-9.98837e+00, -9.70844e+00, -9.62087e+00, n2, _state); p2 = mannwhitneyu_uninterpolate(-9.70844e+00, -9.41156e+00, -9.28998e+00, n2, _state); p3 = mannwhitneyu_uninterpolate(-9.62087e+00, -9.28998e+00, -9.11686e+00, n2, _state); result = mannwhitneyu_uninterpolate(p1, p2, p3, n1, _state); return result; } /************************************************************************* Tail(4.0, N1, N2) *************************************************************************/ static double mannwhitneyu_usigma400(ae_int_t n1, ae_int_t n2, ae_state *_state) { double p1; double p2; double p3; double result; p1 = mannwhitneyu_uninterpolate(-1.20250e+01, -1.14911e+01, -1.13231e+01, n2, _state); p2 = mannwhitneyu_uninterpolate(-1.14911e+01, -1.09927e+01, -1.07937e+01, n2, _state); p3 = mannwhitneyu_uninterpolate(-1.13231e+01, -1.07937e+01, -1.05285e+01, n2, _state); result = mannwhitneyu_uninterpolate(p1, p2, p3, n1, _state); return result; } /************************************************************************* Tail(S, 5, 5) *************************************************************************/ static double mannwhitneyu_utbln5n5(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/2.611165e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -2.596264e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.412086e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.858542e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.614282e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.372686e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.524731e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.435331e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.284665e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.184141e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.298360e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 7.447272e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.938769e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.276205e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.138481e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.684625e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.558104e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 6) *************************************************************************/ static double mannwhitneyu_utbln5n6(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/2.738613e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -2.810459e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.684429e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.712858e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.009324e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.644391e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 6.034173e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.953498e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.279293e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.563485e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.971952e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.506309e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.541406e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.283205e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.016347e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.221626e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.286752e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 7) *************************************************************************/ static double mannwhitneyu_utbln5n7(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/2.841993e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -2.994677e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.923264e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.506190e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.054280e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.794587e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.726290e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.534180e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.517845e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.904428e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.882443e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.482988e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.114875e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.515082e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.996056e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.293581e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.349444e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 8) *************************************************************************/ static double mannwhitneyu_utbln5n8(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/2.927700e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.155727e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.135078e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.247203e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.309697e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.993725e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.567219e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.383704e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.002188e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.487322e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.443899e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.688270e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.600339e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.874948e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.811593e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.072353e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.659457e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 9) *************************************************************************/ static double mannwhitneyu_utbln5n9(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.000000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.298162e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.325016e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.939852e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.563029e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.222652e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.195200e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.445665e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.204792e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.775217e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.527781e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.221948e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.242968e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.607959e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.771285e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 6.694026e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.481190e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 10) *************************************************************************/ static double mannwhitneyu_utbln5n10(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.061862e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.425360e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.496710e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.587658e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.812005e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.427637e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.515702e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.406867e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.796295e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.237591e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.654249e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.181165e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.011665e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.417927e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.534880e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.791255e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.871512e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 11) *************************************************************************/ static double mannwhitneyu_utbln5n11(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.115427e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.539959e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.652998e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.196503e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.054363e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.618848e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.109411e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.786668e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.215648e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.484220e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.935991e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.396191e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.894177e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.206979e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.519055e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.210326e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.189679e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 12) *************************************************************************/ static double mannwhitneyu_utbln5n12(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.162278e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.644007e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.796173e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.771177e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.290043e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.794686e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.702110e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.185959e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.416259e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.592056e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.201530e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.754365e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.978945e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.012032e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.304579e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.100378e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.728269e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 13) *************************************************************************/ static double mannwhitneyu_utbln5n13(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.203616e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.739120e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.928117e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.031605e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.519403e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.962648e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.292183e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.809293e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.465156e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.456278e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.446055e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.109490e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.218256e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.941479e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.058603e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.824402e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.830947e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 14) *************************************************************************/ static double mannwhitneyu_utbln5n14(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.240370e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.826559e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.050370e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.083408e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.743164e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.012030e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.884686e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.059656e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.327521e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.134026e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.584201e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.440618e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.524133e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.990007e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.887334e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.534977e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.705395e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 15) *************************************************************************/ static double mannwhitneyu_utbln5n15(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.851572e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.082033e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.095983e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.814595e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.073148e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.420213e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.517175e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.344180e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.371393e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.711443e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.228569e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.683483e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.267112e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.156044e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 9.131316e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.301023e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 16) *************************************************************************/ static double mannwhitneyu_utbln5n16(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.852210e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.077482e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.091186e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.797282e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.084994e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.667054e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.843909e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.456732e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.039830e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.723508e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.940608e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.478285e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.649144e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.237703e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.707410e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.874293e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 17) *************************************************************************/ static double mannwhitneyu_utbln5n17(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.851752e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.071259e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.084700e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.758898e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.073846e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.684838e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.964936e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.782442e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.956362e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.984727e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.196936e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.558262e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.690746e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.364855e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.401006e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.546748e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 18) *************************************************************************/ static double mannwhitneyu_utbln5n18(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.850840e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.064799e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.077651e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.712659e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.049217e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.571333e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.929809e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.752044e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.949464e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.896101e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.614460e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.384357e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.489113e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.445725e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.945636e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.424653e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 19) *************************************************************************/ static double mannwhitneyu_utbln5n19(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.850027e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.059159e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.071106e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.669960e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.022780e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.442555e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.851335e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.433865e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.514465e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.332989e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.606099e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.341945e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.402164e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.039761e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.512831e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.284427e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 20) *************************************************************************/ static double mannwhitneyu_utbln5n20(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.849651e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.054729e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.065747e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.636243e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.003234e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.372789e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.831551e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.763090e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.830626e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.122384e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.108328e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.557983e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.945666e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.965696e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.493236e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.162591e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 21) *************************************************************************/ static double mannwhitneyu_utbln5n21(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.849649e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.051155e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.061430e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.608869e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.902788e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.346562e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.874709e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.682887e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.026206e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.534551e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.990575e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.713334e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 9.737011e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.304571e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.133110e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.123457e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 22) *************************************************************************/ static double mannwhitneyu_utbln5n22(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.849598e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.047605e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.057264e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.579513e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.749602e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.275137e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.881768e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.177374e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.981056e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.696290e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.886803e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.085378e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.675242e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.426367e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.039613e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.662378e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 23) *************************************************************************/ static double mannwhitneyu_utbln5n23(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.849269e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.043761e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.052735e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.544683e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.517503e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.112082e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.782070e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.549483e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.747329e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.694263e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.147141e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.526209e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.039173e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.235615e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.656546e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.014423e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 24) *************************************************************************/ static double mannwhitneyu_utbln5n24(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.848925e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.040178e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.048355e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.510198e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.261134e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.915864e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.627423e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.307345e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.732992e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.869652e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.494176e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.047533e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.178439e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.424171e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.829195e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.840810e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 25) *************************************************************************/ static double mannwhitneyu_utbln5n25(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.848937e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.037512e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.044866e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.483269e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.063682e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.767778e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.508540e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.332756e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.881511e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.124041e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.368456e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.930499e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.779630e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.029528e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.658678e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.289695e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 26) *************************************************************************/ static double mannwhitneyu_utbln5n26(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.849416e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.035915e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.042493e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.466021e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.956432e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.698914e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.465689e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.035254e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.674614e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.492734e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.014021e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.944953e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.255750e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.075841e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.989330e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.134862e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 27) *************************************************************************/ static double mannwhitneyu_utbln5n27(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.850070e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.034815e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.040650e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.453117e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.886426e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.661702e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.452346e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.002476e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.720126e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.001400e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.729826e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.740640e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.206333e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.366093e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.193471e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.804091e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 28) *************************************************************************/ static double mannwhitneyu_utbln5n28(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.850668e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.033786e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.038853e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.440281e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.806020e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.612883e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.420436e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.787982e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.535230e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.263121e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.849609e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.863967e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.391610e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.720294e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.952273e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.901413e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 29) *************************************************************************/ static double mannwhitneyu_utbln5n29(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.851217e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.032834e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.037113e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.427762e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.719146e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.557172e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.375498e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.452033e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.187516e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.916936e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.065533e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.067301e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.615824e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.432244e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.417795e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.710038e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 30) *************************************************************************/ static double mannwhitneyu_utbln5n30(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.851845e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.032148e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.035679e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.417758e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.655330e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.522132e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.352106e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.326911e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.064969e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.813321e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.683881e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.813346e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.627085e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.832107e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.519336e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.888530e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 5, 100) *************************************************************************/ static double mannwhitneyu_utbln5n100(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.250000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.877940e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.039324e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.022243e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.305825e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.960119e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.112000e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.138868e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.418164e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.174520e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.489617e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.878301e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.302233e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.054113e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.458862e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.186591e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.623412e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 6, 6) *************************************************************************/ static double mannwhitneyu_utbln6n6(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/2.882307e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.054075e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.998804e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.681518e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.067578e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.709435e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 9.952661e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.641700e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.304572e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.336275e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.770385e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.401891e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.246148e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.442663e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.502866e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.105855e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.739371e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 6, 7) *************************************************************************/ static double mannwhitneyu_utbln6n7(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.000000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.265287e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.274613e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.582352e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.334293e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.915502e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.108091e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.546701e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.298827e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.891501e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.313717e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.989501e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.914594e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.062372e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.158841e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.596443e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.185662e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 6, 8) *************************************************************************/ static double mannwhitneyu_utbln6n8(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.098387e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.450954e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.520462e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.420299e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.604853e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.165840e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.008756e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.723402e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.843521e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.883405e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.720980e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.301709e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.948034e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.776243e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.623736e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.742068e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.796927e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 6, 9) *************************************************************************/ static double mannwhitneyu_utbln6n9(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.181981e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.616113e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.741650e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.204487e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.873068e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.446794e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.632286e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.266481e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.280067e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.780687e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.480242e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.592200e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.581019e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.264231e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.347174e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.167535e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.092185e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 6, 10) *************************************************************************/ static double mannwhitneyu_utbln6n10(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.253957e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.764382e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.942366e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.939896e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.137812e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.720270e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.281070e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.901060e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.824937e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.802812e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.258132e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.233536e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.085530e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.212151e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.001329e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.226048e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.035298e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 6, 11) *************************************************************************/ static double mannwhitneyu_utbln6n11(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.316625e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.898597e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.125710e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.063297e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.396852e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.990126e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.927977e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.726500e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.858745e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.654590e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.217736e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.989770e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.768493e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.924364e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.140215e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.647914e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.924802e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 6, 12) *************************************************************************/ static double mannwhitneyu_utbln6n12(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.371709e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.020941e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.294250e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.128842e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.650389e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.248611e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.578510e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.162852e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.746982e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.454209e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.128042e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.936650e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.530794e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.665192e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.994144e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.662249e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.368541e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 6, 13) *************************************************************************/ static double mannwhitneyu_utbln6n13(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.420526e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.133167e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.450016e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.191088e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.898220e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.050249e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.226901e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.471113e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.007470e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.049420e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.059074e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.881249e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.452780e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.441805e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.787493e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.483957e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.481590e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 6, 14) *************************************************************************/ static double mannwhitneyu_utbln6n14(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.450000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.201268e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.542568e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.226965e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.046029e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.136657e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.786757e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.843748e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.588022e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.253029e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.667188e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.788330e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.474545e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.540494e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.951188e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.863323e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.220904e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 6, 15) *************************************************************************/ static double mannwhitneyu_utbln6n15(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.450000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.195689e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.526567e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.213617e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.975035e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.118480e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.859142e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.083312e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.298720e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.766708e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.026356e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.093113e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.135168e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.136376e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.190870e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.435972e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.413129e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 6, 30) *************************************************************************/ static double mannwhitneyu_utbln6n30(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.450000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.166269e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.427399e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.118239e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.360847e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.745885e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.025041e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.187179e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.432089e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.408451e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.388774e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.795560e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.304136e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.258516e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.180236e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.388679e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.836027e-06, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 6, 100) *************************************************************************/ static double mannwhitneyu_utbln6n100(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.450000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.181350e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.417919e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.094201e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.195883e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.818937e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.514202e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.125047e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.022148e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.284181e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.157766e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.023752e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.127985e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.221690e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.516179e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 9.501398e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 9.380220e-06, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 7, 7) *************************************************************************/ static double mannwhitneyu_utbln7n7(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.130495e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.501264e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.584790e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.577311e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.617002e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.145186e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.023462e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.408251e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.626515e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.072492e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.722926e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.095445e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.842602e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.751427e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.008927e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.892431e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.772386e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 7, 8) *************************************************************************/ static double mannwhitneyu_utbln7n8(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.240370e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.709965e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.862154e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.504541e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.900195e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.439995e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.678028e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.485540e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.437047e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.440092e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.114227e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.516569e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.829457e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.787550e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.761866e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.991911e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.533481e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 7, 9) *************************************************************************/ static double mannwhitneyu_utbln7n9(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.334314e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.896550e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.112671e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.037277e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.181695e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.765190e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.360116e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.695960e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.780578e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.963843e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.616148e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.852104e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.390744e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.014041e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.888101e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.467474e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.004611e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 7, 10) *************************************************************************/ static double mannwhitneyu_utbln7n10(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.415650e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.064844e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.340749e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.118888e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.459730e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.097781e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.057688e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.097406e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.209262e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.065641e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.196677e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.313994e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.827157e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.822284e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.389090e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.340850e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.395172e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 7, 11) *************************************************************************/ static double mannwhitneyu_utbln7n11(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.486817e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.217795e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.549783e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.195905e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.733093e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.428447e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.760093e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.431676e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.717152e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.032199e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.832423e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.905979e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.302799e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.464371e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.456211e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.736244e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.140712e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 7, 12) *************************************************************************/ static double mannwhitneyu_utbln7n12(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.500000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.235822e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.564100e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.190813e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.686546e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.395083e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.967359e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.747096e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.304144e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.903198e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.134906e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.175035e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.266224e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.892931e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.604706e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 9.070459e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.427010e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 7, 13) *************************************************************************/ static double mannwhitneyu_utbln7n13(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.500000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.222204e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.532300e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.164642e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.523768e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.531984e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.467857e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.483804e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.524136e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.077740e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.745218e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.602085e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.828831e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.994070e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.873879e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.341937e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.706444e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 7, 14) *************************************************************************/ static double mannwhitneyu_utbln7n14(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.500000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.211763e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.507542e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.143640e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.395755e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.808020e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.044259e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.182308e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.057325e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.724255e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.303900e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.113148e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.102514e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.559442e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.634986e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.776476e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.054489e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 7, 15) *************************************************************************/ static double mannwhitneyu_utbln7n15(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.500000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.204898e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.489960e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.129172e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.316741e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.506107e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.983676e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.258013e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.262515e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.984156e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.912108e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.974023e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 6.056195e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.090842e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.232620e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.816339e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.020421e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 7, 30) *************************************************************************/ static double mannwhitneyu_utbln7n30(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.500000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.176536e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.398705e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.045481e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.821982e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.962304e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.698132e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.062667e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.282353e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.014836e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.035683e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.004137e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.801453e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.920705e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.518735e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.821501e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.801008e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 7, 100) *************************************************************************/ static double mannwhitneyu_utbln7n100(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.500000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.188337e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.386949e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.022834e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.686517e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.323516e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.399392e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.644333e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.617044e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.031396e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.792066e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.675457e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.673416e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.258552e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.174214e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.073644e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.349958e-06, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 8, 8) *************************************************************************/ static double mannwhitneyu_utbln8n8(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.360672e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -3.940217e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.168913e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.051485e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.195325e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.775196e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.385506e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.244902e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.525632e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.771275e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.332874e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.079599e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.882551e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.407944e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.769844e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.062433e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.872535e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 8, 9) *************************************************************************/ static double mannwhitneyu_utbln8n9(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.464102e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.147004e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.446939e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.146155e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.488561e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.144561e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.116917e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.205667e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.515661e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.618616e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.599011e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.457324e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.482917e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.488267e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.469823e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.957591e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.058326e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 8, 10) *************************************************************************/ static double mannwhitneyu_utbln8n10(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.554093e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.334282e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.700860e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.235253e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.778489e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.527324e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.862885e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.589781e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.507355e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.717526e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 9.215726e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.848696e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.918854e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.219614e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.753761e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.573688e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.602177e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 8, 11) *************************************************************************/ static double mannwhitneyu_utbln8n11(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.600000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.421882e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.812457e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.266153e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.849344e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.971527e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.258944e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.944820e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.894685e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.031836e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.514330e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.351660e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 6.206748e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.492600e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.005338e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.780099e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.673599e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 8, 12) *************************************************************************/ static double mannwhitneyu_utbln8n12(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.600000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.398211e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.762214e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.226296e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.603837e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.643223e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.502438e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.544574e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.647734e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.442259e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.011484e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.384758e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.998259e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.659985e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.331046e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.638478e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.056785e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 8, 13) *************************************************************************/ static double mannwhitneyu_utbln8n13(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.600000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.380670e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.724511e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.195851e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.420511e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.609928e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.893999e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.115919e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.291410e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.339664e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.801548e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.534710e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.793250e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.806718e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.384624e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.120582e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.936453e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 8, 14) *************************************************************************/ static double mannwhitneyu_utbln8n14(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.600000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.368494e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.697171e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.174440e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.300621e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.087393e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.685826e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.085254e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.525658e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.966647e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.453388e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.826066e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.501958e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.336297e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.251972e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.118456e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.415959e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 8, 15) *************************************************************************/ static double mannwhitneyu_utbln8n15(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.600000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.358397e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.674485e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.155941e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.195780e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.544830e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.426183e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.309902e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.650956e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.068874e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.538544e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.192525e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.073905e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.079673e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 9.423572e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 6.579647e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.765904e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 8, 30) *************************************************************************/ static double mannwhitneyu_utbln8n30(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.600000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.318823e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.567159e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.064864e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.688413e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.153712e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.309389e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.226861e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.523815e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.780987e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.166866e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.922431e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.466397e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.690036e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.008185e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.271903e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.534751e-06, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 8, 100) *************************************************************************/ static double mannwhitneyu_utbln8n100(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.600000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.324531e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.547071e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.038129e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.541549e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.525605e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.044992e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.085713e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.017871e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.459226e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.092064e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.024349e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 7.366347e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 6.385637e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.321722e-08, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.439286e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.058079e-07, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 9, 9) *************************************************************************/ static double mannwhitneyu_utbln9n9(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.576237e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.372857e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.750859e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.248233e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.792868e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.559372e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.894941e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.643256e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.091370e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.285034e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 6.112997e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.806229e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.150741e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.509825e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.891051e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.485013e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.343653e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 9, 10) *************************************************************************/ static double mannwhitneyu_utbln9n10(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.516726e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.939333e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.305046e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.935326e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.029141e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.420592e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.053140e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.065930e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.523581e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.544888e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.813741e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.510631e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.536057e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.833815e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.189692e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.615050e-03, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 9, 11) *************************************************************************/ static double mannwhitneyu_utbln9n11(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.481308e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.867483e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.249072e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.591790e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.400128e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.341992e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.463680e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.487211e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.671196e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.343472e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.544146e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.802335e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.117084e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.217443e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.858766e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.193687e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 9, 12) *************************************************************************/ static double mannwhitneyu_utbln9n12(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.456776e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.817037e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.209788e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.362108e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.171356e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.661557e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.026141e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.361908e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.093885e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.298389e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.663603e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.768522e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.579015e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.868677e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.440652e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.523037e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 9, 13) *************************************************************************/ static double mannwhitneyu_utbln9n13(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.438840e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.779308e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.180614e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.196489e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.346621e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.234857e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.796211e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.575715e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.525647e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.964651e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.275235e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.299124e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.397416e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.295781e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.237619e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 7.269692e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 9, 14) *************************************************************************/ static double mannwhitneyu_utbln9n14(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.425981e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.751545e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.159543e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.086570e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.917446e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.120112e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.175519e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.515473e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.727772e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.070629e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.677569e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.876953e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.233502e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.508182e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.120389e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.847212e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 9, 15) *************************************************************************/ static double mannwhitneyu_utbln9n15(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.414952e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.727612e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.140634e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.981231e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.382635e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.853575e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.571051e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.567625e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.214197e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.448700e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.712669e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.015050e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.438610e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 6.301363e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.309386e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.164772e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 9, 30) *************************************************************************/ static double mannwhitneyu_utbln9n30(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.370720e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.615712e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.050023e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.504775e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.318265e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.646826e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.741492e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.735360e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.966911e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.100738e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.348991e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.527687e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.917286e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.397466e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.360175e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.892252e-07, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 9, 100) *************************************************************************/ static double mannwhitneyu_utbln9n100(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.372506e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.590966e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.021758e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.359849e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.755519e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.533166e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.936659e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.634913e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.730053e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.791845e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.030682e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.228663e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.631175e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.636749e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.404599e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.789872e-07, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 10, 10) *************************************************************************/ static double mannwhitneyu_utbln10n10(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.468831e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.844398e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.231728e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.486073e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.781321e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.971425e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.215371e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.828451e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.419872e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.430165e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.740363e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.049211e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.269371e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.211393e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.232314e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.016081e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 10, 11) *************************************************************************/ static double mannwhitneyu_utbln10n11(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.437998e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.782296e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.184732e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.219585e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.457012e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.296008e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.481501e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.527940e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.953426e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.563840e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.574403e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.535775e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.338037e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.002654e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.852676e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.318132e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 10, 12) *************************************************************************/ static double mannwhitneyu_utbln10n12(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.416082e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.737458e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.150952e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.036884e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.609030e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.908684e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.439666e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.162647e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.451601e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.148757e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.803981e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.731621e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.346903e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.013151e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.956148e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.438381e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 10, 13) *************************************************************************/ static double mannwhitneyu_utbln10n13(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.399480e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.702863e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.124829e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.897428e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.979802e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.634368e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.180461e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.484926e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.864376e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.186576e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.886925e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.836828e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.074756e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.209547e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.883266e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.380143e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 10, 14) *************************************************************************/ static double mannwhitneyu_utbln10n14(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.386924e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.676124e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.104740e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.793826e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.558886e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.492462e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.052903e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.917782e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.878696e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.576046e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.764551e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.288778e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.757658e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.299101e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.265197e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.384503e-07, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 10, 15) *************************************************************************/ static double mannwhitneyu_utbln10n15(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.376846e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.654247e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.088083e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.705945e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.169677e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.317213e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.264836e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.548024e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.633910e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.505621e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.658588e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.320254e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.175277e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.122317e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.675688e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.661363e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 10, 30) *************************************************************************/ static double mannwhitneyu_utbln10n30(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.333977e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.548099e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.004444e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.291014e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.523674e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.828211e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.716917e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.894256e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.433371e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.522675e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.764192e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.140235e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.629230e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.541895e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.944946e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.726360e-06, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 10, 100) *************************************************************************/ static double mannwhitneyu_utbln10n100(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.650000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.334008e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.522316e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.769627e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.158110e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.053650e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.242235e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.173571e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.033661e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.824732e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.084420e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.610036e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.728155e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.217130e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.340966e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.001235e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.694052e-07, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 11, 11) *************************************************************************/ static double mannwhitneyu_utbln11n11(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.519760e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.880694e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.200698e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.174092e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.072304e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.054773e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.506613e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.813942e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.223644e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.417416e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.499166e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.194332e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 7.369096e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.968590e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.630532e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.061000e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 11, 12) *************************************************************************/ static double mannwhitneyu_utbln11n12(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.495790e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.832622e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.165420e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.987306e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.265621e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.723537e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.347406e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.353464e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 6.613369e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.102522e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.237709e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.665652e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.626903e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.167518e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.564455e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.047320e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 11, 13) *************************************************************************/ static double mannwhitneyu_utbln11n13(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.477880e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.796242e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.138769e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.851739e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.722104e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.548304e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.176683e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.817895e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.842451e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.935870e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.421777e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.238831e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.867026e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.458255e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.306259e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.961487e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 11, 14) *************************************************************************/ static double mannwhitneyu_utbln11n14(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.463683e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.766969e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.117082e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.739574e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.238865e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.350306e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.425871e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.640172e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.660633e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.879883e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.349658e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.271795e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.304544e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.024201e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.816867e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.596787e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 11, 15) *************************************************************************/ static double mannwhitneyu_utbln11n15(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.452526e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.743570e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.099705e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.650612e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.858285e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.187036e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.689241e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.294360e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.072623e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.278008e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.322382e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.131558e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.305669e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.825627e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.332689e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.120973e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 11, 30) *************************************************************************/ static double mannwhitneyu_utbln11n30(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.402621e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.627440e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.011333e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.224126e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.232856e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.859347e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.377381e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.756709e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.033230e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.875472e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.608399e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.102943e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.740693e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.343139e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.196878e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.658062e-07, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 11, 100) *************************************************************************/ static double mannwhitneyu_utbln11n100(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.398795e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.596486e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.814761e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.085187e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.766529e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.379425e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.986351e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.214705e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.360075e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.260869e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.033307e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.727087e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.393883e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.242989e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.111928e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.898823e-09, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 12, 12) *************************************************************************/ static double mannwhitneyu_utbln12n12(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.472616e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.786627e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.132099e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.817523e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.570179e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.479511e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.799492e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.565350e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.530139e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.380132e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.242761e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.576269e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.018771e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.933911e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 9.002799e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.022048e-06, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 12, 13) *************************************************************************/ static double mannwhitneyu_utbln12n13(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.454800e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.750794e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.105988e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.684754e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.011826e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.262579e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.044492e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.478741e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.322165e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.621104e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.068753e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.468396e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.056235e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.327375e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.914877e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.784191e-04, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 12, 14) *************************************************************************/ static double mannwhitneyu_utbln12n14(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.440910e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.722404e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.085254e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.579439e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.563738e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.066730e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.129346e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.014531e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.129679e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.000909e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.996174e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 6.377924e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.936304e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.051098e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 9.025820e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 8.730585e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 12, 15) *************************************************************************/ static double mannwhitneyu_utbln12n15(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.430123e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.700008e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.068971e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.499725e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.250897e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.473145e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.680008e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.483350e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.766992e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.891081e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.015140e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.977756e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.707414e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.114786e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 6.238865e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.381445e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 12, 30) *************************************************************************/ static double mannwhitneyu_utbln12n30(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.380023e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.585782e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.838583e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.103394e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.834015e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.635212e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.948212e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.574169e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.747980e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.833672e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.722433e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.181038e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.206473e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.716003e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.476434e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.217700e-07, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 12, 100) *************************************************************************/ static double mannwhitneyu_utbln12n100(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.700000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.374567e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.553481e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.541334e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.701907e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.414757e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.404103e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.234388e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.453762e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.311060e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.317501e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.713888e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.309583e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.019804e-08, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.224829e-09, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.349019e-08, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.893302e-08, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 13, 13) *************************************************************************/ static double mannwhitneyu_utbln13n13(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.750000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.541046e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.859047e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.130164e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.689719e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.950693e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.231455e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.976550e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.538455e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.245603e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.142647e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.831434e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.032483e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.488405e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.156927e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.949279e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.532700e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 13, 14) *************************************************************************/ static double mannwhitneyu_utbln13n14(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.750000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.525655e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.828341e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.108110e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.579552e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.488307e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.032328e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.988741e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.766394e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.388950e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.338179e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.133440e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.023518e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.110570e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.202332e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.056132e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.536323e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 13, 15) *************************************************************************/ static double mannwhitneyu_utbln13n15(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.750000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.513585e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.803952e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.090686e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.495310e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.160314e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.073124e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.480313e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.478239e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.140914e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.311541e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.677105e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.115464e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.578563e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.044604e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.888939e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 2.395644e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 13, 30) *************************************************************************/ static double mannwhitneyu_utbln13n30(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.750000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.455999e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.678434e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.995491e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.078100e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.705220e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.258739e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.671526e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.185458e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.507764e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.411446e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.044355e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.285765e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.345282e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.066940e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.962037e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.723644e-07, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 13, 100) *************************************************************************/ static double mannwhitneyu_utbln13n100(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.750000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.446787e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.640804e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.671552e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.364990e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.274444e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.047440e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.161439e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.171729e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.562171e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.359762e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.275494e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.747635e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.700292e-08, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.565559e-09, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 5.005396e-09, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 3.335794e-09, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 14, 14) *************************************************************************/ static double mannwhitneyu_utbln14n14(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.750000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.510624e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.798584e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.087107e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.478532e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.098050e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.855986e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.409083e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.299536e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.176177e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.479417e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.812761e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -5.225872e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 4.516521e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 6.730551e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 9.237563e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.611820e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 14, 15) *************************************************************************/ static double mannwhitneyu_utbln14n15(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.750000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.498681e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.774668e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.070267e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.399348e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.807239e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.845763e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.071773e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.261698e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.011695e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.305946e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.879295e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.999439e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.904438e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.944986e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.373908e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.140794e-05, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 14, 30) *************************************************************************/ static double mannwhitneyu_utbln14n30(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.750000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.440378e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.649587e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.807829e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.989753e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.463646e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.586580e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -6.745917e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.635398e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.923172e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.446699e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.613892e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.214073e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.651683e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.272777e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.464988e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.109803e-07, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, 14, 100) *************************************************************************/ static double mannwhitneyu_utbln14n100(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); x = ae_minreal(2*(s-0.000000e+00)/3.750000e+00-1, 1.0, _state); tj = (double)(1); tj1 = x; mannwhitneyu_ucheb(x, -4.429701e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -4.610577e+00, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -9.482675e-01, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.605550e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.062151e-02, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.525154e-03, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.835983e-04, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -8.411440e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.744901e-05, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.318850e-06, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.692100e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -1.536270e-07, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -3.705888e-08, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -7.999599e-09, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, -2.908395e-09, &tj, &tj1, &result, _state); mannwhitneyu_ucheb(x, 1.546923e-09, &tj, &tj1, &result, _state); return result; } /************************************************************************* Tail(S, N1, N2) *************************************************************************/ static double mannwhitneyu_usigma(double s, ae_int_t n1, ae_int_t n2, ae_state *_state) { double f0; double f1; double f2; double f3; double f4; double s0; double s1; double s2; double s3; double s4; double result; result = (double)(0); /* * N1=5, N2 = 5, 6, 7, ... */ if( ae_minint(n1, n2, _state)==5 ) { if( ae_maxint(n1, n2, _state)==5 ) { result = mannwhitneyu_utbln5n5(s, _state); } if( ae_maxint(n1, n2, _state)==6 ) { result = mannwhitneyu_utbln5n6(s, _state); } if( ae_maxint(n1, n2, _state)==7 ) { result = mannwhitneyu_utbln5n7(s, _state); } if( ae_maxint(n1, n2, _state)==8 ) { result = mannwhitneyu_utbln5n8(s, _state); } if( ae_maxint(n1, n2, _state)==9 ) { result = mannwhitneyu_utbln5n9(s, _state); } if( ae_maxint(n1, n2, _state)==10 ) { result = mannwhitneyu_utbln5n10(s, _state); } if( ae_maxint(n1, n2, _state)==11 ) { result = mannwhitneyu_utbln5n11(s, _state); } if( ae_maxint(n1, n2, _state)==12 ) { result = mannwhitneyu_utbln5n12(s, _state); } if( ae_maxint(n1, n2, _state)==13 ) { result = mannwhitneyu_utbln5n13(s, _state); } if( ae_maxint(n1, n2, _state)==14 ) { result = mannwhitneyu_utbln5n14(s, _state); } if( ae_maxint(n1, n2, _state)==15 ) { result = mannwhitneyu_utbln5n15(s, _state); } if( ae_maxint(n1, n2, _state)==16 ) { result = mannwhitneyu_utbln5n16(s, _state); } if( ae_maxint(n1, n2, _state)==17 ) { result = mannwhitneyu_utbln5n17(s, _state); } if( ae_maxint(n1, n2, _state)==18 ) { result = mannwhitneyu_utbln5n18(s, _state); } if( ae_maxint(n1, n2, _state)==19 ) { result = mannwhitneyu_utbln5n19(s, _state); } if( ae_maxint(n1, n2, _state)==20 ) { result = mannwhitneyu_utbln5n20(s, _state); } if( ae_maxint(n1, n2, _state)==21 ) { result = mannwhitneyu_utbln5n21(s, _state); } if( ae_maxint(n1, n2, _state)==22 ) { result = mannwhitneyu_utbln5n22(s, _state); } if( ae_maxint(n1, n2, _state)==23 ) { result = mannwhitneyu_utbln5n23(s, _state); } if( ae_maxint(n1, n2, _state)==24 ) { result = mannwhitneyu_utbln5n24(s, _state); } if( ae_maxint(n1, n2, _state)==25 ) { result = mannwhitneyu_utbln5n25(s, _state); } if( ae_maxint(n1, n2, _state)==26 ) { result = mannwhitneyu_utbln5n26(s, _state); } if( ae_maxint(n1, n2, _state)==27 ) { result = mannwhitneyu_utbln5n27(s, _state); } if( ae_maxint(n1, n2, _state)==28 ) { result = mannwhitneyu_utbln5n28(s, _state); } if( ae_maxint(n1, n2, _state)==29 ) { result = mannwhitneyu_utbln5n29(s, _state); } if( ae_maxint(n1, n2, _state)>29 ) { f0 = mannwhitneyu_utbln5n15(s, _state); f1 = mannwhitneyu_utbln5n30(s, _state); f2 = mannwhitneyu_utbln5n100(s, _state); result = mannwhitneyu_uninterpolate(f0, f1, f2, ae_maxint(n1, n2, _state), _state); } return result; } /* * N1=6, N2 = 6, 7, 8, ... */ if( ae_minint(n1, n2, _state)==6 ) { if( ae_maxint(n1, n2, _state)==6 ) { result = mannwhitneyu_utbln6n6(s, _state); } if( ae_maxint(n1, n2, _state)==7 ) { result = mannwhitneyu_utbln6n7(s, _state); } if( ae_maxint(n1, n2, _state)==8 ) { result = mannwhitneyu_utbln6n8(s, _state); } if( ae_maxint(n1, n2, _state)==9 ) { result = mannwhitneyu_utbln6n9(s, _state); } if( ae_maxint(n1, n2, _state)==10 ) { result = mannwhitneyu_utbln6n10(s, _state); } if( ae_maxint(n1, n2, _state)==11 ) { result = mannwhitneyu_utbln6n11(s, _state); } if( ae_maxint(n1, n2, _state)==12 ) { result = mannwhitneyu_utbln6n12(s, _state); } if( ae_maxint(n1, n2, _state)==13 ) { result = mannwhitneyu_utbln6n13(s, _state); } if( ae_maxint(n1, n2, _state)==14 ) { result = mannwhitneyu_utbln6n14(s, _state); } if( ae_maxint(n1, n2, _state)==15 ) { result = mannwhitneyu_utbln6n15(s, _state); } if( ae_maxint(n1, n2, _state)>15 ) { f0 = mannwhitneyu_utbln6n15(s, _state); f1 = mannwhitneyu_utbln6n30(s, _state); f2 = mannwhitneyu_utbln6n100(s, _state); result = mannwhitneyu_uninterpolate(f0, f1, f2, ae_maxint(n1, n2, _state), _state); } return result; } /* * N1=7, N2 = 7, 8, ... */ if( ae_minint(n1, n2, _state)==7 ) { if( ae_maxint(n1, n2, _state)==7 ) { result = mannwhitneyu_utbln7n7(s, _state); } if( ae_maxint(n1, n2, _state)==8 ) { result = mannwhitneyu_utbln7n8(s, _state); } if( ae_maxint(n1, n2, _state)==9 ) { result = mannwhitneyu_utbln7n9(s, _state); } if( ae_maxint(n1, n2, _state)==10 ) { result = mannwhitneyu_utbln7n10(s, _state); } if( ae_maxint(n1, n2, _state)==11 ) { result = mannwhitneyu_utbln7n11(s, _state); } if( ae_maxint(n1, n2, _state)==12 ) { result = mannwhitneyu_utbln7n12(s, _state); } if( ae_maxint(n1, n2, _state)==13 ) { result = mannwhitneyu_utbln7n13(s, _state); } if( ae_maxint(n1, n2, _state)==14 ) { result = mannwhitneyu_utbln7n14(s, _state); } if( ae_maxint(n1, n2, _state)==15 ) { result = mannwhitneyu_utbln7n15(s, _state); } if( ae_maxint(n1, n2, _state)>15 ) { f0 = mannwhitneyu_utbln7n15(s, _state); f1 = mannwhitneyu_utbln7n30(s, _state); f2 = mannwhitneyu_utbln7n100(s, _state); result = mannwhitneyu_uninterpolate(f0, f1, f2, ae_maxint(n1, n2, _state), _state); } return result; } /* * N1=8, N2 = 8, 9, 10, ... */ if( ae_minint(n1, n2, _state)==8 ) { if( ae_maxint(n1, n2, _state)==8 ) { result = mannwhitneyu_utbln8n8(s, _state); } if( ae_maxint(n1, n2, _state)==9 ) { result = mannwhitneyu_utbln8n9(s, _state); } if( ae_maxint(n1, n2, _state)==10 ) { result = mannwhitneyu_utbln8n10(s, _state); } if( ae_maxint(n1, n2, _state)==11 ) { result = mannwhitneyu_utbln8n11(s, _state); } if( ae_maxint(n1, n2, _state)==12 ) { result = mannwhitneyu_utbln8n12(s, _state); } if( ae_maxint(n1, n2, _state)==13 ) { result = mannwhitneyu_utbln8n13(s, _state); } if( ae_maxint(n1, n2, _state)==14 ) { result = mannwhitneyu_utbln8n14(s, _state); } if( ae_maxint(n1, n2, _state)==15 ) { result = mannwhitneyu_utbln8n15(s, _state); } if( ae_maxint(n1, n2, _state)>15 ) { f0 = mannwhitneyu_utbln8n15(s, _state); f1 = mannwhitneyu_utbln8n30(s, _state); f2 = mannwhitneyu_utbln8n100(s, _state); result = mannwhitneyu_uninterpolate(f0, f1, f2, ae_maxint(n1, n2, _state), _state); } return result; } /* * N1=9, N2 = 9, 10, ... */ if( ae_minint(n1, n2, _state)==9 ) { if( ae_maxint(n1, n2, _state)==9 ) { result = mannwhitneyu_utbln9n9(s, _state); } if( ae_maxint(n1, n2, _state)==10 ) { result = mannwhitneyu_utbln9n10(s, _state); } if( ae_maxint(n1, n2, _state)==11 ) { result = mannwhitneyu_utbln9n11(s, _state); } if( ae_maxint(n1, n2, _state)==12 ) { result = mannwhitneyu_utbln9n12(s, _state); } if( ae_maxint(n1, n2, _state)==13 ) { result = mannwhitneyu_utbln9n13(s, _state); } if( ae_maxint(n1, n2, _state)==14 ) { result = mannwhitneyu_utbln9n14(s, _state); } if( ae_maxint(n1, n2, _state)==15 ) { result = mannwhitneyu_utbln9n15(s, _state); } if( ae_maxint(n1, n2, _state)>15 ) { f0 = mannwhitneyu_utbln9n15(s, _state); f1 = mannwhitneyu_utbln9n30(s, _state); f2 = mannwhitneyu_utbln9n100(s, _state); result = mannwhitneyu_uninterpolate(f0, f1, f2, ae_maxint(n1, n2, _state), _state); } return result; } /* * N1=10, N2 = 10, 11, ... */ if( ae_minint(n1, n2, _state)==10 ) { if( ae_maxint(n1, n2, _state)==10 ) { result = mannwhitneyu_utbln10n10(s, _state); } if( ae_maxint(n1, n2, _state)==11 ) { result = mannwhitneyu_utbln10n11(s, _state); } if( ae_maxint(n1, n2, _state)==12 ) { result = mannwhitneyu_utbln10n12(s, _state); } if( ae_maxint(n1, n2, _state)==13 ) { result = mannwhitneyu_utbln10n13(s, _state); } if( ae_maxint(n1, n2, _state)==14 ) { result = mannwhitneyu_utbln10n14(s, _state); } if( ae_maxint(n1, n2, _state)==15 ) { result = mannwhitneyu_utbln10n15(s, _state); } if( ae_maxint(n1, n2, _state)>15 ) { f0 = mannwhitneyu_utbln10n15(s, _state); f1 = mannwhitneyu_utbln10n30(s, _state); f2 = mannwhitneyu_utbln10n100(s, _state); result = mannwhitneyu_uninterpolate(f0, f1, f2, ae_maxint(n1, n2, _state), _state); } return result; } /* * N1=11, N2 = 11, 12, ... */ if( ae_minint(n1, n2, _state)==11 ) { if( ae_maxint(n1, n2, _state)==11 ) { result = mannwhitneyu_utbln11n11(s, _state); } if( ae_maxint(n1, n2, _state)==12 ) { result = mannwhitneyu_utbln11n12(s, _state); } if( ae_maxint(n1, n2, _state)==13 ) { result = mannwhitneyu_utbln11n13(s, _state); } if( ae_maxint(n1, n2, _state)==14 ) { result = mannwhitneyu_utbln11n14(s, _state); } if( ae_maxint(n1, n2, _state)==15 ) { result = mannwhitneyu_utbln11n15(s, _state); } if( ae_maxint(n1, n2, _state)>15 ) { f0 = mannwhitneyu_utbln11n15(s, _state); f1 = mannwhitneyu_utbln11n30(s, _state); f2 = mannwhitneyu_utbln11n100(s, _state); result = mannwhitneyu_uninterpolate(f0, f1, f2, ae_maxint(n1, n2, _state), _state); } return result; } /* * N1=12, N2 = 12, 13, ... */ if( ae_minint(n1, n2, _state)==12 ) { if( ae_maxint(n1, n2, _state)==12 ) { result = mannwhitneyu_utbln12n12(s, _state); } if( ae_maxint(n1, n2, _state)==13 ) { result = mannwhitneyu_utbln12n13(s, _state); } if( ae_maxint(n1, n2, _state)==14 ) { result = mannwhitneyu_utbln12n14(s, _state); } if( ae_maxint(n1, n2, _state)==15 ) { result = mannwhitneyu_utbln12n15(s, _state); } if( ae_maxint(n1, n2, _state)>15 ) { f0 = mannwhitneyu_utbln12n15(s, _state); f1 = mannwhitneyu_utbln12n30(s, _state); f2 = mannwhitneyu_utbln12n100(s, _state); result = mannwhitneyu_uninterpolate(f0, f1, f2, ae_maxint(n1, n2, _state), _state); } return result; } /* * N1=13, N2 = 13, 14, ... */ if( ae_minint(n1, n2, _state)==13 ) { if( ae_maxint(n1, n2, _state)==13 ) { result = mannwhitneyu_utbln13n13(s, _state); } if( ae_maxint(n1, n2, _state)==14 ) { result = mannwhitneyu_utbln13n14(s, _state); } if( ae_maxint(n1, n2, _state)==15 ) { result = mannwhitneyu_utbln13n15(s, _state); } if( ae_maxint(n1, n2, _state)>15 ) { f0 = mannwhitneyu_utbln13n15(s, _state); f1 = mannwhitneyu_utbln13n30(s, _state); f2 = mannwhitneyu_utbln13n100(s, _state); result = mannwhitneyu_uninterpolate(f0, f1, f2, ae_maxint(n1, n2, _state), _state); } return result; } /* * N1=14, N2 = 14, 15, ... */ if( ae_minint(n1, n2, _state)==14 ) { if( ae_maxint(n1, n2, _state)==14 ) { result = mannwhitneyu_utbln14n14(s, _state); } if( ae_maxint(n1, n2, _state)==15 ) { result = mannwhitneyu_utbln14n15(s, _state); } if( ae_maxint(n1, n2, _state)>15 ) { f0 = mannwhitneyu_utbln14n15(s, _state); f1 = mannwhitneyu_utbln14n30(s, _state); f2 = mannwhitneyu_utbln14n100(s, _state); result = mannwhitneyu_uninterpolate(f0, f1, f2, ae_maxint(n1, n2, _state), _state); } return result; } /* * N1 >= 15, N2 >= 15 */ if( ae_fp_greater(s,(double)(4)) ) { s = (double)(4); } if( ae_fp_less(s,(double)(3)) ) { s0 = 0.000000e+00; f0 = mannwhitneyu_usigma000(n1, n2, _state); s1 = 7.500000e-01; f1 = mannwhitneyu_usigma075(n1, n2, _state); s2 = 1.500000e+00; f2 = mannwhitneyu_usigma150(n1, n2, _state); s3 = 2.250000e+00; f3 = mannwhitneyu_usigma225(n1, n2, _state); s4 = 3.000000e+00; f4 = mannwhitneyu_usigma300(n1, n2, _state); f1 = ((s-s0)*f1-(s-s1)*f0)/(s1-s0); f2 = ((s-s0)*f2-(s-s2)*f0)/(s2-s0); f3 = ((s-s0)*f3-(s-s3)*f0)/(s3-s0); f4 = ((s-s0)*f4-(s-s4)*f0)/(s4-s0); f2 = ((s-s1)*f2-(s-s2)*f1)/(s2-s1); f3 = ((s-s1)*f3-(s-s3)*f1)/(s3-s1); f4 = ((s-s1)*f4-(s-s4)*f1)/(s4-s1); f3 = ((s-s2)*f3-(s-s3)*f2)/(s3-s2); f4 = ((s-s2)*f4-(s-s4)*f2)/(s4-s2); f4 = ((s-s3)*f4-(s-s4)*f3)/(s4-s3); result = f4; } else { s0 = 3.000000e+00; f0 = mannwhitneyu_usigma300(n1, n2, _state); s1 = 3.333333e+00; f1 = mannwhitneyu_usigma333(n1, n2, _state); s2 = 3.666667e+00; f2 = mannwhitneyu_usigma367(n1, n2, _state); s3 = 4.000000e+00; f3 = mannwhitneyu_usigma400(n1, n2, _state); f1 = ((s-s0)*f1-(s-s1)*f0)/(s1-s0); f2 = ((s-s0)*f2-(s-s2)*f0)/(s2-s0); f3 = ((s-s0)*f3-(s-s3)*f0)/(s3-s0); f2 = ((s-s1)*f2-(s-s2)*f1)/(s2-s1); f3 = ((s-s1)*f3-(s-s3)*f1)/(s3-s1); f3 = ((s-s2)*f3-(s-s3)*f2)/(s3-s2); result = f3; } return result; } /************************************************************************* Jarque-Bera test This test checks hypotheses about the fact that a given sample X is a sample of normal random variable. Requirements: * the number of elements in the sample is not less than 5. Input parameters: X - sample. Array whose index goes from 0 to N-1. N - size of the sample. N>=5 Output parameters: P - p-value for the test Accuracy of the approximation used (5<=N<=1951): p-value relative error (5<=N<=1951) [1, 0.1] < 1% [0.1, 0.01] < 2% [0.01, 0.001] < 6% [0.001, 0] wasn't measured For N>1951 accuracy wasn't measured but it shouldn't be sharply different from table values. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ void jarqueberatest(/* Real */ ae_vector* x, ae_int_t n, double* p, ae_state *_state) { double s; *p = 0; /* * N is too small */ if( n<5 ) { *p = 1.0; return; } /* * N is large enough */ jarquebera_jarqueberastatistic(x, n, &s, _state); *p = jarquebera_jarqueberaapprox(n, s, _state); } static void jarquebera_jarqueberastatistic(/* Real */ ae_vector* x, ae_int_t n, double* s, ae_state *_state) { ae_int_t i; double v; double v1; double v2; double stddev; double mean; double variance; double skewness; double kurtosis; *s = 0; mean = (double)(0); variance = (double)(0); skewness = (double)(0); kurtosis = (double)(0); stddev = (double)(0); ae_assert(n>1, "Assertion failed", _state); /* * Mean */ for(i=0; i<=n-1; i++) { mean = mean+x->ptr.p_double[i]; } mean = mean/n; /* * Variance (using corrected two-pass algorithm) */ if( n!=1 ) { v1 = (double)(0); for(i=0; i<=n-1; i++) { v1 = v1+ae_sqr(x->ptr.p_double[i]-mean, _state); } v2 = (double)(0); for(i=0; i<=n-1; i++) { v2 = v2+(x->ptr.p_double[i]-mean); } v2 = ae_sqr(v2, _state)/n; variance = (v1-v2)/(n-1); if( ae_fp_less(variance,(double)(0)) ) { variance = (double)(0); } stddev = ae_sqrt(variance, _state); } /* * Skewness and kurtosis */ if( ae_fp_neq(stddev,(double)(0)) ) { for(i=0; i<=n-1; i++) { v = (x->ptr.p_double[i]-mean)/stddev; v2 = ae_sqr(v, _state); skewness = skewness+v2*v; kurtosis = kurtosis+ae_sqr(v2, _state); } skewness = skewness/n; kurtosis = kurtosis/n-3; } /* * Statistic */ *s = (double)n/(double)6*(ae_sqr(skewness, _state)+ae_sqr(kurtosis, _state)/4); } static double jarquebera_jarqueberaapprox(ae_int_t n, double s, ae_state *_state) { ae_frame _frame_block; ae_vector vx; ae_vector vy; ae_matrix ctbl; double t1; double t2; double t3; double t; double f1; double f2; double f3; double f12; double f23; double x; double result; ae_frame_make(_state, &_frame_block); ae_vector_init(&vx, 0, DT_REAL, _state); ae_vector_init(&vy, 0, DT_REAL, _state); ae_matrix_init(&ctbl, 0, 0, DT_REAL, _state); result = (double)(1); x = s; if( n<5 ) { ae_frame_leave(_state); return result; } /* * N = 5..20 are tabulated */ if( n>=5&&n<=20 ) { if( n==5 ) { result = ae_exp(jarquebera_jbtbl5(x, _state), _state); } if( n==6 ) { result = ae_exp(jarquebera_jbtbl6(x, _state), _state); } if( n==7 ) { result = ae_exp(jarquebera_jbtbl7(x, _state), _state); } if( n==8 ) { result = ae_exp(jarquebera_jbtbl8(x, _state), _state); } if( n==9 ) { result = ae_exp(jarquebera_jbtbl9(x, _state), _state); } if( n==10 ) { result = ae_exp(jarquebera_jbtbl10(x, _state), _state); } if( n==11 ) { result = ae_exp(jarquebera_jbtbl11(x, _state), _state); } if( n==12 ) { result = ae_exp(jarquebera_jbtbl12(x, _state), _state); } if( n==13 ) { result = ae_exp(jarquebera_jbtbl13(x, _state), _state); } if( n==14 ) { result = ae_exp(jarquebera_jbtbl14(x, _state), _state); } if( n==15 ) { result = ae_exp(jarquebera_jbtbl15(x, _state), _state); } if( n==16 ) { result = ae_exp(jarquebera_jbtbl16(x, _state), _state); } if( n==17 ) { result = ae_exp(jarquebera_jbtbl17(x, _state), _state); } if( n==18 ) { result = ae_exp(jarquebera_jbtbl18(x, _state), _state); } if( n==19 ) { result = ae_exp(jarquebera_jbtbl19(x, _state), _state); } if( n==20 ) { result = ae_exp(jarquebera_jbtbl20(x, _state), _state); } ae_frame_leave(_state); return result; } /* * N = 20, 30, 50 are tabulated. * In-between values are interpolated * using interpolating polynomial of the second degree. */ if( n>20&&n<=50 ) { t1 = -1.0/20.0; t2 = -1.0/30.0; t3 = -1.0/50.0; t = -1.0/n; f1 = jarquebera_jbtbl20(x, _state); f2 = jarquebera_jbtbl30(x, _state); f3 = jarquebera_jbtbl50(x, _state); f12 = ((t-t2)*f1+(t1-t)*f2)/(t1-t2); f23 = ((t-t3)*f2+(t2-t)*f3)/(t2-t3); result = ((t-t3)*f12+(t1-t)*f23)/(t1-t3); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } result = ae_exp(result, _state); ae_frame_leave(_state); return result; } /* * N = 50, 65, 100 are tabulated. * In-between values are interpolated * using interpolating polynomial of the second degree. */ if( n>50&&n<=100 ) { t1 = -1.0/50.0; t2 = -1.0/65.0; t3 = -1.0/100.0; t = -1.0/n; f1 = jarquebera_jbtbl50(x, _state); f2 = jarquebera_jbtbl65(x, _state); f3 = jarquebera_jbtbl100(x, _state); f12 = ((t-t2)*f1+(t1-t)*f2)/(t1-t2); f23 = ((t-t3)*f2+(t2-t)*f3)/(t2-t3); result = ((t-t3)*f12+(t1-t)*f23)/(t1-t3); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } result = ae_exp(result, _state); ae_frame_leave(_state); return result; } /* * N = 100, 130, 200 are tabulated. * In-between values are interpolated * using interpolating polynomial of the second degree. */ if( n>100&&n<=200 ) { t1 = -1.0/100.0; t2 = -1.0/130.0; t3 = -1.0/200.0; t = -1.0/n; f1 = jarquebera_jbtbl100(x, _state); f2 = jarquebera_jbtbl130(x, _state); f3 = jarquebera_jbtbl200(x, _state); f12 = ((t-t2)*f1+(t1-t)*f2)/(t1-t2); f23 = ((t-t3)*f2+(t2-t)*f3)/(t2-t3); result = ((t-t3)*f12+(t1-t)*f23)/(t1-t3); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } result = ae_exp(result, _state); ae_frame_leave(_state); return result; } /* * N = 200, 301, 501 are tabulated. * In-between values are interpolated * using interpolating polynomial of the second degree. */ if( n>200&&n<=501 ) { t1 = -1.0/200.0; t2 = -1.0/301.0; t3 = -1.0/501.0; t = -1.0/n; f1 = jarquebera_jbtbl200(x, _state); f2 = jarquebera_jbtbl301(x, _state); f3 = jarquebera_jbtbl501(x, _state); f12 = ((t-t2)*f1+(t1-t)*f2)/(t1-t2); f23 = ((t-t3)*f2+(t2-t)*f3)/(t2-t3); result = ((t-t3)*f12+(t1-t)*f23)/(t1-t3); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } result = ae_exp(result, _state); ae_frame_leave(_state); return result; } /* * N = 501, 701, 1401 are tabulated. * In-between values are interpolated * using interpolating polynomial of the second degree. */ if( n>501&&n<=1401 ) { t1 = -1.0/501.0; t2 = -1.0/701.0; t3 = -1.0/1401.0; t = -1.0/n; f1 = jarquebera_jbtbl501(x, _state); f2 = jarquebera_jbtbl701(x, _state); f3 = jarquebera_jbtbl1401(x, _state); f12 = ((t-t2)*f1+(t1-t)*f2)/(t1-t2); f23 = ((t-t3)*f2+(t2-t)*f3)/(t2-t3); result = ((t-t3)*f12+(t1-t)*f23)/(t1-t3); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } result = ae_exp(result, _state); ae_frame_leave(_state); return result; } /* * Asymptotic expansion */ if( n>1401 ) { result = -0.5*x+(jarquebera_jbtbl1401(x, _state)+0.5*x)*ae_sqrt((double)1401/(double)n, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } result = ae_exp(result, _state); ae_frame_leave(_state); return result; } ae_frame_leave(_state); return result; } static double jarquebera_jbtbl5(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,0.4000) ) { x = 2*(s-0.000000)/0.400000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.097885e-20, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.854501e-20, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.756616e-20, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,1.1000) ) { x = 2*(s-0.400000)/0.700000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.324545e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.075941e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -9.772272e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.175686e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.576162e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.126861e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.434425e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.790359e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.809178e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.479704e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.717040e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.294170e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.880632e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.023344e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.601531e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.920403e-02, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -5.188419e+02*(s-1.100000e+00)-4.767297e+00; return result; } static double jarquebera_jbtbl6(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,0.2500) ) { x = 2*(s-0.000000)/0.250000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -2.274707e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.700471e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.425764e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,1.3000) ) { x = 2*(s-0.250000)/1.050000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.339000e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.011104e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.168177e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.085666e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.738606e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.022876e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.462402e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.908270e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.230772e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.006996e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.410222e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.893768e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 8.114564e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,1.8500) ) { x = 2*(s-1.300000)/0.550000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.794311e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.578700e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.394664e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.928290e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.813273e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.076063e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.835380e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.013013e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.058903e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.856915e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.710887e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.770029e+02*(s-1.850000e+00)-1.371015e+01; return result; } static double jarquebera_jbtbl7(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,1.4000) ) { x = 2*(s-0.000000)/1.400000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.093681e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.695911e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.473192e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.203236e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.590379e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.291876e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.132007e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 9.411147e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.180067e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.487610e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.436561e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,3.0000) ) { x = 2*(s-1.400000)/1.600000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -5.947854e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.772675e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.707912e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.691171e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.132795e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.481310e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.867536e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 8.772327e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.033387e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.378277e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.497964e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.636814e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -9.581640e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,3.2000) ) { x = 2*(s-3.000000)/0.200000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -7.511008e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.140472e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.682053e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.568561e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.933930e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.140472e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.895025e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.140472e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.933930e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.568561e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.682053e+00, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.824116e+03*(s-3.200000e+00)-1.440330e+01; return result; } static double jarquebera_jbtbl8(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,1.3000) ) { x = 2*(s-0.000000)/1.300000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -7.199015e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.095921e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.736828e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.047438e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.484320e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.937923e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.810470e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.139780e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.708443e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,2.0000) ) { x = 2*(s-1.300000)/0.700000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -3.378966e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.802461e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.547593e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.241042e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.203274e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.201990e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.125597e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.584426e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.546069e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,5.0000) ) { x = 2*(s-2.000000)/3.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.828366e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.137533e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.016671e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.745637e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.189801e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.621610e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.741122e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.516368e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.552085e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.787029e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.359774e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -5.087028e+00*(s-5.000000e+00)-1.071300e+01; return result; } static double jarquebera_jbtbl9(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,1.3000) ) { x = 2*(s-0.000000)/1.300000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.279320e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -9.277151e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.669339e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.086149e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.333816e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.871249e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.007048e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.482245e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.355615e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,2.0000) ) { x = 2*(s-1.300000)/0.700000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -2.981430e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.972248e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.747737e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.808530e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.888305e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 9.001302e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.378767e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.108510e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.915372e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,7.0000) ) { x = 2*(s-2.000000)/5.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.387463e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.845231e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.809956e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.543461e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.880397e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.160074e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.356527e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.394428e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 9.619892e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.758763e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.790977e-05, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -2.020952e+00*(s-7.000000e+00)-9.516623e+00; return result; } static double jarquebera_jbtbl10(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,1.2000) ) { x = 2*(s-0.000000)/1.200000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -4.590993e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.562730e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.353934e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.069933e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.849151e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 8.931406e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.636295e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.178340e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.917749e-05, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,2.0000) ) { x = 2*(s-1.200000)/0.800000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -2.537658e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -9.962401e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.838715e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.055792e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.580316e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.781701e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.770362e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.838983e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.999052e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,7.0000) ) { x = 2*(s-2.000000)/5.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -5.337524e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.877029e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.734650e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.249254e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.320250e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.432266e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -8.711035e-01*(s-7.000000e+00)-7.212811e+00; return result; } static double jarquebera_jbtbl11(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,1.2000) ) { x = 2*(s-0.000000)/1.200000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -4.339517e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.051558e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.000992e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.022547e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -9.808401e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.592870e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.575081e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.086173e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.089011e-05, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,2.2500) ) { x = 2*(s-1.200000)/1.050000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -2.523221e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.068388e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.179661e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.555524e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.238964e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.364320e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.895771e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.762774e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.201340e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,8.0000) ) { x = 2*(s-2.250000)/5.750000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -5.212179e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.684579e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 8.299519e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.606261e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.310869e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.320115e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -5.715445e-01*(s-8.000000e+00)-6.845834e+00; return result; } static double jarquebera_jbtbl12(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,1.0000) ) { x = 2*(s-0.000000)/1.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -2.736742e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.657836e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.047209e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.319599e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.545631e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 9.280445e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.815679e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.213519e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.256838e-05, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,3.0000) ) { x = 2*(s-1.000000)/2.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -2.573947e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.515287e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.611880e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.271311e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.495815e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.141186e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.180886e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.388211e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.890761e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.233175e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.946156e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,12.0000) ) { x = 2*(s-3.000000)/9.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -5.947819e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.034157e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.878986e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.078603e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.990977e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.866215e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.897866e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.512252e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.073743e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.022621e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.501343e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -2.877243e-01*(s-1.200000e+01)-7.936839e+00; return result; } static double jarquebera_jbtbl13(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,1.0000) ) { x = 2*(s-0.000000)/1.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -2.713276e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.557541e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -9.459092e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.044145e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.546132e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.002374e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.349456e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.025669e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.590242e-05, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,3.0000) ) { x = 2*(s-1.000000)/2.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -2.454383e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.467539e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.270774e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.075763e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.611647e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.990785e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 8.109212e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.135031e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.915919e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.522390e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.144701e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,13.0000) ) { x = 2*(s-3.000000)/10.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -5.736127e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.920809e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.175858e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.002049e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.158966e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.157781e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.762172e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.780347e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.193310e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.442421e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.547756e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -2.799944e-01*(s-1.300000e+01)-7.566269e+00; return result; } static double jarquebera_jbtbl14(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,1.0000) ) { x = 2*(s-0.000000)/1.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -2.698527e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.479081e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.640733e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.466899e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.469485e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.150009e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.965975e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.710210e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.327808e-05, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,3.0000) ) { x = 2*(s-1.000000)/2.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -2.350359e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.421365e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.960468e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.149167e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.361109e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.976022e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.082700e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.563328e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.453123e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.917559e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.151067e-05, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,15.0000) ) { x = 2*(s-3.000000)/12.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -5.746892e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.010441e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.566146e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.129690e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.929724e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.524227e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.192933e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.254730e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.620685e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.289618e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.112350e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -2.590621e-01*(s-1.500000e+01)-7.632238e+00; return result; } static double jarquebera_jbtbl15(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,2.0000) ) { x = 2*(s-0.000000)/2.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.043660e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.361653e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.009497e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.951784e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.377903e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.003253e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.271309e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,5.0000) ) { x = 2*(s-2.000000)/3.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -3.582778e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.349578e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 9.476514e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.717385e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.222591e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.635124e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.815993e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,17.0000) ) { x = 2*(s-5.000000)/12.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.115476e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.655936e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 8.404310e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.663794e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 8.868618e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.381447e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 9.444801e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.581503e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -9.468696e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.728509e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.206470e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.927937e-01*(s-1.700000e+01)-7.700983e+00; return result; } static double jarquebera_jbtbl16(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,2.0000) ) { x = 2*(s-0.000000)/2.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.002570e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.298141e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.832803e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.877026e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.539436e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 8.439658e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.756911e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,5.0000) ) { x = 2*(s-2.000000)/3.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -3.486198e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.242944e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.020002e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.130531e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.512373e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.054876e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.556839e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,20.0000) ) { x = 2*(s-5.000000)/15.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.241608e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.832655e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.340545e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.361143e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.283219e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.484549e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.805968e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.057243e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.454439e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.177513e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.819209e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -2.391580e-01*(s-2.000000e+01)-7.963205e+00; return result; } static double jarquebera_jbtbl17(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,3.0000) ) { x = 2*(s-0.000000)/3.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.566973e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.810330e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.840039e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.337294e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.383549e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.556515e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.656965e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.404569e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.447867e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,6.0000) ) { x = 2*(s-3.000000)/3.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -3.905684e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.222920e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.146667e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.809176e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.057028e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.211838e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.099683e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.161105e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.225465e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,24.0000) ) { x = 2*(s-6.000000)/18.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.594282e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.917838e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.455980e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.999589e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.604263e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.484445e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.819937e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.930390e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.771761e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.232581e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.029083e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -2.127771e-01*(s-2.400000e+01)-8.400197e+00; return result; } static double jarquebera_jbtbl18(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,3.0000) ) { x = 2*(s-0.000000)/3.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.526802e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.762373e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.598890e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.189437e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.971721e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.823067e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.064501e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.014932e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.953513e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,6.0000) ) { x = 2*(s-3.000000)/3.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -3.818669e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.070918e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.277196e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.879817e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.887357e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.638451e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.502800e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.165796e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.034960e-05, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,20.0000) ) { x = 2*(s-6.000000)/14.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.010656e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.496296e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.002227e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.338250e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.137036e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.586202e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -9.736384e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.332251e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.877982e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.160963e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.547247e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.684623e-01*(s-2.000000e+01)-7.428883e+00; return result; } static double jarquebera_jbtbl19(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,3.0000) ) { x = 2*(s-0.000000)/3.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.490213e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.719633e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.459123e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.034878e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.113868e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.030922e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.054022e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.525623e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.277360e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,6.0000) ) { x = 2*(s-3.000000)/3.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -3.744750e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.977749e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.223716e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.363889e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.711774e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.557257e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.254794e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 9.034207e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.498107e-05, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,20.0000) ) { x = 2*(s-6.000000)/14.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -5.872768e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.430689e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.136575e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.726627e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.421110e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.581510e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.559520e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.838208e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 8.428839e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.170682e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.006647e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.539373e-01*(s-2.000000e+01)-7.206941e+00; return result; } static double jarquebera_jbtbl20(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,4.0000) ) { x = 2*(s-0.000000)/4.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.854794e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.948947e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.632184e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.139397e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.006237e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.810031e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.573620e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 9.951242e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.274092e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.464196e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.882139e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.575144e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.822804e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.061348e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.908404e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.978353e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,15.0000) ) { x = 2*(s-4.000000)/11.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -5.030989e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.327151e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.346404e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.840051e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.578551e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -9.813886e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.905973e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.358489e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.450795e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.941157e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.432418e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.070537e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 9.375654e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.367378e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 9.890859e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.679782e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,25.0000) ) { x = 2*(s-15.000000)/10.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -7.015854e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.487737e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.244254e-02, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.318007e-01*(s-2.500000e+01)-7.742185e+00; return result; } static double jarquebera_jbtbl30(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,4.0000) ) { x = 2*(s-0.000000)/4.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.630822e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.724298e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.872756e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.658268e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.573597e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.994157e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.994825e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.394303e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.785029e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.990264e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.037838e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.755546e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.774473e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.821395e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.392603e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.353313e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,15.0000) ) { x = 2*(s-4.000000)/11.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -4.539322e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.197018e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.396848e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.804293e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.867928e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.768758e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.211792e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.925799e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.046235e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -9.536469e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.489642e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,25.0000) ) { x = 2*(s-15.000000)/10.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.263462e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.177316e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.590637e-02, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.028212e-01*(s-2.500000e+01)-6.855288e+00; return result; } static double jarquebera_jbtbl50(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,4.0000) ) { x = 2*(s-0.000000)/4.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.436279e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.519711e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.148699e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.001204e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.207620e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.034778e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.220322e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.033260e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.588280e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.851653e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.287733e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,15.0000) ) { x = 2*(s-4.000000)/11.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -4.234645e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.189127e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.429738e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.058822e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 9.086776e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.445783e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.311671e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.261298e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.496987e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.605249e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 8.162282e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,25.0000) ) { x = 2*(s-15.000000)/10.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -5.921095e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.888603e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.080113e-02, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -9.313116e-02*(s-2.500000e+01)-6.479154e+00; return result; } static double jarquebera_jbtbl65(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,4.0000) ) { x = 2*(s-0.000000)/4.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.360024e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.434631e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.514580e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 7.332038e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.158197e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.121233e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.051056e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,15.0000) ) { x = 2*(s-4.000000)/11.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -4.148601e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.214233e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.487977e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.424720e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.116715e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.043152e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.718149e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.313701e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.097305e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.181031e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.256975e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,25.0000) ) { x = 2*(s-15.000000)/10.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -5.858951e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.895179e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.933237e-02, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -9.443768e-02*(s-2.500000e+01)-6.419137e+00; return result; } static double jarquebera_jbtbl100(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,4.0000) ) { x = 2*(s-0.000000)/4.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.257021e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.313418e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.628931e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.264287e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.518487e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.499826e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.836044e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,15.0000) ) { x = 2*(s-4.000000)/11.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -4.056508e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.279690e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.665746e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.290012e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.487632e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.704465e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.211669e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,25.0000) ) { x = 2*(s-15.000000)/10.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -5.866099e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.399767e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.498208e-02, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.080097e-01*(s-2.500000e+01)-6.481094e+00; return result; } static double jarquebera_jbtbl130(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,4.0000) ) { x = 2*(s-0.000000)/4.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.207999e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.253864e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.618032e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.112729e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.210546e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.732602e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.410527e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,15.0000) ) { x = 2*(s-4.000000)/11.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -4.026324e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.331990e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.779129e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.674749e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.669077e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.679136e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 8.833221e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,25.0000) ) { x = 2*(s-15.000000)/10.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -5.893951e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.475304e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.116734e-02, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.045722e-01*(s-2.500000e+01)-6.510314e+00; return result; } static double jarquebera_jbtbl200(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,4.0000) ) { x = 2*(s-0.000000)/4.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.146155e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.177398e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.297970e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.869745e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.717288e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.982108e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.427636e-05, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,15.0000) ) { x = 2*(s-4.000000)/11.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -4.034235e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.455006e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.942996e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.973795e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.418812e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.156778e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.896705e-05, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,25.0000) ) { x = 2*(s-15.000000)/10.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.086071e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.152176e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.725393e-02, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.132404e-01*(s-2.500000e+01)-6.764034e+00; return result; } static double jarquebera_jbtbl301(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,4.0000) ) { x = 2*(s-0.000000)/4.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.104290e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.125800e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -9.595847e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.219666e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.502210e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.414543e-05, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.754115e-05, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,15.0000) ) { x = 2*(s-4.000000)/11.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -4.065955e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.582060e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.004472e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -4.709092e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.105779e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.197391e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.386780e-04, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,25.0000) ) { x = 2*(s-15.000000)/10.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.311384e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.918763e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.626584e-02, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.293626e-01*(s-2.500000e+01)-7.066995e+00; return result; } static double jarquebera_jbtbl501(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,4.0000) ) { x = 2*(s-0.000000)/4.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.067426e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.079765e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -5.463005e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 6.875659e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,15.0000) ) { x = 2*(s-4.000000)/11.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -4.127574e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.740694e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.044502e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.746714e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 3.810594e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.197111e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,25.0000) ) { x = 2*(s-15.000000)/10.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.628194e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -8.846221e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.386405e-02, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.418332e-01*(s-2.500000e+01)-7.468952e+00; return result; } static double jarquebera_jbtbl701(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,4.0000) ) { x = 2*(s-0.000000)/4.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.050999e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.059769e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -3.922680e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 4.847054e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,15.0000) ) { x = 2*(s-4.000000)/11.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -4.192182e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.860007e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.963942e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.838711e-02, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.893112e-04, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.159788e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,25.0000) ) { x = 2*(s-15.000000)/10.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -6.917851e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -9.817020e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.383727e-02, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -1.532706e-01*(s-2.500000e+01)-7.845715e+00; return result; } static double jarquebera_jbtbl1401(double s, ae_state *_state) { double x; double tj; double tj1; double result; result = (double)(0); if( ae_fp_less_eq(s,4.0000) ) { x = 2*(s-0.000000)/4.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -1.026266e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.030061e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.259222e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 2.536254e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,15.0000) ) { x = 2*(s-4.000000)/11.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -4.329849e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -2.095443e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 1.759363e-01, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -7.751359e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -6.124368e-03, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.793114e-03, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } if( ae_fp_less_eq(s,25.0000) ) { x = 2*(s-15.000000)/10.000000-1; tj = (double)(1); tj1 = x; jarquebera_jbcheb(x, -7.544330e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, -1.225382e+00, &tj, &tj1, &result, _state); jarquebera_jbcheb(x, 5.392349e-02, &tj, &tj1, &result, _state); if( ae_fp_greater(result,(double)(0)) ) { result = (double)(0); } return result; } result = -2.019375e-01*(s-2.500000e+01)-8.715788e+00; return result; } static void jarquebera_jbcheb(double x, double c, double* tj, double* tj1, double* r, ae_state *_state) { double t; *r = *r+c*(*tj); t = 2*x*(*tj1)-(*tj); *tj = *tj1; *tj1 = t; } /************************************************************************* Two-sample F-test This test checks three hypotheses about dispersions of the given samples. The following tests are performed: * two-tailed test (null hypothesis - the dispersions are equal) * left-tailed test (null hypothesis - the dispersion of the first sample is greater than or equal to the dispersion of the second sample). * right-tailed test (null hypothesis - the dispersion of the first sample is less than or equal to the dispersion of the second sample) The test is based on the following assumptions: * the given samples have normal distributions * the samples are independent. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - sample size. Y - sample 2. Array whose index goes from 0 to M-1. M - sample size. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 19.09.2006 by Bochkanov Sergey *************************************************************************/ void ftest(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, double* bothtails, double* lefttail, double* righttail, ae_state *_state) { ae_int_t i; double xmean; double ymean; double xvar; double yvar; ae_int_t df1; ae_int_t df2; double stat; *bothtails = 0; *lefttail = 0; *righttail = 0; if( n<=2||m<=2 ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; return; } /* * Mean */ xmean = (double)(0); for(i=0; i<=n-1; i++) { xmean = xmean+x->ptr.p_double[i]; } xmean = xmean/n; ymean = (double)(0); for(i=0; i<=m-1; i++) { ymean = ymean+y->ptr.p_double[i]; } ymean = ymean/m; /* * Variance (using corrected two-pass algorithm) */ xvar = (double)(0); for(i=0; i<=n-1; i++) { xvar = xvar+ae_sqr(x->ptr.p_double[i]-xmean, _state); } xvar = xvar/(n-1); yvar = (double)(0); for(i=0; i<=m-1; i++) { yvar = yvar+ae_sqr(y->ptr.p_double[i]-ymean, _state); } yvar = yvar/(m-1); if( ae_fp_eq(xvar,(double)(0))||ae_fp_eq(yvar,(double)(0)) ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; return; } /* * Statistic */ df1 = n-1; df2 = m-1; stat = ae_minreal(xvar/yvar, yvar/xvar, _state); *bothtails = 1-(fdistribution(df1, df2, 1/stat, _state)-fdistribution(df1, df2, stat, _state)); *lefttail = fdistribution(df1, df2, xvar/yvar, _state); *righttail = 1-(*lefttail); } /************************************************************************* One-sample chi-square test This test checks three hypotheses about the dispersion of the given sample The following tests are performed: * two-tailed test (null hypothesis - the dispersion equals the given number) * left-tailed test (null hypothesis - the dispersion is greater than or equal to the given number) * right-tailed test (null hypothesis - dispersion is less than or equal to the given number). Test is based on the following assumptions: * the given sample has a normal distribution. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of the sample. Variance - dispersion value to compare with. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 19.09.2006 by Bochkanov Sergey *************************************************************************/ void onesamplevariancetest(/* Real */ ae_vector* x, ae_int_t n, double variance, double* bothtails, double* lefttail, double* righttail, ae_state *_state) { ae_int_t i; double xmean; double xvar; double s; double stat; *bothtails = 0; *lefttail = 0; *righttail = 0; if( n<=1 ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; return; } /* * Mean */ xmean = (double)(0); for(i=0; i<=n-1; i++) { xmean = xmean+x->ptr.p_double[i]; } xmean = xmean/n; /* * Variance */ xvar = (double)(0); for(i=0; i<=n-1; i++) { xvar = xvar+ae_sqr(x->ptr.p_double[i]-xmean, _state); } xvar = xvar/(n-1); if( ae_fp_eq(xvar,(double)(0)) ) { *bothtails = 1.0; *lefttail = 1.0; *righttail = 1.0; return; } /* * Statistic */ stat = (n-1)*xvar/variance; s = chisquaredistribution((double)(n-1), stat, _state); *bothtails = 2*ae_minreal(s, 1-s, _state); *lefttail = s; *righttail = 1-(*lefttail); } } cpp/src/fasttransforms.cpp0000755000175000017500000034733613105126766015617 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "fasttransforms.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif using namespace std; ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* 1-dimensional complex FFT. Array size N may be arbitrary number (composite or prime). Composite N's are handled with cache-oblivious variation of a Cooley-Tukey algorithm. Small prime-factors are transformed using hard coded codelets (similar to FFTW codelets, but without low-level optimization), large prime-factors are handled with Bluestein's algorithm. Fastests transforms are for smooth N's (prime factors are 2, 3, 5 only), most fast for powers of 2. When N have prime factors larger than these, but orders of magnitude smaller than N, computations will be about 4 times slower than for nearby highly composite N's. When N itself is prime, speed will be 6 times lower. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - complex function to be transformed N - problem size OUTPUT PARAMETERS A - DFT of a input array, array[0..N-1] A_out[j] = SUM(A_in[k]*exp(-2*pi*sqrt(-1)*j*k/N), k = 0..N-1) -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/ void fftc1d(complex_1d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fftc1d(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional complex FFT. Array size N may be arbitrary number (composite or prime). Composite N's are handled with cache-oblivious variation of a Cooley-Tukey algorithm. Small prime-factors are transformed using hard coded codelets (similar to FFTW codelets, but without low-level optimization), large prime-factors are handled with Bluestein's algorithm. Fastests transforms are for smooth N's (prime factors are 2, 3, 5 only), most fast for powers of 2. When N have prime factors larger than these, but orders of magnitude smaller than N, computations will be about 4 times slower than for nearby highly composite N's. When N itself is prime, speed will be 6 times lower. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - complex function to be transformed N - problem size OUTPUT PARAMETERS A - DFT of a input array, array[0..N-1] A_out[j] = SUM(A_in[k]*exp(-2*pi*sqrt(-1)*j*k/N), k = 0..N-1) -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/ void fftc1d(complex_1d_array &a) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = a.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fftc1d(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional complex inverse FFT. Array size N may be arbitrary number (composite or prime). Algorithm has O(N*logN) complexity for any N (composite or prime). See FFTC1D() description for more information about algorithm performance. INPUT PARAMETERS A - array[0..N-1] - complex array to be transformed N - problem size OUTPUT PARAMETERS A - inverse DFT of a input array, array[0..N-1] A_out[j] = SUM(A_in[k]/N*exp(+2*pi*sqrt(-1)*j*k/N), k = 0..N-1) -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/ void fftc1dinv(complex_1d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fftc1dinv(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional complex inverse FFT. Array size N may be arbitrary number (composite or prime). Algorithm has O(N*logN) complexity for any N (composite or prime). See FFTC1D() description for more information about algorithm performance. INPUT PARAMETERS A - array[0..N-1] - complex array to be transformed N - problem size OUTPUT PARAMETERS A - inverse DFT of a input array, array[0..N-1] A_out[j] = SUM(A_in[k]/N*exp(+2*pi*sqrt(-1)*j*k/N), k = 0..N-1) -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/ void fftc1dinv(complex_1d_array &a) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = a.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fftc1dinv(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional real FFT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS F - DFT of a input array, array[0..N-1] F[j] = SUM(A[k]*exp(-2*pi*sqrt(-1)*j*k/N), k = 0..N-1) NOTE: F[] satisfies symmetry property F[k] = conj(F[N-k]), so just one half of array is usually needed. But for convinience subroutine returns full complex array (with frequencies above N/2), so its result may be used by other FFT-related subroutines. -- ALGLIB -- Copyright 01.06.2009 by Bochkanov Sergey *************************************************************************/ void fftr1d(const real_1d_array &a, const ae_int_t n, complex_1d_array &f) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fftr1d(const_cast(a.c_ptr()), n, const_cast(f.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional real FFT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS F - DFT of a input array, array[0..N-1] F[j] = SUM(A[k]*exp(-2*pi*sqrt(-1)*j*k/N), k = 0..N-1) NOTE: F[] satisfies symmetry property F[k] = conj(F[N-k]), so just one half of array is usually needed. But for convinience subroutine returns full complex array (with frequencies above N/2), so its result may be used by other FFT-related subroutines. -- ALGLIB -- Copyright 01.06.2009 by Bochkanov Sergey *************************************************************************/ void fftr1d(const real_1d_array &a, complex_1d_array &f) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = a.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fftr1d(const_cast(a.c_ptr()), n, const_cast(f.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional real inverse FFT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS F - array[0..floor(N/2)] - frequencies from forward real FFT N - problem size OUTPUT PARAMETERS A - inverse DFT of a input array, array[0..N-1] NOTE: F[] should satisfy symmetry property F[k] = conj(F[N-k]), so just one half of frequencies array is needed - elements from 0 to floor(N/2). F[0] is ALWAYS real. If N is even F[floor(N/2)] is real too. If N is odd, then F[floor(N/2)] has no special properties. Relying on properties noted above, FFTR1DInv subroutine uses only elements from 0th to floor(N/2)-th. It ignores imaginary part of F[0], and in case N is even it ignores imaginary part of F[floor(N/2)] too. When you call this function using full arguments list - "FFTR1DInv(F,N,A)" - you can pass either either frequencies array with N elements or reduced array with roughly N/2 elements - subroutine will successfully transform both. If you call this function using reduced arguments list - "FFTR1DInv(F,A)" - you must pass FULL array with N elements (although higher N/2 are still not used) because array size is used to automatically determine FFT length -- ALGLIB -- Copyright 01.06.2009 by Bochkanov Sergey *************************************************************************/ void fftr1dinv(const complex_1d_array &f, const ae_int_t n, real_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fftr1dinv(const_cast(f.c_ptr()), n, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional real inverse FFT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS F - array[0..floor(N/2)] - frequencies from forward real FFT N - problem size OUTPUT PARAMETERS A - inverse DFT of a input array, array[0..N-1] NOTE: F[] should satisfy symmetry property F[k] = conj(F[N-k]), so just one half of frequencies array is needed - elements from 0 to floor(N/2). F[0] is ALWAYS real. If N is even F[floor(N/2)] is real too. If N is odd, then F[floor(N/2)] has no special properties. Relying on properties noted above, FFTR1DInv subroutine uses only elements from 0th to floor(N/2)-th. It ignores imaginary part of F[0], and in case N is even it ignores imaginary part of F[floor(N/2)] too. When you call this function using full arguments list - "FFTR1DInv(F,N,A)" - you can pass either either frequencies array with N elements or reduced array with roughly N/2 elements - subroutine will successfully transform both. If you call this function using reduced arguments list - "FFTR1DInv(F,A)" - you must pass FULL array with N elements (although higher N/2 are still not used) because array size is used to automatically determine FFT length -- ALGLIB -- Copyright 01.06.2009 by Bochkanov Sergey *************************************************************************/ void fftr1dinv(const complex_1d_array &f, real_1d_array &a) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = f.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fftr1dinv(const_cast(f.c_ptr()), n, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional Fast Hartley Transform. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS A - FHT of a input array, array[0..N-1], A_out[k] = sum(A_in[j]*(cos(2*pi*j*k/N)+sin(2*pi*j*k/N)), j=0..N-1) -- ALGLIB -- Copyright 04.06.2009 by Bochkanov Sergey *************************************************************************/ void fhtr1d(real_1d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fhtr1d(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional inverse FHT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - complex array to be transformed N - problem size OUTPUT PARAMETERS A - inverse FHT of a input array, array[0..N-1] -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/ void fhtr1dinv(real_1d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fhtr1dinv(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional complex convolution. For given A/B returns conv(A,B) (non-circular). Subroutine can automatically choose between three implementations: straightforward O(M*N) formula for very small N (or M), overlap-add algorithm for cases where max(M,N) is significantly larger than min(M,N), but O(M*N) algorithm is too slow, and general FFT-based formula for cases where two previois algorithms are too slow. Algorithm has max(M,N)*log(max(M,N)) complexity for any M/N. INPUT PARAMETERS A - array[0..M-1] - complex function to be transformed M - problem size B - array[0..N-1] - complex function to be transformed N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..N+M-2]. NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convc1d(const complex_1d_array &a, const ae_int_t m, const complex_1d_array &b, const ae_int_t n, complex_1d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::convc1d(const_cast(a.c_ptr()), m, const_cast(b.c_ptr()), n, const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional complex non-circular deconvolution (inverse of ConvC1D()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - response N - response length, N<=M OUTPUT PARAMETERS R - deconvolved signal. array[0..M-N]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convc1dinv(const complex_1d_array &a, const ae_int_t m, const complex_1d_array &b, const ae_int_t n, complex_1d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::convc1dinv(const_cast(a.c_ptr()), m, const_cast(b.c_ptr()), n, const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional circular complex convolution. For given S/R returns conv(S,R) (circular). Algorithm has linearithmic complexity for any M/N. IMPORTANT: normal convolution is commutative, i.e. it is symmetric - conv(A,B)=conv(B,A). Cyclic convolution IS NOT. One function - S - is a signal, periodic function, and another - R - is a response, non-periodic function with limited length. INPUT PARAMETERS S - array[0..M-1] - complex periodic signal M - problem size B - array[0..N-1] - complex non-periodic response N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convc1dcircular(const complex_1d_array &s, const ae_int_t m, const complex_1d_array &r, const ae_int_t n, complex_1d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::convc1dcircular(const_cast(s.c_ptr()), m, const_cast(r.c_ptr()), n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional circular complex deconvolution (inverse of ConvC1DCircular()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved periodic signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - non-periodic response N - response length OUTPUT PARAMETERS R - deconvolved signal. array[0..M-1]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convc1dcircularinv(const complex_1d_array &a, const ae_int_t m, const complex_1d_array &b, const ae_int_t n, complex_1d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::convc1dcircularinv(const_cast(a.c_ptr()), m, const_cast(b.c_ptr()), n, const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional real convolution. Analogous to ConvC1D(), see ConvC1D() comments for more details. INPUT PARAMETERS A - array[0..M-1] - real function to be transformed M - problem size B - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..N+M-2]. NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convr1d(const real_1d_array &a, const ae_int_t m, const real_1d_array &b, const ae_int_t n, real_1d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::convr1d(const_cast(a.c_ptr()), m, const_cast(b.c_ptr()), n, const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional real deconvolution (inverse of ConvC1D()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - response N - response length, N<=M OUTPUT PARAMETERS R - deconvolved signal. array[0..M-N]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convr1dinv(const real_1d_array &a, const ae_int_t m, const real_1d_array &b, const ae_int_t n, real_1d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::convr1dinv(const_cast(a.c_ptr()), m, const_cast(b.c_ptr()), n, const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional circular real convolution. Analogous to ConvC1DCircular(), see ConvC1DCircular() comments for more details. INPUT PARAMETERS S - array[0..M-1] - real signal M - problem size B - array[0..N-1] - real response N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convr1dcircular(const real_1d_array &s, const ae_int_t m, const real_1d_array &r, const ae_int_t n, real_1d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::convr1dcircular(const_cast(s.c_ptr()), m, const_cast(r.c_ptr()), n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional complex deconvolution (inverse of ConvC1D()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - response N - response length OUTPUT PARAMETERS R - deconvolved signal. array[0..M-N]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convr1dcircularinv(const real_1d_array &a, const ae_int_t m, const real_1d_array &b, const ae_int_t n, real_1d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::convr1dcircularinv(const_cast(a.c_ptr()), m, const_cast(b.c_ptr()), n, const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional complex cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (non-circular). Correlation is calculated using reduction to convolution. Algorithm with max(N,N)*log(max(N,N)) complexity is used (see ConvC1D() for more info about performance). IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrC1D(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - complex function to be transformed, signal containing pattern N - problem size Pattern - array[0..M-1] - complex function to be transformed, pattern to search withing signal M - problem size OUTPUT PARAMETERS R - cross-correlation, array[0..N+M-2]: * positive lags are stored in R[0..N-1], R[i] = sum(conj(pattern[j])*signal[i+j] * negative lags are stored in R[N..N+M-2], R[N+M-1-i] = sum(conj(pattern[j])*signal[-i+j] NOTE: It is assumed that pattern domain is [0..M-1]. If Pattern is non-zero on [-K..M-1], you can still use this subroutine, just shift result by K. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void corrc1d(const complex_1d_array &signal, const ae_int_t n, const complex_1d_array &pattern, const ae_int_t m, complex_1d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::corrc1d(const_cast(signal.c_ptr()), n, const_cast(pattern.c_ptr()), m, const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional circular complex cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (circular). Algorithm has linearithmic complexity for any M/N. IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrC1DCircular(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - complex function to be transformed, periodic signal containing pattern N - problem size Pattern - array[0..M-1] - complex function to be transformed, non-periodic pattern to search withing signal M - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void corrc1dcircular(const complex_1d_array &signal, const ae_int_t m, const complex_1d_array &pattern, const ae_int_t n, complex_1d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::corrc1dcircular(const_cast(signal.c_ptr()), m, const_cast(pattern.c_ptr()), n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional real cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (non-circular). Correlation is calculated using reduction to convolution. Algorithm with max(N,N)*log(max(N,N)) complexity is used (see ConvC1D() for more info about performance). IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrR1D(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - real function to be transformed, signal containing pattern N - problem size Pattern - array[0..M-1] - real function to be transformed, pattern to search withing signal M - problem size OUTPUT PARAMETERS R - cross-correlation, array[0..N+M-2]: * positive lags are stored in R[0..N-1], R[i] = sum(pattern[j]*signal[i+j] * negative lags are stored in R[N..N+M-2], R[N+M-1-i] = sum(pattern[j]*signal[-i+j] NOTE: It is assumed that pattern domain is [0..M-1]. If Pattern is non-zero on [-K..M-1], you can still use this subroutine, just shift result by K. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void corrr1d(const real_1d_array &signal, const ae_int_t n, const real_1d_array &pattern, const ae_int_t m, real_1d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::corrr1d(const_cast(signal.c_ptr()), n, const_cast(pattern.c_ptr()), m, const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional circular real cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (circular). Algorithm has linearithmic complexity for any M/N. IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrR1DCircular(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - real function to be transformed, periodic signal containing pattern N - problem size Pattern - array[0..M-1] - real function to be transformed, non-periodic pattern to search withing signal M - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void corrr1dcircular(const real_1d_array &signal, const ae_int_t m, const real_1d_array &pattern, const ae_int_t n, real_1d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::corrr1dcircular(const_cast(signal.c_ptr()), m, const_cast(pattern.c_ptr()), n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF COMPUTATIONAL CORE // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { /************************************************************************* 1-dimensional complex FFT. Array size N may be arbitrary number (composite or prime). Composite N's are handled with cache-oblivious variation of a Cooley-Tukey algorithm. Small prime-factors are transformed using hard coded codelets (similar to FFTW codelets, but without low-level optimization), large prime-factors are handled with Bluestein's algorithm. Fastests transforms are for smooth N's (prime factors are 2, 3, 5 only), most fast for powers of 2. When N have prime factors larger than these, but orders of magnitude smaller than N, computations will be about 4 times slower than for nearby highly composite N's. When N itself is prime, speed will be 6 times lower. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - complex function to be transformed N - problem size OUTPUT PARAMETERS A - DFT of a input array, array[0..N-1] A_out[j] = SUM(A_in[k]*exp(-2*pi*sqrt(-1)*j*k/N), k = 0..N-1) -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/ void fftc1d(/* Complex */ ae_vector* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; fasttransformplan plan; ae_int_t i; ae_vector buf; ae_frame_make(_state, &_frame_block); _fasttransformplan_init(&plan, _state); ae_vector_init(&buf, 0, DT_REAL, _state); ae_assert(n>0, "FFTC1D: incorrect N!", _state); ae_assert(a->cnt>=n, "FFTC1D: Length(A)ptr.p_complex[i].x; buf.ptr.p_double[2*i+1] = a->ptr.p_complex[i].y; } /* * Generate plan and execute it. * * Plan is a combination of a successive factorizations of N and * precomputed data. It is much like a FFTW plan, but is not stored * between subroutine calls and is much simpler. */ ftcomplexfftplan(n, 1, &plan, _state); ftapplyplan(&plan, &buf, 0, 1, _state); /* * result */ for(i=0; i<=n-1; i++) { a->ptr.p_complex[i].x = buf.ptr.p_double[2*i+0]; a->ptr.p_complex[i].y = buf.ptr.p_double[2*i+1]; } ae_frame_leave(_state); } /************************************************************************* 1-dimensional complex inverse FFT. Array size N may be arbitrary number (composite or prime). Algorithm has O(N*logN) complexity for any N (composite or prime). See FFTC1D() description for more information about algorithm performance. INPUT PARAMETERS A - array[0..N-1] - complex array to be transformed N - problem size OUTPUT PARAMETERS A - inverse DFT of a input array, array[0..N-1] A_out[j] = SUM(A_in[k]/N*exp(+2*pi*sqrt(-1)*j*k/N), k = 0..N-1) -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/ void fftc1dinv(/* Complex */ ae_vector* a, ae_int_t n, ae_state *_state) { ae_int_t i; ae_assert(n>0, "FFTC1DInv: incorrect N!", _state); ae_assert(a->cnt>=n, "FFTC1DInv: Length(A)ptr.p_complex[i].y = -a->ptr.p_complex[i].y; } fftc1d(a, n, _state); for(i=0; i<=n-1; i++) { a->ptr.p_complex[i].x = a->ptr.p_complex[i].x/n; a->ptr.p_complex[i].y = -a->ptr.p_complex[i].y/n; } } /************************************************************************* 1-dimensional real FFT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS F - DFT of a input array, array[0..N-1] F[j] = SUM(A[k]*exp(-2*pi*sqrt(-1)*j*k/N), k = 0..N-1) NOTE: F[] satisfies symmetry property F[k] = conj(F[N-k]), so just one half of array is usually needed. But for convinience subroutine returns full complex array (with frequencies above N/2), so its result may be used by other FFT-related subroutines. -- ALGLIB -- Copyright 01.06.2009 by Bochkanov Sergey *************************************************************************/ void fftr1d(/* Real */ ae_vector* a, ae_int_t n, /* Complex */ ae_vector* f, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t n2; ae_int_t idx; ae_complex hn; ae_complex hmnc; ae_complex v; ae_vector buf; fasttransformplan plan; ae_frame_make(_state, &_frame_block); ae_vector_clear(f); ae_vector_init(&buf, 0, DT_REAL, _state); _fasttransformplan_init(&plan, _state); ae_assert(n>0, "FFTR1D: incorrect N!", _state); ae_assert(a->cnt>=n, "FFTR1D: Length(A)ptr.p_complex[0] = ae_complex_from_d(a->ptr.p_double[0]); ae_frame_leave(_state); return; } if( n==2 ) { ae_vector_set_length(f, 2, _state); f->ptr.p_complex[0].x = a->ptr.p_double[0]+a->ptr.p_double[1]; f->ptr.p_complex[0].y = (double)(0); f->ptr.p_complex[1].x = a->ptr.p_double[0]-a->ptr.p_double[1]; f->ptr.p_complex[1].y = (double)(0); ae_frame_leave(_state); return; } /* * Choose between odd-size and even-size FFTs */ if( n%2==0 ) { /* * even-size real FFT, use reduction to the complex task */ n2 = n/2; ae_vector_set_length(&buf, n, _state); ae_v_move(&buf.ptr.p_double[0], 1, &a->ptr.p_double[0], 1, ae_v_len(0,n-1)); ftcomplexfftplan(n2, 1, &plan, _state); ftapplyplan(&plan, &buf, 0, 1, _state); ae_vector_set_length(f, n, _state); for(i=0; i<=n2; i++) { idx = 2*(i%n2); hn.x = buf.ptr.p_double[idx+0]; hn.y = buf.ptr.p_double[idx+1]; idx = 2*((n2-i)%n2); hmnc.x = buf.ptr.p_double[idx+0]; hmnc.y = -buf.ptr.p_double[idx+1]; v.x = -ae_sin(-2*ae_pi*i/n, _state); v.y = ae_cos(-2*ae_pi*i/n, _state); f->ptr.p_complex[i] = ae_c_sub(ae_c_add(hn,hmnc),ae_c_mul(v,ae_c_sub(hn,hmnc))); f->ptr.p_complex[i].x = 0.5*f->ptr.p_complex[i].x; f->ptr.p_complex[i].y = 0.5*f->ptr.p_complex[i].y; } for(i=n2+1; i<=n-1; i++) { f->ptr.p_complex[i] = ae_c_conj(f->ptr.p_complex[n-i], _state); } } else { /* * use complex FFT */ ae_vector_set_length(f, n, _state); for(i=0; i<=n-1; i++) { f->ptr.p_complex[i] = ae_complex_from_d(a->ptr.p_double[i]); } fftc1d(f, n, _state); } ae_frame_leave(_state); } /************************************************************************* 1-dimensional real inverse FFT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS F - array[0..floor(N/2)] - frequencies from forward real FFT N - problem size OUTPUT PARAMETERS A - inverse DFT of a input array, array[0..N-1] NOTE: F[] should satisfy symmetry property F[k] = conj(F[N-k]), so just one half of frequencies array is needed - elements from 0 to floor(N/2). F[0] is ALWAYS real. If N is even F[floor(N/2)] is real too. If N is odd, then F[floor(N/2)] has no special properties. Relying on properties noted above, FFTR1DInv subroutine uses only elements from 0th to floor(N/2)-th. It ignores imaginary part of F[0], and in case N is even it ignores imaginary part of F[floor(N/2)] too. When you call this function using full arguments list - "FFTR1DInv(F,N,A)" - you can pass either either frequencies array with N elements or reduced array with roughly N/2 elements - subroutine will successfully transform both. If you call this function using reduced arguments list - "FFTR1DInv(F,A)" - you must pass FULL array with N elements (although higher N/2 are still not used) because array size is used to automatically determine FFT length -- ALGLIB -- Copyright 01.06.2009 by Bochkanov Sergey *************************************************************************/ void fftr1dinv(/* Complex */ ae_vector* f, ae_int_t n, /* Real */ ae_vector* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector h; ae_vector fh; ae_frame_make(_state, &_frame_block); ae_vector_clear(a); ae_vector_init(&h, 0, DT_REAL, _state); ae_vector_init(&fh, 0, DT_COMPLEX, _state); ae_assert(n>0, "FFTR1DInv: incorrect N!", _state); ae_assert(f->cnt>=ae_ifloor((double)n/(double)2, _state)+1, "FFTR1DInv: Length(F)ptr.p_complex[0].x, _state), "FFTR1DInv: F contains infinite or NAN values!", _state); for(i=1; i<=ae_ifloor((double)n/(double)2, _state)-1; i++) { ae_assert(ae_isfinite(f->ptr.p_complex[i].x, _state)&&ae_isfinite(f->ptr.p_complex[i].y, _state), "FFTR1DInv: F contains infinite or NAN values!", _state); } ae_assert(ae_isfinite(f->ptr.p_complex[ae_ifloor((double)n/(double)2, _state)].x, _state), "FFTR1DInv: F contains infinite or NAN values!", _state); if( n%2!=0 ) { ae_assert(ae_isfinite(f->ptr.p_complex[ae_ifloor((double)n/(double)2, _state)].y, _state), "FFTR1DInv: F contains infinite or NAN values!", _state); } /* * Special case: N=1, FFT is just identity transform. * After this block we assume that N is strictly greater than 1. */ if( n==1 ) { ae_vector_set_length(a, 1, _state); a->ptr.p_double[0] = f->ptr.p_complex[0].x; ae_frame_leave(_state); return; } /* * inverse real FFT is reduced to the inverse real FHT, * which is reduced to the forward real FHT, * which is reduced to the forward real FFT. * * Don't worry, it is really compact and efficient reduction :) */ ae_vector_set_length(&h, n, _state); ae_vector_set_length(a, n, _state); h.ptr.p_double[0] = f->ptr.p_complex[0].x; for(i=1; i<=ae_ifloor((double)n/(double)2, _state)-1; i++) { h.ptr.p_double[i] = f->ptr.p_complex[i].x-f->ptr.p_complex[i].y; h.ptr.p_double[n-i] = f->ptr.p_complex[i].x+f->ptr.p_complex[i].y; } if( n%2==0 ) { h.ptr.p_double[ae_ifloor((double)n/(double)2, _state)] = f->ptr.p_complex[ae_ifloor((double)n/(double)2, _state)].x; } else { h.ptr.p_double[ae_ifloor((double)n/(double)2, _state)] = f->ptr.p_complex[ae_ifloor((double)n/(double)2, _state)].x-f->ptr.p_complex[ae_ifloor((double)n/(double)2, _state)].y; h.ptr.p_double[ae_ifloor((double)n/(double)2, _state)+1] = f->ptr.p_complex[ae_ifloor((double)n/(double)2, _state)].x+f->ptr.p_complex[ae_ifloor((double)n/(double)2, _state)].y; } fftr1d(&h, n, &fh, _state); for(i=0; i<=n-1; i++) { a->ptr.p_double[i] = (fh.ptr.p_complex[i].x-fh.ptr.p_complex[i].y)/n; } ae_frame_leave(_state); } /************************************************************************* Internal subroutine. Never call it directly! -- ALGLIB -- Copyright 01.06.2009 by Bochkanov Sergey *************************************************************************/ void fftr1dinternaleven(/* Real */ ae_vector* a, ae_int_t n, /* Real */ ae_vector* buf, fasttransformplan* plan, ae_state *_state) { double x; double y; ae_int_t i; ae_int_t n2; ae_int_t idx; ae_complex hn; ae_complex hmnc; ae_complex v; ae_assert(n>0&&n%2==0, "FFTR1DEvenInplace: incorrect N!", _state); /* * Special cases: * * N=2 * * After this block we assume that N is strictly greater than 2 */ if( n==2 ) { x = a->ptr.p_double[0]+a->ptr.p_double[1]; y = a->ptr.p_double[0]-a->ptr.p_double[1]; a->ptr.p_double[0] = x; a->ptr.p_double[1] = y; return; } /* * even-size real FFT, use reduction to the complex task */ n2 = n/2; ae_v_move(&buf->ptr.p_double[0], 1, &a->ptr.p_double[0], 1, ae_v_len(0,n-1)); ftapplyplan(plan, buf, 0, 1, _state); a->ptr.p_double[0] = buf->ptr.p_double[0]+buf->ptr.p_double[1]; for(i=1; i<=n2-1; i++) { idx = 2*(i%n2); hn.x = buf->ptr.p_double[idx+0]; hn.y = buf->ptr.p_double[idx+1]; idx = 2*(n2-i); hmnc.x = buf->ptr.p_double[idx+0]; hmnc.y = -buf->ptr.p_double[idx+1]; v.x = -ae_sin(-2*ae_pi*i/n, _state); v.y = ae_cos(-2*ae_pi*i/n, _state); v = ae_c_sub(ae_c_add(hn,hmnc),ae_c_mul(v,ae_c_sub(hn,hmnc))); a->ptr.p_double[2*i+0] = 0.5*v.x; a->ptr.p_double[2*i+1] = 0.5*v.y; } a->ptr.p_double[1] = buf->ptr.p_double[0]-buf->ptr.p_double[1]; } /************************************************************************* Internal subroutine. Never call it directly! -- ALGLIB -- Copyright 01.06.2009 by Bochkanov Sergey *************************************************************************/ void fftr1dinvinternaleven(/* Real */ ae_vector* a, ae_int_t n, /* Real */ ae_vector* buf, fasttransformplan* plan, ae_state *_state) { double x; double y; double t; ae_int_t i; ae_int_t n2; ae_assert(n>0&&n%2==0, "FFTR1DInvInternalEven: incorrect N!", _state); /* * Special cases: * * N=2 * * After this block we assume that N is strictly greater than 2 */ if( n==2 ) { x = 0.5*(a->ptr.p_double[0]+a->ptr.p_double[1]); y = 0.5*(a->ptr.p_double[0]-a->ptr.p_double[1]); a->ptr.p_double[0] = x; a->ptr.p_double[1] = y; return; } /* * inverse real FFT is reduced to the inverse real FHT, * which is reduced to the forward real FHT, * which is reduced to the forward real FFT. * * Don't worry, it is really compact and efficient reduction :) */ n2 = n/2; buf->ptr.p_double[0] = a->ptr.p_double[0]; for(i=1; i<=n2-1; i++) { x = a->ptr.p_double[2*i+0]; y = a->ptr.p_double[2*i+1]; buf->ptr.p_double[i] = x-y; buf->ptr.p_double[n-i] = x+y; } buf->ptr.p_double[n2] = a->ptr.p_double[1]; fftr1dinternaleven(buf, n, a, plan, _state); a->ptr.p_double[0] = buf->ptr.p_double[0]/n; t = (double)1/(double)n; for(i=1; i<=n2-1; i++) { x = buf->ptr.p_double[2*i+0]; y = buf->ptr.p_double[2*i+1]; a->ptr.p_double[i] = t*(x-y); a->ptr.p_double[n-i] = t*(x+y); } a->ptr.p_double[n2] = buf->ptr.p_double[1]/n; } /************************************************************************* 1-dimensional Fast Hartley Transform. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS A - FHT of a input array, array[0..N-1], A_out[k] = sum(A_in[j]*(cos(2*pi*j*k/N)+sin(2*pi*j*k/N)), j=0..N-1) -- ALGLIB -- Copyright 04.06.2009 by Bochkanov Sergey *************************************************************************/ void fhtr1d(/* Real */ ae_vector* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector fa; ae_frame_make(_state, &_frame_block); ae_vector_init(&fa, 0, DT_COMPLEX, _state); ae_assert(n>0, "FHTR1D: incorrect N!", _state); /* * Special case: N=1, FHT is just identity transform. * After this block we assume that N is strictly greater than 1. */ if( n==1 ) { ae_frame_leave(_state); return; } /* * Reduce FHt to real FFT */ fftr1d(a, n, &fa, _state); for(i=0; i<=n-1; i++) { a->ptr.p_double[i] = fa.ptr.p_complex[i].x-fa.ptr.p_complex[i].y; } ae_frame_leave(_state); } /************************************************************************* 1-dimensional inverse FHT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - complex array to be transformed N - problem size OUTPUT PARAMETERS A - inverse FHT of a input array, array[0..N-1] -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/ void fhtr1dinv(/* Real */ ae_vector* a, ae_int_t n, ae_state *_state) { ae_int_t i; ae_assert(n>0, "FHTR1DInv: incorrect N!", _state); /* * Special case: N=1, iFHT is just identity transform. * After this block we assume that N is strictly greater than 1. */ if( n==1 ) { return; } /* * Inverse FHT can be expressed in terms of the FHT as * * invfht(x) = fht(x)/N */ fhtr1d(a, n, _state); for(i=0; i<=n-1; i++) { a->ptr.p_double[i] = a->ptr.p_double[i]/n; } } /************************************************************************* 1-dimensional complex convolution. For given A/B returns conv(A,B) (non-circular). Subroutine can automatically choose between three implementations: straightforward O(M*N) formula for very small N (or M), overlap-add algorithm for cases where max(M,N) is significantly larger than min(M,N), but O(M*N) algorithm is too slow, and general FFT-based formula for cases where two previois algorithms are too slow. Algorithm has max(M,N)*log(max(M,N)) complexity for any M/N. INPUT PARAMETERS A - array[0..M-1] - complex function to be transformed M - problem size B - array[0..N-1] - complex function to be transformed N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..N+M-2]. NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convc1d(/* Complex */ ae_vector* a, ae_int_t m, /* Complex */ ae_vector* b, ae_int_t n, /* Complex */ ae_vector* r, ae_state *_state) { ae_vector_clear(r); ae_assert(n>0&&m>0, "ConvC1D: incorrect N or M!", _state); /* * normalize task: make M>=N, * so A will be longer that B. */ if( m0&&m>0)&&n<=m, "ConvC1DInv: incorrect N or M!", _state); p = ftbasefindsmooth(m, _state); ftcomplexfftplan(p, 1, &plan, _state); ae_vector_set_length(&buf, 2*p, _state); for(i=0; i<=m-1; i++) { buf.ptr.p_double[2*i+0] = a->ptr.p_complex[i].x; buf.ptr.p_double[2*i+1] = a->ptr.p_complex[i].y; } for(i=m; i<=p-1; i++) { buf.ptr.p_double[2*i+0] = (double)(0); buf.ptr.p_double[2*i+1] = (double)(0); } ae_vector_set_length(&buf2, 2*p, _state); for(i=0; i<=n-1; i++) { buf2.ptr.p_double[2*i+0] = b->ptr.p_complex[i].x; buf2.ptr.p_double[2*i+1] = b->ptr.p_complex[i].y; } for(i=n; i<=p-1; i++) { buf2.ptr.p_double[2*i+0] = (double)(0); buf2.ptr.p_double[2*i+1] = (double)(0); } ftapplyplan(&plan, &buf, 0, 1, _state); ftapplyplan(&plan, &buf2, 0, 1, _state); for(i=0; i<=p-1; i++) { c1.x = buf.ptr.p_double[2*i+0]; c1.y = buf.ptr.p_double[2*i+1]; c2.x = buf2.ptr.p_double[2*i+0]; c2.y = buf2.ptr.p_double[2*i+1]; c3 = ae_c_div(c1,c2); buf.ptr.p_double[2*i+0] = c3.x; buf.ptr.p_double[2*i+1] = -c3.y; } ftapplyplan(&plan, &buf, 0, 1, _state); t = (double)1/(double)p; ae_vector_set_length(r, m-n+1, _state); for(i=0; i<=m-n; i++) { r->ptr.p_complex[i].x = t*buf.ptr.p_double[2*i+0]; r->ptr.p_complex[i].y = -t*buf.ptr.p_double[2*i+1]; } ae_frame_leave(_state); } /************************************************************************* 1-dimensional circular complex convolution. For given S/R returns conv(S,R) (circular). Algorithm has linearithmic complexity for any M/N. IMPORTANT: normal convolution is commutative, i.e. it is symmetric - conv(A,B)=conv(B,A). Cyclic convolution IS NOT. One function - S - is a signal, periodic function, and another - R - is a response, non-periodic function with limited length. INPUT PARAMETERS S - array[0..M-1] - complex periodic signal M - problem size B - array[0..N-1] - complex non-periodic response N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convc1dcircular(/* Complex */ ae_vector* s, ae_int_t m, /* Complex */ ae_vector* r, ae_int_t n, /* Complex */ ae_vector* c, ae_state *_state) { ae_frame _frame_block; ae_vector buf; ae_int_t i1; ae_int_t i2; ae_int_t j2; ae_frame_make(_state, &_frame_block); ae_vector_clear(c); ae_vector_init(&buf, 0, DT_COMPLEX, _state); ae_assert(n>0&&m>0, "ConvC1DCircular: incorrect N or M!", _state); /* * normalize task: make M>=N, * so A will be longer (at least - not shorter) that B. */ if( mptr.p_complex[i1], 1, "N", ae_v_len(0,j2)); i1 = i1+m; } convc1dcircular(s, m, &buf, m, c, _state); ae_frame_leave(_state); return; } convc1dx(s, m, r, n, ae_true, -1, 0, c, _state); ae_frame_leave(_state); } /************************************************************************* 1-dimensional circular complex deconvolution (inverse of ConvC1DCircular()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved periodic signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - non-periodic response N - response length OUTPUT PARAMETERS R - deconvolved signal. array[0..M-1]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convc1dcircularinv(/* Complex */ ae_vector* a, ae_int_t m, /* Complex */ ae_vector* b, ae_int_t n, /* Complex */ ae_vector* r, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t i1; ae_int_t i2; ae_int_t j2; ae_vector buf; ae_vector buf2; ae_vector cbuf; fasttransformplan plan; ae_complex c1; ae_complex c2; ae_complex c3; double t; ae_frame_make(_state, &_frame_block); ae_vector_clear(r); ae_vector_init(&buf, 0, DT_REAL, _state); ae_vector_init(&buf2, 0, DT_REAL, _state); ae_vector_init(&cbuf, 0, DT_COMPLEX, _state); _fasttransformplan_init(&plan, _state); ae_assert(n>0&&m>0, "ConvC1DCircularInv: incorrect N or M!", _state); /* * normalize task: make M>=N, * so A will be longer (at least - not shorter) that B. */ if( mptr.p_complex[i1], 1, "N", ae_v_len(0,j2)); i1 = i1+m; } convc1dcircularinv(a, m, &cbuf, m, r, _state); ae_frame_leave(_state); return; } /* * Task is normalized */ ftcomplexfftplan(m, 1, &plan, _state); ae_vector_set_length(&buf, 2*m, _state); for(i=0; i<=m-1; i++) { buf.ptr.p_double[2*i+0] = a->ptr.p_complex[i].x; buf.ptr.p_double[2*i+1] = a->ptr.p_complex[i].y; } ae_vector_set_length(&buf2, 2*m, _state); for(i=0; i<=n-1; i++) { buf2.ptr.p_double[2*i+0] = b->ptr.p_complex[i].x; buf2.ptr.p_double[2*i+1] = b->ptr.p_complex[i].y; } for(i=n; i<=m-1; i++) { buf2.ptr.p_double[2*i+0] = (double)(0); buf2.ptr.p_double[2*i+1] = (double)(0); } ftapplyplan(&plan, &buf, 0, 1, _state); ftapplyplan(&plan, &buf2, 0, 1, _state); for(i=0; i<=m-1; i++) { c1.x = buf.ptr.p_double[2*i+0]; c1.y = buf.ptr.p_double[2*i+1]; c2.x = buf2.ptr.p_double[2*i+0]; c2.y = buf2.ptr.p_double[2*i+1]; c3 = ae_c_div(c1,c2); buf.ptr.p_double[2*i+0] = c3.x; buf.ptr.p_double[2*i+1] = -c3.y; } ftapplyplan(&plan, &buf, 0, 1, _state); t = (double)1/(double)m; ae_vector_set_length(r, m, _state); for(i=0; i<=m-1; i++) { r->ptr.p_complex[i].x = t*buf.ptr.p_double[2*i+0]; r->ptr.p_complex[i].y = -t*buf.ptr.p_double[2*i+1]; } ae_frame_leave(_state); } /************************************************************************* 1-dimensional real convolution. Analogous to ConvC1D(), see ConvC1D() comments for more details. INPUT PARAMETERS A - array[0..M-1] - real function to be transformed M - problem size B - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..N+M-2]. NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convr1d(/* Real */ ae_vector* a, ae_int_t m, /* Real */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* r, ae_state *_state) { ae_vector_clear(r); ae_assert(n>0&&m>0, "ConvR1D: incorrect N or M!", _state); /* * normalize task: make M>=N, * so A will be longer that B. */ if( m0&&m>0)&&n<=m, "ConvR1DInv: incorrect N or M!", _state); p = ftbasefindsmootheven(m, _state); ae_vector_set_length(&buf, p, _state); ae_v_move(&buf.ptr.p_double[0], 1, &a->ptr.p_double[0], 1, ae_v_len(0,m-1)); for(i=m; i<=p-1; i++) { buf.ptr.p_double[i] = (double)(0); } ae_vector_set_length(&buf2, p, _state); ae_v_move(&buf2.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=n; i<=p-1; i++) { buf2.ptr.p_double[i] = (double)(0); } ae_vector_set_length(&buf3, p, _state); ftcomplexfftplan(p/2, 1, &plan, _state); fftr1dinternaleven(&buf, p, &buf3, &plan, _state); fftr1dinternaleven(&buf2, p, &buf3, &plan, _state); buf.ptr.p_double[0] = buf.ptr.p_double[0]/buf2.ptr.p_double[0]; buf.ptr.p_double[1] = buf.ptr.p_double[1]/buf2.ptr.p_double[1]; for(i=1; i<=p/2-1; i++) { c1.x = buf.ptr.p_double[2*i+0]; c1.y = buf.ptr.p_double[2*i+1]; c2.x = buf2.ptr.p_double[2*i+0]; c2.y = buf2.ptr.p_double[2*i+1]; c3 = ae_c_div(c1,c2); buf.ptr.p_double[2*i+0] = c3.x; buf.ptr.p_double[2*i+1] = c3.y; } fftr1dinvinternaleven(&buf, p, &buf3, &plan, _state); ae_vector_set_length(r, m-n+1, _state); ae_v_move(&r->ptr.p_double[0], 1, &buf.ptr.p_double[0], 1, ae_v_len(0,m-n)); ae_frame_leave(_state); } /************************************************************************* 1-dimensional circular real convolution. Analogous to ConvC1DCircular(), see ConvC1DCircular() comments for more details. INPUT PARAMETERS S - array[0..M-1] - real signal M - problem size B - array[0..N-1] - real response N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convr1dcircular(/* Real */ ae_vector* s, ae_int_t m, /* Real */ ae_vector* r, ae_int_t n, /* Real */ ae_vector* c, ae_state *_state) { ae_frame _frame_block; ae_vector buf; ae_int_t i1; ae_int_t i2; ae_int_t j2; ae_frame_make(_state, &_frame_block); ae_vector_clear(c); ae_vector_init(&buf, 0, DT_REAL, _state); ae_assert(n>0&&m>0, "ConvC1DCircular: incorrect N or M!", _state); /* * normalize task: make M>=N, * so A will be longer (at least - not shorter) that B. */ if( mptr.p_double[i1], 1, ae_v_len(0,j2)); i1 = i1+m; } convr1dcircular(s, m, &buf, m, c, _state); ae_frame_leave(_state); return; } /* * reduce to usual convolution */ convr1dx(s, m, r, n, ae_true, -1, 0, c, _state); ae_frame_leave(_state); } /************************************************************************* 1-dimensional complex deconvolution (inverse of ConvC1D()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - response N - response length OUTPUT PARAMETERS R - deconvolved signal. array[0..M-N]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convr1dcircularinv(/* Real */ ae_vector* a, ae_int_t m, /* Real */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* r, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t i1; ae_int_t i2; ae_int_t j2; ae_vector buf; ae_vector buf2; ae_vector buf3; ae_vector cbuf; ae_vector cbuf2; fasttransformplan plan; ae_complex c1; ae_complex c2; ae_complex c3; ae_frame_make(_state, &_frame_block); ae_vector_clear(r); ae_vector_init(&buf, 0, DT_REAL, _state); ae_vector_init(&buf2, 0, DT_REAL, _state); ae_vector_init(&buf3, 0, DT_REAL, _state); ae_vector_init(&cbuf, 0, DT_COMPLEX, _state); ae_vector_init(&cbuf2, 0, DT_COMPLEX, _state); _fasttransformplan_init(&plan, _state); ae_assert(n>0&&m>0, "ConvR1DCircularInv: incorrect N or M!", _state); /* * normalize task: make M>=N, * so A will be longer (at least - not shorter) that B. */ if( mptr.p_double[i1], 1, ae_v_len(0,j2)); i1 = i1+m; } convr1dcircularinv(a, m, &buf, m, r, _state); ae_frame_leave(_state); return; } /* * Task is normalized */ if( m%2==0 ) { /* * size is even, use fast even-size FFT */ ae_vector_set_length(&buf, m, _state); ae_v_move(&buf.ptr.p_double[0], 1, &a->ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_vector_set_length(&buf2, m, _state); ae_v_move(&buf2.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=n; i<=m-1; i++) { buf2.ptr.p_double[i] = (double)(0); } ae_vector_set_length(&buf3, m, _state); ftcomplexfftplan(m/2, 1, &plan, _state); fftr1dinternaleven(&buf, m, &buf3, &plan, _state); fftr1dinternaleven(&buf2, m, &buf3, &plan, _state); buf.ptr.p_double[0] = buf.ptr.p_double[0]/buf2.ptr.p_double[0]; buf.ptr.p_double[1] = buf.ptr.p_double[1]/buf2.ptr.p_double[1]; for(i=1; i<=m/2-1; i++) { c1.x = buf.ptr.p_double[2*i+0]; c1.y = buf.ptr.p_double[2*i+1]; c2.x = buf2.ptr.p_double[2*i+0]; c2.y = buf2.ptr.p_double[2*i+1]; c3 = ae_c_div(c1,c2); buf.ptr.p_double[2*i+0] = c3.x; buf.ptr.p_double[2*i+1] = c3.y; } fftr1dinvinternaleven(&buf, m, &buf3, &plan, _state); ae_vector_set_length(r, m, _state); ae_v_move(&r->ptr.p_double[0], 1, &buf.ptr.p_double[0], 1, ae_v_len(0,m-1)); } else { /* * odd-size, use general real FFT */ fftr1d(a, m, &cbuf, _state); ae_vector_set_length(&buf2, m, _state); ae_v_move(&buf2.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=n; i<=m-1; i++) { buf2.ptr.p_double[i] = (double)(0); } fftr1d(&buf2, m, &cbuf2, _state); for(i=0; i<=ae_ifloor((double)m/(double)2, _state); i++) { cbuf.ptr.p_complex[i] = ae_c_div(cbuf.ptr.p_complex[i],cbuf2.ptr.p_complex[i]); } fftr1dinv(&cbuf, m, r, _state); } ae_frame_leave(_state); } /************************************************************************* 1-dimensional complex convolution. Extended subroutine which allows to choose convolution algorithm. Intended for internal use, ALGLIB users should call ConvC1D()/ConvC1DCircular(). INPUT PARAMETERS A - array[0..M-1] - complex function to be transformed M - problem size B - array[0..N-1] - complex function to be transformed N - problem size, N<=M Alg - algorithm type: *-2 auto-select Q for overlap-add *-1 auto-select algorithm and parameters * 0 straightforward formula for small N's * 1 general FFT-based code * 2 overlap-add with length Q Q - length for overlap-add OUTPUT PARAMETERS R - convolution: A*B. array[0..N+M-1]. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convc1dx(/* Complex */ ae_vector* a, ae_int_t m, /* Complex */ ae_vector* b, ae_int_t n, ae_bool circular, ae_int_t alg, ae_int_t q, /* Complex */ ae_vector* r, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t p; ae_int_t ptotal; ae_int_t i1; ae_int_t i2; ae_int_t j1; ae_int_t j2; ae_vector bbuf; ae_complex v; double ax; double ay; double bx; double by; double t; double tx; double ty; double flopcand; double flopbest; ae_int_t algbest; fasttransformplan plan; ae_vector buf; ae_vector buf2; ae_frame_make(_state, &_frame_block); ae_vector_clear(r); ae_vector_init(&bbuf, 0, DT_COMPLEX, _state); _fasttransformplan_init(&plan, _state); ae_vector_init(&buf, 0, DT_REAL, _state); ae_vector_init(&buf2, 0, DT_REAL, _state); ae_assert(n>0&&m>0, "ConvC1DX: incorrect N or M!", _state); ae_assert(n<=m, "ConvC1DX: Nptr.p_complex[0]; ae_v_cmovec(&r->ptr.p_complex[0], 1, &a->ptr.p_complex[0], 1, "N", ae_v_len(0,m-1), v); ae_frame_leave(_state); return; } /* * use straightforward formula */ if( circular ) { /* * circular convolution */ ae_vector_set_length(r, m, _state); v = b->ptr.p_complex[0]; ae_v_cmovec(&r->ptr.p_complex[0], 1, &a->ptr.p_complex[0], 1, "N", ae_v_len(0,m-1), v); for(i=1; i<=n-1; i++) { v = b->ptr.p_complex[i]; i1 = 0; i2 = i-1; j1 = m-i; j2 = m-1; ae_v_caddc(&r->ptr.p_complex[i1], 1, &a->ptr.p_complex[j1], 1, "N", ae_v_len(i1,i2), v); i1 = i; i2 = m-1; j1 = 0; j2 = m-i-1; ae_v_caddc(&r->ptr.p_complex[i1], 1, &a->ptr.p_complex[j1], 1, "N", ae_v_len(i1,i2), v); } } else { /* * non-circular convolution */ ae_vector_set_length(r, m+n-1, _state); for(i=0; i<=m+n-2; i++) { r->ptr.p_complex[i] = ae_complex_from_i(0); } for(i=0; i<=n-1; i++) { v = b->ptr.p_complex[i]; ae_v_caddc(&r->ptr.p_complex[i], 1, &a->ptr.p_complex[0], 1, "N", ae_v_len(i,i+m-1), v); } } ae_frame_leave(_state); return; } /* * general FFT-based code for * circular and non-circular convolutions. * * First, if convolution is circular, we test whether M is smooth or not. * If it is smooth, we just use M-length FFT to calculate convolution. * If it is not, we calculate non-circular convolution and wrap it arount. * * IF convolution is non-circular, we use zero-padding + FFT. */ if( alg==1 ) { if( circular&&ftbaseissmooth(m, _state) ) { /* * special code for circular convolution with smooth M */ ftcomplexfftplan(m, 1, &plan, _state); ae_vector_set_length(&buf, 2*m, _state); for(i=0; i<=m-1; i++) { buf.ptr.p_double[2*i+0] = a->ptr.p_complex[i].x; buf.ptr.p_double[2*i+1] = a->ptr.p_complex[i].y; } ae_vector_set_length(&buf2, 2*m, _state); for(i=0; i<=n-1; i++) { buf2.ptr.p_double[2*i+0] = b->ptr.p_complex[i].x; buf2.ptr.p_double[2*i+1] = b->ptr.p_complex[i].y; } for(i=n; i<=m-1; i++) { buf2.ptr.p_double[2*i+0] = (double)(0); buf2.ptr.p_double[2*i+1] = (double)(0); } ftapplyplan(&plan, &buf, 0, 1, _state); ftapplyplan(&plan, &buf2, 0, 1, _state); for(i=0; i<=m-1; i++) { ax = buf.ptr.p_double[2*i+0]; ay = buf.ptr.p_double[2*i+1]; bx = buf2.ptr.p_double[2*i+0]; by = buf2.ptr.p_double[2*i+1]; tx = ax*bx-ay*by; ty = ax*by+ay*bx; buf.ptr.p_double[2*i+0] = tx; buf.ptr.p_double[2*i+1] = -ty; } ftapplyplan(&plan, &buf, 0, 1, _state); t = (double)1/(double)m; ae_vector_set_length(r, m, _state); for(i=0; i<=m-1; i++) { r->ptr.p_complex[i].x = t*buf.ptr.p_double[2*i+0]; r->ptr.p_complex[i].y = -t*buf.ptr.p_double[2*i+1]; } } else { /* * M is non-smooth, general code (circular/non-circular): * * first part is the same for circular and non-circular * convolutions. zero padding, FFTs, inverse FFTs * * second part differs: * * for non-circular convolution we just copy array * * for circular convolution we add array tail to its head */ p = ftbasefindsmooth(m+n-1, _state); ftcomplexfftplan(p, 1, &plan, _state); ae_vector_set_length(&buf, 2*p, _state); for(i=0; i<=m-1; i++) { buf.ptr.p_double[2*i+0] = a->ptr.p_complex[i].x; buf.ptr.p_double[2*i+1] = a->ptr.p_complex[i].y; } for(i=m; i<=p-1; i++) { buf.ptr.p_double[2*i+0] = (double)(0); buf.ptr.p_double[2*i+1] = (double)(0); } ae_vector_set_length(&buf2, 2*p, _state); for(i=0; i<=n-1; i++) { buf2.ptr.p_double[2*i+0] = b->ptr.p_complex[i].x; buf2.ptr.p_double[2*i+1] = b->ptr.p_complex[i].y; } for(i=n; i<=p-1; i++) { buf2.ptr.p_double[2*i+0] = (double)(0); buf2.ptr.p_double[2*i+1] = (double)(0); } ftapplyplan(&plan, &buf, 0, 1, _state); ftapplyplan(&plan, &buf2, 0, 1, _state); for(i=0; i<=p-1; i++) { ax = buf.ptr.p_double[2*i+0]; ay = buf.ptr.p_double[2*i+1]; bx = buf2.ptr.p_double[2*i+0]; by = buf2.ptr.p_double[2*i+1]; tx = ax*bx-ay*by; ty = ax*by+ay*bx; buf.ptr.p_double[2*i+0] = tx; buf.ptr.p_double[2*i+1] = -ty; } ftapplyplan(&plan, &buf, 0, 1, _state); t = (double)1/(double)p; if( circular ) { /* * circular, add tail to head */ ae_vector_set_length(r, m, _state); for(i=0; i<=m-1; i++) { r->ptr.p_complex[i].x = t*buf.ptr.p_double[2*i+0]; r->ptr.p_complex[i].y = -t*buf.ptr.p_double[2*i+1]; } for(i=m; i<=m+n-2; i++) { r->ptr.p_complex[i-m].x = r->ptr.p_complex[i-m].x+t*buf.ptr.p_double[2*i+0]; r->ptr.p_complex[i-m].y = r->ptr.p_complex[i-m].y-t*buf.ptr.p_double[2*i+1]; } } else { /* * non-circular, just copy */ ae_vector_set_length(r, m+n-1, _state); for(i=0; i<=m+n-2; i++) { r->ptr.p_complex[i].x = t*buf.ptr.p_double[2*i+0]; r->ptr.p_complex[i].y = -t*buf.ptr.p_double[2*i+1]; } } } ae_frame_leave(_state); return; } /* * overlap-add method for * circular and non-circular convolutions. * * First part of code (separate FFTs of input blocks) is the same * for all types of convolution. Second part (overlapping outputs) * differs for different types of convolution. We just copy output * when convolution is non-circular. We wrap it around, if it is * circular. */ if( alg==2 ) { ae_vector_set_length(&buf, 2*(q+n-1), _state); /* * prepare R */ if( circular ) { ae_vector_set_length(r, m, _state); for(i=0; i<=m-1; i++) { r->ptr.p_complex[i] = ae_complex_from_i(0); } } else { ae_vector_set_length(r, m+n-1, _state); for(i=0; i<=m+n-2; i++) { r->ptr.p_complex[i] = ae_complex_from_i(0); } } /* * pre-calculated FFT(B) */ ae_vector_set_length(&bbuf, q+n-1, _state); ae_v_cmove(&bbuf.ptr.p_complex[0], 1, &b->ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); for(j=n; j<=q+n-2; j++) { bbuf.ptr.p_complex[j] = ae_complex_from_i(0); } fftc1d(&bbuf, q+n-1, _state); /* * prepare FFT plan for chunks of A */ ftcomplexfftplan(q+n-1, 1, &plan, _state); /* * main overlap-add cycle */ i = 0; while(i<=m-1) { p = ae_minint(q, m-i, _state); for(j=0; j<=p-1; j++) { buf.ptr.p_double[2*j+0] = a->ptr.p_complex[i+j].x; buf.ptr.p_double[2*j+1] = a->ptr.p_complex[i+j].y; } for(j=p; j<=q+n-2; j++) { buf.ptr.p_double[2*j+0] = (double)(0); buf.ptr.p_double[2*j+1] = (double)(0); } ftapplyplan(&plan, &buf, 0, 1, _state); for(j=0; j<=q+n-2; j++) { ax = buf.ptr.p_double[2*j+0]; ay = buf.ptr.p_double[2*j+1]; bx = bbuf.ptr.p_complex[j].x; by = bbuf.ptr.p_complex[j].y; tx = ax*bx-ay*by; ty = ax*by+ay*bx; buf.ptr.p_double[2*j+0] = tx; buf.ptr.p_double[2*j+1] = -ty; } ftapplyplan(&plan, &buf, 0, 1, _state); t = (double)1/(double)(q+n-1); if( circular ) { j1 = ae_minint(i+p+n-2, m-1, _state)-i; j2 = j1+1; } else { j1 = p+n-2; j2 = j1+1; } for(j=0; j<=j1; j++) { r->ptr.p_complex[i+j].x = r->ptr.p_complex[i+j].x+buf.ptr.p_double[2*j+0]*t; r->ptr.p_complex[i+j].y = r->ptr.p_complex[i+j].y-buf.ptr.p_double[2*j+1]*t; } for(j=j2; j<=p+n-2; j++) { r->ptr.p_complex[j-j2].x = r->ptr.p_complex[j-j2].x+buf.ptr.p_double[2*j+0]*t; r->ptr.p_complex[j-j2].y = r->ptr.p_complex[j-j2].y-buf.ptr.p_double[2*j+1]*t; } i = i+p; } ae_frame_leave(_state); return; } ae_frame_leave(_state); } /************************************************************************* 1-dimensional real convolution. Extended subroutine which allows to choose convolution algorithm. Intended for internal use, ALGLIB users should call ConvR1D(). INPUT PARAMETERS A - array[0..M-1] - complex function to be transformed M - problem size B - array[0..N-1] - complex function to be transformed N - problem size, N<=M Alg - algorithm type: *-2 auto-select Q for overlap-add *-1 auto-select algorithm and parameters * 0 straightforward formula for small N's * 1 general FFT-based code * 2 overlap-add with length Q Q - length for overlap-add OUTPUT PARAMETERS R - convolution: A*B. array[0..N+M-1]. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convr1dx(/* Real */ ae_vector* a, ae_int_t m, /* Real */ ae_vector* b, ae_int_t n, ae_bool circular, ae_int_t alg, ae_int_t q, /* Real */ ae_vector* r, ae_state *_state) { ae_frame _frame_block; double v; ae_int_t i; ae_int_t j; ae_int_t p; ae_int_t ptotal; ae_int_t i1; ae_int_t i2; ae_int_t j1; ae_int_t j2; double ax; double ay; double bx; double by; double tx; double ty; double flopcand; double flopbest; ae_int_t algbest; fasttransformplan plan; ae_vector buf; ae_vector buf2; ae_vector buf3; ae_frame_make(_state, &_frame_block); ae_vector_clear(r); _fasttransformplan_init(&plan, _state); ae_vector_init(&buf, 0, DT_REAL, _state); ae_vector_init(&buf2, 0, DT_REAL, _state); ae_vector_init(&buf3, 0, DT_REAL, _state); ae_assert(n>0&&m>0, "ConvC1DX: incorrect N or M!", _state); ae_assert(n<=m, "ConvC1DX: Nptr.p_double[0]; ae_v_moved(&r->ptr.p_double[0], 1, &a->ptr.p_double[0], 1, ae_v_len(0,m-1), v); ae_frame_leave(_state); return; } /* * use straightforward formula */ if( circular ) { /* * circular convolution */ ae_vector_set_length(r, m, _state); v = b->ptr.p_double[0]; ae_v_moved(&r->ptr.p_double[0], 1, &a->ptr.p_double[0], 1, ae_v_len(0,m-1), v); for(i=1; i<=n-1; i++) { v = b->ptr.p_double[i]; i1 = 0; i2 = i-1; j1 = m-i; j2 = m-1; ae_v_addd(&r->ptr.p_double[i1], 1, &a->ptr.p_double[j1], 1, ae_v_len(i1,i2), v); i1 = i; i2 = m-1; j1 = 0; j2 = m-i-1; ae_v_addd(&r->ptr.p_double[i1], 1, &a->ptr.p_double[j1], 1, ae_v_len(i1,i2), v); } } else { /* * non-circular convolution */ ae_vector_set_length(r, m+n-1, _state); for(i=0; i<=m+n-2; i++) { r->ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { v = b->ptr.p_double[i]; ae_v_addd(&r->ptr.p_double[i], 1, &a->ptr.p_double[0], 1, ae_v_len(i,i+m-1), v); } } ae_frame_leave(_state); return; } /* * general FFT-based code for * circular and non-circular convolutions. * * First, if convolution is circular, we test whether M is smooth or not. * If it is smooth, we just use M-length FFT to calculate convolution. * If it is not, we calculate non-circular convolution and wrap it arount. * * If convolution is non-circular, we use zero-padding + FFT. * * We assume that M+N-1>2 - we should call small case code otherwise */ if( alg==1 ) { ae_assert(m+n-1>2, "ConvR1DX: internal error!", _state); if( (circular&&ftbaseissmooth(m, _state))&&m%2==0 ) { /* * special code for circular convolution with smooth even M */ ae_vector_set_length(&buf, m, _state); ae_v_move(&buf.ptr.p_double[0], 1, &a->ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_vector_set_length(&buf2, m, _state); ae_v_move(&buf2.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=n; i<=m-1; i++) { buf2.ptr.p_double[i] = (double)(0); } ae_vector_set_length(&buf3, m, _state); ftcomplexfftplan(m/2, 1, &plan, _state); fftr1dinternaleven(&buf, m, &buf3, &plan, _state); fftr1dinternaleven(&buf2, m, &buf3, &plan, _state); buf.ptr.p_double[0] = buf.ptr.p_double[0]*buf2.ptr.p_double[0]; buf.ptr.p_double[1] = buf.ptr.p_double[1]*buf2.ptr.p_double[1]; for(i=1; i<=m/2-1; i++) { ax = buf.ptr.p_double[2*i+0]; ay = buf.ptr.p_double[2*i+1]; bx = buf2.ptr.p_double[2*i+0]; by = buf2.ptr.p_double[2*i+1]; tx = ax*bx-ay*by; ty = ax*by+ay*bx; buf.ptr.p_double[2*i+0] = tx; buf.ptr.p_double[2*i+1] = ty; } fftr1dinvinternaleven(&buf, m, &buf3, &plan, _state); ae_vector_set_length(r, m, _state); ae_v_move(&r->ptr.p_double[0], 1, &buf.ptr.p_double[0], 1, ae_v_len(0,m-1)); } else { /* * M is non-smooth or non-even, general code (circular/non-circular): * * first part is the same for circular and non-circular * convolutions. zero padding, FFTs, inverse FFTs * * second part differs: * * for non-circular convolution we just copy array * * for circular convolution we add array tail to its head */ p = ftbasefindsmootheven(m+n-1, _state); ae_vector_set_length(&buf, p, _state); ae_v_move(&buf.ptr.p_double[0], 1, &a->ptr.p_double[0], 1, ae_v_len(0,m-1)); for(i=m; i<=p-1; i++) { buf.ptr.p_double[i] = (double)(0); } ae_vector_set_length(&buf2, p, _state); ae_v_move(&buf2.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=n; i<=p-1; i++) { buf2.ptr.p_double[i] = (double)(0); } ae_vector_set_length(&buf3, p, _state); ftcomplexfftplan(p/2, 1, &plan, _state); fftr1dinternaleven(&buf, p, &buf3, &plan, _state); fftr1dinternaleven(&buf2, p, &buf3, &plan, _state); buf.ptr.p_double[0] = buf.ptr.p_double[0]*buf2.ptr.p_double[0]; buf.ptr.p_double[1] = buf.ptr.p_double[1]*buf2.ptr.p_double[1]; for(i=1; i<=p/2-1; i++) { ax = buf.ptr.p_double[2*i+0]; ay = buf.ptr.p_double[2*i+1]; bx = buf2.ptr.p_double[2*i+0]; by = buf2.ptr.p_double[2*i+1]; tx = ax*bx-ay*by; ty = ax*by+ay*bx; buf.ptr.p_double[2*i+0] = tx; buf.ptr.p_double[2*i+1] = ty; } fftr1dinvinternaleven(&buf, p, &buf3, &plan, _state); if( circular ) { /* * circular, add tail to head */ ae_vector_set_length(r, m, _state); ae_v_move(&r->ptr.p_double[0], 1, &buf.ptr.p_double[0], 1, ae_v_len(0,m-1)); if( n>=2 ) { ae_v_add(&r->ptr.p_double[0], 1, &buf.ptr.p_double[m], 1, ae_v_len(0,n-2)); } } else { /* * non-circular, just copy */ ae_vector_set_length(r, m+n-1, _state); ae_v_move(&r->ptr.p_double[0], 1, &buf.ptr.p_double[0], 1, ae_v_len(0,m+n-2)); } } ae_frame_leave(_state); return; } /* * overlap-add method */ if( alg==2 ) { ae_assert((q+n-1)%2==0, "ConvR1DX: internal error!", _state); ae_vector_set_length(&buf, q+n-1, _state); ae_vector_set_length(&buf2, q+n-1, _state); ae_vector_set_length(&buf3, q+n-1, _state); ftcomplexfftplan((q+n-1)/2, 1, &plan, _state); /* * prepare R */ if( circular ) { ae_vector_set_length(r, m, _state); for(i=0; i<=m-1; i++) { r->ptr.p_double[i] = (double)(0); } } else { ae_vector_set_length(r, m+n-1, _state); for(i=0; i<=m+n-2; i++) { r->ptr.p_double[i] = (double)(0); } } /* * pre-calculated FFT(B) */ ae_v_move(&buf2.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); for(j=n; j<=q+n-2; j++) { buf2.ptr.p_double[j] = (double)(0); } fftr1dinternaleven(&buf2, q+n-1, &buf3, &plan, _state); /* * main overlap-add cycle */ i = 0; while(i<=m-1) { p = ae_minint(q, m-i, _state); ae_v_move(&buf.ptr.p_double[0], 1, &a->ptr.p_double[i], 1, ae_v_len(0,p-1)); for(j=p; j<=q+n-2; j++) { buf.ptr.p_double[j] = (double)(0); } fftr1dinternaleven(&buf, q+n-1, &buf3, &plan, _state); buf.ptr.p_double[0] = buf.ptr.p_double[0]*buf2.ptr.p_double[0]; buf.ptr.p_double[1] = buf.ptr.p_double[1]*buf2.ptr.p_double[1]; for(j=1; j<=(q+n-1)/2-1; j++) { ax = buf.ptr.p_double[2*j+0]; ay = buf.ptr.p_double[2*j+1]; bx = buf2.ptr.p_double[2*j+0]; by = buf2.ptr.p_double[2*j+1]; tx = ax*bx-ay*by; ty = ax*by+ay*bx; buf.ptr.p_double[2*j+0] = tx; buf.ptr.p_double[2*j+1] = ty; } fftr1dinvinternaleven(&buf, q+n-1, &buf3, &plan, _state); if( circular ) { j1 = ae_minint(i+p+n-2, m-1, _state)-i; j2 = j1+1; } else { j1 = p+n-2; j2 = j1+1; } ae_v_add(&r->ptr.p_double[i], 1, &buf.ptr.p_double[0], 1, ae_v_len(i,i+j1)); if( p+n-2>=j2 ) { ae_v_add(&r->ptr.p_double[0], 1, &buf.ptr.p_double[j2], 1, ae_v_len(0,p+n-2-j2)); } i = i+p; } ae_frame_leave(_state); return; } ae_frame_leave(_state); } /************************************************************************* 1-dimensional complex cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (non-circular). Correlation is calculated using reduction to convolution. Algorithm with max(N,N)*log(max(N,N)) complexity is used (see ConvC1D() for more info about performance). IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrC1D(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - complex function to be transformed, signal containing pattern N - problem size Pattern - array[0..M-1] - complex function to be transformed, pattern to search withing signal M - problem size OUTPUT PARAMETERS R - cross-correlation, array[0..N+M-2]: * positive lags are stored in R[0..N-1], R[i] = sum(conj(pattern[j])*signal[i+j] * negative lags are stored in R[N..N+M-2], R[N+M-1-i] = sum(conj(pattern[j])*signal[-i+j] NOTE: It is assumed that pattern domain is [0..M-1]. If Pattern is non-zero on [-K..M-1], you can still use this subroutine, just shift result by K. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void corrc1d(/* Complex */ ae_vector* signal, ae_int_t n, /* Complex */ ae_vector* pattern, ae_int_t m, /* Complex */ ae_vector* r, ae_state *_state) { ae_frame _frame_block; ae_vector p; ae_vector b; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_clear(r); ae_vector_init(&p, 0, DT_COMPLEX, _state); ae_vector_init(&b, 0, DT_COMPLEX, _state); ae_assert(n>0&&m>0, "CorrC1D: incorrect N or M!", _state); ae_vector_set_length(&p, m, _state); for(i=0; i<=m-1; i++) { p.ptr.p_complex[m-1-i] = ae_c_conj(pattern->ptr.p_complex[i], _state); } convc1d(&p, m, signal, n, &b, _state); ae_vector_set_length(r, m+n-1, _state); ae_v_cmove(&r->ptr.p_complex[0], 1, &b.ptr.p_complex[m-1], 1, "N", ae_v_len(0,n-1)); if( m+n-2>=n ) { ae_v_cmove(&r->ptr.p_complex[n], 1, &b.ptr.p_complex[0], 1, "N", ae_v_len(n,m+n-2)); } ae_frame_leave(_state); } /************************************************************************* 1-dimensional circular complex cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (circular). Algorithm has linearithmic complexity for any M/N. IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrC1DCircular(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - complex function to be transformed, periodic signal containing pattern N - problem size Pattern - array[0..M-1] - complex function to be transformed, non-periodic pattern to search withing signal M - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void corrc1dcircular(/* Complex */ ae_vector* signal, ae_int_t m, /* Complex */ ae_vector* pattern, ae_int_t n, /* Complex */ ae_vector* c, ae_state *_state) { ae_frame _frame_block; ae_vector p; ae_vector b; ae_int_t i1; ae_int_t i2; ae_int_t i; ae_int_t j2; ae_frame_make(_state, &_frame_block); ae_vector_clear(c); ae_vector_init(&p, 0, DT_COMPLEX, _state); ae_vector_init(&b, 0, DT_COMPLEX, _state); ae_assert(n>0&&m>0, "ConvC1DCircular: incorrect N or M!", _state); /* * normalize task: make M>=N, * so A will be longer (at least - not shorter) that B. */ if( mptr.p_complex[i1], 1, "N", ae_v_len(0,j2)); i1 = i1+m; } corrc1dcircular(signal, m, &b, m, c, _state); ae_frame_leave(_state); return; } /* * Task is normalized */ ae_vector_set_length(&p, n, _state); for(i=0; i<=n-1; i++) { p.ptr.p_complex[n-1-i] = ae_c_conj(pattern->ptr.p_complex[i], _state); } convc1dcircular(signal, m, &p, n, &b, _state); ae_vector_set_length(c, m, _state); ae_v_cmove(&c->ptr.p_complex[0], 1, &b.ptr.p_complex[n-1], 1, "N", ae_v_len(0,m-n)); if( m-n+1<=m-1 ) { ae_v_cmove(&c->ptr.p_complex[m-n+1], 1, &b.ptr.p_complex[0], 1, "N", ae_v_len(m-n+1,m-1)); } ae_frame_leave(_state); } /************************************************************************* 1-dimensional real cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (non-circular). Correlation is calculated using reduction to convolution. Algorithm with max(N,N)*log(max(N,N)) complexity is used (see ConvC1D() for more info about performance). IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrR1D(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - real function to be transformed, signal containing pattern N - problem size Pattern - array[0..M-1] - real function to be transformed, pattern to search withing signal M - problem size OUTPUT PARAMETERS R - cross-correlation, array[0..N+M-2]: * positive lags are stored in R[0..N-1], R[i] = sum(pattern[j]*signal[i+j] * negative lags are stored in R[N..N+M-2], R[N+M-1-i] = sum(pattern[j]*signal[-i+j] NOTE: It is assumed that pattern domain is [0..M-1]. If Pattern is non-zero on [-K..M-1], you can still use this subroutine, just shift result by K. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void corrr1d(/* Real */ ae_vector* signal, ae_int_t n, /* Real */ ae_vector* pattern, ae_int_t m, /* Real */ ae_vector* r, ae_state *_state) { ae_frame _frame_block; ae_vector p; ae_vector b; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_clear(r); ae_vector_init(&p, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_assert(n>0&&m>0, "CorrR1D: incorrect N or M!", _state); ae_vector_set_length(&p, m, _state); for(i=0; i<=m-1; i++) { p.ptr.p_double[m-1-i] = pattern->ptr.p_double[i]; } convr1d(&p, m, signal, n, &b, _state); ae_vector_set_length(r, m+n-1, _state); ae_v_move(&r->ptr.p_double[0], 1, &b.ptr.p_double[m-1], 1, ae_v_len(0,n-1)); if( m+n-2>=n ) { ae_v_move(&r->ptr.p_double[n], 1, &b.ptr.p_double[0], 1, ae_v_len(n,m+n-2)); } ae_frame_leave(_state); } /************************************************************************* 1-dimensional circular real cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (circular). Algorithm has linearithmic complexity for any M/N. IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrR1DCircular(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - real function to be transformed, periodic signal containing pattern N - problem size Pattern - array[0..M-1] - real function to be transformed, non-periodic pattern to search withing signal M - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void corrr1dcircular(/* Real */ ae_vector* signal, ae_int_t m, /* Real */ ae_vector* pattern, ae_int_t n, /* Real */ ae_vector* c, ae_state *_state) { ae_frame _frame_block; ae_vector p; ae_vector b; ae_int_t i1; ae_int_t i2; ae_int_t i; ae_int_t j2; ae_frame_make(_state, &_frame_block); ae_vector_clear(c); ae_vector_init(&p, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_assert(n>0&&m>0, "ConvC1DCircular: incorrect N or M!", _state); /* * normalize task: make M>=N, * so A will be longer (at least - not shorter) that B. */ if( mptr.p_double[i1], 1, ae_v_len(0,j2)); i1 = i1+m; } corrr1dcircular(signal, m, &b, m, c, _state); ae_frame_leave(_state); return; } /* * Task is normalized */ ae_vector_set_length(&p, n, _state); for(i=0; i<=n-1; i++) { p.ptr.p_double[n-1-i] = pattern->ptr.p_double[i]; } convr1dcircular(signal, m, &p, n, &b, _state); ae_vector_set_length(c, m, _state); ae_v_move(&c->ptr.p_double[0], 1, &b.ptr.p_double[n-1], 1, ae_v_len(0,m-n)); if( m-n+1<=m-1 ) { ae_v_move(&c->ptr.p_double[m-n+1], 1, &b.ptr.p_double[0], 1, ae_v_len(m-n+1,m-1)); } ae_frame_leave(_state); } } cpp/src/dataanalysis.h0000755000175000017500000114730013105126765014652 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _dataanalysis_pkg_h #define _dataanalysis_pkg_h #include "ap.h" #include "alglibinternal.h" #include "linalg.h" #include "alglibmisc.h" #include "statistics.h" #include "specialfunctions.h" #include "solvers.h" #include "optimization.h" ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (DATATYPES) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { typedef struct { double relclserror; double avgce; double rmserror; double avgerror; double avgrelerror; } cvreport; typedef struct { double relclserror; double avgce; double rmserror; double avgerror; double avgrelerror; } modelerrors; typedef struct { double f; ae_vector g; } smlpgrad; typedef struct { ae_int_t hlnetworktype; ae_int_t hlnormtype; ae_vector hllayersizes; ae_vector hlconnections; ae_vector hlneurons; ae_vector structinfo; ae_vector weights; ae_vector columnmeans; ae_vector columnsigmas; ae_vector neurons; ae_vector dfdnet; ae_vector derror; ae_vector x; ae_vector y; ae_matrix xy; ae_vector xyrow; ae_vector nwbuf; ae_vector integerbuf; modelerrors err; ae_vector rndbuf; ae_shared_pool buf; ae_shared_pool gradbuf; ae_matrix dummydxy; sparsematrix dummysxy; ae_vector dummyidx; ae_shared_pool dummypool; } multilayerperceptron; typedef struct { ae_vector w; } linearmodel; typedef struct { ae_matrix c; double rmserror; double avgerror; double avgrelerror; double cvrmserror; double cvavgerror; double cvavgrelerror; ae_int_t ncvdefects; ae_vector cvdefects; } lrreport; typedef struct { ae_vector w; } logitmodel; typedef struct { ae_bool brackt; ae_bool stage1; ae_int_t infoc; double dg; double dgm; double dginit; double dgtest; double dgx; double dgxm; double dgy; double dgym; double finit; double ftest1; double fm; double fx; double fxm; double fy; double fym; double stx; double sty; double stmin; double stmax; double width; double width1; double xtrapf; } logitmcstate; typedef struct { ae_int_t ngrad; ae_int_t nhess; } mnlreport; typedef struct { ae_int_t n; ae_vector states; ae_int_t npairs; ae_matrix data; ae_matrix ec; ae_matrix bndl; ae_matrix bndu; ae_matrix c; ae_vector ct; ae_int_t ccnt; ae_vector pw; ae_matrix priorp; double regterm; minbleicstate bs; ae_int_t repinneriterationscount; ae_int_t repouteriterationscount; ae_int_t repnfev; ae_int_t repterminationtype; minbleicreport br; ae_vector tmpp; ae_vector effectivew; ae_vector effectivebndl; ae_vector effectivebndu; ae_matrix effectivec; ae_vector effectivect; ae_vector h; ae_matrix p; } mcpdstate; typedef struct { ae_int_t inneriterationscount; ae_int_t outeriterationscount; ae_int_t nfev; ae_int_t terminationtype; } mcpdreport; typedef struct { ae_int_t ensemblesize; ae_vector weights; ae_vector columnmeans; ae_vector columnsigmas; multilayerperceptron network; ae_vector y; } mlpensemble; typedef struct { double relclserror; double avgce; double rmserror; double avgerror; double avgrelerror; ae_int_t ngrad; ae_int_t nhess; ae_int_t ncholesky; } mlpreport; typedef struct { double relclserror; double avgce; double rmserror; double avgerror; double avgrelerror; } mlpcvreport; typedef struct { ae_vector bestparameters; double bestrmserror; ae_bool randomizenetwork; multilayerperceptron network; minlbfgsstate optimizer; minlbfgsreport optimizerrep; ae_vector wbuf0; ae_vector wbuf1; ae_vector allminibatches; ae_vector currentminibatch; rcommstate rstate; ae_int_t algoused; ae_int_t minibatchsize; hqrndstate generator; } smlptrnsession; typedef struct { ae_vector trnsubset; ae_vector valsubset; ae_shared_pool mlpsessions; mlpreport mlprep; multilayerperceptron network; } mlpetrnsession; typedef struct { ae_int_t nin; ae_int_t nout; ae_bool rcpar; ae_int_t lbfgsfactor; double decay; double wstep; ae_int_t maxits; ae_int_t datatype; ae_int_t npoints; ae_matrix densexy; sparsematrix sparsexy; smlptrnsession session; ae_int_t ngradbatch; ae_vector subset; ae_int_t subsetsize; ae_vector valsubset; ae_int_t valsubsetsize; ae_int_t algokind; ae_int_t minibatchsize; } mlptrainer; typedef struct { multilayerperceptron network; mlpreport rep; ae_vector subset; ae_int_t subsetsize; ae_vector xyrow; ae_vector y; ae_int_t ngrad; ae_shared_pool trnpool; } mlpparallelizationcv; typedef struct { ae_matrix ct; ae_matrix ctbest; ae_vector xycbest; ae_vector xycprev; ae_vector d2; ae_vector csizes; apbuffers initbuf; ae_shared_pool updatepool; } kmeansbuffers; typedef struct { ae_int_t npoints; ae_int_t nfeatures; ae_int_t disttype; ae_matrix xy; ae_matrix d; ae_int_t ahcalgo; ae_int_t kmeansrestarts; ae_int_t kmeansmaxits; ae_int_t kmeansinitalgo; ae_bool kmeansdbgnoits; ae_matrix tmpd; apbuffers distbuf; kmeansbuffers kmeanstmp; } clusterizerstate; typedef struct { ae_int_t terminationtype; ae_int_t npoints; ae_vector p; ae_matrix z; ae_matrix pz; ae_matrix pm; ae_vector mergedist; } ahcreport; typedef struct { ae_int_t npoints; ae_int_t nfeatures; ae_int_t terminationtype; ae_int_t iterationscount; double energy; ae_int_t k; ae_matrix c; ae_vector cidx; } kmeansreport; typedef struct { ae_int_t nvars; ae_int_t nclasses; ae_int_t ntrees; ae_int_t bufsize; ae_vector trees; } decisionforest; typedef struct { double relclserror; double avgce; double rmserror; double avgerror; double avgrelerror; double oobrelclserror; double oobavgce; double oobrmserror; double oobavgerror; double oobavgrelerror; } dfreport; typedef struct { ae_vector treebuf; ae_vector idxbuf; ae_vector tmpbufr; ae_vector tmpbufr2; ae_vector tmpbufi; ae_vector classibuf; ae_vector sortrbuf; ae_vector sortrbuf2; ae_vector sortibuf; ae_vector varpool; ae_vector evsbin; ae_vector evssplits; } dfinternalbuffers; } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* Model's errors: * RelCLSError - fraction of misclassified cases. * AvgCE - acerage cross-entropy * RMSError - root-mean-square error * AvgError - average error * AvgRelError - average relative error NOTE 1: RelCLSError/AvgCE are zero on regression problems. NOTE 2: on classification problems RMSError/AvgError/AvgRelError contain errors in prediction of posterior probabilities *************************************************************************/ class _modelerrors_owner { public: _modelerrors_owner(); _modelerrors_owner(const _modelerrors_owner &rhs); _modelerrors_owner& operator=(const _modelerrors_owner &rhs); virtual ~_modelerrors_owner(); alglib_impl::modelerrors* c_ptr(); alglib_impl::modelerrors* c_ptr() const; protected: alglib_impl::modelerrors *p_struct; }; class modelerrors : public _modelerrors_owner { public: modelerrors(); modelerrors(const modelerrors &rhs); modelerrors& operator=(const modelerrors &rhs); virtual ~modelerrors(); double &relclserror; double &avgce; double &rmserror; double &avgerror; double &avgrelerror; }; /************************************************************************* *************************************************************************/ class _multilayerperceptron_owner { public: _multilayerperceptron_owner(); _multilayerperceptron_owner(const _multilayerperceptron_owner &rhs); _multilayerperceptron_owner& operator=(const _multilayerperceptron_owner &rhs); virtual ~_multilayerperceptron_owner(); alglib_impl::multilayerperceptron* c_ptr(); alglib_impl::multilayerperceptron* c_ptr() const; protected: alglib_impl::multilayerperceptron *p_struct; }; class multilayerperceptron : public _multilayerperceptron_owner { public: multilayerperceptron(); multilayerperceptron(const multilayerperceptron &rhs); multilayerperceptron& operator=(const multilayerperceptron &rhs); virtual ~multilayerperceptron(); }; /************************************************************************* *************************************************************************/ class _linearmodel_owner { public: _linearmodel_owner(); _linearmodel_owner(const _linearmodel_owner &rhs); _linearmodel_owner& operator=(const _linearmodel_owner &rhs); virtual ~_linearmodel_owner(); alglib_impl::linearmodel* c_ptr(); alglib_impl::linearmodel* c_ptr() const; protected: alglib_impl::linearmodel *p_struct; }; class linearmodel : public _linearmodel_owner { public: linearmodel(); linearmodel(const linearmodel &rhs); linearmodel& operator=(const linearmodel &rhs); virtual ~linearmodel(); }; /************************************************************************* LRReport structure contains additional information about linear model: * C - covariation matrix, array[0..NVars,0..NVars]. C[i,j] = Cov(A[i],A[j]) * RMSError - root mean square error on a training set * AvgError - average error on a training set * AvgRelError - average relative error on a training set (excluding observations with zero function value). * CVRMSError - leave-one-out cross-validation estimate of generalization error. Calculated using fast algorithm with O(NVars*NPoints) complexity. * CVAvgError - cross-validation estimate of average error * CVAvgRelError - cross-validation estimate of average relative error All other fields of the structure are intended for internal use and should not be used outside ALGLIB. *************************************************************************/ class _lrreport_owner { public: _lrreport_owner(); _lrreport_owner(const _lrreport_owner &rhs); _lrreport_owner& operator=(const _lrreport_owner &rhs); virtual ~_lrreport_owner(); alglib_impl::lrreport* c_ptr(); alglib_impl::lrreport* c_ptr() const; protected: alglib_impl::lrreport *p_struct; }; class lrreport : public _lrreport_owner { public: lrreport(); lrreport(const lrreport &rhs); lrreport& operator=(const lrreport &rhs); virtual ~lrreport(); real_2d_array c; double &rmserror; double &avgerror; double &avgrelerror; double &cvrmserror; double &cvavgerror; double &cvavgrelerror; ae_int_t &ncvdefects; integer_1d_array cvdefects; }; /************************************************************************* *************************************************************************/ class _logitmodel_owner { public: _logitmodel_owner(); _logitmodel_owner(const _logitmodel_owner &rhs); _logitmodel_owner& operator=(const _logitmodel_owner &rhs); virtual ~_logitmodel_owner(); alglib_impl::logitmodel* c_ptr(); alglib_impl::logitmodel* c_ptr() const; protected: alglib_impl::logitmodel *p_struct; }; class logitmodel : public _logitmodel_owner { public: logitmodel(); logitmodel(const logitmodel &rhs); logitmodel& operator=(const logitmodel &rhs); virtual ~logitmodel(); }; /************************************************************************* MNLReport structure contains information about training process: * NGrad - number of gradient calculations * NHess - number of Hessian calculations *************************************************************************/ class _mnlreport_owner { public: _mnlreport_owner(); _mnlreport_owner(const _mnlreport_owner &rhs); _mnlreport_owner& operator=(const _mnlreport_owner &rhs); virtual ~_mnlreport_owner(); alglib_impl::mnlreport* c_ptr(); alglib_impl::mnlreport* c_ptr() const; protected: alglib_impl::mnlreport *p_struct; }; class mnlreport : public _mnlreport_owner { public: mnlreport(); mnlreport(const mnlreport &rhs); mnlreport& operator=(const mnlreport &rhs); virtual ~mnlreport(); ae_int_t &ngrad; ae_int_t &nhess; }; /************************************************************************* This structure is a MCPD (Markov Chains for Population Data) solver. You should use ALGLIB functions in order to work with this object. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ class _mcpdstate_owner { public: _mcpdstate_owner(); _mcpdstate_owner(const _mcpdstate_owner &rhs); _mcpdstate_owner& operator=(const _mcpdstate_owner &rhs); virtual ~_mcpdstate_owner(); alglib_impl::mcpdstate* c_ptr(); alglib_impl::mcpdstate* c_ptr() const; protected: alglib_impl::mcpdstate *p_struct; }; class mcpdstate : public _mcpdstate_owner { public: mcpdstate(); mcpdstate(const mcpdstate &rhs); mcpdstate& operator=(const mcpdstate &rhs); virtual ~mcpdstate(); }; /************************************************************************* This structure is a MCPD training report: InnerIterationsCount - number of inner iterations of the underlying optimization algorithm OuterIterationsCount - number of outer iterations of the underlying optimization algorithm NFEV - number of merit function evaluations TerminationType - termination type (same as for MinBLEIC optimizer, positive values denote success, negative ones - failure) -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ class _mcpdreport_owner { public: _mcpdreport_owner(); _mcpdreport_owner(const _mcpdreport_owner &rhs); _mcpdreport_owner& operator=(const _mcpdreport_owner &rhs); virtual ~_mcpdreport_owner(); alglib_impl::mcpdreport* c_ptr(); alglib_impl::mcpdreport* c_ptr() const; protected: alglib_impl::mcpdreport *p_struct; }; class mcpdreport : public _mcpdreport_owner { public: mcpdreport(); mcpdreport(const mcpdreport &rhs); mcpdreport& operator=(const mcpdreport &rhs); virtual ~mcpdreport(); ae_int_t &inneriterationscount; ae_int_t &outeriterationscount; ae_int_t &nfev; ae_int_t &terminationtype; }; /************************************************************************* Neural networks ensemble *************************************************************************/ class _mlpensemble_owner { public: _mlpensemble_owner(); _mlpensemble_owner(const _mlpensemble_owner &rhs); _mlpensemble_owner& operator=(const _mlpensemble_owner &rhs); virtual ~_mlpensemble_owner(); alglib_impl::mlpensemble* c_ptr(); alglib_impl::mlpensemble* c_ptr() const; protected: alglib_impl::mlpensemble *p_struct; }; class mlpensemble : public _mlpensemble_owner { public: mlpensemble(); mlpensemble(const mlpensemble &rhs); mlpensemble& operator=(const mlpensemble &rhs); virtual ~mlpensemble(); }; /************************************************************************* Training report: * RelCLSError - fraction of misclassified cases. * AvgCE - acerage cross-entropy * RMSError - root-mean-square error * AvgError - average error * AvgRelError - average relative error * NGrad - number of gradient calculations * NHess - number of Hessian calculations * NCholesky - number of Cholesky decompositions NOTE 1: RelCLSError/AvgCE are zero on regression problems. NOTE 2: on classification problems RMSError/AvgError/AvgRelError contain errors in prediction of posterior probabilities *************************************************************************/ class _mlpreport_owner { public: _mlpreport_owner(); _mlpreport_owner(const _mlpreport_owner &rhs); _mlpreport_owner& operator=(const _mlpreport_owner &rhs); virtual ~_mlpreport_owner(); alglib_impl::mlpreport* c_ptr(); alglib_impl::mlpreport* c_ptr() const; protected: alglib_impl::mlpreport *p_struct; }; class mlpreport : public _mlpreport_owner { public: mlpreport(); mlpreport(const mlpreport &rhs); mlpreport& operator=(const mlpreport &rhs); virtual ~mlpreport(); double &relclserror; double &avgce; double &rmserror; double &avgerror; double &avgrelerror; ae_int_t &ngrad; ae_int_t &nhess; ae_int_t &ncholesky; }; /************************************************************************* Cross-validation estimates of generalization error *************************************************************************/ class _mlpcvreport_owner { public: _mlpcvreport_owner(); _mlpcvreport_owner(const _mlpcvreport_owner &rhs); _mlpcvreport_owner& operator=(const _mlpcvreport_owner &rhs); virtual ~_mlpcvreport_owner(); alglib_impl::mlpcvreport* c_ptr(); alglib_impl::mlpcvreport* c_ptr() const; protected: alglib_impl::mlpcvreport *p_struct; }; class mlpcvreport : public _mlpcvreport_owner { public: mlpcvreport(); mlpcvreport(const mlpcvreport &rhs); mlpcvreport& operator=(const mlpcvreport &rhs); virtual ~mlpcvreport(); double &relclserror; double &avgce; double &rmserror; double &avgerror; double &avgrelerror; }; /************************************************************************* Trainer object for neural network. You should not try to access fields of this object directly - use ALGLIB functions to work with this object. *************************************************************************/ class _mlptrainer_owner { public: _mlptrainer_owner(); _mlptrainer_owner(const _mlptrainer_owner &rhs); _mlptrainer_owner& operator=(const _mlptrainer_owner &rhs); virtual ~_mlptrainer_owner(); alglib_impl::mlptrainer* c_ptr(); alglib_impl::mlptrainer* c_ptr() const; protected: alglib_impl::mlptrainer *p_struct; }; class mlptrainer : public _mlptrainer_owner { public: mlptrainer(); mlptrainer(const mlptrainer &rhs); mlptrainer& operator=(const mlptrainer &rhs); virtual ~mlptrainer(); }; /************************************************************************* This structure is a clusterization engine. You should not try to access its fields directly. Use ALGLIB functions in order to work with this object. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ class _clusterizerstate_owner { public: _clusterizerstate_owner(); _clusterizerstate_owner(const _clusterizerstate_owner &rhs); _clusterizerstate_owner& operator=(const _clusterizerstate_owner &rhs); virtual ~_clusterizerstate_owner(); alglib_impl::clusterizerstate* c_ptr(); alglib_impl::clusterizerstate* c_ptr() const; protected: alglib_impl::clusterizerstate *p_struct; }; class clusterizerstate : public _clusterizerstate_owner { public: clusterizerstate(); clusterizerstate(const clusterizerstate &rhs); clusterizerstate& operator=(const clusterizerstate &rhs); virtual ~clusterizerstate(); }; /************************************************************************* This structure is used to store results of the agglomerative hierarchical clustering (AHC). Following information is returned: * TerminationType - completion code: * 1 for successful completion of algorithm * -5 inappropriate combination of clustering algorithm and distance function was used. As for now, it is possible only when Ward's method is called for dataset with non-Euclidean distance function. In case negative completion code is returned, other fields of report structure are invalid and should not be used. * NPoints contains number of points in the original dataset * Z contains information about merges performed (see below). Z contains indexes from the original (unsorted) dataset and it can be used when you need to know what points were merged. However, it is not convenient when you want to build a dendrograd (see below). * if you want to build dendrogram, you can use Z, but it is not good option, because Z contains indexes from unsorted dataset. Dendrogram built from such dataset is likely to have intersections. So, you have to reorder you points before building dendrogram. Permutation which reorders point is returned in P. Another representation of merges, which is more convenient for dendorgram construction, is returned in PM. * more information on format of Z, P and PM can be found below and in the examples from ALGLIB Reference Manual. FORMAL DESCRIPTION OF FIELDS: NPoints number of points Z array[NPoints-1,2], contains indexes of clusters linked in pairs to form clustering tree. I-th row corresponds to I-th merge: * Z[I,0] - index of the first cluster to merge * Z[I,1] - index of the second cluster to merge * Z[I,0]=0 NFeatures number of variables, >=1 TerminationType completion code: * -5 if distance type is anything different from Euclidean metric * -3 for degenerate dataset: a) less than K distinct points, b) K=0 for non-empty dataset. * +1 for successful completion K number of clusters C array[K,NFeatures], rows of the array store centers CIdx array[NPoints], which contains cluster indexes IterationsCount actual number of iterations performed by clusterizer. If algorithm performed more than one random restart, total number of iterations is returned. Energy merit function, "energy", sum of squared deviations from cluster centers -- ALGLIB -- Copyright 27.11.2012 by Bochkanov Sergey *************************************************************************/ class _kmeansreport_owner { public: _kmeansreport_owner(); _kmeansreport_owner(const _kmeansreport_owner &rhs); _kmeansreport_owner& operator=(const _kmeansreport_owner &rhs); virtual ~_kmeansreport_owner(); alglib_impl::kmeansreport* c_ptr(); alglib_impl::kmeansreport* c_ptr() const; protected: alglib_impl::kmeansreport *p_struct; }; class kmeansreport : public _kmeansreport_owner { public: kmeansreport(); kmeansreport(const kmeansreport &rhs); kmeansreport& operator=(const kmeansreport &rhs); virtual ~kmeansreport(); ae_int_t &npoints; ae_int_t &nfeatures; ae_int_t &terminationtype; ae_int_t &iterationscount; double &energy; ae_int_t &k; real_2d_array c; integer_1d_array cidx; }; /************************************************************************* *************************************************************************/ class _decisionforest_owner { public: _decisionforest_owner(); _decisionforest_owner(const _decisionforest_owner &rhs); _decisionforest_owner& operator=(const _decisionforest_owner &rhs); virtual ~_decisionforest_owner(); alglib_impl::decisionforest* c_ptr(); alglib_impl::decisionforest* c_ptr() const; protected: alglib_impl::decisionforest *p_struct; }; class decisionforest : public _decisionforest_owner { public: decisionforest(); decisionforest(const decisionforest &rhs); decisionforest& operator=(const decisionforest &rhs); virtual ~decisionforest(); }; /************************************************************************* *************************************************************************/ class _dfreport_owner { public: _dfreport_owner(); _dfreport_owner(const _dfreport_owner &rhs); _dfreport_owner& operator=(const _dfreport_owner &rhs); virtual ~_dfreport_owner(); alglib_impl::dfreport* c_ptr(); alglib_impl::dfreport* c_ptr() const; protected: alglib_impl::dfreport *p_struct; }; class dfreport : public _dfreport_owner { public: dfreport(); dfreport(const dfreport &rhs); dfreport& operator=(const dfreport &rhs); virtual ~dfreport(); double &relclserror; double &avgce; double &rmserror; double &avgerror; double &avgrelerror; double &oobrelclserror; double &oobavgce; double &oobrmserror; double &oobavgerror; double &oobavgrelerror; }; /************************************************************************* Principal components analysis This function builds orthogonal basis where first axis corresponds to direction with maximum variance, second axis maximizes variance in the subspace orthogonal to first axis and so on. This function builds FULL basis, i.e. returns N vectors corresponding to ALL directions, no matter how informative. If you need just a few (say, 10 or 50) of the most important directions, you may find it faster to use one of the reduced versions: * pcatruncatedsubspace() - for subspace iteration based method It should be noted that, unlike LDA, PCA does not use class labels. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * multithreading support ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Multithreading typically gives sublinear (wrt to cores count) speedup, ! because only some parts of the algorithm can be parallelized. ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - dataset, array[0..NPoints-1,0..NVars-1]. matrix contains ONLY INDEPENDENT VARIABLES. NPoints - dataset size, NPoints>=0 NVars - number of independent variables, NVars>=1 OUTPUT PARAMETERS: Info - return code: * -4, if SVD subroutine haven't converged * -1, if wrong parameters has been passed (NPoints<0, NVars<1) * 1, if task is solved S2 - array[0..NVars-1]. variance values corresponding to basis vectors. V - array[0..NVars-1,0..NVars-1] matrix, whose columns store basis vectors. -- ALGLIB -- Copyright 25.08.2008 by Bochkanov Sergey *************************************************************************/ void pcabuildbasis(const real_2d_array &x, const ae_int_t npoints, const ae_int_t nvars, ae_int_t &info, real_1d_array &s2, real_2d_array &v); void smp_pcabuildbasis(const real_2d_array &x, const ae_int_t npoints, const ae_int_t nvars, ae_int_t &info, real_1d_array &s2, real_2d_array &v); /************************************************************************* Principal components analysis This function performs truncated PCA, i.e. returns just a few most important directions. Internally it uses iterative eigensolver which is very efficient when only a minor fraction of full basis is required. Thus, if you need full basis, it is better to use pcabuildbasis() function. It should be noted that, unlike LDA, PCA does not use class labels. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * multithreading support ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! For a situation when you need just a few eigenvectors (~1-10), ! multithreading typically gives sublinear (wrt to cores count) speedup. ! For larger problems it may give you nearly linear increase in ! performance. ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - dataset, array[0..NPoints-1,0..NVars-1]. matrix contains ONLY INDEPENDENT VARIABLES. NPoints - dataset size, NPoints>=0 NVars - number of independent variables, NVars>=1 NNeeded - number of requested directions, in [1,NVars] range; this function is efficient only for NNeeded<1) * -1, incorrect pararemets were passed (N<=0). * 1, OK Threshold- partiton boundary. Left part contains values which are strictly less than Threshold. Right part contains values which are greater than or equal to Threshold. PAL, PBL- probabilities P(0|v=Threshold) and P(1|v>=Threshold) CVE - cross-validation estimate of cross-entropy -- ALGLIB -- Copyright 22.05.2008 by Bochkanov Sergey *************************************************************************/ void dsoptimalsplit2(const real_1d_array &a, const integer_1d_array &c, const ae_int_t n, ae_int_t &info, double &threshold, double &pal, double &pbl, double &par, double &pbr, double &cve); /************************************************************************* Optimal partition, internal subroutine. Fast version. Accepts: A array[0..N-1] array of attributes array[0..N-1] C array[0..N-1] array of class labels TiesBuf array[0..N] temporaries (ties) CntBuf array[0..2*NC-1] temporaries (counts) Alpha centering factor (0<=alpha<=1, recommended value - 0.05) BufR array[0..N-1] temporaries BufI array[0..N-1] temporaries Output: Info error code (">0"=OK, "<0"=bad) RMS training set RMS error CVRMS leave-one-out RMS error Note: content of all arrays is changed by subroutine; it doesn't allocate temporaries. -- ALGLIB -- Copyright 11.12.2008 by Bochkanov Sergey *************************************************************************/ void dsoptimalsplit2fast(real_1d_array &a, integer_1d_array &c, integer_1d_array &tiesbuf, integer_1d_array &cntbuf, real_1d_array &bufr, integer_1d_array &bufi, const ae_int_t n, const ae_int_t nc, const double alpha, ae_int_t &info, double &threshold, double &rms, double &cvrms); /************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/ void mlpserialize(multilayerperceptron &obj, std::string &s_out); /************************************************************************* This function unserializes data structure from string. *************************************************************************/ void mlpunserialize(const std::string &s_in, multilayerperceptron &obj); /************************************************************************* This function serializes data structure to C++ stream. Data stream generated by this function is same as string representation generated by string version of serializer - alphanumeric characters, dots, underscores, minus signs, which are grouped into words separated by spaces and CR+LF. We recommend you to read comments on string version of serializer to find out more about serialization of AlGLIB objects. *************************************************************************/ void mlpserialize(multilayerperceptron &obj, std::ostream &s_out); /************************************************************************* This function unserializes data structure from stream. *************************************************************************/ void mlpunserialize(const std::istream &s_in, multilayerperceptron &obj); /************************************************************************* Creates neural network with NIn inputs, NOut outputs, without hidden layers, with linear output layer. Network weights are filled with small random values. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreate0(const ae_int_t nin, const ae_int_t nout, multilayerperceptron &network); /************************************************************************* Same as MLPCreate0, but with one hidden layer (NHid neurons) with non-linear activation function. Output layer is linear. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreate1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, multilayerperceptron &network); /************************************************************************* Same as MLPCreate0, but with two hidden layers (NHid1 and NHid2 neurons) with non-linear activation function. Output layer is linear. $ALL -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreate2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, multilayerperceptron &network); /************************************************************************* Creates neural network with NIn inputs, NOut outputs, without hidden layers with non-linear output layer. Network weights are filled with small random values. Activation function of the output layer takes values: (B, +INF), if D>=0 or (-INF, B), if D<0. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreateb0(const ae_int_t nin, const ae_int_t nout, const double b, const double d, multilayerperceptron &network); /************************************************************************* Same as MLPCreateB0 but with non-linear hidden layer. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreateb1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, const double b, const double d, multilayerperceptron &network); /************************************************************************* Same as MLPCreateB0 but with two non-linear hidden layers. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreateb2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, const double b, const double d, multilayerperceptron &network); /************************************************************************* Creates neural network with NIn inputs, NOut outputs, without hidden layers with non-linear output layer. Network weights are filled with small random values. Activation function of the output layer takes values [A,B]. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreater0(const ae_int_t nin, const ae_int_t nout, const double a, const double b, multilayerperceptron &network); /************************************************************************* Same as MLPCreateR0, but with non-linear hidden layer. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreater1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, const double a, const double b, multilayerperceptron &network); /************************************************************************* Same as MLPCreateR0, but with two non-linear hidden layers. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreater2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, const double a, const double b, multilayerperceptron &network); /************************************************************************* Creates classifier network with NIn inputs and NOut possible classes. Network contains no hidden layers and linear output layer with SOFTMAX- normalization (so outputs sums up to 1.0 and converge to posterior probabilities). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreatec0(const ae_int_t nin, const ae_int_t nout, multilayerperceptron &network); /************************************************************************* Same as MLPCreateC0, but with one non-linear hidden layer. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreatec1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, multilayerperceptron &network); /************************************************************************* Same as MLPCreateC0, but with two non-linear hidden layers. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreatec2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, multilayerperceptron &network); /************************************************************************* Copying of neural network INPUT PARAMETERS: Network1 - original OUTPUT PARAMETERS: Network2 - copy -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcopy(const multilayerperceptron &network1, multilayerperceptron &network2); /************************************************************************* This function copies tunable parameters (weights/means/sigmas) from one network to another with same architecture. It performs some rudimentary checks that architectures are same, and throws exception if check fails. It is intended for fast copying of states between two network which are known to have same geometry. INPUT PARAMETERS: Network1 - source, must be correctly initialized Network2 - target, must have same architecture OUTPUT PARAMETERS: Network2 - network state is copied from source to target -- ALGLIB -- Copyright 20.06.2013 by Bochkanov Sergey *************************************************************************/ void mlpcopytunableparameters(const multilayerperceptron &network1, const multilayerperceptron &network2); /************************************************************************* Randomization of neural network weights -- ALGLIB -- Copyright 06.11.2007 by Bochkanov Sergey *************************************************************************/ void mlprandomize(const multilayerperceptron &network); /************************************************************************* Randomization of neural network weights and standartisator -- ALGLIB -- Copyright 10.03.2008 by Bochkanov Sergey *************************************************************************/ void mlprandomizefull(const multilayerperceptron &network); /************************************************************************* Internal subroutine. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpinitpreprocessor(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize); /************************************************************************* Returns information about initialized network: number of inputs, outputs, weights. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpproperties(const multilayerperceptron &network, ae_int_t &nin, ae_int_t &nout, ae_int_t &wcount); /************************************************************************* Returns number of inputs. -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetinputscount(const multilayerperceptron &network); /************************************************************************* Returns number of outputs. -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetoutputscount(const multilayerperceptron &network); /************************************************************************* Returns number of weights. -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetweightscount(const multilayerperceptron &network); /************************************************************************* Tells whether network is SOFTMAX-normalized (i.e. classifier) or not. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ bool mlpissoftmax(const multilayerperceptron &network); /************************************************************************* This function returns total number of layers (including input, hidden and output layers). -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetlayerscount(const multilayerperceptron &network); /************************************************************************* This function returns size of K-th layer. K=0 corresponds to input layer, K=CNT-1 corresponds to output layer. Size of the output layer is always equal to the number of outputs, although when we have softmax-normalized network, last neuron doesn't have any connections - it is just zero. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetlayersize(const multilayerperceptron &network, const ae_int_t k); /************************************************************************* This function returns offset/scaling coefficients for I-th input of the network. INPUT PARAMETERS: Network - network I - input index OUTPUT PARAMETERS: Mean - mean term Sigma - sigma term, guaranteed to be nonzero. I-th input is passed through linear transformation IN[i] = (IN[i]-Mean)/Sigma before feeding to the network -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpgetinputscaling(const multilayerperceptron &network, const ae_int_t i, double &mean, double &sigma); /************************************************************************* This function returns offset/scaling coefficients for I-th output of the network. INPUT PARAMETERS: Network - network I - input index OUTPUT PARAMETERS: Mean - mean term Sigma - sigma term, guaranteed to be nonzero. I-th output is passed through linear transformation OUT[i] = OUT[i]*Sigma+Mean before returning it to user. In case we have SOFTMAX-normalized network, we return (Mean,Sigma)=(0.0,1.0). -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpgetoutputscaling(const multilayerperceptron &network, const ae_int_t i, double &mean, double &sigma); /************************************************************************* This function returns information about Ith neuron of Kth layer INPUT PARAMETERS: Network - network K - layer index I - neuron index (within layer) OUTPUT PARAMETERS: FKind - activation function type (used by MLPActivationFunction()) this value is zero for input or linear neurons Threshold - also called offset, bias zero for input neurons NOTE: this function throws exception if layer or neuron with given index do not exists. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpgetneuroninfo(const multilayerperceptron &network, const ae_int_t k, const ae_int_t i, ae_int_t &fkind, double &threshold); /************************************************************************* This function returns information about connection from I0-th neuron of K0-th layer to I1-th neuron of K1-th layer. INPUT PARAMETERS: Network - network K0 - layer index I0 - neuron index (within layer) K1 - layer index I1 - neuron index (within layer) RESULT: connection weight (zero for non-existent connections) This function: 1. throws exception if layer or neuron with given index do not exists. 2. returns zero if neurons exist, but there is no connection between them -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ double mlpgetweight(const multilayerperceptron &network, const ae_int_t k0, const ae_int_t i0, const ae_int_t k1, const ae_int_t i1); /************************************************************************* This function sets offset/scaling coefficients for I-th input of the network. INPUT PARAMETERS: Network - network I - input index Mean - mean term Sigma - sigma term (if zero, will be replaced by 1.0) NTE: I-th input is passed through linear transformation IN[i] = (IN[i]-Mean)/Sigma before feeding to the network. This function sets Mean and Sigma. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpsetinputscaling(const multilayerperceptron &network, const ae_int_t i, const double mean, const double sigma); /************************************************************************* This function sets offset/scaling coefficients for I-th output of the network. INPUT PARAMETERS: Network - network I - input index Mean - mean term Sigma - sigma term (if zero, will be replaced by 1.0) OUTPUT PARAMETERS: NOTE: I-th output is passed through linear transformation OUT[i] = OUT[i]*Sigma+Mean before returning it to user. This function sets Sigma/Mean. In case we have SOFTMAX-normalized network, you can not set (Sigma,Mean) to anything other than(0.0,1.0) - this function will throw exception. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpsetoutputscaling(const multilayerperceptron &network, const ae_int_t i, const double mean, const double sigma); /************************************************************************* This function modifies information about Ith neuron of Kth layer INPUT PARAMETERS: Network - network K - layer index I - neuron index (within layer) FKind - activation function type (used by MLPActivationFunction()) this value must be zero for input neurons (you can not set activation function for input neurons) Threshold - also called offset, bias this value must be zero for input neurons (you can not set threshold for input neurons) NOTES: 1. this function throws exception if layer or neuron with given index do not exists. 2. this function also throws exception when you try to set non-linear activation function for input neurons (any kind of network) or for output neurons of classifier network. 3. this function throws exception when you try to set non-zero threshold for input neurons (any kind of network). -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpsetneuroninfo(const multilayerperceptron &network, const ae_int_t k, const ae_int_t i, const ae_int_t fkind, const double threshold); /************************************************************************* This function modifies information about connection from I0-th neuron of K0-th layer to I1-th neuron of K1-th layer. INPUT PARAMETERS: Network - network K0 - layer index I0 - neuron index (within layer) K1 - layer index I1 - neuron index (within layer) W - connection weight (must be zero for non-existent connections) This function: 1. throws exception if layer or neuron with given index do not exists. 2. throws exception if you try to set non-zero weight for non-existent connection -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpsetweight(const multilayerperceptron &network, const ae_int_t k0, const ae_int_t i0, const ae_int_t k1, const ae_int_t i1, const double w); /************************************************************************* Neural network activation function INPUT PARAMETERS: NET - neuron input K - function index (zero for linear function) OUTPUT PARAMETERS: F - function DF - its derivative D2F - its second derivative -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpactivationfunction(const double net, const ae_int_t k, double &f, double &df, double &d2f); /************************************************************************* Procesing INPUT PARAMETERS: Network - neural network X - input vector, array[0..NIn-1]. OUTPUT PARAMETERS: Y - result. Regression estimate when solving regression task, vector of posterior probabilities for classification task. See also MLPProcessI -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpprocess(const multilayerperceptron &network, const real_1d_array &x, real_1d_array &y); /************************************************************************* 'interactive' variant of MLPProcess for languages like Python which support constructs like "Y = MLPProcess(NN,X)" and interactive mode of the interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 21.09.2010 by Bochkanov Sergey *************************************************************************/ void mlpprocessi(const multilayerperceptron &network, const real_1d_array &x, real_1d_array &y); /************************************************************************* Error of the neural network on dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x, depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ double mlperror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); double smp_mlperror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Error of the neural network on dataset given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x, depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0 RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ double mlperrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints); double smp_mlperrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints); /************************************************************************* Natural error function for neural network, internal subroutine. NOTE: this function is single-threaded. Unlike other error function, it receives no speed-up from being executed in SMP mode. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ double mlperrorn(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize); /************************************************************************* Classification error of the neural network on dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: classification error (number of misclassified cases) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpclserror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); ae_int_t smp_mlpclserror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Relative classification error on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Percent of incorrectly classified cases. Works both for classifier networks and general purpose networks used as classifiers. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 25.12.2008 by Bochkanov Sergey *************************************************************************/ double mlprelclserror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); double smp_mlprelclserror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Relative classification error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Percent of incorrectly classified cases. Works both for classifier networks and general purpose networks used as classifiers. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/ double mlprelclserrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints); double smp_mlprelclserrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints); /************************************************************************* Average cross-entropy (in bits per element) on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: CrossEntropy/(NPoints*LN(2)). Zero if network solves regression task. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 08.01.2009 by Bochkanov Sergey *************************************************************************/ double mlpavgce(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); double smp_mlpavgce(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Average cross-entropy (in bits per element) on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: CrossEntropy/(NPoints*LN(2)). Zero if network solves regression task. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 9.08.2012 by Bochkanov Sergey *************************************************************************/ double mlpavgcesparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints); double smp_mlpavgcesparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints); /************************************************************************* RMS error on the test set given. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Root mean square error. Its meaning for regression task is obvious. As for classification task, RMS error means error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ double mlprmserror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); double smp_mlprmserror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* RMS error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Root mean square error. Its meaning for regression task is obvious. As for classification task, RMS error means error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/ double mlprmserrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints); double smp_mlprmserrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints); /************************************************************************* Average absolute error on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Its meaning for regression task is obvious. As for classification task, it means average error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 11.03.2008 by Bochkanov Sergey *************************************************************************/ double mlpavgerror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); double smp_mlpavgerror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Average absolute error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Its meaning for regression task is obvious. As for classification task, it means average error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/ double mlpavgerrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints); double smp_mlpavgerrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints); /************************************************************************* Average relative error on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Its meaning for regression task is obvious. As for classification task, it means average relative error when estimating posterior probability of belonging to the correct class. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 11.03.2008 by Bochkanov Sergey *************************************************************************/ double mlpavgrelerror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); double smp_mlpavgrelerror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Average relative error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Its meaning for regression task is obvious. As for classification task, it means average relative error when estimating posterior probability of belonging to the correct class. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/ double mlpavgrelerrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints); double smp_mlpavgrelerrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints); /************************************************************************* Gradient calculation INPUT PARAMETERS: Network - network initialized with one of the network creation funcs X - input vector, length of array must be at least NIn DesiredY- desired outputs, length of array must be at least NOut Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpgrad(const multilayerperceptron &network, const real_1d_array &x, const real_1d_array &desiredy, double &e, real_1d_array &grad); /************************************************************************* Gradient calculation (natural error function is used) INPUT PARAMETERS: Network - network initialized with one of the network creation funcs X - input vector, length of array must be at least NIn DesiredY- desired outputs, length of array must be at least NOut Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, sum-of-squares for regression networks, cross-entropy for classification networks. Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpgradn(const multilayerperceptron &network, const real_1d_array &x, const real_1d_array &desiredy, double &e, real_1d_array &grad); /************************************************************************* Batch gradient calculation for a set of inputs/outputs FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in dense format; one sample = one row: * first NIn columns contain inputs, * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SSize - number of elements in XY Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpgradbatch(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize, double &e, real_1d_array &grad); void smp_mlpgradbatch(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize, double &e, real_1d_array &grad); /************************************************************************* Batch gradient calculation for a set of inputs/outputs given by sparse matrices FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in sparse format; one sample = one row: * MATRIX MUST BE STORED IN CRS FORMAT * first NIn columns contain inputs. * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SSize - number of elements in XY Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpgradbatchsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t ssize, double &e, real_1d_array &grad); void smp_mlpgradbatchsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t ssize, double &e, real_1d_array &grad); /************************************************************************* Batch gradient calculation for a subset of dataset FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in dense format; one sample = one row: * first NIn columns contain inputs, * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SetSize - real size of XY, SetSize>=0; Idx - subset of SubsetSize elements, array[SubsetSize]: * Idx[I] stores row index in the original dataset which is given by XY. Gradient is calculated with respect to rows whose indexes are stored in Idx[]. * Idx[] must store correct indexes; this function throws an exception in case incorrect index (less than 0 or larger than rows(XY)) is given * Idx[] may store indexes in any order and even with repetitions. SubsetSize- number of elements in Idx[] array: * positive value means that subset given by Idx[] is processed * zero value results in zero gradient * negative value means that full dataset is processed Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpgradbatchsubset(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t setsize, const integer_1d_array &idx, const ae_int_t subsetsize, double &e, real_1d_array &grad); void smp_mlpgradbatchsubset(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t setsize, const integer_1d_array &idx, const ae_int_t subsetsize, double &e, real_1d_array &grad); /************************************************************************* Batch gradient calculation for a set of inputs/outputs for a subset of dataset given by set of indexes. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in sparse format; one sample = one row: * MATRIX MUST BE STORED IN CRS FORMAT * first NIn columns contain inputs, * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SetSize - real size of XY, SetSize>=0; Idx - subset of SubsetSize elements, array[SubsetSize]: * Idx[I] stores row index in the original dataset which is given by XY. Gradient is calculated with respect to rows whose indexes are stored in Idx[]. * Idx[] must store correct indexes; this function throws an exception in case incorrect index (less than 0 or larger than rows(XY)) is given * Idx[] may store indexes in any order and even with repetitions. SubsetSize- number of elements in Idx[] array: * positive value means that subset given by Idx[] is processed * zero value results in zero gradient * negative value means that full dataset is processed Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] NOTE: when SubsetSize<0 is used full dataset by call MLPGradBatchSparse function. -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpgradbatchsparsesubset(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t setsize, const integer_1d_array &idx, const ae_int_t subsetsize, double &e, real_1d_array &grad); void smp_mlpgradbatchsparsesubset(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t setsize, const integer_1d_array &idx, const ae_int_t subsetsize, double &e, real_1d_array &grad); /************************************************************************* Batch gradient calculation for a set of inputs/outputs (natural error function is used) INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - set of inputs/outputs; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SSize - number of elements in XY Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, sum-of-squares for regression networks, cross-entropy for classification networks. Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpgradnbatch(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize, double &e, real_1d_array &grad); /************************************************************************* Batch Hessian calculation (natural error function) using R-algorithm. Internal subroutine. -- ALGLIB -- Copyright 26.01.2008 by Bochkanov Sergey. Hessian calculation based on R-algorithm described in "Fast Exact Multiplication by the Hessian", B. A. Pearlmutter, Neural Computation, 1994. *************************************************************************/ void mlphessiannbatch(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize, double &e, real_1d_array &grad, real_2d_array &h); /************************************************************************* Batch Hessian calculation using R-algorithm. Internal subroutine. -- ALGLIB -- Copyright 26.01.2008 by Bochkanov Sergey. Hessian calculation based on R-algorithm described in "Fast Exact Multiplication by the Hessian", B. A. Pearlmutter, Neural Computation, 1994. *************************************************************************/ void mlphessianbatch(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize, double &e, real_1d_array &grad, real_2d_array &h); /************************************************************************* Calculation of all types of errors on subset of dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SetSize - real size of XY, SetSize>=0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. OUTPUT PARAMETERS: Rep - it contains all type of errors. -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/ void mlpallerrorssubset(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize, modelerrors &rep); void smp_mlpallerrorssubset(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize, modelerrors &rep); /************************************************************************* Calculation of all types of errors on subset of dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset given by sparse matrix; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SetSize - real size of XY, SetSize>=0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. OUTPUT PARAMETERS: Rep - it contains all type of errors. -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/ void mlpallerrorssparsesubset(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize, modelerrors &rep); void smp_mlpallerrorssparsesubset(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize, modelerrors &rep); /************************************************************************* Error of the neural network on subset of dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; SetSize - real size of XY, SetSize>=0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/ double mlperrorsubset(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize); double smp_mlperrorsubset(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize); /************************************************************************* Error of the neural network on subset of sparse dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. SetSize - real size of XY, SetSize>=0; it is used when SubsetSize<0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/ double mlperrorsparsesubset(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize); double smp_mlperrorsparsesubset(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize); /************************************************************************* Multiclass Fisher LDA Subroutine finds coefficients of linear combination which optimally separates training set on classes. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! Multithreading is used to accelerate initial phase of LDA, which ! includes calculation of products of large matrices. Again, for best ! efficiency problem must be high-dimensional. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: XY - training set, array[0..NPoints-1,0..NVars]. First NVars columns store values of independent variables, next column stores number of class (from 0 to NClasses-1) which dataset element belongs to. Fractional values are rounded to nearest integer. NPoints - training set size, NPoints>=0 NVars - number of independent variables, NVars>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: Info - return code: * -4, if internal EVD subroutine hasn't converged * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, NVars<1, NClasses<2) * 1, if task has been solved * 2, if there was a multicollinearity in training set, but task has been solved. W - linear combination coefficients, array[0..NVars-1] -- ALGLIB -- Copyright 31.05.2008 by Bochkanov Sergey *************************************************************************/ void fisherlda(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t nclasses, ae_int_t &info, real_1d_array &w); /************************************************************************* N-dimensional multiclass Fisher LDA Subroutine finds coefficients of linear combinations which optimally separates training set on classes. It returns N-dimensional basis whose vector are sorted by quality of training set separation (in descending order). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! Multithreading is used to accelerate initial phase of LDA, which ! includes calculation of products of large matrices. Again, for best ! efficiency problem must be high-dimensional. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: XY - training set, array[0..NPoints-1,0..NVars]. First NVars columns store values of independent variables, next column stores number of class (from 0 to NClasses-1) which dataset element belongs to. Fractional values are rounded to nearest integer. NPoints - training set size, NPoints>=0 NVars - number of independent variables, NVars>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: Info - return code: * -4, if internal EVD subroutine hasn't converged * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, NVars<1, NClasses<2) * 1, if task has been solved * 2, if there was a multicollinearity in training set, but task has been solved. W - basis, array[0..NVars-1,0..NVars-1] columns of matrix stores basis vectors, sorted by quality of training set separation (in descending order) -- ALGLIB -- Copyright 31.05.2008 by Bochkanov Sergey *************************************************************************/ void fisherldan(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t nclasses, ae_int_t &info, real_2d_array &w); void smp_fisherldan(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t nclasses, ae_int_t &info, real_2d_array &w); /************************************************************************* Linear regression Subroutine builds model: Y = A(0)*X[0] + ... + A(N-1)*X[N-1] + A(N) and model found in ALGLIB format, covariation matrix, training set errors (rms, average, average relative) and leave-one-out cross-validation estimate of the generalization error. CV estimate calculated using fast algorithm with O(NPoints*NVars) complexity. When covariation matrix is calculated standard deviations of function values are assumed to be equal to RMS error on the training set. INPUT PARAMETERS: XY - training set, array [0..NPoints-1,0..NVars]: * NVars columns - independent variables * last column - dependent variable NPoints - training set size, NPoints>NVars+1 NVars - number of independent variables OUTPUT PARAMETERS: Info - return code: * -255, in case of unknown internal error * -4, if internal SVD subroutine haven't converged * -1, if incorrect parameters was passed (NPoints0. NPoints - training set size, NPoints>NVars+1 NVars - number of independent variables OUTPUT PARAMETERS: Info - return code: * -255, in case of unknown internal error * -4, if internal SVD subroutine haven't converged * -1, if incorrect parameters was passed (NPoints=0 K - K>=1 (K can be larger than N , such cases will be correctly handled). Window width. K=1 corresponds to identity transformation (nothing changes). OUTPUT PARAMETERS: X - array, whose first N elements were processed with SMA(K) NOTE 1: this function uses efficient in-place algorithm which does not allocate temporary arrays. NOTE 2: this algorithm makes only one pass through array and uses running sum to speed-up calculation of the averages. Additional measures are taken to ensure that running sum on a long sequence of zero elements will be correctly reset to zero even in the presence of round-off error. NOTE 3: this is unsymmetric version of the algorithm, which does NOT averages points after the current one. Only X[i], X[i-1], ... are used when calculating new value of X[i]. We should also note that this algorithm uses BOTH previous points and current one, i.e. new value of X[i] depends on BOTH previous point and X[i] itself. -- ALGLIB -- Copyright 25.10.2011 by Bochkanov Sergey *************************************************************************/ void filtersma(real_1d_array &x, const ae_int_t n, const ae_int_t k); void filtersma(real_1d_array &x, const ae_int_t k); /************************************************************************* Filters: exponential moving averages. This filter replaces array by results of EMA(alpha) filter. EMA(alpha) is defined as filter which replaces X[] by S[]: S[0] = X[0] S[t] = alpha*X[t] + (1-alpha)*S[t-1] INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 alpha - 0=0 K - K>=1 (K can be larger than N , such cases will be correctly handled). Window width. K=1 corresponds to identity transformation (nothing changes). OUTPUT PARAMETERS: X - array, whose first N elements were processed with SMA(K) NOTE 1: this function uses efficient in-place algorithm which does not allocate temporary arrays. NOTE 2: this algorithm makes only one pass through array and uses running sum to speed-up calculation of the averages. Additional measures are taken to ensure that running sum on a long sequence of zero elements will be correctly reset to zero even in the presence of round-off error. NOTE 3: this is unsymmetric version of the algorithm, which does NOT averages points after the current one. Only X[i], X[i-1], ... are used when calculating new value of X[i]. We should also note that this algorithm uses BOTH previous points and current one, i.e. new value of X[i] depends on BOTH previous point and X[i] itself. -- ALGLIB -- Copyright 25.10.2011 by Bochkanov Sergey *************************************************************************/ void filterlrma(real_1d_array &x, const ae_int_t n, const ae_int_t k); void filterlrma(real_1d_array &x, const ae_int_t k); /************************************************************************* This subroutine trains logit model. INPUT PARAMETERS: XY - training set, array[0..NPoints-1,0..NVars] First NVars columns store values of independent variables, next column stores number of class (from 0 to NClasses-1) which dataset element belongs to. Fractional values are rounded to nearest integer. NPoints - training set size, NPoints>=1 NVars - number of independent variables, NVars>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints=1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdcreate(const ae_int_t n, mcpdstate &s); /************************************************************************* DESCRIPTION: This function is a specialized version of MCPDCreate() function, and we recommend you to read comments for this function for general information about MCPD solver. This function creates MCPD (Markov Chains for Population Data) solver for "Entry-state" model, i.e. model where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional state vectors P is a N*N transition matrix and one selected component of X[] is called "entry" state and is treated in a special way: system state always transits from "entry" state to some another state system state can not transit from any state into "entry" state Such conditions basically mean that row of P which corresponds to "entry" state is zero. Such models arise when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is NOT constant - at every moment of time there is some (unpredictable) amount of "new" individuals, which can transit into one of the states at the next turn, but still no one leaves population * you want to model transitions of individuals from one state into another * but you do NOT want to predict amount of "new" individuals because it does not depends on individuals already present (hence system can not transit INTO entry state - it can only transit FROM it). This model is discussed in more details in the ALGLIB User Guide (see http://www.alglib.net/dataanalysis/ for more data). INPUT PARAMETERS: N - problem dimension, N>=2 EntryState- index of entry state, in 0..N-1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdcreateentry(const ae_int_t n, const ae_int_t entrystate, mcpdstate &s); /************************************************************************* DESCRIPTION: This function is a specialized version of MCPDCreate() function, and we recommend you to read comments for this function for general information about MCPD solver. This function creates MCPD (Markov Chains for Population Data) solver for "Exit-state" model, i.e. model where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional state vectors P is a N*N transition matrix and one selected component of X[] is called "exit" state and is treated in a special way: system state can transit from any state into "exit" state system state can not transit from "exit" state into any other state transition operator discards "exit" state (makes it zero at each turn) Such conditions basically mean that column of P which corresponds to "exit" state is zero. Multiplication by such P may decrease sum of vector components. Such models arise when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is NOT constant - individuals can move into "exit" state and leave population at the next turn, but there are no new individuals * amount of individuals which leave population can be predicted * you want to model transitions of individuals from one state into another (including transitions into the "exit" state) This model is discussed in more details in the ALGLIB User Guide (see http://www.alglib.net/dataanalysis/ for more data). INPUT PARAMETERS: N - problem dimension, N>=2 ExitState- index of exit state, in 0..N-1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdcreateexit(const ae_int_t n, const ae_int_t exitstate, mcpdstate &s); /************************************************************************* DESCRIPTION: This function is a specialized version of MCPDCreate() function, and we recommend you to read comments for this function for general information about MCPD solver. This function creates MCPD (Markov Chains for Population Data) solver for "Entry-Exit-states" model, i.e. model where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional state vectors P is a N*N transition matrix one selected component of X[] is called "entry" state and is treated in a special way: system state always transits from "entry" state to some another state system state can not transit from any state into "entry" state and another one component of X[] is called "exit" state and is treated in a special way too: system state can transit from any state into "exit" state system state can not transit from "exit" state into any other state transition operator discards "exit" state (makes it zero at each turn) Such conditions basically mean that: row of P which corresponds to "entry" state is zero column of P which corresponds to "exit" state is zero Multiplication by such P may decrease sum of vector components. Such models arise when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is NOT constant * at every moment of time there is some (unpredictable) amount of "new" individuals, which can transit into one of the states at the next turn * some individuals can move (predictably) into "exit" state and leave population at the next turn * you want to model transitions of individuals from one state into another, including transitions from the "entry" state and into the "exit" state. * but you do NOT want to predict amount of "new" individuals because it does not depends on individuals already present (hence system can not transit INTO entry state - it can only transit FROM it). This model is discussed in more details in the ALGLIB User Guide (see http://www.alglib.net/dataanalysis/ for more data). INPUT PARAMETERS: N - problem dimension, N>=2 EntryState- index of entry state, in 0..N-1 ExitState- index of exit state, in 0..N-1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdcreateentryexit(const ae_int_t n, const ae_int_t entrystate, const ae_int_t exitstate, mcpdstate &s); /************************************************************************* This function is used to add a track - sequence of system states at the different moments of its evolution. You may add one or several tracks to the MCPD solver. In case you have several tracks, they won't overwrite each other. For example, if you pass two tracks, A1-A2-A3 (system at t=A+1, t=A+2 and t=A+3) and B1-B2-B3, then solver will try to model transitions from t=A+1 to t=A+2, t=A+2 to t=A+3, t=B+1 to t=B+2, t=B+2 to t=B+3. But it WONT mix these two tracks - i.e. it wont try to model transition from t=A+3 to t=B+1. INPUT PARAMETERS: S - solver XY - track, array[K,N]: * I-th row is a state at t=I * elements of XY must be non-negative (exception will be thrown on negative elements) K - number of points in a track * if given, only leading K rows of XY are used * if not given, automatically determined from size of XY NOTES: 1. Track may contain either proportional or population data: * with proportional data all rows of XY must sum to 1.0, i.e. we have proportions instead of absolute population values * with population data rows of XY contain population counts and generally do not sum to 1.0 (although they still must be non-negative) -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdaddtrack(const mcpdstate &s, const real_2d_array &xy, const ae_int_t k); void mcpdaddtrack(const mcpdstate &s, const real_2d_array &xy); /************************************************************************* This function is used to add equality constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to place equality constraints on arbitrary subset of elements of P. Set of constraints is specified by EC, which may contain either NAN's or finite numbers from [0,1]. NAN denotes absence of constraint, finite number denotes equality constraint on specific element of P. You can also use MCPDAddEC() function which allows to ADD equality constraint for one element of P without changing constraints for other elements. These functions (MCPDSetEC and MCPDAddEC) interact as follows: * there is internal matrix of equality constraints which is stored in the MCPD solver * MCPDSetEC() replaces this matrix by another one (SET) * MCPDAddEC() modifies one element of this matrix and leaves other ones unchanged (ADD) * thus MCPDAddEC() call preserves all modifications done by previous calls, while MCPDSetEC() completely discards all changes done to the equality constraints. INPUT PARAMETERS: S - solver EC - equality constraints, array[N,N]. Elements of EC can be either NAN's or finite numbers from [0,1]. NAN denotes absence of constraints, while finite value denotes equality constraint on the corresponding element of P. NOTES: 1. infinite values of EC will lead to exception being thrown. Values less than 0.0 or greater than 1.0 will lead to error code being returned after call to MCPDSolve(). -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetec(const mcpdstate &s, const real_2d_array &ec); /************************************************************************* This function is used to add equality constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to ADD equality constraint for one element of P without changing constraints for other elements. You can also use MCPDSetEC() function which allows you to specify arbitrary set of equality constraints in one call. These functions (MCPDSetEC and MCPDAddEC) interact as follows: * there is internal matrix of equality constraints which is stored in the MCPD solver * MCPDSetEC() replaces this matrix by another one (SET) * MCPDAddEC() modifies one element of this matrix and leaves other ones unchanged (ADD) * thus MCPDAddEC() call preserves all modifications done by previous calls, while MCPDSetEC() completely discards all changes done to the equality constraints. INPUT PARAMETERS: S - solver I - row index of element being constrained J - column index of element being constrained C - value (constraint for P[I,J]). Can be either NAN (no constraint) or finite value from [0,1]. NOTES: 1. infinite values of C will lead to exception being thrown. Values less than 0.0 or greater than 1.0 will lead to error code being returned after call to MCPDSolve(). -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdaddec(const mcpdstate &s, const ae_int_t i, const ae_int_t j, const double c); /************************************************************************* This function is used to add bound constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to place bound constraints on arbitrary subset of elements of P. Set of constraints is specified by BndL/BndU matrices, which may contain arbitrary combination of finite numbers or infinities (like -INF=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to ADD bound constraint for one element of P without changing constraints for other elements. You can also use MCPDSetBC() function which allows to place bound constraints on arbitrary subset of elements of P. Set of constraints is specified by BndL/BndU matrices, which may contain arbitrary combination of finite numbers or infinities (like -INF=" (CT[i]>0). Your constraint may involve only some subset of P (less than N*N elements). For example it can be something like P[0,0] + P[0,1] = 0.5 In this case you still should pass matrix with N*N+1 columns, but all its elements (except for C[0,0], C[0,1] and C[0,N*N-1]) will be zero. INPUT PARAMETERS: S - solver C - array[K,N*N+1] - coefficients of constraints (see above for complete description) CT - array[K] - constraint types (see above for complete description) K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetlc(const mcpdstate &s, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k); void mcpdsetlc(const mcpdstate &s, const real_2d_array &c, const integer_1d_array &ct); /************************************************************************* This function allows to tune amount of Tikhonov regularization being applied to your problem. By default, regularizing term is equal to r*||P-prior_P||^2, where r is a small non-zero value, P is transition matrix, prior_P is identity matrix, ||X||^2 is a sum of squared elements of X. This function allows you to change coefficient r. You can also change prior values with MCPDSetPrior() function. INPUT PARAMETERS: S - solver V - regularization coefficient, finite non-negative value. It is not recommended to specify zero value unless you are pretty sure that you want it. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsettikhonovregularizer(const mcpdstate &s, const double v); /************************************************************************* This function allows to set prior values used for regularization of your problem. By default, regularizing term is equal to r*||P-prior_P||^2, where r is a small non-zero value, P is transition matrix, prior_P is identity matrix, ||X||^2 is a sum of squared elements of X. This function allows you to change prior values prior_P. You can also change r with MCPDSetTikhonovRegularizer() function. INPUT PARAMETERS: S - solver PP - array[N,N], matrix of prior values: 1. elements must be real numbers from [0,1] 2. columns must sum to 1.0. First property is checked (exception is thrown otherwise), while second one is not checked/enforced. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetprior(const mcpdstate &s, const real_2d_array &pp); /************************************************************************* This function is used to change prediction weights MCPD solver scales prediction errors as follows Error(P) = ||W*(y-P*x)||^2 where x is a system state at time t y is a system state at time t+1 P is a transition matrix W is a diagonal scaling matrix By default, weights are chosen in order to minimize relative prediction error instead of absolute one. For example, if one component of state is about 0.5 in magnitude and another one is about 0.05, then algorithm will make corresponding weights equal to 2.0 and 20.0. INPUT PARAMETERS: S - solver PW - array[N], weights: * must be non-negative values (exception will be thrown otherwise) * zero values will be replaced by automatically chosen values -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetpredictionweights(const mcpdstate &s, const real_1d_array &pw); /************************************************************************* This function is used to start solution of the MCPD problem. After return from this function, you can use MCPDResults() to get solution and completion code. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsolve(const mcpdstate &s); /************************************************************************* MCPD results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: P - array[N,N], transition matrix Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one. Speaking short, positive values denote success, negative ones are failures. More information about fields of this structure can be found in the comments on MCPDReport datatype. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdresults(const mcpdstate &s, real_2d_array &p, mcpdreport &rep); /************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/ void mlpeserialize(mlpensemble &obj, std::string &s_out); /************************************************************************* This function unserializes data structure from string. *************************************************************************/ void mlpeunserialize(const std::string &s_in, mlpensemble &obj); /************************************************************************* This function serializes data structure to C++ stream. Data stream generated by this function is same as string representation generated by string version of serializer - alphanumeric characters, dots, underscores, minus signs, which are grouped into words separated by spaces and CR+LF. We recommend you to read comments on string version of serializer to find out more about serialization of AlGLIB objects. *************************************************************************/ void mlpeserialize(mlpensemble &obj, std::ostream &s_out); /************************************************************************* This function unserializes data structure from stream. *************************************************************************/ void mlpeunserialize(const std::istream &s_in, mlpensemble &obj); /************************************************************************* Like MLPCreate0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreate0(const ae_int_t nin, const ae_int_t nout, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Like MLPCreate1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreate1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Like MLPCreate2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreate2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Like MLPCreateB0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreateb0(const ae_int_t nin, const ae_int_t nout, const double b, const double d, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Like MLPCreateB1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreateb1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, const double b, const double d, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Like MLPCreateB2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreateb2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, const double b, const double d, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Like MLPCreateR0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreater0(const ae_int_t nin, const ae_int_t nout, const double a, const double b, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Like MLPCreateR1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreater1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, const double a, const double b, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Like MLPCreateR2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreater2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, const double a, const double b, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Like MLPCreateC0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreatec0(const ae_int_t nin, const ae_int_t nout, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Like MLPCreateC1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreatec1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Like MLPCreateC2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreatec2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Creates ensemble from network. Only network geometry is copied. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreatefromnetwork(const multilayerperceptron &network, const ae_int_t ensemblesize, mlpensemble &ensemble); /************************************************************************* Randomization of MLP ensemble -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlperandomize(const mlpensemble &ensemble); /************************************************************************* Return ensemble properties (number of inputs and outputs). -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpeproperties(const mlpensemble &ensemble, ae_int_t &nin, ae_int_t &nout); /************************************************************************* Return normalization type (whether ensemble is SOFTMAX-normalized or not). -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ bool mlpeissoftmax(const mlpensemble &ensemble); /************************************************************************* Procesing INPUT PARAMETERS: Ensemble- neural networks ensemble X - input vector, array[0..NIn-1]. Y - (possibly) preallocated buffer; if size of Y is less than NOut, it will be reallocated. If it is large enough, it is NOT reallocated, so we can save some time on reallocation. OUTPUT PARAMETERS: Y - result. Regression estimate when solving regression task, vector of posterior probabilities for classification task. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpeprocess(const mlpensemble &ensemble, const real_1d_array &x, real_1d_array &y); /************************************************************************* 'interactive' variant of MLPEProcess for languages like Python which support constructs like "Y = MLPEProcess(LM,X)" and interactive mode of the interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpeprocessi(const mlpensemble &ensemble, const real_1d_array &x, real_1d_array &y); /************************************************************************* Relative classification error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: percent of incorrectly classified cases. Works both for classifier betwork and for regression networks which are used as classifiers. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlperelclserror(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Average cross-entropy (in bits per element) on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: CrossEntropy/(NPoints*LN(2)). Zero if ensemble solves regression task. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlpeavgce(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* RMS error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: root mean square error. Its meaning for regression task is obvious. As for classification task RMS error means error when estimating posterior probabilities. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlpermserror(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Average error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task it means average error when estimating posterior probabilities. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlpeavgerror(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Average relative error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task it means average relative error when estimating posterior probabilities. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlpeavgrelerror(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Neural network training using modified Levenberg-Marquardt with exact Hessian calculation and regularization. Subroutine trains neural network with restarts from random positions. Algorithm is well suited for small and medium scale problems (hundreds of weights). INPUT PARAMETERS: Network - neural network with initialized geometry XY - training set NPoints - training set size Decay - weight decay constant, >=0.001 Decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 0.001. Restarts - number of restarts from random position, >0. If you don't know what Restarts to choose, use 2. OUTPUT PARAMETERS: Network - trained neural network. Info - return code: * -9, if internal matrix inverse subroutine failed * -2, if there is a point with class number outside of [0..NOut-1]. * -1, if wrong parameters specified (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void mlptrainlm(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, ae_int_t &info, mlpreport &rep); /************************************************************************* Neural network training using L-BFGS algorithm with regularization. Subroutine trains neural network with restarts from random positions. Algorithm is well suited for problems of any dimensionality (memory requirements and step complexity are linear by weights number). INPUT PARAMETERS: Network - neural network with initialized geometry XY - training set NPoints - training set size Decay - weight decay constant, >=0.001 Decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 0.001. Restarts - number of restarts from random position, >0. If you don't know what Restarts to choose, use 2. WStep - stopping criterion. Algorithm stops if step size is less than WStep. Recommended value - 0.01. Zero step size means stopping after MaxIts iterations. MaxIts - stopping criterion. Algorithm stops after MaxIts iterations (NOT gradient calculations). Zero MaxIts means stopping when step is sufficiently small. OUTPUT PARAMETERS: Network - trained neural network. Info - return code: * -8, if both WStep=0 and MaxIts=0 * -2, if there is a point with class number outside of [0..NOut-1]. * -1, if wrong parameters specified (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report -- ALGLIB -- Copyright 09.12.2007 by Bochkanov Sergey *************************************************************************/ void mlptrainlbfgs(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, const double wstep, const ae_int_t maxits, ae_int_t &info, mlpreport &rep); /************************************************************************* Neural network training using early stopping (base algorithm - L-BFGS with regularization). INPUT PARAMETERS: Network - neural network with initialized geometry TrnXY - training set TrnSize - training set size, TrnSize>0 ValXY - validation set ValSize - validation set size, ValSize>0 Decay - weight decay constant, >=0.001 Decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 0.001. Restarts - number of restarts, either: * strictly positive number - algorithm make specified number of restarts from random position. * -1, in which case algorithm makes exactly one run from the initial state of the network (no randomization). If you don't know what Restarts to choose, choose one one the following: * -1 (deterministic start) * +1 (one random restart) * +5 (moderate amount of random restarts) OUTPUT PARAMETERS: Network - trained neural network. Info - return code: * -2, if there is a point with class number outside of [0..NOut-1]. * -1, if wrong parameters specified (NPoints<0, Restarts<1, ...). * 2, task has been solved, stopping criterion met - sufficiently small step size. Not expected (we use EARLY stopping) but possible and not an error. * 6, task has been solved, stopping criterion met - increasing of validation set error. Rep - training report NOTE: Algorithm stops if validation set error increases for a long enough or step size is small enought (there are task where validation set may decrease for eternity). In any case solution returned corresponds to the minimum of validation set error. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void mlptraines(const multilayerperceptron &network, const real_2d_array &trnxy, const ae_int_t trnsize, const real_2d_array &valxy, const ae_int_t valsize, const double decay, const ae_int_t restarts, ae_int_t &info, mlpreport &rep); /************************************************************************* Cross-validation estimate of generalization error. Base algorithm - L-BFGS. INPUT PARAMETERS: Network - neural network with initialized geometry. Network is not changed during cross-validation - it is used only as a representative of its architecture. XY - training set. SSize - training set size Decay - weight decay, same as in MLPTrainLBFGS Restarts - number of restarts, >0. restarts are counted for each partition separately, so total number of restarts will be Restarts*FoldsCount. WStep - stopping criterion, same as in MLPTrainLBFGS MaxIts - stopping criterion, same as in MLPTrainLBFGS FoldsCount - number of folds in k-fold cross-validation, 2<=FoldsCount<=SSize. recommended value: 10. OUTPUT PARAMETERS: Info - return code, same as in MLPTrainLBFGS Rep - report, same as in MLPTrainLM/MLPTrainLBFGS CVRep - generalization error estimates -- ALGLIB -- Copyright 09.12.2007 by Bochkanov Sergey *************************************************************************/ void mlpkfoldcvlbfgs(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, const double wstep, const ae_int_t maxits, const ae_int_t foldscount, ae_int_t &info, mlpreport &rep, mlpcvreport &cvrep); /************************************************************************* Cross-validation estimate of generalization error. Base algorithm - Levenberg-Marquardt. INPUT PARAMETERS: Network - neural network with initialized geometry. Network is not changed during cross-validation - it is used only as a representative of its architecture. XY - training set. SSize - training set size Decay - weight decay, same as in MLPTrainLBFGS Restarts - number of restarts, >0. restarts are counted for each partition separately, so total number of restarts will be Restarts*FoldsCount. FoldsCount - number of folds in k-fold cross-validation, 2<=FoldsCount<=SSize. recommended value: 10. OUTPUT PARAMETERS: Info - return code, same as in MLPTrainLBFGS Rep - report, same as in MLPTrainLM/MLPTrainLBFGS CVRep - generalization error estimates -- ALGLIB -- Copyright 09.12.2007 by Bochkanov Sergey *************************************************************************/ void mlpkfoldcvlm(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, const ae_int_t foldscount, ae_int_t &info, mlpreport &rep, mlpcvreport &cvrep); /************************************************************************* This function estimates generalization error using cross-validation on the current dataset with current training settings. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * FoldsCount cross-validation rounds (always) ! * NRestarts training sessions performed within each of ! cross-validation rounds (if NRestarts>1) ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - trainer object Network - neural network. It must have same number of inputs and output/classes as was specified during creation of the trainer object. Network is not changed during cross- validation and is not trained - it is used only as representative of its architecture. I.e., we estimate generalization properties of ARCHITECTURE, not some specific network. NRestarts - number of restarts, >=0: * NRestarts>0 means that for each cross-validation round specified number of random restarts is performed, with best network being chosen after training. * NRestarts=0 is same as NRestarts=1 FoldsCount - number of folds in k-fold cross-validation: * 2<=FoldsCount<=size of dataset * recommended value: 10. * values larger than dataset size will be silently truncated down to dataset size OUTPUT PARAMETERS: Rep - structure which contains cross-validation estimates: * Rep.RelCLSError - fraction of misclassified cases. * Rep.AvgCE - acerage cross-entropy * Rep.RMSError - root-mean-square error * Rep.AvgError - average error * Rep.AvgRelError - average relative error NOTE: when no dataset was specified with MLPSetDataset/SetSparseDataset(), or subset with only one point was given, zeros are returned as estimates. NOTE: this method performs FoldsCount cross-validation rounds, each one with NRestarts random starts. Thus, FoldsCount*NRestarts networks are trained in total. NOTE: Rep.RelCLSError/Rep.AvgCE are zero on regression problems. NOTE: on classification problems Rep.RMSError/Rep.AvgError/Rep.AvgRelError contain errors in prediction of posterior probabilities. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpkfoldcv(const mlptrainer &s, const multilayerperceptron &network, const ae_int_t nrestarts, const ae_int_t foldscount, mlpreport &rep); void smp_mlpkfoldcv(const mlptrainer &s, const multilayerperceptron &network, const ae_int_t nrestarts, const ae_int_t foldscount, mlpreport &rep); /************************************************************************* Creation of the network trainer object for regression networks INPUT PARAMETERS: NIn - number of inputs, NIn>=1 NOut - number of outputs, NOut>=1 OUTPUT PARAMETERS: S - neural network trainer object. This structure can be used to train any regression network with NIn inputs and NOut outputs. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpcreatetrainer(const ae_int_t nin, const ae_int_t nout, mlptrainer &s); /************************************************************************* Creation of the network trainer object for classification networks INPUT PARAMETERS: NIn - number of inputs, NIn>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: S - neural network trainer object. This structure can be used to train any classification network with NIn inputs and NOut outputs. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpcreatetrainercls(const ae_int_t nin, const ae_int_t nclasses, mlptrainer &s); /************************************************************************* This function sets "current dataset" of the trainer object to one passed by user. INPUT PARAMETERS: S - trainer object XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. NPoints - points count, >=0. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following datasetformat is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetdataset(const mlptrainer &s, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* This function sets "current dataset" of the trainer object to one passed by user (sparse matrix is used to store dataset). INPUT PARAMETERS: S - trainer object XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Any sparse storage format can be used: Hash-table, CRS... NPoints - points count, >=0 DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following datasetformat is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetsparsedataset(const mlptrainer &s, const sparsematrix &xy, const ae_int_t npoints); /************************************************************************* This function sets weight decay coefficient which is used for training. INPUT PARAMETERS: S - trainer object Decay - weight decay coefficient, >=0. Weight decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 1.0E-3. Weight decay can be set to zero, in this case network is trained without weight decay. NOTE: by default network uses some small nonzero value for weight decay. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetdecay(const mlptrainer &s, const double decay); /************************************************************************* This function sets stopping criteria for the optimizer. INPUT PARAMETERS: S - trainer object WStep - stopping criterion. Algorithm stops if step size is less than WStep. Recommended value - 0.01. Zero step size means stopping after MaxIts iterations. WStep>=0. MaxIts - stopping criterion. Algorithm stops after MaxIts epochs (full passes over entire dataset). Zero MaxIts means stopping when step is sufficiently small. MaxIts>=0. NOTE: by default, WStep=0.005 and MaxIts=0 are used. These values are also used when MLPSetCond() is called with WStep=0 and MaxIts=0. NOTE: these stopping criteria are used for all kinds of neural training - from "conventional" networks to early stopping ensembles. When used for "conventional" networks, they are used as the only stopping criteria. When combined with early stopping, they used as ADDITIONAL stopping criteria which can terminate early stopping algorithm. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetcond(const mlptrainer &s, const double wstep, const ae_int_t maxits); /************************************************************************* This function sets training algorithm: batch training using L-BFGS will be used. This algorithm: * the most robust for small-scale problems, but may be too slow for large scale ones. * perfoms full pass through the dataset before performing step * uses conditions specified by MLPSetCond() for stopping * is default one used by trainer object INPUT PARAMETERS: S - trainer object -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetalgobatch(const mlptrainer &s); /************************************************************************* This function trains neural network passed to this function, using current dataset (one which was passed to MLPSetDataset() or MLPSetSparseDataset()) and current training settings. Training from NRestarts random starting positions is performed, best network is chosen. Training is performed using current training algorithm. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * NRestarts training sessions performed within each of ! cross-validation rounds (if NRestarts>1) ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - trainer object Network - neural network. It must have same number of inputs and output/classes as was specified during creation of the trainer object. NRestarts - number of restarts, >=0: * NRestarts>0 means that specified number of random restarts are performed, best network is chosen after training * NRestarts=0 means that current state of the network is used for training. OUTPUT PARAMETERS: Network - trained network NOTE: when no dataset was specified with MLPSetDataset/SetSparseDataset(), network is filled by zero values. Same behavior for functions MLPStartTraining and MLPContinueTraining. NOTE: this method uses sum-of-squares error function for training. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlptrainnetwork(const mlptrainer &s, const multilayerperceptron &network, const ae_int_t nrestarts, mlpreport &rep); void smp_mlptrainnetwork(const mlptrainer &s, const multilayerperceptron &network, const ae_int_t nrestarts, mlpreport &rep); /************************************************************************* IMPORTANT: this is an "expert" version of the MLPTrain() function. We do not recommend you to use it unless you are pretty sure that you need ability to monitor training progress. This function performs step-by-step training of the neural network. Here "step-by-step" means that training starts with MLPStartTraining() call, and then user subsequently calls MLPContinueTraining() to perform one more iteration of the training. After call to this function trainer object remembers network and is ready to train it. However, no training is performed until first call to MLPContinueTraining() function. Subsequent calls to MLPContinueTraining() will advance training progress one iteration further. EXAMPLE: > > ...initialize network and trainer object.... > > MLPStartTraining(Trainer, Network, True) > while MLPContinueTraining(Trainer, Network) do > ...visualize training progress... > INPUT PARAMETERS: S - trainer object Network - neural network. It must have same number of inputs and output/classes as was specified during creation of the trainer object. RandomStart - randomize network before training or not: * True means that network is randomized and its initial state (one which was passed to the trainer object) is lost. * False means that training is started from the current state of the network OUTPUT PARAMETERS: Network - neural network which is ready to training (weights are initialized, preprocessor is initialized using current training set) NOTE: this method uses sum-of-squares error function for training. NOTE: it is expected that trainer object settings are NOT changed during step-by-step training, i.e. no one changes stopping criteria or training set during training. It is possible and there is no defense against such actions, but algorithm behavior in such cases is undefined and can be unpredictable. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpstarttraining(const mlptrainer &s, const multilayerperceptron &network, const bool randomstart); /************************************************************************* IMPORTANT: this is an "expert" version of the MLPTrain() function. We do not recommend you to use it unless you are pretty sure that you need ability to monitor training progress. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. This function performs step-by-step training of the neural network. Here "step-by-step" means that training starts with MLPStartTraining() call, and then user subsequently calls MLPContinueTraining() to perform one more iteration of the training. This function performs one more iteration of the training and returns either True (training continues) or False (training stopped). In case True was returned, Network weights are updated according to the current state of the optimization progress. In case False was returned, no additional updates is performed (previous update of the network weights moved us to the final point, and no additional updates is needed). EXAMPLE: > > [initialize network and trainer object] > > MLPStartTraining(Trainer, Network, True) > while MLPContinueTraining(Trainer, Network) do > [visualize training progress] > INPUT PARAMETERS: S - trainer object Network - neural network structure, which is used to store current state of the training process. OUTPUT PARAMETERS: Network - weights of the neural network are rewritten by the current approximation. NOTE: this method uses sum-of-squares error function for training. NOTE: it is expected that trainer object settings are NOT changed during step-by-step training, i.e. no one changes stopping criteria or training set during training. It is possible and there is no defense against such actions, but algorithm behavior in such cases is undefined and can be unpredictable. NOTE: It is expected that Network is the same one which was passed to MLPStartTraining() function. However, THIS function checks only following: * that number of network inputs is consistent with trainer object settings * that number of network outputs/classes is consistent with trainer object settings * that number of network weights is the same as number of weights in the network passed to MLPStartTraining() function Exception is thrown when these conditions are violated. It is also expected that you do not change state of the network on your own - the only party who has right to change network during its training is a trainer object. Any attempt to interfere with trainer may lead to unpredictable results. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ bool mlpcontinuetraining(const mlptrainer &s, const multilayerperceptron &network); bool smp_mlpcontinuetraining(const mlptrainer &s, const multilayerperceptron &network); /************************************************************************* Training neural networks ensemble using bootstrap aggregating (bagging). Modified Levenberg-Marquardt algorithm is used as base training method. INPUT PARAMETERS: Ensemble - model with initialized geometry XY - training set NPoints - training set size Decay - weight decay coefficient, >=0.001 Restarts - restarts, >0. OUTPUT PARAMETERS: Ensemble - trained model Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report. OOBErrors - out-of-bag generalization error estimate -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpebagginglm(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, ae_int_t &info, mlpreport &rep, mlpcvreport &ooberrors); /************************************************************************* Training neural networks ensemble using bootstrap aggregating (bagging). L-BFGS algorithm is used as base training method. INPUT PARAMETERS: Ensemble - model with initialized geometry XY - training set NPoints - training set size Decay - weight decay coefficient, >=0.001 Restarts - restarts, >0. WStep - stopping criterion, same as in MLPTrainLBFGS MaxIts - stopping criterion, same as in MLPTrainLBFGS OUTPUT PARAMETERS: Ensemble - trained model Info - return code: * -8, if both WStep=0 and MaxIts=0 * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report. OOBErrors - out-of-bag generalization error estimate -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpebagginglbfgs(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, const double wstep, const ae_int_t maxits, ae_int_t &info, mlpreport &rep, mlpcvreport &ooberrors); /************************************************************************* Training neural networks ensemble using early stopping. INPUT PARAMETERS: Ensemble - model with initialized geometry XY - training set NPoints - training set size Decay - weight decay coefficient, >=0.001 Restarts - restarts, >0. OUTPUT PARAMETERS: Ensemble - trained model Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, Restarts<1). * 6, if task has been solved. Rep - training report. OOBErrors - out-of-bag generalization error estimate -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void mlpetraines(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, ae_int_t &info, mlpreport &rep); /************************************************************************* This function trains neural network ensemble passed to this function using current dataset and early stopping training algorithm. Each early stopping round performs NRestarts random restarts (thus, EnsembleSize*NRestarts training rounds is performed in total). FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * EnsembleSize training sessions performed for each of ensemble ! members (always parallelized) ! * NRestarts training sessions performed within each of training ! sessions (if NRestarts>1) ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - trainer object; Ensemble - neural network ensemble. It must have same number of inputs and outputs/classes as was specified during creation of the trainer object. NRestarts - number of restarts, >=0: * NRestarts>0 means that specified number of random restarts are performed during each ES round; * NRestarts=0 is silently replaced by 1. OUTPUT PARAMETERS: Ensemble - trained ensemble; Rep - it contains all type of errors. NOTE: this training method uses BOTH early stopping and weight decay! So, you should select weight decay before starting training just as you select it before training "conventional" networks. NOTE: when no dataset was specified with MLPSetDataset/SetSparseDataset(), or single-point dataset was passed, ensemble is filled by zero values. NOTE: this method uses sum-of-squares error function for training. -- ALGLIB -- Copyright 22.08.2012 by Bochkanov Sergey *************************************************************************/ void mlptrainensemblees(const mlptrainer &s, const mlpensemble &ensemble, const ae_int_t nrestarts, mlpreport &rep); void smp_mlptrainensemblees(const mlptrainer &s, const mlpensemble &ensemble, const ae_int_t nrestarts, mlpreport &rep); /************************************************************************* This function initializes clusterizer object. Newly initialized object is empty, i.e. it does not contain dataset. You should use it as follows: 1. creation 2. dataset is added with ClusterizerSetPoints() 3. additional parameters are set 3. clusterization is performed with one of the clustering functions -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizercreate(clusterizerstate &s); /************************************************************************* This function adds dataset to the clusterizer structure. This function overrides all previous calls of ClusterizerSetPoints() or ClusterizerSetDistances(). INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() XY - array[NPoints,NFeatures], dataset NPoints - number of points, >=0 NFeatures- number of features, >=1 DistType- distance function: * 0 Chebyshev distance (L-inf norm) * 1 city block distance (L1 norm) * 2 Euclidean distance (L2 norm), non-squared * 10 Pearson correlation: dist(a,b) = 1-corr(a,b) * 11 Absolute Pearson correlation: dist(a,b) = 1-|corr(a,b)| * 12 Uncentered Pearson correlation (cosine of the angle): dist(a,b) = a'*b/(|a|*|b|) * 13 Absolute uncentered Pearson correlation dist(a,b) = |a'*b|/(|a|*|b|) * 20 Spearman rank correlation: dist(a,b) = 1-rankcorr(a,b) * 21 Absolute Spearman rank correlation dist(a,b) = 1-|rankcorr(a,b)| NOTE 1: different distance functions have different performance penalty: * Euclidean or Pearson correlation distances are the fastest ones * Spearman correlation distance function is a bit slower * city block and Chebyshev distances are order of magnitude slower The reason behing difference in performance is that correlation-based distance functions are computed using optimized linear algebra kernels, while Chebyshev and city block distance functions are computed using simple nested loops with two branches at each iteration. NOTE 2: different clustering algorithms have different limitations: * agglomerative hierarchical clustering algorithms may be used with any kind of distance metric * k-means++ clustering algorithm may be used only with Euclidean distance function Thus, list of specific clustering algorithms you may use depends on distance function you specify when you set your dataset. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetpoints(const clusterizerstate &s, const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures, const ae_int_t disttype); void clusterizersetpoints(const clusterizerstate &s, const real_2d_array &xy, const ae_int_t disttype); /************************************************************************* This function adds dataset given by distance matrix to the clusterizer structure. It is important that dataset is not given explicitly - only distance matrix is given. This function overrides all previous calls of ClusterizerSetPoints() or ClusterizerSetDistances(). INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() D - array[NPoints,NPoints], distance matrix given by its upper or lower triangle (main diagonal is ignored because its entries are expected to be zero). NPoints - number of points IsUpper - whether upper or lower triangle of D is given. NOTE 1: different clustering algorithms have different limitations: * agglomerative hierarchical clustering algorithms may be used with any kind of distance metric, including one which is given by distance matrix * k-means++ clustering algorithm may be used only with Euclidean distance function and explicitly given points - it can not be used with dataset given by distance matrix Thus, if you call this function, you will be unable to use k-means clustering algorithm to process your problem. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetdistances(const clusterizerstate &s, const real_2d_array &d, const ae_int_t npoints, const bool isupper); void clusterizersetdistances(const clusterizerstate &s, const real_2d_array &d, const bool isupper); /************************************************************************* This function sets agglomerative hierarchical clustering algorithm INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() Algo - algorithm type: * 0 complete linkage (default algorithm) * 1 single linkage * 2 unweighted average linkage * 3 weighted average linkage * 4 Ward's method NOTE: Ward's method works correctly only with Euclidean distance, that's why algorithm will return negative termination code (failure) for any other distance type. It is possible, however, to use this method with user-supplied distance matrix. It is your responsibility to pass one which was calculated with Euclidean distance function. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetahcalgo(const clusterizerstate &s, const ae_int_t algo); /************************************************************************* This function sets k-means properties: number of restarts and maximum number of iterations per one run. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() Restarts- restarts count, >=1. k-means++ algorithm performs several restarts and chooses best set of centers (one with minimum squared distance). MaxIts - maximum number of k-means iterations performed during one run. >=0, zero value means that algorithm performs unlimited number of iterations. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetkmeanslimits(const clusterizerstate &s, const ae_int_t restarts, const ae_int_t maxits); /************************************************************************* This function sets k-means initialization algorithm. Several different algorithms can be chosen, including k-means++. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() InitAlgo- initialization algorithm: * 0 automatic selection ( different versions of ALGLIB may select different algorithms) * 1 random initialization * 2 k-means++ initialization (best quality of initial centers, but long non-parallelizable initialization phase with bad cache locality) * 3 "fast-greedy" algorithm with efficient, easy to parallelize initialization. Quality of initial centers is somewhat worse than that of k-means++. This algorithm is a default one in the current version of ALGLIB. *-1 "debug" algorithm which always selects first K rows of dataset; this algorithm is used for debug purposes only. Do not use it in the industrial code! -- ALGLIB -- Copyright 21.01.2015 by Bochkanov Sergey *************************************************************************/ void clusterizersetkmeansinit(const clusterizerstate &s, const ae_int_t initalgo); /************************************************************************* This function performs agglomerative hierarchical clustering COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Agglomerative hierarchical clustering algorithm has two phases: ! distance matrix calculation and clustering itself. Only first phase ! (distance matrix calculation) is accelerated by Intel MKL and multi- ! threading. Thus, acceleration is significant only for medium or high- ! dimensional problems. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() OUTPUT PARAMETERS: Rep - clustering results; see description of AHCReport structure for more information. NOTE 1: hierarchical clustering algorithms require large amounts of memory. In particular, this implementation needs sizeof(double)*NPoints^2 bytes, which are used to store distance matrix. In case we work with user-supplied matrix, this amount is multiplied by 2 (we have to store original matrix and to work with its copy). For example, problem with 10000 points would require 800M of RAM, even when working in a 1-dimensional space. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizerrunahc(const clusterizerstate &s, ahcreport &rep); void smp_clusterizerrunahc(const clusterizerstate &s, ahcreport &rep); /************************************************************************* This function performs clustering by k-means++ algorithm. You may change algorithm properties by calling: * ClusterizerSetKMeansLimits() to change number of restarts or iterations * ClusterizerSetKMeansInit() to change initialization algorithm By default, one restart and unlimited number of iterations are used. Initialization algorithm is chosen automatically. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (can be used from C# and C++) ! * access to high-performance C++ core (actual for C# users) ! ! K-means clustering algorithm has two phases: selection of initial ! centers and clustering itself. ALGLIB parallelizes both phases. ! Parallel version is optimized for the following scenario: medium or ! high-dimensional problem (20 or more dimensions) with large number of ! points and clusters. However, some speed-up can be obtained even when ! assumptions above are violated. ! ! As for native-vs-managed comparison, working with native core brings ! 30-40% improvement in speed over pure C# version of ALGLIB. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() K - number of clusters, K>=0. K can be zero only when algorithm is called for empty dataset, in this case completion code is set to success (+1). If K=0 and dataset size is non-zero, we can not meaningfully assign points to some center (there are no centers because K=0) and return -3 as completion code (failure). OUTPUT PARAMETERS: Rep - clustering results; see description of KMeansReport structure for more information. NOTE 1: k-means clustering can be performed only for datasets with Euclidean distance function. Algorithm will return negative completion code in Rep.TerminationType in case dataset was added to clusterizer with DistType other than Euclidean (or dataset was specified by distance matrix instead of explicitly given points). -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizerrunkmeans(const clusterizerstate &s, const ae_int_t k, kmeansreport &rep); void smp_clusterizerrunkmeans(const clusterizerstate &s, const ae_int_t k, kmeansreport &rep); /************************************************************************* This function returns distance matrix for dataset COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Agglomerative hierarchical clustering algorithm has two phases: ! distance matrix calculation and clustering itself. Only first phase ! (distance matrix calculation) is accelerated by Intel MKL and multi- ! threading. Thus, acceleration is significant only for medium or high- ! dimensional problems. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points, >=0 NFeatures- number of features, >=1 DistType- distance function: * 0 Chebyshev distance (L-inf norm) * 1 city block distance (L1 norm) * 2 Euclidean distance (L2 norm, non-squared) * 10 Pearson correlation: dist(a,b) = 1-corr(a,b) * 11 Absolute Pearson correlation: dist(a,b) = 1-|corr(a,b)| * 12 Uncentered Pearson correlation (cosine of the angle): dist(a,b) = a'*b/(|a|*|b|) * 13 Absolute uncentered Pearson correlation dist(a,b) = |a'*b|/(|a|*|b|) * 20 Spearman rank correlation: dist(a,b) = 1-rankcorr(a,b) * 21 Absolute Spearman rank correlation dist(a,b) = 1-|rankcorr(a,b)| OUTPUT PARAMETERS: D - array[NPoints,NPoints], distance matrix (full matrix is returned, with lower and upper triangles) NOTE: different distance functions have different performance penalty: * Euclidean or Pearson correlation distances are the fastest ones * Spearman correlation distance function is a bit slower * city block and Chebyshev distances are order of magnitude slower The reason behing difference in performance is that correlation-based distance functions are computed using optimized linear algebra kernels, while Chebyshev and city block distance functions are computed using simple nested loops with two branches at each iteration. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizergetdistances(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures, const ae_int_t disttype, real_2d_array &d); void smp_clusterizergetdistances(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures, const ae_int_t disttype, real_2d_array &d); /************************************************************************* This function takes as input clusterization report Rep, desired clusters count K, and builds top K clusters from hierarchical clusterization tree. It returns assignment of points to clusters (array of cluster indexes). INPUT PARAMETERS: Rep - report from ClusterizerRunAHC() performed on XY K - desired number of clusters, 1<=K<=NPoints. K can be zero only when NPoints=0. OUTPUT PARAMETERS: CIdx - array[NPoints], I-th element contains cluster index (from 0 to K-1) for I-th point of the dataset. CZ - array[K]. This array allows to convert cluster indexes returned by this function to indexes used by Rep.Z. J-th cluster returned by this function corresponds to CZ[J]-th cluster stored in Rep.Z/PZ/PM. It is guaranteed that CZ[I]=0 OUTPUT PARAMETERS: K - number of clusters, 1<=K<=NPoints CIdx - array[NPoints], I-th element contains cluster index (from 0 to K-1) for I-th point of the dataset. CZ - array[K]. This array allows to convert cluster indexes returned by this function to indexes used by Rep.Z. J-th cluster returned by this function corresponds to CZ[J]-th cluster stored in Rep.Z/PZ/PM. It is guaranteed that CZ[I]=1 NVars - number of independent variables, NVars>=1 NClasses - task type: * NClasses=1 - regression task with one dependent variable * NClasses>1 - classification task with NClasses classes. NTrees - number of trees in a forest, NTrees>=1. recommended values: 50-100. R - percent of a training set used to build individual trees. 01). * 1, if task has been solved DF - model built Rep - training report, contains error on a training set and out-of-bag estimates of generalization error. -- ALGLIB -- Copyright 19.02.2009 by Bochkanov Sergey *************************************************************************/ void dfbuildrandomdecisionforest(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t nclasses, const ae_int_t ntrees, const double r, ae_int_t &info, decisionforest &df, dfreport &rep); /************************************************************************* This subroutine builds random decision forest. This function gives ability to tune number of variables used when choosing best split. INPUT PARAMETERS: XY - training set NPoints - training set size, NPoints>=1 NVars - number of independent variables, NVars>=1 NClasses - task type: * NClasses=1 - regression task with one dependent variable * NClasses>1 - classification task with NClasses classes. NTrees - number of trees in a forest, NTrees>=1. recommended values: 50-100. NRndVars - number of variables used when choosing best split R - percent of a training set used to build individual trees. 01). * 1, if task has been solved DF - model built Rep - training report, contains error on a training set and out-of-bag estimates of generalization error. -- ALGLIB -- Copyright 19.02.2009 by Bochkanov Sergey *************************************************************************/ void dfbuildrandomdecisionforestx1(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t nclasses, const ae_int_t ntrees, const ae_int_t nrndvars, const double r, ae_int_t &info, decisionforest &df, dfreport &rep); /************************************************************************* Procesing INPUT PARAMETERS: DF - decision forest model X - input vector, array[0..NVars-1]. OUTPUT PARAMETERS: Y - result. Regression estimate when solving regression task, vector of posterior probabilities for classification task. See also DFProcessI. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ void dfprocess(const decisionforest &df, const real_1d_array &x, real_1d_array &y); /************************************************************************* 'interactive' variant of DFProcess for languages like Python which support constructs like "Y = DFProcessI(DF,X)" and interactive mode of interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void dfprocessi(const decisionforest &df, const real_1d_array &x, real_1d_array &y); /************************************************************************* Relative classification error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: percent of incorrectly classified cases. Zero if model solves regression task. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfrelclserror(const decisionforest &df, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Average cross-entropy (in bits per element) on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: CrossEntropy/(NPoints*LN(2)). Zero if model solves regression task. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfavgce(const decisionforest &df, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* RMS error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: root mean square error. Its meaning for regression task is obvious. As for classification task, RMS error means error when estimating posterior probabilities. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfrmserror(const decisionforest &df, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Average error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task, it means average error when estimating posterior probabilities. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfavgerror(const decisionforest &df, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* Average relative error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task, it means average relative error when estimating posterior probability of belonging to the correct class. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfavgrelerror(const decisionforest &df, const real_2d_array &xy, const ae_int_t npoints); /************************************************************************* k-means++ clusterization. Backward compatibility function, we recommend to use CLUSTERING subpackage as better replacement. -- ALGLIB -- Copyright 21.03.2009 by Bochkanov Sergey *************************************************************************/ void kmeansgenerate(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t k, const ae_int_t restarts, ae_int_t &info, real_2d_array &c, integer_1d_array &xyc); } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (FUNCTIONS) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { void pcabuildbasis(/* Real */ ae_matrix* x, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, /* Real */ ae_vector* s2, /* Real */ ae_matrix* v, ae_state *_state); void _pexec_pcabuildbasis(/* Real */ ae_matrix* x, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, /* Real */ ae_vector* s2, /* Real */ ae_matrix* v, ae_state *_state); void pcatruncatedsubspace(/* Real */ ae_matrix* x, ae_int_t npoints, ae_int_t nvars, ae_int_t nneeded, double eps, ae_int_t maxits, /* Real */ ae_vector* s2, /* Real */ ae_matrix* v, ae_state *_state); void _pexec_pcatruncatedsubspace(/* Real */ ae_matrix* x, ae_int_t npoints, ae_int_t nvars, ae_int_t nneeded, double eps, ae_int_t maxits, /* Real */ ae_vector* s2, /* Real */ ae_matrix* v, ae_state *_state); void dserrallocate(ae_int_t nclasses, /* Real */ ae_vector* buf, ae_state *_state); void dserraccumulate(/* Real */ ae_vector* buf, /* Real */ ae_vector* y, /* Real */ ae_vector* desiredy, ae_state *_state); void dserrfinish(/* Real */ ae_vector* buf, ae_state *_state); void dsnormalize(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, /* Real */ ae_vector* means, /* Real */ ae_vector* sigmas, ae_state *_state); void dsnormalizec(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, /* Real */ ae_vector* means, /* Real */ ae_vector* sigmas, ae_state *_state); double dsgetmeanmindistance(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_state *_state); void dstie(/* Real */ ae_vector* a, ae_int_t n, /* Integer */ ae_vector* ties, ae_int_t* tiecount, /* Integer */ ae_vector* p1, /* Integer */ ae_vector* p2, ae_state *_state); void dstiefasti(/* Real */ ae_vector* a, /* Integer */ ae_vector* b, ae_int_t n, /* Integer */ ae_vector* ties, ae_int_t* tiecount, /* Real */ ae_vector* bufr, /* Integer */ ae_vector* bufi, ae_state *_state); void dsoptimalsplit2(/* Real */ ae_vector* a, /* Integer */ ae_vector* c, ae_int_t n, ae_int_t* info, double* threshold, double* pal, double* pbl, double* par, double* pbr, double* cve, ae_state *_state); void dsoptimalsplit2fast(/* Real */ ae_vector* a, /* Integer */ ae_vector* c, /* Integer */ ae_vector* tiesbuf, /* Integer */ ae_vector* cntbuf, /* Real */ ae_vector* bufr, /* Integer */ ae_vector* bufi, ae_int_t n, ae_int_t nc, double alpha, ae_int_t* info, double* threshold, double* rms, double* cvrms, ae_state *_state); void dssplitk(/* Real */ ae_vector* a, /* Integer */ ae_vector* c, ae_int_t n, ae_int_t nc, ae_int_t kmax, ae_int_t* info, /* Real */ ae_vector* thresholds, ae_int_t* ni, double* cve, ae_state *_state); void dsoptimalsplitk(/* Real */ ae_vector* a, /* Integer */ ae_vector* c, ae_int_t n, ae_int_t nc, ae_int_t kmax, ae_int_t* info, /* Real */ ae_vector* thresholds, ae_int_t* ni, double* cve, ae_state *_state); void _cvreport_init(void* _p, ae_state *_state); void _cvreport_init_copy(void* _dst, void* _src, ae_state *_state); void _cvreport_clear(void* _p); void _cvreport_destroy(void* _p); ae_int_t mlpgradsplitcost(ae_state *_state); ae_int_t mlpgradsplitsize(ae_state *_state); void mlpcreate0(ae_int_t nin, ae_int_t nout, multilayerperceptron* network, ae_state *_state); void mlpcreate1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, multilayerperceptron* network, ae_state *_state); void mlpcreate2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, multilayerperceptron* network, ae_state *_state); void mlpcreateb0(ae_int_t nin, ae_int_t nout, double b, double d, multilayerperceptron* network, ae_state *_state); void mlpcreateb1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, double b, double d, multilayerperceptron* network, ae_state *_state); void mlpcreateb2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, double b, double d, multilayerperceptron* network, ae_state *_state); void mlpcreater0(ae_int_t nin, ae_int_t nout, double a, double b, multilayerperceptron* network, ae_state *_state); void mlpcreater1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, double a, double b, multilayerperceptron* network, ae_state *_state); void mlpcreater2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, double a, double b, multilayerperceptron* network, ae_state *_state); void mlpcreatec0(ae_int_t nin, ae_int_t nout, multilayerperceptron* network, ae_state *_state); void mlpcreatec1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, multilayerperceptron* network, ae_state *_state); void mlpcreatec2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, multilayerperceptron* network, ae_state *_state); void mlpcopy(multilayerperceptron* network1, multilayerperceptron* network2, ae_state *_state); void mlpcopyshared(multilayerperceptron* network1, multilayerperceptron* network2, ae_state *_state); ae_bool mlpsamearchitecture(multilayerperceptron* network1, multilayerperceptron* network2, ae_state *_state); void mlpcopytunableparameters(multilayerperceptron* network1, multilayerperceptron* network2, ae_state *_state); void mlpexporttunableparameters(multilayerperceptron* network, /* Real */ ae_vector* p, ae_int_t* pcount, ae_state *_state); void mlpimporttunableparameters(multilayerperceptron* network, /* Real */ ae_vector* p, ae_state *_state); void mlpserializeold(multilayerperceptron* network, /* Real */ ae_vector* ra, ae_int_t* rlen, ae_state *_state); void mlpunserializeold(/* Real */ ae_vector* ra, multilayerperceptron* network, ae_state *_state); void mlprandomize(multilayerperceptron* network, ae_state *_state); void mlprandomizefull(multilayerperceptron* network, ae_state *_state); void mlpinitpreprocessor(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, ae_state *_state); void mlpinitpreprocessorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t ssize, ae_state *_state); void mlpinitpreprocessorsubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* idx, ae_int_t subsetsize, ae_state *_state); void mlpinitpreprocessorsparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* idx, ae_int_t subsetsize, ae_state *_state); void mlpproperties(multilayerperceptron* network, ae_int_t* nin, ae_int_t* nout, ae_int_t* wcount, ae_state *_state); ae_int_t mlpntotal(multilayerperceptron* network, ae_state *_state); ae_int_t mlpgetinputscount(multilayerperceptron* network, ae_state *_state); ae_int_t mlpgetoutputscount(multilayerperceptron* network, ae_state *_state); ae_int_t mlpgetweightscount(multilayerperceptron* network, ae_state *_state); ae_bool mlpissoftmax(multilayerperceptron* network, ae_state *_state); ae_int_t mlpgetlayerscount(multilayerperceptron* network, ae_state *_state); ae_int_t mlpgetlayersize(multilayerperceptron* network, ae_int_t k, ae_state *_state); void mlpgetinputscaling(multilayerperceptron* network, ae_int_t i, double* mean, double* sigma, ae_state *_state); void mlpgetoutputscaling(multilayerperceptron* network, ae_int_t i, double* mean, double* sigma, ae_state *_state); void mlpgetneuroninfo(multilayerperceptron* network, ae_int_t k, ae_int_t i, ae_int_t* fkind, double* threshold, ae_state *_state); double mlpgetweight(multilayerperceptron* network, ae_int_t k0, ae_int_t i0, ae_int_t k1, ae_int_t i1, ae_state *_state); void mlpsetinputscaling(multilayerperceptron* network, ae_int_t i, double mean, double sigma, ae_state *_state); void mlpsetoutputscaling(multilayerperceptron* network, ae_int_t i, double mean, double sigma, ae_state *_state); void mlpsetneuroninfo(multilayerperceptron* network, ae_int_t k, ae_int_t i, ae_int_t fkind, double threshold, ae_state *_state); void mlpsetweight(multilayerperceptron* network, ae_int_t k0, ae_int_t i0, ae_int_t k1, ae_int_t i1, double w, ae_state *_state); void mlpactivationfunction(double net, ae_int_t k, double* f, double* df, double* d2f, ae_state *_state); void mlpprocess(multilayerperceptron* network, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void mlpprocessi(multilayerperceptron* network, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); double mlperror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double _pexec_mlperror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mlperrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state); double _pexec_mlperrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state); double mlperrorn(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, ae_state *_state); ae_int_t mlpclserror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); ae_int_t _pexec_mlpclserror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mlprelclserror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double _pexec_mlprelclserror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mlprelclserrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state); double _pexec_mlprelclserrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state); double mlpavgce(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double _pexec_mlpavgce(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mlpavgcesparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state); double _pexec_mlpavgcesparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state); double mlprmserror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double _pexec_mlprmserror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mlprmserrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state); double _pexec_mlprmserrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state); double mlpavgerror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double _pexec_mlpavgerror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mlpavgerrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state); double _pexec_mlpavgerrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state); double mlpavgrelerror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double _pexec_mlpavgrelerror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mlpavgrelerrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state); double _pexec_mlpavgrelerrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state); void mlpgrad(multilayerperceptron* network, /* Real */ ae_vector* x, /* Real */ ae_vector* desiredy, double* e, /* Real */ ae_vector* grad, ae_state *_state); void mlpgradn(multilayerperceptron* network, /* Real */ ae_vector* x, /* Real */ ae_vector* desiredy, double* e, /* Real */ ae_vector* grad, ae_state *_state); void mlpgradbatch(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, ae_state *_state); void _pexec_mlpgradbatch(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, ae_state *_state); void mlpgradbatchsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, ae_state *_state); void _pexec_mlpgradbatchsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, ae_state *_state); void mlpgradbatchsubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* idx, ae_int_t subsetsize, double* e, /* Real */ ae_vector* grad, ae_state *_state); void _pexec_mlpgradbatchsubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* idx, ae_int_t subsetsize, double* e, /* Real */ ae_vector* grad, ae_state *_state); void mlpgradbatchsparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* idx, ae_int_t subsetsize, double* e, /* Real */ ae_vector* grad, ae_state *_state); void _pexec_mlpgradbatchsparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* idx, ae_int_t subsetsize, double* e, /* Real */ ae_vector* grad, ae_state *_state); void mlpgradbatchx(multilayerperceptron* network, /* Real */ ae_matrix* densexy, sparsematrix* sparsexy, ae_int_t datasetsize, ae_int_t datasettype, /* Integer */ ae_vector* idx, ae_int_t subset0, ae_int_t subset1, ae_int_t subsettype, ae_shared_pool* buf, ae_shared_pool* gradbuf, ae_state *_state); void mlpgradnbatch(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, ae_state *_state); void mlphessiannbatch(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, /* Real */ ae_matrix* h, ae_state *_state); void mlphessianbatch(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, /* Real */ ae_matrix* h, ae_state *_state); void mlpinternalprocessvector(/* Integer */ ae_vector* structinfo, /* Real */ ae_vector* weights, /* Real */ ae_vector* columnmeans, /* Real */ ae_vector* columnsigmas, /* Real */ ae_vector* neurons, /* Real */ ae_vector* dfdnet, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void mlpalloc(ae_serializer* s, multilayerperceptron* network, ae_state *_state); void mlpserialize(ae_serializer* s, multilayerperceptron* network, ae_state *_state); void mlpunserialize(ae_serializer* s, multilayerperceptron* network, ae_state *_state); void mlpallerrorssubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, modelerrors* rep, ae_state *_state); void _pexec_mlpallerrorssubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, modelerrors* rep, ae_state *_state); void mlpallerrorssparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, modelerrors* rep, ae_state *_state); void _pexec_mlpallerrorssparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, modelerrors* rep, ae_state *_state); double mlperrorsubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, ae_state *_state); double _pexec_mlperrorsubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, ae_state *_state); double mlperrorsparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, ae_state *_state); double _pexec_mlperrorsparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, ae_state *_state); void mlpallerrorsx(multilayerperceptron* network, /* Real */ ae_matrix* densexy, sparsematrix* sparsexy, ae_int_t datasetsize, ae_int_t datasettype, /* Integer */ ae_vector* idx, ae_int_t subset0, ae_int_t subset1, ae_int_t subsettype, ae_shared_pool* buf, modelerrors* rep, ae_state *_state); void _modelerrors_init(void* _p, ae_state *_state); void _modelerrors_init_copy(void* _dst, void* _src, ae_state *_state); void _modelerrors_clear(void* _p); void _modelerrors_destroy(void* _p); void _smlpgrad_init(void* _p, ae_state *_state); void _smlpgrad_init_copy(void* _dst, void* _src, ae_state *_state); void _smlpgrad_clear(void* _p); void _smlpgrad_destroy(void* _p); void _multilayerperceptron_init(void* _p, ae_state *_state); void _multilayerperceptron_init_copy(void* _dst, void* _src, ae_state *_state); void _multilayerperceptron_clear(void* _p); void _multilayerperceptron_destroy(void* _p); void fisherlda(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t* info, /* Real */ ae_vector* w, ae_state *_state); void fisherldan(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t* info, /* Real */ ae_matrix* w, ae_state *_state); void _pexec_fisherldan(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t* info, /* Real */ ae_matrix* w, ae_state *_state); void lrbuild(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, linearmodel* lm, lrreport* ar, ae_state *_state); void lrbuilds(/* Real */ ae_matrix* xy, /* Real */ ae_vector* s, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, linearmodel* lm, lrreport* ar, ae_state *_state); void lrbuildzs(/* Real */ ae_matrix* xy, /* Real */ ae_vector* s, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, linearmodel* lm, lrreport* ar, ae_state *_state); void lrbuildz(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, linearmodel* lm, lrreport* ar, ae_state *_state); void lrunpack(linearmodel* lm, /* Real */ ae_vector* v, ae_int_t* nvars, ae_state *_state); void lrpack(/* Real */ ae_vector* v, ae_int_t nvars, linearmodel* lm, ae_state *_state); double lrprocess(linearmodel* lm, /* Real */ ae_vector* x, ae_state *_state); double lrrmserror(linearmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double lravgerror(linearmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double lravgrelerror(linearmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); void lrcopy(linearmodel* lm1, linearmodel* lm2, ae_state *_state); void lrlines(/* Real */ ae_matrix* xy, /* Real */ ae_vector* s, ae_int_t n, ae_int_t* info, double* a, double* b, double* vara, double* varb, double* covab, double* corrab, double* p, ae_state *_state); void lrline(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t* info, double* a, double* b, ae_state *_state); void _linearmodel_init(void* _p, ae_state *_state); void _linearmodel_init_copy(void* _dst, void* _src, ae_state *_state); void _linearmodel_clear(void* _p); void _linearmodel_destroy(void* _p); void _lrreport_init(void* _p, ae_state *_state); void _lrreport_init_copy(void* _dst, void* _src, ae_state *_state); void _lrreport_clear(void* _p); void _lrreport_destroy(void* _p); void filtersma(/* Real */ ae_vector* x, ae_int_t n, ae_int_t k, ae_state *_state); void filterema(/* Real */ ae_vector* x, ae_int_t n, double alpha, ae_state *_state); void filterlrma(/* Real */ ae_vector* x, ae_int_t n, ae_int_t k, ae_state *_state); void mnltrainh(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t* info, logitmodel* lm, mnlreport* rep, ae_state *_state); void mnlprocess(logitmodel* lm, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void mnlprocessi(logitmodel* lm, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void mnlunpack(logitmodel* lm, /* Real */ ae_matrix* a, ae_int_t* nvars, ae_int_t* nclasses, ae_state *_state); void mnlpack(/* Real */ ae_matrix* a, ae_int_t nvars, ae_int_t nclasses, logitmodel* lm, ae_state *_state); void mnlcopy(logitmodel* lm1, logitmodel* lm2, ae_state *_state); double mnlavgce(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mnlrelclserror(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mnlrmserror(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mnlavgerror(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mnlavgrelerror(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t ssize, ae_state *_state); ae_int_t mnlclserror(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); void _logitmodel_init(void* _p, ae_state *_state); void _logitmodel_init_copy(void* _dst, void* _src, ae_state *_state); void _logitmodel_clear(void* _p); void _logitmodel_destroy(void* _p); void _logitmcstate_init(void* _p, ae_state *_state); void _logitmcstate_init_copy(void* _dst, void* _src, ae_state *_state); void _logitmcstate_clear(void* _p); void _logitmcstate_destroy(void* _p); void _mnlreport_init(void* _p, ae_state *_state); void _mnlreport_init_copy(void* _dst, void* _src, ae_state *_state); void _mnlreport_clear(void* _p); void _mnlreport_destroy(void* _p); void mcpdcreate(ae_int_t n, mcpdstate* s, ae_state *_state); void mcpdcreateentry(ae_int_t n, ae_int_t entrystate, mcpdstate* s, ae_state *_state); void mcpdcreateexit(ae_int_t n, ae_int_t exitstate, mcpdstate* s, ae_state *_state); void mcpdcreateentryexit(ae_int_t n, ae_int_t entrystate, ae_int_t exitstate, mcpdstate* s, ae_state *_state); void mcpdaddtrack(mcpdstate* s, /* Real */ ae_matrix* xy, ae_int_t k, ae_state *_state); void mcpdsetec(mcpdstate* s, /* Real */ ae_matrix* ec, ae_state *_state); void mcpdaddec(mcpdstate* s, ae_int_t i, ae_int_t j, double c, ae_state *_state); void mcpdsetbc(mcpdstate* s, /* Real */ ae_matrix* bndl, /* Real */ ae_matrix* bndu, ae_state *_state); void mcpdaddbc(mcpdstate* s, ae_int_t i, ae_int_t j, double bndl, double bndu, ae_state *_state); void mcpdsetlc(mcpdstate* s, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state); void mcpdsettikhonovregularizer(mcpdstate* s, double v, ae_state *_state); void mcpdsetprior(mcpdstate* s, /* Real */ ae_matrix* pp, ae_state *_state); void mcpdsetpredictionweights(mcpdstate* s, /* Real */ ae_vector* pw, ae_state *_state); void mcpdsolve(mcpdstate* s, ae_state *_state); void mcpdresults(mcpdstate* s, /* Real */ ae_matrix* p, mcpdreport* rep, ae_state *_state); void _mcpdstate_init(void* _p, ae_state *_state); void _mcpdstate_init_copy(void* _dst, void* _src, ae_state *_state); void _mcpdstate_clear(void* _p); void _mcpdstate_destroy(void* _p); void _mcpdreport_init(void* _p, ae_state *_state); void _mcpdreport_init_copy(void* _dst, void* _src, ae_state *_state); void _mcpdreport_clear(void* _p); void _mcpdreport_destroy(void* _p); void mlpecreate0(ae_int_t nin, ae_int_t nout, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecreate1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecreate2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecreateb0(ae_int_t nin, ae_int_t nout, double b, double d, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecreateb1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, double b, double d, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecreateb2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, double b, double d, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecreater0(ae_int_t nin, ae_int_t nout, double a, double b, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecreater1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, double a, double b, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecreater2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, double a, double b, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecreatec0(ae_int_t nin, ae_int_t nout, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecreatec1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecreatec2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecreatefromnetwork(multilayerperceptron* network, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state); void mlpecopy(mlpensemble* ensemble1, mlpensemble* ensemble2, ae_state *_state); void mlperandomize(mlpensemble* ensemble, ae_state *_state); void mlpeproperties(mlpensemble* ensemble, ae_int_t* nin, ae_int_t* nout, ae_state *_state); ae_bool mlpeissoftmax(mlpensemble* ensemble, ae_state *_state); void mlpeprocess(mlpensemble* ensemble, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void mlpeprocessi(mlpensemble* ensemble, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void mlpeallerrorsx(mlpensemble* ensemble, /* Real */ ae_matrix* densexy, sparsematrix* sparsexy, ae_int_t datasetsize, ae_int_t datasettype, /* Integer */ ae_vector* idx, ae_int_t subset0, ae_int_t subset1, ae_int_t subsettype, ae_shared_pool* buf, modelerrors* rep, ae_state *_state); void mlpeallerrorssparse(mlpensemble* ensemble, sparsematrix* xy, ae_int_t npoints, double* relcls, double* avgce, double* rms, double* avg, double* avgrel, ae_state *_state); double mlperelclserror(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mlpeavgce(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mlpermserror(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mlpeavgerror(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double mlpeavgrelerror(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); void mlpealloc(ae_serializer* s, mlpensemble* ensemble, ae_state *_state); void mlpeserialize(ae_serializer* s, mlpensemble* ensemble, ae_state *_state); void mlpeunserialize(ae_serializer* s, mlpensemble* ensemble, ae_state *_state); void _mlpensemble_init(void* _p, ae_state *_state); void _mlpensemble_init_copy(void* _dst, void* _src, ae_state *_state); void _mlpensemble_clear(void* _p); void _mlpensemble_destroy(void* _p); void mlptrainlm(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t* info, mlpreport* rep, ae_state *_state); void mlptrainlbfgs(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, double wstep, ae_int_t maxits, ae_int_t* info, mlpreport* rep, ae_state *_state); void mlptraines(multilayerperceptron* network, /* Real */ ae_matrix* trnxy, ae_int_t trnsize, /* Real */ ae_matrix* valxy, ae_int_t valsize, double decay, ae_int_t restarts, ae_int_t* info, mlpreport* rep, ae_state *_state); void mlpkfoldcvlbfgs(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, double wstep, ae_int_t maxits, ae_int_t foldscount, ae_int_t* info, mlpreport* rep, mlpcvreport* cvrep, ae_state *_state); void mlpkfoldcvlm(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t foldscount, ae_int_t* info, mlpreport* rep, mlpcvreport* cvrep, ae_state *_state); void mlpkfoldcv(mlptrainer* s, multilayerperceptron* network, ae_int_t nrestarts, ae_int_t foldscount, mlpreport* rep, ae_state *_state); void _pexec_mlpkfoldcv(mlptrainer* s, multilayerperceptron* network, ae_int_t nrestarts, ae_int_t foldscount, mlpreport* rep, ae_state *_state); void mlpcreatetrainer(ae_int_t nin, ae_int_t nout, mlptrainer* s, ae_state *_state); void mlpcreatetrainercls(ae_int_t nin, ae_int_t nclasses, mlptrainer* s, ae_state *_state); void mlpsetdataset(mlptrainer* s, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); void mlpsetsparsedataset(mlptrainer* s, sparsematrix* xy, ae_int_t npoints, ae_state *_state); void mlpsetdecay(mlptrainer* s, double decay, ae_state *_state); void mlpsetcond(mlptrainer* s, double wstep, ae_int_t maxits, ae_state *_state); void mlpsetalgobatch(mlptrainer* s, ae_state *_state); void mlptrainnetwork(mlptrainer* s, multilayerperceptron* network, ae_int_t nrestarts, mlpreport* rep, ae_state *_state); void _pexec_mlptrainnetwork(mlptrainer* s, multilayerperceptron* network, ae_int_t nrestarts, mlpreport* rep, ae_state *_state); void mlpstarttraining(mlptrainer* s, multilayerperceptron* network, ae_bool randomstart, ae_state *_state); ae_bool mlpcontinuetraining(mlptrainer* s, multilayerperceptron* network, ae_state *_state); ae_bool _pexec_mlpcontinuetraining(mlptrainer* s, multilayerperceptron* network, ae_state *_state); void mlpebagginglm(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t* info, mlpreport* rep, mlpcvreport* ooberrors, ae_state *_state); void mlpebagginglbfgs(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, double wstep, ae_int_t maxits, ae_int_t* info, mlpreport* rep, mlpcvreport* ooberrors, ae_state *_state); void mlpetraines(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t* info, mlpreport* rep, ae_state *_state); void mlptrainensemblees(mlptrainer* s, mlpensemble* ensemble, ae_int_t nrestarts, mlpreport* rep, ae_state *_state); void _pexec_mlptrainensemblees(mlptrainer* s, mlpensemble* ensemble, ae_int_t nrestarts, mlpreport* rep, ae_state *_state); void _mlpreport_init(void* _p, ae_state *_state); void _mlpreport_init_copy(void* _dst, void* _src, ae_state *_state); void _mlpreport_clear(void* _p); void _mlpreport_destroy(void* _p); void _mlpcvreport_init(void* _p, ae_state *_state); void _mlpcvreport_init_copy(void* _dst, void* _src, ae_state *_state); void _mlpcvreport_clear(void* _p); void _mlpcvreport_destroy(void* _p); void _smlptrnsession_init(void* _p, ae_state *_state); void _smlptrnsession_init_copy(void* _dst, void* _src, ae_state *_state); void _smlptrnsession_clear(void* _p); void _smlptrnsession_destroy(void* _p); void _mlpetrnsession_init(void* _p, ae_state *_state); void _mlpetrnsession_init_copy(void* _dst, void* _src, ae_state *_state); void _mlpetrnsession_clear(void* _p); void _mlpetrnsession_destroy(void* _p); void _mlptrainer_init(void* _p, ae_state *_state); void _mlptrainer_init_copy(void* _dst, void* _src, ae_state *_state); void _mlptrainer_clear(void* _p); void _mlptrainer_destroy(void* _p); void _mlpparallelizationcv_init(void* _p, ae_state *_state); void _mlpparallelizationcv_init_copy(void* _dst, void* _src, ae_state *_state); void _mlpparallelizationcv_clear(void* _p); void _mlpparallelizationcv_destroy(void* _p); void clusterizercreate(clusterizerstate* s, ae_state *_state); void clusterizersetpoints(clusterizerstate* s, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_int_t disttype, ae_state *_state); void clusterizersetdistances(clusterizerstate* s, /* Real */ ae_matrix* d, ae_int_t npoints, ae_bool isupper, ae_state *_state); void clusterizersetahcalgo(clusterizerstate* s, ae_int_t algo, ae_state *_state); void clusterizersetkmeanslimits(clusterizerstate* s, ae_int_t restarts, ae_int_t maxits, ae_state *_state); void clusterizersetkmeansinit(clusterizerstate* s, ae_int_t initalgo, ae_state *_state); void clusterizerrunahc(clusterizerstate* s, ahcreport* rep, ae_state *_state); void _pexec_clusterizerrunahc(clusterizerstate* s, ahcreport* rep, ae_state *_state); void clusterizerrunkmeans(clusterizerstate* s, ae_int_t k, kmeansreport* rep, ae_state *_state); void _pexec_clusterizerrunkmeans(clusterizerstate* s, ae_int_t k, kmeansreport* rep, ae_state *_state); void clusterizergetdistances(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_int_t disttype, /* Real */ ae_matrix* d, ae_state *_state); void _pexec_clusterizergetdistances(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_int_t disttype, /* Real */ ae_matrix* d, ae_state *_state); void clusterizergetdistancesbuf(apbuffers* buf, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_int_t disttype, /* Real */ ae_matrix* d, ae_state *_state); void clusterizergetkclusters(ahcreport* rep, ae_int_t k, /* Integer */ ae_vector* cidx, /* Integer */ ae_vector* cz, ae_state *_state); void clusterizerseparatedbydist(ahcreport* rep, double r, ae_int_t* k, /* Integer */ ae_vector* cidx, /* Integer */ ae_vector* cz, ae_state *_state); void clusterizerseparatedbycorr(ahcreport* rep, double r, ae_int_t* k, /* Integer */ ae_vector* cidx, /* Integer */ ae_vector* cz, ae_state *_state); void kmeansinitbuf(kmeansbuffers* buf, ae_state *_state); void kmeansgenerateinternal(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t k, ae_int_t initalgo, ae_int_t maxits, ae_int_t restarts, ae_bool kmeansdbgnoits, ae_int_t* info, ae_int_t* iterationscount, /* Real */ ae_matrix* ccol, ae_bool needccol, /* Real */ ae_matrix* crow, ae_bool needcrow, /* Integer */ ae_vector* xyc, double* energy, kmeansbuffers* buf, ae_state *_state); void kmeansupdatedistances(/* Real */ ae_matrix* xy, ae_int_t idx0, ae_int_t idx1, ae_int_t nvars, /* Real */ ae_matrix* ct, ae_int_t cidx0, ae_int_t cidx1, /* Integer */ ae_vector* xyc, /* Real */ ae_vector* xydist2, ae_shared_pool* bufferpool, ae_state *_state); void _kmeansbuffers_init(void* _p, ae_state *_state); void _kmeansbuffers_init_copy(void* _dst, void* _src, ae_state *_state); void _kmeansbuffers_clear(void* _p); void _kmeansbuffers_destroy(void* _p); void _clusterizerstate_init(void* _p, ae_state *_state); void _clusterizerstate_init_copy(void* _dst, void* _src, ae_state *_state); void _clusterizerstate_clear(void* _p); void _clusterizerstate_destroy(void* _p); void _ahcreport_init(void* _p, ae_state *_state); void _ahcreport_init_copy(void* _dst, void* _src, ae_state *_state); void _ahcreport_clear(void* _p); void _ahcreport_destroy(void* _p); void _kmeansreport_init(void* _p, ae_state *_state); void _kmeansreport_init_copy(void* _dst, void* _src, ae_state *_state); void _kmeansreport_clear(void* _p); void _kmeansreport_destroy(void* _p); void dfbuildrandomdecisionforest(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t ntrees, double r, ae_int_t* info, decisionforest* df, dfreport* rep, ae_state *_state); void dfbuildrandomdecisionforestx1(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t ntrees, ae_int_t nrndvars, double r, ae_int_t* info, decisionforest* df, dfreport* rep, ae_state *_state); void dfbuildinternal(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t ntrees, ae_int_t samplesize, ae_int_t nfeatures, ae_int_t flags, ae_int_t* info, decisionforest* df, dfreport* rep, ae_state *_state); void dfprocess(decisionforest* df, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void dfprocessi(decisionforest* df, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); double dfrelclserror(decisionforest* df, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double dfavgce(decisionforest* df, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double dfrmserror(decisionforest* df, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double dfavgerror(decisionforest* df, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); double dfavgrelerror(decisionforest* df, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); void dfcopy(decisionforest* df1, decisionforest* df2, ae_state *_state); void dfalloc(ae_serializer* s, decisionforest* forest, ae_state *_state); void dfserialize(ae_serializer* s, decisionforest* forest, ae_state *_state); void dfunserialize(ae_serializer* s, decisionforest* forest, ae_state *_state); void _decisionforest_init(void* _p, ae_state *_state); void _decisionforest_init_copy(void* _dst, void* _src, ae_state *_state); void _decisionforest_clear(void* _p); void _decisionforest_destroy(void* _p); void _dfreport_init(void* _p, ae_state *_state); void _dfreport_init_copy(void* _dst, void* _src, ae_state *_state); void _dfreport_clear(void* _p); void _dfreport_destroy(void* _p); void _dfinternalbuffers_init(void* _p, ae_state *_state); void _dfinternalbuffers_init_copy(void* _dst, void* _src, ae_state *_state); void _dfinternalbuffers_clear(void* _p); void _dfinternalbuffers_destroy(void* _p); void kmeansgenerate(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t k, ae_int_t restarts, ae_int_t* info, /* Real */ ae_matrix* c, /* Integer */ ae_vector* xyc, ae_state *_state); } #endif cpp/src/optimization.h0000755000175000017500000121413513105126765014724 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _optimization_pkg_h #define _optimization_pkg_h #include "ap.h" #include "alglibinternal.h" #include "alglibmisc.h" #include "linalg.h" #include "solvers.h" ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (DATATYPES) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { typedef struct { ae_int_t n; ae_int_t k; double alpha; double tau; double theta; ae_matrix a; ae_matrix q; ae_vector b; ae_vector r; ae_vector xc; ae_vector d; ae_vector activeset; ae_matrix tq2dense; ae_matrix tk2; ae_vector tq2diag; ae_vector tq1; ae_vector tk1; double tq0; double tk0; ae_vector txc; ae_vector tb; ae_int_t nfree; ae_int_t ecakind; ae_matrix ecadense; ae_matrix eq; ae_matrix eccm; ae_vector ecadiag; ae_vector eb; double ec; ae_vector tmp0; ae_vector tmp1; ae_vector tmpg; ae_matrix tmp2; ae_bool ismaintermchanged; ae_bool issecondarytermchanged; ae_bool islineartermchanged; ae_bool isactivesetchanged; } convexquadraticmodel; typedef struct { ae_vector norms; ae_vector alpha; ae_vector rho; ae_matrix yk; ae_vector idx; ae_vector bufa; ae_vector bufb; } precbuflbfgs; typedef struct { ae_int_t n; ae_int_t k; ae_vector d; ae_matrix v; ae_vector bufc; ae_matrix bufz; ae_matrix bufw; ae_vector tmp; } precbuflowrank; typedef struct { ae_int_t ns; ae_int_t nd; ae_int_t nr; ae_matrix densea; ae_vector b; ae_vector nnc; double debugflops; ae_int_t debugmaxinnerits; ae_vector xn; ae_vector xp; ae_matrix tmpca; ae_matrix tmplq; ae_matrix trda; ae_vector trdd; ae_vector crb; ae_vector g; ae_vector d; ae_vector dx; ae_vector diagaa; ae_vector cb; ae_vector cx; ae_vector cborg; ae_vector tmpcholesky; ae_vector r; ae_vector regdiag; ae_vector tmp0; ae_vector tmp1; ae_vector tmp2; ae_vector rdtmprowmap; } snnlssolver; typedef struct { ae_int_t n; ae_int_t algostate; ae_vector xc; ae_bool hasxc; ae_vector s; ae_vector h; ae_vector activeset; ae_bool basisisready; ae_matrix sbasis; ae_matrix pbasis; ae_matrix ibasis; ae_int_t basissize; ae_bool feasinitpt; ae_bool constraintschanged; ae_vector hasbndl; ae_vector hasbndu; ae_vector bndl; ae_vector bndu; ae_matrix cleic; ae_int_t nec; ae_int_t nic; ae_vector mtx; ae_vector mtas; ae_vector cdtmp; ae_vector corrtmp; ae_vector unitdiagonal; snnlssolver solver; ae_vector scntmp; ae_vector tmp0; ae_vector tmpfeas; ae_matrix tmpm0; ae_vector rctmps; ae_vector rctmpg; ae_vector rctmprightpart; ae_matrix rctmpdense0; ae_matrix rctmpdense1; ae_vector rctmpisequality; ae_vector rctmpconstraintidx; ae_vector rctmplambdas; ae_matrix tmpbasis; } sactiveset; typedef struct { double epsg; double epsf; double epsx; ae_int_t maxouterits; ae_bool cgphase; ae_bool cnphase; ae_int_t cgminits; ae_int_t cgmaxits; ae_int_t cnmaxupdates; ae_int_t sparsesolver; } qqpsettings; typedef struct { ae_int_t n; ae_int_t nmain; ae_int_t nslack; ae_int_t nec; ae_int_t nic; ae_int_t akind; ae_matrix densea; sparsematrix sparsea; ae_bool sparseupper; double absamax; double absasum; double absasum2; ae_vector b; ae_vector bndl; ae_vector bndu; ae_vector havebndl; ae_vector havebndu; ae_matrix cleic; ae_vector xs; ae_vector xf; ae_vector gc; ae_vector xp; ae_vector dc; ae_vector dp; ae_vector cgc; ae_vector cgp; sactiveset sas; ae_vector activated; ae_int_t nfree; ae_int_t cnmodelage; ae_matrix densez; sparsematrix sparsecca; ae_vector yidx; ae_vector regdiag; ae_vector regx0; ae_vector tmpcn; ae_vector tmpcni; ae_vector tmpcnb; ae_vector tmp0; ae_vector stpbuf; sparsebuffers sbuf; ae_int_t repinneriterationscount; ae_int_t repouteriterationscount; ae_int_t repncholesky; ae_int_t repncupdates; } qqpbuffers; typedef struct { ae_int_t n; ae_int_t m; double epsg; double epsf; double epsx; ae_int_t maxits; ae_bool xrep; double stpmax; ae_vector s; double diffstep; ae_int_t nfev; ae_int_t mcstage; ae_int_t k; ae_int_t q; ae_int_t p; ae_vector rho; ae_matrix yk; ae_matrix sk; ae_vector xp; ae_vector theta; ae_vector d; double stp; ae_vector work; double fold; double trimthreshold; ae_int_t prectype; double gammak; ae_matrix denseh; ae_vector diagh; ae_vector precc; ae_vector precd; ae_matrix precw; ae_int_t preck; precbuflbfgs precbuf; precbuflowrank lowrankbuf; double fbase; double fm2; double fm1; double fp1; double fp2; ae_vector autobuf; ae_vector x; double f; ae_vector g; ae_bool needf; ae_bool needfg; ae_bool xupdated; ae_bool userterminationneeded; double teststep; rcommstate rstate; ae_int_t repiterationscount; ae_int_t repnfev; ae_int_t repvaridx; ae_int_t repterminationtype; linminstate lstate; } minlbfgsstate; typedef struct { ae_int_t iterationscount; ae_int_t nfev; ae_int_t varidx; ae_int_t terminationtype; } minlbfgsreport; typedef struct { double epsx; ae_int_t outerits; double rho; } qpdenseaulsettings; typedef struct { ae_vector nulc; ae_matrix sclsfta; ae_vector sclsftb; ae_vector sclsfthasbndl; ae_vector sclsfthasbndu; ae_vector sclsftbndl; ae_vector sclsftbndu; ae_vector sclsftxc; ae_matrix sclsftcleic; ae_matrix exa; ae_vector exb; ae_vector exxc; ae_vector exxn; ae_vector exbndl; ae_vector exbndu; ae_vector exscale; ae_vector exxorigin; qqpsettings qqpsettingsuser; qqpbuffers qqpbuf; ae_vector nulcest; ae_vector tmp0; ae_matrix tmp2; ae_vector modelg; ae_vector d; ae_vector deltax; convexquadraticmodel dummycqm; sparsematrix dummysparse; ae_matrix qrkkt; ae_vector qrrightpart; ae_vector qrtau; ae_vector qrsv0; ae_vector qrsvx1; ae_int_t repinneriterationscount; ae_int_t repouteriterationscount; ae_int_t repncholesky; ae_int_t repnmv; } qpdenseaulbuffers; typedef struct { double epsg; double epsf; double epsx; ae_int_t maxits; } qpcholeskysettings; typedef struct { sactiveset sas; ae_vector pg; ae_vector gc; ae_vector xs; ae_vector xn; ae_vector workbndl; ae_vector workbndu; ae_vector havebndl; ae_vector havebndu; ae_matrix workcleic; ae_vector rctmpg; ae_vector tmp0; ae_vector tmp1; ae_vector tmpb; ae_int_t repinneriterationscount; ae_int_t repouteriterationscount; ae_int_t repncholesky; } qpcholeskybuffers; typedef struct { ae_int_t n; double epsg; double epsf; double epsx; ae_int_t maxits; double stpmax; double suggestedstep; ae_bool xrep; ae_bool drep; ae_int_t cgtype; ae_int_t prectype; ae_vector diagh; ae_vector diaghl2; ae_matrix vcorr; ae_int_t vcnt; ae_vector s; double diffstep; ae_int_t nfev; ae_int_t mcstage; ae_int_t k; ae_vector xk; ae_vector dk; ae_vector xn; ae_vector dn; ae_vector d; double fold; double stp; double curstpmax; ae_vector yk; double lastgoodstep; double lastscaledstep; ae_int_t mcinfo; ae_bool innerresetneeded; ae_bool terminationneeded; double trimthreshold; ae_int_t rstimer; ae_vector x; double f; ae_vector g; ae_bool needf; ae_bool needfg; ae_bool xupdated; ae_bool algpowerup; ae_bool lsstart; ae_bool lsend; ae_bool userterminationneeded; double teststep; rcommstate rstate; ae_int_t repiterationscount; ae_int_t repnfev; ae_int_t repvaridx; ae_int_t repterminationtype; ae_int_t debugrestartscount; linminstate lstate; double fbase; double fm2; double fm1; double fp1; double fp2; double betahs; double betady; ae_vector work0; ae_vector work1; } mincgstate; typedef struct { ae_int_t iterationscount; ae_int_t nfev; ae_int_t varidx; ae_int_t terminationtype; } mincgreport; typedef struct { ae_int_t nmain; ae_int_t nslack; double epsg; double epsf; double epsx; ae_int_t maxits; ae_bool xrep; ae_bool drep; double stpmax; double diffstep; sactiveset sas; ae_vector s; ae_int_t prectype; ae_vector diagh; ae_vector x; double f; ae_vector g; ae_bool needf; ae_bool needfg; ae_bool xupdated; ae_bool lsstart; ae_bool steepestdescentstep; ae_bool boundedstep; ae_bool userterminationneeded; double teststep; rcommstate rstate; ae_vector ugc; ae_vector cgc; ae_vector xn; ae_vector ugn; ae_vector cgn; ae_vector xp; double fc; double fn; double fp; ae_vector d; ae_matrix cleic; ae_int_t nec; ae_int_t nic; double lastgoodstep; double lastscaledgoodstep; double maxscaledgrad; ae_vector hasbndl; ae_vector hasbndu; ae_vector bndl; ae_vector bndu; ae_int_t repinneriterationscount; ae_int_t repouteriterationscount; ae_int_t repnfev; ae_int_t repvaridx; ae_int_t repterminationtype; double repdebugeqerr; double repdebugfs; double repdebugff; double repdebugdx; ae_int_t repdebugfeasqpits; ae_int_t repdebugfeasgpaits; ae_vector xstart; snnlssolver solver; double fbase; double fm2; double fm1; double fp1; double fp2; double xm1; double xp1; double gm1; double gp1; ae_int_t cidx; double cval; ae_vector tmpprec; ae_vector tmp0; ae_int_t nfev; ae_int_t mcstage; double stp; double curstpmax; double activationstep; ae_vector work; linminstate lstate; double trimthreshold; ae_int_t nonmonotoniccnt; ae_matrix bufyk; ae_matrix bufsk; ae_vector bufrho; ae_vector buftheta; ae_int_t bufsize; } minbleicstate; typedef struct { ae_int_t iterationscount; ae_int_t nfev; ae_int_t varidx; ae_int_t terminationtype; double debugeqerr; double debugfs; double debugff; double debugdx; ae_int_t debugfeasqpits; ae_int_t debugfeasgpaits; ae_int_t inneriterationscount; ae_int_t outeriterationscount; } minbleicreport; typedef struct { double epsg; double epsf; double epsx; ae_int_t maxits; } qpbleicsettings; typedef struct { minbleicstate solver; minbleicreport solverrep; ae_vector tmp0; ae_vector tmp1; ae_vector tmpi; ae_int_t repinneriterationscount; ae_int_t repouteriterationscount; } qpbleicbuffers; typedef struct { ae_int_t n; qqpsettings qqpsettingsuser; qpbleicsettings qpbleicsettingsuser; qpdenseaulsettings qpdenseaulsettingsuser; ae_bool dbgskipconstraintnormalization; ae_int_t algokind; ae_int_t akind; convexquadraticmodel a; sparsematrix sparsea; ae_bool sparseaupper; double absamax; double absasum; double absasum2; ae_vector b; ae_vector bndl; ae_vector bndu; ae_vector s; ae_vector havebndl; ae_vector havebndu; ae_vector xorigin; ae_vector startx; ae_bool havex; ae_matrix cleic; ae_int_t nec; ae_int_t nic; sparsematrix scleic; ae_int_t snec; ae_int_t snic; ae_vector xs; ae_int_t repinneriterationscount; ae_int_t repouteriterationscount; ae_int_t repncholesky; ae_int_t repnmv; ae_int_t repterminationtype; ae_vector tmp0; ae_matrix ecleic; ae_matrix dummyr2; ae_bool qpbleicfirstcall; qpbleicbuffers qpbleicbuf; qqpbuffers qqpbuf; qpdenseaulbuffers qpdenseaulbuf; qpcholeskybuffers qpcholeskybuf; } minqpstate; typedef struct { ae_int_t inneriterationscount; ae_int_t outeriterationscount; ae_int_t nmv; ae_int_t ncholesky; ae_int_t terminationtype; } minqpreport; typedef struct { double stabilizingpoint; double initialinequalitymultiplier; ae_int_t solvertype; ae_int_t prectype; ae_int_t updatefreq; double rho; ae_int_t n; double epsg; double epsf; double epsx; ae_int_t maxits; ae_int_t aulitscnt; ae_bool xrep; double stpmax; double diffstep; double teststep; ae_vector s; ae_vector bndl; ae_vector bndu; ae_vector hasbndl; ae_vector hasbndu; ae_int_t nec; ae_int_t nic; ae_matrix cleic; ae_int_t ng; ae_int_t nh; ae_vector x; double f; ae_vector fi; ae_matrix j; ae_bool needfij; ae_bool needfi; ae_bool xupdated; rcommstate rstate; rcommstate rstateaul; ae_vector scaledbndl; ae_vector scaledbndu; ae_matrix scaledcleic; ae_vector xc; ae_vector xstart; ae_vector xbase; ae_vector fbase; ae_vector dfbase; ae_vector fm2; ae_vector fm1; ae_vector fp1; ae_vector fp2; ae_vector dfm1; ae_vector dfp1; ae_vector bufd; ae_vector bufc; ae_vector tmp0; ae_matrix bufw; ae_matrix bufz; ae_vector xk; ae_vector xk1; ae_vector gk; ae_vector gk1; double gammak; ae_bool xkpresent; minlbfgsstate auloptimizer; minlbfgsreport aulreport; ae_vector nubc; ae_vector nulc; ae_vector nunlc; ae_int_t repinneriterationscount; ae_int_t repouteriterationscount; ae_int_t repnfev; ae_int_t repvaridx; ae_int_t repfuncidx; ae_int_t repterminationtype; ae_int_t repdbgphase0its; } minnlcstate; typedef struct { ae_int_t iterationscount; ae_int_t nfev; ae_int_t varidx; ae_int_t funcidx; ae_int_t terminationtype; ae_int_t dbgphase0its; } minnlcreport; typedef struct { ae_int_t nmain; double epsg; double epsf; double epsx; ae_int_t maxits; ae_bool xrep; double stpmax; double diffstep; ae_vector s; ae_int_t prectype; ae_vector diagh; ae_vector x; double f; ae_vector g; ae_bool needf; ae_bool needfg; ae_bool xupdated; ae_bool userterminationneeded; double teststep; rcommstate rstate; ae_vector xc; ae_vector ugc; ae_vector cgc; ae_vector xn; ae_vector ugn; ae_vector cgn; ae_vector xp; double fc; double fn; double fp; ae_vector d; double lastscaledgoodstep; ae_vector hasbndl; ae_vector hasbndu; ae_vector bndl; ae_vector bndu; ae_int_t repiterationscount; ae_int_t repnfev; ae_int_t repvaridx; ae_int_t repterminationtype; ae_vector xstart; snnlssolver solver; double fbase; double fm2; double fm1; double fp1; double fp2; double xm1; double xp1; double gm1; double gp1; ae_vector tmpprec; ae_vector tmp0; ae_int_t nfev; ae_int_t mcstage; double stp; double curstpmax; ae_vector work; linminstate lstate; double trimthreshold; ae_int_t nonmonotoniccnt; ae_matrix bufyk; ae_matrix bufsk; ae_vector bufrho; ae_vector buftheta; ae_int_t bufsize; } minbcstate; typedef struct { ae_int_t iterationscount; ae_int_t nfev; ae_int_t varidx; ae_int_t terminationtype; } minbcreport; typedef struct { double fc; double fn; ae_vector xc; ae_vector xn; ae_vector x0; ae_vector gc; ae_vector d; ae_matrix uh; ae_matrix ch; ae_matrix rk; ae_vector invutc; ae_vector tmp0; ae_vector tmpidx; ae_vector tmpd; ae_vector tmpc; ae_vector tmplambdas; ae_matrix tmpc2; ae_vector tmpb; snnlssolver nnls; } minnsqp; typedef struct { ae_int_t solvertype; ae_int_t n; double epsx; ae_int_t maxits; ae_bool xrep; double diffstep; ae_vector s; ae_vector bndl; ae_vector bndu; ae_vector hasbndl; ae_vector hasbndu; ae_int_t nec; ae_int_t nic; ae_matrix cleic; ae_int_t ng; ae_int_t nh; ae_vector x; double f; ae_vector fi; ae_matrix j; ae_bool needfij; ae_bool needfi; ae_bool xupdated; rcommstate rstate; rcommstate rstateags; hqrndstate agsrs; double agsradius; ae_int_t agssamplesize; double agsraddecay; double agsalphadecay; double agsdecrease; double agsinitstp; double agsstattold; double agsshortstpabs; double agsshortstprel; double agsshortf; ae_int_t agsshortlimit; double agsrhononlinear; ae_int_t agsminupdate; ae_int_t agsmaxraddecays; ae_int_t agsmaxbacktrack; ae_int_t agsmaxbacktracknonfull; double agspenaltylevel; double agspenaltyincrease; ae_vector xstart; ae_vector xc; ae_vector xn; ae_vector grs; ae_vector d; ae_vector colmax; ae_vector diagh; ae_vector signmin; ae_vector signmax; ae_bool userterminationneeded; ae_vector scaledbndl; ae_vector scaledbndu; ae_matrix scaledcleic; ae_vector rholinear; ae_matrix samplex; ae_matrix samplegm; ae_matrix samplegmbc; ae_vector samplef; ae_vector samplef0; minnsqp nsqp; ae_vector tmp0; ae_vector tmp1; ae_matrix tmp2; ae_vector tmp3; ae_vector xbase; ae_vector fp; ae_vector fm; ae_int_t repinneriterationscount; ae_int_t repouteriterationscount; ae_int_t repnfev; ae_int_t repvaridx; ae_int_t repfuncidx; ae_int_t repterminationtype; double replcerr; double repnlcerr; ae_int_t dbgncholesky; } minnsstate; typedef struct { ae_int_t iterationscount; ae_int_t nfev; double cerr; double lcerr; double nlcerr; ae_int_t terminationtype; ae_int_t varidx; ae_int_t funcidx; } minnsreport; typedef struct { ae_int_t n; double epsg; double epsf; double epsx; ae_int_t maxits; ae_bool xrep; double stpmax; ae_int_t cgtype; ae_int_t k; ae_int_t nfev; ae_int_t mcstage; ae_vector bndl; ae_vector bndu; ae_int_t curalgo; ae_int_t acount; double mu; double finit; double dginit; ae_vector ak; ae_vector xk; ae_vector dk; ae_vector an; ae_vector xn; ae_vector dn; ae_vector d; double fold; double stp; ae_vector work; ae_vector yk; ae_vector gc; double laststep; ae_vector x; double f; ae_vector g; ae_bool needfg; ae_bool xupdated; rcommstate rstate; ae_int_t repiterationscount; ae_int_t repnfev; ae_int_t repterminationtype; ae_int_t debugrestartscount; linminstate lstate; double betahs; double betady; } minasastate; typedef struct { ae_int_t iterationscount; ae_int_t nfev; ae_int_t terminationtype; ae_int_t activeconstraints; } minasareport; typedef struct { ae_int_t n; ae_int_t m; double stpmax; ae_int_t modelage; ae_int_t maxmodelage; ae_bool hasfi; double epsx; ae_vector x; double f; ae_vector fi; ae_bool needf; ae_bool needfi; double fbase; ae_vector modeldiag; ae_vector xbase; ae_vector fibase; ae_vector bndl; ae_vector bndu; ae_vector havebndl; ae_vector havebndu; ae_vector s; rcommstate rstate; ae_vector xdir; ae_vector choleskybuf; ae_vector tmp0; ae_vector tmpct; double actualdecrease; double predicteddecrease; minqpstate qpstate; minqpreport qprep; sparsematrix tmpsp; } minlmstepfinder; typedef struct { ae_int_t n; ae_int_t m; double diffstep; double epsx; ae_int_t maxits; ae_bool xrep; double stpmax; ae_int_t maxmodelage; ae_bool makeadditers; ae_vector x; double f; ae_vector fi; ae_matrix j; ae_matrix h; ae_vector g; ae_bool needf; ae_bool needfg; ae_bool needfgh; ae_bool needfij; ae_bool needfi; ae_bool xupdated; ae_bool userterminationneeded; ae_int_t algomode; ae_bool hasf; ae_bool hasfi; ae_bool hasg; ae_vector xbase; double fbase; ae_vector fibase; ae_vector gbase; ae_matrix quadraticmodel; ae_vector bndl; ae_vector bndu; ae_vector havebndl; ae_vector havebndu; ae_vector s; ae_matrix cleic; ae_int_t nec; ae_int_t nic; double lambdav; double nu; ae_int_t modelage; ae_vector xnew; ae_vector xdir; ae_vector deltax; ae_vector deltaf; ae_bool deltaxready; ae_bool deltafready; double teststep; ae_int_t repiterationscount; ae_int_t repterminationtype; ae_int_t repfuncidx; ae_int_t repvaridx; ae_int_t repnfunc; ae_int_t repnjac; ae_int_t repngrad; ae_int_t repnhess; ae_int_t repncholesky; rcommstate rstate; ae_vector choleskybuf; ae_vector tmp0; double actualdecrease; double predicteddecrease; double xm1; double xp1; ae_vector fm1; ae_vector fp1; ae_vector fc1; ae_vector gm1; ae_vector gp1; ae_vector gc1; minlbfgsstate internalstate; minlbfgsreport internalrep; minqpstate qpstate; minqpreport qprep; minlmstepfinder finderstate; } minlmstate; typedef struct { ae_int_t iterationscount; ae_int_t terminationtype; ae_int_t funcidx; ae_int_t varidx; ae_int_t nfunc; ae_int_t njac; ae_int_t ngrad; ae_int_t nhess; ae_int_t ncholesky; } minlmreport; } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* *************************************************************************/ class _minlbfgsstate_owner { public: _minlbfgsstate_owner(); _minlbfgsstate_owner(const _minlbfgsstate_owner &rhs); _minlbfgsstate_owner& operator=(const _minlbfgsstate_owner &rhs); virtual ~_minlbfgsstate_owner(); alglib_impl::minlbfgsstate* c_ptr(); alglib_impl::minlbfgsstate* c_ptr() const; protected: alglib_impl::minlbfgsstate *p_struct; }; class minlbfgsstate : public _minlbfgsstate_owner { public: minlbfgsstate(); minlbfgsstate(const minlbfgsstate &rhs); minlbfgsstate& operator=(const minlbfgsstate &rhs); virtual ~minlbfgsstate(); ae_bool &needf; ae_bool &needfg; ae_bool &xupdated; double &f; real_1d_array g; real_1d_array x; }; /************************************************************************* This structure stores optimization report: * IterationsCount total number of inner iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinLBFGSSetGradientCheck() for more information. 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 terminated by user who called minlbfgsrequesttermination(). X contains point which was "current accepted" when termination request was submitted. Other fields of this structure are not documented and should not be used! *************************************************************************/ class _minlbfgsreport_owner { public: _minlbfgsreport_owner(); _minlbfgsreport_owner(const _minlbfgsreport_owner &rhs); _minlbfgsreport_owner& operator=(const _minlbfgsreport_owner &rhs); virtual ~_minlbfgsreport_owner(); alglib_impl::minlbfgsreport* c_ptr(); alglib_impl::minlbfgsreport* c_ptr() const; protected: alglib_impl::minlbfgsreport *p_struct; }; class minlbfgsreport : public _minlbfgsreport_owner { public: minlbfgsreport(); minlbfgsreport(const minlbfgsreport &rhs); minlbfgsreport& operator=(const minlbfgsreport &rhs); virtual ~minlbfgsreport(); ae_int_t &iterationscount; ae_int_t &nfev; ae_int_t &varidx; ae_int_t &terminationtype; }; /************************************************************************* This object stores state of the nonlinear CG optimizer. You should use ALGLIB functions to work with this object. *************************************************************************/ class _mincgstate_owner { public: _mincgstate_owner(); _mincgstate_owner(const _mincgstate_owner &rhs); _mincgstate_owner& operator=(const _mincgstate_owner &rhs); virtual ~_mincgstate_owner(); alglib_impl::mincgstate* c_ptr(); alglib_impl::mincgstate* c_ptr() const; protected: alglib_impl::mincgstate *p_struct; }; class mincgstate : public _mincgstate_owner { public: mincgstate(); mincgstate(const mincgstate &rhs); mincgstate& operator=(const mincgstate &rhs); virtual ~mincgstate(); ae_bool &needf; ae_bool &needfg; ae_bool &xupdated; double &f; real_1d_array g; real_1d_array x; }; /************************************************************************* This structure stores optimization report: * IterationsCount total number of inner iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinCGSetGradientCheck() for more information. 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 terminated by user who called mincgrequesttermination(). X contains point which was "current accepted" when termination request was submitted. Other fields of this structure are not documented and should not be used! *************************************************************************/ class _mincgreport_owner { public: _mincgreport_owner(); _mincgreport_owner(const _mincgreport_owner &rhs); _mincgreport_owner& operator=(const _mincgreport_owner &rhs); virtual ~_mincgreport_owner(); alglib_impl::mincgreport* c_ptr(); alglib_impl::mincgreport* c_ptr() const; protected: alglib_impl::mincgreport *p_struct; }; class mincgreport : public _mincgreport_owner { public: mincgreport(); mincgreport(const mincgreport &rhs); mincgreport& operator=(const mincgreport &rhs); virtual ~mincgreport(); ae_int_t &iterationscount; ae_int_t &nfev; ae_int_t &varidx; ae_int_t &terminationtype; }; /************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinBLEIC subpackage to work with this object *************************************************************************/ class _minbleicstate_owner { public: _minbleicstate_owner(); _minbleicstate_owner(const _minbleicstate_owner &rhs); _minbleicstate_owner& operator=(const _minbleicstate_owner &rhs); virtual ~_minbleicstate_owner(); alglib_impl::minbleicstate* c_ptr(); alglib_impl::minbleicstate* c_ptr() const; protected: alglib_impl::minbleicstate *p_struct; }; class minbleicstate : public _minbleicstate_owner { public: minbleicstate(); minbleicstate(const minbleicstate &rhs); minbleicstate& operator=(const minbleicstate &rhs); virtual ~minbleicstate(); ae_bool &needf; ae_bool &needfg; ae_bool &xupdated; double &f; real_1d_array g; real_1d_array x; }; /************************************************************************* This structure stores optimization report: * IterationsCount number of iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinBLEICSetGradientCheck() for more information. -3 inconsistent constraints. Feasible point is either nonexistent or too hard to find. Try to restart optimizer with better initial approximation 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 terminated by user who called minbleicrequesttermination(). X contains point which was "current accepted" when termination request was submitted. ADDITIONAL FIELDS There are additional fields which can be used for debugging: * DebugEqErr error in the equality constraints (2-norm) * DebugFS f, calculated at projection of initial point to the feasible set * DebugFF f, calculated at the final point * DebugDX |X_start-X_final| *************************************************************************/ class _minbleicreport_owner { public: _minbleicreport_owner(); _minbleicreport_owner(const _minbleicreport_owner &rhs); _minbleicreport_owner& operator=(const _minbleicreport_owner &rhs); virtual ~_minbleicreport_owner(); alglib_impl::minbleicreport* c_ptr(); alglib_impl::minbleicreport* c_ptr() const; protected: alglib_impl::minbleicreport *p_struct; }; class minbleicreport : public _minbleicreport_owner { public: minbleicreport(); minbleicreport(const minbleicreport &rhs); minbleicreport& operator=(const minbleicreport &rhs); virtual ~minbleicreport(); ae_int_t &iterationscount; ae_int_t &nfev; ae_int_t &varidx; ae_int_t &terminationtype; double &debugeqerr; double &debugfs; double &debugff; double &debugdx; ae_int_t &debugfeasqpits; ae_int_t &debugfeasgpaits; ae_int_t &inneriterationscount; ae_int_t &outeriterationscount; }; /************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinQP subpackage to work with this object *************************************************************************/ class _minqpstate_owner { public: _minqpstate_owner(); _minqpstate_owner(const _minqpstate_owner &rhs); _minqpstate_owner& operator=(const _minqpstate_owner &rhs); virtual ~_minqpstate_owner(); alglib_impl::minqpstate* c_ptr(); alglib_impl::minqpstate* c_ptr() const; protected: alglib_impl::minqpstate *p_struct; }; class minqpstate : public _minqpstate_owner { public: minqpstate(); minqpstate(const minqpstate &rhs); minqpstate& operator=(const minqpstate &rhs); virtual ~minqpstate(); }; /************************************************************************* This structure stores optimization report: * InnerIterationsCount number of inner iterations * OuterIterationsCount number of outer iterations * NCholesky number of Cholesky decomposition * NMV number of matrix-vector products (only products calculated as part of iterative process are counted) * TerminationType completion code (see below) Completion codes: * -5 inappropriate solver was used: * QuickQP solver for problem with general linear constraints (dense/sparse) * -4 BLEIC-QP or QuickQP solver found unconstrained direction of negative curvature (function is unbounded from below even under constraints), no meaningful minimum can be found. * -3 inconsistent constraints (or, maybe, feasible point is too hard to find). If you are sure that constraints are feasible, try to restart optimizer with better initial approximation. * -1 solver error * 1..4 successful completion * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. *************************************************************************/ class _minqpreport_owner { public: _minqpreport_owner(); _minqpreport_owner(const _minqpreport_owner &rhs); _minqpreport_owner& operator=(const _minqpreport_owner &rhs); virtual ~_minqpreport_owner(); alglib_impl::minqpreport* c_ptr(); alglib_impl::minqpreport* c_ptr() const; protected: alglib_impl::minqpreport *p_struct; }; class minqpreport : public _minqpreport_owner { public: minqpreport(); minqpreport(const minqpreport &rhs); minqpreport& operator=(const minqpreport &rhs); virtual ~minqpreport(); ae_int_t &inneriterationscount; ae_int_t &outeriterationscount; ae_int_t &nmv; ae_int_t &ncholesky; ae_int_t &terminationtype; }; /************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinNLC subpackage to work with this object *************************************************************************/ class _minnlcstate_owner { public: _minnlcstate_owner(); _minnlcstate_owner(const _minnlcstate_owner &rhs); _minnlcstate_owner& operator=(const _minnlcstate_owner &rhs); virtual ~_minnlcstate_owner(); alglib_impl::minnlcstate* c_ptr(); alglib_impl::minnlcstate* c_ptr() const; protected: alglib_impl::minnlcstate *p_struct; }; class minnlcstate : public _minnlcstate_owner { public: minnlcstate(); minnlcstate(const minnlcstate &rhs); minnlcstate& operator=(const minnlcstate &rhs); virtual ~minnlcstate(); ae_bool &needfi; ae_bool &needfij; ae_bool &xupdated; double &f; real_1d_array fi; real_2d_array j; real_1d_array x; }; /************************************************************************* This structure stores optimization report: * IterationsCount total number of inner iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinNLCSetGradientCheck() for more information. 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. Other fields of this structure are not documented and should not be used! *************************************************************************/ class _minnlcreport_owner { public: _minnlcreport_owner(); _minnlcreport_owner(const _minnlcreport_owner &rhs); _minnlcreport_owner& operator=(const _minnlcreport_owner &rhs); virtual ~_minnlcreport_owner(); alglib_impl::minnlcreport* c_ptr(); alglib_impl::minnlcreport* c_ptr() const; protected: alglib_impl::minnlcreport *p_struct; }; class minnlcreport : public _minnlcreport_owner { public: minnlcreport(); minnlcreport(const minnlcreport &rhs); minnlcreport& operator=(const minnlcreport &rhs); virtual ~minnlcreport(); ae_int_t &iterationscount; ae_int_t &nfev; ae_int_t &varidx; ae_int_t &funcidx; ae_int_t &terminationtype; ae_int_t &dbgphase0its; }; /************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinBC subpackage to work with this object *************************************************************************/ class _minbcstate_owner { public: _minbcstate_owner(); _minbcstate_owner(const _minbcstate_owner &rhs); _minbcstate_owner& operator=(const _minbcstate_owner &rhs); virtual ~_minbcstate_owner(); alglib_impl::minbcstate* c_ptr(); alglib_impl::minbcstate* c_ptr() const; protected: alglib_impl::minbcstate *p_struct; }; class minbcstate : public _minbcstate_owner { public: minbcstate(); minbcstate(const minbcstate &rhs); minbcstate& operator=(const minbcstate &rhs); virtual ~minbcstate(); ae_bool &needf; ae_bool &needfg; ae_bool &xupdated; double &f; real_1d_array g; real_1d_array x; }; /************************************************************************* This structure stores optimization report: * IterationsCount number of iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -7 gradient verification failed. See MinBCSetGradientCheck() for more information. -3 inconsistent constraints. 1 relative function improvement is no more than EpsF. 2 relative step is no more than EpsX. 4 gradient norm is no more than EpsG 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 terminated by user who called minbcrequesttermination(). X contains point which was "current accepted" when termination request was submitted. ADDITIONAL FIELDS There are additional fields which can be used for debugging: * DebugEqErr error in the equality constraints (2-norm) * DebugFS f, calculated at projection of initial point to the feasible set * DebugFF f, calculated at the final point * DebugDX |X_start-X_final| *************************************************************************/ class _minbcreport_owner { public: _minbcreport_owner(); _minbcreport_owner(const _minbcreport_owner &rhs); _minbcreport_owner& operator=(const _minbcreport_owner &rhs); virtual ~_minbcreport_owner(); alglib_impl::minbcreport* c_ptr(); alglib_impl::minbcreport* c_ptr() const; protected: alglib_impl::minbcreport *p_struct; }; class minbcreport : public _minbcreport_owner { public: minbcreport(); minbcreport(const minbcreport &rhs); minbcreport& operator=(const minbcreport &rhs); virtual ~minbcreport(); ae_int_t &iterationscount; ae_int_t &nfev; ae_int_t &varidx; ae_int_t &terminationtype; }; /************************************************************************* This object stores nonlinear optimizer state. You should use functions provided by MinNS subpackage to work with this object *************************************************************************/ class _minnsstate_owner { public: _minnsstate_owner(); _minnsstate_owner(const _minnsstate_owner &rhs); _minnsstate_owner& operator=(const _minnsstate_owner &rhs); virtual ~_minnsstate_owner(); alglib_impl::minnsstate* c_ptr(); alglib_impl::minnsstate* c_ptr() const; protected: alglib_impl::minnsstate *p_struct; }; class minnsstate : public _minnsstate_owner { public: minnsstate(); minnsstate(const minnsstate &rhs); minnsstate& operator=(const minnsstate &rhs); virtual ~minnsstate(); ae_bool &needfi; ae_bool &needfij; ae_bool &xupdated; double &f; real_1d_array fi; real_2d_array j; real_1d_array x; }; /************************************************************************* This structure stores optimization report: * IterationsCount total number of inner iterations * NFEV number of gradient evaluations * TerminationType termination type (see below) * CErr maximum violation of all types of constraints * LCErr maximum violation of linear constraints * NLCErr maximum violation of nonlinear constraints TERMINATION CODES TerminationType field contains completion code, which can be: -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. -3 box constraints are inconsistent -1 inconsistent parameters were passed: * penalty parameter for minnssetalgoags() is zero, but we have nonlinear constraints set by minnssetnlc() 2 sampling radius decreased below epsx 5 MaxIts steps was taken 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. 8 User requested termination via MinNSRequestTermination() Other fields of this structure are not documented and should not be used! *************************************************************************/ class _minnsreport_owner { public: _minnsreport_owner(); _minnsreport_owner(const _minnsreport_owner &rhs); _minnsreport_owner& operator=(const _minnsreport_owner &rhs); virtual ~_minnsreport_owner(); alglib_impl::minnsreport* c_ptr(); alglib_impl::minnsreport* c_ptr() const; protected: alglib_impl::minnsreport *p_struct; }; class minnsreport : public _minnsreport_owner { public: minnsreport(); minnsreport(const minnsreport &rhs); minnsreport& operator=(const minnsreport &rhs); virtual ~minnsreport(); ae_int_t &iterationscount; ae_int_t &nfev; double &cerr; double &lcerr; double &nlcerr; ae_int_t &terminationtype; ae_int_t &varidx; ae_int_t &funcidx; }; /************************************************************************* *************************************************************************/ class _minasastate_owner { public: _minasastate_owner(); _minasastate_owner(const _minasastate_owner &rhs); _minasastate_owner& operator=(const _minasastate_owner &rhs); virtual ~_minasastate_owner(); alglib_impl::minasastate* c_ptr(); alglib_impl::minasastate* c_ptr() const; protected: alglib_impl::minasastate *p_struct; }; class minasastate : public _minasastate_owner { public: minasastate(); minasastate(const minasastate &rhs); minasastate& operator=(const minasastate &rhs); virtual ~minasastate(); ae_bool &needfg; ae_bool &xupdated; double &f; real_1d_array g; real_1d_array x; }; /************************************************************************* *************************************************************************/ class _minasareport_owner { public: _minasareport_owner(); _minasareport_owner(const _minasareport_owner &rhs); _minasareport_owner& operator=(const _minasareport_owner &rhs); virtual ~_minasareport_owner(); alglib_impl::minasareport* c_ptr(); alglib_impl::minasareport* c_ptr() const; protected: alglib_impl::minasareport *p_struct; }; class minasareport : public _minasareport_owner { public: minasareport(); minasareport(const minasareport &rhs); minasareport& operator=(const minasareport &rhs); virtual ~minasareport(); ae_int_t &iterationscount; ae_int_t &nfev; ae_int_t &terminationtype; ae_int_t &activeconstraints; }; /************************************************************************* Levenberg-Marquardt optimizer. This structure should be created using one of the MinLMCreate???() functions. You should not access its fields directly; use ALGLIB functions to work with it. *************************************************************************/ class _minlmstate_owner { public: _minlmstate_owner(); _minlmstate_owner(const _minlmstate_owner &rhs); _minlmstate_owner& operator=(const _minlmstate_owner &rhs); virtual ~_minlmstate_owner(); alglib_impl::minlmstate* c_ptr(); alglib_impl::minlmstate* c_ptr() const; protected: alglib_impl::minlmstate *p_struct; }; class minlmstate : public _minlmstate_owner { public: minlmstate(); minlmstate(const minlmstate &rhs); minlmstate& operator=(const minlmstate &rhs); virtual ~minlmstate(); ae_bool &needf; ae_bool &needfg; ae_bool &needfgh; ae_bool &needfi; ae_bool &needfij; ae_bool &xupdated; double &f; real_1d_array fi; real_1d_array g; real_2d_array h; real_2d_array j; real_1d_array x; }; /************************************************************************* Optimization report, filled by MinLMResults() function FIELDS: * TerminationType, completetion code: * -8 optimizer detected NAN/INF values either in the function itself, or in its Jacobian * -7 derivative correctness check failed; see rep.funcidx, rep.varidx for more information. * -5 inappropriate solver was used: * solver created with minlmcreatefgh() used on problem with general linear constraints (set with minlmsetlc() call). * -3 constraints are inconsistent * 2 relative step is no more than EpsX. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * 8 terminated by user who called MinLMRequestTermination(). X contains point which was "current accepted" when termination request was submitted. * IterationsCount, contains iterations count * NFunc, number of function calculations * NJac, number of Jacobi matrix calculations * NGrad, number of gradient calculations * NHess, number of Hessian calculations * NCholesky, number of Cholesky decomposition calculations *************************************************************************/ class _minlmreport_owner { public: _minlmreport_owner(); _minlmreport_owner(const _minlmreport_owner &rhs); _minlmreport_owner& operator=(const _minlmreport_owner &rhs); virtual ~_minlmreport_owner(); alglib_impl::minlmreport* c_ptr(); alglib_impl::minlmreport* c_ptr() const; protected: alglib_impl::minlmreport *p_struct; }; class minlmreport : public _minlmreport_owner { public: minlmreport(); minlmreport(const minlmreport &rhs); minlmreport& operator=(const minlmreport &rhs); virtual ~minlmreport(); ae_int_t &iterationscount; ae_int_t &terminationtype; ae_int_t &funcidx; ae_int_t &varidx; ae_int_t &nfunc; ae_int_t &njac; ae_int_t &ngrad; ae_int_t &nhess; ae_int_t &ncholesky; }; /************************************************************************* LIMITED MEMORY BFGS METHOD FOR LARGE SCALE OPTIMIZATION DESCRIPTION: The subroutine minimizes function F(x) of N arguments by using a quasi- Newton method (LBFGS scheme) which is optimized to use a minimum amount of memory. The subroutine generates the approximation of an inverse Hessian matrix by using information about the last M steps of the algorithm (instead of N). It lessens a required amount of memory from a value of order N^2 to a value of order 2*N*M. REQUIREMENTS: Algorithm will request following information during its operation: * function value F and its gradient G (simultaneously) at given point X USAGE: 1. User initializes algorithm state with MinLBFGSCreate() call 2. User tunes solver parameters with MinLBFGSSetCond() MinLBFGSSetStpMax() and other functions 3. User calls MinLBFGSOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 4. User calls MinLBFGSResults() to get solution 5. Optionally user may call MinLBFGSRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLBFGSRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension. N>0 M - number of corrections in the BFGS scheme of Hessian approximation update. Recommended value: 3<=M<=7. The smaller value causes worse convergence, the bigger will not cause a considerably better convergence, but will cause a fall in the performance. M<=N. X - initial solution approximation, array[0..N-1]. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLBFGSSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLBFGSSetStpMax() function to bound algorithm's steps. However, L-BFGS rarely needs such a tuning. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgscreate(const ae_int_t n, const ae_int_t m, const real_1d_array &x, minlbfgsstate &state); void minlbfgscreate(const ae_int_t m, const real_1d_array &x, minlbfgsstate &state); /************************************************************************* The subroutine is finite difference variant of MinLBFGSCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinLBFGSCreate() in order to get more information about creation of LBFGS optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of corrections in the BFGS scheme of Hessian approximation update. Recommended value: 3<=M<=7. The smaller value causes worse convergence, the bigger will not cause a considerably better convergence, but will cause a fall in the performance. M<=N. X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinLBFGSSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. LBFGS needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void minlbfgscreatef(const ae_int_t n, const ae_int_t m, const real_1d_array &x, const double diffstep, minlbfgsstate &state); void minlbfgscreatef(const ae_int_t m, const real_1d_array &x, const double diffstep, minlbfgsstate &state); /************************************************************************* This function sets stopping conditions for L-BFGS optimization algorithm. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinLBFGSSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0, EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (small EpsX). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetcond(const minlbfgsstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits); /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinLBFGSOptimize(). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetxrep(const minlbfgsstate &state, const bool needxrep); /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0 (default), if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetstpmax(const minlbfgsstate &state, const double stpmax); /************************************************************************* This function sets scaling coefficients for LBFGS optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the LBFGS too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinLBFGSSetPrec...() functions. There is special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlbfgssetscale(const minlbfgsstate &state, const real_1d_array &s); /************************************************************************* Modification of the preconditioner: default preconditioner (simple scaling, same for all elements of X) is used. INPUT PARAMETERS: State - structure which stores algorithm state NOTE: you can change preconditioner "on the fly", during algorithm iterations. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetprecdefault(const minlbfgsstate &state); /************************************************************************* Modification of the preconditioner: Cholesky factorization of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state P - triangular preconditioner, Cholesky factorization of the approximate Hessian. array[0..N-1,0..N-1], (if larger, only leading N elements are used). IsUpper - whether upper or lower triangle of P is given (other triangle is not referenced) After call to this function preconditioner is changed to P (P is copied into the internal buffer). NOTE: you can change preconditioner "on the fly", during algorithm iterations. NOTE 2: P should be nonsingular. Exception will be thrown otherwise. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetpreccholesky(const minlbfgsstate &state, const real_2d_array &p, const bool isupper); /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE: you can change preconditioner "on the fly", during algorithm iterations. NOTE 2: D[i] should be positive. Exception will be thrown otherwise. NOTE 3: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetprecdiag(const minlbfgsstate &state, const real_1d_array &d); /************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinLBFGSSetScale() call (before or after MinLBFGSSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetprecscale(const minlbfgsstate &state); /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minlbfgsiteration(const minlbfgsstate &state); /************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x grad - callback which calculates function (or merit function) value func and gradient grad at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied gradient, and one which uses function value only and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object (either MinLBFGSCreate() for analytical gradient or MinLBFGSCreateF() for numerical differentiation) you should choose appropriate variant of MinLBFGSOptimize() - one which accepts function AND gradient or one which accepts function ONLY. Be careful to choose variant of MinLBFGSOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinLBFGSOptimize() and specific function used to create optimizer. | USER PASSED TO MinLBFGSOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinLBFGSCreateF() | work FAIL MinLBFGSCreate() | FAIL work Here "FAIL" denotes inappropriate combinations of optimizer creation function and MinLBFGSOptimize() version. Attemps to use such combination (for example, to create optimizer with MinLBFGSCreateF() and to pass gradient information to MinCGOptimize()) will lead to exception being thrown. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void minlbfgsoptimize(minlbfgsstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minlbfgsoptimize(minlbfgsstate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); /************************************************************************* L-BFGS algorithm results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report: * Rep.TerminationType completetion code: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinLBFGSSetGradientCheck() for more information. * -2 rounding errors prevent further improvement. X contains best point found. * -1 incorrect parameters were specified * 1 relative function improvement is no more than EpsF. * 2 relative step is no more than EpsX. * 4 gradient norm is no more than EpsG * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * 8 terminated by user who called minlbfgsrequesttermination(). X contains point which was "current accepted" when termination request was submitted. * Rep.IterationsCount contains iterations count * NFEV countains number of function calculations -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgsresults(const minlbfgsstate &state, real_1d_array &x, minlbfgsreport &rep); /************************************************************************* L-BFGS algorithm results Buffered implementation of MinLBFGSResults which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 20.08.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgsresultsbuf(const minlbfgsstate &state, real_1d_array &x, minlbfgsreport &rep); /************************************************************************* This subroutine restarts LBFGS algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used to store algorithm state X - new starting point. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgsrestartfrom(const minlbfgsstate &state, const real_1d_array &x); /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void minlbfgsrequesttermination(const minlbfgsstate &state); /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinLBFGSOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * if needed, steps are bounded with respect to constraints on X[] * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinLBFGSSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 24.05.2012 by Bochkanov Sergey *************************************************************************/ void minlbfgssetgradientcheck(const minlbfgsstate &state, const double teststep); /************************************************************************* NONLINEAR CONJUGATE GRADIENT METHOD DESCRIPTION: The subroutine minimizes function F(x) of N arguments by using one of the nonlinear conjugate gradient methods. These CG methods are globally convergent (even on non-convex functions) as long as grad(f) is Lipschitz continuous in a some neighborhood of the L = { x : f(x)<=f(x0) }. REQUIREMENTS: Algorithm will request following information during its operation: * function value F and its gradient G (simultaneously) at given point X USAGE: 1. User initializes algorithm state with MinCGCreate() call 2. User tunes solver parameters with MinCGSetCond(), MinCGSetStpMax() and other functions 3. User calls MinCGOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 4. User calls MinCGResults() to get solution 5. Optionally, user may call MinCGRestartFrom() to solve another problem with same N but another starting point and/or another function. MinCGRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 25.03.2010 by Bochkanov Sergey *************************************************************************/ void mincgcreate(const ae_int_t n, const real_1d_array &x, mincgstate &state); void mincgcreate(const real_1d_array &x, mincgstate &state); /************************************************************************* The subroutine is finite difference variant of MinCGCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinCGCreate() in order to get more information about creation of CG optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinCGSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. L-BFGS needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void mincgcreatef(const ae_int_t n, const real_1d_array &x, const double diffstep, mincgstate &state); void mincgcreatef(const real_1d_array &x, const double diffstep, mincgstate &state); /************************************************************************* This function sets stopping conditions for CG optimization algorithm. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinCGSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0, EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (small EpsX). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetcond(const mincgstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits); /************************************************************************* This function sets scaling coefficients for CG optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of CG optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the CG too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinCGSetPrec...() functions. There is special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void mincgsetscale(const mincgstate &state, const real_1d_array &s); /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinCGOptimize(). -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetxrep(const mincgstate &state, const bool needxrep); /************************************************************************* This function sets CG algorithm. INPUT PARAMETERS: State - structure which stores algorithm state CGType - algorithm type: * -1 automatic selection of the best algorithm * 0 DY (Dai and Yuan) algorithm * 1 Hybrid DY-HS algorithm -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetcgtype(const mincgstate &state, const ae_int_t cgtype); /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetstpmax(const mincgstate &state, const double stpmax); /************************************************************************* This function allows to suggest initial step length to the CG algorithm. Suggested step length is used as starting point for the line search. It can be useful when you have badly scaled problem, i.e. when ||grad|| (which is used as initial estimate for the first step) is many orders of magnitude different from the desired step. Line search may fail on such problems without good estimate of initial step length. Imagine, for example, problem with ||grad||=10^50 and desired step equal to 0.1 Line search function will use 10^50 as initial step, then it will decrease step length by 2 (up to 20 attempts) and will get 10^44, which is still too large. This function allows us to tell than line search should be started from some moderate step length, like 1.0, so algorithm will be able to detect desired step length in a several searches. Default behavior (when no step is suggested) is to use preconditioner, if it is available, to generate initial estimate of step length. This function influences only first iteration of algorithm. It should be called between MinCGCreate/MinCGRestartFrom() call and MinCGOptimize call. Suggested step is ignored if you have preconditioner. INPUT PARAMETERS: State - structure used to store algorithm state. Stp - initial estimate of the step length. Can be zero (no estimate). -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void mincgsuggeststep(const mincgstate &state, const double stp); /************************************************************************* Modification of the preconditioner: preconditioning is turned off. INPUT PARAMETERS: State - structure which stores algorithm state NOTE: you can change preconditioner "on the fly", during algorithm iterations. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetprecdefault(const mincgstate &state); /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE: you can change preconditioner "on the fly", during algorithm iterations. NOTE 2: D[i] should be positive. Exception will be thrown otherwise. NOTE 3: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetprecdiag(const mincgstate &state, const real_1d_array &d); /************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinCGSetScale() call (before or after MinCGSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state NOTE: you can change preconditioner "on the fly", during algorithm iterations. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void mincgsetprecscale(const mincgstate &state); /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool mincgiteration(const mincgstate &state); /************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x grad - callback which calculates function (or merit function) value func and gradient grad at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied gradient, and one which uses function value only and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object (either MinCGCreate() for analytical gradient or MinCGCreateF() for numerical differentiation) you should choose appropriate variant of MinCGOptimize() - one which accepts function AND gradient or one which accepts function ONLY. Be careful to choose variant of MinCGOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinCGOptimize() and specific function used to create optimizer. | USER PASSED TO MinCGOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinCGCreateF() | work FAIL MinCGCreate() | FAIL work Here "FAIL" denotes inappropriate combinations of optimizer creation function and MinCGOptimize() version. Attemps to use such combination (for example, to create optimizer with MinCGCreateF() and to pass gradient information to MinCGOptimize()) will lead to exception being thrown. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 20.04.2009 by Bochkanov Sergey *************************************************************************/ void mincgoptimize(mincgstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void mincgoptimize(mincgstate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); /************************************************************************* Conjugate gradient results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report: * Rep.TerminationType completetion code: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinCGSetGradientCheck() for more information. * 1 relative function improvement is no more than EpsF. * 2 relative step is no more than EpsX. * 4 gradient norm is no more than EpsG * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible, we return best X found so far * 8 terminated by user * Rep.IterationsCount contains iterations count * NFEV countains number of function calculations -- ALGLIB -- Copyright 20.04.2009 by Bochkanov Sergey *************************************************************************/ void mincgresults(const mincgstate &state, real_1d_array &x, mincgreport &rep); /************************************************************************* Conjugate gradient results Buffered implementation of MinCGResults(), which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 20.04.2009 by Bochkanov Sergey *************************************************************************/ void mincgresultsbuf(const mincgstate &state, real_1d_array &x, mincgreport &rep); /************************************************************************* This subroutine restarts CG algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used to store algorithm state. X - new starting point. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void mincgrestartfrom(const mincgstate &state, const real_1d_array &x); /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void mincgrequesttermination(const mincgstate &state); /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinCGOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinCGSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 31.05.2012 by Bochkanov Sergey *************************************************************************/ void mincgsetgradientcheck(const mincgstate &state, const double teststep); /************************************************************************* BOUND CONSTRAINED OPTIMIZATION WITH ADDITIONAL LINEAR EQUALITY AND INEQUALITY CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints REQUIREMENTS: * user must provide function value and gradient * starting point X0 must be feasible or not too far away from the feasible set * grad(f) must be Lipschitz continuous on a level set: L = { x : f(x)<=f(x0) } * function must be defined everywhere on the feasible set F USAGE: Constrained optimization if far more complex than the unconstrained one. Here we give very brief outline of the BLEIC optimizer. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinBLEICCreate() call 2. USer adds boundary and/or linear constraints by calling MinBLEICSetBC() and MinBLEICSetLC() functions. 3. User sets stopping conditions with MinBLEICSetCond(). 4. User calls MinBLEICOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 5. User calls MinBLEICResults() to get solution 6. Optionally user may call MinBLEICRestartFrom() to solve another problem with same N but another starting point. MinBLEICRestartFrom() allows to reuse already initialized structure. NOTE: if you have box-only constraints (no general linear constraints), then MinBC optimizer can be better option. It uses special, faster constraint activation method, which performs better on problems with multiple constraints active at the solution. On small-scale problems performance of MinBC is similar to that of MinBLEIC, but on large-scale ones (hundreds and thousands of active constraints) it can be several times faster than MinBLEIC. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleiccreate(const ae_int_t n, const real_1d_array &x, minbleicstate &state); void minbleiccreate(const real_1d_array &x, minbleicstate &state); /************************************************************************* The subroutine is finite difference variant of MinBLEICCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinBLEICCreate() in order to get more information about creation of BLEIC optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinBLEICSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. CG needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void minbleiccreatef(const ae_int_t n, const real_1d_array &x, const double diffstep, minbleicstate &state); void minbleiccreatef(const real_1d_array &x, const double diffstep, minbleicstate &state); /************************************************************************* This function sets boundary constraints for BLEIC optimizer. Boundary constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinBLEICRestartFrom(). NOTE: if you have box-only constraints (no general linear constraints), then MinBC optimizer can be better option. It uses special, faster constraint activation method, which performs better on problems with multiple constraints active at the solution. On small-scale problems performance of MinBC is similar to that of MinBLEIC, but on large-scale ones (hundreds and thousands of active constraints) it can be several times faster than MinBLEIC. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints, even when numerical differentiation is used (algorithm adjusts nodes according to boundary constraints) -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetbc(const minbleicstate &state, const real_1d_array &bndl, const real_1d_array &bndu); /************************************************************************* This function sets linear constraints for BLEIC optimizer. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinBLEICRestartFrom(). INPUT PARAMETERS: State - structure previously allocated with MinBLEICCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: linear (non-bound) constraints are satisfied only approximately: * there always exists some minor violation (about Epsilon in magnitude) due to rounding errors * numerical differentiation, if used, may lead to function evaluations outside of the feasible area, because algorithm does NOT change numerical differentiation formula according to linear constraints. If you want constraints to be satisfied exactly, try to reformulate your problem in such manner that all constraints will become boundary ones (this kind of constraints is always satisfied exactly, both in the final solution and in all intermediate points). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetlc(const minbleicstate &state, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k); void minbleicsetlc(const minbleicstate &state, const real_2d_array &c, const integer_1d_array &ct); /************************************************************************* This function sets stopping conditions for the optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinBLEICSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. NOTE: when SetCond() called with non-zero MaxIts, BLEIC solver may perform slightly more than MaxIts iterations. I.e., MaxIts sets non-strict limit on iterations count. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetcond(const minbleicstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits); /************************************************************************* This function sets scaling coefficients for BLEIC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the BLEIC too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinBLEICSetPrec...() functions. There is a special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minbleicsetscale(const minbleicstate &state, const real_1d_array &s); /************************************************************************* Modification of the preconditioner: preconditioning is turned off. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetprecdefault(const minbleicstate &state); /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE 1: D[i] should be positive. Exception will be thrown otherwise. NOTE 2: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetprecdiag(const minbleicstate &state, const real_1d_array &d); /************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinBLEICSetScale() call (before or after MinBLEICSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetprecscale(const minbleicstate &state); /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinBLEICOptimize(). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetxrep(const minbleicstate &state, const bool needxrep); /************************************************************************* This function sets maximum step length IMPORTANT: this feature is hard to combine with preconditioning. You can't set upper limit on step length, when you solve optimization problem with linear (non-boundary) constraints AND preconditioner turned on. When non-boundary constraints are present, you have to either a) use preconditioner, or b) use upper limit on step length. YOU CAN'T USE BOTH! In this case algorithm will terminate with appropriate error code. INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which lead to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetstpmax(const minbleicstate &state, const double stpmax); /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minbleiciteration(const minbleicstate &state); /************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x grad - callback which calculates function (or merit function) value func and gradient grad at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied gradient, and one which uses function value only and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object (either MinBLEICCreate() for analytical gradient or MinBLEICCreateF() for numerical differentiation) you should choose appropriate variant of MinBLEICOptimize() - one which accepts function AND gradient or one which accepts function ONLY. Be careful to choose variant of MinBLEICOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinBLEICOptimize() and specific function used to create optimizer. | USER PASSED TO MinBLEICOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinBLEICCreateF() | work FAIL MinBLEICCreate() | FAIL work Here "FAIL" denotes inappropriate combinations of optimizer creation function and MinBLEICOptimize() version. Attemps to use such combination (for example, to create optimizer with MinBLEICCreateF() and to pass gradient information to MinCGOptimize()) will lead to exception being thrown. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicoptimize(minbleicstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minbleicoptimize(minbleicstate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); /************************************************************************* BLEIC results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinBLEICSetGradientCheck() for more information. * -3 inconsistent constraints. Feasible point is either nonexistent or too hard to find. Try to restart optimizer with better initial approximation * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken * 8 terminated by user who called minbleicrequesttermination(). X contains point which was "current accepted" when termination request was submitted. More information about fields of this structure can be found in the comments on MinBLEICReport datatype. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicresults(const minbleicstate &state, real_1d_array &x, minbleicreport &rep); /************************************************************************* BLEIC results Buffered implementation of MinBLEICResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicresultsbuf(const minbleicstate &state, real_1d_array &x, minbleicreport &rep); /************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with MinBLEICCreate call. X - new starting point. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicrestartfrom(const minbleicstate &state, const real_1d_array &x); /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void minbleicrequesttermination(const minbleicstate &state); /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinBLEICOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * if needed, steps are bounded with respect to constraints on X[] * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinBLEICSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/ void minbleicsetgradientcheck(const minbleicstate &state, const double teststep); /************************************************************************* CONSTRAINED QUADRATIC PROGRAMMING The subroutine creates QP optimizer. After initial creation, it contains default optimization problem with zero quadratic and linear terms and no constraints. You should set quadratic/linear terms with calls to functions provided by MinQP subpackage. You should also choose appropriate QP solver and set it and its stopping criteria by means of MinQPSetAlgo??????() function. Then, you should start solution process by means of MinQPOptimize() call. Solution itself can be obtained with MinQPResults() function. Following solvers are recommended: * QuickQP for dense problems with box-only constraints (or no constraints at all) * QP-BLEIC for dense/sparse problems with moderate (up to 50) number of general linear constraints * DENSE-AUL-QP for dense problems with any (small or large) number of general linear constraints INPUT PARAMETERS: N - problem size OUTPUT PARAMETERS: State - optimizer with zero quadratic/linear terms and no constraints -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpcreate(const ae_int_t n, minqpstate &state); /************************************************************************* This function sets linear term for QP solver. By default, linear term is zero. INPUT PARAMETERS: State - structure which stores algorithm state B - linear term, array[N]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetlinearterm(const minqpstate &state, const real_1d_array &b); /************************************************************************* This function sets dense quadratic term for QP solver. By default, quadratic term is zero. SUPPORT BY QP SOLVERS: Dense quadratic term can be handled by following QP solvers: * QuickQP * BLEIC-QP * Dense-AUL-QP IMPORTANT: This solver minimizes following function: f(x) = 0.5*x'*A*x + b'*x. Note that quadratic term has 0.5 before it. So if you want to minimize f(x) = x^2 + x you should rewrite your problem as follows: f(x) = 0.5*(2*x^2) + x and your matrix A will be equal to [[2.0]], not to [[1.0]] INPUT PARAMETERS: State - structure which stores algorithm state A - matrix, array[N,N] IsUpper - (optional) storage type: * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used * if not given, both lower and upper triangles must be filled. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetquadraticterm(const minqpstate &state, const real_2d_array &a, const bool isupper); void minqpsetquadraticterm(const minqpstate &state, const real_2d_array &a); /************************************************************************* This function sets sparse quadratic term for QP solver. By default, quadratic term is zero. This function overrides previous calls to minqpsetquadraticterm() or minqpsetquadratictermsparse(). SUPPORT BY QP SOLVERS: Sparse quadratic term can be handled by following QP solvers: * QuickQP * BLEIC-QP * Dense-AUL-QP (internally converts sparse matrix to dense format) IMPORTANT: This solver minimizes following function: f(x) = 0.5*x'*A*x + b'*x. Note that quadratic term has 0.5 before it. So if you want to minimize f(x) = x^2 + x you should rewrite your problem as follows: f(x) = 0.5*(2*x^2) + x and your matrix A will be equal to [[2.0]], not to [[1.0]] INPUT PARAMETERS: State - structure which stores algorithm state A - matrix, array[N,N] IsUpper - (optional) storage type: * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used * if not given, both lower and upper triangles must be filled. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetquadratictermsparse(const minqpstate &state, const sparsematrix &a, const bool isupper); /************************************************************************* This function sets starting point for QP solver. It is useful to have good initial approximation to the solution, because it will increase speed of convergence and identification of active constraints. INPUT PARAMETERS: State - structure which stores algorithm state X - starting point, array[N]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetstartingpoint(const minqpstate &state, const real_1d_array &x); /************************************************************************* This function sets origin for QP solver. By default, following QP program is solved: min(0.5*x'*A*x+b'*x) This function allows to solve different problem: min(0.5*(x-x_origin)'*A*(x-x_origin)+b'*(x-x_origin)) Specification of non-zero origin affects function being minimized, but not constraints. Box and linear constraints are still calculated without origin. INPUT PARAMETERS: State - structure which stores algorithm state XOrigin - origin, array[N]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetorigin(const minqpstate &state, const real_1d_array &xorigin); /************************************************************************* This function sets scaling coefficients. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances) and as preconditioner. Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetscale(const minqpstate &state, const real_1d_array &s); /************************************************************************* DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED This function tells solver to use Cholesky-based algorithm. This algorithm was deprecated in ALGLIB 3.9.0 because its performance is inferior to that of BLEIC-QP or QuickQP on high-dimensional problems. Furthermore, it supports only dense convex QP problems. This solver is no longer active by default. We recommend you to switch to AUL-QP, BLEIC-QP or QuickQP solver. DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED DEPRECATED -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetalgocholesky(const minqpstate &state); /************************************************************************* This function tells solver to use BLEIC-based algorithm and sets stopping criteria for the algorithm. ALGORITHM FEATURES: * supports dense and sparse QP problems * supports box and general linear equality/inequality constraints * can solve all types of problems (convex, semidefinite, nonconvex) as long as they are bounded from below under constraints. Say, it is possible to solve "min{-x^2} subject to -1<=x<=+1". Of course, global minimum is found only for positive definite and semidefinite problems. As for indefinite ones - only local minimum is found. ALGORITHM OUTLINE: * BLEIC-QP solver is just a driver function for MinBLEIC solver; it solves quadratic programming problem as general linearly constrained optimization problem, which is solved by means of BLEIC solver (part of ALGLIB, active set method). ALGORITHM LIMITATIONS: * this algorithm is fast enough for large-scale problems with small amount of general linear constraints (say, up to 50), but it is inefficient for problems with several hundreds of constraints. Iteration cost is roughly quadratic w.r.t. constraint count. Furthermore, it can not efficiently handle sparse constraints (they are converted to dense format prior to solution). Thus, if you have large and/or sparse constraint matrix and convex QP problem, Dense-AUL-QP solver may be better solution. * unlike QuickQP solver, this algorithm does not perform Newton steps and does not use Level 3 BLAS. Being general-purpose active set method, it can activate constraints only one-by-one. Thus, its performance is lower than that of QuickQP. * its precision is also a bit inferior to that of QuickQP. BLEIC-QP performs only LBFGS steps (no Newton steps), which are good at detecting neighborhood of the solution, buy needs many iterations to find solution with more than 6 digits of precision. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} EpsX - >=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinQPSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. NOTE: this algorithm uses LBFGS iterations, which are relatively cheap, but improve function value only a bit. So you will need many iterations to converge - from 0.1*N to 10*N, depending on problem's condition number. IT IS VERY IMPORTANT TO CALL MinQPSetScale() WHEN YOU USE THIS ALGORITHM BECAUSE ITS STOPPING CRITERIA ARE SCALE-DEPENDENT! Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (presently it is small step length, but it may change in the future versions of ALGLIB). -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetalgobleic(const minqpstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits); /************************************************************************* This function tells QP solver to use Dense-AUL algorithm and sets stopping criteria for the algorithm. ALGORITHM FEATURES: * supports dense and sparse QP problems; although it uses dense Cholesky to build preconditioner, it still works faster for sparse problems. * supports box and dense/sparse general linear equality/inequality constraints * convergence is theoretically proved for positive-definite (convex) QP problems. Semidefinite and non-convex problems can be solved as long as they are bounded from below under constraints, although without theoretical guarantees. * this solver is better than QP-BLEIC on problems with large number of general linear constraints. ALGORITHM OUTLINE: * this algorithm is an augmented Lagrangian method with dense preconditioner (hence its name). It is similar to barrier/penalty methods, but much more precise and faster. * it performs several outer iterations in order to refine values of the Lagrange multipliers. Single outer iteration is a solution of some unconstrained optimization problem: first it performs dense Cholesky factorization of the Hessian in order to build preconditioner (adaptive regularization is applied to enforce positive definiteness), and then it uses L-BFGS optimizer to solve optimization problem. * typically you need about 5-10 outer iterations to converge to solution ALGORITHM LIMITATIONS: * because dense Cholesky driver is used, this algorithm has O(N^2) memory requirements and O(OuterIterations*N^3) minimum running time. From the practical point of view, it limits its applicability by several thousands of variables. From the other side, variables count is the most limiting factor, and dependence on constraint count is much more lower. Assuming that constraint matrix is sparse, it may handle tens of thousands of general linear constraints. * its precision is lower than that of BLEIC-QP and QuickQP. It is hard to find solution with more than 6 digits of precision. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0, stopping criteria for inner optimizer. Inner iterations are stopped when step length (with variable scaling being applied) is less than EpsX. See minqpsetscale() for more information on variable scaling. Rho - penalty coefficient, Rho>0: * large enough that algorithm converges with desired precision. * not TOO large to prevent ill-conditioning * recommended values are 100, 1000 or 10000 ItsCnt - number of outer iterations: * recommended values: 10-15 (although in most cases it converges within 5 iterations, you may need a few more to be sure). * ItsCnt=0 means that small number of outer iterations is automatically chosen (10 iterations in current version). * ItsCnt=1 means that AUL algorithm performs just as usual barrier method. * ItsCnt>1 means that AUL algorithm performs specified number of outer iterations IT IS VERY IMPORTANT TO CALL minqpsetscale() WHEN YOU USE THIS ALGORITHM BECAUSE ITS CONVERGENCE PROPERTIES AND STOPPING CRITERIA ARE SCALE-DEPENDENT! NOTE: Passing EpsX=0 will lead to automatic step length selection (specific step length chosen may change in the future versions of ALGLIB, so it is better to specify step length explicitly). -- ALGLIB -- Copyright 20.08.2016 by Bochkanov Sergey *************************************************************************/ void minqpsetalgodenseaul(const minqpstate &state, const double epsx, const double rho, const ae_int_t itscnt); /************************************************************************* This function tells solver to use QuickQP algorithm: special extra-fast algorithm for problems with box-only constrants. It may solve non-convex problems as long as they are bounded from below under constraints. ALGORITHM FEATURES: * many times (from 5x to 50x!) faster than BLEIC-based QP solver; utilizes accelerated methods for activation of constraints. * supports dense and sparse QP problems * supports ONLY box constraints; general linear constraints are NOT supported by this solver * can solve all types of problems (convex, semidefinite, nonconvex) as long as they are bounded from below under constraints. Say, it is possible to solve "min{-x^2} subject to -1<=x<=+1". In convex/semidefinite case global minimum is returned, in nonconvex case - algorithm returns one of the local minimums. ALGORITHM OUTLINE: * algorithm performs two kinds of iterations: constrained CG iterations and constrained Newton iterations * initially it performs small number of constrained CG iterations, which can efficiently activate/deactivate multiple constraints * after CG phase algorithm tries to calculate Cholesky decomposition and to perform several constrained Newton steps. If Cholesky decomposition failed (matrix is indefinite even under constraints), we perform more CG iterations until we converge to such set of constraints that system matrix becomes positive definite. Constrained Newton steps greatly increase convergence speed and precision. * algorithm interleaves CG and Newton iterations which allows to handle indefinite matrices (CG phase) and quickly converge after final set of constraints is found (Newton phase). Combination of CG and Newton phases is called "outer iteration". * it is possible to turn off Newton phase (beneficial for semidefinite problems - Cholesky decomposition will fail too often) ALGORITHM LIMITATIONS: * algorithm does not support general linear constraints; only box ones are supported * Cholesky decomposition for sparse problems is performed with Skyline Cholesky solver, which is intended for low-profile matrices. No profile- reducing reordering of variables is performed in this version of ALGLIB. * problems with near-zero negative eigenvalues (or exacty zero ones) may experience about 2-3x performance penalty. The reason is that Cholesky decomposition can not be performed until we identify directions of zero and negative curvature and activate corresponding boundary constraints - but we need a lot of trial and errors because these directions are hard to notice in the matrix spectrum. In this case you may turn off Newton phase of algorithm. Large negative eigenvalues are not an issue, so highly non-convex problems can be solved very efficiently. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} EpsX - >=0 The subroutine finishes its work if exploratory steepest descent step on k+1-th iteration satisfies following condition: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinQPSetScale() MaxOuterIts-maximum number of OUTER iterations. One outer iteration includes some amount of CG iterations (from 5 to ~N) and one or several (usually small amount) Newton steps. Thus, one outer iteration has high cost, but can greatly reduce funcation value. Use 0 if you do not want to limit number of outer iterations. UseNewton- use Newton phase or not: * Newton phase improves performance of positive definite dense problems (about 2 times improvement can be observed) * can result in some performance penalty on semidefinite or slightly negative definite problems - each Newton phase will bring no improvement (Cholesky failure), but still will require computational time. * if you doubt, you can turn off this phase - optimizer will retain its most of its high speed. IT IS VERY IMPORTANT TO CALL MinQPSetScale() WHEN YOU USE THIS ALGORITHM BECAUSE ITS STOPPING CRITERIA ARE SCALE-DEPENDENT! Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (presently it is small step length, but it may change in the future versions of ALGLIB). -- ALGLIB -- Copyright 22.05.2014 by Bochkanov Sergey *************************************************************************/ void minqpsetalgoquickqp(const minqpstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxouterits, const bool usenewton); /************************************************************************* This function sets box constraints for QP solver Box constraints are inactive by default (after initial creation). After being set, they are preserved until explicitly turned off with another SetBC() call. All QP solvers may handle box constraints. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF (latter is recommended because it will allow solver to use better algorithm). BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF (latter is recommended because it will allow solver to use better algorithm). NOTE: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpsetbc(const minqpstate &state, const real_1d_array &bndl, const real_1d_array &bndu); /************************************************************************* This function sets dense linear constraints for QP optimizer. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). After call to this function sparse constraints are dropped, and you have only those constraints which were specified in the present call. If you want to specify mixed (with dense and sparse terms) linear constraints, you should call minqpsetlcmixed(). SUPPORT BY QP SOLVERS: Following QP solvers can handle dense linear constraints: * BLEIC-QP - handles them with high precision, but may be inefficient for problems with hundreds of constraints * Dense-AUL-QP - handles them with moderate precision (approx. 10^-6), may efficiently handle thousands of constraints. Following QP solvers can NOT handle dense linear constraints: * QuickQP - can not handle general linear constraints INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 19.06.2012 by Bochkanov Sergey *************************************************************************/ void minqpsetlc(const minqpstate &state, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k); void minqpsetlc(const minqpstate &state, const real_2d_array &c, const integer_1d_array &ct); /************************************************************************* This function sets sparse linear constraints for QP optimizer. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). After call to this function dense constraints are dropped, and you have only those constraints which were specified in the present call. If you want to specify mixed (with dense and sparse terms) linear constraints, you should call minqpsetlcmixed(). SUPPORT BY QP SOLVERS: Following QP solvers can handle sparse linear constraints: * BLEIC-QP - handles them with high precision, but can not utilize their sparsity - sparse constraint matrix is silently converted to dense format. Thus, it may be inefficient for problems with hundreds of constraints. * Dense-AUL-QP - although this solver uses dense linear algebra to calculate Cholesky preconditioner, it may efficiently handle sparse constraints. It may solve problems with hundreds and thousands of constraints. The only drawback is that precision of constraint handling is typically within 1E-4... ..1E-6 range. Following QP solvers can NOT handle sparse linear constraints: * QuickQP - can not handle general linear constraints INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. C - linear constraints, sparse matrix with dimensions at least [K,N+1]. If matrix has larger size, only leading Kx(N+1) rectangle is used. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0 NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 22.08.2016 by Bochkanov Sergey *************************************************************************/ void minqpsetlcsparse(const minqpstate &state, const sparsematrix &c, const integer_1d_array &ct, const ae_int_t k); /************************************************************************* This function sets mixed linear constraints, which include a set of dense rows, and a set of sparse rows. This function overrides results of previous calls to minqpsetlc(), minqpsetlcsparse() and minqpsetlcmixed(). This function may be useful if constraint matrix includes large number of both types of rows - dense and sparse. If you have just a few sparse rows, you may represent them in dense format without loosing performance. Similarly, if you have just a few dense rows, you may store them in sparse format with almost same performance. SUPPORT BY QP SOLVERS: Following QP solvers can handle mixed dense/sparse linear constraints: * BLEIC-QP - handles them with high precision, but can not utilize their sparsity - sparse constraint matrix is silently converted to dense format. Thus, it may be inefficient for problems with hundreds of constraints. * Dense-AUL-QP - although this solver uses dense linear algebra to calculate Cholesky preconditioner, it may efficiently handle sparse constraints. It may solve problems with hundreds and thousands of constraints. The only drawback is that precision of constraint handling is typically within 1E-4... ..1E-6 range. Following QP solvers can NOT handle mixed linear constraints: * QuickQP - can not handle general linear constraints at all INPUT PARAMETERS: State - structure previously allocated with MinQPCreate call. DenseC - dense linear constraints, array[K,N+1]. Each row of DenseC represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of DenseC (including right part) must be finite. DenseCT - type of constraints, array[K]: * if DenseCT[i]>0, then I-th constraint is DenseC[i,*]*x >= DenseC[i,n+1] * if DenseCT[i]=0, then I-th constraint is DenseC[i,*]*x = DenseC[i,n+1] * if DenseCT[i]<0, then I-th constraint is DenseC[i,*]*x <= DenseC[i,n+1] DenseK - number of equality/inequality constraints, DenseK>=0 SparseC - linear constraints, sparse matrix with dimensions at least [SparseK,N+1]. If matrix has larger size, only leading SPARSEKx(N+1) rectangle is used. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. SparseCT- type of sparse constraints, array[K]: * if SparseCT[i]>0, then I-th constraint is SparseC[i,*]*x >= SparseC[i,n+1] * if SparseCT[i]=0, then I-th constraint is SparseC[i,*]*x = SparseC[i,n+1] * if SparseCT[i]<0, then I-th constraint is SparseC[i,*]*x <= SparseC[i,n+1] SparseK - number of sparse equality/inequality constraints, K>=0 NOTE 1: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations (BLEIC-QP solver is most precise, AUL-QP solver is less precise). -- ALGLIB -- Copyright 22.08.2016 by Bochkanov Sergey *************************************************************************/ void minqpsetlcmixed(const minqpstate &state, const real_2d_array &densec, const integer_1d_array &densect, const ae_int_t densek, const sparsematrix &sparsec, const integer_1d_array &sparsect, const ae_int_t sparsek); /************************************************************************* This function solves quadratic programming problem. Prior to calling this function you should choose solver by means of one of the following functions: * minqpsetalgoquickqp() - for QuickQP solver * minqpsetalgobleic() - for BLEIC-QP solver * minqpsetalgodenseaul() - for Dense-AUL-QP solver These functions also allow you to control stopping criteria of the solver. If you did not set solver, MinQP subpackage will automatically select solver for your problem and will run it with default stopping criteria. However, it is better to set explicitly solver and its stopping criteria. INPUT PARAMETERS: State - algorithm state You should use MinQPResults() function to access results after calls to this function. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey. Special thanks to Elvira Illarionova for important suggestions on the linearly constrained QP algorithm. *************************************************************************/ void minqpoptimize(const minqpstate &state); /************************************************************************* QP solver results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution. This array is allocated and initialized only when Rep.TerminationType parameter is positive (success). Rep - optimization report. You should check Rep.TerminationType, which contains completion code, and you may check another fields which contain another information about algorithm functioning. Failure codes returned by algorithm are: * -5 inappropriate solver was used: * QuickQP solver for problem with general linear constraints * -4 BLEIC-QP/QuickQP solver found unconstrained direction of negative curvature (function is unbounded from below even under constraints), no meaningful minimum can be found. * -3 inconsistent constraints (or maybe feasible point is too hard to find). If you are sure that constraints are feasible, try to restart optimizer with better initial approximation. Completion codes specific for Cholesky algorithm: * 4 successful completion Completion codes specific for BLEIC/QuickQP algorithms: * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpresults(const minqpstate &state, real_1d_array &x, minqpreport &rep); /************************************************************************* QP results Buffered implementation of MinQPResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 11.01.2011 by Bochkanov Sergey *************************************************************************/ void minqpresultsbuf(const minqpstate &state, real_1d_array &x, minqpreport &rep); /************************************************************************* NONLINEARLY CONSTRAINED OPTIMIZATION WITH PRECONDITIONED AUGMENTED LAGRANGIAN ALGORITHM DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints * nonlinear equality constraints Gi(x)=0 * nonlinear inequality constraints Hi(x)<=0 REQUIREMENTS: * user must provide function value and gradient for F(), H(), G() * starting point X0 must be feasible or not too far away from the feasible set * F(), G(), H() are twice continuously differentiable on the feasible set and its neighborhood * nonlinear constraints G() and H() must have non-zero gradient at G(x)=0 and at H(x)=0. Say, constraint like x^2>=1 is supported, but x^2>=0 is NOT supported. USAGE: Constrained optimization if far more complex than the unconstrained one. Nonlinearly constrained optimization is one of the most esoteric numerical procedures. Here we give very brief outline of the MinNLC optimizer. We strongly recommend you to study examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinNLCCreate() call and chooses what NLC solver to use. There is some solver which is used by default, with default settings, but you should NOT rely on default choice. It may change in future releases of ALGLIB without notice, and no one can guarantee that new solver will be able to solve your problem with default settings. From the other side, if you choose solver explicitly, you can be pretty sure that it will work with new ALGLIB releases. In the current release following solvers can be used: * AUL solver (activated with MinNLCSetAlgoAUL() function) 2. User adds boundary and/or linear and/or nonlinear constraints by means of calling one of the following functions: a) MinNLCSetBC() for boundary constraints b) MinNLCSetLC() for linear constraints c) MinNLCSetNLC() for nonlinear constraints You may combine (a), (b) and (c) in one optimization problem. 3. User sets scale of the variables with MinNLCSetScale() function. It is VERY important to set scale of the variables, because nonlinearly constrained problems are hard to solve when variables are badly scaled. 4. User sets stopping conditions with MinNLCSetCond(). If NLC solver uses inner/outer iteration layout, this function sets stopping conditions for INNER iterations. 5. User chooses one of the preconditioning methods. Preconditioning is very important for efficient handling of boundary/linear/nonlinear constraints. Without preconditioning algorithm would require thousands of iterations even for simple problems. Several preconditioners can be used: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). Since version 3.11.0 ALGLIB uses exact robust preconditioner as default option, but in some cases exact low rank one may be better option. 6. Finally, user calls MinNLCOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G/H. 7. User calls MinNLCResults() to get solution 8. Optionally user may call MinNLCRestartFrom() to solve another problem with same N but another starting point. MinNLCRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlccreate(const ae_int_t n, const real_1d_array &x, minnlcstate &state); void minnlccreate(const real_1d_array &x, minnlcstate &state); /************************************************************************* This subroutine is a finite difference variant of MinNLCCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinNLCCreate() in order to get more information about creation of NLC optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinNLCSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large TRUNCATION errors, while too small step will result in too large NUMERICAL errors. 1.0E-4 can be good value to start from. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlccreatef(const ae_int_t n, const real_1d_array &x, const double diffstep, minnlcstate &state); void minnlccreatef(const real_1d_array &x, const double diffstep, minnlcstate &state); /************************************************************************* This function sets boundary constraints for NLC optimizer. Boundary constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinNLCRestartFrom(). You may combine boundary constraints with general linear ones - and with nonlinear ones! Boundary constraints are handled more efficiently than other types. Thus, if your problem has mixed constraints, you may explicitly specify some of them as boundary and save some time/space. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: when you solve your problem with augmented Lagrangian solver, boundary constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of feasible area! -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetbc(const minnlcstate &state, const real_1d_array &bndl, const real_1d_array &bndu); /************************************************************************* This function sets linear constraints for MinNLC optimizer. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinNLCRestartFrom(). You may combine linear constraints with boundary ones - and with nonlinear ones! If your problem has mixed constraints, you may explicitly specify some of them as linear. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE 1: when you solve your problem with augmented Lagrangian solver, linear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of feasible area! -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetlc(const minnlcstate &state, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k); void minnlcsetlc(const minnlcstate &state, const real_2d_array &c, const integer_1d_array &ct); /************************************************************************* This function sets nonlinear constraints for MinNLC optimizer. In fact, this function sets NUMBER of nonlinear constraints. Constraints itself (constraint functions) are passed to MinNLCOptimize() method. This method requires user-defined vector function F[] and its Jacobian J[], where: * first component of F[] and first row of Jacobian J[] corresponds to function being minimized * next NLEC components of F[] (and rows of J) correspond to nonlinear equality constraints G_i(x)=0 * next NLIC components of F[] (and rows of J) correspond to nonlinear inequality constraints H_i(x)<=0 NOTE: you may combine nonlinear constraints with linear/boundary ones. If your problem has mixed constraints, you may explicitly specify some of them as linear ones. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. NLEC - number of Non-Linear Equality Constraints (NLEC), >=0 NLIC - number of Non-Linear Inquality Constraints (NLIC), >=0 NOTE 1: when you solve your problem with augmented Lagrangian solver, nonlinear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of feasible area! NOTE 2: algorithm scales variables according to scale specified by MinNLCSetScale() function, so it can handle problems with badly scaled variables (as long as we KNOW their scales). However, there is no way to automatically scale nonlinear constraints Gi(x) and Hi(x). Inappropriate scaling of Gi/Hi may ruin convergence. Solving problem with constraint "1000*G0(x)=0" is NOT same as solving it with constraint "0.001*G0(x)=0". It means that YOU are the one who is responsible for correct scaling of nonlinear constraints Gi(x) and Hi(x). We recommend you to scale nonlinear constraints in such way that I-th component of dG/dX (or dH/dx) has approximately unit magnitude (for problems with unit scale) or has magnitude approximately equal to 1/S[i] (where S is a scale set by MinNLCSetScale() function). -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetnlc(const minnlcstate &state, const ae_int_t nlec, const ae_int_t nlic); /************************************************************************* This function sets stopping conditions for inner iterations of optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinNLCSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetcond(const minnlcstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits); /************************************************************************* This function sets scaling coefficients for NLC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetscale(const minnlcstate &state, const real_1d_array &s); /************************************************************************* This function sets preconditioner to "inexact LBFGS-based" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may use following preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). Inexact LBFGS-based preconditioner uses L-BFGS formula combined with orthogonality assumption to perform very fast updates. For a N-dimensional problem with K general linear or nonlinear constraints (boundary ones are not counted) it has O(N*K) cost per iteration. This preconditioner has best quality (less iterations) when general linear and nonlinear constraints are orthogonal to each other (orthogonality with respect to boundary constraints is not required). Number of iterations increases when constraints are non-orthogonal, because algorithm assumes orthogonality, but still it is better than no preconditioner at all. INPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 26.09.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetprecinexact(const minnlcstate &state); /************************************************************************* This function sets preconditioner to "exact low rank" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may use following preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). It also provides special unpreconditioned mode of operation which can be used for test purposes. Comments below discuss low rank preconditioner. Exact low-rank preconditioner uses Woodbury matrix identity to build quadratic model of the penalized function. It has following features: * no special assumptions about orthogonality of constraints * preconditioner evaluation is optimized for K<=N. * finally, stability of the process is guaranteed only for K<=N due to degeneracy of intermediate matrices. That's why we recommend to use "exact robust" preconditioner for such cases. RECOMMENDATIONS We recommend to choose between "exact low rank" and "exact robust" preconditioners, with "low rank" version being chosen when you know in advance that total count of non-box constraints won't exceed N, and "robust" version being chosen when you need bulletproof solution. INPUT PARAMETERS: State - structure stores algorithm state UpdateFreq- update frequency. Preconditioner is rebuilt after every UpdateFreq iterations. Recommended value: 10 or higher. Zero value means that good default value will be used. -- ALGLIB -- Copyright 26.09.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetprecexactlowrank(const minnlcstate &state, const ae_int_t updatefreq); /************************************************************************* This function sets preconditioner to "exact robust" mode. Preconditioning is very important for convergence of Augmented Lagrangian algorithm because presence of penalty term makes problem ill-conditioned. Difference between performance of preconditioned and unpreconditioned methods can be as large as 100x! MinNLC optimizer may use following preconditioners, each with its own benefits and drawbacks: a) inexact LBFGS-based, with O(N*K) evaluation time b) exact low rank one, with O(N*K^2) evaluation time c) exact robust one, with O(N^3+K*N^2) evaluation time where K is a total number of general linear and nonlinear constraints (box ones are not counted). It also provides special unpreconditioned mode of operation which can be used for test purposes. Comments below discuss robust preconditioner. Exact robust preconditioner uses Cholesky decomposition to invert approximate Hessian matrix H=D+W'*C*W (where D stands for diagonal terms of Hessian, combined result of initial scaling matrix and penalty from box constraints; W stands for general linear constraints and linearization of nonlinear ones; C stands for diagonal matrix of penalty coefficients). This preconditioner has following features: * no special assumptions about constraint structure * preconditioner is optimized for stability; unlike "exact low rank" version which fails for K>=N, this one works well for any value of K. * the only drawback is that is takes O(N^3+K*N^2) time to build it. No economical Woodbury update is applied even when it makes sense, thus there are exist situations (K<=0. Set StpMax to 0.0 (default), if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minnlcsetstpmax(const minnlcstate &state, const double stpmax); /************************************************************************* This function tells MinNLC unit to use Augmented Lagrangian algorithm for nonlinearly constrained optimization. This algorithm is a slight modification of one described in "A Modified Barrier-Augmented Lagrangian Method for Constrained Minimization (1999)" by D.GOLDFARB, R.POLYAK, K. SCHEINBERG, I.YUZEFOVICH. Augmented Lagrangian algorithm works by converting problem of minimizing F(x) subject to equality/inequality constraints to unconstrained problem of the form min[ f(x) + + Rho*PENALTY_EQ(x) + SHIFT_EQ(x,Nu1) + + Rho*PENALTY_INEQ(x) + SHIFT_INEQ(x,Nu2) ] where: * Rho is a fixed penalization coefficient * PENALTY_EQ(x) is a penalty term, which is used to APPROXIMATELY enforce equality constraints * SHIFT_EQ(x) is a special "shift" term which is used to "fine-tune" equality constraints, greatly increasing precision * PENALTY_INEQ(x) is a penalty term which is used to approximately enforce inequality constraints * SHIFT_INEQ(x) is a special "shift" term which is used to "fine-tune" inequality constraints, greatly increasing precision * Nu1/Nu2 are vectors of Lagrange coefficients which are fine-tuned during outer iterations of algorithm This version of AUL algorithm uses preconditioner, which greatly accelerates convergence. Because this algorithm is similar to penalty methods, it may perform steps into infeasible area. All kinds of constraints (boundary, linear and nonlinear ones) may be violated in intermediate points - and in the solution. However, properly configured AUL method is significantly better at handling constraints than barrier and/or penalty methods. The very basic outline of algorithm is given below: 1) first outer iteration is performed with "default" values of Lagrange multipliers Nu1/Nu2. Solution quality is low (candidate point can be too far away from true solution; large violation of constraints is possible) and is comparable with that of penalty methods. 2) subsequent outer iterations refine Lagrange multipliers and improve quality of the solution. INPUT PARAMETERS: State - structure which stores algorithm state Rho - penalty coefficient, Rho>0: * large enough that algorithm converges with desired precision. Minimum value is 10*max(S'*diag(H)*S), where S is a scale matrix (set by MinNLCSetScale) and H is a Hessian of the function being minimized. If you can not easily estimate Hessian norm, see our recommendations below. * not TOO large to prevent ill-conditioning * for unit-scale problems (variables and Hessian have unit magnitude), Rho=100 or Rho=1000 can be used. * it is important to note that Rho is internally multiplied by scaling matrix, i.e. optimum value of Rho depends on scale of variables specified by MinNLCSetScale(). ItsCnt - number of outer iterations: * ItsCnt=0 means that small number of outer iterations is automatically chosen (10 iterations in current version). * ItsCnt=1 means that AUL algorithm performs just as usual barrier method. * ItsCnt>1 means that AUL algorithm performs specified number of outer iterations HOW TO CHOOSE PARAMETERS Nonlinear optimization is a tricky area and Augmented Lagrangian algorithm is sometimes hard to tune. Good values of Rho and ItsCnt are problem- specific. In order to help you we prepared following set of recommendations: * for unit-scale problems (variables and Hessian have unit magnitude), Rho=100 or Rho=1000 can be used. * start from some small value of Rho and solve problem with just one outer iteration (ItcCnt=1). In this case algorithm behaves like penalty method. Increase Rho in 2x or 10x steps until you see that one outer iteration returns point which is "rough approximation to solution". It is very important to have Rho so large that penalty term becomes constraining i.e. modified function becomes highly convex in constrained directions. From the other side, too large Rho may prevent you from converging to the solution. You can diagnose it by studying number of inner iterations performed by algorithm: too few (5-10 on 1000-dimensional problem) or too many (orders of magnitude more than dimensionality) usually means that Rho is too large. * with just one outer iteration you usually have low-quality solution. Some constraints can be violated with very large margin, while other ones (which are NOT violated in the true solution) can push final point too far in the inner area of the feasible set. For example, if you have constraint x0>=0 and true solution x0=1, then merely a presence of "x0>=0" will introduce a bias towards larger values of x0. Say, algorithm may stop at x0=1.5 instead of 1.0. * after you found good Rho, you may increase number of outer iterations. ItsCnt=10 is a good value. Subsequent outer iteration will refine values of Lagrange multipliers. Constraints which were violated will be enforced, inactive constraints will be dropped (corresponding multipliers will be decreased). Ideally, you should see 10-1000x improvement in constraint handling (constraint violation is reduced). * if you see that algorithm converges to vicinity of solution, but additional outer iterations do not refine solution, it may mean that algorithm is unstable - it wanders around true solution, but can not approach it. Sometimes algorithm may be stabilized by increasing Rho one more time, making it 5x or 10x larger. SCALING OF CONSTRAINTS [IMPORTANT] AUL optimizer scales variables according to scale specified by MinNLCSetScale() function, so it can handle problems with badly scaled variables (as long as we KNOW their scales). However, because function being optimized is a mix of original function and constraint-dependent penalty functions, it is important to rescale both variables AND constraints. Say, if you minimize f(x)=x^2 subject to 1000000*x>=0, then you have constraint whose scale is different from that of target function (another example is 0.000001*x>=0). It is also possible to have constraints whose scales are misaligned: 1000000*x0>=0, 0.000001*x1<=0. Inappropriate scaling may ruin convergence because minimizing x^2 subject to x>=0 is NOT same as minimizing it subject to 1000000*x>=0. Because we know coefficients of boundary/linear constraints, we can automatically rescale and normalize them. However, there is no way to automatically rescale nonlinear constraints Gi(x) and Hi(x) - they are black boxes. It means that YOU are the one who is responsible for correct scaling of nonlinear constraints Gi(x) and Hi(x). We recommend you to rescale nonlinear constraints in such way that I-th component of dG/dX (or dH/dx) has magnitude approximately equal to 1/S[i] (where S is a scale set by MinNLCSetScale() function). WHAT IF IT DOES NOT CONVERGE? It is possible that AUL algorithm fails to converge to precise values of Lagrange multipliers. It stops somewhere around true solution, but candidate point is still too far from solution, and some constraints are violated. Such kind of failure is specific for Lagrangian algorithms - technically, they stop at some point, but this point is not constrained solution. There are exist several reasons why algorithm may fail to converge: a) too loose stopping criteria for inner iteration b) degenerate, redundant constraints c) target function has unconstrained extremum exactly at the boundary of some constraint d) numerical noise in the target function In all these cases algorithm is unstable - each outer iteration results in large and almost random step which improves handling of some constraints, but violates other ones (ideally outer iterations should form a sequence of progressively decreasing steps towards solution). First reason possible is that too loose stopping criteria for inner iteration were specified. Augmented Lagrangian algorithm solves a sequence of intermediate problems, and requries each of them to be solved with high precision. Insufficient precision results in incorrect update of Lagrange multipliers. Another reason is that you may have specified degenerate constraints: say, some constraint was repeated twice. In most cases AUL algorithm gracefully handles such situations, but sometimes it may spend too much time figuring out subtle degeneracies in constraint matrix. Third reason is tricky and hard to diagnose. Consider situation when you minimize f=x^2 subject to constraint x>=0. Unconstrained extremum is located exactly at the boundary of constrained area. In this case algorithm will tend to oscillate between negative and positive x. Each time it stops at x<0 it "reinforces" constraint x>=0, and each time it is bounced to x>0 it "relaxes" constraint (and is attracted to x<0). Such situation sometimes happens in problems with hidden symetries. Algorithm is got caught in a loop with Lagrange multipliers being continuously increased/decreased. Luckily, such loop forms after at least three iterations, so this problem can be solved by DECREASING number of outer iterations down to 1-2 and increasing penalty coefficient Rho as much as possible. Final reason is numerical noise. AUL algorithm is robust against moderate noise (more robust than, say, active set methods), but large noise may destabilize algorithm. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetalgoaul(const minnlcstate &state, const double rho, const ae_int_t itscnt); /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinNLCOptimize(). NOTE: algorithm passes two parameters to rep() callback - current point and penalized function value at current point. Important - function value which is returned is NOT function being minimized. It is sum of the value of the function being minimized - and penalty term. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minnlcsetxrep(const minnlcstate &state, const bool needxrep); /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minnlciteration(const minnlcstate &state); /************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state fvec - callback which calculates function vector fi[] at given point x jac - callback which calculates function vector fi[] and Jacobian jac at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied Jacobian, and one which uses only function vector and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object you should choose appropriate variant of MinNLCOptimize() - one which accepts function AND Jacobian or one which accepts ONLY function. Be careful to choose variant of MinNLCOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinNLCOptimize() and specific function used to create optimizer. | USER PASSED TO MinNLCOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinNLCCreateF() | works FAILS MinNLCCreate() | FAILS works Here "FAILS" denotes inappropriate combinations of optimizer creation function and MinNLCOptimize() version. Attemps to use such combination will lead to exception. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcoptimize(minnlcstate &state, void (*fvec)(const real_1d_array &x, real_1d_array &fi, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minnlcoptimize(minnlcstate &state, void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); /************************************************************************* MinNLC results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinNLCSetGradientCheck() for more information. * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken More information about fields of this structure can be found in the comments on MinNLCReport datatype. -- ALGLIB -- Copyright 06.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcresults(const minnlcstate &state, real_1d_array &x, minnlcreport &rep); /************************************************************************* NLC results Buffered implementation of MinNLCResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minnlcresultsbuf(const minnlcstate &state, real_1d_array &x, minnlcreport &rep); /************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with MinNLCCreate call. X - new starting point. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minnlcrestartfrom(const minnlcstate &state, const real_1d_array &x); /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinNLCOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative, and Rep.FuncIdx is set to index of the function. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinNLCSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2014 by Bochkanov Sergey *************************************************************************/ void minnlcsetgradientcheck(const minnlcstate &state, const double teststep); /************************************************************************* BOX CONSTRAINED OPTIMIZATION WITH FAST ACTIVATION OF MULTIPLE BOX CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to box constraints (with some of box constraints actually being equality ones). This optimizer uses algorithm similar to that of MinBLEIC (optimizer with general linear constraints), but presence of box-only constraints allows us to use faster constraint activation strategies. On large-scale problems, with multiple constraints active at the solution, this optimizer can be several times faster than BLEIC. REQUIREMENTS: * user must provide function value and gradient * starting point X0 must be feasible or not too far away from the feasible set * grad(f) must be Lipschitz continuous on a level set: L = { x : f(x)<=f(x0) } * function must be defined everywhere on the feasible set F USAGE: Constrained optimization if far more complex than the unconstrained one. Here we give very brief outline of the BC optimizer. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on optimization, which is available at http://www.alglib.net/optimization/ 1. User initializes algorithm state with MinBCCreate() call 2. USer adds box constraints by calling MinBCSetBC() function. 3. User sets stopping conditions with MinBCSetCond(). 4. User calls MinBCOptimize() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 5. User calls MinBCResults() to get solution 6. Optionally user may call MinBCRestartFrom() to solve another problem with same N but another starting point. MinBCRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size ofX X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbccreate(const ae_int_t n, const real_1d_array &x, minbcstate &state); void minbccreate(const real_1d_array &x, minbcstate &state); /************************************************************************* The subroutine is finite difference variant of MinBCCreate(). It uses finite differences in order to differentiate target function. Description below contains information which is specific to this function only. We recommend to read comments on MinBCCreate() in order to get more information about creation of BC optimizer. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[0..N-1]. DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. algorithm uses 4-point central formula for differentiation. 2. differentiation step along I-th axis is equal to DiffStep*S[I] where S[] is scaling vector which can be set by MinBCSetScale() call. 3. we recommend you to use moderate values of differentiation step. Too large step will result in too large truncation errors, while too small step will result in too large numerical errors. 1.0E-6 can be good value to start with. 4. Numerical differentiation is very inefficient - one gradient calculation needs 4*N function evaluations. This function will work for any N - either small (1...10), moderate (10...100) or large (100...). However, performance penalty will be too severe for any N's except for small ones. We should also say that code which relies on numerical differentiation is less robust and precise. CG needs exact gradient values. Imprecise gradient may slow down convergence, especially on highly nonlinear problems. Thus we recommend to use this function for fast prototyping on small- dimensional problems only, and to implement analytical gradient as soon as possible. -- ALGLIB -- Copyright 16.05.2011 by Bochkanov Sergey *************************************************************************/ void minbccreatef(const ae_int_t n, const real_1d_array &x, const double diffstep, minbcstate &state); void minbccreatef(const real_1d_array &x, const double diffstep, minbcstate &state); /************************************************************************* This function sets boundary constraints for BC optimizer. Boundary constraints are inactive by default (after initial creation). They are preserved after algorithm restart with MinBCRestartFrom(). INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints, even when numerical differentiation is used (algorithm adjusts nodes according to boundary constraints) -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetbc(const minbcstate &state, const real_1d_array &bndl, const real_1d_array &bndu); /************************************************************************* This function sets stopping conditions for the optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsG - >=0 The subroutine finishes its work if the condition |v|=0 The subroutine finishes its work if on k+1-th iteration the condition |F(k+1)-F(k)|<=EpsF*max{|F(k)|,|F(k+1)|,1} is satisfied. EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - step vector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinBCSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsG=0, EpsF=0 and EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. NOTE: when SetCond() called with non-zero MaxIts, BC solver may perform slightly more than MaxIts iterations. I.e., MaxIts sets non-strict limit on iterations count. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetcond(const minbcstate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits); /************************************************************************* This function sets scaling coefficients for BC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. In most optimizers (and in the BC too) scaling is NOT a form of preconditioning. It just affects stopping conditions. You should set preconditioner by separate call to one of the MinBCSetPrec...() functions. There is a special preconditioning mode, however, which uses scaling coefficients to form diagonal preconditioning matrix. You can turn this mode on, if you want. But you should understand that scaling is not the same thing as preconditioning - these are two different, although related forms of tuning solver. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minbcsetscale(const minbcstate &state, const real_1d_array &s); /************************************************************************* Modification of the preconditioner: preconditioning is turned off. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetprecdefault(const minbcstate &state); /************************************************************************* Modification of the preconditioner: diagonal of approximate Hessian is used. INPUT PARAMETERS: State - structure which stores algorithm state D - diagonal of the approximate Hessian, array[0..N-1], (if larger, only leading N elements are used). NOTE 1: D[i] should be positive. Exception will be thrown otherwise. NOTE 2: you should pass diagonal of approximate Hessian - NOT ITS INVERSE. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetprecdiag(const minbcstate &state, const real_1d_array &d); /************************************************************************* Modification of the preconditioner: scale-based diagonal preconditioning. This preconditioning mode can be useful when you don't have approximate diagonal of Hessian, but you know that your variables are badly scaled (for example, one variable is in [1,10], and another in [1000,100000]), and most part of the ill-conditioning comes from different scales of vars. In this case simple scale-based preconditioner, with H[i] = 1/(s[i]^2), can greatly improve convergence. IMPRTANT: you should set scale of your variables with MinBCSetScale() call (before or after MinBCSetPrecScale() call). Without knowledge of the scale of your variables scale-based preconditioner will be just unit matrix. INPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetprecscale(const minbcstate &state); /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinBCOptimize(). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetxrep(const minbcstate &state, const bool needxrep); /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which lead to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minbcsetstpmax(const minbcstate &state, const double stpmax); /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minbciteration(const minbcstate &state); /************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x grad - callback which calculates function (or merit function) value func and gradient grad at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied gradient, and one which uses function value only and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object (either MinBCCreate() for analytical gradient or MinBCCreateF() for numerical differentiation) you should choose appropriate variant of MinBCOptimize() - one which accepts function AND gradient or one which accepts function ONLY. Be careful to choose variant of MinBCOptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to MinBCOptimize() and specific function used to create optimizer. | USER PASSED TO MinBCOptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ MinBCCreateF() | works FAILS MinBCCreate() | FAILS works Here "FAIL" denotes inappropriate combinations of optimizer creation function and MinBCOptimize() version. Attemps to use such combination (for example, to create optimizer with MinBCCreateF() and to pass gradient information to MinCGOptimize()) will lead to exception being thrown. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcoptimize(minbcstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minbcoptimize(minbcstate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); /************************************************************************* BC results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -7 gradient verification failed. See MinBCSetGradientCheck() for more information. * -3 inconsistent constraints. * 1 relative function improvement is no more than EpsF. * 2 scaled step is no more than EpsX. * 4 scaled gradient norm is no more than EpsG. * 5 MaxIts steps was taken * 8 terminated by user who called minbcrequesttermination(). X contains point which was "current accepted" when termination request was submitted. More information about fields of this structure can be found in the comments on MinBCReport datatype. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcresults(const minbcstate &state, real_1d_array &x, minbcreport &rep); /************************************************************************* BC results Buffered implementation of MinBCResults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcresultsbuf(const minbcstate &state, real_1d_array &x, minbcreport &rep); /************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with MinBCCreate call. X - new starting point. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbcrestartfrom(const minbcstate &state, const real_1d_array &x); /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void minbcrequesttermination(const minbcstate &state); /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinBCOptimize() is called * prior to actual optimization, for each component of parameters being optimized X[i] algorithm performs following steps: * two trial steps are made to X[i]-TestStep*S[i] and X[i]+TestStep*S[i], where X[i] is i-th component of the initial point and S[i] is a scale of i-th parameter * if needed, steps are bounded with respect to constraints on X[] * F(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinBCSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/ void minbcsetgradientcheck(const minbcstate &state, const double teststep); /************************************************************************* NONSMOOTH NONCONVEX OPTIMIZATION SUBJECT TO BOX/LINEAR/NONLINEAR-NONSMOOTH CONSTRAINTS DESCRIPTION: The subroutine minimizes function F(x) of N arguments subject to any combination of: * bound constraints * linear inequality constraints * linear equality constraints * nonlinear equality constraints Gi(x)=0 * nonlinear inequality constraints Hi(x)<=0 IMPORTANT: see MinNSSetAlgoAGS for important information on performance restrictions of AGS solver. REQUIREMENTS: * starting point X0 must be feasible or not too far away from the feasible set * F(), G(), H() are continuous, locally Lipschitz and continuously (but not necessarily twice) differentiable in an open dense subset of R^N. Functions F(), G() and H() may be nonsmooth and non-convex. Informally speaking, it means that functions are composed of large differentiable "patches" with nonsmoothness having place only at the boundaries between these "patches". Most real-life nonsmooth functions satisfy these requirements. Say, anything which involves finite number of abs(), min() and max() is very likely to pass the test. Say, it is possible to optimize anything of the following: * f=abs(x0)+2*abs(x1) * f=max(x0,x1) * f=sin(max(x0,x1)+abs(x2)) * for nonlinearly constrained problems: F() must be bounded from below without nonlinear constraints (this requirement is due to the fact that, contrary to box and linear constraints, nonlinear ones require special handling). * user must provide function value and gradient for F(), H(), G() at all points where function/gradient can be calculated. If optimizer requires value exactly at the boundary between "patches" (say, at x=0 for f=abs(x)), where gradient is not defined, user may resolve tie arbitrarily (in our case - return +1 or -1 at its discretion). * NS solver supports numerical differentiation, i.e. it may differentiate your function for you, but it results in 2N increase of function evaluations. Not recommended unless you solve really small problems. See minnscreatef() for more information on this functionality. USAGE: 1. User initializes algorithm state with MinNSCreate() call and chooses what NLC solver to use. There is some solver which is used by default, with default settings, but you should NOT rely on default choice. It may change in future releases of ALGLIB without notice, and no one can guarantee that new solver will be able to solve your problem with default settings. From the other side, if you choose solver explicitly, you can be pretty sure that it will work with new ALGLIB releases. In the current release following solvers can be used: * AGS solver (activated with MinNSSetAlgoAGS() function) 2. User adds boundary and/or linear and/or nonlinear constraints by means of calling one of the following functions: a) MinNSSetBC() for boundary constraints b) MinNSSetLC() for linear constraints c) MinNSSetNLC() for nonlinear constraints You may combine (a), (b) and (c) in one optimization problem. 3. User sets scale of the variables with MinNSSetScale() function. It is VERY important to set scale of the variables, because nonlinearly constrained problems are hard to solve when variables are badly scaled. 4. User sets stopping conditions with MinNSSetCond(). 5. Finally, user calls MinNSOptimize() function which takes algorithm state and pointer (delegate, etc) to callback function which calculates F/G/H. 7. User calls MinNSResults() to get solution 8. Optionally user may call MinNSRestartFrom() to solve another problem with same N but another starting point. MinNSRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. OUTPUT PARAMETERS: State - structure stores algorithm state NOTE: minnscreatef() function may be used if you do not have analytic gradient. This function creates solver which uses numerical differentiation with user-specified step. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnscreate(const ae_int_t n, const real_1d_array &x, minnsstate &state); void minnscreate(const real_1d_array &x, minnsstate &state); /************************************************************************* Version of minnscreatef() which uses numerical differentiation. I.e., you do not have to calculate derivatives yourself. However, this version needs 2N times more function evaluations. 2-point differentiation formula is used, because more precise 4-point formula is unstable when used on non-smooth functions. INPUT PARAMETERS: N - problem dimension, N>0: * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - starting point, array[N]: * it is better to set X to a feasible point * but X can be infeasible, in which case algorithm will try to find feasible point first, using X as initial approximation. DiffStep- differentiation step, DiffStep>0. Algorithm performs numerical differentiation with step for I-th variable being equal to DiffStep*S[I] (here S[] is a scale vector, set by minnssetscale() function). Do not use too small steps, because it may lead to catastrophic cancellation during intermediate calculations. OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnscreatef(const ae_int_t n, const real_1d_array &x, const double diffstep, minnsstate &state); void minnscreatef(const real_1d_array &x, const double diffstep, minnsstate &state); /************************************************************************* This function sets boundary constraints. Boundary constraints are inactive by default (after initial creation). They are preserved after algorithm restart with minnsrestartfrom(). INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF. BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF. NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: AGS solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints, even when numerical differentiation is used (algorithm adjusts nodes according to boundary constraints) -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetbc(const minnsstate &state, const real_1d_array &bndl, const real_1d_array &bndu); /************************************************************************* This function sets linear constraints. Linear constraints are inactive by default (after initial creation). They are preserved after algorithm restart with minnsrestartfrom(). INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT NOTE: linear (non-bound) constraints are satisfied only approximately: * there always exists some minor violation (about current sampling radius in magnitude during optimization, about EpsX in the solution) due to use of penalty method to handle constraints. * numerical differentiation, if used, may lead to function evaluations outside of the feasible area, because algorithm does NOT change numerical differentiation formula according to linear constraints. If you want constraints to be satisfied exactly, try to reformulate your problem in such manner that all constraints will become boundary ones (this kind of constraints is always satisfied exactly, both in the final solution and in all intermediate points). -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetlc(const minnsstate &state, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k); void minnssetlc(const minnsstate &state, const real_2d_array &c, const integer_1d_array &ct); /************************************************************************* This function sets nonlinear constraints. In fact, this function sets NUMBER of nonlinear constraints. Constraints itself (constraint functions) are passed to minnsoptimize() method. This method requires user-defined vector function F[] and its Jacobian J[], where: * first component of F[] and first row of Jacobian J[] correspond to function being minimized * next NLEC components of F[] (and rows of J) correspond to nonlinear equality constraints G_i(x)=0 * next NLIC components of F[] (and rows of J) correspond to nonlinear inequality constraints H_i(x)<=0 NOTE: you may combine nonlinear constraints with linear/boundary ones. If your problem has mixed constraints, you may explicitly specify some of them as linear ones. It may help optimizer to handle them more efficiently. INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. NLEC - number of Non-Linear Equality Constraints (NLEC), >=0 NLIC - number of Non-Linear Inquality Constraints (NLIC), >=0 NOTE 1: nonlinear constraints are satisfied only approximately! It is possible that algorithm will evaluate function outside of the feasible area! NOTE 2: algorithm scales variables according to scale specified by minnssetscale() function, so it can handle problems with badly scaled variables (as long as we KNOW their scales). However, there is no way to automatically scale nonlinear constraints Gi(x) and Hi(x). Inappropriate scaling of Gi/Hi may ruin convergence. Solving problem with constraint "1000*G0(x)=0" is NOT same as solving it with constraint "0.001*G0(x)=0". It means that YOU are the one who is responsible for correct scaling of nonlinear constraints Gi(x) and Hi(x). We recommend you to scale nonlinear constraints in such way that I-th component of dG/dX (or dH/dx) has approximately unit magnitude (for problems with unit scale) or has magnitude approximately equal to 1/S[i] (where S is a scale set by minnssetscale() function). NOTE 3: nonlinear constraints are always hard to handle, no matter what algorithm you try to use. Even basic box/linear constraints modify function curvature by adding valleys and ridges. However, nonlinear constraints add valleys which are very hard to follow due to their "curved" nature. It means that optimization with single nonlinear constraint may be significantly slower than optimization with multiple linear ones. It is normal situation, and we recommend you to carefully choose Rho parameter of minnssetalgoags(), because too large value may slow down convergence. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetnlc(const minnsstate &state, const ae_int_t nlec, const ae_int_t nlic); /************************************************************************* This function sets stopping conditions for iterations of optimizer. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0 The AGS solver finishes its work if on k+1-th iteration sampling radius decreases below EpsX. MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Passing EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection. We do not recommend you to rely on default choice in production code. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetcond(const minnsstate &state, const double epsx, const ae_int_t maxits); /************************************************************************* This function sets scaling coefficients for NLC optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Scaling is also used by finite difference variant of the optimizer - step along I-th axis is equal to DiffStep*S[I]. INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetscale(const minnsstate &state, const real_1d_array &s); /************************************************************************* This function tells MinNS unit to use AGS (adaptive gradient sampling) algorithm for nonsmooth constrained optimization. This algorithm is a slight modification of one described in "An Adaptive Gradient Sampling Algorithm for Nonsmooth Optimization" by Frank E. Curtisy and Xiaocun Quez. This optimizer has following benefits and drawbacks: + robustness; it can be used with nonsmooth and nonconvex functions. + relatively easy tuning; most of the metaparameters are easy to select. - it has convergence of steepest descent, slower than CG/LBFGS. - each iteration involves evaluation of ~2N gradient values and solution of 2Nx2N quadratic programming problem, which limits applicability of algorithm by small-scale problems (up to 50-100). IMPORTANT: this algorithm has convergence guarantees, i.e. it will steadily move towards some stationary point of the function. However, "stationary point" does not always mean "solution". Nonsmooth problems often have "flat spots", i.e. areas where function do not change at all. Such "flat spots" are stationary points by definition, and algorithm may be caught here. Nonsmooth CONVEX tasks are not prone to this problem. Say, if your function has form f()=MAX(f0,f1,...), and f_i are convex, then f() is convex too and you have guaranteed convergence to solution. INPUT PARAMETERS: State - structure which stores algorithm state Radius - initial sampling radius, >=0. Internally multiplied by vector of per-variable scales specified by minnssetscale()). You should select relatively large sampling radius, roughly proportional to scaled length of the first steps of the algorithm. Something close to 0.1 in magnitude should be good for most problems. AGS solver can automatically decrease radius, so too large radius is not a problem (assuming that you won't choose so large radius that algorithm will sample function in too far away points, where gradient value is irrelevant). Too small radius won't cause algorithm to fail, but it may slow down algorithm (it may have to perform too short steps). Penalty - penalty coefficient for nonlinear constraints: * for problem with nonlinear constraints should be some problem-specific positive value, large enough that penalty term changes shape of the function. Starting from some problem-specific value penalty coefficient becomes large enough to exactly enforce nonlinear constraints; larger values do not improve precision. Increasing it too much may slow down convergence, so you should choose it carefully. * can be zero for problems WITHOUT nonlinear constraints (i.e. for unconstrained ones or ones with just box or linear constraints) * if you specify zero value for problem with at least one nonlinear constraint, algorithm will terminate with error code -1. ALGORITHM OUTLINE The very basic outline of unconstrained AGS algorithm is given below: 0. If sampling radius is below EpsX or we performed more then MaxIts iterations - STOP. 1. sample O(N) gradient values at random locations around current point; informally speaking, this sample is an implicit piecewise linear model of the function, although algorithm formulation does not mention that explicitly 2. solve quadratic programming problem in order to find descent direction 3. if QP solver tells us that we are near solution, decrease sampling radius and move to (0) 4. perform backtracking line search 5. after moving to new point, goto (0) As for the constraints: * box constraints are handled exactly by modification of the function being minimized * linear/nonlinear constraints are handled by adding L1 penalty. Because our solver can handle nonsmoothness, we can use L1 penalty function, which is an exact one (i.e. exact solution is returned under such penalty). * penalty coefficient for linear constraints is chosen automatically; however, penalty coefficient for nonlinear constraints must be specified by user. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnssetalgoags(const minnsstate &state, const double radius, const double penalty); /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to minnsoptimize(). -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minnssetxrep(const minnsstate &state, const bool needxrep); /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsrequesttermination(const minnsstate &state); /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minnsiteration(const minnsstate &state); /************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state fvec - callback which calculates function vector fi[] at given point x jac - callback which calculates function vector fi[] and Jacobian jac at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. This function has two different implementations: one which uses exact (analytical) user-supplied Jacobian, and one which uses only function vector and numerically differentiates function in order to obtain gradient. Depending on the specific function used to create optimizer object you should choose appropriate variant of minnsoptimize() - one which accepts function AND Jacobian or one which accepts ONLY function. Be careful to choose variant of minnsoptimize() which corresponds to your optimization scheme! Table below lists different combinations of callback (function/gradient) passed to minnsoptimize() and specific function used to create optimizer. | USER PASSED TO minnsoptimize() CREATED WITH | function only | function and gradient ------------------------------------------------------------ minnscreatef() | works FAILS minnscreate() | FAILS works Here "FAILS" denotes inappropriate combinations of optimizer creation function and minnsoptimize() version. Attemps to use such combination will lead to exception. Either you did not pass gradient when it WAS needed or you passed gradient when it was NOT needed. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsoptimize(minnsstate &state, void (*fvec)(const real_1d_array &x, real_1d_array &fi, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minnsoptimize(minnsstate &state, void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); /************************************************************************* MinNS results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one: * -8 internal integrity control detected infinite or NAN values in function/gradient. Abnormal termination signalled. * -3 box constraints are inconsistent * -1 inconsistent parameters were passed: * penalty parameter for minnssetalgoags() is zero, but we have nonlinear constraints set by minnssetnlc() * 2 sampling radius decreased below epsx * 7 stopping conditions are too stringent, further improvement is impossible, X contains best point found so far. * 8 User requested termination via minnsrequesttermination() -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsresults(const minnsstate &state, real_1d_array &x, minnsreport &rep); /************************************************************************* Buffered implementation of minnsresults() which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsresultsbuf(const minnsstate &state, real_1d_array &x, minnsreport &rep); /************************************************************************* This subroutine restarts algorithm from new point. All optimization parameters (including constraints) are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure previously allocated with minnscreate() call. X - new starting point. -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ void minnsrestartfrom(const minnsstate &state, const real_1d_array &x); /************************************************************************* Obsolete function, use MinLBFGSSetPrecDefault() instead. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetdefaultpreconditioner(const minlbfgsstate &state); /************************************************************************* Obsolete function, use MinLBFGSSetCholeskyPreconditioner() instead. -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void minlbfgssetcholeskypreconditioner(const minlbfgsstate &state, const real_2d_array &p, const bool isupper); /************************************************************************* This is obsolete function which was used by previous version of the BLEIC optimizer. It does nothing in the current version of BLEIC. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetbarrierwidth(const minbleicstate &state, const double mu); /************************************************************************* This is obsolete function which was used by previous version of the BLEIC optimizer. It does nothing in the current version of BLEIC. -- ALGLIB -- Copyright 28.11.2010 by Bochkanov Sergey *************************************************************************/ void minbleicsetbarrierdecay(const minbleicstate &state, const double mudecay); /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 25.03.2010 by Bochkanov Sergey *************************************************************************/ void minasacreate(const ae_int_t n, const real_1d_array &x, const real_1d_array &bndl, const real_1d_array &bndu, minasastate &state); void minasacreate(const real_1d_array &x, const real_1d_array &bndl, const real_1d_array &bndu, minasastate &state); /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minasasetcond(const minasastate &state, const double epsg, const double epsf, const double epsx, const ae_int_t maxits); /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minasasetxrep(const minasastate &state, const bool needxrep); /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minasasetalgorithm(const minasastate &state, const ae_int_t algotype); /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minasasetstpmax(const minasastate &state, const double stpmax); /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minasaiteration(const minasastate &state); /************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state grad - callback which calculates function (or merit function) value func and gradient grad at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void minasaoptimize(minasastate &state, void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void minasaresults(const minasastate &state, real_1d_array &x, minasareport &rep); /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void minasaresultsbuf(const minasastate &state, real_1d_array &x, minasareport &rep); /************************************************************************* Obsolete optimization algorithm. Was replaced by MinBLEIC subpackage. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void minasarestartfrom(const minasastate &state, const real_1d_array &x, const real_1d_array &bndl, const real_1d_array &bndu); /************************************************************************* IMPROVED LEVENBERG-MARQUARDT METHOD FOR NON-LINEAR LEAST SQUARES OPTIMIZATION DESCRIPTION: This function is used to find minimum of function which is represented as sum of squares: F(x) = f[0]^2(x[0],...,x[n-1]) + ... + f[m-1]^2(x[0],...,x[n-1]) using value of function vector f[] and Jacobian of f[]. REQUIREMENTS: This algorithm will request following information during its operation: * function vector f[] at given point X * function vector f[] and Jacobian of f[] (simultaneously) at given point There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts fvec() and jac() callbacks. First one is used to calculate f[] at given point, second one calculates f[] and Jacobian df[i]/dx[j]. You can try to initialize MinLMState structure with VJ function and then use incorrect version of MinLMOptimize() (for example, version which works with general form function and does not provide Jacobian), but it will lead to exception being thrown after first attempt to calculate Jacobian. USAGE: 1. User initializes algorithm state with MinLMCreateVJ() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of functions f[i] X - initial solution, array[0..N-1] OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatevj(const ae_int_t n, const ae_int_t m, const real_1d_array &x, minlmstate &state); void minlmcreatevj(const ae_int_t m, const real_1d_array &x, minlmstate &state); /************************************************************************* IMPROVED LEVENBERG-MARQUARDT METHOD FOR NON-LINEAR LEAST SQUARES OPTIMIZATION DESCRIPTION: This function is used to find minimum of function which is represented as sum of squares: F(x) = f[0]^2(x[0],...,x[n-1]) + ... + f[m-1]^2(x[0],...,x[n-1]) using value of function vector f[] only. Finite differences are used to calculate Jacobian. REQUIREMENTS: This algorithm will request following information during its operation: * function vector f[] at given point X There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts fvec() callback. You can try to initialize MinLMState structure with VJ function and then use incorrect version of MinLMOptimize() (for example, version which works with general form function and does not accept function vector), but it will lead to exception being thrown after first attempt to calculate Jacobian. USAGE: 1. User initializes algorithm state with MinLMCreateV() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N/M but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X M - number of functions f[i] X - initial solution, array[0..N-1] DiffStep- differentiation step, >0 OUTPUT PARAMETERS: State - structure which stores algorithm state See also MinLMIteration, MinLMResults. NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatev(const ae_int_t n, const ae_int_t m, const real_1d_array &x, const double diffstep, minlmstate &state); void minlmcreatev(const ae_int_t m, const real_1d_array &x, const double diffstep, minlmstate &state); /************************************************************************* LEVENBERG-MARQUARDT-LIKE METHOD FOR NON-LINEAR OPTIMIZATION DESCRIPTION: This function is used to find minimum of general form (not "sum-of- -squares") function F = F(x[0], ..., x[n-1]) using its gradient and Hessian. Levenberg-Marquardt modification with L-BFGS pre-optimization and internal pre-conditioned L-BFGS optimization after each Levenberg-Marquardt step is used. REQUIREMENTS: This algorithm will request following information during its operation: * function value F at given point X * F and gradient G (simultaneously) at given point X * F, G and Hessian H (simultaneously) at given point X There are several overloaded versions of MinLMOptimize() function which correspond to different LM-like optimization algorithms provided by this unit. You should choose version which accepts func(), grad() and hess() function pointers. First pointer is used to calculate F at given point, second one calculates F(x) and grad F(x), third one calculates F(x), grad F(x), hess F(x). You can try to initialize MinLMState structure with FGH-function and then use incorrect version of MinLMOptimize() (for example, version which does not provide Hessian matrix), but it will lead to exception being thrown after first attempt to calculate Hessian. USAGE: 1. User initializes algorithm state with MinLMCreateFGH() call 2. User tunes solver parameters with MinLMSetCond(), MinLMSetStpMax() and other functions 3. User calls MinLMOptimize() function which takes algorithm state and pointers (delegates, etc.) to callback functions. 4. User calls MinLMResults() to get solution 5. Optionally, user may call MinLMRestartFrom() to solve another problem with same N but another starting point and/or another function. MinLMRestartFrom() allows to reuse already initialized structure. INPUT PARAMETERS: N - dimension, N>1 * if given, only leading N elements of X are used * if not given, automatically determined from size of X X - initial solution, array[0..N-1] OUTPUT PARAMETERS: State - structure which stores algorithm state NOTES: 1. you may tune stopping conditions with MinLMSetCond() function 2. if target function contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow, use MinLMSetStpMax() function to bound algorithm's steps. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatefgh(const ae_int_t n, const real_1d_array &x, minlmstate &state); void minlmcreatefgh(const real_1d_array &x, minlmstate &state); /************************************************************************* This function sets stopping conditions for Levenberg-Marquardt optimization algorithm. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by MinLMSetScale() Recommended values: 1E-9 ... 1E-12. MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Only Levenberg-Marquardt iterations are counted (L-BFGS/CG iterations are NOT counted because their cost is very low compared to that of LM). Passing EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (small EpsX). NOTE: it is not recommended to set large EpsX (say, 0.001). Because LM is a second-order method, it performs very precise steps anyway. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlmsetcond(const minlmstate &state, const double epsx, const ae_int_t maxits); /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not If NeedXRep is True, algorithm will call rep() callback function if it is provided to MinLMOptimize(). Both Levenberg-Marquardt and internal L-BFGS iterations are reported. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlmsetxrep(const minlmstate &state, const bool needxrep); /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. NOTE: non-zero StpMax leads to moderate performance degradation because intermediate step of preconditioned L-BFGS optimization is incompatible with limits on step size. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void minlmsetstpmax(const minlmstate &state, const double stpmax); /************************************************************************* This function sets scaling coefficients for LM optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Generally, scale is NOT considered to be a form of preconditioner. But LM optimizer is unique in that it uses scaling matrix both in the stopping condition tests and as Marquardt damping factor. Proper scaling is very important for the algorithm performance. It is less important for the quality of results, but still has some influence (it is easier to converge when variables are properly scaled, so premature stopping is possible when very badly scalled variables are combined with relaxed stopping conditions). INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlmsetscale(const minlmstate &state, const real_1d_array &s); /************************************************************************* This function sets boundary constraints for LM optimizer Boundary constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetBC() call. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[N]. If some (all) variables are unbounded, you may specify very small number or -INF (latter is recommended because it will allow solver to use better algorithm). BndU - upper bounds, array[N]. If some (all) variables are unbounded, you may specify very large number or +INF (latter is recommended because it will allow solver to use better algorithm). NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints or at its boundary -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlmsetbc(const minlmstate &state, const real_1d_array &bndl, const real_1d_array &bndu); /************************************************************************* This function sets general linear constraints for LM optimizer Linear constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another minlmsetlc() call. INPUT PARAMETERS: State - structure stores algorithm state C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT IMPORTANT: if you have linear constraints, it is strongly recommended to set scale of variables with minlmsetscale(). QP solver which is used to calculate linearly constrained steps heavily relies on good scaling of input problems. IMPORTANT: solvers created with minlmcreatefgh() do not support linear constraints. NOTE: linear (non-bound) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations. NOTE: general linear constraints add significant overhead to solution process. Although solver performs roughly same amount of iterations (when compared with similar box-only constrained problem), each iteration now involves solution of linearly constrained QP subproblem, which requires ~3-5 times more Cholesky decompositions. Thus, if you can reformulate your problem in such way this it has only box constraints, it may be beneficial to do so. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void minlmsetlc(const minlmstate &state, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k); void minlmsetlc(const minlmstate &state, const real_2d_array &c, const integer_1d_array &ct); /************************************************************************* This function is used to change acceleration settings You can choose between three acceleration strategies: * AccType=0, no acceleration. * AccType=1, secant updates are used to update quadratic model after each iteration. After fixed number of iterations (or after model breakdown) we recalculate quadratic model using analytic Jacobian or finite differences. Number of secant-based iterations depends on optimization settings: about 3 iterations - when we have analytic Jacobian, up to 2*N iterations - when we use finite differences to calculate Jacobian. AccType=1 is recommended when Jacobian calculation cost is prohibitively high (several Mx1 function vector calculations followed by several NxN Cholesky factorizations are faster than calculation of one M*N Jacobian). It should also be used when we have no Jacobian, because finite difference approximation takes too much time to compute. Table below list optimization protocols (XYZ protocol corresponds to MinLMCreateXYZ) and acceleration types they support (and use by default). ACCELERATION TYPES SUPPORTED BY OPTIMIZATION PROTOCOLS: protocol 0 1 comment V + + VJ + + FGH + DEFAULT VALUES: protocol 0 1 comment V x without acceleration it is so slooooooooow VJ x FGH x NOTE: this function should be called before optimization. Attempt to call it during algorithm iterations may result in unexpected behavior. NOTE: attempt to call this function with unsupported protocol/acceleration combination will result in exception being thrown. -- ALGLIB -- Copyright 14.10.2010 by Bochkanov Sergey *************************************************************************/ void minlmsetacctype(const minlmstate &state, const ae_int_t acctype); /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool minlmiteration(const minlmstate &state); /************************************************************************* This family of functions is used to launcn iterations of nonlinear optimizer These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x grad - callback which calculates function (or merit function) value func and gradient grad at given point x hess - callback which calculates function (or merit function) value func, gradient grad and Hessian hess at given point x fvec - callback which calculates function vector fi[] at given point x jac - callback which calculates function vector fi[] and Jacobian jac at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. Depending on function used to create state structure, this algorithm may accept Jacobian and/or Hessian and/or gradient. According to the said above, there ase several versions of this function, which accept different sets of callbacks. This flexibility opens way to subtle errors - you may create state with MinLMCreateFGH() (optimization using Hessian), but call function which does not accept Hessian. So when algorithm will request Hessian, there will be no callback to call. In this case exception will be thrown. Be careful to avoid such errors because there is no way to find them at compile time - you can see them at runtime only. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmoptimize(minlmstate &state, void (*fvec)(const real_1d_array &x, real_1d_array &fi, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minlmoptimize(minlmstate &state, void (*fvec)(const real_1d_array &x, real_1d_array &fi, void *ptr), void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minlmoptimize(minlmstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*hess)(const real_1d_array &x, double &func, real_1d_array &grad, real_2d_array &hess, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minlmoptimize(minlmstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); void minlmoptimize(minlmstate &state, void (*func)(const real_1d_array &x, double &func, void *ptr), void (*grad)(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*jac)(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr), void (*rep)(const real_1d_array &x, double func, void *ptr) = NULL, void *ptr = NULL); /************************************************************************* Levenberg-Marquardt algorithm results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: X - array[0..N-1], solution Rep - optimization report; includes termination codes and additional information. Termination codes are listed below, see comments for this structure for more info. Termination code is stored in rep.terminationtype field: * -8 optimizer detected NAN/INF values either in the function itself, or in its Jacobian * -7 derivative correctness check failed; see rep.funcidx, rep.varidx for more information. * -3 constraints are inconsistent * 2 relative step is no more than EpsX. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible * 8 terminated by user who called minlmrequesttermination(). X contains point which was "current accepted" when termination request was submitted. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmresults(const minlmstate &state, real_1d_array &x, minlmreport &rep); /************************************************************************* Levenberg-Marquardt algorithm results Buffered implementation of MinLMResults(), which uses pre-allocated buffer to store X[]. If buffer size is too small, it resizes buffer. It is intended to be used in the inner cycles of performance critical algorithms where array reallocation penalty is too large to be ignored. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmresultsbuf(const minlmstate &state, real_1d_array &x, minlmreport &rep); /************************************************************************* This subroutine restarts LM algorithm from new point. All optimization parameters are left unchanged. This function allows to solve multiple optimization problems (which must have same number of dimensions) without object reallocation penalty. INPUT PARAMETERS: State - structure used for reverse communication previously allocated with MinLMCreateXXX call. X - new starting point. -- ALGLIB -- Copyright 30.07.2010 by Bochkanov Sergey *************************************************************************/ void minlmrestartfrom(const minlmstate &state, const real_1d_array &x); /************************************************************************* This subroutine submits request for termination of running optimizer. It should be called from user-supplied callback when user decides that it is time to "smoothly" terminate optimization process. As result, optimizer stops at point which was "current accepted" when termination request was submitted and returns error code 8 (successful termination). INPUT PARAMETERS: State - optimizer structure NOTE: after request for termination optimizer may perform several additional calls to user-supplied callbacks. It does NOT guarantee to stop immediately - it just guarantees that these additional calls will be discarded later. NOTE: calling this function on optimizer which is NOT running will have no effect. NOTE: multiple calls to this function are possible. First call is counted, subsequent calls are silently ignored. -- ALGLIB -- Copyright 08.10.2014 by Bochkanov Sergey *************************************************************************/ void minlmrequesttermination(const minlmstate &state); /************************************************************************* This is obsolete function. Since ALGLIB 3.3 it is equivalent to MinLMCreateVJ(). -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatevgj(const ae_int_t n, const ae_int_t m, const real_1d_array &x, minlmstate &state); void minlmcreatevgj(const ae_int_t m, const real_1d_array &x, minlmstate &state); /************************************************************************* This is obsolete function. Since ALGLIB 3.3 it is equivalent to MinLMCreateFJ(). -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatefgj(const ae_int_t n, const ae_int_t m, const real_1d_array &x, minlmstate &state); void minlmcreatefgj(const ae_int_t m, const real_1d_array &x, minlmstate &state); /************************************************************************* This function is considered obsolete since ALGLIB 3.1.0 and is present for backward compatibility only. We recommend to use MinLMCreateVJ, which provides similar, but more consistent and feature-rich interface. -- ALGLIB -- Copyright 30.03.2009 by Bochkanov Sergey *************************************************************************/ void minlmcreatefj(const ae_int_t n, const ae_int_t m, const real_1d_array &x, minlmstate &state); void minlmcreatefj(const ae_int_t m, const real_1d_array &x, minlmstate &state); /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before optimization begins * MinLMOptimize() is called * prior to actual optimization, for each function Fi and each component of parameters being optimized X[j] algorithm performs following steps: * two trial steps are made to X[j]-TestStep*S[j] and X[j]+TestStep*S[j], where X[j] is j-th parameter and S[j] is a scale of j-th parameter * if needed, steps are bounded with respect to constraints on X[] * Fi(X) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative, Rep.FuncIdx is set to index of the function. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N (parameters count) Jacobian evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with MinLMSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/ void minlmsetgradientcheck(const minlmstate &state, const double teststep); } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (FUNCTIONS) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { void cqminit(ae_int_t n, convexquadraticmodel* s, ae_state *_state); void cqmseta(convexquadraticmodel* s, /* Real */ ae_matrix* a, ae_bool isupper, double alpha, ae_state *_state); void cqmgeta(convexquadraticmodel* s, /* Real */ ae_matrix* a, ae_state *_state); void cqmrewritedensediagonal(convexquadraticmodel* s, /* Real */ ae_vector* z, ae_state *_state); void cqmsetd(convexquadraticmodel* s, /* Real */ ae_vector* d, double tau, ae_state *_state); void cqmdropa(convexquadraticmodel* s, ae_state *_state); void cqmsetb(convexquadraticmodel* s, /* Real */ ae_vector* b, ae_state *_state); void cqmsetq(convexquadraticmodel* s, /* Real */ ae_matrix* q, /* Real */ ae_vector* r, ae_int_t k, double theta, ae_state *_state); void cqmsetactiveset(convexquadraticmodel* s, /* Real */ ae_vector* x, /* Boolean */ ae_vector* activeset, ae_state *_state); double cqmeval(convexquadraticmodel* s, /* Real */ ae_vector* x, ae_state *_state); void cqmevalx(convexquadraticmodel* s, /* Real */ ae_vector* x, double* r, double* noise, ae_state *_state); void cqmgradunconstrained(convexquadraticmodel* s, /* Real */ ae_vector* x, /* Real */ ae_vector* g, ae_state *_state); double cqmxtadx2(convexquadraticmodel* s, /* Real */ ae_vector* x, ae_state *_state); void cqmadx(convexquadraticmodel* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); ae_bool cqmconstrainedoptimum(convexquadraticmodel* s, /* Real */ ae_vector* x, ae_state *_state); void cqmscalevector(convexquadraticmodel* s, /* Real */ ae_vector* x, ae_state *_state); double cqmdebugconstrainedevalt(convexquadraticmodel* s, /* Real */ ae_vector* x, ae_state *_state); double cqmdebugconstrainedevale(convexquadraticmodel* s, /* Real */ ae_vector* x, ae_state *_state); void _convexquadraticmodel_init(void* _p, ae_state *_state); void _convexquadraticmodel_init_copy(void* _dst, void* _src, ae_state *_state); void _convexquadraticmodel_clear(void* _p); void _convexquadraticmodel_destroy(void* _p); void trimprepare(double f, double* threshold, ae_state *_state); void trimfunction(double* f, /* Real */ ae_vector* g, ae_int_t n, double threshold, ae_state *_state); ae_bool enforceboundaryconstraints(/* Real */ ae_vector* x, /* Real */ ae_vector* bl, /* Boolean */ ae_vector* havebl, /* Real */ ae_vector* bu, /* Boolean */ ae_vector* havebu, ae_int_t nmain, ae_int_t nslack, ae_state *_state); void projectgradientintobc(/* Real */ ae_vector* x, /* Real */ ae_vector* g, /* Real */ ae_vector* bl, /* Boolean */ ae_vector* havebl, /* Real */ ae_vector* bu, /* Boolean */ ae_vector* havebu, ae_int_t nmain, ae_int_t nslack, ae_state *_state); void calculatestepbound(/* Real */ ae_vector* x, /* Real */ ae_vector* d, double alpha, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* havebndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* havebndu, ae_int_t nmain, ae_int_t nslack, ae_int_t* variabletofreeze, double* valuetofreeze, double* maxsteplen, ae_state *_state); ae_int_t postprocessboundedstep(/* Real */ ae_vector* x, /* Real */ ae_vector* xprev, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* havebndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* havebndu, ae_int_t nmain, ae_int_t nslack, ae_int_t variabletofreeze, double valuetofreeze, double steptaken, double maxsteplen, ae_state *_state); void filterdirection(/* Real */ ae_vector* d, /* Real */ ae_vector* x, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* havebndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* havebndu, /* Real */ ae_vector* s, ae_int_t nmain, ae_int_t nslack, double droptol, ae_state *_state); ae_int_t numberofchangedconstraints(/* Real */ ae_vector* x, /* Real */ ae_vector* xprev, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* havebndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* havebndu, ae_int_t nmain, ae_int_t nslack, ae_state *_state); ae_bool findfeasiblepoint(/* Real */ ae_vector* x, /* Real */ ae_vector* bndl, /* Boolean */ ae_vector* havebndl, /* Real */ ae_vector* bndu, /* Boolean */ ae_vector* havebndu, ae_int_t nmain, ae_int_t nslack, /* Real */ ae_matrix* ce, ae_int_t k, double epsi, ae_int_t* qpits, ae_int_t* gpaits, ae_state *_state); ae_bool derivativecheck(double f0, double df0, double f1, double df1, double f, double df, double width, ae_state *_state); void estimateparabolicmodel(double absasum, double absasum2, double mx, double mb, double md, double d1, double d2, ae_int_t* d1est, ae_int_t* d2est, ae_state *_state); void inexactlbfgspreconditioner(/* Real */ ae_vector* s, ae_int_t n, /* Real */ ae_vector* d, /* Real */ ae_vector* c, /* Real */ ae_matrix* w, ae_int_t k, precbuflbfgs* buf, ae_state *_state); void preparelowrankpreconditioner(/* Real */ ae_vector* d, /* Real */ ae_vector* c, /* Real */ ae_matrix* w, ae_int_t n, ae_int_t k, precbuflowrank* buf, ae_state *_state); void applylowrankpreconditioner(/* Real */ ae_vector* s, precbuflowrank* buf, ae_state *_state); void _precbuflbfgs_init(void* _p, ae_state *_state); void _precbuflbfgs_init_copy(void* _dst, void* _src, ae_state *_state); void _precbuflbfgs_clear(void* _p); void _precbuflbfgs_destroy(void* _p); void _precbuflowrank_init(void* _p, ae_state *_state); void _precbuflowrank_init_copy(void* _dst, void* _src, ae_state *_state); void _precbuflowrank_clear(void* _p); void _precbuflowrank_destroy(void* _p); void snnlsinit(ae_int_t nsmax, ae_int_t ndmax, ae_int_t nrmax, snnlssolver* s, ae_state *_state); void snnlssetproblem(snnlssolver* s, /* Real */ ae_matrix* a, /* Real */ ae_vector* b, ae_int_t ns, ae_int_t nd, ae_int_t nr, ae_state *_state); void snnlsdropnnc(snnlssolver* s, ae_int_t idx, ae_state *_state); void snnlssolve(snnlssolver* s, /* Real */ ae_vector* x, ae_state *_state); void _snnlssolver_init(void* _p, ae_state *_state); void _snnlssolver_init_copy(void* _dst, void* _src, ae_state *_state); void _snnlssolver_clear(void* _p); void _snnlssolver_destroy(void* _p); void sasinit(ae_int_t n, sactiveset* s, ae_state *_state); void sassetscale(sactiveset* state, /* Real */ ae_vector* s, ae_state *_state); void sassetprecdiag(sactiveset* state, /* Real */ ae_vector* d, ae_state *_state); void sassetbc(sactiveset* state, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state); void sassetlc(sactiveset* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state); void sassetlcx(sactiveset* state, /* Real */ ae_matrix* cleic, ae_int_t nec, ae_int_t nic, ae_state *_state); ae_bool sasstartoptimization(sactiveset* state, /* Real */ ae_vector* x, ae_state *_state); void sasexploredirection(sactiveset* state, /* Real */ ae_vector* d, double* stpmax, ae_int_t* cidx, double* vval, ae_state *_state); ae_int_t sasmoveto(sactiveset* state, /* Real */ ae_vector* xn, ae_bool needact, ae_int_t cidx, double cval, ae_state *_state); void sasimmediateactivation(sactiveset* state, ae_int_t cidx, double cval, ae_state *_state); void sasconstraineddescent(sactiveset* state, /* Real */ ae_vector* g, /* Real */ ae_vector* d, ae_state *_state); void sasconstraineddescentprec(sactiveset* state, /* Real */ ae_vector* g, /* Real */ ae_vector* d, ae_state *_state); void sasconstraineddirection(sactiveset* state, /* Real */ ae_vector* d, ae_state *_state); void sasconstraineddirectionprec(sactiveset* state, /* Real */ ae_vector* d, ae_state *_state); void sascorrection(sactiveset* state, /* Real */ ae_vector* x, double* penalty, ae_state *_state); double sasactivelcpenalty1(sactiveset* state, /* Real */ ae_vector* x, ae_state *_state); double sasscaledconstrainednorm(sactiveset* state, /* Real */ ae_vector* d, ae_state *_state); void sasstopoptimization(sactiveset* state, ae_state *_state); void sasreactivateconstraints(sactiveset* state, /* Real */ ae_vector* gc, ae_state *_state); void sasreactivateconstraintsprec(sactiveset* state, /* Real */ ae_vector* gc, ae_state *_state); void sasrebuildbasis(sactiveset* state, ae_state *_state); void _sactiveset_init(void* _p, ae_state *_state); void _sactiveset_init_copy(void* _dst, void* _src, ae_state *_state); void _sactiveset_clear(void* _p); void _sactiveset_destroy(void* _p); void qqploaddefaults(ae_int_t nmain, qqpsettings* s, ae_state *_state); void qqpcopysettings(qqpsettings* src, qqpsettings* dst, ae_state *_state); void qqpoptimize(convexquadraticmodel* cqmac, sparsematrix* sparseac, /* Real */ ae_matrix* denseac, ae_int_t akind, ae_bool isupper, /* Real */ ae_vector* bc, /* Real */ ae_vector* bndlc, /* Real */ ae_vector* bnduc, /* Real */ ae_vector* sc, /* Real */ ae_vector* xoriginc, ae_int_t nc, /* Real */ ae_matrix* cleicc, ae_int_t nec, ae_int_t nic, qqpsettings* settings, qqpbuffers* sstate, /* Real */ ae_vector* xs, ae_int_t* terminationtype, ae_state *_state); void _qqpsettings_init(void* _p, ae_state *_state); void _qqpsettings_init_copy(void* _dst, void* _src, ae_state *_state); void _qqpsettings_clear(void* _p); void _qqpsettings_destroy(void* _p); void _qqpbuffers_init(void* _p, ae_state *_state); void _qqpbuffers_init_copy(void* _dst, void* _src, ae_state *_state); void _qqpbuffers_clear(void* _p); void _qqpbuffers_destroy(void* _p); void minlbfgscreate(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, minlbfgsstate* state, ae_state *_state); void minlbfgscreatef(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, double diffstep, minlbfgsstate* state, ae_state *_state); void minlbfgssetcond(minlbfgsstate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state); void minlbfgssetxrep(minlbfgsstate* state, ae_bool needxrep, ae_state *_state); void minlbfgssetstpmax(minlbfgsstate* state, double stpmax, ae_state *_state); void minlbfgssetscale(minlbfgsstate* state, /* Real */ ae_vector* s, ae_state *_state); void minlbfgscreatex(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, ae_int_t flags, double diffstep, minlbfgsstate* state, ae_state *_state); void minlbfgssetprecdefault(minlbfgsstate* state, ae_state *_state); void minlbfgssetpreccholesky(minlbfgsstate* state, /* Real */ ae_matrix* p, ae_bool isupper, ae_state *_state); void minlbfgssetprecdiag(minlbfgsstate* state, /* Real */ ae_vector* d, ae_state *_state); void minlbfgssetprecscale(minlbfgsstate* state, ae_state *_state); void minlbfgssetprecrankklbfgsfast(minlbfgsstate* state, /* Real */ ae_vector* d, /* Real */ ae_vector* c, /* Real */ ae_matrix* w, ae_int_t cnt, ae_state *_state); void minlbfgssetpreclowrankexact(minlbfgsstate* state, /* Real */ ae_vector* d, /* Real */ ae_vector* c, /* Real */ ae_matrix* w, ae_int_t cnt, ae_state *_state); ae_bool minlbfgsiteration(minlbfgsstate* state, ae_state *_state); void minlbfgsresults(minlbfgsstate* state, /* Real */ ae_vector* x, minlbfgsreport* rep, ae_state *_state); void minlbfgsresultsbuf(minlbfgsstate* state, /* Real */ ae_vector* x, minlbfgsreport* rep, ae_state *_state); void minlbfgsrestartfrom(minlbfgsstate* state, /* Real */ ae_vector* x, ae_state *_state); void minlbfgsrequesttermination(minlbfgsstate* state, ae_state *_state); void minlbfgssetgradientcheck(minlbfgsstate* state, double teststep, ae_state *_state); void _minlbfgsstate_init(void* _p, ae_state *_state); void _minlbfgsstate_init_copy(void* _dst, void* _src, ae_state *_state); void _minlbfgsstate_clear(void* _p); void _minlbfgsstate_destroy(void* _p); void _minlbfgsreport_init(void* _p, ae_state *_state); void _minlbfgsreport_init_copy(void* _dst, void* _src, ae_state *_state); void _minlbfgsreport_clear(void* _p); void _minlbfgsreport_destroy(void* _p); void qpdenseaulloaddefaults(ae_int_t nmain, qpdenseaulsettings* s, ae_state *_state); void qpdenseauloptimize(convexquadraticmodel* a, sparsematrix* sparsea, ae_int_t akind, ae_bool sparseaupper, /* Real */ ae_vector* b, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, /* Real */ ae_vector* s, /* Real */ ae_vector* xorigin, ae_int_t nn, /* Real */ ae_matrix* cleic, ae_int_t dnec, ae_int_t dnic, sparsematrix* scleic, ae_int_t snec, ae_int_t snic, ae_bool renormlc, qpdenseaulsettings* settings, qpdenseaulbuffers* state, /* Real */ ae_vector* xs, ae_int_t* terminationtype, ae_state *_state); void _qpdenseaulsettings_init(void* _p, ae_state *_state); void _qpdenseaulsettings_init_copy(void* _dst, void* _src, ae_state *_state); void _qpdenseaulsettings_clear(void* _p); void _qpdenseaulsettings_destroy(void* _p); void _qpdenseaulbuffers_init(void* _p, ae_state *_state); void _qpdenseaulbuffers_init_copy(void* _dst, void* _src, ae_state *_state); void _qpdenseaulbuffers_clear(void* _p); void _qpdenseaulbuffers_destroy(void* _p); void qpcholeskyloaddefaults(ae_int_t nmain, qpcholeskysettings* s, ae_state *_state); void qpcholeskycopysettings(qpcholeskysettings* src, qpcholeskysettings* dst, ae_state *_state); void qpcholeskyoptimize(convexquadraticmodel* a, double anorm, /* Real */ ae_vector* b, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, /* Real */ ae_vector* s, /* Real */ ae_vector* xorigin, ae_int_t n, /* Real */ ae_matrix* cleic, ae_int_t nec, ae_int_t nic, qpcholeskybuffers* sstate, /* Real */ ae_vector* xsc, ae_int_t* terminationtype, ae_state *_state); void _qpcholeskysettings_init(void* _p, ae_state *_state); void _qpcholeskysettings_init_copy(void* _dst, void* _src, ae_state *_state); void _qpcholeskysettings_clear(void* _p); void _qpcholeskysettings_destroy(void* _p); void _qpcholeskybuffers_init(void* _p, ae_state *_state); void _qpcholeskybuffers_init_copy(void* _dst, void* _src, ae_state *_state); void _qpcholeskybuffers_clear(void* _p); void _qpcholeskybuffers_destroy(void* _p); void mincgcreate(ae_int_t n, /* Real */ ae_vector* x, mincgstate* state, ae_state *_state); void mincgcreatef(ae_int_t n, /* Real */ ae_vector* x, double diffstep, mincgstate* state, ae_state *_state); void mincgsetcond(mincgstate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state); void mincgsetscale(mincgstate* state, /* Real */ ae_vector* s, ae_state *_state); void mincgsetxrep(mincgstate* state, ae_bool needxrep, ae_state *_state); void mincgsetdrep(mincgstate* state, ae_bool needdrep, ae_state *_state); void mincgsetcgtype(mincgstate* state, ae_int_t cgtype, ae_state *_state); void mincgsetstpmax(mincgstate* state, double stpmax, ae_state *_state); void mincgsuggeststep(mincgstate* state, double stp, ae_state *_state); double mincglastgoodstep(mincgstate* state, ae_state *_state); void mincgsetprecdefault(mincgstate* state, ae_state *_state); void mincgsetprecdiag(mincgstate* state, /* Real */ ae_vector* d, ae_state *_state); void mincgsetprecscale(mincgstate* state, ae_state *_state); ae_bool mincgiteration(mincgstate* state, ae_state *_state); void mincgresults(mincgstate* state, /* Real */ ae_vector* x, mincgreport* rep, ae_state *_state); void mincgresultsbuf(mincgstate* state, /* Real */ ae_vector* x, mincgreport* rep, ae_state *_state); void mincgrestartfrom(mincgstate* state, /* Real */ ae_vector* x, ae_state *_state); void mincgrequesttermination(mincgstate* state, ae_state *_state); void mincgsetprecdiagfast(mincgstate* state, /* Real */ ae_vector* d, ae_state *_state); void mincgsetpreclowrankfast(mincgstate* state, /* Real */ ae_vector* d1, /* Real */ ae_vector* c, /* Real */ ae_matrix* v, ae_int_t vcnt, ae_state *_state); void mincgsetprecvarpart(mincgstate* state, /* Real */ ae_vector* d2, ae_state *_state); void mincgsetgradientcheck(mincgstate* state, double teststep, ae_state *_state); void _mincgstate_init(void* _p, ae_state *_state); void _mincgstate_init_copy(void* _dst, void* _src, ae_state *_state); void _mincgstate_clear(void* _p); void _mincgstate_destroy(void* _p); void _mincgreport_init(void* _p, ae_state *_state); void _mincgreport_init_copy(void* _dst, void* _src, ae_state *_state); void _mincgreport_clear(void* _p); void _mincgreport_destroy(void* _p); void minbleiccreate(ae_int_t n, /* Real */ ae_vector* x, minbleicstate* state, ae_state *_state); void minbleiccreatef(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minbleicstate* state, ae_state *_state); void minbleicsetbc(minbleicstate* state, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state); void minbleicsetlc(minbleicstate* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state); void minbleicsetcond(minbleicstate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state); void minbleicsetscale(minbleicstate* state, /* Real */ ae_vector* s, ae_state *_state); void minbleicsetprecdefault(minbleicstate* state, ae_state *_state); void minbleicsetprecdiag(minbleicstate* state, /* Real */ ae_vector* d, ae_state *_state); void minbleicsetprecscale(minbleicstate* state, ae_state *_state); void minbleicsetxrep(minbleicstate* state, ae_bool needxrep, ae_state *_state); void minbleicsetdrep(minbleicstate* state, ae_bool needdrep, ae_state *_state); void minbleicsetstpmax(minbleicstate* state, double stpmax, ae_state *_state); ae_bool minbleiciteration(minbleicstate* state, ae_state *_state); void minbleicresults(minbleicstate* state, /* Real */ ae_vector* x, minbleicreport* rep, ae_state *_state); void minbleicresultsbuf(minbleicstate* state, /* Real */ ae_vector* x, minbleicreport* rep, ae_state *_state); void minbleicrestartfrom(minbleicstate* state, /* Real */ ae_vector* x, ae_state *_state); void minbleicrequesttermination(minbleicstate* state, ae_state *_state); void minbleicemergencytermination(minbleicstate* state, ae_state *_state); void minbleicsetgradientcheck(minbleicstate* state, double teststep, ae_state *_state); void _minbleicstate_init(void* _p, ae_state *_state); void _minbleicstate_init_copy(void* _dst, void* _src, ae_state *_state); void _minbleicstate_clear(void* _p); void _minbleicstate_destroy(void* _p); void _minbleicreport_init(void* _p, ae_state *_state); void _minbleicreport_init_copy(void* _dst, void* _src, ae_state *_state); void _minbleicreport_clear(void* _p); void _minbleicreport_destroy(void* _p); void qpbleicloaddefaults(ae_int_t nmain, qpbleicsettings* s, ae_state *_state); void qpbleiccopysettings(qpbleicsettings* src, qpbleicsettings* dst, ae_state *_state); void qpbleicoptimize(convexquadraticmodel* a, sparsematrix* sparsea, ae_int_t akind, ae_bool sparseaupper, double absasum, double absasum2, /* Real */ ae_vector* b, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, /* Real */ ae_vector* s, /* Real */ ae_vector* xorigin, ae_int_t n, /* Real */ ae_matrix* cleic, ae_int_t nec, ae_int_t nic, qpbleicsettings* settings, qpbleicbuffers* sstate, ae_bool* firstcall, /* Real */ ae_vector* xs, ae_int_t* terminationtype, ae_state *_state); void _qpbleicsettings_init(void* _p, ae_state *_state); void _qpbleicsettings_init_copy(void* _dst, void* _src, ae_state *_state); void _qpbleicsettings_clear(void* _p); void _qpbleicsettings_destroy(void* _p); void _qpbleicbuffers_init(void* _p, ae_state *_state); void _qpbleicbuffers_init_copy(void* _dst, void* _src, ae_state *_state); void _qpbleicbuffers_clear(void* _p); void _qpbleicbuffers_destroy(void* _p); void minqpcreate(ae_int_t n, minqpstate* state, ae_state *_state); void minqpsetlinearterm(minqpstate* state, /* Real */ ae_vector* b, ae_state *_state); void minqpsetquadraticterm(minqpstate* state, /* Real */ ae_matrix* a, ae_bool isupper, ae_state *_state); void minqpsetquadratictermsparse(minqpstate* state, sparsematrix* a, ae_bool isupper, ae_state *_state); void minqpsetstartingpoint(minqpstate* state, /* Real */ ae_vector* x, ae_state *_state); void minqpsetorigin(minqpstate* state, /* Real */ ae_vector* xorigin, ae_state *_state); void minqpsetscale(minqpstate* state, /* Real */ ae_vector* s, ae_state *_state); void minqpsetalgocholesky(minqpstate* state, ae_state *_state); void minqpsetalgobleic(minqpstate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state); void minqpsetalgodenseaul(minqpstate* state, double epsx, double rho, ae_int_t itscnt, ae_state *_state); void minqpsetalgoquickqp(minqpstate* state, double epsg, double epsf, double epsx, ae_int_t maxouterits, ae_bool usenewton, ae_state *_state); void minqpsetbc(minqpstate* state, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state); void minqpsetlc(minqpstate* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state); void minqpsetlcsparse(minqpstate* state, sparsematrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state); void minqpsetlcmixed(minqpstate* state, /* Real */ ae_matrix* densec, /* Integer */ ae_vector* densect, ae_int_t densek, sparsematrix* sparsec, /* Integer */ ae_vector* sparsect, ae_int_t sparsek, ae_state *_state); void minqpoptimize(minqpstate* state, ae_state *_state); void minqpresults(minqpstate* state, /* Real */ ae_vector* x, minqpreport* rep, ae_state *_state); void minqpresultsbuf(minqpstate* state, /* Real */ ae_vector* x, minqpreport* rep, ae_state *_state); void minqpsetlineartermfast(minqpstate* state, /* Real */ ae_vector* b, ae_state *_state); void minqpsetquadratictermfast(minqpstate* state, /* Real */ ae_matrix* a, ae_bool isupper, double s, ae_state *_state); void minqprewritediagonal(minqpstate* state, /* Real */ ae_vector* s, ae_state *_state); void minqpsetstartingpointfast(minqpstate* state, /* Real */ ae_vector* x, ae_state *_state); void minqpsetoriginfast(minqpstate* state, /* Real */ ae_vector* xorigin, ae_state *_state); void _minqpstate_init(void* _p, ae_state *_state); void _minqpstate_init_copy(void* _dst, void* _src, ae_state *_state); void _minqpstate_clear(void* _p); void _minqpstate_destroy(void* _p); void _minqpreport_init(void* _p, ae_state *_state); void _minqpreport_init_copy(void* _dst, void* _src, ae_state *_state); void _minqpreport_clear(void* _p); void _minqpreport_destroy(void* _p); void minnlccreate(ae_int_t n, /* Real */ ae_vector* x, minnlcstate* state, ae_state *_state); void minnlccreatef(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minnlcstate* state, ae_state *_state); void minnlcsetbc(minnlcstate* state, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state); void minnlcsetlc(minnlcstate* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state); void minnlcsetnlc(minnlcstate* state, ae_int_t nlec, ae_int_t nlic, ae_state *_state); void minnlcsetcond(minnlcstate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state); void minnlcsetscale(minnlcstate* state, /* Real */ ae_vector* s, ae_state *_state); void minnlcsetprecinexact(minnlcstate* state, ae_state *_state); void minnlcsetprecexactlowrank(minnlcstate* state, ae_int_t updatefreq, ae_state *_state); void minnlcsetprecexactrobust(minnlcstate* state, ae_int_t updatefreq, ae_state *_state); void minnlcsetprecnone(minnlcstate* state, ae_state *_state); void minnlcsetstpmax(minnlcstate* state, double stpmax, ae_state *_state); void minnlcsetalgoaul(minnlcstate* state, double rho, ae_int_t itscnt, ae_state *_state); void minnlcsetxrep(minnlcstate* state, ae_bool needxrep, ae_state *_state); ae_bool minnlciteration(minnlcstate* state, ae_state *_state); void minnlcresults(minnlcstate* state, /* Real */ ae_vector* x, minnlcreport* rep, ae_state *_state); void minnlcresultsbuf(minnlcstate* state, /* Real */ ae_vector* x, minnlcreport* rep, ae_state *_state); void minnlcrestartfrom(minnlcstate* state, /* Real */ ae_vector* x, ae_state *_state); void minnlcsetgradientcheck(minnlcstate* state, double teststep, ae_state *_state); void minnlcequalitypenaltyfunction(double alpha, double* f, double* df, double* d2f, ae_state *_state); void minnlcinequalitypenaltyfunction(double alpha, double stabilizingpoint, double* f, double* df, double* d2f, ae_state *_state); void minnlcinequalityshiftfunction(double alpha, double* f, double* df, double* d2f, ae_state *_state); void _minnlcstate_init(void* _p, ae_state *_state); void _minnlcstate_init_copy(void* _dst, void* _src, ae_state *_state); void _minnlcstate_clear(void* _p); void _minnlcstate_destroy(void* _p); void _minnlcreport_init(void* _p, ae_state *_state); void _minnlcreport_init_copy(void* _dst, void* _src, ae_state *_state); void _minnlcreport_clear(void* _p); void _minnlcreport_destroy(void* _p); void minbccreate(ae_int_t n, /* Real */ ae_vector* x, minbcstate* state, ae_state *_state); void minbccreatef(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minbcstate* state, ae_state *_state); void minbcsetbc(minbcstate* state, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state); void minbcsetcond(minbcstate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state); void minbcsetscale(minbcstate* state, /* Real */ ae_vector* s, ae_state *_state); void minbcsetprecdefault(minbcstate* state, ae_state *_state); void minbcsetprecdiag(minbcstate* state, /* Real */ ae_vector* d, ae_state *_state); void minbcsetprecscale(minbcstate* state, ae_state *_state); void minbcsetxrep(minbcstate* state, ae_bool needxrep, ae_state *_state); void minbcsetstpmax(minbcstate* state, double stpmax, ae_state *_state); ae_bool minbciteration(minbcstate* state, ae_state *_state); void minbcresults(minbcstate* state, /* Real */ ae_vector* x, minbcreport* rep, ae_state *_state); void minbcresultsbuf(minbcstate* state, /* Real */ ae_vector* x, minbcreport* rep, ae_state *_state); void minbcrestartfrom(minbcstate* state, /* Real */ ae_vector* x, ae_state *_state); void minbcrequesttermination(minbcstate* state, ae_state *_state); void minbcsetgradientcheck(minbcstate* state, double teststep, ae_state *_state); void _minbcstate_init(void* _p, ae_state *_state); void _minbcstate_init_copy(void* _dst, void* _src, ae_state *_state); void _minbcstate_clear(void* _p); void _minbcstate_destroy(void* _p); void _minbcreport_init(void* _p, ae_state *_state); void _minbcreport_init_copy(void* _dst, void* _src, ae_state *_state); void _minbcreport_clear(void* _p); void _minbcreport_destroy(void* _p); void minnscreate(ae_int_t n, /* Real */ ae_vector* x, minnsstate* state, ae_state *_state); void minnscreatef(ae_int_t n, /* Real */ ae_vector* x, double diffstep, minnsstate* state, ae_state *_state); void minnssetbc(minnsstate* state, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state); void minnssetlc(minnsstate* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state); void minnssetnlc(minnsstate* state, ae_int_t nlec, ae_int_t nlic, ae_state *_state); void minnssetcond(minnsstate* state, double epsx, ae_int_t maxits, ae_state *_state); void minnssetscale(minnsstate* state, /* Real */ ae_vector* s, ae_state *_state); void minnssetalgoags(minnsstate* state, double radius, double penalty, ae_state *_state); void minnssetxrep(minnsstate* state, ae_bool needxrep, ae_state *_state); void minnsrequesttermination(minnsstate* state, ae_state *_state); ae_bool minnsiteration(minnsstate* state, ae_state *_state); void minnsresults(minnsstate* state, /* Real */ ae_vector* x, minnsreport* rep, ae_state *_state); void minnsresultsbuf(minnsstate* state, /* Real */ ae_vector* x, minnsreport* rep, ae_state *_state); void minnsrestartfrom(minnsstate* state, /* Real */ ae_vector* x, ae_state *_state); void _minnsqp_init(void* _p, ae_state *_state); void _minnsqp_init_copy(void* _dst, void* _src, ae_state *_state); void _minnsqp_clear(void* _p); void _minnsqp_destroy(void* _p); void _minnsstate_init(void* _p, ae_state *_state); void _minnsstate_init_copy(void* _dst, void* _src, ae_state *_state); void _minnsstate_clear(void* _p); void _minnsstate_destroy(void* _p); void _minnsreport_init(void* _p, ae_state *_state); void _minnsreport_init_copy(void* _dst, void* _src, ae_state *_state); void _minnsreport_clear(void* _p); void _minnsreport_destroy(void* _p); void minlbfgssetdefaultpreconditioner(minlbfgsstate* state, ae_state *_state); void minlbfgssetcholeskypreconditioner(minlbfgsstate* state, /* Real */ ae_matrix* p, ae_bool isupper, ae_state *_state); void minbleicsetbarrierwidth(minbleicstate* state, double mu, ae_state *_state); void minbleicsetbarrierdecay(minbleicstate* state, double mudecay, ae_state *_state); void minasacreate(ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, minasastate* state, ae_state *_state); void minasasetcond(minasastate* state, double epsg, double epsf, double epsx, ae_int_t maxits, ae_state *_state); void minasasetxrep(minasastate* state, ae_bool needxrep, ae_state *_state); void minasasetalgorithm(minasastate* state, ae_int_t algotype, ae_state *_state); void minasasetstpmax(minasastate* state, double stpmax, ae_state *_state); ae_bool minasaiteration(minasastate* state, ae_state *_state); void minasaresults(minasastate* state, /* Real */ ae_vector* x, minasareport* rep, ae_state *_state); void minasaresultsbuf(minasastate* state, /* Real */ ae_vector* x, minasareport* rep, ae_state *_state); void minasarestartfrom(minasastate* state, /* Real */ ae_vector* x, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state); void _minasastate_init(void* _p, ae_state *_state); void _minasastate_init_copy(void* _dst, void* _src, ae_state *_state); void _minasastate_clear(void* _p); void _minasastate_destroy(void* _p); void _minasareport_init(void* _p, ae_state *_state); void _minasareport_init_copy(void* _dst, void* _src, ae_state *_state); void _minasareport_clear(void* _p); void _minasareport_destroy(void* _p); void minlmcreatevj(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, minlmstate* state, ae_state *_state); void minlmcreatev(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, double diffstep, minlmstate* state, ae_state *_state); void minlmcreatefgh(ae_int_t n, /* Real */ ae_vector* x, minlmstate* state, ae_state *_state); void minlmsetcond(minlmstate* state, double epsx, ae_int_t maxits, ae_state *_state); void minlmsetxrep(minlmstate* state, ae_bool needxrep, ae_state *_state); void minlmsetstpmax(minlmstate* state, double stpmax, ae_state *_state); void minlmsetscale(minlmstate* state, /* Real */ ae_vector* s, ae_state *_state); void minlmsetbc(minlmstate* state, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state); void minlmsetlc(minlmstate* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state); void minlmsetacctype(minlmstate* state, ae_int_t acctype, ae_state *_state); ae_bool minlmiteration(minlmstate* state, ae_state *_state); void minlmresults(minlmstate* state, /* Real */ ae_vector* x, minlmreport* rep, ae_state *_state); void minlmresultsbuf(minlmstate* state, /* Real */ ae_vector* x, minlmreport* rep, ae_state *_state); void minlmrestartfrom(minlmstate* state, /* Real */ ae_vector* x, ae_state *_state); void minlmrequesttermination(minlmstate* state, ae_state *_state); void minlmcreatevgj(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, minlmstate* state, ae_state *_state); void minlmcreatefgj(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, minlmstate* state, ae_state *_state); void minlmcreatefj(ae_int_t n, ae_int_t m, /* Real */ ae_vector* x, minlmstate* state, ae_state *_state); void minlmsetgradientcheck(minlmstate* state, double teststep, ae_state *_state); void _minlmstepfinder_init(void* _p, ae_state *_state); void _minlmstepfinder_init_copy(void* _dst, void* _src, ae_state *_state); void _minlmstepfinder_clear(void* _p); void _minlmstepfinder_destroy(void* _p); void _minlmstate_init(void* _p, ae_state *_state); void _minlmstate_init_copy(void* _dst, void* _src, ae_state *_state); void _minlmstate_clear(void* _p); void _minlmstate_destroy(void* _p); void _minlmreport_init(void* _p, ae_state *_state); void _minlmreport_init_copy(void* _dst, void* _src, ae_state *_state); void _minlmreport_clear(void* _p); void _minlmreport_destroy(void* _p); } #endif cpp/src/dataanalysis.cpp0000755000175000017500000504551413105126765015215 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "dataanalysis.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif using namespace std; ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* Principal components analysis This function builds orthogonal basis where first axis corresponds to direction with maximum variance, second axis maximizes variance in the subspace orthogonal to first axis and so on. This function builds FULL basis, i.e. returns N vectors corresponding to ALL directions, no matter how informative. If you need just a few (say, 10 or 50) of the most important directions, you may find it faster to use one of the reduced versions: * pcatruncatedsubspace() - for subspace iteration based method It should be noted that, unlike LDA, PCA does not use class labels. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * multithreading support ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Multithreading typically gives sublinear (wrt to cores count) speedup, ! because only some parts of the algorithm can be parallelized. ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - dataset, array[0..NPoints-1,0..NVars-1]. matrix contains ONLY INDEPENDENT VARIABLES. NPoints - dataset size, NPoints>=0 NVars - number of independent variables, NVars>=1 OUTPUT PARAMETERS: Info - return code: * -4, if SVD subroutine haven't converged * -1, if wrong parameters has been passed (NPoints<0, NVars<1) * 1, if task is solved S2 - array[0..NVars-1]. variance values corresponding to basis vectors. V - array[0..NVars-1,0..NVars-1] matrix, whose columns store basis vectors. -- ALGLIB -- Copyright 25.08.2008 by Bochkanov Sergey *************************************************************************/ void pcabuildbasis(const real_2d_array &x, const ae_int_t npoints, const ae_int_t nvars, ae_int_t &info, real_1d_array &s2, real_2d_array &v) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pcabuildbasis(const_cast(x.c_ptr()), npoints, nvars, &info, const_cast(s2.c_ptr()), const_cast(v.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_pcabuildbasis(const real_2d_array &x, const ae_int_t npoints, const ae_int_t nvars, ae_int_t &info, real_1d_array &s2, real_2d_array &v) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_pcabuildbasis(const_cast(x.c_ptr()), npoints, nvars, &info, const_cast(s2.c_ptr()), const_cast(v.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Principal components analysis This function performs truncated PCA, i.e. returns just a few most important directions. Internally it uses iterative eigensolver which is very efficient when only a minor fraction of full basis is required. Thus, if you need full basis, it is better to use pcabuildbasis() function. It should be noted that, unlike LDA, PCA does not use class labels. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * multithreading support ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! For a situation when you need just a few eigenvectors (~1-10), ! multithreading typically gives sublinear (wrt to cores count) speedup. ! For larger problems it may give you nearly linear increase in ! performance. ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - dataset, array[0..NPoints-1,0..NVars-1]. matrix contains ONLY INDEPENDENT VARIABLES. NPoints - dataset size, NPoints>=0 NVars - number of independent variables, NVars>=1 NNeeded - number of requested directions, in [1,NVars] range; this function is efficient only for NNeeded<(x.c_ptr()), npoints, nvars, nneeded, eps, maxits, const_cast(s2.c_ptr()), const_cast(v.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_pcatruncatedsubspace(const real_2d_array &x, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t nneeded, const double eps, const ae_int_t maxits, real_1d_array &s2, real_2d_array &v) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_pcatruncatedsubspace(const_cast(x.c_ptr()), npoints, nvars, nneeded, eps, maxits, const_cast(s2.c_ptr()), const_cast(v.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Optimal binary classification Algorithms finds optimal (=with minimal cross-entropy) binary partition. Internal subroutine. INPUT PARAMETERS: A - array[0..N-1], variable C - array[0..N-1], class numbers (0 or 1). N - array size OUTPUT PARAMETERS: Info - completetion code: * -3, all values of A[] are same (partition is impossible) * -2, one of C[] is incorrect (<0, >1) * -1, incorrect pararemets were passed (N<=0). * 1, OK Threshold- partiton boundary. Left part contains values which are strictly less than Threshold. Right part contains values which are greater than or equal to Threshold. PAL, PBL- probabilities P(0|v=Threshold) and P(1|v>=Threshold) CVE - cross-validation estimate of cross-entropy -- ALGLIB -- Copyright 22.05.2008 by Bochkanov Sergey *************************************************************************/ void dsoptimalsplit2(const real_1d_array &a, const integer_1d_array &c, const ae_int_t n, ae_int_t &info, double &threshold, double &pal, double &pbl, double &par, double &pbr, double &cve) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::dsoptimalsplit2(const_cast(a.c_ptr()), const_cast(c.c_ptr()), n, &info, &threshold, &pal, &pbl, &par, &pbr, &cve, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Optimal partition, internal subroutine. Fast version. Accepts: A array[0..N-1] array of attributes array[0..N-1] C array[0..N-1] array of class labels TiesBuf array[0..N] temporaries (ties) CntBuf array[0..2*NC-1] temporaries (counts) Alpha centering factor (0<=alpha<=1, recommended value - 0.05) BufR array[0..N-1] temporaries BufI array[0..N-1] temporaries Output: Info error code (">0"=OK, "<0"=bad) RMS training set RMS error CVRMS leave-one-out RMS error Note: content of all arrays is changed by subroutine; it doesn't allocate temporaries. -- ALGLIB -- Copyright 11.12.2008 by Bochkanov Sergey *************************************************************************/ void dsoptimalsplit2fast(real_1d_array &a, integer_1d_array &c, integer_1d_array &tiesbuf, integer_1d_array &cntbuf, real_1d_array &bufr, integer_1d_array &bufi, const ae_int_t n, const ae_int_t nc, const double alpha, ae_int_t &info, double &threshold, double &rms, double &cvrms) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::dsoptimalsplit2fast(const_cast(a.c_ptr()), const_cast(c.c_ptr()), const_cast(tiesbuf.c_ptr()), const_cast(cntbuf.c_ptr()), const_cast(bufr.c_ptr()), const_cast(bufi.c_ptr()), n, nc, alpha, &info, &threshold, &rms, &cvrms, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Model's errors: * RelCLSError - fraction of misclassified cases. * AvgCE - acerage cross-entropy * RMSError - root-mean-square error * AvgError - average error * AvgRelError - average relative error NOTE 1: RelCLSError/AvgCE are zero on regression problems. NOTE 2: on classification problems RMSError/AvgError/AvgRelError contain errors in prediction of posterior probabilities *************************************************************************/ _modelerrors_owner::_modelerrors_owner() { p_struct = (alglib_impl::modelerrors*)alglib_impl::ae_malloc(sizeof(alglib_impl::modelerrors), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_modelerrors_init(p_struct, NULL); } _modelerrors_owner::_modelerrors_owner(const _modelerrors_owner &rhs) { p_struct = (alglib_impl::modelerrors*)alglib_impl::ae_malloc(sizeof(alglib_impl::modelerrors), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_modelerrors_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _modelerrors_owner& _modelerrors_owner::operator=(const _modelerrors_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_modelerrors_clear(p_struct); alglib_impl::_modelerrors_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _modelerrors_owner::~_modelerrors_owner() { alglib_impl::_modelerrors_clear(p_struct); ae_free(p_struct); } alglib_impl::modelerrors* _modelerrors_owner::c_ptr() { return p_struct; } alglib_impl::modelerrors* _modelerrors_owner::c_ptr() const { return const_cast(p_struct); } modelerrors::modelerrors() : _modelerrors_owner() ,relclserror(p_struct->relclserror),avgce(p_struct->avgce),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror) { } modelerrors::modelerrors(const modelerrors &rhs):_modelerrors_owner(rhs) ,relclserror(p_struct->relclserror),avgce(p_struct->avgce),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror) { } modelerrors& modelerrors::operator=(const modelerrors &rhs) { if( this==&rhs ) return *this; _modelerrors_owner::operator=(rhs); return *this; } modelerrors::~modelerrors() { } /************************************************************************* *************************************************************************/ _multilayerperceptron_owner::_multilayerperceptron_owner() { p_struct = (alglib_impl::multilayerperceptron*)alglib_impl::ae_malloc(sizeof(alglib_impl::multilayerperceptron), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_multilayerperceptron_init(p_struct, NULL); } _multilayerperceptron_owner::_multilayerperceptron_owner(const _multilayerperceptron_owner &rhs) { p_struct = (alglib_impl::multilayerperceptron*)alglib_impl::ae_malloc(sizeof(alglib_impl::multilayerperceptron), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_multilayerperceptron_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _multilayerperceptron_owner& _multilayerperceptron_owner::operator=(const _multilayerperceptron_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_multilayerperceptron_clear(p_struct); alglib_impl::_multilayerperceptron_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _multilayerperceptron_owner::~_multilayerperceptron_owner() { alglib_impl::_multilayerperceptron_clear(p_struct); ae_free(p_struct); } alglib_impl::multilayerperceptron* _multilayerperceptron_owner::c_ptr() { return p_struct; } alglib_impl::multilayerperceptron* _multilayerperceptron_owner::c_ptr() const { return const_cast(p_struct); } multilayerperceptron::multilayerperceptron() : _multilayerperceptron_owner() { } multilayerperceptron::multilayerperceptron(const multilayerperceptron &rhs):_multilayerperceptron_owner(rhs) { } multilayerperceptron& multilayerperceptron::operator=(const multilayerperceptron &rhs) { if( this==&rhs ) return *this; _multilayerperceptron_owner::operator=(rhs); return *this; } multilayerperceptron::~multilayerperceptron() { } /************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/ void mlpserialize(multilayerperceptron &obj, std::string &s_out) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_int_t ssize; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_alloc_start(&serializer); alglib_impl::mlpalloc(&serializer, obj.c_ptr(), &state); ssize = alglib_impl::ae_serializer_get_alloc_size(&serializer); s_out.clear(); s_out.reserve((size_t)(ssize+1)); alglib_impl::ae_serializer_sstart_str(&serializer, &s_out); alglib_impl::mlpserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); if( s_out.length()>(size_t)ssize ) throw ap_error("ALGLIB: serialization integrity error"); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function unserializes data structure from string. *************************************************************************/ void mlpunserialize(const std::string &s_in, multilayerperceptron &obj) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_ustart_str(&serializer, &s_in); alglib_impl::mlpunserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function serializes data structure to C++ stream. Data stream generated by this function is same as string representation generated by string version of serializer - alphanumeric characters, dots, underscores, minus signs, which are grouped into words separated by spaces and CR+LF. We recommend you to read comments on string version of serializer to find out more about serialization of AlGLIB objects. *************************************************************************/ void mlpserialize(multilayerperceptron &obj, std::ostream &s_out) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_alloc_start(&serializer); alglib_impl::mlpalloc(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_get_alloc_size(&serializer); // not actually needed, but we have to ask alglib_impl::ae_serializer_sstart_stream(&serializer, &s_out); alglib_impl::mlpserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function unserializes data structure from stream. *************************************************************************/ void mlpunserialize(const std::istream &s_in, multilayerperceptron &obj) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_ustart_stream(&serializer, &s_in); alglib_impl::mlpunserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* Creates neural network with NIn inputs, NOut outputs, without hidden layers, with linear output layer. Network weights are filled with small random values. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreate0(const ae_int_t nin, const ae_int_t nout, multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreate0(nin, nout, const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Same as MLPCreate0, but with one hidden layer (NHid neurons) with non-linear activation function. Output layer is linear. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreate1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreate1(nin, nhid, nout, const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Same as MLPCreate0, but with two hidden layers (NHid1 and NHid2 neurons) with non-linear activation function. Output layer is linear. $ALL -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreate2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreate2(nin, nhid1, nhid2, nout, const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Creates neural network with NIn inputs, NOut outputs, without hidden layers with non-linear output layer. Network weights are filled with small random values. Activation function of the output layer takes values: (B, +INF), if D>=0 or (-INF, B), if D<0. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreateb0(const ae_int_t nin, const ae_int_t nout, const double b, const double d, multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreateb0(nin, nout, b, d, const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Same as MLPCreateB0 but with non-linear hidden layer. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreateb1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, const double b, const double d, multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreateb1(nin, nhid, nout, b, d, const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Same as MLPCreateB0 but with two non-linear hidden layers. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreateb2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, const double b, const double d, multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreateb2(nin, nhid1, nhid2, nout, b, d, const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Creates neural network with NIn inputs, NOut outputs, without hidden layers with non-linear output layer. Network weights are filled with small random values. Activation function of the output layer takes values [A,B]. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreater0(const ae_int_t nin, const ae_int_t nout, const double a, const double b, multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreater0(nin, nout, a, b, const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Same as MLPCreateR0, but with non-linear hidden layer. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreater1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, const double a, const double b, multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreater1(nin, nhid, nout, a, b, const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Same as MLPCreateR0, but with two non-linear hidden layers. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreater2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, const double a, const double b, multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreater2(nin, nhid1, nhid2, nout, a, b, const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Creates classifier network with NIn inputs and NOut possible classes. Network contains no hidden layers and linear output layer with SOFTMAX- normalization (so outputs sums up to 1.0 and converge to posterior probabilities). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreatec0(const ae_int_t nin, const ae_int_t nout, multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreatec0(nin, nout, const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Same as MLPCreateC0, but with one non-linear hidden layer. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreatec1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreatec1(nin, nhid, nout, const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Same as MLPCreateC0, but with two non-linear hidden layers. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreatec2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreatec2(nin, nhid1, nhid2, nout, const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Copying of neural network INPUT PARAMETERS: Network1 - original OUTPUT PARAMETERS: Network2 - copy -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcopy(const multilayerperceptron &network1, multilayerperceptron &network2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcopy(const_cast(network1.c_ptr()), const_cast(network2.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function copies tunable parameters (weights/means/sigmas) from one network to another with same architecture. It performs some rudimentary checks that architectures are same, and throws exception if check fails. It is intended for fast copying of states between two network which are known to have same geometry. INPUT PARAMETERS: Network1 - source, must be correctly initialized Network2 - target, must have same architecture OUTPUT PARAMETERS: Network2 - network state is copied from source to target -- ALGLIB -- Copyright 20.06.2013 by Bochkanov Sergey *************************************************************************/ void mlpcopytunableparameters(const multilayerperceptron &network1, const multilayerperceptron &network2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcopytunableparameters(const_cast(network1.c_ptr()), const_cast(network2.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Randomization of neural network weights -- ALGLIB -- Copyright 06.11.2007 by Bochkanov Sergey *************************************************************************/ void mlprandomize(const multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlprandomize(const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Randomization of neural network weights and standartisator -- ALGLIB -- Copyright 10.03.2008 by Bochkanov Sergey *************************************************************************/ void mlprandomizefull(const multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlprandomizefull(const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Internal subroutine. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpinitpreprocessor(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpinitpreprocessor(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), ssize, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Returns information about initialized network: number of inputs, outputs, weights. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpproperties(const multilayerperceptron &network, ae_int_t &nin, ae_int_t &nout, ae_int_t &wcount) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpproperties(const_cast(network.c_ptr()), &nin, &nout, &wcount, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Returns number of inputs. -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetinputscount(const multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::mlpgetinputscount(const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Returns number of outputs. -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetoutputscount(const multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::mlpgetoutputscount(const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Returns number of weights. -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetweightscount(const multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::mlpgetweightscount(const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Tells whether network is SOFTMAX-normalized (i.e. classifier) or not. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ bool mlpissoftmax(const multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::mlpissoftmax(const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns total number of layers (including input, hidden and output layers). -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetlayerscount(const multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::mlpgetlayerscount(const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns size of K-th layer. K=0 corresponds to input layer, K=CNT-1 corresponds to output layer. Size of the output layer is always equal to the number of outputs, although when we have softmax-normalized network, last neuron doesn't have any connections - it is just zero. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetlayersize(const multilayerperceptron &network, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::mlpgetlayersize(const_cast(network.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns offset/scaling coefficients for I-th input of the network. INPUT PARAMETERS: Network - network I - input index OUTPUT PARAMETERS: Mean - mean term Sigma - sigma term, guaranteed to be nonzero. I-th input is passed through linear transformation IN[i] = (IN[i]-Mean)/Sigma before feeding to the network -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpgetinputscaling(const multilayerperceptron &network, const ae_int_t i, double &mean, double &sigma) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpgetinputscaling(const_cast(network.c_ptr()), i, &mean, &sigma, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns offset/scaling coefficients for I-th output of the network. INPUT PARAMETERS: Network - network I - input index OUTPUT PARAMETERS: Mean - mean term Sigma - sigma term, guaranteed to be nonzero. I-th output is passed through linear transformation OUT[i] = OUT[i]*Sigma+Mean before returning it to user. In case we have SOFTMAX-normalized network, we return (Mean,Sigma)=(0.0,1.0). -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpgetoutputscaling(const multilayerperceptron &network, const ae_int_t i, double &mean, double &sigma) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpgetoutputscaling(const_cast(network.c_ptr()), i, &mean, &sigma, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns information about Ith neuron of Kth layer INPUT PARAMETERS: Network - network K - layer index I - neuron index (within layer) OUTPUT PARAMETERS: FKind - activation function type (used by MLPActivationFunction()) this value is zero for input or linear neurons Threshold - also called offset, bias zero for input neurons NOTE: this function throws exception if layer or neuron with given index do not exists. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpgetneuroninfo(const multilayerperceptron &network, const ae_int_t k, const ae_int_t i, ae_int_t &fkind, double &threshold) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpgetneuroninfo(const_cast(network.c_ptr()), k, i, &fkind, &threshold, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns information about connection from I0-th neuron of K0-th layer to I1-th neuron of K1-th layer. INPUT PARAMETERS: Network - network K0 - layer index I0 - neuron index (within layer) K1 - layer index I1 - neuron index (within layer) RESULT: connection weight (zero for non-existent connections) This function: 1. throws exception if layer or neuron with given index do not exists. 2. returns zero if neurons exist, but there is no connection between them -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ double mlpgetweight(const multilayerperceptron &network, const ae_int_t k0, const ae_int_t i0, const ae_int_t k1, const ae_int_t i1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlpgetweight(const_cast(network.c_ptr()), k0, i0, k1, i1, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets offset/scaling coefficients for I-th input of the network. INPUT PARAMETERS: Network - network I - input index Mean - mean term Sigma - sigma term (if zero, will be replaced by 1.0) NTE: I-th input is passed through linear transformation IN[i] = (IN[i]-Mean)/Sigma before feeding to the network. This function sets Mean and Sigma. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpsetinputscaling(const multilayerperceptron &network, const ae_int_t i, const double mean, const double sigma) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpsetinputscaling(const_cast(network.c_ptr()), i, mean, sigma, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets offset/scaling coefficients for I-th output of the network. INPUT PARAMETERS: Network - network I - input index Mean - mean term Sigma - sigma term (if zero, will be replaced by 1.0) OUTPUT PARAMETERS: NOTE: I-th output is passed through linear transformation OUT[i] = OUT[i]*Sigma+Mean before returning it to user. This function sets Sigma/Mean. In case we have SOFTMAX-normalized network, you can not set (Sigma,Mean) to anything other than(0.0,1.0) - this function will throw exception. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpsetoutputscaling(const multilayerperceptron &network, const ae_int_t i, const double mean, const double sigma) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpsetoutputscaling(const_cast(network.c_ptr()), i, mean, sigma, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function modifies information about Ith neuron of Kth layer INPUT PARAMETERS: Network - network K - layer index I - neuron index (within layer) FKind - activation function type (used by MLPActivationFunction()) this value must be zero for input neurons (you can not set activation function for input neurons) Threshold - also called offset, bias this value must be zero for input neurons (you can not set threshold for input neurons) NOTES: 1. this function throws exception if layer or neuron with given index do not exists. 2. this function also throws exception when you try to set non-linear activation function for input neurons (any kind of network) or for output neurons of classifier network. 3. this function throws exception when you try to set non-zero threshold for input neurons (any kind of network). -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpsetneuroninfo(const multilayerperceptron &network, const ae_int_t k, const ae_int_t i, const ae_int_t fkind, const double threshold) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpsetneuroninfo(const_cast(network.c_ptr()), k, i, fkind, threshold, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function modifies information about connection from I0-th neuron of K0-th layer to I1-th neuron of K1-th layer. INPUT PARAMETERS: Network - network K0 - layer index I0 - neuron index (within layer) K1 - layer index I1 - neuron index (within layer) W - connection weight (must be zero for non-existent connections) This function: 1. throws exception if layer or neuron with given index do not exists. 2. throws exception if you try to set non-zero weight for non-existent connection -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpsetweight(const multilayerperceptron &network, const ae_int_t k0, const ae_int_t i0, const ae_int_t k1, const ae_int_t i1, const double w) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpsetweight(const_cast(network.c_ptr()), k0, i0, k1, i1, w, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Neural network activation function INPUT PARAMETERS: NET - neuron input K - function index (zero for linear function) OUTPUT PARAMETERS: F - function DF - its derivative D2F - its second derivative -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpactivationfunction(const double net, const ae_int_t k, double &f, double &df, double &d2f) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpactivationfunction(net, k, &f, &df, &d2f, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Procesing INPUT PARAMETERS: Network - neural network X - input vector, array[0..NIn-1]. OUTPUT PARAMETERS: Y - result. Regression estimate when solving regression task, vector of posterior probabilities for classification task. See also MLPProcessI -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpprocess(const multilayerperceptron &network, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpprocess(const_cast(network.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 'interactive' variant of MLPProcess for languages like Python which support constructs like "Y = MLPProcess(NN,X)" and interactive mode of the interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 21.09.2010 by Bochkanov Sergey *************************************************************************/ void mlpprocessi(const multilayerperceptron &network, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpprocessi(const_cast(network.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Error of the neural network on dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x, depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ double mlperror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlperror(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlperror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlperror(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Error of the neural network on dataset given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x, depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0 RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ double mlperrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlperrorsparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlperrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlperrorsparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Natural error function for neural network, internal subroutine. NOTE: this function is single-threaded. Unlike other error function, it receives no speed-up from being executed in SMP mode. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ double mlperrorn(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlperrorn(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), ssize, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Classification error of the neural network on dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: classification error (number of misclassified cases) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpclserror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::mlpclserror(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } ae_int_t smp_mlpclserror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::_pexec_mlpclserror(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Relative classification error on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Percent of incorrectly classified cases. Works both for classifier networks and general purpose networks used as classifiers. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 25.12.2008 by Bochkanov Sergey *************************************************************************/ double mlprelclserror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlprelclserror(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlprelclserror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlprelclserror(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Relative classification error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Percent of incorrectly classified cases. Works both for classifier networks and general purpose networks used as classifiers. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/ double mlprelclserrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlprelclserrorsparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlprelclserrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlprelclserrorsparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average cross-entropy (in bits per element) on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: CrossEntropy/(NPoints*LN(2)). Zero if network solves regression task. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 08.01.2009 by Bochkanov Sergey *************************************************************************/ double mlpavgce(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlpavgce(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlpavgce(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlpavgce(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average cross-entropy (in bits per element) on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: CrossEntropy/(NPoints*LN(2)). Zero if network solves regression task. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 9.08.2012 by Bochkanov Sergey *************************************************************************/ double mlpavgcesparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlpavgcesparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlpavgcesparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlpavgcesparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* RMS error on the test set given. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Root mean square error. Its meaning for regression task is obvious. As for classification task, RMS error means error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ double mlprmserror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlprmserror(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlprmserror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlprmserror(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* RMS error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Root mean square error. Its meaning for regression task is obvious. As for classification task, RMS error means error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/ double mlprmserrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlprmserrorsparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlprmserrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlprmserrorsparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average absolute error on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Its meaning for regression task is obvious. As for classification task, it means average error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 11.03.2008 by Bochkanov Sergey *************************************************************************/ double mlpavgerror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlpavgerror(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlpavgerror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlpavgerror(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average absolute error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Its meaning for regression task is obvious. As for classification task, it means average error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/ double mlpavgerrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlpavgerrorsparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlpavgerrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlpavgerrorsparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average relative error on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Its meaning for regression task is obvious. As for classification task, it means average relative error when estimating posterior probability of belonging to the correct class. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 11.03.2008 by Bochkanov Sergey *************************************************************************/ double mlpavgrelerror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlpavgrelerror(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlpavgrelerror(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlpavgrelerror(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average relative error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Its meaning for regression task is obvious. As for classification task, it means average relative error when estimating posterior probability of belonging to the correct class. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/ double mlpavgrelerrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlpavgrelerrorsparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlpavgrelerrorsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlpavgrelerrorsparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Gradient calculation INPUT PARAMETERS: Network - network initialized with one of the network creation funcs X - input vector, length of array must be at least NIn DesiredY- desired outputs, length of array must be at least NOut Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpgrad(const multilayerperceptron &network, const real_1d_array &x, const real_1d_array &desiredy, double &e, real_1d_array &grad) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpgrad(const_cast(network.c_ptr()), const_cast(x.c_ptr()), const_cast(desiredy.c_ptr()), &e, const_cast(grad.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Gradient calculation (natural error function is used) INPUT PARAMETERS: Network - network initialized with one of the network creation funcs X - input vector, length of array must be at least NIn DesiredY- desired outputs, length of array must be at least NOut Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, sum-of-squares for regression networks, cross-entropy for classification networks. Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpgradn(const multilayerperceptron &network, const real_1d_array &x, const real_1d_array &desiredy, double &e, real_1d_array &grad) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpgradn(const_cast(network.c_ptr()), const_cast(x.c_ptr()), const_cast(desiredy.c_ptr()), &e, const_cast(grad.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Batch gradient calculation for a set of inputs/outputs FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in dense format; one sample = one row: * first NIn columns contain inputs, * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SSize - number of elements in XY Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpgradbatch(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize, double &e, real_1d_array &grad) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpgradbatch(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), ssize, &e, const_cast(grad.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_mlpgradbatch(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize, double &e, real_1d_array &grad) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_mlpgradbatch(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), ssize, &e, const_cast(grad.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Batch gradient calculation for a set of inputs/outputs given by sparse matrices FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in sparse format; one sample = one row: * MATRIX MUST BE STORED IN CRS FORMAT * first NIn columns contain inputs. * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SSize - number of elements in XY Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpgradbatchsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t ssize, double &e, real_1d_array &grad) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpgradbatchsparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), ssize, &e, const_cast(grad.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_mlpgradbatchsparse(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t ssize, double &e, real_1d_array &grad) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_mlpgradbatchsparse(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), ssize, &e, const_cast(grad.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Batch gradient calculation for a subset of dataset FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in dense format; one sample = one row: * first NIn columns contain inputs, * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SetSize - real size of XY, SetSize>=0; Idx - subset of SubsetSize elements, array[SubsetSize]: * Idx[I] stores row index in the original dataset which is given by XY. Gradient is calculated with respect to rows whose indexes are stored in Idx[]. * Idx[] must store correct indexes; this function throws an exception in case incorrect index (less than 0 or larger than rows(XY)) is given * Idx[] may store indexes in any order and even with repetitions. SubsetSize- number of elements in Idx[] array: * positive value means that subset given by Idx[] is processed * zero value results in zero gradient * negative value means that full dataset is processed Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpgradbatchsubset(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t setsize, const integer_1d_array &idx, const ae_int_t subsetsize, double &e, real_1d_array &grad) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpgradbatchsubset(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), setsize, const_cast(idx.c_ptr()), subsetsize, &e, const_cast(grad.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_mlpgradbatchsubset(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t setsize, const integer_1d_array &idx, const ae_int_t subsetsize, double &e, real_1d_array &grad) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_mlpgradbatchsubset(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), setsize, const_cast(idx.c_ptr()), subsetsize, &e, const_cast(grad.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Batch gradient calculation for a set of inputs/outputs for a subset of dataset given by set of indexes. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in sparse format; one sample = one row: * MATRIX MUST BE STORED IN CRS FORMAT * first NIn columns contain inputs, * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SetSize - real size of XY, SetSize>=0; Idx - subset of SubsetSize elements, array[SubsetSize]: * Idx[I] stores row index in the original dataset which is given by XY. Gradient is calculated with respect to rows whose indexes are stored in Idx[]. * Idx[] must store correct indexes; this function throws an exception in case incorrect index (less than 0 or larger than rows(XY)) is given * Idx[] may store indexes in any order and even with repetitions. SubsetSize- number of elements in Idx[] array: * positive value means that subset given by Idx[] is processed * zero value results in zero gradient * negative value means that full dataset is processed Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] NOTE: when SubsetSize<0 is used full dataset by call MLPGradBatchSparse function. -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpgradbatchsparsesubset(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t setsize, const integer_1d_array &idx, const ae_int_t subsetsize, double &e, real_1d_array &grad) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpgradbatchsparsesubset(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), setsize, const_cast(idx.c_ptr()), subsetsize, &e, const_cast(grad.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_mlpgradbatchsparsesubset(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t setsize, const integer_1d_array &idx, const ae_int_t subsetsize, double &e, real_1d_array &grad) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_mlpgradbatchsparsesubset(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), setsize, const_cast(idx.c_ptr()), subsetsize, &e, const_cast(grad.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Batch gradient calculation for a set of inputs/outputs (natural error function is used) INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - set of inputs/outputs; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SSize - number of elements in XY Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, sum-of-squares for regression networks, cross-entropy for classification networks. Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpgradnbatch(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize, double &e, real_1d_array &grad) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpgradnbatch(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), ssize, &e, const_cast(grad.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Batch Hessian calculation (natural error function) using R-algorithm. Internal subroutine. -- ALGLIB -- Copyright 26.01.2008 by Bochkanov Sergey. Hessian calculation based on R-algorithm described in "Fast Exact Multiplication by the Hessian", B. A. Pearlmutter, Neural Computation, 1994. *************************************************************************/ void mlphessiannbatch(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize, double &e, real_1d_array &grad, real_2d_array &h) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlphessiannbatch(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), ssize, &e, const_cast(grad.c_ptr()), const_cast(h.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Batch Hessian calculation using R-algorithm. Internal subroutine. -- ALGLIB -- Copyright 26.01.2008 by Bochkanov Sergey. Hessian calculation based on R-algorithm described in "Fast Exact Multiplication by the Hessian", B. A. Pearlmutter, Neural Computation, 1994. *************************************************************************/ void mlphessianbatch(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t ssize, double &e, real_1d_array &grad, real_2d_array &h) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlphessianbatch(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), ssize, &e, const_cast(grad.c_ptr()), const_cast(h.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of all types of errors on subset of dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SetSize - real size of XY, SetSize>=0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. OUTPUT PARAMETERS: Rep - it contains all type of errors. -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/ void mlpallerrorssubset(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize, modelerrors &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpallerrorssubset(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), setsize, const_cast(subset.c_ptr()), subsetsize, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_mlpallerrorssubset(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize, modelerrors &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_mlpallerrorssubset(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), setsize, const_cast(subset.c_ptr()), subsetsize, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of all types of errors on subset of dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset given by sparse matrix; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SetSize - real size of XY, SetSize>=0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. OUTPUT PARAMETERS: Rep - it contains all type of errors. -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/ void mlpallerrorssparsesubset(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize, modelerrors &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpallerrorssparsesubset(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), setsize, const_cast(subset.c_ptr()), subsetsize, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_mlpallerrorssparsesubset(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize, modelerrors &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_mlpallerrorssparsesubset(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), setsize, const_cast(subset.c_ptr()), subsetsize, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Error of the neural network on subset of dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; SetSize - real size of XY, SetSize>=0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/ double mlperrorsubset(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlperrorsubset(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), setsize, const_cast(subset.c_ptr()), subsetsize, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlperrorsubset(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlperrorsubset(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), setsize, const_cast(subset.c_ptr()), subsetsize, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Error of the neural network on subset of sparse dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. SetSize - real size of XY, SetSize>=0; it is used when SubsetSize<0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/ double mlperrorsparsesubset(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlperrorsparsesubset(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), setsize, const_cast(subset.c_ptr()), subsetsize, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } double smp_mlperrorsparsesubset(const multilayerperceptron &network, const sparsematrix &xy, const ae_int_t setsize, const integer_1d_array &subset, const ae_int_t subsetsize) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::_pexec_mlperrorsparsesubset(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), setsize, const_cast(subset.c_ptr()), subsetsize, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Multiclass Fisher LDA Subroutine finds coefficients of linear combination which optimally separates training set on classes. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! Multithreading is used to accelerate initial phase of LDA, which ! includes calculation of products of large matrices. Again, for best ! efficiency problem must be high-dimensional. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: XY - training set, array[0..NPoints-1,0..NVars]. First NVars columns store values of independent variables, next column stores number of class (from 0 to NClasses-1) which dataset element belongs to. Fractional values are rounded to nearest integer. NPoints - training set size, NPoints>=0 NVars - number of independent variables, NVars>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: Info - return code: * -4, if internal EVD subroutine hasn't converged * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, NVars<1, NClasses<2) * 1, if task has been solved * 2, if there was a multicollinearity in training set, but task has been solved. W - linear combination coefficients, array[0..NVars-1] -- ALGLIB -- Copyright 31.05.2008 by Bochkanov Sergey *************************************************************************/ void fisherlda(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t nclasses, ae_int_t &info, real_1d_array &w) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fisherlda(const_cast(xy.c_ptr()), npoints, nvars, nclasses, &info, const_cast(w.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* N-dimensional multiclass Fisher LDA Subroutine finds coefficients of linear combinations which optimally separates training set on classes. It returns N-dimensional basis whose vector are sorted by quality of training set separation (in descending order). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! Multithreading is used to accelerate initial phase of LDA, which ! includes calculation of products of large matrices. Again, for best ! efficiency problem must be high-dimensional. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: XY - training set, array[0..NPoints-1,0..NVars]. First NVars columns store values of independent variables, next column stores number of class (from 0 to NClasses-1) which dataset element belongs to. Fractional values are rounded to nearest integer. NPoints - training set size, NPoints>=0 NVars - number of independent variables, NVars>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: Info - return code: * -4, if internal EVD subroutine hasn't converged * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, NVars<1, NClasses<2) * 1, if task has been solved * 2, if there was a multicollinearity in training set, but task has been solved. W - basis, array[0..NVars-1,0..NVars-1] columns of matrix stores basis vectors, sorted by quality of training set separation (in descending order) -- ALGLIB -- Copyright 31.05.2008 by Bochkanov Sergey *************************************************************************/ void fisherldan(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t nclasses, ae_int_t &info, real_2d_array &w) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fisherldan(const_cast(xy.c_ptr()), npoints, nvars, nclasses, &info, const_cast(w.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_fisherldan(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t nclasses, ae_int_t &info, real_2d_array &w) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_fisherldan(const_cast(xy.c_ptr()), npoints, nvars, nclasses, &info, const_cast(w.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* *************************************************************************/ _linearmodel_owner::_linearmodel_owner() { p_struct = (alglib_impl::linearmodel*)alglib_impl::ae_malloc(sizeof(alglib_impl::linearmodel), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_linearmodel_init(p_struct, NULL); } _linearmodel_owner::_linearmodel_owner(const _linearmodel_owner &rhs) { p_struct = (alglib_impl::linearmodel*)alglib_impl::ae_malloc(sizeof(alglib_impl::linearmodel), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_linearmodel_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _linearmodel_owner& _linearmodel_owner::operator=(const _linearmodel_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_linearmodel_clear(p_struct); alglib_impl::_linearmodel_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _linearmodel_owner::~_linearmodel_owner() { alglib_impl::_linearmodel_clear(p_struct); ae_free(p_struct); } alglib_impl::linearmodel* _linearmodel_owner::c_ptr() { return p_struct; } alglib_impl::linearmodel* _linearmodel_owner::c_ptr() const { return const_cast(p_struct); } linearmodel::linearmodel() : _linearmodel_owner() { } linearmodel::linearmodel(const linearmodel &rhs):_linearmodel_owner(rhs) { } linearmodel& linearmodel::operator=(const linearmodel &rhs) { if( this==&rhs ) return *this; _linearmodel_owner::operator=(rhs); return *this; } linearmodel::~linearmodel() { } /************************************************************************* LRReport structure contains additional information about linear model: * C - covariation matrix, array[0..NVars,0..NVars]. C[i,j] = Cov(A[i],A[j]) * RMSError - root mean square error on a training set * AvgError - average error on a training set * AvgRelError - average relative error on a training set (excluding observations with zero function value). * CVRMSError - leave-one-out cross-validation estimate of generalization error. Calculated using fast algorithm with O(NVars*NPoints) complexity. * CVAvgError - cross-validation estimate of average error * CVAvgRelError - cross-validation estimate of average relative error All other fields of the structure are intended for internal use and should not be used outside ALGLIB. *************************************************************************/ _lrreport_owner::_lrreport_owner() { p_struct = (alglib_impl::lrreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::lrreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_lrreport_init(p_struct, NULL); } _lrreport_owner::_lrreport_owner(const _lrreport_owner &rhs) { p_struct = (alglib_impl::lrreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::lrreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_lrreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _lrreport_owner& _lrreport_owner::operator=(const _lrreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_lrreport_clear(p_struct); alglib_impl::_lrreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _lrreport_owner::~_lrreport_owner() { alglib_impl::_lrreport_clear(p_struct); ae_free(p_struct); } alglib_impl::lrreport* _lrreport_owner::c_ptr() { return p_struct; } alglib_impl::lrreport* _lrreport_owner::c_ptr() const { return const_cast(p_struct); } lrreport::lrreport() : _lrreport_owner() ,c(&p_struct->c),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),cvrmserror(p_struct->cvrmserror),cvavgerror(p_struct->cvavgerror),cvavgrelerror(p_struct->cvavgrelerror),ncvdefects(p_struct->ncvdefects),cvdefects(&p_struct->cvdefects) { } lrreport::lrreport(const lrreport &rhs):_lrreport_owner(rhs) ,c(&p_struct->c),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),cvrmserror(p_struct->cvrmserror),cvavgerror(p_struct->cvavgerror),cvavgrelerror(p_struct->cvavgrelerror),ncvdefects(p_struct->ncvdefects),cvdefects(&p_struct->cvdefects) { } lrreport& lrreport::operator=(const lrreport &rhs) { if( this==&rhs ) return *this; _lrreport_owner::operator=(rhs); return *this; } lrreport::~lrreport() { } /************************************************************************* Linear regression Subroutine builds model: Y = A(0)*X[0] + ... + A(N-1)*X[N-1] + A(N) and model found in ALGLIB format, covariation matrix, training set errors (rms, average, average relative) and leave-one-out cross-validation estimate of the generalization error. CV estimate calculated using fast algorithm with O(NPoints*NVars) complexity. When covariation matrix is calculated standard deviations of function values are assumed to be equal to RMS error on the training set. INPUT PARAMETERS: XY - training set, array [0..NPoints-1,0..NVars]: * NVars columns - independent variables * last column - dependent variable NPoints - training set size, NPoints>NVars+1 NVars - number of independent variables OUTPUT PARAMETERS: Info - return code: * -255, in case of unknown internal error * -4, if internal SVD subroutine haven't converged * -1, if incorrect parameters was passed (NPoints(xy.c_ptr()), npoints, nvars, &info, const_cast(lm.c_ptr()), const_cast(ar.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Linear regression Variant of LRBuild which uses vector of standatd deviations (errors in function values). INPUT PARAMETERS: XY - training set, array [0..NPoints-1,0..NVars]: * NVars columns - independent variables * last column - dependent variable S - standard deviations (errors in function values) array[0..NPoints-1], S[i]>0. NPoints - training set size, NPoints>NVars+1 NVars - number of independent variables OUTPUT PARAMETERS: Info - return code: * -255, in case of unknown internal error * -4, if internal SVD subroutine haven't converged * -1, if incorrect parameters was passed (NPoints(xy.c_ptr()), const_cast(s.c_ptr()), npoints, nvars, &info, const_cast(lm.c_ptr()), const_cast(ar.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like LRBuildS, but builds model Y = A(0)*X[0] + ... + A(N-1)*X[N-1] i.e. with zero constant term. -- ALGLIB -- Copyright 30.10.2008 by Bochkanov Sergey *************************************************************************/ void lrbuildzs(const real_2d_array &xy, const real_1d_array &s, const ae_int_t npoints, const ae_int_t nvars, ae_int_t &info, linearmodel &lm, lrreport &ar) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lrbuildzs(const_cast(xy.c_ptr()), const_cast(s.c_ptr()), npoints, nvars, &info, const_cast(lm.c_ptr()), const_cast(ar.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like LRBuild but builds model Y = A(0)*X[0] + ... + A(N-1)*X[N-1] i.e. with zero constant term. -- ALGLIB -- Copyright 30.10.2008 by Bochkanov Sergey *************************************************************************/ void lrbuildz(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, ae_int_t &info, linearmodel &lm, lrreport &ar) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lrbuildz(const_cast(xy.c_ptr()), npoints, nvars, &info, const_cast(lm.c_ptr()), const_cast(ar.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacks coefficients of linear model. INPUT PARAMETERS: LM - linear model in ALGLIB format OUTPUT PARAMETERS: V - coefficients, array[0..NVars] constant term (intercept) is stored in the V[NVars]. NVars - number of independent variables (one less than number of coefficients) -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ void lrunpack(const linearmodel &lm, real_1d_array &v, ae_int_t &nvars) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lrunpack(const_cast(lm.c_ptr()), const_cast(v.c_ptr()), &nvars, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* "Packs" coefficients and creates linear model in ALGLIB format (LRUnpack reversed). INPUT PARAMETERS: V - coefficients, array[0..NVars] NVars - number of independent variables OUTPUT PAREMETERS: LM - linear model. -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ void lrpack(const real_1d_array &v, const ae_int_t nvars, linearmodel &lm) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lrpack(const_cast(v.c_ptr()), nvars, const_cast(lm.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Procesing INPUT PARAMETERS: LM - linear model X - input vector, array[0..NVars-1]. Result: value of linear model regression estimate -- ALGLIB -- Copyright 03.09.2008 by Bochkanov Sergey *************************************************************************/ double lrprocess(const linearmodel &lm, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::lrprocess(const_cast(lm.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* RMS error on the test set INPUT PARAMETERS: LM - linear model XY - test set NPoints - test set size RESULT: root mean square error. -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ double lrrmserror(const linearmodel &lm, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::lrrmserror(const_cast(lm.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average error on the test set INPUT PARAMETERS: LM - linear model XY - test set NPoints - test set size RESULT: average error. -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ double lravgerror(const linearmodel &lm, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::lravgerror(const_cast(lm.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* RMS error on the test set INPUT PARAMETERS: LM - linear model XY - test set NPoints - test set size RESULT: average relative error. -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ double lravgrelerror(const linearmodel &lm, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::lravgrelerror(const_cast(lm.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Filters: simple moving averages (unsymmetric). This filter replaces array by results of SMA(K) filter. SMA(K) is defined as filter which averages at most K previous points (previous - not points AROUND central point) - or less, in case of the first K-1 points. INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 K - K>=1 (K can be larger than N , such cases will be correctly handled). Window width. K=1 corresponds to identity transformation (nothing changes). OUTPUT PARAMETERS: X - array, whose first N elements were processed with SMA(K) NOTE 1: this function uses efficient in-place algorithm which does not allocate temporary arrays. NOTE 2: this algorithm makes only one pass through array and uses running sum to speed-up calculation of the averages. Additional measures are taken to ensure that running sum on a long sequence of zero elements will be correctly reset to zero even in the presence of round-off error. NOTE 3: this is unsymmetric version of the algorithm, which does NOT averages points after the current one. Only X[i], X[i-1], ... are used when calculating new value of X[i]. We should also note that this algorithm uses BOTH previous points and current one, i.e. new value of X[i] depends on BOTH previous point and X[i] itself. -- ALGLIB -- Copyright 25.10.2011 by Bochkanov Sergey *************************************************************************/ void filtersma(real_1d_array &x, const ae_int_t n, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::filtersma(const_cast(x.c_ptr()), n, k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Filters: simple moving averages (unsymmetric). This filter replaces array by results of SMA(K) filter. SMA(K) is defined as filter which averages at most K previous points (previous - not points AROUND central point) - or less, in case of the first K-1 points. INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 K - K>=1 (K can be larger than N , such cases will be correctly handled). Window width. K=1 corresponds to identity transformation (nothing changes). OUTPUT PARAMETERS: X - array, whose first N elements were processed with SMA(K) NOTE 1: this function uses efficient in-place algorithm which does not allocate temporary arrays. NOTE 2: this algorithm makes only one pass through array and uses running sum to speed-up calculation of the averages. Additional measures are taken to ensure that running sum on a long sequence of zero elements will be correctly reset to zero even in the presence of round-off error. NOTE 3: this is unsymmetric version of the algorithm, which does NOT averages points after the current one. Only X[i], X[i-1], ... are used when calculating new value of X[i]. We should also note that this algorithm uses BOTH previous points and current one, i.e. new value of X[i] depends on BOTH previous point and X[i] itself. -- ALGLIB -- Copyright 25.10.2011 by Bochkanov Sergey *************************************************************************/ void filtersma(real_1d_array &x, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::filtersma(const_cast(x.c_ptr()), n, k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Filters: exponential moving averages. This filter replaces array by results of EMA(alpha) filter. EMA(alpha) is defined as filter which replaces X[] by S[]: S[0] = X[0] S[t] = alpha*X[t] + (1-alpha)*S[t-1] INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 alpha - 0(x.c_ptr()), n, alpha, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Filters: exponential moving averages. This filter replaces array by results of EMA(alpha) filter. EMA(alpha) is defined as filter which replaces X[] by S[]: S[0] = X[0] S[t] = alpha*X[t] + (1-alpha)*S[t-1] INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 alpha - 0(x.c_ptr()), n, alpha, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Filters: linear regression moving averages. This filter replaces array by results of LRMA(K) filter. LRMA(K) is defined as filter which, for each data point, builds linear regression model using K prevous points (point itself is included in these K points) and calculates value of this linear model at the point in question. INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 K - K>=1 (K can be larger than N , such cases will be correctly handled). Window width. K=1 corresponds to identity transformation (nothing changes). OUTPUT PARAMETERS: X - array, whose first N elements were processed with SMA(K) NOTE 1: this function uses efficient in-place algorithm which does not allocate temporary arrays. NOTE 2: this algorithm makes only one pass through array and uses running sum to speed-up calculation of the averages. Additional measures are taken to ensure that running sum on a long sequence of zero elements will be correctly reset to zero even in the presence of round-off error. NOTE 3: this is unsymmetric version of the algorithm, which does NOT averages points after the current one. Only X[i], X[i-1], ... are used when calculating new value of X[i]. We should also note that this algorithm uses BOTH previous points and current one, i.e. new value of X[i] depends on BOTH previous point and X[i] itself. -- ALGLIB -- Copyright 25.10.2011 by Bochkanov Sergey *************************************************************************/ void filterlrma(real_1d_array &x, const ae_int_t n, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::filterlrma(const_cast(x.c_ptr()), n, k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Filters: linear regression moving averages. This filter replaces array by results of LRMA(K) filter. LRMA(K) is defined as filter which, for each data point, builds linear regression model using K prevous points (point itself is included in these K points) and calculates value of this linear model at the point in question. INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 K - K>=1 (K can be larger than N , such cases will be correctly handled). Window width. K=1 corresponds to identity transformation (nothing changes). OUTPUT PARAMETERS: X - array, whose first N elements were processed with SMA(K) NOTE 1: this function uses efficient in-place algorithm which does not allocate temporary arrays. NOTE 2: this algorithm makes only one pass through array and uses running sum to speed-up calculation of the averages. Additional measures are taken to ensure that running sum on a long sequence of zero elements will be correctly reset to zero even in the presence of round-off error. NOTE 3: this is unsymmetric version of the algorithm, which does NOT averages points after the current one. Only X[i], X[i-1], ... are used when calculating new value of X[i]. We should also note that this algorithm uses BOTH previous points and current one, i.e. new value of X[i] depends on BOTH previous point and X[i] itself. -- ALGLIB -- Copyright 25.10.2011 by Bochkanov Sergey *************************************************************************/ void filterlrma(real_1d_array &x, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::filterlrma(const_cast(x.c_ptr()), n, k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* *************************************************************************/ _logitmodel_owner::_logitmodel_owner() { p_struct = (alglib_impl::logitmodel*)alglib_impl::ae_malloc(sizeof(alglib_impl::logitmodel), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_logitmodel_init(p_struct, NULL); } _logitmodel_owner::_logitmodel_owner(const _logitmodel_owner &rhs) { p_struct = (alglib_impl::logitmodel*)alglib_impl::ae_malloc(sizeof(alglib_impl::logitmodel), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_logitmodel_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _logitmodel_owner& _logitmodel_owner::operator=(const _logitmodel_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_logitmodel_clear(p_struct); alglib_impl::_logitmodel_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _logitmodel_owner::~_logitmodel_owner() { alglib_impl::_logitmodel_clear(p_struct); ae_free(p_struct); } alglib_impl::logitmodel* _logitmodel_owner::c_ptr() { return p_struct; } alglib_impl::logitmodel* _logitmodel_owner::c_ptr() const { return const_cast(p_struct); } logitmodel::logitmodel() : _logitmodel_owner() { } logitmodel::logitmodel(const logitmodel &rhs):_logitmodel_owner(rhs) { } logitmodel& logitmodel::operator=(const logitmodel &rhs) { if( this==&rhs ) return *this; _logitmodel_owner::operator=(rhs); return *this; } logitmodel::~logitmodel() { } /************************************************************************* MNLReport structure contains information about training process: * NGrad - number of gradient calculations * NHess - number of Hessian calculations *************************************************************************/ _mnlreport_owner::_mnlreport_owner() { p_struct = (alglib_impl::mnlreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::mnlreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mnlreport_init(p_struct, NULL); } _mnlreport_owner::_mnlreport_owner(const _mnlreport_owner &rhs) { p_struct = (alglib_impl::mnlreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::mnlreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mnlreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _mnlreport_owner& _mnlreport_owner::operator=(const _mnlreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_mnlreport_clear(p_struct); alglib_impl::_mnlreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _mnlreport_owner::~_mnlreport_owner() { alglib_impl::_mnlreport_clear(p_struct); ae_free(p_struct); } alglib_impl::mnlreport* _mnlreport_owner::c_ptr() { return p_struct; } alglib_impl::mnlreport* _mnlreport_owner::c_ptr() const { return const_cast(p_struct); } mnlreport::mnlreport() : _mnlreport_owner() ,ngrad(p_struct->ngrad),nhess(p_struct->nhess) { } mnlreport::mnlreport(const mnlreport &rhs):_mnlreport_owner(rhs) ,ngrad(p_struct->ngrad),nhess(p_struct->nhess) { } mnlreport& mnlreport::operator=(const mnlreport &rhs) { if( this==&rhs ) return *this; _mnlreport_owner::operator=(rhs); return *this; } mnlreport::~mnlreport() { } /************************************************************************* This subroutine trains logit model. INPUT PARAMETERS: XY - training set, array[0..NPoints-1,0..NVars] First NVars columns store values of independent variables, next column stores number of class (from 0 to NClasses-1) which dataset element belongs to. Fractional values are rounded to nearest integer. NPoints - training set size, NPoints>=1 NVars - number of independent variables, NVars>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints(xy.c_ptr()), npoints, nvars, nclasses, &info, const_cast(lm.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Procesing INPUT PARAMETERS: LM - logit model, passed by non-constant reference (some fields of structure are used as temporaries when calculating model output). X - input vector, array[0..NVars-1]. Y - (possibly) preallocated buffer; if size of Y is less than NClasses, it will be reallocated.If it is large enough, it is NOT reallocated, so we can save some time on reallocation. OUTPUT PARAMETERS: Y - result, array[0..NClasses-1] Vector of posterior probabilities for classification task. -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/ void mnlprocess(const logitmodel &lm, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mnlprocess(const_cast(lm.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 'interactive' variant of MNLProcess for languages like Python which support constructs like "Y = MNLProcess(LM,X)" and interactive mode of the interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/ void mnlprocessi(const logitmodel &lm, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mnlprocessi(const_cast(lm.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacks coefficients of logit model. Logit model have form: P(class=i) = S(i) / (S(0) + S(1) + ... +S(M-1)) S(i) = Exp(A[i,0]*X[0] + ... + A[i,N-1]*X[N-1] + A[i,N]), when i(lm.c_ptr()), const_cast(a.c_ptr()), &nvars, &nclasses, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* "Packs" coefficients and creates logit model in ALGLIB format (MNLUnpack reversed). INPUT PARAMETERS: A - model (see MNLUnpack) NVars - number of independent variables NClasses - number of classes OUTPUT PARAMETERS: LM - logit model. -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/ void mnlpack(const real_2d_array &a, const ae_int_t nvars, const ae_int_t nclasses, logitmodel &lm) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mnlpack(const_cast(a.c_ptr()), nvars, nclasses, const_cast(lm.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average cross-entropy (in bits per element) on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: CrossEntropy/(NPoints*ln(2)). -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/ double mnlavgce(const logitmodel &lm, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mnlavgce(const_cast(lm.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Relative classification error on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: percent of incorrectly classified cases. -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/ double mnlrelclserror(const logitmodel &lm, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mnlrelclserror(const_cast(lm.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* RMS error on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: root mean square error (error when estimating posterior probabilities). -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ double mnlrmserror(const logitmodel &lm, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mnlrmserror(const_cast(lm.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average error on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: average error (error when estimating posterior probabilities). -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ double mnlavgerror(const logitmodel &lm, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mnlavgerror(const_cast(lm.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average relative error on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: average relative error (error when estimating posterior probabilities). -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ double mnlavgrelerror(const logitmodel &lm, const real_2d_array &xy, const ae_int_t ssize) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mnlavgrelerror(const_cast(lm.c_ptr()), const_cast(xy.c_ptr()), ssize, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Classification error on test set = MNLRelClsError*NPoints -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/ ae_int_t mnlclserror(const logitmodel &lm, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::mnlclserror(const_cast(lm.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This structure is a MCPD (Markov Chains for Population Data) solver. You should use ALGLIB functions in order to work with this object. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ _mcpdstate_owner::_mcpdstate_owner() { p_struct = (alglib_impl::mcpdstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::mcpdstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mcpdstate_init(p_struct, NULL); } _mcpdstate_owner::_mcpdstate_owner(const _mcpdstate_owner &rhs) { p_struct = (alglib_impl::mcpdstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::mcpdstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mcpdstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _mcpdstate_owner& _mcpdstate_owner::operator=(const _mcpdstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_mcpdstate_clear(p_struct); alglib_impl::_mcpdstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _mcpdstate_owner::~_mcpdstate_owner() { alglib_impl::_mcpdstate_clear(p_struct); ae_free(p_struct); } alglib_impl::mcpdstate* _mcpdstate_owner::c_ptr() { return p_struct; } alglib_impl::mcpdstate* _mcpdstate_owner::c_ptr() const { return const_cast(p_struct); } mcpdstate::mcpdstate() : _mcpdstate_owner() { } mcpdstate::mcpdstate(const mcpdstate &rhs):_mcpdstate_owner(rhs) { } mcpdstate& mcpdstate::operator=(const mcpdstate &rhs) { if( this==&rhs ) return *this; _mcpdstate_owner::operator=(rhs); return *this; } mcpdstate::~mcpdstate() { } /************************************************************************* This structure is a MCPD training report: InnerIterationsCount - number of inner iterations of the underlying optimization algorithm OuterIterationsCount - number of outer iterations of the underlying optimization algorithm NFEV - number of merit function evaluations TerminationType - termination type (same as for MinBLEIC optimizer, positive values denote success, negative ones - failure) -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ _mcpdreport_owner::_mcpdreport_owner() { p_struct = (alglib_impl::mcpdreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::mcpdreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mcpdreport_init(p_struct, NULL); } _mcpdreport_owner::_mcpdreport_owner(const _mcpdreport_owner &rhs) { p_struct = (alglib_impl::mcpdreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::mcpdreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mcpdreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _mcpdreport_owner& _mcpdreport_owner::operator=(const _mcpdreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_mcpdreport_clear(p_struct); alglib_impl::_mcpdreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _mcpdreport_owner::~_mcpdreport_owner() { alglib_impl::_mcpdreport_clear(p_struct); ae_free(p_struct); } alglib_impl::mcpdreport* _mcpdreport_owner::c_ptr() { return p_struct; } alglib_impl::mcpdreport* _mcpdreport_owner::c_ptr() const { return const_cast(p_struct); } mcpdreport::mcpdreport() : _mcpdreport_owner() ,inneriterationscount(p_struct->inneriterationscount),outeriterationscount(p_struct->outeriterationscount),nfev(p_struct->nfev),terminationtype(p_struct->terminationtype) { } mcpdreport::mcpdreport(const mcpdreport &rhs):_mcpdreport_owner(rhs) ,inneriterationscount(p_struct->inneriterationscount),outeriterationscount(p_struct->outeriterationscount),nfev(p_struct->nfev),terminationtype(p_struct->terminationtype) { } mcpdreport& mcpdreport::operator=(const mcpdreport &rhs) { if( this==&rhs ) return *this; _mcpdreport_owner::operator=(rhs); return *this; } mcpdreport::~mcpdreport() { } /************************************************************************* DESCRIPTION: This function creates MCPD (Markov Chains for Population Data) solver. This solver can be used to find transition matrix P for N-dimensional prediction problem where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional population vectors (components of each X are non-negative), and P is a N*N transition matrix (elements of P are non-negative, each column sums to 1.0). Such models arise when when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is constant, i.e. there is no new individuals and no one leaves population * you want to model transitions of individuals from one state into another USAGE: Here we give very brief outline of the MCPD. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on data analysis which is available at http://www.alglib.net/dataanalysis/ 1. User initializes algorithm state with MCPDCreate() call 2. User adds one or more tracks - sequences of states which describe evolution of a system being modelled from different starting conditions 3. User may add optional boundary, equality and/or linear constraints on the coefficients of P by calling one of the following functions: * MCPDSetEC() to set equality constraints * MCPDSetBC() to set bound constraints * MCPDSetLC() to set linear constraints 4. Optionally, user may set custom weights for prediction errors (by default, algorithm assigns non-equal, automatically chosen weights for errors in the prediction of different components of X). It can be done with a call of MCPDSetPredictionWeights() function. 5. User calls MCPDSolve() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 6. User calls MCPDResults() to get solution INPUT PARAMETERS: N - problem dimension, N>=1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdcreate(const ae_int_t n, mcpdstate &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdcreate(n, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* DESCRIPTION: This function is a specialized version of MCPDCreate() function, and we recommend you to read comments for this function for general information about MCPD solver. This function creates MCPD (Markov Chains for Population Data) solver for "Entry-state" model, i.e. model where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional state vectors P is a N*N transition matrix and one selected component of X[] is called "entry" state and is treated in a special way: system state always transits from "entry" state to some another state system state can not transit from any state into "entry" state Such conditions basically mean that row of P which corresponds to "entry" state is zero. Such models arise when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is NOT constant - at every moment of time there is some (unpredictable) amount of "new" individuals, which can transit into one of the states at the next turn, but still no one leaves population * you want to model transitions of individuals from one state into another * but you do NOT want to predict amount of "new" individuals because it does not depends on individuals already present (hence system can not transit INTO entry state - it can only transit FROM it). This model is discussed in more details in the ALGLIB User Guide (see http://www.alglib.net/dataanalysis/ for more data). INPUT PARAMETERS: N - problem dimension, N>=2 EntryState- index of entry state, in 0..N-1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdcreateentry(const ae_int_t n, const ae_int_t entrystate, mcpdstate &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdcreateentry(n, entrystate, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* DESCRIPTION: This function is a specialized version of MCPDCreate() function, and we recommend you to read comments for this function for general information about MCPD solver. This function creates MCPD (Markov Chains for Population Data) solver for "Exit-state" model, i.e. model where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional state vectors P is a N*N transition matrix and one selected component of X[] is called "exit" state and is treated in a special way: system state can transit from any state into "exit" state system state can not transit from "exit" state into any other state transition operator discards "exit" state (makes it zero at each turn) Such conditions basically mean that column of P which corresponds to "exit" state is zero. Multiplication by such P may decrease sum of vector components. Such models arise when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is NOT constant - individuals can move into "exit" state and leave population at the next turn, but there are no new individuals * amount of individuals which leave population can be predicted * you want to model transitions of individuals from one state into another (including transitions into the "exit" state) This model is discussed in more details in the ALGLIB User Guide (see http://www.alglib.net/dataanalysis/ for more data). INPUT PARAMETERS: N - problem dimension, N>=2 ExitState- index of exit state, in 0..N-1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdcreateexit(const ae_int_t n, const ae_int_t exitstate, mcpdstate &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdcreateexit(n, exitstate, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* DESCRIPTION: This function is a specialized version of MCPDCreate() function, and we recommend you to read comments for this function for general information about MCPD solver. This function creates MCPD (Markov Chains for Population Data) solver for "Entry-Exit-states" model, i.e. model where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional state vectors P is a N*N transition matrix one selected component of X[] is called "entry" state and is treated in a special way: system state always transits from "entry" state to some another state system state can not transit from any state into "entry" state and another one component of X[] is called "exit" state and is treated in a special way too: system state can transit from any state into "exit" state system state can not transit from "exit" state into any other state transition operator discards "exit" state (makes it zero at each turn) Such conditions basically mean that: row of P which corresponds to "entry" state is zero column of P which corresponds to "exit" state is zero Multiplication by such P may decrease sum of vector components. Such models arise when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is NOT constant * at every moment of time there is some (unpredictable) amount of "new" individuals, which can transit into one of the states at the next turn * some individuals can move (predictably) into "exit" state and leave population at the next turn * you want to model transitions of individuals from one state into another, including transitions from the "entry" state and into the "exit" state. * but you do NOT want to predict amount of "new" individuals because it does not depends on individuals already present (hence system can not transit INTO entry state - it can only transit FROM it). This model is discussed in more details in the ALGLIB User Guide (see http://www.alglib.net/dataanalysis/ for more data). INPUT PARAMETERS: N - problem dimension, N>=2 EntryState- index of entry state, in 0..N-1 ExitState- index of exit state, in 0..N-1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdcreateentryexit(const ae_int_t n, const ae_int_t entrystate, const ae_int_t exitstate, mcpdstate &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdcreateentryexit(n, entrystate, exitstate, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to add a track - sequence of system states at the different moments of its evolution. You may add one or several tracks to the MCPD solver. In case you have several tracks, they won't overwrite each other. For example, if you pass two tracks, A1-A2-A3 (system at t=A+1, t=A+2 and t=A+3) and B1-B2-B3, then solver will try to model transitions from t=A+1 to t=A+2, t=A+2 to t=A+3, t=B+1 to t=B+2, t=B+2 to t=B+3. But it WONT mix these two tracks - i.e. it wont try to model transition from t=A+3 to t=B+1. INPUT PARAMETERS: S - solver XY - track, array[K,N]: * I-th row is a state at t=I * elements of XY must be non-negative (exception will be thrown on negative elements) K - number of points in a track * if given, only leading K rows of XY are used * if not given, automatically determined from size of XY NOTES: 1. Track may contain either proportional or population data: * with proportional data all rows of XY must sum to 1.0, i.e. we have proportions instead of absolute population values * with population data rows of XY contain population counts and generally do not sum to 1.0 (although they still must be non-negative) -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdaddtrack(const mcpdstate &s, const real_2d_array &xy, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdaddtrack(const_cast(s.c_ptr()), const_cast(xy.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to add a track - sequence of system states at the different moments of its evolution. You may add one or several tracks to the MCPD solver. In case you have several tracks, they won't overwrite each other. For example, if you pass two tracks, A1-A2-A3 (system at t=A+1, t=A+2 and t=A+3) and B1-B2-B3, then solver will try to model transitions from t=A+1 to t=A+2, t=A+2 to t=A+3, t=B+1 to t=B+2, t=B+2 to t=B+3. But it WONT mix these two tracks - i.e. it wont try to model transition from t=A+3 to t=B+1. INPUT PARAMETERS: S - solver XY - track, array[K,N]: * I-th row is a state at t=I * elements of XY must be non-negative (exception will be thrown on negative elements) K - number of points in a track * if given, only leading K rows of XY are used * if not given, automatically determined from size of XY NOTES: 1. Track may contain either proportional or population data: * with proportional data all rows of XY must sum to 1.0, i.e. we have proportions instead of absolute population values * with population data rows of XY contain population counts and generally do not sum to 1.0 (although they still must be non-negative) -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdaddtrack(const mcpdstate &s, const real_2d_array &xy) { alglib_impl::ae_state _alglib_env_state; ae_int_t k; k = xy.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdaddtrack(const_cast(s.c_ptr()), const_cast(xy.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to add equality constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to place equality constraints on arbitrary subset of elements of P. Set of constraints is specified by EC, which may contain either NAN's or finite numbers from [0,1]. NAN denotes absence of constraint, finite number denotes equality constraint on specific element of P. You can also use MCPDAddEC() function which allows to ADD equality constraint for one element of P without changing constraints for other elements. These functions (MCPDSetEC and MCPDAddEC) interact as follows: * there is internal matrix of equality constraints which is stored in the MCPD solver * MCPDSetEC() replaces this matrix by another one (SET) * MCPDAddEC() modifies one element of this matrix and leaves other ones unchanged (ADD) * thus MCPDAddEC() call preserves all modifications done by previous calls, while MCPDSetEC() completely discards all changes done to the equality constraints. INPUT PARAMETERS: S - solver EC - equality constraints, array[N,N]. Elements of EC can be either NAN's or finite numbers from [0,1]. NAN denotes absence of constraints, while finite value denotes equality constraint on the corresponding element of P. NOTES: 1. infinite values of EC will lead to exception being thrown. Values less than 0.0 or greater than 1.0 will lead to error code being returned after call to MCPDSolve(). -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetec(const mcpdstate &s, const real_2d_array &ec) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdsetec(const_cast(s.c_ptr()), const_cast(ec.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to add equality constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to ADD equality constraint for one element of P without changing constraints for other elements. You can also use MCPDSetEC() function which allows you to specify arbitrary set of equality constraints in one call. These functions (MCPDSetEC and MCPDAddEC) interact as follows: * there is internal matrix of equality constraints which is stored in the MCPD solver * MCPDSetEC() replaces this matrix by another one (SET) * MCPDAddEC() modifies one element of this matrix and leaves other ones unchanged (ADD) * thus MCPDAddEC() call preserves all modifications done by previous calls, while MCPDSetEC() completely discards all changes done to the equality constraints. INPUT PARAMETERS: S - solver I - row index of element being constrained J - column index of element being constrained C - value (constraint for P[I,J]). Can be either NAN (no constraint) or finite value from [0,1]. NOTES: 1. infinite values of C will lead to exception being thrown. Values less than 0.0 or greater than 1.0 will lead to error code being returned after call to MCPDSolve(). -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdaddec(const mcpdstate &s, const ae_int_t i, const ae_int_t j, const double c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdaddec(const_cast(s.c_ptr()), i, j, c, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to add bound constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to place bound constraints on arbitrary subset of elements of P. Set of constraints is specified by BndL/BndU matrices, which may contain arbitrary combination of finite numbers or infinities (like -INF(s.c_ptr()), const_cast(bndl.c_ptr()), const_cast(bndu.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to add bound constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to ADD bound constraint for one element of P without changing constraints for other elements. You can also use MCPDSetBC() function which allows to place bound constraints on arbitrary subset of elements of P. Set of constraints is specified by BndL/BndU matrices, which may contain arbitrary combination of finite numbers or infinities (like -INF(s.c_ptr()), i, j, bndl, bndu, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to set linear equality/inequality constraints on the elements of the transition matrix P. This function can be used to set one or several general linear constraints on the elements of P. Two types of constraints are supported: * equality constraints * inequality constraints (both less-or-equal and greater-or-equal) Coefficients of constraints are specified by matrix C (one of the parameters). One row of C corresponds to one constraint. Because transition matrix P has N*N elements, we need N*N columns to store all coefficients (they are stored row by row), and one more column to store right part - hence C has N*N+1 columns. Constraint kind is stored in the CT array. Thus, I-th linear constraint is P[0,0]*C[I,0] + P[0,1]*C[I,1] + .. + P[0,N-1]*C[I,N-1] + + P[1,0]*C[I,N] + P[1,1]*C[I,N+1] + ... + + P[N-1,N-1]*C[I,N*N-1] ?=? C[I,N*N] where ?=? can be either "=" (CT[i]=0), "<=" (CT[i]<0) or ">=" (CT[i]>0). Your constraint may involve only some subset of P (less than N*N elements). For example it can be something like P[0,0] + P[0,1] = 0.5 In this case you still should pass matrix with N*N+1 columns, but all its elements (except for C[0,0], C[0,1] and C[0,N*N-1]) will be zero. INPUT PARAMETERS: S - solver C - array[K,N*N+1] - coefficients of constraints (see above for complete description) CT - array[K] - constraint types (see above for complete description) K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetlc(const mcpdstate &s, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdsetlc(const_cast(s.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to set linear equality/inequality constraints on the elements of the transition matrix P. This function can be used to set one or several general linear constraints on the elements of P. Two types of constraints are supported: * equality constraints * inequality constraints (both less-or-equal and greater-or-equal) Coefficients of constraints are specified by matrix C (one of the parameters). One row of C corresponds to one constraint. Because transition matrix P has N*N elements, we need N*N columns to store all coefficients (they are stored row by row), and one more column to store right part - hence C has N*N+1 columns. Constraint kind is stored in the CT array. Thus, I-th linear constraint is P[0,0]*C[I,0] + P[0,1]*C[I,1] + .. + P[0,N-1]*C[I,N-1] + + P[1,0]*C[I,N] + P[1,1]*C[I,N+1] + ... + + P[N-1,N-1]*C[I,N*N-1] ?=? C[I,N*N] where ?=? can be either "=" (CT[i]=0), "<=" (CT[i]<0) or ">=" (CT[i]>0). Your constraint may involve only some subset of P (less than N*N elements). For example it can be something like P[0,0] + P[0,1] = 0.5 In this case you still should pass matrix with N*N+1 columns, but all its elements (except for C[0,0], C[0,1] and C[0,N*N-1]) will be zero. INPUT PARAMETERS: S - solver C - array[K,N*N+1] - coefficients of constraints (see above for complete description) CT - array[K] - constraint types (see above for complete description) K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetlc(const mcpdstate &s, const real_2d_array &c, const integer_1d_array &ct) { alglib_impl::ae_state _alglib_env_state; ae_int_t k; if( (c.rows()!=ct.length())) throw ap_error("Error while calling 'mcpdsetlc': looks like one of arguments has wrong size"); k = c.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdsetlc(const_cast(s.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function allows to tune amount of Tikhonov regularization being applied to your problem. By default, regularizing term is equal to r*||P-prior_P||^2, where r is a small non-zero value, P is transition matrix, prior_P is identity matrix, ||X||^2 is a sum of squared elements of X. This function allows you to change coefficient r. You can also change prior values with MCPDSetPrior() function. INPUT PARAMETERS: S - solver V - regularization coefficient, finite non-negative value. It is not recommended to specify zero value unless you are pretty sure that you want it. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsettikhonovregularizer(const mcpdstate &s, const double v) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdsettikhonovregularizer(const_cast(s.c_ptr()), v, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function allows to set prior values used for regularization of your problem. By default, regularizing term is equal to r*||P-prior_P||^2, where r is a small non-zero value, P is transition matrix, prior_P is identity matrix, ||X||^2 is a sum of squared elements of X. This function allows you to change prior values prior_P. You can also change r with MCPDSetTikhonovRegularizer() function. INPUT PARAMETERS: S - solver PP - array[N,N], matrix of prior values: 1. elements must be real numbers from [0,1] 2. columns must sum to 1.0. First property is checked (exception is thrown otherwise), while second one is not checked/enforced. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetprior(const mcpdstate &s, const real_2d_array &pp) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdsetprior(const_cast(s.c_ptr()), const_cast(pp.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to change prediction weights MCPD solver scales prediction errors as follows Error(P) = ||W*(y-P*x)||^2 where x is a system state at time t y is a system state at time t+1 P is a transition matrix W is a diagonal scaling matrix By default, weights are chosen in order to minimize relative prediction error instead of absolute one. For example, if one component of state is about 0.5 in magnitude and another one is about 0.05, then algorithm will make corresponding weights equal to 2.0 and 20.0. INPUT PARAMETERS: S - solver PW - array[N], weights: * must be non-negative values (exception will be thrown otherwise) * zero values will be replaced by automatically chosen values -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetpredictionweights(const mcpdstate &s, const real_1d_array &pw) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdsetpredictionweights(const_cast(s.c_ptr()), const_cast(pw.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to start solution of the MCPD problem. After return from this function, you can use MCPDResults() to get solution and completion code. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsolve(const mcpdstate &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdsolve(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* MCPD results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: P - array[N,N], transition matrix Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one. Speaking short, positive values denote success, negative ones are failures. More information about fields of this structure can be found in the comments on MCPDReport datatype. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdresults(const mcpdstate &s, real_2d_array &p, mcpdreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mcpdresults(const_cast(s.c_ptr()), const_cast(p.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Neural networks ensemble *************************************************************************/ _mlpensemble_owner::_mlpensemble_owner() { p_struct = (alglib_impl::mlpensemble*)alglib_impl::ae_malloc(sizeof(alglib_impl::mlpensemble), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mlpensemble_init(p_struct, NULL); } _mlpensemble_owner::_mlpensemble_owner(const _mlpensemble_owner &rhs) { p_struct = (alglib_impl::mlpensemble*)alglib_impl::ae_malloc(sizeof(alglib_impl::mlpensemble), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mlpensemble_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _mlpensemble_owner& _mlpensemble_owner::operator=(const _mlpensemble_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_mlpensemble_clear(p_struct); alglib_impl::_mlpensemble_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _mlpensemble_owner::~_mlpensemble_owner() { alglib_impl::_mlpensemble_clear(p_struct); ae_free(p_struct); } alglib_impl::mlpensemble* _mlpensemble_owner::c_ptr() { return p_struct; } alglib_impl::mlpensemble* _mlpensemble_owner::c_ptr() const { return const_cast(p_struct); } mlpensemble::mlpensemble() : _mlpensemble_owner() { } mlpensemble::mlpensemble(const mlpensemble &rhs):_mlpensemble_owner(rhs) { } mlpensemble& mlpensemble::operator=(const mlpensemble &rhs) { if( this==&rhs ) return *this; _mlpensemble_owner::operator=(rhs); return *this; } mlpensemble::~mlpensemble() { } /************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/ void mlpeserialize(mlpensemble &obj, std::string &s_out) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_int_t ssize; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_alloc_start(&serializer); alglib_impl::mlpealloc(&serializer, obj.c_ptr(), &state); ssize = alglib_impl::ae_serializer_get_alloc_size(&serializer); s_out.clear(); s_out.reserve((size_t)(ssize+1)); alglib_impl::ae_serializer_sstart_str(&serializer, &s_out); alglib_impl::mlpeserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); if( s_out.length()>(size_t)ssize ) throw ap_error("ALGLIB: serialization integrity error"); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function unserializes data structure from string. *************************************************************************/ void mlpeunserialize(const std::string &s_in, mlpensemble &obj) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_ustart_str(&serializer, &s_in); alglib_impl::mlpeunserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function serializes data structure to C++ stream. Data stream generated by this function is same as string representation generated by string version of serializer - alphanumeric characters, dots, underscores, minus signs, which are grouped into words separated by spaces and CR+LF. We recommend you to read comments on string version of serializer to find out more about serialization of AlGLIB objects. *************************************************************************/ void mlpeserialize(mlpensemble &obj, std::ostream &s_out) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_alloc_start(&serializer); alglib_impl::mlpealloc(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_get_alloc_size(&serializer); // not actually needed, but we have to ask alglib_impl::ae_serializer_sstart_stream(&serializer, &s_out); alglib_impl::mlpeserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function unserializes data structure from stream. *************************************************************************/ void mlpeunserialize(const std::istream &s_in, mlpensemble &obj) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_ustart_stream(&serializer, &s_in); alglib_impl::mlpeunserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* Like MLPCreate0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreate0(const ae_int_t nin, const ae_int_t nout, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreate0(nin, nout, ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like MLPCreate1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreate1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreate1(nin, nhid, nout, ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like MLPCreate2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreate2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreate2(nin, nhid1, nhid2, nout, ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like MLPCreateB0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreateb0(const ae_int_t nin, const ae_int_t nout, const double b, const double d, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreateb0(nin, nout, b, d, ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like MLPCreateB1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreateb1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, const double b, const double d, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreateb1(nin, nhid, nout, b, d, ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like MLPCreateB2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreateb2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, const double b, const double d, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreateb2(nin, nhid1, nhid2, nout, b, d, ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like MLPCreateR0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreater0(const ae_int_t nin, const ae_int_t nout, const double a, const double b, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreater0(nin, nout, a, b, ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like MLPCreateR1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreater1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, const double a, const double b, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreater1(nin, nhid, nout, a, b, ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like MLPCreateR2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreater2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, const double a, const double b, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreater2(nin, nhid1, nhid2, nout, a, b, ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like MLPCreateC0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreatec0(const ae_int_t nin, const ae_int_t nout, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreatec0(nin, nout, ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like MLPCreateC1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreatec1(const ae_int_t nin, const ae_int_t nhid, const ae_int_t nout, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreatec1(nin, nhid, nout, ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Like MLPCreateC2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreatec2(const ae_int_t nin, const ae_int_t nhid1, const ae_int_t nhid2, const ae_int_t nout, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreatec2(nin, nhid1, nhid2, nout, ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Creates ensemble from network. Only network geometry is copied. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreatefromnetwork(const multilayerperceptron &network, const ae_int_t ensemblesize, mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpecreatefromnetwork(const_cast(network.c_ptr()), ensemblesize, const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Randomization of MLP ensemble -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlperandomize(const mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlperandomize(const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Return ensemble properties (number of inputs and outputs). -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpeproperties(const mlpensemble &ensemble, ae_int_t &nin, ae_int_t &nout) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpeproperties(const_cast(ensemble.c_ptr()), &nin, &nout, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Return normalization type (whether ensemble is SOFTMAX-normalized or not). -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ bool mlpeissoftmax(const mlpensemble &ensemble) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::mlpeissoftmax(const_cast(ensemble.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Procesing INPUT PARAMETERS: Ensemble- neural networks ensemble X - input vector, array[0..NIn-1]. Y - (possibly) preallocated buffer; if size of Y is less than NOut, it will be reallocated. If it is large enough, it is NOT reallocated, so we can save some time on reallocation. OUTPUT PARAMETERS: Y - result. Regression estimate when solving regression task, vector of posterior probabilities for classification task. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpeprocess(const mlpensemble &ensemble, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpeprocess(const_cast(ensemble.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 'interactive' variant of MLPEProcess for languages like Python which support constructs like "Y = MLPEProcess(LM,X)" and interactive mode of the interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpeprocessi(const mlpensemble &ensemble, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpeprocessi(const_cast(ensemble.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Relative classification error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: percent of incorrectly classified cases. Works both for classifier betwork and for regression networks which are used as classifiers. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlperelclserror(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlperelclserror(const_cast(ensemble.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average cross-entropy (in bits per element) on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: CrossEntropy/(NPoints*LN(2)). Zero if ensemble solves regression task. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlpeavgce(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlpeavgce(const_cast(ensemble.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* RMS error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: root mean square error. Its meaning for regression task is obvious. As for classification task RMS error means error when estimating posterior probabilities. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlpermserror(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlpermserror(const_cast(ensemble.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task it means average error when estimating posterior probabilities. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlpeavgerror(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlpeavgerror(const_cast(ensemble.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average relative error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task it means average relative error when estimating posterior probabilities. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlpeavgrelerror(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::mlpeavgrelerror(const_cast(ensemble.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Training report: * RelCLSError - fraction of misclassified cases. * AvgCE - acerage cross-entropy * RMSError - root-mean-square error * AvgError - average error * AvgRelError - average relative error * NGrad - number of gradient calculations * NHess - number of Hessian calculations * NCholesky - number of Cholesky decompositions NOTE 1: RelCLSError/AvgCE are zero on regression problems. NOTE 2: on classification problems RMSError/AvgError/AvgRelError contain errors in prediction of posterior probabilities *************************************************************************/ _mlpreport_owner::_mlpreport_owner() { p_struct = (alglib_impl::mlpreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::mlpreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mlpreport_init(p_struct, NULL); } _mlpreport_owner::_mlpreport_owner(const _mlpreport_owner &rhs) { p_struct = (alglib_impl::mlpreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::mlpreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mlpreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _mlpreport_owner& _mlpreport_owner::operator=(const _mlpreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_mlpreport_clear(p_struct); alglib_impl::_mlpreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _mlpreport_owner::~_mlpreport_owner() { alglib_impl::_mlpreport_clear(p_struct); ae_free(p_struct); } alglib_impl::mlpreport* _mlpreport_owner::c_ptr() { return p_struct; } alglib_impl::mlpreport* _mlpreport_owner::c_ptr() const { return const_cast(p_struct); } mlpreport::mlpreport() : _mlpreport_owner() ,relclserror(p_struct->relclserror),avgce(p_struct->avgce),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),ngrad(p_struct->ngrad),nhess(p_struct->nhess),ncholesky(p_struct->ncholesky) { } mlpreport::mlpreport(const mlpreport &rhs):_mlpreport_owner(rhs) ,relclserror(p_struct->relclserror),avgce(p_struct->avgce),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),ngrad(p_struct->ngrad),nhess(p_struct->nhess),ncholesky(p_struct->ncholesky) { } mlpreport& mlpreport::operator=(const mlpreport &rhs) { if( this==&rhs ) return *this; _mlpreport_owner::operator=(rhs); return *this; } mlpreport::~mlpreport() { } /************************************************************************* Cross-validation estimates of generalization error *************************************************************************/ _mlpcvreport_owner::_mlpcvreport_owner() { p_struct = (alglib_impl::mlpcvreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::mlpcvreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mlpcvreport_init(p_struct, NULL); } _mlpcvreport_owner::_mlpcvreport_owner(const _mlpcvreport_owner &rhs) { p_struct = (alglib_impl::mlpcvreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::mlpcvreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mlpcvreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _mlpcvreport_owner& _mlpcvreport_owner::operator=(const _mlpcvreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_mlpcvreport_clear(p_struct); alglib_impl::_mlpcvreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _mlpcvreport_owner::~_mlpcvreport_owner() { alglib_impl::_mlpcvreport_clear(p_struct); ae_free(p_struct); } alglib_impl::mlpcvreport* _mlpcvreport_owner::c_ptr() { return p_struct; } alglib_impl::mlpcvreport* _mlpcvreport_owner::c_ptr() const { return const_cast(p_struct); } mlpcvreport::mlpcvreport() : _mlpcvreport_owner() ,relclserror(p_struct->relclserror),avgce(p_struct->avgce),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror) { } mlpcvreport::mlpcvreport(const mlpcvreport &rhs):_mlpcvreport_owner(rhs) ,relclserror(p_struct->relclserror),avgce(p_struct->avgce),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror) { } mlpcvreport& mlpcvreport::operator=(const mlpcvreport &rhs) { if( this==&rhs ) return *this; _mlpcvreport_owner::operator=(rhs); return *this; } mlpcvreport::~mlpcvreport() { } /************************************************************************* Trainer object for neural network. You should not try to access fields of this object directly - use ALGLIB functions to work with this object. *************************************************************************/ _mlptrainer_owner::_mlptrainer_owner() { p_struct = (alglib_impl::mlptrainer*)alglib_impl::ae_malloc(sizeof(alglib_impl::mlptrainer), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mlptrainer_init(p_struct, NULL); } _mlptrainer_owner::_mlptrainer_owner(const _mlptrainer_owner &rhs) { p_struct = (alglib_impl::mlptrainer*)alglib_impl::ae_malloc(sizeof(alglib_impl::mlptrainer), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_mlptrainer_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _mlptrainer_owner& _mlptrainer_owner::operator=(const _mlptrainer_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_mlptrainer_clear(p_struct); alglib_impl::_mlptrainer_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _mlptrainer_owner::~_mlptrainer_owner() { alglib_impl::_mlptrainer_clear(p_struct); ae_free(p_struct); } alglib_impl::mlptrainer* _mlptrainer_owner::c_ptr() { return p_struct; } alglib_impl::mlptrainer* _mlptrainer_owner::c_ptr() const { return const_cast(p_struct); } mlptrainer::mlptrainer() : _mlptrainer_owner() { } mlptrainer::mlptrainer(const mlptrainer &rhs):_mlptrainer_owner(rhs) { } mlptrainer& mlptrainer::operator=(const mlptrainer &rhs) { if( this==&rhs ) return *this; _mlptrainer_owner::operator=(rhs); return *this; } mlptrainer::~mlptrainer() { } /************************************************************************* Neural network training using modified Levenberg-Marquardt with exact Hessian calculation and regularization. Subroutine trains neural network with restarts from random positions. Algorithm is well suited for small and medium scale problems (hundreds of weights). INPUT PARAMETERS: Network - neural network with initialized geometry XY - training set NPoints - training set size Decay - weight decay constant, >=0.001 Decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 0.001. Restarts - number of restarts from random position, >0. If you don't know what Restarts to choose, use 2. OUTPUT PARAMETERS: Network - trained neural network. Info - return code: * -9, if internal matrix inverse subroutine failed * -2, if there is a point with class number outside of [0..NOut-1]. * -1, if wrong parameters specified (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void mlptrainlm(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, ae_int_t &info, mlpreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlptrainlm(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, decay, restarts, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Neural network training using L-BFGS algorithm with regularization. Subroutine trains neural network with restarts from random positions. Algorithm is well suited for problems of any dimensionality (memory requirements and step complexity are linear by weights number). INPUT PARAMETERS: Network - neural network with initialized geometry XY - training set NPoints - training set size Decay - weight decay constant, >=0.001 Decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 0.001. Restarts - number of restarts from random position, >0. If you don't know what Restarts to choose, use 2. WStep - stopping criterion. Algorithm stops if step size is less than WStep. Recommended value - 0.01. Zero step size means stopping after MaxIts iterations. MaxIts - stopping criterion. Algorithm stops after MaxIts iterations (NOT gradient calculations). Zero MaxIts means stopping when step is sufficiently small. OUTPUT PARAMETERS: Network - trained neural network. Info - return code: * -8, if both WStep=0 and MaxIts=0 * -2, if there is a point with class number outside of [0..NOut-1]. * -1, if wrong parameters specified (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report -- ALGLIB -- Copyright 09.12.2007 by Bochkanov Sergey *************************************************************************/ void mlptrainlbfgs(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, const double wstep, const ae_int_t maxits, ae_int_t &info, mlpreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlptrainlbfgs(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, decay, restarts, wstep, maxits, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Neural network training using early stopping (base algorithm - L-BFGS with regularization). INPUT PARAMETERS: Network - neural network with initialized geometry TrnXY - training set TrnSize - training set size, TrnSize>0 ValXY - validation set ValSize - validation set size, ValSize>0 Decay - weight decay constant, >=0.001 Decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 0.001. Restarts - number of restarts, either: * strictly positive number - algorithm make specified number of restarts from random position. * -1, in which case algorithm makes exactly one run from the initial state of the network (no randomization). If you don't know what Restarts to choose, choose one one the following: * -1 (deterministic start) * +1 (one random restart) * +5 (moderate amount of random restarts) OUTPUT PARAMETERS: Network - trained neural network. Info - return code: * -2, if there is a point with class number outside of [0..NOut-1]. * -1, if wrong parameters specified (NPoints<0, Restarts<1, ...). * 2, task has been solved, stopping criterion met - sufficiently small step size. Not expected (we use EARLY stopping) but possible and not an error. * 6, task has been solved, stopping criterion met - increasing of validation set error. Rep - training report NOTE: Algorithm stops if validation set error increases for a long enough or step size is small enought (there are task where validation set may decrease for eternity). In any case solution returned corresponds to the minimum of validation set error. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void mlptraines(const multilayerperceptron &network, const real_2d_array &trnxy, const ae_int_t trnsize, const real_2d_array &valxy, const ae_int_t valsize, const double decay, const ae_int_t restarts, ae_int_t &info, mlpreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlptraines(const_cast(network.c_ptr()), const_cast(trnxy.c_ptr()), trnsize, const_cast(valxy.c_ptr()), valsize, decay, restarts, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Cross-validation estimate of generalization error. Base algorithm - L-BFGS. INPUT PARAMETERS: Network - neural network with initialized geometry. Network is not changed during cross-validation - it is used only as a representative of its architecture. XY - training set. SSize - training set size Decay - weight decay, same as in MLPTrainLBFGS Restarts - number of restarts, >0. restarts are counted for each partition separately, so total number of restarts will be Restarts*FoldsCount. WStep - stopping criterion, same as in MLPTrainLBFGS MaxIts - stopping criterion, same as in MLPTrainLBFGS FoldsCount - number of folds in k-fold cross-validation, 2<=FoldsCount<=SSize. recommended value: 10. OUTPUT PARAMETERS: Info - return code, same as in MLPTrainLBFGS Rep - report, same as in MLPTrainLM/MLPTrainLBFGS CVRep - generalization error estimates -- ALGLIB -- Copyright 09.12.2007 by Bochkanov Sergey *************************************************************************/ void mlpkfoldcvlbfgs(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, const double wstep, const ae_int_t maxits, const ae_int_t foldscount, ae_int_t &info, mlpreport &rep, mlpcvreport &cvrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpkfoldcvlbfgs(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, decay, restarts, wstep, maxits, foldscount, &info, const_cast(rep.c_ptr()), const_cast(cvrep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Cross-validation estimate of generalization error. Base algorithm - Levenberg-Marquardt. INPUT PARAMETERS: Network - neural network with initialized geometry. Network is not changed during cross-validation - it is used only as a representative of its architecture. XY - training set. SSize - training set size Decay - weight decay, same as in MLPTrainLBFGS Restarts - number of restarts, >0. restarts are counted for each partition separately, so total number of restarts will be Restarts*FoldsCount. FoldsCount - number of folds in k-fold cross-validation, 2<=FoldsCount<=SSize. recommended value: 10. OUTPUT PARAMETERS: Info - return code, same as in MLPTrainLBFGS Rep - report, same as in MLPTrainLM/MLPTrainLBFGS CVRep - generalization error estimates -- ALGLIB -- Copyright 09.12.2007 by Bochkanov Sergey *************************************************************************/ void mlpkfoldcvlm(const multilayerperceptron &network, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, const ae_int_t foldscount, ae_int_t &info, mlpreport &rep, mlpcvreport &cvrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpkfoldcvlm(const_cast(network.c_ptr()), const_cast(xy.c_ptr()), npoints, decay, restarts, foldscount, &info, const_cast(rep.c_ptr()), const_cast(cvrep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function estimates generalization error using cross-validation on the current dataset with current training settings. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * FoldsCount cross-validation rounds (always) ! * NRestarts training sessions performed within each of ! cross-validation rounds (if NRestarts>1) ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - trainer object Network - neural network. It must have same number of inputs and output/classes as was specified during creation of the trainer object. Network is not changed during cross- validation and is not trained - it is used only as representative of its architecture. I.e., we estimate generalization properties of ARCHITECTURE, not some specific network. NRestarts - number of restarts, >=0: * NRestarts>0 means that for each cross-validation round specified number of random restarts is performed, with best network being chosen after training. * NRestarts=0 is same as NRestarts=1 FoldsCount - number of folds in k-fold cross-validation: * 2<=FoldsCount<=size of dataset * recommended value: 10. * values larger than dataset size will be silently truncated down to dataset size OUTPUT PARAMETERS: Rep - structure which contains cross-validation estimates: * Rep.RelCLSError - fraction of misclassified cases. * Rep.AvgCE - acerage cross-entropy * Rep.RMSError - root-mean-square error * Rep.AvgError - average error * Rep.AvgRelError - average relative error NOTE: when no dataset was specified with MLPSetDataset/SetSparseDataset(), or subset with only one point was given, zeros are returned as estimates. NOTE: this method performs FoldsCount cross-validation rounds, each one with NRestarts random starts. Thus, FoldsCount*NRestarts networks are trained in total. NOTE: Rep.RelCLSError/Rep.AvgCE are zero on regression problems. NOTE: on classification problems Rep.RMSError/Rep.AvgError/Rep.AvgRelError contain errors in prediction of posterior probabilities. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpkfoldcv(const mlptrainer &s, const multilayerperceptron &network, const ae_int_t nrestarts, const ae_int_t foldscount, mlpreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpkfoldcv(const_cast(s.c_ptr()), const_cast(network.c_ptr()), nrestarts, foldscount, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_mlpkfoldcv(const mlptrainer &s, const multilayerperceptron &network, const ae_int_t nrestarts, const ae_int_t foldscount, mlpreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_mlpkfoldcv(const_cast(s.c_ptr()), const_cast(network.c_ptr()), nrestarts, foldscount, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Creation of the network trainer object for regression networks INPUT PARAMETERS: NIn - number of inputs, NIn>=1 NOut - number of outputs, NOut>=1 OUTPUT PARAMETERS: S - neural network trainer object. This structure can be used to train any regression network with NIn inputs and NOut outputs. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpcreatetrainer(const ae_int_t nin, const ae_int_t nout, mlptrainer &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreatetrainer(nin, nout, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Creation of the network trainer object for classification networks INPUT PARAMETERS: NIn - number of inputs, NIn>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: S - neural network trainer object. This structure can be used to train any classification network with NIn inputs and NOut outputs. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpcreatetrainercls(const ae_int_t nin, const ae_int_t nclasses, mlptrainer &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpcreatetrainercls(nin, nclasses, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets "current dataset" of the trainer object to one passed by user. INPUT PARAMETERS: S - trainer object XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. NPoints - points count, >=0. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following datasetformat is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetdataset(const mlptrainer &s, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpsetdataset(const_cast(s.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets "current dataset" of the trainer object to one passed by user (sparse matrix is used to store dataset). INPUT PARAMETERS: S - trainer object XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Any sparse storage format can be used: Hash-table, CRS... NPoints - points count, >=0 DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following datasetformat is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetsparsedataset(const mlptrainer &s, const sparsematrix &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpsetsparsedataset(const_cast(s.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets weight decay coefficient which is used for training. INPUT PARAMETERS: S - trainer object Decay - weight decay coefficient, >=0. Weight decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 1.0E-3. Weight decay can be set to zero, in this case network is trained without weight decay. NOTE: by default network uses some small nonzero value for weight decay. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetdecay(const mlptrainer &s, const double decay) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpsetdecay(const_cast(s.c_ptr()), decay, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping criteria for the optimizer. INPUT PARAMETERS: S - trainer object WStep - stopping criterion. Algorithm stops if step size is less than WStep. Recommended value - 0.01. Zero step size means stopping after MaxIts iterations. WStep>=0. MaxIts - stopping criterion. Algorithm stops after MaxIts epochs (full passes over entire dataset). Zero MaxIts means stopping when step is sufficiently small. MaxIts>=0. NOTE: by default, WStep=0.005 and MaxIts=0 are used. These values are also used when MLPSetCond() is called with WStep=0 and MaxIts=0. NOTE: these stopping criteria are used for all kinds of neural training - from "conventional" networks to early stopping ensembles. When used for "conventional" networks, they are used as the only stopping criteria. When combined with early stopping, they used as ADDITIONAL stopping criteria which can terminate early stopping algorithm. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetcond(const mlptrainer &s, const double wstep, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpsetcond(const_cast(s.c_ptr()), wstep, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets training algorithm: batch training using L-BFGS will be used. This algorithm: * the most robust for small-scale problems, but may be too slow for large scale ones. * perfoms full pass through the dataset before performing step * uses conditions specified by MLPSetCond() for stopping * is default one used by trainer object INPUT PARAMETERS: S - trainer object -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetalgobatch(const mlptrainer &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpsetalgobatch(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function trains neural network passed to this function, using current dataset (one which was passed to MLPSetDataset() or MLPSetSparseDataset()) and current training settings. Training from NRestarts random starting positions is performed, best network is chosen. Training is performed using current training algorithm. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * NRestarts training sessions performed within each of ! cross-validation rounds (if NRestarts>1) ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - trainer object Network - neural network. It must have same number of inputs and output/classes as was specified during creation of the trainer object. NRestarts - number of restarts, >=0: * NRestarts>0 means that specified number of random restarts are performed, best network is chosen after training * NRestarts=0 means that current state of the network is used for training. OUTPUT PARAMETERS: Network - trained network NOTE: when no dataset was specified with MLPSetDataset/SetSparseDataset(), network is filled by zero values. Same behavior for functions MLPStartTraining and MLPContinueTraining. NOTE: this method uses sum-of-squares error function for training. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlptrainnetwork(const mlptrainer &s, const multilayerperceptron &network, const ae_int_t nrestarts, mlpreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlptrainnetwork(const_cast(s.c_ptr()), const_cast(network.c_ptr()), nrestarts, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_mlptrainnetwork(const mlptrainer &s, const multilayerperceptron &network, const ae_int_t nrestarts, mlpreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_mlptrainnetwork(const_cast(s.c_ptr()), const_cast(network.c_ptr()), nrestarts, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* IMPORTANT: this is an "expert" version of the MLPTrain() function. We do not recommend you to use it unless you are pretty sure that you need ability to monitor training progress. This function performs step-by-step training of the neural network. Here "step-by-step" means that training starts with MLPStartTraining() call, and then user subsequently calls MLPContinueTraining() to perform one more iteration of the training. After call to this function trainer object remembers network and is ready to train it. However, no training is performed until first call to MLPContinueTraining() function. Subsequent calls to MLPContinueTraining() will advance training progress one iteration further. EXAMPLE: > > ...initialize network and trainer object.... > > MLPStartTraining(Trainer, Network, True) > while MLPContinueTraining(Trainer, Network) do > ...visualize training progress... > INPUT PARAMETERS: S - trainer object Network - neural network. It must have same number of inputs and output/classes as was specified during creation of the trainer object. RandomStart - randomize network before training or not: * True means that network is randomized and its initial state (one which was passed to the trainer object) is lost. * False means that training is started from the current state of the network OUTPUT PARAMETERS: Network - neural network which is ready to training (weights are initialized, preprocessor is initialized using current training set) NOTE: this method uses sum-of-squares error function for training. NOTE: it is expected that trainer object settings are NOT changed during step-by-step training, i.e. no one changes stopping criteria or training set during training. It is possible and there is no defense against such actions, but algorithm behavior in such cases is undefined and can be unpredictable. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpstarttraining(const mlptrainer &s, const multilayerperceptron &network, const bool randomstart) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpstarttraining(const_cast(s.c_ptr()), const_cast(network.c_ptr()), randomstart, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* IMPORTANT: this is an "expert" version of the MLPTrain() function. We do not recommend you to use it unless you are pretty sure that you need ability to monitor training progress. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. This function performs step-by-step training of the neural network. Here "step-by-step" means that training starts with MLPStartTraining() call, and then user subsequently calls MLPContinueTraining() to perform one more iteration of the training. This function performs one more iteration of the training and returns either True (training continues) or False (training stopped). In case True was returned, Network weights are updated according to the current state of the optimization progress. In case False was returned, no additional updates is performed (previous update of the network weights moved us to the final point, and no additional updates is needed). EXAMPLE: > > [initialize network and trainer object] > > MLPStartTraining(Trainer, Network, True) > while MLPContinueTraining(Trainer, Network) do > [visualize training progress] > INPUT PARAMETERS: S - trainer object Network - neural network structure, which is used to store current state of the training process. OUTPUT PARAMETERS: Network - weights of the neural network are rewritten by the current approximation. NOTE: this method uses sum-of-squares error function for training. NOTE: it is expected that trainer object settings are NOT changed during step-by-step training, i.e. no one changes stopping criteria or training set during training. It is possible and there is no defense against such actions, but algorithm behavior in such cases is undefined and can be unpredictable. NOTE: It is expected that Network is the same one which was passed to MLPStartTraining() function. However, THIS function checks only following: * that number of network inputs is consistent with trainer object settings * that number of network outputs/classes is consistent with trainer object settings * that number of network weights is the same as number of weights in the network passed to MLPStartTraining() function Exception is thrown when these conditions are violated. It is also expected that you do not change state of the network on your own - the only party who has right to change network during its training is a trainer object. Any attempt to interfere with trainer may lead to unpredictable results. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ bool mlpcontinuetraining(const mlptrainer &s, const multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::mlpcontinuetraining(const_cast(s.c_ptr()), const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } bool smp_mlpcontinuetraining(const mlptrainer &s, const multilayerperceptron &network) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::_pexec_mlpcontinuetraining(const_cast(s.c_ptr()), const_cast(network.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Training neural networks ensemble using bootstrap aggregating (bagging). Modified Levenberg-Marquardt algorithm is used as base training method. INPUT PARAMETERS: Ensemble - model with initialized geometry XY - training set NPoints - training set size Decay - weight decay coefficient, >=0.001 Restarts - restarts, >0. OUTPUT PARAMETERS: Ensemble - trained model Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report. OOBErrors - out-of-bag generalization error estimate -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpebagginglm(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, ae_int_t &info, mlpreport &rep, mlpcvreport &ooberrors) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpebagginglm(const_cast(ensemble.c_ptr()), const_cast(xy.c_ptr()), npoints, decay, restarts, &info, const_cast(rep.c_ptr()), const_cast(ooberrors.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Training neural networks ensemble using bootstrap aggregating (bagging). L-BFGS algorithm is used as base training method. INPUT PARAMETERS: Ensemble - model with initialized geometry XY - training set NPoints - training set size Decay - weight decay coefficient, >=0.001 Restarts - restarts, >0. WStep - stopping criterion, same as in MLPTrainLBFGS MaxIts - stopping criterion, same as in MLPTrainLBFGS OUTPUT PARAMETERS: Ensemble - trained model Info - return code: * -8, if both WStep=0 and MaxIts=0 * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report. OOBErrors - out-of-bag generalization error estimate -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpebagginglbfgs(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, const double wstep, const ae_int_t maxits, ae_int_t &info, mlpreport &rep, mlpcvreport &ooberrors) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpebagginglbfgs(const_cast(ensemble.c_ptr()), const_cast(xy.c_ptr()), npoints, decay, restarts, wstep, maxits, &info, const_cast(rep.c_ptr()), const_cast(ooberrors.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Training neural networks ensemble using early stopping. INPUT PARAMETERS: Ensemble - model with initialized geometry XY - training set NPoints - training set size Decay - weight decay coefficient, >=0.001 Restarts - restarts, >0. OUTPUT PARAMETERS: Ensemble - trained model Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, Restarts<1). * 6, if task has been solved. Rep - training report. OOBErrors - out-of-bag generalization error estimate -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void mlpetraines(const mlpensemble &ensemble, const real_2d_array &xy, const ae_int_t npoints, const double decay, const ae_int_t restarts, ae_int_t &info, mlpreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlpetraines(const_cast(ensemble.c_ptr()), const_cast(xy.c_ptr()), npoints, decay, restarts, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function trains neural network ensemble passed to this function using current dataset and early stopping training algorithm. Each early stopping round performs NRestarts random restarts (thus, EnsembleSize*NRestarts training rounds is performed in total). FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * EnsembleSize training sessions performed for each of ensemble ! members (always parallelized) ! * NRestarts training sessions performed within each of training ! sessions (if NRestarts>1) ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - trainer object; Ensemble - neural network ensemble. It must have same number of inputs and outputs/classes as was specified during creation of the trainer object. NRestarts - number of restarts, >=0: * NRestarts>0 means that specified number of random restarts are performed during each ES round; * NRestarts=0 is silently replaced by 1. OUTPUT PARAMETERS: Ensemble - trained ensemble; Rep - it contains all type of errors. NOTE: this training method uses BOTH early stopping and weight decay! So, you should select weight decay before starting training just as you select it before training "conventional" networks. NOTE: when no dataset was specified with MLPSetDataset/SetSparseDataset(), or single-point dataset was passed, ensemble is filled by zero values. NOTE: this method uses sum-of-squares error function for training. -- ALGLIB -- Copyright 22.08.2012 by Bochkanov Sergey *************************************************************************/ void mlptrainensemblees(const mlptrainer &s, const mlpensemble &ensemble, const ae_int_t nrestarts, mlpreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::mlptrainensemblees(const_cast(s.c_ptr()), const_cast(ensemble.c_ptr()), nrestarts, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_mlptrainensemblees(const mlptrainer &s, const mlpensemble &ensemble, const ae_int_t nrestarts, mlpreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_mlptrainensemblees(const_cast(s.c_ptr()), const_cast(ensemble.c_ptr()), nrestarts, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This structure is a clusterization engine. You should not try to access its fields directly. Use ALGLIB functions in order to work with this object. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ _clusterizerstate_owner::_clusterizerstate_owner() { p_struct = (alglib_impl::clusterizerstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::clusterizerstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_clusterizerstate_init(p_struct, NULL); } _clusterizerstate_owner::_clusterizerstate_owner(const _clusterizerstate_owner &rhs) { p_struct = (alglib_impl::clusterizerstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::clusterizerstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_clusterizerstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _clusterizerstate_owner& _clusterizerstate_owner::operator=(const _clusterizerstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_clusterizerstate_clear(p_struct); alglib_impl::_clusterizerstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _clusterizerstate_owner::~_clusterizerstate_owner() { alglib_impl::_clusterizerstate_clear(p_struct); ae_free(p_struct); } alglib_impl::clusterizerstate* _clusterizerstate_owner::c_ptr() { return p_struct; } alglib_impl::clusterizerstate* _clusterizerstate_owner::c_ptr() const { return const_cast(p_struct); } clusterizerstate::clusterizerstate() : _clusterizerstate_owner() { } clusterizerstate::clusterizerstate(const clusterizerstate &rhs):_clusterizerstate_owner(rhs) { } clusterizerstate& clusterizerstate::operator=(const clusterizerstate &rhs) { if( this==&rhs ) return *this; _clusterizerstate_owner::operator=(rhs); return *this; } clusterizerstate::~clusterizerstate() { } /************************************************************************* This structure is used to store results of the agglomerative hierarchical clustering (AHC). Following information is returned: * TerminationType - completion code: * 1 for successful completion of algorithm * -5 inappropriate combination of clustering algorithm and distance function was used. As for now, it is possible only when Ward's method is called for dataset with non-Euclidean distance function. In case negative completion code is returned, other fields of report structure are invalid and should not be used. * NPoints contains number of points in the original dataset * Z contains information about merges performed (see below). Z contains indexes from the original (unsorted) dataset and it can be used when you need to know what points were merged. However, it is not convenient when you want to build a dendrograd (see below). * if you want to build dendrogram, you can use Z, but it is not good option, because Z contains indexes from unsorted dataset. Dendrogram built from such dataset is likely to have intersections. So, you have to reorder you points before building dendrogram. Permutation which reorders point is returned in P. Another representation of merges, which is more convenient for dendorgram construction, is returned in PM. * more information on format of Z, P and PM can be found below and in the examples from ALGLIB Reference Manual. FORMAL DESCRIPTION OF FIELDS: NPoints number of points Z array[NPoints-1,2], contains indexes of clusters linked in pairs to form clustering tree. I-th row corresponds to I-th merge: * Z[I,0] - index of the first cluster to merge * Z[I,1] - index of the second cluster to merge * Z[I,0](rhs.p_struct), NULL); } _ahcreport_owner& _ahcreport_owner::operator=(const _ahcreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_ahcreport_clear(p_struct); alglib_impl::_ahcreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _ahcreport_owner::~_ahcreport_owner() { alglib_impl::_ahcreport_clear(p_struct); ae_free(p_struct); } alglib_impl::ahcreport* _ahcreport_owner::c_ptr() { return p_struct; } alglib_impl::ahcreport* _ahcreport_owner::c_ptr() const { return const_cast(p_struct); } ahcreport::ahcreport() : _ahcreport_owner() ,terminationtype(p_struct->terminationtype),npoints(p_struct->npoints),p(&p_struct->p),z(&p_struct->z),pz(&p_struct->pz),pm(&p_struct->pm),mergedist(&p_struct->mergedist) { } ahcreport::ahcreport(const ahcreport &rhs):_ahcreport_owner(rhs) ,terminationtype(p_struct->terminationtype),npoints(p_struct->npoints),p(&p_struct->p),z(&p_struct->z),pz(&p_struct->pz),pm(&p_struct->pm),mergedist(&p_struct->mergedist) { } ahcreport& ahcreport::operator=(const ahcreport &rhs) { if( this==&rhs ) return *this; _ahcreport_owner::operator=(rhs); return *this; } ahcreport::~ahcreport() { } /************************************************************************* This structure is used to store results of the k-means clustering algorithm. Following information is always returned: * NPoints contains number of points in the original dataset * TerminationType contains completion code, negative on failure, positive on success * K contains number of clusters For positive TerminationType we return: * NFeatures contains number of variables in the original dataset * C, which contains centers found by algorithm * CIdx, which maps points of the original dataset to clusters FORMAL DESCRIPTION OF FIELDS: NPoints number of points, >=0 NFeatures number of variables, >=1 TerminationType completion code: * -5 if distance type is anything different from Euclidean metric * -3 for degenerate dataset: a) less than K distinct points, b) K=0 for non-empty dataset. * +1 for successful completion K number of clusters C array[K,NFeatures], rows of the array store centers CIdx array[NPoints], which contains cluster indexes IterationsCount actual number of iterations performed by clusterizer. If algorithm performed more than one random restart, total number of iterations is returned. Energy merit function, "energy", sum of squared deviations from cluster centers -- ALGLIB -- Copyright 27.11.2012 by Bochkanov Sergey *************************************************************************/ _kmeansreport_owner::_kmeansreport_owner() { p_struct = (alglib_impl::kmeansreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::kmeansreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_kmeansreport_init(p_struct, NULL); } _kmeansreport_owner::_kmeansreport_owner(const _kmeansreport_owner &rhs) { p_struct = (alglib_impl::kmeansreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::kmeansreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_kmeansreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _kmeansreport_owner& _kmeansreport_owner::operator=(const _kmeansreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_kmeansreport_clear(p_struct); alglib_impl::_kmeansreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _kmeansreport_owner::~_kmeansreport_owner() { alglib_impl::_kmeansreport_clear(p_struct); ae_free(p_struct); } alglib_impl::kmeansreport* _kmeansreport_owner::c_ptr() { return p_struct; } alglib_impl::kmeansreport* _kmeansreport_owner::c_ptr() const { return const_cast(p_struct); } kmeansreport::kmeansreport() : _kmeansreport_owner() ,npoints(p_struct->npoints),nfeatures(p_struct->nfeatures),terminationtype(p_struct->terminationtype),iterationscount(p_struct->iterationscount),energy(p_struct->energy),k(p_struct->k),c(&p_struct->c),cidx(&p_struct->cidx) { } kmeansreport::kmeansreport(const kmeansreport &rhs):_kmeansreport_owner(rhs) ,npoints(p_struct->npoints),nfeatures(p_struct->nfeatures),terminationtype(p_struct->terminationtype),iterationscount(p_struct->iterationscount),energy(p_struct->energy),k(p_struct->k),c(&p_struct->c),cidx(&p_struct->cidx) { } kmeansreport& kmeansreport::operator=(const kmeansreport &rhs) { if( this==&rhs ) return *this; _kmeansreport_owner::operator=(rhs); return *this; } kmeansreport::~kmeansreport() { } /************************************************************************* This function initializes clusterizer object. Newly initialized object is empty, i.e. it does not contain dataset. You should use it as follows: 1. creation 2. dataset is added with ClusterizerSetPoints() 3. additional parameters are set 3. clusterization is performed with one of the clustering functions -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizercreate(clusterizerstate &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::clusterizercreate(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function adds dataset to the clusterizer structure. This function overrides all previous calls of ClusterizerSetPoints() or ClusterizerSetDistances(). INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() XY - array[NPoints,NFeatures], dataset NPoints - number of points, >=0 NFeatures- number of features, >=1 DistType- distance function: * 0 Chebyshev distance (L-inf norm) * 1 city block distance (L1 norm) * 2 Euclidean distance (L2 norm), non-squared * 10 Pearson correlation: dist(a,b) = 1-corr(a,b) * 11 Absolute Pearson correlation: dist(a,b) = 1-|corr(a,b)| * 12 Uncentered Pearson correlation (cosine of the angle): dist(a,b) = a'*b/(|a|*|b|) * 13 Absolute uncentered Pearson correlation dist(a,b) = |a'*b|/(|a|*|b|) * 20 Spearman rank correlation: dist(a,b) = 1-rankcorr(a,b) * 21 Absolute Spearman rank correlation dist(a,b) = 1-|rankcorr(a,b)| NOTE 1: different distance functions have different performance penalty: * Euclidean or Pearson correlation distances are the fastest ones * Spearman correlation distance function is a bit slower * city block and Chebyshev distances are order of magnitude slower The reason behing difference in performance is that correlation-based distance functions are computed using optimized linear algebra kernels, while Chebyshev and city block distance functions are computed using simple nested loops with two branches at each iteration. NOTE 2: different clustering algorithms have different limitations: * agglomerative hierarchical clustering algorithms may be used with any kind of distance metric * k-means++ clustering algorithm may be used only with Euclidean distance function Thus, list of specific clustering algorithms you may use depends on distance function you specify when you set your dataset. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetpoints(const clusterizerstate &s, const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures, const ae_int_t disttype) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::clusterizersetpoints(const_cast(s.c_ptr()), const_cast(xy.c_ptr()), npoints, nfeatures, disttype, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function adds dataset to the clusterizer structure. This function overrides all previous calls of ClusterizerSetPoints() or ClusterizerSetDistances(). INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() XY - array[NPoints,NFeatures], dataset NPoints - number of points, >=0 NFeatures- number of features, >=1 DistType- distance function: * 0 Chebyshev distance (L-inf norm) * 1 city block distance (L1 norm) * 2 Euclidean distance (L2 norm), non-squared * 10 Pearson correlation: dist(a,b) = 1-corr(a,b) * 11 Absolute Pearson correlation: dist(a,b) = 1-|corr(a,b)| * 12 Uncentered Pearson correlation (cosine of the angle): dist(a,b) = a'*b/(|a|*|b|) * 13 Absolute uncentered Pearson correlation dist(a,b) = |a'*b|/(|a|*|b|) * 20 Spearman rank correlation: dist(a,b) = 1-rankcorr(a,b) * 21 Absolute Spearman rank correlation dist(a,b) = 1-|rankcorr(a,b)| NOTE 1: different distance functions have different performance penalty: * Euclidean or Pearson correlation distances are the fastest ones * Spearman correlation distance function is a bit slower * city block and Chebyshev distances are order of magnitude slower The reason behing difference in performance is that correlation-based distance functions are computed using optimized linear algebra kernels, while Chebyshev and city block distance functions are computed using simple nested loops with two branches at each iteration. NOTE 2: different clustering algorithms have different limitations: * agglomerative hierarchical clustering algorithms may be used with any kind of distance metric * k-means++ clustering algorithm may be used only with Euclidean distance function Thus, list of specific clustering algorithms you may use depends on distance function you specify when you set your dataset. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetpoints(const clusterizerstate &s, const real_2d_array &xy, const ae_int_t disttype) { alglib_impl::ae_state _alglib_env_state; ae_int_t npoints; ae_int_t nfeatures; npoints = xy.rows(); nfeatures = xy.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::clusterizersetpoints(const_cast(s.c_ptr()), const_cast(xy.c_ptr()), npoints, nfeatures, disttype, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function adds dataset given by distance matrix to the clusterizer structure. It is important that dataset is not given explicitly - only distance matrix is given. This function overrides all previous calls of ClusterizerSetPoints() or ClusterizerSetDistances(). INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() D - array[NPoints,NPoints], distance matrix given by its upper or lower triangle (main diagonal is ignored because its entries are expected to be zero). NPoints - number of points IsUpper - whether upper or lower triangle of D is given. NOTE 1: different clustering algorithms have different limitations: * agglomerative hierarchical clustering algorithms may be used with any kind of distance metric, including one which is given by distance matrix * k-means++ clustering algorithm may be used only with Euclidean distance function and explicitly given points - it can not be used with dataset given by distance matrix Thus, if you call this function, you will be unable to use k-means clustering algorithm to process your problem. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetdistances(const clusterizerstate &s, const real_2d_array &d, const ae_int_t npoints, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::clusterizersetdistances(const_cast(s.c_ptr()), const_cast(d.c_ptr()), npoints, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function adds dataset given by distance matrix to the clusterizer structure. It is important that dataset is not given explicitly - only distance matrix is given. This function overrides all previous calls of ClusterizerSetPoints() or ClusterizerSetDistances(). INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() D - array[NPoints,NPoints], distance matrix given by its upper or lower triangle (main diagonal is ignored because its entries are expected to be zero). NPoints - number of points IsUpper - whether upper or lower triangle of D is given. NOTE 1: different clustering algorithms have different limitations: * agglomerative hierarchical clustering algorithms may be used with any kind of distance metric, including one which is given by distance matrix * k-means++ clustering algorithm may be used only with Euclidean distance function and explicitly given points - it can not be used with dataset given by distance matrix Thus, if you call this function, you will be unable to use k-means clustering algorithm to process your problem. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetdistances(const clusterizerstate &s, const real_2d_array &d, const bool isupper) { alglib_impl::ae_state _alglib_env_state; ae_int_t npoints; if( (d.rows()!=d.cols())) throw ap_error("Error while calling 'clusterizersetdistances': looks like one of arguments has wrong size"); npoints = d.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::clusterizersetdistances(const_cast(s.c_ptr()), const_cast(d.c_ptr()), npoints, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets agglomerative hierarchical clustering algorithm INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() Algo - algorithm type: * 0 complete linkage (default algorithm) * 1 single linkage * 2 unweighted average linkage * 3 weighted average linkage * 4 Ward's method NOTE: Ward's method works correctly only with Euclidean distance, that's why algorithm will return negative termination code (failure) for any other distance type. It is possible, however, to use this method with user-supplied distance matrix. It is your responsibility to pass one which was calculated with Euclidean distance function. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetahcalgo(const clusterizerstate &s, const ae_int_t algo) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::clusterizersetahcalgo(const_cast(s.c_ptr()), algo, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets k-means properties: number of restarts and maximum number of iterations per one run. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() Restarts- restarts count, >=1. k-means++ algorithm performs several restarts and chooses best set of centers (one with minimum squared distance). MaxIts - maximum number of k-means iterations performed during one run. >=0, zero value means that algorithm performs unlimited number of iterations. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetkmeanslimits(const clusterizerstate &s, const ae_int_t restarts, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::clusterizersetkmeanslimits(const_cast(s.c_ptr()), restarts, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets k-means initialization algorithm. Several different algorithms can be chosen, including k-means++. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() InitAlgo- initialization algorithm: * 0 automatic selection ( different versions of ALGLIB may select different algorithms) * 1 random initialization * 2 k-means++ initialization (best quality of initial centers, but long non-parallelizable initialization phase with bad cache locality) * 3 "fast-greedy" algorithm with efficient, easy to parallelize initialization. Quality of initial centers is somewhat worse than that of k-means++. This algorithm is a default one in the current version of ALGLIB. *-1 "debug" algorithm which always selects first K rows of dataset; this algorithm is used for debug purposes only. Do not use it in the industrial code! -- ALGLIB -- Copyright 21.01.2015 by Bochkanov Sergey *************************************************************************/ void clusterizersetkmeansinit(const clusterizerstate &s, const ae_int_t initalgo) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::clusterizersetkmeansinit(const_cast(s.c_ptr()), initalgo, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs agglomerative hierarchical clustering COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Agglomerative hierarchical clustering algorithm has two phases: ! distance matrix calculation and clustering itself. Only first phase ! (distance matrix calculation) is accelerated by Intel MKL and multi- ! threading. Thus, acceleration is significant only for medium or high- ! dimensional problems. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() OUTPUT PARAMETERS: Rep - clustering results; see description of AHCReport structure for more information. NOTE 1: hierarchical clustering algorithms require large amounts of memory. In particular, this implementation needs sizeof(double)*NPoints^2 bytes, which are used to store distance matrix. In case we work with user-supplied matrix, this amount is multiplied by 2 (we have to store original matrix and to work with its copy). For example, problem with 10000 points would require 800M of RAM, even when working in a 1-dimensional space. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizerrunahc(const clusterizerstate &s, ahcreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::clusterizerrunahc(const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_clusterizerrunahc(const clusterizerstate &s, ahcreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_clusterizerrunahc(const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs clustering by k-means++ algorithm. You may change algorithm properties by calling: * ClusterizerSetKMeansLimits() to change number of restarts or iterations * ClusterizerSetKMeansInit() to change initialization algorithm By default, one restart and unlimited number of iterations are used. Initialization algorithm is chosen automatically. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (can be used from C# and C++) ! * access to high-performance C++ core (actual for C# users) ! ! K-means clustering algorithm has two phases: selection of initial ! centers and clustering itself. ALGLIB parallelizes both phases. ! Parallel version is optimized for the following scenario: medium or ! high-dimensional problem (20 or more dimensions) with large number of ! points and clusters. However, some speed-up can be obtained even when ! assumptions above are violated. ! ! As for native-vs-managed comparison, working with native core brings ! 30-40% improvement in speed over pure C# version of ALGLIB. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() K - number of clusters, K>=0. K can be zero only when algorithm is called for empty dataset, in this case completion code is set to success (+1). If K=0 and dataset size is non-zero, we can not meaningfully assign points to some center (there are no centers because K=0) and return -3 as completion code (failure). OUTPUT PARAMETERS: Rep - clustering results; see description of KMeansReport structure for more information. NOTE 1: k-means clustering can be performed only for datasets with Euclidean distance function. Algorithm will return negative completion code in Rep.TerminationType in case dataset was added to clusterizer with DistType other than Euclidean (or dataset was specified by distance matrix instead of explicitly given points). -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizerrunkmeans(const clusterizerstate &s, const ae_int_t k, kmeansreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::clusterizerrunkmeans(const_cast(s.c_ptr()), k, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_clusterizerrunkmeans(const clusterizerstate &s, const ae_int_t k, kmeansreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_clusterizerrunkmeans(const_cast(s.c_ptr()), k, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns distance matrix for dataset COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Agglomerative hierarchical clustering algorithm has two phases: ! distance matrix calculation and clustering itself. Only first phase ! (distance matrix calculation) is accelerated by Intel MKL and multi- ! threading. Thus, acceleration is significant only for medium or high- ! dimensional problems. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points, >=0 NFeatures- number of features, >=1 DistType- distance function: * 0 Chebyshev distance (L-inf norm) * 1 city block distance (L1 norm) * 2 Euclidean distance (L2 norm, non-squared) * 10 Pearson correlation: dist(a,b) = 1-corr(a,b) * 11 Absolute Pearson correlation: dist(a,b) = 1-|corr(a,b)| * 12 Uncentered Pearson correlation (cosine of the angle): dist(a,b) = a'*b/(|a|*|b|) * 13 Absolute uncentered Pearson correlation dist(a,b) = |a'*b|/(|a|*|b|) * 20 Spearman rank correlation: dist(a,b) = 1-rankcorr(a,b) * 21 Absolute Spearman rank correlation dist(a,b) = 1-|rankcorr(a,b)| OUTPUT PARAMETERS: D - array[NPoints,NPoints], distance matrix (full matrix is returned, with lower and upper triangles) NOTE: different distance functions have different performance penalty: * Euclidean or Pearson correlation distances are the fastest ones * Spearman correlation distance function is a bit slower * city block and Chebyshev distances are order of magnitude slower The reason behing difference in performance is that correlation-based distance functions are computed using optimized linear algebra kernels, while Chebyshev and city block distance functions are computed using simple nested loops with two branches at each iteration. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizergetdistances(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures, const ae_int_t disttype, real_2d_array &d) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::clusterizergetdistances(const_cast(xy.c_ptr()), npoints, nfeatures, disttype, const_cast(d.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_clusterizergetdistances(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures, const ae_int_t disttype, real_2d_array &d) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_clusterizergetdistances(const_cast(xy.c_ptr()), npoints, nfeatures, disttype, const_cast(d.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function takes as input clusterization report Rep, desired clusters count K, and builds top K clusters from hierarchical clusterization tree. It returns assignment of points to clusters (array of cluster indexes). INPUT PARAMETERS: Rep - report from ClusterizerRunAHC() performed on XY K - desired number of clusters, 1<=K<=NPoints. K can be zero only when NPoints=0. OUTPUT PARAMETERS: CIdx - array[NPoints], I-th element contains cluster index (from 0 to K-1) for I-th point of the dataset. CZ - array[K]. This array allows to convert cluster indexes returned by this function to indexes used by Rep.Z. J-th cluster returned by this function corresponds to CZ[J]-th cluster stored in Rep.Z/PZ/PM. It is guaranteed that CZ[I](rep.c_ptr()), k, const_cast(cidx.c_ptr()), const_cast(cz.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function accepts AHC report Rep, desired minimum intercluster distance and returns top clusters from hierarchical clusterization tree which are separated by distance R or HIGHER. It returns assignment of points to clusters (array of cluster indexes). There is one more function with similar name - ClusterizerSeparatedByCorr, which returns clusters with intercluster correlation equal to R or LOWER (note: higher for distance, lower for correlation). INPUT PARAMETERS: Rep - report from ClusterizerRunAHC() performed on XY R - desired minimum intercluster distance, R>=0 OUTPUT PARAMETERS: K - number of clusters, 1<=K<=NPoints CIdx - array[NPoints], I-th element contains cluster index (from 0 to K-1) for I-th point of the dataset. CZ - array[K]. This array allows to convert cluster indexes returned by this function to indexes used by Rep.Z. J-th cluster returned by this function corresponds to CZ[J]-th cluster stored in Rep.Z/PZ/PM. It is guaranteed that CZ[I](rep.c_ptr()), r, &k, const_cast(cidx.c_ptr()), const_cast(cz.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function accepts AHC report Rep, desired maximum intercluster correlation and returns top clusters from hierarchical clusterization tree which are separated by correlation R or LOWER. It returns assignment of points to clusters (array of cluster indexes). There is one more function with similar name - ClusterizerSeparatedByDist, which returns clusters with intercluster distance equal to R or HIGHER (note: higher for distance, lower for correlation). INPUT PARAMETERS: Rep - report from ClusterizerRunAHC() performed on XY R - desired maximum intercluster correlation, -1<=R<=+1 OUTPUT PARAMETERS: K - number of clusters, 1<=K<=NPoints CIdx - array[NPoints], I-th element contains cluster index (from 0 to K-1) for I-th point of the dataset. CZ - array[K]. This array allows to convert cluster indexes returned by this function to indexes used by Rep.Z. J-th cluster returned by this function corresponds to CZ[J]-th cluster stored in Rep.Z/PZ/PM. It is guaranteed that CZ[I](rep.c_ptr()), r, &k, const_cast(cidx.c_ptr()), const_cast(cz.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* *************************************************************************/ _decisionforest_owner::_decisionforest_owner() { p_struct = (alglib_impl::decisionforest*)alglib_impl::ae_malloc(sizeof(alglib_impl::decisionforest), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_decisionforest_init(p_struct, NULL); } _decisionforest_owner::_decisionforest_owner(const _decisionforest_owner &rhs) { p_struct = (alglib_impl::decisionforest*)alglib_impl::ae_malloc(sizeof(alglib_impl::decisionforest), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_decisionforest_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _decisionforest_owner& _decisionforest_owner::operator=(const _decisionforest_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_decisionforest_clear(p_struct); alglib_impl::_decisionforest_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _decisionforest_owner::~_decisionforest_owner() { alglib_impl::_decisionforest_clear(p_struct); ae_free(p_struct); } alglib_impl::decisionforest* _decisionforest_owner::c_ptr() { return p_struct; } alglib_impl::decisionforest* _decisionforest_owner::c_ptr() const { return const_cast(p_struct); } decisionforest::decisionforest() : _decisionforest_owner() { } decisionforest::decisionforest(const decisionforest &rhs):_decisionforest_owner(rhs) { } decisionforest& decisionforest::operator=(const decisionforest &rhs) { if( this==&rhs ) return *this; _decisionforest_owner::operator=(rhs); return *this; } decisionforest::~decisionforest() { } /************************************************************************* *************************************************************************/ _dfreport_owner::_dfreport_owner() { p_struct = (alglib_impl::dfreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::dfreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_dfreport_init(p_struct, NULL); } _dfreport_owner::_dfreport_owner(const _dfreport_owner &rhs) { p_struct = (alglib_impl::dfreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::dfreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_dfreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _dfreport_owner& _dfreport_owner::operator=(const _dfreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_dfreport_clear(p_struct); alglib_impl::_dfreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _dfreport_owner::~_dfreport_owner() { alglib_impl::_dfreport_clear(p_struct); ae_free(p_struct); } alglib_impl::dfreport* _dfreport_owner::c_ptr() { return p_struct; } alglib_impl::dfreport* _dfreport_owner::c_ptr() const { return const_cast(p_struct); } dfreport::dfreport() : _dfreport_owner() ,relclserror(p_struct->relclserror),avgce(p_struct->avgce),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),oobrelclserror(p_struct->oobrelclserror),oobavgce(p_struct->oobavgce),oobrmserror(p_struct->oobrmserror),oobavgerror(p_struct->oobavgerror),oobavgrelerror(p_struct->oobavgrelerror) { } dfreport::dfreport(const dfreport &rhs):_dfreport_owner(rhs) ,relclserror(p_struct->relclserror),avgce(p_struct->avgce),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),oobrelclserror(p_struct->oobrelclserror),oobavgce(p_struct->oobavgce),oobrmserror(p_struct->oobrmserror),oobavgerror(p_struct->oobavgerror),oobavgrelerror(p_struct->oobavgrelerror) { } dfreport& dfreport::operator=(const dfreport &rhs) { if( this==&rhs ) return *this; _dfreport_owner::operator=(rhs); return *this; } dfreport::~dfreport() { } /************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/ void dfserialize(decisionforest &obj, std::string &s_out) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_int_t ssize; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_alloc_start(&serializer); alglib_impl::dfalloc(&serializer, obj.c_ptr(), &state); ssize = alglib_impl::ae_serializer_get_alloc_size(&serializer); s_out.clear(); s_out.reserve((size_t)(ssize+1)); alglib_impl::ae_serializer_sstart_str(&serializer, &s_out); alglib_impl::dfserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); if( s_out.length()>(size_t)ssize ) throw ap_error("ALGLIB: serialization integrity error"); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function unserializes data structure from string. *************************************************************************/ void dfunserialize(const std::string &s_in, decisionforest &obj) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_ustart_str(&serializer, &s_in); alglib_impl::dfunserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function serializes data structure to C++ stream. Data stream generated by this function is same as string representation generated by string version of serializer - alphanumeric characters, dots, underscores, minus signs, which are grouped into words separated by spaces and CR+LF. We recommend you to read comments on string version of serializer to find out more about serialization of AlGLIB objects. *************************************************************************/ void dfserialize(decisionforest &obj, std::ostream &s_out) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_alloc_start(&serializer); alglib_impl::dfalloc(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_get_alloc_size(&serializer); // not actually needed, but we have to ask alglib_impl::ae_serializer_sstart_stream(&serializer, &s_out); alglib_impl::dfserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function unserializes data structure from stream. *************************************************************************/ void dfunserialize(const std::istream &s_in, decisionforest &obj) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_ustart_stream(&serializer, &s_in); alglib_impl::dfunserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This subroutine builds random decision forest. INPUT PARAMETERS: XY - training set NPoints - training set size, NPoints>=1 NVars - number of independent variables, NVars>=1 NClasses - task type: * NClasses=1 - regression task with one dependent variable * NClasses>1 - classification task with NClasses classes. NTrees - number of trees in a forest, NTrees>=1. recommended values: 50-100. R - percent of a training set used to build individual trees. 01). * 1, if task has been solved DF - model built Rep - training report, contains error on a training set and out-of-bag estimates of generalization error. -- ALGLIB -- Copyright 19.02.2009 by Bochkanov Sergey *************************************************************************/ void dfbuildrandomdecisionforest(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t nclasses, const ae_int_t ntrees, const double r, ae_int_t &info, decisionforest &df, dfreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::dfbuildrandomdecisionforest(const_cast(xy.c_ptr()), npoints, nvars, nclasses, ntrees, r, &info, const_cast(df.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds random decision forest. This function gives ability to tune number of variables used when choosing best split. INPUT PARAMETERS: XY - training set NPoints - training set size, NPoints>=1 NVars - number of independent variables, NVars>=1 NClasses - task type: * NClasses=1 - regression task with one dependent variable * NClasses>1 - classification task with NClasses classes. NTrees - number of trees in a forest, NTrees>=1. recommended values: 50-100. NRndVars - number of variables used when choosing best split R - percent of a training set used to build individual trees. 01). * 1, if task has been solved DF - model built Rep - training report, contains error on a training set and out-of-bag estimates of generalization error. -- ALGLIB -- Copyright 19.02.2009 by Bochkanov Sergey *************************************************************************/ void dfbuildrandomdecisionforestx1(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t nclasses, const ae_int_t ntrees, const ae_int_t nrndvars, const double r, ae_int_t &info, decisionforest &df, dfreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::dfbuildrandomdecisionforestx1(const_cast(xy.c_ptr()), npoints, nvars, nclasses, ntrees, nrndvars, r, &info, const_cast(df.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Procesing INPUT PARAMETERS: DF - decision forest model X - input vector, array[0..NVars-1]. OUTPUT PARAMETERS: Y - result. Regression estimate when solving regression task, vector of posterior probabilities for classification task. See also DFProcessI. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ void dfprocess(const decisionforest &df, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::dfprocess(const_cast(df.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 'interactive' variant of DFProcess for languages like Python which support constructs like "Y = DFProcessI(DF,X)" and interactive mode of interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void dfprocessi(const decisionforest &df, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::dfprocessi(const_cast(df.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Relative classification error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: percent of incorrectly classified cases. Zero if model solves regression task. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfrelclserror(const decisionforest &df, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::dfrelclserror(const_cast(df.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average cross-entropy (in bits per element) on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: CrossEntropy/(NPoints*LN(2)). Zero if model solves regression task. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfavgce(const decisionforest &df, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::dfavgce(const_cast(df.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* RMS error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: root mean square error. Its meaning for regression task is obvious. As for classification task, RMS error means error when estimating posterior probabilities. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfrmserror(const decisionforest &df, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::dfrmserror(const_cast(df.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task, it means average error when estimating posterior probabilities. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfavgerror(const decisionforest &df, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::dfavgerror(const_cast(df.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Average relative error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task, it means average relative error when estimating posterior probability of belonging to the correct class. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfavgrelerror(const decisionforest &df, const real_2d_array &xy, const ae_int_t npoints) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::dfavgrelerror(const_cast(df.c_ptr()), const_cast(xy.c_ptr()), npoints, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* k-means++ clusterization. Backward compatibility function, we recommend to use CLUSTERING subpackage as better replacement. -- ALGLIB -- Copyright 21.03.2009 by Bochkanov Sergey *************************************************************************/ void kmeansgenerate(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nvars, const ae_int_t k, const ae_int_t restarts, ae_int_t &info, real_2d_array &c, integer_1d_array &xyc) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kmeansgenerate(const_cast(xy.c_ptr()), npoints, nvars, k, restarts, &info, const_cast(c.c_ptr()), const_cast(xyc.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF COMPUTATIONAL CORE // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { static double bdss_xlny(double x, double y, ae_state *_state); static double bdss_getcv(/* Integer */ ae_vector* cnt, ae_int_t nc, ae_state *_state); static void bdss_tieaddc(/* Integer */ ae_vector* c, /* Integer */ ae_vector* ties, ae_int_t ntie, ae_int_t nc, /* Integer */ ae_vector* cnt, ae_state *_state); static void bdss_tiesubc(/* Integer */ ae_vector* c, /* Integer */ ae_vector* ties, ae_int_t ntie, ae_int_t nc, /* Integer */ ae_vector* cnt, ae_state *_state); static ae_int_t mlpbase_mlpvnum = 7; static ae_int_t mlpbase_mlpfirstversion = 0; static ae_int_t mlpbase_nfieldwidth = 4; static ae_int_t mlpbase_hlconnfieldwidth = 5; static ae_int_t mlpbase_hlnfieldwidth = 4; static ae_int_t mlpbase_gradbasecasecost = 50000; static ae_int_t mlpbase_microbatchsize = 64; static void mlpbase_addinputlayer(ae_int_t ncount, /* Integer */ ae_vector* lsizes, /* Integer */ ae_vector* ltypes, /* Integer */ ae_vector* lconnfirst, /* Integer */ ae_vector* lconnlast, ae_int_t* lastproc, ae_state *_state); static void mlpbase_addbiasedsummatorlayer(ae_int_t ncount, /* Integer */ ae_vector* lsizes, /* Integer */ ae_vector* ltypes, /* Integer */ ae_vector* lconnfirst, /* Integer */ ae_vector* lconnlast, ae_int_t* lastproc, ae_state *_state); static void mlpbase_addactivationlayer(ae_int_t functype, /* Integer */ ae_vector* lsizes, /* Integer */ ae_vector* ltypes, /* Integer */ ae_vector* lconnfirst, /* Integer */ ae_vector* lconnlast, ae_int_t* lastproc, ae_state *_state); static void mlpbase_addzerolayer(/* Integer */ ae_vector* lsizes, /* Integer */ ae_vector* ltypes, /* Integer */ ae_vector* lconnfirst, /* Integer */ ae_vector* lconnlast, ae_int_t* lastproc, ae_state *_state); static void mlpbase_hladdinputlayer(multilayerperceptron* network, ae_int_t* connidx, ae_int_t* neuroidx, ae_int_t* structinfoidx, ae_int_t nin, ae_state *_state); static void mlpbase_hladdoutputlayer(multilayerperceptron* network, ae_int_t* connidx, ae_int_t* neuroidx, ae_int_t* structinfoidx, ae_int_t* weightsidx, ae_int_t k, ae_int_t nprev, ae_int_t nout, ae_bool iscls, ae_bool islinearout, ae_state *_state); static void mlpbase_hladdhiddenlayer(multilayerperceptron* network, ae_int_t* connidx, ae_int_t* neuroidx, ae_int_t* structinfoidx, ae_int_t* weightsidx, ae_int_t k, ae_int_t nprev, ae_int_t ncur, ae_state *_state); static void mlpbase_fillhighlevelinformation(multilayerperceptron* network, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_bool iscls, ae_bool islinearout, ae_state *_state); static void mlpbase_mlpcreate(ae_int_t nin, ae_int_t nout, /* Integer */ ae_vector* lsizes, /* Integer */ ae_vector* ltypes, /* Integer */ ae_vector* lconnfirst, /* Integer */ ae_vector* lconnlast, ae_int_t layerscount, ae_bool isclsnet, multilayerperceptron* network, ae_state *_state); static void mlpbase_mlphessianbatchinternal(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, ae_bool naturalerr, double* e, /* Real */ ae_vector* grad, /* Real */ ae_matrix* h, ae_state *_state); static void mlpbase_mlpinternalcalculategradient(multilayerperceptron* network, /* Real */ ae_vector* neurons, /* Real */ ae_vector* weights, /* Real */ ae_vector* derror, /* Real */ ae_vector* grad, ae_bool naturalerrorfunc, ae_state *_state); static void mlpbase_mlpchunkedgradient(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t cstart, ae_int_t csize, /* Real */ ae_vector* batch4buf, /* Real */ ae_vector* hpcbuf, double* e, ae_bool naturalerrorfunc, ae_state *_state); static void mlpbase_mlpchunkedprocess(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t cstart, ae_int_t csize, /* Real */ ae_vector* batch4buf, /* Real */ ae_vector* hpcbuf, ae_state *_state); static double mlpbase_safecrossentropy(double t, double z, ae_state *_state); static void mlpbase_randomizebackwardpass(multilayerperceptron* network, ae_int_t neuronidx, double v, ae_state *_state); static ae_int_t linreg_lrvnum = 5; static void linreg_lrinternal(/* Real */ ae_matrix* xy, /* Real */ ae_vector* s, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, linearmodel* lm, lrreport* ar, ae_state *_state); static double logit_xtol = 100*ae_machineepsilon; static double logit_ftol = 0.0001; static double logit_gtol = 0.3; static ae_int_t logit_maxfev = 20; static double logit_stpmin = 1.0E-2; static double logit_stpmax = 1.0E5; static ae_int_t logit_logitvnum = 6; static void logit_mnliexp(/* Real */ ae_vector* w, /* Real */ ae_vector* x, ae_state *_state); static void logit_mnlallerrors(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, double* relcls, double* avgce, double* rms, double* avg, double* avgrel, ae_state *_state); static void logit_mnlmcsrch(ae_int_t n, /* Real */ ae_vector* x, double* f, /* Real */ ae_vector* g, /* Real */ ae_vector* s, double* stp, ae_int_t* info, ae_int_t* nfev, /* Real */ ae_vector* wa, logitmcstate* state, ae_int_t* stage, ae_state *_state); static void logit_mnlmcstep(double* stx, double* fx, double* dx, double* sty, double* fy, double* dy, double* stp, double fp, double dp, ae_bool* brackt, double stmin, double stmax, ae_int_t* info, ae_state *_state); static double mcpd_xtol = 1.0E-8; static void mcpd_mcpdinit(ae_int_t n, ae_int_t entrystate, ae_int_t exitstate, mcpdstate* s, ae_state *_state); static ae_int_t mlpe_mlpefirstversion = 1; static double mlptrain_mindecay = 0.001; static ae_int_t mlptrain_defaultlbfgsfactor = 6; static void mlptrain_mlpkfoldcvgeneral(multilayerperceptron* n, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t foldscount, ae_bool lmalgorithm, double wstep, ae_int_t maxits, ae_int_t* info, mlpreport* rep, mlpcvreport* cvrep, ae_state *_state); static void mlptrain_mlpkfoldsplit(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nclasses, ae_int_t foldscount, ae_bool stratifiedsplits, /* Integer */ ae_vector* folds, ae_state *_state); static void mlptrain_mthreadcv(mlptrainer* s, ae_int_t rowsize, ae_int_t nrestarts, /* Integer */ ae_vector* folds, ae_int_t fold, ae_int_t dfold, /* Real */ ae_matrix* cvy, ae_shared_pool* pooldatacv, ae_state *_state); static void mlptrain_mlptrainnetworkx(mlptrainer* s, ae_int_t nrestarts, ae_int_t algokind, /* Integer */ ae_vector* trnsubset, ae_int_t trnsubsetsize, /* Integer */ ae_vector* valsubset, ae_int_t valsubsetsize, multilayerperceptron* network, mlpreport* rep, ae_bool isrootcall, ae_shared_pool* sessions, ae_state *_state); static void mlptrain_mlptrainensemblex(mlptrainer* s, mlpensemble* ensemble, ae_int_t idx0, ae_int_t idx1, ae_int_t nrestarts, ae_int_t trainingmethod, sinteger* ngrad, ae_bool isrootcall, ae_shared_pool* esessions, ae_state *_state); static void mlptrain_mlpstarttrainingx(mlptrainer* s, ae_bool randomstart, ae_int_t algokind, /* Integer */ ae_vector* subset, ae_int_t subsetsize, smlptrnsession* session, ae_state *_state); static ae_bool mlptrain_mlpcontinuetrainingx(mlptrainer* s, /* Integer */ ae_vector* subset, ae_int_t subsetsize, ae_int_t* ngradbatch, smlptrnsession* session, ae_state *_state); static void mlptrain_mlpebagginginternal(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, double wstep, ae_int_t maxits, ae_bool lmalgorithm, ae_int_t* info, mlpreport* rep, mlpcvreport* ooberrors, ae_state *_state); static void mlptrain_initmlptrnsession(multilayerperceptron* networktrained, ae_bool randomizenetwork, mlptrainer* trainer, smlptrnsession* session, ae_state *_state); static void mlptrain_initmlptrnsessions(multilayerperceptron* networktrained, ae_bool randomizenetwork, mlptrainer* trainer, ae_shared_pool* sessions, ae_state *_state); static void mlptrain_initmlpetrnsession(multilayerperceptron* individualnetwork, mlptrainer* trainer, mlpetrnsession* session, ae_state *_state); static void mlptrain_initmlpetrnsessions(multilayerperceptron* individualnetwork, mlptrainer* trainer, ae_shared_pool* sessions, ae_state *_state); static double clustering_parallelcomplexity = 200000; static ae_int_t clustering_kmeansblocksize = 32; static ae_int_t clustering_kmeansparalleldim = 8; static ae_int_t clustering_kmeansparallelk = 8; static void clustering_selectinitialcenters(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t initalgo, ae_int_t k, /* Real */ ae_matrix* ct, apbuffers* initbuf, ae_shared_pool* updatepool, ae_state *_state); static ae_bool clustering_fixcenters(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, /* Real */ ae_matrix* ct, ae_int_t k, apbuffers* initbuf, ae_shared_pool* updatepool, ae_state *_state); static void clustering_clusterizerrunahcinternal(clusterizerstate* s, /* Real */ ae_matrix* d, ahcreport* rep, ae_state *_state); static void clustering_evaluatedistancematrixrec(/* Real */ ae_matrix* xy, ae_int_t nfeatures, ae_int_t disttype, /* Real */ ae_matrix* d, ae_int_t i0, ae_int_t i1, ae_int_t j0, ae_int_t j1, ae_state *_state); static ae_int_t dforest_innernodewidth = 3; static ae_int_t dforest_leafnodewidth = 2; static ae_int_t dforest_dfusestrongsplits = 1; static ae_int_t dforest_dfuseevs = 2; static ae_int_t dforest_dffirstversion = 0; static ae_int_t dforest_dfclserror(decisionforest* df, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state); static void dforest_dfprocessinternal(decisionforest* df, ae_int_t offs, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); static void dforest_dfbuildtree(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t nfeatures, ae_int_t nvarsinpool, ae_int_t flags, dfinternalbuffers* bufs, hqrndstate* rs, ae_state *_state); static void dforest_dfbuildtreerec(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t nfeatures, ae_int_t nvarsinpool, ae_int_t flags, ae_int_t* numprocessed, ae_int_t idx1, ae_int_t idx2, dfinternalbuffers* bufs, hqrndstate* rs, ae_state *_state); static void dforest_dfsplitc(/* Real */ ae_vector* x, /* Integer */ ae_vector* c, /* Integer */ ae_vector* cntbuf, ae_int_t n, ae_int_t nc, ae_int_t flags, ae_int_t* info, double* threshold, double* e, /* Real */ ae_vector* sortrbuf, /* Integer */ ae_vector* sortibuf, ae_state *_state); static void dforest_dfsplitr(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t flags, ae_int_t* info, double* threshold, double* e, /* Real */ ae_vector* sortrbuf, /* Real */ ae_vector* sortrbuf2, ae_state *_state); /************************************************************************* Principal components analysis This function builds orthogonal basis where first axis corresponds to direction with maximum variance, second axis maximizes variance in the subspace orthogonal to first axis and so on. This function builds FULL basis, i.e. returns N vectors corresponding to ALL directions, no matter how informative. If you need just a few (say, 10 or 50) of the most important directions, you may find it faster to use one of the reduced versions: * pcatruncatedsubspace() - for subspace iteration based method It should be noted that, unlike LDA, PCA does not use class labels. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * multithreading support ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Multithreading typically gives sublinear (wrt to cores count) speedup, ! because only some parts of the algorithm can be parallelized. ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - dataset, array[0..NPoints-1,0..NVars-1]. matrix contains ONLY INDEPENDENT VARIABLES. NPoints - dataset size, NPoints>=0 NVars - number of independent variables, NVars>=1 OUTPUT PARAMETERS: Info - return code: * -4, if SVD subroutine haven't converged * -1, if wrong parameters has been passed (NPoints<0, NVars<1) * 1, if task is solved S2 - array[0..NVars-1]. variance values corresponding to basis vectors. V - array[0..NVars-1,0..NVars-1] matrix, whose columns store basis vectors. -- ALGLIB -- Copyright 25.08.2008 by Bochkanov Sergey *************************************************************************/ void pcabuildbasis(/* Real */ ae_matrix* x, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, /* Real */ ae_vector* s2, /* Real */ ae_matrix* v, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix u; ae_matrix vt; ae_vector m; ae_vector t; ae_int_t i; ae_int_t j; double mean; double variance; double skewness; double kurtosis; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(s2); ae_matrix_clear(v); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&u, 0, 0, DT_REAL, _state); ae_matrix_init(&vt, 0, 0, DT_REAL, _state); ae_vector_init(&m, 0, DT_REAL, _state); ae_vector_init(&t, 0, DT_REAL, _state); /* * Check input data */ if( npoints<0||nvars<1 ) { *info = -1; ae_frame_leave(_state); return; } *info = 1; /* * Special case: NPoints=0 */ if( npoints==0 ) { ae_vector_set_length(s2, nvars, _state); ae_matrix_set_length(v, nvars, nvars, _state); for(i=0; i<=nvars-1; i++) { s2->ptr.p_double[i] = (double)(0); } for(i=0; i<=nvars-1; i++) { for(j=0; j<=nvars-1; j++) { if( i==j ) { v->ptr.pp_double[i][j] = (double)(1); } else { v->ptr.pp_double[i][j] = (double)(0); } } } ae_frame_leave(_state); return; } /* * Calculate means */ ae_vector_set_length(&m, nvars, _state); ae_vector_set_length(&t, npoints, _state); for(j=0; j<=nvars-1; j++) { ae_v_move(&t.ptr.p_double[0], 1, &x->ptr.pp_double[0][j], x->stride, ae_v_len(0,npoints-1)); samplemoments(&t, npoints, &mean, &variance, &skewness, &kurtosis, _state); m.ptr.p_double[j] = mean; } /* * Center, apply SVD, prepare output */ ae_matrix_set_length(&a, ae_maxint(npoints, nvars, _state), nvars, _state); for(i=0; i<=npoints-1; i++) { ae_v_move(&a.ptr.pp_double[i][0], 1, &x->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); ae_v_sub(&a.ptr.pp_double[i][0], 1, &m.ptr.p_double[0], 1, ae_v_len(0,nvars-1)); } for(i=npoints; i<=nvars-1; i++) { for(j=0; j<=nvars-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } if( !rmatrixsvd(&a, ae_maxint(npoints, nvars, _state), nvars, 0, 1, 2, s2, &u, &vt, _state) ) { *info = -4; ae_frame_leave(_state); return; } if( npoints!=1 ) { for(i=0; i<=nvars-1; i++) { s2->ptr.p_double[i] = ae_sqr(s2->ptr.p_double[i], _state)/(npoints-1); } } ae_matrix_set_length(v, nvars, nvars, _state); copyandtranspose(&vt, 0, nvars-1, 0, nvars-1, v, 0, nvars-1, 0, nvars-1, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_pcabuildbasis(/* Real */ ae_matrix* x, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, /* Real */ ae_vector* s2, /* Real */ ae_matrix* v, ae_state *_state) { pcabuildbasis(x,npoints,nvars,info,s2,v, _state); } /************************************************************************* Principal components analysis This function performs truncated PCA, i.e. returns just a few most important directions. Internally it uses iterative eigensolver which is very efficient when only a minor fraction of full basis is required. Thus, if you need full basis, it is better to use pcabuildbasis() function. It should be noted that, unlike LDA, PCA does not use class labels. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * multithreading support ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! For a situation when you need just a few eigenvectors (~1-10), ! multithreading typically gives sublinear (wrt to cores count) speedup. ! For larger problems it may give you nearly linear increase in ! performance. ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - dataset, array[0..NPoints-1,0..NVars-1]. matrix contains ONLY INDEPENDENT VARIABLES. NPoints - dataset size, NPoints>=0 NVars - number of independent variables, NVars>=1 NNeeded - number of requested directions, in [1,NVars] range; this function is efficient only for NNeeded<=0, "PCATruncatedSubspace: npoints<0", _state); ae_assert(nvars>=1, "PCATruncatedSubspace: nvars<1", _state); ae_assert(nneeded>0, "PCATruncatedSubspace: nneeded<1", _state); ae_assert(nneeded<=nvars, "PCATruncatedSubspace: nneeded>nvars", _state); ae_assert(maxits>=0, "PCATruncatedSubspace: maxits<0", _state); ae_assert(ae_isfinite(eps, _state)&&ae_fp_greater_eq(eps,(double)(0)), "PCATruncatedSubspace: eps<0 or is not finite", _state); /* * Initialize parameters */ nwork = ae_maxint(2*nneeded, nneeded+8, _state); nwork = ae_minint(nwork, nvars, _state); hqrndseed(3463, 9854, &rs, _state); /* * Special case: NPoints=0 */ if( npoints==0 ) { ae_vector_set_length(s2, nneeded, _state); ae_matrix_set_length(v, nvars, nneeded, _state); for(i=0; i<=nvars-1; i++) { s2->ptr.p_double[i] = (double)(0); } for(i=0; i<=nvars-1; i++) { for(j=0; j<=nneeded-1; j++) { if( i==j ) { v->ptr.pp_double[i][j] = (double)(1); } else { v->ptr.pp_double[i][j] = (double)(0); } } } ae_frame_leave(_state); return; } /* * Center matrix */ ae_vector_set_length(&means, nvars, _state); for(i=0; i<=nvars-1; i++) { means.ptr.p_double[i] = (double)(0); } vv = (double)1/(double)npoints; for(i=0; i<=npoints-1; i++) { ae_v_addd(&means.ptr.p_double[0], 1, &x->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1), vv); } ae_matrix_set_length(&a, npoints, nvars, _state); for(i=0; i<=npoints-1; i++) { ae_v_move(&a.ptr.pp_double[i][0], 1, &x->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); ae_v_sub(&a.ptr.pp_double[i][0], 1, &means.ptr.p_double[0], 1, ae_v_len(0,nvars-1)); } /* * Find eigenvalues with subspace iteration solver */ eigsubspacecreate(nvars, nneeded, &solver, _state); eigsubspacesetcond(&solver, eps, maxits, _state); eigsubspaceoocstart(&solver, 0, _state); while(eigsubspaceooccontinue(&solver, _state)) { ae_assert(solver.requesttype==0, "PCATruncatedSubspace: integrity check failed", _state); k = solver.requestsize; rmatrixsetlengthatleast(&b, npoints, k, _state); rmatrixgemm(npoints, k, nvars, 1.0, &a, 0, 0, 0, &solver.x, 0, 0, 0, 0.0, &b, 0, 0, _state); rmatrixgemm(nvars, k, npoints, 1.0, &a, 0, 0, 1, &b, 0, 0, 0, 0.0, &solver.ax, 0, 0, _state); } eigsubspaceoocstop(&solver, s2, v, &rep, _state); if( npoints!=1 ) { for(i=0; i<=nneeded-1; i++) { s2->ptr.p_double[i] = s2->ptr.p_double[i]/(npoints-1); } } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_pcatruncatedsubspace(/* Real */ ae_matrix* x, ae_int_t npoints, ae_int_t nvars, ae_int_t nneeded, double eps, ae_int_t maxits, /* Real */ ae_vector* s2, /* Real */ ae_matrix* v, ae_state *_state) { pcatruncatedsubspace(x,npoints,nvars,nneeded,eps,maxits,s2,v, _state); } /************************************************************************* This set of routines (DSErrAllocate, DSErrAccumulate, DSErrFinish) calculates different error functions (classification error, cross-entropy, rms, avg, avg.rel errors). 1. DSErrAllocate prepares buffer. 2. DSErrAccumulate accumulates individual errors: * Y contains predicted output (posterior probabilities for classification) * DesiredY contains desired output (class number for classification) 3. DSErrFinish outputs results: * Buf[0] contains relative classification error (zero for regression tasks) * Buf[1] contains avg. cross-entropy (zero for regression tasks) * Buf[2] contains rms error (regression, classification) * Buf[3] contains average error (regression, classification) * Buf[4] contains average relative error (regression, classification) NOTES(1): "NClasses>0" means that we have classification task. "NClasses<0" means regression task with -NClasses real outputs. NOTES(2): rms. avg, avg.rel errors for classification tasks are interpreted as errors in posterior probabilities with respect to probabilities given by training/test set. -- ALGLIB -- Copyright 11.01.2009 by Bochkanov Sergey *************************************************************************/ void dserrallocate(ae_int_t nclasses, /* Real */ ae_vector* buf, ae_state *_state) { ae_vector_clear(buf); ae_vector_set_length(buf, 7+1, _state); buf->ptr.p_double[0] = (double)(0); buf->ptr.p_double[1] = (double)(0); buf->ptr.p_double[2] = (double)(0); buf->ptr.p_double[3] = (double)(0); buf->ptr.p_double[4] = (double)(0); buf->ptr.p_double[5] = (double)(nclasses); buf->ptr.p_double[6] = (double)(0); buf->ptr.p_double[7] = (double)(0); } /************************************************************************* See DSErrAllocate for comments on this routine. -- ALGLIB -- Copyright 11.01.2009 by Bochkanov Sergey *************************************************************************/ void dserraccumulate(/* Real */ ae_vector* buf, /* Real */ ae_vector* y, /* Real */ ae_vector* desiredy, ae_state *_state) { ae_int_t nclasses; ae_int_t nout; ae_int_t offs; ae_int_t mmax; ae_int_t rmax; ae_int_t j; double v; double ev; offs = 5; nclasses = ae_round(buf->ptr.p_double[offs], _state); if( nclasses>0 ) { /* * Classification */ rmax = ae_round(desiredy->ptr.p_double[0], _state); mmax = 0; for(j=1; j<=nclasses-1; j++) { if( ae_fp_greater(y->ptr.p_double[j],y->ptr.p_double[mmax]) ) { mmax = j; } } if( mmax!=rmax ) { buf->ptr.p_double[0] = buf->ptr.p_double[0]+1; } if( ae_fp_greater(y->ptr.p_double[rmax],(double)(0)) ) { buf->ptr.p_double[1] = buf->ptr.p_double[1]-ae_log(y->ptr.p_double[rmax], _state); } else { buf->ptr.p_double[1] = buf->ptr.p_double[1]+ae_log(ae_maxrealnumber, _state); } for(j=0; j<=nclasses-1; j++) { v = y->ptr.p_double[j]; if( j==rmax ) { ev = (double)(1); } else { ev = (double)(0); } buf->ptr.p_double[2] = buf->ptr.p_double[2]+ae_sqr(v-ev, _state); buf->ptr.p_double[3] = buf->ptr.p_double[3]+ae_fabs(v-ev, _state); if( ae_fp_neq(ev,(double)(0)) ) { buf->ptr.p_double[4] = buf->ptr.p_double[4]+ae_fabs((v-ev)/ev, _state); buf->ptr.p_double[offs+2] = buf->ptr.p_double[offs+2]+1; } } buf->ptr.p_double[offs+1] = buf->ptr.p_double[offs+1]+1; } else { /* * Regression */ nout = -nclasses; rmax = 0; for(j=1; j<=nout-1; j++) { if( ae_fp_greater(desiredy->ptr.p_double[j],desiredy->ptr.p_double[rmax]) ) { rmax = j; } } mmax = 0; for(j=1; j<=nout-1; j++) { if( ae_fp_greater(y->ptr.p_double[j],y->ptr.p_double[mmax]) ) { mmax = j; } } if( mmax!=rmax ) { buf->ptr.p_double[0] = buf->ptr.p_double[0]+1; } for(j=0; j<=nout-1; j++) { v = y->ptr.p_double[j]; ev = desiredy->ptr.p_double[j]; buf->ptr.p_double[2] = buf->ptr.p_double[2]+ae_sqr(v-ev, _state); buf->ptr.p_double[3] = buf->ptr.p_double[3]+ae_fabs(v-ev, _state); if( ae_fp_neq(ev,(double)(0)) ) { buf->ptr.p_double[4] = buf->ptr.p_double[4]+ae_fabs((v-ev)/ev, _state); buf->ptr.p_double[offs+2] = buf->ptr.p_double[offs+2]+1; } } buf->ptr.p_double[offs+1] = buf->ptr.p_double[offs+1]+1; } } /************************************************************************* See DSErrAllocate for comments on this routine. -- ALGLIB -- Copyright 11.01.2009 by Bochkanov Sergey *************************************************************************/ void dserrfinish(/* Real */ ae_vector* buf, ae_state *_state) { ae_int_t nout; ae_int_t offs; offs = 5; nout = ae_iabs(ae_round(buf->ptr.p_double[offs], _state), _state); if( ae_fp_neq(buf->ptr.p_double[offs+1],(double)(0)) ) { buf->ptr.p_double[0] = buf->ptr.p_double[0]/buf->ptr.p_double[offs+1]; buf->ptr.p_double[1] = buf->ptr.p_double[1]/buf->ptr.p_double[offs+1]; buf->ptr.p_double[2] = ae_sqrt(buf->ptr.p_double[2]/(nout*buf->ptr.p_double[offs+1]), _state); buf->ptr.p_double[3] = buf->ptr.p_double[3]/(nout*buf->ptr.p_double[offs+1]); } if( ae_fp_neq(buf->ptr.p_double[offs+2],(double)(0)) ) { buf->ptr.p_double[4] = buf->ptr.p_double[4]/buf->ptr.p_double[offs+2]; } } /************************************************************************* -- ALGLIB -- Copyright 19.05.2008 by Bochkanov Sergey *************************************************************************/ void dsnormalize(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, /* Real */ ae_vector* means, /* Real */ ae_vector* sigmas, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_vector tmp; double mean; double variance; double skewness; double kurtosis; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(means); ae_vector_clear(sigmas); ae_vector_init(&tmp, 0, DT_REAL, _state); /* * Test parameters */ if( npoints<=0||nvars<1 ) { *info = -1; ae_frame_leave(_state); return; } *info = 1; /* * Standartization */ ae_vector_set_length(means, nvars-1+1, _state); ae_vector_set_length(sigmas, nvars-1+1, _state); ae_vector_set_length(&tmp, npoints-1+1, _state); for(j=0; j<=nvars-1; j++) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][j], xy->stride, ae_v_len(0,npoints-1)); samplemoments(&tmp, npoints, &mean, &variance, &skewness, &kurtosis, _state); means->ptr.p_double[j] = mean; sigmas->ptr.p_double[j] = ae_sqrt(variance, _state); if( ae_fp_eq(sigmas->ptr.p_double[j],(double)(0)) ) { sigmas->ptr.p_double[j] = (double)(1); } for(i=0; i<=npoints-1; i++) { xy->ptr.pp_double[i][j] = (xy->ptr.pp_double[i][j]-means->ptr.p_double[j])/sigmas->ptr.p_double[j]; } } ae_frame_leave(_state); } /************************************************************************* -- ALGLIB -- Copyright 19.05.2008 by Bochkanov Sergey *************************************************************************/ void dsnormalizec(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, /* Real */ ae_vector* means, /* Real */ ae_vector* sigmas, ae_state *_state) { ae_frame _frame_block; ae_int_t j; ae_vector tmp; double mean; double variance; double skewness; double kurtosis; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(means); ae_vector_clear(sigmas); ae_vector_init(&tmp, 0, DT_REAL, _state); /* * Test parameters */ if( npoints<=0||nvars<1 ) { *info = -1; ae_frame_leave(_state); return; } *info = 1; /* * Standartization */ ae_vector_set_length(means, nvars-1+1, _state); ae_vector_set_length(sigmas, nvars-1+1, _state); ae_vector_set_length(&tmp, npoints-1+1, _state); for(j=0; j<=nvars-1; j++) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][j], xy->stride, ae_v_len(0,npoints-1)); samplemoments(&tmp, npoints, &mean, &variance, &skewness, &kurtosis, _state); means->ptr.p_double[j] = mean; sigmas->ptr.p_double[j] = ae_sqrt(variance, _state); if( ae_fp_eq(sigmas->ptr.p_double[j],(double)(0)) ) { sigmas->ptr.p_double[j] = (double)(1); } } ae_frame_leave(_state); } /************************************************************************* -- ALGLIB -- Copyright 19.05.2008 by Bochkanov Sergey *************************************************************************/ double dsgetmeanmindistance(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_vector tmp; ae_vector tmp2; double v; double result; ae_frame_make(_state, &_frame_block); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&tmp2, 0, DT_REAL, _state); /* * Test parameters */ if( npoints<=0||nvars<1 ) { result = (double)(0); ae_frame_leave(_state); return result; } /* * Process */ ae_vector_set_length(&tmp, npoints-1+1, _state); for(i=0; i<=npoints-1; i++) { tmp.ptr.p_double[i] = ae_maxrealnumber; } ae_vector_set_length(&tmp2, nvars-1+1, _state); for(i=0; i<=npoints-1; i++) { for(j=i+1; j<=npoints-1; j++) { ae_v_move(&tmp2.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); ae_v_sub(&tmp2.ptr.p_double[0], 1, &xy->ptr.pp_double[j][0], 1, ae_v_len(0,nvars-1)); v = ae_v_dotproduct(&tmp2.ptr.p_double[0], 1, &tmp2.ptr.p_double[0], 1, ae_v_len(0,nvars-1)); v = ae_sqrt(v, _state); tmp.ptr.p_double[i] = ae_minreal(tmp.ptr.p_double[i], v, _state); tmp.ptr.p_double[j] = ae_minreal(tmp.ptr.p_double[j], v, _state); } } result = (double)(0); for(i=0; i<=npoints-1; i++) { result = result+tmp.ptr.p_double[i]/npoints; } ae_frame_leave(_state); return result; } /************************************************************************* -- ALGLIB -- Copyright 19.05.2008 by Bochkanov Sergey *************************************************************************/ void dstie(/* Real */ ae_vector* a, ae_int_t n, /* Integer */ ae_vector* ties, ae_int_t* tiecount, /* Integer */ ae_vector* p1, /* Integer */ ae_vector* p2, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t k; ae_vector tmp; ae_frame_make(_state, &_frame_block); ae_vector_clear(ties); *tiecount = 0; ae_vector_clear(p1); ae_vector_clear(p2); ae_vector_init(&tmp, 0, DT_INT, _state); /* * Special case */ if( n<=0 ) { *tiecount = 0; ae_frame_leave(_state); return; } /* * Sort A */ tagsort(a, n, p1, p2, _state); /* * Process ties */ *tiecount = 1; for(i=1; i<=n-1; i++) { if( ae_fp_neq(a->ptr.p_double[i],a->ptr.p_double[i-1]) ) { *tiecount = *tiecount+1; } } ae_vector_set_length(ties, *tiecount+1, _state); ties->ptr.p_int[0] = 0; k = 1; for(i=1; i<=n-1; i++) { if( ae_fp_neq(a->ptr.p_double[i],a->ptr.p_double[i-1]) ) { ties->ptr.p_int[k] = i; k = k+1; } } ties->ptr.p_int[*tiecount] = n; ae_frame_leave(_state); } /************************************************************************* -- ALGLIB -- Copyright 11.12.2008 by Bochkanov Sergey *************************************************************************/ void dstiefasti(/* Real */ ae_vector* a, /* Integer */ ae_vector* b, ae_int_t n, /* Integer */ ae_vector* ties, ae_int_t* tiecount, /* Real */ ae_vector* bufr, /* Integer */ ae_vector* bufi, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t k; ae_vector tmp; ae_frame_make(_state, &_frame_block); *tiecount = 0; ae_vector_init(&tmp, 0, DT_INT, _state); /* * Special case */ if( n<=0 ) { *tiecount = 0; ae_frame_leave(_state); return; } /* * Sort A */ tagsortfasti(a, b, bufr, bufi, n, _state); /* * Process ties */ ties->ptr.p_int[0] = 0; k = 1; for(i=1; i<=n-1; i++) { if( ae_fp_neq(a->ptr.p_double[i],a->ptr.p_double[i-1]) ) { ties->ptr.p_int[k] = i; k = k+1; } } ties->ptr.p_int[k] = n; *tiecount = k; ae_frame_leave(_state); } /************************************************************************* Optimal binary classification Algorithms finds optimal (=with minimal cross-entropy) binary partition. Internal subroutine. INPUT PARAMETERS: A - array[0..N-1], variable C - array[0..N-1], class numbers (0 or 1). N - array size OUTPUT PARAMETERS: Info - completetion code: * -3, all values of A[] are same (partition is impossible) * -2, one of C[] is incorrect (<0, >1) * -1, incorrect pararemets were passed (N<=0). * 1, OK Threshold- partiton boundary. Left part contains values which are strictly less than Threshold. Right part contains values which are greater than or equal to Threshold. PAL, PBL- probabilities P(0|v=Threshold) and P(1|v>=Threshold) CVE - cross-validation estimate of cross-entropy -- ALGLIB -- Copyright 22.05.2008 by Bochkanov Sergey *************************************************************************/ void dsoptimalsplit2(/* Real */ ae_vector* a, /* Integer */ ae_vector* c, ae_int_t n, ae_int_t* info, double* threshold, double* pal, double* pbl, double* par, double* pbr, double* cve, ae_state *_state) { ae_frame _frame_block; ae_vector _a; ae_vector _c; ae_int_t i; ae_int_t t; double s; ae_vector ties; ae_int_t tiecount; ae_vector p1; ae_vector p2; ae_int_t k; ae_int_t koptimal; double pak; double pbk; double cvoptimal; double cv; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_a, a, _state); a = &_a; ae_vector_init_copy(&_c, c, _state); c = &_c; *info = 0; *threshold = 0; *pal = 0; *pbl = 0; *par = 0; *pbr = 0; *cve = 0; ae_vector_init(&ties, 0, DT_INT, _state); ae_vector_init(&p1, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); /* * Test for errors in inputs */ if( n<=0 ) { *info = -1; ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { if( c->ptr.p_int[i]!=0&&c->ptr.p_int[i]!=1 ) { *info = -2; ae_frame_leave(_state); return; } } *info = 1; /* * Tie */ dstie(a, n, &ties, &tiecount, &p1, &p2, _state); for(i=0; i<=n-1; i++) { if( p2.ptr.p_int[i]!=i ) { t = c->ptr.p_int[i]; c->ptr.p_int[i] = c->ptr.p_int[p2.ptr.p_int[i]]; c->ptr.p_int[p2.ptr.p_int[i]] = t; } } /* * Special case: number of ties is 1. * * NOTE: we assume that P[i,j] equals to 0 or 1, * intermediate values are not allowed. */ if( tiecount==1 ) { *info = -3; ae_frame_leave(_state); return; } /* * General case, number of ties > 1 * * NOTE: we assume that P[i,j] equals to 0 or 1, * intermediate values are not allowed. */ *pal = (double)(0); *pbl = (double)(0); *par = (double)(0); *pbr = (double)(0); for(i=0; i<=n-1; i++) { if( c->ptr.p_int[i]==0 ) { *par = *par+1; } if( c->ptr.p_int[i]==1 ) { *pbr = *pbr+1; } } koptimal = -1; cvoptimal = ae_maxrealnumber; for(k=0; k<=tiecount-2; k++) { /* * first, obtain information about K-th tie which is * moved from R-part to L-part */ pak = (double)(0); pbk = (double)(0); for(i=ties.ptr.p_int[k]; i<=ties.ptr.p_int[k+1]-1; i++) { if( c->ptr.p_int[i]==0 ) { pak = pak+1; } if( c->ptr.p_int[i]==1 ) { pbk = pbk+1; } } /* * Calculate cross-validation CE */ cv = (double)(0); cv = cv-bdss_xlny(*pal+pak, (*pal+pak)/(*pal+pak+(*pbl)+pbk+1), _state); cv = cv-bdss_xlny(*pbl+pbk, (*pbl+pbk)/(*pal+pak+1+(*pbl)+pbk), _state); cv = cv-bdss_xlny(*par-pak, (*par-pak)/(*par-pak+(*pbr)-pbk+1), _state); cv = cv-bdss_xlny(*pbr-pbk, (*pbr-pbk)/(*par-pak+1+(*pbr)-pbk), _state); /* * Compare with best */ if( ae_fp_less(cv,cvoptimal) ) { cvoptimal = cv; koptimal = k; } /* * update */ *pal = *pal+pak; *pbl = *pbl+pbk; *par = *par-pak; *pbr = *pbr-pbk; } *cve = cvoptimal; *threshold = 0.5*(a->ptr.p_double[ties.ptr.p_int[koptimal]]+a->ptr.p_double[ties.ptr.p_int[koptimal+1]]); *pal = (double)(0); *pbl = (double)(0); *par = (double)(0); *pbr = (double)(0); for(i=0; i<=n-1; i++) { if( ae_fp_less(a->ptr.p_double[i],*threshold) ) { if( c->ptr.p_int[i]==0 ) { *pal = *pal+1; } else { *pbl = *pbl+1; } } else { if( c->ptr.p_int[i]==0 ) { *par = *par+1; } else { *pbr = *pbr+1; } } } s = *pal+(*pbl); *pal = *pal/s; *pbl = *pbl/s; s = *par+(*pbr); *par = *par/s; *pbr = *pbr/s; ae_frame_leave(_state); } /************************************************************************* Optimal partition, internal subroutine. Fast version. Accepts: A array[0..N-1] array of attributes array[0..N-1] C array[0..N-1] array of class labels TiesBuf array[0..N] temporaries (ties) CntBuf array[0..2*NC-1] temporaries (counts) Alpha centering factor (0<=alpha<=1, recommended value - 0.05) BufR array[0..N-1] temporaries BufI array[0..N-1] temporaries Output: Info error code (">0"=OK, "<0"=bad) RMS training set RMS error CVRMS leave-one-out RMS error Note: content of all arrays is changed by subroutine; it doesn't allocate temporaries. -- ALGLIB -- Copyright 11.12.2008 by Bochkanov Sergey *************************************************************************/ void dsoptimalsplit2fast(/* Real */ ae_vector* a, /* Integer */ ae_vector* c, /* Integer */ ae_vector* tiesbuf, /* Integer */ ae_vector* cntbuf, /* Real */ ae_vector* bufr, /* Integer */ ae_vector* bufi, ae_int_t n, ae_int_t nc, double alpha, ae_int_t* info, double* threshold, double* rms, double* cvrms, ae_state *_state) { ae_int_t i; ae_int_t k; ae_int_t cl; ae_int_t tiecount; double cbest; double cc; ae_int_t koptimal; ae_int_t sl; ae_int_t sr; double v; double w; double x; *info = 0; *threshold = 0; *rms = 0; *cvrms = 0; /* * Test for errors in inputs */ if( n<=0||nc<2 ) { *info = -1; return; } for(i=0; i<=n-1; i++) { if( c->ptr.p_int[i]<0||c->ptr.p_int[i]>=nc ) { *info = -2; return; } } *info = 1; /* * Tie */ dstiefasti(a, c, n, tiesbuf, &tiecount, bufr, bufi, _state); /* * Special case: number of ties is 1. */ if( tiecount==1 ) { *info = -3; return; } /* * General case, number of ties > 1 */ for(i=0; i<=2*nc-1; i++) { cntbuf->ptr.p_int[i] = 0; } for(i=0; i<=n-1; i++) { cntbuf->ptr.p_int[nc+c->ptr.p_int[i]] = cntbuf->ptr.p_int[nc+c->ptr.p_int[i]]+1; } koptimal = -1; *threshold = a->ptr.p_double[n-1]; cbest = ae_maxrealnumber; sl = 0; sr = n; for(k=0; k<=tiecount-2; k++) { /* * first, move Kth tie from right to left */ for(i=tiesbuf->ptr.p_int[k]; i<=tiesbuf->ptr.p_int[k+1]-1; i++) { cl = c->ptr.p_int[i]; cntbuf->ptr.p_int[cl] = cntbuf->ptr.p_int[cl]+1; cntbuf->ptr.p_int[nc+cl] = cntbuf->ptr.p_int[nc+cl]-1; } sl = sl+(tiesbuf->ptr.p_int[k+1]-tiesbuf->ptr.p_int[k]); sr = sr-(tiesbuf->ptr.p_int[k+1]-tiesbuf->ptr.p_int[k]); /* * Calculate RMS error */ v = (double)(0); for(i=0; i<=nc-1; i++) { w = (double)(cntbuf->ptr.p_int[i]); v = v+w*ae_sqr(w/sl-1, _state); v = v+(sl-w)*ae_sqr(w/sl, _state); w = (double)(cntbuf->ptr.p_int[nc+i]); v = v+w*ae_sqr(w/sr-1, _state); v = v+(sr-w)*ae_sqr(w/sr, _state); } v = ae_sqrt(v/(nc*n), _state); /* * Compare with best */ x = (double)(2*sl)/(double)(sl+sr)-1; cc = v*(1-alpha+alpha*ae_sqr(x, _state)); if( ae_fp_less(cc,cbest) ) { /* * store split */ *rms = v; koptimal = k; cbest = cc; /* * calculate CVRMS error */ *cvrms = (double)(0); for(i=0; i<=nc-1; i++) { if( sl>1 ) { w = (double)(cntbuf->ptr.p_int[i]); *cvrms = *cvrms+w*ae_sqr((w-1)/(sl-1)-1, _state); *cvrms = *cvrms+(sl-w)*ae_sqr(w/(sl-1), _state); } else { w = (double)(cntbuf->ptr.p_int[i]); *cvrms = *cvrms+w*ae_sqr((double)1/(double)nc-1, _state); *cvrms = *cvrms+(sl-w)*ae_sqr((double)1/(double)nc, _state); } if( sr>1 ) { w = (double)(cntbuf->ptr.p_int[nc+i]); *cvrms = *cvrms+w*ae_sqr((w-1)/(sr-1)-1, _state); *cvrms = *cvrms+(sr-w)*ae_sqr(w/(sr-1), _state); } else { w = (double)(cntbuf->ptr.p_int[nc+i]); *cvrms = *cvrms+w*ae_sqr((double)1/(double)nc-1, _state); *cvrms = *cvrms+(sr-w)*ae_sqr((double)1/(double)nc, _state); } } *cvrms = ae_sqrt(*cvrms/(nc*n), _state); } } /* * Calculate threshold. * Code is a bit complicated because there can be such * numbers that 0.5(A+B) equals to A or B (if A-B=epsilon) */ *threshold = 0.5*(a->ptr.p_double[tiesbuf->ptr.p_int[koptimal]]+a->ptr.p_double[tiesbuf->ptr.p_int[koptimal+1]]); if( ae_fp_less_eq(*threshold,a->ptr.p_double[tiesbuf->ptr.p_int[koptimal]]) ) { *threshold = a->ptr.p_double[tiesbuf->ptr.p_int[koptimal+1]]; } } /************************************************************************* Automatic non-optimal discretization, internal subroutine. -- ALGLIB -- Copyright 22.05.2008 by Bochkanov Sergey *************************************************************************/ void dssplitk(/* Real */ ae_vector* a, /* Integer */ ae_vector* c, ae_int_t n, ae_int_t nc, ae_int_t kmax, ae_int_t* info, /* Real */ ae_vector* thresholds, ae_int_t* ni, double* cve, ae_state *_state) { ae_frame _frame_block; ae_vector _a; ae_vector _c; ae_int_t i; ae_int_t j; ae_int_t j1; ae_int_t k; ae_vector ties; ae_int_t tiecount; ae_vector p1; ae_vector p2; ae_vector cnt; double v2; ae_int_t bestk; double bestcve; ae_vector bestsizes; double curcve; ae_vector cursizes; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_a, a, _state); a = &_a; ae_vector_init_copy(&_c, c, _state); c = &_c; *info = 0; ae_vector_clear(thresholds); *ni = 0; *cve = 0; ae_vector_init(&ties, 0, DT_INT, _state); ae_vector_init(&p1, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); ae_vector_init(&cnt, 0, DT_INT, _state); ae_vector_init(&bestsizes, 0, DT_INT, _state); ae_vector_init(&cursizes, 0, DT_INT, _state); /* * Test for errors in inputs */ if( (n<=0||nc<2)||kmax<2 ) { *info = -1; ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { if( c->ptr.p_int[i]<0||c->ptr.p_int[i]>=nc ) { *info = -2; ae_frame_leave(_state); return; } } *info = 1; /* * Tie */ dstie(a, n, &ties, &tiecount, &p1, &p2, _state); for(i=0; i<=n-1; i++) { if( p2.ptr.p_int[i]!=i ) { k = c->ptr.p_int[i]; c->ptr.p_int[i] = c->ptr.p_int[p2.ptr.p_int[i]]; c->ptr.p_int[p2.ptr.p_int[i]] = k; } } /* * Special cases */ if( tiecount==1 ) { *info = -3; ae_frame_leave(_state); return; } /* * General case: * 0. allocate arrays */ kmax = ae_minint(kmax, tiecount, _state); ae_vector_set_length(&bestsizes, kmax-1+1, _state); ae_vector_set_length(&cursizes, kmax-1+1, _state); ae_vector_set_length(&cnt, nc-1+1, _state); /* * General case: * 1. prepare "weak" solution (two subintervals, divided at median) */ v2 = ae_maxrealnumber; j = -1; for(i=1; i<=tiecount-1; i++) { if( ae_fp_less(ae_fabs(ties.ptr.p_int[i]-0.5*(n-1), _state),v2) ) { v2 = ae_fabs(ties.ptr.p_int[i]-0.5*n, _state); j = i; } } ae_assert(j>0, "DSSplitK: internal error #1!", _state); bestk = 2; bestsizes.ptr.p_int[0] = ties.ptr.p_int[j]; bestsizes.ptr.p_int[1] = n-j; bestcve = (double)(0); for(i=0; i<=nc-1; i++) { cnt.ptr.p_int[i] = 0; } for(i=0; i<=j-1; i++) { bdss_tieaddc(c, &ties, i, nc, &cnt, _state); } bestcve = bestcve+bdss_getcv(&cnt, nc, _state); for(i=0; i<=nc-1; i++) { cnt.ptr.p_int[i] = 0; } for(i=j; i<=tiecount-1; i++) { bdss_tieaddc(c, &ties, i, nc, &cnt, _state); } bestcve = bestcve+bdss_getcv(&cnt, nc, _state); /* * General case: * 2. Use greedy algorithm to find sub-optimal split in O(KMax*N) time */ for(k=2; k<=kmax; k++) { /* * Prepare greedy K-interval split */ for(i=0; i<=k-1; i++) { cursizes.ptr.p_int[i] = 0; } i = 0; j = 0; while(j<=tiecount-1&&i<=k-1) { /* * Rule: I-th bin is empty, fill it */ if( cursizes.ptr.p_int[i]==0 ) { cursizes.ptr.p_int[i] = ties.ptr.p_int[j+1]-ties.ptr.p_int[j]; j = j+1; continue; } /* * Rule: (K-1-I) bins left, (K-1-I) ties left (1 tie per bin); next bin */ if( tiecount-j==k-1-i ) { i = i+1; continue; } /* * Rule: last bin, always place in current */ if( i==k-1 ) { cursizes.ptr.p_int[i] = cursizes.ptr.p_int[i]+ties.ptr.p_int[j+1]-ties.ptr.p_int[j]; j = j+1; continue; } /* * Place J-th tie in I-th bin, or leave for I+1-th bin. */ if( ae_fp_less(ae_fabs(cursizes.ptr.p_int[i]+ties.ptr.p_int[j+1]-ties.ptr.p_int[j]-(double)n/(double)k, _state),ae_fabs(cursizes.ptr.p_int[i]-(double)n/(double)k, _state)) ) { cursizes.ptr.p_int[i] = cursizes.ptr.p_int[i]+ties.ptr.p_int[j+1]-ties.ptr.p_int[j]; j = j+1; } else { i = i+1; } } ae_assert(cursizes.ptr.p_int[k-1]!=0&&j==tiecount, "DSSplitK: internal error #1", _state); /* * Calculate CVE */ curcve = (double)(0); j = 0; for(i=0; i<=k-1; i++) { for(j1=0; j1<=nc-1; j1++) { cnt.ptr.p_int[j1] = 0; } for(j1=j; j1<=j+cursizes.ptr.p_int[i]-1; j1++) { cnt.ptr.p_int[c->ptr.p_int[j1]] = cnt.ptr.p_int[c->ptr.p_int[j1]]+1; } curcve = curcve+bdss_getcv(&cnt, nc, _state); j = j+cursizes.ptr.p_int[i]; } /* * Choose best variant */ if( ae_fp_less(curcve,bestcve) ) { for(i=0; i<=k-1; i++) { bestsizes.ptr.p_int[i] = cursizes.ptr.p_int[i]; } bestcve = curcve; bestk = k; } } /* * Transform from sizes to thresholds */ *cve = bestcve; *ni = bestk; ae_vector_set_length(thresholds, *ni-2+1, _state); j = bestsizes.ptr.p_int[0]; for(i=1; i<=bestk-1; i++) { thresholds->ptr.p_double[i-1] = 0.5*(a->ptr.p_double[j-1]+a->ptr.p_double[j]); j = j+bestsizes.ptr.p_int[i]; } ae_frame_leave(_state); } /************************************************************************* Automatic optimal discretization, internal subroutine. -- ALGLIB -- Copyright 22.05.2008 by Bochkanov Sergey *************************************************************************/ void dsoptimalsplitk(/* Real */ ae_vector* a, /* Integer */ ae_vector* c, ae_int_t n, ae_int_t nc, ae_int_t kmax, ae_int_t* info, /* Real */ ae_vector* thresholds, ae_int_t* ni, double* cve, ae_state *_state) { ae_frame _frame_block; ae_vector _a; ae_vector _c; ae_int_t i; ae_int_t j; ae_int_t s; ae_int_t jl; ae_int_t jr; double v2; ae_vector ties; ae_int_t tiecount; ae_vector p1; ae_vector p2; double cvtemp; ae_vector cnt; ae_vector cnt2; ae_matrix cv; ae_matrix splits; ae_int_t k; ae_int_t koptimal; double cvoptimal; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_a, a, _state); a = &_a; ae_vector_init_copy(&_c, c, _state); c = &_c; *info = 0; ae_vector_clear(thresholds); *ni = 0; *cve = 0; ae_vector_init(&ties, 0, DT_INT, _state); ae_vector_init(&p1, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); ae_vector_init(&cnt, 0, DT_INT, _state); ae_vector_init(&cnt2, 0, DT_INT, _state); ae_matrix_init(&cv, 0, 0, DT_REAL, _state); ae_matrix_init(&splits, 0, 0, DT_INT, _state); /* * Test for errors in inputs */ if( (n<=0||nc<2)||kmax<2 ) { *info = -1; ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { if( c->ptr.p_int[i]<0||c->ptr.p_int[i]>=nc ) { *info = -2; ae_frame_leave(_state); return; } } *info = 1; /* * Tie */ dstie(a, n, &ties, &tiecount, &p1, &p2, _state); for(i=0; i<=n-1; i++) { if( p2.ptr.p_int[i]!=i ) { k = c->ptr.p_int[i]; c->ptr.p_int[i] = c->ptr.p_int[p2.ptr.p_int[i]]; c->ptr.p_int[p2.ptr.p_int[i]] = k; } } /* * Special cases */ if( tiecount==1 ) { *info = -3; ae_frame_leave(_state); return; } /* * General case * Use dynamic programming to find best split in O(KMax*NC*TieCount^2) time */ kmax = ae_minint(kmax, tiecount, _state); ae_matrix_set_length(&cv, kmax-1+1, tiecount-1+1, _state); ae_matrix_set_length(&splits, kmax-1+1, tiecount-1+1, _state); ae_vector_set_length(&cnt, nc-1+1, _state); ae_vector_set_length(&cnt2, nc-1+1, _state); for(j=0; j<=nc-1; j++) { cnt.ptr.p_int[j] = 0; } for(j=0; j<=tiecount-1; j++) { bdss_tieaddc(c, &ties, j, nc, &cnt, _state); splits.ptr.pp_int[0][j] = 0; cv.ptr.pp_double[0][j] = bdss_getcv(&cnt, nc, _state); } for(k=1; k<=kmax-1; k++) { for(j=0; j<=nc-1; j++) { cnt.ptr.p_int[j] = 0; } /* * Subtask size J in [K..TieCount-1]: * optimal K-splitting on ties from 0-th to J-th. */ for(j=k; j<=tiecount-1; j++) { /* * Update Cnt - let it contain classes of ties from K-th to J-th */ bdss_tieaddc(c, &ties, j, nc, &cnt, _state); /* * Search for optimal split point S in [K..J] */ for(i=0; i<=nc-1; i++) { cnt2.ptr.p_int[i] = cnt.ptr.p_int[i]; } cv.ptr.pp_double[k][j] = cv.ptr.pp_double[k-1][j-1]+bdss_getcv(&cnt2, nc, _state); splits.ptr.pp_int[k][j] = j; for(s=k+1; s<=j; s++) { /* * Update Cnt2 - let it contain classes of ties from S-th to J-th */ bdss_tiesubc(c, &ties, s-1, nc, &cnt2, _state); /* * Calculate CVE */ cvtemp = cv.ptr.pp_double[k-1][s-1]+bdss_getcv(&cnt2, nc, _state); if( ae_fp_less(cvtemp,cv.ptr.pp_double[k][j]) ) { cv.ptr.pp_double[k][j] = cvtemp; splits.ptr.pp_int[k][j] = s; } } } } /* * Choose best partition, output result */ koptimal = -1; cvoptimal = ae_maxrealnumber; for(k=0; k<=kmax-1; k++) { if( ae_fp_less(cv.ptr.pp_double[k][tiecount-1],cvoptimal) ) { cvoptimal = cv.ptr.pp_double[k][tiecount-1]; koptimal = k; } } ae_assert(koptimal>=0, "DSOptimalSplitK: internal error #1!", _state); if( koptimal==0 ) { /* * Special case: best partition is one big interval. * Even 2-partition is not better. * This is possible when dealing with "weak" predictor variables. * * Make binary split as close to the median as possible. */ v2 = ae_maxrealnumber; j = -1; for(i=1; i<=tiecount-1; i++) { if( ae_fp_less(ae_fabs(ties.ptr.p_int[i]-0.5*(n-1), _state),v2) ) { v2 = ae_fabs(ties.ptr.p_int[i]-0.5*(n-1), _state); j = i; } } ae_assert(j>0, "DSOptimalSplitK: internal error #2!", _state); ae_vector_set_length(thresholds, 0+1, _state); thresholds->ptr.p_double[0] = 0.5*(a->ptr.p_double[ties.ptr.p_int[j-1]]+a->ptr.p_double[ties.ptr.p_int[j]]); *ni = 2; *cve = (double)(0); for(i=0; i<=nc-1; i++) { cnt.ptr.p_int[i] = 0; } for(i=0; i<=j-1; i++) { bdss_tieaddc(c, &ties, i, nc, &cnt, _state); } *cve = *cve+bdss_getcv(&cnt, nc, _state); for(i=0; i<=nc-1; i++) { cnt.ptr.p_int[i] = 0; } for(i=j; i<=tiecount-1; i++) { bdss_tieaddc(c, &ties, i, nc, &cnt, _state); } *cve = *cve+bdss_getcv(&cnt, nc, _state); } else { /* * General case: 2 or more intervals * * NOTE: we initialize both JL and JR (left and right bounds), * altough algorithm needs only JL. */ ae_vector_set_length(thresholds, koptimal-1+1, _state); *ni = koptimal+1; *cve = cv.ptr.pp_double[koptimal][tiecount-1]; jl = splits.ptr.pp_int[koptimal][tiecount-1]; jr = tiecount-1; for(k=koptimal; k>=1; k--) { thresholds->ptr.p_double[k-1] = 0.5*(a->ptr.p_double[ties.ptr.p_int[jl-1]]+a->ptr.p_double[ties.ptr.p_int[jl]]); jr = jl-1; jl = splits.ptr.pp_int[k-1][jl-1]; } touchint(&jr, _state); } ae_frame_leave(_state); } /************************************************************************* Internal function *************************************************************************/ static double bdss_xlny(double x, double y, ae_state *_state) { double result; if( ae_fp_eq(x,(double)(0)) ) { result = (double)(0); } else { result = x*ae_log(y, _state); } return result; } /************************************************************************* Internal function, returns number of samples of class I in Cnt[I] *************************************************************************/ static double bdss_getcv(/* Integer */ ae_vector* cnt, ae_int_t nc, ae_state *_state) { ae_int_t i; double s; double result; s = (double)(0); for(i=0; i<=nc-1; i++) { s = s+cnt->ptr.p_int[i]; } result = (double)(0); for(i=0; i<=nc-1; i++) { result = result-bdss_xlny((double)(cnt->ptr.p_int[i]), cnt->ptr.p_int[i]/(s+nc-1), _state); } return result; } /************************************************************************* Internal function, adds number of samples of class I in tie NTie to Cnt[I] *************************************************************************/ static void bdss_tieaddc(/* Integer */ ae_vector* c, /* Integer */ ae_vector* ties, ae_int_t ntie, ae_int_t nc, /* Integer */ ae_vector* cnt, ae_state *_state) { ae_int_t i; for(i=ties->ptr.p_int[ntie]; i<=ties->ptr.p_int[ntie+1]-1; i++) { cnt->ptr.p_int[c->ptr.p_int[i]] = cnt->ptr.p_int[c->ptr.p_int[i]]+1; } } /************************************************************************* Internal function, subtracts number of samples of class I in tie NTie to Cnt[I] *************************************************************************/ static void bdss_tiesubc(/* Integer */ ae_vector* c, /* Integer */ ae_vector* ties, ae_int_t ntie, ae_int_t nc, /* Integer */ ae_vector* cnt, ae_state *_state) { ae_int_t i; for(i=ties->ptr.p_int[ntie]; i<=ties->ptr.p_int[ntie+1]-1; i++) { cnt->ptr.p_int[c->ptr.p_int[i]] = cnt->ptr.p_int[c->ptr.p_int[i]]-1; } } void _cvreport_init(void* _p, ae_state *_state) { cvreport *p = (cvreport*)_p; ae_touch_ptr((void*)p); } void _cvreport_init_copy(void* _dst, void* _src, ae_state *_state) { cvreport *dst = (cvreport*)_dst; cvreport *src = (cvreport*)_src; dst->relclserror = src->relclserror; dst->avgce = src->avgce; dst->rmserror = src->rmserror; dst->avgerror = src->avgerror; dst->avgrelerror = src->avgrelerror; } void _cvreport_clear(void* _p) { cvreport *p = (cvreport*)_p; ae_touch_ptr((void*)p); } void _cvreport_destroy(void* _p) { cvreport *p = (cvreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* This function returns number of weights updates which is required for gradient calculation problem to be splitted. *************************************************************************/ ae_int_t mlpgradsplitcost(ae_state *_state) { ae_int_t result; result = mlpbase_gradbasecasecost; return result; } /************************************************************************* This function returns number of elements in subset of dataset which is required for gradient calculation problem to be splitted. *************************************************************************/ ae_int_t mlpgradsplitsize(ae_state *_state) { ae_int_t result; result = mlpbase_microbatchsize; return result; } /************************************************************************* Creates neural network with NIn inputs, NOut outputs, without hidden layers, with linear output layer. Network weights are filled with small random values. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreate0(ae_int_t nin, ae_int_t nout, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_vector lsizes; ae_vector ltypes; ae_vector lconnfirst; ae_vector lconnlast; ae_int_t layerscount; ae_int_t lastproc; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&lsizes, 0, DT_INT, _state); ae_vector_init(<ypes, 0, DT_INT, _state); ae_vector_init(&lconnfirst, 0, DT_INT, _state); ae_vector_init(&lconnlast, 0, DT_INT, _state); layerscount = 1+3; /* * Allocate arrays */ ae_vector_set_length(&lsizes, layerscount-1+1, _state); ae_vector_set_length(<ypes, layerscount-1+1, _state); ae_vector_set_length(&lconnfirst, layerscount-1+1, _state); ae_vector_set_length(&lconnlast, layerscount-1+1, _state); /* * Layers */ mlpbase_addinputlayer(nin, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nout, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(-5, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); /* * Create */ mlpbase_mlpcreate(nin, nout, &lsizes, <ypes, &lconnfirst, &lconnlast, layerscount, ae_false, network, _state); mlpbase_fillhighlevelinformation(network, nin, 0, 0, nout, ae_false, ae_true, _state); ae_frame_leave(_state); } /************************************************************************* Same as MLPCreate0, but with one hidden layer (NHid neurons) with non-linear activation function. Output layer is linear. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreate1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_vector lsizes; ae_vector ltypes; ae_vector lconnfirst; ae_vector lconnlast; ae_int_t layerscount; ae_int_t lastproc; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&lsizes, 0, DT_INT, _state); ae_vector_init(<ypes, 0, DT_INT, _state); ae_vector_init(&lconnfirst, 0, DT_INT, _state); ae_vector_init(&lconnlast, 0, DT_INT, _state); layerscount = 1+3+3; /* * Allocate arrays */ ae_vector_set_length(&lsizes, layerscount-1+1, _state); ae_vector_set_length(<ypes, layerscount-1+1, _state); ae_vector_set_length(&lconnfirst, layerscount-1+1, _state); ae_vector_set_length(&lconnlast, layerscount-1+1, _state); /* * Layers */ mlpbase_addinputlayer(nin, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nhid, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nout, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(-5, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); /* * Create */ mlpbase_mlpcreate(nin, nout, &lsizes, <ypes, &lconnfirst, &lconnlast, layerscount, ae_false, network, _state); mlpbase_fillhighlevelinformation(network, nin, nhid, 0, nout, ae_false, ae_true, _state); ae_frame_leave(_state); } /************************************************************************* Same as MLPCreate0, but with two hidden layers (NHid1 and NHid2 neurons) with non-linear activation function. Output layer is linear. $ALL -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreate2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_vector lsizes; ae_vector ltypes; ae_vector lconnfirst; ae_vector lconnlast; ae_int_t layerscount; ae_int_t lastproc; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&lsizes, 0, DT_INT, _state); ae_vector_init(<ypes, 0, DT_INT, _state); ae_vector_init(&lconnfirst, 0, DT_INT, _state); ae_vector_init(&lconnlast, 0, DT_INT, _state); layerscount = 1+3+3+3; /* * Allocate arrays */ ae_vector_set_length(&lsizes, layerscount-1+1, _state); ae_vector_set_length(<ypes, layerscount-1+1, _state); ae_vector_set_length(&lconnfirst, layerscount-1+1, _state); ae_vector_set_length(&lconnlast, layerscount-1+1, _state); /* * Layers */ mlpbase_addinputlayer(nin, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nhid1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nhid2, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nout, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(-5, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); /* * Create */ mlpbase_mlpcreate(nin, nout, &lsizes, <ypes, &lconnfirst, &lconnlast, layerscount, ae_false, network, _state); mlpbase_fillhighlevelinformation(network, nin, nhid1, nhid2, nout, ae_false, ae_true, _state); ae_frame_leave(_state); } /************************************************************************* Creates neural network with NIn inputs, NOut outputs, without hidden layers with non-linear output layer. Network weights are filled with small random values. Activation function of the output layer takes values: (B, +INF), if D>=0 or (-INF, B), if D<0. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreateb0(ae_int_t nin, ae_int_t nout, double b, double d, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_vector lsizes; ae_vector ltypes; ae_vector lconnfirst; ae_vector lconnlast; ae_int_t layerscount; ae_int_t lastproc; ae_int_t i; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&lsizes, 0, DT_INT, _state); ae_vector_init(<ypes, 0, DT_INT, _state); ae_vector_init(&lconnfirst, 0, DT_INT, _state); ae_vector_init(&lconnlast, 0, DT_INT, _state); layerscount = 1+3; if( ae_fp_greater_eq(d,(double)(0)) ) { d = (double)(1); } else { d = (double)(-1); } /* * Allocate arrays */ ae_vector_set_length(&lsizes, layerscount-1+1, _state); ae_vector_set_length(<ypes, layerscount-1+1, _state); ae_vector_set_length(&lconnfirst, layerscount-1+1, _state); ae_vector_set_length(&lconnlast, layerscount-1+1, _state); /* * Layers */ mlpbase_addinputlayer(nin, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nout, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(3, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); /* * Create */ mlpbase_mlpcreate(nin, nout, &lsizes, <ypes, &lconnfirst, &lconnlast, layerscount, ae_false, network, _state); mlpbase_fillhighlevelinformation(network, nin, 0, 0, nout, ae_false, ae_false, _state); /* * Turn on ouputs shift/scaling. */ for(i=nin; i<=nin+nout-1; i++) { network->columnmeans.ptr.p_double[i] = b; network->columnsigmas.ptr.p_double[i] = d; } ae_frame_leave(_state); } /************************************************************************* Same as MLPCreateB0 but with non-linear hidden layer. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreateb1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, double b, double d, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_vector lsizes; ae_vector ltypes; ae_vector lconnfirst; ae_vector lconnlast; ae_int_t layerscount; ae_int_t lastproc; ae_int_t i; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&lsizes, 0, DT_INT, _state); ae_vector_init(<ypes, 0, DT_INT, _state); ae_vector_init(&lconnfirst, 0, DT_INT, _state); ae_vector_init(&lconnlast, 0, DT_INT, _state); layerscount = 1+3+3; if( ae_fp_greater_eq(d,(double)(0)) ) { d = (double)(1); } else { d = (double)(-1); } /* * Allocate arrays */ ae_vector_set_length(&lsizes, layerscount-1+1, _state); ae_vector_set_length(<ypes, layerscount-1+1, _state); ae_vector_set_length(&lconnfirst, layerscount-1+1, _state); ae_vector_set_length(&lconnlast, layerscount-1+1, _state); /* * Layers */ mlpbase_addinputlayer(nin, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nhid, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nout, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(3, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); /* * Create */ mlpbase_mlpcreate(nin, nout, &lsizes, <ypes, &lconnfirst, &lconnlast, layerscount, ae_false, network, _state); mlpbase_fillhighlevelinformation(network, nin, nhid, 0, nout, ae_false, ae_false, _state); /* * Turn on ouputs shift/scaling. */ for(i=nin; i<=nin+nout-1; i++) { network->columnmeans.ptr.p_double[i] = b; network->columnsigmas.ptr.p_double[i] = d; } ae_frame_leave(_state); } /************************************************************************* Same as MLPCreateB0 but with two non-linear hidden layers. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreateb2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, double b, double d, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_vector lsizes; ae_vector ltypes; ae_vector lconnfirst; ae_vector lconnlast; ae_int_t layerscount; ae_int_t lastproc; ae_int_t i; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&lsizes, 0, DT_INT, _state); ae_vector_init(<ypes, 0, DT_INT, _state); ae_vector_init(&lconnfirst, 0, DT_INT, _state); ae_vector_init(&lconnlast, 0, DT_INT, _state); layerscount = 1+3+3+3; if( ae_fp_greater_eq(d,(double)(0)) ) { d = (double)(1); } else { d = (double)(-1); } /* * Allocate arrays */ ae_vector_set_length(&lsizes, layerscount-1+1, _state); ae_vector_set_length(<ypes, layerscount-1+1, _state); ae_vector_set_length(&lconnfirst, layerscount-1+1, _state); ae_vector_set_length(&lconnlast, layerscount-1+1, _state); /* * Layers */ mlpbase_addinputlayer(nin, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nhid1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nhid2, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nout, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(3, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); /* * Create */ mlpbase_mlpcreate(nin, nout, &lsizes, <ypes, &lconnfirst, &lconnlast, layerscount, ae_false, network, _state); mlpbase_fillhighlevelinformation(network, nin, nhid1, nhid2, nout, ae_false, ae_false, _state); /* * Turn on ouputs shift/scaling. */ for(i=nin; i<=nin+nout-1; i++) { network->columnmeans.ptr.p_double[i] = b; network->columnsigmas.ptr.p_double[i] = d; } ae_frame_leave(_state); } /************************************************************************* Creates neural network with NIn inputs, NOut outputs, without hidden layers with non-linear output layer. Network weights are filled with small random values. Activation function of the output layer takes values [A,B]. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreater0(ae_int_t nin, ae_int_t nout, double a, double b, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_vector lsizes; ae_vector ltypes; ae_vector lconnfirst; ae_vector lconnlast; ae_int_t layerscount; ae_int_t lastproc; ae_int_t i; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&lsizes, 0, DT_INT, _state); ae_vector_init(<ypes, 0, DT_INT, _state); ae_vector_init(&lconnfirst, 0, DT_INT, _state); ae_vector_init(&lconnlast, 0, DT_INT, _state); layerscount = 1+3; /* * Allocate arrays */ ae_vector_set_length(&lsizes, layerscount-1+1, _state); ae_vector_set_length(<ypes, layerscount-1+1, _state); ae_vector_set_length(&lconnfirst, layerscount-1+1, _state); ae_vector_set_length(&lconnlast, layerscount-1+1, _state); /* * Layers */ mlpbase_addinputlayer(nin, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nout, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); /* * Create */ mlpbase_mlpcreate(nin, nout, &lsizes, <ypes, &lconnfirst, &lconnlast, layerscount, ae_false, network, _state); mlpbase_fillhighlevelinformation(network, nin, 0, 0, nout, ae_false, ae_false, _state); /* * Turn on outputs shift/scaling. */ for(i=nin; i<=nin+nout-1; i++) { network->columnmeans.ptr.p_double[i] = 0.5*(a+b); network->columnsigmas.ptr.p_double[i] = 0.5*(a-b); } ae_frame_leave(_state); } /************************************************************************* Same as MLPCreateR0, but with non-linear hidden layer. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreater1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, double a, double b, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_vector lsizes; ae_vector ltypes; ae_vector lconnfirst; ae_vector lconnlast; ae_int_t layerscount; ae_int_t lastproc; ae_int_t i; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&lsizes, 0, DT_INT, _state); ae_vector_init(<ypes, 0, DT_INT, _state); ae_vector_init(&lconnfirst, 0, DT_INT, _state); ae_vector_init(&lconnlast, 0, DT_INT, _state); layerscount = 1+3+3; /* * Allocate arrays */ ae_vector_set_length(&lsizes, layerscount-1+1, _state); ae_vector_set_length(<ypes, layerscount-1+1, _state); ae_vector_set_length(&lconnfirst, layerscount-1+1, _state); ae_vector_set_length(&lconnlast, layerscount-1+1, _state); /* * Layers */ mlpbase_addinputlayer(nin, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nhid, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nout, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); /* * Create */ mlpbase_mlpcreate(nin, nout, &lsizes, <ypes, &lconnfirst, &lconnlast, layerscount, ae_false, network, _state); mlpbase_fillhighlevelinformation(network, nin, nhid, 0, nout, ae_false, ae_false, _state); /* * Turn on outputs shift/scaling. */ for(i=nin; i<=nin+nout-1; i++) { network->columnmeans.ptr.p_double[i] = 0.5*(a+b); network->columnsigmas.ptr.p_double[i] = 0.5*(a-b); } ae_frame_leave(_state); } /************************************************************************* Same as MLPCreateR0, but with two non-linear hidden layers. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpcreater2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, double a, double b, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_vector lsizes; ae_vector ltypes; ae_vector lconnfirst; ae_vector lconnlast; ae_int_t layerscount; ae_int_t lastproc; ae_int_t i; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&lsizes, 0, DT_INT, _state); ae_vector_init(<ypes, 0, DT_INT, _state); ae_vector_init(&lconnfirst, 0, DT_INT, _state); ae_vector_init(&lconnlast, 0, DT_INT, _state); layerscount = 1+3+3+3; /* * Allocate arrays */ ae_vector_set_length(&lsizes, layerscount-1+1, _state); ae_vector_set_length(<ypes, layerscount-1+1, _state); ae_vector_set_length(&lconnfirst, layerscount-1+1, _state); ae_vector_set_length(&lconnlast, layerscount-1+1, _state); /* * Layers */ mlpbase_addinputlayer(nin, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nhid1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nhid2, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nout, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); /* * Create */ mlpbase_mlpcreate(nin, nout, &lsizes, <ypes, &lconnfirst, &lconnlast, layerscount, ae_false, network, _state); mlpbase_fillhighlevelinformation(network, nin, nhid1, nhid2, nout, ae_false, ae_false, _state); /* * Turn on outputs shift/scaling. */ for(i=nin; i<=nin+nout-1; i++) { network->columnmeans.ptr.p_double[i] = 0.5*(a+b); network->columnsigmas.ptr.p_double[i] = 0.5*(a-b); } ae_frame_leave(_state); } /************************************************************************* Creates classifier network with NIn inputs and NOut possible classes. Network contains no hidden layers and linear output layer with SOFTMAX- normalization (so outputs sums up to 1.0 and converge to posterior probabilities). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreatec0(ae_int_t nin, ae_int_t nout, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_vector lsizes; ae_vector ltypes; ae_vector lconnfirst; ae_vector lconnlast; ae_int_t layerscount; ae_int_t lastproc; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&lsizes, 0, DT_INT, _state); ae_vector_init(<ypes, 0, DT_INT, _state); ae_vector_init(&lconnfirst, 0, DT_INT, _state); ae_vector_init(&lconnlast, 0, DT_INT, _state); ae_assert(nout>=2, "MLPCreateC0: NOut<2!", _state); layerscount = 1+2+1; /* * Allocate arrays */ ae_vector_set_length(&lsizes, layerscount-1+1, _state); ae_vector_set_length(<ypes, layerscount-1+1, _state); ae_vector_set_length(&lconnfirst, layerscount-1+1, _state); ae_vector_set_length(&lconnlast, layerscount-1+1, _state); /* * Layers */ mlpbase_addinputlayer(nin, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nout-1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addzerolayer(&lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); /* * Create */ mlpbase_mlpcreate(nin, nout, &lsizes, <ypes, &lconnfirst, &lconnlast, layerscount, ae_true, network, _state); mlpbase_fillhighlevelinformation(network, nin, 0, 0, nout, ae_true, ae_true, _state); ae_frame_leave(_state); } /************************************************************************* Same as MLPCreateC0, but with one non-linear hidden layer. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreatec1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_vector lsizes; ae_vector ltypes; ae_vector lconnfirst; ae_vector lconnlast; ae_int_t layerscount; ae_int_t lastproc; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&lsizes, 0, DT_INT, _state); ae_vector_init(<ypes, 0, DT_INT, _state); ae_vector_init(&lconnfirst, 0, DT_INT, _state); ae_vector_init(&lconnlast, 0, DT_INT, _state); ae_assert(nout>=2, "MLPCreateC1: NOut<2!", _state); layerscount = 1+3+2+1; /* * Allocate arrays */ ae_vector_set_length(&lsizes, layerscount-1+1, _state); ae_vector_set_length(<ypes, layerscount-1+1, _state); ae_vector_set_length(&lconnfirst, layerscount-1+1, _state); ae_vector_set_length(&lconnlast, layerscount-1+1, _state); /* * Layers */ mlpbase_addinputlayer(nin, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nhid, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nout-1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addzerolayer(&lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); /* * Create */ mlpbase_mlpcreate(nin, nout, &lsizes, <ypes, &lconnfirst, &lconnlast, layerscount, ae_true, network, _state); mlpbase_fillhighlevelinformation(network, nin, nhid, 0, nout, ae_true, ae_true, _state); ae_frame_leave(_state); } /************************************************************************* Same as MLPCreateC0, but with two non-linear hidden layers. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcreatec2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_vector lsizes; ae_vector ltypes; ae_vector lconnfirst; ae_vector lconnlast; ae_int_t layerscount; ae_int_t lastproc; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&lsizes, 0, DT_INT, _state); ae_vector_init(<ypes, 0, DT_INT, _state); ae_vector_init(&lconnfirst, 0, DT_INT, _state); ae_vector_init(&lconnlast, 0, DT_INT, _state); ae_assert(nout>=2, "MLPCreateC2: NOut<2!", _state); layerscount = 1+3+3+2+1; /* * Allocate arrays */ ae_vector_set_length(&lsizes, layerscount-1+1, _state); ae_vector_set_length(<ypes, layerscount-1+1, _state); ae_vector_set_length(&lconnfirst, layerscount-1+1, _state); ae_vector_set_length(&lconnlast, layerscount-1+1, _state); /* * Layers */ mlpbase_addinputlayer(nin, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nhid1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nhid2, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addactivationlayer(1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addbiasedsummatorlayer(nout-1, &lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); mlpbase_addzerolayer(&lsizes, <ypes, &lconnfirst, &lconnlast, &lastproc, _state); /* * Create */ mlpbase_mlpcreate(nin, nout, &lsizes, <ypes, &lconnfirst, &lconnlast, layerscount, ae_true, network, _state); mlpbase_fillhighlevelinformation(network, nin, nhid1, nhid2, nout, ae_true, ae_true, _state); ae_frame_leave(_state); } /************************************************************************* Copying of neural network INPUT PARAMETERS: Network1 - original OUTPUT PARAMETERS: Network2 - copy -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcopy(multilayerperceptron* network1, multilayerperceptron* network2, ae_state *_state) { _multilayerperceptron_clear(network2); mlpcopyshared(network1, network2, _state); } /************************************************************************* Copying of neural network (second parameter is passed as shared object). INPUT PARAMETERS: Network1 - original OUTPUT PARAMETERS: Network2 - copy -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpcopyshared(multilayerperceptron* network1, multilayerperceptron* network2, ae_state *_state) { ae_frame _frame_block; ae_int_t wcount; ae_int_t i; mlpbuffers buf; smlpgrad sgrad; ae_frame_make(_state, &_frame_block); _mlpbuffers_init(&buf, _state); _smlpgrad_init(&sgrad, _state); /* * Copy scalar and array fields */ network2->hlnetworktype = network1->hlnetworktype; network2->hlnormtype = network1->hlnormtype; copyintegerarray(&network1->hllayersizes, &network2->hllayersizes, _state); copyintegerarray(&network1->hlconnections, &network2->hlconnections, _state); copyintegerarray(&network1->hlneurons, &network2->hlneurons, _state); copyintegerarray(&network1->structinfo, &network2->structinfo, _state); copyrealarray(&network1->weights, &network2->weights, _state); copyrealarray(&network1->columnmeans, &network2->columnmeans, _state); copyrealarray(&network1->columnsigmas, &network2->columnsigmas, _state); copyrealarray(&network1->neurons, &network2->neurons, _state); copyrealarray(&network1->dfdnet, &network2->dfdnet, _state); copyrealarray(&network1->derror, &network2->derror, _state); copyrealarray(&network1->x, &network2->x, _state); copyrealarray(&network1->y, &network2->y, _state); copyrealarray(&network1->nwbuf, &network2->nwbuf, _state); copyintegerarray(&network1->integerbuf, &network2->integerbuf, _state); /* * copy buffers */ wcount = mlpgetweightscount(network1, _state); ae_shared_pool_set_seed(&network2->buf, &buf, sizeof(buf), _mlpbuffers_init, _mlpbuffers_init_copy, _mlpbuffers_destroy, _state); ae_vector_set_length(&sgrad.g, wcount, _state); sgrad.f = 0.0; for(i=0; i<=wcount-1; i++) { sgrad.g.ptr.p_double[i] = 0.0; } ae_shared_pool_set_seed(&network2->gradbuf, &sgrad, sizeof(sgrad), _smlpgrad_init, _smlpgrad_init_copy, _smlpgrad_destroy, _state); ae_frame_leave(_state); } /************************************************************************* This function compares architectures of neural networks. Only geometries are compared, weights and other parameters are not tested. -- ALGLIB -- Copyright 20.06.2013 by Bochkanov Sergey *************************************************************************/ ae_bool mlpsamearchitecture(multilayerperceptron* network1, multilayerperceptron* network2, ae_state *_state) { ae_int_t i; ae_int_t ninfo; ae_bool result; ae_assert(network1->structinfo.cnt>0&&network1->structinfo.cnt>=network1->structinfo.ptr.p_int[0], "MLPSameArchitecture: Network1 is uninitialized", _state); ae_assert(network2->structinfo.cnt>0&&network2->structinfo.cnt>=network2->structinfo.ptr.p_int[0], "MLPSameArchitecture: Network2 is uninitialized", _state); result = ae_false; if( network1->structinfo.ptr.p_int[0]!=network2->structinfo.ptr.p_int[0] ) { return result; } ninfo = network1->structinfo.ptr.p_int[0]; for(i=0; i<=ninfo-1; i++) { if( network1->structinfo.ptr.p_int[i]!=network2->structinfo.ptr.p_int[i] ) { return result; } } result = ae_true; return result; } /************************************************************************* This function copies tunable parameters (weights/means/sigmas) from one network to another with same architecture. It performs some rudimentary checks that architectures are same, and throws exception if check fails. It is intended for fast copying of states between two network which are known to have same geometry. INPUT PARAMETERS: Network1 - source, must be correctly initialized Network2 - target, must have same architecture OUTPUT PARAMETERS: Network2 - network state is copied from source to target -- ALGLIB -- Copyright 20.06.2013 by Bochkanov Sergey *************************************************************************/ void mlpcopytunableparameters(multilayerperceptron* network1, multilayerperceptron* network2, ae_state *_state) { ae_int_t i; ae_int_t ninfo; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_assert(network1->structinfo.cnt>0&&network1->structinfo.cnt>=network1->structinfo.ptr.p_int[0], "MLPCopyTunableParameters: Network1 is uninitialized", _state); ae_assert(network2->structinfo.cnt>0&&network2->structinfo.cnt>=network2->structinfo.ptr.p_int[0], "MLPCopyTunableParameters: Network2 is uninitialized", _state); ae_assert(network1->structinfo.ptr.p_int[0]==network2->structinfo.ptr.p_int[0], "MLPCopyTunableParameters: Network1 geometry differs from that of Network2", _state); ninfo = network1->structinfo.ptr.p_int[0]; for(i=0; i<=ninfo-1; i++) { ae_assert(network1->structinfo.ptr.p_int[i]==network2->structinfo.ptr.p_int[i], "MLPCopyTunableParameters: Network1 geometry differs from that of Network2", _state); } mlpproperties(network1, &nin, &nout, &wcount, _state); for(i=0; i<=wcount-1; i++) { network2->weights.ptr.p_double[i] = network1->weights.ptr.p_double[i]; } if( mlpissoftmax(network1, _state) ) { for(i=0; i<=nin-1; i++) { network2->columnmeans.ptr.p_double[i] = network1->columnmeans.ptr.p_double[i]; network2->columnsigmas.ptr.p_double[i] = network1->columnsigmas.ptr.p_double[i]; } } else { for(i=0; i<=nin+nout-1; i++) { network2->columnmeans.ptr.p_double[i] = network1->columnmeans.ptr.p_double[i]; network2->columnsigmas.ptr.p_double[i] = network1->columnsigmas.ptr.p_double[i]; } } } /************************************************************************* This function exports tunable parameters (weights/means/sigmas) from network to contiguous array. Nothing is guaranteed about array format, the only thing you can count for is that MLPImportTunableParameters() will be able to parse it. It is intended for fast copying of states between network and backup array INPUT PARAMETERS: Network - source, must be correctly initialized P - array to use. If its size is enough to store data, it is reused. OUTPUT PARAMETERS: P - array which stores network parameters, resized if needed PCount - number of parameters stored in array. -- ALGLIB -- Copyright 20.06.2013 by Bochkanov Sergey *************************************************************************/ void mlpexporttunableparameters(multilayerperceptron* network, /* Real */ ae_vector* p, ae_int_t* pcount, ae_state *_state) { ae_int_t i; ae_int_t k; ae_int_t nin; ae_int_t nout; ae_int_t wcount; *pcount = 0; ae_assert(network->structinfo.cnt>0&&network->structinfo.cnt>=network->structinfo.ptr.p_int[0], "MLPExportTunableParameters: Network is uninitialized", _state); mlpproperties(network, &nin, &nout, &wcount, _state); if( mlpissoftmax(network, _state) ) { *pcount = wcount+2*nin; rvectorsetlengthatleast(p, *pcount, _state); k = 0; for(i=0; i<=wcount-1; i++) { p->ptr.p_double[k] = network->weights.ptr.p_double[i]; k = k+1; } for(i=0; i<=nin-1; i++) { p->ptr.p_double[k] = network->columnmeans.ptr.p_double[i]; k = k+1; p->ptr.p_double[k] = network->columnsigmas.ptr.p_double[i]; k = k+1; } } else { *pcount = wcount+2*(nin+nout); rvectorsetlengthatleast(p, *pcount, _state); k = 0; for(i=0; i<=wcount-1; i++) { p->ptr.p_double[k] = network->weights.ptr.p_double[i]; k = k+1; } for(i=0; i<=nin+nout-1; i++) { p->ptr.p_double[k] = network->columnmeans.ptr.p_double[i]; k = k+1; p->ptr.p_double[k] = network->columnsigmas.ptr.p_double[i]; k = k+1; } } } /************************************************************************* This function imports tunable parameters (weights/means/sigmas) which were exported by MLPExportTunableParameters(). It is intended for fast copying of states between network and backup array INPUT PARAMETERS: Network - target: * must be correctly initialized * must have same geometry as network used to export params P - array with parameters -- ALGLIB -- Copyright 20.06.2013 by Bochkanov Sergey *************************************************************************/ void mlpimporttunableparameters(multilayerperceptron* network, /* Real */ ae_vector* p, ae_state *_state) { ae_int_t i; ae_int_t k; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_assert(network->structinfo.cnt>0&&network->structinfo.cnt>=network->structinfo.ptr.p_int[0], "MLPImportTunableParameters: Network is uninitialized", _state); mlpproperties(network, &nin, &nout, &wcount, _state); if( mlpissoftmax(network, _state) ) { k = 0; for(i=0; i<=wcount-1; i++) { network->weights.ptr.p_double[i] = p->ptr.p_double[k]; k = k+1; } for(i=0; i<=nin-1; i++) { network->columnmeans.ptr.p_double[i] = p->ptr.p_double[k]; k = k+1; network->columnsigmas.ptr.p_double[i] = p->ptr.p_double[k]; k = k+1; } } else { k = 0; for(i=0; i<=wcount-1; i++) { network->weights.ptr.p_double[i] = p->ptr.p_double[k]; k = k+1; } for(i=0; i<=nin+nout-1; i++) { network->columnmeans.ptr.p_double[i] = p->ptr.p_double[k]; k = k+1; network->columnsigmas.ptr.p_double[i] = p->ptr.p_double[k]; k = k+1; } } } /************************************************************************* Serialization of MultiLayerPerceptron strucure INPUT PARAMETERS: Network - original OUTPUT PARAMETERS: RA - array of real numbers which stores network, array[0..RLen-1] RLen - RA lenght -- ALGLIB -- Copyright 29.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpserializeold(multilayerperceptron* network, /* Real */ ae_vector* ra, ae_int_t* rlen, ae_state *_state) { ae_int_t i; ae_int_t ssize; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t sigmalen; ae_int_t offs; ae_vector_clear(ra); *rlen = 0; /* * Unload info */ ssize = network->structinfo.ptr.p_int[0]; nin = network->structinfo.ptr.p_int[1]; nout = network->structinfo.ptr.p_int[2]; wcount = network->structinfo.ptr.p_int[4]; if( mlpissoftmax(network, _state) ) { sigmalen = nin; } else { sigmalen = nin+nout; } /* * RA format: * LEN DESRC. * 1 RLen * 1 version (MLPVNum) * 1 StructInfo size * SSize StructInfo * WCount Weights * SigmaLen ColumnMeans * SigmaLen ColumnSigmas */ *rlen = 3+ssize+wcount+2*sigmalen; ae_vector_set_length(ra, *rlen-1+1, _state); ra->ptr.p_double[0] = (double)(*rlen); ra->ptr.p_double[1] = (double)(mlpbase_mlpvnum); ra->ptr.p_double[2] = (double)(ssize); offs = 3; for(i=0; i<=ssize-1; i++) { ra->ptr.p_double[offs+i] = (double)(network->structinfo.ptr.p_int[i]); } offs = offs+ssize; ae_v_move(&ra->ptr.p_double[offs], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(offs,offs+wcount-1)); offs = offs+wcount; ae_v_move(&ra->ptr.p_double[offs], 1, &network->columnmeans.ptr.p_double[0], 1, ae_v_len(offs,offs+sigmalen-1)); offs = offs+sigmalen; ae_v_move(&ra->ptr.p_double[offs], 1, &network->columnsigmas.ptr.p_double[0], 1, ae_v_len(offs,offs+sigmalen-1)); offs = offs+sigmalen; } /************************************************************************* Unserialization of MultiLayerPerceptron strucure INPUT PARAMETERS: RA - real array which stores network OUTPUT PARAMETERS: Network - restored network -- ALGLIB -- Copyright 29.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpunserializeold(/* Real */ ae_vector* ra, multilayerperceptron* network, ae_state *_state) { ae_int_t i; ae_int_t ssize; ae_int_t ntotal; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t sigmalen; ae_int_t offs; _multilayerperceptron_clear(network); ae_assert(ae_round(ra->ptr.p_double[1], _state)==mlpbase_mlpvnum, "MLPUnserialize: incorrect array!", _state); /* * Unload StructInfo from IA */ offs = 3; ssize = ae_round(ra->ptr.p_double[2], _state); ae_vector_set_length(&network->structinfo, ssize-1+1, _state); for(i=0; i<=ssize-1; i++) { network->structinfo.ptr.p_int[i] = ae_round(ra->ptr.p_double[offs+i], _state); } offs = offs+ssize; /* * Unload info from StructInfo */ ssize = network->structinfo.ptr.p_int[0]; nin = network->structinfo.ptr.p_int[1]; nout = network->structinfo.ptr.p_int[2]; ntotal = network->structinfo.ptr.p_int[3]; wcount = network->structinfo.ptr.p_int[4]; if( network->structinfo.ptr.p_int[6]==0 ) { sigmalen = nin+nout; } else { sigmalen = nin; } /* * Allocate space for other fields */ ae_vector_set_length(&network->weights, wcount-1+1, _state); ae_vector_set_length(&network->columnmeans, sigmalen-1+1, _state); ae_vector_set_length(&network->columnsigmas, sigmalen-1+1, _state); ae_vector_set_length(&network->neurons, ntotal-1+1, _state); ae_vector_set_length(&network->nwbuf, ae_maxint(wcount, 2*nout, _state)-1+1, _state); ae_vector_set_length(&network->dfdnet, ntotal-1+1, _state); ae_vector_set_length(&network->x, nin-1+1, _state); ae_vector_set_length(&network->y, nout-1+1, _state); ae_vector_set_length(&network->derror, ntotal-1+1, _state); /* * Copy parameters from RA */ ae_v_move(&network->weights.ptr.p_double[0], 1, &ra->ptr.p_double[offs], 1, ae_v_len(0,wcount-1)); offs = offs+wcount; ae_v_move(&network->columnmeans.ptr.p_double[0], 1, &ra->ptr.p_double[offs], 1, ae_v_len(0,sigmalen-1)); offs = offs+sigmalen; ae_v_move(&network->columnsigmas.ptr.p_double[0], 1, &ra->ptr.p_double[offs], 1, ae_v_len(0,sigmalen-1)); offs = offs+sigmalen; } /************************************************************************* Randomization of neural network weights -- ALGLIB -- Copyright 06.11.2007 by Bochkanov Sergey *************************************************************************/ void mlprandomize(multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t ntotal; ae_int_t istart; hqrndstate r; ae_int_t entrysize; ae_int_t entryoffs; ae_int_t neuronidx; ae_int_t neurontype; double vmean; double vvar; ae_int_t i; ae_int_t n1; ae_int_t n2; double desiredsigma; ae_int_t montecarlocnt; double ef; double ef2; double v; double wscale; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&r, _state); hqrndrandomize(&r, _state); mlpproperties(network, &nin, &nout, &wcount, _state); ntotal = network->structinfo.ptr.p_int[3]; istart = network->structinfo.ptr.p_int[5]; desiredsigma = 0.5; montecarlocnt = 20; /* * Stage 1: * * Network.Weights is filled by standard deviation of weights * * default values: sigma=1 */ for(i=0; i<=wcount-1; i++) { network->weights.ptr.p_double[i] = 1.0; } /* * Stage 2: * * assume that input neurons have zero mean and unit standard deviation * * assume that constant neurons have zero standard deviation * * perform forward pass along neurons * * for each non-input non-constant neuron: * * calculate mean and standard deviation of neuron's output * assuming that we know means/deviations of neurons which feed it * and assuming that weights has unit variance and zero mean. * * for each nonlinear neuron additionally we perform backward pass: * * scale variances of weights which feed it in such way that neuron's * input has unit standard deviation * * NOTE: this algorithm assumes that each connection feeds at most one * non-linear neuron. This assumption can be incorrect in upcoming * architectures with strong neurons. However, algorithm should * work smoothly even in this case. * * During this stage we use Network.RndBuf, which is grouped into NTotal * entries, each of them having following format: * * Buf[Offset+0] mean value of neuron's output * Buf[Offset+1] standard deviation of neuron's output * * */ entrysize = 2; rvectorsetlengthatleast(&network->rndbuf, entrysize*ntotal, _state); for(neuronidx=0; neuronidx<=ntotal-1; neuronidx++) { neurontype = network->structinfo.ptr.p_int[istart+neuronidx*mlpbase_nfieldwidth+0]; entryoffs = entrysize*neuronidx; if( neurontype==-2 ) { /* * Input neuron: zero mean, unit variance. */ network->rndbuf.ptr.p_double[entryoffs+0] = 0.0; network->rndbuf.ptr.p_double[entryoffs+1] = 1.0; continue; } if( neurontype==-3 ) { /* * "-1" neuron: mean=-1, zero variance. */ network->rndbuf.ptr.p_double[entryoffs+0] = -1.0; network->rndbuf.ptr.p_double[entryoffs+1] = 0.0; continue; } if( neurontype==-4 ) { /* * "0" neuron: mean=0, zero variance. */ network->rndbuf.ptr.p_double[entryoffs+0] = 0.0; network->rndbuf.ptr.p_double[entryoffs+1] = 0.0; continue; } if( neurontype==0 ) { /* * Adaptive summator neuron: * * calculate its mean and variance. * * we assume that weights of this neuron have unit variance and zero mean. * * thus, neuron's output is always have zero mean * * as for variance, it is a bit more interesting: * * let n[i] is i-th input neuron * * let w[i] is i-th weight * * we assume that n[i] and w[i] are independently distributed * * Var(n0*w0+n1*w1+...) = Var(n0*w0)+Var(n1*w1)+... * * Var(X*Y) = mean(X)^2*Var(Y) + mean(Y)^2*Var(X) + Var(X)*Var(Y) * * mean(w[i])=0, var(w[i])=1 * * Var(n[i]*w[i]) = mean(n[i])^2 + Var(n[i]) */ n1 = network->structinfo.ptr.p_int[istart+neuronidx*mlpbase_nfieldwidth+2]; n2 = n1+network->structinfo.ptr.p_int[istart+neuronidx*mlpbase_nfieldwidth+1]-1; vmean = 0.0; vvar = 0.0; for(i=n1; i<=n2; i++) { vvar = vvar+ae_sqr(network->rndbuf.ptr.p_double[entrysize*i+0], _state)+ae_sqr(network->rndbuf.ptr.p_double[entrysize*i+1], _state); } network->rndbuf.ptr.p_double[entryoffs+0] = vmean; network->rndbuf.ptr.p_double[entryoffs+1] = ae_sqrt(vvar, _state); continue; } if( neurontype==-5 ) { /* * Linear activation function */ i = network->structinfo.ptr.p_int[istart+neuronidx*mlpbase_nfieldwidth+2]; vmean = network->rndbuf.ptr.p_double[entrysize*i+0]; vvar = ae_sqr(network->rndbuf.ptr.p_double[entrysize*i+1], _state); if( ae_fp_greater(vvar,(double)(0)) ) { wscale = desiredsigma/ae_sqrt(vvar, _state); } else { wscale = 1.0; } mlpbase_randomizebackwardpass(network, i, wscale, _state); network->rndbuf.ptr.p_double[entryoffs+0] = vmean*wscale; network->rndbuf.ptr.p_double[entryoffs+1] = desiredsigma; continue; } if( neurontype>0 ) { /* * Nonlinear activation function: * * scale its inputs * * estimate mean/sigma of its output using Monte-Carlo method * (we simulate different inputs with unit deviation and * sample activation function output on such inputs) */ i = network->structinfo.ptr.p_int[istart+neuronidx*mlpbase_nfieldwidth+2]; vmean = network->rndbuf.ptr.p_double[entrysize*i+0]; vvar = ae_sqr(network->rndbuf.ptr.p_double[entrysize*i+1], _state); if( ae_fp_greater(vvar,(double)(0)) ) { wscale = desiredsigma/ae_sqrt(vvar, _state); } else { wscale = 1.0; } mlpbase_randomizebackwardpass(network, i, wscale, _state); ef = 0.0; ef2 = 0.0; vmean = vmean*wscale; for(i=0; i<=montecarlocnt-1; i++) { v = vmean+desiredsigma*hqrndnormal(&r, _state); ef = ef+v; ef2 = ef2+v*v; } ef = ef/montecarlocnt; ef2 = ef2/montecarlocnt; network->rndbuf.ptr.p_double[entryoffs+0] = ef; network->rndbuf.ptr.p_double[entryoffs+1] = ae_maxreal(ef2-ef*ef, 0.0, _state); continue; } ae_assert(ae_false, "MLPRandomize: unexpected neuron type", _state); } /* * Stage 3: generate weights. */ for(i=0; i<=wcount-1; i++) { network->weights.ptr.p_double[i] = network->weights.ptr.p_double[i]*hqrndnormal(&r, _state); } ae_frame_leave(_state); } /************************************************************************* Randomization of neural network weights and standartisator -- ALGLIB -- Copyright 10.03.2008 by Bochkanov Sergey *************************************************************************/ void mlprandomizefull(multilayerperceptron* network, ae_state *_state) { ae_int_t i; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t ntotal; ae_int_t istart; ae_int_t offs; ae_int_t ntype; mlpproperties(network, &nin, &nout, &wcount, _state); ntotal = network->structinfo.ptr.p_int[3]; istart = network->structinfo.ptr.p_int[5]; /* * Process network */ mlprandomize(network, _state); for(i=0; i<=nin-1; i++) { network->columnmeans.ptr.p_double[i] = ae_randomreal(_state)-0.5; network->columnsigmas.ptr.p_double[i] = ae_randomreal(_state)+0.5; } if( !mlpissoftmax(network, _state) ) { for(i=0; i<=nout-1; i++) { offs = istart+(ntotal-nout+i)*mlpbase_nfieldwidth; ntype = network->structinfo.ptr.p_int[offs+0]; if( ntype==0 ) { /* * Shifts are changed only for linear outputs neurons */ network->columnmeans.ptr.p_double[nin+i] = 2*ae_randomreal(_state)-1; } if( ntype==0||ntype==3 ) { /* * Scales are changed only for linear or bounded outputs neurons. * Note that scale randomization preserves sign. */ network->columnsigmas.ptr.p_double[nin+i] = ae_sign(network->columnsigmas.ptr.p_double[nin+i], _state)*(1.5*ae_randomreal(_state)+0.5); } } } } /************************************************************************* Internal subroutine. -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ void mlpinitpreprocessor(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t jmax; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t ntotal; ae_int_t istart; ae_int_t offs; ae_int_t ntype; ae_vector means; ae_vector sigmas; double s; ae_frame_make(_state, &_frame_block); ae_vector_init(&means, 0, DT_REAL, _state); ae_vector_init(&sigmas, 0, DT_REAL, _state); mlpproperties(network, &nin, &nout, &wcount, _state); ntotal = network->structinfo.ptr.p_int[3]; istart = network->structinfo.ptr.p_int[5]; /* * Means/Sigmas */ if( mlpissoftmax(network, _state) ) { jmax = nin-1; } else { jmax = nin+nout-1; } ae_vector_set_length(&means, jmax+1, _state); ae_vector_set_length(&sigmas, jmax+1, _state); for(i=0; i<=jmax; i++) { means.ptr.p_double[i] = (double)(0); sigmas.ptr.p_double[i] = (double)(0); } for(i=0; i<=ssize-1; i++) { for(j=0; j<=jmax; j++) { means.ptr.p_double[j] = means.ptr.p_double[j]+xy->ptr.pp_double[i][j]; } } for(i=0; i<=jmax; i++) { means.ptr.p_double[i] = means.ptr.p_double[i]/ssize; } for(i=0; i<=ssize-1; i++) { for(j=0; j<=jmax; j++) { sigmas.ptr.p_double[j] = sigmas.ptr.p_double[j]+ae_sqr(xy->ptr.pp_double[i][j]-means.ptr.p_double[j], _state); } } for(i=0; i<=jmax; i++) { sigmas.ptr.p_double[i] = ae_sqrt(sigmas.ptr.p_double[i]/ssize, _state); } /* * Inputs */ for(i=0; i<=nin-1; i++) { network->columnmeans.ptr.p_double[i] = means.ptr.p_double[i]; network->columnsigmas.ptr.p_double[i] = sigmas.ptr.p_double[i]; if( ae_fp_eq(network->columnsigmas.ptr.p_double[i],(double)(0)) ) { network->columnsigmas.ptr.p_double[i] = (double)(1); } } /* * Outputs */ if( !mlpissoftmax(network, _state) ) { for(i=0; i<=nout-1; i++) { offs = istart+(ntotal-nout+i)*mlpbase_nfieldwidth; ntype = network->structinfo.ptr.p_int[offs+0]; /* * Linear outputs */ if( ntype==0 ) { network->columnmeans.ptr.p_double[nin+i] = means.ptr.p_double[nin+i]; network->columnsigmas.ptr.p_double[nin+i] = sigmas.ptr.p_double[nin+i]; if( ae_fp_eq(network->columnsigmas.ptr.p_double[nin+i],(double)(0)) ) { network->columnsigmas.ptr.p_double[nin+i] = (double)(1); } } /* * Bounded outputs (half-interval) */ if( ntype==3 ) { s = means.ptr.p_double[nin+i]-network->columnmeans.ptr.p_double[nin+i]; if( ae_fp_eq(s,(double)(0)) ) { s = (double)(ae_sign(network->columnsigmas.ptr.p_double[nin+i], _state)); } if( ae_fp_eq(s,(double)(0)) ) { s = 1.0; } network->columnsigmas.ptr.p_double[nin+i] = ae_sign(network->columnsigmas.ptr.p_double[nin+i], _state)*ae_fabs(s, _state); if( ae_fp_eq(network->columnsigmas.ptr.p_double[nin+i],(double)(0)) ) { network->columnsigmas.ptr.p_double[nin+i] = (double)(1); } } } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine. Initialization for preprocessor based on a sample. INPUT Network - initialized neural network; XY - sample, given by sparse matrix; SSize - sample size. OUTPUT Network - neural network with initialised preprocessor. -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpinitpreprocessorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t ssize, ae_state *_state) { ae_frame _frame_block; ae_int_t jmax; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t ntotal; ae_int_t istart; ae_int_t offs; ae_int_t ntype; ae_vector means; ae_vector sigmas; double s; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_vector_init(&means, 0, DT_REAL, _state); ae_vector_init(&sigmas, 0, DT_REAL, _state); mlpproperties(network, &nin, &nout, &wcount, _state); ntotal = network->structinfo.ptr.p_int[3]; istart = network->structinfo.ptr.p_int[5]; /* * Means/Sigmas */ if( mlpissoftmax(network, _state) ) { jmax = nin-1; } else { jmax = nin+nout-1; } ae_vector_set_length(&means, jmax+1, _state); ae_vector_set_length(&sigmas, jmax+1, _state); for(i=0; i<=jmax; i++) { means.ptr.p_double[i] = (double)(0); sigmas.ptr.p_double[i] = (double)(0); } for(i=0; i<=ssize-1; i++) { sparsegetrow(xy, i, &network->xyrow, _state); for(j=0; j<=jmax; j++) { means.ptr.p_double[j] = means.ptr.p_double[j]+network->xyrow.ptr.p_double[j]; } } for(i=0; i<=jmax; i++) { means.ptr.p_double[i] = means.ptr.p_double[i]/ssize; } for(i=0; i<=ssize-1; i++) { sparsegetrow(xy, i, &network->xyrow, _state); for(j=0; j<=jmax; j++) { sigmas.ptr.p_double[j] = sigmas.ptr.p_double[j]+ae_sqr(network->xyrow.ptr.p_double[j]-means.ptr.p_double[j], _state); } } for(i=0; i<=jmax; i++) { sigmas.ptr.p_double[i] = ae_sqrt(sigmas.ptr.p_double[i]/ssize, _state); } /* * Inputs */ for(i=0; i<=nin-1; i++) { network->columnmeans.ptr.p_double[i] = means.ptr.p_double[i]; network->columnsigmas.ptr.p_double[i] = sigmas.ptr.p_double[i]; if( ae_fp_eq(network->columnsigmas.ptr.p_double[i],(double)(0)) ) { network->columnsigmas.ptr.p_double[i] = (double)(1); } } /* * Outputs */ if( !mlpissoftmax(network, _state) ) { for(i=0; i<=nout-1; i++) { offs = istart+(ntotal-nout+i)*mlpbase_nfieldwidth; ntype = network->structinfo.ptr.p_int[offs+0]; /* * Linear outputs */ if( ntype==0 ) { network->columnmeans.ptr.p_double[nin+i] = means.ptr.p_double[nin+i]; network->columnsigmas.ptr.p_double[nin+i] = sigmas.ptr.p_double[nin+i]; if( ae_fp_eq(network->columnsigmas.ptr.p_double[nin+i],(double)(0)) ) { network->columnsigmas.ptr.p_double[nin+i] = (double)(1); } } /* * Bounded outputs (half-interval) */ if( ntype==3 ) { s = means.ptr.p_double[nin+i]-network->columnmeans.ptr.p_double[nin+i]; if( ae_fp_eq(s,(double)(0)) ) { s = (double)(ae_sign(network->columnsigmas.ptr.p_double[nin+i], _state)); } if( ae_fp_eq(s,(double)(0)) ) { s = 1.0; } network->columnsigmas.ptr.p_double[nin+i] = ae_sign(network->columnsigmas.ptr.p_double[nin+i], _state)*ae_fabs(s, _state); if( ae_fp_eq(network->columnsigmas.ptr.p_double[nin+i],(double)(0)) ) { network->columnsigmas.ptr.p_double[nin+i] = (double)(1); } } } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine. Initialization for preprocessor based on a subsample. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SetSize - real size of XY, SetSize>=0; Idx - subset of SubsetSize elements, array[SubsetSize]: * Idx[I] stores row index in the original dataset which is given by XY. Gradient is calculated with respect to rows whose indexes are stored in Idx[]. * Idx[] must store correct indexes; this function throws an exception in case incorrect index (less than 0 or larger than rows(XY)) is given * Idx[] may store indexes in any order and even with repetitions. SubsetSize- number of elements in Idx[] array. OUTPUT: Network - neural network with initialised preprocessor. NOTE: when SubsetSize<0 is used full dataset by call MLPInitPreprocessor function. -- ALGLIB -- Copyright 23.08.2012 by Bochkanov Sergey *************************************************************************/ void mlpinitpreprocessorsubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* idx, ae_int_t subsetsize, ae_state *_state) { ae_frame _frame_block; ae_int_t jmax; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t ntotal; ae_int_t istart; ae_int_t offs; ae_int_t ntype; ae_vector means; ae_vector sigmas; double s; ae_int_t npoints; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_vector_init(&means, 0, DT_REAL, _state); ae_vector_init(&sigmas, 0, DT_REAL, _state); ae_assert(setsize>=0, "MLPInitPreprocessorSubset: SetSize<0", _state); if( subsetsize<0 ) { mlpinitpreprocessor(network, xy, setsize, _state); ae_frame_leave(_state); return; } ae_assert(subsetsize<=idx->cnt, "MLPInitPreprocessorSubset: SubsetSize>Length(Idx)", _state); npoints = setsize; for(i=0; i<=subsetsize-1; i++) { ae_assert(idx->ptr.p_int[i]>=0, "MLPInitPreprocessorSubset: incorrect index of XY row(Idx[I]<0)", _state); ae_assert(idx->ptr.p_int[i]<=npoints-1, "MLPInitPreprocessorSubset: incorrect index of XY row(Idx[I]>Rows(XY)-1)", _state); } mlpproperties(network, &nin, &nout, &wcount, _state); ntotal = network->structinfo.ptr.p_int[3]; istart = network->structinfo.ptr.p_int[5]; /* * Means/Sigmas */ if( mlpissoftmax(network, _state) ) { jmax = nin-1; } else { jmax = nin+nout-1; } ae_vector_set_length(&means, jmax+1, _state); ae_vector_set_length(&sigmas, jmax+1, _state); for(i=0; i<=jmax; i++) { means.ptr.p_double[i] = (double)(0); sigmas.ptr.p_double[i] = (double)(0); } for(i=0; i<=subsetsize-1; i++) { for(j=0; j<=jmax; j++) { means.ptr.p_double[j] = means.ptr.p_double[j]+xy->ptr.pp_double[idx->ptr.p_int[i]][j]; } } for(i=0; i<=jmax; i++) { means.ptr.p_double[i] = means.ptr.p_double[i]/subsetsize; } for(i=0; i<=subsetsize-1; i++) { for(j=0; j<=jmax; j++) { sigmas.ptr.p_double[j] = sigmas.ptr.p_double[j]+ae_sqr(xy->ptr.pp_double[idx->ptr.p_int[i]][j]-means.ptr.p_double[j], _state); } } for(i=0; i<=jmax; i++) { sigmas.ptr.p_double[i] = ae_sqrt(sigmas.ptr.p_double[i]/subsetsize, _state); } /* * Inputs */ for(i=0; i<=nin-1; i++) { network->columnmeans.ptr.p_double[i] = means.ptr.p_double[i]; network->columnsigmas.ptr.p_double[i] = sigmas.ptr.p_double[i]; if( ae_fp_eq(network->columnsigmas.ptr.p_double[i],(double)(0)) ) { network->columnsigmas.ptr.p_double[i] = (double)(1); } } /* * Outputs */ if( !mlpissoftmax(network, _state) ) { for(i=0; i<=nout-1; i++) { offs = istart+(ntotal-nout+i)*mlpbase_nfieldwidth; ntype = network->structinfo.ptr.p_int[offs+0]; /* * Linear outputs */ if( ntype==0 ) { network->columnmeans.ptr.p_double[nin+i] = means.ptr.p_double[nin+i]; network->columnsigmas.ptr.p_double[nin+i] = sigmas.ptr.p_double[nin+i]; if( ae_fp_eq(network->columnsigmas.ptr.p_double[nin+i],(double)(0)) ) { network->columnsigmas.ptr.p_double[nin+i] = (double)(1); } } /* * Bounded outputs (half-interval) */ if( ntype==3 ) { s = means.ptr.p_double[nin+i]-network->columnmeans.ptr.p_double[nin+i]; if( ae_fp_eq(s,(double)(0)) ) { s = (double)(ae_sign(network->columnsigmas.ptr.p_double[nin+i], _state)); } if( ae_fp_eq(s,(double)(0)) ) { s = 1.0; } network->columnsigmas.ptr.p_double[nin+i] = ae_sign(network->columnsigmas.ptr.p_double[nin+i], _state)*ae_fabs(s, _state); if( ae_fp_eq(network->columnsigmas.ptr.p_double[nin+i],(double)(0)) ) { network->columnsigmas.ptr.p_double[nin+i] = (double)(1); } } } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine. Initialization for preprocessor based on a subsample. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset, given by sparse matrix; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SetSize - real size of XY, SetSize>=0; Idx - subset of SubsetSize elements, array[SubsetSize]: * Idx[I] stores row index in the original dataset which is given by XY. Gradient is calculated with respect to rows whose indexes are stored in Idx[]. * Idx[] must store correct indexes; this function throws an exception in case incorrect index (less than 0 or larger than rows(XY)) is given * Idx[] may store indexes in any order and even with repetitions. SubsetSize- number of elements in Idx[] array. OUTPUT: Network - neural network with initialised preprocessor. NOTE: when SubsetSize<0 is used full dataset by call MLPInitPreprocessorSparse function. -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpinitpreprocessorsparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* idx, ae_int_t subsetsize, ae_state *_state) { ae_frame _frame_block; ae_int_t jmax; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t ntotal; ae_int_t istart; ae_int_t offs; ae_int_t ntype; ae_vector means; ae_vector sigmas; double s; ae_int_t npoints; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_vector_init(&means, 0, DT_REAL, _state); ae_vector_init(&sigmas, 0, DT_REAL, _state); ae_assert(setsize>=0, "MLPInitPreprocessorSparseSubset: SetSize<0", _state); if( subsetsize<0 ) { mlpinitpreprocessorsparse(network, xy, setsize, _state); ae_frame_leave(_state); return; } ae_assert(subsetsize<=idx->cnt, "MLPInitPreprocessorSparseSubset: SubsetSize>Length(Idx)", _state); npoints = setsize; for(i=0; i<=subsetsize-1; i++) { ae_assert(idx->ptr.p_int[i]>=0, "MLPInitPreprocessorSparseSubset: incorrect index of XY row(Idx[I]<0)", _state); ae_assert(idx->ptr.p_int[i]<=npoints-1, "MLPInitPreprocessorSparseSubset: incorrect index of XY row(Idx[I]>Rows(XY)-1)", _state); } mlpproperties(network, &nin, &nout, &wcount, _state); ntotal = network->structinfo.ptr.p_int[3]; istart = network->structinfo.ptr.p_int[5]; /* * Means/Sigmas */ if( mlpissoftmax(network, _state) ) { jmax = nin-1; } else { jmax = nin+nout-1; } ae_vector_set_length(&means, jmax+1, _state); ae_vector_set_length(&sigmas, jmax+1, _state); for(i=0; i<=jmax; i++) { means.ptr.p_double[i] = (double)(0); sigmas.ptr.p_double[i] = (double)(0); } for(i=0; i<=subsetsize-1; i++) { sparsegetrow(xy, idx->ptr.p_int[i], &network->xyrow, _state); for(j=0; j<=jmax; j++) { means.ptr.p_double[j] = means.ptr.p_double[j]+network->xyrow.ptr.p_double[j]; } } for(i=0; i<=jmax; i++) { means.ptr.p_double[i] = means.ptr.p_double[i]/subsetsize; } for(i=0; i<=subsetsize-1; i++) { sparsegetrow(xy, idx->ptr.p_int[i], &network->xyrow, _state); for(j=0; j<=jmax; j++) { sigmas.ptr.p_double[j] = sigmas.ptr.p_double[j]+ae_sqr(network->xyrow.ptr.p_double[j]-means.ptr.p_double[j], _state); } } for(i=0; i<=jmax; i++) { sigmas.ptr.p_double[i] = ae_sqrt(sigmas.ptr.p_double[i]/subsetsize, _state); } /* * Inputs */ for(i=0; i<=nin-1; i++) { network->columnmeans.ptr.p_double[i] = means.ptr.p_double[i]; network->columnsigmas.ptr.p_double[i] = sigmas.ptr.p_double[i]; if( ae_fp_eq(network->columnsigmas.ptr.p_double[i],(double)(0)) ) { network->columnsigmas.ptr.p_double[i] = (double)(1); } } /* * Outputs */ if( !mlpissoftmax(network, _state) ) { for(i=0; i<=nout-1; i++) { offs = istart+(ntotal-nout+i)*mlpbase_nfieldwidth; ntype = network->structinfo.ptr.p_int[offs+0]; /* * Linear outputs */ if( ntype==0 ) { network->columnmeans.ptr.p_double[nin+i] = means.ptr.p_double[nin+i]; network->columnsigmas.ptr.p_double[nin+i] = sigmas.ptr.p_double[nin+i]; if( ae_fp_eq(network->columnsigmas.ptr.p_double[nin+i],(double)(0)) ) { network->columnsigmas.ptr.p_double[nin+i] = (double)(1); } } /* * Bounded outputs (half-interval) */ if( ntype==3 ) { s = means.ptr.p_double[nin+i]-network->columnmeans.ptr.p_double[nin+i]; if( ae_fp_eq(s,(double)(0)) ) { s = (double)(ae_sign(network->columnsigmas.ptr.p_double[nin+i], _state)); } if( ae_fp_eq(s,(double)(0)) ) { s = 1.0; } network->columnsigmas.ptr.p_double[nin+i] = ae_sign(network->columnsigmas.ptr.p_double[nin+i], _state)*ae_fabs(s, _state); if( ae_fp_eq(network->columnsigmas.ptr.p_double[nin+i],(double)(0)) ) { network->columnsigmas.ptr.p_double[nin+i] = (double)(1); } } } } ae_frame_leave(_state); } /************************************************************************* Returns information about initialized network: number of inputs, outputs, weights. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpproperties(multilayerperceptron* network, ae_int_t* nin, ae_int_t* nout, ae_int_t* wcount, ae_state *_state) { *nin = 0; *nout = 0; *wcount = 0; *nin = network->structinfo.ptr.p_int[1]; *nout = network->structinfo.ptr.p_int[2]; *wcount = network->structinfo.ptr.p_int[4]; } /************************************************************************* Returns number of "internal", low-level neurons in the network (one which is stored in StructInfo). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpntotal(multilayerperceptron* network, ae_state *_state) { ae_int_t result; result = network->structinfo.ptr.p_int[3]; return result; } /************************************************************************* Returns number of inputs. -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetinputscount(multilayerperceptron* network, ae_state *_state) { ae_int_t result; result = network->structinfo.ptr.p_int[1]; return result; } /************************************************************************* Returns number of outputs. -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetoutputscount(multilayerperceptron* network, ae_state *_state) { ae_int_t result; result = network->structinfo.ptr.p_int[2]; return result; } /************************************************************************* Returns number of weights. -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetweightscount(multilayerperceptron* network, ae_state *_state) { ae_int_t result; result = network->structinfo.ptr.p_int[4]; return result; } /************************************************************************* Tells whether network is SOFTMAX-normalized (i.e. classifier) or not. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ ae_bool mlpissoftmax(multilayerperceptron* network, ae_state *_state) { ae_bool result; result = network->structinfo.ptr.p_int[6]==1; return result; } /************************************************************************* This function returns total number of layers (including input, hidden and output layers). -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetlayerscount(multilayerperceptron* network, ae_state *_state) { ae_int_t result; result = network->hllayersizes.cnt; return result; } /************************************************************************* This function returns size of K-th layer. K=0 corresponds to input layer, K=CNT-1 corresponds to output layer. Size of the output layer is always equal to the number of outputs, although when we have softmax-normalized network, last neuron doesn't have any connections - it is just zero. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpgetlayersize(multilayerperceptron* network, ae_int_t k, ae_state *_state) { ae_int_t result; ae_assert(k>=0&&khllayersizes.cnt, "MLPGetLayerSize: incorrect layer index", _state); result = network->hllayersizes.ptr.p_int[k]; return result; } /************************************************************************* This function returns offset/scaling coefficients for I-th input of the network. INPUT PARAMETERS: Network - network I - input index OUTPUT PARAMETERS: Mean - mean term Sigma - sigma term, guaranteed to be nonzero. I-th input is passed through linear transformation IN[i] = (IN[i]-Mean)/Sigma before feeding to the network -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpgetinputscaling(multilayerperceptron* network, ae_int_t i, double* mean, double* sigma, ae_state *_state) { *mean = 0; *sigma = 0; ae_assert(i>=0&&ihllayersizes.ptr.p_int[0], "MLPGetInputScaling: incorrect (nonexistent) I", _state); *mean = network->columnmeans.ptr.p_double[i]; *sigma = network->columnsigmas.ptr.p_double[i]; if( ae_fp_eq(*sigma,(double)(0)) ) { *sigma = (double)(1); } } /************************************************************************* This function returns offset/scaling coefficients for I-th output of the network. INPUT PARAMETERS: Network - network I - input index OUTPUT PARAMETERS: Mean - mean term Sigma - sigma term, guaranteed to be nonzero. I-th output is passed through linear transformation OUT[i] = OUT[i]*Sigma+Mean before returning it to user. In case we have SOFTMAX-normalized network, we return (Mean,Sigma)=(0.0,1.0). -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpgetoutputscaling(multilayerperceptron* network, ae_int_t i, double* mean, double* sigma, ae_state *_state) { *mean = 0; *sigma = 0; ae_assert(i>=0&&ihllayersizes.ptr.p_int[network->hllayersizes.cnt-1], "MLPGetOutputScaling: incorrect (nonexistent) I", _state); if( network->structinfo.ptr.p_int[6]==1 ) { *mean = (double)(0); *sigma = (double)(1); } else { *mean = network->columnmeans.ptr.p_double[network->hllayersizes.ptr.p_int[0]+i]; *sigma = network->columnsigmas.ptr.p_double[network->hllayersizes.ptr.p_int[0]+i]; } } /************************************************************************* This function returns information about Ith neuron of Kth layer INPUT PARAMETERS: Network - network K - layer index I - neuron index (within layer) OUTPUT PARAMETERS: FKind - activation function type (used by MLPActivationFunction()) this value is zero for input or linear neurons Threshold - also called offset, bias zero for input neurons NOTE: this function throws exception if layer or neuron with given index do not exists. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpgetneuroninfo(multilayerperceptron* network, ae_int_t k, ae_int_t i, ae_int_t* fkind, double* threshold, ae_state *_state) { ae_int_t ncnt; ae_int_t istart; ae_int_t highlevelidx; ae_int_t activationoffset; *fkind = 0; *threshold = 0; ncnt = network->hlneurons.cnt/mlpbase_hlnfieldwidth; istart = network->structinfo.ptr.p_int[5]; /* * search */ network->integerbuf.ptr.p_int[0] = k; network->integerbuf.ptr.p_int[1] = i; highlevelidx = recsearch(&network->hlneurons, mlpbase_hlnfieldwidth, 2, 0, ncnt, &network->integerbuf, _state); ae_assert(highlevelidx>=0, "MLPGetNeuronInfo: incorrect (nonexistent) layer or neuron index", _state); /* * 1. find offset of the activation function record in the */ if( network->hlneurons.ptr.p_int[highlevelidx*mlpbase_hlnfieldwidth+2]>=0 ) { activationoffset = istart+network->hlneurons.ptr.p_int[highlevelidx*mlpbase_hlnfieldwidth+2]*mlpbase_nfieldwidth; *fkind = network->structinfo.ptr.p_int[activationoffset+0]; } else { *fkind = 0; } if( network->hlneurons.ptr.p_int[highlevelidx*mlpbase_hlnfieldwidth+3]>=0 ) { *threshold = network->weights.ptr.p_double[network->hlneurons.ptr.p_int[highlevelidx*mlpbase_hlnfieldwidth+3]]; } else { *threshold = (double)(0); } } /************************************************************************* This function returns information about connection from I0-th neuron of K0-th layer to I1-th neuron of K1-th layer. INPUT PARAMETERS: Network - network K0 - layer index I0 - neuron index (within layer) K1 - layer index I1 - neuron index (within layer) RESULT: connection weight (zero for non-existent connections) This function: 1. throws exception if layer or neuron with given index do not exists. 2. returns zero if neurons exist, but there is no connection between them -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ double mlpgetweight(multilayerperceptron* network, ae_int_t k0, ae_int_t i0, ae_int_t k1, ae_int_t i1, ae_state *_state) { ae_int_t ccnt; ae_int_t highlevelidx; double result; ccnt = network->hlconnections.cnt/mlpbase_hlconnfieldwidth; /* * check params */ ae_assert(k0>=0&&k0hllayersizes.cnt, "MLPGetWeight: incorrect (nonexistent) K0", _state); ae_assert(i0>=0&&i0hllayersizes.ptr.p_int[k0], "MLPGetWeight: incorrect (nonexistent) I0", _state); ae_assert(k1>=0&&k1hllayersizes.cnt, "MLPGetWeight: incorrect (nonexistent) K1", _state); ae_assert(i1>=0&&i1hllayersizes.ptr.p_int[k1], "MLPGetWeight: incorrect (nonexistent) I1", _state); /* * search */ network->integerbuf.ptr.p_int[0] = k0; network->integerbuf.ptr.p_int[1] = i0; network->integerbuf.ptr.p_int[2] = k1; network->integerbuf.ptr.p_int[3] = i1; highlevelidx = recsearch(&network->hlconnections, mlpbase_hlconnfieldwidth, 4, 0, ccnt, &network->integerbuf, _state); if( highlevelidx>=0 ) { result = network->weights.ptr.p_double[network->hlconnections.ptr.p_int[highlevelidx*mlpbase_hlconnfieldwidth+4]]; } else { result = (double)(0); } return result; } /************************************************************************* This function sets offset/scaling coefficients for I-th input of the network. INPUT PARAMETERS: Network - network I - input index Mean - mean term Sigma - sigma term (if zero, will be replaced by 1.0) NTE: I-th input is passed through linear transformation IN[i] = (IN[i]-Mean)/Sigma before feeding to the network. This function sets Mean and Sigma. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpsetinputscaling(multilayerperceptron* network, ae_int_t i, double mean, double sigma, ae_state *_state) { ae_assert(i>=0&&ihllayersizes.ptr.p_int[0], "MLPSetInputScaling: incorrect (nonexistent) I", _state); ae_assert(ae_isfinite(mean, _state), "MLPSetInputScaling: infinite or NAN Mean", _state); ae_assert(ae_isfinite(sigma, _state), "MLPSetInputScaling: infinite or NAN Sigma", _state); if( ae_fp_eq(sigma,(double)(0)) ) { sigma = (double)(1); } network->columnmeans.ptr.p_double[i] = mean; network->columnsigmas.ptr.p_double[i] = sigma; } /************************************************************************* This function sets offset/scaling coefficients for I-th output of the network. INPUT PARAMETERS: Network - network I - input index Mean - mean term Sigma - sigma term (if zero, will be replaced by 1.0) OUTPUT PARAMETERS: NOTE: I-th output is passed through linear transformation OUT[i] = OUT[i]*Sigma+Mean before returning it to user. This function sets Sigma/Mean. In case we have SOFTMAX-normalized network, you can not set (Sigma,Mean) to anything other than(0.0,1.0) - this function will throw exception. -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpsetoutputscaling(multilayerperceptron* network, ae_int_t i, double mean, double sigma, ae_state *_state) { ae_assert(i>=0&&ihllayersizes.ptr.p_int[network->hllayersizes.cnt-1], "MLPSetOutputScaling: incorrect (nonexistent) I", _state); ae_assert(ae_isfinite(mean, _state), "MLPSetOutputScaling: infinite or NAN Mean", _state); ae_assert(ae_isfinite(sigma, _state), "MLPSetOutputScaling: infinite or NAN Sigma", _state); if( network->structinfo.ptr.p_int[6]==1 ) { ae_assert(ae_fp_eq(mean,(double)(0)), "MLPSetOutputScaling: you can not set non-zero Mean term for classifier network", _state); ae_assert(ae_fp_eq(sigma,(double)(1)), "MLPSetOutputScaling: you can not set non-unit Sigma term for classifier network", _state); } else { if( ae_fp_eq(sigma,(double)(0)) ) { sigma = (double)(1); } network->columnmeans.ptr.p_double[network->hllayersizes.ptr.p_int[0]+i] = mean; network->columnsigmas.ptr.p_double[network->hllayersizes.ptr.p_int[0]+i] = sigma; } } /************************************************************************* This function modifies information about Ith neuron of Kth layer INPUT PARAMETERS: Network - network K - layer index I - neuron index (within layer) FKind - activation function type (used by MLPActivationFunction()) this value must be zero for input neurons (you can not set activation function for input neurons) Threshold - also called offset, bias this value must be zero for input neurons (you can not set threshold for input neurons) NOTES: 1. this function throws exception if layer or neuron with given index do not exists. 2. this function also throws exception when you try to set non-linear activation function for input neurons (any kind of network) or for output neurons of classifier network. 3. this function throws exception when you try to set non-zero threshold for input neurons (any kind of network). -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpsetneuroninfo(multilayerperceptron* network, ae_int_t k, ae_int_t i, ae_int_t fkind, double threshold, ae_state *_state) { ae_int_t ncnt; ae_int_t istart; ae_int_t highlevelidx; ae_int_t activationoffset; ae_assert(ae_isfinite(threshold, _state), "MLPSetNeuronInfo: infinite or NAN Threshold", _state); /* * convenience vars */ ncnt = network->hlneurons.cnt/mlpbase_hlnfieldwidth; istart = network->structinfo.ptr.p_int[5]; /* * search */ network->integerbuf.ptr.p_int[0] = k; network->integerbuf.ptr.p_int[1] = i; highlevelidx = recsearch(&network->hlneurons, mlpbase_hlnfieldwidth, 2, 0, ncnt, &network->integerbuf, _state); ae_assert(highlevelidx>=0, "MLPSetNeuronInfo: incorrect (nonexistent) layer or neuron index", _state); /* * activation function */ if( network->hlneurons.ptr.p_int[highlevelidx*mlpbase_hlnfieldwidth+2]>=0 ) { activationoffset = istart+network->hlneurons.ptr.p_int[highlevelidx*mlpbase_hlnfieldwidth+2]*mlpbase_nfieldwidth; network->structinfo.ptr.p_int[activationoffset+0] = fkind; } else { ae_assert(fkind==0, "MLPSetNeuronInfo: you try to set activation function for neuron which can not have one", _state); } /* * Threshold */ if( network->hlneurons.ptr.p_int[highlevelidx*mlpbase_hlnfieldwidth+3]>=0 ) { network->weights.ptr.p_double[network->hlneurons.ptr.p_int[highlevelidx*mlpbase_hlnfieldwidth+3]] = threshold; } else { ae_assert(ae_fp_eq(threshold,(double)(0)), "MLPSetNeuronInfo: you try to set non-zero threshold for neuron which can not have one", _state); } } /************************************************************************* This function modifies information about connection from I0-th neuron of K0-th layer to I1-th neuron of K1-th layer. INPUT PARAMETERS: Network - network K0 - layer index I0 - neuron index (within layer) K1 - layer index I1 - neuron index (within layer) W - connection weight (must be zero for non-existent connections) This function: 1. throws exception if layer or neuron with given index do not exists. 2. throws exception if you try to set non-zero weight for non-existent connection -- ALGLIB -- Copyright 25.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpsetweight(multilayerperceptron* network, ae_int_t k0, ae_int_t i0, ae_int_t k1, ae_int_t i1, double w, ae_state *_state) { ae_int_t ccnt; ae_int_t highlevelidx; ccnt = network->hlconnections.cnt/mlpbase_hlconnfieldwidth; /* * check params */ ae_assert(k0>=0&&k0hllayersizes.cnt, "MLPSetWeight: incorrect (nonexistent) K0", _state); ae_assert(i0>=0&&i0hllayersizes.ptr.p_int[k0], "MLPSetWeight: incorrect (nonexistent) I0", _state); ae_assert(k1>=0&&k1hllayersizes.cnt, "MLPSetWeight: incorrect (nonexistent) K1", _state); ae_assert(i1>=0&&i1hllayersizes.ptr.p_int[k1], "MLPSetWeight: incorrect (nonexistent) I1", _state); ae_assert(ae_isfinite(w, _state), "MLPSetWeight: infinite or NAN weight", _state); /* * search */ network->integerbuf.ptr.p_int[0] = k0; network->integerbuf.ptr.p_int[1] = i0; network->integerbuf.ptr.p_int[2] = k1; network->integerbuf.ptr.p_int[3] = i1; highlevelidx = recsearch(&network->hlconnections, mlpbase_hlconnfieldwidth, 4, 0, ccnt, &network->integerbuf, _state); if( highlevelidx>=0 ) { network->weights.ptr.p_double[network->hlconnections.ptr.p_int[highlevelidx*mlpbase_hlconnfieldwidth+4]] = w; } else { ae_assert(ae_fp_eq(w,(double)(0)), "MLPSetWeight: you try to set non-zero weight for non-existent connection", _state); } } /************************************************************************* Neural network activation function INPUT PARAMETERS: NET - neuron input K - function index (zero for linear function) OUTPUT PARAMETERS: F - function DF - its derivative D2F - its second derivative -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpactivationfunction(double net, ae_int_t k, double* f, double* df, double* d2f, ae_state *_state) { double net2; double arg; double root; double r; *f = 0; *df = 0; *d2f = 0; if( k==0||k==-5 ) { *f = net; *df = (double)(1); *d2f = (double)(0); return; } if( k==1 ) { /* * TanH activation function */ if( ae_fp_less(ae_fabs(net, _state),(double)(100)) ) { *f = ae_tanh(net, _state); } else { *f = (double)(ae_sign(net, _state)); } *df = 1-*f*(*f); *d2f = -2*(*f)*(*df); return; } if( k==3 ) { /* * EX activation function */ if( ae_fp_greater_eq(net,(double)(0)) ) { net2 = net*net; arg = net2+1; root = ae_sqrt(arg, _state); *f = net+root; r = net/root; *df = 1+r; *d2f = (root-net*r)/arg; } else { *f = ae_exp(net, _state); *df = *f; *d2f = *f; } return; } if( k==2 ) { *f = ae_exp(-ae_sqr(net, _state), _state); *df = -2*net*(*f); *d2f = -2*(*f+*df*net); return; } *f = (double)(0); *df = (double)(0); *d2f = (double)(0); } /************************************************************************* Procesing INPUT PARAMETERS: Network - neural network X - input vector, array[0..NIn-1]. OUTPUT PARAMETERS: Y - result. Regression estimate when solving regression task, vector of posterior probabilities for classification task. See also MLPProcessI -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpprocess(multilayerperceptron* network, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { if( y->cntstructinfo.ptr.p_int[2] ) { ae_vector_set_length(y, network->structinfo.ptr.p_int[2], _state); } mlpinternalprocessvector(&network->structinfo, &network->weights, &network->columnmeans, &network->columnsigmas, &network->neurons, &network->dfdnet, x, y, _state); } /************************************************************************* 'interactive' variant of MLPProcess for languages like Python which support constructs like "Y = MLPProcess(NN,X)" and interactive mode of the interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 21.09.2010 by Bochkanov Sergey *************************************************************************/ void mlpprocessi(multilayerperceptron* network, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_vector_clear(y); mlpprocess(network, x, y, _state); } /************************************************************************* Error of the neural network on dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x, depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ double mlperror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { double result; ae_assert(xy->rows>=npoints, "MLPError: XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+1, "MLPError: XY has less than NIn+1 columns", _state); } else { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPError: XY has less than NIn+NOut columns", _state); } } mlpallerrorsx(network, xy, &network->dummysxy, npoints, 0, &network->dummyidx, 0, npoints, 0, &network->buf, &network->err, _state); result = ae_sqr(network->err.rmserror, _state)*npoints*mlpgetoutputscount(network, _state)/2; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlperror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { return mlperror(network,xy,npoints, _state); } /************************************************************************* Error of the neural network on dataset given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x, depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0 RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ double mlperrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { double result; ae_assert(sparseiscrs(xy, _state), "MLPErrorSparse: XY is not in CRS format.", _state); ae_assert(sparsegetnrows(xy, _state)>=npoints, "MLPErrorSparse: XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+1, "MLPErrorSparse: XY has less than NIn+1 columns", _state); } else { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPErrorSparse: XY has less than NIn+NOut columns", _state); } } mlpallerrorsx(network, &network->dummydxy, xy, npoints, 1, &network->dummyidx, 0, npoints, 0, &network->buf, &network->err, _state); result = ae_sqr(network->err.rmserror, _state)*npoints*mlpgetoutputscount(network, _state)/2; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlperrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { return mlperrorsparse(network,xy,npoints, _state); } /************************************************************************* Natural error function for neural network, internal subroutine. NOTE: this function is single-threaded. Unlike other error function, it receives no speed-up from being executed in SMP mode. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ double mlperrorn(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, ae_state *_state) { ae_int_t i; ae_int_t k; ae_int_t nin; ae_int_t nout; ae_int_t wcount; double e; double result; mlpproperties(network, &nin, &nout, &wcount, _state); result = (double)(0); for(i=0; i<=ssize-1; i++) { /* * Process vector */ ae_v_move(&network->x.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nin-1)); mlpprocess(network, &network->x, &network->y, _state); /* * Update error function */ if( network->structinfo.ptr.p_int[6]==0 ) { /* * Least squares error function */ ae_v_sub(&network->y.ptr.p_double[0], 1, &xy->ptr.pp_double[i][nin], 1, ae_v_len(0,nout-1)); e = ae_v_dotproduct(&network->y.ptr.p_double[0], 1, &network->y.ptr.p_double[0], 1, ae_v_len(0,nout-1)); result = result+e/2; } else { /* * Cross-entropy error function */ k = ae_round(xy->ptr.pp_double[i][nin], _state); if( k>=0&&ky.ptr.p_double[k], _state); } } } return result; } /************************************************************************* Classification error of the neural network on dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: classification error (number of misclassified cases) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ ae_int_t mlpclserror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_int_t result; ae_assert(xy->rows>=npoints, "MLPClsError: XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+1, "MLPClsError: XY has less than NIn+1 columns", _state); } else { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPClsError: XY has less than NIn+NOut columns", _state); } } mlpallerrorsx(network, xy, &network->dummysxy, npoints, 0, &network->dummyidx, 0, npoints, 0, &network->buf, &network->err, _state); result = ae_round(npoints*network->err.relclserror, _state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_int_t _pexec_mlpclserror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { return mlpclserror(network,xy,npoints, _state); } /************************************************************************* Relative classification error on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Percent of incorrectly classified cases. Works both for classifier networks and general purpose networks used as classifiers. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 25.12.2008 by Bochkanov Sergey *************************************************************************/ double mlprelclserror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { double result; ae_assert(xy->rows>=npoints, "MLPRelClsError: XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+1, "MLPRelClsError: XY has less than NIn+1 columns", _state); } else { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPRelClsError: XY has less than NIn+NOut columns", _state); } } if( npoints>0 ) { result = (double)mlpclserror(network, xy, npoints, _state)/(double)npoints; } else { result = 0.0; } return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlprelclserror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { return mlprelclserror(network,xy,npoints, _state); } /************************************************************************* Relative classification error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Percent of incorrectly classified cases. Works both for classifier networks and general purpose networks used as classifiers. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/ double mlprelclserrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { double result; ae_assert(sparseiscrs(xy, _state), "MLPRelClsErrorSparse: sparse matrix XY is not in CRS format.", _state); ae_assert(sparsegetnrows(xy, _state)>=npoints, "MLPRelClsErrorSparse: sparse matrix XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+1, "MLPRelClsErrorSparse: sparse matrix XY has less than NIn+1 columns", _state); } else { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPRelClsErrorSparse: sparse matrix XY has less than NIn+NOut columns", _state); } } mlpallerrorsx(network, &network->dummydxy, xy, npoints, 1, &network->dummyidx, 0, npoints, 0, &network->buf, &network->err, _state); result = network->err.relclserror; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlprelclserrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { return mlprelclserrorsparse(network,xy,npoints, _state); } /************************************************************************* Average cross-entropy (in bits per element) on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: CrossEntropy/(NPoints*LN(2)). Zero if network solves regression task. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 08.01.2009 by Bochkanov Sergey *************************************************************************/ double mlpavgce(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { double result; ae_assert(xy->rows>=npoints, "MLPAvgCE: XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+1, "MLPAvgCE: XY has less than NIn+1 columns", _state); } else { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPAvgCE: XY has less than NIn+NOut columns", _state); } } mlpallerrorsx(network, xy, &network->dummysxy, npoints, 0, &network->dummyidx, 0, npoints, 0, &network->buf, &network->err, _state); result = network->err.avgce; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlpavgce(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { return mlpavgce(network,xy,npoints, _state); } /************************************************************************* Average cross-entropy (in bits per element) on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: CrossEntropy/(NPoints*LN(2)). Zero if network solves regression task. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 9.08.2012 by Bochkanov Sergey *************************************************************************/ double mlpavgcesparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { double result; ae_assert(sparseiscrs(xy, _state), "MLPAvgCESparse: sparse matrix XY is not in CRS format.", _state); ae_assert(sparsegetnrows(xy, _state)>=npoints, "MLPAvgCESparse: sparse matrix XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+1, "MLPAvgCESparse: sparse matrix XY has less than NIn+1 columns", _state); } else { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPAvgCESparse: sparse matrix XY has less than NIn+NOut columns", _state); } } mlpallerrorsx(network, &network->dummydxy, xy, npoints, 1, &network->dummyidx, 0, npoints, 0, &network->buf, &network->err, _state); result = network->err.avgce; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlpavgcesparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { return mlpavgcesparse(network,xy,npoints, _state); } /************************************************************************* RMS error on the test set given. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Root mean square error. Its meaning for regression task is obvious. As for classification task, RMS error means error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ double mlprmserror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { double result; ae_assert(xy->rows>=npoints, "MLPRMSError: XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+1, "MLPRMSError: XY has less than NIn+1 columns", _state); } else { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPRMSError: XY has less than NIn+NOut columns", _state); } } mlpallerrorsx(network, xy, &network->dummysxy, npoints, 0, &network->dummyidx, 0, npoints, 0, &network->buf, &network->err, _state); result = network->err.rmserror; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlprmserror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { return mlprmserror(network,xy,npoints, _state); } /************************************************************************* RMS error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Root mean square error. Its meaning for regression task is obvious. As for classification task, RMS error means error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/ double mlprmserrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { double result; ae_assert(sparseiscrs(xy, _state), "MLPRMSErrorSparse: sparse matrix XY is not in CRS format.", _state); ae_assert(sparsegetnrows(xy, _state)>=npoints, "MLPRMSErrorSparse: sparse matrix XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+1, "MLPRMSErrorSparse: sparse matrix XY has less than NIn+1 columns", _state); } else { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPRMSErrorSparse: sparse matrix XY has less than NIn+NOut columns", _state); } } mlpallerrorsx(network, &network->dummydxy, xy, npoints, 1, &network->dummyidx, 0, npoints, 0, &network->buf, &network->err, _state); result = network->err.rmserror; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlprmserrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { return mlprmserrorsparse(network,xy,npoints, _state); } /************************************************************************* Average absolute error on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Its meaning for regression task is obvious. As for classification task, it means average error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 11.03.2008 by Bochkanov Sergey *************************************************************************/ double mlpavgerror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { double result; ae_assert(xy->rows>=npoints, "MLPAvgError: XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+1, "MLPAvgError: XY has less than NIn+1 columns", _state); } else { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPAvgError: XY has less than NIn+NOut columns", _state); } } mlpallerrorsx(network, xy, &network->dummysxy, npoints, 0, &network->dummyidx, 0, npoints, 0, &network->buf, &network->err, _state); result = network->err.avgerror; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlpavgerror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { return mlpavgerror(network,xy,npoints, _state); } /************************************************************************* Average absolute error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Its meaning for regression task is obvious. As for classification task, it means average error when estimating posterior probabilities. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/ double mlpavgerrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { double result; ae_assert(sparseiscrs(xy, _state), "MLPAvgErrorSparse: XY is not in CRS format.", _state); ae_assert(sparsegetnrows(xy, _state)>=npoints, "MLPAvgErrorSparse: XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+1, "MLPAvgErrorSparse: XY has less than NIn+1 columns", _state); } else { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPAvgErrorSparse: XY has less than NIn+NOut columns", _state); } } mlpallerrorsx(network, &network->dummydxy, xy, npoints, 1, &network->dummyidx, 0, npoints, 0, &network->buf, &network->err, _state); result = network->err.avgerror; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlpavgerrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { return mlpavgerrorsparse(network,xy,npoints, _state); } /************************************************************************* Average relative error on the test set. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; NPoints - points count. RESULT: Its meaning for regression task is obvious. As for classification task, it means average relative error when estimating posterior probability of belonging to the correct class. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 11.03.2008 by Bochkanov Sergey *************************************************************************/ double mlpavgrelerror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { double result; ae_assert(xy->rows>=npoints, "MLPAvgRelError: XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+1, "MLPAvgRelError: XY has less than NIn+1 columns", _state); } else { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPAvgRelError: XY has less than NIn+NOut columns", _state); } } mlpallerrorsx(network, xy, &network->dummysxy, npoints, 0, &network->dummyidx, 0, npoints, 0, &network->buf, &network->err, _state); result = network->err.avgrelerror; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlpavgrelerror(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { return mlpavgrelerror(network,xy,npoints, _state); } /************************************************************************* Average relative error on the test set given by sparse matrix. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. NPoints - points count, >=0. RESULT: Its meaning for regression task is obvious. As for classification task, it means average relative error when estimating posterior probability of belonging to the correct class. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 09.08.2012 by Bochkanov Sergey *************************************************************************/ double mlpavgrelerrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { double result; ae_assert(sparseiscrs(xy, _state), "MLPAvgRelErrorSparse: XY is not in CRS format.", _state); ae_assert(sparsegetnrows(xy, _state)>=npoints, "MLPAvgRelErrorSparse: XY has less than NPoints rows", _state); if( npoints>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+1, "MLPAvgRelErrorSparse: XY has less than NIn+1 columns", _state); } else { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPAvgRelErrorSparse: XY has less than NIn+NOut columns", _state); } } mlpallerrorsx(network, &network->dummydxy, xy, npoints, 1, &network->dummyidx, 0, npoints, 0, &network->buf, &network->err, _state); result = network->err.avgrelerror; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlpavgrelerrorsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { return mlpavgrelerrorsparse(network,xy,npoints, _state); } /************************************************************************* Gradient calculation INPUT PARAMETERS: Network - network initialized with one of the network creation funcs X - input vector, length of array must be at least NIn DesiredY- desired outputs, length of array must be at least NOut Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpgrad(multilayerperceptron* network, /* Real */ ae_vector* x, /* Real */ ae_vector* desiredy, double* e, /* Real */ ae_vector* grad, ae_state *_state) { ae_int_t i; ae_int_t nout; ae_int_t ntotal; *e = 0; /* * Alloc */ rvectorsetlengthatleast(grad, network->structinfo.ptr.p_int[4], _state); /* * Prepare dError/dOut, internal structures */ mlpprocess(network, x, &network->y, _state); nout = network->structinfo.ptr.p_int[2]; ntotal = network->structinfo.ptr.p_int[3]; *e = (double)(0); for(i=0; i<=ntotal-1; i++) { network->derror.ptr.p_double[i] = (double)(0); } for(i=0; i<=nout-1; i++) { network->derror.ptr.p_double[ntotal-nout+i] = network->y.ptr.p_double[i]-desiredy->ptr.p_double[i]; *e = *e+ae_sqr(network->y.ptr.p_double[i]-desiredy->ptr.p_double[i], _state)/2; } /* * gradient */ mlpbase_mlpinternalcalculategradient(network, &network->neurons, &network->weights, &network->derror, grad, ae_false, _state); } /************************************************************************* Gradient calculation (natural error function is used) INPUT PARAMETERS: Network - network initialized with one of the network creation funcs X - input vector, length of array must be at least NIn DesiredY- desired outputs, length of array must be at least NOut Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, sum-of-squares for regression networks, cross-entropy for classification networks. Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpgradn(multilayerperceptron* network, /* Real */ ae_vector* x, /* Real */ ae_vector* desiredy, double* e, /* Real */ ae_vector* grad, ae_state *_state) { double s; ae_int_t i; ae_int_t nout; ae_int_t ntotal; *e = 0; /* * Alloc */ rvectorsetlengthatleast(grad, network->structinfo.ptr.p_int[4], _state); /* * Prepare dError/dOut, internal structures */ mlpprocess(network, x, &network->y, _state); nout = network->structinfo.ptr.p_int[2]; ntotal = network->structinfo.ptr.p_int[3]; for(i=0; i<=ntotal-1; i++) { network->derror.ptr.p_double[i] = (double)(0); } *e = (double)(0); if( network->structinfo.ptr.p_int[6]==0 ) { /* * Regression network, least squares */ for(i=0; i<=nout-1; i++) { network->derror.ptr.p_double[ntotal-nout+i] = network->y.ptr.p_double[i]-desiredy->ptr.p_double[i]; *e = *e+ae_sqr(network->y.ptr.p_double[i]-desiredy->ptr.p_double[i], _state)/2; } } else { /* * Classification network, cross-entropy */ s = (double)(0); for(i=0; i<=nout-1; i++) { s = s+desiredy->ptr.p_double[i]; } for(i=0; i<=nout-1; i++) { network->derror.ptr.p_double[ntotal-nout+i] = s*network->y.ptr.p_double[i]-desiredy->ptr.p_double[i]; *e = *e+mlpbase_safecrossentropy(desiredy->ptr.p_double[i], network->y.ptr.p_double[i], _state); } } /* * gradient */ mlpbase_mlpinternalcalculategradient(network, &network->neurons, &network->weights, &network->derror, grad, ae_true, _state); } /************************************************************************* Batch gradient calculation for a set of inputs/outputs FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in dense format; one sample = one row: * first NIn columns contain inputs, * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SSize - number of elements in XY Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpgradbatch(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t subset0; ae_int_t subset1; ae_int_t subsettype; smlpgrad *sgrad; ae_smart_ptr _sgrad; ae_frame_make(_state, &_frame_block); *e = 0; ae_smart_ptr_init(&_sgrad, (void**)&sgrad, _state); ae_assert(ssize>=0, "MLPGradBatchSparse: SSize<0", _state); subset0 = 0; subset1 = ssize; subsettype = 0; mlpproperties(network, &nin, &nout, &wcount, _state); rvectorsetlengthatleast(grad, wcount, _state); ae_shared_pool_first_recycled(&network->gradbuf, &_sgrad, _state); while(sgrad!=NULL) { sgrad->f = 0.0; for(i=0; i<=wcount-1; i++) { sgrad->g.ptr.p_double[i] = 0.0; } ae_shared_pool_next_recycled(&network->gradbuf, &_sgrad, _state); } mlpgradbatchx(network, xy, &network->dummysxy, ssize, 0, &network->dummyidx, subset0, subset1, subsettype, &network->buf, &network->gradbuf, _state); *e = 0.0; for(i=0; i<=wcount-1; i++) { grad->ptr.p_double[i] = 0.0; } ae_shared_pool_first_recycled(&network->gradbuf, &_sgrad, _state); while(sgrad!=NULL) { *e = *e+sgrad->f; for(i=0; i<=wcount-1; i++) { grad->ptr.p_double[i] = grad->ptr.p_double[i]+sgrad->g.ptr.p_double[i]; } ae_shared_pool_next_recycled(&network->gradbuf, &_sgrad, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_mlpgradbatch(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, ae_state *_state) { mlpgradbatch(network,xy,ssize,e,grad, _state); } /************************************************************************* Batch gradient calculation for a set of inputs/outputs given by sparse matrices FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in sparse format; one sample = one row: * MATRIX MUST BE STORED IN CRS FORMAT * first NIn columns contain inputs. * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SSize - number of elements in XY Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpgradbatchsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t subset0; ae_int_t subset1; ae_int_t subsettype; smlpgrad *sgrad; ae_smart_ptr _sgrad; ae_frame_make(_state, &_frame_block); *e = 0; ae_smart_ptr_init(&_sgrad, (void**)&sgrad, _state); ae_assert(ssize>=0, "MLPGradBatchSparse: SSize<0", _state); ae_assert(sparseiscrs(xy, _state), "MLPGradBatchSparse: sparse matrix XY must be in CRS format.", _state); subset0 = 0; subset1 = ssize; subsettype = 0; mlpproperties(network, &nin, &nout, &wcount, _state); rvectorsetlengthatleast(grad, wcount, _state); ae_shared_pool_first_recycled(&network->gradbuf, &_sgrad, _state); while(sgrad!=NULL) { sgrad->f = 0.0; for(i=0; i<=wcount-1; i++) { sgrad->g.ptr.p_double[i] = 0.0; } ae_shared_pool_next_recycled(&network->gradbuf, &_sgrad, _state); } mlpgradbatchx(network, &network->dummydxy, xy, ssize, 1, &network->dummyidx, subset0, subset1, subsettype, &network->buf, &network->gradbuf, _state); *e = 0.0; for(i=0; i<=wcount-1; i++) { grad->ptr.p_double[i] = 0.0; } ae_shared_pool_first_recycled(&network->gradbuf, &_sgrad, _state); while(sgrad!=NULL) { *e = *e+sgrad->f; for(i=0; i<=wcount-1; i++) { grad->ptr.p_double[i] = grad->ptr.p_double[i]+sgrad->g.ptr.p_double[i]; } ae_shared_pool_next_recycled(&network->gradbuf, &_sgrad, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_mlpgradbatchsparse(multilayerperceptron* network, sparsematrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, ae_state *_state) { mlpgradbatchsparse(network,xy,ssize,e,grad, _state); } /************************************************************************* Batch gradient calculation for a subset of dataset FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in dense format; one sample = one row: * first NIn columns contain inputs, * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SetSize - real size of XY, SetSize>=0; Idx - subset of SubsetSize elements, array[SubsetSize]: * Idx[I] stores row index in the original dataset which is given by XY. Gradient is calculated with respect to rows whose indexes are stored in Idx[]. * Idx[] must store correct indexes; this function throws an exception in case incorrect index (less than 0 or larger than rows(XY)) is given * Idx[] may store indexes in any order and even with repetitions. SubsetSize- number of elements in Idx[] array: * positive value means that subset given by Idx[] is processed * zero value results in zero gradient * negative value means that full dataset is processed Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpgradbatchsubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* idx, ae_int_t subsetsize, double* e, /* Real */ ae_vector* grad, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t npoints; ae_int_t subset0; ae_int_t subset1; ae_int_t subsettype; smlpgrad *sgrad; ae_smart_ptr _sgrad; ae_frame_make(_state, &_frame_block); *e = 0; ae_smart_ptr_init(&_sgrad, (void**)&sgrad, _state); ae_assert(setsize>=0, "MLPGradBatchSubset: SetSize<0", _state); ae_assert(subsetsize<=idx->cnt, "MLPGradBatchSubset: SubsetSize>Length(Idx)", _state); npoints = setsize; if( subsetsize<0 ) { subset0 = 0; subset1 = setsize; subsettype = 0; } else { subset0 = 0; subset1 = subsetsize; subsettype = 1; for(i=0; i<=subsetsize-1; i++) { ae_assert(idx->ptr.p_int[i]>=0, "MLPGradBatchSubset: incorrect index of XY row(Idx[I]<0)", _state); ae_assert(idx->ptr.p_int[i]<=npoints-1, "MLPGradBatchSubset: incorrect index of XY row(Idx[I]>Rows(XY)-1)", _state); } } mlpproperties(network, &nin, &nout, &wcount, _state); rvectorsetlengthatleast(grad, wcount, _state); ae_shared_pool_first_recycled(&network->gradbuf, &_sgrad, _state); while(sgrad!=NULL) { sgrad->f = 0.0; for(i=0; i<=wcount-1; i++) { sgrad->g.ptr.p_double[i] = 0.0; } ae_shared_pool_next_recycled(&network->gradbuf, &_sgrad, _state); } mlpgradbatchx(network, xy, &network->dummysxy, setsize, 0, idx, subset0, subset1, subsettype, &network->buf, &network->gradbuf, _state); *e = 0.0; for(i=0; i<=wcount-1; i++) { grad->ptr.p_double[i] = 0.0; } ae_shared_pool_first_recycled(&network->gradbuf, &_sgrad, _state); while(sgrad!=NULL) { *e = *e+sgrad->f; for(i=0; i<=wcount-1; i++) { grad->ptr.p_double[i] = grad->ptr.p_double[i]+sgrad->g.ptr.p_double[i]; } ae_shared_pool_next_recycled(&network->gradbuf, &_sgrad, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_mlpgradbatchsubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* idx, ae_int_t subsetsize, double* e, /* Real */ ae_vector* grad, ae_state *_state) { mlpgradbatchsubset(network,xy,setsize,idx,subsetsize,e,grad, _state); } /************************************************************************* Batch gradient calculation for a set of inputs/outputs for a subset of dataset given by set of indexes. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset in sparse format; one sample = one row: * MATRIX MUST BE STORED IN CRS FORMAT * first NIn columns contain inputs, * for regression problem, next NOut columns store desired outputs. * for classification problem, next column (just one!) stores class number. SetSize - real size of XY, SetSize>=0; Idx - subset of SubsetSize elements, array[SubsetSize]: * Idx[I] stores row index in the original dataset which is given by XY. Gradient is calculated with respect to rows whose indexes are stored in Idx[]. * Idx[] must store correct indexes; this function throws an exception in case incorrect index (less than 0 or larger than rows(XY)) is given * Idx[] may store indexes in any order and even with repetitions. SubsetSize- number of elements in Idx[] array: * positive value means that subset given by Idx[] is processed * zero value results in zero gradient * negative value means that full dataset is processed Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, SUM(sqr(y[i]-desiredy[i])/2,i) Grad - gradient of E with respect to weights of network, array[WCount] NOTE: when SubsetSize<0 is used full dataset by call MLPGradBatchSparse function. -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpgradbatchsparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* idx, ae_int_t subsetsize, double* e, /* Real */ ae_vector* grad, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t npoints; ae_int_t subset0; ae_int_t subset1; ae_int_t subsettype; smlpgrad *sgrad; ae_smart_ptr _sgrad; ae_frame_make(_state, &_frame_block); *e = 0; ae_smart_ptr_init(&_sgrad, (void**)&sgrad, _state); ae_assert(setsize>=0, "MLPGradBatchSparseSubset: SetSize<0", _state); ae_assert(subsetsize<=idx->cnt, "MLPGradBatchSparseSubset: SubsetSize>Length(Idx)", _state); ae_assert(sparseiscrs(xy, _state), "MLPGradBatchSparseSubset: sparse matrix XY must be in CRS format.", _state); npoints = setsize; if( subsetsize<0 ) { subset0 = 0; subset1 = setsize; subsettype = 0; } else { subset0 = 0; subset1 = subsetsize; subsettype = 1; for(i=0; i<=subsetsize-1; i++) { ae_assert(idx->ptr.p_int[i]>=0, "MLPGradBatchSparseSubset: incorrect index of XY row(Idx[I]<0)", _state); ae_assert(idx->ptr.p_int[i]<=npoints-1, "MLPGradBatchSparseSubset: incorrect index of XY row(Idx[I]>Rows(XY)-1)", _state); } } mlpproperties(network, &nin, &nout, &wcount, _state); rvectorsetlengthatleast(grad, wcount, _state); ae_shared_pool_first_recycled(&network->gradbuf, &_sgrad, _state); while(sgrad!=NULL) { sgrad->f = 0.0; for(i=0; i<=wcount-1; i++) { sgrad->g.ptr.p_double[i] = 0.0; } ae_shared_pool_next_recycled(&network->gradbuf, &_sgrad, _state); } mlpgradbatchx(network, &network->dummydxy, xy, setsize, 1, idx, subset0, subset1, subsettype, &network->buf, &network->gradbuf, _state); *e = 0.0; for(i=0; i<=wcount-1; i++) { grad->ptr.p_double[i] = 0.0; } ae_shared_pool_first_recycled(&network->gradbuf, &_sgrad, _state); while(sgrad!=NULL) { *e = *e+sgrad->f; for(i=0; i<=wcount-1; i++) { grad->ptr.p_double[i] = grad->ptr.p_double[i]+sgrad->g.ptr.p_double[i]; } ae_shared_pool_next_recycled(&network->gradbuf, &_sgrad, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_mlpgradbatchsparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* idx, ae_int_t subsetsize, double* e, /* Real */ ae_vector* grad, ae_state *_state) { mlpgradbatchsparsesubset(network,xy,setsize,idx,subsetsize,e,grad, _state); } /************************************************************************* Internal function which actually calculates batch gradient for a subset or full dataset, which can be represented in different formats. THIS FUNCTION IS NOT INTENDED TO BE USED BY ALGLIB USERS! -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpgradbatchx(multilayerperceptron* network, /* Real */ ae_matrix* densexy, sparsematrix* sparsexy, ae_int_t datasetsize, ae_int_t datasettype, /* Integer */ ae_vector* idx, ae_int_t subset0, ae_int_t subset1, ae_int_t subsettype, ae_shared_pool* buf, ae_shared_pool* gradbuf, ae_state *_state) { ae_frame _frame_block; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t rowsize; ae_int_t srcidx; ae_int_t cstart; ae_int_t csize; ae_int_t j; double problemcost; mlpbuffers *buf2; ae_smart_ptr _buf2; ae_int_t len0; ae_int_t len1; mlpbuffers *pbuf; ae_smart_ptr _pbuf; smlpgrad *sgrad; ae_smart_ptr _sgrad; ae_frame_make(_state, &_frame_block); ae_smart_ptr_init(&_buf2, (void**)&buf2, _state); ae_smart_ptr_init(&_pbuf, (void**)&pbuf, _state); ae_smart_ptr_init(&_sgrad, (void**)&sgrad, _state); ae_assert(datasetsize>=0, "MLPGradBatchX: SetSize<0", _state); ae_assert(datasettype==0||datasettype==1, "MLPGradBatchX: DatasetType is incorrect", _state); ae_assert(subsettype==0||subsettype==1, "MLPGradBatchX: SubsetType is incorrect", _state); /* * Determine network and dataset properties */ mlpproperties(network, &nin, &nout, &wcount, _state); if( mlpissoftmax(network, _state) ) { rowsize = nin+1; } else { rowsize = nin+nout; } /* * Split problem. * * Splitting problem allows us to reduce effect of single-precision * arithmetics (SSE-optimized version of MLPChunkedGradient uses single * precision internally, but converts them to double precision after * results are exported from HPC buffer to network). Small batches are * calculated in single precision, results are aggregated in double * precision, and it allows us to avoid accumulation of errors when * we process very large batches (tens of thousands of items). * * NOTE: it is important to use real arithmetics for ProblemCost * because ProblemCost may be larger than MAXINT. */ problemcost = (double)(subset1-subset0); problemcost = problemcost*wcount; if( subset1-subset0>=2*mlpbase_microbatchsize&&ae_fp_greater(problemcost,(double)(mlpbase_gradbasecasecost)) ) { splitlength(subset1-subset0, mlpbase_microbatchsize, &len0, &len1, _state); mlpgradbatchx(network, densexy, sparsexy, datasetsize, datasettype, idx, subset0, subset0+len0, subsettype, buf, gradbuf, _state); mlpgradbatchx(network, densexy, sparsexy, datasetsize, datasettype, idx, subset0+len0, subset1, subsettype, buf, gradbuf, _state); ae_frame_leave(_state); return; } /* * Chunked processing */ ae_shared_pool_retrieve(gradbuf, &_sgrad, _state); ae_shared_pool_retrieve(buf, &_pbuf, _state); hpcpreparechunkedgradient(&network->weights, wcount, mlpntotal(network, _state), nin, nout, pbuf, _state); cstart = subset0; while(cstartchunksize, _state)-cstart; for(j=0; j<=csize-1; j++) { srcidx = -1; if( subsettype==0 ) { srcidx = cstart+j; } if( subsettype==1 ) { srcidx = idx->ptr.p_int[cstart+j]; } ae_assert(srcidx>=0, "MLPGradBatchX: internal error", _state); if( datasettype==0 ) { ae_v_move(&pbuf->xy.ptr.pp_double[j][0], 1, &densexy->ptr.pp_double[srcidx][0], 1, ae_v_len(0,rowsize-1)); } if( datasettype==1 ) { sparsegetrow(sparsexy, srcidx, &pbuf->xyrow, _state); ae_v_move(&pbuf->xy.ptr.pp_double[j][0], 1, &pbuf->xyrow.ptr.p_double[0], 1, ae_v_len(0,rowsize-1)); } } /* * Process chunk and advance line pointer */ mlpbase_mlpchunkedgradient(network, &pbuf->xy, 0, csize, &pbuf->batch4buf, &pbuf->hpcbuf, &sgrad->f, ae_false, _state); cstart = cstart+pbuf->chunksize; } hpcfinalizechunkedgradient(pbuf, &sgrad->g, _state); ae_shared_pool_recycle(buf, &_pbuf, _state); ae_shared_pool_recycle(gradbuf, &_sgrad, _state); ae_frame_leave(_state); } /************************************************************************* Batch gradient calculation for a set of inputs/outputs (natural error function is used) INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - set of inputs/outputs; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SSize - number of elements in XY Grad - possibly preallocated array. If size of array is smaller than WCount, it will be reallocated. It is recommended to reuse previously allocated array to reduce allocation overhead. OUTPUT PARAMETERS: E - error function, sum-of-squares for regression networks, cross-entropy for classification networks. Grad - gradient of E with respect to weights of network, array[WCount] -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ void mlpgradnbatch(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t nin; ae_int_t nout; ae_int_t wcount; mlpbuffers *pbuf; ae_smart_ptr _pbuf; ae_frame_make(_state, &_frame_block); *e = 0; ae_smart_ptr_init(&_pbuf, (void**)&pbuf, _state); /* * Alloc */ mlpproperties(network, &nin, &nout, &wcount, _state); ae_shared_pool_retrieve(&network->buf, &_pbuf, _state); hpcpreparechunkedgradient(&network->weights, wcount, mlpntotal(network, _state), nin, nout, pbuf, _state); rvectorsetlengthatleast(grad, wcount, _state); for(i=0; i<=wcount-1; i++) { grad->ptr.p_double[i] = (double)(0); } *e = (double)(0); i = 0; while(i<=ssize-1) { mlpbase_mlpchunkedgradient(network, xy, i, ae_minint(ssize, i+pbuf->chunksize, _state)-i, &pbuf->batch4buf, &pbuf->hpcbuf, e, ae_true, _state); i = i+pbuf->chunksize; } hpcfinalizechunkedgradient(pbuf, grad, _state); ae_shared_pool_recycle(&network->buf, &_pbuf, _state); ae_frame_leave(_state); } /************************************************************************* Batch Hessian calculation (natural error function) using R-algorithm. Internal subroutine. -- ALGLIB -- Copyright 26.01.2008 by Bochkanov Sergey. Hessian calculation based on R-algorithm described in "Fast Exact Multiplication by the Hessian", B. A. Pearlmutter, Neural Computation, 1994. *************************************************************************/ void mlphessiannbatch(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, /* Real */ ae_matrix* h, ae_state *_state) { *e = 0; mlpbase_mlphessianbatchinternal(network, xy, ssize, ae_true, e, grad, h, _state); } /************************************************************************* Batch Hessian calculation using R-algorithm. Internal subroutine. -- ALGLIB -- Copyright 26.01.2008 by Bochkanov Sergey. Hessian calculation based on R-algorithm described in "Fast Exact Multiplication by the Hessian", B. A. Pearlmutter, Neural Computation, 1994. *************************************************************************/ void mlphessianbatch(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, double* e, /* Real */ ae_vector* grad, /* Real */ ae_matrix* h, ae_state *_state) { *e = 0; mlpbase_mlphessianbatchinternal(network, xy, ssize, ae_false, e, grad, h, _state); } /************************************************************************* Internal subroutine, shouldn't be called by user. *************************************************************************/ void mlpinternalprocessvector(/* Integer */ ae_vector* structinfo, /* Real */ ae_vector* weights, /* Real */ ae_vector* columnmeans, /* Real */ ae_vector* columnsigmas, /* Real */ ae_vector* neurons, /* Real */ ae_vector* dfdnet, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_int_t n1; ae_int_t n2; ae_int_t w1; ae_int_t w2; ae_int_t ntotal; ae_int_t nin; ae_int_t nout; ae_int_t istart; ae_int_t offs; double net; double f; double df; double d2f; double mx; ae_bool perr; /* * Read network geometry */ nin = structinfo->ptr.p_int[1]; nout = structinfo->ptr.p_int[2]; ntotal = structinfo->ptr.p_int[3]; istart = structinfo->ptr.p_int[5]; /* * Inputs standartisation and putting in the network */ for(i=0; i<=nin-1; i++) { if( ae_fp_neq(columnsigmas->ptr.p_double[i],(double)(0)) ) { neurons->ptr.p_double[i] = (x->ptr.p_double[i]-columnmeans->ptr.p_double[i])/columnsigmas->ptr.p_double[i]; } else { neurons->ptr.p_double[i] = x->ptr.p_double[i]-columnmeans->ptr.p_double[i]; } } /* * Process network */ for(i=0; i<=ntotal-1; i++) { offs = istart+i*mlpbase_nfieldwidth; if( structinfo->ptr.p_int[offs+0]>0||structinfo->ptr.p_int[offs+0]==-5 ) { /* * Activation function */ mlpactivationfunction(neurons->ptr.p_double[structinfo->ptr.p_int[offs+2]], structinfo->ptr.p_int[offs+0], &f, &df, &d2f, _state); neurons->ptr.p_double[i] = f; dfdnet->ptr.p_double[i] = df; continue; } if( structinfo->ptr.p_int[offs+0]==0 ) { /* * Adaptive summator */ n1 = structinfo->ptr.p_int[offs+2]; n2 = n1+structinfo->ptr.p_int[offs+1]-1; w1 = structinfo->ptr.p_int[offs+3]; w2 = w1+structinfo->ptr.p_int[offs+1]-1; net = ae_v_dotproduct(&weights->ptr.p_double[w1], 1, &neurons->ptr.p_double[n1], 1, ae_v_len(w1,w2)); neurons->ptr.p_double[i] = net; dfdnet->ptr.p_double[i] = 1.0; touchint(&n2, _state); continue; } if( structinfo->ptr.p_int[offs+0]<0 ) { perr = ae_true; if( structinfo->ptr.p_int[offs+0]==-2 ) { /* * input neuron, left unchanged */ perr = ae_false; } if( structinfo->ptr.p_int[offs+0]==-3 ) { /* * "-1" neuron */ neurons->ptr.p_double[i] = (double)(-1); perr = ae_false; } if( structinfo->ptr.p_int[offs+0]==-4 ) { /* * "0" neuron */ neurons->ptr.p_double[i] = (double)(0); perr = ae_false; } ae_assert(!perr, "MLPInternalProcessVector: internal error - unknown neuron type!", _state); continue; } } /* * Extract result */ ae_v_move(&y->ptr.p_double[0], 1, &neurons->ptr.p_double[ntotal-nout], 1, ae_v_len(0,nout-1)); /* * Softmax post-processing or standardisation if needed */ ae_assert(structinfo->ptr.p_int[6]==0||structinfo->ptr.p_int[6]==1, "MLPInternalProcessVector: unknown normalization type!", _state); if( structinfo->ptr.p_int[6]==1 ) { /* * Softmax */ mx = y->ptr.p_double[0]; for(i=1; i<=nout-1; i++) { mx = ae_maxreal(mx, y->ptr.p_double[i], _state); } net = (double)(0); for(i=0; i<=nout-1; i++) { y->ptr.p_double[i] = ae_exp(y->ptr.p_double[i]-mx, _state); net = net+y->ptr.p_double[i]; } for(i=0; i<=nout-1; i++) { y->ptr.p_double[i] = y->ptr.p_double[i]/net; } } else { /* * Standardisation */ for(i=0; i<=nout-1; i++) { y->ptr.p_double[i] = y->ptr.p_double[i]*columnsigmas->ptr.p_double[nin+i]+columnmeans->ptr.p_double[nin+i]; } } } /************************************************************************* Serializer: allocation -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpalloc(ae_serializer* s, multilayerperceptron* network, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t fkind; double threshold; double v0; double v1; ae_int_t nin; ae_int_t nout; nin = network->hllayersizes.ptr.p_int[0]; nout = network->hllayersizes.ptr.p_int[network->hllayersizes.cnt-1]; ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); allocintegerarray(s, &network->hllayersizes, -1, _state); for(i=1; i<=network->hllayersizes.cnt-1; i++) { for(j=0; j<=network->hllayersizes.ptr.p_int[i]-1; j++) { mlpgetneuroninfo(network, i, j, &fkind, &threshold, _state); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); for(k=0; k<=network->hllayersizes.ptr.p_int[i-1]-1; k++) { ae_serializer_alloc_entry(s); } } } for(j=0; j<=nin-1; j++) { mlpgetinputscaling(network, j, &v0, &v1, _state); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); } for(j=0; j<=nout-1; j++) { mlpgetoutputscaling(network, j, &v0, &v1, _state); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); } } /************************************************************************* Serializer: serialization -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpserialize(ae_serializer* s, multilayerperceptron* network, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t fkind; double threshold; double v0; double v1; ae_int_t nin; ae_int_t nout; nin = network->hllayersizes.ptr.p_int[0]; nout = network->hllayersizes.ptr.p_int[network->hllayersizes.cnt-1]; ae_serializer_serialize_int(s, getmlpserializationcode(_state), _state); ae_serializer_serialize_int(s, mlpbase_mlpfirstversion, _state); ae_serializer_serialize_bool(s, mlpissoftmax(network, _state), _state); serializeintegerarray(s, &network->hllayersizes, -1, _state); for(i=1; i<=network->hllayersizes.cnt-1; i++) { for(j=0; j<=network->hllayersizes.ptr.p_int[i]-1; j++) { mlpgetneuroninfo(network, i, j, &fkind, &threshold, _state); ae_serializer_serialize_int(s, fkind, _state); ae_serializer_serialize_double(s, threshold, _state); for(k=0; k<=network->hllayersizes.ptr.p_int[i-1]-1; k++) { ae_serializer_serialize_double(s, mlpgetweight(network, i-1, k, i, j, _state), _state); } } } for(j=0; j<=nin-1; j++) { mlpgetinputscaling(network, j, &v0, &v1, _state); ae_serializer_serialize_double(s, v0, _state); ae_serializer_serialize_double(s, v1, _state); } for(j=0; j<=nout-1; j++) { mlpgetoutputscaling(network, j, &v0, &v1, _state); ae_serializer_serialize_double(s, v0, _state); ae_serializer_serialize_double(s, v1, _state); } } /************************************************************************* Serializer: unserialization -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpunserialize(ae_serializer* s, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_int_t i0; ae_int_t i1; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t fkind; double threshold; double v0; double v1; ae_int_t nin; ae_int_t nout; ae_bool issoftmax; ae_vector layersizes; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&layersizes, 0, DT_INT, _state); /* * check correctness of header */ ae_serializer_unserialize_int(s, &i0, _state); ae_assert(i0==getmlpserializationcode(_state), "MLPUnserialize: stream header corrupted", _state); ae_serializer_unserialize_int(s, &i1, _state); ae_assert(i1==mlpbase_mlpfirstversion, "MLPUnserialize: stream header corrupted", _state); /* * Create network */ ae_serializer_unserialize_bool(s, &issoftmax, _state); unserializeintegerarray(s, &layersizes, _state); ae_assert((layersizes.cnt==2||layersizes.cnt==3)||layersizes.cnt==4, "MLPUnserialize: too many hidden layers!", _state); nin = layersizes.ptr.p_int[0]; nout = layersizes.ptr.p_int[layersizes.cnt-1]; if( layersizes.cnt==2 ) { if( issoftmax ) { mlpcreatec0(layersizes.ptr.p_int[0], layersizes.ptr.p_int[1], network, _state); } else { mlpcreate0(layersizes.ptr.p_int[0], layersizes.ptr.p_int[1], network, _state); } } if( layersizes.cnt==3 ) { if( issoftmax ) { mlpcreatec1(layersizes.ptr.p_int[0], layersizes.ptr.p_int[1], layersizes.ptr.p_int[2], network, _state); } else { mlpcreate1(layersizes.ptr.p_int[0], layersizes.ptr.p_int[1], layersizes.ptr.p_int[2], network, _state); } } if( layersizes.cnt==4 ) { if( issoftmax ) { mlpcreatec2(layersizes.ptr.p_int[0], layersizes.ptr.p_int[1], layersizes.ptr.p_int[2], layersizes.ptr.p_int[3], network, _state); } else { mlpcreate2(layersizes.ptr.p_int[0], layersizes.ptr.p_int[1], layersizes.ptr.p_int[2], layersizes.ptr.p_int[3], network, _state); } } /* * Load neurons and weights */ for(i=1; i<=layersizes.cnt-1; i++) { for(j=0; j<=layersizes.ptr.p_int[i]-1; j++) { ae_serializer_unserialize_int(s, &fkind, _state); ae_serializer_unserialize_double(s, &threshold, _state); mlpsetneuroninfo(network, i, j, fkind, threshold, _state); for(k=0; k<=layersizes.ptr.p_int[i-1]-1; k++) { ae_serializer_unserialize_double(s, &v0, _state); mlpsetweight(network, i-1, k, i, j, v0, _state); } } } /* * Load standartizator */ for(j=0; j<=nin-1; j++) { ae_serializer_unserialize_double(s, &v0, _state); ae_serializer_unserialize_double(s, &v1, _state); mlpsetinputscaling(network, j, v0, v1, _state); } for(j=0; j<=nout-1; j++) { ae_serializer_unserialize_double(s, &v0, _state); ae_serializer_unserialize_double(s, &v1, _state); mlpsetoutputscaling(network, j, v0, v1, _state); } ae_frame_leave(_state); } /************************************************************************* Calculation of all types of errors on subset of dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SetSize - real size of XY, SetSize>=0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. OUTPUT PARAMETERS: Rep - it contains all type of errors. -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/ void mlpallerrorssubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, modelerrors* rep, ae_state *_state) { ae_int_t idx0; ae_int_t idx1; ae_int_t idxtype; _modelerrors_clear(rep); ae_assert(xy->rows>=setsize, "MLPAllErrorsSubset: XY has less than SetSize rows", _state); if( setsize>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+1, "MLPAllErrorsSubset: XY has less than NIn+1 columns", _state); } else { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPAllErrorsSubset: XY has less than NIn+NOut columns", _state); } } if( subsetsize>=0 ) { idx0 = 0; idx1 = subsetsize; idxtype = 1; } else { idx0 = 0; idx1 = setsize; idxtype = 0; } mlpallerrorsx(network, xy, &network->dummysxy, setsize, 0, subset, idx0, idx1, idxtype, &network->buf, rep, _state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_mlpallerrorssubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, modelerrors* rep, ae_state *_state) { mlpallerrorssubset(network,xy,setsize,subset,subsetsize,rep, _state); } /************************************************************************* Calculation of all types of errors on subset of dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - network initialized with one of the network creation funcs XY - original dataset given by sparse matrix; one sample = one row; first NIn columns contain inputs, next NOut columns - desired outputs. SetSize - real size of XY, SetSize>=0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. OUTPUT PARAMETERS: Rep - it contains all type of errors. -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/ void mlpallerrorssparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, modelerrors* rep, ae_state *_state) { ae_int_t idx0; ae_int_t idx1; ae_int_t idxtype; _modelerrors_clear(rep); ae_assert(sparseiscrs(xy, _state), "MLPAllErrorsSparseSubset: XY is not in CRS format.", _state); ae_assert(sparsegetnrows(xy, _state)>=setsize, "MLPAllErrorsSparseSubset: XY has less than SetSize rows", _state); if( setsize>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+1, "MLPAllErrorsSparseSubset: XY has less than NIn+1 columns", _state); } else { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPAllErrorsSparseSubset: XY has less than NIn+NOut columns", _state); } } if( subsetsize>=0 ) { idx0 = 0; idx1 = subsetsize; idxtype = 1; } else { idx0 = 0; idx1 = setsize; idxtype = 0; } mlpallerrorsx(network, &network->dummydxy, xy, setsize, 1, subset, idx0, idx1, idxtype, &network->buf, rep, _state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_mlpallerrorssparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, modelerrors* rep, ae_state *_state) { mlpallerrorssparsesubset(network,xy,setsize,subset,subsetsize,rep, _state); } /************************************************************************* Error of the neural network on subset of dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format; SetSize - real size of XY, SetSize>=0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/ double mlperrorsubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, ae_state *_state) { ae_int_t idx0; ae_int_t idx1; ae_int_t idxtype; double result; ae_assert(xy->rows>=setsize, "MLPErrorSubset: XY has less than SetSize rows", _state); if( setsize>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+1, "MLPErrorSubset: XY has less than NIn+1 columns", _state); } else { ae_assert(xy->cols>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPErrorSubset: XY has less than NIn+NOut columns", _state); } } if( subsetsize>=0 ) { idx0 = 0; idx1 = subsetsize; idxtype = 1; } else { idx0 = 0; idx1 = setsize; idxtype = 0; } mlpallerrorsx(network, xy, &network->dummysxy, setsize, 0, subset, idx0, idx1, idxtype, &network->buf, &network->err, _state); result = ae_sqr(network->err.rmserror, _state)*(idx1-idx0)*mlpgetoutputscount(network, _state)/2; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlperrorsubset(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, ae_state *_state) { return mlperrorsubset(network,xy,setsize,subset,subsetsize, _state); } /************************************************************************* Error of the neural network on subset of sparse dataset. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support ! ! First improvement gives close-to-linear speedup on multicore systems. ! Second improvement gives constant speedup (2-3x depending on your CPU) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: Network - neural network; XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Sparse matrix must use CRS format for storage. SetSize - real size of XY, SetSize>=0; it is used when SubsetSize<0; Subset - subset of SubsetSize elements, array[SubsetSize]; SubsetSize- number of elements in Subset[] array: * if SubsetSize>0, rows of XY with indices Subset[0]... ...Subset[SubsetSize-1] are processed * if SubsetSize=0, zeros are returned * if SubsetSize<0, entire dataset is processed; Subset[] array is ignored in this case. RESULT: sum-of-squares error, SUM(sqr(y[i]-desired_y[i])/2) DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following dataset format is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 04.09.2012 by Bochkanov Sergey *************************************************************************/ double mlperrorsparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, ae_state *_state) { ae_int_t idx0; ae_int_t idx1; ae_int_t idxtype; double result; ae_assert(sparseiscrs(xy, _state), "MLPErrorSparseSubset: XY is not in CRS format.", _state); ae_assert(sparsegetnrows(xy, _state)>=setsize, "MLPErrorSparseSubset: XY has less than SetSize rows", _state); if( setsize>0 ) { if( mlpissoftmax(network, _state) ) { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+1, "MLPErrorSparseSubset: XY has less than NIn+1 columns", _state); } else { ae_assert(sparsegetncols(xy, _state)>=mlpgetinputscount(network, _state)+mlpgetoutputscount(network, _state), "MLPErrorSparseSubset: XY has less than NIn+NOut columns", _state); } } if( subsetsize>=0 ) { idx0 = 0; idx1 = subsetsize; idxtype = 1; } else { idx0 = 0; idx1 = setsize; idxtype = 0; } mlpallerrorsx(network, &network->dummydxy, xy, setsize, 1, subset, idx0, idx1, idxtype, &network->buf, &network->err, _state); result = ae_sqr(network->err.rmserror, _state)*(idx1-idx0)*mlpgetoutputscount(network, _state)/2; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ double _pexec_mlperrorsparsesubset(multilayerperceptron* network, sparsematrix* xy, ae_int_t setsize, /* Integer */ ae_vector* subset, ae_int_t subsetsize, ae_state *_state) { return mlperrorsparsesubset(network,xy,setsize,subset,subsetsize, _state); } /************************************************************************* Calculation of all types of errors at once for a subset or full dataset, which can be represented in different formats. THIS INTERNAL FUNCTION IS NOT INTENDED TO BE USED BY ALGLIB USERS! -- ALGLIB -- Copyright 26.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpallerrorsx(multilayerperceptron* network, /* Real */ ae_matrix* densexy, sparsematrix* sparsexy, ae_int_t datasetsize, ae_int_t datasettype, /* Integer */ ae_vector* idx, ae_int_t subset0, ae_int_t subset1, ae_int_t subsettype, ae_shared_pool* buf, modelerrors* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t rowsize; ae_bool iscls; ae_int_t srcidx; ae_int_t cstart; ae_int_t csize; ae_int_t j; mlpbuffers *pbuf; ae_smart_ptr _pbuf; ae_int_t len0; ae_int_t len1; modelerrors rep0; modelerrors rep1; ae_frame_make(_state, &_frame_block); ae_smart_ptr_init(&_pbuf, (void**)&pbuf, _state); _modelerrors_init(&rep0, _state); _modelerrors_init(&rep1, _state); ae_assert(datasetsize>=0, "MLPAllErrorsX: SetSize<0", _state); ae_assert(datasettype==0||datasettype==1, "MLPAllErrorsX: DatasetType is incorrect", _state); ae_assert(subsettype==0||subsettype==1, "MLPAllErrorsX: SubsetType is incorrect", _state); /* * Determine network properties */ mlpproperties(network, &nin, &nout, &wcount, _state); iscls = mlpissoftmax(network, _state); /* * Split problem. * * Splitting problem allows us to reduce effect of single-precision * arithmetics (SSE-optimized version of MLPChunkedProcess uses single * precision internally, but converts them to double precision after * results are exported from HPC buffer to network). Small batches are * calculated in single precision, results are aggregated in double * precision, and it allows us to avoid accumulation of errors when * we process very large batches (tens of thousands of items). * * NOTE: it is important to use real arithmetics for ProblemCost * because ProblemCost may be larger than MAXINT. */ if( subset1-subset0>=2*mlpbase_microbatchsize&&ae_fp_greater(inttoreal(subset1-subset0, _state)*inttoreal(wcount, _state),(double)(mlpbase_gradbasecasecost)) ) { splitlength(subset1-subset0, mlpbase_microbatchsize, &len0, &len1, _state); mlpallerrorsx(network, densexy, sparsexy, datasetsize, datasettype, idx, subset0, subset0+len0, subsettype, buf, &rep0, _state); mlpallerrorsx(network, densexy, sparsexy, datasetsize, datasettype, idx, subset0+len0, subset1, subsettype, buf, &rep1, _state); rep->relclserror = (len0*rep0.relclserror+len1*rep1.relclserror)/(len0+len1); rep->avgce = (len0*rep0.avgce+len1*rep1.avgce)/(len0+len1); rep->rmserror = ae_sqrt((len0*ae_sqr(rep0.rmserror, _state)+len1*ae_sqr(rep1.rmserror, _state))/(len0+len1), _state); rep->avgerror = (len0*rep0.avgerror+len1*rep1.avgerror)/(len0+len1); rep->avgrelerror = (len0*rep0.avgrelerror+len1*rep1.avgrelerror)/(len0+len1); ae_frame_leave(_state); return; } /* * Retrieve and prepare */ ae_shared_pool_retrieve(buf, &_pbuf, _state); if( iscls ) { rowsize = nin+1; dserrallocate(nout, &pbuf->tmp0, _state); } else { rowsize = nin+nout; dserrallocate(-nout, &pbuf->tmp0, _state); } /* * Processing */ hpcpreparechunkedgradient(&network->weights, wcount, mlpntotal(network, _state), nin, nout, pbuf, _state); cstart = subset0; while(cstartchunksize, _state)-cstart; for(j=0; j<=csize-1; j++) { srcidx = -1; if( subsettype==0 ) { srcidx = cstart+j; } if( subsettype==1 ) { srcidx = idx->ptr.p_int[cstart+j]; } ae_assert(srcidx>=0, "MLPAllErrorsX: internal error", _state); if( datasettype==0 ) { ae_v_move(&pbuf->xy.ptr.pp_double[j][0], 1, &densexy->ptr.pp_double[srcidx][0], 1, ae_v_len(0,rowsize-1)); } if( datasettype==1 ) { sparsegetrow(sparsexy, srcidx, &pbuf->xyrow, _state); ae_v_move(&pbuf->xy.ptr.pp_double[j][0], 1, &pbuf->xyrow.ptr.p_double[0], 1, ae_v_len(0,rowsize-1)); } } /* * Unpack XY and process (temporary code, to be replaced by chunked processing) */ for(j=0; j<=csize-1; j++) { ae_v_move(&pbuf->xy2.ptr.pp_double[j][0], 1, &pbuf->xy.ptr.pp_double[j][0], 1, ae_v_len(0,rowsize-1)); } mlpbase_mlpchunkedprocess(network, &pbuf->xy2, 0, csize, &pbuf->batch4buf, &pbuf->hpcbuf, _state); for(j=0; j<=csize-1; j++) { ae_v_move(&pbuf->x.ptr.p_double[0], 1, &pbuf->xy2.ptr.pp_double[j][0], 1, ae_v_len(0,nin-1)); ae_v_move(&pbuf->y.ptr.p_double[0], 1, &pbuf->xy2.ptr.pp_double[j][nin], 1, ae_v_len(0,nout-1)); if( iscls ) { pbuf->desiredy.ptr.p_double[0] = pbuf->xy.ptr.pp_double[j][nin]; } else { ae_v_move(&pbuf->desiredy.ptr.p_double[0], 1, &pbuf->xy.ptr.pp_double[j][nin], 1, ae_v_len(0,nout-1)); } dserraccumulate(&pbuf->tmp0, &pbuf->y, &pbuf->desiredy, _state); } /* * Process chunk and advance line pointer */ cstart = cstart+pbuf->chunksize; } dserrfinish(&pbuf->tmp0, _state); rep->relclserror = pbuf->tmp0.ptr.p_double[0]; rep->avgce = pbuf->tmp0.ptr.p_double[1]/ae_log((double)(2), _state); rep->rmserror = pbuf->tmp0.ptr.p_double[2]; rep->avgerror = pbuf->tmp0.ptr.p_double[3]; rep->avgrelerror = pbuf->tmp0.ptr.p_double[4]; /* * Recycle */ ae_shared_pool_recycle(buf, &_pbuf, _state); ae_frame_leave(_state); } /************************************************************************* Internal subroutine: adding new input layer to network *************************************************************************/ static void mlpbase_addinputlayer(ae_int_t ncount, /* Integer */ ae_vector* lsizes, /* Integer */ ae_vector* ltypes, /* Integer */ ae_vector* lconnfirst, /* Integer */ ae_vector* lconnlast, ae_int_t* lastproc, ae_state *_state) { lsizes->ptr.p_int[0] = ncount; ltypes->ptr.p_int[0] = -2; lconnfirst->ptr.p_int[0] = 0; lconnlast->ptr.p_int[0] = 0; *lastproc = 0; } /************************************************************************* Internal subroutine: adding new summator layer to network *************************************************************************/ static void mlpbase_addbiasedsummatorlayer(ae_int_t ncount, /* Integer */ ae_vector* lsizes, /* Integer */ ae_vector* ltypes, /* Integer */ ae_vector* lconnfirst, /* Integer */ ae_vector* lconnlast, ae_int_t* lastproc, ae_state *_state) { lsizes->ptr.p_int[*lastproc+1] = 1; ltypes->ptr.p_int[*lastproc+1] = -3; lconnfirst->ptr.p_int[*lastproc+1] = 0; lconnlast->ptr.p_int[*lastproc+1] = 0; lsizes->ptr.p_int[*lastproc+2] = ncount; ltypes->ptr.p_int[*lastproc+2] = 0; lconnfirst->ptr.p_int[*lastproc+2] = *lastproc; lconnlast->ptr.p_int[*lastproc+2] = *lastproc+1; *lastproc = *lastproc+2; } /************************************************************************* Internal subroutine: adding new summator layer to network *************************************************************************/ static void mlpbase_addactivationlayer(ae_int_t functype, /* Integer */ ae_vector* lsizes, /* Integer */ ae_vector* ltypes, /* Integer */ ae_vector* lconnfirst, /* Integer */ ae_vector* lconnlast, ae_int_t* lastproc, ae_state *_state) { ae_assert(functype>0||functype==-5, "AddActivationLayer: incorrect function type", _state); lsizes->ptr.p_int[*lastproc+1] = lsizes->ptr.p_int[*lastproc]; ltypes->ptr.p_int[*lastproc+1] = functype; lconnfirst->ptr.p_int[*lastproc+1] = *lastproc; lconnlast->ptr.p_int[*lastproc+1] = *lastproc; *lastproc = *lastproc+1; } /************************************************************************* Internal subroutine: adding new zero layer to network *************************************************************************/ static void mlpbase_addzerolayer(/* Integer */ ae_vector* lsizes, /* Integer */ ae_vector* ltypes, /* Integer */ ae_vector* lconnfirst, /* Integer */ ae_vector* lconnlast, ae_int_t* lastproc, ae_state *_state) { lsizes->ptr.p_int[*lastproc+1] = 1; ltypes->ptr.p_int[*lastproc+1] = -4; lconnfirst->ptr.p_int[*lastproc+1] = 0; lconnlast->ptr.p_int[*lastproc+1] = 0; *lastproc = *lastproc+1; } /************************************************************************* This routine adds input layer to the high-level description of the network. It modifies Network.HLConnections and Network.HLNeurons and assumes that these arrays have enough place to store data. It accepts following parameters: Network - network ConnIdx - index of the first free entry in the HLConnections NeuroIdx - index of the first free entry in the HLNeurons StructInfoIdx- index of the first entry in the low level description of the current layer (in the StructInfo array) NIn - number of inputs It modified Network and indices. *************************************************************************/ static void mlpbase_hladdinputlayer(multilayerperceptron* network, ae_int_t* connidx, ae_int_t* neuroidx, ae_int_t* structinfoidx, ae_int_t nin, ae_state *_state) { ae_int_t i; ae_int_t offs; offs = mlpbase_hlnfieldwidth*(*neuroidx); for(i=0; i<=nin-1; i++) { network->hlneurons.ptr.p_int[offs+0] = 0; network->hlneurons.ptr.p_int[offs+1] = i; network->hlneurons.ptr.p_int[offs+2] = -1; network->hlneurons.ptr.p_int[offs+3] = -1; offs = offs+mlpbase_hlnfieldwidth; } *neuroidx = *neuroidx+nin; *structinfoidx = *structinfoidx+nin; } /************************************************************************* This routine adds output layer to the high-level description of the network. It modifies Network.HLConnections and Network.HLNeurons and assumes that these arrays have enough place to store data. It accepts following parameters: Network - network ConnIdx - index of the first free entry in the HLConnections NeuroIdx - index of the first free entry in the HLNeurons StructInfoIdx- index of the first entry in the low level description of the current layer (in the StructInfo array) WeightsIdx - index of the first entry in the Weights array which corresponds to the current layer K - current layer index NPrev - number of neurons in the previous layer NOut - number of outputs IsCls - is it classifier network? IsLinear - is it network with linear output? It modified Network and ConnIdx/NeuroIdx/StructInfoIdx/WeightsIdx. *************************************************************************/ static void mlpbase_hladdoutputlayer(multilayerperceptron* network, ae_int_t* connidx, ae_int_t* neuroidx, ae_int_t* structinfoidx, ae_int_t* weightsidx, ae_int_t k, ae_int_t nprev, ae_int_t nout, ae_bool iscls, ae_bool islinearout, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t neurooffs; ae_int_t connoffs; ae_assert((iscls&&islinearout)||!iscls, "HLAddOutputLayer: internal error", _state); neurooffs = mlpbase_hlnfieldwidth*(*neuroidx); connoffs = mlpbase_hlconnfieldwidth*(*connidx); if( !iscls ) { /* * Regression network */ for(i=0; i<=nout-1; i++) { network->hlneurons.ptr.p_int[neurooffs+0] = k; network->hlneurons.ptr.p_int[neurooffs+1] = i; network->hlneurons.ptr.p_int[neurooffs+2] = *structinfoidx+1+nout+i; network->hlneurons.ptr.p_int[neurooffs+3] = *weightsidx+nprev+(nprev+1)*i; neurooffs = neurooffs+mlpbase_hlnfieldwidth; } for(i=0; i<=nprev-1; i++) { for(j=0; j<=nout-1; j++) { network->hlconnections.ptr.p_int[connoffs+0] = k-1; network->hlconnections.ptr.p_int[connoffs+1] = i; network->hlconnections.ptr.p_int[connoffs+2] = k; network->hlconnections.ptr.p_int[connoffs+3] = j; network->hlconnections.ptr.p_int[connoffs+4] = *weightsidx+i+j*(nprev+1); connoffs = connoffs+mlpbase_hlconnfieldwidth; } } *connidx = *connidx+nprev*nout; *neuroidx = *neuroidx+nout; *structinfoidx = *structinfoidx+2*nout+1; *weightsidx = *weightsidx+nout*(nprev+1); } else { /* * Classification network */ for(i=0; i<=nout-2; i++) { network->hlneurons.ptr.p_int[neurooffs+0] = k; network->hlneurons.ptr.p_int[neurooffs+1] = i; network->hlneurons.ptr.p_int[neurooffs+2] = -1; network->hlneurons.ptr.p_int[neurooffs+3] = *weightsidx+nprev+(nprev+1)*i; neurooffs = neurooffs+mlpbase_hlnfieldwidth; } network->hlneurons.ptr.p_int[neurooffs+0] = k; network->hlneurons.ptr.p_int[neurooffs+1] = i; network->hlneurons.ptr.p_int[neurooffs+2] = -1; network->hlneurons.ptr.p_int[neurooffs+3] = -1; for(i=0; i<=nprev-1; i++) { for(j=0; j<=nout-2; j++) { network->hlconnections.ptr.p_int[connoffs+0] = k-1; network->hlconnections.ptr.p_int[connoffs+1] = i; network->hlconnections.ptr.p_int[connoffs+2] = k; network->hlconnections.ptr.p_int[connoffs+3] = j; network->hlconnections.ptr.p_int[connoffs+4] = *weightsidx+i+j*(nprev+1); connoffs = connoffs+mlpbase_hlconnfieldwidth; } } *connidx = *connidx+nprev*(nout-1); *neuroidx = *neuroidx+nout; *structinfoidx = *structinfoidx+nout+2; *weightsidx = *weightsidx+(nout-1)*(nprev+1); } } /************************************************************************* This routine adds hidden layer to the high-level description of the network. It modifies Network.HLConnections and Network.HLNeurons and assumes that these arrays have enough place to store data. It accepts following parameters: Network - network ConnIdx - index of the first free entry in the HLConnections NeuroIdx - index of the first free entry in the HLNeurons StructInfoIdx- index of the first entry in the low level description of the current layer (in the StructInfo array) WeightsIdx - index of the first entry in the Weights array which corresponds to the current layer K - current layer index NPrev - number of neurons in the previous layer NCur - number of neurons in the current layer It modified Network and ConnIdx/NeuroIdx/StructInfoIdx/WeightsIdx. *************************************************************************/ static void mlpbase_hladdhiddenlayer(multilayerperceptron* network, ae_int_t* connidx, ae_int_t* neuroidx, ae_int_t* structinfoidx, ae_int_t* weightsidx, ae_int_t k, ae_int_t nprev, ae_int_t ncur, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t neurooffs; ae_int_t connoffs; neurooffs = mlpbase_hlnfieldwidth*(*neuroidx); connoffs = mlpbase_hlconnfieldwidth*(*connidx); for(i=0; i<=ncur-1; i++) { network->hlneurons.ptr.p_int[neurooffs+0] = k; network->hlneurons.ptr.p_int[neurooffs+1] = i; network->hlneurons.ptr.p_int[neurooffs+2] = *structinfoidx+1+ncur+i; network->hlneurons.ptr.p_int[neurooffs+3] = *weightsidx+nprev+(nprev+1)*i; neurooffs = neurooffs+mlpbase_hlnfieldwidth; } for(i=0; i<=nprev-1; i++) { for(j=0; j<=ncur-1; j++) { network->hlconnections.ptr.p_int[connoffs+0] = k-1; network->hlconnections.ptr.p_int[connoffs+1] = i; network->hlconnections.ptr.p_int[connoffs+2] = k; network->hlconnections.ptr.p_int[connoffs+3] = j; network->hlconnections.ptr.p_int[connoffs+4] = *weightsidx+i+j*(nprev+1); connoffs = connoffs+mlpbase_hlconnfieldwidth; } } *connidx = *connidx+nprev*ncur; *neuroidx = *neuroidx+ncur; *structinfoidx = *structinfoidx+2*ncur+1; *weightsidx = *weightsidx+ncur*(nprev+1); } /************************************************************************* This function fills high level information about network created using internal MLPCreate() function. This function does NOT examine StructInfo for low level information, it just expects that network has following structure: input neuron \ ... | input layer input neuron / "-1" neuron \ biased summator | ... | biased summator | hidden layer(s), if there are exists any activation function | ... | activation function / "-1" neuron \ biased summator | output layer: ... | biased summator | * we have NOut summators/activators for regression networks activation function | * we have only NOut-1 summators and no activators for classifiers ... | * we have "0" neuron only when we have classifier activation function | "0" neuron / -- ALGLIB -- Copyright 30.03.2008 by Bochkanov Sergey *************************************************************************/ static void mlpbase_fillhighlevelinformation(multilayerperceptron* network, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_bool iscls, ae_bool islinearout, ae_state *_state) { ae_int_t idxweights; ae_int_t idxstruct; ae_int_t idxneuro; ae_int_t idxconn; ae_assert((iscls&&islinearout)||!iscls, "FillHighLevelInformation: internal error", _state); /* * Preparations common to all types of networks */ idxweights = 0; idxneuro = 0; idxstruct = 0; idxconn = 0; network->hlnetworktype = 0; /* * network without hidden layers */ if( nhid1==0 ) { ae_vector_set_length(&network->hllayersizes, 2, _state); network->hllayersizes.ptr.p_int[0] = nin; network->hllayersizes.ptr.p_int[1] = nout; if( !iscls ) { ae_vector_set_length(&network->hlconnections, mlpbase_hlconnfieldwidth*nin*nout, _state); ae_vector_set_length(&network->hlneurons, mlpbase_hlnfieldwidth*(nin+nout), _state); network->hlnormtype = 0; } else { ae_vector_set_length(&network->hlconnections, mlpbase_hlconnfieldwidth*nin*(nout-1), _state); ae_vector_set_length(&network->hlneurons, mlpbase_hlnfieldwidth*(nin+nout), _state); network->hlnormtype = 1; } mlpbase_hladdinputlayer(network, &idxconn, &idxneuro, &idxstruct, nin, _state); mlpbase_hladdoutputlayer(network, &idxconn, &idxneuro, &idxstruct, &idxweights, 1, nin, nout, iscls, islinearout, _state); return; } /* * network with one hidden layers */ if( nhid2==0 ) { ae_vector_set_length(&network->hllayersizes, 3, _state); network->hllayersizes.ptr.p_int[0] = nin; network->hllayersizes.ptr.p_int[1] = nhid1; network->hllayersizes.ptr.p_int[2] = nout; if( !iscls ) { ae_vector_set_length(&network->hlconnections, mlpbase_hlconnfieldwidth*(nin*nhid1+nhid1*nout), _state); ae_vector_set_length(&network->hlneurons, mlpbase_hlnfieldwidth*(nin+nhid1+nout), _state); network->hlnormtype = 0; } else { ae_vector_set_length(&network->hlconnections, mlpbase_hlconnfieldwidth*(nin*nhid1+nhid1*(nout-1)), _state); ae_vector_set_length(&network->hlneurons, mlpbase_hlnfieldwidth*(nin+nhid1+nout), _state); network->hlnormtype = 1; } mlpbase_hladdinputlayer(network, &idxconn, &idxneuro, &idxstruct, nin, _state); mlpbase_hladdhiddenlayer(network, &idxconn, &idxneuro, &idxstruct, &idxweights, 1, nin, nhid1, _state); mlpbase_hladdoutputlayer(network, &idxconn, &idxneuro, &idxstruct, &idxweights, 2, nhid1, nout, iscls, islinearout, _state); return; } /* * Two hidden layers */ ae_vector_set_length(&network->hllayersizes, 4, _state); network->hllayersizes.ptr.p_int[0] = nin; network->hllayersizes.ptr.p_int[1] = nhid1; network->hllayersizes.ptr.p_int[2] = nhid2; network->hllayersizes.ptr.p_int[3] = nout; if( !iscls ) { ae_vector_set_length(&network->hlconnections, mlpbase_hlconnfieldwidth*(nin*nhid1+nhid1*nhid2+nhid2*nout), _state); ae_vector_set_length(&network->hlneurons, mlpbase_hlnfieldwidth*(nin+nhid1+nhid2+nout), _state); network->hlnormtype = 0; } else { ae_vector_set_length(&network->hlconnections, mlpbase_hlconnfieldwidth*(nin*nhid1+nhid1*nhid2+nhid2*(nout-1)), _state); ae_vector_set_length(&network->hlneurons, mlpbase_hlnfieldwidth*(nin+nhid1+nhid2+nout), _state); network->hlnormtype = 1; } mlpbase_hladdinputlayer(network, &idxconn, &idxneuro, &idxstruct, nin, _state); mlpbase_hladdhiddenlayer(network, &idxconn, &idxneuro, &idxstruct, &idxweights, 1, nin, nhid1, _state); mlpbase_hladdhiddenlayer(network, &idxconn, &idxneuro, &idxstruct, &idxweights, 2, nhid1, nhid2, _state); mlpbase_hladdoutputlayer(network, &idxconn, &idxneuro, &idxstruct, &idxweights, 3, nhid2, nout, iscls, islinearout, _state); } /************************************************************************* Internal subroutine. -- ALGLIB -- Copyright 04.11.2007 by Bochkanov Sergey *************************************************************************/ static void mlpbase_mlpcreate(ae_int_t nin, ae_int_t nout, /* Integer */ ae_vector* lsizes, /* Integer */ ae_vector* ltypes, /* Integer */ ae_vector* lconnfirst, /* Integer */ ae_vector* lconnlast, ae_int_t layerscount, ae_bool isclsnet, multilayerperceptron* network, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t ssize; ae_int_t ntotal; ae_int_t wcount; ae_int_t offs; ae_int_t nprocessed; ae_int_t wallocated; ae_vector localtemp; ae_vector lnfirst; ae_vector lnsyn; mlpbuffers buf; smlpgrad sgrad; ae_frame_make(_state, &_frame_block); _multilayerperceptron_clear(network); ae_vector_init(&localtemp, 0, DT_INT, _state); ae_vector_init(&lnfirst, 0, DT_INT, _state); ae_vector_init(&lnsyn, 0, DT_INT, _state); _mlpbuffers_init(&buf, _state); _smlpgrad_init(&sgrad, _state); /* * Check */ ae_assert(layerscount>0, "MLPCreate: wrong parameters!", _state); ae_assert(ltypes->ptr.p_int[0]==-2, "MLPCreate: wrong LTypes[0] (must be -2)!", _state); for(i=0; i<=layerscount-1; i++) { ae_assert(lsizes->ptr.p_int[i]>0, "MLPCreate: wrong LSizes!", _state); ae_assert(lconnfirst->ptr.p_int[i]>=0&&(lconnfirst->ptr.p_int[i]ptr.p_int[i]>=lconnfirst->ptr.p_int[i]&&(lconnlast->ptr.p_int[i]ptr.p_int[i]>=0||ltypes->ptr.p_int[i]==-5 ) { lnsyn.ptr.p_int[i] = 0; for(j=lconnfirst->ptr.p_int[i]; j<=lconnlast->ptr.p_int[i]; j++) { lnsyn.ptr.p_int[i] = lnsyn.ptr.p_int[i]+lsizes->ptr.p_int[j]; } } else { if( (ltypes->ptr.p_int[i]==-2||ltypes->ptr.p_int[i]==-3)||ltypes->ptr.p_int[i]==-4 ) { lnsyn.ptr.p_int[i] = 0; } } ae_assert(lnsyn.ptr.p_int[i]>=0, "MLPCreate: internal error #0!", _state); /* * Other info */ lnfirst.ptr.p_int[i] = ntotal; ntotal = ntotal+lsizes->ptr.p_int[i]; if( ltypes->ptr.p_int[i]==0 ) { wcount = wcount+lnsyn.ptr.p_int[i]*lsizes->ptr.p_int[i]; } } ssize = 7+ntotal*mlpbase_nfieldwidth; /* * Allocate */ ae_vector_set_length(&network->structinfo, ssize-1+1, _state); ae_vector_set_length(&network->weights, wcount-1+1, _state); if( isclsnet ) { ae_vector_set_length(&network->columnmeans, nin-1+1, _state); ae_vector_set_length(&network->columnsigmas, nin-1+1, _state); } else { ae_vector_set_length(&network->columnmeans, nin+nout-1+1, _state); ae_vector_set_length(&network->columnsigmas, nin+nout-1+1, _state); } ae_vector_set_length(&network->neurons, ntotal-1+1, _state); ae_vector_set_length(&network->nwbuf, ae_maxint(wcount, 2*nout, _state)-1+1, _state); ae_vector_set_length(&network->integerbuf, 3+1, _state); ae_vector_set_length(&network->dfdnet, ntotal-1+1, _state); ae_vector_set_length(&network->x, nin-1+1, _state); ae_vector_set_length(&network->y, nout-1+1, _state); ae_vector_set_length(&network->derror, ntotal-1+1, _state); /* * Fill structure: global info */ network->structinfo.ptr.p_int[0] = ssize; network->structinfo.ptr.p_int[1] = nin; network->structinfo.ptr.p_int[2] = nout; network->structinfo.ptr.p_int[3] = ntotal; network->structinfo.ptr.p_int[4] = wcount; network->structinfo.ptr.p_int[5] = 7; if( isclsnet ) { network->structinfo.ptr.p_int[6] = 1; } else { network->structinfo.ptr.p_int[6] = 0; } /* * Fill structure: neuron connections */ nprocessed = 0; wallocated = 0; for(i=0; i<=layerscount-1; i++) { for(j=0; j<=lsizes->ptr.p_int[i]-1; j++) { offs = network->structinfo.ptr.p_int[5]+nprocessed*mlpbase_nfieldwidth; network->structinfo.ptr.p_int[offs+0] = ltypes->ptr.p_int[i]; if( ltypes->ptr.p_int[i]==0 ) { /* * Adaptive summator: * * connections with weights to previous neurons */ network->structinfo.ptr.p_int[offs+1] = lnsyn.ptr.p_int[i]; network->structinfo.ptr.p_int[offs+2] = lnfirst.ptr.p_int[lconnfirst->ptr.p_int[i]]; network->structinfo.ptr.p_int[offs+3] = wallocated; wallocated = wallocated+lnsyn.ptr.p_int[i]; nprocessed = nprocessed+1; } if( ltypes->ptr.p_int[i]>0||ltypes->ptr.p_int[i]==-5 ) { /* * Activation layer: * * each neuron connected to one (only one) of previous neurons. * * no weights */ network->structinfo.ptr.p_int[offs+1] = 1; network->structinfo.ptr.p_int[offs+2] = lnfirst.ptr.p_int[lconnfirst->ptr.p_int[i]]+j; network->structinfo.ptr.p_int[offs+3] = -1; nprocessed = nprocessed+1; } if( (ltypes->ptr.p_int[i]==-2||ltypes->ptr.p_int[i]==-3)||ltypes->ptr.p_int[i]==-4 ) { nprocessed = nprocessed+1; } } } ae_assert(wallocated==wcount, "MLPCreate: internal error #1!", _state); ae_assert(nprocessed==ntotal, "MLPCreate: internal error #2!", _state); /* * Fill weights by small random values * Initialize means and sigmas */ for(i=0; i<=nin-1; i++) { network->columnmeans.ptr.p_double[i] = (double)(0); network->columnsigmas.ptr.p_double[i] = (double)(1); } if( !isclsnet ) { for(i=0; i<=nout-1; i++) { network->columnmeans.ptr.p_double[nin+i] = (double)(0); network->columnsigmas.ptr.p_double[nin+i] = (double)(1); } } mlprandomize(network, _state); /* * Seed buffers */ ae_shared_pool_set_seed(&network->buf, &buf, sizeof(buf), _mlpbuffers_init, _mlpbuffers_init_copy, _mlpbuffers_destroy, _state); ae_vector_set_length(&sgrad.g, wcount, _state); sgrad.f = 0.0; for(i=0; i<=wcount-1; i++) { sgrad.g.ptr.p_double[i] = 0.0; } ae_shared_pool_set_seed(&network->gradbuf, &sgrad, sizeof(sgrad), _smlpgrad_init, _smlpgrad_init_copy, _smlpgrad_destroy, _state); ae_frame_leave(_state); } /************************************************************************* Internal subroutine for Hessian calculation. WARNING! Unspeakable math far beyong human capabilities :) *************************************************************************/ static void mlpbase_mlphessianbatchinternal(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t ssize, ae_bool naturalerr, double* e, /* Real */ ae_vector* grad, /* Real */ ae_matrix* h, ae_state *_state) { ae_frame _frame_block; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t ntotal; ae_int_t istart; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t kl; ae_int_t offs; ae_int_t n1; ae_int_t n2; ae_int_t w1; ae_int_t w2; double s; double t; double v; double et; ae_bool bflag; double f; double df; double d2f; double deidyj; double mx; double q; double z; double s2; double expi; double expj; ae_vector x; ae_vector desiredy; ae_vector gt; ae_vector zeros; ae_matrix rx; ae_matrix ry; ae_matrix rdx; ae_matrix rdy; ae_frame_make(_state, &_frame_block); *e = 0; ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&desiredy, 0, DT_REAL, _state); ae_vector_init(>, 0, DT_REAL, _state); ae_vector_init(&zeros, 0, DT_REAL, _state); ae_matrix_init(&rx, 0, 0, DT_REAL, _state); ae_matrix_init(&ry, 0, 0, DT_REAL, _state); ae_matrix_init(&rdx, 0, 0, DT_REAL, _state); ae_matrix_init(&rdy, 0, 0, DT_REAL, _state); mlpproperties(network, &nin, &nout, &wcount, _state); ntotal = network->structinfo.ptr.p_int[3]; istart = network->structinfo.ptr.p_int[5]; /* * Prepare */ ae_vector_set_length(&x, nin-1+1, _state); ae_vector_set_length(&desiredy, nout-1+1, _state); ae_vector_set_length(&zeros, wcount-1+1, _state); ae_vector_set_length(>, wcount-1+1, _state); ae_matrix_set_length(&rx, ntotal+nout-1+1, wcount-1+1, _state); ae_matrix_set_length(&ry, ntotal+nout-1+1, wcount-1+1, _state); ae_matrix_set_length(&rdx, ntotal+nout-1+1, wcount-1+1, _state); ae_matrix_set_length(&rdy, ntotal+nout-1+1, wcount-1+1, _state); *e = (double)(0); for(i=0; i<=wcount-1; i++) { zeros.ptr.p_double[i] = (double)(0); } ae_v_move(&grad->ptr.p_double[0], 1, &zeros.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); for(i=0; i<=wcount-1; i++) { ae_v_move(&h->ptr.pp_double[i][0], 1, &zeros.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); } /* * Process */ for(k=0; k<=ssize-1; k++) { /* * Process vector with MLPGradN. * Now Neurons, DFDNET and DError contains results of the last run. */ ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[k][0], 1, ae_v_len(0,nin-1)); if( mlpissoftmax(network, _state) ) { /* * class labels outputs */ kl = ae_round(xy->ptr.pp_double[k][nin], _state); for(i=0; i<=nout-1; i++) { if( i==kl ) { desiredy.ptr.p_double[i] = (double)(1); } else { desiredy.ptr.p_double[i] = (double)(0); } } } else { /* * real outputs */ ae_v_move(&desiredy.ptr.p_double[0], 1, &xy->ptr.pp_double[k][nin], 1, ae_v_len(0,nout-1)); } if( naturalerr ) { mlpgradn(network, &x, &desiredy, &et, >, _state); } else { mlpgrad(network, &x, &desiredy, &et, >, _state); } /* * grad, error */ *e = *e+et; ae_v_add(&grad->ptr.p_double[0], 1, >.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); /* * Hessian. * Forward pass of the R-algorithm */ for(i=0; i<=ntotal-1; i++) { offs = istart+i*mlpbase_nfieldwidth; ae_v_move(&rx.ptr.pp_double[i][0], 1, &zeros.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); ae_v_move(&ry.ptr.pp_double[i][0], 1, &zeros.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); if( network->structinfo.ptr.p_int[offs+0]>0||network->structinfo.ptr.p_int[offs+0]==-5 ) { /* * Activation function */ n1 = network->structinfo.ptr.p_int[offs+2]; ae_v_move(&rx.ptr.pp_double[i][0], 1, &ry.ptr.pp_double[n1][0], 1, ae_v_len(0,wcount-1)); v = network->dfdnet.ptr.p_double[i]; ae_v_moved(&ry.ptr.pp_double[i][0], 1, &rx.ptr.pp_double[i][0], 1, ae_v_len(0,wcount-1), v); continue; } if( network->structinfo.ptr.p_int[offs+0]==0 ) { /* * Adaptive summator */ n1 = network->structinfo.ptr.p_int[offs+2]; n2 = n1+network->structinfo.ptr.p_int[offs+1]-1; w1 = network->structinfo.ptr.p_int[offs+3]; w2 = w1+network->structinfo.ptr.p_int[offs+1]-1; for(j=n1; j<=n2; j++) { v = network->weights.ptr.p_double[w1+j-n1]; ae_v_addd(&rx.ptr.pp_double[i][0], 1, &ry.ptr.pp_double[j][0], 1, ae_v_len(0,wcount-1), v); rx.ptr.pp_double[i][w1+j-n1] = rx.ptr.pp_double[i][w1+j-n1]+network->neurons.ptr.p_double[j]; } ae_v_move(&ry.ptr.pp_double[i][0], 1, &rx.ptr.pp_double[i][0], 1, ae_v_len(0,wcount-1)); continue; } if( network->structinfo.ptr.p_int[offs+0]<0 ) { bflag = ae_true; if( network->structinfo.ptr.p_int[offs+0]==-2 ) { /* * input neuron, left unchanged */ bflag = ae_false; } if( network->structinfo.ptr.p_int[offs+0]==-3 ) { /* * "-1" neuron, left unchanged */ bflag = ae_false; } if( network->structinfo.ptr.p_int[offs+0]==-4 ) { /* * "0" neuron, left unchanged */ bflag = ae_false; } ae_assert(!bflag, "MLPHessianNBatch: internal error - unknown neuron type!", _state); continue; } } /* * Hessian. Backward pass of the R-algorithm. * * Stage 1. Initialize RDY */ for(i=0; i<=ntotal+nout-1; i++) { ae_v_move(&rdy.ptr.pp_double[i][0], 1, &zeros.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); } if( network->structinfo.ptr.p_int[6]==0 ) { /* * Standardisation. * * In context of the Hessian calculation standardisation * is considered as additional layer with weightless * activation function: * * F(NET) := Sigma*NET * * So we add one more layer to forward pass, and * make forward/backward pass through this layer. */ for(i=0; i<=nout-1; i++) { n1 = ntotal-nout+i; n2 = ntotal+i; /* * Forward pass from N1 to N2 */ ae_v_move(&rx.ptr.pp_double[n2][0], 1, &ry.ptr.pp_double[n1][0], 1, ae_v_len(0,wcount-1)); v = network->columnsigmas.ptr.p_double[nin+i]; ae_v_moved(&ry.ptr.pp_double[n2][0], 1, &rx.ptr.pp_double[n2][0], 1, ae_v_len(0,wcount-1), v); /* * Initialization of RDY */ ae_v_move(&rdy.ptr.pp_double[n2][0], 1, &ry.ptr.pp_double[n2][0], 1, ae_v_len(0,wcount-1)); /* * Backward pass from N2 to N1: * 1. Calculate R(dE/dX). * 2. No R(dE/dWij) is needed since weight of activation neuron * is fixed to 1. So we can update R(dE/dY) for * the connected neuron (note that Vij=0, Wij=1) */ df = network->columnsigmas.ptr.p_double[nin+i]; ae_v_moved(&rdx.ptr.pp_double[n2][0], 1, &rdy.ptr.pp_double[n2][0], 1, ae_v_len(0,wcount-1), df); ae_v_add(&rdy.ptr.pp_double[n1][0], 1, &rdx.ptr.pp_double[n2][0], 1, ae_v_len(0,wcount-1)); } } else { /* * Softmax. * * Initialize RDY using generalized expression for ei'(yi) * (see expression (9) from p. 5 of "Fast Exact Multiplication by the Hessian"). * * When we are working with softmax network, generalized * expression for ei'(yi) is used because softmax * normalization leads to ei, which depends on all y's */ if( naturalerr ) { /* * softmax + cross-entropy. * We have: * * S = sum(exp(yk)), * ei = sum(trn)*exp(yi)/S-trn_i * * j=i: d(ei)/d(yj) = T*exp(yi)*(S-exp(yi))/S^2 * j<>i: d(ei)/d(yj) = -T*exp(yi)*exp(yj)/S^2 */ t = (double)(0); for(i=0; i<=nout-1; i++) { t = t+desiredy.ptr.p_double[i]; } mx = network->neurons.ptr.p_double[ntotal-nout]; for(i=0; i<=nout-1; i++) { mx = ae_maxreal(mx, network->neurons.ptr.p_double[ntotal-nout+i], _state); } s = (double)(0); for(i=0; i<=nout-1; i++) { network->nwbuf.ptr.p_double[i] = ae_exp(network->neurons.ptr.p_double[ntotal-nout+i]-mx, _state); s = s+network->nwbuf.ptr.p_double[i]; } for(i=0; i<=nout-1; i++) { for(j=0; j<=nout-1; j++) { if( j==i ) { deidyj = t*network->nwbuf.ptr.p_double[i]*(s-network->nwbuf.ptr.p_double[i])/ae_sqr(s, _state); ae_v_addd(&rdy.ptr.pp_double[ntotal-nout+i][0], 1, &ry.ptr.pp_double[ntotal-nout+i][0], 1, ae_v_len(0,wcount-1), deidyj); } else { deidyj = -t*network->nwbuf.ptr.p_double[i]*network->nwbuf.ptr.p_double[j]/ae_sqr(s, _state); ae_v_addd(&rdy.ptr.pp_double[ntotal-nout+i][0], 1, &ry.ptr.pp_double[ntotal-nout+j][0], 1, ae_v_len(0,wcount-1), deidyj); } } } } else { /* * For a softmax + squared error we have expression * far beyond human imagination so we dont even try * to comment on it. Just enjoy the code... * * P.S. That's why "natural error" is called "natural" - * compact beatiful expressions, fast code.... */ mx = network->neurons.ptr.p_double[ntotal-nout]; for(i=0; i<=nout-1; i++) { mx = ae_maxreal(mx, network->neurons.ptr.p_double[ntotal-nout+i], _state); } s = (double)(0); s2 = (double)(0); for(i=0; i<=nout-1; i++) { network->nwbuf.ptr.p_double[i] = ae_exp(network->neurons.ptr.p_double[ntotal-nout+i]-mx, _state); s = s+network->nwbuf.ptr.p_double[i]; s2 = s2+ae_sqr(network->nwbuf.ptr.p_double[i], _state); } q = (double)(0); for(i=0; i<=nout-1; i++) { q = q+(network->y.ptr.p_double[i]-desiredy.ptr.p_double[i])*network->nwbuf.ptr.p_double[i]; } for(i=0; i<=nout-1; i++) { z = -q+(network->y.ptr.p_double[i]-desiredy.ptr.p_double[i])*s; expi = network->nwbuf.ptr.p_double[i]; for(j=0; j<=nout-1; j++) { expj = network->nwbuf.ptr.p_double[j]; if( j==i ) { deidyj = expi/ae_sqr(s, _state)*((z+expi)*(s-2*expi)/s+expi*s2/ae_sqr(s, _state)); } else { deidyj = expi*expj/ae_sqr(s, _state)*(s2/ae_sqr(s, _state)-2*z/s-(expi+expj)/s+(network->y.ptr.p_double[i]-desiredy.ptr.p_double[i])-(network->y.ptr.p_double[j]-desiredy.ptr.p_double[j])); } ae_v_addd(&rdy.ptr.pp_double[ntotal-nout+i][0], 1, &ry.ptr.pp_double[ntotal-nout+j][0], 1, ae_v_len(0,wcount-1), deidyj); } } } } /* * Hessian. Backward pass of the R-algorithm * * Stage 2. Process. */ for(i=ntotal-1; i>=0; i--) { /* * Possible variants: * 1. Activation function * 2. Adaptive summator * 3. Special neuron */ offs = istart+i*mlpbase_nfieldwidth; if( network->structinfo.ptr.p_int[offs+0]>0||network->structinfo.ptr.p_int[offs+0]==-5 ) { n1 = network->structinfo.ptr.p_int[offs+2]; /* * First, calculate R(dE/dX). */ mlpactivationfunction(network->neurons.ptr.p_double[n1], network->structinfo.ptr.p_int[offs+0], &f, &df, &d2f, _state); v = d2f*network->derror.ptr.p_double[i]; ae_v_moved(&rdx.ptr.pp_double[i][0], 1, &rdy.ptr.pp_double[i][0], 1, ae_v_len(0,wcount-1), df); ae_v_addd(&rdx.ptr.pp_double[i][0], 1, &rx.ptr.pp_double[i][0], 1, ae_v_len(0,wcount-1), v); /* * No R(dE/dWij) is needed since weight of activation neuron * is fixed to 1. * * So we can update R(dE/dY) for the connected neuron. * (note that Vij=0, Wij=1) */ ae_v_add(&rdy.ptr.pp_double[n1][0], 1, &rdx.ptr.pp_double[i][0], 1, ae_v_len(0,wcount-1)); continue; } if( network->structinfo.ptr.p_int[offs+0]==0 ) { /* * Adaptive summator */ n1 = network->structinfo.ptr.p_int[offs+2]; n2 = n1+network->structinfo.ptr.p_int[offs+1]-1; w1 = network->structinfo.ptr.p_int[offs+3]; w2 = w1+network->structinfo.ptr.p_int[offs+1]-1; /* * First, calculate R(dE/dX). */ ae_v_move(&rdx.ptr.pp_double[i][0], 1, &rdy.ptr.pp_double[i][0], 1, ae_v_len(0,wcount-1)); /* * Then, calculate R(dE/dWij) */ for(j=w1; j<=w2; j++) { v = network->neurons.ptr.p_double[n1+j-w1]; ae_v_addd(&h->ptr.pp_double[j][0], 1, &rdx.ptr.pp_double[i][0], 1, ae_v_len(0,wcount-1), v); v = network->derror.ptr.p_double[i]; ae_v_addd(&h->ptr.pp_double[j][0], 1, &ry.ptr.pp_double[n1+j-w1][0], 1, ae_v_len(0,wcount-1), v); } /* * And finally, update R(dE/dY) for connected neurons. */ for(j=w1; j<=w2; j++) { v = network->weights.ptr.p_double[j]; ae_v_addd(&rdy.ptr.pp_double[n1+j-w1][0], 1, &rdx.ptr.pp_double[i][0], 1, ae_v_len(0,wcount-1), v); rdy.ptr.pp_double[n1+j-w1][j] = rdy.ptr.pp_double[n1+j-w1][j]+network->derror.ptr.p_double[i]; } continue; } if( network->structinfo.ptr.p_int[offs+0]<0 ) { bflag = ae_false; if( (network->structinfo.ptr.p_int[offs+0]==-2||network->structinfo.ptr.p_int[offs+0]==-3)||network->structinfo.ptr.p_int[offs+0]==-4 ) { /* * Special neuron type, no back-propagation required */ bflag = ae_true; } ae_assert(bflag, "MLPHessianNBatch: unknown neuron type!", _state); continue; } } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine Network must be processed by MLPProcess on X *************************************************************************/ static void mlpbase_mlpinternalcalculategradient(multilayerperceptron* network, /* Real */ ae_vector* neurons, /* Real */ ae_vector* weights, /* Real */ ae_vector* derror, /* Real */ ae_vector* grad, ae_bool naturalerrorfunc, ae_state *_state) { ae_int_t i; ae_int_t n1; ae_int_t n2; ae_int_t w1; ae_int_t w2; ae_int_t ntotal; ae_int_t istart; ae_int_t nin; ae_int_t nout; ae_int_t offs; double dedf; double dfdnet; double v; double fown; double deown; double net; double mx; ae_bool bflag; /* * Read network geometry */ nin = network->structinfo.ptr.p_int[1]; nout = network->structinfo.ptr.p_int[2]; ntotal = network->structinfo.ptr.p_int[3]; istart = network->structinfo.ptr.p_int[5]; /* * Pre-processing of dError/dOut: * from dError/dOut(normalized) to dError/dOut(non-normalized) */ ae_assert(network->structinfo.ptr.p_int[6]==0||network->structinfo.ptr.p_int[6]==1, "MLPInternalCalculateGradient: unknown normalization type!", _state); if( network->structinfo.ptr.p_int[6]==1 ) { /* * Softmax */ if( !naturalerrorfunc ) { mx = network->neurons.ptr.p_double[ntotal-nout]; for(i=0; i<=nout-1; i++) { mx = ae_maxreal(mx, network->neurons.ptr.p_double[ntotal-nout+i], _state); } net = (double)(0); for(i=0; i<=nout-1; i++) { network->nwbuf.ptr.p_double[i] = ae_exp(network->neurons.ptr.p_double[ntotal-nout+i]-mx, _state); net = net+network->nwbuf.ptr.p_double[i]; } v = ae_v_dotproduct(&network->derror.ptr.p_double[ntotal-nout], 1, &network->nwbuf.ptr.p_double[0], 1, ae_v_len(ntotal-nout,ntotal-1)); for(i=0; i<=nout-1; i++) { fown = network->nwbuf.ptr.p_double[i]; deown = network->derror.ptr.p_double[ntotal-nout+i]; network->nwbuf.ptr.p_double[nout+i] = (-v+deown*fown+deown*(net-fown))*fown/ae_sqr(net, _state); } for(i=0; i<=nout-1; i++) { network->derror.ptr.p_double[ntotal-nout+i] = network->nwbuf.ptr.p_double[nout+i]; } } } else { /* * Un-standardisation */ for(i=0; i<=nout-1; i++) { network->derror.ptr.p_double[ntotal-nout+i] = network->derror.ptr.p_double[ntotal-nout+i]*network->columnsigmas.ptr.p_double[nin+i]; } } /* * Backpropagation */ for(i=ntotal-1; i>=0; i--) { /* * Extract info */ offs = istart+i*mlpbase_nfieldwidth; if( network->structinfo.ptr.p_int[offs+0]>0||network->structinfo.ptr.p_int[offs+0]==-5 ) { /* * Activation function */ dedf = network->derror.ptr.p_double[i]; dfdnet = network->dfdnet.ptr.p_double[i]; derror->ptr.p_double[network->structinfo.ptr.p_int[offs+2]] = derror->ptr.p_double[network->structinfo.ptr.p_int[offs+2]]+dedf*dfdnet; continue; } if( network->structinfo.ptr.p_int[offs+0]==0 ) { /* * Adaptive summator */ n1 = network->structinfo.ptr.p_int[offs+2]; n2 = n1+network->structinfo.ptr.p_int[offs+1]-1; w1 = network->structinfo.ptr.p_int[offs+3]; w2 = w1+network->structinfo.ptr.p_int[offs+1]-1; dedf = network->derror.ptr.p_double[i]; dfdnet = 1.0; v = dedf*dfdnet; ae_v_moved(&grad->ptr.p_double[w1], 1, &neurons->ptr.p_double[n1], 1, ae_v_len(w1,w2), v); ae_v_addd(&derror->ptr.p_double[n1], 1, &weights->ptr.p_double[w1], 1, ae_v_len(n1,n2), v); continue; } if( network->structinfo.ptr.p_int[offs+0]<0 ) { bflag = ae_false; if( (network->structinfo.ptr.p_int[offs+0]==-2||network->structinfo.ptr.p_int[offs+0]==-3)||network->structinfo.ptr.p_int[offs+0]==-4 ) { /* * Special neuron type, no back-propagation required */ bflag = ae_true; } ae_assert(bflag, "MLPInternalCalculateGradient: unknown neuron type!", _state); continue; } } } static void mlpbase_mlpchunkedgradient(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t cstart, ae_int_t csize, /* Real */ ae_vector* batch4buf, /* Real */ ae_vector* hpcbuf, double* e, ae_bool naturalerrorfunc, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t kl; ae_int_t ntotal; ae_int_t nin; ae_int_t nout; ae_int_t offs; double f; double df; double d2f; double v; double vv; double s; double fown; double deown; ae_bool bflag; ae_int_t istart; ae_int_t entrysize; ae_int_t dfoffs; ae_int_t derroroffs; ae_int_t entryoffs; ae_int_t neuronidx; ae_int_t srcentryoffs; ae_int_t srcneuronidx; ae_int_t srcweightidx; ae_int_t neurontype; ae_int_t nweights; ae_int_t offs0; ae_int_t offs1; ae_int_t offs2; double v0; double v1; double v2; double v3; double s0; double s1; double s2; double s3; ae_int_t chunksize; chunksize = 4; ae_assert(csize<=chunksize, "MLPChunkedGradient: internal error (CSize>ChunkSize)", _state); /* * Try to use HPC core, if possible */ if( hpcchunkedgradient(&network->weights, &network->structinfo, &network->columnmeans, &network->columnsigmas, xy, cstart, csize, batch4buf, hpcbuf, e, naturalerrorfunc, _state) ) { return; } /* * Read network geometry, prepare data */ nin = network->structinfo.ptr.p_int[1]; nout = network->structinfo.ptr.p_int[2]; ntotal = network->structinfo.ptr.p_int[3]; istart = network->structinfo.ptr.p_int[5]; entrysize = 12; dfoffs = 4; derroroffs = 8; /* * Fill Batch4Buf by zeros. * * THIS STAGE IS VERY IMPORTANT! * * We fill all components of entry - neuron values, dF/dNET, dError/dF. * It allows us to easily handle situations when CSizeptr.p_double[i] = (double)(0); } /* * Forward pass: * 1. Load data into Batch4Buf. If CSizecolumnsigmas.ptr.p_double[i],(double)(0)) ) { batch4buf->ptr.p_double[entryoffs+j] = (xy->ptr.pp_double[cstart+j][i]-network->columnmeans.ptr.p_double[i])/network->columnsigmas.ptr.p_double[i]; } else { batch4buf->ptr.p_double[entryoffs+j] = xy->ptr.pp_double[cstart+j][i]-network->columnmeans.ptr.p_double[i]; } } } for(neuronidx=0; neuronidx<=ntotal-1; neuronidx++) { entryoffs = entrysize*neuronidx; offs = istart+neuronidx*mlpbase_nfieldwidth; neurontype = network->structinfo.ptr.p_int[offs+0]; if( neurontype>0||neurontype==-5 ) { /* * "activation function" neuron, which takes value of neuron SrcNeuronIdx * and applies activation function to it. * * This neuron has no weights and no tunable parameters. */ srcneuronidx = network->structinfo.ptr.p_int[offs+2]; srcentryoffs = entrysize*srcneuronidx; mlpactivationfunction(batch4buf->ptr.p_double[srcentryoffs+0], neurontype, &f, &df, &d2f, _state); batch4buf->ptr.p_double[entryoffs+0] = f; batch4buf->ptr.p_double[entryoffs+0+dfoffs] = df; mlpactivationfunction(batch4buf->ptr.p_double[srcentryoffs+1], neurontype, &f, &df, &d2f, _state); batch4buf->ptr.p_double[entryoffs+1] = f; batch4buf->ptr.p_double[entryoffs+1+dfoffs] = df; mlpactivationfunction(batch4buf->ptr.p_double[srcentryoffs+2], neurontype, &f, &df, &d2f, _state); batch4buf->ptr.p_double[entryoffs+2] = f; batch4buf->ptr.p_double[entryoffs+2+dfoffs] = df; mlpactivationfunction(batch4buf->ptr.p_double[srcentryoffs+3], neurontype, &f, &df, &d2f, _state); batch4buf->ptr.p_double[entryoffs+3] = f; batch4buf->ptr.p_double[entryoffs+3+dfoffs] = df; continue; } if( neurontype==0 ) { /* * "adaptive summator" neuron, whose output is a weighted sum of inputs. * It has weights, but has no activation function. */ nweights = network->structinfo.ptr.p_int[offs+1]; srcneuronidx = network->structinfo.ptr.p_int[offs+2]; srcentryoffs = entrysize*srcneuronidx; srcweightidx = network->structinfo.ptr.p_int[offs+3]; v0 = (double)(0); v1 = (double)(0); v2 = (double)(0); v3 = (double)(0); for(j=0; j<=nweights-1; j++) { v = network->weights.ptr.p_double[srcweightidx]; srcweightidx = srcweightidx+1; v0 = v0+v*batch4buf->ptr.p_double[srcentryoffs+0]; v1 = v1+v*batch4buf->ptr.p_double[srcentryoffs+1]; v2 = v2+v*batch4buf->ptr.p_double[srcentryoffs+2]; v3 = v3+v*batch4buf->ptr.p_double[srcentryoffs+3]; srcentryoffs = srcentryoffs+entrysize; } batch4buf->ptr.p_double[entryoffs+0] = v0; batch4buf->ptr.p_double[entryoffs+1] = v1; batch4buf->ptr.p_double[entryoffs+2] = v2; batch4buf->ptr.p_double[entryoffs+3] = v3; batch4buf->ptr.p_double[entryoffs+0+dfoffs] = (double)(1); batch4buf->ptr.p_double[entryoffs+1+dfoffs] = (double)(1); batch4buf->ptr.p_double[entryoffs+2+dfoffs] = (double)(1); batch4buf->ptr.p_double[entryoffs+3+dfoffs] = (double)(1); continue; } if( neurontype<0 ) { bflag = ae_false; if( neurontype==-2 ) { /* * Input neuron, left unchanged */ bflag = ae_true; } if( neurontype==-3 ) { /* * "-1" neuron */ batch4buf->ptr.p_double[entryoffs+0] = (double)(-1); batch4buf->ptr.p_double[entryoffs+1] = (double)(-1); batch4buf->ptr.p_double[entryoffs+2] = (double)(-1); batch4buf->ptr.p_double[entryoffs+3] = (double)(-1); batch4buf->ptr.p_double[entryoffs+0+dfoffs] = (double)(0); batch4buf->ptr.p_double[entryoffs+1+dfoffs] = (double)(0); batch4buf->ptr.p_double[entryoffs+2+dfoffs] = (double)(0); batch4buf->ptr.p_double[entryoffs+3+dfoffs] = (double)(0); bflag = ae_true; } if( neurontype==-4 ) { /* * "0" neuron */ batch4buf->ptr.p_double[entryoffs+0] = (double)(0); batch4buf->ptr.p_double[entryoffs+1] = (double)(0); batch4buf->ptr.p_double[entryoffs+2] = (double)(0); batch4buf->ptr.p_double[entryoffs+3] = (double)(0); batch4buf->ptr.p_double[entryoffs+0+dfoffs] = (double)(0); batch4buf->ptr.p_double[entryoffs+1+dfoffs] = (double)(0); batch4buf->ptr.p_double[entryoffs+2+dfoffs] = (double)(0); batch4buf->ptr.p_double[entryoffs+3+dfoffs] = (double)(0); bflag = ae_true; } ae_assert(bflag, "MLPChunkedGradient: internal error - unknown neuron type!", _state); continue; } } /* * Intermediate phase between forward and backward passes. * * For regression networks: * * forward pass is completely done (no additional post-processing is * needed). * * before starting backward pass, we have to calculate dError/dOut * for output neurons. We also update error at this phase. * * For classification networks: * * in addition to forward pass we apply SOFTMAX normalization to * output neurons. * * after applying normalization, we have to calculate dError/dOut, * which is calculated in two steps: * * first, we calculate derivative of error with respect to SOFTMAX * normalized outputs (normalized dError) * * then, we calculate derivative of error with respect to values * of outputs BEFORE normalization was applied to them */ ae_assert(network->structinfo.ptr.p_int[6]==0||network->structinfo.ptr.p_int[6]==1, "MLPChunkedGradient: unknown normalization type!", _state); if( network->structinfo.ptr.p_int[6]==1 ) { /* * SOFTMAX-normalized network. * * First, calculate (V0,V1,V2,V3) - component-wise maximum * of output neurons. This vector of maximum values will be * used for normalization of outputs prior to calculating * exponentials. * * NOTE: the only purpose of this stage is to prevent overflow * during calculation of exponentials. With this stage * we make sure that all exponentials are calculated * with non-positive argument. If you load (0,0,0,0) to * (V0,V1,V2,V3), your program will continue working - * although with less robustness. */ entryoffs = entrysize*(ntotal-nout); v0 = batch4buf->ptr.p_double[entryoffs+0]; v1 = batch4buf->ptr.p_double[entryoffs+1]; v2 = batch4buf->ptr.p_double[entryoffs+2]; v3 = batch4buf->ptr.p_double[entryoffs+3]; entryoffs = entryoffs+entrysize; for(i=1; i<=nout-1; i++) { v = batch4buf->ptr.p_double[entryoffs+0]; if( v>v0 ) { v0 = v; } v = batch4buf->ptr.p_double[entryoffs+1]; if( v>v1 ) { v1 = v; } v = batch4buf->ptr.p_double[entryoffs+2]; if( v>v2 ) { v2 = v; } v = batch4buf->ptr.p_double[entryoffs+3]; if( v>v3 ) { v3 = v; } entryoffs = entryoffs+entrysize; } /* * Then, calculate exponentials and place them to part of the * array which is located past the last entry. We also * calculate sum of exponentials which will be stored past the * exponentials. */ entryoffs = entrysize*(ntotal-nout); offs0 = entrysize*ntotal; s0 = (double)(0); s1 = (double)(0); s2 = (double)(0); s3 = (double)(0); for(i=0; i<=nout-1; i++) { v = ae_exp(batch4buf->ptr.p_double[entryoffs+0]-v0, _state); s0 = s0+v; batch4buf->ptr.p_double[offs0+0] = v; v = ae_exp(batch4buf->ptr.p_double[entryoffs+1]-v1, _state); s1 = s1+v; batch4buf->ptr.p_double[offs0+1] = v; v = ae_exp(batch4buf->ptr.p_double[entryoffs+2]-v2, _state); s2 = s2+v; batch4buf->ptr.p_double[offs0+2] = v; v = ae_exp(batch4buf->ptr.p_double[entryoffs+3]-v3, _state); s3 = s3+v; batch4buf->ptr.p_double[offs0+3] = v; entryoffs = entryoffs+entrysize; offs0 = offs0+chunksize; } offs0 = entrysize*ntotal+2*nout*chunksize; batch4buf->ptr.p_double[offs0+0] = s0; batch4buf->ptr.p_double[offs0+1] = s1; batch4buf->ptr.p_double[offs0+2] = s2; batch4buf->ptr.p_double[offs0+3] = s3; /* * Now we have: * * Batch4Buf[0...EntrySize*NTotal-1] stores: * * NTotal*ChunkSize neuron output values (SOFTMAX normalization * was not applied to these values), * * NTotal*ChunkSize values of dF/dNET (derivative of neuron * output with respect to its input) * * NTotal*ChunkSize zeros in the elements which correspond to * dError/dOut (derivative of error with respect to neuron output). * * Batch4Buf[EntrySize*NTotal...EntrySize*NTotal+ChunkSize*NOut-1] - * stores exponentials of last NOut neurons. * * Batch4Buf[EntrySize*NTotal+ChunkSize*NOut-1...EntrySize*NTotal+ChunkSize*2*NOut-1] * - can be used for temporary calculations * * Batch4Buf[EntrySize*NTotal+ChunkSize*2*NOut...EntrySize*NTotal+ChunkSize*2*NOut+ChunkSize-1] * - stores sum-of-exponentials * * Block below calculates derivatives of error function with respect * to non-SOFTMAX-normalized output values of last NOut neurons. * * It is quite complicated; we do not describe algebra behind it, * but if you want you may check it yourself :) */ if( naturalerrorfunc ) { /* * Calculate derivative of error with respect to values of * output neurons PRIOR TO SOFTMAX NORMALIZATION. Because we * use natural error function (cross-entropy), we can do so * very easy. */ offs0 = entrysize*ntotal+2*nout*chunksize; for(k=0; k<=csize-1; k++) { s = batch4buf->ptr.p_double[offs0+k]; kl = ae_round(xy->ptr.pp_double[cstart+k][nin], _state); offs1 = (ntotal-nout)*entrysize+derroroffs+k; offs2 = entrysize*ntotal+k; for(i=0; i<=nout-1; i++) { if( i==kl ) { v = (double)(1); } else { v = (double)(0); } vv = batch4buf->ptr.p_double[offs2]; batch4buf->ptr.p_double[offs1] = vv/s-v; *e = *e+mlpbase_safecrossentropy(v, vv/s, _state); offs1 = offs1+entrysize; offs2 = offs2+chunksize; } } } else { /* * SOFTMAX normalization makes things very difficult. * Sorry, we do not dare to describe this esoteric math * in details. */ offs0 = entrysize*ntotal+chunksize*2*nout; for(k=0; k<=csize-1; k++) { s = batch4buf->ptr.p_double[offs0+k]; kl = ae_round(xy->ptr.pp_double[cstart+k][nin], _state); vv = (double)(0); offs1 = entrysize*ntotal+k; offs2 = entrysize*ntotal+nout*chunksize+k; for(i=0; i<=nout-1; i++) { fown = batch4buf->ptr.p_double[offs1]; if( i==kl ) { deown = fown/s-1; } else { deown = fown/s; } batch4buf->ptr.p_double[offs2] = deown; vv = vv+deown*fown; *e = *e+deown*deown/2; offs1 = offs1+chunksize; offs2 = offs2+chunksize; } offs1 = entrysize*ntotal+k; offs2 = entrysize*ntotal+nout*chunksize+k; for(i=0; i<=nout-1; i++) { fown = batch4buf->ptr.p_double[offs1]; deown = batch4buf->ptr.p_double[offs2]; batch4buf->ptr.p_double[(ntotal-nout+i)*entrysize+derroroffs+k] = (-vv+deown*fown+deown*(s-fown))*fown/ae_sqr(s, _state); offs1 = offs1+chunksize; offs2 = offs2+chunksize; } } } } else { /* * Regression network with sum-of-squares function. * * For each NOut of last neurons: * * calculate difference between actual and desired output * * calculate dError/dOut for this neuron (proportional to difference) * * store in in last 4 components of entry (these values are used * to start backpropagation) * * update error */ for(i=0; i<=nout-1; i++) { v0 = network->columnsigmas.ptr.p_double[nin+i]; v1 = network->columnmeans.ptr.p_double[nin+i]; entryoffs = entrysize*(ntotal-nout+i); offs0 = entryoffs; offs1 = entryoffs+derroroffs; for(j=0; j<=csize-1; j++) { v = batch4buf->ptr.p_double[offs0+j]*v0+v1-xy->ptr.pp_double[cstart+j][nin+i]; batch4buf->ptr.p_double[offs1+j] = v*v0; *e = *e+v*v/2; } } } /* * Backpropagation */ for(neuronidx=ntotal-1; neuronidx>=0; neuronidx--) { entryoffs = entrysize*neuronidx; offs = istart+neuronidx*mlpbase_nfieldwidth; neurontype = network->structinfo.ptr.p_int[offs+0]; if( neurontype>0||neurontype==-5 ) { /* * Activation function */ srcneuronidx = network->structinfo.ptr.p_int[offs+2]; srcentryoffs = entrysize*srcneuronidx; offs0 = srcentryoffs+derroroffs; offs1 = entryoffs+derroroffs; offs2 = entryoffs+dfoffs; batch4buf->ptr.p_double[offs0+0] = batch4buf->ptr.p_double[offs0+0]+batch4buf->ptr.p_double[offs1+0]*batch4buf->ptr.p_double[offs2+0]; batch4buf->ptr.p_double[offs0+1] = batch4buf->ptr.p_double[offs0+1]+batch4buf->ptr.p_double[offs1+1]*batch4buf->ptr.p_double[offs2+1]; batch4buf->ptr.p_double[offs0+2] = batch4buf->ptr.p_double[offs0+2]+batch4buf->ptr.p_double[offs1+2]*batch4buf->ptr.p_double[offs2+2]; batch4buf->ptr.p_double[offs0+3] = batch4buf->ptr.p_double[offs0+3]+batch4buf->ptr.p_double[offs1+3]*batch4buf->ptr.p_double[offs2+3]; continue; } if( neurontype==0 ) { /* * Adaptive summator */ nweights = network->structinfo.ptr.p_int[offs+1]; srcneuronidx = network->structinfo.ptr.p_int[offs+2]; srcentryoffs = entrysize*srcneuronidx; srcweightidx = network->structinfo.ptr.p_int[offs+3]; v0 = batch4buf->ptr.p_double[entryoffs+derroroffs+0]; v1 = batch4buf->ptr.p_double[entryoffs+derroroffs+1]; v2 = batch4buf->ptr.p_double[entryoffs+derroroffs+2]; v3 = batch4buf->ptr.p_double[entryoffs+derroroffs+3]; for(j=0; j<=nweights-1; j++) { offs0 = srcentryoffs; offs1 = srcentryoffs+derroroffs; v = network->weights.ptr.p_double[srcweightidx]; hpcbuf->ptr.p_double[srcweightidx] = hpcbuf->ptr.p_double[srcweightidx]+batch4buf->ptr.p_double[offs0+0]*v0+batch4buf->ptr.p_double[offs0+1]*v1+batch4buf->ptr.p_double[offs0+2]*v2+batch4buf->ptr.p_double[offs0+3]*v3; batch4buf->ptr.p_double[offs1+0] = batch4buf->ptr.p_double[offs1+0]+v*v0; batch4buf->ptr.p_double[offs1+1] = batch4buf->ptr.p_double[offs1+1]+v*v1; batch4buf->ptr.p_double[offs1+2] = batch4buf->ptr.p_double[offs1+2]+v*v2; batch4buf->ptr.p_double[offs1+3] = batch4buf->ptr.p_double[offs1+3]+v*v3; srcentryoffs = srcentryoffs+entrysize; srcweightidx = srcweightidx+1; } continue; } if( neurontype<0 ) { bflag = ae_false; if( (neurontype==-2||neurontype==-3)||neurontype==-4 ) { /* * Special neuron type, no back-propagation required */ bflag = ae_true; } ae_assert(bflag, "MLPInternalCalculateGradient: unknown neuron type!", _state); continue; } } } static void mlpbase_mlpchunkedprocess(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t cstart, ae_int_t csize, /* Real */ ae_vector* batch4buf, /* Real */ ae_vector* hpcbuf, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t ntotal; ae_int_t nin; ae_int_t nout; ae_int_t offs; double f; double df; double d2f; double v; ae_bool bflag; ae_int_t istart; ae_int_t entrysize; ae_int_t entryoffs; ae_int_t neuronidx; ae_int_t srcentryoffs; ae_int_t srcneuronidx; ae_int_t srcweightidx; ae_int_t neurontype; ae_int_t nweights; ae_int_t offs0; double v0; double v1; double v2; double v3; double s0; double s1; double s2; double s3; ae_int_t chunksize; chunksize = 4; ae_assert(csize<=chunksize, "MLPChunkedProcess: internal error (CSize>ChunkSize)", _state); /* * Try to use HPC core, if possible */ if( hpcchunkedprocess(&network->weights, &network->structinfo, &network->columnmeans, &network->columnsigmas, xy, cstart, csize, batch4buf, hpcbuf, _state) ) { return; } /* * Read network geometry, prepare data */ nin = network->structinfo.ptr.p_int[1]; nout = network->structinfo.ptr.p_int[2]; ntotal = network->structinfo.ptr.p_int[3]; istart = network->structinfo.ptr.p_int[5]; entrysize = 4; /* * Fill Batch4Buf by zeros. * * THIS STAGE IS VERY IMPORTANT! * * We fill all components of entry - neuron values, dF/dNET, dError/dF. * It allows us to easily handle situations when CSizeptr.p_double[i] = (double)(0); } /* * Forward pass: * 1. Load data into Batch4Buf. If CSizecolumnsigmas.ptr.p_double[i],(double)(0)) ) { batch4buf->ptr.p_double[entryoffs+j] = (xy->ptr.pp_double[cstart+j][i]-network->columnmeans.ptr.p_double[i])/network->columnsigmas.ptr.p_double[i]; } else { batch4buf->ptr.p_double[entryoffs+j] = xy->ptr.pp_double[cstart+j][i]-network->columnmeans.ptr.p_double[i]; } } } for(neuronidx=0; neuronidx<=ntotal-1; neuronidx++) { entryoffs = entrysize*neuronidx; offs = istart+neuronidx*mlpbase_nfieldwidth; neurontype = network->structinfo.ptr.p_int[offs+0]; if( neurontype>0||neurontype==-5 ) { /* * "activation function" neuron, which takes value of neuron SrcNeuronIdx * and applies activation function to it. * * This neuron has no weights and no tunable parameters. */ srcneuronidx = network->structinfo.ptr.p_int[offs+2]; srcentryoffs = entrysize*srcneuronidx; mlpactivationfunction(batch4buf->ptr.p_double[srcentryoffs+0], neurontype, &f, &df, &d2f, _state); batch4buf->ptr.p_double[entryoffs+0] = f; mlpactivationfunction(batch4buf->ptr.p_double[srcentryoffs+1], neurontype, &f, &df, &d2f, _state); batch4buf->ptr.p_double[entryoffs+1] = f; mlpactivationfunction(batch4buf->ptr.p_double[srcentryoffs+2], neurontype, &f, &df, &d2f, _state); batch4buf->ptr.p_double[entryoffs+2] = f; mlpactivationfunction(batch4buf->ptr.p_double[srcentryoffs+3], neurontype, &f, &df, &d2f, _state); batch4buf->ptr.p_double[entryoffs+3] = f; continue; } if( neurontype==0 ) { /* * "adaptive summator" neuron, whose output is a weighted sum of inputs. * It has weights, but has no activation function. */ nweights = network->structinfo.ptr.p_int[offs+1]; srcneuronidx = network->structinfo.ptr.p_int[offs+2]; srcentryoffs = entrysize*srcneuronidx; srcweightidx = network->structinfo.ptr.p_int[offs+3]; v0 = (double)(0); v1 = (double)(0); v2 = (double)(0); v3 = (double)(0); for(j=0; j<=nweights-1; j++) { v = network->weights.ptr.p_double[srcweightidx]; srcweightidx = srcweightidx+1; v0 = v0+v*batch4buf->ptr.p_double[srcentryoffs+0]; v1 = v1+v*batch4buf->ptr.p_double[srcentryoffs+1]; v2 = v2+v*batch4buf->ptr.p_double[srcentryoffs+2]; v3 = v3+v*batch4buf->ptr.p_double[srcentryoffs+3]; srcentryoffs = srcentryoffs+entrysize; } batch4buf->ptr.p_double[entryoffs+0] = v0; batch4buf->ptr.p_double[entryoffs+1] = v1; batch4buf->ptr.p_double[entryoffs+2] = v2; batch4buf->ptr.p_double[entryoffs+3] = v3; continue; } if( neurontype<0 ) { bflag = ae_false; if( neurontype==-2 ) { /* * Input neuron, left unchanged */ bflag = ae_true; } if( neurontype==-3 ) { /* * "-1" neuron */ batch4buf->ptr.p_double[entryoffs+0] = (double)(-1); batch4buf->ptr.p_double[entryoffs+1] = (double)(-1); batch4buf->ptr.p_double[entryoffs+2] = (double)(-1); batch4buf->ptr.p_double[entryoffs+3] = (double)(-1); bflag = ae_true; } if( neurontype==-4 ) { /* * "0" neuron */ batch4buf->ptr.p_double[entryoffs+0] = (double)(0); batch4buf->ptr.p_double[entryoffs+1] = (double)(0); batch4buf->ptr.p_double[entryoffs+2] = (double)(0); batch4buf->ptr.p_double[entryoffs+3] = (double)(0); bflag = ae_true; } ae_assert(bflag, "MLPChunkedProcess: internal error - unknown neuron type!", _state); continue; } } /* * SOFTMAX normalization or scaling. */ ae_assert(network->structinfo.ptr.p_int[6]==0||network->structinfo.ptr.p_int[6]==1, "MLPChunkedProcess: unknown normalization type!", _state); if( network->structinfo.ptr.p_int[6]==1 ) { /* * SOFTMAX-normalized network. * * First, calculate (V0,V1,V2,V3) - component-wise maximum * of output neurons. This vector of maximum values will be * used for normalization of outputs prior to calculating * exponentials. * * NOTE: the only purpose of this stage is to prevent overflow * during calculation of exponentials. With this stage * we make sure that all exponentials are calculated * with non-positive argument. If you load (0,0,0,0) to * (V0,V1,V2,V3), your program will continue working - * although with less robustness. */ entryoffs = entrysize*(ntotal-nout); v0 = batch4buf->ptr.p_double[entryoffs+0]; v1 = batch4buf->ptr.p_double[entryoffs+1]; v2 = batch4buf->ptr.p_double[entryoffs+2]; v3 = batch4buf->ptr.p_double[entryoffs+3]; entryoffs = entryoffs+entrysize; for(i=1; i<=nout-1; i++) { v = batch4buf->ptr.p_double[entryoffs+0]; if( v>v0 ) { v0 = v; } v = batch4buf->ptr.p_double[entryoffs+1]; if( v>v1 ) { v1 = v; } v = batch4buf->ptr.p_double[entryoffs+2]; if( v>v2 ) { v2 = v; } v = batch4buf->ptr.p_double[entryoffs+3]; if( v>v3 ) { v3 = v; } entryoffs = entryoffs+entrysize; } /* * Then, calculate exponentials and place them to part of the * array which is located past the last entry. We also * calculate sum of exponentials. */ entryoffs = entrysize*(ntotal-nout); offs0 = entrysize*ntotal; s0 = (double)(0); s1 = (double)(0); s2 = (double)(0); s3 = (double)(0); for(i=0; i<=nout-1; i++) { v = ae_exp(batch4buf->ptr.p_double[entryoffs+0]-v0, _state); s0 = s0+v; batch4buf->ptr.p_double[offs0+0] = v; v = ae_exp(batch4buf->ptr.p_double[entryoffs+1]-v1, _state); s1 = s1+v; batch4buf->ptr.p_double[offs0+1] = v; v = ae_exp(batch4buf->ptr.p_double[entryoffs+2]-v2, _state); s2 = s2+v; batch4buf->ptr.p_double[offs0+2] = v; v = ae_exp(batch4buf->ptr.p_double[entryoffs+3]-v3, _state); s3 = s3+v; batch4buf->ptr.p_double[offs0+3] = v; entryoffs = entryoffs+entrysize; offs0 = offs0+chunksize; } /* * Write SOFTMAX-normalized values to the output array. */ offs0 = entrysize*ntotal; for(i=0; i<=nout-1; i++) { if( csize>0 ) { xy->ptr.pp_double[cstart+0][nin+i] = batch4buf->ptr.p_double[offs0+0]/s0; } if( csize>1 ) { xy->ptr.pp_double[cstart+1][nin+i] = batch4buf->ptr.p_double[offs0+1]/s1; } if( csize>2 ) { xy->ptr.pp_double[cstart+2][nin+i] = batch4buf->ptr.p_double[offs0+2]/s2; } if( csize>3 ) { xy->ptr.pp_double[cstart+3][nin+i] = batch4buf->ptr.p_double[offs0+3]/s3; } offs0 = offs0+chunksize; } } else { /* * Regression network with sum-of-squares function. * * For each NOut of last neurons: * * calculate difference between actual and desired output * * calculate dError/dOut for this neuron (proportional to difference) * * store in in last 4 components of entry (these values are used * to start backpropagation) * * update error */ for(i=0; i<=nout-1; i++) { v0 = network->columnsigmas.ptr.p_double[nin+i]; v1 = network->columnmeans.ptr.p_double[nin+i]; entryoffs = entrysize*(ntotal-nout+i); for(j=0; j<=csize-1; j++) { xy->ptr.pp_double[cstart+j][nin+i] = batch4buf->ptr.p_double[entryoffs+j]*v0+v1; } } } } /************************************************************************* Returns T*Ln(T/Z), guarded against overflow/underflow. Internal subroutine. *************************************************************************/ static double mlpbase_safecrossentropy(double t, double z, ae_state *_state) { double r; double result; if( ae_fp_eq(t,(double)(0)) ) { result = (double)(0); } else { if( ae_fp_greater(ae_fabs(z, _state),(double)(1)) ) { /* * Shouldn't be the case with softmax, * but we just want to be sure. */ if( ae_fp_eq(t/z,(double)(0)) ) { r = ae_minrealnumber; } else { r = t/z; } } else { /* * Normal case */ if( ae_fp_eq(z,(double)(0))||ae_fp_greater_eq(ae_fabs(t, _state),ae_maxrealnumber*ae_fabs(z, _state)) ) { r = ae_maxrealnumber; } else { r = t/z; } } result = t*ae_log(r, _state); } return result; } /************************************************************************* This function performs backward pass of neural network randimization: * it assumes that Network.Weights stores standard deviation of weights (weights are not generated yet, only their deviations are present) * it sets deviations of weights which feed NeuronIdx-th neuron to specified value * it recursively passes to deeper neuron and modifies their weights * it stops after encountering nonlinear neurons, linear activation function, input neurons, "0" and "-1" neurons -- ALGLIB -- Copyright 27.06.2013 by Bochkanov Sergey *************************************************************************/ static void mlpbase_randomizebackwardpass(multilayerperceptron* network, ae_int_t neuronidx, double v, ae_state *_state) { ae_int_t istart; ae_int_t neurontype; ae_int_t n1; ae_int_t n2; ae_int_t w1; ae_int_t w2; ae_int_t offs; ae_int_t i; istart = network->structinfo.ptr.p_int[5]; neurontype = network->structinfo.ptr.p_int[istart+neuronidx*mlpbase_nfieldwidth+0]; if( neurontype==-2 ) { /* * Input neuron - stop */ return; } if( neurontype==-3 ) { /* * "-1" neuron: stop */ return; } if( neurontype==-4 ) { /* * "0" neuron: stop */ return; } if( neurontype==0 ) { /* * Adaptive summator neuron: * * modify deviations of its weights * * recursively call this function for its inputs */ offs = istart+neuronidx*mlpbase_nfieldwidth; n1 = network->structinfo.ptr.p_int[offs+2]; n2 = n1+network->structinfo.ptr.p_int[offs+1]-1; w1 = network->structinfo.ptr.p_int[offs+3]; w2 = w1+network->structinfo.ptr.p_int[offs+1]-1; for(i=w1; i<=w2; i++) { network->weights.ptr.p_double[i] = v; } for(i=n1; i<=n2; i++) { mlpbase_randomizebackwardpass(network, i, v, _state); } return; } if( neurontype==-5 ) { /* * Linear activation function: stop */ return; } if( neurontype>0 ) { /* * Nonlinear activation function: stop */ return; } ae_assert(ae_false, "RandomizeBackwardPass: unexpected neuron type", _state); } void _modelerrors_init(void* _p, ae_state *_state) { modelerrors *p = (modelerrors*)_p; ae_touch_ptr((void*)p); } void _modelerrors_init_copy(void* _dst, void* _src, ae_state *_state) { modelerrors *dst = (modelerrors*)_dst; modelerrors *src = (modelerrors*)_src; dst->relclserror = src->relclserror; dst->avgce = src->avgce; dst->rmserror = src->rmserror; dst->avgerror = src->avgerror; dst->avgrelerror = src->avgrelerror; } void _modelerrors_clear(void* _p) { modelerrors *p = (modelerrors*)_p; ae_touch_ptr((void*)p); } void _modelerrors_destroy(void* _p) { modelerrors *p = (modelerrors*)_p; ae_touch_ptr((void*)p); } void _smlpgrad_init(void* _p, ae_state *_state) { smlpgrad *p = (smlpgrad*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->g, 0, DT_REAL, _state); } void _smlpgrad_init_copy(void* _dst, void* _src, ae_state *_state) { smlpgrad *dst = (smlpgrad*)_dst; smlpgrad *src = (smlpgrad*)_src; dst->f = src->f; ae_vector_init_copy(&dst->g, &src->g, _state); } void _smlpgrad_clear(void* _p) { smlpgrad *p = (smlpgrad*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->g); } void _smlpgrad_destroy(void* _p) { smlpgrad *p = (smlpgrad*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->g); } void _multilayerperceptron_init(void* _p, ae_state *_state) { multilayerperceptron *p = (multilayerperceptron*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->hllayersizes, 0, DT_INT, _state); ae_vector_init(&p->hlconnections, 0, DT_INT, _state); ae_vector_init(&p->hlneurons, 0, DT_INT, _state); ae_vector_init(&p->structinfo, 0, DT_INT, _state); ae_vector_init(&p->weights, 0, DT_REAL, _state); ae_vector_init(&p->columnmeans, 0, DT_REAL, _state); ae_vector_init(&p->columnsigmas, 0, DT_REAL, _state); ae_vector_init(&p->neurons, 0, DT_REAL, _state); ae_vector_init(&p->dfdnet, 0, DT_REAL, _state); ae_vector_init(&p->derror, 0, DT_REAL, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->y, 0, DT_REAL, _state); ae_matrix_init(&p->xy, 0, 0, DT_REAL, _state); ae_vector_init(&p->xyrow, 0, DT_REAL, _state); ae_vector_init(&p->nwbuf, 0, DT_REAL, _state); ae_vector_init(&p->integerbuf, 0, DT_INT, _state); _modelerrors_init(&p->err, _state); ae_vector_init(&p->rndbuf, 0, DT_REAL, _state); ae_shared_pool_init(&p->buf, _state); ae_shared_pool_init(&p->gradbuf, _state); ae_matrix_init(&p->dummydxy, 0, 0, DT_REAL, _state); _sparsematrix_init(&p->dummysxy, _state); ae_vector_init(&p->dummyidx, 0, DT_INT, _state); ae_shared_pool_init(&p->dummypool, _state); } void _multilayerperceptron_init_copy(void* _dst, void* _src, ae_state *_state) { multilayerperceptron *dst = (multilayerperceptron*)_dst; multilayerperceptron *src = (multilayerperceptron*)_src; dst->hlnetworktype = src->hlnetworktype; dst->hlnormtype = src->hlnormtype; ae_vector_init_copy(&dst->hllayersizes, &src->hllayersizes, _state); ae_vector_init_copy(&dst->hlconnections, &src->hlconnections, _state); ae_vector_init_copy(&dst->hlneurons, &src->hlneurons, _state); ae_vector_init_copy(&dst->structinfo, &src->structinfo, _state); ae_vector_init_copy(&dst->weights, &src->weights, _state); ae_vector_init_copy(&dst->columnmeans, &src->columnmeans, _state); ae_vector_init_copy(&dst->columnsigmas, &src->columnsigmas, _state); ae_vector_init_copy(&dst->neurons, &src->neurons, _state); ae_vector_init_copy(&dst->dfdnet, &src->dfdnet, _state); ae_vector_init_copy(&dst->derror, &src->derror, _state); ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->y, &src->y, _state); ae_matrix_init_copy(&dst->xy, &src->xy, _state); ae_vector_init_copy(&dst->xyrow, &src->xyrow, _state); ae_vector_init_copy(&dst->nwbuf, &src->nwbuf, _state); ae_vector_init_copy(&dst->integerbuf, &src->integerbuf, _state); _modelerrors_init_copy(&dst->err, &src->err, _state); ae_vector_init_copy(&dst->rndbuf, &src->rndbuf, _state); ae_shared_pool_init_copy(&dst->buf, &src->buf, _state); ae_shared_pool_init_copy(&dst->gradbuf, &src->gradbuf, _state); ae_matrix_init_copy(&dst->dummydxy, &src->dummydxy, _state); _sparsematrix_init_copy(&dst->dummysxy, &src->dummysxy, _state); ae_vector_init_copy(&dst->dummyidx, &src->dummyidx, _state); ae_shared_pool_init_copy(&dst->dummypool, &src->dummypool, _state); } void _multilayerperceptron_clear(void* _p) { multilayerperceptron *p = (multilayerperceptron*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->hllayersizes); ae_vector_clear(&p->hlconnections); ae_vector_clear(&p->hlneurons); ae_vector_clear(&p->structinfo); ae_vector_clear(&p->weights); ae_vector_clear(&p->columnmeans); ae_vector_clear(&p->columnsigmas); ae_vector_clear(&p->neurons); ae_vector_clear(&p->dfdnet); ae_vector_clear(&p->derror); ae_vector_clear(&p->x); ae_vector_clear(&p->y); ae_matrix_clear(&p->xy); ae_vector_clear(&p->xyrow); ae_vector_clear(&p->nwbuf); ae_vector_clear(&p->integerbuf); _modelerrors_clear(&p->err); ae_vector_clear(&p->rndbuf); ae_shared_pool_clear(&p->buf); ae_shared_pool_clear(&p->gradbuf); ae_matrix_clear(&p->dummydxy); _sparsematrix_clear(&p->dummysxy); ae_vector_clear(&p->dummyidx); ae_shared_pool_clear(&p->dummypool); } void _multilayerperceptron_destroy(void* _p) { multilayerperceptron *p = (multilayerperceptron*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->hllayersizes); ae_vector_destroy(&p->hlconnections); ae_vector_destroy(&p->hlneurons); ae_vector_destroy(&p->structinfo); ae_vector_destroy(&p->weights); ae_vector_destroy(&p->columnmeans); ae_vector_destroy(&p->columnsigmas); ae_vector_destroy(&p->neurons); ae_vector_destroy(&p->dfdnet); ae_vector_destroy(&p->derror); ae_vector_destroy(&p->x); ae_vector_destroy(&p->y); ae_matrix_destroy(&p->xy); ae_vector_destroy(&p->xyrow); ae_vector_destroy(&p->nwbuf); ae_vector_destroy(&p->integerbuf); _modelerrors_destroy(&p->err); ae_vector_destroy(&p->rndbuf); ae_shared_pool_destroy(&p->buf); ae_shared_pool_destroy(&p->gradbuf); ae_matrix_destroy(&p->dummydxy); _sparsematrix_destroy(&p->dummysxy); ae_vector_destroy(&p->dummyidx); ae_shared_pool_destroy(&p->dummypool); } /************************************************************************* Multiclass Fisher LDA Subroutine finds coefficients of linear combination which optimally separates training set on classes. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! Multithreading is used to accelerate initial phase of LDA, which ! includes calculation of products of large matrices. Again, for best ! efficiency problem must be high-dimensional. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: XY - training set, array[0..NPoints-1,0..NVars]. First NVars columns store values of independent variables, next column stores number of class (from 0 to NClasses-1) which dataset element belongs to. Fractional values are rounded to nearest integer. NPoints - training set size, NPoints>=0 NVars - number of independent variables, NVars>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: Info - return code: * -4, if internal EVD subroutine hasn't converged * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, NVars<1, NClasses<2) * 1, if task has been solved * 2, if there was a multicollinearity in training set, but task has been solved. W - linear combination coefficients, array[0..NVars-1] -- ALGLIB -- Copyright 31.05.2008 by Bochkanov Sergey *************************************************************************/ void fisherlda(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t* info, /* Real */ ae_vector* w, ae_state *_state) { ae_frame _frame_block; ae_matrix w2; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(w); ae_matrix_init(&w2, 0, 0, DT_REAL, _state); fisherldan(xy, npoints, nvars, nclasses, info, &w2, _state); if( *info>0 ) { ae_vector_set_length(w, nvars, _state); ae_v_move(&w->ptr.p_double[0], 1, &w2.ptr.pp_double[0][0], w2.stride, ae_v_len(0,nvars-1)); } ae_frame_leave(_state); } /************************************************************************* N-dimensional multiclass Fisher LDA Subroutine finds coefficients of linear combinations which optimally separates training set on classes. It returns N-dimensional basis whose vector are sorted by quality of training set separation (in descending order). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! Multithreading is used to accelerate initial phase of LDA, which ! includes calculation of products of large matrices. Again, for best ! efficiency problem must be high-dimensional. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: XY - training set, array[0..NPoints-1,0..NVars]. First NVars columns store values of independent variables, next column stores number of class (from 0 to NClasses-1) which dataset element belongs to. Fractional values are rounded to nearest integer. NPoints - training set size, NPoints>=0 NVars - number of independent variables, NVars>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: Info - return code: * -4, if internal EVD subroutine hasn't converged * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, NVars<1, NClasses<2) * 1, if task has been solved * 2, if there was a multicollinearity in training set, but task has been solved. W - basis, array[0..NVars-1,0..NVars-1] columns of matrix stores basis vectors, sorted by quality of training set separation (in descending order) -- ALGLIB -- Copyright 31.05.2008 by Bochkanov Sergey *************************************************************************/ void fisherldan(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t* info, /* Real */ ae_matrix* w, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t m; double v; ae_vector c; ae_vector mu; ae_matrix muc; ae_vector nc; ae_matrix sw; ae_matrix st; ae_matrix z; ae_matrix z2; ae_matrix tm; ae_matrix sbroot; ae_matrix a; ae_matrix xyc; ae_matrix xyproj; ae_matrix wproj; ae_vector tf; ae_vector d; ae_vector d2; ae_vector work; ae_frame_make(_state, &_frame_block); *info = 0; ae_matrix_clear(w); ae_vector_init(&c, 0, DT_INT, _state); ae_vector_init(&mu, 0, DT_REAL, _state); ae_matrix_init(&muc, 0, 0, DT_REAL, _state); ae_vector_init(&nc, 0, DT_INT, _state); ae_matrix_init(&sw, 0, 0, DT_REAL, _state); ae_matrix_init(&st, 0, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); ae_matrix_init(&z2, 0, 0, DT_REAL, _state); ae_matrix_init(&tm, 0, 0, DT_REAL, _state); ae_matrix_init(&sbroot, 0, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&xyc, 0, 0, DT_REAL, _state); ae_matrix_init(&xyproj, 0, 0, DT_REAL, _state); ae_matrix_init(&wproj, 0, 0, DT_REAL, _state); ae_vector_init(&tf, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&d2, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); /* * Test data */ if( (npoints<0||nvars<1)||nclasses<2 ) { *info = -1; ae_frame_leave(_state); return; } for(i=0; i<=npoints-1; i++) { if( ae_round(xy->ptr.pp_double[i][nvars], _state)<0||ae_round(xy->ptr.pp_double[i][nvars], _state)>=nclasses ) { *info = -2; ae_frame_leave(_state); return; } } *info = 1; /* * Special case: NPoints<=1 * Degenerate task. */ if( npoints<=1 ) { *info = 2; ae_matrix_set_length(w, nvars, nvars, _state); for(i=0; i<=nvars-1; i++) { for(j=0; j<=nvars-1; j++) { if( i==j ) { w->ptr.pp_double[i][j] = (double)(1); } else { w->ptr.pp_double[i][j] = (double)(0); } } } ae_frame_leave(_state); return; } /* * Prepare temporaries */ ae_vector_set_length(&tf, nvars, _state); ae_vector_set_length(&work, ae_maxint(nvars, npoints, _state)+1, _state); ae_matrix_set_length(&xyc, npoints, nvars, _state); /* * Convert class labels from reals to integers (just for convenience) */ ae_vector_set_length(&c, npoints, _state); for(i=0; i<=npoints-1; i++) { c.ptr.p_int[i] = ae_round(xy->ptr.pp_double[i][nvars], _state); } /* * Calculate class sizes, class means */ ae_vector_set_length(&mu, nvars, _state); ae_matrix_set_length(&muc, nclasses, nvars, _state); ae_vector_set_length(&nc, nclasses, _state); for(j=0; j<=nvars-1; j++) { mu.ptr.p_double[j] = (double)(0); } for(i=0; i<=nclasses-1; i++) { nc.ptr.p_int[i] = 0; for(j=0; j<=nvars-1; j++) { muc.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=npoints-1; i++) { ae_v_add(&mu.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); ae_v_add(&muc.ptr.pp_double[c.ptr.p_int[i]][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); nc.ptr.p_int[c.ptr.p_int[i]] = nc.ptr.p_int[c.ptr.p_int[i]]+1; } for(i=0; i<=nclasses-1; i++) { v = (double)1/(double)nc.ptr.p_int[i]; ae_v_muld(&muc.ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1), v); } v = (double)1/(double)npoints; ae_v_muld(&mu.ptr.p_double[0], 1, ae_v_len(0,nvars-1), v); /* * Create ST matrix */ ae_matrix_set_length(&st, nvars, nvars, _state); for(i=0; i<=nvars-1; i++) { for(j=0; j<=nvars-1; j++) { st.ptr.pp_double[i][j] = (double)(0); } } for(k=0; k<=npoints-1; k++) { ae_v_move(&xyc.ptr.pp_double[k][0], 1, &xy->ptr.pp_double[k][0], 1, ae_v_len(0,nvars-1)); ae_v_sub(&xyc.ptr.pp_double[k][0], 1, &mu.ptr.p_double[0], 1, ae_v_len(0,nvars-1)); } rmatrixgemm(nvars, nvars, npoints, 1.0, &xyc, 0, 0, 1, &xyc, 0, 0, 0, 0.0, &st, 0, 0, _state); /* * Create SW matrix */ ae_matrix_set_length(&sw, nvars, nvars, _state); for(i=0; i<=nvars-1; i++) { for(j=0; j<=nvars-1; j++) { sw.ptr.pp_double[i][j] = (double)(0); } } for(k=0; k<=npoints-1; k++) { ae_v_move(&xyc.ptr.pp_double[k][0], 1, &xy->ptr.pp_double[k][0], 1, ae_v_len(0,nvars-1)); ae_v_sub(&xyc.ptr.pp_double[k][0], 1, &muc.ptr.pp_double[c.ptr.p_int[k]][0], 1, ae_v_len(0,nvars-1)); } rmatrixgemm(nvars, nvars, npoints, 1.0, &xyc, 0, 0, 1, &xyc, 0, 0, 0, 0.0, &sw, 0, 0, _state); /* * Maximize ratio J=(w'*ST*w)/(w'*SW*w). * * First, make transition from w to v such that w'*ST*w becomes v'*v: * v = root(ST)*w = R*w * R = root(D)*Z' * w = (root(ST)^-1)*v = RI*v * RI = Z*inv(root(D)) * J = (v'*v)/(v'*(RI'*SW*RI)*v) * ST = Z*D*Z' * * so we have * * J = (v'*v) / (v'*(inv(root(D))*Z'*SW*Z*inv(root(D)))*v) = * = (v'*v) / (v'*A*v) */ if( !smatrixevd(&st, nvars, 1, ae_true, &d, &z, _state) ) { *info = -4; ae_frame_leave(_state); return; } ae_matrix_set_length(w, nvars, nvars, _state); if( ae_fp_less_eq(d.ptr.p_double[nvars-1],(double)(0))||ae_fp_less_eq(d.ptr.p_double[0],1000*ae_machineepsilon*d.ptr.p_double[nvars-1]) ) { /* * Special case: D[NVars-1]<=0 * Degenerate task (all variables takes the same value). */ if( ae_fp_less_eq(d.ptr.p_double[nvars-1],(double)(0)) ) { *info = 2; for(i=0; i<=nvars-1; i++) { for(j=0; j<=nvars-1; j++) { if( i==j ) { w->ptr.pp_double[i][j] = (double)(1); } else { w->ptr.pp_double[i][j] = (double)(0); } } } ae_frame_leave(_state); return; } /* * Special case: degenerate ST matrix, multicollinearity found. * Since we know ST eigenvalues/vectors we can translate task to * non-degenerate form. * * Let WG is orthogonal basis of the non zero variance subspace * of the ST and let WZ is orthogonal basis of the zero variance * subspace. * * Projection on WG allows us to use LDA on reduced M-dimensional * subspace, N-M vectors of WZ allows us to update reduced LDA * factors to full N-dimensional subspace. */ m = 0; for(k=0; k<=nvars-1; k++) { if( ae_fp_less_eq(d.ptr.p_double[k],1000*ae_machineepsilon*d.ptr.p_double[nvars-1]) ) { m = k+1; } } ae_assert(m!=0, "FisherLDAN: internal error #1", _state); ae_matrix_set_length(&xyproj, npoints, nvars-m+1, _state); rmatrixgemm(npoints, nvars-m, nvars, 1.0, xy, 0, 0, 0, &z, 0, m, 0, 0.0, &xyproj, 0, 0, _state); for(i=0; i<=npoints-1; i++) { xyproj.ptr.pp_double[i][nvars-m] = xy->ptr.pp_double[i][nvars]; } fisherldan(&xyproj, npoints, nvars-m, nclasses, info, &wproj, _state); if( *info<0 ) { ae_frame_leave(_state); return; } rmatrixgemm(nvars, nvars-m, nvars-m, 1.0, &z, 0, m, 0, &wproj, 0, 0, 0, 0.0, w, 0, 0, _state); for(k=nvars-m; k<=nvars-1; k++) { ae_v_move(&w->ptr.pp_double[0][k], w->stride, &z.ptr.pp_double[0][k-(nvars-m)], z.stride, ae_v_len(0,nvars-1)); } *info = 2; } else { /* * General case: no multicollinearity */ ae_matrix_set_length(&tm, nvars, nvars, _state); ae_matrix_set_length(&a, nvars, nvars, _state); rmatrixgemm(nvars, nvars, nvars, 1.0, &sw, 0, 0, 0, &z, 0, 0, 0, 0.0, &tm, 0, 0, _state); rmatrixgemm(nvars, nvars, nvars, 1.0, &z, 0, 0, 1, &tm, 0, 0, 0, 0.0, &a, 0, 0, _state); for(i=0; i<=nvars-1; i++) { for(j=0; j<=nvars-1; j++) { a.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]/ae_sqrt(d.ptr.p_double[i]*d.ptr.p_double[j], _state); } } if( !smatrixevd(&a, nvars, 1, ae_true, &d2, &z2, _state) ) { *info = -4; ae_frame_leave(_state); return; } for(i=0; i<=nvars-1; i++) { for(k=0; k<=nvars-1; k++) { z2.ptr.pp_double[i][k] = z2.ptr.pp_double[i][k]/ae_sqrt(d.ptr.p_double[i], _state); } } rmatrixgemm(nvars, nvars, nvars, 1.0, &z, 0, 0, 0, &z2, 0, 0, 0, 0.0, w, 0, 0, _state); } /* * Post-processing: * * normalization * * converting to non-negative form, if possible */ for(k=0; k<=nvars-1; k++) { v = ae_v_dotproduct(&w->ptr.pp_double[0][k], w->stride, &w->ptr.pp_double[0][k], w->stride, ae_v_len(0,nvars-1)); v = 1/ae_sqrt(v, _state); ae_v_muld(&w->ptr.pp_double[0][k], w->stride, ae_v_len(0,nvars-1), v); v = (double)(0); for(i=0; i<=nvars-1; i++) { v = v+w->ptr.pp_double[i][k]; } if( ae_fp_less(v,(double)(0)) ) { ae_v_muld(&w->ptr.pp_double[0][k], w->stride, ae_v_len(0,nvars-1), -1); } } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_fisherldan(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t* info, /* Real */ ae_matrix* w, ae_state *_state) { fisherldan(xy,npoints,nvars,nclasses,info,w, _state); } /************************************************************************* Linear regression Subroutine builds model: Y = A(0)*X[0] + ... + A(N-1)*X[N-1] + A(N) and model found in ALGLIB format, covariation matrix, training set errors (rms, average, average relative) and leave-one-out cross-validation estimate of the generalization error. CV estimate calculated using fast algorithm with O(NPoints*NVars) complexity. When covariation matrix is calculated standard deviations of function values are assumed to be equal to RMS error on the training set. INPUT PARAMETERS: XY - training set, array [0..NPoints-1,0..NVars]: * NVars columns - independent variables * last column - dependent variable NPoints - training set size, NPoints>NVars+1 NVars - number of independent variables OUTPUT PARAMETERS: Info - return code: * -255, in case of unknown internal error * -4, if internal SVD subroutine haven't converged * -1, if incorrect parameters was passed (NPointsrmserror, _state)*npoints/(npoints-nvars-1); for(i=0; i<=nvars; i++) { ae_v_muld(&ar->c.ptr.pp_double[i][0], 1, ae_v_len(0,nvars), sigma2); } ae_frame_leave(_state); } /************************************************************************* Linear regression Variant of LRBuild which uses vector of standatd deviations (errors in function values). INPUT PARAMETERS: XY - training set, array [0..NPoints-1,0..NVars]: * NVars columns - independent variables * last column - dependent variable S - standard deviations (errors in function values) array[0..NPoints-1], S[i]>0. NPoints - training set size, NPoints>NVars+1 NVars - number of independent variables OUTPUT PARAMETERS: Info - return code: * -255, in case of unknown internal error * -4, if internal SVD subroutine haven't converged * -1, if incorrect parameters was passed (NPointsptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); xyi.ptr.pp_double[i][nvars] = (double)(1); xyi.ptr.pp_double[i][nvars+1] = xy->ptr.pp_double[i][nvars]; } /* * Standartization */ ae_vector_set_length(&x, npoints-1+1, _state); ae_vector_set_length(&means, nvars-1+1, _state); ae_vector_set_length(&sigmas, nvars-1+1, _state); for(j=0; j<=nvars-1; j++) { ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[0][j], xy->stride, ae_v_len(0,npoints-1)); samplemoments(&x, npoints, &mean, &variance, &skewness, &kurtosis, _state); means.ptr.p_double[j] = mean; sigmas.ptr.p_double[j] = ae_sqrt(variance, _state); if( ae_fp_eq(sigmas.ptr.p_double[j],(double)(0)) ) { sigmas.ptr.p_double[j] = (double)(1); } for(i=0; i<=npoints-1; i++) { xyi.ptr.pp_double[i][j] = (xyi.ptr.pp_double[i][j]-means.ptr.p_double[j])/sigmas.ptr.p_double[j]; } } /* * Internal processing */ linreg_lrinternal(&xyi, s, npoints, nvars+1, info, lm, ar, _state); if( *info<0 ) { ae_frame_leave(_state); return; } /* * Un-standartization */ offs = ae_round(lm->w.ptr.p_double[3], _state); for(j=0; j<=nvars-1; j++) { /* * Constant term is updated (and its covariance too, * since it gets some variance from J-th component) */ lm->w.ptr.p_double[offs+nvars] = lm->w.ptr.p_double[offs+nvars]-lm->w.ptr.p_double[offs+j]*means.ptr.p_double[j]/sigmas.ptr.p_double[j]; v = means.ptr.p_double[j]/sigmas.ptr.p_double[j]; ae_v_subd(&ar->c.ptr.pp_double[nvars][0], 1, &ar->c.ptr.pp_double[j][0], 1, ae_v_len(0,nvars), v); ae_v_subd(&ar->c.ptr.pp_double[0][nvars], ar->c.stride, &ar->c.ptr.pp_double[0][j], ar->c.stride, ae_v_len(0,nvars), v); /* * J-th term is updated */ lm->w.ptr.p_double[offs+j] = lm->w.ptr.p_double[offs+j]/sigmas.ptr.p_double[j]; v = 1/sigmas.ptr.p_double[j]; ae_v_muld(&ar->c.ptr.pp_double[j][0], 1, ae_v_len(0,nvars), v); ae_v_muld(&ar->c.ptr.pp_double[0][j], ar->c.stride, ae_v_len(0,nvars), v); } ae_frame_leave(_state); } /************************************************************************* Like LRBuildS, but builds model Y = A(0)*X[0] + ... + A(N-1)*X[N-1] i.e. with zero constant term. -- ALGLIB -- Copyright 30.10.2008 by Bochkanov Sergey *************************************************************************/ void lrbuildzs(/* Real */ ae_matrix* xy, /* Real */ ae_vector* s, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, linearmodel* lm, lrreport* ar, ae_state *_state) { ae_frame _frame_block; ae_matrix xyi; ae_vector x; ae_vector c; ae_int_t i; ae_int_t j; double v; ae_int_t offs; double mean; double variance; double skewness; double kurtosis; ae_frame_make(_state, &_frame_block); *info = 0; _linearmodel_clear(lm); _lrreport_clear(ar); ae_matrix_init(&xyi, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&c, 0, DT_REAL, _state); /* * Test parameters */ if( npoints<=nvars+1||nvars<1 ) { *info = -1; ae_frame_leave(_state); return; } /* * Copy data, add one more column (constant term) */ ae_matrix_set_length(&xyi, npoints-1+1, nvars+1+1, _state); for(i=0; i<=npoints-1; i++) { ae_v_move(&xyi.ptr.pp_double[i][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); xyi.ptr.pp_double[i][nvars] = (double)(0); xyi.ptr.pp_double[i][nvars+1] = xy->ptr.pp_double[i][nvars]; } /* * Standartization: unusual scaling */ ae_vector_set_length(&x, npoints-1+1, _state); ae_vector_set_length(&c, nvars-1+1, _state); for(j=0; j<=nvars-1; j++) { ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[0][j], xy->stride, ae_v_len(0,npoints-1)); samplemoments(&x, npoints, &mean, &variance, &skewness, &kurtosis, _state); if( ae_fp_greater(ae_fabs(mean, _state),ae_sqrt(variance, _state)) ) { /* * variation is relatively small, it is better to * bring mean value to 1 */ c.ptr.p_double[j] = mean; } else { /* * variation is large, it is better to bring variance to 1 */ if( ae_fp_eq(variance,(double)(0)) ) { variance = (double)(1); } c.ptr.p_double[j] = ae_sqrt(variance, _state); } for(i=0; i<=npoints-1; i++) { xyi.ptr.pp_double[i][j] = xyi.ptr.pp_double[i][j]/c.ptr.p_double[j]; } } /* * Internal processing */ linreg_lrinternal(&xyi, s, npoints, nvars+1, info, lm, ar, _state); if( *info<0 ) { ae_frame_leave(_state); return; } /* * Un-standartization */ offs = ae_round(lm->w.ptr.p_double[3], _state); for(j=0; j<=nvars-1; j++) { /* * J-th term is updated */ lm->w.ptr.p_double[offs+j] = lm->w.ptr.p_double[offs+j]/c.ptr.p_double[j]; v = 1/c.ptr.p_double[j]; ae_v_muld(&ar->c.ptr.pp_double[j][0], 1, ae_v_len(0,nvars), v); ae_v_muld(&ar->c.ptr.pp_double[0][j], ar->c.stride, ae_v_len(0,nvars), v); } ae_frame_leave(_state); } /************************************************************************* Like LRBuild but builds model Y = A(0)*X[0] + ... + A(N-1)*X[N-1] i.e. with zero constant term. -- ALGLIB -- Copyright 30.10.2008 by Bochkanov Sergey *************************************************************************/ void lrbuildz(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, linearmodel* lm, lrreport* ar, ae_state *_state) { ae_frame _frame_block; ae_vector s; ae_int_t i; double sigma2; ae_frame_make(_state, &_frame_block); *info = 0; _linearmodel_clear(lm); _lrreport_clear(ar); ae_vector_init(&s, 0, DT_REAL, _state); if( npoints<=nvars+1||nvars<1 ) { *info = -1; ae_frame_leave(_state); return; } ae_vector_set_length(&s, npoints-1+1, _state); for(i=0; i<=npoints-1; i++) { s.ptr.p_double[i] = (double)(1); } lrbuildzs(xy, &s, npoints, nvars, info, lm, ar, _state); if( *info<0 ) { ae_frame_leave(_state); return; } sigma2 = ae_sqr(ar->rmserror, _state)*npoints/(npoints-nvars-1); for(i=0; i<=nvars; i++) { ae_v_muld(&ar->c.ptr.pp_double[i][0], 1, ae_v_len(0,nvars), sigma2); } ae_frame_leave(_state); } /************************************************************************* Unpacks coefficients of linear model. INPUT PARAMETERS: LM - linear model in ALGLIB format OUTPUT PARAMETERS: V - coefficients, array[0..NVars] constant term (intercept) is stored in the V[NVars]. NVars - number of independent variables (one less than number of coefficients) -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ void lrunpack(linearmodel* lm, /* Real */ ae_vector* v, ae_int_t* nvars, ae_state *_state) { ae_int_t offs; ae_vector_clear(v); *nvars = 0; ae_assert(ae_round(lm->w.ptr.p_double[1], _state)==linreg_lrvnum, "LINREG: Incorrect LINREG version!", _state); *nvars = ae_round(lm->w.ptr.p_double[2], _state); offs = ae_round(lm->w.ptr.p_double[3], _state); ae_vector_set_length(v, *nvars+1, _state); ae_v_move(&v->ptr.p_double[0], 1, &lm->w.ptr.p_double[offs], 1, ae_v_len(0,*nvars)); } /************************************************************************* "Packs" coefficients and creates linear model in ALGLIB format (LRUnpack reversed). INPUT PARAMETERS: V - coefficients, array[0..NVars] NVars - number of independent variables OUTPUT PAREMETERS: LM - linear model. -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ void lrpack(/* Real */ ae_vector* v, ae_int_t nvars, linearmodel* lm, ae_state *_state) { ae_int_t offs; _linearmodel_clear(lm); ae_vector_set_length(&lm->w, 4+nvars+1, _state); offs = 4; lm->w.ptr.p_double[0] = (double)(4+nvars+1); lm->w.ptr.p_double[1] = (double)(linreg_lrvnum); lm->w.ptr.p_double[2] = (double)(nvars); lm->w.ptr.p_double[3] = (double)(offs); ae_v_move(&lm->w.ptr.p_double[offs], 1, &v->ptr.p_double[0], 1, ae_v_len(offs,offs+nvars)); } /************************************************************************* Procesing INPUT PARAMETERS: LM - linear model X - input vector, array[0..NVars-1]. Result: value of linear model regression estimate -- ALGLIB -- Copyright 03.09.2008 by Bochkanov Sergey *************************************************************************/ double lrprocess(linearmodel* lm, /* Real */ ae_vector* x, ae_state *_state) { double v; ae_int_t offs; ae_int_t nvars; double result; ae_assert(ae_round(lm->w.ptr.p_double[1], _state)==linreg_lrvnum, "LINREG: Incorrect LINREG version!", _state); nvars = ae_round(lm->w.ptr.p_double[2], _state); offs = ae_round(lm->w.ptr.p_double[3], _state); v = ae_v_dotproduct(&x->ptr.p_double[0], 1, &lm->w.ptr.p_double[offs], 1, ae_v_len(0,nvars-1)); result = v+lm->w.ptr.p_double[offs+nvars]; return result; } /************************************************************************* RMS error on the test set INPUT PARAMETERS: LM - linear model XY - test set NPoints - test set size RESULT: root mean square error. -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ double lrrmserror(linearmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_int_t i; double v; ae_int_t offs; ae_int_t nvars; double result; ae_assert(ae_round(lm->w.ptr.p_double[1], _state)==linreg_lrvnum, "LINREG: Incorrect LINREG version!", _state); nvars = ae_round(lm->w.ptr.p_double[2], _state); offs = ae_round(lm->w.ptr.p_double[3], _state); result = (double)(0); for(i=0; i<=npoints-1; i++) { v = ae_v_dotproduct(&xy->ptr.pp_double[i][0], 1, &lm->w.ptr.p_double[offs], 1, ae_v_len(0,nvars-1)); v = v+lm->w.ptr.p_double[offs+nvars]; result = result+ae_sqr(v-xy->ptr.pp_double[i][nvars], _state); } result = ae_sqrt(result/npoints, _state); return result; } /************************************************************************* Average error on the test set INPUT PARAMETERS: LM - linear model XY - test set NPoints - test set size RESULT: average error. -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ double lravgerror(linearmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_int_t i; double v; ae_int_t offs; ae_int_t nvars; double result; ae_assert(ae_round(lm->w.ptr.p_double[1], _state)==linreg_lrvnum, "LINREG: Incorrect LINREG version!", _state); nvars = ae_round(lm->w.ptr.p_double[2], _state); offs = ae_round(lm->w.ptr.p_double[3], _state); result = (double)(0); for(i=0; i<=npoints-1; i++) { v = ae_v_dotproduct(&xy->ptr.pp_double[i][0], 1, &lm->w.ptr.p_double[offs], 1, ae_v_len(0,nvars-1)); v = v+lm->w.ptr.p_double[offs+nvars]; result = result+ae_fabs(v-xy->ptr.pp_double[i][nvars], _state); } result = result/npoints; return result; } /************************************************************************* RMS error on the test set INPUT PARAMETERS: LM - linear model XY - test set NPoints - test set size RESULT: average relative error. -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ double lravgrelerror(linearmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_int_t i; ae_int_t k; double v; ae_int_t offs; ae_int_t nvars; double result; ae_assert(ae_round(lm->w.ptr.p_double[1], _state)==linreg_lrvnum, "LINREG: Incorrect LINREG version!", _state); nvars = ae_round(lm->w.ptr.p_double[2], _state); offs = ae_round(lm->w.ptr.p_double[3], _state); result = (double)(0); k = 0; for(i=0; i<=npoints-1; i++) { if( ae_fp_neq(xy->ptr.pp_double[i][nvars],(double)(0)) ) { v = ae_v_dotproduct(&xy->ptr.pp_double[i][0], 1, &lm->w.ptr.p_double[offs], 1, ae_v_len(0,nvars-1)); v = v+lm->w.ptr.p_double[offs+nvars]; result = result+ae_fabs((v-xy->ptr.pp_double[i][nvars])/xy->ptr.pp_double[i][nvars], _state); k = k+1; } } if( k!=0 ) { result = result/k; } return result; } /************************************************************************* Copying of LinearModel strucure INPUT PARAMETERS: LM1 - original OUTPUT PARAMETERS: LM2 - copy -- ALGLIB -- Copyright 15.03.2009 by Bochkanov Sergey *************************************************************************/ void lrcopy(linearmodel* lm1, linearmodel* lm2, ae_state *_state) { ae_int_t k; _linearmodel_clear(lm2); k = ae_round(lm1->w.ptr.p_double[0], _state); ae_vector_set_length(&lm2->w, k-1+1, _state); ae_v_move(&lm2->w.ptr.p_double[0], 1, &lm1->w.ptr.p_double[0], 1, ae_v_len(0,k-1)); } void lrlines(/* Real */ ae_matrix* xy, /* Real */ ae_vector* s, ae_int_t n, ae_int_t* info, double* a, double* b, double* vara, double* varb, double* covab, double* corrab, double* p, ae_state *_state) { ae_int_t i; double ss; double sx; double sxx; double sy; double stt; double e1; double e2; double t; double chi2; *info = 0; *a = 0; *b = 0; *vara = 0; *varb = 0; *covab = 0; *corrab = 0; *p = 0; if( n<2 ) { *info = -1; return; } for(i=0; i<=n-1; i++) { if( ae_fp_less_eq(s->ptr.p_double[i],(double)(0)) ) { *info = -2; return; } } *info = 1; /* * Calculate S, SX, SY, SXX */ ss = (double)(0); sx = (double)(0); sy = (double)(0); sxx = (double)(0); for(i=0; i<=n-1; i++) { t = ae_sqr(s->ptr.p_double[i], _state); ss = ss+1/t; sx = sx+xy->ptr.pp_double[i][0]/t; sy = sy+xy->ptr.pp_double[i][1]/t; sxx = sxx+ae_sqr(xy->ptr.pp_double[i][0], _state)/t; } /* * Test for condition number */ t = ae_sqrt(4*ae_sqr(sx, _state)+ae_sqr(ss-sxx, _state), _state); e1 = 0.5*(ss+sxx+t); e2 = 0.5*(ss+sxx-t); if( ae_fp_less_eq(ae_minreal(e1, e2, _state),1000*ae_machineepsilon*ae_maxreal(e1, e2, _state)) ) { *info = -3; return; } /* * Calculate A, B */ *a = (double)(0); *b = (double)(0); stt = (double)(0); for(i=0; i<=n-1; i++) { t = (xy->ptr.pp_double[i][0]-sx/ss)/s->ptr.p_double[i]; *b = *b+t*xy->ptr.pp_double[i][1]/s->ptr.p_double[i]; stt = stt+ae_sqr(t, _state); } *b = *b/stt; *a = (sy-sx*(*b))/ss; /* * Calculate goodness-of-fit */ if( n>2 ) { chi2 = (double)(0); for(i=0; i<=n-1; i++) { chi2 = chi2+ae_sqr((xy->ptr.pp_double[i][1]-(*a)-*b*xy->ptr.pp_double[i][0])/s->ptr.p_double[i], _state); } *p = incompletegammac((double)(n-2)/(double)2, chi2/2, _state); } else { *p = (double)(1); } /* * Calculate other parameters */ *vara = (1+ae_sqr(sx, _state)/(ss*stt))/ss; *varb = 1/stt; *covab = -sx/(ss*stt); *corrab = *covab/ae_sqrt(*vara*(*varb), _state); } void lrline(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t* info, double* a, double* b, ae_state *_state) { ae_frame _frame_block; ae_vector s; ae_int_t i; double vara; double varb; double covab; double corrab; double p; ae_frame_make(_state, &_frame_block); *info = 0; *a = 0; *b = 0; ae_vector_init(&s, 0, DT_REAL, _state); if( n<2 ) { *info = -1; ae_frame_leave(_state); return; } ae_vector_set_length(&s, n-1+1, _state); for(i=0; i<=n-1; i++) { s.ptr.p_double[i] = (double)(1); } lrlines(xy, &s, n, info, a, b, &vara, &varb, &covab, &corrab, &p, _state); ae_frame_leave(_state); } /************************************************************************* Internal linear regression subroutine *************************************************************************/ static void linreg_lrinternal(/* Real */ ae_matrix* xy, /* Real */ ae_vector* s, ae_int_t npoints, ae_int_t nvars, ae_int_t* info, linearmodel* lm, lrreport* ar, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix u; ae_matrix vt; ae_matrix vm; ae_matrix xym; ae_vector b; ae_vector sv; ae_vector t; ae_vector svi; ae_vector work; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t ncv; ae_int_t na; ae_int_t nacv; double r; double p; double epstol; lrreport ar2; ae_int_t offs; linearmodel tlm; ae_frame_make(_state, &_frame_block); *info = 0; _linearmodel_clear(lm); _lrreport_clear(ar); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&u, 0, 0, DT_REAL, _state); ae_matrix_init(&vt, 0, 0, DT_REAL, _state); ae_matrix_init(&vm, 0, 0, DT_REAL, _state); ae_matrix_init(&xym, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&sv, 0, DT_REAL, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&svi, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); _lrreport_init(&ar2, _state); _linearmodel_init(&tlm, _state); epstol = (double)(1000); /* * Check for errors in data */ if( npointsptr.p_double[i],(double)(0)) ) { *info = -2; ae_frame_leave(_state); return; } } *info = 1; /* * Create design matrix */ ae_matrix_set_length(&a, npoints-1+1, nvars-1+1, _state); ae_vector_set_length(&b, npoints-1+1, _state); for(i=0; i<=npoints-1; i++) { r = 1/s->ptr.p_double[i]; ae_v_moved(&a.ptr.pp_double[i][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1), r); b.ptr.p_double[i] = xy->ptr.pp_double[i][nvars]/s->ptr.p_double[i]; } /* * Allocate W: * W[0] array size * W[1] version number, 0 * W[2] NVars (minus 1, to be compatible with external representation) * W[3] coefficients offset */ ae_vector_set_length(&lm->w, 4+nvars-1+1, _state); offs = 4; lm->w.ptr.p_double[0] = (double)(4+nvars); lm->w.ptr.p_double[1] = (double)(linreg_lrvnum); lm->w.ptr.p_double[2] = (double)(nvars-1); lm->w.ptr.p_double[3] = (double)(offs); /* * Solve problem using SVD: * * 0. check for degeneracy (different types) * 1. A = U*diag(sv)*V' * 2. T = b'*U * 3. w = SUM((T[i]/sv[i])*V[..,i]) * 4. cov(wi,wj) = SUM(Vji*Vjk/sv[i]^2,K=1..M) * * see $15.4 of "Numerical Recipes in C" for more information */ ae_vector_set_length(&t, nvars-1+1, _state); ae_vector_set_length(&svi, nvars-1+1, _state); ae_matrix_set_length(&ar->c, nvars-1+1, nvars-1+1, _state); ae_matrix_set_length(&vm, nvars-1+1, nvars-1+1, _state); if( !rmatrixsvd(&a, npoints, nvars, 1, 1, 2, &sv, &u, &vt, _state) ) { *info = -4; ae_frame_leave(_state); return; } if( ae_fp_less_eq(sv.ptr.p_double[0],(double)(0)) ) { /* * Degenerate case: zero design matrix. */ for(i=offs; i<=offs+nvars-1; i++) { lm->w.ptr.p_double[i] = (double)(0); } ar->rmserror = lrrmserror(lm, xy, npoints, _state); ar->avgerror = lravgerror(lm, xy, npoints, _state); ar->avgrelerror = lravgrelerror(lm, xy, npoints, _state); ar->cvrmserror = ar->rmserror; ar->cvavgerror = ar->avgerror; ar->cvavgrelerror = ar->avgrelerror; ar->ncvdefects = 0; ae_vector_set_length(&ar->cvdefects, nvars-1+1, _state); for(i=0; i<=nvars-1; i++) { ar->cvdefects.ptr.p_int[i] = -1; } ae_matrix_set_length(&ar->c, nvars-1+1, nvars-1+1, _state); for(i=0; i<=nvars-1; i++) { for(j=0; j<=nvars-1; j++) { ar->c.ptr.pp_double[i][j] = (double)(0); } } ae_frame_leave(_state); return; } if( ae_fp_less_eq(sv.ptr.p_double[nvars-1],epstol*ae_machineepsilon*sv.ptr.p_double[0]) ) { /* * Degenerate case, non-zero design matrix. * * We can leave it and solve task in SVD least squares fashion. * Solution and covariance matrix will be obtained correctly, * but CV error estimates - will not. It is better to reduce * it to non-degenerate task and to obtain correct CV estimates. */ for(k=nvars; k>=1; k--) { if( ae_fp_greater(sv.ptr.p_double[k-1],epstol*ae_machineepsilon*sv.ptr.p_double[0]) ) { /* * Reduce */ ae_matrix_set_length(&xym, npoints-1+1, k+1, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=k-1; j++) { r = ae_v_dotproduct(&xy->ptr.pp_double[i][0], 1, &vt.ptr.pp_double[j][0], 1, ae_v_len(0,nvars-1)); xym.ptr.pp_double[i][j] = r; } xym.ptr.pp_double[i][k] = xy->ptr.pp_double[i][nvars]; } /* * Solve */ linreg_lrinternal(&xym, s, npoints, k, info, &tlm, &ar2, _state); if( *info!=1 ) { ae_frame_leave(_state); return; } /* * Convert back to un-reduced format */ for(j=0; j<=nvars-1; j++) { lm->w.ptr.p_double[offs+j] = (double)(0); } for(j=0; j<=k-1; j++) { r = tlm.w.ptr.p_double[offs+j]; ae_v_addd(&lm->w.ptr.p_double[offs], 1, &vt.ptr.pp_double[j][0], 1, ae_v_len(offs,offs+nvars-1), r); } ar->rmserror = ar2.rmserror; ar->avgerror = ar2.avgerror; ar->avgrelerror = ar2.avgrelerror; ar->cvrmserror = ar2.cvrmserror; ar->cvavgerror = ar2.cvavgerror; ar->cvavgrelerror = ar2.cvavgrelerror; ar->ncvdefects = ar2.ncvdefects; ae_vector_set_length(&ar->cvdefects, nvars-1+1, _state); for(j=0; j<=ar->ncvdefects-1; j++) { ar->cvdefects.ptr.p_int[j] = ar2.cvdefects.ptr.p_int[j]; } for(j=ar->ncvdefects; j<=nvars-1; j++) { ar->cvdefects.ptr.p_int[j] = -1; } ae_matrix_set_length(&ar->c, nvars-1+1, nvars-1+1, _state); ae_vector_set_length(&work, nvars+1, _state); matrixmatrixmultiply(&ar2.c, 0, k-1, 0, k-1, ae_false, &vt, 0, k-1, 0, nvars-1, ae_false, 1.0, &vm, 0, k-1, 0, nvars-1, 0.0, &work, _state); matrixmatrixmultiply(&vt, 0, k-1, 0, nvars-1, ae_true, &vm, 0, k-1, 0, nvars-1, ae_false, 1.0, &ar->c, 0, nvars-1, 0, nvars-1, 0.0, &work, _state); ae_frame_leave(_state); return; } } *info = -255; ae_frame_leave(_state); return; } for(i=0; i<=nvars-1; i++) { if( ae_fp_greater(sv.ptr.p_double[i],epstol*ae_machineepsilon*sv.ptr.p_double[0]) ) { svi.ptr.p_double[i] = 1/sv.ptr.p_double[i]; } else { svi.ptr.p_double[i] = (double)(0); } } for(i=0; i<=nvars-1; i++) { t.ptr.p_double[i] = (double)(0); } for(i=0; i<=npoints-1; i++) { r = b.ptr.p_double[i]; ae_v_addd(&t.ptr.p_double[0], 1, &u.ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1), r); } for(i=0; i<=nvars-1; i++) { lm->w.ptr.p_double[offs+i] = (double)(0); } for(i=0; i<=nvars-1; i++) { r = t.ptr.p_double[i]*svi.ptr.p_double[i]; ae_v_addd(&lm->w.ptr.p_double[offs], 1, &vt.ptr.pp_double[i][0], 1, ae_v_len(offs,offs+nvars-1), r); } for(j=0; j<=nvars-1; j++) { r = svi.ptr.p_double[j]; ae_v_moved(&vm.ptr.pp_double[0][j], vm.stride, &vt.ptr.pp_double[j][0], 1, ae_v_len(0,nvars-1), r); } for(i=0; i<=nvars-1; i++) { for(j=i; j<=nvars-1; j++) { r = ae_v_dotproduct(&vm.ptr.pp_double[i][0], 1, &vm.ptr.pp_double[j][0], 1, ae_v_len(0,nvars-1)); ar->c.ptr.pp_double[i][j] = r; ar->c.ptr.pp_double[j][i] = r; } } /* * Leave-1-out cross-validation error. * * NOTATIONS: * A design matrix * A*x = b original linear least squares task * U*S*V' SVD of A * ai i-th row of the A * bi i-th element of the b * xf solution of the original LLS task * * Cross-validation error of i-th element from a sample is * calculated using following formula: * * ERRi = ai*xf - (ai*xf-bi*(ui*ui'))/(1-ui*ui') (1) * * This formula can be derived from normal equations of the * original task * * (A'*A)x = A'*b (2) * * by applying modification (zeroing out i-th row of A) to (2): * * (A-ai)'*(A-ai) = (A-ai)'*b * * and using Sherman-Morrison formula for updating matrix inverse * * NOTE 1: b is not zeroed out since it is much simpler and * does not influence final result. * * NOTE 2: some design matrices A have such ui that 1-ui*ui'=0. * Formula (1) can't be applied for such cases and they are skipped * from CV calculation (which distorts resulting CV estimate). * But from the properties of U we can conclude that there can * be no more than NVars such vectors. Usually * NVars << NPoints, so in a normal case it only slightly * influences result. */ ncv = 0; na = 0; nacv = 0; ar->rmserror = (double)(0); ar->avgerror = (double)(0); ar->avgrelerror = (double)(0); ar->cvrmserror = (double)(0); ar->cvavgerror = (double)(0); ar->cvavgrelerror = (double)(0); ar->ncvdefects = 0; ae_vector_set_length(&ar->cvdefects, nvars-1+1, _state); for(i=0; i<=nvars-1; i++) { ar->cvdefects.ptr.p_int[i] = -1; } for(i=0; i<=npoints-1; i++) { /* * Error on a training set */ r = ae_v_dotproduct(&xy->ptr.pp_double[i][0], 1, &lm->w.ptr.p_double[offs], 1, ae_v_len(0,nvars-1)); ar->rmserror = ar->rmserror+ae_sqr(r-xy->ptr.pp_double[i][nvars], _state); ar->avgerror = ar->avgerror+ae_fabs(r-xy->ptr.pp_double[i][nvars], _state); if( ae_fp_neq(xy->ptr.pp_double[i][nvars],(double)(0)) ) { ar->avgrelerror = ar->avgrelerror+ae_fabs((r-xy->ptr.pp_double[i][nvars])/xy->ptr.pp_double[i][nvars], _state); na = na+1; } /* * Error using fast leave-one-out cross-validation */ p = ae_v_dotproduct(&u.ptr.pp_double[i][0], 1, &u.ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); if( ae_fp_greater(p,1-epstol*ae_machineepsilon) ) { ar->cvdefects.ptr.p_int[ar->ncvdefects] = i; ar->ncvdefects = ar->ncvdefects+1; continue; } r = s->ptr.p_double[i]*(r/s->ptr.p_double[i]-b.ptr.p_double[i]*p)/(1-p); ar->cvrmserror = ar->cvrmserror+ae_sqr(r-xy->ptr.pp_double[i][nvars], _state); ar->cvavgerror = ar->cvavgerror+ae_fabs(r-xy->ptr.pp_double[i][nvars], _state); if( ae_fp_neq(xy->ptr.pp_double[i][nvars],(double)(0)) ) { ar->cvavgrelerror = ar->cvavgrelerror+ae_fabs((r-xy->ptr.pp_double[i][nvars])/xy->ptr.pp_double[i][nvars], _state); nacv = nacv+1; } ncv = ncv+1; } if( ncv==0 ) { /* * Something strange: ALL ui are degenerate. * Unexpected... */ *info = -255; ae_frame_leave(_state); return; } ar->rmserror = ae_sqrt(ar->rmserror/npoints, _state); ar->avgerror = ar->avgerror/npoints; if( na!=0 ) { ar->avgrelerror = ar->avgrelerror/na; } ar->cvrmserror = ae_sqrt(ar->cvrmserror/ncv, _state); ar->cvavgerror = ar->cvavgerror/ncv; if( nacv!=0 ) { ar->cvavgrelerror = ar->cvavgrelerror/nacv; } ae_frame_leave(_state); } void _linearmodel_init(void* _p, ae_state *_state) { linearmodel *p = (linearmodel*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->w, 0, DT_REAL, _state); } void _linearmodel_init_copy(void* _dst, void* _src, ae_state *_state) { linearmodel *dst = (linearmodel*)_dst; linearmodel *src = (linearmodel*)_src; ae_vector_init_copy(&dst->w, &src->w, _state); } void _linearmodel_clear(void* _p) { linearmodel *p = (linearmodel*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->w); } void _linearmodel_destroy(void* _p) { linearmodel *p = (linearmodel*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->w); } void _lrreport_init(void* _p, ae_state *_state) { lrreport *p = (lrreport*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->c, 0, 0, DT_REAL, _state); ae_vector_init(&p->cvdefects, 0, DT_INT, _state); } void _lrreport_init_copy(void* _dst, void* _src, ae_state *_state) { lrreport *dst = (lrreport*)_dst; lrreport *src = (lrreport*)_src; ae_matrix_init_copy(&dst->c, &src->c, _state); dst->rmserror = src->rmserror; dst->avgerror = src->avgerror; dst->avgrelerror = src->avgrelerror; dst->cvrmserror = src->cvrmserror; dst->cvavgerror = src->cvavgerror; dst->cvavgrelerror = src->cvavgrelerror; dst->ncvdefects = src->ncvdefects; ae_vector_init_copy(&dst->cvdefects, &src->cvdefects, _state); } void _lrreport_clear(void* _p) { lrreport *p = (lrreport*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->c); ae_vector_clear(&p->cvdefects); } void _lrreport_destroy(void* _p) { lrreport *p = (lrreport*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->c); ae_vector_destroy(&p->cvdefects); } /************************************************************************* Filters: simple moving averages (unsymmetric). This filter replaces array by results of SMA(K) filter. SMA(K) is defined as filter which averages at most K previous points (previous - not points AROUND central point) - or less, in case of the first K-1 points. INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 K - K>=1 (K can be larger than N , such cases will be correctly handled). Window width. K=1 corresponds to identity transformation (nothing changes). OUTPUT PARAMETERS: X - array, whose first N elements were processed with SMA(K) NOTE 1: this function uses efficient in-place algorithm which does not allocate temporary arrays. NOTE 2: this algorithm makes only one pass through array and uses running sum to speed-up calculation of the averages. Additional measures are taken to ensure that running sum on a long sequence of zero elements will be correctly reset to zero even in the presence of round-off error. NOTE 3: this is unsymmetric version of the algorithm, which does NOT averages points after the current one. Only X[i], X[i-1], ... are used when calculating new value of X[i]. We should also note that this algorithm uses BOTH previous points and current one, i.e. new value of X[i] depends on BOTH previous point and X[i] itself. -- ALGLIB -- Copyright 25.10.2011 by Bochkanov Sergey *************************************************************************/ void filtersma(/* Real */ ae_vector* x, ae_int_t n, ae_int_t k, ae_state *_state) { ae_int_t i; double runningsum; double termsinsum; ae_int_t zeroprefix; double v; ae_assert(n>=0, "FilterSMA: N<0", _state); ae_assert(x->cnt>=n, "FilterSMA: Length(X)=1, "FilterSMA: K<1", _state); /* * Quick exit, if necessary */ if( n<=1||k==1 ) { return; } /* * Prepare variables (see below for explanation) */ runningsum = 0.0; termsinsum = (double)(0); for(i=ae_maxint(n-k, 0, _state); i<=n-1; i++) { runningsum = runningsum+x->ptr.p_double[i]; termsinsum = termsinsum+1; } i = ae_maxint(n-k, 0, _state); zeroprefix = 0; while(i<=n-1&&ae_fp_eq(x->ptr.p_double[i],(double)(0))) { zeroprefix = zeroprefix+1; i = i+1; } /* * General case: we assume that N>1 and K>1 * * Make one pass through all elements. At the beginning of * the iteration we have: * * I element being processed * * RunningSum current value of the running sum * (including I-th element) * * TermsInSum number of terms in sum, 0<=TermsInSum<=K * * ZeroPrefix length of the sequence of zero elements * which starts at X[I-K+1] and continues towards X[I]. * Equal to zero in case X[I-K+1] is non-zero. * This value is used to make RunningSum exactly zero * when it follows from the problem properties. */ for(i=n-1; i>=0; i--) { /* * Store new value of X[i], save old value in V */ v = x->ptr.p_double[i]; x->ptr.p_double[i] = runningsum/termsinsum; /* * Update RunningSum and TermsInSum */ if( i-k>=0 ) { runningsum = runningsum-v+x->ptr.p_double[i-k]; } else { runningsum = runningsum-v; termsinsum = termsinsum-1; } /* * Update ZeroPrefix. * In case we have ZeroPrefix=TermsInSum, * RunningSum is reset to zero. */ if( i-k>=0 ) { if( ae_fp_neq(x->ptr.p_double[i-k],(double)(0)) ) { zeroprefix = 0; } else { zeroprefix = ae_minint(zeroprefix+1, k, _state); } } else { zeroprefix = ae_minint(zeroprefix, i+1, _state); } if( ae_fp_eq((double)(zeroprefix),termsinsum) ) { runningsum = (double)(0); } } } /************************************************************************* Filters: exponential moving averages. This filter replaces array by results of EMA(alpha) filter. EMA(alpha) is defined as filter which replaces X[] by S[]: S[0] = X[0] S[t] = alpha*X[t] + (1-alpha)*S[t-1] INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 alpha - 0=0, "FilterEMA: N<0", _state); ae_assert(x->cnt>=n, "FilterEMA: Length(X)1", _state); /* * Quick exit, if necessary */ if( n<=1||ae_fp_eq(alpha,(double)(1)) ) { return; } /* * Process */ for(i=1; i<=n-1; i++) { x->ptr.p_double[i] = alpha*x->ptr.p_double[i]+(1-alpha)*x->ptr.p_double[i-1]; } } /************************************************************************* Filters: linear regression moving averages. This filter replaces array by results of LRMA(K) filter. LRMA(K) is defined as filter which, for each data point, builds linear regression model using K prevous points (point itself is included in these K points) and calculates value of this linear model at the point in question. INPUT PARAMETERS: X - array[N], array to process. It can be larger than N, in this case only first N points are processed. N - points count, N>=0 K - K>=1 (K can be larger than N , such cases will be correctly handled). Window width. K=1 corresponds to identity transformation (nothing changes). OUTPUT PARAMETERS: X - array, whose first N elements were processed with SMA(K) NOTE 1: this function uses efficient in-place algorithm which does not allocate temporary arrays. NOTE 2: this algorithm makes only one pass through array and uses running sum to speed-up calculation of the averages. Additional measures are taken to ensure that running sum on a long sequence of zero elements will be correctly reset to zero even in the presence of round-off error. NOTE 3: this is unsymmetric version of the algorithm, which does NOT averages points after the current one. Only X[i], X[i-1], ... are used when calculating new value of X[i]. We should also note that this algorithm uses BOTH previous points and current one, i.e. new value of X[i] depends on BOTH previous point and X[i] itself. -- ALGLIB -- Copyright 25.10.2011 by Bochkanov Sergey *************************************************************************/ void filterlrma(/* Real */ ae_vector* x, ae_int_t n, ae_int_t k, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t m; ae_matrix xy; ae_vector s; ae_int_t info; double a; double b; double vara; double varb; double covab; double corrab; double p; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_assert(n>=0, "FilterLRMA: N<0", _state); ae_assert(x->cnt>=n, "FilterLRMA: Length(X)=1, "FilterLRMA: K<1", _state); /* * Quick exit, if necessary: * * either N is equal to 1 (nothing to average) * * or K is 1 (only point itself is used) or 2 (model is too simple, * we will always get identity transformation) */ if( n<=1||k<=2 ) { ae_frame_leave(_state); return; } /* * General case: K>2, N>1. * We do not process points with I<2 because first two points (I=0 and I=1) will be * left unmodified by LRMA filter in any case. */ ae_matrix_set_length(&xy, k, 2, _state); ae_vector_set_length(&s, k, _state); for(i=0; i<=k-1; i++) { xy.ptr.pp_double[i][0] = (double)(i); s.ptr.p_double[i] = 1.0; } for(i=n-1; i>=2; i--) { m = ae_minint(i+1, k, _state); ae_v_move(&xy.ptr.pp_double[0][1], xy.stride, &x->ptr.p_double[i-m+1], 1, ae_v_len(0,m-1)); lrlines(&xy, &s, m, &info, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); ae_assert(info==1, "FilterLRMA: internal error", _state); x->ptr.p_double[i] = a+b*(m-1); } ae_frame_leave(_state); } /************************************************************************* This subroutine trains logit model. INPUT PARAMETERS: XY - training set, array[0..NPoints-1,0..NVars] First NVars columns store values of independent variables, next column stores number of class (from 0 to NClasses-1) which dataset element belongs to. Fractional values are rounded to nearest integer. NPoints - training set size, NPoints>=1 NVars - number of independent variables, NVars>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPointsptr.pp_double[i][nvars], _state)<0||ae_round(xy->ptr.pp_double[i][nvars], _state)>=nclasses ) { *info = -2; ae_frame_leave(_state); return; } } *info = 1; /* * Initialize data */ rep->ngrad = 0; rep->nhess = 0; /* * Allocate array */ offs = 5; ssize = 5+(nvars+1)*(nclasses-1)+nclasses; ae_vector_set_length(&lm->w, ssize-1+1, _state); lm->w.ptr.p_double[0] = (double)(ssize); lm->w.ptr.p_double[1] = (double)(logit_logitvnum); lm->w.ptr.p_double[2] = (double)(nvars); lm->w.ptr.p_double[3] = (double)(nclasses); lm->w.ptr.p_double[4] = (double)(offs); /* * Degenerate case: all outputs are equal */ allsame = ae_true; for(i=1; i<=npoints-1; i++) { if( ae_round(xy->ptr.pp_double[i][nvars], _state)!=ae_round(xy->ptr.pp_double[i-1][nvars], _state) ) { allsame = ae_false; } } if( allsame ) { for(i=0; i<=(nvars+1)*(nclasses-1)-1; i++) { lm->w.ptr.p_double[offs+i] = (double)(0); } v = -2*ae_log(ae_minrealnumber, _state); k = ae_round(xy->ptr.pp_double[0][nvars], _state); if( k==nclasses-1 ) { for(i=0; i<=nclasses-2; i++) { lm->w.ptr.p_double[offs+i*(nvars+1)+nvars] = -v; } } else { for(i=0; i<=nclasses-2; i++) { if( i==k ) { lm->w.ptr.p_double[offs+i*(nvars+1)+nvars] = v; } else { lm->w.ptr.p_double[offs+i*(nvars+1)+nvars] = (double)(0); } } } ae_frame_leave(_state); return; } /* * General case. * Prepare task and network. Allocate space. */ mlpcreatec0(nvars, nclasses, &network, _state); mlpinitpreprocessor(&network, xy, npoints, _state); mlpproperties(&network, &nin, &nout, &wcount, _state); for(i=0; i<=wcount-1; i++) { network.weights.ptr.p_double[i] = (2*ae_randomreal(_state)-1)/nvars; } ae_vector_set_length(&g, wcount-1+1, _state); ae_matrix_set_length(&h, wcount-1+1, wcount-1+1, _state); ae_vector_set_length(&wbase, wcount-1+1, _state); ae_vector_set_length(&wdir, wcount-1+1, _state); ae_vector_set_length(&work, wcount-1+1, _state); /* * First stage: optimize in gradient direction. */ for(k=0; k<=wcount/3+10; k++) { /* * Calculate gradient in starting point */ mlpgradnbatch(&network, xy, npoints, &e, &g, _state); v = ae_v_dotproduct(&network.weights.ptr.p_double[0], 1, &network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); e = e+0.5*decay*v; ae_v_addd(&g.ptr.p_double[0], 1, &network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1), decay); rep->ngrad = rep->ngrad+1; /* * Setup optimization scheme */ ae_v_moveneg(&wdir.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); v = ae_v_dotproduct(&wdir.ptr.p_double[0], 1, &wdir.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); wstep = ae_sqrt(v, _state); v = 1/ae_sqrt(v, _state); ae_v_muld(&wdir.ptr.p_double[0], 1, ae_v_len(0,wcount-1), v); mcstage = 0; logit_mnlmcsrch(wcount, &network.weights, &e, &g, &wdir, &wstep, &mcinfo, &mcnfev, &work, &mcstate, &mcstage, _state); while(mcstage!=0) { mlpgradnbatch(&network, xy, npoints, &e, &g, _state); v = ae_v_dotproduct(&network.weights.ptr.p_double[0], 1, &network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); e = e+0.5*decay*v; ae_v_addd(&g.ptr.p_double[0], 1, &network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1), decay); rep->ngrad = rep->ngrad+1; logit_mnlmcsrch(wcount, &network.weights, &e, &g, &wdir, &wstep, &mcinfo, &mcnfev, &work, &mcstate, &mcstage, _state); } } /* * Second stage: use Hessian when we are close to the minimum */ for(;;) { /* * Calculate and update E/G/H */ mlphessiannbatch(&network, xy, npoints, &e, &g, &h, _state); v = ae_v_dotproduct(&network.weights.ptr.p_double[0], 1, &network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); e = e+0.5*decay*v; ae_v_addd(&g.ptr.p_double[0], 1, &network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1), decay); for(k=0; k<=wcount-1; k++) { h.ptr.pp_double[k][k] = h.ptr.pp_double[k][k]+decay; } rep->nhess = rep->nhess+1; /* * Select step direction * NOTE: it is important to use lower-triangle Cholesky * factorization since it is much faster than higher-triangle version. */ spd = spdmatrixcholesky(&h, wcount, ae_false, _state); spdmatrixcholeskysolve(&h, wcount, ae_false, &g, &solverinfo, &solverrep, &wdir, _state); spd = solverinfo>0; if( spd ) { /* * H is positive definite. * Step in Newton direction. */ ae_v_muld(&wdir.ptr.p_double[0], 1, ae_v_len(0,wcount-1), -1); spd = ae_true; } else { /* * H is indefinite. * Step in gradient direction. */ ae_v_moveneg(&wdir.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); spd = ae_false; } /* * Optimize in WDir direction */ v = ae_v_dotproduct(&wdir.ptr.p_double[0], 1, &wdir.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); wstep = ae_sqrt(v, _state); v = 1/ae_sqrt(v, _state); ae_v_muld(&wdir.ptr.p_double[0], 1, ae_v_len(0,wcount-1), v); mcstage = 0; logit_mnlmcsrch(wcount, &network.weights, &e, &g, &wdir, &wstep, &mcinfo, &mcnfev, &work, &mcstate, &mcstage, _state); while(mcstage!=0) { mlpgradnbatch(&network, xy, npoints, &e, &g, _state); v = ae_v_dotproduct(&network.weights.ptr.p_double[0], 1, &network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); e = e+0.5*decay*v; ae_v_addd(&g.ptr.p_double[0], 1, &network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1), decay); rep->ngrad = rep->ngrad+1; logit_mnlmcsrch(wcount, &network.weights, &e, &g, &wdir, &wstep, &mcinfo, &mcnfev, &work, &mcstate, &mcstage, _state); } if( spd&&((mcinfo==2||mcinfo==4)||mcinfo==6) ) { break; } } /* * Convert from NN format to MNL format */ ae_v_move(&lm->w.ptr.p_double[offs], 1, &network.weights.ptr.p_double[0], 1, ae_v_len(offs,offs+wcount-1)); for(k=0; k<=nvars-1; k++) { for(i=0; i<=nclasses-2; i++) { s = network.columnsigmas.ptr.p_double[k]; if( ae_fp_eq(s,(double)(0)) ) { s = (double)(1); } j = offs+(nvars+1)*i; v = lm->w.ptr.p_double[j+k]; lm->w.ptr.p_double[j+k] = v/s; lm->w.ptr.p_double[j+nvars] = lm->w.ptr.p_double[j+nvars]+v*network.columnmeans.ptr.p_double[k]/s; } } for(k=0; k<=nclasses-2; k++) { lm->w.ptr.p_double[offs+(nvars+1)*k+nvars] = -lm->w.ptr.p_double[offs+(nvars+1)*k+nvars]; } ae_frame_leave(_state); } /************************************************************************* Procesing INPUT PARAMETERS: LM - logit model, passed by non-constant reference (some fields of structure are used as temporaries when calculating model output). X - input vector, array[0..NVars-1]. Y - (possibly) preallocated buffer; if size of Y is less than NClasses, it will be reallocated.If it is large enough, it is NOT reallocated, so we can save some time on reallocation. OUTPUT PARAMETERS: Y - result, array[0..NClasses-1] Vector of posterior probabilities for classification task. -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/ void mnlprocess(logitmodel* lm, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t nvars; ae_int_t nclasses; ae_int_t offs; ae_int_t i; ae_int_t i1; double s; ae_assert(ae_fp_eq(lm->w.ptr.p_double[1],(double)(logit_logitvnum)), "MNLProcess: unexpected model version", _state); nvars = ae_round(lm->w.ptr.p_double[2], _state); nclasses = ae_round(lm->w.ptr.p_double[3], _state); offs = ae_round(lm->w.ptr.p_double[4], _state); logit_mnliexp(&lm->w, x, _state); s = (double)(0); i1 = offs+(nvars+1)*(nclasses-1); for(i=i1; i<=i1+nclasses-1; i++) { s = s+lm->w.ptr.p_double[i]; } if( y->cntptr.p_double[i] = lm->w.ptr.p_double[i1+i]/s; } } /************************************************************************* 'interactive' variant of MNLProcess for languages like Python which support constructs like "Y = MNLProcess(LM,X)" and interactive mode of the interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/ void mnlprocessi(logitmodel* lm, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_vector_clear(y); mnlprocess(lm, x, y, _state); } /************************************************************************* Unpacks coefficients of logit model. Logit model have form: P(class=i) = S(i) / (S(0) + S(1) + ... +S(M-1)) S(i) = Exp(A[i,0]*X[0] + ... + A[i,N-1]*X[N-1] + A[i,N]), when iw.ptr.p_double[1],(double)(logit_logitvnum)), "MNLUnpack: unexpected model version", _state); *nvars = ae_round(lm->w.ptr.p_double[2], _state); *nclasses = ae_round(lm->w.ptr.p_double[3], _state); offs = ae_round(lm->w.ptr.p_double[4], _state); ae_matrix_set_length(a, *nclasses-2+1, *nvars+1, _state); for(i=0; i<=*nclasses-2; i++) { ae_v_move(&a->ptr.pp_double[i][0], 1, &lm->w.ptr.p_double[offs+i*(*nvars+1)], 1, ae_v_len(0,*nvars)); } } /************************************************************************* "Packs" coefficients and creates logit model in ALGLIB format (MNLUnpack reversed). INPUT PARAMETERS: A - model (see MNLUnpack) NVars - number of independent variables NClasses - number of classes OUTPUT PARAMETERS: LM - logit model. -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/ void mnlpack(/* Real */ ae_matrix* a, ae_int_t nvars, ae_int_t nclasses, logitmodel* lm, ae_state *_state) { ae_int_t offs; ae_int_t i; ae_int_t ssize; _logitmodel_clear(lm); offs = 5; ssize = 5+(nvars+1)*(nclasses-1)+nclasses; ae_vector_set_length(&lm->w, ssize-1+1, _state); lm->w.ptr.p_double[0] = (double)(ssize); lm->w.ptr.p_double[1] = (double)(logit_logitvnum); lm->w.ptr.p_double[2] = (double)(nvars); lm->w.ptr.p_double[3] = (double)(nclasses); lm->w.ptr.p_double[4] = (double)(offs); for(i=0; i<=nclasses-2; i++) { ae_v_move(&lm->w.ptr.p_double[offs+i*(nvars+1)], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(offs+i*(nvars+1),offs+i*(nvars+1)+nvars)); } } /************************************************************************* Copying of LogitModel strucure INPUT PARAMETERS: LM1 - original OUTPUT PARAMETERS: LM2 - copy -- ALGLIB -- Copyright 15.03.2009 by Bochkanov Sergey *************************************************************************/ void mnlcopy(logitmodel* lm1, logitmodel* lm2, ae_state *_state) { ae_int_t k; _logitmodel_clear(lm2); k = ae_round(lm1->w.ptr.p_double[0], _state); ae_vector_set_length(&lm2->w, k-1+1, _state); ae_v_move(&lm2->w.ptr.p_double[0], 1, &lm1->w.ptr.p_double[0], 1, ae_v_len(0,k-1)); } /************************************************************************* Average cross-entropy (in bits per element) on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: CrossEntropy/(NPoints*ln(2)). -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/ double mnlavgce(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_frame _frame_block; ae_int_t nvars; ae_int_t nclasses; ae_int_t i; ae_vector workx; ae_vector worky; double result; ae_frame_make(_state, &_frame_block); ae_vector_init(&workx, 0, DT_REAL, _state); ae_vector_init(&worky, 0, DT_REAL, _state); ae_assert(ae_fp_eq(lm->w.ptr.p_double[1],(double)(logit_logitvnum)), "MNLClsError: unexpected model version", _state); nvars = ae_round(lm->w.ptr.p_double[2], _state); nclasses = ae_round(lm->w.ptr.p_double[3], _state); ae_vector_set_length(&workx, nvars-1+1, _state); ae_vector_set_length(&worky, nclasses-1+1, _state); result = (double)(0); for(i=0; i<=npoints-1; i++) { ae_assert(ae_round(xy->ptr.pp_double[i][nvars], _state)>=0&&ae_round(xy->ptr.pp_double[i][nvars], _state)ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); mnlprocess(lm, &workx, &worky, _state); if( ae_fp_greater(worky.ptr.p_double[ae_round(xy->ptr.pp_double[i][nvars], _state)],(double)(0)) ) { result = result-ae_log(worky.ptr.p_double[ae_round(xy->ptr.pp_double[i][nvars], _state)], _state); } else { result = result-ae_log(ae_minrealnumber, _state); } } result = result/(npoints*ae_log((double)(2), _state)); ae_frame_leave(_state); return result; } /************************************************************************* Relative classification error on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: percent of incorrectly classified cases. -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/ double mnlrelclserror(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { double result; result = (double)mnlclserror(lm, xy, npoints, _state)/(double)npoints; return result; } /************************************************************************* RMS error on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: root mean square error (error when estimating posterior probabilities). -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ double mnlrmserror(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { double relcls; double avgce; double rms; double avg; double avgrel; double result; ae_assert(ae_round(lm->w.ptr.p_double[1], _state)==logit_logitvnum, "MNLRMSError: Incorrect MNL version!", _state); logit_mnlallerrors(lm, xy, npoints, &relcls, &avgce, &rms, &avg, &avgrel, _state); result = rms; return result; } /************************************************************************* Average error on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: average error (error when estimating posterior probabilities). -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ double mnlavgerror(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { double relcls; double avgce; double rms; double avg; double avgrel; double result; ae_assert(ae_round(lm->w.ptr.p_double[1], _state)==logit_logitvnum, "MNLRMSError: Incorrect MNL version!", _state); logit_mnlallerrors(lm, xy, npoints, &relcls, &avgce, &rms, &avg, &avgrel, _state); result = avg; return result; } /************************************************************************* Average relative error on the test set INPUT PARAMETERS: LM - logit model XY - test set NPoints - test set size RESULT: average relative error (error when estimating posterior probabilities). -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ double mnlavgrelerror(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t ssize, ae_state *_state) { double relcls; double avgce; double rms; double avg; double avgrel; double result; ae_assert(ae_round(lm->w.ptr.p_double[1], _state)==logit_logitvnum, "MNLRMSError: Incorrect MNL version!", _state); logit_mnlallerrors(lm, xy, ssize, &relcls, &avgce, &rms, &avg, &avgrel, _state); result = avgrel; return result; } /************************************************************************* Classification error on test set = MNLRelClsError*NPoints -- ALGLIB -- Copyright 10.09.2008 by Bochkanov Sergey *************************************************************************/ ae_int_t mnlclserror(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_frame _frame_block; ae_int_t nvars; ae_int_t nclasses; ae_int_t i; ae_int_t j; ae_vector workx; ae_vector worky; ae_int_t nmax; ae_int_t result; ae_frame_make(_state, &_frame_block); ae_vector_init(&workx, 0, DT_REAL, _state); ae_vector_init(&worky, 0, DT_REAL, _state); ae_assert(ae_fp_eq(lm->w.ptr.p_double[1],(double)(logit_logitvnum)), "MNLClsError: unexpected model version", _state); nvars = ae_round(lm->w.ptr.p_double[2], _state); nclasses = ae_round(lm->w.ptr.p_double[3], _state); ae_vector_set_length(&workx, nvars-1+1, _state); ae_vector_set_length(&worky, nclasses-1+1, _state); result = 0; for(i=0; i<=npoints-1; i++) { /* * Process */ ae_v_move(&workx.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); mnlprocess(lm, &workx, &worky, _state); /* * Logit version of the answer */ nmax = 0; for(j=0; j<=nclasses-1; j++) { if( ae_fp_greater(worky.ptr.p_double[j],worky.ptr.p_double[nmax]) ) { nmax = j; } } /* * compare */ if( nmax!=ae_round(xy->ptr.pp_double[i][nvars], _state) ) { result = result+1; } } ae_frame_leave(_state); return result; } /************************************************************************* Internal subroutine. Places exponents of the anti-overflow shifted internal linear outputs into the service part of the W array. *************************************************************************/ static void logit_mnliexp(/* Real */ ae_vector* w, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t nvars; ae_int_t nclasses; ae_int_t offs; ae_int_t i; ae_int_t i1; double v; double mx; ae_assert(ae_fp_eq(w->ptr.p_double[1],(double)(logit_logitvnum)), "LOGIT: unexpected model version", _state); nvars = ae_round(w->ptr.p_double[2], _state); nclasses = ae_round(w->ptr.p_double[3], _state); offs = ae_round(w->ptr.p_double[4], _state); i1 = offs+(nvars+1)*(nclasses-1); for(i=0; i<=nclasses-2; i++) { v = ae_v_dotproduct(&w->ptr.p_double[offs+i*(nvars+1)], 1, &x->ptr.p_double[0], 1, ae_v_len(offs+i*(nvars+1),offs+i*(nvars+1)+nvars-1)); w->ptr.p_double[i1+i] = v+w->ptr.p_double[offs+i*(nvars+1)+nvars]; } w->ptr.p_double[i1+nclasses-1] = (double)(0); mx = (double)(0); for(i=i1; i<=i1+nclasses-1; i++) { mx = ae_maxreal(mx, w->ptr.p_double[i], _state); } for(i=i1; i<=i1+nclasses-1; i++) { w->ptr.p_double[i] = ae_exp(w->ptr.p_double[i]-mx, _state); } } /************************************************************************* Calculation of all types of errors -- ALGLIB -- Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ static void logit_mnlallerrors(logitmodel* lm, /* Real */ ae_matrix* xy, ae_int_t npoints, double* relcls, double* avgce, double* rms, double* avg, double* avgrel, ae_state *_state) { ae_frame _frame_block; ae_int_t nvars; ae_int_t nclasses; ae_int_t i; ae_vector buf; ae_vector workx; ae_vector y; ae_vector dy; ae_frame_make(_state, &_frame_block); *relcls = 0; *avgce = 0; *rms = 0; *avg = 0; *avgrel = 0; ae_vector_init(&buf, 0, DT_REAL, _state); ae_vector_init(&workx, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&dy, 0, DT_REAL, _state); ae_assert(ae_round(lm->w.ptr.p_double[1], _state)==logit_logitvnum, "MNL unit: Incorrect MNL version!", _state); nvars = ae_round(lm->w.ptr.p_double[2], _state); nclasses = ae_round(lm->w.ptr.p_double[3], _state); ae_vector_set_length(&workx, nvars-1+1, _state); ae_vector_set_length(&y, nclasses-1+1, _state); ae_vector_set_length(&dy, 0+1, _state); dserrallocate(nclasses, &buf, _state); for(i=0; i<=npoints-1; i++) { ae_v_move(&workx.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); mnlprocess(lm, &workx, &y, _state); dy.ptr.p_double[0] = xy->ptr.pp_double[i][nvars]; dserraccumulate(&buf, &y, &dy, _state); } dserrfinish(&buf, _state); *relcls = buf.ptr.p_double[0]; *avgce = buf.ptr.p_double[1]; *rms = buf.ptr.p_double[2]; *avg = buf.ptr.p_double[3]; *avgrel = buf.ptr.p_double[4]; ae_frame_leave(_state); } /************************************************************************* THE PURPOSE OF MCSRCH IS TO FIND A STEP WHICH SATISFIES A SUFFICIENT DECREASE CONDITION AND A CURVATURE CONDITION. AT EACH STAGE THE SUBROUTINE UPDATES AN INTERVAL OF UNCERTAINTY WITH ENDPOINTS STX AND STY. THE INTERVAL OF UNCERTAINTY IS INITIALLY CHOSEN SO THAT IT CONTAINS A MINIMIZER OF THE MODIFIED FUNCTION F(X+STP*S) - F(X) - FTOL*STP*(GRADF(X)'S). IF A STEP IS OBTAINED FOR WHICH THE MODIFIED FUNCTION HAS A NONPOSITIVE FUNCTION VALUE AND NONNEGATIVE DERIVATIVE, THEN THE INTERVAL OF UNCERTAINTY IS CHOSEN SO THAT IT CONTAINS A MINIMIZER OF F(X+STP*S). THE ALGORITHM IS DESIGNED TO FIND A STEP WHICH SATISFIES THE SUFFICIENT DECREASE CONDITION F(X+STP*S) .LE. F(X) + FTOL*STP*(GRADF(X)'S), AND THE CURVATURE CONDITION ABS(GRADF(X+STP*S)'S)) .LE. GTOL*ABS(GRADF(X)'S). IF FTOL IS LESS THAN GTOL AND IF, FOR EXAMPLE, THE FUNCTION IS BOUNDED BELOW, THEN THERE IS ALWAYS A STEP WHICH SATISFIES BOTH CONDITIONS. IF NO STEP CAN BE FOUND WHICH SATISFIES BOTH CONDITIONS, THEN THE ALGORITHM USUALLY STOPS WHEN ROUNDING ERRORS PREVENT FURTHER PROGRESS. IN THIS CASE STP ONLY SATISFIES THE SUFFICIENT DECREASE CONDITION. PARAMETERS DESCRIPRION N IS A POSITIVE INTEGER INPUT VARIABLE SET TO THE NUMBER OF VARIABLES. X IS AN ARRAY OF LENGTH N. ON INPUT IT MUST CONTAIN THE BASE POINT FOR THE LINE SEARCH. ON OUTPUT IT CONTAINS X+STP*S. F IS A VARIABLE. ON INPUT IT MUST CONTAIN THE VALUE OF F AT X. ON OUTPUT IT CONTAINS THE VALUE OF F AT X + STP*S. G IS AN ARRAY OF LENGTH N. ON INPUT IT MUST CONTAIN THE GRADIENT OF F AT X. ON OUTPUT IT CONTAINS THE GRADIENT OF F AT X + STP*S. S IS AN INPUT ARRAY OF LENGTH N WHICH SPECIFIES THE SEARCH DIRECTION. STP IS A NONNEGATIVE VARIABLE. ON INPUT STP CONTAINS AN INITIAL ESTIMATE OF A SATISFACTORY STEP. ON OUTPUT STP CONTAINS THE FINAL ESTIMATE. FTOL AND GTOL ARE NONNEGATIVE INPUT VARIABLES. TERMINATION OCCURS WHEN THE SUFFICIENT DECREASE CONDITION AND THE DIRECTIONAL DERIVATIVE CONDITION ARE SATISFIED. XTOL IS A NONNEGATIVE INPUT VARIABLE. TERMINATION OCCURS WHEN THE RELATIVE WIDTH OF THE INTERVAL OF UNCERTAINTY IS AT MOST XTOL. STPMIN AND STPMAX ARE NONNEGATIVE INPUT VARIABLES WHICH SPECIFY LOWER AND UPPER BOUNDS FOR THE STEP. MAXFEV IS A POSITIVE INTEGER INPUT VARIABLE. TERMINATION OCCURS WHEN THE NUMBER OF CALLS TO FCN IS AT LEAST MAXFEV BY THE END OF AN ITERATION. INFO IS AN INTEGER OUTPUT VARIABLE SET AS FOLLOWS: INFO = 0 IMPROPER INPUT PARAMETERS. INFO = 1 THE SUFFICIENT DECREASE CONDITION AND THE DIRECTIONAL DERIVATIVE CONDITION HOLD. INFO = 2 RELATIVE WIDTH OF THE INTERVAL OF UNCERTAINTY IS AT MOST XTOL. INFO = 3 NUMBER OF CALLS TO FCN HAS REACHED MAXFEV. INFO = 4 THE STEP IS AT THE LOWER BOUND STPMIN. INFO = 5 THE STEP IS AT THE UPPER BOUND STPMAX. INFO = 6 ROUNDING ERRORS PREVENT FURTHER PROGRESS. THERE MAY NOT BE A STEP WHICH SATISFIES THE SUFFICIENT DECREASE AND CURVATURE CONDITIONS. TOLERANCES MAY BE TOO SMALL. NFEV IS AN INTEGER OUTPUT VARIABLE SET TO THE NUMBER OF CALLS TO FCN. WA IS A WORK ARRAY OF LENGTH N. ARGONNE NATIONAL LABORATORY. MINPACK PROJECT. JUNE 1983 JORGE J. MORE', DAVID J. THUENTE *************************************************************************/ static void logit_mnlmcsrch(ae_int_t n, /* Real */ ae_vector* x, double* f, /* Real */ ae_vector* g, /* Real */ ae_vector* s, double* stp, ae_int_t* info, ae_int_t* nfev, /* Real */ ae_vector* wa, logitmcstate* state, ae_int_t* stage, ae_state *_state) { double v; double p5; double p66; double zero; /* * init */ p5 = 0.5; p66 = 0.66; state->xtrapf = 4.0; zero = (double)(0); /* * Main cycle */ for(;;) { if( *stage==0 ) { /* * NEXT */ *stage = 2; continue; } if( *stage==2 ) { state->infoc = 1; *info = 0; /* * CHECK THE INPUT PARAMETERS FOR ERRORS. */ if( ((((((n<=0||ae_fp_less_eq(*stp,(double)(0)))||ae_fp_less(logit_ftol,(double)(0)))||ae_fp_less(logit_gtol,zero))||ae_fp_less(logit_xtol,zero))||ae_fp_less(logit_stpmin,zero))||ae_fp_less(logit_stpmax,logit_stpmin))||logit_maxfev<=0 ) { *stage = 0; return; } /* * COMPUTE THE INITIAL GRADIENT IN THE SEARCH DIRECTION * AND CHECK THAT S IS A DESCENT DIRECTION. */ v = ae_v_dotproduct(&g->ptr.p_double[0], 1, &s->ptr.p_double[0], 1, ae_v_len(0,n-1)); state->dginit = v; if( ae_fp_greater_eq(state->dginit,(double)(0)) ) { *stage = 0; return; } /* * INITIALIZE LOCAL VARIABLES. */ state->brackt = ae_false; state->stage1 = ae_true; *nfev = 0; state->finit = *f; state->dgtest = logit_ftol*state->dginit; state->width = logit_stpmax-logit_stpmin; state->width1 = state->width/p5; ae_v_move(&wa->ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * THE VARIABLES STX, FX, DGX CONTAIN THE VALUES OF THE STEP, * FUNCTION, AND DIRECTIONAL DERIVATIVE AT THE BEST STEP. * THE VARIABLES STY, FY, DGY CONTAIN THE VALUE OF THE STEP, * FUNCTION, AND DERIVATIVE AT THE OTHER ENDPOINT OF * THE INTERVAL OF UNCERTAINTY. * THE VARIABLES STP, F, DG CONTAIN THE VALUES OF THE STEP, * FUNCTION, AND DERIVATIVE AT THE CURRENT STEP. */ state->stx = (double)(0); state->fx = state->finit; state->dgx = state->dginit; state->sty = (double)(0); state->fy = state->finit; state->dgy = state->dginit; /* * NEXT */ *stage = 3; continue; } if( *stage==3 ) { /* * START OF ITERATION. * * SET THE MINIMUM AND MAXIMUM STEPS TO CORRESPOND * TO THE PRESENT INTERVAL OF UNCERTAINTY. */ if( state->brackt ) { if( ae_fp_less(state->stx,state->sty) ) { state->stmin = state->stx; state->stmax = state->sty; } else { state->stmin = state->sty; state->stmax = state->stx; } } else { state->stmin = state->stx; state->stmax = *stp+state->xtrapf*(*stp-state->stx); } /* * FORCE THE STEP TO BE WITHIN THE BOUNDS STPMAX AND STPMIN. */ if( ae_fp_greater(*stp,logit_stpmax) ) { *stp = logit_stpmax; } if( ae_fp_less(*stp,logit_stpmin) ) { *stp = logit_stpmin; } /* * IF AN UNUSUAL TERMINATION IS TO OCCUR THEN LET * STP BE THE LOWEST POINT OBTAINED SO FAR. */ if( (((state->brackt&&(ae_fp_less_eq(*stp,state->stmin)||ae_fp_greater_eq(*stp,state->stmax)))||*nfev>=logit_maxfev-1)||state->infoc==0)||(state->brackt&&ae_fp_less_eq(state->stmax-state->stmin,logit_xtol*state->stmax)) ) { *stp = state->stx; } /* * EVALUATE THE FUNCTION AND GRADIENT AT STP * AND COMPUTE THE DIRECTIONAL DERIVATIVE. */ ae_v_move(&x->ptr.p_double[0], 1, &wa->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&x->ptr.p_double[0], 1, &s->ptr.p_double[0], 1, ae_v_len(0,n-1), *stp); /* * NEXT */ *stage = 4; return; } if( *stage==4 ) { *info = 0; *nfev = *nfev+1; v = ae_v_dotproduct(&g->ptr.p_double[0], 1, &s->ptr.p_double[0], 1, ae_v_len(0,n-1)); state->dg = v; state->ftest1 = state->finit+*stp*state->dgtest; /* * TEST FOR CONVERGENCE. */ if( (state->brackt&&(ae_fp_less_eq(*stp,state->stmin)||ae_fp_greater_eq(*stp,state->stmax)))||state->infoc==0 ) { *info = 6; } if( (ae_fp_eq(*stp,logit_stpmax)&&ae_fp_less_eq(*f,state->ftest1))&&ae_fp_less_eq(state->dg,state->dgtest) ) { *info = 5; } if( ae_fp_eq(*stp,logit_stpmin)&&(ae_fp_greater(*f,state->ftest1)||ae_fp_greater_eq(state->dg,state->dgtest)) ) { *info = 4; } if( *nfev>=logit_maxfev ) { *info = 3; } if( state->brackt&&ae_fp_less_eq(state->stmax-state->stmin,logit_xtol*state->stmax) ) { *info = 2; } if( ae_fp_less_eq(*f,state->ftest1)&&ae_fp_less_eq(ae_fabs(state->dg, _state),-logit_gtol*state->dginit) ) { *info = 1; } /* * CHECK FOR TERMINATION. */ if( *info!=0 ) { *stage = 0; return; } /* * IN THE FIRST STAGE WE SEEK A STEP FOR WHICH THE MODIFIED * FUNCTION HAS A NONPOSITIVE VALUE AND NONNEGATIVE DERIVATIVE. */ if( (state->stage1&&ae_fp_less_eq(*f,state->ftest1))&&ae_fp_greater_eq(state->dg,ae_minreal(logit_ftol, logit_gtol, _state)*state->dginit) ) { state->stage1 = ae_false; } /* * A MODIFIED FUNCTION IS USED TO PREDICT THE STEP ONLY IF * WE HAVE NOT OBTAINED A STEP FOR WHICH THE MODIFIED * FUNCTION HAS A NONPOSITIVE FUNCTION VALUE AND NONNEGATIVE * DERIVATIVE, AND IF A LOWER FUNCTION VALUE HAS BEEN * OBTAINED BUT THE DECREASE IS NOT SUFFICIENT. */ if( (state->stage1&&ae_fp_less_eq(*f,state->fx))&&ae_fp_greater(*f,state->ftest1) ) { /* * DEFINE THE MODIFIED FUNCTION AND DERIVATIVE VALUES. */ state->fm = *f-*stp*state->dgtest; state->fxm = state->fx-state->stx*state->dgtest; state->fym = state->fy-state->sty*state->dgtest; state->dgm = state->dg-state->dgtest; state->dgxm = state->dgx-state->dgtest; state->dgym = state->dgy-state->dgtest; /* * CALL CSTEP TO UPDATE THE INTERVAL OF UNCERTAINTY * AND TO COMPUTE THE NEW STEP. */ logit_mnlmcstep(&state->stx, &state->fxm, &state->dgxm, &state->sty, &state->fym, &state->dgym, stp, state->fm, state->dgm, &state->brackt, state->stmin, state->stmax, &state->infoc, _state); /* * RESET THE FUNCTION AND GRADIENT VALUES FOR F. */ state->fx = state->fxm+state->stx*state->dgtest; state->fy = state->fym+state->sty*state->dgtest; state->dgx = state->dgxm+state->dgtest; state->dgy = state->dgym+state->dgtest; } else { /* * CALL MCSTEP TO UPDATE THE INTERVAL OF UNCERTAINTY * AND TO COMPUTE THE NEW STEP. */ logit_mnlmcstep(&state->stx, &state->fx, &state->dgx, &state->sty, &state->fy, &state->dgy, stp, *f, state->dg, &state->brackt, state->stmin, state->stmax, &state->infoc, _state); } /* * FORCE A SUFFICIENT DECREASE IN THE SIZE OF THE * INTERVAL OF UNCERTAINTY. */ if( state->brackt ) { if( ae_fp_greater_eq(ae_fabs(state->sty-state->stx, _state),p66*state->width1) ) { *stp = state->stx+p5*(state->sty-state->stx); } state->width1 = state->width; state->width = ae_fabs(state->sty-state->stx, _state); } /* * NEXT. */ *stage = 3; continue; } } } static void logit_mnlmcstep(double* stx, double* fx, double* dx, double* sty, double* fy, double* dy, double* stp, double fp, double dp, ae_bool* brackt, double stmin, double stmax, ae_int_t* info, ae_state *_state) { ae_bool bound; double gamma; double p; double q; double r; double s; double sgnd; double stpc; double stpf; double stpq; double theta; *info = 0; /* * CHECK THE INPUT PARAMETERS FOR ERRORS. */ if( ((*brackt&&(ae_fp_less_eq(*stp,ae_minreal(*stx, *sty, _state))||ae_fp_greater_eq(*stp,ae_maxreal(*stx, *sty, _state))))||ae_fp_greater_eq(*dx*(*stp-(*stx)),(double)(0)))||ae_fp_less(stmax,stmin) ) { return; } /* * DETERMINE IF THE DERIVATIVES HAVE OPPOSITE SIGN. */ sgnd = dp*(*dx/ae_fabs(*dx, _state)); /* * FIRST CASE. A HIGHER FUNCTION VALUE. * THE MINIMUM IS BRACKETED. IF THE CUBIC STEP IS CLOSER * TO STX THAN THE QUADRATIC STEP, THE CUBIC STEP IS TAKEN, * ELSE THE AVERAGE OF THE CUBIC AND QUADRATIC STEPS IS TAKEN. */ if( ae_fp_greater(fp,*fx) ) { *info = 1; bound = ae_true; theta = 3*(*fx-fp)/(*stp-(*stx))+(*dx)+dp; s = ae_maxreal(ae_fabs(theta, _state), ae_maxreal(ae_fabs(*dx, _state), ae_fabs(dp, _state), _state), _state); gamma = s*ae_sqrt(ae_sqr(theta/s, _state)-*dx/s*(dp/s), _state); if( ae_fp_less(*stp,*stx) ) { gamma = -gamma; } p = gamma-(*dx)+theta; q = gamma-(*dx)+gamma+dp; r = p/q; stpc = *stx+r*(*stp-(*stx)); stpq = *stx+*dx/((*fx-fp)/(*stp-(*stx))+(*dx))/2*(*stp-(*stx)); if( ae_fp_less(ae_fabs(stpc-(*stx), _state),ae_fabs(stpq-(*stx), _state)) ) { stpf = stpc; } else { stpf = stpc+(stpq-stpc)/2; } *brackt = ae_true; } else { if( ae_fp_less(sgnd,(double)(0)) ) { /* * SECOND CASE. A LOWER FUNCTION VALUE AND DERIVATIVES OF * OPPOSITE SIGN. THE MINIMUM IS BRACKETED. IF THE CUBIC * STEP IS CLOSER TO STX THAN THE QUADRATIC (SECANT) STEP, * THE CUBIC STEP IS TAKEN, ELSE THE QUADRATIC STEP IS TAKEN. */ *info = 2; bound = ae_false; theta = 3*(*fx-fp)/(*stp-(*stx))+(*dx)+dp; s = ae_maxreal(ae_fabs(theta, _state), ae_maxreal(ae_fabs(*dx, _state), ae_fabs(dp, _state), _state), _state); gamma = s*ae_sqrt(ae_sqr(theta/s, _state)-*dx/s*(dp/s), _state); if( ae_fp_greater(*stp,*stx) ) { gamma = -gamma; } p = gamma-dp+theta; q = gamma-dp+gamma+(*dx); r = p/q; stpc = *stp+r*(*stx-(*stp)); stpq = *stp+dp/(dp-(*dx))*(*stx-(*stp)); if( ae_fp_greater(ae_fabs(stpc-(*stp), _state),ae_fabs(stpq-(*stp), _state)) ) { stpf = stpc; } else { stpf = stpq; } *brackt = ae_true; } else { if( ae_fp_less(ae_fabs(dp, _state),ae_fabs(*dx, _state)) ) { /* * THIRD CASE. A LOWER FUNCTION VALUE, DERIVATIVES OF THE * SAME SIGN, AND THE MAGNITUDE OF THE DERIVATIVE DECREASES. * THE CUBIC STEP IS ONLY USED IF THE CUBIC TENDS TO INFINITY * IN THE DIRECTION OF THE STEP OR IF THE MINIMUM OF THE CUBIC * IS BEYOND STP. OTHERWISE THE CUBIC STEP IS DEFINED TO BE * EITHER STPMIN OR STPMAX. THE QUADRATIC (SECANT) STEP IS ALSO * COMPUTED AND IF THE MINIMUM IS BRACKETED THEN THE THE STEP * CLOSEST TO STX IS TAKEN, ELSE THE STEP FARTHEST AWAY IS TAKEN. */ *info = 3; bound = ae_true; theta = 3*(*fx-fp)/(*stp-(*stx))+(*dx)+dp; s = ae_maxreal(ae_fabs(theta, _state), ae_maxreal(ae_fabs(*dx, _state), ae_fabs(dp, _state), _state), _state); /* * THE CASE GAMMA = 0 ONLY ARISES IF THE CUBIC DOES NOT TEND * TO INFINITY IN THE DIRECTION OF THE STEP. */ gamma = s*ae_sqrt(ae_maxreal((double)(0), ae_sqr(theta/s, _state)-*dx/s*(dp/s), _state), _state); if( ae_fp_greater(*stp,*stx) ) { gamma = -gamma; } p = gamma-dp+theta; q = gamma+(*dx-dp)+gamma; r = p/q; if( ae_fp_less(r,(double)(0))&&ae_fp_neq(gamma,(double)(0)) ) { stpc = *stp+r*(*stx-(*stp)); } else { if( ae_fp_greater(*stp,*stx) ) { stpc = stmax; } else { stpc = stmin; } } stpq = *stp+dp/(dp-(*dx))*(*stx-(*stp)); if( *brackt ) { if( ae_fp_less(ae_fabs(*stp-stpc, _state),ae_fabs(*stp-stpq, _state)) ) { stpf = stpc; } else { stpf = stpq; } } else { if( ae_fp_greater(ae_fabs(*stp-stpc, _state),ae_fabs(*stp-stpq, _state)) ) { stpf = stpc; } else { stpf = stpq; } } } else { /* * FOURTH CASE. A LOWER FUNCTION VALUE, DERIVATIVES OF THE * SAME SIGN, AND THE MAGNITUDE OF THE DERIVATIVE DOES * NOT DECREASE. IF THE MINIMUM IS NOT BRACKETED, THE STEP * IS EITHER STPMIN OR STPMAX, ELSE THE CUBIC STEP IS TAKEN. */ *info = 4; bound = ae_false; if( *brackt ) { theta = 3*(fp-(*fy))/(*sty-(*stp))+(*dy)+dp; s = ae_maxreal(ae_fabs(theta, _state), ae_maxreal(ae_fabs(*dy, _state), ae_fabs(dp, _state), _state), _state); gamma = s*ae_sqrt(ae_sqr(theta/s, _state)-*dy/s*(dp/s), _state); if( ae_fp_greater(*stp,*sty) ) { gamma = -gamma; } p = gamma-dp+theta; q = gamma-dp+gamma+(*dy); r = p/q; stpc = *stp+r*(*sty-(*stp)); stpf = stpc; } else { if( ae_fp_greater(*stp,*stx) ) { stpf = stmax; } else { stpf = stmin; } } } } } /* * UPDATE THE INTERVAL OF UNCERTAINTY. THIS UPDATE DOES NOT * DEPEND ON THE NEW STEP OR THE CASE ANALYSIS ABOVE. */ if( ae_fp_greater(fp,*fx) ) { *sty = *stp; *fy = fp; *dy = dp; } else { if( ae_fp_less(sgnd,0.0) ) { *sty = *stx; *fy = *fx; *dy = *dx; } *stx = *stp; *fx = fp; *dx = dp; } /* * COMPUTE THE NEW STEP AND SAFEGUARD IT. */ stpf = ae_minreal(stmax, stpf, _state); stpf = ae_maxreal(stmin, stpf, _state); *stp = stpf; if( *brackt&&bound ) { if( ae_fp_greater(*sty,*stx) ) { *stp = ae_minreal(*stx+0.66*(*sty-(*stx)), *stp, _state); } else { *stp = ae_maxreal(*stx+0.66*(*sty-(*stx)), *stp, _state); } } } void _logitmodel_init(void* _p, ae_state *_state) { logitmodel *p = (logitmodel*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->w, 0, DT_REAL, _state); } void _logitmodel_init_copy(void* _dst, void* _src, ae_state *_state) { logitmodel *dst = (logitmodel*)_dst; logitmodel *src = (logitmodel*)_src; ae_vector_init_copy(&dst->w, &src->w, _state); } void _logitmodel_clear(void* _p) { logitmodel *p = (logitmodel*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->w); } void _logitmodel_destroy(void* _p) { logitmodel *p = (logitmodel*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->w); } void _logitmcstate_init(void* _p, ae_state *_state) { logitmcstate *p = (logitmcstate*)_p; ae_touch_ptr((void*)p); } void _logitmcstate_init_copy(void* _dst, void* _src, ae_state *_state) { logitmcstate *dst = (logitmcstate*)_dst; logitmcstate *src = (logitmcstate*)_src; dst->brackt = src->brackt; dst->stage1 = src->stage1; dst->infoc = src->infoc; dst->dg = src->dg; dst->dgm = src->dgm; dst->dginit = src->dginit; dst->dgtest = src->dgtest; dst->dgx = src->dgx; dst->dgxm = src->dgxm; dst->dgy = src->dgy; dst->dgym = src->dgym; dst->finit = src->finit; dst->ftest1 = src->ftest1; dst->fm = src->fm; dst->fx = src->fx; dst->fxm = src->fxm; dst->fy = src->fy; dst->fym = src->fym; dst->stx = src->stx; dst->sty = src->sty; dst->stmin = src->stmin; dst->stmax = src->stmax; dst->width = src->width; dst->width1 = src->width1; dst->xtrapf = src->xtrapf; } void _logitmcstate_clear(void* _p) { logitmcstate *p = (logitmcstate*)_p; ae_touch_ptr((void*)p); } void _logitmcstate_destroy(void* _p) { logitmcstate *p = (logitmcstate*)_p; ae_touch_ptr((void*)p); } void _mnlreport_init(void* _p, ae_state *_state) { mnlreport *p = (mnlreport*)_p; ae_touch_ptr((void*)p); } void _mnlreport_init_copy(void* _dst, void* _src, ae_state *_state) { mnlreport *dst = (mnlreport*)_dst; mnlreport *src = (mnlreport*)_src; dst->ngrad = src->ngrad; dst->nhess = src->nhess; } void _mnlreport_clear(void* _p) { mnlreport *p = (mnlreport*)_p; ae_touch_ptr((void*)p); } void _mnlreport_destroy(void* _p) { mnlreport *p = (mnlreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* DESCRIPTION: This function creates MCPD (Markov Chains for Population Data) solver. This solver can be used to find transition matrix P for N-dimensional prediction problem where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional population vectors (components of each X are non-negative), and P is a N*N transition matrix (elements of P are non-negative, each column sums to 1.0). Such models arise when when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is constant, i.e. there is no new individuals and no one leaves population * you want to model transitions of individuals from one state into another USAGE: Here we give very brief outline of the MCPD. We strongly recommend you to read examples in the ALGLIB Reference Manual and to read ALGLIB User Guide on data analysis which is available at http://www.alglib.net/dataanalysis/ 1. User initializes algorithm state with MCPDCreate() call 2. User adds one or more tracks - sequences of states which describe evolution of a system being modelled from different starting conditions 3. User may add optional boundary, equality and/or linear constraints on the coefficients of P by calling one of the following functions: * MCPDSetEC() to set equality constraints * MCPDSetBC() to set bound constraints * MCPDSetLC() to set linear constraints 4. Optionally, user may set custom weights for prediction errors (by default, algorithm assigns non-equal, automatically chosen weights for errors in the prediction of different components of X). It can be done with a call of MCPDSetPredictionWeights() function. 5. User calls MCPDSolve() function which takes algorithm state and pointer (delegate, etc.) to callback function which calculates F/G. 6. User calls MCPDResults() to get solution INPUT PARAMETERS: N - problem dimension, N>=1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdcreate(ae_int_t n, mcpdstate* s, ae_state *_state) { _mcpdstate_clear(s); ae_assert(n>=1, "MCPDCreate: N<1", _state); mcpd_mcpdinit(n, -1, -1, s, _state); } /************************************************************************* DESCRIPTION: This function is a specialized version of MCPDCreate() function, and we recommend you to read comments for this function for general information about MCPD solver. This function creates MCPD (Markov Chains for Population Data) solver for "Entry-state" model, i.e. model where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional state vectors P is a N*N transition matrix and one selected component of X[] is called "entry" state and is treated in a special way: system state always transits from "entry" state to some another state system state can not transit from any state into "entry" state Such conditions basically mean that row of P which corresponds to "entry" state is zero. Such models arise when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is NOT constant - at every moment of time there is some (unpredictable) amount of "new" individuals, which can transit into one of the states at the next turn, but still no one leaves population * you want to model transitions of individuals from one state into another * but you do NOT want to predict amount of "new" individuals because it does not depends on individuals already present (hence system can not transit INTO entry state - it can only transit FROM it). This model is discussed in more details in the ALGLIB User Guide (see http://www.alglib.net/dataanalysis/ for more data). INPUT PARAMETERS: N - problem dimension, N>=2 EntryState- index of entry state, in 0..N-1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdcreateentry(ae_int_t n, ae_int_t entrystate, mcpdstate* s, ae_state *_state) { _mcpdstate_clear(s); ae_assert(n>=2, "MCPDCreateEntry: N<2", _state); ae_assert(entrystate>=0, "MCPDCreateEntry: EntryState<0", _state); ae_assert(entrystate=N", _state); mcpd_mcpdinit(n, entrystate, -1, s, _state); } /************************************************************************* DESCRIPTION: This function is a specialized version of MCPDCreate() function, and we recommend you to read comments for this function for general information about MCPD solver. This function creates MCPD (Markov Chains for Population Data) solver for "Exit-state" model, i.e. model where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional state vectors P is a N*N transition matrix and one selected component of X[] is called "exit" state and is treated in a special way: system state can transit from any state into "exit" state system state can not transit from "exit" state into any other state transition operator discards "exit" state (makes it zero at each turn) Such conditions basically mean that column of P which corresponds to "exit" state is zero. Multiplication by such P may decrease sum of vector components. Such models arise when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is NOT constant - individuals can move into "exit" state and leave population at the next turn, but there are no new individuals * amount of individuals which leave population can be predicted * you want to model transitions of individuals from one state into another (including transitions into the "exit" state) This model is discussed in more details in the ALGLIB User Guide (see http://www.alglib.net/dataanalysis/ for more data). INPUT PARAMETERS: N - problem dimension, N>=2 ExitState- index of exit state, in 0..N-1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdcreateexit(ae_int_t n, ae_int_t exitstate, mcpdstate* s, ae_state *_state) { _mcpdstate_clear(s); ae_assert(n>=2, "MCPDCreateExit: N<2", _state); ae_assert(exitstate>=0, "MCPDCreateExit: ExitState<0", _state); ae_assert(exitstate=N", _state); mcpd_mcpdinit(n, -1, exitstate, s, _state); } /************************************************************************* DESCRIPTION: This function is a specialized version of MCPDCreate() function, and we recommend you to read comments for this function for general information about MCPD solver. This function creates MCPD (Markov Chains for Population Data) solver for "Entry-Exit-states" model, i.e. model where transition from X[i] to X[i+1] is modelled as X[i+1] = P*X[i] where X[i] and X[i+1] are N-dimensional state vectors P is a N*N transition matrix one selected component of X[] is called "entry" state and is treated in a special way: system state always transits from "entry" state to some another state system state can not transit from any state into "entry" state and another one component of X[] is called "exit" state and is treated in a special way too: system state can transit from any state into "exit" state system state can not transit from "exit" state into any other state transition operator discards "exit" state (makes it zero at each turn) Such conditions basically mean that: row of P which corresponds to "entry" state is zero column of P which corresponds to "exit" state is zero Multiplication by such P may decrease sum of vector components. Such models arise when: * there is some population of individuals * individuals can have different states * individuals can transit from one state to another * population size is NOT constant * at every moment of time there is some (unpredictable) amount of "new" individuals, which can transit into one of the states at the next turn * some individuals can move (predictably) into "exit" state and leave population at the next turn * you want to model transitions of individuals from one state into another, including transitions from the "entry" state and into the "exit" state. * but you do NOT want to predict amount of "new" individuals because it does not depends on individuals already present (hence system can not transit INTO entry state - it can only transit FROM it). This model is discussed in more details in the ALGLIB User Guide (see http://www.alglib.net/dataanalysis/ for more data). INPUT PARAMETERS: N - problem dimension, N>=2 EntryState- index of entry state, in 0..N-1 ExitState- index of exit state, in 0..N-1 OUTPUT PARAMETERS: State - structure stores algorithm state -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdcreateentryexit(ae_int_t n, ae_int_t entrystate, ae_int_t exitstate, mcpdstate* s, ae_state *_state) { _mcpdstate_clear(s); ae_assert(n>=2, "MCPDCreateEntryExit: N<2", _state); ae_assert(entrystate>=0, "MCPDCreateEntryExit: EntryState<0", _state); ae_assert(entrystate=N", _state); ae_assert(exitstate>=0, "MCPDCreateEntryExit: ExitState<0", _state); ae_assert(exitstate=N", _state); ae_assert(entrystate!=exitstate, "MCPDCreateEntryExit: EntryState=ExitState", _state); mcpd_mcpdinit(n, entrystate, exitstate, s, _state); } /************************************************************************* This function is used to add a track - sequence of system states at the different moments of its evolution. You may add one or several tracks to the MCPD solver. In case you have several tracks, they won't overwrite each other. For example, if you pass two tracks, A1-A2-A3 (system at t=A+1, t=A+2 and t=A+3) and B1-B2-B3, then solver will try to model transitions from t=A+1 to t=A+2, t=A+2 to t=A+3, t=B+1 to t=B+2, t=B+2 to t=B+3. But it WONT mix these two tracks - i.e. it wont try to model transition from t=A+3 to t=B+1. INPUT PARAMETERS: S - solver XY - track, array[K,N]: * I-th row is a state at t=I * elements of XY must be non-negative (exception will be thrown on negative elements) K - number of points in a track * if given, only leading K rows of XY are used * if not given, automatically determined from size of XY NOTES: 1. Track may contain either proportional or population data: * with proportional data all rows of XY must sum to 1.0, i.e. we have proportions instead of absolute population values * with population data rows of XY contain population counts and generally do not sum to 1.0 (although they still must be non-negative) -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdaddtrack(mcpdstate* s, /* Real */ ae_matrix* xy, ae_int_t k, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t n; double s0; double s1; n = s->n; ae_assert(k>=0, "MCPDAddTrack: K<0", _state); ae_assert(xy->cols>=n, "MCPDAddTrack: Cols(XY)rows>=k, "MCPDAddTrack: Rows(XY)ptr.pp_double[i][j],(double)(0)), "MCPDAddTrack: XY contains negative elements", _state); } } if( k<2 ) { return; } if( s->data.rowsnpairs+k-1 ) { rmatrixresize(&s->data, ae_maxint(2*s->data.rows, s->npairs+k-1, _state), 2*n, _state); } for(i=0; i<=k-2; i++) { s0 = (double)(0); s1 = (double)(0); for(j=0; j<=n-1; j++) { if( s->states.ptr.p_int[j]>=0 ) { s0 = s0+xy->ptr.pp_double[i][j]; } if( s->states.ptr.p_int[j]<=0 ) { s1 = s1+xy->ptr.pp_double[i+1][j]; } } if( ae_fp_greater(s0,(double)(0))&&ae_fp_greater(s1,(double)(0)) ) { for(j=0; j<=n-1; j++) { if( s->states.ptr.p_int[j]>=0 ) { s->data.ptr.pp_double[s->npairs][j] = xy->ptr.pp_double[i][j]/s0; } else { s->data.ptr.pp_double[s->npairs][j] = 0.0; } if( s->states.ptr.p_int[j]<=0 ) { s->data.ptr.pp_double[s->npairs][n+j] = xy->ptr.pp_double[i+1][j]/s1; } else { s->data.ptr.pp_double[s->npairs][n+j] = 0.0; } } s->npairs = s->npairs+1; } } } /************************************************************************* This function is used to add equality constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to place equality constraints on arbitrary subset of elements of P. Set of constraints is specified by EC, which may contain either NAN's or finite numbers from [0,1]. NAN denotes absence of constraint, finite number denotes equality constraint on specific element of P. You can also use MCPDAddEC() function which allows to ADD equality constraint for one element of P without changing constraints for other elements. These functions (MCPDSetEC and MCPDAddEC) interact as follows: * there is internal matrix of equality constraints which is stored in the MCPD solver * MCPDSetEC() replaces this matrix by another one (SET) * MCPDAddEC() modifies one element of this matrix and leaves other ones unchanged (ADD) * thus MCPDAddEC() call preserves all modifications done by previous calls, while MCPDSetEC() completely discards all changes done to the equality constraints. INPUT PARAMETERS: S - solver EC - equality constraints, array[N,N]. Elements of EC can be either NAN's or finite numbers from [0,1]. NAN denotes absence of constraints, while finite value denotes equality constraint on the corresponding element of P. NOTES: 1. infinite values of EC will lead to exception being thrown. Values less than 0.0 or greater than 1.0 will lead to error code being returned after call to MCPDSolve(). -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetec(mcpdstate* s, /* Real */ ae_matrix* ec, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t n; n = s->n; ae_assert(ec->cols>=n, "MCPDSetEC: Cols(EC)rows>=n, "MCPDSetEC: Rows(EC)ptr.pp_double[i][j], _state)||ae_isnan(ec->ptr.pp_double[i][j], _state), "MCPDSetEC: EC containts infinite elements", _state); s->ec.ptr.pp_double[i][j] = ec->ptr.pp_double[i][j]; } } } /************************************************************************* This function is used to add equality constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to ADD equality constraint for one element of P without changing constraints for other elements. You can also use MCPDSetEC() function which allows you to specify arbitrary set of equality constraints in one call. These functions (MCPDSetEC and MCPDAddEC) interact as follows: * there is internal matrix of equality constraints which is stored in the MCPD solver * MCPDSetEC() replaces this matrix by another one (SET) * MCPDAddEC() modifies one element of this matrix and leaves other ones unchanged (ADD) * thus MCPDAddEC() call preserves all modifications done by previous calls, while MCPDSetEC() completely discards all changes done to the equality constraints. INPUT PARAMETERS: S - solver I - row index of element being constrained J - column index of element being constrained C - value (constraint for P[I,J]). Can be either NAN (no constraint) or finite value from [0,1]. NOTES: 1. infinite values of C will lead to exception being thrown. Values less than 0.0 or greater than 1.0 will lead to error code being returned after call to MCPDSolve(). -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdaddec(mcpdstate* s, ae_int_t i, ae_int_t j, double c, ae_state *_state) { ae_assert(i>=0, "MCPDAddEC: I<0", _state); ae_assert(in, "MCPDAddEC: I>=N", _state); ae_assert(j>=0, "MCPDAddEC: J<0", _state); ae_assert(jn, "MCPDAddEC: J>=N", _state); ae_assert(ae_isnan(c, _state)||ae_isfinite(c, _state), "MCPDAddEC: C is not finite number or NAN", _state); s->ec.ptr.pp_double[i][j] = c; } /************************************************************************* This function is used to add bound constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to place bound constraints on arbitrary subset of elements of P. Set of constraints is specified by BndL/BndU matrices, which may contain arbitrary combination of finite numbers or infinities (like -INFn; ae_assert(bndl->cols>=n, "MCPDSetBC: Cols(BndL)rows>=n, "MCPDSetBC: Rows(BndL)cols>=n, "MCPDSetBC: Cols(BndU)rows>=n, "MCPDSetBC: Rows(BndU)ptr.pp_double[i][j], _state)||ae_isneginf(bndl->ptr.pp_double[i][j], _state), "MCPDSetBC: BndL containts NAN or +INF", _state); ae_assert(ae_isfinite(bndu->ptr.pp_double[i][j], _state)||ae_isposinf(bndu->ptr.pp_double[i][j], _state), "MCPDSetBC: BndU containts NAN or -INF", _state); s->bndl.ptr.pp_double[i][j] = bndl->ptr.pp_double[i][j]; s->bndu.ptr.pp_double[i][j] = bndu->ptr.pp_double[i][j]; } } } /************************************************************************* This function is used to add bound constraints on the elements of the transition matrix P. MCPD solver has four types of constraints which can be placed on P: * user-specified equality constraints (optional) * user-specified bound constraints (optional) * user-specified general linear constraints (optional) * basic constraints (always present): * non-negativity: P[i,j]>=0 * consistency: every column of P sums to 1.0 Final constraints which are passed to the underlying optimizer are calculated as intersection of all present constraints. For example, you may specify boundary constraint on P[0,0] and equality one: 0.1<=P[0,0]<=0.9 P[0,0]=0.5 Such combination of constraints will be silently reduced to their intersection, which is P[0,0]=0.5. This function can be used to ADD bound constraint for one element of P without changing constraints for other elements. You can also use MCPDSetBC() function which allows to place bound constraints on arbitrary subset of elements of P. Set of constraints is specified by BndL/BndU matrices, which may contain arbitrary combination of finite numbers or infinities (like -INF=0, "MCPDAddBC: I<0", _state); ae_assert(in, "MCPDAddBC: I>=N", _state); ae_assert(j>=0, "MCPDAddBC: J<0", _state); ae_assert(jn, "MCPDAddBC: J>=N", _state); ae_assert(ae_isfinite(bndl, _state)||ae_isneginf(bndl, _state), "MCPDAddBC: BndL is NAN or +INF", _state); ae_assert(ae_isfinite(bndu, _state)||ae_isposinf(bndu, _state), "MCPDAddBC: BndU is NAN or -INF", _state); s->bndl.ptr.pp_double[i][j] = bndl; s->bndu.ptr.pp_double[i][j] = bndu; } /************************************************************************* This function is used to set linear equality/inequality constraints on the elements of the transition matrix P. This function can be used to set one or several general linear constraints on the elements of P. Two types of constraints are supported: * equality constraints * inequality constraints (both less-or-equal and greater-or-equal) Coefficients of constraints are specified by matrix C (one of the parameters). One row of C corresponds to one constraint. Because transition matrix P has N*N elements, we need N*N columns to store all coefficients (they are stored row by row), and one more column to store right part - hence C has N*N+1 columns. Constraint kind is stored in the CT array. Thus, I-th linear constraint is P[0,0]*C[I,0] + P[0,1]*C[I,1] + .. + P[0,N-1]*C[I,N-1] + + P[1,0]*C[I,N] + P[1,1]*C[I,N+1] + ... + + P[N-1,N-1]*C[I,N*N-1] ?=? C[I,N*N] where ?=? can be either "=" (CT[i]=0), "<=" (CT[i]<0) or ">=" (CT[i]>0). Your constraint may involve only some subset of P (less than N*N elements). For example it can be something like P[0,0] + P[0,1] = 0.5 In this case you still should pass matrix with N*N+1 columns, but all its elements (except for C[0,0], C[0,1] and C[0,N*N-1]) will be zero. INPUT PARAMETERS: S - solver C - array[K,N*N+1] - coefficients of constraints (see above for complete description) CT - array[K] - constraint types (see above for complete description) K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetlc(mcpdstate* s, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t n; n = s->n; ae_assert(c->cols>=n*n+1, "MCPDSetLC: Cols(C)rows>=k, "MCPDSetLC: Rows(C)cnt>=k, "MCPDSetLC: Len(CT)c, k, n*n+1, _state); ivectorsetlengthatleast(&s->ct, k, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=n*n; j++) { s->c.ptr.pp_double[i][j] = c->ptr.pp_double[i][j]; } s->ct.ptr.p_int[i] = ct->ptr.p_int[i]; } s->ccnt = k; } /************************************************************************* This function allows to tune amount of Tikhonov regularization being applied to your problem. By default, regularizing term is equal to r*||P-prior_P||^2, where r is a small non-zero value, P is transition matrix, prior_P is identity matrix, ||X||^2 is a sum of squared elements of X. This function allows you to change coefficient r. You can also change prior values with MCPDSetPrior() function. INPUT PARAMETERS: S - solver V - regularization coefficient, finite non-negative value. It is not recommended to specify zero value unless you are pretty sure that you want it. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsettikhonovregularizer(mcpdstate* s, double v, ae_state *_state) { ae_assert(ae_isfinite(v, _state), "MCPDSetTikhonovRegularizer: V is infinite or NAN", _state); ae_assert(ae_fp_greater_eq(v,0.0), "MCPDSetTikhonovRegularizer: V is less than zero", _state); s->regterm = v; } /************************************************************************* This function allows to set prior values used for regularization of your problem. By default, regularizing term is equal to r*||P-prior_P||^2, where r is a small non-zero value, P is transition matrix, prior_P is identity matrix, ||X||^2 is a sum of squared elements of X. This function allows you to change prior values prior_P. You can also change r with MCPDSetTikhonovRegularizer() function. INPUT PARAMETERS: S - solver PP - array[N,N], matrix of prior values: 1. elements must be real numbers from [0,1] 2. columns must sum to 1.0. First property is checked (exception is thrown otherwise), while second one is not checked/enforced. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetprior(mcpdstate* s, /* Real */ ae_matrix* pp, ae_state *_state) { ae_frame _frame_block; ae_matrix _pp; ae_int_t i; ae_int_t j; ae_int_t n; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_pp, pp, _state); pp = &_pp; n = s->n; ae_assert(pp->cols>=n, "MCPDSetPrior: Cols(PP)rows>=n, "MCPDSetPrior: Rows(PP)ptr.pp_double[i][j], _state), "MCPDSetPrior: PP containts infinite elements", _state); ae_assert(ae_fp_greater_eq(pp->ptr.pp_double[i][j],0.0)&&ae_fp_less_eq(pp->ptr.pp_double[i][j],1.0), "MCPDSetPrior: PP[i,j] is less than 0.0 or greater than 1.0", _state); s->priorp.ptr.pp_double[i][j] = pp->ptr.pp_double[i][j]; } } ae_frame_leave(_state); } /************************************************************************* This function is used to change prediction weights MCPD solver scales prediction errors as follows Error(P) = ||W*(y-P*x)||^2 where x is a system state at time t y is a system state at time t+1 P is a transition matrix W is a diagonal scaling matrix By default, weights are chosen in order to minimize relative prediction error instead of absolute one. For example, if one component of state is about 0.5 in magnitude and another one is about 0.05, then algorithm will make corresponding weights equal to 2.0 and 20.0. INPUT PARAMETERS: S - solver PW - array[N], weights: * must be non-negative values (exception will be thrown otherwise) * zero values will be replaced by automatically chosen values -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsetpredictionweights(mcpdstate* s, /* Real */ ae_vector* pw, ae_state *_state) { ae_int_t i; ae_int_t n; n = s->n; ae_assert(pw->cnt>=n, "MCPDSetPredictionWeights: Length(PW)ptr.p_double[i], _state), "MCPDSetPredictionWeights: PW containts infinite or NAN elements", _state); ae_assert(ae_fp_greater_eq(pw->ptr.p_double[i],(double)(0)), "MCPDSetPredictionWeights: PW containts negative elements", _state); s->pw.ptr.p_double[i] = pw->ptr.p_double[i]; } } /************************************************************************* This function is used to start solution of the MCPD problem. After return from this function, you can use MCPDResults() to get solution and completion code. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdsolve(mcpdstate* s, ae_state *_state) { ae_int_t n; ae_int_t npairs; ae_int_t ccnt; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t k2; double v; double vv; n = s->n; npairs = s->npairs; /* * init fields of S */ s->repterminationtype = 0; s->repinneriterationscount = 0; s->repouteriterationscount = 0; s->repnfev = 0; for(k=0; k<=n-1; k++) { for(k2=0; k2<=n-1; k2++) { s->p.ptr.pp_double[k][k2] = _state->v_nan; } } /* * Generate "effective" weights for prediction and calculate preconditioner */ for(i=0; i<=n-1; i++) { if( ae_fp_eq(s->pw.ptr.p_double[i],(double)(0)) ) { v = (double)(0); k = 0; for(j=0; j<=npairs-1; j++) { if( ae_fp_neq(s->data.ptr.pp_double[j][n+i],(double)(0)) ) { v = v+s->data.ptr.pp_double[j][n+i]; k = k+1; } } if( k!=0 ) { s->effectivew.ptr.p_double[i] = k/v; } else { s->effectivew.ptr.p_double[i] = 1.0; } } else { s->effectivew.ptr.p_double[i] = s->pw.ptr.p_double[i]; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { s->h.ptr.p_double[i*n+j] = 2*s->regterm; } } for(k=0; k<=npairs-1; k++) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { s->h.ptr.p_double[i*n+j] = s->h.ptr.p_double[i*n+j]+2*ae_sqr(s->effectivew.ptr.p_double[i], _state)*ae_sqr(s->data.ptr.pp_double[k][j], _state); } } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_eq(s->h.ptr.p_double[i*n+j],(double)(0)) ) { s->h.ptr.p_double[i*n+j] = (double)(1); } } } /* * Generate "effective" BndL/BndU */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { /* * Set default boundary constraints. * Lower bound is always zero, upper bound is calculated * with respect to entry/exit states. */ s->effectivebndl.ptr.p_double[i*n+j] = 0.0; if( s->states.ptr.p_int[i]>0||s->states.ptr.p_int[j]<0 ) { s->effectivebndu.ptr.p_double[i*n+j] = 0.0; } else { s->effectivebndu.ptr.p_double[i*n+j] = 1.0; } /* * Calculate intersection of the default and user-specified bound constraints. * This code checks consistency of such combination. */ if( ae_isfinite(s->bndl.ptr.pp_double[i][j], _state)&&ae_fp_greater(s->bndl.ptr.pp_double[i][j],s->effectivebndl.ptr.p_double[i*n+j]) ) { s->effectivebndl.ptr.p_double[i*n+j] = s->bndl.ptr.pp_double[i][j]; } if( ae_isfinite(s->bndu.ptr.pp_double[i][j], _state)&&ae_fp_less(s->bndu.ptr.pp_double[i][j],s->effectivebndu.ptr.p_double[i*n+j]) ) { s->effectivebndu.ptr.p_double[i*n+j] = s->bndu.ptr.pp_double[i][j]; } if( ae_fp_greater(s->effectivebndl.ptr.p_double[i*n+j],s->effectivebndu.ptr.p_double[i*n+j]) ) { s->repterminationtype = -3; return; } /* * Calculate intersection of the effective bound constraints * and user-specified equality constraints. * This code checks consistency of such combination. */ if( ae_isfinite(s->ec.ptr.pp_double[i][j], _state) ) { if( ae_fp_less(s->ec.ptr.pp_double[i][j],s->effectivebndl.ptr.p_double[i*n+j])||ae_fp_greater(s->ec.ptr.pp_double[i][j],s->effectivebndu.ptr.p_double[i*n+j]) ) { s->repterminationtype = -3; return; } s->effectivebndl.ptr.p_double[i*n+j] = s->ec.ptr.pp_double[i][j]; s->effectivebndu.ptr.p_double[i*n+j] = s->ec.ptr.pp_double[i][j]; } } } /* * Generate linear constraints: * * "default" sums-to-one constraints (not generated for "exit" states) */ rmatrixsetlengthatleast(&s->effectivec, s->ccnt+n, n*n+1, _state); ivectorsetlengthatleast(&s->effectivect, s->ccnt+n, _state); ccnt = s->ccnt; for(i=0; i<=s->ccnt-1; i++) { for(j=0; j<=n*n; j++) { s->effectivec.ptr.pp_double[i][j] = s->c.ptr.pp_double[i][j]; } s->effectivect.ptr.p_int[i] = s->ct.ptr.p_int[i]; } for(i=0; i<=n-1; i++) { if( s->states.ptr.p_int[i]>=0 ) { for(k=0; k<=n*n-1; k++) { s->effectivec.ptr.pp_double[ccnt][k] = (double)(0); } for(k=0; k<=n-1; k++) { s->effectivec.ptr.pp_double[ccnt][k*n+i] = (double)(1); } s->effectivec.ptr.pp_double[ccnt][n*n] = 1.0; s->effectivect.ptr.p_int[ccnt] = 0; ccnt = ccnt+1; } } /* * create optimizer */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { s->tmpp.ptr.p_double[i*n+j] = (double)1/(double)n; } } minbleicrestartfrom(&s->bs, &s->tmpp, _state); minbleicsetbc(&s->bs, &s->effectivebndl, &s->effectivebndu, _state); minbleicsetlc(&s->bs, &s->effectivec, &s->effectivect, ccnt, _state); minbleicsetcond(&s->bs, 0.0, 0.0, mcpd_xtol, 0, _state); minbleicsetprecdiag(&s->bs, &s->h, _state); /* * solve problem */ while(minbleiciteration(&s->bs, _state)) { ae_assert(s->bs.needfg, "MCPDSolve: internal error", _state); if( s->bs.needfg ) { /* * Calculate regularization term */ s->bs.f = 0.0; vv = s->regterm; for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { s->bs.f = s->bs.f+vv*ae_sqr(s->bs.x.ptr.p_double[i*n+j]-s->priorp.ptr.pp_double[i][j], _state); s->bs.g.ptr.p_double[i*n+j] = 2*vv*(s->bs.x.ptr.p_double[i*n+j]-s->priorp.ptr.pp_double[i][j]); } } /* * calculate prediction error/gradient for K-th pair */ for(k=0; k<=npairs-1; k++) { for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&s->bs.x.ptr.p_double[i*n], 1, &s->data.ptr.pp_double[k][0], 1, ae_v_len(i*n,i*n+n-1)); vv = s->effectivew.ptr.p_double[i]; s->bs.f = s->bs.f+ae_sqr(vv*(v-s->data.ptr.pp_double[k][n+i]), _state); for(j=0; j<=n-1; j++) { s->bs.g.ptr.p_double[i*n+j] = s->bs.g.ptr.p_double[i*n+j]+2*vv*vv*(v-s->data.ptr.pp_double[k][n+i])*s->data.ptr.pp_double[k][j]; } } } /* * continue */ continue; } } minbleicresultsbuf(&s->bs, &s->tmpp, &s->br, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { s->p.ptr.pp_double[i][j] = s->tmpp.ptr.p_double[i*n+j]; } } s->repterminationtype = s->br.terminationtype; s->repinneriterationscount = s->br.inneriterationscount; s->repouteriterationscount = s->br.outeriterationscount; s->repnfev = s->br.nfev; } /************************************************************************* MCPD results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: P - array[N,N], transition matrix Rep - optimization report. You should check Rep.TerminationType in order to distinguish successful termination from unsuccessful one. Speaking short, positive values denote success, negative ones are failures. More information about fields of this structure can be found in the comments on MCPDReport datatype. -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ void mcpdresults(mcpdstate* s, /* Real */ ae_matrix* p, mcpdreport* rep, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(p); _mcpdreport_clear(rep); ae_matrix_set_length(p, s->n, s->n, _state); for(i=0; i<=s->n-1; i++) { for(j=0; j<=s->n-1; j++) { p->ptr.pp_double[i][j] = s->p.ptr.pp_double[i][j]; } } rep->terminationtype = s->repterminationtype; rep->inneriterationscount = s->repinneriterationscount; rep->outeriterationscount = s->repouteriterationscount; rep->nfev = s->repnfev; } /************************************************************************* Internal initialization function -- ALGLIB -- Copyright 23.05.2010 by Bochkanov Sergey *************************************************************************/ static void mcpd_mcpdinit(ae_int_t n, ae_int_t entrystate, ae_int_t exitstate, mcpdstate* s, ae_state *_state) { ae_int_t i; ae_int_t j; ae_assert(n>=1, "MCPDCreate: N<1", _state); s->n = n; ae_vector_set_length(&s->states, n, _state); for(i=0; i<=n-1; i++) { s->states.ptr.p_int[i] = 0; } if( entrystate>=0 ) { s->states.ptr.p_int[entrystate] = 1; } if( exitstate>=0 ) { s->states.ptr.p_int[exitstate] = -1; } s->npairs = 0; s->regterm = 1.0E-8; s->ccnt = 0; ae_matrix_set_length(&s->p, n, n, _state); ae_matrix_set_length(&s->ec, n, n, _state); ae_matrix_set_length(&s->bndl, n, n, _state); ae_matrix_set_length(&s->bndu, n, n, _state); ae_vector_set_length(&s->pw, n, _state); ae_matrix_set_length(&s->priorp, n, n, _state); ae_vector_set_length(&s->tmpp, n*n, _state); ae_vector_set_length(&s->effectivew, n, _state); ae_vector_set_length(&s->effectivebndl, n*n, _state); ae_vector_set_length(&s->effectivebndu, n*n, _state); ae_vector_set_length(&s->h, n*n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { s->p.ptr.pp_double[i][j] = 0.0; s->priorp.ptr.pp_double[i][j] = 0.0; s->bndl.ptr.pp_double[i][j] = _state->v_neginf; s->bndu.ptr.pp_double[i][j] = _state->v_posinf; s->ec.ptr.pp_double[i][j] = _state->v_nan; } s->pw.ptr.p_double[i] = 0.0; s->priorp.ptr.pp_double[i][i] = 1.0; } ae_matrix_set_length(&s->data, 1, 2*n, _state); for(i=0; i<=2*n-1; i++) { s->data.ptr.pp_double[0][i] = 0.0; } for(i=0; i<=n*n-1; i++) { s->tmpp.ptr.p_double[i] = 0.0; } minbleiccreate(n*n, &s->tmpp, &s->bs, _state); } void _mcpdstate_init(void* _p, ae_state *_state) { mcpdstate *p = (mcpdstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->states, 0, DT_INT, _state); ae_matrix_init(&p->data, 0, 0, DT_REAL, _state); ae_matrix_init(&p->ec, 0, 0, DT_REAL, _state); ae_matrix_init(&p->bndl, 0, 0, DT_REAL, _state); ae_matrix_init(&p->bndu, 0, 0, DT_REAL, _state); ae_matrix_init(&p->c, 0, 0, DT_REAL, _state); ae_vector_init(&p->ct, 0, DT_INT, _state); ae_vector_init(&p->pw, 0, DT_REAL, _state); ae_matrix_init(&p->priorp, 0, 0, DT_REAL, _state); _minbleicstate_init(&p->bs, _state); _minbleicreport_init(&p->br, _state); ae_vector_init(&p->tmpp, 0, DT_REAL, _state); ae_vector_init(&p->effectivew, 0, DT_REAL, _state); ae_vector_init(&p->effectivebndl, 0, DT_REAL, _state); ae_vector_init(&p->effectivebndu, 0, DT_REAL, _state); ae_matrix_init(&p->effectivec, 0, 0, DT_REAL, _state); ae_vector_init(&p->effectivect, 0, DT_INT, _state); ae_vector_init(&p->h, 0, DT_REAL, _state); ae_matrix_init(&p->p, 0, 0, DT_REAL, _state); } void _mcpdstate_init_copy(void* _dst, void* _src, ae_state *_state) { mcpdstate *dst = (mcpdstate*)_dst; mcpdstate *src = (mcpdstate*)_src; dst->n = src->n; ae_vector_init_copy(&dst->states, &src->states, _state); dst->npairs = src->npairs; ae_matrix_init_copy(&dst->data, &src->data, _state); ae_matrix_init_copy(&dst->ec, &src->ec, _state); ae_matrix_init_copy(&dst->bndl, &src->bndl, _state); ae_matrix_init_copy(&dst->bndu, &src->bndu, _state); ae_matrix_init_copy(&dst->c, &src->c, _state); ae_vector_init_copy(&dst->ct, &src->ct, _state); dst->ccnt = src->ccnt; ae_vector_init_copy(&dst->pw, &src->pw, _state); ae_matrix_init_copy(&dst->priorp, &src->priorp, _state); dst->regterm = src->regterm; _minbleicstate_init_copy(&dst->bs, &src->bs, _state); dst->repinneriterationscount = src->repinneriterationscount; dst->repouteriterationscount = src->repouteriterationscount; dst->repnfev = src->repnfev; dst->repterminationtype = src->repterminationtype; _minbleicreport_init_copy(&dst->br, &src->br, _state); ae_vector_init_copy(&dst->tmpp, &src->tmpp, _state); ae_vector_init_copy(&dst->effectivew, &src->effectivew, _state); ae_vector_init_copy(&dst->effectivebndl, &src->effectivebndl, _state); ae_vector_init_copy(&dst->effectivebndu, &src->effectivebndu, _state); ae_matrix_init_copy(&dst->effectivec, &src->effectivec, _state); ae_vector_init_copy(&dst->effectivect, &src->effectivect, _state); ae_vector_init_copy(&dst->h, &src->h, _state); ae_matrix_init_copy(&dst->p, &src->p, _state); } void _mcpdstate_clear(void* _p) { mcpdstate *p = (mcpdstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->states); ae_matrix_clear(&p->data); ae_matrix_clear(&p->ec); ae_matrix_clear(&p->bndl); ae_matrix_clear(&p->bndu); ae_matrix_clear(&p->c); ae_vector_clear(&p->ct); ae_vector_clear(&p->pw); ae_matrix_clear(&p->priorp); _minbleicstate_clear(&p->bs); _minbleicreport_clear(&p->br); ae_vector_clear(&p->tmpp); ae_vector_clear(&p->effectivew); ae_vector_clear(&p->effectivebndl); ae_vector_clear(&p->effectivebndu); ae_matrix_clear(&p->effectivec); ae_vector_clear(&p->effectivect); ae_vector_clear(&p->h); ae_matrix_clear(&p->p); } void _mcpdstate_destroy(void* _p) { mcpdstate *p = (mcpdstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->states); ae_matrix_destroy(&p->data); ae_matrix_destroy(&p->ec); ae_matrix_destroy(&p->bndl); ae_matrix_destroy(&p->bndu); ae_matrix_destroy(&p->c); ae_vector_destroy(&p->ct); ae_vector_destroy(&p->pw); ae_matrix_destroy(&p->priorp); _minbleicstate_destroy(&p->bs); _minbleicreport_destroy(&p->br); ae_vector_destroy(&p->tmpp); ae_vector_destroy(&p->effectivew); ae_vector_destroy(&p->effectivebndl); ae_vector_destroy(&p->effectivebndu); ae_matrix_destroy(&p->effectivec); ae_vector_destroy(&p->effectivect); ae_vector_destroy(&p->h); ae_matrix_destroy(&p->p); } void _mcpdreport_init(void* _p, ae_state *_state) { mcpdreport *p = (mcpdreport*)_p; ae_touch_ptr((void*)p); } void _mcpdreport_init_copy(void* _dst, void* _src, ae_state *_state) { mcpdreport *dst = (mcpdreport*)_dst; mcpdreport *src = (mcpdreport*)_src; dst->inneriterationscount = src->inneriterationscount; dst->outeriterationscount = src->outeriterationscount; dst->nfev = src->nfev; dst->terminationtype = src->terminationtype; } void _mcpdreport_clear(void* _p) { mcpdreport *p = (mcpdreport*)_p; ae_touch_ptr((void*)p); } void _mcpdreport_destroy(void* _p) { mcpdreport *p = (mcpdreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* Like MLPCreate0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreate0(ae_int_t nin, ae_int_t nout, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_frame_make(_state, &_frame_block); _mlpensemble_clear(ensemble); _multilayerperceptron_init(&net, _state); mlpcreate0(nin, nout, &net, _state); mlpecreatefromnetwork(&net, ensemblesize, ensemble, _state); ae_frame_leave(_state); } /************************************************************************* Like MLPCreate1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreate1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_frame_make(_state, &_frame_block); _mlpensemble_clear(ensemble); _multilayerperceptron_init(&net, _state); mlpcreate1(nin, nhid, nout, &net, _state); mlpecreatefromnetwork(&net, ensemblesize, ensemble, _state); ae_frame_leave(_state); } /************************************************************************* Like MLPCreate2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreate2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_frame_make(_state, &_frame_block); _mlpensemble_clear(ensemble); _multilayerperceptron_init(&net, _state); mlpcreate2(nin, nhid1, nhid2, nout, &net, _state); mlpecreatefromnetwork(&net, ensemblesize, ensemble, _state); ae_frame_leave(_state); } /************************************************************************* Like MLPCreateB0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreateb0(ae_int_t nin, ae_int_t nout, double b, double d, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_frame_make(_state, &_frame_block); _mlpensemble_clear(ensemble); _multilayerperceptron_init(&net, _state); mlpcreateb0(nin, nout, b, d, &net, _state); mlpecreatefromnetwork(&net, ensemblesize, ensemble, _state); ae_frame_leave(_state); } /************************************************************************* Like MLPCreateB1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreateb1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, double b, double d, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_frame_make(_state, &_frame_block); _mlpensemble_clear(ensemble); _multilayerperceptron_init(&net, _state); mlpcreateb1(nin, nhid, nout, b, d, &net, _state); mlpecreatefromnetwork(&net, ensemblesize, ensemble, _state); ae_frame_leave(_state); } /************************************************************************* Like MLPCreateB2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreateb2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, double b, double d, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_frame_make(_state, &_frame_block); _mlpensemble_clear(ensemble); _multilayerperceptron_init(&net, _state); mlpcreateb2(nin, nhid1, nhid2, nout, b, d, &net, _state); mlpecreatefromnetwork(&net, ensemblesize, ensemble, _state); ae_frame_leave(_state); } /************************************************************************* Like MLPCreateR0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreater0(ae_int_t nin, ae_int_t nout, double a, double b, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_frame_make(_state, &_frame_block); _mlpensemble_clear(ensemble); _multilayerperceptron_init(&net, _state); mlpcreater0(nin, nout, a, b, &net, _state); mlpecreatefromnetwork(&net, ensemblesize, ensemble, _state); ae_frame_leave(_state); } /************************************************************************* Like MLPCreateR1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreater1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, double a, double b, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_frame_make(_state, &_frame_block); _mlpensemble_clear(ensemble); _multilayerperceptron_init(&net, _state); mlpcreater1(nin, nhid, nout, a, b, &net, _state); mlpecreatefromnetwork(&net, ensemblesize, ensemble, _state); ae_frame_leave(_state); } /************************************************************************* Like MLPCreateR2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreater2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, double a, double b, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_frame_make(_state, &_frame_block); _mlpensemble_clear(ensemble); _multilayerperceptron_init(&net, _state); mlpcreater2(nin, nhid1, nhid2, nout, a, b, &net, _state); mlpecreatefromnetwork(&net, ensemblesize, ensemble, _state); ae_frame_leave(_state); } /************************************************************************* Like MLPCreateC0, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreatec0(ae_int_t nin, ae_int_t nout, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_frame_make(_state, &_frame_block); _mlpensemble_clear(ensemble); _multilayerperceptron_init(&net, _state); mlpcreatec0(nin, nout, &net, _state); mlpecreatefromnetwork(&net, ensemblesize, ensemble, _state); ae_frame_leave(_state); } /************************************************************************* Like MLPCreateC1, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreatec1(ae_int_t nin, ae_int_t nhid, ae_int_t nout, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_frame_make(_state, &_frame_block); _mlpensemble_clear(ensemble); _multilayerperceptron_init(&net, _state); mlpcreatec1(nin, nhid, nout, &net, _state); mlpecreatefromnetwork(&net, ensemblesize, ensemble, _state); ae_frame_leave(_state); } /************************************************************************* Like MLPCreateC2, but for ensembles. -- ALGLIB -- Copyright 18.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreatec2(ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_frame_make(_state, &_frame_block); _mlpensemble_clear(ensemble); _multilayerperceptron_init(&net, _state); mlpcreatec2(nin, nhid1, nhid2, nout, &net, _state); mlpecreatefromnetwork(&net, ensemblesize, ensemble, _state); ae_frame_leave(_state); } /************************************************************************* Creates ensemble from network. Only network geometry is copied. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecreatefromnetwork(multilayerperceptron* network, ae_int_t ensemblesize, mlpensemble* ensemble, ae_state *_state) { ae_int_t i; ae_int_t ccount; ae_int_t wcount; _mlpensemble_clear(ensemble); ae_assert(ensemblesize>0, "MLPECreate: incorrect ensemble size!", _state); /* * Copy network */ mlpcopy(network, &ensemble->network, _state); /* * network properties */ if( mlpissoftmax(network, _state) ) { ccount = mlpgetinputscount(&ensemble->network, _state); } else { ccount = mlpgetinputscount(&ensemble->network, _state)+mlpgetoutputscount(&ensemble->network, _state); } wcount = mlpgetweightscount(&ensemble->network, _state); ensemble->ensemblesize = ensemblesize; /* * weights, means, sigmas */ ae_vector_set_length(&ensemble->weights, ensemblesize*wcount, _state); ae_vector_set_length(&ensemble->columnmeans, ensemblesize*ccount, _state); ae_vector_set_length(&ensemble->columnsigmas, ensemblesize*ccount, _state); for(i=0; i<=ensemblesize*wcount-1; i++) { ensemble->weights.ptr.p_double[i] = ae_randomreal(_state)-0.5; } for(i=0; i<=ensemblesize-1; i++) { ae_v_move(&ensemble->columnmeans.ptr.p_double[i*ccount], 1, &network->columnmeans.ptr.p_double[0], 1, ae_v_len(i*ccount,(i+1)*ccount-1)); ae_v_move(&ensemble->columnsigmas.ptr.p_double[i*ccount], 1, &network->columnsigmas.ptr.p_double[0], 1, ae_v_len(i*ccount,(i+1)*ccount-1)); } /* * temporaries, internal buffers */ ae_vector_set_length(&ensemble->y, mlpgetoutputscount(&ensemble->network, _state), _state); } /************************************************************************* Copying of MLPEnsemble strucure INPUT PARAMETERS: Ensemble1 - original OUTPUT PARAMETERS: Ensemble2 - copy -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpecopy(mlpensemble* ensemble1, mlpensemble* ensemble2, ae_state *_state) { ae_int_t ccount; ae_int_t wcount; _mlpensemble_clear(ensemble2); /* * Unload info */ if( mlpissoftmax(&ensemble1->network, _state) ) { ccount = mlpgetinputscount(&ensemble1->network, _state); } else { ccount = mlpgetinputscount(&ensemble1->network, _state)+mlpgetoutputscount(&ensemble1->network, _state); } wcount = mlpgetweightscount(&ensemble1->network, _state); /* * Allocate space */ ae_vector_set_length(&ensemble2->weights, ensemble1->ensemblesize*wcount, _state); ae_vector_set_length(&ensemble2->columnmeans, ensemble1->ensemblesize*ccount, _state); ae_vector_set_length(&ensemble2->columnsigmas, ensemble1->ensemblesize*ccount, _state); ae_vector_set_length(&ensemble2->y, mlpgetoutputscount(&ensemble1->network, _state), _state); /* * Copy */ ensemble2->ensemblesize = ensemble1->ensemblesize; ae_v_move(&ensemble2->weights.ptr.p_double[0], 1, &ensemble1->weights.ptr.p_double[0], 1, ae_v_len(0,ensemble1->ensemblesize*wcount-1)); ae_v_move(&ensemble2->columnmeans.ptr.p_double[0], 1, &ensemble1->columnmeans.ptr.p_double[0], 1, ae_v_len(0,ensemble1->ensemblesize*ccount-1)); ae_v_move(&ensemble2->columnsigmas.ptr.p_double[0], 1, &ensemble1->columnsigmas.ptr.p_double[0], 1, ae_v_len(0,ensemble1->ensemblesize*ccount-1)); mlpcopy(&ensemble1->network, &ensemble2->network, _state); } /************************************************************************* Randomization of MLP ensemble -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlperandomize(mlpensemble* ensemble, ae_state *_state) { ae_int_t i; ae_int_t wcount; wcount = mlpgetweightscount(&ensemble->network, _state); for(i=0; i<=ensemble->ensemblesize*wcount-1; i++) { ensemble->weights.ptr.p_double[i] = ae_randomreal(_state)-0.5; } } /************************************************************************* Return ensemble properties (number of inputs and outputs). -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpeproperties(mlpensemble* ensemble, ae_int_t* nin, ae_int_t* nout, ae_state *_state) { *nin = 0; *nout = 0; *nin = mlpgetinputscount(&ensemble->network, _state); *nout = mlpgetoutputscount(&ensemble->network, _state); } /************************************************************************* Return normalization type (whether ensemble is SOFTMAX-normalized or not). -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ ae_bool mlpeissoftmax(mlpensemble* ensemble, ae_state *_state) { ae_bool result; result = mlpissoftmax(&ensemble->network, _state); return result; } /************************************************************************* Procesing INPUT PARAMETERS: Ensemble- neural networks ensemble X - input vector, array[0..NIn-1]. Y - (possibly) preallocated buffer; if size of Y is less than NOut, it will be reallocated. If it is large enough, it is NOT reallocated, so we can save some time on reallocation. OUTPUT PARAMETERS: Y - result. Regression estimate when solving regression task, vector of posterior probabilities for classification task. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpeprocess(mlpensemble* ensemble, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_int_t es; ae_int_t wc; ae_int_t cc; double v; ae_int_t nout; if( y->cntnetwork, _state) ) { ae_vector_set_length(y, mlpgetoutputscount(&ensemble->network, _state), _state); } es = ensemble->ensemblesize; wc = mlpgetweightscount(&ensemble->network, _state); if( mlpissoftmax(&ensemble->network, _state) ) { cc = mlpgetinputscount(&ensemble->network, _state); } else { cc = mlpgetinputscount(&ensemble->network, _state)+mlpgetoutputscount(&ensemble->network, _state); } v = (double)1/(double)es; nout = mlpgetoutputscount(&ensemble->network, _state); for(i=0; i<=nout-1; i++) { y->ptr.p_double[i] = (double)(0); } for(i=0; i<=es-1; i++) { ae_v_move(&ensemble->network.weights.ptr.p_double[0], 1, &ensemble->weights.ptr.p_double[i*wc], 1, ae_v_len(0,wc-1)); ae_v_move(&ensemble->network.columnmeans.ptr.p_double[0], 1, &ensemble->columnmeans.ptr.p_double[i*cc], 1, ae_v_len(0,cc-1)); ae_v_move(&ensemble->network.columnsigmas.ptr.p_double[0], 1, &ensemble->columnsigmas.ptr.p_double[i*cc], 1, ae_v_len(0,cc-1)); mlpprocess(&ensemble->network, x, &ensemble->y, _state); ae_v_addd(&y->ptr.p_double[0], 1, &ensemble->y.ptr.p_double[0], 1, ae_v_len(0,nout-1), v); } } /************************************************************************* 'interactive' variant of MLPEProcess for languages like Python which support constructs like "Y = MLPEProcess(LM,X)" and interactive mode of the interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpeprocessi(mlpensemble* ensemble, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_vector_clear(y); mlpeprocess(ensemble, x, y, _state); } /************************************************************************* Calculation of all types of errors -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpeallerrorsx(mlpensemble* ensemble, /* Real */ ae_matrix* densexy, sparsematrix* sparsexy, ae_int_t datasetsize, ae_int_t datasettype, /* Integer */ ae_vector* idx, ae_int_t subset0, ae_int_t subset1, ae_int_t subsettype, ae_shared_pool* buf, modelerrors* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t nin; ae_int_t nout; ae_bool iscls; ae_int_t srcidx; mlpbuffers *pbuf; ae_smart_ptr _pbuf; modelerrors rep0; modelerrors rep1; ae_frame_make(_state, &_frame_block); ae_smart_ptr_init(&_pbuf, (void**)&pbuf, _state); _modelerrors_init(&rep0, _state); _modelerrors_init(&rep1, _state); /* * Get network information */ nin = mlpgetinputscount(&ensemble->network, _state); nout = mlpgetoutputscount(&ensemble->network, _state); iscls = mlpissoftmax(&ensemble->network, _state); /* * Retrieve buffer, prepare, process data, recycle buffer */ ae_shared_pool_retrieve(buf, &_pbuf, _state); if( iscls ) { dserrallocate(nout, &pbuf->tmp0, _state); } else { dserrallocate(-nout, &pbuf->tmp0, _state); } rvectorsetlengthatleast(&pbuf->x, nin, _state); rvectorsetlengthatleast(&pbuf->y, nout, _state); rvectorsetlengthatleast(&pbuf->desiredy, nout, _state); for(i=subset0; i<=subset1-1; i++) { srcidx = -1; if( subsettype==0 ) { srcidx = i; } if( subsettype==1 ) { srcidx = idx->ptr.p_int[i]; } ae_assert(srcidx>=0, "MLPEAllErrorsX: internal error", _state); if( datasettype==0 ) { ae_v_move(&pbuf->x.ptr.p_double[0], 1, &densexy->ptr.pp_double[srcidx][0], 1, ae_v_len(0,nin-1)); } if( datasettype==1 ) { sparsegetrow(sparsexy, srcidx, &pbuf->x, _state); } mlpeprocess(ensemble, &pbuf->x, &pbuf->y, _state); if( mlpissoftmax(&ensemble->network, _state) ) { if( datasettype==0 ) { pbuf->desiredy.ptr.p_double[0] = densexy->ptr.pp_double[srcidx][nin]; } if( datasettype==1 ) { pbuf->desiredy.ptr.p_double[0] = sparseget(sparsexy, srcidx, nin, _state); } } else { if( datasettype==0 ) { ae_v_move(&pbuf->desiredy.ptr.p_double[0], 1, &densexy->ptr.pp_double[srcidx][nin], 1, ae_v_len(0,nout-1)); } if( datasettype==1 ) { for(j=0; j<=nout-1; j++) { pbuf->desiredy.ptr.p_double[j] = sparseget(sparsexy, srcidx, nin+j, _state); } } } dserraccumulate(&pbuf->tmp0, &pbuf->y, &pbuf->desiredy, _state); } dserrfinish(&pbuf->tmp0, _state); rep->relclserror = pbuf->tmp0.ptr.p_double[0]; rep->avgce = pbuf->tmp0.ptr.p_double[1]/ae_log((double)(2), _state); rep->rmserror = pbuf->tmp0.ptr.p_double[2]; rep->avgerror = pbuf->tmp0.ptr.p_double[3]; rep->avgrelerror = pbuf->tmp0.ptr.p_double[4]; ae_shared_pool_recycle(buf, &_pbuf, _state); ae_frame_leave(_state); } /************************************************************************* Calculation of all types of errors on dataset given by sparse matrix -- ALGLIB -- Copyright 10.09.2012 by Bochkanov Sergey *************************************************************************/ void mlpeallerrorssparse(mlpensemble* ensemble, sparsematrix* xy, ae_int_t npoints, double* relcls, double* avgce, double* rms, double* avg, double* avgrel, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector buf; ae_vector workx; ae_vector y; ae_vector dy; ae_int_t nin; ae_int_t nout; ae_frame_make(_state, &_frame_block); *relcls = 0; *avgce = 0; *rms = 0; *avg = 0; *avgrel = 0; ae_vector_init(&buf, 0, DT_REAL, _state); ae_vector_init(&workx, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&dy, 0, DT_REAL, _state); nin = mlpgetinputscount(&ensemble->network, _state); nout = mlpgetoutputscount(&ensemble->network, _state); if( mlpissoftmax(&ensemble->network, _state) ) { ae_vector_set_length(&dy, 1, _state); dserrallocate(nout, &buf, _state); } else { ae_vector_set_length(&dy, nout, _state); dserrallocate(-nout, &buf, _state); } for(i=0; i<=npoints-1; i++) { sparsegetrow(xy, i, &workx, _state); mlpeprocess(ensemble, &workx, &y, _state); if( mlpissoftmax(&ensemble->network, _state) ) { dy.ptr.p_double[0] = workx.ptr.p_double[nin]; } else { ae_v_move(&dy.ptr.p_double[0], 1, &workx.ptr.p_double[nin], 1, ae_v_len(0,nout-1)); } dserraccumulate(&buf, &y, &dy, _state); } dserrfinish(&buf, _state); *relcls = buf.ptr.p_double[0]; *avgce = buf.ptr.p_double[1]; *rms = buf.ptr.p_double[2]; *avg = buf.ptr.p_double[3]; *avgrel = buf.ptr.p_double[4]; ae_frame_leave(_state); } /************************************************************************* Relative classification error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: percent of incorrectly classified cases. Works both for classifier betwork and for regression networks which are used as classifiers. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlperelclserror(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_frame _frame_block; modelerrors rep; double result; ae_frame_make(_state, &_frame_block); _modelerrors_init(&rep, _state); mlpeallerrorsx(ensemble, xy, &ensemble->network.dummysxy, npoints, 0, &ensemble->network.dummyidx, 0, npoints, 0, &ensemble->network.buf, &rep, _state); result = rep.relclserror; ae_frame_leave(_state); return result; } /************************************************************************* Average cross-entropy (in bits per element) on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: CrossEntropy/(NPoints*LN(2)). Zero if ensemble solves regression task. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlpeavgce(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_frame _frame_block; modelerrors rep; double result; ae_frame_make(_state, &_frame_block); _modelerrors_init(&rep, _state); mlpeallerrorsx(ensemble, xy, &ensemble->network.dummysxy, npoints, 0, &ensemble->network.dummyidx, 0, npoints, 0, &ensemble->network.buf, &rep, _state); result = rep.avgce; ae_frame_leave(_state); return result; } /************************************************************************* RMS error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: root mean square error. Its meaning for regression task is obvious. As for classification task RMS error means error when estimating posterior probabilities. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlpermserror(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_frame _frame_block; modelerrors rep; double result; ae_frame_make(_state, &_frame_block); _modelerrors_init(&rep, _state); mlpeallerrorsx(ensemble, xy, &ensemble->network.dummysxy, npoints, 0, &ensemble->network.dummyidx, 0, npoints, 0, &ensemble->network.buf, &rep, _state); result = rep.rmserror; ae_frame_leave(_state); return result; } /************************************************************************* Average error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task it means average error when estimating posterior probabilities. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlpeavgerror(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_frame _frame_block; modelerrors rep; double result; ae_frame_make(_state, &_frame_block); _modelerrors_init(&rep, _state); mlpeallerrorsx(ensemble, xy, &ensemble->network.dummysxy, npoints, 0, &ensemble->network.dummyidx, 0, npoints, 0, &ensemble->network.buf, &rep, _state); result = rep.avgerror; ae_frame_leave(_state); return result; } /************************************************************************* Average relative error on the test set INPUT PARAMETERS: Ensemble- ensemble XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task it means average relative error when estimating posterior probabilities. -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ double mlpeavgrelerror(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_frame _frame_block; modelerrors rep; double result; ae_frame_make(_state, &_frame_block); _modelerrors_init(&rep, _state); mlpeallerrorsx(ensemble, xy, &ensemble->network.dummysxy, npoints, 0, &ensemble->network.dummyidx, 0, npoints, 0, &ensemble->network.buf, &rep, _state); result = rep.avgrelerror; ae_frame_leave(_state); return result; } /************************************************************************* Serializer: allocation -- ALGLIB -- Copyright 19.10.2011 by Bochkanov Sergey *************************************************************************/ void mlpealloc(ae_serializer* s, mlpensemble* ensemble, ae_state *_state) { ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); allocrealarray(s, &ensemble->weights, -1, _state); allocrealarray(s, &ensemble->columnmeans, -1, _state); allocrealarray(s, &ensemble->columnsigmas, -1, _state); mlpalloc(s, &ensemble->network, _state); } /************************************************************************* Serializer: serialization -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpeserialize(ae_serializer* s, mlpensemble* ensemble, ae_state *_state) { ae_serializer_serialize_int(s, getmlpeserializationcode(_state), _state); ae_serializer_serialize_int(s, mlpe_mlpefirstversion, _state); ae_serializer_serialize_int(s, ensemble->ensemblesize, _state); serializerealarray(s, &ensemble->weights, -1, _state); serializerealarray(s, &ensemble->columnmeans, -1, _state); serializerealarray(s, &ensemble->columnsigmas, -1, _state); mlpserialize(s, &ensemble->network, _state); } /************************************************************************* Serializer: unserialization -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ void mlpeunserialize(ae_serializer* s, mlpensemble* ensemble, ae_state *_state) { ae_int_t i0; ae_int_t i1; _mlpensemble_clear(ensemble); /* * check correctness of header */ ae_serializer_unserialize_int(s, &i0, _state); ae_assert(i0==getmlpeserializationcode(_state), "MLPEUnserialize: stream header corrupted", _state); ae_serializer_unserialize_int(s, &i1, _state); ae_assert(i1==mlpe_mlpefirstversion, "MLPEUnserialize: stream header corrupted", _state); /* * Create network */ ae_serializer_unserialize_int(s, &ensemble->ensemblesize, _state); unserializerealarray(s, &ensemble->weights, _state); unserializerealarray(s, &ensemble->columnmeans, _state); unserializerealarray(s, &ensemble->columnsigmas, _state); mlpunserialize(s, &ensemble->network, _state); /* * Allocate termoraries */ ae_vector_set_length(&ensemble->y, mlpgetoutputscount(&ensemble->network, _state), _state); } void _mlpensemble_init(void* _p, ae_state *_state) { mlpensemble *p = (mlpensemble*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->weights, 0, DT_REAL, _state); ae_vector_init(&p->columnmeans, 0, DT_REAL, _state); ae_vector_init(&p->columnsigmas, 0, DT_REAL, _state); _multilayerperceptron_init(&p->network, _state); ae_vector_init(&p->y, 0, DT_REAL, _state); } void _mlpensemble_init_copy(void* _dst, void* _src, ae_state *_state) { mlpensemble *dst = (mlpensemble*)_dst; mlpensemble *src = (mlpensemble*)_src; dst->ensemblesize = src->ensemblesize; ae_vector_init_copy(&dst->weights, &src->weights, _state); ae_vector_init_copy(&dst->columnmeans, &src->columnmeans, _state); ae_vector_init_copy(&dst->columnsigmas, &src->columnsigmas, _state); _multilayerperceptron_init_copy(&dst->network, &src->network, _state); ae_vector_init_copy(&dst->y, &src->y, _state); } void _mlpensemble_clear(void* _p) { mlpensemble *p = (mlpensemble*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->weights); ae_vector_clear(&p->columnmeans); ae_vector_clear(&p->columnsigmas); _multilayerperceptron_clear(&p->network); ae_vector_clear(&p->y); } void _mlpensemble_destroy(void* _p) { mlpensemble *p = (mlpensemble*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->weights); ae_vector_destroy(&p->columnmeans); ae_vector_destroy(&p->columnsigmas); _multilayerperceptron_destroy(&p->network); ae_vector_destroy(&p->y); } /************************************************************************* Neural network training using modified Levenberg-Marquardt with exact Hessian calculation and regularization. Subroutine trains neural network with restarts from random positions. Algorithm is well suited for small and medium scale problems (hundreds of weights). INPUT PARAMETERS: Network - neural network with initialized geometry XY - training set NPoints - training set size Decay - weight decay constant, >=0.001 Decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 0.001. Restarts - number of restarts from random position, >0. If you don't know what Restarts to choose, use 2. OUTPUT PARAMETERS: Network - trained neural network. Info - return code: * -9, if internal matrix inverse subroutine failed * -2, if there is a point with class number outside of [0..NOut-1]. * -1, if wrong parameters specified (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void mlptrainlm(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t* info, mlpreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t nin; ae_int_t nout; ae_int_t wcount; double lmsteptol; ae_int_t i; ae_int_t k; double v; double e; double enew; double xnorm2; double stepnorm; ae_vector g; ae_vector d; ae_matrix h; ae_matrix hmod; ae_matrix z; ae_bool spd; double nu; double lambdav; double lambdaup; double lambdadown; minlbfgsreport internalrep; minlbfgsstate state; ae_vector x; ae_vector y; ae_vector wbase; ae_vector wdir; ae_vector wt; ae_vector wx; ae_int_t pass; ae_vector wbest; double ebest; ae_int_t invinfo; matinvreport invrep; ae_int_t solverinfo; densesolverreport solverrep; ae_frame_make(_state, &_frame_block); *info = 0; _mlpreport_clear(rep); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_matrix_init(&h, 0, 0, DT_REAL, _state); ae_matrix_init(&hmod, 0, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); _minlbfgsreport_init(&internalrep, _state); _minlbfgsstate_init(&state, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&wbase, 0, DT_REAL, _state); ae_vector_init(&wdir, 0, DT_REAL, _state); ae_vector_init(&wt, 0, DT_REAL, _state); ae_vector_init(&wx, 0, DT_REAL, _state); ae_vector_init(&wbest, 0, DT_REAL, _state); _matinvreport_init(&invrep, _state); _densesolverreport_init(&solverrep, _state); mlpproperties(network, &nin, &nout, &wcount, _state); lambdaup = (double)(10); lambdadown = 0.3; lmsteptol = 0.001; /* * Test for inputs */ if( npoints<=0||restarts<1 ) { *info = -1; ae_frame_leave(_state); return; } if( mlpissoftmax(network, _state) ) { for(i=0; i<=npoints-1; i++) { if( ae_round(xy->ptr.pp_double[i][nin], _state)<0||ae_round(xy->ptr.pp_double[i][nin], _state)>=nout ) { *info = -2; ae_frame_leave(_state); return; } } } decay = ae_maxreal(decay, mlptrain_mindecay, _state); *info = 2; /* * Initialize data */ rep->ngrad = 0; rep->nhess = 0; rep->ncholesky = 0; /* * General case. * Prepare task and network. Allocate space. */ mlpinitpreprocessor(network, xy, npoints, _state); ae_vector_set_length(&g, wcount-1+1, _state); ae_matrix_set_length(&h, wcount-1+1, wcount-1+1, _state); ae_matrix_set_length(&hmod, wcount-1+1, wcount-1+1, _state); ae_vector_set_length(&wbase, wcount-1+1, _state); ae_vector_set_length(&wdir, wcount-1+1, _state); ae_vector_set_length(&wbest, wcount-1+1, _state); ae_vector_set_length(&wt, wcount-1+1, _state); ae_vector_set_length(&wx, wcount-1+1, _state); ebest = ae_maxrealnumber; /* * Multiple passes */ for(pass=1; pass<=restarts; pass++) { /* * Initialize weights */ mlprandomize(network, _state); /* * First stage of the hybrid algorithm: LBFGS */ ae_v_move(&wbase.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); minlbfgscreate(wcount, ae_minint(wcount, 5, _state), &wbase, &state, _state); minlbfgssetcond(&state, (double)(0), (double)(0), (double)(0), ae_maxint(25, wcount, _state), _state); while(minlbfgsiteration(&state, _state)) { /* * gradient */ ae_v_move(&network->weights.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); mlpgradbatch(network, xy, npoints, &state.f, &state.g, _state); /* * weight decay */ v = ae_v_dotproduct(&network->weights.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); state.f = state.f+0.5*decay*v; ae_v_addd(&state.g.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1), decay); /* * next iteration */ rep->ngrad = rep->ngrad+1; } minlbfgsresults(&state, &wbase, &internalrep, _state); ae_v_move(&network->weights.ptr.p_double[0], 1, &wbase.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); /* * Second stage of the hybrid algorithm: LM * * Initialize H with identity matrix, * G with gradient, * E with regularized error. */ mlphessianbatch(network, xy, npoints, &e, &g, &h, _state); v = ae_v_dotproduct(&network->weights.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); e = e+0.5*decay*v; ae_v_addd(&g.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1), decay); for(k=0; k<=wcount-1; k++) { h.ptr.pp_double[k][k] = h.ptr.pp_double[k][k]+decay; } rep->nhess = rep->nhess+1; lambdav = 0.001; nu = (double)(2); for(;;) { /* * 1. HMod = H+lambda*I * 2. Try to solve (H+Lambda*I)*dx = -g. * Increase lambda if left part is not positive definite. */ for(i=0; i<=wcount-1; i++) { ae_v_move(&hmod.ptr.pp_double[i][0], 1, &h.ptr.pp_double[i][0], 1, ae_v_len(0,wcount-1)); hmod.ptr.pp_double[i][i] = hmod.ptr.pp_double[i][i]+lambdav; } spd = spdmatrixcholesky(&hmod, wcount, ae_true, _state); rep->ncholesky = rep->ncholesky+1; if( !spd ) { lambdav = lambdav*lambdaup*nu; nu = nu*2; continue; } spdmatrixcholeskysolve(&hmod, wcount, ae_true, &g, &solverinfo, &solverrep, &wdir, _state); if( solverinfo<0 ) { lambdav = lambdav*lambdaup*nu; nu = nu*2; continue; } ae_v_muld(&wdir.ptr.p_double[0], 1, ae_v_len(0,wcount-1), -1); /* * Lambda found. * 1. Save old w in WBase * 1. Test some stopping criterions * 2. If error(w+wdir)>error(w), increase lambda */ ae_v_add(&network->weights.ptr.p_double[0], 1, &wdir.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); xnorm2 = ae_v_dotproduct(&network->weights.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); stepnorm = ae_v_dotproduct(&wdir.ptr.p_double[0], 1, &wdir.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); stepnorm = ae_sqrt(stepnorm, _state); enew = mlperror(network, xy, npoints, _state)+0.5*decay*xnorm2; if( ae_fp_less(stepnorm,lmsteptol*(1+ae_sqrt(xnorm2, _state))) ) { break; } if( ae_fp_greater(enew,e) ) { lambdav = lambdav*lambdaup*nu; nu = nu*2; continue; } /* * Optimize using inv(cholesky(H)) as preconditioner */ rmatrixtrinverse(&hmod, wcount, ae_true, ae_false, &invinfo, &invrep, _state); if( invinfo<=0 ) { /* * if matrix can't be inverted then exit with errors * TODO: make WCount steps in direction suggested by HMod */ *info = -9; ae_frame_leave(_state); return; } ae_v_move(&wbase.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); for(i=0; i<=wcount-1; i++) { wt.ptr.p_double[i] = (double)(0); } minlbfgscreatex(wcount, wcount, &wt, 1, 0.0, &state, _state); minlbfgssetcond(&state, (double)(0), (double)(0), (double)(0), 5, _state); while(minlbfgsiteration(&state, _state)) { /* * gradient */ for(i=0; i<=wcount-1; i++) { v = ae_v_dotproduct(&state.x.ptr.p_double[i], 1, &hmod.ptr.pp_double[i][i], 1, ae_v_len(i,wcount-1)); network->weights.ptr.p_double[i] = wbase.ptr.p_double[i]+v; } mlpgradbatch(network, xy, npoints, &state.f, &g, _state); for(i=0; i<=wcount-1; i++) { state.g.ptr.p_double[i] = (double)(0); } for(i=0; i<=wcount-1; i++) { v = g.ptr.p_double[i]; ae_v_addd(&state.g.ptr.p_double[i], 1, &hmod.ptr.pp_double[i][i], 1, ae_v_len(i,wcount-1), v); } /* * weight decay * grad(x'*x) = A'*(x0+A*t) */ v = ae_v_dotproduct(&network->weights.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); state.f = state.f+0.5*decay*v; for(i=0; i<=wcount-1; i++) { v = decay*network->weights.ptr.p_double[i]; ae_v_addd(&state.g.ptr.p_double[i], 1, &hmod.ptr.pp_double[i][i], 1, ae_v_len(i,wcount-1), v); } /* * next iteration */ rep->ngrad = rep->ngrad+1; } minlbfgsresults(&state, &wt, &internalrep, _state); /* * Accept new position. * Calculate Hessian */ for(i=0; i<=wcount-1; i++) { v = ae_v_dotproduct(&wt.ptr.p_double[i], 1, &hmod.ptr.pp_double[i][i], 1, ae_v_len(i,wcount-1)); network->weights.ptr.p_double[i] = wbase.ptr.p_double[i]+v; } mlphessianbatch(network, xy, npoints, &e, &g, &h, _state); v = ae_v_dotproduct(&network->weights.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); e = e+0.5*decay*v; ae_v_addd(&g.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1), decay); for(k=0; k<=wcount-1; k++) { h.ptr.pp_double[k][k] = h.ptr.pp_double[k][k]+decay; } rep->nhess = rep->nhess+1; /* * Update lambda */ lambdav = lambdav*lambdadown; nu = (double)(2); } /* * update WBest */ v = ae_v_dotproduct(&network->weights.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); e = 0.5*decay*v+mlperror(network, xy, npoints, _state); if( ae_fp_less(e,ebest) ) { ebest = e; ae_v_move(&wbest.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); } } /* * copy WBest to output */ ae_v_move(&network->weights.ptr.p_double[0], 1, &wbest.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); ae_frame_leave(_state); } /************************************************************************* Neural network training using L-BFGS algorithm with regularization. Subroutine trains neural network with restarts from random positions. Algorithm is well suited for problems of any dimensionality (memory requirements and step complexity are linear by weights number). INPUT PARAMETERS: Network - neural network with initialized geometry XY - training set NPoints - training set size Decay - weight decay constant, >=0.001 Decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 0.001. Restarts - number of restarts from random position, >0. If you don't know what Restarts to choose, use 2. WStep - stopping criterion. Algorithm stops if step size is less than WStep. Recommended value - 0.01. Zero step size means stopping after MaxIts iterations. MaxIts - stopping criterion. Algorithm stops after MaxIts iterations (NOT gradient calculations). Zero MaxIts means stopping when step is sufficiently small. OUTPUT PARAMETERS: Network - trained neural network. Info - return code: * -8, if both WStep=0 and MaxIts=0 * -2, if there is a point with class number outside of [0..NOut-1]. * -1, if wrong parameters specified (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report -- ALGLIB -- Copyright 09.12.2007 by Bochkanov Sergey *************************************************************************/ void mlptrainlbfgs(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, double wstep, ae_int_t maxits, ae_int_t* info, mlpreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t pass; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_vector w; ae_vector wbest; double e; double v; double ebest; minlbfgsreport internalrep; minlbfgsstate state; ae_frame_make(_state, &_frame_block); *info = 0; _mlpreport_clear(rep); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&wbest, 0, DT_REAL, _state); _minlbfgsreport_init(&internalrep, _state); _minlbfgsstate_init(&state, _state); /* * Test inputs, parse flags, read network geometry */ if( ae_fp_eq(wstep,(double)(0))&&maxits==0 ) { *info = -8; ae_frame_leave(_state); return; } if( ((npoints<=0||restarts<1)||ae_fp_less(wstep,(double)(0)))||maxits<0 ) { *info = -1; ae_frame_leave(_state); return; } mlpproperties(network, &nin, &nout, &wcount, _state); if( mlpissoftmax(network, _state) ) { for(i=0; i<=npoints-1; i++) { if( ae_round(xy->ptr.pp_double[i][nin], _state)<0||ae_round(xy->ptr.pp_double[i][nin], _state)>=nout ) { *info = -2; ae_frame_leave(_state); return; } } } decay = ae_maxreal(decay, mlptrain_mindecay, _state); *info = 2; /* * Prepare */ mlpinitpreprocessor(network, xy, npoints, _state); ae_vector_set_length(&w, wcount-1+1, _state); ae_vector_set_length(&wbest, wcount-1+1, _state); ebest = ae_maxrealnumber; /* * Multiple starts */ rep->ncholesky = 0; rep->nhess = 0; rep->ngrad = 0; for(pass=1; pass<=restarts; pass++) { /* * Process */ mlprandomize(network, _state); ae_v_move(&w.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); minlbfgscreate(wcount, ae_minint(wcount, 10, _state), &w, &state, _state); minlbfgssetcond(&state, 0.0, 0.0, wstep, maxits, _state); while(minlbfgsiteration(&state, _state)) { ae_v_move(&network->weights.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); mlpgradnbatch(network, xy, npoints, &state.f, &state.g, _state); v = ae_v_dotproduct(&network->weights.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); state.f = state.f+0.5*decay*v; ae_v_addd(&state.g.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1), decay); rep->ngrad = rep->ngrad+1; } minlbfgsresults(&state, &w, &internalrep, _state); ae_v_move(&network->weights.ptr.p_double[0], 1, &w.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); /* * Compare with best */ v = ae_v_dotproduct(&network->weights.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); e = mlperrorn(network, xy, npoints, _state)+0.5*decay*v; if( ae_fp_less(e,ebest) ) { ae_v_move(&wbest.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); ebest = e; } } /* * The best network */ ae_v_move(&network->weights.ptr.p_double[0], 1, &wbest.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); ae_frame_leave(_state); } /************************************************************************* Neural network training using early stopping (base algorithm - L-BFGS with regularization). INPUT PARAMETERS: Network - neural network with initialized geometry TrnXY - training set TrnSize - training set size, TrnSize>0 ValXY - validation set ValSize - validation set size, ValSize>0 Decay - weight decay constant, >=0.001 Decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 0.001. Restarts - number of restarts, either: * strictly positive number - algorithm make specified number of restarts from random position. * -1, in which case algorithm makes exactly one run from the initial state of the network (no randomization). If you don't know what Restarts to choose, choose one one the following: * -1 (deterministic start) * +1 (one random restart) * +5 (moderate amount of random restarts) OUTPUT PARAMETERS: Network - trained neural network. Info - return code: * -2, if there is a point with class number outside of [0..NOut-1]. * -1, if wrong parameters specified (NPoints<0, Restarts<1, ...). * 2, task has been solved, stopping criterion met - sufficiently small step size. Not expected (we use EARLY stopping) but possible and not an error. * 6, task has been solved, stopping criterion met - increasing of validation set error. Rep - training report NOTE: Algorithm stops if validation set error increases for a long enough or step size is small enought (there are task where validation set may decrease for eternity). In any case solution returned corresponds to the minimum of validation set error. -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void mlptraines(multilayerperceptron* network, /* Real */ ae_matrix* trnxy, ae_int_t trnsize, /* Real */ ae_matrix* valxy, ae_int_t valsize, double decay, ae_int_t restarts, ae_int_t* info, mlpreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t pass; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_vector w; ae_vector wbest; double e; double v; double ebest; ae_vector wfinal; double efinal; ae_int_t itcnt; ae_int_t itbest; minlbfgsreport internalrep; minlbfgsstate state; double wstep; ae_bool needrandomization; ae_frame_make(_state, &_frame_block); *info = 0; _mlpreport_clear(rep); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&wbest, 0, DT_REAL, _state); ae_vector_init(&wfinal, 0, DT_REAL, _state); _minlbfgsreport_init(&internalrep, _state); _minlbfgsstate_init(&state, _state); wstep = 0.001; /* * Test inputs, parse flags, read network geometry */ if( ((trnsize<=0||valsize<=0)||(restarts<1&&restarts!=-1))||ae_fp_less(decay,(double)(0)) ) { *info = -1; ae_frame_leave(_state); return; } if( restarts==-1 ) { needrandomization = ae_false; restarts = 1; } else { needrandomization = ae_true; } mlpproperties(network, &nin, &nout, &wcount, _state); if( mlpissoftmax(network, _state) ) { for(i=0; i<=trnsize-1; i++) { if( ae_round(trnxy->ptr.pp_double[i][nin], _state)<0||ae_round(trnxy->ptr.pp_double[i][nin], _state)>=nout ) { *info = -2; ae_frame_leave(_state); return; } } for(i=0; i<=valsize-1; i++) { if( ae_round(valxy->ptr.pp_double[i][nin], _state)<0||ae_round(valxy->ptr.pp_double[i][nin], _state)>=nout ) { *info = -2; ae_frame_leave(_state); return; } } } *info = 2; /* * Prepare */ mlpinitpreprocessor(network, trnxy, trnsize, _state); ae_vector_set_length(&w, wcount-1+1, _state); ae_vector_set_length(&wbest, wcount-1+1, _state); ae_vector_set_length(&wfinal, wcount-1+1, _state); efinal = ae_maxrealnumber; for(i=0; i<=wcount-1; i++) { wfinal.ptr.p_double[i] = (double)(0); } /* * Multiple starts */ rep->ncholesky = 0; rep->nhess = 0; rep->ngrad = 0; for(pass=1; pass<=restarts; pass++) { /* * Process */ if( needrandomization ) { mlprandomize(network, _state); } ebest = mlperror(network, valxy, valsize, _state); ae_v_move(&wbest.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); itbest = 0; itcnt = 0; ae_v_move(&w.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); minlbfgscreate(wcount, ae_minint(wcount, 10, _state), &w, &state, _state); minlbfgssetcond(&state, 0.0, 0.0, wstep, 0, _state); minlbfgssetxrep(&state, ae_true, _state); while(minlbfgsiteration(&state, _state)) { /* * Calculate gradient */ if( state.needfg ) { ae_v_move(&network->weights.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); mlpgradnbatch(network, trnxy, trnsize, &state.f, &state.g, _state); v = ae_v_dotproduct(&network->weights.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); state.f = state.f+0.5*decay*v; ae_v_addd(&state.g.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1), decay); rep->ngrad = rep->ngrad+1; } /* * Validation set */ if( state.xupdated ) { ae_v_move(&network->weights.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); e = mlperror(network, valxy, valsize, _state); if( ae_fp_less(e,ebest) ) { ebest = e; ae_v_move(&wbest.ptr.p_double[0], 1, &network->weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); itbest = itcnt; } if( itcnt>30&&ae_fp_greater((double)(itcnt),1.5*itbest) ) { *info = 6; break; } itcnt = itcnt+1; } } minlbfgsresults(&state, &w, &internalrep, _state); /* * Compare with final answer */ if( ae_fp_less(ebest,efinal) ) { ae_v_move(&wfinal.ptr.p_double[0], 1, &wbest.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); efinal = ebest; } } /* * The best network */ ae_v_move(&network->weights.ptr.p_double[0], 1, &wfinal.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); ae_frame_leave(_state); } /************************************************************************* Cross-validation estimate of generalization error. Base algorithm - L-BFGS. INPUT PARAMETERS: Network - neural network with initialized geometry. Network is not changed during cross-validation - it is used only as a representative of its architecture. XY - training set. SSize - training set size Decay - weight decay, same as in MLPTrainLBFGS Restarts - number of restarts, >0. restarts are counted for each partition separately, so total number of restarts will be Restarts*FoldsCount. WStep - stopping criterion, same as in MLPTrainLBFGS MaxIts - stopping criterion, same as in MLPTrainLBFGS FoldsCount - number of folds in k-fold cross-validation, 2<=FoldsCount<=SSize. recommended value: 10. OUTPUT PARAMETERS: Info - return code, same as in MLPTrainLBFGS Rep - report, same as in MLPTrainLM/MLPTrainLBFGS CVRep - generalization error estimates -- ALGLIB -- Copyright 09.12.2007 by Bochkanov Sergey *************************************************************************/ void mlpkfoldcvlbfgs(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, double wstep, ae_int_t maxits, ae_int_t foldscount, ae_int_t* info, mlpreport* rep, mlpcvreport* cvrep, ae_state *_state) { *info = 0; _mlpreport_clear(rep); _mlpcvreport_clear(cvrep); mlptrain_mlpkfoldcvgeneral(network, xy, npoints, decay, restarts, foldscount, ae_false, wstep, maxits, info, rep, cvrep, _state); } /************************************************************************* Cross-validation estimate of generalization error. Base algorithm - Levenberg-Marquardt. INPUT PARAMETERS: Network - neural network with initialized geometry. Network is not changed during cross-validation - it is used only as a representative of its architecture. XY - training set. SSize - training set size Decay - weight decay, same as in MLPTrainLBFGS Restarts - number of restarts, >0. restarts are counted for each partition separately, so total number of restarts will be Restarts*FoldsCount. FoldsCount - number of folds in k-fold cross-validation, 2<=FoldsCount<=SSize. recommended value: 10. OUTPUT PARAMETERS: Info - return code, same as in MLPTrainLBFGS Rep - report, same as in MLPTrainLM/MLPTrainLBFGS CVRep - generalization error estimates -- ALGLIB -- Copyright 09.12.2007 by Bochkanov Sergey *************************************************************************/ void mlpkfoldcvlm(multilayerperceptron* network, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t foldscount, ae_int_t* info, mlpreport* rep, mlpcvreport* cvrep, ae_state *_state) { *info = 0; _mlpreport_clear(rep); _mlpcvreport_clear(cvrep); mlptrain_mlpkfoldcvgeneral(network, xy, npoints, decay, restarts, foldscount, ae_true, 0.0, 0, info, rep, cvrep, _state); } /************************************************************************* This function estimates generalization error using cross-validation on the current dataset with current training settings. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * FoldsCount cross-validation rounds (always) ! * NRestarts training sessions performed within each of ! cross-validation rounds (if NRestarts>1) ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - trainer object Network - neural network. It must have same number of inputs and output/classes as was specified during creation of the trainer object. Network is not changed during cross- validation and is not trained - it is used only as representative of its architecture. I.e., we estimate generalization properties of ARCHITECTURE, not some specific network. NRestarts - number of restarts, >=0: * NRestarts>0 means that for each cross-validation round specified number of random restarts is performed, with best network being chosen after training. * NRestarts=0 is same as NRestarts=1 FoldsCount - number of folds in k-fold cross-validation: * 2<=FoldsCount<=size of dataset * recommended value: 10. * values larger than dataset size will be silently truncated down to dataset size OUTPUT PARAMETERS: Rep - structure which contains cross-validation estimates: * Rep.RelCLSError - fraction of misclassified cases. * Rep.AvgCE - acerage cross-entropy * Rep.RMSError - root-mean-square error * Rep.AvgError - average error * Rep.AvgRelError - average relative error NOTE: when no dataset was specified with MLPSetDataset/SetSparseDataset(), or subset with only one point was given, zeros are returned as estimates. NOTE: this method performs FoldsCount cross-validation rounds, each one with NRestarts random starts. Thus, FoldsCount*NRestarts networks are trained in total. NOTE: Rep.RelCLSError/Rep.AvgCE are zero on regression problems. NOTE: on classification problems Rep.RMSError/Rep.AvgError/Rep.AvgRelError contain errors in prediction of posterior probabilities. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpkfoldcv(mlptrainer* s, multilayerperceptron* network, ae_int_t nrestarts, ae_int_t foldscount, mlpreport* rep, ae_state *_state) { ae_frame _frame_block; ae_shared_pool pooldatacv; mlpparallelizationcv datacv; mlpparallelizationcv *sdatacv; ae_smart_ptr _sdatacv; ae_matrix cvy; ae_vector folds; ae_vector buf; ae_vector dy; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t rowsize; ae_int_t ntype; ae_int_t ttype; ae_int_t i; ae_int_t j; ae_int_t k; hqrndstate rs; ae_frame_make(_state, &_frame_block); _mlpreport_clear(rep); ae_shared_pool_init(&pooldatacv, _state); _mlpparallelizationcv_init(&datacv, _state); ae_smart_ptr_init(&_sdatacv, (void**)&sdatacv, _state); ae_matrix_init(&cvy, 0, 0, DT_REAL, _state); ae_vector_init(&folds, 0, DT_INT, _state); ae_vector_init(&buf, 0, DT_REAL, _state); ae_vector_init(&dy, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); if( !mlpissoftmax(network, _state) ) { ntype = 0; } else { ntype = 1; } if( s->rcpar ) { ttype = 0; } else { ttype = 1; } ae_assert(ntype==ttype, "MLPKFoldCV: type of input network is not similar to network type in trainer object", _state); ae_assert(s->npoints>=0, "MLPKFoldCV: possible trainer S is not initialized(S.NPoints<0)", _state); mlpproperties(network, &nin, &nout, &wcount, _state); ae_assert(s->nin==nin, "MLPKFoldCV: number of inputs in trainer is not equal to number of inputs in network", _state); ae_assert(s->nout==nout, "MLPKFoldCV: number of outputs in trainer is not equal to number of outputs in network", _state); ae_assert(nrestarts>=0, "MLPKFoldCV: NRestarts<0", _state); ae_assert(foldscount>=2, "MLPKFoldCV: FoldsCount<2", _state); if( foldscount>s->npoints ) { foldscount = s->npoints; } rep->relclserror = (double)(0); rep->avgce = (double)(0); rep->rmserror = (double)(0); rep->avgerror = (double)(0); rep->avgrelerror = (double)(0); hqrndrandomize(&rs, _state); rep->ngrad = 0; rep->nhess = 0; rep->ncholesky = 0; if( s->npoints==0||s->npoints==1 ) { ae_frame_leave(_state); return; } /* * Read network geometry, test parameters */ if( s->rcpar ) { rowsize = nin+nout; ae_vector_set_length(&dy, nout, _state); dserrallocate(-nout, &buf, _state); } else { rowsize = nin+1; ae_vector_set_length(&dy, 1, _state); dserrallocate(nout, &buf, _state); } /* * Folds */ ae_vector_set_length(&folds, s->npoints, _state); for(i=0; i<=s->npoints-1; i++) { folds.ptr.p_int[i] = i*foldscount/s->npoints; } for(i=0; i<=s->npoints-2; i++) { j = i+hqrnduniformi(&rs, s->npoints-i, _state); if( j!=i ) { k = folds.ptr.p_int[i]; folds.ptr.p_int[i] = folds.ptr.p_int[j]; folds.ptr.p_int[j] = k; } } ae_matrix_set_length(&cvy, s->npoints, nout, _state); /* * Initialize SEED-value for shared pool */ datacv.ngrad = 0; mlpcopy(network, &datacv.network, _state); ae_vector_set_length(&datacv.subset, s->npoints, _state); ae_vector_set_length(&datacv.xyrow, rowsize, _state); ae_vector_set_length(&datacv.y, nout, _state); /* * Create shared pool */ ae_shared_pool_set_seed(&pooldatacv, &datacv, sizeof(datacv), _mlpparallelizationcv_init, _mlpparallelizationcv_init_copy, _mlpparallelizationcv_destroy, _state); /* * Parallelization */ mlptrain_mthreadcv(s, rowsize, nrestarts, &folds, 0, foldscount, &cvy, &pooldatacv, _state); /* * Calculate value for NGrad */ ae_shared_pool_first_recycled(&pooldatacv, &_sdatacv, _state); while(sdatacv!=NULL) { rep->ngrad = rep->ngrad+sdatacv->ngrad; ae_shared_pool_next_recycled(&pooldatacv, &_sdatacv, _state); } /* * Connect of results and calculate cross-validation error */ for(i=0; i<=s->npoints-1; i++) { if( s->datatype==0 ) { ae_v_move(&datacv.xyrow.ptr.p_double[0], 1, &s->densexy.ptr.pp_double[i][0], 1, ae_v_len(0,rowsize-1)); } if( s->datatype==1 ) { sparsegetrow(&s->sparsexy, i, &datacv.xyrow, _state); } ae_v_move(&datacv.y.ptr.p_double[0], 1, &cvy.ptr.pp_double[i][0], 1, ae_v_len(0,nout-1)); if( s->rcpar ) { ae_v_move(&dy.ptr.p_double[0], 1, &datacv.xyrow.ptr.p_double[nin], 1, ae_v_len(0,nout-1)); } else { dy.ptr.p_double[0] = datacv.xyrow.ptr.p_double[nin]; } dserraccumulate(&buf, &datacv.y, &dy, _state); } dserrfinish(&buf, _state); rep->relclserror = buf.ptr.p_double[0]; rep->avgce = buf.ptr.p_double[1]; rep->rmserror = buf.ptr.p_double[2]; rep->avgerror = buf.ptr.p_double[3]; rep->avgrelerror = buf.ptr.p_double[4]; ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_mlpkfoldcv(mlptrainer* s, multilayerperceptron* network, ae_int_t nrestarts, ae_int_t foldscount, mlpreport* rep, ae_state *_state) { mlpkfoldcv(s,network,nrestarts,foldscount,rep, _state); } /************************************************************************* Creation of the network trainer object for regression networks INPUT PARAMETERS: NIn - number of inputs, NIn>=1 NOut - number of outputs, NOut>=1 OUTPUT PARAMETERS: S - neural network trainer object. This structure can be used to train any regression network with NIn inputs and NOut outputs. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpcreatetrainer(ae_int_t nin, ae_int_t nout, mlptrainer* s, ae_state *_state) { _mlptrainer_clear(s); ae_assert(nin>=1, "MLPCreateTrainer: NIn<1.", _state); ae_assert(nout>=1, "MLPCreateTrainer: NOut<1.", _state); s->nin = nin; s->nout = nout; s->rcpar = ae_true; s->lbfgsfactor = mlptrain_defaultlbfgsfactor; s->decay = 1.0E-6; mlpsetcond(s, (double)(0), 0, _state); s->datatype = 0; s->npoints = 0; mlpsetalgobatch(s, _state); } /************************************************************************* Creation of the network trainer object for classification networks INPUT PARAMETERS: NIn - number of inputs, NIn>=1 NClasses - number of classes, NClasses>=2 OUTPUT PARAMETERS: S - neural network trainer object. This structure can be used to train any classification network with NIn inputs and NOut outputs. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpcreatetrainercls(ae_int_t nin, ae_int_t nclasses, mlptrainer* s, ae_state *_state) { _mlptrainer_clear(s); ae_assert(nin>=1, "MLPCreateTrainerCls: NIn<1.", _state); ae_assert(nclasses>=2, "MLPCreateTrainerCls: NClasses<2.", _state); s->nin = nin; s->nout = nclasses; s->rcpar = ae_false; s->lbfgsfactor = mlptrain_defaultlbfgsfactor; s->decay = 1.0E-6; mlpsetcond(s, (double)(0), 0, _state); s->datatype = 0; s->npoints = 0; mlpsetalgobatch(s, _state); } /************************************************************************* This function sets "current dataset" of the trainer object to one passed by user. INPUT PARAMETERS: S - trainer object XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. NPoints - points count, >=0. DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following datasetformat is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetdataset(mlptrainer* s, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_int_t ndim; ae_int_t i; ae_int_t j; ae_assert(s->nin>=1, "MLPSetDataset: possible parameter S is not initialized or spoiled(S.NIn<=0).", _state); ae_assert(npoints>=0, "MLPSetDataset: NPoint<0", _state); ae_assert(npoints<=xy->rows, "MLPSetDataset: invalid size of matrix XY(NPoint more then rows of matrix XY)", _state); s->datatype = 0; s->npoints = npoints; if( npoints==0 ) { return; } if( s->rcpar ) { ae_assert(s->nout>=1, "MLPSetDataset: possible parameter S is not initialized or is spoiled(NOut<1 for regression).", _state); ndim = s->nin+s->nout; ae_assert(ndim<=xy->cols, "MLPSetDataset: invalid size of matrix XY(too few columns in matrix XY).", _state); ae_assert(apservisfinitematrix(xy, npoints, ndim, _state), "MLPSetDataset: parameter XY contains Infinite or NaN.", _state); } else { ae_assert(s->nout>=2, "MLPSetDataset: possible parameter S is not initialized or is spoiled(NClasses<2 for classifier).", _state); ndim = s->nin+1; ae_assert(ndim<=xy->cols, "MLPSetDataset: invalid size of matrix XY(too few columns in matrix XY).", _state); ae_assert(apservisfinitematrix(xy, npoints, ndim, _state), "MLPSetDataset: parameter XY contains Infinite or NaN.", _state); for(i=0; i<=npoints-1; i++) { ae_assert(ae_round(xy->ptr.pp_double[i][s->nin], _state)>=0&&ae_round(xy->ptr.pp_double[i][s->nin], _state)nout, "MLPSetDataset: invalid parameter XY(in classifier used nonexistent class number: either XY[.,NIn]<0 or XY[.,NIn]>=NClasses).", _state); } } rmatrixsetlengthatleast(&s->densexy, npoints, ndim, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=ndim-1; j++) { s->densexy.ptr.pp_double[i][j] = xy->ptr.pp_double[i][j]; } } } /************************************************************************* This function sets "current dataset" of the trainer object to one passed by user (sparse matrix is used to store dataset). INPUT PARAMETERS: S - trainer object XY - training set, see below for information on the training set format. This function checks correctness of the dataset (no NANs/INFs, class numbers are correct) and throws exception when incorrect dataset is passed. Any sparse storage format can be used: Hash-table, CRS... NPoints - points count, >=0 DATASET FORMAT: This function uses two different dataset formats - one for regression networks, another one for classification networks. For regression networks with NIn inputs and NOut outputs following dataset format is used: * dataset is given by NPoints*(NIn+NOut) matrix * each row corresponds to one example * first NIn columns are inputs, next NOut columns are outputs For classification networks with NIn inputs and NClasses clases following datasetformat is used: * dataset is given by NPoints*(NIn+1) matrix * each row corresponds to one example * first NIn columns are inputs, last column stores class number (from 0 to NClasses-1). -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetsparsedataset(mlptrainer* s, sparsematrix* xy, ae_int_t npoints, ae_state *_state) { double v; ae_int_t t0; ae_int_t t1; ae_int_t i; ae_int_t j; /* * Check correctness of the data */ ae_assert(s->nin>0, "MLPSetSparseDataset: possible parameter S is not initialized or spoiled(S.NIn<=0).", _state); ae_assert(npoints>=0, "MLPSetSparseDataset: NPoint<0", _state); ae_assert(npoints<=sparsegetnrows(xy, _state), "MLPSetSparseDataset: invalid size of sparse matrix XY(NPoint more then rows of matrix XY)", _state); if( npoints>0 ) { t0 = 0; t1 = 0; if( s->rcpar ) { ae_assert(s->nout>=1, "MLPSetSparseDataset: possible parameter S is not initialized or is spoiled(NOut<1 for regression).", _state); ae_assert(s->nin+s->nout<=sparsegetncols(xy, _state), "MLPSetSparseDataset: invalid size of sparse matrix XY(too few columns in sparse matrix XY).", _state); while(sparseenumerate(xy, &t0, &t1, &i, &j, &v, _state)) { if( inin+s->nout ) { ae_assert(ae_isfinite(v, _state), "MLPSetSparseDataset: sparse matrix XY contains Infinite or NaN.", _state); } } } else { ae_assert(s->nout>=2, "MLPSetSparseDataset: possible parameter S is not initialized or is spoiled(NClasses<2 for classifier).", _state); ae_assert(s->nin+1<=sparsegetncols(xy, _state), "MLPSetSparseDataset: invalid size of sparse matrix XY(too few columns in sparse matrix XY).", _state); while(sparseenumerate(xy, &t0, &t1, &i, &j, &v, _state)) { if( inin ) { if( j!=s->nin ) { ae_assert(ae_isfinite(v, _state), "MLPSetSparseDataset: sparse matrix XY contains Infinite or NaN.", _state); } else { ae_assert((ae_isfinite(v, _state)&&ae_round(v, _state)>=0)&&ae_round(v, _state)nout, "MLPSetSparseDataset: invalid sparse matrix XY(in classifier used nonexistent class number: either XY[.,NIn]<0 or XY[.,NIn]>=NClasses).", _state); } } } } } /* * Set dataset */ s->datatype = 1; s->npoints = npoints; sparsecopytocrs(xy, &s->sparsexy, _state); } /************************************************************************* This function sets weight decay coefficient which is used for training. INPUT PARAMETERS: S - trainer object Decay - weight decay coefficient, >=0. Weight decay term 'Decay*||Weights||^2' is added to error function. If you don't know what Decay to choose, use 1.0E-3. Weight decay can be set to zero, in this case network is trained without weight decay. NOTE: by default network uses some small nonzero value for weight decay. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetdecay(mlptrainer* s, double decay, ae_state *_state) { ae_assert(ae_isfinite(decay, _state), "MLPSetDecay: parameter Decay contains Infinite or NaN.", _state); ae_assert(ae_fp_greater_eq(decay,(double)(0)), "MLPSetDecay: Decay<0.", _state); s->decay = decay; } /************************************************************************* This function sets stopping criteria for the optimizer. INPUT PARAMETERS: S - trainer object WStep - stopping criterion. Algorithm stops if step size is less than WStep. Recommended value - 0.01. Zero step size means stopping after MaxIts iterations. WStep>=0. MaxIts - stopping criterion. Algorithm stops after MaxIts epochs (full passes over entire dataset). Zero MaxIts means stopping when step is sufficiently small. MaxIts>=0. NOTE: by default, WStep=0.005 and MaxIts=0 are used. These values are also used when MLPSetCond() is called with WStep=0 and MaxIts=0. NOTE: these stopping criteria are used for all kinds of neural training - from "conventional" networks to early stopping ensembles. When used for "conventional" networks, they are used as the only stopping criteria. When combined with early stopping, they used as ADDITIONAL stopping criteria which can terminate early stopping algorithm. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetcond(mlptrainer* s, double wstep, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(wstep, _state), "MLPSetCond: parameter WStep contains Infinite or NaN.", _state); ae_assert(ae_fp_greater_eq(wstep,(double)(0)), "MLPSetCond: WStep<0.", _state); ae_assert(maxits>=0, "MLPSetCond: MaxIts<0.", _state); if( ae_fp_neq(wstep,(double)(0))||maxits!=0 ) { s->wstep = wstep; s->maxits = maxits; } else { s->wstep = 0.005; s->maxits = 0; } } /************************************************************************* This function sets training algorithm: batch training using L-BFGS will be used. This algorithm: * the most robust for small-scale problems, but may be too slow for large scale ones. * perfoms full pass through the dataset before performing step * uses conditions specified by MLPSetCond() for stopping * is default one used by trainer object INPUT PARAMETERS: S - trainer object -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpsetalgobatch(mlptrainer* s, ae_state *_state) { s->algokind = 0; } /************************************************************************* This function trains neural network passed to this function, using current dataset (one which was passed to MLPSetDataset() or MLPSetSparseDataset()) and current training settings. Training from NRestarts random starting positions is performed, best network is chosen. Training is performed using current training algorithm. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * NRestarts training sessions performed within each of ! cross-validation rounds (if NRestarts>1) ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - trainer object Network - neural network. It must have same number of inputs and output/classes as was specified during creation of the trainer object. NRestarts - number of restarts, >=0: * NRestarts>0 means that specified number of random restarts are performed, best network is chosen after training * NRestarts=0 means that current state of the network is used for training. OUTPUT PARAMETERS: Network - trained network NOTE: when no dataset was specified with MLPSetDataset/SetSparseDataset(), network is filled by zero values. Same behavior for functions MLPStartTraining and MLPContinueTraining. NOTE: this method uses sum-of-squares error function for training. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlptrainnetwork(mlptrainer* s, multilayerperceptron* network, ae_int_t nrestarts, mlpreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t ntype; ae_int_t ttype; ae_shared_pool trnpool; ae_frame_make(_state, &_frame_block); _mlpreport_clear(rep); ae_shared_pool_init(&trnpool, _state); ae_assert(s->npoints>=0, "MLPTrainNetwork: parameter S is not initialized or is spoiled(S.NPoints<0)", _state); if( !mlpissoftmax(network, _state) ) { ntype = 0; } else { ntype = 1; } if( s->rcpar ) { ttype = 0; } else { ttype = 1; } ae_assert(ntype==ttype, "MLPTrainNetwork: type of input network is not similar to network type in trainer object", _state); mlpproperties(network, &nin, &nout, &wcount, _state); ae_assert(s->nin==nin, "MLPTrainNetwork: number of inputs in trainer is not equal to number of inputs in network", _state); ae_assert(s->nout==nout, "MLPTrainNetwork: number of outputs in trainer is not equal to number of outputs in network", _state); ae_assert(nrestarts>=0, "MLPTrainNetwork: NRestarts<0.", _state); /* * Train */ mlptrain_mlptrainnetworkx(s, nrestarts, -1, &s->subset, -1, &s->subset, 0, network, rep, ae_true, &trnpool, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_mlptrainnetwork(mlptrainer* s, multilayerperceptron* network, ae_int_t nrestarts, mlpreport* rep, ae_state *_state) { mlptrainnetwork(s,network,nrestarts,rep, _state); } /************************************************************************* IMPORTANT: this is an "expert" version of the MLPTrain() function. We do not recommend you to use it unless you are pretty sure that you need ability to monitor training progress. This function performs step-by-step training of the neural network. Here "step-by-step" means that training starts with MLPStartTraining() call, and then user subsequently calls MLPContinueTraining() to perform one more iteration of the training. After call to this function trainer object remembers network and is ready to train it. However, no training is performed until first call to MLPContinueTraining() function. Subsequent calls to MLPContinueTraining() will advance training progress one iteration further. EXAMPLE: > > ...initialize network and trainer object.... > > MLPStartTraining(Trainer, Network, True) > while MLPContinueTraining(Trainer, Network) do > ...visualize training progress... > INPUT PARAMETERS: S - trainer object Network - neural network. It must have same number of inputs and output/classes as was specified during creation of the trainer object. RandomStart - randomize network before training or not: * True means that network is randomized and its initial state (one which was passed to the trainer object) is lost. * False means that training is started from the current state of the network OUTPUT PARAMETERS: Network - neural network which is ready to training (weights are initialized, preprocessor is initialized using current training set) NOTE: this method uses sum-of-squares error function for training. NOTE: it is expected that trainer object settings are NOT changed during step-by-step training, i.e. no one changes stopping criteria or training set during training. It is possible and there is no defense against such actions, but algorithm behavior in such cases is undefined and can be unpredictable. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ void mlpstarttraining(mlptrainer* s, multilayerperceptron* network, ae_bool randomstart, ae_state *_state) { ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t ntype; ae_int_t ttype; ae_assert(s->npoints>=0, "MLPStartTraining: parameter S is not initialized or is spoiled(S.NPoints<0)", _state); if( !mlpissoftmax(network, _state) ) { ntype = 0; } else { ntype = 1; } if( s->rcpar ) { ttype = 0; } else { ttype = 1; } ae_assert(ntype==ttype, "MLPStartTraining: type of input network is not similar to network type in trainer object", _state); mlpproperties(network, &nin, &nout, &wcount, _state); ae_assert(s->nin==nin, "MLPStartTraining: number of inputs in trainer is not equal to number of inputs in the network.", _state); ae_assert(s->nout==nout, "MLPStartTraining: number of outputs in trainer is not equal to number of outputs in the network.", _state); /* * Initialize temporaries */ mlptrain_initmlptrnsession(network, randomstart, s, &s->session, _state); /* * Train network */ mlptrain_mlpstarttrainingx(s, randomstart, -1, &s->subset, -1, &s->session, _state); /* * Update network */ mlpcopytunableparameters(&s->session.network, network, _state); } /************************************************************************* IMPORTANT: this is an "expert" version of the MLPTrain() function. We do not recommend you to use it unless you are pretty sure that you need ability to monitor training progress. FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. This function performs step-by-step training of the neural network. Here "step-by-step" means that training starts with MLPStartTraining() call, and then user subsequently calls MLPContinueTraining() to perform one more iteration of the training. This function performs one more iteration of the training and returns either True (training continues) or False (training stopped). In case True was returned, Network weights are updated according to the current state of the optimization progress. In case False was returned, no additional updates is performed (previous update of the network weights moved us to the final point, and no additional updates is needed). EXAMPLE: > > [initialize network and trainer object] > > MLPStartTraining(Trainer, Network, True) > while MLPContinueTraining(Trainer, Network) do > [visualize training progress] > INPUT PARAMETERS: S - trainer object Network - neural network structure, which is used to store current state of the training process. OUTPUT PARAMETERS: Network - weights of the neural network are rewritten by the current approximation. NOTE: this method uses sum-of-squares error function for training. NOTE: it is expected that trainer object settings are NOT changed during step-by-step training, i.e. no one changes stopping criteria or training set during training. It is possible and there is no defense against such actions, but algorithm behavior in such cases is undefined and can be unpredictable. NOTE: It is expected that Network is the same one which was passed to MLPStartTraining() function. However, THIS function checks only following: * that number of network inputs is consistent with trainer object settings * that number of network outputs/classes is consistent with trainer object settings * that number of network weights is the same as number of weights in the network passed to MLPStartTraining() function Exception is thrown when these conditions are violated. It is also expected that you do not change state of the network on your own - the only party who has right to change network during its training is a trainer object. Any attempt to interfere with trainer may lead to unpredictable results. -- ALGLIB -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ ae_bool mlpcontinuetraining(mlptrainer* s, multilayerperceptron* network, ae_state *_state) { ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t ntype; ae_int_t ttype; ae_bool result; ae_assert(s->npoints>=0, "MLPContinueTraining: parameter S is not initialized or is spoiled(S.NPoints<0)", _state); if( s->rcpar ) { ttype = 0; } else { ttype = 1; } if( !mlpissoftmax(network, _state) ) { ntype = 0; } else { ntype = 1; } ae_assert(ntype==ttype, "MLPContinueTraining: type of input network is not similar to network type in trainer object.", _state); mlpproperties(network, &nin, &nout, &wcount, _state); ae_assert(s->nin==nin, "MLPContinueTraining: number of inputs in trainer is not equal to number of inputs in the network.", _state); ae_assert(s->nout==nout, "MLPContinueTraining: number of outputs in trainer is not equal to number of outputs in the network.", _state); result = mlptrain_mlpcontinuetrainingx(s, &s->subset, -1, &s->ngradbatch, &s->session, _state); if( result ) { ae_v_move(&network->weights.ptr.p_double[0], 1, &s->session.network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); } return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_mlpcontinuetraining(mlptrainer* s, multilayerperceptron* network, ae_state *_state) { return mlpcontinuetraining(s,network, _state); } /************************************************************************* Training neural networks ensemble using bootstrap aggregating (bagging). Modified Levenberg-Marquardt algorithm is used as base training method. INPUT PARAMETERS: Ensemble - model with initialized geometry XY - training set NPoints - training set size Decay - weight decay coefficient, >=0.001 Restarts - restarts, >0. OUTPUT PARAMETERS: Ensemble - trained model Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report. OOBErrors - out-of-bag generalization error estimate -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpebagginglm(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t* info, mlpreport* rep, mlpcvreport* ooberrors, ae_state *_state) { *info = 0; _mlpreport_clear(rep); _mlpcvreport_clear(ooberrors); mlptrain_mlpebagginginternal(ensemble, xy, npoints, decay, restarts, 0.0, 0, ae_true, info, rep, ooberrors, _state); } /************************************************************************* Training neural networks ensemble using bootstrap aggregating (bagging). L-BFGS algorithm is used as base training method. INPUT PARAMETERS: Ensemble - model with initialized geometry XY - training set NPoints - training set size Decay - weight decay coefficient, >=0.001 Restarts - restarts, >0. WStep - stopping criterion, same as in MLPTrainLBFGS MaxIts - stopping criterion, same as in MLPTrainLBFGS OUTPUT PARAMETERS: Ensemble - trained model Info - return code: * -8, if both WStep=0 and MaxIts=0 * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, Restarts<1). * 2, if task has been solved. Rep - training report. OOBErrors - out-of-bag generalization error estimate -- ALGLIB -- Copyright 17.02.2009 by Bochkanov Sergey *************************************************************************/ void mlpebagginglbfgs(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, double wstep, ae_int_t maxits, ae_int_t* info, mlpreport* rep, mlpcvreport* ooberrors, ae_state *_state) { *info = 0; _mlpreport_clear(rep); _mlpcvreport_clear(ooberrors); mlptrain_mlpebagginginternal(ensemble, xy, npoints, decay, restarts, wstep, maxits, ae_false, info, rep, ooberrors, _state); } /************************************************************************* Training neural networks ensemble using early stopping. INPUT PARAMETERS: Ensemble - model with initialized geometry XY - training set NPoints - training set size Decay - weight decay coefficient, >=0.001 Restarts - restarts, >0. OUTPUT PARAMETERS: Ensemble - trained model Info - return code: * -2, if there is a point with class number outside of [0..NClasses-1]. * -1, if incorrect parameters was passed (NPoints<0, Restarts<1). * 6, if task has been solved. Rep - training report. OOBErrors - out-of-bag generalization error estimate -- ALGLIB -- Copyright 10.03.2009 by Bochkanov Sergey *************************************************************************/ void mlpetraines(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t* info, mlpreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t k; ae_int_t ccount; ae_int_t pcount; ae_matrix trnxy; ae_matrix valxy; ae_int_t trnsize; ae_int_t valsize; ae_int_t tmpinfo; mlpreport tmprep; modelerrors moderr; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_frame_make(_state, &_frame_block); *info = 0; _mlpreport_clear(rep); ae_matrix_init(&trnxy, 0, 0, DT_REAL, _state); ae_matrix_init(&valxy, 0, 0, DT_REAL, _state); _mlpreport_init(&tmprep, _state); _modelerrors_init(&moderr, _state); nin = mlpgetinputscount(&ensemble->network, _state); nout = mlpgetoutputscount(&ensemble->network, _state); wcount = mlpgetweightscount(&ensemble->network, _state); if( (npoints<2||restarts<1)||ae_fp_less(decay,(double)(0)) ) { *info = -1; ae_frame_leave(_state); return; } if( mlpissoftmax(&ensemble->network, _state) ) { for(i=0; i<=npoints-1; i++) { if( ae_round(xy->ptr.pp_double[i][nin], _state)<0||ae_round(xy->ptr.pp_double[i][nin], _state)>=nout ) { *info = -2; ae_frame_leave(_state); return; } } } *info = 6; /* * allocate */ if( mlpissoftmax(&ensemble->network, _state) ) { ccount = nin+1; pcount = nin; } else { ccount = nin+nout; pcount = nin+nout; } ae_matrix_set_length(&trnxy, npoints, ccount, _state); ae_matrix_set_length(&valxy, npoints, ccount, _state); rep->ngrad = 0; rep->nhess = 0; rep->ncholesky = 0; /* * train networks */ for(k=0; k<=ensemble->ensemblesize-1; k++) { /* * Split set */ do { trnsize = 0; valsize = 0; for(i=0; i<=npoints-1; i++) { if( ae_fp_less(ae_randomreal(_state),0.66) ) { /* * Assign sample to training set */ ae_v_move(&trnxy.ptr.pp_double[trnsize][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,ccount-1)); trnsize = trnsize+1; } else { /* * Assign sample to validation set */ ae_v_move(&valxy.ptr.pp_double[valsize][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,ccount-1)); valsize = valsize+1; } } } while(!(trnsize!=0&&valsize!=0)); /* * Train */ mlptraines(&ensemble->network, &trnxy, trnsize, &valxy, valsize, decay, restarts, &tmpinfo, &tmprep, _state); if( tmpinfo<0 ) { *info = tmpinfo; ae_frame_leave(_state); return; } /* * save results */ ae_v_move(&ensemble->weights.ptr.p_double[k*wcount], 1, &ensemble->network.weights.ptr.p_double[0], 1, ae_v_len(k*wcount,(k+1)*wcount-1)); ae_v_move(&ensemble->columnmeans.ptr.p_double[k*pcount], 1, &ensemble->network.columnmeans.ptr.p_double[0], 1, ae_v_len(k*pcount,(k+1)*pcount-1)); ae_v_move(&ensemble->columnsigmas.ptr.p_double[k*pcount], 1, &ensemble->network.columnsigmas.ptr.p_double[0], 1, ae_v_len(k*pcount,(k+1)*pcount-1)); rep->ngrad = rep->ngrad+tmprep.ngrad; rep->nhess = rep->nhess+tmprep.nhess; rep->ncholesky = rep->ncholesky+tmprep.ncholesky; } mlpeallerrorsx(ensemble, xy, &ensemble->network.dummysxy, npoints, 0, &ensemble->network.dummyidx, 0, npoints, 0, &ensemble->network.buf, &moderr, _state); rep->relclserror = moderr.relclserror; rep->avgce = moderr.avgce; rep->rmserror = moderr.rmserror; rep->avgerror = moderr.avgerror; rep->avgrelerror = moderr.avgrelerror; ae_frame_leave(_state); } /************************************************************************* This function trains neural network ensemble passed to this function using current dataset and early stopping training algorithm. Each early stopping round performs NRestarts random restarts (thus, EnsembleSize*NRestarts training rounds is performed in total). FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (C++ and C# computational cores) ! * SSE support (C++ computational core) ! ! Second improvement gives constant speedup (2-3X). First improvement ! gives close-to-linear speedup on multicore systems. Following ! operations can be executed in parallel: ! * EnsembleSize training sessions performed for each of ensemble ! members (always parallelized) ! * NRestarts training sessions performed within each of training ! sessions (if NRestarts>1) ! * gradient calculation over large dataset (if dataset is large enough) ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! In order to use SSE features you have to: ! * use commercial version of ALGLIB on Intel processors ! * use C++ computational core ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - trainer object; Ensemble - neural network ensemble. It must have same number of inputs and outputs/classes as was specified during creation of the trainer object. NRestarts - number of restarts, >=0: * NRestarts>0 means that specified number of random restarts are performed during each ES round; * NRestarts=0 is silently replaced by 1. OUTPUT PARAMETERS: Ensemble - trained ensemble; Rep - it contains all type of errors. NOTE: this training method uses BOTH early stopping and weight decay! So, you should select weight decay before starting training just as you select it before training "conventional" networks. NOTE: when no dataset was specified with MLPSetDataset/SetSparseDataset(), or single-point dataset was passed, ensemble is filled by zero values. NOTE: this method uses sum-of-squares error function for training. -- ALGLIB -- Copyright 22.08.2012 by Bochkanov Sergey *************************************************************************/ void mlptrainensemblees(mlptrainer* s, mlpensemble* ensemble, ae_int_t nrestarts, mlpreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t nin; ae_int_t nout; ae_int_t ntype; ae_int_t ttype; ae_shared_pool esessions; sinteger sgrad; modelerrors tmprep; ae_frame_make(_state, &_frame_block); _mlpreport_clear(rep); ae_shared_pool_init(&esessions, _state); _sinteger_init(&sgrad, _state); _modelerrors_init(&tmprep, _state); ae_assert(s->npoints>=0, "MLPTrainEnsembleES: parameter S is not initialized or is spoiled(S.NPoints<0)", _state); if( !mlpeissoftmax(ensemble, _state) ) { ntype = 0; } else { ntype = 1; } if( s->rcpar ) { ttype = 0; } else { ttype = 1; } ae_assert(ntype==ttype, "MLPTrainEnsembleES: internal error - type of input network is not similar to network type in trainer object", _state); nin = mlpgetinputscount(&ensemble->network, _state); ae_assert(s->nin==nin, "MLPTrainEnsembleES: number of inputs in trainer is not equal to number of inputs in ensemble network", _state); nout = mlpgetoutputscount(&ensemble->network, _state); ae_assert(s->nout==nout, "MLPTrainEnsembleES: number of outputs in trainer is not equal to number of outputs in ensemble network", _state); ae_assert(nrestarts>=0, "MLPTrainEnsembleES: NRestarts<0.", _state); /* * Initialize parameter Rep */ rep->relclserror = (double)(0); rep->avgce = (double)(0); rep->rmserror = (double)(0); rep->avgerror = (double)(0); rep->avgrelerror = (double)(0); rep->ngrad = 0; rep->nhess = 0; rep->ncholesky = 0; /* * Allocate */ ivectorsetlengthatleast(&s->subset, s->npoints, _state); ivectorsetlengthatleast(&s->valsubset, s->npoints, _state); /* * Start training * * NOTE: ESessions is not initialized because MLPTrainEnsembleX * needs uninitialized pool. */ sgrad.val = 0; mlptrain_mlptrainensemblex(s, ensemble, 0, ensemble->ensemblesize, nrestarts, 0, &sgrad, ae_true, &esessions, _state); rep->ngrad = sgrad.val; /* * Calculate errors. */ if( s->datatype==0 ) { mlpeallerrorsx(ensemble, &s->densexy, &s->sparsexy, s->npoints, 0, &ensemble->network.dummyidx, 0, s->npoints, 0, &ensemble->network.buf, &tmprep, _state); } if( s->datatype==1 ) { mlpeallerrorsx(ensemble, &s->densexy, &s->sparsexy, s->npoints, 1, &ensemble->network.dummyidx, 0, s->npoints, 0, &ensemble->network.buf, &tmprep, _state); } rep->relclserror = tmprep.relclserror; rep->avgce = tmprep.avgce; rep->rmserror = tmprep.rmserror; rep->avgerror = tmprep.avgerror; rep->avgrelerror = tmprep.avgrelerror; ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_mlptrainensemblees(mlptrainer* s, mlpensemble* ensemble, ae_int_t nrestarts, mlpreport* rep, ae_state *_state) { mlptrainensemblees(s,ensemble,nrestarts,rep, _state); } /************************************************************************* Internal cross-validation subroutine *************************************************************************/ static void mlptrain_mlpkfoldcvgeneral(multilayerperceptron* n, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, ae_int_t foldscount, ae_bool lmalgorithm, double wstep, ae_int_t maxits, ae_int_t* info, mlpreport* rep, mlpcvreport* cvrep, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t fold; ae_int_t j; ae_int_t k; multilayerperceptron network; ae_int_t nin; ae_int_t nout; ae_int_t rowlen; ae_int_t wcount; ae_int_t nclasses; ae_int_t tssize; ae_int_t cvssize; ae_matrix cvset; ae_matrix testset; ae_vector folds; ae_int_t relcnt; mlpreport internalrep; ae_vector x; ae_vector y; ae_frame_make(_state, &_frame_block); *info = 0; _mlpreport_clear(rep); _mlpcvreport_clear(cvrep); _multilayerperceptron_init(&network, _state); ae_matrix_init(&cvset, 0, 0, DT_REAL, _state); ae_matrix_init(&testset, 0, 0, DT_REAL, _state); ae_vector_init(&folds, 0, DT_INT, _state); _mlpreport_init(&internalrep, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); /* * Read network geometry, test parameters */ mlpproperties(n, &nin, &nout, &wcount, _state); if( mlpissoftmax(n, _state) ) { nclasses = nout; rowlen = nin+1; } else { nclasses = -nout; rowlen = nin+nout; } if( (npoints<=0||foldscount<2)||foldscount>npoints ) { *info = -1; ae_frame_leave(_state); return; } mlpcopy(n, &network, _state); /* * K-fold out cross-validation. * First, estimate generalization error */ ae_matrix_set_length(&testset, npoints-1+1, rowlen-1+1, _state); ae_matrix_set_length(&cvset, npoints-1+1, rowlen-1+1, _state); ae_vector_set_length(&x, nin-1+1, _state); ae_vector_set_length(&y, nout-1+1, _state); mlptrain_mlpkfoldsplit(xy, npoints, nclasses, foldscount, ae_false, &folds, _state); cvrep->relclserror = (double)(0); cvrep->avgce = (double)(0); cvrep->rmserror = (double)(0); cvrep->avgerror = (double)(0); cvrep->avgrelerror = (double)(0); rep->ngrad = 0; rep->nhess = 0; rep->ncholesky = 0; relcnt = 0; for(fold=0; fold<=foldscount-1; fold++) { /* * Separate set */ tssize = 0; cvssize = 0; for(i=0; i<=npoints-1; i++) { if( folds.ptr.p_int[i]==fold ) { ae_v_move(&testset.ptr.pp_double[tssize][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,rowlen-1)); tssize = tssize+1; } else { ae_v_move(&cvset.ptr.pp_double[cvssize][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,rowlen-1)); cvssize = cvssize+1; } } /* * Train on CV training set */ if( lmalgorithm ) { mlptrainlm(&network, &cvset, cvssize, decay, restarts, info, &internalrep, _state); } else { mlptrainlbfgs(&network, &cvset, cvssize, decay, restarts, wstep, maxits, info, &internalrep, _state); } if( *info<0 ) { cvrep->relclserror = (double)(0); cvrep->avgce = (double)(0); cvrep->rmserror = (double)(0); cvrep->avgerror = (double)(0); cvrep->avgrelerror = (double)(0); ae_frame_leave(_state); return; } rep->ngrad = rep->ngrad+internalrep.ngrad; rep->nhess = rep->nhess+internalrep.nhess; rep->ncholesky = rep->ncholesky+internalrep.ncholesky; /* * Estimate error using CV test set */ if( mlpissoftmax(&network, _state) ) { /* * classification-only code */ cvrep->relclserror = cvrep->relclserror+mlpclserror(&network, &testset, tssize, _state); cvrep->avgce = cvrep->avgce+mlperrorn(&network, &testset, tssize, _state); } for(i=0; i<=tssize-1; i++) { ae_v_move(&x.ptr.p_double[0], 1, &testset.ptr.pp_double[i][0], 1, ae_v_len(0,nin-1)); mlpprocess(&network, &x, &y, _state); if( mlpissoftmax(&network, _state) ) { /* * Classification-specific code */ k = ae_round(testset.ptr.pp_double[i][nin], _state); for(j=0; j<=nout-1; j++) { if( j==k ) { cvrep->rmserror = cvrep->rmserror+ae_sqr(y.ptr.p_double[j]-1, _state); cvrep->avgerror = cvrep->avgerror+ae_fabs(y.ptr.p_double[j]-1, _state); cvrep->avgrelerror = cvrep->avgrelerror+ae_fabs(y.ptr.p_double[j]-1, _state); relcnt = relcnt+1; } else { cvrep->rmserror = cvrep->rmserror+ae_sqr(y.ptr.p_double[j], _state); cvrep->avgerror = cvrep->avgerror+ae_fabs(y.ptr.p_double[j], _state); } } } else { /* * Regression-specific code */ for(j=0; j<=nout-1; j++) { cvrep->rmserror = cvrep->rmserror+ae_sqr(y.ptr.p_double[j]-testset.ptr.pp_double[i][nin+j], _state); cvrep->avgerror = cvrep->avgerror+ae_fabs(y.ptr.p_double[j]-testset.ptr.pp_double[i][nin+j], _state); if( ae_fp_neq(testset.ptr.pp_double[i][nin+j],(double)(0)) ) { cvrep->avgrelerror = cvrep->avgrelerror+ae_fabs((y.ptr.p_double[j]-testset.ptr.pp_double[i][nin+j])/testset.ptr.pp_double[i][nin+j], _state); relcnt = relcnt+1; } } } } } if( mlpissoftmax(&network, _state) ) { cvrep->relclserror = cvrep->relclserror/npoints; cvrep->avgce = cvrep->avgce/(ae_log((double)(2), _state)*npoints); } cvrep->rmserror = ae_sqrt(cvrep->rmserror/(npoints*nout), _state); cvrep->avgerror = cvrep->avgerror/(npoints*nout); if( relcnt>0 ) { cvrep->avgrelerror = cvrep->avgrelerror/relcnt; } *info = 1; ae_frame_leave(_state); } /************************************************************************* Subroutine prepares K-fold split of the training set. NOTES: "NClasses>0" means that we have classification task. "NClasses<0" means regression task with -NClasses real outputs. *************************************************************************/ static void mlptrain_mlpkfoldsplit(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nclasses, ae_int_t foldscount, ae_bool stratifiedsplits, /* Integer */ ae_vector* folds, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; hqrndstate rs; ae_frame_make(_state, &_frame_block); ae_vector_clear(folds); _hqrndstate_init(&rs, _state); /* * test parameters */ ae_assert(npoints>0, "MLPKFoldSplit: wrong NPoints!", _state); ae_assert(nclasses>1||nclasses<0, "MLPKFoldSplit: wrong NClasses!", _state); ae_assert(foldscount>=2&&foldscount<=npoints, "MLPKFoldSplit: wrong FoldsCount!", _state); ae_assert(!stratifiedsplits, "MLPKFoldSplit: stratified splits are not supported!", _state); /* * Folds */ hqrndrandomize(&rs, _state); ae_vector_set_length(folds, npoints-1+1, _state); for(i=0; i<=npoints-1; i++) { folds->ptr.p_int[i] = i*foldscount/npoints; } for(i=0; i<=npoints-2; i++) { j = i+hqrnduniformi(&rs, npoints-i, _state); if( j!=i ) { k = folds->ptr.p_int[i]; folds->ptr.p_int[i] = folds->ptr.p_int[j]; folds->ptr.p_int[j] = k; } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine for parallelization function MLPFoldCV. INPUT PARAMETERS: S - trainer object; RowSize - row size(eitherNIn+NOut or NIn+1); NRestarts - number of restarts(>=0); Folds - cross-validation set; Fold - the number of first cross-validation(>=0); DFold - the number of second cross-validation(>=Fold+1); CVY - parameter which stores the result is returned by network, training on I-th cross-validation set. It has to be preallocated. PoolDataCV- parameter for parallelization. NOTE: There are no checks on the parameters correctness. -- ALGLIB -- Copyright 25.09.2012 by Bochkanov Sergey *************************************************************************/ static void mlptrain_mthreadcv(mlptrainer* s, ae_int_t rowsize, ae_int_t nrestarts, /* Integer */ ae_vector* folds, ae_int_t fold, ae_int_t dfold, /* Real */ ae_matrix* cvy, ae_shared_pool* pooldatacv, ae_state *_state) { ae_frame _frame_block; mlpparallelizationcv *datacv; ae_smart_ptr _datacv; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_smart_ptr_init(&_datacv, (void**)&datacv, _state); if( fold==dfold-1 ) { /* * Separate set */ ae_shared_pool_retrieve(pooldatacv, &_datacv, _state); datacv->subsetsize = 0; for(i=0; i<=s->npoints-1; i++) { if( folds->ptr.p_int[i]!=fold ) { datacv->subset.ptr.p_int[datacv->subsetsize] = i; datacv->subsetsize = datacv->subsetsize+1; } } /* * Train on CV training set */ mlptrain_mlptrainnetworkx(s, nrestarts, -1, &datacv->subset, datacv->subsetsize, &datacv->subset, 0, &datacv->network, &datacv->rep, ae_true, &datacv->trnpool, _state); datacv->ngrad = datacv->ngrad+datacv->rep.ngrad; /* * Estimate error using CV test set */ for(i=0; i<=s->npoints-1; i++) { if( folds->ptr.p_int[i]==fold ) { if( s->datatype==0 ) { ae_v_move(&datacv->xyrow.ptr.p_double[0], 1, &s->densexy.ptr.pp_double[i][0], 1, ae_v_len(0,rowsize-1)); } if( s->datatype==1 ) { sparsegetrow(&s->sparsexy, i, &datacv->xyrow, _state); } mlpprocess(&datacv->network, &datacv->xyrow, &datacv->y, _state); ae_v_move(&cvy->ptr.pp_double[i][0], 1, &datacv->y.ptr.p_double[0], 1, ae_v_len(0,s->nout-1)); } } ae_shared_pool_recycle(pooldatacv, &_datacv, _state); } else { ae_assert(foldDFold-1).", _state); mlptrain_mthreadcv(s, rowsize, nrestarts, folds, fold, (fold+dfold)/2, cvy, pooldatacv, _state); mlptrain_mthreadcv(s, rowsize, nrestarts, folds, (fold+dfold)/2, dfold, cvy, pooldatacv, _state); } ae_frame_leave(_state); } /************************************************************************* This function trains neural network passed to this function, using current dataset (one which was passed to MLPSetDataset() or MLPSetSparseDataset()) and current training settings. Training from NRestarts random starting positions is performed, best network is chosen. This function is inteded to be used internally. It may be used in several settings: * training with ValSubsetSize=0, corresponds to "normal" training with termination criteria based on S.MaxIts (steps count) and S.WStep (step size). Training sample is given by TrnSubset/TrnSubsetSize. * training with ValSubsetSize>0, corresponds to early stopping training with additional MaxIts/WStep stopping criteria. Training sample is given by TrnSubset/TrnSubsetSize, validation sample is given by ValSubset/ ValSubsetSize. -- ALGLIB -- Copyright 13.08.2012 by Bochkanov Sergey *************************************************************************/ static void mlptrain_mlptrainnetworkx(mlptrainer* s, ae_int_t nrestarts, ae_int_t algokind, /* Integer */ ae_vector* trnsubset, ae_int_t trnsubsetsize, /* Integer */ ae_vector* valsubset, ae_int_t valsubsetsize, multilayerperceptron* network, mlpreport* rep, ae_bool isrootcall, ae_shared_pool* sessions, ae_state *_state) { ae_frame _frame_block; modelerrors modrep; double eval; double ebest; ae_int_t ngradbatch; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t pcount; ae_int_t itbest; ae_int_t itcnt; ae_int_t ntype; ae_int_t ttype; ae_bool rndstart; ae_int_t i; ae_int_t nr0; ae_int_t nr1; mlpreport rep0; mlpreport rep1; ae_bool randomizenetwork; double bestrmserror; smlptrnsession *psession; ae_smart_ptr _psession; ae_frame_make(_state, &_frame_block); _modelerrors_init(&modrep, _state); _mlpreport_init(&rep0, _state); _mlpreport_init(&rep1, _state); ae_smart_ptr_init(&_psession, (void**)&psession, _state); mlpproperties(network, &nin, &nout, &wcount, _state); /* * Process root call */ if( isrootcall ) { /* * Check correctness of parameters */ ae_assert(algokind==0||algokind==-1, "MLPTrainNetworkX: unexpected AlgoKind", _state); ae_assert(s->npoints>=0, "MLPTrainNetworkX: internal error - parameter S is not initialized or is spoiled(S.NPoints<0)", _state); if( s->rcpar ) { ttype = 0; } else { ttype = 1; } if( !mlpissoftmax(network, _state) ) { ntype = 0; } else { ntype = 1; } ae_assert(ntype==ttype, "MLPTrainNetworkX: internal error - type of the training network is not similar to network type in trainer object", _state); ae_assert(s->nin==nin, "MLPTrainNetworkX: internal error - number of inputs in trainer is not equal to number of inputs in the training network.", _state); ae_assert(s->nout==nout, "MLPTrainNetworkX: internal error - number of outputs in trainer is not equal to number of outputs in the training network.", _state); ae_assert(nrestarts>=0, "MLPTrainNetworkX: internal error - NRestarts<0.", _state); ae_assert(trnsubset->cnt>=trnsubsetsize, "MLPTrainNetworkX: internal error - parameter TrnSubsetSize more than input subset size(Length(TrnSubset)ptr.p_int[i]>=0&&trnsubset->ptr.p_int[i]<=s->npoints-1, "MLPTrainNetworkX: internal error - parameter TrnSubset contains incorrect index(TrnSubset[I]<0 or TrnSubset[I]>S.NPoints-1)", _state); } ae_assert(valsubset->cnt>=valsubsetsize, "MLPTrainNetworkX: internal error - parameter ValSubsetSize more than input subset size(Length(ValSubset)ptr.p_int[i]>=0&&valsubset->ptr.p_int[i]<=s->npoints-1, "MLPTrainNetworkX: internal error - parameter ValSubset contains incorrect index(ValSubset[I]<0 or ValSubset[I]>S.NPoints-1)", _state); } /* * Train */ randomizenetwork = nrestarts>0; mlptrain_initmlptrnsessions(network, randomizenetwork, s, sessions, _state); mlptrain_mlptrainnetworkx(s, nrestarts, algokind, trnsubset, trnsubsetsize, valsubset, valsubsetsize, network, rep, ae_false, sessions, _state); /* * Choose best network */ bestrmserror = ae_maxrealnumber; ae_shared_pool_first_recycled(sessions, &_psession, _state); while(psession!=NULL) { if( ae_fp_less(psession->bestrmserror,bestrmserror) ) { mlpimporttunableparameters(network, &psession->bestparameters, _state); bestrmserror = psession->bestrmserror; } ae_shared_pool_next_recycled(sessions, &_psession, _state); } /* * Calculate errors */ if( s->datatype==0 ) { mlpallerrorssubset(network, &s->densexy, s->npoints, trnsubset, trnsubsetsize, &modrep, _state); } if( s->datatype==1 ) { mlpallerrorssparsesubset(network, &s->sparsexy, s->npoints, trnsubset, trnsubsetsize, &modrep, _state); } rep->relclserror = modrep.relclserror; rep->avgce = modrep.avgce; rep->rmserror = modrep.rmserror; rep->avgerror = modrep.avgerror; rep->avgrelerror = modrep.avgrelerror; /* * Done */ ae_frame_leave(_state); return; } /* * Split problem, if we have more than 1 restart */ if( nrestarts>=2 ) { /* * Divide problem with NRestarts into two: NR0 and NR1. */ nr0 = nrestarts/2; nr1 = nrestarts-nr0; mlptrain_mlptrainnetworkx(s, nr0, algokind, trnsubset, trnsubsetsize, valsubset, valsubsetsize, network, &rep0, ae_false, sessions, _state); mlptrain_mlptrainnetworkx(s, nr1, algokind, trnsubset, trnsubsetsize, valsubset, valsubsetsize, network, &rep1, ae_false, sessions, _state); /* * Aggregate results */ rep->ngrad = rep0.ngrad+rep1.ngrad; rep->nhess = rep0.nhess+rep1.nhess; rep->ncholesky = rep0.ncholesky+rep1.ncholesky; /* * Done :) */ ae_frame_leave(_state); return; } /* * Execution with NRestarts=1 or NRestarts=0: * * NRestarts=1 means that network is restarted from random position * * NRestarts=0 means that network is not randomized */ ae_assert(nrestarts==0||nrestarts==1, "MLPTrainNetworkX: internal error", _state); rep->ngrad = 0; rep->nhess = 0; rep->ncholesky = 0; ae_shared_pool_retrieve(sessions, &_psession, _state); if( ((s->datatype==0||s->datatype==1)&&s->npoints>0)&&trnsubsetsize!=0 ) { /* * Train network using combination of early stopping and step-size * and step-count based criteria. Network state with best value of * validation set error is stored in WBuf0. When validation set is * zero, most recent state of network is stored. */ rndstart = nrestarts!=0; ngradbatch = 0; eval = (double)(0); ebest = (double)(0); itbest = 0; itcnt = 0; mlptrain_mlpstarttrainingx(s, rndstart, algokind, trnsubset, trnsubsetsize, psession, _state); if( s->datatype==0 ) { ebest = mlperrorsubset(&psession->network, &s->densexy, s->npoints, valsubset, valsubsetsize, _state); } if( s->datatype==1 ) { ebest = mlperrorsparsesubset(&psession->network, &s->sparsexy, s->npoints, valsubset, valsubsetsize, _state); } ae_v_move(&psession->wbuf0.ptr.p_double[0], 1, &psession->network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); while(mlptrain_mlpcontinuetrainingx(s, trnsubset, trnsubsetsize, &ngradbatch, psession, _state)) { if( s->datatype==0 ) { eval = mlperrorsubset(&psession->network, &s->densexy, s->npoints, valsubset, valsubsetsize, _state); } if( s->datatype==1 ) { eval = mlperrorsparsesubset(&psession->network, &s->sparsexy, s->npoints, valsubset, valsubsetsize, _state); } if( ae_fp_less_eq(eval,ebest)||valsubsetsize==0 ) { ae_v_move(&psession->wbuf0.ptr.p_double[0], 1, &psession->network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); ebest = eval; itbest = itcnt; } if( itcnt>30&&ae_fp_greater((double)(itcnt),1.5*itbest) ) { break; } itcnt = itcnt+1; } ae_v_move(&psession->network.weights.ptr.p_double[0], 1, &psession->wbuf0.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); rep->ngrad = ngradbatch; } else { for(i=0; i<=wcount-1; i++) { psession->network.weights.ptr.p_double[i] = (double)(0); } } /* * Evaluate network performance and update PSession.BestParameters/BestRMSError * (if needed). */ if( s->datatype==0 ) { mlpallerrorssubset(&psession->network, &s->densexy, s->npoints, trnsubset, trnsubsetsize, &modrep, _state); } if( s->datatype==1 ) { mlpallerrorssparsesubset(&psession->network, &s->sparsexy, s->npoints, trnsubset, trnsubsetsize, &modrep, _state); } if( ae_fp_less(modrep.rmserror,psession->bestrmserror) ) { mlpexporttunableparameters(&psession->network, &psession->bestparameters, &pcount, _state); psession->bestrmserror = modrep.rmserror; } /* * Move session back to pool */ ae_shared_pool_recycle(sessions, &_psession, _state); ae_frame_leave(_state); } /************************************************************************* This function trains neural network ensemble passed to this function using current dataset and early stopping training algorithm. Each early stopping round performs NRestarts random restarts (thus, EnsembleSize*NRestarts training rounds is performed in total). -- ALGLIB -- Copyright 22.08.2012 by Bochkanov Sergey *************************************************************************/ static void mlptrain_mlptrainensemblex(mlptrainer* s, mlpensemble* ensemble, ae_int_t idx0, ae_int_t idx1, ae_int_t nrestarts, ae_int_t trainingmethod, sinteger* ngrad, ae_bool isrootcall, ae_shared_pool* esessions, ae_state *_state) { ae_frame _frame_block; ae_int_t pcount; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t trnsubsetsize; ae_int_t valsubsetsize; ae_int_t k0; sinteger ngrad0; sinteger ngrad1; mlpetrnsession *psession; ae_smart_ptr _psession; hqrndstate rs; ae_frame_make(_state, &_frame_block); _sinteger_init(&ngrad0, _state); _sinteger_init(&ngrad1, _state); ae_smart_ptr_init(&_psession, (void**)&psession, _state); _hqrndstate_init(&rs, _state); nin = mlpgetinputscount(&ensemble->network, _state); nout = mlpgetoutputscount(&ensemble->network, _state); wcount = mlpgetweightscount(&ensemble->network, _state); if( mlpissoftmax(&ensemble->network, _state) ) { pcount = nin; } else { pcount = nin+nout; } if( nrestarts<=0 ) { nrestarts = 1; } /* * Handle degenerate case */ if( s->npoints<2 ) { for(i=idx0; i<=idx1-1; i++) { for(j=0; j<=wcount-1; j++) { ensemble->weights.ptr.p_double[i*wcount+j] = 0.0; } for(j=0; j<=pcount-1; j++) { ensemble->columnmeans.ptr.p_double[i*pcount+j] = 0.0; ensemble->columnsigmas.ptr.p_double[i*pcount+j] = 1.0; } } ae_frame_leave(_state); return; } /* * Process root call */ if( isrootcall ) { /* * Prepare: * * prepare MLPETrnSessions * * fill ensemble by zeros (helps to detect errors) */ mlptrain_initmlpetrnsessions(&ensemble->network, s, esessions, _state); for(i=idx0; i<=idx1-1; i++) { for(j=0; j<=wcount-1; j++) { ensemble->weights.ptr.p_double[i*wcount+j] = 0.0; } for(j=0; j<=pcount-1; j++) { ensemble->columnmeans.ptr.p_double[i*pcount+j] = 0.0; ensemble->columnsigmas.ptr.p_double[i*pcount+j] = 0.0; } } /* * Train in non-root mode and exit */ mlptrain_mlptrainensemblex(s, ensemble, idx0, idx1, nrestarts, trainingmethod, ngrad, ae_false, esessions, _state); ae_frame_leave(_state); return; } /* * Split problem */ if( idx1-idx0>=2 ) { k0 = (idx1-idx0)/2; ngrad0.val = 0; ngrad1.val = 0; mlptrain_mlptrainensemblex(s, ensemble, idx0, idx0+k0, nrestarts, trainingmethod, &ngrad0, ae_false, esessions, _state); mlptrain_mlptrainensemblex(s, ensemble, idx0+k0, idx1, nrestarts, trainingmethod, &ngrad1, ae_false, esessions, _state); ngrad->val = ngrad0.val+ngrad1.val; ae_frame_leave(_state); return; } /* * Retrieve and prepare session */ ae_shared_pool_retrieve(esessions, &_psession, _state); /* * Train */ hqrndrandomize(&rs, _state); for(k=idx0; k<=idx1-1; k++) { /* * Split set */ trnsubsetsize = 0; valsubsetsize = 0; if( trainingmethod==0 ) { do { trnsubsetsize = 0; valsubsetsize = 0; for(i=0; i<=s->npoints-1; i++) { if( ae_fp_less(ae_randomreal(_state),0.66) ) { /* * Assign sample to training set */ psession->trnsubset.ptr.p_int[trnsubsetsize] = i; trnsubsetsize = trnsubsetsize+1; } else { /* * Assign sample to validation set */ psession->valsubset.ptr.p_int[valsubsetsize] = i; valsubsetsize = valsubsetsize+1; } } } while(!(trnsubsetsize!=0&&valsubsetsize!=0)); } if( trainingmethod==1 ) { valsubsetsize = 0; trnsubsetsize = s->npoints; for(i=0; i<=s->npoints-1; i++) { psession->trnsubset.ptr.p_int[i] = hqrnduniformi(&rs, s->npoints, _state); } } /* * Train */ mlptrain_mlptrainnetworkx(s, nrestarts, -1, &psession->trnsubset, trnsubsetsize, &psession->valsubset, valsubsetsize, &psession->network, &psession->mlprep, ae_true, &psession->mlpsessions, _state); ngrad->val = ngrad->val+psession->mlprep.ngrad; /* * Save results */ ae_v_move(&ensemble->weights.ptr.p_double[k*wcount], 1, &psession->network.weights.ptr.p_double[0], 1, ae_v_len(k*wcount,(k+1)*wcount-1)); ae_v_move(&ensemble->columnmeans.ptr.p_double[k*pcount], 1, &psession->network.columnmeans.ptr.p_double[0], 1, ae_v_len(k*pcount,(k+1)*pcount-1)); ae_v_move(&ensemble->columnsigmas.ptr.p_double[k*pcount], 1, &psession->network.columnsigmas.ptr.p_double[0], 1, ae_v_len(k*pcount,(k+1)*pcount-1)); } /* * Recycle session */ ae_shared_pool_recycle(esessions, &_psession, _state); ae_frame_leave(_state); } /************************************************************************* This function performs step-by-step training of the neural network. Here "step-by-step" means that training starts with MLPStartTrainingX call, and then user subsequently calls MLPContinueTrainingX to perform one more iteration of the training. After call to this function trainer object remembers network and is ready to train it. However, no training is performed until first call to MLPContinueTraining() function. Subsequent calls to MLPContinueTraining() will advance traing progress one iteration further. -- ALGLIB -- Copyright 13.08.2012 by Bochkanov Sergey *************************************************************************/ static void mlptrain_mlpstarttrainingx(mlptrainer* s, ae_bool randomstart, ae_int_t algokind, /* Integer */ ae_vector* subset, ae_int_t subsetsize, smlptrnsession* session, ae_state *_state) { ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t ntype; ae_int_t ttype; ae_int_t i; /* * Check parameters */ ae_assert(s->npoints>=0, "MLPStartTrainingX: internal error - parameter S is not initialized or is spoiled(S.NPoints<0)", _state); ae_assert(algokind==0||algokind==-1, "MLPStartTrainingX: unexpected AlgoKind", _state); if( s->rcpar ) { ttype = 0; } else { ttype = 1; } if( !mlpissoftmax(&session->network, _state) ) { ntype = 0; } else { ntype = 1; } ae_assert(ntype==ttype, "MLPStartTrainingX: internal error - type of the resulting network is not similar to network type in trainer object", _state); mlpproperties(&session->network, &nin, &nout, &wcount, _state); ae_assert(s->nin==nin, "MLPStartTrainingX: number of inputs in trainer is not equal to number of inputs in the network.", _state); ae_assert(s->nout==nout, "MLPStartTrainingX: number of outputs in trainer is not equal to number of outputs in the network.", _state); ae_assert(subset->cnt>=subsetsize, "MLPStartTrainingX: internal error - parameter SubsetSize more than input subset size(Length(Subset)ptr.p_int[i]>=0&&subset->ptr.p_int[i]<=s->npoints-1, "MLPStartTrainingX: internal error - parameter Subset contains incorrect index(Subset[I]<0 or Subset[I]>S.NPoints-1)", _state); } /* * Prepare session */ minlbfgssetcond(&session->optimizer, 0.0, 0.0, s->wstep, s->maxits, _state); if( s->npoints>0&&subsetsize!=0 ) { if( randomstart ) { mlprandomize(&session->network, _state); } minlbfgsrestartfrom(&session->optimizer, &session->network.weights, _state); } else { for(i=0; i<=wcount-1; i++) { session->network.weights.ptr.p_double[i] = (double)(0); } } if( algokind==-1 ) { session->algoused = s->algokind; if( s->algokind==1 ) { session->minibatchsize = s->minibatchsize; } } else { session->algoused = 0; } hqrndrandomize(&session->generator, _state); ae_vector_set_length(&session->rstate.ia, 15+1, _state); ae_vector_set_length(&session->rstate.ra, 1+1, _state); session->rstate.stage = -1; } /************************************************************************* This function performs step-by-step training of the neural network. Here "step-by-step" means that training starts with MLPStartTrainingX call, and then user subsequently calls MLPContinueTrainingX to perform one more iteration of the training. This function performs one more iteration of the training and returns either True (training continues) or False (training stopped). In case True was returned, Network weights are updated according to the current state of the optimization progress. In case False was returned, no additional updates is performed (previous update of the network weights moved us to the final point, and no additional updates is needed). EXAMPLE: > > [initialize network and trainer object] > > MLPStartTraining(Trainer, Network, True) > while MLPContinueTraining(Trainer, Network) do > [visualize training progress] > -- ALGLIB -- Copyright 13.08.2012 by Bochkanov Sergey *************************************************************************/ static ae_bool mlptrain_mlpcontinuetrainingx(mlptrainer* s, /* Integer */ ae_vector* subset, ae_int_t subsetsize, ae_int_t* ngradbatch, smlptrnsession* session, ae_state *_state) { ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t twcount; ae_int_t ntype; ae_int_t ttype; double decay; double v; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t trnsetsize; ae_int_t epoch; ae_int_t minibatchcount; ae_int_t minibatchidx; ae_int_t cursize; ae_int_t idx0; ae_int_t idx1; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( session->rstate.stage>=0 ) { nin = session->rstate.ia.ptr.p_int[0]; nout = session->rstate.ia.ptr.p_int[1]; wcount = session->rstate.ia.ptr.p_int[2]; twcount = session->rstate.ia.ptr.p_int[3]; ntype = session->rstate.ia.ptr.p_int[4]; ttype = session->rstate.ia.ptr.p_int[5]; i = session->rstate.ia.ptr.p_int[6]; j = session->rstate.ia.ptr.p_int[7]; k = session->rstate.ia.ptr.p_int[8]; trnsetsize = session->rstate.ia.ptr.p_int[9]; epoch = session->rstate.ia.ptr.p_int[10]; minibatchcount = session->rstate.ia.ptr.p_int[11]; minibatchidx = session->rstate.ia.ptr.p_int[12]; cursize = session->rstate.ia.ptr.p_int[13]; idx0 = session->rstate.ia.ptr.p_int[14]; idx1 = session->rstate.ia.ptr.p_int[15]; decay = session->rstate.ra.ptr.p_double[0]; v = session->rstate.ra.ptr.p_double[1]; } else { nin = 359; nout = -58; wcount = -919; twcount = -909; ntype = 81; ttype = 255; i = 74; j = -788; k = 809; trnsetsize = 205; epoch = -838; minibatchcount = 939; minibatchidx = -526; cursize = 763; idx0 = -541; idx1 = -698; decay = -900; v = -318; } if( session->rstate.stage==0 ) { goto lbl_0; } /* * Routine body */ /* * Check correctness of inputs */ ae_assert(s->npoints>=0, "MLPContinueTrainingX: internal error - parameter S is not initialized or is spoiled(S.NPoints<0).", _state); if( s->rcpar ) { ttype = 0; } else { ttype = 1; } if( !mlpissoftmax(&session->network, _state) ) { ntype = 0; } else { ntype = 1; } ae_assert(ntype==ttype, "MLPContinueTrainingX: internal error - type of the resulting network is not similar to network type in trainer object.", _state); mlpproperties(&session->network, &nin, &nout, &wcount, _state); ae_assert(s->nin==nin, "MLPContinueTrainingX: internal error - number of inputs in trainer is not equal to number of inputs in the network.", _state); ae_assert(s->nout==nout, "MLPContinueTrainingX: internal error - number of outputs in trainer is not equal to number of outputs in the network.", _state); ae_assert(subset->cnt>=subsetsize, "MLPContinueTrainingX: internal error - parameter SubsetSize more than input subset size(Length(Subset)ptr.p_int[i]>=0&&subset->ptr.p_int[i]<=s->npoints-1, "MLPContinueTrainingX: internal error - parameter Subset contains incorrect index(Subset[I]<0 or Subset[I]>S.NPoints-1).", _state); } /* * Quick exit on empty training set */ if( s->npoints==0||subsetsize==0 ) { result = ae_false; return result; } /* * Minibatch training */ if( session->algoused==1 ) { ae_assert(ae_false, "MINIBATCH TRAINING IS NOT IMPLEMENTED YET", _state); } /* * Last option: full batch training */ decay = s->decay; lbl_1: if( !minlbfgsiteration(&session->optimizer, _state) ) { goto lbl_2; } if( !session->optimizer.xupdated ) { goto lbl_3; } ae_v_move(&session->network.weights.ptr.p_double[0], 1, &session->optimizer.x.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); session->rstate.stage = 0; goto lbl_rcomm; lbl_0: lbl_3: ae_v_move(&session->network.weights.ptr.p_double[0], 1, &session->optimizer.x.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); if( s->datatype==0 ) { mlpgradbatchsubset(&session->network, &s->densexy, s->npoints, subset, subsetsize, &session->optimizer.f, &session->optimizer.g, _state); } if( s->datatype==1 ) { mlpgradbatchsparsesubset(&session->network, &s->sparsexy, s->npoints, subset, subsetsize, &session->optimizer.f, &session->optimizer.g, _state); } /* * Increment number of operations performed on batch gradient */ *ngradbatch = *ngradbatch+1; v = ae_v_dotproduct(&session->network.weights.ptr.p_double[0], 1, &session->network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); session->optimizer.f = session->optimizer.f+0.5*decay*v; ae_v_addd(&session->optimizer.g.ptr.p_double[0], 1, &session->network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1), decay); goto lbl_1; lbl_2: minlbfgsresultsbuf(&session->optimizer, &session->network.weights, &session->optimizerrep, _state); result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; session->rstate.ia.ptr.p_int[0] = nin; session->rstate.ia.ptr.p_int[1] = nout; session->rstate.ia.ptr.p_int[2] = wcount; session->rstate.ia.ptr.p_int[3] = twcount; session->rstate.ia.ptr.p_int[4] = ntype; session->rstate.ia.ptr.p_int[5] = ttype; session->rstate.ia.ptr.p_int[6] = i; session->rstate.ia.ptr.p_int[7] = j; session->rstate.ia.ptr.p_int[8] = k; session->rstate.ia.ptr.p_int[9] = trnsetsize; session->rstate.ia.ptr.p_int[10] = epoch; session->rstate.ia.ptr.p_int[11] = minibatchcount; session->rstate.ia.ptr.p_int[12] = minibatchidx; session->rstate.ia.ptr.p_int[13] = cursize; session->rstate.ia.ptr.p_int[14] = idx0; session->rstate.ia.ptr.p_int[15] = idx1; session->rstate.ra.ptr.p_double[0] = decay; session->rstate.ra.ptr.p_double[1] = v; return result; } /************************************************************************* Internal bagging subroutine. -- ALGLIB -- Copyright 19.02.2009 by Bochkanov Sergey *************************************************************************/ static void mlptrain_mlpebagginginternal(mlpensemble* ensemble, /* Real */ ae_matrix* xy, ae_int_t npoints, double decay, ae_int_t restarts, double wstep, ae_int_t maxits, ae_bool lmalgorithm, ae_int_t* info, mlpreport* rep, mlpcvreport* ooberrors, ae_state *_state) { ae_frame _frame_block; ae_matrix xys; ae_vector s; ae_matrix oobbuf; ae_vector oobcntbuf; ae_vector x; ae_vector y; ae_vector dy; ae_vector dsbuf; ae_int_t ccnt; ae_int_t pcnt; ae_int_t i; ae_int_t j; ae_int_t k; double v; mlpreport tmprep; ae_int_t nin; ae_int_t nout; ae_int_t wcount; hqrndstate rs; ae_frame_make(_state, &_frame_block); *info = 0; _mlpreport_clear(rep); _mlpcvreport_clear(ooberrors); ae_matrix_init(&xys, 0, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_BOOL, _state); ae_matrix_init(&oobbuf, 0, 0, DT_REAL, _state); ae_vector_init(&oobcntbuf, 0, DT_INT, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&dy, 0, DT_REAL, _state); ae_vector_init(&dsbuf, 0, DT_REAL, _state); _mlpreport_init(&tmprep, _state); _hqrndstate_init(&rs, _state); nin = mlpgetinputscount(&ensemble->network, _state); nout = mlpgetoutputscount(&ensemble->network, _state); wcount = mlpgetweightscount(&ensemble->network, _state); /* * Test for inputs */ if( (!lmalgorithm&&ae_fp_eq(wstep,(double)(0)))&&maxits==0 ) { *info = -8; ae_frame_leave(_state); return; } if( ((npoints<=0||restarts<1)||ae_fp_less(wstep,(double)(0)))||maxits<0 ) { *info = -1; ae_frame_leave(_state); return; } if( mlpissoftmax(&ensemble->network, _state) ) { for(i=0; i<=npoints-1; i++) { if( ae_round(xy->ptr.pp_double[i][nin], _state)<0||ae_round(xy->ptr.pp_double[i][nin], _state)>=nout ) { *info = -2; ae_frame_leave(_state); return; } } } /* * allocate temporaries */ *info = 2; rep->ngrad = 0; rep->nhess = 0; rep->ncholesky = 0; ooberrors->relclserror = (double)(0); ooberrors->avgce = (double)(0); ooberrors->rmserror = (double)(0); ooberrors->avgerror = (double)(0); ooberrors->avgrelerror = (double)(0); if( mlpissoftmax(&ensemble->network, _state) ) { ccnt = nin+1; pcnt = nin; } else { ccnt = nin+nout; pcnt = nin+nout; } ae_matrix_set_length(&xys, npoints, ccnt, _state); ae_vector_set_length(&s, npoints, _state); ae_matrix_set_length(&oobbuf, npoints, nout, _state); ae_vector_set_length(&oobcntbuf, npoints, _state); ae_vector_set_length(&x, nin, _state); ae_vector_set_length(&y, nout, _state); if( mlpissoftmax(&ensemble->network, _state) ) { ae_vector_set_length(&dy, 1, _state); } else { ae_vector_set_length(&dy, nout, _state); } for(i=0; i<=npoints-1; i++) { for(j=0; j<=nout-1; j++) { oobbuf.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=npoints-1; i++) { oobcntbuf.ptr.p_int[i] = 0; } /* * main bagging cycle */ hqrndrandomize(&rs, _state); for(k=0; k<=ensemble->ensemblesize-1; k++) { /* * prepare dataset */ for(i=0; i<=npoints-1; i++) { s.ptr.p_bool[i] = ae_false; } for(i=0; i<=npoints-1; i++) { j = hqrnduniformi(&rs, npoints, _state); s.ptr.p_bool[j] = ae_true; ae_v_move(&xys.ptr.pp_double[i][0], 1, &xy->ptr.pp_double[j][0], 1, ae_v_len(0,ccnt-1)); } /* * train */ if( lmalgorithm ) { mlptrainlm(&ensemble->network, &xys, npoints, decay, restarts, info, &tmprep, _state); } else { mlptrainlbfgs(&ensemble->network, &xys, npoints, decay, restarts, wstep, maxits, info, &tmprep, _state); } if( *info<0 ) { ae_frame_leave(_state); return; } /* * save results */ rep->ngrad = rep->ngrad+tmprep.ngrad; rep->nhess = rep->nhess+tmprep.nhess; rep->ncholesky = rep->ncholesky+tmprep.ncholesky; ae_v_move(&ensemble->weights.ptr.p_double[k*wcount], 1, &ensemble->network.weights.ptr.p_double[0], 1, ae_v_len(k*wcount,(k+1)*wcount-1)); ae_v_move(&ensemble->columnmeans.ptr.p_double[k*pcnt], 1, &ensemble->network.columnmeans.ptr.p_double[0], 1, ae_v_len(k*pcnt,(k+1)*pcnt-1)); ae_v_move(&ensemble->columnsigmas.ptr.p_double[k*pcnt], 1, &ensemble->network.columnsigmas.ptr.p_double[0], 1, ae_v_len(k*pcnt,(k+1)*pcnt-1)); /* * OOB estimates */ for(i=0; i<=npoints-1; i++) { if( !s.ptr.p_bool[i] ) { ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nin-1)); mlpprocess(&ensemble->network, &x, &y, _state); ae_v_add(&oobbuf.ptr.pp_double[i][0], 1, &y.ptr.p_double[0], 1, ae_v_len(0,nout-1)); oobcntbuf.ptr.p_int[i] = oobcntbuf.ptr.p_int[i]+1; } } } /* * OOB estimates */ if( mlpissoftmax(&ensemble->network, _state) ) { dserrallocate(nout, &dsbuf, _state); } else { dserrallocate(-nout, &dsbuf, _state); } for(i=0; i<=npoints-1; i++) { if( oobcntbuf.ptr.p_int[i]!=0 ) { v = (double)1/(double)oobcntbuf.ptr.p_int[i]; ae_v_moved(&y.ptr.p_double[0], 1, &oobbuf.ptr.pp_double[i][0], 1, ae_v_len(0,nout-1), v); if( mlpissoftmax(&ensemble->network, _state) ) { dy.ptr.p_double[0] = xy->ptr.pp_double[i][nin]; } else { ae_v_moved(&dy.ptr.p_double[0], 1, &xy->ptr.pp_double[i][nin], 1, ae_v_len(0,nout-1), v); } dserraccumulate(&dsbuf, &y, &dy, _state); } } dserrfinish(&dsbuf, _state); ooberrors->relclserror = dsbuf.ptr.p_double[0]; ooberrors->avgce = dsbuf.ptr.p_double[1]; ooberrors->rmserror = dsbuf.ptr.p_double[2]; ooberrors->avgerror = dsbuf.ptr.p_double[3]; ooberrors->avgrelerror = dsbuf.ptr.p_double[4]; ae_frame_leave(_state); } /************************************************************************* This function initializes temporaries needed for training session. -- ALGLIB -- Copyright 01.07.2013 by Bochkanov Sergey *************************************************************************/ static void mlptrain_initmlptrnsession(multilayerperceptron* networktrained, ae_bool randomizenetwork, mlptrainer* trainer, smlptrnsession* session, ae_state *_state) { ae_frame _frame_block; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t pcount; ae_vector dummysubset; ae_frame_make(_state, &_frame_block); ae_vector_init(&dummysubset, 0, DT_INT, _state); /* * Prepare network: * * copy input network to Session.Network * * re-initialize preprocessor and weights if RandomizeNetwork=True */ mlpcopy(networktrained, &session->network, _state); if( randomizenetwork ) { ae_assert(trainer->datatype==0||trainer->datatype==1, "InitTemporaries: unexpected Trainer.DataType", _state); if( trainer->datatype==0 ) { mlpinitpreprocessorsubset(&session->network, &trainer->densexy, trainer->npoints, &dummysubset, -1, _state); } if( trainer->datatype==1 ) { mlpinitpreprocessorsparsesubset(&session->network, &trainer->sparsexy, trainer->npoints, &dummysubset, -1, _state); } mlprandomize(&session->network, _state); session->randomizenetwork = ae_true; } else { session->randomizenetwork = ae_false; } /* * Determine network geometry and initialize optimizer */ mlpproperties(&session->network, &nin, &nout, &wcount, _state); minlbfgscreate(wcount, ae_minint(wcount, trainer->lbfgsfactor, _state), &session->network.weights, &session->optimizer, _state); minlbfgssetxrep(&session->optimizer, ae_true, _state); /* * Create buffers */ ae_vector_set_length(&session->wbuf0, wcount, _state); ae_vector_set_length(&session->wbuf1, wcount, _state); /* * Initialize session result */ mlpexporttunableparameters(&session->network, &session->bestparameters, &pcount, _state); session->bestrmserror = ae_maxrealnumber; ae_frame_leave(_state); } /************************************************************************* This function initializes temporaries needed for training session. *************************************************************************/ static void mlptrain_initmlptrnsessions(multilayerperceptron* networktrained, ae_bool randomizenetwork, mlptrainer* trainer, ae_shared_pool* sessions, ae_state *_state) { ae_frame _frame_block; ae_vector dummysubset; smlptrnsession t; smlptrnsession *p; ae_smart_ptr _p; ae_frame_make(_state, &_frame_block); ae_vector_init(&dummysubset, 0, DT_INT, _state); _smlptrnsession_init(&t, _state); ae_smart_ptr_init(&_p, (void**)&p, _state); if( ae_shared_pool_is_initialized(sessions) ) { /* * Pool was already initialized. * Clear sessions stored in the pool. */ ae_shared_pool_first_recycled(sessions, &_p, _state); while(p!=NULL) { ae_assert(mlpsamearchitecture(&p->network, networktrained, _state), "InitMLPTrnSessions: internal consistency error", _state); p->bestrmserror = ae_maxrealnumber; ae_shared_pool_next_recycled(sessions, &_p, _state); } } else { /* * Prepare session and seed pool */ mlptrain_initmlptrnsession(networktrained, randomizenetwork, trainer, &t, _state); ae_shared_pool_set_seed(sessions, &t, sizeof(t), _smlptrnsession_init, _smlptrnsession_init_copy, _smlptrnsession_destroy, _state); } ae_frame_leave(_state); } /************************************************************************* This function initializes temporaries needed for ensemble training. *************************************************************************/ static void mlptrain_initmlpetrnsession(multilayerperceptron* individualnetwork, mlptrainer* trainer, mlpetrnsession* session, ae_state *_state) { ae_frame _frame_block; ae_vector dummysubset; ae_frame_make(_state, &_frame_block); ae_vector_init(&dummysubset, 0, DT_INT, _state); /* * Prepare network: * * copy input network to Session.Network * * re-initialize preprocessor and weights if RandomizeNetwork=True */ mlpcopy(individualnetwork, &session->network, _state); mlptrain_initmlptrnsessions(individualnetwork, ae_true, trainer, &session->mlpsessions, _state); ivectorsetlengthatleast(&session->trnsubset, trainer->npoints, _state); ivectorsetlengthatleast(&session->valsubset, trainer->npoints, _state); ae_frame_leave(_state); } /************************************************************************* This function initializes temporaries needed for training session. *************************************************************************/ static void mlptrain_initmlpetrnsessions(multilayerperceptron* individualnetwork, mlptrainer* trainer, ae_shared_pool* sessions, ae_state *_state) { ae_frame _frame_block; mlpetrnsession t; ae_frame_make(_state, &_frame_block); _mlpetrnsession_init(&t, _state); if( !ae_shared_pool_is_initialized(sessions) ) { mlptrain_initmlpetrnsession(individualnetwork, trainer, &t, _state); ae_shared_pool_set_seed(sessions, &t, sizeof(t), _mlpetrnsession_init, _mlpetrnsession_init_copy, _mlpetrnsession_destroy, _state); } ae_frame_leave(_state); } void _mlpreport_init(void* _p, ae_state *_state) { mlpreport *p = (mlpreport*)_p; ae_touch_ptr((void*)p); } void _mlpreport_init_copy(void* _dst, void* _src, ae_state *_state) { mlpreport *dst = (mlpreport*)_dst; mlpreport *src = (mlpreport*)_src; dst->relclserror = src->relclserror; dst->avgce = src->avgce; dst->rmserror = src->rmserror; dst->avgerror = src->avgerror; dst->avgrelerror = src->avgrelerror; dst->ngrad = src->ngrad; dst->nhess = src->nhess; dst->ncholesky = src->ncholesky; } void _mlpreport_clear(void* _p) { mlpreport *p = (mlpreport*)_p; ae_touch_ptr((void*)p); } void _mlpreport_destroy(void* _p) { mlpreport *p = (mlpreport*)_p; ae_touch_ptr((void*)p); } void _mlpcvreport_init(void* _p, ae_state *_state) { mlpcvreport *p = (mlpcvreport*)_p; ae_touch_ptr((void*)p); } void _mlpcvreport_init_copy(void* _dst, void* _src, ae_state *_state) { mlpcvreport *dst = (mlpcvreport*)_dst; mlpcvreport *src = (mlpcvreport*)_src; dst->relclserror = src->relclserror; dst->avgce = src->avgce; dst->rmserror = src->rmserror; dst->avgerror = src->avgerror; dst->avgrelerror = src->avgrelerror; } void _mlpcvreport_clear(void* _p) { mlpcvreport *p = (mlpcvreport*)_p; ae_touch_ptr((void*)p); } void _mlpcvreport_destroy(void* _p) { mlpcvreport *p = (mlpcvreport*)_p; ae_touch_ptr((void*)p); } void _smlptrnsession_init(void* _p, ae_state *_state) { smlptrnsession *p = (smlptrnsession*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->bestparameters, 0, DT_REAL, _state); _multilayerperceptron_init(&p->network, _state); _minlbfgsstate_init(&p->optimizer, _state); _minlbfgsreport_init(&p->optimizerrep, _state); ae_vector_init(&p->wbuf0, 0, DT_REAL, _state); ae_vector_init(&p->wbuf1, 0, DT_REAL, _state); ae_vector_init(&p->allminibatches, 0, DT_INT, _state); ae_vector_init(&p->currentminibatch, 0, DT_INT, _state); _rcommstate_init(&p->rstate, _state); _hqrndstate_init(&p->generator, _state); } void _smlptrnsession_init_copy(void* _dst, void* _src, ae_state *_state) { smlptrnsession *dst = (smlptrnsession*)_dst; smlptrnsession *src = (smlptrnsession*)_src; ae_vector_init_copy(&dst->bestparameters, &src->bestparameters, _state); dst->bestrmserror = src->bestrmserror; dst->randomizenetwork = src->randomizenetwork; _multilayerperceptron_init_copy(&dst->network, &src->network, _state); _minlbfgsstate_init_copy(&dst->optimizer, &src->optimizer, _state); _minlbfgsreport_init_copy(&dst->optimizerrep, &src->optimizerrep, _state); ae_vector_init_copy(&dst->wbuf0, &src->wbuf0, _state); ae_vector_init_copy(&dst->wbuf1, &src->wbuf1, _state); ae_vector_init_copy(&dst->allminibatches, &src->allminibatches, _state); ae_vector_init_copy(&dst->currentminibatch, &src->currentminibatch, _state); _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); dst->algoused = src->algoused; dst->minibatchsize = src->minibatchsize; _hqrndstate_init_copy(&dst->generator, &src->generator, _state); } void _smlptrnsession_clear(void* _p) { smlptrnsession *p = (smlptrnsession*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->bestparameters); _multilayerperceptron_clear(&p->network); _minlbfgsstate_clear(&p->optimizer); _minlbfgsreport_clear(&p->optimizerrep); ae_vector_clear(&p->wbuf0); ae_vector_clear(&p->wbuf1); ae_vector_clear(&p->allminibatches); ae_vector_clear(&p->currentminibatch); _rcommstate_clear(&p->rstate); _hqrndstate_clear(&p->generator); } void _smlptrnsession_destroy(void* _p) { smlptrnsession *p = (smlptrnsession*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->bestparameters); _multilayerperceptron_destroy(&p->network); _minlbfgsstate_destroy(&p->optimizer); _minlbfgsreport_destroy(&p->optimizerrep); ae_vector_destroy(&p->wbuf0); ae_vector_destroy(&p->wbuf1); ae_vector_destroy(&p->allminibatches); ae_vector_destroy(&p->currentminibatch); _rcommstate_destroy(&p->rstate); _hqrndstate_destroy(&p->generator); } void _mlpetrnsession_init(void* _p, ae_state *_state) { mlpetrnsession *p = (mlpetrnsession*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->trnsubset, 0, DT_INT, _state); ae_vector_init(&p->valsubset, 0, DT_INT, _state); ae_shared_pool_init(&p->mlpsessions, _state); _mlpreport_init(&p->mlprep, _state); _multilayerperceptron_init(&p->network, _state); } void _mlpetrnsession_init_copy(void* _dst, void* _src, ae_state *_state) { mlpetrnsession *dst = (mlpetrnsession*)_dst; mlpetrnsession *src = (mlpetrnsession*)_src; ae_vector_init_copy(&dst->trnsubset, &src->trnsubset, _state); ae_vector_init_copy(&dst->valsubset, &src->valsubset, _state); ae_shared_pool_init_copy(&dst->mlpsessions, &src->mlpsessions, _state); _mlpreport_init_copy(&dst->mlprep, &src->mlprep, _state); _multilayerperceptron_init_copy(&dst->network, &src->network, _state); } void _mlpetrnsession_clear(void* _p) { mlpetrnsession *p = (mlpetrnsession*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->trnsubset); ae_vector_clear(&p->valsubset); ae_shared_pool_clear(&p->mlpsessions); _mlpreport_clear(&p->mlprep); _multilayerperceptron_clear(&p->network); } void _mlpetrnsession_destroy(void* _p) { mlpetrnsession *p = (mlpetrnsession*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->trnsubset); ae_vector_destroy(&p->valsubset); ae_shared_pool_destroy(&p->mlpsessions); _mlpreport_destroy(&p->mlprep); _multilayerperceptron_destroy(&p->network); } void _mlptrainer_init(void* _p, ae_state *_state) { mlptrainer *p = (mlptrainer*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->densexy, 0, 0, DT_REAL, _state); _sparsematrix_init(&p->sparsexy, _state); _smlptrnsession_init(&p->session, _state); ae_vector_init(&p->subset, 0, DT_INT, _state); ae_vector_init(&p->valsubset, 0, DT_INT, _state); } void _mlptrainer_init_copy(void* _dst, void* _src, ae_state *_state) { mlptrainer *dst = (mlptrainer*)_dst; mlptrainer *src = (mlptrainer*)_src; dst->nin = src->nin; dst->nout = src->nout; dst->rcpar = src->rcpar; dst->lbfgsfactor = src->lbfgsfactor; dst->decay = src->decay; dst->wstep = src->wstep; dst->maxits = src->maxits; dst->datatype = src->datatype; dst->npoints = src->npoints; ae_matrix_init_copy(&dst->densexy, &src->densexy, _state); _sparsematrix_init_copy(&dst->sparsexy, &src->sparsexy, _state); _smlptrnsession_init_copy(&dst->session, &src->session, _state); dst->ngradbatch = src->ngradbatch; ae_vector_init_copy(&dst->subset, &src->subset, _state); dst->subsetsize = src->subsetsize; ae_vector_init_copy(&dst->valsubset, &src->valsubset, _state); dst->valsubsetsize = src->valsubsetsize; dst->algokind = src->algokind; dst->minibatchsize = src->minibatchsize; } void _mlptrainer_clear(void* _p) { mlptrainer *p = (mlptrainer*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->densexy); _sparsematrix_clear(&p->sparsexy); _smlptrnsession_clear(&p->session); ae_vector_clear(&p->subset); ae_vector_clear(&p->valsubset); } void _mlptrainer_destroy(void* _p) { mlptrainer *p = (mlptrainer*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->densexy); _sparsematrix_destroy(&p->sparsexy); _smlptrnsession_destroy(&p->session); ae_vector_destroy(&p->subset); ae_vector_destroy(&p->valsubset); } void _mlpparallelizationcv_init(void* _p, ae_state *_state) { mlpparallelizationcv *p = (mlpparallelizationcv*)_p; ae_touch_ptr((void*)p); _multilayerperceptron_init(&p->network, _state); _mlpreport_init(&p->rep, _state); ae_vector_init(&p->subset, 0, DT_INT, _state); ae_vector_init(&p->xyrow, 0, DT_REAL, _state); ae_vector_init(&p->y, 0, DT_REAL, _state); ae_shared_pool_init(&p->trnpool, _state); } void _mlpparallelizationcv_init_copy(void* _dst, void* _src, ae_state *_state) { mlpparallelizationcv *dst = (mlpparallelizationcv*)_dst; mlpparallelizationcv *src = (mlpparallelizationcv*)_src; _multilayerperceptron_init_copy(&dst->network, &src->network, _state); _mlpreport_init_copy(&dst->rep, &src->rep, _state); ae_vector_init_copy(&dst->subset, &src->subset, _state); dst->subsetsize = src->subsetsize; ae_vector_init_copy(&dst->xyrow, &src->xyrow, _state); ae_vector_init_copy(&dst->y, &src->y, _state); dst->ngrad = src->ngrad; ae_shared_pool_init_copy(&dst->trnpool, &src->trnpool, _state); } void _mlpparallelizationcv_clear(void* _p) { mlpparallelizationcv *p = (mlpparallelizationcv*)_p; ae_touch_ptr((void*)p); _multilayerperceptron_clear(&p->network); _mlpreport_clear(&p->rep); ae_vector_clear(&p->subset); ae_vector_clear(&p->xyrow); ae_vector_clear(&p->y); ae_shared_pool_clear(&p->trnpool); } void _mlpparallelizationcv_destroy(void* _p) { mlpparallelizationcv *p = (mlpparallelizationcv*)_p; ae_touch_ptr((void*)p); _multilayerperceptron_destroy(&p->network); _mlpreport_destroy(&p->rep); ae_vector_destroy(&p->subset); ae_vector_destroy(&p->xyrow); ae_vector_destroy(&p->y); ae_shared_pool_destroy(&p->trnpool); } /************************************************************************* This function initializes clusterizer object. Newly initialized object is empty, i.e. it does not contain dataset. You should use it as follows: 1. creation 2. dataset is added with ClusterizerSetPoints() 3. additional parameters are set 3. clusterization is performed with one of the clustering functions -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizercreate(clusterizerstate* s, ae_state *_state) { _clusterizerstate_clear(s); s->npoints = 0; s->nfeatures = 0; s->disttype = 2; s->ahcalgo = 0; s->kmeansrestarts = 1; s->kmeansmaxits = 0; s->kmeansinitalgo = 0; s->kmeansdbgnoits = ae_false; kmeansinitbuf(&s->kmeanstmp, _state); } /************************************************************************* This function adds dataset to the clusterizer structure. This function overrides all previous calls of ClusterizerSetPoints() or ClusterizerSetDistances(). INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() XY - array[NPoints,NFeatures], dataset NPoints - number of points, >=0 NFeatures- number of features, >=1 DistType- distance function: * 0 Chebyshev distance (L-inf norm) * 1 city block distance (L1 norm) * 2 Euclidean distance (L2 norm), non-squared * 10 Pearson correlation: dist(a,b) = 1-corr(a,b) * 11 Absolute Pearson correlation: dist(a,b) = 1-|corr(a,b)| * 12 Uncentered Pearson correlation (cosine of the angle): dist(a,b) = a'*b/(|a|*|b|) * 13 Absolute uncentered Pearson correlation dist(a,b) = |a'*b|/(|a|*|b|) * 20 Spearman rank correlation: dist(a,b) = 1-rankcorr(a,b) * 21 Absolute Spearman rank correlation dist(a,b) = 1-|rankcorr(a,b)| NOTE 1: different distance functions have different performance penalty: * Euclidean or Pearson correlation distances are the fastest ones * Spearman correlation distance function is a bit slower * city block and Chebyshev distances are order of magnitude slower The reason behing difference in performance is that correlation-based distance functions are computed using optimized linear algebra kernels, while Chebyshev and city block distance functions are computed using simple nested loops with two branches at each iteration. NOTE 2: different clustering algorithms have different limitations: * agglomerative hierarchical clustering algorithms may be used with any kind of distance metric * k-means++ clustering algorithm may be used only with Euclidean distance function Thus, list of specific clustering algorithms you may use depends on distance function you specify when you set your dataset. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetpoints(clusterizerstate* s, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_int_t disttype, ae_state *_state) { ae_int_t i; ae_assert((((((((disttype==0||disttype==1)||disttype==2)||disttype==10)||disttype==11)||disttype==12)||disttype==13)||disttype==20)||disttype==21, "ClusterizerSetPoints: incorrect DistType", _state); ae_assert(npoints>=0, "ClusterizerSetPoints: NPoints<0", _state); ae_assert(nfeatures>=1, "ClusterizerSetPoints: NFeatures<1", _state); ae_assert(xy->rows>=npoints, "ClusterizerSetPoints: Rows(XY)cols>=nfeatures, "ClusterizerSetPoints: Cols(XY)npoints = npoints; s->nfeatures = nfeatures; s->disttype = disttype; rmatrixsetlengthatleast(&s->xy, npoints, nfeatures, _state); for(i=0; i<=npoints-1; i++) { ae_v_move(&s->xy.ptr.pp_double[i][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nfeatures-1)); } } /************************************************************************* This function adds dataset given by distance matrix to the clusterizer structure. It is important that dataset is not given explicitly - only distance matrix is given. This function overrides all previous calls of ClusterizerSetPoints() or ClusterizerSetDistances(). INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() D - array[NPoints,NPoints], distance matrix given by its upper or lower triangle (main diagonal is ignored because its entries are expected to be zero). NPoints - number of points IsUpper - whether upper or lower triangle of D is given. NOTE 1: different clustering algorithms have different limitations: * agglomerative hierarchical clustering algorithms may be used with any kind of distance metric, including one which is given by distance matrix * k-means++ clustering algorithm may be used only with Euclidean distance function and explicitly given points - it can not be used with dataset given by distance matrix Thus, if you call this function, you will be unable to use k-means clustering algorithm to process your problem. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetdistances(clusterizerstate* s, /* Real */ ae_matrix* d, ae_int_t npoints, ae_bool isupper, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t j0; ae_int_t j1; ae_assert(npoints>=0, "ClusterizerSetDistances: NPoints<0", _state); ae_assert(d->rows>=npoints, "ClusterizerSetDistances: Rows(D)cols>=npoints, "ClusterizerSetDistances: Cols(D)npoints = npoints; s->nfeatures = 0; s->disttype = -1; rmatrixsetlengthatleast(&s->d, npoints, npoints, _state); for(i=0; i<=npoints-1; i++) { if( isupper ) { j0 = i+1; j1 = npoints-1; } else { j0 = 0; j1 = i-1; } for(j=j0; j<=j1; j++) { ae_assert(ae_isfinite(d->ptr.pp_double[i][j], _state)&&ae_fp_greater_eq(d->ptr.pp_double[i][j],(double)(0)), "ClusterizerSetDistances: D contains infinite, NAN or negative elements", _state); s->d.ptr.pp_double[i][j] = d->ptr.pp_double[i][j]; s->d.ptr.pp_double[j][i] = d->ptr.pp_double[i][j]; } s->d.ptr.pp_double[i][i] = (double)(0); } } /************************************************************************* This function sets agglomerative hierarchical clustering algorithm INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() Algo - algorithm type: * 0 complete linkage (default algorithm) * 1 single linkage * 2 unweighted average linkage * 3 weighted average linkage * 4 Ward's method NOTE: Ward's method works correctly only with Euclidean distance, that's why algorithm will return negative termination code (failure) for any other distance type. It is possible, however, to use this method with user-supplied distance matrix. It is your responsibility to pass one which was calculated with Euclidean distance function. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetahcalgo(clusterizerstate* s, ae_int_t algo, ae_state *_state) { ae_assert((((algo==0||algo==1)||algo==2)||algo==3)||algo==4, "ClusterizerSetHCAlgo: incorrect algorithm type", _state); s->ahcalgo = algo; } /************************************************************************* This function sets k-means properties: number of restarts and maximum number of iterations per one run. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() Restarts- restarts count, >=1. k-means++ algorithm performs several restarts and chooses best set of centers (one with minimum squared distance). MaxIts - maximum number of k-means iterations performed during one run. >=0, zero value means that algorithm performs unlimited number of iterations. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizersetkmeanslimits(clusterizerstate* s, ae_int_t restarts, ae_int_t maxits, ae_state *_state) { ae_assert(restarts>=1, "ClusterizerSetKMeansLimits: Restarts<=0", _state); ae_assert(maxits>=0, "ClusterizerSetKMeansLimits: MaxIts<0", _state); s->kmeansrestarts = restarts; s->kmeansmaxits = maxits; } /************************************************************************* This function sets k-means initialization algorithm. Several different algorithms can be chosen, including k-means++. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() InitAlgo- initialization algorithm: * 0 automatic selection ( different versions of ALGLIB may select different algorithms) * 1 random initialization * 2 k-means++ initialization (best quality of initial centers, but long non-parallelizable initialization phase with bad cache locality) * 3 "fast-greedy" algorithm with efficient, easy to parallelize initialization. Quality of initial centers is somewhat worse than that of k-means++. This algorithm is a default one in the current version of ALGLIB. *-1 "debug" algorithm which always selects first K rows of dataset; this algorithm is used for debug purposes only. Do not use it in the industrial code! -- ALGLIB -- Copyright 21.01.2015 by Bochkanov Sergey *************************************************************************/ void clusterizersetkmeansinit(clusterizerstate* s, ae_int_t initalgo, ae_state *_state) { ae_assert(initalgo>=-1&&initalgo<=3, "ClusterizerSetKMeansInit: InitAlgo is incorrect", _state); s->kmeansinitalgo = initalgo; } /************************************************************************* This function performs agglomerative hierarchical clustering COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Agglomerative hierarchical clustering algorithm has two phases: ! distance matrix calculation and clustering itself. Only first phase ! (distance matrix calculation) is accelerated by Intel MKL and multi- ! threading. Thus, acceleration is significant only for medium or high- ! dimensional problems. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() OUTPUT PARAMETERS: Rep - clustering results; see description of AHCReport structure for more information. NOTE 1: hierarchical clustering algorithms require large amounts of memory. In particular, this implementation needs sizeof(double)*NPoints^2 bytes, which are used to store distance matrix. In case we work with user-supplied matrix, this amount is multiplied by 2 (we have to store original matrix and to work with its copy). For example, problem with 10000 points would require 800M of RAM, even when working in a 1-dimensional space. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizerrunahc(clusterizerstate* s, ahcreport* rep, ae_state *_state) { ae_int_t npoints; ae_int_t nfeatures; _ahcreport_clear(rep); npoints = s->npoints; nfeatures = s->nfeatures; /* * Fill Rep.NPoints, quick exit when NPoints<=1 */ rep->npoints = npoints; if( npoints==0 ) { ae_vector_set_length(&rep->p, 0, _state); ae_matrix_set_length(&rep->z, 0, 0, _state); ae_matrix_set_length(&rep->pz, 0, 0, _state); ae_matrix_set_length(&rep->pm, 0, 0, _state); ae_vector_set_length(&rep->mergedist, 0, _state); rep->terminationtype = 1; return; } if( npoints==1 ) { ae_vector_set_length(&rep->p, 1, _state); ae_matrix_set_length(&rep->z, 0, 0, _state); ae_matrix_set_length(&rep->pz, 0, 0, _state); ae_matrix_set_length(&rep->pm, 0, 0, _state); ae_vector_set_length(&rep->mergedist, 0, _state); rep->p.ptr.p_int[0] = 0; rep->terminationtype = 1; return; } /* * More than one point */ if( s->disttype==-1 ) { /* * Run clusterizer with user-supplied distance matrix */ clustering_clusterizerrunahcinternal(s, &s->d, rep, _state); return; } else { /* * Check combination of AHC algo and distance type */ if( s->ahcalgo==4&&s->disttype!=2 ) { rep->terminationtype = -5; return; } /* * Build distance matrix D. */ clusterizergetdistancesbuf(&s->distbuf, &s->xy, npoints, nfeatures, s->disttype, &s->tmpd, _state); /* * Run clusterizer */ clustering_clusterizerrunahcinternal(s, &s->tmpd, rep, _state); return; } } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_clusterizerrunahc(clusterizerstate* s, ahcreport* rep, ae_state *_state) { clusterizerrunahc(s,rep, _state); } /************************************************************************* This function performs clustering by k-means++ algorithm. You may change algorithm properties by calling: * ClusterizerSetKMeansLimits() to change number of restarts or iterations * ClusterizerSetKMeansInit() to change initialization algorithm By default, one restart and unlimited number of iterations are used. Initialization algorithm is chosen automatically. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function: ! * multicore support (can be used from C# and C++) ! * access to high-performance C++ core (actual for C# users) ! ! K-means clustering algorithm has two phases: selection of initial ! centers and clustering itself. ALGLIB parallelizes both phases. ! Parallel version is optimized for the following scenario: medium or ! high-dimensional problem (20 or more dimensions) with large number of ! points and clusters. However, some speed-up can be obtained even when ! assumptions above are violated. ! ! As for native-vs-managed comparison, working with native core brings ! 30-40% improvement in speed over pure C# version of ALGLIB. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() K - number of clusters, K>=0. K can be zero only when algorithm is called for empty dataset, in this case completion code is set to success (+1). If K=0 and dataset size is non-zero, we can not meaningfully assign points to some center (there are no centers because K=0) and return -3 as completion code (failure). OUTPUT PARAMETERS: Rep - clustering results; see description of KMeansReport structure for more information. NOTE 1: k-means clustering can be performed only for datasets with Euclidean distance function. Algorithm will return negative completion code in Rep.TerminationType in case dataset was added to clusterizer with DistType other than Euclidean (or dataset was specified by distance matrix instead of explicitly given points). -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizerrunkmeans(clusterizerstate* s, ae_int_t k, kmeansreport* rep, ae_state *_state) { ae_frame _frame_block; ae_matrix dummy; ae_frame_make(_state, &_frame_block); _kmeansreport_clear(rep); ae_matrix_init(&dummy, 0, 0, DT_REAL, _state); ae_assert(k>=0, "ClusterizerRunKMeans: K<0", _state); /* * Incorrect distance type */ if( s->disttype!=2 ) { rep->npoints = s->npoints; rep->terminationtype = -5; rep->k = k; rep->iterationscount = 0; rep->energy = 0.0; ae_frame_leave(_state); return; } /* * K>NPoints or (K=0 and NPoints>0) */ if( k>s->npoints||(k==0&&s->npoints>0) ) { rep->npoints = s->npoints; rep->terminationtype = -3; rep->k = k; rep->iterationscount = 0; rep->energy = 0.0; ae_frame_leave(_state); return; } /* * No points */ if( s->npoints==0 ) { rep->npoints = 0; rep->terminationtype = 1; rep->k = k; rep->iterationscount = 0; rep->energy = 0.0; ae_frame_leave(_state); return; } /* * Normal case: * 1<=K<=NPoints, Euclidean distance */ rep->npoints = s->npoints; rep->nfeatures = s->nfeatures; rep->k = k; rep->npoints = s->npoints; rep->nfeatures = s->nfeatures; kmeansgenerateinternal(&s->xy, s->npoints, s->nfeatures, k, s->kmeansinitalgo, s->kmeansmaxits, s->kmeansrestarts, s->kmeansdbgnoits, &rep->terminationtype, &rep->iterationscount, &dummy, ae_false, &rep->c, ae_true, &rep->cidx, &rep->energy, &s->kmeanstmp, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_clusterizerrunkmeans(clusterizerstate* s, ae_int_t k, kmeansreport* rep, ae_state *_state) { clusterizerrunkmeans(s,k,rep, _state); } /************************************************************************* This function returns distance matrix for dataset COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Agglomerative hierarchical clustering algorithm has two phases: ! distance matrix calculation and clustering itself. Only first phase ! (distance matrix calculation) is accelerated by Intel MKL and multi- ! threading. Thus, acceleration is significant only for medium or high- ! dimensional problems. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points, >=0 NFeatures- number of features, >=1 DistType- distance function: * 0 Chebyshev distance (L-inf norm) * 1 city block distance (L1 norm) * 2 Euclidean distance (L2 norm, non-squared) * 10 Pearson correlation: dist(a,b) = 1-corr(a,b) * 11 Absolute Pearson correlation: dist(a,b) = 1-|corr(a,b)| * 12 Uncentered Pearson correlation (cosine of the angle): dist(a,b) = a'*b/(|a|*|b|) * 13 Absolute uncentered Pearson correlation dist(a,b) = |a'*b|/(|a|*|b|) * 20 Spearman rank correlation: dist(a,b) = 1-rankcorr(a,b) * 21 Absolute Spearman rank correlation dist(a,b) = 1-|rankcorr(a,b)| OUTPUT PARAMETERS: D - array[NPoints,NPoints], distance matrix (full matrix is returned, with lower and upper triangles) NOTE: different distance functions have different performance penalty: * Euclidean or Pearson correlation distances are the fastest ones * Spearman correlation distance function is a bit slower * city block and Chebyshev distances are order of magnitude slower The reason behing difference in performance is that correlation-based distance functions are computed using optimized linear algebra kernels, while Chebyshev and city block distance functions are computed using simple nested loops with two branches at each iteration. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ void clusterizergetdistances(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_int_t disttype, /* Real */ ae_matrix* d, ae_state *_state) { ae_frame _frame_block; apbuffers buf; ae_frame_make(_state, &_frame_block); ae_matrix_clear(d); _apbuffers_init(&buf, _state); ae_assert(nfeatures>=1, "ClusterizerGetDistances: NFeatures<1", _state); ae_assert(npoints>=0, "ClusterizerGetDistances: NPoints<1", _state); ae_assert((((((((disttype==0||disttype==1)||disttype==2)||disttype==10)||disttype==11)||disttype==12)||disttype==13)||disttype==20)||disttype==21, "ClusterizerGetDistances: incorrect DistType", _state); ae_assert(xy->rows>=npoints, "ClusterizerGetDistances: Rows(XY)cols>=nfeatures, "ClusterizerGetDistances: Cols(XY)=1, "ClusterizerGetDistancesBuf: NFeatures<1", _state); ae_assert(npoints>=0, "ClusterizerGetDistancesBuf: NPoints<1", _state); ae_assert((((((((disttype==0||disttype==1)||disttype==2)||disttype==10)||disttype==11)||disttype==12)||disttype==13)||disttype==20)||disttype==21, "ClusterizerGetDistancesBuf: incorrect DistType", _state); ae_assert(xy->rows>=npoints, "ClusterizerGetDistancesBuf: Rows(XY)cols>=nfeatures, "ClusterizerGetDistancesBuf: Cols(XY)ptr.pp_double[0][0] = (double)(0); return; } /* * Build distance matrix D. */ if( disttype==0||disttype==1 ) { /* * Chebyshev or city-block distances: * * recursively calculate upper triangle (with main diagonal) * * copy it to the bottom part of the matrix */ rmatrixsetlengthatleast(d, npoints, npoints, _state); clustering_evaluatedistancematrixrec(xy, nfeatures, disttype, d, 0, npoints, 0, npoints, _state); rmatrixenforcesymmetricity(d, npoints, ae_true, _state); return; } if( disttype==2 ) { /* * Euclidean distance * * NOTE: parallelization is done within RMatrixSYRK */ rmatrixsetlengthatleast(d, npoints, npoints, _state); rmatrixsetlengthatleast(&buf->rm0, npoints, nfeatures, _state); rvectorsetlengthatleast(&buf->ra1, nfeatures, _state); rvectorsetlengthatleast(&buf->ra0, npoints, _state); for(j=0; j<=nfeatures-1; j++) { buf->ra1.ptr.p_double[j] = 0.0; } v = (double)1/(double)npoints; for(i=0; i<=npoints-1; i++) { ae_v_addd(&buf->ra1.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nfeatures-1), v); } for(i=0; i<=npoints-1; i++) { ae_v_move(&buf->rm0.ptr.pp_double[i][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nfeatures-1)); ae_v_sub(&buf->rm0.ptr.pp_double[i][0], 1, &buf->ra1.ptr.p_double[0], 1, ae_v_len(0,nfeatures-1)); } rmatrixsyrk(npoints, nfeatures, 1.0, &buf->rm0, 0, 0, 0, 0.0, d, 0, 0, ae_true, _state); for(i=0; i<=npoints-1; i++) { buf->ra0.ptr.p_double[i] = d->ptr.pp_double[i][i]; } for(i=0; i<=npoints-1; i++) { d->ptr.pp_double[i][i] = 0.0; for(j=i+1; j<=npoints-1; j++) { v = ae_sqrt(ae_maxreal(buf->ra0.ptr.p_double[i]+buf->ra0.ptr.p_double[j]-2*d->ptr.pp_double[i][j], 0.0, _state), _state); d->ptr.pp_double[i][j] = v; } } rmatrixenforcesymmetricity(d, npoints, ae_true, _state); return; } if( disttype==10||disttype==11 ) { /* * Absolute/nonabsolute Pearson correlation distance * * NOTE: parallelization is done within PearsonCorrM, which calls RMatrixSYRK internally */ rmatrixsetlengthatleast(d, npoints, npoints, _state); rvectorsetlengthatleast(&buf->ra0, npoints, _state); rmatrixsetlengthatleast(&buf->rm0, npoints, nfeatures, _state); for(i=0; i<=npoints-1; i++) { v = 0.0; for(j=0; j<=nfeatures-1; j++) { v = v+xy->ptr.pp_double[i][j]; } v = v/nfeatures; for(j=0; j<=nfeatures-1; j++) { buf->rm0.ptr.pp_double[i][j] = xy->ptr.pp_double[i][j]-v; } } rmatrixsyrk(npoints, nfeatures, 1.0, &buf->rm0, 0, 0, 0, 0.0, d, 0, 0, ae_true, _state); for(i=0; i<=npoints-1; i++) { buf->ra0.ptr.p_double[i] = d->ptr.pp_double[i][i]; } for(i=0; i<=npoints-1; i++) { d->ptr.pp_double[i][i] = 0.0; for(j=i+1; j<=npoints-1; j++) { v = d->ptr.pp_double[i][j]/ae_sqrt(buf->ra0.ptr.p_double[i]*buf->ra0.ptr.p_double[j], _state); if( disttype==10 ) { v = 1-v; } else { v = 1-ae_fabs(v, _state); } v = ae_maxreal(v, 0.0, _state); d->ptr.pp_double[i][j] = v; } } rmatrixenforcesymmetricity(d, npoints, ae_true, _state); return; } if( disttype==12||disttype==13 ) { /* * Absolute/nonabsolute uncentered Pearson correlation distance * * NOTE: parallelization is done within RMatrixSYRK */ rmatrixsetlengthatleast(d, npoints, npoints, _state); rvectorsetlengthatleast(&buf->ra0, npoints, _state); rmatrixsyrk(npoints, nfeatures, 1.0, xy, 0, 0, 0, 0.0, d, 0, 0, ae_true, _state); for(i=0; i<=npoints-1; i++) { buf->ra0.ptr.p_double[i] = d->ptr.pp_double[i][i]; } for(i=0; i<=npoints-1; i++) { d->ptr.pp_double[i][i] = 0.0; for(j=i+1; j<=npoints-1; j++) { v = d->ptr.pp_double[i][j]/ae_sqrt(buf->ra0.ptr.p_double[i]*buf->ra0.ptr.p_double[j], _state); if( disttype==13 ) { v = ae_fabs(v, _state); } v = ae_minreal(v, 1.0, _state); d->ptr.pp_double[i][j] = 1-v; } } rmatrixenforcesymmetricity(d, npoints, ae_true, _state); return; } if( disttype==20||disttype==21 ) { /* * Spearman rank correlation * * NOTE: parallelization of correlation matrix is done within * PearsonCorrM, which calls RMatrixSYRK internally */ rmatrixsetlengthatleast(d, npoints, npoints, _state); rvectorsetlengthatleast(&buf->ra0, npoints, _state); rmatrixsetlengthatleast(&buf->rm0, npoints, nfeatures, _state); rmatrixcopy(npoints, nfeatures, xy, 0, 0, &buf->rm0, 0, 0, _state); rankdatacentered(&buf->rm0, npoints, nfeatures, _state); rmatrixsyrk(npoints, nfeatures, 1.0, &buf->rm0, 0, 0, 0, 0.0, d, 0, 0, ae_true, _state); for(i=0; i<=npoints-1; i++) { if( ae_fp_greater(d->ptr.pp_double[i][i],(double)(0)) ) { buf->ra0.ptr.p_double[i] = 1/ae_sqrt(d->ptr.pp_double[i][i], _state); } else { buf->ra0.ptr.p_double[i] = 0.0; } } for(i=0; i<=npoints-1; i++) { v = buf->ra0.ptr.p_double[i]; d->ptr.pp_double[i][i] = 0.0; for(j=i+1; j<=npoints-1; j++) { vv = d->ptr.pp_double[i][j]*v*buf->ra0.ptr.p_double[j]; if( disttype==20 ) { vr = 1-vv; } else { vr = 1-ae_fabs(vv, _state); } if( ae_fp_less(vr,(double)(0)) ) { vr = 0.0; } d->ptr.pp_double[i][j] = vr; } } rmatrixenforcesymmetricity(d, npoints, ae_true, _state); return; } ae_assert(ae_false, "Assertion failed", _state); } /************************************************************************* This function takes as input clusterization report Rep, desired clusters count K, and builds top K clusters from hierarchical clusterization tree. It returns assignment of points to clusters (array of cluster indexes). INPUT PARAMETERS: Rep - report from ClusterizerRunAHC() performed on XY K - desired number of clusters, 1<=K<=NPoints. K can be zero only when NPoints=0. OUTPUT PARAMETERS: CIdx - array[NPoints], I-th element contains cluster index (from 0 to K-1) for I-th point of the dataset. CZ - array[K]. This array allows to convert cluster indexes returned by this function to indexes used by Rep.Z. J-th cluster returned by this function corresponds to CZ[J]-th cluster stored in Rep.Z/PZ/PM. It is guaranteed that CZ[I]npoints; ae_assert(npoints>=0, "ClusterizerGetKClusters: internal error in Rep integrity", _state); ae_assert(k>=0, "ClusterizerGetKClusters: K<=0", _state); ae_assert(k<=npoints, "ClusterizerGetKClusters: K>NPoints", _state); ae_assert(k>0||npoints==0, "ClusterizerGetKClusters: K<=0", _state); ae_assert(npoints==rep->npoints, "ClusterizerGetKClusters: NPoints<>Rep.NPoints", _state); /* * Quick exit */ if( npoints==0 ) { ae_frame_leave(_state); return; } if( npoints==1 ) { ae_vector_set_length(cz, 1, _state); ae_vector_set_length(cidx, 1, _state); cz->ptr.p_int[0] = 0; cidx->ptr.p_int[0] = 0; ae_frame_leave(_state); return; } /* * Replay merges, from top to bottom, * keep track of clusters being present at the moment */ ae_vector_set_length(&presentclusters, 2*npoints-1, _state); ae_vector_set_length(&tmpidx, npoints, _state); for(i=0; i<=2*npoints-3; i++) { presentclusters.ptr.p_bool[i] = ae_false; } presentclusters.ptr.p_bool[2*npoints-2] = ae_true; for(i=0; i<=npoints-1; i++) { tmpidx.ptr.p_int[i] = 2*npoints-2; } for(mergeidx=npoints-2; mergeidx>=npoints-k; mergeidx--) { /* * Update information about clusters being present at the moment */ presentclusters.ptr.p_bool[npoints+mergeidx] = ae_false; presentclusters.ptr.p_bool[rep->z.ptr.pp_int[mergeidx][0]] = ae_true; presentclusters.ptr.p_bool[rep->z.ptr.pp_int[mergeidx][1]] = ae_true; /* * Update TmpIdx according to the current state of the dataset * * NOTE: TmpIdx contains cluster indexes from [0..2*NPoints-2]; * we will convert them to [0..K-1] later. */ i0 = rep->pm.ptr.pp_int[mergeidx][0]; i1 = rep->pm.ptr.pp_int[mergeidx][1]; t = rep->z.ptr.pp_int[mergeidx][0]; for(i=i0; i<=i1; i++) { tmpidx.ptr.p_int[i] = t; } i0 = rep->pm.ptr.pp_int[mergeidx][2]; i1 = rep->pm.ptr.pp_int[mergeidx][3]; t = rep->z.ptr.pp_int[mergeidx][1]; for(i=i0; i<=i1; i++) { tmpidx.ptr.p_int[i] = t; } } /* * Fill CZ - array which allows us to convert cluster indexes * from one system to another. */ ae_vector_set_length(cz, k, _state); ae_vector_set_length(&clusterindexes, 2*npoints-1, _state); t = 0; for(i=0; i<=2*npoints-2; i++) { if( presentclusters.ptr.p_bool[i] ) { cz->ptr.p_int[t] = i; clusterindexes.ptr.p_int[i] = t; t = t+1; } } ae_assert(t==k, "ClusterizerGetKClusters: internal error", _state); /* * Convert indexes stored in CIdx */ ae_vector_set_length(cidx, npoints, _state); for(i=0; i<=npoints-1; i++) { cidx->ptr.p_int[i] = clusterindexes.ptr.p_int[tmpidx.ptr.p_int[rep->p.ptr.p_int[i]]]; } ae_frame_leave(_state); } /************************************************************************* This function accepts AHC report Rep, desired minimum intercluster distance and returns top clusters from hierarchical clusterization tree which are separated by distance R or HIGHER. It returns assignment of points to clusters (array of cluster indexes). There is one more function with similar name - ClusterizerSeparatedByCorr, which returns clusters with intercluster correlation equal to R or LOWER (note: higher for distance, lower for correlation). INPUT PARAMETERS: Rep - report from ClusterizerRunAHC() performed on XY R - desired minimum intercluster distance, R>=0 OUTPUT PARAMETERS: K - number of clusters, 1<=K<=NPoints CIdx - array[NPoints], I-th element contains cluster index (from 0 to K-1) for I-th point of the dataset. CZ - array[K]. This array allows to convert cluster indexes returned by this function to indexes used by Rep.Z. J-th cluster returned by this function corresponds to CZ[J]-th cluster stored in Rep.Z/PZ/PM. It is guaranteed that CZ[I]npoints&&ae_fp_greater_eq(rep->mergedist.ptr.p_double[rep->npoints-1-(*k)],r)) { *k = *k+1; } clusterizergetkclusters(rep, *k, cidx, cz, _state); } /************************************************************************* This function accepts AHC report Rep, desired maximum intercluster correlation and returns top clusters from hierarchical clusterization tree which are separated by correlation R or LOWER. It returns assignment of points to clusters (array of cluster indexes). There is one more function with similar name - ClusterizerSeparatedByDist, which returns clusters with intercluster distance equal to R or HIGHER (note: higher for distance, lower for correlation). INPUT PARAMETERS: Rep - report from ClusterizerRunAHC() performed on XY R - desired maximum intercluster correlation, -1<=R<=+1 OUTPUT PARAMETERS: K - number of clusters, 1<=K<=NPoints CIdx - array[NPoints], I-th element contains cluster index (from 0 to K-1) for I-th point of the dataset. CZ - array[K]. This array allows to convert cluster indexes returned by this function to indexes used by Rep.Z. J-th cluster returned by this function corresponds to CZ[J]-th cluster stored in Rep.Z/PZ/PM. It is guaranteed that CZ[I]npoints&&ae_fp_greater_eq(rep->mergedist.ptr.p_double[rep->npoints-1-(*k)],1-r)) { *k = *k+1; } clusterizergetkclusters(rep, *k, cidx, cz, _state); } /************************************************************************* K-means++ initialization INPUT PARAMETERS: Buf - special reusable structure which stores previously allocated memory, intended to avoid memory fragmentation when solving multiple subsequent problems. Must be initialized prior to usage. OUTPUT PARAMETERS: Buf - initialized structure -- ALGLIB -- Copyright 24.07.2015 by Bochkanov Sergey *************************************************************************/ void kmeansinitbuf(kmeansbuffers* buf, ae_state *_state) { ae_frame _frame_block; apbuffers updateseed; ae_frame_make(_state, &_frame_block); _apbuffers_init(&updateseed, _state); ae_shared_pool_set_seed(&buf->updatepool, &updateseed, sizeof(updateseed), _apbuffers_init, _apbuffers_init_copy, _apbuffers_destroy, _state); ae_frame_leave(_state); } /************************************************************************* K-means++ clusterization INPUT PARAMETERS: XY - dataset, array [0..NPoints-1,0..NVars-1]. NPoints - dataset size, NPoints>=K NVars - number of variables, NVars>=1 K - desired number of clusters, K>=1 InitAlgo - initialization algorithm: * 0 - automatic selection of best algorithm * 1 - random selection of centers * 2 - k-means++ * 3 - fast-greedy init *-1 - first K rows of dataset are used (special debug algorithm) MaxIts - iterations limit or zero for no limit Restarts - number of restarts, Restarts>=1 KMeansDbgNoIts- debug flag; if set, Lloyd's iteration is not performed, only initialization phase. Buf - special reusable structure which stores previously allocated memory, intended to avoid memory fragmentation when solving multiple subsequent problems: * MUST BE INITIALIZED WITH KMeansInitBuffers() CALL BEFORE FIRST PASS TO THIS FUNCTION! * subsequent passes must be made without re-initialization OUTPUT PARAMETERS: Info - return code: * -3, if task is degenerate (number of distinct points is less than K) * -1, if incorrect NPoints/NFeatures/K/Restarts was passed * 1, if subroutine finished successfully IterationsCount- actual number of iterations performed by clusterizer CCol - array[0..NVars-1,0..K-1].matrix whose columns store cluster's centers NeedCCol - True in case caller requires to store result in CCol CRow - array[0..K-1,0..NVars-1], same as CCol, but centers are stored in rows NeedCRow - True in case caller requires to store result in CCol XYC - array[NPoints], which contains cluster indexes Energy - merit function of clusterization -- ALGLIB -- Copyright 21.03.2009 by Bochkanov Sergey *************************************************************************/ void kmeansgenerateinternal(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t k, ae_int_t initalgo, ae_int_t maxits, ae_int_t restarts, ae_bool kmeansdbgnoits, ae_int_t* info, ae_int_t* iterationscount, /* Real */ ae_matrix* ccol, ae_bool needccol, /* Real */ ae_matrix* crow, ae_bool needcrow, /* Integer */ ae_vector* xyc, double* energy, kmeansbuffers* buf, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t i1; double e; double eprev; double v; double vv; ae_bool waschanges; ae_bool zerosizeclusters; ae_int_t pass; ae_int_t itcnt; hqrndstate rs; ae_frame_make(_state, &_frame_block); *info = 0; *iterationscount = 0; ae_matrix_clear(ccol); ae_matrix_clear(crow); ae_vector_clear(xyc); *energy = 0; _hqrndstate_init(&rs, _state); /* * Test parameters */ if( ((npointsct, k, nvars, _state); rmatrixsetlengthatleast(&buf->ctbest, k, nvars, _state); ivectorsetlengthatleast(&buf->xycprev, npoints, _state); ivectorsetlengthatleast(&buf->xycbest, npoints, _state); rvectorsetlengthatleast(&buf->d2, npoints, _state); ivectorsetlengthatleast(&buf->csizes, k, _state); *energy = ae_maxrealnumber; hqrndrandomize(&rs, _state); for(pass=1; pass<=restarts; pass++) { /* * Select initial centers. * * Note that for performance reasons centers are stored in ROWS of CT, not * in columns. We'll transpose CT in the end and store it in the C. * * Also note that SelectInitialCenters() may return degenerate set of centers * (some of them have no corresponding points in dataset, some are non-distinct). * Algorithm below is robust enough to deal with such set. */ clustering_selectinitialcenters(xy, npoints, nvars, initalgo, k, &buf->ct, &buf->initbuf, &buf->updatepool, _state); /* * Lloyd's iteration */ if( !kmeansdbgnoits ) { /* * Perform iteration as usual, in normal mode */ for(i=0; i<=npoints-1; i++) { xyc->ptr.p_int[i] = -1; } eprev = ae_maxrealnumber; e = ae_maxrealnumber; itcnt = 0; while(maxits==0||itcntxycprev.ptr.p_int[i] = xyc->ptr.p_int[i]; } kmeansupdatedistances(xy, 0, npoints, nvars, &buf->ct, 0, k, xyc, &buf->d2, &buf->updatepool, _state); waschanges = ae_false; for(i=0; i<=npoints-1; i++) { waschanges = waschanges||xyc->ptr.p_int[i]!=buf->xycprev.ptr.p_int[i]; } /* * Update centers */ for(j=0; j<=k-1; j++) { buf->csizes.ptr.p_int[j] = 0; } for(i=0; i<=k-1; i++) { for(j=0; j<=nvars-1; j++) { buf->ct.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=npoints-1; i++) { buf->csizes.ptr.p_int[xyc->ptr.p_int[i]] = buf->csizes.ptr.p_int[xyc->ptr.p_int[i]]+1; ae_v_add(&buf->ct.ptr.pp_double[xyc->ptr.p_int[i]][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); } zerosizeclusters = ae_false; for(j=0; j<=k-1; j++) { if( buf->csizes.ptr.p_int[j]!=0 ) { v = (double)1/(double)buf->csizes.ptr.p_int[j]; ae_v_muld(&buf->ct.ptr.pp_double[j][0], 1, ae_v_len(0,nvars-1), v); } zerosizeclusters = zerosizeclusters||buf->csizes.ptr.p_int[j]==0; } if( zerosizeclusters ) { /* * Some clusters have zero size - rare, but possible. * We'll choose new centers for such clusters using k-means++ rule * and restart algorithm */ if( !clustering_fixcenters(xy, npoints, nvars, &buf->ct, k, &buf->initbuf, &buf->updatepool, _state) ) { *info = -3; ae_frame_leave(_state); return; } continue; } /* * Stop if one of two conditions is met: * 1. nothing has changed during iteration * 2. energy function increased after recalculation on new centers */ e = (double)(0); for(i=0; i<=npoints-1; i++) { v = 0.0; i1 = xyc->ptr.p_int[i]; for(j=0; j<=nvars-1; j++) { vv = xy->ptr.pp_double[i][j]-buf->ct.ptr.pp_double[i1][j]; v = v+vv*vv; } e = e+v; } if( !waschanges||ae_fp_greater_eq(e,eprev) ) { break; } /* * Update EPrev */ eprev = e; } } else { /* * Debug mode: no Lloyd's iteration. * We just calculate potential E. */ kmeansupdatedistances(xy, 0, npoints, nvars, &buf->ct, 0, k, xyc, &buf->d2, &buf->updatepool, _state); e = (double)(0); for(i=0; i<=npoints-1; i++) { e = e+buf->d2.ptr.p_double[i]; } } /* * Compare E with best centers found so far */ if( ae_fp_less(e,*energy) ) { /* * store partition. */ *energy = e; copymatrix(&buf->ct, 0, k-1, 0, nvars-1, &buf->ctbest, 0, k-1, 0, nvars-1, _state); for(i=0; i<=npoints-1; i++) { buf->xycbest.ptr.p_int[i] = xyc->ptr.p_int[i]; } } } /* * Copy and transpose */ if( needccol ) { ae_matrix_set_length(ccol, nvars, k, _state); copyandtranspose(&buf->ctbest, 0, k-1, 0, nvars-1, ccol, 0, nvars-1, 0, k-1, _state); } if( needcrow ) { ae_matrix_set_length(crow, k, nvars, _state); rmatrixcopy(k, nvars, &buf->ctbest, 0, 0, crow, 0, 0, _state); } for(i=0; i<=npoints-1; i++) { xyc->ptr.p_int[i] = buf->xycbest.ptr.p_int[i]; } ae_frame_leave(_state); } /************************************************************************* This procedure recalculates distances from points to centers and assigns each point to closest center. INPUT PARAMETERS: XY - dataset, array [0..NPoints-1,0..NVars-1]. Idx0,Idx1 - define range of dataset [Idx0,Idx1) to process; right boundary is not included. NVars - number of variables, NVars>=1 CT - matrix of centers, centers are stored in rows CIdx0,CIdx1 - define range of centers [CIdx0,CIdx1) to process; right boundary is not included. XYC - preallocated output buffer, XYDist2 - preallocated output buffer Tmp - temporary buffer, automatically reallocated if needed BufferPool - shared pool seeded with instance of APBuffers structure (seed instance can be unitialized). It is recommended to use this pool only with KMeansUpdateDistances() function. OUTPUT PARAMETERS: XYC - new assignment of points to centers are stored in [Idx0,Idx1) XYDist2 - squared distances from points to their centers are stored in [Idx0,Idx1) -- ALGLIB -- Copyright 21.01.2015 by Bochkanov Sergey *************************************************************************/ void kmeansupdatedistances(/* Real */ ae_matrix* xy, ae_int_t idx0, ae_int_t idx1, ae_int_t nvars, /* Real */ ae_matrix* ct, ae_int_t cidx0, ae_int_t cidx1, /* Integer */ ae_vector* xyc, /* Real */ ae_vector* xydist2, ae_shared_pool* bufferpool, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t i0; ae_int_t i1; ae_int_t j; ae_int_t cclosest; double dclosest; double vv; apbuffers *buf; ae_smart_ptr _buf; double rcomplexity; ae_int_t task0; ae_int_t task1; ae_int_t pblkcnt; ae_int_t cblkcnt; ae_int_t vblkcnt; ae_int_t pblk; ae_int_t cblk; ae_int_t vblk; ae_int_t p0; ae_int_t p1; ae_int_t c0; ae_int_t c1; ae_int_t v0; ae_int_t v1; double v00; double v01; double v10; double v11; double vp0; double vp1; double vc0; double vc1; ae_int_t pcnt; ae_int_t pcntpadded; ae_int_t ccnt; ae_int_t ccntpadded; ae_int_t offs0; ae_int_t offs00; ae_int_t offs01; ae_int_t offs10; ae_int_t offs11; ae_int_t vcnt; ae_int_t stride; ae_frame_make(_state, &_frame_block); ae_smart_ptr_init(&_buf, (void**)&buf, _state); /* * Quick exit for special cases */ if( idx1<=idx0 ) { ae_frame_leave(_state); return; } if( cidx1<=cidx0 ) { ae_frame_leave(_state); return; } if( nvars<=0 ) { ae_frame_leave(_state); return; } /* * Try to recursively divide/process dataset * * NOTE: real arithmetics is used to avoid integer overflow on large problem sizes */ rcomplexity = (double)(idx1-idx0); rcomplexity = rcomplexity*(cidx1-cidx0); rcomplexity = rcomplexity*nvars; if( ((ae_fp_greater_eq(rcomplexity,clustering_parallelcomplexity)&&idx1-idx0>=2*clustering_kmeansblocksize)&&nvars>=clustering_kmeansparalleldim)&&cidx1-cidx0>=clustering_kmeansparallelk ) { splitlength(idx1-idx0, clustering_kmeansblocksize, &task0, &task1, _state); kmeansupdatedistances(xy, idx0, idx0+task0, nvars, ct, cidx0, cidx1, xyc, xydist2, bufferpool, _state); kmeansupdatedistances(xy, idx0+task0, idx1, nvars, ct, cidx0, cidx1, xyc, xydist2, bufferpool, _state); ae_frame_leave(_state); return; } /* * Dataset chunk is selected. * * Process it with blocked algorithm: * * iterate over points, process them in KMeansBlockSize-ed chunks * * for each chunk of dataset, iterate over centers, process them in KMeansBlockSize-ed chunks * * for each chunk of dataset/centerset, iterate over variables, process them in KMeansBlockSize-ed chunks */ ae_assert(clustering_kmeansblocksize%2==0, "KMeansUpdateDistances: internal error", _state); ae_shared_pool_retrieve(bufferpool, &_buf, _state); rvectorsetlengthatleast(&buf->ra0, clustering_kmeansblocksize*clustering_kmeansblocksize, _state); rvectorsetlengthatleast(&buf->ra1, clustering_kmeansblocksize*clustering_kmeansblocksize, _state); rvectorsetlengthatleast(&buf->ra2, clustering_kmeansblocksize*clustering_kmeansblocksize, _state); rvectorsetlengthatleast(&buf->ra3, clustering_kmeansblocksize, _state); ivectorsetlengthatleast(&buf->ia3, clustering_kmeansblocksize, _state); pblkcnt = chunkscount(idx1-idx0, clustering_kmeansblocksize, _state); cblkcnt = chunkscount(cidx1-cidx0, clustering_kmeansblocksize, _state); vblkcnt = chunkscount(nvars, clustering_kmeansblocksize, _state); for(pblk=0; pblk<=pblkcnt-1; pblk++) { /* * Process PBlk-th chunk of dataset. */ p0 = idx0+pblk*clustering_kmeansblocksize; p1 = ae_minint(p0+clustering_kmeansblocksize, idx1, _state); /* * Prepare RA3[]/IA3[] for storage of best distances and best cluster numbers. */ for(i=0; i<=clustering_kmeansblocksize-1; i++) { buf->ra3.ptr.p_double[i] = ae_maxrealnumber; buf->ia3.ptr.p_int[i] = -1; } /* * Iterare over chunks of centerset. */ for(cblk=0; cblk<=cblkcnt-1; cblk++) { /* * Process CBlk-th chunk of centerset */ c0 = cidx0+cblk*clustering_kmeansblocksize; c1 = ae_minint(c0+clustering_kmeansblocksize, cidx1, _state); /* * At this point we have to calculate a set of pairwise distances * between points [P0,P1) and centers [C0,C1) and select best center * for each point. It can also be done with blocked algorithm * (blocking for variables). * * Following arrays are used: * * RA0[] - matrix of distances, padded by zeros for even size, * rows are stored with stride KMeansBlockSize. * * RA1[] - matrix of points (variables corresponding to current * block are extracted), padded by zeros for even size, * rows are stored with stride KMeansBlockSize. * * RA2[] - matrix of centers (variables corresponding to current * block are extracted), padded by zeros for even size, * rows are stored with stride KMeansBlockSize. * */ pcnt = p1-p0; pcntpadded = pcnt+pcnt%2; ccnt = c1-c0; ccntpadded = ccnt+ccnt%2; stride = clustering_kmeansblocksize; ae_assert(pcntpadded<=clustering_kmeansblocksize, "KMeansUpdateDistances: integrity error", _state); ae_assert(ccntpadded<=clustering_kmeansblocksize, "KMeansUpdateDistances: integrity error", _state); for(i=0; i<=pcntpadded-1; i++) { for(j=0; j<=ccntpadded-1; j++) { buf->ra0.ptr.p_double[i*stride+j] = 0.0; } } for(vblk=0; vblk<=vblkcnt-1; vblk++) { /* * Fetch VBlk-th block of variables to arrays RA1 (points) and RA2 (centers). * Pad points and centers with zeros. */ v0 = vblk*clustering_kmeansblocksize; v1 = ae_minint(v0+clustering_kmeansblocksize, nvars, _state); vcnt = v1-v0; for(i=0; i<=pcnt-1; i++) { for(j=0; j<=vcnt-1; j++) { buf->ra1.ptr.p_double[i*stride+j] = xy->ptr.pp_double[p0+i][v0+j]; } } for(i=pcnt; i<=pcntpadded-1; i++) { for(j=0; j<=vcnt-1; j++) { buf->ra1.ptr.p_double[i*stride+j] = 0.0; } } for(i=0; i<=ccnt-1; i++) { for(j=0; j<=vcnt-1; j++) { buf->ra2.ptr.p_double[i*stride+j] = ct->ptr.pp_double[c0+i][v0+j]; } } for(i=ccnt; i<=ccntpadded-1; i++) { for(j=0; j<=vcnt-1; j++) { buf->ra2.ptr.p_double[i*stride+j] = 0.0; } } /* * Update distance matrix with sums-of-squared-differences of RA1 and RA2 */ i0 = 0; while(i0ra0.ptr.p_double[offs0]; v01 = buf->ra0.ptr.p_double[offs0+1]; v10 = buf->ra0.ptr.p_double[offs0+stride]; v11 = buf->ra0.ptr.p_double[offs0+stride+1]; offs00 = i0*stride; offs01 = offs00+stride; offs10 = i1*stride; offs11 = offs10+stride; for(j=0; j<=vcnt-1; j++) { vp0 = buf->ra1.ptr.p_double[offs00+j]; vp1 = buf->ra1.ptr.p_double[offs01+j]; vc0 = buf->ra2.ptr.p_double[offs10+j]; vc1 = buf->ra2.ptr.p_double[offs11+j]; vv = vp0-vc0; v00 = v00+vv*vv; vv = vp0-vc1; v01 = v01+vv*vv; vv = vp1-vc0; v10 = v10+vv*vv; vv = vp1-vc1; v11 = v11+vv*vv; } offs0 = i0*stride+i1; buf->ra0.ptr.p_double[offs0] = v00; buf->ra0.ptr.p_double[offs0+1] = v01; buf->ra0.ptr.p_double[offs0+stride] = v10; buf->ra0.ptr.p_double[offs0+stride+1] = v11; i1 = i1+2; } i0 = i0+2; } } for(i=0; i<=pcnt-1; i++) { cclosest = buf->ia3.ptr.p_int[i]; dclosest = buf->ra3.ptr.p_double[i]; for(j=0; j<=ccnt-1; j++) { if( ae_fp_less(buf->ra0.ptr.p_double[i*stride+j],dclosest) ) { dclosest = buf->ra0.ptr.p_double[i*stride+j]; cclosest = c0+j; } } buf->ia3.ptr.p_int[i] = cclosest; buf->ra3.ptr.p_double[i] = dclosest; } } /* * Store best centers to XYC[] */ for(i=p0; i<=p1-1; i++) { xyc->ptr.p_int[i] = buf->ia3.ptr.p_int[i-p0]; xydist2->ptr.p_double[i] = buf->ra3.ptr.p_double[i-p0]; } } ae_shared_pool_recycle(bufferpool, &_buf, _state); ae_frame_leave(_state); } /************************************************************************* This function selects initial centers according to specified initialization algorithm. IMPORTANT: this function provides no guarantees regarding selection of DIFFERENT centers. Centers returned by this function may include duplicates (say, when random sampling is used). It is also possible that some centers are empty. Algorithm which uses this function must be able to deal with it. Say, you may want to use FixCenters() in order to fix empty centers. INPUT PARAMETERS: XY - dataset, array [0..NPoints-1,0..NVars-1]. NPoints - points count NVars - number of variables, NVars>=1 InitAlgo - initialization algorithm: * 0 - automatic selection of best algorithm * 1 - random selection * 2 - k-means++ * 3 - fast-greedy init *-1 - first K rows of dataset are used (debug algorithm) K - number of centers, K>=1 CT - possibly preallocated output buffer, resized if needed InitBuf - internal buffer, possibly unitialized instance of APBuffers. It is recommended to use this instance only with SelectInitialCenters() and FixCenters() functions, because these functions may allocate really large storage. UpdatePool - shared pool seeded with instance of APBuffers structure (seed instance can be unitialized). Used internally with KMeansUpdateDistances() function. It is recommended to use this pool ONLY with KMeansUpdateDistances() function. OUTPUT PARAMETERS: CT - set of K clusters, one per row RESULT: True on success, False on failure (impossible to create K independent clusters) -- ALGLIB -- Copyright 21.01.2015 by Bochkanov Sergey *************************************************************************/ static void clustering_selectinitialcenters(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t initalgo, ae_int_t k, /* Real */ ae_matrix* ct, apbuffers* initbuf, ae_shared_pool* updatepool, ae_state *_state) { ae_frame _frame_block; ae_int_t cidx; ae_int_t i; ae_int_t j; double v; double vv; double s; ae_int_t lastnz; ae_int_t ptidx; ae_int_t samplesize; ae_int_t samplescntnew; ae_int_t samplescntall; double samplescale; hqrndstate rs; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); /* * Check parameters */ ae_assert(npoints>0, "SelectInitialCenters: internal error", _state); ae_assert(nvars>0, "SelectInitialCenters: internal error", _state); ae_assert(k>0, "SelectInitialCenters: internal error", _state); if( initalgo==0 ) { initalgo = 3; } rmatrixsetlengthatleast(ct, k, nvars, _state); /* * Random initialization */ if( initalgo==-1 ) { for(i=0; i<=k-1; i++) { ae_v_move(&ct->ptr.pp_double[i][0], 1, &xy->ptr.pp_double[i%npoints][0], 1, ae_v_len(0,nvars-1)); } ae_frame_leave(_state); return; } /* * Random initialization */ if( initalgo==1 ) { for(i=0; i<=k-1; i++) { j = hqrnduniformi(&rs, npoints, _state); ae_v_move(&ct->ptr.pp_double[i][0], 1, &xy->ptr.pp_double[j][0], 1, ae_v_len(0,nvars-1)); } ae_frame_leave(_state); return; } /* * k-means++ initialization */ if( initalgo==2 ) { /* * Prepare distances array. * Select initial center at random. */ rvectorsetlengthatleast(&initbuf->ra0, npoints, _state); for(i=0; i<=npoints-1; i++) { initbuf->ra0.ptr.p_double[i] = ae_maxrealnumber; } ptidx = hqrnduniformi(&rs, npoints, _state); ae_v_move(&ct->ptr.pp_double[0][0], 1, &xy->ptr.pp_double[ptidx][0], 1, ae_v_len(0,nvars-1)); /* * For each newly added center repeat: * * reevaluate distances from points to best centers * * sample points with probability dependent on distance * * add new center */ for(cidx=0; cidx<=k-2; cidx++) { /* * Reevaluate distances */ s = 0.0; for(i=0; i<=npoints-1; i++) { v = 0.0; for(j=0; j<=nvars-1; j++) { vv = xy->ptr.pp_double[i][j]-ct->ptr.pp_double[cidx][j]; v = v+vv*vv; } if( ae_fp_less(v,initbuf->ra0.ptr.p_double[i]) ) { initbuf->ra0.ptr.p_double[i] = v; } s = s+initbuf->ra0.ptr.p_double[i]; } /* * If all distances are zero, it means that we can not find enough * distinct points. In this case we just select non-distinct center * at random and continue iterations. This issue will be handled * later in the FixCenters() function. */ if( ae_fp_eq(s,0.0) ) { ptidx = hqrnduniformi(&rs, npoints, _state); ae_v_move(&ct->ptr.pp_double[cidx+1][0], 1, &xy->ptr.pp_double[ptidx][0], 1, ae_v_len(0,nvars-1)); continue; } /* * Select point as center using its distance. * We also handle situation when because of rounding errors * no point was selected - in this case, last non-zero one * will be used. */ v = hqrnduniformr(&rs, _state); vv = 0.0; lastnz = -1; ptidx = -1; for(i=0; i<=npoints-1; i++) { if( ae_fp_eq(initbuf->ra0.ptr.p_double[i],0.0) ) { continue; } lastnz = i; vv = vv+initbuf->ra0.ptr.p_double[i]; if( ae_fp_less_eq(v,vv/s) ) { ptidx = i; break; } } ae_assert(lastnz>=0, "SelectInitialCenters: integrity error", _state); if( ptidx<0 ) { ptidx = lastnz; } ae_v_move(&ct->ptr.pp_double[cidx+1][0], 1, &xy->ptr.pp_double[ptidx][0], 1, ae_v_len(0,nvars-1)); } ae_frame_leave(_state); return; } /* * "Fast-greedy" algorithm based on "Scalable k-means++". * * We perform several rounds, within each round we sample about 0.5*K points * (not exactly 0.5*K) until we have 2*K points sampled. Before each round * we calculate distances from dataset points to closest points sampled so far. * We sample dataset points independently using distance xtimes 0.5*K divided by total * as probability (similar to k-means++, but each point is sampled independently; * after each round we have roughtly 0.5*K points added to sample). * * After sampling is done, we run "greedy" version of k-means++ on this subsample * which selects most distant point on every round. */ if( initalgo==3 ) { /* * Prepare arrays. * Select initial center at random, add it to "new" part of sample, * which is stored at the beginning of the array */ samplesize = 2*k; samplescale = 0.5*k; rmatrixsetlengthatleast(&initbuf->rm0, samplesize, nvars, _state); ptidx = hqrnduniformi(&rs, npoints, _state); ae_v_move(&initbuf->rm0.ptr.pp_double[0][0], 1, &xy->ptr.pp_double[ptidx][0], 1, ae_v_len(0,nvars-1)); samplescntnew = 1; samplescntall = 1; rvectorsetlengthatleast(&initbuf->ra0, npoints, _state); rvectorsetlengthatleast(&initbuf->ra1, npoints, _state); ivectorsetlengthatleast(&initbuf->ia1, npoints, _state); for(i=0; i<=npoints-1; i++) { initbuf->ra0.ptr.p_double[i] = ae_maxrealnumber; } /* * Repeat until samples count is 2*K */ while(samplescntallrm0, samplescntall-samplescntnew, samplescntall, &initbuf->ia1, &initbuf->ra1, updatepool, _state); samplescntnew = 0; /* * Merge new distances with old ones. * Calculate sum of distances, if sum is exactly zero - fill sample * by randomly selected points and terminate. */ s = 0.0; for(i=0; i<=npoints-1; i++) { initbuf->ra0.ptr.p_double[i] = ae_minreal(initbuf->ra0.ptr.p_double[i], initbuf->ra1.ptr.p_double[i], _state); s = s+initbuf->ra0.ptr.p_double[i]; } if( ae_fp_eq(s,0.0) ) { while(samplescntallrm0.ptr.pp_double[samplescntall][0], 1, &xy->ptr.pp_double[ptidx][0], 1, ae_v_len(0,nvars-1)); inc(&samplescntall, _state); inc(&samplescntnew, _state); } break; } /* * Sample points independently. */ for(i=0; i<=npoints-1; i++) { if( samplescntall==samplesize ) { break; } if( ae_fp_eq(initbuf->ra0.ptr.p_double[i],0.0) ) { continue; } if( ae_fp_less_eq(hqrnduniformr(&rs, _state),samplescale*initbuf->ra0.ptr.p_double[i]/s) ) { ae_v_move(&initbuf->rm0.ptr.pp_double[samplescntall][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); inc(&samplescntall, _state); inc(&samplescntnew, _state); } } } /* * Run greedy version of k-means on sampled points */ rvectorsetlengthatleast(&initbuf->ra0, samplescntall, _state); for(i=0; i<=samplescntall-1; i++) { initbuf->ra0.ptr.p_double[i] = ae_maxrealnumber; } ptidx = hqrnduniformi(&rs, samplescntall, _state); ae_v_move(&ct->ptr.pp_double[0][0], 1, &initbuf->rm0.ptr.pp_double[ptidx][0], 1, ae_v_len(0,nvars-1)); for(cidx=0; cidx<=k-2; cidx++) { /* * Reevaluate distances */ for(i=0; i<=samplescntall-1; i++) { v = 0.0; for(j=0; j<=nvars-1; j++) { vv = initbuf->rm0.ptr.pp_double[i][j]-ct->ptr.pp_double[cidx][j]; v = v+vv*vv; } if( ae_fp_less(v,initbuf->ra0.ptr.p_double[i]) ) { initbuf->ra0.ptr.p_double[i] = v; } } /* * Select point as center in greedy manner - most distant * point is selected. */ ptidx = 0; for(i=0; i<=samplescntall-1; i++) { if( ae_fp_greater(initbuf->ra0.ptr.p_double[i],initbuf->ra0.ptr.p_double[ptidx]) ) { ptidx = i; } } ae_v_move(&ct->ptr.pp_double[cidx+1][0], 1, &initbuf->rm0.ptr.pp_double[ptidx][0], 1, ae_v_len(0,nvars-1)); } ae_frame_leave(_state); return; } /* * Internal error */ ae_assert(ae_false, "SelectInitialCenters: internal error", _state); ae_frame_leave(_state); } /************************************************************************* This function "fixes" centers, i.e. replaces ones which have no neighbor points by new centers which have at least one neighbor. If it is impossible to fix centers (not enough distinct points in the dataset), this function returns False. INPUT PARAMETERS: XY - dataset, array [0..NPoints-1,0..NVars-1]. NPoints - points count, >=1 NVars - number of variables, NVars>=1 CT - centers K - number of centers, K>=1 InitBuf - internal buffer, possibly unitialized instance of APBuffers. It is recommended to use this instance only with SelectInitialCenters() and FixCenters() functions, because these functions may allocate really large storage. UpdatePool - shared pool seeded with instance of APBuffers structure (seed instance can be unitialized). Used internally with KMeansUpdateDistances() function. It is recommended to use this pool ONLY with KMeansUpdateDistances() function. OUTPUT PARAMETERS: CT - set of K centers, one per row RESULT: True on success, False on failure (impossible to create K independent clusters) -- ALGLIB -- Copyright 21.01.2015 by Bochkanov Sergey *************************************************************************/ static ae_bool clustering_fixcenters(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, /* Real */ ae_matrix* ct, ae_int_t k, apbuffers* initbuf, ae_shared_pool* updatepool, ae_state *_state) { ae_int_t fixiteration; ae_int_t centertofix; ae_int_t i; ae_int_t j; ae_int_t pdistant; double ddistant; double v; ae_bool result; ae_assert(npoints>=1, "FixCenters: internal error", _state); ae_assert(nvars>=1, "FixCenters: internal error", _state); ae_assert(k>=1, "FixCenters: internal error", _state); /* * Calculate distances from points to best centers (RA0) * and best center indexes (IA0) */ ivectorsetlengthatleast(&initbuf->ia0, npoints, _state); rvectorsetlengthatleast(&initbuf->ra0, npoints, _state); kmeansupdatedistances(xy, 0, npoints, nvars, ct, 0, k, &initbuf->ia0, &initbuf->ra0, updatepool, _state); /* * Repeat loop: * * find first center which has no corresponding point * * set it to the most distant (from the rest of the centerset) point * * recalculate distances, update IA0/RA0 * * repeat * * Loop is repeated for at most 2*K iterations. It is stopped once we have * no "empty" clusters. */ bvectorsetlengthatleast(&initbuf->ba0, k, _state); for(fixiteration=0; fixiteration<=2*k; fixiteration++) { /* * Select center to fix (one which is not mentioned in IA0), * terminate if there is no such center. * BA0[] stores True for centers which have at least one point. */ for(i=0; i<=k-1; i++) { initbuf->ba0.ptr.p_bool[i] = ae_false; } for(i=0; i<=npoints-1; i++) { initbuf->ba0.ptr.p_bool[initbuf->ia0.ptr.p_int[i]] = ae_true; } centertofix = -1; for(i=0; i<=k-1; i++) { if( !initbuf->ba0.ptr.p_bool[i] ) { centertofix = i; break; } } if( centertofix<0 ) { result = ae_true; return result; } /* * Replace center to fix by the most distant point. * Update IA0/RA0 */ pdistant = 0; ddistant = initbuf->ra0.ptr.p_double[pdistant]; for(i=0; i<=npoints-1; i++) { if( ae_fp_greater(initbuf->ra0.ptr.p_double[i],ddistant) ) { ddistant = initbuf->ra0.ptr.p_double[i]; pdistant = i; } } if( ae_fp_eq(ddistant,0.0) ) { break; } ae_v_move(&ct->ptr.pp_double[centertofix][0], 1, &xy->ptr.pp_double[pdistant][0], 1, ae_v_len(0,nvars-1)); for(i=0; i<=npoints-1; i++) { v = 0.0; for(j=0; j<=nvars-1; j++) { v = v+ae_sqr(xy->ptr.pp_double[i][j]-ct->ptr.pp_double[centertofix][j], _state); } if( ae_fp_less(v,initbuf->ra0.ptr.p_double[i]) ) { initbuf->ra0.ptr.p_double[i] = v; initbuf->ia0.ptr.p_int[i] = centertofix; } } } result = ae_false; return result; } /************************************************************************* This function performs agglomerative hierarchical clustering using precomputed distance matrix. Internal function, should not be called directly. INPUT PARAMETERS: S - clusterizer state, initialized by ClusterizerCreate() D - distance matrix, array[S.NFeatures,S.NFeatures] Contents of the matrix is destroyed during algorithm operation. OUTPUT PARAMETERS: Rep - clustering results; see description of AHCReport structure for more information. -- ALGLIB -- Copyright 10.07.2012 by Bochkanov Sergey *************************************************************************/ static void clustering_clusterizerrunahcinternal(clusterizerstate* s, /* Real */ ae_matrix* d, ahcreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; double v; ae_int_t mergeidx; ae_int_t c0; ae_int_t c1; ae_int_t s0; ae_int_t s1; ae_int_t ar; ae_int_t br; ae_int_t npoints; ae_vector cidx; ae_vector csizes; ae_vector nnidx; ae_matrix cinfo; ae_int_t n0; ae_int_t n1; ae_int_t ni; double d01; ae_frame_make(_state, &_frame_block); ae_vector_init(&cidx, 0, DT_INT, _state); ae_vector_init(&csizes, 0, DT_INT, _state); ae_vector_init(&nnidx, 0, DT_INT, _state); ae_matrix_init(&cinfo, 0, 0, DT_INT, _state); npoints = s->npoints; /* * Fill Rep.NPoints, quick exit when NPoints<=1 */ rep->npoints = npoints; if( npoints==0 ) { ae_vector_set_length(&rep->p, 0, _state); ae_matrix_set_length(&rep->z, 0, 0, _state); ae_matrix_set_length(&rep->pz, 0, 0, _state); ae_matrix_set_length(&rep->pm, 0, 0, _state); ae_vector_set_length(&rep->mergedist, 0, _state); rep->terminationtype = 1; ae_frame_leave(_state); return; } if( npoints==1 ) { ae_vector_set_length(&rep->p, 1, _state); ae_matrix_set_length(&rep->z, 0, 0, _state); ae_matrix_set_length(&rep->pz, 0, 0, _state); ae_matrix_set_length(&rep->pm, 0, 0, _state); ae_vector_set_length(&rep->mergedist, 0, _state); rep->p.ptr.p_int[0] = 0; rep->terminationtype = 1; ae_frame_leave(_state); return; } ae_matrix_set_length(&rep->z, npoints-1, 2, _state); ae_vector_set_length(&rep->mergedist, npoints-1, _state); rep->terminationtype = 1; /* * Build list of nearest neighbors */ ae_vector_set_length(&nnidx, npoints, _state); for(i=0; i<=npoints-1; i++) { /* * Calculate index of the nearest neighbor */ k = -1; v = ae_maxrealnumber; for(j=0; j<=npoints-1; j++) { if( j!=i&&ae_fp_less(d->ptr.pp_double[i][j],v) ) { k = j; v = d->ptr.pp_double[i][j]; } } ae_assert(ae_fp_less(v,ae_maxrealnumber), "ClusterizerRunAHC: internal error", _state); nnidx.ptr.p_int[i] = k; } /* * For AHCAlgo=4 (Ward's method) replace distances by their squares times 0.5 */ if( s->ahcalgo==4 ) { for(i=0; i<=npoints-1; i++) { for(j=0; j<=npoints-1; j++) { d->ptr.pp_double[i][j] = 0.5*d->ptr.pp_double[i][j]*d->ptr.pp_double[i][j]; } } } /* * Distance matrix is built, perform merges. * * NOTE 1: CIdx is array[NPoints] which maps rows/columns of the * distance matrix D to indexes of clusters. Values of CIdx * from [0,NPoints) denote single-point clusters, and values * from [NPoints,2*NPoints-1) denote ones obtained by merging * smaller clusters. Negative calues correspond to absent clusters. * * Initially it contains [0...NPoints-1], after each merge * one element of CIdx (one with index C0) is replaced by * NPoints+MergeIdx, and another one with index C1 is * rewritten by -1. * * NOTE 2: CSizes is array[NPoints] which stores sizes of clusters. * */ ae_vector_set_length(&cidx, npoints, _state); ae_vector_set_length(&csizes, npoints, _state); for(i=0; i<=npoints-1; i++) { cidx.ptr.p_int[i] = i; csizes.ptr.p_int[i] = 1; } for(mergeidx=0; mergeidx<=npoints-2; mergeidx++) { /* * Select pair of clusters (C0,C1) with CIdx[C0]=0 ) { if( ae_fp_less(d->ptr.pp_double[i][nnidx.ptr.p_int[i]],d01) ) { c0 = i; c1 = nnidx.ptr.p_int[i]; d01 = d->ptr.pp_double[i][nnidx.ptr.p_int[i]]; } } } ae_assert(ae_fp_less(d01,ae_maxrealnumber), "ClusterizerRunAHC: internal error", _state); if( cidx.ptr.p_int[c0]>cidx.ptr.p_int[c1] ) { i = c1; c1 = c0; c0 = i; } /* * Fill one row of Rep.Z and one element of Rep.MergeDist */ rep->z.ptr.pp_int[mergeidx][0] = cidx.ptr.p_int[c0]; rep->z.ptr.pp_int[mergeidx][1] = cidx.ptr.p_int[c1]; rep->mergedist.ptr.p_double[mergeidx] = d01; /* * Update distance matrix: * * row/column C0 are updated by distances to the new cluster * * row/column C1 are considered empty (we can fill them by zeros, * but do not want to spend time - we just ignore them) * * NOTE: it is important to update distance matrix BEFORE CIdx/CSizes * are updated. */ ae_assert((((s->ahcalgo==0||s->ahcalgo==1)||s->ahcalgo==2)||s->ahcalgo==3)||s->ahcalgo==4, "ClusterizerRunAHC: internal error", _state); for(i=0; i<=npoints-1; i++) { if( i!=c0&&i!=c1 ) { n0 = csizes.ptr.p_int[c0]; n1 = csizes.ptr.p_int[c1]; ni = csizes.ptr.p_int[i]; if( s->ahcalgo==0 ) { d->ptr.pp_double[i][c0] = ae_maxreal(d->ptr.pp_double[i][c0], d->ptr.pp_double[i][c1], _state); } if( s->ahcalgo==1 ) { d->ptr.pp_double[i][c0] = ae_minreal(d->ptr.pp_double[i][c0], d->ptr.pp_double[i][c1], _state); } if( s->ahcalgo==2 ) { d->ptr.pp_double[i][c0] = (csizes.ptr.p_int[c0]*d->ptr.pp_double[i][c0]+csizes.ptr.p_int[c1]*d->ptr.pp_double[i][c1])/(csizes.ptr.p_int[c0]+csizes.ptr.p_int[c1]); } if( s->ahcalgo==3 ) { d->ptr.pp_double[i][c0] = (d->ptr.pp_double[i][c0]+d->ptr.pp_double[i][c1])/2; } if( s->ahcalgo==4 ) { d->ptr.pp_double[i][c0] = ((n0+ni)*d->ptr.pp_double[i][c0]+(n1+ni)*d->ptr.pp_double[i][c1]-ni*d01)/(n0+n1+ni); } d->ptr.pp_double[c0][i] = d->ptr.pp_double[i][c0]; } } /* * Update CIdx and CSizes */ cidx.ptr.p_int[c0] = npoints+mergeidx; cidx.ptr.p_int[c1] = -1; csizes.ptr.p_int[c0] = csizes.ptr.p_int[c0]+csizes.ptr.p_int[c1]; csizes.ptr.p_int[c1] = 0; /* * Update nearest neighbors array: * * update nearest neighbors of everything except for C0/C1 * * update neighbors of C0/C1 */ for(i=0; i<=npoints-1; i++) { if( (cidx.ptr.p_int[i]>=0&&i!=c0)&&(nnidx.ptr.p_int[i]==c0||nnidx.ptr.p_int[i]==c1) ) { /* * I-th cluster which is distinct from C0/C1 has former C0/C1 cluster as its nearest * neighbor. We handle this issue depending on specific AHC algorithm being used. */ if( s->ahcalgo==1 ) { /* * Single linkage. Merging of two clusters together * does NOT change distances between new cluster and * other clusters. * * The only thing we have to do is to update nearest neighbor index */ nnidx.ptr.p_int[i] = c0; } else { /* * Something other than single linkage. We have to re-examine * all the row to find nearest neighbor. */ k = -1; v = ae_maxrealnumber; for(j=0; j<=npoints-1; j++) { if( (cidx.ptr.p_int[j]>=0&&j!=i)&&ae_fp_less(d->ptr.pp_double[i][j],v) ) { k = j; v = d->ptr.pp_double[i][j]; } } ae_assert(ae_fp_less(v,ae_maxrealnumber)||mergeidx==npoints-2, "ClusterizerRunAHC: internal error", _state); nnidx.ptr.p_int[i] = k; } } } k = -1; v = ae_maxrealnumber; for(j=0; j<=npoints-1; j++) { if( (cidx.ptr.p_int[j]>=0&&j!=c0)&&ae_fp_less(d->ptr.pp_double[c0][j],v) ) { k = j; v = d->ptr.pp_double[c0][j]; } } ae_assert(ae_fp_less(v,ae_maxrealnumber)||mergeidx==npoints-2, "ClusterizerRunAHC: internal error", _state); nnidx.ptr.p_int[c0] = k; } /* * Calculate Rep.P and Rep.PM. * * In order to do that, we fill CInfo matrix - (2*NPoints-1)*3 matrix, * with I-th row containing: * * CInfo[I,0] - size of I-th cluster * * CInfo[I,1] - beginning of I-th cluster * * CInfo[I,2] - end of I-th cluster * * CInfo[I,3] - height of I-th cluster * * We perform it as follows: * * first NPoints clusters have unit size (CInfo[I,0]=1) and zero * height (CInfo[I,3]=0) * * we replay NPoints-1 merges from first to last and fill sizes of * corresponding clusters (new size is a sum of sizes of clusters * being merged) and height (new height is max(heights)+1). * * now we ready to determine locations of clusters. Last cluster * spans entire dataset, we know it. We replay merges from last to * first, during each merge we already know location of the merge * result, and we can position first cluster to the left part of * the result, and second cluster to the right part. */ ae_vector_set_length(&rep->p, npoints, _state); ae_matrix_set_length(&rep->pm, npoints-1, 6, _state); ae_matrix_set_length(&cinfo, 2*npoints-1, 4, _state); for(i=0; i<=npoints-1; i++) { cinfo.ptr.pp_int[i][0] = 1; cinfo.ptr.pp_int[i][3] = 0; } for(i=0; i<=npoints-2; i++) { cinfo.ptr.pp_int[npoints+i][0] = cinfo.ptr.pp_int[rep->z.ptr.pp_int[i][0]][0]+cinfo.ptr.pp_int[rep->z.ptr.pp_int[i][1]][0]; cinfo.ptr.pp_int[npoints+i][3] = ae_maxint(cinfo.ptr.pp_int[rep->z.ptr.pp_int[i][0]][3], cinfo.ptr.pp_int[rep->z.ptr.pp_int[i][1]][3], _state)+1; } cinfo.ptr.pp_int[2*npoints-2][1] = 0; cinfo.ptr.pp_int[2*npoints-2][2] = npoints-1; for(i=npoints-2; i>=0; i--) { /* * We merge C0 which spans [A0,B0] and C1 (spans [A1,B1]), * with unknown A0, B0, A1, B1. However, we know that result * is CR, which spans [AR,BR] with known AR/BR, and we know * sizes of C0, C1, CR (denotes as S0, S1, SR). */ c0 = rep->z.ptr.pp_int[i][0]; c1 = rep->z.ptr.pp_int[i][1]; s0 = cinfo.ptr.pp_int[c0][0]; s1 = cinfo.ptr.pp_int[c1][0]; ar = cinfo.ptr.pp_int[npoints+i][1]; br = cinfo.ptr.pp_int[npoints+i][2]; cinfo.ptr.pp_int[c0][1] = ar; cinfo.ptr.pp_int[c0][2] = ar+s0-1; cinfo.ptr.pp_int[c1][1] = br-(s1-1); cinfo.ptr.pp_int[c1][2] = br; rep->pm.ptr.pp_int[i][0] = cinfo.ptr.pp_int[c0][1]; rep->pm.ptr.pp_int[i][1] = cinfo.ptr.pp_int[c0][2]; rep->pm.ptr.pp_int[i][2] = cinfo.ptr.pp_int[c1][1]; rep->pm.ptr.pp_int[i][3] = cinfo.ptr.pp_int[c1][2]; rep->pm.ptr.pp_int[i][4] = cinfo.ptr.pp_int[c0][3]; rep->pm.ptr.pp_int[i][5] = cinfo.ptr.pp_int[c1][3]; } for(i=0; i<=npoints-1; i++) { ae_assert(cinfo.ptr.pp_int[i][1]==cinfo.ptr.pp_int[i][2], "Assertion failed", _state); rep->p.ptr.p_int[i] = cinfo.ptr.pp_int[i][1]; } /* * Calculate Rep.PZ */ ae_matrix_set_length(&rep->pz, npoints-1, 2, _state); for(i=0; i<=npoints-2; i++) { rep->pz.ptr.pp_int[i][0] = rep->z.ptr.pp_int[i][0]; rep->pz.ptr.pp_int[i][1] = rep->z.ptr.pp_int[i][1]; if( rep->pz.ptr.pp_int[i][0]pz.ptr.pp_int[i][0] = rep->p.ptr.p_int[rep->pz.ptr.pp_int[i][0]]; } if( rep->pz.ptr.pp_int[i][1]pz.ptr.pp_int[i][1] = rep->p.ptr.p_int[rep->pz.ptr.pp_int[i][1]]; } } ae_frame_leave(_state); } /************************************************************************* This function recursively evaluates distance matrix for SOME (not all!) distance types. INPUT PARAMETERS: XY - array[?,NFeatures], dataset NFeatures- number of features, >=1 DistType- distance function: * 0 Chebyshev distance (L-inf norm) * 1 city block distance (L1 norm) D - preallocated output matrix I0,I1 - half interval of rows to calculate: [I0,I1) is processed J0,J1 - half interval of cols to calculate: [J0,J1) is processed OUTPUT PARAMETERS: D - array[NPoints,NPoints], distance matrix upper triangle and main diagonal are initialized with data. NOTE: intersection of [I0,I1) and [J0,J1) may completely lie in upper triangle, only partially intersect with it, or have zero intersection. In any case, only intersection of submatrix given by [I0,I1)*[J0,J1) with upper triangle of the matrix is evaluated. Say, for 4x4 distance matrix A: * [0,2)*[0,2) will result in evaluation of A00, A01, A11 * [2,4)*[2,4) will result in evaluation of A22, A23, A32, A33 * [2,4)*[0,2) will result in evaluation of empty set of elements -- ALGLIB -- Copyright 07.04.2013 by Bochkanov Sergey *************************************************************************/ static void clustering_evaluatedistancematrixrec(/* Real */ ae_matrix* xy, ae_int_t nfeatures, ae_int_t disttype, /* Real */ ae_matrix* d, ae_int_t i0, ae_int_t i1, ae_int_t j0, ae_int_t j1, ae_state *_state) { double rcomplexity; ae_int_t len0; ae_int_t len1; ae_int_t i; ae_int_t j; ae_int_t k; double v; double vv; ae_assert(disttype==0||disttype==1, "EvaluateDistanceMatrixRec: incorrect DistType", _state); /* * Normalize J0/J1: * * J0:=max(J0,I0) - we ignore lower triangle * * J1:=max(J1,J0) - normalize J1 */ j0 = ae_maxint(j0, i0, _state); j1 = ae_maxint(j1, j0, _state); if( j1<=j0||i1<=i0 ) { return; } /* * Try to process in parallel. Two condtions must hold in order to * activate parallel processing: * 1. I1-I0>2 or J1-J0>2 * 2. (I1-I0)*(J1-J0)*NFeatures>=ParallelComplexity * * NOTE: all quantities are converted to reals in order to avoid * integer overflow during multiplication * * NOTE: strict inequality in (1) is necessary to reduce task to 2x2 * basecases. In future versions we will be able to handle such * basecases more efficiently than 1x1 cases. */ rcomplexity = (double)(i1-i0); rcomplexity = rcomplexity*(j1-j0); rcomplexity = rcomplexity*nfeatures; if( ae_fp_greater_eq(rcomplexity,clustering_parallelcomplexity)&&(i1-i0>2||j1-j0>2) ) { /* * Recursive division along largest of dimensions */ if( i1-i0>j1-j0 ) { splitlengtheven(i1-i0, &len0, &len1, _state); clustering_evaluatedistancematrixrec(xy, nfeatures, disttype, d, i0, i0+len0, j0, j1, _state); clustering_evaluatedistancematrixrec(xy, nfeatures, disttype, d, i0+len0, i1, j0, j1, _state); } else { splitlengtheven(j1-j0, &len0, &len1, _state); clustering_evaluatedistancematrixrec(xy, nfeatures, disttype, d, i0, i1, j0, j0+len0, _state); clustering_evaluatedistancematrixrec(xy, nfeatures, disttype, d, i0, i1, j0+len0, j1, _state); } return; } /* * Sequential processing */ for(i=i0; i<=i1-1; i++) { for(j=j0; j<=j1-1; j++) { if( j>=i ) { v = 0.0; if( disttype==0 ) { for(k=0; k<=nfeatures-1; k++) { vv = xy->ptr.pp_double[i][k]-xy->ptr.pp_double[j][k]; if( ae_fp_less(vv,(double)(0)) ) { vv = -vv; } if( ae_fp_greater(vv,v) ) { v = vv; } } } if( disttype==1 ) { for(k=0; k<=nfeatures-1; k++) { vv = xy->ptr.pp_double[i][k]-xy->ptr.pp_double[j][k]; if( ae_fp_less(vv,(double)(0)) ) { vv = -vv; } v = v+vv; } } d->ptr.pp_double[i][j] = v; } } } } void _kmeansbuffers_init(void* _p, ae_state *_state) { kmeansbuffers *p = (kmeansbuffers*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->ct, 0, 0, DT_REAL, _state); ae_matrix_init(&p->ctbest, 0, 0, DT_REAL, _state); ae_vector_init(&p->xycbest, 0, DT_INT, _state); ae_vector_init(&p->xycprev, 0, DT_INT, _state); ae_vector_init(&p->d2, 0, DT_REAL, _state); ae_vector_init(&p->csizes, 0, DT_INT, _state); _apbuffers_init(&p->initbuf, _state); ae_shared_pool_init(&p->updatepool, _state); } void _kmeansbuffers_init_copy(void* _dst, void* _src, ae_state *_state) { kmeansbuffers *dst = (kmeansbuffers*)_dst; kmeansbuffers *src = (kmeansbuffers*)_src; ae_matrix_init_copy(&dst->ct, &src->ct, _state); ae_matrix_init_copy(&dst->ctbest, &src->ctbest, _state); ae_vector_init_copy(&dst->xycbest, &src->xycbest, _state); ae_vector_init_copy(&dst->xycprev, &src->xycprev, _state); ae_vector_init_copy(&dst->d2, &src->d2, _state); ae_vector_init_copy(&dst->csizes, &src->csizes, _state); _apbuffers_init_copy(&dst->initbuf, &src->initbuf, _state); ae_shared_pool_init_copy(&dst->updatepool, &src->updatepool, _state); } void _kmeansbuffers_clear(void* _p) { kmeansbuffers *p = (kmeansbuffers*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->ct); ae_matrix_clear(&p->ctbest); ae_vector_clear(&p->xycbest); ae_vector_clear(&p->xycprev); ae_vector_clear(&p->d2); ae_vector_clear(&p->csizes); _apbuffers_clear(&p->initbuf); ae_shared_pool_clear(&p->updatepool); } void _kmeansbuffers_destroy(void* _p) { kmeansbuffers *p = (kmeansbuffers*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->ct); ae_matrix_destroy(&p->ctbest); ae_vector_destroy(&p->xycbest); ae_vector_destroy(&p->xycprev); ae_vector_destroy(&p->d2); ae_vector_destroy(&p->csizes); _apbuffers_destroy(&p->initbuf); ae_shared_pool_destroy(&p->updatepool); } void _clusterizerstate_init(void* _p, ae_state *_state) { clusterizerstate *p = (clusterizerstate*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->xy, 0, 0, DT_REAL, _state); ae_matrix_init(&p->d, 0, 0, DT_REAL, _state); ae_matrix_init(&p->tmpd, 0, 0, DT_REAL, _state); _apbuffers_init(&p->distbuf, _state); _kmeansbuffers_init(&p->kmeanstmp, _state); } void _clusterizerstate_init_copy(void* _dst, void* _src, ae_state *_state) { clusterizerstate *dst = (clusterizerstate*)_dst; clusterizerstate *src = (clusterizerstate*)_src; dst->npoints = src->npoints; dst->nfeatures = src->nfeatures; dst->disttype = src->disttype; ae_matrix_init_copy(&dst->xy, &src->xy, _state); ae_matrix_init_copy(&dst->d, &src->d, _state); dst->ahcalgo = src->ahcalgo; dst->kmeansrestarts = src->kmeansrestarts; dst->kmeansmaxits = src->kmeansmaxits; dst->kmeansinitalgo = src->kmeansinitalgo; dst->kmeansdbgnoits = src->kmeansdbgnoits; ae_matrix_init_copy(&dst->tmpd, &src->tmpd, _state); _apbuffers_init_copy(&dst->distbuf, &src->distbuf, _state); _kmeansbuffers_init_copy(&dst->kmeanstmp, &src->kmeanstmp, _state); } void _clusterizerstate_clear(void* _p) { clusterizerstate *p = (clusterizerstate*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->xy); ae_matrix_clear(&p->d); ae_matrix_clear(&p->tmpd); _apbuffers_clear(&p->distbuf); _kmeansbuffers_clear(&p->kmeanstmp); } void _clusterizerstate_destroy(void* _p) { clusterizerstate *p = (clusterizerstate*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->xy); ae_matrix_destroy(&p->d); ae_matrix_destroy(&p->tmpd); _apbuffers_destroy(&p->distbuf); _kmeansbuffers_destroy(&p->kmeanstmp); } void _ahcreport_init(void* _p, ae_state *_state) { ahcreport *p = (ahcreport*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->p, 0, DT_INT, _state); ae_matrix_init(&p->z, 0, 0, DT_INT, _state); ae_matrix_init(&p->pz, 0, 0, DT_INT, _state); ae_matrix_init(&p->pm, 0, 0, DT_INT, _state); ae_vector_init(&p->mergedist, 0, DT_REAL, _state); } void _ahcreport_init_copy(void* _dst, void* _src, ae_state *_state) { ahcreport *dst = (ahcreport*)_dst; ahcreport *src = (ahcreport*)_src; dst->terminationtype = src->terminationtype; dst->npoints = src->npoints; ae_vector_init_copy(&dst->p, &src->p, _state); ae_matrix_init_copy(&dst->z, &src->z, _state); ae_matrix_init_copy(&dst->pz, &src->pz, _state); ae_matrix_init_copy(&dst->pm, &src->pm, _state); ae_vector_init_copy(&dst->mergedist, &src->mergedist, _state); } void _ahcreport_clear(void* _p) { ahcreport *p = (ahcreport*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->p); ae_matrix_clear(&p->z); ae_matrix_clear(&p->pz); ae_matrix_clear(&p->pm); ae_vector_clear(&p->mergedist); } void _ahcreport_destroy(void* _p) { ahcreport *p = (ahcreport*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->p); ae_matrix_destroy(&p->z); ae_matrix_destroy(&p->pz); ae_matrix_destroy(&p->pm); ae_vector_destroy(&p->mergedist); } void _kmeansreport_init(void* _p, ae_state *_state) { kmeansreport *p = (kmeansreport*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->c, 0, 0, DT_REAL, _state); ae_vector_init(&p->cidx, 0, DT_INT, _state); } void _kmeansreport_init_copy(void* _dst, void* _src, ae_state *_state) { kmeansreport *dst = (kmeansreport*)_dst; kmeansreport *src = (kmeansreport*)_src; dst->npoints = src->npoints; dst->nfeatures = src->nfeatures; dst->terminationtype = src->terminationtype; dst->iterationscount = src->iterationscount; dst->energy = src->energy; dst->k = src->k; ae_matrix_init_copy(&dst->c, &src->c, _state); ae_vector_init_copy(&dst->cidx, &src->cidx, _state); } void _kmeansreport_clear(void* _p) { kmeansreport *p = (kmeansreport*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->c); ae_vector_clear(&p->cidx); } void _kmeansreport_destroy(void* _p) { kmeansreport *p = (kmeansreport*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->c); ae_vector_destroy(&p->cidx); } /************************************************************************* This subroutine builds random decision forest. INPUT PARAMETERS: XY - training set NPoints - training set size, NPoints>=1 NVars - number of independent variables, NVars>=1 NClasses - task type: * NClasses=1 - regression task with one dependent variable * NClasses>1 - classification task with NClasses classes. NTrees - number of trees in a forest, NTrees>=1. recommended values: 50-100. R - percent of a training set used to build individual trees. 01). * 1, if task has been solved DF - model built Rep - training report, contains error on a training set and out-of-bag estimates of generalization error. -- ALGLIB -- Copyright 19.02.2009 by Bochkanov Sergey *************************************************************************/ void dfbuildrandomdecisionforest(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t ntrees, double r, ae_int_t* info, decisionforest* df, dfreport* rep, ae_state *_state) { ae_int_t samplesize; *info = 0; _decisionforest_clear(df); _dfreport_clear(rep); if( ae_fp_less_eq(r,(double)(0))||ae_fp_greater(r,(double)(1)) ) { *info = -1; return; } samplesize = ae_maxint(ae_round(r*npoints, _state), 1, _state); dfbuildinternal(xy, npoints, nvars, nclasses, ntrees, samplesize, ae_maxint(nvars/2, 1, _state), dforest_dfusestrongsplits+dforest_dfuseevs, info, df, rep, _state); } /************************************************************************* This subroutine builds random decision forest. This function gives ability to tune number of variables used when choosing best split. INPUT PARAMETERS: XY - training set NPoints - training set size, NPoints>=1 NVars - number of independent variables, NVars>=1 NClasses - task type: * NClasses=1 - regression task with one dependent variable * NClasses>1 - classification task with NClasses classes. NTrees - number of trees in a forest, NTrees>=1. recommended values: 50-100. NRndVars - number of variables used when choosing best split R - percent of a training set used to build individual trees. 01). * 1, if task has been solved DF - model built Rep - training report, contains error on a training set and out-of-bag estimates of generalization error. -- ALGLIB -- Copyright 19.02.2009 by Bochkanov Sergey *************************************************************************/ void dfbuildrandomdecisionforestx1(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t ntrees, ae_int_t nrndvars, double r, ae_int_t* info, decisionforest* df, dfreport* rep, ae_state *_state) { ae_int_t samplesize; *info = 0; _decisionforest_clear(df); _dfreport_clear(rep); if( ae_fp_less_eq(r,(double)(0))||ae_fp_greater(r,(double)(1)) ) { *info = -1; return; } if( nrndvars<=0||nrndvars>nvars ) { *info = -1; return; } samplesize = ae_maxint(ae_round(r*npoints, _state), 1, _state); dfbuildinternal(xy, npoints, nvars, nclasses, ntrees, samplesize, nrndvars, dforest_dfusestrongsplits+dforest_dfuseevs, info, df, rep, _state); } void dfbuildinternal(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t ntrees, ae_int_t samplesize, ae_int_t nfeatures, ae_int_t flags, ae_int_t* info, decisionforest* df, dfreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t tmpi; ae_int_t lasttreeoffs; ae_int_t offs; ae_int_t ooboffs; ae_int_t treesize; ae_int_t nvarsinpool; ae_bool useevs; dfinternalbuffers bufs; ae_vector permbuf; ae_vector oobbuf; ae_vector oobcntbuf; ae_matrix xys; ae_vector x; ae_vector y; ae_int_t oobcnt; ae_int_t oobrelcnt; double v; double vmin; double vmax; ae_bool bflag; hqrndstate rs; ae_frame_make(_state, &_frame_block); *info = 0; _decisionforest_clear(df); _dfreport_clear(rep); _dfinternalbuffers_init(&bufs, _state); ae_vector_init(&permbuf, 0, DT_INT, _state); ae_vector_init(&oobbuf, 0, DT_REAL, _state); ae_vector_init(&oobcntbuf, 0, DT_INT, _state); ae_matrix_init(&xys, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); /* * Test for inputs */ if( (((((npoints<1||samplesize<1)||samplesize>npoints)||nvars<1)||nclasses<1)||ntrees<1)||nfeatures<1 ) { *info = -1; ae_frame_leave(_state); return; } if( nclasses>1 ) { for(i=0; i<=npoints-1; i++) { if( ae_round(xy->ptr.pp_double[i][nvars], _state)<0||ae_round(xy->ptr.pp_double[i][nvars], _state)>=nclasses ) { *info = -2; ae_frame_leave(_state); return; } } } *info = 1; /* * Flags */ useevs = flags/dforest_dfuseevs%2!=0; /* * Allocate data, prepare header */ treesize = 1+dforest_innernodewidth*(samplesize-1)+dforest_leafnodewidth*samplesize; ae_vector_set_length(&permbuf, npoints-1+1, _state); ae_vector_set_length(&bufs.treebuf, treesize-1+1, _state); ae_vector_set_length(&bufs.idxbuf, npoints-1+1, _state); ae_vector_set_length(&bufs.tmpbufr, npoints-1+1, _state); ae_vector_set_length(&bufs.tmpbufr2, npoints-1+1, _state); ae_vector_set_length(&bufs.tmpbufi, npoints-1+1, _state); ae_vector_set_length(&bufs.sortrbuf, npoints, _state); ae_vector_set_length(&bufs.sortrbuf2, npoints, _state); ae_vector_set_length(&bufs.sortibuf, npoints, _state); ae_vector_set_length(&bufs.varpool, nvars-1+1, _state); ae_vector_set_length(&bufs.evsbin, nvars-1+1, _state); ae_vector_set_length(&bufs.evssplits, nvars-1+1, _state); ae_vector_set_length(&bufs.classibuf, 2*nclasses-1+1, _state); ae_vector_set_length(&oobbuf, nclasses*npoints-1+1, _state); ae_vector_set_length(&oobcntbuf, npoints-1+1, _state); ae_vector_set_length(&df->trees, ntrees*treesize-1+1, _state); ae_matrix_set_length(&xys, samplesize-1+1, nvars+1, _state); ae_vector_set_length(&x, nvars-1+1, _state); ae_vector_set_length(&y, nclasses-1+1, _state); for(i=0; i<=npoints-1; i++) { permbuf.ptr.p_int[i] = i; } for(i=0; i<=npoints*nclasses-1; i++) { oobbuf.ptr.p_double[i] = (double)(0); } for(i=0; i<=npoints-1; i++) { oobcntbuf.ptr.p_int[i] = 0; } /* * Prepare variable pool and EVS (extended variable selection/splitting) buffers * (whether EVS is turned on or not): * 1. detect binary variables and pre-calculate splits for them * 2. detect variables with non-distinct values and exclude them from pool */ for(i=0; i<=nvars-1; i++) { bufs.varpool.ptr.p_int[i] = i; } nvarsinpool = nvars; if( useevs ) { for(j=0; j<=nvars-1; j++) { vmin = xy->ptr.pp_double[0][j]; vmax = vmin; for(i=0; i<=npoints-1; i++) { v = xy->ptr.pp_double[i][j]; vmin = ae_minreal(vmin, v, _state); vmax = ae_maxreal(vmax, v, _state); } if( ae_fp_eq(vmin,vmax) ) { /* * exclude variable from pool */ bufs.varpool.ptr.p_int[j] = bufs.varpool.ptr.p_int[nvarsinpool-1]; bufs.varpool.ptr.p_int[nvarsinpool-1] = -1; nvarsinpool = nvarsinpool-1; continue; } bflag = ae_false; for(i=0; i<=npoints-1; i++) { v = xy->ptr.pp_double[i][j]; if( ae_fp_neq(v,vmin)&&ae_fp_neq(v,vmax) ) { bflag = ae_true; break; } } if( bflag ) { /* * non-binary variable */ bufs.evsbin.ptr.p_bool[j] = ae_false; } else { /* * Prepare */ bufs.evsbin.ptr.p_bool[j] = ae_true; bufs.evssplits.ptr.p_double[j] = 0.5*(vmin+vmax); if( ae_fp_less_eq(bufs.evssplits.ptr.p_double[j],vmin) ) { bufs.evssplits.ptr.p_double[j] = vmax; } } } } /* * RANDOM FOREST FORMAT * W[0] - size of array * W[1] - version number * W[2] - NVars * W[3] - NClasses (1 for regression) * W[4] - NTrees * W[5] - trees offset * * * TREE FORMAT * W[Offs] - size of sub-array * node info: * W[K+0] - variable number (-1 for leaf mode) * W[K+1] - threshold (class/value for leaf node) * W[K+2] - ">=" branch index (absent for leaf node) * */ df->nvars = nvars; df->nclasses = nclasses; df->ntrees = ntrees; /* * Build forest */ hqrndrandomize(&rs, _state); offs = 0; for(i=0; i<=ntrees-1; i++) { /* * Prepare sample */ for(k=0; k<=samplesize-1; k++) { j = k+hqrnduniformi(&rs, npoints-k, _state); tmpi = permbuf.ptr.p_int[k]; permbuf.ptr.p_int[k] = permbuf.ptr.p_int[j]; permbuf.ptr.p_int[j] = tmpi; j = permbuf.ptr.p_int[k]; ae_v_move(&xys.ptr.pp_double[k][0], 1, &xy->ptr.pp_double[j][0], 1, ae_v_len(0,nvars)); } /* * build tree, copy */ dforest_dfbuildtree(&xys, samplesize, nvars, nclasses, nfeatures, nvarsinpool, flags, &bufs, &rs, _state); j = ae_round(bufs.treebuf.ptr.p_double[0], _state); ae_v_move(&df->trees.ptr.p_double[offs], 1, &bufs.treebuf.ptr.p_double[0], 1, ae_v_len(offs,offs+j-1)); lasttreeoffs = offs; offs = offs+j; /* * OOB estimates */ for(k=samplesize; k<=npoints-1; k++) { for(j=0; j<=nclasses-1; j++) { y.ptr.p_double[j] = (double)(0); } j = permbuf.ptr.p_int[k]; ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[j][0], 1, ae_v_len(0,nvars-1)); dforest_dfprocessinternal(df, lasttreeoffs, &x, &y, _state); ae_v_add(&oobbuf.ptr.p_double[j*nclasses], 1, &y.ptr.p_double[0], 1, ae_v_len(j*nclasses,(j+1)*nclasses-1)); oobcntbuf.ptr.p_int[j] = oobcntbuf.ptr.p_int[j]+1; } } df->bufsize = offs; /* * Normalize OOB results */ for(i=0; i<=npoints-1; i++) { if( oobcntbuf.ptr.p_int[i]!=0 ) { v = (double)1/(double)oobcntbuf.ptr.p_int[i]; ae_v_muld(&oobbuf.ptr.p_double[i*nclasses], 1, ae_v_len(i*nclasses,i*nclasses+nclasses-1), v); } } /* * Calculate training set estimates */ rep->relclserror = dfrelclserror(df, xy, npoints, _state); rep->avgce = dfavgce(df, xy, npoints, _state); rep->rmserror = dfrmserror(df, xy, npoints, _state); rep->avgerror = dfavgerror(df, xy, npoints, _state); rep->avgrelerror = dfavgrelerror(df, xy, npoints, _state); /* * Calculate OOB estimates. */ rep->oobrelclserror = (double)(0); rep->oobavgce = (double)(0); rep->oobrmserror = (double)(0); rep->oobavgerror = (double)(0); rep->oobavgrelerror = (double)(0); oobcnt = 0; oobrelcnt = 0; for(i=0; i<=npoints-1; i++) { if( oobcntbuf.ptr.p_int[i]!=0 ) { ooboffs = i*nclasses; if( nclasses>1 ) { /* * classification-specific code */ k = ae_round(xy->ptr.pp_double[i][nvars], _state); tmpi = 0; for(j=1; j<=nclasses-1; j++) { if( ae_fp_greater(oobbuf.ptr.p_double[ooboffs+j],oobbuf.ptr.p_double[ooboffs+tmpi]) ) { tmpi = j; } } if( tmpi!=k ) { rep->oobrelclserror = rep->oobrelclserror+1; } if( ae_fp_neq(oobbuf.ptr.p_double[ooboffs+k],(double)(0)) ) { rep->oobavgce = rep->oobavgce-ae_log(oobbuf.ptr.p_double[ooboffs+k], _state); } else { rep->oobavgce = rep->oobavgce-ae_log(ae_minrealnumber, _state); } for(j=0; j<=nclasses-1; j++) { if( j==k ) { rep->oobrmserror = rep->oobrmserror+ae_sqr(oobbuf.ptr.p_double[ooboffs+j]-1, _state); rep->oobavgerror = rep->oobavgerror+ae_fabs(oobbuf.ptr.p_double[ooboffs+j]-1, _state); rep->oobavgrelerror = rep->oobavgrelerror+ae_fabs(oobbuf.ptr.p_double[ooboffs+j]-1, _state); oobrelcnt = oobrelcnt+1; } else { rep->oobrmserror = rep->oobrmserror+ae_sqr(oobbuf.ptr.p_double[ooboffs+j], _state); rep->oobavgerror = rep->oobavgerror+ae_fabs(oobbuf.ptr.p_double[ooboffs+j], _state); } } } else { /* * regression-specific code */ rep->oobrmserror = rep->oobrmserror+ae_sqr(oobbuf.ptr.p_double[ooboffs]-xy->ptr.pp_double[i][nvars], _state); rep->oobavgerror = rep->oobavgerror+ae_fabs(oobbuf.ptr.p_double[ooboffs]-xy->ptr.pp_double[i][nvars], _state); if( ae_fp_neq(xy->ptr.pp_double[i][nvars],(double)(0)) ) { rep->oobavgrelerror = rep->oobavgrelerror+ae_fabs((oobbuf.ptr.p_double[ooboffs]-xy->ptr.pp_double[i][nvars])/xy->ptr.pp_double[i][nvars], _state); oobrelcnt = oobrelcnt+1; } } /* * update OOB estimates count. */ oobcnt = oobcnt+1; } } if( oobcnt>0 ) { rep->oobrelclserror = rep->oobrelclserror/oobcnt; rep->oobavgce = rep->oobavgce/oobcnt; rep->oobrmserror = ae_sqrt(rep->oobrmserror/(oobcnt*nclasses), _state); rep->oobavgerror = rep->oobavgerror/(oobcnt*nclasses); if( oobrelcnt>0 ) { rep->oobavgrelerror = rep->oobavgrelerror/oobrelcnt; } } ae_frame_leave(_state); } /************************************************************************* Procesing INPUT PARAMETERS: DF - decision forest model X - input vector, array[0..NVars-1]. OUTPUT PARAMETERS: Y - result. Regression estimate when solving regression task, vector of posterior probabilities for classification task. See also DFProcessI. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ void dfprocess(decisionforest* df, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t offs; ae_int_t i; double v; /* * Proceed */ if( y->cntnclasses ) { ae_vector_set_length(y, df->nclasses, _state); } offs = 0; for(i=0; i<=df->nclasses-1; i++) { y->ptr.p_double[i] = (double)(0); } for(i=0; i<=df->ntrees-1; i++) { /* * Process basic tree */ dforest_dfprocessinternal(df, offs, x, y, _state); /* * Next tree */ offs = offs+ae_round(df->trees.ptr.p_double[offs], _state); } v = (double)1/(double)df->ntrees; ae_v_muld(&y->ptr.p_double[0], 1, ae_v_len(0,df->nclasses-1), v); } /************************************************************************* 'interactive' variant of DFProcess for languages like Python which support constructs like "Y = DFProcessI(DF,X)" and interactive mode of interpreter This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void dfprocessi(decisionforest* df, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_vector_clear(y); dfprocess(df, x, y, _state); } /************************************************************************* Relative classification error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: percent of incorrectly classified cases. Zero if model solves regression task. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfrelclserror(decisionforest* df, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { double result; result = (double)dforest_dfclserror(df, xy, npoints, _state)/(double)npoints; return result; } /************************************************************************* Average cross-entropy (in bits per element) on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: CrossEntropy/(NPoints*LN(2)). Zero if model solves regression task. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfavgce(decisionforest* df, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t tmpi; double result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_set_length(&x, df->nvars-1+1, _state); ae_vector_set_length(&y, df->nclasses-1+1, _state); result = (double)(0); for(i=0; i<=npoints-1; i++) { ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,df->nvars-1)); dfprocess(df, &x, &y, _state); if( df->nclasses>1 ) { /* * classification-specific code */ k = ae_round(xy->ptr.pp_double[i][df->nvars], _state); tmpi = 0; for(j=1; j<=df->nclasses-1; j++) { if( ae_fp_greater(y.ptr.p_double[j],y.ptr.p_double[tmpi]) ) { tmpi = j; } } if( ae_fp_neq(y.ptr.p_double[k],(double)(0)) ) { result = result-ae_log(y.ptr.p_double[k], _state); } else { result = result-ae_log(ae_minrealnumber, _state); } } } result = result/npoints; ae_frame_leave(_state); return result; } /************************************************************************* RMS error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: root mean square error. Its meaning for regression task is obvious. As for classification task, RMS error means error when estimating posterior probabilities. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfrmserror(decisionforest* df, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t tmpi; double result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_set_length(&x, df->nvars-1+1, _state); ae_vector_set_length(&y, df->nclasses-1+1, _state); result = (double)(0); for(i=0; i<=npoints-1; i++) { ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,df->nvars-1)); dfprocess(df, &x, &y, _state); if( df->nclasses>1 ) { /* * classification-specific code */ k = ae_round(xy->ptr.pp_double[i][df->nvars], _state); tmpi = 0; for(j=1; j<=df->nclasses-1; j++) { if( ae_fp_greater(y.ptr.p_double[j],y.ptr.p_double[tmpi]) ) { tmpi = j; } } for(j=0; j<=df->nclasses-1; j++) { if( j==k ) { result = result+ae_sqr(y.ptr.p_double[j]-1, _state); } else { result = result+ae_sqr(y.ptr.p_double[j], _state); } } } else { /* * regression-specific code */ result = result+ae_sqr(y.ptr.p_double[0]-xy->ptr.pp_double[i][df->nvars], _state); } } result = ae_sqrt(result/(npoints*df->nclasses), _state); ae_frame_leave(_state); return result; } /************************************************************************* Average error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task, it means average error when estimating posterior probabilities. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfavgerror(decisionforest* df, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_int_t i; ae_int_t j; ae_int_t k; double result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_set_length(&x, df->nvars-1+1, _state); ae_vector_set_length(&y, df->nclasses-1+1, _state); result = (double)(0); for(i=0; i<=npoints-1; i++) { ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,df->nvars-1)); dfprocess(df, &x, &y, _state); if( df->nclasses>1 ) { /* * classification-specific code */ k = ae_round(xy->ptr.pp_double[i][df->nvars], _state); for(j=0; j<=df->nclasses-1; j++) { if( j==k ) { result = result+ae_fabs(y.ptr.p_double[j]-1, _state); } else { result = result+ae_fabs(y.ptr.p_double[j], _state); } } } else { /* * regression-specific code */ result = result+ae_fabs(y.ptr.p_double[0]-xy->ptr.pp_double[i][df->nvars], _state); } } result = result/(npoints*df->nclasses); ae_frame_leave(_state); return result; } /************************************************************************* Average relative error on the test set INPUT PARAMETERS: DF - decision forest model XY - test set NPoints - test set size RESULT: Its meaning for regression task is obvious. As for classification task, it means average relative error when estimating posterior probability of belonging to the correct class. -- ALGLIB -- Copyright 16.02.2009 by Bochkanov Sergey *************************************************************************/ double dfavgrelerror(decisionforest* df, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_int_t relcnt; ae_int_t i; ae_int_t j; ae_int_t k; double result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_set_length(&x, df->nvars-1+1, _state); ae_vector_set_length(&y, df->nclasses-1+1, _state); result = (double)(0); relcnt = 0; for(i=0; i<=npoints-1; i++) { ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,df->nvars-1)); dfprocess(df, &x, &y, _state); if( df->nclasses>1 ) { /* * classification-specific code */ k = ae_round(xy->ptr.pp_double[i][df->nvars], _state); for(j=0; j<=df->nclasses-1; j++) { if( j==k ) { result = result+ae_fabs(y.ptr.p_double[j]-1, _state); relcnt = relcnt+1; } } } else { /* * regression-specific code */ if( ae_fp_neq(xy->ptr.pp_double[i][df->nvars],(double)(0)) ) { result = result+ae_fabs((y.ptr.p_double[0]-xy->ptr.pp_double[i][df->nvars])/xy->ptr.pp_double[i][df->nvars], _state); relcnt = relcnt+1; } } } if( relcnt>0 ) { result = result/relcnt; } ae_frame_leave(_state); return result; } /************************************************************************* Copying of DecisionForest strucure INPUT PARAMETERS: DF1 - original OUTPUT PARAMETERS: DF2 - copy -- ALGLIB -- Copyright 13.02.2009 by Bochkanov Sergey *************************************************************************/ void dfcopy(decisionforest* df1, decisionforest* df2, ae_state *_state) { _decisionforest_clear(df2); df2->nvars = df1->nvars; df2->nclasses = df1->nclasses; df2->ntrees = df1->ntrees; df2->bufsize = df1->bufsize; ae_vector_set_length(&df2->trees, df1->bufsize-1+1, _state); ae_v_move(&df2->trees.ptr.p_double[0], 1, &df1->trees.ptr.p_double[0], 1, ae_v_len(0,df1->bufsize-1)); } /************************************************************************* Serializer: allocation -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ void dfalloc(ae_serializer* s, decisionforest* forest, ae_state *_state) { ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); allocrealarray(s, &forest->trees, forest->bufsize, _state); } /************************************************************************* Serializer: serialization -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ void dfserialize(ae_serializer* s, decisionforest* forest, ae_state *_state) { ae_serializer_serialize_int(s, getrdfserializationcode(_state), _state); ae_serializer_serialize_int(s, dforest_dffirstversion, _state); ae_serializer_serialize_int(s, forest->nvars, _state); ae_serializer_serialize_int(s, forest->nclasses, _state); ae_serializer_serialize_int(s, forest->ntrees, _state); ae_serializer_serialize_int(s, forest->bufsize, _state); serializerealarray(s, &forest->trees, forest->bufsize, _state); } /************************************************************************* Serializer: unserialization -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ void dfunserialize(ae_serializer* s, decisionforest* forest, ae_state *_state) { ae_int_t i0; ae_int_t i1; _decisionforest_clear(forest); /* * check correctness of header */ ae_serializer_unserialize_int(s, &i0, _state); ae_assert(i0==getrdfserializationcode(_state), "DFUnserialize: stream header corrupted", _state); ae_serializer_unserialize_int(s, &i1, _state); ae_assert(i1==dforest_dffirstversion, "DFUnserialize: stream header corrupted", _state); /* * Unserialize data */ ae_serializer_unserialize_int(s, &forest->nvars, _state); ae_serializer_unserialize_int(s, &forest->nclasses, _state); ae_serializer_unserialize_int(s, &forest->ntrees, _state); ae_serializer_unserialize_int(s, &forest->bufsize, _state); unserializerealarray(s, &forest->trees, _state); } /************************************************************************* Classification error *************************************************************************/ static ae_int_t dforest_dfclserror(decisionforest* df, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t tmpi; ae_int_t result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); if( df->nclasses<=1 ) { result = 0; ae_frame_leave(_state); return result; } ae_vector_set_length(&x, df->nvars-1+1, _state); ae_vector_set_length(&y, df->nclasses-1+1, _state); result = 0; for(i=0; i<=npoints-1; i++) { ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,df->nvars-1)); dfprocess(df, &x, &y, _state); k = ae_round(xy->ptr.pp_double[i][df->nvars], _state); tmpi = 0; for(j=1; j<=df->nclasses-1; j++) { if( ae_fp_greater(y.ptr.p_double[j],y.ptr.p_double[tmpi]) ) { tmpi = j; } } if( tmpi!=k ) { result = result+1; } } ae_frame_leave(_state); return result; } /************************************************************************* Internal subroutine for processing one decision tree starting at Offs *************************************************************************/ static void dforest_dfprocessinternal(decisionforest* df, ae_int_t offs, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t k; ae_int_t idx; /* * Set pointer to the root */ k = offs+1; /* * Navigate through the tree */ for(;;) { if( ae_fp_eq(df->trees.ptr.p_double[k],(double)(-1)) ) { if( df->nclasses==1 ) { y->ptr.p_double[0] = y->ptr.p_double[0]+df->trees.ptr.p_double[k+1]; } else { idx = ae_round(df->trees.ptr.p_double[k+1], _state); y->ptr.p_double[idx] = y->ptr.p_double[idx]+1; } break; } if( ae_fp_less(x->ptr.p_double[ae_round(df->trees.ptr.p_double[k], _state)],df->trees.ptr.p_double[k+1]) ) { k = k+dforest_innernodewidth; } else { k = offs+ae_round(df->trees.ptr.p_double[k+2], _state); } } } /************************************************************************* Builds one decision tree. Just a wrapper for the DFBuildTreeRec. *************************************************************************/ static void dforest_dfbuildtree(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t nfeatures, ae_int_t nvarsinpool, ae_int_t flags, dfinternalbuffers* bufs, hqrndstate* rs, ae_state *_state) { ae_int_t numprocessed; ae_int_t i; ae_assert(npoints>0, "Assertion failed", _state); /* * Prepare IdxBuf. It stores indices of the training set elements. * When training set is being split, contents of IdxBuf is * correspondingly reordered so we can know which elements belong * to which branch of decision tree. */ for(i=0; i<=npoints-1; i++) { bufs->idxbuf.ptr.p_int[i] = i; } /* * Recursive procedure */ numprocessed = 1; dforest_dfbuildtreerec(xy, npoints, nvars, nclasses, nfeatures, nvarsinpool, flags, &numprocessed, 0, npoints-1, bufs, rs, _state); bufs->treebuf.ptr.p_double[0] = (double)(numprocessed); } /************************************************************************* Builds one decision tree (internal recursive subroutine) Parameters: TreeBuf - large enough array, at least TreeSize IdxBuf - at least NPoints elements TmpBufR - at least NPoints TmpBufR2 - at least NPoints TmpBufI - at least NPoints TmpBufI2 - at least NPoints+1 *************************************************************************/ static void dforest_dfbuildtreerec(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t nclasses, ae_int_t nfeatures, ae_int_t nvarsinpool, ae_int_t flags, ae_int_t* numprocessed, ae_int_t idx1, ae_int_t idx2, dfinternalbuffers* bufs, hqrndstate* rs, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; ae_bool bflag; ae_int_t i1; ae_int_t i2; ae_int_t info; double sl; double sr; double w; ae_int_t idxbest; double ebest; double tbest; ae_int_t varcur; double s; double v; double v1; double v2; double threshold; ae_int_t oldnp; double currms; ae_bool useevs; /* * these initializers are not really necessary, * but without them compiler complains about uninitialized locals */ tbest = (double)(0); /* * Prepare */ ae_assert(npoints>0, "Assertion failed", _state); ae_assert(idx2>=idx1, "Assertion failed", _state); useevs = flags/dforest_dfuseevs%2!=0; /* * Leaf node */ if( idx2==idx1 ) { bufs->treebuf.ptr.p_double[*numprocessed] = (double)(-1); bufs->treebuf.ptr.p_double[*numprocessed+1] = xy->ptr.pp_double[bufs->idxbuf.ptr.p_int[idx1]][nvars]; *numprocessed = *numprocessed+dforest_leafnodewidth; return; } /* * Non-leaf node. * Select random variable, prepare split: * 1. prepare default solution - no splitting, class at random * 2. investigate possible splits, compare with default/best */ idxbest = -1; if( nclasses>1 ) { /* * default solution for classification */ for(i=0; i<=nclasses-1; i++) { bufs->classibuf.ptr.p_int[i] = 0; } s = (double)(idx2-idx1+1); for(i=idx1; i<=idx2; i++) { j = ae_round(xy->ptr.pp_double[bufs->idxbuf.ptr.p_int[i]][nvars], _state); bufs->classibuf.ptr.p_int[j] = bufs->classibuf.ptr.p_int[j]+1; } ebest = (double)(0); for(i=0; i<=nclasses-1; i++) { ebest = ebest+bufs->classibuf.ptr.p_int[i]*ae_sqr(1-bufs->classibuf.ptr.p_int[i]/s, _state)+(s-bufs->classibuf.ptr.p_int[i])*ae_sqr(bufs->classibuf.ptr.p_int[i]/s, _state); } ebest = ae_sqrt(ebest/(nclasses*(idx2-idx1+1)), _state); } else { /* * default solution for regression */ v = (double)(0); for(i=idx1; i<=idx2; i++) { v = v+xy->ptr.pp_double[bufs->idxbuf.ptr.p_int[i]][nvars]; } v = v/(idx2-idx1+1); ebest = (double)(0); for(i=idx1; i<=idx2; i++) { ebest = ebest+ae_sqr(xy->ptr.pp_double[bufs->idxbuf.ptr.p_int[i]][nvars]-v, _state); } ebest = ae_sqrt(ebest/(idx2-idx1+1), _state); } i = 0; while(i<=ae_minint(nfeatures, nvarsinpool, _state)-1) { /* * select variables from pool */ j = i+hqrnduniformi(rs, nvarsinpool-i, _state); k = bufs->varpool.ptr.p_int[i]; bufs->varpool.ptr.p_int[i] = bufs->varpool.ptr.p_int[j]; bufs->varpool.ptr.p_int[j] = k; varcur = bufs->varpool.ptr.p_int[i]; /* * load variable values to working array * * apply EVS preprocessing: if all variable values are same, * variable is excluded from pool. * * This is necessary for binary pre-splits (see later) to work. */ for(j=idx1; j<=idx2; j++) { bufs->tmpbufr.ptr.p_double[j-idx1] = xy->ptr.pp_double[bufs->idxbuf.ptr.p_int[j]][varcur]; } if( useevs ) { bflag = ae_false; v = bufs->tmpbufr.ptr.p_double[0]; for(j=0; j<=idx2-idx1; j++) { if( ae_fp_neq(bufs->tmpbufr.ptr.p_double[j],v) ) { bflag = ae_true; break; } } if( !bflag ) { /* * exclude variable from pool, * go to the next iteration. * I is not increased. */ k = bufs->varpool.ptr.p_int[i]; bufs->varpool.ptr.p_int[i] = bufs->varpool.ptr.p_int[nvarsinpool-1]; bufs->varpool.ptr.p_int[nvarsinpool-1] = k; nvarsinpool = nvarsinpool-1; continue; } } /* * load labels to working array */ if( nclasses>1 ) { for(j=idx1; j<=idx2; j++) { bufs->tmpbufi.ptr.p_int[j-idx1] = ae_round(xy->ptr.pp_double[bufs->idxbuf.ptr.p_int[j]][nvars], _state); } } else { for(j=idx1; j<=idx2; j++) { bufs->tmpbufr2.ptr.p_double[j-idx1] = xy->ptr.pp_double[bufs->idxbuf.ptr.p_int[j]][nvars]; } } /* * calculate split */ if( useevs&&bufs->evsbin.ptr.p_bool[varcur] ) { /* * Pre-calculated splits for binary variables. * Threshold is already known, just calculate RMS error */ threshold = bufs->evssplits.ptr.p_double[varcur]; if( nclasses>1 ) { /* * classification-specific code */ for(j=0; j<=2*nclasses-1; j++) { bufs->classibuf.ptr.p_int[j] = 0; } sl = (double)(0); sr = (double)(0); for(j=0; j<=idx2-idx1; j++) { k = bufs->tmpbufi.ptr.p_int[j]; if( ae_fp_less(bufs->tmpbufr.ptr.p_double[j],threshold) ) { bufs->classibuf.ptr.p_int[k] = bufs->classibuf.ptr.p_int[k]+1; sl = sl+1; } else { bufs->classibuf.ptr.p_int[k+nclasses] = bufs->classibuf.ptr.p_int[k+nclasses]+1; sr = sr+1; } } ae_assert(ae_fp_neq(sl,(double)(0))&&ae_fp_neq(sr,(double)(0)), "DFBuildTreeRec: something strange!", _state); currms = (double)(0); for(j=0; j<=nclasses-1; j++) { w = (double)(bufs->classibuf.ptr.p_int[j]); currms = currms+w*ae_sqr(w/sl-1, _state); currms = currms+(sl-w)*ae_sqr(w/sl, _state); w = (double)(bufs->classibuf.ptr.p_int[nclasses+j]); currms = currms+w*ae_sqr(w/sr-1, _state); currms = currms+(sr-w)*ae_sqr(w/sr, _state); } currms = ae_sqrt(currms/(nclasses*(idx2-idx1+1)), _state); } else { /* * regression-specific code */ sl = (double)(0); sr = (double)(0); v1 = (double)(0); v2 = (double)(0); for(j=0; j<=idx2-idx1; j++) { if( ae_fp_less(bufs->tmpbufr.ptr.p_double[j],threshold) ) { v1 = v1+bufs->tmpbufr2.ptr.p_double[j]; sl = sl+1; } else { v2 = v2+bufs->tmpbufr2.ptr.p_double[j]; sr = sr+1; } } ae_assert(ae_fp_neq(sl,(double)(0))&&ae_fp_neq(sr,(double)(0)), "DFBuildTreeRec: something strange!", _state); v1 = v1/sl; v2 = v2/sr; currms = (double)(0); for(j=0; j<=idx2-idx1; j++) { if( ae_fp_less(bufs->tmpbufr.ptr.p_double[j],threshold) ) { currms = currms+ae_sqr(v1-bufs->tmpbufr2.ptr.p_double[j], _state); } else { currms = currms+ae_sqr(v2-bufs->tmpbufr2.ptr.p_double[j], _state); } } currms = ae_sqrt(currms/(idx2-idx1+1), _state); } info = 1; } else { /* * Generic splits */ if( nclasses>1 ) { dforest_dfsplitc(&bufs->tmpbufr, &bufs->tmpbufi, &bufs->classibuf, idx2-idx1+1, nclasses, dforest_dfusestrongsplits, &info, &threshold, &currms, &bufs->sortrbuf, &bufs->sortibuf, _state); } else { dforest_dfsplitr(&bufs->tmpbufr, &bufs->tmpbufr2, idx2-idx1+1, dforest_dfusestrongsplits, &info, &threshold, &currms, &bufs->sortrbuf, &bufs->sortrbuf2, _state); } } if( info>0 ) { if( ae_fp_less_eq(currms,ebest) ) { ebest = currms; idxbest = varcur; tbest = threshold; } } /* * Next iteration */ i = i+1; } /* * to split or not to split */ if( idxbest<0 ) { /* * All values are same, cannot split. */ bufs->treebuf.ptr.p_double[*numprocessed] = (double)(-1); if( nclasses>1 ) { /* * Select random class label (randomness allows us to * approximate distribution of the classes) */ bufs->treebuf.ptr.p_double[*numprocessed+1] = (double)(ae_round(xy->ptr.pp_double[bufs->idxbuf.ptr.p_int[idx1+hqrnduniformi(rs, idx2-idx1+1, _state)]][nvars], _state)); } else { /* * Select average (for regression task). */ v = (double)(0); for(i=idx1; i<=idx2; i++) { v = v+xy->ptr.pp_double[bufs->idxbuf.ptr.p_int[i]][nvars]/(idx2-idx1+1); } bufs->treebuf.ptr.p_double[*numprocessed+1] = v; } *numprocessed = *numprocessed+dforest_leafnodewidth; } else { /* * we can split */ bufs->treebuf.ptr.p_double[*numprocessed] = (double)(idxbest); bufs->treebuf.ptr.p_double[*numprocessed+1] = tbest; i1 = idx1; i2 = idx2; while(i1<=i2) { /* * Reorder indices so that left partition is in [Idx1..I1-1], * and right partition is in [I2+1..Idx2] */ if( ae_fp_less(xy->ptr.pp_double[bufs->idxbuf.ptr.p_int[i1]][idxbest],tbest) ) { i1 = i1+1; continue; } if( ae_fp_greater_eq(xy->ptr.pp_double[bufs->idxbuf.ptr.p_int[i2]][idxbest],tbest) ) { i2 = i2-1; continue; } j = bufs->idxbuf.ptr.p_int[i1]; bufs->idxbuf.ptr.p_int[i1] = bufs->idxbuf.ptr.p_int[i2]; bufs->idxbuf.ptr.p_int[i2] = j; i1 = i1+1; i2 = i2-1; } oldnp = *numprocessed; *numprocessed = *numprocessed+dforest_innernodewidth; dforest_dfbuildtreerec(xy, npoints, nvars, nclasses, nfeatures, nvarsinpool, flags, numprocessed, idx1, i1-1, bufs, rs, _state); bufs->treebuf.ptr.p_double[oldnp+2] = (double)(*numprocessed); dforest_dfbuildtreerec(xy, npoints, nvars, nclasses, nfeatures, nvarsinpool, flags, numprocessed, i2+1, idx2, bufs, rs, _state); } } /************************************************************************* Makes split on attribute *************************************************************************/ static void dforest_dfsplitc(/* Real */ ae_vector* x, /* Integer */ ae_vector* c, /* Integer */ ae_vector* cntbuf, ae_int_t n, ae_int_t nc, ae_int_t flags, ae_int_t* info, double* threshold, double* e, /* Real */ ae_vector* sortrbuf, /* Integer */ ae_vector* sortibuf, ae_state *_state) { ae_int_t i; ae_int_t neq; ae_int_t nless; ae_int_t ngreater; ae_int_t q; ae_int_t qmin; ae_int_t qmax; ae_int_t qcnt; double cursplit; ae_int_t nleft; double v; double cure; double w; double sl; double sr; *info = 0; *threshold = 0; *e = 0; tagsortfasti(x, c, sortrbuf, sortibuf, n, _state); *e = ae_maxrealnumber; *threshold = 0.5*(x->ptr.p_double[0]+x->ptr.p_double[n-1]); *info = -3; if( flags/dforest_dfusestrongsplits%2==0 ) { /* * weak splits, split at half */ qcnt = 2; qmin = 1; qmax = 1; } else { /* * strong splits: choose best quartile */ qcnt = 4; qmin = 1; qmax = 3; } for(q=qmin; q<=qmax; q++) { cursplit = x->ptr.p_double[n*q/qcnt]; neq = 0; nless = 0; ngreater = 0; for(i=0; i<=n-1; i++) { if( ae_fp_less(x->ptr.p_double[i],cursplit) ) { nless = nless+1; } if( ae_fp_eq(x->ptr.p_double[i],cursplit) ) { neq = neq+1; } if( ae_fp_greater(x->ptr.p_double[i],cursplit) ) { ngreater = ngreater+1; } } ae_assert(neq!=0, "DFSplitR: NEq=0, something strange!!!", _state); if( nless!=0||ngreater!=0 ) { /* * set threshold between two partitions, with * some tweaking to avoid problems with floating point * arithmetics. * * The problem is that when you calculates C = 0.5*(A+B) there * can be no C which lies strictly between A and B (for example, * there is no floating point number which is * greater than 1 and less than 1+eps). In such situations * we choose right side as theshold (remember that * points which lie on threshold falls to the right side). */ if( nlessptr.p_double[nless+neq-1]+x->ptr.p_double[nless+neq]); nleft = nless+neq; if( ae_fp_less_eq(cursplit,x->ptr.p_double[nless+neq-1]) ) { cursplit = x->ptr.p_double[nless+neq]; } } else { cursplit = 0.5*(x->ptr.p_double[nless-1]+x->ptr.p_double[nless]); nleft = nless; if( ae_fp_less_eq(cursplit,x->ptr.p_double[nless-1]) ) { cursplit = x->ptr.p_double[nless]; } } *info = 1; cure = (double)(0); for(i=0; i<=2*nc-1; i++) { cntbuf->ptr.p_int[i] = 0; } for(i=0; i<=nleft-1; i++) { cntbuf->ptr.p_int[c->ptr.p_int[i]] = cntbuf->ptr.p_int[c->ptr.p_int[i]]+1; } for(i=nleft; i<=n-1; i++) { cntbuf->ptr.p_int[nc+c->ptr.p_int[i]] = cntbuf->ptr.p_int[nc+c->ptr.p_int[i]]+1; } sl = (double)(nleft); sr = (double)(n-nleft); v = (double)(0); for(i=0; i<=nc-1; i++) { w = (double)(cntbuf->ptr.p_int[i]); v = v+w*ae_sqr(w/sl-1, _state); v = v+(sl-w)*ae_sqr(w/sl, _state); w = (double)(cntbuf->ptr.p_int[nc+i]); v = v+w*ae_sqr(w/sr-1, _state); v = v+(sr-w)*ae_sqr(w/sr, _state); } cure = ae_sqrt(v/(nc*n), _state); if( ae_fp_less(cure,*e) ) { *threshold = cursplit; *e = cure; } } } } /************************************************************************* Makes split on attribute *************************************************************************/ static void dforest_dfsplitr(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t flags, ae_int_t* info, double* threshold, double* e, /* Real */ ae_vector* sortrbuf, /* Real */ ae_vector* sortrbuf2, ae_state *_state) { ae_int_t i; ae_int_t neq; ae_int_t nless; ae_int_t ngreater; ae_int_t q; ae_int_t qmin; ae_int_t qmax; ae_int_t qcnt; double cursplit; ae_int_t nleft; double v; double cure; *info = 0; *threshold = 0; *e = 0; tagsortfastr(x, y, sortrbuf, sortrbuf2, n, _state); *e = ae_maxrealnumber; *threshold = 0.5*(x->ptr.p_double[0]+x->ptr.p_double[n-1]); *info = -3; if( flags/dforest_dfusestrongsplits%2==0 ) { /* * weak splits, split at half */ qcnt = 2; qmin = 1; qmax = 1; } else { /* * strong splits: choose best quartile */ qcnt = 4; qmin = 1; qmax = 3; } for(q=qmin; q<=qmax; q++) { cursplit = x->ptr.p_double[n*q/qcnt]; neq = 0; nless = 0; ngreater = 0; for(i=0; i<=n-1; i++) { if( ae_fp_less(x->ptr.p_double[i],cursplit) ) { nless = nless+1; } if( ae_fp_eq(x->ptr.p_double[i],cursplit) ) { neq = neq+1; } if( ae_fp_greater(x->ptr.p_double[i],cursplit) ) { ngreater = ngreater+1; } } ae_assert(neq!=0, "DFSplitR: NEq=0, something strange!!!", _state); if( nless!=0||ngreater!=0 ) { /* * set threshold between two partitions, with * some tweaking to avoid problems with floating point * arithmetics. * * The problem is that when you calculates C = 0.5*(A+B) there * can be no C which lies strictly between A and B (for example, * there is no floating point number which is * greater than 1 and less than 1+eps). In such situations * we choose right side as theshold (remember that * points which lie on threshold falls to the right side). */ if( nlessptr.p_double[nless+neq-1]+x->ptr.p_double[nless+neq]); nleft = nless+neq; if( ae_fp_less_eq(cursplit,x->ptr.p_double[nless+neq-1]) ) { cursplit = x->ptr.p_double[nless+neq]; } } else { cursplit = 0.5*(x->ptr.p_double[nless-1]+x->ptr.p_double[nless]); nleft = nless; if( ae_fp_less_eq(cursplit,x->ptr.p_double[nless-1]) ) { cursplit = x->ptr.p_double[nless]; } } *info = 1; cure = (double)(0); v = (double)(0); for(i=0; i<=nleft-1; i++) { v = v+y->ptr.p_double[i]; } v = v/nleft; for(i=0; i<=nleft-1; i++) { cure = cure+ae_sqr(y->ptr.p_double[i]-v, _state); } v = (double)(0); for(i=nleft; i<=n-1; i++) { v = v+y->ptr.p_double[i]; } v = v/(n-nleft); for(i=nleft; i<=n-1; i++) { cure = cure+ae_sqr(y->ptr.p_double[i]-v, _state); } cure = ae_sqrt(cure/n, _state); if( ae_fp_less(cure,*e) ) { *threshold = cursplit; *e = cure; } } } } void _decisionforest_init(void* _p, ae_state *_state) { decisionforest *p = (decisionforest*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->trees, 0, DT_REAL, _state); } void _decisionforest_init_copy(void* _dst, void* _src, ae_state *_state) { decisionforest *dst = (decisionforest*)_dst; decisionforest *src = (decisionforest*)_src; dst->nvars = src->nvars; dst->nclasses = src->nclasses; dst->ntrees = src->ntrees; dst->bufsize = src->bufsize; ae_vector_init_copy(&dst->trees, &src->trees, _state); } void _decisionforest_clear(void* _p) { decisionforest *p = (decisionforest*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->trees); } void _decisionforest_destroy(void* _p) { decisionforest *p = (decisionforest*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->trees); } void _dfreport_init(void* _p, ae_state *_state) { dfreport *p = (dfreport*)_p; ae_touch_ptr((void*)p); } void _dfreport_init_copy(void* _dst, void* _src, ae_state *_state) { dfreport *dst = (dfreport*)_dst; dfreport *src = (dfreport*)_src; dst->relclserror = src->relclserror; dst->avgce = src->avgce; dst->rmserror = src->rmserror; dst->avgerror = src->avgerror; dst->avgrelerror = src->avgrelerror; dst->oobrelclserror = src->oobrelclserror; dst->oobavgce = src->oobavgce; dst->oobrmserror = src->oobrmserror; dst->oobavgerror = src->oobavgerror; dst->oobavgrelerror = src->oobavgrelerror; } void _dfreport_clear(void* _p) { dfreport *p = (dfreport*)_p; ae_touch_ptr((void*)p); } void _dfreport_destroy(void* _p) { dfreport *p = (dfreport*)_p; ae_touch_ptr((void*)p); } void _dfinternalbuffers_init(void* _p, ae_state *_state) { dfinternalbuffers *p = (dfinternalbuffers*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->treebuf, 0, DT_REAL, _state); ae_vector_init(&p->idxbuf, 0, DT_INT, _state); ae_vector_init(&p->tmpbufr, 0, DT_REAL, _state); ae_vector_init(&p->tmpbufr2, 0, DT_REAL, _state); ae_vector_init(&p->tmpbufi, 0, DT_INT, _state); ae_vector_init(&p->classibuf, 0, DT_INT, _state); ae_vector_init(&p->sortrbuf, 0, DT_REAL, _state); ae_vector_init(&p->sortrbuf2, 0, DT_REAL, _state); ae_vector_init(&p->sortibuf, 0, DT_INT, _state); ae_vector_init(&p->varpool, 0, DT_INT, _state); ae_vector_init(&p->evsbin, 0, DT_BOOL, _state); ae_vector_init(&p->evssplits, 0, DT_REAL, _state); } void _dfinternalbuffers_init_copy(void* _dst, void* _src, ae_state *_state) { dfinternalbuffers *dst = (dfinternalbuffers*)_dst; dfinternalbuffers *src = (dfinternalbuffers*)_src; ae_vector_init_copy(&dst->treebuf, &src->treebuf, _state); ae_vector_init_copy(&dst->idxbuf, &src->idxbuf, _state); ae_vector_init_copy(&dst->tmpbufr, &src->tmpbufr, _state); ae_vector_init_copy(&dst->tmpbufr2, &src->tmpbufr2, _state); ae_vector_init_copy(&dst->tmpbufi, &src->tmpbufi, _state); ae_vector_init_copy(&dst->classibuf, &src->classibuf, _state); ae_vector_init_copy(&dst->sortrbuf, &src->sortrbuf, _state); ae_vector_init_copy(&dst->sortrbuf2, &src->sortrbuf2, _state); ae_vector_init_copy(&dst->sortibuf, &src->sortibuf, _state); ae_vector_init_copy(&dst->varpool, &src->varpool, _state); ae_vector_init_copy(&dst->evsbin, &src->evsbin, _state); ae_vector_init_copy(&dst->evssplits, &src->evssplits, _state); } void _dfinternalbuffers_clear(void* _p) { dfinternalbuffers *p = (dfinternalbuffers*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->treebuf); ae_vector_clear(&p->idxbuf); ae_vector_clear(&p->tmpbufr); ae_vector_clear(&p->tmpbufr2); ae_vector_clear(&p->tmpbufi); ae_vector_clear(&p->classibuf); ae_vector_clear(&p->sortrbuf); ae_vector_clear(&p->sortrbuf2); ae_vector_clear(&p->sortibuf); ae_vector_clear(&p->varpool); ae_vector_clear(&p->evsbin); ae_vector_clear(&p->evssplits); } void _dfinternalbuffers_destroy(void* _p) { dfinternalbuffers *p = (dfinternalbuffers*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->treebuf); ae_vector_destroy(&p->idxbuf); ae_vector_destroy(&p->tmpbufr); ae_vector_destroy(&p->tmpbufr2); ae_vector_destroy(&p->tmpbufi); ae_vector_destroy(&p->classibuf); ae_vector_destroy(&p->sortrbuf); ae_vector_destroy(&p->sortrbuf2); ae_vector_destroy(&p->sortibuf); ae_vector_destroy(&p->varpool); ae_vector_destroy(&p->evsbin); ae_vector_destroy(&p->evssplits); } /************************************************************************* k-means++ clusterization. Backward compatibility function, we recommend to use CLUSTERING subpackage as better replacement. -- ALGLIB -- Copyright 21.03.2009 by Bochkanov Sergey *************************************************************************/ void kmeansgenerate(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, ae_int_t k, ae_int_t restarts, ae_int_t* info, /* Real */ ae_matrix* c, /* Integer */ ae_vector* xyc, ae_state *_state) { ae_frame _frame_block; ae_matrix dummy; ae_int_t itscnt; double e; kmeansbuffers buf; ae_frame_make(_state, &_frame_block); *info = 0; ae_matrix_clear(c); ae_vector_clear(xyc); ae_matrix_init(&dummy, 0, 0, DT_REAL, _state); _kmeansbuffers_init(&buf, _state); kmeansinitbuf(&buf, _state); kmeansgenerateinternal(xy, npoints, nvars, k, 0, 0, restarts, ae_false, info, &itscnt, c, ae_true, &dummy, ae_false, xyc, &e, &buf, _state); ae_frame_leave(_state); } } cpp/src/integration.cpp0000755000175000017500000041307313105126766015056 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "integration.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif using namespace std; ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* Computation of nodes and weights for a Gauss quadrature formula The algorithm generates the N-point Gauss quadrature formula with weight function given by coefficients alpha and beta of a recurrence relation which generates a system of orthogonal polynomials: P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zeroth moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - array[0..N-1], alpha coefficients Beta - array[0..N-1], beta coefficients Zero-indexed element is not used and may be arbitrary. Beta[I]>0. Mu0 - zeroth moment of the weight function. N - number of nodes of the quadrature formula, N>=1 OUTPUT PARAMETERS: Info - error code: * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * 1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 2005-2009 by Bochkanov Sergey *************************************************************************/ void gqgeneraterec(const real_1d_array &alpha, const real_1d_array &beta, const double mu0, const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &w) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::gqgeneraterec(const_cast(alpha.c_ptr()), const_cast(beta.c_ptr()), mu0, n, &info, const_cast(x.c_ptr()), const_cast(w.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Computation of nodes and weights for a Gauss-Lobatto quadrature formula The algorithm generates the N-point Gauss-Lobatto quadrature formula with weight function given by coefficients alpha and beta of a recurrence which generates a system of orthogonal polynomials. P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zeroth moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - array[0..N-2], alpha coefficients Beta - array[0..N-2], beta coefficients. Zero-indexed element is not used, may be arbitrary. Beta[I]>0 Mu0 - zeroth moment of the weighting function. A - left boundary of the integration interval. B - right boundary of the integration interval. N - number of nodes of the quadrature formula, N>=3 (including the left and right boundary nodes). OUTPUT PARAMETERS: Info - error code: * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * 1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 2005-2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategausslobattorec(const real_1d_array &alpha, const real_1d_array &beta, const double mu0, const double a, const double b, const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &w) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::gqgenerategausslobattorec(const_cast(alpha.c_ptr()), const_cast(beta.c_ptr()), mu0, a, b, n, &info, const_cast(x.c_ptr()), const_cast(w.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Computation of nodes and weights for a Gauss-Radau quadrature formula The algorithm generates the N-point Gauss-Radau quadrature formula with weight function given by the coefficients alpha and beta of a recurrence which generates a system of orthogonal polynomials. P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zeroth moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - array[0..N-2], alpha coefficients. Beta - array[0..N-1], beta coefficients Zero-indexed element is not used. Beta[I]>0 Mu0 - zeroth moment of the weighting function. A - left boundary of the integration interval. N - number of nodes of the quadrature formula, N>=2 (including the left boundary node). OUTPUT PARAMETERS: Info - error code: * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * 1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 2005-2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategaussradaurec(const real_1d_array &alpha, const real_1d_array &beta, const double mu0, const double a, const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &w) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::gqgenerategaussradaurec(const_cast(alpha.c_ptr()), const_cast(beta.c_ptr()), mu0, a, n, &info, const_cast(x.c_ptr()), const_cast(w.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Returns nodes/weights for Gauss-Legendre quadrature on [-1,1] with N nodes. INPUT PARAMETERS: N - number of nodes, >=1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. N is too large to obtain weights/nodes with high enough accuracy. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategausslegendre(const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &w) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::gqgenerategausslegendre(n, &info, const_cast(x.c_ptr()), const_cast(w.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Returns nodes/weights for Gauss-Jacobi quadrature on [-1,1] with weight function W(x)=Power(1-x,Alpha)*Power(1+x,Beta). INPUT PARAMETERS: N - number of nodes, >=1 Alpha - power-law coefficient, Alpha>-1 Beta - power-law coefficient, Beta>-1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. Alpha or Beta are too close to -1 to obtain weights/nodes with high enough accuracy, or, may be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N/Alpha/Beta was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategaussjacobi(const ae_int_t n, const double alpha, const double beta, ae_int_t &info, real_1d_array &x, real_1d_array &w) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::gqgenerategaussjacobi(n, alpha, beta, &info, const_cast(x.c_ptr()), const_cast(w.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Returns nodes/weights for Gauss-Laguerre quadrature on [0,+inf) with weight function W(x)=Power(x,Alpha)*Exp(-x) INPUT PARAMETERS: N - number of nodes, >=1 Alpha - power-law coefficient, Alpha>-1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. Alpha is too close to -1 to obtain weights/nodes with high enough accuracy or, may be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N/Alpha was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategausslaguerre(const ae_int_t n, const double alpha, ae_int_t &info, real_1d_array &x, real_1d_array &w) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::gqgenerategausslaguerre(n, alpha, &info, const_cast(x.c_ptr()), const_cast(w.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Returns nodes/weights for Gauss-Hermite quadrature on (-inf,+inf) with weight function W(x)=Exp(-x*x) INPUT PARAMETERS: N - number of nodes, >=1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. May be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N/Alpha was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategausshermite(const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &w) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::gqgenerategausshermite(n, &info, const_cast(x.c_ptr()), const_cast(w.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Computation of nodes and weights of a Gauss-Kronrod quadrature formula The algorithm generates the N-point Gauss-Kronrod quadrature formula with weight function given by coefficients alpha and beta of a recurrence relation which generates a system of orthogonal polynomials: P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zero moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - alpha coefficients, array[0..floor(3*K/2)]. Beta - beta coefficients, array[0..ceil(3*K/2)]. Beta[0] is not used and may be arbitrary. Beta[I]>0. Mu0 - zeroth moment of the weight function. N - number of nodes of the Gauss-Kronrod quadrature formula, N >= 3, N = 2*K+1. OUTPUT PARAMETERS: Info - error code: * -5 no real and positive Gauss-Kronrod formula can be created for such a weight function with a given number of nodes. * -4 N is too large, task may be ill conditioned - x[i]=x[i+1] found. * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 08.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqgeneraterec(const real_1d_array &alpha, const real_1d_array &beta, const double mu0, const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &wkronrod, real_1d_array &wgauss) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::gkqgeneraterec(const_cast(alpha.c_ptr()), const_cast(beta.c_ptr()), mu0, n, &info, const_cast(x.c_ptr()), const_cast(wkronrod.c_ptr()), const_cast(wgauss.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Returns Gauss and Gauss-Kronrod nodes/weights for Gauss-Legendre quadrature with N points. GKQLegendreCalc (calculation) or GKQLegendreTbl (precomputed table) is used depending on machine precision and number of nodes. INPUT PARAMETERS: N - number of Kronrod nodes, must be odd number, >=3. OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. N is too large to obtain weights/nodes with high enough accuracy. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqgenerategausslegendre(const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &wkronrod, real_1d_array &wgauss) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::gkqgenerategausslegendre(n, &info, const_cast(x.c_ptr()), const_cast(wkronrod.c_ptr()), const_cast(wgauss.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Returns Gauss and Gauss-Kronrod nodes/weights for Gauss-Jacobi quadrature on [-1,1] with weight function W(x)=Power(1-x,Alpha)*Power(1+x,Beta). INPUT PARAMETERS: N - number of Kronrod nodes, must be odd number, >=3. Alpha - power-law coefficient, Alpha>-1 Beta - power-law coefficient, Beta>-1 OUTPUT PARAMETERS: Info - error code: * -5 no real and positive Gauss-Kronrod formula can be created for such a weight function with a given number of nodes. * -4 an error was detected when calculating weights/nodes. Alpha or Beta are too close to -1 to obtain weights/nodes with high enough accuracy, or, may be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK * +2 OK, but quadrature rule have exterior nodes, x[0]<-1 or x[n-1]>+1 X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqgenerategaussjacobi(const ae_int_t n, const double alpha, const double beta, ae_int_t &info, real_1d_array &x, real_1d_array &wkronrod, real_1d_array &wgauss) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::gkqgenerategaussjacobi(n, alpha, beta, &info, const_cast(x.c_ptr()), const_cast(wkronrod.c_ptr()), const_cast(wgauss.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Returns Gauss and Gauss-Kronrod nodes for quadrature with N points. Reduction to tridiagonal eigenproblem is used. INPUT PARAMETERS: N - number of Kronrod nodes, must be odd number, >=3. OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. N is too large to obtain weights/nodes with high enough accuracy. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqlegendrecalc(const ae_int_t n, ae_int_t &info, real_1d_array &x, real_1d_array &wkronrod, real_1d_array &wgauss) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::gkqlegendrecalc(n, &info, const_cast(x.c_ptr()), const_cast(wkronrod.c_ptr()), const_cast(wgauss.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Returns Gauss and Gauss-Kronrod nodes for quadrature with N points using pre-calculated table. Nodes/weights were computed with accuracy up to 1.0E-32 (if MPFR version of ALGLIB is used). In standard double precision accuracy reduces to something about 2.0E-16 (depending on your compiler's handling of long floating point constants). INPUT PARAMETERS: N - number of Kronrod nodes. N can be 15, 21, 31, 41, 51, 61. OUTPUT PARAMETERS: X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqlegendretbl(const ae_int_t n, real_1d_array &x, real_1d_array &wkronrod, real_1d_array &wgauss, double &eps) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::gkqlegendretbl(n, const_cast(x.c_ptr()), const_cast(wkronrod.c_ptr()), const_cast(wgauss.c_ptr()), &eps, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Integration report: * TerminationType = completetion code: * -5 non-convergence of Gauss-Kronrod nodes calculation subroutine. * -1 incorrect parameters were specified * 1 OK * Rep.NFEV countains number of function calculations * Rep.NIntervals contains number of intervals [a,b] was partitioned into. *************************************************************************/ _autogkreport_owner::_autogkreport_owner() { p_struct = (alglib_impl::autogkreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::autogkreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_autogkreport_init(p_struct, NULL); } _autogkreport_owner::_autogkreport_owner(const _autogkreport_owner &rhs) { p_struct = (alglib_impl::autogkreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::autogkreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_autogkreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _autogkreport_owner& _autogkreport_owner::operator=(const _autogkreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_autogkreport_clear(p_struct); alglib_impl::_autogkreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _autogkreport_owner::~_autogkreport_owner() { alglib_impl::_autogkreport_clear(p_struct); ae_free(p_struct); } alglib_impl::autogkreport* _autogkreport_owner::c_ptr() { return p_struct; } alglib_impl::autogkreport* _autogkreport_owner::c_ptr() const { return const_cast(p_struct); } autogkreport::autogkreport() : _autogkreport_owner() ,terminationtype(p_struct->terminationtype),nfev(p_struct->nfev),nintervals(p_struct->nintervals) { } autogkreport::autogkreport(const autogkreport &rhs):_autogkreport_owner(rhs) ,terminationtype(p_struct->terminationtype),nfev(p_struct->nfev),nintervals(p_struct->nintervals) { } autogkreport& autogkreport::operator=(const autogkreport &rhs) { if( this==&rhs ) return *this; _autogkreport_owner::operator=(rhs); return *this; } autogkreport::~autogkreport() { } /************************************************************************* This structure stores state of the integration algorithm. Although this class has public fields, they are not intended for external use. You should use ALGLIB functions to work with this class: * autogksmooth()/AutoGKSmoothW()/... to create objects * autogkintegrate() to begin integration * autogkresults() to get results *************************************************************************/ _autogkstate_owner::_autogkstate_owner() { p_struct = (alglib_impl::autogkstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::autogkstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_autogkstate_init(p_struct, NULL); } _autogkstate_owner::_autogkstate_owner(const _autogkstate_owner &rhs) { p_struct = (alglib_impl::autogkstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::autogkstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_autogkstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _autogkstate_owner& _autogkstate_owner::operator=(const _autogkstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_autogkstate_clear(p_struct); alglib_impl::_autogkstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _autogkstate_owner::~_autogkstate_owner() { alglib_impl::_autogkstate_clear(p_struct); ae_free(p_struct); } alglib_impl::autogkstate* _autogkstate_owner::c_ptr() { return p_struct; } alglib_impl::autogkstate* _autogkstate_owner::c_ptr() const { return const_cast(p_struct); } autogkstate::autogkstate() : _autogkstate_owner() ,needf(p_struct->needf),x(p_struct->x),xminusa(p_struct->xminusa),bminusx(p_struct->bminusx),f(p_struct->f) { } autogkstate::autogkstate(const autogkstate &rhs):_autogkstate_owner(rhs) ,needf(p_struct->needf),x(p_struct->x),xminusa(p_struct->xminusa),bminusx(p_struct->bminusx),f(p_struct->f) { } autogkstate& autogkstate::operator=(const autogkstate &rhs) { if( this==&rhs ) return *this; _autogkstate_owner::operator=(rhs); return *this; } autogkstate::~autogkstate() { } /************************************************************************* Integration of a smooth function F(x) on a finite interval [a,b]. Fast-convergent algorithm based on a Gauss-Kronrod formula is used. Result is calculated with accuracy close to the machine precision. Algorithm works well only with smooth integrands. It may be used with continuous non-smooth integrands, but with less performance. It should never be used with integrands which have integrable singularities at lower or upper limits - algorithm may crash. Use AutoGKSingular in such cases. INPUT PARAMETERS: A, B - interval boundaries (AB) OUTPUT PARAMETERS State - structure which stores algorithm state SEE ALSO AutoGKSmoothW, AutoGKSingular, AutoGKResults. -- ALGLIB -- Copyright 06.05.2009 by Bochkanov Sergey *************************************************************************/ void autogksmooth(const double a, const double b, autogkstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::autogksmooth(a, b, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Integration of a smooth function F(x) on a finite interval [a,b]. This subroutine is same as AutoGKSmooth(), but it guarantees that interval [a,b] is partitioned into subintervals which have width at most XWidth. Subroutine can be used when integrating nearly-constant function with narrow "bumps" (about XWidth wide). If "bumps" are too narrow, AutoGKSmooth subroutine can overlook them. INPUT PARAMETERS: A, B - interval boundaries (AB) OUTPUT PARAMETERS State - structure which stores algorithm state SEE ALSO AutoGKSmooth, AutoGKSingular, AutoGKResults. -- ALGLIB -- Copyright 06.05.2009 by Bochkanov Sergey *************************************************************************/ void autogksmoothw(const double a, const double b, const double xwidth, autogkstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::autogksmoothw(a, b, xwidth, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Integration on a finite interval [A,B]. Integrand have integrable singularities at A/B. F(X) must diverge as "(x-A)^alpha" at A, as "(B-x)^beta" at B, with known alpha/beta (alpha>-1, beta>-1). If alpha/beta are not known, estimates from below can be used (but these estimates should be greater than -1 too). One of alpha/beta variables (or even both alpha/beta) may be equal to 0, which means than function F(x) is non-singular at A/B. Anyway (singular at bounds or not), function F(x) is supposed to be continuous on (A,B). Fast-convergent algorithm based on a Gauss-Kronrod formula is used. Result is calculated with accuracy close to the machine precision. INPUT PARAMETERS: A, B - interval boundaries (AB) Alpha - power-law coefficient of the F(x) at A, Alpha>-1 Beta - power-law coefficient of the F(x) at B, Beta>-1 OUTPUT PARAMETERS State - structure which stores algorithm state SEE ALSO AutoGKSmooth, AutoGKSmoothW, AutoGKResults. -- ALGLIB -- Copyright 06.05.2009 by Bochkanov Sergey *************************************************************************/ void autogksingular(const double a, const double b, const double alpha, const double beta, autogkstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::autogksingular(a, b, alpha, beta, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool autogkiteration(const autogkstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::autogkiteration(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void autogkintegrate(autogkstate &state, void (*func)(double x, double xminusa, double bminusx, double &y, void *ptr), void *ptr){ alglib_impl::ae_state _alglib_env_state; if( func==NULL ) throw ap_error("ALGLIB: error in 'autogkintegrate()' (func is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::autogkiteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needf ) { func(state.x, state.xminusa, state.bminusx, state.f, ptr); continue; } throw ap_error("ALGLIB: unexpected error in 'autogkintegrate()'"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Adaptive integration results Called after AutoGKIteration returned False. Input parameters: State - algorithm state (used by AutoGKIteration). Output parameters: V - integral(f(x)dx,a,b) Rep - optimization report (see AutoGKReport description) -- ALGLIB -- Copyright 14.11.2007 by Bochkanov Sergey *************************************************************************/ void autogkresults(const autogkstate &state, double &v, autogkreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::autogkresults(const_cast(state.c_ptr()), &v, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF COMPUTATIONAL CORE // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { static ae_int_t autogk_maxsubintervals = 10000; static void autogk_autogkinternalprepare(double a, double b, double eps, double xwidth, autogkinternalstate* state, ae_state *_state); static ae_bool autogk_autogkinternaliteration(autogkinternalstate* state, ae_state *_state); static void autogk_mheappop(/* Real */ ae_matrix* heap, ae_int_t heapsize, ae_int_t heapwidth, ae_state *_state); static void autogk_mheappush(/* Real */ ae_matrix* heap, ae_int_t heapsize, ae_int_t heapwidth, ae_state *_state); static void autogk_mheapresize(/* Real */ ae_matrix* heap, ae_int_t* heapsize, ae_int_t newheapsize, ae_int_t heapwidth, ae_state *_state); /************************************************************************* Computation of nodes and weights for a Gauss quadrature formula The algorithm generates the N-point Gauss quadrature formula with weight function given by coefficients alpha and beta of a recurrence relation which generates a system of orthogonal polynomials: P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zeroth moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - array[0..N-1], alpha coefficients Beta - array[0..N-1], beta coefficients Zero-indexed element is not used and may be arbitrary. Beta[I]>0. Mu0 - zeroth moment of the weight function. N - number of nodes of the quadrature formula, N>=1 OUTPUT PARAMETERS: Info - error code: * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * 1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 2005-2009 by Bochkanov Sergey *************************************************************************/ void gqgeneraterec(/* Real */ ae_vector* alpha, /* Real */ ae_vector* beta, double mu0, ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector d; ae_vector e; ae_matrix z; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(x); ae_vector_clear(w); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); if( n<1 ) { *info = -1; ae_frame_leave(_state); return; } *info = 1; /* * Initialize */ ae_vector_set_length(&d, n, _state); ae_vector_set_length(&e, n, _state); for(i=1; i<=n-1; i++) { d.ptr.p_double[i-1] = alpha->ptr.p_double[i-1]; if( ae_fp_less_eq(beta->ptr.p_double[i],(double)(0)) ) { *info = -2; ae_frame_leave(_state); return; } e.ptr.p_double[i-1] = ae_sqrt(beta->ptr.p_double[i], _state); } d.ptr.p_double[n-1] = alpha->ptr.p_double[n-1]; /* * EVD */ if( !smatrixtdevd(&d, &e, n, 3, &z, _state) ) { *info = -3; ae_frame_leave(_state); return; } /* * Generate */ ae_vector_set_length(x, n, _state); ae_vector_set_length(w, n, _state); for(i=1; i<=n; i++) { x->ptr.p_double[i-1] = d.ptr.p_double[i-1]; w->ptr.p_double[i-1] = mu0*ae_sqr(z.ptr.pp_double[0][i-1], _state); } ae_frame_leave(_state); } /************************************************************************* Computation of nodes and weights for a Gauss-Lobatto quadrature formula The algorithm generates the N-point Gauss-Lobatto quadrature formula with weight function given by coefficients alpha and beta of a recurrence which generates a system of orthogonal polynomials. P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zeroth moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - array[0..N-2], alpha coefficients Beta - array[0..N-2], beta coefficients. Zero-indexed element is not used, may be arbitrary. Beta[I]>0 Mu0 - zeroth moment of the weighting function. A - left boundary of the integration interval. B - right boundary of the integration interval. N - number of nodes of the quadrature formula, N>=3 (including the left and right boundary nodes). OUTPUT PARAMETERS: Info - error code: * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * 1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 2005-2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategausslobattorec(/* Real */ ae_vector* alpha, /* Real */ ae_vector* beta, double mu0, double a, double b, ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state) { ae_frame _frame_block; ae_vector _alpha; ae_vector _beta; ae_int_t i; ae_vector d; ae_vector e; ae_matrix z; double pim1a; double pia; double pim1b; double pib; double t; double a11; double a12; double a21; double a22; double b1; double b2; double alph; double bet; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_alpha, alpha, _state); alpha = &_alpha; ae_vector_init_copy(&_beta, beta, _state); beta = &_beta; *info = 0; ae_vector_clear(x); ae_vector_clear(w); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); if( n<=2 ) { *info = -1; ae_frame_leave(_state); return; } *info = 1; /* * Initialize, D[1:N+1], E[1:N] */ n = n-2; ae_vector_set_length(&d, n+2, _state); ae_vector_set_length(&e, n+1, _state); for(i=1; i<=n+1; i++) { d.ptr.p_double[i-1] = alpha->ptr.p_double[i-1]; } for(i=1; i<=n; i++) { if( ae_fp_less_eq(beta->ptr.p_double[i],(double)(0)) ) { *info = -2; ae_frame_leave(_state); return; } e.ptr.p_double[i-1] = ae_sqrt(beta->ptr.p_double[i], _state); } /* * Caclulate Pn(a), Pn+1(a), Pn(b), Pn+1(b) */ beta->ptr.p_double[0] = (double)(0); pim1a = (double)(0); pia = (double)(1); pim1b = (double)(0); pib = (double)(1); for(i=1; i<=n+1; i++) { /* * Pi(a) */ t = (a-alpha->ptr.p_double[i-1])*pia-beta->ptr.p_double[i-1]*pim1a; pim1a = pia; pia = t; /* * Pi(b) */ t = (b-alpha->ptr.p_double[i-1])*pib-beta->ptr.p_double[i-1]*pim1b; pim1b = pib; pib = t; } /* * Calculate alpha'(n+1), beta'(n+1) */ a11 = pia; a12 = pim1a; a21 = pib; a22 = pim1b; b1 = a*pia; b2 = b*pib; if( ae_fp_greater(ae_fabs(a11, _state),ae_fabs(a21, _state)) ) { a22 = a22-a12*a21/a11; b2 = b2-b1*a21/a11; bet = b2/a22; alph = (b1-bet*a12)/a11; } else { a12 = a12-a22*a11/a21; b1 = b1-b2*a11/a21; bet = b1/a12; alph = (b2-bet*a22)/a21; } if( ae_fp_less(bet,(double)(0)) ) { *info = -3; ae_frame_leave(_state); return; } d.ptr.p_double[n+1] = alph; e.ptr.p_double[n] = ae_sqrt(bet, _state); /* * EVD */ if( !smatrixtdevd(&d, &e, n+2, 3, &z, _state) ) { *info = -3; ae_frame_leave(_state); return; } /* * Generate */ ae_vector_set_length(x, n+2, _state); ae_vector_set_length(w, n+2, _state); for(i=1; i<=n+2; i++) { x->ptr.p_double[i-1] = d.ptr.p_double[i-1]; w->ptr.p_double[i-1] = mu0*ae_sqr(z.ptr.pp_double[0][i-1], _state); } ae_frame_leave(_state); } /************************************************************************* Computation of nodes and weights for a Gauss-Radau quadrature formula The algorithm generates the N-point Gauss-Radau quadrature formula with weight function given by the coefficients alpha and beta of a recurrence which generates a system of orthogonal polynomials. P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zeroth moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - array[0..N-2], alpha coefficients. Beta - array[0..N-1], beta coefficients Zero-indexed element is not used. Beta[I]>0 Mu0 - zeroth moment of the weighting function. A - left boundary of the integration interval. N - number of nodes of the quadrature formula, N>=2 (including the left boundary node). OUTPUT PARAMETERS: Info - error code: * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * 1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 2005-2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategaussradaurec(/* Real */ ae_vector* alpha, /* Real */ ae_vector* beta, double mu0, double a, ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state) { ae_frame _frame_block; ae_vector _alpha; ae_vector _beta; ae_int_t i; ae_vector d; ae_vector e; ae_matrix z; double polim1; double poli; double t; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_alpha, alpha, _state); alpha = &_alpha; ae_vector_init_copy(&_beta, beta, _state); beta = &_beta; *info = 0; ae_vector_clear(x); ae_vector_clear(w); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); if( n<2 ) { *info = -1; ae_frame_leave(_state); return; } *info = 1; /* * Initialize, D[1:N], E[1:N] */ n = n-1; ae_vector_set_length(&d, n+1, _state); ae_vector_set_length(&e, n, _state); for(i=1; i<=n; i++) { d.ptr.p_double[i-1] = alpha->ptr.p_double[i-1]; if( ae_fp_less_eq(beta->ptr.p_double[i],(double)(0)) ) { *info = -2; ae_frame_leave(_state); return; } e.ptr.p_double[i-1] = ae_sqrt(beta->ptr.p_double[i], _state); } /* * Caclulate Pn(a), Pn-1(a), and D[N+1] */ beta->ptr.p_double[0] = (double)(0); polim1 = (double)(0); poli = (double)(1); for(i=1; i<=n; i++) { t = (a-alpha->ptr.p_double[i-1])*poli-beta->ptr.p_double[i-1]*polim1; polim1 = poli; poli = t; } d.ptr.p_double[n] = a-beta->ptr.p_double[n]*polim1/poli; /* * EVD */ if( !smatrixtdevd(&d, &e, n+1, 3, &z, _state) ) { *info = -3; ae_frame_leave(_state); return; } /* * Generate */ ae_vector_set_length(x, n+1, _state); ae_vector_set_length(w, n+1, _state); for(i=1; i<=n+1; i++) { x->ptr.p_double[i-1] = d.ptr.p_double[i-1]; w->ptr.p_double[i-1] = mu0*ae_sqr(z.ptr.pp_double[0][i-1], _state); } ae_frame_leave(_state); } /************************************************************************* Returns nodes/weights for Gauss-Legendre quadrature on [-1,1] with N nodes. INPUT PARAMETERS: N - number of nodes, >=1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. N is too large to obtain weights/nodes with high enough accuracy. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategausslegendre(ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state) { ae_frame _frame_block; ae_vector alpha; ae_vector beta; ae_int_t i; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(x); ae_vector_clear(w); ae_vector_init(&alpha, 0, DT_REAL, _state); ae_vector_init(&beta, 0, DT_REAL, _state); if( n<1 ) { *info = -1; ae_frame_leave(_state); return; } ae_vector_set_length(&alpha, n, _state); ae_vector_set_length(&beta, n, _state); for(i=0; i<=n-1; i++) { alpha.ptr.p_double[i] = (double)(0); } beta.ptr.p_double[0] = (double)(2); for(i=1; i<=n-1; i++) { beta.ptr.p_double[i] = 1/(4-1/ae_sqr((double)(i), _state)); } gqgeneraterec(&alpha, &beta, beta.ptr.p_double[0], n, info, x, w, _state); /* * test basic properties to detect errors */ if( *info>0 ) { if( ae_fp_less(x->ptr.p_double[0],(double)(-1))||ae_fp_greater(x->ptr.p_double[n-1],(double)(1)) ) { *info = -4; } for(i=0; i<=n-2; i++) { if( ae_fp_greater_eq(x->ptr.p_double[i],x->ptr.p_double[i+1]) ) { *info = -4; } } } ae_frame_leave(_state); } /************************************************************************* Returns nodes/weights for Gauss-Jacobi quadrature on [-1,1] with weight function W(x)=Power(1-x,Alpha)*Power(1+x,Beta). INPUT PARAMETERS: N - number of nodes, >=1 Alpha - power-law coefficient, Alpha>-1 Beta - power-law coefficient, Beta>-1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. Alpha or Beta are too close to -1 to obtain weights/nodes with high enough accuracy, or, may be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N/Alpha/Beta was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategaussjacobi(ae_int_t n, double alpha, double beta, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state) { ae_frame _frame_block; ae_vector a; ae_vector b; double alpha2; double beta2; double apb; double t; ae_int_t i; double s; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(x); ae_vector_clear(w); ae_vector_init(&a, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); if( (n<1||ae_fp_less_eq(alpha,(double)(-1)))||ae_fp_less_eq(beta,(double)(-1)) ) { *info = -1; ae_frame_leave(_state); return; } ae_vector_set_length(&a, n, _state); ae_vector_set_length(&b, n, _state); apb = alpha+beta; a.ptr.p_double[0] = (beta-alpha)/(apb+2); t = (apb+1)*ae_log((double)(2), _state)+lngamma(alpha+1, &s, _state)+lngamma(beta+1, &s, _state)-lngamma(apb+2, &s, _state); if( ae_fp_greater(t,ae_log(ae_maxrealnumber, _state)) ) { *info = -4; ae_frame_leave(_state); return; } b.ptr.p_double[0] = ae_exp(t, _state); if( n>1 ) { alpha2 = ae_sqr(alpha, _state); beta2 = ae_sqr(beta, _state); a.ptr.p_double[1] = (beta2-alpha2)/((apb+2)*(apb+4)); b.ptr.p_double[1] = 4*(alpha+1)*(beta+1)/((apb+3)*ae_sqr(apb+2, _state)); for(i=2; i<=n-1; i++) { a.ptr.p_double[i] = 0.25*(beta2-alpha2)/(i*i*(1+0.5*apb/i)*(1+0.5*(apb+2)/i)); b.ptr.p_double[i] = 0.25*(1+alpha/i)*(1+beta/i)*(1+apb/i)/((1+0.5*(apb+1)/i)*(1+0.5*(apb-1)/i)*ae_sqr(1+0.5*apb/i, _state)); } } gqgeneraterec(&a, &b, b.ptr.p_double[0], n, info, x, w, _state); /* * test basic properties to detect errors */ if( *info>0 ) { if( ae_fp_less(x->ptr.p_double[0],(double)(-1))||ae_fp_greater(x->ptr.p_double[n-1],(double)(1)) ) { *info = -4; } for(i=0; i<=n-2; i++) { if( ae_fp_greater_eq(x->ptr.p_double[i],x->ptr.p_double[i+1]) ) { *info = -4; } } } ae_frame_leave(_state); } /************************************************************************* Returns nodes/weights for Gauss-Laguerre quadrature on [0,+inf) with weight function W(x)=Power(x,Alpha)*Exp(-x) INPUT PARAMETERS: N - number of nodes, >=1 Alpha - power-law coefficient, Alpha>-1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. Alpha is too close to -1 to obtain weights/nodes with high enough accuracy or, may be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N/Alpha was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategausslaguerre(ae_int_t n, double alpha, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state) { ae_frame _frame_block; ae_vector a; ae_vector b; double t; ae_int_t i; double s; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(x); ae_vector_clear(w); ae_vector_init(&a, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); if( n<1||ae_fp_less_eq(alpha,(double)(-1)) ) { *info = -1; ae_frame_leave(_state); return; } ae_vector_set_length(&a, n, _state); ae_vector_set_length(&b, n, _state); a.ptr.p_double[0] = alpha+1; t = lngamma(alpha+1, &s, _state); if( ae_fp_greater_eq(t,ae_log(ae_maxrealnumber, _state)) ) { *info = -4; ae_frame_leave(_state); return; } b.ptr.p_double[0] = ae_exp(t, _state); if( n>1 ) { for(i=1; i<=n-1; i++) { a.ptr.p_double[i] = 2*i+alpha+1; b.ptr.p_double[i] = i*(i+alpha); } } gqgeneraterec(&a, &b, b.ptr.p_double[0], n, info, x, w, _state); /* * test basic properties to detect errors */ if( *info>0 ) { if( ae_fp_less(x->ptr.p_double[0],(double)(0)) ) { *info = -4; } for(i=0; i<=n-2; i++) { if( ae_fp_greater_eq(x->ptr.p_double[i],x->ptr.p_double[i+1]) ) { *info = -4; } } } ae_frame_leave(_state); } /************************************************************************* Returns nodes/weights for Gauss-Hermite quadrature on (-inf,+inf) with weight function W(x)=Exp(-x*x) INPUT PARAMETERS: N - number of nodes, >=1 OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. May be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N/Alpha was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. W - array[0..N-1] - array of quadrature weights. -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gqgenerategausshermite(ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state) { ae_frame _frame_block; ae_vector a; ae_vector b; ae_int_t i; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(x); ae_vector_clear(w); ae_vector_init(&a, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); if( n<1 ) { *info = -1; ae_frame_leave(_state); return; } ae_vector_set_length(&a, n, _state); ae_vector_set_length(&b, n, _state); for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(0); } b.ptr.p_double[0] = ae_sqrt(4*ae_atan((double)(1), _state), _state); if( n>1 ) { for(i=1; i<=n-1; i++) { b.ptr.p_double[i] = 0.5*i; } } gqgeneraterec(&a, &b, b.ptr.p_double[0], n, info, x, w, _state); /* * test basic properties to detect errors */ if( *info>0 ) { for(i=0; i<=n-2; i++) { if( ae_fp_greater_eq(x->ptr.p_double[i],x->ptr.p_double[i+1]) ) { *info = -4; } } } ae_frame_leave(_state); } /************************************************************************* Computation of nodes and weights of a Gauss-Kronrod quadrature formula The algorithm generates the N-point Gauss-Kronrod quadrature formula with weight function given by coefficients alpha and beta of a recurrence relation which generates a system of orthogonal polynomials: P-1(x) = 0 P0(x) = 1 Pn+1(x) = (x-alpha(n))*Pn(x) - beta(n)*Pn-1(x) and zero moment Mu0 Mu0 = integral(W(x)dx,a,b) INPUT PARAMETERS: Alpha - alpha coefficients, array[0..floor(3*K/2)]. Beta - beta coefficients, array[0..ceil(3*K/2)]. Beta[0] is not used and may be arbitrary. Beta[I]>0. Mu0 - zeroth moment of the weight function. N - number of nodes of the Gauss-Kronrod quadrature formula, N >= 3, N = 2*K+1. OUTPUT PARAMETERS: Info - error code: * -5 no real and positive Gauss-Kronrod formula can be created for such a weight function with a given number of nodes. * -4 N is too large, task may be ill conditioned - x[i]=x[i+1] found. * -3 internal eigenproblem solver hasn't converged * -2 Beta[i]<=0 * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 08.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqgeneraterec(/* Real */ ae_vector* alpha, /* Real */ ae_vector* beta, double mu0, ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* wkronrod, /* Real */ ae_vector* wgauss, ae_state *_state) { ae_frame _frame_block; ae_vector _alpha; ae_vector _beta; ae_vector ta; ae_int_t i; ae_int_t j; ae_vector t; ae_vector s; ae_int_t wlen; ae_int_t woffs; double u; ae_int_t m; ae_int_t l; ae_int_t k; ae_vector xgtmp; ae_vector wgtmp; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_alpha, alpha, _state); alpha = &_alpha; ae_vector_init_copy(&_beta, beta, _state); beta = &_beta; *info = 0; ae_vector_clear(x); ae_vector_clear(wkronrod); ae_vector_clear(wgauss); ae_vector_init(&ta, 0, DT_REAL, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&xgtmp, 0, DT_REAL, _state); ae_vector_init(&wgtmp, 0, DT_REAL, _state); if( n%2!=1||n<3 ) { *info = -1; ae_frame_leave(_state); return; } for(i=0; i<=ae_iceil((double)(3*(n/2))/(double)2, _state); i++) { if( ae_fp_less_eq(beta->ptr.p_double[i],(double)(0)) ) { *info = -2; ae_frame_leave(_state); return; } } *info = 1; /* * from external conventions about N/Beta/Mu0 to internal */ n = n/2; beta->ptr.p_double[0] = mu0; /* * Calculate Gauss nodes/weights, save them for later processing */ gqgeneraterec(alpha, beta, mu0, n, info, &xgtmp, &wgtmp, _state); if( *info<0 ) { ae_frame_leave(_state); return; } /* * Resize: * * A from 0..floor(3*n/2) to 0..2*n * * B from 0..ceil(3*n/2) to 0..2*n */ ae_vector_set_length(&ta, ae_ifloor((double)(3*n)/(double)2, _state)+1, _state); ae_v_move(&ta.ptr.p_double[0], 1, &alpha->ptr.p_double[0], 1, ae_v_len(0,ae_ifloor((double)(3*n)/(double)2, _state))); ae_vector_set_length(alpha, 2*n+1, _state); ae_v_move(&alpha->ptr.p_double[0], 1, &ta.ptr.p_double[0], 1, ae_v_len(0,ae_ifloor((double)(3*n)/(double)2, _state))); for(i=ae_ifloor((double)(3*n)/(double)2, _state)+1; i<=2*n; i++) { alpha->ptr.p_double[i] = (double)(0); } ae_vector_set_length(&ta, ae_iceil((double)(3*n)/(double)2, _state)+1, _state); ae_v_move(&ta.ptr.p_double[0], 1, &beta->ptr.p_double[0], 1, ae_v_len(0,ae_iceil((double)(3*n)/(double)2, _state))); ae_vector_set_length(beta, 2*n+1, _state); ae_v_move(&beta->ptr.p_double[0], 1, &ta.ptr.p_double[0], 1, ae_v_len(0,ae_iceil((double)(3*n)/(double)2, _state))); for(i=ae_iceil((double)(3*n)/(double)2, _state)+1; i<=2*n; i++) { beta->ptr.p_double[i] = (double)(0); } /* * Initialize T, S */ wlen = 2+n/2; ae_vector_set_length(&t, wlen, _state); ae_vector_set_length(&s, wlen, _state); ae_vector_set_length(&ta, wlen, _state); woffs = 1; for(i=0; i<=wlen-1; i++) { t.ptr.p_double[i] = (double)(0); s.ptr.p_double[i] = (double)(0); } /* * Algorithm from Dirk P. Laurie, "Calculation of Gauss-Kronrod quadrature rules", 1997. */ t.ptr.p_double[woffs+0] = beta->ptr.p_double[n+1]; for(m=0; m<=n-2; m++) { u = (double)(0); for(k=(m+1)/2; k>=0; k--) { l = m-k; u = u+(alpha->ptr.p_double[k+n+1]-alpha->ptr.p_double[l])*t.ptr.p_double[woffs+k]+beta->ptr.p_double[k+n+1]*s.ptr.p_double[woffs+k-1]-beta->ptr.p_double[l]*s.ptr.p_double[woffs+k]; s.ptr.p_double[woffs+k] = u; } ae_v_move(&ta.ptr.p_double[0], 1, &t.ptr.p_double[0], 1, ae_v_len(0,wlen-1)); ae_v_move(&t.ptr.p_double[0], 1, &s.ptr.p_double[0], 1, ae_v_len(0,wlen-1)); ae_v_move(&s.ptr.p_double[0], 1, &ta.ptr.p_double[0], 1, ae_v_len(0,wlen-1)); } for(j=n/2; j>=0; j--) { s.ptr.p_double[woffs+j] = s.ptr.p_double[woffs+j-1]; } for(m=n-1; m<=2*n-3; m++) { u = (double)(0); for(k=m+1-n; k<=(m-1)/2; k++) { l = m-k; j = n-1-l; u = u-(alpha->ptr.p_double[k+n+1]-alpha->ptr.p_double[l])*t.ptr.p_double[woffs+j]-beta->ptr.p_double[k+n+1]*s.ptr.p_double[woffs+j]+beta->ptr.p_double[l]*s.ptr.p_double[woffs+j+1]; s.ptr.p_double[woffs+j] = u; } if( m%2==0 ) { k = m/2; alpha->ptr.p_double[k+n+1] = alpha->ptr.p_double[k]+(s.ptr.p_double[woffs+j]-beta->ptr.p_double[k+n+1]*s.ptr.p_double[woffs+j+1])/t.ptr.p_double[woffs+j+1]; } else { k = (m+1)/2; beta->ptr.p_double[k+n+1] = s.ptr.p_double[woffs+j]/s.ptr.p_double[woffs+j+1]; } ae_v_move(&ta.ptr.p_double[0], 1, &t.ptr.p_double[0], 1, ae_v_len(0,wlen-1)); ae_v_move(&t.ptr.p_double[0], 1, &s.ptr.p_double[0], 1, ae_v_len(0,wlen-1)); ae_v_move(&s.ptr.p_double[0], 1, &ta.ptr.p_double[0], 1, ae_v_len(0,wlen-1)); } alpha->ptr.p_double[2*n] = alpha->ptr.p_double[n-1]-beta->ptr.p_double[2*n]*s.ptr.p_double[woffs+0]/t.ptr.p_double[woffs+0]; /* * calculation of Kronrod nodes and weights, unpacking of Gauss weights */ gqgeneraterec(alpha, beta, mu0, 2*n+1, info, x, wkronrod, _state); if( *info==-2 ) { *info = -5; } if( *info<0 ) { ae_frame_leave(_state); return; } for(i=0; i<=2*n-1; i++) { if( ae_fp_greater_eq(x->ptr.p_double[i],x->ptr.p_double[i+1]) ) { *info = -4; } } if( *info<0 ) { ae_frame_leave(_state); return; } ae_vector_set_length(wgauss, 2*n+1, _state); for(i=0; i<=2*n; i++) { wgauss->ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { wgauss->ptr.p_double[2*i+1] = wgtmp.ptr.p_double[i]; } ae_frame_leave(_state); } /************************************************************************* Returns Gauss and Gauss-Kronrod nodes/weights for Gauss-Legendre quadrature with N points. GKQLegendreCalc (calculation) or GKQLegendreTbl (precomputed table) is used depending on machine precision and number of nodes. INPUT PARAMETERS: N - number of Kronrod nodes, must be odd number, >=3. OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. N is too large to obtain weights/nodes with high enough accuracy. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqgenerategausslegendre(ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* wkronrod, /* Real */ ae_vector* wgauss, ae_state *_state) { double eps; *info = 0; ae_vector_clear(x); ae_vector_clear(wkronrod); ae_vector_clear(wgauss); if( ae_fp_greater(ae_machineepsilon,1.0E-32)&&(((((n==15||n==21)||n==31)||n==41)||n==51)||n==61) ) { *info = 1; gkqlegendretbl(n, x, wkronrod, wgauss, &eps, _state); } else { gkqlegendrecalc(n, info, x, wkronrod, wgauss, _state); } } /************************************************************************* Returns Gauss and Gauss-Kronrod nodes/weights for Gauss-Jacobi quadrature on [-1,1] with weight function W(x)=Power(1-x,Alpha)*Power(1+x,Beta). INPUT PARAMETERS: N - number of Kronrod nodes, must be odd number, >=3. Alpha - power-law coefficient, Alpha>-1 Beta - power-law coefficient, Beta>-1 OUTPUT PARAMETERS: Info - error code: * -5 no real and positive Gauss-Kronrod formula can be created for such a weight function with a given number of nodes. * -4 an error was detected when calculating weights/nodes. Alpha or Beta are too close to -1 to obtain weights/nodes with high enough accuracy, or, may be, N is too large. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK * +2 OK, but quadrature rule have exterior nodes, x[0]<-1 or x[n-1]>+1 X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqgenerategaussjacobi(ae_int_t n, double alpha, double beta, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* wkronrod, /* Real */ ae_vector* wgauss, ae_state *_state) { ae_frame _frame_block; ae_int_t clen; ae_vector a; ae_vector b; double alpha2; double beta2; double apb; double t; ae_int_t i; double s; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(x); ae_vector_clear(wkronrod); ae_vector_clear(wgauss); ae_vector_init(&a, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); if( n%2!=1||n<3 ) { *info = -1; ae_frame_leave(_state); return; } if( ae_fp_less_eq(alpha,(double)(-1))||ae_fp_less_eq(beta,(double)(-1)) ) { *info = -1; ae_frame_leave(_state); return; } clen = ae_iceil((double)(3*(n/2))/(double)2, _state)+1; ae_vector_set_length(&a, clen, _state); ae_vector_set_length(&b, clen, _state); for(i=0; i<=clen-1; i++) { a.ptr.p_double[i] = (double)(0); } apb = alpha+beta; a.ptr.p_double[0] = (beta-alpha)/(apb+2); t = (apb+1)*ae_log((double)(2), _state)+lngamma(alpha+1, &s, _state)+lngamma(beta+1, &s, _state)-lngamma(apb+2, &s, _state); if( ae_fp_greater(t,ae_log(ae_maxrealnumber, _state)) ) { *info = -4; ae_frame_leave(_state); return; } b.ptr.p_double[0] = ae_exp(t, _state); if( clen>1 ) { alpha2 = ae_sqr(alpha, _state); beta2 = ae_sqr(beta, _state); a.ptr.p_double[1] = (beta2-alpha2)/((apb+2)*(apb+4)); b.ptr.p_double[1] = 4*(alpha+1)*(beta+1)/((apb+3)*ae_sqr(apb+2, _state)); for(i=2; i<=clen-1; i++) { a.ptr.p_double[i] = 0.25*(beta2-alpha2)/(i*i*(1+0.5*apb/i)*(1+0.5*(apb+2)/i)); b.ptr.p_double[i] = 0.25*(1+alpha/i)*(1+beta/i)*(1+apb/i)/((1+0.5*(apb+1)/i)*(1+0.5*(apb-1)/i)*ae_sqr(1+0.5*apb/i, _state)); } } gkqgeneraterec(&a, &b, b.ptr.p_double[0], n, info, x, wkronrod, wgauss, _state); /* * test basic properties to detect errors */ if( *info>0 ) { if( ae_fp_less(x->ptr.p_double[0],(double)(-1))||ae_fp_greater(x->ptr.p_double[n-1],(double)(1)) ) { *info = 2; } for(i=0; i<=n-2; i++) { if( ae_fp_greater_eq(x->ptr.p_double[i],x->ptr.p_double[i+1]) ) { *info = -4; } } } ae_frame_leave(_state); } /************************************************************************* Returns Gauss and Gauss-Kronrod nodes for quadrature with N points. Reduction to tridiagonal eigenproblem is used. INPUT PARAMETERS: N - number of Kronrod nodes, must be odd number, >=3. OUTPUT PARAMETERS: Info - error code: * -4 an error was detected when calculating weights/nodes. N is too large to obtain weights/nodes with high enough accuracy. Try to use multiple precision version. * -3 internal eigenproblem solver hasn't converged * -1 incorrect N was passed * +1 OK X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqlegendrecalc(ae_int_t n, ae_int_t* info, /* Real */ ae_vector* x, /* Real */ ae_vector* wkronrod, /* Real */ ae_vector* wgauss, ae_state *_state) { ae_frame _frame_block; ae_vector alpha; ae_vector beta; ae_int_t alen; ae_int_t blen; double mu0; ae_int_t k; ae_int_t i; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(x); ae_vector_clear(wkronrod); ae_vector_clear(wgauss); ae_vector_init(&alpha, 0, DT_REAL, _state); ae_vector_init(&beta, 0, DT_REAL, _state); if( n%2!=1||n<3 ) { *info = -1; ae_frame_leave(_state); return; } mu0 = (double)(2); alen = ae_ifloor((double)(3*(n/2))/(double)2, _state)+1; blen = ae_iceil((double)(3*(n/2))/(double)2, _state)+1; ae_vector_set_length(&alpha, alen, _state); ae_vector_set_length(&beta, blen, _state); for(k=0; k<=alen-1; k++) { alpha.ptr.p_double[k] = (double)(0); } beta.ptr.p_double[0] = (double)(2); for(k=1; k<=blen-1; k++) { beta.ptr.p_double[k] = 1/(4-1/ae_sqr((double)(k), _state)); } gkqgeneraterec(&alpha, &beta, mu0, n, info, x, wkronrod, wgauss, _state); /* * test basic properties to detect errors */ if( *info>0 ) { if( ae_fp_less(x->ptr.p_double[0],(double)(-1))||ae_fp_greater(x->ptr.p_double[n-1],(double)(1)) ) { *info = -4; } for(i=0; i<=n-2; i++) { if( ae_fp_greater_eq(x->ptr.p_double[i],x->ptr.p_double[i+1]) ) { *info = -4; } } } ae_frame_leave(_state); } /************************************************************************* Returns Gauss and Gauss-Kronrod nodes for quadrature with N points using pre-calculated table. Nodes/weights were computed with accuracy up to 1.0E-32 (if MPFR version of ALGLIB is used). In standard double precision accuracy reduces to something about 2.0E-16 (depending on your compiler's handling of long floating point constants). INPUT PARAMETERS: N - number of Kronrod nodes. N can be 15, 21, 31, 41, 51, 61. OUTPUT PARAMETERS: X - array[0..N-1] - array of quadrature nodes, ordered in ascending order. WKronrod - array[0..N-1] - Kronrod weights WGauss - array[0..N-1] - Gauss weights (interleaved with zeros corresponding to extended Kronrod nodes). -- ALGLIB -- Copyright 12.05.2009 by Bochkanov Sergey *************************************************************************/ void gkqlegendretbl(ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* wkronrod, /* Real */ ae_vector* wgauss, double* eps, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t ng; ae_vector p1; ae_vector p2; double tmp; ae_frame_make(_state, &_frame_block); ae_vector_clear(x); ae_vector_clear(wkronrod); ae_vector_clear(wgauss); *eps = 0; ae_vector_init(&p1, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); /* * these initializers are not really necessary, * but without them compiler complains about uninitialized locals */ ng = 0; /* * Process */ ae_assert(((((n==15||n==21)||n==31)||n==41)||n==51)||n==61, "GKQNodesTbl: incorrect N!", _state); ae_vector_set_length(x, n, _state); ae_vector_set_length(wkronrod, n, _state); ae_vector_set_length(wgauss, n, _state); for(i=0; i<=n-1; i++) { x->ptr.p_double[i] = (double)(0); wkronrod->ptr.p_double[i] = (double)(0); wgauss->ptr.p_double[i] = (double)(0); } *eps = ae_maxreal(ae_machineepsilon, 1.0E-32, _state); if( n==15 ) { ng = 4; wgauss->ptr.p_double[0] = 0.129484966168869693270611432679082; wgauss->ptr.p_double[1] = 0.279705391489276667901467771423780; wgauss->ptr.p_double[2] = 0.381830050505118944950369775488975; wgauss->ptr.p_double[3] = 0.417959183673469387755102040816327; x->ptr.p_double[0] = 0.991455371120812639206854697526329; x->ptr.p_double[1] = 0.949107912342758524526189684047851; x->ptr.p_double[2] = 0.864864423359769072789712788640926; x->ptr.p_double[3] = 0.741531185599394439863864773280788; x->ptr.p_double[4] = 0.586087235467691130294144838258730; x->ptr.p_double[5] = 0.405845151377397166906606412076961; x->ptr.p_double[6] = 0.207784955007898467600689403773245; x->ptr.p_double[7] = 0.000000000000000000000000000000000; wkronrod->ptr.p_double[0] = 0.022935322010529224963732008058970; wkronrod->ptr.p_double[1] = 0.063092092629978553290700663189204; wkronrod->ptr.p_double[2] = 0.104790010322250183839876322541518; wkronrod->ptr.p_double[3] = 0.140653259715525918745189590510238; wkronrod->ptr.p_double[4] = 0.169004726639267902826583426598550; wkronrod->ptr.p_double[5] = 0.190350578064785409913256402421014; wkronrod->ptr.p_double[6] = 0.204432940075298892414161999234649; wkronrod->ptr.p_double[7] = 0.209482141084727828012999174891714; } if( n==21 ) { ng = 5; wgauss->ptr.p_double[0] = 0.066671344308688137593568809893332; wgauss->ptr.p_double[1] = 0.149451349150580593145776339657697; wgauss->ptr.p_double[2] = 0.219086362515982043995534934228163; wgauss->ptr.p_double[3] = 0.269266719309996355091226921569469; wgauss->ptr.p_double[4] = 0.295524224714752870173892994651338; x->ptr.p_double[0] = 0.995657163025808080735527280689003; x->ptr.p_double[1] = 0.973906528517171720077964012084452; x->ptr.p_double[2] = 0.930157491355708226001207180059508; x->ptr.p_double[3] = 0.865063366688984510732096688423493; x->ptr.p_double[4] = 0.780817726586416897063717578345042; x->ptr.p_double[5] = 0.679409568299024406234327365114874; x->ptr.p_double[6] = 0.562757134668604683339000099272694; x->ptr.p_double[7] = 0.433395394129247190799265943165784; x->ptr.p_double[8] = 0.294392862701460198131126603103866; x->ptr.p_double[9] = 0.148874338981631210884826001129720; x->ptr.p_double[10] = 0.000000000000000000000000000000000; wkronrod->ptr.p_double[0] = 0.011694638867371874278064396062192; wkronrod->ptr.p_double[1] = 0.032558162307964727478818972459390; wkronrod->ptr.p_double[2] = 0.054755896574351996031381300244580; wkronrod->ptr.p_double[3] = 0.075039674810919952767043140916190; wkronrod->ptr.p_double[4] = 0.093125454583697605535065465083366; wkronrod->ptr.p_double[5] = 0.109387158802297641899210590325805; wkronrod->ptr.p_double[6] = 0.123491976262065851077958109831074; wkronrod->ptr.p_double[7] = 0.134709217311473325928054001771707; wkronrod->ptr.p_double[8] = 0.142775938577060080797094273138717; wkronrod->ptr.p_double[9] = 0.147739104901338491374841515972068; wkronrod->ptr.p_double[10] = 0.149445554002916905664936468389821; } if( n==31 ) { ng = 8; wgauss->ptr.p_double[0] = 0.030753241996117268354628393577204; wgauss->ptr.p_double[1] = 0.070366047488108124709267416450667; wgauss->ptr.p_double[2] = 0.107159220467171935011869546685869; wgauss->ptr.p_double[3] = 0.139570677926154314447804794511028; wgauss->ptr.p_double[4] = 0.166269205816993933553200860481209; wgauss->ptr.p_double[5] = 0.186161000015562211026800561866423; wgauss->ptr.p_double[6] = 0.198431485327111576456118326443839; wgauss->ptr.p_double[7] = 0.202578241925561272880620199967519; x->ptr.p_double[0] = 0.998002298693397060285172840152271; x->ptr.p_double[1] = 0.987992518020485428489565718586613; x->ptr.p_double[2] = 0.967739075679139134257347978784337; x->ptr.p_double[3] = 0.937273392400705904307758947710209; x->ptr.p_double[4] = 0.897264532344081900882509656454496; x->ptr.p_double[5] = 0.848206583410427216200648320774217; x->ptr.p_double[6] = 0.790418501442465932967649294817947; x->ptr.p_double[7] = 0.724417731360170047416186054613938; x->ptr.p_double[8] = 0.650996741297416970533735895313275; x->ptr.p_double[9] = 0.570972172608538847537226737253911; x->ptr.p_double[10] = 0.485081863640239680693655740232351; x->ptr.p_double[11] = 0.394151347077563369897207370981045; x->ptr.p_double[12] = 0.299180007153168812166780024266389; x->ptr.p_double[13] = 0.201194093997434522300628303394596; x->ptr.p_double[14] = 0.101142066918717499027074231447392; x->ptr.p_double[15] = 0.000000000000000000000000000000000; wkronrod->ptr.p_double[0] = 0.005377479872923348987792051430128; wkronrod->ptr.p_double[1] = 0.015007947329316122538374763075807; wkronrod->ptr.p_double[2] = 0.025460847326715320186874001019653; wkronrod->ptr.p_double[3] = 0.035346360791375846222037948478360; wkronrod->ptr.p_double[4] = 0.044589751324764876608227299373280; wkronrod->ptr.p_double[5] = 0.053481524690928087265343147239430; wkronrod->ptr.p_double[6] = 0.062009567800670640285139230960803; wkronrod->ptr.p_double[7] = 0.069854121318728258709520077099147; wkronrod->ptr.p_double[8] = 0.076849680757720378894432777482659; wkronrod->ptr.p_double[9] = 0.083080502823133021038289247286104; wkronrod->ptr.p_double[10] = 0.088564443056211770647275443693774; wkronrod->ptr.p_double[11] = 0.093126598170825321225486872747346; wkronrod->ptr.p_double[12] = 0.096642726983623678505179907627589; wkronrod->ptr.p_double[13] = 0.099173598721791959332393173484603; wkronrod->ptr.p_double[14] = 0.100769845523875595044946662617570; wkronrod->ptr.p_double[15] = 0.101330007014791549017374792767493; } if( n==41 ) { ng = 10; wgauss->ptr.p_double[0] = 0.017614007139152118311861962351853; wgauss->ptr.p_double[1] = 0.040601429800386941331039952274932; wgauss->ptr.p_double[2] = 0.062672048334109063569506535187042; wgauss->ptr.p_double[3] = 0.083276741576704748724758143222046; wgauss->ptr.p_double[4] = 0.101930119817240435036750135480350; wgauss->ptr.p_double[5] = 0.118194531961518417312377377711382; wgauss->ptr.p_double[6] = 0.131688638449176626898494499748163; wgauss->ptr.p_double[7] = 0.142096109318382051329298325067165; wgauss->ptr.p_double[8] = 0.149172986472603746787828737001969; wgauss->ptr.p_double[9] = 0.152753387130725850698084331955098; x->ptr.p_double[0] = 0.998859031588277663838315576545863; x->ptr.p_double[1] = 0.993128599185094924786122388471320; x->ptr.p_double[2] = 0.981507877450250259193342994720217; x->ptr.p_double[3] = 0.963971927277913791267666131197277; x->ptr.p_double[4] = 0.940822633831754753519982722212443; x->ptr.p_double[5] = 0.912234428251325905867752441203298; x->ptr.p_double[6] = 0.878276811252281976077442995113078; x->ptr.p_double[7] = 0.839116971822218823394529061701521; x->ptr.p_double[8] = 0.795041428837551198350638833272788; x->ptr.p_double[9] = 0.746331906460150792614305070355642; x->ptr.p_double[10] = 0.693237656334751384805490711845932; x->ptr.p_double[11] = 0.636053680726515025452836696226286; x->ptr.p_double[12] = 0.575140446819710315342946036586425; x->ptr.p_double[13] = 0.510867001950827098004364050955251; x->ptr.p_double[14] = 0.443593175238725103199992213492640; x->ptr.p_double[15] = 0.373706088715419560672548177024927; x->ptr.p_double[16] = 0.301627868114913004320555356858592; x->ptr.p_double[17] = 0.227785851141645078080496195368575; x->ptr.p_double[18] = 0.152605465240922675505220241022678; x->ptr.p_double[19] = 0.076526521133497333754640409398838; x->ptr.p_double[20] = 0.000000000000000000000000000000000; wkronrod->ptr.p_double[0] = 0.003073583718520531501218293246031; wkronrod->ptr.p_double[1] = 0.008600269855642942198661787950102; wkronrod->ptr.p_double[2] = 0.014626169256971252983787960308868; wkronrod->ptr.p_double[3] = 0.020388373461266523598010231432755; wkronrod->ptr.p_double[4] = 0.025882133604951158834505067096153; wkronrod->ptr.p_double[5] = 0.031287306777032798958543119323801; wkronrod->ptr.p_double[6] = 0.036600169758200798030557240707211; wkronrod->ptr.p_double[7] = 0.041668873327973686263788305936895; wkronrod->ptr.p_double[8] = 0.046434821867497674720231880926108; wkronrod->ptr.p_double[9] = 0.050944573923728691932707670050345; wkronrod->ptr.p_double[10] = 0.055195105348285994744832372419777; wkronrod->ptr.p_double[11] = 0.059111400880639572374967220648594; wkronrod->ptr.p_double[12] = 0.062653237554781168025870122174255; wkronrod->ptr.p_double[13] = 0.065834597133618422111563556969398; wkronrod->ptr.p_double[14] = 0.068648672928521619345623411885368; wkronrod->ptr.p_double[15] = 0.071054423553444068305790361723210; wkronrod->ptr.p_double[16] = 0.073030690332786667495189417658913; wkronrod->ptr.p_double[17] = 0.074582875400499188986581418362488; wkronrod->ptr.p_double[18] = 0.075704497684556674659542775376617; wkronrod->ptr.p_double[19] = 0.076377867672080736705502835038061; wkronrod->ptr.p_double[20] = 0.076600711917999656445049901530102; } if( n==51 ) { ng = 13; wgauss->ptr.p_double[0] = 0.011393798501026287947902964113235; wgauss->ptr.p_double[1] = 0.026354986615032137261901815295299; wgauss->ptr.p_double[2] = 0.040939156701306312655623487711646; wgauss->ptr.p_double[3] = 0.054904695975835191925936891540473; wgauss->ptr.p_double[4] = 0.068038333812356917207187185656708; wgauss->ptr.p_double[5] = 0.080140700335001018013234959669111; wgauss->ptr.p_double[6] = 0.091028261982963649811497220702892; wgauss->ptr.p_double[7] = 0.100535949067050644202206890392686; wgauss->ptr.p_double[8] = 0.108519624474263653116093957050117; wgauss->ptr.p_double[9] = 0.114858259145711648339325545869556; wgauss->ptr.p_double[10] = 0.119455763535784772228178126512901; wgauss->ptr.p_double[11] = 0.122242442990310041688959518945852; wgauss->ptr.p_double[12] = 0.123176053726715451203902873079050; x->ptr.p_double[0] = 0.999262104992609834193457486540341; x->ptr.p_double[1] = 0.995556969790498097908784946893902; x->ptr.p_double[2] = 0.988035794534077247637331014577406; x->ptr.p_double[3] = 0.976663921459517511498315386479594; x->ptr.p_double[4] = 0.961614986425842512418130033660167; x->ptr.p_double[5] = 0.942974571228974339414011169658471; x->ptr.p_double[6] = 0.920747115281701561746346084546331; x->ptr.p_double[7] = 0.894991997878275368851042006782805; x->ptr.p_double[8] = 0.865847065293275595448996969588340; x->ptr.p_double[9] = 0.833442628760834001421021108693570; x->ptr.p_double[10] = 0.797873797998500059410410904994307; x->ptr.p_double[11] = 0.759259263037357630577282865204361; x->ptr.p_double[12] = 0.717766406813084388186654079773298; x->ptr.p_double[13] = 0.673566368473468364485120633247622; x->ptr.p_double[14] = 0.626810099010317412788122681624518; x->ptr.p_double[15] = 0.577662930241222967723689841612654; x->ptr.p_double[16] = 0.526325284334719182599623778158010; x->ptr.p_double[17] = 0.473002731445714960522182115009192; x->ptr.p_double[18] = 0.417885382193037748851814394594572; x->ptr.p_double[19] = 0.361172305809387837735821730127641; x->ptr.p_double[20] = 0.303089538931107830167478909980339; x->ptr.p_double[21] = 0.243866883720988432045190362797452; x->ptr.p_double[22] = 0.183718939421048892015969888759528; x->ptr.p_double[23] = 0.122864692610710396387359818808037; x->ptr.p_double[24] = 0.061544483005685078886546392366797; x->ptr.p_double[25] = 0.000000000000000000000000000000000; wkronrod->ptr.p_double[0] = 0.001987383892330315926507851882843; wkronrod->ptr.p_double[1] = 0.005561932135356713758040236901066; wkronrod->ptr.p_double[2] = 0.009473973386174151607207710523655; wkronrod->ptr.p_double[3] = 0.013236229195571674813656405846976; wkronrod->ptr.p_double[4] = 0.016847817709128298231516667536336; wkronrod->ptr.p_double[5] = 0.020435371145882835456568292235939; wkronrod->ptr.p_double[6] = 0.024009945606953216220092489164881; wkronrod->ptr.p_double[7] = 0.027475317587851737802948455517811; wkronrod->ptr.p_double[8] = 0.030792300167387488891109020215229; wkronrod->ptr.p_double[9] = 0.034002130274329337836748795229551; wkronrod->ptr.p_double[10] = 0.037116271483415543560330625367620; wkronrod->ptr.p_double[11] = 0.040083825504032382074839284467076; wkronrod->ptr.p_double[12] = 0.042872845020170049476895792439495; wkronrod->ptr.p_double[13] = 0.045502913049921788909870584752660; wkronrod->ptr.p_double[14] = 0.047982537138836713906392255756915; wkronrod->ptr.p_double[15] = 0.050277679080715671963325259433440; wkronrod->ptr.p_double[16] = 0.052362885806407475864366712137873; wkronrod->ptr.p_double[17] = 0.054251129888545490144543370459876; wkronrod->ptr.p_double[18] = 0.055950811220412317308240686382747; wkronrod->ptr.p_double[19] = 0.057437116361567832853582693939506; wkronrod->ptr.p_double[20] = 0.058689680022394207961974175856788; wkronrod->ptr.p_double[21] = 0.059720340324174059979099291932562; wkronrod->ptr.p_double[22] = 0.060539455376045862945360267517565; wkronrod->ptr.p_double[23] = 0.061128509717053048305859030416293; wkronrod->ptr.p_double[24] = 0.061471189871425316661544131965264; wkronrod->ptr.p_double[25] = 0.061580818067832935078759824240055; } if( n==61 ) { ng = 15; wgauss->ptr.p_double[0] = 0.007968192496166605615465883474674; wgauss->ptr.p_double[1] = 0.018466468311090959142302131912047; wgauss->ptr.p_double[2] = 0.028784707883323369349719179611292; wgauss->ptr.p_double[3] = 0.038799192569627049596801936446348; wgauss->ptr.p_double[4] = 0.048402672830594052902938140422808; wgauss->ptr.p_double[5] = 0.057493156217619066481721689402056; wgauss->ptr.p_double[6] = 0.065974229882180495128128515115962; wgauss->ptr.p_double[7] = 0.073755974737705206268243850022191; wgauss->ptr.p_double[8] = 0.080755895229420215354694938460530; wgauss->ptr.p_double[9] = 0.086899787201082979802387530715126; wgauss->ptr.p_double[10] = 0.092122522237786128717632707087619; wgauss->ptr.p_double[11] = 0.096368737174644259639468626351810; wgauss->ptr.p_double[12] = 0.099593420586795267062780282103569; wgauss->ptr.p_double[13] = 0.101762389748405504596428952168554; wgauss->ptr.p_double[14] = 0.102852652893558840341285636705415; x->ptr.p_double[0] = 0.999484410050490637571325895705811; x->ptr.p_double[1] = 0.996893484074649540271630050918695; x->ptr.p_double[2] = 0.991630996870404594858628366109486; x->ptr.p_double[3] = 0.983668123279747209970032581605663; x->ptr.p_double[4] = 0.973116322501126268374693868423707; x->ptr.p_double[5] = 0.960021864968307512216871025581798; x->ptr.p_double[6] = 0.944374444748559979415831324037439; x->ptr.p_double[7] = 0.926200047429274325879324277080474; x->ptr.p_double[8] = 0.905573307699907798546522558925958; x->ptr.p_double[9] = 0.882560535792052681543116462530226; x->ptr.p_double[10] = 0.857205233546061098958658510658944; x->ptr.p_double[11] = 0.829565762382768397442898119732502; x->ptr.p_double[12] = 0.799727835821839083013668942322683; x->ptr.p_double[13] = 0.767777432104826194917977340974503; x->ptr.p_double[14] = 0.733790062453226804726171131369528; x->ptr.p_double[15] = 0.697850494793315796932292388026640; x->ptr.p_double[16] = 0.660061064126626961370053668149271; x->ptr.p_double[17] = 0.620526182989242861140477556431189; x->ptr.p_double[18] = 0.579345235826361691756024932172540; x->ptr.p_double[19] = 0.536624148142019899264169793311073; x->ptr.p_double[20] = 0.492480467861778574993693061207709; x->ptr.p_double[21] = 0.447033769538089176780609900322854; x->ptr.p_double[22] = 0.400401254830394392535476211542661; x->ptr.p_double[23] = 0.352704725530878113471037207089374; x->ptr.p_double[24] = 0.304073202273625077372677107199257; x->ptr.p_double[25] = 0.254636926167889846439805129817805; x->ptr.p_double[26] = 0.204525116682309891438957671002025; x->ptr.p_double[27] = 0.153869913608583546963794672743256; x->ptr.p_double[28] = 0.102806937966737030147096751318001; x->ptr.p_double[29] = 0.051471842555317695833025213166723; x->ptr.p_double[30] = 0.000000000000000000000000000000000; wkronrod->ptr.p_double[0] = 0.001389013698677007624551591226760; wkronrod->ptr.p_double[1] = 0.003890461127099884051267201844516; wkronrod->ptr.p_double[2] = 0.006630703915931292173319826369750; wkronrod->ptr.p_double[3] = 0.009273279659517763428441146892024; wkronrod->ptr.p_double[4] = 0.011823015253496341742232898853251; wkronrod->ptr.p_double[5] = 0.014369729507045804812451432443580; wkronrod->ptr.p_double[6] = 0.016920889189053272627572289420322; wkronrod->ptr.p_double[7] = 0.019414141193942381173408951050128; wkronrod->ptr.p_double[8] = 0.021828035821609192297167485738339; wkronrod->ptr.p_double[9] = 0.024191162078080601365686370725232; wkronrod->ptr.p_double[10] = 0.026509954882333101610601709335075; wkronrod->ptr.p_double[11] = 0.028754048765041292843978785354334; wkronrod->ptr.p_double[12] = 0.030907257562387762472884252943092; wkronrod->ptr.p_double[13] = 0.032981447057483726031814191016854; wkronrod->ptr.p_double[14] = 0.034979338028060024137499670731468; wkronrod->ptr.p_double[15] = 0.036882364651821229223911065617136; wkronrod->ptr.p_double[16] = 0.038678945624727592950348651532281; wkronrod->ptr.p_double[17] = 0.040374538951535959111995279752468; wkronrod->ptr.p_double[18] = 0.041969810215164246147147541285970; wkronrod->ptr.p_double[19] = 0.043452539701356069316831728117073; wkronrod->ptr.p_double[20] = 0.044814800133162663192355551616723; wkronrod->ptr.p_double[21] = 0.046059238271006988116271735559374; wkronrod->ptr.p_double[22] = 0.047185546569299153945261478181099; wkronrod->ptr.p_double[23] = 0.048185861757087129140779492298305; wkronrod->ptr.p_double[24] = 0.049055434555029778887528165367238; wkronrod->ptr.p_double[25] = 0.049795683427074206357811569379942; wkronrod->ptr.p_double[26] = 0.050405921402782346840893085653585; wkronrod->ptr.p_double[27] = 0.050881795898749606492297473049805; wkronrod->ptr.p_double[28] = 0.051221547849258772170656282604944; wkronrod->ptr.p_double[29] = 0.051426128537459025933862879215781; wkronrod->ptr.p_double[30] = 0.051494729429451567558340433647099; } /* * copy nodes */ for(i=n-1; i>=n/2; i--) { x->ptr.p_double[i] = -x->ptr.p_double[n-1-i]; } /* * copy Kronrod weights */ for(i=n-1; i>=n/2; i--) { wkronrod->ptr.p_double[i] = wkronrod->ptr.p_double[n-1-i]; } /* * copy Gauss weights */ for(i=ng-1; i>=0; i--) { wgauss->ptr.p_double[n-2-2*i] = wgauss->ptr.p_double[i]; wgauss->ptr.p_double[1+2*i] = wgauss->ptr.p_double[i]; } for(i=0; i<=n/2; i++) { wgauss->ptr.p_double[2*i] = (double)(0); } /* * reorder */ tagsort(x, n, &p1, &p2, _state); for(i=0; i<=n-1; i++) { tmp = wkronrod->ptr.p_double[i]; wkronrod->ptr.p_double[i] = wkronrod->ptr.p_double[p2.ptr.p_int[i]]; wkronrod->ptr.p_double[p2.ptr.p_int[i]] = tmp; tmp = wgauss->ptr.p_double[i]; wgauss->ptr.p_double[i] = wgauss->ptr.p_double[p2.ptr.p_int[i]]; wgauss->ptr.p_double[p2.ptr.p_int[i]] = tmp; } ae_frame_leave(_state); } /************************************************************************* Integration of a smooth function F(x) on a finite interval [a,b]. Fast-convergent algorithm based on a Gauss-Kronrod formula is used. Result is calculated with accuracy close to the machine precision. Algorithm works well only with smooth integrands. It may be used with continuous non-smooth integrands, but with less performance. It should never be used with integrands which have integrable singularities at lower or upper limits - algorithm may crash. Use AutoGKSingular in such cases. INPUT PARAMETERS: A, B - interval boundaries (AB) OUTPUT PARAMETERS State - structure which stores algorithm state SEE ALSO AutoGKSmoothW, AutoGKSingular, AutoGKResults. -- ALGLIB -- Copyright 06.05.2009 by Bochkanov Sergey *************************************************************************/ void autogksmooth(double a, double b, autogkstate* state, ae_state *_state) { _autogkstate_clear(state); ae_assert(ae_isfinite(a, _state), "AutoGKSmooth: A is not finite!", _state); ae_assert(ae_isfinite(b, _state), "AutoGKSmooth: B is not finite!", _state); autogksmoothw(a, b, 0.0, state, _state); } /************************************************************************* Integration of a smooth function F(x) on a finite interval [a,b]. This subroutine is same as AutoGKSmooth(), but it guarantees that interval [a,b] is partitioned into subintervals which have width at most XWidth. Subroutine can be used when integrating nearly-constant function with narrow "bumps" (about XWidth wide). If "bumps" are too narrow, AutoGKSmooth subroutine can overlook them. INPUT PARAMETERS: A, B - interval boundaries (AB) OUTPUT PARAMETERS State - structure which stores algorithm state SEE ALSO AutoGKSmooth, AutoGKSingular, AutoGKResults. -- ALGLIB -- Copyright 06.05.2009 by Bochkanov Sergey *************************************************************************/ void autogksmoothw(double a, double b, double xwidth, autogkstate* state, ae_state *_state) { _autogkstate_clear(state); ae_assert(ae_isfinite(a, _state), "AutoGKSmoothW: A is not finite!", _state); ae_assert(ae_isfinite(b, _state), "AutoGKSmoothW: B is not finite!", _state); ae_assert(ae_isfinite(xwidth, _state), "AutoGKSmoothW: XWidth is not finite!", _state); state->wrappermode = 0; state->a = a; state->b = b; state->xwidth = xwidth; state->needf = ae_false; ae_vector_set_length(&state->rstate.ra, 10+1, _state); state->rstate.stage = -1; } /************************************************************************* Integration on a finite interval [A,B]. Integrand have integrable singularities at A/B. F(X) must diverge as "(x-A)^alpha" at A, as "(B-x)^beta" at B, with known alpha/beta (alpha>-1, beta>-1). If alpha/beta are not known, estimates from below can be used (but these estimates should be greater than -1 too). One of alpha/beta variables (or even both alpha/beta) may be equal to 0, which means than function F(x) is non-singular at A/B. Anyway (singular at bounds or not), function F(x) is supposed to be continuous on (A,B). Fast-convergent algorithm based on a Gauss-Kronrod formula is used. Result is calculated with accuracy close to the machine precision. INPUT PARAMETERS: A, B - interval boundaries (AB) Alpha - power-law coefficient of the F(x) at A, Alpha>-1 Beta - power-law coefficient of the F(x) at B, Beta>-1 OUTPUT PARAMETERS State - structure which stores algorithm state SEE ALSO AutoGKSmooth, AutoGKSmoothW, AutoGKResults. -- ALGLIB -- Copyright 06.05.2009 by Bochkanov Sergey *************************************************************************/ void autogksingular(double a, double b, double alpha, double beta, autogkstate* state, ae_state *_state) { _autogkstate_clear(state); ae_assert(ae_isfinite(a, _state), "AutoGKSingular: A is not finite!", _state); ae_assert(ae_isfinite(b, _state), "AutoGKSingular: B is not finite!", _state); ae_assert(ae_isfinite(alpha, _state), "AutoGKSingular: Alpha is not finite!", _state); ae_assert(ae_isfinite(beta, _state), "AutoGKSingular: Beta is not finite!", _state); state->wrappermode = 1; state->a = a; state->b = b; state->alpha = alpha; state->beta = beta; state->xwidth = 0.0; state->needf = ae_false; ae_vector_set_length(&state->rstate.ra, 10+1, _state); state->rstate.stage = -1; } /************************************************************************* -- ALGLIB -- Copyright 07.05.2009 by Bochkanov Sergey *************************************************************************/ ae_bool autogkiteration(autogkstate* state, ae_state *_state) { double s; double tmp; double eps; double a; double b; double x; double t; double alpha; double beta; double v1; double v2; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { s = state->rstate.ra.ptr.p_double[0]; tmp = state->rstate.ra.ptr.p_double[1]; eps = state->rstate.ra.ptr.p_double[2]; a = state->rstate.ra.ptr.p_double[3]; b = state->rstate.ra.ptr.p_double[4]; x = state->rstate.ra.ptr.p_double[5]; t = state->rstate.ra.ptr.p_double[6]; alpha = state->rstate.ra.ptr.p_double[7]; beta = state->rstate.ra.ptr.p_double[8]; v1 = state->rstate.ra.ptr.p_double[9]; v2 = state->rstate.ra.ptr.p_double[10]; } else { s = 359; tmp = -58; eps = -919; a = -909; b = 81; x = 255; t = 74; alpha = -788; beta = 809; v1 = 205; v2 = -838; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } /* * Routine body */ eps = (double)(0); a = state->a; b = state->b; alpha = state->alpha; beta = state->beta; state->terminationtype = -1; state->nfev = 0; state->nintervals = 0; /* * smooth function at a finite interval */ if( state->wrappermode!=0 ) { goto lbl_3; } /* * special case */ if( ae_fp_eq(a,b) ) { state->terminationtype = 1; state->v = (double)(0); result = ae_false; return result; } /* * general case */ autogk_autogkinternalprepare(a, b, eps, state->xwidth, &state->internalstate, _state); lbl_5: if( !autogk_autogkinternaliteration(&state->internalstate, _state) ) { goto lbl_6; } x = state->internalstate.x; state->x = x; state->xminusa = x-a; state->bminusx = b-x; state->needf = ae_true; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: state->needf = ae_false; state->nfev = state->nfev+1; state->internalstate.f = state->f; goto lbl_5; lbl_6: state->v = state->internalstate.r; state->terminationtype = state->internalstate.info; state->nintervals = state->internalstate.heapused; result = ae_false; return result; lbl_3: /* * function with power-law singularities at the ends of a finite interval */ if( state->wrappermode!=1 ) { goto lbl_7; } /* * test coefficients */ if( ae_fp_less_eq(alpha,(double)(-1))||ae_fp_less_eq(beta,(double)(-1)) ) { state->terminationtype = -1; state->v = (double)(0); result = ae_false; return result; } /* * special cases */ if( ae_fp_eq(a,b) ) { state->terminationtype = 1; state->v = (double)(0); result = ae_false; return result; } /* * reduction to general form */ if( ae_fp_less(a,b) ) { s = (double)(1); } else { s = (double)(-1); tmp = a; a = b; b = tmp; tmp = alpha; alpha = beta; beta = tmp; } alpha = ae_minreal(alpha, (double)(0), _state); beta = ae_minreal(beta, (double)(0), _state); /* * first, integrate left half of [a,b]: * integral(f(x)dx, a, (b+a)/2) = * = 1/(1+alpha) * integral(t^(-alpha/(1+alpha))*f(a+t^(1/(1+alpha)))dt, 0, (0.5*(b-a))^(1+alpha)) */ autogk_autogkinternalprepare((double)(0), ae_pow(0.5*(b-a), 1+alpha, _state), eps, state->xwidth, &state->internalstate, _state); lbl_9: if( !autogk_autogkinternaliteration(&state->internalstate, _state) ) { goto lbl_10; } /* * Fill State.X, State.XMinusA, State.BMinusX. * Latter two are filled correctly even if Binternalstate.x; t = ae_pow(x, 1/(1+alpha), _state); state->x = a+t; if( ae_fp_greater(s,(double)(0)) ) { state->xminusa = t; state->bminusx = b-(a+t); } else { state->xminusa = a+t-b; state->bminusx = -t; } state->needf = ae_true; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: state->needf = ae_false; if( ae_fp_neq(alpha,(double)(0)) ) { state->internalstate.f = state->f*ae_pow(x, -alpha/(1+alpha), _state)/(1+alpha); } else { state->internalstate.f = state->f; } state->nfev = state->nfev+1; goto lbl_9; lbl_10: v1 = state->internalstate.r; state->nintervals = state->nintervals+state->internalstate.heapused; /* * then, integrate right half of [a,b]: * integral(f(x)dx, (b+a)/2, b) = * = 1/(1+beta) * integral(t^(-beta/(1+beta))*f(b-t^(1/(1+beta)))dt, 0, (0.5*(b-a))^(1+beta)) */ autogk_autogkinternalprepare((double)(0), ae_pow(0.5*(b-a), 1+beta, _state), eps, state->xwidth, &state->internalstate, _state); lbl_11: if( !autogk_autogkinternaliteration(&state->internalstate, _state) ) { goto lbl_12; } /* * Fill State.X, State.XMinusA, State.BMinusX. * Latter two are filled correctly (X-A, B-X) even if Binternalstate.x; t = ae_pow(x, 1/(1+beta), _state); state->x = b-t; if( ae_fp_greater(s,(double)(0)) ) { state->xminusa = b-t-a; state->bminusx = t; } else { state->xminusa = -t; state->bminusx = a-(b-t); } state->needf = ae_true; state->rstate.stage = 2; goto lbl_rcomm; lbl_2: state->needf = ae_false; if( ae_fp_neq(beta,(double)(0)) ) { state->internalstate.f = state->f*ae_pow(x, -beta/(1+beta), _state)/(1+beta); } else { state->internalstate.f = state->f; } state->nfev = state->nfev+1; goto lbl_11; lbl_12: v2 = state->internalstate.r; state->nintervals = state->nintervals+state->internalstate.heapused; /* * final result */ state->v = s*(v1+v2); state->terminationtype = 1; result = ae_false; return result; lbl_7: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ra.ptr.p_double[0] = s; state->rstate.ra.ptr.p_double[1] = tmp; state->rstate.ra.ptr.p_double[2] = eps; state->rstate.ra.ptr.p_double[3] = a; state->rstate.ra.ptr.p_double[4] = b; state->rstate.ra.ptr.p_double[5] = x; state->rstate.ra.ptr.p_double[6] = t; state->rstate.ra.ptr.p_double[7] = alpha; state->rstate.ra.ptr.p_double[8] = beta; state->rstate.ra.ptr.p_double[9] = v1; state->rstate.ra.ptr.p_double[10] = v2; return result; } /************************************************************************* Adaptive integration results Called after AutoGKIteration returned False. Input parameters: State - algorithm state (used by AutoGKIteration). Output parameters: V - integral(f(x)dx,a,b) Rep - optimization report (see AutoGKReport description) -- ALGLIB -- Copyright 14.11.2007 by Bochkanov Sergey *************************************************************************/ void autogkresults(autogkstate* state, double* v, autogkreport* rep, ae_state *_state) { *v = 0; _autogkreport_clear(rep); *v = state->v; rep->terminationtype = state->terminationtype; rep->nfev = state->nfev; rep->nintervals = state->nintervals; } /************************************************************************* Internal AutoGK subroutine eps<0 - error eps=0 - automatic eps selection width<0 - error width=0 - no width requirements *************************************************************************/ static void autogk_autogkinternalprepare(double a, double b, double eps, double xwidth, autogkinternalstate* state, ae_state *_state) { /* * Save settings */ state->a = a; state->b = b; state->eps = eps; state->xwidth = xwidth; /* * Prepare RComm structure */ ae_vector_set_length(&state->rstate.ia, 3+1, _state); ae_vector_set_length(&state->rstate.ra, 8+1, _state); state->rstate.stage = -1; } /************************************************************************* Internal AutoGK subroutine *************************************************************************/ static ae_bool autogk_autogkinternaliteration(autogkinternalstate* state, ae_state *_state) { double c1; double c2; ae_int_t i; ae_int_t j; double intg; double intk; double inta; double v; double ta; double tb; ae_int_t ns; double qeps; ae_int_t info; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { i = state->rstate.ia.ptr.p_int[0]; j = state->rstate.ia.ptr.p_int[1]; ns = state->rstate.ia.ptr.p_int[2]; info = state->rstate.ia.ptr.p_int[3]; c1 = state->rstate.ra.ptr.p_double[0]; c2 = state->rstate.ra.ptr.p_double[1]; intg = state->rstate.ra.ptr.p_double[2]; intk = state->rstate.ra.ptr.p_double[3]; inta = state->rstate.ra.ptr.p_double[4]; v = state->rstate.ra.ptr.p_double[5]; ta = state->rstate.ra.ptr.p_double[6]; tb = state->rstate.ra.ptr.p_double[7]; qeps = state->rstate.ra.ptr.p_double[8]; } else { i = 939; j = -526; ns = 763; info = -541; c1 = -698; c2 = -900; intg = -318; intk = -940; inta = 1016; v = -229; ta = -536; tb = 487; qeps = -115; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } /* * Routine body */ /* * initialize quadratures. * use 15-point Gauss-Kronrod formula. */ state->n = 15; gkqgenerategausslegendre(state->n, &info, &state->qn, &state->wk, &state->wg, _state); if( info<0 ) { state->info = -5; state->r = (double)(0); result = ae_false; return result; } ae_vector_set_length(&state->wr, state->n, _state); for(i=0; i<=state->n-1; i++) { if( i==0 ) { state->wr.ptr.p_double[i] = 0.5*ae_fabs(state->qn.ptr.p_double[1]-state->qn.ptr.p_double[0], _state); continue; } if( i==state->n-1 ) { state->wr.ptr.p_double[state->n-1] = 0.5*ae_fabs(state->qn.ptr.p_double[state->n-1]-state->qn.ptr.p_double[state->n-2], _state); continue; } state->wr.ptr.p_double[i] = 0.5*ae_fabs(state->qn.ptr.p_double[i-1]-state->qn.ptr.p_double[i+1], _state); } /* * special case */ if( ae_fp_eq(state->a,state->b) ) { state->info = 1; state->r = (double)(0); result = ae_false; return result; } /* * test parameters */ if( ae_fp_less(state->eps,(double)(0))||ae_fp_less(state->xwidth,(double)(0)) ) { state->info = -1; state->r = (double)(0); result = ae_false; return result; } state->info = 1; if( ae_fp_eq(state->eps,(double)(0)) ) { state->eps = 100000*ae_machineepsilon; } /* * First, prepare heap * * column 0 - absolute error * * column 1 - integral of a F(x) (calculated using Kronrod extension nodes) * * column 2 - integral of a |F(x)| (calculated using modified rect. method) * * column 3 - left boundary of a subinterval * * column 4 - right boundary of a subinterval */ if( ae_fp_neq(state->xwidth,(double)(0)) ) { goto lbl_3; } /* * no maximum width requirements * start from one big subinterval */ state->heapwidth = 5; state->heapsize = 1; state->heapused = 1; ae_matrix_set_length(&state->heap, state->heapsize, state->heapwidth, _state); c1 = 0.5*(state->b-state->a); c2 = 0.5*(state->b+state->a); intg = (double)(0); intk = (double)(0); inta = (double)(0); i = 0; lbl_5: if( i>state->n-1 ) { goto lbl_7; } /* * obtain F */ state->x = c1*state->qn.ptr.p_double[i]+c2; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: v = state->f; /* * Gauss-Kronrod formula */ intk = intk+v*state->wk.ptr.p_double[i]; if( i%2==1 ) { intg = intg+v*state->wg.ptr.p_double[i]; } /* * Integral |F(x)| * Use rectangles method */ inta = inta+ae_fabs(v, _state)*state->wr.ptr.p_double[i]; i = i+1; goto lbl_5; lbl_7: intk = intk*(state->b-state->a)*0.5; intg = intg*(state->b-state->a)*0.5; inta = inta*(state->b-state->a)*0.5; state->heap.ptr.pp_double[0][0] = ae_fabs(intg-intk, _state); state->heap.ptr.pp_double[0][1] = intk; state->heap.ptr.pp_double[0][2] = inta; state->heap.ptr.pp_double[0][3] = state->a; state->heap.ptr.pp_double[0][4] = state->b; state->sumerr = state->heap.ptr.pp_double[0][0]; state->sumabs = ae_fabs(inta, _state); goto lbl_4; lbl_3: /* * maximum subinterval should be no more than XWidth. * so we create Ceil((B-A)/XWidth)+1 small subintervals */ ns = ae_iceil(ae_fabs(state->b-state->a, _state)/state->xwidth, _state)+1; state->heapsize = ns; state->heapused = ns; state->heapwidth = 5; ae_matrix_set_length(&state->heap, state->heapsize, state->heapwidth, _state); state->sumerr = (double)(0); state->sumabs = (double)(0); j = 0; lbl_8: if( j>ns-1 ) { goto lbl_10; } ta = state->a+j*(state->b-state->a)/ns; tb = state->a+(j+1)*(state->b-state->a)/ns; c1 = 0.5*(tb-ta); c2 = 0.5*(tb+ta); intg = (double)(0); intk = (double)(0); inta = (double)(0); i = 0; lbl_11: if( i>state->n-1 ) { goto lbl_13; } /* * obtain F */ state->x = c1*state->qn.ptr.p_double[i]+c2; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: v = state->f; /* * Gauss-Kronrod formula */ intk = intk+v*state->wk.ptr.p_double[i]; if( i%2==1 ) { intg = intg+v*state->wg.ptr.p_double[i]; } /* * Integral |F(x)| * Use rectangles method */ inta = inta+ae_fabs(v, _state)*state->wr.ptr.p_double[i]; i = i+1; goto lbl_11; lbl_13: intk = intk*(tb-ta)*0.5; intg = intg*(tb-ta)*0.5; inta = inta*(tb-ta)*0.5; state->heap.ptr.pp_double[j][0] = ae_fabs(intg-intk, _state); state->heap.ptr.pp_double[j][1] = intk; state->heap.ptr.pp_double[j][2] = inta; state->heap.ptr.pp_double[j][3] = ta; state->heap.ptr.pp_double[j][4] = tb; state->sumerr = state->sumerr+state->heap.ptr.pp_double[j][0]; state->sumabs = state->sumabs+ae_fabs(inta, _state); j = j+1; goto lbl_8; lbl_10: lbl_4: /* * method iterations */ lbl_14: if( ae_false ) { goto lbl_15; } /* * additional memory if needed */ if( state->heapused==state->heapsize ) { autogk_mheapresize(&state->heap, &state->heapsize, 4*state->heapsize, state->heapwidth, _state); } /* * TODO: every 20 iterations recalculate errors/sums */ if( ae_fp_less_eq(state->sumerr,state->eps*state->sumabs)||state->heapused>=autogk_maxsubintervals ) { state->r = (double)(0); for(j=0; j<=state->heapused-1; j++) { state->r = state->r+state->heap.ptr.pp_double[j][1]; } result = ae_false; return result; } /* * Exclude interval with maximum absolute error */ autogk_mheappop(&state->heap, state->heapused, state->heapwidth, _state); state->sumerr = state->sumerr-state->heap.ptr.pp_double[state->heapused-1][0]; state->sumabs = state->sumabs-state->heap.ptr.pp_double[state->heapused-1][2]; /* * Divide interval, create subintervals */ ta = state->heap.ptr.pp_double[state->heapused-1][3]; tb = state->heap.ptr.pp_double[state->heapused-1][4]; state->heap.ptr.pp_double[state->heapused-1][3] = ta; state->heap.ptr.pp_double[state->heapused-1][4] = 0.5*(ta+tb); state->heap.ptr.pp_double[state->heapused][3] = 0.5*(ta+tb); state->heap.ptr.pp_double[state->heapused][4] = tb; j = state->heapused-1; lbl_16: if( j>state->heapused ) { goto lbl_18; } c1 = 0.5*(state->heap.ptr.pp_double[j][4]-state->heap.ptr.pp_double[j][3]); c2 = 0.5*(state->heap.ptr.pp_double[j][4]+state->heap.ptr.pp_double[j][3]); intg = (double)(0); intk = (double)(0); inta = (double)(0); i = 0; lbl_19: if( i>state->n-1 ) { goto lbl_21; } /* * F(x) */ state->x = c1*state->qn.ptr.p_double[i]+c2; state->rstate.stage = 2; goto lbl_rcomm; lbl_2: v = state->f; /* * Gauss-Kronrod formula */ intk = intk+v*state->wk.ptr.p_double[i]; if( i%2==1 ) { intg = intg+v*state->wg.ptr.p_double[i]; } /* * Integral |F(x)| * Use rectangles method */ inta = inta+ae_fabs(v, _state)*state->wr.ptr.p_double[i]; i = i+1; goto lbl_19; lbl_21: intk = intk*(state->heap.ptr.pp_double[j][4]-state->heap.ptr.pp_double[j][3])*0.5; intg = intg*(state->heap.ptr.pp_double[j][4]-state->heap.ptr.pp_double[j][3])*0.5; inta = inta*(state->heap.ptr.pp_double[j][4]-state->heap.ptr.pp_double[j][3])*0.5; state->heap.ptr.pp_double[j][0] = ae_fabs(intg-intk, _state); state->heap.ptr.pp_double[j][1] = intk; state->heap.ptr.pp_double[j][2] = inta; state->sumerr = state->sumerr+state->heap.ptr.pp_double[j][0]; state->sumabs = state->sumabs+state->heap.ptr.pp_double[j][2]; j = j+1; goto lbl_16; lbl_18: autogk_mheappush(&state->heap, state->heapused-1, state->heapwidth, _state); autogk_mheappush(&state->heap, state->heapused, state->heapwidth, _state); state->heapused = state->heapused+1; goto lbl_14; lbl_15: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = i; state->rstate.ia.ptr.p_int[1] = j; state->rstate.ia.ptr.p_int[2] = ns; state->rstate.ia.ptr.p_int[3] = info; state->rstate.ra.ptr.p_double[0] = c1; state->rstate.ra.ptr.p_double[1] = c2; state->rstate.ra.ptr.p_double[2] = intg; state->rstate.ra.ptr.p_double[3] = intk; state->rstate.ra.ptr.p_double[4] = inta; state->rstate.ra.ptr.p_double[5] = v; state->rstate.ra.ptr.p_double[6] = ta; state->rstate.ra.ptr.p_double[7] = tb; state->rstate.ra.ptr.p_double[8] = qeps; return result; } static void autogk_mheappop(/* Real */ ae_matrix* heap, ae_int_t heapsize, ae_int_t heapwidth, ae_state *_state) { ae_int_t i; ae_int_t p; double t; ae_int_t maxcp; if( heapsize==1 ) { return; } for(i=0; i<=heapwidth-1; i++) { t = heap->ptr.pp_double[heapsize-1][i]; heap->ptr.pp_double[heapsize-1][i] = heap->ptr.pp_double[0][i]; heap->ptr.pp_double[0][i] = t; } p = 0; while(2*p+1ptr.pp_double[2*p+2][0],heap->ptr.pp_double[2*p+1][0]) ) { maxcp = 2*p+2; } } if( ae_fp_less(heap->ptr.pp_double[p][0],heap->ptr.pp_double[maxcp][0]) ) { for(i=0; i<=heapwidth-1; i++) { t = heap->ptr.pp_double[p][i]; heap->ptr.pp_double[p][i] = heap->ptr.pp_double[maxcp][i]; heap->ptr.pp_double[maxcp][i] = t; } p = maxcp; } else { break; } } } static void autogk_mheappush(/* Real */ ae_matrix* heap, ae_int_t heapsize, ae_int_t heapwidth, ae_state *_state) { ae_int_t i; ae_int_t p; double t; ae_int_t parent; if( heapsize==0 ) { return; } p = heapsize; while(p!=0) { parent = (p-1)/2; if( ae_fp_greater(heap->ptr.pp_double[p][0],heap->ptr.pp_double[parent][0]) ) { for(i=0; i<=heapwidth-1; i++) { t = heap->ptr.pp_double[p][i]; heap->ptr.pp_double[p][i] = heap->ptr.pp_double[parent][i]; heap->ptr.pp_double[parent][i] = t; } p = parent; } else { break; } } } static void autogk_mheapresize(/* Real */ ae_matrix* heap, ae_int_t* heapsize, ae_int_t newheapsize, ae_int_t heapwidth, ae_state *_state) { ae_frame _frame_block; ae_matrix tmp; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_matrix_init(&tmp, 0, 0, DT_REAL, _state); ae_matrix_set_length(&tmp, *heapsize, heapwidth, _state); for(i=0; i<=*heapsize-1; i++) { ae_v_move(&tmp.ptr.pp_double[i][0], 1, &heap->ptr.pp_double[i][0], 1, ae_v_len(0,heapwidth-1)); } ae_matrix_set_length(heap, newheapsize, heapwidth, _state); for(i=0; i<=*heapsize-1; i++) { ae_v_move(&heap->ptr.pp_double[i][0], 1, &tmp.ptr.pp_double[i][0], 1, ae_v_len(0,heapwidth-1)); } *heapsize = newheapsize; ae_frame_leave(_state); } void _autogkreport_init(void* _p, ae_state *_state) { autogkreport *p = (autogkreport*)_p; ae_touch_ptr((void*)p); } void _autogkreport_init_copy(void* _dst, void* _src, ae_state *_state) { autogkreport *dst = (autogkreport*)_dst; autogkreport *src = (autogkreport*)_src; dst->terminationtype = src->terminationtype; dst->nfev = src->nfev; dst->nintervals = src->nintervals; } void _autogkreport_clear(void* _p) { autogkreport *p = (autogkreport*)_p; ae_touch_ptr((void*)p); } void _autogkreport_destroy(void* _p) { autogkreport *p = (autogkreport*)_p; ae_touch_ptr((void*)p); } void _autogkinternalstate_init(void* _p, ae_state *_state) { autogkinternalstate *p = (autogkinternalstate*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->heap, 0, 0, DT_REAL, _state); ae_vector_init(&p->qn, 0, DT_REAL, _state); ae_vector_init(&p->wg, 0, DT_REAL, _state); ae_vector_init(&p->wk, 0, DT_REAL, _state); ae_vector_init(&p->wr, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); } void _autogkinternalstate_init_copy(void* _dst, void* _src, ae_state *_state) { autogkinternalstate *dst = (autogkinternalstate*)_dst; autogkinternalstate *src = (autogkinternalstate*)_src; dst->a = src->a; dst->b = src->b; dst->eps = src->eps; dst->xwidth = src->xwidth; dst->x = src->x; dst->f = src->f; dst->info = src->info; dst->r = src->r; ae_matrix_init_copy(&dst->heap, &src->heap, _state); dst->heapsize = src->heapsize; dst->heapwidth = src->heapwidth; dst->heapused = src->heapused; dst->sumerr = src->sumerr; dst->sumabs = src->sumabs; ae_vector_init_copy(&dst->qn, &src->qn, _state); ae_vector_init_copy(&dst->wg, &src->wg, _state); ae_vector_init_copy(&dst->wk, &src->wk, _state); ae_vector_init_copy(&dst->wr, &src->wr, _state); dst->n = src->n; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); } void _autogkinternalstate_clear(void* _p) { autogkinternalstate *p = (autogkinternalstate*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->heap); ae_vector_clear(&p->qn); ae_vector_clear(&p->wg); ae_vector_clear(&p->wk); ae_vector_clear(&p->wr); _rcommstate_clear(&p->rstate); } void _autogkinternalstate_destroy(void* _p) { autogkinternalstate *p = (autogkinternalstate*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->heap); ae_vector_destroy(&p->qn); ae_vector_destroy(&p->wg); ae_vector_destroy(&p->wk); ae_vector_destroy(&p->wr); _rcommstate_destroy(&p->rstate); } void _autogkstate_init(void* _p, ae_state *_state) { autogkstate *p = (autogkstate*)_p; ae_touch_ptr((void*)p); _autogkinternalstate_init(&p->internalstate, _state); _rcommstate_init(&p->rstate, _state); } void _autogkstate_init_copy(void* _dst, void* _src, ae_state *_state) { autogkstate *dst = (autogkstate*)_dst; autogkstate *src = (autogkstate*)_src; dst->a = src->a; dst->b = src->b; dst->alpha = src->alpha; dst->beta = src->beta; dst->xwidth = src->xwidth; dst->x = src->x; dst->xminusa = src->xminusa; dst->bminusx = src->bminusx; dst->needf = src->needf; dst->f = src->f; dst->wrappermode = src->wrappermode; _autogkinternalstate_init_copy(&dst->internalstate, &src->internalstate, _state); _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); dst->v = src->v; dst->terminationtype = src->terminationtype; dst->nfev = src->nfev; dst->nintervals = src->nintervals; } void _autogkstate_clear(void* _p) { autogkstate *p = (autogkstate*)_p; ae_touch_ptr((void*)p); _autogkinternalstate_clear(&p->internalstate); _rcommstate_clear(&p->rstate); } void _autogkstate_destroy(void* _p) { autogkstate *p = (autogkstate*)_p; ae_touch_ptr((void*)p); _autogkinternalstate_destroy(&p->internalstate); _rcommstate_destroy(&p->rstate); } } cpp/src/linalg.cpp0000755000175000017500000615141613105126765014006 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "linalg.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif using namespace std; ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* Sparse matrix structure. You should use ALGLIB functions to work with sparse matrix. Never try to access its fields directly! NOTES ON THE SPARSE STORAGE FORMATS Sparse matrices can be stored using several formats: * Hash-Table representation * Compressed Row Storage (CRS) * Skyline matrix storage (SKS) Each of the formats has benefits and drawbacks: * Hash-table is good for dynamic operations (insertion of new elements), but does not support linear algebra operations * CRS is good for operations like matrix-vector or matrix-matrix products, but its initialization is less convenient - you have to tell row sizes at the initialization, and you have to fill matrix only row by row, from left to right. * SKS is a special format which is used to store triangular factors from Cholesky factorization. It does not support dynamic modification, and support for linear algebra operations is very limited. Tables below outline information about these two formats: OPERATIONS WITH MATRIX HASH CRS SKS creation + + + SparseGet + + + SparseRewriteExisting + + + SparseSet + SparseAdd + SparseGetRow + + SparseGetCompressedRow + + sparse-dense linear algebra + + *************************************************************************/ _sparsematrix_owner::_sparsematrix_owner() { p_struct = (alglib_impl::sparsematrix*)alglib_impl::ae_malloc(sizeof(alglib_impl::sparsematrix), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_sparsematrix_init(p_struct, NULL); } _sparsematrix_owner::_sparsematrix_owner(const _sparsematrix_owner &rhs) { p_struct = (alglib_impl::sparsematrix*)alglib_impl::ae_malloc(sizeof(alglib_impl::sparsematrix), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_sparsematrix_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _sparsematrix_owner& _sparsematrix_owner::operator=(const _sparsematrix_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_sparsematrix_clear(p_struct); alglib_impl::_sparsematrix_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _sparsematrix_owner::~_sparsematrix_owner() { alglib_impl::_sparsematrix_clear(p_struct); ae_free(p_struct); } alglib_impl::sparsematrix* _sparsematrix_owner::c_ptr() { return p_struct; } alglib_impl::sparsematrix* _sparsematrix_owner::c_ptr() const { return const_cast(p_struct); } sparsematrix::sparsematrix() : _sparsematrix_owner() { } sparsematrix::sparsematrix(const sparsematrix &rhs):_sparsematrix_owner(rhs) { } sparsematrix& sparsematrix::operator=(const sparsematrix &rhs) { if( this==&rhs ) return *this; _sparsematrix_owner::operator=(rhs); return *this; } sparsematrix::~sparsematrix() { } /************************************************************************* Temporary buffers for sparse matrix operations. You should pass an instance of this structure to factorization functions. It allows to reuse memory during repeated sparse factorizations. You do not have to call some initialization function - simply passing an instance to factorization function is enough. *************************************************************************/ _sparsebuffers_owner::_sparsebuffers_owner() { p_struct = (alglib_impl::sparsebuffers*)alglib_impl::ae_malloc(sizeof(alglib_impl::sparsebuffers), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_sparsebuffers_init(p_struct, NULL); } _sparsebuffers_owner::_sparsebuffers_owner(const _sparsebuffers_owner &rhs) { p_struct = (alglib_impl::sparsebuffers*)alglib_impl::ae_malloc(sizeof(alglib_impl::sparsebuffers), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_sparsebuffers_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _sparsebuffers_owner& _sparsebuffers_owner::operator=(const _sparsebuffers_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_sparsebuffers_clear(p_struct); alglib_impl::_sparsebuffers_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _sparsebuffers_owner::~_sparsebuffers_owner() { alglib_impl::_sparsebuffers_clear(p_struct); ae_free(p_struct); } alglib_impl::sparsebuffers* _sparsebuffers_owner::c_ptr() { return p_struct; } alglib_impl::sparsebuffers* _sparsebuffers_owner::c_ptr() const { return const_cast(p_struct); } sparsebuffers::sparsebuffers() : _sparsebuffers_owner() { } sparsebuffers::sparsebuffers(const sparsebuffers &rhs):_sparsebuffers_owner(rhs) { } sparsebuffers& sparsebuffers::operator=(const sparsebuffers &rhs) { if( this==&rhs ) return *this; _sparsebuffers_owner::operator=(rhs); return *this; } sparsebuffers::~sparsebuffers() { } /************************************************************************* This function creates sparse matrix in a Hash-Table format. This function creates Hast-Table matrix, which can be converted to CRS format after its initialization is over. Typical usage scenario for a sparse matrix is: 1. creation in a Hash-Table format 2. insertion of the matrix elements 3. conversion to the CRS representation 4. matrix is passed to some linear algebra algorithm Some information about different matrix formats can be found below, in the "NOTES" section. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 K - K>=0, expected number of non-zero elements in a matrix. K can be inexact approximation, can be less than actual number of elements (table will grow when needed) or even zero). It is important to understand that although hash-table may grow automatically, it is better to provide good estimate of data size. OUTPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. All elements of the matrix are zero. NOTE 1 Hash-tables use memory inefficiently, and they have to keep some amount of the "spare memory" in order to have good performance. Hash table for matrix with K non-zero elements will need C*K*(8+2*sizeof(int)) bytes, where C is a small constant, about 1.5-2 in magnitude. CRS storage, from the other side, is more memory-efficient, and needs just K*(8+sizeof(int))+M*sizeof(int) bytes, where M is a number of rows in a matrix. When you convert from the Hash-Table to CRS representation, all unneeded memory will be freed. NOTE 2 Comments of SparseMatrix structure outline information about different sparse storage formats. We recommend you to read them before starting to use ALGLIB sparse matrices. NOTE 3 This function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateBuf function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecreate(const ae_int_t m, const ae_int_t n, const ae_int_t k, sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecreate(m, n, k, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function creates sparse matrix in a Hash-Table format. This function creates Hast-Table matrix, which can be converted to CRS format after its initialization is over. Typical usage scenario for a sparse matrix is: 1. creation in a Hash-Table format 2. insertion of the matrix elements 3. conversion to the CRS representation 4. matrix is passed to some linear algebra algorithm Some information about different matrix formats can be found below, in the "NOTES" section. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 K - K>=0, expected number of non-zero elements in a matrix. K can be inexact approximation, can be less than actual number of elements (table will grow when needed) or even zero). It is important to understand that although hash-table may grow automatically, it is better to provide good estimate of data size. OUTPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. All elements of the matrix are zero. NOTE 1 Hash-tables use memory inefficiently, and they have to keep some amount of the "spare memory" in order to have good performance. Hash table for matrix with K non-zero elements will need C*K*(8+2*sizeof(int)) bytes, where C is a small constant, about 1.5-2 in magnitude. CRS storage, from the other side, is more memory-efficient, and needs just K*(8+sizeof(int))+M*sizeof(int) bytes, where M is a number of rows in a matrix. When you convert from the Hash-Table to CRS representation, all unneeded memory will be freed. NOTE 2 Comments of SparseMatrix structure outline information about different sparse storage formats. We recommend you to read them before starting to use ALGLIB sparse matrices. NOTE 3 This function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateBuf function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecreate(const ae_int_t m, const ae_int_t n, sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; ae_int_t k; k = 0; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecreate(m, n, k, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This version of SparseCreate function creates sparse matrix in Hash-Table format, reusing previously allocated storage as much as possible. Read comments for SparseCreate() for more information. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 K - K>=0, expected number of non-zero elements in a matrix. K can be inexact approximation, can be less than actual number of elements (table will grow when needed) or even zero). It is important to understand that although hash-table may grow automatically, it is better to provide good estimate of data size. S - SparseMatrix structure which MAY contain some already allocated storage. OUTPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. All elements of the matrix are zero. Previously allocated storage is reused, if its size is compatible with expected number of non-zeros K. -- ALGLIB PROJECT -- Copyright 14.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsecreatebuf(const ae_int_t m, const ae_int_t n, const ae_int_t k, const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecreatebuf(m, n, k, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This version of SparseCreate function creates sparse matrix in Hash-Table format, reusing previously allocated storage as much as possible. Read comments for SparseCreate() for more information. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 K - K>=0, expected number of non-zero elements in a matrix. K can be inexact approximation, can be less than actual number of elements (table will grow when needed) or even zero). It is important to understand that although hash-table may grow automatically, it is better to provide good estimate of data size. S - SparseMatrix structure which MAY contain some already allocated storage. OUTPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. All elements of the matrix are zero. Previously allocated storage is reused, if its size is compatible with expected number of non-zeros K. -- ALGLIB PROJECT -- Copyright 14.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsecreatebuf(const ae_int_t m, const ae_int_t n, const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; ae_int_t k; k = 0; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecreatebuf(m, n, k, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function creates sparse matrix in a CRS format (expert function for situations when you are running out of memory). This function creates CRS matrix. Typical usage scenario for a CRS matrix is: 1. creation (you have to tell number of non-zero elements at each row at this moment) 2. insertion of the matrix elements (row by row, from left to right) 3. matrix is passed to some linear algebra algorithm This function is a memory-efficient alternative to SparseCreate(), but it is more complex because it requires you to know in advance how large your matrix is. Some information about different matrix formats can be found in comments on SparseMatrix structure. We recommend you to read them before starting to use ALGLIB sparse matrices.. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 NER - number of elements at each row, array[M], NER[I]>=0 OUTPUT PARAMETERS S - sparse M*N matrix in CRS representation. You have to fill ALL non-zero elements by calling SparseSet() BEFORE you try to use this matrix. NOTE: this function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateCRSBuf function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecreatecrs(const ae_int_t m, const ae_int_t n, const integer_1d_array &ner, sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecreatecrs(m, n, const_cast(ner.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function creates sparse matrix in a CRS format (expert function for situations when you are running out of memory). This version of CRS matrix creation function may reuse memory already allocated in S. This function creates CRS matrix. Typical usage scenario for a CRS matrix is: 1. creation (you have to tell number of non-zero elements at each row at this moment) 2. insertion of the matrix elements (row by row, from left to right) 3. matrix is passed to some linear algebra algorithm This function is a memory-efficient alternative to SparseCreate(), but it is more complex because it requires you to know in advance how large your matrix is. Some information about different matrix formats can be found in comments on SparseMatrix structure. We recommend you to read them before starting to use ALGLIB sparse matrices.. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 NER - number of elements at each row, array[M], NER[I]>=0 S - sparse matrix structure with possibly preallocated memory. OUTPUT PARAMETERS S - sparse M*N matrix in CRS representation. You have to fill ALL non-zero elements by calling SparseSet() BEFORE you try to use this matrix. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecreatecrsbuf(const ae_int_t m, const ae_int_t n, const integer_1d_array &ner, const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecreatecrsbuf(m, n, const_cast(ner.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function creates sparse matrix in a SKS format (skyline storage format). In most cases you do not need this function - CRS format better suits most use cases. INPUT PARAMETERS M, N - number of rows(M) and columns (N) in a matrix: * M=N (as for now, ALGLIB supports only square SKS) * N>=1 * M>=1 D - "bottom" bandwidths, array[M], D[I]>=0. I-th element stores number of non-zeros at I-th row, below the diagonal (diagonal itself is not included) U - "top" bandwidths, array[N], U[I]>=0. I-th element stores number of non-zeros at I-th row, above the diagonal (diagonal itself is not included) OUTPUT PARAMETERS S - sparse M*N matrix in SKS representation. All elements are filled by zeros. You may use SparseRewriteExisting() to change their values. NOTE: this function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateSKSBuf function. -- ALGLIB PROJECT -- Copyright 13.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsecreatesks(const ae_int_t m, const ae_int_t n, const integer_1d_array &d, const integer_1d_array &u, sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecreatesks(m, n, const_cast(d.c_ptr()), const_cast(u.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is "buffered" version of SparseCreateSKS() which reuses memory previously allocated in S (of course, memory is reallocated if needed). This function creates sparse matrix in a SKS format (skyline storage format). In most cases you do not need this function - CRS format better suits most use cases. INPUT PARAMETERS M, N - number of rows(M) and columns (N) in a matrix: * M=N (as for now, ALGLIB supports only square SKS) * N>=1 * M>=1 D - "bottom" bandwidths, array[M], 0<=D[I]<=I. I-th element stores number of non-zeros at I-th row, below the diagonal (diagonal itself is not included) U - "top" bandwidths, array[N], 0<=U[I]<=I. I-th element stores number of non-zeros at I-th row, above the diagonal (diagonal itself is not included) OUTPUT PARAMETERS S - sparse M*N matrix in SKS representation. All elements are filled by zeros. You may use SparseSet()/SparseAdd() to change their values. -- ALGLIB PROJECT -- Copyright 13.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsecreatesksbuf(const ae_int_t m, const ae_int_t n, const integer_1d_array &d, const integer_1d_array &u, const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecreatesksbuf(m, n, const_cast(d.c_ptr()), const_cast(u.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function copies S0 to S1. This function completely deallocates memory owned by S1 before creating a copy of S0. If you want to reuse memory, use SparseCopyBuf. NOTE: this function does not verify its arguments, it just copies all fields of the structure. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecopy(const sparsematrix &s0, sparsematrix &s1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecopy(const_cast(s0.c_ptr()), const_cast(s1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function copies S0 to S1. Memory already allocated in S1 is reused as much as possible. NOTE: this function does not verify its arguments, it just copies all fields of the structure. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecopybuf(const sparsematrix &s0, const sparsematrix &s1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecopybuf(const_cast(s0.c_ptr()), const_cast(s1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function efficiently swaps contents of S0 and S1. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/ void sparseswap(const sparsematrix &s0, const sparsematrix &s1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparseswap(const_cast(s0.c_ptr()), const_cast(s1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function adds value to S[i,j] - element of the sparse matrix. Matrix must be in a Hash-Table mode. In case S[i,j] already exists in the table, V i added to its value. In case S[i,j] is non-existent, it is inserted in the table. Table automatically grows when necessary. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. Exception will be thrown for CRS matrix. I - row index of the element to modify, 0<=I(s.c_ptr()), i, j, v, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function modifies S[i,j] - element of the sparse matrix. For Hash-based storage format: * this function can be called at any moment - during matrix initialization or later * new value can be zero or non-zero. In case new value of S[i,j] is zero, this element is deleted from the table. * this function has no effect when called with zero V for non-existent element. For CRS-bases storage format: * this function can be called ONLY DURING MATRIX INITIALIZATION * new value MUST be non-zero. Exception will be thrown for zero V. * elements must be initialized in correct order - from top row to bottom, within row - from left to right. For SKS storage: NOT SUPPORTED! Use SparseRewriteExisting() to work with SKS matrices. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table or CRS representation. I - row index of the element to modify, 0<=I(s.c_ptr()), i, j, v, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns S[i,j] - element of the sparse matrix. Matrix can be in any mode (Hash-Table, CRS, SKS), but this function is less efficient for CRS matrices. Hash-Table and SKS matrices can find element in O(1) time, while CRS matrices need O(log(RS)) time, where RS is an number of non-zero elements in a row. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. Exception will be thrown for CRS matrix. I - row index of the element to modify, 0<=I(s.c_ptr()), i, j, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns I-th diagonal element of the sparse matrix. Matrix can be in any mode (Hash-Table or CRS storage), but this function is most efficient for CRS matrices - it requires less than 50 CPU cycles to extract diagonal element. For Hash-Table matrices we still have O(1) query time, but function is many times slower. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. Exception will be thrown for CRS matrix. I - index of the element to modify, 0<=I(s.c_ptr()), i, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates matrix-vector product S*x. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y - array[M], S*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemv(const sparsematrix &s, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsemv(const_cast(s.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates matrix-vector product S^T*x. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. X - array[M], input vector. For performance reasons we make only quick checks - we check that array size is at least M, but we do not check for NAN's or INF's. Y - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y - array[N], S^T*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemtv(const sparsematrix &s, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsemtv(const_cast(s.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function simultaneously calculates two matrix-vector products: S*x and S^T*x. S must be square (non-rectangular) matrix stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse N*N matrix in CRS or SKS format. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y0 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. Y1 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y0 - array[N], S*x Y1 - array[N], S^T*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemv2(const sparsematrix &s, const real_1d_array &x, real_1d_array &y0, real_1d_array &y1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsemv2(const_cast(s.c_ptr()), const_cast(x.c_ptr()), const_cast(y0.c_ptr()), const_cast(y1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates matrix-vector product S*x, when S is symmetric matrix. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*M matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is given: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y - array[M], S*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsesmv(const sparsematrix &s, const bool isupper, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsesmv(const_cast(s.c_ptr()), isupper, const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates vector-matrix-vector product x'*S*x, where S is symmetric matrix. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*M matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is given: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. RESULT x'*S*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 27.01.2014 by Bochkanov Sergey *************************************************************************/ double sparsevsmv(const sparsematrix &s, const bool isupper, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::sparsevsmv(const_cast(s.c_ptr()), isupper, const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates matrix-matrix product S*A. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. A - array[N][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B - array[M][K], S*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemm(const sparsematrix &s, const real_2d_array &a, const ae_int_t k, real_2d_array &b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsemm(const_cast(s.c_ptr()), const_cast(a.c_ptr()), k, const_cast(b.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates matrix-matrix product S^T*A. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. A - array[M][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least M, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B - array[N][K], S^T*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemtm(const sparsematrix &s, const real_2d_array &a, const ae_int_t k, real_2d_array &b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsemtm(const_cast(s.c_ptr()), const_cast(a.c_ptr()), k, const_cast(b.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function simultaneously calculates two matrix-matrix products: S*A and S^T*A. S must be square (non-rectangular) matrix stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse N*N matrix in CRS or SKS format. A - array[N][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B0 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. B1 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B0 - array[N][K], S*A B1 - array[N][K], S^T*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemm2(const sparsematrix &s, const real_2d_array &a, const ae_int_t k, real_2d_array &b0, real_2d_array &b1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsemm2(const_cast(s.c_ptr()), const_cast(a.c_ptr()), k, const_cast(b0.c_ptr()), const_cast(b1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates matrix-matrix product S*A, when S is symmetric matrix. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*M matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is given: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. A - array[N][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B - array[M][K], S*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsesmm(const sparsematrix &s, const bool isupper, const real_2d_array &a, const ae_int_t k, real_2d_array &b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsesmm(const_cast(s.c_ptr()), isupper, const_cast(a.c_ptr()), k, const_cast(b.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates matrix-vector product op(S)*x, when x is vector, S is symmetric triangular matrix, op(S) is transposition or no operation. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse square matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is used: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. IsUnit - unit or non-unit diagonal: * if True, diagonal elements of triangular matrix are considered equal to 1.0. Actual elements stored in S are not referenced at all. * if False, diagonal stored in S is used OpType - operation type: * if 0, S*x is calculated * if 1, (S^T)*x is calculated (transposition) X - array[N] which stores input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y - possibly preallocated input buffer. Automatically resized if its size is too small. OUTPUT PARAMETERS Y - array[N], op(S)*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 20.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsetrmv(const sparsematrix &s, const bool isupper, const bool isunit, const ae_int_t optype, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsetrmv(const_cast(s.c_ptr()), isupper, isunit, optype, const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function solves linear system op(S)*y=x where x is vector, S is symmetric triangular matrix, op(S) is transposition or no operation. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse square matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is used: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. IsUnit - unit or non-unit diagonal: * if True, diagonal elements of triangular matrix are considered equal to 1.0. Actual elements stored in S are not referenced at all. * if False, diagonal stored in S is used. It is your responsibility to make sure that diagonal is non-zero. OpType - operation type: * if 0, S*x is calculated * if 1, (S^T)*x is calculated (transposition) X - array[N] which stores input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. OUTPUT PARAMETERS X - array[N], inv(op(S))*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. NOTE: no assertion or tests are done during algorithm operation. It is your responsibility to provide invertible matrix to algorithm. -- ALGLIB PROJECT -- Copyright 20.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsetrsv(const sparsematrix &s, const bool isupper, const bool isunit, const ae_int_t optype, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsetrsv(const_cast(s.c_ptr()), isupper, isunit, optype, const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This procedure resizes Hash-Table matrix. It can be called when you have deleted too many elements from the matrix, and you want to free unneeded memory. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparseresizematrix(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparseresizematrix(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to enumerate all elements of the sparse matrix. Before first call user initializes T0 and T1 counters by zero. These counters are used to remember current position in a matrix; after each call they are updated by the function. Subsequent calls to this function return non-zero elements of the sparse matrix, one by one. If you enumerate CRS matrix, matrix is traversed from left to right, from top to bottom. In case you enumerate matrix stored as Hash table, elements are returned in random order. EXAMPLE > T0=0 > T1=0 > while SparseEnumerate(S,T0,T1,I,J,V) do > ....do something with I,J,V INPUT PARAMETERS S - sparse M*N matrix in Hash-Table or CRS representation. T0 - internal counter T1 - internal counter OUTPUT PARAMETERS T0 - new value of the internal counter T1 - new value of the internal counter I - row index of non-zero element, 0<=I(s.c_ptr()), &t0, &t1, &i, &j, &v, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function rewrites existing (non-zero) element. It returns True if element exists or False, when it is called for non-existing (zero) element. This function works with any kind of the matrix. The purpose of this function is to provide convenient thread-safe way to modify sparse matrix. Such modification (already existing element is rewritten) is guaranteed to be thread-safe without any synchronization, as long as different threads modify different elements. INPUT PARAMETERS S - sparse M*N matrix in any kind of representation (Hash, SKS, CRS). I - row index of non-zero element to modify, 0<=I(s.c_ptr()), i, j, v, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns I-th row of the sparse matrix. Matrix must be stored in CRS or SKS format. INPUT PARAMETERS: S - sparse M*N matrix in CRS format I - row index, 0<=I(s.c_ptr()), i, const_cast(irow.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns I-th row of the sparse matrix IN COMPRESSED FORMAT - only non-zero elements are returned (with their indexes). Matrix must be stored in CRS or SKS format. INPUT PARAMETERS: S - sparse M*N matrix in CRS format I - row index, 0<=I(s.c_ptr()), i, const_cast(colidx.c_ptr()), const_cast(vals.c_ptr()), &nzcnt, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs efficient in-place transpose of SKS matrix. No additional memory is allocated during transposition. This function supports only skyline storage format (SKS). INPUT PARAMETERS S - sparse matrix in SKS format. OUTPUT PARAMETERS S - sparse matrix, transposed. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsetransposesks(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsetransposesks(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs in-place conversion to desired sparse storage format. INPUT PARAMETERS S0 - sparse matrix in any format. Fmt - desired storage format of the output, as returned by SparseGetMatrixType() function: * 0 for hash-based storage * 1 for CRS * 2 for SKS OUTPUT PARAMETERS S0 - sparse matrix in requested format. NOTE: in-place conversion wastes a lot of memory which is used to store temporaries. If you perform a lot of repeated conversions, we recommend to use out-of-place buffered conversion functions, like SparseCopyToBuf(), which can reuse already allocated memory. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/ void sparseconvertto(const sparsematrix &s0, const ae_int_t fmt) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparseconvertto(const_cast(s0.c_ptr()), fmt, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs out-of-place conversion to desired sparse storage format. S0 is copied to S1 and converted on-the-fly. Memory allocated in S1 is reused to maximum extent possible. INPUT PARAMETERS S0 - sparse matrix in any format. Fmt - desired storage format of the output, as returned by SparseGetMatrixType() function: * 0 for hash-based storage * 1 for CRS * 2 for SKS OUTPUT PARAMETERS S1 - sparse matrix in requested format. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsecopytobuf(const sparsematrix &s0, const ae_int_t fmt, const sparsematrix &s1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecopytobuf(const_cast(s0.c_ptr()), fmt, const_cast(s1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs in-place conversion to Hash table storage. INPUT PARAMETERS S - sparse matrix in CRS format. OUTPUT PARAMETERS S - sparse matrix in Hash table format. NOTE: this function has no effect when called with matrix which is already in Hash table mode. NOTE: in-place conversion involves allocation of temporary arrays. If you perform a lot of repeated in- place conversions, it may lead to memory fragmentation. Consider using out-of-place SparseCopyToHashBuf() function in this case. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparseconverttohash(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparseconverttohash(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs out-of-place conversion to Hash table storage format. S0 is copied to S1 and converted on-the-fly. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in Hash table format. NOTE: if S0 is stored as Hash-table, it is just copied without conversion. NOTE: this function de-allocates memory occupied by S1 before starting conversion. If you perform a lot of repeated conversions, it may lead to memory fragmentation. In this case we recommend you to use SparseCopyToHashBuf() function which re-uses memory in S1 as much as possible. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsecopytohash(const sparsematrix &s0, sparsematrix &s1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecopytohash(const_cast(s0.c_ptr()), const_cast(s1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs out-of-place conversion to Hash table storage format. S0 is copied to S1 and converted on-the-fly. Memory allocated in S1 is reused to maximum extent possible. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in Hash table format. NOTE: if S0 is stored as Hash-table, it is just copied without conversion. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsecopytohashbuf(const sparsematrix &s0, const sparsematrix &s1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecopytohashbuf(const_cast(s0.c_ptr()), const_cast(s1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function converts matrix to CRS format. Some algorithms (linear algebra ones, for example) require matrices in CRS format. This function allows to perform in-place conversion. INPUT PARAMETERS S - sparse M*N matrix in any format OUTPUT PARAMETERS S - matrix in CRS format NOTE: this function has no effect when called with matrix which is already in CRS mode. NOTE: this function allocates temporary memory to store a copy of the matrix. If you perform a lot of repeated conversions, we recommend you to use SparseCopyToCRSBuf() function, which can reuse previously allocated memory. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparseconverttocrs(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparseconverttocrs(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs out-of-place conversion to CRS format. S0 is copied to S1 and converted on-the-fly. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in CRS format. NOTE: if S0 is stored as CRS, it is just copied without conversion. NOTE: this function de-allocates memory occupied by S1 before starting CRS conversion. If you perform a lot of repeated CRS conversions, it may lead to memory fragmentation. In this case we recommend you to use SparseCopyToCRSBuf() function which re-uses memory in S1 as much as possible. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsecopytocrs(const sparsematrix &s0, sparsematrix &s1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecopytocrs(const_cast(s0.c_ptr()), const_cast(s1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs out-of-place conversion to CRS format. S0 is copied to S1 and converted on-the-fly. Memory allocated in S1 is reused to maximum extent possible. INPUT PARAMETERS S0 - sparse matrix in any format. S1 - matrix which may contain some pre-allocated memory, or can be just uninitialized structure. OUTPUT PARAMETERS S1 - sparse matrix in CRS format. NOTE: if S0 is stored as CRS, it is just copied without conversion. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsecopytocrsbuf(const sparsematrix &s0, const sparsematrix &s1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecopytocrsbuf(const_cast(s0.c_ptr()), const_cast(s1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs in-place conversion to SKS format. INPUT PARAMETERS S - sparse matrix in any format. OUTPUT PARAMETERS S - sparse matrix in SKS format. NOTE: this function has no effect when called with matrix which is already in SKS mode. NOTE: in-place conversion involves allocation of temporary arrays. If you perform a lot of repeated in- place conversions, it may lead to memory fragmentation. Consider using out-of-place SparseCopyToSKSBuf() function in this case. -- ALGLIB PROJECT -- Copyright 15.01.2014 by Bochkanov Sergey *************************************************************************/ void sparseconverttosks(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparseconverttosks(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs out-of-place conversion to SKS storage format. S0 is copied to S1 and converted on-the-fly. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in SKS format. NOTE: if S0 is stored as SKS, it is just copied without conversion. NOTE: this function de-allocates memory occupied by S1 before starting conversion. If you perform a lot of repeated conversions, it may lead to memory fragmentation. In this case we recommend you to use SparseCopyToSKSBuf() function which re-uses memory in S1 as much as possible. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsecopytosks(const sparsematrix &s0, sparsematrix &s1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecopytosks(const_cast(s0.c_ptr()), const_cast(s1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs out-of-place conversion to SKS format. S0 is copied to S1 and converted on-the-fly. Memory allocated in S1 is reused to maximum extent possible. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in SKS format. NOTE: if S0 is stored as SKS, it is just copied without conversion. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsecopytosksbuf(const sparsematrix &s0, const sparsematrix &s1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsecopytosksbuf(const_cast(s0.c_ptr()), const_cast(s1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns type of the matrix storage format. INPUT PARAMETERS: S - sparse matrix. RESULT: sparse storage format used by matrix: 0 - Hash-table 1 - CRS (compressed row storage) 2 - SKS (skyline) NOTE: future versions of ALGLIB may include additional sparse storage formats. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ ae_int_t sparsegetmatrixtype(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::sparsegetmatrixtype(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function checks matrix storage format and returns True when matrix is stored using Hash table representation. INPUT PARAMETERS: S - sparse matrix. RESULT: True if matrix type is Hash table False if matrix type is not Hash table -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ bool sparseishash(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::sparseishash(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function checks matrix storage format and returns True when matrix is stored using CRS representation. INPUT PARAMETERS: S - sparse matrix. RESULT: True if matrix type is CRS False if matrix type is not CRS -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ bool sparseiscrs(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::sparseiscrs(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function checks matrix storage format and returns True when matrix is stored using SKS representation. INPUT PARAMETERS: S - sparse matrix. RESULT: True if matrix type is SKS False if matrix type is not SKS -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ bool sparseissks(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::sparseissks(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The function frees all memory occupied by sparse matrix. Sparse matrix structure becomes unusable after this call. OUTPUT PARAMETERS S - sparse matrix to delete -- ALGLIB PROJECT -- Copyright 24.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsefree(sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sparsefree(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The function returns number of rows of a sparse matrix. RESULT: number of rows of a sparse matrix. -- ALGLIB PROJECT -- Copyright 23.08.2012 by Bochkanov Sergey *************************************************************************/ ae_int_t sparsegetnrows(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::sparsegetnrows(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The function returns number of columns of a sparse matrix. RESULT: number of columns of a sparse matrix. -- ALGLIB PROJECT -- Copyright 23.08.2012 by Bochkanov Sergey *************************************************************************/ ae_int_t sparsegetncols(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::sparsegetncols(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The function returns number of strictly upper triangular non-zero elements in the matrix. It counts SYMBOLICALLY non-zero elements, i.e. entries in the sparse matrix data structure. If some element has zero numerical value, it is still counted. This function has different cost for different types of matrices: * for hash-based matrices it involves complete pass over entire hash-table with O(NNZ) cost, where NNZ is number of non-zero elements * for CRS and SKS matrix types cost of counting is O(N) (N - matrix size). RESULT: number of non-zero elements strictly above main diagonal -- ALGLIB PROJECT -- Copyright 12.02.2014 by Bochkanov Sergey *************************************************************************/ ae_int_t sparsegetuppercount(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::sparsegetuppercount(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* The function returns number of strictly lower triangular non-zero elements in the matrix. It counts SYMBOLICALLY non-zero elements, i.e. entries in the sparse matrix data structure. If some element has zero numerical value, it is still counted. This function has different cost for different types of matrices: * for hash-based matrices it involves complete pass over entire hash-table with O(NNZ) cost, where NNZ is number of non-zero elements * for CRS and SKS matrix types cost of counting is O(N) (N - matrix size). RESULT: number of non-zero elements strictly below main diagonal -- ALGLIB PROJECT -- Copyright 12.02.2014 by Bochkanov Sergey *************************************************************************/ ae_int_t sparsegetlowercount(const sparsematrix &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::sparsegetlowercount(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Generation of a random uniformly distributed (Haar) orthogonal matrix INPUT PARAMETERS: N - matrix size, N>=1 OUTPUT PARAMETERS: A - orthogonal NxN matrix, array[0..N-1,0..N-1] NOTE: this function uses algorithm described in Stewart, G. W. (1980), "The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators". Speaking short, to generate an (N+1)x(N+1) orthogonal matrix, it: * takes an NxN one * takes uniformly distributed unit vector of dimension N+1. * constructs a Householder reflection from the vector, then applies it to the smaller matrix (embedded in the larger size with a 1 at the bottom right corner). -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrndorthogonal(const ae_int_t n, real_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixrndorthogonal(n, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Generation of random NxN matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrndcond(const ae_int_t n, const double c, real_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixrndcond(n, c, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Generation of a random Haar distributed orthogonal complex matrix INPUT PARAMETERS: N - matrix size, N>=1 OUTPUT PARAMETERS: A - orthogonal NxN matrix, array[0..N-1,0..N-1] NOTE: this function uses algorithm described in Stewart, G. W. (1980), "The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators". Speaking short, to generate an (N+1)x(N+1) orthogonal matrix, it: * takes an NxN one * takes uniformly distributed unit vector of dimension N+1. * constructs a Householder reflection from the vector, then applies it to the smaller matrix (embedded in the larger size with a 1 at the bottom right corner). -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrndorthogonal(const ae_int_t n, complex_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixrndorthogonal(n, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Generation of random NxN complex matrix with given condition number C and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrndcond(const ae_int_t n, const double c, complex_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixrndcond(n, c, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Generation of random NxN symmetric matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void smatrixrndcond(const ae_int_t n, const double c, real_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::smatrixrndcond(n, c, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Generation of random NxN symmetric positive definite matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random SPD matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void spdmatrixrndcond(const ae_int_t n, const double c, real_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixrndcond(n, c, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Generation of random NxN Hermitian matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void hmatrixrndcond(const ae_int_t n, const double c, complex_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hmatrixrndcond(n, c, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Generation of random NxN Hermitian positive definite matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random HPD matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void hpdmatrixrndcond(const ae_int_t n, const double c, complex_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixrndcond(n, c, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Multiplication of MxN matrix by NxN random Haar distributed orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrndorthogonalfromtheright(real_2d_array &a, const ae_int_t m, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixrndorthogonalfromtheright(const_cast(a.c_ptr()), m, n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Multiplication of MxN matrix by MxM random Haar distributed orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - Q*A, where Q is random MxM orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrndorthogonalfromtheleft(real_2d_array &a, const ae_int_t m, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixrndorthogonalfromtheleft(const_cast(a.c_ptr()), m, n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Multiplication of MxN complex matrix by NxN random Haar distributed complex orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrndorthogonalfromtheright(complex_2d_array &a, const ae_int_t m, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixrndorthogonalfromtheright(const_cast(a.c_ptr()), m, n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Multiplication of MxN complex matrix by MxM random Haar distributed complex orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - Q*A, where Q is random MxM orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrndorthogonalfromtheleft(complex_2d_array &a, const ae_int_t m, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixrndorthogonalfromtheleft(const_cast(a.c_ptr()), m, n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Symmetric multiplication of NxN matrix by random Haar distributed orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..N-1, 0..N-1] N - matrix size OUTPUT PARAMETERS: A - Q'*A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void smatrixrndmultiply(real_2d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::smatrixrndmultiply(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Hermitian multiplication of NxN matrix by random Haar distributed complex orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..N-1, 0..N-1] N - matrix size OUTPUT PARAMETERS: A - Q^H*A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void hmatrixrndmultiply(complex_2d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hmatrixrndmultiply(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Cache-oblivous complex "copy-and-transpose" Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/ void cmatrixtranspose(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, complex_2d_array &b, const ae_int_t ib, const ae_int_t jb) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixtranspose(m, n, const_cast(a.c_ptr()), ia, ja, const_cast(b.c_ptr()), ib, jb, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Cache-oblivous real "copy-and-transpose" Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/ void rmatrixtranspose(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, const real_2d_array &b, const ae_int_t ib, const ae_int_t jb) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixtranspose(m, n, const_cast(a.c_ptr()), ia, ja, const_cast(b.c_ptr()), ib, jb, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This code enforces symmetricy of the matrix by copying Upper part to lower one (or vice versa). INPUT PARAMETERS: A - matrix N - number of rows/columns IsUpper - whether we want to copy upper triangle to lower one (True) or vice versa (False). *************************************************************************/ void rmatrixenforcesymmetricity(const real_2d_array &a, const ae_int_t n, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixenforcesymmetricity(const_cast(a.c_ptr()), n, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Copy Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/ void cmatrixcopy(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, complex_2d_array &b, const ae_int_t ib, const ae_int_t jb) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixcopy(m, n, const_cast(a.c_ptr()), ia, ja, const_cast(b.c_ptr()), ib, jb, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Copy Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/ void rmatrixcopy(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, real_2d_array &b, const ae_int_t ib, const ae_int_t jb) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixcopy(m, n, const_cast(a.c_ptr()), ia, ja, const_cast(b.c_ptr()), ib, jb, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Rank-1 correction: A := A + u*v' INPUT PARAMETERS: M - number of rows N - number of columns A - target matrix, MxN submatrix is updated IA - submatrix offset (row index) JA - submatrix offset (column index) U - vector #1 IU - subvector offset V - vector #2 IV - subvector offset *************************************************************************/ void cmatrixrank1(const ae_int_t m, const ae_int_t n, complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, complex_1d_array &u, const ae_int_t iu, complex_1d_array &v, const ae_int_t iv) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixrank1(m, n, const_cast(a.c_ptr()), ia, ja, const_cast(u.c_ptr()), iu, const_cast(v.c_ptr()), iv, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Rank-1 correction: A := A + u*v' INPUT PARAMETERS: M - number of rows N - number of columns A - target matrix, MxN submatrix is updated IA - submatrix offset (row index) JA - submatrix offset (column index) U - vector #1 IU - subvector offset V - vector #2 IV - subvector offset *************************************************************************/ void rmatrixrank1(const ae_int_t m, const ae_int_t n, real_2d_array &a, const ae_int_t ia, const ae_int_t ja, real_1d_array &u, const ae_int_t iu, real_1d_array &v, const ae_int_t iv) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixrank1(m, n, const_cast(a.c_ptr()), ia, ja, const_cast(u.c_ptr()), iu, const_cast(v.c_ptr()), iv, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Matrix-vector product: y := op(A)*x INPUT PARAMETERS: M - number of rows of op(A) M>=0 N - number of columns of op(A) N>=0 A - target matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpA - operation type: * OpA=0 => op(A) = A * OpA=1 => op(A) = A^T * OpA=2 => op(A) = A^H X - input vector IX - subvector offset IY - subvector offset Y - preallocated matrix, must be large enough to store result OUTPUT PARAMETERS: Y - vector which stores result if M=0, then subroutine does nothing. if N=0, Y is filled by zeros. -- ALGLIB routine -- 28.01.2010 Bochkanov Sergey *************************************************************************/ void cmatrixmv(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t opa, const complex_1d_array &x, const ae_int_t ix, complex_1d_array &y, const ae_int_t iy) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixmv(m, n, const_cast(a.c_ptr()), ia, ja, opa, const_cast(x.c_ptr()), ix, const_cast(y.c_ptr()), iy, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Matrix-vector product: y := op(A)*x INPUT PARAMETERS: M - number of rows of op(A) N - number of columns of op(A) A - target matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpA - operation type: * OpA=0 => op(A) = A * OpA=1 => op(A) = A^T X - input vector IX - subvector offset IY - subvector offset Y - preallocated matrix, must be large enough to store result OUTPUT PARAMETERS: Y - vector which stores result if M=0, then subroutine does nothing. if N=0, Y is filled by zeros. -- ALGLIB routine -- 28.01.2010 Bochkanov Sergey *************************************************************************/ void rmatrixmv(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t opa, const real_1d_array &x, const ae_int_t ix, real_1d_array &y, const ae_int_t iy) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixmv(m, n, const_cast(a.c_ptr()), ia, ja, opa, const_cast(x.c_ptr()), ix, const_cast(y.c_ptr()), iy, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates X*op(A^-1) where: * X is MxN general matrix * A is NxN upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition, conjugate transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+N-1,J1:J1+N-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrighttrsm(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const complex_2d_array &x, const ae_int_t i2, const ae_int_t j2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixrighttrsm(m, n, const_cast(a.c_ptr()), i1, j1, isupper, isunit, optype, const_cast(x.c_ptr()), i2, j2, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixrighttrsm(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const complex_2d_array &x, const ae_int_t i2, const ae_int_t j2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixrighttrsm(m, n, const_cast(a.c_ptr()), i1, j1, isupper, isunit, optype, const_cast(x.c_ptr()), i2, j2, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates op(A^-1)*X where: * X is MxN general matrix * A is MxM upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition, conjugate transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+M-1,J1:J1+M-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixlefttrsm(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const complex_2d_array &x, const ae_int_t i2, const ae_int_t j2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixlefttrsm(m, n, const_cast(a.c_ptr()), i1, j1, isupper, isunit, optype, const_cast(x.c_ptr()), i2, j2, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixlefttrsm(const ae_int_t m, const ae_int_t n, const complex_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const complex_2d_array &x, const ae_int_t i2, const ae_int_t j2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixlefttrsm(m, n, const_cast(a.c_ptr()), i1, j1, isupper, isunit, optype, const_cast(x.c_ptr()), i2, j2, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates X*op(A^-1) where: * X is MxN general matrix * A is NxN upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+N-1,J1:J1+N-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrighttrsm(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const real_2d_array &x, const ae_int_t i2, const ae_int_t j2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixrighttrsm(m, n, const_cast(a.c_ptr()), i1, j1, isupper, isunit, optype, const_cast(x.c_ptr()), i2, j2, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixrighttrsm(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const real_2d_array &x, const ae_int_t i2, const ae_int_t j2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixrighttrsm(m, n, const_cast(a.c_ptr()), i1, j1, isupper, isunit, optype, const_cast(x.c_ptr()), i2, j2, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates op(A^-1)*X where: * X is MxN general matrix * A is MxM upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+M-1,J1:J1+M-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixlefttrsm(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const real_2d_array &x, const ae_int_t i2, const ae_int_t j2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixlefttrsm(m, n, const_cast(a.c_ptr()), i1, j1, isupper, isunit, optype, const_cast(x.c_ptr()), i2, j2, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixlefttrsm(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const ae_int_t i1, const ae_int_t j1, const bool isupper, const bool isunit, const ae_int_t optype, const real_2d_array &x, const ae_int_t i2, const ae_int_t j2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixlefttrsm(m, n, const_cast(a.c_ptr()), i1, j1, isupper, isunit, optype, const_cast(x.c_ptr()), i2, j2, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates C=alpha*A*A^H+beta*C or C=alpha*A^H*A+beta*C where: * C is NxN Hermitian matrix given by its upper/lower triangle * A is NxK matrix when A*A^H is calculated, KxN matrix otherwise Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 K - matrix size, K>=0 Alpha - coefficient A - matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpTypeA - multiplication type: * 0 - A*A^H is calculated * 2 - A^H*A is calculated Beta - coefficient C - preallocated input/output matrix IC - submatrix offset (row index) JC - submatrix offset (column index) IsUpper - whether upper or lower triangle of C is updated; this function updates only one half of C, leaving other half unchanged (not referenced at all). -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixherk(const ae_int_t n, const ae_int_t k, const double alpha, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const double beta, const complex_2d_array &c, const ae_int_t ic, const ae_int_t jc, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixherk(n, k, alpha, const_cast(a.c_ptr()), ia, ja, optypea, beta, const_cast(c.c_ptr()), ic, jc, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixherk(const ae_int_t n, const ae_int_t k, const double alpha, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const double beta, const complex_2d_array &c, const ae_int_t ic, const ae_int_t jc, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixherk(n, k, alpha, const_cast(a.c_ptr()), ia, ja, optypea, beta, const_cast(c.c_ptr()), ic, jc, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates C=alpha*A*A^T+beta*C or C=alpha*A^T*A+beta*C where: * C is NxN symmetric matrix given by its upper/lower triangle * A is NxK matrix when A*A^T is calculated, KxN matrix otherwise Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 K - matrix size, K>=0 Alpha - coefficient A - matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpTypeA - multiplication type: * 0 - A*A^T is calculated * 2 - A^T*A is calculated Beta - coefficient C - preallocated input/output matrix IC - submatrix offset (row index) JC - submatrix offset (column index) IsUpper - whether C is upper triangular or lower triangular -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixsyrk(const ae_int_t n, const ae_int_t k, const double alpha, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const double beta, const real_2d_array &c, const ae_int_t ic, const ae_int_t jc, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixsyrk(n, k, alpha, const_cast(a.c_ptr()), ia, ja, optypea, beta, const_cast(c.c_ptr()), ic, jc, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixsyrk(const ae_int_t n, const ae_int_t k, const double alpha, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const double beta, const real_2d_array &c, const ae_int_t ic, const ae_int_t jc, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixsyrk(n, k, alpha, const_cast(a.c_ptr()), ia, ja, optypea, beta, const_cast(c.c_ptr()), ic, jc, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates C = alpha*op1(A)*op2(B) +beta*C where: * C is MxN general matrix * op1(A) is MxK matrix * op2(B) is KxN matrix * "op" may be identity transformation, transposition, conjugate transposition Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. IMPORTANT: This function does NOT preallocate output matrix C, it MUST be preallocated by caller prior to calling this function. In case C does not have enough space to store result, exception will be generated. INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset OpTypeA - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition B - matrix IB - submatrix offset JB - submatrix offset OpTypeB - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition Beta - coefficient C - matrix (PREALLOCATED, large enough to store result) IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixgemm(const ae_int_t m, const ae_int_t n, const ae_int_t k, const alglib::complex alpha, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const complex_2d_array &b, const ae_int_t ib, const ae_int_t jb, const ae_int_t optypeb, const alglib::complex beta, const complex_2d_array &c, const ae_int_t ic, const ae_int_t jc) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixgemm(m, n, k, *alpha.c_ptr(), const_cast(a.c_ptr()), ia, ja, optypea, const_cast(b.c_ptr()), ib, jb, optypeb, *beta.c_ptr(), const_cast(c.c_ptr()), ic, jc, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixgemm(const ae_int_t m, const ae_int_t n, const ae_int_t k, const alglib::complex alpha, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const complex_2d_array &b, const ae_int_t ib, const ae_int_t jb, const ae_int_t optypeb, const alglib::complex beta, const complex_2d_array &c, const ae_int_t ic, const ae_int_t jc) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixgemm(m, n, k, *alpha.c_ptr(), const_cast(a.c_ptr()), ia, ja, optypea, const_cast(b.c_ptr()), ib, jb, optypeb, *beta.c_ptr(), const_cast(c.c_ptr()), ic, jc, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates C = alpha*op1(A)*op2(B) +beta*C where: * C is MxN general matrix * op1(A) is MxK matrix * op2(B) is KxN matrix * "op" may be identity transformation, transposition Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. IMPORTANT: This function does NOT preallocate output matrix C, it MUST be preallocated by caller prior to calling this function. In case C does not have enough space to store result, exception will be generated. INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset OpTypeA - transformation type: * 0 - no transformation * 1 - transposition B - matrix IB - submatrix offset JB - submatrix offset OpTypeB - transformation type: * 0 - no transformation * 1 - transposition Beta - coefficient C - PREALLOCATED output matrix, large enough to store result IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 2009-2013 Bochkanov Sergey *************************************************************************/ void rmatrixgemm(const ae_int_t m, const ae_int_t n, const ae_int_t k, const double alpha, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const real_2d_array &b, const ae_int_t ib, const ae_int_t jb, const ae_int_t optypeb, const double beta, const real_2d_array &c, const ae_int_t ic, const ae_int_t jc) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixgemm(m, n, k, alpha, const_cast(a.c_ptr()), ia, ja, optypea, const_cast(b.c_ptr()), ib, jb, optypeb, beta, const_cast(c.c_ptr()), ic, jc, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixgemm(const ae_int_t m, const ae_int_t n, const ae_int_t k, const double alpha, const real_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const real_2d_array &b, const ae_int_t ib, const ae_int_t jb, const ae_int_t optypeb, const double beta, const real_2d_array &c, const ae_int_t ic, const ae_int_t jc) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixgemm(m, n, k, alpha, const_cast(a.c_ptr()), ia, ja, optypea, const_cast(b.c_ptr()), ib, jb, optypeb, beta, const_cast(c.c_ptr()), ic, jc, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine is an older version of CMatrixHERK(), one with wrong name (it is HErmitian update, not SYmmetric). It is left here for backward compatibility. -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixsyrk(const ae_int_t n, const ae_int_t k, const double alpha, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const double beta, const complex_2d_array &c, const ae_int_t ic, const ae_int_t jc, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixsyrk(n, k, alpha, const_cast(a.c_ptr()), ia, ja, optypea, beta, const_cast(c.c_ptr()), ic, jc, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixsyrk(const ae_int_t n, const ae_int_t k, const double alpha, const complex_2d_array &a, const ae_int_t ia, const ae_int_t ja, const ae_int_t optypea, const double beta, const complex_2d_array &c, const ae_int_t ic, const ae_int_t jc, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixsyrk(n, k, alpha, const_cast(a.c_ptr()), ia, ja, optypea, beta, const_cast(c.c_ptr()), ic, jc, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* LU decomposition of a general real matrix with row pivoting A is represented as A = P*L*U, where: * L is lower unitriangular matrix * U is upper triangular matrix * P = P0*P1*...*PK, K=min(M,N)-1, Pi - permutation matrix for I and Pivots[I] This is cache-oblivous implementation of LU decomposition. It is optimized for square matrices. As for rectangular matrices: * best case - M>>N * worst case - N>>M, small M, large N, matrix does not fit in CPU cache COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - array[0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. OUTPUT PARAMETERS: A - matrices L and U in compact form: * L is stored under main diagonal * U is stored on and above main diagonal Pivots - permutation matrix in compact form. array[0..Min(M-1,N-1)]. -- ALGLIB routine -- 10.01.2010 Bochkanov Sergey *************************************************************************/ void rmatrixlu(real_2d_array &a, const ae_int_t m, const ae_int_t n, integer_1d_array &pivots) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixlu(const_cast(a.c_ptr()), m, n, const_cast(pivots.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixlu(real_2d_array &a, const ae_int_t m, const ae_int_t n, integer_1d_array &pivots) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixlu(const_cast(a.c_ptr()), m, n, const_cast(pivots.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* LU decomposition of a general complex matrix with row pivoting A is represented as A = P*L*U, where: * L is lower unitriangular matrix * U is upper triangular matrix * P = P0*P1*...*PK, K=min(M,N)-1, Pi - permutation matrix for I and Pivots[I] This is cache-oblivous implementation of LU decomposition. It is optimized for square matrices. As for rectangular matrices: * best case - M>>N * worst case - N>>M, small M, large N, matrix does not fit in CPU cache COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - array[0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. OUTPUT PARAMETERS: A - matrices L and U in compact form: * L is stored under main diagonal * U is stored on and above main diagonal Pivots - permutation matrix in compact form. array[0..Min(M-1,N-1)]. -- ALGLIB routine -- 10.01.2010 Bochkanov Sergey *************************************************************************/ void cmatrixlu(complex_2d_array &a, const ae_int_t m, const ae_int_t n, integer_1d_array &pivots) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixlu(const_cast(a.c_ptr()), m, n, const_cast(pivots.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixlu(complex_2d_array &a, const ae_int_t m, const ae_int_t n, integer_1d_array &pivots) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixlu(const_cast(a.c_ptr()), m, n, const_cast(pivots.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Cache-oblivious Cholesky decomposition The algorithm computes Cholesky decomposition of a Hermitian positive- definite matrix. The result of an algorithm is a representation of A as A=U'*U or A=L*L' (here X' detones conj(X^T)). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - upper or lower triangle of a factorized matrix. array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - if IsUpper=True, then A contains an upper triangle of a symmetric matrix, otherwise A contains a lower one. OUTPUT PARAMETERS: A - the result of factorization. If IsUpper=True, then the upper triangle contains matrix U, so that A = U'*U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of A is not determined in such case. -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ bool hpdmatrixcholesky(complex_2d_array &a, const ae_int_t n, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::hpdmatrixcholesky(const_cast(a.c_ptr()), n, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } bool smp_hpdmatrixcholesky(complex_2d_array &a, const ae_int_t n, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::_pexec_hpdmatrixcholesky(const_cast(a.c_ptr()), n, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Cache-oblivious Cholesky decomposition The algorithm computes Cholesky decomposition of a symmetric positive- definite matrix. The result of an algorithm is a representation of A as A=U^T*U or A=L*L^T COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - upper or lower triangle of a factorized matrix. array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - if IsUpper=True, then A contains an upper triangle of a symmetric matrix, otherwise A contains a lower one. OUTPUT PARAMETERS: A - the result of factorization. If IsUpper=True, then the upper triangle contains matrix U, so that A = U^T*U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of A is not determined in such case. -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ bool spdmatrixcholesky(real_2d_array &a, const ae_int_t n, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::spdmatrixcholesky(const_cast(a.c_ptr()), n, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } bool smp_spdmatrixcholesky(real_2d_array &a, const ae_int_t n, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::_pexec_spdmatrixcholesky(const_cast(a.c_ptr()), n, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Update of Cholesky decomposition: rank-1 update to original A. "Buffered" version which uses preallocated buffer which is saved between subsequent function calls. This function uses internally allocated buffer which is not saved between subsequent calls. So, if you perform a lot of subsequent updates, we recommend you to use "buffered" version of this function: SPDMatrixCholeskyUpdateAdd1Buf(). INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. U - array[N], rank-1 update to A: A_mod = A + u*u' Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. NOTE: this function always succeeds, so it does not return completion code NOTE: this function checks sizes of input arrays, but it does NOT checks for presence of infinities or NAN's. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/ void spdmatrixcholeskyupdateadd1(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &u) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixcholeskyupdateadd1(const_cast(a.c_ptr()), n, isupper, const_cast(u.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Update of Cholesky decomposition: "fixing" some variables. This function uses internally allocated buffer which is not saved between subsequent calls. So, if you perform a lot of subsequent updates, we recommend you to use "buffered" version of this function: SPDMatrixCholeskyUpdateFixBuf(). "FIXING" EXPLAINED: Suppose we have N*N positive definite matrix A. "Fixing" some variable means filling corresponding row/column of A by zeros, and setting diagonal element to 1. For example, if we fix 2nd variable in 4*4 matrix A, it becomes Af: ( A00 A01 A02 A03 ) ( Af00 0 Af02 Af03 ) ( A10 A11 A12 A13 ) ( 0 1 0 0 ) ( A20 A21 A22 A23 ) => ( Af20 0 Af22 Af23 ) ( A30 A31 A32 A33 ) ( Af30 0 Af32 Af33 ) If we have Cholesky decomposition of A, it must be recalculated after variables were fixed. However, it is possible to use efficient algorithm, which needs O(K*N^2) time to "fix" K variables, given Cholesky decomposition of original, "unfixed" A. INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. Fix - array[N], I-th element is True if I-th variable must be fixed. Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. NOTE: this function always succeeds, so it does not return completion code NOTE: this function checks sizes of input arrays, but it does NOT checks for presence of infinities or NAN's. NOTE: this function is efficient only for moderate amount of updated variables - say, 0.1*N or 0.3*N. For larger amount of variables it will still work, but you may get better performance with straightforward Cholesky. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/ void spdmatrixcholeskyupdatefix(const real_2d_array &a, const ae_int_t n, const bool isupper, const boolean_1d_array &fix) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixcholeskyupdatefix(const_cast(a.c_ptr()), n, isupper, const_cast(fix.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Update of Cholesky decomposition: rank-1 update to original A. "Buffered" version which uses preallocated buffer which is saved between subsequent function calls. See comments for SPDMatrixCholeskyUpdateAdd1() for more information. INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. U - array[N], rank-1 update to A: A_mod = A + u*u' Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/ void spdmatrixcholeskyupdateadd1buf(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &u, real_1d_array &bufr) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixcholeskyupdateadd1buf(const_cast(a.c_ptr()), n, isupper, const_cast(u.c_ptr()), const_cast(bufr.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Update of Cholesky decomposition: "fixing" some variables. "Buffered" version which uses preallocated buffer which is saved between subsequent function calls. See comments for SPDMatrixCholeskyUpdateFix() for more information. INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. Fix - array[N], I-th element is True if I-th variable must be fixed. Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/ void spdmatrixcholeskyupdatefixbuf(const real_2d_array &a, const ae_int_t n, const bool isupper, const boolean_1d_array &fix, real_1d_array &bufr) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixcholeskyupdatefixbuf(const_cast(a.c_ptr()), n, isupper, const_cast(fix.c_ptr()), const_cast(bufr.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Sparse Cholesky decomposition for skyline matrixm using in-place algorithm without allocating additional storage. The algorithm computes Cholesky decomposition of a symmetric positive- definite sparse matrix. The result of an algorithm is a representation of A as A=U^T*U or A=L*L^T This function is a more efficient alternative to general, but slower SparseCholeskyX(), because it does not create temporary copies of the target. It performs factorization in-place, which gives best performance on low-profile matrices. Its drawback, however, is that it can not perform profile-reducing permutation of input matrix. INPUT PARAMETERS: A - sparse matrix in skyline storage (SKS) format. N - size of matrix A (can be smaller than actual size of A) IsUpper - if IsUpper=True, then factorization is performed on upper triangle. Another triangle is ignored (it may contant some data, but it is not changed). OUTPUT PARAMETERS: A - the result of factorization, stored in SKS. If IsUpper=True, then the upper triangle contains matrix U, such that A = U^T*U. Lower triangle is not changed. Similarly, if IsUpper = False. In this case L is returned, and we have A = L*(L^T). Note that THIS function does not perform permutation of rows to reduce bandwidth. RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of A is not determined in such case. NOTE: for performance reasons this function does NOT check that input matrix includes only finite values. It is your responsibility to make sure that there are no infinite or NAN values in the matrix. -- ALGLIB routine -- 16.01.2014 Bochkanov Sergey *************************************************************************/ bool sparsecholeskyskyline(const sparsematrix &a, const ae_int_t n, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::sparsecholeskyskyline(const_cast(a.c_ptr()), n, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Estimate of a matrix condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixrcond1(const real_2d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rmatrixrcond1(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixrcondinf(const real_2d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rmatrixrcondinf(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Condition number estimate of a symmetric positive definite matrix. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm of condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: A - symmetric positive definite matrix which is given by its upper or lower triangle depending on the value of IsUpper. Array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. Result: 1/LowerBound(cond(A)), if matrix A is positive definite, -1, if matrix A is not positive definite, and its condition number could not be found by this algorithm. NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double spdmatrixrcond(const real_2d_array &a, const ae_int_t n, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spdmatrixrcond(const_cast(a.c_ptr()), n, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Triangular matrix: estimate of a condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array[0..N-1, 0..N-1]. N - size of A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixtrrcond1(const real_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rmatrixtrrcond1(const_cast(a.c_ptr()), n, isupper, isunit, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Triangular matrix: estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixtrrcondinf(const real_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rmatrixtrrcondinf(const_cast(a.c_ptr()), n, isupper, isunit, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Condition number estimate of a Hermitian positive definite matrix. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm of condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: A - Hermitian positive definite matrix which is given by its upper or lower triangle depending on the value of IsUpper. Array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. Result: 1/LowerBound(cond(A)), if matrix A is positive definite, -1, if matrix A is not positive definite, and its condition number could not be found by this algorithm. NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double hpdmatrixrcond(const complex_2d_array &a, const ae_int_t n, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::hpdmatrixrcond(const_cast(a.c_ptr()), n, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Estimate of a matrix condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixrcond1(const complex_2d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::cmatrixrcond1(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixrcondinf(const complex_2d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::cmatrixrcondinf(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the RMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixlurcond1(const real_2d_array &lua, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rmatrixlurcond1(const_cast(lua.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (infinity norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the RMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixlurcondinf(const real_2d_array &lua, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rmatrixlurcondinf(const_cast(lua.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Condition number estimate of a symmetric positive definite matrix given by Cholesky decomposition. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: CD - Cholesky decomposition of matrix A, output of SMatrixCholesky subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double spdmatrixcholeskyrcond(const real_2d_array &a, const ae_int_t n, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spdmatrixcholeskyrcond(const_cast(a.c_ptr()), n, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Condition number estimate of a Hermitian positive definite matrix given by Cholesky decomposition. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: CD - Cholesky decomposition of matrix A, output of SMatrixCholesky subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double hpdmatrixcholeskyrcond(const complex_2d_array &a, const ae_int_t n, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::hpdmatrixcholeskyrcond(const_cast(a.c_ptr()), n, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the CMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixlurcond1(const complex_2d_array &lua, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::cmatrixlurcond1(const_cast(lua.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (infinity norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the CMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixlurcondinf(const complex_2d_array &lua, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::cmatrixlurcondinf(const_cast(lua.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Triangular matrix: estimate of a condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array[0..N-1, 0..N-1]. N - size of A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixtrrcond1(const complex_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::cmatrixtrrcond1(const_cast(a.c_ptr()), n, isupper, isunit, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Triangular matrix: estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixtrrcondinf(const complex_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::cmatrixtrrcondinf(const_cast(a.c_ptr()), n, isupper, isunit, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Matrix inverse report: * R1 reciprocal of condition number in 1-norm * RInf reciprocal of condition number in inf-norm *************************************************************************/ _matinvreport_owner::_matinvreport_owner() { p_struct = (alglib_impl::matinvreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::matinvreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_matinvreport_init(p_struct, NULL); } _matinvreport_owner::_matinvreport_owner(const _matinvreport_owner &rhs) { p_struct = (alglib_impl::matinvreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::matinvreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_matinvreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _matinvreport_owner& _matinvreport_owner::operator=(const _matinvreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_matinvreport_clear(p_struct); alglib_impl::_matinvreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _matinvreport_owner::~_matinvreport_owner() { alglib_impl::_matinvreport_clear(p_struct); ae_free(p_struct); } alglib_impl::matinvreport* _matinvreport_owner::c_ptr() { return p_struct; } alglib_impl::matinvreport* _matinvreport_owner::c_ptr() const { return const_cast(p_struct); } matinvreport::matinvreport() : _matinvreport_owner() ,r1(p_struct->r1),rinf(p_struct->rinf) { } matinvreport::matinvreport(const matinvreport &rhs):_matinvreport_owner(rhs) ,r1(p_struct->r1),rinf(p_struct->rinf) { } matinvreport& matinvreport::operator=(const matinvreport &rhs) { if( this==&rhs ) return *this; _matinvreport_owner::operator=(rhs); return *this; } matinvreport::~matinvreport() { } /************************************************************************* Inversion of a matrix given by its LU decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations (the output of RMatrixLU subroutine). N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. it is filled by zeros in such cases. * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - solver report, see below for more info A - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. SOLVER REPORT Subroutine sets following fields of the Rep structure: * R1 reciprocal of condition number: 1/cond(A), 1-norm. * RInf reciprocal of condition number: 1/cond(A), inf-norm. -- ALGLIB routine -- 05.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixluinverse(real_2d_array &a, const integer_1d_array &pivots, const ae_int_t n, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixluinverse(const_cast(a.c_ptr()), const_cast(pivots.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixluinverse(real_2d_array &a, const integer_1d_array &pivots, const ae_int_t n, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixluinverse(const_cast(a.c_ptr()), const_cast(pivots.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a matrix given by its LU decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations (the output of RMatrixLU subroutine). N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. it is filled by zeros in such cases. * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - solver report, see below for more info A - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. SOLVER REPORT Subroutine sets following fields of the Rep structure: * R1 reciprocal of condition number: 1/cond(A), 1-norm. * RInf reciprocal of condition number: 1/cond(A), inf-norm. -- ALGLIB routine -- 05.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixluinverse(real_2d_array &a, const integer_1d_array &pivots, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.cols()!=a.rows()) || (a.cols()!=pivots.length())) throw ap_error("Error while calling 'rmatrixluinverse': looks like one of arguments has wrong size"); n = a.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixluinverse(const_cast(a.c_ptr()), const_cast(pivots.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixluinverse(real_2d_array &a, const integer_1d_array &pivots, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.cols()!=a.rows()) || (a.cols()!=pivots.length())) throw ap_error("Error while calling 'rmatrixluinverse': looks like one of arguments has wrong size"); n = a.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixluinverse(const_cast(a.c_ptr()), const_cast(pivots.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a general matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse Result: True, if the matrix is not singular. False, if the matrix is singular. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/ void rmatrixinverse(real_2d_array &a, const ae_int_t n, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixinverse(const_cast(a.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixinverse(real_2d_array &a, const ae_int_t n, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixinverse(const_cast(a.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a general matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse Result: True, if the matrix is not singular. False, if the matrix is singular. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/ void rmatrixinverse(real_2d_array &a, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'rmatrixinverse': looks like one of arguments has wrong size"); n = a.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixinverse(const_cast(a.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixinverse(real_2d_array &a, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'rmatrixinverse': looks like one of arguments has wrong size"); n = a.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixinverse(const_cast(a.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a matrix given by its LU decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - LU decomposition of the matrix (output of CMatrixLU subroutine). Pivots - table of permutations (the output of CMatrixLU subroutine). N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) OUTPUT PARAMETERS: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 05.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixluinverse(complex_2d_array &a, const integer_1d_array &pivots, const ae_int_t n, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixluinverse(const_cast(a.c_ptr()), const_cast(pivots.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixluinverse(complex_2d_array &a, const integer_1d_array &pivots, const ae_int_t n, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixluinverse(const_cast(a.c_ptr()), const_cast(pivots.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a matrix given by its LU decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - LU decomposition of the matrix (output of CMatrixLU subroutine). Pivots - table of permutations (the output of CMatrixLU subroutine). N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) OUTPUT PARAMETERS: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 05.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixluinverse(complex_2d_array &a, const integer_1d_array &pivots, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.cols()!=a.rows()) || (a.cols()!=pivots.length())) throw ap_error("Error while calling 'cmatrixluinverse': looks like one of arguments has wrong size"); n = a.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixluinverse(const_cast(a.c_ptr()), const_cast(pivots.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixluinverse(complex_2d_array &a, const integer_1d_array &pivots, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.cols()!=a.rows()) || (a.cols()!=pivots.length())) throw ap_error("Error while calling 'cmatrixluinverse': looks like one of arguments has wrong size"); n = a.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixluinverse(const_cast(a.c_ptr()), const_cast(pivots.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a general matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void cmatrixinverse(complex_2d_array &a, const ae_int_t n, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixinverse(const_cast(a.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixinverse(complex_2d_array &a, const ae_int_t n, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixinverse(const_cast(a.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a general matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void cmatrixinverse(complex_2d_array &a, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'cmatrixinverse': looks like one of arguments has wrong size"); n = a.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixinverse(const_cast(a.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixinverse(complex_2d_array &a, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'cmatrixinverse': looks like one of arguments has wrong size"); n = a.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixinverse(const_cast(a.c_ptr()), n, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a symmetric positive definite matrix which is given by Cholesky decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - Cholesky decomposition of the matrix to be inverted: A=U'*U or A = L*L'. Output of SPDMatrixCholesky subroutine. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, lower half is used. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskyinverse(real_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixcholeskyinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spdmatrixcholeskyinverse(real_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spdmatrixcholeskyinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a symmetric positive definite matrix which is given by Cholesky decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - Cholesky decomposition of the matrix to be inverted: A=U'*U or A = L*L'. Output of SPDMatrixCholesky subroutine. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, lower half is used. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskyinverse(real_2d_array &a, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isupper; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'spdmatrixcholeskyinverse': looks like one of arguments has wrong size"); n = a.cols(); isupper = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixcholeskyinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spdmatrixcholeskyinverse(real_2d_array &a, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isupper; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'spdmatrixcholeskyinverse': looks like one of arguments has wrong size"); n = a.cols(); isupper = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spdmatrixcholeskyinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a symmetric positive definite matrix. Given an upper or lower triangle of a symmetric positive definite matrix, the algorithm generates matrix A^-1 and saves the upper or lower triangle depending on the input. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be inverted (upper or lower triangle). Array with elements [0..N-1,0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void spdmatrixinverse(real_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spdmatrixinverse(real_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spdmatrixinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a symmetric positive definite matrix. Given an upper or lower triangle of a symmetric positive definite matrix, the algorithm generates matrix A^-1 and saves the upper or lower triangle depending on the input. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be inverted (upper or lower triangle). Array with elements [0..N-1,0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void spdmatrixinverse(real_2d_array &a, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isupper; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'spdmatrixinverse': looks like one of arguments has wrong size"); if( !alglib_impl::ae_is_symmetric(const_cast(a.c_ptr())) ) throw ap_error("'a' parameter is not symmetric matrix"); n = a.cols(); isupper = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spdmatrixinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); if( !alglib_impl::ae_force_symmetric(const_cast(a.c_ptr())) ) throw ap_error("Internal error while forcing symmetricity of 'a' parameter"); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spdmatrixinverse(real_2d_array &a, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isupper; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'spdmatrixinverse': looks like one of arguments has wrong size"); if( !alglib_impl::ae_is_symmetric(const_cast(a.c_ptr())) ) throw ap_error("'a' parameter is not symmetric matrix"); n = a.cols(); isupper = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spdmatrixinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); if( !alglib_impl::ae_force_symmetric(const_cast(a.c_ptr())) ) throw ap_error("Internal error while forcing symmetricity of 'a' parameter"); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a Hermitian positive definite matrix which is given by Cholesky decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - Cholesky decomposition of the matrix to be inverted: A=U'*U or A = L*L'. Output of HPDMatrixCholesky subroutine. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, lower half is used. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskyinverse(complex_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixcholeskyinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_hpdmatrixcholeskyinverse(complex_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_hpdmatrixcholeskyinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a Hermitian positive definite matrix which is given by Cholesky decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - Cholesky decomposition of the matrix to be inverted: A=U'*U or A = L*L'. Output of HPDMatrixCholesky subroutine. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, lower half is used. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void hpdmatrixcholeskyinverse(complex_2d_array &a, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isupper; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'hpdmatrixcholeskyinverse': looks like one of arguments has wrong size"); n = a.cols(); isupper = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixcholeskyinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_hpdmatrixcholeskyinverse(complex_2d_array &a, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isupper; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'hpdmatrixcholeskyinverse': looks like one of arguments has wrong size"); n = a.cols(); isupper = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_hpdmatrixcholeskyinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a Hermitian positive definite matrix. Given an upper or lower triangle of a Hermitian positive definite matrix, the algorithm generates matrix A^-1 and saves the upper or lower triangle depending on the input. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be inverted (upper or lower triangle). Array with elements [0..N-1,0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void hpdmatrixinverse(complex_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_hpdmatrixinverse(complex_2d_array &a, const ae_int_t n, const bool isupper, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_hpdmatrixinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inversion of a Hermitian positive definite matrix. Given an upper or lower triangle of a Hermitian positive definite matrix, the algorithm generates matrix A^-1 and saves the upper or lower triangle depending on the input. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be inverted (upper or lower triangle). Array with elements [0..N-1,0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void hpdmatrixinverse(complex_2d_array &a, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isupper; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'hpdmatrixinverse': looks like one of arguments has wrong size"); if( !alglib_impl::ae_is_hermitian(const_cast(a.c_ptr())) ) throw ap_error("'a' parameter is not Hermitian matrix"); n = a.cols(); isupper = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hpdmatrixinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); if( !alglib_impl::ae_force_hermitian(const_cast(a.c_ptr())) ) throw ap_error("Internal error while forcing Hermitian properties of 'a' parameter"); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_hpdmatrixinverse(complex_2d_array &a, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isupper; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'hpdmatrixinverse': looks like one of arguments has wrong size"); if( !alglib_impl::ae_is_hermitian(const_cast(a.c_ptr())) ) throw ap_error("'a' parameter is not Hermitian matrix"); n = a.cols(); isupper = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_hpdmatrixinverse(const_cast(a.c_ptr()), n, isupper, &info, const_cast(rep.c_ptr()), &_alglib_env_state); if( !alglib_impl::ae_force_hermitian(const_cast(a.c_ptr())) ) throw ap_error("Internal error while forcing Hermitian properties of 'a' parameter"); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Triangular matrix inverse (real) The subroutine inverts the following types of matrices: * upper triangular * upper triangular with unit diagonal * lower triangular * lower triangular with unit diagonal In case of an upper (lower) triangular matrix, the inverse matrix will also be upper (lower) triangular, and after the end of the algorithm, the inverse matrix replaces the source matrix. The elements below (above) the main diagonal are not changed by the algorithm. If the matrix has a unit diagonal, the inverse matrix also has a unit diagonal, and the diagonal elements are not passed to the algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that triangular inverse is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix, array[0..N-1, 0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - True, if the matrix is upper triangular. IsUnit - diagonal type (optional): * if True, matrix has unit diagonal (a[i,i] are NOT used) * if False, matrix diagonal is arbitrary * if not given, False is assumed Output parameters: Info - same as for RMatrixLUInverse Rep - same as for RMatrixLUInverse A - same as for RMatrixLUInverse. -- ALGLIB -- Copyright 05.02.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixtrinverse(real_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixtrinverse(const_cast(a.c_ptr()), n, isupper, isunit, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixtrinverse(real_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixtrinverse(const_cast(a.c_ptr()), n, isupper, isunit, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Triangular matrix inverse (real) The subroutine inverts the following types of matrices: * upper triangular * upper triangular with unit diagonal * lower triangular * lower triangular with unit diagonal In case of an upper (lower) triangular matrix, the inverse matrix will also be upper (lower) triangular, and after the end of the algorithm, the inverse matrix replaces the source matrix. The elements below (above) the main diagonal are not changed by the algorithm. If the matrix has a unit diagonal, the inverse matrix also has a unit diagonal, and the diagonal elements are not passed to the algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that triangular inverse is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix, array[0..N-1, 0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - True, if the matrix is upper triangular. IsUnit - diagonal type (optional): * if True, matrix has unit diagonal (a[i,i] are NOT used) * if False, matrix diagonal is arbitrary * if not given, False is assumed Output parameters: Info - same as for RMatrixLUInverse Rep - same as for RMatrixLUInverse A - same as for RMatrixLUInverse. -- ALGLIB -- Copyright 05.02.2010 by Bochkanov Sergey *************************************************************************/ void rmatrixtrinverse(real_2d_array &a, const bool isupper, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isunit; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'rmatrixtrinverse': looks like one of arguments has wrong size"); n = a.cols(); isunit = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixtrinverse(const_cast(a.c_ptr()), n, isupper, isunit, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixtrinverse(real_2d_array &a, const bool isupper, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isunit; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'rmatrixtrinverse': looks like one of arguments has wrong size"); n = a.cols(); isunit = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixtrinverse(const_cast(a.c_ptr()), n, isupper, isunit, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Triangular matrix inverse (complex) The subroutine inverts the following types of matrices: * upper triangular * upper triangular with unit diagonal * lower triangular * lower triangular with unit diagonal In case of an upper (lower) triangular matrix, the inverse matrix will also be upper (lower) triangular, and after the end of the algorithm, the inverse matrix replaces the source matrix. The elements below (above) the main diagonal are not changed by the algorithm. If the matrix has a unit diagonal, the inverse matrix also has a unit diagonal, and the diagonal elements are not passed to the algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that triangular inverse is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix, array[0..N-1, 0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - True, if the matrix is upper triangular. IsUnit - diagonal type (optional): * if True, matrix has unit diagonal (a[i,i] are NOT used) * if False, matrix diagonal is arbitrary * if not given, False is assumed Output parameters: Info - same as for RMatrixLUInverse Rep - same as for RMatrixLUInverse A - same as for RMatrixLUInverse. -- ALGLIB -- Copyright 05.02.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixtrinverse(complex_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixtrinverse(const_cast(a.c_ptr()), n, isupper, isunit, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixtrinverse(complex_2d_array &a, const ae_int_t n, const bool isupper, const bool isunit, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixtrinverse(const_cast(a.c_ptr()), n, isupper, isunit, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Triangular matrix inverse (complex) The subroutine inverts the following types of matrices: * upper triangular * upper triangular with unit diagonal * lower triangular * lower triangular with unit diagonal In case of an upper (lower) triangular matrix, the inverse matrix will also be upper (lower) triangular, and after the end of the algorithm, the inverse matrix replaces the source matrix. The elements below (above) the main diagonal are not changed by the algorithm. If the matrix has a unit diagonal, the inverse matrix also has a unit diagonal, and the diagonal elements are not passed to the algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that triangular inverse is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix, array[0..N-1, 0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - True, if the matrix is upper triangular. IsUnit - diagonal type (optional): * if True, matrix has unit diagonal (a[i,i] are NOT used) * if False, matrix diagonal is arbitrary * if not given, False is assumed Output parameters: Info - same as for RMatrixLUInverse Rep - same as for RMatrixLUInverse A - same as for RMatrixLUInverse. -- ALGLIB -- Copyright 05.02.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixtrinverse(complex_2d_array &a, const bool isupper, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isunit; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'cmatrixtrinverse': looks like one of arguments has wrong size"); n = a.cols(); isunit = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixtrinverse(const_cast(a.c_ptr()), n, isupper, isunit, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixtrinverse(complex_2d_array &a, const bool isupper, ae_int_t &info, matinvreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isunit; if( (a.cols()!=a.rows())) throw ap_error("Error while calling 'cmatrixtrinverse': looks like one of arguments has wrong size"); n = a.cols(); isunit = false; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixtrinverse(const_cast(a.c_ptr()), n, isupper, isunit, &info, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* QR decomposition of a rectangular matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q and R in compact form (see below). Tau - array of scalar factors which are used to form matrix Q. Array whose index ranges within [0.. Min(M-1,N-1)]. Matrix A is represented as A = QR, where Q is an orthogonal matrix of size MxM, R - upper triangular (or upper trapezoid) matrix of size M x N. The elements of matrix R are located on and above the main diagonal of matrix A. The elements which are located in Tau array and below the main diagonal of matrix A are used to form matrix Q as follows: Matrix Q is represented as a product of elementary reflections Q = H(0)*H(2)*...*H(k-1), where k = min(m,n), and each H(i) is in the form H(i) = 1 - tau * v * (v^T) where tau is a scalar stored in Tau[I]; v - real vector, so that v(0:i-1) = 0, v(i) = 1, v(i+1:m-1) stored in A(i+1:m-1,i). -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixqr(real_2d_array &a, const ae_int_t m, const ae_int_t n, real_1d_array &tau) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixqr(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixqr(real_2d_array &a, const ae_int_t m, const ae_int_t n, real_1d_array &tau) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixqr(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* LQ decomposition of a rectangular matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices L and Q in compact form (see below) Tau - array of scalar factors which are used to form matrix Q. Array whose index ranges within [0..Min(M,N)-1]. Matrix A is represented as A = LQ, where Q is an orthogonal matrix of size MxM, L - lower triangular (or lower trapezoid) matrix of size M x N. The elements of matrix L are located on and below the main diagonal of matrix A. The elements which are located in Tau array and above the main diagonal of matrix A are used to form matrix Q as follows: Matrix Q is represented as a product of elementary reflections Q = H(k-1)*H(k-2)*...*H(1)*H(0), where k = min(m,n), and each H(i) is of the form H(i) = 1 - tau * v * (v^T) where tau is a scalar stored in Tau[I]; v - real vector, so that v(0:i-1)=0, v(i) = 1, v(i+1:n-1) stored in A(i,i+1:n-1). -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixlq(real_2d_array &a, const ae_int_t m, const ae_int_t n, real_1d_array &tau) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixlq(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixlq(real_2d_array &a, const ae_int_t m, const ae_int_t n, real_1d_array &tau) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixlq(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* QR decomposition of a rectangular complex matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1] M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q and R in compact form Tau - array of scalar factors which are used to form matrix Q. Array whose indexes range within [0.. Min(M,N)-1] Matrix A is represented as A = QR, where Q is an orthogonal matrix of size MxM, R - upper triangular (or upper trapezoid) matrix of size MxN. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ void cmatrixqr(complex_2d_array &a, const ae_int_t m, const ae_int_t n, complex_1d_array &tau) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixqr(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixqr(complex_2d_array &a, const ae_int_t m, const ae_int_t n, complex_1d_array &tau) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixqr(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* LQ decomposition of a rectangular complex matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1] M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q and L in compact form Tau - array of scalar factors which are used to form matrix Q. Array whose indexes range within [0.. Min(M,N)-1] Matrix A is represented as A = LQ, where Q is an orthogonal matrix of size MxM, L - lower triangular (or lower trapezoid) matrix of size MxN. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ void cmatrixlq(complex_2d_array &a, const ae_int_t m, const ae_int_t n, complex_1d_array &tau) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixlq(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixlq(complex_2d_array &a, const ae_int_t m, const ae_int_t n, complex_1d_array &tau) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixlq(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Partial unpacking of matrix Q from the QR decomposition of a matrix A COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices Q and R in compact form. Output of RMatrixQR subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of the RMatrixQR subroutine. QColumns - required number of columns of matrix Q. M>=QColumns>=0. Output parameters: Q - first QColumns columns of matrix Q. Array whose indexes range within [0..M-1, 0..QColumns-1]. If QColumns=0, the array remains unchanged. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixqrunpackq(const real_2d_array &a, const ae_int_t m, const ae_int_t n, const real_1d_array &tau, const ae_int_t qcolumns, real_2d_array &q) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixqrunpackq(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), qcolumns, const_cast(q.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixqrunpackq(const real_2d_array &a, const ae_int_t m, const ae_int_t n, const real_1d_array &tau, const ae_int_t qcolumns, real_2d_array &q) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixqrunpackq(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), qcolumns, const_cast(q.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacking of matrix R from the QR decomposition of a matrix A Input parameters: A - matrices Q and R in compact form. Output of RMatrixQR subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: R - matrix R, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixqrunpackr(const real_2d_array &a, const ae_int_t m, const ae_int_t n, real_2d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixqrunpackr(const_cast(a.c_ptr()), m, n, const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Partial unpacking of matrix Q from the LQ decomposition of a matrix A COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices L and Q in compact form. Output of RMatrixLQ subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of the RMatrixLQ subroutine. QRows - required number of rows in matrix Q. N>=QRows>=0. Output parameters: Q - first QRows rows of matrix Q. Array whose indexes range within [0..QRows-1, 0..N-1]. If QRows=0, the array remains unchanged. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixlqunpackq(const real_2d_array &a, const ae_int_t m, const ae_int_t n, const real_1d_array &tau, const ae_int_t qrows, real_2d_array &q) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixlqunpackq(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), qrows, const_cast(q.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rmatrixlqunpackq(const real_2d_array &a, const ae_int_t m, const ae_int_t n, const real_1d_array &tau, const ae_int_t qrows, real_2d_array &q) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rmatrixlqunpackq(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), qrows, const_cast(q.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacking of matrix L from the LQ decomposition of a matrix A Input parameters: A - matrices Q and L in compact form. Output of RMatrixLQ subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: L - matrix L, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixlqunpackl(const real_2d_array &a, const ae_int_t m, const ae_int_t n, real_2d_array &l) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixlqunpackl(const_cast(a.c_ptr()), m, n, const_cast(l.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Partial unpacking of matrix Q from QR decomposition of a complex matrix A. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices Q and R in compact form. Output of CMatrixQR subroutine . M - number of rows in matrix A. M>=0. N - number of columns in matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of CMatrixQR subroutine . QColumns - required number of columns in matrix Q. M>=QColumns>=0. Output parameters: Q - first QColumns columns of matrix Q. Array whose index ranges within [0..M-1, 0..QColumns-1]. If QColumns=0, array isn't changed. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixqrunpackq(const complex_2d_array &a, const ae_int_t m, const ae_int_t n, const complex_1d_array &tau, const ae_int_t qcolumns, complex_2d_array &q) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixqrunpackq(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), qcolumns, const_cast(q.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixqrunpackq(const complex_2d_array &a, const ae_int_t m, const ae_int_t n, const complex_1d_array &tau, const ae_int_t qcolumns, complex_2d_array &q) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixqrunpackq(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), qcolumns, const_cast(q.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacking of matrix R from the QR decomposition of a matrix A Input parameters: A - matrices Q and R in compact form. Output of CMatrixQR subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: R - matrix R, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixqrunpackr(const complex_2d_array &a, const ae_int_t m, const ae_int_t n, complex_2d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixqrunpackr(const_cast(a.c_ptr()), m, n, const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Partial unpacking of matrix Q from LQ decomposition of a complex matrix A. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices Q and R in compact form. Output of CMatrixLQ subroutine . M - number of rows in matrix A. M>=0. N - number of columns in matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of CMatrixLQ subroutine . QRows - required number of rows in matrix Q. N>=QColumns>=0. Output parameters: Q - first QRows rows of matrix Q. Array whose index ranges within [0..QRows-1, 0..N-1]. If QRows=0, array isn't changed. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixlqunpackq(const complex_2d_array &a, const ae_int_t m, const ae_int_t n, const complex_1d_array &tau, const ae_int_t qrows, complex_2d_array &q) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixlqunpackq(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), qrows, const_cast(q.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_cmatrixlqunpackq(const complex_2d_array &a, const ae_int_t m, const ae_int_t n, const complex_1d_array &tau, const ae_int_t qrows, complex_2d_array &q) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_cmatrixlqunpackq(const_cast(a.c_ptr()), m, n, const_cast(tau.c_ptr()), qrows, const_cast(q.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacking of matrix L from the LQ decomposition of a matrix A Input parameters: A - matrices Q and L in compact form. Output of CMatrixLQ subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: L - matrix L, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixlqunpackl(const complex_2d_array &a, const ae_int_t m, const ae_int_t n, complex_2d_array &l) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::cmatrixlqunpackl(const_cast(a.c_ptr()), m, n, const_cast(l.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Reduction of a rectangular matrix to bidiagonal form The algorithm reduces the rectangular matrix A to bidiagonal form by orthogonal transformations P and Q: A = Q*B*(P^T). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function because ! bidiagonal decompostion is inherently sequential in nature. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - source matrix. array[0..M-1, 0..N-1] M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q, B, P in compact form (see below). TauQ - scalar factors which are used to form matrix Q. TauP - scalar factors which are used to form matrix P. The main diagonal and one of the secondary diagonals of matrix A are replaced with bidiagonal matrix B. Other elements contain elementary reflections which form MxM matrix Q and NxN matrix P, respectively. If M>=N, B is the upper bidiagonal MxN matrix and is stored in the corresponding elements of matrix A. Matrix Q is represented as a product of elementary reflections Q = H(0)*H(1)*...*H(n-1), where H(i) = 1-tau*v*v'. Here tau is a scalar which is stored in TauQ[i], and vector v has the following structure: v(0:i-1)=0, v(i)=1, v(i+1:m-1) is stored in elements A(i+1:m-1,i). Matrix P is as follows: P = G(0)*G(1)*...*G(n-2), where G(i) = 1 - tau*u*u'. Tau is stored in TauP[i], u(0:i)=0, u(i+1)=1, u(i+2:n-1) is stored in elements A(i,i+2:n-1). If M n): m=5, n=6 (m < n): ( d e u1 u1 u1 ) ( d u1 u1 u1 u1 u1 ) ( v1 d e u2 u2 ) ( e d u2 u2 u2 u2 ) ( v1 v2 d e u3 ) ( v1 e d u3 u3 u3 ) ( v1 v2 v3 d e ) ( v1 v2 e d u4 u4 ) ( v1 v2 v3 v4 d ) ( v1 v2 v3 e d u5 ) ( v1 v2 v3 v4 v5 ) Here vi and ui are vectors which form H(i) and G(i), and d and e - are the diagonal and off-diagonal elements of matrix B. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994. Sergey Bochkanov, ALGLIB project, translation from FORTRAN to pseudocode, 2007-2010. *************************************************************************/ void rmatrixbd(real_2d_array &a, const ae_int_t m, const ae_int_t n, real_1d_array &tauq, real_1d_array &taup) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixbd(const_cast(a.c_ptr()), m, n, const_cast(tauq.c_ptr()), const_cast(taup.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacking matrix Q which reduces a matrix to bidiagonal form. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: QP - matrices Q and P in compact form. Output of ToBidiagonal subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUQ - scalar factors which are used to form Q. Output of ToBidiagonal subroutine. QColumns - required number of columns in matrix Q. M>=QColumns>=0. Output parameters: Q - first QColumns columns of matrix Q. Array[0..M-1, 0..QColumns-1] If QColumns=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdunpackq(const real_2d_array &qp, const ae_int_t m, const ae_int_t n, const real_1d_array &tauq, const ae_int_t qcolumns, real_2d_array &q) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixbdunpackq(const_cast(qp.c_ptr()), m, n, const_cast(tauq.c_ptr()), qcolumns, const_cast(q.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Multiplication by matrix Q which reduces matrix A to bidiagonal form. The algorithm allows pre- or post-multiply by Q or Q'. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: QP - matrices Q and P in compact form. Output of ToBidiagonal subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUQ - scalar factors which are used to form Q. Output of ToBidiagonal subroutine. Z - multiplied matrix. array[0..ZRows-1,0..ZColumns-1] ZRows - number of rows in matrix Z. If FromTheRight=False, ZRows=M, otherwise ZRows can be arbitrary. ZColumns - number of columns in matrix Z. If FromTheRight=True, ZColumns=M, otherwise ZColumns can be arbitrary. FromTheRight - pre- or post-multiply. DoTranspose - multiply by Q or Q'. Output parameters: Z - product of Z and Q. Array[0..ZRows-1,0..ZColumns-1] If ZRows=0 or ZColumns=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdmultiplybyq(const real_2d_array &qp, const ae_int_t m, const ae_int_t n, const real_1d_array &tauq, real_2d_array &z, const ae_int_t zrows, const ae_int_t zcolumns, const bool fromtheright, const bool dotranspose) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixbdmultiplybyq(const_cast(qp.c_ptr()), m, n, const_cast(tauq.c_ptr()), const_cast(z.c_ptr()), zrows, zcolumns, fromtheright, dotranspose, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacking matrix P which reduces matrix A to bidiagonal form. The subroutine returns transposed matrix P. Input parameters: QP - matrices Q and P in compact form. Output of ToBidiagonal subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUP - scalar factors which are used to form P. Output of ToBidiagonal subroutine. PTRows - required number of rows of matrix P^T. N >= PTRows >= 0. Output parameters: PT - first PTRows columns of matrix P^T Array[0..PTRows-1, 0..N-1] If PTRows=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdunpackpt(const real_2d_array &qp, const ae_int_t m, const ae_int_t n, const real_1d_array &taup, const ae_int_t ptrows, real_2d_array &pt) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixbdunpackpt(const_cast(qp.c_ptr()), m, n, const_cast(taup.c_ptr()), ptrows, const_cast(pt.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Multiplication by matrix P which reduces matrix A to bidiagonal form. The algorithm allows pre- or post-multiply by P or P'. Input parameters: QP - matrices Q and P in compact form. Output of RMatrixBD subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUP - scalar factors which are used to form P. Output of RMatrixBD subroutine. Z - multiplied matrix. Array whose indexes range within [0..ZRows-1,0..ZColumns-1]. ZRows - number of rows in matrix Z. If FromTheRight=False, ZRows=N, otherwise ZRows can be arbitrary. ZColumns - number of columns in matrix Z. If FromTheRight=True, ZColumns=N, otherwise ZColumns can be arbitrary. FromTheRight - pre- or post-multiply. DoTranspose - multiply by P or P'. Output parameters: Z - product of Z and P. Array whose indexes range within [0..ZRows-1,0..ZColumns-1]. If ZRows=0 or ZColumns=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdmultiplybyp(const real_2d_array &qp, const ae_int_t m, const ae_int_t n, const real_1d_array &taup, real_2d_array &z, const ae_int_t zrows, const ae_int_t zcolumns, const bool fromtheright, const bool dotranspose) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixbdmultiplybyp(const_cast(qp.c_ptr()), m, n, const_cast(taup.c_ptr()), const_cast(z.c_ptr()), zrows, zcolumns, fromtheright, dotranspose, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacking of the main and secondary diagonals of bidiagonal decomposition of matrix A. Input parameters: B - output of RMatrixBD subroutine. M - number of rows in matrix B. N - number of columns in matrix B. Output parameters: IsUpper - True, if the matrix is upper bidiagonal. otherwise IsUpper is False. D - the main diagonal. Array whose index ranges within [0..Min(M,N)-1]. E - the secondary diagonal (upper or lower, depending on the value of IsUpper). Array index ranges within [0..Min(M,N)-1], the last element is not used. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdunpackdiagonals(const real_2d_array &b, const ae_int_t m, const ae_int_t n, bool &isupper, real_1d_array &d, real_1d_array &e) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixbdunpackdiagonals(const_cast(b.c_ptr()), m, n, &isupper, const_cast(d.c_ptr()), const_cast(e.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Reduction of a square matrix to upper Hessenberg form: Q'*A*Q = H, where Q is an orthogonal matrix, H - Hessenberg matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A with elements [0..N-1, 0..N-1] N - size of matrix A. Output parameters: A - matrices Q and P in compact form (see below). Tau - array of scalar factors which are used to form matrix Q. Array whose index ranges within [0..N-2] Matrix H is located on the main diagonal, on the lower secondary diagonal and above the main diagonal of matrix A. The elements which are used to form matrix Q are situated in array Tau and below the lower secondary diagonal of matrix A as follows: Matrix Q is represented as a product of elementary reflections Q = H(0)*H(2)*...*H(n-2), where each H(i) is given by H(i) = 1 - tau * v * (v^T) where tau is a scalar stored in Tau[I]; v - is a real vector, so that v(0:i) = 0, v(i+1) = 1, v(i+2:n-1) stored in A(i+2:n-1,i). -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ void rmatrixhessenberg(real_2d_array &a, const ae_int_t n, real_1d_array &tau) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixhessenberg(const_cast(a.c_ptr()), n, const_cast(tau.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacking matrix Q which reduces matrix A to upper Hessenberg form COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - output of RMatrixHessenberg subroutine. N - size of matrix A. Tau - scalar factors which are used to form Q. Output of RMatrixHessenberg subroutine. Output parameters: Q - matrix Q. Array whose indexes range within [0..N-1, 0..N-1]. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixhessenbergunpackq(const real_2d_array &a, const ae_int_t n, const real_1d_array &tau, real_2d_array &q) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixhessenbergunpackq(const_cast(a.c_ptr()), n, const_cast(tau.c_ptr()), const_cast(q.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacking matrix H (the result of matrix A reduction to upper Hessenberg form) Input parameters: A - output of RMatrixHessenberg subroutine. N - size of matrix A. Output parameters: H - matrix H. Array whose indexes range within [0..N-1, 0..N-1]. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixhessenbergunpackh(const real_2d_array &a, const ae_int_t n, real_2d_array &h) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixhessenbergunpackh(const_cast(a.c_ptr()), n, const_cast(h.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Reduction of a symmetric matrix which is given by its higher or lower triangular part to a tridiagonal matrix using orthogonal similarity transformation: Q'*A*Q=T. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be transformed array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. If IsUpper = True, then matrix A is given by its upper triangle, and the lower triangle is not used and not modified by the algorithm, and vice versa if IsUpper = False. Output parameters: A - matrices T and Q in compact form (see lower) Tau - array of factors which are forming matrices H(i) array with elements [0..N-2]. D - main diagonal of symmetric matrix T. array with elements [0..N-1]. E - secondary diagonal of symmetric matrix T. array with elements [0..N-2]. If IsUpper=True, the matrix Q is represented as a product of elementary reflectors Q = H(n-2) . . . H(2) H(0). Each H(i) has the form H(i) = I - tau * v * v' where tau is a real scalar, and v is a real vector with v(i+1:n-1) = 0, v(i) = 1, v(0:i-1) is stored on exit in A(0:i-1,i+1), and tau in TAU(i). If IsUpper=False, the matrix Q is represented as a product of elementary reflectors Q = H(0) H(2) . . . H(n-2). Each H(i) has the form H(i) = I - tau * v * v' where tau is a real scalar, and v is a real vector with v(0:i) = 0, v(i+1) = 1, v(i+2:n-1) is stored on exit in A(i+2:n-1,i), and tau in TAU(i). The contents of A on exit are illustrated by the following examples with n = 5: if UPLO = 'U': if UPLO = 'L': ( d e v1 v2 v3 ) ( d ) ( d e v2 v3 ) ( e d ) ( d e v3 ) ( v0 e d ) ( d e ) ( v0 v1 e d ) ( d ) ( v0 v1 v2 e d ) where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i). -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ void smatrixtd(real_2d_array &a, const ae_int_t n, const bool isupper, real_1d_array &tau, real_1d_array &d, real_1d_array &e) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::smatrixtd(const_cast(a.c_ptr()), n, isupper, const_cast(tau.c_ptr()), const_cast(d.c_ptr()), const_cast(e.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacking matrix Q which reduces symmetric matrix to a tridiagonal form. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - the result of a SMatrixTD subroutine N - size of matrix A. IsUpper - storage format (a parameter of SMatrixTD subroutine) Tau - the result of a SMatrixTD subroutine Output parameters: Q - transformation matrix. array with elements [0..N-1, 0..N-1]. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/ void smatrixtdunpackq(const real_2d_array &a, const ae_int_t n, const bool isupper, const real_1d_array &tau, real_2d_array &q) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::smatrixtdunpackq(const_cast(a.c_ptr()), n, isupper, const_cast(tau.c_ptr()), const_cast(q.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Reduction of a Hermitian matrix which is given by its higher or lower triangular part to a real tridiagonal matrix using unitary similarity transformation: Q'*A*Q = T. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be transformed array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. If IsUpper = True, then matrix A is given by its upper triangle, and the lower triangle is not used and not modified by the algorithm, and vice versa if IsUpper = False. Output parameters: A - matrices T and Q in compact form (see lower) Tau - array of factors which are forming matrices H(i) array with elements [0..N-2]. D - main diagonal of real symmetric matrix T. array with elements [0..N-1]. E - secondary diagonal of real symmetric matrix T. array with elements [0..N-2]. If IsUpper=True, the matrix Q is represented as a product of elementary reflectors Q = H(n-2) . . . H(2) H(0). Each H(i) has the form H(i) = I - tau * v * v' where tau is a complex scalar, and v is a complex vector with v(i+1:n-1) = 0, v(i) = 1, v(0:i-1) is stored on exit in A(0:i-1,i+1), and tau in TAU(i). If IsUpper=False, the matrix Q is represented as a product of elementary reflectors Q = H(0) H(2) . . . H(n-2). Each H(i) has the form H(i) = I - tau * v * v' where tau is a complex scalar, and v is a complex vector with v(0:i) = 0, v(i+1) = 1, v(i+2:n-1) is stored on exit in A(i+2:n-1,i), and tau in TAU(i). The contents of A on exit are illustrated by the following examples with n = 5: if UPLO = 'U': if UPLO = 'L': ( d e v1 v2 v3 ) ( d ) ( d e v2 v3 ) ( e d ) ( d e v3 ) ( v0 e d ) ( d e ) ( v0 v1 e d ) ( d ) ( v0 v1 v2 e d ) where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i). -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ void hmatrixtd(complex_2d_array &a, const ae_int_t n, const bool isupper, complex_1d_array &tau, real_1d_array &d, real_1d_array &e) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hmatrixtd(const_cast(a.c_ptr()), n, isupper, const_cast(tau.c_ptr()), const_cast(d.c_ptr()), const_cast(e.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Unpacking matrix Q which reduces a Hermitian matrix to a real tridiagonal form. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - the result of a HMatrixTD subroutine N - size of matrix A. IsUpper - storage format (a parameter of HMatrixTD subroutine) Tau - the result of a HMatrixTD subroutine Output parameters: Q - transformation matrix. array with elements [0..N-1, 0..N-1]. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/ void hmatrixtdunpackq(const complex_2d_array &a, const ae_int_t n, const bool isupper, const complex_1d_array &tau, complex_2d_array &q) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hmatrixtdunpackq(const_cast(a.c_ptr()), n, isupper, const_cast(tau.c_ptr()), const_cast(q.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Singular value decomposition of a bidiagonal matrix (extended algorithm) COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The algorithm performs the singular value decomposition of a bidiagonal matrix B (upper or lower) representing it as B = Q*S*P^T, where Q and P - orthogonal matrices, S - diagonal matrix with non-negative elements on the main diagonal, in descending order. The algorithm finds singular values. In addition, the algorithm can calculate matrices Q and P (more precisely, not the matrices, but their product with given matrices U and VT - U*Q and (P^T)*VT)). Of course, matrices U and VT can be of any type, including identity. Furthermore, the algorithm can calculate Q'*C (this product is calculated more effectively than U*Q, because this calculation operates with rows instead of matrix columns). The feature of the algorithm is its ability to find all singular values including those which are arbitrarily close to 0 with relative accuracy close to machine precision. If the parameter IsFractionalAccuracyRequired is set to True, all singular values will have high relative accuracy close to machine precision. If the parameter is set to False, only the biggest singular value will have relative accuracy close to machine precision. The absolute error of other singular values is equal to the absolute error of the biggest singular value. Input parameters: D - main diagonal of matrix B. Array whose index ranges within [0..N-1]. E - superdiagonal (or subdiagonal) of matrix B. Array whose index ranges within [0..N-2]. N - size of matrix B. IsUpper - True, if the matrix is upper bidiagonal. IsFractionalAccuracyRequired - THIS PARAMETER IS IGNORED SINCE ALGLIB 3.5.0 SINGULAR VALUES ARE ALWAYS SEARCHED WITH HIGH ACCURACY. U - matrix to be multiplied by Q. Array whose indexes range within [0..NRU-1, 0..N-1]. The matrix can be bigger, in that case only the submatrix [0..NRU-1, 0..N-1] will be multiplied by Q. NRU - number of rows in matrix U. C - matrix to be multiplied by Q'. Array whose indexes range within [0..N-1, 0..NCC-1]. The matrix can be bigger, in that case only the submatrix [0..N-1, 0..NCC-1] will be multiplied by Q'. NCC - number of columns in matrix C. VT - matrix to be multiplied by P^T. Array whose indexes range within [0..N-1, 0..NCVT-1]. The matrix can be bigger, in that case only the submatrix [0..N-1, 0..NCVT-1] will be multiplied by P^T. NCVT - number of columns in matrix VT. Output parameters: D - singular values of matrix B in descending order. U - if NRU>0, contains matrix U*Q. VT - if NCVT>0, contains matrix (P^T)*VT. C - if NCC>0, contains matrix Q'*C. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged (rare case). NOTE: multiplication U*Q is performed by means of transposition to internal buffer, multiplication and backward transposition. It helps to avoid costly columnwise operations and speed-up algorithm. Additional information: The type of convergence is controlled by the internal parameter TOL. If the parameter is greater than 0, the singular values will have relative accuracy TOL. If TOL<0, the singular values will have absolute accuracy ABS(TOL)*norm(B). By default, |TOL| falls within the range of 10*Epsilon and 100*Epsilon, where Epsilon is the machine precision. It is not recommended to use TOL less than 10*Epsilon since this will considerably slow down the algorithm and may not lead to error decreasing. History: * 31 March, 2007. changed MAXITR from 6 to 12. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1999. *************************************************************************/ bool rmatrixbdsvd(real_1d_array &d, const real_1d_array &e, const ae_int_t n, const bool isupper, const bool isfractionalaccuracyrequired, real_2d_array &u, const ae_int_t nru, real_2d_array &c, const ae_int_t ncc, real_2d_array &vt, const ae_int_t ncvt) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::rmatrixbdsvd(const_cast(d.c_ptr()), const_cast(e.c_ptr()), n, isupper, isfractionalaccuracyrequired, const_cast(u.c_ptr()), nru, const_cast(c.c_ptr()), ncc, const_cast(vt.c_ptr()), ncvt, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Singular value decomposition of a rectangular matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is only partially supported (some parts are ! optimized, but most - are not). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The algorithm calculates the singular value decomposition of a matrix of size MxN: A = U * S * V^T The algorithm finds the singular values and, optionally, matrices U and V^T. The algorithm can find both first min(M,N) columns of matrix U and rows of matrix V^T (singular vectors), and matrices U and V^T wholly (of sizes MxM and NxN respectively). Take into account that the subroutine does not return matrix V but V^T. Input parameters: A - matrix to be decomposed. Array whose indexes range within [0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. UNeeded - 0, 1 or 2. See the description of the parameter U. VTNeeded - 0, 1 or 2. See the description of the parameter VT. AdditionalMemory - If the parameter: * equals 0, the algorithm doesn't use additional memory (lower requirements, lower performance). * equals 1, the algorithm uses additional memory of size min(M,N)*min(M,N) of real numbers. It often speeds up the algorithm. * equals 2, the algorithm uses additional memory of size M*min(M,N) of real numbers. It allows to get a maximum performance. The recommended value of the parameter is 2. Output parameters: W - contains singular values in descending order. U - if UNeeded=0, U isn't changed, the left singular vectors are not calculated. if Uneeded=1, U contains left singular vectors (first min(M,N) columns of matrix U). Array whose indexes range within [0..M-1, 0..Min(M,N)-1]. if UNeeded=2, U contains matrix U wholly. Array whose indexes range within [0..M-1, 0..M-1]. VT - if VTNeeded=0, VT isn't changed, the right singular vectors are not calculated. if VTNeeded=1, VT contains right singular vectors (first min(M,N) rows of matrix V^T). Array whose indexes range within [0..min(M,N)-1, 0..N-1]. if VTNeeded=2, VT contains matrix V^T wholly. Array whose indexes range within [0..N-1, 0..N-1]. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ bool rmatrixsvd(const real_2d_array &a, const ae_int_t m, const ae_int_t n, const ae_int_t uneeded, const ae_int_t vtneeded, const ae_int_t additionalmemory, real_1d_array &w, real_2d_array &u, real_2d_array &vt) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::rmatrixsvd(const_cast(a.c_ptr()), m, n, uneeded, vtneeded, additionalmemory, const_cast(w.c_ptr()), const_cast(u.c_ptr()), const_cast(vt.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } bool smp_rmatrixsvd(const real_2d_array &a, const ae_int_t m, const ae_int_t n, const ae_int_t uneeded, const ae_int_t vtneeded, const ae_int_t additionalmemory, real_1d_array &w, real_2d_array &u, real_2d_array &vt) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::_pexec_rmatrixsvd(const_cast(a.c_ptr()), m, n, uneeded, vtneeded, additionalmemory, const_cast(w.c_ptr()), const_cast(u.c_ptr()), const_cast(vt.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This object stores state of the iterative norm estimation algorithm. You should use ALGLIB functions to work with this object. *************************************************************************/ _normestimatorstate_owner::_normestimatorstate_owner() { p_struct = (alglib_impl::normestimatorstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::normestimatorstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_normestimatorstate_init(p_struct, NULL); } _normestimatorstate_owner::_normestimatorstate_owner(const _normestimatorstate_owner &rhs) { p_struct = (alglib_impl::normestimatorstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::normestimatorstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_normestimatorstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _normestimatorstate_owner& _normestimatorstate_owner::operator=(const _normestimatorstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_normestimatorstate_clear(p_struct); alglib_impl::_normestimatorstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _normestimatorstate_owner::~_normestimatorstate_owner() { alglib_impl::_normestimatorstate_clear(p_struct); ae_free(p_struct); } alglib_impl::normestimatorstate* _normestimatorstate_owner::c_ptr() { return p_struct; } alglib_impl::normestimatorstate* _normestimatorstate_owner::c_ptr() const { return const_cast(p_struct); } normestimatorstate::normestimatorstate() : _normestimatorstate_owner() { } normestimatorstate::normestimatorstate(const normestimatorstate &rhs):_normestimatorstate_owner(rhs) { } normestimatorstate& normestimatorstate::operator=(const normestimatorstate &rhs) { if( this==&rhs ) return *this; _normestimatorstate_owner::operator=(rhs); return *this; } normestimatorstate::~normestimatorstate() { } /************************************************************************* This procedure initializes matrix norm estimator. USAGE: 1. User initializes algorithm state with NormEstimatorCreate() call 2. User calls NormEstimatorEstimateSparse() (or NormEstimatorIteration()) 3. User calls NormEstimatorResults() to get solution. INPUT PARAMETERS: M - number of rows in the matrix being estimated, M>0 N - number of columns in the matrix being estimated, N>0 NStart - number of random starting vectors recommended value - at least 5. NIts - number of iterations to do with best starting vector recommended value - at least 5. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTE: this algorithm is effectively deterministic, i.e. it always returns same result when repeatedly called for the same matrix. In fact, algorithm uses randomized starting vectors, but internal random numbers generator always generates same sequence of the random values (it is a feature, not bug). Algorithm can be made non-deterministic with NormEstimatorSetSeed(0) call. -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorcreate(const ae_int_t m, const ae_int_t n, const ae_int_t nstart, const ae_int_t nits, normestimatorstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::normestimatorcreate(m, n, nstart, nits, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function changes seed value used by algorithm. In some cases we need deterministic processing, i.e. subsequent calls must return equal results, in other cases we need non-deterministic algorithm which returns different results for the same matrix on every pass. Setting zero seed will lead to non-deterministic algorithm, while non-zero value will make our algorithm deterministic. INPUT PARAMETERS: State - norm estimator state, must be initialized with a call to NormEstimatorCreate() SeedVal - seed value, >=0. Zero value = non-deterministic algo. -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorsetseed(const normestimatorstate &state, const ae_int_t seedval) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::normestimatorsetseed(const_cast(state.c_ptr()), seedval, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function estimates norm of the sparse M*N matrix A. INPUT PARAMETERS: State - norm estimator state, must be initialized with a call to NormEstimatorCreate() A - sparse M*N matrix, must be converted to CRS format prior to calling this function. After this function is over you can call NormEstimatorResults() to get estimate of the norm(A). -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorestimatesparse(const normestimatorstate &state, const sparsematrix &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::normestimatorestimatesparse(const_cast(state.c_ptr()), const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Matrix norm estimation results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: Nrm - estimate of the matrix norm, Nrm>=0 -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorresults(const normestimatorstate &state, double &nrm) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::normestimatorresults(const_cast(state.c_ptr()), &nrm, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This object stores state of the subspace iteration algorithm. You should use ALGLIB functions to work with this object. *************************************************************************/ _eigsubspacestate_owner::_eigsubspacestate_owner() { p_struct = (alglib_impl::eigsubspacestate*)alglib_impl::ae_malloc(sizeof(alglib_impl::eigsubspacestate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_eigsubspacestate_init(p_struct, NULL); } _eigsubspacestate_owner::_eigsubspacestate_owner(const _eigsubspacestate_owner &rhs) { p_struct = (alglib_impl::eigsubspacestate*)alglib_impl::ae_malloc(sizeof(alglib_impl::eigsubspacestate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_eigsubspacestate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _eigsubspacestate_owner& _eigsubspacestate_owner::operator=(const _eigsubspacestate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_eigsubspacestate_clear(p_struct); alglib_impl::_eigsubspacestate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _eigsubspacestate_owner::~_eigsubspacestate_owner() { alglib_impl::_eigsubspacestate_clear(p_struct); ae_free(p_struct); } alglib_impl::eigsubspacestate* _eigsubspacestate_owner::c_ptr() { return p_struct; } alglib_impl::eigsubspacestate* _eigsubspacestate_owner::c_ptr() const { return const_cast(p_struct); } eigsubspacestate::eigsubspacestate() : _eigsubspacestate_owner() { } eigsubspacestate::eigsubspacestate(const eigsubspacestate &rhs):_eigsubspacestate_owner(rhs) { } eigsubspacestate& eigsubspacestate::operator=(const eigsubspacestate &rhs) { if( this==&rhs ) return *this; _eigsubspacestate_owner::operator=(rhs); return *this; } eigsubspacestate::~eigsubspacestate() { } /************************************************************************* This object stores state of the subspace iteration algorithm. You should use ALGLIB functions to work with this object. *************************************************************************/ _eigsubspacereport_owner::_eigsubspacereport_owner() { p_struct = (alglib_impl::eigsubspacereport*)alglib_impl::ae_malloc(sizeof(alglib_impl::eigsubspacereport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_eigsubspacereport_init(p_struct, NULL); } _eigsubspacereport_owner::_eigsubspacereport_owner(const _eigsubspacereport_owner &rhs) { p_struct = (alglib_impl::eigsubspacereport*)alglib_impl::ae_malloc(sizeof(alglib_impl::eigsubspacereport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_eigsubspacereport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _eigsubspacereport_owner& _eigsubspacereport_owner::operator=(const _eigsubspacereport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_eigsubspacereport_clear(p_struct); alglib_impl::_eigsubspacereport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _eigsubspacereport_owner::~_eigsubspacereport_owner() { alglib_impl::_eigsubspacereport_clear(p_struct); ae_free(p_struct); } alglib_impl::eigsubspacereport* _eigsubspacereport_owner::c_ptr() { return p_struct; } alglib_impl::eigsubspacereport* _eigsubspacereport_owner::c_ptr() const { return const_cast(p_struct); } eigsubspacereport::eigsubspacereport() : _eigsubspacereport_owner() ,iterationscount(p_struct->iterationscount) { } eigsubspacereport::eigsubspacereport(const eigsubspacereport &rhs):_eigsubspacereport_owner(rhs) ,iterationscount(p_struct->iterationscount) { } eigsubspacereport& eigsubspacereport::operator=(const eigsubspacereport &rhs) { if( this==&rhs ) return *this; _eigsubspacereport_owner::operator=(rhs); return *this; } eigsubspacereport::~eigsubspacereport() { } /************************************************************************* This function initializes subspace iteration solver. This solver is used to solve symmetric real eigenproblems where just a few (top K) eigenvalues and corresponding eigenvectors is required. This solver can be significantly faster than complete EVD decomposition in the following case: * when only just a small fraction of top eigenpairs of dense matrix is required. When K approaches N, this solver is slower than complete dense EVD * when problem matrix is sparse (and/or is not known explicitly, i.e. only matrix-matrix product can be performed) USAGE (explicit dense/sparse matrix): 1. User initializes algorithm state with eigsubspacecreate() call 2. [optional] User tunes solver parameters by calling eigsubspacesetcond() or other functions 3. User calls eigsubspacesolvedense() or eigsubspacesolvesparse() methods, which take algorithm state and 2D array or alglib.sparsematrix object. USAGE (out-of-core mode): 1. User initializes algorithm state with eigsubspacecreate() call 2. [optional] User tunes solver parameters by calling eigsubspacesetcond() or other functions 3. User activates out-of-core mode of the solver and repeatedly calls communication functions in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: N - problem dimensionality, N>0 K - number of top eigenvector to calculate, 0(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Buffered version of constructor which aims to reuse previously allocated memory as much as possible. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspacecreatebuf(const ae_int_t n, const ae_int_t k, const eigsubspacestate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::eigsubspacecreatebuf(n, k, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping critera for the solver: * error in eigenvector/value allowed by solver * maximum number of iterations to perform INPUT PARAMETERS: State - solver structure Eps - eps>=0, with non-zero value used to tell solver that it can stop after all eigenvalues converged with error roughly proportional to eps*MAX(LAMBDA_MAX), where LAMBDA_MAX is a maximum eigenvalue. Zero value means that no check for precision is performed. MaxIts - maxits>=0, with non-zero value used to tell solver that it can stop after maxits steps (no matter how precise current estimate is) NOTE: passing eps=0 and maxits=0 results in automatic selection of moderate eps as stopping criteria (1.0E-6 in current implementation, but it may change without notice). NOTE: very small values of eps are possible (say, 1.0E-12), although the larger problem you solve (N and/or K), the harder it is to find precise eigenvectors because rounding errors tend to accumulate. NOTE: passing non-zero eps results in some performance penalty, roughly equal to 2N*(2K)^2 FLOPs per iteration. These additional computations are required in order to estimate current error in eigenvalues via Rayleigh-Ritz process. Most of this additional time is spent in construction of ~2Kx2K symmetric subproblem whose eigenvalues are checked with exact eigensolver. This additional time is negligible if you search for eigenvalues of the large dense matrix, but may become noticeable on highly sparse EVD problems, where cost of matrix-matrix product is low. If you set eps to exactly zero, Rayleigh-Ritz phase is completely turned off. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspacesetcond(const eigsubspacestate &state, const double eps, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::eigsubspacesetcond(const_cast(state.c_ptr()), eps, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function initiates out-of-core mode of subspace eigensolver. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver object MType - matrix type: * 0 for real symmetric matrix (solver assumes that matrix being processed is symmetric; symmetric direct eigensolver is used for smaller subproblems arising during solution of larger "full" task) Future versions of ALGLIB may introduce support for other matrix types; for now, only symmetric eigenproblems are supported. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocstart(const eigsubspacestate &state, const ae_int_t mtype) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::eigsubspaceoocstart(const_cast(state.c_ptr()), mtype, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function performs subspace iteration in the out-of-core mode. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ bool eigsubspaceooccontinue(const eigsubspacestate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::eigsubspaceooccontinue(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to retrieve information about out-of-core request sent by solver to user code: request type (current version of the solver sends only requests for matrix-matrix products) and request size (size of the matrices being multiplied). This function returns just request metrics; in order to get contents of the matrices being multiplied, use eigsubspaceoocgetrequestdata(). It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver running in out-of-core mode OUTPUT PARAMETERS: RequestType - type of the request to process: * 0 - for matrix-matrix product A*X, with A being NxN matrix whose eigenvalues/vectors are needed, and X being NxREQUESTSIZE one which is returned by the eigsubspaceoocgetrequestdata(). RequestSize - size of the X matrix (number of columns), usually it is several times larger than number of vectors K requested by user. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocgetrequestinfo(const eigsubspacestate &state, ae_int_t &requesttype, ae_int_t &requestsize) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::eigsubspaceoocgetrequestinfo(const_cast(state.c_ptr()), &requesttype, &requestsize, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to retrieve information about out-of-core request sent by solver to user code: matrix X (array[N,RequestSize) which have to be multiplied by out-of-core matrix A in a product A*X. This function returns just request data; in order to get size of the data prior to processing requestm, use eigsubspaceoocgetrequestinfo(). It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver running in out-of-core mode X - possibly preallocated storage; reallocated if needed, left unchanged, if large enough to store request data. OUTPUT PARAMETERS: X - array[N,RequestSize] or larger, leading rectangle is filled with dense matrix X. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocgetrequestdata(const eigsubspacestate &state, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::eigsubspaceoocgetrequestdata(const_cast(state.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function is used to send user reply to out-of-core request sent by solver. Usually it is product A*X for returned by solver matrix X. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver running in out-of-core mode AX - array[N,RequestSize] or larger, leading rectangle is filled with product A*X. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocsendresult(const eigsubspacestate &state, const real_2d_array &ax) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::eigsubspaceoocsendresult(const_cast(state.c_ptr()), const_cast(ax.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function finalizes out-of-core mode of subspace eigensolver. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver state OUTPUT PARAMETERS: W - array[K], depending on solver settings: * top K eigenvalues ordered by descending - if eigenvectors are returned in Z * zeros - if invariant subspace is returned in Z Z - array[N,K], depending on solver settings either: * matrix of eigenvectors found * orthogonal basis of K-dimensional invariant subspace Rep - report with additional parameters -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocstop(const eigsubspacestate &state, real_1d_array &w, real_2d_array &z, eigsubspacereport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::eigsubspaceoocstop(const_cast(state.c_ptr()), const_cast(w.c_ptr()), const_cast(z.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function runs eigensolver for dense NxN symmetric matrix A, given by upper or lower triangle. This function can not process nonsymmetric matrices. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * multithreading support ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! For a situation when you need just a few eigenvectors (~1-10), ! multithreading typically gives sublinear (wrt to cores count) speedup. ! For larger problems it may give you nearly linear increase in ! performance. ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: State - solver state A - array[N,N], symmetric NxN matrix given by one of its triangles IsUpper - whether upper or lower triangle of A is given (the other one is not referenced at all). OUTPUT PARAMETERS: W - array[K], top K eigenvalues ordered by descending of their absolute values Z - array[N,K], matrix of eigenvectors found Rep - report with additional parameters NOTE: internally this function allocates a copy of NxN dense A. You should take it into account when working with very large matrices occupying almost all RAM. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspacesolvedenses(const eigsubspacestate &state, const real_2d_array &a, const bool isupper, real_1d_array &w, real_2d_array &z, eigsubspacereport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::eigsubspacesolvedenses(const_cast(state.c_ptr()), const_cast(a.c_ptr()), isupper, const_cast(w.c_ptr()), const_cast(z.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_eigsubspacesolvedenses(const eigsubspacestate &state, const real_2d_array &a, const bool isupper, real_1d_array &w, real_2d_array &z, eigsubspacereport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_eigsubspacesolvedenses(const_cast(state.c_ptr()), const_cast(a.c_ptr()), isupper, const_cast(w.c_ptr()), const_cast(z.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function runs eigensolver for dense NxN symmetric matrix A, given by upper or lower triangle. This function can not process nonsymmetric matrices. INPUT PARAMETERS: State - solver state A - NxN symmetric matrix given by one of its triangles IsUpper - whether upper or lower triangle of A is given (the other one is not referenced at all). OUTPUT PARAMETERS: W - array[K], top K eigenvalues ordered by descending of their absolute values Z - array[N,K], matrix of eigenvectors found Rep - report with additional parameters -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspacesolvesparses(const eigsubspacestate &state, const sparsematrix &a, const bool isupper, real_1d_array &w, real_2d_array &z, eigsubspacereport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::eigsubspacesolvesparses(const_cast(state.c_ptr()), const_cast(a.c_ptr()), isupper, const_cast(w.c_ptr()), const_cast(z.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Finding the eigenvalues and eigenvectors of a symmetric matrix The algorithm finds eigen pairs of a symmetric matrix by reducing it to tridiagonal form and using the QL/QR algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpper - storage format. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains the eigenvectors. Array whose indexes range within [0..N-1, 0..N-1]. The eigenvectors are stored in the matrix columns. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged (rare case). -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/ bool smatrixevd(const real_2d_array &a, const ae_int_t n, const ae_int_t zneeded, const bool isupper, real_1d_array &d, real_2d_array &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::smatrixevd(const_cast(a.c_ptr()), n, zneeded, isupper, const_cast(d.c_ptr()), const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Subroutine for finding the eigenvalues (and eigenvectors) of a symmetric matrix in a given half open interval (A, B] by using a bisection and inverse iteration Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. B1, B2 - half open interval (B1, B2] to search eigenvalues in. Output parameters: M - number of eigenvalues found in a given half-interval (M>=0). W - array of the eigenvalues found. Array whose index ranges within [0..M-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..M-1]. The eigenvectors are stored in the matrix columns. Result: True, if successful. M contains the number of eigenvalues in the given half-interval (could be equal to 0), W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned, M is equal to 0. -- ALGLIB -- Copyright 07.01.2006 by Bochkanov Sergey *************************************************************************/ bool smatrixevdr(const real_2d_array &a, const ae_int_t n, const ae_int_t zneeded, const bool isupper, const double b1, const double b2, ae_int_t &m, real_1d_array &w, real_2d_array &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::smatrixevdr(const_cast(a.c_ptr()), n, zneeded, isupper, b1, b2, &m, const_cast(w.c_ptr()), const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Subroutine for finding the eigenvalues and eigenvectors of a symmetric matrix with given indexes by using bisection and inverse iteration methods. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. I1, I2 - index interval for searching (from I1 to I2). 0 <= I1 <= I2 <= N-1. Output parameters: W - array of the eigenvalues found. Array whose index ranges within [0..I2-I1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..I2-I1]. In that case, the eigenvectors are stored in the matrix columns. Result: True, if successful. W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned. -- ALGLIB -- Copyright 07.01.2006 by Bochkanov Sergey *************************************************************************/ bool smatrixevdi(const real_2d_array &a, const ae_int_t n, const ae_int_t zneeded, const bool isupper, const ae_int_t i1, const ae_int_t i2, real_1d_array &w, real_2d_array &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::smatrixevdi(const_cast(a.c_ptr()), n, zneeded, isupper, i1, i2, const_cast(w.c_ptr()), const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Finding the eigenvalues and eigenvectors of a Hermitian matrix The algorithm finds eigen pairs of a Hermitian matrix by reducing it to real tridiagonal form and using the QL/QR algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - Hermitian matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains the eigenvectors. Array whose indexes range within [0..N-1, 0..N-1]. The eigenvectors are stored in the matrix columns. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged (rare case). Note: eigenvectors of Hermitian matrix are defined up to multiplication by a complex number L, such that |L|=1. -- ALGLIB -- Copyright 2005, 23 March 2007 by Bochkanov Sergey *************************************************************************/ bool hmatrixevd(const complex_2d_array &a, const ae_int_t n, const ae_int_t zneeded, const bool isupper, real_1d_array &d, complex_2d_array &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::hmatrixevd(const_cast(a.c_ptr()), n, zneeded, isupper, const_cast(d.c_ptr()), const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Subroutine for finding the eigenvalues (and eigenvectors) of a Hermitian matrix in a given half-interval (A, B] by using a bisection and inverse iteration Input parameters: A - Hermitian matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. B1, B2 - half-interval (B1, B2] to search eigenvalues in. Output parameters: M - number of eigenvalues found in a given half-interval, M>=0 W - array of the eigenvalues found. Array whose index ranges within [0..M-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..M-1]. The eigenvectors are stored in the matrix columns. Result: True, if successful. M contains the number of eigenvalues in the given half-interval (could be equal to 0), W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned, M is equal to 0. Note: eigen vectors of Hermitian matrix are defined up to multiplication by a complex number L, such as |L|=1. -- ALGLIB -- Copyright 07.01.2006, 24.03.2007 by Bochkanov Sergey. *************************************************************************/ bool hmatrixevdr(const complex_2d_array &a, const ae_int_t n, const ae_int_t zneeded, const bool isupper, const double b1, const double b2, ae_int_t &m, real_1d_array &w, complex_2d_array &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::hmatrixevdr(const_cast(a.c_ptr()), n, zneeded, isupper, b1, b2, &m, const_cast(w.c_ptr()), const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Subroutine for finding the eigenvalues and eigenvectors of a Hermitian matrix with given indexes by using bisection and inverse iteration methods Input parameters: A - Hermitian matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. I1, I2 - index interval for searching (from I1 to I2). 0 <= I1 <= I2 <= N-1. Output parameters: W - array of the eigenvalues found. Array whose index ranges within [0..I2-I1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..I2-I1]. In that case, the eigenvectors are stored in the matrix columns. Result: True, if successful. W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned. Note: eigen vectors of Hermitian matrix are defined up to multiplication by a complex number L, such as |L|=1. -- ALGLIB -- Copyright 07.01.2006, 24.03.2007 by Bochkanov Sergey. *************************************************************************/ bool hmatrixevdi(const complex_2d_array &a, const ae_int_t n, const ae_int_t zneeded, const bool isupper, const ae_int_t i1, const ae_int_t i2, real_1d_array &w, complex_2d_array &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::hmatrixevdi(const_cast(a.c_ptr()), n, zneeded, isupper, i1, i2, const_cast(w.c_ptr()), const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Finding the eigenvalues and eigenvectors of a tridiagonal symmetric matrix The algorithm finds the eigen pairs of a tridiagonal symmetric matrix by using an QL/QR algorithm with implicit shifts. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: D - the main diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-1]. E - the secondary diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-2]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not needed; * 1, the eigenvectors of a tridiagonal matrix are multiplied by the square matrix Z. It is used if the tridiagonal matrix is obtained by the similarity transformation of a symmetric matrix; * 2, the eigenvectors of a tridiagonal matrix replace the square matrix Z; * 3, matrix Z contains the first row of the eigenvectors matrix. Z - if ZNeeded=1, Z contains the square matrix by which the eigenvectors are multiplied. Array whose indexes range within [0..N-1, 0..N-1]. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains the product of a given matrix (from the left) and the eigenvectors matrix (from the right); * 2, Z contains the eigenvectors. * 3, Z contains the first row of the eigenvectors matrix. If ZNeeded<3, Z is the array whose indexes range within [0..N-1, 0..N-1]. In that case, the eigenvectors are stored in the matrix columns. If ZNeeded=3, Z is the array whose indexes range within [0..0, 0..N-1]. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ bool smatrixtdevd(real_1d_array &d, const real_1d_array &e, const ae_int_t n, const ae_int_t zneeded, real_2d_array &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::smatrixtdevd(const_cast(d.c_ptr()), const_cast(e.c_ptr()), n, zneeded, const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Subroutine for finding the tridiagonal matrix eigenvalues/vectors in a given half-interval (A, B] by using bisection and inverse iteration. Input parameters: D - the main diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-1]. E - the secondary diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-2]. N - size of matrix, N>=0. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not needed; * 1, the eigenvectors of a tridiagonal matrix are multiplied by the square matrix Z. It is used if the tridiagonal matrix is obtained by the similarity transformation of a symmetric matrix. * 2, the eigenvectors of a tridiagonal matrix replace matrix Z. A, B - half-interval (A, B] to search eigenvalues in. Z - if ZNeeded is equal to: * 0, Z isn't used and remains unchanged; * 1, Z contains the square matrix (array whose indexes range within [0..N-1, 0..N-1]) which reduces the given symmetric matrix to tridiagonal form; * 2, Z isn't used (but changed on the exit). Output parameters: D - array of the eigenvalues found. Array whose index ranges within [0..M-1]. M - number of eigenvalues found in the given half-interval (M>=0). Z - if ZNeeded is equal to: * 0, doesn't contain any information; * 1, contains the product of a given NxN matrix Z (from the left) and NxM matrix of the eigenvectors found (from the right). Array whose indexes range within [0..N-1, 0..M-1]. * 2, contains the matrix of the eigenvectors found. Array whose indexes range within [0..N-1, 0..M-1]. Result: True, if successful. In that case, M contains the number of eigenvalues in the given half-interval (could be equal to 0), D contains the eigenvalues, Z contains the eigenvectors (if needed). It should be noted that the subroutine changes the size of arrays D and Z. False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned, M is equal to 0. -- ALGLIB -- Copyright 31.03.2008 by Bochkanov Sergey *************************************************************************/ bool smatrixtdevdr(real_1d_array &d, const real_1d_array &e, const ae_int_t n, const ae_int_t zneeded, const double a, const double b, ae_int_t &m, real_2d_array &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::smatrixtdevdr(const_cast(d.c_ptr()), const_cast(e.c_ptr()), n, zneeded, a, b, &m, const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Subroutine for finding tridiagonal matrix eigenvalues/vectors with given indexes (in ascending order) by using the bisection and inverse iteraion. Input parameters: D - the main diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-1]. E - the secondary diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-2]. N - size of matrix. N>=0. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not needed; * 1, the eigenvectors of a tridiagonal matrix are multiplied by the square matrix Z. It is used if the tridiagonal matrix is obtained by the similarity transformation of a symmetric matrix. * 2, the eigenvectors of a tridiagonal matrix replace matrix Z. I1, I2 - index interval for searching (from I1 to I2). 0 <= I1 <= I2 <= N-1. Z - if ZNeeded is equal to: * 0, Z isn't used and remains unchanged; * 1, Z contains the square matrix (array whose indexes range within [0..N-1, 0..N-1]) which reduces the given symmetric matrix to tridiagonal form; * 2, Z isn't used (but changed on the exit). Output parameters: D - array of the eigenvalues found. Array whose index ranges within [0..I2-I1]. Z - if ZNeeded is equal to: * 0, doesn't contain any information; * 1, contains the product of a given NxN matrix Z (from the left) and Nx(I2-I1) matrix of the eigenvectors found (from the right). Array whose indexes range within [0..N-1, 0..I2-I1]. * 2, contains the matrix of the eigenvalues found. Array whose indexes range within [0..N-1, 0..I2-I1]. Result: True, if successful. In that case, D contains the eigenvalues, Z contains the eigenvectors (if needed). It should be noted that the subroutine changes the size of arrays D and Z. False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned. -- ALGLIB -- Copyright 25.12.2005 by Bochkanov Sergey *************************************************************************/ bool smatrixtdevdi(real_1d_array &d, const real_1d_array &e, const ae_int_t n, const ae_int_t zneeded, const ae_int_t i1, const ae_int_t i2, real_2d_array &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::smatrixtdevdi(const_cast(d.c_ptr()), const_cast(e.c_ptr()), n, zneeded, i1, i2, const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Finding eigenvalues and eigenvectors of a general (unsymmetric) matrix COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Speed-up provided by MKL for this particular problem (EVD) ! is really high, because MKL uses combination of (a) better low-level ! optimizations, and (b) better EVD algorithms. ! ! On one particular SSE-capable machine for N=1024, commercial MKL- ! -capable ALGLIB was: ! * 7-10 times faster than open source "generic C" version ! * 15-18 times faster than "pure C#" version ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The algorithm finds eigenvalues and eigenvectors of a general matrix by using the QR algorithm with multiple shifts. The algorithm can find eigenvalues and both left and right eigenvectors. The right eigenvector is a vector x such that A*x = w*x, and the left eigenvector is a vector y such that y'*A = w*y' (here y' implies a complex conjugate transposition of vector y). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. VNeeded - flag controlling whether eigenvectors are needed or not. If VNeeded is equal to: * 0, eigenvectors are not returned; * 1, right eigenvectors are returned; * 2, left eigenvectors are returned; * 3, both left and right eigenvectors are returned. Output parameters: WR - real parts of eigenvalues. Array whose index ranges within [0..N-1]. WR - imaginary parts of eigenvalues. Array whose index ranges within [0..N-1]. VL, VR - arrays of left and right eigenvectors (if they are needed). If WI[i]=0, the respective eigenvalue is a real number, and it corresponds to the column number I of matrices VL/VR. If WI[i]>0, we have a pair of complex conjugate numbers with positive and negative imaginary parts: the first eigenvalue WR[i] + sqrt(-1)*WI[i]; the second eigenvalue WR[i+1] + sqrt(-1)*WI[i+1]; WI[i]>0 WI[i+1] = -WI[i] < 0 In that case, the eigenvector corresponding to the first eigenvalue is located in i and i+1 columns of matrices VL/VR (the column number i contains the real part, and the column number i+1 contains the imaginary part), and the vector corresponding to the second eigenvalue is a complex conjugate to the first vector. Arrays whose indexes range within [0..N-1, 0..N-1]. Result: True, if the algorithm has converged. False, if the algorithm has not converged. Note 1: Some users may ask the following question: what if WI[N-1]>0? WI[N] must contain an eigenvalue which is complex conjugate to the N-th eigenvalue, but the array has only size N? The answer is as follows: such a situation cannot occur because the algorithm finds a pairs of eigenvalues, therefore, if WI[i]>0, I is strictly less than N-1. Note 2: The algorithm performance depends on the value of the internal parameter NS of the InternalSchurDecomposition subroutine which defines the number of shifts in the QR algorithm (similarly to the block width in block-matrix algorithms of linear algebra). If you require maximum performance on your machine, it is recommended to adjust this parameter manually. See also the InternalTREVC subroutine. The algorithm is based on the LAPACK 3.0 library. *************************************************************************/ bool rmatrixevd(const real_2d_array &a, const ae_int_t n, const ae_int_t vneeded, real_1d_array &wr, real_1d_array &wi, real_2d_array &vl, real_2d_array &vr) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::rmatrixevd(const_cast(a.c_ptr()), n, vneeded, const_cast(wr.c_ptr()), const_cast(wi.c_ptr()), const_cast(vl.c_ptr()), const_cast(vr.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Subroutine performing the Schur decomposition of a general matrix by using the QR algorithm with multiple shifts. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The source matrix A is represented as S'*A*S = T, where S is an orthogonal matrix (Schur vectors), T - upper quasi-triangular matrix (with blocks of sizes 1x1 and 2x2 on the main diagonal). Input parameters: A - matrix to be decomposed. Array whose indexes range within [0..N-1, 0..N-1]. N - size of A, N>=0. Output parameters: A - contains matrix T. Array whose indexes range within [0..N-1, 0..N-1]. S - contains Schur vectors. Array whose indexes range within [0..N-1, 0..N-1]. Note 1: The block structure of matrix T can be easily recognized: since all the elements below the blocks are zeros, the elements a[i+1,i] which are equal to 0 show the block border. Note 2: The algorithm performance depends on the value of the internal parameter NS of the InternalSchurDecomposition subroutine which defines the number of shifts in the QR algorithm (similarly to the block width in block-matrix algorithms in linear algebra). If you require maximum performance on your machine, it is recommended to adjust this parameter manually. Result: True, if the algorithm has converged and parameters A and S contain the result. False, if the algorithm has not converged. Algorithm implemented on the basis of the DHSEQR subroutine (LAPACK 3.0 library). *************************************************************************/ bool rmatrixschur(real_2d_array &a, const ae_int_t n, real_2d_array &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::rmatrixschur(const_cast(a.c_ptr()), n, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Algorithm for solving the following generalized symmetric positive-definite eigenproblem: A*x = lambda*B*x (1) or A*B*x = lambda*x (2) or B*A*x = lambda*x (3). where A is a symmetric matrix, B - symmetric positive-definite matrix. The problem is solved by reducing it to an ordinary symmetric eigenvalue problem. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrices A and B. IsUpperA - storage format of matrix A. B - symmetric positive-definite matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. IsUpperB - storage format of matrix B. ZNeeded - if ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. ProblemType - if ProblemType is equal to: * 1, the following problem is solved: A*x = lambda*B*x; * 2, the following problem is solved: A*B*x = lambda*x; * 3, the following problem is solved: B*A*x = lambda*x. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..N-1]. The eigenvectors are stored in matrix columns. It should be noted that the eigenvectors in such problems do not form an orthogonal system. Result: True, if the problem was solved successfully. False, if the error occurred during the Cholesky decomposition of matrix B (the matrix isn't positive-definite) or during the work of the iterative algorithm for solving the symmetric eigenproblem. See also the GeneralizedSymmetricDefiniteEVDReduce subroutine. -- ALGLIB -- Copyright 1.28.2006 by Bochkanov Sergey *************************************************************************/ bool smatrixgevd(const real_2d_array &a, const ae_int_t n, const bool isuppera, const real_2d_array &b, const bool isupperb, const ae_int_t zneeded, const ae_int_t problemtype, real_1d_array &d, real_2d_array &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::smatrixgevd(const_cast(a.c_ptr()), n, isuppera, const_cast(b.c_ptr()), isupperb, zneeded, problemtype, const_cast(d.c_ptr()), const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Algorithm for reduction of the following generalized symmetric positive- definite eigenvalue problem: A*x = lambda*B*x (1) or A*B*x = lambda*x (2) or B*A*x = lambda*x (3) to the symmetric eigenvalues problem C*y = lambda*y (eigenvalues of this and the given problems are the same, and the eigenvectors of the given problem could be obtained by multiplying the obtained eigenvectors by the transformation matrix x = R*y). Here A is a symmetric matrix, B - symmetric positive-definite matrix. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrices A and B. IsUpperA - storage format of matrix A. B - symmetric positive-definite matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. IsUpperB - storage format of matrix B. ProblemType - if ProblemType is equal to: * 1, the following problem is solved: A*x = lambda*B*x; * 2, the following problem is solved: A*B*x = lambda*x; * 3, the following problem is solved: B*A*x = lambda*x. Output parameters: A - symmetric matrix which is given by its upper or lower triangle depending on IsUpperA. Contains matrix C. Array whose indexes range within [0..N-1, 0..N-1]. R - upper triangular or low triangular transformation matrix which is used to obtain the eigenvectors of a given problem as the product of eigenvectors of C (from the right) and matrix R (from the left). If the matrix is upper triangular, the elements below the main diagonal are equal to 0 (and vice versa). Thus, we can perform the multiplication without taking into account the internal structure (which is an easier though less effective way). Array whose indexes range within [0..N-1, 0..N-1]. IsUpperR - type of matrix R (upper or lower triangular). Result: True, if the problem was reduced successfully. False, if the error occurred during the Cholesky decomposition of matrix B (the matrix is not positive-definite). -- ALGLIB -- Copyright 1.28.2006 by Bochkanov Sergey *************************************************************************/ bool smatrixgevdreduce(real_2d_array &a, const ae_int_t n, const bool isuppera, const real_2d_array &b, const bool isupperb, const ae_int_t problemtype, real_2d_array &r, bool &isupperr) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::smatrixgevdreduce(const_cast(a.c_ptr()), n, isuppera, const_cast(b.c_ptr()), isupperb, problemtype, const_cast(r.c_ptr()), &isupperr, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm updates matrix A^-1 when adding a number to an element of matrix A. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. UpdRow - row where the element to be updated is stored. UpdColumn - column where the element to be updated is stored. UpdVal - a number to be added to the element. Output parameters: InvA - inverse of modified matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void rmatrixinvupdatesimple(real_2d_array &inva, const ae_int_t n, const ae_int_t updrow, const ae_int_t updcolumn, const double updval) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixinvupdatesimple(const_cast(inva.c_ptr()), n, updrow, updcolumn, updval, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm updates matrix A^-1 when adding a vector to a row of matrix A. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. UpdRow - the row of A whose vector V was added. 0 <= Row <= N-1 V - the vector to be added to a row. Array whose index ranges within [0..N-1]. Output parameters: InvA - inverse of modified matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void rmatrixinvupdaterow(real_2d_array &inva, const ae_int_t n, const ae_int_t updrow, const real_1d_array &v) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixinvupdaterow(const_cast(inva.c_ptr()), n, updrow, const_cast(v.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm updates matrix A^-1 when adding a vector to a column of matrix A. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. UpdColumn - the column of A whose vector U was added. 0 <= UpdColumn <= N-1 U - the vector to be added to a column. Array whose index ranges within [0..N-1]. Output parameters: InvA - inverse of modified matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void rmatrixinvupdatecolumn(real_2d_array &inva, const ae_int_t n, const ae_int_t updcolumn, const real_1d_array &u) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixinvupdatecolumn(const_cast(inva.c_ptr()), n, updcolumn, const_cast(u.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm computes the inverse of matrix A+u*v' by using the given matrix A^-1 and the vectors u and v. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. U - the vector modifying the matrix. Array whose index ranges within [0..N-1]. V - the vector modifying the matrix. Array whose index ranges within [0..N-1]. Output parameters: InvA - inverse of matrix A + u*v'. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void rmatrixinvupdateuv(real_2d_array &inva, const ae_int_t n, const real_1d_array &u, const real_1d_array &v) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rmatrixinvupdateuv(const_cast(inva.c_ptr()), n, const_cast(u.c_ptr()), const_cast(v.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Determinant calculation of the matrix given by its LU decomposition. Input parameters: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations which were made during the LU decomposition. Output of RMatrixLU subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: matrix determinant. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ double rmatrixludet(const real_2d_array &a, const integer_1d_array &pivots, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rmatrixludet(const_cast(a.c_ptr()), const_cast(pivots.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Determinant calculation of the matrix given by its LU decomposition. Input parameters: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations which were made during the LU decomposition. Output of RMatrixLU subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: matrix determinant. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ double rmatrixludet(const real_2d_array &a, const integer_1d_array &pivots) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.rows()!=a.cols()) || (a.rows()!=pivots.length())) throw ap_error("Error while calling 'rmatrixludet': looks like one of arguments has wrong size"); n = a.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rmatrixludet(const_cast(a.c_ptr()), const_cast(pivots.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the determinant of a general matrix Input parameters: A - matrix, array[0..N-1, 0..N-1] N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: determinant of matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ double rmatrixdet(const real_2d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rmatrixdet(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the determinant of a general matrix Input parameters: A - matrix, array[0..N-1, 0..N-1] N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: determinant of matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ double rmatrixdet(const real_2d_array &a) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.rows()!=a.cols())) throw ap_error("Error while calling 'rmatrixdet': looks like one of arguments has wrong size"); n = a.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rmatrixdet(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Determinant calculation of the matrix given by its LU decomposition. Input parameters: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations which were made during the LU decomposition. Output of RMatrixLU subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: matrix determinant. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ alglib::complex cmatrixludet(const complex_2d_array &a, const integer_1d_array &pivots, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_complex result = alglib_impl::cmatrixludet(const_cast(a.c_ptr()), const_cast(pivots.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Determinant calculation of the matrix given by its LU decomposition. Input parameters: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations which were made during the LU decomposition. Output of RMatrixLU subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: matrix determinant. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ alglib::complex cmatrixludet(const complex_2d_array &a, const integer_1d_array &pivots) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.rows()!=a.cols()) || (a.rows()!=pivots.length())) throw ap_error("Error while calling 'cmatrixludet': looks like one of arguments has wrong size"); n = a.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_complex result = alglib_impl::cmatrixludet(const_cast(a.c_ptr()), const_cast(pivots.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the determinant of a general matrix Input parameters: A - matrix, array[0..N-1, 0..N-1] N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: determinant of matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ alglib::complex cmatrixdet(const complex_2d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_complex result = alglib_impl::cmatrixdet(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the determinant of a general matrix Input parameters: A - matrix, array[0..N-1, 0..N-1] N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: determinant of matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ alglib::complex cmatrixdet(const complex_2d_array &a) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.rows()!=a.cols())) throw ap_error("Error while calling 'cmatrixdet': looks like one of arguments has wrong size"); n = a.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_complex result = alglib_impl::cmatrixdet(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Determinant calculation of the matrix given by the Cholesky decomposition. Input parameters: A - Cholesky decomposition, output of SMatrixCholesky subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) As the determinant is equal to the product of squares of diagonal elements, it's not necessary to specify which triangle - lower or upper - the matrix is stored in. Result: matrix determinant. -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/ double spdmatrixcholeskydet(const real_2d_array &a, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spdmatrixcholeskydet(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Determinant calculation of the matrix given by the Cholesky decomposition. Input parameters: A - Cholesky decomposition, output of SMatrixCholesky subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) As the determinant is equal to the product of squares of diagonal elements, it's not necessary to specify which triangle - lower or upper - the matrix is stored in. Result: matrix determinant. -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/ double spdmatrixcholeskydet(const real_2d_array &a) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (a.rows()!=a.cols())) throw ap_error("Error while calling 'spdmatrixcholeskydet': looks like one of arguments has wrong size"); n = a.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spdmatrixcholeskydet(const_cast(a.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Determinant calculation of the symmetric positive definite matrix. Input parameters: A - matrix. Array with elements [0..N-1, 0..N-1]. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) IsUpper - (optional) storage type: * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Result: determinant of matrix A. If matrix A is not positive definite, exception is thrown. -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/ double spdmatrixdet(const real_2d_array &a, const ae_int_t n, const bool isupper) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spdmatrixdet(const_cast(a.c_ptr()), n, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Determinant calculation of the symmetric positive definite matrix. Input parameters: A - matrix. Array with elements [0..N-1, 0..N-1]. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) IsUpper - (optional) storage type: * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Result: determinant of matrix A. If matrix A is not positive definite, exception is thrown. -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/ double spdmatrixdet(const real_2d_array &a) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; bool isupper; if( (a.rows()!=a.cols())) throw ap_error("Error while calling 'spdmatrixdet': looks like one of arguments has wrong size"); if( !alglib_impl::ae_is_symmetric(const_cast(a.c_ptr())) ) throw ap_error("'a' parameter is not symmetric matrix"); n = a.rows(); isupper = false; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spdmatrixdet(const_cast(a.c_ptr()), n, isupper, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF COMPUTATIONAL CORE // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { static double sparse_desiredloadfactor = 0.66; static double sparse_maxloadfactor = 0.75; static double sparse_growfactor = 2.00; static ae_int_t sparse_additional = 10; static ae_int_t sparse_linalgswitch = 16; static void sparse_sparseinitduidx(sparsematrix* s, ae_state *_state); static ae_int_t sparse_hash(ae_int_t i, ae_int_t j, ae_int_t tabsize, ae_state *_state); static ae_int_t ablas_rgemmparallelsize = 64; static ae_int_t ablas_cgemmparallelsize = 64; static void ablas_ablasinternalsplitlength(ae_int_t n, ae_int_t nb, ae_int_t* n1, ae_int_t* n2, ae_state *_state); static void ablas_cmatrixrighttrsm2(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); static void ablas_cmatrixlefttrsm2(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); static void ablas_rmatrixrighttrsm2(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); static void ablas_rmatrixlefttrsm2(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); static void ablas_cmatrixherk2(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); static void ablas_rmatrixsyrk2(ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); static void trfac_cmatrixluprec(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Complex */ ae_vector* tmp, ae_state *_state); static void trfac_rmatrixluprec(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Real */ ae_vector* tmp, ae_state *_state); static void trfac_cmatrixplurec(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Complex */ ae_vector* tmp, ae_state *_state); static void trfac_rmatrixplurec(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Real */ ae_vector* tmp, ae_state *_state); static void trfac_cmatrixlup2(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Complex */ ae_vector* tmp, ae_state *_state); static void trfac_rmatrixlup2(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Real */ ae_vector* tmp, ae_state *_state); static void trfac_cmatrixplu2(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Complex */ ae_vector* tmp, ae_state *_state); static void trfac_rmatrixplu2(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Real */ ae_vector* tmp, ae_state *_state); static ae_bool trfac_hpdmatrixcholeskyrec(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tmp, ae_state *_state); static ae_bool trfac_hpdmatrixcholesky2(/* Complex */ ae_matrix* aaa, ae_int_t offs, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tmp, ae_state *_state); static ae_bool trfac_spdmatrixcholesky2(/* Real */ ae_matrix* aaa, ae_int_t offs, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tmp, ae_state *_state); static void rcond_rmatrixrcondtrinternal(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_bool onenorm, double anorm, double* rc, ae_state *_state); static void rcond_cmatrixrcondtrinternal(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_bool onenorm, double anorm, double* rc, ae_state *_state); static void rcond_spdmatrixrcondcholeskyinternal(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, ae_bool isnormprovided, double anorm, double* rc, ae_state *_state); static void rcond_hpdmatrixrcondcholeskyinternal(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, ae_bool isnormprovided, double anorm, double* rc, ae_state *_state); static void rcond_rmatrixrcondluinternal(/* Real */ ae_matrix* lua, ae_int_t n, ae_bool onenorm, ae_bool isanormprovided, double anorm, double* rc, ae_state *_state); static void rcond_cmatrixrcondluinternal(/* Complex */ ae_matrix* lua, ae_int_t n, ae_bool onenorm, ae_bool isanormprovided, double anorm, double* rc, ae_state *_state); static void rcond_rmatrixestimatenorm(ae_int_t n, /* Real */ ae_vector* v, /* Real */ ae_vector* x, /* Integer */ ae_vector* isgn, double* est, ae_int_t* kase, ae_state *_state); static void rcond_cmatrixestimatenorm(ae_int_t n, /* Complex */ ae_vector* v, /* Complex */ ae_vector* x, double* est, ae_int_t* kase, /* Integer */ ae_vector* isave, /* Real */ ae_vector* rsave, ae_state *_state); static double rcond_internalcomplexrcondscsum1(/* Complex */ ae_vector* x, ae_int_t n, ae_state *_state); static ae_int_t rcond_internalcomplexrcondicmax1(/* Complex */ ae_vector* x, ae_int_t n, ae_state *_state); static void rcond_internalcomplexrcondsaveall(/* Integer */ ae_vector* isave, /* Real */ ae_vector* rsave, ae_int_t* i, ae_int_t* iter, ae_int_t* j, ae_int_t* jlast, ae_int_t* jump, double* absxi, double* altsgn, double* estold, double* temp, ae_state *_state); static void rcond_internalcomplexrcondloadall(/* Integer */ ae_vector* isave, /* Real */ ae_vector* rsave, ae_int_t* i, ae_int_t* iter, ae_int_t* j, ae_int_t* jlast, ae_int_t* jump, double* absxi, double* altsgn, double* estold, double* temp, ae_state *_state); static ae_int_t matinv_parallelsize = 64; static void matinv_rmatrixtrinverserec(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, ae_bool isunit, /* Real */ ae_vector* tmp, sinteger* info, matinvreport* rep, ae_state *_state); static void matinv_cmatrixtrinverserec(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, ae_bool isunit, /* Complex */ ae_vector* tmp, ae_int_t* info, matinvreport* rep, ae_state *_state); static void matinv_rmatrixluinverserec(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t n, /* Real */ ae_vector* work, sinteger* info, matinvreport* rep, ae_state *_state); static void matinv_cmatrixluinverserec(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t n, /* Complex */ ae_vector* work, ae_int_t* info, matinvreport* rep, ae_state *_state); static void matinv_hpdmatrixcholeskyinverserec(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tmp, ae_state *_state); static void ortfac_cmatrixqrbasecase(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* work, /* Complex */ ae_vector* t, /* Complex */ ae_vector* tau, ae_state *_state); static void ortfac_cmatrixlqbasecase(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* work, /* Complex */ ae_vector* t, /* Complex */ ae_vector* tau, ae_state *_state); static void ortfac_rmatrixblockreflector(/* Real */ ae_matrix* a, /* Real */ ae_vector* tau, ae_bool columnwisea, ae_int_t lengtha, ae_int_t blocksize, /* Real */ ae_matrix* t, /* Real */ ae_vector* work, ae_state *_state); static void ortfac_cmatrixblockreflector(/* Complex */ ae_matrix* a, /* Complex */ ae_vector* tau, ae_bool columnwisea, ae_int_t lengtha, ae_int_t blocksize, /* Complex */ ae_matrix* t, /* Complex */ ae_vector* work, ae_state *_state); static ae_bool bdsvd_bidiagonalsvddecompositioninternal(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool isupper, ae_bool isfractionalaccuracyrequired, /* Real */ ae_matrix* uu, ae_int_t ustart, ae_int_t nru, /* Real */ ae_matrix* c, ae_int_t cstart, ae_int_t ncc, /* Real */ ae_matrix* vt, ae_int_t vstart, ae_int_t ncvt, ae_state *_state); static double bdsvd_extsignbdsqr(double a, double b, ae_state *_state); static void bdsvd_svd2x2(double f, double g, double h, double* ssmin, double* ssmax, ae_state *_state); static void bdsvd_svdv2x2(double f, double g, double h, double* ssmin, double* ssmax, double* snr, double* csr, double* snl, double* csl, ae_state *_state); static ae_int_t evd_stepswithintol = 2; static void evd_clearrfields(eigsubspacestate* state, ae_state *_state); static ae_bool evd_tridiagonalevd(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_int_t zneeded, /* Real */ ae_matrix* z, ae_state *_state); static void evd_tdevde2(double a, double b, double c, double* rt1, double* rt2, ae_state *_state); static void evd_tdevdev2(double a, double b, double c, double* rt1, double* rt2, double* cs1, double* sn1, ae_state *_state); static double evd_tdevdpythag(double a, double b, ae_state *_state); static double evd_tdevdextsign(double a, double b, ae_state *_state); static ae_bool evd_internalbisectioneigenvalues(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_int_t irange, ae_int_t iorder, double vl, double vu, ae_int_t il, ae_int_t iu, double abstol, /* Real */ ae_vector* w, ae_int_t* m, ae_int_t* nsplit, /* Integer */ ae_vector* iblock, /* Integer */ ae_vector* isplit, ae_int_t* errorcode, ae_state *_state); static void evd_internaldstein(ae_int_t n, /* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t m, /* Real */ ae_vector* w, /* Integer */ ae_vector* iblock, /* Integer */ ae_vector* isplit, /* Real */ ae_matrix* z, /* Integer */ ae_vector* ifail, ae_int_t* info, ae_state *_state); static void evd_tdininternaldlagtf(ae_int_t n, /* Real */ ae_vector* a, double lambdav, /* Real */ ae_vector* b, /* Real */ ae_vector* c, double tol, /* Real */ ae_vector* d, /* Integer */ ae_vector* iin, ae_int_t* info, ae_state *_state); static void evd_tdininternaldlagts(ae_int_t n, /* Real */ ae_vector* a, /* Real */ ae_vector* b, /* Real */ ae_vector* c, /* Real */ ae_vector* d, /* Integer */ ae_vector* iin, /* Real */ ae_vector* y, double* tol, ae_int_t* info, ae_state *_state); static void evd_internaldlaebz(ae_int_t ijob, ae_int_t nitmax, ae_int_t n, ae_int_t mmax, ae_int_t minp, double abstol, double reltol, double pivmin, /* Real */ ae_vector* d, /* Real */ ae_vector* e, /* Real */ ae_vector* e2, /* Integer */ ae_vector* nval, /* Real */ ae_matrix* ab, /* Real */ ae_vector* c, ae_int_t* mout, /* Integer */ ae_matrix* nab, /* Real */ ae_vector* work, /* Integer */ ae_vector* iwork, ae_int_t* info, ae_state *_state); static void evd_rmatrixinternaltrevc(/* Real */ ae_matrix* t, ae_int_t n, ae_int_t side, ae_int_t howmny, /* Boolean */ ae_vector* vselect, /* Real */ ae_matrix* vl, /* Real */ ae_matrix* vr, ae_int_t* m, ae_int_t* info, ae_state *_state); static void evd_internaltrevc(/* Real */ ae_matrix* t, ae_int_t n, ae_int_t side, ae_int_t howmny, /* Boolean */ ae_vector* vselect, /* Real */ ae_matrix* vl, /* Real */ ae_matrix* vr, ae_int_t* m, ae_int_t* info, ae_state *_state); static void evd_internalhsevdlaln2(ae_bool ltrans, ae_int_t na, ae_int_t nw, double smin, double ca, /* Real */ ae_matrix* a, double d1, double d2, /* Real */ ae_matrix* b, double wr, double wi, /* Boolean */ ae_vector* rswap4, /* Boolean */ ae_vector* zswap4, /* Integer */ ae_matrix* ipivot44, /* Real */ ae_vector* civ4, /* Real */ ae_vector* crv4, /* Real */ ae_matrix* x, double* scl, double* xnorm, ae_int_t* info, ae_state *_state); static void evd_internalhsevdladiv(double a, double b, double c, double d, double* p, double* q, ae_state *_state); /************************************************************************* This function creates sparse matrix in a Hash-Table format. This function creates Hast-Table matrix, which can be converted to CRS format after its initialization is over. Typical usage scenario for a sparse matrix is: 1. creation in a Hash-Table format 2. insertion of the matrix elements 3. conversion to the CRS representation 4. matrix is passed to some linear algebra algorithm Some information about different matrix formats can be found below, in the "NOTES" section. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 K - K>=0, expected number of non-zero elements in a matrix. K can be inexact approximation, can be less than actual number of elements (table will grow when needed) or even zero). It is important to understand that although hash-table may grow automatically, it is better to provide good estimate of data size. OUTPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. All elements of the matrix are zero. NOTE 1 Hash-tables use memory inefficiently, and they have to keep some amount of the "spare memory" in order to have good performance. Hash table for matrix with K non-zero elements will need C*K*(8+2*sizeof(int)) bytes, where C is a small constant, about 1.5-2 in magnitude. CRS storage, from the other side, is more memory-efficient, and needs just K*(8+sizeof(int))+M*sizeof(int) bytes, where M is a number of rows in a matrix. When you convert from the Hash-Table to CRS representation, all unneeded memory will be freed. NOTE 2 Comments of SparseMatrix structure outline information about different sparse storage formats. We recommend you to read them before starting to use ALGLIB sparse matrices. NOTE 3 This function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateBuf function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecreate(ae_int_t m, ae_int_t n, ae_int_t k, sparsematrix* s, ae_state *_state) { _sparsematrix_clear(s); sparsecreatebuf(m, n, k, s, _state); } /************************************************************************* This version of SparseCreate function creates sparse matrix in Hash-Table format, reusing previously allocated storage as much as possible. Read comments for SparseCreate() for more information. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 K - K>=0, expected number of non-zero elements in a matrix. K can be inexact approximation, can be less than actual number of elements (table will grow when needed) or even zero). It is important to understand that although hash-table may grow automatically, it is better to provide good estimate of data size. S - SparseMatrix structure which MAY contain some already allocated storage. OUTPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. All elements of the matrix are zero. Previously allocated storage is reused, if its size is compatible with expected number of non-zeros K. -- ALGLIB PROJECT -- Copyright 14.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsecreatebuf(ae_int_t m, ae_int_t n, ae_int_t k, sparsematrix* s, ae_state *_state) { ae_int_t i; ae_assert(m>0, "SparseCreateBuf: M<=0", _state); ae_assert(n>0, "SparseCreateBuf: N<=0", _state); ae_assert(k>=0, "SparseCreateBuf: K<0", _state); /* * Hash-table size is max(existing_size,requested_size) * * NOTE: it is important to use ALL available memory for hash table * because it is impossible to efficiently reallocate table * without temporary storage. So, if we want table with up to * 1.000.000 elements, we have to create such table from the * very beginning. Otherwise, the very idea of memory reuse * will be compromised. */ s->tablesize = ae_round(k/sparse_desiredloadfactor+sparse_additional, _state); rvectorsetlengthatleast(&s->vals, s->tablesize, _state); s->tablesize = s->vals.cnt; /* * Initialize other fields */ s->matrixtype = 0; s->m = m; s->n = n; s->nfree = s->tablesize; ivectorsetlengthatleast(&s->idx, 2*s->tablesize, _state); for(i=0; i<=s->tablesize-1; i++) { s->idx.ptr.p_int[2*i] = -1; } } /************************************************************************* This function creates sparse matrix in a CRS format (expert function for situations when you are running out of memory). This function creates CRS matrix. Typical usage scenario for a CRS matrix is: 1. creation (you have to tell number of non-zero elements at each row at this moment) 2. insertion of the matrix elements (row by row, from left to right) 3. matrix is passed to some linear algebra algorithm This function is a memory-efficient alternative to SparseCreate(), but it is more complex because it requires you to know in advance how large your matrix is. Some information about different matrix formats can be found in comments on SparseMatrix structure. We recommend you to read them before starting to use ALGLIB sparse matrices.. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 NER - number of elements at each row, array[M], NER[I]>=0 OUTPUT PARAMETERS S - sparse M*N matrix in CRS representation. You have to fill ALL non-zero elements by calling SparseSet() BEFORE you try to use this matrix. NOTE: this function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateCRSBuf function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecreatecrs(ae_int_t m, ae_int_t n, /* Integer */ ae_vector* ner, sparsematrix* s, ae_state *_state) { ae_int_t i; _sparsematrix_clear(s); ae_assert(m>0, "SparseCreateCRS: M<=0", _state); ae_assert(n>0, "SparseCreateCRS: N<=0", _state); ae_assert(ner->cnt>=m, "SparseCreateCRS: Length(NER)ptr.p_int[i]>=0, "SparseCreateCRS: NER[] contains negative elements", _state); } sparsecreatecrsbuf(m, n, ner, s, _state); } /************************************************************************* This function creates sparse matrix in a CRS format (expert function for situations when you are running out of memory). This version of CRS matrix creation function may reuse memory already allocated in S. This function creates CRS matrix. Typical usage scenario for a CRS matrix is: 1. creation (you have to tell number of non-zero elements at each row at this moment) 2. insertion of the matrix elements (row by row, from left to right) 3. matrix is passed to some linear algebra algorithm This function is a memory-efficient alternative to SparseCreate(), but it is more complex because it requires you to know in advance how large your matrix is. Some information about different matrix formats can be found in comments on SparseMatrix structure. We recommend you to read them before starting to use ALGLIB sparse matrices.. INPUT PARAMETERS M - number of rows in a matrix, M>=1 N - number of columns in a matrix, N>=1 NER - number of elements at each row, array[M], NER[I]>=0 S - sparse matrix structure with possibly preallocated memory. OUTPUT PARAMETERS S - sparse M*N matrix in CRS representation. You have to fill ALL non-zero elements by calling SparseSet() BEFORE you try to use this matrix. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecreatecrsbuf(ae_int_t m, ae_int_t n, /* Integer */ ae_vector* ner, sparsematrix* s, ae_state *_state) { ae_int_t i; ae_int_t noe; ae_assert(m>0, "SparseCreateCRSBuf: M<=0", _state); ae_assert(n>0, "SparseCreateCRSBuf: N<=0", _state); ae_assert(ner->cnt>=m, "SparseCreateCRSBuf: Length(NER)matrixtype = 1; s->ninitialized = 0; s->m = m; s->n = n; ivectorsetlengthatleast(&s->ridx, s->m+1, _state); s->ridx.ptr.p_int[0] = 0; for(i=0; i<=s->m-1; i++) { ae_assert(ner->ptr.p_int[i]>=0, "SparseCreateCRSBuf: NER[] contains negative elements", _state); noe = noe+ner->ptr.p_int[i]; s->ridx.ptr.p_int[i+1] = s->ridx.ptr.p_int[i]+ner->ptr.p_int[i]; } rvectorsetlengthatleast(&s->vals, noe, _state); ivectorsetlengthatleast(&s->idx, noe, _state); if( noe==0 ) { sparse_sparseinitduidx(s, _state); } } /************************************************************************* This function creates sparse matrix in a SKS format (skyline storage format). In most cases you do not need this function - CRS format better suits most use cases. INPUT PARAMETERS M, N - number of rows(M) and columns (N) in a matrix: * M=N (as for now, ALGLIB supports only square SKS) * N>=1 * M>=1 D - "bottom" bandwidths, array[M], D[I]>=0. I-th element stores number of non-zeros at I-th row, below the diagonal (diagonal itself is not included) U - "top" bandwidths, array[N], U[I]>=0. I-th element stores number of non-zeros at I-th row, above the diagonal (diagonal itself is not included) OUTPUT PARAMETERS S - sparse M*N matrix in SKS representation. All elements are filled by zeros. You may use SparseRewriteExisting() to change their values. NOTE: this function completely overwrites S with new sparse matrix. Previously allocated storage is NOT reused. If you want to reuse already allocated memory, call SparseCreateSKSBuf function. -- ALGLIB PROJECT -- Copyright 13.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsecreatesks(ae_int_t m, ae_int_t n, /* Integer */ ae_vector* d, /* Integer */ ae_vector* u, sparsematrix* s, ae_state *_state) { ae_int_t i; _sparsematrix_clear(s); ae_assert(m>0, "SparseCreateSKS: M<=0", _state); ae_assert(n>0, "SparseCreateSKS: N<=0", _state); ae_assert(m==n, "SparseCreateSKS: M<>N", _state); ae_assert(d->cnt>=m, "SparseCreateSKS: Length(D)cnt>=n, "SparseCreateSKS: Length(U)ptr.p_int[i]>=0, "SparseCreateSKS: D[] contains negative elements", _state); ae_assert(d->ptr.p_int[i]<=i, "SparseCreateSKS: D[I]>I for some I", _state); } for(i=0; i<=n-1; i++) { ae_assert(u->ptr.p_int[i]>=0, "SparseCreateSKS: U[] contains negative elements", _state); ae_assert(u->ptr.p_int[i]<=i, "SparseCreateSKS: U[I]>I for some I", _state); } sparsecreatesksbuf(m, n, d, u, s, _state); } /************************************************************************* This is "buffered" version of SparseCreateSKS() which reuses memory previously allocated in S (of course, memory is reallocated if needed). This function creates sparse matrix in a SKS format (skyline storage format). In most cases you do not need this function - CRS format better suits most use cases. INPUT PARAMETERS M, N - number of rows(M) and columns (N) in a matrix: * M=N (as for now, ALGLIB supports only square SKS) * N>=1 * M>=1 D - "bottom" bandwidths, array[M], 0<=D[I]<=I. I-th element stores number of non-zeros at I-th row, below the diagonal (diagonal itself is not included) U - "top" bandwidths, array[N], 0<=U[I]<=I. I-th element stores number of non-zeros at I-th row, above the diagonal (diagonal itself is not included) OUTPUT PARAMETERS S - sparse M*N matrix in SKS representation. All elements are filled by zeros. You may use SparseSet()/SparseAdd() to change their values. -- ALGLIB PROJECT -- Copyright 13.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsecreatesksbuf(ae_int_t m, ae_int_t n, /* Integer */ ae_vector* d, /* Integer */ ae_vector* u, sparsematrix* s, ae_state *_state) { ae_int_t i; ae_int_t minmn; ae_int_t nz; ae_int_t mxd; ae_int_t mxu; ae_assert(m>0, "SparseCreateSKSBuf: M<=0", _state); ae_assert(n>0, "SparseCreateSKSBuf: N<=0", _state); ae_assert(m==n, "SparseCreateSKSBuf: M<>N", _state); ae_assert(d->cnt>=m, "SparseCreateSKSBuf: Length(D)cnt>=n, "SparseCreateSKSBuf: Length(U)ptr.p_int[i]>=0, "SparseCreateSKSBuf: D[] contains negative elements", _state); ae_assert(d->ptr.p_int[i]<=i, "SparseCreateSKSBuf: D[I]>I for some I", _state); } for(i=0; i<=n-1; i++) { ae_assert(u->ptr.p_int[i]>=0, "SparseCreateSKSBuf: U[] contains negative elements", _state); ae_assert(u->ptr.p_int[i]<=i, "SparseCreateSKSBuf: U[I]>I for some I", _state); } minmn = ae_minint(m, n, _state); s->matrixtype = 2; s->ninitialized = 0; s->m = m; s->n = n; ivectorsetlengthatleast(&s->ridx, minmn+1, _state); s->ridx.ptr.p_int[0] = 0; nz = 0; for(i=0; i<=minmn-1; i++) { nz = nz+1+d->ptr.p_int[i]+u->ptr.p_int[i]; s->ridx.ptr.p_int[i+1] = s->ridx.ptr.p_int[i]+1+d->ptr.p_int[i]+u->ptr.p_int[i]; } rvectorsetlengthatleast(&s->vals, nz, _state); for(i=0; i<=nz-1; i++) { s->vals.ptr.p_double[i] = 0.0; } ivectorsetlengthatleast(&s->didx, m+1, _state); mxd = 0; for(i=0; i<=m-1; i++) { s->didx.ptr.p_int[i] = d->ptr.p_int[i]; mxd = ae_maxint(mxd, d->ptr.p_int[i], _state); } s->didx.ptr.p_int[m] = mxd; ivectorsetlengthatleast(&s->uidx, n+1, _state); mxu = 0; for(i=0; i<=n-1; i++) { s->uidx.ptr.p_int[i] = u->ptr.p_int[i]; mxu = ae_maxint(mxu, u->ptr.p_int[i], _state); } s->uidx.ptr.p_int[n] = mxu; } /************************************************************************* This function copies S0 to S1. This function completely deallocates memory owned by S1 before creating a copy of S0. If you want to reuse memory, use SparseCopyBuf. NOTE: this function does not verify its arguments, it just copies all fields of the structure. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecopy(sparsematrix* s0, sparsematrix* s1, ae_state *_state) { _sparsematrix_clear(s1); sparsecopybuf(s0, s1, _state); } /************************************************************************* This function copies S0 to S1. Memory already allocated in S1 is reused as much as possible. NOTE: this function does not verify its arguments, it just copies all fields of the structure. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsecopybuf(sparsematrix* s0, sparsematrix* s1, ae_state *_state) { ae_int_t l; ae_int_t i; s1->matrixtype = s0->matrixtype; s1->m = s0->m; s1->n = s0->n; s1->nfree = s0->nfree; s1->ninitialized = s0->ninitialized; s1->tablesize = s0->tablesize; /* * Initialization for arrays */ l = s0->vals.cnt; rvectorsetlengthatleast(&s1->vals, l, _state); for(i=0; i<=l-1; i++) { s1->vals.ptr.p_double[i] = s0->vals.ptr.p_double[i]; } l = s0->ridx.cnt; ivectorsetlengthatleast(&s1->ridx, l, _state); for(i=0; i<=l-1; i++) { s1->ridx.ptr.p_int[i] = s0->ridx.ptr.p_int[i]; } l = s0->idx.cnt; ivectorsetlengthatleast(&s1->idx, l, _state); for(i=0; i<=l-1; i++) { s1->idx.ptr.p_int[i] = s0->idx.ptr.p_int[i]; } /* * Initalization for CRS-parameters */ l = s0->uidx.cnt; ivectorsetlengthatleast(&s1->uidx, l, _state); for(i=0; i<=l-1; i++) { s1->uidx.ptr.p_int[i] = s0->uidx.ptr.p_int[i]; } l = s0->didx.cnt; ivectorsetlengthatleast(&s1->didx, l, _state); for(i=0; i<=l-1; i++) { s1->didx.ptr.p_int[i] = s0->didx.ptr.p_int[i]; } } /************************************************************************* This function efficiently swaps contents of S0 and S1. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/ void sparseswap(sparsematrix* s0, sparsematrix* s1, ae_state *_state) { swapi(&s1->matrixtype, &s0->matrixtype, _state); swapi(&s1->m, &s0->m, _state); swapi(&s1->n, &s0->n, _state); swapi(&s1->nfree, &s0->nfree, _state); swapi(&s1->ninitialized, &s0->ninitialized, _state); swapi(&s1->tablesize, &s0->tablesize, _state); ae_swap_vectors(&s1->vals, &s0->vals); ae_swap_vectors(&s1->ridx, &s0->ridx); ae_swap_vectors(&s1->idx, &s0->idx); ae_swap_vectors(&s1->uidx, &s0->uidx); ae_swap_vectors(&s1->didx, &s0->didx); } /************************************************************************* This function adds value to S[i,j] - element of the sparse matrix. Matrix must be in a Hash-Table mode. In case S[i,j] already exists in the table, V i added to its value. In case S[i,j] is non-existent, it is inserted in the table. Table automatically grows when necessary. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. Exception will be thrown for CRS matrix. I - row index of the element to modify, 0<=Imatrixtype==0, "SparseAdd: matrix must be in the Hash-Table mode to do this operation", _state); ae_assert(i>=0, "SparseAdd: I<0", _state); ae_assert(im, "SparseAdd: I>=M", _state); ae_assert(j>=0, "SparseAdd: J<0", _state); ae_assert(jn, "SparseAdd: J>=N", _state); ae_assert(ae_isfinite(v, _state), "SparseAdd: V is not finite number", _state); if( ae_fp_eq(v,(double)(0)) ) { return; } tcode = -1; k = s->tablesize; if( ae_fp_greater_eq((1-sparse_maxloadfactor)*k,(double)(s->nfree)) ) { sparseresizematrix(s, _state); k = s->tablesize; } hashcode = sparse_hash(i, j, k, _state); for(;;) { if( s->idx.ptr.p_int[2*hashcode]==-1 ) { if( tcode!=-1 ) { hashcode = tcode; } s->vals.ptr.p_double[hashcode] = v; s->idx.ptr.p_int[2*hashcode] = i; s->idx.ptr.p_int[2*hashcode+1] = j; if( tcode==-1 ) { s->nfree = s->nfree-1; } return; } else { if( s->idx.ptr.p_int[2*hashcode]==i&&s->idx.ptr.p_int[2*hashcode+1]==j ) { s->vals.ptr.p_double[hashcode] = s->vals.ptr.p_double[hashcode]+v; if( ae_fp_eq(s->vals.ptr.p_double[hashcode],(double)(0)) ) { s->idx.ptr.p_int[2*hashcode] = -2; } return; } /* * Is it deleted element? */ if( tcode==-1&&s->idx.ptr.p_int[2*hashcode]==-2 ) { tcode = hashcode; } /* * Next step */ hashcode = (hashcode+1)%k; } } } /************************************************************************* This function modifies S[i,j] - element of the sparse matrix. For Hash-based storage format: * this function can be called at any moment - during matrix initialization or later * new value can be zero or non-zero. In case new value of S[i,j] is zero, this element is deleted from the table. * this function has no effect when called with zero V for non-existent element. For CRS-bases storage format: * this function can be called ONLY DURING MATRIX INITIALIZATION * new value MUST be non-zero. Exception will be thrown for zero V. * elements must be initialized in correct order - from top row to bottom, within row - from left to right. For SKS storage: NOT SUPPORTED! Use SparseRewriteExisting() to work with SKS matrices. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table or CRS representation. I - row index of the element to modify, 0<=Imatrixtype==0||s->matrixtype==1, "SparseSet: unsupported matrix storage format", _state); ae_assert(i>=0, "SparseSet: I<0", _state); ae_assert(im, "SparseSet: I>=M", _state); ae_assert(j>=0, "SparseSet: J<0", _state); ae_assert(jn, "SparseSet: J>=N", _state); ae_assert(ae_isfinite(v, _state), "SparseSet: V is not finite number", _state); /* * Hash-table matrix */ if( s->matrixtype==0 ) { tcode = -1; k = s->tablesize; if( ae_fp_greater_eq((1-sparse_maxloadfactor)*k,(double)(s->nfree)) ) { sparseresizematrix(s, _state); k = s->tablesize; } hashcode = sparse_hash(i, j, k, _state); for(;;) { if( s->idx.ptr.p_int[2*hashcode]==-1 ) { if( ae_fp_neq(v,(double)(0)) ) { if( tcode!=-1 ) { hashcode = tcode; } s->vals.ptr.p_double[hashcode] = v; s->idx.ptr.p_int[2*hashcode] = i; s->idx.ptr.p_int[2*hashcode+1] = j; if( tcode==-1 ) { s->nfree = s->nfree-1; } } return; } else { if( s->idx.ptr.p_int[2*hashcode]==i&&s->idx.ptr.p_int[2*hashcode+1]==j ) { if( ae_fp_eq(v,(double)(0)) ) { s->idx.ptr.p_int[2*hashcode] = -2; } else { s->vals.ptr.p_double[hashcode] = v; } return; } if( tcode==-1&&s->idx.ptr.p_int[2*hashcode]==-2 ) { tcode = hashcode; } /* * Next step */ hashcode = (hashcode+1)%k; } } } /* * CRS matrix */ if( s->matrixtype==1 ) { ae_assert(s->ridx.ptr.p_int[i]<=s->ninitialized, "SparseSet: too few initialized elements at some row (you have promised more when called SparceCreateCRS)", _state); ae_assert(s->ridx.ptr.p_int[i+1]>s->ninitialized, "SparseSet: too many initialized elements at some row (you have promised less when called SparceCreateCRS)", _state); ae_assert(s->ninitialized==s->ridx.ptr.p_int[i]||s->idx.ptr.p_int[s->ninitialized-1]vals.ptr.p_double[s->ninitialized] = v; s->idx.ptr.p_int[s->ninitialized] = j; s->ninitialized = s->ninitialized+1; /* * If matrix has been created then * initiale 'S.UIdx' and 'S.DIdx' */ if( s->ninitialized==s->ridx.ptr.p_int[s->m] ) { sparse_sparseinitduidx(s, _state); } } } /************************************************************************* This function returns S[i,j] - element of the sparse matrix. Matrix can be in any mode (Hash-Table, CRS, SKS), but this function is less efficient for CRS matrices. Hash-Table and SKS matrices can find element in O(1) time, while CRS matrices need O(log(RS)) time, where RS is an number of non-zero elements in a row. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. Exception will be thrown for CRS matrix. I - row index of the element to modify, 0<=I=0, "SparseGet: I<0", _state); ae_assert(im, "SparseGet: I>=M", _state); ae_assert(j>=0, "SparseGet: J<0", _state); ae_assert(jn, "SparseGet: J>=N", _state); result = 0.0; if( s->matrixtype==0 ) { /* * Hash-based storage */ result = (double)(0); k = s->tablesize; hashcode = sparse_hash(i, j, k, _state); for(;;) { if( s->idx.ptr.p_int[2*hashcode]==-1 ) { return result; } if( s->idx.ptr.p_int[2*hashcode]==i&&s->idx.ptr.p_int[2*hashcode+1]==j ) { result = s->vals.ptr.p_double[hashcode]; return result; } hashcode = (hashcode+1)%k; } } if( s->matrixtype==1 ) { /* * CRS */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseGet: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); k0 = s->ridx.ptr.p_int[i]; k1 = s->ridx.ptr.p_int[i+1]-1; result = (double)(0); while(k0<=k1) { k = (k0+k1)/2; if( s->idx.ptr.p_int[k]==j ) { result = s->vals.ptr.p_double[k]; return result; } if( s->idx.ptr.p_int[k]matrixtype==2 ) { /* * SKS */ ae_assert(s->m==s->n, "SparseGet: non-square SKS matrix not supported", _state); result = (double)(0); if( i==j ) { /* * Return diagonal element */ result = s->vals.ptr.p_double[s->ridx.ptr.p_int[i]+s->didx.ptr.p_int[i]]; return result; } if( jdidx.ptr.p_int[i]; if( i-j<=k ) { result = s->vals.ptr.p_double[s->ridx.ptr.p_int[i]+k+j-i]; } } else { /* * Return superdiagonal element at J-th "skyline block" */ k = s->uidx.ptr.p_int[j]; if( j-i<=k ) { result = s->vals.ptr.p_double[s->ridx.ptr.p_int[j+1]-(j-i)]; } return result; } return result; } ae_assert(ae_false, "SparseGet: unexpected matrix type", _state); return result; } /************************************************************************* This function returns I-th diagonal element of the sparse matrix. Matrix can be in any mode (Hash-Table or CRS storage), but this function is most efficient for CRS matrices - it requires less than 50 CPU cycles to extract diagonal element. For Hash-Table matrices we still have O(1) query time, but function is many times slower. INPUT PARAMETERS S - sparse M*N matrix in Hash-Table representation. Exception will be thrown for CRS matrix. I - index of the element to modify, 0<=I=0, "SparseGetDiagonal: I<0", _state); ae_assert(im, "SparseGetDiagonal: I>=M", _state); ae_assert(in, "SparseGetDiagonal: I>=N", _state); result = (double)(0); if( s->matrixtype==0 ) { result = sparseget(s, i, i, _state); return result; } if( s->matrixtype==1 ) { if( s->didx.ptr.p_int[i]!=s->uidx.ptr.p_int[i] ) { result = s->vals.ptr.p_double[s->didx.ptr.p_int[i]]; } return result; } if( s->matrixtype==2 ) { ae_assert(s->m==s->n, "SparseGetDiagonal: non-square SKS matrix not supported", _state); result = s->vals.ptr.p_double[s->ridx.ptr.p_int[i]+s->didx.ptr.p_int[i]]; return result; } ae_assert(ae_false, "SparseGetDiagonal: unexpected matrix type", _state); return result; } /************************************************************************* This function calculates matrix-vector product S*x. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y - array[M], S*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemv(sparsematrix* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { double tval; double v; double vv; ae_int_t i; ae_int_t j; ae_int_t lt; ae_int_t rt; ae_int_t lt1; ae_int_t rt1; ae_int_t n; ae_int_t m; ae_int_t d; ae_int_t u; ae_int_t ri; ae_int_t ri1; ae_assert(x->cnt>=s->n, "SparseMV: length(X)matrixtype==1||s->matrixtype==2, "SparseMV: incorrect matrix type (convert your matrix to CRS/SKS)", _state); rvectorsetlengthatleast(y, s->m, _state); n = s->n; m = s->m; if( s->matrixtype==1 ) { /* * CRS format */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseMV: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); for(i=0; i<=m-1; i++) { tval = (double)(0); lt = s->ridx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]-1; for(j=lt; j<=rt; j++) { tval = tval+x->ptr.p_double[s->idx.ptr.p_int[j]]*s->vals.ptr.p_double[j]; } y->ptr.p_double[i] = tval; } return; } if( s->matrixtype==2 ) { /* * SKS format */ ae_assert(s->m==s->n, "SparseMV: non-square SKS matrices are not supported", _state); for(i=0; i<=n-1; i++) { ri = s->ridx.ptr.p_int[i]; ri1 = s->ridx.ptr.p_int[i+1]; d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; v = s->vals.ptr.p_double[ri+d]*x->ptr.p_double[i]; if( d>0 ) { lt = ri; rt = ri+d-1; lt1 = i-d; rt1 = i-1; vv = ae_v_dotproduct(&s->vals.ptr.p_double[lt], 1, &x->ptr.p_double[lt1], 1, ae_v_len(lt,rt)); v = v+vv; } y->ptr.p_double[i] = v; if( u>0 ) { lt = ri1-u; rt = ri1-1; lt1 = i-u; rt1 = i-1; v = x->ptr.p_double[i]; ae_v_addd(&y->ptr.p_double[lt1], 1, &s->vals.ptr.p_double[lt], 1, ae_v_len(lt1,rt1), v); } } return; } } /************************************************************************* This function calculates matrix-vector product S^T*x. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. X - array[M], input vector. For performance reasons we make only quick checks - we check that array size is at least M, but we do not check for NAN's or INF's. Y - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y - array[N], S^T*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemtv(sparsematrix* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t lt; ae_int_t rt; ae_int_t ct; ae_int_t lt1; ae_int_t rt1; double v; double vv; ae_int_t n; ae_int_t m; ae_int_t ri; ae_int_t ri1; ae_int_t d; ae_int_t u; ae_assert(s->matrixtype==1||s->matrixtype==2, "SparseMTV: incorrect matrix type (convert your matrix to CRS/SKS)", _state); ae_assert(x->cnt>=s->m, "SparseMTV: Length(X)n; m = s->m; rvectorsetlengthatleast(y, n, _state); for(i=0; i<=n-1; i++) { y->ptr.p_double[i] = (double)(0); } if( s->matrixtype==1 ) { /* * CRS format */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[m], "SparseMTV: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); for(i=0; i<=m-1; i++) { lt = s->ridx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]; v = x->ptr.p_double[i]; for(j=lt; j<=rt-1; j++) { ct = s->idx.ptr.p_int[j]; y->ptr.p_double[ct] = y->ptr.p_double[ct]+v*s->vals.ptr.p_double[j]; } } return; } if( s->matrixtype==2 ) { /* * SKS format */ ae_assert(s->m==s->n, "SparseMV: non-square SKS matrices are not supported", _state); for(i=0; i<=n-1; i++) { ri = s->ridx.ptr.p_int[i]; ri1 = s->ridx.ptr.p_int[i+1]; d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; if( d>0 ) { lt = ri; rt = ri+d-1; lt1 = i-d; rt1 = i-1; v = x->ptr.p_double[i]; ae_v_addd(&y->ptr.p_double[lt1], 1, &s->vals.ptr.p_double[lt], 1, ae_v_len(lt1,rt1), v); } v = s->vals.ptr.p_double[ri+d]*x->ptr.p_double[i]; if( u>0 ) { lt = ri1-u; rt = ri1-1; lt1 = i-u; rt1 = i-1; vv = ae_v_dotproduct(&s->vals.ptr.p_double[lt], 1, &x->ptr.p_double[lt1], 1, ae_v_len(lt,rt)); v = v+vv; } y->ptr.p_double[i] = v; } return; } } /************************************************************************* This function simultaneously calculates two matrix-vector products: S*x and S^T*x. S must be square (non-rectangular) matrix stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse N*N matrix in CRS or SKS format. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y0 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. Y1 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y0 - array[N], S*x Y1 - array[N], S^T*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemv2(sparsematrix* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y0, /* Real */ ae_vector* y1, ae_state *_state) { ae_int_t l; double tval; ae_int_t i; ae_int_t j; double vx; double vs; double v; double vv; double vd0; double vd1; ae_int_t vi; ae_int_t j0; ae_int_t j1; ae_int_t n; ae_int_t ri; ae_int_t ri1; ae_int_t d; ae_int_t u; ae_int_t lt; ae_int_t rt; ae_int_t lt1; ae_int_t rt1; ae_assert(s->matrixtype==1||s->matrixtype==2, "SparseMV2: incorrect matrix type (convert your matrix to CRS/SKS)", _state); ae_assert(s->m==s->n, "SparseMV2: matrix is non-square", _state); l = x->cnt; ae_assert(l>=s->n, "SparseMV2: Length(X)n; rvectorsetlengthatleast(y0, l, _state); rvectorsetlengthatleast(y1, l, _state); for(i=0; i<=n-1; i++) { y0->ptr.p_double[i] = (double)(0); y1->ptr.p_double[i] = (double)(0); } if( s->matrixtype==1 ) { /* * CRS format */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseMV2: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); for(i=0; i<=s->m-1; i++) { tval = (double)(0); vx = x->ptr.p_double[i]; j0 = s->ridx.ptr.p_int[i]; j1 = s->ridx.ptr.p_int[i+1]-1; for(j=j0; j<=j1; j++) { vi = s->idx.ptr.p_int[j]; vs = s->vals.ptr.p_double[j]; tval = tval+x->ptr.p_double[vi]*vs; y1->ptr.p_double[vi] = y1->ptr.p_double[vi]+vx*vs; } y0->ptr.p_double[i] = tval; } return; } if( s->matrixtype==2 ) { /* * SKS format */ for(i=0; i<=n-1; i++) { ri = s->ridx.ptr.p_int[i]; ri1 = s->ridx.ptr.p_int[i+1]; d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; vd0 = s->vals.ptr.p_double[ri+d]*x->ptr.p_double[i]; vd1 = vd0; if( d>0 ) { lt = ri; rt = ri+d-1; lt1 = i-d; rt1 = i-1; v = x->ptr.p_double[i]; ae_v_addd(&y1->ptr.p_double[lt1], 1, &s->vals.ptr.p_double[lt], 1, ae_v_len(lt1,rt1), v); vv = ae_v_dotproduct(&s->vals.ptr.p_double[lt], 1, &x->ptr.p_double[lt1], 1, ae_v_len(lt,rt)); vd0 = vd0+vv; } if( u>0 ) { lt = ri1-u; rt = ri1-1; lt1 = i-u; rt1 = i-1; v = x->ptr.p_double[i]; ae_v_addd(&y0->ptr.p_double[lt1], 1, &s->vals.ptr.p_double[lt], 1, ae_v_len(lt1,rt1), v); vv = ae_v_dotproduct(&s->vals.ptr.p_double[lt], 1, &x->ptr.p_double[lt1], 1, ae_v_len(lt,rt)); vd1 = vd1+vv; } y0->ptr.p_double[i] = vd0; y1->ptr.p_double[i] = vd1; } return; } } /************************************************************************* This function calculates matrix-vector product S*x, when S is symmetric matrix. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*M matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is given: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS Y - array[M], S*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsesmv(sparsematrix* s, ae_bool isupper, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t id; ae_int_t lt; ae_int_t rt; double v; double vv; double vy; double vx; double vd; ae_int_t ri; ae_int_t ri1; ae_int_t d; ae_int_t u; ae_int_t lt1; ae_int_t rt1; ae_assert(s->matrixtype==1||s->matrixtype==2, "SparseSMV: incorrect matrix type (convert your matrix to CRS/SKS)", _state); ae_assert(x->cnt>=s->n, "SparseSMV: length(X)m==s->n, "SparseSMV: non-square matrix", _state); n = s->n; rvectorsetlengthatleast(y, n, _state); for(i=0; i<=n-1; i++) { y->ptr.p_double[i] = (double)(0); } if( s->matrixtype==1 ) { /* * CRS format */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseSMV: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); for(i=0; i<=n-1; i++) { if( s->didx.ptr.p_int[i]!=s->uidx.ptr.p_int[i] ) { y->ptr.p_double[i] = y->ptr.p_double[i]+s->vals.ptr.p_double[s->didx.ptr.p_int[i]]*x->ptr.p_double[s->idx.ptr.p_int[s->didx.ptr.p_int[i]]]; } if( isupper ) { lt = s->uidx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]; vy = (double)(0); vx = x->ptr.p_double[i]; for(j=lt; j<=rt-1; j++) { id = s->idx.ptr.p_int[j]; v = s->vals.ptr.p_double[j]; vy = vy+x->ptr.p_double[id]*v; y->ptr.p_double[id] = y->ptr.p_double[id]+vx*v; } y->ptr.p_double[i] = y->ptr.p_double[i]+vy; } else { lt = s->ridx.ptr.p_int[i]; rt = s->didx.ptr.p_int[i]; vy = (double)(0); vx = x->ptr.p_double[i]; for(j=lt; j<=rt-1; j++) { id = s->idx.ptr.p_int[j]; v = s->vals.ptr.p_double[j]; vy = vy+x->ptr.p_double[id]*v; y->ptr.p_double[id] = y->ptr.p_double[id]+vx*v; } y->ptr.p_double[i] = y->ptr.p_double[i]+vy; } } return; } if( s->matrixtype==2 ) { /* * SKS format */ for(i=0; i<=n-1; i++) { ri = s->ridx.ptr.p_int[i]; ri1 = s->ridx.ptr.p_int[i+1]; d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; vd = s->vals.ptr.p_double[ri+d]*x->ptr.p_double[i]; if( d>0&&!isupper ) { lt = ri; rt = ri+d-1; lt1 = i-d; rt1 = i-1; v = x->ptr.p_double[i]; ae_v_addd(&y->ptr.p_double[lt1], 1, &s->vals.ptr.p_double[lt], 1, ae_v_len(lt1,rt1), v); vv = ae_v_dotproduct(&s->vals.ptr.p_double[lt], 1, &x->ptr.p_double[lt1], 1, ae_v_len(lt,rt)); vd = vd+vv; } if( u>0&&isupper ) { lt = ri1-u; rt = ri1-1; lt1 = i-u; rt1 = i-1; v = x->ptr.p_double[i]; ae_v_addd(&y->ptr.p_double[lt1], 1, &s->vals.ptr.p_double[lt], 1, ae_v_len(lt1,rt1), v); vv = ae_v_dotproduct(&s->vals.ptr.p_double[lt], 1, &x->ptr.p_double[lt1], 1, ae_v_len(lt,rt)); vd = vd+vv; } y->ptr.p_double[i] = vd; } return; } } /************************************************************************* This function calculates vector-matrix-vector product x'*S*x, where S is symmetric matrix. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*M matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is given: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. X - array[N], input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. RESULT x'*S*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 27.01.2014 by Bochkanov Sergey *************************************************************************/ double sparsevsmv(sparsematrix* s, ae_bool isupper, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t id; ae_int_t lt; ae_int_t rt; double v; double v0; double v1; ae_int_t ri; ae_int_t ri1; ae_int_t d; ae_int_t u; ae_int_t lt1; double result; ae_assert(s->matrixtype==1||s->matrixtype==2, "SparseVSMV: incorrect matrix type (convert your matrix to CRS/SKS)", _state); ae_assert(x->cnt>=s->n, "SparseVSMV: length(X)m==s->n, "SparseVSMV: non-square matrix", _state); n = s->n; result = 0.0; if( s->matrixtype==1 ) { /* * CRS format */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseVSMV: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); for(i=0; i<=n-1; i++) { if( s->didx.ptr.p_int[i]!=s->uidx.ptr.p_int[i] ) { v = x->ptr.p_double[s->idx.ptr.p_int[s->didx.ptr.p_int[i]]]; result = result+v*s->vals.ptr.p_double[s->didx.ptr.p_int[i]]*v; } if( isupper ) { lt = s->uidx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]; } else { lt = s->ridx.ptr.p_int[i]; rt = s->didx.ptr.p_int[i]; } v0 = x->ptr.p_double[i]; for(j=lt; j<=rt-1; j++) { id = s->idx.ptr.p_int[j]; v1 = x->ptr.p_double[id]; v = s->vals.ptr.p_double[j]; result = result+2*v0*v1*v; } } return result; } if( s->matrixtype==2 ) { /* * SKS format */ for(i=0; i<=n-1; i++) { ri = s->ridx.ptr.p_int[i]; ri1 = s->ridx.ptr.p_int[i+1]; d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; v = x->ptr.p_double[i]; result = result+v*s->vals.ptr.p_double[ri+d]*v; if( d>0&&!isupper ) { lt = ri; rt = ri+d-1; lt1 = i-d; k = d-1; v0 = x->ptr.p_double[i]; v = 0.0; for(j=0; j<=k; j++) { v = v+x->ptr.p_double[lt1+j]*s->vals.ptr.p_double[lt+j]; } result = result+2*v0*v; } if( u>0&&isupper ) { lt = ri1-u; rt = ri1-1; lt1 = i-u; k = u-1; v0 = x->ptr.p_double[i]; v = 0.0; for(j=0; j<=k; j++) { v = v+x->ptr.p_double[lt1+j]*s->vals.ptr.p_double[lt+j]; } result = result+2*v0*v; } } return result; } return result; } /************************************************************************* This function calculates matrix-matrix product S*A. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. A - array[N][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B - array[M][K], S*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemm(sparsematrix* s, /* Real */ ae_matrix* a, ae_int_t k, /* Real */ ae_matrix* b, ae_state *_state) { double tval; double v; ae_int_t id; ae_int_t i; ae_int_t j; ae_int_t k0; ae_int_t k1; ae_int_t lt; ae_int_t rt; ae_int_t m; ae_int_t n; ae_int_t ri; ae_int_t ri1; ae_int_t lt1; ae_int_t rt1; ae_int_t d; ae_int_t u; double vd; ae_assert(s->matrixtype==1||s->matrixtype==2, "SparseMM: incorrect matrix type (convert your matrix to CRS/SKS)", _state); ae_assert(a->rows>=s->n, "SparseMM: Rows(A)0, "SparseMM: K<=0", _state); m = s->m; n = s->n; k1 = k-1; rmatrixsetlengthatleast(b, m, k, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=k-1; j++) { b->ptr.pp_double[i][j] = (double)(0); } } if( s->matrixtype==1 ) { /* * CRS format */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[m], "SparseMM: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); if( kridx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]; for(k0=lt; k0<=rt-1; k0++) { tval = tval+s->vals.ptr.p_double[k0]*a->ptr.pp_double[s->idx.ptr.p_int[k0]][j]; } b->ptr.pp_double[i][j] = tval; } } } else { for(i=0; i<=m-1; i++) { lt = s->ridx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]; for(j=lt; j<=rt-1; j++) { id = s->idx.ptr.p_int[j]; v = s->vals.ptr.p_double[j]; ae_v_addd(&b->ptr.pp_double[i][0], 1, &a->ptr.pp_double[id][0], 1, ae_v_len(0,k-1), v); } } } return; } if( s->matrixtype==2 ) { /* * SKS format */ ae_assert(m==n, "SparseMM: non-square SKS matrices are not supported", _state); for(i=0; i<=n-1; i++) { ri = s->ridx.ptr.p_int[i]; ri1 = s->ridx.ptr.p_int[i+1]; d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; if( d>0 ) { lt = ri; rt = ri+d-1; lt1 = i-d; rt1 = i-1; for(j=lt1; j<=rt1; j++) { v = s->vals.ptr.p_double[lt+(j-lt1)]; if( kptr.pp_double[i][k0] = b->ptr.pp_double[i][k0]+v*a->ptr.pp_double[j][k0]; } } else { /* * Use vector operation */ ae_v_addd(&b->ptr.pp_double[i][0], 1, &a->ptr.pp_double[j][0], 1, ae_v_len(0,k-1), v); } } } if( u>0 ) { lt = ri1-u; rt = ri1-1; lt1 = i-u; rt1 = i-1; for(j=lt1; j<=rt1; j++) { v = s->vals.ptr.p_double[lt+(j-lt1)]; if( kptr.pp_double[j][k0] = b->ptr.pp_double[j][k0]+v*a->ptr.pp_double[i][k0]; } } else { /* * Use vector operation */ ae_v_addd(&b->ptr.pp_double[j][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); } } } vd = s->vals.ptr.p_double[ri+d]; ae_v_addd(&b->ptr.pp_double[i][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), vd); } return; } } /************************************************************************* This function calculates matrix-matrix product S^T*A. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*N matrix in CRS or SKS format. A - array[M][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least M, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B - array[N][K], S^T*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemtm(sparsematrix* s, /* Real */ ae_matrix* a, ae_int_t k, /* Real */ ae_matrix* b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k0; ae_int_t k1; ae_int_t lt; ae_int_t rt; ae_int_t ct; double v; ae_int_t m; ae_int_t n; ae_int_t ri; ae_int_t ri1; ae_int_t lt1; ae_int_t rt1; ae_int_t d; ae_int_t u; ae_assert(s->matrixtype==1||s->matrixtype==2, "SparseMTM: incorrect matrix type (convert your matrix to CRS/SKS)", _state); ae_assert(a->rows>=s->m, "SparseMTM: Rows(A)0, "SparseMTM: K<=0", _state); m = s->m; n = s->n; k1 = k-1; rmatrixsetlengthatleast(b, n, k, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { b->ptr.pp_double[i][j] = (double)(0); } } if( s->matrixtype==1 ) { /* * CRS format */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[m], "SparseMTM: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); if( kridx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]; for(k0=lt; k0<=rt-1; k0++) { v = s->vals.ptr.p_double[k0]; ct = s->idx.ptr.p_int[k0]; for(j=0; j<=k-1; j++) { b->ptr.pp_double[ct][j] = b->ptr.pp_double[ct][j]+v*a->ptr.pp_double[i][j]; } } } } else { for(i=0; i<=m-1; i++) { lt = s->ridx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]; for(j=lt; j<=rt-1; j++) { v = s->vals.ptr.p_double[j]; ct = s->idx.ptr.p_int[j]; ae_v_addd(&b->ptr.pp_double[ct][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); } } } return; } if( s->matrixtype==2 ) { /* * SKS format */ ae_assert(m==n, "SparseMTM: non-square SKS matrices are not supported", _state); for(i=0; i<=n-1; i++) { ri = s->ridx.ptr.p_int[i]; ri1 = s->ridx.ptr.p_int[i+1]; d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; if( d>0 ) { lt = ri; rt = ri+d-1; lt1 = i-d; rt1 = i-1; for(j=lt1; j<=rt1; j++) { v = s->vals.ptr.p_double[lt+(j-lt1)]; if( kptr.pp_double[j][k0] = b->ptr.pp_double[j][k0]+v*a->ptr.pp_double[i][k0]; } } else { /* * Use vector operation */ ae_v_addd(&b->ptr.pp_double[j][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); } } } if( u>0 ) { lt = ri1-u; rt = ri1-1; lt1 = i-u; rt1 = i-1; for(j=lt1; j<=rt1; j++) { v = s->vals.ptr.p_double[lt+(j-lt1)]; if( kptr.pp_double[i][k0] = b->ptr.pp_double[i][k0]+v*a->ptr.pp_double[j][k0]; } } else { /* * Use vector operation */ ae_v_addd(&b->ptr.pp_double[i][0], 1, &a->ptr.pp_double[j][0], 1, ae_v_len(0,k-1), v); } } } v = s->vals.ptr.p_double[ri+d]; ae_v_addd(&b->ptr.pp_double[i][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); } return; } } /************************************************************************* This function simultaneously calculates two matrix-matrix products: S*A and S^T*A. S must be square (non-rectangular) matrix stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse N*N matrix in CRS or SKS format. A - array[N][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B0 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. B1 - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B0 - array[N][K], S*A B1 - array[N][K], S^T*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsemm2(sparsematrix* s, /* Real */ ae_matrix* a, ae_int_t k, /* Real */ ae_matrix* b0, /* Real */ ae_matrix* b1, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k0; ae_int_t lt; ae_int_t rt; ae_int_t ct; double v; double tval; ae_int_t n; ae_int_t k1; ae_int_t ri; ae_int_t ri1; ae_int_t lt1; ae_int_t rt1; ae_int_t d; ae_int_t u; ae_assert(s->matrixtype==1||s->matrixtype==2, "SparseMM2: incorrect matrix type (convert your matrix to CRS/SKS)", _state); ae_assert(s->m==s->n, "SparseMM2: matrix is non-square", _state); ae_assert(a->rows>=s->n, "SparseMM2: Rows(A)0, "SparseMM2: K<=0", _state); n = s->n; k1 = k-1; rmatrixsetlengthatleast(b0, n, k, _state); rmatrixsetlengthatleast(b1, n, k, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { b1->ptr.pp_double[i][j] = (double)(0); b0->ptr.pp_double[i][j] = (double)(0); } } if( s->matrixtype==1 ) { /* * CRS format */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseMM2: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); if( kridx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]; v = a->ptr.pp_double[i][j]; for(k0=lt; k0<=rt-1; k0++) { ct = s->idx.ptr.p_int[k0]; b1->ptr.pp_double[ct][j] = b1->ptr.pp_double[ct][j]+s->vals.ptr.p_double[k0]*v; tval = tval+s->vals.ptr.p_double[k0]*a->ptr.pp_double[ct][j]; } b0->ptr.pp_double[i][j] = tval; } } } else { for(i=0; i<=n-1; i++) { lt = s->ridx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]; for(j=lt; j<=rt-1; j++) { v = s->vals.ptr.p_double[j]; ct = s->idx.ptr.p_int[j]; ae_v_addd(&b0->ptr.pp_double[i][0], 1, &a->ptr.pp_double[ct][0], 1, ae_v_len(0,k-1), v); ae_v_addd(&b1->ptr.pp_double[ct][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); } } } return; } if( s->matrixtype==2 ) { /* * SKS format */ ae_assert(s->m==s->n, "SparseMM2: non-square SKS matrices are not supported", _state); for(i=0; i<=n-1; i++) { ri = s->ridx.ptr.p_int[i]; ri1 = s->ridx.ptr.p_int[i+1]; d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; if( d>0 ) { lt = ri; rt = ri+d-1; lt1 = i-d; rt1 = i-1; for(j=lt1; j<=rt1; j++) { v = s->vals.ptr.p_double[lt+(j-lt1)]; if( kptr.pp_double[i][k0] = b0->ptr.pp_double[i][k0]+v*a->ptr.pp_double[j][k0]; b1->ptr.pp_double[j][k0] = b1->ptr.pp_double[j][k0]+v*a->ptr.pp_double[i][k0]; } } else { /* * Use vector operation */ ae_v_addd(&b0->ptr.pp_double[i][0], 1, &a->ptr.pp_double[j][0], 1, ae_v_len(0,k-1), v); ae_v_addd(&b1->ptr.pp_double[j][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); } } } if( u>0 ) { lt = ri1-u; rt = ri1-1; lt1 = i-u; rt1 = i-1; for(j=lt1; j<=rt1; j++) { v = s->vals.ptr.p_double[lt+(j-lt1)]; if( kptr.pp_double[j][k0] = b0->ptr.pp_double[j][k0]+v*a->ptr.pp_double[i][k0]; b1->ptr.pp_double[i][k0] = b1->ptr.pp_double[i][k0]+v*a->ptr.pp_double[j][k0]; } } else { /* * Use vector operation */ ae_v_addd(&b0->ptr.pp_double[j][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); ae_v_addd(&b1->ptr.pp_double[i][0], 1, &a->ptr.pp_double[j][0], 1, ae_v_len(0,k-1), v); } } } v = s->vals.ptr.p_double[ri+d]; ae_v_addd(&b0->ptr.pp_double[i][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); ae_v_addd(&b1->ptr.pp_double[i][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); } return; } } /************************************************************************* This function calculates matrix-matrix product S*A, when S is symmetric matrix. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse M*M matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is given: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. A - array[N][K], input dense matrix. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. K - number of columns of matrix (A). B - output buffer, possibly preallocated. In case buffer size is too small to store result, this buffer is automatically resized. OUTPUT PARAMETERS B - array[M][K], S*A NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparsesmm(sparsematrix* s, ae_bool isupper, /* Real */ ae_matrix* a, ae_int_t k, /* Real */ ae_matrix* b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k0; ae_int_t id; ae_int_t k1; ae_int_t lt; ae_int_t rt; double v; double vb; double va; ae_int_t n; ae_int_t ri; ae_int_t ri1; ae_int_t lt1; ae_int_t rt1; ae_int_t d; ae_int_t u; ae_assert(s->matrixtype==1||s->matrixtype==2, "SparseSMM: incorrect matrix type (convert your matrix to CRS/SKS)", _state); ae_assert(a->rows>=s->n, "SparseSMM: Rows(X)m==s->n, "SparseSMM: matrix is non-square", _state); n = s->n; k1 = k-1; rmatrixsetlengthatleast(b, n, k, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { b->ptr.pp_double[i][j] = (double)(0); } } if( s->matrixtype==1 ) { /* * CRS format */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseSMM: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); if( k>sparse_linalgswitch ) { for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { if( s->didx.ptr.p_int[i]!=s->uidx.ptr.p_int[i] ) { id = s->didx.ptr.p_int[i]; b->ptr.pp_double[i][j] = b->ptr.pp_double[i][j]+s->vals.ptr.p_double[id]*a->ptr.pp_double[s->idx.ptr.p_int[id]][j]; } if( isupper ) { lt = s->uidx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]; vb = (double)(0); va = a->ptr.pp_double[i][j]; for(k0=lt; k0<=rt-1; k0++) { id = s->idx.ptr.p_int[k0]; v = s->vals.ptr.p_double[k0]; vb = vb+a->ptr.pp_double[id][j]*v; b->ptr.pp_double[id][j] = b->ptr.pp_double[id][j]+va*v; } b->ptr.pp_double[i][j] = b->ptr.pp_double[i][j]+vb; } else { lt = s->ridx.ptr.p_int[i]; rt = s->didx.ptr.p_int[i]; vb = (double)(0); va = a->ptr.pp_double[i][j]; for(k0=lt; k0<=rt-1; k0++) { id = s->idx.ptr.p_int[k0]; v = s->vals.ptr.p_double[k0]; vb = vb+a->ptr.pp_double[id][j]*v; b->ptr.pp_double[id][j] = b->ptr.pp_double[id][j]+va*v; } b->ptr.pp_double[i][j] = b->ptr.pp_double[i][j]+vb; } } } } else { for(i=0; i<=n-1; i++) { if( s->didx.ptr.p_int[i]!=s->uidx.ptr.p_int[i] ) { id = s->didx.ptr.p_int[i]; v = s->vals.ptr.p_double[id]; ae_v_addd(&b->ptr.pp_double[i][0], 1, &a->ptr.pp_double[s->idx.ptr.p_int[id]][0], 1, ae_v_len(0,k-1), v); } if( isupper ) { lt = s->uidx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]; for(j=lt; j<=rt-1; j++) { id = s->idx.ptr.p_int[j]; v = s->vals.ptr.p_double[j]; ae_v_addd(&b->ptr.pp_double[i][0], 1, &a->ptr.pp_double[id][0], 1, ae_v_len(0,k-1), v); ae_v_addd(&b->ptr.pp_double[id][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); } } else { lt = s->ridx.ptr.p_int[i]; rt = s->didx.ptr.p_int[i]; for(j=lt; j<=rt-1; j++) { id = s->idx.ptr.p_int[j]; v = s->vals.ptr.p_double[j]; ae_v_addd(&b->ptr.pp_double[i][0], 1, &a->ptr.pp_double[id][0], 1, ae_v_len(0,k-1), v); ae_v_addd(&b->ptr.pp_double[id][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); } } } } return; } if( s->matrixtype==2 ) { /* * SKS format */ ae_assert(s->m==s->n, "SparseMM2: non-square SKS matrices are not supported", _state); for(i=0; i<=n-1; i++) { ri = s->ridx.ptr.p_int[i]; ri1 = s->ridx.ptr.p_int[i+1]; d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; if( d>0&&!isupper ) { lt = ri; rt = ri+d-1; lt1 = i-d; rt1 = i-1; for(j=lt1; j<=rt1; j++) { v = s->vals.ptr.p_double[lt+(j-lt1)]; if( kptr.pp_double[i][k0] = b->ptr.pp_double[i][k0]+v*a->ptr.pp_double[j][k0]; b->ptr.pp_double[j][k0] = b->ptr.pp_double[j][k0]+v*a->ptr.pp_double[i][k0]; } } else { /* * Use vector operation */ ae_v_addd(&b->ptr.pp_double[i][0], 1, &a->ptr.pp_double[j][0], 1, ae_v_len(0,k-1), v); ae_v_addd(&b->ptr.pp_double[j][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); } } } if( u>0&&isupper ) { lt = ri1-u; rt = ri1-1; lt1 = i-u; rt1 = i-1; for(j=lt1; j<=rt1; j++) { v = s->vals.ptr.p_double[lt+(j-lt1)]; if( kptr.pp_double[j][k0] = b->ptr.pp_double[j][k0]+v*a->ptr.pp_double[i][k0]; b->ptr.pp_double[i][k0] = b->ptr.pp_double[i][k0]+v*a->ptr.pp_double[j][k0]; } } else { /* * Use vector operation */ ae_v_addd(&b->ptr.pp_double[j][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); ae_v_addd(&b->ptr.pp_double[i][0], 1, &a->ptr.pp_double[j][0], 1, ae_v_len(0,k-1), v); } } } v = s->vals.ptr.p_double[ri+d]; ae_v_addd(&b->ptr.pp_double[i][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); } return; } } /************************************************************************* This function calculates matrix-vector product op(S)*x, when x is vector, S is symmetric triangular matrix, op(S) is transposition or no operation. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse square matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is used: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. IsUnit - unit or non-unit diagonal: * if True, diagonal elements of triangular matrix are considered equal to 1.0. Actual elements stored in S are not referenced at all. * if False, diagonal stored in S is used OpType - operation type: * if 0, S*x is calculated * if 1, (S^T)*x is calculated (transposition) X - array[N] which stores input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. Y - possibly preallocated input buffer. Automatically resized if its size is too small. OUTPUT PARAMETERS Y - array[N], op(S)*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. -- ALGLIB PROJECT -- Copyright 20.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsetrmv(sparsematrix* s, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t j0; ae_int_t j1; double v; ae_int_t ri; ae_int_t ri1; ae_int_t d; ae_int_t u; ae_int_t lt; ae_int_t rt; ae_int_t lt1; ae_int_t rt1; ae_assert(s->matrixtype==1||s->matrixtype==2, "SparseTRMV: incorrect matrix type (convert your matrix to CRS/SKS)", _state); ae_assert(optype==0||optype==1, "SparseTRMV: incorrect operation type (must be 0 or 1)", _state); ae_assert(x->cnt>=s->n, "SparseTRMV: Length(X)m==s->n, "SparseTRMV: matrix is non-square", _state); n = s->n; rvectorsetlengthatleast(y, n, _state); if( isunit ) { /* * Set initial value of y to x */ for(i=0; i<=n-1; i++) { y->ptr.p_double[i] = x->ptr.p_double[i]; } } else { /* * Set initial value of y to 0 */ for(i=0; i<=n-1; i++) { y->ptr.p_double[i] = (double)(0); } } if( s->matrixtype==1 ) { /* * CRS format */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseTRMV: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); for(i=0; i<=n-1; i++) { /* * Depending on IsUpper/IsUnit, select range of indexes to process */ if( isupper ) { if( isunit||s->didx.ptr.p_int[i]==s->uidx.ptr.p_int[i] ) { j0 = s->uidx.ptr.p_int[i]; } else { j0 = s->didx.ptr.p_int[i]; } j1 = s->ridx.ptr.p_int[i+1]-1; } else { j0 = s->ridx.ptr.p_int[i]; if( isunit||s->didx.ptr.p_int[i]==s->uidx.ptr.p_int[i] ) { j1 = s->didx.ptr.p_int[i]-1; } else { j1 = s->didx.ptr.p_int[i]; } } /* * Depending on OpType, process subset of I-th row of input matrix */ if( optype==0 ) { v = 0.0; for(j=j0; j<=j1; j++) { v = v+s->vals.ptr.p_double[j]*x->ptr.p_double[s->idx.ptr.p_int[j]]; } y->ptr.p_double[i] = y->ptr.p_double[i]+v; } else { v = x->ptr.p_double[i]; for(j=j0; j<=j1; j++) { k = s->idx.ptr.p_int[j]; y->ptr.p_double[k] = y->ptr.p_double[k]+v*s->vals.ptr.p_double[j]; } } } return; } if( s->matrixtype==2 ) { /* * SKS format */ ae_assert(s->m==s->n, "SparseTRMV: non-square SKS matrices are not supported", _state); for(i=0; i<=n-1; i++) { ri = s->ridx.ptr.p_int[i]; ri1 = s->ridx.ptr.p_int[i+1]; d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; if( !isunit ) { y->ptr.p_double[i] = y->ptr.p_double[i]+s->vals.ptr.p_double[ri+d]*x->ptr.p_double[i]; } if( d>0&&!isupper ) { lt = ri; rt = ri+d-1; lt1 = i-d; rt1 = i-1; if( optype==0 ) { v = ae_v_dotproduct(&s->vals.ptr.p_double[lt], 1, &x->ptr.p_double[lt1], 1, ae_v_len(lt,rt)); y->ptr.p_double[i] = y->ptr.p_double[i]+v; } else { v = x->ptr.p_double[i]; ae_v_addd(&y->ptr.p_double[lt1], 1, &s->vals.ptr.p_double[lt], 1, ae_v_len(lt1,rt1), v); } } if( u>0&&isupper ) { lt = ri1-u; rt = ri1-1; lt1 = i-u; rt1 = i-1; if( optype==0 ) { v = x->ptr.p_double[i]; ae_v_addd(&y->ptr.p_double[lt1], 1, &s->vals.ptr.p_double[lt], 1, ae_v_len(lt1,rt1), v); } else { v = ae_v_dotproduct(&s->vals.ptr.p_double[lt], 1, &x->ptr.p_double[lt1], 1, ae_v_len(lt,rt)); y->ptr.p_double[i] = y->ptr.p_double[i]+v; } } } return; } } /************************************************************************* This function solves linear system op(S)*y=x where x is vector, S is symmetric triangular matrix, op(S) is transposition or no operation. Matrix S must be stored in CRS or SKS format (exception will be thrown otherwise). INPUT PARAMETERS S - sparse square matrix in CRS or SKS format. IsUpper - whether upper or lower triangle of S is used: * if upper triangle is given, only S[i,j] for j>=i are used, and lower triangle is ignored (it can be empty - these elements are not referenced at all). * if lower triangle is given, only S[i,j] for j<=i are used, and upper triangle is ignored. IsUnit - unit or non-unit diagonal: * if True, diagonal elements of triangular matrix are considered equal to 1.0. Actual elements stored in S are not referenced at all. * if False, diagonal stored in S is used. It is your responsibility to make sure that diagonal is non-zero. OpType - operation type: * if 0, S*x is calculated * if 1, (S^T)*x is calculated (transposition) X - array[N] which stores input vector. For performance reasons we make only quick checks - we check that array size is at least N, but we do not check for NAN's or INF's. OUTPUT PARAMETERS X - array[N], inv(op(S))*x NOTE: this function throws exception when called for non-CRS/SKS matrix. You must convert your matrix with SparseConvertToCRS/SKS() before using this function. NOTE: no assertion or tests are done during algorithm operation. It is your responsibility to provide invertible matrix to algorithm. -- ALGLIB PROJECT -- Copyright 20.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsetrsv(sparsematrix* s, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t n; ae_int_t fst; ae_int_t lst; ae_int_t stp; ae_int_t i; ae_int_t j; ae_int_t k; double v; double vd; ae_int_t j0; ae_int_t j1; ae_int_t ri; ae_int_t ri1; ae_int_t d; ae_int_t u; ae_int_t lt; ae_int_t lt1; ae_assert(s->matrixtype==1||s->matrixtype==2, "SparseTRSV: incorrect matrix type (convert your matrix to CRS/SKS)", _state); ae_assert(optype==0||optype==1, "SparseTRSV: incorrect operation type (must be 0 or 1)", _state); ae_assert(x->cnt>=s->n, "SparseTRSV: Length(X)m==s->n, "SparseTRSV: matrix is non-square", _state); n = s->n; if( s->matrixtype==1 ) { /* * CRS format. * * Several branches for different combinations of IsUpper and OpType */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseTRSV: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); if( optype==0 ) { /* * No transposition. * * S*x=y with upper or lower triangular S. */ if( isupper ) { fst = n-1; lst = 0; stp = -1; } else { fst = 0; lst = n-1; stp = 1; } i = fst; while((stp>0&&i<=lst)||(stp<0&&i>=lst)) { /* * Select range of indexes to process */ if( isupper ) { j0 = s->uidx.ptr.p_int[i]; j1 = s->ridx.ptr.p_int[i+1]-1; } else { j0 = s->ridx.ptr.p_int[i]; j1 = s->didx.ptr.p_int[i]-1; } /* * Calculate X[I] */ v = 0.0; for(j=j0; j<=j1; j++) { v = v+s->vals.ptr.p_double[j]*x->ptr.p_double[s->idx.ptr.p_int[j]]; } if( !isunit ) { if( s->didx.ptr.p_int[i]==s->uidx.ptr.p_int[i] ) { vd = (double)(0); } else { vd = s->vals.ptr.p_double[s->didx.ptr.p_int[i]]; } } else { vd = 1.0; } k = saferdiv(x->ptr.p_double[i]-v, vd, &v, _state); ae_assert(k<=0, "SparseTRSV: near-overflow or division by exact zero", _state); x->ptr.p_double[i] = v; /* * Next I */ i = i+stp; } return; } if( optype==1 ) { /* * Transposition. * * (S^T)*x=y with upper or lower triangular S. */ if( isupper ) { fst = 0; lst = n-1; stp = 1; } else { fst = n-1; lst = 0; stp = -1; } i = fst; while((stp>0&&i<=lst)||(stp<0&&i>=lst)) { /* * X[i] already stores A[i,i]*Y[i], the only thing left * is to divide by diagonal element. */ if( !isunit ) { if( s->didx.ptr.p_int[i]==s->uidx.ptr.p_int[i] ) { vd = (double)(0); } else { vd = s->vals.ptr.p_double[s->didx.ptr.p_int[i]]; } } else { vd = 1.0; } k = saferdiv(x->ptr.p_double[i], vd, &v, _state); ae_assert(k<=0, "SparseTRSV: near-overflow or division by exact zero", _state); x->ptr.p_double[i] = v; /* * For upper triangular case: * subtract X[i]*Ai from X[i+1:N-1] * * For lower triangular case: * subtract X[i]*Ai from X[0:i-1] * * (here Ai is I-th row of original, untransposed A). */ if( isupper ) { j0 = s->uidx.ptr.p_int[i]; j1 = s->ridx.ptr.p_int[i+1]-1; } else { j0 = s->ridx.ptr.p_int[i]; j1 = s->didx.ptr.p_int[i]-1; } v = x->ptr.p_double[i]; for(j=j0; j<=j1; j++) { k = s->idx.ptr.p_int[j]; x->ptr.p_double[k] = x->ptr.p_double[k]-s->vals.ptr.p_double[j]*v; } /* * Next I */ i = i+stp; } return; } ae_assert(ae_false, "SparseTRSV: internal error", _state); } if( s->matrixtype==2 ) { /* * SKS format */ ae_assert(s->m==s->n, "SparseTRSV: non-square SKS matrices are not supported", _state); if( (optype==0&&!isupper)||(optype==1&&isupper) ) { /* * Lower triangular op(S) (matrix itself can be upper triangular). */ for(i=0; i<=n-1; i++) { /* * Select range of indexes to process */ ri = s->ridx.ptr.p_int[i]; ri1 = s->ridx.ptr.p_int[i+1]; d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; if( isupper ) { lt = i-u; lt1 = ri1-u; k = u-1; } else { lt = i-d; lt1 = ri; k = d-1; } /* * Calculate X[I] */ v = 0.0; for(j=0; j<=k; j++) { v = v+s->vals.ptr.p_double[lt1+j]*x->ptr.p_double[lt+j]; } if( isunit ) { vd = (double)(1); } else { vd = s->vals.ptr.p_double[ri+d]; } k = saferdiv(x->ptr.p_double[i]-v, vd, &v, _state); ae_assert(k<=0, "SparseTRSV: near-overflow or division by exact zero", _state); x->ptr.p_double[i] = v; } return; } if( (optype==1&&!isupper)||(optype==0&&isupper) ) { /* * Upper triangular op(S) (matrix itself can be lower triangular). */ for(i=n-1; i>=0; i--) { ri = s->ridx.ptr.p_int[i]; ri1 = s->ridx.ptr.p_int[i+1]; d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; /* * X[i] already stores A[i,i]*Y[i], the only thing left * is to divide by diagonal element. */ if( isunit ) { vd = (double)(1); } else { vd = s->vals.ptr.p_double[ri+d]; } k = saferdiv(x->ptr.p_double[i], vd, &v, _state); ae_assert(k<=0, "SparseTRSV: near-overflow or division by exact zero", _state); x->ptr.p_double[i] = v; /* * Subtract product of X[i] and I-th column of "effective" A from * unprocessed variables. */ v = x->ptr.p_double[i]; if( isupper ) { lt = i-u; lt1 = ri1-u; k = u-1; } else { lt = i-d; lt1 = ri; k = d-1; } for(j=0; j<=k; j++) { x->ptr.p_double[lt+j] = x->ptr.p_double[lt+j]-v*s->vals.ptr.p_double[lt1+j]; } } return; } ae_assert(ae_false, "SparseTRSV: internal error", _state); } ae_assert(ae_false, "SparseTRSV: internal error", _state); } /************************************************************************* This procedure resizes Hash-Table matrix. It can be called when you have deleted too many elements from the matrix, and you want to free unneeded memory. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparseresizematrix(sparsematrix* s, ae_state *_state) { ae_frame _frame_block; ae_int_t k; ae_int_t k1; ae_int_t i; ae_vector tvals; ae_vector tidx; ae_frame_make(_state, &_frame_block); ae_vector_init(&tvals, 0, DT_REAL, _state); ae_vector_init(&tidx, 0, DT_INT, _state); ae_assert(s->matrixtype==0, "SparseResizeMatrix: incorrect matrix type", _state); /* * Initialization for length and number of non-null elementd */ k = s->tablesize; k1 = 0; /* * Calculating number of non-null elements */ for(i=0; i<=k-1; i++) { if( s->idx.ptr.p_int[2*i]>=0 ) { k1 = k1+1; } } /* * Initialization value for free space */ s->tablesize = ae_round(k1/sparse_desiredloadfactor*sparse_growfactor+sparse_additional, _state); s->nfree = s->tablesize-k1; ae_vector_set_length(&tvals, s->tablesize, _state); ae_vector_set_length(&tidx, 2*s->tablesize, _state); ae_swap_vectors(&s->vals, &tvals); ae_swap_vectors(&s->idx, &tidx); for(i=0; i<=s->tablesize-1; i++) { s->idx.ptr.p_int[2*i] = -1; } for(i=0; i<=k-1; i++) { if( tidx.ptr.p_int[2*i]>=0 ) { sparseset(s, tidx.ptr.p_int[2*i], tidx.ptr.p_int[2*i+1], tvals.ptr.p_double[i], _state); } } ae_frame_leave(_state); } /************************************************************************* This function return average length of chain at hash-table. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ double sparsegetaveragelengthofchain(sparsematrix* s, ae_state *_state) { ae_int_t nchains; ae_int_t talc; ae_int_t l; ae_int_t i; ae_int_t ind0; ae_int_t ind1; ae_int_t hashcode; double result; /* * If matrix represent in CRS then return zero and exit */ if( s->matrixtype!=0 ) { result = (double)(0); return result; } nchains = 0; talc = 0; l = s->tablesize; for(i=0; i<=l-1; i++) { ind0 = 2*i; if( s->idx.ptr.p_int[ind0]!=-1 ) { nchains = nchains+1; hashcode = sparse_hash(s->idx.ptr.p_int[ind0], s->idx.ptr.p_int[ind0+1], l, _state); for(;;) { talc = talc+1; ind1 = 2*hashcode; if( s->idx.ptr.p_int[ind0]==s->idx.ptr.p_int[ind1]&&s->idx.ptr.p_int[ind0+1]==s->idx.ptr.p_int[ind1+1] ) { break; } hashcode = (hashcode+1)%l; } } } if( nchains==0 ) { result = (double)(0); } else { result = (double)talc/(double)nchains; } return result; } /************************************************************************* This function is used to enumerate all elements of the sparse matrix. Before first call user initializes T0 and T1 counters by zero. These counters are used to remember current position in a matrix; after each call they are updated by the function. Subsequent calls to this function return non-zero elements of the sparse matrix, one by one. If you enumerate CRS matrix, matrix is traversed from left to right, from top to bottom. In case you enumerate matrix stored as Hash table, elements are returned in random order. EXAMPLE > T0=0 > T1=0 > while SparseEnumerate(S,T0,T1,I,J,V) do > ....do something with I,J,V INPUT PARAMETERS S - sparse M*N matrix in Hash-Table or CRS representation. T0 - internal counter T1 - internal counter OUTPUT PARAMETERS T0 - new value of the internal counter T1 - new value of the internal counter I - row index of non-zero element, 0<=Imatrixtype!=0&&*t1<0) ) { /* * Incorrect T0/T1, terminate enumeration */ result = ae_false; return result; } if( s->matrixtype==0 ) { /* * Hash-table matrix */ sz = s->tablesize; for(i0=*t0; i0<=sz-1; i0++) { if( s->idx.ptr.p_int[2*i0]==-1||s->idx.ptr.p_int[2*i0]==-2 ) { continue; } else { *i = s->idx.ptr.p_int[2*i0]; *j = s->idx.ptr.p_int[2*i0+1]; *v = s->vals.ptr.p_double[i0]; *t0 = i0+1; result = ae_true; return result; } } *t0 = 0; *t1 = 0; result = ae_false; return result; } if( s->matrixtype==1 ) { /* * CRS matrix */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseEnumerate: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); if( *t0>=s->ninitialized ) { *t0 = 0; *t1 = 0; result = ae_false; return result; } while(*t0>s->ridx.ptr.p_int[*t1+1]-1&&*t1m) { *t1 = *t1+1; } *i = *t1; *j = s->idx.ptr.p_int[*t0]; *v = s->vals.ptr.p_double[*t0]; *t0 = *t0+1; result = ae_true; return result; } if( s->matrixtype==2 ) { /* * SKS matrix: * * T0 stores current offset in Vals[] array * * T1 stores index of the diagonal block */ ae_assert(s->m==s->n, "SparseEnumerate: non-square SKS matrices are not supported", _state); if( *t0>=s->ridx.ptr.p_int[s->m] ) { *t0 = 0; *t1 = 0; result = ae_false; return result; } while(*t0>s->ridx.ptr.p_int[*t1+1]-1&&*t1m) { *t1 = *t1+1; } i0 = *t0-s->ridx.ptr.p_int[*t1]; if( i0didx.ptr.p_int[*t1]+1 ) { /* * subdiagonal or diagonal element, row index is T1. */ *i = *t1; *j = *t1-s->didx.ptr.p_int[*t1]+i0; } else { /* * superdiagonal element, column index is T1. */ *i = *t1-(s->ridx.ptr.p_int[*t1+1]-(*t0)); *j = *t1; } *v = s->vals.ptr.p_double[*t0]; *t0 = *t0+1; result = ae_true; return result; } ae_assert(ae_false, "SparseEnumerate: unexpected matrix type", _state); return result; } /************************************************************************* This function rewrites existing (non-zero) element. It returns True if element exists or False, when it is called for non-existing (zero) element. This function works with any kind of the matrix. The purpose of this function is to provide convenient thread-safe way to modify sparse matrix. Such modification (already existing element is rewritten) is guaranteed to be thread-safe without any synchronization, as long as different threads modify different elements. INPUT PARAMETERS S - sparse M*N matrix in any kind of representation (Hash, SKS, CRS). I - row index of non-zero element to modify, 0<=Im, "SparseRewriteExisting: invalid argument I(either I<0 or I>=S.M)", _state); ae_assert(0<=j&&jn, "SparseRewriteExisting: invalid argument J(either J<0 or J>=S.N)", _state); ae_assert(ae_isfinite(v, _state), "SparseRewriteExisting: invalid argument V(either V is infinite or V is NaN)", _state); result = ae_false; /* * Hash-table matrix */ if( s->matrixtype==0 ) { k = s->tablesize; hashcode = sparse_hash(i, j, k, _state); for(;;) { if( s->idx.ptr.p_int[2*hashcode]==-1 ) { return result; } if( s->idx.ptr.p_int[2*hashcode]==i&&s->idx.ptr.p_int[2*hashcode+1]==j ) { s->vals.ptr.p_double[hashcode] = v; result = ae_true; return result; } hashcode = (hashcode+1)%k; } } /* * CRS matrix */ if( s->matrixtype==1 ) { ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseRewriteExisting: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); k0 = s->ridx.ptr.p_int[i]; k1 = s->ridx.ptr.p_int[i+1]-1; while(k0<=k1) { k = (k0+k1)/2; if( s->idx.ptr.p_int[k]==j ) { s->vals.ptr.p_double[k] = v; result = ae_true; return result; } if( s->idx.ptr.p_int[k]matrixtype==2 ) { ae_assert(s->m==s->n, "SparseRewriteExisting: non-square SKS matrix not supported", _state); if( i==j ) { /* * Rewrite diagonal element */ result = ae_true; s->vals.ptr.p_double[s->ridx.ptr.p_int[i]+s->didx.ptr.p_int[i]] = v; return result; } if( jdidx.ptr.p_int[i]; if( i-j<=k ) { s->vals.ptr.p_double[s->ridx.ptr.p_int[i]+k+j-i] = v; result = ae_true; } } else { /* * Return superdiagonal element at J-th "skyline block" */ k = s->uidx.ptr.p_int[j]; if( j-i<=k ) { s->vals.ptr.p_double[s->ridx.ptr.p_int[j+1]-(j-i)] = v; result = ae_true; } } return result; } return result; } /************************************************************************* This function returns I-th row of the sparse matrix. Matrix must be stored in CRS or SKS format. INPUT PARAMETERS: S - sparse M*N matrix in CRS format I - row index, 0<=Imatrixtype==1||s->matrixtype==2, "SparseGetRow: S must be CRS/SKS-based matrix", _state); ae_assert(i>=0&&im, "SparseGetRow: I<0 or I>=M", _state); /* * Prepare output buffer */ rvectorsetlengthatleast(irow, s->n, _state); for(i0=0; i0<=s->n-1; i0++) { irow->ptr.p_double[i0] = (double)(0); } /* * Output */ if( s->matrixtype==1 ) { for(i0=s->ridx.ptr.p_int[i]; i0<=s->ridx.ptr.p_int[i+1]-1; i0++) { irow->ptr.p_double[s->idx.ptr.p_int[i0]] = s->vals.ptr.p_double[i0]; } return; } if( s->matrixtype==2 ) { /* * Copy subdiagonal and diagonal parts */ ae_assert(s->n==s->m, "SparseGetRow: non-square SKS matrices are not supported", _state); j0 = i-s->didx.ptr.p_int[i]; i0 = -j0+s->ridx.ptr.p_int[i]; for(j=j0; j<=i; j++) { irow->ptr.p_double[j] = s->vals.ptr.p_double[j+i0]; } /* * Copy superdiagonal part */ upperprofile = s->uidx.ptr.p_int[s->n]; j0 = i+1; j1 = ae_minint(s->n-1, i+upperprofile, _state); for(j=j0; j<=j1; j++) { if( j-i<=s->uidx.ptr.p_int[j] ) { irow->ptr.p_double[j] = s->vals.ptr.p_double[s->ridx.ptr.p_int[j+1]-(j-i)]; } } return; } } /************************************************************************* This function returns I-th row of the sparse matrix IN COMPRESSED FORMAT - only non-zero elements are returned (with their indexes). Matrix must be stored in CRS or SKS format. INPUT PARAMETERS: S - sparse M*N matrix in CRS format I - row index, 0<=Imatrixtype==1||s->matrixtype==2, "SparseGetRow: S must be CRS/SKS-based matrix", _state); ae_assert(i>=0&&im, "SparseGetRow: I<0 or I>=M", _state); /* * Initialize NZCnt */ *nzcnt = 0; /* * CRS matrix - just copy data */ if( s->matrixtype==1 ) { *nzcnt = s->ridx.ptr.p_int[i+1]-s->ridx.ptr.p_int[i]; ivectorsetlengthatleast(colidx, *nzcnt, _state); rvectorsetlengthatleast(vals, *nzcnt, _state); k0 = s->ridx.ptr.p_int[i]; for(k=0; k<=*nzcnt-1; k++) { colidx->ptr.p_int[k] = s->idx.ptr.p_int[k0+k]; vals->ptr.p_double[k] = s->vals.ptr.p_double[k0+k]; } return; } /* * SKS matrix - a bit more complex sequence */ if( s->matrixtype==2 ) { ae_assert(s->n==s->m, "SparseGetCompressedRow: non-square SKS matrices are not supported", _state); /* * Allocate enough place for storage */ upperprofile = s->uidx.ptr.p_int[s->n]; ivectorsetlengthatleast(colidx, s->didx.ptr.p_int[i]+1+upperprofile, _state); rvectorsetlengthatleast(vals, s->didx.ptr.p_int[i]+1+upperprofile, _state); /* * Copy subdiagonal and diagonal parts */ j0 = i-s->didx.ptr.p_int[i]; i0 = -j0+s->ridx.ptr.p_int[i]; for(j=j0; j<=i; j++) { colidx->ptr.p_int[*nzcnt] = j; vals->ptr.p_double[*nzcnt] = s->vals.ptr.p_double[j+i0]; *nzcnt = *nzcnt+1; } /* * Copy superdiagonal part */ j0 = i+1; j1 = ae_minint(s->n-1, i+upperprofile, _state); for(j=j0; j<=j1; j++) { if( j-i<=s->uidx.ptr.p_int[j] ) { colidx->ptr.p_int[*nzcnt] = j; vals->ptr.p_double[*nzcnt] = s->vals.ptr.p_double[s->ridx.ptr.p_int[j+1]-(j-i)]; *nzcnt = *nzcnt+1; } } return; } } /************************************************************************* This function performs efficient in-place transpose of SKS matrix. No additional memory is allocated during transposition. This function supports only skyline storage format (SKS). INPUT PARAMETERS S - sparse matrix in SKS format. OUTPUT PARAMETERS S - sparse matrix, transposed. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsetransposesks(sparsematrix* s, ae_state *_state) { ae_int_t n; ae_int_t d; ae_int_t u; ae_int_t i; ae_int_t k; ae_int_t t0; ae_int_t t1; double v; ae_assert(s->matrixtype==2, "SparseTransposeSKS: only SKS matrices are supported", _state); ae_assert(s->m==s->n, "SparseTransposeSKS: non-square SKS matrices are not supported", _state); n = s->n; for(i=1; i<=n-1; i++) { d = s->didx.ptr.p_int[i]; u = s->uidx.ptr.p_int[i]; k = s->uidx.ptr.p_int[i]; s->uidx.ptr.p_int[i] = s->didx.ptr.p_int[i]; s->didx.ptr.p_int[i] = k; if( d==u ) { /* * Upper skyline height equal to lower skyline height, * simple exchange is needed for transposition */ t0 = s->ridx.ptr.p_int[i]; for(k=0; k<=d-1; k++) { v = s->vals.ptr.p_double[t0+k]; s->vals.ptr.p_double[t0+k] = s->vals.ptr.p_double[t0+d+1+k]; s->vals.ptr.p_double[t0+d+1+k] = v; } } if( d>u ) { /* * Upper skyline height is less than lower skyline height. * * Transposition becomes a bit tricky: we have to rearrange * "L0 L1 D U" to "U D L0 L1", where |L0|=|U|=u, |L1|=d-u. * * In order to do this we perform a sequence of swaps and * in-place reversals: * * swap(L0,U) => "U L1 D L0" * * reverse("L1 D L0") => "U L0~ D L1~" (where X~ is a reverse of X) * * reverse("L0~ D") => "U D L0 L1~" * * reverse("L1") => "U D L0 L1" */ t0 = s->ridx.ptr.p_int[i]; t1 = s->ridx.ptr.p_int[i]+d+1; for(k=0; k<=u-1; k++) { v = s->vals.ptr.p_double[t0+k]; s->vals.ptr.p_double[t0+k] = s->vals.ptr.p_double[t1+k]; s->vals.ptr.p_double[t1+k] = v; } t0 = s->ridx.ptr.p_int[i]+u; t1 = s->ridx.ptr.p_int[i+1]-1; while(t1>t0) { v = s->vals.ptr.p_double[t0]; s->vals.ptr.p_double[t0] = s->vals.ptr.p_double[t1]; s->vals.ptr.p_double[t1] = v; t0 = t0+1; t1 = t1-1; } t0 = s->ridx.ptr.p_int[i]+u; t1 = s->ridx.ptr.p_int[i]+u+u; while(t1>t0) { v = s->vals.ptr.p_double[t0]; s->vals.ptr.p_double[t0] = s->vals.ptr.p_double[t1]; s->vals.ptr.p_double[t1] = v; t0 = t0+1; t1 = t1-1; } t0 = s->ridx.ptr.p_int[i+1]-(d-u); t1 = s->ridx.ptr.p_int[i+1]-1; while(t1>t0) { v = s->vals.ptr.p_double[t0]; s->vals.ptr.p_double[t0] = s->vals.ptr.p_double[t1]; s->vals.ptr.p_double[t1] = v; t0 = t0+1; t1 = t1-1; } } if( d "U1 D U0 L" * * reverse("U1 D U0") => "U0~ D U1~ L" (where X~ is a reverse of X) * * reverse("U0~") => "U0 D U1~ L" * * reverse("D U1~") => "U0 U1 D L" */ t0 = s->ridx.ptr.p_int[i]; t1 = s->ridx.ptr.p_int[i+1]-d; for(k=0; k<=d-1; k++) { v = s->vals.ptr.p_double[t0+k]; s->vals.ptr.p_double[t0+k] = s->vals.ptr.p_double[t1+k]; s->vals.ptr.p_double[t1+k] = v; } t0 = s->ridx.ptr.p_int[i]; t1 = s->ridx.ptr.p_int[i]+u; while(t1>t0) { v = s->vals.ptr.p_double[t0]; s->vals.ptr.p_double[t0] = s->vals.ptr.p_double[t1]; s->vals.ptr.p_double[t1] = v; t0 = t0+1; t1 = t1-1; } t0 = s->ridx.ptr.p_int[i]; t1 = s->ridx.ptr.p_int[i]+u-d-1; while(t1>t0) { v = s->vals.ptr.p_double[t0]; s->vals.ptr.p_double[t0] = s->vals.ptr.p_double[t1]; s->vals.ptr.p_double[t1] = v; t0 = t0+1; t1 = t1-1; } t0 = s->ridx.ptr.p_int[i]+u-d; t1 = s->ridx.ptr.p_int[i+1]-d-1; while(t1>t0) { v = s->vals.ptr.p_double[t0]; s->vals.ptr.p_double[t0] = s->vals.ptr.p_double[t1]; s->vals.ptr.p_double[t1] = v; t0 = t0+1; t1 = t1-1; } } } k = s->uidx.ptr.p_int[n]; s->uidx.ptr.p_int[n] = s->didx.ptr.p_int[n]; s->didx.ptr.p_int[n] = k; } /************************************************************************* This function performs in-place conversion to desired sparse storage format. INPUT PARAMETERS S0 - sparse matrix in any format. Fmt - desired storage format of the output, as returned by SparseGetMatrixType() function: * 0 for hash-based storage * 1 for CRS * 2 for SKS OUTPUT PARAMETERS S0 - sparse matrix in requested format. NOTE: in-place conversion wastes a lot of memory which is used to store temporaries. If you perform a lot of repeated conversions, we recommend to use out-of-place buffered conversion functions, like SparseCopyToBuf(), which can reuse already allocated memory. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/ void sparseconvertto(sparsematrix* s0, ae_int_t fmt, ae_state *_state) { ae_assert((fmt==0||fmt==1)||fmt==2, "SparseConvertTo: invalid fmt parameter", _state); if( fmt==0 ) { sparseconverttohash(s0, _state); return; } if( fmt==1 ) { sparseconverttocrs(s0, _state); return; } if( fmt==2 ) { sparseconverttosks(s0, _state); return; } ae_assert(ae_false, "SparseConvertTo: invalid matrix type", _state); } /************************************************************************* This function performs out-of-place conversion to desired sparse storage format. S0 is copied to S1 and converted on-the-fly. Memory allocated in S1 is reused to maximum extent possible. INPUT PARAMETERS S0 - sparse matrix in any format. Fmt - desired storage format of the output, as returned by SparseGetMatrixType() function: * 0 for hash-based storage * 1 for CRS * 2 for SKS OUTPUT PARAMETERS S1 - sparse matrix in requested format. -- ALGLIB PROJECT -- Copyright 16.01.2014 by Bochkanov Sergey *************************************************************************/ void sparsecopytobuf(sparsematrix* s0, ae_int_t fmt, sparsematrix* s1, ae_state *_state) { ae_assert((fmt==0||fmt==1)||fmt==2, "SparseCopyToBuf: invalid fmt parameter", _state); if( fmt==0 ) { sparsecopytohashbuf(s0, s1, _state); return; } if( fmt==1 ) { sparsecopytocrsbuf(s0, s1, _state); return; } if( fmt==2 ) { sparsecopytosksbuf(s0, s1, _state); return; } ae_assert(ae_false, "SparseCopyToBuf: invalid matrix type", _state); } /************************************************************************* This function performs in-place conversion to Hash table storage. INPUT PARAMETERS S - sparse matrix in CRS format. OUTPUT PARAMETERS S - sparse matrix in Hash table format. NOTE: this function has no effect when called with matrix which is already in Hash table mode. NOTE: in-place conversion involves allocation of temporary arrays. If you perform a lot of repeated in- place conversions, it may lead to memory fragmentation. Consider using out-of-place SparseCopyToHashBuf() function in this case. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparseconverttohash(sparsematrix* s, ae_state *_state) { ae_frame _frame_block; ae_vector tidx; ae_vector tridx; ae_vector tdidx; ae_vector tuidx; ae_vector tvals; ae_int_t n; ae_int_t m; ae_int_t offs0; ae_int_t i; ae_int_t j; ae_int_t k; ae_frame_make(_state, &_frame_block); ae_vector_init(&tidx, 0, DT_INT, _state); ae_vector_init(&tridx, 0, DT_INT, _state); ae_vector_init(&tdidx, 0, DT_INT, _state); ae_vector_init(&tuidx, 0, DT_INT, _state); ae_vector_init(&tvals, 0, DT_REAL, _state); ae_assert((s->matrixtype==0||s->matrixtype==1)||s->matrixtype==2, "SparseConvertToHash: invalid matrix type", _state); if( s->matrixtype==0 ) { /* * Already in Hash mode */ ae_frame_leave(_state); return; } if( s->matrixtype==1 ) { /* * From CRS to Hash */ s->matrixtype = 0; m = s->m; n = s->n; ae_swap_vectors(&s->idx, &tidx); ae_swap_vectors(&s->ridx, &tridx); ae_swap_vectors(&s->vals, &tvals); sparsecreatebuf(m, n, tridx.ptr.p_int[m], s, _state); for(i=0; i<=m-1; i++) { for(j=tridx.ptr.p_int[i]; j<=tridx.ptr.p_int[i+1]-1; j++) { sparseset(s, i, tidx.ptr.p_int[j], tvals.ptr.p_double[j], _state); } } ae_frame_leave(_state); return; } if( s->matrixtype==2 ) { /* * From SKS to Hash */ s->matrixtype = 0; m = s->m; n = s->n; ae_swap_vectors(&s->ridx, &tridx); ae_swap_vectors(&s->didx, &tdidx); ae_swap_vectors(&s->uidx, &tuidx); ae_swap_vectors(&s->vals, &tvals); sparsecreatebuf(m, n, tridx.ptr.p_int[m], s, _state); for(i=0; i<=m-1; i++) { /* * copy subdiagonal and diagonal parts of I-th block */ offs0 = tridx.ptr.p_int[i]; k = tdidx.ptr.p_int[i]+1; for(j=0; j<=k-1; j++) { sparseset(s, i, i-tdidx.ptr.p_int[i]+j, tvals.ptr.p_double[offs0+j], _state); } /* * Copy superdiagonal part of I-th block */ offs0 = tridx.ptr.p_int[i]+tdidx.ptr.p_int[i]+1; k = tuidx.ptr.p_int[i]; for(j=0; j<=k-1; j++) { sparseset(s, i-k+j, i, tvals.ptr.p_double[offs0+j], _state); } } ae_frame_leave(_state); return; } ae_assert(ae_false, "SparseConvertToHash: invalid matrix type", _state); ae_frame_leave(_state); } /************************************************************************* This function performs out-of-place conversion to Hash table storage format. S0 is copied to S1 and converted on-the-fly. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in Hash table format. NOTE: if S0 is stored as Hash-table, it is just copied without conversion. NOTE: this function de-allocates memory occupied by S1 before starting conversion. If you perform a lot of repeated conversions, it may lead to memory fragmentation. In this case we recommend you to use SparseCopyToHashBuf() function which re-uses memory in S1 as much as possible. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsecopytohash(sparsematrix* s0, sparsematrix* s1, ae_state *_state) { _sparsematrix_clear(s1); ae_assert((s0->matrixtype==0||s0->matrixtype==1)||s0->matrixtype==2, "SparseCopyToHash: invalid matrix type", _state); sparsecopytohashbuf(s0, s1, _state); } /************************************************************************* This function performs out-of-place conversion to Hash table storage format. S0 is copied to S1 and converted on-the-fly. Memory allocated in S1 is reused to maximum extent possible. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in Hash table format. NOTE: if S0 is stored as Hash-table, it is just copied without conversion. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsecopytohashbuf(sparsematrix* s0, sparsematrix* s1, ae_state *_state) { double val; ae_int_t t0; ae_int_t t1; ae_int_t i; ae_int_t j; ae_assert((s0->matrixtype==0||s0->matrixtype==1)||s0->matrixtype==2, "SparseCopyToHashBuf: invalid matrix type", _state); if( s0->matrixtype==0 ) { /* * Already hash, just copy */ sparsecopybuf(s0, s1, _state); return; } if( s0->matrixtype==1 ) { /* * CRS storage */ t0 = 0; t1 = 0; sparsecreatebuf(s0->m, s0->n, s0->ridx.ptr.p_int[s0->m], s1, _state); while(sparseenumerate(s0, &t0, &t1, &i, &j, &val, _state)) { sparseset(s1, i, j, val, _state); } return; } if( s0->matrixtype==2 ) { /* * SKS storage */ t0 = 0; t1 = 0; sparsecreatebuf(s0->m, s0->n, s0->ridx.ptr.p_int[s0->m], s1, _state); while(sparseenumerate(s0, &t0, &t1, &i, &j, &val, _state)) { sparseset(s1, i, j, val, _state); } return; } ae_assert(ae_false, "SparseCopyToHashBuf: invalid matrix type", _state); } /************************************************************************* This function converts matrix to CRS format. Some algorithms (linear algebra ones, for example) require matrices in CRS format. This function allows to perform in-place conversion. INPUT PARAMETERS S - sparse M*N matrix in any format OUTPUT PARAMETERS S - matrix in CRS format NOTE: this function has no effect when called with matrix which is already in CRS mode. NOTE: this function allocates temporary memory to store a copy of the matrix. If you perform a lot of repeated conversions, we recommend you to use SparseCopyToCRSBuf() function, which can reuse previously allocated memory. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ void sparseconverttocrs(sparsematrix* s, ae_state *_state) { ae_frame _frame_block; ae_int_t m; ae_int_t i; ae_int_t j; ae_vector tvals; ae_vector tidx; ae_vector temp; ae_vector tridx; ae_int_t nonne; ae_int_t k; ae_int_t offs0; ae_int_t offs1; ae_frame_make(_state, &_frame_block); ae_vector_init(&tvals, 0, DT_REAL, _state); ae_vector_init(&tidx, 0, DT_INT, _state); ae_vector_init(&temp, 0, DT_INT, _state); ae_vector_init(&tridx, 0, DT_INT, _state); m = s->m; if( s->matrixtype==0 ) { /* * From Hash-table to CRS. * First, create local copy of the hash table. */ s->matrixtype = 1; k = s->tablesize; ae_swap_vectors(&s->vals, &tvals); ae_swap_vectors(&s->idx, &tidx); /* * Fill RIdx by number of elements per row: * RIdx[I+1] stores number of elements in I-th row. * * Convert RIdx from row sizes to row offsets. * Set NInitialized */ nonne = 0; ivectorsetlengthatleast(&s->ridx, s->m+1, _state); for(i=0; i<=s->m; i++) { s->ridx.ptr.p_int[i] = 0; } for(i=0; i<=k-1; i++) { if( tidx.ptr.p_int[2*i]>=0 ) { s->ridx.ptr.p_int[tidx.ptr.p_int[2*i]+1] = s->ridx.ptr.p_int[tidx.ptr.p_int[2*i]+1]+1; nonne = nonne+1; } } for(i=0; i<=s->m-1; i++) { s->ridx.ptr.p_int[i+1] = s->ridx.ptr.p_int[i+1]+s->ridx.ptr.p_int[i]; } s->ninitialized = s->ridx.ptr.p_int[s->m]; /* * Allocate memory and move elements to Vals/Idx. * Initially, elements are sorted by rows, but unsorted within row. * After initial insertion we sort elements within row. */ ae_vector_set_length(&temp, s->m, _state); for(i=0; i<=s->m-1; i++) { temp.ptr.p_int[i] = 0; } rvectorsetlengthatleast(&s->vals, nonne, _state); ivectorsetlengthatleast(&s->idx, nonne, _state); for(i=0; i<=k-1; i++) { if( tidx.ptr.p_int[2*i]>=0 ) { s->vals.ptr.p_double[s->ridx.ptr.p_int[tidx.ptr.p_int[2*i]]+temp.ptr.p_int[tidx.ptr.p_int[2*i]]] = tvals.ptr.p_double[i]; s->idx.ptr.p_int[s->ridx.ptr.p_int[tidx.ptr.p_int[2*i]]+temp.ptr.p_int[tidx.ptr.p_int[2*i]]] = tidx.ptr.p_int[2*i+1]; temp.ptr.p_int[tidx.ptr.p_int[2*i]] = temp.ptr.p_int[tidx.ptr.p_int[2*i]]+1; } } for(i=0; i<=s->m-1; i++) { tagsortmiddleir(&s->idx, &s->vals, s->ridx.ptr.p_int[i], s->ridx.ptr.p_int[i+1]-s->ridx.ptr.p_int[i], _state); } /* * Initialization 'S.UIdx' and 'S.DIdx' */ sparse_sparseinitduidx(s, _state); ae_frame_leave(_state); return; } if( s->matrixtype==1 ) { /* * Already CRS */ ae_frame_leave(_state); return; } if( s->matrixtype==2 ) { ae_assert(s->m==s->n, "SparseConvertToCRS: non-square SKS matrices are not supported", _state); /* * From SKS to CRS. * * First, create local copy of the SKS matrix (Vals, * Idx, RIdx are stored; DIdx/UIdx for some time are * left in the SparseMatrix structure). */ s->matrixtype = 1; ae_swap_vectors(&s->vals, &tvals); ae_swap_vectors(&s->idx, &tidx); ae_swap_vectors(&s->ridx, &tridx); /* * Fill RIdx by number of elements per row: * RIdx[I+1] stores number of elements in I-th row. * * Convert RIdx from row sizes to row offsets. * Set NInitialized */ ivectorsetlengthatleast(&s->ridx, m+1, _state); s->ridx.ptr.p_int[0] = 0; for(i=1; i<=m; i++) { s->ridx.ptr.p_int[i] = 1; } nonne = 0; for(i=0; i<=m-1; i++) { s->ridx.ptr.p_int[i+1] = s->didx.ptr.p_int[i]+s->ridx.ptr.p_int[i+1]; for(j=i-s->uidx.ptr.p_int[i]; j<=i-1; j++) { s->ridx.ptr.p_int[j+1] = s->ridx.ptr.p_int[j+1]+1; } nonne = nonne+s->didx.ptr.p_int[i]+1+s->uidx.ptr.p_int[i]; } for(i=0; i<=s->m-1; i++) { s->ridx.ptr.p_int[i+1] = s->ridx.ptr.p_int[i+1]+s->ridx.ptr.p_int[i]; } s->ninitialized = s->ridx.ptr.p_int[s->m]; /* * Allocate memory and move elements to Vals/Idx. * Initially, elements are sorted by rows, and are sorted within row too. * No additional post-sorting is required. */ ae_vector_set_length(&temp, m, _state); for(i=0; i<=m-1; i++) { temp.ptr.p_int[i] = 0; } rvectorsetlengthatleast(&s->vals, nonne, _state); ivectorsetlengthatleast(&s->idx, nonne, _state); for(i=0; i<=m-1; i++) { /* * copy subdiagonal and diagonal parts of I-th block */ offs0 = tridx.ptr.p_int[i]; offs1 = s->ridx.ptr.p_int[i]+temp.ptr.p_int[i]; k = s->didx.ptr.p_int[i]+1; for(j=0; j<=k-1; j++) { s->vals.ptr.p_double[offs1+j] = tvals.ptr.p_double[offs0+j]; s->idx.ptr.p_int[offs1+j] = i-s->didx.ptr.p_int[i]+j; } temp.ptr.p_int[i] = temp.ptr.p_int[i]+s->didx.ptr.p_int[i]+1; /* * Copy superdiagonal part of I-th block */ offs0 = tridx.ptr.p_int[i]+s->didx.ptr.p_int[i]+1; k = s->uidx.ptr.p_int[i]; for(j=0; j<=k-1; j++) { offs1 = s->ridx.ptr.p_int[i-k+j]+temp.ptr.p_int[i-k+j]; s->vals.ptr.p_double[offs1] = tvals.ptr.p_double[offs0+j]; s->idx.ptr.p_int[offs1] = i; temp.ptr.p_int[i-k+j] = temp.ptr.p_int[i-k+j]+1; } } /* * Initialization 'S.UIdx' and 'S.DIdx' */ sparse_sparseinitduidx(s, _state); ae_frame_leave(_state); return; } ae_assert(ae_false, "SparseConvertToCRS: invalid matrix type", _state); ae_frame_leave(_state); } /************************************************************************* This function performs out-of-place conversion to CRS format. S0 is copied to S1 and converted on-the-fly. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in CRS format. NOTE: if S0 is stored as CRS, it is just copied without conversion. NOTE: this function de-allocates memory occupied by S1 before starting CRS conversion. If you perform a lot of repeated CRS conversions, it may lead to memory fragmentation. In this case we recommend you to use SparseCopyToCRSBuf() function which re-uses memory in S1 as much as possible. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsecopytocrs(sparsematrix* s0, sparsematrix* s1, ae_state *_state) { _sparsematrix_clear(s1); ae_assert((s0->matrixtype==0||s0->matrixtype==1)||s0->matrixtype==2, "SparseCopyToCRS: invalid matrix type", _state); sparsecopytocrsbuf(s0, s1, _state); } /************************************************************************* This function performs out-of-place conversion to CRS format. S0 is copied to S1 and converted on-the-fly. Memory allocated in S1 is reused to maximum extent possible. INPUT PARAMETERS S0 - sparse matrix in any format. S1 - matrix which may contain some pre-allocated memory, or can be just uninitialized structure. OUTPUT PARAMETERS S1 - sparse matrix in CRS format. NOTE: if S0 is stored as CRS, it is just copied without conversion. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsecopytocrsbuf(sparsematrix* s0, sparsematrix* s1, ae_state *_state) { ae_frame _frame_block; ae_vector temp; ae_int_t nonne; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t offs0; ae_int_t offs1; ae_int_t m; ae_frame_make(_state, &_frame_block); ae_vector_init(&temp, 0, DT_INT, _state); ae_assert((s0->matrixtype==0||s0->matrixtype==1)||s0->matrixtype==2, "SparseCopyToCRSBuf: invalid matrix type", _state); m = s0->m; if( s0->matrixtype==0 ) { /* * Convert from hash-table to CRS * Done like ConvertToCRS function */ s1->matrixtype = 1; s1->m = s0->m; s1->n = s0->n; s1->nfree = s0->nfree; nonne = 0; k = s0->tablesize; ivectorsetlengthatleast(&s1->ridx, s1->m+1, _state); for(i=0; i<=s1->m; i++) { s1->ridx.ptr.p_int[i] = 0; } ae_vector_set_length(&temp, s1->m, _state); for(i=0; i<=s1->m-1; i++) { temp.ptr.p_int[i] = 0; } /* * Number of elements per row */ for(i=0; i<=k-1; i++) { if( s0->idx.ptr.p_int[2*i]>=0 ) { s1->ridx.ptr.p_int[s0->idx.ptr.p_int[2*i]+1] = s1->ridx.ptr.p_int[s0->idx.ptr.p_int[2*i]+1]+1; nonne = nonne+1; } } /* * Fill RIdx (offsets of rows) */ for(i=0; i<=s1->m-1; i++) { s1->ridx.ptr.p_int[i+1] = s1->ridx.ptr.p_int[i+1]+s1->ridx.ptr.p_int[i]; } /* * Allocate memory */ rvectorsetlengthatleast(&s1->vals, nonne, _state); ivectorsetlengthatleast(&s1->idx, nonne, _state); for(i=0; i<=k-1; i++) { if( s0->idx.ptr.p_int[2*i]>=0 ) { s1->vals.ptr.p_double[s1->ridx.ptr.p_int[s0->idx.ptr.p_int[2*i]]+temp.ptr.p_int[s0->idx.ptr.p_int[2*i]]] = s0->vals.ptr.p_double[i]; s1->idx.ptr.p_int[s1->ridx.ptr.p_int[s0->idx.ptr.p_int[2*i]]+temp.ptr.p_int[s0->idx.ptr.p_int[2*i]]] = s0->idx.ptr.p_int[2*i+1]; temp.ptr.p_int[s0->idx.ptr.p_int[2*i]] = temp.ptr.p_int[s0->idx.ptr.p_int[2*i]]+1; } } /* * Set NInitialized */ s1->ninitialized = s1->ridx.ptr.p_int[s1->m]; /* * Sorting of elements */ for(i=0; i<=s1->m-1; i++) { tagsortmiddleir(&s1->idx, &s1->vals, s1->ridx.ptr.p_int[i], s1->ridx.ptr.p_int[i+1]-s1->ridx.ptr.p_int[i], _state); } /* * Initialization 'S.UIdx' and 'S.DIdx' */ sparse_sparseinitduidx(s1, _state); ae_frame_leave(_state); return; } if( s0->matrixtype==1 ) { /* * Already CRS, just copy */ sparsecopybuf(s0, s1, _state); ae_frame_leave(_state); return; } if( s0->matrixtype==2 ) { ae_assert(s0->m==s0->n, "SparseCopyToCRS: non-square SKS matrices are not supported", _state); /* * From SKS to CRS. */ s1->m = s0->m; s1->n = s0->n; s1->matrixtype = 1; /* * Fill RIdx by number of elements per row: * RIdx[I+1] stores number of elements in I-th row. * * Convert RIdx from row sizes to row offsets. * Set NInitialized */ ivectorsetlengthatleast(&s1->ridx, m+1, _state); s1->ridx.ptr.p_int[0] = 0; for(i=1; i<=m; i++) { s1->ridx.ptr.p_int[i] = 1; } nonne = 0; for(i=0; i<=m-1; i++) { s1->ridx.ptr.p_int[i+1] = s0->didx.ptr.p_int[i]+s1->ridx.ptr.p_int[i+1]; for(j=i-s0->uidx.ptr.p_int[i]; j<=i-1; j++) { s1->ridx.ptr.p_int[j+1] = s1->ridx.ptr.p_int[j+1]+1; } nonne = nonne+s0->didx.ptr.p_int[i]+1+s0->uidx.ptr.p_int[i]; } for(i=0; i<=m-1; i++) { s1->ridx.ptr.p_int[i+1] = s1->ridx.ptr.p_int[i+1]+s1->ridx.ptr.p_int[i]; } s1->ninitialized = s1->ridx.ptr.p_int[m]; /* * Allocate memory and move elements to Vals/Idx. * Initially, elements are sorted by rows, and are sorted within row too. * No additional post-sorting is required. */ ae_vector_set_length(&temp, m, _state); for(i=0; i<=m-1; i++) { temp.ptr.p_int[i] = 0; } rvectorsetlengthatleast(&s1->vals, nonne, _state); ivectorsetlengthatleast(&s1->idx, nonne, _state); for(i=0; i<=m-1; i++) { /* * copy subdiagonal and diagonal parts of I-th block */ offs0 = s0->ridx.ptr.p_int[i]; offs1 = s1->ridx.ptr.p_int[i]+temp.ptr.p_int[i]; k = s0->didx.ptr.p_int[i]+1; for(j=0; j<=k-1; j++) { s1->vals.ptr.p_double[offs1+j] = s0->vals.ptr.p_double[offs0+j]; s1->idx.ptr.p_int[offs1+j] = i-s0->didx.ptr.p_int[i]+j; } temp.ptr.p_int[i] = temp.ptr.p_int[i]+s0->didx.ptr.p_int[i]+1; /* * Copy superdiagonal part of I-th block */ offs0 = s0->ridx.ptr.p_int[i]+s0->didx.ptr.p_int[i]+1; k = s0->uidx.ptr.p_int[i]; for(j=0; j<=k-1; j++) { offs1 = s1->ridx.ptr.p_int[i-k+j]+temp.ptr.p_int[i-k+j]; s1->vals.ptr.p_double[offs1] = s0->vals.ptr.p_double[offs0+j]; s1->idx.ptr.p_int[offs1] = i; temp.ptr.p_int[i-k+j] = temp.ptr.p_int[i-k+j]+1; } } /* * Initialization 'S.UIdx' and 'S.DIdx' */ sparse_sparseinitduidx(s1, _state); ae_frame_leave(_state); return; } ae_assert(ae_false, "SparseCopyToCRSBuf: unexpected matrix type", _state); ae_frame_leave(_state); } /************************************************************************* This function performs in-place conversion to SKS format. INPUT PARAMETERS S - sparse matrix in any format. OUTPUT PARAMETERS S - sparse matrix in SKS format. NOTE: this function has no effect when called with matrix which is already in SKS mode. NOTE: in-place conversion involves allocation of temporary arrays. If you perform a lot of repeated in- place conversions, it may lead to memory fragmentation. Consider using out-of-place SparseCopyToSKSBuf() function in this case. -- ALGLIB PROJECT -- Copyright 15.01.2014 by Bochkanov Sergey *************************************************************************/ void sparseconverttosks(sparsematrix* s, ae_state *_state) { ae_frame _frame_block; ae_vector tridx; ae_vector tdidx; ae_vector tuidx; ae_vector tvals; ae_int_t n; ae_int_t t0; ae_int_t t1; ae_int_t i; ae_int_t j; ae_int_t k; double v; ae_frame_make(_state, &_frame_block); ae_vector_init(&tridx, 0, DT_INT, _state); ae_vector_init(&tdidx, 0, DT_INT, _state); ae_vector_init(&tuidx, 0, DT_INT, _state); ae_vector_init(&tvals, 0, DT_REAL, _state); ae_assert((s->matrixtype==0||s->matrixtype==1)||s->matrixtype==2, "SparseConvertToSKS: invalid matrix type", _state); ae_assert(s->m==s->n, "SparseConvertToSKS: rectangular matrices are not supported", _state); n = s->n; if( s->matrixtype==2 ) { /* * Already in SKS mode */ ae_frame_leave(_state); return; } /* * Generate internal copy of SKS matrix */ ivectorsetlengthatleast(&tdidx, n+1, _state); ivectorsetlengthatleast(&tuidx, n+1, _state); for(i=0; i<=n; i++) { tdidx.ptr.p_int[i] = 0; tuidx.ptr.p_int[i] = 0; } t0 = 0; t1 = 0; while(sparseenumerate(s, &t0, &t1, &i, &j, &v, _state)) { if( jmatrixtype = 2; s->ninitialized = 0; s->nfree = 0; s->m = n; s->n = n; ae_swap_vectors(&s->didx, &tdidx); ae_swap_vectors(&s->uidx, &tuidx); ae_swap_vectors(&s->ridx, &tridx); ae_swap_vectors(&s->vals, &tvals); ae_frame_leave(_state); } /************************************************************************* This function performs out-of-place conversion to SKS storage format. S0 is copied to S1 and converted on-the-fly. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in SKS format. NOTE: if S0 is stored as SKS, it is just copied without conversion. NOTE: this function de-allocates memory occupied by S1 before starting conversion. If you perform a lot of repeated conversions, it may lead to memory fragmentation. In this case we recommend you to use SparseCopyToSKSBuf() function which re-uses memory in S1 as much as possible. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsecopytosks(sparsematrix* s0, sparsematrix* s1, ae_state *_state) { _sparsematrix_clear(s1); ae_assert((s0->matrixtype==0||s0->matrixtype==1)||s0->matrixtype==2, "SparseCopyToSKS: invalid matrix type", _state); sparsecopytosksbuf(s0, s1, _state); } /************************************************************************* This function performs out-of-place conversion to SKS format. S0 is copied to S1 and converted on-the-fly. Memory allocated in S1 is reused to maximum extent possible. INPUT PARAMETERS S0 - sparse matrix in any format. OUTPUT PARAMETERS S1 - sparse matrix in SKS format. NOTE: if S0 is stored as SKS, it is just copied without conversion. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsecopytosksbuf(sparsematrix* s0, sparsematrix* s1, ae_state *_state) { double v; ae_int_t n; ae_int_t t0; ae_int_t t1; ae_int_t i; ae_int_t j; ae_int_t k; ae_assert((s0->matrixtype==0||s0->matrixtype==1)||s0->matrixtype==2, "SparseCopyToSKSBuf: invalid matrix type", _state); ae_assert(s0->m==s0->n, "SparseCopyToSKSBuf: rectangular matrices are not supported", _state); n = s0->n; if( s0->matrixtype==2 ) { /* * Already SKS, just copy */ sparsecopybuf(s0, s1, _state); return; } /* * Generate copy of matrix in the SKS format */ ivectorsetlengthatleast(&s1->didx, n+1, _state); ivectorsetlengthatleast(&s1->uidx, n+1, _state); for(i=0; i<=n; i++) { s1->didx.ptr.p_int[i] = 0; s1->uidx.ptr.p_int[i] = 0; } t0 = 0; t1 = 0; while(sparseenumerate(s0, &t0, &t1, &i, &j, &v, _state)) { if( jdidx.ptr.p_int[i] = ae_maxint(s1->didx.ptr.p_int[i], i-j, _state); } else { s1->uidx.ptr.p_int[j] = ae_maxint(s1->uidx.ptr.p_int[j], j-i, _state); } } ivectorsetlengthatleast(&s1->ridx, n+1, _state); s1->ridx.ptr.p_int[0] = 0; for(i=1; i<=n; i++) { s1->ridx.ptr.p_int[i] = s1->ridx.ptr.p_int[i-1]+s1->didx.ptr.p_int[i-1]+1+s1->uidx.ptr.p_int[i-1]; } rvectorsetlengthatleast(&s1->vals, s1->ridx.ptr.p_int[n], _state); k = s1->ridx.ptr.p_int[n]; for(i=0; i<=k-1; i++) { s1->vals.ptr.p_double[i] = 0.0; } t0 = 0; t1 = 0; while(sparseenumerate(s0, &t0, &t1, &i, &j, &v, _state)) { if( j<=i ) { s1->vals.ptr.p_double[s1->ridx.ptr.p_int[i]+s1->didx.ptr.p_int[i]-(i-j)] = v; } else { s1->vals.ptr.p_double[s1->ridx.ptr.p_int[j+1]-(j-i)] = v; } } for(i=0; i<=n-1; i++) { s1->didx.ptr.p_int[n] = ae_maxint(s1->didx.ptr.p_int[n], s1->didx.ptr.p_int[i], _state); s1->uidx.ptr.p_int[n] = ae_maxint(s1->uidx.ptr.p_int[n], s1->uidx.ptr.p_int[i], _state); } s1->matrixtype = 2; s1->ninitialized = 0; s1->nfree = 0; s1->m = n; s1->n = n; } /************************************************************************* This non-accessible to user function performs in-place creation of CRS matrix. It is expected that: * S.M and S.N are initialized * S.RIdx, S.Idx and S.Vals are loaded with values in CRS format used by ALGLIB, with elements of S.Idx/S.Vals possibly being unsorted within each row (this constructor function may post-sort matrix, assuming that it is sorted by rows). Only 5 fields should be set by caller. Other fields will be rewritten by this constructor function. This function performs integrity check on user-specified values, with the only exception being Vals[] array: * it does not require values to be non-zero * it does not checks for element of Vals[] being finite IEEE-754 values INPUT PARAMETERS S - sparse matrix with corresponding fields set by caller OUTPUT PARAMETERS S - sparse matrix in CRS format. -- ALGLIB PROJECT -- Copyright 20.08.2016 by Bochkanov Sergey *************************************************************************/ void sparsecreatecrsinplace(sparsematrix* s, ae_state *_state) { ae_int_t m; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t j0; ae_int_t j1; m = s->m; n = s->n; /* * Perform integrity check */ ae_assert(s->m>0, "SparseCreateCRSInplace: integrity check failed", _state); ae_assert(s->n>0, "SparseCreateCRSInplace: integrity check failed", _state); ae_assert(s->ridx.cnt>=m+1, "SparseCreateCRSInplace: integrity check failed", _state); for(i=0; i<=m-1; i++) { ae_assert(s->ridx.ptr.p_int[i]>=0&&s->ridx.ptr.p_int[i]<=s->ridx.ptr.p_int[i+1], "SparseCreateCRSInplace: integrity check failed", _state); } ae_assert(s->ridx.ptr.p_int[m]<=s->idx.cnt, "SparseCreateCRSInplace: integrity check failed", _state); ae_assert(s->ridx.ptr.p_int[m]<=s->vals.cnt, "SparseCreateCRSInplace: integrity check failed", _state); for(i=0; i<=m-1; i++) { j0 = s->ridx.ptr.p_int[i]; j1 = s->ridx.ptr.p_int[i+1]-1; for(j=j0; j<=j1; j++) { ae_assert(s->idx.ptr.p_int[j]>=0&&s->idx.ptr.p_int[j]matrixtype = 1; s->ninitialized = s->ridx.ptr.p_int[m]; for(i=0; i<=m-1; i++) { tagsortmiddleir(&s->idx, &s->vals, s->ridx.ptr.p_int[i], s->ridx.ptr.p_int[i+1]-s->ridx.ptr.p_int[i], _state); } sparse_sparseinitduidx(s, _state); } /************************************************************************* This function returns type of the matrix storage format. INPUT PARAMETERS: S - sparse matrix. RESULT: sparse storage format used by matrix: 0 - Hash-table 1 - CRS (compressed row storage) 2 - SKS (skyline) NOTE: future versions of ALGLIB may include additional sparse storage formats. -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ ae_int_t sparsegetmatrixtype(sparsematrix* s, ae_state *_state) { ae_int_t result; ae_assert((s->matrixtype==0||s->matrixtype==1)||s->matrixtype==2, "SparseGetMatrixType: invalid matrix type", _state); result = s->matrixtype; return result; } /************************************************************************* This function checks matrix storage format and returns True when matrix is stored using Hash table representation. INPUT PARAMETERS: S - sparse matrix. RESULT: True if matrix type is Hash table False if matrix type is not Hash table -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ ae_bool sparseishash(sparsematrix* s, ae_state *_state) { ae_bool result; ae_assert((s->matrixtype==0||s->matrixtype==1)||s->matrixtype==2, "SparseIsHash: invalid matrix type", _state); result = s->matrixtype==0; return result; } /************************************************************************* This function checks matrix storage format and returns True when matrix is stored using CRS representation. INPUT PARAMETERS: S - sparse matrix. RESULT: True if matrix type is CRS False if matrix type is not CRS -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ ae_bool sparseiscrs(sparsematrix* s, ae_state *_state) { ae_bool result; ae_assert((s->matrixtype==0||s->matrixtype==1)||s->matrixtype==2, "SparseIsCRS: invalid matrix type", _state); result = s->matrixtype==1; return result; } /************************************************************************* This function checks matrix storage format and returns True when matrix is stored using SKS representation. INPUT PARAMETERS: S - sparse matrix. RESULT: True if matrix type is SKS False if matrix type is not SKS -- ALGLIB PROJECT -- Copyright 20.07.2012 by Bochkanov Sergey *************************************************************************/ ae_bool sparseissks(sparsematrix* s, ae_state *_state) { ae_bool result; ae_assert((s->matrixtype==0||s->matrixtype==1)||s->matrixtype==2, "SparseIsSKS: invalid matrix type", _state); result = s->matrixtype==2; return result; } /************************************************************************* The function frees all memory occupied by sparse matrix. Sparse matrix structure becomes unusable after this call. OUTPUT PARAMETERS S - sparse matrix to delete -- ALGLIB PROJECT -- Copyright 24.07.2012 by Bochkanov Sergey *************************************************************************/ void sparsefree(sparsematrix* s, ae_state *_state) { _sparsematrix_clear(s); s->matrixtype = -1; s->m = 0; s->n = 0; s->nfree = 0; s->ninitialized = 0; s->tablesize = 0; } /************************************************************************* The function returns number of rows of a sparse matrix. RESULT: number of rows of a sparse matrix. -- ALGLIB PROJECT -- Copyright 23.08.2012 by Bochkanov Sergey *************************************************************************/ ae_int_t sparsegetnrows(sparsematrix* s, ae_state *_state) { ae_int_t result; result = s->m; return result; } /************************************************************************* The function returns number of columns of a sparse matrix. RESULT: number of columns of a sparse matrix. -- ALGLIB PROJECT -- Copyright 23.08.2012 by Bochkanov Sergey *************************************************************************/ ae_int_t sparsegetncols(sparsematrix* s, ae_state *_state) { ae_int_t result; result = s->n; return result; } /************************************************************************* The function returns number of strictly upper triangular non-zero elements in the matrix. It counts SYMBOLICALLY non-zero elements, i.e. entries in the sparse matrix data structure. If some element has zero numerical value, it is still counted. This function has different cost for different types of matrices: * for hash-based matrices it involves complete pass over entire hash-table with O(NNZ) cost, where NNZ is number of non-zero elements * for CRS and SKS matrix types cost of counting is O(N) (N - matrix size). RESULT: number of non-zero elements strictly above main diagonal -- ALGLIB PROJECT -- Copyright 12.02.2014 by Bochkanov Sergey *************************************************************************/ ae_int_t sparsegetuppercount(sparsematrix* s, ae_state *_state) { ae_int_t sz; ae_int_t i0; ae_int_t i; ae_int_t result; result = -1; if( s->matrixtype==0 ) { /* * Hash-table matrix */ result = 0; sz = s->tablesize; for(i0=0; i0<=sz-1; i0++) { i = s->idx.ptr.p_int[2*i0]; if( i>=0&&s->idx.ptr.p_int[2*i0+1]>i ) { result = result+1; } } return result; } if( s->matrixtype==1 ) { /* * CRS matrix */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseGetUpperCount: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); result = 0; sz = s->m; for(i=0; i<=sz-1; i++) { result = result+(s->ridx.ptr.p_int[i+1]-s->uidx.ptr.p_int[i]); } return result; } if( s->matrixtype==2 ) { /* * SKS matrix */ ae_assert(s->m==s->n, "SparseGetUpperCount: non-square SKS matrices are not supported", _state); result = 0; sz = s->m; for(i=0; i<=sz-1; i++) { result = result+s->uidx.ptr.p_int[i]; } return result; } ae_assert(ae_false, "SparseGetUpperCount: internal error", _state); return result; } /************************************************************************* The function returns number of strictly lower triangular non-zero elements in the matrix. It counts SYMBOLICALLY non-zero elements, i.e. entries in the sparse matrix data structure. If some element has zero numerical value, it is still counted. This function has different cost for different types of matrices: * for hash-based matrices it involves complete pass over entire hash-table with O(NNZ) cost, where NNZ is number of non-zero elements * for CRS and SKS matrix types cost of counting is O(N) (N - matrix size). RESULT: number of non-zero elements strictly below main diagonal -- ALGLIB PROJECT -- Copyright 12.02.2014 by Bochkanov Sergey *************************************************************************/ ae_int_t sparsegetlowercount(sparsematrix* s, ae_state *_state) { ae_int_t sz; ae_int_t i0; ae_int_t i; ae_int_t result; result = -1; if( s->matrixtype==0 ) { /* * Hash-table matrix */ result = 0; sz = s->tablesize; for(i0=0; i0<=sz-1; i0++) { i = s->idx.ptr.p_int[2*i0]; if( i>=0&&s->idx.ptr.p_int[2*i0+1]matrixtype==1 ) { /* * CRS matrix */ ae_assert(s->ninitialized==s->ridx.ptr.p_int[s->m], "SparseGetUpperCount: some rows/elements of the CRS matrix were not initialized (you must initialize everything you promised to SparseCreateCRS)", _state); result = 0; sz = s->m; for(i=0; i<=sz-1; i++) { result = result+(s->didx.ptr.p_int[i]-s->ridx.ptr.p_int[i]); } return result; } if( s->matrixtype==2 ) { /* * SKS matrix */ ae_assert(s->m==s->n, "SparseGetUpperCount: non-square SKS matrices are not supported", _state); result = 0; sz = s->m; for(i=0; i<=sz-1; i++) { result = result+s->didx.ptr.p_int[i]; } return result; } ae_assert(ae_false, "SparseGetUpperCount: internal error", _state); return result; } /************************************************************************* Procedure for initialization 'S.DIdx' and 'S.UIdx' -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ static void sparse_sparseinitduidx(sparsematrix* s, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t lt; ae_int_t rt; ae_assert(s->matrixtype==1, "SparseInitDUIdx: internal error, incorrect matrix type", _state); ivectorsetlengthatleast(&s->didx, s->m, _state); ivectorsetlengthatleast(&s->uidx, s->m, _state); for(i=0; i<=s->m-1; i++) { s->uidx.ptr.p_int[i] = -1; s->didx.ptr.p_int[i] = -1; lt = s->ridx.ptr.p_int[i]; rt = s->ridx.ptr.p_int[i+1]; for(j=lt; j<=rt-1; j++) { if( iidx.ptr.p_int[j]&&s->uidx.ptr.p_int[i]==-1 ) { s->uidx.ptr.p_int[i] = j; break; } else { if( i==s->idx.ptr.p_int[j] ) { s->didx.ptr.p_int[i] = j; } } } if( s->uidx.ptr.p_int[i]==-1 ) { s->uidx.ptr.p_int[i] = s->ridx.ptr.p_int[i+1]; } if( s->didx.ptr.p_int[i]==-1 ) { s->didx.ptr.p_int[i] = s->uidx.ptr.p_int[i]; } } } /************************************************************************* This is hash function. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ static ae_int_t sparse_hash(ae_int_t i, ae_int_t j, ae_int_t tabsize, ae_state *_state) { ae_frame _frame_block; hqrndstate r; ae_int_t result; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&r, _state); hqrndseed(i, j, &r, _state); result = hqrnduniformi(&r, tabsize, _state); ae_frame_leave(_state); return result; } void _sparsematrix_init(void* _p, ae_state *_state) { sparsematrix *p = (sparsematrix*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->vals, 0, DT_REAL, _state); ae_vector_init(&p->idx, 0, DT_INT, _state); ae_vector_init(&p->ridx, 0, DT_INT, _state); ae_vector_init(&p->didx, 0, DT_INT, _state); ae_vector_init(&p->uidx, 0, DT_INT, _state); } void _sparsematrix_init_copy(void* _dst, void* _src, ae_state *_state) { sparsematrix *dst = (sparsematrix*)_dst; sparsematrix *src = (sparsematrix*)_src; ae_vector_init_copy(&dst->vals, &src->vals, _state); ae_vector_init_copy(&dst->idx, &src->idx, _state); ae_vector_init_copy(&dst->ridx, &src->ridx, _state); ae_vector_init_copy(&dst->didx, &src->didx, _state); ae_vector_init_copy(&dst->uidx, &src->uidx, _state); dst->matrixtype = src->matrixtype; dst->m = src->m; dst->n = src->n; dst->nfree = src->nfree; dst->ninitialized = src->ninitialized; dst->tablesize = src->tablesize; } void _sparsematrix_clear(void* _p) { sparsematrix *p = (sparsematrix*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->vals); ae_vector_clear(&p->idx); ae_vector_clear(&p->ridx); ae_vector_clear(&p->didx); ae_vector_clear(&p->uidx); } void _sparsematrix_destroy(void* _p) { sparsematrix *p = (sparsematrix*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->vals); ae_vector_destroy(&p->idx); ae_vector_destroy(&p->ridx); ae_vector_destroy(&p->didx); ae_vector_destroy(&p->uidx); } void _sparsebuffers_init(void* _p, ae_state *_state) { sparsebuffers *p = (sparsebuffers*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->d, 0, DT_INT, _state); ae_vector_init(&p->u, 0, DT_INT, _state); _sparsematrix_init(&p->s, _state); } void _sparsebuffers_init_copy(void* _dst, void* _src, ae_state *_state) { sparsebuffers *dst = (sparsebuffers*)_dst; sparsebuffers *src = (sparsebuffers*)_src; ae_vector_init_copy(&dst->d, &src->d, _state); ae_vector_init_copy(&dst->u, &src->u, _state); _sparsematrix_init_copy(&dst->s, &src->s, _state); } void _sparsebuffers_clear(void* _p) { sparsebuffers *p = (sparsebuffers*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->d); ae_vector_clear(&p->u); _sparsematrix_clear(&p->s); } void _sparsebuffers_destroy(void* _p) { sparsebuffers *p = (sparsebuffers*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->d); ae_vector_destroy(&p->u); _sparsematrix_destroy(&p->s); } /************************************************************************* Generation of a random uniformly distributed (Haar) orthogonal matrix INPUT PARAMETERS: N - matrix size, N>=1 OUTPUT PARAMETERS: A - orthogonal NxN matrix, array[0..N-1,0..N-1] NOTE: this function uses algorithm described in Stewart, G. W. (1980), "The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators". Speaking short, to generate an (N+1)x(N+1) orthogonal matrix, it: * takes an NxN one * takes uniformly distributed unit vector of dimension N+1. * constructs a Householder reflection from the vector, then applies it to the smaller matrix (embedded in the larger size with a 1 at the bottom right corner). -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrndorthogonal(ae_int_t n, /* Real */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(a); ae_assert(n>=1, "RMatrixRndOrthogonal: N<1!", _state); ae_matrix_set_length(a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { a->ptr.pp_double[i][j] = (double)(1); } else { a->ptr.pp_double[i][j] = (double)(0); } } } rmatrixrndorthogonalfromtheright(a, n, n, _state); } /************************************************************************* Generation of random NxN matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrndcond(ae_int_t n, double c, /* Real */ ae_matrix* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double l1; double l2; hqrndstate rs; ae_frame_make(_state, &_frame_block); ae_matrix_clear(a); _hqrndstate_init(&rs, _state); ae_assert(n>=1&&ae_fp_greater_eq(c,(double)(1)), "RMatrixRndCond: N<1 or C<1!", _state); ae_matrix_set_length(a, n, n, _state); if( n==1 ) { /* * special case */ a->ptr.pp_double[0][0] = (double)(2*ae_randominteger(2, _state)-1); ae_frame_leave(_state); return; } hqrndrandomize(&rs, _state); l1 = (double)(0); l2 = ae_log(1/c, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a->ptr.pp_double[i][j] = (double)(0); } } a->ptr.pp_double[0][0] = ae_exp(l1, _state); for(i=1; i<=n-2; i++) { a->ptr.pp_double[i][i] = ae_exp(hqrnduniformr(&rs, _state)*(l2-l1)+l1, _state); } a->ptr.pp_double[n-1][n-1] = ae_exp(l2, _state); rmatrixrndorthogonalfromtheleft(a, n, n, _state); rmatrixrndorthogonalfromtheright(a, n, n, _state); ae_frame_leave(_state); } /************************************************************************* Generation of a random Haar distributed orthogonal complex matrix INPUT PARAMETERS: N - matrix size, N>=1 OUTPUT PARAMETERS: A - orthogonal NxN matrix, array[0..N-1,0..N-1] NOTE: this function uses algorithm described in Stewart, G. W. (1980), "The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators". Speaking short, to generate an (N+1)x(N+1) orthogonal matrix, it: * takes an NxN one * takes uniformly distributed unit vector of dimension N+1. * constructs a Householder reflection from the vector, then applies it to the smaller matrix (embedded in the larger size with a 1 at the bottom right corner). -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrndorthogonal(ae_int_t n, /* Complex */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(a); ae_assert(n>=1, "CMatrixRndOrthogonal: N<1!", _state); ae_matrix_set_length(a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { a->ptr.pp_complex[i][j] = ae_complex_from_i(1); } else { a->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } cmatrixrndorthogonalfromtheright(a, n, n, _state); } /************************************************************************* Generation of random NxN complex matrix with given condition number C and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrndcond(ae_int_t n, double c, /* Complex */ ae_matrix* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double l1; double l2; hqrndstate state; ae_complex v; ae_frame_make(_state, &_frame_block); ae_matrix_clear(a); _hqrndstate_init(&state, _state); ae_assert(n>=1&&ae_fp_greater_eq(c,(double)(1)), "CMatrixRndCond: N<1 or C<1!", _state); ae_matrix_set_length(a, n, n, _state); if( n==1 ) { /* * special case */ hqrndrandomize(&state, _state); hqrndunit2(&state, &v.x, &v.y, _state); a->ptr.pp_complex[0][0] = v; ae_frame_leave(_state); return; } hqrndrandomize(&state, _state); l1 = (double)(0); l2 = ae_log(1/c, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } a->ptr.pp_complex[0][0] = ae_complex_from_d(ae_exp(l1, _state)); for(i=1; i<=n-2; i++) { a->ptr.pp_complex[i][i] = ae_complex_from_d(ae_exp(hqrnduniformr(&state, _state)*(l2-l1)+l1, _state)); } a->ptr.pp_complex[n-1][n-1] = ae_complex_from_d(ae_exp(l2, _state)); cmatrixrndorthogonalfromtheleft(a, n, n, _state); cmatrixrndorthogonalfromtheright(a, n, n, _state); ae_frame_leave(_state); } /************************************************************************* Generation of random NxN symmetric matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void smatrixrndcond(ae_int_t n, double c, /* Real */ ae_matrix* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double l1; double l2; hqrndstate rs; ae_frame_make(_state, &_frame_block); ae_matrix_clear(a); _hqrndstate_init(&rs, _state); ae_assert(n>=1&&ae_fp_greater_eq(c,(double)(1)), "SMatrixRndCond: N<1 or C<1!", _state); ae_matrix_set_length(a, n, n, _state); if( n==1 ) { /* * special case */ a->ptr.pp_double[0][0] = (double)(2*ae_randominteger(2, _state)-1); ae_frame_leave(_state); return; } /* * Prepare matrix */ hqrndrandomize(&rs, _state); l1 = (double)(0); l2 = ae_log(1/c, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a->ptr.pp_double[i][j] = (double)(0); } } a->ptr.pp_double[0][0] = ae_exp(l1, _state); for(i=1; i<=n-2; i++) { a->ptr.pp_double[i][i] = (2*hqrnduniformi(&rs, 2, _state)-1)*ae_exp(hqrnduniformr(&rs, _state)*(l2-l1)+l1, _state); } a->ptr.pp_double[n-1][n-1] = ae_exp(l2, _state); /* * Multiply */ smatrixrndmultiply(a, n, _state); ae_frame_leave(_state); } /************************************************************************* Generation of random NxN symmetric positive definite matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random SPD matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void spdmatrixrndcond(ae_int_t n, double c, /* Real */ ae_matrix* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double l1; double l2; hqrndstate rs; ae_frame_make(_state, &_frame_block); ae_matrix_clear(a); _hqrndstate_init(&rs, _state); /* * Special cases */ if( n<=0||ae_fp_less(c,(double)(1)) ) { ae_frame_leave(_state); return; } ae_matrix_set_length(a, n, n, _state); if( n==1 ) { a->ptr.pp_double[0][0] = (double)(1); ae_frame_leave(_state); return; } /* * Prepare matrix */ hqrndrandomize(&rs, _state); l1 = (double)(0); l2 = ae_log(1/c, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a->ptr.pp_double[i][j] = (double)(0); } } a->ptr.pp_double[0][0] = ae_exp(l1, _state); for(i=1; i<=n-2; i++) { a->ptr.pp_double[i][i] = ae_exp(hqrnduniformr(&rs, _state)*(l2-l1)+l1, _state); } a->ptr.pp_double[n-1][n-1] = ae_exp(l2, _state); /* * Multiply */ smatrixrndmultiply(a, n, _state); ae_frame_leave(_state); } /************************************************************************* Generation of random NxN Hermitian matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void hmatrixrndcond(ae_int_t n, double c, /* Complex */ ae_matrix* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double l1; double l2; hqrndstate rs; ae_frame_make(_state, &_frame_block); ae_matrix_clear(a); _hqrndstate_init(&rs, _state); ae_assert(n>=1&&ae_fp_greater_eq(c,(double)(1)), "HMatrixRndCond: N<1 or C<1!", _state); ae_matrix_set_length(a, n, n, _state); if( n==1 ) { /* * special case */ a->ptr.pp_complex[0][0] = ae_complex_from_i(2*ae_randominteger(2, _state)-1); ae_frame_leave(_state); return; } /* * Prepare matrix */ hqrndrandomize(&rs, _state); l1 = (double)(0); l2 = ae_log(1/c, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } a->ptr.pp_complex[0][0] = ae_complex_from_d(ae_exp(l1, _state)); for(i=1; i<=n-2; i++) { a->ptr.pp_complex[i][i] = ae_complex_from_d((2*hqrnduniformi(&rs, 2, _state)-1)*ae_exp(hqrnduniformr(&rs, _state)*(l2-l1)+l1, _state)); } a->ptr.pp_complex[n-1][n-1] = ae_complex_from_d(ae_exp(l2, _state)); /* * Multiply */ hmatrixrndmultiply(a, n, _state); /* * post-process to ensure that matrix diagonal is real */ for(i=0; i<=n-1; i++) { a->ptr.pp_complex[i][i].y = (double)(0); } ae_frame_leave(_state); } /************************************************************************* Generation of random NxN Hermitian positive definite matrix with given condition number and norm2(A)=1 INPUT PARAMETERS: N - matrix size C - condition number (in 2-norm) OUTPUT PARAMETERS: A - random HPD matrix with norm2(A)=1 and cond(A)=C -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void hpdmatrixrndcond(ae_int_t n, double c, /* Complex */ ae_matrix* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double l1; double l2; hqrndstate rs; ae_frame_make(_state, &_frame_block); ae_matrix_clear(a); _hqrndstate_init(&rs, _state); /* * Special cases */ if( n<=0||ae_fp_less(c,(double)(1)) ) { ae_frame_leave(_state); return; } ae_matrix_set_length(a, n, n, _state); if( n==1 ) { a->ptr.pp_complex[0][0] = ae_complex_from_i(1); ae_frame_leave(_state); return; } /* * Prepare matrix */ hqrndrandomize(&rs, _state); l1 = (double)(0); l2 = ae_log(1/c, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } a->ptr.pp_complex[0][0] = ae_complex_from_d(ae_exp(l1, _state)); for(i=1; i<=n-2; i++) { a->ptr.pp_complex[i][i] = ae_complex_from_d(ae_exp(hqrnduniformr(&rs, _state)*(l2-l1)+l1, _state)); } a->ptr.pp_complex[n-1][n-1] = ae_complex_from_d(ae_exp(l2, _state)); /* * Multiply */ hmatrixrndmultiply(a, n, _state); /* * post-process to ensure that matrix diagonal is real */ for(i=0; i<=n-1; i++) { a->ptr.pp_complex[i][i].y = (double)(0); } ae_frame_leave(_state); } /************************************************************************* Multiplication of MxN matrix by NxN random Haar distributed orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrndorthogonalfromtheright(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, ae_state *_state) { ae_frame _frame_block; double tau; double lambdav; ae_int_t s; ae_int_t i; double u1; double u2; ae_vector w; ae_vector v; hqrndstate state; ae_frame_make(_state, &_frame_block); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&v, 0, DT_REAL, _state); _hqrndstate_init(&state, _state); ae_assert(n>=1&&m>=1, "RMatrixRndOrthogonalFromTheRight: N<1 or M<1!", _state); if( n==1 ) { /* * Special case */ tau = (double)(2*ae_randominteger(2, _state)-1); for(i=0; i<=m-1; i++) { a->ptr.pp_double[i][0] = a->ptr.pp_double[i][0]*tau; } ae_frame_leave(_state); return; } /* * General case. * First pass. */ ae_vector_set_length(&w, m, _state); ae_vector_set_length(&v, n+1, _state); hqrndrandomize(&state, _state); for(s=2; s<=n; s++) { /* * Prepare random normal v */ do { i = 1; while(i<=s) { hqrndnormal2(&state, &u1, &u2, _state); v.ptr.p_double[i] = u1; if( i+1<=s ) { v.ptr.p_double[i+1] = u2; } i = i+2; } lambdav = ae_v_dotproduct(&v.ptr.p_double[1], 1, &v.ptr.p_double[1], 1, ae_v_len(1,s)); } while(ae_fp_eq(lambdav,(double)(0))); /* * Prepare and apply reflection */ generatereflection(&v, s, &tau, _state); v.ptr.p_double[1] = (double)(1); applyreflectionfromtheright(a, tau, &v, 0, m-1, n-s, n-1, &w, _state); } /* * Second pass. */ for(i=0; i<=n-1; i++) { tau = (double)(2*hqrnduniformi(&state, 2, _state)-1); ae_v_muld(&a->ptr.pp_double[0][i], a->stride, ae_v_len(0,m-1), tau); } ae_frame_leave(_state); } /************************************************************************* Multiplication of MxN matrix by MxM random Haar distributed orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - Q*A, where Q is random MxM orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrndorthogonalfromtheleft(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, ae_state *_state) { ae_frame _frame_block; double tau; double lambdav; ae_int_t s; ae_int_t i; ae_int_t j; double u1; double u2; ae_vector w; ae_vector v; hqrndstate state; ae_frame_make(_state, &_frame_block); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&v, 0, DT_REAL, _state); _hqrndstate_init(&state, _state); ae_assert(n>=1&&m>=1, "RMatrixRndOrthogonalFromTheRight: N<1 or M<1!", _state); if( m==1 ) { /* * special case */ tau = (double)(2*ae_randominteger(2, _state)-1); for(j=0; j<=n-1; j++) { a->ptr.pp_double[0][j] = a->ptr.pp_double[0][j]*tau; } ae_frame_leave(_state); return; } /* * General case. * First pass. */ ae_vector_set_length(&w, n, _state); ae_vector_set_length(&v, m+1, _state); hqrndrandomize(&state, _state); for(s=2; s<=m; s++) { /* * Prepare random normal v */ do { i = 1; while(i<=s) { hqrndnormal2(&state, &u1, &u2, _state); v.ptr.p_double[i] = u1; if( i+1<=s ) { v.ptr.p_double[i+1] = u2; } i = i+2; } lambdav = ae_v_dotproduct(&v.ptr.p_double[1], 1, &v.ptr.p_double[1], 1, ae_v_len(1,s)); } while(ae_fp_eq(lambdav,(double)(0))); /* * Prepare and apply reflection */ generatereflection(&v, s, &tau, _state); v.ptr.p_double[1] = (double)(1); applyreflectionfromtheleft(a, tau, &v, m-s, m-1, 0, n-1, &w, _state); } /* * Second pass. */ for(i=0; i<=m-1; i++) { tau = (double)(2*hqrnduniformi(&state, 2, _state)-1); ae_v_muld(&a->ptr.pp_double[i][0], 1, ae_v_len(0,n-1), tau); } ae_frame_leave(_state); } /************************************************************************* Multiplication of MxN complex matrix by NxN random Haar distributed complex orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrndorthogonalfromtheright(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_complex lambdav; ae_complex tau; ae_int_t s; ae_int_t i; ae_vector w; ae_vector v; hqrndstate state; ae_frame_make(_state, &_frame_block); ae_vector_init(&w, 0, DT_COMPLEX, _state); ae_vector_init(&v, 0, DT_COMPLEX, _state); _hqrndstate_init(&state, _state); ae_assert(n>=1&&m>=1, "CMatrixRndOrthogonalFromTheRight: N<1 or M<1!", _state); if( n==1 ) { /* * Special case */ hqrndrandomize(&state, _state); hqrndunit2(&state, &tau.x, &tau.y, _state); for(i=0; i<=m-1; i++) { a->ptr.pp_complex[i][0] = ae_c_mul(a->ptr.pp_complex[i][0],tau); } ae_frame_leave(_state); return; } /* * General case. * First pass. */ ae_vector_set_length(&w, m, _state); ae_vector_set_length(&v, n+1, _state); hqrndrandomize(&state, _state); for(s=2; s<=n; s++) { /* * Prepare random normal v */ do { for(i=1; i<=s; i++) { hqrndnormal2(&state, &tau.x, &tau.y, _state); v.ptr.p_complex[i] = tau; } lambdav = ae_v_cdotproduct(&v.ptr.p_complex[1], 1, "N", &v.ptr.p_complex[1], 1, "Conj", ae_v_len(1,s)); } while(ae_c_eq_d(lambdav,(double)(0))); /* * Prepare and apply reflection */ complexgeneratereflection(&v, s, &tau, _state); v.ptr.p_complex[1] = ae_complex_from_i(1); complexapplyreflectionfromtheright(a, tau, &v, 0, m-1, n-s, n-1, &w, _state); } /* * Second pass. */ for(i=0; i<=n-1; i++) { hqrndunit2(&state, &tau.x, &tau.y, _state); ae_v_cmulc(&a->ptr.pp_complex[0][i], a->stride, ae_v_len(0,m-1), tau); } ae_frame_leave(_state); } /************************************************************************* Multiplication of MxN complex matrix by MxM random Haar distributed complex orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..M-1, 0..N-1] M, N- matrix size OUTPUT PARAMETERS: A - Q*A, where Q is random MxM orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrndorthogonalfromtheleft(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_complex tau; ae_complex lambdav; ae_int_t s; ae_int_t i; ae_int_t j; ae_vector w; ae_vector v; hqrndstate state; ae_frame_make(_state, &_frame_block); ae_vector_init(&w, 0, DT_COMPLEX, _state); ae_vector_init(&v, 0, DT_COMPLEX, _state); _hqrndstate_init(&state, _state); ae_assert(n>=1&&m>=1, "CMatrixRndOrthogonalFromTheRight: N<1 or M<1!", _state); if( m==1 ) { /* * special case */ hqrndrandomize(&state, _state); hqrndunit2(&state, &tau.x, &tau.y, _state); for(j=0; j<=n-1; j++) { a->ptr.pp_complex[0][j] = ae_c_mul(a->ptr.pp_complex[0][j],tau); } ae_frame_leave(_state); return; } /* * General case. * First pass. */ ae_vector_set_length(&w, n, _state); ae_vector_set_length(&v, m+1, _state); hqrndrandomize(&state, _state); for(s=2; s<=m; s++) { /* * Prepare random normal v */ do { for(i=1; i<=s; i++) { hqrndnormal2(&state, &tau.x, &tau.y, _state); v.ptr.p_complex[i] = tau; } lambdav = ae_v_cdotproduct(&v.ptr.p_complex[1], 1, "N", &v.ptr.p_complex[1], 1, "Conj", ae_v_len(1,s)); } while(ae_c_eq_d(lambdav,(double)(0))); /* * Prepare and apply reflection */ complexgeneratereflection(&v, s, &tau, _state); v.ptr.p_complex[1] = ae_complex_from_i(1); complexapplyreflectionfromtheleft(a, tau, &v, m-s, m-1, 0, n-1, &w, _state); } /* * Second pass. */ for(i=0; i<=m-1; i++) { hqrndunit2(&state, &tau.x, &tau.y, _state); ae_v_cmulc(&a->ptr.pp_complex[i][0], 1, ae_v_len(0,n-1), tau); } ae_frame_leave(_state); } /************************************************************************* Symmetric multiplication of NxN matrix by random Haar distributed orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..N-1, 0..N-1] N - matrix size OUTPUT PARAMETERS: A - Q'*A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void smatrixrndmultiply(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; double tau; double lambdav; ae_int_t s; ae_int_t i; double u1; double u2; ae_vector w; ae_vector v; hqrndstate state; ae_frame_make(_state, &_frame_block); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&v, 0, DT_REAL, _state); _hqrndstate_init(&state, _state); /* * General case. */ ae_vector_set_length(&w, n, _state); ae_vector_set_length(&v, n+1, _state); hqrndrandomize(&state, _state); for(s=2; s<=n; s++) { /* * Prepare random normal v */ do { i = 1; while(i<=s) { hqrndnormal2(&state, &u1, &u2, _state); v.ptr.p_double[i] = u1; if( i+1<=s ) { v.ptr.p_double[i+1] = u2; } i = i+2; } lambdav = ae_v_dotproduct(&v.ptr.p_double[1], 1, &v.ptr.p_double[1], 1, ae_v_len(1,s)); } while(ae_fp_eq(lambdav,(double)(0))); /* * Prepare and apply reflection */ generatereflection(&v, s, &tau, _state); v.ptr.p_double[1] = (double)(1); applyreflectionfromtheright(a, tau, &v, 0, n-1, n-s, n-1, &w, _state); applyreflectionfromtheleft(a, tau, &v, n-s, n-1, 0, n-1, &w, _state); } /* * Second pass. */ for(i=0; i<=n-1; i++) { tau = (double)(2*hqrnduniformi(&state, 2, _state)-1); ae_v_muld(&a->ptr.pp_double[0][i], a->stride, ae_v_len(0,n-1), tau); ae_v_muld(&a->ptr.pp_double[i][0], 1, ae_v_len(0,n-1), tau); } /* * Copy upper triangle to lower */ for(i=0; i<=n-2; i++) { ae_v_move(&a->ptr.pp_double[i+1][i], a->stride, &a->ptr.pp_double[i][i+1], 1, ae_v_len(i+1,n-1)); } ae_frame_leave(_state); } /************************************************************************* Hermitian multiplication of NxN matrix by random Haar distributed complex orthogonal matrix INPUT PARAMETERS: A - matrix, array[0..N-1, 0..N-1] N - matrix size OUTPUT PARAMETERS: A - Q^H*A*Q, where Q is random NxN orthogonal matrix -- ALGLIB routine -- 04.12.2009 Bochkanov Sergey *************************************************************************/ void hmatrixrndmultiply(/* Complex */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_complex tau; ae_complex lambdav; ae_int_t s; ae_int_t i; ae_vector w; ae_vector v; hqrndstate state; ae_frame_make(_state, &_frame_block); ae_vector_init(&w, 0, DT_COMPLEX, _state); ae_vector_init(&v, 0, DT_COMPLEX, _state); _hqrndstate_init(&state, _state); /* * General case. */ ae_vector_set_length(&w, n, _state); ae_vector_set_length(&v, n+1, _state); hqrndrandomize(&state, _state); for(s=2; s<=n; s++) { /* * Prepare random normal v */ do { for(i=1; i<=s; i++) { hqrndnormal2(&state, &tau.x, &tau.y, _state); v.ptr.p_complex[i] = tau; } lambdav = ae_v_cdotproduct(&v.ptr.p_complex[1], 1, "N", &v.ptr.p_complex[1], 1, "Conj", ae_v_len(1,s)); } while(ae_c_eq_d(lambdav,(double)(0))); /* * Prepare and apply reflection */ complexgeneratereflection(&v, s, &tau, _state); v.ptr.p_complex[1] = ae_complex_from_i(1); complexapplyreflectionfromtheright(a, tau, &v, 0, n-1, n-s, n-1, &w, _state); complexapplyreflectionfromtheleft(a, ae_c_conj(tau, _state), &v, n-s, n-1, 0, n-1, &w, _state); } /* * Second pass. */ for(i=0; i<=n-1; i++) { hqrndunit2(&state, &tau.x, &tau.y, _state); ae_v_cmulc(&a->ptr.pp_complex[0][i], a->stride, ae_v_len(0,n-1), tau); tau = ae_c_conj(tau, _state); ae_v_cmulc(&a->ptr.pp_complex[i][0], 1, ae_v_len(0,n-1), tau); } /* * Change all values from lower triangle by complex-conjugate values * from upper one */ for(i=0; i<=n-2; i++) { ae_v_cmove(&a->ptr.pp_complex[i+1][i], a->stride, &a->ptr.pp_complex[i][i+1], 1, "N", ae_v_len(i+1,n-1)); } for(s=0; s<=n-2; s++) { for(i=s+1; i<=n-1; i++) { a->ptr.pp_complex[i][s].y = -a->ptr.pp_complex[i][s].y; } } ae_frame_leave(_state); } /************************************************************************* Splits matrix length in two parts, left part should match ABLAS block size INPUT PARAMETERS A - real matrix, is passed to ensure that we didn't split complex matrix using real splitting subroutine. matrix itself is not changed. N - length, N>0 OUTPUT PARAMETERS N1 - length N2 - length N1+N2=N, N1>=N2, N2 may be zero -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void ablassplitlength(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t* n1, ae_int_t* n2, ae_state *_state) { *n1 = 0; *n2 = 0; if( n>ablasblocksize(a, _state) ) { ablas_ablasinternalsplitlength(n, ablasblocksize(a, _state), n1, n2, _state); } else { ablas_ablasinternalsplitlength(n, ablasmicroblocksize(_state), n1, n2, _state); } } /************************************************************************* Complex ABLASSplitLength -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void ablascomplexsplitlength(/* Complex */ ae_matrix* a, ae_int_t n, ae_int_t* n1, ae_int_t* n2, ae_state *_state) { *n1 = 0; *n2 = 0; if( n>ablascomplexblocksize(a, _state) ) { ablas_ablasinternalsplitlength(n, ablascomplexblocksize(a, _state), n1, n2, _state); } else { ablas_ablasinternalsplitlength(n, ablasmicroblocksize(_state), n1, n2, _state); } } /************************************************************************* Returns block size - subdivision size where cache-oblivious soubroutines switch to the optimized kernel. INPUT PARAMETERS A - real matrix, is passed to ensure that we didn't split complex matrix using real splitting subroutine. matrix itself is not changed. -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ ae_int_t ablasblocksize(/* Real */ ae_matrix* a, ae_state *_state) { ae_int_t result; result = 32; return result; } /************************************************************************* Block size for complex subroutines. -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ ae_int_t ablascomplexblocksize(/* Complex */ ae_matrix* a, ae_state *_state) { ae_int_t result; result = 24; return result; } /************************************************************************* Microblock size -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ ae_int_t ablasmicroblocksize(ae_state *_state) { ae_int_t result; result = 8; return result; } /************************************************************************* Cache-oblivous complex "copy-and-transpose" Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/ void cmatrixtranspose(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_state *_state) { ae_int_t i; ae_int_t s1; ae_int_t s2; if( m<=2*ablascomplexblocksize(a, _state)&&n<=2*ablascomplexblocksize(a, _state) ) { /* * base case */ for(i=0; i<=m-1; i++) { ae_v_cmove(&b->ptr.pp_complex[ib][jb+i], b->stride, &a->ptr.pp_complex[ia+i][ja], 1, "N", ae_v_len(ib,ib+n-1)); } } else { /* * Cache-oblivious recursion */ if( m>n ) { ablascomplexsplitlength(a, m, &s1, &s2, _state); cmatrixtranspose(s1, n, a, ia, ja, b, ib, jb, _state); cmatrixtranspose(s2, n, a, ia+s1, ja, b, ib, jb+s1, _state); } else { ablascomplexsplitlength(a, n, &s1, &s2, _state); cmatrixtranspose(m, s1, a, ia, ja, b, ib, jb, _state); cmatrixtranspose(m, s2, a, ia, ja+s1, b, ib+s1, jb, _state); } } } /************************************************************************* Cache-oblivous real "copy-and-transpose" Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/ void rmatrixtranspose(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_state *_state) { ae_int_t i; ae_int_t s1; ae_int_t s2; if( m<=2*ablasblocksize(a, _state)&&n<=2*ablasblocksize(a, _state) ) { /* * base case */ for(i=0; i<=m-1; i++) { ae_v_move(&b->ptr.pp_double[ib][jb+i], b->stride, &a->ptr.pp_double[ia+i][ja], 1, ae_v_len(ib,ib+n-1)); } } else { /* * Cache-oblivious recursion */ if( m>n ) { ablassplitlength(a, m, &s1, &s2, _state); rmatrixtranspose(s1, n, a, ia, ja, b, ib, jb, _state); rmatrixtranspose(s2, n, a, ia+s1, ja, b, ib, jb+s1, _state); } else { ablassplitlength(a, n, &s1, &s2, _state); rmatrixtranspose(m, s1, a, ia, ja, b, ib, jb, _state); rmatrixtranspose(m, s2, a, ia, ja+s1, b, ib+s1, jb, _state); } } } /************************************************************************* This code enforces symmetricy of the matrix by copying Upper part to lower one (or vice versa). INPUT PARAMETERS: A - matrix N - number of rows/columns IsUpper - whether we want to copy upper triangle to lower one (True) or vice versa (False). *************************************************************************/ void rmatrixenforcesymmetricity(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state) { ae_int_t i; ae_int_t j; if( isupper ) { for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { a->ptr.pp_double[j][i] = a->ptr.pp_double[i][j]; } } } else { for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { a->ptr.pp_double[i][j] = a->ptr.pp_double[j][i]; } } } } /************************************************************************* Copy Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/ void cmatrixcopy(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_state *_state) { ae_int_t i; if( m==0||n==0 ) { return; } for(i=0; i<=m-1; i++) { ae_v_cmove(&b->ptr.pp_complex[ib+i][jb], 1, &a->ptr.pp_complex[ia+i][ja], 1, "N", ae_v_len(jb,jb+n-1)); } } /************************************************************************* Copy Input parameters: M - number of rows N - number of columns A - source matrix, MxN submatrix is copied and transposed IA - submatrix offset (row index) JA - submatrix offset (column index) B - destination matrix, must be large enough to store result IB - submatrix offset (row index) JB - submatrix offset (column index) *************************************************************************/ void rmatrixcopy(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_state *_state) { ae_int_t i; if( m==0||n==0 ) { return; } for(i=0; i<=m-1; i++) { ae_v_move(&b->ptr.pp_double[ib+i][jb], 1, &a->ptr.pp_double[ia+i][ja], 1, ae_v_len(jb,jb+n-1)); } } /************************************************************************* Rank-1 correction: A := A + u*v' INPUT PARAMETERS: M - number of rows N - number of columns A - target matrix, MxN submatrix is updated IA - submatrix offset (row index) JA - submatrix offset (column index) U - vector #1 IU - subvector offset V - vector #2 IV - subvector offset *************************************************************************/ void cmatrixrank1(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Complex */ ae_vector* u, ae_int_t iu, /* Complex */ ae_vector* v, ae_int_t iv, ae_state *_state) { ae_int_t i; ae_complex s; if( m==0||n==0 ) { return; } if( cmatrixrank1f(m, n, a, ia, ja, u, iu, v, iv, _state) ) { return; } for(i=0; i<=m-1; i++) { s = u->ptr.p_complex[iu+i]; ae_v_caddc(&a->ptr.pp_complex[ia+i][ja], 1, &v->ptr.p_complex[iv], 1, "N", ae_v_len(ja,ja+n-1), s); } } /************************************************************************* Rank-1 correction: A := A + u*v' INPUT PARAMETERS: M - number of rows N - number of columns A - target matrix, MxN submatrix is updated IA - submatrix offset (row index) JA - submatrix offset (column index) U - vector #1 IU - subvector offset V - vector #2 IV - subvector offset *************************************************************************/ void rmatrixrank1(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_vector* u, ae_int_t iu, /* Real */ ae_vector* v, ae_int_t iv, ae_state *_state) { ae_int_t i; double s; if( m==0||n==0 ) { return; } if( rmatrixrank1f(m, n, a, ia, ja, u, iu, v, iv, _state) ) { return; } for(i=0; i<=m-1; i++) { s = u->ptr.p_double[iu+i]; ae_v_addd(&a->ptr.pp_double[ia+i][ja], 1, &v->ptr.p_double[iv], 1, ae_v_len(ja,ja+n-1), s); } } /************************************************************************* Matrix-vector product: y := op(A)*x INPUT PARAMETERS: M - number of rows of op(A) M>=0 N - number of columns of op(A) N>=0 A - target matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpA - operation type: * OpA=0 => op(A) = A * OpA=1 => op(A) = A^T * OpA=2 => op(A) = A^H X - input vector IX - subvector offset IY - subvector offset Y - preallocated matrix, must be large enough to store result OUTPUT PARAMETERS: Y - vector which stores result if M=0, then subroutine does nothing. if N=0, Y is filled by zeros. -- ALGLIB routine -- 28.01.2010 Bochkanov Sergey *************************************************************************/ void cmatrixmv(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t opa, /* Complex */ ae_vector* x, ae_int_t ix, /* Complex */ ae_vector* y, ae_int_t iy, ae_state *_state) { ae_int_t i; ae_complex v; if( m==0 ) { return; } if( n==0 ) { for(i=0; i<=m-1; i++) { y->ptr.p_complex[iy+i] = ae_complex_from_i(0); } return; } if( cmatrixmvf(m, n, a, ia, ja, opa, x, ix, y, iy, _state) ) { return; } if( opa==0 ) { /* * y = A*x */ for(i=0; i<=m-1; i++) { v = ae_v_cdotproduct(&a->ptr.pp_complex[ia+i][ja], 1, "N", &x->ptr.p_complex[ix], 1, "N", ae_v_len(ja,ja+n-1)); y->ptr.p_complex[iy+i] = v; } return; } if( opa==1 ) { /* * y = A^T*x */ for(i=0; i<=m-1; i++) { y->ptr.p_complex[iy+i] = ae_complex_from_i(0); } for(i=0; i<=n-1; i++) { v = x->ptr.p_complex[ix+i]; ae_v_caddc(&y->ptr.p_complex[iy], 1, &a->ptr.pp_complex[ia+i][ja], 1, "N", ae_v_len(iy,iy+m-1), v); } return; } if( opa==2 ) { /* * y = A^H*x */ for(i=0; i<=m-1; i++) { y->ptr.p_complex[iy+i] = ae_complex_from_i(0); } for(i=0; i<=n-1; i++) { v = x->ptr.p_complex[ix+i]; ae_v_caddc(&y->ptr.p_complex[iy], 1, &a->ptr.pp_complex[ia+i][ja], 1, "Conj", ae_v_len(iy,iy+m-1), v); } return; } } /************************************************************************* Matrix-vector product: y := op(A)*x INPUT PARAMETERS: M - number of rows of op(A) N - number of columns of op(A) A - target matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpA - operation type: * OpA=0 => op(A) = A * OpA=1 => op(A) = A^T X - input vector IX - subvector offset IY - subvector offset Y - preallocated matrix, must be large enough to store result OUTPUT PARAMETERS: Y - vector which stores result if M=0, then subroutine does nothing. if N=0, Y is filled by zeros. -- ALGLIB routine -- 28.01.2010 Bochkanov Sergey *************************************************************************/ void rmatrixmv(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t opa, /* Real */ ae_vector* x, ae_int_t ix, /* Real */ ae_vector* y, ae_int_t iy, ae_state *_state) { ae_int_t i; double v; if( m==0 ) { return; } if( n==0 ) { for(i=0; i<=m-1; i++) { y->ptr.p_double[iy+i] = (double)(0); } return; } if( rmatrixmvf(m, n, a, ia, ja, opa, x, ix, y, iy, _state) ) { return; } if( opa==0 ) { /* * y = A*x */ for(i=0; i<=m-1; i++) { v = ae_v_dotproduct(&a->ptr.pp_double[ia+i][ja], 1, &x->ptr.p_double[ix], 1, ae_v_len(ja,ja+n-1)); y->ptr.p_double[iy+i] = v; } return; } if( opa==1 ) { /* * y = A^T*x */ for(i=0; i<=m-1; i++) { y->ptr.p_double[iy+i] = (double)(0); } for(i=0; i<=n-1; i++) { v = x->ptr.p_double[ix+i]; ae_v_addd(&y->ptr.p_double[iy], 1, &a->ptr.pp_double[ia+i][ja], 1, ae_v_len(iy,iy+m-1), v); } return; } } /************************************************************************* This subroutine calculates X*op(A^-1) where: * X is MxN general matrix * A is NxN upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition, conjugate transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+N-1,J1:J1+N-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixrighttrsm(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { ae_int_t s1; ae_int_t s2; ae_int_t bs; bs = ablascomplexblocksize(a, _state); /* * Basecase: either MKL-supported code or ALGLIB basecase code */ if( cmatrixrighttrsmmkl(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state) ) { return; } if( m<=bs&&n<=bs ) { ablas_cmatrixrighttrsm2(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); return; } /* * Recursive subdivision */ if( m>=n ) { /* * Split X: X*A = (X1 X2)^T*A */ ablascomplexsplitlength(a, m, &s1, &s2, _state); cmatrixrighttrsm(s1, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); cmatrixrighttrsm(s2, n, a, i1, j1, isupper, isunit, optype, x, i2+s1, j2, _state); return; } else { /* * Split A: * (A1 A12) * X*op(A) = X*op( ) * ( A2) * * Different variants depending on * IsUpper/OpType combinations */ ablascomplexsplitlength(a, n, &s1, &s2, _state); if( isupper&&optype==0 ) { /* * (A1 A12)-1 * X*A^-1 = (X1 X2)*( ) * ( A2) */ cmatrixrighttrsm(m, s1, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); cmatrixgemm(m, s2, s1, ae_complex_from_d(-1.0), x, i2, j2, 0, a, i1, j1+s1, 0, ae_complex_from_d(1.0), x, i2, j2+s1, _state); cmatrixrighttrsm(m, s2, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2, j2+s1, _state); return; } if( isupper&&optype!=0 ) { /* * (A1' )-1 * X*A^-1 = (X1 X2)*( ) * (A12' A2') */ cmatrixrighttrsm(m, s2, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2, j2+s1, _state); cmatrixgemm(m, s1, s2, ae_complex_from_d(-1.0), x, i2, j2+s1, 0, a, i1, j1+s1, optype, ae_complex_from_d(1.0), x, i2, j2, _state); cmatrixrighttrsm(m, s1, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); return; } if( !isupper&&optype==0 ) { /* * (A1 )-1 * X*A^-1 = (X1 X2)*( ) * (A21 A2) */ cmatrixrighttrsm(m, s2, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2, j2+s1, _state); cmatrixgemm(m, s1, s2, ae_complex_from_d(-1.0), x, i2, j2+s1, 0, a, i1+s1, j1, 0, ae_complex_from_d(1.0), x, i2, j2, _state); cmatrixrighttrsm(m, s1, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); return; } if( !isupper&&optype!=0 ) { /* * (A1' A21')-1 * X*A^-1 = (X1 X2)*( ) * ( A2') */ cmatrixrighttrsm(m, s1, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); cmatrixgemm(m, s2, s1, ae_complex_from_d(-1.0), x, i2, j2, 0, a, i1+s1, j1, optype, ae_complex_from_d(1.0), x, i2, j2+s1, _state); cmatrixrighttrsm(m, s2, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2, j2+s1, _state); return; } } } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixrighttrsm(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { cmatrixrighttrsm(m,n,a,i1,j1,isupper,isunit,optype,x,i2,j2, _state); } /************************************************************************* This subroutine calculates op(A^-1)*X where: * X is MxN general matrix * A is MxM upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition, conjugate transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+M-1,J1:J1+M-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixlefttrsm(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { ae_int_t s1; ae_int_t s2; ae_int_t bs; bs = ablascomplexblocksize(a, _state); /* * Basecase: either MKL-supported code or ALGLIB basecase code */ if( cmatrixlefttrsmmkl(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state) ) { return; } if( m<=bs&&n<=bs ) { ablas_cmatrixlefttrsm2(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); return; } /* * Recursive subdivision */ if( n>=m ) { /* * Split X: op(A)^-1*X = op(A)^-1*(X1 X2) */ ablascomplexsplitlength(x, n, &s1, &s2, _state); cmatrixlefttrsm(m, s1, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); cmatrixlefttrsm(m, s2, a, i1, j1, isupper, isunit, optype, x, i2, j2+s1, _state); return; } else { /* * Split A */ ablascomplexsplitlength(a, m, &s1, &s2, _state); if( isupper&&optype==0 ) { /* * (A1 A12)-1 ( X1 ) * A^-1*X* = ( ) *( ) * ( A2) ( X2 ) */ cmatrixlefttrsm(s2, n, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2+s1, j2, _state); cmatrixgemm(s1, n, s2, ae_complex_from_d(-1.0), a, i1, j1+s1, 0, x, i2+s1, j2, 0, ae_complex_from_d(1.0), x, i2, j2, _state); cmatrixlefttrsm(s1, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); return; } if( isupper&&optype!=0 ) { /* * (A1' )-1 ( X1 ) * A^-1*X = ( ) *( ) * (A12' A2') ( X2 ) */ cmatrixlefttrsm(s1, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); cmatrixgemm(s2, n, s1, ae_complex_from_d(-1.0), a, i1, j1+s1, optype, x, i2, j2, 0, ae_complex_from_d(1.0), x, i2+s1, j2, _state); cmatrixlefttrsm(s2, n, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2+s1, j2, _state); return; } if( !isupper&&optype==0 ) { /* * (A1 )-1 ( X1 ) * A^-1*X = ( ) *( ) * (A21 A2) ( X2 ) */ cmatrixlefttrsm(s1, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); cmatrixgemm(s2, n, s1, ae_complex_from_d(-1.0), a, i1+s1, j1, 0, x, i2, j2, 0, ae_complex_from_d(1.0), x, i2+s1, j2, _state); cmatrixlefttrsm(s2, n, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2+s1, j2, _state); return; } if( !isupper&&optype!=0 ) { /* * (A1' A21')-1 ( X1 ) * A^-1*X = ( ) *( ) * ( A2') ( X2 ) */ cmatrixlefttrsm(s2, n, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2+s1, j2, _state); cmatrixgemm(s1, n, s2, ae_complex_from_d(-1.0), a, i1+s1, j1, optype, x, i2+s1, j2, 0, ae_complex_from_d(1.0), x, i2, j2, _state); cmatrixlefttrsm(s1, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); return; } } } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixlefttrsm(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { cmatrixlefttrsm(m,n,a,i1,j1,isupper,isunit,optype,x,i2,j2, _state); } /************************************************************************* This subroutine calculates X*op(A^-1) where: * X is MxN general matrix * A is NxN upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+N-1,J1:J1+N-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixrighttrsm(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { ae_int_t s1; ae_int_t s2; ae_int_t bs; bs = ablasblocksize(a, _state); /* * Basecase: MKL or ALGLIB code */ if( rmatrixrighttrsmmkl(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state) ) { return; } if( m<=bs&&n<=bs ) { ablas_rmatrixrighttrsm2(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); return; } /* * Recursive subdivision */ if( m>=n ) { /* * Split X: X*A = (X1 X2)^T*A */ ablassplitlength(a, m, &s1, &s2, _state); rmatrixrighttrsm(s1, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); rmatrixrighttrsm(s2, n, a, i1, j1, isupper, isunit, optype, x, i2+s1, j2, _state); return; } else { /* * Split A: * (A1 A12) * X*op(A) = X*op( ) * ( A2) * * Different variants depending on * IsUpper/OpType combinations */ ablassplitlength(a, n, &s1, &s2, _state); if( isupper&&optype==0 ) { /* * (A1 A12)-1 * X*A^-1 = (X1 X2)*( ) * ( A2) */ rmatrixrighttrsm(m, s1, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); rmatrixgemm(m, s2, s1, -1.0, x, i2, j2, 0, a, i1, j1+s1, 0, 1.0, x, i2, j2+s1, _state); rmatrixrighttrsm(m, s2, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2, j2+s1, _state); return; } if( isupper&&optype!=0 ) { /* * (A1' )-1 * X*A^-1 = (X1 X2)*( ) * (A12' A2') */ rmatrixrighttrsm(m, s2, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2, j2+s1, _state); rmatrixgemm(m, s1, s2, -1.0, x, i2, j2+s1, 0, a, i1, j1+s1, optype, 1.0, x, i2, j2, _state); rmatrixrighttrsm(m, s1, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); return; } if( !isupper&&optype==0 ) { /* * (A1 )-1 * X*A^-1 = (X1 X2)*( ) * (A21 A2) */ rmatrixrighttrsm(m, s2, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2, j2+s1, _state); rmatrixgemm(m, s1, s2, -1.0, x, i2, j2+s1, 0, a, i1+s1, j1, 0, 1.0, x, i2, j2, _state); rmatrixrighttrsm(m, s1, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); return; } if( !isupper&&optype!=0 ) { /* * (A1' A21')-1 * X*A^-1 = (X1 X2)*( ) * ( A2') */ rmatrixrighttrsm(m, s1, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); rmatrixgemm(m, s2, s1, -1.0, x, i2, j2, 0, a, i1+s1, j1, optype, 1.0, x, i2, j2+s1, _state); rmatrixrighttrsm(m, s2, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2, j2+s1, _state); return; } } } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixrighttrsm(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { rmatrixrighttrsm(m,n,a,i1,j1,isupper,isunit,optype,x,i2,j2, _state); } /************************************************************************* This subroutine calculates op(A^-1)*X where: * X is MxN general matrix * A is MxM upper/lower triangular/unitriangular matrix * "op" may be identity transformation, transposition Multiplication result replaces X. Cache-oblivious algorithm is used. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 M - matrix size, N>=0 A - matrix, actial matrix is stored in A[I1:I1+M-1,J1:J1+M-1] I1 - submatrix offset J1 - submatrix offset IsUpper - whether matrix is upper triangular IsUnit - whether matrix is unitriangular OpType - transformation type: * 0 - no transformation * 1 - transposition X - matrix, actial matrix is stored in X[I2:I2+M-1,J2:J2+N-1] I2 - submatrix offset J2 - submatrix offset -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixlefttrsm(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { ae_int_t s1; ae_int_t s2; ae_int_t bs; bs = ablasblocksize(a, _state); /* * Basecase: MKL or ALGLIB code */ if( rmatrixlefttrsmmkl(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state) ) { return; } if( m<=bs&&n<=bs ) { ablas_rmatrixlefttrsm2(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); return; } /* * Recursive subdivision */ if( n>=m ) { /* * Split X: op(A)^-1*X = op(A)^-1*(X1 X2) */ ablassplitlength(x, n, &s1, &s2, _state); rmatrixlefttrsm(m, s1, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); rmatrixlefttrsm(m, s2, a, i1, j1, isupper, isunit, optype, x, i2, j2+s1, _state); } else { /* * Split A */ ablassplitlength(a, m, &s1, &s2, _state); if( isupper&&optype==0 ) { /* * (A1 A12)-1 ( X1 ) * A^-1*X* = ( ) *( ) * ( A2) ( X2 ) */ rmatrixlefttrsm(s2, n, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2+s1, j2, _state); rmatrixgemm(s1, n, s2, -1.0, a, i1, j1+s1, 0, x, i2+s1, j2, 0, 1.0, x, i2, j2, _state); rmatrixlefttrsm(s1, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); return; } if( isupper&&optype!=0 ) { /* * (A1' )-1 ( X1 ) * A^-1*X = ( ) *( ) * (A12' A2') ( X2 ) */ rmatrixlefttrsm(s1, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); rmatrixgemm(s2, n, s1, -1.0, a, i1, j1+s1, optype, x, i2, j2, 0, 1.0, x, i2+s1, j2, _state); rmatrixlefttrsm(s2, n, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2+s1, j2, _state); return; } if( !isupper&&optype==0 ) { /* * (A1 )-1 ( X1 ) * A^-1*X = ( ) *( ) * (A21 A2) ( X2 ) */ rmatrixlefttrsm(s1, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); rmatrixgemm(s2, n, s1, -1.0, a, i1+s1, j1, 0, x, i2, j2, 0, 1.0, x, i2+s1, j2, _state); rmatrixlefttrsm(s2, n, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2+s1, j2, _state); return; } if( !isupper&&optype!=0 ) { /* * (A1' A21')-1 ( X1 ) * A^-1*X = ( ) *( ) * ( A2') ( X2 ) */ rmatrixlefttrsm(s2, n, a, i1+s1, j1+s1, isupper, isunit, optype, x, i2+s1, j2, _state); rmatrixgemm(s1, n, s2, -1.0, a, i1+s1, j1, optype, x, i2+s1, j2, 0, 1.0, x, i2, j2, _state); rmatrixlefttrsm(s1, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state); return; } } } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixlefttrsm(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { rmatrixlefttrsm(m,n,a,i1,j1,isupper,isunit,optype,x,i2,j2, _state); } /************************************************************************* This subroutine calculates C=alpha*A*A^H+beta*C or C=alpha*A^H*A+beta*C where: * C is NxN Hermitian matrix given by its upper/lower triangle * A is NxK matrix when A*A^H is calculated, KxN matrix otherwise Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 K - matrix size, K>=0 Alpha - coefficient A - matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpTypeA - multiplication type: * 0 - A*A^H is calculated * 2 - A^H*A is calculated Beta - coefficient C - preallocated input/output matrix IC - submatrix offset (row index) JC - submatrix offset (column index) IsUpper - whether upper or lower triangle of C is updated; this function updates only one half of C, leaving other half unchanged (not referenced at all). -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixherk(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { ae_int_t s1; ae_int_t s2; ae_int_t bs; bs = ablascomplexblocksize(a, _state); /* * Use MKL or ALGLIB basecase code */ if( cmatrixherkmkl(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state) ) { return; } if( n<=bs&&k<=bs ) { ablas_cmatrixherk2(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); return; } /* * Recursive division of the problem */ if( k>=n ) { /* * Split K */ ablascomplexsplitlength(a, k, &s1, &s2, _state); if( optypea==0 ) { cmatrixherk(n, s1, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); cmatrixherk(n, s2, alpha, a, ia, ja+s1, optypea, 1.0, c, ic, jc, isupper, _state); } else { cmatrixherk(n, s1, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); cmatrixherk(n, s2, alpha, a, ia+s1, ja, optypea, 1.0, c, ic, jc, isupper, _state); } } else { /* * Split N */ ablascomplexsplitlength(a, n, &s1, &s2, _state); if( optypea==0&&isupper ) { cmatrixherk(s1, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); cmatrixgemm(s1, s2, k, ae_complex_from_d(alpha), a, ia, ja, 0, a, ia+s1, ja, 2, ae_complex_from_d(beta), c, ic, jc+s1, _state); cmatrixherk(s2, k, alpha, a, ia+s1, ja, optypea, beta, c, ic+s1, jc+s1, isupper, _state); return; } if( optypea==0&&!isupper ) { cmatrixherk(s1, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); cmatrixgemm(s2, s1, k, ae_complex_from_d(alpha), a, ia+s1, ja, 0, a, ia, ja, 2, ae_complex_from_d(beta), c, ic+s1, jc, _state); cmatrixherk(s2, k, alpha, a, ia+s1, ja, optypea, beta, c, ic+s1, jc+s1, isupper, _state); return; } if( optypea!=0&&isupper ) { cmatrixherk(s1, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); cmatrixgemm(s1, s2, k, ae_complex_from_d(alpha), a, ia, ja, 2, a, ia, ja+s1, 0, ae_complex_from_d(beta), c, ic, jc+s1, _state); cmatrixherk(s2, k, alpha, a, ia, ja+s1, optypea, beta, c, ic+s1, jc+s1, isupper, _state); return; } if( optypea!=0&&!isupper ) { cmatrixherk(s1, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); cmatrixgemm(s2, s1, k, ae_complex_from_d(alpha), a, ia, ja+s1, 2, a, ia, ja, 0, ae_complex_from_d(beta), c, ic+s1, jc, _state); cmatrixherk(s2, k, alpha, a, ia, ja+s1, optypea, beta, c, ic+s1, jc+s1, isupper, _state); return; } } } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixherk(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { cmatrixherk(n,k,alpha,a,ia,ja,optypea,beta,c,ic,jc,isupper, _state); } /************************************************************************* This subroutine calculates C=alpha*A*A^T+beta*C or C=alpha*A^T*A+beta*C where: * C is NxN symmetric matrix given by its upper/lower triangle * A is NxK matrix when A*A^T is calculated, KxN matrix otherwise Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS N - matrix size, N>=0 K - matrix size, K>=0 Alpha - coefficient A - matrix IA - submatrix offset (row index) JA - submatrix offset (column index) OpTypeA - multiplication type: * 0 - A*A^T is calculated * 2 - A^T*A is calculated Beta - coefficient C - preallocated input/output matrix IC - submatrix offset (row index) JC - submatrix offset (column index) IsUpper - whether C is upper triangular or lower triangular -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ void rmatrixsyrk(ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { ae_int_t s1; ae_int_t s2; ae_int_t bs; bs = ablasblocksize(a, _state); /* * Use MKL or generic basecase code */ if( rmatrixsyrkmkl(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state) ) { return; } if( n<=bs&&k<=bs ) { ablas_rmatrixsyrk2(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); return; } /* * Recursive subdivision of the problem */ if( k>=n ) { /* * Split K */ ablassplitlength(a, k, &s1, &s2, _state); if( optypea==0 ) { rmatrixsyrk(n, s1, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); rmatrixsyrk(n, s2, alpha, a, ia, ja+s1, optypea, 1.0, c, ic, jc, isupper, _state); } else { rmatrixsyrk(n, s1, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); rmatrixsyrk(n, s2, alpha, a, ia+s1, ja, optypea, 1.0, c, ic, jc, isupper, _state); } } else { /* * Split N */ ablassplitlength(a, n, &s1, &s2, _state); if( optypea==0&&isupper ) { rmatrixsyrk(s1, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); rmatrixgemm(s1, s2, k, alpha, a, ia, ja, 0, a, ia+s1, ja, 1, beta, c, ic, jc+s1, _state); rmatrixsyrk(s2, k, alpha, a, ia+s1, ja, optypea, beta, c, ic+s1, jc+s1, isupper, _state); return; } if( optypea==0&&!isupper ) { rmatrixsyrk(s1, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); rmatrixgemm(s2, s1, k, alpha, a, ia+s1, ja, 0, a, ia, ja, 1, beta, c, ic+s1, jc, _state); rmatrixsyrk(s2, k, alpha, a, ia+s1, ja, optypea, beta, c, ic+s1, jc+s1, isupper, _state); return; } if( optypea!=0&&isupper ) { rmatrixsyrk(s1, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); rmatrixgemm(s1, s2, k, alpha, a, ia, ja, 1, a, ia, ja+s1, 0, beta, c, ic, jc+s1, _state); rmatrixsyrk(s2, k, alpha, a, ia, ja+s1, optypea, beta, c, ic+s1, jc+s1, isupper, _state); return; } if( optypea!=0&&!isupper ) { rmatrixsyrk(s1, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); rmatrixgemm(s2, s1, k, alpha, a, ia, ja+s1, 1, a, ia, ja, 0, beta, c, ic+s1, jc, _state); rmatrixsyrk(s2, k, alpha, a, ia, ja+s1, optypea, beta, c, ic+s1, jc+s1, isupper, _state); return; } } } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixsyrk(ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { rmatrixsyrk(n,k,alpha,a,ia,ja,optypea,beta,c,ic,jc,isupper, _state); } /************************************************************************* This subroutine calculates C = alpha*op1(A)*op2(B) +beta*C where: * C is MxN general matrix * op1(A) is MxK matrix * op2(B) is KxN matrix * "op" may be identity transformation, transposition, conjugate transposition Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. IMPORTANT: This function does NOT preallocate output matrix C, it MUST be preallocated by caller prior to calling this function. In case C does not have enough space to store result, exception will be generated. INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset OpTypeA - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition B - matrix IB - submatrix offset JB - submatrix offset OpTypeB - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition Beta - coefficient C - matrix (PREALLOCATED, large enough to store result) IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { ae_int_t s1; ae_int_t s2; ae_int_t bs; bs = ablascomplexblocksize(a, _state); /* * Use MKL or ALGLIB basecase code */ if( cmatrixgemmmkl(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state) ) { return; } if( (m<=bs&&n<=bs)&&k<=bs ) { cmatrixgemmk(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); return; } /* * SMP support is turned on when M or N are larger than some boundary value. * Magnitude of K is not taken into account because splitting on K does not * allow us to spawn child tasks. */ /* * Recursive algorithm: parallel splitting on M/N */ if( m>=n&&m>=k ) { /* * A*B = (A1 A2)^T*B */ ablascomplexsplitlength(a, m, &s1, &s2, _state); cmatrixgemm(s1, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); if( optypea==0 ) { cmatrixgemm(s2, n, k, alpha, a, ia+s1, ja, optypea, b, ib, jb, optypeb, beta, c, ic+s1, jc, _state); } else { cmatrixgemm(s2, n, k, alpha, a, ia, ja+s1, optypea, b, ib, jb, optypeb, beta, c, ic+s1, jc, _state); } return; } if( n>=m&&n>=k ) { /* * A*B = A*(B1 B2) */ ablascomplexsplitlength(a, n, &s1, &s2, _state); if( optypeb==0 ) { cmatrixgemm(m, s1, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); cmatrixgemm(m, s2, k, alpha, a, ia, ja, optypea, b, ib, jb+s1, optypeb, beta, c, ic, jc+s1, _state); } else { cmatrixgemm(m, s1, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); cmatrixgemm(m, s2, k, alpha, a, ia, ja, optypea, b, ib+s1, jb, optypeb, beta, c, ic, jc+s1, _state); } return; } /* * Recursive algorithm: serial splitting on K */ /* * A*B = (A1 A2)*(B1 B2)^T */ ablascomplexsplitlength(a, k, &s1, &s2, _state); if( optypea==0&&optypeb==0 ) { cmatrixgemm(m, n, s1, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); cmatrixgemm(m, n, s2, alpha, a, ia, ja+s1, optypea, b, ib+s1, jb, optypeb, ae_complex_from_d(1.0), c, ic, jc, _state); } if( optypea==0&&optypeb!=0 ) { cmatrixgemm(m, n, s1, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); cmatrixgemm(m, n, s2, alpha, a, ia, ja+s1, optypea, b, ib, jb+s1, optypeb, ae_complex_from_d(1.0), c, ic, jc, _state); } if( optypea!=0&&optypeb==0 ) { cmatrixgemm(m, n, s1, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); cmatrixgemm(m, n, s2, alpha, a, ia+s1, ja, optypea, b, ib+s1, jb, optypeb, ae_complex_from_d(1.0), c, ic, jc, _state); } if( optypea!=0&&optypeb!=0 ) { cmatrixgemm(m, n, s1, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); cmatrixgemm(m, n, s2, alpha, a, ia+s1, ja, optypea, b, ib, jb+s1, optypeb, ae_complex_from_d(1.0), c, ic, jc, _state); } return; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { cmatrixgemm(m,n,k,alpha,a,ia,ja,optypea,b,ib,jb,optypeb,beta,c,ic,jc, _state); } /************************************************************************* This subroutine calculates C = alpha*op1(A)*op2(B) +beta*C where: * C is MxN general matrix * op1(A) is MxK matrix * op2(B) is KxN matrix * "op" may be identity transformation, transposition Additional info: * cache-oblivious algorithm is used. * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. Because starting/stopping worker thread always ! involves some overhead, parallelism starts to be profitable for N's ! larger than 128. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. IMPORTANT: This function does NOT preallocate output matrix C, it MUST be preallocated by caller prior to calling this function. In case C does not have enough space to store result, exception will be generated. INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset OpTypeA - transformation type: * 0 - no transformation * 1 - transposition B - matrix IB - submatrix offset JB - submatrix offset OpTypeB - transformation type: * 0 - no transformation * 1 - transposition Beta - coefficient C - PREALLOCATED output matrix, large enough to store result IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 2009-2013 Bochkanov Sergey *************************************************************************/ void rmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { ae_int_t s1; ae_int_t s2; ae_int_t bs; bs = ablasblocksize(a, _state); /* * Check input sizes for correctness */ ae_assert(optypea==0||optypea==1, "RMatrixGEMM: incorrect OpTypeA (must be 0 or 1)", _state); ae_assert(optypeb==0||optypeb==1, "RMatrixGEMM: incorrect OpTypeB (must be 0 or 1)", _state); ae_assert(ic+m<=c->rows, "RMatrixGEMM: incorect size of output matrix C", _state); ae_assert(jc+n<=c->cols, "RMatrixGEMM: incorect size of output matrix C", _state); /* * Use MKL or ALGLIB basecase code */ if( rmatrixgemmmkl(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state) ) { return; } if( (m<=bs&&n<=bs)&&k<=bs ) { rmatrixgemmk(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); return; } /* * SMP support is turned on when M or N are larger than some boundary value. * Magnitude of K is not taken into account because splitting on K does not * allow us to spawn child tasks. */ /* * Recursive algorithm: split on M or N */ if( m>=n&&m>=k ) { /* * A*B = (A1 A2)^T*B */ ablassplitlength(a, m, &s1, &s2, _state); if( optypea==0 ) { rmatrixgemm(s1, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); rmatrixgemm(s2, n, k, alpha, a, ia+s1, ja, optypea, b, ib, jb, optypeb, beta, c, ic+s1, jc, _state); } else { rmatrixgemm(s1, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); rmatrixgemm(s2, n, k, alpha, a, ia, ja+s1, optypea, b, ib, jb, optypeb, beta, c, ic+s1, jc, _state); } return; } if( n>=m&&n>=k ) { /* * A*B = A*(B1 B2) */ ablassplitlength(a, n, &s1, &s2, _state); if( optypeb==0 ) { rmatrixgemm(m, s1, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); rmatrixgemm(m, s2, k, alpha, a, ia, ja, optypea, b, ib, jb+s1, optypeb, beta, c, ic, jc+s1, _state); } else { rmatrixgemm(m, s1, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); rmatrixgemm(m, s2, k, alpha, a, ia, ja, optypea, b, ib+s1, jb, optypeb, beta, c, ic, jc+s1, _state); } return; } /* * Recursive algorithm: split on K */ /* * A*B = (A1 A2)*(B1 B2)^T */ ablassplitlength(a, k, &s1, &s2, _state); if( optypea==0&&optypeb==0 ) { rmatrixgemm(m, n, s1, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); rmatrixgemm(m, n, s2, alpha, a, ia, ja+s1, optypea, b, ib+s1, jb, optypeb, 1.0, c, ic, jc, _state); } if( optypea==0&&optypeb!=0 ) { rmatrixgemm(m, n, s1, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); rmatrixgemm(m, n, s2, alpha, a, ia, ja+s1, optypea, b, ib, jb+s1, optypeb, 1.0, c, ic, jc, _state); } if( optypea!=0&&optypeb==0 ) { rmatrixgemm(m, n, s1, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); rmatrixgemm(m, n, s2, alpha, a, ia+s1, ja, optypea, b, ib+s1, jb, optypeb, 1.0, c, ic, jc, _state); } if( optypea!=0&&optypeb!=0 ) { rmatrixgemm(m, n, s1, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state); rmatrixgemm(m, n, s2, alpha, a, ia+s1, ja, optypea, b, ib, jb+s1, optypeb, 1.0, c, ic, jc, _state); } return; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { rmatrixgemm(m,n,k,alpha,a,ia,ja,optypea,b,ib,jb,optypeb,beta,c,ic,jc, _state); } /************************************************************************* This subroutine is an older version of CMatrixHERK(), one with wrong name (it is HErmitian update, not SYmmetric). It is left here for backward compatibility. -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ void cmatrixsyrk(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { cmatrixherk(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixsyrk(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { cmatrixsyrk(n,k,alpha,a,ia,ja,optypea,beta,c,ic,jc,isupper, _state); } /************************************************************************* Complex ABLASSplitLength -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ static void ablas_ablasinternalsplitlength(ae_int_t n, ae_int_t nb, ae_int_t* n1, ae_int_t* n2, ae_state *_state) { ae_int_t r; *n1 = 0; *n2 = 0; if( n<=nb ) { /* * Block size, no further splitting */ *n1 = n; *n2 = 0; } else { /* * Greater than block size */ if( n%nb!=0 ) { /* * Split remainder */ *n2 = n%nb; *n1 = n-(*n2); } else { /* * Split on block boundaries */ *n2 = n/2; *n1 = n-(*n2); if( *n1%nb==0 ) { return; } r = nb-*n1%nb; *n1 = *n1+r; *n2 = *n2-r; } } } /************************************************************************* Level 2 variant of CMatrixRightTRSM *************************************************************************/ static void ablas_cmatrixrighttrsm2(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { ae_int_t i; ae_int_t j; ae_complex vc; ae_complex vd; /* * Special case */ if( n*m==0 ) { return; } /* * Try to call fast TRSM */ if( cmatrixrighttrsmf(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state) ) { return; } /* * General case */ if( isupper ) { /* * Upper triangular matrix */ if( optype==0 ) { /* * X*A^(-1) */ for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( isunit ) { vd = ae_complex_from_i(1); } else { vd = a->ptr.pp_complex[i1+j][j1+j]; } x->ptr.pp_complex[i2+i][j2+j] = ae_c_div(x->ptr.pp_complex[i2+i][j2+j],vd); if( jptr.pp_complex[i2+i][j2+j]; ae_v_csubc(&x->ptr.pp_complex[i2+i][j2+j+1], 1, &a->ptr.pp_complex[i1+j][j1+j+1], 1, "N", ae_v_len(j2+j+1,j2+n-1), vc); } } } return; } if( optype==1 ) { /* * X*A^(-T) */ for(i=0; i<=m-1; i++) { for(j=n-1; j>=0; j--) { vc = ae_complex_from_i(0); vd = ae_complex_from_i(1); if( jptr.pp_complex[i2+i][j2+j+1], 1, "N", &a->ptr.pp_complex[i1+j][j1+j+1], 1, "N", ae_v_len(j2+j+1,j2+n-1)); } if( !isunit ) { vd = a->ptr.pp_complex[i1+j][j1+j]; } x->ptr.pp_complex[i2+i][j2+j] = ae_c_div(ae_c_sub(x->ptr.pp_complex[i2+i][j2+j],vc),vd); } } return; } if( optype==2 ) { /* * X*A^(-H) */ for(i=0; i<=m-1; i++) { for(j=n-1; j>=0; j--) { vc = ae_complex_from_i(0); vd = ae_complex_from_i(1); if( jptr.pp_complex[i2+i][j2+j+1], 1, "N", &a->ptr.pp_complex[i1+j][j1+j+1], 1, "Conj", ae_v_len(j2+j+1,j2+n-1)); } if( !isunit ) { vd = ae_c_conj(a->ptr.pp_complex[i1+j][j1+j], _state); } x->ptr.pp_complex[i2+i][j2+j] = ae_c_div(ae_c_sub(x->ptr.pp_complex[i2+i][j2+j],vc),vd); } } return; } } else { /* * Lower triangular matrix */ if( optype==0 ) { /* * X*A^(-1) */ for(i=0; i<=m-1; i++) { for(j=n-1; j>=0; j--) { if( isunit ) { vd = ae_complex_from_i(1); } else { vd = a->ptr.pp_complex[i1+j][j1+j]; } x->ptr.pp_complex[i2+i][j2+j] = ae_c_div(x->ptr.pp_complex[i2+i][j2+j],vd); if( j>0 ) { vc = x->ptr.pp_complex[i2+i][j2+j]; ae_v_csubc(&x->ptr.pp_complex[i2+i][j2], 1, &a->ptr.pp_complex[i1+j][j1], 1, "N", ae_v_len(j2,j2+j-1), vc); } } } return; } if( optype==1 ) { /* * X*A^(-T) */ for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { vc = ae_complex_from_i(0); vd = ae_complex_from_i(1); if( j>0 ) { vc = ae_v_cdotproduct(&x->ptr.pp_complex[i2+i][j2], 1, "N", &a->ptr.pp_complex[i1+j][j1], 1, "N", ae_v_len(j2,j2+j-1)); } if( !isunit ) { vd = a->ptr.pp_complex[i1+j][j1+j]; } x->ptr.pp_complex[i2+i][j2+j] = ae_c_div(ae_c_sub(x->ptr.pp_complex[i2+i][j2+j],vc),vd); } } return; } if( optype==2 ) { /* * X*A^(-H) */ for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { vc = ae_complex_from_i(0); vd = ae_complex_from_i(1); if( j>0 ) { vc = ae_v_cdotproduct(&x->ptr.pp_complex[i2+i][j2], 1, "N", &a->ptr.pp_complex[i1+j][j1], 1, "Conj", ae_v_len(j2,j2+j-1)); } if( !isunit ) { vd = ae_c_conj(a->ptr.pp_complex[i1+j][j1+j], _state); } x->ptr.pp_complex[i2+i][j2+j] = ae_c_div(ae_c_sub(x->ptr.pp_complex[i2+i][j2+j],vc),vd); } } return; } } } /************************************************************************* Level-2 subroutine *************************************************************************/ static void ablas_cmatrixlefttrsm2(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { ae_int_t i; ae_int_t j; ae_complex vc; ae_complex vd; /* * Special case */ if( n*m==0 ) { return; } /* * Try to call fast TRSM */ if( cmatrixlefttrsmf(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state) ) { return; } /* * General case */ if( isupper ) { /* * Upper triangular matrix */ if( optype==0 ) { /* * A^(-1)*X */ for(i=m-1; i>=0; i--) { for(j=i+1; j<=m-1; j++) { vc = a->ptr.pp_complex[i1+i][j1+j]; ae_v_csubc(&x->ptr.pp_complex[i2+i][j2], 1, &x->ptr.pp_complex[i2+j][j2], 1, "N", ae_v_len(j2,j2+n-1), vc); } if( !isunit ) { vd = ae_c_d_div(1,a->ptr.pp_complex[i1+i][j1+i]); ae_v_cmulc(&x->ptr.pp_complex[i2+i][j2], 1, ae_v_len(j2,j2+n-1), vd); } } return; } if( optype==1 ) { /* * A^(-T)*X */ for(i=0; i<=m-1; i++) { if( isunit ) { vd = ae_complex_from_i(1); } else { vd = ae_c_d_div(1,a->ptr.pp_complex[i1+i][j1+i]); } ae_v_cmulc(&x->ptr.pp_complex[i2+i][j2], 1, ae_v_len(j2,j2+n-1), vd); for(j=i+1; j<=m-1; j++) { vc = a->ptr.pp_complex[i1+i][j1+j]; ae_v_csubc(&x->ptr.pp_complex[i2+j][j2], 1, &x->ptr.pp_complex[i2+i][j2], 1, "N", ae_v_len(j2,j2+n-1), vc); } } return; } if( optype==2 ) { /* * A^(-H)*X */ for(i=0; i<=m-1; i++) { if( isunit ) { vd = ae_complex_from_i(1); } else { vd = ae_c_d_div(1,ae_c_conj(a->ptr.pp_complex[i1+i][j1+i], _state)); } ae_v_cmulc(&x->ptr.pp_complex[i2+i][j2], 1, ae_v_len(j2,j2+n-1), vd); for(j=i+1; j<=m-1; j++) { vc = ae_c_conj(a->ptr.pp_complex[i1+i][j1+j], _state); ae_v_csubc(&x->ptr.pp_complex[i2+j][j2], 1, &x->ptr.pp_complex[i2+i][j2], 1, "N", ae_v_len(j2,j2+n-1), vc); } } return; } } else { /* * Lower triangular matrix */ if( optype==0 ) { /* * A^(-1)*X */ for(i=0; i<=m-1; i++) { for(j=0; j<=i-1; j++) { vc = a->ptr.pp_complex[i1+i][j1+j]; ae_v_csubc(&x->ptr.pp_complex[i2+i][j2], 1, &x->ptr.pp_complex[i2+j][j2], 1, "N", ae_v_len(j2,j2+n-1), vc); } if( isunit ) { vd = ae_complex_from_i(1); } else { vd = ae_c_d_div(1,a->ptr.pp_complex[i1+j][j1+j]); } ae_v_cmulc(&x->ptr.pp_complex[i2+i][j2], 1, ae_v_len(j2,j2+n-1), vd); } return; } if( optype==1 ) { /* * A^(-T)*X */ for(i=m-1; i>=0; i--) { if( isunit ) { vd = ae_complex_from_i(1); } else { vd = ae_c_d_div(1,a->ptr.pp_complex[i1+i][j1+i]); } ae_v_cmulc(&x->ptr.pp_complex[i2+i][j2], 1, ae_v_len(j2,j2+n-1), vd); for(j=i-1; j>=0; j--) { vc = a->ptr.pp_complex[i1+i][j1+j]; ae_v_csubc(&x->ptr.pp_complex[i2+j][j2], 1, &x->ptr.pp_complex[i2+i][j2], 1, "N", ae_v_len(j2,j2+n-1), vc); } } return; } if( optype==2 ) { /* * A^(-H)*X */ for(i=m-1; i>=0; i--) { if( isunit ) { vd = ae_complex_from_i(1); } else { vd = ae_c_d_div(1,ae_c_conj(a->ptr.pp_complex[i1+i][j1+i], _state)); } ae_v_cmulc(&x->ptr.pp_complex[i2+i][j2], 1, ae_v_len(j2,j2+n-1), vd); for(j=i-1; j>=0; j--) { vc = ae_c_conj(a->ptr.pp_complex[i1+i][j1+j], _state); ae_v_csubc(&x->ptr.pp_complex[i2+j][j2], 1, &x->ptr.pp_complex[i2+i][j2], 1, "N", ae_v_len(j2,j2+n-1), vc); } } return; } } } /************************************************************************* Level 2 subroutine -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ static void ablas_rmatrixrighttrsm2(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { ae_int_t i; ae_int_t j; double vr; double vd; /* * Special case */ if( n*m==0 ) { return; } /* * Try to use "fast" code */ if( rmatrixrighttrsmf(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state) ) { return; } /* * General case */ if( isupper ) { /* * Upper triangular matrix */ if( optype==0 ) { /* * X*A^(-1) */ for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( isunit ) { vd = (double)(1); } else { vd = a->ptr.pp_double[i1+j][j1+j]; } x->ptr.pp_double[i2+i][j2+j] = x->ptr.pp_double[i2+i][j2+j]/vd; if( jptr.pp_double[i2+i][j2+j]; ae_v_subd(&x->ptr.pp_double[i2+i][j2+j+1], 1, &a->ptr.pp_double[i1+j][j1+j+1], 1, ae_v_len(j2+j+1,j2+n-1), vr); } } } return; } if( optype==1 ) { /* * X*A^(-T) */ for(i=0; i<=m-1; i++) { for(j=n-1; j>=0; j--) { vr = (double)(0); vd = (double)(1); if( jptr.pp_double[i2+i][j2+j+1], 1, &a->ptr.pp_double[i1+j][j1+j+1], 1, ae_v_len(j2+j+1,j2+n-1)); } if( !isunit ) { vd = a->ptr.pp_double[i1+j][j1+j]; } x->ptr.pp_double[i2+i][j2+j] = (x->ptr.pp_double[i2+i][j2+j]-vr)/vd; } } return; } } else { /* * Lower triangular matrix */ if( optype==0 ) { /* * X*A^(-1) */ for(i=0; i<=m-1; i++) { for(j=n-1; j>=0; j--) { if( isunit ) { vd = (double)(1); } else { vd = a->ptr.pp_double[i1+j][j1+j]; } x->ptr.pp_double[i2+i][j2+j] = x->ptr.pp_double[i2+i][j2+j]/vd; if( j>0 ) { vr = x->ptr.pp_double[i2+i][j2+j]; ae_v_subd(&x->ptr.pp_double[i2+i][j2], 1, &a->ptr.pp_double[i1+j][j1], 1, ae_v_len(j2,j2+j-1), vr); } } } return; } if( optype==1 ) { /* * X*A^(-T) */ for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { vr = (double)(0); vd = (double)(1); if( j>0 ) { vr = ae_v_dotproduct(&x->ptr.pp_double[i2+i][j2], 1, &a->ptr.pp_double[i1+j][j1], 1, ae_v_len(j2,j2+j-1)); } if( !isunit ) { vd = a->ptr.pp_double[i1+j][j1+j]; } x->ptr.pp_double[i2+i][j2+j] = (x->ptr.pp_double[i2+i][j2+j]-vr)/vd; } } return; } } } /************************************************************************* Level 2 subroutine *************************************************************************/ static void ablas_rmatrixlefttrsm2(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { ae_int_t i; ae_int_t j; double vr; double vd; /* * Special case */ if( n==0||m==0 ) { return; } /* * Try fast code */ if( rmatrixlefttrsmf(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2, _state) ) { return; } /* * General case */ if( isupper ) { /* * Upper triangular matrix */ if( optype==0 ) { /* * A^(-1)*X */ for(i=m-1; i>=0; i--) { for(j=i+1; j<=m-1; j++) { vr = a->ptr.pp_double[i1+i][j1+j]; ae_v_subd(&x->ptr.pp_double[i2+i][j2], 1, &x->ptr.pp_double[i2+j][j2], 1, ae_v_len(j2,j2+n-1), vr); } if( !isunit ) { vd = 1/a->ptr.pp_double[i1+i][j1+i]; ae_v_muld(&x->ptr.pp_double[i2+i][j2], 1, ae_v_len(j2,j2+n-1), vd); } } return; } if( optype==1 ) { /* * A^(-T)*X */ for(i=0; i<=m-1; i++) { if( isunit ) { vd = (double)(1); } else { vd = 1/a->ptr.pp_double[i1+i][j1+i]; } ae_v_muld(&x->ptr.pp_double[i2+i][j2], 1, ae_v_len(j2,j2+n-1), vd); for(j=i+1; j<=m-1; j++) { vr = a->ptr.pp_double[i1+i][j1+j]; ae_v_subd(&x->ptr.pp_double[i2+j][j2], 1, &x->ptr.pp_double[i2+i][j2], 1, ae_v_len(j2,j2+n-1), vr); } } return; } } else { /* * Lower triangular matrix */ if( optype==0 ) { /* * A^(-1)*X */ for(i=0; i<=m-1; i++) { for(j=0; j<=i-1; j++) { vr = a->ptr.pp_double[i1+i][j1+j]; ae_v_subd(&x->ptr.pp_double[i2+i][j2], 1, &x->ptr.pp_double[i2+j][j2], 1, ae_v_len(j2,j2+n-1), vr); } if( isunit ) { vd = (double)(1); } else { vd = 1/a->ptr.pp_double[i1+j][j1+j]; } ae_v_muld(&x->ptr.pp_double[i2+i][j2], 1, ae_v_len(j2,j2+n-1), vd); } return; } if( optype==1 ) { /* * A^(-T)*X */ for(i=m-1; i>=0; i--) { if( isunit ) { vd = (double)(1); } else { vd = 1/a->ptr.pp_double[i1+i][j1+i]; } ae_v_muld(&x->ptr.pp_double[i2+i][j2], 1, ae_v_len(j2,j2+n-1), vd); for(j=i-1; j>=0; j--) { vr = a->ptr.pp_double[i1+i][j1+j]; ae_v_subd(&x->ptr.pp_double[i2+j][j2], 1, &x->ptr.pp_double[i2+i][j2], 1, ae_v_len(j2,j2+n-1), vr); } } return; } } } /************************************************************************* Level 2 subroutine *************************************************************************/ static void ablas_cmatrixherk2(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t j1; ae_int_t j2; ae_complex v; /* * Fast exit (nothing to be done) */ if( (ae_fp_eq(alpha,(double)(0))||k==0)&&ae_fp_eq(beta,(double)(1)) ) { return; } /* * Try to call fast SYRK */ if( cmatrixherkf(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state) ) { return; } /* * SYRK */ if( optypea==0 ) { /* * C=alpha*A*A^H+beta*C */ for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i; j2 = n-1; } else { j1 = 0; j2 = i; } for(j=j1; j<=j2; j++) { if( ae_fp_neq(alpha,(double)(0))&&k>0 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[ia+i][ja], 1, "N", &a->ptr.pp_complex[ia+j][ja], 1, "Conj", ae_v_len(ja,ja+k-1)); } else { v = ae_complex_from_i(0); } if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_complex[ic+i][jc+j] = ae_c_mul_d(v,alpha); } else { c->ptr.pp_complex[ic+i][jc+j] = ae_c_add(ae_c_mul_d(c->ptr.pp_complex[ic+i][jc+j],beta),ae_c_mul_d(v,alpha)); } } } return; } else { /* * C=alpha*A^H*A+beta*C */ for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i; j2 = n-1; } else { j1 = 0; j2 = i; } if( ae_fp_eq(beta,(double)(0)) ) { for(j=j1; j<=j2; j++) { c->ptr.pp_complex[ic+i][jc+j] = ae_complex_from_i(0); } } else { ae_v_cmuld(&c->ptr.pp_complex[ic+i][jc+j1], 1, ae_v_len(jc+j1,jc+j2), beta); } } if( ae_fp_neq(alpha,(double)(0))&&k>0 ) { for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { if( isupper ) { j1 = j; j2 = n-1; } else { j1 = 0; j2 = j; } v = ae_c_mul_d(ae_c_conj(a->ptr.pp_complex[ia+i][ja+j], _state),alpha); ae_v_caddc(&c->ptr.pp_complex[ic+j][jc+j1], 1, &a->ptr.pp_complex[ia+i][ja+j1], 1, "N", ae_v_len(jc+j1,jc+j2), v); } } } return; } } /************************************************************************* Level 2 subrotuine *************************************************************************/ static void ablas_rmatrixsyrk2(ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t j1; ae_int_t j2; double v; /* * Fast exit (nothing to be done) */ if( (ae_fp_eq(alpha,(double)(0))||k==0)&&ae_fp_eq(beta,(double)(1)) ) { return; } /* * Try to call fast SYRK */ if( rmatrixsyrkf(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper, _state) ) { return; } /* * SYRK */ if( optypea==0 ) { /* * C=alpha*A*A^H+beta*C */ for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i; j2 = n-1; } else { j1 = 0; j2 = i; } for(j=j1; j<=j2; j++) { if( ae_fp_neq(alpha,(double)(0))&&k>0 ) { v = ae_v_dotproduct(&a->ptr.pp_double[ia+i][ja], 1, &a->ptr.pp_double[ia+j][ja], 1, ae_v_len(ja,ja+k-1)); } else { v = (double)(0); } if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_double[ic+i][jc+j] = alpha*v; } else { c->ptr.pp_double[ic+i][jc+j] = beta*c->ptr.pp_double[ic+i][jc+j]+alpha*v; } } } return; } else { /* * C=alpha*A^H*A+beta*C */ for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i; j2 = n-1; } else { j1 = 0; j2 = i; } if( ae_fp_eq(beta,(double)(0)) ) { for(j=j1; j<=j2; j++) { c->ptr.pp_double[ic+i][jc+j] = (double)(0); } } else { ae_v_muld(&c->ptr.pp_double[ic+i][jc+j1], 1, ae_v_len(jc+j1,jc+j2), beta); } } if( ae_fp_neq(alpha,(double)(0))&&k>0 ) { for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { if( isupper ) { j1 = j; j2 = n-1; } else { j1 = 0; j2 = j; } v = alpha*a->ptr.pp_double[ia+i][ja+j]; ae_v_addd(&c->ptr.pp_double[ic+j][jc+j1], 1, &a->ptr.pp_double[ia+i][ja+j1], 1, ae_v_len(jc+j1,jc+j2), v); } } } return; } } /************************************************************************* LU decomposition of a general real matrix with row pivoting A is represented as A = P*L*U, where: * L is lower unitriangular matrix * U is upper triangular matrix * P = P0*P1*...*PK, K=min(M,N)-1, Pi - permutation matrix for I and Pivots[I] This is cache-oblivous implementation of LU decomposition. It is optimized for square matrices. As for rectangular matrices: * best case - M>>N * worst case - N>>M, small M, large N, matrix does not fit in CPU cache COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - array[0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. OUTPUT PARAMETERS: A - matrices L and U in compact form: * L is stored under main diagonal * U is stored on and above main diagonal Pivots - permutation matrix in compact form. array[0..Min(M-1,N-1)]. -- ALGLIB routine -- 10.01.2010 Bochkanov Sergey *************************************************************************/ void rmatrixlu(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state) { ae_vector_clear(pivots); ae_assert(m>0, "RMatrixLU: incorrect M!", _state); ae_assert(n>0, "RMatrixLU: incorrect N!", _state); rmatrixplu(a, m, n, pivots, _state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixlu(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state) { rmatrixlu(a,m,n,pivots, _state); } /************************************************************************* LU decomposition of a general complex matrix with row pivoting A is represented as A = P*L*U, where: * L is lower unitriangular matrix * U is upper triangular matrix * P = P0*P1*...*PK, K=min(M,N)-1, Pi - permutation matrix for I and Pivots[I] This is cache-oblivous implementation of LU decomposition. It is optimized for square matrices. As for rectangular matrices: * best case - M>>N * worst case - N>>M, small M, large N, matrix does not fit in CPU cache COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that LU decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - array[0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. OUTPUT PARAMETERS: A - matrices L and U in compact form: * L is stored under main diagonal * U is stored on and above main diagonal Pivots - permutation matrix in compact form. array[0..Min(M-1,N-1)]. -- ALGLIB routine -- 10.01.2010 Bochkanov Sergey *************************************************************************/ void cmatrixlu(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state) { ae_vector_clear(pivots); ae_assert(m>0, "CMatrixLU: incorrect M!", _state); ae_assert(n>0, "CMatrixLU: incorrect N!", _state); cmatrixplu(a, m, n, pivots, _state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixlu(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state) { cmatrixlu(a,m,n,pivots, _state); } /************************************************************************* Cache-oblivious Cholesky decomposition The algorithm computes Cholesky decomposition of a Hermitian positive- definite matrix. The result of an algorithm is a representation of A as A=U'*U or A=L*L' (here X' detones conj(X^T)). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - upper or lower triangle of a factorized matrix. array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - if IsUpper=True, then A contains an upper triangle of a symmetric matrix, otherwise A contains a lower one. OUTPUT PARAMETERS: A - the result of factorization. If IsUpper=True, then the upper triangle contains matrix U, so that A = U'*U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of A is not determined in such case. -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ ae_bool hpdmatrixcholesky(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state) { ae_frame _frame_block; ae_vector tmp; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&tmp, 0, DT_COMPLEX, _state); if( n<1 ) { result = ae_false; ae_frame_leave(_state); return result; } result = trfac_hpdmatrixcholeskyrec(a, 0, n, isupper, &tmp, _state); ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_hpdmatrixcholesky(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state) { return hpdmatrixcholesky(a,n,isupper, _state); } /************************************************************************* Cache-oblivious Cholesky decomposition The algorithm computes Cholesky decomposition of a symmetric positive- definite matrix. The result of an algorithm is a representation of A as A=U^T*U or A=L*L^T COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that Cholesky decomposition is harder ! to parallelize than, say, matrix-matrix product - this algorithm has ! several synchronization points which can not be avoided. However, ! parallelism starts to be profitable starting from N=500. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - upper or lower triangle of a factorized matrix. array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - if IsUpper=True, then A contains an upper triangle of a symmetric matrix, otherwise A contains a lower one. OUTPUT PARAMETERS: A - the result of factorization. If IsUpper=True, then the upper triangle contains matrix U, so that A = U^T*U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of A is not determined in such case. -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ ae_bool spdmatrixcholesky(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state) { ae_frame _frame_block; ae_vector tmp; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&tmp, 0, DT_REAL, _state); if( n<1 ) { result = ae_false; ae_frame_leave(_state); return result; } result = spdmatrixcholeskyrec(a, 0, n, isupper, &tmp, _state); ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_spdmatrixcholesky(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state) { return spdmatrixcholesky(a,n,isupper, _state); } /************************************************************************* Update of Cholesky decomposition: rank-1 update to original A. "Buffered" version which uses preallocated buffer which is saved between subsequent function calls. This function uses internally allocated buffer which is not saved between subsequent calls. So, if you perform a lot of subsequent updates, we recommend you to use "buffered" version of this function: SPDMatrixCholeskyUpdateAdd1Buf(). INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. U - array[N], rank-1 update to A: A_mod = A + u*u' Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. NOTE: this function always succeeds, so it does not return completion code NOTE: this function checks sizes of input arrays, but it does NOT checks for presence of infinities or NAN's. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/ void spdmatrixcholeskyupdateadd1(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* u, ae_state *_state) { ae_frame _frame_block; ae_vector bufr; ae_frame_make(_state, &_frame_block); ae_vector_init(&bufr, 0, DT_REAL, _state); ae_assert(n>0, "SPDMatrixCholeskyUpdateAdd1: N<=0", _state); ae_assert(a->rows>=n, "SPDMatrixCholeskyUpdateAdd1: Rows(A)cols>=n, "SPDMatrixCholeskyUpdateAdd1: Cols(A)cnt>=n, "SPDMatrixCholeskyUpdateAdd1: Length(U) ( Af20 0 Af22 Af23 ) ( A30 A31 A32 A33 ) ( Af30 0 Af32 Af33 ) If we have Cholesky decomposition of A, it must be recalculated after variables were fixed. However, it is possible to use efficient algorithm, which needs O(K*N^2) time to "fix" K variables, given Cholesky decomposition of original, "unfixed" A. INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. Fix - array[N], I-th element is True if I-th variable must be fixed. Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. NOTE: this function always succeeds, so it does not return completion code NOTE: this function checks sizes of input arrays, but it does NOT checks for presence of infinities or NAN's. NOTE: this function is efficient only for moderate amount of updated variables - say, 0.1*N or 0.3*N. For larger amount of variables it will still work, but you may get better performance with straightforward Cholesky. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/ void spdmatrixcholeskyupdatefix(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Boolean */ ae_vector* fix, ae_state *_state) { ae_frame _frame_block; ae_vector bufr; ae_frame_make(_state, &_frame_block); ae_vector_init(&bufr, 0, DT_REAL, _state); ae_assert(n>0, "SPDMatrixCholeskyUpdateFix: N<=0", _state); ae_assert(a->rows>=n, "SPDMatrixCholeskyUpdateFix: Rows(A)cols>=n, "SPDMatrixCholeskyUpdateFix: Cols(A)cnt>=n, "SPDMatrixCholeskyUpdateFix: Length(Fix)0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. U - array[N], rank-1 update to A: A_mod = A + u*u' Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/ void spdmatrixcholeskyupdateadd1buf(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* u, /* Real */ ae_vector* bufr, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t nz; double cs; double sn; double v; double vv; ae_assert(n>0, "SPDMatrixCholeskyUpdateAdd1Buf: N<=0", _state); ae_assert(a->rows>=n, "SPDMatrixCholeskyUpdateAdd1Buf: Rows(A)cols>=n, "SPDMatrixCholeskyUpdateAdd1Buf: Cols(A)cnt>=n, "SPDMatrixCholeskyUpdateAdd1Buf: Length(U)ptr.p_double[i],(double)(0)) ) { nz = i; break; } } if( nz==n ) { /* * Nothing to update */ return; } /* * If working with upper triangular matrix */ if( isupper ) { /* * Perform a sequence of updates which fix variables one by one. * This approach is different from one which is used when we work * with lower triangular matrix. */ rvectorsetlengthatleast(bufr, n, _state); for(j=nz; j<=n-1; j++) { bufr->ptr.p_double[j] = u->ptr.p_double[j]; } for(i=nz; i<=n-1; i++) { if( ae_fp_neq(bufr->ptr.p_double[i],(double)(0)) ) { generaterotation(a->ptr.pp_double[i][i], bufr->ptr.p_double[i], &cs, &sn, &v, _state); a->ptr.pp_double[i][i] = v; bufr->ptr.p_double[i] = 0.0; for(j=i+1; j<=n-1; j++) { v = a->ptr.pp_double[i][j]; vv = bufr->ptr.p_double[j]; a->ptr.pp_double[i][j] = cs*v+sn*vv; bufr->ptr.p_double[j] = -sn*v+cs*vv; } } } } else { /* * Calculate rows of modified Cholesky factor, row-by-row * (updates performed during variable fixing are applied * simultaneously to each row) */ rvectorsetlengthatleast(bufr, 3*n, _state); for(j=nz; j<=n-1; j++) { bufr->ptr.p_double[j] = u->ptr.p_double[j]; } for(i=nz; i<=n-1; i++) { /* * Update all previous updates [Idx+1...I-1] to I-th row */ vv = bufr->ptr.p_double[i]; for(j=nz; j<=i-1; j++) { cs = bufr->ptr.p_double[n+2*j+0]; sn = bufr->ptr.p_double[n+2*j+1]; v = a->ptr.pp_double[i][j]; a->ptr.pp_double[i][j] = cs*v+sn*vv; vv = -sn*v+cs*vv; } /* * generate rotation applied to I-th element of update vector */ generaterotation(a->ptr.pp_double[i][i], vv, &cs, &sn, &v, _state); a->ptr.pp_double[i][i] = v; bufr->ptr.p_double[n+2*i+0] = cs; bufr->ptr.p_double[n+2*i+1] = sn; } } } /************************************************************************* Update of Cholesky decomposition: "fixing" some variables. "Buffered" version which uses preallocated buffer which is saved between subsequent function calls. See comments for SPDMatrixCholeskyUpdateFix() for more information. INPUT PARAMETERS: A - upper or lower Cholesky factor. array with elements [0..N-1, 0..N-1]. Exception is thrown if array size is too small. N - size of matrix A, N>0 IsUpper - if IsUpper=True, then A contains upper Cholesky factor; otherwise A contains a lower one. Fix - array[N], I-th element is True if I-th variable must be fixed. Exception is thrown if array size is too small. BufR - possibly preallocated buffer; automatically resized if needed. It is recommended to reuse this buffer if you perform a lot of subsequent decompositions. OUTPUT PARAMETERS: A - updated factorization. If IsUpper=True, then the upper triangle contains matrix U, and the elements below the main diagonal are not modified. Similarly, if IsUpper = False. -- ALGLIB -- 03.02.2014 Sergey Bochkanov *************************************************************************/ void spdmatrixcholeskyupdatefixbuf(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Boolean */ ae_vector* fix, /* Real */ ae_vector* bufr, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t nfix; ae_int_t idx; double cs; double sn; double v; double vv; ae_assert(n>0, "SPDMatrixCholeskyUpdateFixBuf: N<=0", _state); ae_assert(a->rows>=n, "SPDMatrixCholeskyUpdateFixBuf: Rows(A)cols>=n, "SPDMatrixCholeskyUpdateFixBuf: Cols(A)cnt>=n, "SPDMatrixCholeskyUpdateFixBuf: Length(Fix)ptr.p_bool[i] ) { inc(&nfix, _state); } } if( nfix==0 ) { /* * Nothing to fix */ return; } if( nfix==n ) { /* * All variables are fixed. * Set A to identity and exit. */ if( isupper ) { for(i=0; i<=n-1; i++) { a->ptr.pp_double[i][i] = (double)(1); for(j=i+1; j<=n-1; j++) { a->ptr.pp_double[i][j] = (double)(0); } } } else { for(i=0; i<=n-1; i++) { for(j=0; j<=i-1; j++) { a->ptr.pp_double[i][j] = (double)(0); } a->ptr.pp_double[i][i] = (double)(1); } } return; } /* * If working with upper triangular matrix */ if( isupper ) { /* * Perform a sequence of updates which fix variables one by one. * This approach is different from one which is used when we work * with lower triangular matrix. */ rvectorsetlengthatleast(bufr, n, _state); for(k=0; k<=n-1; k++) { if( fix->ptr.p_bool[k] ) { idx = k; /* * Quick exit if it is last variable */ if( idx==n-1 ) { for(i=0; i<=idx-1; i++) { a->ptr.pp_double[i][idx] = 0.0; } a->ptr.pp_double[idx][idx] = 1.0; continue; } /* * We have Cholesky decomposition of quadratic term in A, * with upper triangle being stored as given below: * * ( U00 u01 U02 ) * U = ( u11 u12 ) * ( U22 ) * * Here u11 is diagonal element corresponding to variable K. We * want to fix this variable, and we do so by modifying U as follows: * * ( U00 0 U02 ) * U_mod = ( 1 0 ) * ( U_m ) * * with U_m = CHOLESKY [ (U22^T)*U22 + (u12^T)*u12 ] * * Of course, we can calculate U_m by calculating (U22^T)*U22 explicitly, * modifying it and performing Cholesky decomposition of modified matrix. * However, we can treat it as follows: * * we already have CHOLESKY[(U22^T)*U22], which is equal to U22 * * we have rank-1 update (u12^T)*u12 applied to (U22^T)*U22 * * thus, we can calculate updated Cholesky with O(N^2) algorithm * instead of O(N^3) one */ for(j=idx+1; j<=n-1; j++) { bufr->ptr.p_double[j] = a->ptr.pp_double[idx][j]; } for(i=0; i<=idx-1; i++) { a->ptr.pp_double[i][idx] = 0.0; } a->ptr.pp_double[idx][idx] = 1.0; for(i=idx+1; i<=n-1; i++) { a->ptr.pp_double[idx][i] = 0.0; } for(i=idx+1; i<=n-1; i++) { if( ae_fp_neq(bufr->ptr.p_double[i],(double)(0)) ) { generaterotation(a->ptr.pp_double[i][i], bufr->ptr.p_double[i], &cs, &sn, &v, _state); a->ptr.pp_double[i][i] = v; bufr->ptr.p_double[i] = 0.0; for(j=i+1; j<=n-1; j++) { v = a->ptr.pp_double[i][j]; vv = bufr->ptr.p_double[j]; a->ptr.pp_double[i][j] = cs*v+sn*vv; bufr->ptr.p_double[j] = -sn*v+cs*vv; } } } } } } else { /* * Calculate rows of modified Cholesky factor, row-by-row * (updates performed during variable fixing are applied * simultaneously to each row) */ rvectorsetlengthatleast(bufr, 3*n, _state); for(k=0; k<=n-1; k++) { if( fix->ptr.p_bool[k] ) { idx = k; /* * Quick exit if it is last variable */ if( idx==n-1 ) { for(i=0; i<=idx-1; i++) { a->ptr.pp_double[idx][i] = 0.0; } a->ptr.pp_double[idx][idx] = 1.0; continue; } /* * store column to buffer and clear row/column of A */ for(j=idx+1; j<=n-1; j++) { bufr->ptr.p_double[j] = a->ptr.pp_double[j][idx]; } for(i=0; i<=idx-1; i++) { a->ptr.pp_double[idx][i] = 0.0; } a->ptr.pp_double[idx][idx] = 1.0; for(i=idx+1; i<=n-1; i++) { a->ptr.pp_double[i][idx] = 0.0; } /* * Apply update to rows of A */ for(i=idx+1; i<=n-1; i++) { /* * Update all previous updates [Idx+1...I-1] to I-th row */ vv = bufr->ptr.p_double[i]; for(j=idx+1; j<=i-1; j++) { cs = bufr->ptr.p_double[n+2*j+0]; sn = bufr->ptr.p_double[n+2*j+1]; v = a->ptr.pp_double[i][j]; a->ptr.pp_double[i][j] = cs*v+sn*vv; vv = -sn*v+cs*vv; } /* * generate rotation applied to I-th element of update vector */ generaterotation(a->ptr.pp_double[i][i], vv, &cs, &sn, &v, _state); a->ptr.pp_double[i][i] = v; bufr->ptr.p_double[n+2*i+0] = cs; bufr->ptr.p_double[n+2*i+1] = sn; } } } } } /************************************************************************* Sparse Cholesky decomposition for skyline matrixm using in-place algorithm without allocating additional storage. The algorithm computes Cholesky decomposition of a symmetric positive- definite sparse matrix. The result of an algorithm is a representation of A as A=U^T*U or A=L*L^T This function is a more efficient alternative to general, but slower SparseCholeskyX(), because it does not create temporary copies of the target. It performs factorization in-place, which gives best performance on low-profile matrices. Its drawback, however, is that it can not perform profile-reducing permutation of input matrix. INPUT PARAMETERS: A - sparse matrix in skyline storage (SKS) format. N - size of matrix A (can be smaller than actual size of A) IsUpper - if IsUpper=True, then factorization is performed on upper triangle. Another triangle is ignored (it may contant some data, but it is not changed). OUTPUT PARAMETERS: A - the result of factorization, stored in SKS. If IsUpper=True, then the upper triangle contains matrix U, such that A = U^T*U. Lower triangle is not changed. Similarly, if IsUpper = False. In this case L is returned, and we have A = L*(L^T). Note that THIS function does not perform permutation of rows to reduce bandwidth. RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of A is not determined in such case. NOTE: for performance reasons this function does NOT check that input matrix includes only finite values. It is your responsibility to make sure that there are no infinite or NAN values in the matrix. -- ALGLIB routine -- 16.01.2014 Bochkanov Sergey *************************************************************************/ ae_bool sparsecholeskyskyline(sparsematrix* a, ae_int_t n, ae_bool isupper, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t jnz; ae_int_t jnza; ae_int_t jnzl; double v; double vv; double a12; ae_int_t nready; ae_int_t nadd; ae_int_t banda; ae_int_t offsa; ae_int_t offsl; ae_bool result; ae_assert(n>=0, "SparseCholeskySkyline: N<0", _state); ae_assert(sparsegetnrows(a, _state)>=n, "SparseCholeskySkyline: rows(A)=n, "SparseCholeskySkyline: cols(A)=BANDWIDTH(A1), I-th equation is reduced from * L[I,0]*A1[0] + L[I,1]*A1[1] + ... + L[I,I]*A1[I] = A[I] * to * L[I,JNZ]*A1[JNZ] + ... + L[I,I]*A1[I] = A[I] * where JNZ = max(NReady-BANDWIDTH(A1),I-BANDWIDTH(L[i])) * (JNZ is an index of the firts column where both A and L become * nonzero). * * NOTE: we rely on details of SparseMatrix internal storage format. * This is allowed by SparseMatrix specification. */ a12 = 0.0; if( a->didx.ptr.p_int[nready]>0 ) { banda = a->didx.ptr.p_int[nready]; for(i=nready-banda; i<=nready-1; i++) { /* * Elements of A1[0:I-1] were computed: * * A1[0:NReady-BandA-1] are zero (sparse) * * A1[NReady-BandA:I-1] replaced corresponding elements of A * * Now it is time to get I-th one. * * First, we calculate: * * JNZA - index of the first column where A become nonzero * * JNZL - index of the first column where L become nonzero * * JNZ - index of the first column where both A and L become nonzero * * OffsA - offset of A[JNZ] in A.Vals * * OffsL - offset of L[I,JNZ] in A.Vals * * Then, we solve SUM(A1[j]*L[I,j],j=JNZ..I-1) + A1[I]*L[I,I] = A[I], * with A1[JNZ..I-1] already known, and A1[I] unknown. */ jnza = nready-banda; jnzl = i-a->didx.ptr.p_int[i]; jnz = ae_maxint(jnza, jnzl, _state); offsa = a->ridx.ptr.p_int[nready]+(jnz-jnza); offsl = a->ridx.ptr.p_int[i]+(jnz-jnzl); v = 0.0; k = i-1-jnz; for(j=0; j<=k; j++) { v = v+a->vals.ptr.p_double[offsa+j]*a->vals.ptr.p_double[offsl+j]; } vv = (a->vals.ptr.p_double[offsa+k+1]-v)/a->vals.ptr.p_double[offsl+k+1]; a->vals.ptr.p_double[offsa+k+1] = vv; a12 = a12+vv*vv; } } /* * Calculate CHOLESKY(B-A1*A1') */ offsa = a->ridx.ptr.p_int[nready]+a->didx.ptr.p_int[nready]; v = a->vals.ptr.p_double[offsa]; if( ae_fp_less_eq(v,a12) ) { result = ae_false; return result; } a->vals.ptr.p_double[offsa] = ae_sqrt(v-a12, _state); /* * Increase size of the updated matrix */ inc(&nready, _state); } /* * transpose if needed */ if( isupper ) { sparsetransposesks(a, _state); } result = ae_true; return result; } /************************************************************************* Sparse Cholesky decomposition: "expert" function. The algorithm computes Cholesky decomposition of a symmetric positive- definite sparse matrix. The result is representation of A as A=U^T*U or A=L*L^T Triangular factor L or U is written to separate SparseMatrix structure. If output buffer already contrains enough memory to store L/U, this memory is reused. INPUT PARAMETERS: A - upper or lower triangle of sparse matrix. Matrix can be in any sparse storage format. N - size of matrix A (can be smaller than actual size of A) IsUpper - if IsUpper=True, then A contains an upper triangle of a symmetric matrix, otherwise A contains a lower one. Another triangle is ignored. P0, P1 - integer arrays: * for Ordering=-3 - user-supplied permutation of rows/ columns, which complies to requirements stated in the "OUTPUT PARAMETERS" section. Both P0 and P1 must be initialized by user. * for other values of Ordering - possibly preallocated buffer, which is filled by internally generated permutation. Automatically resized if its size is too small to store data. Ordering- sparse matrix reordering algorithm which is used to reduce fill-in amount: * -3 use ordering supplied by user in P0/P1 * -2 use random ordering * -1 use original order * 0 use best algorithm implemented so far If input matrix is given in SKS format, factorization function ignores Ordering and uses original order of the columns. The idea is that if you already store matrix in SKS format, it is better not to perform costly reordering. Algo - type of algorithm which is used during factorization: * 0 use best algorithm (for SKS input or output matrices Algo=2 is used; otherwise Algo=1 is used) * 1 use CRS-based algorithm * 2 use skyline-based factorization algorithm. This algorithm is a fastest one for low-profile matrices, but requires too much of memory for matrices with large bandwidth. Fmt - desired storage format of the output, as returned by SparseGetMatrixType() function: * 0 for hash-based storage * 1 for CRS * 2 for SKS If you do not know what format to choose, use 1 (CRS). Buf - SparseBuffers structure which is used to store temporaries. This function may reuse previously allocated storage, so if you perform repeated factorizations it is beneficial to reuse Buf. C - SparseMatrix structure which can be just some random garbage. In case in contains enough memory to store triangular factors, this memory will be reused. Othwerwise, algorithm will automatically allocate enough memory. OUTPUT PARAMETERS: C - the result of factorization, stored in desired format. If IsUpper=True, then the upper triangle contains matrix U, such that (P'*A*P) = U^T*U, where P is a permutation matrix (see below). The elements below the main diagonal are zero. Similarly, if IsUpper = False. In this case L is returned, and we have (P'*A*P) = L*(L^T). P0 - permutation (according to Ordering parameter) which minimizes amount of fill-in: * P0 is array[N] * permutation is applied to A before factorization takes place, i.e. we have U'*U = L*L' = P'*A*P * P0[k]=j means that column/row j of A is moved to k-th position before starting factorization. P1 - permutation P in another format, array[N]: * P1[k]=j means that k-th column/row of A is moved to j-th position RESULT: If the matrix is positive-definite, the function returns True. Otherwise, the function returns False. Contents of C is not determined in such case. NOTE: for performance reasons this function does NOT check that input matrix includes only finite values. It is your responsibility to make sure that there are no infinite or NAN values in the matrix. -- ALGLIB routine -- 16.01.2014 Bochkanov Sergey *************************************************************************/ ae_bool sparsecholeskyx(sparsematrix* a, ae_int_t n, ae_bool isupper, /* Integer */ ae_vector* p0, /* Integer */ ae_vector* p1, ae_int_t ordering, ae_int_t algo, ae_int_t fmt, sparsebuffers* buf, sparsematrix* c, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t t0; ae_int_t t1; double v; hqrndstate rs; ae_bool result; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&rs, _state); ae_assert(n>=0, "SparseMatrixCholeskyBuf: N<0", _state); ae_assert(sparsegetnrows(a, _state)>=n, "SparseMatrixCholeskyBuf: rows(A)=n, "SparseMatrixCholeskyBuf: cols(A)=-3&&ordering<=0, "SparseMatrixCholeskyBuf: invalid Ordering parameter", _state); ae_assert(algo>=0&&algo<=2, "SparseMatrixCholeskyBuf: invalid Algo parameter", _state); hqrndrandomize(&rs, _state); /* * Perform some quick checks. * Because sparse matrices are expensive data structures, these * checks are better to perform during early stages of the factorization. */ result = ae_false; if( n<1 ) { ae_frame_leave(_state); return result; } for(i=0; i<=n-1; i++) { if( ae_fp_less_eq(sparsegetdiagonal(a, i, _state),(double)(0)) ) { ae_frame_leave(_state); return result; } } /* * First, determine appropriate ordering: * * for SKS inputs, Ordering=-1 is automatically chosen (overrides user settings) */ if( ordering==0 ) { ordering = -1; } if( sparseissks(a, _state) ) { ordering = -1; } if( ordering==-3 ) { /* * User-supplied ordering. * Check its correctness. */ ae_assert(p0->cnt>=n, "SparseCholeskyX: user-supplied permutation is too short", _state); ae_assert(p1->cnt>=n, "SparseCholeskyX: user-supplied permutation is too short", _state); for(i=0; i<=n-1; i++) { ae_assert(p0->ptr.p_int[i]>=0&&p0->ptr.p_int[i]ptr.p_int[i]>=0&&p1->ptr.p_int[i]ptr.p_int[p0->ptr.p_int[i]]==i, "SparseCholeskyX: user-supplied permutation is inconsistent - P1 is not inverse of P0", _state); } } if( ordering==-2 ) { /* * Use random ordering */ ivectorsetlengthatleast(p0, n, _state); ivectorsetlengthatleast(p1, n, _state); for(i=0; i<=n-1; i++) { p0->ptr.p_int[i] = i; } for(i=0; i<=n-1; i++) { j = i+hqrnduniformi(&rs, n-i, _state); if( j!=i ) { k = p0->ptr.p_int[i]; p0->ptr.p_int[i] = p0->ptr.p_int[j]; p0->ptr.p_int[j] = k; } } for(i=0; i<=n-1; i++) { p1->ptr.p_int[p0->ptr.p_int[i]] = i; } } if( ordering==-1 ) { /* * Use initial ordering */ ivectorsetlengthatleast(p0, n, _state); ivectorsetlengthatleast(p1, n, _state); for(i=0; i<=n-1; i++) { p0->ptr.p_int[i] = i; p1->ptr.p_int[i] = i; } } /* * Determine algorithm to use: * * for SKS input or output - use SKS solver (overrides user settings) * * default is to use Algo=1 */ if( algo==0 ) { algo = 1; } if( sparseissks(a, _state)||fmt==2 ) { algo = 2; } algo = 2; if( algo==2 ) { /* * Skyline Cholesky with non-skyline output. * * Call CholeskyX() recursively with Buf.S as output matrix, * then perform conversion from SKS to desired format. We can * use Buf.S in reccurrent call because SKS-to-SKS CholeskyX() * does not uses this field. */ if( fmt!=2 ) { result = sparsecholeskyx(a, n, isupper, p0, p1, -3, algo, 2, buf, &buf->s, _state); if( result ) { sparsecopytobuf(&buf->s, fmt, c, _state); } ae_frame_leave(_state); return result; } /* * Skyline Cholesky with skyline output */ if( sparseissks(a, _state)&&ordering==-1 ) { /* * Non-permuted skyline matrix. * * Quickly copy matrix to output buffer without permutation. * * NOTE: Buf.D is used as dummy vector filled with zeros. */ ivectorsetlengthatleast(&buf->d, n, _state); for(i=0; i<=n-1; i++) { buf->d.ptr.p_int[i] = 0; } if( isupper ) { /* * Create strictly upper-triangular matrix, * copy upper triangle of input. */ sparsecreatesksbuf(n, n, &buf->d, &a->uidx, c, _state); for(i=0; i<=n-1; i++) { t0 = a->ridx.ptr.p_int[i+1]-a->uidx.ptr.p_int[i]-1; t1 = a->ridx.ptr.p_int[i+1]-1; k = c->ridx.ptr.p_int[i+1]-c->uidx.ptr.p_int[i]-1; for(j=t0; j<=t1; j++) { c->vals.ptr.p_double[k] = a->vals.ptr.p_double[j]; k = k+1; } } } else { /* * Create strictly lower-triangular matrix, * copy lower triangle of input. */ sparsecreatesksbuf(n, n, &a->didx, &buf->d, c, _state); for(i=0; i<=n-1; i++) { t0 = a->ridx.ptr.p_int[i]; t1 = a->ridx.ptr.p_int[i]+a->didx.ptr.p_int[i]; k = c->ridx.ptr.p_int[i]; for(j=t0; j<=t1; j++) { c->vals.ptr.p_double[k] = a->vals.ptr.p_double[j]; k = k+1; } } } } else { /* * Non-identity permutations OR non-skyline input: * * investigate profile of permuted A * * create skyline matrix in output buffer * * copy input with permutation */ ivectorsetlengthatleast(&buf->d, n, _state); ivectorsetlengthatleast(&buf->u, n, _state); for(i=0; i<=n-1; i++) { buf->d.ptr.p_int[i] = 0; buf->u.ptr.p_int[i] = 0; } t0 = 0; t1 = 0; while(sparseenumerate(a, &t0, &t1, &i, &j, &v, _state)) { if( (isupper&&j>=i)||(!isupper&&j<=i) ) { i = p1->ptr.p_int[i]; j = p1->ptr.p_int[j]; if( (ji&&!isupper) ) { swapi(&i, &j, _state); } if( i>j ) { buf->d.ptr.p_int[i] = ae_maxint(buf->d.ptr.p_int[i], i-j, _state); } else { buf->u.ptr.p_int[j] = ae_maxint(buf->u.ptr.p_int[j], j-i, _state); } } } sparsecreatesksbuf(n, n, &buf->d, &buf->u, c, _state); t0 = 0; t1 = 0; while(sparseenumerate(a, &t0, &t1, &i, &j, &v, _state)) { if( (isupper&&j>=i)||(!isupper&&j<=i) ) { i = p1->ptr.p_int[i]; j = p1->ptr.p_int[j]; if( (ji&&!isupper) ) { swapi(&j, &i, _state); } sparserewriteexisting(c, i, j, v, _state); } } } result = sparsecholeskyskyline(c, n, isupper, _state); ae_frame_leave(_state); return result; } ae_assert(ae_false, "SparseCholeskyX: internal error - unexpected algorithm", _state); ae_frame_leave(_state); return result; } void rmatrixlup(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state) { ae_frame _frame_block; ae_vector tmp; ae_int_t i; ae_int_t j; double mx; double v; ae_frame_make(_state, &_frame_block); ae_vector_clear(pivots); ae_vector_init(&tmp, 0, DT_REAL, _state); /* * Internal LU decomposition subroutine. * Never call it directly. */ ae_assert(m>0, "RMatrixLUP: incorrect M!", _state); ae_assert(n>0, "RMatrixLUP: incorrect N!", _state); /* * Scale matrix to avoid overflows, * decompose it, then scale back. */ mx = (double)(0); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { mx = ae_maxreal(mx, ae_fabs(a->ptr.pp_double[i][j], _state), _state); } } if( ae_fp_neq(mx,(double)(0)) ) { v = 1/mx; for(i=0; i<=m-1; i++) { ae_v_muld(&a->ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } } ae_vector_set_length(pivots, ae_minint(m, n, _state), _state); ae_vector_set_length(&tmp, 2*ae_maxint(m, n, _state), _state); trfac_rmatrixluprec(a, 0, m, n, pivots, &tmp, _state); if( ae_fp_neq(mx,(double)(0)) ) { v = mx; for(i=0; i<=m-1; i++) { ae_v_muld(&a->ptr.pp_double[i][0], 1, ae_v_len(0,ae_minint(i, n-1, _state)), v); } } ae_frame_leave(_state); } void cmatrixlup(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state) { ae_frame _frame_block; ae_vector tmp; ae_int_t i; ae_int_t j; double mx; double v; ae_frame_make(_state, &_frame_block); ae_vector_clear(pivots); ae_vector_init(&tmp, 0, DT_COMPLEX, _state); /* * Internal LU decomposition subroutine. * Never call it directly. */ ae_assert(m>0, "CMatrixLUP: incorrect M!", _state); ae_assert(n>0, "CMatrixLUP: incorrect N!", _state); /* * Scale matrix to avoid overflows, * decompose it, then scale back. */ mx = (double)(0); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { mx = ae_maxreal(mx, ae_c_abs(a->ptr.pp_complex[i][j], _state), _state); } } if( ae_fp_neq(mx,(double)(0)) ) { v = 1/mx; for(i=0; i<=m-1; i++) { ae_v_cmuld(&a->ptr.pp_complex[i][0], 1, ae_v_len(0,n-1), v); } } ae_vector_set_length(pivots, ae_minint(m, n, _state), _state); ae_vector_set_length(&tmp, 2*ae_maxint(m, n, _state), _state); trfac_cmatrixluprec(a, 0, m, n, pivots, &tmp, _state); if( ae_fp_neq(mx,(double)(0)) ) { v = mx; for(i=0; i<=m-1; i++) { ae_v_cmuld(&a->ptr.pp_complex[i][0], 1, ae_v_len(0,ae_minint(i, n-1, _state)), v); } } ae_frame_leave(_state); } void rmatrixplu(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state) { ae_frame _frame_block; ae_vector tmp; ae_int_t i; ae_int_t j; double mx; double v; ae_frame_make(_state, &_frame_block); ae_vector_clear(pivots); ae_vector_init(&tmp, 0, DT_REAL, _state); /* * Internal LU decomposition subroutine. * Never call it directly. */ ae_assert(m>0, "RMatrixPLU: incorrect M!", _state); ae_assert(n>0, "RMatrixPLU: incorrect N!", _state); ae_vector_set_length(&tmp, 2*ae_maxint(m, n, _state), _state); ae_vector_set_length(pivots, ae_minint(m, n, _state), _state); /* * Scale matrix to avoid overflows, * decompose it, then scale back. */ mx = (double)(0); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { mx = ae_maxreal(mx, ae_fabs(a->ptr.pp_double[i][j], _state), _state); } } if( ae_fp_neq(mx,(double)(0)) ) { v = 1/mx; for(i=0; i<=m-1; i++) { ae_v_muld(&a->ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } } trfac_rmatrixplurec(a, 0, m, n, pivots, &tmp, _state); if( ae_fp_neq(mx,(double)(0)) ) { v = mx; for(i=0; i<=ae_minint(m, n, _state)-1; i++) { ae_v_muld(&a->ptr.pp_double[i][i], 1, ae_v_len(i,n-1), v); } } ae_frame_leave(_state); } void cmatrixplu(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state) { ae_frame _frame_block; ae_vector tmp; ae_int_t i; ae_int_t j; double mx; ae_complex v; ae_frame_make(_state, &_frame_block); ae_vector_clear(pivots); ae_vector_init(&tmp, 0, DT_COMPLEX, _state); /* * Internal LU decomposition subroutine. * Never call it directly. */ ae_assert(m>0, "CMatrixPLU: incorrect M!", _state); ae_assert(n>0, "CMatrixPLU: incorrect N!", _state); ae_vector_set_length(&tmp, 2*ae_maxint(m, n, _state), _state); ae_vector_set_length(pivots, ae_minint(m, n, _state), _state); /* * Scale matrix to avoid overflows, * decompose it, then scale back. */ mx = (double)(0); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { mx = ae_maxreal(mx, ae_c_abs(a->ptr.pp_complex[i][j], _state), _state); } } if( ae_fp_neq(mx,(double)(0)) ) { v = ae_complex_from_d(1/mx); for(i=0; i<=m-1; i++) { ae_v_cmulc(&a->ptr.pp_complex[i][0], 1, ae_v_len(0,n-1), v); } } trfac_cmatrixplurec(a, 0, m, n, pivots, &tmp, _state); if( ae_fp_neq(mx,(double)(0)) ) { v = ae_complex_from_d(mx); for(i=0; i<=ae_minint(m, n, _state)-1; i++) { ae_v_cmulc(&a->ptr.pp_complex[i][i], 1, ae_v_len(i,n-1), v); } } ae_frame_leave(_state); } /************************************************************************* Recursive computational subroutine for SPDMatrixCholesky. INPUT PARAMETERS: A - matrix given by upper or lower triangle Offs - offset of diagonal block to decompose N - diagonal block size IsUpper - what half is given Tmp - temporary array; allocated by function, if its size is too small; can be reused on subsequent calls. OUTPUT PARAMETERS: A - upper (or lower) triangle contains Cholesky decomposition RESULT: True, on success False, on failure -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ ae_bool spdmatrixcholeskyrec(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tmp, ae_state *_state) { ae_int_t n1; ae_int_t n2; ae_bool result; /* * check N */ if( n<1 ) { result = ae_false; return result; } /* * Prepare buffer */ if( tmp->cnt<2*n ) { ae_vector_set_length(tmp, 2*n, _state); } /* * special cases * * NOTE: we do not use MKL to accelerate Cholesky basecase * because basecase cost is negligible when compared to * the cost of entire decomposition (most time is spent * in GEMM snd SYRK). */ if( n==1 ) { if( ae_fp_greater(a->ptr.pp_double[offs][offs],(double)(0)) ) { a->ptr.pp_double[offs][offs] = ae_sqrt(a->ptr.pp_double[offs][offs], _state); result = ae_true; } else { result = ae_false; } return result; } if( n<=ablasblocksize(a, _state) ) { result = trfac_spdmatrixcholesky2(a, offs, n, isupper, tmp, _state); return result; } /* * general case: split task in cache-oblivious manner */ result = ae_true; ablassplitlength(a, n, &n1, &n2, _state); result = spdmatrixcholeskyrec(a, offs, n1, isupper, tmp, _state); if( !result ) { return result; } if( n2>0 ) { if( isupper ) { rmatrixlefttrsm(n1, n2, a, offs, offs, isupper, ae_false, 1, a, offs, offs+n1, _state); rmatrixsyrk(n2, n1, -1.0, a, offs, offs+n1, 1, 1.0, a, offs+n1, offs+n1, isupper, _state); } else { rmatrixrighttrsm(n2, n1, a, offs, offs, isupper, ae_false, 1, a, offs+n1, offs, _state); rmatrixsyrk(n2, n1, -1.0, a, offs+n1, offs, 0, 1.0, a, offs+n1, offs+n1, isupper, _state); } result = spdmatrixcholeskyrec(a, offs+n1, n2, isupper, tmp, _state); if( !result ) { return result; } } return result; } /************************************************************************* Recurrent complex LU subroutine. Never call it directly. -- ALGLIB routine -- 04.01.2010 Bochkanov Sergey *************************************************************************/ static void trfac_cmatrixluprec(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Complex */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t m1; ae_int_t m2; /* * Kernel case */ if( ae_minint(m, n, _state)<=ablascomplexblocksize(a, _state) ) { trfac_cmatrixlup2(a, offs, m, n, pivots, tmp, _state); return; } /* * Preliminary step, make N>=M * * ( A1 ) * A = ( ), where A1 is square * ( A2 ) * * Factorize A1, update A2 */ if( m>n ) { trfac_cmatrixluprec(a, offs, n, n, pivots, tmp, _state); for(i=0; i<=n-1; i++) { ae_v_cmove(&tmp->ptr.p_complex[0], 1, &a->ptr.pp_complex[offs+n][offs+i], a->stride, "N", ae_v_len(0,m-n-1)); ae_v_cmove(&a->ptr.pp_complex[offs+n][offs+i], a->stride, &a->ptr.pp_complex[offs+n][pivots->ptr.p_int[offs+i]], a->stride, "N", ae_v_len(offs+n,offs+m-1)); ae_v_cmove(&a->ptr.pp_complex[offs+n][pivots->ptr.p_int[offs+i]], a->stride, &tmp->ptr.p_complex[0], 1, "N", ae_v_len(offs+n,offs+m-1)); } cmatrixrighttrsm(m-n, n, a, offs, offs, ae_true, ae_true, 0, a, offs+n, offs, _state); return; } /* * Non-kernel case */ ablascomplexsplitlength(a, m, &m1, &m2, _state); trfac_cmatrixluprec(a, offs, m1, n, pivots, tmp, _state); if( m2>0 ) { for(i=0; i<=m1-1; i++) { if( offs+i!=pivots->ptr.p_int[offs+i] ) { ae_v_cmove(&tmp->ptr.p_complex[0], 1, &a->ptr.pp_complex[offs+m1][offs+i], a->stride, "N", ae_v_len(0,m2-1)); ae_v_cmove(&a->ptr.pp_complex[offs+m1][offs+i], a->stride, &a->ptr.pp_complex[offs+m1][pivots->ptr.p_int[offs+i]], a->stride, "N", ae_v_len(offs+m1,offs+m-1)); ae_v_cmove(&a->ptr.pp_complex[offs+m1][pivots->ptr.p_int[offs+i]], a->stride, &tmp->ptr.p_complex[0], 1, "N", ae_v_len(offs+m1,offs+m-1)); } } cmatrixrighttrsm(m2, m1, a, offs, offs, ae_true, ae_true, 0, a, offs+m1, offs, _state); cmatrixgemm(m-m1, n-m1, m1, ae_complex_from_d(-1.0), a, offs+m1, offs, 0, a, offs, offs+m1, 0, ae_complex_from_d(1.0), a, offs+m1, offs+m1, _state); trfac_cmatrixluprec(a, offs+m1, m-m1, n-m1, pivots, tmp, _state); for(i=0; i<=m2-1; i++) { if( offs+m1+i!=pivots->ptr.p_int[offs+m1+i] ) { ae_v_cmove(&tmp->ptr.p_complex[0], 1, &a->ptr.pp_complex[offs][offs+m1+i], a->stride, "N", ae_v_len(0,m1-1)); ae_v_cmove(&a->ptr.pp_complex[offs][offs+m1+i], a->stride, &a->ptr.pp_complex[offs][pivots->ptr.p_int[offs+m1+i]], a->stride, "N", ae_v_len(offs,offs+m1-1)); ae_v_cmove(&a->ptr.pp_complex[offs][pivots->ptr.p_int[offs+m1+i]], a->stride, &tmp->ptr.p_complex[0], 1, "N", ae_v_len(offs,offs+m1-1)); } } } } /************************************************************************* Recurrent real LU subroutine. Never call it directly. -- ALGLIB routine -- 04.01.2010 Bochkanov Sergey *************************************************************************/ static void trfac_rmatrixluprec(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Real */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t m1; ae_int_t m2; /* * Kernel case */ if( ae_minint(m, n, _state)<=ablasblocksize(a, _state) ) { trfac_rmatrixlup2(a, offs, m, n, pivots, tmp, _state); return; } /* * Preliminary step, make N>=M * * ( A1 ) * A = ( ), where A1 is square * ( A2 ) * * Factorize A1, update A2 */ if( m>n ) { trfac_rmatrixluprec(a, offs, n, n, pivots, tmp, _state); for(i=0; i<=n-1; i++) { if( offs+i!=pivots->ptr.p_int[offs+i] ) { ae_v_move(&tmp->ptr.p_double[0], 1, &a->ptr.pp_double[offs+n][offs+i], a->stride, ae_v_len(0,m-n-1)); ae_v_move(&a->ptr.pp_double[offs+n][offs+i], a->stride, &a->ptr.pp_double[offs+n][pivots->ptr.p_int[offs+i]], a->stride, ae_v_len(offs+n,offs+m-1)); ae_v_move(&a->ptr.pp_double[offs+n][pivots->ptr.p_int[offs+i]], a->stride, &tmp->ptr.p_double[0], 1, ae_v_len(offs+n,offs+m-1)); } } rmatrixrighttrsm(m-n, n, a, offs, offs, ae_true, ae_true, 0, a, offs+n, offs, _state); return; } /* * Non-kernel case */ ablassplitlength(a, m, &m1, &m2, _state); trfac_rmatrixluprec(a, offs, m1, n, pivots, tmp, _state); if( m2>0 ) { for(i=0; i<=m1-1; i++) { if( offs+i!=pivots->ptr.p_int[offs+i] ) { ae_v_move(&tmp->ptr.p_double[0], 1, &a->ptr.pp_double[offs+m1][offs+i], a->stride, ae_v_len(0,m2-1)); ae_v_move(&a->ptr.pp_double[offs+m1][offs+i], a->stride, &a->ptr.pp_double[offs+m1][pivots->ptr.p_int[offs+i]], a->stride, ae_v_len(offs+m1,offs+m-1)); ae_v_move(&a->ptr.pp_double[offs+m1][pivots->ptr.p_int[offs+i]], a->stride, &tmp->ptr.p_double[0], 1, ae_v_len(offs+m1,offs+m-1)); } } rmatrixrighttrsm(m2, m1, a, offs, offs, ae_true, ae_true, 0, a, offs+m1, offs, _state); rmatrixgemm(m-m1, n-m1, m1, -1.0, a, offs+m1, offs, 0, a, offs, offs+m1, 0, 1.0, a, offs+m1, offs+m1, _state); trfac_rmatrixluprec(a, offs+m1, m-m1, n-m1, pivots, tmp, _state); for(i=0; i<=m2-1; i++) { if( offs+m1+i!=pivots->ptr.p_int[offs+m1+i] ) { ae_v_move(&tmp->ptr.p_double[0], 1, &a->ptr.pp_double[offs][offs+m1+i], a->stride, ae_v_len(0,m1-1)); ae_v_move(&a->ptr.pp_double[offs][offs+m1+i], a->stride, &a->ptr.pp_double[offs][pivots->ptr.p_int[offs+m1+i]], a->stride, ae_v_len(offs,offs+m1-1)); ae_v_move(&a->ptr.pp_double[offs][pivots->ptr.p_int[offs+m1+i]], a->stride, &tmp->ptr.p_double[0], 1, ae_v_len(offs,offs+m1-1)); } } } } /************************************************************************* Recurrent complex LU subroutine. Never call it directly. -- ALGLIB routine -- 04.01.2010 Bochkanov Sergey *************************************************************************/ static void trfac_cmatrixplurec(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Complex */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t n1; ae_int_t n2; /* * Kernel case */ if( ae_minint(m, n, _state)<=ablascomplexblocksize(a, _state) ) { trfac_cmatrixplu2(a, offs, m, n, pivots, tmp, _state); return; } /* * Preliminary step, make M>=N. * * A = (A1 A2), where A1 is square * Factorize A1, update A2 */ if( n>m ) { trfac_cmatrixplurec(a, offs, m, m, pivots, tmp, _state); for(i=0; i<=m-1; i++) { ae_v_cmove(&tmp->ptr.p_complex[0], 1, &a->ptr.pp_complex[offs+i][offs+m], 1, "N", ae_v_len(0,n-m-1)); ae_v_cmove(&a->ptr.pp_complex[offs+i][offs+m], 1, &a->ptr.pp_complex[pivots->ptr.p_int[offs+i]][offs+m], 1, "N", ae_v_len(offs+m,offs+n-1)); ae_v_cmove(&a->ptr.pp_complex[pivots->ptr.p_int[offs+i]][offs+m], 1, &tmp->ptr.p_complex[0], 1, "N", ae_v_len(offs+m,offs+n-1)); } cmatrixlefttrsm(m, n-m, a, offs, offs, ae_false, ae_true, 0, a, offs, offs+m, _state); return; } /* * Non-kernel case */ ablascomplexsplitlength(a, n, &n1, &n2, _state); trfac_cmatrixplurec(a, offs, m, n1, pivots, tmp, _state); if( n2>0 ) { for(i=0; i<=n1-1; i++) { if( offs+i!=pivots->ptr.p_int[offs+i] ) { ae_v_cmove(&tmp->ptr.p_complex[0], 1, &a->ptr.pp_complex[offs+i][offs+n1], 1, "N", ae_v_len(0,n2-1)); ae_v_cmove(&a->ptr.pp_complex[offs+i][offs+n1], 1, &a->ptr.pp_complex[pivots->ptr.p_int[offs+i]][offs+n1], 1, "N", ae_v_len(offs+n1,offs+n-1)); ae_v_cmove(&a->ptr.pp_complex[pivots->ptr.p_int[offs+i]][offs+n1], 1, &tmp->ptr.p_complex[0], 1, "N", ae_v_len(offs+n1,offs+n-1)); } } cmatrixlefttrsm(n1, n2, a, offs, offs, ae_false, ae_true, 0, a, offs, offs+n1, _state); cmatrixgemm(m-n1, n-n1, n1, ae_complex_from_d(-1.0), a, offs+n1, offs, 0, a, offs, offs+n1, 0, ae_complex_from_d(1.0), a, offs+n1, offs+n1, _state); trfac_cmatrixplurec(a, offs+n1, m-n1, n-n1, pivots, tmp, _state); for(i=0; i<=n2-1; i++) { if( offs+n1+i!=pivots->ptr.p_int[offs+n1+i] ) { ae_v_cmove(&tmp->ptr.p_complex[0], 1, &a->ptr.pp_complex[offs+n1+i][offs], 1, "N", ae_v_len(0,n1-1)); ae_v_cmove(&a->ptr.pp_complex[offs+n1+i][offs], 1, &a->ptr.pp_complex[pivots->ptr.p_int[offs+n1+i]][offs], 1, "N", ae_v_len(offs,offs+n1-1)); ae_v_cmove(&a->ptr.pp_complex[pivots->ptr.p_int[offs+n1+i]][offs], 1, &tmp->ptr.p_complex[0], 1, "N", ae_v_len(offs,offs+n1-1)); } } } } /************************************************************************* Recurrent real LU subroutine. Never call it directly. -- ALGLIB routine -- 04.01.2010 Bochkanov Sergey *************************************************************************/ static void trfac_rmatrixplurec(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Real */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t n1; ae_int_t n2; /* * Basecases */ if( rmatrixplumkl(a, offs, m, n, pivots, _state) ) { return; } if( ae_minint(m, n, _state)<=ablasblocksize(a, _state) ) { trfac_rmatrixplu2(a, offs, m, n, pivots, tmp, _state); return; } /* * Preliminary step, make M>=N. * * A = (A1 A2), where A1 is square * Factorize A1, update A2 */ if( n>m ) { trfac_rmatrixplurec(a, offs, m, m, pivots, tmp, _state); for(i=0; i<=m-1; i++) { ae_v_move(&tmp->ptr.p_double[0], 1, &a->ptr.pp_double[offs+i][offs+m], 1, ae_v_len(0,n-m-1)); ae_v_move(&a->ptr.pp_double[offs+i][offs+m], 1, &a->ptr.pp_double[pivots->ptr.p_int[offs+i]][offs+m], 1, ae_v_len(offs+m,offs+n-1)); ae_v_move(&a->ptr.pp_double[pivots->ptr.p_int[offs+i]][offs+m], 1, &tmp->ptr.p_double[0], 1, ae_v_len(offs+m,offs+n-1)); } rmatrixlefttrsm(m, n-m, a, offs, offs, ae_false, ae_true, 0, a, offs, offs+m, _state); return; } /* * Non-kernel case */ ablassplitlength(a, n, &n1, &n2, _state); trfac_rmatrixplurec(a, offs, m, n1, pivots, tmp, _state); if( n2>0 ) { for(i=0; i<=n1-1; i++) { if( offs+i!=pivots->ptr.p_int[offs+i] ) { ae_v_move(&tmp->ptr.p_double[0], 1, &a->ptr.pp_double[offs+i][offs+n1], 1, ae_v_len(0,n2-1)); ae_v_move(&a->ptr.pp_double[offs+i][offs+n1], 1, &a->ptr.pp_double[pivots->ptr.p_int[offs+i]][offs+n1], 1, ae_v_len(offs+n1,offs+n-1)); ae_v_move(&a->ptr.pp_double[pivots->ptr.p_int[offs+i]][offs+n1], 1, &tmp->ptr.p_double[0], 1, ae_v_len(offs+n1,offs+n-1)); } } rmatrixlefttrsm(n1, n2, a, offs, offs, ae_false, ae_true, 0, a, offs, offs+n1, _state); rmatrixgemm(m-n1, n-n1, n1, -1.0, a, offs+n1, offs, 0, a, offs, offs+n1, 0, 1.0, a, offs+n1, offs+n1, _state); trfac_rmatrixplurec(a, offs+n1, m-n1, n-n1, pivots, tmp, _state); for(i=0; i<=n2-1; i++) { if( offs+n1+i!=pivots->ptr.p_int[offs+n1+i] ) { ae_v_move(&tmp->ptr.p_double[0], 1, &a->ptr.pp_double[offs+n1+i][offs], 1, ae_v_len(0,n1-1)); ae_v_move(&a->ptr.pp_double[offs+n1+i][offs], 1, &a->ptr.pp_double[pivots->ptr.p_int[offs+n1+i]][offs], 1, ae_v_len(offs,offs+n1-1)); ae_v_move(&a->ptr.pp_double[pivots->ptr.p_int[offs+n1+i]][offs], 1, &tmp->ptr.p_double[0], 1, ae_v_len(offs,offs+n1-1)); } } } } /************************************************************************* Complex LUP kernel -- ALGLIB routine -- 10.01.2010 Bochkanov Sergey *************************************************************************/ static void trfac_cmatrixlup2(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Complex */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t jp; ae_complex s; /* * Quick return if possible */ if( m==0||n==0 ) { return; } /* * main cycle */ for(j=0; j<=ae_minint(m-1, n-1, _state); j++) { /* * Find pivot, swap columns */ jp = j; for(i=j+1; i<=n-1; i++) { if( ae_fp_greater(ae_c_abs(a->ptr.pp_complex[offs+j][offs+i], _state),ae_c_abs(a->ptr.pp_complex[offs+j][offs+jp], _state)) ) { jp = i; } } pivots->ptr.p_int[offs+j] = offs+jp; if( jp!=j ) { ae_v_cmove(&tmp->ptr.p_complex[0], 1, &a->ptr.pp_complex[offs][offs+j], a->stride, "N", ae_v_len(0,m-1)); ae_v_cmove(&a->ptr.pp_complex[offs][offs+j], a->stride, &a->ptr.pp_complex[offs][offs+jp], a->stride, "N", ae_v_len(offs,offs+m-1)); ae_v_cmove(&a->ptr.pp_complex[offs][offs+jp], a->stride, &tmp->ptr.p_complex[0], 1, "N", ae_v_len(offs,offs+m-1)); } /* * LU decomposition of 1x(N-J) matrix */ if( ae_c_neq_d(a->ptr.pp_complex[offs+j][offs+j],(double)(0))&&j+1<=n-1 ) { s = ae_c_d_div(1,a->ptr.pp_complex[offs+j][offs+j]); ae_v_cmulc(&a->ptr.pp_complex[offs+j][offs+j+1], 1, ae_v_len(offs+j+1,offs+n-1), s); } /* * Update trailing (M-J-1)x(N-J-1) matrix */ if( jptr.p_complex[0], 1, &a->ptr.pp_complex[offs+j+1][offs+j], a->stride, "N", ae_v_len(0,m-j-2)); ae_v_cmoveneg(&tmp->ptr.p_complex[m], 1, &a->ptr.pp_complex[offs+j][offs+j+1], 1, "N", ae_v_len(m,m+n-j-2)); cmatrixrank1(m-j-1, n-j-1, a, offs+j+1, offs+j+1, tmp, 0, tmp, m, _state); } } } /************************************************************************* Real LUP kernel -- ALGLIB routine -- 10.01.2010 Bochkanov Sergey *************************************************************************/ static void trfac_rmatrixlup2(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Real */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t jp; double s; /* * Quick return if possible */ if( m==0||n==0 ) { return; } /* * main cycle */ for(j=0; j<=ae_minint(m-1, n-1, _state); j++) { /* * Find pivot, swap columns */ jp = j; for(i=j+1; i<=n-1; i++) { if( ae_fp_greater(ae_fabs(a->ptr.pp_double[offs+j][offs+i], _state),ae_fabs(a->ptr.pp_double[offs+j][offs+jp], _state)) ) { jp = i; } } pivots->ptr.p_int[offs+j] = offs+jp; if( jp!=j ) { ae_v_move(&tmp->ptr.p_double[0], 1, &a->ptr.pp_double[offs][offs+j], a->stride, ae_v_len(0,m-1)); ae_v_move(&a->ptr.pp_double[offs][offs+j], a->stride, &a->ptr.pp_double[offs][offs+jp], a->stride, ae_v_len(offs,offs+m-1)); ae_v_move(&a->ptr.pp_double[offs][offs+jp], a->stride, &tmp->ptr.p_double[0], 1, ae_v_len(offs,offs+m-1)); } /* * LU decomposition of 1x(N-J) matrix */ if( ae_fp_neq(a->ptr.pp_double[offs+j][offs+j],(double)(0))&&j+1<=n-1 ) { s = 1/a->ptr.pp_double[offs+j][offs+j]; ae_v_muld(&a->ptr.pp_double[offs+j][offs+j+1], 1, ae_v_len(offs+j+1,offs+n-1), s); } /* * Update trailing (M-J-1)x(N-J-1) matrix */ if( jptr.p_double[0], 1, &a->ptr.pp_double[offs+j+1][offs+j], a->stride, ae_v_len(0,m-j-2)); ae_v_moveneg(&tmp->ptr.p_double[m], 1, &a->ptr.pp_double[offs+j][offs+j+1], 1, ae_v_len(m,m+n-j-2)); rmatrixrank1(m-j-1, n-j-1, a, offs+j+1, offs+j+1, tmp, 0, tmp, m, _state); } } } /************************************************************************* Complex PLU kernel -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University June 30, 1992 *************************************************************************/ static void trfac_cmatrixplu2(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Complex */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t jp; ae_complex s; /* * Quick return if possible */ if( m==0||n==0 ) { return; } for(j=0; j<=ae_minint(m-1, n-1, _state); j++) { /* * Find pivot and test for singularity. */ jp = j; for(i=j+1; i<=m-1; i++) { if( ae_fp_greater(ae_c_abs(a->ptr.pp_complex[offs+i][offs+j], _state),ae_c_abs(a->ptr.pp_complex[offs+jp][offs+j], _state)) ) { jp = i; } } pivots->ptr.p_int[offs+j] = offs+jp; if( ae_c_neq_d(a->ptr.pp_complex[offs+jp][offs+j],(double)(0)) ) { /* *Apply the interchange to rows */ if( jp!=j ) { for(i=0; i<=n-1; i++) { s = a->ptr.pp_complex[offs+j][offs+i]; a->ptr.pp_complex[offs+j][offs+i] = a->ptr.pp_complex[offs+jp][offs+i]; a->ptr.pp_complex[offs+jp][offs+i] = s; } } /* *Compute elements J+1:M of J-th column. */ if( j+1<=m-1 ) { s = ae_c_d_div(1,a->ptr.pp_complex[offs+j][offs+j]); ae_v_cmulc(&a->ptr.pp_complex[offs+j+1][offs+j], a->stride, ae_v_len(offs+j+1,offs+m-1), s); } } if( jptr.p_complex[0], 1, &a->ptr.pp_complex[offs+j+1][offs+j], a->stride, "N", ae_v_len(0,m-j-2)); ae_v_cmoveneg(&tmp->ptr.p_complex[m], 1, &a->ptr.pp_complex[offs+j][offs+j+1], 1, "N", ae_v_len(m,m+n-j-2)); cmatrixrank1(m-j-1, n-j-1, a, offs+j+1, offs+j+1, tmp, 0, tmp, m, _state); } } } /************************************************************************* Real PLU kernel -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University June 30, 1992 *************************************************************************/ static void trfac_rmatrixplu2(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, /* Real */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t jp; double s; /* * Quick return if possible */ if( m==0||n==0 ) { return; } for(j=0; j<=ae_minint(m-1, n-1, _state); j++) { /* * Find pivot and test for singularity. */ jp = j; for(i=j+1; i<=m-1; i++) { if( ae_fp_greater(ae_fabs(a->ptr.pp_double[offs+i][offs+j], _state),ae_fabs(a->ptr.pp_double[offs+jp][offs+j], _state)) ) { jp = i; } } pivots->ptr.p_int[offs+j] = offs+jp; if( ae_fp_neq(a->ptr.pp_double[offs+jp][offs+j],(double)(0)) ) { /* *Apply the interchange to rows */ if( jp!=j ) { for(i=0; i<=n-1; i++) { s = a->ptr.pp_double[offs+j][offs+i]; a->ptr.pp_double[offs+j][offs+i] = a->ptr.pp_double[offs+jp][offs+i]; a->ptr.pp_double[offs+jp][offs+i] = s; } } /* *Compute elements J+1:M of J-th column. */ if( j+1<=m-1 ) { s = 1/a->ptr.pp_double[offs+j][offs+j]; ae_v_muld(&a->ptr.pp_double[offs+j+1][offs+j], a->stride, ae_v_len(offs+j+1,offs+m-1), s); } } if( jptr.p_double[0], 1, &a->ptr.pp_double[offs+j+1][offs+j], a->stride, ae_v_len(0,m-j-2)); ae_v_moveneg(&tmp->ptr.p_double[m], 1, &a->ptr.pp_double[offs+j][offs+j+1], 1, ae_v_len(m,m+n-j-2)); rmatrixrank1(m-j-1, n-j-1, a, offs+j+1, offs+j+1, tmp, 0, tmp, m, _state); } } } /************************************************************************* Recursive computational subroutine for HPDMatrixCholesky -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ static ae_bool trfac_hpdmatrixcholeskyrec(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tmp, ae_state *_state) { ae_int_t n1; ae_int_t n2; ae_bool result; /* * check N */ if( n<1 ) { result = ae_false; return result; } /* * Prepare buffer */ if( tmp->cnt<2*n ) { ae_vector_set_length(tmp, 2*n, _state); } /* * special cases * * NOTE: we do not use MKL for basecases because their price is only * minor part of overall running time for N>256. */ if( n==1 ) { if( ae_fp_greater(a->ptr.pp_complex[offs][offs].x,(double)(0)) ) { a->ptr.pp_complex[offs][offs] = ae_complex_from_d(ae_sqrt(a->ptr.pp_complex[offs][offs].x, _state)); result = ae_true; } else { result = ae_false; } return result; } if( n<=ablascomplexblocksize(a, _state) ) { result = trfac_hpdmatrixcholesky2(a, offs, n, isupper, tmp, _state); return result; } /* * general case: split task in cache-oblivious manner */ result = ae_true; ablascomplexsplitlength(a, n, &n1, &n2, _state); result = trfac_hpdmatrixcholeskyrec(a, offs, n1, isupper, tmp, _state); if( !result ) { return result; } if( n2>0 ) { if( isupper ) { cmatrixlefttrsm(n1, n2, a, offs, offs, isupper, ae_false, 2, a, offs, offs+n1, _state); cmatrixherk(n2, n1, -1.0, a, offs, offs+n1, 2, 1.0, a, offs+n1, offs+n1, isupper, _state); } else { cmatrixrighttrsm(n2, n1, a, offs, offs, isupper, ae_false, 2, a, offs+n1, offs, _state); cmatrixherk(n2, n1, -1.0, a, offs+n1, offs, 0, 1.0, a, offs+n1, offs+n1, isupper, _state); } result = trfac_hpdmatrixcholeskyrec(a, offs+n1, n2, isupper, tmp, _state); if( !result ) { return result; } } return result; } /************************************************************************* Level-2 Hermitian Cholesky subroutine. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 *************************************************************************/ static ae_bool trfac_hpdmatrixcholesky2(/* Complex */ ae_matrix* aaa, ae_int_t offs, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t j; double ajj; ae_complex v; double r; ae_bool result; result = ae_true; if( n<0 ) { result = ae_false; return result; } /* * Quick return if possible */ if( n==0 ) { return result; } if( isupper ) { /* * Compute the Cholesky factorization A = U'*U. */ for(j=0; j<=n-1; j++) { /* * Compute U(J,J) and test for non-positive-definiteness. */ v = ae_v_cdotproduct(&aaa->ptr.pp_complex[offs][offs+j], aaa->stride, "Conj", &aaa->ptr.pp_complex[offs][offs+j], aaa->stride, "N", ae_v_len(offs,offs+j-1)); ajj = ae_c_sub(aaa->ptr.pp_complex[offs+j][offs+j],v).x; if( ae_fp_less_eq(ajj,(double)(0)) ) { aaa->ptr.pp_complex[offs+j][offs+j] = ae_complex_from_d(ajj); result = ae_false; return result; } ajj = ae_sqrt(ajj, _state); aaa->ptr.pp_complex[offs+j][offs+j] = ae_complex_from_d(ajj); /* * Compute elements J+1:N-1 of row J. */ if( j0 ) { ae_v_cmoveneg(&tmp->ptr.p_complex[0], 1, &aaa->ptr.pp_complex[offs][offs+j], aaa->stride, "Conj", ae_v_len(0,j-1)); cmatrixmv(n-j-1, j, aaa, offs, offs+j+1, 1, tmp, 0, tmp, n, _state); ae_v_cadd(&aaa->ptr.pp_complex[offs+j][offs+j+1], 1, &tmp->ptr.p_complex[n], 1, "N", ae_v_len(offs+j+1,offs+n-1)); } r = 1/ajj; ae_v_cmuld(&aaa->ptr.pp_complex[offs+j][offs+j+1], 1, ae_v_len(offs+j+1,offs+n-1), r); } } } else { /* * Compute the Cholesky factorization A = L*L'. */ for(j=0; j<=n-1; j++) { /* * Compute L(J+1,J+1) and test for non-positive-definiteness. */ v = ae_v_cdotproduct(&aaa->ptr.pp_complex[offs+j][offs], 1, "Conj", &aaa->ptr.pp_complex[offs+j][offs], 1, "N", ae_v_len(offs,offs+j-1)); ajj = ae_c_sub(aaa->ptr.pp_complex[offs+j][offs+j],v).x; if( ae_fp_less_eq(ajj,(double)(0)) ) { aaa->ptr.pp_complex[offs+j][offs+j] = ae_complex_from_d(ajj); result = ae_false; return result; } ajj = ae_sqrt(ajj, _state); aaa->ptr.pp_complex[offs+j][offs+j] = ae_complex_from_d(ajj); /* * Compute elements J+1:N of column J. */ if( j0 ) { ae_v_cmove(&tmp->ptr.p_complex[0], 1, &aaa->ptr.pp_complex[offs+j][offs], 1, "Conj", ae_v_len(0,j-1)); cmatrixmv(n-j-1, j, aaa, offs+j+1, offs, 0, tmp, 0, tmp, n, _state); for(i=0; i<=n-j-2; i++) { aaa->ptr.pp_complex[offs+j+1+i][offs+j] = ae_c_div_d(ae_c_sub(aaa->ptr.pp_complex[offs+j+1+i][offs+j],tmp->ptr.p_complex[n+i]),ajj); } } else { for(i=0; i<=n-j-2; i++) { aaa->ptr.pp_complex[offs+j+1+i][offs+j] = ae_c_div_d(aaa->ptr.pp_complex[offs+j+1+i][offs+j],ajj); } } } } } return result; } /************************************************************************* Level-2 Cholesky subroutine -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 *************************************************************************/ static ae_bool trfac_spdmatrixcholesky2(/* Real */ ae_matrix* aaa, ae_int_t offs, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tmp, ae_state *_state) { ae_int_t i; ae_int_t j; double ajj; double v; double r; ae_bool result; result = ae_true; if( n<0 ) { result = ae_false; return result; } /* * Quick return if possible */ if( n==0 ) { return result; } if( isupper ) { /* * Compute the Cholesky factorization A = U'*U. */ for(j=0; j<=n-1; j++) { /* * Compute U(J,J) and test for non-positive-definiteness. */ v = ae_v_dotproduct(&aaa->ptr.pp_double[offs][offs+j], aaa->stride, &aaa->ptr.pp_double[offs][offs+j], aaa->stride, ae_v_len(offs,offs+j-1)); ajj = aaa->ptr.pp_double[offs+j][offs+j]-v; if( ae_fp_less_eq(ajj,(double)(0)) ) { aaa->ptr.pp_double[offs+j][offs+j] = ajj; result = ae_false; return result; } ajj = ae_sqrt(ajj, _state); aaa->ptr.pp_double[offs+j][offs+j] = ajj; /* * Compute elements J+1:N-1 of row J. */ if( j0 ) { ae_v_moveneg(&tmp->ptr.p_double[0], 1, &aaa->ptr.pp_double[offs][offs+j], aaa->stride, ae_v_len(0,j-1)); rmatrixmv(n-j-1, j, aaa, offs, offs+j+1, 1, tmp, 0, tmp, n, _state); ae_v_add(&aaa->ptr.pp_double[offs+j][offs+j+1], 1, &tmp->ptr.p_double[n], 1, ae_v_len(offs+j+1,offs+n-1)); } r = 1/ajj; ae_v_muld(&aaa->ptr.pp_double[offs+j][offs+j+1], 1, ae_v_len(offs+j+1,offs+n-1), r); } } } else { /* * Compute the Cholesky factorization A = L*L'. */ for(j=0; j<=n-1; j++) { /* * Compute L(J+1,J+1) and test for non-positive-definiteness. */ v = ae_v_dotproduct(&aaa->ptr.pp_double[offs+j][offs], 1, &aaa->ptr.pp_double[offs+j][offs], 1, ae_v_len(offs,offs+j-1)); ajj = aaa->ptr.pp_double[offs+j][offs+j]-v; if( ae_fp_less_eq(ajj,(double)(0)) ) { aaa->ptr.pp_double[offs+j][offs+j] = ajj; result = ae_false; return result; } ajj = ae_sqrt(ajj, _state); aaa->ptr.pp_double[offs+j][offs+j] = ajj; /* * Compute elements J+1:N of column J. */ if( j0 ) { ae_v_move(&tmp->ptr.p_double[0], 1, &aaa->ptr.pp_double[offs+j][offs], 1, ae_v_len(0,j-1)); rmatrixmv(n-j-1, j, aaa, offs+j+1, offs, 0, tmp, 0, tmp, n, _state); for(i=0; i<=n-j-2; i++) { aaa->ptr.pp_double[offs+j+1+i][offs+j] = (aaa->ptr.pp_double[offs+j+1+i][offs+j]-tmp->ptr.p_double[n+i])/ajj; } } else { for(i=0; i<=n-j-2; i++) { aaa->ptr.pp_double[offs+j+1+i][offs+j] = aaa->ptr.pp_double[offs+j+1+i][offs+j]/ajj; } } } } } return result; } /************************************************************************* Estimate of a matrix condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixrcond1(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_int_t j; double v; double nrm; ae_vector pivots; ae_vector t; double result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_init(&pivots, 0, DT_INT, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_assert(n>=1, "RMatrixRCond1: N<1!", _state); ae_vector_set_length(&t, n, _state); for(i=0; i<=n-1; i++) { t.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { t.ptr.p_double[j] = t.ptr.p_double[j]+ae_fabs(a->ptr.pp_double[i][j], _state); } } nrm = (double)(0); for(i=0; i<=n-1; i++) { nrm = ae_maxreal(nrm, t.ptr.p_double[i], _state); } rmatrixlu(a, n, n, &pivots, _state); rcond_rmatrixrcondluinternal(a, n, ae_true, ae_true, nrm, &v, _state); result = v; ae_frame_leave(_state); return result; } /************************************************************************* Estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixrcondinf(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_int_t j; double v; double nrm; ae_vector pivots; double result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_init(&pivots, 0, DT_INT, _state); ae_assert(n>=1, "RMatrixRCondInf: N<1!", _state); nrm = (double)(0); for(i=0; i<=n-1; i++) { v = (double)(0); for(j=0; j<=n-1; j++) { v = v+ae_fabs(a->ptr.pp_double[i][j], _state); } nrm = ae_maxreal(nrm, v, _state); } rmatrixlu(a, n, n, &pivots, _state); rcond_rmatrixrcondluinternal(a, n, ae_false, ae_true, nrm, &v, _state); result = v; ae_frame_leave(_state); return result; } /************************************************************************* Condition number estimate of a symmetric positive definite matrix. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm of condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: A - symmetric positive definite matrix which is given by its upper or lower triangle depending on the value of IsUpper. Array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. Result: 1/LowerBound(cond(A)), if matrix A is positive definite, -1, if matrix A is not positive definite, and its condition number could not be found by this algorithm. NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double spdmatrixrcond(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_int_t j; ae_int_t j1; ae_int_t j2; double v; double nrm; ae_vector t; double result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_set_length(&t, n, _state); for(i=0; i<=n-1; i++) { t.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i; j2 = n-1; } else { j1 = 0; j2 = i; } for(j=j1; j<=j2; j++) { if( i==j ) { t.ptr.p_double[i] = t.ptr.p_double[i]+ae_fabs(a->ptr.pp_double[i][i], _state); } else { t.ptr.p_double[i] = t.ptr.p_double[i]+ae_fabs(a->ptr.pp_double[i][j], _state); t.ptr.p_double[j] = t.ptr.p_double[j]+ae_fabs(a->ptr.pp_double[i][j], _state); } } } nrm = (double)(0); for(i=0; i<=n-1; i++) { nrm = ae_maxreal(nrm, t.ptr.p_double[i], _state); } if( spdmatrixcholesky(a, n, isupper, _state) ) { rcond_spdmatrixrcondcholeskyinternal(a, n, isupper, ae_true, nrm, &v, _state); result = v; } else { result = (double)(-1); } ae_frame_leave(_state); return result; } /************************************************************************* Triangular matrix: estimate of a condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array[0..N-1, 0..N-1]. N - size of A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixtrrcond1(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double v; double nrm; ae_vector pivots; ae_vector t; ae_int_t j1; ae_int_t j2; double result; ae_frame_make(_state, &_frame_block); ae_vector_init(&pivots, 0, DT_INT, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_assert(n>=1, "RMatrixTRRCond1: N<1!", _state); ae_vector_set_length(&t, n, _state); for(i=0; i<=n-1; i++) { t.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i+1; j2 = n-1; } else { j1 = 0; j2 = i-1; } for(j=j1; j<=j2; j++) { t.ptr.p_double[j] = t.ptr.p_double[j]+ae_fabs(a->ptr.pp_double[i][j], _state); } if( isunit ) { t.ptr.p_double[i] = t.ptr.p_double[i]+1; } else { t.ptr.p_double[i] = t.ptr.p_double[i]+ae_fabs(a->ptr.pp_double[i][i], _state); } } nrm = (double)(0); for(i=0; i<=n-1; i++) { nrm = ae_maxreal(nrm, t.ptr.p_double[i], _state); } rcond_rmatrixrcondtrinternal(a, n, isupper, isunit, ae_true, nrm, &v, _state); result = v; ae_frame_leave(_state); return result; } /************************************************************************* Triangular matrix: estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixtrrcondinf(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double v; double nrm; ae_vector pivots; ae_int_t j1; ae_int_t j2; double result; ae_frame_make(_state, &_frame_block); ae_vector_init(&pivots, 0, DT_INT, _state); ae_assert(n>=1, "RMatrixTRRCondInf: N<1!", _state); nrm = (double)(0); for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i+1; j2 = n-1; } else { j1 = 0; j2 = i-1; } v = (double)(0); for(j=j1; j<=j2; j++) { v = v+ae_fabs(a->ptr.pp_double[i][j], _state); } if( isunit ) { v = v+1; } else { v = v+ae_fabs(a->ptr.pp_double[i][i], _state); } nrm = ae_maxreal(nrm, v, _state); } rcond_rmatrixrcondtrinternal(a, n, isupper, isunit, ae_false, nrm, &v, _state); result = v; ae_frame_leave(_state); return result; } /************************************************************************* Condition number estimate of a Hermitian positive definite matrix. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm of condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: A - Hermitian positive definite matrix which is given by its upper or lower triangle depending on the value of IsUpper. Array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. Result: 1/LowerBound(cond(A)), if matrix A is positive definite, -1, if matrix A is not positive definite, and its condition number could not be found by this algorithm. NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double hpdmatrixrcond(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_int_t j; ae_int_t j1; ae_int_t j2; double v; double nrm; ae_vector t; double result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_set_length(&t, n, _state); for(i=0; i<=n-1; i++) { t.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i; j2 = n-1; } else { j1 = 0; j2 = i; } for(j=j1; j<=j2; j++) { if( i==j ) { t.ptr.p_double[i] = t.ptr.p_double[i]+ae_c_abs(a->ptr.pp_complex[i][i], _state); } else { t.ptr.p_double[i] = t.ptr.p_double[i]+ae_c_abs(a->ptr.pp_complex[i][j], _state); t.ptr.p_double[j] = t.ptr.p_double[j]+ae_c_abs(a->ptr.pp_complex[i][j], _state); } } } nrm = (double)(0); for(i=0; i<=n-1; i++) { nrm = ae_maxreal(nrm, t.ptr.p_double[i], _state); } if( hpdmatrixcholesky(a, n, isupper, _state) ) { rcond_hpdmatrixrcondcholeskyinternal(a, n, isupper, ae_true, nrm, &v, _state); result = v; } else { result = (double)(-1); } ae_frame_leave(_state); return result; } /************************************************************************* Estimate of a matrix condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixrcond1(/* Complex */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_int_t j; double v; double nrm; ae_vector pivots; ae_vector t; double result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_init(&pivots, 0, DT_INT, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_assert(n>=1, "CMatrixRCond1: N<1!", _state); ae_vector_set_length(&t, n, _state); for(i=0; i<=n-1; i++) { t.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { t.ptr.p_double[j] = t.ptr.p_double[j]+ae_c_abs(a->ptr.pp_complex[i][j], _state); } } nrm = (double)(0); for(i=0; i<=n-1; i++) { nrm = ae_maxreal(nrm, t.ptr.p_double[i], _state); } cmatrixlu(a, n, n, &pivots, _state); rcond_cmatrixrcondluinternal(a, n, ae_true, ae_true, nrm, &v, _state); result = v; ae_frame_leave(_state); return result; } /************************************************************************* Estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixrcondinf(/* Complex */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_int_t j; double v; double nrm; ae_vector pivots; double result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_init(&pivots, 0, DT_INT, _state); ae_assert(n>=1, "CMatrixRCondInf: N<1!", _state); nrm = (double)(0); for(i=0; i<=n-1; i++) { v = (double)(0); for(j=0; j<=n-1; j++) { v = v+ae_c_abs(a->ptr.pp_complex[i][j], _state); } nrm = ae_maxreal(nrm, v, _state); } cmatrixlu(a, n, n, &pivots, _state); rcond_cmatrixrcondluinternal(a, n, ae_false, ae_true, nrm, &v, _state); result = v; ae_frame_leave(_state); return result; } /************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the RMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixlurcond1(/* Real */ ae_matrix* lua, ae_int_t n, ae_state *_state) { double v; double result; rcond_rmatrixrcondluinternal(lua, n, ae_true, ae_false, (double)(0), &v, _state); result = v; return result; } /************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (infinity norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the RMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double rmatrixlurcondinf(/* Real */ ae_matrix* lua, ae_int_t n, ae_state *_state) { double v; double result; rcond_rmatrixrcondluinternal(lua, n, ae_false, ae_false, (double)(0), &v, _state); result = v; return result; } /************************************************************************* Condition number estimate of a symmetric positive definite matrix given by Cholesky decomposition. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: CD - Cholesky decomposition of matrix A, output of SMatrixCholesky subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double spdmatrixcholeskyrcond(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state) { double v; double result; rcond_spdmatrixrcondcholeskyinternal(a, n, isupper, ae_false, (double)(0), &v, _state); result = v; return result; } /************************************************************************* Condition number estimate of a Hermitian positive definite matrix given by Cholesky decomposition. The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). It should be noted that 1-norm and inf-norm condition numbers of symmetric matrices are equal, so the algorithm doesn't take into account the differences between these types of norms. Input parameters: CD - Cholesky decomposition of matrix A, output of SMatrixCholesky subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double hpdmatrixcholeskyrcond(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state) { double v; double result; rcond_hpdmatrixrcondcholeskyinternal(a, n, isupper, ae_false, (double)(0), &v, _state); result = v; return result; } /************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the CMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixlurcond1(/* Complex */ ae_matrix* lua, ae_int_t n, ae_state *_state) { double v; double result; ae_assert(n>=1, "CMatrixLURCond1: N<1!", _state); rcond_cmatrixrcondluinternal(lua, n, ae_true, ae_false, 0.0, &v, _state); result = v; return result; } /************************************************************************* Estimate of the condition number of a matrix given by its LU decomposition (infinity norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: LUA - LU decomposition of a matrix in compact form. Output of the CMatrixLU subroutine. N - size of matrix A. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixlurcondinf(/* Complex */ ae_matrix* lua, ae_int_t n, ae_state *_state) { double v; double result; ae_assert(n>=1, "CMatrixLURCondInf: N<1!", _state); rcond_cmatrixrcondluinternal(lua, n, ae_false, ae_false, 0.0, &v, _state); result = v; return result; } /************************************************************************* Triangular matrix: estimate of a condition number (1-norm) The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array[0..N-1, 0..N-1]. N - size of A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixtrrcond1(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double v; double nrm; ae_vector pivots; ae_vector t; ae_int_t j1; ae_int_t j2; double result; ae_frame_make(_state, &_frame_block); ae_vector_init(&pivots, 0, DT_INT, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_assert(n>=1, "RMatrixTRRCond1: N<1!", _state); ae_vector_set_length(&t, n, _state); for(i=0; i<=n-1; i++) { t.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i+1; j2 = n-1; } else { j1 = 0; j2 = i-1; } for(j=j1; j<=j2; j++) { t.ptr.p_double[j] = t.ptr.p_double[j]+ae_c_abs(a->ptr.pp_complex[i][j], _state); } if( isunit ) { t.ptr.p_double[i] = t.ptr.p_double[i]+1; } else { t.ptr.p_double[i] = t.ptr.p_double[i]+ae_c_abs(a->ptr.pp_complex[i][i], _state); } } nrm = (double)(0); for(i=0; i<=n-1; i++) { nrm = ae_maxreal(nrm, t.ptr.p_double[i], _state); } rcond_cmatrixrcondtrinternal(a, n, isupper, isunit, ae_true, nrm, &v, _state); result = v; ae_frame_leave(_state); return result; } /************************************************************************* Triangular matrix: estimate of a matrix condition number (infinity-norm). The algorithm calculates a lower bound of the condition number. In this case, the algorithm does not return a lower bound of the condition number, but an inverse number (to avoid an overflow in case of a singular matrix). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - True, if the matrix is upper triangular. IsUnit - True, if the matrix has a unit diagonal. Result: 1/LowerBound(cond(A)) NOTE: if k(A) is very large, then matrix is assumed degenerate, k(A)=INF, 0.0 is returned in such cases. *************************************************************************/ double cmatrixtrrcondinf(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double v; double nrm; ae_vector pivots; ae_int_t j1; ae_int_t j2; double result; ae_frame_make(_state, &_frame_block); ae_vector_init(&pivots, 0, DT_INT, _state); ae_assert(n>=1, "RMatrixTRRCondInf: N<1!", _state); nrm = (double)(0); for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i+1; j2 = n-1; } else { j1 = 0; j2 = i-1; } v = (double)(0); for(j=j1; j<=j2; j++) { v = v+ae_c_abs(a->ptr.pp_complex[i][j], _state); } if( isunit ) { v = v+1; } else { v = v+ae_c_abs(a->ptr.pp_complex[i][i], _state); } nrm = ae_maxreal(nrm, v, _state); } rcond_cmatrixrcondtrinternal(a, n, isupper, isunit, ae_false, nrm, &v, _state); result = v; ae_frame_leave(_state); return result; } /************************************************************************* Threshold for rcond: matrices with condition number beyond this threshold are considered singular. Threshold must be far enough from underflow, at least Sqr(Threshold) must be greater than underflow. *************************************************************************/ double rcondthreshold(ae_state *_state) { double result; result = ae_sqrt(ae_sqrt(ae_minrealnumber, _state), _state); return result; } /************************************************************************* Internal subroutine for condition number estimation -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 *************************************************************************/ static void rcond_rmatrixrcondtrinternal(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_bool onenorm, double anorm, double* rc, ae_state *_state) { ae_frame _frame_block; ae_vector ex; ae_vector ev; ae_vector iwork; ae_vector tmp; ae_int_t i; ae_int_t j; ae_int_t kase; ae_int_t kase1; ae_int_t j1; ae_int_t j2; double ainvnm; double maxgrowth; double s; ae_frame_make(_state, &_frame_block); *rc = 0; ae_vector_init(&ex, 0, DT_REAL, _state); ae_vector_init(&ev, 0, DT_REAL, _state); ae_vector_init(&iwork, 0, DT_INT, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); /* * RC=0 if something happens */ *rc = (double)(0); /* * init */ if( onenorm ) { kase1 = 1; } else { kase1 = 2; } ae_vector_set_length(&iwork, n+1, _state); ae_vector_set_length(&tmp, n, _state); /* * prepare parameters for triangular solver */ maxgrowth = 1/rcondthreshold(_state); s = (double)(0); for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i+1; j2 = n-1; } else { j1 = 0; j2 = i-1; } for(j=j1; j<=j2; j++) { s = ae_maxreal(s, ae_fabs(a->ptr.pp_double[i][j], _state), _state); } if( isunit ) { s = ae_maxreal(s, (double)(1), _state); } else { s = ae_maxreal(s, ae_fabs(a->ptr.pp_double[i][i], _state), _state); } } if( ae_fp_eq(s,(double)(0)) ) { s = (double)(1); } s = 1/s; /* * Scale according to S */ anorm = anorm*s; /* * Quick return if possible * We assume that ANORM<>0 after this block */ if( ae_fp_eq(anorm,(double)(0)) ) { ae_frame_leave(_state); return; } if( n==1 ) { *rc = (double)(1); ae_frame_leave(_state); return; } /* * Estimate the norm of inv(A). */ ainvnm = (double)(0); kase = 0; for(;;) { rcond_rmatrixestimatenorm(n, &ev, &ex, &iwork, &ainvnm, &kase, _state); if( kase==0 ) { break; } /* * from 1-based array to 0-based */ for(i=0; i<=n-1; i++) { ex.ptr.p_double[i] = ex.ptr.p_double[i+1]; } /* * multiply by inv(A) or inv(A') */ if( kase==kase1 ) { /* * multiply by inv(A) */ if( !rmatrixscaledtrsafesolve(a, s, n, &ex, isupper, 0, isunit, maxgrowth, _state) ) { ae_frame_leave(_state); return; } } else { /* * multiply by inv(A') */ if( !rmatrixscaledtrsafesolve(a, s, n, &ex, isupper, 1, isunit, maxgrowth, _state) ) { ae_frame_leave(_state); return; } } /* * from 0-based array to 1-based */ for(i=n-1; i>=0; i--) { ex.ptr.p_double[i+1] = ex.ptr.p_double[i]; } } /* * Compute the estimate of the reciprocal condition number. */ if( ae_fp_neq(ainvnm,(double)(0)) ) { *rc = 1/ainvnm; *rc = *rc/anorm; if( ae_fp_less(*rc,rcondthreshold(_state)) ) { *rc = (double)(0); } } ae_frame_leave(_state); } /************************************************************************* Condition number estimation -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University March 31, 1993 *************************************************************************/ static void rcond_cmatrixrcondtrinternal(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_bool onenorm, double anorm, double* rc, ae_state *_state) { ae_frame _frame_block; ae_vector ex; ae_vector cwork2; ae_vector cwork3; ae_vector cwork4; ae_vector isave; ae_vector rsave; ae_int_t kase; ae_int_t kase1; double ainvnm; ae_int_t i; ae_int_t j; ae_int_t j1; ae_int_t j2; double s; double maxgrowth; ae_frame_make(_state, &_frame_block); *rc = 0; ae_vector_init(&ex, 0, DT_COMPLEX, _state); ae_vector_init(&cwork2, 0, DT_COMPLEX, _state); ae_vector_init(&cwork3, 0, DT_COMPLEX, _state); ae_vector_init(&cwork4, 0, DT_COMPLEX, _state); ae_vector_init(&isave, 0, DT_INT, _state); ae_vector_init(&rsave, 0, DT_REAL, _state); /* * RC=0 if something happens */ *rc = (double)(0); /* * init */ if( n<=0 ) { ae_frame_leave(_state); return; } if( n==0 ) { *rc = (double)(1); ae_frame_leave(_state); return; } ae_vector_set_length(&cwork2, n+1, _state); /* * prepare parameters for triangular solver */ maxgrowth = 1/rcondthreshold(_state); s = (double)(0); for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i+1; j2 = n-1; } else { j1 = 0; j2 = i-1; } for(j=j1; j<=j2; j++) { s = ae_maxreal(s, ae_c_abs(a->ptr.pp_complex[i][j], _state), _state); } if( isunit ) { s = ae_maxreal(s, (double)(1), _state); } else { s = ae_maxreal(s, ae_c_abs(a->ptr.pp_complex[i][i], _state), _state); } } if( ae_fp_eq(s,(double)(0)) ) { s = (double)(1); } s = 1/s; /* * Scale according to S */ anorm = anorm*s; /* * Quick return if possible */ if( ae_fp_eq(anorm,(double)(0)) ) { ae_frame_leave(_state); return; } /* * Estimate the norm of inv(A). */ ainvnm = (double)(0); if( onenorm ) { kase1 = 1; } else { kase1 = 2; } kase = 0; for(;;) { rcond_cmatrixestimatenorm(n, &cwork4, &ex, &ainvnm, &kase, &isave, &rsave, _state); if( kase==0 ) { break; } /* * From 1-based to 0-based */ for(i=0; i<=n-1; i++) { ex.ptr.p_complex[i] = ex.ptr.p_complex[i+1]; } /* * multiply by inv(A) or inv(A') */ if( kase==kase1 ) { /* * multiply by inv(A) */ if( !cmatrixscaledtrsafesolve(a, s, n, &ex, isupper, 0, isunit, maxgrowth, _state) ) { ae_frame_leave(_state); return; } } else { /* * multiply by inv(A') */ if( !cmatrixscaledtrsafesolve(a, s, n, &ex, isupper, 2, isunit, maxgrowth, _state) ) { ae_frame_leave(_state); return; } } /* * from 0-based to 1-based */ for(i=n-1; i>=0; i--) { ex.ptr.p_complex[i+1] = ex.ptr.p_complex[i]; } } /* * Compute the estimate of the reciprocal condition number. */ if( ae_fp_neq(ainvnm,(double)(0)) ) { *rc = 1/ainvnm; *rc = *rc/anorm; if( ae_fp_less(*rc,rcondthreshold(_state)) ) { *rc = (double)(0); } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine for condition number estimation -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 *************************************************************************/ static void rcond_spdmatrixrcondcholeskyinternal(/* Real */ ae_matrix* cha, ae_int_t n, ae_bool isupper, ae_bool isnormprovided, double anorm, double* rc, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t kase; double ainvnm; ae_vector ex; ae_vector ev; ae_vector tmp; ae_vector iwork; double sa; double v; double maxgrowth; ae_frame_make(_state, &_frame_block); *rc = 0; ae_vector_init(&ex, 0, DT_REAL, _state); ae_vector_init(&ev, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&iwork, 0, DT_INT, _state); ae_assert(n>=1, "Assertion failed", _state); ae_vector_set_length(&tmp, n, _state); /* * RC=0 if something happens */ *rc = (double)(0); /* * prepare parameters for triangular solver */ maxgrowth = 1/rcondthreshold(_state); sa = (double)(0); if( isupper ) { for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { sa = ae_maxreal(sa, ae_c_abs(ae_complex_from_d(cha->ptr.pp_double[i][j]), _state), _state); } } } else { for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { sa = ae_maxreal(sa, ae_c_abs(ae_complex_from_d(cha->ptr.pp_double[i][j]), _state), _state); } } } if( ae_fp_eq(sa,(double)(0)) ) { sa = (double)(1); } sa = 1/sa; /* * Estimate the norm of A. */ if( !isnormprovided ) { kase = 0; anorm = (double)(0); for(;;) { rcond_rmatrixestimatenorm(n, &ev, &ex, &iwork, &anorm, &kase, _state); if( kase==0 ) { break; } if( isupper ) { /* * Multiply by U */ for(i=1; i<=n; i++) { v = ae_v_dotproduct(&cha->ptr.pp_double[i-1][i-1], 1, &ex.ptr.p_double[i], 1, ae_v_len(i-1,n-1)); ex.ptr.p_double[i] = v; } ae_v_muld(&ex.ptr.p_double[1], 1, ae_v_len(1,n), sa); /* * Multiply by U' */ for(i=0; i<=n-1; i++) { tmp.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { v = ex.ptr.p_double[i+1]; ae_v_addd(&tmp.ptr.p_double[i], 1, &cha->ptr.pp_double[i][i], 1, ae_v_len(i,n-1), v); } ae_v_move(&ex.ptr.p_double[1], 1, &tmp.ptr.p_double[0], 1, ae_v_len(1,n)); ae_v_muld(&ex.ptr.p_double[1], 1, ae_v_len(1,n), sa); } else { /* * Multiply by L' */ for(i=0; i<=n-1; i++) { tmp.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { v = ex.ptr.p_double[i+1]; ae_v_addd(&tmp.ptr.p_double[0], 1, &cha->ptr.pp_double[i][0], 1, ae_v_len(0,i), v); } ae_v_move(&ex.ptr.p_double[1], 1, &tmp.ptr.p_double[0], 1, ae_v_len(1,n)); ae_v_muld(&ex.ptr.p_double[1], 1, ae_v_len(1,n), sa); /* * Multiply by L */ for(i=n; i>=1; i--) { v = ae_v_dotproduct(&cha->ptr.pp_double[i-1][0], 1, &ex.ptr.p_double[1], 1, ae_v_len(0,i-1)); ex.ptr.p_double[i] = v; } ae_v_muld(&ex.ptr.p_double[1], 1, ae_v_len(1,n), sa); } } } /* * Quick return if possible */ if( ae_fp_eq(anorm,(double)(0)) ) { ae_frame_leave(_state); return; } if( n==1 ) { *rc = (double)(1); ae_frame_leave(_state); return; } /* * Estimate the 1-norm of inv(A). */ kase = 0; for(;;) { rcond_rmatrixestimatenorm(n, &ev, &ex, &iwork, &ainvnm, &kase, _state); if( kase==0 ) { break; } for(i=0; i<=n-1; i++) { ex.ptr.p_double[i] = ex.ptr.p_double[i+1]; } if( isupper ) { /* * Multiply by inv(U'). */ if( !rmatrixscaledtrsafesolve(cha, sa, n, &ex, isupper, 1, ae_false, maxgrowth, _state) ) { ae_frame_leave(_state); return; } /* * Multiply by inv(U). */ if( !rmatrixscaledtrsafesolve(cha, sa, n, &ex, isupper, 0, ae_false, maxgrowth, _state) ) { ae_frame_leave(_state); return; } } else { /* * Multiply by inv(L). */ if( !rmatrixscaledtrsafesolve(cha, sa, n, &ex, isupper, 0, ae_false, maxgrowth, _state) ) { ae_frame_leave(_state); return; } /* * Multiply by inv(L'). */ if( !rmatrixscaledtrsafesolve(cha, sa, n, &ex, isupper, 1, ae_false, maxgrowth, _state) ) { ae_frame_leave(_state); return; } } for(i=n-1; i>=0; i--) { ex.ptr.p_double[i+1] = ex.ptr.p_double[i]; } } /* * Compute the estimate of the reciprocal condition number. */ if( ae_fp_neq(ainvnm,(double)(0)) ) { v = 1/ainvnm; *rc = v/anorm; if( ae_fp_less(*rc,rcondthreshold(_state)) ) { *rc = (double)(0); } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine for condition number estimation -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 *************************************************************************/ static void rcond_hpdmatrixrcondcholeskyinternal(/* Complex */ ae_matrix* cha, ae_int_t n, ae_bool isupper, ae_bool isnormprovided, double anorm, double* rc, ae_state *_state) { ae_frame _frame_block; ae_vector isave; ae_vector rsave; ae_vector ex; ae_vector ev; ae_vector tmp; ae_int_t kase; double ainvnm; ae_complex v; ae_int_t i; ae_int_t j; double sa; double maxgrowth; ae_frame_make(_state, &_frame_block); *rc = 0; ae_vector_init(&isave, 0, DT_INT, _state); ae_vector_init(&rsave, 0, DT_REAL, _state); ae_vector_init(&ex, 0, DT_COMPLEX, _state); ae_vector_init(&ev, 0, DT_COMPLEX, _state); ae_vector_init(&tmp, 0, DT_COMPLEX, _state); ae_assert(n>=1, "Assertion failed", _state); ae_vector_set_length(&tmp, n, _state); /* * RC=0 if something happens */ *rc = (double)(0); /* * prepare parameters for triangular solver */ maxgrowth = 1/rcondthreshold(_state); sa = (double)(0); if( isupper ) { for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { sa = ae_maxreal(sa, ae_c_abs(cha->ptr.pp_complex[i][j], _state), _state); } } } else { for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { sa = ae_maxreal(sa, ae_c_abs(cha->ptr.pp_complex[i][j], _state), _state); } } } if( ae_fp_eq(sa,(double)(0)) ) { sa = (double)(1); } sa = 1/sa; /* * Estimate the norm of A */ if( !isnormprovided ) { anorm = (double)(0); kase = 0; for(;;) { rcond_cmatrixestimatenorm(n, &ev, &ex, &anorm, &kase, &isave, &rsave, _state); if( kase==0 ) { break; } if( isupper ) { /* * Multiply by U */ for(i=1; i<=n; i++) { v = ae_v_cdotproduct(&cha->ptr.pp_complex[i-1][i-1], 1, "N", &ex.ptr.p_complex[i], 1, "N", ae_v_len(i-1,n-1)); ex.ptr.p_complex[i] = v; } ae_v_cmuld(&ex.ptr.p_complex[1], 1, ae_v_len(1,n), sa); /* * Multiply by U' */ for(i=0; i<=n-1; i++) { tmp.ptr.p_complex[i] = ae_complex_from_i(0); } for(i=0; i<=n-1; i++) { v = ex.ptr.p_complex[i+1]; ae_v_caddc(&tmp.ptr.p_complex[i], 1, &cha->ptr.pp_complex[i][i], 1, "Conj", ae_v_len(i,n-1), v); } ae_v_cmove(&ex.ptr.p_complex[1], 1, &tmp.ptr.p_complex[0], 1, "N", ae_v_len(1,n)); ae_v_cmuld(&ex.ptr.p_complex[1], 1, ae_v_len(1,n), sa); } else { /* * Multiply by L' */ for(i=0; i<=n-1; i++) { tmp.ptr.p_complex[i] = ae_complex_from_i(0); } for(i=0; i<=n-1; i++) { v = ex.ptr.p_complex[i+1]; ae_v_caddc(&tmp.ptr.p_complex[0], 1, &cha->ptr.pp_complex[i][0], 1, "Conj", ae_v_len(0,i), v); } ae_v_cmove(&ex.ptr.p_complex[1], 1, &tmp.ptr.p_complex[0], 1, "N", ae_v_len(1,n)); ae_v_cmuld(&ex.ptr.p_complex[1], 1, ae_v_len(1,n), sa); /* * Multiply by L */ for(i=n; i>=1; i--) { v = ae_v_cdotproduct(&cha->ptr.pp_complex[i-1][0], 1, "N", &ex.ptr.p_complex[1], 1, "N", ae_v_len(0,i-1)); ex.ptr.p_complex[i] = v; } ae_v_cmuld(&ex.ptr.p_complex[1], 1, ae_v_len(1,n), sa); } } } /* * Quick return if possible * After this block we assume that ANORM<>0 */ if( ae_fp_eq(anorm,(double)(0)) ) { ae_frame_leave(_state); return; } if( n==1 ) { *rc = (double)(1); ae_frame_leave(_state); return; } /* * Estimate the norm of inv(A). */ ainvnm = (double)(0); kase = 0; for(;;) { rcond_cmatrixestimatenorm(n, &ev, &ex, &ainvnm, &kase, &isave, &rsave, _state); if( kase==0 ) { break; } for(i=0; i<=n-1; i++) { ex.ptr.p_complex[i] = ex.ptr.p_complex[i+1]; } if( isupper ) { /* * Multiply by inv(U'). */ if( !cmatrixscaledtrsafesolve(cha, sa, n, &ex, isupper, 2, ae_false, maxgrowth, _state) ) { ae_frame_leave(_state); return; } /* * Multiply by inv(U). */ if( !cmatrixscaledtrsafesolve(cha, sa, n, &ex, isupper, 0, ae_false, maxgrowth, _state) ) { ae_frame_leave(_state); return; } } else { /* * Multiply by inv(L). */ if( !cmatrixscaledtrsafesolve(cha, sa, n, &ex, isupper, 0, ae_false, maxgrowth, _state) ) { ae_frame_leave(_state); return; } /* * Multiply by inv(L'). */ if( !cmatrixscaledtrsafesolve(cha, sa, n, &ex, isupper, 2, ae_false, maxgrowth, _state) ) { ae_frame_leave(_state); return; } } for(i=n-1; i>=0; i--) { ex.ptr.p_complex[i+1] = ex.ptr.p_complex[i]; } } /* * Compute the estimate of the reciprocal condition number. */ if( ae_fp_neq(ainvnm,(double)(0)) ) { *rc = 1/ainvnm; *rc = *rc/anorm; if( ae_fp_less(*rc,rcondthreshold(_state)) ) { *rc = (double)(0); } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine for condition number estimation -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 *************************************************************************/ static void rcond_rmatrixrcondluinternal(/* Real */ ae_matrix* lua, ae_int_t n, ae_bool onenorm, ae_bool isanormprovided, double anorm, double* rc, ae_state *_state) { ae_frame _frame_block; ae_vector ex; ae_vector ev; ae_vector iwork; ae_vector tmp; double v; ae_int_t i; ae_int_t j; ae_int_t kase; ae_int_t kase1; double ainvnm; double maxgrowth; double su; double sl; ae_bool mupper; ae_bool munit; ae_frame_make(_state, &_frame_block); *rc = 0; ae_vector_init(&ex, 0, DT_REAL, _state); ae_vector_init(&ev, 0, DT_REAL, _state); ae_vector_init(&iwork, 0, DT_INT, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); /* * RC=0 if something happens */ *rc = (double)(0); /* * init */ if( onenorm ) { kase1 = 1; } else { kase1 = 2; } mupper = ae_true; munit = ae_true; ae_vector_set_length(&iwork, n+1, _state); ae_vector_set_length(&tmp, n, _state); /* * prepare parameters for triangular solver */ maxgrowth = 1/rcondthreshold(_state); su = (double)(0); sl = (double)(1); for(i=0; i<=n-1; i++) { for(j=0; j<=i-1; j++) { sl = ae_maxreal(sl, ae_fabs(lua->ptr.pp_double[i][j], _state), _state); } for(j=i; j<=n-1; j++) { su = ae_maxreal(su, ae_fabs(lua->ptr.pp_double[i][j], _state), _state); } } if( ae_fp_eq(su,(double)(0)) ) { su = (double)(1); } su = 1/su; sl = 1/sl; /* * Estimate the norm of A. */ if( !isanormprovided ) { kase = 0; anorm = (double)(0); for(;;) { rcond_rmatrixestimatenorm(n, &ev, &ex, &iwork, &anorm, &kase, _state); if( kase==0 ) { break; } if( kase==kase1 ) { /* * Multiply by U */ for(i=1; i<=n; i++) { v = ae_v_dotproduct(&lua->ptr.pp_double[i-1][i-1], 1, &ex.ptr.p_double[i], 1, ae_v_len(i-1,n-1)); ex.ptr.p_double[i] = v; } /* * Multiply by L */ for(i=n; i>=1; i--) { if( i>1 ) { v = ae_v_dotproduct(&lua->ptr.pp_double[i-1][0], 1, &ex.ptr.p_double[1], 1, ae_v_len(0,i-2)); } else { v = (double)(0); } ex.ptr.p_double[i] = ex.ptr.p_double[i]+v; } } else { /* * Multiply by L' */ for(i=0; i<=n-1; i++) { tmp.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { v = ex.ptr.p_double[i+1]; if( i>=1 ) { ae_v_addd(&tmp.ptr.p_double[0], 1, &lua->ptr.pp_double[i][0], 1, ae_v_len(0,i-1), v); } tmp.ptr.p_double[i] = tmp.ptr.p_double[i]+v; } ae_v_move(&ex.ptr.p_double[1], 1, &tmp.ptr.p_double[0], 1, ae_v_len(1,n)); /* * Multiply by U' */ for(i=0; i<=n-1; i++) { tmp.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { v = ex.ptr.p_double[i+1]; ae_v_addd(&tmp.ptr.p_double[i], 1, &lua->ptr.pp_double[i][i], 1, ae_v_len(i,n-1), v); } ae_v_move(&ex.ptr.p_double[1], 1, &tmp.ptr.p_double[0], 1, ae_v_len(1,n)); } } } /* * Scale according to SU/SL */ anorm = anorm*su*sl; /* * Quick return if possible * We assume that ANORM<>0 after this block */ if( ae_fp_eq(anorm,(double)(0)) ) { ae_frame_leave(_state); return; } if( n==1 ) { *rc = (double)(1); ae_frame_leave(_state); return; } /* * Estimate the norm of inv(A). */ ainvnm = (double)(0); kase = 0; for(;;) { rcond_rmatrixestimatenorm(n, &ev, &ex, &iwork, &ainvnm, &kase, _state); if( kase==0 ) { break; } /* * from 1-based array to 0-based */ for(i=0; i<=n-1; i++) { ex.ptr.p_double[i] = ex.ptr.p_double[i+1]; } /* * multiply by inv(A) or inv(A') */ if( kase==kase1 ) { /* * Multiply by inv(L). */ if( !rmatrixscaledtrsafesolve(lua, sl, n, &ex, !mupper, 0, munit, maxgrowth, _state) ) { ae_frame_leave(_state); return; } /* * Multiply by inv(U). */ if( !rmatrixscaledtrsafesolve(lua, su, n, &ex, mupper, 0, !munit, maxgrowth, _state) ) { ae_frame_leave(_state); return; } } else { /* * Multiply by inv(U'). */ if( !rmatrixscaledtrsafesolve(lua, su, n, &ex, mupper, 1, !munit, maxgrowth, _state) ) { ae_frame_leave(_state); return; } /* * Multiply by inv(L'). */ if( !rmatrixscaledtrsafesolve(lua, sl, n, &ex, !mupper, 1, munit, maxgrowth, _state) ) { ae_frame_leave(_state); return; } } /* * from 0-based array to 1-based */ for(i=n-1; i>=0; i--) { ex.ptr.p_double[i+1] = ex.ptr.p_double[i]; } } /* * Compute the estimate of the reciprocal condition number. */ if( ae_fp_neq(ainvnm,(double)(0)) ) { *rc = 1/ainvnm; *rc = *rc/anorm; if( ae_fp_less(*rc,rcondthreshold(_state)) ) { *rc = (double)(0); } } ae_frame_leave(_state); } /************************************************************************* Condition number estimation -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University March 31, 1993 *************************************************************************/ static void rcond_cmatrixrcondluinternal(/* Complex */ ae_matrix* lua, ae_int_t n, ae_bool onenorm, ae_bool isanormprovided, double anorm, double* rc, ae_state *_state) { ae_frame _frame_block; ae_vector ex; ae_vector cwork2; ae_vector cwork3; ae_vector cwork4; ae_vector isave; ae_vector rsave; ae_int_t kase; ae_int_t kase1; double ainvnm; ae_complex v; ae_int_t i; ae_int_t j; double su; double sl; double maxgrowth; ae_frame_make(_state, &_frame_block); *rc = 0; ae_vector_init(&ex, 0, DT_COMPLEX, _state); ae_vector_init(&cwork2, 0, DT_COMPLEX, _state); ae_vector_init(&cwork3, 0, DT_COMPLEX, _state); ae_vector_init(&cwork4, 0, DT_COMPLEX, _state); ae_vector_init(&isave, 0, DT_INT, _state); ae_vector_init(&rsave, 0, DT_REAL, _state); if( n<=0 ) { ae_frame_leave(_state); return; } ae_vector_set_length(&cwork2, n+1, _state); *rc = (double)(0); if( n==0 ) { *rc = (double)(1); ae_frame_leave(_state); return; } /* * prepare parameters for triangular solver */ maxgrowth = 1/rcondthreshold(_state); su = (double)(0); sl = (double)(1); for(i=0; i<=n-1; i++) { for(j=0; j<=i-1; j++) { sl = ae_maxreal(sl, ae_c_abs(lua->ptr.pp_complex[i][j], _state), _state); } for(j=i; j<=n-1; j++) { su = ae_maxreal(su, ae_c_abs(lua->ptr.pp_complex[i][j], _state), _state); } } if( ae_fp_eq(su,(double)(0)) ) { su = (double)(1); } su = 1/su; sl = 1/sl; /* * Estimate the norm of SU*SL*A. */ if( !isanormprovided ) { anorm = (double)(0); if( onenorm ) { kase1 = 1; } else { kase1 = 2; } kase = 0; do { rcond_cmatrixestimatenorm(n, &cwork4, &ex, &anorm, &kase, &isave, &rsave, _state); if( kase!=0 ) { if( kase==kase1 ) { /* * Multiply by U */ for(i=1; i<=n; i++) { v = ae_v_cdotproduct(&lua->ptr.pp_complex[i-1][i-1], 1, "N", &ex.ptr.p_complex[i], 1, "N", ae_v_len(i-1,n-1)); ex.ptr.p_complex[i] = v; } /* * Multiply by L */ for(i=n; i>=1; i--) { v = ae_complex_from_i(0); if( i>1 ) { v = ae_v_cdotproduct(&lua->ptr.pp_complex[i-1][0], 1, "N", &ex.ptr.p_complex[1], 1, "N", ae_v_len(0,i-2)); } ex.ptr.p_complex[i] = ae_c_add(v,ex.ptr.p_complex[i]); } } else { /* * Multiply by L' */ for(i=1; i<=n; i++) { cwork2.ptr.p_complex[i] = ae_complex_from_i(0); } for(i=1; i<=n; i++) { v = ex.ptr.p_complex[i]; if( i>1 ) { ae_v_caddc(&cwork2.ptr.p_complex[1], 1, &lua->ptr.pp_complex[i-1][0], 1, "Conj", ae_v_len(1,i-1), v); } cwork2.ptr.p_complex[i] = ae_c_add(cwork2.ptr.p_complex[i],v); } /* * Multiply by U' */ for(i=1; i<=n; i++) { ex.ptr.p_complex[i] = ae_complex_from_i(0); } for(i=1; i<=n; i++) { v = cwork2.ptr.p_complex[i]; ae_v_caddc(&ex.ptr.p_complex[i], 1, &lua->ptr.pp_complex[i-1][i-1], 1, "Conj", ae_v_len(i,n), v); } } } } while(kase!=0); } /* * Scale according to SU/SL */ anorm = anorm*su*sl; /* * Quick return if possible */ if( ae_fp_eq(anorm,(double)(0)) ) { ae_frame_leave(_state); return; } /* * Estimate the norm of inv(A). */ ainvnm = (double)(0); if( onenorm ) { kase1 = 1; } else { kase1 = 2; } kase = 0; for(;;) { rcond_cmatrixestimatenorm(n, &cwork4, &ex, &ainvnm, &kase, &isave, &rsave, _state); if( kase==0 ) { break; } /* * From 1-based to 0-based */ for(i=0; i<=n-1; i++) { ex.ptr.p_complex[i] = ex.ptr.p_complex[i+1]; } /* * multiply by inv(A) or inv(A') */ if( kase==kase1 ) { /* * Multiply by inv(L). */ if( !cmatrixscaledtrsafesolve(lua, sl, n, &ex, ae_false, 0, ae_true, maxgrowth, _state) ) { *rc = (double)(0); ae_frame_leave(_state); return; } /* * Multiply by inv(U). */ if( !cmatrixscaledtrsafesolve(lua, su, n, &ex, ae_true, 0, ae_false, maxgrowth, _state) ) { *rc = (double)(0); ae_frame_leave(_state); return; } } else { /* * Multiply by inv(U'). */ if( !cmatrixscaledtrsafesolve(lua, su, n, &ex, ae_true, 2, ae_false, maxgrowth, _state) ) { *rc = (double)(0); ae_frame_leave(_state); return; } /* * Multiply by inv(L'). */ if( !cmatrixscaledtrsafesolve(lua, sl, n, &ex, ae_false, 2, ae_true, maxgrowth, _state) ) { *rc = (double)(0); ae_frame_leave(_state); return; } } /* * from 0-based to 1-based */ for(i=n-1; i>=0; i--) { ex.ptr.p_complex[i+1] = ex.ptr.p_complex[i]; } } /* * Compute the estimate of the reciprocal condition number. */ if( ae_fp_neq(ainvnm,(double)(0)) ) { *rc = 1/ainvnm; *rc = *rc/anorm; if( ae_fp_less(*rc,rcondthreshold(_state)) ) { *rc = (double)(0); } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine for matrix norm estimation -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 *************************************************************************/ static void rcond_rmatrixestimatenorm(ae_int_t n, /* Real */ ae_vector* v, /* Real */ ae_vector* x, /* Integer */ ae_vector* isgn, double* est, ae_int_t* kase, ae_state *_state) { ae_int_t itmax; ae_int_t i; double t; ae_bool flg; ae_int_t positer; ae_int_t posj; ae_int_t posjlast; ae_int_t posjump; ae_int_t posaltsgn; ae_int_t posestold; ae_int_t postemp; itmax = 5; posaltsgn = n+1; posestold = n+2; postemp = n+3; positer = n+1; posj = n+2; posjlast = n+3; posjump = n+4; if( *kase==0 ) { ae_vector_set_length(v, n+4, _state); ae_vector_set_length(x, n+1, _state); ae_vector_set_length(isgn, n+5, _state); t = (double)1/(double)n; for(i=1; i<=n; i++) { x->ptr.p_double[i] = t; } *kase = 1; isgn->ptr.p_int[posjump] = 1; return; } /* * ................ ENTRY (JUMP = 1) * FIRST ITERATION. X HAS BEEN OVERWRITTEN BY A*X. */ if( isgn->ptr.p_int[posjump]==1 ) { if( n==1 ) { v->ptr.p_double[1] = x->ptr.p_double[1]; *est = ae_fabs(v->ptr.p_double[1], _state); *kase = 0; return; } *est = (double)(0); for(i=1; i<=n; i++) { *est = *est+ae_fabs(x->ptr.p_double[i], _state); } for(i=1; i<=n; i++) { if( ae_fp_greater_eq(x->ptr.p_double[i],(double)(0)) ) { x->ptr.p_double[i] = (double)(1); } else { x->ptr.p_double[i] = (double)(-1); } isgn->ptr.p_int[i] = ae_sign(x->ptr.p_double[i], _state); } *kase = 2; isgn->ptr.p_int[posjump] = 2; return; } /* * ................ ENTRY (JUMP = 2) * FIRST ITERATION. X HAS BEEN OVERWRITTEN BY TRANDPOSE(A)*X. */ if( isgn->ptr.p_int[posjump]==2 ) { isgn->ptr.p_int[posj] = 1; for(i=2; i<=n; i++) { if( ae_fp_greater(ae_fabs(x->ptr.p_double[i], _state),ae_fabs(x->ptr.p_double[isgn->ptr.p_int[posj]], _state)) ) { isgn->ptr.p_int[posj] = i; } } isgn->ptr.p_int[positer] = 2; /* * MAIN LOOP - ITERATIONS 2,3,...,ITMAX. */ for(i=1; i<=n; i++) { x->ptr.p_double[i] = (double)(0); } x->ptr.p_double[isgn->ptr.p_int[posj]] = (double)(1); *kase = 1; isgn->ptr.p_int[posjump] = 3; return; } /* * ................ ENTRY (JUMP = 3) * X HAS BEEN OVERWRITTEN BY A*X. */ if( isgn->ptr.p_int[posjump]==3 ) { ae_v_move(&v->ptr.p_double[1], 1, &x->ptr.p_double[1], 1, ae_v_len(1,n)); v->ptr.p_double[posestold] = *est; *est = (double)(0); for(i=1; i<=n; i++) { *est = *est+ae_fabs(v->ptr.p_double[i], _state); } flg = ae_false; for(i=1; i<=n; i++) { if( (ae_fp_greater_eq(x->ptr.p_double[i],(double)(0))&&isgn->ptr.p_int[i]<0)||(ae_fp_less(x->ptr.p_double[i],(double)(0))&&isgn->ptr.p_int[i]>=0) ) { flg = ae_true; } } /* * REPEATED SIGN VECTOR DETECTED, HENCE ALGORITHM HAS CONVERGED. * OR MAY BE CYCLING. */ if( !flg||ae_fp_less_eq(*est,v->ptr.p_double[posestold]) ) { v->ptr.p_double[posaltsgn] = (double)(1); for(i=1; i<=n; i++) { x->ptr.p_double[i] = v->ptr.p_double[posaltsgn]*(1+(double)(i-1)/(double)(n-1)); v->ptr.p_double[posaltsgn] = -v->ptr.p_double[posaltsgn]; } *kase = 1; isgn->ptr.p_int[posjump] = 5; return; } for(i=1; i<=n; i++) { if( ae_fp_greater_eq(x->ptr.p_double[i],(double)(0)) ) { x->ptr.p_double[i] = (double)(1); isgn->ptr.p_int[i] = 1; } else { x->ptr.p_double[i] = (double)(-1); isgn->ptr.p_int[i] = -1; } } *kase = 2; isgn->ptr.p_int[posjump] = 4; return; } /* * ................ ENTRY (JUMP = 4) * X HAS BEEN OVERWRITTEN BY TRANDPOSE(A)*X. */ if( isgn->ptr.p_int[posjump]==4 ) { isgn->ptr.p_int[posjlast] = isgn->ptr.p_int[posj]; isgn->ptr.p_int[posj] = 1; for(i=2; i<=n; i++) { if( ae_fp_greater(ae_fabs(x->ptr.p_double[i], _state),ae_fabs(x->ptr.p_double[isgn->ptr.p_int[posj]], _state)) ) { isgn->ptr.p_int[posj] = i; } } if( ae_fp_neq(x->ptr.p_double[isgn->ptr.p_int[posjlast]],ae_fabs(x->ptr.p_double[isgn->ptr.p_int[posj]], _state))&&isgn->ptr.p_int[positer]ptr.p_int[positer] = isgn->ptr.p_int[positer]+1; for(i=1; i<=n; i++) { x->ptr.p_double[i] = (double)(0); } x->ptr.p_double[isgn->ptr.p_int[posj]] = (double)(1); *kase = 1; isgn->ptr.p_int[posjump] = 3; return; } /* * ITERATION COMPLETE. FINAL STAGE. */ v->ptr.p_double[posaltsgn] = (double)(1); for(i=1; i<=n; i++) { x->ptr.p_double[i] = v->ptr.p_double[posaltsgn]*(1+(double)(i-1)/(double)(n-1)); v->ptr.p_double[posaltsgn] = -v->ptr.p_double[posaltsgn]; } *kase = 1; isgn->ptr.p_int[posjump] = 5; return; } /* * ................ ENTRY (JUMP = 5) * X HAS BEEN OVERWRITTEN BY A*X. */ if( isgn->ptr.p_int[posjump]==5 ) { v->ptr.p_double[postemp] = (double)(0); for(i=1; i<=n; i++) { v->ptr.p_double[postemp] = v->ptr.p_double[postemp]+ae_fabs(x->ptr.p_double[i], _state); } v->ptr.p_double[postemp] = 2*v->ptr.p_double[postemp]/(3*n); if( ae_fp_greater(v->ptr.p_double[postemp],*est) ) { ae_v_move(&v->ptr.p_double[1], 1, &x->ptr.p_double[1], 1, ae_v_len(1,n)); *est = v->ptr.p_double[postemp]; } *kase = 0; return; } } static void rcond_cmatrixestimatenorm(ae_int_t n, /* Complex */ ae_vector* v, /* Complex */ ae_vector* x, double* est, ae_int_t* kase, /* Integer */ ae_vector* isave, /* Real */ ae_vector* rsave, ae_state *_state) { ae_int_t itmax; ae_int_t i; ae_int_t iter; ae_int_t j; ae_int_t jlast; ae_int_t jump; double absxi; double altsgn; double estold; double safmin; double temp; /* *Executable Statements .. */ itmax = 5; safmin = ae_minrealnumber; if( *kase==0 ) { ae_vector_set_length(v, n+1, _state); ae_vector_set_length(x, n+1, _state); ae_vector_set_length(isave, 5, _state); ae_vector_set_length(rsave, 4, _state); for(i=1; i<=n; i++) { x->ptr.p_complex[i] = ae_complex_from_d((double)1/(double)n); } *kase = 1; jump = 1; rcond_internalcomplexrcondsaveall(isave, rsave, &i, &iter, &j, &jlast, &jump, &absxi, &altsgn, &estold, &temp, _state); return; } rcond_internalcomplexrcondloadall(isave, rsave, &i, &iter, &j, &jlast, &jump, &absxi, &altsgn, &estold, &temp, _state); /* * ENTRY (JUMP = 1) * FIRST ITERATION. X HAS BEEN OVERWRITTEN BY A*X. */ if( jump==1 ) { if( n==1 ) { v->ptr.p_complex[1] = x->ptr.p_complex[1]; *est = ae_c_abs(v->ptr.p_complex[1], _state); *kase = 0; rcond_internalcomplexrcondsaveall(isave, rsave, &i, &iter, &j, &jlast, &jump, &absxi, &altsgn, &estold, &temp, _state); return; } *est = rcond_internalcomplexrcondscsum1(x, n, _state); for(i=1; i<=n; i++) { absxi = ae_c_abs(x->ptr.p_complex[i], _state); if( ae_fp_greater(absxi,safmin) ) { x->ptr.p_complex[i] = ae_c_div_d(x->ptr.p_complex[i],absxi); } else { x->ptr.p_complex[i] = ae_complex_from_i(1); } } *kase = 2; jump = 2; rcond_internalcomplexrcondsaveall(isave, rsave, &i, &iter, &j, &jlast, &jump, &absxi, &altsgn, &estold, &temp, _state); return; } /* * ENTRY (JUMP = 2) * FIRST ITERATION. X HAS BEEN OVERWRITTEN BY CTRANS(A)*X. */ if( jump==2 ) { j = rcond_internalcomplexrcondicmax1(x, n, _state); iter = 2; /* * MAIN LOOP - ITERATIONS 2,3,...,ITMAX. */ for(i=1; i<=n; i++) { x->ptr.p_complex[i] = ae_complex_from_i(0); } x->ptr.p_complex[j] = ae_complex_from_i(1); *kase = 1; jump = 3; rcond_internalcomplexrcondsaveall(isave, rsave, &i, &iter, &j, &jlast, &jump, &absxi, &altsgn, &estold, &temp, _state); return; } /* * ENTRY (JUMP = 3) * X HAS BEEN OVERWRITTEN BY A*X. */ if( jump==3 ) { ae_v_cmove(&v->ptr.p_complex[1], 1, &x->ptr.p_complex[1], 1, "N", ae_v_len(1,n)); estold = *est; *est = rcond_internalcomplexrcondscsum1(v, n, _state); /* * TEST FOR CYCLING. */ if( ae_fp_less_eq(*est,estold) ) { /* * ITERATION COMPLETE. FINAL STAGE. */ altsgn = (double)(1); for(i=1; i<=n; i++) { x->ptr.p_complex[i] = ae_complex_from_d(altsgn*(1+(double)(i-1)/(double)(n-1))); altsgn = -altsgn; } *kase = 1; jump = 5; rcond_internalcomplexrcondsaveall(isave, rsave, &i, &iter, &j, &jlast, &jump, &absxi, &altsgn, &estold, &temp, _state); return; } for(i=1; i<=n; i++) { absxi = ae_c_abs(x->ptr.p_complex[i], _state); if( ae_fp_greater(absxi,safmin) ) { x->ptr.p_complex[i] = ae_c_div_d(x->ptr.p_complex[i],absxi); } else { x->ptr.p_complex[i] = ae_complex_from_i(1); } } *kase = 2; jump = 4; rcond_internalcomplexrcondsaveall(isave, rsave, &i, &iter, &j, &jlast, &jump, &absxi, &altsgn, &estold, &temp, _state); return; } /* * ENTRY (JUMP = 4) * X HAS BEEN OVERWRITTEN BY CTRANS(A)*X. */ if( jump==4 ) { jlast = j; j = rcond_internalcomplexrcondicmax1(x, n, _state); if( ae_fp_neq(ae_c_abs(x->ptr.p_complex[jlast], _state),ae_c_abs(x->ptr.p_complex[j], _state))&&iterptr.p_complex[i] = ae_complex_from_i(0); } x->ptr.p_complex[j] = ae_complex_from_i(1); *kase = 1; jump = 3; rcond_internalcomplexrcondsaveall(isave, rsave, &i, &iter, &j, &jlast, &jump, &absxi, &altsgn, &estold, &temp, _state); return; } /* * ITERATION COMPLETE. FINAL STAGE. */ altsgn = (double)(1); for(i=1; i<=n; i++) { x->ptr.p_complex[i] = ae_complex_from_d(altsgn*(1+(double)(i-1)/(double)(n-1))); altsgn = -altsgn; } *kase = 1; jump = 5; rcond_internalcomplexrcondsaveall(isave, rsave, &i, &iter, &j, &jlast, &jump, &absxi, &altsgn, &estold, &temp, _state); return; } /* * ENTRY (JUMP = 5) * X HAS BEEN OVERWRITTEN BY A*X. */ if( jump==5 ) { temp = 2*(rcond_internalcomplexrcondscsum1(x, n, _state)/(3*n)); if( ae_fp_greater(temp,*est) ) { ae_v_cmove(&v->ptr.p_complex[1], 1, &x->ptr.p_complex[1], 1, "N", ae_v_len(1,n)); *est = temp; } *kase = 0; rcond_internalcomplexrcondsaveall(isave, rsave, &i, &iter, &j, &jlast, &jump, &absxi, &altsgn, &estold, &temp, _state); return; } } static double rcond_internalcomplexrcondscsum1(/* Complex */ ae_vector* x, ae_int_t n, ae_state *_state) { ae_int_t i; double result; result = (double)(0); for(i=1; i<=n; i++) { result = result+ae_c_abs(x->ptr.p_complex[i], _state); } return result; } static ae_int_t rcond_internalcomplexrcondicmax1(/* Complex */ ae_vector* x, ae_int_t n, ae_state *_state) { ae_int_t i; double m; ae_int_t result; result = 1; m = ae_c_abs(x->ptr.p_complex[1], _state); for(i=2; i<=n; i++) { if( ae_fp_greater(ae_c_abs(x->ptr.p_complex[i], _state),m) ) { result = i; m = ae_c_abs(x->ptr.p_complex[i], _state); } } return result; } static void rcond_internalcomplexrcondsaveall(/* Integer */ ae_vector* isave, /* Real */ ae_vector* rsave, ae_int_t* i, ae_int_t* iter, ae_int_t* j, ae_int_t* jlast, ae_int_t* jump, double* absxi, double* altsgn, double* estold, double* temp, ae_state *_state) { isave->ptr.p_int[0] = *i; isave->ptr.p_int[1] = *iter; isave->ptr.p_int[2] = *j; isave->ptr.p_int[3] = *jlast; isave->ptr.p_int[4] = *jump; rsave->ptr.p_double[0] = *absxi; rsave->ptr.p_double[1] = *altsgn; rsave->ptr.p_double[2] = *estold; rsave->ptr.p_double[3] = *temp; } static void rcond_internalcomplexrcondloadall(/* Integer */ ae_vector* isave, /* Real */ ae_vector* rsave, ae_int_t* i, ae_int_t* iter, ae_int_t* j, ae_int_t* jlast, ae_int_t* jump, double* absxi, double* altsgn, double* estold, double* temp, ae_state *_state) { *i = isave->ptr.p_int[0]; *iter = isave->ptr.p_int[1]; *j = isave->ptr.p_int[2]; *jlast = isave->ptr.p_int[3]; *jump = isave->ptr.p_int[4]; *absxi = rsave->ptr.p_double[0]; *altsgn = rsave->ptr.p_double[1]; *estold = rsave->ptr.p_double[2]; *temp = rsave->ptr.p_double[3]; } /************************************************************************* Inversion of a matrix given by its LU decomposition. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations (the output of RMatrixLU subroutine). N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) OUTPUT PARAMETERS: Info - return code: * -3 A is singular, or VERY close to singular. it is filled by zeros in such cases. * 1 task is solved (but matrix A may be ill-conditioned, check R1/RInf parameters for condition numbers). Rep - solver report, see below for more info A - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. SOLVER REPORT Subroutine sets following fields of the Rep structure: * R1 reciprocal of condition number: 1/cond(A), 1-norm. * RInf reciprocal of condition number: 1/cond(A), inf-norm. -- ALGLIB routine -- 05.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixluinverse(/* Real */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_int_t i; ae_int_t j; ae_int_t k; double v; sinteger sinfo; ae_frame_make(_state, &_frame_block); *info = 0; _matinvreport_clear(rep); ae_vector_init(&work, 0, DT_REAL, _state); _sinteger_init(&sinfo, _state); ae_assert(n>0, "RMatrixLUInverse: N<=0!", _state); ae_assert(a->cols>=n, "RMatrixLUInverse: cols(A)rows>=n, "RMatrixLUInverse: rows(A)cnt>=n, "RMatrixLUInverse: len(Pivots)ptr.p_int[i]>n-1||pivots->ptr.p_int[i]0, "RMatrixLUInverse: incorrect Pivots array!", _state); /* * calculate condition numbers */ rep->r1 = rmatrixlurcond1(a, n, _state); rep->rinf = rmatrixlurcondinf(a, n, _state); if( ae_fp_less(rep->r1,rcondthreshold(_state))||ae_fp_less(rep->rinf,rcondthreshold(_state)) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a->ptr.pp_double[i][j] = (double)(0); } } rep->r1 = (double)(0); rep->rinf = (double)(0); *info = -3; ae_frame_leave(_state); return; } /* * Call cache-oblivious code */ ae_vector_set_length(&work, n, _state); sinfo.val = 1; matinv_rmatrixluinverserec(a, 0, n, &work, &sinfo, rep, _state); *info = sinfo.val; /* * apply permutations */ for(i=0; i<=n-1; i++) { for(j=n-2; j>=0; j--) { k = pivots->ptr.p_int[j]; v = a->ptr.pp_double[i][j]; a->ptr.pp_double[i][j] = a->ptr.pp_double[i][k]; a->ptr.pp_double[i][k] = v; } } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixluinverse(/* Real */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state) { rmatrixluinverse(a,pivots,n,info,rep, _state); } /************************************************************************* Inversion of a general matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse Result: True, if the matrix is not singular. False, if the matrix is singular. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/ void rmatrixinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector pivots; ae_frame_make(_state, &_frame_block); *info = 0; _matinvreport_clear(rep); ae_vector_init(&pivots, 0, DT_INT, _state); ae_assert(n>0, "RMatrixInverse: N<=0!", _state); ae_assert(a->cols>=n, "RMatrixInverse: cols(A)rows>=n, "RMatrixInverse: rows(A)0, "CMatrixLUInverse: N<=0!", _state); ae_assert(a->cols>=n, "CMatrixLUInverse: cols(A)rows>=n, "CMatrixLUInverse: rows(A)cnt>=n, "CMatrixLUInverse: len(Pivots)ptr.p_int[i]>n-1||pivots->ptr.p_int[i]0, "CMatrixLUInverse: incorrect Pivots array!", _state); /* * calculate condition numbers */ rep->r1 = cmatrixlurcond1(a, n, _state); rep->rinf = cmatrixlurcondinf(a, n, _state); if( ae_fp_less(rep->r1,rcondthreshold(_state))||ae_fp_less(rep->rinf,rcondthreshold(_state)) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } rep->r1 = (double)(0); rep->rinf = (double)(0); *info = -3; ae_frame_leave(_state); return; } /* * Call cache-oblivious code */ ae_vector_set_length(&work, n, _state); matinv_cmatrixluinverserec(a, 0, n, &work, info, rep, _state); /* * apply permutations */ for(i=0; i<=n-1; i++) { for(j=n-2; j>=0; j--) { k = pivots->ptr.p_int[j]; v = a->ptr.pp_complex[i][j]; a->ptr.pp_complex[i][j] = a->ptr.pp_complex[i][k]; a->ptr.pp_complex[i][k] = v; } } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixluinverse(/* Complex */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state) { cmatrixluinverse(a,pivots,n,info,rep, _state); } /************************************************************************* Inversion of a general matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that matrix inversion is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void cmatrixinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_int_t* info, matinvreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector pivots; ae_frame_make(_state, &_frame_block); *info = 0; _matinvreport_clear(rep); ae_vector_init(&pivots, 0, DT_INT, _state); ae_assert(n>0, "CRMatrixInverse: N<=0!", _state); ae_assert(a->cols>=n, "CRMatrixInverse: cols(A)rows>=n, "CRMatrixInverse: rows(A)0, "SPDMatrixCholeskyInverse: N<=0!", _state); ae_assert(a->cols>=n, "SPDMatrixCholeskyInverse: cols(A)rows>=n, "SPDMatrixCholeskyInverse: rows(A)ptr.pp_double[i][i], _state); } ae_assert(f, "SPDMatrixCholeskyInverse: A contains infinite or NaN values!", _state); /* * calculate condition numbers */ rep->r1 = spdmatrixcholeskyrcond(a, n, isupper, _state); rep->rinf = rep->r1; if( ae_fp_less(rep->r1,rcondthreshold(_state))||ae_fp_less(rep->rinf,rcondthreshold(_state)) ) { if( isupper ) { for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { a->ptr.pp_double[i][j] = (double)(0); } } } else { for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { a->ptr.pp_double[i][j] = (double)(0); } } } rep->r1 = (double)(0); rep->rinf = (double)(0); *info = -3; ae_frame_leave(_state); return; } /* * Inverse */ ae_vector_set_length(&tmp, n, _state); spdmatrixcholeskyinverserec(a, 0, n, isupper, &tmp, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_spdmatrixcholeskyinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_int_t* info, matinvreport* rep, ae_state *_state) { spdmatrixcholeskyinverse(a,n,isupper,info,rep, _state); } /************************************************************************* Inversion of a symmetric positive definite matrix. Given an upper or lower triangle of a symmetric positive definite matrix, the algorithm generates matrix A^-1 and saves the upper or lower triangle depending on the input. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be inverted (upper or lower triangle). Array with elements [0..N-1,0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void spdmatrixinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_int_t* info, matinvreport* rep, ae_state *_state) { *info = 0; _matinvreport_clear(rep); ae_assert(n>0, "SPDMatrixInverse: N<=0!", _state); ae_assert(a->cols>=n, "SPDMatrixInverse: cols(A)rows>=n, "SPDMatrixInverse: rows(A)0, "HPDMatrixCholeskyInverse: N<=0!", _state); ae_assert(a->cols>=n, "HPDMatrixCholeskyInverse: cols(A)rows>=n, "HPDMatrixCholeskyInverse: rows(A)ptr.pp_complex[i][i].x, _state))&&ae_isfinite(a->ptr.pp_complex[i][i].y, _state); } ae_assert(f, "HPDMatrixCholeskyInverse: A contains infinite or NaN values!", _state); *info = 1; /* * calculate condition numbers */ rep->r1 = hpdmatrixcholeskyrcond(a, n, isupper, _state); rep->rinf = rep->r1; if( ae_fp_less(rep->r1,rcondthreshold(_state))||ae_fp_less(rep->rinf,rcondthreshold(_state)) ) { if( isupper ) { for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { a->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } else { for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { a->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } rep->r1 = (double)(0); rep->rinf = (double)(0); *info = -3; ae_frame_leave(_state); return; } /* * Inverse */ ae_vector_set_length(&tmp, n, _state); matinv_hpdmatrixcholeskyinverserec(a, 0, n, isupper, &tmp, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_hpdmatrixcholeskyinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_int_t* info, matinvreport* rep, ae_state *_state) { hpdmatrixcholeskyinverse(a,n,isupper,info,rep, _state); } /************************************************************************* Inversion of a Hermitian positive definite matrix. Given an upper or lower triangle of a Hermitian positive definite matrix, the algorithm generates matrix A^-1 and saves the upper or lower triangle depending on the input. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. However, Cholesky inversion is a "difficult" ! algorithm - it has lots of internal synchronization points which ! prevents efficient parallelization of algorithm. Only very large ! problems (N=thousands) can be efficiently parallelized. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be inverted (upper or lower triangle). Array with elements [0..N-1,0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - storage type (optional): * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Output parameters: Info - return code, same as in RMatrixLUInverse Rep - solver report, same as in RMatrixLUInverse A - inverse of matrix A, same as in RMatrixLUInverse -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void hpdmatrixinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_int_t* info, matinvreport* rep, ae_state *_state) { *info = 0; _matinvreport_clear(rep); ae_assert(n>0, "HPDMatrixInverse: N<=0!", _state); ae_assert(a->cols>=n, "HPDMatrixInverse: cols(A)rows>=n, "HPDMatrixInverse: rows(A)0, "RMatrixTRInverse: N<=0!", _state); ae_assert(a->cols>=n, "RMatrixTRInverse: cols(A)rows>=n, "RMatrixTRInverse: rows(A)r1 = rmatrixtrrcond1(a, n, isupper, isunit, _state); rep->rinf = rmatrixtrrcondinf(a, n, isupper, isunit, _state); if( ae_fp_less(rep->r1,rcondthreshold(_state))||ae_fp_less(rep->rinf,rcondthreshold(_state)) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a->ptr.pp_double[i][j] = (double)(0); } } rep->r1 = (double)(0); rep->rinf = (double)(0); *info = -3; ae_frame_leave(_state); return; } /* * Invert */ ae_vector_set_length(&tmp, n, _state); sinfo.val = 1; matinv_rmatrixtrinverserec(a, 0, n, isupper, isunit, &tmp, &sinfo, rep, _state); *info = sinfo.val; ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixtrinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_int_t* info, matinvreport* rep, ae_state *_state) { rmatrixtrinverse(a,n,isupper,isunit,info,rep, _state); } /************************************************************************* Triangular matrix inverse (complex) The subroutine inverts the following types of matrices: * upper triangular * upper triangular with unit diagonal * lower triangular * lower triangular with unit diagonal In case of an upper (lower) triangular matrix, the inverse matrix will also be upper (lower) triangular, and after the end of the algorithm, the inverse matrix replaces the source matrix. The elements below (above) the main diagonal are not changed by the algorithm. If the matrix has a unit diagonal, the inverse matrix also has a unit diagonal, and the diagonal elements are not passed to the algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that triangular inverse is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=1024, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix, array[0..N-1, 0..N-1]. N - size of matrix A (optional) : * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, size is automatically determined from matrix size (A must be square matrix) IsUpper - True, if the matrix is upper triangular. IsUnit - diagonal type (optional): * if True, matrix has unit diagonal (a[i,i] are NOT used) * if False, matrix diagonal is arbitrary * if not given, False is assumed Output parameters: Info - same as for RMatrixLUInverse Rep - same as for RMatrixLUInverse A - same as for RMatrixLUInverse. -- ALGLIB -- Copyright 05.02.2010 by Bochkanov Sergey *************************************************************************/ void cmatrixtrinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_int_t* info, matinvreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_vector tmp; ae_frame_make(_state, &_frame_block); *info = 0; _matinvreport_clear(rep); ae_vector_init(&tmp, 0, DT_COMPLEX, _state); ae_assert(n>0, "CMatrixTRInverse: N<=0!", _state); ae_assert(a->cols>=n, "CMatrixTRInverse: cols(A)rows>=n, "CMatrixTRInverse: rows(A)r1 = cmatrixtrrcond1(a, n, isupper, isunit, _state); rep->rinf = cmatrixtrrcondinf(a, n, isupper, isunit, _state); if( ae_fp_less(rep->r1,rcondthreshold(_state))||ae_fp_less(rep->rinf,rcondthreshold(_state)) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } rep->r1 = (double)(0); rep->rinf = (double)(0); *info = -3; ae_frame_leave(_state); return; } /* * Invert */ ae_vector_set_length(&tmp, n, _state); matinv_cmatrixtrinverserec(a, 0, n, isupper, isunit, &tmp, info, rep, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixtrinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunit, ae_int_t* info, matinvreport* rep, ae_state *_state) { cmatrixtrinverse(a,n,isupper,isunit,info,rep, _state); } /************************************************************************* Recursive subroutine for SPD inversion. NOTE: this function expects that matris is strictly positive-definite. -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ void spdmatrixcholeskyinverserec(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tmp, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double v; ae_int_t n1; ae_int_t n2; sinteger sinfo2; matinvreport rep2; ae_frame_make(_state, &_frame_block); _sinteger_init(&sinfo2, _state); _matinvreport_init(&rep2, _state); if( n<1 ) { ae_frame_leave(_state); return; } /* * Base case */ if( n<=ablasblocksize(a, _state) ) { sinfo2.val = 1; matinv_rmatrixtrinverserec(a, offs, n, isupper, ae_false, tmp, &sinfo2, &rep2, _state); ae_assert(sinfo2.val>0, "SPDMatrixCholeskyInverseRec: integrity check failed", _state); if( isupper ) { /* * Compute the product U * U'. * NOTE: we never assume that diagonal of U is real */ for(i=0; i<=n-1; i++) { if( i==0 ) { /* * 1x1 matrix */ a->ptr.pp_double[offs+i][offs+i] = ae_sqr(a->ptr.pp_double[offs+i][offs+i], _state); } else { /* * (I+1)x(I+1) matrix, * * ( A11 A12 ) ( A11^H ) ( A11*A11^H+A12*A12^H A12*A22^H ) * ( ) * ( ) = ( ) * ( A22 ) ( A12^H A22^H ) ( A22*A12^H A22*A22^H ) * * A11 is IxI, A22 is 1x1. */ ae_v_move(&tmp->ptr.p_double[0], 1, &a->ptr.pp_double[offs][offs+i], a->stride, ae_v_len(0,i-1)); for(j=0; j<=i-1; j++) { v = a->ptr.pp_double[offs+j][offs+i]; ae_v_addd(&a->ptr.pp_double[offs+j][offs+j], 1, &tmp->ptr.p_double[j], 1, ae_v_len(offs+j,offs+i-1), v); } v = a->ptr.pp_double[offs+i][offs+i]; ae_v_muld(&a->ptr.pp_double[offs][offs+i], a->stride, ae_v_len(offs,offs+i-1), v); a->ptr.pp_double[offs+i][offs+i] = ae_sqr(a->ptr.pp_double[offs+i][offs+i], _state); } } } else { /* * Compute the product L' * L * NOTE: we never assume that diagonal of L is real */ for(i=0; i<=n-1; i++) { if( i==0 ) { /* * 1x1 matrix */ a->ptr.pp_double[offs+i][offs+i] = ae_sqr(a->ptr.pp_double[offs+i][offs+i], _state); } else { /* * (I+1)x(I+1) matrix, * * ( A11^H A21^H ) ( A11 ) ( A11^H*A11+A21^H*A21 A21^H*A22 ) * ( ) * ( ) = ( ) * ( A22^H ) ( A21 A22 ) ( A22^H*A21 A22^H*A22 ) * * A11 is IxI, A22 is 1x1. */ ae_v_move(&tmp->ptr.p_double[0], 1, &a->ptr.pp_double[offs+i][offs], 1, ae_v_len(0,i-1)); for(j=0; j<=i-1; j++) { v = a->ptr.pp_double[offs+i][offs+j]; ae_v_addd(&a->ptr.pp_double[offs+j][offs], 1, &tmp->ptr.p_double[0], 1, ae_v_len(offs,offs+j), v); } v = a->ptr.pp_double[offs+i][offs+i]; ae_v_muld(&a->ptr.pp_double[offs+i][offs], 1, ae_v_len(offs,offs+i-1), v); a->ptr.pp_double[offs+i][offs+i] = ae_sqr(a->ptr.pp_double[offs+i][offs+i], _state); } } } ae_frame_leave(_state); return; } /* * Recursive code: triangular factor inversion merged with * UU' or L'L multiplication */ ablassplitlength(a, n, &n1, &n2, _state); /* * form off-diagonal block of trangular inverse */ if( isupper ) { for(i=0; i<=n1-1; i++) { ae_v_muld(&a->ptr.pp_double[offs+i][offs+n1], 1, ae_v_len(offs+n1,offs+n-1), -1); } rmatrixlefttrsm(n1, n2, a, offs, offs, isupper, ae_false, 0, a, offs, offs+n1, _state); rmatrixrighttrsm(n1, n2, a, offs+n1, offs+n1, isupper, ae_false, 0, a, offs, offs+n1, _state); } else { for(i=0; i<=n2-1; i++) { ae_v_muld(&a->ptr.pp_double[offs+n1+i][offs], 1, ae_v_len(offs,offs+n1-1), -1); } rmatrixrighttrsm(n2, n1, a, offs, offs, isupper, ae_false, 0, a, offs+n1, offs, _state); rmatrixlefttrsm(n2, n1, a, offs+n1, offs+n1, isupper, ae_false, 0, a, offs+n1, offs, _state); } /* * invert first diagonal block */ spdmatrixcholeskyinverserec(a, offs, n1, isupper, tmp, _state); /* * update first diagonal block with off-diagonal block, * update off-diagonal block */ if( isupper ) { rmatrixsyrk(n1, n2, 1.0, a, offs, offs+n1, 0, 1.0, a, offs, offs, isupper, _state); rmatrixrighttrsm(n1, n2, a, offs+n1, offs+n1, isupper, ae_false, 1, a, offs, offs+n1, _state); } else { rmatrixsyrk(n1, n2, 1.0, a, offs+n1, offs, 1, 1.0, a, offs, offs, isupper, _state); rmatrixlefttrsm(n2, n1, a, offs+n1, offs+n1, isupper, ae_false, 1, a, offs+n1, offs, _state); } /* * invert second diagonal block */ spdmatrixcholeskyinverserec(a, offs+n1, n2, isupper, tmp, _state); ae_frame_leave(_state); } /************************************************************************* Triangular matrix inversion, recursive subroutine NOTE: this function sets Info on failure, leaves it unchanged on success. NOTE: only Tmp[Offs:Offs+N-1] is modified, other entries of the temporary array are not modified -- ALGLIB -- 05.02.2010, Bochkanov Sergey. Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992. *************************************************************************/ static void matinv_rmatrixtrinverserec(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, ae_bool isunit, /* Real */ ae_vector* tmp, sinteger* info, matinvreport* rep, ae_state *_state) { ae_int_t n1; ae_int_t n2; ae_int_t i; ae_int_t j; double v; double ajj; if( n<1 ) { info->val = -1; return; } /* * Base case */ if( n<=ablasblocksize(a, _state) ) { if( isupper ) { /* * Compute inverse of upper triangular matrix. */ for(j=0; j<=n-1; j++) { if( !isunit ) { if( ae_fp_eq(a->ptr.pp_double[offs+j][offs+j],(double)(0)) ) { info->val = -3; return; } a->ptr.pp_double[offs+j][offs+j] = 1/a->ptr.pp_double[offs+j][offs+j]; ajj = -a->ptr.pp_double[offs+j][offs+j]; } else { ajj = (double)(-1); } /* * Compute elements 1:j-1 of j-th column. */ if( j>0 ) { ae_v_move(&tmp->ptr.p_double[offs+0], 1, &a->ptr.pp_double[offs+0][offs+j], a->stride, ae_v_len(offs+0,offs+j-1)); for(i=0; i<=j-1; i++) { if( iptr.pp_double[offs+i][offs+i+1], 1, &tmp->ptr.p_double[offs+i+1], 1, ae_v_len(offs+i+1,offs+j-1)); } else { v = (double)(0); } if( !isunit ) { a->ptr.pp_double[offs+i][offs+j] = v+a->ptr.pp_double[offs+i][offs+i]*tmp->ptr.p_double[offs+i]; } else { a->ptr.pp_double[offs+i][offs+j] = v+tmp->ptr.p_double[offs+i]; } } ae_v_muld(&a->ptr.pp_double[offs+0][offs+j], a->stride, ae_v_len(offs+0,offs+j-1), ajj); } } } else { /* * Compute inverse of lower triangular matrix. */ for(j=n-1; j>=0; j--) { if( !isunit ) { if( ae_fp_eq(a->ptr.pp_double[offs+j][offs+j],(double)(0)) ) { info->val = -3; return; } a->ptr.pp_double[offs+j][offs+j] = 1/a->ptr.pp_double[offs+j][offs+j]; ajj = -a->ptr.pp_double[offs+j][offs+j]; } else { ajj = (double)(-1); } if( jptr.p_double[offs+j+1], 1, &a->ptr.pp_double[offs+j+1][offs+j], a->stride, ae_v_len(offs+j+1,offs+n-1)); for(i=j+1; i<=n-1; i++) { if( i>j+1 ) { v = ae_v_dotproduct(&a->ptr.pp_double[offs+i][offs+j+1], 1, &tmp->ptr.p_double[offs+j+1], 1, ae_v_len(offs+j+1,offs+i-1)); } else { v = (double)(0); } if( !isunit ) { a->ptr.pp_double[offs+i][offs+j] = v+a->ptr.pp_double[offs+i][offs+i]*tmp->ptr.p_double[offs+i]; } else { a->ptr.pp_double[offs+i][offs+j] = v+tmp->ptr.p_double[offs+i]; } } ae_v_muld(&a->ptr.pp_double[offs+j+1][offs+j], a->stride, ae_v_len(offs+j+1,offs+n-1), ajj); } } } return; } /* * Recursive case */ ablassplitlength(a, n, &n1, &n2, _state); if( n2>0 ) { if( isupper ) { for(i=0; i<=n1-1; i++) { ae_v_muld(&a->ptr.pp_double[offs+i][offs+n1], 1, ae_v_len(offs+n1,offs+n-1), -1); } rmatrixrighttrsm(n1, n2, a, offs+n1, offs+n1, isupper, isunit, 0, a, offs, offs+n1, _state); rmatrixlefttrsm(n1, n2, a, offs, offs, isupper, isunit, 0, a, offs, offs+n1, _state); matinv_rmatrixtrinverserec(a, offs+n1, n2, isupper, isunit, tmp, info, rep, _state); } else { for(i=0; i<=n2-1; i++) { ae_v_muld(&a->ptr.pp_double[offs+n1+i][offs], 1, ae_v_len(offs,offs+n1-1), -1); } rmatrixlefttrsm(n2, n1, a, offs+n1, offs+n1, isupper, isunit, 0, a, offs+n1, offs, _state); rmatrixrighttrsm(n2, n1, a, offs, offs, isupper, isunit, 0, a, offs+n1, offs, _state); matinv_rmatrixtrinverserec(a, offs+n1, n2, isupper, isunit, tmp, info, rep, _state); } } matinv_rmatrixtrinverserec(a, offs, n1, isupper, isunit, tmp, info, rep, _state); } /************************************************************************* Triangular matrix inversion, recursive subroutine -- ALGLIB -- 05.02.2010, Bochkanov Sergey. Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992. *************************************************************************/ static void matinv_cmatrixtrinverserec(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, ae_bool isunit, /* Complex */ ae_vector* tmp, ae_int_t* info, matinvreport* rep, ae_state *_state) { ae_int_t n1; ae_int_t n2; ae_int_t i; ae_int_t j; ae_complex v; ae_complex ajj; if( n<1 ) { *info = -1; return; } /* * Base case */ if( n<=ablascomplexblocksize(a, _state) ) { if( isupper ) { /* * Compute inverse of upper triangular matrix. */ for(j=0; j<=n-1; j++) { if( !isunit ) { if( ae_c_eq_d(a->ptr.pp_complex[offs+j][offs+j],(double)(0)) ) { *info = -3; return; } a->ptr.pp_complex[offs+j][offs+j] = ae_c_d_div(1,a->ptr.pp_complex[offs+j][offs+j]); ajj = ae_c_neg(a->ptr.pp_complex[offs+j][offs+j]); } else { ajj = ae_complex_from_i(-1); } /* * Compute elements 1:j-1 of j-th column. */ if( j>0 ) { ae_v_cmove(&tmp->ptr.p_complex[0], 1, &a->ptr.pp_complex[offs+0][offs+j], a->stride, "N", ae_v_len(0,j-1)); for(i=0; i<=j-1; i++) { if( iptr.pp_complex[offs+i][offs+i+1], 1, "N", &tmp->ptr.p_complex[i+1], 1, "N", ae_v_len(offs+i+1,offs+j-1)); } else { v = ae_complex_from_i(0); } if( !isunit ) { a->ptr.pp_complex[offs+i][offs+j] = ae_c_add(v,ae_c_mul(a->ptr.pp_complex[offs+i][offs+i],tmp->ptr.p_complex[i])); } else { a->ptr.pp_complex[offs+i][offs+j] = ae_c_add(v,tmp->ptr.p_complex[i]); } } ae_v_cmulc(&a->ptr.pp_complex[offs+0][offs+j], a->stride, ae_v_len(offs+0,offs+j-1), ajj); } } } else { /* * Compute inverse of lower triangular matrix. */ for(j=n-1; j>=0; j--) { if( !isunit ) { if( ae_c_eq_d(a->ptr.pp_complex[offs+j][offs+j],(double)(0)) ) { *info = -3; return; } a->ptr.pp_complex[offs+j][offs+j] = ae_c_d_div(1,a->ptr.pp_complex[offs+j][offs+j]); ajj = ae_c_neg(a->ptr.pp_complex[offs+j][offs+j]); } else { ajj = ae_complex_from_i(-1); } if( jptr.p_complex[j+1], 1, &a->ptr.pp_complex[offs+j+1][offs+j], a->stride, "N", ae_v_len(j+1,n-1)); for(i=j+1; i<=n-1; i++) { if( i>j+1 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[offs+i][offs+j+1], 1, "N", &tmp->ptr.p_complex[j+1], 1, "N", ae_v_len(offs+j+1,offs+i-1)); } else { v = ae_complex_from_i(0); } if( !isunit ) { a->ptr.pp_complex[offs+i][offs+j] = ae_c_add(v,ae_c_mul(a->ptr.pp_complex[offs+i][offs+i],tmp->ptr.p_complex[i])); } else { a->ptr.pp_complex[offs+i][offs+j] = ae_c_add(v,tmp->ptr.p_complex[i]); } } ae_v_cmulc(&a->ptr.pp_complex[offs+j+1][offs+j], a->stride, ae_v_len(offs+j+1,offs+n-1), ajj); } } } return; } /* * Recursive case */ ablascomplexsplitlength(a, n, &n1, &n2, _state); if( n2>0 ) { if( isupper ) { for(i=0; i<=n1-1; i++) { ae_v_cmuld(&a->ptr.pp_complex[offs+i][offs+n1], 1, ae_v_len(offs+n1,offs+n-1), -1); } cmatrixlefttrsm(n1, n2, a, offs, offs, isupper, isunit, 0, a, offs, offs+n1, _state); cmatrixrighttrsm(n1, n2, a, offs+n1, offs+n1, isupper, isunit, 0, a, offs, offs+n1, _state); } else { for(i=0; i<=n2-1; i++) { ae_v_cmuld(&a->ptr.pp_complex[offs+n1+i][offs], 1, ae_v_len(offs,offs+n1-1), -1); } cmatrixrighttrsm(n2, n1, a, offs, offs, isupper, isunit, 0, a, offs+n1, offs, _state); cmatrixlefttrsm(n2, n1, a, offs+n1, offs+n1, isupper, isunit, 0, a, offs+n1, offs, _state); } matinv_cmatrixtrinverserec(a, offs+n1, n2, isupper, isunit, tmp, info, rep, _state); } matinv_cmatrixtrinverserec(a, offs, n1, isupper, isunit, tmp, info, rep, _state); } static void matinv_rmatrixluinverserec(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t n, /* Real */ ae_vector* work, sinteger* info, matinvreport* rep, ae_state *_state) { ae_int_t i; ae_int_t j; double v; ae_int_t n1; ae_int_t n2; if( n<1 ) { info->val = -1; return; } /* * Base case */ if( n<=ablasblocksize(a, _state) ) { /* * Form inv(U) */ matinv_rmatrixtrinverserec(a, offs, n, ae_true, ae_false, work, info, rep, _state); if( info->val<=0 ) { return; } /* * Solve the equation inv(A)*L = inv(U) for inv(A). */ for(j=n-1; j>=0; j--) { /* * Copy current column of L to WORK and replace with zeros. */ for(i=j+1; i<=n-1; i++) { work->ptr.p_double[i] = a->ptr.pp_double[offs+i][offs+j]; a->ptr.pp_double[offs+i][offs+j] = (double)(0); } /* * Compute current column of inv(A). */ if( jptr.pp_double[offs+i][offs+j+1], 1, &work->ptr.p_double[j+1], 1, ae_v_len(offs+j+1,offs+n-1)); a->ptr.pp_double[offs+i][offs+j] = a->ptr.pp_double[offs+i][offs+j]-v; } } } return; } /* * Recursive code: * * ( L1 ) ( U1 U12 ) * A = ( ) * ( ) * ( L12 L2 ) ( U2 ) * * ( W X ) * A^-1 = ( ) * ( Y Z ) */ ablassplitlength(a, n, &n1, &n2, _state); ae_assert(n2>0, "LUInverseRec: internal error!", _state); /* * X := inv(U1)*U12*inv(U2) */ rmatrixlefttrsm(n1, n2, a, offs, offs, ae_true, ae_false, 0, a, offs, offs+n1, _state); rmatrixrighttrsm(n1, n2, a, offs+n1, offs+n1, ae_true, ae_false, 0, a, offs, offs+n1, _state); /* * Y := inv(L2)*L12*inv(L1) */ rmatrixlefttrsm(n2, n1, a, offs+n1, offs+n1, ae_false, ae_true, 0, a, offs+n1, offs, _state); rmatrixrighttrsm(n2, n1, a, offs, offs, ae_false, ae_true, 0, a, offs+n1, offs, _state); /* * W := inv(L1*U1)+X*Y */ matinv_rmatrixluinverserec(a, offs, n1, work, info, rep, _state); if( info->val<=0 ) { return; } rmatrixgemm(n1, n1, n2, 1.0, a, offs, offs+n1, 0, a, offs+n1, offs, 0, 1.0, a, offs, offs, _state); /* * X := -X*inv(L2) * Y := -inv(U2)*Y */ rmatrixrighttrsm(n1, n2, a, offs+n1, offs+n1, ae_false, ae_true, 0, a, offs, offs+n1, _state); for(i=0; i<=n1-1; i++) { ae_v_muld(&a->ptr.pp_double[offs+i][offs+n1], 1, ae_v_len(offs+n1,offs+n-1), -1); } rmatrixlefttrsm(n2, n1, a, offs+n1, offs+n1, ae_true, ae_false, 0, a, offs+n1, offs, _state); for(i=0; i<=n2-1; i++) { ae_v_muld(&a->ptr.pp_double[offs+n1+i][offs], 1, ae_v_len(offs,offs+n1-1), -1); } /* * Z := inv(L2*U2) */ matinv_rmatrixluinverserec(a, offs+n1, n2, work, info, rep, _state); } static void matinv_cmatrixluinverserec(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t n, /* Complex */ ae_vector* work, ae_int_t* info, matinvreport* rep, ae_state *_state) { ae_int_t i; ae_int_t j; ae_complex v; ae_int_t n1; ae_int_t n2; if( n<1 ) { *info = -1; return; } /* * Base case */ if( n<=ablascomplexblocksize(a, _state) ) { /* * Form inv(U) */ matinv_cmatrixtrinverserec(a, offs, n, ae_true, ae_false, work, info, rep, _state); if( *info<=0 ) { return; } /* * Solve the equation inv(A)*L = inv(U) for inv(A). */ for(j=n-1; j>=0; j--) { /* * Copy current column of L to WORK and replace with zeros. */ for(i=j+1; i<=n-1; i++) { work->ptr.p_complex[i] = a->ptr.pp_complex[offs+i][offs+j]; a->ptr.pp_complex[offs+i][offs+j] = ae_complex_from_i(0); } /* * Compute current column of inv(A). */ if( jptr.pp_complex[offs+i][offs+j+1], 1, "N", &work->ptr.p_complex[j+1], 1, "N", ae_v_len(offs+j+1,offs+n-1)); a->ptr.pp_complex[offs+i][offs+j] = ae_c_sub(a->ptr.pp_complex[offs+i][offs+j],v); } } } return; } /* * Recursive code: * * ( L1 ) ( U1 U12 ) * A = ( ) * ( ) * ( L12 L2 ) ( U2 ) * * ( W X ) * A^-1 = ( ) * ( Y Z ) */ ablascomplexsplitlength(a, n, &n1, &n2, _state); ae_assert(n2>0, "LUInverseRec: internal error!", _state); /* * X := inv(U1)*U12*inv(U2) */ cmatrixlefttrsm(n1, n2, a, offs, offs, ae_true, ae_false, 0, a, offs, offs+n1, _state); cmatrixrighttrsm(n1, n2, a, offs+n1, offs+n1, ae_true, ae_false, 0, a, offs, offs+n1, _state); /* * Y := inv(L2)*L12*inv(L1) */ cmatrixlefttrsm(n2, n1, a, offs+n1, offs+n1, ae_false, ae_true, 0, a, offs+n1, offs, _state); cmatrixrighttrsm(n2, n1, a, offs, offs, ae_false, ae_true, 0, a, offs+n1, offs, _state); /* * W := inv(L1*U1)+X*Y */ matinv_cmatrixluinverserec(a, offs, n1, work, info, rep, _state); if( *info<=0 ) { return; } cmatrixgemm(n1, n1, n2, ae_complex_from_d(1.0), a, offs, offs+n1, 0, a, offs+n1, offs, 0, ae_complex_from_d(1.0), a, offs, offs, _state); /* * X := -X*inv(L2) * Y := -inv(U2)*Y */ cmatrixrighttrsm(n1, n2, a, offs+n1, offs+n1, ae_false, ae_true, 0, a, offs, offs+n1, _state); for(i=0; i<=n1-1; i++) { ae_v_cmuld(&a->ptr.pp_complex[offs+i][offs+n1], 1, ae_v_len(offs+n1,offs+n-1), -1); } cmatrixlefttrsm(n2, n1, a, offs+n1, offs+n1, ae_true, ae_false, 0, a, offs+n1, offs, _state); for(i=0; i<=n2-1; i++) { ae_v_cmuld(&a->ptr.pp_complex[offs+n1+i][offs], 1, ae_v_len(offs,offs+n1-1), -1); } /* * Z := inv(L2*U2) */ matinv_cmatrixluinverserec(a, offs+n1, n2, work, info, rep, _state); } /************************************************************************* Recursive subroutine for HPD inversion. -- ALGLIB routine -- 10.02.2010 Bochkanov Sergey *************************************************************************/ static void matinv_hpdmatrixcholeskyinverserec(/* Complex */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tmp, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_complex v; ae_int_t n1; ae_int_t n2; ae_int_t info2; matinvreport rep2; ae_frame_make(_state, &_frame_block); _matinvreport_init(&rep2, _state); if( n<1 ) { ae_frame_leave(_state); return; } /* * Base case */ if( n<=ablascomplexblocksize(a, _state) ) { matinv_cmatrixtrinverserec(a, offs, n, isupper, ae_false, tmp, &info2, &rep2, _state); if( isupper ) { /* * Compute the product U * U'. * NOTE: we never assume that diagonal of U is real */ for(i=0; i<=n-1; i++) { if( i==0 ) { /* * 1x1 matrix */ a->ptr.pp_complex[offs+i][offs+i] = ae_complex_from_d(ae_sqr(a->ptr.pp_complex[offs+i][offs+i].x, _state)+ae_sqr(a->ptr.pp_complex[offs+i][offs+i].y, _state)); } else { /* * (I+1)x(I+1) matrix, * * ( A11 A12 ) ( A11^H ) ( A11*A11^H+A12*A12^H A12*A22^H ) * ( ) * ( ) = ( ) * ( A22 ) ( A12^H A22^H ) ( A22*A12^H A22*A22^H ) * * A11 is IxI, A22 is 1x1. */ ae_v_cmove(&tmp->ptr.p_complex[0], 1, &a->ptr.pp_complex[offs][offs+i], a->stride, "Conj", ae_v_len(0,i-1)); for(j=0; j<=i-1; j++) { v = a->ptr.pp_complex[offs+j][offs+i]; ae_v_caddc(&a->ptr.pp_complex[offs+j][offs+j], 1, &tmp->ptr.p_complex[j], 1, "N", ae_v_len(offs+j,offs+i-1), v); } v = ae_c_conj(a->ptr.pp_complex[offs+i][offs+i], _state); ae_v_cmulc(&a->ptr.pp_complex[offs][offs+i], a->stride, ae_v_len(offs,offs+i-1), v); a->ptr.pp_complex[offs+i][offs+i] = ae_complex_from_d(ae_sqr(a->ptr.pp_complex[offs+i][offs+i].x, _state)+ae_sqr(a->ptr.pp_complex[offs+i][offs+i].y, _state)); } } } else { /* * Compute the product L' * L * NOTE: we never assume that diagonal of L is real */ for(i=0; i<=n-1; i++) { if( i==0 ) { /* * 1x1 matrix */ a->ptr.pp_complex[offs+i][offs+i] = ae_complex_from_d(ae_sqr(a->ptr.pp_complex[offs+i][offs+i].x, _state)+ae_sqr(a->ptr.pp_complex[offs+i][offs+i].y, _state)); } else { /* * (I+1)x(I+1) matrix, * * ( A11^H A21^H ) ( A11 ) ( A11^H*A11+A21^H*A21 A21^H*A22 ) * ( ) * ( ) = ( ) * ( A22^H ) ( A21 A22 ) ( A22^H*A21 A22^H*A22 ) * * A11 is IxI, A22 is 1x1. */ ae_v_cmove(&tmp->ptr.p_complex[0], 1, &a->ptr.pp_complex[offs+i][offs], 1, "N", ae_v_len(0,i-1)); for(j=0; j<=i-1; j++) { v = ae_c_conj(a->ptr.pp_complex[offs+i][offs+j], _state); ae_v_caddc(&a->ptr.pp_complex[offs+j][offs], 1, &tmp->ptr.p_complex[0], 1, "N", ae_v_len(offs,offs+j), v); } v = ae_c_conj(a->ptr.pp_complex[offs+i][offs+i], _state); ae_v_cmulc(&a->ptr.pp_complex[offs+i][offs], 1, ae_v_len(offs,offs+i-1), v); a->ptr.pp_complex[offs+i][offs+i] = ae_complex_from_d(ae_sqr(a->ptr.pp_complex[offs+i][offs+i].x, _state)+ae_sqr(a->ptr.pp_complex[offs+i][offs+i].y, _state)); } } } ae_frame_leave(_state); return; } /* * Recursive code: triangular factor inversion merged with * UU' or L'L multiplication */ ablascomplexsplitlength(a, n, &n1, &n2, _state); /* * form off-diagonal block of trangular inverse */ if( isupper ) { for(i=0; i<=n1-1; i++) { ae_v_cmuld(&a->ptr.pp_complex[offs+i][offs+n1], 1, ae_v_len(offs+n1,offs+n-1), -1); } cmatrixlefttrsm(n1, n2, a, offs, offs, isupper, ae_false, 0, a, offs, offs+n1, _state); cmatrixrighttrsm(n1, n2, a, offs+n1, offs+n1, isupper, ae_false, 0, a, offs, offs+n1, _state); } else { for(i=0; i<=n2-1; i++) { ae_v_cmuld(&a->ptr.pp_complex[offs+n1+i][offs], 1, ae_v_len(offs,offs+n1-1), -1); } cmatrixrighttrsm(n2, n1, a, offs, offs, isupper, ae_false, 0, a, offs+n1, offs, _state); cmatrixlefttrsm(n2, n1, a, offs+n1, offs+n1, isupper, ae_false, 0, a, offs+n1, offs, _state); } /* * invert first diagonal block */ matinv_hpdmatrixcholeskyinverserec(a, offs, n1, isupper, tmp, _state); /* * update first diagonal block with off-diagonal block, * update off-diagonal block */ if( isupper ) { cmatrixherk(n1, n2, 1.0, a, offs, offs+n1, 0, 1.0, a, offs, offs, isupper, _state); cmatrixrighttrsm(n1, n2, a, offs+n1, offs+n1, isupper, ae_false, 2, a, offs, offs+n1, _state); } else { cmatrixherk(n1, n2, 1.0, a, offs+n1, offs, 2, 1.0, a, offs, offs, isupper, _state); cmatrixlefttrsm(n2, n1, a, offs+n1, offs+n1, isupper, ae_false, 2, a, offs+n1, offs, _state); } /* * invert second diagonal block */ matinv_hpdmatrixcholeskyinverserec(a, offs+n1, n2, isupper, tmp, _state); ae_frame_leave(_state); } void _matinvreport_init(void* _p, ae_state *_state) { matinvreport *p = (matinvreport*)_p; ae_touch_ptr((void*)p); } void _matinvreport_init_copy(void* _dst, void* _src, ae_state *_state) { matinvreport *dst = (matinvreport*)_dst; matinvreport *src = (matinvreport*)_src; dst->r1 = src->r1; dst->rinf = src->rinf; } void _matinvreport_clear(void* _p) { matinvreport *p = (matinvreport*)_p; ae_touch_ptr((void*)p); } void _matinvreport_destroy(void* _p) { matinvreport *p = (matinvreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* QR decomposition of a rectangular matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q and R in compact form (see below). Tau - array of scalar factors which are used to form matrix Q. Array whose index ranges within [0.. Min(M-1,N-1)]. Matrix A is represented as A = QR, where Q is an orthogonal matrix of size MxM, R - upper triangular (or upper trapezoid) matrix of size M x N. The elements of matrix R are located on and above the main diagonal of matrix A. The elements which are located in Tau array and below the main diagonal of matrix A are used to form matrix Q as follows: Matrix Q is represented as a product of elementary reflections Q = H(0)*H(2)*...*H(k-1), where k = min(m,n), and each H(i) is in the form H(i) = 1 - tau * v * (v^T) where tau is a scalar stored in Tau[I]; v - real vector, so that v(0:i-1) = 0, v(i) = 1, v(i+1:m-1) stored in A(i+1:m-1,i). -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixqr(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_vector t; ae_vector taubuf; ae_int_t minmn; ae_matrix tmpa; ae_matrix tmpt; ae_matrix tmpr; ae_int_t blockstart; ae_int_t blocksize; ae_int_t rowscount; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_clear(tau); ae_vector_init(&work, 0, DT_REAL, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&taubuf, 0, DT_REAL, _state); ae_matrix_init(&tmpa, 0, 0, DT_REAL, _state); ae_matrix_init(&tmpt, 0, 0, DT_REAL, _state); ae_matrix_init(&tmpr, 0, 0, DT_REAL, _state); if( m<=0||n<=0 ) { ae_frame_leave(_state); return; } minmn = ae_minint(m, n, _state); ae_vector_set_length(&work, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(&t, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(tau, minmn, _state); ae_vector_set_length(&taubuf, minmn, _state); ae_matrix_set_length(&tmpa, m, ablasblocksize(a, _state), _state); ae_matrix_set_length(&tmpt, ablasblocksize(a, _state), 2*ablasblocksize(a, _state), _state); ae_matrix_set_length(&tmpr, 2*ablasblocksize(a, _state), n, _state); /* * Blocked code */ blockstart = 0; while(blockstart!=minmn) { /* * Determine block size */ blocksize = minmn-blockstart; if( blocksize>ablasblocksize(a, _state) ) { blocksize = ablasblocksize(a, _state); } rowscount = m-blockstart; /* * QR decomposition of submatrix. * Matrix is copied to temporary storage to solve * some TLB issues arising from non-contiguous memory * access pattern. */ rmatrixcopy(rowscount, blocksize, a, blockstart, blockstart, &tmpa, 0, 0, _state); rmatrixqrbasecase(&tmpa, rowscount, blocksize, &work, &t, &taubuf, _state); rmatrixcopy(rowscount, blocksize, &tmpa, 0, 0, a, blockstart, blockstart, _state); ae_v_move(&tau->ptr.p_double[blockstart], 1, &taubuf.ptr.p_double[0], 1, ae_v_len(blockstart,blockstart+blocksize-1)); /* * Update the rest, choose between: * a) Level 2 algorithm (when the rest of the matrix is small enough) * b) blocked algorithm, see algorithm 5 from 'A storage efficient WY * representation for products of Householder transformations', * by R. Schreiber and C. Van Loan. */ if( blockstart+blocksize<=n-1 ) { if( n-blockstart-blocksize>=2*ablasblocksize(a, _state)||rowscount>=4*ablasblocksize(a, _state) ) { /* * Prepare block reflector */ ortfac_rmatrixblockreflector(&tmpa, &taubuf, ae_true, rowscount, blocksize, &tmpt, &work, _state); /* * Multiply the rest of A by Q'. * * Q = E + Y*T*Y' = E + TmpA*TmpT*TmpA' * Q' = E + Y*T'*Y' = E + TmpA*TmpT'*TmpA' */ rmatrixgemm(blocksize, n-blockstart-blocksize, rowscount, 1.0, &tmpa, 0, 0, 1, a, blockstart, blockstart+blocksize, 0, 0.0, &tmpr, 0, 0, _state); rmatrixgemm(blocksize, n-blockstart-blocksize, blocksize, 1.0, &tmpt, 0, 0, 1, &tmpr, 0, 0, 0, 0.0, &tmpr, blocksize, 0, _state); rmatrixgemm(rowscount, n-blockstart-blocksize, blocksize, 1.0, &tmpa, 0, 0, 0, &tmpr, blocksize, 0, 0, 1.0, a, blockstart, blockstart+blocksize, _state); } else { /* * Level 2 algorithm */ for(i=0; i<=blocksize-1; i++) { ae_v_move(&t.ptr.p_double[1], 1, &tmpa.ptr.pp_double[i][i], tmpa.stride, ae_v_len(1,rowscount-i)); t.ptr.p_double[1] = (double)(1); applyreflectionfromtheleft(a, taubuf.ptr.p_double[i], &t, blockstart+i, m-1, blockstart+blocksize, n-1, &work, _state); } } } /* * Advance */ blockstart = blockstart+blocksize; } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixqr(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_state *_state) { rmatrixqr(a,m,n,tau, _state); } /************************************************************************* LQ decomposition of a rectangular matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices L and Q in compact form (see below) Tau - array of scalar factors which are used to form matrix Q. Array whose index ranges within [0..Min(M,N)-1]. Matrix A is represented as A = LQ, where Q is an orthogonal matrix of size MxM, L - lower triangular (or lower trapezoid) matrix of size M x N. The elements of matrix L are located on and below the main diagonal of matrix A. The elements which are located in Tau array and above the main diagonal of matrix A are used to form matrix Q as follows: Matrix Q is represented as a product of elementary reflections Q = H(k-1)*H(k-2)*...*H(1)*H(0), where k = min(m,n), and each H(i) is of the form H(i) = 1 - tau * v * (v^T) where tau is a scalar stored in Tau[I]; v - real vector, so that v(0:i-1)=0, v(i) = 1, v(i+1:n-1) stored in A(i,i+1:n-1). -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixlq(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_vector t; ae_vector taubuf; ae_int_t minmn; ae_matrix tmpa; ae_matrix tmpt; ae_matrix tmpr; ae_int_t blockstart; ae_int_t blocksize; ae_int_t columnscount; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_clear(tau); ae_vector_init(&work, 0, DT_REAL, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&taubuf, 0, DT_REAL, _state); ae_matrix_init(&tmpa, 0, 0, DT_REAL, _state); ae_matrix_init(&tmpt, 0, 0, DT_REAL, _state); ae_matrix_init(&tmpr, 0, 0, DT_REAL, _state); if( m<=0||n<=0 ) { ae_frame_leave(_state); return; } minmn = ae_minint(m, n, _state); ae_vector_set_length(&work, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(&t, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(tau, minmn, _state); ae_vector_set_length(&taubuf, minmn, _state); ae_matrix_set_length(&tmpa, ablasblocksize(a, _state), n, _state); ae_matrix_set_length(&tmpt, ablasblocksize(a, _state), 2*ablasblocksize(a, _state), _state); ae_matrix_set_length(&tmpr, m, 2*ablasblocksize(a, _state), _state); /* * Blocked code */ blockstart = 0; while(blockstart!=minmn) { /* * Determine block size */ blocksize = minmn-blockstart; if( blocksize>ablasblocksize(a, _state) ) { blocksize = ablasblocksize(a, _state); } columnscount = n-blockstart; /* * LQ decomposition of submatrix. * Matrix is copied to temporary storage to solve * some TLB issues arising from non-contiguous memory * access pattern. */ rmatrixcopy(blocksize, columnscount, a, blockstart, blockstart, &tmpa, 0, 0, _state); rmatrixlqbasecase(&tmpa, blocksize, columnscount, &work, &t, &taubuf, _state); rmatrixcopy(blocksize, columnscount, &tmpa, 0, 0, a, blockstart, blockstart, _state); ae_v_move(&tau->ptr.p_double[blockstart], 1, &taubuf.ptr.p_double[0], 1, ae_v_len(blockstart,blockstart+blocksize-1)); /* * Update the rest, choose between: * a) Level 2 algorithm (when the rest of the matrix is small enough) * b) blocked algorithm, see algorithm 5 from 'A storage efficient WY * representation for products of Householder transformations', * by R. Schreiber and C. Van Loan. */ if( blockstart+blocksize<=m-1 ) { if( m-blockstart-blocksize>=2*ablasblocksize(a, _state) ) { /* * Prepare block reflector */ ortfac_rmatrixblockreflector(&tmpa, &taubuf, ae_false, columnscount, blocksize, &tmpt, &work, _state); /* * Multiply the rest of A by Q. * * Q = E + Y*T*Y' = E + TmpA'*TmpT*TmpA */ rmatrixgemm(m-blockstart-blocksize, blocksize, columnscount, 1.0, a, blockstart+blocksize, blockstart, 0, &tmpa, 0, 0, 1, 0.0, &tmpr, 0, 0, _state); rmatrixgemm(m-blockstart-blocksize, blocksize, blocksize, 1.0, &tmpr, 0, 0, 0, &tmpt, 0, 0, 0, 0.0, &tmpr, 0, blocksize, _state); rmatrixgemm(m-blockstart-blocksize, columnscount, blocksize, 1.0, &tmpr, 0, blocksize, 0, &tmpa, 0, 0, 0, 1.0, a, blockstart+blocksize, blockstart, _state); } else { /* * Level 2 algorithm */ for(i=0; i<=blocksize-1; i++) { ae_v_move(&t.ptr.p_double[1], 1, &tmpa.ptr.pp_double[i][i], 1, ae_v_len(1,columnscount-i)); t.ptr.p_double[1] = (double)(1); applyreflectionfromtheright(a, taubuf.ptr.p_double[i], &t, blockstart+blocksize, m-1, blockstart+i, n-1, &work, _state); } } } /* * Advance */ blockstart = blockstart+blocksize; } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixlq(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_state *_state) { rmatrixlq(a,m,n,tau, _state); } /************************************************************************* QR decomposition of a rectangular complex matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1] M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q and R in compact form Tau - array of scalar factors which are used to form matrix Q. Array whose indexes range within [0.. Min(M,N)-1] Matrix A is represented as A = QR, where Q is an orthogonal matrix of size MxM, R - upper triangular (or upper trapezoid) matrix of size MxN. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ void cmatrixqr(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_vector t; ae_vector taubuf; ae_int_t minmn; ae_matrix tmpa; ae_matrix tmpt; ae_matrix tmpr; ae_int_t blockstart; ae_int_t blocksize; ae_int_t rowscount; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_clear(tau); ae_vector_init(&work, 0, DT_COMPLEX, _state); ae_vector_init(&t, 0, DT_COMPLEX, _state); ae_vector_init(&taubuf, 0, DT_COMPLEX, _state); ae_matrix_init(&tmpa, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&tmpt, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&tmpr, 0, 0, DT_COMPLEX, _state); if( m<=0||n<=0 ) { ae_frame_leave(_state); return; } minmn = ae_minint(m, n, _state); ae_vector_set_length(&work, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(&t, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(tau, minmn, _state); ae_vector_set_length(&taubuf, minmn, _state); ae_matrix_set_length(&tmpa, m, ablascomplexblocksize(a, _state), _state); ae_matrix_set_length(&tmpt, ablascomplexblocksize(a, _state), ablascomplexblocksize(a, _state), _state); ae_matrix_set_length(&tmpr, 2*ablascomplexblocksize(a, _state), n, _state); /* * Blocked code */ blockstart = 0; while(blockstart!=minmn) { /* * Determine block size */ blocksize = minmn-blockstart; if( blocksize>ablascomplexblocksize(a, _state) ) { blocksize = ablascomplexblocksize(a, _state); } rowscount = m-blockstart; /* * QR decomposition of submatrix. * Matrix is copied to temporary storage to solve * some TLB issues arising from non-contiguous memory * access pattern. */ cmatrixcopy(rowscount, blocksize, a, blockstart, blockstart, &tmpa, 0, 0, _state); ortfac_cmatrixqrbasecase(&tmpa, rowscount, blocksize, &work, &t, &taubuf, _state); cmatrixcopy(rowscount, blocksize, &tmpa, 0, 0, a, blockstart, blockstart, _state); ae_v_cmove(&tau->ptr.p_complex[blockstart], 1, &taubuf.ptr.p_complex[0], 1, "N", ae_v_len(blockstart,blockstart+blocksize-1)); /* * Update the rest, choose between: * a) Level 2 algorithm (when the rest of the matrix is small enough) * b) blocked algorithm, see algorithm 5 from 'A storage efficient WY * representation for products of Householder transformations', * by R. Schreiber and C. Van Loan. */ if( blockstart+blocksize<=n-1 ) { if( n-blockstart-blocksize>=2*ablascomplexblocksize(a, _state) ) { /* * Prepare block reflector */ ortfac_cmatrixblockreflector(&tmpa, &taubuf, ae_true, rowscount, blocksize, &tmpt, &work, _state); /* * Multiply the rest of A by Q'. * * Q = E + Y*T*Y' = E + TmpA*TmpT*TmpA' * Q' = E + Y*T'*Y' = E + TmpA*TmpT'*TmpA' */ cmatrixgemm(blocksize, n-blockstart-blocksize, rowscount, ae_complex_from_d(1.0), &tmpa, 0, 0, 2, a, blockstart, blockstart+blocksize, 0, ae_complex_from_d(0.0), &tmpr, 0, 0, _state); cmatrixgemm(blocksize, n-blockstart-blocksize, blocksize, ae_complex_from_d(1.0), &tmpt, 0, 0, 2, &tmpr, 0, 0, 0, ae_complex_from_d(0.0), &tmpr, blocksize, 0, _state); cmatrixgemm(rowscount, n-blockstart-blocksize, blocksize, ae_complex_from_d(1.0), &tmpa, 0, 0, 0, &tmpr, blocksize, 0, 0, ae_complex_from_d(1.0), a, blockstart, blockstart+blocksize, _state); } else { /* * Level 2 algorithm */ for(i=0; i<=blocksize-1; i++) { ae_v_cmove(&t.ptr.p_complex[1], 1, &tmpa.ptr.pp_complex[i][i], tmpa.stride, "N", ae_v_len(1,rowscount-i)); t.ptr.p_complex[1] = ae_complex_from_i(1); complexapplyreflectionfromtheleft(a, ae_c_conj(taubuf.ptr.p_complex[i], _state), &t, blockstart+i, m-1, blockstart+blocksize, n-1, &work, _state); } } } /* * Advance */ blockstart = blockstart+blocksize; } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixqr(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_state *_state) { cmatrixqr(a,m,n,tau, _state); } /************************************************************************* LQ decomposition of a rectangular complex matrix of size MxN COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A whose indexes range within [0..M-1, 0..N-1] M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q and L in compact form Tau - array of scalar factors which are used to form matrix Q. Array whose indexes range within [0.. Min(M,N)-1] Matrix A is represented as A = LQ, where Q is an orthogonal matrix of size MxM, L - lower triangular (or lower trapezoid) matrix of size MxN. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ void cmatrixlq(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_vector t; ae_vector taubuf; ae_int_t minmn; ae_matrix tmpa; ae_matrix tmpt; ae_matrix tmpr; ae_int_t blockstart; ae_int_t blocksize; ae_int_t columnscount; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_clear(tau); ae_vector_init(&work, 0, DT_COMPLEX, _state); ae_vector_init(&t, 0, DT_COMPLEX, _state); ae_vector_init(&taubuf, 0, DT_COMPLEX, _state); ae_matrix_init(&tmpa, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&tmpt, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&tmpr, 0, 0, DT_COMPLEX, _state); if( m<=0||n<=0 ) { ae_frame_leave(_state); return; } minmn = ae_minint(m, n, _state); ae_vector_set_length(&work, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(&t, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(tau, minmn, _state); ae_vector_set_length(&taubuf, minmn, _state); ae_matrix_set_length(&tmpa, ablascomplexblocksize(a, _state), n, _state); ae_matrix_set_length(&tmpt, ablascomplexblocksize(a, _state), ablascomplexblocksize(a, _state), _state); ae_matrix_set_length(&tmpr, m, 2*ablascomplexblocksize(a, _state), _state); /* * Blocked code */ blockstart = 0; while(blockstart!=minmn) { /* * Determine block size */ blocksize = minmn-blockstart; if( blocksize>ablascomplexblocksize(a, _state) ) { blocksize = ablascomplexblocksize(a, _state); } columnscount = n-blockstart; /* * LQ decomposition of submatrix. * Matrix is copied to temporary storage to solve * some TLB issues arising from non-contiguous memory * access pattern. */ cmatrixcopy(blocksize, columnscount, a, blockstart, blockstart, &tmpa, 0, 0, _state); ortfac_cmatrixlqbasecase(&tmpa, blocksize, columnscount, &work, &t, &taubuf, _state); cmatrixcopy(blocksize, columnscount, &tmpa, 0, 0, a, blockstart, blockstart, _state); ae_v_cmove(&tau->ptr.p_complex[blockstart], 1, &taubuf.ptr.p_complex[0], 1, "N", ae_v_len(blockstart,blockstart+blocksize-1)); /* * Update the rest, choose between: * a) Level 2 algorithm (when the rest of the matrix is small enough) * b) blocked algorithm, see algorithm 5 from 'A storage efficient WY * representation for products of Householder transformations', * by R. Schreiber and C. Van Loan. */ if( blockstart+blocksize<=m-1 ) { if( m-blockstart-blocksize>=2*ablascomplexblocksize(a, _state) ) { /* * Prepare block reflector */ ortfac_cmatrixblockreflector(&tmpa, &taubuf, ae_false, columnscount, blocksize, &tmpt, &work, _state); /* * Multiply the rest of A by Q. * * Q = E + Y*T*Y' = E + TmpA'*TmpT*TmpA */ cmatrixgemm(m-blockstart-blocksize, blocksize, columnscount, ae_complex_from_d(1.0), a, blockstart+blocksize, blockstart, 0, &tmpa, 0, 0, 2, ae_complex_from_d(0.0), &tmpr, 0, 0, _state); cmatrixgemm(m-blockstart-blocksize, blocksize, blocksize, ae_complex_from_d(1.0), &tmpr, 0, 0, 0, &tmpt, 0, 0, 0, ae_complex_from_d(0.0), &tmpr, 0, blocksize, _state); cmatrixgemm(m-blockstart-blocksize, columnscount, blocksize, ae_complex_from_d(1.0), &tmpr, 0, blocksize, 0, &tmpa, 0, 0, 0, ae_complex_from_d(1.0), a, blockstart+blocksize, blockstart, _state); } else { /* * Level 2 algorithm */ for(i=0; i<=blocksize-1; i++) { ae_v_cmove(&t.ptr.p_complex[1], 1, &tmpa.ptr.pp_complex[i][i], 1, "Conj", ae_v_len(1,columnscount-i)); t.ptr.p_complex[1] = ae_complex_from_i(1); complexapplyreflectionfromtheright(a, taubuf.ptr.p_complex[i], &t, blockstart+blocksize, m-1, blockstart+i, n-1, &work, _state); } } } /* * Advance */ blockstart = blockstart+blocksize; } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixlq(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_state *_state) { cmatrixlq(a,m,n,tau, _state); } /************************************************************************* Partial unpacking of matrix Q from the QR decomposition of a matrix A COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices Q and R in compact form. Output of RMatrixQR subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of the RMatrixQR subroutine. QColumns - required number of columns of matrix Q. M>=QColumns>=0. Output parameters: Q - first QColumns columns of matrix Q. Array whose indexes range within [0..M-1, 0..QColumns-1]. If QColumns=0, the array remains unchanged. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixqrunpackq(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_int_t qcolumns, /* Real */ ae_matrix* q, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_vector t; ae_vector taubuf; ae_int_t minmn; ae_int_t refcnt; ae_matrix tmpa; ae_matrix tmpt; ae_matrix tmpr; ae_int_t blockstart; ae_int_t blocksize; ae_int_t rowscount; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_matrix_clear(q); ae_vector_init(&work, 0, DT_REAL, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&taubuf, 0, DT_REAL, _state); ae_matrix_init(&tmpa, 0, 0, DT_REAL, _state); ae_matrix_init(&tmpt, 0, 0, DT_REAL, _state); ae_matrix_init(&tmpr, 0, 0, DT_REAL, _state); ae_assert(qcolumns<=m, "UnpackQFromQR: QColumns>M!", _state); if( (m<=0||n<=0)||qcolumns<=0 ) { ae_frame_leave(_state); return; } /* * init */ minmn = ae_minint(m, n, _state); refcnt = ae_minint(minmn, qcolumns, _state); ae_matrix_set_length(q, m, qcolumns, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=qcolumns-1; j++) { if( i==j ) { q->ptr.pp_double[i][j] = (double)(1); } else { q->ptr.pp_double[i][j] = (double)(0); } } } ae_vector_set_length(&work, ae_maxint(m, qcolumns, _state)+1, _state); ae_vector_set_length(&t, ae_maxint(m, qcolumns, _state)+1, _state); ae_vector_set_length(&taubuf, minmn, _state); ae_matrix_set_length(&tmpa, m, ablasblocksize(a, _state), _state); ae_matrix_set_length(&tmpt, ablasblocksize(a, _state), 2*ablasblocksize(a, _state), _state); ae_matrix_set_length(&tmpr, 2*ablasblocksize(a, _state), qcolumns, _state); /* * Blocked code */ blockstart = ablasblocksize(a, _state)*(refcnt/ablasblocksize(a, _state)); blocksize = refcnt-blockstart; while(blockstart>=0) { rowscount = m-blockstart; if( blocksize>0 ) { /* * Copy current block */ rmatrixcopy(rowscount, blocksize, a, blockstart, blockstart, &tmpa, 0, 0, _state); ae_v_move(&taubuf.ptr.p_double[0], 1, &tau->ptr.p_double[blockstart], 1, ae_v_len(0,blocksize-1)); /* * Update, choose between: * a) Level 2 algorithm (when the rest of the matrix is small enough) * b) blocked algorithm, see algorithm 5 from 'A storage efficient WY * representation for products of Householder transformations', * by R. Schreiber and C. Van Loan. */ if( qcolumns>=2*ablasblocksize(a, _state) ) { /* * Prepare block reflector */ ortfac_rmatrixblockreflector(&tmpa, &taubuf, ae_true, rowscount, blocksize, &tmpt, &work, _state); /* * Multiply matrix by Q. * * Q = E + Y*T*Y' = E + TmpA*TmpT*TmpA' */ rmatrixgemm(blocksize, qcolumns, rowscount, 1.0, &tmpa, 0, 0, 1, q, blockstart, 0, 0, 0.0, &tmpr, 0, 0, _state); rmatrixgemm(blocksize, qcolumns, blocksize, 1.0, &tmpt, 0, 0, 0, &tmpr, 0, 0, 0, 0.0, &tmpr, blocksize, 0, _state); rmatrixgemm(rowscount, qcolumns, blocksize, 1.0, &tmpa, 0, 0, 0, &tmpr, blocksize, 0, 0, 1.0, q, blockstart, 0, _state); } else { /* * Level 2 algorithm */ for(i=blocksize-1; i>=0; i--) { ae_v_move(&t.ptr.p_double[1], 1, &tmpa.ptr.pp_double[i][i], tmpa.stride, ae_v_len(1,rowscount-i)); t.ptr.p_double[1] = (double)(1); applyreflectionfromtheleft(q, taubuf.ptr.p_double[i], &t, blockstart+i, m-1, 0, qcolumns-1, &work, _state); } } } /* * Advance */ blockstart = blockstart-ablasblocksize(a, _state); blocksize = ablasblocksize(a, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixqrunpackq(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_int_t qcolumns, /* Real */ ae_matrix* q, ae_state *_state) { rmatrixqrunpackq(a,m,n,tau,qcolumns,q, _state); } /************************************************************************* Unpacking of matrix R from the QR decomposition of a matrix A Input parameters: A - matrices Q and R in compact form. Output of RMatrixQR subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: R - matrix R, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixqrunpackr(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* r, ae_state *_state) { ae_int_t i; ae_int_t k; ae_matrix_clear(r); if( m<=0||n<=0 ) { return; } k = ae_minint(m, n, _state); ae_matrix_set_length(r, m, n, _state); for(i=0; i<=n-1; i++) { r->ptr.pp_double[0][i] = (double)(0); } for(i=1; i<=m-1; i++) { ae_v_move(&r->ptr.pp_double[i][0], 1, &r->ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); } for(i=0; i<=k-1; i++) { ae_v_move(&r->ptr.pp_double[i][i], 1, &a->ptr.pp_double[i][i], 1, ae_v_len(i,n-1)); } } /************************************************************************* Partial unpacking of matrix Q from the LQ decomposition of a matrix A COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices L and Q in compact form. Output of RMatrixLQ subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of the RMatrixLQ subroutine. QRows - required number of rows in matrix Q. N>=QRows>=0. Output parameters: Q - first QRows rows of matrix Q. Array whose indexes range within [0..QRows-1, 0..N-1]. If QRows=0, the array remains unchanged. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixlqunpackq(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_int_t qrows, /* Real */ ae_matrix* q, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_vector t; ae_vector taubuf; ae_int_t minmn; ae_int_t refcnt; ae_matrix tmpa; ae_matrix tmpt; ae_matrix tmpr; ae_int_t blockstart; ae_int_t blocksize; ae_int_t columnscount; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_matrix_clear(q); ae_vector_init(&work, 0, DT_REAL, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&taubuf, 0, DT_REAL, _state); ae_matrix_init(&tmpa, 0, 0, DT_REAL, _state); ae_matrix_init(&tmpt, 0, 0, DT_REAL, _state); ae_matrix_init(&tmpr, 0, 0, DT_REAL, _state); ae_assert(qrows<=n, "RMatrixLQUnpackQ: QRows>N!", _state); if( (m<=0||n<=0)||qrows<=0 ) { ae_frame_leave(_state); return; } /* * init */ minmn = ae_minint(m, n, _state); refcnt = ae_minint(minmn, qrows, _state); ae_vector_set_length(&work, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(&t, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(&taubuf, minmn, _state); ae_matrix_set_length(&tmpa, ablasblocksize(a, _state), n, _state); ae_matrix_set_length(&tmpt, ablasblocksize(a, _state), 2*ablasblocksize(a, _state), _state); ae_matrix_set_length(&tmpr, qrows, 2*ablasblocksize(a, _state), _state); ae_matrix_set_length(q, qrows, n, _state); for(i=0; i<=qrows-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { q->ptr.pp_double[i][j] = (double)(1); } else { q->ptr.pp_double[i][j] = (double)(0); } } } /* * Blocked code */ blockstart = ablasblocksize(a, _state)*(refcnt/ablasblocksize(a, _state)); blocksize = refcnt-blockstart; while(blockstart>=0) { columnscount = n-blockstart; if( blocksize>0 ) { /* * Copy submatrix */ rmatrixcopy(blocksize, columnscount, a, blockstart, blockstart, &tmpa, 0, 0, _state); ae_v_move(&taubuf.ptr.p_double[0], 1, &tau->ptr.p_double[blockstart], 1, ae_v_len(0,blocksize-1)); /* * Update matrix, choose between: * a) Level 2 algorithm (when the rest of the matrix is small enough) * b) blocked algorithm, see algorithm 5 from 'A storage efficient WY * representation for products of Householder transformations', * by R. Schreiber and C. Van Loan. */ if( qrows>=2*ablasblocksize(a, _state) ) { /* * Prepare block reflector */ ortfac_rmatrixblockreflector(&tmpa, &taubuf, ae_false, columnscount, blocksize, &tmpt, &work, _state); /* * Multiply the rest of A by Q'. * * Q' = E + Y*T'*Y' = E + TmpA'*TmpT'*TmpA */ rmatrixgemm(qrows, blocksize, columnscount, 1.0, q, 0, blockstart, 0, &tmpa, 0, 0, 1, 0.0, &tmpr, 0, 0, _state); rmatrixgemm(qrows, blocksize, blocksize, 1.0, &tmpr, 0, 0, 0, &tmpt, 0, 0, 1, 0.0, &tmpr, 0, blocksize, _state); rmatrixgemm(qrows, columnscount, blocksize, 1.0, &tmpr, 0, blocksize, 0, &tmpa, 0, 0, 0, 1.0, q, 0, blockstart, _state); } else { /* * Level 2 algorithm */ for(i=blocksize-1; i>=0; i--) { ae_v_move(&t.ptr.p_double[1], 1, &tmpa.ptr.pp_double[i][i], 1, ae_v_len(1,columnscount-i)); t.ptr.p_double[1] = (double)(1); applyreflectionfromtheright(q, taubuf.ptr.p_double[i], &t, 0, qrows-1, blockstart+i, n-1, &work, _state); } } } /* * Advance */ blockstart = blockstart-ablasblocksize(a, _state); blocksize = ablasblocksize(a, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rmatrixlqunpackq(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tau, ae_int_t qrows, /* Real */ ae_matrix* q, ae_state *_state) { rmatrixlqunpackq(a,m,n,tau,qrows,q, _state); } /************************************************************************* Unpacking of matrix L from the LQ decomposition of a matrix A Input parameters: A - matrices Q and L in compact form. Output of RMatrixLQ subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: L - matrix L, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void rmatrixlqunpackl(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* l, ae_state *_state) { ae_int_t i; ae_int_t k; ae_matrix_clear(l); if( m<=0||n<=0 ) { return; } ae_matrix_set_length(l, m, n, _state); for(i=0; i<=n-1; i++) { l->ptr.pp_double[0][i] = (double)(0); } for(i=1; i<=m-1; i++) { ae_v_move(&l->ptr.pp_double[i][0], 1, &l->ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); } for(i=0; i<=m-1; i++) { k = ae_minint(i, n-1, _state); ae_v_move(&l->ptr.pp_double[i][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,k)); } } /************************************************************************* Partial unpacking of matrix Q from QR decomposition of a complex matrix A. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices Q and R in compact form. Output of CMatrixQR subroutine . M - number of rows in matrix A. M>=0. N - number of columns in matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of CMatrixQR subroutine . QColumns - required number of columns in matrix Q. M>=QColumns>=0. Output parameters: Q - first QColumns columns of matrix Q. Array whose index ranges within [0..M-1, 0..QColumns-1]. If QColumns=0, array isn't changed. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixqrunpackq(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_int_t qcolumns, /* Complex */ ae_matrix* q, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_vector t; ae_vector taubuf; ae_int_t minmn; ae_int_t refcnt; ae_matrix tmpa; ae_matrix tmpt; ae_matrix tmpr; ae_int_t blockstart; ae_int_t blocksize; ae_int_t rowscount; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_matrix_clear(q); ae_vector_init(&work, 0, DT_COMPLEX, _state); ae_vector_init(&t, 0, DT_COMPLEX, _state); ae_vector_init(&taubuf, 0, DT_COMPLEX, _state); ae_matrix_init(&tmpa, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&tmpt, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&tmpr, 0, 0, DT_COMPLEX, _state); ae_assert(qcolumns<=m, "UnpackQFromQR: QColumns>M!", _state); if( m<=0||n<=0 ) { ae_frame_leave(_state); return; } /* * init */ minmn = ae_minint(m, n, _state); refcnt = ae_minint(minmn, qcolumns, _state); ae_vector_set_length(&work, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(&t, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(&taubuf, minmn, _state); ae_matrix_set_length(&tmpa, m, ablascomplexblocksize(a, _state), _state); ae_matrix_set_length(&tmpt, ablascomplexblocksize(a, _state), ablascomplexblocksize(a, _state), _state); ae_matrix_set_length(&tmpr, 2*ablascomplexblocksize(a, _state), qcolumns, _state); ae_matrix_set_length(q, m, qcolumns, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=qcolumns-1; j++) { if( i==j ) { q->ptr.pp_complex[i][j] = ae_complex_from_i(1); } else { q->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } /* * Blocked code */ blockstart = ablascomplexblocksize(a, _state)*(refcnt/ablascomplexblocksize(a, _state)); blocksize = refcnt-blockstart; while(blockstart>=0) { rowscount = m-blockstart; if( blocksize>0 ) { /* * QR decomposition of submatrix. * Matrix is copied to temporary storage to solve * some TLB issues arising from non-contiguous memory * access pattern. */ cmatrixcopy(rowscount, blocksize, a, blockstart, blockstart, &tmpa, 0, 0, _state); ae_v_cmove(&taubuf.ptr.p_complex[0], 1, &tau->ptr.p_complex[blockstart], 1, "N", ae_v_len(0,blocksize-1)); /* * Update matrix, choose between: * a) Level 2 algorithm (when the rest of the matrix is small enough) * b) blocked algorithm, see algorithm 5 from 'A storage efficient WY * representation for products of Householder transformations', * by R. Schreiber and C. Van Loan. */ if( qcolumns>=2*ablascomplexblocksize(a, _state) ) { /* * Prepare block reflector */ ortfac_cmatrixblockreflector(&tmpa, &taubuf, ae_true, rowscount, blocksize, &tmpt, &work, _state); /* * Multiply the rest of A by Q. * * Q = E + Y*T*Y' = E + TmpA*TmpT*TmpA' */ cmatrixgemm(blocksize, qcolumns, rowscount, ae_complex_from_d(1.0), &tmpa, 0, 0, 2, q, blockstart, 0, 0, ae_complex_from_d(0.0), &tmpr, 0, 0, _state); cmatrixgemm(blocksize, qcolumns, blocksize, ae_complex_from_d(1.0), &tmpt, 0, 0, 0, &tmpr, 0, 0, 0, ae_complex_from_d(0.0), &tmpr, blocksize, 0, _state); cmatrixgemm(rowscount, qcolumns, blocksize, ae_complex_from_d(1.0), &tmpa, 0, 0, 0, &tmpr, blocksize, 0, 0, ae_complex_from_d(1.0), q, blockstart, 0, _state); } else { /* * Level 2 algorithm */ for(i=blocksize-1; i>=0; i--) { ae_v_cmove(&t.ptr.p_complex[1], 1, &tmpa.ptr.pp_complex[i][i], tmpa.stride, "N", ae_v_len(1,rowscount-i)); t.ptr.p_complex[1] = ae_complex_from_i(1); complexapplyreflectionfromtheleft(q, taubuf.ptr.p_complex[i], &t, blockstart+i, m-1, 0, qcolumns-1, &work, _state); } } } /* * Advance */ blockstart = blockstart-ablascomplexblocksize(a, _state); blocksize = ablascomplexblocksize(a, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixqrunpackq(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_int_t qcolumns, /* Complex */ ae_matrix* q, ae_state *_state) { cmatrixqrunpackq(a,m,n,tau,qcolumns,q, _state); } /************************************************************************* Unpacking of matrix R from the QR decomposition of a matrix A Input parameters: A - matrices Q and R in compact form. Output of CMatrixQR subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: R - matrix R, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixqrunpackr(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* r, ae_state *_state) { ae_int_t i; ae_int_t k; ae_matrix_clear(r); if( m<=0||n<=0 ) { return; } k = ae_minint(m, n, _state); ae_matrix_set_length(r, m, n, _state); for(i=0; i<=n-1; i++) { r->ptr.pp_complex[0][i] = ae_complex_from_i(0); } for(i=1; i<=m-1; i++) { ae_v_cmove(&r->ptr.pp_complex[i][0], 1, &r->ptr.pp_complex[0][0], 1, "N", ae_v_len(0,n-1)); } for(i=0; i<=k-1; i++) { ae_v_cmove(&r->ptr.pp_complex[i][i], 1, &a->ptr.pp_complex[i][i], 1, "N", ae_v_len(i,n-1)); } } /************************************************************************* Partial unpacking of matrix Q from LQ decomposition of a complex matrix A. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multicore support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Say, on SSE2-capable CPU with N=1024, HPC ALGLIB will be: ! * about 2-3x faster than ALGLIB for C++ without MKL ! * about 7-10x faster than "pure C#" edition of ALGLIB ! Difference in performance will be more striking on newer CPU's with ! support for newer SIMD instructions. Generally, MKL accelerates any ! problem whose size is at least 128, with best efficiency achieved for ! N's larger than 512. ! ! Commercial edition of ALGLIB also supports multithreaded acceleration ! of this function. We should note that QP decomposition is harder to ! parallelize than, say, matrix-matrix product - this algorithm has ! many internal synchronization points which can not be avoided. However ! parallelism starts to be profitable starting from N=512, achieving ! near-linear speedup for N=4096 or higher. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrices Q and R in compact form. Output of CMatrixLQ subroutine . M - number of rows in matrix A. M>=0. N - number of columns in matrix A. N>=0. Tau - scalar factors which are used to form Q. Output of CMatrixLQ subroutine . QRows - required number of rows in matrix Q. N>=QColumns>=0. Output parameters: Q - first QRows rows of matrix Q. Array whose index ranges within [0..QRows-1, 0..N-1]. If QRows=0, array isn't changed. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixlqunpackq(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_int_t qrows, /* Complex */ ae_matrix* q, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_vector t; ae_vector taubuf; ae_int_t minmn; ae_int_t refcnt; ae_matrix tmpa; ae_matrix tmpt; ae_matrix tmpr; ae_int_t blockstart; ae_int_t blocksize; ae_int_t columnscount; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_matrix_clear(q); ae_vector_init(&work, 0, DT_COMPLEX, _state); ae_vector_init(&t, 0, DT_COMPLEX, _state); ae_vector_init(&taubuf, 0, DT_COMPLEX, _state); ae_matrix_init(&tmpa, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&tmpt, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&tmpr, 0, 0, DT_COMPLEX, _state); if( m<=0||n<=0 ) { ae_frame_leave(_state); return; } /* * Init */ minmn = ae_minint(m, n, _state); refcnt = ae_minint(minmn, qrows, _state); ae_vector_set_length(&work, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(&t, ae_maxint(m, n, _state)+1, _state); ae_vector_set_length(&taubuf, minmn, _state); ae_matrix_set_length(&tmpa, ablascomplexblocksize(a, _state), n, _state); ae_matrix_set_length(&tmpt, ablascomplexblocksize(a, _state), ablascomplexblocksize(a, _state), _state); ae_matrix_set_length(&tmpr, qrows, 2*ablascomplexblocksize(a, _state), _state); ae_matrix_set_length(q, qrows, n, _state); for(i=0; i<=qrows-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { q->ptr.pp_complex[i][j] = ae_complex_from_i(1); } else { q->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } /* * Blocked code */ blockstart = ablascomplexblocksize(a, _state)*(refcnt/ablascomplexblocksize(a, _state)); blocksize = refcnt-blockstart; while(blockstart>=0) { columnscount = n-blockstart; if( blocksize>0 ) { /* * LQ decomposition of submatrix. * Matrix is copied to temporary storage to solve * some TLB issues arising from non-contiguous memory * access pattern. */ cmatrixcopy(blocksize, columnscount, a, blockstart, blockstart, &tmpa, 0, 0, _state); ae_v_cmove(&taubuf.ptr.p_complex[0], 1, &tau->ptr.p_complex[blockstart], 1, "N", ae_v_len(0,blocksize-1)); /* * Update matrix, choose between: * a) Level 2 algorithm (when the rest of the matrix is small enough) * b) blocked algorithm, see algorithm 5 from 'A storage efficient WY * representation for products of Householder transformations', * by R. Schreiber and C. Van Loan. */ if( qrows>=2*ablascomplexblocksize(a, _state) ) { /* * Prepare block reflector */ ortfac_cmatrixblockreflector(&tmpa, &taubuf, ae_false, columnscount, blocksize, &tmpt, &work, _state); /* * Multiply the rest of A by Q'. * * Q' = E + Y*T'*Y' = E + TmpA'*TmpT'*TmpA */ cmatrixgemm(qrows, blocksize, columnscount, ae_complex_from_d(1.0), q, 0, blockstart, 0, &tmpa, 0, 0, 2, ae_complex_from_d(0.0), &tmpr, 0, 0, _state); cmatrixgemm(qrows, blocksize, blocksize, ae_complex_from_d(1.0), &tmpr, 0, 0, 0, &tmpt, 0, 0, 2, ae_complex_from_d(0.0), &tmpr, 0, blocksize, _state); cmatrixgemm(qrows, columnscount, blocksize, ae_complex_from_d(1.0), &tmpr, 0, blocksize, 0, &tmpa, 0, 0, 0, ae_complex_from_d(1.0), q, 0, blockstart, _state); } else { /* * Level 2 algorithm */ for(i=blocksize-1; i>=0; i--) { ae_v_cmove(&t.ptr.p_complex[1], 1, &tmpa.ptr.pp_complex[i][i], 1, "Conj", ae_v_len(1,columnscount-i)); t.ptr.p_complex[1] = ae_complex_from_i(1); complexapplyreflectionfromtheright(q, ae_c_conj(taubuf.ptr.p_complex[i], _state), &t, 0, qrows-1, blockstart+i, n-1, &work, _state); } } } /* * Advance */ blockstart = blockstart-ablascomplexblocksize(a, _state); blocksize = ablascomplexblocksize(a, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_cmatrixlqunpackq(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* tau, ae_int_t qrows, /* Complex */ ae_matrix* q, ae_state *_state) { cmatrixlqunpackq(a,m,n,tau,qrows,q, _state); } /************************************************************************* Unpacking of matrix L from the LQ decomposition of a matrix A Input parameters: A - matrices Q and L in compact form. Output of CMatrixLQ subroutine. M - number of rows in given matrix A. M>=0. N - number of columns in given matrix A. N>=0. Output parameters: L - matrix L, array[0..M-1, 0..N-1]. -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ void cmatrixlqunpackl(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* l, ae_state *_state) { ae_int_t i; ae_int_t k; ae_matrix_clear(l); if( m<=0||n<=0 ) { return; } ae_matrix_set_length(l, m, n, _state); for(i=0; i<=n-1; i++) { l->ptr.pp_complex[0][i] = ae_complex_from_i(0); } for(i=1; i<=m-1; i++) { ae_v_cmove(&l->ptr.pp_complex[i][0], 1, &l->ptr.pp_complex[0][0], 1, "N", ae_v_len(0,n-1)); } for(i=0; i<=m-1; i++) { k = ae_minint(i, n-1, _state); ae_v_cmove(&l->ptr.pp_complex[i][0], 1, &a->ptr.pp_complex[i][0], 1, "N", ae_v_len(0,k)); } } /************************************************************************* Base case for real QR -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994. Sergey Bochkanov, ALGLIB project, translation from FORTRAN to pseudocode, 2007-2010. *************************************************************************/ void rmatrixqrbasecase(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* work, /* Real */ ae_vector* t, /* Real */ ae_vector* tau, ae_state *_state) { ae_int_t i; ae_int_t k; ae_int_t minmn; double tmp; minmn = ae_minint(m, n, _state); /* * Test the input arguments */ k = minmn; for(i=0; i<=k-1; i++) { /* * Generate elementary reflector H(i) to annihilate A(i+1:m,i) */ ae_v_move(&t->ptr.p_double[1], 1, &a->ptr.pp_double[i][i], a->stride, ae_v_len(1,m-i)); generatereflection(t, m-i, &tmp, _state); tau->ptr.p_double[i] = tmp; ae_v_move(&a->ptr.pp_double[i][i], a->stride, &t->ptr.p_double[1], 1, ae_v_len(i,m-1)); t->ptr.p_double[1] = (double)(1); if( iptr.p_double[i], t, i, m-1, i+1, n-1, work, _state); } } } /************************************************************************* Base case for real LQ -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994. Sergey Bochkanov, ALGLIB project, translation from FORTRAN to pseudocode, 2007-2010. *************************************************************************/ void rmatrixlqbasecase(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* work, /* Real */ ae_vector* t, /* Real */ ae_vector* tau, ae_state *_state) { ae_int_t i; ae_int_t k; double tmp; k = ae_minint(m, n, _state); for(i=0; i<=k-1; i++) { /* * Generate elementary reflector H(i) to annihilate A(i,i+1:n-1) */ ae_v_move(&t->ptr.p_double[1], 1, &a->ptr.pp_double[i][i], 1, ae_v_len(1,n-i)); generatereflection(t, n-i, &tmp, _state); tau->ptr.p_double[i] = tmp; ae_v_move(&a->ptr.pp_double[i][i], 1, &t->ptr.p_double[1], 1, ae_v_len(i,n-1)); t->ptr.p_double[1] = (double)(1); if( iptr.p_double[i], t, i+1, m-1, i, n-1, work, _state); } } } /************************************************************************* Reduction of a rectangular matrix to bidiagonal form The algorithm reduces the rectangular matrix A to bidiagonal form by orthogonal transformations P and Q: A = Q*B*(P^T). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function because ! bidiagonal decompostion is inherently sequential in nature. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - source matrix. array[0..M-1, 0..N-1] M - number of rows in matrix A. N - number of columns in matrix A. Output parameters: A - matrices Q, B, P in compact form (see below). TauQ - scalar factors which are used to form matrix Q. TauP - scalar factors which are used to form matrix P. The main diagonal and one of the secondary diagonals of matrix A are replaced with bidiagonal matrix B. Other elements contain elementary reflections which form MxM matrix Q and NxN matrix P, respectively. If M>=N, B is the upper bidiagonal MxN matrix and is stored in the corresponding elements of matrix A. Matrix Q is represented as a product of elementary reflections Q = H(0)*H(1)*...*H(n-1), where H(i) = 1-tau*v*v'. Here tau is a scalar which is stored in TauQ[i], and vector v has the following structure: v(0:i-1)=0, v(i)=1, v(i+1:m-1) is stored in elements A(i+1:m-1,i). Matrix P is as follows: P = G(0)*G(1)*...*G(n-2), where G(i) = 1 - tau*u*u'. Tau is stored in TauP[i], u(0:i)=0, u(i+1)=1, u(i+2:n-1) is stored in elements A(i,i+2:n-1). If M n): m=5, n=6 (m < n): ( d e u1 u1 u1 ) ( d u1 u1 u1 u1 u1 ) ( v1 d e u2 u2 ) ( e d u2 u2 u2 u2 ) ( v1 v2 d e u3 ) ( v1 e d u3 u3 u3 ) ( v1 v2 v3 d e ) ( v1 v2 e d u4 u4 ) ( v1 v2 v3 v4 d ) ( v1 v2 v3 e d u5 ) ( v1 v2 v3 v4 v5 ) Here vi and ui are vectors which form H(i) and G(i), and d and e - are the diagonal and off-diagonal elements of matrix B. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994. Sergey Bochkanov, ALGLIB project, translation from FORTRAN to pseudocode, 2007-2010. *************************************************************************/ void rmatrixbd(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tauq, /* Real */ ae_vector* taup, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_vector t; ae_int_t maxmn; ae_int_t i; double ltau; ae_frame_make(_state, &_frame_block); ae_vector_clear(tauq); ae_vector_clear(taup); ae_vector_init(&work, 0, DT_REAL, _state); ae_vector_init(&t, 0, DT_REAL, _state); /* * Prepare */ if( n<=0||m<=0 ) { ae_frame_leave(_state); return; } maxmn = ae_maxint(m, n, _state); ae_vector_set_length(&work, maxmn+1, _state); ae_vector_set_length(&t, maxmn+1, _state); if( m>=n ) { ae_vector_set_length(tauq, n, _state); ae_vector_set_length(taup, n, _state); for(i=0; i<=n-1; i++) { tauq->ptr.p_double[i] = 0.0; taup->ptr.p_double[i] = 0.0; } } else { ae_vector_set_length(tauq, m, _state); ae_vector_set_length(taup, m, _state); for(i=0; i<=m-1; i++) { tauq->ptr.p_double[i] = 0.0; taup->ptr.p_double[i] = 0.0; } } /* * Try to use MKL code * * NOTE: buffers Work[] and T[] are used for temporary storage of diagonals; * because they are present in A[], we do not use them. */ if( rmatrixbdmkl(a, m, n, &work, &t, tauq, taup, _state) ) { ae_frame_leave(_state); return; } /* * ALGLIB code */ if( m>=n ) { /* * Reduce to upper bidiagonal form */ for(i=0; i<=n-1; i++) { /* * Generate elementary reflector H(i) to annihilate A(i+1:m-1,i) */ ae_v_move(&t.ptr.p_double[1], 1, &a->ptr.pp_double[i][i], a->stride, ae_v_len(1,m-i)); generatereflection(&t, m-i, <au, _state); tauq->ptr.p_double[i] = ltau; ae_v_move(&a->ptr.pp_double[i][i], a->stride, &t.ptr.p_double[1], 1, ae_v_len(i,m-1)); t.ptr.p_double[1] = (double)(1); /* * Apply H(i) to A(i:m-1,i+1:n-1) from the left */ applyreflectionfromtheleft(a, ltau, &t, i, m-1, i+1, n-1, &work, _state); if( iptr.pp_double[i][i+1], 1, ae_v_len(1,n-i-1)); generatereflection(&t, n-1-i, <au, _state); taup->ptr.p_double[i] = ltau; ae_v_move(&a->ptr.pp_double[i][i+1], 1, &t.ptr.p_double[1], 1, ae_v_len(i+1,n-1)); t.ptr.p_double[1] = (double)(1); /* * Apply G(i) to A(i+1:m-1,i+1:n-1) from the right */ applyreflectionfromtheright(a, ltau, &t, i+1, m-1, i+1, n-1, &work, _state); } else { taup->ptr.p_double[i] = (double)(0); } } } else { /* * Reduce to lower bidiagonal form */ for(i=0; i<=m-1; i++) { /* * Generate elementary reflector G(i) to annihilate A(i,i+1:n-1) */ ae_v_move(&t.ptr.p_double[1], 1, &a->ptr.pp_double[i][i], 1, ae_v_len(1,n-i)); generatereflection(&t, n-i, <au, _state); taup->ptr.p_double[i] = ltau; ae_v_move(&a->ptr.pp_double[i][i], 1, &t.ptr.p_double[1], 1, ae_v_len(i,n-1)); t.ptr.p_double[1] = (double)(1); /* * Apply G(i) to A(i+1:m-1,i:n-1) from the right */ applyreflectionfromtheright(a, ltau, &t, i+1, m-1, i, n-1, &work, _state); if( iptr.pp_double[i+1][i], a->stride, ae_v_len(1,m-1-i)); generatereflection(&t, m-1-i, <au, _state); tauq->ptr.p_double[i] = ltau; ae_v_move(&a->ptr.pp_double[i+1][i], a->stride, &t.ptr.p_double[1], 1, ae_v_len(i+1,m-1)); t.ptr.p_double[1] = (double)(1); /* * Apply H(i) to A(i+1:m-1,i+1:n-1) from the left */ applyreflectionfromtheleft(a, ltau, &t, i+1, m-1, i+1, n-1, &work, _state); } else { tauq->ptr.p_double[i] = (double)(0); } } } ae_frame_leave(_state); } /************************************************************************* Unpacking matrix Q which reduces a matrix to bidiagonal form. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: QP - matrices Q and P in compact form. Output of ToBidiagonal subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUQ - scalar factors which are used to form Q. Output of ToBidiagonal subroutine. QColumns - required number of columns in matrix Q. M>=QColumns>=0. Output parameters: Q - first QColumns columns of matrix Q. Array[0..M-1, 0..QColumns-1] If QColumns=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdunpackq(/* Real */ ae_matrix* qp, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tauq, ae_int_t qcolumns, /* Real */ ae_matrix* q, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(q); ae_assert(qcolumns<=m, "RMatrixBDUnpackQ: QColumns>M!", _state); ae_assert(qcolumns>=0, "RMatrixBDUnpackQ: QColumns<0!", _state); if( (m==0||n==0)||qcolumns==0 ) { return; } /* * prepare Q */ ae_matrix_set_length(q, m, qcolumns, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=qcolumns-1; j++) { if( i==j ) { q->ptr.pp_double[i][j] = (double)(1); } else { q->ptr.pp_double[i][j] = (double)(0); } } } /* * Calculate */ rmatrixbdmultiplybyq(qp, m, n, tauq, q, m, qcolumns, ae_false, ae_false, _state); } /************************************************************************* Multiplication by matrix Q which reduces matrix A to bidiagonal form. The algorithm allows pre- or post-multiply by Q or Q'. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: QP - matrices Q and P in compact form. Output of ToBidiagonal subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUQ - scalar factors which are used to form Q. Output of ToBidiagonal subroutine. Z - multiplied matrix. array[0..ZRows-1,0..ZColumns-1] ZRows - number of rows in matrix Z. If FromTheRight=False, ZRows=M, otherwise ZRows can be arbitrary. ZColumns - number of columns in matrix Z. If FromTheRight=True, ZColumns=M, otherwise ZColumns can be arbitrary. FromTheRight - pre- or post-multiply. DoTranspose - multiply by Q or Q'. Output parameters: Z - product of Z and Q. Array[0..ZRows-1,0..ZColumns-1] If ZRows=0 or ZColumns=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdmultiplybyq(/* Real */ ae_matrix* qp, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tauq, /* Real */ ae_matrix* z, ae_int_t zrows, ae_int_t zcolumns, ae_bool fromtheright, ae_bool dotranspose, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t i1; ae_int_t i2; ae_int_t istep; ae_vector v; ae_vector work; ae_vector dummy; ae_int_t mx; ae_frame_make(_state, &_frame_block); ae_vector_init(&v, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); ae_vector_init(&dummy, 0, DT_REAL, _state); if( ((m<=0||n<=0)||zrows<=0)||zcolumns<=0 ) { ae_frame_leave(_state); return; } ae_assert((fromtheright&&zcolumns==m)||(!fromtheright&&zrows==m), "RMatrixBDMultiplyByQ: incorrect Z size!", _state); /* * Try to use MKL code */ if( rmatrixbdmultiplybymkl(qp, m, n, tauq, &dummy, z, zrows, zcolumns, ae_true, fromtheright, dotranspose, _state) ) { ae_frame_leave(_state); return; } /* * init */ mx = ae_maxint(m, n, _state); mx = ae_maxint(mx, zrows, _state); mx = ae_maxint(mx, zcolumns, _state); ae_vector_set_length(&v, mx+1, _state); ae_vector_set_length(&work, mx+1, _state); if( m>=n ) { /* * setup */ if( fromtheright ) { i1 = 0; i2 = n-1; istep = 1; } else { i1 = n-1; i2 = 0; istep = -1; } if( dotranspose ) { i = i1; i1 = i2; i2 = i; istep = -istep; } /* * Process */ i = i1; do { ae_v_move(&v.ptr.p_double[1], 1, &qp->ptr.pp_double[i][i], qp->stride, ae_v_len(1,m-i)); v.ptr.p_double[1] = (double)(1); if( fromtheright ) { applyreflectionfromtheright(z, tauq->ptr.p_double[i], &v, 0, zrows-1, i, m-1, &work, _state); } else { applyreflectionfromtheleft(z, tauq->ptr.p_double[i], &v, i, m-1, 0, zcolumns-1, &work, _state); } i = i+istep; } while(i!=i2+istep); } else { /* * setup */ if( fromtheright ) { i1 = 0; i2 = m-2; istep = 1; } else { i1 = m-2; i2 = 0; istep = -1; } if( dotranspose ) { i = i1; i1 = i2; i2 = i; istep = -istep; } /* * Process */ if( m-1>0 ) { i = i1; do { ae_v_move(&v.ptr.p_double[1], 1, &qp->ptr.pp_double[i+1][i], qp->stride, ae_v_len(1,m-i-1)); v.ptr.p_double[1] = (double)(1); if( fromtheright ) { applyreflectionfromtheright(z, tauq->ptr.p_double[i], &v, 0, zrows-1, i+1, m-1, &work, _state); } else { applyreflectionfromtheleft(z, tauq->ptr.p_double[i], &v, i+1, m-1, 0, zcolumns-1, &work, _state); } i = i+istep; } while(i!=i2+istep); } } ae_frame_leave(_state); } /************************************************************************* Unpacking matrix P which reduces matrix A to bidiagonal form. The subroutine returns transposed matrix P. Input parameters: QP - matrices Q and P in compact form. Output of ToBidiagonal subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUP - scalar factors which are used to form P. Output of ToBidiagonal subroutine. PTRows - required number of rows of matrix P^T. N >= PTRows >= 0. Output parameters: PT - first PTRows columns of matrix P^T Array[0..PTRows-1, 0..N-1] If PTRows=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdunpackpt(/* Real */ ae_matrix* qp, ae_int_t m, ae_int_t n, /* Real */ ae_vector* taup, ae_int_t ptrows, /* Real */ ae_matrix* pt, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(pt); ae_assert(ptrows<=n, "RMatrixBDUnpackPT: PTRows>N!", _state); ae_assert(ptrows>=0, "RMatrixBDUnpackPT: PTRows<0!", _state); if( (m==0||n==0)||ptrows==0 ) { return; } /* * prepare PT */ ae_matrix_set_length(pt, ptrows, n, _state); for(i=0; i<=ptrows-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { pt->ptr.pp_double[i][j] = (double)(1); } else { pt->ptr.pp_double[i][j] = (double)(0); } } } /* * Calculate */ rmatrixbdmultiplybyp(qp, m, n, taup, pt, ptrows, n, ae_true, ae_true, _state); } /************************************************************************* Multiplication by matrix P which reduces matrix A to bidiagonal form. The algorithm allows pre- or post-multiply by P or P'. Input parameters: QP - matrices Q and P in compact form. Output of RMatrixBD subroutine. M - number of rows in matrix A. N - number of columns in matrix A. TAUP - scalar factors which are used to form P. Output of RMatrixBD subroutine. Z - multiplied matrix. Array whose indexes range within [0..ZRows-1,0..ZColumns-1]. ZRows - number of rows in matrix Z. If FromTheRight=False, ZRows=N, otherwise ZRows can be arbitrary. ZColumns - number of columns in matrix Z. If FromTheRight=True, ZColumns=N, otherwise ZColumns can be arbitrary. FromTheRight - pre- or post-multiply. DoTranspose - multiply by P or P'. Output parameters: Z - product of Z and P. Array whose indexes range within [0..ZRows-1,0..ZColumns-1]. If ZRows=0 or ZColumns=0, the array is not modified. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdmultiplybyp(/* Real */ ae_matrix* qp, ae_int_t m, ae_int_t n, /* Real */ ae_vector* taup, /* Real */ ae_matrix* z, ae_int_t zrows, ae_int_t zcolumns, ae_bool fromtheright, ae_bool dotranspose, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector v; ae_vector work; ae_vector dummy; ae_int_t mx; ae_int_t i1; ae_int_t i2; ae_int_t istep; ae_frame_make(_state, &_frame_block); ae_vector_init(&v, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); ae_vector_init(&dummy, 0, DT_REAL, _state); if( ((m<=0||n<=0)||zrows<=0)||zcolumns<=0 ) { ae_frame_leave(_state); return; } ae_assert((fromtheright&&zcolumns==n)||(!fromtheright&&zrows==n), "RMatrixBDMultiplyByP: incorrect Z size!", _state); /* * init */ mx = ae_maxint(m, n, _state); mx = ae_maxint(mx, zrows, _state); mx = ae_maxint(mx, zcolumns, _state); ae_vector_set_length(&v, mx+1, _state); ae_vector_set_length(&work, mx+1, _state); if( m>=n ) { /* * setup */ if( fromtheright ) { i1 = n-2; i2 = 0; istep = -1; } else { i1 = 0; i2 = n-2; istep = 1; } if( !dotranspose ) { i = i1; i1 = i2; i2 = i; istep = -istep; } /* * Process */ if( n-1>0 ) { i = i1; do { ae_v_move(&v.ptr.p_double[1], 1, &qp->ptr.pp_double[i][i+1], 1, ae_v_len(1,n-1-i)); v.ptr.p_double[1] = (double)(1); if( fromtheright ) { applyreflectionfromtheright(z, taup->ptr.p_double[i], &v, 0, zrows-1, i+1, n-1, &work, _state); } else { applyreflectionfromtheleft(z, taup->ptr.p_double[i], &v, i+1, n-1, 0, zcolumns-1, &work, _state); } i = i+istep; } while(i!=i2+istep); } } else { /* * setup */ if( fromtheright ) { i1 = m-1; i2 = 0; istep = -1; } else { i1 = 0; i2 = m-1; istep = 1; } if( !dotranspose ) { i = i1; i1 = i2; i2 = i; istep = -istep; } /* * Process */ i = i1; do { ae_v_move(&v.ptr.p_double[1], 1, &qp->ptr.pp_double[i][i], 1, ae_v_len(1,n-i)); v.ptr.p_double[1] = (double)(1); if( fromtheright ) { applyreflectionfromtheright(z, taup->ptr.p_double[i], &v, 0, zrows-1, i, n-1, &work, _state); } else { applyreflectionfromtheleft(z, taup->ptr.p_double[i], &v, i, n-1, 0, zcolumns-1, &work, _state); } i = i+istep; } while(i!=i2+istep); } ae_frame_leave(_state); } /************************************************************************* Unpacking of the main and secondary diagonals of bidiagonal decomposition of matrix A. Input parameters: B - output of RMatrixBD subroutine. M - number of rows in matrix B. N - number of columns in matrix B. Output parameters: IsUpper - True, if the matrix is upper bidiagonal. otherwise IsUpper is False. D - the main diagonal. Array whose index ranges within [0..Min(M,N)-1]. E - the secondary diagonal (upper or lower, depending on the value of IsUpper). Array index ranges within [0..Min(M,N)-1], the last element is not used. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixbdunpackdiagonals(/* Real */ ae_matrix* b, ae_int_t m, ae_int_t n, ae_bool* isupper, /* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_state *_state) { ae_int_t i; *isupper = ae_false; ae_vector_clear(d); ae_vector_clear(e); *isupper = m>=n; if( m<=0||n<=0 ) { return; } if( *isupper ) { ae_vector_set_length(d, n, _state); ae_vector_set_length(e, n, _state); for(i=0; i<=n-2; i++) { d->ptr.p_double[i] = b->ptr.pp_double[i][i]; e->ptr.p_double[i] = b->ptr.pp_double[i][i+1]; } d->ptr.p_double[n-1] = b->ptr.pp_double[n-1][n-1]; } else { ae_vector_set_length(d, m, _state); ae_vector_set_length(e, m, _state); for(i=0; i<=m-2; i++) { d->ptr.p_double[i] = b->ptr.pp_double[i][i]; e->ptr.p_double[i] = b->ptr.pp_double[i+1][i]; } d->ptr.p_double[m-1] = b->ptr.pp_double[m-1][m-1]; } } /************************************************************************* Reduction of a square matrix to upper Hessenberg form: Q'*A*Q = H, where Q is an orthogonal matrix, H - Hessenberg matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix A with elements [0..N-1, 0..N-1] N - size of matrix A. Output parameters: A - matrices Q and P in compact form (see below). Tau - array of scalar factors which are used to form matrix Q. Array whose index ranges within [0..N-2] Matrix H is located on the main diagonal, on the lower secondary diagonal and above the main diagonal of matrix A. The elements which are used to form matrix Q are situated in array Tau and below the lower secondary diagonal of matrix A as follows: Matrix Q is represented as a product of elementary reflections Q = H(0)*H(2)*...*H(n-2), where each H(i) is given by H(i) = 1 - tau * v * (v^T) where tau is a scalar stored in Tau[I]; v - is a real vector, so that v(0:i) = 0, v(i+1) = 1, v(i+2:n-1) stored in A(i+2:n-1,i). -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ void rmatrixhessenberg(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* tau, ae_state *_state) { ae_frame _frame_block; ae_int_t i; double v; ae_vector t; ae_vector work; ae_frame_make(_state, &_frame_block); ae_vector_clear(tau); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); ae_assert(n>=0, "RMatrixHessenberg: incorrect N!", _state); /* * Quick return if possible */ if( n<=1 ) { ae_frame_leave(_state); return; } /* * Allocate place */ ae_vector_set_length(tau, n-2+1, _state); ae_vector_set_length(&t, n+1, _state); ae_vector_set_length(&work, n-1+1, _state); /* * MKL version */ if( rmatrixhessenbergmkl(a, n, tau, _state) ) { ae_frame_leave(_state); return; } /* * ALGLIB version */ for(i=0; i<=n-2; i++) { /* * Compute elementary reflector H(i) to annihilate A(i+2:ihi,i) */ ae_v_move(&t.ptr.p_double[1], 1, &a->ptr.pp_double[i+1][i], a->stride, ae_v_len(1,n-i-1)); generatereflection(&t, n-i-1, &v, _state); ae_v_move(&a->ptr.pp_double[i+1][i], a->stride, &t.ptr.p_double[1], 1, ae_v_len(i+1,n-1)); tau->ptr.p_double[i] = v; t.ptr.p_double[1] = (double)(1); /* * Apply H(i) to A(1:ihi,i+1:ihi) from the right */ applyreflectionfromtheright(a, v, &t, 0, n-1, i+1, n-1, &work, _state); /* * Apply H(i) to A(i+1:ihi,i+1:n) from the left */ applyreflectionfromtheleft(a, v, &t, i+1, n-1, i+1, n-1, &work, _state); } ae_frame_leave(_state); } /************************************************************************* Unpacking matrix Q which reduces matrix A to upper Hessenberg form COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - output of RMatrixHessenberg subroutine. N - size of matrix A. Tau - scalar factors which are used to form Q. Output of RMatrixHessenberg subroutine. Output parameters: Q - matrix Q. Array whose indexes range within [0..N-1, 0..N-1]. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixhessenbergunpackq(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* tau, /* Real */ ae_matrix* q, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_vector v; ae_vector work; ae_frame_make(_state, &_frame_block); ae_matrix_clear(q); ae_vector_init(&v, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); if( n==0 ) { ae_frame_leave(_state); return; } /* * init */ ae_matrix_set_length(q, n-1+1, n-1+1, _state); ae_vector_set_length(&v, n-1+1, _state); ae_vector_set_length(&work, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { q->ptr.pp_double[i][j] = (double)(1); } else { q->ptr.pp_double[i][j] = (double)(0); } } } /* * MKL version */ if( rmatrixhessenbergunpackqmkl(a, n, tau, q, _state) ) { ae_frame_leave(_state); return; } /* * ALGLIB version: unpack Q */ for(i=0; i<=n-2; i++) { /* * Apply H(i) */ ae_v_move(&v.ptr.p_double[1], 1, &a->ptr.pp_double[i+1][i], a->stride, ae_v_len(1,n-i-1)); v.ptr.p_double[1] = (double)(1); applyreflectionfromtheright(q, tau->ptr.p_double[i], &v, 0, n-1, i+1, n-1, &work, _state); } ae_frame_leave(_state); } /************************************************************************* Unpacking matrix H (the result of matrix A reduction to upper Hessenberg form) Input parameters: A - output of RMatrixHessenberg subroutine. N - size of matrix A. Output parameters: H - matrix H. Array whose indexes range within [0..N-1, 0..N-1]. -- ALGLIB -- 2005-2010 Bochkanov Sergey *************************************************************************/ void rmatrixhessenbergunpackh(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* h, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_vector v; ae_vector work; ae_frame_make(_state, &_frame_block); ae_matrix_clear(h); ae_vector_init(&v, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); if( n==0 ) { ae_frame_leave(_state); return; } ae_matrix_set_length(h, n-1+1, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=i-2; j++) { h->ptr.pp_double[i][j] = (double)(0); } j = ae_maxint(0, i-1, _state); ae_v_move(&h->ptr.pp_double[i][j], 1, &a->ptr.pp_double[i][j], 1, ae_v_len(j,n-1)); } ae_frame_leave(_state); } /************************************************************************* Reduction of a symmetric matrix which is given by its higher or lower triangular part to a tridiagonal matrix using orthogonal similarity transformation: Q'*A*Q=T. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be transformed array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. If IsUpper = True, then matrix A is given by its upper triangle, and the lower triangle is not used and not modified by the algorithm, and vice versa if IsUpper = False. Output parameters: A - matrices T and Q in compact form (see lower) Tau - array of factors which are forming matrices H(i) array with elements [0..N-2]. D - main diagonal of symmetric matrix T. array with elements [0..N-1]. E - secondary diagonal of symmetric matrix T. array with elements [0..N-2]. If IsUpper=True, the matrix Q is represented as a product of elementary reflectors Q = H(n-2) . . . H(2) H(0). Each H(i) has the form H(i) = I - tau * v * v' where tau is a real scalar, and v is a real vector with v(i+1:n-1) = 0, v(i) = 1, v(0:i-1) is stored on exit in A(0:i-1,i+1), and tau in TAU(i). If IsUpper=False, the matrix Q is represented as a product of elementary reflectors Q = H(0) H(2) . . . H(n-2). Each H(i) has the form H(i) = I - tau * v * v' where tau is a real scalar, and v is a real vector with v(0:i) = 0, v(i+1) = 1, v(i+2:n-1) is stored on exit in A(i+2:n-1,i), and tau in TAU(i). The contents of A on exit are illustrated by the following examples with n = 5: if UPLO = 'U': if UPLO = 'L': ( d e v1 v2 v3 ) ( d ) ( d e v2 v3 ) ( e d ) ( d e v3 ) ( v0 e d ) ( d e ) ( v0 v1 e d ) ( d ) ( v0 v1 v2 e d ) where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i). -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ void smatrixtd(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tau, /* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_state *_state) { ae_frame _frame_block; ae_int_t i; double alpha; double taui; double v; ae_vector t; ae_vector t2; ae_vector t3; ae_frame_make(_state, &_frame_block); ae_vector_clear(tau); ae_vector_clear(d); ae_vector_clear(e); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&t2, 0, DT_REAL, _state); ae_vector_init(&t3, 0, DT_REAL, _state); if( n<=0 ) { ae_frame_leave(_state); return; } ae_vector_set_length(&t, n+1, _state); ae_vector_set_length(&t2, n+1, _state); ae_vector_set_length(&t3, n+1, _state); if( n>1 ) { ae_vector_set_length(tau, n-2+1, _state); } ae_vector_set_length(d, n-1+1, _state); if( n>1 ) { ae_vector_set_length(e, n-2+1, _state); } /* * Try to use MKL */ if( smatrixtdmkl(a, n, isupper, tau, d, e, _state) ) { ae_frame_leave(_state); return; } /* * ALGLIB version */ if( isupper ) { /* * Reduce the upper triangle of A */ for(i=n-2; i>=0; i--) { /* * Generate elementary reflector H() = E - tau * v * v' */ if( i>=1 ) { ae_v_move(&t.ptr.p_double[2], 1, &a->ptr.pp_double[0][i+1], a->stride, ae_v_len(2,i+1)); } t.ptr.p_double[1] = a->ptr.pp_double[i][i+1]; generatereflection(&t, i+1, &taui, _state); if( i>=1 ) { ae_v_move(&a->ptr.pp_double[0][i+1], a->stride, &t.ptr.p_double[2], 1, ae_v_len(0,i-1)); } a->ptr.pp_double[i][i+1] = t.ptr.p_double[1]; e->ptr.p_double[i] = a->ptr.pp_double[i][i+1]; if( ae_fp_neq(taui,(double)(0)) ) { /* * Apply H from both sides to A */ a->ptr.pp_double[i][i+1] = (double)(1); /* * Compute x := tau * A * v storing x in TAU */ ae_v_move(&t.ptr.p_double[1], 1, &a->ptr.pp_double[0][i+1], a->stride, ae_v_len(1,i+1)); symmetricmatrixvectormultiply(a, isupper, 0, i, &t, taui, &t3, _state); ae_v_move(&tau->ptr.p_double[0], 1, &t3.ptr.p_double[1], 1, ae_v_len(0,i)); /* * Compute w := x - 1/2 * tau * (x'*v) * v */ v = ae_v_dotproduct(&tau->ptr.p_double[0], 1, &a->ptr.pp_double[0][i+1], a->stride, ae_v_len(0,i)); alpha = -0.5*taui*v; ae_v_addd(&tau->ptr.p_double[0], 1, &a->ptr.pp_double[0][i+1], a->stride, ae_v_len(0,i), alpha); /* * Apply the transformation as a rank-2 update: * A := A - v * w' - w * v' */ ae_v_move(&t.ptr.p_double[1], 1, &a->ptr.pp_double[0][i+1], a->stride, ae_v_len(1,i+1)); ae_v_move(&t3.ptr.p_double[1], 1, &tau->ptr.p_double[0], 1, ae_v_len(1,i+1)); symmetricrank2update(a, isupper, 0, i, &t, &t3, &t2, (double)(-1), _state); a->ptr.pp_double[i][i+1] = e->ptr.p_double[i]; } d->ptr.p_double[i+1] = a->ptr.pp_double[i+1][i+1]; tau->ptr.p_double[i] = taui; } d->ptr.p_double[0] = a->ptr.pp_double[0][0]; } else { /* * Reduce the lower triangle of A */ for(i=0; i<=n-2; i++) { /* * Generate elementary reflector H = E - tau * v * v' */ ae_v_move(&t.ptr.p_double[1], 1, &a->ptr.pp_double[i+1][i], a->stride, ae_v_len(1,n-i-1)); generatereflection(&t, n-i-1, &taui, _state); ae_v_move(&a->ptr.pp_double[i+1][i], a->stride, &t.ptr.p_double[1], 1, ae_v_len(i+1,n-1)); e->ptr.p_double[i] = a->ptr.pp_double[i+1][i]; if( ae_fp_neq(taui,(double)(0)) ) { /* * Apply H from both sides to A */ a->ptr.pp_double[i+1][i] = (double)(1); /* * Compute x := tau * A * v storing y in TAU */ ae_v_move(&t.ptr.p_double[1], 1, &a->ptr.pp_double[i+1][i], a->stride, ae_v_len(1,n-i-1)); symmetricmatrixvectormultiply(a, isupper, i+1, n-1, &t, taui, &t2, _state); ae_v_move(&tau->ptr.p_double[i], 1, &t2.ptr.p_double[1], 1, ae_v_len(i,n-2)); /* * Compute w := x - 1/2 * tau * (x'*v) * v */ v = ae_v_dotproduct(&tau->ptr.p_double[i], 1, &a->ptr.pp_double[i+1][i], a->stride, ae_v_len(i,n-2)); alpha = -0.5*taui*v; ae_v_addd(&tau->ptr.p_double[i], 1, &a->ptr.pp_double[i+1][i], a->stride, ae_v_len(i,n-2), alpha); /* * Apply the transformation as a rank-2 update: * A := A - v * w' - w * v' * */ ae_v_move(&t.ptr.p_double[1], 1, &a->ptr.pp_double[i+1][i], a->stride, ae_v_len(1,n-i-1)); ae_v_move(&t2.ptr.p_double[1], 1, &tau->ptr.p_double[i], 1, ae_v_len(1,n-i-1)); symmetricrank2update(a, isupper, i+1, n-1, &t, &t2, &t3, (double)(-1), _state); a->ptr.pp_double[i+1][i] = e->ptr.p_double[i]; } d->ptr.p_double[i] = a->ptr.pp_double[i][i]; tau->ptr.p_double[i] = taui; } d->ptr.p_double[n-1] = a->ptr.pp_double[n-1][n-1]; } ae_frame_leave(_state); } /************************************************************************* Unpacking matrix Q which reduces symmetric matrix to a tridiagonal form. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - the result of a SMatrixTD subroutine N - size of matrix A. IsUpper - storage format (a parameter of SMatrixTD subroutine) Tau - the result of a SMatrixTD subroutine Output parameters: Q - transformation matrix. array with elements [0..N-1, 0..N-1]. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/ void smatrixtdunpackq(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tau, /* Real */ ae_matrix* q, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_vector v; ae_vector work; ae_frame_make(_state, &_frame_block); ae_matrix_clear(q); ae_vector_init(&v, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); if( n==0 ) { ae_frame_leave(_state); return; } /* * init */ ae_matrix_set_length(q, n-1+1, n-1+1, _state); ae_vector_set_length(&v, n+1, _state); ae_vector_set_length(&work, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { q->ptr.pp_double[i][j] = (double)(1); } else { q->ptr.pp_double[i][j] = (double)(0); } } } /* * MKL version */ if( smatrixtdunpackqmkl(a, n, isupper, tau, q, _state) ) { ae_frame_leave(_state); return; } /* * ALGLIB version: unpack Q */ if( isupper ) { for(i=0; i<=n-2; i++) { /* * Apply H(i) */ ae_v_move(&v.ptr.p_double[1], 1, &a->ptr.pp_double[0][i+1], a->stride, ae_v_len(1,i+1)); v.ptr.p_double[i+1] = (double)(1); applyreflectionfromtheleft(q, tau->ptr.p_double[i], &v, 0, i, 0, n-1, &work, _state); } } else { for(i=n-2; i>=0; i--) { /* * Apply H(i) */ ae_v_move(&v.ptr.p_double[1], 1, &a->ptr.pp_double[i+1][i], a->stride, ae_v_len(1,n-i-1)); v.ptr.p_double[1] = (double)(1); applyreflectionfromtheleft(q, tau->ptr.p_double[i], &v, i+1, n-1, 0, n-1, &work, _state); } } ae_frame_leave(_state); } /************************************************************************* Reduction of a Hermitian matrix which is given by its higher or lower triangular part to a real tridiagonal matrix using unitary similarity transformation: Q'*A*Q = T. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - matrix to be transformed array with elements [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. If IsUpper = True, then matrix A is given by its upper triangle, and the lower triangle is not used and not modified by the algorithm, and vice versa if IsUpper = False. Output parameters: A - matrices T and Q in compact form (see lower) Tau - array of factors which are forming matrices H(i) array with elements [0..N-2]. D - main diagonal of real symmetric matrix T. array with elements [0..N-1]. E - secondary diagonal of real symmetric matrix T. array with elements [0..N-2]. If IsUpper=True, the matrix Q is represented as a product of elementary reflectors Q = H(n-2) . . . H(2) H(0). Each H(i) has the form H(i) = I - tau * v * v' where tau is a complex scalar, and v is a complex vector with v(i+1:n-1) = 0, v(i) = 1, v(0:i-1) is stored on exit in A(0:i-1,i+1), and tau in TAU(i). If IsUpper=False, the matrix Q is represented as a product of elementary reflectors Q = H(0) H(2) . . . H(n-2). Each H(i) has the form H(i) = I - tau * v * v' where tau is a complex scalar, and v is a complex vector with v(0:i) = 0, v(i+1) = 1, v(i+2:n-1) is stored on exit in A(i+2:n-1,i), and tau in TAU(i). The contents of A on exit are illustrated by the following examples with n = 5: if UPLO = 'U': if UPLO = 'L': ( d e v1 v2 v3 ) ( d ) ( d e v2 v3 ) ( e d ) ( d e v3 ) ( v0 e d ) ( d e ) ( v0 v1 e d ) ( d ) ( v0 v1 v2 e d ) where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i). -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ void hmatrixtd(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tau, /* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_complex alpha; ae_complex taui; ae_complex v; ae_vector t; ae_vector t2; ae_vector t3; ae_frame_make(_state, &_frame_block); ae_vector_clear(tau); ae_vector_clear(d); ae_vector_clear(e); ae_vector_init(&t, 0, DT_COMPLEX, _state); ae_vector_init(&t2, 0, DT_COMPLEX, _state); ae_vector_init(&t3, 0, DT_COMPLEX, _state); /* * Init and test */ if( n<=0 ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { ae_assert(ae_fp_eq(a->ptr.pp_complex[i][i].y,(double)(0)), "Assertion failed", _state); } if( n>1 ) { ae_vector_set_length(tau, n-2+1, _state); ae_vector_set_length(e, n-2+1, _state); } ae_vector_set_length(d, n-1+1, _state); ae_vector_set_length(&t, n-1+1, _state); ae_vector_set_length(&t2, n-1+1, _state); ae_vector_set_length(&t3, n-1+1, _state); /* * MKL version */ if( hmatrixtdmkl(a, n, isupper, tau, d, e, _state) ) { ae_frame_leave(_state); return; } /* * ALGLIB version */ if( isupper ) { /* * Reduce the upper triangle of A */ a->ptr.pp_complex[n-1][n-1] = ae_complex_from_d(a->ptr.pp_complex[n-1][n-1].x); for(i=n-2; i>=0; i--) { /* * Generate elementary reflector H = I+1 - tau * v * v' */ alpha = a->ptr.pp_complex[i][i+1]; t.ptr.p_complex[1] = alpha; if( i>=1 ) { ae_v_cmove(&t.ptr.p_complex[2], 1, &a->ptr.pp_complex[0][i+1], a->stride, "N", ae_v_len(2,i+1)); } complexgeneratereflection(&t, i+1, &taui, _state); if( i>=1 ) { ae_v_cmove(&a->ptr.pp_complex[0][i+1], a->stride, &t.ptr.p_complex[2], 1, "N", ae_v_len(0,i-1)); } alpha = t.ptr.p_complex[1]; e->ptr.p_double[i] = alpha.x; if( ae_c_neq_d(taui,(double)(0)) ) { /* * Apply H(I+1) from both sides to A */ a->ptr.pp_complex[i][i+1] = ae_complex_from_i(1); /* * Compute x := tau * A * v storing x in TAU */ ae_v_cmove(&t.ptr.p_complex[1], 1, &a->ptr.pp_complex[0][i+1], a->stride, "N", ae_v_len(1,i+1)); hermitianmatrixvectormultiply(a, isupper, 0, i, &t, taui, &t2, _state); ae_v_cmove(&tau->ptr.p_complex[0], 1, &t2.ptr.p_complex[1], 1, "N", ae_v_len(0,i)); /* * Compute w := x - 1/2 * tau * (x'*v) * v */ v = ae_v_cdotproduct(&tau->ptr.p_complex[0], 1, "Conj", &a->ptr.pp_complex[0][i+1], a->stride, "N", ae_v_len(0,i)); alpha = ae_c_neg(ae_c_mul(ae_c_mul_d(taui,0.5),v)); ae_v_caddc(&tau->ptr.p_complex[0], 1, &a->ptr.pp_complex[0][i+1], a->stride, "N", ae_v_len(0,i), alpha); /* * Apply the transformation as a rank-2 update: * A := A - v * w' - w * v' */ ae_v_cmove(&t.ptr.p_complex[1], 1, &a->ptr.pp_complex[0][i+1], a->stride, "N", ae_v_len(1,i+1)); ae_v_cmove(&t3.ptr.p_complex[1], 1, &tau->ptr.p_complex[0], 1, "N", ae_v_len(1,i+1)); hermitianrank2update(a, isupper, 0, i, &t, &t3, &t2, ae_complex_from_i(-1), _state); } else { a->ptr.pp_complex[i][i] = ae_complex_from_d(a->ptr.pp_complex[i][i].x); } a->ptr.pp_complex[i][i+1] = ae_complex_from_d(e->ptr.p_double[i]); d->ptr.p_double[i+1] = a->ptr.pp_complex[i+1][i+1].x; tau->ptr.p_complex[i] = taui; } d->ptr.p_double[0] = a->ptr.pp_complex[0][0].x; } else { /* * Reduce the lower triangle of A */ a->ptr.pp_complex[0][0] = ae_complex_from_d(a->ptr.pp_complex[0][0].x); for(i=0; i<=n-2; i++) { /* * Generate elementary reflector H = I - tau * v * v' */ ae_v_cmove(&t.ptr.p_complex[1], 1, &a->ptr.pp_complex[i+1][i], a->stride, "N", ae_v_len(1,n-i-1)); complexgeneratereflection(&t, n-i-1, &taui, _state); ae_v_cmove(&a->ptr.pp_complex[i+1][i], a->stride, &t.ptr.p_complex[1], 1, "N", ae_v_len(i+1,n-1)); e->ptr.p_double[i] = a->ptr.pp_complex[i+1][i].x; if( ae_c_neq_d(taui,(double)(0)) ) { /* * Apply H(i) from both sides to A(i+1:n,i+1:n) */ a->ptr.pp_complex[i+1][i] = ae_complex_from_i(1); /* * Compute x := tau * A * v storing y in TAU */ ae_v_cmove(&t.ptr.p_complex[1], 1, &a->ptr.pp_complex[i+1][i], a->stride, "N", ae_v_len(1,n-i-1)); hermitianmatrixvectormultiply(a, isupper, i+1, n-1, &t, taui, &t2, _state); ae_v_cmove(&tau->ptr.p_complex[i], 1, &t2.ptr.p_complex[1], 1, "N", ae_v_len(i,n-2)); /* * Compute w := x - 1/2 * tau * (x'*v) * v */ v = ae_v_cdotproduct(&tau->ptr.p_complex[i], 1, "Conj", &a->ptr.pp_complex[i+1][i], a->stride, "N", ae_v_len(i,n-2)); alpha = ae_c_neg(ae_c_mul(ae_c_mul_d(taui,0.5),v)); ae_v_caddc(&tau->ptr.p_complex[i], 1, &a->ptr.pp_complex[i+1][i], a->stride, "N", ae_v_len(i,n-2), alpha); /* * Apply the transformation as a rank-2 update: * A := A - v * w' - w * v' */ ae_v_cmove(&t.ptr.p_complex[1], 1, &a->ptr.pp_complex[i+1][i], a->stride, "N", ae_v_len(1,n-i-1)); ae_v_cmove(&t2.ptr.p_complex[1], 1, &tau->ptr.p_complex[i], 1, "N", ae_v_len(1,n-i-1)); hermitianrank2update(a, isupper, i+1, n-1, &t, &t2, &t3, ae_complex_from_i(-1), _state); } else { a->ptr.pp_complex[i+1][i+1] = ae_complex_from_d(a->ptr.pp_complex[i+1][i+1].x); } a->ptr.pp_complex[i+1][i] = ae_complex_from_d(e->ptr.p_double[i]); d->ptr.p_double[i] = a->ptr.pp_complex[i][i].x; tau->ptr.p_complex[i] = taui; } d->ptr.p_double[n-1] = a->ptr.pp_complex[n-1][n-1].x; } ae_frame_leave(_state); } /************************************************************************* Unpacking matrix Q which reduces a Hermitian matrix to a real tridiagonal form. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - the result of a HMatrixTD subroutine N - size of matrix A. IsUpper - storage format (a parameter of HMatrixTD subroutine) Tau - the result of a HMatrixTD subroutine Output parameters: Q - transformation matrix. array with elements [0..N-1, 0..N-1]. -- ALGLIB -- Copyright 2005-2010 by Bochkanov Sergey *************************************************************************/ void hmatrixtdunpackq(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tau, /* Complex */ ae_matrix* q, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_vector v; ae_vector work; ae_frame_make(_state, &_frame_block); ae_matrix_clear(q); ae_vector_init(&v, 0, DT_COMPLEX, _state); ae_vector_init(&work, 0, DT_COMPLEX, _state); if( n==0 ) { ae_frame_leave(_state); return; } /* * init */ ae_matrix_set_length(q, n-1+1, n-1+1, _state); ae_vector_set_length(&v, n+1, _state); ae_vector_set_length(&work, n-1+1, _state); /* * MKL version */ if( hmatrixtdunpackqmkl(a, n, isupper, tau, q, _state) ) { ae_frame_leave(_state); return; } /* * ALGLIB version */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { q->ptr.pp_complex[i][j] = ae_complex_from_i(1); } else { q->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } if( isupper ) { for(i=0; i<=n-2; i++) { /* * Apply H(i) */ ae_v_cmove(&v.ptr.p_complex[1], 1, &a->ptr.pp_complex[0][i+1], a->stride, "N", ae_v_len(1,i+1)); v.ptr.p_complex[i+1] = ae_complex_from_i(1); complexapplyreflectionfromtheleft(q, tau->ptr.p_complex[i], &v, 0, i, 0, n-1, &work, _state); } } else { for(i=n-2; i>=0; i--) { /* * Apply H(i) */ ae_v_cmove(&v.ptr.p_complex[1], 1, &a->ptr.pp_complex[i+1][i], a->stride, "N", ae_v_len(1,n-i-1)); v.ptr.p_complex[1] = ae_complex_from_i(1); complexapplyreflectionfromtheleft(q, tau->ptr.p_complex[i], &v, i+1, n-1, 0, n-1, &work, _state); } } ae_frame_leave(_state); } /************************************************************************* Base case for complex QR -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994. Sergey Bochkanov, ALGLIB project, translation from FORTRAN to pseudocode, 2007-2010. *************************************************************************/ static void ortfac_cmatrixqrbasecase(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* work, /* Complex */ ae_vector* t, /* Complex */ ae_vector* tau, ae_state *_state) { ae_int_t i; ae_int_t k; ae_int_t mmi; ae_int_t minmn; ae_complex tmp; minmn = ae_minint(m, n, _state); if( minmn<=0 ) { return; } /* * Test the input arguments */ k = ae_minint(m, n, _state); for(i=0; i<=k-1; i++) { /* * Generate elementary reflector H(i) to annihilate A(i+1:m,i) */ mmi = m-i; ae_v_cmove(&t->ptr.p_complex[1], 1, &a->ptr.pp_complex[i][i], a->stride, "N", ae_v_len(1,mmi)); complexgeneratereflection(t, mmi, &tmp, _state); tau->ptr.p_complex[i] = tmp; ae_v_cmove(&a->ptr.pp_complex[i][i], a->stride, &t->ptr.p_complex[1], 1, "N", ae_v_len(i,m-1)); t->ptr.p_complex[1] = ae_complex_from_i(1); if( iptr.p_complex[i], _state), t, i, m-1, i+1, n-1, work, _state); } } } /************************************************************************* Base case for complex LQ -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994. Sergey Bochkanov, ALGLIB project, translation from FORTRAN to pseudocode, 2007-2010. *************************************************************************/ static void ortfac_cmatrixlqbasecase(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_vector* work, /* Complex */ ae_vector* t, /* Complex */ ae_vector* tau, ae_state *_state) { ae_int_t i; ae_int_t minmn; ae_complex tmp; minmn = ae_minint(m, n, _state); if( minmn<=0 ) { return; } /* * Test the input arguments */ for(i=0; i<=minmn-1; i++) { /* * Generate elementary reflector H(i) * * NOTE: ComplexGenerateReflection() generates left reflector, * i.e. H which reduces x by applyiong from the left, but we * need RIGHT reflector. So we replace H=E-tau*v*v' by H^H, * which changes v to conj(v). */ ae_v_cmove(&t->ptr.p_complex[1], 1, &a->ptr.pp_complex[i][i], 1, "Conj", ae_v_len(1,n-i)); complexgeneratereflection(t, n-i, &tmp, _state); tau->ptr.p_complex[i] = tmp; ae_v_cmove(&a->ptr.pp_complex[i][i], 1, &t->ptr.p_complex[1], 1, "Conj", ae_v_len(i,n-1)); t->ptr.p_complex[1] = ae_complex_from_i(1); if( iptr.p_complex[i], t, i+1, m-1, i, n-1, work, _state); } } } /************************************************************************* Generate block reflector: * fill unused parts of reflectors matrix by zeros * fill diagonal of reflectors matrix by ones * generate triangular factor T PARAMETERS: A - either LengthA*BlockSize (if ColumnwiseA) or BlockSize*LengthA (if not ColumnwiseA) matrix of elementary reflectors. Modified on exit. Tau - scalar factors ColumnwiseA - reflectors are stored in rows or in columns LengthA - length of largest reflector BlockSize - number of reflectors T - array[BlockSize,2*BlockSize]. Left BlockSize*BlockSize submatrix stores triangular factor on exit. WORK - array[BlockSize] -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ static void ortfac_rmatrixblockreflector(/* Real */ ae_matrix* a, /* Real */ ae_vector* tau, ae_bool columnwisea, ae_int_t lengtha, ae_int_t blocksize, /* Real */ ae_matrix* t, /* Real */ ae_vector* work, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; double v; /* * fill beginning of new column with zeros, * load 1.0 in the first non-zero element */ for(k=0; k<=blocksize-1; k++) { if( columnwisea ) { for(i=0; i<=k-1; i++) { a->ptr.pp_double[i][k] = (double)(0); } } else { for(i=0; i<=k-1; i++) { a->ptr.pp_double[k][i] = (double)(0); } } a->ptr.pp_double[k][k] = (double)(1); } /* * Calculate Gram matrix of A */ for(i=0; i<=blocksize-1; i++) { for(j=0; j<=blocksize-1; j++) { t->ptr.pp_double[i][blocksize+j] = (double)(0); } } for(k=0; k<=lengtha-1; k++) { for(j=1; j<=blocksize-1; j++) { if( columnwisea ) { v = a->ptr.pp_double[k][j]; if( ae_fp_neq(v,(double)(0)) ) { ae_v_addd(&t->ptr.pp_double[j][blocksize], 1, &a->ptr.pp_double[k][0], 1, ae_v_len(blocksize,blocksize+j-1), v); } } else { v = a->ptr.pp_double[j][k]; if( ae_fp_neq(v,(double)(0)) ) { ae_v_addd(&t->ptr.pp_double[j][blocksize], 1, &a->ptr.pp_double[0][k], a->stride, ae_v_len(blocksize,blocksize+j-1), v); } } } } /* * Prepare Y (stored in TmpA) and T (stored in TmpT) */ for(k=0; k<=blocksize-1; k++) { /* * fill non-zero part of T, use pre-calculated Gram matrix */ ae_v_move(&work->ptr.p_double[0], 1, &t->ptr.pp_double[k][blocksize], 1, ae_v_len(0,k-1)); for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&t->ptr.pp_double[i][i], 1, &work->ptr.p_double[i], 1, ae_v_len(i,k-1)); t->ptr.pp_double[i][k] = -tau->ptr.p_double[k]*v; } t->ptr.pp_double[k][k] = -tau->ptr.p_double[k]; /* * Rest of T is filled by zeros */ for(i=k+1; i<=blocksize-1; i++) { t->ptr.pp_double[i][k] = (double)(0); } } } /************************************************************************* Generate block reflector (complex): * fill unused parts of reflectors matrix by zeros * fill diagonal of reflectors matrix by ones * generate triangular factor T -- ALGLIB routine -- 17.02.2010 Bochkanov Sergey *************************************************************************/ static void ortfac_cmatrixblockreflector(/* Complex */ ae_matrix* a, /* Complex */ ae_vector* tau, ae_bool columnwisea, ae_int_t lengtha, ae_int_t blocksize, /* Complex */ ae_matrix* t, /* Complex */ ae_vector* work, ae_state *_state) { ae_int_t i; ae_int_t k; ae_complex v; /* * Prepare Y (stored in TmpA) and T (stored in TmpT) */ for(k=0; k<=blocksize-1; k++) { /* * fill beginning of new column with zeros, * load 1.0 in the first non-zero element */ if( columnwisea ) { for(i=0; i<=k-1; i++) { a->ptr.pp_complex[i][k] = ae_complex_from_i(0); } } else { for(i=0; i<=k-1; i++) { a->ptr.pp_complex[k][i] = ae_complex_from_i(0); } } a->ptr.pp_complex[k][k] = ae_complex_from_i(1); /* * fill non-zero part of T, */ for(i=0; i<=k-1; i++) { if( columnwisea ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[k][i], a->stride, "Conj", &a->ptr.pp_complex[k][k], a->stride, "N", ae_v_len(k,lengtha-1)); } else { v = ae_v_cdotproduct(&a->ptr.pp_complex[i][k], 1, "N", &a->ptr.pp_complex[k][k], 1, "Conj", ae_v_len(k,lengtha-1)); } work->ptr.p_complex[i] = v; } for(i=0; i<=k-1; i++) { v = ae_v_cdotproduct(&t->ptr.pp_complex[i][i], 1, "N", &work->ptr.p_complex[i], 1, "N", ae_v_len(i,k-1)); t->ptr.pp_complex[i][k] = ae_c_neg(ae_c_mul(tau->ptr.p_complex[k],v)); } t->ptr.pp_complex[k][k] = ae_c_neg(tau->ptr.p_complex[k]); /* * Rest of T is filled by zeros */ for(i=k+1; i<=blocksize-1; i++) { t->ptr.pp_complex[i][k] = ae_complex_from_i(0); } } } /************************************************************************* Basic Cholesky solver for ScaleA*Cholesky(A)'*x = y. This subroutine assumes that: * A*ScaleA is well scaled * A is well-conditioned, so no zero divisions or overflow may occur INPUT PARAMETERS: CHA - Cholesky decomposition of A SqrtScaleA- square root of scale factor ScaleA N - matrix size, N>=0. IsUpper - storage type XB - right part Tmp - buffer; function automatically allocates it, if it is too small. It can be reused if function is called several times. OUTPUT PARAMETERS: XB - solution NOTE 1: no assertion or tests are done during algorithm operation NOTE 2: N=0 will force algorithm to silently return -- ALGLIB -- Copyright 13.10.2010 by Bochkanov Sergey *************************************************************************/ void fblscholeskysolve(/* Real */ ae_matrix* cha, double sqrtscalea, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* xb, /* Real */ ae_vector* tmp, ae_state *_state) { ae_int_t i; double v; if( n==0 ) { return; } if( tmp->cntptr.p_double[i] = xb->ptr.p_double[i]/(sqrtscalea*cha->ptr.pp_double[i][i]); if( iptr.p_double[i]; ae_v_moved(&tmp->ptr.p_double[i+1], 1, &cha->ptr.pp_double[i][i+1], 1, ae_v_len(i+1,n-1), sqrtscalea); ae_v_subd(&xb->ptr.p_double[i+1], 1, &tmp->ptr.p_double[i+1], 1, ae_v_len(i+1,n-1), v); } } /* * Solve U*x=y then. */ for(i=n-1; i>=0; i--) { if( iptr.p_double[i+1], 1, &cha->ptr.pp_double[i][i+1], 1, ae_v_len(i+1,n-1), sqrtscalea); v = ae_v_dotproduct(&tmp->ptr.p_double[i+1], 1, &xb->ptr.p_double[i+1], 1, ae_v_len(i+1,n-1)); xb->ptr.p_double[i] = xb->ptr.p_double[i]-v; } xb->ptr.p_double[i] = xb->ptr.p_double[i]/(sqrtscalea*cha->ptr.pp_double[i][i]); } } else { /* * Solve L*y=b first */ for(i=0; i<=n-1; i++) { if( i>0 ) { ae_v_moved(&tmp->ptr.p_double[0], 1, &cha->ptr.pp_double[i][0], 1, ae_v_len(0,i-1), sqrtscalea); v = ae_v_dotproduct(&tmp->ptr.p_double[0], 1, &xb->ptr.p_double[0], 1, ae_v_len(0,i-1)); xb->ptr.p_double[i] = xb->ptr.p_double[i]-v; } xb->ptr.p_double[i] = xb->ptr.p_double[i]/(sqrtscalea*cha->ptr.pp_double[i][i]); } /* * Solve L'*x=y then. */ for(i=n-1; i>=0; i--) { xb->ptr.p_double[i] = xb->ptr.p_double[i]/(sqrtscalea*cha->ptr.pp_double[i][i]); if( i>0 ) { v = xb->ptr.p_double[i]; ae_v_moved(&tmp->ptr.p_double[0], 1, &cha->ptr.pp_double[i][0], 1, ae_v_len(0,i-1), sqrtscalea); ae_v_subd(&xb->ptr.p_double[0], 1, &tmp->ptr.p_double[0], 1, ae_v_len(0,i-1), v); } } } } /************************************************************************* Fast basic linear solver: linear SPD CG Solves (A^T*A + alpha*I)*x = b where: * A is MxN matrix * alpha>0 is a scalar * I is NxN identity matrix * b is Nx1 vector * X is Nx1 unknown vector. N iterations of linear conjugate gradient are used to solve problem. INPUT PARAMETERS: A - array[M,N], matrix M - number of rows N - number of unknowns B - array[N], right part X - initial approxumation, array[N] Buf - buffer; function automatically allocates it, if it is too small. It can be reused if function is called several times with same M and N. OUTPUT PARAMETERS: X - improved solution NOTES: * solver checks quality of improved solution. If (because of problem condition number, numerical noise, etc.) new solution is WORSE than original approximation, then original approximation is returned. * solver assumes that both A, B, Alpha are well scaled (i.e. they are less than sqrt(overflow) and greater than sqrt(underflow)). -- ALGLIB -- Copyright 20.08.2009 by Bochkanov Sergey *************************************************************************/ void fblssolvecgx(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double alpha, /* Real */ ae_vector* b, /* Real */ ae_vector* x, /* Real */ ae_vector* buf, ae_state *_state) { ae_int_t k; ae_int_t offsrk; ae_int_t offsrk1; ae_int_t offsxk; ae_int_t offsxk1; ae_int_t offspk; ae_int_t offspk1; ae_int_t offstmp1; ae_int_t offstmp2; ae_int_t bs; double e1; double e2; double rk2; double rk12; double pap; double s; double betak; double v1; double v2; /* * Test for special case: B=0 */ v1 = ae_v_dotproduct(&b->ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); if( ae_fp_eq(v1,(double)(0)) ) { for(k=0; k<=n-1; k++) { x->ptr.p_double[k] = (double)(0); } return; } /* * Offsets inside Buf for: * * R[K], R[K+1] * * X[K], X[K+1] * * P[K], P[K+1] * * Tmp1 - array[M], Tmp2 - array[N] */ offsrk = 0; offsrk1 = offsrk+n; offsxk = offsrk1+n; offsxk1 = offsxk+n; offspk = offsxk1+n; offspk1 = offspk+n; offstmp1 = offspk1+n; offstmp2 = offstmp1+m; bs = offstmp2+n; if( buf->cntptr.p_double[offsxk], 1, &x->ptr.p_double[0], 1, ae_v_len(offsxk,offsxk+n-1)); /* * r(0) = b-A*x(0) * RK2 = r(0)'*r(0) */ rmatrixmv(m, n, a, 0, 0, 0, buf, offsxk, buf, offstmp1, _state); rmatrixmv(n, m, a, 0, 0, 1, buf, offstmp1, buf, offstmp2, _state); ae_v_addd(&buf->ptr.p_double[offstmp2], 1, &buf->ptr.p_double[offsxk], 1, ae_v_len(offstmp2,offstmp2+n-1), alpha); ae_v_move(&buf->ptr.p_double[offsrk], 1, &b->ptr.p_double[0], 1, ae_v_len(offsrk,offsrk+n-1)); ae_v_sub(&buf->ptr.p_double[offsrk], 1, &buf->ptr.p_double[offstmp2], 1, ae_v_len(offsrk,offsrk+n-1)); rk2 = ae_v_dotproduct(&buf->ptr.p_double[offsrk], 1, &buf->ptr.p_double[offsrk], 1, ae_v_len(offsrk,offsrk+n-1)); ae_v_move(&buf->ptr.p_double[offspk], 1, &buf->ptr.p_double[offsrk], 1, ae_v_len(offspk,offspk+n-1)); e1 = ae_sqrt(rk2, _state); /* * Cycle */ for(k=0; k<=n-1; k++) { /* * Calculate A*p(k) - store in Buf[OffsTmp2:OffsTmp2+N-1] * and p(k)'*A*p(k) - store in PAP * * If PAP=0, break (iteration is over) */ rmatrixmv(m, n, a, 0, 0, 0, buf, offspk, buf, offstmp1, _state); v1 = ae_v_dotproduct(&buf->ptr.p_double[offstmp1], 1, &buf->ptr.p_double[offstmp1], 1, ae_v_len(offstmp1,offstmp1+m-1)); v2 = ae_v_dotproduct(&buf->ptr.p_double[offspk], 1, &buf->ptr.p_double[offspk], 1, ae_v_len(offspk,offspk+n-1)); pap = v1+alpha*v2; rmatrixmv(n, m, a, 0, 0, 1, buf, offstmp1, buf, offstmp2, _state); ae_v_addd(&buf->ptr.p_double[offstmp2], 1, &buf->ptr.p_double[offspk], 1, ae_v_len(offstmp2,offstmp2+n-1), alpha); if( ae_fp_eq(pap,(double)(0)) ) { break; } /* * S = (r(k)'*r(k))/(p(k)'*A*p(k)) */ s = rk2/pap; /* * x(k+1) = x(k) + S*p(k) */ ae_v_move(&buf->ptr.p_double[offsxk1], 1, &buf->ptr.p_double[offsxk], 1, ae_v_len(offsxk1,offsxk1+n-1)); ae_v_addd(&buf->ptr.p_double[offsxk1], 1, &buf->ptr.p_double[offspk], 1, ae_v_len(offsxk1,offsxk1+n-1), s); /* * r(k+1) = r(k) - S*A*p(k) * RK12 = r(k+1)'*r(k+1) * * Break if r(k+1) small enough (when compared to r(k)) */ ae_v_move(&buf->ptr.p_double[offsrk1], 1, &buf->ptr.p_double[offsrk], 1, ae_v_len(offsrk1,offsrk1+n-1)); ae_v_subd(&buf->ptr.p_double[offsrk1], 1, &buf->ptr.p_double[offstmp2], 1, ae_v_len(offsrk1,offsrk1+n-1), s); rk12 = ae_v_dotproduct(&buf->ptr.p_double[offsrk1], 1, &buf->ptr.p_double[offsrk1], 1, ae_v_len(offsrk1,offsrk1+n-1)); if( ae_fp_less_eq(ae_sqrt(rk12, _state),100*ae_machineepsilon*ae_sqrt(rk2, _state)) ) { /* * X(k) = x(k+1) before exit - * - because we expect to find solution at x(k) */ ae_v_move(&buf->ptr.p_double[offsxk], 1, &buf->ptr.p_double[offsxk1], 1, ae_v_len(offsxk,offsxk+n-1)); break; } /* * BetaK = RK12/RK2 * p(k+1) = r(k+1)+betak*p(k) */ betak = rk12/rk2; ae_v_move(&buf->ptr.p_double[offspk1], 1, &buf->ptr.p_double[offsrk1], 1, ae_v_len(offspk1,offspk1+n-1)); ae_v_addd(&buf->ptr.p_double[offspk1], 1, &buf->ptr.p_double[offspk], 1, ae_v_len(offspk1,offspk1+n-1), betak); /* * r(k) := r(k+1) * x(k) := x(k+1) * p(k) := p(k+1) */ ae_v_move(&buf->ptr.p_double[offsrk], 1, &buf->ptr.p_double[offsrk1], 1, ae_v_len(offsrk,offsrk+n-1)); ae_v_move(&buf->ptr.p_double[offsxk], 1, &buf->ptr.p_double[offsxk1], 1, ae_v_len(offsxk,offsxk+n-1)); ae_v_move(&buf->ptr.p_double[offspk], 1, &buf->ptr.p_double[offspk1], 1, ae_v_len(offspk,offspk+n-1)); rk2 = rk12; } /* * Calculate E2 */ rmatrixmv(m, n, a, 0, 0, 0, buf, offsxk, buf, offstmp1, _state); rmatrixmv(n, m, a, 0, 0, 1, buf, offstmp1, buf, offstmp2, _state); ae_v_addd(&buf->ptr.p_double[offstmp2], 1, &buf->ptr.p_double[offsxk], 1, ae_v_len(offstmp2,offstmp2+n-1), alpha); ae_v_move(&buf->ptr.p_double[offsrk], 1, &b->ptr.p_double[0], 1, ae_v_len(offsrk,offsrk+n-1)); ae_v_sub(&buf->ptr.p_double[offsrk], 1, &buf->ptr.p_double[offstmp2], 1, ae_v_len(offsrk,offsrk+n-1)); v1 = ae_v_dotproduct(&buf->ptr.p_double[offsrk], 1, &buf->ptr.p_double[offsrk], 1, ae_v_len(offsrk,offsrk+n-1)); e2 = ae_sqrt(v1, _state); /* * Output result (if it was improved) */ if( ae_fp_less(e2,e1) ) { ae_v_move(&x->ptr.p_double[0], 1, &buf->ptr.p_double[offsxk], 1, ae_v_len(0,n-1)); } } /************************************************************************* Construction of linear conjugate gradient solver. State parameter passed using "var" semantics (i.e. previous state is NOT erased). When it is already initialized, we can reause prevously allocated memory. INPUT PARAMETERS: X - initial solution B - right part N - system size State - structure; may be preallocated, if we want to reuse memory OUTPUT PARAMETERS: State - structure which is used by FBLSCGIteration() to store algorithm state between subsequent calls. NOTE: no error checking is done; caller must check all parameters, prevent overflows, and so on. -- ALGLIB -- Copyright 22.10.2009 by Bochkanov Sergey *************************************************************************/ void fblscgcreate(/* Real */ ae_vector* x, /* Real */ ae_vector* b, ae_int_t n, fblslincgstate* state, ae_state *_state) { if( state->b.cntb, n, _state); } if( state->rk.cntrk, n, _state); } if( state->rk1.cntrk1, n, _state); } if( state->xk.cntxk, n, _state); } if( state->xk1.cntxk1, n, _state); } if( state->pk.cntpk, n, _state); } if( state->pk1.cntpk1, n, _state); } if( state->tmp2.cnttmp2, n, _state); } if( state->x.cntx, n, _state); } if( state->ax.cntax, n, _state); } state->n = n; ae_v_move(&state->xk.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->b.ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_vector_set_length(&state->rstate.ia, 1+1, _state); ae_vector_set_length(&state->rstate.ra, 6+1, _state); state->rstate.stage = -1; } /************************************************************************* Linear CG solver, function relying on reverse communication to calculate matrix-vector products. See comments for FBLSLinCGState structure for more info. -- ALGLIB -- Copyright 22.10.2009 by Bochkanov Sergey *************************************************************************/ ae_bool fblscgiteration(fblslincgstate* state, ae_state *_state) { ae_int_t n; ae_int_t k; double rk2; double rk12; double pap; double s; double betak; double v1; double v2; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { n = state->rstate.ia.ptr.p_int[0]; k = state->rstate.ia.ptr.p_int[1]; rk2 = state->rstate.ra.ptr.p_double[0]; rk12 = state->rstate.ra.ptr.p_double[1]; pap = state->rstate.ra.ptr.p_double[2]; s = state->rstate.ra.ptr.p_double[3]; betak = state->rstate.ra.ptr.p_double[4]; v1 = state->rstate.ra.ptr.p_double[5]; v2 = state->rstate.ra.ptr.p_double[6]; } else { n = 359; k = -58; rk2 = -919; rk12 = -909; pap = 81; s = 255; betak = 74; v1 = -788; v2 = 809; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } /* * Routine body */ /* * prepare locals */ n = state->n; /* * Test for special case: B=0 */ v1 = ae_v_dotproduct(&state->b.ptr.p_double[0], 1, &state->b.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( ae_fp_eq(v1,(double)(0)) ) { for(k=0; k<=n-1; k++) { state->xk.ptr.p_double[k] = (double)(0); } result = ae_false; return result; } /* * r(0) = b-A*x(0) * RK2 = r(0)'*r(0) */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xk.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->rstate.stage = 0; goto lbl_rcomm; lbl_0: ae_v_move(&state->rk.ptr.p_double[0], 1, &state->b.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_sub(&state->rk.ptr.p_double[0], 1, &state->ax.ptr.p_double[0], 1, ae_v_len(0,n-1)); rk2 = ae_v_dotproduct(&state->rk.ptr.p_double[0], 1, &state->rk.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->pk.ptr.p_double[0], 1, &state->rk.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->e1 = ae_sqrt(rk2, _state); /* * Cycle */ k = 0; lbl_3: if( k>n-1 ) { goto lbl_5; } /* * Calculate A*p(k) - store in State.Tmp2 * and p(k)'*A*p(k) - store in PAP * * If PAP=0, break (iteration is over) */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->pk.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->rstate.stage = 1; goto lbl_rcomm; lbl_1: ae_v_move(&state->tmp2.ptr.p_double[0], 1, &state->ax.ptr.p_double[0], 1, ae_v_len(0,n-1)); pap = state->xax; if( !ae_isfinite(pap, _state) ) { goto lbl_5; } if( ae_fp_less_eq(pap,(double)(0)) ) { goto lbl_5; } /* * S = (r(k)'*r(k))/(p(k)'*A*p(k)) */ s = rk2/pap; /* * x(k+1) = x(k) + S*p(k) */ ae_v_move(&state->xk1.ptr.p_double[0], 1, &state->xk.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&state->xk1.ptr.p_double[0], 1, &state->pk.ptr.p_double[0], 1, ae_v_len(0,n-1), s); /* * r(k+1) = r(k) - S*A*p(k) * RK12 = r(k+1)'*r(k+1) * * Break if r(k+1) small enough (when compared to r(k)) */ ae_v_move(&state->rk1.ptr.p_double[0], 1, &state->rk.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_subd(&state->rk1.ptr.p_double[0], 1, &state->tmp2.ptr.p_double[0], 1, ae_v_len(0,n-1), s); rk12 = ae_v_dotproduct(&state->rk1.ptr.p_double[0], 1, &state->rk1.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( ae_fp_less_eq(ae_sqrt(rk12, _state),100*ae_machineepsilon*state->e1) ) { /* * X(k) = x(k+1) before exit - * - because we expect to find solution at x(k) */ ae_v_move(&state->xk.ptr.p_double[0], 1, &state->xk1.ptr.p_double[0], 1, ae_v_len(0,n-1)); goto lbl_5; } /* * BetaK = RK12/RK2 * p(k+1) = r(k+1)+betak*p(k) * * NOTE: we expect that BetaK won't overflow because of * "Sqrt(RK12)<=100*MachineEpsilon*E1" test above. */ betak = rk12/rk2; ae_v_move(&state->pk1.ptr.p_double[0], 1, &state->rk1.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&state->pk1.ptr.p_double[0], 1, &state->pk.ptr.p_double[0], 1, ae_v_len(0,n-1), betak); /* * r(k) := r(k+1) * x(k) := x(k+1) * p(k) := p(k+1) */ ae_v_move(&state->rk.ptr.p_double[0], 1, &state->rk1.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->xk.ptr.p_double[0], 1, &state->xk1.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->pk.ptr.p_double[0], 1, &state->pk1.ptr.p_double[0], 1, ae_v_len(0,n-1)); rk2 = rk12; k = k+1; goto lbl_3; lbl_5: /* * Calculate E2 */ ae_v_move(&state->x.ptr.p_double[0], 1, &state->xk.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->rstate.stage = 2; goto lbl_rcomm; lbl_2: ae_v_move(&state->rk.ptr.p_double[0], 1, &state->b.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_sub(&state->rk.ptr.p_double[0], 1, &state->ax.ptr.p_double[0], 1, ae_v_len(0,n-1)); v1 = ae_v_dotproduct(&state->rk.ptr.p_double[0], 1, &state->rk.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->e2 = ae_sqrt(v1, _state); result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = n; state->rstate.ia.ptr.p_int[1] = k; state->rstate.ra.ptr.p_double[0] = rk2; state->rstate.ra.ptr.p_double[1] = rk12; state->rstate.ra.ptr.p_double[2] = pap; state->rstate.ra.ptr.p_double[3] = s; state->rstate.ra.ptr.p_double[4] = betak; state->rstate.ra.ptr.p_double[5] = v1; state->rstate.ra.ptr.p_double[6] = v2; return result; } /************************************************************************* Fast least squares solver, solves well conditioned system without performing any checks for degeneracy, and using user-provided buffers (which are automatically reallocated if too small). This function is intended for solution of moderately sized systems. It uses factorization algorithms based on Level 2 BLAS operations, thus it won't work efficiently on large scale systems. INPUT PARAMETERS: A - array[M,N], system matrix. Contents of A is destroyed during solution. B - array[M], right part M - number of equations N - number of variables, N<=M Tmp0, Tmp1, Tmp2- buffers; function automatically allocates them, if they are too small. They can be reused if function is called several times. OUTPUT PARAMETERS: B - solution (first N components, next M-N are zero) -- ALGLIB -- Copyright 20.01.2012 by Bochkanov Sergey *************************************************************************/ void fblssolvels(/* Real */ ae_matrix* a, /* Real */ ae_vector* b, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tmp0, /* Real */ ae_vector* tmp1, /* Real */ ae_vector* tmp2, ae_state *_state) { ae_int_t i; ae_int_t k; double v; ae_assert(n>0, "FBLSSolveLS: N<=0", _state); ae_assert(m>=n, "FBLSSolveLS: Mrows>=m, "FBLSSolveLS: Rows(A)cols>=n, "FBLSSolveLS: Cols(A)cnt>=m, "FBLSSolveLS: Length(B)ptr.p_double[i] = (double)(0); } ae_v_move(&tmp0->ptr.p_double[k], 1, &a->ptr.pp_double[k][k], a->stride, ae_v_len(k,m-1)); tmp0->ptr.p_double[k] = (double)(1); v = ae_v_dotproduct(&tmp0->ptr.p_double[k], 1, &b->ptr.p_double[k], 1, ae_v_len(k,m-1)); v = v*tmp2->ptr.p_double[k]; ae_v_subd(&b->ptr.p_double[k], 1, &tmp0->ptr.p_double[k], 1, ae_v_len(k,m-1), v); } /* * Solve triangular system */ b->ptr.p_double[n-1] = b->ptr.p_double[n-1]/a->ptr.pp_double[n-1][n-1]; for(i=n-2; i>=0; i--) { v = ae_v_dotproduct(&a->ptr.pp_double[i][i+1], 1, &b->ptr.p_double[i+1], 1, ae_v_len(i+1,n-1)); b->ptr.p_double[i] = (b->ptr.p_double[i]-v)/a->ptr.pp_double[i][i]; } for(i=n; i<=m-1; i++) { b->ptr.p_double[i] = 0.0; } } void _fblslincgstate_init(void* _p, ae_state *_state) { fblslincgstate *p = (fblslincgstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->ax, 0, DT_REAL, _state); ae_vector_init(&p->rk, 0, DT_REAL, _state); ae_vector_init(&p->rk1, 0, DT_REAL, _state); ae_vector_init(&p->xk, 0, DT_REAL, _state); ae_vector_init(&p->xk1, 0, DT_REAL, _state); ae_vector_init(&p->pk, 0, DT_REAL, _state); ae_vector_init(&p->pk1, 0, DT_REAL, _state); ae_vector_init(&p->b, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); ae_vector_init(&p->tmp2, 0, DT_REAL, _state); } void _fblslincgstate_init_copy(void* _dst, void* _src, ae_state *_state) { fblslincgstate *dst = (fblslincgstate*)_dst; fblslincgstate *src = (fblslincgstate*)_src; dst->e1 = src->e1; dst->e2 = src->e2; ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->ax, &src->ax, _state); dst->xax = src->xax; dst->n = src->n; ae_vector_init_copy(&dst->rk, &src->rk, _state); ae_vector_init_copy(&dst->rk1, &src->rk1, _state); ae_vector_init_copy(&dst->xk, &src->xk, _state); ae_vector_init_copy(&dst->xk1, &src->xk1, _state); ae_vector_init_copy(&dst->pk, &src->pk, _state); ae_vector_init_copy(&dst->pk1, &src->pk1, _state); ae_vector_init_copy(&dst->b, &src->b, _state); _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); ae_vector_init_copy(&dst->tmp2, &src->tmp2, _state); } void _fblslincgstate_clear(void* _p) { fblslincgstate *p = (fblslincgstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->x); ae_vector_clear(&p->ax); ae_vector_clear(&p->rk); ae_vector_clear(&p->rk1); ae_vector_clear(&p->xk); ae_vector_clear(&p->xk1); ae_vector_clear(&p->pk); ae_vector_clear(&p->pk1); ae_vector_clear(&p->b); _rcommstate_clear(&p->rstate); ae_vector_clear(&p->tmp2); } void _fblslincgstate_destroy(void* _p) { fblslincgstate *p = (fblslincgstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->x); ae_vector_destroy(&p->ax); ae_vector_destroy(&p->rk); ae_vector_destroy(&p->rk1); ae_vector_destroy(&p->xk); ae_vector_destroy(&p->xk1); ae_vector_destroy(&p->pk); ae_vector_destroy(&p->pk1); ae_vector_destroy(&p->b); _rcommstate_destroy(&p->rstate); ae_vector_destroy(&p->tmp2); } /************************************************************************* Singular value decomposition of a bidiagonal matrix (extended algorithm) COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The algorithm performs the singular value decomposition of a bidiagonal matrix B (upper or lower) representing it as B = Q*S*P^T, where Q and P - orthogonal matrices, S - diagonal matrix with non-negative elements on the main diagonal, in descending order. The algorithm finds singular values. In addition, the algorithm can calculate matrices Q and P (more precisely, not the matrices, but their product with given matrices U and VT - U*Q and (P^T)*VT)). Of course, matrices U and VT can be of any type, including identity. Furthermore, the algorithm can calculate Q'*C (this product is calculated more effectively than U*Q, because this calculation operates with rows instead of matrix columns). The feature of the algorithm is its ability to find all singular values including those which are arbitrarily close to 0 with relative accuracy close to machine precision. If the parameter IsFractionalAccuracyRequired is set to True, all singular values will have high relative accuracy close to machine precision. If the parameter is set to False, only the biggest singular value will have relative accuracy close to machine precision. The absolute error of other singular values is equal to the absolute error of the biggest singular value. Input parameters: D - main diagonal of matrix B. Array whose index ranges within [0..N-1]. E - superdiagonal (or subdiagonal) of matrix B. Array whose index ranges within [0..N-2]. N - size of matrix B. IsUpper - True, if the matrix is upper bidiagonal. IsFractionalAccuracyRequired - THIS PARAMETER IS IGNORED SINCE ALGLIB 3.5.0 SINGULAR VALUES ARE ALWAYS SEARCHED WITH HIGH ACCURACY. U - matrix to be multiplied by Q. Array whose indexes range within [0..NRU-1, 0..N-1]. The matrix can be bigger, in that case only the submatrix [0..NRU-1, 0..N-1] will be multiplied by Q. NRU - number of rows in matrix U. C - matrix to be multiplied by Q'. Array whose indexes range within [0..N-1, 0..NCC-1]. The matrix can be bigger, in that case only the submatrix [0..N-1, 0..NCC-1] will be multiplied by Q'. NCC - number of columns in matrix C. VT - matrix to be multiplied by P^T. Array whose indexes range within [0..N-1, 0..NCVT-1]. The matrix can be bigger, in that case only the submatrix [0..N-1, 0..NCVT-1] will be multiplied by P^T. NCVT - number of columns in matrix VT. Output parameters: D - singular values of matrix B in descending order. U - if NRU>0, contains matrix U*Q. VT - if NCVT>0, contains matrix (P^T)*VT. C - if NCC>0, contains matrix Q'*C. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged (rare case). NOTE: multiplication U*Q is performed by means of transposition to internal buffer, multiplication and backward transposition. It helps to avoid costly columnwise operations and speed-up algorithm. Additional information: The type of convergence is controlled by the internal parameter TOL. If the parameter is greater than 0, the singular values will have relative accuracy TOL. If TOL<0, the singular values will have absolute accuracy ABS(TOL)*norm(B). By default, |TOL| falls within the range of 10*Epsilon and 100*Epsilon, where Epsilon is the machine precision. It is not recommended to use TOL less than 10*Epsilon since this will considerably slow down the algorithm and may not lead to error decreasing. History: * 31 March, 2007. changed MAXITR from 6 to 12. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1999. *************************************************************************/ ae_bool rmatrixbdsvd(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool isupper, ae_bool isfractionalaccuracyrequired, /* Real */ ae_matrix* u, ae_int_t nru, /* Real */ ae_matrix* c, ae_int_t ncc, /* Real */ ae_matrix* vt, ae_int_t ncvt, ae_state *_state) { ae_frame _frame_block; ae_vector _e; ae_int_t i; ae_vector en; ae_vector d1; ae_vector e1; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_e, e, _state); e = &_e; ae_vector_init(&en, 0, DT_REAL, _state); ae_vector_init(&d1, 0, DT_REAL, _state); ae_vector_init(&e1, 0, DT_REAL, _state); result = ae_false; /* * Try to use MKL */ ae_vector_set_length(&en, n, _state); for(i=0; i<=n-2; i++) { en.ptr.p_double[i] = e->ptr.p_double[i]; } en.ptr.p_double[n-1] = 0.0; if( rmatrixbdsvdmkl(d, &en, n, isupper, u, nru, c, ncc, vt, ncvt, &result, _state) ) { ae_frame_leave(_state); return result; } /* * Use ALGLIB code */ ae_vector_set_length(&d1, n+1, _state); ae_v_move(&d1.ptr.p_double[1], 1, &d->ptr.p_double[0], 1, ae_v_len(1,n)); if( n>1 ) { ae_vector_set_length(&e1, n-1+1, _state); ae_v_move(&e1.ptr.p_double[1], 1, &e->ptr.p_double[0], 1, ae_v_len(1,n-1)); } result = bdsvd_bidiagonalsvddecompositioninternal(&d1, &e1, n, isupper, isfractionalaccuracyrequired, u, 0, nru, c, 0, ncc, vt, 0, ncvt, _state); ae_v_move(&d->ptr.p_double[0], 1, &d1.ptr.p_double[1], 1, ae_v_len(0,n-1)); ae_frame_leave(_state); return result; } ae_bool bidiagonalsvddecomposition(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool isupper, ae_bool isfractionalaccuracyrequired, /* Real */ ae_matrix* u, ae_int_t nru, /* Real */ ae_matrix* c, ae_int_t ncc, /* Real */ ae_matrix* vt, ae_int_t ncvt, ae_state *_state) { ae_frame _frame_block; ae_vector _e; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_e, e, _state); e = &_e; result = bdsvd_bidiagonalsvddecompositioninternal(d, e, n, isupper, isfractionalaccuracyrequired, u, 1, nru, c, 1, ncc, vt, 1, ncvt, _state); ae_frame_leave(_state); return result; } /************************************************************************* Internal working subroutine for bidiagonal decomposition *************************************************************************/ static ae_bool bdsvd_bidiagonalsvddecompositioninternal(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool isupper, ae_bool isfractionalaccuracyrequired, /* Real */ ae_matrix* uu, ae_int_t ustart, ae_int_t nru, /* Real */ ae_matrix* c, ae_int_t cstart, ae_int_t ncc, /* Real */ ae_matrix* vt, ae_int_t vstart, ae_int_t ncvt, ae_state *_state) { ae_frame _frame_block; ae_vector _e; ae_int_t i; ae_int_t idir; ae_int_t isub; ae_int_t iter; ae_int_t j; ae_int_t ll; ae_int_t lll; ae_int_t m; ae_int_t maxit; ae_int_t oldll; ae_int_t oldm; double abse; double abss; double cosl; double cosr; double cs; double eps; double f; double g; double h; double mu; double oldcs; double oldsn; double r; double shift; double sigmn; double sigmx; double sinl; double sinr; double sll; double smax; double smin; double sminl; double sminoa; double sn; double thresh; double tol; double tolmul; double unfl; ae_vector work0; ae_vector work1; ae_vector work2; ae_vector work3; ae_int_t maxitr; ae_bool matrixsplitflag; ae_bool iterflag; ae_vector utemp; ae_vector vttemp; ae_vector ctemp; ae_vector etemp; ae_matrix ut; ae_bool fwddir; double tmp; ae_int_t mm1; ae_int_t mm0; ae_bool bchangedir; ae_int_t uend; ae_int_t cend; ae_int_t vend; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_e, e, _state); e = &_e; ae_vector_init(&work0, 0, DT_REAL, _state); ae_vector_init(&work1, 0, DT_REAL, _state); ae_vector_init(&work2, 0, DT_REAL, _state); ae_vector_init(&work3, 0, DT_REAL, _state); ae_vector_init(&utemp, 0, DT_REAL, _state); ae_vector_init(&vttemp, 0, DT_REAL, _state); ae_vector_init(&ctemp, 0, DT_REAL, _state); ae_vector_init(&etemp, 0, DT_REAL, _state); ae_matrix_init(&ut, 0, 0, DT_REAL, _state); result = ae_true; if( n==0 ) { ae_frame_leave(_state); return result; } if( n==1 ) { if( ae_fp_less(d->ptr.p_double[1],(double)(0)) ) { d->ptr.p_double[1] = -d->ptr.p_double[1]; if( ncvt>0 ) { ae_v_muld(&vt->ptr.pp_double[vstart][vstart], 1, ae_v_len(vstart,vstart+ncvt-1), -1); } } ae_frame_leave(_state); return result; } /* * these initializers are not really necessary, * but without them compiler complains about uninitialized locals */ ll = 0; oldsn = (double)(0); /* * init */ ae_vector_set_length(&work0, n-1+1, _state); ae_vector_set_length(&work1, n-1+1, _state); ae_vector_set_length(&work2, n-1+1, _state); ae_vector_set_length(&work3, n-1+1, _state); uend = ustart+ae_maxint(nru-1, 0, _state); vend = vstart+ae_maxint(ncvt-1, 0, _state); cend = cstart+ae_maxint(ncc-1, 0, _state); ae_vector_set_length(&utemp, uend+1, _state); ae_vector_set_length(&vttemp, vend+1, _state); ae_vector_set_length(&ctemp, cend+1, _state); maxitr = 12; fwddir = ae_true; if( nru>0 ) { ae_matrix_set_length(&ut, ustart+n, ustart+nru, _state); rmatrixtranspose(nru, n, uu, ustart, ustart, &ut, ustart, ustart, _state); } /* * resize E from N-1 to N */ ae_vector_set_length(&etemp, n+1, _state); for(i=1; i<=n-1; i++) { etemp.ptr.p_double[i] = e->ptr.p_double[i]; } ae_vector_set_length(e, n+1, _state); for(i=1; i<=n-1; i++) { e->ptr.p_double[i] = etemp.ptr.p_double[i]; } e->ptr.p_double[n] = (double)(0); idir = 0; /* * Get machine constants */ eps = ae_machineepsilon; unfl = ae_minrealnumber; /* * If matrix lower bidiagonal, rotate to be upper bidiagonal * by applying Givens rotations on the left */ if( !isupper ) { for(i=1; i<=n-1; i++) { generaterotation(d->ptr.p_double[i], e->ptr.p_double[i], &cs, &sn, &r, _state); d->ptr.p_double[i] = r; e->ptr.p_double[i] = sn*d->ptr.p_double[i+1]; d->ptr.p_double[i+1] = cs*d->ptr.p_double[i+1]; work0.ptr.p_double[i] = cs; work1.ptr.p_double[i] = sn; } /* * Update singular vectors if desired */ if( nru>0 ) { applyrotationsfromtheleft(fwddir, 1+ustart-1, n+ustart-1, ustart, uend, &work0, &work1, &ut, &utemp, _state); } if( ncc>0 ) { applyrotationsfromtheleft(fwddir, 1+cstart-1, n+cstart-1, cstart, cend, &work0, &work1, c, &ctemp, _state); } } /* * Compute singular values to relative accuracy TOL * (By setting TOL to be negative, algorithm will compute * singular values to absolute accuracy ABS(TOL)*norm(input matrix)) */ tolmul = ae_maxreal((double)(10), ae_minreal((double)(100), ae_pow(eps, -0.125, _state), _state), _state); tol = tolmul*eps; /* * Compute approximate maximum, minimum singular values */ smax = (double)(0); for(i=1; i<=n; i++) { smax = ae_maxreal(smax, ae_fabs(d->ptr.p_double[i], _state), _state); } for(i=1; i<=n-1; i++) { smax = ae_maxreal(smax, ae_fabs(e->ptr.p_double[i], _state), _state); } sminl = (double)(0); if( ae_fp_greater_eq(tol,(double)(0)) ) { /* * Relative accuracy desired */ sminoa = ae_fabs(d->ptr.p_double[1], _state); if( ae_fp_neq(sminoa,(double)(0)) ) { mu = sminoa; for(i=2; i<=n; i++) { mu = ae_fabs(d->ptr.p_double[i], _state)*(mu/(mu+ae_fabs(e->ptr.p_double[i-1], _state))); sminoa = ae_minreal(sminoa, mu, _state); if( ae_fp_eq(sminoa,(double)(0)) ) { break; } } } sminoa = sminoa/ae_sqrt((double)(n), _state); thresh = ae_maxreal(tol*sminoa, maxitr*n*n*unfl, _state); } else { /* * Absolute accuracy desired */ thresh = ae_maxreal(ae_fabs(tol, _state)*smax, maxitr*n*n*unfl, _state); } /* * Prepare for main iteration loop for the singular values * (MAXIT is the maximum number of passes through the inner * loop permitted before nonconvergence signalled.) */ maxit = maxitr*n*n; iter = 0; oldll = -1; oldm = -1; /* * M points to last element of unconverged part of matrix */ m = n; /* * Begin main iteration loop */ for(;;) { /* * Check for convergence or exceeding iteration count */ if( m<=1 ) { break; } if( iter>maxit ) { result = ae_false; ae_frame_leave(_state); return result; } /* * Find diagonal block of matrix to work on */ if( ae_fp_less(tol,(double)(0))&&ae_fp_less_eq(ae_fabs(d->ptr.p_double[m], _state),thresh) ) { d->ptr.p_double[m] = (double)(0); } smax = ae_fabs(d->ptr.p_double[m], _state); smin = smax; matrixsplitflag = ae_false; for(lll=1; lll<=m-1; lll++) { ll = m-lll; abss = ae_fabs(d->ptr.p_double[ll], _state); abse = ae_fabs(e->ptr.p_double[ll], _state); if( ae_fp_less(tol,(double)(0))&&ae_fp_less_eq(abss,thresh) ) { d->ptr.p_double[ll] = (double)(0); } if( ae_fp_less_eq(abse,thresh) ) { matrixsplitflag = ae_true; break; } smin = ae_minreal(smin, abss, _state); smax = ae_maxreal(smax, ae_maxreal(abss, abse, _state), _state); } if( !matrixsplitflag ) { ll = 0; } else { /* * Matrix splits since E(LL) = 0 */ e->ptr.p_double[ll] = (double)(0); if( ll==m-1 ) { /* * Convergence of bottom singular value, return to top of loop */ m = m-1; continue; } } ll = ll+1; /* * E(LL) through E(M-1) are nonzero, E(LL-1) is zero */ if( ll==m-1 ) { /* * 2 by 2 block, handle separately */ bdsvd_svdv2x2(d->ptr.p_double[m-1], e->ptr.p_double[m-1], d->ptr.p_double[m], &sigmn, &sigmx, &sinr, &cosr, &sinl, &cosl, _state); d->ptr.p_double[m-1] = sigmx; e->ptr.p_double[m-1] = (double)(0); d->ptr.p_double[m] = sigmn; /* * Compute singular vectors, if desired */ if( ncvt>0 ) { mm0 = m+(vstart-1); mm1 = m-1+(vstart-1); ae_v_moved(&vttemp.ptr.p_double[vstart], 1, &vt->ptr.pp_double[mm1][vstart], 1, ae_v_len(vstart,vend), cosr); ae_v_addd(&vttemp.ptr.p_double[vstart], 1, &vt->ptr.pp_double[mm0][vstart], 1, ae_v_len(vstart,vend), sinr); ae_v_muld(&vt->ptr.pp_double[mm0][vstart], 1, ae_v_len(vstart,vend), cosr); ae_v_subd(&vt->ptr.pp_double[mm0][vstart], 1, &vt->ptr.pp_double[mm1][vstart], 1, ae_v_len(vstart,vend), sinr); ae_v_move(&vt->ptr.pp_double[mm1][vstart], 1, &vttemp.ptr.p_double[vstart], 1, ae_v_len(vstart,vend)); } if( nru>0 ) { mm0 = m+ustart-1; mm1 = m-1+ustart-1; ae_v_moved(&utemp.ptr.p_double[ustart], 1, &ut.ptr.pp_double[mm1][ustart], 1, ae_v_len(ustart,uend), cosl); ae_v_addd(&utemp.ptr.p_double[ustart], 1, &ut.ptr.pp_double[mm0][ustart], 1, ae_v_len(ustart,uend), sinl); ae_v_muld(&ut.ptr.pp_double[mm0][ustart], 1, ae_v_len(ustart,uend), cosl); ae_v_subd(&ut.ptr.pp_double[mm0][ustart], 1, &ut.ptr.pp_double[mm1][ustart], 1, ae_v_len(ustart,uend), sinl); ae_v_move(&ut.ptr.pp_double[mm1][ustart], 1, &utemp.ptr.p_double[ustart], 1, ae_v_len(ustart,uend)); } if( ncc>0 ) { mm0 = m+cstart-1; mm1 = m-1+cstart-1; ae_v_moved(&ctemp.ptr.p_double[cstart], 1, &c->ptr.pp_double[mm1][cstart], 1, ae_v_len(cstart,cend), cosl); ae_v_addd(&ctemp.ptr.p_double[cstart], 1, &c->ptr.pp_double[mm0][cstart], 1, ae_v_len(cstart,cend), sinl); ae_v_muld(&c->ptr.pp_double[mm0][cstart], 1, ae_v_len(cstart,cend), cosl); ae_v_subd(&c->ptr.pp_double[mm0][cstart], 1, &c->ptr.pp_double[mm1][cstart], 1, ae_v_len(cstart,cend), sinl); ae_v_move(&c->ptr.pp_double[mm1][cstart], 1, &ctemp.ptr.p_double[cstart], 1, ae_v_len(cstart,cend)); } m = m-2; continue; } /* * If working on new submatrix, choose shift direction * (from larger end diagonal element towards smaller) * * Previously was * "if (LL>OLDM) or (M * Very strange that LAPACK still contains it. */ bchangedir = ae_false; if( idir==1&&ae_fp_less(ae_fabs(d->ptr.p_double[ll], _state),1.0E-3*ae_fabs(d->ptr.p_double[m], _state)) ) { bchangedir = ae_true; } if( idir==2&&ae_fp_less(ae_fabs(d->ptr.p_double[m], _state),1.0E-3*ae_fabs(d->ptr.p_double[ll], _state)) ) { bchangedir = ae_true; } if( (ll!=oldll||m!=oldm)||bchangedir ) { if( ae_fp_greater_eq(ae_fabs(d->ptr.p_double[ll], _state),ae_fabs(d->ptr.p_double[m], _state)) ) { /* * Chase bulge from top (big end) to bottom (small end) */ idir = 1; } else { /* * Chase bulge from bottom (big end) to top (small end) */ idir = 2; } } /* * Apply convergence tests */ if( idir==1 ) { /* * Run convergence test in forward direction * First apply standard test to bottom of matrix */ if( ae_fp_less_eq(ae_fabs(e->ptr.p_double[m-1], _state),ae_fabs(tol, _state)*ae_fabs(d->ptr.p_double[m], _state))||(ae_fp_less(tol,(double)(0))&&ae_fp_less_eq(ae_fabs(e->ptr.p_double[m-1], _state),thresh)) ) { e->ptr.p_double[m-1] = (double)(0); continue; } if( ae_fp_greater_eq(tol,(double)(0)) ) { /* * If relative accuracy desired, * apply convergence criterion forward */ mu = ae_fabs(d->ptr.p_double[ll], _state); sminl = mu; iterflag = ae_false; for(lll=ll; lll<=m-1; lll++) { if( ae_fp_less_eq(ae_fabs(e->ptr.p_double[lll], _state),tol*mu) ) { e->ptr.p_double[lll] = (double)(0); iterflag = ae_true; break; } mu = ae_fabs(d->ptr.p_double[lll+1], _state)*(mu/(mu+ae_fabs(e->ptr.p_double[lll], _state))); sminl = ae_minreal(sminl, mu, _state); } if( iterflag ) { continue; } } } else { /* * Run convergence test in backward direction * First apply standard test to top of matrix */ if( ae_fp_less_eq(ae_fabs(e->ptr.p_double[ll], _state),ae_fabs(tol, _state)*ae_fabs(d->ptr.p_double[ll], _state))||(ae_fp_less(tol,(double)(0))&&ae_fp_less_eq(ae_fabs(e->ptr.p_double[ll], _state),thresh)) ) { e->ptr.p_double[ll] = (double)(0); continue; } if( ae_fp_greater_eq(tol,(double)(0)) ) { /* * If relative accuracy desired, * apply convergence criterion backward */ mu = ae_fabs(d->ptr.p_double[m], _state); sminl = mu; iterflag = ae_false; for(lll=m-1; lll>=ll; lll--) { if( ae_fp_less_eq(ae_fabs(e->ptr.p_double[lll], _state),tol*mu) ) { e->ptr.p_double[lll] = (double)(0); iterflag = ae_true; break; } mu = ae_fabs(d->ptr.p_double[lll], _state)*(mu/(mu+ae_fabs(e->ptr.p_double[lll], _state))); sminl = ae_minreal(sminl, mu, _state); } if( iterflag ) { continue; } } } oldll = ll; oldm = m; /* * Compute shift. First, test if shifting would ruin relative * accuracy, and if so set the shift to zero. */ if( ae_fp_greater_eq(tol,(double)(0))&&ae_fp_less_eq(n*tol*(sminl/smax),ae_maxreal(eps, 0.01*tol, _state)) ) { /* * Use a zero shift to avoid loss of relative accuracy */ shift = (double)(0); } else { /* * Compute the shift from 2-by-2 block at end of matrix */ if( idir==1 ) { sll = ae_fabs(d->ptr.p_double[ll], _state); bdsvd_svd2x2(d->ptr.p_double[m-1], e->ptr.p_double[m-1], d->ptr.p_double[m], &shift, &r, _state); } else { sll = ae_fabs(d->ptr.p_double[m], _state); bdsvd_svd2x2(d->ptr.p_double[ll], e->ptr.p_double[ll], d->ptr.p_double[ll+1], &shift, &r, _state); } /* * Test if shift negligible, and if so set to zero */ if( ae_fp_greater(sll,(double)(0)) ) { if( ae_fp_less(ae_sqr(shift/sll, _state),eps) ) { shift = (double)(0); } } } /* * Increment iteration count */ iter = iter+m-ll; /* * If SHIFT = 0, do simplified QR iteration */ if( ae_fp_eq(shift,(double)(0)) ) { if( idir==1 ) { /* * Chase bulge from top to bottom * Save cosines and sines for later singular vector updates */ cs = (double)(1); oldcs = (double)(1); for(i=ll; i<=m-1; i++) { generaterotation(d->ptr.p_double[i]*cs, e->ptr.p_double[i], &cs, &sn, &r, _state); if( i>ll ) { e->ptr.p_double[i-1] = oldsn*r; } generaterotation(oldcs*r, d->ptr.p_double[i+1]*sn, &oldcs, &oldsn, &tmp, _state); d->ptr.p_double[i] = tmp; work0.ptr.p_double[i-ll+1] = cs; work1.ptr.p_double[i-ll+1] = sn; work2.ptr.p_double[i-ll+1] = oldcs; work3.ptr.p_double[i-ll+1] = oldsn; } h = d->ptr.p_double[m]*cs; d->ptr.p_double[m] = h*oldcs; e->ptr.p_double[m-1] = h*oldsn; /* * Update singular vectors */ if( ncvt>0 ) { applyrotationsfromtheleft(fwddir, ll+vstart-1, m+vstart-1, vstart, vend, &work0, &work1, vt, &vttemp, _state); } if( nru>0 ) { applyrotationsfromtheleft(fwddir, ll+ustart-1, m+ustart-1, ustart, uend, &work2, &work3, &ut, &utemp, _state); } if( ncc>0 ) { applyrotationsfromtheleft(fwddir, ll+cstart-1, m+cstart-1, cstart, cend, &work2, &work3, c, &ctemp, _state); } /* * Test convergence */ if( ae_fp_less_eq(ae_fabs(e->ptr.p_double[m-1], _state),thresh) ) { e->ptr.p_double[m-1] = (double)(0); } } else { /* * Chase bulge from bottom to top * Save cosines and sines for later singular vector updates */ cs = (double)(1); oldcs = (double)(1); for(i=m; i>=ll+1; i--) { generaterotation(d->ptr.p_double[i]*cs, e->ptr.p_double[i-1], &cs, &sn, &r, _state); if( iptr.p_double[i] = oldsn*r; } generaterotation(oldcs*r, d->ptr.p_double[i-1]*sn, &oldcs, &oldsn, &tmp, _state); d->ptr.p_double[i] = tmp; work0.ptr.p_double[i-ll] = cs; work1.ptr.p_double[i-ll] = -sn; work2.ptr.p_double[i-ll] = oldcs; work3.ptr.p_double[i-ll] = -oldsn; } h = d->ptr.p_double[ll]*cs; d->ptr.p_double[ll] = h*oldcs; e->ptr.p_double[ll] = h*oldsn; /* * Update singular vectors */ if( ncvt>0 ) { applyrotationsfromtheleft(!fwddir, ll+vstart-1, m+vstart-1, vstart, vend, &work2, &work3, vt, &vttemp, _state); } if( nru>0 ) { applyrotationsfromtheleft(!fwddir, ll+ustart-1, m+ustart-1, ustart, uend, &work0, &work1, &ut, &utemp, _state); } if( ncc>0 ) { applyrotationsfromtheleft(!fwddir, ll+cstart-1, m+cstart-1, cstart, cend, &work0, &work1, c, &ctemp, _state); } /* * Test convergence */ if( ae_fp_less_eq(ae_fabs(e->ptr.p_double[ll], _state),thresh) ) { e->ptr.p_double[ll] = (double)(0); } } } else { /* * Use nonzero shift */ if( idir==1 ) { /* * Chase bulge from top to bottom * Save cosines and sines for later singular vector updates */ f = (ae_fabs(d->ptr.p_double[ll], _state)-shift)*(bdsvd_extsignbdsqr((double)(1), d->ptr.p_double[ll], _state)+shift/d->ptr.p_double[ll]); g = e->ptr.p_double[ll]; for(i=ll; i<=m-1; i++) { generaterotation(f, g, &cosr, &sinr, &r, _state); if( i>ll ) { e->ptr.p_double[i-1] = r; } f = cosr*d->ptr.p_double[i]+sinr*e->ptr.p_double[i]; e->ptr.p_double[i] = cosr*e->ptr.p_double[i]-sinr*d->ptr.p_double[i]; g = sinr*d->ptr.p_double[i+1]; d->ptr.p_double[i+1] = cosr*d->ptr.p_double[i+1]; generaterotation(f, g, &cosl, &sinl, &r, _state); d->ptr.p_double[i] = r; f = cosl*e->ptr.p_double[i]+sinl*d->ptr.p_double[i+1]; d->ptr.p_double[i+1] = cosl*d->ptr.p_double[i+1]-sinl*e->ptr.p_double[i]; if( iptr.p_double[i+1]; e->ptr.p_double[i+1] = cosl*e->ptr.p_double[i+1]; } work0.ptr.p_double[i-ll+1] = cosr; work1.ptr.p_double[i-ll+1] = sinr; work2.ptr.p_double[i-ll+1] = cosl; work3.ptr.p_double[i-ll+1] = sinl; } e->ptr.p_double[m-1] = f; /* * Update singular vectors */ if( ncvt>0 ) { applyrotationsfromtheleft(fwddir, ll+vstart-1, m+vstart-1, vstart, vend, &work0, &work1, vt, &vttemp, _state); } if( nru>0 ) { applyrotationsfromtheleft(fwddir, ll+ustart-1, m+ustart-1, ustart, uend, &work2, &work3, &ut, &utemp, _state); } if( ncc>0 ) { applyrotationsfromtheleft(fwddir, ll+cstart-1, m+cstart-1, cstart, cend, &work2, &work3, c, &ctemp, _state); } /* * Test convergence */ if( ae_fp_less_eq(ae_fabs(e->ptr.p_double[m-1], _state),thresh) ) { e->ptr.p_double[m-1] = (double)(0); } } else { /* * Chase bulge from bottom to top * Save cosines and sines for later singular vector updates */ f = (ae_fabs(d->ptr.p_double[m], _state)-shift)*(bdsvd_extsignbdsqr((double)(1), d->ptr.p_double[m], _state)+shift/d->ptr.p_double[m]); g = e->ptr.p_double[m-1]; for(i=m; i>=ll+1; i--) { generaterotation(f, g, &cosr, &sinr, &r, _state); if( iptr.p_double[i] = r; } f = cosr*d->ptr.p_double[i]+sinr*e->ptr.p_double[i-1]; e->ptr.p_double[i-1] = cosr*e->ptr.p_double[i-1]-sinr*d->ptr.p_double[i]; g = sinr*d->ptr.p_double[i-1]; d->ptr.p_double[i-1] = cosr*d->ptr.p_double[i-1]; generaterotation(f, g, &cosl, &sinl, &r, _state); d->ptr.p_double[i] = r; f = cosl*e->ptr.p_double[i-1]+sinl*d->ptr.p_double[i-1]; d->ptr.p_double[i-1] = cosl*d->ptr.p_double[i-1]-sinl*e->ptr.p_double[i-1]; if( i>ll+1 ) { g = sinl*e->ptr.p_double[i-2]; e->ptr.p_double[i-2] = cosl*e->ptr.p_double[i-2]; } work0.ptr.p_double[i-ll] = cosr; work1.ptr.p_double[i-ll] = -sinr; work2.ptr.p_double[i-ll] = cosl; work3.ptr.p_double[i-ll] = -sinl; } e->ptr.p_double[ll] = f; /* * Test convergence */ if( ae_fp_less_eq(ae_fabs(e->ptr.p_double[ll], _state),thresh) ) { e->ptr.p_double[ll] = (double)(0); } /* * Update singular vectors if desired */ if( ncvt>0 ) { applyrotationsfromtheleft(!fwddir, ll+vstart-1, m+vstart-1, vstart, vend, &work2, &work3, vt, &vttemp, _state); } if( nru>0 ) { applyrotationsfromtheleft(!fwddir, ll+ustart-1, m+ustart-1, ustart, uend, &work0, &work1, &ut, &utemp, _state); } if( ncc>0 ) { applyrotationsfromtheleft(!fwddir, ll+cstart-1, m+cstart-1, cstart, cend, &work0, &work1, c, &ctemp, _state); } } } /* * QR iteration finished, go back and check convergence */ continue; } /* * All singular values converged, so make them positive */ for(i=1; i<=n; i++) { if( ae_fp_less(d->ptr.p_double[i],(double)(0)) ) { d->ptr.p_double[i] = -d->ptr.p_double[i]; /* * Change sign of singular vectors, if desired */ if( ncvt>0 ) { ae_v_muld(&vt->ptr.pp_double[i+vstart-1][vstart], 1, ae_v_len(vstart,vend), -1); } } } /* * Sort the singular values into decreasing order (insertion sort on * singular values, but only one transposition per singular vector) */ for(i=1; i<=n-1; i++) { /* * Scan for smallest D(I) */ isub = 1; smin = d->ptr.p_double[1]; for(j=2; j<=n+1-i; j++) { if( ae_fp_less_eq(d->ptr.p_double[j],smin) ) { isub = j; smin = d->ptr.p_double[j]; } } if( isub!=n+1-i ) { /* * Swap singular values and vectors */ d->ptr.p_double[isub] = d->ptr.p_double[n+1-i]; d->ptr.p_double[n+1-i] = smin; if( ncvt>0 ) { j = n+1-i; ae_v_move(&vttemp.ptr.p_double[vstart], 1, &vt->ptr.pp_double[isub+vstart-1][vstart], 1, ae_v_len(vstart,vend)); ae_v_move(&vt->ptr.pp_double[isub+vstart-1][vstart], 1, &vt->ptr.pp_double[j+vstart-1][vstart], 1, ae_v_len(vstart,vend)); ae_v_move(&vt->ptr.pp_double[j+vstart-1][vstart], 1, &vttemp.ptr.p_double[vstart], 1, ae_v_len(vstart,vend)); } if( nru>0 ) { j = n+1-i; ae_v_move(&utemp.ptr.p_double[ustart], 1, &ut.ptr.pp_double[isub+ustart-1][ustart], 1, ae_v_len(ustart,uend)); ae_v_move(&ut.ptr.pp_double[isub+ustart-1][ustart], 1, &ut.ptr.pp_double[j+ustart-1][ustart], 1, ae_v_len(ustart,uend)); ae_v_move(&ut.ptr.pp_double[j+ustart-1][ustart], 1, &utemp.ptr.p_double[ustart], 1, ae_v_len(ustart,uend)); } if( ncc>0 ) { j = n+1-i; ae_v_move(&ctemp.ptr.p_double[cstart], 1, &c->ptr.pp_double[isub+cstart-1][cstart], 1, ae_v_len(cstart,cend)); ae_v_move(&c->ptr.pp_double[isub+cstart-1][cstart], 1, &c->ptr.pp_double[j+cstart-1][cstart], 1, ae_v_len(cstart,cend)); ae_v_move(&c->ptr.pp_double[j+cstart-1][cstart], 1, &ctemp.ptr.p_double[cstart], 1, ae_v_len(cstart,cend)); } } } /* * Copy U back from temporary storage */ if( nru>0 ) { rmatrixtranspose(n, nru, &ut, ustart, ustart, uu, ustart, ustart, _state); } ae_frame_leave(_state); return result; } static double bdsvd_extsignbdsqr(double a, double b, ae_state *_state) { double result; if( ae_fp_greater_eq(b,(double)(0)) ) { result = ae_fabs(a, _state); } else { result = -ae_fabs(a, _state); } return result; } static void bdsvd_svd2x2(double f, double g, double h, double* ssmin, double* ssmax, ae_state *_state) { double aas; double at; double au; double c; double fa; double fhmn; double fhmx; double ga; double ha; *ssmin = 0; *ssmax = 0; fa = ae_fabs(f, _state); ga = ae_fabs(g, _state); ha = ae_fabs(h, _state); fhmn = ae_minreal(fa, ha, _state); fhmx = ae_maxreal(fa, ha, _state); if( ae_fp_eq(fhmn,(double)(0)) ) { *ssmin = (double)(0); if( ae_fp_eq(fhmx,(double)(0)) ) { *ssmax = ga; } else { *ssmax = ae_maxreal(fhmx, ga, _state)*ae_sqrt(1+ae_sqr(ae_minreal(fhmx, ga, _state)/ae_maxreal(fhmx, ga, _state), _state), _state); } } else { if( ae_fp_less(ga,fhmx) ) { aas = 1+fhmn/fhmx; at = (fhmx-fhmn)/fhmx; au = ae_sqr(ga/fhmx, _state); c = 2/(ae_sqrt(aas*aas+au, _state)+ae_sqrt(at*at+au, _state)); *ssmin = fhmn*c; *ssmax = fhmx/c; } else { au = fhmx/ga; if( ae_fp_eq(au,(double)(0)) ) { /* * Avoid possible harmful underflow if exponent range * asymmetric (true SSMIN may not underflow even if * AU underflows) */ *ssmin = fhmn*fhmx/ga; *ssmax = ga; } else { aas = 1+fhmn/fhmx; at = (fhmx-fhmn)/fhmx; c = 1/(ae_sqrt(1+ae_sqr(aas*au, _state), _state)+ae_sqrt(1+ae_sqr(at*au, _state), _state)); *ssmin = fhmn*c*au; *ssmin = *ssmin+(*ssmin); *ssmax = ga/(c+c); } } } } static void bdsvd_svdv2x2(double f, double g, double h, double* ssmin, double* ssmax, double* snr, double* csr, double* snl, double* csl, ae_state *_state) { ae_bool gasmal; ae_bool swp; ae_int_t pmax; double a; double clt; double crt; double d; double fa; double ft; double ga; double gt; double ha; double ht; double l; double m; double mm; double r; double s; double slt; double srt; double t; double temp; double tsign; double tt; double v; *ssmin = 0; *ssmax = 0; *snr = 0; *csr = 0; *snl = 0; *csl = 0; ft = f; fa = ae_fabs(ft, _state); ht = h; ha = ae_fabs(h, _state); /* * these initializers are not really necessary, * but without them compiler complains about uninitialized locals */ clt = (double)(0); crt = (double)(0); slt = (double)(0); srt = (double)(0); tsign = (double)(0); /* * PMAX points to the maximum absolute element of matrix * PMAX = 1 if F largest in absolute values * PMAX = 2 if G largest in absolute values * PMAX = 3 if H largest in absolute values */ pmax = 1; swp = ae_fp_greater(ha,fa); if( swp ) { /* * Now FA .ge. HA */ pmax = 3; temp = ft; ft = ht; ht = temp; temp = fa; fa = ha; ha = temp; } gt = g; ga = ae_fabs(gt, _state); if( ae_fp_eq(ga,(double)(0)) ) { /* * Diagonal matrix */ *ssmin = ha; *ssmax = fa; clt = (double)(1); crt = (double)(1); slt = (double)(0); srt = (double)(0); } else { gasmal = ae_true; if( ae_fp_greater(ga,fa) ) { pmax = 2; if( ae_fp_less(fa/ga,ae_machineepsilon) ) { /* * Case of very large GA */ gasmal = ae_false; *ssmax = ga; if( ae_fp_greater(ha,(double)(1)) ) { v = ga/ha; *ssmin = fa/v; } else { v = fa/ga; *ssmin = v*ha; } clt = (double)(1); slt = ht/gt; srt = (double)(1); crt = ft/gt; } } if( gasmal ) { /* * Normal case */ d = fa-ha; if( ae_fp_eq(d,fa) ) { l = (double)(1); } else { l = d/fa; } m = gt/ft; t = 2-l; mm = m*m; tt = t*t; s = ae_sqrt(tt+mm, _state); if( ae_fp_eq(l,(double)(0)) ) { r = ae_fabs(m, _state); } else { r = ae_sqrt(l*l+mm, _state); } a = 0.5*(s+r); *ssmin = ha/a; *ssmax = fa*a; if( ae_fp_eq(mm,(double)(0)) ) { /* * Note that M is very tiny */ if( ae_fp_eq(l,(double)(0)) ) { t = bdsvd_extsignbdsqr((double)(2), ft, _state)*bdsvd_extsignbdsqr((double)(1), gt, _state); } else { t = gt/bdsvd_extsignbdsqr(d, ft, _state)+m/t; } } else { t = (m/(s+t)+m/(r+l))*(1+a); } l = ae_sqrt(t*t+4, _state); crt = 2/l; srt = t/l; clt = (crt+srt*m)/a; v = ht/ft; slt = v*srt/a; } } if( swp ) { *csl = srt; *snl = crt; *csr = slt; *snr = clt; } else { *csl = clt; *snl = slt; *csr = crt; *snr = srt; } /* * Correct signs of SSMAX and SSMIN */ if( pmax==1 ) { tsign = bdsvd_extsignbdsqr((double)(1), *csr, _state)*bdsvd_extsignbdsqr((double)(1), *csl, _state)*bdsvd_extsignbdsqr((double)(1), f, _state); } if( pmax==2 ) { tsign = bdsvd_extsignbdsqr((double)(1), *snr, _state)*bdsvd_extsignbdsqr((double)(1), *csl, _state)*bdsvd_extsignbdsqr((double)(1), g, _state); } if( pmax==3 ) { tsign = bdsvd_extsignbdsqr((double)(1), *snr, _state)*bdsvd_extsignbdsqr((double)(1), *snl, _state)*bdsvd_extsignbdsqr((double)(1), h, _state); } *ssmax = bdsvd_extsignbdsqr(*ssmax, tsign, _state); *ssmin = bdsvd_extsignbdsqr(*ssmin, tsign*bdsvd_extsignbdsqr((double)(1), f, _state)*bdsvd_extsignbdsqr((double)(1), h, _state), _state); } /************************************************************************* Singular value decomposition of a rectangular matrix. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is only partially supported (some parts are ! optimized, but most - are not). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The algorithm calculates the singular value decomposition of a matrix of size MxN: A = U * S * V^T The algorithm finds the singular values and, optionally, matrices U and V^T. The algorithm can find both first min(M,N) columns of matrix U and rows of matrix V^T (singular vectors), and matrices U and V^T wholly (of sizes MxM and NxN respectively). Take into account that the subroutine does not return matrix V but V^T. Input parameters: A - matrix to be decomposed. Array whose indexes range within [0..M-1, 0..N-1]. M - number of rows in matrix A. N - number of columns in matrix A. UNeeded - 0, 1 or 2. See the description of the parameter U. VTNeeded - 0, 1 or 2. See the description of the parameter VT. AdditionalMemory - If the parameter: * equals 0, the algorithm doesn't use additional memory (lower requirements, lower performance). * equals 1, the algorithm uses additional memory of size min(M,N)*min(M,N) of real numbers. It often speeds up the algorithm. * equals 2, the algorithm uses additional memory of size M*min(M,N) of real numbers. It allows to get a maximum performance. The recommended value of the parameter is 2. Output parameters: W - contains singular values in descending order. U - if UNeeded=0, U isn't changed, the left singular vectors are not calculated. if Uneeded=1, U contains left singular vectors (first min(M,N) columns of matrix U). Array whose indexes range within [0..M-1, 0..Min(M,N)-1]. if UNeeded=2, U contains matrix U wholly. Array whose indexes range within [0..M-1, 0..M-1]. VT - if VTNeeded=0, VT isn't changed, the right singular vectors are not calculated. if VTNeeded=1, VT contains right singular vectors (first min(M,N) rows of matrix V^T). Array whose indexes range within [0..min(M,N)-1, 0..N-1]. if VTNeeded=2, VT contains matrix V^T wholly. Array whose indexes range within [0..N-1, 0..N-1]. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ ae_bool rmatrixsvd(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, ae_int_t uneeded, ae_int_t vtneeded, ae_int_t additionalmemory, /* Real */ ae_vector* w, /* Real */ ae_matrix* u, /* Real */ ae_matrix* vt, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_vector tauq; ae_vector taup; ae_vector tau; ae_vector e; ae_vector work; ae_matrix t2; ae_bool isupper; ae_int_t minmn; ae_int_t ncu; ae_int_t nrvt; ae_int_t nru; ae_int_t ncvt; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_clear(w); ae_matrix_clear(u); ae_matrix_clear(vt); ae_vector_init(&tauq, 0, DT_REAL, _state); ae_vector_init(&taup, 0, DT_REAL, _state); ae_vector_init(&tau, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); ae_matrix_init(&t2, 0, 0, DT_REAL, _state); result = ae_true; if( m==0||n==0 ) { ae_frame_leave(_state); return result; } ae_assert(uneeded>=0&&uneeded<=2, "SVDDecomposition: wrong parameters!", _state); ae_assert(vtneeded>=0&&vtneeded<=2, "SVDDecomposition: wrong parameters!", _state); ae_assert(additionalmemory>=0&&additionalmemory<=2, "SVDDecomposition: wrong parameters!", _state); /* * initialize */ minmn = ae_minint(m, n, _state); ae_vector_set_length(w, minmn+1, _state); ncu = 0; nru = 0; if( uneeded==1 ) { nru = m; ncu = minmn; ae_matrix_set_length(u, nru-1+1, ncu-1+1, _state); } if( uneeded==2 ) { nru = m; ncu = m; ae_matrix_set_length(u, nru-1+1, ncu-1+1, _state); } nrvt = 0; ncvt = 0; if( vtneeded==1 ) { nrvt = minmn; ncvt = n; ae_matrix_set_length(vt, nrvt-1+1, ncvt-1+1, _state); } if( vtneeded==2 ) { nrvt = n; ncvt = n; ae_matrix_set_length(vt, nrvt-1+1, ncvt-1+1, _state); } /* * M much larger than N * Use bidiagonal reduction with QR-decomposition */ if( ae_fp_greater((double)(m),1.6*n) ) { if( uneeded==0 ) { /* * No left singular vectors to be computed */ rmatrixqr(a, m, n, &tau, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=i-1; j++) { a->ptr.pp_double[i][j] = (double)(0); } } rmatrixbd(a, n, n, &tauq, &taup, _state); rmatrixbdunpackpt(a, n, n, &taup, nrvt, vt, _state); rmatrixbdunpackdiagonals(a, n, n, &isupper, w, &e, _state); result = rmatrixbdsvd(w, &e, n, isupper, ae_false, u, 0, a, 0, vt, ncvt, _state); ae_frame_leave(_state); return result; } else { /* * Left singular vectors (may be full matrix U) to be computed */ rmatrixqr(a, m, n, &tau, _state); rmatrixqrunpackq(a, m, n, &tau, ncu, u, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=i-1; j++) { a->ptr.pp_double[i][j] = (double)(0); } } rmatrixbd(a, n, n, &tauq, &taup, _state); rmatrixbdunpackpt(a, n, n, &taup, nrvt, vt, _state); rmatrixbdunpackdiagonals(a, n, n, &isupper, w, &e, _state); if( additionalmemory<1 ) { /* * No additional memory can be used */ rmatrixbdmultiplybyq(a, n, n, &tauq, u, m, n, ae_true, ae_false, _state); result = rmatrixbdsvd(w, &e, n, isupper, ae_false, u, m, a, 0, vt, ncvt, _state); } else { /* * Large U. Transforming intermediate matrix T2 */ ae_vector_set_length(&work, ae_maxint(m, n, _state)+1, _state); rmatrixbdunpackq(a, n, n, &tauq, n, &t2, _state); copymatrix(u, 0, m-1, 0, n-1, a, 0, m-1, 0, n-1, _state); inplacetranspose(&t2, 0, n-1, 0, n-1, &work, _state); result = rmatrixbdsvd(w, &e, n, isupper, ae_false, u, 0, &t2, n, vt, ncvt, _state); rmatrixgemm(m, n, n, 1.0, a, 0, 0, 0, &t2, 0, 0, 1, 0.0, u, 0, 0, _state); } ae_frame_leave(_state); return result; } } /* * N much larger than M * Use bidiagonal reduction with LQ-decomposition */ if( ae_fp_greater((double)(n),1.6*m) ) { if( vtneeded==0 ) { /* * No right singular vectors to be computed */ rmatrixlq(a, m, n, &tau, _state); for(i=0; i<=m-1; i++) { for(j=i+1; j<=m-1; j++) { a->ptr.pp_double[i][j] = (double)(0); } } rmatrixbd(a, m, m, &tauq, &taup, _state); rmatrixbdunpackq(a, m, m, &tauq, ncu, u, _state); rmatrixbdunpackdiagonals(a, m, m, &isupper, w, &e, _state); ae_vector_set_length(&work, m+1, _state); inplacetranspose(u, 0, nru-1, 0, ncu-1, &work, _state); result = rmatrixbdsvd(w, &e, m, isupper, ae_false, a, 0, u, nru, vt, 0, _state); inplacetranspose(u, 0, nru-1, 0, ncu-1, &work, _state); ae_frame_leave(_state); return result; } else { /* * Right singular vectors (may be full matrix VT) to be computed */ rmatrixlq(a, m, n, &tau, _state); rmatrixlqunpackq(a, m, n, &tau, nrvt, vt, _state); for(i=0; i<=m-1; i++) { for(j=i+1; j<=m-1; j++) { a->ptr.pp_double[i][j] = (double)(0); } } rmatrixbd(a, m, m, &tauq, &taup, _state); rmatrixbdunpackq(a, m, m, &tauq, ncu, u, _state); rmatrixbdunpackdiagonals(a, m, m, &isupper, w, &e, _state); ae_vector_set_length(&work, ae_maxint(m, n, _state)+1, _state); inplacetranspose(u, 0, nru-1, 0, ncu-1, &work, _state); if( additionalmemory<1 ) { /* * No additional memory available */ rmatrixbdmultiplybyp(a, m, m, &taup, vt, m, n, ae_false, ae_true, _state); result = rmatrixbdsvd(w, &e, m, isupper, ae_false, a, 0, u, nru, vt, n, _state); } else { /* * Large VT. Transforming intermediate matrix T2 */ rmatrixbdunpackpt(a, m, m, &taup, m, &t2, _state); result = rmatrixbdsvd(w, &e, m, isupper, ae_false, a, 0, u, nru, &t2, m, _state); copymatrix(vt, 0, m-1, 0, n-1, a, 0, m-1, 0, n-1, _state); rmatrixgemm(m, n, m, 1.0, &t2, 0, 0, 0, a, 0, 0, 0, 0.0, vt, 0, 0, _state); } inplacetranspose(u, 0, nru-1, 0, ncu-1, &work, _state); ae_frame_leave(_state); return result; } } /* * M<=N * We can use inplace transposition of U to get rid of columnwise operations */ if( m<=n ) { rmatrixbd(a, m, n, &tauq, &taup, _state); rmatrixbdunpackq(a, m, n, &tauq, ncu, u, _state); rmatrixbdunpackpt(a, m, n, &taup, nrvt, vt, _state); rmatrixbdunpackdiagonals(a, m, n, &isupper, w, &e, _state); ae_vector_set_length(&work, m+1, _state); inplacetranspose(u, 0, nru-1, 0, ncu-1, &work, _state); result = rmatrixbdsvd(w, &e, minmn, isupper, ae_false, a, 0, u, nru, vt, ncvt, _state); inplacetranspose(u, 0, nru-1, 0, ncu-1, &work, _state); ae_frame_leave(_state); return result; } /* * Simple bidiagonal reduction */ rmatrixbd(a, m, n, &tauq, &taup, _state); rmatrixbdunpackq(a, m, n, &tauq, ncu, u, _state); rmatrixbdunpackpt(a, m, n, &taup, nrvt, vt, _state); rmatrixbdunpackdiagonals(a, m, n, &isupper, w, &e, _state); if( additionalmemory<2||uneeded==0 ) { /* * We cant use additional memory or there is no need in such operations */ result = rmatrixbdsvd(w, &e, minmn, isupper, ae_false, u, nru, a, 0, vt, ncvt, _state); } else { /* * We can use additional memory */ ae_matrix_set_length(&t2, minmn-1+1, m-1+1, _state); copyandtranspose(u, 0, m-1, 0, minmn-1, &t2, 0, minmn-1, 0, m-1, _state); result = rmatrixbdsvd(w, &e, minmn, isupper, ae_false, u, 0, &t2, m, vt, ncvt, _state); copyandtranspose(&t2, 0, minmn-1, 0, m-1, u, 0, m-1, 0, minmn-1, _state); } ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_rmatrixsvd(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, ae_int_t uneeded, ae_int_t vtneeded, ae_int_t additionalmemory, /* Real */ ae_vector* w, /* Real */ ae_matrix* u, /* Real */ ae_matrix* vt, ae_state *_state) { return rmatrixsvd(a,m,n,uneeded,vtneeded,additionalmemory,w,u,vt, _state); } /************************************************************************* This procedure initializes matrix norm estimator. USAGE: 1. User initializes algorithm state with NormEstimatorCreate() call 2. User calls NormEstimatorEstimateSparse() (or NormEstimatorIteration()) 3. User calls NormEstimatorResults() to get solution. INPUT PARAMETERS: M - number of rows in the matrix being estimated, M>0 N - number of columns in the matrix being estimated, N>0 NStart - number of random starting vectors recommended value - at least 5. NIts - number of iterations to do with best starting vector recommended value - at least 5. OUTPUT PARAMETERS: State - structure which stores algorithm state NOTE: this algorithm is effectively deterministic, i.e. it always returns same result when repeatedly called for the same matrix. In fact, algorithm uses randomized starting vectors, but internal random numbers generator always generates same sequence of the random values (it is a feature, not bug). Algorithm can be made non-deterministic with NormEstimatorSetSeed(0) call. -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorcreate(ae_int_t m, ae_int_t n, ae_int_t nstart, ae_int_t nits, normestimatorstate* state, ae_state *_state) { _normestimatorstate_clear(state); ae_assert(m>0, "NormEstimatorCreate: M<=0", _state); ae_assert(n>0, "NormEstimatorCreate: N<=0", _state); ae_assert(nstart>0, "NormEstimatorCreate: NStart<=0", _state); ae_assert(nits>0, "NormEstimatorCreate: NIts<=0", _state); state->m = m; state->n = n; state->nstart = nstart; state->nits = nits; state->seedval = 11; hqrndrandomize(&state->r, _state); ae_vector_set_length(&state->x0, state->n, _state); ae_vector_set_length(&state->t, state->m, _state); ae_vector_set_length(&state->x1, state->n, _state); ae_vector_set_length(&state->xbest, state->n, _state); ae_vector_set_length(&state->x, ae_maxint(state->n, state->m, _state), _state); ae_vector_set_length(&state->mv, state->m, _state); ae_vector_set_length(&state->mtv, state->n, _state); ae_vector_set_length(&state->rstate.ia, 3+1, _state); ae_vector_set_length(&state->rstate.ra, 2+1, _state); state->rstate.stage = -1; } /************************************************************************* This function changes seed value used by algorithm. In some cases we need deterministic processing, i.e. subsequent calls must return equal results, in other cases we need non-deterministic algorithm which returns different results for the same matrix on every pass. Setting zero seed will lead to non-deterministic algorithm, while non-zero value will make our algorithm deterministic. INPUT PARAMETERS: State - norm estimator state, must be initialized with a call to NormEstimatorCreate() SeedVal - seed value, >=0. Zero value = non-deterministic algo. -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorsetseed(normestimatorstate* state, ae_int_t seedval, ae_state *_state) { ae_assert(seedval>=0, "NormEstimatorSetSeed: SeedVal<0", _state); state->seedval = seedval; } /************************************************************************* -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ ae_bool normestimatoriteration(normestimatorstate* state, ae_state *_state) { ae_int_t n; ae_int_t m; ae_int_t i; ae_int_t itcnt; double v; double growth; double bestgrowth; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { n = state->rstate.ia.ptr.p_int[0]; m = state->rstate.ia.ptr.p_int[1]; i = state->rstate.ia.ptr.p_int[2]; itcnt = state->rstate.ia.ptr.p_int[3]; v = state->rstate.ra.ptr.p_double[0]; growth = state->rstate.ra.ptr.p_double[1]; bestgrowth = state->rstate.ra.ptr.p_double[2]; } else { n = 359; m = -58; i = -919; itcnt = -909; v = 81; growth = 255; bestgrowth = 74; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } /* * Routine body */ n = state->n; m = state->m; if( state->seedval>0 ) { hqrndseed(state->seedval, state->seedval+2, &state->r, _state); } bestgrowth = (double)(0); state->xbest.ptr.p_double[0] = (double)(1); for(i=1; i<=n-1; i++) { state->xbest.ptr.p_double[i] = (double)(0); } itcnt = 0; lbl_4: if( itcnt>state->nstart-1 ) { goto lbl_6; } do { v = (double)(0); for(i=0; i<=n-1; i++) { state->x0.ptr.p_double[i] = hqrndnormal(&state->r, _state); v = v+ae_sqr(state->x0.ptr.p_double[i], _state); } } while(ae_fp_eq(v,(double)(0))); v = 1/ae_sqrt(v, _state); ae_v_muld(&state->x0.ptr.p_double[0], 1, ae_v_len(0,n-1), v); ae_v_move(&state->x.ptr.p_double[0], 1, &state->x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->needmv = ae_true; state->needmtv = ae_false; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: ae_v_move(&state->x.ptr.p_double[0], 1, &state->mv.ptr.p_double[0], 1, ae_v_len(0,m-1)); state->needmv = ae_false; state->needmtv = ae_true; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: ae_v_move(&state->x1.ptr.p_double[0], 1, &state->mtv.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->x1.ptr.p_double[i], _state); } growth = ae_sqrt(ae_sqrt(v, _state), _state); if( ae_fp_greater(growth,bestgrowth) ) { v = 1/ae_sqrt(v, _state); ae_v_moved(&state->xbest.ptr.p_double[0], 1, &state->x1.ptr.p_double[0], 1, ae_v_len(0,n-1), v); bestgrowth = growth; } itcnt = itcnt+1; goto lbl_4; lbl_6: ae_v_move(&state->x0.ptr.p_double[0], 1, &state->xbest.ptr.p_double[0], 1, ae_v_len(0,n-1)); itcnt = 0; lbl_7: if( itcnt>state->nits-1 ) { goto lbl_9; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); state->needmv = ae_true; state->needmtv = ae_false; state->rstate.stage = 2; goto lbl_rcomm; lbl_2: ae_v_move(&state->x.ptr.p_double[0], 1, &state->mv.ptr.p_double[0], 1, ae_v_len(0,m-1)); state->needmv = ae_false; state->needmtv = ae_true; state->rstate.stage = 3; goto lbl_rcomm; lbl_3: ae_v_move(&state->x1.ptr.p_double[0], 1, &state->mtv.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(state->x1.ptr.p_double[i], _state); } state->repnorm = ae_sqrt(ae_sqrt(v, _state), _state); if( ae_fp_neq(v,(double)(0)) ) { v = 1/ae_sqrt(v, _state); ae_v_moved(&state->x0.ptr.p_double[0], 1, &state->x1.ptr.p_double[0], 1, ae_v_len(0,n-1), v); } itcnt = itcnt+1; goto lbl_7; lbl_9: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = n; state->rstate.ia.ptr.p_int[1] = m; state->rstate.ia.ptr.p_int[2] = i; state->rstate.ia.ptr.p_int[3] = itcnt; state->rstate.ra.ptr.p_double[0] = v; state->rstate.ra.ptr.p_double[1] = growth; state->rstate.ra.ptr.p_double[2] = bestgrowth; return result; } /************************************************************************* This function estimates norm of the sparse M*N matrix A. INPUT PARAMETERS: State - norm estimator state, must be initialized with a call to NormEstimatorCreate() A - sparse M*N matrix, must be converted to CRS format prior to calling this function. After this function is over you can call NormEstimatorResults() to get estimate of the norm(A). -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorestimatesparse(normestimatorstate* state, sparsematrix* a, ae_state *_state) { normestimatorrestart(state, _state); while(normestimatoriteration(state, _state)) { if( state->needmv ) { sparsemv(a, &state->x, &state->mv, _state); continue; } if( state->needmtv ) { sparsemtv(a, &state->x, &state->mtv, _state); continue; } } } /************************************************************************* Matrix norm estimation results INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: Nrm - estimate of the matrix norm, Nrm>=0 -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorresults(normestimatorstate* state, double* nrm, ae_state *_state) { *nrm = 0; *nrm = state->repnorm; } /************************************************************************* This function restarts estimator and prepares it for the next estimation round. INPUT PARAMETERS: State - algorithm state -- ALGLIB -- Copyright 06.12.2011 by Bochkanov Sergey *************************************************************************/ void normestimatorrestart(normestimatorstate* state, ae_state *_state) { ae_vector_set_length(&state->rstate.ia, 3+1, _state); ae_vector_set_length(&state->rstate.ra, 2+1, _state); state->rstate.stage = -1; } void _normestimatorstate_init(void* _p, ae_state *_state) { normestimatorstate *p = (normestimatorstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->x0, 0, DT_REAL, _state); ae_vector_init(&p->x1, 0, DT_REAL, _state); ae_vector_init(&p->t, 0, DT_REAL, _state); ae_vector_init(&p->xbest, 0, DT_REAL, _state); _hqrndstate_init(&p->r, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->mv, 0, DT_REAL, _state); ae_vector_init(&p->mtv, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); } void _normestimatorstate_init_copy(void* _dst, void* _src, ae_state *_state) { normestimatorstate *dst = (normestimatorstate*)_dst; normestimatorstate *src = (normestimatorstate*)_src; dst->n = src->n; dst->m = src->m; dst->nstart = src->nstart; dst->nits = src->nits; dst->seedval = src->seedval; ae_vector_init_copy(&dst->x0, &src->x0, _state); ae_vector_init_copy(&dst->x1, &src->x1, _state); ae_vector_init_copy(&dst->t, &src->t, _state); ae_vector_init_copy(&dst->xbest, &src->xbest, _state); _hqrndstate_init_copy(&dst->r, &src->r, _state); ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->mv, &src->mv, _state); ae_vector_init_copy(&dst->mtv, &src->mtv, _state); dst->needmv = src->needmv; dst->needmtv = src->needmtv; dst->repnorm = src->repnorm; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); } void _normestimatorstate_clear(void* _p) { normestimatorstate *p = (normestimatorstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->x0); ae_vector_clear(&p->x1); ae_vector_clear(&p->t); ae_vector_clear(&p->xbest); _hqrndstate_clear(&p->r); ae_vector_clear(&p->x); ae_vector_clear(&p->mv); ae_vector_clear(&p->mtv); _rcommstate_clear(&p->rstate); } void _normestimatorstate_destroy(void* _p) { normestimatorstate *p = (normestimatorstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->x0); ae_vector_destroy(&p->x1); ae_vector_destroy(&p->t); ae_vector_destroy(&p->xbest); _hqrndstate_destroy(&p->r); ae_vector_destroy(&p->x); ae_vector_destroy(&p->mv); ae_vector_destroy(&p->mtv); _rcommstate_destroy(&p->rstate); } /************************************************************************* This function initializes subspace iteration solver. This solver is used to solve symmetric real eigenproblems where just a few (top K) eigenvalues and corresponding eigenvectors is required. This solver can be significantly faster than complete EVD decomposition in the following case: * when only just a small fraction of top eigenpairs of dense matrix is required. When K approaches N, this solver is slower than complete dense EVD * when problem matrix is sparse (and/or is not known explicitly, i.e. only matrix-matrix product can be performed) USAGE (explicit dense/sparse matrix): 1. User initializes algorithm state with eigsubspacecreate() call 2. [optional] User tunes solver parameters by calling eigsubspacesetcond() or other functions 3. User calls eigsubspacesolvedense() or eigsubspacesolvesparse() methods, which take algorithm state and 2D array or alglib.sparsematrix object. USAGE (out-of-core mode): 1. User initializes algorithm state with eigsubspacecreate() call 2. [optional] User tunes solver parameters by calling eigsubspacesetcond() or other functions 3. User activates out-of-core mode of the solver and repeatedly calls communication functions in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: N - problem dimensionality, N>0 K - number of top eigenvector to calculate, 00, "EigSubspaceCreate: N<=0", _state); ae_assert(k>0, "EigSubspaceCreate: K<=0", _state); ae_assert(k<=n, "EigSubspaceCreate: K>N", _state); eigsubspacecreatebuf(n, k, state, _state); } /************************************************************************* Buffered version of constructor which aims to reuse previously allocated memory as much as possible. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspacecreatebuf(ae_int_t n, ae_int_t k, eigsubspacestate* state, ae_state *_state) { ae_assert(n>0, "EigSubspaceCreate: N<=0", _state); ae_assert(k>0, "EigSubspaceCreate: K<=0", _state); ae_assert(k<=n, "EigSubspaceCreate: K>N", _state); /* * Initialize algorithm parameters */ state->running = ae_false; state->n = n; state->k = k; state->nwork = ae_minint(ae_maxint(2*k, 8, _state), n, _state); state->eigenvectorsneeded = 1; eigsubspacesetcond(state, 0.0, 0, _state); /* * Allocate temporaries */ rmatrixsetlengthatleast(&state->x, state->n, state->nwork, _state); rmatrixsetlengthatleast(&state->ax, state->n, state->nwork, _state); } /************************************************************************* This function sets stopping critera for the solver: * error in eigenvector/value allowed by solver * maximum number of iterations to perform INPUT PARAMETERS: State - solver structure Eps - eps>=0, with non-zero value used to tell solver that it can stop after all eigenvalues converged with error roughly proportional to eps*MAX(LAMBDA_MAX), where LAMBDA_MAX is a maximum eigenvalue. Zero value means that no check for precision is performed. MaxIts - maxits>=0, with non-zero value used to tell solver that it can stop after maxits steps (no matter how precise current estimate is) NOTE: passing eps=0 and maxits=0 results in automatic selection of moderate eps as stopping criteria (1.0E-6 in current implementation, but it may change without notice). NOTE: very small values of eps are possible (say, 1.0E-12), although the larger problem you solve (N and/or K), the harder it is to find precise eigenvectors because rounding errors tend to accumulate. NOTE: passing non-zero eps results in some performance penalty, roughly equal to 2N*(2K)^2 FLOPs per iteration. These additional computations are required in order to estimate current error in eigenvalues via Rayleigh-Ritz process. Most of this additional time is spent in construction of ~2Kx2K symmetric subproblem whose eigenvalues are checked with exact eigensolver. This additional time is negligible if you search for eigenvalues of the large dense matrix, but may become noticeable on highly sparse EVD problems, where cost of matrix-matrix product is low. If you set eps to exactly zero, Rayleigh-Ritz phase is completely turned off. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspacesetcond(eigsubspacestate* state, double eps, ae_int_t maxits, ae_state *_state) { ae_assert(!state->running, "EigSubspaceSetCond: solver is already running", _state); ae_assert(ae_isfinite(eps, _state)&&ae_fp_greater_eq(eps,(double)(0)), "EigSubspaceSetCond: Eps<0 or NAN/INF", _state); ae_assert(maxits>=0, "EigSubspaceSetCond: MaxIts<0", _state); if( ae_fp_eq(eps,(double)(0))&&maxits==0 ) { eps = 1.0E-6; } state->eps = eps; state->maxits = maxits; } /************************************************************************* This function initiates out-of-core mode of subspace eigensolver. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver object MType - matrix type: * 0 for real symmetric matrix (solver assumes that matrix being processed is symmetric; symmetric direct eigensolver is used for smaller subproblems arising during solution of larger "full" task) Future versions of ALGLIB may introduce support for other matrix types; for now, only symmetric eigenproblems are supported. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocstart(eigsubspacestate* state, ae_int_t mtype, ae_state *_state) { ae_assert(!state->running, "EigSubspaceStart: solver is already running", _state); ae_assert(mtype==0, "EigSubspaceStart: incorrect mtype parameter", _state); ae_vector_set_length(&state->rstate.ia, 7+1, _state); ae_vector_set_length(&state->rstate.ra, 1+1, _state); state->rstate.stage = -1; evd_clearrfields(state, _state); state->running = ae_true; state->matrixtype = mtype; } /************************************************************************* This function performs subspace iteration in the out-of-core mode. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ ae_bool eigsubspaceooccontinue(eigsubspacestate* state, ae_state *_state) { ae_bool result; ae_assert(state->running, "EigSubspaceContinue: solver is not running", _state); result = eigsubspaceiteration(state, _state); state->running = result; return result; } /************************************************************************* This function is used to retrieve information about out-of-core request sent by solver to user code: request type (current version of the solver sends only requests for matrix-matrix products) and request size (size of the matrices being multiplied). This function returns just request metrics; in order to get contents of the matrices being multiplied, use eigsubspaceoocgetrequestdata(). It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver running in out-of-core mode OUTPUT PARAMETERS: RequestType - type of the request to process: * 0 - for matrix-matrix product A*X, with A being NxN matrix whose eigenvalues/vectors are needed, and X being NxREQUESTSIZE one which is returned by the eigsubspaceoocgetrequestdata(). RequestSize - size of the X matrix (number of columns), usually it is several times larger than number of vectors K requested by user. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocgetrequestinfo(eigsubspacestate* state, ae_int_t* requesttype, ae_int_t* requestsize, ae_state *_state) { *requesttype = 0; *requestsize = 0; ae_assert(state->running, "EigSubspaceOOCGetRequestInfo: solver is not running", _state); *requesttype = state->requesttype; *requestsize = state->requestsize; } /************************************************************************* This function is used to retrieve information about out-of-core request sent by solver to user code: matrix X (array[N,RequestSize) which have to be multiplied by out-of-core matrix A in a product A*X. This function returns just request data; in order to get size of the data prior to processing requestm, use eigsubspaceoocgetrequestinfo(). It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver running in out-of-core mode X - possibly preallocated storage; reallocated if needed, left unchanged, if large enough to store request data. OUTPUT PARAMETERS: X - array[N,RequestSize] or larger, leading rectangle is filled with dense matrix X. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocgetrequestdata(eigsubspacestate* state, /* Real */ ae_matrix* x, ae_state *_state) { ae_int_t i; ae_int_t j; ae_assert(state->running, "EigSubspaceOOCGetRequestInfo: solver is not running", _state); rmatrixsetlengthatleast(x, state->n, state->requestsize, _state); for(i=0; i<=state->n-1; i++) { for(j=0; j<=state->requestsize-1; j++) { x->ptr.pp_double[i][j] = state->x.ptr.pp_double[i][j]; } } } /************************************************************************* This function is used to send user reply to out-of-core request sent by solver. Usually it is product A*X for returned by solver matrix X. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver running in out-of-core mode AX - array[N,RequestSize] or larger, leading rectangle is filled with product A*X. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocsendresult(eigsubspacestate* state, /* Real */ ae_matrix* ax, ae_state *_state) { ae_int_t i; ae_int_t j; ae_assert(state->running, "EigSubspaceOOCGetRequestInfo: solver is not running", _state); for(i=0; i<=state->n-1; i++) { for(j=0; j<=state->requestsize-1; j++) { state->ax.ptr.pp_double[i][j] = ax->ptr.pp_double[i][j]; } } } /************************************************************************* This function finalizes out-of-core mode of subspace eigensolver. It should be used in conjunction with other out-of-core-related functions of this subspackage in a loop like below: > alglib.eigsubspaceoocstart(state) > while alglib.eigsubspaceooccontinue(state) do > alglib.eigsubspaceoocgetrequestinfo(state, out RequestType, out M) > alglib.eigsubspaceoocgetrequestdata(state, out X) > [calculate Y=A*X, with X=R^NxM] > alglib.eigsubspaceoocsendresult(state, in Y) > alglib.eigsubspaceoocstop(state, out W, out Z, out Report) INPUT PARAMETERS: State - solver state OUTPUT PARAMETERS: W - array[K], depending on solver settings: * top K eigenvalues ordered by descending - if eigenvectors are returned in Z * zeros - if invariant subspace is returned in Z Z - array[N,K], depending on solver settings either: * matrix of eigenvectors found * orthogonal basis of K-dimensional invariant subspace Rep - report with additional parameters -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspaceoocstop(eigsubspacestate* state, /* Real */ ae_vector* w, /* Real */ ae_matrix* z, eigsubspacereport* rep, ae_state *_state) { ae_int_t n; ae_int_t k; ae_int_t i; ae_int_t j; ae_vector_clear(w); ae_matrix_clear(z); _eigsubspacereport_clear(rep); ae_assert(!state->running, "EigSubspaceStop: solver is still running", _state); n = state->n; k = state->k; ae_vector_set_length(w, k, _state); ae_matrix_set_length(z, n, k, _state); for(i=0; i<=k-1; i++) { w->ptr.p_double[i] = state->rw.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { z->ptr.pp_double[i][j] = state->rq.ptr.pp_double[i][j]; } } rep->iterationscount = state->repiterationscount; } /************************************************************************* This function runs eigensolver for dense NxN symmetric matrix A, given by upper or lower triangle. This function can not process nonsymmetric matrices. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * multithreading support ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! For a situation when you need just a few eigenvectors (~1-10), ! multithreading typically gives sublinear (wrt to cores count) speedup. ! For larger problems it may give you nearly linear increase in ! performance. ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Best results are achieved for high-dimensional problems ! (NVars is at least 256). ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: State - solver state A - array[N,N], symmetric NxN matrix given by one of its triangles IsUpper - whether upper or lower triangle of A is given (the other one is not referenced at all). OUTPUT PARAMETERS: W - array[K], top K eigenvalues ordered by descending of their absolute values Z - array[N,K], matrix of eigenvectors found Rep - report with additional parameters NOTE: internally this function allocates a copy of NxN dense A. You should take it into account when working with very large matrices occupying almost all RAM. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspacesolvedenses(eigsubspacestate* state, /* Real */ ae_matrix* a, ae_bool isupper, /* Real */ ae_vector* w, /* Real */ ae_matrix* z, eigsubspacereport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t m; ae_int_t i; ae_int_t j; ae_int_t k; double v; ae_matrix acopy; ae_frame_make(_state, &_frame_block); ae_vector_clear(w); ae_matrix_clear(z); _eigsubspacereport_clear(rep); ae_matrix_init(&acopy, 0, 0, DT_REAL, _state); ae_assert(!state->running, "EigSubspaceSolveDenseS: solver is still running", _state); n = state->n; /* * Allocate copy of A, copy one triangle to another */ ae_matrix_set_length(&acopy, n, n, _state); for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { if( isupper ) { v = a->ptr.pp_double[i][j]; } else { v = a->ptr.pp_double[j][i]; } acopy.ptr.pp_double[i][j] = v; acopy.ptr.pp_double[j][i] = v; } } /* * Start iterations */ state->matrixtype = 0; ae_vector_set_length(&state->rstate.ia, 7+1, _state); ae_vector_set_length(&state->rstate.ra, 1+1, _state); state->rstate.stage = -1; evd_clearrfields(state, _state); while(eigsubspaceiteration(state, _state)) { /* * Calculate A*X with RMatrixGEMM */ ae_assert(state->requesttype==0, "EigSubspaceSolveDense: integrity check failed", _state); ae_assert(state->requestsize>0, "EigSubspaceSolveDense: integrity check failed", _state); m = state->requestsize; rmatrixgemm(n, m, n, 1.0, &acopy, 0, 0, 0, &state->x, 0, 0, 0, 0.0, &state->ax, 0, 0, _state); } k = state->k; ae_vector_set_length(w, k, _state); ae_matrix_set_length(z, n, k, _state); for(i=0; i<=k-1; i++) { w->ptr.p_double[i] = state->rw.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { z->ptr.pp_double[i][j] = state->rq.ptr.pp_double[i][j]; } } rep->iterationscount = state->repiterationscount; ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_eigsubspacesolvedenses(eigsubspacestate* state, /* Real */ ae_matrix* a, ae_bool isupper, /* Real */ ae_vector* w, /* Real */ ae_matrix* z, eigsubspacereport* rep, ae_state *_state) { eigsubspacesolvedenses(state,a,isupper,w,z,rep, _state); } /************************************************************************* This function runs eigensolver for dense NxN symmetric matrix A, given by upper or lower triangle. This function can not process nonsymmetric matrices. INPUT PARAMETERS: State - solver state A - NxN symmetric matrix given by one of its triangles IsUpper - whether upper or lower triangle of A is given (the other one is not referenced at all). OUTPUT PARAMETERS: W - array[K], top K eigenvalues ordered by descending of their absolute values Z - array[N,K], matrix of eigenvectors found Rep - report with additional parameters -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ void eigsubspacesolvesparses(eigsubspacestate* state, sparsematrix* a, ae_bool isupper, /* Real */ ae_vector* w, /* Real */ ae_matrix* z, eigsubspacereport* rep, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t k; ae_vector_clear(w); ae_matrix_clear(z); _eigsubspacereport_clear(rep); ae_assert(!state->running, "EigSubspaceSolveSparseS: solver is still running", _state); n = state->n; state->matrixtype = 0; ae_vector_set_length(&state->rstate.ia, 7+1, _state); ae_vector_set_length(&state->rstate.ra, 1+1, _state); state->rstate.stage = -1; evd_clearrfields(state, _state); while(eigsubspaceiteration(state, _state)) { ae_assert(state->requesttype==0, "EigSubspaceSolveDense: integrity check failed", _state); ae_assert(state->requestsize>0, "EigSubspaceSolveDense: integrity check failed", _state); sparsesmm(a, isupper, &state->x, state->requestsize, &state->ax, _state); } k = state->k; ae_vector_set_length(w, k, _state); ae_matrix_set_length(z, n, k, _state); for(i=0; i<=k-1; i++) { w->ptr.p_double[i] = state->rw.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { z->ptr.pp_double[i][j] = state->rq.ptr.pp_double[i][j]; } } rep->iterationscount = state->repiterationscount; } /************************************************************************* Internal r-comm function. -- ALGLIB -- Copyright 16.01.2017 by Bochkanov Sergey *************************************************************************/ ae_bool eigsubspaceiteration(eigsubspacestate* state, ae_state *_state) { ae_int_t n; ae_int_t nwork; ae_int_t k; ae_int_t cnt; ae_int_t i; ae_int_t i1; ae_int_t j; double vv; double v; ae_int_t convcnt; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { n = state->rstate.ia.ptr.p_int[0]; nwork = state->rstate.ia.ptr.p_int[1]; k = state->rstate.ia.ptr.p_int[2]; cnt = state->rstate.ia.ptr.p_int[3]; i = state->rstate.ia.ptr.p_int[4]; i1 = state->rstate.ia.ptr.p_int[5]; j = state->rstate.ia.ptr.p_int[6]; convcnt = state->rstate.ia.ptr.p_int[7]; vv = state->rstate.ra.ptr.p_double[0]; v = state->rstate.ra.ptr.p_double[1]; } else { n = 359; nwork = -58; k = -919; cnt = -909; i = 81; i1 = 255; j = 74; convcnt = -788; vv = 809; v = 205; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } /* * Routine body */ n = state->n; k = state->k; nwork = state->nwork; /* * Initialize RNG. Deterministic initialization (with fixed * seed) is required because we need deterministic behavior * of the entire solver. */ hqrndseed(453, 463664, &state->rs, _state); /* * Prepare iteration * Initialize QCur with random orthogonal matrix. */ state->repiterationscount = 0; rmatrixsetlengthatleast(&state->qcur, nwork, n, _state); rmatrixsetlengthatleast(&state->znew, nwork, n, _state); rvectorsetlengthatleast(&state->wcur, nwork, _state); rvectorsetlengthatleast(&state->wprev, nwork, _state); rvectorsetlengthatleast(&state->wrank, nwork, _state); rmatrixsetlengthatleast(&state->x, n, nwork, _state); rmatrixsetlengthatleast(&state->ax, n, nwork, _state); for(i=0; i<=nwork-1; i++) { state->wprev.ptr.p_double[i] = -1.0; } for(i=0; i<=nwork-1; i++) { for(j=0; j<=n-1; j++) { state->znew.ptr.pp_double[i][j] = hqrndnormal(&state->rs, _state); } } rmatrixlq(&state->znew, nwork, n, &state->tau, _state); rmatrixlqunpackq(&state->znew, nwork, n, &state->tau, nwork, &state->qcur, _state); /* * Start iteration */ state->repiterationscount = 0; convcnt = 0; lbl_2: if( !((state->maxits==0||state->repiterationscountmaxits)&&convcntqcur, 0, 0, &state->x, 0, 0, _state); evd_clearrfields(state, _state); state->requesttype = 0; state->requestsize = nwork; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: /* * Perform Rayleigh-Ritz step to estimate convergence of diagonal eigenvalues */ if( ae_fp_greater(state->eps,(double)(0)) ) { ae_assert(state->matrixtype==0, "integrity check failed", _state); rmatrixsetlengthatleast(&state->r, nwork, nwork, _state); rmatrixgemm(nwork, nwork, n, 1.0, &state->qcur, 0, 0, 0, &state->ax, 0, 0, 0, 0.0, &state->r, 0, 0, _state); if( !smatrixevd(&state->r, nwork, 0, ae_true, &state->wcur, &state->dummy, _state) ) { ae_assert(ae_false, "EigSubspace: direct eigensolver failed to converge", _state); } for(j=0; j<=nwork-1; j++) { state->wrank.ptr.p_double[j] = ae_fabs(state->wcur.ptr.p_double[j], _state); } rankxuntied(&state->wrank, nwork, &state->buf, _state); v = (double)(0); vv = (double)(0); for(j=0; j<=nwork-1; j++) { if( ae_fp_greater_eq(state->wrank.ptr.p_double[j],(double)(nwork-k)) ) { v = ae_maxreal(v, ae_fabs(state->wcur.ptr.p_double[j]-state->wprev.ptr.p_double[j], _state), _state); vv = ae_maxreal(vv, ae_fabs(state->wcur.ptr.p_double[j], _state), _state); } } if( ae_fp_eq(vv,(double)(0)) ) { vv = (double)(1); } if( ae_fp_less_eq(v,state->eps*vv) ) { inc(&convcnt, _state); } else { convcnt = 0; } for(j=0; j<=nwork-1; j++) { state->wprev.ptr.p_double[j] = state->wcur.ptr.p_double[j]; } } /* * QR renormalization and update of QCur */ rmatrixtranspose(n, nwork, &state->ax, 0, 0, &state->znew, 0, 0, _state); rmatrixlq(&state->znew, nwork, n, &state->tau, _state); rmatrixlqunpackq(&state->znew, nwork, n, &state->tau, nwork, &state->qcur, _state); /* * Update iteration index */ state->repiterationscount = state->repiterationscount+1; goto lbl_2; lbl_3: /* * Perform Rayleigh-Ritz step: find true eigenpairs in NWork-dimensional * subspace. */ ae_assert(state->matrixtype==0, "integrity check failed", _state); ae_assert(state->eigenvectorsneeded==1, "Assertion failed", _state); rmatrixsetlengthatleast(&state->r, nwork, nwork, _state); rmatrixtranspose(nwork, n, &state->qcur, 0, 0, &state->x, 0, 0, _state); evd_clearrfields(state, _state); state->requesttype = 0; state->requestsize = nwork; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: rmatrixgemm(nwork, nwork, n, 1.0, &state->qcur, 0, 0, 0, &state->ax, 0, 0, 0, 0.0, &state->r, 0, 0, _state); if( !smatrixevd(&state->r, nwork, 1, ae_true, &state->tw, &state->tz, _state) ) { ae_assert(ae_false, "EigSubspace: direct eigensolver failed to converge", _state); } /* * Reorder eigenpairs according to their absolute magnitude, select * K top ones. This reordering algorithm is very inefficient and has * O(NWork*K) running time, but it is still faster than other parts * of the solver, so we may use it. * * Then, we transform RZ to RQ (full N-dimensional representation). * After this part is done, RW and RQ contain solution. */ for(j=0; j<=nwork-1; j++) { state->wrank.ptr.p_double[j] = ae_fabs(state->tw.ptr.p_double[j], _state); } rankxuntied(&state->wrank, nwork, &state->buf, _state); cnt = 0; rvectorsetlengthatleast(&state->rw, k, _state); rmatrixsetlengthatleast(&state->rz, nwork, k, _state); for(i=nwork-1; i>=nwork-k; i--) { for(i1=0; i1<=nwork-1; i1++) { if( ae_fp_eq(state->wrank.ptr.p_double[i1],(double)(i)) ) { ae_assert(cntrw.ptr.p_double[cnt] = state->tw.ptr.p_double[i1]; for(j=0; j<=nwork-1; j++) { state->rz.ptr.pp_double[j][cnt] = state->tz.ptr.pp_double[j][i1]; } cnt = cnt+1; } } } ae_assert(cnt==k, "EigSubspace: integrity check failed", _state); rmatrixsetlengthatleast(&state->rq, n, k, _state); rmatrixgemm(n, k, nwork, 1.0, &state->qcur, 0, 0, 1, &state->rz, 0, 0, 0, 0.0, &state->rq, 0, 0, _state); result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = n; state->rstate.ia.ptr.p_int[1] = nwork; state->rstate.ia.ptr.p_int[2] = k; state->rstate.ia.ptr.p_int[3] = cnt; state->rstate.ia.ptr.p_int[4] = i; state->rstate.ia.ptr.p_int[5] = i1; state->rstate.ia.ptr.p_int[6] = j; state->rstate.ia.ptr.p_int[7] = convcnt; state->rstate.ra.ptr.p_double[0] = vv; state->rstate.ra.ptr.p_double[1] = v; return result; } /************************************************************************* Finding the eigenvalues and eigenvectors of a symmetric matrix The algorithm finds eigen pairs of a symmetric matrix by reducing it to tridiagonal form and using the QL/QR algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpper - storage format. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains the eigenvectors. Array whose indexes range within [0..N-1, 0..N-1]. The eigenvectors are stored in the matrix columns. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged (rare case). -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/ ae_bool smatrixevd(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t zneeded, ae_bool isupper, /* Real */ ae_vector* d, /* Real */ ae_matrix* z, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_vector tau; ae_vector e; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_clear(d); ae_matrix_clear(z); ae_vector_init(&tau, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_assert(zneeded==0||zneeded==1, "SMatrixEVD: incorrect ZNeeded", _state); smatrixtd(a, n, isupper, &tau, d, &e, _state); if( zneeded==1 ) { smatrixtdunpackq(a, n, isupper, &tau, z, _state); } result = smatrixtdevd(d, &e, n, zneeded, z, _state); ae_frame_leave(_state); return result; } /************************************************************************* Subroutine for finding the eigenvalues (and eigenvectors) of a symmetric matrix in a given half open interval (A, B] by using a bisection and inverse iteration Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. B1, B2 - half open interval (B1, B2] to search eigenvalues in. Output parameters: M - number of eigenvalues found in a given half-interval (M>=0). W - array of the eigenvalues found. Array whose index ranges within [0..M-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..M-1]. The eigenvectors are stored in the matrix columns. Result: True, if successful. M contains the number of eigenvalues in the given half-interval (could be equal to 0), W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned, M is equal to 0. -- ALGLIB -- Copyright 07.01.2006 by Bochkanov Sergey *************************************************************************/ ae_bool smatrixevdr(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t zneeded, ae_bool isupper, double b1, double b2, ae_int_t* m, /* Real */ ae_vector* w, /* Real */ ae_matrix* z, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_vector tau; ae_vector e; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; *m = 0; ae_vector_clear(w); ae_matrix_clear(z); ae_vector_init(&tau, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_assert(zneeded==0||zneeded==1, "SMatrixTDEVDR: incorrect ZNeeded", _state); smatrixtd(a, n, isupper, &tau, w, &e, _state); if( zneeded==1 ) { smatrixtdunpackq(a, n, isupper, &tau, z, _state); } result = smatrixtdevdr(w, &e, n, zneeded, b1, b2, m, z, _state); ae_frame_leave(_state); return result; } /************************************************************************* Subroutine for finding the eigenvalues and eigenvectors of a symmetric matrix with given indexes by using bisection and inverse iteration methods. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. I1, I2 - index interval for searching (from I1 to I2). 0 <= I1 <= I2 <= N-1. Output parameters: W - array of the eigenvalues found. Array whose index ranges within [0..I2-I1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..I2-I1]. In that case, the eigenvectors are stored in the matrix columns. Result: True, if successful. W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned. -- ALGLIB -- Copyright 07.01.2006 by Bochkanov Sergey *************************************************************************/ ae_bool smatrixevdi(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t zneeded, ae_bool isupper, ae_int_t i1, ae_int_t i2, /* Real */ ae_vector* w, /* Real */ ae_matrix* z, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_vector tau; ae_vector e; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_clear(w); ae_matrix_clear(z); ae_vector_init(&tau, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_assert(zneeded==0||zneeded==1, "SMatrixEVDI: incorrect ZNeeded", _state); smatrixtd(a, n, isupper, &tau, w, &e, _state); if( zneeded==1 ) { smatrixtdunpackq(a, n, isupper, &tau, z, _state); } result = smatrixtdevdi(w, &e, n, zneeded, i1, i2, z, _state); ae_frame_leave(_state); return result; } /************************************************************************* Finding the eigenvalues and eigenvectors of a Hermitian matrix The algorithm finds eigen pairs of a Hermitian matrix by reducing it to real tridiagonal form and using the QL/QR algorithm. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: A - Hermitian matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. IsUpper - storage format. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains the eigenvectors. Array whose indexes range within [0..N-1, 0..N-1]. The eigenvectors are stored in the matrix columns. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged (rare case). Note: eigenvectors of Hermitian matrix are defined up to multiplication by a complex number L, such that |L|=1. -- ALGLIB -- Copyright 2005, 23 March 2007 by Bochkanov Sergey *************************************************************************/ ae_bool hmatrixevd(/* Complex */ ae_matrix* a, ae_int_t n, ae_int_t zneeded, ae_bool isupper, /* Real */ ae_vector* d, /* Complex */ ae_matrix* z, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_vector tau; ae_vector e; ae_matrix t; ae_matrix qz; ae_matrix q; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_clear(d); ae_matrix_clear(z); ae_vector_init(&tau, 0, DT_COMPLEX, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_matrix_init(&t, 0, 0, DT_REAL, _state); ae_matrix_init(&qz, 0, 0, DT_REAL, _state); ae_matrix_init(&q, 0, 0, DT_COMPLEX, _state); ae_assert(zneeded==0||zneeded==1, "HermitianEVD: incorrect ZNeeded", _state); /* * Reduce to tridiagonal form */ hmatrixtd(a, n, isupper, &tau, d, &e, _state); if( zneeded==1 ) { hmatrixtdunpackq(a, n, isupper, &tau, &q, _state); zneeded = 2; } /* * TDEVD */ result = smatrixtdevd(d, &e, n, zneeded, &t, _state); /* * Eigenvectors are needed * Calculate Z = Q*T = Re(Q)*T + i*Im(Q)*T */ if( result&&zneeded!=0 ) { ae_matrix_set_length(z, n, n, _state); ae_matrix_set_length(&qz, n, 2*n, _state); /* * Calculate Re(Q)*T */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { qz.ptr.pp_double[i][j] = q.ptr.pp_complex[i][j].x; } } rmatrixgemm(n, n, n, 1.0, &qz, 0, 0, 0, &t, 0, 0, 0, 0.0, &qz, 0, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { z->ptr.pp_complex[i][j].x = qz.ptr.pp_double[i][n+j]; } } /* * Calculate Im(Q)*T */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { qz.ptr.pp_double[i][j] = q.ptr.pp_complex[i][j].y; } } rmatrixgemm(n, n, n, 1.0, &qz, 0, 0, 0, &t, 0, 0, 0, 0.0, &qz, 0, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { z->ptr.pp_complex[i][j].y = qz.ptr.pp_double[i][n+j]; } } } ae_frame_leave(_state); return result; } /************************************************************************* Subroutine for finding the eigenvalues (and eigenvectors) of a Hermitian matrix in a given half-interval (A, B] by using a bisection and inverse iteration Input parameters: A - Hermitian matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. B1, B2 - half-interval (B1, B2] to search eigenvalues in. Output parameters: M - number of eigenvalues found in a given half-interval, M>=0 W - array of the eigenvalues found. Array whose index ranges within [0..M-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..M-1]. The eigenvectors are stored in the matrix columns. Result: True, if successful. M contains the number of eigenvalues in the given half-interval (could be equal to 0), W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned, M is equal to 0. Note: eigen vectors of Hermitian matrix are defined up to multiplication by a complex number L, such as |L|=1. -- ALGLIB -- Copyright 07.01.2006, 24.03.2007 by Bochkanov Sergey. *************************************************************************/ ae_bool hmatrixevdr(/* Complex */ ae_matrix* a, ae_int_t n, ae_int_t zneeded, ae_bool isupper, double b1, double b2, ae_int_t* m, /* Real */ ae_vector* w, /* Complex */ ae_matrix* z, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_matrix q; ae_matrix t; ae_vector tau; ae_vector e; ae_vector work; ae_int_t i; ae_int_t k; double v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; *m = 0; ae_vector_clear(w); ae_matrix_clear(z); ae_matrix_init(&q, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&t, 0, 0, DT_REAL, _state); ae_vector_init(&tau, 0, DT_COMPLEX, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); ae_assert(zneeded==0||zneeded==1, "HermitianEigenValuesAndVectorsInInterval: incorrect ZNeeded", _state); /* * Reduce to tridiagonal form */ hmatrixtd(a, n, isupper, &tau, w, &e, _state); if( zneeded==1 ) { hmatrixtdunpackq(a, n, isupper, &tau, &q, _state); zneeded = 2; } /* * Bisection and inverse iteration */ result = smatrixtdevdr(w, &e, n, zneeded, b1, b2, m, &t, _state); /* * Eigenvectors are needed * Calculate Z = Q*T = Re(Q)*T + i*Im(Q)*T */ if( (result&&zneeded!=0)&&*m!=0 ) { ae_vector_set_length(&work, *m-1+1, _state); ae_matrix_set_length(z, n-1+1, *m-1+1, _state); for(i=0; i<=n-1; i++) { /* * Calculate real part */ for(k=0; k<=*m-1; k++) { work.ptr.p_double[k] = (double)(0); } for(k=0; k<=n-1; k++) { v = q.ptr.pp_complex[i][k].x; ae_v_addd(&work.ptr.p_double[0], 1, &t.ptr.pp_double[k][0], 1, ae_v_len(0,*m-1), v); } for(k=0; k<=*m-1; k++) { z->ptr.pp_complex[i][k].x = work.ptr.p_double[k]; } /* * Calculate imaginary part */ for(k=0; k<=*m-1; k++) { work.ptr.p_double[k] = (double)(0); } for(k=0; k<=n-1; k++) { v = q.ptr.pp_complex[i][k].y; ae_v_addd(&work.ptr.p_double[0], 1, &t.ptr.pp_double[k][0], 1, ae_v_len(0,*m-1), v); } for(k=0; k<=*m-1; k++) { z->ptr.pp_complex[i][k].y = work.ptr.p_double[k]; } } } ae_frame_leave(_state); return result; } /************************************************************************* Subroutine for finding the eigenvalues and eigenvectors of a Hermitian matrix with given indexes by using bisection and inverse iteration methods Input parameters: A - Hermitian matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. IsUpperA - storage format of matrix A. I1, I2 - index interval for searching (from I1 to I2). 0 <= I1 <= I2 <= N-1. Output parameters: W - array of the eigenvalues found. Array whose index ranges within [0..I2-I1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..I2-I1]. In that case, the eigenvectors are stored in the matrix columns. Result: True, if successful. W contains the eigenvalues, Z contains the eigenvectors (if needed). False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned. Note: eigen vectors of Hermitian matrix are defined up to multiplication by a complex number L, such as |L|=1. -- ALGLIB -- Copyright 07.01.2006, 24.03.2007 by Bochkanov Sergey. *************************************************************************/ ae_bool hmatrixevdi(/* Complex */ ae_matrix* a, ae_int_t n, ae_int_t zneeded, ae_bool isupper, ae_int_t i1, ae_int_t i2, /* Real */ ae_vector* w, /* Complex */ ae_matrix* z, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_matrix q; ae_matrix t; ae_vector tau; ae_vector e; ae_vector work; ae_int_t i; ae_int_t k; double v; ae_int_t m; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_clear(w); ae_matrix_clear(z); ae_matrix_init(&q, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&t, 0, 0, DT_REAL, _state); ae_vector_init(&tau, 0, DT_COMPLEX, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); ae_assert(zneeded==0||zneeded==1, "HermitianEigenValuesAndVectorsByIndexes: incorrect ZNeeded", _state); /* * Reduce to tridiagonal form */ hmatrixtd(a, n, isupper, &tau, w, &e, _state); if( zneeded==1 ) { hmatrixtdunpackq(a, n, isupper, &tau, &q, _state); zneeded = 2; } /* * Bisection and inverse iteration */ result = smatrixtdevdi(w, &e, n, zneeded, i1, i2, &t, _state); /* * Eigenvectors are needed * Calculate Z = Q*T = Re(Q)*T + i*Im(Q)*T */ m = i2-i1+1; if( result&&zneeded!=0 ) { ae_vector_set_length(&work, m-1+1, _state); ae_matrix_set_length(z, n-1+1, m-1+1, _state); for(i=0; i<=n-1; i++) { /* * Calculate real part */ for(k=0; k<=m-1; k++) { work.ptr.p_double[k] = (double)(0); } for(k=0; k<=n-1; k++) { v = q.ptr.pp_complex[i][k].x; ae_v_addd(&work.ptr.p_double[0], 1, &t.ptr.pp_double[k][0], 1, ae_v_len(0,m-1), v); } for(k=0; k<=m-1; k++) { z->ptr.pp_complex[i][k].x = work.ptr.p_double[k]; } /* * Calculate imaginary part */ for(k=0; k<=m-1; k++) { work.ptr.p_double[k] = (double)(0); } for(k=0; k<=n-1; k++) { v = q.ptr.pp_complex[i][k].y; ae_v_addd(&work.ptr.p_double[0], 1, &t.ptr.pp_double[k][0], 1, ae_v_len(0,m-1), v); } for(k=0; k<=m-1; k++) { z->ptr.pp_complex[i][k].y = work.ptr.p_double[k]; } } } ae_frame_leave(_state); return result; } /************************************************************************* Finding the eigenvalues and eigenvectors of a tridiagonal symmetric matrix The algorithm finds the eigen pairs of a tridiagonal symmetric matrix by using an QL/QR algorithm with implicit shifts. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. Input parameters: D - the main diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-1]. E - the secondary diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-2]. N - size of matrix A. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not needed; * 1, the eigenvectors of a tridiagonal matrix are multiplied by the square matrix Z. It is used if the tridiagonal matrix is obtained by the similarity transformation of a symmetric matrix; * 2, the eigenvectors of a tridiagonal matrix replace the square matrix Z; * 3, matrix Z contains the first row of the eigenvectors matrix. Z - if ZNeeded=1, Z contains the square matrix by which the eigenvectors are multiplied. Array whose indexes range within [0..N-1, 0..N-1]. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains the product of a given matrix (from the left) and the eigenvectors matrix (from the right); * 2, Z contains the eigenvectors. * 3, Z contains the first row of the eigenvectors matrix. If ZNeeded<3, Z is the array whose indexes range within [0..N-1, 0..N-1]. In that case, the eigenvectors are stored in the matrix columns. If ZNeeded=3, Z is the array whose indexes range within [0..0, 0..N-1]. Result: True, if the algorithm has converged. False, if the algorithm hasn't converged. -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ ae_bool smatrixtdevd(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_int_t zneeded, /* Real */ ae_matrix* z, ae_state *_state) { ae_frame _frame_block; ae_vector _e; ae_vector d1; ae_vector e1; ae_vector ex; ae_matrix z1; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_e, e, _state); e = &_e; ae_vector_init(&d1, 0, DT_REAL, _state); ae_vector_init(&e1, 0, DT_REAL, _state); ae_vector_init(&ex, 0, DT_REAL, _state); ae_matrix_init(&z1, 0, 0, DT_REAL, _state); ae_assert(n>=1, "SMatrixTDEVD: N<=0", _state); ae_assert(zneeded>=0&&zneeded<=3, "SMatrixTDEVD: incorrect ZNeeded", _state); result = ae_false; /* * Preprocess Z: make ZNeeded equal to 0, 1 or 3. * Ensure that memory for Z is allocated. */ if( zneeded==2 ) { /* * Load identity to Z */ rmatrixsetlengthatleast(z, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { z->ptr.pp_double[i][j] = 0.0; } z->ptr.pp_double[i][i] = 1.0; } zneeded = 1; } if( zneeded==3 ) { /* * Allocate memory */ rmatrixsetlengthatleast(z, 1, n, _state); } /* * Try to solve problem with MKL */ ae_vector_set_length(&ex, n, _state); for(i=0; i<=n-2; i++) { ex.ptr.p_double[i] = e->ptr.p_double[i]; } if( smatrixtdevdmkl(d, &ex, n, zneeded, z, &result, _state) ) { ae_frame_leave(_state); return result; } /* * Prepare 1-based task */ ae_vector_set_length(&d1, n+1, _state); ae_vector_set_length(&e1, n+1, _state); ae_v_move(&d1.ptr.p_double[1], 1, &d->ptr.p_double[0], 1, ae_v_len(1,n)); if( n>1 ) { ae_v_move(&e1.ptr.p_double[1], 1, &e->ptr.p_double[0], 1, ae_v_len(1,n-1)); } if( zneeded==1 ) { ae_matrix_set_length(&z1, n+1, n+1, _state); for(i=1; i<=n; i++) { ae_v_move(&z1.ptr.pp_double[i][1], 1, &z->ptr.pp_double[i-1][0], 1, ae_v_len(1,n)); } } /* * Solve 1-based task */ result = evd_tridiagonalevd(&d1, &e1, n, zneeded, &z1, _state); if( !result ) { ae_frame_leave(_state); return result; } /* * Convert back to 0-based result */ ae_v_move(&d->ptr.p_double[0], 1, &d1.ptr.p_double[1], 1, ae_v_len(0,n-1)); if( zneeded!=0 ) { if( zneeded==1 ) { for(i=1; i<=n; i++) { ae_v_move(&z->ptr.pp_double[i-1][0], 1, &z1.ptr.pp_double[i][1], 1, ae_v_len(0,n-1)); } ae_frame_leave(_state); return result; } if( zneeded==2 ) { ae_matrix_set_length(z, n-1+1, n-1+1, _state); for(i=1; i<=n; i++) { ae_v_move(&z->ptr.pp_double[i-1][0], 1, &z1.ptr.pp_double[i][1], 1, ae_v_len(0,n-1)); } ae_frame_leave(_state); return result; } if( zneeded==3 ) { ae_matrix_set_length(z, 0+1, n-1+1, _state); ae_v_move(&z->ptr.pp_double[0][0], 1, &z1.ptr.pp_double[1][1], 1, ae_v_len(0,n-1)); ae_frame_leave(_state); return result; } ae_assert(ae_false, "SMatrixTDEVD: Incorrect ZNeeded!", _state); } ae_frame_leave(_state); return result; } /************************************************************************* Subroutine for finding the tridiagonal matrix eigenvalues/vectors in a given half-interval (A, B] by using bisection and inverse iteration. Input parameters: D - the main diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-1]. E - the secondary diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-2]. N - size of matrix, N>=0. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not needed; * 1, the eigenvectors of a tridiagonal matrix are multiplied by the square matrix Z. It is used if the tridiagonal matrix is obtained by the similarity transformation of a symmetric matrix. * 2, the eigenvectors of a tridiagonal matrix replace matrix Z. A, B - half-interval (A, B] to search eigenvalues in. Z - if ZNeeded is equal to: * 0, Z isn't used and remains unchanged; * 1, Z contains the square matrix (array whose indexes range within [0..N-1, 0..N-1]) which reduces the given symmetric matrix to tridiagonal form; * 2, Z isn't used (but changed on the exit). Output parameters: D - array of the eigenvalues found. Array whose index ranges within [0..M-1]. M - number of eigenvalues found in the given half-interval (M>=0). Z - if ZNeeded is equal to: * 0, doesn't contain any information; * 1, contains the product of a given NxN matrix Z (from the left) and NxM matrix of the eigenvectors found (from the right). Array whose indexes range within [0..N-1, 0..M-1]. * 2, contains the matrix of the eigenvectors found. Array whose indexes range within [0..N-1, 0..M-1]. Result: True, if successful. In that case, M contains the number of eigenvalues in the given half-interval (could be equal to 0), D contains the eigenvalues, Z contains the eigenvectors (if needed). It should be noted that the subroutine changes the size of arrays D and Z. False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned, M is equal to 0. -- ALGLIB -- Copyright 31.03.2008 by Bochkanov Sergey *************************************************************************/ ae_bool smatrixtdevdr(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_int_t zneeded, double a, double b, ae_int_t* m, /* Real */ ae_matrix* z, ae_state *_state) { ae_frame _frame_block; ae_int_t errorcode; ae_int_t nsplit; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t cr; ae_vector iblock; ae_vector isplit; ae_vector ifail; ae_vector d1; ae_vector e1; ae_vector w; ae_matrix z2; ae_matrix z3; double v; ae_bool result; ae_frame_make(_state, &_frame_block); *m = 0; ae_vector_init(&iblock, 0, DT_INT, _state); ae_vector_init(&isplit, 0, DT_INT, _state); ae_vector_init(&ifail, 0, DT_INT, _state); ae_vector_init(&d1, 0, DT_REAL, _state); ae_vector_init(&e1, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_matrix_init(&z2, 0, 0, DT_REAL, _state); ae_matrix_init(&z3, 0, 0, DT_REAL, _state); ae_assert(zneeded>=0&&zneeded<=2, "SMatrixTDEVDR: incorrect ZNeeded!", _state); /* * Special cases */ if( ae_fp_less_eq(b,a) ) { *m = 0; result = ae_true; ae_frame_leave(_state); return result; } if( n<=0 ) { *m = 0; result = ae_true; ae_frame_leave(_state); return result; } /* * Copy D,E to D1, E1 */ ae_vector_set_length(&d1, n+1, _state); ae_v_move(&d1.ptr.p_double[1], 1, &d->ptr.p_double[0], 1, ae_v_len(1,n)); if( n>1 ) { ae_vector_set_length(&e1, n-1+1, _state); ae_v_move(&e1.ptr.p_double[1], 1, &e->ptr.p_double[0], 1, ae_v_len(1,n-1)); } /* * No eigen vectors */ if( zneeded==0 ) { result = evd_internalbisectioneigenvalues(&d1, &e1, n, 2, 1, a, b, 0, 0, (double)(-1), &w, m, &nsplit, &iblock, &isplit, &errorcode, _state); if( !result||*m==0 ) { *m = 0; ae_frame_leave(_state); return result; } ae_vector_set_length(d, *m-1+1, _state); ae_v_move(&d->ptr.p_double[0], 1, &w.ptr.p_double[1], 1, ae_v_len(0,*m-1)); ae_frame_leave(_state); return result; } /* * Eigen vectors are multiplied by Z */ if( zneeded==1 ) { /* * Find eigen pairs */ result = evd_internalbisectioneigenvalues(&d1, &e1, n, 2, 2, a, b, 0, 0, (double)(-1), &w, m, &nsplit, &iblock, &isplit, &errorcode, _state); if( !result||*m==0 ) { *m = 0; ae_frame_leave(_state); return result; } evd_internaldstein(n, &d1, &e1, *m, &w, &iblock, &isplit, &z2, &ifail, &cr, _state); if( cr!=0 ) { *m = 0; result = ae_false; ae_frame_leave(_state); return result; } /* * Sort eigen values and vectors */ for(i=1; i<=*m; i++) { k = i; for(j=i; j<=*m; j++) { if( ae_fp_less(w.ptr.p_double[j],w.ptr.p_double[k]) ) { k = j; } } v = w.ptr.p_double[i]; w.ptr.p_double[i] = w.ptr.p_double[k]; w.ptr.p_double[k] = v; for(j=1; j<=n; j++) { v = z2.ptr.pp_double[j][i]; z2.ptr.pp_double[j][i] = z2.ptr.pp_double[j][k]; z2.ptr.pp_double[j][k] = v; } } /* * Transform Z2 and overwrite Z */ ae_matrix_set_length(&z3, *m+1, n+1, _state); for(i=1; i<=*m; i++) { ae_v_move(&z3.ptr.pp_double[i][1], 1, &z2.ptr.pp_double[1][i], z2.stride, ae_v_len(1,n)); } for(i=1; i<=n; i++) { for(j=1; j<=*m; j++) { v = ae_v_dotproduct(&z->ptr.pp_double[i-1][0], 1, &z3.ptr.pp_double[j][1], 1, ae_v_len(0,n-1)); z2.ptr.pp_double[i][j] = v; } } ae_matrix_set_length(z, n-1+1, *m-1+1, _state); for(i=1; i<=*m; i++) { ae_v_move(&z->ptr.pp_double[0][i-1], z->stride, &z2.ptr.pp_double[1][i], z2.stride, ae_v_len(0,n-1)); } /* * Store W */ ae_vector_set_length(d, *m-1+1, _state); for(i=1; i<=*m; i++) { d->ptr.p_double[i-1] = w.ptr.p_double[i]; } ae_frame_leave(_state); return result; } /* * Eigen vectors are stored in Z */ if( zneeded==2 ) { /* * Find eigen pairs */ result = evd_internalbisectioneigenvalues(&d1, &e1, n, 2, 2, a, b, 0, 0, (double)(-1), &w, m, &nsplit, &iblock, &isplit, &errorcode, _state); if( !result||*m==0 ) { *m = 0; ae_frame_leave(_state); return result; } evd_internaldstein(n, &d1, &e1, *m, &w, &iblock, &isplit, &z2, &ifail, &cr, _state); if( cr!=0 ) { *m = 0; result = ae_false; ae_frame_leave(_state); return result; } /* * Sort eigen values and vectors */ for(i=1; i<=*m; i++) { k = i; for(j=i; j<=*m; j++) { if( ae_fp_less(w.ptr.p_double[j],w.ptr.p_double[k]) ) { k = j; } } v = w.ptr.p_double[i]; w.ptr.p_double[i] = w.ptr.p_double[k]; w.ptr.p_double[k] = v; for(j=1; j<=n; j++) { v = z2.ptr.pp_double[j][i]; z2.ptr.pp_double[j][i] = z2.ptr.pp_double[j][k]; z2.ptr.pp_double[j][k] = v; } } /* * Store W */ ae_vector_set_length(d, *m-1+1, _state); for(i=1; i<=*m; i++) { d->ptr.p_double[i-1] = w.ptr.p_double[i]; } ae_matrix_set_length(z, n-1+1, *m-1+1, _state); for(i=1; i<=*m; i++) { ae_v_move(&z->ptr.pp_double[0][i-1], z->stride, &z2.ptr.pp_double[1][i], z2.stride, ae_v_len(0,n-1)); } ae_frame_leave(_state); return result; } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Subroutine for finding tridiagonal matrix eigenvalues/vectors with given indexes (in ascending order) by using the bisection and inverse iteraion. Input parameters: D - the main diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-1]. E - the secondary diagonal of a tridiagonal matrix. Array whose index ranges within [0..N-2]. N - size of matrix. N>=0. ZNeeded - flag controlling whether the eigenvectors are needed or not. If ZNeeded is equal to: * 0, the eigenvectors are not needed; * 1, the eigenvectors of a tridiagonal matrix are multiplied by the square matrix Z. It is used if the tridiagonal matrix is obtained by the similarity transformation of a symmetric matrix. * 2, the eigenvectors of a tridiagonal matrix replace matrix Z. I1, I2 - index interval for searching (from I1 to I2). 0 <= I1 <= I2 <= N-1. Z - if ZNeeded is equal to: * 0, Z isn't used and remains unchanged; * 1, Z contains the square matrix (array whose indexes range within [0..N-1, 0..N-1]) which reduces the given symmetric matrix to tridiagonal form; * 2, Z isn't used (but changed on the exit). Output parameters: D - array of the eigenvalues found. Array whose index ranges within [0..I2-I1]. Z - if ZNeeded is equal to: * 0, doesn't contain any information; * 1, contains the product of a given NxN matrix Z (from the left) and Nx(I2-I1) matrix of the eigenvectors found (from the right). Array whose indexes range within [0..N-1, 0..I2-I1]. * 2, contains the matrix of the eigenvalues found. Array whose indexes range within [0..N-1, 0..I2-I1]. Result: True, if successful. In that case, D contains the eigenvalues, Z contains the eigenvectors (if needed). It should be noted that the subroutine changes the size of arrays D and Z. False, if the bisection method subroutine wasn't able to find the eigenvalues in the given interval or if the inverse iteration subroutine wasn't able to find all the corresponding eigenvectors. In that case, the eigenvalues and eigenvectors are not returned. -- ALGLIB -- Copyright 25.12.2005 by Bochkanov Sergey *************************************************************************/ ae_bool smatrixtdevdi(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_int_t zneeded, ae_int_t i1, ae_int_t i2, /* Real */ ae_matrix* z, ae_state *_state) { ae_frame _frame_block; ae_int_t errorcode; ae_int_t nsplit; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t m; ae_int_t cr; ae_vector iblock; ae_vector isplit; ae_vector ifail; ae_vector w; ae_vector d1; ae_vector e1; ae_matrix z2; ae_matrix z3; double v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&iblock, 0, DT_INT, _state); ae_vector_init(&isplit, 0, DT_INT, _state); ae_vector_init(&ifail, 0, DT_INT, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&d1, 0, DT_REAL, _state); ae_vector_init(&e1, 0, DT_REAL, _state); ae_matrix_init(&z2, 0, 0, DT_REAL, _state); ae_matrix_init(&z3, 0, 0, DT_REAL, _state); ae_assert((0<=i1&&i1<=i2)&&i2ptr.p_double[0], 1, ae_v_len(1,n)); if( n>1 ) { ae_vector_set_length(&e1, n-1+1, _state); ae_v_move(&e1.ptr.p_double[1], 1, &e->ptr.p_double[0], 1, ae_v_len(1,n-1)); } /* * No eigen vectors */ if( zneeded==0 ) { result = evd_internalbisectioneigenvalues(&d1, &e1, n, 3, 1, (double)(0), (double)(0), i1+1, i2+1, (double)(-1), &w, &m, &nsplit, &iblock, &isplit, &errorcode, _state); if( !result ) { ae_frame_leave(_state); return result; } if( m!=i2-i1+1 ) { result = ae_false; ae_frame_leave(_state); return result; } ae_vector_set_length(d, m-1+1, _state); for(i=1; i<=m; i++) { d->ptr.p_double[i-1] = w.ptr.p_double[i]; } ae_frame_leave(_state); return result; } /* * Eigen vectors are multiplied by Z */ if( zneeded==1 ) { /* * Find eigen pairs */ result = evd_internalbisectioneigenvalues(&d1, &e1, n, 3, 2, (double)(0), (double)(0), i1+1, i2+1, (double)(-1), &w, &m, &nsplit, &iblock, &isplit, &errorcode, _state); if( !result ) { ae_frame_leave(_state); return result; } if( m!=i2-i1+1 ) { result = ae_false; ae_frame_leave(_state); return result; } evd_internaldstein(n, &d1, &e1, m, &w, &iblock, &isplit, &z2, &ifail, &cr, _state); if( cr!=0 ) { result = ae_false; ae_frame_leave(_state); return result; } /* * Sort eigen values and vectors */ for(i=1; i<=m; i++) { k = i; for(j=i; j<=m; j++) { if( ae_fp_less(w.ptr.p_double[j],w.ptr.p_double[k]) ) { k = j; } } v = w.ptr.p_double[i]; w.ptr.p_double[i] = w.ptr.p_double[k]; w.ptr.p_double[k] = v; for(j=1; j<=n; j++) { v = z2.ptr.pp_double[j][i]; z2.ptr.pp_double[j][i] = z2.ptr.pp_double[j][k]; z2.ptr.pp_double[j][k] = v; } } /* * Transform Z2 and overwrite Z */ ae_matrix_set_length(&z3, m+1, n+1, _state); for(i=1; i<=m; i++) { ae_v_move(&z3.ptr.pp_double[i][1], 1, &z2.ptr.pp_double[1][i], z2.stride, ae_v_len(1,n)); } for(i=1; i<=n; i++) { for(j=1; j<=m; j++) { v = ae_v_dotproduct(&z->ptr.pp_double[i-1][0], 1, &z3.ptr.pp_double[j][1], 1, ae_v_len(0,n-1)); z2.ptr.pp_double[i][j] = v; } } ae_matrix_set_length(z, n-1+1, m-1+1, _state); for(i=1; i<=m; i++) { ae_v_move(&z->ptr.pp_double[0][i-1], z->stride, &z2.ptr.pp_double[1][i], z2.stride, ae_v_len(0,n-1)); } /* * Store W */ ae_vector_set_length(d, m-1+1, _state); for(i=1; i<=m; i++) { d->ptr.p_double[i-1] = w.ptr.p_double[i]; } ae_frame_leave(_state); return result; } /* * Eigen vectors are stored in Z */ if( zneeded==2 ) { /* * Find eigen pairs */ result = evd_internalbisectioneigenvalues(&d1, &e1, n, 3, 2, (double)(0), (double)(0), i1+1, i2+1, (double)(-1), &w, &m, &nsplit, &iblock, &isplit, &errorcode, _state); if( !result ) { ae_frame_leave(_state); return result; } if( m!=i2-i1+1 ) { result = ae_false; ae_frame_leave(_state); return result; } evd_internaldstein(n, &d1, &e1, m, &w, &iblock, &isplit, &z2, &ifail, &cr, _state); if( cr!=0 ) { result = ae_false; ae_frame_leave(_state); return result; } /* * Sort eigen values and vectors */ for(i=1; i<=m; i++) { k = i; for(j=i; j<=m; j++) { if( ae_fp_less(w.ptr.p_double[j],w.ptr.p_double[k]) ) { k = j; } } v = w.ptr.p_double[i]; w.ptr.p_double[i] = w.ptr.p_double[k]; w.ptr.p_double[k] = v; for(j=1; j<=n; j++) { v = z2.ptr.pp_double[j][i]; z2.ptr.pp_double[j][i] = z2.ptr.pp_double[j][k]; z2.ptr.pp_double[j][k] = v; } } /* * Store Z */ ae_matrix_set_length(z, n-1+1, m-1+1, _state); for(i=1; i<=m; i++) { ae_v_move(&z->ptr.pp_double[0][i-1], z->stride, &z2.ptr.pp_double[1][i], z2.stride, ae_v_len(0,n-1)); } /* * Store W */ ae_vector_set_length(d, m-1+1, _state); for(i=1; i<=m; i++) { d->ptr.p_double[i-1] = w.ptr.p_double[i]; } ae_frame_leave(_state); return result; } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Finding eigenvalues and eigenvectors of a general (unsymmetric) matrix COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. Speed-up provided by MKL for this particular problem (EVD) ! is really high, because MKL uses combination of (a) better low-level ! optimizations, and (b) better EVD algorithms. ! ! On one particular SSE-capable machine for N=1024, commercial MKL- ! -capable ALGLIB was: ! * 7-10 times faster than open source "generic C" version ! * 15-18 times faster than "pure C#" version ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The algorithm finds eigenvalues and eigenvectors of a general matrix by using the QR algorithm with multiple shifts. The algorithm can find eigenvalues and both left and right eigenvectors. The right eigenvector is a vector x such that A*x = w*x, and the left eigenvector is a vector y such that y'*A = w*y' (here y' implies a complex conjugate transposition of vector y). Input parameters: A - matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. VNeeded - flag controlling whether eigenvectors are needed or not. If VNeeded is equal to: * 0, eigenvectors are not returned; * 1, right eigenvectors are returned; * 2, left eigenvectors are returned; * 3, both left and right eigenvectors are returned. Output parameters: WR - real parts of eigenvalues. Array whose index ranges within [0..N-1]. WR - imaginary parts of eigenvalues. Array whose index ranges within [0..N-1]. VL, VR - arrays of left and right eigenvectors (if they are needed). If WI[i]=0, the respective eigenvalue is a real number, and it corresponds to the column number I of matrices VL/VR. If WI[i]>0, we have a pair of complex conjugate numbers with positive and negative imaginary parts: the first eigenvalue WR[i] + sqrt(-1)*WI[i]; the second eigenvalue WR[i+1] + sqrt(-1)*WI[i+1]; WI[i]>0 WI[i+1] = -WI[i] < 0 In that case, the eigenvector corresponding to the first eigenvalue is located in i and i+1 columns of matrices VL/VR (the column number i contains the real part, and the column number i+1 contains the imaginary part), and the vector corresponding to the second eigenvalue is a complex conjugate to the first vector. Arrays whose indexes range within [0..N-1, 0..N-1]. Result: True, if the algorithm has converged. False, if the algorithm has not converged. Note 1: Some users may ask the following question: what if WI[N-1]>0? WI[N] must contain an eigenvalue which is complex conjugate to the N-th eigenvalue, but the array has only size N? The answer is as follows: such a situation cannot occur because the algorithm finds a pairs of eigenvalues, therefore, if WI[i]>0, I is strictly less than N-1. Note 2: The algorithm performance depends on the value of the internal parameter NS of the InternalSchurDecomposition subroutine which defines the number of shifts in the QR algorithm (similarly to the block width in block-matrix algorithms of linear algebra). If you require maximum performance on your machine, it is recommended to adjust this parameter manually. See also the InternalTREVC subroutine. The algorithm is based on the LAPACK 3.0 library. *************************************************************************/ ae_bool rmatrixevd(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t vneeded, /* Real */ ae_vector* wr, /* Real */ ae_vector* wi, /* Real */ ae_matrix* vl, /* Real */ ae_matrix* vr, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_matrix a1; ae_matrix vl1; ae_matrix vr1; ae_matrix s1; ae_matrix s; ae_matrix dummy; ae_vector wr1; ae_vector wi1; ae_vector tau; ae_int_t i; ae_int_t info; ae_vector sel1; ae_int_t m1; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_clear(wr); ae_vector_clear(wi); ae_matrix_clear(vl); ae_matrix_clear(vr); ae_matrix_init(&a1, 0, 0, DT_REAL, _state); ae_matrix_init(&vl1, 0, 0, DT_REAL, _state); ae_matrix_init(&vr1, 0, 0, DT_REAL, _state); ae_matrix_init(&s1, 0, 0, DT_REAL, _state); ae_matrix_init(&s, 0, 0, DT_REAL, _state); ae_matrix_init(&dummy, 0, 0, DT_REAL, _state); ae_vector_init(&wr1, 0, DT_REAL, _state); ae_vector_init(&wi1, 0, DT_REAL, _state); ae_vector_init(&tau, 0, DT_REAL, _state); ae_vector_init(&sel1, 0, DT_BOOL, _state); ae_assert(vneeded>=0&&vneeded<=3, "RMatrixEVD: incorrect VNeeded!", _state); if( vneeded==0 ) { /* * Eigen values only */ rmatrixhessenberg(a, n, &tau, _state); rmatrixinternalschurdecomposition(a, n, 0, 0, wr, wi, &dummy, &info, _state); result = info==0; ae_frame_leave(_state); return result; } /* * Eigen values and vectors */ rmatrixhessenberg(a, n, &tau, _state); rmatrixhessenbergunpackq(a, n, &tau, &s, _state); rmatrixinternalschurdecomposition(a, n, 1, 1, wr, wi, &s, &info, _state); result = info==0; if( !result ) { ae_frame_leave(_state); return result; } if( vneeded==1||vneeded==3 ) { ae_matrix_set_length(vr, n, n, _state); for(i=0; i<=n-1; i++) { ae_v_move(&vr->ptr.pp_double[i][0], 1, &s.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } } if( vneeded==2||vneeded==3 ) { ae_matrix_set_length(vl, n, n, _state); for(i=0; i<=n-1; i++) { ae_v_move(&vl->ptr.pp_double[i][0], 1, &s.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } } evd_rmatrixinternaltrevc(a, n, vneeded, 1, &sel1, vl, vr, &m1, &info, _state); result = info==0; ae_frame_leave(_state); return result; } /************************************************************************* Clears request fileds (to be sure that we don't forgot to clear something) *************************************************************************/ static void evd_clearrfields(eigsubspacestate* state, ae_state *_state) { state->requesttype = -1; state->requestsize = -1; } static ae_bool evd_tridiagonalevd(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_int_t zneeded, /* Real */ ae_matrix* z, ae_state *_state) { ae_frame _frame_block; ae_vector _e; ae_int_t maxit; ae_int_t i; ae_int_t ii; ae_int_t iscale; ae_int_t j; ae_int_t jtot; ae_int_t k; ae_int_t t; ae_int_t l; ae_int_t l1; ae_int_t lend; ae_int_t lendm1; ae_int_t lendp1; ae_int_t lendsv; ae_int_t lm1; ae_int_t lsv; ae_int_t m; ae_int_t mm1; ae_int_t nm1; ae_int_t nmaxit; ae_int_t tmpint; double anorm; double b; double c; double eps; double eps2; double f; double g; double p; double r; double rt1; double rt2; double s; double safmax; double safmin; double ssfmax; double ssfmin; double tst; double tmp; ae_vector work1; ae_vector work2; ae_vector workc; ae_vector works; ae_vector wtemp; ae_bool gotoflag; ae_int_t zrows; ae_bool wastranspose; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_e, e, _state); e = &_e; ae_vector_init(&work1, 0, DT_REAL, _state); ae_vector_init(&work2, 0, DT_REAL, _state); ae_vector_init(&workc, 0, DT_REAL, _state); ae_vector_init(&works, 0, DT_REAL, _state); ae_vector_init(&wtemp, 0, DT_REAL, _state); ae_assert(zneeded>=0&&zneeded<=3, "TridiagonalEVD: Incorrent ZNeeded", _state); /* * Quick return if possible */ if( zneeded<0||zneeded>3 ) { result = ae_false; ae_frame_leave(_state); return result; } result = ae_true; if( n==0 ) { ae_frame_leave(_state); return result; } if( n==1 ) { if( zneeded==2||zneeded==3 ) { ae_matrix_set_length(z, 1+1, 1+1, _state); z->ptr.pp_double[1][1] = (double)(1); } ae_frame_leave(_state); return result; } maxit = 30; /* * Initialize arrays */ ae_vector_set_length(&wtemp, n+1, _state); ae_vector_set_length(&work1, n-1+1, _state); ae_vector_set_length(&work2, n-1+1, _state); ae_vector_set_length(&workc, n+1, _state); ae_vector_set_length(&works, n+1, _state); /* * Determine the unit roundoff and over/underflow thresholds. */ eps = ae_machineepsilon; eps2 = ae_sqr(eps, _state); safmin = ae_minrealnumber; safmax = ae_maxrealnumber; ssfmax = ae_sqrt(safmax, _state)/3; ssfmin = ae_sqrt(safmin, _state)/eps2; /* * Prepare Z * * Here we are using transposition to get rid of column operations * */ wastranspose = ae_false; zrows = 0; if( zneeded==1 ) { zrows = n; } if( zneeded==2 ) { zrows = n; } if( zneeded==3 ) { zrows = 1; } if( zneeded==1 ) { wastranspose = ae_true; inplacetranspose(z, 1, n, 1, n, &wtemp, _state); } if( zneeded==2 ) { wastranspose = ae_true; ae_matrix_set_length(z, n+1, n+1, _state); for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { if( i==j ) { z->ptr.pp_double[i][j] = (double)(1); } else { z->ptr.pp_double[i][j] = (double)(0); } } } } if( zneeded==3 ) { wastranspose = ae_false; ae_matrix_set_length(z, 1+1, n+1, _state); for(j=1; j<=n; j++) { if( j==1 ) { z->ptr.pp_double[1][j] = (double)(1); } else { z->ptr.pp_double[1][j] = (double)(0); } } } nmaxit = n*maxit; jtot = 0; /* * Determine where the matrix splits and choose QL or QR iteration * for each block, according to whether top or bottom diagonal * element is smaller. */ l1 = 1; nm1 = n-1; for(;;) { if( l1>n ) { break; } if( l1>1 ) { e->ptr.p_double[l1-1] = (double)(0); } gotoflag = ae_false; m = l1; if( l1<=nm1 ) { for(m=l1; m<=nm1; m++) { tst = ae_fabs(e->ptr.p_double[m], _state); if( ae_fp_eq(tst,(double)(0)) ) { gotoflag = ae_true; break; } if( ae_fp_less_eq(tst,ae_sqrt(ae_fabs(d->ptr.p_double[m], _state), _state)*ae_sqrt(ae_fabs(d->ptr.p_double[m+1], _state), _state)*eps) ) { e->ptr.p_double[m] = (double)(0); gotoflag = ae_true; break; } } } if( !gotoflag ) { m = n; } /* * label 30: */ l = l1; lsv = l; lend = m; lendsv = lend; l1 = m+1; if( lend==l ) { continue; } /* * Scale submatrix in rows and columns L to LEND */ if( l==lend ) { anorm = ae_fabs(d->ptr.p_double[l], _state); } else { anorm = ae_maxreal(ae_fabs(d->ptr.p_double[l], _state)+ae_fabs(e->ptr.p_double[l], _state), ae_fabs(e->ptr.p_double[lend-1], _state)+ae_fabs(d->ptr.p_double[lend], _state), _state); for(i=l+1; i<=lend-1; i++) { anorm = ae_maxreal(anorm, ae_fabs(d->ptr.p_double[i], _state)+ae_fabs(e->ptr.p_double[i], _state)+ae_fabs(e->ptr.p_double[i-1], _state), _state); } } iscale = 0; if( ae_fp_eq(anorm,(double)(0)) ) { continue; } if( ae_fp_greater(anorm,ssfmax) ) { iscale = 1; tmp = ssfmax/anorm; tmpint = lend-1; ae_v_muld(&d->ptr.p_double[l], 1, ae_v_len(l,lend), tmp); ae_v_muld(&e->ptr.p_double[l], 1, ae_v_len(l,tmpint), tmp); } if( ae_fp_less(anorm,ssfmin) ) { iscale = 2; tmp = ssfmin/anorm; tmpint = lend-1; ae_v_muld(&d->ptr.p_double[l], 1, ae_v_len(l,lend), tmp); ae_v_muld(&e->ptr.p_double[l], 1, ae_v_len(l,tmpint), tmp); } /* * Choose between QL and QR iteration */ if( ae_fp_less(ae_fabs(d->ptr.p_double[lend], _state),ae_fabs(d->ptr.p_double[l], _state)) ) { lend = lsv; l = lendsv; } if( lend>l ) { /* * QL Iteration * * Look for small subdiagonal element. */ for(;;) { gotoflag = ae_false; if( l!=lend ) { lendm1 = lend-1; for(m=l; m<=lendm1; m++) { tst = ae_sqr(ae_fabs(e->ptr.p_double[m], _state), _state); if( ae_fp_less_eq(tst,eps2*ae_fabs(d->ptr.p_double[m], _state)*ae_fabs(d->ptr.p_double[m+1], _state)+safmin) ) { gotoflag = ae_true; break; } } } if( !gotoflag ) { m = lend; } if( mptr.p_double[m] = (double)(0); } p = d->ptr.p_double[l]; if( m!=l ) { /* * If remaining matrix is 2-by-2, use DLAE2 or SLAEV2 * to compute its eigensystem. */ if( m==l+1 ) { if( zneeded>0 ) { evd_tdevdev2(d->ptr.p_double[l], e->ptr.p_double[l], d->ptr.p_double[l+1], &rt1, &rt2, &c, &s, _state); work1.ptr.p_double[l] = c; work2.ptr.p_double[l] = s; workc.ptr.p_double[1] = work1.ptr.p_double[l]; works.ptr.p_double[1] = work2.ptr.p_double[l]; if( !wastranspose ) { applyrotationsfromtheright(ae_false, 1, zrows, l, l+1, &workc, &works, z, &wtemp, _state); } else { applyrotationsfromtheleft(ae_false, l, l+1, 1, zrows, &workc, &works, z, &wtemp, _state); } } else { evd_tdevde2(d->ptr.p_double[l], e->ptr.p_double[l], d->ptr.p_double[l+1], &rt1, &rt2, _state); } d->ptr.p_double[l] = rt1; d->ptr.p_double[l+1] = rt2; e->ptr.p_double[l] = (double)(0); l = l+2; if( l<=lend ) { continue; } /* * GOTO 140 */ break; } if( jtot==nmaxit ) { /* * GOTO 140 */ break; } jtot = jtot+1; /* * Form shift. */ g = (d->ptr.p_double[l+1]-p)/(2*e->ptr.p_double[l]); r = evd_tdevdpythag(g, (double)(1), _state); g = d->ptr.p_double[m]-p+e->ptr.p_double[l]/(g+evd_tdevdextsign(r, g, _state)); s = (double)(1); c = (double)(1); p = (double)(0); /* * Inner loop */ mm1 = m-1; for(i=mm1; i>=l; i--) { f = s*e->ptr.p_double[i]; b = c*e->ptr.p_double[i]; generaterotation(g, f, &c, &s, &r, _state); if( i!=m-1 ) { e->ptr.p_double[i+1] = r; } g = d->ptr.p_double[i+1]-p; r = (d->ptr.p_double[i]-g)*s+2*c*b; p = s*r; d->ptr.p_double[i+1] = g+p; g = c*r-b; /* * If eigenvectors are desired, then save rotations. */ if( zneeded>0 ) { work1.ptr.p_double[i] = c; work2.ptr.p_double[i] = -s; } } /* * If eigenvectors are desired, then apply saved rotations. */ if( zneeded>0 ) { for(i=l; i<=m-1; i++) { workc.ptr.p_double[i-l+1] = work1.ptr.p_double[i]; works.ptr.p_double[i-l+1] = work2.ptr.p_double[i]; } if( !wastranspose ) { applyrotationsfromtheright(ae_false, 1, zrows, l, m, &workc, &works, z, &wtemp, _state); } else { applyrotationsfromtheleft(ae_false, l, m, 1, zrows, &workc, &works, z, &wtemp, _state); } } d->ptr.p_double[l] = d->ptr.p_double[l]-p; e->ptr.p_double[l] = g; continue; } /* * Eigenvalue found. */ d->ptr.p_double[l] = p; l = l+1; if( l<=lend ) { continue; } break; } } else { /* * QR Iteration * * Look for small superdiagonal element. */ for(;;) { gotoflag = ae_false; if( l!=lend ) { lendp1 = lend+1; for(m=l; m>=lendp1; m--) { tst = ae_sqr(ae_fabs(e->ptr.p_double[m-1], _state), _state); if( ae_fp_less_eq(tst,eps2*ae_fabs(d->ptr.p_double[m], _state)*ae_fabs(d->ptr.p_double[m-1], _state)+safmin) ) { gotoflag = ae_true; break; } } } if( !gotoflag ) { m = lend; } if( m>lend ) { e->ptr.p_double[m-1] = (double)(0); } p = d->ptr.p_double[l]; if( m!=l ) { /* * If remaining matrix is 2-by-2, use DLAE2 or SLAEV2 * to compute its eigensystem. */ if( m==l-1 ) { if( zneeded>0 ) { evd_tdevdev2(d->ptr.p_double[l-1], e->ptr.p_double[l-1], d->ptr.p_double[l], &rt1, &rt2, &c, &s, _state); work1.ptr.p_double[m] = c; work2.ptr.p_double[m] = s; workc.ptr.p_double[1] = c; works.ptr.p_double[1] = s; if( !wastranspose ) { applyrotationsfromtheright(ae_true, 1, zrows, l-1, l, &workc, &works, z, &wtemp, _state); } else { applyrotationsfromtheleft(ae_true, l-1, l, 1, zrows, &workc, &works, z, &wtemp, _state); } } else { evd_tdevde2(d->ptr.p_double[l-1], e->ptr.p_double[l-1], d->ptr.p_double[l], &rt1, &rt2, _state); } d->ptr.p_double[l-1] = rt1; d->ptr.p_double[l] = rt2; e->ptr.p_double[l-1] = (double)(0); l = l-2; if( l>=lend ) { continue; } break; } if( jtot==nmaxit ) { break; } jtot = jtot+1; /* * Form shift. */ g = (d->ptr.p_double[l-1]-p)/(2*e->ptr.p_double[l-1]); r = evd_tdevdpythag(g, (double)(1), _state); g = d->ptr.p_double[m]-p+e->ptr.p_double[l-1]/(g+evd_tdevdextsign(r, g, _state)); s = (double)(1); c = (double)(1); p = (double)(0); /* * Inner loop */ lm1 = l-1; for(i=m; i<=lm1; i++) { f = s*e->ptr.p_double[i]; b = c*e->ptr.p_double[i]; generaterotation(g, f, &c, &s, &r, _state); if( i!=m ) { e->ptr.p_double[i-1] = r; } g = d->ptr.p_double[i]-p; r = (d->ptr.p_double[i+1]-g)*s+2*c*b; p = s*r; d->ptr.p_double[i] = g+p; g = c*r-b; /* * If eigenvectors are desired, then save rotations. */ if( zneeded>0 ) { work1.ptr.p_double[i] = c; work2.ptr.p_double[i] = s; } } /* * If eigenvectors are desired, then apply saved rotations. */ if( zneeded>0 ) { for(i=m; i<=l-1; i++) { workc.ptr.p_double[i-m+1] = work1.ptr.p_double[i]; works.ptr.p_double[i-m+1] = work2.ptr.p_double[i]; } if( !wastranspose ) { applyrotationsfromtheright(ae_true, 1, zrows, m, l, &workc, &works, z, &wtemp, _state); } else { applyrotationsfromtheleft(ae_true, m, l, 1, zrows, &workc, &works, z, &wtemp, _state); } } d->ptr.p_double[l] = d->ptr.p_double[l]-p; e->ptr.p_double[lm1] = g; continue; } /* * Eigenvalue found. */ d->ptr.p_double[l] = p; l = l-1; if( l>=lend ) { continue; } break; } } /* * Undo scaling if necessary */ if( iscale==1 ) { tmp = anorm/ssfmax; tmpint = lendsv-1; ae_v_muld(&d->ptr.p_double[lsv], 1, ae_v_len(lsv,lendsv), tmp); ae_v_muld(&e->ptr.p_double[lsv], 1, ae_v_len(lsv,tmpint), tmp); } if( iscale==2 ) { tmp = anorm/ssfmin; tmpint = lendsv-1; ae_v_muld(&d->ptr.p_double[lsv], 1, ae_v_len(lsv,lendsv), tmp); ae_v_muld(&e->ptr.p_double[lsv], 1, ae_v_len(lsv,tmpint), tmp); } /* * Check for no convergence to an eigenvalue after a total * of N*MAXIT iterations. */ if( jtot>=nmaxit ) { result = ae_false; if( wastranspose ) { inplacetranspose(z, 1, n, 1, n, &wtemp, _state); } ae_frame_leave(_state); return result; } } /* * Order eigenvalues and eigenvectors. */ if( zneeded==0 ) { /* * Sort */ if( n==1 ) { ae_frame_leave(_state); return result; } if( n==2 ) { if( ae_fp_greater(d->ptr.p_double[1],d->ptr.p_double[2]) ) { tmp = d->ptr.p_double[1]; d->ptr.p_double[1] = d->ptr.p_double[2]; d->ptr.p_double[2] = tmp; } ae_frame_leave(_state); return result; } i = 2; do { t = i; while(t!=1) { k = t/2; if( ae_fp_greater_eq(d->ptr.p_double[k],d->ptr.p_double[t]) ) { t = 1; } else { tmp = d->ptr.p_double[k]; d->ptr.p_double[k] = d->ptr.p_double[t]; d->ptr.p_double[t] = tmp; t = k; } } i = i+1; } while(i<=n); i = n-1; do { tmp = d->ptr.p_double[i+1]; d->ptr.p_double[i+1] = d->ptr.p_double[1]; d->ptr.p_double[1] = tmp; t = 1; while(t!=0) { k = 2*t; if( k>i ) { t = 0; } else { if( kptr.p_double[k+1],d->ptr.p_double[k]) ) { k = k+1; } } if( ae_fp_greater_eq(d->ptr.p_double[t],d->ptr.p_double[k]) ) { t = 0; } else { tmp = d->ptr.p_double[k]; d->ptr.p_double[k] = d->ptr.p_double[t]; d->ptr.p_double[t] = tmp; t = k; } } } i = i-1; } while(i>=1); } else { /* * Use Selection Sort to minimize swaps of eigenvectors */ for(ii=2; ii<=n; ii++) { i = ii-1; k = i; p = d->ptr.p_double[i]; for(j=ii; j<=n; j++) { if( ae_fp_less(d->ptr.p_double[j],p) ) { k = j; p = d->ptr.p_double[j]; } } if( k!=i ) { d->ptr.p_double[k] = d->ptr.p_double[i]; d->ptr.p_double[i] = p; if( wastranspose ) { ae_v_move(&wtemp.ptr.p_double[1], 1, &z->ptr.pp_double[i][1], 1, ae_v_len(1,n)); ae_v_move(&z->ptr.pp_double[i][1], 1, &z->ptr.pp_double[k][1], 1, ae_v_len(1,n)); ae_v_move(&z->ptr.pp_double[k][1], 1, &wtemp.ptr.p_double[1], 1, ae_v_len(1,n)); } else { ae_v_move(&wtemp.ptr.p_double[1], 1, &z->ptr.pp_double[1][i], z->stride, ae_v_len(1,zrows)); ae_v_move(&z->ptr.pp_double[1][i], z->stride, &z->ptr.pp_double[1][k], z->stride, ae_v_len(1,zrows)); ae_v_move(&z->ptr.pp_double[1][k], z->stride, &wtemp.ptr.p_double[1], 1, ae_v_len(1,zrows)); } } } if( wastranspose ) { inplacetranspose(z, 1, n, 1, n, &wtemp, _state); } } ae_frame_leave(_state); return result; } /************************************************************************* DLAE2 computes the eigenvalues of a 2-by-2 symmetric matrix [ A B ] [ B C ]. On return, RT1 is the eigenvalue of larger absolute value, and RT2 is the eigenvalue of smaller absolute value. -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ static void evd_tdevde2(double a, double b, double c, double* rt1, double* rt2, ae_state *_state) { double ab; double acmn; double acmx; double adf; double df; double rt; double sm; double tb; *rt1 = 0; *rt2 = 0; sm = a+c; df = a-c; adf = ae_fabs(df, _state); tb = b+b; ab = ae_fabs(tb, _state); if( ae_fp_greater(ae_fabs(a, _state),ae_fabs(c, _state)) ) { acmx = a; acmn = c; } else { acmx = c; acmn = a; } if( ae_fp_greater(adf,ab) ) { rt = adf*ae_sqrt(1+ae_sqr(ab/adf, _state), _state); } else { if( ae_fp_less(adf,ab) ) { rt = ab*ae_sqrt(1+ae_sqr(adf/ab, _state), _state); } else { /* * Includes case AB=ADF=0 */ rt = ab*ae_sqrt((double)(2), _state); } } if( ae_fp_less(sm,(double)(0)) ) { *rt1 = 0.5*(sm-rt); /* * Order of execution important. * To get fully accurate smaller eigenvalue, * next line needs to be executed in higher precision. */ *rt2 = acmx/(*rt1)*acmn-b/(*rt1)*b; } else { if( ae_fp_greater(sm,(double)(0)) ) { *rt1 = 0.5*(sm+rt); /* * Order of execution important. * To get fully accurate smaller eigenvalue, * next line needs to be executed in higher precision. */ *rt2 = acmx/(*rt1)*acmn-b/(*rt1)*b; } else { /* * Includes case RT1 = RT2 = 0 */ *rt1 = 0.5*rt; *rt2 = -0.5*rt; } } } /************************************************************************* DLAEV2 computes the eigendecomposition of a 2-by-2 symmetric matrix [ A B ] [ B C ]. On return, RT1 is the eigenvalue of larger absolute value, RT2 is the eigenvalue of smaller absolute value, and (CS1,SN1) is the unit right eigenvector for RT1, giving the decomposition [ CS1 SN1 ] [ A B ] [ CS1 -SN1 ] = [ RT1 0 ] [-SN1 CS1 ] [ B C ] [ SN1 CS1 ] [ 0 RT2 ]. -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ static void evd_tdevdev2(double a, double b, double c, double* rt1, double* rt2, double* cs1, double* sn1, ae_state *_state) { ae_int_t sgn1; ae_int_t sgn2; double ab; double acmn; double acmx; double acs; double adf; double cs; double ct; double df; double rt; double sm; double tb; double tn; *rt1 = 0; *rt2 = 0; *cs1 = 0; *sn1 = 0; /* * Compute the eigenvalues */ sm = a+c; df = a-c; adf = ae_fabs(df, _state); tb = b+b; ab = ae_fabs(tb, _state); if( ae_fp_greater(ae_fabs(a, _state),ae_fabs(c, _state)) ) { acmx = a; acmn = c; } else { acmx = c; acmn = a; } if( ae_fp_greater(adf,ab) ) { rt = adf*ae_sqrt(1+ae_sqr(ab/adf, _state), _state); } else { if( ae_fp_less(adf,ab) ) { rt = ab*ae_sqrt(1+ae_sqr(adf/ab, _state), _state); } else { /* * Includes case AB=ADF=0 */ rt = ab*ae_sqrt((double)(2), _state); } } if( ae_fp_less(sm,(double)(0)) ) { *rt1 = 0.5*(sm-rt); sgn1 = -1; /* * Order of execution important. * To get fully accurate smaller eigenvalue, * next line needs to be executed in higher precision. */ *rt2 = acmx/(*rt1)*acmn-b/(*rt1)*b; } else { if( ae_fp_greater(sm,(double)(0)) ) { *rt1 = 0.5*(sm+rt); sgn1 = 1; /* * Order of execution important. * To get fully accurate smaller eigenvalue, * next line needs to be executed in higher precision. */ *rt2 = acmx/(*rt1)*acmn-b/(*rt1)*b; } else { /* * Includes case RT1 = RT2 = 0 */ *rt1 = 0.5*rt; *rt2 = -0.5*rt; sgn1 = 1; } } /* * Compute the eigenvector */ if( ae_fp_greater_eq(df,(double)(0)) ) { cs = df+rt; sgn2 = 1; } else { cs = df-rt; sgn2 = -1; } acs = ae_fabs(cs, _state); if( ae_fp_greater(acs,ab) ) { ct = -tb/cs; *sn1 = 1/ae_sqrt(1+ct*ct, _state); *cs1 = ct*(*sn1); } else { if( ae_fp_eq(ab,(double)(0)) ) { *cs1 = (double)(1); *sn1 = (double)(0); } else { tn = -cs/tb; *cs1 = 1/ae_sqrt(1+tn*tn, _state); *sn1 = tn*(*cs1); } } if( sgn1==sgn2 ) { tn = *cs1; *cs1 = -*sn1; *sn1 = tn; } } /************************************************************************* Internal routine *************************************************************************/ static double evd_tdevdpythag(double a, double b, ae_state *_state) { double result; if( ae_fp_less(ae_fabs(a, _state),ae_fabs(b, _state)) ) { result = ae_fabs(b, _state)*ae_sqrt(1+ae_sqr(a/b, _state), _state); } else { result = ae_fabs(a, _state)*ae_sqrt(1+ae_sqr(b/a, _state), _state); } return result; } /************************************************************************* Internal routine *************************************************************************/ static double evd_tdevdextsign(double a, double b, ae_state *_state) { double result; if( ae_fp_greater_eq(b,(double)(0)) ) { result = ae_fabs(a, _state); } else { result = -ae_fabs(a, _state); } return result; } static ae_bool evd_internalbisectioneigenvalues(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_int_t irange, ae_int_t iorder, double vl, double vu, ae_int_t il, ae_int_t iu, double abstol, /* Real */ ae_vector* w, ae_int_t* m, ae_int_t* nsplit, /* Integer */ ae_vector* iblock, /* Integer */ ae_vector* isplit, ae_int_t* errorcode, ae_state *_state) { ae_frame _frame_block; ae_vector _d; ae_vector _e; double fudge; double relfac; ae_bool ncnvrg; ae_bool toofew; ae_int_t ib; ae_int_t ibegin; ae_int_t idiscl; ae_int_t idiscu; ae_int_t ie; ae_int_t iend; ae_int_t iinfo; ae_int_t im; ae_int_t iin; ae_int_t ioff; ae_int_t iout; ae_int_t itmax; ae_int_t iw; ae_int_t iwoff; ae_int_t j; ae_int_t itmp1; ae_int_t jb; ae_int_t jdisc; ae_int_t je; ae_int_t nwl; ae_int_t nwu; double atoli; double bnorm; double gl; double gu; double pivmin; double rtoli; double safemn; double tmp1; double tmp2; double tnorm; double ulp; double wkill; double wl; double wlu; double wu; double wul; double scalefactor; double t; ae_vector idumma; ae_vector work; ae_vector iwork; ae_vector ia1s2; ae_vector ra1s2; ae_matrix ra1s2x2; ae_matrix ia1s2x2; ae_vector ra1siin; ae_vector ra2siin; ae_vector ra3siin; ae_vector ra4siin; ae_matrix ra1siinx2; ae_matrix ia1siinx2; ae_vector iworkspace; ae_vector rworkspace; ae_int_t tmpi; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_d, d, _state); d = &_d; ae_vector_init_copy(&_e, e, _state); e = &_e; ae_vector_clear(w); *m = 0; *nsplit = 0; ae_vector_clear(iblock); ae_vector_clear(isplit); *errorcode = 0; ae_vector_init(&idumma, 0, DT_INT, _state); ae_vector_init(&work, 0, DT_REAL, _state); ae_vector_init(&iwork, 0, DT_INT, _state); ae_vector_init(&ia1s2, 0, DT_INT, _state); ae_vector_init(&ra1s2, 0, DT_REAL, _state); ae_matrix_init(&ra1s2x2, 0, 0, DT_REAL, _state); ae_matrix_init(&ia1s2x2, 0, 0, DT_INT, _state); ae_vector_init(&ra1siin, 0, DT_REAL, _state); ae_vector_init(&ra2siin, 0, DT_REAL, _state); ae_vector_init(&ra3siin, 0, DT_REAL, _state); ae_vector_init(&ra4siin, 0, DT_REAL, _state); ae_matrix_init(&ra1siinx2, 0, 0, DT_REAL, _state); ae_matrix_init(&ia1siinx2, 0, 0, DT_INT, _state); ae_vector_init(&iworkspace, 0, DT_INT, _state); ae_vector_init(&rworkspace, 0, DT_REAL, _state); /* * Quick return if possible */ *m = 0; if( n==0 ) { result = ae_true; ae_frame_leave(_state); return result; } /* * Get machine constants * NB is the minimum vector length for vector bisection, or 0 * if only scalar is to be done. */ fudge = (double)(2); relfac = (double)(2); safemn = ae_minrealnumber; ulp = 2*ae_machineepsilon; rtoli = ulp*relfac; ae_vector_set_length(&idumma, 1+1, _state); ae_vector_set_length(&work, 4*n+1, _state); ae_vector_set_length(&iwork, 3*n+1, _state); ae_vector_set_length(w, n+1, _state); ae_vector_set_length(iblock, n+1, _state); ae_vector_set_length(isplit, n+1, _state); ae_vector_set_length(&ia1s2, 2+1, _state); ae_vector_set_length(&ra1s2, 2+1, _state); ae_matrix_set_length(&ra1s2x2, 2+1, 2+1, _state); ae_matrix_set_length(&ia1s2x2, 2+1, 2+1, _state); ae_vector_set_length(&ra1siin, n+1, _state); ae_vector_set_length(&ra2siin, n+1, _state); ae_vector_set_length(&ra3siin, n+1, _state); ae_vector_set_length(&ra4siin, n+1, _state); ae_matrix_set_length(&ra1siinx2, n+1, 2+1, _state); ae_matrix_set_length(&ia1siinx2, n+1, 2+1, _state); ae_vector_set_length(&iworkspace, n+1, _state); ae_vector_set_length(&rworkspace, n+1, _state); /* * these initializers are not really necessary, * but without them compiler complains about uninitialized locals */ wlu = (double)(0); wul = (double)(0); /* * Check for Errors */ result = ae_false; *errorcode = 0; if( irange<=0||irange>=4 ) { *errorcode = -4; } if( iorder<=0||iorder>=3 ) { *errorcode = -5; } if( n<0 ) { *errorcode = -3; } if( irange==2&&ae_fp_greater_eq(vl,vu) ) { *errorcode = -6; } if( irange==3&&(il<1||il>ae_maxint(1, n, _state)) ) { *errorcode = -8; } if( irange==3&&(iun) ) { *errorcode = -9; } if( *errorcode!=0 ) { ae_frame_leave(_state); return result; } /* * Initialize error flags */ ncnvrg = ae_false; toofew = ae_false; /* * Simplifications: */ if( (irange==3&&il==1)&&iu==n ) { irange = 1; } /* * Special Case when N=1 */ if( n==1 ) { *nsplit = 1; isplit->ptr.p_int[1] = 1; if( irange==2&&(ae_fp_greater_eq(vl,d->ptr.p_double[1])||ae_fp_less(vu,d->ptr.p_double[1])) ) { *m = 0; } else { w->ptr.p_double[1] = d->ptr.p_double[1]; iblock->ptr.p_int[1] = 1; *m = 1; } result = ae_true; ae_frame_leave(_state); return result; } /* * Scaling */ t = ae_fabs(d->ptr.p_double[n], _state); for(j=1; j<=n-1; j++) { t = ae_maxreal(t, ae_fabs(d->ptr.p_double[j], _state), _state); t = ae_maxreal(t, ae_fabs(e->ptr.p_double[j], _state), _state); } scalefactor = (double)(1); if( ae_fp_neq(t,(double)(0)) ) { if( ae_fp_greater(t,ae_sqrt(ae_sqrt(ae_minrealnumber, _state), _state)*ae_sqrt(ae_maxrealnumber, _state)) ) { scalefactor = t; } if( ae_fp_less(t,ae_sqrt(ae_sqrt(ae_maxrealnumber, _state), _state)*ae_sqrt(ae_minrealnumber, _state)) ) { scalefactor = t; } for(j=1; j<=n-1; j++) { d->ptr.p_double[j] = d->ptr.p_double[j]/scalefactor; e->ptr.p_double[j] = e->ptr.p_double[j]/scalefactor; } d->ptr.p_double[n] = d->ptr.p_double[n]/scalefactor; } /* * Compute Splitting Points */ *nsplit = 1; work.ptr.p_double[n] = (double)(0); pivmin = (double)(1); for(j=2; j<=n; j++) { tmp1 = ae_sqr(e->ptr.p_double[j-1], _state); if( ae_fp_greater(ae_fabs(d->ptr.p_double[j]*d->ptr.p_double[j-1], _state)*ae_sqr(ulp, _state)+safemn,tmp1) ) { isplit->ptr.p_int[*nsplit] = j-1; *nsplit = *nsplit+1; work.ptr.p_double[j-1] = (double)(0); } else { work.ptr.p_double[j-1] = tmp1; pivmin = ae_maxreal(pivmin, tmp1, _state); } } isplit->ptr.p_int[*nsplit] = n; pivmin = pivmin*safemn; /* * Compute Interval and ATOLI */ if( irange==3 ) { /* * RANGE='I': Compute the interval containing eigenvalues * IL through IU. * * Compute Gershgorin interval for entire (split) matrix * and use it as the initial interval */ gu = d->ptr.p_double[1]; gl = d->ptr.p_double[1]; tmp1 = (double)(0); for(j=1; j<=n-1; j++) { tmp2 = ae_sqrt(work.ptr.p_double[j], _state); gu = ae_maxreal(gu, d->ptr.p_double[j]+tmp1+tmp2, _state); gl = ae_minreal(gl, d->ptr.p_double[j]-tmp1-tmp2, _state); tmp1 = tmp2; } gu = ae_maxreal(gu, d->ptr.p_double[n]+tmp1, _state); gl = ae_minreal(gl, d->ptr.p_double[n]-tmp1, _state); tnorm = ae_maxreal(ae_fabs(gl, _state), ae_fabs(gu, _state), _state); gl = gl-fudge*tnorm*ulp*n-fudge*2*pivmin; gu = gu+fudge*tnorm*ulp*n+fudge*pivmin; /* * Compute Iteration parameters */ itmax = ae_iceil((ae_log(tnorm+pivmin, _state)-ae_log(pivmin, _state))/ae_log((double)(2), _state), _state)+2; if( ae_fp_less_eq(abstol,(double)(0)) ) { atoli = ulp*tnorm; } else { atoli = abstol; } work.ptr.p_double[n+1] = gl; work.ptr.p_double[n+2] = gl; work.ptr.p_double[n+3] = gu; work.ptr.p_double[n+4] = gu; work.ptr.p_double[n+5] = gl; work.ptr.p_double[n+6] = gu; iwork.ptr.p_int[1] = -1; iwork.ptr.p_int[2] = -1; iwork.ptr.p_int[3] = n+1; iwork.ptr.p_int[4] = n+1; iwork.ptr.p_int[5] = il-1; iwork.ptr.p_int[6] = iu; /* * Calling DLAEBZ * * DLAEBZ( 3, ITMAX, N, 2, 2, NB, ATOLI, RTOLI, PIVMIN, D, E, * WORK, IWORK( 5 ), WORK( N+1 ), WORK( N+5 ), IOUT, * IWORK, W, IBLOCK, IINFO ) */ ia1s2.ptr.p_int[1] = iwork.ptr.p_int[5]; ia1s2.ptr.p_int[2] = iwork.ptr.p_int[6]; ra1s2.ptr.p_double[1] = work.ptr.p_double[n+5]; ra1s2.ptr.p_double[2] = work.ptr.p_double[n+6]; ra1s2x2.ptr.pp_double[1][1] = work.ptr.p_double[n+1]; ra1s2x2.ptr.pp_double[2][1] = work.ptr.p_double[n+2]; ra1s2x2.ptr.pp_double[1][2] = work.ptr.p_double[n+3]; ra1s2x2.ptr.pp_double[2][2] = work.ptr.p_double[n+4]; ia1s2x2.ptr.pp_int[1][1] = iwork.ptr.p_int[1]; ia1s2x2.ptr.pp_int[2][1] = iwork.ptr.p_int[2]; ia1s2x2.ptr.pp_int[1][2] = iwork.ptr.p_int[3]; ia1s2x2.ptr.pp_int[2][2] = iwork.ptr.p_int[4]; evd_internaldlaebz(3, itmax, n, 2, 2, atoli, rtoli, pivmin, d, e, &work, &ia1s2, &ra1s2x2, &ra1s2, &iout, &ia1s2x2, w, iblock, &iinfo, _state); iwork.ptr.p_int[5] = ia1s2.ptr.p_int[1]; iwork.ptr.p_int[6] = ia1s2.ptr.p_int[2]; work.ptr.p_double[n+5] = ra1s2.ptr.p_double[1]; work.ptr.p_double[n+6] = ra1s2.ptr.p_double[2]; work.ptr.p_double[n+1] = ra1s2x2.ptr.pp_double[1][1]; work.ptr.p_double[n+2] = ra1s2x2.ptr.pp_double[2][1]; work.ptr.p_double[n+3] = ra1s2x2.ptr.pp_double[1][2]; work.ptr.p_double[n+4] = ra1s2x2.ptr.pp_double[2][2]; iwork.ptr.p_int[1] = ia1s2x2.ptr.pp_int[1][1]; iwork.ptr.p_int[2] = ia1s2x2.ptr.pp_int[2][1]; iwork.ptr.p_int[3] = ia1s2x2.ptr.pp_int[1][2]; iwork.ptr.p_int[4] = ia1s2x2.ptr.pp_int[2][2]; if( iwork.ptr.p_int[6]==iu ) { wl = work.ptr.p_double[n+1]; wlu = work.ptr.p_double[n+3]; nwl = iwork.ptr.p_int[1]; wu = work.ptr.p_double[n+4]; wul = work.ptr.p_double[n+2]; nwu = iwork.ptr.p_int[4]; } else { wl = work.ptr.p_double[n+2]; wlu = work.ptr.p_double[n+4]; nwl = iwork.ptr.p_int[2]; wu = work.ptr.p_double[n+3]; wul = work.ptr.p_double[n+1]; nwu = iwork.ptr.p_int[3]; } if( ((nwl<0||nwl>=n)||nwu<1)||nwu>n ) { *errorcode = 4; result = ae_false; ae_frame_leave(_state); return result; } } else { /* * RANGE='A' or 'V' -- Set ATOLI */ tnorm = ae_maxreal(ae_fabs(d->ptr.p_double[1], _state)+ae_fabs(e->ptr.p_double[1], _state), ae_fabs(d->ptr.p_double[n], _state)+ae_fabs(e->ptr.p_double[n-1], _state), _state); for(j=2; j<=n-1; j++) { tnorm = ae_maxreal(tnorm, ae_fabs(d->ptr.p_double[j], _state)+ae_fabs(e->ptr.p_double[j-1], _state)+ae_fabs(e->ptr.p_double[j], _state), _state); } if( ae_fp_less_eq(abstol,(double)(0)) ) { atoli = ulp*tnorm; } else { atoli = abstol; } if( irange==2 ) { wl = vl; wu = vu; } else { wl = (double)(0); wu = (double)(0); } } /* * Find Eigenvalues -- Loop Over Blocks and recompute NWL and NWU. * NWL accumulates the number of eigenvalues .le. WL, * NWU accumulates the number of eigenvalues .le. WU */ *m = 0; iend = 0; *errorcode = 0; nwl = 0; nwu = 0; for(jb=1; jb<=*nsplit; jb++) { ioff = iend; ibegin = ioff+1; iend = isplit->ptr.p_int[jb]; iin = iend-ioff; if( iin==1 ) { /* * Special Case -- IIN=1 */ if( irange==1||ae_fp_greater_eq(wl,d->ptr.p_double[ibegin]-pivmin) ) { nwl = nwl+1; } if( irange==1||ae_fp_greater_eq(wu,d->ptr.p_double[ibegin]-pivmin) ) { nwu = nwu+1; } if( irange==1||(ae_fp_less(wl,d->ptr.p_double[ibegin]-pivmin)&&ae_fp_greater_eq(wu,d->ptr.p_double[ibegin]-pivmin)) ) { *m = *m+1; w->ptr.p_double[*m] = d->ptr.p_double[ibegin]; iblock->ptr.p_int[*m] = jb; } } else { /* * General Case -- IIN > 1 * * Compute Gershgorin Interval * and use it as the initial interval */ gu = d->ptr.p_double[ibegin]; gl = d->ptr.p_double[ibegin]; tmp1 = (double)(0); for(j=ibegin; j<=iend-1; j++) { tmp2 = ae_fabs(e->ptr.p_double[j], _state); gu = ae_maxreal(gu, d->ptr.p_double[j]+tmp1+tmp2, _state); gl = ae_minreal(gl, d->ptr.p_double[j]-tmp1-tmp2, _state); tmp1 = tmp2; } gu = ae_maxreal(gu, d->ptr.p_double[iend]+tmp1, _state); gl = ae_minreal(gl, d->ptr.p_double[iend]-tmp1, _state); bnorm = ae_maxreal(ae_fabs(gl, _state), ae_fabs(gu, _state), _state); gl = gl-fudge*bnorm*ulp*iin-fudge*pivmin; gu = gu+fudge*bnorm*ulp*iin+fudge*pivmin; /* * Compute ATOLI for the current submatrix */ if( ae_fp_less_eq(abstol,(double)(0)) ) { atoli = ulp*ae_maxreal(ae_fabs(gl, _state), ae_fabs(gu, _state), _state); } else { atoli = abstol; } if( irange>1 ) { if( ae_fp_less(gu,wl) ) { nwl = nwl+iin; nwu = nwu+iin; continue; } gl = ae_maxreal(gl, wl, _state); gu = ae_minreal(gu, wu, _state); if( ae_fp_greater_eq(gl,gu) ) { continue; } } /* * Set Up Initial Interval */ work.ptr.p_double[n+1] = gl; work.ptr.p_double[n+iin+1] = gu; /* * Calling DLAEBZ * * CALL DLAEBZ( 1, 0, IN, IN, 1, NB, ATOLI, RTOLI, PIVMIN, * D( IBEGIN ), E( IBEGIN ), WORK( IBEGIN ), * IDUMMA, WORK( N+1 ), WORK( N+2*IN+1 ), IM, * IWORK, W( M+1 ), IBLOCK( M+1 ), IINFO ) */ for(tmpi=1; tmpi<=iin; tmpi++) { ra1siin.ptr.p_double[tmpi] = d->ptr.p_double[ibegin-1+tmpi]; if( ibegin-1+tmpiptr.p_double[ibegin-1+tmpi]; } ra3siin.ptr.p_double[tmpi] = work.ptr.p_double[ibegin-1+tmpi]; ra1siinx2.ptr.pp_double[tmpi][1] = work.ptr.p_double[n+tmpi]; ra1siinx2.ptr.pp_double[tmpi][2] = work.ptr.p_double[n+tmpi+iin]; ra4siin.ptr.p_double[tmpi] = work.ptr.p_double[n+2*iin+tmpi]; rworkspace.ptr.p_double[tmpi] = w->ptr.p_double[*m+tmpi]; iworkspace.ptr.p_int[tmpi] = iblock->ptr.p_int[*m+tmpi]; ia1siinx2.ptr.pp_int[tmpi][1] = iwork.ptr.p_int[tmpi]; ia1siinx2.ptr.pp_int[tmpi][2] = iwork.ptr.p_int[tmpi+iin]; } evd_internaldlaebz(1, 0, iin, iin, 1, atoli, rtoli, pivmin, &ra1siin, &ra2siin, &ra3siin, &idumma, &ra1siinx2, &ra4siin, &im, &ia1siinx2, &rworkspace, &iworkspace, &iinfo, _state); for(tmpi=1; tmpi<=iin; tmpi++) { work.ptr.p_double[n+tmpi] = ra1siinx2.ptr.pp_double[tmpi][1]; work.ptr.p_double[n+tmpi+iin] = ra1siinx2.ptr.pp_double[tmpi][2]; work.ptr.p_double[n+2*iin+tmpi] = ra4siin.ptr.p_double[tmpi]; w->ptr.p_double[*m+tmpi] = rworkspace.ptr.p_double[tmpi]; iblock->ptr.p_int[*m+tmpi] = iworkspace.ptr.p_int[tmpi]; iwork.ptr.p_int[tmpi] = ia1siinx2.ptr.pp_int[tmpi][1]; iwork.ptr.p_int[tmpi+iin] = ia1siinx2.ptr.pp_int[tmpi][2]; } nwl = nwl+iwork.ptr.p_int[1]; nwu = nwu+iwork.ptr.p_int[iin+1]; iwoff = *m-iwork.ptr.p_int[1]; /* * Compute Eigenvalues */ itmax = ae_iceil((ae_log(gu-gl+pivmin, _state)-ae_log(pivmin, _state))/ae_log((double)(2), _state), _state)+2; /* * Calling DLAEBZ * *CALL DLAEBZ( 2, ITMAX, IN, IN, 1, NB, ATOLI, RTOLI, PIVMIN, * D( IBEGIN ), E( IBEGIN ), WORK( IBEGIN ), * IDUMMA, WORK( N+1 ), WORK( N+2*IN+1 ), IOUT, * IWORK, W( M+1 ), IBLOCK( M+1 ), IINFO ) */ for(tmpi=1; tmpi<=iin; tmpi++) { ra1siin.ptr.p_double[tmpi] = d->ptr.p_double[ibegin-1+tmpi]; if( ibegin-1+tmpiptr.p_double[ibegin-1+tmpi]; } ra3siin.ptr.p_double[tmpi] = work.ptr.p_double[ibegin-1+tmpi]; ra1siinx2.ptr.pp_double[tmpi][1] = work.ptr.p_double[n+tmpi]; ra1siinx2.ptr.pp_double[tmpi][2] = work.ptr.p_double[n+tmpi+iin]; ra4siin.ptr.p_double[tmpi] = work.ptr.p_double[n+2*iin+tmpi]; rworkspace.ptr.p_double[tmpi] = w->ptr.p_double[*m+tmpi]; iworkspace.ptr.p_int[tmpi] = iblock->ptr.p_int[*m+tmpi]; ia1siinx2.ptr.pp_int[tmpi][1] = iwork.ptr.p_int[tmpi]; ia1siinx2.ptr.pp_int[tmpi][2] = iwork.ptr.p_int[tmpi+iin]; } evd_internaldlaebz(2, itmax, iin, iin, 1, atoli, rtoli, pivmin, &ra1siin, &ra2siin, &ra3siin, &idumma, &ra1siinx2, &ra4siin, &iout, &ia1siinx2, &rworkspace, &iworkspace, &iinfo, _state); for(tmpi=1; tmpi<=iin; tmpi++) { work.ptr.p_double[n+tmpi] = ra1siinx2.ptr.pp_double[tmpi][1]; work.ptr.p_double[n+tmpi+iin] = ra1siinx2.ptr.pp_double[tmpi][2]; work.ptr.p_double[n+2*iin+tmpi] = ra4siin.ptr.p_double[tmpi]; w->ptr.p_double[*m+tmpi] = rworkspace.ptr.p_double[tmpi]; iblock->ptr.p_int[*m+tmpi] = iworkspace.ptr.p_int[tmpi]; iwork.ptr.p_int[tmpi] = ia1siinx2.ptr.pp_int[tmpi][1]; iwork.ptr.p_int[tmpi+iin] = ia1siinx2.ptr.pp_int[tmpi][2]; } /* * Copy Eigenvalues Into W and IBLOCK * Use -JB for block number for unconverged eigenvalues. */ for(j=1; j<=iout; j++) { tmp1 = 0.5*(work.ptr.p_double[j+n]+work.ptr.p_double[j+iin+n]); /* * Flag non-convergence. */ if( j>iout-iinfo ) { ncnvrg = ae_true; ib = -jb; } else { ib = jb; } for(je=iwork.ptr.p_int[j]+1+iwoff; je<=iwork.ptr.p_int[j+iin]+iwoff; je++) { w->ptr.p_double[je] = tmp1; iblock->ptr.p_int[je] = ib; } } *m = *m+im; } } /* * If RANGE='I', then (WL,WU) contains eigenvalues NWL+1,...,NWU * If NWL+1 < IL or NWU > IU, discard extra eigenvalues. */ if( irange==3 ) { im = 0; idiscl = il-1-nwl; idiscu = nwu-iu; if( idiscl>0||idiscu>0 ) { for(je=1; je<=*m; je++) { if( ae_fp_less_eq(w->ptr.p_double[je],wlu)&&idiscl>0 ) { idiscl = idiscl-1; } else { if( ae_fp_greater_eq(w->ptr.p_double[je],wul)&&idiscu>0 ) { idiscu = idiscu-1; } else { im = im+1; w->ptr.p_double[im] = w->ptr.p_double[je]; iblock->ptr.p_int[im] = iblock->ptr.p_int[je]; } } } *m = im; } if( idiscl>0||idiscu>0 ) { /* * Code to deal with effects of bad arithmetic: * Some low eigenvalues to be discarded are not in (WL,WLU], * or high eigenvalues to be discarded are not in (WUL,WU] * so just kill off the smallest IDISCL/largest IDISCU * eigenvalues, by simply finding the smallest/largest * eigenvalue(s). * * (If N(w) is monotone non-decreasing, this should never * happen.) */ if( idiscl>0 ) { wkill = wu; for(jdisc=1; jdisc<=idiscl; jdisc++) { iw = 0; for(je=1; je<=*m; je++) { if( iblock->ptr.p_int[je]!=0&&(ae_fp_less(w->ptr.p_double[je],wkill)||iw==0) ) { iw = je; wkill = w->ptr.p_double[je]; } } iblock->ptr.p_int[iw] = 0; } } if( idiscu>0 ) { wkill = wl; for(jdisc=1; jdisc<=idiscu; jdisc++) { iw = 0; for(je=1; je<=*m; je++) { if( iblock->ptr.p_int[je]!=0&&(ae_fp_greater(w->ptr.p_double[je],wkill)||iw==0) ) { iw = je; wkill = w->ptr.p_double[je]; } } iblock->ptr.p_int[iw] = 0; } } im = 0; for(je=1; je<=*m; je++) { if( iblock->ptr.p_int[je]!=0 ) { im = im+1; w->ptr.p_double[im] = w->ptr.p_double[je]; iblock->ptr.p_int[im] = iblock->ptr.p_int[je]; } } *m = im; } if( idiscl<0||idiscu<0 ) { toofew = ae_true; } } /* * If ORDER='B', do nothing -- the eigenvalues are already sorted * by block. * If ORDER='E', sort the eigenvalues from smallest to largest */ if( iorder==1&&*nsplit>1 ) { for(je=1; je<=*m-1; je++) { ie = 0; tmp1 = w->ptr.p_double[je]; for(j=je+1; j<=*m; j++) { if( ae_fp_less(w->ptr.p_double[j],tmp1) ) { ie = j; tmp1 = w->ptr.p_double[j]; } } if( ie!=0 ) { itmp1 = iblock->ptr.p_int[ie]; w->ptr.p_double[ie] = w->ptr.p_double[je]; iblock->ptr.p_int[ie] = iblock->ptr.p_int[je]; w->ptr.p_double[je] = tmp1; iblock->ptr.p_int[je] = itmp1; } } } for(j=1; j<=*m; j++) { w->ptr.p_double[j] = w->ptr.p_double[j]*scalefactor; } *errorcode = 0; if( ncnvrg ) { *errorcode = *errorcode+1; } if( toofew ) { *errorcode = *errorcode+2; } result = *errorcode==0; ae_frame_leave(_state); return result; } static void evd_internaldstein(ae_int_t n, /* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t m, /* Real */ ae_vector* w, /* Integer */ ae_vector* iblock, /* Integer */ ae_vector* isplit, /* Real */ ae_matrix* z, /* Integer */ ae_vector* ifail, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_vector _e; ae_vector _w; ae_int_t maxits; ae_int_t extra; ae_int_t b1; ae_int_t blksiz; ae_int_t bn; ae_int_t gpind; ae_int_t i; ae_int_t iinfo; ae_int_t its; ae_int_t j; ae_int_t j1; ae_int_t jblk; ae_int_t jmax; ae_int_t nblk; ae_int_t nrmchk; double dtpcrt; double eps; double eps1; double nrm; double onenrm; double ortol; double pertol; double scl; double sep; double tol; double xj; double xjm; double ztr; ae_vector work1; ae_vector work2; ae_vector work3; ae_vector work4; ae_vector work5; ae_vector iwork; ae_bool tmpcriterion; ae_int_t ti; ae_int_t i1; ae_int_t i2; double v; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_e, e, _state); e = &_e; ae_vector_init_copy(&_w, w, _state); w = &_w; ae_matrix_clear(z); ae_vector_clear(ifail); *info = 0; ae_vector_init(&work1, 0, DT_REAL, _state); ae_vector_init(&work2, 0, DT_REAL, _state); ae_vector_init(&work3, 0, DT_REAL, _state); ae_vector_init(&work4, 0, DT_REAL, _state); ae_vector_init(&work5, 0, DT_REAL, _state); ae_vector_init(&iwork, 0, DT_INT, _state); maxits = 5; extra = 2; ae_vector_set_length(&work1, ae_maxint(n, 1, _state)+1, _state); ae_vector_set_length(&work2, ae_maxint(n-1, 1, _state)+1, _state); ae_vector_set_length(&work3, ae_maxint(n, 1, _state)+1, _state); ae_vector_set_length(&work4, ae_maxint(n, 1, _state)+1, _state); ae_vector_set_length(&work5, ae_maxint(n, 1, _state)+1, _state); ae_vector_set_length(&iwork, ae_maxint(n, 1, _state)+1, _state); ae_vector_set_length(ifail, ae_maxint(m, 1, _state)+1, _state); ae_matrix_set_length(z, ae_maxint(n, 1, _state)+1, ae_maxint(m, 1, _state)+1, _state); /* * these initializers are not really necessary, * but without them compiler complains about uninitialized locals */ gpind = 0; onenrm = (double)(0); ortol = (double)(0); dtpcrt = (double)(0); xjm = (double)(0); /* * Test the input parameters. */ *info = 0; for(i=1; i<=m; i++) { ifail->ptr.p_int[i] = 0; } if( n<0 ) { *info = -1; ae_frame_leave(_state); return; } if( m<0||m>n ) { *info = -4; ae_frame_leave(_state); return; } for(j=2; j<=m; j++) { if( iblock->ptr.p_int[j]ptr.p_int[j-1] ) { *info = -6; break; } if( iblock->ptr.p_int[j]==iblock->ptr.p_int[j-1]&&ae_fp_less(w->ptr.p_double[j],w->ptr.p_double[j-1]) ) { *info = -5; break; } } if( *info!=0 ) { ae_frame_leave(_state); return; } /* * Quick return if possible */ if( n==0||m==0 ) { ae_frame_leave(_state); return; } if( n==1 ) { z->ptr.pp_double[1][1] = (double)(1); ae_frame_leave(_state); return; } /* * Some preparations */ ti = n-1; ae_v_move(&work1.ptr.p_double[1], 1, &e->ptr.p_double[1], 1, ae_v_len(1,ti)); ae_vector_set_length(e, n+1, _state); ae_v_move(&e->ptr.p_double[1], 1, &work1.ptr.p_double[1], 1, ae_v_len(1,ti)); ae_v_move(&work1.ptr.p_double[1], 1, &w->ptr.p_double[1], 1, ae_v_len(1,m)); ae_vector_set_length(w, n+1, _state); ae_v_move(&w->ptr.p_double[1], 1, &work1.ptr.p_double[1], 1, ae_v_len(1,m)); /* * Get machine constants. */ eps = ae_machineepsilon; /* * Compute eigenvectors of matrix blocks. */ j1 = 1; for(nblk=1; nblk<=iblock->ptr.p_int[m]; nblk++) { /* * Find starting and ending indices of block nblk. */ if( nblk==1 ) { b1 = 1; } else { b1 = isplit->ptr.p_int[nblk-1]+1; } bn = isplit->ptr.p_int[nblk]; blksiz = bn-b1+1; if( blksiz!=1 ) { /* * Compute reorthogonalization criterion and stopping criterion. */ gpind = b1; onenrm = ae_fabs(d->ptr.p_double[b1], _state)+ae_fabs(e->ptr.p_double[b1], _state); onenrm = ae_maxreal(onenrm, ae_fabs(d->ptr.p_double[bn], _state)+ae_fabs(e->ptr.p_double[bn-1], _state), _state); for(i=b1+1; i<=bn-1; i++) { onenrm = ae_maxreal(onenrm, ae_fabs(d->ptr.p_double[i], _state)+ae_fabs(e->ptr.p_double[i-1], _state)+ae_fabs(e->ptr.p_double[i], _state), _state); } ortol = 0.001*onenrm; dtpcrt = ae_sqrt(0.1/blksiz, _state); } /* * Loop through eigenvalues of block nblk. */ jblk = 0; for(j=j1; j<=m; j++) { if( iblock->ptr.p_int[j]!=nblk ) { j1 = j; break; } jblk = jblk+1; xj = w->ptr.p_double[j]; if( blksiz==1 ) { /* * Skip all the work if the block size is one. */ work1.ptr.p_double[1] = (double)(1); } else { /* * If eigenvalues j and j-1 are too close, add a relatively * small perturbation. */ if( jblk>1 ) { eps1 = ae_fabs(eps*xj, _state); pertol = 10*eps1; sep = xj-xjm; if( ae_fp_less(sep,pertol) ) { xj = xjm+pertol; } } its = 0; nrmchk = 0; /* * Get random starting vector. */ for(ti=1; ti<=blksiz; ti++) { work1.ptr.p_double[ti] = 2*ae_randomreal(_state)-1; } /* * Copy the matrix T so it won't be destroyed in factorization. */ for(ti=1; ti<=blksiz-1; ti++) { work2.ptr.p_double[ti] = e->ptr.p_double[b1+ti-1]; work3.ptr.p_double[ti] = e->ptr.p_double[b1+ti-1]; work4.ptr.p_double[ti] = d->ptr.p_double[b1+ti-1]; } work4.ptr.p_double[blksiz] = d->ptr.p_double[b1+blksiz-1]; /* * Compute LU factors with partial pivoting ( PT = LU ) */ tol = (double)(0); evd_tdininternaldlagtf(blksiz, &work4, xj, &work2, &work3, tol, &work5, &iwork, &iinfo, _state); /* * Update iteration count. */ do { its = its+1; if( its>maxits ) { /* * If stopping criterion was not satisfied, update info and * store eigenvector number in array ifail. */ *info = *info+1; ifail->ptr.p_int[*info] = j; break; } /* * Normalize and scale the righthand side vector Pb. */ v = (double)(0); for(ti=1; ti<=blksiz; ti++) { v = v+ae_fabs(work1.ptr.p_double[ti], _state); } scl = blksiz*onenrm*ae_maxreal(eps, ae_fabs(work4.ptr.p_double[blksiz], _state), _state)/v; ae_v_muld(&work1.ptr.p_double[1], 1, ae_v_len(1,blksiz), scl); /* * Solve the system LU = Pb. */ evd_tdininternaldlagts(blksiz, &work4, &work2, &work3, &work5, &iwork, &work1, &tol, &iinfo, _state); /* * Reorthogonalize by modified Gram-Schmidt if eigenvalues are * close enough. */ if( jblk!=1 ) { if( ae_fp_greater(ae_fabs(xj-xjm, _state),ortol) ) { gpind = j; } if( gpind!=j ) { for(i=gpind; i<=j-1; i++) { i1 = b1; i2 = b1+blksiz-1; ztr = ae_v_dotproduct(&work1.ptr.p_double[1], 1, &z->ptr.pp_double[i1][i], z->stride, ae_v_len(1,blksiz)); ae_v_subd(&work1.ptr.p_double[1], 1, &z->ptr.pp_double[i1][i], z->stride, ae_v_len(1,blksiz), ztr); touchint(&i2, _state); } } } /* * Check the infinity norm of the iterate. */ jmax = vectoridxabsmax(&work1, 1, blksiz, _state); nrm = ae_fabs(work1.ptr.p_double[jmax], _state); /* * Continue for additional iterations after norm reaches * stopping criterion. */ tmpcriterion = ae_false; if( ae_fp_less(nrm,dtpcrt) ) { tmpcriterion = ae_true; } else { nrmchk = nrmchk+1; if( nrmchkptr.pp_double[i][j] = (double)(0); } for(i=1; i<=blksiz; i++) { z->ptr.pp_double[b1+i-1][j] = work1.ptr.p_double[i]; } /* * Save the shift to check eigenvalue spacing at next * iteration. */ xjm = xj; } } ae_frame_leave(_state); } static void evd_tdininternaldlagtf(ae_int_t n, /* Real */ ae_vector* a, double lambdav, /* Real */ ae_vector* b, /* Real */ ae_vector* c, double tol, /* Real */ ae_vector* d, /* Integer */ ae_vector* iin, ae_int_t* info, ae_state *_state) { ae_int_t k; double eps; double mult; double piv1; double piv2; double scale1; double scale2; double temp; double tl; *info = 0; *info = 0; if( n<0 ) { *info = -1; return; } if( n==0 ) { return; } a->ptr.p_double[1] = a->ptr.p_double[1]-lambdav; iin->ptr.p_int[n] = 0; if( n==1 ) { if( ae_fp_eq(a->ptr.p_double[1],(double)(0)) ) { iin->ptr.p_int[1] = 1; } return; } eps = ae_machineepsilon; tl = ae_maxreal(tol, eps, _state); scale1 = ae_fabs(a->ptr.p_double[1], _state)+ae_fabs(b->ptr.p_double[1], _state); for(k=1; k<=n-1; k++) { a->ptr.p_double[k+1] = a->ptr.p_double[k+1]-lambdav; scale2 = ae_fabs(c->ptr.p_double[k], _state)+ae_fabs(a->ptr.p_double[k+1], _state); if( kptr.p_double[k+1], _state); } if( ae_fp_eq(a->ptr.p_double[k],(double)(0)) ) { piv1 = (double)(0); } else { piv1 = ae_fabs(a->ptr.p_double[k], _state)/scale1; } if( ae_fp_eq(c->ptr.p_double[k],(double)(0)) ) { iin->ptr.p_int[k] = 0; piv2 = (double)(0); scale1 = scale2; if( kptr.p_double[k] = (double)(0); } } else { piv2 = ae_fabs(c->ptr.p_double[k], _state)/scale2; if( ae_fp_less_eq(piv2,piv1) ) { iin->ptr.p_int[k] = 0; scale1 = scale2; c->ptr.p_double[k] = c->ptr.p_double[k]/a->ptr.p_double[k]; a->ptr.p_double[k+1] = a->ptr.p_double[k+1]-c->ptr.p_double[k]*b->ptr.p_double[k]; if( kptr.p_double[k] = (double)(0); } } else { iin->ptr.p_int[k] = 1; mult = a->ptr.p_double[k]/c->ptr.p_double[k]; a->ptr.p_double[k] = c->ptr.p_double[k]; temp = a->ptr.p_double[k+1]; a->ptr.p_double[k+1] = b->ptr.p_double[k]-mult*temp; if( kptr.p_double[k] = b->ptr.p_double[k+1]; b->ptr.p_double[k+1] = -mult*d->ptr.p_double[k]; } b->ptr.p_double[k] = temp; c->ptr.p_double[k] = mult; } } if( ae_fp_less_eq(ae_maxreal(piv1, piv2, _state),tl)&&iin->ptr.p_int[n]==0 ) { iin->ptr.p_int[n] = k; } } if( ae_fp_less_eq(ae_fabs(a->ptr.p_double[n], _state),scale1*tl)&&iin->ptr.p_int[n]==0 ) { iin->ptr.p_int[n] = n; } } static void evd_tdininternaldlagts(ae_int_t n, /* Real */ ae_vector* a, /* Real */ ae_vector* b, /* Real */ ae_vector* c, /* Real */ ae_vector* d, /* Integer */ ae_vector* iin, /* Real */ ae_vector* y, double* tol, ae_int_t* info, ae_state *_state) { ae_int_t k; double absak; double ak; double bignum; double eps; double pert; double sfmin; double temp; *info = 0; *info = 0; if( n<0 ) { *info = -1; return; } if( n==0 ) { return; } eps = ae_machineepsilon; sfmin = ae_minrealnumber; bignum = 1/sfmin; if( ae_fp_less_eq(*tol,(double)(0)) ) { *tol = ae_fabs(a->ptr.p_double[1], _state); if( n>1 ) { *tol = ae_maxreal(*tol, ae_maxreal(ae_fabs(a->ptr.p_double[2], _state), ae_fabs(b->ptr.p_double[1], _state), _state), _state); } for(k=3; k<=n; k++) { *tol = ae_maxreal(*tol, ae_maxreal(ae_fabs(a->ptr.p_double[k], _state), ae_maxreal(ae_fabs(b->ptr.p_double[k-1], _state), ae_fabs(d->ptr.p_double[k-2], _state), _state), _state), _state); } *tol = *tol*eps; if( ae_fp_eq(*tol,(double)(0)) ) { *tol = eps; } } for(k=2; k<=n; k++) { if( iin->ptr.p_int[k-1]==0 ) { y->ptr.p_double[k] = y->ptr.p_double[k]-c->ptr.p_double[k-1]*y->ptr.p_double[k-1]; } else { temp = y->ptr.p_double[k-1]; y->ptr.p_double[k-1] = y->ptr.p_double[k]; y->ptr.p_double[k] = temp-c->ptr.p_double[k-1]*y->ptr.p_double[k]; } } for(k=n; k>=1; k--) { if( k<=n-2 ) { temp = y->ptr.p_double[k]-b->ptr.p_double[k]*y->ptr.p_double[k+1]-d->ptr.p_double[k]*y->ptr.p_double[k+2]; } else { if( k==n-1 ) { temp = y->ptr.p_double[k]-b->ptr.p_double[k]*y->ptr.p_double[k+1]; } else { temp = y->ptr.p_double[k]; } } ak = a->ptr.p_double[k]; pert = ae_fabs(*tol, _state); if( ae_fp_less(ak,(double)(0)) ) { pert = -pert; } for(;;) { absak = ae_fabs(ak, _state); if( ae_fp_less(absak,(double)(1)) ) { if( ae_fp_less(absak,sfmin) ) { if( ae_fp_eq(absak,(double)(0))||ae_fp_greater(ae_fabs(temp, _state)*sfmin,absak) ) { ak = ak+pert; pert = 2*pert; continue; } else { temp = temp*bignum; ak = ak*bignum; } } else { if( ae_fp_greater(ae_fabs(temp, _state),absak*bignum) ) { ak = ak+pert; pert = 2*pert; continue; } } } break; } y->ptr.p_double[k] = temp/ak; } } static void evd_internaldlaebz(ae_int_t ijob, ae_int_t nitmax, ae_int_t n, ae_int_t mmax, ae_int_t minp, double abstol, double reltol, double pivmin, /* Real */ ae_vector* d, /* Real */ ae_vector* e, /* Real */ ae_vector* e2, /* Integer */ ae_vector* nval, /* Real */ ae_matrix* ab, /* Real */ ae_vector* c, ae_int_t* mout, /* Integer */ ae_matrix* nab, /* Real */ ae_vector* work, /* Integer */ ae_vector* iwork, ae_int_t* info, ae_state *_state) { ae_int_t itmp1; ae_int_t itmp2; ae_int_t j; ae_int_t ji; ae_int_t jit; ae_int_t jp; ae_int_t kf; ae_int_t kfnew; ae_int_t kl; ae_int_t klnew; double tmp1; double tmp2; *mout = 0; *info = 0; *info = 0; if( ijob<1||ijob>3 ) { *info = -1; return; } /* * Initialize NAB */ if( ijob==1 ) { /* * Compute the number of eigenvalues in the initial intervals. */ *mout = 0; /* *DIR$ NOVECTOR */ for(ji=1; ji<=minp; ji++) { for(jp=1; jp<=2; jp++) { tmp1 = d->ptr.p_double[1]-ab->ptr.pp_double[ji][jp]; if( ae_fp_less(ae_fabs(tmp1, _state),pivmin) ) { tmp1 = -pivmin; } nab->ptr.pp_int[ji][jp] = 0; if( ae_fp_less_eq(tmp1,(double)(0)) ) { nab->ptr.pp_int[ji][jp] = 1; } for(j=2; j<=n; j++) { tmp1 = d->ptr.p_double[j]-e2->ptr.p_double[j-1]/tmp1-ab->ptr.pp_double[ji][jp]; if( ae_fp_less(ae_fabs(tmp1, _state),pivmin) ) { tmp1 = -pivmin; } if( ae_fp_less_eq(tmp1,(double)(0)) ) { nab->ptr.pp_int[ji][jp] = nab->ptr.pp_int[ji][jp]+1; } } } *mout = *mout+nab->ptr.pp_int[ji][2]-nab->ptr.pp_int[ji][1]; } return; } /* * Initialize for loop * * KF and KL have the following meaning: * Intervals 1,...,KF-1 have converged. * Intervals KF,...,KL still need to be refined. */ kf = 1; kl = minp; /* * If IJOB=2, initialize C. * If IJOB=3, use the user-supplied starting point. */ if( ijob==2 ) { for(ji=1; ji<=minp; ji++) { c->ptr.p_double[ji] = 0.5*(ab->ptr.pp_double[ji][1]+ab->ptr.pp_double[ji][2]); } } /* * Iteration loop */ for(jit=1; jit<=nitmax; jit++) { /* * Loop over intervals * * * Serial Version of the loop */ klnew = kl; for(ji=kf; ji<=kl; ji++) { /* * Compute N(w), the number of eigenvalues less than w */ tmp1 = c->ptr.p_double[ji]; tmp2 = d->ptr.p_double[1]-tmp1; itmp1 = 0; if( ae_fp_less_eq(tmp2,pivmin) ) { itmp1 = 1; tmp2 = ae_minreal(tmp2, -pivmin, _state); } /* * A series of compiler directives to defeat vectorization * for the next loop * **$PL$ CMCHAR=' ' *CDIR$ NEXTSCALAR *C$DIR SCALAR *CDIR$ NEXT SCALAR *CVD$L NOVECTOR *CDEC$ NOVECTOR *CVD$ NOVECTOR **VDIR NOVECTOR **VOCL LOOP,SCALAR *CIBM PREFER SCALAR **$PL$ CMCHAR='*' */ for(j=2; j<=n; j++) { tmp2 = d->ptr.p_double[j]-e2->ptr.p_double[j-1]/tmp2-tmp1; if( ae_fp_less_eq(tmp2,pivmin) ) { itmp1 = itmp1+1; tmp2 = ae_minreal(tmp2, -pivmin, _state); } } if( ijob<=2 ) { /* * IJOB=2: Choose all intervals containing eigenvalues. * * Insure that N(w) is monotone */ itmp1 = ae_minint(nab->ptr.pp_int[ji][2], ae_maxint(nab->ptr.pp_int[ji][1], itmp1, _state), _state); /* * Update the Queue -- add intervals if both halves * contain eigenvalues. */ if( itmp1==nab->ptr.pp_int[ji][2] ) { /* * No eigenvalue in the upper interval: * just use the lower interval. */ ab->ptr.pp_double[ji][2] = tmp1; } else { if( itmp1==nab->ptr.pp_int[ji][1] ) { /* * No eigenvalue in the lower interval: * just use the upper interval. */ ab->ptr.pp_double[ji][1] = tmp1; } else { if( klnewptr.pp_double[klnew][2] = ab->ptr.pp_double[ji][2]; nab->ptr.pp_int[klnew][2] = nab->ptr.pp_int[ji][2]; ab->ptr.pp_double[klnew][1] = tmp1; nab->ptr.pp_int[klnew][1] = itmp1; ab->ptr.pp_double[ji][2] = tmp1; nab->ptr.pp_int[ji][2] = itmp1; } else { *info = mmax+1; return; } } } } else { /* * IJOB=3: Binary search. Keep only the interval * containing w s.t. N(w) = NVAL */ if( itmp1<=nval->ptr.p_int[ji] ) { ab->ptr.pp_double[ji][1] = tmp1; nab->ptr.pp_int[ji][1] = itmp1; } if( itmp1>=nval->ptr.p_int[ji] ) { ab->ptr.pp_double[ji][2] = tmp1; nab->ptr.pp_int[ji][2] = itmp1; } } } kl = klnew; /* * Check for convergence */ kfnew = kf; for(ji=kf; ji<=kl; ji++) { tmp1 = ae_fabs(ab->ptr.pp_double[ji][2]-ab->ptr.pp_double[ji][1], _state); tmp2 = ae_maxreal(ae_fabs(ab->ptr.pp_double[ji][2], _state), ae_fabs(ab->ptr.pp_double[ji][1], _state), _state); if( ae_fp_less(tmp1,ae_maxreal(abstol, ae_maxreal(pivmin, reltol*tmp2, _state), _state))||nab->ptr.pp_int[ji][1]>=nab->ptr.pp_int[ji][2] ) { /* * Converged -- Swap with position KFNEW, * then increment KFNEW */ if( ji>kfnew ) { tmp1 = ab->ptr.pp_double[ji][1]; tmp2 = ab->ptr.pp_double[ji][2]; itmp1 = nab->ptr.pp_int[ji][1]; itmp2 = nab->ptr.pp_int[ji][2]; ab->ptr.pp_double[ji][1] = ab->ptr.pp_double[kfnew][1]; ab->ptr.pp_double[ji][2] = ab->ptr.pp_double[kfnew][2]; nab->ptr.pp_int[ji][1] = nab->ptr.pp_int[kfnew][1]; nab->ptr.pp_int[ji][2] = nab->ptr.pp_int[kfnew][2]; ab->ptr.pp_double[kfnew][1] = tmp1; ab->ptr.pp_double[kfnew][2] = tmp2; nab->ptr.pp_int[kfnew][1] = itmp1; nab->ptr.pp_int[kfnew][2] = itmp2; if( ijob==3 ) { itmp1 = nval->ptr.p_int[ji]; nval->ptr.p_int[ji] = nval->ptr.p_int[kfnew]; nval->ptr.p_int[kfnew] = itmp1; } } kfnew = kfnew+1; } } kf = kfnew; /* * Choose Midpoints */ for(ji=kf; ji<=kl; ji++) { c->ptr.p_double[ji] = 0.5*(ab->ptr.pp_double[ji][1]+ab->ptr.pp_double[ji][2]); } /* * If no more intervals to refine, quit. */ if( kf>kl ) { break; } } /* * Converged */ *info = ae_maxint(kl+1-kf, 0, _state); *mout = kl; } /************************************************************************* Internal subroutine -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University June 30, 1999 *************************************************************************/ static void evd_rmatrixinternaltrevc(/* Real */ ae_matrix* t, ae_int_t n, ae_int_t side, ae_int_t howmny, /* Boolean */ ae_vector* vselect, /* Real */ ae_matrix* vl, /* Real */ ae_matrix* vr, ae_int_t* m, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_vector _vselect; ae_int_t i; ae_int_t j; ae_matrix t1; ae_matrix vl1; ae_matrix vr1; ae_vector vselect1; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_vselect, vselect, _state); vselect = &_vselect; *m = 0; *info = 0; ae_matrix_init(&t1, 0, 0, DT_REAL, _state); ae_matrix_init(&vl1, 0, 0, DT_REAL, _state); ae_matrix_init(&vr1, 0, 0, DT_REAL, _state); ae_vector_init(&vselect1, 0, DT_BOOL, _state); /* * Allocate VL/VR, if needed */ if( howmny==2||howmny==3 ) { if( side==1||side==3 ) { rmatrixsetlengthatleast(vr, n, n, _state); } if( side==2||side==3 ) { rmatrixsetlengthatleast(vl, n, n, _state); } } /* * Try to use MKL kernel */ if( rmatrixinternaltrevcmkl(t, n, side, howmny, vl, vr, m, info, _state) ) { ae_frame_leave(_state); return; } /* * ALGLIB version */ ae_matrix_set_length(&t1, n+1, n+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { t1.ptr.pp_double[i+1][j+1] = t->ptr.pp_double[i][j]; } } if( howmny==3 ) { ae_vector_set_length(&vselect1, n+1, _state); for(i=0; i<=n-1; i++) { vselect1.ptr.p_bool[1+i] = vselect->ptr.p_bool[i]; } } if( (side==2||side==3)&&howmny==1 ) { ae_matrix_set_length(&vl1, n+1, n+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { vl1.ptr.pp_double[i+1][j+1] = vl->ptr.pp_double[i][j]; } } } if( (side==1||side==3)&&howmny==1 ) { ae_matrix_set_length(&vr1, n+1, n+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { vr1.ptr.pp_double[i+1][j+1] = vr->ptr.pp_double[i][j]; } } } evd_internaltrevc(&t1, n, side, howmny, &vselect1, &vl1, &vr1, m, info, _state); if( side!=1 ) { rmatrixsetlengthatleast(vl, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { vl->ptr.pp_double[i][j] = vl1.ptr.pp_double[i+1][j+1]; } } } if( side!=2 ) { rmatrixsetlengthatleast(vr, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { vr->ptr.pp_double[i][j] = vr1.ptr.pp_double[i+1][j+1]; } } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University June 30, 1999 *************************************************************************/ static void evd_internaltrevc(/* Real */ ae_matrix* t, ae_int_t n, ae_int_t side, ae_int_t howmny, /* Boolean */ ae_vector* vselect, /* Real */ ae_matrix* vl, /* Real */ ae_matrix* vr, ae_int_t* m, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_vector _vselect; ae_bool allv; ae_bool bothv; ae_bool leftv; ae_bool over; ae_bool pair; ae_bool rightv; ae_bool somev; ae_int_t i; ae_int_t ierr; ae_int_t ii; ae_int_t ip; ae_int_t iis; ae_int_t j; ae_int_t j1; ae_int_t j2; ae_int_t jnxt; ae_int_t k; ae_int_t ki; ae_int_t n2; double beta; double bignum; double emax; double rec; double remax; double scl; double smin; double smlnum; double ulp; double unfl; double vcrit; double vmax; double wi; double wr; double xnorm; ae_matrix x; ae_vector work; ae_vector temp; ae_matrix temp11; ae_matrix temp22; ae_matrix temp11b; ae_matrix temp21b; ae_matrix temp12b; ae_matrix temp22b; ae_bool skipflag; ae_int_t k1; ae_int_t k2; ae_int_t k3; ae_int_t k4; double vt; ae_vector rswap4; ae_vector zswap4; ae_matrix ipivot44; ae_vector civ4; ae_vector crv4; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_vselect, vselect, _state); vselect = &_vselect; *m = 0; *info = 0; ae_matrix_init(&x, 0, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); ae_vector_init(&temp, 0, DT_REAL, _state); ae_matrix_init(&temp11, 0, 0, DT_REAL, _state); ae_matrix_init(&temp22, 0, 0, DT_REAL, _state); ae_matrix_init(&temp11b, 0, 0, DT_REAL, _state); ae_matrix_init(&temp21b, 0, 0, DT_REAL, _state); ae_matrix_init(&temp12b, 0, 0, DT_REAL, _state); ae_matrix_init(&temp22b, 0, 0, DT_REAL, _state); ae_vector_init(&rswap4, 0, DT_BOOL, _state); ae_vector_init(&zswap4, 0, DT_BOOL, _state); ae_matrix_init(&ipivot44, 0, 0, DT_INT, _state); ae_vector_init(&civ4, 0, DT_REAL, _state); ae_vector_init(&crv4, 0, DT_REAL, _state); ae_matrix_set_length(&x, 2+1, 2+1, _state); ae_matrix_set_length(&temp11, 1+1, 1+1, _state); ae_matrix_set_length(&temp11b, 1+1, 1+1, _state); ae_matrix_set_length(&temp21b, 2+1, 1+1, _state); ae_matrix_set_length(&temp12b, 1+1, 2+1, _state); ae_matrix_set_length(&temp22b, 2+1, 2+1, _state); ae_matrix_set_length(&temp22, 2+1, 2+1, _state); ae_vector_set_length(&work, 3*n+1, _state); ae_vector_set_length(&temp, n+1, _state); ae_vector_set_length(&rswap4, 4+1, _state); ae_vector_set_length(&zswap4, 4+1, _state); ae_matrix_set_length(&ipivot44, 4+1, 4+1, _state); ae_vector_set_length(&civ4, 4+1, _state); ae_vector_set_length(&crv4, 4+1, _state); if( howmny!=1 ) { if( side==1||side==3 ) { ae_matrix_set_length(vr, n+1, n+1, _state); } if( side==2||side==3 ) { ae_matrix_set_length(vl, n+1, n+1, _state); } } /* * Decode and test the input parameters */ bothv = side==3; rightv = side==1||bothv; leftv = side==2||bothv; allv = howmny==2; over = howmny==1; somev = howmny==3; *info = 0; if( n<0 ) { *info = -2; ae_frame_leave(_state); return; } if( !rightv&&!leftv ) { *info = -3; ae_frame_leave(_state); return; } if( (!allv&&!over)&&!somev ) { *info = -4; ae_frame_leave(_state); return; } /* * Set M to the number of columns required to store the selected * eigenvectors, standardize the array SELECT if necessary, and * test MM. */ if( somev ) { *m = 0; pair = ae_false; for(j=1; j<=n; j++) { if( pair ) { pair = ae_false; vselect->ptr.p_bool[j] = ae_false; } else { if( jptr.pp_double[j+1][j],(double)(0)) ) { if( vselect->ptr.p_bool[j] ) { *m = *m+1; } } else { pair = ae_true; if( vselect->ptr.p_bool[j]||vselect->ptr.p_bool[j+1] ) { vselect->ptr.p_bool[j] = ae_true; *m = *m+2; } } } else { if( vselect->ptr.p_bool[n] ) { *m = *m+1; } } } } } else { *m = n; } /* * Quick return if possible. */ if( n==0 ) { ae_frame_leave(_state); return; } /* * Set the constants to control overflow. */ unfl = ae_minrealnumber; ulp = ae_machineepsilon; smlnum = unfl*(n/ulp); bignum = (1-ulp)/smlnum; /* * Compute 1-norm of each column of strictly upper triangular * part of T to control overflow in triangular solver. */ work.ptr.p_double[1] = (double)(0); for(j=2; j<=n; j++) { work.ptr.p_double[j] = (double)(0); for(i=1; i<=j-1; i++) { work.ptr.p_double[j] = work.ptr.p_double[j]+ae_fabs(t->ptr.pp_double[i][j], _state); } } /* * Index IP is used to specify the real or complex eigenvalue: * IP = 0, real eigenvalue, * 1, first of conjugate complex pair: (wr,wi) * -1, second of conjugate complex pair: (wr,wi) */ n2 = 2*n; if( rightv ) { /* * Compute right eigenvectors. */ ip = 0; iis = *m; for(ki=n; ki>=1; ki--) { skipflag = ae_false; if( ip==1 ) { skipflag = ae_true; } else { if( ki!=1 ) { if( ae_fp_neq(t->ptr.pp_double[ki][ki-1],(double)(0)) ) { ip = -1; } } if( somev ) { if( ip==0 ) { if( !vselect->ptr.p_bool[ki] ) { skipflag = ae_true; } } else { if( !vselect->ptr.p_bool[ki-1] ) { skipflag = ae_true; } } } } if( !skipflag ) { /* * Compute the KI-th eigenvalue (WR,WI). */ wr = t->ptr.pp_double[ki][ki]; wi = (double)(0); if( ip!=0 ) { wi = ae_sqrt(ae_fabs(t->ptr.pp_double[ki][ki-1], _state), _state)*ae_sqrt(ae_fabs(t->ptr.pp_double[ki-1][ki], _state), _state); } smin = ae_maxreal(ulp*(ae_fabs(wr, _state)+ae_fabs(wi, _state)), smlnum, _state); if( ip==0 ) { /* * Real right eigenvector */ work.ptr.p_double[ki+n] = (double)(1); /* * Form right-hand side */ for(k=1; k<=ki-1; k++) { work.ptr.p_double[k+n] = -t->ptr.pp_double[k][ki]; } /* * Solve the upper quasi-triangular system: * (T(1:KI-1,1:KI-1) - WR)*X = SCALE*WORK. */ jnxt = ki-1; for(j=ki-1; j>=1; j--) { if( j>jnxt ) { continue; } j1 = j; j2 = j; jnxt = j-1; if( j>1 ) { if( ae_fp_neq(t->ptr.pp_double[j][j-1],(double)(0)) ) { j1 = j-1; jnxt = j-2; } } if( j1==j2 ) { /* * 1-by-1 diagonal block */ temp11.ptr.pp_double[1][1] = t->ptr.pp_double[j][j]; temp11b.ptr.pp_double[1][1] = work.ptr.p_double[j+n]; evd_internalhsevdlaln2(ae_false, 1, 1, smin, (double)(1), &temp11, 1.0, 1.0, &temp11b, wr, 0.0, &rswap4, &zswap4, &ipivot44, &civ4, &crv4, &x, &scl, &xnorm, &ierr, _state); /* * Scale X(1,1) to avoid overflow when updating * the right-hand side. */ if( ae_fp_greater(xnorm,(double)(1)) ) { if( ae_fp_greater(work.ptr.p_double[j],bignum/xnorm) ) { x.ptr.pp_double[1][1] = x.ptr.pp_double[1][1]/xnorm; scl = scl/xnorm; } } /* * Scale if necessary */ if( ae_fp_neq(scl,(double)(1)) ) { k1 = n+1; k2 = n+ki; ae_v_muld(&work.ptr.p_double[k1], 1, ae_v_len(k1,k2), scl); } work.ptr.p_double[j+n] = x.ptr.pp_double[1][1]; /* * Update right-hand side */ k1 = 1+n; k2 = j-1+n; k3 = j-1; vt = -x.ptr.pp_double[1][1]; ae_v_addd(&work.ptr.p_double[k1], 1, &t->ptr.pp_double[1][j], t->stride, ae_v_len(k1,k2), vt); } else { /* * 2-by-2 diagonal block */ temp22.ptr.pp_double[1][1] = t->ptr.pp_double[j-1][j-1]; temp22.ptr.pp_double[1][2] = t->ptr.pp_double[j-1][j]; temp22.ptr.pp_double[2][1] = t->ptr.pp_double[j][j-1]; temp22.ptr.pp_double[2][2] = t->ptr.pp_double[j][j]; temp21b.ptr.pp_double[1][1] = work.ptr.p_double[j-1+n]; temp21b.ptr.pp_double[2][1] = work.ptr.p_double[j+n]; evd_internalhsevdlaln2(ae_false, 2, 1, smin, 1.0, &temp22, 1.0, 1.0, &temp21b, wr, (double)(0), &rswap4, &zswap4, &ipivot44, &civ4, &crv4, &x, &scl, &xnorm, &ierr, _state); /* * Scale X(1,1) and X(2,1) to avoid overflow when * updating the right-hand side. */ if( ae_fp_greater(xnorm,(double)(1)) ) { beta = ae_maxreal(work.ptr.p_double[j-1], work.ptr.p_double[j], _state); if( ae_fp_greater(beta,bignum/xnorm) ) { x.ptr.pp_double[1][1] = x.ptr.pp_double[1][1]/xnorm; x.ptr.pp_double[2][1] = x.ptr.pp_double[2][1]/xnorm; scl = scl/xnorm; } } /* * Scale if necessary */ if( ae_fp_neq(scl,(double)(1)) ) { k1 = 1+n; k2 = ki+n; ae_v_muld(&work.ptr.p_double[k1], 1, ae_v_len(k1,k2), scl); } work.ptr.p_double[j-1+n] = x.ptr.pp_double[1][1]; work.ptr.p_double[j+n] = x.ptr.pp_double[2][1]; /* * Update right-hand side */ k1 = 1+n; k2 = j-2+n; k3 = j-2; k4 = j-1; vt = -x.ptr.pp_double[1][1]; ae_v_addd(&work.ptr.p_double[k1], 1, &t->ptr.pp_double[1][k4], t->stride, ae_v_len(k1,k2), vt); vt = -x.ptr.pp_double[2][1]; ae_v_addd(&work.ptr.p_double[k1], 1, &t->ptr.pp_double[1][j], t->stride, ae_v_len(k1,k2), vt); } } /* * Copy the vector x or Q*x to VR and normalize. */ if( !over ) { k1 = 1+n; k2 = ki+n; ae_v_move(&vr->ptr.pp_double[1][iis], vr->stride, &work.ptr.p_double[k1], 1, ae_v_len(1,ki)); ii = columnidxabsmax(vr, 1, ki, iis, _state); remax = 1/ae_fabs(vr->ptr.pp_double[ii][iis], _state); ae_v_muld(&vr->ptr.pp_double[1][iis], vr->stride, ae_v_len(1,ki), remax); for(k=ki+1; k<=n; k++) { vr->ptr.pp_double[k][iis] = (double)(0); } } else { if( ki>1 ) { ae_v_move(&temp.ptr.p_double[1], 1, &vr->ptr.pp_double[1][ki], vr->stride, ae_v_len(1,n)); matrixvectormultiply(vr, 1, n, 1, ki-1, ae_false, &work, 1+n, ki-1+n, 1.0, &temp, 1, n, work.ptr.p_double[ki+n], _state); ae_v_move(&vr->ptr.pp_double[1][ki], vr->stride, &temp.ptr.p_double[1], 1, ae_v_len(1,n)); } ii = columnidxabsmax(vr, 1, n, ki, _state); remax = 1/ae_fabs(vr->ptr.pp_double[ii][ki], _state); ae_v_muld(&vr->ptr.pp_double[1][ki], vr->stride, ae_v_len(1,n), remax); } } else { /* * Complex right eigenvector. * * Initial solve * [ (T(KI-1,KI-1) T(KI-1,KI) ) - (WR + I* WI)]*X = 0. * [ (T(KI,KI-1) T(KI,KI) ) ] */ if( ae_fp_greater_eq(ae_fabs(t->ptr.pp_double[ki-1][ki], _state),ae_fabs(t->ptr.pp_double[ki][ki-1], _state)) ) { work.ptr.p_double[ki-1+n] = (double)(1); work.ptr.p_double[ki+n2] = wi/t->ptr.pp_double[ki-1][ki]; } else { work.ptr.p_double[ki-1+n] = -wi/t->ptr.pp_double[ki][ki-1]; work.ptr.p_double[ki+n2] = (double)(1); } work.ptr.p_double[ki+n] = (double)(0); work.ptr.p_double[ki-1+n2] = (double)(0); /* * Form right-hand side */ for(k=1; k<=ki-2; k++) { work.ptr.p_double[k+n] = -work.ptr.p_double[ki-1+n]*t->ptr.pp_double[k][ki-1]; work.ptr.p_double[k+n2] = -work.ptr.p_double[ki+n2]*t->ptr.pp_double[k][ki]; } /* * Solve upper quasi-triangular system: * (T(1:KI-2,1:KI-2) - (WR+i*WI))*X = SCALE*(WORK+i*WORK2) */ jnxt = ki-2; for(j=ki-2; j>=1; j--) { if( j>jnxt ) { continue; } j1 = j; j2 = j; jnxt = j-1; if( j>1 ) { if( ae_fp_neq(t->ptr.pp_double[j][j-1],(double)(0)) ) { j1 = j-1; jnxt = j-2; } } if( j1==j2 ) { /* * 1-by-1 diagonal block */ temp11.ptr.pp_double[1][1] = t->ptr.pp_double[j][j]; temp12b.ptr.pp_double[1][1] = work.ptr.p_double[j+n]; temp12b.ptr.pp_double[1][2] = work.ptr.p_double[j+n+n]; evd_internalhsevdlaln2(ae_false, 1, 2, smin, 1.0, &temp11, 1.0, 1.0, &temp12b, wr, wi, &rswap4, &zswap4, &ipivot44, &civ4, &crv4, &x, &scl, &xnorm, &ierr, _state); /* * Scale X(1,1) and X(1,2) to avoid overflow when * updating the right-hand side. */ if( ae_fp_greater(xnorm,(double)(1)) ) { if( ae_fp_greater(work.ptr.p_double[j],bignum/xnorm) ) { x.ptr.pp_double[1][1] = x.ptr.pp_double[1][1]/xnorm; x.ptr.pp_double[1][2] = x.ptr.pp_double[1][2]/xnorm; scl = scl/xnorm; } } /* * Scale if necessary */ if( ae_fp_neq(scl,(double)(1)) ) { k1 = 1+n; k2 = ki+n; ae_v_muld(&work.ptr.p_double[k1], 1, ae_v_len(k1,k2), scl); k1 = 1+n2; k2 = ki+n2; ae_v_muld(&work.ptr.p_double[k1], 1, ae_v_len(k1,k2), scl); } work.ptr.p_double[j+n] = x.ptr.pp_double[1][1]; work.ptr.p_double[j+n2] = x.ptr.pp_double[1][2]; /* * Update the right-hand side */ k1 = 1+n; k2 = j-1+n; k3 = 1; k4 = j-1; vt = -x.ptr.pp_double[1][1]; ae_v_addd(&work.ptr.p_double[k1], 1, &t->ptr.pp_double[k3][j], t->stride, ae_v_len(k1,k2), vt); k1 = 1+n2; k2 = j-1+n2; k3 = 1; k4 = j-1; vt = -x.ptr.pp_double[1][2]; ae_v_addd(&work.ptr.p_double[k1], 1, &t->ptr.pp_double[k3][j], t->stride, ae_v_len(k1,k2), vt); } else { /* * 2-by-2 diagonal block */ temp22.ptr.pp_double[1][1] = t->ptr.pp_double[j-1][j-1]; temp22.ptr.pp_double[1][2] = t->ptr.pp_double[j-1][j]; temp22.ptr.pp_double[2][1] = t->ptr.pp_double[j][j-1]; temp22.ptr.pp_double[2][2] = t->ptr.pp_double[j][j]; temp22b.ptr.pp_double[1][1] = work.ptr.p_double[j-1+n]; temp22b.ptr.pp_double[1][2] = work.ptr.p_double[j-1+n+n]; temp22b.ptr.pp_double[2][1] = work.ptr.p_double[j+n]; temp22b.ptr.pp_double[2][2] = work.ptr.p_double[j+n+n]; evd_internalhsevdlaln2(ae_false, 2, 2, smin, 1.0, &temp22, 1.0, 1.0, &temp22b, wr, wi, &rswap4, &zswap4, &ipivot44, &civ4, &crv4, &x, &scl, &xnorm, &ierr, _state); /* * Scale X to avoid overflow when updating * the right-hand side. */ if( ae_fp_greater(xnorm,(double)(1)) ) { beta = ae_maxreal(work.ptr.p_double[j-1], work.ptr.p_double[j], _state); if( ae_fp_greater(beta,bignum/xnorm) ) { rec = 1/xnorm; x.ptr.pp_double[1][1] = x.ptr.pp_double[1][1]*rec; x.ptr.pp_double[1][2] = x.ptr.pp_double[1][2]*rec; x.ptr.pp_double[2][1] = x.ptr.pp_double[2][1]*rec; x.ptr.pp_double[2][2] = x.ptr.pp_double[2][2]*rec; scl = scl*rec; } } /* * Scale if necessary */ if( ae_fp_neq(scl,(double)(1)) ) { ae_v_muld(&work.ptr.p_double[1+n], 1, ae_v_len(1+n,ki+n), scl); ae_v_muld(&work.ptr.p_double[1+n2], 1, ae_v_len(1+n2,ki+n2), scl); } work.ptr.p_double[j-1+n] = x.ptr.pp_double[1][1]; work.ptr.p_double[j+n] = x.ptr.pp_double[2][1]; work.ptr.p_double[j-1+n2] = x.ptr.pp_double[1][2]; work.ptr.p_double[j+n2] = x.ptr.pp_double[2][2]; /* * Update the right-hand side */ vt = -x.ptr.pp_double[1][1]; ae_v_addd(&work.ptr.p_double[n+1], 1, &t->ptr.pp_double[1][j-1], t->stride, ae_v_len(n+1,n+j-2), vt); vt = -x.ptr.pp_double[2][1]; ae_v_addd(&work.ptr.p_double[n+1], 1, &t->ptr.pp_double[1][j], t->stride, ae_v_len(n+1,n+j-2), vt); vt = -x.ptr.pp_double[1][2]; ae_v_addd(&work.ptr.p_double[n2+1], 1, &t->ptr.pp_double[1][j-1], t->stride, ae_v_len(n2+1,n2+j-2), vt); vt = -x.ptr.pp_double[2][2]; ae_v_addd(&work.ptr.p_double[n2+1], 1, &t->ptr.pp_double[1][j], t->stride, ae_v_len(n2+1,n2+j-2), vt); } } /* * Copy the vector x or Q*x to VR and normalize. */ if( !over ) { ae_v_move(&vr->ptr.pp_double[1][iis-1], vr->stride, &work.ptr.p_double[n+1], 1, ae_v_len(1,ki)); ae_v_move(&vr->ptr.pp_double[1][iis], vr->stride, &work.ptr.p_double[n2+1], 1, ae_v_len(1,ki)); emax = (double)(0); for(k=1; k<=ki; k++) { emax = ae_maxreal(emax, ae_fabs(vr->ptr.pp_double[k][iis-1], _state)+ae_fabs(vr->ptr.pp_double[k][iis], _state), _state); } remax = 1/emax; ae_v_muld(&vr->ptr.pp_double[1][iis-1], vr->stride, ae_v_len(1,ki), remax); ae_v_muld(&vr->ptr.pp_double[1][iis], vr->stride, ae_v_len(1,ki), remax); for(k=ki+1; k<=n; k++) { vr->ptr.pp_double[k][iis-1] = (double)(0); vr->ptr.pp_double[k][iis] = (double)(0); } } else { if( ki>2 ) { ae_v_move(&temp.ptr.p_double[1], 1, &vr->ptr.pp_double[1][ki-1], vr->stride, ae_v_len(1,n)); matrixvectormultiply(vr, 1, n, 1, ki-2, ae_false, &work, 1+n, ki-2+n, 1.0, &temp, 1, n, work.ptr.p_double[ki-1+n], _state); ae_v_move(&vr->ptr.pp_double[1][ki-1], vr->stride, &temp.ptr.p_double[1], 1, ae_v_len(1,n)); ae_v_move(&temp.ptr.p_double[1], 1, &vr->ptr.pp_double[1][ki], vr->stride, ae_v_len(1,n)); matrixvectormultiply(vr, 1, n, 1, ki-2, ae_false, &work, 1+n2, ki-2+n2, 1.0, &temp, 1, n, work.ptr.p_double[ki+n2], _state); ae_v_move(&vr->ptr.pp_double[1][ki], vr->stride, &temp.ptr.p_double[1], 1, ae_v_len(1,n)); } else { vt = work.ptr.p_double[ki-1+n]; ae_v_muld(&vr->ptr.pp_double[1][ki-1], vr->stride, ae_v_len(1,n), vt); vt = work.ptr.p_double[ki+n2]; ae_v_muld(&vr->ptr.pp_double[1][ki], vr->stride, ae_v_len(1,n), vt); } emax = (double)(0); for(k=1; k<=n; k++) { emax = ae_maxreal(emax, ae_fabs(vr->ptr.pp_double[k][ki-1], _state)+ae_fabs(vr->ptr.pp_double[k][ki], _state), _state); } remax = 1/emax; ae_v_muld(&vr->ptr.pp_double[1][ki-1], vr->stride, ae_v_len(1,n), remax); ae_v_muld(&vr->ptr.pp_double[1][ki], vr->stride, ae_v_len(1,n), remax); } } iis = iis-1; if( ip!=0 ) { iis = iis-1; } } if( ip==1 ) { ip = 0; } if( ip==-1 ) { ip = 1; } } } if( leftv ) { /* * Compute left eigenvectors. */ ip = 0; iis = 1; for(ki=1; ki<=n; ki++) { skipflag = ae_false; if( ip==-1 ) { skipflag = ae_true; } else { if( ki!=n ) { if( ae_fp_neq(t->ptr.pp_double[ki+1][ki],(double)(0)) ) { ip = 1; } } if( somev ) { if( !vselect->ptr.p_bool[ki] ) { skipflag = ae_true; } } } if( !skipflag ) { /* * Compute the KI-th eigenvalue (WR,WI). */ wr = t->ptr.pp_double[ki][ki]; wi = (double)(0); if( ip!=0 ) { wi = ae_sqrt(ae_fabs(t->ptr.pp_double[ki][ki+1], _state), _state)*ae_sqrt(ae_fabs(t->ptr.pp_double[ki+1][ki], _state), _state); } smin = ae_maxreal(ulp*(ae_fabs(wr, _state)+ae_fabs(wi, _state)), smlnum, _state); if( ip==0 ) { /* * Real left eigenvector. */ work.ptr.p_double[ki+n] = (double)(1); /* * Form right-hand side */ for(k=ki+1; k<=n; k++) { work.ptr.p_double[k+n] = -t->ptr.pp_double[ki][k]; } /* * Solve the quasi-triangular system: * (T(KI+1:N,KI+1:N) - WR)'*X = SCALE*WORK */ vmax = (double)(1); vcrit = bignum; jnxt = ki+1; for(j=ki+1; j<=n; j++) { if( jptr.pp_double[j+1][j],(double)(0)) ) { j2 = j+1; jnxt = j+2; } } if( j1==j2 ) { /* * 1-by-1 diagonal block * * Scale if necessary to avoid overflow when forming * the right-hand side. */ if( ae_fp_greater(work.ptr.p_double[j],vcrit) ) { rec = 1/vmax; ae_v_muld(&work.ptr.p_double[ki+n], 1, ae_v_len(ki+n,n+n), rec); vmax = (double)(1); vcrit = bignum; } vt = ae_v_dotproduct(&t->ptr.pp_double[ki+1][j], t->stride, &work.ptr.p_double[ki+1+n], 1, ae_v_len(ki+1,j-1)); work.ptr.p_double[j+n] = work.ptr.p_double[j+n]-vt; /* * Solve (T(J,J)-WR)'*X = WORK */ temp11.ptr.pp_double[1][1] = t->ptr.pp_double[j][j]; temp11b.ptr.pp_double[1][1] = work.ptr.p_double[j+n]; evd_internalhsevdlaln2(ae_false, 1, 1, smin, 1.0, &temp11, 1.0, 1.0, &temp11b, wr, (double)(0), &rswap4, &zswap4, &ipivot44, &civ4, &crv4, &x, &scl, &xnorm, &ierr, _state); /* * Scale if necessary */ if( ae_fp_neq(scl,(double)(1)) ) { ae_v_muld(&work.ptr.p_double[ki+n], 1, ae_v_len(ki+n,n+n), scl); } work.ptr.p_double[j+n] = x.ptr.pp_double[1][1]; vmax = ae_maxreal(ae_fabs(work.ptr.p_double[j+n], _state), vmax, _state); vcrit = bignum/vmax; } else { /* * 2-by-2 diagonal block * * Scale if necessary to avoid overflow when forming * the right-hand side. */ beta = ae_maxreal(work.ptr.p_double[j], work.ptr.p_double[j+1], _state); if( ae_fp_greater(beta,vcrit) ) { rec = 1/vmax; ae_v_muld(&work.ptr.p_double[ki+n], 1, ae_v_len(ki+n,n+n), rec); vmax = (double)(1); vcrit = bignum; } vt = ae_v_dotproduct(&t->ptr.pp_double[ki+1][j], t->stride, &work.ptr.p_double[ki+1+n], 1, ae_v_len(ki+1,j-1)); work.ptr.p_double[j+n] = work.ptr.p_double[j+n]-vt; vt = ae_v_dotproduct(&t->ptr.pp_double[ki+1][j+1], t->stride, &work.ptr.p_double[ki+1+n], 1, ae_v_len(ki+1,j-1)); work.ptr.p_double[j+1+n] = work.ptr.p_double[j+1+n]-vt; /* * Solve * [T(J,J)-WR T(J,J+1) ]'* X = SCALE*( WORK1 ) * [T(J+1,J) T(J+1,J+1)-WR] ( WORK2 ) */ temp22.ptr.pp_double[1][1] = t->ptr.pp_double[j][j]; temp22.ptr.pp_double[1][2] = t->ptr.pp_double[j][j+1]; temp22.ptr.pp_double[2][1] = t->ptr.pp_double[j+1][j]; temp22.ptr.pp_double[2][2] = t->ptr.pp_double[j+1][j+1]; temp21b.ptr.pp_double[1][1] = work.ptr.p_double[j+n]; temp21b.ptr.pp_double[2][1] = work.ptr.p_double[j+1+n]; evd_internalhsevdlaln2(ae_true, 2, 1, smin, 1.0, &temp22, 1.0, 1.0, &temp21b, wr, (double)(0), &rswap4, &zswap4, &ipivot44, &civ4, &crv4, &x, &scl, &xnorm, &ierr, _state); /* * Scale if necessary */ if( ae_fp_neq(scl,(double)(1)) ) { ae_v_muld(&work.ptr.p_double[ki+n], 1, ae_v_len(ki+n,n+n), scl); } work.ptr.p_double[j+n] = x.ptr.pp_double[1][1]; work.ptr.p_double[j+1+n] = x.ptr.pp_double[2][1]; vmax = ae_maxreal(ae_fabs(work.ptr.p_double[j+n], _state), ae_maxreal(ae_fabs(work.ptr.p_double[j+1+n], _state), vmax, _state), _state); vcrit = bignum/vmax; } } /* * Copy the vector x or Q*x to VL and normalize. */ if( !over ) { ae_v_move(&vl->ptr.pp_double[ki][iis], vl->stride, &work.ptr.p_double[ki+n], 1, ae_v_len(ki,n)); ii = columnidxabsmax(vl, ki, n, iis, _state); remax = 1/ae_fabs(vl->ptr.pp_double[ii][iis], _state); ae_v_muld(&vl->ptr.pp_double[ki][iis], vl->stride, ae_v_len(ki,n), remax); for(k=1; k<=ki-1; k++) { vl->ptr.pp_double[k][iis] = (double)(0); } } else { if( kiptr.pp_double[1][ki], vl->stride, ae_v_len(1,n)); matrixvectormultiply(vl, 1, n, ki+1, n, ae_false, &work, ki+1+n, n+n, 1.0, &temp, 1, n, work.ptr.p_double[ki+n], _state); ae_v_move(&vl->ptr.pp_double[1][ki], vl->stride, &temp.ptr.p_double[1], 1, ae_v_len(1,n)); } ii = columnidxabsmax(vl, 1, n, ki, _state); remax = 1/ae_fabs(vl->ptr.pp_double[ii][ki], _state); ae_v_muld(&vl->ptr.pp_double[1][ki], vl->stride, ae_v_len(1,n), remax); } } else { /* * Complex left eigenvector. * * Initial solve: * ((T(KI,KI) T(KI,KI+1) )' - (WR - I* WI))*X = 0. * ((T(KI+1,KI) T(KI+1,KI+1)) ) */ if( ae_fp_greater_eq(ae_fabs(t->ptr.pp_double[ki][ki+1], _state),ae_fabs(t->ptr.pp_double[ki+1][ki], _state)) ) { work.ptr.p_double[ki+n] = wi/t->ptr.pp_double[ki][ki+1]; work.ptr.p_double[ki+1+n2] = (double)(1); } else { work.ptr.p_double[ki+n] = (double)(1); work.ptr.p_double[ki+1+n2] = -wi/t->ptr.pp_double[ki+1][ki]; } work.ptr.p_double[ki+1+n] = (double)(0); work.ptr.p_double[ki+n2] = (double)(0); /* * Form right-hand side */ for(k=ki+2; k<=n; k++) { work.ptr.p_double[k+n] = -work.ptr.p_double[ki+n]*t->ptr.pp_double[ki][k]; work.ptr.p_double[k+n2] = -work.ptr.p_double[ki+1+n2]*t->ptr.pp_double[ki+1][k]; } /* * Solve complex quasi-triangular system: * ( T(KI+2,N:KI+2,N) - (WR-i*WI) )*X = WORK1+i*WORK2 */ vmax = (double)(1); vcrit = bignum; jnxt = ki+2; for(j=ki+2; j<=n; j++) { if( jptr.pp_double[j+1][j],(double)(0)) ) { j2 = j+1; jnxt = j+2; } } if( j1==j2 ) { /* * 1-by-1 diagonal block * * Scale if necessary to avoid overflow when * forming the right-hand side elements. */ if( ae_fp_greater(work.ptr.p_double[j],vcrit) ) { rec = 1/vmax; ae_v_muld(&work.ptr.p_double[ki+n], 1, ae_v_len(ki+n,n+n), rec); ae_v_muld(&work.ptr.p_double[ki+n2], 1, ae_v_len(ki+n2,n+n2), rec); vmax = (double)(1); vcrit = bignum; } vt = ae_v_dotproduct(&t->ptr.pp_double[ki+2][j], t->stride, &work.ptr.p_double[ki+2+n], 1, ae_v_len(ki+2,j-1)); work.ptr.p_double[j+n] = work.ptr.p_double[j+n]-vt; vt = ae_v_dotproduct(&t->ptr.pp_double[ki+2][j], t->stride, &work.ptr.p_double[ki+2+n2], 1, ae_v_len(ki+2,j-1)); work.ptr.p_double[j+n2] = work.ptr.p_double[j+n2]-vt; /* * Solve (T(J,J)-(WR-i*WI))*(X11+i*X12)= WK+I*WK2 */ temp11.ptr.pp_double[1][1] = t->ptr.pp_double[j][j]; temp12b.ptr.pp_double[1][1] = work.ptr.p_double[j+n]; temp12b.ptr.pp_double[1][2] = work.ptr.p_double[j+n+n]; evd_internalhsevdlaln2(ae_false, 1, 2, smin, 1.0, &temp11, 1.0, 1.0, &temp12b, wr, -wi, &rswap4, &zswap4, &ipivot44, &civ4, &crv4, &x, &scl, &xnorm, &ierr, _state); /* * Scale if necessary */ if( ae_fp_neq(scl,(double)(1)) ) { ae_v_muld(&work.ptr.p_double[ki+n], 1, ae_v_len(ki+n,n+n), scl); ae_v_muld(&work.ptr.p_double[ki+n2], 1, ae_v_len(ki+n2,n+n2), scl); } work.ptr.p_double[j+n] = x.ptr.pp_double[1][1]; work.ptr.p_double[j+n2] = x.ptr.pp_double[1][2]; vmax = ae_maxreal(ae_fabs(work.ptr.p_double[j+n], _state), ae_maxreal(ae_fabs(work.ptr.p_double[j+n2], _state), vmax, _state), _state); vcrit = bignum/vmax; } else { /* * 2-by-2 diagonal block * * Scale if necessary to avoid overflow when forming * the right-hand side elements. */ beta = ae_maxreal(work.ptr.p_double[j], work.ptr.p_double[j+1], _state); if( ae_fp_greater(beta,vcrit) ) { rec = 1/vmax; ae_v_muld(&work.ptr.p_double[ki+n], 1, ae_v_len(ki+n,n+n), rec); ae_v_muld(&work.ptr.p_double[ki+n2], 1, ae_v_len(ki+n2,n+n2), rec); vmax = (double)(1); vcrit = bignum; } vt = ae_v_dotproduct(&t->ptr.pp_double[ki+2][j], t->stride, &work.ptr.p_double[ki+2+n], 1, ae_v_len(ki+2,j-1)); work.ptr.p_double[j+n] = work.ptr.p_double[j+n]-vt; vt = ae_v_dotproduct(&t->ptr.pp_double[ki+2][j], t->stride, &work.ptr.p_double[ki+2+n2], 1, ae_v_len(ki+2,j-1)); work.ptr.p_double[j+n2] = work.ptr.p_double[j+n2]-vt; vt = ae_v_dotproduct(&t->ptr.pp_double[ki+2][j+1], t->stride, &work.ptr.p_double[ki+2+n], 1, ae_v_len(ki+2,j-1)); work.ptr.p_double[j+1+n] = work.ptr.p_double[j+1+n]-vt; vt = ae_v_dotproduct(&t->ptr.pp_double[ki+2][j+1], t->stride, &work.ptr.p_double[ki+2+n2], 1, ae_v_len(ki+2,j-1)); work.ptr.p_double[j+1+n2] = work.ptr.p_double[j+1+n2]-vt; /* * Solve 2-by-2 complex linear equation * ([T(j,j) T(j,j+1) ]'-(wr-i*wi)*I)*X = SCALE*B * ([T(j+1,j) T(j+1,j+1)] ) */ temp22.ptr.pp_double[1][1] = t->ptr.pp_double[j][j]; temp22.ptr.pp_double[1][2] = t->ptr.pp_double[j][j+1]; temp22.ptr.pp_double[2][1] = t->ptr.pp_double[j+1][j]; temp22.ptr.pp_double[2][2] = t->ptr.pp_double[j+1][j+1]; temp22b.ptr.pp_double[1][1] = work.ptr.p_double[j+n]; temp22b.ptr.pp_double[1][2] = work.ptr.p_double[j+n+n]; temp22b.ptr.pp_double[2][1] = work.ptr.p_double[j+1+n]; temp22b.ptr.pp_double[2][2] = work.ptr.p_double[j+1+n+n]; evd_internalhsevdlaln2(ae_true, 2, 2, smin, 1.0, &temp22, 1.0, 1.0, &temp22b, wr, -wi, &rswap4, &zswap4, &ipivot44, &civ4, &crv4, &x, &scl, &xnorm, &ierr, _state); /* * Scale if necessary */ if( ae_fp_neq(scl,(double)(1)) ) { ae_v_muld(&work.ptr.p_double[ki+n], 1, ae_v_len(ki+n,n+n), scl); ae_v_muld(&work.ptr.p_double[ki+n2], 1, ae_v_len(ki+n2,n+n2), scl); } work.ptr.p_double[j+n] = x.ptr.pp_double[1][1]; work.ptr.p_double[j+n2] = x.ptr.pp_double[1][2]; work.ptr.p_double[j+1+n] = x.ptr.pp_double[2][1]; work.ptr.p_double[j+1+n2] = x.ptr.pp_double[2][2]; vmax = ae_maxreal(ae_fabs(x.ptr.pp_double[1][1], _state), vmax, _state); vmax = ae_maxreal(ae_fabs(x.ptr.pp_double[1][2], _state), vmax, _state); vmax = ae_maxreal(ae_fabs(x.ptr.pp_double[2][1], _state), vmax, _state); vmax = ae_maxreal(ae_fabs(x.ptr.pp_double[2][2], _state), vmax, _state); vcrit = bignum/vmax; } } /* * Copy the vector x or Q*x to VL and normalize. */ if( !over ) { ae_v_move(&vl->ptr.pp_double[ki][iis], vl->stride, &work.ptr.p_double[ki+n], 1, ae_v_len(ki,n)); ae_v_move(&vl->ptr.pp_double[ki][iis+1], vl->stride, &work.ptr.p_double[ki+n2], 1, ae_v_len(ki,n)); emax = (double)(0); for(k=ki; k<=n; k++) { emax = ae_maxreal(emax, ae_fabs(vl->ptr.pp_double[k][iis], _state)+ae_fabs(vl->ptr.pp_double[k][iis+1], _state), _state); } remax = 1/emax; ae_v_muld(&vl->ptr.pp_double[ki][iis], vl->stride, ae_v_len(ki,n), remax); ae_v_muld(&vl->ptr.pp_double[ki][iis+1], vl->stride, ae_v_len(ki,n), remax); for(k=1; k<=ki-1; k++) { vl->ptr.pp_double[k][iis] = (double)(0); vl->ptr.pp_double[k][iis+1] = (double)(0); } } else { if( kiptr.pp_double[1][ki], vl->stride, ae_v_len(1,n)); matrixvectormultiply(vl, 1, n, ki+2, n, ae_false, &work, ki+2+n, n+n, 1.0, &temp, 1, n, work.ptr.p_double[ki+n], _state); ae_v_move(&vl->ptr.pp_double[1][ki], vl->stride, &temp.ptr.p_double[1], 1, ae_v_len(1,n)); ae_v_move(&temp.ptr.p_double[1], 1, &vl->ptr.pp_double[1][ki+1], vl->stride, ae_v_len(1,n)); matrixvectormultiply(vl, 1, n, ki+2, n, ae_false, &work, ki+2+n2, n+n2, 1.0, &temp, 1, n, work.ptr.p_double[ki+1+n2], _state); ae_v_move(&vl->ptr.pp_double[1][ki+1], vl->stride, &temp.ptr.p_double[1], 1, ae_v_len(1,n)); } else { vt = work.ptr.p_double[ki+n]; ae_v_muld(&vl->ptr.pp_double[1][ki], vl->stride, ae_v_len(1,n), vt); vt = work.ptr.p_double[ki+1+n2]; ae_v_muld(&vl->ptr.pp_double[1][ki+1], vl->stride, ae_v_len(1,n), vt); } emax = (double)(0); for(k=1; k<=n; k++) { emax = ae_maxreal(emax, ae_fabs(vl->ptr.pp_double[k][ki], _state)+ae_fabs(vl->ptr.pp_double[k][ki+1], _state), _state); } remax = 1/emax; ae_v_muld(&vl->ptr.pp_double[1][ki], vl->stride, ae_v_len(1,n), remax); ae_v_muld(&vl->ptr.pp_double[1][ki+1], vl->stride, ae_v_len(1,n), remax); } } iis = iis+1; if( ip!=0 ) { iis = iis+1; } } if( ip==-1 ) { ip = 0; } if( ip==1 ) { ip = -1; } } } ae_frame_leave(_state); } /************************************************************************* DLALN2 solves a system of the form (ca A - w D ) X = s B or (ca A' - w D) X = s B with possible scaling ("s") and perturbation of A. (A' means A-transpose.) A is an NA x NA real matrix, ca is a real scalar, D is an NA x NA real diagonal matrix, w is a real or complex value, and X and B are NA x 1 matrices -- real if w is real, complex if w is complex. NA may be 1 or 2. If w is complex, X and B are represented as NA x 2 matrices, the first column of each being the real part and the second being the imaginary part. "s" is a scaling factor (.LE. 1), computed by DLALN2, which is so chosen that X can be computed without overflow. X is further scaled if necessary to assure that norm(ca A - w D)*norm(X) is less than overflow. If both singular values of (ca A - w D) are less than SMIN, SMIN*identity will be used instead of (ca A - w D). If only one singular value is less than SMIN, one element of (ca A - w D) will be perturbed enough to make the smallest singular value roughly SMIN. If both singular values are at least SMIN, (ca A - w D) will not be perturbed. In any case, the perturbation will be at most some small multiple of max( SMIN, ulp*norm(ca A - w D) ). The singular values are computed by infinity-norm approximations, and thus will only be correct to a factor of 2 or so. Note: all input quantities are assumed to be smaller than overflow by a reasonable factor. (See BIGNUM.) -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ static void evd_internalhsevdlaln2(ae_bool ltrans, ae_int_t na, ae_int_t nw, double smin, double ca, /* Real */ ae_matrix* a, double d1, double d2, /* Real */ ae_matrix* b, double wr, double wi, /* Boolean */ ae_vector* rswap4, /* Boolean */ ae_vector* zswap4, /* Integer */ ae_matrix* ipivot44, /* Real */ ae_vector* civ4, /* Real */ ae_vector* crv4, /* Real */ ae_matrix* x, double* scl, double* xnorm, ae_int_t* info, ae_state *_state) { ae_int_t icmax; ae_int_t j; double bbnd; double bi1; double bi2; double bignum; double bnorm; double br1; double br2; double ci21; double ci22; double cmax; double cnorm; double cr21; double cr22; double csi; double csr; double li21; double lr21; double smini; double smlnum; double temp; double u22abs; double ui11; double ui11r; double ui12; double ui12s; double ui22; double ur11; double ur11r; double ur12; double ur12s; double ur22; double xi1; double xi2; double xr1; double xr2; double tmp1; double tmp2; *scl = 0; *xnorm = 0; *info = 0; zswap4->ptr.p_bool[1] = ae_false; zswap4->ptr.p_bool[2] = ae_false; zswap4->ptr.p_bool[3] = ae_true; zswap4->ptr.p_bool[4] = ae_true; rswap4->ptr.p_bool[1] = ae_false; rswap4->ptr.p_bool[2] = ae_true; rswap4->ptr.p_bool[3] = ae_false; rswap4->ptr.p_bool[4] = ae_true; ipivot44->ptr.pp_int[1][1] = 1; ipivot44->ptr.pp_int[2][1] = 2; ipivot44->ptr.pp_int[3][1] = 3; ipivot44->ptr.pp_int[4][1] = 4; ipivot44->ptr.pp_int[1][2] = 2; ipivot44->ptr.pp_int[2][2] = 1; ipivot44->ptr.pp_int[3][2] = 4; ipivot44->ptr.pp_int[4][2] = 3; ipivot44->ptr.pp_int[1][3] = 3; ipivot44->ptr.pp_int[2][3] = 4; ipivot44->ptr.pp_int[3][3] = 1; ipivot44->ptr.pp_int[4][3] = 2; ipivot44->ptr.pp_int[1][4] = 4; ipivot44->ptr.pp_int[2][4] = 3; ipivot44->ptr.pp_int[3][4] = 2; ipivot44->ptr.pp_int[4][4] = 1; smlnum = 2*ae_minrealnumber; bignum = 1/smlnum; smini = ae_maxreal(smin, smlnum, _state); /* * Don't check for input errors */ *info = 0; /* * Standard Initializations */ *scl = (double)(1); if( na==1 ) { /* * 1 x 1 (i.e., scalar) system C X = B */ if( nw==1 ) { /* * Real 1x1 system. * * C = ca A - w D */ csr = ca*a->ptr.pp_double[1][1]-wr*d1; cnorm = ae_fabs(csr, _state); /* * If | C | < SMINI, use C = SMINI */ if( ae_fp_less(cnorm,smini) ) { csr = smini; cnorm = smini; *info = 1; } /* * Check scaling for X = B / C */ bnorm = ae_fabs(b->ptr.pp_double[1][1], _state); if( ae_fp_less(cnorm,(double)(1))&&ae_fp_greater(bnorm,(double)(1)) ) { if( ae_fp_greater(bnorm,bignum*cnorm) ) { *scl = 1/bnorm; } } /* * Compute X */ x->ptr.pp_double[1][1] = b->ptr.pp_double[1][1]*(*scl)/csr; *xnorm = ae_fabs(x->ptr.pp_double[1][1], _state); } else { /* * Complex 1x1 system (w is complex) * * C = ca A - w D */ csr = ca*a->ptr.pp_double[1][1]-wr*d1; csi = -wi*d1; cnorm = ae_fabs(csr, _state)+ae_fabs(csi, _state); /* * If | C | < SMINI, use C = SMINI */ if( ae_fp_less(cnorm,smini) ) { csr = smini; csi = (double)(0); cnorm = smini; *info = 1; } /* * Check scaling for X = B / C */ bnorm = ae_fabs(b->ptr.pp_double[1][1], _state)+ae_fabs(b->ptr.pp_double[1][2], _state); if( ae_fp_less(cnorm,(double)(1))&&ae_fp_greater(bnorm,(double)(1)) ) { if( ae_fp_greater(bnorm,bignum*cnorm) ) { *scl = 1/bnorm; } } /* * Compute X */ evd_internalhsevdladiv(*scl*b->ptr.pp_double[1][1], *scl*b->ptr.pp_double[1][2], csr, csi, &tmp1, &tmp2, _state); x->ptr.pp_double[1][1] = tmp1; x->ptr.pp_double[1][2] = tmp2; *xnorm = ae_fabs(x->ptr.pp_double[1][1], _state)+ae_fabs(x->ptr.pp_double[1][2], _state); } } else { /* * 2x2 System * * Compute the real part of C = ca A - w D (or ca A' - w D ) */ crv4->ptr.p_double[1+0] = ca*a->ptr.pp_double[1][1]-wr*d1; crv4->ptr.p_double[2+2] = ca*a->ptr.pp_double[2][2]-wr*d2; if( ltrans ) { crv4->ptr.p_double[1+2] = ca*a->ptr.pp_double[2][1]; crv4->ptr.p_double[2+0] = ca*a->ptr.pp_double[1][2]; } else { crv4->ptr.p_double[2+0] = ca*a->ptr.pp_double[2][1]; crv4->ptr.p_double[1+2] = ca*a->ptr.pp_double[1][2]; } if( nw==1 ) { /* * Real 2x2 system (w is real) * * Find the largest element in C */ cmax = (double)(0); icmax = 0; for(j=1; j<=4; j++) { if( ae_fp_greater(ae_fabs(crv4->ptr.p_double[j], _state),cmax) ) { cmax = ae_fabs(crv4->ptr.p_double[j], _state); icmax = j; } } /* * If norm(C) < SMINI, use SMINI*identity. */ if( ae_fp_less(cmax,smini) ) { bnorm = ae_maxreal(ae_fabs(b->ptr.pp_double[1][1], _state), ae_fabs(b->ptr.pp_double[2][1], _state), _state); if( ae_fp_less(smini,(double)(1))&&ae_fp_greater(bnorm,(double)(1)) ) { if( ae_fp_greater(bnorm,bignum*smini) ) { *scl = 1/bnorm; } } temp = *scl/smini; x->ptr.pp_double[1][1] = temp*b->ptr.pp_double[1][1]; x->ptr.pp_double[2][1] = temp*b->ptr.pp_double[2][1]; *xnorm = temp*bnorm; *info = 1; return; } /* * Gaussian elimination with complete pivoting. */ ur11 = crv4->ptr.p_double[icmax]; cr21 = crv4->ptr.p_double[ipivot44->ptr.pp_int[2][icmax]]; ur12 = crv4->ptr.p_double[ipivot44->ptr.pp_int[3][icmax]]; cr22 = crv4->ptr.p_double[ipivot44->ptr.pp_int[4][icmax]]; ur11r = 1/ur11; lr21 = ur11r*cr21; ur22 = cr22-ur12*lr21; /* * If smaller pivot < SMINI, use SMINI */ if( ae_fp_less(ae_fabs(ur22, _state),smini) ) { ur22 = smini; *info = 1; } if( rswap4->ptr.p_bool[icmax] ) { br1 = b->ptr.pp_double[2][1]; br2 = b->ptr.pp_double[1][1]; } else { br1 = b->ptr.pp_double[1][1]; br2 = b->ptr.pp_double[2][1]; } br2 = br2-lr21*br1; bbnd = ae_maxreal(ae_fabs(br1*(ur22*ur11r), _state), ae_fabs(br2, _state), _state); if( ae_fp_greater(bbnd,(double)(1))&&ae_fp_less(ae_fabs(ur22, _state),(double)(1)) ) { if( ae_fp_greater_eq(bbnd,bignum*ae_fabs(ur22, _state)) ) { *scl = 1/bbnd; } } xr2 = br2*(*scl)/ur22; xr1 = *scl*br1*ur11r-xr2*(ur11r*ur12); if( zswap4->ptr.p_bool[icmax] ) { x->ptr.pp_double[1][1] = xr2; x->ptr.pp_double[2][1] = xr1; } else { x->ptr.pp_double[1][1] = xr1; x->ptr.pp_double[2][1] = xr2; } *xnorm = ae_maxreal(ae_fabs(xr1, _state), ae_fabs(xr2, _state), _state); /* * Further scaling if norm(A) norm(X) > overflow */ if( ae_fp_greater(*xnorm,(double)(1))&&ae_fp_greater(cmax,(double)(1)) ) { if( ae_fp_greater(*xnorm,bignum/cmax) ) { temp = cmax/bignum; x->ptr.pp_double[1][1] = temp*x->ptr.pp_double[1][1]; x->ptr.pp_double[2][1] = temp*x->ptr.pp_double[2][1]; *xnorm = temp*(*xnorm); *scl = temp*(*scl); } } } else { /* * Complex 2x2 system (w is complex) * * Find the largest element in C */ civ4->ptr.p_double[1+0] = -wi*d1; civ4->ptr.p_double[2+0] = (double)(0); civ4->ptr.p_double[1+2] = (double)(0); civ4->ptr.p_double[2+2] = -wi*d2; cmax = (double)(0); icmax = 0; for(j=1; j<=4; j++) { if( ae_fp_greater(ae_fabs(crv4->ptr.p_double[j], _state)+ae_fabs(civ4->ptr.p_double[j], _state),cmax) ) { cmax = ae_fabs(crv4->ptr.p_double[j], _state)+ae_fabs(civ4->ptr.p_double[j], _state); icmax = j; } } /* * If norm(C) < SMINI, use SMINI*identity. */ if( ae_fp_less(cmax,smini) ) { bnorm = ae_maxreal(ae_fabs(b->ptr.pp_double[1][1], _state)+ae_fabs(b->ptr.pp_double[1][2], _state), ae_fabs(b->ptr.pp_double[2][1], _state)+ae_fabs(b->ptr.pp_double[2][2], _state), _state); if( ae_fp_less(smini,(double)(1))&&ae_fp_greater(bnorm,(double)(1)) ) { if( ae_fp_greater(bnorm,bignum*smini) ) { *scl = 1/bnorm; } } temp = *scl/smini; x->ptr.pp_double[1][1] = temp*b->ptr.pp_double[1][1]; x->ptr.pp_double[2][1] = temp*b->ptr.pp_double[2][1]; x->ptr.pp_double[1][2] = temp*b->ptr.pp_double[1][2]; x->ptr.pp_double[2][2] = temp*b->ptr.pp_double[2][2]; *xnorm = temp*bnorm; *info = 1; return; } /* * Gaussian elimination with complete pivoting. */ ur11 = crv4->ptr.p_double[icmax]; ui11 = civ4->ptr.p_double[icmax]; cr21 = crv4->ptr.p_double[ipivot44->ptr.pp_int[2][icmax]]; ci21 = civ4->ptr.p_double[ipivot44->ptr.pp_int[2][icmax]]; ur12 = crv4->ptr.p_double[ipivot44->ptr.pp_int[3][icmax]]; ui12 = civ4->ptr.p_double[ipivot44->ptr.pp_int[3][icmax]]; cr22 = crv4->ptr.p_double[ipivot44->ptr.pp_int[4][icmax]]; ci22 = civ4->ptr.p_double[ipivot44->ptr.pp_int[4][icmax]]; if( icmax==1||icmax==4 ) { /* * Code when off-diagonals of pivoted C are real */ if( ae_fp_greater(ae_fabs(ur11, _state),ae_fabs(ui11, _state)) ) { temp = ui11/ur11; ur11r = 1/(ur11*(1+ae_sqr(temp, _state))); ui11r = -temp*ur11r; } else { temp = ur11/ui11; ui11r = -1/(ui11*(1+ae_sqr(temp, _state))); ur11r = -temp*ui11r; } lr21 = cr21*ur11r; li21 = cr21*ui11r; ur12s = ur12*ur11r; ui12s = ur12*ui11r; ur22 = cr22-ur12*lr21; ui22 = ci22-ur12*li21; } else { /* * Code when diagonals of pivoted C are real */ ur11r = 1/ur11; ui11r = (double)(0); lr21 = cr21*ur11r; li21 = ci21*ur11r; ur12s = ur12*ur11r; ui12s = ui12*ur11r; ur22 = cr22-ur12*lr21+ui12*li21; ui22 = -ur12*li21-ui12*lr21; } u22abs = ae_fabs(ur22, _state)+ae_fabs(ui22, _state); /* * If smaller pivot < SMINI, use SMINI */ if( ae_fp_less(u22abs,smini) ) { ur22 = smini; ui22 = (double)(0); *info = 1; } if( rswap4->ptr.p_bool[icmax] ) { br2 = b->ptr.pp_double[1][1]; br1 = b->ptr.pp_double[2][1]; bi2 = b->ptr.pp_double[1][2]; bi1 = b->ptr.pp_double[2][2]; } else { br1 = b->ptr.pp_double[1][1]; br2 = b->ptr.pp_double[2][1]; bi1 = b->ptr.pp_double[1][2]; bi2 = b->ptr.pp_double[2][2]; } br2 = br2-lr21*br1+li21*bi1; bi2 = bi2-li21*br1-lr21*bi1; bbnd = ae_maxreal((ae_fabs(br1, _state)+ae_fabs(bi1, _state))*(u22abs*(ae_fabs(ur11r, _state)+ae_fabs(ui11r, _state))), ae_fabs(br2, _state)+ae_fabs(bi2, _state), _state); if( ae_fp_greater(bbnd,(double)(1))&&ae_fp_less(u22abs,(double)(1)) ) { if( ae_fp_greater_eq(bbnd,bignum*u22abs) ) { *scl = 1/bbnd; br1 = *scl*br1; bi1 = *scl*bi1; br2 = *scl*br2; bi2 = *scl*bi2; } } evd_internalhsevdladiv(br2, bi2, ur22, ui22, &xr2, &xi2, _state); xr1 = ur11r*br1-ui11r*bi1-ur12s*xr2+ui12s*xi2; xi1 = ui11r*br1+ur11r*bi1-ui12s*xr2-ur12s*xi2; if( zswap4->ptr.p_bool[icmax] ) { x->ptr.pp_double[1][1] = xr2; x->ptr.pp_double[2][1] = xr1; x->ptr.pp_double[1][2] = xi2; x->ptr.pp_double[2][2] = xi1; } else { x->ptr.pp_double[1][1] = xr1; x->ptr.pp_double[2][1] = xr2; x->ptr.pp_double[1][2] = xi1; x->ptr.pp_double[2][2] = xi2; } *xnorm = ae_maxreal(ae_fabs(xr1, _state)+ae_fabs(xi1, _state), ae_fabs(xr2, _state)+ae_fabs(xi2, _state), _state); /* * Further scaling if norm(A) norm(X) > overflow */ if( ae_fp_greater(*xnorm,(double)(1))&&ae_fp_greater(cmax,(double)(1)) ) { if( ae_fp_greater(*xnorm,bignum/cmax) ) { temp = cmax/bignum; x->ptr.pp_double[1][1] = temp*x->ptr.pp_double[1][1]; x->ptr.pp_double[2][1] = temp*x->ptr.pp_double[2][1]; x->ptr.pp_double[1][2] = temp*x->ptr.pp_double[1][2]; x->ptr.pp_double[2][2] = temp*x->ptr.pp_double[2][2]; *xnorm = temp*(*xnorm); *scl = temp*(*scl); } } } } } /************************************************************************* performs complex division in real arithmetic a + i*b p + i*q = --------- c + i*d The algorithm is due to Robert L. Smith and can be found in D. Knuth, The art of Computer Programming, Vol.2, p.195 -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 *************************************************************************/ static void evd_internalhsevdladiv(double a, double b, double c, double d, double* p, double* q, ae_state *_state) { double e; double f; *p = 0; *q = 0; if( ae_fp_less(ae_fabs(d, _state),ae_fabs(c, _state)) ) { e = d/c; f = c+d*e; *p = (a+b*e)/f; *q = (b-a*e)/f; } else { e = c/d; f = d+c*e; *p = (b+a*e)/f; *q = (-a+b*e)/f; } } void _eigsubspacestate_init(void* _p, ae_state *_state) { eigsubspacestate *p = (eigsubspacestate*)_p; ae_touch_ptr((void*)p); _hqrndstate_init(&p->rs, _state); ae_vector_init(&p->tau, 0, DT_REAL, _state); ae_matrix_init(&p->qcur, 0, 0, DT_REAL, _state); ae_matrix_init(&p->znew, 0, 0, DT_REAL, _state); ae_matrix_init(&p->r, 0, 0, DT_REAL, _state); ae_matrix_init(&p->rz, 0, 0, DT_REAL, _state); ae_matrix_init(&p->tz, 0, 0, DT_REAL, _state); ae_matrix_init(&p->rq, 0, 0, DT_REAL, _state); ae_matrix_init(&p->dummy, 0, 0, DT_REAL, _state); ae_vector_init(&p->rw, 0, DT_REAL, _state); ae_vector_init(&p->tw, 0, DT_REAL, _state); ae_vector_init(&p->wcur, 0, DT_REAL, _state); ae_vector_init(&p->wprev, 0, DT_REAL, _state); ae_vector_init(&p->wrank, 0, DT_REAL, _state); _apbuffers_init(&p->buf, _state); ae_matrix_init(&p->x, 0, 0, DT_REAL, _state); ae_matrix_init(&p->ax, 0, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); } void _eigsubspacestate_init_copy(void* _dst, void* _src, ae_state *_state) { eigsubspacestate *dst = (eigsubspacestate*)_dst; eigsubspacestate *src = (eigsubspacestate*)_src; dst->n = src->n; dst->k = src->k; dst->nwork = src->nwork; dst->maxits = src->maxits; dst->eps = src->eps; dst->eigenvectorsneeded = src->eigenvectorsneeded; dst->matrixtype = src->matrixtype; _hqrndstate_init_copy(&dst->rs, &src->rs, _state); dst->running = src->running; ae_vector_init_copy(&dst->tau, &src->tau, _state); ae_matrix_init_copy(&dst->qcur, &src->qcur, _state); ae_matrix_init_copy(&dst->znew, &src->znew, _state); ae_matrix_init_copy(&dst->r, &src->r, _state); ae_matrix_init_copy(&dst->rz, &src->rz, _state); ae_matrix_init_copy(&dst->tz, &src->tz, _state); ae_matrix_init_copy(&dst->rq, &src->rq, _state); ae_matrix_init_copy(&dst->dummy, &src->dummy, _state); ae_vector_init_copy(&dst->rw, &src->rw, _state); ae_vector_init_copy(&dst->tw, &src->tw, _state); ae_vector_init_copy(&dst->wcur, &src->wcur, _state); ae_vector_init_copy(&dst->wprev, &src->wprev, _state); ae_vector_init_copy(&dst->wrank, &src->wrank, _state); _apbuffers_init_copy(&dst->buf, &src->buf, _state); ae_matrix_init_copy(&dst->x, &src->x, _state); ae_matrix_init_copy(&dst->ax, &src->ax, _state); dst->requesttype = src->requesttype; dst->requestsize = src->requestsize; dst->repiterationscount = src->repiterationscount; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); } void _eigsubspacestate_clear(void* _p) { eigsubspacestate *p = (eigsubspacestate*)_p; ae_touch_ptr((void*)p); _hqrndstate_clear(&p->rs); ae_vector_clear(&p->tau); ae_matrix_clear(&p->qcur); ae_matrix_clear(&p->znew); ae_matrix_clear(&p->r); ae_matrix_clear(&p->rz); ae_matrix_clear(&p->tz); ae_matrix_clear(&p->rq); ae_matrix_clear(&p->dummy); ae_vector_clear(&p->rw); ae_vector_clear(&p->tw); ae_vector_clear(&p->wcur); ae_vector_clear(&p->wprev); ae_vector_clear(&p->wrank); _apbuffers_clear(&p->buf); ae_matrix_clear(&p->x); ae_matrix_clear(&p->ax); _rcommstate_clear(&p->rstate); } void _eigsubspacestate_destroy(void* _p) { eigsubspacestate *p = (eigsubspacestate*)_p; ae_touch_ptr((void*)p); _hqrndstate_destroy(&p->rs); ae_vector_destroy(&p->tau); ae_matrix_destroy(&p->qcur); ae_matrix_destroy(&p->znew); ae_matrix_destroy(&p->r); ae_matrix_destroy(&p->rz); ae_matrix_destroy(&p->tz); ae_matrix_destroy(&p->rq); ae_matrix_destroy(&p->dummy); ae_vector_destroy(&p->rw); ae_vector_destroy(&p->tw); ae_vector_destroy(&p->wcur); ae_vector_destroy(&p->wprev); ae_vector_destroy(&p->wrank); _apbuffers_destroy(&p->buf); ae_matrix_destroy(&p->x); ae_matrix_destroy(&p->ax); _rcommstate_destroy(&p->rstate); } void _eigsubspacereport_init(void* _p, ae_state *_state) { eigsubspacereport *p = (eigsubspacereport*)_p; ae_touch_ptr((void*)p); } void _eigsubspacereport_init_copy(void* _dst, void* _src, ae_state *_state) { eigsubspacereport *dst = (eigsubspacereport*)_dst; eigsubspacereport *src = (eigsubspacereport*)_src; dst->iterationscount = src->iterationscount; } void _eigsubspacereport_clear(void* _p) { eigsubspacereport *p = (eigsubspacereport*)_p; ae_touch_ptr((void*)p); } void _eigsubspacereport_destroy(void* _p) { eigsubspacereport *p = (eigsubspacereport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* Subroutine performing the Schur decomposition of a general matrix by using the QR algorithm with multiple shifts. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes one important improvement of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Multithreaded acceleration is NOT supported for this function. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. The source matrix A is represented as S'*A*S = T, where S is an orthogonal matrix (Schur vectors), T - upper quasi-triangular matrix (with blocks of sizes 1x1 and 2x2 on the main diagonal). Input parameters: A - matrix to be decomposed. Array whose indexes range within [0..N-1, 0..N-1]. N - size of A, N>=0. Output parameters: A - contains matrix T. Array whose indexes range within [0..N-1, 0..N-1]. S - contains Schur vectors. Array whose indexes range within [0..N-1, 0..N-1]. Note 1: The block structure of matrix T can be easily recognized: since all the elements below the blocks are zeros, the elements a[i+1,i] which are equal to 0 show the block border. Note 2: The algorithm performance depends on the value of the internal parameter NS of the InternalSchurDecomposition subroutine which defines the number of shifts in the QR algorithm (similarly to the block width in block-matrix algorithms in linear algebra). If you require maximum performance on your machine, it is recommended to adjust this parameter manually. Result: True, if the algorithm has converged and parameters A and S contain the result. False, if the algorithm has not converged. Algorithm implemented on the basis of the DHSEQR subroutine (LAPACK 3.0 library). *************************************************************************/ ae_bool rmatrixschur(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* s, ae_state *_state) { ae_frame _frame_block; ae_vector tau; ae_vector wi; ae_vector wr; ae_int_t info; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_clear(s); ae_vector_init(&tau, 0, DT_REAL, _state); ae_vector_init(&wi, 0, DT_REAL, _state); ae_vector_init(&wr, 0, DT_REAL, _state); /* * Upper Hessenberg form of the 0-based matrix */ rmatrixhessenberg(a, n, &tau, _state); rmatrixhessenbergunpackq(a, n, &tau, s, _state); /* * Schur decomposition */ rmatrixinternalschurdecomposition(a, n, 1, 1, &wr, &wi, s, &info, _state); result = info==0; ae_frame_leave(_state); return result; } /************************************************************************* Algorithm for solving the following generalized symmetric positive-definite eigenproblem: A*x = lambda*B*x (1) or A*B*x = lambda*x (2) or B*A*x = lambda*x (3). where A is a symmetric matrix, B - symmetric positive-definite matrix. The problem is solved by reducing it to an ordinary symmetric eigenvalue problem. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrices A and B. IsUpperA - storage format of matrix A. B - symmetric positive-definite matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. IsUpperB - storage format of matrix B. ZNeeded - if ZNeeded is equal to: * 0, the eigenvectors are not returned; * 1, the eigenvectors are returned. ProblemType - if ProblemType is equal to: * 1, the following problem is solved: A*x = lambda*B*x; * 2, the following problem is solved: A*B*x = lambda*x; * 3, the following problem is solved: B*A*x = lambda*x. Output parameters: D - eigenvalues in ascending order. Array whose index ranges within [0..N-1]. Z - if ZNeeded is equal to: * 0, Z hasn't changed; * 1, Z contains eigenvectors. Array whose indexes range within [0..N-1, 0..N-1]. The eigenvectors are stored in matrix columns. It should be noted that the eigenvectors in such problems do not form an orthogonal system. Result: True, if the problem was solved successfully. False, if the error occurred during the Cholesky decomposition of matrix B (the matrix isn't positive-definite) or during the work of the iterative algorithm for solving the symmetric eigenproblem. See also the GeneralizedSymmetricDefiniteEVDReduce subroutine. -- ALGLIB -- Copyright 1.28.2006 by Bochkanov Sergey *************************************************************************/ ae_bool smatrixgevd(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isuppera, /* Real */ ae_matrix* b, ae_bool isupperb, ae_int_t zneeded, ae_int_t problemtype, /* Real */ ae_vector* d, /* Real */ ae_matrix* z, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_matrix r; ae_matrix t; ae_bool isupperr; ae_int_t j1; ae_int_t j2; ae_int_t j1inc; ae_int_t j2inc; ae_int_t i; ae_int_t j; double v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_clear(d); ae_matrix_clear(z); ae_matrix_init(&r, 0, 0, DT_REAL, _state); ae_matrix_init(&t, 0, 0, DT_REAL, _state); /* * Reduce and solve */ result = smatrixgevdreduce(a, n, isuppera, b, isupperb, problemtype, &r, &isupperr, _state); if( !result ) { ae_frame_leave(_state); return result; } result = smatrixevd(a, n, zneeded, isuppera, d, &t, _state); if( !result ) { ae_frame_leave(_state); return result; } /* * Transform eigenvectors if needed */ if( zneeded!=0 ) { /* * fill Z with zeros */ ae_matrix_set_length(z, n-1+1, n-1+1, _state); for(j=0; j<=n-1; j++) { z->ptr.pp_double[0][j] = 0.0; } for(i=1; i<=n-1; i++) { ae_v_move(&z->ptr.pp_double[i][0], 1, &z->ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); } /* * Setup R properties */ if( isupperr ) { j1 = 0; j2 = n-1; j1inc = 1; j2inc = 0; } else { j1 = 0; j2 = 0; j1inc = 0; j2inc = 1; } /* * Calculate R*Z */ for(i=0; i<=n-1; i++) { for(j=j1; j<=j2; j++) { v = r.ptr.pp_double[i][j]; ae_v_addd(&z->ptr.pp_double[i][0], 1, &t.ptr.pp_double[j][0], 1, ae_v_len(0,n-1), v); } j1 = j1+j1inc; j2 = j2+j2inc; } } ae_frame_leave(_state); return result; } /************************************************************************* Algorithm for reduction of the following generalized symmetric positive- definite eigenvalue problem: A*x = lambda*B*x (1) or A*B*x = lambda*x (2) or B*A*x = lambda*x (3) to the symmetric eigenvalues problem C*y = lambda*y (eigenvalues of this and the given problems are the same, and the eigenvectors of the given problem could be obtained by multiplying the obtained eigenvectors by the transformation matrix x = R*y). Here A is a symmetric matrix, B - symmetric positive-definite matrix. Input parameters: A - symmetric matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrices A and B. IsUpperA - storage format of matrix A. B - symmetric positive-definite matrix which is given by its upper or lower triangular part. Array whose indexes range within [0..N-1, 0..N-1]. IsUpperB - storage format of matrix B. ProblemType - if ProblemType is equal to: * 1, the following problem is solved: A*x = lambda*B*x; * 2, the following problem is solved: A*B*x = lambda*x; * 3, the following problem is solved: B*A*x = lambda*x. Output parameters: A - symmetric matrix which is given by its upper or lower triangle depending on IsUpperA. Contains matrix C. Array whose indexes range within [0..N-1, 0..N-1]. R - upper triangular or low triangular transformation matrix which is used to obtain the eigenvectors of a given problem as the product of eigenvectors of C (from the right) and matrix R (from the left). If the matrix is upper triangular, the elements below the main diagonal are equal to 0 (and vice versa). Thus, we can perform the multiplication without taking into account the internal structure (which is an easier though less effective way). Array whose indexes range within [0..N-1, 0..N-1]. IsUpperR - type of matrix R (upper or lower triangular). Result: True, if the problem was reduced successfully. False, if the error occurred during the Cholesky decomposition of matrix B (the matrix is not positive-definite). -- ALGLIB -- Copyright 1.28.2006 by Bochkanov Sergey *************************************************************************/ ae_bool smatrixgevdreduce(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isuppera, /* Real */ ae_matrix* b, ae_bool isupperb, ae_int_t problemtype, /* Real */ ae_matrix* r, ae_bool* isupperr, ae_state *_state) { ae_frame _frame_block; ae_matrix t; ae_vector w1; ae_vector w2; ae_vector w3; ae_int_t i; ae_int_t j; double v; matinvreport rep; ae_int_t info; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_clear(r); *isupperr = ae_false; ae_matrix_init(&t, 0, 0, DT_REAL, _state); ae_vector_init(&w1, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&w3, 0, DT_REAL, _state); _matinvreport_init(&rep, _state); ae_assert(n>0, "SMatrixGEVDReduce: N<=0!", _state); ae_assert((problemtype==1||problemtype==2)||problemtype==3, "SMatrixGEVDReduce: incorrect ProblemType!", _state); result = ae_true; /* * Problem 1: A*x = lambda*B*x * * Reducing to: * C*y = lambda*y * C = L^(-1) * A * L^(-T) * x = L^(-T) * y */ if( problemtype==1 ) { /* * Factorize B in T: B = LL' */ ae_matrix_set_length(&t, n-1+1, n-1+1, _state); if( isupperb ) { for(i=0; i<=n-1; i++) { ae_v_move(&t.ptr.pp_double[i][i], t.stride, &b->ptr.pp_double[i][i], 1, ae_v_len(i,n-1)); } } else { for(i=0; i<=n-1; i++) { ae_v_move(&t.ptr.pp_double[i][0], 1, &b->ptr.pp_double[i][0], 1, ae_v_len(0,i)); } } if( !spdmatrixcholesky(&t, n, ae_false, _state) ) { result = ae_false; ae_frame_leave(_state); return result; } /* * Invert L in T */ rmatrixtrinverse(&t, n, ae_false, ae_false, &info, &rep, _state); if( info<=0 ) { result = ae_false; ae_frame_leave(_state); return result; } /* * Build L^(-1) * A * L^(-T) in R */ ae_vector_set_length(&w1, n+1, _state); ae_vector_set_length(&w2, n+1, _state); ae_matrix_set_length(r, n-1+1, n-1+1, _state); for(j=1; j<=n; j++) { /* * Form w2 = A * l'(j) (here l'(j) is j-th column of L^(-T)) */ ae_v_move(&w1.ptr.p_double[1], 1, &t.ptr.pp_double[j-1][0], 1, ae_v_len(1,j)); symmetricmatrixvectormultiply(a, isuppera, 0, j-1, &w1, 1.0, &w2, _state); if( isuppera ) { matrixvectormultiply(a, 0, j-1, j, n-1, ae_true, &w1, 1, j, 1.0, &w2, j+1, n, 0.0, _state); } else { matrixvectormultiply(a, j, n-1, 0, j-1, ae_false, &w1, 1, j, 1.0, &w2, j+1, n, 0.0, _state); } /* * Form l(i)*w2 (here l(i) is i-th row of L^(-1)) */ for(i=1; i<=n; i++) { v = ae_v_dotproduct(&t.ptr.pp_double[i-1][0], 1, &w2.ptr.p_double[1], 1, ae_v_len(0,i-1)); r->ptr.pp_double[i-1][j-1] = v; } } /* * Copy R to A */ for(i=0; i<=n-1; i++) { ae_v_move(&a->ptr.pp_double[i][0], 1, &r->ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } /* * Copy L^(-1) from T to R and transpose */ *isupperr = ae_true; for(i=0; i<=n-1; i++) { for(j=0; j<=i-1; j++) { r->ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { ae_v_move(&r->ptr.pp_double[i][i], 1, &t.ptr.pp_double[i][i], t.stride, ae_v_len(i,n-1)); } ae_frame_leave(_state); return result; } /* * Problem 2: A*B*x = lambda*x * or * problem 3: B*A*x = lambda*x * * Reducing to: * C*y = lambda*y * C = U * A * U' * B = U'* U */ if( problemtype==2||problemtype==3 ) { /* * Factorize B in T: B = U'*U */ ae_matrix_set_length(&t, n-1+1, n-1+1, _state); if( isupperb ) { for(i=0; i<=n-1; i++) { ae_v_move(&t.ptr.pp_double[i][i], 1, &b->ptr.pp_double[i][i], 1, ae_v_len(i,n-1)); } } else { for(i=0; i<=n-1; i++) { ae_v_move(&t.ptr.pp_double[i][i], 1, &b->ptr.pp_double[i][i], b->stride, ae_v_len(i,n-1)); } } if( !spdmatrixcholesky(&t, n, ae_true, _state) ) { result = ae_false; ae_frame_leave(_state); return result; } /* * Build U * A * U' in R */ ae_vector_set_length(&w1, n+1, _state); ae_vector_set_length(&w2, n+1, _state); ae_vector_set_length(&w3, n+1, _state); ae_matrix_set_length(r, n-1+1, n-1+1, _state); for(j=1; j<=n; j++) { /* * Form w2 = A * u'(j) (here u'(j) is j-th column of U') */ ae_v_move(&w1.ptr.p_double[1], 1, &t.ptr.pp_double[j-1][j-1], 1, ae_v_len(1,n-j+1)); symmetricmatrixvectormultiply(a, isuppera, j-1, n-1, &w1, 1.0, &w3, _state); ae_v_move(&w2.ptr.p_double[j], 1, &w3.ptr.p_double[1], 1, ae_v_len(j,n)); ae_v_move(&w1.ptr.p_double[j], 1, &t.ptr.pp_double[j-1][j-1], 1, ae_v_len(j,n)); if( isuppera ) { matrixvectormultiply(a, 0, j-2, j-1, n-1, ae_false, &w1, j, n, 1.0, &w2, 1, j-1, 0.0, _state); } else { matrixvectormultiply(a, j-1, n-1, 0, j-2, ae_true, &w1, j, n, 1.0, &w2, 1, j-1, 0.0, _state); } /* * Form u(i)*w2 (here u(i) is i-th row of U) */ for(i=1; i<=n; i++) { v = ae_v_dotproduct(&t.ptr.pp_double[i-1][i-1], 1, &w2.ptr.p_double[i], 1, ae_v_len(i-1,n-1)); r->ptr.pp_double[i-1][j-1] = v; } } /* * Copy R to A */ for(i=0; i<=n-1; i++) { ae_v_move(&a->ptr.pp_double[i][0], 1, &r->ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } if( problemtype==2 ) { /* * Invert U in T */ rmatrixtrinverse(&t, n, ae_true, ae_false, &info, &rep, _state); if( info<=0 ) { result = ae_false; ae_frame_leave(_state); return result; } /* * Copy U^-1 from T to R */ *isupperr = ae_true; for(i=0; i<=n-1; i++) { for(j=0; j<=i-1; j++) { r->ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { ae_v_move(&r->ptr.pp_double[i][i], 1, &t.ptr.pp_double[i][i], 1, ae_v_len(i,n-1)); } } else { /* * Copy U from T to R and transpose */ *isupperr = ae_false; for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { r->ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { ae_v_move(&r->ptr.pp_double[i][i], r->stride, &t.ptr.pp_double[i][i], 1, ae_v_len(i,n-1)); } } } ae_frame_leave(_state); return result; } /************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm updates matrix A^-1 when adding a number to an element of matrix A. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. UpdRow - row where the element to be updated is stored. UpdColumn - column where the element to be updated is stored. UpdVal - a number to be added to the element. Output parameters: InvA - inverse of modified matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void rmatrixinvupdatesimple(/* Real */ ae_matrix* inva, ae_int_t n, ae_int_t updrow, ae_int_t updcolumn, double updval, ae_state *_state) { ae_frame _frame_block; ae_vector t1; ae_vector t2; ae_int_t i; double lambdav; double vt; ae_frame_make(_state, &_frame_block); ae_vector_init(&t1, 0, DT_REAL, _state); ae_vector_init(&t2, 0, DT_REAL, _state); ae_assert(updrow>=0&&updrow=0&&updcolumnptr.pp_double[0][updrow], inva->stride, ae_v_len(0,n-1)); /* * T2 = v*InvA */ ae_v_move(&t2.ptr.p_double[0], 1, &inva->ptr.pp_double[updcolumn][0], 1, ae_v_len(0,n-1)); /* * Lambda = v * InvA * U */ lambdav = updval*inva->ptr.pp_double[updcolumn][updrow]; /* * InvA = InvA - correction */ for(i=0; i<=n-1; i++) { vt = updval*t1.ptr.p_double[i]; vt = vt/(1+lambdav); ae_v_subd(&inva->ptr.pp_double[i][0], 1, &t2.ptr.p_double[0], 1, ae_v_len(0,n-1), vt); } ae_frame_leave(_state); } /************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm updates matrix A^-1 when adding a vector to a row of matrix A. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. UpdRow - the row of A whose vector V was added. 0 <= Row <= N-1 V - the vector to be added to a row. Array whose index ranges within [0..N-1]. Output parameters: InvA - inverse of modified matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void rmatrixinvupdaterow(/* Real */ ae_matrix* inva, ae_int_t n, ae_int_t updrow, /* Real */ ae_vector* v, ae_state *_state) { ae_frame _frame_block; ae_vector t1; ae_vector t2; ae_int_t i; ae_int_t j; double lambdav; double vt; ae_frame_make(_state, &_frame_block); ae_vector_init(&t1, 0, DT_REAL, _state); ae_vector_init(&t2, 0, DT_REAL, _state); ae_vector_set_length(&t1, n-1+1, _state); ae_vector_set_length(&t2, n-1+1, _state); /* * T1 = InvA * U */ ae_v_move(&t1.ptr.p_double[0], 1, &inva->ptr.pp_double[0][updrow], inva->stride, ae_v_len(0,n-1)); /* * T2 = v*InvA * Lambda = v * InvA * U */ for(j=0; j<=n-1; j++) { vt = ae_v_dotproduct(&v->ptr.p_double[0], 1, &inva->ptr.pp_double[0][j], inva->stride, ae_v_len(0,n-1)); t2.ptr.p_double[j] = vt; } lambdav = t2.ptr.p_double[updrow]; /* * InvA = InvA - correction */ for(i=0; i<=n-1; i++) { vt = t1.ptr.p_double[i]/(1+lambdav); ae_v_subd(&inva->ptr.pp_double[i][0], 1, &t2.ptr.p_double[0], 1, ae_v_len(0,n-1), vt); } ae_frame_leave(_state); } /************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm updates matrix A^-1 when adding a vector to a column of matrix A. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. UpdColumn - the column of A whose vector U was added. 0 <= UpdColumn <= N-1 U - the vector to be added to a column. Array whose index ranges within [0..N-1]. Output parameters: InvA - inverse of modified matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void rmatrixinvupdatecolumn(/* Real */ ae_matrix* inva, ae_int_t n, ae_int_t updcolumn, /* Real */ ae_vector* u, ae_state *_state) { ae_frame _frame_block; ae_vector t1; ae_vector t2; ae_int_t i; double lambdav; double vt; ae_frame_make(_state, &_frame_block); ae_vector_init(&t1, 0, DT_REAL, _state); ae_vector_init(&t2, 0, DT_REAL, _state); ae_vector_set_length(&t1, n-1+1, _state); ae_vector_set_length(&t2, n-1+1, _state); /* * T1 = InvA * U * Lambda = v * InvA * U */ for(i=0; i<=n-1; i++) { vt = ae_v_dotproduct(&inva->ptr.pp_double[i][0], 1, &u->ptr.p_double[0], 1, ae_v_len(0,n-1)); t1.ptr.p_double[i] = vt; } lambdav = t1.ptr.p_double[updcolumn]; /* * T2 = v*InvA */ ae_v_move(&t2.ptr.p_double[0], 1, &inva->ptr.pp_double[updcolumn][0], 1, ae_v_len(0,n-1)); /* * InvA = InvA - correction */ for(i=0; i<=n-1; i++) { vt = t1.ptr.p_double[i]/(1+lambdav); ae_v_subd(&inva->ptr.pp_double[i][0], 1, &t2.ptr.p_double[0], 1, ae_v_len(0,n-1), vt); } ae_frame_leave(_state); } /************************************************************************* Inverse matrix update by the Sherman-Morrison formula The algorithm computes the inverse of matrix A+u*v' by using the given matrix A^-1 and the vectors u and v. Input parameters: InvA - inverse of matrix A. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. U - the vector modifying the matrix. Array whose index ranges within [0..N-1]. V - the vector modifying the matrix. Array whose index ranges within [0..N-1]. Output parameters: InvA - inverse of matrix A + u*v'. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ void rmatrixinvupdateuv(/* Real */ ae_matrix* inva, ae_int_t n, /* Real */ ae_vector* u, /* Real */ ae_vector* v, ae_state *_state) { ae_frame _frame_block; ae_vector t1; ae_vector t2; ae_int_t i; ae_int_t j; double lambdav; double vt; ae_frame_make(_state, &_frame_block); ae_vector_init(&t1, 0, DT_REAL, _state); ae_vector_init(&t2, 0, DT_REAL, _state); ae_vector_set_length(&t1, n-1+1, _state); ae_vector_set_length(&t2, n-1+1, _state); /* * T1 = InvA * U * Lambda = v * T1 */ for(i=0; i<=n-1; i++) { vt = ae_v_dotproduct(&inva->ptr.pp_double[i][0], 1, &u->ptr.p_double[0], 1, ae_v_len(0,n-1)); t1.ptr.p_double[i] = vt; } lambdav = ae_v_dotproduct(&v->ptr.p_double[0], 1, &t1.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * T2 = v*InvA */ for(j=0; j<=n-1; j++) { vt = ae_v_dotproduct(&v->ptr.p_double[0], 1, &inva->ptr.pp_double[0][j], inva->stride, ae_v_len(0,n-1)); t2.ptr.p_double[j] = vt; } /* * InvA = InvA - correction */ for(i=0; i<=n-1; i++) { vt = t1.ptr.p_double[i]/(1+lambdav); ae_v_subd(&inva->ptr.pp_double[i][0], 1, &t2.ptr.p_double[0], 1, ae_v_len(0,n-1), vt); } ae_frame_leave(_state); } /************************************************************************* Determinant calculation of the matrix given by its LU decomposition. Input parameters: A - LU decomposition of the matrix (output of RMatrixLU subroutine). Pivots - table of permutations which were made during the LU decomposition. Output of RMatrixLU subroutine. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: matrix determinant. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ double rmatrixludet(/* Real */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t s; double result; ae_assert(n>=1, "RMatrixLUDet: N<1!", _state); ae_assert(pivots->cnt>=n, "RMatrixLUDet: Pivots array is too short!", _state); ae_assert(a->rows>=n, "RMatrixLUDet: rows(A)cols>=n, "RMatrixLUDet: cols(A)ptr.pp_double[i][i]; if( pivots->ptr.p_int[i]!=i ) { s = -s; } } result = result*s; return result; } /************************************************************************* Calculation of the determinant of a general matrix Input parameters: A - matrix, array[0..N-1, 0..N-1] N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: determinant of matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ double rmatrixdet(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_vector pivots; double result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_init(&pivots, 0, DT_INT, _state); ae_assert(n>=1, "RMatrixDet: N<1!", _state); ae_assert(a->rows>=n, "RMatrixDet: rows(A)cols>=n, "RMatrixDet: cols(A)=1, "CMatrixLUDet: N<1!", _state); ae_assert(pivots->cnt>=n, "CMatrixLUDet: Pivots array is too short!", _state); ae_assert(a->rows>=n, "CMatrixLUDet: rows(A)cols>=n, "CMatrixLUDet: cols(A)ptr.pp_complex[i][i]); if( pivots->ptr.p_int[i]!=i ) { s = -s; } } result = ae_c_mul_d(result,(double)(s)); return result; } /************************************************************************* Calculation of the determinant of a general matrix Input parameters: A - matrix, array[0..N-1, 0..N-1] N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) Result: determinant of matrix A. -- ALGLIB -- Copyright 2005 by Bochkanov Sergey *************************************************************************/ ae_complex cmatrixdet(/* Complex */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_vector pivots; ae_complex result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_init(&pivots, 0, DT_INT, _state); ae_assert(n>=1, "CMatrixDet: N<1!", _state); ae_assert(a->rows>=n, "CMatrixDet: rows(A)cols>=n, "CMatrixDet: cols(A)=1, "SPDMatrixCholeskyDet: N<1!", _state); ae_assert(a->rows>=n, "SPDMatrixCholeskyDet: rows(A)cols>=n, "SPDMatrixCholeskyDet: cols(A)ptr.pp_double[i][i], _state); } ae_assert(f, "SPDMatrixCholeskyDet: A contains infinite or NaN values!", _state); result = (double)(1); for(i=0; i<=n-1; i++) { result = result*ae_sqr(a->ptr.pp_double[i][i], _state); } return result; } /************************************************************************* Determinant calculation of the symmetric positive definite matrix. Input parameters: A - matrix. Array with elements [0..N-1, 0..N-1]. N - (optional) size of matrix A: * if given, only principal NxN submatrix is processed and overwritten. other elements are unchanged. * if not given, automatically determined from matrix size (A must be square matrix) IsUpper - (optional) storage type: * if True, symmetric matrix A is given by its upper triangle, and the lower triangle isn't used/changed by function * if False, symmetric matrix A is given by its lower triangle, and the upper triangle isn't used/changed by function * if not given, both lower and upper triangles must be filled. Result: determinant of matrix A. If matrix A is not positive definite, exception is thrown. -- ALGLIB -- Copyright 2005-2008 by Bochkanov Sergey *************************************************************************/ double spdmatrixdet(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_bool b; double result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_assert(n>=1, "SPDMatrixDet: N<1!", _state); ae_assert(a->rows>=n, "SPDMatrixDet: rows(A)cols>=n, "SPDMatrixDet: cols(A)>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "alglibinternal.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif using namespace std; ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF COMPUTATIONAL CORE // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { static void tsort_tagsortfastirec(/* Real */ ae_vector* a, /* Integer */ ae_vector* b, /* Real */ ae_vector* bufa, /* Integer */ ae_vector* bufb, ae_int_t i1, ae_int_t i2, ae_state *_state); static void tsort_tagsortfastrrec(/* Real */ ae_vector* a, /* Real */ ae_vector* b, /* Real */ ae_vector* bufa, /* Real */ ae_vector* bufb, ae_int_t i1, ae_int_t i2, ae_state *_state); static void tsort_tagsortfastrec(/* Real */ ae_vector* a, /* Real */ ae_vector* bufa, ae_int_t i1, ae_int_t i2, ae_state *_state); static ae_bool safesolve_cbasicsolveandupdate(ae_complex alpha, ae_complex beta, double lnmax, double bnorm, double maxgrowth, double* xnorm, ae_complex* x, ae_state *_state); static double linmin_ftol = 0.001; static double linmin_xtol = 100*ae_machineepsilon; static ae_int_t linmin_maxfev = 20; static double linmin_stpmin = 1.0E-50; static double linmin_defstpmax = 1.0E+50; static double linmin_armijofactor = 1.3; static void linmin_mcstep(double* stx, double* fx, double* dx, double* sty, double* fy, double* dy, double* stp, double fp, double dp, ae_bool* brackt, double stmin, double stmax, ae_int_t* info, ae_state *_state); static void xblas_xsum(/* Real */ ae_vector* w, double mx, ae_int_t n, double* r, double* rerr, ae_state *_state); static double xblas_xfastpow(double r, ae_int_t n, ae_state *_state); static void hsschur_internalauxschur(ae_bool wantt, ae_bool wantz, ae_int_t n, ae_int_t ilo, ae_int_t ihi, /* Real */ ae_matrix* h, /* Real */ ae_vector* wr, /* Real */ ae_vector* wi, ae_int_t iloz, ae_int_t ihiz, /* Real */ ae_matrix* z, /* Real */ ae_vector* work, /* Real */ ae_vector* workv3, /* Real */ ae_vector* workc1, /* Real */ ae_vector* works1, ae_int_t* info, ae_state *_state); static void hsschur_aux2x2schur(double* a, double* b, double* c, double* d, double* rt1r, double* rt1i, double* rt2r, double* rt2i, double* cs, double* sn, ae_state *_state); static double hsschur_extschursign(double a, double b, ae_state *_state); static ae_int_t hsschur_extschursigntoone(double b, ae_state *_state); static ae_bool hpccores_hpcpreparechunkedgradientx(/* Real */ ae_vector* weights, ae_int_t wcount, /* Real */ ae_vector* hpcbuf, ae_state *_state); static ae_bool hpccores_hpcfinalizechunkedgradientx(/* Real */ ae_vector* buf, ae_int_t wcount, /* Real */ ae_vector* grad, ae_state *_state); static ae_bool ntheory_isprime(ae_int_t n, ae_state *_state); static ae_int_t ntheory_modmul(ae_int_t a, ae_int_t b, ae_int_t n, ae_state *_state); static ae_int_t ntheory_modexp(ae_int_t a, ae_int_t b, ae_int_t n, ae_state *_state); static ae_int_t ftbase_coltype = 0; static ae_int_t ftbase_coloperandscnt = 1; static ae_int_t ftbase_coloperandsize = 2; static ae_int_t ftbase_colmicrovectorsize = 3; static ae_int_t ftbase_colparam0 = 4; static ae_int_t ftbase_colparam1 = 5; static ae_int_t ftbase_colparam2 = 6; static ae_int_t ftbase_colparam3 = 7; static ae_int_t ftbase_colscnt = 8; static ae_int_t ftbase_opend = 0; static ae_int_t ftbase_opcomplexreffft = 1; static ae_int_t ftbase_opbluesteinsfft = 2; static ae_int_t ftbase_opcomplexcodeletfft = 3; static ae_int_t ftbase_opcomplexcodelettwfft = 4; static ae_int_t ftbase_opradersfft = 5; static ae_int_t ftbase_opcomplextranspose = -1; static ae_int_t ftbase_opcomplexfftfactors = -2; static ae_int_t ftbase_opstart = -3; static ae_int_t ftbase_opjmp = -4; static ae_int_t ftbase_opparallelcall = -5; static ae_int_t ftbase_maxradix = 6; static ae_int_t ftbase_updatetw = 16; static ae_int_t ftbase_recursivethreshold = 1024; static ae_int_t ftbase_raderthreshold = 19; static ae_int_t ftbase_ftbasecodeletrecommended = 5; static double ftbase_ftbaseinefficiencyfactor = 1.3; static ae_int_t ftbase_ftbasemaxsmoothfactor = 5; static void ftbase_ftdeterminespacerequirements(ae_int_t n, ae_int_t* precrsize, ae_int_t* precisize, ae_state *_state); static void ftbase_ftcomplexfftplanrec(ae_int_t n, ae_int_t k, ae_bool childplan, ae_bool topmostplan, ae_int_t* rowptr, ae_int_t* bluesteinsize, ae_int_t* precrptr, ae_int_t* preciptr, fasttransformplan* plan, ae_state *_state); static void ftbase_ftpushentry(fasttransformplan* plan, ae_int_t* rowptr, ae_int_t etype, ae_int_t eopcnt, ae_int_t eopsize, ae_int_t emcvsize, ae_int_t eparam0, ae_state *_state); static void ftbase_ftpushentry2(fasttransformplan* plan, ae_int_t* rowptr, ae_int_t etype, ae_int_t eopcnt, ae_int_t eopsize, ae_int_t emcvsize, ae_int_t eparam0, ae_int_t eparam1, ae_state *_state); static void ftbase_ftpushentry4(fasttransformplan* plan, ae_int_t* rowptr, ae_int_t etype, ae_int_t eopcnt, ae_int_t eopsize, ae_int_t emcvsize, ae_int_t eparam0, ae_int_t eparam1, ae_int_t eparam2, ae_int_t eparam3, ae_state *_state); static void ftbase_ftapplysubplan(fasttransformplan* plan, ae_int_t subplan, /* Real */ ae_vector* a, ae_int_t abase, ae_int_t aoffset, /* Real */ ae_vector* buf, ae_int_t repcnt, ae_state *_state); static void ftbase_ftapplycomplexreffft(/* Real */ ae_vector* a, ae_int_t offs, ae_int_t operandscnt, ae_int_t operandsize, ae_int_t microvectorsize, /* Real */ ae_vector* buf, ae_state *_state); static void ftbase_ftapplycomplexcodeletfft(/* Real */ ae_vector* a, ae_int_t offs, ae_int_t operandscnt, ae_int_t operandsize, ae_int_t microvectorsize, ae_state *_state); static void ftbase_ftapplycomplexcodelettwfft(/* Real */ ae_vector* a, ae_int_t offs, ae_int_t operandscnt, ae_int_t operandsize, ae_int_t microvectorsize, ae_state *_state); static void ftbase_ftprecomputebluesteinsfft(ae_int_t n, ae_int_t m, /* Real */ ae_vector* precr, ae_int_t offs, ae_state *_state); static void ftbase_ftbluesteinsfft(fasttransformplan* plan, /* Real */ ae_vector* a, ae_int_t abase, ae_int_t aoffset, ae_int_t operandscnt, ae_int_t n, ae_int_t m, ae_int_t precoffs, ae_int_t subplan, /* Real */ ae_vector* bufa, /* Real */ ae_vector* bufb, /* Real */ ae_vector* bufc, /* Real */ ae_vector* bufd, ae_state *_state); static void ftbase_ftprecomputeradersfft(ae_int_t n, ae_int_t rq, ae_int_t riq, /* Real */ ae_vector* precr, ae_int_t offs, ae_state *_state); static void ftbase_ftradersfft(fasttransformplan* plan, /* Real */ ae_vector* a, ae_int_t abase, ae_int_t aoffset, ae_int_t operandscnt, ae_int_t n, ae_int_t subplan, ae_int_t rq, ae_int_t riq, ae_int_t precoffs, /* Real */ ae_vector* buf, ae_state *_state); static void ftbase_ftfactorize(ae_int_t n, ae_bool isroot, ae_int_t* n1, ae_int_t* n2, ae_state *_state); static ae_int_t ftbase_ftoptimisticestimate(ae_int_t n, ae_state *_state); static void ftbase_ffttwcalc(/* Real */ ae_vector* a, ae_int_t aoffset, ae_int_t n1, ae_int_t n2, ae_state *_state); static void ftbase_internalcomplexlintranspose(/* Real */ ae_vector* a, ae_int_t m, ae_int_t n, ae_int_t astart, /* Real */ ae_vector* buf, ae_state *_state); static void ftbase_ffticltrec(/* Real */ ae_vector* a, ae_int_t astart, ae_int_t astride, /* Real */ ae_vector* b, ae_int_t bstart, ae_int_t bstride, ae_int_t m, ae_int_t n, ae_state *_state); static void ftbase_fftirltrec(/* Real */ ae_vector* a, ae_int_t astart, ae_int_t astride, /* Real */ ae_vector* b, ae_int_t bstart, ae_int_t bstride, ae_int_t m, ae_int_t n, ae_state *_state); static void ftbase_ftbasefindsmoothrec(ae_int_t n, ae_int_t seed, ae_int_t leastfactor, ae_int_t* best, ae_state *_state); ae_int_t getrdfserializationcode(ae_state *_state) { ae_int_t result; result = 1; return result; } ae_int_t getkdtreeserializationcode(ae_state *_state) { ae_int_t result; result = 2; return result; } ae_int_t getmlpserializationcode(ae_state *_state) { ae_int_t result; result = 3; return result; } ae_int_t getmlpeserializationcode(ae_state *_state) { ae_int_t result; result = 4; return result; } ae_int_t getrbfserializationcode(ae_state *_state) { ae_int_t result; result = 5; return result; } /************************************************************************* This function is used to set error flags during unit tests. When COND parameter is True, FLAG variable is set to True. When COND is False, FLAG is unchanged. The purpose of this function is to have single point where failures of unit tests can be detected. This function returns value of COND. *************************************************************************/ ae_bool seterrorflag(ae_bool* flag, ae_bool cond, ae_state *_state) { ae_bool result; if( cond ) { *flag = ae_true; } result = cond; return result; } /************************************************************************* Internally calls SetErrorFlag() with condition: Abs(Val-RefVal)>Tol*Max(Abs(RefVal),S) This function is used to test relative error in Val against RefVal, with relative error being replaced by absolute when scale of RefVal is less than S. This function returns value of COND. *************************************************************************/ ae_bool seterrorflagdiff(ae_bool* flag, double val, double refval, double tol, double s, ae_state *_state) { ae_bool result; result = seterrorflag(flag, ae_fp_greater(ae_fabs(val-refval, _state),tol*ae_maxreal(ae_fabs(refval, _state), s, _state)), _state); return result; } /************************************************************************* The function "touches" integer - it is used to avoid compiler messages about unused variables (in rare cases when we do NOT want to remove these variables). -- ALGLIB -- Copyright 17.09.2012 by Bochkanov Sergey *************************************************************************/ void touchint(ae_int_t* a, ae_state *_state) { } /************************************************************************* The function "touches" real - it is used to avoid compiler messages about unused variables (in rare cases when we do NOT want to remove these variables). -- ALGLIB -- Copyright 17.09.2012 by Bochkanov Sergey *************************************************************************/ void touchreal(double* a, ae_state *_state) { } /************************************************************************* The function performs zero-coalescing on real value. NOTE: no check is performed for B<>0 -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ double coalesce(double a, double b, ae_state *_state) { double result; result = a; if( ae_fp_eq(a,0.0) ) { result = b; } return result; } /************************************************************************* The function performs zero-coalescing on integer value. NOTE: no check is performed for B<>0 -- ALGLIB -- Copyright 18.05.2015 by Bochkanov Sergey *************************************************************************/ ae_int_t coalescei(ae_int_t a, ae_int_t b, ae_state *_state) { ae_int_t result; result = a; if( a==0 ) { result = b; } return result; } /************************************************************************* The function convert integer value to real value. -- ALGLIB -- Copyright 17.09.2012 by Bochkanov Sergey *************************************************************************/ double inttoreal(ae_int_t a, ae_state *_state) { double result; result = (double)(a); return result; } /************************************************************************* The function calculates binary logarithm. NOTE: it costs twice as much as Ln(x) -- ALGLIB -- Copyright 17.09.2012 by Bochkanov Sergey *************************************************************************/ double logbase2(double x, ae_state *_state) { double result; result = ae_log(x, _state)/ae_log((double)(2), _state); return result; } /************************************************************************* This function compares two numbers for approximate equality, with tolerance to errors as large as tol. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ ae_bool approxequal(double a, double b, double tol, ae_state *_state) { ae_bool result; result = ae_fp_less_eq(ae_fabs(a-b, _state),tol); return result; } /************************************************************************* This function compares two numbers for approximate equality, with tolerance to errors as large as max(|a|,|b|)*tol. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ ae_bool approxequalrel(double a, double b, double tol, ae_state *_state) { ae_bool result; result = ae_fp_less_eq(ae_fabs(a-b, _state),ae_maxreal(ae_fabs(a, _state), ae_fabs(b, _state), _state)*tol); return result; } /************************************************************************* This function generates 1-dimensional general interpolation task with moderate Lipshitz constant (close to 1.0) If N=1 then suborutine generates only one point at the middle of [A,B] -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void taskgenint1d(double a, double b, ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; double h; ae_vector_clear(x); ae_vector_clear(y); ae_assert(n>=1, "TaskGenInterpolationEqdist1D: N<1!", _state); ae_vector_set_length(x, n, _state); ae_vector_set_length(y, n, _state); if( n>1 ) { x->ptr.p_double[0] = a; y->ptr.p_double[0] = 2*ae_randomreal(_state)-1; h = (b-a)/(n-1); for(i=1; i<=n-1; i++) { if( i!=n-1 ) { x->ptr.p_double[i] = a+(i+0.2*(2*ae_randomreal(_state)-1))*h; } else { x->ptr.p_double[i] = b; } y->ptr.p_double[i] = y->ptr.p_double[i-1]+(2*ae_randomreal(_state)-1)*(x->ptr.p_double[i]-x->ptr.p_double[i-1]); } } else { x->ptr.p_double[0] = 0.5*(a+b); y->ptr.p_double[0] = 2*ae_randomreal(_state)-1; } } /************************************************************************* This function generates 1-dimensional equidistant interpolation task with moderate Lipshitz constant (close to 1.0) If N=1 then suborutine generates only one point at the middle of [A,B] -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void taskgenint1dequidist(double a, double b, ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; double h; ae_vector_clear(x); ae_vector_clear(y); ae_assert(n>=1, "TaskGenInterpolationEqdist1D: N<1!", _state); ae_vector_set_length(x, n, _state); ae_vector_set_length(y, n, _state); if( n>1 ) { x->ptr.p_double[0] = a; y->ptr.p_double[0] = 2*ae_randomreal(_state)-1; h = (b-a)/(n-1); for(i=1; i<=n-1; i++) { x->ptr.p_double[i] = a+i*h; y->ptr.p_double[i] = y->ptr.p_double[i-1]+(2*ae_randomreal(_state)-1)*h; } } else { x->ptr.p_double[0] = 0.5*(a+b); y->ptr.p_double[0] = 2*ae_randomreal(_state)-1; } } /************************************************************************* This function generates 1-dimensional Chebyshev-1 interpolation task with moderate Lipshitz constant (close to 1.0) If N=1 then suborutine generates only one point at the middle of [A,B] -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void taskgenint1dcheb1(double a, double b, ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_vector_clear(x); ae_vector_clear(y); ae_assert(n>=1, "TaskGenInterpolation1DCheb1: N<1!", _state); ae_vector_set_length(x, n, _state); ae_vector_set_length(y, n, _state); if( n>1 ) { for(i=0; i<=n-1; i++) { x->ptr.p_double[i] = 0.5*(b+a)+0.5*(b-a)*ae_cos(ae_pi*(2*i+1)/(2*n), _state); if( i==0 ) { y->ptr.p_double[i] = 2*ae_randomreal(_state)-1; } else { y->ptr.p_double[i] = y->ptr.p_double[i-1]+(2*ae_randomreal(_state)-1)*(x->ptr.p_double[i]-x->ptr.p_double[i-1]); } } } else { x->ptr.p_double[0] = 0.5*(a+b); y->ptr.p_double[0] = 2*ae_randomreal(_state)-1; } } /************************************************************************* This function generates 1-dimensional Chebyshev-2 interpolation task with moderate Lipshitz constant (close to 1.0) If N=1 then suborutine generates only one point at the middle of [A,B] -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void taskgenint1dcheb2(double a, double b, ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_vector_clear(x); ae_vector_clear(y); ae_assert(n>=1, "TaskGenInterpolation1DCheb2: N<1!", _state); ae_vector_set_length(x, n, _state); ae_vector_set_length(y, n, _state); if( n>1 ) { for(i=0; i<=n-1; i++) { x->ptr.p_double[i] = 0.5*(b+a)+0.5*(b-a)*ae_cos(ae_pi*i/(n-1), _state); if( i==0 ) { y->ptr.p_double[i] = 2*ae_randomreal(_state)-1; } else { y->ptr.p_double[i] = y->ptr.p_double[i-1]+(2*ae_randomreal(_state)-1)*(x->ptr.p_double[i]-x->ptr.p_double[i-1]); } } } else { x->ptr.p_double[0] = 0.5*(a+b); y->ptr.p_double[0] = 2*ae_randomreal(_state)-1; } } /************************************************************************* This function checks that all values from X[] are distinct. It does more than just usual floating point comparison: * first, it calculates max(X) and min(X) * second, it maps X[] from [min,max] to [1,2] * only at this stage actual comparison is done The meaning of such check is to ensure that all values are "distinct enough" and will not cause interpolation subroutine to fail. NOTE: X[] must be sorted by ascending (subroutine ASSERT's it) -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ ae_bool aredistinct(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state) { double a; double b; ae_int_t i; ae_bool nonsorted; ae_bool result; ae_assert(n>=1, "APSERVAreDistinct: internal error (N<1)", _state); if( n==1 ) { /* * everything is alright, it is up to caller to decide whether it * can interpolate something with just one point */ result = ae_true; return result; } a = x->ptr.p_double[0]; b = x->ptr.p_double[0]; nonsorted = ae_false; for(i=1; i<=n-1; i++) { a = ae_minreal(a, x->ptr.p_double[i], _state); b = ae_maxreal(b, x->ptr.p_double[i], _state); nonsorted = nonsorted||ae_fp_greater_eq(x->ptr.p_double[i-1],x->ptr.p_double[i]); } ae_assert(!nonsorted, "APSERVAreDistinct: internal error (not sorted)", _state); for(i=1; i<=n-1; i++) { if( ae_fp_eq((x->ptr.p_double[i]-a)/(b-a)+1,(x->ptr.p_double[i-1]-a)/(b-a)+1) ) { result = ae_false; return result; } } result = ae_true; return result; } /************************************************************************* This function checks that two boolean values are the same (both are True or both are False). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ ae_bool aresameboolean(ae_bool v1, ae_bool v2, ae_state *_state) { ae_bool result; result = (v1&&v2)||(!v1&&!v2); return result; } /************************************************************************* If Length(X)cntcntcnt0&&n>0 ) { if( x->rowscolscnt; ae_swap_vectors(x, &oldx); ae_vector_set_length(x, n, _state); for(i=0; i<=n-1; i++) { if( iptr.p_int[i] = oldx.ptr.p_int[i]; } else { x->ptr.p_int[i] = 0; } } ae_frame_leave(_state); } /************************************************************************* Resizes X and: * preserves old contents of X * fills new elements by zeros -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void rvectorresize(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector oldx; ae_int_t i; ae_int_t n2; ae_frame_make(_state, &_frame_block); ae_vector_init(&oldx, 0, DT_REAL, _state); n2 = x->cnt; ae_swap_vectors(x, &oldx); ae_vector_set_length(x, n, _state); for(i=0; i<=n-1; i++) { if( iptr.p_double[i] = oldx.ptr.p_double[i]; } else { x->ptr.p_double[i] = (double)(0); } } ae_frame_leave(_state); } /************************************************************************* Resizes X and: * preserves old contents of X * fills new elements by zeros -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void rmatrixresize(/* Real */ ae_matrix* x, ae_int_t m, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_matrix oldx; ae_int_t i; ae_int_t j; ae_int_t m2; ae_int_t n2; ae_frame_make(_state, &_frame_block); ae_matrix_init(&oldx, 0, 0, DT_REAL, _state); m2 = x->rows; n2 = x->cols; ae_swap_matrices(x, &oldx); ae_matrix_set_length(x, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( iptr.pp_double[i][j] = oldx.ptr.pp_double[i][j]; } else { x->ptr.pp_double[i][j] = 0.0; } } } ae_frame_leave(_state); } /************************************************************************* Resizes X and: * preserves old contents of X * fills new elements by zeros -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void imatrixresize(/* Integer */ ae_matrix* x, ae_int_t m, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_matrix oldx; ae_int_t i; ae_int_t j; ae_int_t m2; ae_int_t n2; ae_frame_make(_state, &_frame_block); ae_matrix_init(&oldx, 0, 0, DT_INT, _state); m2 = x->rows; n2 = x->cols; ae_swap_matrices(x, &oldx); ae_matrix_set_length(x, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( iptr.pp_int[i][j] = oldx.ptr.pp_int[i][j]; } else { x->ptr.pp_int[i][j] = 0; } } } ae_frame_leave(_state); } /************************************************************************* This function checks that length(X) is at least N and first N values from X[] are finite -- ALGLIB -- Copyright 18.06.2010 by Bochkanov Sergey *************************************************************************/ ae_bool isfinitevector(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state) { ae_int_t i; ae_bool result; ae_assert(n>=0, "APSERVIsFiniteVector: internal error (N<0)", _state); if( n==0 ) { result = ae_true; return result; } if( x->cntptr.p_double[i], _state) ) { result = ae_false; return result; } } result = ae_true; return result; } /************************************************************************* This function checks that first N values from X[] are finite -- ALGLIB -- Copyright 18.06.2010 by Bochkanov Sergey *************************************************************************/ ae_bool isfinitecvector(/* Complex */ ae_vector* z, ae_int_t n, ae_state *_state) { ae_int_t i; ae_bool result; ae_assert(n>=0, "APSERVIsFiniteCVector: internal error (N<0)", _state); for(i=0; i<=n-1; i++) { if( !ae_isfinite(z->ptr.p_complex[i].x, _state)||!ae_isfinite(z->ptr.p_complex[i].y, _state) ) { result = ae_false; return result; } } result = ae_true; return result; } /************************************************************************* This function checks that size of X is at least MxN and values from X[0..M-1,0..N-1] are finite. -- ALGLIB -- Copyright 18.06.2010 by Bochkanov Sergey *************************************************************************/ ae_bool apservisfinitematrix(/* Real */ ae_matrix* x, ae_int_t m, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; ae_assert(n>=0, "APSERVIsFiniteMatrix: internal error (N<0)", _state); ae_assert(m>=0, "APSERVIsFiniteMatrix: internal error (M<0)", _state); if( m==0||n==0 ) { result = ae_true; return result; } if( x->rowscolsptr.pp_double[i][j], _state) ) { result = ae_false; return result; } } } result = ae_true; return result; } /************************************************************************* This function checks that all values from X[0..M-1,0..N-1] are finite -- ALGLIB -- Copyright 18.06.2010 by Bochkanov Sergey *************************************************************************/ ae_bool apservisfinitecmatrix(/* Complex */ ae_matrix* x, ae_int_t m, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; ae_assert(n>=0, "APSERVIsFiniteCMatrix: internal error (N<0)", _state); ae_assert(m>=0, "APSERVIsFiniteCMatrix: internal error (M<0)", _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( !ae_isfinite(x->ptr.pp_complex[i][j].x, _state)||!ae_isfinite(x->ptr.pp_complex[i][j].y, _state) ) { result = ae_false; return result; } } } result = ae_true; return result; } /************************************************************************* This function checks that size of X is at least NxN and all values from upper/lower triangle of X[0..N-1,0..N-1] are finite -- ALGLIB -- Copyright 18.06.2010 by Bochkanov Sergey *************************************************************************/ ae_bool isfinitertrmatrix(/* Real */ ae_matrix* x, ae_int_t n, ae_bool isupper, ae_state *_state) { ae_int_t i; ae_int_t j1; ae_int_t j2; ae_int_t j; ae_bool result; ae_assert(n>=0, "APSERVIsFiniteRTRMatrix: internal error (N<0)", _state); if( n==0 ) { result = ae_true; return result; } if( x->rowscolsptr.pp_double[i][j], _state) ) { result = ae_false; return result; } } } result = ae_true; return result; } /************************************************************************* This function checks that all values from upper/lower triangle of X[0..N-1,0..N-1] are finite -- ALGLIB -- Copyright 18.06.2010 by Bochkanov Sergey *************************************************************************/ ae_bool apservisfinitectrmatrix(/* Complex */ ae_matrix* x, ae_int_t n, ae_bool isupper, ae_state *_state) { ae_int_t i; ae_int_t j1; ae_int_t j2; ae_int_t j; ae_bool result; ae_assert(n>=0, "APSERVIsFiniteCTRMatrix: internal error (N<0)", _state); for(i=0; i<=n-1; i++) { if( isupper ) { j1 = i; j2 = n-1; } else { j1 = 0; j2 = i; } for(j=j1; j<=j2; j++) { if( !ae_isfinite(x->ptr.pp_complex[i][j].x, _state)||!ae_isfinite(x->ptr.pp_complex[i][j].y, _state) ) { result = ae_false; return result; } } } result = ae_true; return result; } /************************************************************************* This function checks that all values from X[0..M-1,0..N-1] are finite or NaN's. -- ALGLIB -- Copyright 18.06.2010 by Bochkanov Sergey *************************************************************************/ ae_bool apservisfiniteornanmatrix(/* Real */ ae_matrix* x, ae_int_t m, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; ae_assert(n>=0, "APSERVIsFiniteOrNaNMatrix: internal error (N<0)", _state); ae_assert(m>=0, "APSERVIsFiniteOrNaNMatrix: internal error (M<0)", _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( !(ae_isfinite(x->ptr.pp_double[i][j], _state)||ae_isnan(x->ptr.pp_double[i][j], _state)) ) { result = ae_false; return result; } } } result = ae_true; return result; } /************************************************************************* Safe sqrt(x^2+y^2) -- ALGLIB -- Copyright by Bochkanov Sergey *************************************************************************/ double safepythag2(double x, double y, ae_state *_state) { double w; double xabs; double yabs; double z; double result; xabs = ae_fabs(x, _state); yabs = ae_fabs(y, _state); w = ae_maxreal(xabs, yabs, _state); z = ae_minreal(xabs, yabs, _state); if( ae_fp_eq(z,(double)(0)) ) { result = w; } else { result = w*ae_sqrt(1+ae_sqr(z/w, _state), _state); } return result; } /************************************************************************* Safe sqrt(x^2+y^2) -- ALGLIB -- Copyright by Bochkanov Sergey *************************************************************************/ double safepythag3(double x, double y, double z, ae_state *_state) { double w; double result; w = ae_maxreal(ae_fabs(x, _state), ae_maxreal(ae_fabs(y, _state), ae_fabs(z, _state), _state), _state); if( ae_fp_eq(w,(double)(0)) ) { result = (double)(0); return result; } x = x/w; y = y/w; z = z/w; result = w*ae_sqrt(ae_sqr(x, _state)+ae_sqr(y, _state)+ae_sqr(z, _state), _state); return result; } /************************************************************************* Safe division. This function attempts to calculate R=X/Y without overflow. It returns: * +1, if abs(X/Y)>=MaxRealNumber or undefined - overflow-like situation (no overlfow is generated, R is either NAN, PosINF, NegINF) * 0, if MinRealNumber0 (R contains result, may be zero) * -1, if 00 */ if( ae_fp_eq(y,(double)(0)) ) { result = 1; if( ae_fp_eq(x,(double)(0)) ) { *r = _state->v_nan; } if( ae_fp_greater(x,(double)(0)) ) { *r = _state->v_posinf; } if( ae_fp_less(x,(double)(0)) ) { *r = _state->v_neginf; } return result; } if( ae_fp_eq(x,(double)(0)) ) { *r = (double)(0); result = 0; return result; } /* * make Y>0 */ if( ae_fp_less(y,(double)(0)) ) { x = -x; y = -y; } /* * */ if( ae_fp_greater_eq(y,(double)(1)) ) { *r = x/y; if( ae_fp_less_eq(ae_fabs(*r, _state),ae_minrealnumber) ) { result = -1; *r = (double)(0); } else { result = 0; } } else { if( ae_fp_greater_eq(ae_fabs(x, _state),ae_maxrealnumber*y) ) { if( ae_fp_greater(x,(double)(0)) ) { *r = _state->v_posinf; } else { *r = _state->v_neginf; } result = 1; } else { *r = x/y; result = 0; } } return result; } /************************************************************************* This function calculates "safe" min(X/Y,V) for positive finite X, Y, V. No overflow is generated in any case. -- ALGLIB -- Copyright by Bochkanov Sergey *************************************************************************/ double safeminposrv(double x, double y, double v, ae_state *_state) { double r; double result; if( ae_fp_greater_eq(y,(double)(1)) ) { /* * Y>=1, we can safely divide by Y */ r = x/y; result = v; if( ae_fp_greater(v,r) ) { result = r; } else { result = v; } } else { /* * Y<1, we can safely multiply by Y */ if( ae_fp_less(x,v*y) ) { result = x/y; } else { result = v; } } return result; } /************************************************************************* This function makes periodic mapping of X to [A,B]. It accepts X, A, B (A>B). It returns T which lies in [A,B] and integer K, such that X = T + K*(B-A). NOTES: * K is represented as real value, although actually it is integer * T is guaranteed to be in [A,B] * T replaces X -- ALGLIB -- Copyright by Bochkanov Sergey *************************************************************************/ void apperiodicmap(double* x, double a, double b, double* k, ae_state *_state) { *k = 0; ae_assert(ae_fp_less(a,b), "APPeriodicMap: internal error!", _state); *k = (double)(ae_ifloor((*x-a)/(b-a), _state)); *x = *x-*k*(b-a); while(ae_fp_less(*x,a)) { *x = *x+(b-a); *k = *k-1; } while(ae_fp_greater(*x,b)) { *x = *x-(b-a); *k = *k+1; } *x = ae_maxreal(*x, a, _state); *x = ae_minreal(*x, b, _state); } /************************************************************************* Returns random normal number using low-quality system-provided generator -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ double randomnormal(ae_state *_state) { double u; double v; double s; double result; for(;;) { u = 2*ae_randomreal(_state)-1; v = 2*ae_randomreal(_state)-1; s = ae_sqr(u, _state)+ae_sqr(v, _state); if( ae_fp_greater(s,(double)(0))&&ae_fp_less(s,(double)(1)) ) { /* * two Sqrt's instead of one to * avoid overflow when S is too small */ s = ae_sqrt(-2*ae_log(s, _state), _state)/ae_sqrt(s, _state); result = u*s; break; } } return result; } /************************************************************************* Generates random unit vector using low-quality system-provided generator. Reallocates array if its size is too short. -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ void randomunit(ae_int_t n, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t i; double v; double vv; ae_assert(n>0, "RandomUnit: N<=0", _state); if( x->cntptr.p_double[i] = vv; v = v+vv*vv; } } while(ae_fp_less_eq(v,(double)(0))); v = 1/ae_sqrt(v, _state); for(i=0; i<=n-1; i++) { x->ptr.p_double[i] = x->ptr.p_double[i]*v; } } /************************************************************************* This function is used to swap two integer values *************************************************************************/ void swapi(ae_int_t* v0, ae_int_t* v1, ae_state *_state) { ae_int_t v; v = *v0; *v0 = *v1; *v1 = v; } /************************************************************************* This function is used to swap two real values *************************************************************************/ void swapr(double* v0, double* v1, ae_state *_state) { double v; v = *v0; *v0 = *v1; *v1 = v; } /************************************************************************* This function is used to return maximum of three real values *************************************************************************/ double maxreal3(double v0, double v1, double v2, ae_state *_state) { double result; result = v0; if( ae_fp_less(result,v1) ) { result = v1; } if( ae_fp_less(result,v2) ) { result = v2; } return result; } /************************************************************************* This function is used to increment value of integer variable *************************************************************************/ void inc(ae_int_t* v, ae_state *_state) { *v = *v+1; } /************************************************************************* This function is used to decrement value of integer variable *************************************************************************/ void dec(ae_int_t* v, ae_state *_state) { *v = *v-1; } /************************************************************************* This function performs two operations: 1. decrements value of integer variable, if it is positive 2. explicitly sets variable to zero if it is non-positive It is used by some algorithms to decrease value of internal counters. *************************************************************************/ void countdown(ae_int_t* v, ae_state *_state) { if( *v>0 ) { *v = *v-1; } else { *v = 0; } } /************************************************************************* This function returns product of two real numbers. It is convenient when you have to perform typecast-and-product of two INTEGERS. *************************************************************************/ double rmul2(double v0, double v1, ae_state *_state) { double result; result = v0*v1; return result; } /************************************************************************* 'bounds' value: maps X to [B1,B2] -- ALGLIB -- Copyright 20.03.2009 by Bochkanov Sergey *************************************************************************/ double boundval(double x, double b1, double b2, ae_state *_state) { double result; if( ae_fp_less_eq(x,b1) ) { result = b1; return result; } if( ae_fp_greater_eq(x,b2) ) { result = b2; return result; } result = x; return result; } /************************************************************************* Allocation of serializer: complex value *************************************************************************/ void alloccomplex(ae_serializer* s, ae_complex v, ae_state *_state) { ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); } /************************************************************************* Serialization: complex value *************************************************************************/ void serializecomplex(ae_serializer* s, ae_complex v, ae_state *_state) { ae_serializer_serialize_double(s, v.x, _state); ae_serializer_serialize_double(s, v.y, _state); } /************************************************************************* Unserialization: complex value *************************************************************************/ ae_complex unserializecomplex(ae_serializer* s, ae_state *_state) { ae_complex result; ae_serializer_unserialize_double(s, &result.x, _state); ae_serializer_unserialize_double(s, &result.y, _state); return result; } /************************************************************************* Allocation of serializer: real array *************************************************************************/ void allocrealarray(ae_serializer* s, /* Real */ ae_vector* v, ae_int_t n, ae_state *_state) { ae_int_t i; if( n<0 ) { n = v->cnt; } ae_serializer_alloc_entry(s); for(i=0; i<=n-1; i++) { ae_serializer_alloc_entry(s); } } /************************************************************************* Serialization: complex value *************************************************************************/ void serializerealarray(ae_serializer* s, /* Real */ ae_vector* v, ae_int_t n, ae_state *_state) { ae_int_t i; if( n<0 ) { n = v->cnt; } ae_serializer_serialize_int(s, n, _state); for(i=0; i<=n-1; i++) { ae_serializer_serialize_double(s, v->ptr.p_double[i], _state); } } /************************************************************************* Unserialization: complex value *************************************************************************/ void unserializerealarray(ae_serializer* s, /* Real */ ae_vector* v, ae_state *_state) { ae_int_t n; ae_int_t i; double t; ae_vector_clear(v); ae_serializer_unserialize_int(s, &n, _state); if( n==0 ) { return; } ae_vector_set_length(v, n, _state); for(i=0; i<=n-1; i++) { ae_serializer_unserialize_double(s, &t, _state); v->ptr.p_double[i] = t; } } /************************************************************************* Allocation of serializer: Integer array *************************************************************************/ void allocintegerarray(ae_serializer* s, /* Integer */ ae_vector* v, ae_int_t n, ae_state *_state) { ae_int_t i; if( n<0 ) { n = v->cnt; } ae_serializer_alloc_entry(s); for(i=0; i<=n-1; i++) { ae_serializer_alloc_entry(s); } } /************************************************************************* Serialization: Integer array *************************************************************************/ void serializeintegerarray(ae_serializer* s, /* Integer */ ae_vector* v, ae_int_t n, ae_state *_state) { ae_int_t i; if( n<0 ) { n = v->cnt; } ae_serializer_serialize_int(s, n, _state); for(i=0; i<=n-1; i++) { ae_serializer_serialize_int(s, v->ptr.p_int[i], _state); } } /************************************************************************* Unserialization: complex value *************************************************************************/ void unserializeintegerarray(ae_serializer* s, /* Integer */ ae_vector* v, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t t; ae_vector_clear(v); ae_serializer_unserialize_int(s, &n, _state); if( n==0 ) { return; } ae_vector_set_length(v, n, _state); for(i=0; i<=n-1; i++) { ae_serializer_unserialize_int(s, &t, _state); v->ptr.p_int[i] = t; } } /************************************************************************* Allocation of serializer: real matrix *************************************************************************/ void allocrealmatrix(ae_serializer* s, /* Real */ ae_matrix* v, ae_int_t n0, ae_int_t n1, ae_state *_state) { ae_int_t i; ae_int_t j; if( n0<0 ) { n0 = v->rows; } if( n1<0 ) { n1 = v->cols; } ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); for(i=0; i<=n0-1; i++) { for(j=0; j<=n1-1; j++) { ae_serializer_alloc_entry(s); } } } /************************************************************************* Serialization: complex value *************************************************************************/ void serializerealmatrix(ae_serializer* s, /* Real */ ae_matrix* v, ae_int_t n0, ae_int_t n1, ae_state *_state) { ae_int_t i; ae_int_t j; if( n0<0 ) { n0 = v->rows; } if( n1<0 ) { n1 = v->cols; } ae_serializer_serialize_int(s, n0, _state); ae_serializer_serialize_int(s, n1, _state); for(i=0; i<=n0-1; i++) { for(j=0; j<=n1-1; j++) { ae_serializer_serialize_double(s, v->ptr.pp_double[i][j], _state); } } } /************************************************************************* Unserialization: complex value *************************************************************************/ void unserializerealmatrix(ae_serializer* s, /* Real */ ae_matrix* v, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t n0; ae_int_t n1; double t; ae_matrix_clear(v); ae_serializer_unserialize_int(s, &n0, _state); ae_serializer_unserialize_int(s, &n1, _state); if( n0==0||n1==0 ) { return; } ae_matrix_set_length(v, n0, n1, _state); for(i=0; i<=n0-1; i++) { for(j=0; j<=n1-1; j++) { ae_serializer_unserialize_double(s, &t, _state); v->ptr.pp_double[i][j] = t; } } } /************************************************************************* Copy integer array *************************************************************************/ void copyintegerarray(/* Integer */ ae_vector* src, /* Integer */ ae_vector* dst, ae_state *_state) { ae_int_t i; ae_vector_clear(dst); if( src->cnt>0 ) { ae_vector_set_length(dst, src->cnt, _state); for(i=0; i<=src->cnt-1; i++) { dst->ptr.p_int[i] = src->ptr.p_int[i]; } } } /************************************************************************* Copy real array *************************************************************************/ void copyrealarray(/* Real */ ae_vector* src, /* Real */ ae_vector* dst, ae_state *_state) { ae_int_t i; ae_vector_clear(dst); if( src->cnt>0 ) { ae_vector_set_length(dst, src->cnt, _state); for(i=0; i<=src->cnt-1; i++) { dst->ptr.p_double[i] = src->ptr.p_double[i]; } } } /************************************************************************* Copy real matrix *************************************************************************/ void copyrealmatrix(/* Real */ ae_matrix* src, /* Real */ ae_matrix* dst, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(dst); if( src->rows>0&&src->cols>0 ) { ae_matrix_set_length(dst, src->rows, src->cols, _state); for(i=0; i<=src->rows-1; i++) { for(j=0; j<=src->cols-1; j++) { dst->ptr.pp_double[i][j] = src->ptr.pp_double[i][j]; } } } } /************************************************************************* Clears integer array *************************************************************************/ void unsetintegerarray(/* Integer */ ae_vector* a, ae_state *_state) { ae_vector_clear(a); } /************************************************************************* Clears real array *************************************************************************/ void unsetrealarray(/* Real */ ae_vector* a, ae_state *_state) { ae_vector_clear(a); } /************************************************************************* Clears real matrix *************************************************************************/ void unsetrealmatrix(/* Real */ ae_matrix* a, ae_state *_state) { ae_matrix_clear(a); } /************************************************************************* This function searches integer array. Elements in this array are actually records, each NRec elements wide. Each record has unique header - NHeader integer values, which identify it. Records are lexicographically sorted by header. Records are identified by their index, not offset (offset = NRec*index). This function searches A (records with indices [I0,I1)) for a record with header B. It returns index of this record (not offset!), or -1 on failure. -- ALGLIB -- Copyright 28.03.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t recsearch(/* Integer */ ae_vector* a, ae_int_t nrec, ae_int_t nheader, ae_int_t i0, ae_int_t i1, /* Integer */ ae_vector* b, ae_state *_state) { ae_int_t mididx; ae_int_t cflag; ae_int_t k; ae_int_t offs; ae_int_t result; result = -1; for(;;) { if( i0>=i1 ) { break; } mididx = (i0+i1)/2; offs = nrec*mididx; cflag = 0; for(k=0; k<=nheader-1; k++) { if( a->ptr.p_int[offs+k]ptr.p_int[k] ) { cflag = -1; break; } if( a->ptr.p_int[offs+k]>b->ptr.p_int[k] ) { cflag = 1; break; } } if( cflag==0 ) { result = mididx; return result; } if( cflag<0 ) { i0 = mididx+1; } else { i1 = mididx; } } return result; } /************************************************************************* This function is used in parallel functions for recurrent division of large task into two smaller tasks. It has following properties: * it works only for TaskSize>=2 (assertion is thrown otherwise) * for TaskSize=2, it returns Task0=1, Task1=1 * in case TaskSize is odd, Task0=TaskSize-1, Task1=1 * in case TaskSize is even, Task0 and Task1 are approximately TaskSize/2 and both Task0 and Task1 are even, Task0>=Task1 -- ALGLIB -- Copyright 07.04.2013 by Bochkanov Sergey *************************************************************************/ void splitlengtheven(ae_int_t tasksize, ae_int_t* task0, ae_int_t* task1, ae_state *_state) { *task0 = 0; *task1 = 0; ae_assert(tasksize>=2, "SplitLengthEven: TaskSize<2", _state); if( tasksize==2 ) { *task0 = 1; *task1 = 1; return; } if( tasksize%2==0 ) { /* * Even division */ *task0 = tasksize/2; *task1 = tasksize/2; if( *task0%2!=0 ) { *task0 = *task0+1; *task1 = *task1-1; } } else { /* * Odd task size, split trailing odd part from it. */ *task0 = tasksize-1; *task1 = 1; } ae_assert(*task0>=1, "SplitLengthEven: internal error", _state); ae_assert(*task1>=1, "SplitLengthEven: internal error", _state); } /************************************************************************* This function is used in parallel functions for recurrent division of large task into two smaller tasks. It has following properties: * it works only for TaskSize>=2 and ChunkSize>=2 (assertion is thrown otherwise) * Task0+Task1=TaskSize, Task0>0, Task1>0 * Task0 and Task1 are close to each other * in case TaskSize>ChunkSize, Task0 is always divisible by ChunkSize -- ALGLIB -- Copyright 07.04.2013 by Bochkanov Sergey *************************************************************************/ void splitlength(ae_int_t tasksize, ae_int_t chunksize, ae_int_t* task0, ae_int_t* task1, ae_state *_state) { *task0 = 0; *task1 = 0; ae_assert(chunksize>=2, "SplitLength: ChunkSize<2", _state); ae_assert(tasksize>=2, "SplitLength: TaskSize<2", _state); *task0 = tasksize/2; if( *task0>chunksize&&*task0%chunksize!=0 ) { *task0 = *task0-*task0%chunksize; } *task1 = tasksize-(*task0); ae_assert(*task0>=1, "SplitLength: internal error", _state); ae_assert(*task1>=1, "SplitLength: internal error", _state); } /************************************************************************* This function is used to calculate number of chunks (including partial, non-complete chunks) in some set. It expects that ChunkSize>=1, TaskSize>=0. Assertion is thrown otherwise. Function result is equivalent to Ceil(TaskSize/ChunkSize), but with guarantees that rounding errors won't ruin results. -- ALGLIB -- Copyright 21.01.2015 by Bochkanov Sergey *************************************************************************/ ae_int_t chunkscount(ae_int_t tasksize, ae_int_t chunksize, ae_state *_state) { ae_int_t result; ae_assert(tasksize>=0, "ChunksCount: TaskSize<0", _state); ae_assert(chunksize>=1, "ChunksCount: ChunkSize<1", _state); result = tasksize/chunksize; if( tasksize%chunksize!=0 ) { result = result+1; } return result; } void _apbuffers_init(void* _p, ae_state *_state) { apbuffers *p = (apbuffers*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->ba0, 0, DT_BOOL, _state); ae_vector_init(&p->ia0, 0, DT_INT, _state); ae_vector_init(&p->ia1, 0, DT_INT, _state); ae_vector_init(&p->ia2, 0, DT_INT, _state); ae_vector_init(&p->ia3, 0, DT_INT, _state); ae_vector_init(&p->ra0, 0, DT_REAL, _state); ae_vector_init(&p->ra1, 0, DT_REAL, _state); ae_vector_init(&p->ra2, 0, DT_REAL, _state); ae_vector_init(&p->ra3, 0, DT_REAL, _state); ae_matrix_init(&p->rm0, 0, 0, DT_REAL, _state); ae_matrix_init(&p->rm1, 0, 0, DT_REAL, _state); } void _apbuffers_init_copy(void* _dst, void* _src, ae_state *_state) { apbuffers *dst = (apbuffers*)_dst; apbuffers *src = (apbuffers*)_src; ae_vector_init_copy(&dst->ba0, &src->ba0, _state); ae_vector_init_copy(&dst->ia0, &src->ia0, _state); ae_vector_init_copy(&dst->ia1, &src->ia1, _state); ae_vector_init_copy(&dst->ia2, &src->ia2, _state); ae_vector_init_copy(&dst->ia3, &src->ia3, _state); ae_vector_init_copy(&dst->ra0, &src->ra0, _state); ae_vector_init_copy(&dst->ra1, &src->ra1, _state); ae_vector_init_copy(&dst->ra2, &src->ra2, _state); ae_vector_init_copy(&dst->ra3, &src->ra3, _state); ae_matrix_init_copy(&dst->rm0, &src->rm0, _state); ae_matrix_init_copy(&dst->rm1, &src->rm1, _state); } void _apbuffers_clear(void* _p) { apbuffers *p = (apbuffers*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->ba0); ae_vector_clear(&p->ia0); ae_vector_clear(&p->ia1); ae_vector_clear(&p->ia2); ae_vector_clear(&p->ia3); ae_vector_clear(&p->ra0); ae_vector_clear(&p->ra1); ae_vector_clear(&p->ra2); ae_vector_clear(&p->ra3); ae_matrix_clear(&p->rm0); ae_matrix_clear(&p->rm1); } void _apbuffers_destroy(void* _p) { apbuffers *p = (apbuffers*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->ba0); ae_vector_destroy(&p->ia0); ae_vector_destroy(&p->ia1); ae_vector_destroy(&p->ia2); ae_vector_destroy(&p->ia3); ae_vector_destroy(&p->ra0); ae_vector_destroy(&p->ra1); ae_vector_destroy(&p->ra2); ae_vector_destroy(&p->ra3); ae_matrix_destroy(&p->rm0); ae_matrix_destroy(&p->rm1); } void _sboolean_init(void* _p, ae_state *_state) { sboolean *p = (sboolean*)_p; ae_touch_ptr((void*)p); } void _sboolean_init_copy(void* _dst, void* _src, ae_state *_state) { sboolean *dst = (sboolean*)_dst; sboolean *src = (sboolean*)_src; dst->val = src->val; } void _sboolean_clear(void* _p) { sboolean *p = (sboolean*)_p; ae_touch_ptr((void*)p); } void _sboolean_destroy(void* _p) { sboolean *p = (sboolean*)_p; ae_touch_ptr((void*)p); } void _sbooleanarray_init(void* _p, ae_state *_state) { sbooleanarray *p = (sbooleanarray*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->val, 0, DT_BOOL, _state); } void _sbooleanarray_init_copy(void* _dst, void* _src, ae_state *_state) { sbooleanarray *dst = (sbooleanarray*)_dst; sbooleanarray *src = (sbooleanarray*)_src; ae_vector_init_copy(&dst->val, &src->val, _state); } void _sbooleanarray_clear(void* _p) { sbooleanarray *p = (sbooleanarray*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->val); } void _sbooleanarray_destroy(void* _p) { sbooleanarray *p = (sbooleanarray*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->val); } void _sinteger_init(void* _p, ae_state *_state) { sinteger *p = (sinteger*)_p; ae_touch_ptr((void*)p); } void _sinteger_init_copy(void* _dst, void* _src, ae_state *_state) { sinteger *dst = (sinteger*)_dst; sinteger *src = (sinteger*)_src; dst->val = src->val; } void _sinteger_clear(void* _p) { sinteger *p = (sinteger*)_p; ae_touch_ptr((void*)p); } void _sinteger_destroy(void* _p) { sinteger *p = (sinteger*)_p; ae_touch_ptr((void*)p); } void _sintegerarray_init(void* _p, ae_state *_state) { sintegerarray *p = (sintegerarray*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->val, 0, DT_INT, _state); } void _sintegerarray_init_copy(void* _dst, void* _src, ae_state *_state) { sintegerarray *dst = (sintegerarray*)_dst; sintegerarray *src = (sintegerarray*)_src; ae_vector_init_copy(&dst->val, &src->val, _state); } void _sintegerarray_clear(void* _p) { sintegerarray *p = (sintegerarray*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->val); } void _sintegerarray_destroy(void* _p) { sintegerarray *p = (sintegerarray*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->val); } void _sreal_init(void* _p, ae_state *_state) { sreal *p = (sreal*)_p; ae_touch_ptr((void*)p); } void _sreal_init_copy(void* _dst, void* _src, ae_state *_state) { sreal *dst = (sreal*)_dst; sreal *src = (sreal*)_src; dst->val = src->val; } void _sreal_clear(void* _p) { sreal *p = (sreal*)_p; ae_touch_ptr((void*)p); } void _sreal_destroy(void* _p) { sreal *p = (sreal*)_p; ae_touch_ptr((void*)p); } void _srealarray_init(void* _p, ae_state *_state) { srealarray *p = (srealarray*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->val, 0, DT_REAL, _state); } void _srealarray_init_copy(void* _dst, void* _src, ae_state *_state) { srealarray *dst = (srealarray*)_dst; srealarray *src = (srealarray*)_src; ae_vector_init_copy(&dst->val, &src->val, _state); } void _srealarray_clear(void* _p) { srealarray *p = (srealarray*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->val); } void _srealarray_destroy(void* _p) { srealarray *p = (srealarray*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->val); } void _scomplex_init(void* _p, ae_state *_state) { scomplex *p = (scomplex*)_p; ae_touch_ptr((void*)p); } void _scomplex_init_copy(void* _dst, void* _src, ae_state *_state) { scomplex *dst = (scomplex*)_dst; scomplex *src = (scomplex*)_src; dst->val = src->val; } void _scomplex_clear(void* _p) { scomplex *p = (scomplex*)_p; ae_touch_ptr((void*)p); } void _scomplex_destroy(void* _p) { scomplex *p = (scomplex*)_p; ae_touch_ptr((void*)p); } void _scomplexarray_init(void* _p, ae_state *_state) { scomplexarray *p = (scomplexarray*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->val, 0, DT_COMPLEX, _state); } void _scomplexarray_init_copy(void* _dst, void* _src, ae_state *_state) { scomplexarray *dst = (scomplexarray*)_dst; scomplexarray *src = (scomplexarray*)_src; ae_vector_init_copy(&dst->val, &src->val, _state); } void _scomplexarray_clear(void* _p) { scomplexarray *p = (scomplexarray*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->val); } void _scomplexarray_destroy(void* _p) { scomplexarray *p = (scomplexarray*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->val); } /************************************************************************* This function sorts array of real keys by ascending. Its results are: * sorted array A * permutation tables P1, P2 Algorithm outputs permutation tables using two formats: * as usual permutation of [0..N-1]. If P1[i]=j, then sorted A[i] contains value which was moved there from J-th position. * as a sequence of pairwise permutations. Sorted A[] may be obtained by swaping A[i] and A[P2[i]] for all i from 0 to N-1. INPUT PARAMETERS: A - unsorted array N - array size OUPUT PARAMETERS: A - sorted array P1, P2 - permutation tables, array[N] NOTES: this function assumes that A[] is finite; it doesn't checks that condition. All other conditions (size of input arrays, etc.) are not checked too. -- ALGLIB -- Copyright 14.05.2008 by Bochkanov Sergey *************************************************************************/ void tagsort(/* Real */ ae_vector* a, ae_int_t n, /* Integer */ ae_vector* p1, /* Integer */ ae_vector* p2, ae_state *_state) { ae_frame _frame_block; apbuffers buf; ae_frame_make(_state, &_frame_block); ae_vector_clear(p1); ae_vector_clear(p2); _apbuffers_init(&buf, _state); tagsortbuf(a, n, p1, p2, &buf, _state); ae_frame_leave(_state); } /************************************************************************* Buffered variant of TagSort, which accepts preallocated output arrays as well as special structure for buffered allocations. If arrays are too short, they are reallocated. If they are large enough, no memory allocation is done. It is intended to be used in the performance-critical parts of code, where additional allocations can lead to severe performance degradation -- ALGLIB -- Copyright 14.05.2008 by Bochkanov Sergey *************************************************************************/ void tagsortbuf(/* Real */ ae_vector* a, ae_int_t n, /* Integer */ ae_vector* p1, /* Integer */ ae_vector* p2, apbuffers* buf, ae_state *_state) { ae_int_t i; ae_int_t lv; ae_int_t lp; ae_int_t rv; ae_int_t rp; /* * Special cases */ if( n<=0 ) { return; } if( n==1 ) { ivectorsetlengthatleast(p1, 1, _state); ivectorsetlengthatleast(p2, 1, _state); p1->ptr.p_int[0] = 0; p2->ptr.p_int[0] = 0; return; } /* * General case, N>1: prepare permutations table P1 */ ivectorsetlengthatleast(p1, n, _state); for(i=0; i<=n-1; i++) { p1->ptr.p_int[i] = i; } /* * General case, N>1: sort, update P1 */ rvectorsetlengthatleast(&buf->ra0, n, _state); ivectorsetlengthatleast(&buf->ia0, n, _state); tagsortfasti(a, p1, &buf->ra0, &buf->ia0, n, _state); /* * General case, N>1: fill permutations table P2 * * To fill P2 we maintain two arrays: * * PV (Buf.IA0), Position(Value). PV[i] contains position of I-th key at the moment * * VP (Buf.IA1), Value(Position). VP[i] contains key which has position I at the moment * * At each step we making permutation of two items: * Left, which is given by position/value pair LP/LV * and Right, which is given by RP/RV * and updating PV[] and VP[] correspondingly. */ ivectorsetlengthatleast(&buf->ia0, n, _state); ivectorsetlengthatleast(&buf->ia1, n, _state); ivectorsetlengthatleast(p2, n, _state); for(i=0; i<=n-1; i++) { buf->ia0.ptr.p_int[i] = i; buf->ia1.ptr.p_int[i] = i; } for(i=0; i<=n-1; i++) { /* * calculate LP, LV, RP, RV */ lp = i; lv = buf->ia1.ptr.p_int[lp]; rv = p1->ptr.p_int[i]; rp = buf->ia0.ptr.p_int[rv]; /* * Fill P2 */ p2->ptr.p_int[i] = rp; /* * update PV and VP */ buf->ia1.ptr.p_int[lp] = rv; buf->ia1.ptr.p_int[rp] = lv; buf->ia0.ptr.p_int[lv] = rp; buf->ia0.ptr.p_int[rv] = lp; } } /************************************************************************* Same as TagSort, but optimized for real keys and integer labels. A is sorted, and same permutations are applied to B. NOTES: 1. this function assumes that A[] is finite; it doesn't checks that condition. All other conditions (size of input arrays, etc.) are not checked too. 2. this function uses two buffers, BufA and BufB, each is N elements large. They may be preallocated (which will save some time) or not, in which case function will automatically allocate memory. -- ALGLIB -- Copyright 11.12.2008 by Bochkanov Sergey *************************************************************************/ void tagsortfasti(/* Real */ ae_vector* a, /* Integer */ ae_vector* b, /* Real */ ae_vector* bufa, /* Integer */ ae_vector* bufb, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool isascending; ae_bool isdescending; double tmpr; ae_int_t tmpi; /* * Special case */ if( n<=1 ) { return; } /* * Test for already sorted set */ isascending = ae_true; isdescending = ae_true; for(i=1; i<=n-1; i++) { isascending = isascending&&a->ptr.p_double[i]>=a->ptr.p_double[i-1]; isdescending = isdescending&&a->ptr.p_double[i]<=a->ptr.p_double[i-1]; } if( isascending ) { return; } if( isdescending ) { for(i=0; i<=n-1; i++) { j = n-1-i; if( j<=i ) { break; } tmpr = a->ptr.p_double[i]; a->ptr.p_double[i] = a->ptr.p_double[j]; a->ptr.p_double[j] = tmpr; tmpi = b->ptr.p_int[i]; b->ptr.p_int[i] = b->ptr.p_int[j]; b->ptr.p_int[j] = tmpi; } return; } /* * General case */ if( bufa->cntcntptr.p_double[i]>=a->ptr.p_double[i-1]; isdescending = isdescending&&a->ptr.p_double[i]<=a->ptr.p_double[i-1]; } if( isascending ) { return; } if( isdescending ) { for(i=0; i<=n-1; i++) { j = n-1-i; if( j<=i ) { break; } tmpr = a->ptr.p_double[i]; a->ptr.p_double[i] = a->ptr.p_double[j]; a->ptr.p_double[j] = tmpr; tmpr = b->ptr.p_double[i]; b->ptr.p_double[i] = b->ptr.p_double[j]; b->ptr.p_double[j] = tmpr; } return; } /* * General case */ if( bufa->cntcntptr.p_double[i]>=a->ptr.p_double[i-1]; isdescending = isdescending&&a->ptr.p_double[i]<=a->ptr.p_double[i-1]; } if( isascending ) { return; } if( isdescending ) { for(i=0; i<=n-1; i++) { j = n-1-i; if( j<=i ) { break; } tmpr = a->ptr.p_double[i]; a->ptr.p_double[i] = a->ptr.p_double[j]; a->ptr.p_double[j] = tmpr; } return; } /* * General case */ if( bufa->cnt1: sort, update B */ i = 2; do { t = i; while(t!=1) { k = t/2; if( a->ptr.p_int[offset+k-1]>=a->ptr.p_int[offset+t-1] ) { t = 1; } else { tmp = a->ptr.p_int[offset+k-1]; a->ptr.p_int[offset+k-1] = a->ptr.p_int[offset+t-1]; a->ptr.p_int[offset+t-1] = tmp; tmpr = b->ptr.p_double[offset+k-1]; b->ptr.p_double[offset+k-1] = b->ptr.p_double[offset+t-1]; b->ptr.p_double[offset+t-1] = tmpr; t = k; } } i = i+1; } while(i<=n); i = n-1; do { tmp = a->ptr.p_int[offset+i]; a->ptr.p_int[offset+i] = a->ptr.p_int[offset+0]; a->ptr.p_int[offset+0] = tmp; tmpr = b->ptr.p_double[offset+i]; b->ptr.p_double[offset+i] = b->ptr.p_double[offset+0]; b->ptr.p_double[offset+0] = tmpr; t = 1; while(t!=0) { k = 2*t; if( k>i ) { t = 0; } else { if( kptr.p_int[offset+k]>a->ptr.p_int[offset+k-1] ) { k = k+1; } } if( a->ptr.p_int[offset+t-1]>=a->ptr.p_int[offset+k-1] ) { t = 0; } else { tmp = a->ptr.p_int[offset+k-1]; a->ptr.p_int[offset+k-1] = a->ptr.p_int[offset+t-1]; a->ptr.p_int[offset+t-1] = tmp; tmpr = b->ptr.p_double[offset+k-1]; b->ptr.p_double[offset+k-1] = b->ptr.p_double[offset+t-1]; b->ptr.p_double[offset+t-1] = tmpr; t = k; } } } i = i-1; } while(i>=1); } /************************************************************************* Heap operations: adds element to the heap PARAMETERS: A - heap itself, must be at least array[0..N] B - array of integer tags, which are updated according to permutations in the heap N - size of the heap (without new element). updated on output VA - value of the element being added VB - value of the tag -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void tagheappushi(/* Real */ ae_vector* a, /* Integer */ ae_vector* b, ae_int_t* n, double va, ae_int_t vb, ae_state *_state) { ae_int_t j; ae_int_t k; double v; if( *n<0 ) { return; } /* * N=0 is a special case */ if( *n==0 ) { a->ptr.p_double[0] = va; b->ptr.p_int[0] = vb; *n = *n+1; return; } /* * add current point to the heap * (add to the bottom, then move up) * * we don't write point to the heap * until its final position is determined * (it allow us to reduce number of array access operations) */ j = *n; *n = *n+1; while(j>0) { k = (j-1)/2; v = a->ptr.p_double[k]; if( ae_fp_less(v,va) ) { /* * swap with higher element */ a->ptr.p_double[j] = v; b->ptr.p_int[j] = b->ptr.p_int[k]; j = k; } else { /* * element in its place. terminate. */ break; } } a->ptr.p_double[j] = va; b->ptr.p_int[j] = vb; } /************************************************************************* Heap operations: replaces top element with new element (which is moved down) PARAMETERS: A - heap itself, must be at least array[0..N-1] B - array of integer tags, which are updated according to permutations in the heap N - size of the heap VA - value of the element which replaces top element VB - value of the tag -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void tagheapreplacetopi(/* Real */ ae_vector* a, /* Integer */ ae_vector* b, ae_int_t n, double va, ae_int_t vb, ae_state *_state) { ae_int_t j; ae_int_t k1; ae_int_t k2; double v; double v1; double v2; if( n<1 ) { return; } /* * N=1 is a special case */ if( n==1 ) { a->ptr.p_double[0] = va; b->ptr.p_int[0] = vb; return; } /* * move down through heap: * * J - current element * * K1 - first child (always exists) * * K2 - second child (may not exists) * * we don't write point to the heap * until its final position is determined * (it allow us to reduce number of array access operations) */ j = 0; k1 = 1; k2 = 2; while(k1=n ) { /* * only one child. * * swap and terminate (because this child * have no siblings due to heap structure) */ v = a->ptr.p_double[k1]; if( ae_fp_greater(v,va) ) { a->ptr.p_double[j] = v; b->ptr.p_int[j] = b->ptr.p_int[k1]; j = k1; } break; } else { /* * two childs */ v1 = a->ptr.p_double[k1]; v2 = a->ptr.p_double[k2]; if( ae_fp_greater(v1,v2) ) { if( ae_fp_less(va,v1) ) { a->ptr.p_double[j] = v1; b->ptr.p_int[j] = b->ptr.p_int[k1]; j = k1; } else { break; } } else { if( ae_fp_less(va,v2) ) { a->ptr.p_double[j] = v2; b->ptr.p_int[j] = b->ptr.p_int[k2]; j = k2; } else { break; } } k1 = 2*j+1; k2 = 2*j+2; } } a->ptr.p_double[j] = va; b->ptr.p_int[j] = vb; } /************************************************************************* Heap operations: pops top element from the heap PARAMETERS: A - heap itself, must be at least array[0..N-1] B - array of integer tags, which are updated according to permutations in the heap N - size of the heap, N>=1 On output top element is moved to A[N-1], B[N-1], heap is reordered, N is decreased by 1. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void tagheappopi(/* Real */ ae_vector* a, /* Integer */ ae_vector* b, ae_int_t* n, ae_state *_state) { double va; ae_int_t vb; if( *n<1 ) { return; } /* * N=1 is a special case */ if( *n==1 ) { *n = 0; return; } /* * swap top element and last element, * then reorder heap */ va = a->ptr.p_double[*n-1]; vb = b->ptr.p_int[*n-1]; a->ptr.p_double[*n-1] = a->ptr.p_double[0]; b->ptr.p_int[*n-1] = b->ptr.p_int[0]; *n = *n-1; tagheapreplacetopi(a, b, *n, va, vb, _state); } /************************************************************************* Search first element less than T in sorted array. PARAMETERS: A - sorted array by ascending from 0 to N-1 N - number of elements in array T - the desired element RESULT: The very first element's index, which isn't less than T. In the case when there aren't such elements, returns N. *************************************************************************/ ae_int_t lowerbound(/* Real */ ae_vector* a, ae_int_t n, double t, ae_state *_state) { ae_int_t l; ae_int_t half; ae_int_t first; ae_int_t middle; ae_int_t result; l = n; first = 0; while(l>0) { half = l/2; middle = first+half; if( ae_fp_less(a->ptr.p_double[middle],t) ) { first = middle+1; l = l-half-1; } else { l = half; } } result = first; return result; } /************************************************************************* Search first element more than T in sorted array. PARAMETERS: A - sorted array by ascending from 0 to N-1 N - number of elements in array T - the desired element RESULT: The very first element's index, which more than T. In the case when there aren't such elements, returns N. *************************************************************************/ ae_int_t upperbound(/* Real */ ae_vector* a, ae_int_t n, double t, ae_state *_state) { ae_int_t l; ae_int_t half; ae_int_t first; ae_int_t middle; ae_int_t result; l = n; first = 0; while(l>0) { half = l/2; middle = first+half; if( ae_fp_less(t,a->ptr.p_double[middle]) ) { l = half; } else { first = middle+1; l = l-half-1; } } result = first; return result; } /************************************************************************* Internal TagSortFastI: sorts A[I1...I2] (both bounds are included), applies same permutations to B. -- ALGLIB -- Copyright 06.09.2010 by Bochkanov Sergey *************************************************************************/ static void tsort_tagsortfastirec(/* Real */ ae_vector* a, /* Integer */ ae_vector* b, /* Real */ ae_vector* bufa, /* Integer */ ae_vector* bufb, ae_int_t i1, ae_int_t i2, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t cntless; ae_int_t cnteq; ae_int_t cntgreater; double tmpr; ae_int_t tmpi; double v0; double v1; double v2; double vp; /* * Fast exit */ if( i2<=i1 ) { return; } /* * Non-recursive sort for small arrays */ if( i2-i1<=16 ) { for(j=i1+1; j<=i2; j++) { /* * Search elements [I1..J-1] for place to insert Jth element. * * This code stops immediately if we can leave A[J] at J-th position * (all elements have same value of A[J] larger than any of them) */ tmpr = a->ptr.p_double[j]; tmpi = j; for(k=j-1; k>=i1; k--) { if( a->ptr.p_double[k]<=tmpr ) { break; } tmpi = k; } k = tmpi; /* * Insert Jth element into Kth position */ if( k!=j ) { tmpr = a->ptr.p_double[j]; tmpi = b->ptr.p_int[j]; for(i=j-1; i>=k; i--) { a->ptr.p_double[i+1] = a->ptr.p_double[i]; b->ptr.p_int[i+1] = b->ptr.p_int[i]; } a->ptr.p_double[k] = tmpr; b->ptr.p_int[k] = tmpi; } } return; } /* * Quicksort: choose pivot * Here we assume that I2-I1>=2 */ v0 = a->ptr.p_double[i1]; v1 = a->ptr.p_double[i1+(i2-i1)/2]; v2 = a->ptr.p_double[i2]; if( v0>v1 ) { tmpr = v1; v1 = v0; v0 = tmpr; } if( v1>v2 ) { tmpr = v2; v2 = v1; v1 = tmpr; } if( v0>v1 ) { tmpr = v1; v1 = v0; v0 = tmpr; } vp = v1; /* * now pass through A/B and: * * move elements that are LESS than VP to the left of A/B * * move elements that are EQUAL to VP to the right of BufA/BufB (in the reverse order) * * move elements that are GREATER than VP to the left of BufA/BufB (in the normal order * * move elements from the tail of BufA/BufB to the middle of A/B (restoring normal order) * * move elements from the left of BufA/BufB to the end of A/B */ cntless = 0; cnteq = 0; cntgreater = 0; for(i=i1; i<=i2; i++) { v0 = a->ptr.p_double[i]; if( v0ptr.p_double[k] = v0; b->ptr.p_int[k] = b->ptr.p_int[i]; } cntless = cntless+1; continue; } if( v0==vp ) { /* * EQUAL */ k = i2-cnteq; bufa->ptr.p_double[k] = v0; bufb->ptr.p_int[k] = b->ptr.p_int[i]; cnteq = cnteq+1; continue; } /* * GREATER */ k = i1+cntgreater; bufa->ptr.p_double[k] = v0; bufb->ptr.p_int[k] = b->ptr.p_int[i]; cntgreater = cntgreater+1; } for(i=0; i<=cnteq-1; i++) { j = i1+cntless+cnteq-1-i; k = i2+i-(cnteq-1); a->ptr.p_double[j] = bufa->ptr.p_double[k]; b->ptr.p_int[j] = bufb->ptr.p_int[k]; } for(i=0; i<=cntgreater-1; i++) { j = i1+cntless+cnteq+i; k = i1+i; a->ptr.p_double[j] = bufa->ptr.p_double[k]; b->ptr.p_int[j] = bufb->ptr.p_int[k]; } /* * Sort left and right parts of the array (ignoring middle part) */ tsort_tagsortfastirec(a, b, bufa, bufb, i1, i1+cntless-1, _state); tsort_tagsortfastirec(a, b, bufa, bufb, i1+cntless+cnteq, i2, _state); } /************************************************************************* Internal TagSortFastR: sorts A[I1...I2] (both bounds are included), applies same permutations to B. -- ALGLIB -- Copyright 06.09.2010 by Bochkanov Sergey *************************************************************************/ static void tsort_tagsortfastrrec(/* Real */ ae_vector* a, /* Real */ ae_vector* b, /* Real */ ae_vector* bufa, /* Real */ ae_vector* bufb, ae_int_t i1, ae_int_t i2, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; double tmpr; double tmpr2; ae_int_t tmpi; ae_int_t cntless; ae_int_t cnteq; ae_int_t cntgreater; double v0; double v1; double v2; double vp; /* * Fast exit */ if( i2<=i1 ) { return; } /* * Non-recursive sort for small arrays */ if( i2-i1<=16 ) { for(j=i1+1; j<=i2; j++) { /* * Search elements [I1..J-1] for place to insert Jth element. * * This code stops immediatly if we can leave A[J] at J-th position * (all elements have same value of A[J] larger than any of them) */ tmpr = a->ptr.p_double[j]; tmpi = j; for(k=j-1; k>=i1; k--) { if( a->ptr.p_double[k]<=tmpr ) { break; } tmpi = k; } k = tmpi; /* * Insert Jth element into Kth position */ if( k!=j ) { tmpr = a->ptr.p_double[j]; tmpr2 = b->ptr.p_double[j]; for(i=j-1; i>=k; i--) { a->ptr.p_double[i+1] = a->ptr.p_double[i]; b->ptr.p_double[i+1] = b->ptr.p_double[i]; } a->ptr.p_double[k] = tmpr; b->ptr.p_double[k] = tmpr2; } } return; } /* * Quicksort: choose pivot * Here we assume that I2-I1>=16 */ v0 = a->ptr.p_double[i1]; v1 = a->ptr.p_double[i1+(i2-i1)/2]; v2 = a->ptr.p_double[i2]; if( v0>v1 ) { tmpr = v1; v1 = v0; v0 = tmpr; } if( v1>v2 ) { tmpr = v2; v2 = v1; v1 = tmpr; } if( v0>v1 ) { tmpr = v1; v1 = v0; v0 = tmpr; } vp = v1; /* * now pass through A/B and: * * move elements that are LESS than VP to the left of A/B * * move elements that are EQUAL to VP to the right of BufA/BufB (in the reverse order) * * move elements that are GREATER than VP to the left of BufA/BufB (in the normal order * * move elements from the tail of BufA/BufB to the middle of A/B (restoring normal order) * * move elements from the left of BufA/BufB to the end of A/B */ cntless = 0; cnteq = 0; cntgreater = 0; for(i=i1; i<=i2; i++) { v0 = a->ptr.p_double[i]; if( v0ptr.p_double[k] = v0; b->ptr.p_double[k] = b->ptr.p_double[i]; } cntless = cntless+1; continue; } if( v0==vp ) { /* * EQUAL */ k = i2-cnteq; bufa->ptr.p_double[k] = v0; bufb->ptr.p_double[k] = b->ptr.p_double[i]; cnteq = cnteq+1; continue; } /* * GREATER */ k = i1+cntgreater; bufa->ptr.p_double[k] = v0; bufb->ptr.p_double[k] = b->ptr.p_double[i]; cntgreater = cntgreater+1; } for(i=0; i<=cnteq-1; i++) { j = i1+cntless+cnteq-1-i; k = i2+i-(cnteq-1); a->ptr.p_double[j] = bufa->ptr.p_double[k]; b->ptr.p_double[j] = bufb->ptr.p_double[k]; } for(i=0; i<=cntgreater-1; i++) { j = i1+cntless+cnteq+i; k = i1+i; a->ptr.p_double[j] = bufa->ptr.p_double[k]; b->ptr.p_double[j] = bufb->ptr.p_double[k]; } /* * Sort left and right parts of the array (ignoring middle part) */ tsort_tagsortfastrrec(a, b, bufa, bufb, i1, i1+cntless-1, _state); tsort_tagsortfastrrec(a, b, bufa, bufb, i1+cntless+cnteq, i2, _state); } /************************************************************************* Internal TagSortFastI: sorts A[I1...I2] (both bounds are included), applies same permutations to B. -- ALGLIB -- Copyright 06.09.2010 by Bochkanov Sergey *************************************************************************/ static void tsort_tagsortfastrec(/* Real */ ae_vector* a, /* Real */ ae_vector* bufa, ae_int_t i1, ae_int_t i2, ae_state *_state) { ae_int_t cntless; ae_int_t cnteq; ae_int_t cntgreater; ae_int_t i; ae_int_t j; ae_int_t k; double tmpr; ae_int_t tmpi; double v0; double v1; double v2; double vp; /* * Fast exit */ if( i2<=i1 ) { return; } /* * Non-recursive sort for small arrays */ if( i2-i1<=16 ) { for(j=i1+1; j<=i2; j++) { /* * Search elements [I1..J-1] for place to insert Jth element. * * This code stops immediatly if we can leave A[J] at J-th position * (all elements have same value of A[J] larger than any of them) */ tmpr = a->ptr.p_double[j]; tmpi = j; for(k=j-1; k>=i1; k--) { if( a->ptr.p_double[k]<=tmpr ) { break; } tmpi = k; } k = tmpi; /* * Insert Jth element into Kth position */ if( k!=j ) { tmpr = a->ptr.p_double[j]; for(i=j-1; i>=k; i--) { a->ptr.p_double[i+1] = a->ptr.p_double[i]; } a->ptr.p_double[k] = tmpr; } } return; } /* * Quicksort: choose pivot * Here we assume that I2-I1>=16 */ v0 = a->ptr.p_double[i1]; v1 = a->ptr.p_double[i1+(i2-i1)/2]; v2 = a->ptr.p_double[i2]; if( v0>v1 ) { tmpr = v1; v1 = v0; v0 = tmpr; } if( v1>v2 ) { tmpr = v2; v2 = v1; v1 = tmpr; } if( v0>v1 ) { tmpr = v1; v1 = v0; v0 = tmpr; } vp = v1; /* * now pass through A/B and: * * move elements that are LESS than VP to the left of A/B * * move elements that are EQUAL to VP to the right of BufA/BufB (in the reverse order) * * move elements that are GREATER than VP to the left of BufA/BufB (in the normal order * * move elements from the tail of BufA/BufB to the middle of A/B (restoring normal order) * * move elements from the left of BufA/BufB to the end of A/B */ cntless = 0; cnteq = 0; cntgreater = 0; for(i=i1; i<=i2; i++) { v0 = a->ptr.p_double[i]; if( v0ptr.p_double[k] = v0; } cntless = cntless+1; continue; } if( v0==vp ) { /* * EQUAL */ k = i2-cnteq; bufa->ptr.p_double[k] = v0; cnteq = cnteq+1; continue; } /* * GREATER */ k = i1+cntgreater; bufa->ptr.p_double[k] = v0; cntgreater = cntgreater+1; } for(i=0; i<=cnteq-1; i++) { j = i1+cntless+cnteq-1-i; k = i2+i-(cnteq-1); a->ptr.p_double[j] = bufa->ptr.p_double[k]; } for(i=0; i<=cntgreater-1; i++) { j = i1+cntless+cnteq+i; k = i1+i; a->ptr.p_double[j] = bufa->ptr.p_double[k]; } /* * Sort left and right parts of the array (ignoring middle part) */ tsort_tagsortfastrec(a, bufa, i1, i1+cntless-1, _state); tsort_tagsortfastrec(a, bufa, i1+cntless+cnteq, i2, _state); } /************************************************************************* Generation of an elementary reflection transformation The subroutine generates elementary reflection H of order N, so that, for a given X, the following equality holds true: ( X(1) ) ( Beta ) H * ( .. ) = ( 0 ) ( X(n) ) ( 0 ) where ( V(1) ) H = 1 - Tau * ( .. ) * ( V(1), ..., V(n) ) ( V(n) ) where the first component of vector V equals 1. Input parameters: X - vector. Array whose index ranges within [1..N]. N - reflection order. Output parameters: X - components from 2 to N are replaced with vector V. The first component is replaced with parameter Beta. Tau - scalar value Tau. If X is a null vector, Tau equals 0, otherwise 1 <= Tau <= 2. This subroutine is the modification of the DLARFG subroutines from the LAPACK library. MODIFICATIONS: 24.12.2005 sign(Alpha) was replaced with an analogous to the Fortran SIGN code. -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ void generatereflection(/* Real */ ae_vector* x, ae_int_t n, double* tau, ae_state *_state) { ae_int_t j; double alpha; double xnorm; double v; double beta; double mx; double s; *tau = 0; if( n<=1 ) { *tau = (double)(0); return; } /* * Scale if needed (to avoid overflow/underflow during intermediate * calculations). */ mx = (double)(0); for(j=1; j<=n; j++) { mx = ae_maxreal(ae_fabs(x->ptr.p_double[j], _state), mx, _state); } s = (double)(1); if( ae_fp_neq(mx,(double)(0)) ) { if( ae_fp_less_eq(mx,ae_minrealnumber/ae_machineepsilon) ) { s = ae_minrealnumber/ae_machineepsilon; v = 1/s; ae_v_muld(&x->ptr.p_double[1], 1, ae_v_len(1,n), v); mx = mx*v; } else { if( ae_fp_greater_eq(mx,ae_maxrealnumber*ae_machineepsilon) ) { s = ae_maxrealnumber*ae_machineepsilon; v = 1/s; ae_v_muld(&x->ptr.p_double[1], 1, ae_v_len(1,n), v); mx = mx*v; } } } /* * XNORM = DNRM2( N-1, X, INCX ) */ alpha = x->ptr.p_double[1]; xnorm = (double)(0); if( ae_fp_neq(mx,(double)(0)) ) { for(j=2; j<=n; j++) { xnorm = xnorm+ae_sqr(x->ptr.p_double[j]/mx, _state); } xnorm = ae_sqrt(xnorm, _state)*mx; } if( ae_fp_eq(xnorm,(double)(0)) ) { /* * H = I */ *tau = (double)(0); x->ptr.p_double[1] = x->ptr.p_double[1]*s; return; } /* * general case */ mx = ae_maxreal(ae_fabs(alpha, _state), ae_fabs(xnorm, _state), _state); beta = -mx*ae_sqrt(ae_sqr(alpha/mx, _state)+ae_sqr(xnorm/mx, _state), _state); if( ae_fp_less(alpha,(double)(0)) ) { beta = -beta; } *tau = (beta-alpha)/beta; v = 1/(alpha-beta); ae_v_muld(&x->ptr.p_double[2], 1, ae_v_len(2,n), v); x->ptr.p_double[1] = beta; /* * Scale back outputs */ x->ptr.p_double[1] = x->ptr.p_double[1]*s; } /************************************************************************* Application of an elementary reflection to a rectangular matrix of size MxN The algorithm pre-multiplies the matrix by an elementary reflection transformation which is given by column V and scalar Tau (see the description of the GenerateReflection procedure). Not the whole matrix but only a part of it is transformed (rows from M1 to M2, columns from N1 to N2). Only the elements of this submatrix are changed. Input parameters: C - matrix to be transformed. Tau - scalar defining the transformation. V - column defining the transformation. Array whose index ranges within [1..M2-M1+1]. M1, M2 - range of rows to be transformed. N1, N2 - range of columns to be transformed. WORK - working array whose indexes goes from N1 to N2. Output parameters: C - the result of multiplying the input matrix C by the transformation matrix which is given by Tau and V. If N1>N2 or M1>M2, C is not modified. -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ void applyreflectionfromtheleft(/* Real */ ae_matrix* c, double tau, /* Real */ ae_vector* v, ae_int_t m1, ae_int_t m2, ae_int_t n1, ae_int_t n2, /* Real */ ae_vector* work, ae_state *_state) { double t; ae_int_t i; if( (ae_fp_eq(tau,(double)(0))||n1>n2)||m1>m2 ) { return; } /* * w := C' * v */ for(i=n1; i<=n2; i++) { work->ptr.p_double[i] = (double)(0); } for(i=m1; i<=m2; i++) { t = v->ptr.p_double[i+1-m1]; ae_v_addd(&work->ptr.p_double[n1], 1, &c->ptr.pp_double[i][n1], 1, ae_v_len(n1,n2), t); } /* * C := C - tau * v * w' */ for(i=m1; i<=m2; i++) { t = v->ptr.p_double[i-m1+1]*tau; ae_v_subd(&c->ptr.pp_double[i][n1], 1, &work->ptr.p_double[n1], 1, ae_v_len(n1,n2), t); } } /************************************************************************* Application of an elementary reflection to a rectangular matrix of size MxN The algorithm post-multiplies the matrix by an elementary reflection transformation which is given by column V and scalar Tau (see the description of the GenerateReflection procedure). Not the whole matrix but only a part of it is transformed (rows from M1 to M2, columns from N1 to N2). Only the elements of this submatrix are changed. Input parameters: C - matrix to be transformed. Tau - scalar defining the transformation. V - column defining the transformation. Array whose index ranges within [1..N2-N1+1]. M1, M2 - range of rows to be transformed. N1, N2 - range of columns to be transformed. WORK - working array whose indexes goes from M1 to M2. Output parameters: C - the result of multiplying the input matrix C by the transformation matrix which is given by Tau and V. If N1>N2 or M1>M2, C is not modified. -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ void applyreflectionfromtheright(/* Real */ ae_matrix* c, double tau, /* Real */ ae_vector* v, ae_int_t m1, ae_int_t m2, ae_int_t n1, ae_int_t n2, /* Real */ ae_vector* work, ae_state *_state) { double t; ae_int_t i; ae_int_t vm; if( (ae_fp_eq(tau,(double)(0))||n1>n2)||m1>m2 ) { return; } vm = n2-n1+1; for(i=m1; i<=m2; i++) { t = ae_v_dotproduct(&c->ptr.pp_double[i][n1], 1, &v->ptr.p_double[1], 1, ae_v_len(n1,n2)); t = t*tau; ae_v_subd(&c->ptr.pp_double[i][n1], 1, &v->ptr.p_double[1], 1, ae_v_len(n1,n2), t); } /* * This line is necessary to avoid spurious compiler warnings */ touchint(&vm, _state); } /************************************************************************* Generation of an elementary complex reflection transformation The subroutine generates elementary complex reflection H of order N, so that, for a given X, the following equality holds true: ( X(1) ) ( Beta ) H' * ( .. ) = ( 0 ), H'*H = I, Beta is a real number ( X(n) ) ( 0 ) where ( V(1) ) H = 1 - Tau * ( .. ) * ( conj(V(1)), ..., conj(V(n)) ) ( V(n) ) where the first component of vector V equals 1. Input parameters: X - vector. Array with elements [1..N]. N - reflection order. Output parameters: X - components from 2 to N are replaced by vector V. The first component is replaced with parameter Beta. Tau - scalar value Tau. This subroutine is the modification of CLARFG subroutines from the LAPACK library. It has similar functionality except for the fact that it doesn't handle errors when intermediate results cause an overflow. -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ void complexgeneratereflection(/* Complex */ ae_vector* x, ae_int_t n, ae_complex* tau, ae_state *_state) { ae_int_t j; ae_complex alpha; double alphi; double alphr; double beta; double xnorm; double mx; ae_complex t; double s; ae_complex v; tau->x = 0; tau->y = 0; if( n<=0 ) { *tau = ae_complex_from_i(0); return; } /* * Scale if needed (to avoid overflow/underflow during intermediate * calculations). */ mx = (double)(0); for(j=1; j<=n; j++) { mx = ae_maxreal(ae_c_abs(x->ptr.p_complex[j], _state), mx, _state); } s = (double)(1); if( ae_fp_neq(mx,(double)(0)) ) { if( ae_fp_less(mx,(double)(1)) ) { s = ae_sqrt(ae_minrealnumber, _state); v = ae_complex_from_d(1/s); ae_v_cmulc(&x->ptr.p_complex[1], 1, ae_v_len(1,n), v); } else { s = ae_sqrt(ae_maxrealnumber, _state); v = ae_complex_from_d(1/s); ae_v_cmulc(&x->ptr.p_complex[1], 1, ae_v_len(1,n), v); } } /* * calculate */ alpha = x->ptr.p_complex[1]; mx = (double)(0); for(j=2; j<=n; j++) { mx = ae_maxreal(ae_c_abs(x->ptr.p_complex[j], _state), mx, _state); } xnorm = (double)(0); if( ae_fp_neq(mx,(double)(0)) ) { for(j=2; j<=n; j++) { t = ae_c_div_d(x->ptr.p_complex[j],mx); xnorm = xnorm+ae_c_mul(t,ae_c_conj(t, _state)).x; } xnorm = ae_sqrt(xnorm, _state)*mx; } alphr = alpha.x; alphi = alpha.y; if( ae_fp_eq(xnorm,(double)(0))&&ae_fp_eq(alphi,(double)(0)) ) { *tau = ae_complex_from_i(0); x->ptr.p_complex[1] = ae_c_mul_d(x->ptr.p_complex[1],s); return; } mx = ae_maxreal(ae_fabs(alphr, _state), ae_fabs(alphi, _state), _state); mx = ae_maxreal(mx, ae_fabs(xnorm, _state), _state); beta = -mx*ae_sqrt(ae_sqr(alphr/mx, _state)+ae_sqr(alphi/mx, _state)+ae_sqr(xnorm/mx, _state), _state); if( ae_fp_less(alphr,(double)(0)) ) { beta = -beta; } tau->x = (beta-alphr)/beta; tau->y = -alphi/beta; alpha = ae_c_d_div(1,ae_c_sub_d(alpha,beta)); if( n>1 ) { ae_v_cmulc(&x->ptr.p_complex[2], 1, ae_v_len(2,n), alpha); } alpha = ae_complex_from_d(beta); x->ptr.p_complex[1] = alpha; /* * Scale back */ x->ptr.p_complex[1] = ae_c_mul_d(x->ptr.p_complex[1],s); } /************************************************************************* Application of an elementary reflection to a rectangular matrix of size MxN The algorithm pre-multiplies the matrix by an elementary reflection transformation which is given by column V and scalar Tau (see the description of the GenerateReflection). Not the whole matrix but only a part of it is transformed (rows from M1 to M2, columns from N1 to N2). Only the elements of this submatrix are changed. Note: the matrix is multiplied by H, not by H'. If it is required to multiply the matrix by H', it is necessary to pass Conj(Tau) instead of Tau. Input parameters: C - matrix to be transformed. Tau - scalar defining transformation. V - column defining transformation. Array whose index ranges within [1..M2-M1+1] M1, M2 - range of rows to be transformed. N1, N2 - range of columns to be transformed. WORK - working array whose index goes from N1 to N2. Output parameters: C - the result of multiplying the input matrix C by the transformation matrix which is given by Tau and V. If N1>N2 or M1>M2, C is not modified. -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ void complexapplyreflectionfromtheleft(/* Complex */ ae_matrix* c, ae_complex tau, /* Complex */ ae_vector* v, ae_int_t m1, ae_int_t m2, ae_int_t n1, ae_int_t n2, /* Complex */ ae_vector* work, ae_state *_state) { ae_complex t; ae_int_t i; if( (ae_c_eq_d(tau,(double)(0))||n1>n2)||m1>m2 ) { return; } /* * w := C^T * conj(v) */ for(i=n1; i<=n2; i++) { work->ptr.p_complex[i] = ae_complex_from_i(0); } for(i=m1; i<=m2; i++) { t = ae_c_conj(v->ptr.p_complex[i+1-m1], _state); ae_v_caddc(&work->ptr.p_complex[n1], 1, &c->ptr.pp_complex[i][n1], 1, "N", ae_v_len(n1,n2), t); } /* * C := C - tau * v * w^T */ for(i=m1; i<=m2; i++) { t = ae_c_mul(v->ptr.p_complex[i-m1+1],tau); ae_v_csubc(&c->ptr.pp_complex[i][n1], 1, &work->ptr.p_complex[n1], 1, "N", ae_v_len(n1,n2), t); } } /************************************************************************* Application of an elementary reflection to a rectangular matrix of size MxN The algorithm post-multiplies the matrix by an elementary reflection transformation which is given by column V and scalar Tau (see the description of the GenerateReflection). Not the whole matrix but only a part of it is transformed (rows from M1 to M2, columns from N1 to N2). Only the elements of this submatrix are changed. Input parameters: C - matrix to be transformed. Tau - scalar defining transformation. V - column defining transformation. Array whose index ranges within [1..N2-N1+1] M1, M2 - range of rows to be transformed. N1, N2 - range of columns to be transformed. WORK - working array whose index goes from M1 to M2. Output parameters: C - the result of multiplying the input matrix C by the transformation matrix which is given by Tau and V. If N1>N2 or M1>M2, C is not modified. -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 *************************************************************************/ void complexapplyreflectionfromtheright(/* Complex */ ae_matrix* c, ae_complex tau, /* Complex */ ae_vector* v, ae_int_t m1, ae_int_t m2, ae_int_t n1, ae_int_t n2, /* Complex */ ae_vector* work, ae_state *_state) { ae_complex t; ae_int_t i; ae_int_t vm; if( (ae_c_eq_d(tau,(double)(0))||n1>n2)||m1>m2 ) { return; } /* * w := C * v */ vm = n2-n1+1; for(i=m1; i<=m2; i++) { t = ae_v_cdotproduct(&c->ptr.pp_complex[i][n1], 1, "N", &v->ptr.p_complex[1], 1, "N", ae_v_len(n1,n2)); work->ptr.p_complex[i] = t; } /* * C := C - w * conj(v^T) */ ae_v_cmove(&v->ptr.p_complex[1], 1, &v->ptr.p_complex[1], 1, "Conj", ae_v_len(1,vm)); for(i=m1; i<=m2; i++) { t = ae_c_mul(work->ptr.p_complex[i],tau); ae_v_csubc(&c->ptr.pp_complex[i][n1], 1, &v->ptr.p_complex[1], 1, "N", ae_v_len(n1,n2), t); } ae_v_cmove(&v->ptr.p_complex[1], 1, &v->ptr.p_complex[1], 1, "Conj", ae_v_len(1,vm)); } /************************************************************************* Fast kernel -- ALGLIB routine -- 19.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool cmatrixrank1f(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Complex */ ae_vector* u, ae_int_t iu, /* Complex */ ae_vector* v, ae_int_t iv, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_ABLAS ae_bool result; result = ae_false; return result; #else return _ialglib_i_cmatrixrank1f(m, n, a, ia, ja, u, iu, v, iv); #endif } /************************************************************************* Fast kernel -- ALGLIB routine -- 19.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixrank1f(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_vector* u, ae_int_t iu, /* Real */ ae_vector* v, ae_int_t iv, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_ABLAS ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixrank1f(m, n, a, ia, ja, u, iu, v, iv); #endif } /************************************************************************* Fast kernel -- ALGLIB routine -- 19.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool cmatrixmvf(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t opa, /* Complex */ ae_vector* x, ae_int_t ix, /* Complex */ ae_vector* y, ae_int_t iy, ae_state *_state) { ae_bool result; result = ae_false; return result; } /************************************************************************* Fast kernel -- ALGLIB routine -- 19.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixmvf(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t opa, /* Real */ ae_vector* x, ae_int_t ix, /* Real */ ae_vector* y, ae_int_t iy, ae_state *_state) { ae_bool result; result = ae_false; return result; } /************************************************************************* Fast kernel -- ALGLIB routine -- 19.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool cmatrixrighttrsmf(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_ABLAS ae_bool result; result = ae_false; return result; #else return _ialglib_i_cmatrixrighttrsmf(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2); #endif } /************************************************************************* Fast kernel -- ALGLIB routine -- 19.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool cmatrixlefttrsmf(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_ABLAS ae_bool result; result = ae_false; return result; #else return _ialglib_i_cmatrixlefttrsmf(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2); #endif } /************************************************************************* Fast kernel -- ALGLIB routine -- 19.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixrighttrsmf(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_ABLAS ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixrighttrsmf(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2); #endif } /************************************************************************* Fast kernel -- ALGLIB routine -- 19.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixlefttrsmf(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_ABLAS ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixlefttrsmf(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2); #endif } /************************************************************************* Fast kernel -- ALGLIB routine -- 19.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool cmatrixherkf(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_ABLAS ae_bool result; result = ae_false; return result; #else return _ialglib_i_cmatrixherkf(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper); #endif } /************************************************************************* Fast kernel -- ALGLIB routine -- 19.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixsyrkf(ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_ABLAS ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixsyrkf(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper); #endif } /************************************************************************* Fast kernel -- ALGLIB routine -- 19.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixgemmf(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_ABLAS ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixgemmf(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc); #endif } /************************************************************************* Fast kernel -- ALGLIB routine -- 19.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool cmatrixgemmf(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_ABLAS ae_bool result; result = ae_false; return result; #else return _ialglib_i_cmatrixgemmf(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc); #endif } /************************************************************************* CMatrixGEMM kernel, basecase code for CMatrixGEMM. This subroutine calculates C = alpha*op1(A)*op2(B) +beta*C where: * C is MxN general matrix * op1(A) is MxK matrix * op2(B) is KxN matrix * "op" may be identity transformation, transposition, conjugate transposition Additional info: * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. IMPORTANT: This function does NOT preallocate output matrix C, it MUST be preallocated by caller prior to calling this function. In case C does not have enough space to store result, exception will be generated. INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset OpTypeA - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition B - matrix IB - submatrix offset JB - submatrix offset OpTypeB - transformation type: * 0 - no transformation * 1 - transposition * 2 - conjugate transposition Beta - coefficient C - PREALLOCATED output matrix IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 27.03.2013 Bochkanov Sergey *************************************************************************/ void cmatrixgemmk(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { ae_int_t i; ae_int_t j; ae_complex v; ae_complex v00; ae_complex v01; ae_complex v10; ae_complex v11; double v00x; double v00y; double v01x; double v01y; double v10x; double v10y; double v11x; double v11y; double a0x; double a0y; double a1x; double a1y; double b0x; double b0y; double b1x; double b1y; ae_int_t idxa0; ae_int_t idxa1; ae_int_t idxb0; ae_int_t idxb1; ae_int_t i0; ae_int_t i1; ae_int_t ik; ae_int_t j0; ae_int_t j1; ae_int_t jk; ae_int_t t; ae_int_t offsa; ae_int_t offsb; /* * if matrix size is zero */ if( m==0||n==0 ) { return; } /* * Try optimized code */ if( cmatrixgemmf(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state) ) { return; } /* * if K=0 or Alpha=0, then C=Beta*C */ if( k==0||ae_c_eq_d(alpha,(double)(0)) ) { if( ae_c_neq_d(beta,(double)(1)) ) { if( ae_c_neq_d(beta,(double)(0)) ) { for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { c->ptr.pp_complex[ic+i][jc+j] = ae_c_mul(beta,c->ptr.pp_complex[ic+i][jc+j]); } } } else { for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { c->ptr.pp_complex[ic+i][jc+j] = ae_complex_from_i(0); } } } } return; } /* * This phase is not really necessary, but compiler complains * about "possibly uninitialized variables" */ a0x = (double)(0); a0y = (double)(0); a1x = (double)(0); a1y = (double)(0); b0x = (double)(0); b0y = (double)(0); b1x = (double)(0); b1y = (double)(0); /* * General case */ i = 0; while(iptr.pp_complex[idxa0][offsa].x; a0y = a->ptr.pp_complex[idxa0][offsa].y; a1x = a->ptr.pp_complex[idxa1][offsa].x; a1y = a->ptr.pp_complex[idxa1][offsa].y; } if( optypea==1 ) { a0x = a->ptr.pp_complex[offsa][idxa0].x; a0y = a->ptr.pp_complex[offsa][idxa0].y; a1x = a->ptr.pp_complex[offsa][idxa1].x; a1y = a->ptr.pp_complex[offsa][idxa1].y; } if( optypea==2 ) { a0x = a->ptr.pp_complex[offsa][idxa0].x; a0y = -a->ptr.pp_complex[offsa][idxa0].y; a1x = a->ptr.pp_complex[offsa][idxa1].x; a1y = -a->ptr.pp_complex[offsa][idxa1].y; } if( optypeb==0 ) { b0x = b->ptr.pp_complex[offsb][idxb0].x; b0y = b->ptr.pp_complex[offsb][idxb0].y; b1x = b->ptr.pp_complex[offsb][idxb1].x; b1y = b->ptr.pp_complex[offsb][idxb1].y; } if( optypeb==1 ) { b0x = b->ptr.pp_complex[idxb0][offsb].x; b0y = b->ptr.pp_complex[idxb0][offsb].y; b1x = b->ptr.pp_complex[idxb1][offsb].x; b1y = b->ptr.pp_complex[idxb1][offsb].y; } if( optypeb==2 ) { b0x = b->ptr.pp_complex[idxb0][offsb].x; b0y = -b->ptr.pp_complex[idxb0][offsb].y; b1x = b->ptr.pp_complex[idxb1][offsb].x; b1y = -b->ptr.pp_complex[idxb1][offsb].y; } v00x = v00x+a0x*b0x-a0y*b0y; v00y = v00y+a0x*b0y+a0y*b0x; v01x = v01x+a0x*b1x-a0y*b1y; v01y = v01y+a0x*b1y+a0y*b1x; v10x = v10x+a1x*b0x-a1y*b0y; v10y = v10y+a1x*b0y+a1y*b0x; v11x = v11x+a1x*b1x-a1y*b1y; v11y = v11y+a1x*b1y+a1y*b1x; offsa = offsa+1; offsb = offsb+1; } v00.x = v00x; v00.y = v00y; v10.x = v10x; v10.y = v10y; v01.x = v01x; v01.y = v01y; v11.x = v11x; v11.y = v11y; if( ae_c_eq_d(beta,(double)(0)) ) { c->ptr.pp_complex[ic+i+0][jc+j+0] = ae_c_mul(alpha,v00); c->ptr.pp_complex[ic+i+0][jc+j+1] = ae_c_mul(alpha,v01); c->ptr.pp_complex[ic+i+1][jc+j+0] = ae_c_mul(alpha,v10); c->ptr.pp_complex[ic+i+1][jc+j+1] = ae_c_mul(alpha,v11); } else { c->ptr.pp_complex[ic+i+0][jc+j+0] = ae_c_add(ae_c_mul(beta,c->ptr.pp_complex[ic+i+0][jc+j+0]),ae_c_mul(alpha,v00)); c->ptr.pp_complex[ic+i+0][jc+j+1] = ae_c_add(ae_c_mul(beta,c->ptr.pp_complex[ic+i+0][jc+j+1]),ae_c_mul(alpha,v01)); c->ptr.pp_complex[ic+i+1][jc+j+0] = ae_c_add(ae_c_mul(beta,c->ptr.pp_complex[ic+i+1][jc+j+0]),ae_c_mul(alpha,v10)); c->ptr.pp_complex[ic+i+1][jc+j+1] = ae_c_add(ae_c_mul(beta,c->ptr.pp_complex[ic+i+1][jc+j+1]),ae_c_mul(alpha,v11)); } } else { /* * Determine submatrix [I0..I1]x[J0..J1] to process */ i0 = i; i1 = ae_minint(i+1, m-1, _state); j0 = j; j1 = ae_minint(j+1, n-1, _state); /* * Process submatrix */ for(ik=i0; ik<=i1; ik++) { for(jk=j0; jk<=j1; jk++) { if( k==0||ae_c_eq_d(alpha,(double)(0)) ) { v = ae_complex_from_i(0); } else { v = ae_complex_from_d(0.0); if( optypea==0&&optypeb==0 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[ia+ik][ja], 1, "N", &b->ptr.pp_complex[ib][jb+jk], b->stride, "N", ae_v_len(ja,ja+k-1)); } if( optypea==0&&optypeb==1 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[ia+ik][ja], 1, "N", &b->ptr.pp_complex[ib+jk][jb], 1, "N", ae_v_len(ja,ja+k-1)); } if( optypea==0&&optypeb==2 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[ia+ik][ja], 1, "N", &b->ptr.pp_complex[ib+jk][jb], 1, "Conj", ae_v_len(ja,ja+k-1)); } if( optypea==1&&optypeb==0 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[ia][ja+ik], a->stride, "N", &b->ptr.pp_complex[ib][jb+jk], b->stride, "N", ae_v_len(ia,ia+k-1)); } if( optypea==1&&optypeb==1 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[ia][ja+ik], a->stride, "N", &b->ptr.pp_complex[ib+jk][jb], 1, "N", ae_v_len(ia,ia+k-1)); } if( optypea==1&&optypeb==2 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[ia][ja+ik], a->stride, "N", &b->ptr.pp_complex[ib+jk][jb], 1, "Conj", ae_v_len(ia,ia+k-1)); } if( optypea==2&&optypeb==0 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[ia][ja+ik], a->stride, "Conj", &b->ptr.pp_complex[ib][jb+jk], b->stride, "N", ae_v_len(ia,ia+k-1)); } if( optypea==2&&optypeb==1 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[ia][ja+ik], a->stride, "Conj", &b->ptr.pp_complex[ib+jk][jb], 1, "N", ae_v_len(ia,ia+k-1)); } if( optypea==2&&optypeb==2 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[ia][ja+ik], a->stride, "Conj", &b->ptr.pp_complex[ib+jk][jb], 1, "Conj", ae_v_len(ia,ia+k-1)); } } if( ae_c_eq_d(beta,(double)(0)) ) { c->ptr.pp_complex[ic+ik][jc+jk] = ae_c_mul(alpha,v); } else { c->ptr.pp_complex[ic+ik][jc+jk] = ae_c_add(ae_c_mul(beta,c->ptr.pp_complex[ic+ik][jc+jk]),ae_c_mul(alpha,v)); } } } } j = j+2; } i = i+2; } } /************************************************************************* RMatrixGEMM kernel, basecase code for RMatrixGEMM. This subroutine calculates C = alpha*op1(A)*op2(B) +beta*C where: * C is MxN general matrix * op1(A) is MxK matrix * op2(B) is KxN matrix * "op" may be identity transformation, transposition Additional info: * multiplication result replaces C. If Beta=0, C elements are not used in calculations (not multiplied by zero - just not referenced) * if Alpha=0, A is not used (not multiplied by zero - just not referenced) * if both Beta and Alpha are zero, C is filled by zeros. IMPORTANT: This function does NOT preallocate output matrix C, it MUST be preallocated by caller prior to calling this function. In case C does not have enough space to store result, exception will be generated. INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset OpTypeA - transformation type: * 0 - no transformation * 1 - transposition B - matrix IB - submatrix offset JB - submatrix offset OpTypeB - transformation type: * 0 - no transformation * 1 - transposition Beta - coefficient C - PREALLOCATED output matrix IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 27.03.2013 Bochkanov Sergey *************************************************************************/ void rmatrixgemmk(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { ae_int_t i; ae_int_t j; /* * if matrix size is zero */ if( m==0||n==0 ) { return; } /* * Try optimized code */ if( rmatrixgemmf(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc, _state) ) { return; } /* * if K=0 or Alpha=0, then C=Beta*C */ if( k==0||ae_fp_eq(alpha,(double)(0)) ) { if( ae_fp_neq(beta,(double)(1)) ) { if( ae_fp_neq(beta,(double)(0)) ) { for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { c->ptr.pp_double[ic+i][jc+j] = beta*c->ptr.pp_double[ic+i][jc+j]; } } } else { for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { c->ptr.pp_double[ic+i][jc+j] = (double)(0); } } } } return; } /* * Call specialized code. * * NOTE: specialized code was moved to separate function because of strange * issues with instructions cache on some systems; Having too long * functions significantly slows down internal loop of the algorithm. */ if( optypea==0&&optypeb==0 ) { rmatrixgemmk44v00(m, n, k, alpha, a, ia, ja, b, ib, jb, beta, c, ic, jc, _state); } if( optypea==0&&optypeb!=0 ) { rmatrixgemmk44v01(m, n, k, alpha, a, ia, ja, b, ib, jb, beta, c, ic, jc, _state); } if( optypea!=0&&optypeb==0 ) { rmatrixgemmk44v10(m, n, k, alpha, a, ia, ja, b, ib, jb, beta, c, ic, jc, _state); } if( optypea!=0&&optypeb!=0 ) { rmatrixgemmk44v11(m, n, k, alpha, a, ia, ja, b, ib, jb, beta, c, ic, jc, _state); } } /************************************************************************* RMatrixGEMM kernel, basecase code for RMatrixGEMM, specialized for sitation with OpTypeA=0 and OpTypeB=0. Additional info: * this function requires that Alpha<>0 (assertion is thrown otherwise) INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset B - matrix IB - submatrix offset JB - submatrix offset Beta - coefficient C - PREALLOCATED output matrix IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 27.03.2013 Bochkanov Sergey *************************************************************************/ void rmatrixgemmk44v00(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { ae_int_t i; ae_int_t j; double v; double v00; double v01; double v02; double v03; double v10; double v11; double v12; double v13; double v20; double v21; double v22; double v23; double v30; double v31; double v32; double v33; double a0; double a1; double a2; double a3; double b0; double b1; double b2; double b3; ae_int_t idxa0; ae_int_t idxa1; ae_int_t idxa2; ae_int_t idxa3; ae_int_t idxb0; ae_int_t idxb1; ae_int_t idxb2; ae_int_t idxb3; ae_int_t i0; ae_int_t i1; ae_int_t ik; ae_int_t j0; ae_int_t j1; ae_int_t jk; ae_int_t t; ae_int_t offsa; ae_int_t offsb; ae_assert(ae_fp_neq(alpha,(double)(0)), "RMatrixGEMMK44V00: internal error (Alpha=0)", _state); /* * if matrix size is zero */ if( m==0||n==0 ) { return; } /* * A*B */ i = 0; while(iptr.pp_double[idxa0][offsa]; a1 = a->ptr.pp_double[idxa1][offsa]; b0 = b->ptr.pp_double[offsb][idxb0]; b1 = b->ptr.pp_double[offsb][idxb1]; v00 = v00+a0*b0; v01 = v01+a0*b1; v10 = v10+a1*b0; v11 = v11+a1*b1; a2 = a->ptr.pp_double[idxa2][offsa]; a3 = a->ptr.pp_double[idxa3][offsa]; v20 = v20+a2*b0; v21 = v21+a2*b1; v30 = v30+a3*b0; v31 = v31+a3*b1; b2 = b->ptr.pp_double[offsb][idxb2]; b3 = b->ptr.pp_double[offsb][idxb3]; v22 = v22+a2*b2; v23 = v23+a2*b3; v32 = v32+a3*b2; v33 = v33+a3*b3; v02 = v02+a0*b2; v03 = v03+a0*b3; v12 = v12+a1*b2; v13 = v13+a1*b3; offsa = offsa+1; offsb = offsb+1; } if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_double[ic+i+0][jc+j+0] = alpha*v00; c->ptr.pp_double[ic+i+0][jc+j+1] = alpha*v01; c->ptr.pp_double[ic+i+0][jc+j+2] = alpha*v02; c->ptr.pp_double[ic+i+0][jc+j+3] = alpha*v03; c->ptr.pp_double[ic+i+1][jc+j+0] = alpha*v10; c->ptr.pp_double[ic+i+1][jc+j+1] = alpha*v11; c->ptr.pp_double[ic+i+1][jc+j+2] = alpha*v12; c->ptr.pp_double[ic+i+1][jc+j+3] = alpha*v13; c->ptr.pp_double[ic+i+2][jc+j+0] = alpha*v20; c->ptr.pp_double[ic+i+2][jc+j+1] = alpha*v21; c->ptr.pp_double[ic+i+2][jc+j+2] = alpha*v22; c->ptr.pp_double[ic+i+2][jc+j+3] = alpha*v23; c->ptr.pp_double[ic+i+3][jc+j+0] = alpha*v30; c->ptr.pp_double[ic+i+3][jc+j+1] = alpha*v31; c->ptr.pp_double[ic+i+3][jc+j+2] = alpha*v32; c->ptr.pp_double[ic+i+3][jc+j+3] = alpha*v33; } else { c->ptr.pp_double[ic+i+0][jc+j+0] = beta*c->ptr.pp_double[ic+i+0][jc+j+0]+alpha*v00; c->ptr.pp_double[ic+i+0][jc+j+1] = beta*c->ptr.pp_double[ic+i+0][jc+j+1]+alpha*v01; c->ptr.pp_double[ic+i+0][jc+j+2] = beta*c->ptr.pp_double[ic+i+0][jc+j+2]+alpha*v02; c->ptr.pp_double[ic+i+0][jc+j+3] = beta*c->ptr.pp_double[ic+i+0][jc+j+3]+alpha*v03; c->ptr.pp_double[ic+i+1][jc+j+0] = beta*c->ptr.pp_double[ic+i+1][jc+j+0]+alpha*v10; c->ptr.pp_double[ic+i+1][jc+j+1] = beta*c->ptr.pp_double[ic+i+1][jc+j+1]+alpha*v11; c->ptr.pp_double[ic+i+1][jc+j+2] = beta*c->ptr.pp_double[ic+i+1][jc+j+2]+alpha*v12; c->ptr.pp_double[ic+i+1][jc+j+3] = beta*c->ptr.pp_double[ic+i+1][jc+j+3]+alpha*v13; c->ptr.pp_double[ic+i+2][jc+j+0] = beta*c->ptr.pp_double[ic+i+2][jc+j+0]+alpha*v20; c->ptr.pp_double[ic+i+2][jc+j+1] = beta*c->ptr.pp_double[ic+i+2][jc+j+1]+alpha*v21; c->ptr.pp_double[ic+i+2][jc+j+2] = beta*c->ptr.pp_double[ic+i+2][jc+j+2]+alpha*v22; c->ptr.pp_double[ic+i+2][jc+j+3] = beta*c->ptr.pp_double[ic+i+2][jc+j+3]+alpha*v23; c->ptr.pp_double[ic+i+3][jc+j+0] = beta*c->ptr.pp_double[ic+i+3][jc+j+0]+alpha*v30; c->ptr.pp_double[ic+i+3][jc+j+1] = beta*c->ptr.pp_double[ic+i+3][jc+j+1]+alpha*v31; c->ptr.pp_double[ic+i+3][jc+j+2] = beta*c->ptr.pp_double[ic+i+3][jc+j+2]+alpha*v32; c->ptr.pp_double[ic+i+3][jc+j+3] = beta*c->ptr.pp_double[ic+i+3][jc+j+3]+alpha*v33; } } else { /* * Determine submatrix [I0..I1]x[J0..J1] to process */ i0 = i; i1 = ae_minint(i+3, m-1, _state); j0 = j; j1 = ae_minint(j+3, n-1, _state); /* * Process submatrix */ for(ik=i0; ik<=i1; ik++) { for(jk=j0; jk<=j1; jk++) { if( k==0||ae_fp_eq(alpha,(double)(0)) ) { v = (double)(0); } else { v = ae_v_dotproduct(&a->ptr.pp_double[ia+ik][ja], 1, &b->ptr.pp_double[ib][jb+jk], b->stride, ae_v_len(ja,ja+k-1)); } if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_double[ic+ik][jc+jk] = alpha*v; } else { c->ptr.pp_double[ic+ik][jc+jk] = beta*c->ptr.pp_double[ic+ik][jc+jk]+alpha*v; } } } } j = j+4; } i = i+4; } } /************************************************************************* RMatrixGEMM kernel, basecase code for RMatrixGEMM, specialized for sitation with OpTypeA=0 and OpTypeB=1. Additional info: * this function requires that Alpha<>0 (assertion is thrown otherwise) INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset B - matrix IB - submatrix offset JB - submatrix offset Beta - coefficient C - PREALLOCATED output matrix IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 27.03.2013 Bochkanov Sergey *************************************************************************/ void rmatrixgemmk44v01(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { ae_int_t i; ae_int_t j; double v; double v00; double v01; double v02; double v03; double v10; double v11; double v12; double v13; double v20; double v21; double v22; double v23; double v30; double v31; double v32; double v33; double a0; double a1; double a2; double a3; double b0; double b1; double b2; double b3; ae_int_t idxa0; ae_int_t idxa1; ae_int_t idxa2; ae_int_t idxa3; ae_int_t idxb0; ae_int_t idxb1; ae_int_t idxb2; ae_int_t idxb3; ae_int_t i0; ae_int_t i1; ae_int_t ik; ae_int_t j0; ae_int_t j1; ae_int_t jk; ae_int_t t; ae_int_t offsa; ae_int_t offsb; ae_assert(ae_fp_neq(alpha,(double)(0)), "RMatrixGEMMK44V00: internal error (Alpha=0)", _state); /* * if matrix size is zero */ if( m==0||n==0 ) { return; } /* * A*B' */ i = 0; while(iptr.pp_double[idxa0][offsa]; a1 = a->ptr.pp_double[idxa1][offsa]; b0 = b->ptr.pp_double[idxb0][offsb]; b1 = b->ptr.pp_double[idxb1][offsb]; v00 = v00+a0*b0; v01 = v01+a0*b1; v10 = v10+a1*b0; v11 = v11+a1*b1; a2 = a->ptr.pp_double[idxa2][offsa]; a3 = a->ptr.pp_double[idxa3][offsa]; v20 = v20+a2*b0; v21 = v21+a2*b1; v30 = v30+a3*b0; v31 = v31+a3*b1; b2 = b->ptr.pp_double[idxb2][offsb]; b3 = b->ptr.pp_double[idxb3][offsb]; v22 = v22+a2*b2; v23 = v23+a2*b3; v32 = v32+a3*b2; v33 = v33+a3*b3; v02 = v02+a0*b2; v03 = v03+a0*b3; v12 = v12+a1*b2; v13 = v13+a1*b3; offsa = offsa+1; offsb = offsb+1; } if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_double[ic+i+0][jc+j+0] = alpha*v00; c->ptr.pp_double[ic+i+0][jc+j+1] = alpha*v01; c->ptr.pp_double[ic+i+0][jc+j+2] = alpha*v02; c->ptr.pp_double[ic+i+0][jc+j+3] = alpha*v03; c->ptr.pp_double[ic+i+1][jc+j+0] = alpha*v10; c->ptr.pp_double[ic+i+1][jc+j+1] = alpha*v11; c->ptr.pp_double[ic+i+1][jc+j+2] = alpha*v12; c->ptr.pp_double[ic+i+1][jc+j+3] = alpha*v13; c->ptr.pp_double[ic+i+2][jc+j+0] = alpha*v20; c->ptr.pp_double[ic+i+2][jc+j+1] = alpha*v21; c->ptr.pp_double[ic+i+2][jc+j+2] = alpha*v22; c->ptr.pp_double[ic+i+2][jc+j+3] = alpha*v23; c->ptr.pp_double[ic+i+3][jc+j+0] = alpha*v30; c->ptr.pp_double[ic+i+3][jc+j+1] = alpha*v31; c->ptr.pp_double[ic+i+3][jc+j+2] = alpha*v32; c->ptr.pp_double[ic+i+3][jc+j+3] = alpha*v33; } else { c->ptr.pp_double[ic+i+0][jc+j+0] = beta*c->ptr.pp_double[ic+i+0][jc+j+0]+alpha*v00; c->ptr.pp_double[ic+i+0][jc+j+1] = beta*c->ptr.pp_double[ic+i+0][jc+j+1]+alpha*v01; c->ptr.pp_double[ic+i+0][jc+j+2] = beta*c->ptr.pp_double[ic+i+0][jc+j+2]+alpha*v02; c->ptr.pp_double[ic+i+0][jc+j+3] = beta*c->ptr.pp_double[ic+i+0][jc+j+3]+alpha*v03; c->ptr.pp_double[ic+i+1][jc+j+0] = beta*c->ptr.pp_double[ic+i+1][jc+j+0]+alpha*v10; c->ptr.pp_double[ic+i+1][jc+j+1] = beta*c->ptr.pp_double[ic+i+1][jc+j+1]+alpha*v11; c->ptr.pp_double[ic+i+1][jc+j+2] = beta*c->ptr.pp_double[ic+i+1][jc+j+2]+alpha*v12; c->ptr.pp_double[ic+i+1][jc+j+3] = beta*c->ptr.pp_double[ic+i+1][jc+j+3]+alpha*v13; c->ptr.pp_double[ic+i+2][jc+j+0] = beta*c->ptr.pp_double[ic+i+2][jc+j+0]+alpha*v20; c->ptr.pp_double[ic+i+2][jc+j+1] = beta*c->ptr.pp_double[ic+i+2][jc+j+1]+alpha*v21; c->ptr.pp_double[ic+i+2][jc+j+2] = beta*c->ptr.pp_double[ic+i+2][jc+j+2]+alpha*v22; c->ptr.pp_double[ic+i+2][jc+j+3] = beta*c->ptr.pp_double[ic+i+2][jc+j+3]+alpha*v23; c->ptr.pp_double[ic+i+3][jc+j+0] = beta*c->ptr.pp_double[ic+i+3][jc+j+0]+alpha*v30; c->ptr.pp_double[ic+i+3][jc+j+1] = beta*c->ptr.pp_double[ic+i+3][jc+j+1]+alpha*v31; c->ptr.pp_double[ic+i+3][jc+j+2] = beta*c->ptr.pp_double[ic+i+3][jc+j+2]+alpha*v32; c->ptr.pp_double[ic+i+3][jc+j+3] = beta*c->ptr.pp_double[ic+i+3][jc+j+3]+alpha*v33; } } else { /* * Determine submatrix [I0..I1]x[J0..J1] to process */ i0 = i; i1 = ae_minint(i+3, m-1, _state); j0 = j; j1 = ae_minint(j+3, n-1, _state); /* * Process submatrix */ for(ik=i0; ik<=i1; ik++) { for(jk=j0; jk<=j1; jk++) { if( k==0||ae_fp_eq(alpha,(double)(0)) ) { v = (double)(0); } else { v = ae_v_dotproduct(&a->ptr.pp_double[ia+ik][ja], 1, &b->ptr.pp_double[ib+jk][jb], 1, ae_v_len(ja,ja+k-1)); } if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_double[ic+ik][jc+jk] = alpha*v; } else { c->ptr.pp_double[ic+ik][jc+jk] = beta*c->ptr.pp_double[ic+ik][jc+jk]+alpha*v; } } } } j = j+4; } i = i+4; } } /************************************************************************* RMatrixGEMM kernel, basecase code for RMatrixGEMM, specialized for sitation with OpTypeA=1 and OpTypeB=0. Additional info: * this function requires that Alpha<>0 (assertion is thrown otherwise) INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset B - matrix IB - submatrix offset JB - submatrix offset Beta - coefficient C - PREALLOCATED output matrix IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 27.03.2013 Bochkanov Sergey *************************************************************************/ void rmatrixgemmk44v10(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { ae_int_t i; ae_int_t j; double v; double v00; double v01; double v02; double v03; double v10; double v11; double v12; double v13; double v20; double v21; double v22; double v23; double v30; double v31; double v32; double v33; double a0; double a1; double a2; double a3; double b0; double b1; double b2; double b3; ae_int_t idxa0; ae_int_t idxa1; ae_int_t idxa2; ae_int_t idxa3; ae_int_t idxb0; ae_int_t idxb1; ae_int_t idxb2; ae_int_t idxb3; ae_int_t i0; ae_int_t i1; ae_int_t ik; ae_int_t j0; ae_int_t j1; ae_int_t jk; ae_int_t t; ae_int_t offsa; ae_int_t offsb; ae_assert(ae_fp_neq(alpha,(double)(0)), "RMatrixGEMMK44V00: internal error (Alpha=0)", _state); /* * if matrix size is zero */ if( m==0||n==0 ) { return; } /* * A'*B */ i = 0; while(iptr.pp_double[offsa][idxa0]; a1 = a->ptr.pp_double[offsa][idxa1]; b0 = b->ptr.pp_double[offsb][idxb0]; b1 = b->ptr.pp_double[offsb][idxb1]; v00 = v00+a0*b0; v01 = v01+a0*b1; v10 = v10+a1*b0; v11 = v11+a1*b1; a2 = a->ptr.pp_double[offsa][idxa2]; a3 = a->ptr.pp_double[offsa][idxa3]; v20 = v20+a2*b0; v21 = v21+a2*b1; v30 = v30+a3*b0; v31 = v31+a3*b1; b2 = b->ptr.pp_double[offsb][idxb2]; b3 = b->ptr.pp_double[offsb][idxb3]; v22 = v22+a2*b2; v23 = v23+a2*b3; v32 = v32+a3*b2; v33 = v33+a3*b3; v02 = v02+a0*b2; v03 = v03+a0*b3; v12 = v12+a1*b2; v13 = v13+a1*b3; offsa = offsa+1; offsb = offsb+1; } if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_double[ic+i+0][jc+j+0] = alpha*v00; c->ptr.pp_double[ic+i+0][jc+j+1] = alpha*v01; c->ptr.pp_double[ic+i+0][jc+j+2] = alpha*v02; c->ptr.pp_double[ic+i+0][jc+j+3] = alpha*v03; c->ptr.pp_double[ic+i+1][jc+j+0] = alpha*v10; c->ptr.pp_double[ic+i+1][jc+j+1] = alpha*v11; c->ptr.pp_double[ic+i+1][jc+j+2] = alpha*v12; c->ptr.pp_double[ic+i+1][jc+j+3] = alpha*v13; c->ptr.pp_double[ic+i+2][jc+j+0] = alpha*v20; c->ptr.pp_double[ic+i+2][jc+j+1] = alpha*v21; c->ptr.pp_double[ic+i+2][jc+j+2] = alpha*v22; c->ptr.pp_double[ic+i+2][jc+j+3] = alpha*v23; c->ptr.pp_double[ic+i+3][jc+j+0] = alpha*v30; c->ptr.pp_double[ic+i+3][jc+j+1] = alpha*v31; c->ptr.pp_double[ic+i+3][jc+j+2] = alpha*v32; c->ptr.pp_double[ic+i+3][jc+j+3] = alpha*v33; } else { c->ptr.pp_double[ic+i+0][jc+j+0] = beta*c->ptr.pp_double[ic+i+0][jc+j+0]+alpha*v00; c->ptr.pp_double[ic+i+0][jc+j+1] = beta*c->ptr.pp_double[ic+i+0][jc+j+1]+alpha*v01; c->ptr.pp_double[ic+i+0][jc+j+2] = beta*c->ptr.pp_double[ic+i+0][jc+j+2]+alpha*v02; c->ptr.pp_double[ic+i+0][jc+j+3] = beta*c->ptr.pp_double[ic+i+0][jc+j+3]+alpha*v03; c->ptr.pp_double[ic+i+1][jc+j+0] = beta*c->ptr.pp_double[ic+i+1][jc+j+0]+alpha*v10; c->ptr.pp_double[ic+i+1][jc+j+1] = beta*c->ptr.pp_double[ic+i+1][jc+j+1]+alpha*v11; c->ptr.pp_double[ic+i+1][jc+j+2] = beta*c->ptr.pp_double[ic+i+1][jc+j+2]+alpha*v12; c->ptr.pp_double[ic+i+1][jc+j+3] = beta*c->ptr.pp_double[ic+i+1][jc+j+3]+alpha*v13; c->ptr.pp_double[ic+i+2][jc+j+0] = beta*c->ptr.pp_double[ic+i+2][jc+j+0]+alpha*v20; c->ptr.pp_double[ic+i+2][jc+j+1] = beta*c->ptr.pp_double[ic+i+2][jc+j+1]+alpha*v21; c->ptr.pp_double[ic+i+2][jc+j+2] = beta*c->ptr.pp_double[ic+i+2][jc+j+2]+alpha*v22; c->ptr.pp_double[ic+i+2][jc+j+3] = beta*c->ptr.pp_double[ic+i+2][jc+j+3]+alpha*v23; c->ptr.pp_double[ic+i+3][jc+j+0] = beta*c->ptr.pp_double[ic+i+3][jc+j+0]+alpha*v30; c->ptr.pp_double[ic+i+3][jc+j+1] = beta*c->ptr.pp_double[ic+i+3][jc+j+1]+alpha*v31; c->ptr.pp_double[ic+i+3][jc+j+2] = beta*c->ptr.pp_double[ic+i+3][jc+j+2]+alpha*v32; c->ptr.pp_double[ic+i+3][jc+j+3] = beta*c->ptr.pp_double[ic+i+3][jc+j+3]+alpha*v33; } } else { /* * Determine submatrix [I0..I1]x[J0..J1] to process */ i0 = i; i1 = ae_minint(i+3, m-1, _state); j0 = j; j1 = ae_minint(j+3, n-1, _state); /* * Process submatrix */ for(ik=i0; ik<=i1; ik++) { for(jk=j0; jk<=j1; jk++) { if( k==0||ae_fp_eq(alpha,(double)(0)) ) { v = (double)(0); } else { v = 0.0; v = ae_v_dotproduct(&a->ptr.pp_double[ia][ja+ik], a->stride, &b->ptr.pp_double[ib][jb+jk], b->stride, ae_v_len(ia,ia+k-1)); } if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_double[ic+ik][jc+jk] = alpha*v; } else { c->ptr.pp_double[ic+ik][jc+jk] = beta*c->ptr.pp_double[ic+ik][jc+jk]+alpha*v; } } } } j = j+4; } i = i+4; } } /************************************************************************* RMatrixGEMM kernel, basecase code for RMatrixGEMM, specialized for sitation with OpTypeA=1 and OpTypeB=1. Additional info: * this function requires that Alpha<>0 (assertion is thrown otherwise) INPUT PARAMETERS M - matrix size, M>0 N - matrix size, N>0 K - matrix size, K>0 Alpha - coefficient A - matrix IA - submatrix offset JA - submatrix offset B - matrix IB - submatrix offset JB - submatrix offset Beta - coefficient C - PREALLOCATED output matrix IC - submatrix offset JC - submatrix offset -- ALGLIB routine -- 27.03.2013 Bochkanov Sergey *************************************************************************/ void rmatrixgemmk44v11(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { ae_int_t i; ae_int_t j; double v; double v00; double v01; double v02; double v03; double v10; double v11; double v12; double v13; double v20; double v21; double v22; double v23; double v30; double v31; double v32; double v33; double a0; double a1; double a2; double a3; double b0; double b1; double b2; double b3; ae_int_t idxa0; ae_int_t idxa1; ae_int_t idxa2; ae_int_t idxa3; ae_int_t idxb0; ae_int_t idxb1; ae_int_t idxb2; ae_int_t idxb3; ae_int_t i0; ae_int_t i1; ae_int_t ik; ae_int_t j0; ae_int_t j1; ae_int_t jk; ae_int_t t; ae_int_t offsa; ae_int_t offsb; ae_assert(ae_fp_neq(alpha,(double)(0)), "RMatrixGEMMK44V00: internal error (Alpha=0)", _state); /* * if matrix size is zero */ if( m==0||n==0 ) { return; } /* * A'*B' */ i = 0; while(iptr.pp_double[offsa][idxa0]; a1 = a->ptr.pp_double[offsa][idxa1]; b0 = b->ptr.pp_double[idxb0][offsb]; b1 = b->ptr.pp_double[idxb1][offsb]; v00 = v00+a0*b0; v01 = v01+a0*b1; v10 = v10+a1*b0; v11 = v11+a1*b1; a2 = a->ptr.pp_double[offsa][idxa2]; a3 = a->ptr.pp_double[offsa][idxa3]; v20 = v20+a2*b0; v21 = v21+a2*b1; v30 = v30+a3*b0; v31 = v31+a3*b1; b2 = b->ptr.pp_double[idxb2][offsb]; b3 = b->ptr.pp_double[idxb3][offsb]; v22 = v22+a2*b2; v23 = v23+a2*b3; v32 = v32+a3*b2; v33 = v33+a3*b3; v02 = v02+a0*b2; v03 = v03+a0*b3; v12 = v12+a1*b2; v13 = v13+a1*b3; offsa = offsa+1; offsb = offsb+1; } if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_double[ic+i+0][jc+j+0] = alpha*v00; c->ptr.pp_double[ic+i+0][jc+j+1] = alpha*v01; c->ptr.pp_double[ic+i+0][jc+j+2] = alpha*v02; c->ptr.pp_double[ic+i+0][jc+j+3] = alpha*v03; c->ptr.pp_double[ic+i+1][jc+j+0] = alpha*v10; c->ptr.pp_double[ic+i+1][jc+j+1] = alpha*v11; c->ptr.pp_double[ic+i+1][jc+j+2] = alpha*v12; c->ptr.pp_double[ic+i+1][jc+j+3] = alpha*v13; c->ptr.pp_double[ic+i+2][jc+j+0] = alpha*v20; c->ptr.pp_double[ic+i+2][jc+j+1] = alpha*v21; c->ptr.pp_double[ic+i+2][jc+j+2] = alpha*v22; c->ptr.pp_double[ic+i+2][jc+j+3] = alpha*v23; c->ptr.pp_double[ic+i+3][jc+j+0] = alpha*v30; c->ptr.pp_double[ic+i+3][jc+j+1] = alpha*v31; c->ptr.pp_double[ic+i+3][jc+j+2] = alpha*v32; c->ptr.pp_double[ic+i+3][jc+j+3] = alpha*v33; } else { c->ptr.pp_double[ic+i+0][jc+j+0] = beta*c->ptr.pp_double[ic+i+0][jc+j+0]+alpha*v00; c->ptr.pp_double[ic+i+0][jc+j+1] = beta*c->ptr.pp_double[ic+i+0][jc+j+1]+alpha*v01; c->ptr.pp_double[ic+i+0][jc+j+2] = beta*c->ptr.pp_double[ic+i+0][jc+j+2]+alpha*v02; c->ptr.pp_double[ic+i+0][jc+j+3] = beta*c->ptr.pp_double[ic+i+0][jc+j+3]+alpha*v03; c->ptr.pp_double[ic+i+1][jc+j+0] = beta*c->ptr.pp_double[ic+i+1][jc+j+0]+alpha*v10; c->ptr.pp_double[ic+i+1][jc+j+1] = beta*c->ptr.pp_double[ic+i+1][jc+j+1]+alpha*v11; c->ptr.pp_double[ic+i+1][jc+j+2] = beta*c->ptr.pp_double[ic+i+1][jc+j+2]+alpha*v12; c->ptr.pp_double[ic+i+1][jc+j+3] = beta*c->ptr.pp_double[ic+i+1][jc+j+3]+alpha*v13; c->ptr.pp_double[ic+i+2][jc+j+0] = beta*c->ptr.pp_double[ic+i+2][jc+j+0]+alpha*v20; c->ptr.pp_double[ic+i+2][jc+j+1] = beta*c->ptr.pp_double[ic+i+2][jc+j+1]+alpha*v21; c->ptr.pp_double[ic+i+2][jc+j+2] = beta*c->ptr.pp_double[ic+i+2][jc+j+2]+alpha*v22; c->ptr.pp_double[ic+i+2][jc+j+3] = beta*c->ptr.pp_double[ic+i+2][jc+j+3]+alpha*v23; c->ptr.pp_double[ic+i+3][jc+j+0] = beta*c->ptr.pp_double[ic+i+3][jc+j+0]+alpha*v30; c->ptr.pp_double[ic+i+3][jc+j+1] = beta*c->ptr.pp_double[ic+i+3][jc+j+1]+alpha*v31; c->ptr.pp_double[ic+i+3][jc+j+2] = beta*c->ptr.pp_double[ic+i+3][jc+j+2]+alpha*v32; c->ptr.pp_double[ic+i+3][jc+j+3] = beta*c->ptr.pp_double[ic+i+3][jc+j+3]+alpha*v33; } } else { /* * Determine submatrix [I0..I1]x[J0..J1] to process */ i0 = i; i1 = ae_minint(i+3, m-1, _state); j0 = j; j1 = ae_minint(j+3, n-1, _state); /* * Process submatrix */ for(ik=i0; ik<=i1; ik++) { for(jk=j0; jk<=j1; jk++) { if( k==0||ae_fp_eq(alpha,(double)(0)) ) { v = (double)(0); } else { v = 0.0; v = ae_v_dotproduct(&a->ptr.pp_double[ia][ja+ik], a->stride, &b->ptr.pp_double[ib+jk][jb], 1, ae_v_len(ia,ia+k-1)); } if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_double[ic+ik][jc+jk] = alpha*v; } else { c->ptr.pp_double[ic+ik][jc+jk] = beta*c->ptr.pp_double[ic+ik][jc+jk]+alpha*v; } } } } j = j+4; } i = i+4; } } /************************************************************************* MKL-based kernel -- ALGLIB routine -- 01.10.2013 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixsyrkmkl(ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixsyrkmkl(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper); #endif } /************************************************************************* MKL-based kernel -- ALGLIB routine -- 01.10.2013 Bochkanov Sergey *************************************************************************/ ae_bool cmatrixherkmkl(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_cmatrixherkmkl(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper); #endif } /************************************************************************* MKL-based kernel -- ALGLIB routine -- 01.10.2013 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixgemmmkl(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixgemmmkl(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc); #endif } /************************************************************************* MKL-based kernel -- ALGLIB routine -- 16.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool cmatrixgemmmkl(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_cmatrixgemmmkl(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc); #endif } /************************************************************************* MKL-based kernel -- ALGLIB routine -- 16.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool cmatrixlefttrsmmkl(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_cmatrixlefttrsmmkl(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2); #endif } /************************************************************************* MKL-based kernel -- ALGLIB routine -- 16.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool cmatrixrighttrsmmkl(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_cmatrixrighttrsmmkl(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2); #endif } /************************************************************************* MKL-based kernel -- ALGLIB routine -- 16.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixlefttrsmmkl(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixlefttrsmmkl(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2); #endif } /************************************************************************* MKL-based kernel -- ALGLIB routine -- 16.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixrighttrsmmkl(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixrighttrsmmkl(m, n, a, i1, j1, isupper, isunit, optype, x, i2, j2); #endif } /************************************************************************* MKL-based kernel. NOTE: if function returned False, CholResult is NOT modified. Not ever referenced! if function returned True, CholResult is set to status of Cholesky decomposition (True on succeess). -- ALGLIB routine -- 16.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool spdmatrixcholeskymkl(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, ae_bool* cholresult, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_spdmatrixcholeskymkl(a, offs, n, isupper, cholresult); #endif } /************************************************************************* MKL-based kernel. -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixplumkl(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixplumkl(a, offs, m, n, pivots); #endif } /************************************************************************* MKL-based kernel. NOTE: this function needs preallocated output/temporary arrays. D and E must be at least max(M,N)-wide. -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixbdmkl(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* d, /* Real */ ae_vector* e, /* Real */ ae_vector* tauq, /* Real */ ae_vector* taup, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixbdmkl(a, m, n, d, e, tauq, taup); #endif } /************************************************************************* MKL-based kernel. If ByQ is True, TauP is not used (can be empty array). If ByQ is False, TauQ is not used (can be empty array). -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixbdmultiplybymkl(/* Real */ ae_matrix* qp, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tauq, /* Real */ ae_vector* taup, /* Real */ ae_matrix* z, ae_int_t zrows, ae_int_t zcolumns, ae_bool byq, ae_bool fromtheright, ae_bool dotranspose, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixbdmultiplybymkl(qp, m, n, tauq, taup, z, zrows, zcolumns, byq, fromtheright, dotranspose); #endif } /************************************************************************* MKL-based kernel. NOTE: Tau must be preallocated array with at least N-1 elements. -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixhessenbergmkl(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* tau, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixhessenbergmkl(a, n, tau); #endif } /************************************************************************* MKL-based kernel. NOTE: Q must be preallocated N*N array -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixhessenbergunpackqmkl(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* tau, /* Real */ ae_matrix* q, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixhessenbergunpackqmkl(a, n, tau, q); #endif } /************************************************************************* MKL-based kernel. NOTE: Tau, D, E must be preallocated arrays; length(E)=length(Tau)=N-1 (or larger) length(D)=N (or larger) -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool smatrixtdmkl(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tau, /* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_smatrixtdmkl(a, n, isupper, tau, d, e); #endif } /************************************************************************* MKL-based kernel. NOTE: Q must be preallocated N*N array -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool smatrixtdunpackqmkl(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tau, /* Real */ ae_matrix* q, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_smatrixtdunpackqmkl(a, n, isupper, tau, q); #endif } /************************************************************************* MKL-based kernel. NOTE: Tau, D, E must be preallocated arrays; length(E)=length(Tau)=N-1 (or larger) length(D)=N (or larger) -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool hmatrixtdmkl(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tau, /* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_hmatrixtdmkl(a, n, isupper, tau, d, e); #endif } /************************************************************************* MKL-based kernel. NOTE: Q must be preallocated N*N array -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool hmatrixtdunpackqmkl(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tau, /* Complex */ ae_matrix* q, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_hmatrixtdunpackqmkl(a, n, isupper, tau, q); #endif } /************************************************************************* MKL-based kernel. Returns True if MKL was present and handled request (MKL completion code is returned as separate output parameter). D and E are pre-allocated arrays with length N (both of them!). On output, D constraints singular values, and E is destroyed. SVDResult is modified if and only if MKL is present. -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixbdsvdmkl(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* u, ae_int_t nru, /* Real */ ae_matrix* c, ae_int_t ncc, /* Real */ ae_matrix* vt, ae_int_t ncvt, ae_bool* svdresult, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixbdsvdmkl(d, e, n, isupper, u, nru, c, ncc, vt, ncvt, svdresult); #endif } /************************************************************************* MKL-based DHSEQR kernel. Returns True if MKL was present and handled request. WR and WI are pre-allocated arrays with length N. Z is pre-allocated array[N,N]. -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixinternalschurdecompositionmkl(/* Real */ ae_matrix* h, ae_int_t n, ae_int_t tneeded, ae_int_t zneeded, /* Real */ ae_vector* wr, /* Real */ ae_vector* wi, /* Real */ ae_matrix* z, ae_int_t* info, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixinternalschurdecompositionmkl(h, n, tneeded, zneeded, wr, wi, z, info); #endif } /************************************************************************* MKL-based DTREVC kernel. Returns True if MKL was present and handled request. NOTE: this function does NOT support HOWMNY=3!!!! VL and VR are pre-allocated arrays with length N*N, if required. If particalar variables is not required, it can be dummy (empty) array. -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixinternaltrevcmkl(/* Real */ ae_matrix* t, ae_int_t n, ae_int_t side, ae_int_t howmny, /* Real */ ae_matrix* vl, /* Real */ ae_matrix* vr, ae_int_t* m, ae_int_t* info, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_rmatrixinternaltrevcmkl(t, n, side, howmny, vl, vr, m, info); #endif } /************************************************************************* MKL-based kernel. Returns True if MKL was present and handled request (MKL completion code is returned as separate output parameter). D and E are pre-allocated arrays with length N (both of them!). On output, D constraints eigenvalues, and E is destroyed. Z is preallocated array[N,N] for ZNeeded<>0; ignored for ZNeeded=0. EVDResult is modified if and only if MKL is present. -- ALGLIB routine -- 20.10.2014 Bochkanov Sergey *************************************************************************/ ae_bool smatrixtdevdmkl(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_int_t zneeded, /* Real */ ae_matrix* z, ae_bool* evdresult, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_MKL ae_bool result; result = ae_false; return result; #else return _ialglib_i_smatrixtdevdmkl(d, e, n, zneeded, z, evdresult); #endif } /************************************************************************* Application of a sequence of elementary rotations to a matrix The algorithm pre-multiplies the matrix by a sequence of rotation transformations which is given by arrays C and S. Depending on the value of the IsForward parameter either 1 and 2, 3 and 4 and so on (if IsForward=true) rows are rotated, or the rows N and N-1, N-2 and N-3 and so on, are rotated. Not the whole matrix but only a part of it is transformed (rows from M1 to M2, columns from N1 to N2). Only the elements of this submatrix are changed. Input parameters: IsForward - the sequence of the rotation application. M1,M2 - the range of rows to be transformed. N1, N2 - the range of columns to be transformed. C,S - transformation coefficients. Array whose index ranges within [1..M2-M1]. A - processed matrix. WORK - working array whose index ranges within [N1..N2]. Output parameters: A - transformed matrix. Utility subroutine. *************************************************************************/ void applyrotationsfromtheleft(ae_bool isforward, ae_int_t m1, ae_int_t m2, ae_int_t n1, ae_int_t n2, /* Real */ ae_vector* c, /* Real */ ae_vector* s, /* Real */ ae_matrix* a, /* Real */ ae_vector* work, ae_state *_state) { ae_int_t j; ae_int_t jp1; double ctemp; double stemp; double temp; if( m1>m2||n1>n2 ) { return; } /* * Form P * A */ if( isforward ) { if( n1!=n2 ) { /* * Common case: N1<>N2 */ for(j=m1; j<=m2-1; j++) { ctemp = c->ptr.p_double[j-m1+1]; stemp = s->ptr.p_double[j-m1+1]; if( ae_fp_neq(ctemp,(double)(1))||ae_fp_neq(stemp,(double)(0)) ) { jp1 = j+1; ae_v_moved(&work->ptr.p_double[n1], 1, &a->ptr.pp_double[jp1][n1], 1, ae_v_len(n1,n2), ctemp); ae_v_subd(&work->ptr.p_double[n1], 1, &a->ptr.pp_double[j][n1], 1, ae_v_len(n1,n2), stemp); ae_v_muld(&a->ptr.pp_double[j][n1], 1, ae_v_len(n1,n2), ctemp); ae_v_addd(&a->ptr.pp_double[j][n1], 1, &a->ptr.pp_double[jp1][n1], 1, ae_v_len(n1,n2), stemp); ae_v_move(&a->ptr.pp_double[jp1][n1], 1, &work->ptr.p_double[n1], 1, ae_v_len(n1,n2)); } } } else { /* * Special case: N1=N2 */ for(j=m1; j<=m2-1; j++) { ctemp = c->ptr.p_double[j-m1+1]; stemp = s->ptr.p_double[j-m1+1]; if( ae_fp_neq(ctemp,(double)(1))||ae_fp_neq(stemp,(double)(0)) ) { temp = a->ptr.pp_double[j+1][n1]; a->ptr.pp_double[j+1][n1] = ctemp*temp-stemp*a->ptr.pp_double[j][n1]; a->ptr.pp_double[j][n1] = stemp*temp+ctemp*a->ptr.pp_double[j][n1]; } } } } else { if( n1!=n2 ) { /* * Common case: N1<>N2 */ for(j=m2-1; j>=m1; j--) { ctemp = c->ptr.p_double[j-m1+1]; stemp = s->ptr.p_double[j-m1+1]; if( ae_fp_neq(ctemp,(double)(1))||ae_fp_neq(stemp,(double)(0)) ) { jp1 = j+1; ae_v_moved(&work->ptr.p_double[n1], 1, &a->ptr.pp_double[jp1][n1], 1, ae_v_len(n1,n2), ctemp); ae_v_subd(&work->ptr.p_double[n1], 1, &a->ptr.pp_double[j][n1], 1, ae_v_len(n1,n2), stemp); ae_v_muld(&a->ptr.pp_double[j][n1], 1, ae_v_len(n1,n2), ctemp); ae_v_addd(&a->ptr.pp_double[j][n1], 1, &a->ptr.pp_double[jp1][n1], 1, ae_v_len(n1,n2), stemp); ae_v_move(&a->ptr.pp_double[jp1][n1], 1, &work->ptr.p_double[n1], 1, ae_v_len(n1,n2)); } } } else { /* * Special case: N1=N2 */ for(j=m2-1; j>=m1; j--) { ctemp = c->ptr.p_double[j-m1+1]; stemp = s->ptr.p_double[j-m1+1]; if( ae_fp_neq(ctemp,(double)(1))||ae_fp_neq(stemp,(double)(0)) ) { temp = a->ptr.pp_double[j+1][n1]; a->ptr.pp_double[j+1][n1] = ctemp*temp-stemp*a->ptr.pp_double[j][n1]; a->ptr.pp_double[j][n1] = stemp*temp+ctemp*a->ptr.pp_double[j][n1]; } } } } } /************************************************************************* Application of a sequence of elementary rotations to a matrix The algorithm post-multiplies the matrix by a sequence of rotation transformations which is given by arrays C and S. Depending on the value of the IsForward parameter either 1 and 2, 3 and 4 and so on (if IsForward=true) rows are rotated, or the rows N and N-1, N-2 and N-3 and so on are rotated. Not the whole matrix but only a part of it is transformed (rows from M1 to M2, columns from N1 to N2). Only the elements of this submatrix are changed. Input parameters: IsForward - the sequence of the rotation application. M1,M2 - the range of rows to be transformed. N1, N2 - the range of columns to be transformed. C,S - transformation coefficients. Array whose index ranges within [1..N2-N1]. A - processed matrix. WORK - working array whose index ranges within [M1..M2]. Output parameters: A - transformed matrix. Utility subroutine. *************************************************************************/ void applyrotationsfromtheright(ae_bool isforward, ae_int_t m1, ae_int_t m2, ae_int_t n1, ae_int_t n2, /* Real */ ae_vector* c, /* Real */ ae_vector* s, /* Real */ ae_matrix* a, /* Real */ ae_vector* work, ae_state *_state) { ae_int_t j; ae_int_t jp1; double ctemp; double stemp; double temp; /* * Form A * P' */ if( isforward ) { if( m1!=m2 ) { /* * Common case: M1<>M2 */ for(j=n1; j<=n2-1; j++) { ctemp = c->ptr.p_double[j-n1+1]; stemp = s->ptr.p_double[j-n1+1]; if( ae_fp_neq(ctemp,(double)(1))||ae_fp_neq(stemp,(double)(0)) ) { jp1 = j+1; ae_v_moved(&work->ptr.p_double[m1], 1, &a->ptr.pp_double[m1][jp1], a->stride, ae_v_len(m1,m2), ctemp); ae_v_subd(&work->ptr.p_double[m1], 1, &a->ptr.pp_double[m1][j], a->stride, ae_v_len(m1,m2), stemp); ae_v_muld(&a->ptr.pp_double[m1][j], a->stride, ae_v_len(m1,m2), ctemp); ae_v_addd(&a->ptr.pp_double[m1][j], a->stride, &a->ptr.pp_double[m1][jp1], a->stride, ae_v_len(m1,m2), stemp); ae_v_move(&a->ptr.pp_double[m1][jp1], a->stride, &work->ptr.p_double[m1], 1, ae_v_len(m1,m2)); } } } else { /* * Special case: M1=M2 */ for(j=n1; j<=n2-1; j++) { ctemp = c->ptr.p_double[j-n1+1]; stemp = s->ptr.p_double[j-n1+1]; if( ae_fp_neq(ctemp,(double)(1))||ae_fp_neq(stemp,(double)(0)) ) { temp = a->ptr.pp_double[m1][j+1]; a->ptr.pp_double[m1][j+1] = ctemp*temp-stemp*a->ptr.pp_double[m1][j]; a->ptr.pp_double[m1][j] = stemp*temp+ctemp*a->ptr.pp_double[m1][j]; } } } } else { if( m1!=m2 ) { /* * Common case: M1<>M2 */ for(j=n2-1; j>=n1; j--) { ctemp = c->ptr.p_double[j-n1+1]; stemp = s->ptr.p_double[j-n1+1]; if( ae_fp_neq(ctemp,(double)(1))||ae_fp_neq(stemp,(double)(0)) ) { jp1 = j+1; ae_v_moved(&work->ptr.p_double[m1], 1, &a->ptr.pp_double[m1][jp1], a->stride, ae_v_len(m1,m2), ctemp); ae_v_subd(&work->ptr.p_double[m1], 1, &a->ptr.pp_double[m1][j], a->stride, ae_v_len(m1,m2), stemp); ae_v_muld(&a->ptr.pp_double[m1][j], a->stride, ae_v_len(m1,m2), ctemp); ae_v_addd(&a->ptr.pp_double[m1][j], a->stride, &a->ptr.pp_double[m1][jp1], a->stride, ae_v_len(m1,m2), stemp); ae_v_move(&a->ptr.pp_double[m1][jp1], a->stride, &work->ptr.p_double[m1], 1, ae_v_len(m1,m2)); } } } else { /* * Special case: M1=M2 */ for(j=n2-1; j>=n1; j--) { ctemp = c->ptr.p_double[j-n1+1]; stemp = s->ptr.p_double[j-n1+1]; if( ae_fp_neq(ctemp,(double)(1))||ae_fp_neq(stemp,(double)(0)) ) { temp = a->ptr.pp_double[m1][j+1]; a->ptr.pp_double[m1][j+1] = ctemp*temp-stemp*a->ptr.pp_double[m1][j]; a->ptr.pp_double[m1][j] = stemp*temp+ctemp*a->ptr.pp_double[m1][j]; } } } } } /************************************************************************* The subroutine generates the elementary rotation, so that: [ CS SN ] . [ F ] = [ R ] [ -SN CS ] [ G ] [ 0 ] CS**2 + SN**2 = 1 *************************************************************************/ void generaterotation(double f, double g, double* cs, double* sn, double* r, ae_state *_state) { double f1; double g1; *cs = 0; *sn = 0; *r = 0; if( ae_fp_eq(g,(double)(0)) ) { *cs = (double)(1); *sn = (double)(0); *r = f; } else { if( ae_fp_eq(f,(double)(0)) ) { *cs = (double)(0); *sn = (double)(1); *r = g; } else { f1 = f; g1 = g; if( ae_fp_greater(ae_fabs(f1, _state),ae_fabs(g1, _state)) ) { *r = ae_fabs(f1, _state)*ae_sqrt(1+ae_sqr(g1/f1, _state), _state); } else { *r = ae_fabs(g1, _state)*ae_sqrt(1+ae_sqr(f1/g1, _state), _state); } *cs = f1/(*r); *sn = g1/(*r); if( ae_fp_greater(ae_fabs(f, _state),ae_fabs(g, _state))&&ae_fp_less(*cs,(double)(0)) ) { *cs = -*cs; *sn = -*sn; *r = -*r; } } } } /************************************************************************* Utility subroutine performing the "safe" solution of system of linear equations with triangular coefficient matrices. The subroutine uses scaling and solves the scaled system A*x=s*b (where s is a scalar value) instead of A*x=b, choosing s so that x can be represented by a floating-point number. The closer the system gets to a singular, the less s is. If the system is singular, s=0 and x contains the non-trivial solution of equation A*x=0. The feature of an algorithm is that it could not cause an overflow or a division by zero regardless of the matrix used as the input. The algorithm can solve systems of equations with upper/lower triangular matrices, with/without unit diagonal, and systems of type A*x=b or A'*x=b (where A' is a transposed matrix A). Input parameters: A - system matrix. Array whose indexes range within [0..N-1, 0..N-1]. N - size of matrix A. X - right-hand member of a system. Array whose index ranges within [0..N-1]. IsUpper - matrix type. If it is True, the system matrix is the upper triangular and is located in the corresponding part of matrix A. Trans - problem type. If it is True, the problem to be solved is A'*x=b, otherwise it is A*x=b. Isunit - matrix type. If it is True, the system matrix has a unit diagonal (the elements on the main diagonal are not used in the calculation process), otherwise the matrix is considered to be a general triangular matrix. Output parameters: X - solution. Array whose index ranges within [0..N-1]. S - scaling factor. -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University June 30, 1992 *************************************************************************/ void rmatrixtrsafesolve(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* x, double* s, ae_bool isupper, ae_bool istrans, ae_bool isunit, ae_state *_state) { ae_frame _frame_block; ae_bool normin; ae_vector cnorm; ae_matrix a1; ae_vector x1; ae_int_t i; ae_frame_make(_state, &_frame_block); *s = 0; ae_vector_init(&cnorm, 0, DT_REAL, _state); ae_matrix_init(&a1, 0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); /* * From 0-based to 1-based */ normin = ae_false; ae_matrix_set_length(&a1, n+1, n+1, _state); ae_vector_set_length(&x1, n+1, _state); for(i=1; i<=n; i++) { ae_v_move(&a1.ptr.pp_double[i][1], 1, &a->ptr.pp_double[i-1][0], 1, ae_v_len(1,n)); } ae_v_move(&x1.ptr.p_double[1], 1, &x->ptr.p_double[0], 1, ae_v_len(1,n)); /* * Solve 1-based */ safesolvetriangular(&a1, n, &x1, s, isupper, istrans, isunit, normin, &cnorm, _state); /* * From 1-based to 0-based */ ae_v_move(&x->ptr.p_double[0], 1, &x1.ptr.p_double[1], 1, ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* Obsolete 1-based subroutine. See RMatrixTRSafeSolve for 0-based replacement. *************************************************************************/ void safesolvetriangular(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* x, double* s, ae_bool isupper, ae_bool istrans, ae_bool isunit, ae_bool normin, /* Real */ ae_vector* cnorm, ae_state *_state) { ae_int_t i; ae_int_t imax; ae_int_t j; ae_int_t jfirst; ae_int_t jinc; ae_int_t jlast; ae_int_t jm1; ae_int_t jp1; ae_int_t ip1; ae_int_t im1; ae_int_t k; ae_int_t flg; double v; double vd; double bignum; double grow; double rec; double smlnum; double sumj; double tjj; double tjjs; double tmax; double tscal; double uscal; double xbnd; double xj; double xmax; ae_bool notran; ae_bool upper; ae_bool nounit; *s = 0; upper = isupper; notran = !istrans; nounit = !isunit; /* * these initializers are not really necessary, * but without them compiler complains about uninitialized locals */ tjjs = (double)(0); /* * Quick return if possible */ if( n==0 ) { return; } /* * Determine machine dependent parameters to control overflow. */ smlnum = ae_minrealnumber/(ae_machineepsilon*2); bignum = 1/smlnum; *s = (double)(1); if( !normin ) { ae_vector_set_length(cnorm, n+1, _state); /* * Compute the 1-norm of each column, not including the diagonal. */ if( upper ) { /* * A is upper triangular. */ for(j=1; j<=n; j++) { v = (double)(0); for(k=1; k<=j-1; k++) { v = v+ae_fabs(a->ptr.pp_double[k][j], _state); } cnorm->ptr.p_double[j] = v; } } else { /* * A is lower triangular. */ for(j=1; j<=n-1; j++) { v = (double)(0); for(k=j+1; k<=n; k++) { v = v+ae_fabs(a->ptr.pp_double[k][j], _state); } cnorm->ptr.p_double[j] = v; } cnorm->ptr.p_double[n] = (double)(0); } } /* * Scale the column norms by TSCAL if the maximum element in CNORM is * greater than BIGNUM. */ imax = 1; for(k=2; k<=n; k++) { if( ae_fp_greater(cnorm->ptr.p_double[k],cnorm->ptr.p_double[imax]) ) { imax = k; } } tmax = cnorm->ptr.p_double[imax]; if( ae_fp_less_eq(tmax,bignum) ) { tscal = (double)(1); } else { tscal = 1/(smlnum*tmax); ae_v_muld(&cnorm->ptr.p_double[1], 1, ae_v_len(1,n), tscal); } /* * Compute a bound on the computed solution vector to see if the * Level 2 BLAS routine DTRSV can be used. */ j = 1; for(k=2; k<=n; k++) { if( ae_fp_greater(ae_fabs(x->ptr.p_double[k], _state),ae_fabs(x->ptr.p_double[j], _state)) ) { j = k; } } xmax = ae_fabs(x->ptr.p_double[j], _state); xbnd = xmax; if( notran ) { /* * Compute the growth in A * x = b. */ if( upper ) { jfirst = n; jlast = 1; jinc = -1; } else { jfirst = 1; jlast = n; jinc = 1; } if( ae_fp_neq(tscal,(double)(1)) ) { grow = (double)(0); } else { if( nounit ) { /* * A is non-unit triangular. * * Compute GROW = 1/G(j) and XBND = 1/M(j). * Initially, G(0) = max{x(i), i=1,...,n}. */ grow = 1/ae_maxreal(xbnd, smlnum, _state); xbnd = grow; j = jfirst; while((jinc>0&&j<=jlast)||(jinc<0&&j>=jlast)) { /* * Exit the loop if the growth factor is too small. */ if( ae_fp_less_eq(grow,smlnum) ) { break; } /* * M(j) = G(j-1) / abs(A(j,j)) */ tjj = ae_fabs(a->ptr.pp_double[j][j], _state); xbnd = ae_minreal(xbnd, ae_minreal((double)(1), tjj, _state)*grow, _state); if( ae_fp_greater_eq(tjj+cnorm->ptr.p_double[j],smlnum) ) { /* * G(j) = G(j-1)*( 1 + CNORM(j) / abs(A(j,j)) ) */ grow = grow*(tjj/(tjj+cnorm->ptr.p_double[j])); } else { /* * G(j) could overflow, set GROW to 0. */ grow = (double)(0); } if( j==jlast ) { grow = xbnd; } j = j+jinc; } } else { /* * A is unit triangular. * * Compute GROW = 1/G(j), where G(0) = max{x(i), i=1,...,n}. */ grow = ae_minreal((double)(1), 1/ae_maxreal(xbnd, smlnum, _state), _state); j = jfirst; while((jinc>0&&j<=jlast)||(jinc<0&&j>=jlast)) { /* * Exit the loop if the growth factor is too small. */ if( ae_fp_less_eq(grow,smlnum) ) { break; } /* * G(j) = G(j-1)*( 1 + CNORM(j) ) */ grow = grow*(1/(1+cnorm->ptr.p_double[j])); j = j+jinc; } } } } else { /* * Compute the growth in A' * x = b. */ if( upper ) { jfirst = 1; jlast = n; jinc = 1; } else { jfirst = n; jlast = 1; jinc = -1; } if( ae_fp_neq(tscal,(double)(1)) ) { grow = (double)(0); } else { if( nounit ) { /* * A is non-unit triangular. * * Compute GROW = 1/G(j) and XBND = 1/M(j). * Initially, M(0) = max{x(i), i=1,...,n}. */ grow = 1/ae_maxreal(xbnd, smlnum, _state); xbnd = grow; j = jfirst; while((jinc>0&&j<=jlast)||(jinc<0&&j>=jlast)) { /* * Exit the loop if the growth factor is too small. */ if( ae_fp_less_eq(grow,smlnum) ) { break; } /* * G(j) = max( G(j-1), M(j-1)*( 1 + CNORM(j) ) ) */ xj = 1+cnorm->ptr.p_double[j]; grow = ae_minreal(grow, xbnd/xj, _state); /* * M(j) = M(j-1)*( 1 + CNORM(j) ) / abs(A(j,j)) */ tjj = ae_fabs(a->ptr.pp_double[j][j], _state); if( ae_fp_greater(xj,tjj) ) { xbnd = xbnd*(tjj/xj); } if( j==jlast ) { grow = ae_minreal(grow, xbnd, _state); } j = j+jinc; } } else { /* * A is unit triangular. * * Compute GROW = 1/G(j), where G(0) = max{x(i), i=1,...,n}. */ grow = ae_minreal((double)(1), 1/ae_maxreal(xbnd, smlnum, _state), _state); j = jfirst; while((jinc>0&&j<=jlast)||(jinc<0&&j>=jlast)) { /* * Exit the loop if the growth factor is too small. */ if( ae_fp_less_eq(grow,smlnum) ) { break; } /* * G(j) = ( 1 + CNORM(j) )*G(j-1) */ xj = 1+cnorm->ptr.p_double[j]; grow = grow/xj; j = j+jinc; } } } } if( ae_fp_greater(grow*tscal,smlnum) ) { /* * Use the Level 2 BLAS solve if the reciprocal of the bound on * elements of X is not too small. */ if( (upper&¬ran)||(!upper&&!notran) ) { if( nounit ) { vd = a->ptr.pp_double[n][n]; } else { vd = (double)(1); } x->ptr.p_double[n] = x->ptr.p_double[n]/vd; for(i=n-1; i>=1; i--) { ip1 = i+1; if( upper ) { v = ae_v_dotproduct(&a->ptr.pp_double[i][ip1], 1, &x->ptr.p_double[ip1], 1, ae_v_len(ip1,n)); } else { v = ae_v_dotproduct(&a->ptr.pp_double[ip1][i], a->stride, &x->ptr.p_double[ip1], 1, ae_v_len(ip1,n)); } if( nounit ) { vd = a->ptr.pp_double[i][i]; } else { vd = (double)(1); } x->ptr.p_double[i] = (x->ptr.p_double[i]-v)/vd; } } else { if( nounit ) { vd = a->ptr.pp_double[1][1]; } else { vd = (double)(1); } x->ptr.p_double[1] = x->ptr.p_double[1]/vd; for(i=2; i<=n; i++) { im1 = i-1; if( upper ) { v = ae_v_dotproduct(&a->ptr.pp_double[1][i], a->stride, &x->ptr.p_double[1], 1, ae_v_len(1,im1)); } else { v = ae_v_dotproduct(&a->ptr.pp_double[i][1], 1, &x->ptr.p_double[1], 1, ae_v_len(1,im1)); } if( nounit ) { vd = a->ptr.pp_double[i][i]; } else { vd = (double)(1); } x->ptr.p_double[i] = (x->ptr.p_double[i]-v)/vd; } } } else { /* * Use a Level 1 BLAS solve, scaling intermediate results. */ if( ae_fp_greater(xmax,bignum) ) { /* * Scale X so that its components are less than or equal to * BIGNUM in absolute value. */ *s = bignum/xmax; ae_v_muld(&x->ptr.p_double[1], 1, ae_v_len(1,n), *s); xmax = bignum; } if( notran ) { /* * Solve A * x = b */ j = jfirst; while((jinc>0&&j<=jlast)||(jinc<0&&j>=jlast)) { /* * Compute x(j) = b(j) / A(j,j), scaling x if necessary. */ xj = ae_fabs(x->ptr.p_double[j], _state); flg = 0; if( nounit ) { tjjs = a->ptr.pp_double[j][j]*tscal; } else { tjjs = tscal; if( ae_fp_eq(tscal,(double)(1)) ) { flg = 100; } } if( flg!=100 ) { tjj = ae_fabs(tjjs, _state); if( ae_fp_greater(tjj,smlnum) ) { /* * abs(A(j,j)) > SMLNUM: */ if( ae_fp_less(tjj,(double)(1)) ) { if( ae_fp_greater(xj,tjj*bignum) ) { /* * Scale x by 1/b(j). */ rec = 1/xj; ae_v_muld(&x->ptr.p_double[1], 1, ae_v_len(1,n), rec); *s = *s*rec; xmax = xmax*rec; } } x->ptr.p_double[j] = x->ptr.p_double[j]/tjjs; xj = ae_fabs(x->ptr.p_double[j], _state); } else { if( ae_fp_greater(tjj,(double)(0)) ) { /* * 0 < abs(A(j,j)) <= SMLNUM: */ if( ae_fp_greater(xj,tjj*bignum) ) { /* * Scale x by (1/abs(x(j)))*abs(A(j,j))*BIGNUM * to avoid overflow when dividing by A(j,j). */ rec = tjj*bignum/xj; if( ae_fp_greater(cnorm->ptr.p_double[j],(double)(1)) ) { /* * Scale by 1/CNORM(j) to avoid overflow when * multiplying x(j) times column j. */ rec = rec/cnorm->ptr.p_double[j]; } ae_v_muld(&x->ptr.p_double[1], 1, ae_v_len(1,n), rec); *s = *s*rec; xmax = xmax*rec; } x->ptr.p_double[j] = x->ptr.p_double[j]/tjjs; xj = ae_fabs(x->ptr.p_double[j], _state); } else { /* * A(j,j) = 0: Set x(1:n) = 0, x(j) = 1, and * scale = 0, and compute a solution to A*x = 0. */ for(i=1; i<=n; i++) { x->ptr.p_double[i] = (double)(0); } x->ptr.p_double[j] = (double)(1); xj = (double)(1); *s = (double)(0); xmax = (double)(0); } } } /* * Scale x if necessary to avoid overflow when adding a * multiple of column j of A. */ if( ae_fp_greater(xj,(double)(1)) ) { rec = 1/xj; if( ae_fp_greater(cnorm->ptr.p_double[j],(bignum-xmax)*rec) ) { /* * Scale x by 1/(2*abs(x(j))). */ rec = rec*0.5; ae_v_muld(&x->ptr.p_double[1], 1, ae_v_len(1,n), rec); *s = *s*rec; } } else { if( ae_fp_greater(xj*cnorm->ptr.p_double[j],bignum-xmax) ) { /* * Scale x by 1/2. */ ae_v_muld(&x->ptr.p_double[1], 1, ae_v_len(1,n), 0.5); *s = *s*0.5; } } if( upper ) { if( j>1 ) { /* * Compute the update * x(1:j-1) := x(1:j-1) - x(j) * A(1:j-1,j) */ v = x->ptr.p_double[j]*tscal; jm1 = j-1; ae_v_subd(&x->ptr.p_double[1], 1, &a->ptr.pp_double[1][j], a->stride, ae_v_len(1,jm1), v); i = 1; for(k=2; k<=j-1; k++) { if( ae_fp_greater(ae_fabs(x->ptr.p_double[k], _state),ae_fabs(x->ptr.p_double[i], _state)) ) { i = k; } } xmax = ae_fabs(x->ptr.p_double[i], _state); } } else { if( jptr.p_double[j]*tscal; ae_v_subd(&x->ptr.p_double[jp1], 1, &a->ptr.pp_double[jp1][j], a->stride, ae_v_len(jp1,n), v); i = j+1; for(k=j+2; k<=n; k++) { if( ae_fp_greater(ae_fabs(x->ptr.p_double[k], _state),ae_fabs(x->ptr.p_double[i], _state)) ) { i = k; } } xmax = ae_fabs(x->ptr.p_double[i], _state); } } j = j+jinc; } } else { /* * Solve A' * x = b */ j = jfirst; while((jinc>0&&j<=jlast)||(jinc<0&&j>=jlast)) { /* * Compute x(j) = b(j) - sum A(k,j)*x(k). * k<>j */ xj = ae_fabs(x->ptr.p_double[j], _state); uscal = tscal; rec = 1/ae_maxreal(xmax, (double)(1), _state); if( ae_fp_greater(cnorm->ptr.p_double[j],(bignum-xj)*rec) ) { /* * If x(j) could overflow, scale x by 1/(2*XMAX). */ rec = rec*0.5; if( nounit ) { tjjs = a->ptr.pp_double[j][j]*tscal; } else { tjjs = tscal; } tjj = ae_fabs(tjjs, _state); if( ae_fp_greater(tjj,(double)(1)) ) { /* * Divide by A(j,j) when scaling x if A(j,j) > 1. */ rec = ae_minreal((double)(1), rec*tjj, _state); uscal = uscal/tjjs; } if( ae_fp_less(rec,(double)(1)) ) { ae_v_muld(&x->ptr.p_double[1], 1, ae_v_len(1,n), rec); *s = *s*rec; xmax = xmax*rec; } } sumj = (double)(0); if( ae_fp_eq(uscal,(double)(1)) ) { /* * If the scaling needed for A in the dot product is 1, * call DDOT to perform the dot product. */ if( upper ) { if( j>1 ) { jm1 = j-1; sumj = ae_v_dotproduct(&a->ptr.pp_double[1][j], a->stride, &x->ptr.p_double[1], 1, ae_v_len(1,jm1)); } else { sumj = (double)(0); } } else { if( jptr.pp_double[jp1][j], a->stride, &x->ptr.p_double[jp1], 1, ae_v_len(jp1,n)); } } } else { /* * Otherwise, use in-line code for the dot product. */ if( upper ) { for(i=1; i<=j-1; i++) { v = a->ptr.pp_double[i][j]*uscal; sumj = sumj+v*x->ptr.p_double[i]; } } else { if( jptr.pp_double[i][j]*uscal; sumj = sumj+v*x->ptr.p_double[i]; } } } } if( ae_fp_eq(uscal,tscal) ) { /* * Compute x(j) := ( x(j) - sumj ) / A(j,j) if 1/A(j,j) * was not used to scale the dotproduct. */ x->ptr.p_double[j] = x->ptr.p_double[j]-sumj; xj = ae_fabs(x->ptr.p_double[j], _state); flg = 0; if( nounit ) { tjjs = a->ptr.pp_double[j][j]*tscal; } else { tjjs = tscal; if( ae_fp_eq(tscal,(double)(1)) ) { flg = 150; } } /* * Compute x(j) = x(j) / A(j,j), scaling if necessary. */ if( flg!=150 ) { tjj = ae_fabs(tjjs, _state); if( ae_fp_greater(tjj,smlnum) ) { /* * abs(A(j,j)) > SMLNUM: */ if( ae_fp_less(tjj,(double)(1)) ) { if( ae_fp_greater(xj,tjj*bignum) ) { /* * Scale X by 1/abs(x(j)). */ rec = 1/xj; ae_v_muld(&x->ptr.p_double[1], 1, ae_v_len(1,n), rec); *s = *s*rec; xmax = xmax*rec; } } x->ptr.p_double[j] = x->ptr.p_double[j]/tjjs; } else { if( ae_fp_greater(tjj,(double)(0)) ) { /* * 0 < abs(A(j,j)) <= SMLNUM: */ if( ae_fp_greater(xj,tjj*bignum) ) { /* * Scale x by (1/abs(x(j)))*abs(A(j,j))*BIGNUM. */ rec = tjj*bignum/xj; ae_v_muld(&x->ptr.p_double[1], 1, ae_v_len(1,n), rec); *s = *s*rec; xmax = xmax*rec; } x->ptr.p_double[j] = x->ptr.p_double[j]/tjjs; } else { /* * A(j,j) = 0: Set x(1:n) = 0, x(j) = 1, and * scale = 0, and compute a solution to A'*x = 0. */ for(i=1; i<=n; i++) { x->ptr.p_double[i] = (double)(0); } x->ptr.p_double[j] = (double)(1); *s = (double)(0); xmax = (double)(0); } } } } else { /* * Compute x(j) := x(j) / A(j,j) - sumj if the dot * product has already been divided by 1/A(j,j). */ x->ptr.p_double[j] = x->ptr.p_double[j]/tjjs-sumj; } xmax = ae_maxreal(xmax, ae_fabs(x->ptr.p_double[j], _state), _state); j = j+jinc; } } *s = *s/tscal; } /* * Scale the column norms by 1/TSCAL for return. */ if( ae_fp_neq(tscal,(double)(1)) ) { v = 1/tscal; ae_v_muld(&cnorm->ptr.p_double[1], 1, ae_v_len(1,n), v); } } /************************************************************************* Real implementation of CMatrixScaledTRSafeSolve -- ALGLIB routine -- 21.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool rmatrixscaledtrsafesolve(/* Real */ ae_matrix* a, double sa, ae_int_t n, /* Real */ ae_vector* x, ae_bool isupper, ae_int_t trans, ae_bool isunit, double maxgrowth, ae_state *_state) { ae_frame _frame_block; double lnmax; double nrmb; double nrmx; ae_int_t i; ae_complex alpha; ae_complex beta; double vr; ae_complex cx; ae_vector tmp; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_assert(n>0, "RMatrixTRSafeSolve: incorrect N!", _state); ae_assert(trans==0||trans==1, "RMatrixTRSafeSolve: incorrect Trans!", _state); result = ae_true; lnmax = ae_log(ae_maxrealnumber, _state); /* * Quick return if possible */ if( n<=0 ) { ae_frame_leave(_state); return result; } /* * Load norms: right part and X */ nrmb = (double)(0); for(i=0; i<=n-1; i++) { nrmb = ae_maxreal(nrmb, ae_fabs(x->ptr.p_double[i], _state), _state); } nrmx = (double)(0); /* * Solve */ ae_vector_set_length(&tmp, n, _state); result = ae_true; if( isupper&&trans==0 ) { /* * U*x = b */ for(i=n-1; i>=0; i--) { /* * Task is reduced to alpha*x[i] = beta */ if( isunit ) { alpha = ae_complex_from_d(sa); } else { alpha = ae_complex_from_d(a->ptr.pp_double[i][i]*sa); } if( iptr.pp_double[i][i+1], 1, ae_v_len(i+1,n-1), sa); vr = ae_v_dotproduct(&tmp.ptr.p_double[i+1], 1, &x->ptr.p_double[i+1], 1, ae_v_len(i+1,n-1)); beta = ae_complex_from_d(x->ptr.p_double[i]-vr); } else { beta = ae_complex_from_d(x->ptr.p_double[i]); } /* * solve alpha*x[i] = beta */ result = safesolve_cbasicsolveandupdate(alpha, beta, lnmax, nrmb, maxgrowth, &nrmx, &cx, _state); if( !result ) { ae_frame_leave(_state); return result; } x->ptr.p_double[i] = cx.x; } ae_frame_leave(_state); return result; } if( !isupper&&trans==0 ) { /* * L*x = b */ for(i=0; i<=n-1; i++) { /* * Task is reduced to alpha*x[i] = beta */ if( isunit ) { alpha = ae_complex_from_d(sa); } else { alpha = ae_complex_from_d(a->ptr.pp_double[i][i]*sa); } if( i>0 ) { ae_v_moved(&tmp.ptr.p_double[0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,i-1), sa); vr = ae_v_dotproduct(&tmp.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,i-1)); beta = ae_complex_from_d(x->ptr.p_double[i]-vr); } else { beta = ae_complex_from_d(x->ptr.p_double[i]); } /* * solve alpha*x[i] = beta */ result = safesolve_cbasicsolveandupdate(alpha, beta, lnmax, nrmb, maxgrowth, &nrmx, &cx, _state); if( !result ) { ae_frame_leave(_state); return result; } x->ptr.p_double[i] = cx.x; } ae_frame_leave(_state); return result; } if( isupper&&trans==1 ) { /* * U^T*x = b */ for(i=0; i<=n-1; i++) { /* * Task is reduced to alpha*x[i] = beta */ if( isunit ) { alpha = ae_complex_from_d(sa); } else { alpha = ae_complex_from_d(a->ptr.pp_double[i][i]*sa); } beta = ae_complex_from_d(x->ptr.p_double[i]); /* * solve alpha*x[i] = beta */ result = safesolve_cbasicsolveandupdate(alpha, beta, lnmax, nrmb, maxgrowth, &nrmx, &cx, _state); if( !result ) { ae_frame_leave(_state); return result; } x->ptr.p_double[i] = cx.x; /* * update the rest of right part */ if( iptr.pp_double[i][i+1], 1, ae_v_len(i+1,n-1), sa); ae_v_subd(&x->ptr.p_double[i+1], 1, &tmp.ptr.p_double[i+1], 1, ae_v_len(i+1,n-1), vr); } } ae_frame_leave(_state); return result; } if( !isupper&&trans==1 ) { /* * L^T*x = b */ for(i=n-1; i>=0; i--) { /* * Task is reduced to alpha*x[i] = beta */ if( isunit ) { alpha = ae_complex_from_d(sa); } else { alpha = ae_complex_from_d(a->ptr.pp_double[i][i]*sa); } beta = ae_complex_from_d(x->ptr.p_double[i]); /* * solve alpha*x[i] = beta */ result = safesolve_cbasicsolveandupdate(alpha, beta, lnmax, nrmb, maxgrowth, &nrmx, &cx, _state); if( !result ) { ae_frame_leave(_state); return result; } x->ptr.p_double[i] = cx.x; /* * update the rest of right part */ if( i>0 ) { vr = cx.x; ae_v_moved(&tmp.ptr.p_double[0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,i-1), sa); ae_v_subd(&x->ptr.p_double[0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,i-1), vr); } } ae_frame_leave(_state); return result; } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Internal subroutine for safe solution of SA*op(A)=b where A is NxN upper/lower triangular/unitriangular matrix, op(A) is either identity transform, transposition or Hermitian transposition, SA is a scaling factor such that max(|SA*A[i,j]|) is close to 1.0 in magnutude. This subroutine limits relative growth of solution (in inf-norm) by MaxGrowth, returning False if growth exceeds MaxGrowth. Degenerate or near-degenerate matrices are handled correctly (False is returned) as long as MaxGrowth is significantly less than MaxRealNumber/norm(b). -- ALGLIB routine -- 21.01.2010 Bochkanov Sergey *************************************************************************/ ae_bool cmatrixscaledtrsafesolve(/* Complex */ ae_matrix* a, double sa, ae_int_t n, /* Complex */ ae_vector* x, ae_bool isupper, ae_int_t trans, ae_bool isunit, double maxgrowth, ae_state *_state) { ae_frame _frame_block; double lnmax; double nrmb; double nrmx; ae_int_t i; ae_complex alpha; ae_complex beta; ae_complex vc; ae_vector tmp; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&tmp, 0, DT_COMPLEX, _state); ae_assert(n>0, "CMatrixTRSafeSolve: incorrect N!", _state); ae_assert((trans==0||trans==1)||trans==2, "CMatrixTRSafeSolve: incorrect Trans!", _state); result = ae_true; lnmax = ae_log(ae_maxrealnumber, _state); /* * Quick return if possible */ if( n<=0 ) { ae_frame_leave(_state); return result; } /* * Load norms: right part and X */ nrmb = (double)(0); for(i=0; i<=n-1; i++) { nrmb = ae_maxreal(nrmb, ae_c_abs(x->ptr.p_complex[i], _state), _state); } nrmx = (double)(0); /* * Solve */ ae_vector_set_length(&tmp, n, _state); result = ae_true; if( isupper&&trans==0 ) { /* * U*x = b */ for(i=n-1; i>=0; i--) { /* * Task is reduced to alpha*x[i] = beta */ if( isunit ) { alpha = ae_complex_from_d(sa); } else { alpha = ae_c_mul_d(a->ptr.pp_complex[i][i],sa); } if( iptr.pp_complex[i][i+1], 1, "N", ae_v_len(i+1,n-1), sa); vc = ae_v_cdotproduct(&tmp.ptr.p_complex[i+1], 1, "N", &x->ptr.p_complex[i+1], 1, "N", ae_v_len(i+1,n-1)); beta = ae_c_sub(x->ptr.p_complex[i],vc); } else { beta = x->ptr.p_complex[i]; } /* * solve alpha*x[i] = beta */ result = safesolve_cbasicsolveandupdate(alpha, beta, lnmax, nrmb, maxgrowth, &nrmx, &vc, _state); if( !result ) { ae_frame_leave(_state); return result; } x->ptr.p_complex[i] = vc; } ae_frame_leave(_state); return result; } if( !isupper&&trans==0 ) { /* * L*x = b */ for(i=0; i<=n-1; i++) { /* * Task is reduced to alpha*x[i] = beta */ if( isunit ) { alpha = ae_complex_from_d(sa); } else { alpha = ae_c_mul_d(a->ptr.pp_complex[i][i],sa); } if( i>0 ) { ae_v_cmoved(&tmp.ptr.p_complex[0], 1, &a->ptr.pp_complex[i][0], 1, "N", ae_v_len(0,i-1), sa); vc = ae_v_cdotproduct(&tmp.ptr.p_complex[0], 1, "N", &x->ptr.p_complex[0], 1, "N", ae_v_len(0,i-1)); beta = ae_c_sub(x->ptr.p_complex[i],vc); } else { beta = x->ptr.p_complex[i]; } /* * solve alpha*x[i] = beta */ result = safesolve_cbasicsolveandupdate(alpha, beta, lnmax, nrmb, maxgrowth, &nrmx, &vc, _state); if( !result ) { ae_frame_leave(_state); return result; } x->ptr.p_complex[i] = vc; } ae_frame_leave(_state); return result; } if( isupper&&trans==1 ) { /* * U^T*x = b */ for(i=0; i<=n-1; i++) { /* * Task is reduced to alpha*x[i] = beta */ if( isunit ) { alpha = ae_complex_from_d(sa); } else { alpha = ae_c_mul_d(a->ptr.pp_complex[i][i],sa); } beta = x->ptr.p_complex[i]; /* * solve alpha*x[i] = beta */ result = safesolve_cbasicsolveandupdate(alpha, beta, lnmax, nrmb, maxgrowth, &nrmx, &vc, _state); if( !result ) { ae_frame_leave(_state); return result; } x->ptr.p_complex[i] = vc; /* * update the rest of right part */ if( iptr.pp_complex[i][i+1], 1, "N", ae_v_len(i+1,n-1), sa); ae_v_csubc(&x->ptr.p_complex[i+1], 1, &tmp.ptr.p_complex[i+1], 1, "N", ae_v_len(i+1,n-1), vc); } } ae_frame_leave(_state); return result; } if( !isupper&&trans==1 ) { /* * L^T*x = b */ for(i=n-1; i>=0; i--) { /* * Task is reduced to alpha*x[i] = beta */ if( isunit ) { alpha = ae_complex_from_d(sa); } else { alpha = ae_c_mul_d(a->ptr.pp_complex[i][i],sa); } beta = x->ptr.p_complex[i]; /* * solve alpha*x[i] = beta */ result = safesolve_cbasicsolveandupdate(alpha, beta, lnmax, nrmb, maxgrowth, &nrmx, &vc, _state); if( !result ) { ae_frame_leave(_state); return result; } x->ptr.p_complex[i] = vc; /* * update the rest of right part */ if( i>0 ) { ae_v_cmoved(&tmp.ptr.p_complex[0], 1, &a->ptr.pp_complex[i][0], 1, "N", ae_v_len(0,i-1), sa); ae_v_csubc(&x->ptr.p_complex[0], 1, &tmp.ptr.p_complex[0], 1, "N", ae_v_len(0,i-1), vc); } } ae_frame_leave(_state); return result; } if( isupper&&trans==2 ) { /* * U^H*x = b */ for(i=0; i<=n-1; i++) { /* * Task is reduced to alpha*x[i] = beta */ if( isunit ) { alpha = ae_complex_from_d(sa); } else { alpha = ae_c_mul_d(ae_c_conj(a->ptr.pp_complex[i][i], _state),sa); } beta = x->ptr.p_complex[i]; /* * solve alpha*x[i] = beta */ result = safesolve_cbasicsolveandupdate(alpha, beta, lnmax, nrmb, maxgrowth, &nrmx, &vc, _state); if( !result ) { ae_frame_leave(_state); return result; } x->ptr.p_complex[i] = vc; /* * update the rest of right part */ if( iptr.pp_complex[i][i+1], 1, "Conj", ae_v_len(i+1,n-1), sa); ae_v_csubc(&x->ptr.p_complex[i+1], 1, &tmp.ptr.p_complex[i+1], 1, "N", ae_v_len(i+1,n-1), vc); } } ae_frame_leave(_state); return result; } if( !isupper&&trans==2 ) { /* * L^T*x = b */ for(i=n-1; i>=0; i--) { /* * Task is reduced to alpha*x[i] = beta */ if( isunit ) { alpha = ae_complex_from_d(sa); } else { alpha = ae_c_mul_d(ae_c_conj(a->ptr.pp_complex[i][i], _state),sa); } beta = x->ptr.p_complex[i]; /* * solve alpha*x[i] = beta */ result = safesolve_cbasicsolveandupdate(alpha, beta, lnmax, nrmb, maxgrowth, &nrmx, &vc, _state); if( !result ) { ae_frame_leave(_state); return result; } x->ptr.p_complex[i] = vc; /* * update the rest of right part */ if( i>0 ) { ae_v_cmoved(&tmp.ptr.p_complex[0], 1, &a->ptr.pp_complex[i][0], 1, "Conj", ae_v_len(0,i-1), sa); ae_v_csubc(&x->ptr.p_complex[0], 1, &tmp.ptr.p_complex[0], 1, "N", ae_v_len(0,i-1), vc); } } ae_frame_leave(_state); return result; } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* complex basic solver-updater for reduced linear system alpha*x[i] = beta solves this equation and updates it in overlfow-safe manner (keeping track of relative growth of solution). Parameters: Alpha - alpha Beta - beta LnMax - precomputed Ln(MaxRealNumber) BNorm - inf-norm of b (right part of original system) MaxGrowth- maximum growth of norm(x) relative to norm(b) XNorm - inf-norm of other components of X (which are already processed) it is updated by CBasicSolveAndUpdate. X - solution -- ALGLIB routine -- 26.01.2009 Bochkanov Sergey *************************************************************************/ static ae_bool safesolve_cbasicsolveandupdate(ae_complex alpha, ae_complex beta, double lnmax, double bnorm, double maxgrowth, double* xnorm, ae_complex* x, ae_state *_state) { double v; ae_bool result; x->x = 0; x->y = 0; result = ae_false; if( ae_c_eq_d(alpha,(double)(0)) ) { return result; } if( ae_c_neq_d(beta,(double)(0)) ) { /* * alpha*x[i]=beta */ v = ae_log(ae_c_abs(beta, _state), _state)-ae_log(ae_c_abs(alpha, _state), _state); if( ae_fp_greater(v,lnmax) ) { return result; } *x = ae_c_div(beta,alpha); } else { /* * alpha*x[i]=0 */ *x = ae_complex_from_i(0); } /* * update NrmX, test growth limit */ *xnorm = ae_maxreal(*xnorm, ae_c_abs(*x, _state), _state); if( ae_fp_greater(*xnorm,maxgrowth*bnorm) ) { return result; } result = ae_true; return result; } void hermitianmatrixvectormultiply(/* Complex */ ae_matrix* a, ae_bool isupper, ae_int_t i1, ae_int_t i2, /* Complex */ ae_vector* x, ae_complex alpha, /* Complex */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_int_t ba1; ae_int_t by1; ae_int_t by2; ae_int_t bx1; ae_int_t bx2; ae_int_t n; ae_complex v; n = i2-i1+1; if( n<=0 ) { return; } /* * Let A = L + D + U, where * L is strictly lower triangular (main diagonal is zero) * D is diagonal * U is strictly upper triangular (main diagonal is zero) * * A*x = L*x + D*x + U*x * * Calculate D*x first */ for(i=i1; i<=i2; i++) { y->ptr.p_complex[i-i1+1] = ae_c_mul(a->ptr.pp_complex[i][i],x->ptr.p_complex[i-i1+1]); } /* * Add L*x + U*x */ if( isupper ) { for(i=i1; i<=i2-1; i++) { /* * Add L*x to the result */ v = x->ptr.p_complex[i-i1+1]; by1 = i-i1+2; by2 = n; ba1 = i+1; ae_v_caddc(&y->ptr.p_complex[by1], 1, &a->ptr.pp_complex[i][ba1], 1, "Conj", ae_v_len(by1,by2), v); /* * Add U*x to the result */ bx1 = i-i1+2; bx2 = n; ba1 = i+1; v = ae_v_cdotproduct(&x->ptr.p_complex[bx1], 1, "N", &a->ptr.pp_complex[i][ba1], 1, "N", ae_v_len(bx1,bx2)); y->ptr.p_complex[i-i1+1] = ae_c_add(y->ptr.p_complex[i-i1+1],v); } } else { for(i=i1+1; i<=i2; i++) { /* * Add L*x to the result */ bx1 = 1; bx2 = i-i1; ba1 = i1; v = ae_v_cdotproduct(&x->ptr.p_complex[bx1], 1, "N", &a->ptr.pp_complex[i][ba1], 1, "N", ae_v_len(bx1,bx2)); y->ptr.p_complex[i-i1+1] = ae_c_add(y->ptr.p_complex[i-i1+1],v); /* * Add U*x to the result */ v = x->ptr.p_complex[i-i1+1]; by1 = 1; by2 = i-i1; ba1 = i1; ae_v_caddc(&y->ptr.p_complex[by1], 1, &a->ptr.pp_complex[i][ba1], 1, "Conj", ae_v_len(by1,by2), v); } } ae_v_cmulc(&y->ptr.p_complex[1], 1, ae_v_len(1,n), alpha); } void hermitianrank2update(/* Complex */ ae_matrix* a, ae_bool isupper, ae_int_t i1, ae_int_t i2, /* Complex */ ae_vector* x, /* Complex */ ae_vector* y, /* Complex */ ae_vector* t, ae_complex alpha, ae_state *_state) { ae_int_t i; ae_int_t tp1; ae_int_t tp2; ae_complex v; if( isupper ) { for(i=i1; i<=i2; i++) { tp1 = i+1-i1; tp2 = i2-i1+1; v = ae_c_mul(alpha,x->ptr.p_complex[i+1-i1]); ae_v_cmovec(&t->ptr.p_complex[tp1], 1, &y->ptr.p_complex[tp1], 1, "Conj", ae_v_len(tp1,tp2), v); v = ae_c_mul(ae_c_conj(alpha, _state),y->ptr.p_complex[i+1-i1]); ae_v_caddc(&t->ptr.p_complex[tp1], 1, &x->ptr.p_complex[tp1], 1, "Conj", ae_v_len(tp1,tp2), v); ae_v_cadd(&a->ptr.pp_complex[i][i], 1, &t->ptr.p_complex[tp1], 1, "N", ae_v_len(i,i2)); } } else { for(i=i1; i<=i2; i++) { tp1 = 1; tp2 = i+1-i1; v = ae_c_mul(alpha,x->ptr.p_complex[i+1-i1]); ae_v_cmovec(&t->ptr.p_complex[tp1], 1, &y->ptr.p_complex[tp1], 1, "Conj", ae_v_len(tp1,tp2), v); v = ae_c_mul(ae_c_conj(alpha, _state),y->ptr.p_complex[i+1-i1]); ae_v_caddc(&t->ptr.p_complex[tp1], 1, &x->ptr.p_complex[tp1], 1, "Conj", ae_v_len(tp1,tp2), v); ae_v_cadd(&a->ptr.pp_complex[i][i1], 1, &t->ptr.p_complex[tp1], 1, "N", ae_v_len(i1,i)); } } } void symmetricmatrixvectormultiply(/* Real */ ae_matrix* a, ae_bool isupper, ae_int_t i1, ae_int_t i2, /* Real */ ae_vector* x, double alpha, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_int_t ba1; ae_int_t ba2; ae_int_t by1; ae_int_t by2; ae_int_t bx1; ae_int_t bx2; ae_int_t n; double v; n = i2-i1+1; if( n<=0 ) { return; } /* * Let A = L + D + U, where * L is strictly lower triangular (main diagonal is zero) * D is diagonal * U is strictly upper triangular (main diagonal is zero) * * A*x = L*x + D*x + U*x * * Calculate D*x first */ for(i=i1; i<=i2; i++) { y->ptr.p_double[i-i1+1] = a->ptr.pp_double[i][i]*x->ptr.p_double[i-i1+1]; } /* * Add L*x + U*x */ if( isupper ) { for(i=i1; i<=i2-1; i++) { /* * Add L*x to the result */ v = x->ptr.p_double[i-i1+1]; by1 = i-i1+2; by2 = n; ba1 = i+1; ba2 = i2; ae_v_addd(&y->ptr.p_double[by1], 1, &a->ptr.pp_double[i][ba1], 1, ae_v_len(by1,by2), v); /* * Add U*x to the result */ bx1 = i-i1+2; bx2 = n; ba1 = i+1; ba2 = i2; v = ae_v_dotproduct(&x->ptr.p_double[bx1], 1, &a->ptr.pp_double[i][ba1], 1, ae_v_len(bx1,bx2)); y->ptr.p_double[i-i1+1] = y->ptr.p_double[i-i1+1]+v; } } else { for(i=i1+1; i<=i2; i++) { /* * Add L*x to the result */ bx1 = 1; bx2 = i-i1; ba1 = i1; ba2 = i-1; v = ae_v_dotproduct(&x->ptr.p_double[bx1], 1, &a->ptr.pp_double[i][ba1], 1, ae_v_len(bx1,bx2)); y->ptr.p_double[i-i1+1] = y->ptr.p_double[i-i1+1]+v; /* * Add U*x to the result */ v = x->ptr.p_double[i-i1+1]; by1 = 1; by2 = i-i1; ba1 = i1; ba2 = i-1; ae_v_addd(&y->ptr.p_double[by1], 1, &a->ptr.pp_double[i][ba1], 1, ae_v_len(by1,by2), v); } } ae_v_muld(&y->ptr.p_double[1], 1, ae_v_len(1,n), alpha); touchint(&ba2, _state); } void symmetricrank2update(/* Real */ ae_matrix* a, ae_bool isupper, ae_int_t i1, ae_int_t i2, /* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* t, double alpha, ae_state *_state) { ae_int_t i; ae_int_t tp1; ae_int_t tp2; double v; if( isupper ) { for(i=i1; i<=i2; i++) { tp1 = i+1-i1; tp2 = i2-i1+1; v = x->ptr.p_double[i+1-i1]; ae_v_moved(&t->ptr.p_double[tp1], 1, &y->ptr.p_double[tp1], 1, ae_v_len(tp1,tp2), v); v = y->ptr.p_double[i+1-i1]; ae_v_addd(&t->ptr.p_double[tp1], 1, &x->ptr.p_double[tp1], 1, ae_v_len(tp1,tp2), v); ae_v_muld(&t->ptr.p_double[tp1], 1, ae_v_len(tp1,tp2), alpha); ae_v_add(&a->ptr.pp_double[i][i], 1, &t->ptr.p_double[tp1], 1, ae_v_len(i,i2)); } } else { for(i=i1; i<=i2; i++) { tp1 = 1; tp2 = i+1-i1; v = x->ptr.p_double[i+1-i1]; ae_v_moved(&t->ptr.p_double[tp1], 1, &y->ptr.p_double[tp1], 1, ae_v_len(tp1,tp2), v); v = y->ptr.p_double[i+1-i1]; ae_v_addd(&t->ptr.p_double[tp1], 1, &x->ptr.p_double[tp1], 1, ae_v_len(tp1,tp2), v); ae_v_muld(&t->ptr.p_double[tp1], 1, ae_v_len(tp1,tp2), alpha); ae_v_add(&a->ptr.pp_double[i][i1], 1, &t->ptr.p_double[tp1], 1, ae_v_len(i1,i)); } } } double vectornorm2(/* Real */ ae_vector* x, ae_int_t i1, ae_int_t i2, ae_state *_state) { ae_int_t n; ae_int_t ix; double absxi; double scl; double ssq; double result; n = i2-i1+1; if( n<1 ) { result = (double)(0); return result; } if( n==1 ) { result = ae_fabs(x->ptr.p_double[i1], _state); return result; } scl = (double)(0); ssq = (double)(1); for(ix=i1; ix<=i2; ix++) { if( ae_fp_neq(x->ptr.p_double[ix],(double)(0)) ) { absxi = ae_fabs(x->ptr.p_double[ix], _state); if( ae_fp_less(scl,absxi) ) { ssq = 1+ssq*ae_sqr(scl/absxi, _state); scl = absxi; } else { ssq = ssq+ae_sqr(absxi/scl, _state); } } } result = scl*ae_sqrt(ssq, _state); return result; } ae_int_t vectoridxabsmax(/* Real */ ae_vector* x, ae_int_t i1, ae_int_t i2, ae_state *_state) { ae_int_t i; ae_int_t result; result = i1; for(i=i1+1; i<=i2; i++) { if( ae_fp_greater(ae_fabs(x->ptr.p_double[i], _state),ae_fabs(x->ptr.p_double[result], _state)) ) { result = i; } } return result; } ae_int_t columnidxabsmax(/* Real */ ae_matrix* x, ae_int_t i1, ae_int_t i2, ae_int_t j, ae_state *_state) { ae_int_t i; ae_int_t result; result = i1; for(i=i1+1; i<=i2; i++) { if( ae_fp_greater(ae_fabs(x->ptr.pp_double[i][j], _state),ae_fabs(x->ptr.pp_double[result][j], _state)) ) { result = i; } } return result; } ae_int_t rowidxabsmax(/* Real */ ae_matrix* x, ae_int_t j1, ae_int_t j2, ae_int_t i, ae_state *_state) { ae_int_t j; ae_int_t result; result = j1; for(j=j1+1; j<=j2; j++) { if( ae_fp_greater(ae_fabs(x->ptr.pp_double[i][j], _state),ae_fabs(x->ptr.pp_double[i][result], _state)) ) { result = j; } } return result; } double upperhessenberg1norm(/* Real */ ae_matrix* a, ae_int_t i1, ae_int_t i2, ae_int_t j1, ae_int_t j2, /* Real */ ae_vector* work, ae_state *_state) { ae_int_t i; ae_int_t j; double result; ae_assert(i2-i1==j2-j1, "UpperHessenberg1Norm: I2-I1<>J2-J1!", _state); for(j=j1; j<=j2; j++) { work->ptr.p_double[j] = (double)(0); } for(i=i1; i<=i2; i++) { for(j=ae_maxint(j1, j1+i-i1-1, _state); j<=j2; j++) { work->ptr.p_double[j] = work->ptr.p_double[j]+ae_fabs(a->ptr.pp_double[i][j], _state); } } result = (double)(0); for(j=j1; j<=j2; j++) { result = ae_maxreal(result, work->ptr.p_double[j], _state); } return result; } void copymatrix(/* Real */ ae_matrix* a, ae_int_t is1, ae_int_t is2, ae_int_t js1, ae_int_t js2, /* Real */ ae_matrix* b, ae_int_t id1, ae_int_t id2, ae_int_t jd1, ae_int_t jd2, ae_state *_state) { ae_int_t isrc; ae_int_t idst; if( is1>is2||js1>js2 ) { return; } ae_assert(is2-is1==id2-id1, "CopyMatrix: different sizes!", _state); ae_assert(js2-js1==jd2-jd1, "CopyMatrix: different sizes!", _state); for(isrc=is1; isrc<=is2; isrc++) { idst = isrc-is1+id1; ae_v_move(&b->ptr.pp_double[idst][jd1], 1, &a->ptr.pp_double[isrc][js1], 1, ae_v_len(jd1,jd2)); } } void inplacetranspose(/* Real */ ae_matrix* a, ae_int_t i1, ae_int_t i2, ae_int_t j1, ae_int_t j2, /* Real */ ae_vector* work, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t ips; ae_int_t jps; ae_int_t l; if( i1>i2||j1>j2 ) { return; } ae_assert(i1-i2==j1-j2, "InplaceTranspose error: incorrect array size!", _state); for(i=i1; i<=i2-1; i++) { j = j1+i-i1; ips = i+1; jps = j1+ips-i1; l = i2-i; ae_v_move(&work->ptr.p_double[1], 1, &a->ptr.pp_double[ips][j], a->stride, ae_v_len(1,l)); ae_v_move(&a->ptr.pp_double[ips][j], a->stride, &a->ptr.pp_double[i][jps], 1, ae_v_len(ips,i2)); ae_v_move(&a->ptr.pp_double[i][jps], 1, &work->ptr.p_double[1], 1, ae_v_len(jps,j2)); } } void copyandtranspose(/* Real */ ae_matrix* a, ae_int_t is1, ae_int_t is2, ae_int_t js1, ae_int_t js2, /* Real */ ae_matrix* b, ae_int_t id1, ae_int_t id2, ae_int_t jd1, ae_int_t jd2, ae_state *_state) { ae_int_t isrc; ae_int_t jdst; if( is1>is2||js1>js2 ) { return; } ae_assert(is2-is1==jd2-jd1, "CopyAndTranspose: different sizes!", _state); ae_assert(js2-js1==id2-id1, "CopyAndTranspose: different sizes!", _state); for(isrc=is1; isrc<=is2; isrc++) { jdst = isrc-is1+jd1; ae_v_move(&b->ptr.pp_double[id1][jdst], b->stride, &a->ptr.pp_double[isrc][js1], 1, ae_v_len(id1,id2)); } } void matrixvectormultiply(/* Real */ ae_matrix* a, ae_int_t i1, ae_int_t i2, ae_int_t j1, ae_int_t j2, ae_bool trans, /* Real */ ae_vector* x, ae_int_t ix1, ae_int_t ix2, double alpha, /* Real */ ae_vector* y, ae_int_t iy1, ae_int_t iy2, double beta, ae_state *_state) { ae_int_t i; double v; if( !trans ) { /* * y := alpha*A*x + beta*y; */ if( i1>i2||j1>j2 ) { return; } ae_assert(j2-j1==ix2-ix1, "MatrixVectorMultiply: A and X dont match!", _state); ae_assert(i2-i1==iy2-iy1, "MatrixVectorMultiply: A and Y dont match!", _state); /* * beta*y */ if( ae_fp_eq(beta,(double)(0)) ) { for(i=iy1; i<=iy2; i++) { y->ptr.p_double[i] = (double)(0); } } else { ae_v_muld(&y->ptr.p_double[iy1], 1, ae_v_len(iy1,iy2), beta); } /* * alpha*A*x */ for(i=i1; i<=i2; i++) { v = ae_v_dotproduct(&a->ptr.pp_double[i][j1], 1, &x->ptr.p_double[ix1], 1, ae_v_len(j1,j2)); y->ptr.p_double[iy1+i-i1] = y->ptr.p_double[iy1+i-i1]+alpha*v; } } else { /* * y := alpha*A'*x + beta*y; */ if( i1>i2||j1>j2 ) { return; } ae_assert(i2-i1==ix2-ix1, "MatrixVectorMultiply: A and X dont match!", _state); ae_assert(j2-j1==iy2-iy1, "MatrixVectorMultiply: A and Y dont match!", _state); /* * beta*y */ if( ae_fp_eq(beta,(double)(0)) ) { for(i=iy1; i<=iy2; i++) { y->ptr.p_double[i] = (double)(0); } } else { ae_v_muld(&y->ptr.p_double[iy1], 1, ae_v_len(iy1,iy2), beta); } /* * alpha*A'*x */ for(i=i1; i<=i2; i++) { v = alpha*x->ptr.p_double[ix1+i-i1]; ae_v_addd(&y->ptr.p_double[iy1], 1, &a->ptr.pp_double[i][j1], 1, ae_v_len(iy1,iy2), v); } } } double pythag2(double x, double y, ae_state *_state) { double w; double xabs; double yabs; double z; double result; xabs = ae_fabs(x, _state); yabs = ae_fabs(y, _state); w = ae_maxreal(xabs, yabs, _state); z = ae_minreal(xabs, yabs, _state); if( ae_fp_eq(z,(double)(0)) ) { result = w; } else { result = w*ae_sqrt(1+ae_sqr(z/w, _state), _state); } return result; } void matrixmatrixmultiply(/* Real */ ae_matrix* a, ae_int_t ai1, ae_int_t ai2, ae_int_t aj1, ae_int_t aj2, ae_bool transa, /* Real */ ae_matrix* b, ae_int_t bi1, ae_int_t bi2, ae_int_t bj1, ae_int_t bj2, ae_bool transb, double alpha, /* Real */ ae_matrix* c, ae_int_t ci1, ae_int_t ci2, ae_int_t cj1, ae_int_t cj2, double beta, /* Real */ ae_vector* work, ae_state *_state) { ae_int_t arows; ae_int_t acols; ae_int_t brows; ae_int_t bcols; ae_int_t crows; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t l; ae_int_t r; double v; /* * Setup */ if( !transa ) { arows = ai2-ai1+1; acols = aj2-aj1+1; } else { arows = aj2-aj1+1; acols = ai2-ai1+1; } if( !transb ) { brows = bi2-bi1+1; bcols = bj2-bj1+1; } else { brows = bj2-bj1+1; bcols = bi2-bi1+1; } ae_assert(acols==brows, "MatrixMatrixMultiply: incorrect matrix sizes!", _state); if( ((arows<=0||acols<=0)||brows<=0)||bcols<=0 ) { return; } crows = arows; /* * Test WORK */ i = ae_maxint(arows, acols, _state); i = ae_maxint(brows, i, _state); i = ae_maxint(i, bcols, _state); work->ptr.p_double[1] = (double)(0); work->ptr.p_double[i] = (double)(0); /* * Prepare C */ if( ae_fp_eq(beta,(double)(0)) ) { for(i=ci1; i<=ci2; i++) { for(j=cj1; j<=cj2; j++) { c->ptr.pp_double[i][j] = (double)(0); } } } else { for(i=ci1; i<=ci2; i++) { ae_v_muld(&c->ptr.pp_double[i][cj1], 1, ae_v_len(cj1,cj2), beta); } } /* * A*B */ if( !transa&&!transb ) { for(l=ai1; l<=ai2; l++) { for(r=bi1; r<=bi2; r++) { v = alpha*a->ptr.pp_double[l][aj1+r-bi1]; k = ci1+l-ai1; ae_v_addd(&c->ptr.pp_double[k][cj1], 1, &b->ptr.pp_double[r][bj1], 1, ae_v_len(cj1,cj2), v); } } return; } /* * A*B' */ if( !transa&&transb ) { if( arows*acolsptr.pp_double[l][aj1], 1, &b->ptr.pp_double[r][bj1], 1, ae_v_len(aj1,aj2)); c->ptr.pp_double[ci1+l-ai1][cj1+r-bi1] = c->ptr.pp_double[ci1+l-ai1][cj1+r-bi1]+alpha*v; } } return; } else { for(l=ai1; l<=ai2; l++) { for(r=bi1; r<=bi2; r++) { v = ae_v_dotproduct(&a->ptr.pp_double[l][aj1], 1, &b->ptr.pp_double[r][bj1], 1, ae_v_len(aj1,aj2)); c->ptr.pp_double[ci1+l-ai1][cj1+r-bi1] = c->ptr.pp_double[ci1+l-ai1][cj1+r-bi1]+alpha*v; } } return; } } /* * A'*B */ if( transa&&!transb ) { for(l=aj1; l<=aj2; l++) { for(r=bi1; r<=bi2; r++) { v = alpha*a->ptr.pp_double[ai1+r-bi1][l]; k = ci1+l-aj1; ae_v_addd(&c->ptr.pp_double[k][cj1], 1, &b->ptr.pp_double[r][bj1], 1, ae_v_len(cj1,cj2), v); } } return; } /* * A'*B' */ if( transa&&transb ) { if( arows*acolsptr.p_double[i] = 0.0; } for(l=ai1; l<=ai2; l++) { v = alpha*b->ptr.pp_double[r][bj1+l-ai1]; ae_v_addd(&work->ptr.p_double[1], 1, &a->ptr.pp_double[l][aj1], 1, ae_v_len(1,crows), v); } ae_v_add(&c->ptr.pp_double[ci1][k], c->stride, &work->ptr.p_double[1], 1, ae_v_len(ci1,ci2)); } return; } else { for(l=aj1; l<=aj2; l++) { k = ai2-ai1+1; ae_v_move(&work->ptr.p_double[1], 1, &a->ptr.pp_double[ai1][l], a->stride, ae_v_len(1,k)); for(r=bi1; r<=bi2; r++) { v = ae_v_dotproduct(&work->ptr.p_double[1], 1, &b->ptr.pp_double[r][bj1], 1, ae_v_len(1,k)); c->ptr.pp_double[ci1+l-aj1][cj1+r-bi1] = c->ptr.pp_double[ci1+l-aj1][cj1+r-bi1]+alpha*v; } } return; } } } /************************************************************************* Normalizes direction/step pair: makes |D|=1, scales Stp. If |D|=0, it returns, leavind D/Stp unchanged. -- ALGLIB -- Copyright 01.04.2010 by Bochkanov Sergey *************************************************************************/ void linminnormalized(/* Real */ ae_vector* d, double* stp, ae_int_t n, ae_state *_state) { double mx; double s; ae_int_t i; /* * first, scale D to avoid underflow/overflow durng squaring */ mx = (double)(0); for(i=0; i<=n-1; i++) { mx = ae_maxreal(mx, ae_fabs(d->ptr.p_double[i], _state), _state); } if( ae_fp_eq(mx,(double)(0)) ) { return; } s = 1/mx; ae_v_muld(&d->ptr.p_double[0], 1, ae_v_len(0,n-1), s); *stp = *stp/s; /* * normalize D */ s = ae_v_dotproduct(&d->ptr.p_double[0], 1, &d->ptr.p_double[0], 1, ae_v_len(0,n-1)); s = 1/ae_sqrt(s, _state); ae_v_muld(&d->ptr.p_double[0], 1, ae_v_len(0,n-1), s); *stp = *stp/s; } /************************************************************************* THE PURPOSE OF MCSRCH IS TO FIND A STEP WHICH SATISFIES A SUFFICIENT DECREASE CONDITION AND A CURVATURE CONDITION. AT EACH STAGE THE SUBROUTINE UPDATES AN INTERVAL OF UNCERTAINTY WITH ENDPOINTS STX AND STY. THE INTERVAL OF UNCERTAINTY IS INITIALLY CHOSEN SO THAT IT CONTAINS A MINIMIZER OF THE MODIFIED FUNCTION F(X+STP*S) - F(X) - FTOL*STP*(GRADF(X)'S). IF A STEP IS OBTAINED FOR WHICH THE MODIFIED FUNCTION HAS A NONPOSITIVE FUNCTION VALUE AND NONNEGATIVE DERIVATIVE, THEN THE INTERVAL OF UNCERTAINTY IS CHOSEN SO THAT IT CONTAINS A MINIMIZER OF F(X+STP*S). THE ALGORITHM IS DESIGNED TO FIND A STEP WHICH SATISFIES THE SUFFICIENT DECREASE CONDITION F(X+STP*S) .LE. F(X) + FTOL*STP*(GRADF(X)'S), AND THE CURVATURE CONDITION ABS(GRADF(X+STP*S)'S)) .LE. GTOL*ABS(GRADF(X)'S). IF FTOL IS LESS THAN GTOL AND IF, FOR EXAMPLE, THE FUNCTION IS BOUNDED BELOW, THEN THERE IS ALWAYS A STEP WHICH SATISFIES BOTH CONDITIONS. IF NO STEP CAN BE FOUND WHICH SATISFIES BOTH CONDITIONS, THEN THE ALGORITHM USUALLY STOPS WHEN ROUNDING ERRORS PREVENT FURTHER PROGRESS. IN THIS CASE STP ONLY SATISFIES THE SUFFICIENT DECREASE CONDITION. :::::::::::::IMPORTANT NOTES::::::::::::: NOTE 1: This routine guarantees that it will stop at the last point where function value was calculated. It won't make several additional function evaluations after finding good point. So if you store function evaluations requested by this routine, you can be sure that last one is the point where we've stopped. NOTE 2: when 0initial_point - after rounding to machine precision ::::::::::::::::::::::::::::::::::::::::: PARAMETERS DESCRIPRION STAGE IS ZERO ON FIRST CALL, ZERO ON FINAL EXIT N IS A POSITIVE INTEGER INPUT VARIABLE SET TO THE NUMBER OF VARIABLES. X IS AN ARRAY OF LENGTH N. ON INPUT IT MUST CONTAIN THE BASE POINT FOR THE LINE SEARCH. ON OUTPUT IT CONTAINS X+STP*S. F IS A VARIABLE. ON INPUT IT MUST CONTAIN THE VALUE OF F AT X. ON OUTPUT IT CONTAINS THE VALUE OF F AT X + STP*S. G IS AN ARRAY OF LENGTH N. ON INPUT IT MUST CONTAIN THE GRADIENT OF F AT X. ON OUTPUT IT CONTAINS THE GRADIENT OF F AT X + STP*S. S IS AN INPUT ARRAY OF LENGTH N WHICH SPECIFIES THE SEARCH DIRECTION. STP IS A NONNEGATIVE VARIABLE. ON INPUT STP CONTAINS AN INITIAL ESTIMATE OF A SATISFACTORY STEP. ON OUTPUT STP CONTAINS THE FINAL ESTIMATE. FTOL AND GTOL ARE NONNEGATIVE INPUT VARIABLES. TERMINATION OCCURS WHEN THE SUFFICIENT DECREASE CONDITION AND THE DIRECTIONAL DERIVATIVE CONDITION ARE SATISFIED. XTOL IS A NONNEGATIVE INPUT VARIABLE. TERMINATION OCCURS WHEN THE RELATIVE WIDTH OF THE INTERVAL OF UNCERTAINTY IS AT MOST XTOL. STPMIN AND STPMAX ARE NONNEGATIVE INPUT VARIABLES WHICH SPECIFY LOWER AND UPPER BOUNDS FOR THE STEP. MAXFEV IS A POSITIVE INTEGER INPUT VARIABLE. TERMINATION OCCURS WHEN THE NUMBER OF CALLS TO FCN IS AT LEAST MAXFEV BY THE END OF AN ITERATION. INFO IS AN INTEGER OUTPUT VARIABLE SET AS FOLLOWS: INFO = 0 IMPROPER INPUT PARAMETERS. INFO = 1 THE SUFFICIENT DECREASE CONDITION AND THE DIRECTIONAL DERIVATIVE CONDITION HOLD. INFO = 2 RELATIVE WIDTH OF THE INTERVAL OF UNCERTAINTY IS AT MOST XTOL. INFO = 3 NUMBER OF CALLS TO FCN HAS REACHED MAXFEV. INFO = 4 THE STEP IS AT THE LOWER BOUND STPMIN. INFO = 5 THE STEP IS AT THE UPPER BOUND STPMAX. INFO = 6 ROUNDING ERRORS PREVENT FURTHER PROGRESS. THERE MAY NOT BE A STEP WHICH SATISFIES THE SUFFICIENT DECREASE AND CURVATURE CONDITIONS. TOLERANCES MAY BE TOO SMALL. NFEV IS AN INTEGER OUTPUT VARIABLE SET TO THE NUMBER OF CALLS TO FCN. WA IS A WORK ARRAY OF LENGTH N. ARGONNE NATIONAL LABORATORY. MINPACK PROJECT. JUNE 1983 JORGE J. MORE', DAVID J. THUENTE *************************************************************************/ void mcsrch(ae_int_t n, /* Real */ ae_vector* x, double* f, /* Real */ ae_vector* g, /* Real */ ae_vector* s, double* stp, double stpmax, double gtol, ae_int_t* info, ae_int_t* nfev, /* Real */ ae_vector* wa, linminstate* state, ae_int_t* stage, ae_state *_state) { ae_int_t i; double v; double p5; double p66; double zero; /* * init */ p5 = 0.5; p66 = 0.66; state->xtrapf = 4.0; zero = (double)(0); if( ae_fp_eq(stpmax,(double)(0)) ) { stpmax = linmin_defstpmax; } if( ae_fp_less(*stp,linmin_stpmin) ) { *stp = linmin_stpmin; } if( ae_fp_greater(*stp,stpmax) ) { *stp = stpmax; } /* * Main cycle */ for(;;) { if( *stage==0 ) { /* * NEXT */ *stage = 2; continue; } if( *stage==2 ) { state->infoc = 1; *info = 0; /* * CHECK THE INPUT PARAMETERS FOR ERRORS. */ if( ae_fp_less(stpmax,linmin_stpmin)&&ae_fp_greater(stpmax,(double)(0)) ) { *info = 5; *stp = stpmax; *stage = 0; return; } if( ((((((n<=0||ae_fp_less_eq(*stp,(double)(0)))||ae_fp_less(linmin_ftol,(double)(0)))||ae_fp_less(gtol,zero))||ae_fp_less(linmin_xtol,zero))||ae_fp_less(linmin_stpmin,zero))||ae_fp_less(stpmax,linmin_stpmin))||linmin_maxfev<=0 ) { *stage = 0; return; } /* * COMPUTE THE INITIAL GRADIENT IN THE SEARCH DIRECTION * AND CHECK THAT S IS A DESCENT DIRECTION. */ v = ae_v_dotproduct(&g->ptr.p_double[0], 1, &s->ptr.p_double[0], 1, ae_v_len(0,n-1)); state->dginit = v; if( ae_fp_greater_eq(state->dginit,(double)(0)) ) { *stage = 0; return; } /* * INITIALIZE LOCAL VARIABLES. */ state->brackt = ae_false; state->stage1 = ae_true; *nfev = 0; state->finit = *f; state->dgtest = linmin_ftol*state->dginit; state->width = stpmax-linmin_stpmin; state->width1 = state->width/p5; ae_v_move(&wa->ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * THE VARIABLES STX, FX, DGX CONTAIN THE VALUES OF THE STEP, * FUNCTION, AND DIRECTIONAL DERIVATIVE AT THE BEST STEP. * THE VARIABLES STY, FY, DGY CONTAIN THE VALUE OF THE STEP, * FUNCTION, AND DERIVATIVE AT THE OTHER ENDPOINT OF * THE INTERVAL OF UNCERTAINTY. * THE VARIABLES STP, F, DG CONTAIN THE VALUES OF THE STEP, * FUNCTION, AND DERIVATIVE AT THE CURRENT STEP. */ state->stx = (double)(0); state->fx = state->finit; state->dgx = state->dginit; state->sty = (double)(0); state->fy = state->finit; state->dgy = state->dginit; /* * NEXT */ *stage = 3; continue; } if( *stage==3 ) { /* * START OF ITERATION. * * SET THE MINIMUM AND MAXIMUM STEPS TO CORRESPOND * TO THE PRESENT INTERVAL OF UNCERTAINTY. */ if( state->brackt ) { if( ae_fp_less(state->stx,state->sty) ) { state->stmin = state->stx; state->stmax = state->sty; } else { state->stmin = state->sty; state->stmax = state->stx; } } else { state->stmin = state->stx; state->stmax = *stp+state->xtrapf*(*stp-state->stx); } /* * FORCE THE STEP TO BE WITHIN THE BOUNDS STPMAX AND STPMIN. */ if( ae_fp_greater(*stp,stpmax) ) { *stp = stpmax; } if( ae_fp_less(*stp,linmin_stpmin) ) { *stp = linmin_stpmin; } /* * IF AN UNUSUAL TERMINATION IS TO OCCUR THEN LET * STP BE THE LOWEST POINT OBTAINED SO FAR. */ if( (((state->brackt&&(ae_fp_less_eq(*stp,state->stmin)||ae_fp_greater_eq(*stp,state->stmax)))||*nfev>=linmin_maxfev-1)||state->infoc==0)||(state->brackt&&ae_fp_less_eq(state->stmax-state->stmin,linmin_xtol*state->stmax)) ) { *stp = state->stx; } /* * EVALUATE THE FUNCTION AND GRADIENT AT STP * AND COMPUTE THE DIRECTIONAL DERIVATIVE. */ ae_v_move(&x->ptr.p_double[0], 1, &wa->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&x->ptr.p_double[0], 1, &s->ptr.p_double[0], 1, ae_v_len(0,n-1), *stp); /* * NEXT */ *stage = 4; return; } if( *stage==4 ) { *info = 0; *nfev = *nfev+1; v = ae_v_dotproduct(&g->ptr.p_double[0], 1, &s->ptr.p_double[0], 1, ae_v_len(0,n-1)); state->dg = v; state->ftest1 = state->finit+*stp*state->dgtest; /* * TEST FOR CONVERGENCE. */ if( (state->brackt&&(ae_fp_less_eq(*stp,state->stmin)||ae_fp_greater_eq(*stp,state->stmax)))||state->infoc==0 ) { *info = 6; } if( ((ae_fp_eq(*stp,stpmax)&&ae_fp_less(*f,state->finit))&&ae_fp_less_eq(*f,state->ftest1))&&ae_fp_less_eq(state->dg,state->dgtest) ) { *info = 5; } if( ae_fp_eq(*stp,linmin_stpmin)&&((ae_fp_greater_eq(*f,state->finit)||ae_fp_greater(*f,state->ftest1))||ae_fp_greater_eq(state->dg,state->dgtest)) ) { *info = 4; } if( *nfev>=linmin_maxfev ) { *info = 3; } if( state->brackt&&ae_fp_less_eq(state->stmax-state->stmin,linmin_xtol*state->stmax) ) { *info = 2; } if( (ae_fp_less(*f,state->finit)&&ae_fp_less_eq(*f,state->ftest1))&&ae_fp_less_eq(ae_fabs(state->dg, _state),-gtol*state->dginit) ) { *info = 1; } /* * CHECK FOR TERMINATION. */ if( *info!=0 ) { /* * Check guarantees provided by the function for INFO=1 or INFO=5 */ if( *info==1||*info==5 ) { v = 0.0; for(i=0; i<=n-1; i++) { v = v+(wa->ptr.p_double[i]-x->ptr.p_double[i])*(wa->ptr.p_double[i]-x->ptr.p_double[i]); } if( ae_fp_greater_eq(*f,state->finit)||ae_fp_eq(v,0.0) ) { *info = 6; } } *stage = 0; return; } /* * IN THE FIRST STAGE WE SEEK A STEP FOR WHICH THE MODIFIED * FUNCTION HAS A NONPOSITIVE VALUE AND NONNEGATIVE DERIVATIVE. */ if( (state->stage1&&ae_fp_less_eq(*f,state->ftest1))&&ae_fp_greater_eq(state->dg,ae_minreal(linmin_ftol, gtol, _state)*state->dginit) ) { state->stage1 = ae_false; } /* * A MODIFIED FUNCTION IS USED TO PREDICT THE STEP ONLY IF * WE HAVE NOT OBTAINED A STEP FOR WHICH THE MODIFIED * FUNCTION HAS A NONPOSITIVE FUNCTION VALUE AND NONNEGATIVE * DERIVATIVE, AND IF A LOWER FUNCTION VALUE HAS BEEN * OBTAINED BUT THE DECREASE IS NOT SUFFICIENT. */ if( (state->stage1&&ae_fp_less_eq(*f,state->fx))&&ae_fp_greater(*f,state->ftest1) ) { /* * DEFINE THE MODIFIED FUNCTION AND DERIVATIVE VALUES. */ state->fm = *f-*stp*state->dgtest; state->fxm = state->fx-state->stx*state->dgtest; state->fym = state->fy-state->sty*state->dgtest; state->dgm = state->dg-state->dgtest; state->dgxm = state->dgx-state->dgtest; state->dgym = state->dgy-state->dgtest; /* * CALL CSTEP TO UPDATE THE INTERVAL OF UNCERTAINTY * AND TO COMPUTE THE NEW STEP. */ linmin_mcstep(&state->stx, &state->fxm, &state->dgxm, &state->sty, &state->fym, &state->dgym, stp, state->fm, state->dgm, &state->brackt, state->stmin, state->stmax, &state->infoc, _state); /* * RESET THE FUNCTION AND GRADIENT VALUES FOR F. */ state->fx = state->fxm+state->stx*state->dgtest; state->fy = state->fym+state->sty*state->dgtest; state->dgx = state->dgxm+state->dgtest; state->dgy = state->dgym+state->dgtest; } else { /* * CALL MCSTEP TO UPDATE THE INTERVAL OF UNCERTAINTY * AND TO COMPUTE THE NEW STEP. */ linmin_mcstep(&state->stx, &state->fx, &state->dgx, &state->sty, &state->fy, &state->dgy, stp, *f, state->dg, &state->brackt, state->stmin, state->stmax, &state->infoc, _state); } /* * FORCE A SUFFICIENT DECREASE IN THE SIZE OF THE * INTERVAL OF UNCERTAINTY. */ if( state->brackt ) { if( ae_fp_greater_eq(ae_fabs(state->sty-state->stx, _state),p66*state->width1) ) { *stp = state->stx+p5*(state->sty-state->stx); } state->width1 = state->width; state->width = ae_fabs(state->sty-state->stx, _state); } /* * NEXT. */ *stage = 3; continue; } } } /************************************************************************* These functions perform Armijo line search using at most FMAX function evaluations. It doesn't enforce some kind of " sufficient decrease" criterion - it just tries different Armijo steps and returns optimum found so far. Optimization is done using F-rcomm interface: * ArmijoCreate initializes State structure (reusing previously allocated buffers) * ArmijoIteration is subsequently called * ArmijoResults returns results INPUT PARAMETERS: N - problem size X - array[N], starting point F - F(X+S*STP) S - step direction, S>0 STP - step length STPMAX - maximum value for STP or zero (if no limit is imposed) FMAX - maximum number of function evaluations State - optimization state -- ALGLIB -- Copyright 05.10.2010 by Bochkanov Sergey *************************************************************************/ void armijocreate(ae_int_t n, /* Real */ ae_vector* x, double f, /* Real */ ae_vector* s, double stp, double stpmax, ae_int_t fmax, armijostate* state, ae_state *_state) { if( state->x.cntx, n, _state); } if( state->xbase.cntxbase, n, _state); } if( state->s.cnts, n, _state); } state->stpmax = stpmax; state->fmax = fmax; state->stplen = stp; state->fcur = f; state->n = n; ae_v_move(&state->xbase.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->s.ptr.p_double[0], 1, &s->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_vector_set_length(&state->rstate.ia, 0+1, _state); ae_vector_set_length(&state->rstate.ra, 0+1, _state); state->rstate.stage = -1; } /************************************************************************* This is rcomm-based search function -- ALGLIB -- Copyright 05.10.2010 by Bochkanov Sergey *************************************************************************/ ae_bool armijoiteration(armijostate* state, ae_state *_state) { double v; ae_int_t n; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { n = state->rstate.ia.ptr.p_int[0]; v = state->rstate.ra.ptr.p_double[0]; } else { n = 359; v = -58; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } /* * Routine body */ if( (ae_fp_less_eq(state->stplen,(double)(0))||ae_fp_less(state->stpmax,(double)(0)))||state->fmax<2 ) { state->info = 0; result = ae_false; return result; } if( ae_fp_less_eq(state->stplen,linmin_stpmin) ) { state->info = 4; result = ae_false; return result; } n = state->n; state->nfev = 0; /* * We always need F */ state->needf = ae_true; /* * Bound StpLen */ if( ae_fp_greater(state->stplen,state->stpmax)&&ae_fp_neq(state->stpmax,(double)(0)) ) { state->stplen = state->stpmax; } /* * Increase length */ v = state->stplen*linmin_armijofactor; if( ae_fp_greater(v,state->stpmax)&&ae_fp_neq(state->stpmax,(double)(0)) ) { v = state->stpmax; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&state->x.ptr.p_double[0], 1, &state->s.ptr.p_double[0], 1, ae_v_len(0,n-1), v); state->rstate.stage = 0; goto lbl_rcomm; lbl_0: state->nfev = state->nfev+1; if( ae_fp_greater_eq(state->f,state->fcur) ) { goto lbl_4; } state->stplen = v; state->fcur = state->f; lbl_6: if( ae_false ) { goto lbl_7; } /* * test stopping conditions */ if( state->nfev>=state->fmax ) { state->info = 3; result = ae_false; return result; } if( ae_fp_greater_eq(state->stplen,state->stpmax) ) { state->info = 5; result = ae_false; return result; } /* * evaluate F */ v = state->stplen*linmin_armijofactor; if( ae_fp_greater(v,state->stpmax)&&ae_fp_neq(state->stpmax,(double)(0)) ) { v = state->stpmax; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&state->x.ptr.p_double[0], 1, &state->s.ptr.p_double[0], 1, ae_v_len(0,n-1), v); state->rstate.stage = 1; goto lbl_rcomm; lbl_1: state->nfev = state->nfev+1; /* * make decision */ if( ae_fp_less(state->f,state->fcur) ) { state->stplen = v; state->fcur = state->f; } else { state->info = 1; result = ae_false; return result; } goto lbl_6; lbl_7: lbl_4: /* * Decrease length */ v = state->stplen/linmin_armijofactor; ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&state->x.ptr.p_double[0], 1, &state->s.ptr.p_double[0], 1, ae_v_len(0,n-1), v); state->rstate.stage = 2; goto lbl_rcomm; lbl_2: state->nfev = state->nfev+1; if( ae_fp_greater_eq(state->f,state->fcur) ) { goto lbl_8; } state->stplen = state->stplen/linmin_armijofactor; state->fcur = state->f; lbl_10: if( ae_false ) { goto lbl_11; } /* * test stopping conditions */ if( state->nfev>=state->fmax ) { state->info = 3; result = ae_false; return result; } if( ae_fp_less_eq(state->stplen,linmin_stpmin) ) { state->info = 4; result = ae_false; return result; } /* * evaluate F */ v = state->stplen/linmin_armijofactor; ae_v_move(&state->x.ptr.p_double[0], 1, &state->xbase.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&state->x.ptr.p_double[0], 1, &state->s.ptr.p_double[0], 1, ae_v_len(0,n-1), v); state->rstate.stage = 3; goto lbl_rcomm; lbl_3: state->nfev = state->nfev+1; /* * make decision */ if( ae_fp_less(state->f,state->fcur) ) { state->stplen = state->stplen/linmin_armijofactor; state->fcur = state->f; } else { state->info = 1; result = ae_false; return result; } goto lbl_10; lbl_11: lbl_8: /* * Nothing to be done */ state->info = 1; result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = n; state->rstate.ra.ptr.p_double[0] = v; return result; } /************************************************************************* Results of Armijo search OUTPUT PARAMETERS: INFO - on output it is set to one of the return codes: * 0 improper input params * 1 optimum step is found with at most FMAX evaluations * 3 FMAX evaluations were used, X contains optimum found so far * 4 step is at lower bound STPMIN * 5 step is at upper bound STP - step length (in case of failure it is still returned) F - function value (in case of failure it is still returned) -- ALGLIB -- Copyright 05.10.2010 by Bochkanov Sergey *************************************************************************/ void armijoresults(armijostate* state, ae_int_t* info, double* stp, double* f, ae_state *_state) { *info = state->info; *stp = state->stplen; *f = state->fcur; } static void linmin_mcstep(double* stx, double* fx, double* dx, double* sty, double* fy, double* dy, double* stp, double fp, double dp, ae_bool* brackt, double stmin, double stmax, ae_int_t* info, ae_state *_state) { ae_bool bound; double gamma; double p; double q; double r; double s; double sgnd; double stpc; double stpf; double stpq; double theta; *info = 0; /* * CHECK THE INPUT PARAMETERS FOR ERRORS. */ if( ((*brackt&&(ae_fp_less_eq(*stp,ae_minreal(*stx, *sty, _state))||ae_fp_greater_eq(*stp,ae_maxreal(*stx, *sty, _state))))||ae_fp_greater_eq(*dx*(*stp-(*stx)),(double)(0)))||ae_fp_less(stmax,stmin) ) { return; } /* * DETERMINE IF THE DERIVATIVES HAVE OPPOSITE SIGN. */ sgnd = dp*(*dx/ae_fabs(*dx, _state)); /* * FIRST CASE. A HIGHER FUNCTION VALUE. * THE MINIMUM IS BRACKETED. IF THE CUBIC STEP IS CLOSER * TO STX THAN THE QUADRATIC STEP, THE CUBIC STEP IS TAKEN, * ELSE THE AVERAGE OF THE CUBIC AND QUADRATIC STEPS IS TAKEN. */ if( ae_fp_greater(fp,*fx) ) { *info = 1; bound = ae_true; theta = 3*(*fx-fp)/(*stp-(*stx))+(*dx)+dp; s = ae_maxreal(ae_fabs(theta, _state), ae_maxreal(ae_fabs(*dx, _state), ae_fabs(dp, _state), _state), _state); gamma = s*ae_sqrt(ae_sqr(theta/s, _state)-*dx/s*(dp/s), _state); if( ae_fp_less(*stp,*stx) ) { gamma = -gamma; } p = gamma-(*dx)+theta; q = gamma-(*dx)+gamma+dp; r = p/q; stpc = *stx+r*(*stp-(*stx)); stpq = *stx+*dx/((*fx-fp)/(*stp-(*stx))+(*dx))/2*(*stp-(*stx)); if( ae_fp_less(ae_fabs(stpc-(*stx), _state),ae_fabs(stpq-(*stx), _state)) ) { stpf = stpc; } else { stpf = stpc+(stpq-stpc)/2; } *brackt = ae_true; } else { if( ae_fp_less(sgnd,(double)(0)) ) { /* * SECOND CASE. A LOWER FUNCTION VALUE AND DERIVATIVES OF * OPPOSITE SIGN. THE MINIMUM IS BRACKETED. IF THE CUBIC * STEP IS CLOSER TO STX THAN THE QUADRATIC (SECANT) STEP, * THE CUBIC STEP IS TAKEN, ELSE THE QUADRATIC STEP IS TAKEN. */ *info = 2; bound = ae_false; theta = 3*(*fx-fp)/(*stp-(*stx))+(*dx)+dp; s = ae_maxreal(ae_fabs(theta, _state), ae_maxreal(ae_fabs(*dx, _state), ae_fabs(dp, _state), _state), _state); gamma = s*ae_sqrt(ae_sqr(theta/s, _state)-*dx/s*(dp/s), _state); if( ae_fp_greater(*stp,*stx) ) { gamma = -gamma; } p = gamma-dp+theta; q = gamma-dp+gamma+(*dx); r = p/q; stpc = *stp+r*(*stx-(*stp)); stpq = *stp+dp/(dp-(*dx))*(*stx-(*stp)); if( ae_fp_greater(ae_fabs(stpc-(*stp), _state),ae_fabs(stpq-(*stp), _state)) ) { stpf = stpc; } else { stpf = stpq; } *brackt = ae_true; } else { if( ae_fp_less(ae_fabs(dp, _state),ae_fabs(*dx, _state)) ) { /* * THIRD CASE. A LOWER FUNCTION VALUE, DERIVATIVES OF THE * SAME SIGN, AND THE MAGNITUDE OF THE DERIVATIVE DECREASES. * THE CUBIC STEP IS ONLY USED IF THE CUBIC TENDS TO INFINITY * IN THE DIRECTION OF THE STEP OR IF THE MINIMUM OF THE CUBIC * IS BEYOND STP. OTHERWISE THE CUBIC STEP IS DEFINED TO BE * EITHER STPMIN OR STPMAX. THE QUADRATIC (SECANT) STEP IS ALSO * COMPUTED AND IF THE MINIMUM IS BRACKETED THEN THE THE STEP * CLOSEST TO STX IS TAKEN, ELSE THE STEP FARTHEST AWAY IS TAKEN. */ *info = 3; bound = ae_true; theta = 3*(*fx-fp)/(*stp-(*stx))+(*dx)+dp; s = ae_maxreal(ae_fabs(theta, _state), ae_maxreal(ae_fabs(*dx, _state), ae_fabs(dp, _state), _state), _state); /* * THE CASE GAMMA = 0 ONLY ARISES IF THE CUBIC DOES NOT TEND * TO INFINITY IN THE DIRECTION OF THE STEP. */ gamma = s*ae_sqrt(ae_maxreal((double)(0), ae_sqr(theta/s, _state)-*dx/s*(dp/s), _state), _state); if( ae_fp_greater(*stp,*stx) ) { gamma = -gamma; } p = gamma-dp+theta; q = gamma+(*dx-dp)+gamma; r = p/q; if( ae_fp_less(r,(double)(0))&&ae_fp_neq(gamma,(double)(0)) ) { stpc = *stp+r*(*stx-(*stp)); } else { if( ae_fp_greater(*stp,*stx) ) { stpc = stmax; } else { stpc = stmin; } } stpq = *stp+dp/(dp-(*dx))*(*stx-(*stp)); if( *brackt ) { if( ae_fp_less(ae_fabs(*stp-stpc, _state),ae_fabs(*stp-stpq, _state)) ) { stpf = stpc; } else { stpf = stpq; } } else { if( ae_fp_greater(ae_fabs(*stp-stpc, _state),ae_fabs(*stp-stpq, _state)) ) { stpf = stpc; } else { stpf = stpq; } } } else { /* * FOURTH CASE. A LOWER FUNCTION VALUE, DERIVATIVES OF THE * SAME SIGN, AND THE MAGNITUDE OF THE DERIVATIVE DOES * NOT DECREASE. IF THE MINIMUM IS NOT BRACKETED, THE STEP * IS EITHER STPMIN OR STPMAX, ELSE THE CUBIC STEP IS TAKEN. */ *info = 4; bound = ae_false; if( *brackt ) { theta = 3*(fp-(*fy))/(*sty-(*stp))+(*dy)+dp; s = ae_maxreal(ae_fabs(theta, _state), ae_maxreal(ae_fabs(*dy, _state), ae_fabs(dp, _state), _state), _state); gamma = s*ae_sqrt(ae_sqr(theta/s, _state)-*dy/s*(dp/s), _state); if( ae_fp_greater(*stp,*sty) ) { gamma = -gamma; } p = gamma-dp+theta; q = gamma-dp+gamma+(*dy); r = p/q; stpc = *stp+r*(*sty-(*stp)); stpf = stpc; } else { if( ae_fp_greater(*stp,*stx) ) { stpf = stmax; } else { stpf = stmin; } } } } } /* * UPDATE THE INTERVAL OF UNCERTAINTY. THIS UPDATE DOES NOT * DEPEND ON THE NEW STEP OR THE CASE ANALYSIS ABOVE. */ if( ae_fp_greater(fp,*fx) ) { *sty = *stp; *fy = fp; *dy = dp; } else { if( ae_fp_less(sgnd,0.0) ) { *sty = *stx; *fy = *fx; *dy = *dx; } *stx = *stp; *fx = fp; *dx = dp; } /* * COMPUTE THE NEW STEP AND SAFEGUARD IT. */ stpf = ae_minreal(stmax, stpf, _state); stpf = ae_maxreal(stmin, stpf, _state); *stp = stpf; if( *brackt&&bound ) { if( ae_fp_greater(*sty,*stx) ) { *stp = ae_minreal(*stx+0.66*(*sty-(*stx)), *stp, _state); } else { *stp = ae_maxreal(*stx+0.66*(*sty-(*stx)), *stp, _state); } } } void _linminstate_init(void* _p, ae_state *_state) { linminstate *p = (linminstate*)_p; ae_touch_ptr((void*)p); } void _linminstate_init_copy(void* _dst, void* _src, ae_state *_state) { linminstate *dst = (linminstate*)_dst; linminstate *src = (linminstate*)_src; dst->brackt = src->brackt; dst->stage1 = src->stage1; dst->infoc = src->infoc; dst->dg = src->dg; dst->dgm = src->dgm; dst->dginit = src->dginit; dst->dgtest = src->dgtest; dst->dgx = src->dgx; dst->dgxm = src->dgxm; dst->dgy = src->dgy; dst->dgym = src->dgym; dst->finit = src->finit; dst->ftest1 = src->ftest1; dst->fm = src->fm; dst->fx = src->fx; dst->fxm = src->fxm; dst->fy = src->fy; dst->fym = src->fym; dst->stx = src->stx; dst->sty = src->sty; dst->stmin = src->stmin; dst->stmax = src->stmax; dst->width = src->width; dst->width1 = src->width1; dst->xtrapf = src->xtrapf; } void _linminstate_clear(void* _p) { linminstate *p = (linminstate*)_p; ae_touch_ptr((void*)p); } void _linminstate_destroy(void* _p) { linminstate *p = (linminstate*)_p; ae_touch_ptr((void*)p); } void _armijostate_init(void* _p, ae_state *_state) { armijostate *p = (armijostate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->xbase, 0, DT_REAL, _state); ae_vector_init(&p->s, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); } void _armijostate_init_copy(void* _dst, void* _src, ae_state *_state) { armijostate *dst = (armijostate*)_dst; armijostate *src = (armijostate*)_src; dst->needf = src->needf; ae_vector_init_copy(&dst->x, &src->x, _state); dst->f = src->f; dst->n = src->n; ae_vector_init_copy(&dst->xbase, &src->xbase, _state); ae_vector_init_copy(&dst->s, &src->s, _state); dst->stplen = src->stplen; dst->fcur = src->fcur; dst->stpmax = src->stpmax; dst->fmax = src->fmax; dst->nfev = src->nfev; dst->info = src->info; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); } void _armijostate_clear(void* _p) { armijostate *p = (armijostate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->x); ae_vector_clear(&p->xbase); ae_vector_clear(&p->s); _rcommstate_clear(&p->rstate); } void _armijostate_destroy(void* _p) { armijostate *p = (armijostate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->x); ae_vector_destroy(&p->xbase); ae_vector_destroy(&p->s); _rcommstate_destroy(&p->rstate); } /************************************************************************* More precise dot-product. Absolute error of subroutine result is about 1 ulp of max(MX,V), where: MX = max( |a[i]*b[i]| ) V = |(a,b)| INPUT PARAMETERS A - array[0..N-1], vector 1 B - array[0..N-1], vector 2 N - vectors length, N<2^29. Temp - array[0..N-1], pre-allocated temporary storage OUTPUT PARAMETERS R - (A,B) RErr - estimate of error. This estimate accounts for both errors during calculation of (A,B) and errors introduced by rounding of A and B to fit in double (about 1 ulp). -- ALGLIB -- Copyright 24.08.2009 by Bochkanov Sergey *************************************************************************/ void xdot(/* Real */ ae_vector* a, /* Real */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* temp, double* r, double* rerr, ae_state *_state) { ae_int_t i; double mx; double v; *r = 0; *rerr = 0; /* * special cases: * * N=0 */ if( n==0 ) { *r = (double)(0); *rerr = (double)(0); return; } mx = (double)(0); for(i=0; i<=n-1; i++) { v = a->ptr.p_double[i]*b->ptr.p_double[i]; temp->ptr.p_double[i] = v; mx = ae_maxreal(mx, ae_fabs(v, _state), _state); } if( ae_fp_eq(mx,(double)(0)) ) { *r = (double)(0); *rerr = (double)(0); return; } xblas_xsum(temp, mx, n, r, rerr, _state); } /************************************************************************* More precise complex dot-product. Absolute error of subroutine result is about 1 ulp of max(MX,V), where: MX = max( |a[i]*b[i]| ) V = |(a,b)| INPUT PARAMETERS A - array[0..N-1], vector 1 B - array[0..N-1], vector 2 N - vectors length, N<2^29. Temp - array[0..2*N-1], pre-allocated temporary storage OUTPUT PARAMETERS R - (A,B) RErr - estimate of error. This estimate accounts for both errors during calculation of (A,B) and errors introduced by rounding of A and B to fit in double (about 1 ulp). -- ALGLIB -- Copyright 27.01.2010 by Bochkanov Sergey *************************************************************************/ void xcdot(/* Complex */ ae_vector* a, /* Complex */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* temp, ae_complex* r, double* rerr, ae_state *_state) { ae_int_t i; double mx; double v; double rerrx; double rerry; r->x = 0; r->y = 0; *rerr = 0; /* * special cases: * * N=0 */ if( n==0 ) { *r = ae_complex_from_i(0); *rerr = (double)(0); return; } /* * calculate real part */ mx = (double)(0); for(i=0; i<=n-1; i++) { v = a->ptr.p_complex[i].x*b->ptr.p_complex[i].x; temp->ptr.p_double[2*i+0] = v; mx = ae_maxreal(mx, ae_fabs(v, _state), _state); v = -a->ptr.p_complex[i].y*b->ptr.p_complex[i].y; temp->ptr.p_double[2*i+1] = v; mx = ae_maxreal(mx, ae_fabs(v, _state), _state); } if( ae_fp_eq(mx,(double)(0)) ) { r->x = (double)(0); rerrx = (double)(0); } else { xblas_xsum(temp, mx, 2*n, &r->x, &rerrx, _state); } /* * calculate imaginary part */ mx = (double)(0); for(i=0; i<=n-1; i++) { v = a->ptr.p_complex[i].x*b->ptr.p_complex[i].y; temp->ptr.p_double[2*i+0] = v; mx = ae_maxreal(mx, ae_fabs(v, _state), _state); v = a->ptr.p_complex[i].y*b->ptr.p_complex[i].x; temp->ptr.p_double[2*i+1] = v; mx = ae_maxreal(mx, ae_fabs(v, _state), _state); } if( ae_fp_eq(mx,(double)(0)) ) { r->y = (double)(0); rerry = (double)(0); } else { xblas_xsum(temp, mx, 2*n, &r->y, &rerry, _state); } /* * total error */ if( ae_fp_eq(rerrx,(double)(0))&&ae_fp_eq(rerry,(double)(0)) ) { *rerr = (double)(0); } else { *rerr = ae_maxreal(rerrx, rerry, _state)*ae_sqrt(1+ae_sqr(ae_minreal(rerrx, rerry, _state)/ae_maxreal(rerrx, rerry, _state), _state), _state); } } /************************************************************************* Internal subroutine for extra-precise calculation of SUM(w[i]). INPUT PARAMETERS: W - array[0..N-1], values to be added W is modified during calculations. MX - max(W[i]) N - array size OUTPUT PARAMETERS: R - SUM(w[i]) RErr- error estimate for R -- ALGLIB -- Copyright 24.08.2009 by Bochkanov Sergey *************************************************************************/ static void xblas_xsum(/* Real */ ae_vector* w, double mx, ae_int_t n, double* r, double* rerr, ae_state *_state) { ae_int_t i; ae_int_t k; ae_int_t ks; double v; double s; double ln2; double chunk; double invchunk; ae_bool allzeros; *r = 0; *rerr = 0; /* * special cases: * * N=0 * * N is too large to use integer arithmetics */ if( n==0 ) { *r = (double)(0); *rerr = (double)(0); return; } if( ae_fp_eq(mx,(double)(0)) ) { *r = (double)(0); *rerr = (double)(0); return; } ae_assert(n<536870912, "XDot: N is too large!", _state); /* * Prepare */ ln2 = ae_log((double)(2), _state); *rerr = mx*ae_machineepsilon; /* * 1. find S such that 0.5<=S*MX<1 * 2. multiply W by S, so task is normalized in some sense * 3. S:=1/S so we can obtain original vector multiplying by S */ k = ae_round(ae_log(mx, _state)/ln2, _state); s = xblas_xfastpow((double)(2), -k, _state); if( !ae_isfinite(s, _state) ) { /* * Overflow or underflow during evaluation of S; fallback low-precision code */ *r = (double)(0); *rerr = mx*ae_machineepsilon; for(i=0; i<=n-1; i++) { *r = *r+w->ptr.p_double[i]; } return; } while(ae_fp_greater_eq(s*mx,(double)(1))) { s = 0.5*s; } while(ae_fp_less(s*mx,0.5)) { s = 2*s; } ae_v_muld(&w->ptr.p_double[0], 1, ae_v_len(0,n-1), s); s = 1/s; /* * find Chunk=2^M such that N*Chunk<2^29 * * we have chosen upper limit (2^29) with enough space left * to tolerate possible problems with rounding and N's close * to the limit, so we don't want to be very strict here. */ k = ae_trunc(ae_log((double)536870912/(double)n, _state)/ln2, _state); chunk = xblas_xfastpow((double)(2), k, _state); if( ae_fp_less(chunk,(double)(2)) ) { chunk = (double)(2); } invchunk = 1/chunk; /* * calculate result */ *r = (double)(0); ae_v_muld(&w->ptr.p_double[0], 1, ae_v_len(0,n-1), chunk); for(;;) { s = s*invchunk; allzeros = ae_true; ks = 0; for(i=0; i<=n-1; i++) { v = w->ptr.p_double[i]; k = ae_trunc(v, _state); if( ae_fp_neq(v,(double)(k)) ) { allzeros = ae_false; } w->ptr.p_double[i] = chunk*(v-k); ks = ks+k; } *r = *r+s*ks; v = ae_fabs(*r, _state); if( allzeros||ae_fp_eq(s*n+mx,mx) ) { break; } } /* * correct error */ *rerr = ae_maxreal(*rerr, ae_fabs(*r, _state)*ae_machineepsilon, _state); } /************************************************************************* Fast Pow -- ALGLIB -- Copyright 24.08.2009 by Bochkanov Sergey *************************************************************************/ static double xblas_xfastpow(double r, ae_int_t n, ae_state *_state) { double result; result = (double)(0); if( n>0 ) { if( n%2==0 ) { result = ae_sqr(xblas_xfastpow(r, n/2, _state), _state); } else { result = r*xblas_xfastpow(r, n-1, _state); } return result; } if( n==0 ) { result = (double)(1); } if( n<0 ) { result = xblas_xfastpow(1/r, -n, _state); } return result; } void rmatrixinternalschurdecomposition(/* Real */ ae_matrix* h, ae_int_t n, ae_int_t tneeded, ae_int_t zneeded, /* Real */ ae_vector* wr, /* Real */ ae_vector* wi, /* Real */ ae_matrix* z, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_matrix h1; ae_matrix z1; ae_vector wr1; ae_vector wi1; ae_frame_make(_state, &_frame_block); ae_vector_clear(wr); ae_vector_clear(wi); *info = 0; ae_matrix_init(&h1, 0, 0, DT_REAL, _state); ae_matrix_init(&z1, 0, 0, DT_REAL, _state); ae_vector_init(&wr1, 0, DT_REAL, _state); ae_vector_init(&wi1, 0, DT_REAL, _state); /* * Allocate space */ ae_vector_set_length(wr, n, _state); ae_vector_set_length(wi, n, _state); if( zneeded==2 ) { rmatrixsetlengthatleast(z, n, n, _state); } /* * MKL version */ if( rmatrixinternalschurdecompositionmkl(h, n, tneeded, zneeded, wr, wi, z, info, _state) ) { ae_frame_leave(_state); return; } /* * ALGLIB version */ ae_matrix_set_length(&h1, n+1, n+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { h1.ptr.pp_double[1+i][1+j] = h->ptr.pp_double[i][j]; } } if( zneeded==1 ) { ae_matrix_set_length(&z1, n+1, n+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { z1.ptr.pp_double[1+i][1+j] = z->ptr.pp_double[i][j]; } } } internalschurdecomposition(&h1, n, tneeded, zneeded, &wr1, &wi1, &z1, info, _state); for(i=0; i<=n-1; i++) { wr->ptr.p_double[i] = wr1.ptr.p_double[i+1]; wi->ptr.p_double[i] = wi1.ptr.p_double[i+1]; } if( tneeded!=0 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { h->ptr.pp_double[i][j] = h1.ptr.pp_double[1+i][1+j]; } } } if( zneeded!=0 ) { rmatrixsetlengthatleast(z, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { z->ptr.pp_double[i][j] = z1.ptr.pp_double[1+i][1+j]; } } } ae_frame_leave(_state); } /************************************************************************* Subroutine performing the Schur decomposition of a matrix in upper Hessenberg form using the QR algorithm with multiple shifts. The source matrix H is represented as S'*H*S = T, where H - matrix in upper Hessenberg form, S - orthogonal matrix (Schur vectors), T - upper quasi-triangular matrix (with blocks of sizes 1x1 and 2x2 on the main diagonal). Input parameters: H - matrix to be decomposed. Array whose indexes range within [1..N, 1..N]. N - size of H, N>=0. Output parameters: H - contains the matrix T. Array whose indexes range within [1..N, 1..N]. All elements below the blocks on the main diagonal are equal to 0. S - contains Schur vectors. Array whose indexes range within [1..N, 1..N]. Note 1: The block structure of matrix T could be easily recognized: since all the elements below the blocks are zeros, the elements a[i+1,i] which are equal to 0 show the block border. Note 2: the algorithm performance depends on the value of the internal parameter NS of InternalSchurDecomposition subroutine which defines the number of shifts in the QR algorithm (analog of the block width in block matrix algorithms in linear algebra). If you require maximum performance on your machine, it is recommended to adjust this parameter manually. Result: True, if the algorithm has converged and the parameters H and S contain the result. False, if the algorithm has not converged. Algorithm implemented on the basis of subroutine DHSEQR (LAPACK 3.0 library). *************************************************************************/ ae_bool upperhessenbergschurdecomposition(/* Real */ ae_matrix* h, ae_int_t n, /* Real */ ae_matrix* s, ae_state *_state) { ae_frame _frame_block; ae_vector wi; ae_vector wr; ae_int_t info; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_clear(s); ae_vector_init(&wi, 0, DT_REAL, _state); ae_vector_init(&wr, 0, DT_REAL, _state); internalschurdecomposition(h, n, 1, 2, &wr, &wi, s, &info, _state); result = info==0; ae_frame_leave(_state); return result; } void internalschurdecomposition(/* Real */ ae_matrix* h, ae_int_t n, ae_int_t tneeded, ae_int_t zneeded, /* Real */ ae_vector* wr, /* Real */ ae_vector* wi, /* Real */ ae_matrix* z, ae_int_t* info, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_int_t i; ae_int_t i1; ae_int_t i2; ae_int_t ierr; ae_int_t ii; ae_int_t itemp; ae_int_t itn; ae_int_t its; ae_int_t j; ae_int_t k; ae_int_t l; ae_int_t maxb; ae_int_t nr; ae_int_t ns; ae_int_t nv; double absw; double smlnum; double tau; double temp; double tst1; double ulp; double unfl; ae_matrix s; ae_vector v; ae_vector vv; ae_vector workc1; ae_vector works1; ae_vector workv3; ae_vector tmpwr; ae_vector tmpwi; ae_bool initz; ae_bool wantt; ae_bool wantz; double cnst; ae_bool failflag; ae_int_t p1; ae_int_t p2; double vt; ae_frame_make(_state, &_frame_block); ae_vector_clear(wr); ae_vector_clear(wi); *info = 0; ae_vector_init(&work, 0, DT_REAL, _state); ae_matrix_init(&s, 0, 0, DT_REAL, _state); ae_vector_init(&v, 0, DT_REAL, _state); ae_vector_init(&vv, 0, DT_REAL, _state); ae_vector_init(&workc1, 0, DT_REAL, _state); ae_vector_init(&works1, 0, DT_REAL, _state); ae_vector_init(&workv3, 0, DT_REAL, _state); ae_vector_init(&tmpwr, 0, DT_REAL, _state); ae_vector_init(&tmpwi, 0, DT_REAL, _state); /* * Set the order of the multi-shift QR algorithm to be used. * If you want to tune algorithm, change this values */ ns = 12; maxb = 50; /* * Now 2 < NS <= MAXB < NH. */ maxb = ae_maxint(3, maxb, _state); ns = ae_minint(maxb, ns, _state); /* * Initialize */ cnst = 1.5; ae_vector_set_length(&work, ae_maxint(n, 1, _state)+1, _state); ae_matrix_set_length(&s, ns+1, ns+1, _state); ae_vector_set_length(&v, ns+1+1, _state); ae_vector_set_length(&vv, ns+1+1, _state); ae_vector_set_length(wr, ae_maxint(n, 1, _state)+1, _state); ae_vector_set_length(wi, ae_maxint(n, 1, _state)+1, _state); ae_vector_set_length(&workc1, 1+1, _state); ae_vector_set_length(&works1, 1+1, _state); ae_vector_set_length(&workv3, 3+1, _state); ae_vector_set_length(&tmpwr, ae_maxint(n, 1, _state)+1, _state); ae_vector_set_length(&tmpwi, ae_maxint(n, 1, _state)+1, _state); ae_assert(n>=0, "InternalSchurDecomposition: incorrect N!", _state); ae_assert(tneeded==0||tneeded==1, "InternalSchurDecomposition: incorrect TNeeded!", _state); ae_assert((zneeded==0||zneeded==1)||zneeded==2, "InternalSchurDecomposition: incorrect ZNeeded!", _state); wantt = tneeded==1; initz = zneeded==2; wantz = zneeded!=0; *info = 0; /* * Initialize Z, if necessary */ if( initz ) { rmatrixsetlengthatleast(z, n+1, n+1, _state); for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { if( i==j ) { z->ptr.pp_double[i][j] = (double)(1); } else { z->ptr.pp_double[i][j] = (double)(0); } } } } /* * Quick return if possible */ if( n==0 ) { ae_frame_leave(_state); return; } if( n==1 ) { wr->ptr.p_double[1] = h->ptr.pp_double[1][1]; wi->ptr.p_double[1] = (double)(0); ae_frame_leave(_state); return; } /* * Set rows and columns 1 to N to zero below the first * subdiagonal. */ for(j=1; j<=n-2; j++) { for(i=j+2; i<=n; i++) { h->ptr.pp_double[i][j] = (double)(0); } } /* * Test if N is sufficiently small */ if( (ns<=2||ns>n)||maxb>=n ) { /* * Use the standard double-shift algorithm */ hsschur_internalauxschur(wantt, wantz, n, 1, n, h, wr, wi, 1, n, z, &work, &workv3, &workc1, &works1, info, _state); /* * fill entries under diagonal blocks of T with zeros */ if( wantt ) { j = 1; while(j<=n) { if( ae_fp_eq(wi->ptr.p_double[j],(double)(0)) ) { for(i=j+1; i<=n; i++) { h->ptr.pp_double[i][j] = (double)(0); } j = j+1; } else { for(i=j+2; i<=n; i++) { h->ptr.pp_double[i][j] = (double)(0); h->ptr.pp_double[i][j+1] = (double)(0); } j = j+2; } } } ae_frame_leave(_state); return; } unfl = ae_minrealnumber; ulp = 2*ae_machineepsilon; smlnum = unfl*(n/ulp); /* * I1 and I2 are the indices of the first row and last column of H * to which transformations must be applied. If eigenvalues only are * being computed, I1 and I2 are set inside the main loop. */ i1 = 1; i2 = n; /* * ITN is the total number of multiple-shift QR iterations allowed. */ itn = 30*n; /* * The main loop begins here. I is the loop index and decreases from * IHI to ILO in steps of at most MAXB. Each iteration of the loop * works with the active submatrix in rows and columns L to I. * Eigenvalues I+1 to IHI have already converged. Either L = ILO or * H(L,L-1) is negligible so that the matrix splits. */ i = n; for(;;) { l = 1; if( i<1 ) { /* * fill entries under diagonal blocks of T with zeros */ if( wantt ) { j = 1; while(j<=n) { if( ae_fp_eq(wi->ptr.p_double[j],(double)(0)) ) { for(i=j+1; i<=n; i++) { h->ptr.pp_double[i][j] = (double)(0); } j = j+1; } else { for(i=j+2; i<=n; i++) { h->ptr.pp_double[i][j] = (double)(0); h->ptr.pp_double[i][j+1] = (double)(0); } j = j+2; } } } /* * Exit */ ae_frame_leave(_state); return; } /* * Perform multiple-shift QR iterations on rows and columns ILO to I * until a submatrix of order at most MAXB splits off at the bottom * because a subdiagonal element has become negligible. */ failflag = ae_true; for(its=0; its<=itn; its++) { /* * Look for a single small subdiagonal element. */ for(k=i; k>=l+1; k--) { tst1 = ae_fabs(h->ptr.pp_double[k-1][k-1], _state)+ae_fabs(h->ptr.pp_double[k][k], _state); if( ae_fp_eq(tst1,(double)(0)) ) { tst1 = upperhessenberg1norm(h, l, i, l, i, &work, _state); } if( ae_fp_less_eq(ae_fabs(h->ptr.pp_double[k][k-1], _state),ae_maxreal(ulp*tst1, smlnum, _state)) ) { break; } } l = k; if( l>1 ) { /* * H(L,L-1) is negligible. */ h->ptr.pp_double[l][l-1] = (double)(0); } /* * Exit from loop if a submatrix of order <= MAXB has split off. */ if( l>=i-maxb+1 ) { failflag = ae_false; break; } /* * Now the active submatrix is in rows and columns L to I. If * eigenvalues only are being computed, only the active submatrix * need be transformed. */ if( its==20||its==30 ) { /* * Exceptional shifts. */ for(ii=i-ns+1; ii<=i; ii++) { wr->ptr.p_double[ii] = cnst*(ae_fabs(h->ptr.pp_double[ii][ii-1], _state)+ae_fabs(h->ptr.pp_double[ii][ii], _state)); wi->ptr.p_double[ii] = (double)(0); } } else { /* * Use eigenvalues of trailing submatrix of order NS as shifts. */ copymatrix(h, i-ns+1, i, i-ns+1, i, &s, 1, ns, 1, ns, _state); hsschur_internalauxschur(ae_false, ae_false, ns, 1, ns, &s, &tmpwr, &tmpwi, 1, ns, z, &work, &workv3, &workc1, &works1, &ierr, _state); for(p1=1; p1<=ns; p1++) { wr->ptr.p_double[i-ns+p1] = tmpwr.ptr.p_double[p1]; wi->ptr.p_double[i-ns+p1] = tmpwi.ptr.p_double[p1]; } if( ierr>0 ) { /* * If DLAHQR failed to compute all NS eigenvalues, use the * unconverged diagonal elements as the remaining shifts. */ for(ii=1; ii<=ierr; ii++) { wr->ptr.p_double[i-ns+ii] = s.ptr.pp_double[ii][ii]; wi->ptr.p_double[i-ns+ii] = (double)(0); } } } /* * Form the first column of (G-w(1)) (G-w(2)) . . . (G-w(ns)) * where G is the Hessenberg submatrix H(L:I,L:I) and w is * the vector of shifts (stored in WR and WI). The result is * stored in the local array V. */ v.ptr.p_double[1] = (double)(1); for(ii=2; ii<=ns+1; ii++) { v.ptr.p_double[ii] = (double)(0); } nv = 1; for(j=i-ns+1; j<=i; j++) { if( ae_fp_greater_eq(wi->ptr.p_double[j],(double)(0)) ) { if( ae_fp_eq(wi->ptr.p_double[j],(double)(0)) ) { /* * real shift */ p1 = nv+1; ae_v_move(&vv.ptr.p_double[1], 1, &v.ptr.p_double[1], 1, ae_v_len(1,p1)); matrixvectormultiply(h, l, l+nv, l, l+nv-1, ae_false, &vv, 1, nv, 1.0, &v, 1, nv+1, -wr->ptr.p_double[j], _state); nv = nv+1; } else { if( ae_fp_greater(wi->ptr.p_double[j],(double)(0)) ) { /* * complex conjugate pair of shifts */ p1 = nv+1; ae_v_move(&vv.ptr.p_double[1], 1, &v.ptr.p_double[1], 1, ae_v_len(1,p1)); matrixvectormultiply(h, l, l+nv, l, l+nv-1, ae_false, &v, 1, nv, 1.0, &vv, 1, nv+1, -2*wr->ptr.p_double[j], _state); itemp = vectoridxabsmax(&vv, 1, nv+1, _state); temp = 1/ae_maxreal(ae_fabs(vv.ptr.p_double[itemp], _state), smlnum, _state); p1 = nv+1; ae_v_muld(&vv.ptr.p_double[1], 1, ae_v_len(1,p1), temp); absw = pythag2(wr->ptr.p_double[j], wi->ptr.p_double[j], _state); temp = temp*absw*absw; matrixvectormultiply(h, l, l+nv+1, l, l+nv, ae_false, &vv, 1, nv+1, 1.0, &v, 1, nv+2, temp, _state); nv = nv+2; } } /* * Scale V(1:NV) so that max(abs(V(i))) = 1. If V is zero, * reset it to the unit vector. */ itemp = vectoridxabsmax(&v, 1, nv, _state); temp = ae_fabs(v.ptr.p_double[itemp], _state); if( ae_fp_eq(temp,(double)(0)) ) { v.ptr.p_double[1] = (double)(1); for(ii=2; ii<=nv; ii++) { v.ptr.p_double[ii] = (double)(0); } } else { temp = ae_maxreal(temp, smlnum, _state); vt = 1/temp; ae_v_muld(&v.ptr.p_double[1], 1, ae_v_len(1,nv), vt); } } } /* * Multiple-shift QR step */ for(k=l; k<=i-1; k++) { /* * The first iteration of this loop determines a reflection G * from the vector V and applies it from left and right to H, * thus creating a nonzero bulge below the subdiagonal. * * Each subsequent iteration determines a reflection G to * restore the Hessenberg form in the (K-1)th column, and thus * chases the bulge one step toward the bottom of the active * submatrix. NR is the order of G. */ nr = ae_minint(ns+1, i-k+1, _state); if( k>l ) { p1 = k-1; p2 = k+nr-1; ae_v_move(&v.ptr.p_double[1], 1, &h->ptr.pp_double[k][p1], h->stride, ae_v_len(1,nr)); touchint(&p2, _state); } generatereflection(&v, nr, &tau, _state); if( k>l ) { h->ptr.pp_double[k][k-1] = v.ptr.p_double[1]; for(ii=k+1; ii<=i; ii++) { h->ptr.pp_double[ii][k-1] = (double)(0); } } v.ptr.p_double[1] = (double)(1); /* * Apply G from the left to transform the rows of the matrix in * columns K to I2. */ applyreflectionfromtheleft(h, tau, &v, k, k+nr-1, k, i2, &work, _state); /* * Apply G from the right to transform the columns of the * matrix in rows I1 to min(K+NR,I). */ applyreflectionfromtheright(h, tau, &v, i1, ae_minint(k+nr, i, _state), k, k+nr-1, &work, _state); if( wantz ) { /* * Accumulate transformations in the matrix Z */ applyreflectionfromtheright(z, tau, &v, 1, n, k, k+nr-1, &work, _state); } } } /* * Failure to converge in remaining number of iterations */ if( failflag ) { *info = i; ae_frame_leave(_state); return; } /* * A submatrix of order <= MAXB in rows and columns L to I has split * off. Use the double-shift QR algorithm to handle it. */ hsschur_internalauxschur(wantt, wantz, n, l, i, h, wr, wi, 1, n, z, &work, &workv3, &workc1, &works1, info, _state); if( *info>0 ) { ae_frame_leave(_state); return; } /* * Decrement number of remaining iterations, and return to start of * the main loop with a new value of I. */ itn = itn-its; i = l-1; } ae_frame_leave(_state); } /************************************************************************* Translation of DLAHQR from LAPACK. *************************************************************************/ static void hsschur_internalauxschur(ae_bool wantt, ae_bool wantz, ae_int_t n, ae_int_t ilo, ae_int_t ihi, /* Real */ ae_matrix* h, /* Real */ ae_vector* wr, /* Real */ ae_vector* wi, ae_int_t iloz, ae_int_t ihiz, /* Real */ ae_matrix* z, /* Real */ ae_vector* work, /* Real */ ae_vector* workv3, /* Real */ ae_vector* workc1, /* Real */ ae_vector* works1, ae_int_t* info, ae_state *_state) { double safmin; double tst; double ab; double ba; double aa; double bb; double rt1r; double rt1i; double rt2r; double rt2i; double tr; double det; double rtdisc; double h21s; ae_int_t i; ae_int_t i1; ae_int_t i2; ae_int_t itmax; ae_int_t its; ae_int_t j; ae_int_t k; ae_int_t l; ae_int_t m; ae_int_t nh; ae_int_t nr; ae_int_t nz; double cs; double h11; double h12; double h21; double h22; double s; double smlnum; double sn; double sum; double t1; double t2; double t3; double v2; double v3; ae_bool failflag; double dat1; double dat2; ae_int_t p1; double him1im1; double him1i; double hiim1; double hii; double wrim1; double wri; double wiim1; double wii; double ulp; *info = 0; *info = 0; dat1 = 0.75; dat2 = -0.4375; /* * Quick return if possible */ if( n==0 ) { return; } if( ilo==ihi ) { wr->ptr.p_double[ilo] = h->ptr.pp_double[ilo][ilo]; wi->ptr.p_double[ilo] = (double)(0); return; } /* * ==== clear out the trash ==== */ for(j=ilo; j<=ihi-3; j++) { h->ptr.pp_double[j+2][j] = (double)(0); h->ptr.pp_double[j+3][j] = (double)(0); } if( ilo<=ihi-2 ) { h->ptr.pp_double[ihi][ihi-2] = (double)(0); } nh = ihi-ilo+1; nz = ihiz-iloz+1; /* * Set machine-dependent constants for the stopping criterion. */ safmin = ae_minrealnumber; ulp = ae_machineepsilon; smlnum = safmin*(nh/ulp); /* * I1 and I2 are the indices of the first row and last column of H * to which transformations must be applied. If eigenvalues only are * being computed, I1 and I2 are set inside the main loop. */ if( wantt ) { i1 = 1; i2 = n; } /* * ITMAX is the total number of QR iterations allowed. */ itmax = 30*ae_maxint(10, nh, _state); /* * The main loop begins here. I is the loop index and decreases from * IHI to ILO in steps of 1 or 2. Each iteration of the loop works * with the active submatrix in rows and columns L to I. * Eigenvalues I+1 to IHI have already converged. Either L = ILO or * H(L,L-1) is negligible so that the matrix splits. */ i = ihi; for(;;) { l = ilo; if( i=l+1; k--) { if( ae_fp_less_eq(ae_fabs(h->ptr.pp_double[k][k-1], _state),smlnum) ) { break; } tst = ae_fabs(h->ptr.pp_double[k-1][k-1], _state)+ae_fabs(h->ptr.pp_double[k][k], _state); if( ae_fp_eq(tst,(double)(0)) ) { if( k-2>=ilo ) { tst = tst+ae_fabs(h->ptr.pp_double[k-1][k-2], _state); } if( k+1<=ihi ) { tst = tst+ae_fabs(h->ptr.pp_double[k+1][k], _state); } } /* * ==== The following is a conservative small subdiagonal * . deflation criterion due to Ahues & Tisseur (LAWN 122, * . 1997). It has better mathematical foundation and * . improves accuracy in some cases. ==== */ if( ae_fp_less_eq(ae_fabs(h->ptr.pp_double[k][k-1], _state),ulp*tst) ) { ab = ae_maxreal(ae_fabs(h->ptr.pp_double[k][k-1], _state), ae_fabs(h->ptr.pp_double[k-1][k], _state), _state); ba = ae_minreal(ae_fabs(h->ptr.pp_double[k][k-1], _state), ae_fabs(h->ptr.pp_double[k-1][k], _state), _state); aa = ae_maxreal(ae_fabs(h->ptr.pp_double[k][k], _state), ae_fabs(h->ptr.pp_double[k-1][k-1]-h->ptr.pp_double[k][k], _state), _state); bb = ae_minreal(ae_fabs(h->ptr.pp_double[k][k], _state), ae_fabs(h->ptr.pp_double[k-1][k-1]-h->ptr.pp_double[k][k], _state), _state); s = aa+ab; if( ae_fp_less_eq(ba*(ab/s),ae_maxreal(smlnum, ulp*(bb*(aa/s)), _state)) ) { break; } } } l = k; if( l>ilo ) { /* * H(L,L-1) is negligible */ h->ptr.pp_double[l][l-1] = (double)(0); } /* * Exit from loop if a submatrix of order 1 or 2 has split off. */ if( l>=i-1 ) { failflag = ae_false; break; } /* * Now the active submatrix is in rows and columns L to I. If * eigenvalues only are being computed, only the active submatrix * need be transformed. */ if( !wantt ) { i1 = l; i2 = i; } /* * Shifts */ if( its==10 ) { /* * Exceptional shift. */ s = ae_fabs(h->ptr.pp_double[l+1][l], _state)+ae_fabs(h->ptr.pp_double[l+2][l+1], _state); h11 = dat1*s+h->ptr.pp_double[l][l]; h12 = dat2*s; h21 = s; h22 = h11; } else { if( its==20 ) { /* * Exceptional shift. */ s = ae_fabs(h->ptr.pp_double[i][i-1], _state)+ae_fabs(h->ptr.pp_double[i-1][i-2], _state); h11 = dat1*s+h->ptr.pp_double[i][i]; h12 = dat2*s; h21 = s; h22 = h11; } else { /* * Prepare to use Francis' double shift * (i.e. 2nd degree generalized Rayleigh quotient) */ h11 = h->ptr.pp_double[i-1][i-1]; h21 = h->ptr.pp_double[i][i-1]; h12 = h->ptr.pp_double[i-1][i]; h22 = h->ptr.pp_double[i][i]; } } s = ae_fabs(h11, _state)+ae_fabs(h12, _state)+ae_fabs(h21, _state)+ae_fabs(h22, _state); if( ae_fp_eq(s,(double)(0)) ) { rt1r = (double)(0); rt1i = (double)(0); rt2r = (double)(0); rt2i = (double)(0); } else { h11 = h11/s; h21 = h21/s; h12 = h12/s; h22 = h22/s; tr = (h11+h22)/2; det = (h11-tr)*(h22-tr)-h12*h21; rtdisc = ae_sqrt(ae_fabs(det, _state), _state); if( ae_fp_greater_eq(det,(double)(0)) ) { /* * ==== complex conjugate shifts ==== */ rt1r = tr*s; rt2r = rt1r; rt1i = rtdisc*s; rt2i = -rt1i; } else { /* * ==== real shifts (use only one of them) ==== */ rt1r = tr+rtdisc; rt2r = tr-rtdisc; if( ae_fp_less_eq(ae_fabs(rt1r-h22, _state),ae_fabs(rt2r-h22, _state)) ) { rt1r = rt1r*s; rt2r = rt1r; } else { rt2r = rt2r*s; rt1r = rt2r; } rt1i = (double)(0); rt2i = (double)(0); } } /* * Look for two consecutive small subdiagonal elements. */ for(m=i-2; m>=l; m--) { /* * Determine the effect of starting the double-shift QR * iteration at row M, and see if this would make H(M,M-1) * negligible. (The following uses scaling to avoid * overflows and most underflows.) */ h21s = h->ptr.pp_double[m+1][m]; s = ae_fabs(h->ptr.pp_double[m][m]-rt2r, _state)+ae_fabs(rt2i, _state)+ae_fabs(h21s, _state); h21s = h->ptr.pp_double[m+1][m]/s; workv3->ptr.p_double[1] = h21s*h->ptr.pp_double[m][m+1]+(h->ptr.pp_double[m][m]-rt1r)*((h->ptr.pp_double[m][m]-rt2r)/s)-rt1i*(rt2i/s); workv3->ptr.p_double[2] = h21s*(h->ptr.pp_double[m][m]+h->ptr.pp_double[m+1][m+1]-rt1r-rt2r); workv3->ptr.p_double[3] = h21s*h->ptr.pp_double[m+2][m+1]; s = ae_fabs(workv3->ptr.p_double[1], _state)+ae_fabs(workv3->ptr.p_double[2], _state)+ae_fabs(workv3->ptr.p_double[3], _state); workv3->ptr.p_double[1] = workv3->ptr.p_double[1]/s; workv3->ptr.p_double[2] = workv3->ptr.p_double[2]/s; workv3->ptr.p_double[3] = workv3->ptr.p_double[3]/s; if( m==l ) { break; } if( ae_fp_less_eq(ae_fabs(h->ptr.pp_double[m][m-1], _state)*(ae_fabs(workv3->ptr.p_double[2], _state)+ae_fabs(workv3->ptr.p_double[3], _state)),ulp*ae_fabs(workv3->ptr.p_double[1], _state)*(ae_fabs(h->ptr.pp_double[m-1][m-1], _state)+ae_fabs(h->ptr.pp_double[m][m], _state)+ae_fabs(h->ptr.pp_double[m+1][m+1], _state))) ) { break; } } /* * Double-shift QR step */ for(k=m; k<=i-1; k++) { /* * The first iteration of this loop determines a reflection G * from the vector V and applies it from left and right to H, * thus creating a nonzero bulge below the subdiagonal. * * Each subsequent iteration determines a reflection G to * restore the Hessenberg form in the (K-1)th column, and thus * chases the bulge one step toward the bottom of the active * submatrix. NR is the order of G. */ nr = ae_minint(3, i-k+1, _state); if( k>m ) { for(p1=1; p1<=nr; p1++) { workv3->ptr.p_double[p1] = h->ptr.pp_double[k+p1-1][k-1]; } } generatereflection(workv3, nr, &t1, _state); if( k>m ) { h->ptr.pp_double[k][k-1] = workv3->ptr.p_double[1]; h->ptr.pp_double[k+1][k-1] = (double)(0); if( kptr.pp_double[k+2][k-1] = (double)(0); } } else { if( m>l ) { /* * ==== Use the following instead of * H( K, K-1 ) = -H( K, K-1 ) to * avoid a bug when v(2) and v(3) * underflow. ==== */ h->ptr.pp_double[k][k-1] = h->ptr.pp_double[k][k-1]*(1-t1); } } v2 = workv3->ptr.p_double[2]; t2 = t1*v2; if( nr==3 ) { v3 = workv3->ptr.p_double[3]; t3 = t1*v3; /* * Apply G from the left to transform the rows of the matrix * in columns K to I2. */ for(j=k; j<=i2; j++) { sum = h->ptr.pp_double[k][j]+v2*h->ptr.pp_double[k+1][j]+v3*h->ptr.pp_double[k+2][j]; h->ptr.pp_double[k][j] = h->ptr.pp_double[k][j]-sum*t1; h->ptr.pp_double[k+1][j] = h->ptr.pp_double[k+1][j]-sum*t2; h->ptr.pp_double[k+2][j] = h->ptr.pp_double[k+2][j]-sum*t3; } /* * Apply G from the right to transform the columns of the * matrix in rows I1 to min(K+3,I). */ for(j=i1; j<=ae_minint(k+3, i, _state); j++) { sum = h->ptr.pp_double[j][k]+v2*h->ptr.pp_double[j][k+1]+v3*h->ptr.pp_double[j][k+2]; h->ptr.pp_double[j][k] = h->ptr.pp_double[j][k]-sum*t1; h->ptr.pp_double[j][k+1] = h->ptr.pp_double[j][k+1]-sum*t2; h->ptr.pp_double[j][k+2] = h->ptr.pp_double[j][k+2]-sum*t3; } if( wantz ) { /* * Accumulate transformations in the matrix Z */ for(j=iloz; j<=ihiz; j++) { sum = z->ptr.pp_double[j][k]+v2*z->ptr.pp_double[j][k+1]+v3*z->ptr.pp_double[j][k+2]; z->ptr.pp_double[j][k] = z->ptr.pp_double[j][k]-sum*t1; z->ptr.pp_double[j][k+1] = z->ptr.pp_double[j][k+1]-sum*t2; z->ptr.pp_double[j][k+2] = z->ptr.pp_double[j][k+2]-sum*t3; } } } else { if( nr==2 ) { /* * Apply G from the left to transform the rows of the matrix * in columns K to I2. */ for(j=k; j<=i2; j++) { sum = h->ptr.pp_double[k][j]+v2*h->ptr.pp_double[k+1][j]; h->ptr.pp_double[k][j] = h->ptr.pp_double[k][j]-sum*t1; h->ptr.pp_double[k+1][j] = h->ptr.pp_double[k+1][j]-sum*t2; } /* * Apply G from the right to transform the columns of the * matrix in rows I1 to min(K+3,I). */ for(j=i1; j<=i; j++) { sum = h->ptr.pp_double[j][k]+v2*h->ptr.pp_double[j][k+1]; h->ptr.pp_double[j][k] = h->ptr.pp_double[j][k]-sum*t1; h->ptr.pp_double[j][k+1] = h->ptr.pp_double[j][k+1]-sum*t2; } if( wantz ) { /* * Accumulate transformations in the matrix Z */ for(j=iloz; j<=ihiz; j++) { sum = z->ptr.pp_double[j][k]+v2*z->ptr.pp_double[j][k+1]; z->ptr.pp_double[j][k] = z->ptr.pp_double[j][k]-sum*t1; z->ptr.pp_double[j][k+1] = z->ptr.pp_double[j][k+1]-sum*t2; } } } } } } /* * Failure to converge in remaining number of iterations */ if( failflag ) { *info = i; return; } /* * Convergence */ if( l==i ) { /* * H(I,I-1) is negligible: one eigenvalue has converged. */ wr->ptr.p_double[i] = h->ptr.pp_double[i][i]; wi->ptr.p_double[i] = (double)(0); } else { if( l==i-1 ) { /* * H(I-1,I-2) is negligible: a pair of eigenvalues have converged. * * Transform the 2-by-2 submatrix to standard Schur form, * and compute and store the eigenvalues. */ him1im1 = h->ptr.pp_double[i-1][i-1]; him1i = h->ptr.pp_double[i-1][i]; hiim1 = h->ptr.pp_double[i][i-1]; hii = h->ptr.pp_double[i][i]; hsschur_aux2x2schur(&him1im1, &him1i, &hiim1, &hii, &wrim1, &wiim1, &wri, &wii, &cs, &sn, _state); wr->ptr.p_double[i-1] = wrim1; wi->ptr.p_double[i-1] = wiim1; wr->ptr.p_double[i] = wri; wi->ptr.p_double[i] = wii; h->ptr.pp_double[i-1][i-1] = him1im1; h->ptr.pp_double[i-1][i] = him1i; h->ptr.pp_double[i][i-1] = hiim1; h->ptr.pp_double[i][i] = hii; if( wantt ) { /* * Apply the transformation to the rest of H. */ if( i2>i ) { workc1->ptr.p_double[1] = cs; works1->ptr.p_double[1] = sn; applyrotationsfromtheleft(ae_true, i-1, i, i+1, i2, workc1, works1, h, work, _state); } workc1->ptr.p_double[1] = cs; works1->ptr.p_double[1] = sn; applyrotationsfromtheright(ae_true, i1, i-2, i-1, i, workc1, works1, h, work, _state); } if( wantz ) { /* * Apply the transformation to Z. */ workc1->ptr.p_double[1] = cs; works1->ptr.p_double[1] = sn; applyrotationsfromtheright(ae_true, iloz, iloz+nz-1, i-1, i, workc1, works1, z, work, _state); } } } /* * return to start of the main loop with new value of I. */ i = l-1; } } static void hsschur_aux2x2schur(double* a, double* b, double* c, double* d, double* rt1r, double* rt1i, double* rt2r, double* rt2i, double* cs, double* sn, ae_state *_state) { double multpl; double aa; double bb; double bcmax; double bcmis; double cc; double cs1; double dd; double eps; double p; double sab; double sac; double scl; double sigma; double sn1; double tau; double temp; double z; *rt1r = 0; *rt1i = 0; *rt2r = 0; *rt2i = 0; *cs = 0; *sn = 0; multpl = 4.0; eps = ae_machineepsilon; if( ae_fp_eq(*c,(double)(0)) ) { *cs = (double)(1); *sn = (double)(0); } else { if( ae_fp_eq(*b,(double)(0)) ) { /* * Swap rows and columns */ *cs = (double)(0); *sn = (double)(1); temp = *d; *d = *a; *a = temp; *b = -*c; *c = (double)(0); } else { if( ae_fp_eq(*a-(*d),(double)(0))&&hsschur_extschursigntoone(*b, _state)!=hsschur_extschursigntoone(*c, _state) ) { *cs = (double)(1); *sn = (double)(0); } else { temp = *a-(*d); p = 0.5*temp; bcmax = ae_maxreal(ae_fabs(*b, _state), ae_fabs(*c, _state), _state); bcmis = ae_minreal(ae_fabs(*b, _state), ae_fabs(*c, _state), _state)*hsschur_extschursigntoone(*b, _state)*hsschur_extschursigntoone(*c, _state); scl = ae_maxreal(ae_fabs(p, _state), bcmax, _state); z = p/scl*p+bcmax/scl*bcmis; /* * If Z is of the order of the machine accuracy, postpone the * decision on the nature of eigenvalues */ if( ae_fp_greater_eq(z,multpl*eps) ) { /* * Real eigenvalues. Compute A and D. */ z = p+hsschur_extschursign(ae_sqrt(scl, _state)*ae_sqrt(z, _state), p, _state); *a = *d+z; *d = *d-bcmax/z*bcmis; /* * Compute B and the rotation matrix */ tau = pythag2(*c, z, _state); *cs = z/tau; *sn = *c/tau; *b = *b-(*c); *c = (double)(0); } else { /* * Complex eigenvalues, or real (almost) equal eigenvalues. * Make diagonal elements equal. */ sigma = *b+(*c); tau = pythag2(sigma, temp, _state); *cs = ae_sqrt(0.5*(1+ae_fabs(sigma, _state)/tau), _state); *sn = -p/(tau*(*cs))*hsschur_extschursign((double)(1), sigma, _state); /* * Compute [ AA BB ] = [ A B ] [ CS -SN ] * [ CC DD ] [ C D ] [ SN CS ] */ aa = *a*(*cs)+*b*(*sn); bb = -*a*(*sn)+*b*(*cs); cc = *c*(*cs)+*d*(*sn); dd = -*c*(*sn)+*d*(*cs); /* * Compute [ A B ] = [ CS SN ] [ AA BB ] * [ C D ] [-SN CS ] [ CC DD ] */ *a = aa*(*cs)+cc*(*sn); *b = bb*(*cs)+dd*(*sn); *c = -aa*(*sn)+cc*(*cs); *d = -bb*(*sn)+dd*(*cs); temp = 0.5*(*a+(*d)); *a = temp; *d = temp; if( ae_fp_neq(*c,(double)(0)) ) { if( ae_fp_neq(*b,(double)(0)) ) { if( hsschur_extschursigntoone(*b, _state)==hsschur_extschursigntoone(*c, _state) ) { /* * Real eigenvalues: reduce to upper triangular form */ sab = ae_sqrt(ae_fabs(*b, _state), _state); sac = ae_sqrt(ae_fabs(*c, _state), _state); p = hsschur_extschursign(sab*sac, *c, _state); tau = 1/ae_sqrt(ae_fabs(*b+(*c), _state), _state); *a = temp+p; *d = temp-p; *b = *b-(*c); *c = (double)(0); cs1 = sab*tau; sn1 = sac*tau; temp = *cs*cs1-*sn*sn1; *sn = *cs*sn1+*sn*cs1; *cs = temp; } } else { *b = -*c; *c = (double)(0); temp = *cs; *cs = -*sn; *sn = temp; } } } } } } /* * Store eigenvalues in (RT1R,RT1I) and (RT2R,RT2I). */ *rt1r = *a; *rt2r = *d; if( ae_fp_eq(*c,(double)(0)) ) { *rt1i = (double)(0); *rt2i = (double)(0); } else { *rt1i = ae_sqrt(ae_fabs(*b, _state), _state)*ae_sqrt(ae_fabs(*c, _state), _state); *rt2i = -*rt1i; } } static double hsschur_extschursign(double a, double b, ae_state *_state) { double result; if( ae_fp_greater_eq(b,(double)(0)) ) { result = ae_fabs(a, _state); } else { result = -ae_fabs(a, _state); } return result; } static ae_int_t hsschur_extschursigntoone(double b, ae_state *_state) { ae_int_t result; if( ae_fp_greater_eq(b,(double)(0)) ) { result = 1; } else { result = -1; } return result; } /************************************************************************* Internal tied ranking subroutine. INPUT PARAMETERS: X - array to rank N - array size IsCentered- whether ranks are centered or not: * True - ranks are centered in such way that their sum is zero * False - ranks are not centered Buf - temporary buffers NOTE: when IsCentered is True and all X[] are equal, this function fills X by zeros (exact zeros are used, not sum which is only approximately equal to zero). *************************************************************************/ void rankx(/* Real */ ae_vector* x, ae_int_t n, ae_bool iscentered, apbuffers* buf, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; double tmp; double voffs; /* * Prepare */ if( n<1 ) { return; } if( n==1 ) { x->ptr.p_double[0] = (double)(0); return; } if( buf->ra1.cntra1, n, _state); } if( buf->ia1.cntia1, n, _state); } for(i=0; i<=n-1; i++) { buf->ra1.ptr.p_double[i] = x->ptr.p_double[i]; buf->ia1.ptr.p_int[i] = i; } tagsortfasti(&buf->ra1, &buf->ia1, &buf->ra2, &buf->ia2, n, _state); /* * Special test for all values being equal */ if( ae_fp_eq(buf->ra1.ptr.p_double[0],buf->ra1.ptr.p_double[n-1]) ) { if( iscentered ) { tmp = 0.0; } else { tmp = (double)(n-1)/(double)2; } for(i=0; i<=n-1; i++) { x->ptr.p_double[i] = tmp; } return; } /* * compute tied ranks */ i = 0; while(i<=n-1) { j = i+1; while(j<=n-1) { if( ae_fp_neq(buf->ra1.ptr.p_double[j],buf->ra1.ptr.p_double[i]) ) { break; } j = j+1; } for(k=i; k<=j-1; k++) { buf->ra1.ptr.p_double[k] = (double)(i+j-1)/(double)2; } i = j; } /* * back to x */ if( iscentered ) { voffs = (double)(n-1)/(double)2; } else { voffs = 0.0; } for(i=0; i<=n-1; i++) { x->ptr.p_double[buf->ia1.ptr.p_int[i]] = buf->ra1.ptr.p_double[i]-voffs; } } /************************************************************************* Internal untied ranking subroutine. INPUT PARAMETERS: X - array to rank N - array size Buf - temporary buffers Returns untied ranks (in case of a tie ranks are resolved arbitrarily). *************************************************************************/ void rankxuntied(/* Real */ ae_vector* x, ae_int_t n, apbuffers* buf, ae_state *_state) { ae_int_t i; /* * Prepare */ if( n<1 ) { return; } if( n==1 ) { x->ptr.p_double[0] = (double)(0); return; } if( buf->ra1.cntra1, n, _state); } if( buf->ia1.cntia1, n, _state); } for(i=0; i<=n-1; i++) { buf->ra1.ptr.p_double[i] = x->ptr.p_double[i]; buf->ia1.ptr.p_int[i] = i; } tagsortfasti(&buf->ra1, &buf->ia1, &buf->ra2, &buf->ia2, n, _state); for(i=0; i<=n-1; i++) { x->ptr.p_double[buf->ia1.ptr.p_int[i]] = (double)(i); } } /************************************************************************* Prepares HPC compuations of chunked gradient with HPCChunkedGradient(). You have to call this function before calling HPCChunkedGradient() for a new set of weights. You have to call it only once, see example below: HOW TO PROCESS DATASET WITH THIS FUNCTION: Grad:=0 HPCPrepareChunkedGradient(Weights, WCount, NTotal, NOut, Buf) foreach chunk-of-dataset do HPCChunkedGradient(...) HPCFinalizeChunkedGradient(Buf, Grad) *************************************************************************/ void hpcpreparechunkedgradient(/* Real */ ae_vector* weights, ae_int_t wcount, ae_int_t ntotal, ae_int_t nin, ae_int_t nout, mlpbuffers* buf, ae_state *_state) { ae_int_t i; ae_int_t batch4size; ae_int_t chunksize; chunksize = 4; batch4size = 3*chunksize*ntotal+chunksize*(2*nout+1); if( buf->xy.rowsxy.colsxy, chunksize, nin+nout, _state); } if( buf->xy2.rowsxy2.colsxy2, chunksize, nin+nout, _state); } if( buf->xyrow.cntxyrow, nin+nout, _state); } if( buf->x.cntx, nin, _state); } if( buf->y.cnty, nout, _state); } if( buf->desiredy.cntdesiredy, nout, _state); } if( buf->batch4buf.cntbatch4buf, batch4size, _state); } if( buf->hpcbuf.cnthpcbuf, wcount, _state); } if( buf->g.cntg, wcount, _state); } if( !hpccores_hpcpreparechunkedgradientx(weights, wcount, &buf->hpcbuf, _state) ) { for(i=0; i<=wcount-1; i++) { buf->hpcbuf.ptr.p_double[i] = 0.0; } } buf->wcount = wcount; buf->ntotal = ntotal; buf->nin = nin; buf->nout = nout; buf->chunksize = chunksize; } /************************************************************************* Finalizes HPC compuations of chunked gradient with HPCChunkedGradient(). You have to call this function after calling HPCChunkedGradient() for a new set of weights. You have to call it only once, see example below: HOW TO PROCESS DATASET WITH THIS FUNCTION: Grad:=0 HPCPrepareChunkedGradient(Weights, WCount, NTotal, NOut, Buf) foreach chunk-of-dataset do HPCChunkedGradient(...) HPCFinalizeChunkedGradient(Buf, Grad) *************************************************************************/ void hpcfinalizechunkedgradient(mlpbuffers* buf, /* Real */ ae_vector* grad, ae_state *_state) { ae_int_t i; if( !hpccores_hpcfinalizechunkedgradientx(&buf->hpcbuf, buf->wcount, grad, _state) ) { for(i=0; i<=buf->wcount-1; i++) { grad->ptr.p_double[i] = grad->ptr.p_double[i]+buf->hpcbuf.ptr.p_double[i]; } } } /************************************************************************* Fast kernel for chunked gradient. *************************************************************************/ ae_bool hpcchunkedgradient(/* Real */ ae_vector* weights, /* Integer */ ae_vector* structinfo, /* Real */ ae_vector* columnmeans, /* Real */ ae_vector* columnsigmas, /* Real */ ae_matrix* xy, ae_int_t cstart, ae_int_t csize, /* Real */ ae_vector* batch4buf, /* Real */ ae_vector* hpcbuf, double* e, ae_bool naturalerrorfunc, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_SSE2 ae_bool result; result = ae_false; return result; #else return _ialglib_i_hpcchunkedgradient(weights, structinfo, columnmeans, columnsigmas, xy, cstart, csize, batch4buf, hpcbuf, e, naturalerrorfunc); #endif } /************************************************************************* Fast kernel for chunked processing. *************************************************************************/ ae_bool hpcchunkedprocess(/* Real */ ae_vector* weights, /* Integer */ ae_vector* structinfo, /* Real */ ae_vector* columnmeans, /* Real */ ae_vector* columnsigmas, /* Real */ ae_matrix* xy, ae_int_t cstart, ae_int_t csize, /* Real */ ae_vector* batch4buf, /* Real */ ae_vector* hpcbuf, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_SSE2 ae_bool result; result = ae_false; return result; #else return _ialglib_i_hpcchunkedprocess(weights, structinfo, columnmeans, columnsigmas, xy, cstart, csize, batch4buf, hpcbuf); #endif } /************************************************************************* Stub function. -- ALGLIB routine -- 14.06.2013 Bochkanov Sergey *************************************************************************/ static ae_bool hpccores_hpcpreparechunkedgradientx(/* Real */ ae_vector* weights, ae_int_t wcount, /* Real */ ae_vector* hpcbuf, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_SSE2 ae_bool result; result = ae_false; return result; #else return _ialglib_i_hpcpreparechunkedgradientx(weights, wcount, hpcbuf); #endif } /************************************************************************* Stub function. -- ALGLIB routine -- 14.06.2013 Bochkanov Sergey *************************************************************************/ static ae_bool hpccores_hpcfinalizechunkedgradientx(/* Real */ ae_vector* buf, ae_int_t wcount, /* Real */ ae_vector* grad, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_SSE2 ae_bool result; result = ae_false; return result; #else return _ialglib_i_hpcfinalizechunkedgradientx(buf, wcount, grad); #endif } void _mlpbuffers_init(void* _p, ae_state *_state) { mlpbuffers *p = (mlpbuffers*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->batch4buf, 0, DT_REAL, _state); ae_vector_init(&p->hpcbuf, 0, DT_REAL, _state); ae_matrix_init(&p->xy, 0, 0, DT_REAL, _state); ae_matrix_init(&p->xy2, 0, 0, DT_REAL, _state); ae_vector_init(&p->xyrow, 0, DT_REAL, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->y, 0, DT_REAL, _state); ae_vector_init(&p->desiredy, 0, DT_REAL, _state); ae_vector_init(&p->g, 0, DT_REAL, _state); ae_vector_init(&p->tmp0, 0, DT_REAL, _state); } void _mlpbuffers_init_copy(void* _dst, void* _src, ae_state *_state) { mlpbuffers *dst = (mlpbuffers*)_dst; mlpbuffers *src = (mlpbuffers*)_src; dst->chunksize = src->chunksize; dst->ntotal = src->ntotal; dst->nin = src->nin; dst->nout = src->nout; dst->wcount = src->wcount; ae_vector_init_copy(&dst->batch4buf, &src->batch4buf, _state); ae_vector_init_copy(&dst->hpcbuf, &src->hpcbuf, _state); ae_matrix_init_copy(&dst->xy, &src->xy, _state); ae_matrix_init_copy(&dst->xy2, &src->xy2, _state); ae_vector_init_copy(&dst->xyrow, &src->xyrow, _state); ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->y, &src->y, _state); ae_vector_init_copy(&dst->desiredy, &src->desiredy, _state); dst->e = src->e; ae_vector_init_copy(&dst->g, &src->g, _state); ae_vector_init_copy(&dst->tmp0, &src->tmp0, _state); } void _mlpbuffers_clear(void* _p) { mlpbuffers *p = (mlpbuffers*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->batch4buf); ae_vector_clear(&p->hpcbuf); ae_matrix_clear(&p->xy); ae_matrix_clear(&p->xy2); ae_vector_clear(&p->xyrow); ae_vector_clear(&p->x); ae_vector_clear(&p->y); ae_vector_clear(&p->desiredy); ae_vector_clear(&p->g); ae_vector_clear(&p->tmp0); } void _mlpbuffers_destroy(void* _p) { mlpbuffers *p = (mlpbuffers*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->batch4buf); ae_vector_destroy(&p->hpcbuf); ae_matrix_destroy(&p->xy); ae_matrix_destroy(&p->xy2); ae_vector_destroy(&p->xyrow); ae_vector_destroy(&p->x); ae_vector_destroy(&p->y); ae_vector_destroy(&p->desiredy); ae_vector_destroy(&p->g); ae_vector_destroy(&p->tmp0); } void findprimitiverootandinverse(ae_int_t n, ae_int_t* proot, ae_int_t* invproot, ae_state *_state) { ae_int_t candroot; ae_int_t phin; ae_int_t q; ae_int_t f; ae_bool allnonone; ae_int_t x; ae_int_t lastx; ae_int_t y; ae_int_t lasty; ae_int_t a; ae_int_t b; ae_int_t t; ae_int_t n2; *proot = 0; *invproot = 0; ae_assert(n>=3, "FindPrimitiveRootAndInverse: N<3", _state); *proot = 0; *invproot = 0; /* * check that N is prime */ ae_assert(ntheory_isprime(n, _state), "FindPrimitiveRoot: N is not prime", _state); /* * Because N is prime, Euler totient function is equal to N-1 */ phin = n-1; /* * Test different values of PRoot - from 2 to N-1. * One of these values MUST be primitive root. * * For testing we use algorithm from Wiki (Primitive root modulo n): * * compute phi(N) * * determine the different prime factors of phi(N), say p1, ..., pk * * for every element m of Zn*, compute m^(phi(N)/pi) mod N for i=1..k * using a fast algorithm for modular exponentiation. * * a number m for which these k results are all different from 1 is a * primitive root. */ for(candroot=2; candroot<=n-1; candroot++) { /* * We have current candidate root in CandRoot. * * Scan different prime factors of PhiN. Here: * * F is a current candidate factor * * Q is a current quotient - amount which was left after dividing PhiN * by all previous factors * * For each factor, perform test mentioned above. */ q = phin; f = 2; allnonone = ae_true; while(q>1) { if( q%f==0 ) { t = ntheory_modexp(candroot, phin/f, n, _state); if( t==1 ) { allnonone = ae_false; break; } while(q%f==0) { q = q/f; } } f = f+1; } if( allnonone ) { *proot = candroot; break; } } ae_assert(*proot>=2, "FindPrimitiveRoot: internal error (root not found)", _state); /* * Use extended Euclidean algorithm to find multiplicative inverse of primitive root */ x = 0; lastx = 1; y = 1; lasty = 0; a = *proot; b = n; while(b!=0) { q = a/b; t = a%b; a = b; b = t; t = lastx-q*x; lastx = x; x = t; t = lasty-q*y; lasty = y; y = t; } while(lastx<0) { lastx = lastx+n; } *invproot = lastx; /* * Check that it is safe to perform multiplication modulo N. * Check results for consistency. */ n2 = (n-1)*(n-1); ae_assert(n2/(n-1)==n-1, "FindPrimitiveRoot: internal error", _state); ae_assert(*proot*(*invproot)/(*proot)==(*invproot), "FindPrimitiveRoot: internal error", _state); ae_assert(*proot*(*invproot)/(*invproot)==(*proot), "FindPrimitiveRoot: internal error", _state); ae_assert(*proot*(*invproot)%n==1, "FindPrimitiveRoot: internal error", _state); } static ae_bool ntheory_isprime(ae_int_t n, ae_state *_state) { ae_int_t p; ae_bool result; result = ae_false; p = 2; while(p*p<=n) { if( n%p==0 ) { return result; } p = p+1; } result = ae_true; return result; } static ae_int_t ntheory_modmul(ae_int_t a, ae_int_t b, ae_int_t n, ae_state *_state) { ae_int_t t; double ra; double rb; ae_int_t result; ae_assert(a>=0&&a=N", _state); ae_assert(b>=0&&b=N", _state); /* * Base cases */ ra = (double)(a); rb = (double)(b); if( b==0||a==0 ) { result = 0; return result; } if( b==1||a==1 ) { result = a*b; return result; } if( ae_fp_eq(ra*rb,(double)(a*b)) ) { result = a*b%n; return result; } /* * Non-base cases */ if( b%2==0 ) { /* * A*B = (A*(B/2)) * 2 * * Product T=A*(B/2) is calculated recursively, product T*2 is * calculated as follows: * * result:=T-N * * result:=result+T * * if result<0 then result:=result+N * * In case integer result overflows, we generate exception */ t = ntheory_modmul(a, b/2, n, _state); result = t-n; result = result+t; if( result<0 ) { result = result+n; } } else { /* * A*B = (A*(B div 2)) * 2 + A * * Product T=A*(B/2) is calculated recursively, product T*2 is * calculated as follows: * * result:=T-N * * result:=result+T * * if result<0 then result:=result+N * * In case integer result overflows, we generate exception */ t = ntheory_modmul(a, b/2, n, _state); result = t-n; result = result+t; if( result<0 ) { result = result+n; } result = result-n; result = result+a; if( result<0 ) { result = result+n; } } return result; } static ae_int_t ntheory_modexp(ae_int_t a, ae_int_t b, ae_int_t n, ae_state *_state) { ae_int_t t; ae_int_t result; ae_assert(a>=0&&a=N", _state); ae_assert(b>=0, "ModExp: B<0", _state); /* * Base cases */ if( b==0 ) { result = 1; return result; } if( b==1 ) { result = a; return result; } /* * Non-base cases */ if( b%2==0 ) { t = ntheory_modmul(a, a, n, _state); result = ntheory_modexp(t, b/2, n, _state); } else { t = ntheory_modmul(a, a, n, _state); result = ntheory_modexp(t, b/2, n, _state); result = ntheory_modmul(result, a, n, _state); } return result; } /************************************************************************* This subroutine generates FFT plan for K complex FFT's with length N each. INPUT PARAMETERS: N - FFT length (in complex numbers), N>=1 K - number of repetitions, K>=1 OUTPUT PARAMETERS: Plan - plan -- ALGLIB -- Copyright 05.04.2013 by Bochkanov Sergey *************************************************************************/ void ftcomplexfftplan(ae_int_t n, ae_int_t k, fasttransformplan* plan, ae_state *_state) { ae_frame _frame_block; srealarray bluesteinbuf; ae_int_t rowptr; ae_int_t bluesteinsize; ae_int_t precrptr; ae_int_t preciptr; ae_int_t precrsize; ae_int_t precisize; ae_frame_make(_state, &_frame_block); _fasttransformplan_clear(plan); _srealarray_init(&bluesteinbuf, _state); /* * Initial check for parameters */ ae_assert(n>0, "FTComplexFFTPlan: N<=0", _state); ae_assert(k>0, "FTComplexFFTPlan: K<=0", _state); /* * Determine required sizes of precomputed real and integer * buffers. This stage of code is highly dependent on internals * of FTComplexFFTPlanRec() and must be kept synchronized with * possible changes in internals of plan generation function. * * Buffer size is determined as follows: * * N is factorized * * we factor out anything which is less or equal to MaxRadix * * prime factor F>RaderThreshold requires 4*FTBaseFindSmooth(2*F-1) * real entries to store precomputed Quantities for Bluestein's * transformation * * prime factor F<=RaderThreshold does NOT require * precomputed storage */ precrsize = 0; precisize = 0; ftbase_ftdeterminespacerequirements(n, &precrsize, &precisize, _state); if( precrsize>0 ) { ae_vector_set_length(&plan->precr, precrsize, _state); } if( precisize>0 ) { ae_vector_set_length(&plan->preci, precisize, _state); } /* * Generate plan */ rowptr = 0; precrptr = 0; preciptr = 0; bluesteinsize = 1; ae_vector_set_length(&plan->buffer, 2*n*k, _state); ftbase_ftcomplexfftplanrec(n, k, ae_true, ae_true, &rowptr, &bluesteinsize, &precrptr, &preciptr, plan, _state); ae_vector_set_length(&bluesteinbuf.val, bluesteinsize, _state); ae_shared_pool_set_seed(&plan->bluesteinpool, &bluesteinbuf, sizeof(bluesteinbuf), _srealarray_init, _srealarray_init_copy, _srealarray_destroy, _state); /* * Check that actual amount of precomputed space used by transformation * plan is EXACTLY equal to amount of space allocated by us. */ ae_assert(precrptr==precrsize, "FTComplexFFTPlan: internal error (PrecRPtr<>PrecRSize)", _state); ae_assert(preciptr==precisize, "FTComplexFFTPlan: internal error (PrecRPtr<>PrecRSize)", _state); ae_frame_leave(_state); } /************************************************************************* This subroutine applies transformation plan to input/output array A. INPUT PARAMETERS: Plan - transformation plan A - array, must be large enough for plan to work OffsA - offset of the subarray to process RepCnt - repetition count (transformation is repeatedly applied to subsequent subarrays) OUTPUT PARAMETERS: Plan - plan (temporary buffers can be modified, plan itself is unchanged and can be reused) A - transformed array -- ALGLIB -- Copyright 05.04.2013 by Bochkanov Sergey *************************************************************************/ void ftapplyplan(fasttransformplan* plan, /* Real */ ae_vector* a, ae_int_t offsa, ae_int_t repcnt, ae_state *_state) { ae_int_t plansize; ae_int_t i; plansize = plan->entries.ptr.pp_int[0][ftbase_coloperandscnt]*plan->entries.ptr.pp_int[0][ftbase_coloperandsize]*plan->entries.ptr.pp_int[0][ftbase_colmicrovectorsize]; for(i=0; i<=repcnt-1; i++) { ftbase_ftapplysubplan(plan, 0, a, offsa+plansize*i, 0, &plan->buffer, 1, _state); } } /************************************************************************* Returns good factorization N=N1*N2. Usually N1<=N2 (but not always - small N's may be exception). if N1<>1 then N2<>1. Factorization is chosen depending on task type and codelets we have. -- ALGLIB -- Copyright 01.05.2009 by Bochkanov Sergey *************************************************************************/ void ftbasefactorize(ae_int_t n, ae_int_t tasktype, ae_int_t* n1, ae_int_t* n2, ae_state *_state) { ae_int_t j; *n1 = 0; *n2 = 0; *n1 = 0; *n2 = 0; /* * try to find good codelet */ if( *n1*(*n2)!=n ) { for(j=ftbase_ftbasecodeletrecommended; j>=2; j--) { if( n%j==0 ) { *n1 = j; *n2 = n/j; break; } } } /* * try to factorize N */ if( *n1*(*n2)!=n ) { for(j=ftbase_ftbasecodeletrecommended+1; j<=n-1; j++) { if( n%j==0 ) { *n1 = j; *n2 = n/j; break; } } } /* * looks like N is prime :( */ if( *n1*(*n2)!=n ) { *n1 = 1; *n2 = n; } /* * normalize */ if( *n2==1&&*n1!=1 ) { *n2 = *n1; *n1 = 1; } } /************************************************************************* Is number smooth? -- ALGLIB -- Copyright 01.05.2009 by Bochkanov Sergey *************************************************************************/ ae_bool ftbaseissmooth(ae_int_t n, ae_state *_state) { ae_int_t i; ae_bool result; for(i=2; i<=ftbase_ftbasemaxsmoothfactor; i++) { while(n%i==0) { n = n/i; } } result = n==1; return result; } /************************************************************************* Returns smallest smooth (divisible only by 2, 3, 5) number that is greater than or equal to max(N,2) -- ALGLIB -- Copyright 01.05.2009 by Bochkanov Sergey *************************************************************************/ ae_int_t ftbasefindsmooth(ae_int_t n, ae_state *_state) { ae_int_t best; ae_int_t result; best = 2; while(bestRaderThreshold requires 4*FTBaseFindSmooth(2*F-1) * real entries to store precomputed Quantities for Bluestein's * transformation * * prime factor F<=RaderThreshold requires 2*(F-1)+ESTIMATE(F-1) * precomputed storage */ ncur = n; for(i=2; i<=ftbase_maxradix; i++) { while(ncur%i==0) { ncur = ncur/i; } } f = 2; while(f<=ncur) { while(ncur%f==0) { if( f>ftbase_raderthreshold ) { *precrsize = *precrsize+4*ftbasefindsmooth(2*f-1, _state); } else { *precrsize = *precrsize+2*(f-1); ftbase_ftdeterminespacerequirements(f-1, precrsize, precisize, _state); } ncur = ncur/f; } f = f+1; } } /************************************************************************* Recurrent function called by FTComplexFFTPlan() and other functions. It recursively builds transformation plan INPUT PARAMETERS: N - FFT length (in complex numbers), N>=1 K - number of repetitions, K>=1 ChildPlan - if True, plan generator inserts OpStart/opEnd in the plan header/footer. TopmostPlan - if True, plan generator assumes that it is topmost plan: * it may use global buffer for transpositions and there is no other plan which executes in parallel RowPtr - index which points to past-the-last entry generated so far BluesteinSize- amount of storage (in real numbers) required for Bluestein buffer PrecRPtr - pointer to unused part of precomputed real buffer (Plan.PrecR): * when this function stores some data to precomputed buffer, it advances pointer. * it is responsibility of the function to assert that Plan.PrecR has enough space to store data before actually writing to buffer. * it is responsibility of the caller to allocate enough space before calling this function PrecIPtr - pointer to unused part of precomputed integer buffer (Plan.PrecI): * when this function stores some data to precomputed buffer, it advances pointer. * it is responsibility of the function to assert that Plan.PrecR has enough space to store data before actually writing to buffer. * it is responsibility of the caller to allocate enough space before calling this function Plan - plan (generated so far) OUTPUT PARAMETERS: RowPtr - updated pointer (advanced by number of entries generated by function) BluesteinSize- updated amount (may be increased, but may never be decreased) NOTE: in case TopmostPlan is True, ChildPlan is also must be True. -- ALGLIB -- Copyright 05.04.2013 by Bochkanov Sergey *************************************************************************/ static void ftbase_ftcomplexfftplanrec(ae_int_t n, ae_int_t k, ae_bool childplan, ae_bool topmostplan, ae_int_t* rowptr, ae_int_t* bluesteinsize, ae_int_t* precrptr, ae_int_t* preciptr, fasttransformplan* plan, ae_state *_state) { ae_frame _frame_block; srealarray localbuf; ae_int_t m; ae_int_t n1; ae_int_t n2; ae_int_t gq; ae_int_t giq; ae_int_t row0; ae_int_t row1; ae_int_t row2; ae_int_t row3; ae_frame_make(_state, &_frame_block); _srealarray_init(&localbuf, _state); ae_assert(n>0, "FTComplexFFTPlan: N<=0", _state); ae_assert(k>0, "FTComplexFFTPlan: K<=0", _state); ae_assert(!topmostplan||childplan, "FTComplexFFTPlan: ChildPlan is inconsistent with TopmostPlan", _state); /* * Try to generate "topmost" plan */ if( topmostplan&&n>ftbase_recursivethreshold ) { ftbase_ftfactorize(n, ae_false, &n1, &n2, _state); if( n1*n2==0 ) { /* * Handle prime-factor FFT with Bluestein's FFT. * Determine size of Bluestein's buffer. */ m = ftbasefindsmooth(2*n-1, _state); *bluesteinsize = ae_maxint(2*m, *bluesteinsize, _state); /* * Generate plan */ ftbase_ftpushentry2(plan, rowptr, ftbase_opstart, k, n, 2, -1, ftbase_ftoptimisticestimate(n, _state), _state); ftbase_ftpushentry4(plan, rowptr, ftbase_opbluesteinsfft, k, n, 2, m, 2, *precrptr, 0, _state); row0 = *rowptr; ftbase_ftpushentry(plan, rowptr, ftbase_opjmp, 0, 0, 0, 0, _state); ftbase_ftcomplexfftplanrec(m, 1, ae_true, ae_true, rowptr, bluesteinsize, precrptr, preciptr, plan, _state); row1 = *rowptr; plan->entries.ptr.pp_int[row0][ftbase_colparam0] = row1-row0; ftbase_ftpushentry(plan, rowptr, ftbase_opend, k, n, 2, 0, _state); /* * Fill precomputed buffer */ ftbase_ftprecomputebluesteinsfft(n, m, &plan->precr, *precrptr, _state); /* * Update pointer to the precomputed area */ *precrptr = *precrptr+4*m; } else { /* * Handle composite FFT with recursive Cooley-Tukey which * uses global buffer instead of local one. */ ftbase_ftpushentry2(plan, rowptr, ftbase_opstart, k, n, 2, -1, ftbase_ftoptimisticestimate(n, _state), _state); ftbase_ftpushentry(plan, rowptr, ftbase_opcomplextranspose, k, n, 2, n1, _state); row0 = *rowptr; ftbase_ftpushentry2(plan, rowptr, ftbase_opparallelcall, k*n2, n1, 2, 0, ftbase_ftoptimisticestimate(n, _state), _state); ftbase_ftpushentry(plan, rowptr, ftbase_opcomplexfftfactors, k, n, 2, n1, _state); ftbase_ftpushentry(plan, rowptr, ftbase_opcomplextranspose, k, n, 2, n2, _state); row2 = *rowptr; ftbase_ftpushentry2(plan, rowptr, ftbase_opparallelcall, k*n1, n2, 2, 0, ftbase_ftoptimisticestimate(n, _state), _state); ftbase_ftpushentry(plan, rowptr, ftbase_opcomplextranspose, k, n, 2, n1, _state); ftbase_ftpushentry(plan, rowptr, ftbase_opend, k, n, 2, 0, _state); row1 = *rowptr; ftbase_ftcomplexfftplanrec(n1, 1, ae_true, ae_false, rowptr, bluesteinsize, precrptr, preciptr, plan, _state); plan->entries.ptr.pp_int[row0][ftbase_colparam0] = row1-row0; row3 = *rowptr; ftbase_ftcomplexfftplanrec(n2, 1, ae_true, ae_false, rowptr, bluesteinsize, precrptr, preciptr, plan, _state); plan->entries.ptr.pp_int[row2][ftbase_colparam0] = row3-row2; } ae_frame_leave(_state); return; } /* * Prepare "non-topmost" plan: * * calculate factorization * * use local (shared) buffer * * update buffer size - ANY plan will need at least * 2*N temporaries, additional requirements can be * applied later */ ftbase_ftfactorize(n, ae_false, &n1, &n2, _state); /* * Handle FFT's with N1*N2=0: either small-N or prime-factor */ if( n1*n2==0 ) { if( n<=ftbase_maxradix ) { /* * Small-N FFT */ if( childplan ) { ftbase_ftpushentry2(plan, rowptr, ftbase_opstart, k, n, 2, -1, ftbase_ftoptimisticestimate(n, _state), _state); } ftbase_ftpushentry(plan, rowptr, ftbase_opcomplexcodeletfft, k, n, 2, 0, _state); if( childplan ) { ftbase_ftpushentry(plan, rowptr, ftbase_opend, k, n, 2, 0, _state); } ae_frame_leave(_state); return; } if( n<=ftbase_raderthreshold ) { /* * Handle prime-factor FFT's with Rader's FFT */ m = n-1; if( childplan ) { ftbase_ftpushentry2(plan, rowptr, ftbase_opstart, k, n, 2, -1, ftbase_ftoptimisticestimate(n, _state), _state); } findprimitiverootandinverse(n, &gq, &giq, _state); ftbase_ftpushentry4(plan, rowptr, ftbase_opradersfft, k, n, 2, 2, gq, giq, *precrptr, _state); ftbase_ftprecomputeradersfft(n, gq, giq, &plan->precr, *precrptr, _state); *precrptr = *precrptr+2*(n-1); row0 = *rowptr; ftbase_ftpushentry(plan, rowptr, ftbase_opjmp, 0, 0, 0, 0, _state); ftbase_ftcomplexfftplanrec(m, 1, ae_true, ae_false, rowptr, bluesteinsize, precrptr, preciptr, plan, _state); row1 = *rowptr; plan->entries.ptr.pp_int[row0][ftbase_colparam0] = row1-row0; if( childplan ) { ftbase_ftpushentry(plan, rowptr, ftbase_opend, k, n, 2, 0, _state); } } else { /* * Handle prime-factor FFT's with Bluestein's FFT */ m = ftbasefindsmooth(2*n-1, _state); *bluesteinsize = ae_maxint(2*m, *bluesteinsize, _state); if( childplan ) { ftbase_ftpushentry2(plan, rowptr, ftbase_opstart, k, n, 2, -1, ftbase_ftoptimisticestimate(n, _state), _state); } ftbase_ftpushentry4(plan, rowptr, ftbase_opbluesteinsfft, k, n, 2, m, 2, *precrptr, 0, _state); ftbase_ftprecomputebluesteinsfft(n, m, &plan->precr, *precrptr, _state); *precrptr = *precrptr+4*m; row0 = *rowptr; ftbase_ftpushentry(plan, rowptr, ftbase_opjmp, 0, 0, 0, 0, _state); ftbase_ftcomplexfftplanrec(m, 1, ae_true, ae_false, rowptr, bluesteinsize, precrptr, preciptr, plan, _state); row1 = *rowptr; plan->entries.ptr.pp_int[row0][ftbase_colparam0] = row1-row0; if( childplan ) { ftbase_ftpushentry(plan, rowptr, ftbase_opend, k, n, 2, 0, _state); } } ae_frame_leave(_state); return; } /* * Handle Cooley-Tukey FFT with small N1 */ if( n1<=ftbase_maxradix ) { /* * Specialized transformation for small N1: * * N2 short inplace FFT's, each N1-point, with integrated twiddle factors * * N1 long FFT's * * final transposition */ if( childplan ) { ftbase_ftpushentry2(plan, rowptr, ftbase_opstart, k, n, 2, -1, ftbase_ftoptimisticestimate(n, _state), _state); } ftbase_ftpushentry(plan, rowptr, ftbase_opcomplexcodelettwfft, k, n1, 2*n2, 0, _state); ftbase_ftcomplexfftplanrec(n2, k*n1, ae_false, ae_false, rowptr, bluesteinsize, precrptr, preciptr, plan, _state); ftbase_ftpushentry(plan, rowptr, ftbase_opcomplextranspose, k, n, 2, n1, _state); if( childplan ) { ftbase_ftpushentry(plan, rowptr, ftbase_opend, k, n, 2, 0, _state); } ae_frame_leave(_state); return; } /* * Handle general Cooley-Tukey FFT, either "flat" or "recursive" */ if( n<=ftbase_recursivethreshold ) { /* * General code for large N1/N2, "flat" version without explicit recurrence * (nested subplans are inserted directly into the body of the plan) */ if( childplan ) { ftbase_ftpushentry2(plan, rowptr, ftbase_opstart, k, n, 2, -1, ftbase_ftoptimisticestimate(n, _state), _state); } ftbase_ftpushentry(plan, rowptr, ftbase_opcomplextranspose, k, n, 2, n1, _state); ftbase_ftcomplexfftplanrec(n1, k*n2, ae_false, ae_false, rowptr, bluesteinsize, precrptr, preciptr, plan, _state); ftbase_ftpushentry(plan, rowptr, ftbase_opcomplexfftfactors, k, n, 2, n1, _state); ftbase_ftpushentry(plan, rowptr, ftbase_opcomplextranspose, k, n, 2, n2, _state); ftbase_ftcomplexfftplanrec(n2, k*n1, ae_false, ae_false, rowptr, bluesteinsize, precrptr, preciptr, plan, _state); ftbase_ftpushentry(plan, rowptr, ftbase_opcomplextranspose, k, n, 2, n1, _state); if( childplan ) { ftbase_ftpushentry(plan, rowptr, ftbase_opend, k, n, 2, 0, _state); } } else { /* * General code for large N1/N2, "recursive" version - nested subplans * are separated from the plan body. * * Generate parent plan. */ if( childplan ) { ftbase_ftpushentry2(plan, rowptr, ftbase_opstart, k, n, 2, -1, ftbase_ftoptimisticestimate(n, _state), _state); } ftbase_ftpushentry(plan, rowptr, ftbase_opcomplextranspose, k, n, 2, n1, _state); row0 = *rowptr; ftbase_ftpushentry2(plan, rowptr, ftbase_opparallelcall, k*n2, n1, 2, 0, ftbase_ftoptimisticestimate(n, _state), _state); ftbase_ftpushentry(plan, rowptr, ftbase_opcomplexfftfactors, k, n, 2, n1, _state); ftbase_ftpushentry(plan, rowptr, ftbase_opcomplextranspose, k, n, 2, n2, _state); row2 = *rowptr; ftbase_ftpushentry2(plan, rowptr, ftbase_opparallelcall, k*n1, n2, 2, 0, ftbase_ftoptimisticestimate(n, _state), _state); ftbase_ftpushentry(plan, rowptr, ftbase_opcomplextranspose, k, n, 2, n1, _state); if( childplan ) { ftbase_ftpushentry(plan, rowptr, ftbase_opend, k, n, 2, 0, _state); } /* * Generate child subplans, insert refence to parent plans */ row1 = *rowptr; ftbase_ftcomplexfftplanrec(n1, 1, ae_true, ae_false, rowptr, bluesteinsize, precrptr, preciptr, plan, _state); plan->entries.ptr.pp_int[row0][ftbase_colparam0] = row1-row0; row3 = *rowptr; ftbase_ftcomplexfftplanrec(n2, 1, ae_true, ae_false, rowptr, bluesteinsize, precrptr, preciptr, plan, _state); plan->entries.ptr.pp_int[row2][ftbase_colparam0] = row3-row2; } ae_frame_leave(_state); } /************************************************************************* This function pushes one more entry to the plan. It resizes Entries matrix if needed. INPUT PARAMETERS: Plan - plan (generated so far) RowPtr - index which points to past-the-last entry generated so far EType - entry type EOpCnt - operands count EOpSize - operand size EMcvSize - microvector size EParam0 - parameter 0 OUTPUT PARAMETERS: Plan - updated plan RowPtr - updated pointer NOTE: Param1 is set to -1. -- ALGLIB -- Copyright 05.04.2013 by Bochkanov Sergey *************************************************************************/ static void ftbase_ftpushentry(fasttransformplan* plan, ae_int_t* rowptr, ae_int_t etype, ae_int_t eopcnt, ae_int_t eopsize, ae_int_t emcvsize, ae_int_t eparam0, ae_state *_state) { ftbase_ftpushentry2(plan, rowptr, etype, eopcnt, eopsize, emcvsize, eparam0, -1, _state); } /************************************************************************* Same as FTPushEntry(), but sets Param0 AND Param1. This function pushes one more entry to the plan. It resized Entries matrix if needed. INPUT PARAMETERS: Plan - plan (generated so far) RowPtr - index which points to past-the-last entry generated so far EType - entry type EOpCnt - operands count EOpSize - operand size EMcvSize - microvector size EParam0 - parameter 0 EParam1 - parameter 1 OUTPUT PARAMETERS: Plan - updated plan RowPtr - updated pointer -- ALGLIB -- Copyright 05.04.2013 by Bochkanov Sergey *************************************************************************/ static void ftbase_ftpushentry2(fasttransformplan* plan, ae_int_t* rowptr, ae_int_t etype, ae_int_t eopcnt, ae_int_t eopsize, ae_int_t emcvsize, ae_int_t eparam0, ae_int_t eparam1, ae_state *_state) { if( *rowptr>=plan->entries.rows ) { imatrixresize(&plan->entries, ae_maxint(2*plan->entries.rows, 1, _state), ftbase_colscnt, _state); } plan->entries.ptr.pp_int[*rowptr][ftbase_coltype] = etype; plan->entries.ptr.pp_int[*rowptr][ftbase_coloperandscnt] = eopcnt; plan->entries.ptr.pp_int[*rowptr][ftbase_coloperandsize] = eopsize; plan->entries.ptr.pp_int[*rowptr][ftbase_colmicrovectorsize] = emcvsize; plan->entries.ptr.pp_int[*rowptr][ftbase_colparam0] = eparam0; plan->entries.ptr.pp_int[*rowptr][ftbase_colparam1] = eparam1; plan->entries.ptr.pp_int[*rowptr][ftbase_colparam2] = 0; plan->entries.ptr.pp_int[*rowptr][ftbase_colparam3] = 0; *rowptr = *rowptr+1; } /************************************************************************* Same as FTPushEntry(), but sets Param0, Param1, Param2 and Param3. This function pushes one more entry to the plan. It resized Entries matrix if needed. INPUT PARAMETERS: Plan - plan (generated so far) RowPtr - index which points to past-the-last entry generated so far EType - entry type EOpCnt - operands count EOpSize - operand size EMcvSize - microvector size EParam0 - parameter 0 EParam1 - parameter 1 EParam2 - parameter 2 EParam3 - parameter 3 OUTPUT PARAMETERS: Plan - updated plan RowPtr - updated pointer -- ALGLIB -- Copyright 05.04.2013 by Bochkanov Sergey *************************************************************************/ static void ftbase_ftpushentry4(fasttransformplan* plan, ae_int_t* rowptr, ae_int_t etype, ae_int_t eopcnt, ae_int_t eopsize, ae_int_t emcvsize, ae_int_t eparam0, ae_int_t eparam1, ae_int_t eparam2, ae_int_t eparam3, ae_state *_state) { if( *rowptr>=plan->entries.rows ) { imatrixresize(&plan->entries, ae_maxint(2*plan->entries.rows, 1, _state), ftbase_colscnt, _state); } plan->entries.ptr.pp_int[*rowptr][ftbase_coltype] = etype; plan->entries.ptr.pp_int[*rowptr][ftbase_coloperandscnt] = eopcnt; plan->entries.ptr.pp_int[*rowptr][ftbase_coloperandsize] = eopsize; plan->entries.ptr.pp_int[*rowptr][ftbase_colmicrovectorsize] = emcvsize; plan->entries.ptr.pp_int[*rowptr][ftbase_colparam0] = eparam0; plan->entries.ptr.pp_int[*rowptr][ftbase_colparam1] = eparam1; plan->entries.ptr.pp_int[*rowptr][ftbase_colparam2] = eparam2; plan->entries.ptr.pp_int[*rowptr][ftbase_colparam3] = eparam3; *rowptr = *rowptr+1; } /************************************************************************* This subroutine applies subplan to input/output array A. INPUT PARAMETERS: Plan - transformation plan SubPlan - subplan index A - array, must be large enough for plan to work ABase - base offset in array A, this value points to start of subarray whose length is equal to length of the plan AOffset - offset with respect to ABase, 0<=AOffsetentries.ptr.pp_int[subplan][ftbase_coltype]==ftbase_opstart, "FTApplySubPlan: incorrect subplan header", _state); rowidx = subplan+1; while(plan->entries.ptr.pp_int[rowidx][ftbase_coltype]!=ftbase_opend) { operation = plan->entries.ptr.pp_int[rowidx][ftbase_coltype]; operandscnt = repcnt*plan->entries.ptr.pp_int[rowidx][ftbase_coloperandscnt]; operandsize = plan->entries.ptr.pp_int[rowidx][ftbase_coloperandsize]; microvectorsize = plan->entries.ptr.pp_int[rowidx][ftbase_colmicrovectorsize]; param0 = plan->entries.ptr.pp_int[rowidx][ftbase_colparam0]; param1 = plan->entries.ptr.pp_int[rowidx][ftbase_colparam1]; touchint(¶m1, _state); /* * Process "jump" operation */ if( operation==ftbase_opjmp ) { rowidx = rowidx+plan->entries.ptr.pp_int[rowidx][ftbase_colparam0]; continue; } /* * Process "parallel call" operation: * * we perform initial check for consistency between parent and child plans * * we call FTSplitAndApplyParallelPlan(), which splits parallel plan into * several parallel tasks */ if( operation==ftbase_opparallelcall ) { parentsize = operandsize*microvectorsize; childsize = plan->entries.ptr.pp_int[rowidx+param0][ftbase_coloperandscnt]*plan->entries.ptr.pp_int[rowidx+param0][ftbase_coloperandsize]*plan->entries.ptr.pp_int[rowidx+param0][ftbase_colmicrovectorsize]; ae_assert(plan->entries.ptr.pp_int[rowidx+param0][ftbase_coltype]==ftbase_opstart, "FTApplySubPlan: incorrect child subplan header", _state); ae_assert(parentsize==childsize, "FTApplySubPlan: incorrect child subplan header", _state); chunksize = ae_maxint(ftbase_recursivethreshold/childsize, 1, _state); lastchunksize = operandscnt%chunksize; if( lastchunksize==0 ) { lastchunksize = chunksize; } i = 0; while(ibluesteinpool, &_bufa, _state); ae_shared_pool_retrieve(&plan->bluesteinpool, &_bufb, _state); ae_shared_pool_retrieve(&plan->bluesteinpool, &_bufc, _state); ae_shared_pool_retrieve(&plan->bluesteinpool, &_bufd, _state); ftbase_ftbluesteinsfft(plan, a, abase, aoffset, operandscnt, operandsize, plan->entries.ptr.pp_int[rowidx][ftbase_colparam0], plan->entries.ptr.pp_int[rowidx][ftbase_colparam2], rowidx+plan->entries.ptr.pp_int[rowidx][ftbase_colparam1], &bufa->val, &bufb->val, &bufc->val, &bufd->val, _state); ae_shared_pool_recycle(&plan->bluesteinpool, &_bufa, _state); ae_shared_pool_recycle(&plan->bluesteinpool, &_bufb, _state); ae_shared_pool_recycle(&plan->bluesteinpool, &_bufc, _state); ae_shared_pool_recycle(&plan->bluesteinpool, &_bufd, _state); rowidx = rowidx+1; continue; } /* * Process Rader's FFT */ if( operation==ftbase_opradersfft ) { ftbase_ftradersfft(plan, a, abase, aoffset, operandscnt, operandsize, rowidx+plan->entries.ptr.pp_int[rowidx][ftbase_colparam0], plan->entries.ptr.pp_int[rowidx][ftbase_colparam1], plan->entries.ptr.pp_int[rowidx][ftbase_colparam2], plan->entries.ptr.pp_int[rowidx][ftbase_colparam3], buf, _state); rowidx = rowidx+1; continue; } /* * Process "complex twiddle factors" operation */ if( operation==ftbase_opcomplexfftfactors ) { ae_assert(microvectorsize==2, "FTApplySubPlan: MicrovectorSize<>1", _state); n1 = plan->entries.ptr.pp_int[rowidx][ftbase_colparam0]; n2 = operandsize/n1; for(i=0; i<=operandscnt-1; i++) { ftbase_ffttwcalc(a, abase+aoffset+i*operandsize*2, n1, n2, _state); } rowidx = rowidx+1; continue; } /* * Process "complex transposition" operation */ if( operation==ftbase_opcomplextranspose ) { ae_assert(microvectorsize==2, "FTApplySubPlan: MicrovectorSize<>1", _state); n1 = plan->entries.ptr.pp_int[rowidx][ftbase_colparam0]; n2 = operandsize/n1; for(i=0; i<=operandscnt-1; i++) { ftbase_internalcomplexlintranspose(a, n1, n2, abase+aoffset+i*operandsize*2, buf, _state); } rowidx = rowidx+1; continue; } /* * Error */ ae_assert(ae_false, "FTApplySubPlan: unexpected plan type", _state); } ae_frame_leave(_state); } /************************************************************************* This subroutine applies complex reference FFT to input/output array A. VERY SLOW OPERATION, do not use it in real life plans :) INPUT PARAMETERS: A - array, must be large enough for plan to work Offs - offset of the subarray to process OperandsCnt - operands count (see description of FastTransformPlan) OperandSize - operand size (see description of FastTransformPlan) MicrovectorSize-microvector size (see description of FastTransformPlan) Buf - temporary array, must be at least OperandsCnt*OperandSize*MicrovectorSize OUTPUT PARAMETERS: A - transformed array -- ALGLIB -- Copyright 05.04.2013 by Bochkanov Sergey *************************************************************************/ static void ftbase_ftapplycomplexreffft(/* Real */ ae_vector* a, ae_int_t offs, ae_int_t operandscnt, ae_int_t operandsize, ae_int_t microvectorsize, /* Real */ ae_vector* buf, ae_state *_state) { ae_int_t opidx; ae_int_t i; ae_int_t k; double hre; double him; double c; double s; double re; double im; ae_int_t n; ae_assert(operandscnt>=1, "FTApplyComplexRefFFT: OperandsCnt<1", _state); ae_assert(operandsize>=1, "FTApplyComplexRefFFT: OperandSize<1", _state); ae_assert(microvectorsize==2, "FTApplyComplexRefFFT: MicrovectorSize<>2", _state); n = operandsize; for(opidx=0; opidx<=operandscnt-1; opidx++) { for(i=0; i<=n-1; i++) { hre = (double)(0); him = (double)(0); for(k=0; k<=n-1; k++) { re = a->ptr.p_double[offs+opidx*operandsize*2+2*k+0]; im = a->ptr.p_double[offs+opidx*operandsize*2+2*k+1]; c = ae_cos(-2*ae_pi*k*i/n, _state); s = ae_sin(-2*ae_pi*k*i/n, _state); hre = hre+c*re-s*im; him = him+c*im+s*re; } buf->ptr.p_double[2*i+0] = hre; buf->ptr.p_double[2*i+1] = him; } for(i=0; i<=operandsize*2-1; i++) { a->ptr.p_double[offs+opidx*operandsize*2+i] = buf->ptr.p_double[i]; } } } /************************************************************************* This subroutine applies complex codelet FFT to input/output array A. INPUT PARAMETERS: A - array, must be large enough for plan to work Offs - offset of the subarray to process OperandsCnt - operands count (see description of FastTransformPlan) OperandSize - operand size (see description of FastTransformPlan) MicrovectorSize-microvector size, must be 2 OUTPUT PARAMETERS: A - transformed array -- ALGLIB -- Copyright 05.04.2013 by Bochkanov Sergey *************************************************************************/ static void ftbase_ftapplycomplexcodeletfft(/* Real */ ae_vector* a, ae_int_t offs, ae_int_t operandscnt, ae_int_t operandsize, ae_int_t microvectorsize, ae_state *_state) { ae_int_t opidx; ae_int_t n; ae_int_t aoffset; double a0x; double a0y; double a1x; double a1y; double a2x; double a2y; double a3x; double a3y; double a4x; double a4y; double a5x; double a5y; double v0; double v1; double v2; double v3; double t1x; double t1y; double t2x; double t2y; double t3x; double t3y; double t4x; double t4y; double t5x; double t5y; double m1x; double m1y; double m2x; double m2y; double m3x; double m3y; double m4x; double m4y; double m5x; double m5y; double s1x; double s1y; double s2x; double s2y; double s3x; double s3y; double s4x; double s4y; double s5x; double s5y; double c1; double c2; double c3; double c4; double c5; double v; ae_assert(operandscnt>=1, "FTApplyComplexCodeletFFT: OperandsCnt<1", _state); ae_assert(operandsize>=1, "FTApplyComplexCodeletFFT: OperandSize<1", _state); ae_assert(microvectorsize==2, "FTApplyComplexCodeletFFT: MicrovectorSize<>2", _state); n = operandsize; /* * Hard-coded transforms for different N's */ ae_assert(n<=ftbase_maxradix, "FTApplyComplexCodeletFFT: N>MaxRadix", _state); if( n==2 ) { for(opidx=0; opidx<=operandscnt-1; opidx++) { aoffset = offs+opidx*operandsize*2; a0x = a->ptr.p_double[aoffset+0]; a0y = a->ptr.p_double[aoffset+1]; a1x = a->ptr.p_double[aoffset+2]; a1y = a->ptr.p_double[aoffset+3]; v0 = a0x+a1x; v1 = a0y+a1y; v2 = a0x-a1x; v3 = a0y-a1y; a->ptr.p_double[aoffset+0] = v0; a->ptr.p_double[aoffset+1] = v1; a->ptr.p_double[aoffset+2] = v2; a->ptr.p_double[aoffset+3] = v3; } return; } if( n==3 ) { c1 = ae_cos(2*ae_pi/3, _state)-1; c2 = ae_sin(2*ae_pi/3, _state); for(opidx=0; opidx<=operandscnt-1; opidx++) { aoffset = offs+opidx*operandsize*2; a0x = a->ptr.p_double[aoffset+0]; a0y = a->ptr.p_double[aoffset+1]; a1x = a->ptr.p_double[aoffset+2]; a1y = a->ptr.p_double[aoffset+3]; a2x = a->ptr.p_double[aoffset+4]; a2y = a->ptr.p_double[aoffset+5]; t1x = a1x+a2x; t1y = a1y+a2y; a0x = a0x+t1x; a0y = a0y+t1y; m1x = c1*t1x; m1y = c1*t1y; m2x = c2*(a1y-a2y); m2y = c2*(a2x-a1x); s1x = a0x+m1x; s1y = a0y+m1y; a1x = s1x+m2x; a1y = s1y+m2y; a2x = s1x-m2x; a2y = s1y-m2y; a->ptr.p_double[aoffset+0] = a0x; a->ptr.p_double[aoffset+1] = a0y; a->ptr.p_double[aoffset+2] = a1x; a->ptr.p_double[aoffset+3] = a1y; a->ptr.p_double[aoffset+4] = a2x; a->ptr.p_double[aoffset+5] = a2y; } return; } if( n==4 ) { for(opidx=0; opidx<=operandscnt-1; opidx++) { aoffset = offs+opidx*operandsize*2; a0x = a->ptr.p_double[aoffset+0]; a0y = a->ptr.p_double[aoffset+1]; a1x = a->ptr.p_double[aoffset+2]; a1y = a->ptr.p_double[aoffset+3]; a2x = a->ptr.p_double[aoffset+4]; a2y = a->ptr.p_double[aoffset+5]; a3x = a->ptr.p_double[aoffset+6]; a3y = a->ptr.p_double[aoffset+7]; t1x = a0x+a2x; t1y = a0y+a2y; t2x = a1x+a3x; t2y = a1y+a3y; m2x = a0x-a2x; m2y = a0y-a2y; m3x = a1y-a3y; m3y = a3x-a1x; a->ptr.p_double[aoffset+0] = t1x+t2x; a->ptr.p_double[aoffset+1] = t1y+t2y; a->ptr.p_double[aoffset+4] = t1x-t2x; a->ptr.p_double[aoffset+5] = t1y-t2y; a->ptr.p_double[aoffset+2] = m2x+m3x; a->ptr.p_double[aoffset+3] = m2y+m3y; a->ptr.p_double[aoffset+6] = m2x-m3x; a->ptr.p_double[aoffset+7] = m2y-m3y; } return; } if( n==5 ) { v = 2*ae_pi/5; c1 = (ae_cos(v, _state)+ae_cos(2*v, _state))/2-1; c2 = (ae_cos(v, _state)-ae_cos(2*v, _state))/2; c3 = -ae_sin(v, _state); c4 = -(ae_sin(v, _state)+ae_sin(2*v, _state)); c5 = ae_sin(v, _state)-ae_sin(2*v, _state); for(opidx=0; opidx<=operandscnt-1; opidx++) { aoffset = offs+opidx*operandsize*2; t1x = a->ptr.p_double[aoffset+2]+a->ptr.p_double[aoffset+8]; t1y = a->ptr.p_double[aoffset+3]+a->ptr.p_double[aoffset+9]; t2x = a->ptr.p_double[aoffset+4]+a->ptr.p_double[aoffset+6]; t2y = a->ptr.p_double[aoffset+5]+a->ptr.p_double[aoffset+7]; t3x = a->ptr.p_double[aoffset+2]-a->ptr.p_double[aoffset+8]; t3y = a->ptr.p_double[aoffset+3]-a->ptr.p_double[aoffset+9]; t4x = a->ptr.p_double[aoffset+6]-a->ptr.p_double[aoffset+4]; t4y = a->ptr.p_double[aoffset+7]-a->ptr.p_double[aoffset+5]; t5x = t1x+t2x; t5y = t1y+t2y; a->ptr.p_double[aoffset+0] = a->ptr.p_double[aoffset+0]+t5x; a->ptr.p_double[aoffset+1] = a->ptr.p_double[aoffset+1]+t5y; m1x = c1*t5x; m1y = c1*t5y; m2x = c2*(t1x-t2x); m2y = c2*(t1y-t2y); m3x = -c3*(t3y+t4y); m3y = c3*(t3x+t4x); m4x = -c4*t4y; m4y = c4*t4x; m5x = -c5*t3y; m5y = c5*t3x; s3x = m3x-m4x; s3y = m3y-m4y; s5x = m3x+m5x; s5y = m3y+m5y; s1x = a->ptr.p_double[aoffset+0]+m1x; s1y = a->ptr.p_double[aoffset+1]+m1y; s2x = s1x+m2x; s2y = s1y+m2y; s4x = s1x-m2x; s4y = s1y-m2y; a->ptr.p_double[aoffset+2] = s2x+s3x; a->ptr.p_double[aoffset+3] = s2y+s3y; a->ptr.p_double[aoffset+4] = s4x+s5x; a->ptr.p_double[aoffset+5] = s4y+s5y; a->ptr.p_double[aoffset+6] = s4x-s5x; a->ptr.p_double[aoffset+7] = s4y-s5y; a->ptr.p_double[aoffset+8] = s2x-s3x; a->ptr.p_double[aoffset+9] = s2y-s3y; } return; } if( n==6 ) { c1 = ae_cos(2*ae_pi/3, _state)-1; c2 = ae_sin(2*ae_pi/3, _state); c3 = ae_cos(-ae_pi/3, _state); c4 = ae_sin(-ae_pi/3, _state); for(opidx=0; opidx<=operandscnt-1; opidx++) { aoffset = offs+opidx*operandsize*2; a0x = a->ptr.p_double[aoffset+0]; a0y = a->ptr.p_double[aoffset+1]; a1x = a->ptr.p_double[aoffset+2]; a1y = a->ptr.p_double[aoffset+3]; a2x = a->ptr.p_double[aoffset+4]; a2y = a->ptr.p_double[aoffset+5]; a3x = a->ptr.p_double[aoffset+6]; a3y = a->ptr.p_double[aoffset+7]; a4x = a->ptr.p_double[aoffset+8]; a4y = a->ptr.p_double[aoffset+9]; a5x = a->ptr.p_double[aoffset+10]; a5y = a->ptr.p_double[aoffset+11]; v0 = a0x; v1 = a0y; a0x = a0x+a3x; a0y = a0y+a3y; a3x = v0-a3x; a3y = v1-a3y; v0 = a1x; v1 = a1y; a1x = a1x+a4x; a1y = a1y+a4y; a4x = v0-a4x; a4y = v1-a4y; v0 = a2x; v1 = a2y; a2x = a2x+a5x; a2y = a2y+a5y; a5x = v0-a5x; a5y = v1-a5y; t4x = a4x*c3-a4y*c4; t4y = a4x*c4+a4y*c3; a4x = t4x; a4y = t4y; t5x = -a5x*c3-a5y*c4; t5y = a5x*c4-a5y*c3; a5x = t5x; a5y = t5y; t1x = a1x+a2x; t1y = a1y+a2y; a0x = a0x+t1x; a0y = a0y+t1y; m1x = c1*t1x; m1y = c1*t1y; m2x = c2*(a1y-a2y); m2y = c2*(a2x-a1x); s1x = a0x+m1x; s1y = a0y+m1y; a1x = s1x+m2x; a1y = s1y+m2y; a2x = s1x-m2x; a2y = s1y-m2y; t1x = a4x+a5x; t1y = a4y+a5y; a3x = a3x+t1x; a3y = a3y+t1y; m1x = c1*t1x; m1y = c1*t1y; m2x = c2*(a4y-a5y); m2y = c2*(a5x-a4x); s1x = a3x+m1x; s1y = a3y+m1y; a4x = s1x+m2x; a4y = s1y+m2y; a5x = s1x-m2x; a5y = s1y-m2y; a->ptr.p_double[aoffset+0] = a0x; a->ptr.p_double[aoffset+1] = a0y; a->ptr.p_double[aoffset+2] = a3x; a->ptr.p_double[aoffset+3] = a3y; a->ptr.p_double[aoffset+4] = a1x; a->ptr.p_double[aoffset+5] = a1y; a->ptr.p_double[aoffset+6] = a4x; a->ptr.p_double[aoffset+7] = a4y; a->ptr.p_double[aoffset+8] = a2x; a->ptr.p_double[aoffset+9] = a2y; a->ptr.p_double[aoffset+10] = a5x; a->ptr.p_double[aoffset+11] = a5y; } return; } } /************************************************************************* This subroutine applies complex "integrated" codelet FFT to input/output array A. "Integrated" codelet differs from "normal" one in following ways: * it can work with MicrovectorSize>1 * hence, it can be used in Cooley-Tukey FFT without transpositions * it performs inlined multiplication by twiddle factors of Cooley-Tukey FFT with N2=MicrovectorSize/2. INPUT PARAMETERS: A - array, must be large enough for plan to work Offs - offset of the subarray to process OperandsCnt - operands count (see description of FastTransformPlan) OperandSize - operand size (see description of FastTransformPlan) MicrovectorSize-microvector size, must be 1 OUTPUT PARAMETERS: A - transformed array -- ALGLIB -- Copyright 05.04.2013 by Bochkanov Sergey *************************************************************************/ static void ftbase_ftapplycomplexcodelettwfft(/* Real */ ae_vector* a, ae_int_t offs, ae_int_t operandscnt, ae_int_t operandsize, ae_int_t microvectorsize, ae_state *_state) { ae_int_t opidx; ae_int_t mvidx; ae_int_t n; ae_int_t m; ae_int_t aoffset0; ae_int_t aoffset2; ae_int_t aoffset4; ae_int_t aoffset6; ae_int_t aoffset8; ae_int_t aoffset10; double a0x; double a0y; double a1x; double a1y; double a2x; double a2y; double a3x; double a3y; double a4x; double a4y; double a5x; double a5y; double v0; double v1; double v2; double v3; double q0x; double q0y; double t1x; double t1y; double t2x; double t2y; double t3x; double t3y; double t4x; double t4y; double t5x; double t5y; double m1x; double m1y; double m2x; double m2y; double m3x; double m3y; double m4x; double m4y; double m5x; double m5y; double s1x; double s1y; double s2x; double s2y; double s3x; double s3y; double s4x; double s4y; double s5x; double s5y; double c1; double c2; double c3; double c4; double c5; double v; double tw0; double tw1; double twx; double twxm1; double twy; double tw2x; double tw2y; double tw3x; double tw3y; double tw4x; double tw4y; double tw5x; double tw5y; ae_assert(operandscnt>=1, "FTApplyComplexCodeletFFT: OperandsCnt<1", _state); ae_assert(operandsize>=1, "FTApplyComplexCodeletFFT: OperandSize<1", _state); ae_assert(microvectorsize>=1, "FTApplyComplexCodeletFFT: MicrovectorSize<>1", _state); ae_assert(microvectorsize%2==0, "FTApplyComplexCodeletFFT: MicrovectorSize is not even", _state); n = operandsize; m = microvectorsize/2; /* * Hard-coded transforms for different N's */ ae_assert(n<=ftbase_maxradix, "FTApplyComplexCodeletTwFFT: N>MaxRadix", _state); if( n==2 ) { v = -2*ae_pi/(n*m); tw0 = -2*ae_sqr(ae_sin(0.5*v, _state), _state); tw1 = ae_sin(v, _state); for(opidx=0; opidx<=operandscnt-1; opidx++) { aoffset0 = offs+opidx*operandsize*microvectorsize; aoffset2 = aoffset0+microvectorsize; twxm1 = 0.0; twy = 0.0; for(mvidx=0; mvidx<=m-1; mvidx++) { a0x = a->ptr.p_double[aoffset0]; a0y = a->ptr.p_double[aoffset0+1]; a1x = a->ptr.p_double[aoffset2]; a1y = a->ptr.p_double[aoffset2+1]; v0 = a0x+a1x; v1 = a0y+a1y; v2 = a0x-a1x; v3 = a0y-a1y; a->ptr.p_double[aoffset0] = v0; a->ptr.p_double[aoffset0+1] = v1; a->ptr.p_double[aoffset2] = v2*(1+twxm1)-v3*twy; a->ptr.p_double[aoffset2+1] = v3*(1+twxm1)+v2*twy; aoffset0 = aoffset0+2; aoffset2 = aoffset2+2; if( (mvidx+1)%ftbase_updatetw==0 ) { v = -2*ae_pi*(mvidx+1)/(n*m); twxm1 = ae_sin(0.5*v, _state); twxm1 = -2*twxm1*twxm1; twy = ae_sin(v, _state); } else { v = twxm1+tw0+twxm1*tw0-twy*tw1; twy = twy+tw1+twxm1*tw1+twy*tw0; twxm1 = v; } } } return; } if( n==3 ) { v = -2*ae_pi/(n*m); tw0 = -2*ae_sqr(ae_sin(0.5*v, _state), _state); tw1 = ae_sin(v, _state); c1 = ae_cos(2*ae_pi/3, _state)-1; c2 = ae_sin(2*ae_pi/3, _state); for(opidx=0; opidx<=operandscnt-1; opidx++) { aoffset0 = offs+opidx*operandsize*microvectorsize; aoffset2 = aoffset0+microvectorsize; aoffset4 = aoffset2+microvectorsize; twx = 1.0; twxm1 = 0.0; twy = 0.0; for(mvidx=0; mvidx<=m-1; mvidx++) { a0x = a->ptr.p_double[aoffset0]; a0y = a->ptr.p_double[aoffset0+1]; a1x = a->ptr.p_double[aoffset2]; a1y = a->ptr.p_double[aoffset2+1]; a2x = a->ptr.p_double[aoffset4]; a2y = a->ptr.p_double[aoffset4+1]; t1x = a1x+a2x; t1y = a1y+a2y; a0x = a0x+t1x; a0y = a0y+t1y; m1x = c1*t1x; m1y = c1*t1y; m2x = c2*(a1y-a2y); m2y = c2*(a2x-a1x); s1x = a0x+m1x; s1y = a0y+m1y; a1x = s1x+m2x; a1y = s1y+m2y; a2x = s1x-m2x; a2y = s1y-m2y; tw2x = twx*twx-twy*twy; tw2y = 2*twx*twy; a->ptr.p_double[aoffset0] = a0x; a->ptr.p_double[aoffset0+1] = a0y; a->ptr.p_double[aoffset2] = a1x*twx-a1y*twy; a->ptr.p_double[aoffset2+1] = a1y*twx+a1x*twy; a->ptr.p_double[aoffset4] = a2x*tw2x-a2y*tw2y; a->ptr.p_double[aoffset4+1] = a2y*tw2x+a2x*tw2y; aoffset0 = aoffset0+2; aoffset2 = aoffset2+2; aoffset4 = aoffset4+2; if( (mvidx+1)%ftbase_updatetw==0 ) { v = -2*ae_pi*(mvidx+1)/(n*m); twxm1 = ae_sin(0.5*v, _state); twxm1 = -2*twxm1*twxm1; twy = ae_sin(v, _state); twx = twxm1+1; } else { v = twxm1+tw0+twxm1*tw0-twy*tw1; twy = twy+tw1+twxm1*tw1+twy*tw0; twxm1 = v; twx = v+1; } } } return; } if( n==4 ) { v = -2*ae_pi/(n*m); tw0 = -2*ae_sqr(ae_sin(0.5*v, _state), _state); tw1 = ae_sin(v, _state); for(opidx=0; opidx<=operandscnt-1; opidx++) { aoffset0 = offs+opidx*operandsize*microvectorsize; aoffset2 = aoffset0+microvectorsize; aoffset4 = aoffset2+microvectorsize; aoffset6 = aoffset4+microvectorsize; twx = 1.0; twxm1 = 0.0; twy = 0.0; for(mvidx=0; mvidx<=m-1; mvidx++) { a0x = a->ptr.p_double[aoffset0]; a0y = a->ptr.p_double[aoffset0+1]; a1x = a->ptr.p_double[aoffset2]; a1y = a->ptr.p_double[aoffset2+1]; a2x = a->ptr.p_double[aoffset4]; a2y = a->ptr.p_double[aoffset4+1]; a3x = a->ptr.p_double[aoffset6]; a3y = a->ptr.p_double[aoffset6+1]; t1x = a0x+a2x; t1y = a0y+a2y; t2x = a1x+a3x; t2y = a1y+a3y; m2x = a0x-a2x; m2y = a0y-a2y; m3x = a1y-a3y; m3y = a3x-a1x; tw2x = twx*twx-twy*twy; tw2y = 2*twx*twy; tw3x = twx*tw2x-twy*tw2y; tw3y = twx*tw2y+twy*tw2x; a1x = m2x+m3x; a1y = m2y+m3y; a2x = t1x-t2x; a2y = t1y-t2y; a3x = m2x-m3x; a3y = m2y-m3y; a->ptr.p_double[aoffset0] = t1x+t2x; a->ptr.p_double[aoffset0+1] = t1y+t2y; a->ptr.p_double[aoffset2] = a1x*twx-a1y*twy; a->ptr.p_double[aoffset2+1] = a1y*twx+a1x*twy; a->ptr.p_double[aoffset4] = a2x*tw2x-a2y*tw2y; a->ptr.p_double[aoffset4+1] = a2y*tw2x+a2x*tw2y; a->ptr.p_double[aoffset6] = a3x*tw3x-a3y*tw3y; a->ptr.p_double[aoffset6+1] = a3y*tw3x+a3x*tw3y; aoffset0 = aoffset0+2; aoffset2 = aoffset2+2; aoffset4 = aoffset4+2; aoffset6 = aoffset6+2; if( (mvidx+1)%ftbase_updatetw==0 ) { v = -2*ae_pi*(mvidx+1)/(n*m); twxm1 = ae_sin(0.5*v, _state); twxm1 = -2*twxm1*twxm1; twy = ae_sin(v, _state); twx = twxm1+1; } else { v = twxm1+tw0+twxm1*tw0-twy*tw1; twy = twy+tw1+twxm1*tw1+twy*tw0; twxm1 = v; twx = v+1; } } } return; } if( n==5 ) { v = -2*ae_pi/(n*m); tw0 = -2*ae_sqr(ae_sin(0.5*v, _state), _state); tw1 = ae_sin(v, _state); v = 2*ae_pi/5; c1 = (ae_cos(v, _state)+ae_cos(2*v, _state))/2-1; c2 = (ae_cos(v, _state)-ae_cos(2*v, _state))/2; c3 = -ae_sin(v, _state); c4 = -(ae_sin(v, _state)+ae_sin(2*v, _state)); c5 = ae_sin(v, _state)-ae_sin(2*v, _state); for(opidx=0; opidx<=operandscnt-1; opidx++) { aoffset0 = offs+opidx*operandsize*microvectorsize; aoffset2 = aoffset0+microvectorsize; aoffset4 = aoffset2+microvectorsize; aoffset6 = aoffset4+microvectorsize; aoffset8 = aoffset6+microvectorsize; twx = 1.0; twxm1 = 0.0; twy = 0.0; for(mvidx=0; mvidx<=m-1; mvidx++) { a0x = a->ptr.p_double[aoffset0]; a0y = a->ptr.p_double[aoffset0+1]; a1x = a->ptr.p_double[aoffset2]; a1y = a->ptr.p_double[aoffset2+1]; a2x = a->ptr.p_double[aoffset4]; a2y = a->ptr.p_double[aoffset4+1]; a3x = a->ptr.p_double[aoffset6]; a3y = a->ptr.p_double[aoffset6+1]; a4x = a->ptr.p_double[aoffset8]; a4y = a->ptr.p_double[aoffset8+1]; t1x = a1x+a4x; t1y = a1y+a4y; t2x = a2x+a3x; t2y = a2y+a3y; t3x = a1x-a4x; t3y = a1y-a4y; t4x = a3x-a2x; t4y = a3y-a2y; t5x = t1x+t2x; t5y = t1y+t2y; q0x = a0x+t5x; q0y = a0y+t5y; m1x = c1*t5x; m1y = c1*t5y; m2x = c2*(t1x-t2x); m2y = c2*(t1y-t2y); m3x = -c3*(t3y+t4y); m3y = c3*(t3x+t4x); m4x = -c4*t4y; m4y = c4*t4x; m5x = -c5*t3y; m5y = c5*t3x; s3x = m3x-m4x; s3y = m3y-m4y; s5x = m3x+m5x; s5y = m3y+m5y; s1x = q0x+m1x; s1y = q0y+m1y; s2x = s1x+m2x; s2y = s1y+m2y; s4x = s1x-m2x; s4y = s1y-m2y; tw2x = twx*twx-twy*twy; tw2y = 2*twx*twy; tw3x = twx*tw2x-twy*tw2y; tw3y = twx*tw2y+twy*tw2x; tw4x = tw2x*tw2x-tw2y*tw2y; tw4y = tw2x*tw2y+tw2y*tw2x; a1x = s2x+s3x; a1y = s2y+s3y; a2x = s4x+s5x; a2y = s4y+s5y; a3x = s4x-s5x; a3y = s4y-s5y; a4x = s2x-s3x; a4y = s2y-s3y; a->ptr.p_double[aoffset0] = q0x; a->ptr.p_double[aoffset0+1] = q0y; a->ptr.p_double[aoffset2] = a1x*twx-a1y*twy; a->ptr.p_double[aoffset2+1] = a1x*twy+a1y*twx; a->ptr.p_double[aoffset4] = a2x*tw2x-a2y*tw2y; a->ptr.p_double[aoffset4+1] = a2x*tw2y+a2y*tw2x; a->ptr.p_double[aoffset6] = a3x*tw3x-a3y*tw3y; a->ptr.p_double[aoffset6+1] = a3x*tw3y+a3y*tw3x; a->ptr.p_double[aoffset8] = a4x*tw4x-a4y*tw4y; a->ptr.p_double[aoffset8+1] = a4x*tw4y+a4y*tw4x; aoffset0 = aoffset0+2; aoffset2 = aoffset2+2; aoffset4 = aoffset4+2; aoffset6 = aoffset6+2; aoffset8 = aoffset8+2; if( (mvidx+1)%ftbase_updatetw==0 ) { v = -2*ae_pi*(mvidx+1)/(n*m); twxm1 = ae_sin(0.5*v, _state); twxm1 = -2*twxm1*twxm1; twy = ae_sin(v, _state); twx = twxm1+1; } else { v = twxm1+tw0+twxm1*tw0-twy*tw1; twy = twy+tw1+twxm1*tw1+twy*tw0; twxm1 = v; twx = v+1; } } } return; } if( n==6 ) { c1 = ae_cos(2*ae_pi/3, _state)-1; c2 = ae_sin(2*ae_pi/3, _state); c3 = ae_cos(-ae_pi/3, _state); c4 = ae_sin(-ae_pi/3, _state); v = -2*ae_pi/(n*m); tw0 = -2*ae_sqr(ae_sin(0.5*v, _state), _state); tw1 = ae_sin(v, _state); for(opidx=0; opidx<=operandscnt-1; opidx++) { aoffset0 = offs+opidx*operandsize*microvectorsize; aoffset2 = aoffset0+microvectorsize; aoffset4 = aoffset2+microvectorsize; aoffset6 = aoffset4+microvectorsize; aoffset8 = aoffset6+microvectorsize; aoffset10 = aoffset8+microvectorsize; twx = 1.0; twxm1 = 0.0; twy = 0.0; for(mvidx=0; mvidx<=m-1; mvidx++) { a0x = a->ptr.p_double[aoffset0+0]; a0y = a->ptr.p_double[aoffset0+1]; a1x = a->ptr.p_double[aoffset2+0]; a1y = a->ptr.p_double[aoffset2+1]; a2x = a->ptr.p_double[aoffset4+0]; a2y = a->ptr.p_double[aoffset4+1]; a3x = a->ptr.p_double[aoffset6+0]; a3y = a->ptr.p_double[aoffset6+1]; a4x = a->ptr.p_double[aoffset8+0]; a4y = a->ptr.p_double[aoffset8+1]; a5x = a->ptr.p_double[aoffset10+0]; a5y = a->ptr.p_double[aoffset10+1]; v0 = a0x; v1 = a0y; a0x = a0x+a3x; a0y = a0y+a3y; a3x = v0-a3x; a3y = v1-a3y; v0 = a1x; v1 = a1y; a1x = a1x+a4x; a1y = a1y+a4y; a4x = v0-a4x; a4y = v1-a4y; v0 = a2x; v1 = a2y; a2x = a2x+a5x; a2y = a2y+a5y; a5x = v0-a5x; a5y = v1-a5y; t4x = a4x*c3-a4y*c4; t4y = a4x*c4+a4y*c3; a4x = t4x; a4y = t4y; t5x = -a5x*c3-a5y*c4; t5y = a5x*c4-a5y*c3; a5x = t5x; a5y = t5y; t1x = a1x+a2x; t1y = a1y+a2y; a0x = a0x+t1x; a0y = a0y+t1y; m1x = c1*t1x; m1y = c1*t1y; m2x = c2*(a1y-a2y); m2y = c2*(a2x-a1x); s1x = a0x+m1x; s1y = a0y+m1y; a1x = s1x+m2x; a1y = s1y+m2y; a2x = s1x-m2x; a2y = s1y-m2y; t1x = a4x+a5x; t1y = a4y+a5y; a3x = a3x+t1x; a3y = a3y+t1y; m1x = c1*t1x; m1y = c1*t1y; m2x = c2*(a4y-a5y); m2y = c2*(a5x-a4x); s1x = a3x+m1x; s1y = a3y+m1y; a4x = s1x+m2x; a4y = s1y+m2y; a5x = s1x-m2x; a5y = s1y-m2y; tw2x = twx*twx-twy*twy; tw2y = 2*twx*twy; tw3x = twx*tw2x-twy*tw2y; tw3y = twx*tw2y+twy*tw2x; tw4x = tw2x*tw2x-tw2y*tw2y; tw4y = 2*tw2x*tw2y; tw5x = tw3x*tw2x-tw3y*tw2y; tw5y = tw3x*tw2y+tw3y*tw2x; a->ptr.p_double[aoffset0+0] = a0x; a->ptr.p_double[aoffset0+1] = a0y; a->ptr.p_double[aoffset2+0] = a3x*twx-a3y*twy; a->ptr.p_double[aoffset2+1] = a3y*twx+a3x*twy; a->ptr.p_double[aoffset4+0] = a1x*tw2x-a1y*tw2y; a->ptr.p_double[aoffset4+1] = a1y*tw2x+a1x*tw2y; a->ptr.p_double[aoffset6+0] = a4x*tw3x-a4y*tw3y; a->ptr.p_double[aoffset6+1] = a4y*tw3x+a4x*tw3y; a->ptr.p_double[aoffset8+0] = a2x*tw4x-a2y*tw4y; a->ptr.p_double[aoffset8+1] = a2y*tw4x+a2x*tw4y; a->ptr.p_double[aoffset10+0] = a5x*tw5x-a5y*tw5y; a->ptr.p_double[aoffset10+1] = a5y*tw5x+a5x*tw5y; aoffset0 = aoffset0+2; aoffset2 = aoffset2+2; aoffset4 = aoffset4+2; aoffset6 = aoffset6+2; aoffset8 = aoffset8+2; aoffset10 = aoffset10+2; if( (mvidx+1)%ftbase_updatetw==0 ) { v = -2*ae_pi*(mvidx+1)/(n*m); twxm1 = ae_sin(0.5*v, _state); twxm1 = -2*twxm1*twxm1; twy = ae_sin(v, _state); twx = twxm1+1; } else { v = twxm1+tw0+twxm1*tw0-twy*tw1; twy = twy+tw1+twxm1*tw1+twy*tw0; twxm1 = v; twx = v+1; } } } return; } } /************************************************************************* This subroutine precomputes data for complex Bluestein's FFT and writes them to array PrecR[] at specified offset. It is responsibility of the caller to make sure that PrecR[] is large enough. INPUT PARAMETERS: N - original size of the transform M - size of the "padded" Bluestein's transform PrecR - preallocated array Offs - offset OUTPUT PARAMETERS: PrecR - data at Offs:Offs+4*M-1 are modified: * PrecR[Offs:Offs+2*M-1] stores Z[k]=exp(i*pi*k^2/N) * PrecR[Offs+2*M:Offs+4*M-1] stores FFT of the Z Other parts of PrecR are unchanged. NOTE: this function performs internal M-point FFT. It allocates temporary plan which is destroyed after leaving this function. -- ALGLIB -- Copyright 08.05.2013 by Bochkanov Sergey *************************************************************************/ static void ftbase_ftprecomputebluesteinsfft(ae_int_t n, ae_int_t m, /* Real */ ae_vector* precr, ae_int_t offs, ae_state *_state) { ae_frame _frame_block; ae_int_t i; double bx; double by; fasttransformplan plan; ae_frame_make(_state, &_frame_block); _fasttransformplan_init(&plan, _state); /* * Fill first half of PrecR with b[k] = exp(i*pi*k^2/N) */ for(i=0; i<=2*m-1; i++) { precr->ptr.p_double[offs+i] = (double)(0); } for(i=0; i<=n-1; i++) { bx = ae_cos(ae_pi/n*i*i, _state); by = ae_sin(ae_pi/n*i*i, _state); precr->ptr.p_double[offs+2*i+0] = bx; precr->ptr.p_double[offs+2*i+1] = by; precr->ptr.p_double[offs+2*((m-i)%m)+0] = bx; precr->ptr.p_double[offs+2*((m-i)%m)+1] = by; } /* * Precomputed FFT */ ftcomplexfftplan(m, 1, &plan, _state); for(i=0; i<=2*m-1; i++) { precr->ptr.p_double[offs+2*m+i] = precr->ptr.p_double[offs+i]; } ftbase_ftapplysubplan(&plan, 0, precr, offs+2*m, 0, &plan.buffer, 1, _state); ae_frame_leave(_state); } /************************************************************************* This subroutine applies complex Bluestein's FFT to input/output array A. INPUT PARAMETERS: Plan - transformation plan A - array, must be large enough for plan to work ABase - base offset in array A, this value points to start of subarray whose length is equal to length of the plan AOffset - offset with respect to ABase, 0<=AOffsetptr.p_double[p0+0]; y = a->ptr.p_double[p0+1]; bx = plan->precr.ptr.p_double[p1+0]; by = -plan->precr.ptr.p_double[p1+1]; bufa->ptr.p_double[2*i+0] = x*bx-y*by; bufa->ptr.p_double[2*i+1] = x*by+y*bx; p0 = p0+2; p1 = p1+2; } for(i=2*n; i<=2*m-1; i++) { bufa->ptr.p_double[i] = (double)(0); } /* * Perform convolution of A and Z (using precomputed * FFT of Z stored in Plan structure). */ ftbase_ftapplysubplan(plan, subplan, bufa, 0, 0, bufc, 1, _state); p0 = 0; p1 = precoffs+2*m; for(i=0; i<=m-1; i++) { ax = bufa->ptr.p_double[p0+0]; ay = bufa->ptr.p_double[p0+1]; bx = plan->precr.ptr.p_double[p1+0]; by = plan->precr.ptr.p_double[p1+1]; bufa->ptr.p_double[p0+0] = ax*bx-ay*by; bufa->ptr.p_double[p0+1] = -(ax*by+ay*bx); p0 = p0+2; p1 = p1+2; } ftbase_ftapplysubplan(plan, subplan, bufa, 0, 0, bufc, 1, _state); /* * Post processing: * A:=conj(Z)*conj(A)/M * Here conj(A)/M corresponds to last stage of inverse DFT, * and conj(Z) comes from Bluestein's FFT algorithm. */ p0 = precoffs; p1 = 0; p2 = abase+aoffset+op*2*n; for(i=0; i<=n-1; i++) { bx = plan->precr.ptr.p_double[p0+0]; by = plan->precr.ptr.p_double[p0+1]; rx = bufa->ptr.p_double[p1+0]/m; ry = -bufa->ptr.p_double[p1+1]/m; a->ptr.p_double[p2+0] = rx*bx-ry*(-by); a->ptr.p_double[p2+1] = rx*(-by)+ry*bx; p0 = p0+2; p1 = p1+2; p2 = p2+2; } } } /************************************************************************* This subroutine precomputes data for complex Rader's FFT and writes them to array PrecR[] at specified offset. It is responsibility of the caller to make sure that PrecR[] is large enough. INPUT PARAMETERS: N - original size of the transform (before reduction to N-1) RQ - primitive root modulo N RIQ - inverse of primitive root modulo N PrecR - preallocated array Offs - offset OUTPUT PARAMETERS: PrecR - data at Offs:Offs+2*(N-1)-1 store FFT of Rader's factors, other parts of PrecR are unchanged. NOTE: this function performs internal (N-1)-point FFT. It allocates temporary plan which is destroyed after leaving this function. -- ALGLIB -- Copyright 08.05.2013 by Bochkanov Sergey *************************************************************************/ static void ftbase_ftprecomputeradersfft(ae_int_t n, ae_int_t rq, ae_int_t riq, /* Real */ ae_vector* precr, ae_int_t offs, ae_state *_state) { ae_frame _frame_block; ae_int_t q; fasttransformplan plan; ae_int_t kiq; double v; ae_frame_make(_state, &_frame_block); _fasttransformplan_init(&plan, _state); /* * Fill PrecR with Rader factors, perform FFT */ kiq = 1; for(q=0; q<=n-2; q++) { v = -2*ae_pi*kiq/n; precr->ptr.p_double[offs+2*q+0] = ae_cos(v, _state); precr->ptr.p_double[offs+2*q+1] = ae_sin(v, _state); kiq = kiq*riq%n; } ftcomplexfftplan(n-1, 1, &plan, _state); ftbase_ftapplysubplan(&plan, 0, precr, offs, 0, &plan.buffer, 1, _state); ae_frame_leave(_state); } /************************************************************************* This subroutine applies complex Rader's FFT to input/output array A. INPUT PARAMETERS: A - array, must be large enough for plan to work ABase - base offset in array A, this value points to start of subarray whose length is equal to length of the plan AOffset - offset with respect to ABase, 0<=AOffset=1, "FTApplyComplexRefFFT: OperandsCnt<1", _state); /* * Process operands */ for(opidx=0; opidx<=operandscnt-1; opidx++) { /* * fill QA */ kq = 1; p0 = abase+aoffset+opidx*n*2; p1 = aoffset+opidx*n*2; rx = a->ptr.p_double[p0+0]; ry = a->ptr.p_double[p0+1]; x0 = rx; y0 = ry; for(q=0; q<=n-2; q++) { ax = a->ptr.p_double[p0+2*kq+0]; ay = a->ptr.p_double[p0+2*kq+1]; buf->ptr.p_double[p1+0] = ax; buf->ptr.p_double[p1+1] = ay; rx = rx+ax; ry = ry+ay; kq = kq*rq%n; p1 = p1+2; } p0 = abase+aoffset+opidx*n*2; p1 = aoffset+opidx*n*2; for(q=0; q<=n-2; q++) { a->ptr.p_double[p0] = buf->ptr.p_double[p1]; a->ptr.p_double[p0+1] = buf->ptr.p_double[p1+1]; p0 = p0+2; p1 = p1+2; } /* * Convolution */ ftbase_ftapplysubplan(plan, subplan, a, abase, aoffset+opidx*n*2, buf, 1, _state); p0 = abase+aoffset+opidx*n*2; p1 = precoffs; for(i=0; i<=n-2; i++) { ax = a->ptr.p_double[p0+0]; ay = a->ptr.p_double[p0+1]; bx = plan->precr.ptr.p_double[p1+0]; by = plan->precr.ptr.p_double[p1+1]; a->ptr.p_double[p0+0] = ax*bx-ay*by; a->ptr.p_double[p0+1] = -(ax*by+ay*bx); p0 = p0+2; p1 = p1+2; } ftbase_ftapplysubplan(plan, subplan, a, abase, aoffset+opidx*n*2, buf, 1, _state); p0 = abase+aoffset+opidx*n*2; for(i=0; i<=n-2; i++) { a->ptr.p_double[p0+0] = a->ptr.p_double[p0+0]/(n-1); a->ptr.p_double[p0+1] = -a->ptr.p_double[p0+1]/(n-1); p0 = p0+2; } /* * Result */ buf->ptr.p_double[aoffset+opidx*n*2+0] = rx; buf->ptr.p_double[aoffset+opidx*n*2+1] = ry; kiq = 1; p0 = aoffset+opidx*n*2; p1 = abase+aoffset+opidx*n*2; for(q=0; q<=n-2; q++) { buf->ptr.p_double[p0+2*kiq+0] = x0+a->ptr.p_double[p1+0]; buf->ptr.p_double[p0+2*kiq+1] = y0+a->ptr.p_double[p1+1]; kiq = kiq*riq%n; p1 = p1+2; } p0 = abase+aoffset+opidx*n*2; p1 = aoffset+opidx*n*2; for(q=0; q<=n-1; q++) { a->ptr.p_double[p0] = buf->ptr.p_double[p1]; a->ptr.p_double[p0+1] = buf->ptr.p_double[p1+1]; p0 = p0+2; p1 = p1+2; } } } /************************************************************************* Factorizes task size N into product of two smaller sizes N1 and N2 INPUT PARAMETERS: N - task size, N>0 IsRoot - whether taks is root task (first one in a sequence) OUTPUT PARAMETERS: N1, N2 - such numbers that: * for prime N: N1=N2=0 * for composite N<=MaxRadix: N1=N2=0 * for composite N>MaxRadix: 1<=N1<=N2, N1*N2=N -- ALGLIB -- Copyright 08.04.2013 by Bochkanov Sergey *************************************************************************/ static void ftbase_ftfactorize(ae_int_t n, ae_bool isroot, ae_int_t* n1, ae_int_t* n2, ae_state *_state) { ae_int_t j; ae_int_t k; *n1 = 0; *n2 = 0; ae_assert(n>0, "FTFactorize: N<=0", _state); *n1 = 0; *n2 = 0; /* * Small N */ if( n<=ftbase_maxradix ) { return; } /* * Large N, recursive split */ if( n>ftbase_recursivethreshold ) { k = ae_iceil(ae_sqrt((double)(n), _state), _state)+1; ae_assert(k*k>=n, "FTFactorize: internal error during recursive factorization", _state); for(j=k; j>=2; j--) { if( n%j==0 ) { *n1 = ae_minint(n/j, j, _state); *n2 = ae_maxint(n/j, j, _state); return; } } } /* * N>MaxRadix, try to find good codelet */ for(j=ftbase_maxradix; j>=2; j--) { if( n%j==0 ) { *n1 = j; *n2 = n/j; break; } } /* * In case no good codelet was found, * try to factorize N into product of ANY primes. */ if( *n1*(*n2)!=n ) { for(j=2; j<=n-1; j++) { if( n%j==0 ) { *n1 = j; *n2 = n/j; break; } if( j*j>n ) { break; } } } /* * normalize */ if( *n1>(*n2) ) { j = *n1; *n1 = *n2; *n2 = j; } } /************************************************************************* Returns optimistic estimate of the FFT cost, in UNITs (1 UNIT = 100 KFLOPs) INPUT PARAMETERS: N - task size, N>0 RESULU: cost in UNITs, rounded down to nearest integer NOTE: If FFT cost is less than 1 UNIT, it will return 0 as result. -- ALGLIB -- Copyright 08.04.2013 by Bochkanov Sergey *************************************************************************/ static ae_int_t ftbase_ftoptimisticestimate(ae_int_t n, ae_state *_state) { ae_int_t result; ae_assert(n>0, "FTOptimisticEstimate: N<=0", _state); result = ae_ifloor(1.0E-5*5*n*ae_log((double)(n), _state)/ae_log((double)(2), _state), _state); return result; } /************************************************************************* Twiddle factors calculation -- ALGLIB -- Copyright 01.05.2009 by Bochkanov Sergey *************************************************************************/ static void ftbase_ffttwcalc(/* Real */ ae_vector* a, ae_int_t aoffset, ae_int_t n1, ae_int_t n2, ae_state *_state) { ae_int_t i; ae_int_t j2; ae_int_t n; ae_int_t halfn1; ae_int_t offs; double x; double y; double twxm1; double twy; double twbasexm1; double twbasey; double twrowxm1; double twrowy; double tmpx; double tmpy; double v; ae_int_t updatetw2; /* * Multiplication by twiddle factors for complex Cooley-Tukey FFT * with N factorized as N1*N2. * * Naive solution to this problem is given below: * * > for K:=1 to N2-1 do * > for J:=1 to N1-1 do * > begin * > Idx:=K*N1+J; * > X:=A[AOffset+2*Idx+0]; * > Y:=A[AOffset+2*Idx+1]; * > TwX:=Cos(-2*Pi()*K*J/(N1*N2)); * > TwY:=Sin(-2*Pi()*K*J/(N1*N2)); * > A[AOffset+2*Idx+0]:=X*TwX-Y*TwY; * > A[AOffset+2*Idx+1]:=X*TwY+Y*TwX; * > end; * * However, there are exist more efficient solutions. * * Each pass of the inner cycle corresponds to multiplication of one * entry of A by W[k,j]=exp(-I*2*pi*k*j/N). This factor can be rewritten * as exp(-I*2*pi*k/N)^j. So we can replace costly exponentiation by * repeated multiplication: W[k,j+1]=W[k,j]*exp(-I*2*pi*k/N), with * second factor being computed once in the beginning of the iteration. * * Also, exp(-I*2*pi*k/N) can be represented as exp(-I*2*pi/N)^k, i.e. * we have W[K+1,1]=W[K,1]*W[1,1]. * * In our loop we use following variables: * * [TwBaseXM1,TwBaseY] = [cos(2*pi/N)-1, sin(2*pi/N)] * * [TwRowXM1, TwRowY] = [cos(2*pi*I/N)-1, sin(2*pi*I/N)] * * [TwXM1, TwY] = [cos(2*pi*I*J/N)-1, sin(2*pi*I*J/N)] * * Meaning of the variables: * * [TwXM1,TwY] is current twiddle factor W[I,J] * * [TwRowXM1, TwRowY] is W[I,1] * * [TwBaseXM1,TwBaseY] is W[1,1] * * During inner loop we multiply current twiddle factor by W[I,1], * during outer loop we update W[I,1]. * */ ae_assert(ftbase_updatetw>=2, "FFTTwCalc: internal error - UpdateTw<2", _state); updatetw2 = ftbase_updatetw/2; halfn1 = n1/2; n = n1*n2; v = -2*ae_pi/n; twbasexm1 = -2*ae_sqr(ae_sin(0.5*v, _state), _state); twbasey = ae_sin(v, _state); twrowxm1 = (double)(0); twrowy = (double)(0); offs = aoffset; for(i=0; i<=n2-1; i++) { /* * Initialize twiddle factor for current row */ twxm1 = (double)(0); twy = (double)(0); /* * N1-point block is separated into 2-point chunks and residual 1-point chunk * (in case N1 is odd). Unrolled loop is several times faster. */ for(j2=0; j2<=halfn1-1; j2++) { /* * Processing: * * process first element in a chunk. * * update twiddle factor (unconditional update) * * process second element * * conditional update of the twiddle factor */ x = a->ptr.p_double[offs+0]; y = a->ptr.p_double[offs+1]; tmpx = x*(1+twxm1)-y*twy; tmpy = x*twy+y*(1+twxm1); a->ptr.p_double[offs+0] = tmpx; a->ptr.p_double[offs+1] = tmpy; tmpx = (1+twxm1)*twrowxm1-twy*twrowy; twy = twy+(1+twxm1)*twrowy+twy*twrowxm1; twxm1 = twxm1+tmpx; x = a->ptr.p_double[offs+2]; y = a->ptr.p_double[offs+3]; tmpx = x*(1+twxm1)-y*twy; tmpy = x*twy+y*(1+twxm1); a->ptr.p_double[offs+2] = tmpx; a->ptr.p_double[offs+3] = tmpy; offs = offs+4; if( (j2+1)%updatetw2==0&&j2ptr.p_double[offs+0]; y = a->ptr.p_double[offs+1]; tmpx = x*(1+twxm1)-y*twy; tmpy = x*twy+y*(1+twxm1); a->ptr.p_double[offs+0] = tmpx; a->ptr.p_double[offs+1] = tmpy; offs = offs+2; } /* * update TwRow: TwRow(new) = TwRow(old)*TwBase */ if( iptr.p_double[astart], 1, &buf->ptr.p_double[0], 1, ae_v_len(astart,astart+2*m*n-1)); } /************************************************************************* Recurrent subroutine for a InternalComplexLinTranspose Write A^T to B, where: * A is m*n complex matrix stored in array A as pairs of real/image values, beginning from AStart position, with AStride stride * B is n*m complex matrix stored in array B as pairs of real/image values, beginning from BStart position, with BStride stride stride is measured in complex numbers, i.e. in real/image pairs. -- ALGLIB -- Copyright 01.05.2009 by Bochkanov Sergey *************************************************************************/ static void ftbase_ffticltrec(/* Real */ ae_vector* a, ae_int_t astart, ae_int_t astride, /* Real */ ae_vector* b, ae_int_t bstart, ae_int_t bstride, ae_int_t m, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t idx1; ae_int_t idx2; ae_int_t m2; ae_int_t m1; ae_int_t n1; if( m==0||n==0 ) { return; } if( ae_maxint(m, n, _state)<=8 ) { m2 = 2*bstride; for(i=0; i<=m-1; i++) { idx1 = bstart+2*i; idx2 = astart+2*i*astride; for(j=0; j<=n-1; j++) { b->ptr.p_double[idx1+0] = a->ptr.p_double[idx2+0]; b->ptr.p_double[idx1+1] = a->ptr.p_double[idx2+1]; idx1 = idx1+m2; idx2 = idx2+2; } } return; } if( n>m ) { /* * New partition: * * "A^T -> B" becomes "(A1 A2)^T -> ( B1 ) * ( B2 ) */ n1 = n/2; if( n-n1>=8&&n1%8!=0 ) { n1 = n1+(8-n1%8); } ae_assert(n-n1>0, "Assertion failed", _state); ftbase_ffticltrec(a, astart, astride, b, bstart, bstride, m, n1, _state); ftbase_ffticltrec(a, astart+2*n1, astride, b, bstart+2*n1*bstride, bstride, m, n-n1, _state); } else { /* * New partition: * * "A^T -> B" becomes "( A1 )^T -> ( B1 B2 ) * ( A2 ) */ m1 = m/2; if( m-m1>=8&&m1%8!=0 ) { m1 = m1+(8-m1%8); } ae_assert(m-m1>0, "Assertion failed", _state); ftbase_ffticltrec(a, astart, astride, b, bstart, bstride, m1, n, _state); ftbase_ffticltrec(a, astart+2*m1*astride, astride, b, bstart+2*m1, bstride, m-m1, n, _state); } } /************************************************************************* Recurrent subroutine for a InternalRealLinTranspose -- ALGLIB -- Copyright 01.05.2009 by Bochkanov Sergey *************************************************************************/ static void ftbase_fftirltrec(/* Real */ ae_vector* a, ae_int_t astart, ae_int_t astride, /* Real */ ae_vector* b, ae_int_t bstart, ae_int_t bstride, ae_int_t m, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t idx1; ae_int_t idx2; ae_int_t m1; ae_int_t n1; if( m==0||n==0 ) { return; } if( ae_maxint(m, n, _state)<=8 ) { for(i=0; i<=m-1; i++) { idx1 = bstart+i; idx2 = astart+i*astride; for(j=0; j<=n-1; j++) { b->ptr.p_double[idx1] = a->ptr.p_double[idx2]; idx1 = idx1+bstride; idx2 = idx2+1; } } return; } if( n>m ) { /* * New partition: * * "A^T -> B" becomes "(A1 A2)^T -> ( B1 ) * ( B2 ) */ n1 = n/2; if( n-n1>=8&&n1%8!=0 ) { n1 = n1+(8-n1%8); } ae_assert(n-n1>0, "Assertion failed", _state); ftbase_fftirltrec(a, astart, astride, b, bstart, bstride, m, n1, _state); ftbase_fftirltrec(a, astart+n1, astride, b, bstart+n1*bstride, bstride, m, n-n1, _state); } else { /* * New partition: * * "A^T -> B" becomes "( A1 )^T -> ( B1 B2 ) * ( A2 ) */ m1 = m/2; if( m-m1>=8&&m1%8!=0 ) { m1 = m1+(8-m1%8); } ae_assert(m-m1>0, "Assertion failed", _state); ftbase_fftirltrec(a, astart, astride, b, bstart, bstride, m1, n, _state); ftbase_fftirltrec(a, astart+m1*astride, astride, b, bstart+m1, bstride, m-m1, n, _state); } } /************************************************************************* recurrent subroutine for FFTFindSmoothRec -- ALGLIB -- Copyright 01.05.2009 by Bochkanov Sergey *************************************************************************/ static void ftbase_ftbasefindsmoothrec(ae_int_t n, ae_int_t seed, ae_int_t leastfactor, ae_int_t* best, ae_state *_state) { ae_assert(ftbase_ftbasemaxsmoothfactor<=5, "FTBaseFindSmoothRec: internal error!", _state); if( seed>=n ) { *best = ae_minint(*best, seed, _state); return; } if( leastfactor<=2 ) { ftbase_ftbasefindsmoothrec(n, seed*2, 2, best, _state); } if( leastfactor<=3 ) { ftbase_ftbasefindsmoothrec(n, seed*3, 3, best, _state); } if( leastfactor<=5 ) { ftbase_ftbasefindsmoothrec(n, seed*5, 5, best, _state); } } void _fasttransformplan_init(void* _p, ae_state *_state) { fasttransformplan *p = (fasttransformplan*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->entries, 0, 0, DT_INT, _state); ae_vector_init(&p->buffer, 0, DT_REAL, _state); ae_vector_init(&p->precr, 0, DT_REAL, _state); ae_vector_init(&p->preci, 0, DT_REAL, _state); ae_shared_pool_init(&p->bluesteinpool, _state); } void _fasttransformplan_init_copy(void* _dst, void* _src, ae_state *_state) { fasttransformplan *dst = (fasttransformplan*)_dst; fasttransformplan *src = (fasttransformplan*)_src; ae_matrix_init_copy(&dst->entries, &src->entries, _state); ae_vector_init_copy(&dst->buffer, &src->buffer, _state); ae_vector_init_copy(&dst->precr, &src->precr, _state); ae_vector_init_copy(&dst->preci, &src->preci, _state); ae_shared_pool_init_copy(&dst->bluesteinpool, &src->bluesteinpool, _state); } void _fasttransformplan_clear(void* _p) { fasttransformplan *p = (fasttransformplan*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->entries); ae_vector_clear(&p->buffer); ae_vector_clear(&p->precr); ae_vector_clear(&p->preci); ae_shared_pool_clear(&p->bluesteinpool); } void _fasttransformplan_destroy(void* _p) { fasttransformplan *p = (fasttransformplan*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->entries); ae_vector_destroy(&p->buffer); ae_vector_destroy(&p->precr); ae_vector_destroy(&p->preci); ae_shared_pool_destroy(&p->bluesteinpool); } double nulog1p(double x, ae_state *_state) { double z; double lp; double lq; double result; z = 1.0+x; if( ae_fp_less(z,0.70710678118654752440)||ae_fp_greater(z,1.41421356237309504880) ) { result = ae_log(z, _state); return result; } z = x*x; lp = 4.5270000862445199635215E-5; lp = lp*x+4.9854102823193375972212E-1; lp = lp*x+6.5787325942061044846969E0; lp = lp*x+2.9911919328553073277375E1; lp = lp*x+6.0949667980987787057556E1; lp = lp*x+5.7112963590585538103336E1; lp = lp*x+2.0039553499201281259648E1; lq = 1.0000000000000000000000E0; lq = lq*x+1.5062909083469192043167E1; lq = lq*x+8.3047565967967209469434E1; lq = lq*x+2.2176239823732856465394E2; lq = lq*x+3.0909872225312059774938E2; lq = lq*x+2.1642788614495947685003E2; lq = lq*x+6.0118660497603843919306E1; z = -0.5*z+x*(z*lp/lq); result = x+z; return result; } double nuexpm1(double x, ae_state *_state) { double r; double xx; double ep; double eq; double result; if( ae_fp_less(x,-0.5)||ae_fp_greater(x,0.5) ) { result = ae_exp(x, _state)-1.0; return result; } xx = x*x; ep = 1.2617719307481059087798E-4; ep = ep*xx+3.0299440770744196129956E-2; ep = ep*xx+9.9999999999999999991025E-1; eq = 3.0019850513866445504159E-6; eq = eq*xx+2.5244834034968410419224E-3; eq = eq*xx+2.2726554820815502876593E-1; eq = eq*xx+2.0000000000000000000897E0; r = x*ep; r = r/(eq-r); result = r+r; return result; } double nucosm1(double x, ae_state *_state) { double xx; double c; double result; if( ae_fp_less(x,-0.25*ae_pi)||ae_fp_greater(x,0.25*ae_pi) ) { result = ae_cos(x, _state)-1; return result; } xx = x*x; c = 4.7377507964246204691685E-14; c = c*xx-1.1470284843425359765671E-11; c = c*xx+2.0876754287081521758361E-9; c = c*xx-2.7557319214999787979814E-7; c = c*xx+2.4801587301570552304991E-5; c = c*xx-1.3888888888888872993737E-3; c = c*xx+4.1666666666666666609054E-2; result = -0.5*xx+xx*xx*c; return result; } } cpp/src/fasttransforms.h0000755000175000017500000006205313105126766015252 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _fasttransforms_pkg_h #define _fasttransforms_pkg_h #include "ap.h" #include "alglibinternal.h" ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (DATATYPES) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* 1-dimensional complex FFT. Array size N may be arbitrary number (composite or prime). Composite N's are handled with cache-oblivious variation of a Cooley-Tukey algorithm. Small prime-factors are transformed using hard coded codelets (similar to FFTW codelets, but without low-level optimization), large prime-factors are handled with Bluestein's algorithm. Fastests transforms are for smooth N's (prime factors are 2, 3, 5 only), most fast for powers of 2. When N have prime factors larger than these, but orders of magnitude smaller than N, computations will be about 4 times slower than for nearby highly composite N's. When N itself is prime, speed will be 6 times lower. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - complex function to be transformed N - problem size OUTPUT PARAMETERS A - DFT of a input array, array[0..N-1] A_out[j] = SUM(A_in[k]*exp(-2*pi*sqrt(-1)*j*k/N), k = 0..N-1) -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/ void fftc1d(complex_1d_array &a, const ae_int_t n); void fftc1d(complex_1d_array &a); /************************************************************************* 1-dimensional complex inverse FFT. Array size N may be arbitrary number (composite or prime). Algorithm has O(N*logN) complexity for any N (composite or prime). See FFTC1D() description for more information about algorithm performance. INPUT PARAMETERS A - array[0..N-1] - complex array to be transformed N - problem size OUTPUT PARAMETERS A - inverse DFT of a input array, array[0..N-1] A_out[j] = SUM(A_in[k]/N*exp(+2*pi*sqrt(-1)*j*k/N), k = 0..N-1) -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/ void fftc1dinv(complex_1d_array &a, const ae_int_t n); void fftc1dinv(complex_1d_array &a); /************************************************************************* 1-dimensional real FFT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS F - DFT of a input array, array[0..N-1] F[j] = SUM(A[k]*exp(-2*pi*sqrt(-1)*j*k/N), k = 0..N-1) NOTE: F[] satisfies symmetry property F[k] = conj(F[N-k]), so just one half of array is usually needed. But for convinience subroutine returns full complex array (with frequencies above N/2), so its result may be used by other FFT-related subroutines. -- ALGLIB -- Copyright 01.06.2009 by Bochkanov Sergey *************************************************************************/ void fftr1d(const real_1d_array &a, const ae_int_t n, complex_1d_array &f); void fftr1d(const real_1d_array &a, complex_1d_array &f); /************************************************************************* 1-dimensional real inverse FFT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS F - array[0..floor(N/2)] - frequencies from forward real FFT N - problem size OUTPUT PARAMETERS A - inverse DFT of a input array, array[0..N-1] NOTE: F[] should satisfy symmetry property F[k] = conj(F[N-k]), so just one half of frequencies array is needed - elements from 0 to floor(N/2). F[0] is ALWAYS real. If N is even F[floor(N/2)] is real too. If N is odd, then F[floor(N/2)] has no special properties. Relying on properties noted above, FFTR1DInv subroutine uses only elements from 0th to floor(N/2)-th. It ignores imaginary part of F[0], and in case N is even it ignores imaginary part of F[floor(N/2)] too. When you call this function using full arguments list - "FFTR1DInv(F,N,A)" - you can pass either either frequencies array with N elements or reduced array with roughly N/2 elements - subroutine will successfully transform both. If you call this function using reduced arguments list - "FFTR1DInv(F,A)" - you must pass FULL array with N elements (although higher N/2 are still not used) because array size is used to automatically determine FFT length -- ALGLIB -- Copyright 01.06.2009 by Bochkanov Sergey *************************************************************************/ void fftr1dinv(const complex_1d_array &f, const ae_int_t n, real_1d_array &a); void fftr1dinv(const complex_1d_array &f, real_1d_array &a); /************************************************************************* 1-dimensional Fast Hartley Transform. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS A - FHT of a input array, array[0..N-1], A_out[k] = sum(A_in[j]*(cos(2*pi*j*k/N)+sin(2*pi*j*k/N)), j=0..N-1) -- ALGLIB -- Copyright 04.06.2009 by Bochkanov Sergey *************************************************************************/ void fhtr1d(real_1d_array &a, const ae_int_t n); /************************************************************************* 1-dimensional inverse FHT. Algorithm has O(N*logN) complexity for any N (composite or prime). INPUT PARAMETERS A - array[0..N-1] - complex array to be transformed N - problem size OUTPUT PARAMETERS A - inverse FHT of a input array, array[0..N-1] -- ALGLIB -- Copyright 29.05.2009 by Bochkanov Sergey *************************************************************************/ void fhtr1dinv(real_1d_array &a, const ae_int_t n); /************************************************************************* 1-dimensional complex convolution. For given A/B returns conv(A,B) (non-circular). Subroutine can automatically choose between three implementations: straightforward O(M*N) formula for very small N (or M), overlap-add algorithm for cases where max(M,N) is significantly larger than min(M,N), but O(M*N) algorithm is too slow, and general FFT-based formula for cases where two previois algorithms are too slow. Algorithm has max(M,N)*log(max(M,N)) complexity for any M/N. INPUT PARAMETERS A - array[0..M-1] - complex function to be transformed M - problem size B - array[0..N-1] - complex function to be transformed N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..N+M-2]. NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convc1d(const complex_1d_array &a, const ae_int_t m, const complex_1d_array &b, const ae_int_t n, complex_1d_array &r); /************************************************************************* 1-dimensional complex non-circular deconvolution (inverse of ConvC1D()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - response N - response length, N<=M OUTPUT PARAMETERS R - deconvolved signal. array[0..M-N]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convc1dinv(const complex_1d_array &a, const ae_int_t m, const complex_1d_array &b, const ae_int_t n, complex_1d_array &r); /************************************************************************* 1-dimensional circular complex convolution. For given S/R returns conv(S,R) (circular). Algorithm has linearithmic complexity for any M/N. IMPORTANT: normal convolution is commutative, i.e. it is symmetric - conv(A,B)=conv(B,A). Cyclic convolution IS NOT. One function - S - is a signal, periodic function, and another - R - is a response, non-periodic function with limited length. INPUT PARAMETERS S - array[0..M-1] - complex periodic signal M - problem size B - array[0..N-1] - complex non-periodic response N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convc1dcircular(const complex_1d_array &s, const ae_int_t m, const complex_1d_array &r, const ae_int_t n, complex_1d_array &c); /************************************************************************* 1-dimensional circular complex deconvolution (inverse of ConvC1DCircular()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved periodic signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - non-periodic response N - response length OUTPUT PARAMETERS R - deconvolved signal. array[0..M-1]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convc1dcircularinv(const complex_1d_array &a, const ae_int_t m, const complex_1d_array &b, const ae_int_t n, complex_1d_array &r); /************************************************************************* 1-dimensional real convolution. Analogous to ConvC1D(), see ConvC1D() comments for more details. INPUT PARAMETERS A - array[0..M-1] - real function to be transformed M - problem size B - array[0..N-1] - real function to be transformed N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..N+M-2]. NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convr1d(const real_1d_array &a, const ae_int_t m, const real_1d_array &b, const ae_int_t n, real_1d_array &r); /************************************************************************* 1-dimensional real deconvolution (inverse of ConvC1D()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - response N - response length, N<=M OUTPUT PARAMETERS R - deconvolved signal. array[0..M-N]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that A is zero at T<0, B is zero too. If one or both functions have non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convr1dinv(const real_1d_array &a, const ae_int_t m, const real_1d_array &b, const ae_int_t n, real_1d_array &r); /************************************************************************* 1-dimensional circular real convolution. Analogous to ConvC1DCircular(), see ConvC1DCircular() comments for more details. INPUT PARAMETERS S - array[0..M-1] - real signal M - problem size B - array[0..N-1] - real response N - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convr1dcircular(const real_1d_array &s, const ae_int_t m, const real_1d_array &r, const ae_int_t n, real_1d_array &c); /************************************************************************* 1-dimensional complex deconvolution (inverse of ConvC1D()). Algorithm has M*log(M)) complexity for any M (composite or prime). INPUT PARAMETERS A - array[0..M-1] - convolved signal, A = conv(R, B) M - convolved signal length B - array[0..N-1] - response N - response length OUTPUT PARAMETERS R - deconvolved signal. array[0..M-N]. NOTE: deconvolution is unstable process and may result in division by zero (if your response function is degenerate, i.e. has zero Fourier coefficient). NOTE: It is assumed that B is zero at T<0. If it has non-zero values at negative T's, you can still use this subroutine - just shift its result correspondingly. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void convr1dcircularinv(const real_1d_array &a, const ae_int_t m, const real_1d_array &b, const ae_int_t n, real_1d_array &r); /************************************************************************* 1-dimensional complex cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (non-circular). Correlation is calculated using reduction to convolution. Algorithm with max(N,N)*log(max(N,N)) complexity is used (see ConvC1D() for more info about performance). IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrC1D(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - complex function to be transformed, signal containing pattern N - problem size Pattern - array[0..M-1] - complex function to be transformed, pattern to search withing signal M - problem size OUTPUT PARAMETERS R - cross-correlation, array[0..N+M-2]: * positive lags are stored in R[0..N-1], R[i] = sum(conj(pattern[j])*signal[i+j] * negative lags are stored in R[N..N+M-2], R[N+M-1-i] = sum(conj(pattern[j])*signal[-i+j] NOTE: It is assumed that pattern domain is [0..M-1]. If Pattern is non-zero on [-K..M-1], you can still use this subroutine, just shift result by K. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void corrc1d(const complex_1d_array &signal, const ae_int_t n, const complex_1d_array &pattern, const ae_int_t m, complex_1d_array &r); /************************************************************************* 1-dimensional circular complex cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (circular). Algorithm has linearithmic complexity for any M/N. IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrC1DCircular(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - complex function to be transformed, periodic signal containing pattern N - problem size Pattern - array[0..M-1] - complex function to be transformed, non-periodic pattern to search withing signal M - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void corrc1dcircular(const complex_1d_array &signal, const ae_int_t m, const complex_1d_array &pattern, const ae_int_t n, complex_1d_array &c); /************************************************************************* 1-dimensional real cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (non-circular). Correlation is calculated using reduction to convolution. Algorithm with max(N,N)*log(max(N,N)) complexity is used (see ConvC1D() for more info about performance). IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrR1D(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - real function to be transformed, signal containing pattern N - problem size Pattern - array[0..M-1] - real function to be transformed, pattern to search withing signal M - problem size OUTPUT PARAMETERS R - cross-correlation, array[0..N+M-2]: * positive lags are stored in R[0..N-1], R[i] = sum(pattern[j]*signal[i+j] * negative lags are stored in R[N..N+M-2], R[N+M-1-i] = sum(pattern[j]*signal[-i+j] NOTE: It is assumed that pattern domain is [0..M-1]. If Pattern is non-zero on [-K..M-1], you can still use this subroutine, just shift result by K. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void corrr1d(const real_1d_array &signal, const ae_int_t n, const real_1d_array &pattern, const ae_int_t m, real_1d_array &r); /************************************************************************* 1-dimensional circular real cross-correlation. For given Pattern/Signal returns corr(Pattern,Signal) (circular). Algorithm has linearithmic complexity for any M/N. IMPORTANT: for historical reasons subroutine accepts its parameters in reversed order: CorrR1DCircular(Signal, Pattern) = Pattern x Signal (using traditional definition of cross-correlation, denoting cross-correlation as "x"). INPUT PARAMETERS Signal - array[0..N-1] - real function to be transformed, periodic signal containing pattern N - problem size Pattern - array[0..M-1] - real function to be transformed, non-periodic pattern to search withing signal M - problem size OUTPUT PARAMETERS R - convolution: A*B. array[0..M-1]. -- ALGLIB -- Copyright 21.07.2009 by Bochkanov Sergey *************************************************************************/ void corrr1dcircular(const real_1d_array &signal, const ae_int_t m, const real_1d_array &pattern, const ae_int_t n, real_1d_array &c); } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (FUNCTIONS) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { void fftc1d(/* Complex */ ae_vector* a, ae_int_t n, ae_state *_state); void fftc1dinv(/* Complex */ ae_vector* a, ae_int_t n, ae_state *_state); void fftr1d(/* Real */ ae_vector* a, ae_int_t n, /* Complex */ ae_vector* f, ae_state *_state); void fftr1dinv(/* Complex */ ae_vector* f, ae_int_t n, /* Real */ ae_vector* a, ae_state *_state); void fftr1dinternaleven(/* Real */ ae_vector* a, ae_int_t n, /* Real */ ae_vector* buf, fasttransformplan* plan, ae_state *_state); void fftr1dinvinternaleven(/* Real */ ae_vector* a, ae_int_t n, /* Real */ ae_vector* buf, fasttransformplan* plan, ae_state *_state); void fhtr1d(/* Real */ ae_vector* a, ae_int_t n, ae_state *_state); void fhtr1dinv(/* Real */ ae_vector* a, ae_int_t n, ae_state *_state); void convc1d(/* Complex */ ae_vector* a, ae_int_t m, /* Complex */ ae_vector* b, ae_int_t n, /* Complex */ ae_vector* r, ae_state *_state); void convc1dinv(/* Complex */ ae_vector* a, ae_int_t m, /* Complex */ ae_vector* b, ae_int_t n, /* Complex */ ae_vector* r, ae_state *_state); void convc1dcircular(/* Complex */ ae_vector* s, ae_int_t m, /* Complex */ ae_vector* r, ae_int_t n, /* Complex */ ae_vector* c, ae_state *_state); void convc1dcircularinv(/* Complex */ ae_vector* a, ae_int_t m, /* Complex */ ae_vector* b, ae_int_t n, /* Complex */ ae_vector* r, ae_state *_state); void convr1d(/* Real */ ae_vector* a, ae_int_t m, /* Real */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* r, ae_state *_state); void convr1dinv(/* Real */ ae_vector* a, ae_int_t m, /* Real */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* r, ae_state *_state); void convr1dcircular(/* Real */ ae_vector* s, ae_int_t m, /* Real */ ae_vector* r, ae_int_t n, /* Real */ ae_vector* c, ae_state *_state); void convr1dcircularinv(/* Real */ ae_vector* a, ae_int_t m, /* Real */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* r, ae_state *_state); void convc1dx(/* Complex */ ae_vector* a, ae_int_t m, /* Complex */ ae_vector* b, ae_int_t n, ae_bool circular, ae_int_t alg, ae_int_t q, /* Complex */ ae_vector* r, ae_state *_state); void convr1dx(/* Real */ ae_vector* a, ae_int_t m, /* Real */ ae_vector* b, ae_int_t n, ae_bool circular, ae_int_t alg, ae_int_t q, /* Real */ ae_vector* r, ae_state *_state); void corrc1d(/* Complex */ ae_vector* signal, ae_int_t n, /* Complex */ ae_vector* pattern, ae_int_t m, /* Complex */ ae_vector* r, ae_state *_state); void corrc1dcircular(/* Complex */ ae_vector* signal, ae_int_t m, /* Complex */ ae_vector* pattern, ae_int_t n, /* Complex */ ae_vector* c, ae_state *_state); void corrr1d(/* Real */ ae_vector* signal, ae_int_t n, /* Real */ ae_vector* pattern, ae_int_t m, /* Real */ ae_vector* r, ae_state *_state); void corrr1dcircular(/* Real */ ae_vector* signal, ae_int_t m, /* Real */ ae_vector* pattern, ae_int_t n, /* Real */ ae_vector* c, ae_state *_state); } #endif cpp/src/statistics.h0000755000175000017500000015723013105126765014371 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _statistics_pkg_h #define _statistics_pkg_h #include "ap.h" #include "alglibinternal.h" #include "linalg.h" #include "specialfunctions.h" #include "alglibmisc.h" ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (DATATYPES) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* Calculation of the distribution moments: mean, variance, skewness, kurtosis. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X OUTPUT PARAMETERS Mean - mean. Variance- variance. Skewness- skewness (if variance<>0; zero otherwise). Kurtosis- kurtosis (if variance<>0; zero otherwise). NOTE: variance is calculated by dividing sum of squares by N-1, not N. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ void samplemoments(const real_1d_array &x, const ae_int_t n, double &mean, double &variance, double &skewness, double &kurtosis); void samplemoments(const real_1d_array &x, double &mean, double &variance, double &skewness, double &kurtosis); /************************************************************************* Calculation of the mean. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Mean' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double samplemean(const real_1d_array &x, const ae_int_t n); double samplemean(const real_1d_array &x); /************************************************************************* Calculation of the variance. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Variance' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double samplevariance(const real_1d_array &x, const ae_int_t n); double samplevariance(const real_1d_array &x); /************************************************************************* Calculation of the skewness. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Skewness' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double sampleskewness(const real_1d_array &x, const ae_int_t n); double sampleskewness(const real_1d_array &x); /************************************************************************* Calculation of the kurtosis. INPUT PARAMETERS: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X NOTE: This function return result which calculated by 'SampleMoments' function and stored at 'Kurtosis' variable. -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ double samplekurtosis(const real_1d_array &x, const ae_int_t n); double samplekurtosis(const real_1d_array &x); /************************************************************************* ADev Input parameters: X - sample N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X Output parameters: ADev- ADev -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ void sampleadev(const real_1d_array &x, const ae_int_t n, double &adev); void sampleadev(const real_1d_array &x, double &adev); /************************************************************************* Median calculation. Input parameters: X - sample (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X Output parameters: Median -- ALGLIB -- Copyright 06.09.2006 by Bochkanov Sergey *************************************************************************/ void samplemedian(const real_1d_array &x, const ae_int_t n, double &median); void samplemedian(const real_1d_array &x, double &median); /************************************************************************* Percentile calculation. Input parameters: X - sample (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only leading N elements of X are processed * if not given, automatically determined from size of X P - percentile (0<=P<=1) Output parameters: V - percentile -- ALGLIB -- Copyright 01.03.2008 by Bochkanov Sergey *************************************************************************/ void samplepercentile(const real_1d_array &x, const ae_int_t n, const double p, double &v); void samplepercentile(const real_1d_array &x, const double p, double &v); /************************************************************************* 2-sample covariance Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: covariance (zero for N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ double cov2(const real_1d_array &x, const real_1d_array &y, const ae_int_t n); double cov2(const real_1d_array &x, const real_1d_array &y); /************************************************************************* Pearson product-moment correlation coefficient Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: Pearson product-moment correlation coefficient (zero for N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ double pearsoncorr2(const real_1d_array &x, const real_1d_array &y, const ae_int_t n); double pearsoncorr2(const real_1d_array &x, const real_1d_array &y); /************************************************************************* Spearman's rank correlation coefficient Input parameters: X - sample 1 (array indexes: [0..N-1]) Y - sample 2 (array indexes: [0..N-1]) N - N>=0, sample size: * if given, only N leading elements of X/Y are processed * if not given, automatically determined from input sizes Result: Spearman's rank correlation coefficient (zero for N=0 or N=1) -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ double spearmancorr2(const real_1d_array &x, const real_1d_array &y, const ae_int_t n); double spearmancorr2(const real_1d_array &x, const real_1d_array &y); /************************************************************************* Covariance matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with covariance matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], covariance matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void covm(const real_2d_array &x, const ae_int_t n, const ae_int_t m, real_2d_array &c); void smp_covm(const real_2d_array &x, const ae_int_t n, const ae_int_t m, real_2d_array &c); void covm(const real_2d_array &x, real_2d_array &c); void smp_covm(const real_2d_array &x, real_2d_array &c); /************************************************************************* Pearson product-moment correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void pearsoncorrm(const real_2d_array &x, const ae_int_t n, const ae_int_t m, real_2d_array &c); void smp_pearsoncorrm(const real_2d_array &x, const ae_int_t n, const ae_int_t m, real_2d_array &c); void pearsoncorrm(const real_2d_array &x, real_2d_array &c); void smp_pearsoncorrm(const real_2d_array &x, real_2d_array &c); /************************************************************************* Spearman's rank correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X are used * if not given, automatically determined from input size M - M>0, number of variables: * if given, only leading M columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M,M], correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void spearmancorrm(const real_2d_array &x, const ae_int_t n, const ae_int_t m, real_2d_array &c); void smp_spearmancorrm(const real_2d_array &x, const ae_int_t n, const ae_int_t m, real_2d_array &c); void spearmancorrm(const real_2d_array &x, real_2d_array &c); void smp_spearmancorrm(const real_2d_array &x, real_2d_array &c); /************************************************************************* Cross-covariance matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with covariance matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-covariance matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void covm2(const real_2d_array &x, const real_2d_array &y, const ae_int_t n, const ae_int_t m1, const ae_int_t m2, real_2d_array &c); void smp_covm2(const real_2d_array &x, const real_2d_array &y, const ae_int_t n, const ae_int_t m1, const ae_int_t m2, real_2d_array &c); void covm2(const real_2d_array &x, const real_2d_array &y, real_2d_array &c); void smp_covm2(const real_2d_array &x, const real_2d_array &y, real_2d_array &c); /************************************************************************* Pearson product-moment cross-correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void pearsoncorrm2(const real_2d_array &x, const real_2d_array &y, const ae_int_t n, const ae_int_t m1, const ae_int_t m2, real_2d_array &c); void smp_pearsoncorrm2(const real_2d_array &x, const real_2d_array &y, const ae_int_t n, const ae_int_t m1, const ae_int_t m2, real_2d_array &c); void pearsoncorrm2(const real_2d_array &x, const real_2d_array &y, real_2d_array &c); void smp_pearsoncorrm2(const real_2d_array &x, const real_2d_array &y, real_2d_array &c); /************************************************************************* Spearman's rank cross-correlation matrix SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! with correlation matrices smaller than 128*128. INPUT PARAMETERS: X - array[N,M1], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation Y - array[N,M2], sample matrix: * J-th column corresponds to J-th variable * I-th row corresponds to I-th observation N - N>=0, number of observations: * if given, only leading N rows of X/Y are used * if not given, automatically determined from input sizes M1 - M1>0, number of variables in X: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size M2 - M2>0, number of variables in Y: * if given, only leading M1 columns of X are used * if not given, automatically determined from input size OUTPUT PARAMETERS: C - array[M1,M2], cross-correlation matrix (zero if N=0 or N=1) -- ALGLIB -- Copyright 28.10.2010 by Bochkanov Sergey *************************************************************************/ void spearmancorrm2(const real_2d_array &x, const real_2d_array &y, const ae_int_t n, const ae_int_t m1, const ae_int_t m2, real_2d_array &c); void smp_spearmancorrm2(const real_2d_array &x, const real_2d_array &y, const ae_int_t n, const ae_int_t m1, const ae_int_t m2, real_2d_array &c); void spearmancorrm2(const real_2d_array &x, const real_2d_array &y, real_2d_array &c); void smp_spearmancorrm2(const real_2d_array &x, const real_2d_array &y, real_2d_array &c); /************************************************************************* This function replaces data in XY by their ranks: * XY is processed row-by-row * rows are processed separately * tied data are correctly handled (tied ranks are calculated) * ranking starts from 0, ends at NFeatures-1 * sum of within-row values is equal to (NFeatures-1)*NFeatures/2 SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! ones where expected operations count is less than 100.000 INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points NFeatures- number of features OUTPUT PARAMETERS: XY - data are replaced by their within-row ranks; ranking starts from 0, ends at NFeatures-1 -- ALGLIB -- Copyright 18.04.2013 by Bochkanov Sergey *************************************************************************/ void rankdata(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures); void smp_rankdata(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures); void rankdata(real_2d_array &xy); void smp_rankdata(real_2d_array &xy); /************************************************************************* This function replaces data in XY by their CENTERED ranks: * XY is processed row-by-row * rows are processed separately * tied data are correctly handled (tied ranks are calculated) * centered ranks are just usual ranks, but centered in such way that sum of within-row values is equal to 0.0. * centering is performed by subtracting mean from each row, i.e it changes mean value, but does NOT change higher moments SMP EDITION OF ALGLIB: ! This function can utilize multicore capabilities of your system. In ! order to do this you have to call version with "smp_" prefix, which ! indicates that multicore code will be used. ! ! This note is given for users of SMP edition; if you use GPL edition, ! or commercial edition of ALGLIB without SMP support, you still will ! be able to call smp-version of this function, but all computations ! will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. ! ! You should remember that starting/stopping worker thread always have ! non-zero cost. Although multicore version is pretty efficient on ! large problems, we do not recommend you to use it on small problems - ! ones where expected operations count is less than 100.000 INPUT PARAMETERS: XY - array[NPoints,NFeatures], dataset NPoints - number of points NFeatures- number of features OUTPUT PARAMETERS: XY - data are replaced by their within-row ranks; ranking starts from 0, ends at NFeatures-1 -- ALGLIB -- Copyright 18.04.2013 by Bochkanov Sergey *************************************************************************/ void rankdatacentered(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures); void smp_rankdatacentered(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nfeatures); void rankdatacentered(real_2d_array &xy); void smp_rankdatacentered(real_2d_array &xy); /************************************************************************* Obsolete function, we recommend to use PearsonCorr2(). -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ double pearsoncorrelation(const real_1d_array &x, const real_1d_array &y, const ae_int_t n); /************************************************************************* Obsolete function, we recommend to use SpearmanCorr2(). -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ double spearmanrankcorrelation(const real_1d_array &x, const real_1d_array &y, const ae_int_t n); /************************************************************************* Wilcoxon signed-rank test This test checks three hypotheses about the median of the given sample. The following tests are performed: * two-tailed test (null hypothesis - the median is equal to the given value) * left-tailed test (null hypothesis - the median is greater than or equal to the given value) * right-tailed test (null hypothesis - the median is less than or equal to the given value) Requirements: * the scale of measurement should be ordinal, interval or ratio (i.e. the test could not be applied to nominal variables). * the distribution should be continuous and symmetric relative to its median. * number of distinct values in the X array should be greater than 4 The test is non-parametric and doesn't require distribution X to be normal Input parameters: X - sample. Array whose index goes from 0 to N-1. N - size of the sample. Median - assumed median value. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. To calculate p-values, special approximation is used. This method lets us calculate p-values with two decimal places in interval [0.0001, 1]. "Two decimal places" does not sound very impressive, but in practice the relative error of less than 1% is enough to make a decision. There is no approximation outside the [0.0001, 1] interval. Therefore, if the significance level outlies this interval, the test returns 0.0001. -- ALGLIB -- Copyright 08.09.2006 by Bochkanov Sergey *************************************************************************/ void wilcoxonsignedranktest(const real_1d_array &x, const ae_int_t n, const double e, double &bothtails, double &lefttail, double &righttail); /************************************************************************* Sign test This test checks three hypotheses about the median of the given sample. The following tests are performed: * two-tailed test (null hypothesis - the median is equal to the given value) * left-tailed test (null hypothesis - the median is greater than or equal to the given value) * right-tailed test (null hypothesis - the median is less than or equal to the given value) Requirements: * the scale of measurement should be ordinal, interval or ratio (i.e. the test could not be applied to nominal variables). The test is non-parametric and doesn't require distribution X to be normal Input parameters: X - sample. Array whose index goes from 0 to N-1. N - size of the sample. Median - assumed median value. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. While calculating p-values high-precision binomial distribution approximation is used, so significance levels have about 15 exact digits. -- ALGLIB -- Copyright 08.09.2006 by Bochkanov Sergey *************************************************************************/ void onesamplesigntest(const real_1d_array &x, const ae_int_t n, const double median, double &bothtails, double &lefttail, double &righttail); /************************************************************************* Pearson's correlation coefficient significance test This test checks hypotheses about whether X and Y are samples of two continuous distributions having zero correlation or whether their correlation is non-zero. The following tests are performed: * two-tailed test (null hypothesis - X and Y have zero correlation) * left-tailed test (null hypothesis - the correlation coefficient is greater than or equal to 0) * right-tailed test (null hypothesis - the correlation coefficient is less than or equal to 0). Requirements: * the number of elements in each sample is not less than 5 * normality of distributions of X and Y. Input parameters: R - Pearson's correlation coefficient for X and Y N - number of elements in samples, N>=5. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ void pearsoncorrelationsignificance(const double r, const ae_int_t n, double &bothtails, double &lefttail, double &righttail); /************************************************************************* Spearman's rank correlation coefficient significance test This test checks hypotheses about whether X and Y are samples of two continuous distributions having zero correlation or whether their correlation is non-zero. The following tests are performed: * two-tailed test (null hypothesis - X and Y have zero correlation) * left-tailed test (null hypothesis - the correlation coefficient is greater than or equal to 0) * right-tailed test (null hypothesis - the correlation coefficient is less than or equal to 0). Requirements: * the number of elements in each sample is not less than 5. The test is non-parametric and doesn't require distributions X and Y to be normal. Input parameters: R - Spearman's rank correlation coefficient for X and Y N - number of elements in samples, N>=5. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ void spearmanrankcorrelationsignificance(const double r, const ae_int_t n, double &bothtails, double &lefttail, double &righttail); /************************************************************************* One-sample t-test This test checks three hypotheses about the mean of the given sample. The following tests are performed: * two-tailed test (null hypothesis - the mean is equal to the given value) * left-tailed test (null hypothesis - the mean is greater than or equal to the given value) * right-tailed test (null hypothesis - the mean is less than or equal to the given value). The test is based on the assumption that a given sample has a normal distribution and an unknown dispersion. If the distribution sharply differs from normal, the test will work incorrectly. INPUT PARAMETERS: X - sample. Array whose index goes from 0 to N-1. N - size of sample, N>=0 Mean - assumed value of the mean. OUTPUT PARAMETERS: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. NOTE: this function correctly handles degenerate cases: * when N=0, all p-values are set to 1.0 * when variance of X[] is exactly zero, p-values are set to 1.0 or 0.0, depending on difference between sample mean and value of mean being tested. -- ALGLIB -- Copyright 08.09.2006 by Bochkanov Sergey *************************************************************************/ void studentttest1(const real_1d_array &x, const ae_int_t n, const double mean, double &bothtails, double &lefttail, double &righttail); /************************************************************************* Two-sample pooled test This test checks three hypotheses about the mean of the given samples. The following tests are performed: * two-tailed test (null hypothesis - the means are equal) * left-tailed test (null hypothesis - the mean of the first sample is greater than or equal to the mean of the second sample) * right-tailed test (null hypothesis - the mean of the first sample is less than or equal to the mean of the second sample). Test is based on the following assumptions: * given samples have normal distributions * dispersions are equal * samples are independent. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of sample. Y - sample 2. Array whose index goes from 0 to M-1. M - size of sample. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. NOTE: this function correctly handles degenerate cases: * when N=0 or M=0, all p-values are set to 1.0 * when both samples has exactly zero variance, p-values are set to 1.0 or 0.0, depending on difference between means. -- ALGLIB -- Copyright 18.09.2006 by Bochkanov Sergey *************************************************************************/ void studentttest2(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, double &bothtails, double &lefttail, double &righttail); /************************************************************************* Two-sample unpooled test This test checks three hypotheses about the mean of the given samples. The following tests are performed: * two-tailed test (null hypothesis - the means are equal) * left-tailed test (null hypothesis - the mean of the first sample is greater than or equal to the mean of the second sample) * right-tailed test (null hypothesis - the mean of the first sample is less than or equal to the mean of the second sample). Test is based on the following assumptions: * given samples have normal distributions * samples are independent. Equality of variances is NOT required. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of the sample. Y - sample 2. Array whose index goes from 0 to M-1. M - size of the sample. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. NOTE: this function correctly handles degenerate cases: * when N=0 or M=0, all p-values are set to 1.0 * when both samples has zero variance, p-values are set to 1.0 or 0.0, depending on difference between means. * when only one sample has zero variance, test reduces to 1-sample version. -- ALGLIB -- Copyright 18.09.2006 by Bochkanov Sergey *************************************************************************/ void unequalvariancettest(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, double &bothtails, double &lefttail, double &righttail); /************************************************************************* Mann-Whitney U-test This test checks hypotheses about whether X and Y are samples of two continuous distributions of the same shape and same median or whether their medians are different. The following tests are performed: * two-tailed test (null hypothesis - the medians are equal) * left-tailed test (null hypothesis - the median of the first sample is greater than or equal to the median of the second sample) * right-tailed test (null hypothesis - the median of the first sample is less than or equal to the median of the second sample). Requirements: * the samples are independent * X and Y are continuous distributions (or discrete distributions well- approximating continuous distributions) * distributions of X and Y have the same shape. The only possible difference is their position (i.e. the value of the median) * the number of elements in each sample is not less than 5 * the scale of measurement should be ordinal, interval or ratio (i.e. the test could not be applied to nominal variables). The test is non-parametric and doesn't require distributions to be normal. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of the sample. N>=5 Y - sample 2. Array whose index goes from 0 to M-1. M - size of the sample. M>=5 Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. To calculate p-values, special approximation is used. This method lets us calculate p-values with satisfactory accuracy in interval [0.0001, 1]. There is no approximation outside the [0.0001, 1] interval. Therefore, if the significance level outlies this interval, the test returns 0.0001. Relative precision of approximation of p-value: N M Max.err. Rms.err. 5..10 N..10 1.4e-02 6.0e-04 5..10 N..100 2.2e-02 5.3e-06 10..15 N..15 1.0e-02 3.2e-04 10..15 N..100 1.0e-02 2.2e-05 15..100 N..100 6.1e-03 2.7e-06 For N,M>100 accuracy checks weren't put into practice, but taking into account characteristics of asymptotic approximation used, precision should not be sharply different from the values for interval [5, 100]. NOTE: P-value approximation was optimized for 0.0001<=p<=0.2500. Thus, P's outside of this interval are enforced to these bounds. Say, you may quite often get P equal to exactly 0.25 or 0.0001. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ void mannwhitneyutest(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, double &bothtails, double &lefttail, double &righttail); /************************************************************************* Jarque-Bera test This test checks hypotheses about the fact that a given sample X is a sample of normal random variable. Requirements: * the number of elements in the sample is not less than 5. Input parameters: X - sample. Array whose index goes from 0 to N-1. N - size of the sample. N>=5 Output parameters: P - p-value for the test Accuracy of the approximation used (5<=N<=1951): p-value relative error (5<=N<=1951) [1, 0.1] < 1% [0.1, 0.01] < 2% [0.01, 0.001] < 6% [0.001, 0] wasn't measured For N>1951 accuracy wasn't measured but it shouldn't be sharply different from table values. -- ALGLIB -- Copyright 09.04.2007 by Bochkanov Sergey *************************************************************************/ void jarqueberatest(const real_1d_array &x, const ae_int_t n, double &p); /************************************************************************* Two-sample F-test This test checks three hypotheses about dispersions of the given samples. The following tests are performed: * two-tailed test (null hypothesis - the dispersions are equal) * left-tailed test (null hypothesis - the dispersion of the first sample is greater than or equal to the dispersion of the second sample). * right-tailed test (null hypothesis - the dispersion of the first sample is less than or equal to the dispersion of the second sample) The test is based on the following assumptions: * the given samples have normal distributions * the samples are independent. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - sample size. Y - sample 2. Array whose index goes from 0 to M-1. M - sample size. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 19.09.2006 by Bochkanov Sergey *************************************************************************/ void ftest(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, double &bothtails, double &lefttail, double &righttail); /************************************************************************* One-sample chi-square test This test checks three hypotheses about the dispersion of the given sample The following tests are performed: * two-tailed test (null hypothesis - the dispersion equals the given number) * left-tailed test (null hypothesis - the dispersion is greater than or equal to the given number) * right-tailed test (null hypothesis - dispersion is less than or equal to the given number). Test is based on the following assumptions: * the given sample has a normal distribution. Input parameters: X - sample 1. Array whose index goes from 0 to N-1. N - size of the sample. Variance - dispersion value to compare with. Output parameters: BothTails - p-value for two-tailed test. If BothTails is less than the given significance level the null hypothesis is rejected. LeftTail - p-value for left-tailed test. If LeftTail is less than the given significance level, the null hypothesis is rejected. RightTail - p-value for right-tailed test. If RightTail is less than the given significance level the null hypothesis is rejected. -- ALGLIB -- Copyright 19.09.2006 by Bochkanov Sergey *************************************************************************/ void onesamplevariancetest(const real_1d_array &x, const ae_int_t n, const double variance, double &bothtails, double &lefttail, double &righttail); } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (FUNCTIONS) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { void samplemoments(/* Real */ ae_vector* x, ae_int_t n, double* mean, double* variance, double* skewness, double* kurtosis, ae_state *_state); double samplemean(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state); double samplevariance(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state); double sampleskewness(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state); double samplekurtosis(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state); void sampleadev(/* Real */ ae_vector* x, ae_int_t n, double* adev, ae_state *_state); void samplemedian(/* Real */ ae_vector* x, ae_int_t n, double* median, ae_state *_state); void samplepercentile(/* Real */ ae_vector* x, ae_int_t n, double p, double* v, ae_state *_state); double cov2(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_state *_state); double pearsoncorr2(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_state *_state); double spearmancorr2(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_state *_state); void covm(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, ae_state *_state); void _pexec_covm(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, ae_state *_state); void pearsoncorrm(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, ae_state *_state); void _pexec_pearsoncorrm(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, ae_state *_state); void spearmancorrm(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, ae_state *_state); void _pexec_spearmancorrm(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, ae_state *_state); void covm2(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t m1, ae_int_t m2, /* Real */ ae_matrix* c, ae_state *_state); void _pexec_covm2(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t m1, ae_int_t m2, /* Real */ ae_matrix* c, ae_state *_state); void pearsoncorrm2(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t m1, ae_int_t m2, /* Real */ ae_matrix* c, ae_state *_state); void _pexec_pearsoncorrm2(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t m1, ae_int_t m2, /* Real */ ae_matrix* c, ae_state *_state); void spearmancorrm2(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t m1, ae_int_t m2, /* Real */ ae_matrix* c, ae_state *_state); void _pexec_spearmancorrm2(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t m1, ae_int_t m2, /* Real */ ae_matrix* c, ae_state *_state); void rankdata(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_state *_state); void _pexec_rankdata(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_state *_state); void rankdatacentered(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_state *_state); void _pexec_rankdatacentered(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_state *_state); double pearsoncorrelation(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_state *_state); double spearmanrankcorrelation(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_state *_state); void wilcoxonsignedranktest(/* Real */ ae_vector* x, ae_int_t n, double e, double* bothtails, double* lefttail, double* righttail, ae_state *_state); void onesamplesigntest(/* Real */ ae_vector* x, ae_int_t n, double median, double* bothtails, double* lefttail, double* righttail, ae_state *_state); void pearsoncorrelationsignificance(double r, ae_int_t n, double* bothtails, double* lefttail, double* righttail, ae_state *_state); void spearmanrankcorrelationsignificance(double r, ae_int_t n, double* bothtails, double* lefttail, double* righttail, ae_state *_state); void studentttest1(/* Real */ ae_vector* x, ae_int_t n, double mean, double* bothtails, double* lefttail, double* righttail, ae_state *_state); void studentttest2(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, double* bothtails, double* lefttail, double* righttail, ae_state *_state); void unequalvariancettest(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, double* bothtails, double* lefttail, double* righttail, ae_state *_state); void mannwhitneyutest(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, double* bothtails, double* lefttail, double* righttail, ae_state *_state); void jarqueberatest(/* Real */ ae_vector* x, ae_int_t n, double* p, ae_state *_state); void ftest(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, double* bothtails, double* lefttail, double* righttail, ae_state *_state); void onesamplevariancetest(/* Real */ ae_vector* x, ae_int_t n, double variance, double* bothtails, double* lefttail, double* righttail, ae_state *_state); } #endif cpp/src/interpolation.cpp0000755000175000017500000636744513105126766015443 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "interpolation.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif using namespace std; ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* IDW interpolant. *************************************************************************/ _idwinterpolant_owner::_idwinterpolant_owner() { p_struct = (alglib_impl::idwinterpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::idwinterpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_idwinterpolant_init(p_struct, NULL); } _idwinterpolant_owner::_idwinterpolant_owner(const _idwinterpolant_owner &rhs) { p_struct = (alglib_impl::idwinterpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::idwinterpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_idwinterpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _idwinterpolant_owner& _idwinterpolant_owner::operator=(const _idwinterpolant_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_idwinterpolant_clear(p_struct); alglib_impl::_idwinterpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _idwinterpolant_owner::~_idwinterpolant_owner() { alglib_impl::_idwinterpolant_clear(p_struct); ae_free(p_struct); } alglib_impl::idwinterpolant* _idwinterpolant_owner::c_ptr() { return p_struct; } alglib_impl::idwinterpolant* _idwinterpolant_owner::c_ptr() const { return const_cast(p_struct); } idwinterpolant::idwinterpolant() : _idwinterpolant_owner() { } idwinterpolant::idwinterpolant(const idwinterpolant &rhs):_idwinterpolant_owner(rhs) { } idwinterpolant& idwinterpolant::operator=(const idwinterpolant &rhs) { if( this==&rhs ) return *this; _idwinterpolant_owner::operator=(rhs); return *this; } idwinterpolant::~idwinterpolant() { } /************************************************************************* IDW interpolation INPUT PARAMETERS: Z - IDW interpolant built with one of model building subroutines. X - array[0..NX-1], interpolation point Result: IDW interpolant Z(X) -- ALGLIB -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/ double idwcalc(const idwinterpolant &z, const real_1d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::idwcalc(const_cast(z.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* IDW interpolant using modified Shepard method for uniform point distributions. INPUT PARAMETERS: XY - X and Y values, array[0..N-1,0..NX]. First NX columns contain X-values, last column contain Y-values. N - number of nodes, N>0. NX - space dimension, NX>=1. D - nodal function type, either: * 0 constant model. Just for demonstration only, worst model ever. * 1 linear model, least squares fitting. Simpe model for datasets too small for quadratic models * 2 quadratic model, least squares fitting. Best model available (if your dataset is large enough). * -1 "fast" linear model, use with caution!!! It is significantly faster than linear/quadratic and better than constant model. But it is less robust (especially in the presence of noise). NQ - number of points used to calculate nodal functions (ignored for constant models). NQ should be LARGER than: * max(1.5*(1+NX),2^NX+1) for linear model, * max(3/4*(NX+2)*(NX+1),2^NX+1) for quadratic model. Values less than this threshold will be silently increased. NW - number of points used to calculate weights and to interpolate. Required: >=2^NX+1, values less than this threshold will be silently increased. Recommended value: about 2*NQ OUTPUT PARAMETERS: Z - IDW interpolant. NOTES: * best results are obtained with quadratic models, worst - with constant models * when N is large, NQ and NW must be significantly smaller than N both to obtain optimal performance and to obtain optimal accuracy. In 2 or 3-dimensional tasks NQ=15 and NW=25 are good values to start with. * NQ and NW may be greater than N. In such cases they will be automatically decreased. * this subroutine is always succeeds (as long as correct parameters are passed). * see 'Multivariate Interpolation of Large Sets of Scattered Data' by Robert J. Renka for more information on this algorithm. * this subroutine assumes that point distribution is uniform at the small scales. If it isn't - for example, points are concentrated along "lines", but "lines" distribution is uniform at the larger scale - then you should use IDWBuildModifiedShepardR() -- ALGLIB PROJECT -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/ void idwbuildmodifiedshepard(const real_2d_array &xy, const ae_int_t n, const ae_int_t nx, const ae_int_t d, const ae_int_t nq, const ae_int_t nw, idwinterpolant &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::idwbuildmodifiedshepard(const_cast(xy.c_ptr()), n, nx, d, nq, nw, const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* IDW interpolant using modified Shepard method for non-uniform datasets. This type of model uses constant nodal functions and interpolates using all nodes which are closer than user-specified radius R. It may be used when points distribution is non-uniform at the small scale, but it is at the distances as large as R. INPUT PARAMETERS: XY - X and Y values, array[0..N-1,0..NX]. First NX columns contain X-values, last column contain Y-values. N - number of nodes, N>0. NX - space dimension, NX>=1. R - radius, R>0 OUTPUT PARAMETERS: Z - IDW interpolant. NOTES: * if there is less than IDWKMin points within R-ball, algorithm selects IDWKMin closest ones, so that continuity properties of interpolant are preserved even far from points. -- ALGLIB PROJECT -- Copyright 11.04.2010 by Bochkanov Sergey *************************************************************************/ void idwbuildmodifiedshepardr(const real_2d_array &xy, const ae_int_t n, const ae_int_t nx, const double r, idwinterpolant &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::idwbuildmodifiedshepardr(const_cast(xy.c_ptr()), n, nx, r, const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* IDW model for noisy data. This subroutine may be used to handle noisy data, i.e. data with noise in OUTPUT values. It differs from IDWBuildModifiedShepard() in the following aspects: * nodal functions are not constrained to pass through nodes: Qi(xi)<>yi, i.e. we have fitting instead of interpolation. * weights which are used during least squares fitting stage are all equal to 1.0 (independently of distance) * "fast"-linear or constant nodal functions are not supported (either not robust enough or too rigid) This problem require far more complex tuning than interpolation problems. Below you can find some recommendations regarding this problem: * focus on tuning NQ; it controls noise reduction. As for NW, you can just make it equal to 2*NQ. * you can use cross-validation to determine optimal NQ. * optimal NQ is a result of complex tradeoff between noise level (more noise = larger NQ required) and underlying function complexity (given fixed N, larger NQ means smoothing of compex features in the data). For example, NQ=N will reduce noise to the minimum level possible, but you will end up with just constant/linear/quadratic (depending on D) least squares model for the whole dataset. INPUT PARAMETERS: XY - X and Y values, array[0..N-1,0..NX]. First NX columns contain X-values, last column contain Y-values. N - number of nodes, N>0. NX - space dimension, NX>=1. D - nodal function degree, either: * 1 linear model, least squares fitting. Simpe model for datasets too small for quadratic models (or for very noisy problems). * 2 quadratic model, least squares fitting. Best model available (if your dataset is large enough). NQ - number of points used to calculate nodal functions. NQ should be significantly larger than 1.5 times the number of coefficients in a nodal function to overcome effects of noise: * larger than 1.5*(1+NX) for linear model, * larger than 3/4*(NX+2)*(NX+1) for quadratic model. Values less than this threshold will be silently increased. NW - number of points used to calculate weights and to interpolate. Required: >=2^NX+1, values less than this threshold will be silently increased. Recommended value: about 2*NQ or larger OUTPUT PARAMETERS: Z - IDW interpolant. NOTES: * best results are obtained with quadratic models, linear models are not recommended to use unless you are pretty sure that it is what you want * this subroutine is always succeeds (as long as correct parameters are passed). * see 'Multivariate Interpolation of Large Sets of Scattered Data' by Robert J. Renka for more information on this algorithm. -- ALGLIB PROJECT -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/ void idwbuildnoisy(const real_2d_array &xy, const ae_int_t n, const ae_int_t nx, const ae_int_t d, const ae_int_t nq, const ae_int_t nw, idwinterpolant &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::idwbuildnoisy(const_cast(xy.c_ptr()), n, nx, d, nq, nw, const_cast(z.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Barycentric interpolant. *************************************************************************/ _barycentricinterpolant_owner::_barycentricinterpolant_owner() { p_struct = (alglib_impl::barycentricinterpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::barycentricinterpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_barycentricinterpolant_init(p_struct, NULL); } _barycentricinterpolant_owner::_barycentricinterpolant_owner(const _barycentricinterpolant_owner &rhs) { p_struct = (alglib_impl::barycentricinterpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::barycentricinterpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_barycentricinterpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _barycentricinterpolant_owner& _barycentricinterpolant_owner::operator=(const _barycentricinterpolant_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_barycentricinterpolant_clear(p_struct); alglib_impl::_barycentricinterpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _barycentricinterpolant_owner::~_barycentricinterpolant_owner() { alglib_impl::_barycentricinterpolant_clear(p_struct); ae_free(p_struct); } alglib_impl::barycentricinterpolant* _barycentricinterpolant_owner::c_ptr() { return p_struct; } alglib_impl::barycentricinterpolant* _barycentricinterpolant_owner::c_ptr() const { return const_cast(p_struct); } barycentricinterpolant::barycentricinterpolant() : _barycentricinterpolant_owner() { } barycentricinterpolant::barycentricinterpolant(const barycentricinterpolant &rhs):_barycentricinterpolant_owner(rhs) { } barycentricinterpolant& barycentricinterpolant::operator=(const barycentricinterpolant &rhs) { if( this==&rhs ) return *this; _barycentricinterpolant_owner::operator=(rhs); return *this; } barycentricinterpolant::~barycentricinterpolant() { } /************************************************************************* Rational interpolation using barycentric formula F(t) = SUM(i=0,n-1,w[i]*f[i]/(t-x[i])) / SUM(i=0,n-1,w[i]/(t-x[i])) Input parameters: B - barycentric interpolant built with one of model building subroutines. T - interpolation point Result: barycentric interpolant F(t) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ double barycentriccalc(const barycentricinterpolant &b, const double t) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::barycentriccalc(const_cast(b.c_ptr()), t, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Differentiation of barycentric interpolant: first derivative. Algorithm used in this subroutine is very robust and should not fail until provided with values too close to MaxRealNumber (usually MaxRealNumber/N or greater will overflow). INPUT PARAMETERS: B - barycentric interpolant built with one of model building subroutines. T - interpolation point OUTPUT PARAMETERS: F - barycentric interpolant at T DF - first derivative NOTE -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricdiff1(const barycentricinterpolant &b, const double t, double &f, double &df) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::barycentricdiff1(const_cast(b.c_ptr()), t, &f, &df, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Differentiation of barycentric interpolant: first/second derivatives. INPUT PARAMETERS: B - barycentric interpolant built with one of model building subroutines. T - interpolation point OUTPUT PARAMETERS: F - barycentric interpolant at T DF - first derivative D2F - second derivative NOTE: this algorithm may fail due to overflow/underflor if used on data whose values are close to MaxRealNumber or MinRealNumber. Use more robust BarycentricDiff1() subroutine in such cases. -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricdiff2(const barycentricinterpolant &b, const double t, double &f, double &df, double &d2f) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::barycentricdiff2(const_cast(b.c_ptr()), t, &f, &df, &d2f, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine performs linear transformation of the argument. INPUT PARAMETERS: B - rational interpolant in barycentric form CA, CB - transformation coefficients: x = CA*t + CB OUTPUT PARAMETERS: B - transformed interpolant with X replaced by T -- ALGLIB PROJECT -- Copyright 19.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentriclintransx(const barycentricinterpolant &b, const double ca, const double cb) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::barycentriclintransx(const_cast(b.c_ptr()), ca, cb, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine performs linear transformation of the barycentric interpolant. INPUT PARAMETERS: B - rational interpolant in barycentric form CA, CB - transformation coefficients: B2(x) = CA*B(x) + CB OUTPUT PARAMETERS: B - transformed interpolant -- ALGLIB PROJECT -- Copyright 19.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentriclintransy(const barycentricinterpolant &b, const double ca, const double cb) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::barycentriclintransy(const_cast(b.c_ptr()), ca, cb, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Extracts X/Y/W arrays from rational interpolant INPUT PARAMETERS: B - barycentric interpolant OUTPUT PARAMETERS: N - nodes count, N>0 X - interpolation nodes, array[0..N-1] F - function values, array[0..N-1] W - barycentric weights, array[0..N-1] -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricunpack(const barycentricinterpolant &b, ae_int_t &n, real_1d_array &x, real_1d_array &y, real_1d_array &w) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::barycentricunpack(const_cast(b.c_ptr()), &n, const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Rational interpolant from X/Y/W arrays F(t) = SUM(i=0,n-1,w[i]*f[i]/(t-x[i])) / SUM(i=0,n-1,w[i]/(t-x[i])) INPUT PARAMETERS: X - interpolation nodes, array[0..N-1] F - function values, array[0..N-1] W - barycentric weights, array[0..N-1] N - nodes count, N>0 OUTPUT PARAMETERS: B - barycentric interpolant built from (X, Y, W) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricbuildxyw(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, barycentricinterpolant &b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::barycentricbuildxyw(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(b.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Rational interpolant without poles The subroutine constructs the rational interpolating function without real poles (see 'Barycentric rational interpolation with no poles and high rates of approximation', Michael S. Floater. and Kai Hormann, for more information on this subject). Input parameters: X - interpolation nodes, array[0..N-1]. Y - function values, array[0..N-1]. N - number of nodes, N>0. D - order of the interpolation scheme, 0 <= D <= N-1. D<0 will cause an error. D>=N it will be replaced with D=N-1. if you don't know what D to choose, use small value about 3-5. Output parameters: B - barycentric interpolant. Note: this algorithm always succeeds and calculates the weights with close to machine precision. -- ALGLIB PROJECT -- Copyright 17.06.2007 by Bochkanov Sergey *************************************************************************/ void barycentricbuildfloaterhormann(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t d, barycentricinterpolant &b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::barycentricbuildfloaterhormann(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, d, const_cast(b.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 1-dimensional spline interpolant *************************************************************************/ _spline1dinterpolant_owner::_spline1dinterpolant_owner() { p_struct = (alglib_impl::spline1dinterpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::spline1dinterpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_spline1dinterpolant_init(p_struct, NULL); } _spline1dinterpolant_owner::_spline1dinterpolant_owner(const _spline1dinterpolant_owner &rhs) { p_struct = (alglib_impl::spline1dinterpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::spline1dinterpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_spline1dinterpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _spline1dinterpolant_owner& _spline1dinterpolant_owner::operator=(const _spline1dinterpolant_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_spline1dinterpolant_clear(p_struct); alglib_impl::_spline1dinterpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _spline1dinterpolant_owner::~_spline1dinterpolant_owner() { alglib_impl::_spline1dinterpolant_clear(p_struct); ae_free(p_struct); } alglib_impl::spline1dinterpolant* _spline1dinterpolant_owner::c_ptr() { return p_struct; } alglib_impl::spline1dinterpolant* _spline1dinterpolant_owner::c_ptr() const { return const_cast(p_struct); } spline1dinterpolant::spline1dinterpolant() : _spline1dinterpolant_owner() { } spline1dinterpolant::spline1dinterpolant(const spline1dinterpolant &rhs):_spline1dinterpolant_owner(rhs) { } spline1dinterpolant& spline1dinterpolant::operator=(const spline1dinterpolant &rhs) { if( this==&rhs ) return *this; _spline1dinterpolant_owner::operator=(rhs); return *this; } spline1dinterpolant::~spline1dinterpolant() { } /************************************************************************* This subroutine builds linear spline interpolant INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildlinear(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, spline1dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dbuildlinear(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds linear spline interpolant INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildlinear(const real_1d_array &x, const real_1d_array &y, spline1dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dbuildlinear': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dbuildlinear(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds cubic spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Y - function values, array[0..N-1]. OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t boundltype, const double boundl, const ae_int_t boundrtype, const double boundr, spline1dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dbuildcubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, boundltype, boundl, boundrtype, boundr, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds cubic spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Y - function values, array[0..N-1]. OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildcubic(const real_1d_array &x, const real_1d_array &y, spline1dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t boundltype; double boundl; ae_int_t boundrtype; double boundr; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dbuildcubic': looks like one of arguments has wrong size"); n = x.length(); boundltype = 0; boundl = 0; boundrtype = 0; boundr = 0; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dbuildcubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, boundltype, boundl, boundrtype, boundr, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function solves following problem: given table y[] of function values at nodes x[], it calculates and returns table of function derivatives d[] (calculated at the same nodes x[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - spline nodes Y - function values OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: D - derivative values at X[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Derivative values are correctly reordered on return, so D[I] is always equal to S'(X[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dgriddiffcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t boundltype, const double boundl, const ae_int_t boundrtype, const double boundr, real_1d_array &d) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dgriddiffcubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, boundltype, boundl, boundrtype, boundr, const_cast(d.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function solves following problem: given table y[] of function values at nodes x[], it calculates and returns table of function derivatives d[] (calculated at the same nodes x[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - spline nodes Y - function values OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: D - derivative values at X[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Derivative values are correctly reordered on return, so D[I] is always equal to S'(X[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dgriddiffcubic(const real_1d_array &x, const real_1d_array &y, real_1d_array &d) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t boundltype; double boundl; ae_int_t boundrtype; double boundr; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dgriddiffcubic': looks like one of arguments has wrong size"); n = x.length(); boundltype = 0; boundl = 0; boundrtype = 0; boundr = 0; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dgriddiffcubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, boundltype, boundl, boundrtype, boundr, const_cast(d.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function solves following problem: given table y[] of function values at nodes x[], it calculates and returns tables of first and second function derivatives d1[] and d2[] (calculated at the same nodes x[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - spline nodes Y - function values OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: D1 - S' values at X[] D2 - S'' values at X[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Derivative values are correctly reordered on return, so D[I] is always equal to S'(X[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dgriddiff2cubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t boundltype, const double boundl, const ae_int_t boundrtype, const double boundr, real_1d_array &d1, real_1d_array &d2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dgriddiff2cubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, boundltype, boundl, boundrtype, boundr, const_cast(d1.c_ptr()), const_cast(d2.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function solves following problem: given table y[] of function values at nodes x[], it calculates and returns tables of first and second function derivatives d1[] and d2[] (calculated at the same nodes x[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - spline nodes Y - function values OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: D1 - S' values at X[] D2 - S'' values at X[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Derivative values are correctly reordered on return, so D[I] is always equal to S'(X[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dgriddiff2cubic(const real_1d_array &x, const real_1d_array &y, real_1d_array &d1, real_1d_array &d2) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t boundltype; double boundl; ae_int_t boundrtype; double boundr; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dgriddiff2cubic': looks like one of arguments has wrong size"); n = x.length(); boundltype = 0; boundl = 0; boundrtype = 0; boundr = 0; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dgriddiff2cubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, boundltype, boundl, boundrtype, boundr, const_cast(d1.c_ptr()), const_cast(d2.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t boundltype, const double boundl, const ae_int_t boundrtype, const double boundr, const real_1d_array &x2, const ae_int_t n2, real_1d_array &y2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dconvcubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, boundltype, boundl, boundrtype, boundr, const_cast(x2.c_ptr()), n2, const_cast(y2.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvcubic(const real_1d_array &x, const real_1d_array &y, const real_1d_array &x2, real_1d_array &y2) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t boundltype; double boundl; ae_int_t boundrtype; double boundr; ae_int_t n2; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dconvcubic': looks like one of arguments has wrong size"); n = x.length(); boundltype = 0; boundl = 0; boundrtype = 0; boundr = 0; n2 = x2.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dconvcubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, boundltype, boundl, boundrtype, boundr, const_cast(x2.c_ptr()), n2, const_cast(y2.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[] and derivatives d2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] D2 - first derivatives at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvdiffcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t boundltype, const double boundl, const ae_int_t boundrtype, const double boundr, const real_1d_array &x2, const ae_int_t n2, real_1d_array &y2, real_1d_array &d2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dconvdiffcubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, boundltype, boundl, boundrtype, boundr, const_cast(x2.c_ptr()), n2, const_cast(y2.c_ptr()), const_cast(d2.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[] and derivatives d2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] D2 - first derivatives at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvdiffcubic(const real_1d_array &x, const real_1d_array &y, const real_1d_array &x2, real_1d_array &y2, real_1d_array &d2) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t boundltype; double boundl; ae_int_t boundrtype; double boundr; ae_int_t n2; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dconvdiffcubic': looks like one of arguments has wrong size"); n = x.length(); boundltype = 0; boundl = 0; boundrtype = 0; boundr = 0; n2 = x2.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dconvdiffcubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, boundltype, boundl, boundrtype, boundr, const_cast(x2.c_ptr()), n2, const_cast(y2.c_ptr()), const_cast(d2.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[], first and second derivatives d2[] and dd2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] D2 - first derivatives at X2[] DD2 - second derivatives at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvdiff2cubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t boundltype, const double boundl, const ae_int_t boundrtype, const double boundr, const real_1d_array &x2, const ae_int_t n2, real_1d_array &y2, real_1d_array &d2, real_1d_array &dd2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dconvdiff2cubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, boundltype, boundl, boundrtype, boundr, const_cast(x2.c_ptr()), n2, const_cast(y2.c_ptr()), const_cast(d2.c_ptr()), const_cast(dd2.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[], first and second derivatives d2[] and dd2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] D2 - first derivatives at X2[] DD2 - second derivatives at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvdiff2cubic(const real_1d_array &x, const real_1d_array &y, const real_1d_array &x2, real_1d_array &y2, real_1d_array &d2, real_1d_array &dd2) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t boundltype; double boundl; ae_int_t boundrtype; double boundr; ae_int_t n2; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dconvdiff2cubic': looks like one of arguments has wrong size"); n = x.length(); boundltype = 0; boundl = 0; boundrtype = 0; boundr = 0; n2 = x2.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dconvdiff2cubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, boundltype, boundl, boundrtype, boundr, const_cast(x2.c_ptr()), n2, const_cast(y2.c_ptr()), const_cast(d2.c_ptr()), const_cast(dd2.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds Catmull-Rom spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Y - function values, array[0..N-1]. OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundType - boundary condition type: * -1 for periodic boundary condition * 0 for parabolically terminated spline (default) Tension - tension parameter: * tension=0 corresponds to classic Catmull-Rom spline (default) * 0(x.c_ptr()), const_cast(y.c_ptr()), n, boundtype, tension, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds Catmull-Rom spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Y - function values, array[0..N-1]. OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundType - boundary condition type: * -1 for periodic boundary condition * 0 for parabolically terminated spline (default) Tension - tension parameter: * tension=0 corresponds to classic Catmull-Rom spline (default) * 0(x.c_ptr()), const_cast(y.c_ptr()), n, boundtype, tension, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds Hermite spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] D - derivatives, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant. ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildhermite(const real_1d_array &x, const real_1d_array &y, const real_1d_array &d, const ae_int_t n, spline1dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dbuildhermite(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(d.c_ptr()), n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds Hermite spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] D - derivatives, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant. ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildhermite(const real_1d_array &x, const real_1d_array &y, const real_1d_array &d, spline1dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length()) || (x.length()!=d.length())) throw ap_error("Error while calling 'spline1dbuildhermite': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dbuildhermite(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(d.c_ptr()), n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds Akima spline interpolant INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildakima(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, spline1dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dbuildakima(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds Akima spline interpolant INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildakima(const real_1d_array &x, const real_1d_array &y, spline1dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dbuildakima': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dbuildakima(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates the value of the spline at the given point X. INPUT PARAMETERS: C - spline interpolant X - point Result: S(x) -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ double spline1dcalc(const spline1dinterpolant &c, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spline1dcalc(const_cast(c.c_ptr()), x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine differentiates the spline. INPUT PARAMETERS: C - spline interpolant. X - point Result: S - S(x) DS - S'(x) D2S - S''(x) -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1ddiff(const spline1dinterpolant &c, const double x, double &s, double &ds, double &d2s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1ddiff(const_cast(c.c_ptr()), x, &s, &ds, &d2s, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine unpacks the spline into the coefficients table. INPUT PARAMETERS: C - spline interpolant. X - point OUTPUT PARAMETERS: Tbl - coefficients table, unpacked format, array[0..N-2, 0..5]. For I = 0...N-2: Tbl[I,0] = X[i] Tbl[I,1] = X[i+1] Tbl[I,2] = C0 Tbl[I,3] = C1 Tbl[I,4] = C2 Tbl[I,5] = C3 On [x[i], x[i+1]] spline is equals to: S(x) = C0 + C1*t + C2*t^2 + C3*t^3 t = x-x[i] NOTE: You can rebuild spline with Spline1DBuildHermite() function, which accepts as inputs function values and derivatives at nodes, which are easy to calculate when you have coefficients. -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dunpack(const spline1dinterpolant &c, ae_int_t &n, real_2d_array &tbl) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dunpack(const_cast(c.c_ptr()), &n, const_cast(tbl.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine performs linear transformation of the spline argument. INPUT PARAMETERS: C - spline interpolant. A, B- transformation coefficients: x = A*t + B Result: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dlintransx(const spline1dinterpolant &c, const double a, const double b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dlintransx(const_cast(c.c_ptr()), a, b, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine performs linear transformation of the spline. INPUT PARAMETERS: C - spline interpolant. A, B- transformation coefficients: S2(x) = A*S(x) + B Result: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dlintransy(const spline1dinterpolant &c, const double a, const double b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dlintransy(const_cast(c.c_ptr()), a, b, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine integrates the spline. INPUT PARAMETERS: C - spline interpolant. X - right bound of the integration interval [a, x], here 'a' denotes min(x[]) Result: integral(S(t)dt,a,x) -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ double spline1dintegrate(const spline1dinterpolant &c, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spline1dintegrate(const_cast(c.c_ptr()), x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function builds monotone cubic Hermite interpolant. This interpolant is monotonic in [x(0),x(n-1)] and is constant outside of this interval. In case y[] form non-monotonic sequence, interpolant is piecewise monotonic. Say, for x=(0,1,2,3,4) and y=(0,1,2,1,0) interpolant will monotonically grow at [0..2] and monotonically decrease at [2..4]. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Subroutine automatically sorts points, so caller may pass unsorted array. Y - function values, array[0..N-1] N - the number of points(N>=2). OUTPUT PARAMETERS: C - spline interpolant. -- ALGLIB PROJECT -- Copyright 21.06.2012 by Bochkanov Sergey *************************************************************************/ void spline1dbuildmonotone(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, spline1dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dbuildmonotone(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function builds monotone cubic Hermite interpolant. This interpolant is monotonic in [x(0),x(n-1)] and is constant outside of this interval. In case y[] form non-monotonic sequence, interpolant is piecewise monotonic. Say, for x=(0,1,2,3,4) and y=(0,1,2,1,0) interpolant will monotonically grow at [0..2] and monotonically decrease at [2..4]. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Subroutine automatically sorts points, so caller may pass unsorted array. Y - function values, array[0..N-1] N - the number of points(N>=2). OUTPUT PARAMETERS: C - spline interpolant. -- ALGLIB PROJECT -- Copyright 21.06.2012 by Bochkanov Sergey *************************************************************************/ void spline1dbuildmonotone(const real_1d_array &x, const real_1d_array &y, spline1dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dbuildmonotone': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dbuildmonotone(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Parametric spline inteprolant: 2-dimensional curve. You should not try to access its members directly - use PSpline2XXXXXXXX() functions instead. *************************************************************************/ _pspline2interpolant_owner::_pspline2interpolant_owner() { p_struct = (alglib_impl::pspline2interpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::pspline2interpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_pspline2interpolant_init(p_struct, NULL); } _pspline2interpolant_owner::_pspline2interpolant_owner(const _pspline2interpolant_owner &rhs) { p_struct = (alglib_impl::pspline2interpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::pspline2interpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_pspline2interpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _pspline2interpolant_owner& _pspline2interpolant_owner::operator=(const _pspline2interpolant_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_pspline2interpolant_clear(p_struct); alglib_impl::_pspline2interpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _pspline2interpolant_owner::~_pspline2interpolant_owner() { alglib_impl::_pspline2interpolant_clear(p_struct); ae_free(p_struct); } alglib_impl::pspline2interpolant* _pspline2interpolant_owner::c_ptr() { return p_struct; } alglib_impl::pspline2interpolant* _pspline2interpolant_owner::c_ptr() const { return const_cast(p_struct); } pspline2interpolant::pspline2interpolant() : _pspline2interpolant_owner() { } pspline2interpolant::pspline2interpolant(const pspline2interpolant &rhs):_pspline2interpolant_owner(rhs) { } pspline2interpolant& pspline2interpolant::operator=(const pspline2interpolant &rhs) { if( this==&rhs ) return *this; _pspline2interpolant_owner::operator=(rhs); return *this; } pspline2interpolant::~pspline2interpolant() { } /************************************************************************* Parametric spline inteprolant: 3-dimensional curve. You should not try to access its members directly - use PSpline3XXXXXXXX() functions instead. *************************************************************************/ _pspline3interpolant_owner::_pspline3interpolant_owner() { p_struct = (alglib_impl::pspline3interpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::pspline3interpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_pspline3interpolant_init(p_struct, NULL); } _pspline3interpolant_owner::_pspline3interpolant_owner(const _pspline3interpolant_owner &rhs) { p_struct = (alglib_impl::pspline3interpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::pspline3interpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_pspline3interpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _pspline3interpolant_owner& _pspline3interpolant_owner::operator=(const _pspline3interpolant_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_pspline3interpolant_clear(p_struct); alglib_impl::_pspline3interpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _pspline3interpolant_owner::~_pspline3interpolant_owner() { alglib_impl::_pspline3interpolant_clear(p_struct); ae_free(p_struct); } alglib_impl::pspline3interpolant* _pspline3interpolant_owner::c_ptr() { return p_struct; } alglib_impl::pspline3interpolant* _pspline3interpolant_owner::c_ptr() const { return const_cast(p_struct); } pspline3interpolant::pspline3interpolant() : _pspline3interpolant_owner() { } pspline3interpolant::pspline3interpolant(const pspline3interpolant &rhs):_pspline3interpolant_owner(rhs) { } pspline3interpolant& pspline3interpolant::operator=(const pspline3interpolant &rhs) { if( this==&rhs ) return *this; _pspline3interpolant_owner::operator=(rhs); return *this; } pspline3interpolant::~pspline3interpolant() { } /************************************************************************* This function builds non-periodic 2-dimensional parametric spline which starts at (X[0],Y[0]) and ends at (X[N-1],Y[N-1]). INPUT PARAMETERS: XY - points, array[0..N-1,0..1]. XY[I,0:1] corresponds to the Ith point. Order of points is important! N - points count, N>=5 for Akima splines, N>=2 for other types of splines. ST - spline type: * 0 Akima spline * 1 parabolically terminated Catmull-Rom spline (Tension=0) * 2 parabolically terminated cubic spline PT - parameterization type: * 0 uniform * 1 chord length * 2 centripetal OUTPUT PARAMETERS: P - parametric spline interpolant NOTES: * this function assumes that there all consequent points are distinct. I.e. (x0,y0)<>(x1,y1), (x1,y1)<>(x2,y2), (x2,y2)<>(x3,y3) and so on. However, non-consequent points may coincide, i.e. we can have (x0,y0)= =(x2,y2). -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2build(const real_2d_array &xy, const ae_int_t n, const ae_int_t st, const ae_int_t pt, pspline2interpolant &p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline2build(const_cast(xy.c_ptr()), n, st, pt, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function builds non-periodic 3-dimensional parametric spline which starts at (X[0],Y[0],Z[0]) and ends at (X[N-1],Y[N-1],Z[N-1]). Same as PSpline2Build() function, but for 3D, so we won't duplicate its description here. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3build(const real_2d_array &xy, const ae_int_t n, const ae_int_t st, const ae_int_t pt, pspline3interpolant &p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline3build(const_cast(xy.c_ptr()), n, st, pt, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function builds periodic 2-dimensional parametric spline which starts at (X[0],Y[0]), goes through all points to (X[N-1],Y[N-1]) and then back to (X[0],Y[0]). INPUT PARAMETERS: XY - points, array[0..N-1,0..1]. XY[I,0:1] corresponds to the Ith point. XY[N-1,0:1] must be different from XY[0,0:1]. Order of points is important! N - points count, N>=3 for other types of splines. ST - spline type: * 1 Catmull-Rom spline (Tension=0) with cyclic boundary conditions * 2 cubic spline with cyclic boundary conditions PT - parameterization type: * 0 uniform * 1 chord length * 2 centripetal OUTPUT PARAMETERS: P - parametric spline interpolant NOTES: * this function assumes that there all consequent points are distinct. I.e. (x0,y0)<>(x1,y1), (x1,y1)<>(x2,y2), (x2,y2)<>(x3,y3) and so on. However, non-consequent points may coincide, i.e. we can have (x0,y0)= =(x2,y2). * last point of sequence is NOT equal to the first point. You shouldn't make curve "explicitly periodic" by making them equal. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2buildperiodic(const real_2d_array &xy, const ae_int_t n, const ae_int_t st, const ae_int_t pt, pspline2interpolant &p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline2buildperiodic(const_cast(xy.c_ptr()), n, st, pt, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function builds periodic 3-dimensional parametric spline which starts at (X[0],Y[0],Z[0]), goes through all points to (X[N-1],Y[N-1],Z[N-1]) and then back to (X[0],Y[0],Z[0]). Same as PSpline2Build() function, but for 3D, so we won't duplicate its description here. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3buildperiodic(const real_2d_array &xy, const ae_int_t n, const ae_int_t st, const ae_int_t pt, pspline3interpolant &p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline3buildperiodic(const_cast(xy.c_ptr()), n, st, pt, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns vector of parameter values correspoding to points. I.e. for P created from (X[0],Y[0])...(X[N-1],Y[N-1]) and U=TValues(P) we have (X[0],Y[0]) = PSpline2Calc(P,U[0]), (X[1],Y[1]) = PSpline2Calc(P,U[1]), (X[2],Y[2]) = PSpline2Calc(P,U[2]), ... INPUT PARAMETERS: P - parametric spline interpolant OUTPUT PARAMETERS: N - array size T - array[0..N-1] NOTES: * for non-periodic splines U[0]=0, U[0](p.c_ptr()), &n, const_cast(t.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns vector of parameter values correspoding to points. Same as PSpline2ParameterValues(), but for 3D. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3parametervalues(const pspline3interpolant &p, ae_int_t &n, real_1d_array &t) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline3parametervalues(const_cast(p.c_ptr()), &n, const_cast(t.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates the value of the parametric spline for a given value of parameter T INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-position Y - Y-position -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2calc(const pspline2interpolant &p, const double t, double &x, double &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline2calc(const_cast(p.c_ptr()), t, &x, &y, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates the value of the parametric spline for a given value of parameter T. INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-position Y - Y-position Z - Z-position -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3calc(const pspline3interpolant &p, const double t, double &x, double &y, double &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline3calc(const_cast(p.c_ptr()), t, &x, &y, &z, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates tangent vector for a given value of parameter T INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-component of tangent vector (normalized) Y - Y-component of tangent vector (normalized) NOTE: X^2+Y^2 is either 1 (for non-zero tangent vector) or 0. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2tangent(const pspline2interpolant &p, const double t, double &x, double &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline2tangent(const_cast(p.c_ptr()), t, &x, &y, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates tangent vector for a given value of parameter T INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-component of tangent vector (normalized) Y - Y-component of tangent vector (normalized) Z - Z-component of tangent vector (normalized) NOTE: X^2+Y^2+Z^2 is either 1 (for non-zero tangent vector) or 0. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3tangent(const pspline3interpolant &p, const double t, double &x, double &y, double &z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline3tangent(const_cast(p.c_ptr()), t, &x, &y, &z, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates derivative, i.e. it returns (dX/dT,dY/dT). INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - X-derivative Y - Y-value DY - Y-derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2diff(const pspline2interpolant &p, const double t, double &x, double &dx, double &y, double &dy) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline2diff(const_cast(p.c_ptr()), t, &x, &dx, &y, &dy, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates derivative, i.e. it returns (dX/dT,dY/dT,dZ/dT). INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - X-derivative Y - Y-value DY - Y-derivative Z - Z-value DZ - Z-derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3diff(const pspline3interpolant &p, const double t, double &x, double &dx, double &y, double &dy, double &z, double &dz) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline3diff(const_cast(p.c_ptr()), t, &x, &dx, &y, &dy, &z, &dz, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates first and second derivative with respect to T. INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - derivative D2X - second derivative Y - Y-value DY - derivative D2Y - second derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2diff2(const pspline2interpolant &p, const double t, double &x, double &dx, double &d2x, double &y, double &dy, double &d2y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline2diff2(const_cast(p.c_ptr()), t, &x, &dx, &d2x, &y, &dy, &d2y, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates first and second derivative with respect to T. INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - derivative D2X - second derivative Y - Y-value DY - derivative D2Y - second derivative Z - Z-value DZ - derivative D2Z - second derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3diff2(const pspline3interpolant &p, const double t, double &x, double &dx, double &d2x, double &y, double &dy, double &d2y, double &z, double &dz, double &d2z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::pspline3diff2(const_cast(p.c_ptr()), t, &x, &dx, &d2x, &y, &dy, &d2y, &z, &dz, &d2z, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates arc length, i.e. length of curve between t=a and t=b. INPUT PARAMETERS: P - parametric spline interpolant A,B - parameter values corresponding to arc ends: * B>A will result in positive length returned * B(p.c_ptr()), a, b, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates arc length, i.e. length of curve between t=a and t=b. INPUT PARAMETERS: P - parametric spline interpolant A,B - parameter values corresponding to arc ends: * B>A will result in positive length returned * B(p.c_ptr()), a, b, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine fits piecewise linear curve to points with Ramer-Douglas- Peucker algorithm. This function performs PARAMETRIC fit, i.e. it can be used to fit curves like circles. On input it accepts dataset which describes parametric multidimensional curve X(t), with X being vector, and t taking values in [0,N), where N is a number of points in dataset. As result, it returns reduced dataset X2, which can be used to build parametric curve X2(t), which approximates X(t) with desired precision (or has specified number of sections). INPUT PARAMETERS: X - array of multidimensional points: * at least N elements, leading N elements are used if more than N elements were specified * order of points is IMPORTANT because it is parametric fit * each row of array is one point which has D coordinates N - number of elements in X D - number of dimensions (elements per row of X) StopM - stopping condition - desired number of sections: * at most M sections are generated by this function * less than M sections can be generated if we have N(x.c_ptr()), n, d, stopm, stopeps, const_cast(x2.c_ptr()), const_cast(idx2.c_ptr()), &nsections, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 3-dimensional spline inteprolant *************************************************************************/ _spline3dinterpolant_owner::_spline3dinterpolant_owner() { p_struct = (alglib_impl::spline3dinterpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::spline3dinterpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_spline3dinterpolant_init(p_struct, NULL); } _spline3dinterpolant_owner::_spline3dinterpolant_owner(const _spline3dinterpolant_owner &rhs) { p_struct = (alglib_impl::spline3dinterpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::spline3dinterpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_spline3dinterpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _spline3dinterpolant_owner& _spline3dinterpolant_owner::operator=(const _spline3dinterpolant_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_spline3dinterpolant_clear(p_struct); alglib_impl::_spline3dinterpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _spline3dinterpolant_owner::~_spline3dinterpolant_owner() { alglib_impl::_spline3dinterpolant_clear(p_struct); ae_free(p_struct); } alglib_impl::spline3dinterpolant* _spline3dinterpolant_owner::c_ptr() { return p_struct; } alglib_impl::spline3dinterpolant* _spline3dinterpolant_owner::c_ptr() const { return const_cast(p_struct); } spline3dinterpolant::spline3dinterpolant() : _spline3dinterpolant_owner() { } spline3dinterpolant::spline3dinterpolant(const spline3dinterpolant &rhs):_spline3dinterpolant_owner(rhs) { } spline3dinterpolant& spline3dinterpolant::operator=(const spline3dinterpolant &rhs) { if( this==&rhs ) return *this; _spline3dinterpolant_owner::operator=(rhs); return *this; } spline3dinterpolant::~spline3dinterpolant() { } /************************************************************************* This subroutine calculates the value of the trilinear or tricubic spline at the given point (X,Y,Z). INPUT PARAMETERS: C - coefficients table. Built by BuildBilinearSpline or BuildBicubicSpline. X, Y, Z - point Result: S(x,y,z) -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ double spline3dcalc(const spline3dinterpolant &c, const double x, const double y, const double z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spline3dcalc(const_cast(c.c_ptr()), x, y, z, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine performs linear transformation of the spline argument. INPUT PARAMETERS: C - spline interpolant AX, BX - transformation coefficients: x = A*u + B AY, BY - transformation coefficients: y = A*v + B AZ, BZ - transformation coefficients: z = A*w + B OUTPUT PARAMETERS: C - transformed spline -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dlintransxyz(const spline3dinterpolant &c, const double ax, const double bx, const double ay, const double by, const double az, const double bz) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline3dlintransxyz(const_cast(c.c_ptr()), ax, bx, ay, by, az, bz, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine performs linear transformation of the spline. INPUT PARAMETERS: C - spline interpolant. A, B- transformation coefficients: S2(x,y) = A*S(x,y,z) + B OUTPUT PARAMETERS: C - transformed spline -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dlintransf(const spline3dinterpolant &c, const double a, const double b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline3dlintransf(const_cast(c.c_ptr()), a, b, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Trilinear spline resampling INPUT PARAMETERS: A - array[0..OldXCount*OldYCount*OldZCount-1], function values at the old grid, : A[0] x=0,y=0,z=0 A[1] x=1,y=0,z=0 A[..] ... A[..] x=oldxcount-1,y=0,z=0 A[..] x=0,y=1,z=0 A[..] ... ... OldZCount - old Z-count, OldZCount>1 OldYCount - old Y-count, OldYCount>1 OldXCount - old X-count, OldXCount>1 NewZCount - new Z-count, NewZCount>1 NewYCount - new Y-count, NewYCount>1 NewXCount - new X-count, NewXCount>1 OUTPUT PARAMETERS: B - array[0..NewXCount*NewYCount*NewZCount-1], function values at the new grid: B[0] x=0,y=0,z=0 B[1] x=1,y=0,z=0 B[..] ... B[..] x=newxcount-1,y=0,z=0 B[..] x=0,y=1,z=0 B[..] ... ... -- ALGLIB routine -- 26.04.2012 Copyright by Bochkanov Sergey *************************************************************************/ void spline3dresampletrilinear(const real_1d_array &a, const ae_int_t oldzcount, const ae_int_t oldycount, const ae_int_t oldxcount, const ae_int_t newzcount, const ae_int_t newycount, const ae_int_t newxcount, real_1d_array &b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline3dresampletrilinear(const_cast(a.c_ptr()), oldzcount, oldycount, oldxcount, newzcount, newycount, newxcount, const_cast(b.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds trilinear vector-valued spline. INPUT PARAMETERS: X - spline abscissas, array[0..N-1] Y - spline ordinates, array[0..M-1] Z - spline applicates, array[0..L-1] F - function values, array[0..M*N*L*D-1]: * first D elements store D values at (X[0],Y[0],Z[0]) * next D elements store D values at (X[1],Y[0],Z[0]) * next D elements store D values at (X[2],Y[0],Z[0]) * ... * next D elements store D values at (X[0],Y[1],Z[0]) * next D elements store D values at (X[1],Y[1],Z[0]) * next D elements store D values at (X[2],Y[1],Z[0]) * ... * next D elements store D values at (X[0],Y[0],Z[1]) * next D elements store D values at (X[1],Y[0],Z[1]) * next D elements store D values at (X[2],Y[0],Z[1]) * ... * general form - D function values at (X[i],Y[j]) are stored at F[D*(N*(M*K+J)+I)...D*(N*(M*K+J)+I)+D-1]. M,N, L - grid size, M>=2, N>=2, L>=2 D - vector dimension, D>=1 OUTPUT PARAMETERS: C - spline interpolant -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dbuildtrilinearv(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, const real_1d_array &z, const ae_int_t l, const real_1d_array &f, const ae_int_t d, spline3dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline3dbuildtrilinearv(const_cast(x.c_ptr()), n, const_cast(y.c_ptr()), m, const_cast(z.c_ptr()), l, const_cast(f.c_ptr()), d, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates bilinear or bicubic vector-valued spline at the given point (X,Y,Z). INPUT PARAMETERS: C - spline interpolant. X, Y, Z - point F - output buffer, possibly preallocated array. In case array size is large enough to store result, it is not reallocated. Array which is too short will be reallocated OUTPUT PARAMETERS: F - array[D] (or larger) which stores function values -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dcalcvbuf(const spline3dinterpolant &c, const double x, const double y, const double z, real_1d_array &f) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline3dcalcvbuf(const_cast(c.c_ptr()), x, y, z, const_cast(f.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates trilinear or tricubic vector-valued spline at the given point (X,Y,Z). INPUT PARAMETERS: C - spline interpolant. X, Y, Z - point OUTPUT PARAMETERS: F - array[D] which stores function values. F is out-parameter and it is reallocated after call to this function. In case you want to reuse previously allocated F, you may use Spline2DCalcVBuf(), which reallocates F only when it is too small. -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dcalcv(const spline3dinterpolant &c, const double x, const double y, const double z, real_1d_array &f) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline3dcalcv(const_cast(c.c_ptr()), x, y, z, const_cast(f.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine unpacks tri-dimensional spline into the coefficients table INPUT PARAMETERS: C - spline interpolant. Result: N - grid size (X) M - grid size (Y) L - grid size (Z) D - number of components SType- spline type. Currently, only one spline type is supported: trilinear spline, as indicated by SType=1. Tbl - spline coefficients: [0..(N-1)*(M-1)*(L-1)*D-1, 0..13]. For T=0..D-1 (component index), I = 0...N-2 (x index), J=0..M-2 (y index), K=0..L-2 (z index): Q := T + I*D + J*D*(N-1) + K*D*(N-1)*(M-1), Q-th row stores decomposition for T-th component of the vector-valued function Tbl[Q,0] = X[i] Tbl[Q,1] = X[i+1] Tbl[Q,2] = Y[j] Tbl[Q,3] = Y[j+1] Tbl[Q,4] = Z[k] Tbl[Q,5] = Z[k+1] Tbl[Q,6] = C000 Tbl[Q,7] = C100 Tbl[Q,8] = C010 Tbl[Q,9] = C110 Tbl[Q,10]= C001 Tbl[Q,11]= C101 Tbl[Q,12]= C011 Tbl[Q,13]= C111 On each grid square spline is equals to: S(x) = SUM(c[i,j,k]*(x^i)*(y^j)*(z^k), i=0..1, j=0..1, k=0..1) t = x-x[j] u = y-y[i] v = z-z[k] NOTE: format of Tbl is given for SType=1. Future versions of ALGLIB can use different formats for different values of SType. -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dunpackv(const spline3dinterpolant &c, ae_int_t &n, ae_int_t &m, ae_int_t &l, ae_int_t &d, ae_int_t &stype, real_2d_array &tbl) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline3dunpackv(const_cast(c.c_ptr()), &n, &m, &l, &d, &stype, const_cast(tbl.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Conversion from barycentric representation to Chebyshev basis. This function has O(N^2) complexity. INPUT PARAMETERS: P - polynomial in barycentric form A,B - base interval for Chebyshev polynomials (see below) A<>B OUTPUT PARAMETERS T - coefficients of Chebyshev representation; P(x) = sum { T[i]*Ti(2*(x-A)/(B-A)-1), i=0..N-1 }, where Ti - I-th Chebyshev polynomial. NOTES: barycentric interpolant passed as P may be either polynomial obtained from polynomial interpolation/ fitting or rational function which is NOT polynomial. We can't distinguish between these two cases, and this algorithm just tries to work assuming that P IS a polynomial. If not, algorithm will return results, but they won't have any meaning. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/ void polynomialbar2cheb(const barycentricinterpolant &p, const double a, const double b, real_1d_array &t) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialbar2cheb(const_cast(p.c_ptr()), a, b, const_cast(t.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Conversion from Chebyshev basis to barycentric representation. This function has O(N^2) complexity. INPUT PARAMETERS: T - coefficients of Chebyshev representation; P(x) = sum { T[i]*Ti(2*(x-A)/(B-A)-1), i=0..N }, where Ti - I-th Chebyshev polynomial. N - number of coefficients: * if given, only leading N elements of T are used * if not given, automatically determined from size of T A,B - base interval for Chebyshev polynomials (see above) A(t.c_ptr()), n, a, b, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Conversion from Chebyshev basis to barycentric representation. This function has O(N^2) complexity. INPUT PARAMETERS: T - coefficients of Chebyshev representation; P(x) = sum { T[i]*Ti(2*(x-A)/(B-A)-1), i=0..N }, where Ti - I-th Chebyshev polynomial. N - number of coefficients: * if given, only leading N elements of T are used * if not given, automatically determined from size of T A,B - base interval for Chebyshev polynomials (see above) A(t.c_ptr()), n, a, b, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Conversion from barycentric representation to power basis. This function has O(N^2) complexity. INPUT PARAMETERS: P - polynomial in barycentric form C - offset (see below); 0.0 is used as default value. S - scale (see below); 1.0 is used as default value. S<>0. OUTPUT PARAMETERS A - coefficients, P(x) = sum { A[i]*((X-C)/S)^i, i=0..N-1 } N - number of coefficients (polynomial degree plus 1) NOTES: 1. this function accepts offset and scale, which can be set to improve numerical properties of polynomial. For example, if P was obtained as result of interpolation on [-1,+1], you can set C=0 and S=1 and represent P as sum of 1, x, x^2, x^3 and so on. In most cases you it is exactly what you need. However, if your interpolation model was built on [999,1001], you will see significant growth of numerical errors when using {1, x, x^2, x^3} as basis. Representing P as sum of 1, (x-1000), (x-1000)^2, (x-1000)^3 will be better option. Such representation can be obtained by using 1000.0 as offset C and 1.0 as scale S. 2. power basis is ill-conditioned and tricks described above can't solve this problem completely. This function will return coefficients in any case, but for N>8 they will become unreliable. However, N's less than 5 are pretty safe. 3. barycentric interpolant passed as P may be either polynomial obtained from polynomial interpolation/ fitting or rational function which is NOT polynomial. We can't distinguish between these two cases, and this algorithm just tries to work assuming that P IS a polynomial. If not, algorithm will return results, but they won't have any meaning. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/ void polynomialbar2pow(const barycentricinterpolant &p, const double c, const double s, real_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialbar2pow(const_cast(p.c_ptr()), c, s, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Conversion from barycentric representation to power basis. This function has O(N^2) complexity. INPUT PARAMETERS: P - polynomial in barycentric form C - offset (see below); 0.0 is used as default value. S - scale (see below); 1.0 is used as default value. S<>0. OUTPUT PARAMETERS A - coefficients, P(x) = sum { A[i]*((X-C)/S)^i, i=0..N-1 } N - number of coefficients (polynomial degree plus 1) NOTES: 1. this function accepts offset and scale, which can be set to improve numerical properties of polynomial. For example, if P was obtained as result of interpolation on [-1,+1], you can set C=0 and S=1 and represent P as sum of 1, x, x^2, x^3 and so on. In most cases you it is exactly what you need. However, if your interpolation model was built on [999,1001], you will see significant growth of numerical errors when using {1, x, x^2, x^3} as basis. Representing P as sum of 1, (x-1000), (x-1000)^2, (x-1000)^3 will be better option. Such representation can be obtained by using 1000.0 as offset C and 1.0 as scale S. 2. power basis is ill-conditioned and tricks described above can't solve this problem completely. This function will return coefficients in any case, but for N>8 they will become unreliable. However, N's less than 5 are pretty safe. 3. barycentric interpolant passed as P may be either polynomial obtained from polynomial interpolation/ fitting or rational function which is NOT polynomial. We can't distinguish between these two cases, and this algorithm just tries to work assuming that P IS a polynomial. If not, algorithm will return results, but they won't have any meaning. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/ void polynomialbar2pow(const barycentricinterpolant &p, real_1d_array &a) { alglib_impl::ae_state _alglib_env_state; double c; double s; c = 0; s = 1; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialbar2pow(const_cast(p.c_ptr()), c, s, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Conversion from power basis to barycentric representation. This function has O(N^2) complexity. INPUT PARAMETERS: A - coefficients, P(x) = sum { A[i]*((X-C)/S)^i, i=0..N-1 } N - number of coefficients (polynomial degree plus 1) * if given, only leading N elements of A are used * if not given, automatically determined from size of A C - offset (see below); 0.0 is used as default value. S - scale (see below); 1.0 is used as default value. S<>0. OUTPUT PARAMETERS P - polynomial in barycentric form NOTES: 1. this function accepts offset and scale, which can be set to improve numerical properties of polynomial. For example, if you interpolate on [-1,+1], you can set C=0 and S=1 and convert from sum of 1, x, x^2, x^3 and so on. In most cases you it is exactly what you need. However, if your interpolation model was built on [999,1001], you will see significant growth of numerical errors when using {1, x, x^2, x^3} as input basis. Converting from sum of 1, (x-1000), (x-1000)^2, (x-1000)^3 will be better option (you have to specify 1000.0 as offset C and 1.0 as scale S). 2. power basis is ill-conditioned and tricks described above can't solve this problem completely. This function will return barycentric model in any case, but for N>8 accuracy well degrade. However, N's less than 5 are pretty safe. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/ void polynomialpow2bar(const real_1d_array &a, const ae_int_t n, const double c, const double s, barycentricinterpolant &p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialpow2bar(const_cast(a.c_ptr()), n, c, s, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Conversion from power basis to barycentric representation. This function has O(N^2) complexity. INPUT PARAMETERS: A - coefficients, P(x) = sum { A[i]*((X-C)/S)^i, i=0..N-1 } N - number of coefficients (polynomial degree plus 1) * if given, only leading N elements of A are used * if not given, automatically determined from size of A C - offset (see below); 0.0 is used as default value. S - scale (see below); 1.0 is used as default value. S<>0. OUTPUT PARAMETERS P - polynomial in barycentric form NOTES: 1. this function accepts offset and scale, which can be set to improve numerical properties of polynomial. For example, if you interpolate on [-1,+1], you can set C=0 and S=1 and convert from sum of 1, x, x^2, x^3 and so on. In most cases you it is exactly what you need. However, if your interpolation model was built on [999,1001], you will see significant growth of numerical errors when using {1, x, x^2, x^3} as input basis. Converting from sum of 1, (x-1000), (x-1000)^2, (x-1000)^3 will be better option (you have to specify 1000.0 as offset C and 1.0 as scale S). 2. power basis is ill-conditioned and tricks described above can't solve this problem completely. This function will return barycentric model in any case, but for N>8 accuracy well degrade. However, N's less than 5 are pretty safe. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/ void polynomialpow2bar(const real_1d_array &a, barycentricinterpolant &p) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; double c; double s; n = a.length(); c = 0; s = 1; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialpow2bar(const_cast(a.c_ptr()), n, c, s, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Lagrange intepolant: generation of the model on the general grid. This function has O(N^2) complexity. INPUT PARAMETERS: X - abscissas, array[0..N-1] Y - function values, array[0..N-1] N - number of points, N>=1 OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuild(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, barycentricinterpolant &p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialbuild(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Lagrange intepolant: generation of the model on the general grid. This function has O(N^2) complexity. INPUT PARAMETERS: X - abscissas, array[0..N-1] Y - function values, array[0..N-1] N - number of points, N>=1 OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuild(const real_1d_array &x, const real_1d_array &y, barycentricinterpolant &p) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'polynomialbuild': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialbuild(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Lagrange intepolant: generation of the model on equidistant grid. This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1] N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuildeqdist(const double a, const double b, const real_1d_array &y, const ae_int_t n, barycentricinterpolant &p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialbuildeqdist(a, b, const_cast(y.c_ptr()), n, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Lagrange intepolant: generation of the model on equidistant grid. This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1] N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuildeqdist(const double a, const double b, const real_1d_array &y, barycentricinterpolant &p) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = y.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialbuildeqdist(a, b, const_cast(y.c_ptr()), n, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Lagrange intepolant on Chebyshev grid (first kind). This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1], Y[I] = Y(0.5*(B+A) + 0.5*(B-A)*Cos(PI*(2*i+1)/(2*n))) N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuildcheb1(const double a, const double b, const real_1d_array &y, const ae_int_t n, barycentricinterpolant &p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialbuildcheb1(a, b, const_cast(y.c_ptr()), n, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Lagrange intepolant on Chebyshev grid (first kind). This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1], Y[I] = Y(0.5*(B+A) + 0.5*(B-A)*Cos(PI*(2*i+1)/(2*n))) N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuildcheb1(const double a, const double b, const real_1d_array &y, barycentricinterpolant &p) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = y.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialbuildcheb1(a, b, const_cast(y.c_ptr()), n, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Lagrange intepolant on Chebyshev grid (second kind). This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1], Y[I] = Y(0.5*(B+A) + 0.5*(B-A)*Cos(PI*i/(n-1))) N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuildcheb2(const double a, const double b, const real_1d_array &y, const ae_int_t n, barycentricinterpolant &p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialbuildcheb2(a, b, const_cast(y.c_ptr()), n, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Lagrange intepolant on Chebyshev grid (second kind). This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1], Y[I] = Y(0.5*(B+A) + 0.5*(B-A)*Cos(PI*i/(n-1))) N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuildcheb2(const double a, const double b, const real_1d_array &y, barycentricinterpolant &p) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = y.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialbuildcheb2(a, b, const_cast(y.c_ptr()), n, const_cast(p.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fast equidistant polynomial interpolation function with O(N) complexity INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on equidistant grid, N>=1 for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolynomialBuildEqDist()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double polynomialcalceqdist(const double a, const double b, const real_1d_array &f, const ae_int_t n, const double t) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::polynomialcalceqdist(a, b, const_cast(f.c_ptr()), n, t, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fast equidistant polynomial interpolation function with O(N) complexity INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on equidistant grid, N>=1 for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolynomialBuildEqDist()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double polynomialcalceqdist(const double a, const double b, const real_1d_array &f, const double t) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = f.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::polynomialcalceqdist(a, b, const_cast(f.c_ptr()), n, t, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fast polynomial interpolation function on Chebyshev points (first kind) with O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on Chebyshev grid (first kind), X[i] = 0.5*(B+A) + 0.5*(B-A)*Cos(PI*(2*i+1)/(2*n)) for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolIntBuildCheb1()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double polynomialcalccheb1(const double a, const double b, const real_1d_array &f, const ae_int_t n, const double t) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::polynomialcalccheb1(a, b, const_cast(f.c_ptr()), n, t, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fast polynomial interpolation function on Chebyshev points (first kind) with O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on Chebyshev grid (first kind), X[i] = 0.5*(B+A) + 0.5*(B-A)*Cos(PI*(2*i+1)/(2*n)) for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolIntBuildCheb1()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double polynomialcalccheb1(const double a, const double b, const real_1d_array &f, const double t) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = f.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::polynomialcalccheb1(a, b, const_cast(f.c_ptr()), n, t, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fast polynomial interpolation function on Chebyshev points (second kind) with O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on Chebyshev grid (second kind), X[i] = 0.5*(B+A) + 0.5*(B-A)*Cos(PI*i/(n-1)) for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolIntBuildCheb2()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double polynomialcalccheb2(const double a, const double b, const real_1d_array &f, const ae_int_t n, const double t) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::polynomialcalccheb2(a, b, const_cast(f.c_ptr()), n, t, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fast polynomial interpolation function on Chebyshev points (second kind) with O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on Chebyshev grid (second kind), X[i] = 0.5*(B+A) + 0.5*(B-A)*Cos(PI*i/(n-1)) for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolIntBuildCheb2()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double polynomialcalccheb2(const double a, const double b, const real_1d_array &f, const double t) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = f.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::polynomialcalccheb2(a, b, const_cast(f.c_ptr()), n, t, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Polynomial fitting report: TaskRCond reciprocal of task's condition number RMSError RMS error AvgError average error AvgRelError average relative error (for non-zero Y[I]) MaxError maximum error *************************************************************************/ _polynomialfitreport_owner::_polynomialfitreport_owner() { p_struct = (alglib_impl::polynomialfitreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::polynomialfitreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_polynomialfitreport_init(p_struct, NULL); } _polynomialfitreport_owner::_polynomialfitreport_owner(const _polynomialfitreport_owner &rhs) { p_struct = (alglib_impl::polynomialfitreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::polynomialfitreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_polynomialfitreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _polynomialfitreport_owner& _polynomialfitreport_owner::operator=(const _polynomialfitreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_polynomialfitreport_clear(p_struct); alglib_impl::_polynomialfitreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _polynomialfitreport_owner::~_polynomialfitreport_owner() { alglib_impl::_polynomialfitreport_clear(p_struct); ae_free(p_struct); } alglib_impl::polynomialfitreport* _polynomialfitreport_owner::c_ptr() { return p_struct; } alglib_impl::polynomialfitreport* _polynomialfitreport_owner::c_ptr() const { return const_cast(p_struct); } polynomialfitreport::polynomialfitreport() : _polynomialfitreport_owner() ,taskrcond(p_struct->taskrcond),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),maxerror(p_struct->maxerror) { } polynomialfitreport::polynomialfitreport(const polynomialfitreport &rhs):_polynomialfitreport_owner(rhs) ,taskrcond(p_struct->taskrcond),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),maxerror(p_struct->maxerror) { } polynomialfitreport& polynomialfitreport::operator=(const polynomialfitreport &rhs) { if( this==&rhs ) return *this; _polynomialfitreport_owner::operator=(rhs); return *this; } polynomialfitreport::~polynomialfitreport() { } /************************************************************************* Barycentric fitting report: RMSError RMS error AvgError average error AvgRelError average relative error (for non-zero Y[I]) MaxError maximum error TaskRCond reciprocal of task's condition number *************************************************************************/ _barycentricfitreport_owner::_barycentricfitreport_owner() { p_struct = (alglib_impl::barycentricfitreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::barycentricfitreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_barycentricfitreport_init(p_struct, NULL); } _barycentricfitreport_owner::_barycentricfitreport_owner(const _barycentricfitreport_owner &rhs) { p_struct = (alglib_impl::barycentricfitreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::barycentricfitreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_barycentricfitreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _barycentricfitreport_owner& _barycentricfitreport_owner::operator=(const _barycentricfitreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_barycentricfitreport_clear(p_struct); alglib_impl::_barycentricfitreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _barycentricfitreport_owner::~_barycentricfitreport_owner() { alglib_impl::_barycentricfitreport_clear(p_struct); ae_free(p_struct); } alglib_impl::barycentricfitreport* _barycentricfitreport_owner::c_ptr() { return p_struct; } alglib_impl::barycentricfitreport* _barycentricfitreport_owner::c_ptr() const { return const_cast(p_struct); } barycentricfitreport::barycentricfitreport() : _barycentricfitreport_owner() ,taskrcond(p_struct->taskrcond),dbest(p_struct->dbest),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),maxerror(p_struct->maxerror) { } barycentricfitreport::barycentricfitreport(const barycentricfitreport &rhs):_barycentricfitreport_owner(rhs) ,taskrcond(p_struct->taskrcond),dbest(p_struct->dbest),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),maxerror(p_struct->maxerror) { } barycentricfitreport& barycentricfitreport::operator=(const barycentricfitreport &rhs) { if( this==&rhs ) return *this; _barycentricfitreport_owner::operator=(rhs); return *this; } barycentricfitreport::~barycentricfitreport() { } /************************************************************************* Spline fitting report: RMSError RMS error AvgError average error AvgRelError average relative error (for non-zero Y[I]) MaxError maximum error Fields below are filled by obsolete functions (Spline1DFitCubic, Spline1DFitHermite). Modern fitting functions do NOT fill these fields: TaskRCond reciprocal of task's condition number *************************************************************************/ _spline1dfitreport_owner::_spline1dfitreport_owner() { p_struct = (alglib_impl::spline1dfitreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::spline1dfitreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_spline1dfitreport_init(p_struct, NULL); } _spline1dfitreport_owner::_spline1dfitreport_owner(const _spline1dfitreport_owner &rhs) { p_struct = (alglib_impl::spline1dfitreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::spline1dfitreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_spline1dfitreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _spline1dfitreport_owner& _spline1dfitreport_owner::operator=(const _spline1dfitreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_spline1dfitreport_clear(p_struct); alglib_impl::_spline1dfitreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _spline1dfitreport_owner::~_spline1dfitreport_owner() { alglib_impl::_spline1dfitreport_clear(p_struct); ae_free(p_struct); } alglib_impl::spline1dfitreport* _spline1dfitreport_owner::c_ptr() { return p_struct; } alglib_impl::spline1dfitreport* _spline1dfitreport_owner::c_ptr() const { return const_cast(p_struct); } spline1dfitreport::spline1dfitreport() : _spline1dfitreport_owner() ,taskrcond(p_struct->taskrcond),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),maxerror(p_struct->maxerror) { } spline1dfitreport::spline1dfitreport(const spline1dfitreport &rhs):_spline1dfitreport_owner(rhs) ,taskrcond(p_struct->taskrcond),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),maxerror(p_struct->maxerror) { } spline1dfitreport& spline1dfitreport::operator=(const spline1dfitreport &rhs) { if( this==&rhs ) return *this; _spline1dfitreport_owner::operator=(rhs); return *this; } spline1dfitreport::~spline1dfitreport() { } /************************************************************************* Least squares fitting report. This structure contains informational fields which are set by fitting functions provided by this unit. Different functions initialize different sets of fields, so you should read documentation on specific function you used in order to know which fields are initialized. TaskRCond reciprocal of task's condition number IterationsCount number of internal iterations VarIdx if user-supplied gradient contains errors which were detected by nonlinear fitter, this field is set to index of the first component of gradient which is suspected to be spoiled by bugs. RMSError RMS error AvgError average error AvgRelError average relative error (for non-zero Y[I]) MaxError maximum error WRMSError weighted RMS error CovPar covariance matrix for parameters, filled by some solvers ErrPar vector of errors in parameters, filled by some solvers ErrCurve vector of fit errors - variability of the best-fit curve, filled by some solvers. Noise vector of per-point noise estimates, filled by some solvers. R2 coefficient of determination (non-weighted, non-adjusted), filled by some solvers. *************************************************************************/ _lsfitreport_owner::_lsfitreport_owner() { p_struct = (alglib_impl::lsfitreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::lsfitreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_lsfitreport_init(p_struct, NULL); } _lsfitreport_owner::_lsfitreport_owner(const _lsfitreport_owner &rhs) { p_struct = (alglib_impl::lsfitreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::lsfitreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_lsfitreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _lsfitreport_owner& _lsfitreport_owner::operator=(const _lsfitreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_lsfitreport_clear(p_struct); alglib_impl::_lsfitreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _lsfitreport_owner::~_lsfitreport_owner() { alglib_impl::_lsfitreport_clear(p_struct); ae_free(p_struct); } alglib_impl::lsfitreport* _lsfitreport_owner::c_ptr() { return p_struct; } alglib_impl::lsfitreport* _lsfitreport_owner::c_ptr() const { return const_cast(p_struct); } lsfitreport::lsfitreport() : _lsfitreport_owner() ,taskrcond(p_struct->taskrcond),iterationscount(p_struct->iterationscount),varidx(p_struct->varidx),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),maxerror(p_struct->maxerror),wrmserror(p_struct->wrmserror),covpar(&p_struct->covpar),errpar(&p_struct->errpar),errcurve(&p_struct->errcurve),noise(&p_struct->noise),r2(p_struct->r2) { } lsfitreport::lsfitreport(const lsfitreport &rhs):_lsfitreport_owner(rhs) ,taskrcond(p_struct->taskrcond),iterationscount(p_struct->iterationscount),varidx(p_struct->varidx),rmserror(p_struct->rmserror),avgerror(p_struct->avgerror),avgrelerror(p_struct->avgrelerror),maxerror(p_struct->maxerror),wrmserror(p_struct->wrmserror),covpar(&p_struct->covpar),errpar(&p_struct->errpar),errcurve(&p_struct->errcurve),noise(&p_struct->noise),r2(p_struct->r2) { } lsfitreport& lsfitreport::operator=(const lsfitreport &rhs) { if( this==&rhs ) return *this; _lsfitreport_owner::operator=(rhs); return *this; } lsfitreport::~lsfitreport() { } /************************************************************************* Nonlinear fitter. You should use ALGLIB functions to work with fitter. Never try to access its fields directly! *************************************************************************/ _lsfitstate_owner::_lsfitstate_owner() { p_struct = (alglib_impl::lsfitstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::lsfitstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_lsfitstate_init(p_struct, NULL); } _lsfitstate_owner::_lsfitstate_owner(const _lsfitstate_owner &rhs) { p_struct = (alglib_impl::lsfitstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::lsfitstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_lsfitstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _lsfitstate_owner& _lsfitstate_owner::operator=(const _lsfitstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_lsfitstate_clear(p_struct); alglib_impl::_lsfitstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _lsfitstate_owner::~_lsfitstate_owner() { alglib_impl::_lsfitstate_clear(p_struct); ae_free(p_struct); } alglib_impl::lsfitstate* _lsfitstate_owner::c_ptr() { return p_struct; } alglib_impl::lsfitstate* _lsfitstate_owner::c_ptr() const { return const_cast(p_struct); } lsfitstate::lsfitstate() : _lsfitstate_owner() ,needf(p_struct->needf),needfg(p_struct->needfg),needfgh(p_struct->needfgh),xupdated(p_struct->xupdated),c(&p_struct->c),f(p_struct->f),g(&p_struct->g),h(&p_struct->h),x(&p_struct->x) { } lsfitstate::lsfitstate(const lsfitstate &rhs):_lsfitstate_owner(rhs) ,needf(p_struct->needf),needfg(p_struct->needfg),needfgh(p_struct->needfgh),xupdated(p_struct->xupdated),c(&p_struct->c),f(p_struct->f),g(&p_struct->g),h(&p_struct->h),x(&p_struct->x) { } lsfitstate& lsfitstate::operator=(const lsfitstate &rhs) { if( this==&rhs ) return *this; _lsfitstate_owner::operator=(rhs); return *this; } lsfitstate::~lsfitstate() { } /************************************************************************* This subroutine fits piecewise linear curve to points with Ramer-Douglas- Peucker algorithm, which stops after generating specified number of linear sections. IMPORTANT: * it does NOT perform least-squares fitting; it builds curve, but this curve does not minimize some least squares metric. See description of RDP algorithm (say, in Wikipedia) for more details on WHAT is performed. * this function does NOT work with parametric curves (i.e. curves which can be represented as {X(t),Y(t)}. It works with curves which can be represented as Y(X). Thus, it is impossible to model figures like circles with this functions. If you want to work with parametric curves, you should use ParametricRDPFixed() function provided by "Parametric" subpackage of "Interpolation" package. INPUT PARAMETERS: X - array of X-coordinates: * at least N elements * can be unordered (points are automatically sorted) * this function may accept non-distinct X (see below for more information on handling of such inputs) Y - array of Y-coordinates: * at least N elements N - number of elements in X/Y M - desired number of sections: * at most M sections are generated by this function * less than M sections can be generated if we have N(x.c_ptr()), const_cast(y.c_ptr()), n, m, const_cast(x2.c_ptr()), const_cast(y2.c_ptr()), &nsections, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine fits piecewise linear curve to points with Ramer-Douglas- Peucker algorithm, which stops after achieving desired precision. IMPORTANT: * it performs non-least-squares fitting; it builds curve, but this curve does not minimize some least squares metric. See description of RDP algorithm (say, in Wikipedia) for more details on WHAT is performed. * this function does NOT work with parametric curves (i.e. curves which can be represented as {X(t),Y(t)}. It works with curves which can be represented as Y(X). Thus, it is impossible to model figures like circles with this functions. If you want to work with parametric curves, you should use ParametricRDPFixed() function provided by "Parametric" subpackage of "Interpolation" package. INPUT PARAMETERS: X - array of X-coordinates: * at least N elements * can be unordered (points are automatically sorted) * this function may accept non-distinct X (see below for more information on handling of such inputs) Y - array of Y-coordinates: * at least N elements N - number of elements in X/Y Eps - positive number, desired precision. OUTPUT PARAMETERS: X2 - X-values of corner points for piecewise approximation, has length NSections+1 or zero (for NSections=0). Y2 - Y-values of corner points, has length NSections+1 or zero (for NSections=0). NSections- number of sections found by algorithm, NSections can be zero for degenerate datasets (N<=1 or all X[] are non-distinct). NOTE: X2/Y2 are ordered arrays, i.e. (X2[0],Y2[0]) is a first point of curve, (X2[NSection-1],Y2[NSection-1]) is the last point. -- ALGLIB -- Copyright 02.10.2014 by Bochkanov Sergey *************************************************************************/ void lstfitpiecewiselinearrdp(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const double eps, real_1d_array &x2, real_1d_array &y2, ae_int_t &nsections) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lstfitpiecewiselinearrdp(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, eps, const_cast(x2.c_ptr()), const_cast(y2.c_ptr()), &nsections, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fitting by polynomials in barycentric form. This function provides simple unterface for unconstrained unweighted fitting. See PolynomialFitWC() if you need constrained fitting. Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO: PolynomialFitWC() COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. N - number of points, N>0 * if given, only leading N elements of X/Y are used * if not given, automatically determined from sizes of X/Y M - number of basis functions (= polynomial_degree + 1), M>=1 OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD P - interpolant in barycentric form. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED NOTES: you can convert P from barycentric form to the power or Chebyshev basis with PolynomialBar2Pow() or PolynomialBar2Cheb() functions from POLINT subpackage. -- ALGLIB PROJECT -- Copyright 10.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialfit(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialfit(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(p.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_polynomialfit(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_polynomialfit(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(p.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fitting by polynomials in barycentric form. This function provides simple unterface for unconstrained unweighted fitting. See PolynomialFitWC() if you need constrained fitting. Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO: PolynomialFitWC() COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. N - number of points, N>0 * if given, only leading N elements of X/Y are used * if not given, automatically determined from sizes of X/Y M - number of basis functions (= polynomial_degree + 1), M>=1 OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD P - interpolant in barycentric form. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED NOTES: you can convert P from barycentric form to the power or Chebyshev basis with PolynomialBar2Pow() or PolynomialBar2Cheb() functions from POLINT subpackage. -- ALGLIB PROJECT -- Copyright 10.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialfit(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'polynomialfit': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialfit(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(p.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_polynomialfit(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'polynomialfit': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_polynomialfit(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(p.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted fitting by polynomials in barycentric form, with constraints on function values or first derivatives. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO: PolynomialFit() COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points, N>0. * if given, only leading N elements of X/Y/W are used * if not given, automatically determined from sizes of X/Y/W XC - points where polynomial values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that P(XC[i])=YC[i] * DC[i]=1 means that P'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints, 0<=K=1 OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints P - interpolant in barycentric form. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTES: you can convert P from barycentric form to the power or Chebyshev basis with PolynomialBar2Pow() or PolynomialBar2Cheb() functions from POLINT subpackage. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * even simple constraints can be inconsistent, see Wikipedia article on this subject: http://en.wikipedia.org/wiki/Birkhoff_interpolation * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints is NOT GUARANTEED. * in the one special cases, however, we can guarantee consistency. This case is: M>1 and constraints on the function values (NOT DERIVATIVES) Our final recommendation is to use constraints WHEN AND ONLY when you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 10.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialfitwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialfitwc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(p.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_polynomialfitwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_polynomialfitwc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(p.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted fitting by polynomials in barycentric form, with constraints on function values or first derivatives. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO: PolynomialFit() COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points, N>0. * if given, only leading N elements of X/Y/W are used * if not given, automatically determined from sizes of X/Y/W XC - points where polynomial values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that P(XC[i])=YC[i] * DC[i]=1 means that P'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints, 0<=K=1 OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints P - interpolant in barycentric form. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTES: you can convert P from barycentric form to the power or Chebyshev basis with PolynomialBar2Pow() or PolynomialBar2Cheb() functions from POLINT subpackage. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * even simple constraints can be inconsistent, see Wikipedia article on this subject: http://en.wikipedia.org/wiki/Birkhoff_interpolation * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints is NOT GUARANTEED. * in the one special cases, however, we can guarantee consistency. This case is: M>1 and constraints on the function values (NOT DERIVATIVES) Our final recommendation is to use constraints WHEN AND ONLY when you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 10.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialfitwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t k; if( (x.length()!=y.length()) || (x.length()!=w.length())) throw ap_error("Error while calling 'polynomialfitwc': looks like one of arguments has wrong size"); if( (xc.length()!=yc.length()) || (xc.length()!=dc.length())) throw ap_error("Error while calling 'polynomialfitwc': looks like one of arguments has wrong size"); n = x.length(); k = xc.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::polynomialfitwc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(p.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_polynomialfitwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t k; if( (x.length()!=y.length()) || (x.length()!=w.length())) throw ap_error("Error while calling 'polynomialfitwc': looks like one of arguments has wrong size"); if( (xc.length()!=yc.length()) || (xc.length()!=dc.length())) throw ap_error("Error while calling 'polynomialfitwc': looks like one of arguments has wrong size"); n = x.length(); k = xc.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_polynomialfitwc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(p.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates value of four-parameter logistic (4PL) model at specified point X. 4PL model has following form: F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) INPUT PARAMETERS: X - current point, X>=0: * zero X is correctly handled even for B<=0 * negative X results in exception. A, B, C, D- parameters of 4PL model: * A is unconstrained * B is unconstrained; zero or negative values are handled correctly. * C>0, non-positive value results in exception * D is unconstrained RESULT: model value at X NOTE: if B=0, denominator is assumed to be equal to 2.0 even for zero X (strictly speaking, 0^0 is undefined). NOTE: this function also throws exception if all input parameters are correct, but overflow was detected during calculations. NOTE: this function performs a lot of checks; if you need really high performance, consider evaluating model yourself, without checking for degenerate cases. -- ALGLIB PROJECT -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/ double logisticcalc4(const double x, const double a, const double b, const double c, const double d) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::logisticcalc4(x, a, b, c, d, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates value of five-parameter logistic (5PL) model at specified point X. 5PL model has following form: F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) INPUT PARAMETERS: X - current point, X>=0: * zero X is correctly handled even for B<=0 * negative X results in exception. A, B, C, D, G- parameters of 5PL model: * A is unconstrained * B is unconstrained; zero or negative values are handled correctly. * C>0, non-positive value results in exception * D is unconstrained * G>0, non-positive value results in exception RESULT: model value at X NOTE: if B=0, denominator is assumed to be equal to Power(2.0,G) even for zero X (strictly speaking, 0^0 is undefined). NOTE: this function also throws exception if all input parameters are correct, but overflow was detected during calculations. NOTE: this function performs a lot of checks; if you need really high performance, consider evaluating model yourself, without checking for degenerate cases. -- ALGLIB PROJECT -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/ double logisticcalc5(const double x, const double a, const double b, const double c, const double d, const double g) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::logisticcalc5(x, a, b, c, d, g, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function fits four-parameter logistic (4PL) model to data provided by user. 4PL model has following form: F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) Here: * A, D - unconstrained (see LogisticFit4EC() for constrained 4PL) * B>=0 * C>0 IMPORTANT: output of this function is constrained in such way that B>0. Because 4PL model is symmetric with respect to B, there is no need to explore B<0. Constraining B makes algorithm easier to stabilize and debug. Users who for some reason prefer to work with negative B's should transform output themselves (swap A and D, replace B by -B). 4PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". * second Levenberg-Marquardt round is performed without excessive constraints. Results from the previous round are used as initial guess. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. OUTPUT PARAMETERS: A, B, C, D- parameters of 4PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc4() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit4(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, double &a, double &b, double &c, double &d, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::logisticfit4(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, &a, &b, &c, &d, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function fits four-parameter logistic (4PL) model to data provided by user, with optional constraints on parameters A and D. 4PL model has following form: F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) Here: * A, D - with optional equality constraints * B>=0 * C>0 IMPORTANT: output of this function is constrained in such way that B>0. Because 4PL model is symmetric with respect to B, there is no need to explore B<0. Constraining B makes algorithm easier to stabilize and debug. Users who for some reason prefer to work with negative B's should transform output themselves (swap A and D, replace B by -B). 4PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". * second Levenberg-Marquardt round is performed without excessive constraints. Results from the previous round are used as initial guess. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. CnstrLeft- optional equality constraint for model value at the left boundary (at X=0). Specify NAN (Not-a-Number) if you do not need constraint on the model value at X=0 (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. CnstrRight- optional equality constraint for model value at X=infinity. Specify NAN (Not-a-Number) if you do not need constraint on the model value (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. OUTPUT PARAMETERS: A, B, C, D- parameters of 4PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc4() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. EQUALITY CONSTRAINTS ON PARAMETERS 4PL/5PL solver supports equality constraints on model values at the left boundary (X=0) and right boundary (X=infinity). These constraints are completely optional and you can specify both of them, only one - or no constraints at all. Parameter CnstrLeft contains left constraint (or NAN for unconstrained fitting), and CnstrRight contains right one. For 4PL, left constraint ALWAYS corresponds to parameter A, and right one is ALWAYS constraint on D. That's because 4PL model is normalized in such way that B>=0. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit4ec(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const double cnstrleft, const double cnstrright, double &a, double &b, double &c, double &d, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::logisticfit4ec(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, cnstrleft, cnstrright, &a, &b, &c, &d, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function fits five-parameter logistic (5PL) model to data provided by user. 5PL model has following form: F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) Here: * A, D - unconstrained * B - unconstrained * C>0 * G>0 IMPORTANT: unlike in 4PL fitting, output of this function is NOT constrained in such way that B is guaranteed to be positive. Furthermore, unlike 4PL, 5PL model is NOT symmetric with respect to B, so you can NOT transform model to equivalent one, with B having desired sign (>0 or <0). 5PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". Parameter G is fixed at G=1. * second Levenberg-Marquardt round is performed without excessive constraints on B and C, but with G still equal to 1. Results from the previous round are used as initial guess. * third Levenberg-Marquardt round relaxes constraints on G and tries two different models - one with B>0 and one with B<0. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. OUTPUT PARAMETERS: A,B,C,D,G- parameters of 5PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc5() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit5(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, double &a, double &b, double &c, double &d, double &g, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::logisticfit5(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, &a, &b, &c, &d, &g, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function fits five-parameter logistic (5PL) model to data provided by user, subject to optional equality constraints on parameters A and D. 5PL model has following form: F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) Here: * A, D - with optional equality constraints * B - unconstrained * C>0 * G>0 IMPORTANT: unlike in 4PL fitting, output of this function is NOT constrained in such way that B is guaranteed to be positive. Furthermore, unlike 4PL, 5PL model is NOT symmetric with respect to B, so you can NOT transform model to equivalent one, with B having desired sign (>0 or <0). 5PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". Parameter G is fixed at G=1. * second Levenberg-Marquardt round is performed without excessive constraints on B and C, but with G still equal to 1. Results from the previous round are used as initial guess. * third Levenberg-Marquardt round relaxes constraints on G and tries two different models - one with B>0 and one with B<0. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. CnstrLeft- optional equality constraint for model value at the left boundary (at X=0). Specify NAN (Not-a-Number) if you do not need constraint on the model value at X=0 (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. CnstrRight- optional equality constraint for model value at X=infinity. Specify NAN (Not-a-Number) if you do not need constraint on the model value (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. OUTPUT PARAMETERS: A,B,C,D,G- parameters of 5PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc5() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. EQUALITY CONSTRAINTS ON PARAMETERS 5PL solver supports equality constraints on model values at the left boundary (X=0) and right boundary (X=infinity). These constraints are completely optional and you can specify both of them, only one - or no constraints at all. Parameter CnstrLeft contains left constraint (or NAN for unconstrained fitting), and CnstrRight contains right one. Unlike 4PL one, 5PL model is NOT symmetric with respect to change in sign of B. Thus, negative B's are possible, and left constraint may constrain parameter A (for positive B's) - or parameter D (for negative B's). Similarly changes meaning of right constraint. You do not have to decide what parameter to constrain - algorithm will automatically determine correct parameters as fitting progresses. However, question highlighted above is important when you interpret fitting results. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit5ec(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const double cnstrleft, const double cnstrright, double &a, double &b, double &c, double &d, double &g, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::logisticfit5ec(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, cnstrleft, cnstrright, &a, &b, &c, &d, &g, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is "expert" 4PL/5PL fitting function, which can be used if you need better control over fitting process than provided by LogisticFit4() or LogisticFit5(). This function fits model of the form F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) (4PL model) or F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) (5PL model) Here: * A, D - unconstrained * B>=0 for 4PL, unconstrained for 5PL * C>0 * G>0 (if present) INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. CnstrLeft- optional equality constraint for model value at the left boundary (at X=0). Specify NAN (Not-a-Number) if you do not need constraint on the model value at X=0 (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. CnstrRight- optional equality constraint for model value at X=infinity. Specify NAN (Not-a-Number) if you do not need constraint on the model value (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. Is4PL - whether 4PL or 5PL models are fitted LambdaV - regularization coefficient, LambdaV>=0. Set it to zero unless you know what you are doing. EpsX - stopping condition (step size), EpsX>=0. Zero value means that small step is automatically chosen. See notes below for more information. RsCnt - number of repeated restarts from random points. 4PL/5PL models are prone to problem of bad local extrema. Utilizing multiple random restarts allows us to improve algorithm convergence. RsCnt>=0. Zero value means that function automatically choose small amount of restarts (recommended). OUTPUT PARAMETERS: A, B, C, D- parameters of 4PL model G - parameter of 5PL model; for Is4PL=True, G=1 is returned. Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc5() function. NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. EQUALITY CONSTRAINTS ON PARAMETERS 4PL/5PL solver supports equality constraints on model values at the left boundary (X=0) and right boundary (X=infinity). These constraints are completely optional and you can specify both of them, only one - or no constraints at all. Parameter CnstrLeft contains left constraint (or NAN for unconstrained fitting), and CnstrRight contains right one. For 4PL, left constraint ALWAYS corresponds to parameter A, and right one is ALWAYS constraint on D. That's because 4PL model is normalized in such way that B>=0. For 5PL model things are different. Unlike 4PL one, 5PL model is NOT symmetric with respect to change in sign of B. Thus, negative B's are possible, and left constraint may constrain parameter A (for positive B's) - or parameter D (for negative B's). Similarly changes meaning of right constraint. You do not have to decide what parameter to constrain - algorithm will automatically determine correct parameters as fitting progresses. However, question highlighted above is important when you interpret fitting results. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit45x(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const double cnstrleft, const double cnstrright, const bool is4pl, const double lambdav, const double epsx, const ae_int_t rscnt, double &a, double &b, double &c, double &d, double &g, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::logisticfit45x(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, cnstrleft, cnstrright, is4pl, lambdav, epsx, rscnt, &a, &b, &c, &d, &g, const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weghted rational least squares fitting using Floater-Hormann rational functions with optimal D chosen from [0,9], with constraints and individual weights. Equidistant grid with M node on [min(x),max(x)] is used to build basis functions. Different values of D are tried, optimal D (least WEIGHTED root mean square error) is chosen. Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2) (mostly dominated by the least squares solver). SEE ALSO * BarycentricFitFloaterHormann(), "lightweight" fitting without invididual weights and constraints. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points, N>0. XC - points where function values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints, 0<=K=2. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints -1 means another errors in parameters passed (N<=0, for example) B - barycentric interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * DBest best value of the D parameter * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroutine doesn't calculate task's condition number for K<>0. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained barycentric interpolants: * excessive constraints can be inconsistent. Floater-Hormann basis functions aren't as flexible as splines (although they are very smooth). * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints IS NOT GUARANTEED. * in the several special cases, however, we CAN guarantee consistency. * one of this cases is constraints on the function VALUES at the interval boundaries. Note that consustency of the constraints on the function DERIVATIVES is NOT guaranteed (you can use in such cases cubic splines which are more flexible). * another special case is ONE constraint on the function value (OR, but not AND, derivative) anywhere in the interval Our final recommendation is to use constraints WHEN AND ONLY WHEN you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricfitfloaterhormannwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, barycentricinterpolant &b, barycentricfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::barycentricfitfloaterhormannwc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(b.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_barycentricfitfloaterhormannwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, barycentricinterpolant &b, barycentricfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_barycentricfitfloaterhormannwc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(b.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Rational least squares fitting using Floater-Hormann rational functions with optimal D chosen from [0,9]. Equidistant grid with M node on [min(x),max(x)] is used to build basis functions. Different values of D are tried, optimal D (least root mean square error) is chosen. Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2) (mostly dominated by the least squares solver). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. N - number of points, N>0. M - number of basis functions ( = number_of_nodes), M>=2. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints B - barycentric interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * DBest best value of the D parameter * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricfitfloaterhormann(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, barycentricinterpolant &b, barycentricfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::barycentricfitfloaterhormann(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(b.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_barycentricfitfloaterhormann(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, barycentricinterpolant &b, barycentricfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_barycentricfitfloaterhormann(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(b.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fitting by penalized cubic spline. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with natural boundary conditions. Problem is regularized by adding non-linearity penalty to the usual least squares penalty function: S(x) = arg min { LS + P }, where LS = SUM { w[i]^2*(y[i] - S(x[i]))^2 } - least squares penalty P = C*10^rho*integral{ S''(x)^2*dx } - non-linearity penalty rho - tunable constant given by user C - automatically determined scale parameter, makes penalty invariant with respect to scaling of X, Y, W. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. N - number of points (optional): * N>0 * if given, only first N elements of X/Y are processed * if not given, automatically determined from X/Y sizes M - number of basis functions ( = number_of_nodes), M>=4. Rho - regularization constant passed by user. It penalizes nonlinearity in the regression spline. It is logarithmically scaled, i.e. actual value of regularization constant is calculated as 10^Rho. It is automatically scaled so that: * Rho=2.0 corresponds to moderate amount of nonlinearity * generally, it should be somewhere in the [-8.0,+8.0] If you do not want to penalize nonlineary, pass small Rho. Values as low as -15 should work. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD or Cholesky decomposition; problem may be too ill-conditioned (very rare) S - spline interpolant. Rep - Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTE 1: additional nodes are added to the spline outside of the fitting interval to force linearity when xmax(x,xc). It is done for consistency - we penalize non-linearity at [min(x,xc),max(x,xc)], so it is natural to force linearity outside of this interval. NOTE 2: function automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfitpenalized(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dfitpenalized(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, rho, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spline1dfitpenalized(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spline1dfitpenalized(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, rho, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fitting by penalized cubic spline. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with natural boundary conditions. Problem is regularized by adding non-linearity penalty to the usual least squares penalty function: S(x) = arg min { LS + P }, where LS = SUM { w[i]^2*(y[i] - S(x[i]))^2 } - least squares penalty P = C*10^rho*integral{ S''(x)^2*dx } - non-linearity penalty rho - tunable constant given by user C - automatically determined scale parameter, makes penalty invariant with respect to scaling of X, Y, W. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. N - number of points (optional): * N>0 * if given, only first N elements of X/Y are processed * if not given, automatically determined from X/Y sizes M - number of basis functions ( = number_of_nodes), M>=4. Rho - regularization constant passed by user. It penalizes nonlinearity in the regression spline. It is logarithmically scaled, i.e. actual value of regularization constant is calculated as 10^Rho. It is automatically scaled so that: * Rho=2.0 corresponds to moderate amount of nonlinearity * generally, it should be somewhere in the [-8.0,+8.0] If you do not want to penalize nonlineary, pass small Rho. Values as low as -15 should work. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD or Cholesky decomposition; problem may be too ill-conditioned (very rare) S - spline interpolant. Rep - Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTE 1: additional nodes are added to the spline outside of the fitting interval to force linearity when xmax(x,xc). It is done for consistency - we penalize non-linearity at [min(x,xc),max(x,xc)], so it is natural to force linearity outside of this interval. NOTE 2: function automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfitpenalized(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dfitpenalized': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dfitpenalized(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, rho, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spline1dfitpenalized(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dfitpenalized': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spline1dfitpenalized(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, rho, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted fitting by penalized cubic spline. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with natural boundary conditions. Problem is regularized by adding non-linearity penalty to the usual least squares penalty function: S(x) = arg min { LS + P }, where LS = SUM { w[i]^2*(y[i] - S(x[i]))^2 } - least squares penalty P = C*10^rho*integral{ S''(x)^2*dx } - non-linearity penalty rho - tunable constant given by user C - automatically determined scale parameter, makes penalty invariant with respect to scaling of X, Y, W. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted problem. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes M - number of basis functions ( = number_of_nodes), M>=4. Rho - regularization constant passed by user. It penalizes nonlinearity in the regression spline. It is logarithmically scaled, i.e. actual value of regularization constant is calculated as 10^Rho. It is automatically scaled so that: * Rho=2.0 corresponds to moderate amount of nonlinearity * generally, it should be somewhere in the [-8.0,+8.0] If you do not want to penalize nonlineary, pass small Rho. Values as low as -15 should work. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD or Cholesky decomposition; problem may be too ill-conditioned (very rare) S - spline interpolant. Rep - Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTE 1: additional nodes are added to the spline outside of the fitting interval to force linearity when xmax(x,xc). It is done for consistency - we penalize non-linearity at [min(x,xc),max(x,xc)], so it is natural to force linearity outside of this interval. NOTE 2: function automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 19.10.2010 by Bochkanov Sergey *************************************************************************/ void spline1dfitpenalizedw(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dfitpenalizedw(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, m, rho, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spline1dfitpenalizedw(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spline1dfitpenalizedw(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, m, rho, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted fitting by penalized cubic spline. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with natural boundary conditions. Problem is regularized by adding non-linearity penalty to the usual least squares penalty function: S(x) = arg min { LS + P }, where LS = SUM { w[i]^2*(y[i] - S(x[i]))^2 } - least squares penalty P = C*10^rho*integral{ S''(x)^2*dx } - non-linearity penalty rho - tunable constant given by user C - automatically determined scale parameter, makes penalty invariant with respect to scaling of X, Y, W. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted problem. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes M - number of basis functions ( = number_of_nodes), M>=4. Rho - regularization constant passed by user. It penalizes nonlinearity in the regression spline. It is logarithmically scaled, i.e. actual value of regularization constant is calculated as 10^Rho. It is automatically scaled so that: * Rho=2.0 corresponds to moderate amount of nonlinearity * generally, it should be somewhere in the [-8.0,+8.0] If you do not want to penalize nonlineary, pass small Rho. Values as low as -15 should work. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD or Cholesky decomposition; problem may be too ill-conditioned (very rare) S - spline interpolant. Rep - Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTE 1: additional nodes are added to the spline outside of the fitting interval to force linearity when xmax(x,xc). It is done for consistency - we penalize non-linearity at [min(x,xc),max(x,xc)], so it is natural to force linearity outside of this interval. NOTE 2: function automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 19.10.2010 by Bochkanov Sergey *************************************************************************/ void spline1dfitpenalizedw(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length()) || (x.length()!=w.length())) throw ap_error("Error while calling 'spline1dfitpenalizedw': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dfitpenalizedw(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, m, rho, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spline1dfitpenalizedw(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length()) || (x.length()!=w.length())) throw ap_error("Error while calling 'spline1dfitpenalizedw': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spline1dfitpenalizedw(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, m, rho, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted fitting by cubic spline, with constraints on function values or derivatives. Equidistant grid with M-2 nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with continuous second derivatives and non-fixed first derivatives at interval ends. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO Spline1DFitHermiteWC() - fitting by Hermite splines (more flexible, less smooth) Spline1DFitCubic() - "lightweight" fitting by cubic splines, without invididual weights and constraints COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes XC - points where spline values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints (optional): * 0<=K=4. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints S - spline interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * excessive constraints can be inconsistent. Splines are piecewise cubic functions, and it is easy to create an example, where large number of constraints concentrated in small area will result in inconsistency. Just because spline is not flexible enough to satisfy all of them. And same constraints spread across the [min(x),max(x)] will be perfectly consistent. * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints IS NOT GUARANTEED. * in the several special cases, however, we CAN guarantee consistency. * one of this cases is constraints on the function values AND/OR its derivatives at the interval boundaries. * another special case is ONE constraint on the function value (OR, but not AND, derivative) anywhere in the interval Our final recommendation is to use constraints WHEN AND ONLY WHEN you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfitcubicwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dfitcubicwc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spline1dfitcubicwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spline1dfitcubicwc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted fitting by cubic spline, with constraints on function values or derivatives. Equidistant grid with M-2 nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with continuous second derivatives and non-fixed first derivatives at interval ends. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO Spline1DFitHermiteWC() - fitting by Hermite splines (more flexible, less smooth) Spline1DFitCubic() - "lightweight" fitting by cubic splines, without invididual weights and constraints COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes XC - points where spline values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints (optional): * 0<=K=4. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints S - spline interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * excessive constraints can be inconsistent. Splines are piecewise cubic functions, and it is easy to create an example, where large number of constraints concentrated in small area will result in inconsistency. Just because spline is not flexible enough to satisfy all of them. And same constraints spread across the [min(x),max(x)] will be perfectly consistent. * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints IS NOT GUARANTEED. * in the several special cases, however, we CAN guarantee consistency. * one of this cases is constraints on the function values AND/OR its derivatives at the interval boundaries. * another special case is ONE constraint on the function value (OR, but not AND, derivative) anywhere in the interval Our final recommendation is to use constraints WHEN AND ONLY WHEN you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfitcubicwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t k; if( (x.length()!=y.length()) || (x.length()!=w.length())) throw ap_error("Error while calling 'spline1dfitcubicwc': looks like one of arguments has wrong size"); if( (xc.length()!=yc.length()) || (xc.length()!=dc.length())) throw ap_error("Error while calling 'spline1dfitcubicwc': looks like one of arguments has wrong size"); n = x.length(); k = xc.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dfitcubicwc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spline1dfitcubicwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t k; if( (x.length()!=y.length()) || (x.length()!=w.length())) throw ap_error("Error while calling 'spline1dfitcubicwc': looks like one of arguments has wrong size"); if( (xc.length()!=yc.length()) || (xc.length()!=dc.length())) throw ap_error("Error while calling 'spline1dfitcubicwc': looks like one of arguments has wrong size"); n = x.length(); k = xc.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spline1dfitcubicwc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted fitting by Hermite spline, with constraints on function values or first derivatives. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are Hermite splines. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO Spline1DFitCubicWC() - fitting by Cubic splines (less flexible, more smooth) Spline1DFitHermite() - "lightweight" Hermite fitting, without invididual weights and constraints COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes XC - points where spline values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints (optional): * 0<=K=4, M IS EVEN! OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints -2 means odd M was passed (which is not supported) -1 means another errors in parameters passed (N<=0, for example) S - spline interpolant. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. IMPORTANT: this subroitine supports only even M's ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * excessive constraints can be inconsistent. Splines are piecewise cubic functions, and it is easy to create an example, where large number of constraints concentrated in small area will result in inconsistency. Just because spline is not flexible enough to satisfy all of them. And same constraints spread across the [min(x),max(x)] will be perfectly consistent. * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints is NOT GUARANTEED. * in the several special cases, however, we can guarantee consistency. * one of this cases is M>=4 and constraints on the function value (AND/OR its derivative) at the interval boundaries. * another special case is M>=4 and ONE constraint on the function value (OR, BUT NOT AND, derivative) anywhere in [min(x),max(x)] Our final recommendation is to use constraints WHEN AND ONLY when you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfithermitewc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dfithermitewc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spline1dfithermitewc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spline1dfithermitewc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted fitting by Hermite spline, with constraints on function values or first derivatives. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are Hermite splines. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO Spline1DFitCubicWC() - fitting by Cubic splines (less flexible, more smooth) Spline1DFitHermite() - "lightweight" Hermite fitting, without invididual weights and constraints COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes XC - points where spline values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints (optional): * 0<=K=4, M IS EVEN! OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints -2 means odd M was passed (which is not supported) -1 means another errors in parameters passed (N<=0, for example) S - spline interpolant. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. IMPORTANT: this subroitine supports only even M's ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * excessive constraints can be inconsistent. Splines are piecewise cubic functions, and it is easy to create an example, where large number of constraints concentrated in small area will result in inconsistency. Just because spline is not flexible enough to satisfy all of them. And same constraints spread across the [min(x),max(x)] will be perfectly consistent. * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints is NOT GUARANTEED. * in the several special cases, however, we can guarantee consistency. * one of this cases is M>=4 and constraints on the function value (AND/OR its derivative) at the interval boundaries. * another special case is M>=4 and ONE constraint on the function value (OR, BUT NOT AND, derivative) anywhere in [min(x),max(x)] Our final recommendation is to use constraints WHEN AND ONLY when you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfithermitewc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t k; if( (x.length()!=y.length()) || (x.length()!=w.length())) throw ap_error("Error while calling 'spline1dfithermitewc': looks like one of arguments has wrong size"); if( (xc.length()!=yc.length()) || (xc.length()!=dc.length())) throw ap_error("Error while calling 'spline1dfithermitewc': looks like one of arguments has wrong size"); n = x.length(); k = xc.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dfithermitewc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spline1dfithermitewc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t k; if( (x.length()!=y.length()) || (x.length()!=w.length())) throw ap_error("Error while calling 'spline1dfithermitewc': looks like one of arguments has wrong size"); if( (xc.length()!=yc.length()) || (xc.length()!=dc.length())) throw ap_error("Error while calling 'spline1dfithermitewc': looks like one of arguments has wrong size"); n = x.length(); k = xc.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spline1dfithermitewc(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), n, const_cast(xc.c_ptr()), const_cast(yc.c_ptr()), const_cast(dc.c_ptr()), k, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Least squares fitting by cubic spline. This subroutine is "lightweight" alternative for more complex and feature- rich Spline1DFitCubicWC(). See Spline1DFitCubicWC() for more information about subroutine parameters (we don't duplicate it here because of length) COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfitcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dfitcubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spline1dfitcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spline1dfitcubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Least squares fitting by cubic spline. This subroutine is "lightweight" alternative for more complex and feature- rich Spline1DFitCubicWC(). See Spline1DFitCubicWC() for more information about subroutine parameters (we don't duplicate it here because of length) COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfitcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dfitcubic': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dfitcubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spline1dfitcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dfitcubic': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spline1dfitcubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Least squares fitting by Hermite spline. This subroutine is "lightweight" alternative for more complex and feature- rich Spline1DFitHermiteWC(). See Spline1DFitHermiteWC() description for more information about subroutine parameters (we don't duplicate it here because of length). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfithermite(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dfithermite(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spline1dfithermite(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spline1dfithermite(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Least squares fitting by Hermite spline. This subroutine is "lightweight" alternative for more complex and feature- rich Spline1DFitHermiteWC(). See Spline1DFitHermiteWC() description for more information about subroutine parameters (we don't duplicate it here because of length). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfithermite(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dfithermite': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline1dfithermite(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_spline1dfithermite(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (x.length()!=y.length())) throw ap_error("Error while calling 'spline1dfithermite': looks like one of arguments has wrong size"); n = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_spline1dfithermite(const_cast(x.c_ptr()), const_cast(y.c_ptr()), n, m, &info, const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted linear least squares fitting. QR decomposition is used to reduce task to MxM, then triangular solver or SVD-based solver is used depending on condition number of the system. It allows to maximize speed and retain decent accuracy. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. W - array[0..N-1] Weights corresponding to function values. Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I, J] - value of J-th basis function in I-th point. N - number of points used. N>=1. M - number of basis functions, M>=1. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -1 incorrect N/M were specified * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * Rep.TaskRCond reciprocal of condition number * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinearw(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, const ae_int_t n, const ae_int_t m, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitlinearw(const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(fmatrix.c_ptr()), n, m, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_lsfitlinearw(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, const ae_int_t n, const ae_int_t m, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_lsfitlinearw(const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(fmatrix.c_ptr()), n, m, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted linear least squares fitting. QR decomposition is used to reduce task to MxM, then triangular solver or SVD-based solver is used depending on condition number of the system. It allows to maximize speed and retain decent accuracy. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. W - array[0..N-1] Weights corresponding to function values. Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I, J] - value of J-th basis function in I-th point. N - number of points used. N>=1. M - number of basis functions, M>=1. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -1 incorrect N/M were specified * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * Rep.TaskRCond reciprocal of condition number * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinearw(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; if( (y.length()!=w.length()) || (y.length()!=fmatrix.rows())) throw ap_error("Error while calling 'lsfitlinearw': looks like one of arguments has wrong size"); n = y.length(); m = fmatrix.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitlinearw(const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(fmatrix.c_ptr()), n, m, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_lsfitlinearw(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; if( (y.length()!=w.length()) || (y.length()!=fmatrix.rows())) throw ap_error("Error while calling 'lsfitlinearw': looks like one of arguments has wrong size"); n = y.length(); m = fmatrix.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_lsfitlinearw(const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(fmatrix.c_ptr()), n, m, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted constained linear least squares fitting. This is variation of LSFitLinearW(), which searchs for min|A*x=b| given that K additional constaints C*x=bc are satisfied. It reduces original task to modified one: min|B*y-d| WITHOUT constraints, then LSFitLinearW() is called. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. W - array[0..N-1] Weights corresponding to function values. Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I,J] - value of J-th basis function in I-th point. CMatrix - a table of constaints, array[0..K-1,0..M]. I-th row of CMatrix corresponds to I-th linear constraint: CMatrix[I,0]*C[0] + ... + CMatrix[I,M-1]*C[M-1] = CMatrix[I,M] N - number of points used. N>=1. M - number of basis functions, M>=1. K - number of constraints, 0 <= K < M K=0 corresponds to absence of constraints. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -3 either too many constraints (M or more), degenerate constraints (some constraints are repetead twice) or inconsistent constraints were specified. * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 07.09.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinearwc(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, const real_2d_array &cmatrix, const ae_int_t n, const ae_int_t m, const ae_int_t k, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitlinearwc(const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(fmatrix.c_ptr()), const_cast(cmatrix.c_ptr()), n, m, k, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_lsfitlinearwc(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, const real_2d_array &cmatrix, const ae_int_t n, const ae_int_t m, const ae_int_t k, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_lsfitlinearwc(const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(fmatrix.c_ptr()), const_cast(cmatrix.c_ptr()), n, m, k, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted constained linear least squares fitting. This is variation of LSFitLinearW(), which searchs for min|A*x=b| given that K additional constaints C*x=bc are satisfied. It reduces original task to modified one: min|B*y-d| WITHOUT constraints, then LSFitLinearW() is called. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. W - array[0..N-1] Weights corresponding to function values. Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I,J] - value of J-th basis function in I-th point. CMatrix - a table of constaints, array[0..K-1,0..M]. I-th row of CMatrix corresponds to I-th linear constraint: CMatrix[I,0]*C[0] + ... + CMatrix[I,M-1]*C[M-1] = CMatrix[I,M] N - number of points used. N>=1. M - number of basis functions, M>=1. K - number of constraints, 0 <= K < M K=0 corresponds to absence of constraints. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -3 either too many constraints (M or more), degenerate constraints (some constraints are repetead twice) or inconsistent constraints were specified. * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 07.09.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinearwc(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, const real_2d_array &cmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; ae_int_t k; if( (y.length()!=w.length()) || (y.length()!=fmatrix.rows())) throw ap_error("Error while calling 'lsfitlinearwc': looks like one of arguments has wrong size"); if( (fmatrix.cols()!=cmatrix.cols()-1)) throw ap_error("Error while calling 'lsfitlinearwc': looks like one of arguments has wrong size"); n = y.length(); m = fmatrix.cols(); k = cmatrix.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitlinearwc(const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(fmatrix.c_ptr()), const_cast(cmatrix.c_ptr()), n, m, k, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_lsfitlinearwc(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, const real_2d_array &cmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; ae_int_t k; if( (y.length()!=w.length()) || (y.length()!=fmatrix.rows())) throw ap_error("Error while calling 'lsfitlinearwc': looks like one of arguments has wrong size"); if( (fmatrix.cols()!=cmatrix.cols()-1)) throw ap_error("Error while calling 'lsfitlinearwc': looks like one of arguments has wrong size"); n = y.length(); m = fmatrix.cols(); k = cmatrix.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_lsfitlinearwc(const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(fmatrix.c_ptr()), const_cast(cmatrix.c_ptr()), n, m, k, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Linear least squares fitting. QR decomposition is used to reduce task to MxM, then triangular solver or SVD-based solver is used depending on condition number of the system. It allows to maximize speed and retain decent accuracy. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I, J] - value of J-th basis function in I-th point. N - number of points used. N>=1. M - number of basis functions, M>=1. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * Rep.TaskRCond reciprocal of condition number * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinear(const real_1d_array &y, const real_2d_array &fmatrix, const ae_int_t n, const ae_int_t m, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitlinear(const_cast(y.c_ptr()), const_cast(fmatrix.c_ptr()), n, m, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_lsfitlinear(const real_1d_array &y, const real_2d_array &fmatrix, const ae_int_t n, const ae_int_t m, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_lsfitlinear(const_cast(y.c_ptr()), const_cast(fmatrix.c_ptr()), n, m, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Linear least squares fitting. QR decomposition is used to reduce task to MxM, then triangular solver or SVD-based solver is used depending on condition number of the system. It allows to maximize speed and retain decent accuracy. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I, J] - value of J-th basis function in I-th point. N - number of points used. N>=1. M - number of basis functions, M>=1. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * Rep.TaskRCond reciprocal of condition number * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinear(const real_1d_array &y, const real_2d_array &fmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; if( (y.length()!=fmatrix.rows())) throw ap_error("Error while calling 'lsfitlinear': looks like one of arguments has wrong size"); n = y.length(); m = fmatrix.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitlinear(const_cast(y.c_ptr()), const_cast(fmatrix.c_ptr()), n, m, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_lsfitlinear(const real_1d_array &y, const real_2d_array &fmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; if( (y.length()!=fmatrix.rows())) throw ap_error("Error while calling 'lsfitlinear': looks like one of arguments has wrong size"); n = y.length(); m = fmatrix.cols(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_lsfitlinear(const_cast(y.c_ptr()), const_cast(fmatrix.c_ptr()), n, m, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Constained linear least squares fitting. This is variation of LSFitLinear(), which searchs for min|A*x=b| given that K additional constaints C*x=bc are satisfied. It reduces original task to modified one: min|B*y-d| WITHOUT constraints, then LSFitLinear() is called. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I,J] - value of J-th basis function in I-th point. CMatrix - a table of constaints, array[0..K-1,0..M]. I-th row of CMatrix corresponds to I-th linear constraint: CMatrix[I,0]*C[0] + ... + CMatrix[I,M-1]*C[M-1] = CMatrix[I,M] N - number of points used. N>=1. M - number of basis functions, M>=1. K - number of constraints, 0 <= K < M K=0 corresponds to absence of constraints. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -3 either too many constraints (M or more), degenerate constraints (some constraints are repetead twice) or inconsistent constraints were specified. * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 07.09.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinearc(const real_1d_array &y, const real_2d_array &fmatrix, const real_2d_array &cmatrix, const ae_int_t n, const ae_int_t m, const ae_int_t k, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitlinearc(const_cast(y.c_ptr()), const_cast(fmatrix.c_ptr()), const_cast(cmatrix.c_ptr()), n, m, k, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_lsfitlinearc(const real_1d_array &y, const real_2d_array &fmatrix, const real_2d_array &cmatrix, const ae_int_t n, const ae_int_t m, const ae_int_t k, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_lsfitlinearc(const_cast(y.c_ptr()), const_cast(fmatrix.c_ptr()), const_cast(cmatrix.c_ptr()), n, m, k, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Constained linear least squares fitting. This is variation of LSFitLinear(), which searchs for min|A*x=b| given that K additional constaints C*x=bc are satisfied. It reduces original task to modified one: min|B*y-d| WITHOUT constraints, then LSFitLinear() is called. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I,J] - value of J-th basis function in I-th point. CMatrix - a table of constaints, array[0..K-1,0..M]. I-th row of CMatrix corresponds to I-th linear constraint: CMatrix[I,0]*C[0] + ... + CMatrix[I,M-1]*C[M-1] = CMatrix[I,M] N - number of points used. N>=1. M - number of basis functions, M>=1. K - number of constraints, 0 <= K < M K=0 corresponds to absence of constraints. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -3 either too many constraints (M or more), degenerate constraints (some constraints are repetead twice) or inconsistent constraints were specified. * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 07.09.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinearc(const real_1d_array &y, const real_2d_array &fmatrix, const real_2d_array &cmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; ae_int_t k; if( (y.length()!=fmatrix.rows())) throw ap_error("Error while calling 'lsfitlinearc': looks like one of arguments has wrong size"); if( (fmatrix.cols()!=cmatrix.cols()-1)) throw ap_error("Error while calling 'lsfitlinearc': looks like one of arguments has wrong size"); n = y.length(); m = fmatrix.cols(); k = cmatrix.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitlinearc(const_cast(y.c_ptr()), const_cast(fmatrix.c_ptr()), const_cast(cmatrix.c_ptr()), n, m, k, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_lsfitlinearc(const real_1d_array &y, const real_2d_array &fmatrix, const real_2d_array &cmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; ae_int_t k; if( (y.length()!=fmatrix.rows())) throw ap_error("Error while calling 'lsfitlinearc': looks like one of arguments has wrong size"); if( (fmatrix.cols()!=cmatrix.cols()-1)) throw ap_error("Error while calling 'lsfitlinearc': looks like one of arguments has wrong size"); n = y.length(); m = fmatrix.cols(); k = cmatrix.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_lsfitlinearc(const_cast(y.c_ptr()), const_cast(fmatrix.c_ptr()), const_cast(cmatrix.c_ptr()), n, m, k, &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted nonlinear least squares fitting using function values only. Combination of numerical differentiation and secant updates is used to obtain function Jacobian. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]). INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted DiffStep- numerical differentiation step; should not be very small or large; large = loss of accuracy small = growth of round-off errors OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 18.10.2008 by Bochkanov Sergey *************************************************************************/ void lsfitcreatewf(const real_2d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &c, const ae_int_t n, const ae_int_t m, const ae_int_t k, const double diffstep, lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitcreatewf(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(c.c_ptr()), n, m, k, diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted nonlinear least squares fitting using function values only. Combination of numerical differentiation and secant updates is used to obtain function Jacobian. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]). INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted DiffStep- numerical differentiation step; should not be very small or large; large = loss of accuracy small = growth of round-off errors OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 18.10.2008 by Bochkanov Sergey *************************************************************************/ void lsfitcreatewf(const real_2d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &c, const double diffstep, lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; ae_int_t k; if( (x.rows()!=y.length()) || (x.rows()!=w.length())) throw ap_error("Error while calling 'lsfitcreatewf': looks like one of arguments has wrong size"); n = x.rows(); m = x.cols(); k = c.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitcreatewf(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(c.c_ptr()), n, m, k, diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Nonlinear least squares fitting using function values only. Combination of numerical differentiation and secant updates is used to obtain function Jacobian. Nonlinear task min(F(c)) is solved, where F(c) = (f(c,x[0])-y[0])^2 + ... + (f(c,x[n-1])-y[n-1])^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]). INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted DiffStep- numerical differentiation step; should not be very small or large; large = loss of accuracy small = growth of round-off errors OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 18.10.2008 by Bochkanov Sergey *************************************************************************/ void lsfitcreatef(const real_2d_array &x, const real_1d_array &y, const real_1d_array &c, const ae_int_t n, const ae_int_t m, const ae_int_t k, const double diffstep, lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitcreatef(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(c.c_ptr()), n, m, k, diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Nonlinear least squares fitting using function values only. Combination of numerical differentiation and secant updates is used to obtain function Jacobian. Nonlinear task min(F(c)) is solved, where F(c) = (f(c,x[0])-y[0])^2 + ... + (f(c,x[n-1])-y[n-1])^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]). INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted DiffStep- numerical differentiation step; should not be very small or large; large = loss of accuracy small = growth of round-off errors OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 18.10.2008 by Bochkanov Sergey *************************************************************************/ void lsfitcreatef(const real_2d_array &x, const real_1d_array &y, const real_1d_array &c, const double diffstep, lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; ae_int_t k; if( (x.rows()!=y.length())) throw ap_error("Error while calling 'lsfitcreatef': looks like one of arguments has wrong size"); n = x.rows(); m = x.cols(); k = c.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitcreatef(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(c.c_ptr()), n, m, k, diffstep, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted nonlinear least squares fitting using gradient only. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]) and its gradient. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted CheapFG - boolean flag, which is: * True if both function and gradient calculation complexity are less than O(M^2). An improved algorithm can be used which corresponds to FGJ scheme from MINLM unit. * False otherwise. Standard Jacibian-bases Levenberg-Marquardt algo will be used (FJ scheme). OUTPUT PARAMETERS: State - structure which stores algorithm state See also: LSFitResults LSFitCreateFG (fitting without weights) LSFitCreateWFGH (fitting using Hessian) LSFitCreateFGH (fitting using Hessian, without weights) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatewfg(const real_2d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &c, const ae_int_t n, const ae_int_t m, const ae_int_t k, const bool cheapfg, lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitcreatewfg(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(c.c_ptr()), n, m, k, cheapfg, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted nonlinear least squares fitting using gradient only. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]) and its gradient. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted CheapFG - boolean flag, which is: * True if both function and gradient calculation complexity are less than O(M^2). An improved algorithm can be used which corresponds to FGJ scheme from MINLM unit. * False otherwise. Standard Jacibian-bases Levenberg-Marquardt algo will be used (FJ scheme). OUTPUT PARAMETERS: State - structure which stores algorithm state See also: LSFitResults LSFitCreateFG (fitting without weights) LSFitCreateWFGH (fitting using Hessian) LSFitCreateFGH (fitting using Hessian, without weights) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatewfg(const real_2d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &c, const bool cheapfg, lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; ae_int_t k; if( (x.rows()!=y.length()) || (x.rows()!=w.length())) throw ap_error("Error while calling 'lsfitcreatewfg': looks like one of arguments has wrong size"); n = x.rows(); m = x.cols(); k = c.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitcreatewfg(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(c.c_ptr()), n, m, k, cheapfg, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Nonlinear least squares fitting using gradient only, without individual weights. Nonlinear task min(F(c)) is solved, where F(c) = ((f(c,x[0])-y[0]))^2 + ... + ((f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]) and its gradient. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted CheapFG - boolean flag, which is: * True if both function and gradient calculation complexity are less than O(M^2). An improved algorithm can be used which corresponds to FGJ scheme from MINLM unit. * False otherwise. Standard Jacibian-bases Levenberg-Marquardt algo will be used (FJ scheme). OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatefg(const real_2d_array &x, const real_1d_array &y, const real_1d_array &c, const ae_int_t n, const ae_int_t m, const ae_int_t k, const bool cheapfg, lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitcreatefg(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(c.c_ptr()), n, m, k, cheapfg, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Nonlinear least squares fitting using gradient only, without individual weights. Nonlinear task min(F(c)) is solved, where F(c) = ((f(c,x[0])-y[0]))^2 + ... + ((f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]) and its gradient. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted CheapFG - boolean flag, which is: * True if both function and gradient calculation complexity are less than O(M^2). An improved algorithm can be used which corresponds to FGJ scheme from MINLM unit. * False otherwise. Standard Jacibian-bases Levenberg-Marquardt algo will be used (FJ scheme). OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatefg(const real_2d_array &x, const real_1d_array &y, const real_1d_array &c, const bool cheapfg, lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; ae_int_t k; if( (x.rows()!=y.length())) throw ap_error("Error while calling 'lsfitcreatefg': looks like one of arguments has wrong size"); n = x.rows(); m = x.cols(); k = c.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitcreatefg(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(c.c_ptr()), n, m, k, cheapfg, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted nonlinear least squares fitting using gradient/Hessian. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses f(c,x[i]), its gradient and its Hessian. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatewfgh(const real_2d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &c, const ae_int_t n, const ae_int_t m, const ae_int_t k, lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitcreatewfgh(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(c.c_ptr()), n, m, k, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Weighted nonlinear least squares fitting using gradient/Hessian. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses f(c,x[i]), its gradient and its Hessian. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatewfgh(const real_2d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &c, lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; ae_int_t k; if( (x.rows()!=y.length()) || (x.rows()!=w.length())) throw ap_error("Error while calling 'lsfitcreatewfgh': looks like one of arguments has wrong size"); n = x.rows(); m = x.cols(); k = c.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitcreatewfgh(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(w.c_ptr()), const_cast(c.c_ptr()), n, m, k, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Nonlinear least squares fitting using gradient/Hessian, without individial weights. Nonlinear task min(F(c)) is solved, where F(c) = ((f(c,x[0])-y[0]))^2 + ... + ((f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses f(c,x[i]), its gradient and its Hessian. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatefgh(const real_2d_array &x, const real_1d_array &y, const real_1d_array &c, const ae_int_t n, const ae_int_t m, const ae_int_t k, lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitcreatefgh(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(c.c_ptr()), n, m, k, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Nonlinear least squares fitting using gradient/Hessian, without individial weights. Nonlinear task min(F(c)) is solved, where F(c) = ((f(c,x[0])-y[0]))^2 + ... + ((f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses f(c,x[i]), its gradient and its Hessian. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatefgh(const real_2d_array &x, const real_1d_array &y, const real_1d_array &c, lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; ae_int_t k; if( (x.rows()!=y.length())) throw ap_error("Error while calling 'lsfitcreatefgh': looks like one of arguments has wrong size"); n = x.rows(); m = x.cols(); k = c.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitcreatefgh(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(c.c_ptr()), n, m, k, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Stopping conditions for nonlinear least squares fitting. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by LSFitSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Only Levenberg-Marquardt iterations are counted (L-BFGS/CG iterations are NOT counted because their cost is very low compared to that of LM). NOTE Passing EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (according to the scheme used by MINLM unit). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitsetcond(const lsfitstate &state, const double epsx, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitsetcond(const_cast(state.c_ptr()), epsx, maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. NOTE: non-zero StpMax leads to moderate performance degradation because intermediate step of preconditioned L-BFGS optimization is incompatible with limits on step size. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void lsfitsetstpmax(const lsfitstate &state, const double stpmax) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitsetstpmax(const_cast(state.c_ptr()), stpmax, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not When reports are needed, State.C (current parameters) and State.F (current value of fitting function) are reported. -- ALGLIB -- Copyright 15.08.2010 by Bochkanov Sergey *************************************************************************/ void lsfitsetxrep(const lsfitstate &state, const bool needxrep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitsetxrep(const_cast(state.c_ptr()), needxrep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets scaling coefficients for underlying optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Generally, scale is NOT considered to be a form of preconditioner. But LM optimizer is unique in that it uses scaling matrix both in the stopping condition tests and as Marquardt damping factor. Proper scaling is very important for the algorithm performance. It is less important for the quality of results, but still has some influence (it is easier to converge when variables are properly scaled, so premature stopping is possible when very badly scalled variables are combined with relaxed stopping conditions). INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void lsfitsetscale(const lsfitstate &state, const real_1d_array &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitsetscale(const_cast(state.c_ptr()), const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets boundary constraints for underlying optimizer Boundary constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetBC() call. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[K]. If some (all) variables are unbounded, you may specify very small number or -INF (latter is recommended because it will allow solver to use better algorithm). BndU - upper bounds, array[K]. If some (all) variables are unbounded, you may specify very large number or +INF (latter is recommended because it will allow solver to use better algorithm). NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: unlike other constrained optimization algorithms, this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void lsfitsetbc(const lsfitstate &state, const real_1d_array &bndl, const real_1d_array &bndu) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitsetbc(const_cast(state.c_ptr()), const_cast(bndl.c_ptr()), const_cast(bndu.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets linear constraints for underlying optimizer Linear constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetLC() call. INPUT PARAMETERS: State - structure stores algorithm state C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT IMPORTANT: if you have linear constraints, it is strongly recommended to set scale of variables with lsfitsetscale(). QP solver which is used to calculate linearly constrained steps heavily relies on good scaling of input problems. NOTE: linear (non-box) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations. NOTE: general linear constraints add significant overhead to solution process. Although solver performs roughly same amount of iterations (when compared with similar box-only constrained problem), each iteration now involves solution of linearly constrained QP subproblem, which requires ~3-5 times more Cholesky decompositions. Thus, if you can reformulate your problem in such way this it has only box constraints, it may be beneficial to do so. -- ALGLIB -- Copyright 29.04.2017 by Bochkanov Sergey *************************************************************************/ void lsfitsetlc(const lsfitstate &state, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitsetlc(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets linear constraints for underlying optimizer Linear constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetLC() call. INPUT PARAMETERS: State - structure stores algorithm state C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT IMPORTANT: if you have linear constraints, it is strongly recommended to set scale of variables with lsfitsetscale(). QP solver which is used to calculate linearly constrained steps heavily relies on good scaling of input problems. NOTE: linear (non-box) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations. NOTE: general linear constraints add significant overhead to solution process. Although solver performs roughly same amount of iterations (when compared with similar box-only constrained problem), each iteration now involves solution of linearly constrained QP subproblem, which requires ~3-5 times more Cholesky decompositions. Thus, if you can reformulate your problem in such way this it has only box constraints, it may be beneficial to do so. -- ALGLIB -- Copyright 29.04.2017 by Bochkanov Sergey *************************************************************************/ void lsfitsetlc(const lsfitstate &state, const real_2d_array &c, const integer_1d_array &ct) { alglib_impl::ae_state _alglib_env_state; ae_int_t k; if( (c.rows()!=ct.length())) throw ap_error("Error while calling 'lsfitsetlc': looks like one of arguments has wrong size"); k = c.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitsetlc(const_cast(state.c_ptr()), const_cast(c.c_ptr()), const_cast(ct.c_ptr()), k, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool lsfititeration(const lsfitstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::lsfititeration(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void lsfitfit(lsfitstate &state, void (*func)(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &c, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( func==NULL ) throw ap_error("ALGLIB: error in 'lsfitfit()' (func is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::lsfititeration(state.c_ptr(), &_alglib_env_state) ) { if( state.needf ) { func(state.c, state.x, state.f, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.c, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'lsfitfit' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void lsfitfit(lsfitstate &state, void (*func)(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr), void (*grad)(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &c, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( func==NULL ) throw ap_error("ALGLIB: error in 'lsfitfit()' (func is NULL)"); if( grad==NULL ) throw ap_error("ALGLIB: error in 'lsfitfit()' (grad is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::lsfititeration(state.c_ptr(), &_alglib_env_state) ) { if( state.needf ) { func(state.c, state.x, state.f, ptr); continue; } if( state.needfg ) { grad(state.c, state.x, state.f, state.g, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.c, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'lsfitfit' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void lsfitfit(lsfitstate &state, void (*func)(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr), void (*grad)(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*hess)(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, real_2d_array &hess, void *ptr), void (*rep)(const real_1d_array &c, double func, void *ptr), void *ptr) { alglib_impl::ae_state _alglib_env_state; if( func==NULL ) throw ap_error("ALGLIB: error in 'lsfitfit()' (func is NULL)"); if( grad==NULL ) throw ap_error("ALGLIB: error in 'lsfitfit()' (grad is NULL)"); if( hess==NULL ) throw ap_error("ALGLIB: error in 'lsfitfit()' (hess is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::lsfititeration(state.c_ptr(), &_alglib_env_state) ) { if( state.needf ) { func(state.c, state.x, state.f, ptr); continue; } if( state.needfg ) { grad(state.c, state.x, state.f, state.g, ptr); continue; } if( state.needfgh ) { hess(state.c, state.x, state.f, state.g, state.h, ptr); continue; } if( state.xupdated ) { if( rep!=NULL ) rep(state.c, state.f, ptr); continue; } throw ap_error("ALGLIB: error in 'lsfitfit' (some derivatives were not provided?)"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Nonlinear least squares fitting results. Called after return from LSFitFit(). INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: Info - completion code: * -7 gradient verification failed. See LSFitSetGradientCheck() for more information. * 2 relative step is no more than EpsX. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible C - array[0..K-1], solution Rep - optimization report. On success following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED * WRMSError weighted rms error on the (X,Y). ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(J*CovPar*J')), where J is Jacobian matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitresults(const lsfitstate &state, ae_int_t &info, real_1d_array &c, lsfitreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitresults(const_cast(state.c_ptr()), &info, const_cast(c.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before fitting begins * LSFitFit() is called * prior to actual fitting, for each point in data set X_i and each component of parameters being fited C_j algorithm performs following steps: * two trial steps are made to C_j-TestStep*S[j] and C_j+TestStep*S[j], where C_j is j-th parameter and S[j] is a scale of j-th parameter * if needed, steps are bounded with respect to constraints on C[] * F(X_i|C) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N*K (points count * parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with LSFitSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. NOTE 4: this function works only for optimizers created with LSFitCreateWFG() or LSFitCreateFG() constructors. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/ void lsfitsetgradientcheck(const lsfitstate &state, const double teststep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::lsfitsetgradientcheck(const_cast(state.c_ptr()), teststep, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fits minimum circumscribed (MCC) circle (or NX-dimensional sphere) to data (a set of points in NX-dimensional space). INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) OUTPUT PARAMETERS: CX - central point for a sphere RHi - radius NOTE: this function is an easy-to-use wrapper around more powerful "expert" function nsfitspherex(). This wrapper is optimized for ease of use and stability - at the cost of somewhat lower performance (we have to use very tight stopping criteria for inner optimizer because we want to make sure that it will converge on any dataset). If you are ready to experiment with settings of "expert" function, you can achieve ~2-4x speedup over standard "bulletproof" settings. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitspheremcc(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nx, real_1d_array &cx, double &rhi) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::nsfitspheremcc(const_cast(xy.c_ptr()), npoints, nx, const_cast(cx.c_ptr()), &rhi, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fits maximum inscribed circle (or NX-dimensional sphere) to data (a set of points in NX-dimensional space). INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius NOTE: this function is an easy-to-use wrapper around more powerful "expert" function nsfitspherex(). This wrapper is optimized for ease of use and stability - at the cost of somewhat lower performance (we have to use very tight stopping criteria for inner optimizer because we want to make sure that it will converge on any dataset). If you are ready to experiment with settings of "expert" function, you can achieve ~2-4x speedup over standard "bulletproof" settings. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitspheremic(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nx, real_1d_array &cx, double &rlo) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::nsfitspheremic(const_cast(xy.c_ptr()), npoints, nx, const_cast(cx.c_ptr()), &rlo, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fits minimum zone circle (or NX-dimensional sphere) to data (a set of points in NX-dimensional space). INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius of inscribed circle RHo - radius of circumscribed circle NOTE: this function is an easy-to-use wrapper around more powerful "expert" function nsfitspherex(). This wrapper is optimized for ease of use and stability - at the cost of somewhat lower performance (we have to use very tight stopping criteria for inner optimizer because we want to make sure that it will converge on any dataset). If you are ready to experiment with settings of "expert" function, you can achieve ~2-4x speedup over standard "bulletproof" settings. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitspheremzc(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nx, real_1d_array &cx, double &rlo, double &rhi) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::nsfitspheremzc(const_cast(xy.c_ptr()), npoints, nx, const_cast(cx.c_ptr()), &rlo, &rhi, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fitting minimum circumscribed, maximum inscribed or minimum zone circles (or NX-dimensional spheres) to data (a set of points in NX-dimensional space). This is expert function which allows to tweak many parameters of underlying nonlinear solver: * stopping criteria for inner iterations * number of outer iterations * penalty coefficient used to handle nonlinear constraints (we convert unconstrained nonsmooth optimization problem ivolving max() and/or min() operations to quadratically constrained smooth one). You may tweak all these parameters or only some of them, leaving other ones at their default state - just specify zero value, and solver will fill it with appropriate default one. These comments also include some discussion of approach used to handle such unusual fitting problem, its stability, drawbacks of alternative methods, and convergence properties. INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) ProblemType-used to encode problem type: * 1 for minimum circumscribed circle/sphere fitting (MCC) * 2 for maximum inscribed circle/sphere fitting (MIC) * 3 for minimum zone circle fitting (difference between Rhi and Rlo is minimized), denoted as MZC EpsX - stopping condition for NLC optimizer: * must be non-negative * use 0 to choose default value (1.0E-12 is used by default) * you may specify larger values, up to 1.0E-6, if you want to speed-up solver; NLC solver performs several preconditioned outer iterations, so final result typically has precision much better than EpsX. AULIts - number of outer iterations performed by NLC optimizer: * must be non-negative * use 0 to choose default value (20 is used by default) * you may specify values smaller than 20 if you want to speed up solver; 10 often results in good combination of precision and speed; sometimes you may get good results with just 6 outer iterations. Penalty - penalty coefficient for NLC optimizer: * must be non-negative * use 0 to choose default value (1.0E6 in current version) * it should be really large, 1.0E6...1.0E7 is a good value to start from; * generally, default value is good enough OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius: * for ProblemType=2,3, radius of the inscribed sphere * for ProblemType=1 - zero RHo - radius: * for ProblemType=1,3, radius of the circumscribed sphere * for ProblemType=2 - zero NOTE: ON THE UNIQUENESS OF SOLUTIONS ALGLIB provides solution to several related circle fitting problems: MCC (minimum circumscribed), MIC (maximum inscribed) and MZC (minimum zone) fitting. It is important to note that among these problems only MCC is convex and has unique solution independently from starting point. As for MIC, it may (or may not, depending on dataset properties) have multiple solutions, and it always has one degenerate solution C=infinity which corresponds to infinitely large radius. Thus, there are no guarantees that solution to MIC returned by this solver will be the best one (and no one can provide you with such guarantee because problem is NP-hard). The only guarantee you have is that this solution is locally optimal, i.e. it can not be improved by infinitesimally small tweaks in the parameters. It is also possible to "run away" to infinity when started from bad initial point located outside of point cloud (or when point cloud does not span entire circumference/surface of the sphere). Finally, MZC (minimum zone circle) stands somewhere between MCC and MIC in stability. It is somewhat regularized by "circumscribed" term of the merit function; however, solutions to MZC may be non-unique, and in some unlucky cases it is also possible to "run away to infinity". NOTE: ON THE NONLINEARLY CONSTRAINED PROGRAMMING APPROACH The problem formulation for MCC (minimum circumscribed circle; for the sake of simplicity we omit MZC and MIC here) is: [ [ ]2 ] min [ max [ XY[i]-C ] ] C [ i [ ] ] i.e. it is unconstrained nonsmooth optimization problem of finding "best" central point, with radius R being unambiguously determined from C. In order to move away from non-smoothness we use following reformulation: [ ] [ ]2 min [ R ] subject to R>=0, [ XY[i]-C ] <= R^2 C,R [ ] [ ] i.e. it becomes smooth quadratically constrained optimization problem with linear target function. Such problem statement is 100% equivalent to the original nonsmooth one, but much easier to approach. We solve it with MinNLC solver provided by ALGLIB. NOTE: ON INSTABILITY OF SEQUENTIAL LINEAR PROGRAMMING APPROACJ ALGLIB has nonlinearly constrained solver which proved to be stable on such problems. However, some authors proposed to linearize constraints in the vicinity of current approximation (Ci,Ri) and to get next approximate solution (Ci+1,Ri+1) as solution to linear programming problem. Obviously, LP problems are easier than nonlinearly constrained ones. Indeed, SLP approach to MCC/MIC/MZC resulted in ~10-20x increase in performance (when compared with NLC solver). However, it turned out that in some cases linearized model fails to predict correct direction for next step and tells us that we converged to solution even when we are still 2-4 digits of precision away from it. It is important that it is not failure of LP solver - it is failure of the linear model; even when solved exactly, it fails to handle subtle nonlinearities which arise near the solution. We validated it by comparing results returned by ALGLIB linear solver with that of MATLAB. In our experiments with SLP solver: * MCC failed most often, at both realistic and synthetic datasets * MIC sometimes failed, but sometimes succeeded * MZC often succeeded; our guess is that presence of two independent sets of constraints (one set for Rlo and another one for Rhi) and two terms in the target function (Rlo and Rhi) regularizes task, so when linear model fails to handle nonlinearities from Rlo, it uses Rhi as a hint (and vice versa). Because SLP approach failed to achieve stable results, we do not include it in ALGLIB. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitspherex(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nx, const ae_int_t problemtype, const double epsx, const ae_int_t aulits, const double penalty, real_1d_array &cx, double &rlo, double &rhi) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::nsfitspherex(const_cast(xy.c_ptr()), npoints, nx, problemtype, epsx, aulits, penalty, const_cast(cx.c_ptr()), &rlo, &rhi, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* 2-dimensional spline inteprolant *************************************************************************/ _spline2dinterpolant_owner::_spline2dinterpolant_owner() { p_struct = (alglib_impl::spline2dinterpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::spline2dinterpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_spline2dinterpolant_init(p_struct, NULL); } _spline2dinterpolant_owner::_spline2dinterpolant_owner(const _spline2dinterpolant_owner &rhs) { p_struct = (alglib_impl::spline2dinterpolant*)alglib_impl::ae_malloc(sizeof(alglib_impl::spline2dinterpolant), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_spline2dinterpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _spline2dinterpolant_owner& _spline2dinterpolant_owner::operator=(const _spline2dinterpolant_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_spline2dinterpolant_clear(p_struct); alglib_impl::_spline2dinterpolant_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _spline2dinterpolant_owner::~_spline2dinterpolant_owner() { alglib_impl::_spline2dinterpolant_clear(p_struct); ae_free(p_struct); } alglib_impl::spline2dinterpolant* _spline2dinterpolant_owner::c_ptr() { return p_struct; } alglib_impl::spline2dinterpolant* _spline2dinterpolant_owner::c_ptr() const { return const_cast(p_struct); } spline2dinterpolant::spline2dinterpolant() : _spline2dinterpolant_owner() { } spline2dinterpolant::spline2dinterpolant(const spline2dinterpolant &rhs):_spline2dinterpolant_owner(rhs) { } spline2dinterpolant& spline2dinterpolant::operator=(const spline2dinterpolant &rhs) { if( this==&rhs ) return *this; _spline2dinterpolant_owner::operator=(rhs); return *this; } spline2dinterpolant::~spline2dinterpolant() { } /************************************************************************* This subroutine calculates the value of the bilinear or bicubic spline at the given point X. Input parameters: C - coefficients table. Built by BuildBilinearSpline or BuildBicubicSpline. X, Y- point Result: S(x,y) -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/ double spline2dcalc(const spline2dinterpolant &c, const double x, const double y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::spline2dcalc(const_cast(c.c_ptr()), x, y, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates the value of the bilinear or bicubic spline at the given point X and its derivatives. Input parameters: C - spline interpolant. X, Y- point Output parameters: F - S(x,y) FX - dS(x,y)/dX FY - dS(x,y)/dY FXY - d2S(x,y)/dXdY -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/ void spline2ddiff(const spline2dinterpolant &c, const double x, const double y, double &f, double &fx, double &fy, double &fxy) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2ddiff(const_cast(c.c_ptr()), x, y, &f, &fx, &fy, &fxy, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine performs linear transformation of the spline argument. Input parameters: C - spline interpolant AX, BX - transformation coefficients: x = A*t + B AY, BY - transformation coefficients: y = A*u + B Result: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/ void spline2dlintransxy(const spline2dinterpolant &c, const double ax, const double bx, const double ay, const double by) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dlintransxy(const_cast(c.c_ptr()), ax, bx, ay, by, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine performs linear transformation of the spline. Input parameters: C - spline interpolant. A, B- transformation coefficients: S2(x,y) = A*S(x,y) + B Output parameters: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/ void spline2dlintransf(const spline2dinterpolant &c, const double a, const double b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dlintransf(const_cast(c.c_ptr()), a, b, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine makes the copy of the spline model. Input parameters: C - spline interpolant Output parameters: CC - spline copy -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/ void spline2dcopy(const spline2dinterpolant &c, spline2dinterpolant &cc) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dcopy(const_cast(c.c_ptr()), const_cast(cc.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Bicubic spline resampling Input parameters: A - function values at the old grid, array[0..OldHeight-1, 0..OldWidth-1] OldHeight - old grid height, OldHeight>1 OldWidth - old grid width, OldWidth>1 NewHeight - new grid height, NewHeight>1 NewWidth - new grid width, NewWidth>1 Output parameters: B - function values at the new grid, array[0..NewHeight-1, 0..NewWidth-1] -- ALGLIB routine -- 15 May, 2007 Copyright by Bochkanov Sergey *************************************************************************/ void spline2dresamplebicubic(const real_2d_array &a, const ae_int_t oldheight, const ae_int_t oldwidth, real_2d_array &b, const ae_int_t newheight, const ae_int_t newwidth) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dresamplebicubic(const_cast(a.c_ptr()), oldheight, oldwidth, const_cast(b.c_ptr()), newheight, newwidth, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Bilinear spline resampling Input parameters: A - function values at the old grid, array[0..OldHeight-1, 0..OldWidth-1] OldHeight - old grid height, OldHeight>1 OldWidth - old grid width, OldWidth>1 NewHeight - new grid height, NewHeight>1 NewWidth - new grid width, NewWidth>1 Output parameters: B - function values at the new grid, array[0..NewHeight-1, 0..NewWidth-1] -- ALGLIB routine -- 09.07.2007 Copyright by Bochkanov Sergey *************************************************************************/ void spline2dresamplebilinear(const real_2d_array &a, const ae_int_t oldheight, const ae_int_t oldwidth, real_2d_array &b, const ae_int_t newheight, const ae_int_t newwidth) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dresamplebilinear(const_cast(a.c_ptr()), oldheight, oldwidth, const_cast(b.c_ptr()), newheight, newwidth, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds bilinear vector-valued spline. Input parameters: X - spline abscissas, array[0..N-1] Y - spline ordinates, array[0..M-1] F - function values, array[0..M*N*D-1]: * first D elements store D values at (X[0],Y[0]) * next D elements store D values at (X[1],Y[0]) * general form - D function values at (X[i],Y[j]) are stored at F[D*(J*N+I)...D*(J*N+I)+D-1]. M,N - grid size, M>=2, N>=2 D - vector dimension, D>=1 Output parameters: C - spline interpolant -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dbuildbilinearv(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, const real_1d_array &f, const ae_int_t d, spline2dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dbuildbilinearv(const_cast(x.c_ptr()), n, const_cast(y.c_ptr()), m, const_cast(f.c_ptr()), d, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine builds bicubic vector-valued spline. Input parameters: X - spline abscissas, array[0..N-1] Y - spline ordinates, array[0..M-1] F - function values, array[0..M*N*D-1]: * first D elements store D values at (X[0],Y[0]) * next D elements store D values at (X[1],Y[0]) * general form - D function values at (X[i],Y[j]) are stored at F[D*(J*N+I)...D*(J*N+I)+D-1]. M,N - grid size, M>=2, N>=2 D - vector dimension, D>=1 Output parameters: C - spline interpolant -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dbuildbicubicv(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, const real_1d_array &f, const ae_int_t d, spline2dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dbuildbicubicv(const_cast(x.c_ptr()), n, const_cast(y.c_ptr()), m, const_cast(f.c_ptr()), d, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates bilinear or bicubic vector-valued spline at the given point (X,Y). INPUT PARAMETERS: C - spline interpolant. X, Y- point F - output buffer, possibly preallocated array. In case array size is large enough to store result, it is not reallocated. Array which is too short will be reallocated OUTPUT PARAMETERS: F - array[D] (or larger) which stores function values -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dcalcvbuf(const spline2dinterpolant &c, const double x, const double y, real_1d_array &f) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dcalcvbuf(const_cast(c.c_ptr()), x, y, const_cast(f.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine calculates bilinear or bicubic vector-valued spline at the given point (X,Y). INPUT PARAMETERS: C - spline interpolant. X, Y- point OUTPUT PARAMETERS: F - array[D] which stores function values. F is out-parameter and it is reallocated after call to this function. In case you want to reuse previously allocated F, you may use Spline2DCalcVBuf(), which reallocates F only when it is too small. -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dcalcv(const spline2dinterpolant &c, const double x, const double y, real_1d_array &f) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dcalcv(const_cast(c.c_ptr()), x, y, const_cast(f.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine unpacks two-dimensional spline into the coefficients table Input parameters: C - spline interpolant. Result: M, N- grid size (x-axis and y-axis) D - number of components Tbl - coefficients table, unpacked format, D - components: [0..(N-1)*(M-1)*D-1, 0..19]. For T=0..D-1 (component index), I = 0...N-2 (x index), J=0..M-2 (y index): K := T + I*D + J*D*(N-1) K-th row stores decomposition for T-th component of the vector-valued function Tbl[K,0] = X[i] Tbl[K,1] = X[i+1] Tbl[K,2] = Y[j] Tbl[K,3] = Y[j+1] Tbl[K,4] = C00 Tbl[K,5] = C01 Tbl[K,6] = C02 Tbl[K,7] = C03 Tbl[K,8] = C10 Tbl[K,9] = C11 ... Tbl[K,19] = C33 On each grid square spline is equals to: S(x) = SUM(c[i,j]*(t^i)*(u^j), i=0..3, j=0..3) t = x-x[j] u = y-y[i] -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dunpackv(const spline2dinterpolant &c, ae_int_t &m, ae_int_t &n, ae_int_t &d, real_2d_array &tbl) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dunpackv(const_cast(c.c_ptr()), &m, &n, &d, const_cast(tbl.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine was deprecated in ALGLIB 3.6.0 We recommend you to switch to Spline2DBuildBilinearV(), which is more flexible and accepts its arguments in more convenient order. -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/ void spline2dbuildbilinear(const real_1d_array &x, const real_1d_array &y, const real_2d_array &f, const ae_int_t m, const ae_int_t n, spline2dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dbuildbilinear(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(f.c_ptr()), m, n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine was deprecated in ALGLIB 3.6.0 We recommend you to switch to Spline2DBuildBicubicV(), which is more flexible and accepts its arguments in more convenient order. -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/ void spline2dbuildbicubic(const real_1d_array &x, const real_1d_array &y, const real_2d_array &f, const ae_int_t m, const ae_int_t n, spline2dinterpolant &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dbuildbicubic(const_cast(x.c_ptr()), const_cast(y.c_ptr()), const_cast(f.c_ptr()), m, n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This subroutine was deprecated in ALGLIB 3.6.0 We recommend you to switch to Spline2DUnpackV(), which is more flexible and accepts its arguments in more convenient order. -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/ void spline2dunpack(const spline2dinterpolant &c, ae_int_t &m, ae_int_t &n, real_2d_array &tbl) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::spline2dunpack(const_cast(c.c_ptr()), &m, &n, const_cast(tbl.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Buffer object which is used to perform nearest neighbor requests in the multithreaded mode (multiple threads working with same KD-tree object). This object should be created with KDTreeCreateBuffer(). *************************************************************************/ _rbfcalcbuffer_owner::_rbfcalcbuffer_owner() { p_struct = (alglib_impl::rbfcalcbuffer*)alglib_impl::ae_malloc(sizeof(alglib_impl::rbfcalcbuffer), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_rbfcalcbuffer_init(p_struct, NULL); } _rbfcalcbuffer_owner::_rbfcalcbuffer_owner(const _rbfcalcbuffer_owner &rhs) { p_struct = (alglib_impl::rbfcalcbuffer*)alglib_impl::ae_malloc(sizeof(alglib_impl::rbfcalcbuffer), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_rbfcalcbuffer_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _rbfcalcbuffer_owner& _rbfcalcbuffer_owner::operator=(const _rbfcalcbuffer_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_rbfcalcbuffer_clear(p_struct); alglib_impl::_rbfcalcbuffer_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _rbfcalcbuffer_owner::~_rbfcalcbuffer_owner() { alglib_impl::_rbfcalcbuffer_clear(p_struct); ae_free(p_struct); } alglib_impl::rbfcalcbuffer* _rbfcalcbuffer_owner::c_ptr() { return p_struct; } alglib_impl::rbfcalcbuffer* _rbfcalcbuffer_owner::c_ptr() const { return const_cast(p_struct); } rbfcalcbuffer::rbfcalcbuffer() : _rbfcalcbuffer_owner() { } rbfcalcbuffer::rbfcalcbuffer(const rbfcalcbuffer &rhs):_rbfcalcbuffer_owner(rhs) { } rbfcalcbuffer& rbfcalcbuffer::operator=(const rbfcalcbuffer &rhs) { if( this==&rhs ) return *this; _rbfcalcbuffer_owner::operator=(rhs); return *this; } rbfcalcbuffer::~rbfcalcbuffer() { } /************************************************************************* RBF model. Never try to directly work with fields of this object - always use ALGLIB functions to use this object. *************************************************************************/ _rbfmodel_owner::_rbfmodel_owner() { p_struct = (alglib_impl::rbfmodel*)alglib_impl::ae_malloc(sizeof(alglib_impl::rbfmodel), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_rbfmodel_init(p_struct, NULL); } _rbfmodel_owner::_rbfmodel_owner(const _rbfmodel_owner &rhs) { p_struct = (alglib_impl::rbfmodel*)alglib_impl::ae_malloc(sizeof(alglib_impl::rbfmodel), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_rbfmodel_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _rbfmodel_owner& _rbfmodel_owner::operator=(const _rbfmodel_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_rbfmodel_clear(p_struct); alglib_impl::_rbfmodel_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _rbfmodel_owner::~_rbfmodel_owner() { alglib_impl::_rbfmodel_clear(p_struct); ae_free(p_struct); } alglib_impl::rbfmodel* _rbfmodel_owner::c_ptr() { return p_struct; } alglib_impl::rbfmodel* _rbfmodel_owner::c_ptr() const { return const_cast(p_struct); } rbfmodel::rbfmodel() : _rbfmodel_owner() { } rbfmodel::rbfmodel(const rbfmodel &rhs):_rbfmodel_owner(rhs) { } rbfmodel& rbfmodel::operator=(const rbfmodel &rhs) { if( this==&rhs ) return *this; _rbfmodel_owner::operator=(rhs); return *this; } rbfmodel::~rbfmodel() { } /************************************************************************* RBF solution report: * TerminationType - termination type, positive values - success, non-positive - failure. Fields which are set by modern RBF solvers (hierarchical): * RMSError - root-mean-square error; NAN for old solvers (ML, QNN) * MaxError - maximum error; NAN for old solvers (ML, QNN) *************************************************************************/ _rbfreport_owner::_rbfreport_owner() { p_struct = (alglib_impl::rbfreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::rbfreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_rbfreport_init(p_struct, NULL); } _rbfreport_owner::_rbfreport_owner(const _rbfreport_owner &rhs) { p_struct = (alglib_impl::rbfreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::rbfreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_rbfreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _rbfreport_owner& _rbfreport_owner::operator=(const _rbfreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_rbfreport_clear(p_struct); alglib_impl::_rbfreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _rbfreport_owner::~_rbfreport_owner() { alglib_impl::_rbfreport_clear(p_struct); ae_free(p_struct); } alglib_impl::rbfreport* _rbfreport_owner::c_ptr() { return p_struct; } alglib_impl::rbfreport* _rbfreport_owner::c_ptr() const { return const_cast(p_struct); } rbfreport::rbfreport() : _rbfreport_owner() ,rmserror(p_struct->rmserror),maxerror(p_struct->maxerror),arows(p_struct->arows),acols(p_struct->acols),annz(p_struct->annz),iterationscount(p_struct->iterationscount),nmv(p_struct->nmv),terminationtype(p_struct->terminationtype) { } rbfreport::rbfreport(const rbfreport &rhs):_rbfreport_owner(rhs) ,rmserror(p_struct->rmserror),maxerror(p_struct->maxerror),arows(p_struct->arows),acols(p_struct->acols),annz(p_struct->annz),iterationscount(p_struct->iterationscount),nmv(p_struct->nmv),terminationtype(p_struct->terminationtype) { } rbfreport& rbfreport::operator=(const rbfreport &rhs) { if( this==&rhs ) return *this; _rbfreport_owner::operator=(rhs); return *this; } rbfreport::~rbfreport() { } /************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/ void rbfserialize(rbfmodel &obj, std::string &s_out) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_int_t ssize; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_alloc_start(&serializer); alglib_impl::rbfalloc(&serializer, obj.c_ptr(), &state); ssize = alglib_impl::ae_serializer_get_alloc_size(&serializer); s_out.clear(); s_out.reserve((size_t)(ssize+1)); alglib_impl::ae_serializer_sstart_str(&serializer, &s_out); alglib_impl::rbfserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); if( s_out.length()>(size_t)ssize ) throw ap_error("ALGLIB: serialization integrity error"); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function unserializes data structure from string. *************************************************************************/ void rbfunserialize(const std::string &s_in, rbfmodel &obj) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_ustart_str(&serializer, &s_in); alglib_impl::rbfunserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function serializes data structure to C++ stream. Data stream generated by this function is same as string representation generated by string version of serializer - alphanumeric characters, dots, underscores, minus signs, which are grouped into words separated by spaces and CR+LF. We recommend you to read comments on string version of serializer to find out more about serialization of AlGLIB objects. *************************************************************************/ void rbfserialize(rbfmodel &obj, std::ostream &s_out) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_alloc_start(&serializer); alglib_impl::rbfalloc(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_get_alloc_size(&serializer); // not actually needed, but we have to ask alglib_impl::ae_serializer_sstart_stream(&serializer, &s_out); alglib_impl::rbfserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function unserializes data structure from stream. *************************************************************************/ void rbfunserialize(const std::istream &s_in, rbfmodel &obj) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_ustart_stream(&serializer, &s_in); alglib_impl::rbfunserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function creates RBF model for a scalar (NY=1) or vector (NY>1) function in a NX-dimensional space (NX>=1). Newly created model is empty. It can be used for interpolation right after creation, but it just returns zeros. You have to add points to the model, tune interpolation settings, and then call model construction function rbfbuildmodel() which will update model according to your specification. USAGE: 1. User creates model with rbfcreate() 2. User adds dataset with rbfsetpoints() (points do NOT have to be on a regular grid) or rbfsetpointsandscales(). 3. (OPTIONAL) User chooses polynomial term by calling: * rbflinterm() to set linear term * rbfconstterm() to set constant term * rbfzeroterm() to set zero term By default, linear term is used. 4. User tweaks algorithm properties with rbfsetalgohierarchical() method (or chooses one of the legacy algorithms - QNN (rbfsetalgoqnn) or ML (rbfsetalgomultilayer)). 5. User calls rbfbuildmodel() function which rebuilds model according to the specification 6. User may call rbfcalc() to calculate model value at the specified point, rbfgridcalc() to calculate model values at the points of the regular grid. User may extract model coefficients with rbfunpack() call. IMPORTANT: we recommend you to use latest model construction algorithm - hierarchical RBFs, which is activated by rbfsetalgohierarchical() function. This algorithm is the fastest one, and most memory- efficient. However, it is incompatible with older versions of ALGLIB (pre-3.11). So, if you serialize hierarchical model, you will be unable to load it in pre-3.11 ALGLIB. Other model types (QNN and RBF-ML) are still backward-compatible. INPUT PARAMETERS: NX - dimension of the space, NX>=1 NY - function dimension, NY>=1 OUTPUT PARAMETERS: S - RBF model (initially equals to zero) NOTE 1: memory requirements. RBF models require amount of memory which is proportional to the number of data points. Some additional memory is allocated during model construction, but most of this memory is freed after model coefficients are calculated. Amount of this additional memory depends on model construction algorithm being used. NOTE 2: prior to ALGLIB version 3.11, RBF models supported only NX=2 or NX=3. Any attempt to create single-dimensional or more than 3-dimensional RBF model resulted in exception. ALGLIB 3.11 supports any NX>0, but models created with NX!=2 and NX!=3 are incompatible with (a) older versions of ALGLIB, (b) old model construction algorithms (QNN or RBF-ML). So, if you create a model with NX=2 or NX=3, then, depending on specific model construction algorithm being chosen, you will (QNN and RBF-ML) or will not (HierarchicalRBF) get backward compatibility with older versions of ALGLIB. You have a choice here. However, if you create a model with NX neither 2 nor 3, you have no backward compatibility from the start, and you are forced to use hierarchical RBFs and ALGLIB 3.11 or later. -- ALGLIB -- Copyright 13.12.2011, 20.06.2016 by Bochkanov Sergey *************************************************************************/ void rbfcreate(const ae_int_t nx, const ae_int_t ny, rbfmodel &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfcreate(nx, ny, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function creates buffer structure which can be used to perform parallel RBF model evaluations (with one RBF model instance being used from multiple threads, as long as different threads use different instances of buffer). This buffer object can be used with rbftscalcbuf() function (here "ts" stands for "thread-safe", "buf" is a suffix which denotes function which reuses previously allocated output space). How to use it: * create RBF model structure with rbfcreate() * load data, tune parameters * call rbfbuildmodel() * call rbfcreatecalcbuffer(), once per thread working with RBF model (you should call this function only AFTER call to rbfbuildmodel(), see below for more information) * call rbftscalcbuf() from different threads, with each thread working with its own copy of buffer object. INPUT PARAMETERS S - RBF model OUTPUT PARAMETERS Buf - external buffer. IMPORTANT: buffer object should be used only with RBF model object which was used to initialize buffer. Any attempt to use buffer with different object is dangerous - you may get memory violation error because sizes of internal arrays do not fit to dimensions of RBF structure. IMPORTANT: you should call this function only for model which was built with rbfbuildmodel() function, after successful invocation of rbfbuildmodel(). Sizes of some internal structures are determined only after model is built, so buffer object created before model construction stage will be useless (and any attempt to use it will result in exception). -- ALGLIB -- Copyright 02.04.2016 by Sergey Bochkanov *************************************************************************/ void rbfcreatecalcbuffer(const rbfmodel &s, rbfcalcbuffer &buf) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfcreatecalcbuffer(const_cast(s.c_ptr()), const_cast(buf.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function adds dataset. This function overrides results of the previous calls, i.e. multiple calls of this function will result in only the last set being added. IMPORTANT: ALGLIB version 3.11 and later allows you to specify a set of per-dimension scales. Interpolation radii are multiplied by the scale vector. It may be useful if you have mixed spatio-temporal data (say, a set of 3D slices recorded at different times). You should call rbfsetpointsandscales() function to use this feature. INPUT PARAMETERS: S - RBF model, initialized by rbfcreate() call. XY - points, array[N,NX+NY]. One row corresponds to one point in the dataset. First NX elements are coordinates, next NY elements are function values. Array may be larger than specified, in this case only leading [N,NX+NY] elements will be used. N - number of points in the dataset After you've added dataset and (optionally) tuned algorithm settings you should call rbfbuildmodel() in order to build a model for you. NOTE: dataset added by this function is not saved during model serialization. MODEL ITSELF is serialized, but data used to build it are not. So, if you 1) add dataset to empty RBF model, 2) serialize and unserialize it, then you will get an empty RBF model with no dataset being attached. From the other side, if you call rbfbuildmodel() between (1) and (2), then after (2) you will get your fully constructed RBF model - but again with no dataset attached, so subsequent calls to rbfbuildmodel() will produce empty model. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetpoints(const rbfmodel &s, const real_2d_array &xy, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetpoints(const_cast(s.c_ptr()), const_cast(xy.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function adds dataset. This function overrides results of the previous calls, i.e. multiple calls of this function will result in only the last set being added. IMPORTANT: ALGLIB version 3.11 and later allows you to specify a set of per-dimension scales. Interpolation radii are multiplied by the scale vector. It may be useful if you have mixed spatio-temporal data (say, a set of 3D slices recorded at different times). You should call rbfsetpointsandscales() function to use this feature. INPUT PARAMETERS: S - RBF model, initialized by rbfcreate() call. XY - points, array[N,NX+NY]. One row corresponds to one point in the dataset. First NX elements are coordinates, next NY elements are function values. Array may be larger than specified, in this case only leading [N,NX+NY] elements will be used. N - number of points in the dataset After you've added dataset and (optionally) tuned algorithm settings you should call rbfbuildmodel() in order to build a model for you. NOTE: dataset added by this function is not saved during model serialization. MODEL ITSELF is serialized, but data used to build it are not. So, if you 1) add dataset to empty RBF model, 2) serialize and unserialize it, then you will get an empty RBF model with no dataset being attached. From the other side, if you call rbfbuildmodel() between (1) and (2), then after (2) you will get your fully constructed RBF model - but again with no dataset attached, so subsequent calls to rbfbuildmodel() will produce empty model. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetpoints(const rbfmodel &s, const real_2d_array &xy) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = xy.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetpoints(const_cast(s.c_ptr()), const_cast(xy.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function adds dataset and a vector of per-dimension scales. It may be useful if you have mixed spatio-temporal data - say, a set of 3D slices recorded at different times. Such data typically require different RBF radii for spatial and temporal dimensions. ALGLIB solves this problem by specifying single RBF radius, which is (optionally) multiplied by the scale vector. This function overrides results of the previous calls, i.e. multiple calls of this function will result in only the last set being added. IMPORTANT: only HierarchicalRBF algorithm can work with scaled points. So, using this function results in RBF models which can be used in ALGLIB 3.11 or later. Previous versions of the library will be unable to unserialize models produced by HierarchicalRBF algo. Any attempt to use this function with RBF-ML or QNN algorithms will result in -3 error code being returned (incorrect algorithm). INPUT PARAMETERS: R - RBF model, initialized by rbfcreate() call. XY - points, array[N,NX+NY]. One row corresponds to one point in the dataset. First NX elements are coordinates, next NY elements are function values. Array may be larger than specified, in this case only leading [N,NX+NY] elements will be used. N - number of points in the dataset S - array[NX], scale vector, S[i]>0. After you've added dataset and (optionally) tuned algorithm settings you should call rbfbuildmodel() in order to build a model for you. NOTE: dataset added by this function is not saved during model serialization. MODEL ITSELF is serialized, but data used to build it are not. So, if you 1) add dataset to empty RBF model, 2) serialize and unserialize it, then you will get an empty RBF model with no dataset being attached. From the other side, if you call rbfbuildmodel() between (1) and (2), then after (2) you will get your fully constructed RBF model - but again with no dataset attached, so subsequent calls to rbfbuildmodel() will produce empty model. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ void rbfsetpointsandscales(const rbfmodel &r, const real_2d_array &xy, const ae_int_t n, const real_1d_array &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetpointsandscales(const_cast(r.c_ptr()), const_cast(xy.c_ptr()), n, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function adds dataset and a vector of per-dimension scales. It may be useful if you have mixed spatio-temporal data - say, a set of 3D slices recorded at different times. Such data typically require different RBF radii for spatial and temporal dimensions. ALGLIB solves this problem by specifying single RBF radius, which is (optionally) multiplied by the scale vector. This function overrides results of the previous calls, i.e. multiple calls of this function will result in only the last set being added. IMPORTANT: only HierarchicalRBF algorithm can work with scaled points. So, using this function results in RBF models which can be used in ALGLIB 3.11 or later. Previous versions of the library will be unable to unserialize models produced by HierarchicalRBF algo. Any attempt to use this function with RBF-ML or QNN algorithms will result in -3 error code being returned (incorrect algorithm). INPUT PARAMETERS: R - RBF model, initialized by rbfcreate() call. XY - points, array[N,NX+NY]. One row corresponds to one point in the dataset. First NX elements are coordinates, next NY elements are function values. Array may be larger than specified, in this case only leading [N,NX+NY] elements will be used. N - number of points in the dataset S - array[NX], scale vector, S[i]>0. After you've added dataset and (optionally) tuned algorithm settings you should call rbfbuildmodel() in order to build a model for you. NOTE: dataset added by this function is not saved during model serialization. MODEL ITSELF is serialized, but data used to build it are not. So, if you 1) add dataset to empty RBF model, 2) serialize and unserialize it, then you will get an empty RBF model with no dataset being attached. From the other side, if you call rbfbuildmodel() between (1) and (2), then after (2) you will get your fully constructed RBF model - but again with no dataset attached, so subsequent calls to rbfbuildmodel() will produce empty model. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ void rbfsetpointsandscales(const rbfmodel &r, const real_2d_array &xy, const real_1d_array &s) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = xy.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetpointsandscales(const_cast(r.c_ptr()), const_cast(xy.c_ptr()), n, const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* DEPRECATED:since version 3.11 ALGLIB includes new RBF model construction algorithm, Hierarchical RBF. This algorithm is faster and requires less memory than QNN and RBF-ML. It is especially good for large-scale interpolation problems. So, we recommend you to consider Hierarchical RBF as default option. ========================================================================== This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called RBF-QNN and it is good for point sets with following properties: a) all points are distinct b) all points are well separated. c) points distribution is approximately uniform. There is no "contour lines", clusters of points, or other small-scale structures. Algorithm description: 1) interpolation centers are allocated to data points 2) interpolation radii are calculated as distances to the nearest centers times Q coefficient (where Q is a value from [0.75,1.50]). 3) after performing (2) radii are transformed in order to avoid situation when single outlier has very large radius and influences many points across all dataset. Transformation has following form: new_r[i] = min(r[i],Z*median(r[])) where r[i] is I-th radius, median() is a median radius across entire dataset, Z is user-specified value which controls amount of deviation from median radius. When (a) is violated, we will be unable to build RBF model. When (b) or (c) are violated, model will be built, but interpolation quality will be low. See http://www.alglib.net/interpolation/ for more information on this subject. This algorithm is used by default. Additional Q parameter controls smoothness properties of the RBF basis: * Q<0.75 will give perfectly conditioned basis, but terrible smoothness properties (RBF interpolant will have sharp peaks around function values) * Q around 1.0 gives good balance between smoothness and condition number * Q>1.5 will lead to badly conditioned systems and slow convergence of the underlying linear solver (although smoothness will be very good) * Q>2.0 will effectively make optimizer useless because it won't converge within reasonable amount of iterations. It is possible to set such large Q, but it is advised not to do so. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call Q - Q parameter, Q>0, recommended value - 1.0 Z - Z parameter, Z>0, recommended value - 5.0 NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetalgoqnn(const rbfmodel &s, const double q, const double z) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetalgoqnn(const_cast(s.c_ptr()), q, z, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* DEPRECATED:since version 3.11 ALGLIB includes new RBF model construction algorithm, Hierarchical RBF. This algorithm is faster and requires less memory than QNN and RBF-ML. It is especially good for large-scale interpolation problems. So, we recommend you to consider Hierarchical RBF as default option. ========================================================================== This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called RBF-QNN and it is good for point sets with following properties: a) all points are distinct b) all points are well separated. c) points distribution is approximately uniform. There is no "contour lines", clusters of points, or other small-scale structures. Algorithm description: 1) interpolation centers are allocated to data points 2) interpolation radii are calculated as distances to the nearest centers times Q coefficient (where Q is a value from [0.75,1.50]). 3) after performing (2) radii are transformed in order to avoid situation when single outlier has very large radius and influences many points across all dataset. Transformation has following form: new_r[i] = min(r[i],Z*median(r[])) where r[i] is I-th radius, median() is a median radius across entire dataset, Z is user-specified value which controls amount of deviation from median radius. When (a) is violated, we will be unable to build RBF model. When (b) or (c) are violated, model will be built, but interpolation quality will be low. See http://www.alglib.net/interpolation/ for more information on this subject. This algorithm is used by default. Additional Q parameter controls smoothness properties of the RBF basis: * Q<0.75 will give perfectly conditioned basis, but terrible smoothness properties (RBF interpolant will have sharp peaks around function values) * Q around 1.0 gives good balance between smoothness and condition number * Q>1.5 will lead to badly conditioned systems and slow convergence of the underlying linear solver (although smoothness will be very good) * Q>2.0 will effectively make optimizer useless because it won't converge within reasonable amount of iterations. It is possible to set such large Q, but it is advised not to do so. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call Q - Q parameter, Q>0, recommended value - 1.0 Z - Z parameter, Z>0, recommended value - 5.0 NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetalgoqnn(const rbfmodel &s) { alglib_impl::ae_state _alglib_env_state; double q; double z; q = 1.0; z = 5.0; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetalgoqnn(const_cast(s.c_ptr()), q, z, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* DEPRECATED:since version 3.11 ALGLIB includes new RBF model construction algorithm, Hierarchical RBF. This algorithm is faster and requires less memory than QNN and RBF-ML. It is especially good for large-scale interpolation problems. So, we recommend you to consider Hierarchical RBF as default option. ========================================================================== This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called RBF-ML. It builds multilayer RBF model, i.e. model with subsequently decreasing radii, which allows us to combine smoothness (due to large radii of the first layers) with exactness (due to small radii of the last layers) and fast convergence. Internally RBF-ML uses many different means of acceleration, from sparse matrices to KD-trees, which results in algorithm whose working time is roughly proportional to N*log(N)*Density*RBase^2*NLayers, where N is a number of points, Density is an average density if points per unit of the interpolation space, RBase is an initial radius, NLayers is a number of layers. RBF-ML is good for following kinds of interpolation problems: 1. "exact" problems (perfect fit) with well separated points 2. least squares problems with arbitrary distribution of points (algorithm gives perfect fit where it is possible, and resorts to least squares fit in the hard areas). 3. noisy problems where we want to apply some controlled amount of smoothing. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call RBase - RBase parameter, RBase>0 NLayers - NLayers parameter, NLayers>0, recommended value to start with - about 5. LambdaV - regularization value, can be useful when solving problem in the least squares sense. Optimal lambda is problem- dependent and require trial and error. In our experience, good lambda can be as large as 0.1, and you can use 0.001 as initial guess. Default value - 0.01, which is used when LambdaV is not given. You can specify zero value, but it is not recommended to do so. TUNING ALGORITHM In order to use this algorithm you have to choose three parameters: * initial radius RBase * number of layers in the model NLayers * regularization coefficient LambdaV Initial radius is easy to choose - you can pick any number several times larger than the average distance between points. Algorithm won't break down if you choose radius which is too large (model construction time will increase, but model will be built correctly). Choose such number of layers that RLast=RBase/2^(NLayers-1) (radius used by the last layer) will be smaller than the typical distance between points. In case model error is too large, you can increase number of layers. Having more layers will make model construction and evaluation proportionally slower, but it will allow you to have model which precisely fits your data. From the other side, if you want to suppress noise, you can DECREASE number of layers to make your model less flexible. Regularization coefficient LambdaV controls smoothness of the individual models built for each layer. We recommend you to use default value in case you don't want to tune this parameter, because having non-zero LambdaV accelerates and stabilizes internal iterative algorithm. In case you want to suppress noise you can use LambdaV as additional parameter (larger value = more smoothness) to tune. TYPICAL ERRORS 1. Using initial radius which is too large. Memory requirements of the RBF-ML are roughly proportional to N*Density*RBase^2 (where Density is an average density of points per unit of the interpolation space). In the extreme case of the very large RBase we will need O(N^2) units of memory - and many layers in order to decrease radius to some reasonably small value. 2. Using too small number of layers - RBF models with large radius are not flexible enough to reproduce small variations in the target function. You need many layers with different radii, from large to small, in order to have good model. 3. Using initial radius which is too small. You will get model with "holes" in the areas which are too far away from interpolation centers. However, algorithm will work correctly (and quickly) in this case. 4. Using too many layers - you will get too large and too slow model. This model will perfectly reproduce your function, but maybe you will be able to achieve similar results with less layers (and less memory). -- ALGLIB -- Copyright 02.03.2012 by Bochkanov Sergey *************************************************************************/ void rbfsetalgomultilayer(const rbfmodel &s, const double rbase, const ae_int_t nlayers, const double lambdav) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetalgomultilayer(const_cast(s.c_ptr()), rbase, nlayers, lambdav, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* DEPRECATED:since version 3.11 ALGLIB includes new RBF model construction algorithm, Hierarchical RBF. This algorithm is faster and requires less memory than QNN and RBF-ML. It is especially good for large-scale interpolation problems. So, we recommend you to consider Hierarchical RBF as default option. ========================================================================== This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called RBF-ML. It builds multilayer RBF model, i.e. model with subsequently decreasing radii, which allows us to combine smoothness (due to large radii of the first layers) with exactness (due to small radii of the last layers) and fast convergence. Internally RBF-ML uses many different means of acceleration, from sparse matrices to KD-trees, which results in algorithm whose working time is roughly proportional to N*log(N)*Density*RBase^2*NLayers, where N is a number of points, Density is an average density if points per unit of the interpolation space, RBase is an initial radius, NLayers is a number of layers. RBF-ML is good for following kinds of interpolation problems: 1. "exact" problems (perfect fit) with well separated points 2. least squares problems with arbitrary distribution of points (algorithm gives perfect fit where it is possible, and resorts to least squares fit in the hard areas). 3. noisy problems where we want to apply some controlled amount of smoothing. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call RBase - RBase parameter, RBase>0 NLayers - NLayers parameter, NLayers>0, recommended value to start with - about 5. LambdaV - regularization value, can be useful when solving problem in the least squares sense. Optimal lambda is problem- dependent and require trial and error. In our experience, good lambda can be as large as 0.1, and you can use 0.001 as initial guess. Default value - 0.01, which is used when LambdaV is not given. You can specify zero value, but it is not recommended to do so. TUNING ALGORITHM In order to use this algorithm you have to choose three parameters: * initial radius RBase * number of layers in the model NLayers * regularization coefficient LambdaV Initial radius is easy to choose - you can pick any number several times larger than the average distance between points. Algorithm won't break down if you choose radius which is too large (model construction time will increase, but model will be built correctly). Choose such number of layers that RLast=RBase/2^(NLayers-1) (radius used by the last layer) will be smaller than the typical distance between points. In case model error is too large, you can increase number of layers. Having more layers will make model construction and evaluation proportionally slower, but it will allow you to have model which precisely fits your data. From the other side, if you want to suppress noise, you can DECREASE number of layers to make your model less flexible. Regularization coefficient LambdaV controls smoothness of the individual models built for each layer. We recommend you to use default value in case you don't want to tune this parameter, because having non-zero LambdaV accelerates and stabilizes internal iterative algorithm. In case you want to suppress noise you can use LambdaV as additional parameter (larger value = more smoothness) to tune. TYPICAL ERRORS 1. Using initial radius which is too large. Memory requirements of the RBF-ML are roughly proportional to N*Density*RBase^2 (where Density is an average density of points per unit of the interpolation space). In the extreme case of the very large RBase we will need O(N^2) units of memory - and many layers in order to decrease radius to some reasonably small value. 2. Using too small number of layers - RBF models with large radius are not flexible enough to reproduce small variations in the target function. You need many layers with different radii, from large to small, in order to have good model. 3. Using initial radius which is too small. You will get model with "holes" in the areas which are too far away from interpolation centers. However, algorithm will work correctly (and quickly) in this case. 4. Using too many layers - you will get too large and too slow model. This model will perfectly reproduce your function, but maybe you will be able to achieve similar results with less layers (and less memory). -- ALGLIB -- Copyright 02.03.2012 by Bochkanov Sergey *************************************************************************/ void rbfsetalgomultilayer(const rbfmodel &s, const double rbase, const ae_int_t nlayers) { alglib_impl::ae_state _alglib_env_state; double lambdav; lambdav = 0.01; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetalgomultilayer(const_cast(s.c_ptr()), rbase, nlayers, lambdav, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called Hierarchical RBF. It similar to its previous incarnation, RBF-ML, i.e. it also builds a sequence of models with decreasing radii. However, it uses more economical way of building upper layers (ones with large radii), which results in faster model construction and evaluation, as well as smaller memory footprint during construction. This algorithm has following important features: * ability to handle millions of points * controllable smoothing via nonlinearity penalization * support for NX-dimensional models with NX=1 or NX>3 (unlike QNN or RBF-ML) * support for specification of per-dimensional radii via scale vector, which is set by means of rbfsetpointsandscales() function. This feature is useful if you solve spatio-temporal interpolation problems, where different radii are required for spatial and temporal dimensions. Running times are roughly proportional to: * N*log(N)*NLayers - for model construction * N*NLayers - for model evaluation You may see that running time does not depend on search radius or points density, just on number of layers in the hierarchy. IMPORTANT: this model construction algorithm was introduced in ALGLIB 3.11 and produces models which are INCOMPATIBLE with previous versions of ALGLIB. You can not unserialize models produced with this function in ALGLIB 3.10 or earlier. INPUT PARAMETERS: S - RBF model, initialized by rbfcreate() call RBase - RBase parameter, RBase>0 NLayers - NLayers parameter, NLayers>0, recommended value to start with - about 5. LambdaNS- >=0, nonlinearity penalty coefficient, negative values are not allowed. This parameter adds controllable smoothing to the problem, which may reduce noise. Specification of non- zero lambda means that in addition to fitting error solver will also minimize LambdaNS*|S''(x)|^2 (appropriately generalized to multiple dimensions. Specification of exactly zero value means that no penalty is added (we do not even evaluate matrix of second derivatives which is necessary for smoothing). Calculation of nonlinearity penalty is costly - it results in several-fold increase of model construction time. Evaluation time remains the same. Optimal lambda is problem-dependent and requires trial and error. Good value to start from is 1e-5...1e-6, which corresponds to slightly noticeable smoothing of the function. Value 1e-2 usually means that quite heavy smoothing is applied. TUNING ALGORITHM In order to use this algorithm you have to choose three parameters: * initial radius RBase * number of layers in the model NLayers * penalty coefficient LambdaNS Initial radius is easy to choose - you can pick any number several times larger than the average distance between points. Algorithm won't break down if you choose radius which is too large (model construction time will increase, but model will be built correctly). Choose such number of layers that RLast=RBase/2^(NLayers-1) (radius used by the last layer) will be smaller than the typical distance between points. In case model error is too large, you can increase number of layers. Having more layers will make model construction and evaluation proportionally slower, but it will allow you to have model which precisely fits your data. From the other side, if you want to suppress noise, you can DECREASE number of layers to make your model less flexible (or specify non-zero LambdaNS). TYPICAL ERRORS 1. Using too small number of layers - RBF models with large radius are not flexible enough to reproduce small variations in the target function. You need many layers with different radii, from large to small, in order to have good model. 2. Using initial radius which is too small. You will get model with "holes" in the areas which are too far away from interpolation centers. However, algorithm will work correctly (and quickly) in this case. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ void rbfsetalgohierarchical(const rbfmodel &s, const double rbase, const ae_int_t nlayers, const double lambdans) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetalgohierarchical(const_cast(s.c_ptr()), rbase, nlayers, lambdans, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets linear term (model is a sum of radial basis functions plus linear polynomial). This function won't have effect until next call to RBFBuildModel(). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetlinterm(const rbfmodel &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetlinterm(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets constant term (model is a sum of radial basis functions plus constant). This function won't have effect until next call to RBFBuildModel(). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetconstterm(const rbfmodel &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetconstterm(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets zero term (model is a sum of radial basis functions without polynomial term). This function won't have effect until next call to RBFBuildModel(). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetzeroterm(const rbfmodel &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetzeroterm(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets basis function type, which can be: * 0 for classic Gaussian * 1 for fast and compact bell-like basis function, which becomes exactly zero at distance equal to 3*R (default option). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call BF - basis function type: * 0 - classic Gaussian * 1 - fast and compact one -- ALGLIB -- Copyright 01.02.2017 by Bochkanov Sergey *************************************************************************/ void rbfsetv2bf(const rbfmodel &s, const ae_int_t bf) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetv2bf(const_cast(s.c_ptr()), bf, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets stopping criteria of the underlying linear solver for hierarchical (version 2) RBF constructor. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call MaxIts - this criterion will stop algorithm after MaxIts iterations. Typically a few hundreds iterations is required, with 400 being a good default value to start experimentation. Zero value means that default value will be selected. -- ALGLIB -- Copyright 01.02.2017 by Bochkanov Sergey *************************************************************************/ void rbfsetv2its(const rbfmodel &s, const ae_int_t maxits) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetv2its(const_cast(s.c_ptr()), maxits, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function sets support radius parameter of hierarchical (version 2) RBF constructor. Hierarchical RBF model achieves great speed-up by removing from the model excessive (too dense) nodes. Say, if you have RBF radius equal to 1 meter, and two nodes are just 1 millimeter apart, you may remove one of them without reducing model quality. Support radius parameter is used to justify which points need removal, and which do not. If two points are less than SUPPORT_R*CUR_RADIUS units of distance apart, one of them is removed from the model. The larger support radius is, the faster model construction AND evaluation are. However, too large values result in "bumpy" models. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call R - support radius coefficient, >=0. Recommended values are [0.1,0.4] range, with 0.1 being default value. -- ALGLIB -- Copyright 01.02.2017 by Bochkanov Sergey *************************************************************************/ void rbfsetv2supportr(const rbfmodel &s, const double r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfsetv2supportr(const_cast(s.c_ptr()), r, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function builds RBF model and returns report (contains some information which can be used for evaluation of the algorithm properties). Call to this function modifies RBF model by calculating its centers/radii/ weights and saving them into RBFModel structure. Initially RBFModel contain zero coefficients, but after call to this function we will have coefficients which were calculated in order to fit our dataset. After you called this function you can call RBFCalc(), RBFGridCalc() and other model calculation functions. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call Rep - report: * Rep.TerminationType: * -5 - non-distinct basis function centers were detected, interpolation aborted; only QNN returns this error code, other algorithms can handle non- distinct nodes. * -4 - nonconvergence of the internal SVD solver * -3 incorrect model construction algorithm was chosen: QNN or RBF-ML, combined with one of the incompatible features - NX=1 or NX>3; points with per-dimension scales. * 1 - successful termination Fields which are set only by modern RBF solvers (hierarchical or nonnegative; older solvers like QNN and ML initialize these fields by NANs): * rep.rmserror - root-mean-square error at nodes * rep.maxerror - maximum error at nodes Fields are used for debugging purposes: * Rep.IterationsCount - iterations count of the LSQR solver * Rep.NMV - number of matrix-vector products * Rep.ARows - rows count for the system matrix * Rep.ACols - columns count for the system matrix * Rep.ANNZ - number of significantly non-zero elements (elements above some algorithm-determined threshold) NOTE: failure to build model will leave current state of the structure unchanged. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfbuildmodel(const rbfmodel &s, rbfreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfbuildmodel(const_cast(s.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates values of the RBF model in the given point. IMPORTANT: this function works only with modern (hierarchical) RBFs. It can not be used with legacy (version 1) RBFs because older RBF code does not support 1-dimensional models. This function should be used when we have NY=1 (scalar function) and NX=1 (1-dimensional space). If you have 3-dimensional space, use rbfcalc3(). If you have 2-dimensional space, use rbfcalc3(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use generic rbfcalc(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when: * model is not initialized * NX<>1 * NY<>1 INPUT PARAMETERS: S - RBF model X0 - X-coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfcalc1(const rbfmodel &s, const double x0) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rbfcalc1(const_cast(s.c_ptr()), x0, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates values of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=2 (2-dimensional space). If you have 3-dimensional space, use rbfcalc3(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use generic rbfcalc(). If you want to calculate function values many times, consider using rbfgridcalc2v(), which is far more efficient than many subsequent calls to rbfcalc2(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when: * model is not initialized * NX<>2 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - first coordinate, finite number X1 - second coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfcalc2(const rbfmodel &s, const double x0, const double x1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rbfcalc2(const_cast(s.c_ptr()), x0, x1, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates value of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=3 (3-dimensional space). If you have 2-dimensional space, use rbfcalc2(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use generic rbfcalc(). If you want to calculate function values many times, consider using rbfgridcalc3v(), which is far more efficient than many subsequent calls to rbfcalc3(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when: * model is not initialized * NX<>3 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - first coordinate, finite number X1 - second coordinate, finite number X2 - third coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfcalc3(const rbfmodel &s, const double x0, const double x1, const double x2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::rbfcalc3(const_cast(s.c_ptr()), x0, x1, x2, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates values of the RBF model at the given point. This is general function which can be used for arbitrary NX (dimension of the space of arguments) and NY (dimension of the function itself). However when you have NY=1 you may find more convenient to use rbfcalc2() or rbfcalc3(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when model is not initialized. INPUT PARAMETERS: S - RBF model X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. OUTPUT PARAMETERS: Y - function value, array[NY]. Y is out-parameter and reallocated after call to this function. In case you want to reuse previously allocated Y, you may use RBFCalcBuf(), which reallocates Y only when it is too small. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfcalc(const rbfmodel &s, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfcalc(const_cast(s.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates values of the RBF model at the given point. Same as rbfcalc(), but does not reallocate Y when in is large enough to store function values. If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. INPUT PARAMETERS: S - RBF model X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. Y - possibly preallocated array OUTPUT PARAMETERS: Y - function value, array[NY]. Y is not reallocated when it is larger than NY. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfcalcbuf(const rbfmodel &s, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfcalcbuf(const_cast(s.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates values of the RBF model at the given point, using external buffer object (internal temporaries of RBF model are not modified). This function allows to use same RBF model object in different threads, assuming that different threads use different instances of buffer structure. INPUT PARAMETERS: S - RBF model, may be shared between different threads Buf - buffer object created for this particular instance of RBF model with rbfcreatecalcbuffer(). X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. Y - possibly preallocated array OUTPUT PARAMETERS: Y - function value, array[NY]. Y is not reallocated when it is larger than NY. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbftscalcbuf(const rbfmodel &s, const rbfcalcbuffer &buf, const real_1d_array &x, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbftscalcbuf(const_cast(s.c_ptr()), const_cast(buf.c_ptr()), const_cast(x.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is legacy function for gridded calculation of RBF model. It is superseded by rbfgridcalc2v() and rbfgridcalc2vsubset() functions. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc2(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, real_2d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfgridcalc2(const_cast(s.c_ptr()), const_cast(x0.c_ptr()), n0, const_cast(x1.c_ptr()), n1, const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates values of the RBF model at the regular grid, which has N0*N1 points, with Point[I,J] = (X0[I], X1[J]). Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>2 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 200x200 problem to get nearly-linear speed- ! up (high efficiency). ! ! Parallel processing is implemented only for modern (hierarchical) ! RBFs. Legacy version 1 RBFs (created by QNN or RBF-ML) are still ! processed serially. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1], where NY is a number of "output" vector values (this function supports vector- valued RBF models). Y is out-variable and is reallocated by this function. Y[K+NY*(I0+I1*N0)]=F_k(X0[I0],X1[I1]), for: * K=0...NY-1 * I0=0...N0-1 * I1=0...N1-1 NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. NOTE: if you need function values on some subset of regular grid, which may be described as "several compact and dense islands", you may use rbfgridcalc2vsubset(). -- ALGLIB -- Copyright 27.01.2017 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc2v(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfgridcalc2v(const_cast(s.c_ptr()), const_cast(x0.c_ptr()), n0, const_cast(x1.c_ptr()), n1, const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rbfgridcalc2v(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rbfgridcalc2v(const_cast(s.c_ptr()), const_cast(x0.c_ptr()), n0, const_cast(x1.c_ptr()), n1, const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates values of the RBF model at some subset of regular grid: * grid has N0*N1 points, with Point[I,J] = (X0[I], X1[J]) * only values at some subset of this grid are required Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>2 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 200x200 problem to get nearly-linear speed- ! up (high efficiency). ! ! Parallel processing is implemented only for modern (hierarchical) ! RBFs. Legacy version 1 RBFs (created by QNN or RBF-ML) are still ! processed serially. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension FlagY - array[N0*N1]: * Y[I0+I1*N0] corresponds to node (X0[I0],X1[I1]) * it is a "bitmap" array which contains False for nodes which are NOT calculated, and True for nodes which are required. OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1*N2], where NY is a number of "output" vector values (this function supports vector- valued RBF models): * Y[K+NY*(I0+I1*N0)]=F_k(X0[I0],X1[I1]), for K=0...NY-1, I0=0...N0-1, I1=0...N1-1. * elements of Y[] which correspond to FlagY[]=True are loaded by model values (which may be exactly zero for some nodes). * elements of Y[] which correspond to FlagY[]=False MAY be initialized by zeros OR may be calculated. This function processes grid as a hierarchy of nested blocks and micro-rows. If just one element of micro-row is required, entire micro-row (up to 8 nodes in the current version, but no promises) is calculated. NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc2vsubset(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, const boolean_1d_array &flagy, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfgridcalc2vsubset(const_cast(s.c_ptr()), const_cast(x0.c_ptr()), n0, const_cast(x1.c_ptr()), n1, const_cast(flagy.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rbfgridcalc2vsubset(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, const boolean_1d_array &flagy, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rbfgridcalc2vsubset(const_cast(s.c_ptr()), const_cast(x0.c_ptr()), n0, const_cast(x1.c_ptr()), n1, const_cast(flagy.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates values of the RBF model at the regular grid, which has N0*N1*N2 points, with Point[I,J,K] = (X0[I], X1[J], X2[K]). Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>3 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 50x50x50 problem to get nearly-linear speed- ! up (high efficiency). ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension X2 - array of grid nodes, third coordinates, array[N2] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N2 - grid size (number of nodes) in the third dimension OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1*N2], where NY is a number of "output" vector values (this function supports vector- valued RBF models). Y is out-variable and is reallocated by this function. Y[K+NY*(I0+I1*N0+I2*N0*N1)]=F_k(X0[I0],X1[I1],X2[I2]), for: * K=0...NY-1 * I0=0...N0-1 * I1=0...N1-1 * I2=0...N2-1 NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. NOTE: if you need function values on some subset of regular grid, which may be described as "several compact and dense islands", you may use rbfgridcalc3vsubset(). -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc3v(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, const real_1d_array &x2, const ae_int_t n2, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfgridcalc3v(const_cast(s.c_ptr()), const_cast(x0.c_ptr()), n0, const_cast(x1.c_ptr()), n1, const_cast(x2.c_ptr()), n2, const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rbfgridcalc3v(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, const real_1d_array &x2, const ae_int_t n2, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rbfgridcalc3v(const_cast(s.c_ptr()), const_cast(x0.c_ptr()), n0, const_cast(x1.c_ptr()), n1, const_cast(x2.c_ptr()), n2, const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function calculates values of the RBF model at some subset of regular grid: * grid has N0*N1*N2 points, with Point[I,J,K] = (X0[I], X1[J], X2[K]) * only values at some subset of this grid are required Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>3 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 50x50x50 problem to get nearly-linear speed- ! up (high efficiency). ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension X2 - array of grid nodes, third coordinates, array[N2] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N2 - grid size (number of nodes) in the third dimension FlagY - array[N0*N1*N2]: * Y[I0+I1*N0+I2*N0*N1] corresponds to node (X0[I0],X1[I1],X2[I2]) * it is a "bitmap" array which contains False for nodes which are NOT calculated, and True for nodes which are required. OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1*N2], where NY is a number of "output" vector values (this function supports vector- valued RBF models): * Y[K+NY*(I0+I1*N0+I2*N0*N1)]=F_k(X0[I0],X1[I1],X2[I2]), for K=0...NY-1, I0=0...N0-1, I1=0...N1-1, I2=0...N2-1. * elements of Y[] which correspond to FlagY[]=True are loaded by model values (which may be exactly zero for some nodes). * elements of Y[] which correspond to FlagY[]=False MAY be initialized by zeros OR may be calculated. This function processes grid as a hierarchy of nested blocks and micro-rows. If just one element of micro-row is required, entire micro-row (up to 8 nodes in the current version, but no promises) is calculated. NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc3vsubset(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, const real_1d_array &x2, const ae_int_t n2, const boolean_1d_array &flagy, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfgridcalc3vsubset(const_cast(s.c_ptr()), const_cast(x0.c_ptr()), n0, const_cast(x1.c_ptr()), n1, const_cast(x2.c_ptr()), n2, const_cast(flagy.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void smp_rbfgridcalc3vsubset(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, const real_1d_array &x2, const ae_int_t n2, const boolean_1d_array &flagy, real_1d_array &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::_pexec_rbfgridcalc3vsubset(const_cast(s.c_ptr()), const_cast(x0.c_ptr()), n0, const_cast(x1.c_ptr()), n1, const_cast(x2.c_ptr()), n2, const_cast(flagy.c_ptr()), const_cast(y.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function "unpacks" RBF model by extracting its coefficients. INPUT PARAMETERS: S - RBF model OUTPUT PARAMETERS: NX - dimensionality of argument NY - dimensionality of the target function XWR - model information, array[NC,NX+NY+1]. One row of the array corresponds to one basis function: * first NX columns - coordinates of the center * next NY columns - weights, one per dimension of the function being modelled For ModelVersion=1: * last column - radius, same for all dimensions of the function being modelled For ModelVersion=2: * last NX columns - radii, one per dimension NC - number of the centers V - polynomial term , array[NY,NX+1]. One row per one dimension of the function being modelled. First NX elements are linear coefficients, V[NX] is equal to the constant part. ModelVersion-version of the RBF model: * 1 - for models created by QNN and RBF-ML algorithms, compatible with ALGLIB 3.10 or earlier. * 2 - for models created by HierarchicalRBF, requires ALGLIB 3.11 or later -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfunpack(const rbfmodel &s, ae_int_t &nx, ae_int_t &ny, real_2d_array &xwr, ae_int_t &nc, real_2d_array &v, ae_int_t &modelversion) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::rbfunpack(const_cast(s.c_ptr()), &nx, &ny, const_cast(xwr.c_ptr()), &nc, const_cast(v.c_ptr()), &modelversion, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function returns model version. INPUT PARAMETERS: S - RBF model RESULT: * 1 - for models created by QNN and RBF-ML algorithms, compatible with ALGLIB 3.10 or earlier. * 2 - for models created by HierarchicalRBF, requires ALGLIB 3.11 or later -- ALGLIB -- Copyright 06.07.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t rbfgetmodelversion(const rbfmodel &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::rbfgetmodelversion(const_cast(s.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF COMPUTATIONAL CORE // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { static double idwint_idwqfactor = 1.5; static ae_int_t idwint_idwkmin = 5; static double idwint_idwcalcq(idwinterpolant* z, /* Real */ ae_vector* x, ae_int_t k, ae_state *_state); static void idwint_idwinit1(ae_int_t n, ae_int_t nx, ae_int_t d, ae_int_t nq, ae_int_t nw, idwinterpolant* z, ae_state *_state); static void idwint_idwinternalsolver(/* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, /* Real */ ae_vector* temp, ae_int_t n, ae_int_t m, ae_int_t* info, /* Real */ ae_vector* x, double* taskrcond, ae_state *_state); static void ratint_barycentricnormalize(barycentricinterpolant* b, ae_state *_state); static void spline1d_spline1dgriddiffcubicinternal(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, /* Real */ ae_vector* d, /* Real */ ae_vector* a1, /* Real */ ae_vector* a2, /* Real */ ae_vector* a3, /* Real */ ae_vector* b, /* Real */ ae_vector* dt, ae_state *_state); static void spline1d_heapsortpoints(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_state *_state); static void spline1d_heapsortppoints(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Integer */ ae_vector* p, ae_int_t n, ae_state *_state); static void spline1d_solvetridiagonal(/* Real */ ae_vector* a, /* Real */ ae_vector* b, /* Real */ ae_vector* c, /* Real */ ae_vector* d, ae_int_t n, /* Real */ ae_vector* x, ae_state *_state); static void spline1d_solvecyclictridiagonal(/* Real */ ae_vector* a, /* Real */ ae_vector* b, /* Real */ ae_vector* c, /* Real */ ae_vector* d, ae_int_t n, /* Real */ ae_vector* x, ae_state *_state); static double spline1d_diffthreepoint(double t, double x0, double f0, double x1, double f1, double x2, double f2, ae_state *_state); static void spline1d_hermitecalc(double p0, double m0, double p1, double m1, double t, double* s, double* ds, ae_state *_state); static double spline1d_rescaleval(double a0, double b0, double a1, double b1, double t, ae_state *_state); static void parametric_pspline2par(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t pt, /* Real */ ae_vector* p, ae_state *_state); static void parametric_pspline3par(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t pt, /* Real */ ae_vector* p, ae_state *_state); static void parametric_rdpanalyzesectionpar(/* Real */ ae_matrix* xy, ae_int_t i0, ae_int_t i1, ae_int_t d, ae_int_t* worstidx, double* worsterror, ae_state *_state); static void spline3d_spline3ddiff(spline3dinterpolant* c, double x, double y, double z, double* f, double* fx, double* fy, double* fxy, ae_state *_state); static void lsfit_rdpanalyzesection(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t i0, ae_int_t i1, ae_int_t* worstidx, double* worsterror, ae_state *_state); static void lsfit_rdprecursive(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t i0, ae_int_t i1, double eps, /* Real */ ae_vector* xout, /* Real */ ae_vector* yout, ae_int_t* nout, ae_state *_state); static void lsfit_logisticfitinternal(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_bool is4pl, double lambdav, minlmstate* state, minlmreport* replm, /* Real */ ae_vector* p1, double* flast, ae_state *_state); static void lsfit_logisticfit45errors(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double a, double b, double c, double d, double g, lsfitreport* rep, ae_state *_state); static void lsfit_spline1dfitinternal(ae_int_t st, /* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); static void lsfit_lsfitlinearinternal(/* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, ae_int_t n, ae_int_t m, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state); static void lsfit_lsfitclearrequestfields(lsfitstate* state, ae_state *_state); static void lsfit_barycentriccalcbasis(barycentricinterpolant* b, double t, /* Real */ ae_vector* y, ae_state *_state); static void lsfit_internalchebyshevfit(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state); static void lsfit_barycentricfitwcfixedd(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t d, ae_int_t* info, barycentricinterpolant* b, barycentricfitreport* rep, ae_state *_state); static void lsfit_clearreport(lsfitreport* rep, ae_state *_state); static void lsfit_estimateerrors(/* Real */ ae_matrix* f1, /* Real */ ae_vector* f0, /* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_vector* x, /* Real */ ae_vector* s, ae_int_t n, ae_int_t k, lsfitreport* rep, /* Real */ ae_matrix* z, ae_int_t zkind, ae_state *_state); static double rbfv2_defaultlambdareg = 1.0E-6; static double rbfv2_defaultsupportr = 0.10; static ae_int_t rbfv2_defaultmaxits = 400; static ae_int_t rbfv2_defaultbf = 1; static ae_int_t rbfv2_maxnodesize = 6; static double rbfv2_minbasecasecost = 100000; static ae_bool rbfv2_rbfv2buildlinearmodel(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_int_t modeltype, /* Real */ ae_matrix* v, ae_state *_state); static void rbfv2_allocatecalcbuffer(rbfv2model* s, rbfv2calcbuffer* buf, ae_state *_state); static void rbfv2_convertandappendtree(kdtree* curtree, ae_int_t n, ae_int_t nx, ae_int_t ny, /* Integer */ ae_vector* kdnodes, /* Real */ ae_vector* kdsplits, /* Real */ ae_vector* cw, ae_state *_state); static void rbfv2_converttreerec(kdtree* curtree, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_int_t nodeoffset, ae_int_t nodesbase, ae_int_t splitsbase, ae_int_t cwbase, /* Integer */ ae_vector* localnodes, ae_int_t* localnodessize, /* Real */ ae_vector* localsplits, ae_int_t* localsplitssize, /* Real */ ae_vector* localcw, ae_int_t* localcwsize, /* Real */ ae_matrix* xybuf, ae_state *_state); static void rbfv2_partialcalcrec(rbfv2model* s, rbfv2calcbuffer* buf, ae_int_t rootidx, double invr2, double queryr2, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); static void rbfv2_partialrowcalcrec(rbfv2model* s, rbfv2calcbuffer* buf, ae_int_t rootidx, double invr2, double rquery2, double rfar2, /* Real */ ae_vector* cx, /* Real */ ae_vector* rx, /* Boolean */ ae_vector* rf, ae_int_t rowsize, /* Real */ ae_vector* ry, ae_state *_state); static void rbfv2_preparepartialquery(/* Real */ ae_vector* x, /* Real */ ae_vector* kdboxmin, /* Real */ ae_vector* kdboxmax, ae_int_t nx, rbfv2calcbuffer* buf, ae_int_t* cnt, ae_state *_state); static void rbfv2_partialqueryrec(/* Integer */ ae_vector* kdnodes, /* Real */ ae_vector* kdsplits, /* Real */ ae_vector* cw, ae_int_t nx, ae_int_t ny, rbfv2calcbuffer* buf, ae_int_t rootidx, double queryr2, /* Real */ ae_vector* x, /* Real */ ae_vector* r2, /* Integer */ ae_vector* offs, ae_int_t* k, ae_state *_state); static ae_int_t rbfv2_partialcountrec(/* Integer */ ae_vector* kdnodes, /* Real */ ae_vector* kdsplits, /* Real */ ae_vector* cw, ae_int_t nx, ae_int_t ny, rbfv2calcbuffer* buf, ae_int_t rootidx, double queryr2, /* Real */ ae_vector* x, ae_state *_state); static void rbfv2_partialunpackrec(/* Integer */ ae_vector* kdnodes, /* Real */ ae_vector* kdsplits, /* Real */ ae_vector* cw, /* Real */ ae_vector* s, ae_int_t nx, ae_int_t ny, ae_int_t rootidx, double r, /* Real */ ae_matrix* xwr, ae_int_t* k, ae_state *_state); static ae_int_t rbfv2_designmatrixrowsize(/* Integer */ ae_vector* kdnodes, /* Real */ ae_vector* kdsplits, /* Real */ ae_vector* cw, /* Real */ ae_vector* ri, /* Integer */ ae_vector* kdroots, /* Real */ ae_vector* kdboxmin, /* Real */ ae_vector* kdboxmax, ae_int_t nx, ae_int_t ny, ae_int_t nh, ae_int_t level, double rcoeff, /* Real */ ae_vector* x0, rbfv2calcbuffer* calcbuf, ae_state *_state); static void rbfv2_designmatrixgeneraterow(/* Integer */ ae_vector* kdnodes, /* Real */ ae_vector* kdsplits, /* Real */ ae_vector* cw, /* Real */ ae_vector* ri, /* Integer */ ae_vector* kdroots, /* Real */ ae_vector* kdboxmin, /* Real */ ae_vector* kdboxmax, /* Integer */ ae_vector* cwrange, ae_int_t nx, ae_int_t ny, ae_int_t nh, ae_int_t level, ae_int_t bf, double rcoeff, ae_int_t rowsperpoint, double penalty, /* Real */ ae_vector* x0, rbfv2calcbuffer* calcbuf, /* Real */ ae_vector* tmpr2, /* Integer */ ae_vector* tmpoffs, /* Integer */ ae_vector* rowidx, /* Real */ ae_vector* rowval, ae_int_t* rowsize, ae_state *_state); static void spline2d_bicubiccalcderivatives(/* Real */ ae_matrix* a, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* dx, /* Real */ ae_matrix* dy, /* Real */ ae_matrix* dxy, ae_state *_state); static ae_int_t rbfv1_mxnx = 3; static double rbfv1_rbffarradius = 6; static double rbfv1_rbfnearradius = 2.1; static double rbfv1_rbfmlradius = 3; static double rbfv1_minbasecasecost = 100000; static ae_bool rbfv1_rbfv1buildlinearmodel(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t ny, ae_int_t modeltype, /* Real */ ae_matrix* v, ae_state *_state); static void rbfv1_buildrbfmodellsqr(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, /* Real */ ae_matrix* xc, /* Real */ ae_vector* r, ae_int_t n, ae_int_t nc, ae_int_t ny, kdtree* pointstree, kdtree* centerstree, double epsort, double epserr, ae_int_t maxits, ae_int_t* gnnz, ae_int_t* snnz, /* Real */ ae_matrix* w, ae_int_t* info, ae_int_t* iterationscount, ae_int_t* nmv, ae_state *_state); static void rbfv1_buildrbfmlayersmodellsqr(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, /* Real */ ae_matrix* xc, double rval, /* Real */ ae_vector* r, ae_int_t n, ae_int_t* nc, ae_int_t ny, ae_int_t nlayers, kdtree* centerstree, double epsort, double epserr, ae_int_t maxits, double lambdav, ae_int_t* annz, /* Real */ ae_matrix* w, ae_int_t* info, ae_int_t* iterationscount, ae_int_t* nmv, ae_state *_state); static double rbf_eps = 1.0E-6; static double rbf_rbffarradius = 6; static ae_int_t rbf_rbffirstversion = 0; static ae_int_t rbf_rbfversion2 = 2; static void rbf_rbfpreparenonserializablefields(rbfmodel* s, ae_state *_state); static void rbf_initializev1(ae_int_t nx, ae_int_t ny, rbfv1model* s, ae_state *_state); static void rbf_initializev2(ae_int_t nx, ae_int_t ny, rbfv2model* s, ae_state *_state); static void rbf_clearreportfields(rbfreport* rep, ae_state *_state); /************************************************************************* IDW interpolation INPUT PARAMETERS: Z - IDW interpolant built with one of model building subroutines. X - array[0..NX-1], interpolation point Result: IDW interpolant Z(X) -- ALGLIB -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/ double idwcalc(idwinterpolant* z, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t i; ae_int_t k; double r; double s; double w; double v1; double v2; double d0; double di; double result; /* * these initializers are not really necessary, * but without them compiler complains about uninitialized locals */ k = 0; /* * Query */ if( z->modeltype==0 ) { /* * NQ/NW-based model */ k = kdtreequeryknn(&z->tree, x, z->nw, ae_true, _state); kdtreequeryresultsdistances(&z->tree, &z->rbuf, _state); kdtreequeryresultstags(&z->tree, &z->tbuf, _state); } if( z->modeltype==1 ) { /* * R-based model */ k = kdtreequeryrnn(&z->tree, x, z->r, ae_true, _state); kdtreequeryresultsdistances(&z->tree, &z->rbuf, _state); kdtreequeryresultstags(&z->tree, &z->tbuf, _state); if( ktree, x, idwint_idwkmin, ae_true, _state); kdtreequeryresultsdistances(&z->tree, &z->rbuf, _state); kdtreequeryresultstags(&z->tree, &z->tbuf, _state); } } /* * initialize weights for linear/quadratic members calculation. * * NOTE 1: weights are calculated using NORMALIZED modified * Shepard's formula. Original formula gives w(i) = sqr((R-di)/(R*di)), * where di is i-th distance, R is max(di). Modified formula have * following form: * w_mod(i) = 1, if di=d0 * w_mod(i) = w(i)/w(0), if di<>d0 * * NOTE 2: self-match is USED for this query * * NOTE 3: last point almost always gain zero weight, but it MUST * be used for fitting because sometimes it will gain NON-ZERO * weight - for example, when all distances are equal. */ r = z->rbuf.ptr.p_double[k-1]; d0 = z->rbuf.ptr.p_double[0]; result = (double)(0); s = (double)(0); for(i=0; i<=k-1; i++) { di = z->rbuf.ptr.p_double[i]; if( ae_fp_eq(di,d0) ) { /* * distance is equal to shortest, set it 1.0 * without explicitly calculating (which would give * us same result, but 'll expose us to the risk of * division by zero). */ w = (double)(1); } else { /* * use normalized formula */ v1 = (r-di)/(r-d0); v2 = d0/di; w = ae_sqr(v1*v2, _state); } result = result+w*idwint_idwcalcq(z, x, z->tbuf.ptr.p_int[i], _state); s = s+w; } result = result/s; return result; } /************************************************************************* IDW interpolant using modified Shepard method for uniform point distributions. INPUT PARAMETERS: XY - X and Y values, array[0..N-1,0..NX]. First NX columns contain X-values, last column contain Y-values. N - number of nodes, N>0. NX - space dimension, NX>=1. D - nodal function type, either: * 0 constant model. Just for demonstration only, worst model ever. * 1 linear model, least squares fitting. Simpe model for datasets too small for quadratic models * 2 quadratic model, least squares fitting. Best model available (if your dataset is large enough). * -1 "fast" linear model, use with caution!!! It is significantly faster than linear/quadratic and better than constant model. But it is less robust (especially in the presence of noise). NQ - number of points used to calculate nodal functions (ignored for constant models). NQ should be LARGER than: * max(1.5*(1+NX),2^NX+1) for linear model, * max(3/4*(NX+2)*(NX+1),2^NX+1) for quadratic model. Values less than this threshold will be silently increased. NW - number of points used to calculate weights and to interpolate. Required: >=2^NX+1, values less than this threshold will be silently increased. Recommended value: about 2*NQ OUTPUT PARAMETERS: Z - IDW interpolant. NOTES: * best results are obtained with quadratic models, worst - with constant models * when N is large, NQ and NW must be significantly smaller than N both to obtain optimal performance and to obtain optimal accuracy. In 2 or 3-dimensional tasks NQ=15 and NW=25 are good values to start with. * NQ and NW may be greater than N. In such cases they will be automatically decreased. * this subroutine is always succeeds (as long as correct parameters are passed). * see 'Multivariate Interpolation of Large Sets of Scattered Data' by Robert J. Renka for more information on this algorithm. * this subroutine assumes that point distribution is uniform at the small scales. If it isn't - for example, points are concentrated along "lines", but "lines" distribution is uniform at the larger scale - then you should use IDWBuildModifiedShepardR() -- ALGLIB PROJECT -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/ void idwbuildmodifiedshepard(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t nx, ae_int_t d, ae_int_t nq, ae_int_t nw, idwinterpolant* z, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t j2; ae_int_t j3; double v; double r; double s; double d0; double di; double v1; double v2; ae_int_t nc; ae_int_t offs; ae_vector x; ae_vector qrbuf; ae_matrix qxybuf; ae_vector y; ae_matrix fmatrix; ae_vector w; ae_vector qsol; ae_vector temp; ae_vector tags; ae_int_t info; double taskrcond; ae_frame_make(_state, &_frame_block); _idwinterpolant_clear(z); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&qrbuf, 0, DT_REAL, _state); ae_matrix_init(&qxybuf, 0, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_matrix_init(&fmatrix, 0, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&qsol, 0, DT_REAL, _state); ae_vector_init(&temp, 0, DT_REAL, _state); ae_vector_init(&tags, 0, DT_INT, _state); /* * these initializers are not really necessary, * but without them compiler complains about uninitialized locals */ nc = 0; /* * assertions */ ae_assert(n>0, "IDWBuildModifiedShepard: N<=0!", _state); ae_assert(nx>=1, "IDWBuildModifiedShepard: NX<1!", _state); ae_assert(d>=-1&&d<=2, "IDWBuildModifiedShepard: D<>-1 and D<>0 and D<>1 and D<>2!", _state); /* * Correct parameters if needed */ if( d==1 ) { nq = ae_maxint(nq, ae_iceil(idwint_idwqfactor*(1+nx), _state)+1, _state); nq = ae_maxint(nq, ae_round(ae_pow((double)(2), (double)(nx), _state), _state)+1, _state); } if( d==2 ) { nq = ae_maxint(nq, ae_iceil(idwint_idwqfactor*(nx+2)*(nx+1)/2, _state)+1, _state); nq = ae_maxint(nq, ae_round(ae_pow((double)(2), (double)(nx), _state), _state)+1, _state); } nw = ae_maxint(nw, ae_round(ae_pow((double)(2), (double)(nx), _state), _state)+1, _state); nq = ae_minint(nq, n, _state); nw = ae_minint(nw, n, _state); /* * primary initialization of Z */ idwint_idwinit1(n, nx, d, nq, nw, z, _state); z->modeltype = 0; /* * Create KD-tree */ ae_vector_set_length(&tags, n, _state); for(i=0; i<=n-1; i++) { tags.ptr.p_int[i] = i; } kdtreebuildtagged(xy, &tags, n, nx, 1, 2, &z->tree, _state); /* * build nodal functions */ ae_vector_set_length(&temp, nq+1, _state); ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&qrbuf, nq, _state); ae_matrix_set_length(&qxybuf, nq, nx+1, _state); if( d==-1 ) { ae_vector_set_length(&w, nq, _state); } if( d==1 ) { ae_vector_set_length(&y, nq, _state); ae_vector_set_length(&w, nq, _state); ae_vector_set_length(&qsol, nx, _state); /* * NX for linear members, * 1 for temporary storage */ ae_matrix_set_length(&fmatrix, nq, nx+1, _state); } if( d==2 ) { ae_vector_set_length(&y, nq, _state); ae_vector_set_length(&w, nq, _state); ae_vector_set_length(&qsol, nx+ae_round(nx*(nx+1)*0.5, _state), _state); /* * NX for linear members, * Round(NX*(NX+1)*0.5) for quadratic model, * 1 for temporary storage */ ae_matrix_set_length(&fmatrix, nq, nx+ae_round(nx*(nx+1)*0.5, _state)+1, _state); } for(i=0; i<=n-1; i++) { /* * Initialize center and function value. * If D=0 it is all what we need */ ae_v_move(&z->q.ptr.pp_double[i][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx)); if( d==0 ) { continue; } /* * calculate weights for linear/quadratic members calculation. * * NOTE 1: weights are calculated using NORMALIZED modified * Shepard's formula. Original formula is w(i) = sqr((R-di)/(R*di)), * where di is i-th distance, R is max(di). Modified formula have * following form: * w_mod(i) = 1, if di=d0 * w_mod(i) = w(i)/w(0), if di<>d0 * * NOTE 2: self-match is NOT used for this query * * NOTE 3: last point almost always gain zero weight, but it MUST * be used for fitting because sometimes it will gain NON-ZERO * weight - for example, when all distances are equal. */ ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); k = kdtreequeryknn(&z->tree, &x, nq, ae_false, _state); kdtreequeryresultsxy(&z->tree, &qxybuf, _state); kdtreequeryresultsdistances(&z->tree, &qrbuf, _state); r = qrbuf.ptr.p_double[k-1]; d0 = qrbuf.ptr.p_double[0]; for(j=0; j<=k-1; j++) { di = qrbuf.ptr.p_double[j]; if( ae_fp_eq(di,d0) ) { /* * distance is equal to shortest, set it 1.0 * without explicitly calculating (which would give * us same result, but 'll expose us to the risk of * division by zero). */ w.ptr.p_double[j] = (double)(1); } else { /* * use normalized formula */ v1 = (r-di)/(r-d0); v2 = d0/di; w.ptr.p_double[j] = ae_sqr(v1*v2, _state); } } /* * calculate linear/quadratic members */ if( d==-1 ) { /* * "Fast" linear nodal function calculated using * inverse distance weighting */ for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = (double)(0); } s = (double)(0); for(j=0; j<=k-1; j++) { /* * calculate J-th inverse distance weighted gradient: * grad_k = (y_j-y_k)*(x_j-x_k)/sqr(norm(x_j-x_k)) * grad = sum(wk*grad_k)/sum(w_k) */ v = (double)(0); for(j2=0; j2<=nx-1; j2++) { v = v+ae_sqr(qxybuf.ptr.pp_double[j][j2]-xy->ptr.pp_double[i][j2], _state); } /* * Although x_j<>x_k, sqr(norm(x_j-x_k)) may be zero due to * underflow. If it is, we assume than J-th gradient is zero * (i.e. don't add anything) */ if( ae_fp_neq(v,(double)(0)) ) { for(j2=0; j2<=nx-1; j2++) { x.ptr.p_double[j2] = x.ptr.p_double[j2]+w.ptr.p_double[j]*(qxybuf.ptr.pp_double[j][nx]-xy->ptr.pp_double[i][nx])*(qxybuf.ptr.pp_double[j][j2]-xy->ptr.pp_double[i][j2])/v; } } s = s+w.ptr.p_double[j]; } for(j=0; j<=nx-1; j++) { z->q.ptr.pp_double[i][nx+1+j] = x.ptr.p_double[j]/s; } } else { /* * Least squares models: build */ if( d==1 ) { /* * Linear nodal function calculated using * least squares fitting to its neighbors */ for(j=0; j<=k-1; j++) { for(j2=0; j2<=nx-1; j2++) { fmatrix.ptr.pp_double[j][j2] = qxybuf.ptr.pp_double[j][j2]-xy->ptr.pp_double[i][j2]; } y.ptr.p_double[j] = qxybuf.ptr.pp_double[j][nx]-xy->ptr.pp_double[i][nx]; } nc = nx; } if( d==2 ) { /* * Quadratic nodal function calculated using * least squares fitting to its neighbors */ for(j=0; j<=k-1; j++) { offs = 0; for(j2=0; j2<=nx-1; j2++) { fmatrix.ptr.pp_double[j][offs] = qxybuf.ptr.pp_double[j][j2]-xy->ptr.pp_double[i][j2]; offs = offs+1; } for(j2=0; j2<=nx-1; j2++) { for(j3=j2; j3<=nx-1; j3++) { fmatrix.ptr.pp_double[j][offs] = (qxybuf.ptr.pp_double[j][j2]-xy->ptr.pp_double[i][j2])*(qxybuf.ptr.pp_double[j][j3]-xy->ptr.pp_double[i][j3]); offs = offs+1; } } y.ptr.p_double[j] = qxybuf.ptr.pp_double[j][nx]-xy->ptr.pp_double[i][nx]; } nc = nx+ae_round(nx*(nx+1)*0.5, _state); } idwint_idwinternalsolver(&y, &w, &fmatrix, &temp, k, nc, &info, &qsol, &taskrcond, _state); /* * Least squares models: copy results */ if( info>0 ) { /* * LLS task is solved, copy results */ z->debugworstrcond = ae_minreal(z->debugworstrcond, taskrcond, _state); z->debugbestrcond = ae_maxreal(z->debugbestrcond, taskrcond, _state); for(j=0; j<=nc-1; j++) { z->q.ptr.pp_double[i][nx+1+j] = qsol.ptr.p_double[j]; } } else { /* * Solver failure, very strange, but we will use * zero values to handle it. */ z->debugsolverfailures = z->debugsolverfailures+1; for(j=0; j<=nc-1; j++) { z->q.ptr.pp_double[i][nx+1+j] = (double)(0); } } } } ae_frame_leave(_state); } /************************************************************************* IDW interpolant using modified Shepard method for non-uniform datasets. This type of model uses constant nodal functions and interpolates using all nodes which are closer than user-specified radius R. It may be used when points distribution is non-uniform at the small scale, but it is at the distances as large as R. INPUT PARAMETERS: XY - X and Y values, array[0..N-1,0..NX]. First NX columns contain X-values, last column contain Y-values. N - number of nodes, N>0. NX - space dimension, NX>=1. R - radius, R>0 OUTPUT PARAMETERS: Z - IDW interpolant. NOTES: * if there is less than IDWKMin points within R-ball, algorithm selects IDWKMin closest ones, so that continuity properties of interpolant are preserved even far from points. -- ALGLIB PROJECT -- Copyright 11.04.2010 by Bochkanov Sergey *************************************************************************/ void idwbuildmodifiedshepardr(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t nx, double r, idwinterpolant* z, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector tags; ae_frame_make(_state, &_frame_block); _idwinterpolant_clear(z); ae_vector_init(&tags, 0, DT_INT, _state); /* * assertions */ ae_assert(n>0, "IDWBuildModifiedShepardR: N<=0!", _state); ae_assert(nx>=1, "IDWBuildModifiedShepardR: NX<1!", _state); ae_assert(ae_fp_greater(r,(double)(0)), "IDWBuildModifiedShepardR: R<=0!", _state); /* * primary initialization of Z */ idwint_idwinit1(n, nx, 0, 0, n, z, _state); z->modeltype = 1; z->r = r; /* * Create KD-tree */ ae_vector_set_length(&tags, n, _state); for(i=0; i<=n-1; i++) { tags.ptr.p_int[i] = i; } kdtreebuildtagged(xy, &tags, n, nx, 1, 2, &z->tree, _state); /* * build nodal functions */ for(i=0; i<=n-1; i++) { ae_v_move(&z->q.ptr.pp_double[i][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx)); } ae_frame_leave(_state); } /************************************************************************* IDW model for noisy data. This subroutine may be used to handle noisy data, i.e. data with noise in OUTPUT values. It differs from IDWBuildModifiedShepard() in the following aspects: * nodal functions are not constrained to pass through nodes: Qi(xi)<>yi, i.e. we have fitting instead of interpolation. * weights which are used during least squares fitting stage are all equal to 1.0 (independently of distance) * "fast"-linear or constant nodal functions are not supported (either not robust enough or too rigid) This problem require far more complex tuning than interpolation problems. Below you can find some recommendations regarding this problem: * focus on tuning NQ; it controls noise reduction. As for NW, you can just make it equal to 2*NQ. * you can use cross-validation to determine optimal NQ. * optimal NQ is a result of complex tradeoff between noise level (more noise = larger NQ required) and underlying function complexity (given fixed N, larger NQ means smoothing of compex features in the data). For example, NQ=N will reduce noise to the minimum level possible, but you will end up with just constant/linear/quadratic (depending on D) least squares model for the whole dataset. INPUT PARAMETERS: XY - X and Y values, array[0..N-1,0..NX]. First NX columns contain X-values, last column contain Y-values. N - number of nodes, N>0. NX - space dimension, NX>=1. D - nodal function degree, either: * 1 linear model, least squares fitting. Simpe model for datasets too small for quadratic models (or for very noisy problems). * 2 quadratic model, least squares fitting. Best model available (if your dataset is large enough). NQ - number of points used to calculate nodal functions. NQ should be significantly larger than 1.5 times the number of coefficients in a nodal function to overcome effects of noise: * larger than 1.5*(1+NX) for linear model, * larger than 3/4*(NX+2)*(NX+1) for quadratic model. Values less than this threshold will be silently increased. NW - number of points used to calculate weights and to interpolate. Required: >=2^NX+1, values less than this threshold will be silently increased. Recommended value: about 2*NQ or larger OUTPUT PARAMETERS: Z - IDW interpolant. NOTES: * best results are obtained with quadratic models, linear models are not recommended to use unless you are pretty sure that it is what you want * this subroutine is always succeeds (as long as correct parameters are passed). * see 'Multivariate Interpolation of Large Sets of Scattered Data' by Robert J. Renka for more information on this algorithm. -- ALGLIB PROJECT -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/ void idwbuildnoisy(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t nx, ae_int_t d, ae_int_t nq, ae_int_t nw, idwinterpolant* z, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t j2; ae_int_t j3; double v; ae_int_t nc; ae_int_t offs; double taskrcond; ae_vector x; ae_vector qrbuf; ae_matrix qxybuf; ae_vector y; ae_vector w; ae_matrix fmatrix; ae_vector qsol; ae_vector tags; ae_vector temp; ae_int_t info; ae_frame_make(_state, &_frame_block); _idwinterpolant_clear(z); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&qrbuf, 0, DT_REAL, _state); ae_matrix_init(&qxybuf, 0, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_matrix_init(&fmatrix, 0, 0, DT_REAL, _state); ae_vector_init(&qsol, 0, DT_REAL, _state); ae_vector_init(&tags, 0, DT_INT, _state); ae_vector_init(&temp, 0, DT_REAL, _state); /* * these initializers are not really necessary, * but without them compiler complains about uninitialized locals */ nc = 0; /* * assertions */ ae_assert(n>0, "IDWBuildNoisy: N<=0!", _state); ae_assert(nx>=1, "IDWBuildNoisy: NX<1!", _state); ae_assert(d>=1&&d<=2, "IDWBuildNoisy: D<>1 and D<>2!", _state); /* * Correct parameters if needed */ if( d==1 ) { nq = ae_maxint(nq, ae_iceil(idwint_idwqfactor*(1+nx), _state)+1, _state); } if( d==2 ) { nq = ae_maxint(nq, ae_iceil(idwint_idwqfactor*(nx+2)*(nx+1)/2, _state)+1, _state); } nw = ae_maxint(nw, ae_round(ae_pow((double)(2), (double)(nx), _state), _state)+1, _state); nq = ae_minint(nq, n, _state); nw = ae_minint(nw, n, _state); /* * primary initialization of Z */ idwint_idwinit1(n, nx, d, nq, nw, z, _state); z->modeltype = 0; /* * Create KD-tree */ ae_vector_set_length(&tags, n, _state); for(i=0; i<=n-1; i++) { tags.ptr.p_int[i] = i; } kdtreebuildtagged(xy, &tags, n, nx, 1, 2, &z->tree, _state); /* * build nodal functions * (special algorithm for noisy data is used) */ ae_vector_set_length(&temp, nq+1, _state); ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&qrbuf, nq, _state); ae_matrix_set_length(&qxybuf, nq, nx+1, _state); if( d==1 ) { ae_vector_set_length(&y, nq, _state); ae_vector_set_length(&w, nq, _state); ae_vector_set_length(&qsol, 1+nx, _state); /* * 1 for constant member, * NX for linear members, * 1 for temporary storage */ ae_matrix_set_length(&fmatrix, nq, 1+nx+1, _state); } if( d==2 ) { ae_vector_set_length(&y, nq, _state); ae_vector_set_length(&w, nq, _state); ae_vector_set_length(&qsol, 1+nx+ae_round(nx*(nx+1)*0.5, _state), _state); /* * 1 for constant member, * NX for linear members, * Round(NX*(NX+1)*0.5) for quadratic model, * 1 for temporary storage */ ae_matrix_set_length(&fmatrix, nq, 1+nx+ae_round(nx*(nx+1)*0.5, _state)+1, _state); } for(i=0; i<=n-1; i++) { /* * Initialize center. */ ae_v_move(&z->q.ptr.pp_double[i][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); /* * Calculate linear/quadratic members * using least squares fit * NOTE 1: all weight are equal to 1.0 * NOTE 2: self-match is USED for this query */ ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); k = kdtreequeryknn(&z->tree, &x, nq, ae_true, _state); kdtreequeryresultsxy(&z->tree, &qxybuf, _state); kdtreequeryresultsdistances(&z->tree, &qrbuf, _state); if( d==1 ) { /* * Linear nodal function calculated using * least squares fitting to its neighbors */ for(j=0; j<=k-1; j++) { fmatrix.ptr.pp_double[j][0] = 1.0; for(j2=0; j2<=nx-1; j2++) { fmatrix.ptr.pp_double[j][1+j2] = qxybuf.ptr.pp_double[j][j2]-xy->ptr.pp_double[i][j2]; } y.ptr.p_double[j] = qxybuf.ptr.pp_double[j][nx]; w.ptr.p_double[j] = (double)(1); } nc = 1+nx; } if( d==2 ) { /* * Quadratic nodal function calculated using * least squares fitting to its neighbors */ for(j=0; j<=k-1; j++) { fmatrix.ptr.pp_double[j][0] = (double)(1); offs = 1; for(j2=0; j2<=nx-1; j2++) { fmatrix.ptr.pp_double[j][offs] = qxybuf.ptr.pp_double[j][j2]-xy->ptr.pp_double[i][j2]; offs = offs+1; } for(j2=0; j2<=nx-1; j2++) { for(j3=j2; j3<=nx-1; j3++) { fmatrix.ptr.pp_double[j][offs] = (qxybuf.ptr.pp_double[j][j2]-xy->ptr.pp_double[i][j2])*(qxybuf.ptr.pp_double[j][j3]-xy->ptr.pp_double[i][j3]); offs = offs+1; } } y.ptr.p_double[j] = qxybuf.ptr.pp_double[j][nx]; w.ptr.p_double[j] = (double)(1); } nc = 1+nx+ae_round(nx*(nx+1)*0.5, _state); } idwint_idwinternalsolver(&y, &w, &fmatrix, &temp, k, nc, &info, &qsol, &taskrcond, _state); /* * Least squares models: copy results */ if( info>0 ) { /* * LLS task is solved, copy results */ z->debugworstrcond = ae_minreal(z->debugworstrcond, taskrcond, _state); z->debugbestrcond = ae_maxreal(z->debugbestrcond, taskrcond, _state); for(j=0; j<=nc-1; j++) { z->q.ptr.pp_double[i][nx+j] = qsol.ptr.p_double[j]; } } else { /* * Solver failure, very strange, but we will use * zero values to handle it. */ z->debugsolverfailures = z->debugsolverfailures+1; v = (double)(0); for(j=0; j<=k-1; j++) { v = v+qxybuf.ptr.pp_double[j][nx]; } z->q.ptr.pp_double[i][nx] = v/k; for(j=0; j<=nc-2; j++) { z->q.ptr.pp_double[i][nx+1+j] = (double)(0); } } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine: K-th nodal function calculation -- ALGLIB -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/ static double idwint_idwcalcq(idwinterpolant* z, /* Real */ ae_vector* x, ae_int_t k, ae_state *_state) { ae_int_t nx; ae_int_t i; ae_int_t j; ae_int_t offs; double result; nx = z->nx; /* * constant member */ result = z->q.ptr.pp_double[k][nx]; /* * linear members */ if( z->d>=1 ) { for(i=0; i<=nx-1; i++) { result = result+z->q.ptr.pp_double[k][nx+1+i]*(x->ptr.p_double[i]-z->q.ptr.pp_double[k][i]); } } /* * quadratic members */ if( z->d>=2 ) { offs = nx+1+nx; for(i=0; i<=nx-1; i++) { for(j=i; j<=nx-1; j++) { result = result+z->q.ptr.pp_double[k][offs]*(x->ptr.p_double[i]-z->q.ptr.pp_double[k][i])*(x->ptr.p_double[j]-z->q.ptr.pp_double[k][j]); offs = offs+1; } } } return result; } /************************************************************************* Initialization of internal structures. It assumes correctness of all parameters. -- ALGLIB -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/ static void idwint_idwinit1(ae_int_t n, ae_int_t nx, ae_int_t d, ae_int_t nq, ae_int_t nw, idwinterpolant* z, ae_state *_state) { z->debugsolverfailures = 0; z->debugworstrcond = 1.0; z->debugbestrcond = (double)(0); z->n = n; z->nx = nx; z->d = 0; if( d==1 ) { z->d = 1; } if( d==2 ) { z->d = 2; } if( d==-1 ) { z->d = 1; } z->nw = nw; if( d==-1 ) { ae_matrix_set_length(&z->q, n, nx+1+nx, _state); } if( d==0 ) { ae_matrix_set_length(&z->q, n, nx+1, _state); } if( d==1 ) { ae_matrix_set_length(&z->q, n, nx+1+nx, _state); } if( d==2 ) { ae_matrix_set_length(&z->q, n, nx+1+nx+ae_round(nx*(nx+1)*0.5, _state), _state); } ae_vector_set_length(&z->tbuf, nw, _state); ae_vector_set_length(&z->rbuf, nw, _state); ae_matrix_set_length(&z->xybuf, nw, nx+1, _state); ae_vector_set_length(&z->xbuf, nx, _state); } /************************************************************************* Linear least squares solver for small tasks. Works faster than standard ALGLIB solver in non-degenerate cases (due to absense of internal allocations and optimized row/colums). In degenerate cases it calls standard solver, which results in small performance penalty associated with preliminary steps. INPUT PARAMETERS: Y array[0..N-1] W array[0..N-1] FMatrix array[0..N-1,0..M], have additional column for temporary values Temp array[0..N] *************************************************************************/ static void idwint_idwinternalsolver(/* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, /* Real */ ae_vector* temp, ae_int_t n, ae_int_t m, ae_int_t* info, /* Real */ ae_vector* x, double* taskrcond, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double v; double tau; ae_vector b; densesolverlsreport srep; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_init(&b, 0, DT_REAL, _state); _densesolverlsreport_init(&srep, _state); /* * set up info */ *info = 1; /* * prepare matrix */ for(i=0; i<=n-1; i++) { fmatrix->ptr.pp_double[i][m] = y->ptr.p_double[i]; v = w->ptr.p_double[i]; ae_v_muld(&fmatrix->ptr.pp_double[i][0], 1, ae_v_len(0,m), v); } /* * use either fast algorithm or general algorithm */ if( m<=n ) { /* * QR decomposition * We assume that M<=N (we would have called LSFit() otherwise) */ for(i=0; i<=m-1; i++) { if( iptr.p_double[1], 1, &fmatrix->ptr.pp_double[i][i], fmatrix->stride, ae_v_len(1,n-i)); generatereflection(temp, n-i, &tau, _state); fmatrix->ptr.pp_double[i][i] = temp->ptr.p_double[1]; temp->ptr.p_double[1] = (double)(1); for(j=i+1; j<=m; j++) { v = ae_v_dotproduct(&fmatrix->ptr.pp_double[i][j], fmatrix->stride, &temp->ptr.p_double[1], 1, ae_v_len(i,n-1)); v = tau*v; ae_v_subd(&fmatrix->ptr.pp_double[i][j], fmatrix->stride, &temp->ptr.p_double[1], 1, ae_v_len(i,n-1), v); } } } /* * Check condition number */ *taskrcond = rmatrixtrrcondinf(fmatrix, m, ae_true, ae_false, _state); /* * use either fast algorithm for non-degenerate cases * or slow algorithm for degenerate cases */ if( ae_fp_greater(*taskrcond,10000*n*ae_machineepsilon) ) { /* * solve triangular system R*x = FMatrix[0:M-1,M] * using fast algorithm, then exit */ x->ptr.p_double[m-1] = fmatrix->ptr.pp_double[m-1][m]/fmatrix->ptr.pp_double[m-1][m-1]; for(i=m-2; i>=0; i--) { v = ae_v_dotproduct(&fmatrix->ptr.pp_double[i][i+1], 1, &x->ptr.p_double[i+1], 1, ae_v_len(i+1,m-1)); x->ptr.p_double[i] = (fmatrix->ptr.pp_double[i][m]-v)/fmatrix->ptr.pp_double[i][i]; } } else { /* * use more general algorithm */ ae_vector_set_length(&b, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=i-1; j++) { fmatrix->ptr.pp_double[i][j] = 0.0; } b.ptr.p_double[i] = fmatrix->ptr.pp_double[i][m]; } rmatrixsolvels(fmatrix, m, m, &b, 10000*ae_machineepsilon, info, &srep, x, _state); } } else { /* * use more general algorithm */ ae_vector_set_length(&b, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = fmatrix->ptr.pp_double[i][m]; } rmatrixsolvels(fmatrix, n, m, &b, 10000*ae_machineepsilon, info, &srep, x, _state); *taskrcond = srep.r2; } ae_frame_leave(_state); } void _idwinterpolant_init(void* _p, ae_state *_state) { idwinterpolant *p = (idwinterpolant*)_p; ae_touch_ptr((void*)p); _kdtree_init(&p->tree, _state); ae_matrix_init(&p->q, 0, 0, DT_REAL, _state); ae_vector_init(&p->xbuf, 0, DT_REAL, _state); ae_vector_init(&p->tbuf, 0, DT_INT, _state); ae_vector_init(&p->rbuf, 0, DT_REAL, _state); ae_matrix_init(&p->xybuf, 0, 0, DT_REAL, _state); } void _idwinterpolant_init_copy(void* _dst, void* _src, ae_state *_state) { idwinterpolant *dst = (idwinterpolant*)_dst; idwinterpolant *src = (idwinterpolant*)_src; dst->n = src->n; dst->nx = src->nx; dst->d = src->d; dst->r = src->r; dst->nw = src->nw; _kdtree_init_copy(&dst->tree, &src->tree, _state); dst->modeltype = src->modeltype; ae_matrix_init_copy(&dst->q, &src->q, _state); ae_vector_init_copy(&dst->xbuf, &src->xbuf, _state); ae_vector_init_copy(&dst->tbuf, &src->tbuf, _state); ae_vector_init_copy(&dst->rbuf, &src->rbuf, _state); ae_matrix_init_copy(&dst->xybuf, &src->xybuf, _state); dst->debugsolverfailures = src->debugsolverfailures; dst->debugworstrcond = src->debugworstrcond; dst->debugbestrcond = src->debugbestrcond; } void _idwinterpolant_clear(void* _p) { idwinterpolant *p = (idwinterpolant*)_p; ae_touch_ptr((void*)p); _kdtree_clear(&p->tree); ae_matrix_clear(&p->q); ae_vector_clear(&p->xbuf); ae_vector_clear(&p->tbuf); ae_vector_clear(&p->rbuf); ae_matrix_clear(&p->xybuf); } void _idwinterpolant_destroy(void* _p) { idwinterpolant *p = (idwinterpolant*)_p; ae_touch_ptr((void*)p); _kdtree_destroy(&p->tree); ae_matrix_destroy(&p->q); ae_vector_destroy(&p->xbuf); ae_vector_destroy(&p->tbuf); ae_vector_destroy(&p->rbuf); ae_matrix_destroy(&p->xybuf); } /************************************************************************* Rational interpolation using barycentric formula F(t) = SUM(i=0,n-1,w[i]*f[i]/(t-x[i])) / SUM(i=0,n-1,w[i]/(t-x[i])) Input parameters: B - barycentric interpolant built with one of model building subroutines. T - interpolation point Result: barycentric interpolant F(t) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ double barycentriccalc(barycentricinterpolant* b, double t, ae_state *_state) { double s1; double s2; double s; double v; ae_int_t i; double result; ae_assert(!ae_isinf(t, _state), "BarycentricCalc: infinite T!", _state); /* * special case: NaN */ if( ae_isnan(t, _state) ) { result = _state->v_nan; return result; } /* * special case: N=1 */ if( b->n==1 ) { result = b->sy*b->y.ptr.p_double[0]; return result; } /* * Here we assume that task is normalized, i.e.: * 1. abs(Y[i])<=1 * 2. abs(W[i])<=1 * 3. X[] is ordered */ s = ae_fabs(t-b->x.ptr.p_double[0], _state); for(i=0; i<=b->n-1; i++) { v = b->x.ptr.p_double[i]; if( ae_fp_eq(v,t) ) { result = b->sy*b->y.ptr.p_double[i]; return result; } v = ae_fabs(t-v, _state); if( ae_fp_less(v,s) ) { s = v; } } s1 = (double)(0); s2 = (double)(0); for(i=0; i<=b->n-1; i++) { v = s/(t-b->x.ptr.p_double[i]); v = v*b->w.ptr.p_double[i]; s1 = s1+v*b->y.ptr.p_double[i]; s2 = s2+v; } result = b->sy*s1/s2; return result; } /************************************************************************* Differentiation of barycentric interpolant: first derivative. Algorithm used in this subroutine is very robust and should not fail until provided with values too close to MaxRealNumber (usually MaxRealNumber/N or greater will overflow). INPUT PARAMETERS: B - barycentric interpolant built with one of model building subroutines. T - interpolation point OUTPUT PARAMETERS: F - barycentric interpolant at T DF - first derivative NOTE -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricdiff1(barycentricinterpolant* b, double t, double* f, double* df, ae_state *_state) { double v; double vv; ae_int_t i; ae_int_t k; double n0; double n1; double d0; double d1; double s0; double s1; double xk; double xi; double xmin; double xmax; double xscale1; double xoffs1; double xscale2; double xoffs2; double xprev; *f = 0; *df = 0; ae_assert(!ae_isinf(t, _state), "BarycentricDiff1: infinite T!", _state); /* * special case: NaN */ if( ae_isnan(t, _state) ) { *f = _state->v_nan; *df = _state->v_nan; return; } /* * special case: N=1 */ if( b->n==1 ) { *f = b->sy*b->y.ptr.p_double[0]; *df = (double)(0); return; } if( ae_fp_eq(b->sy,(double)(0)) ) { *f = (double)(0); *df = (double)(0); return; } ae_assert(ae_fp_greater(b->sy,(double)(0)), "BarycentricDiff1: internal error", _state); /* * We assume than N>1 and B.SY>0. Find: * 1. pivot point (X[i] closest to T) * 2. width of interval containing X[i] */ v = ae_fabs(b->x.ptr.p_double[0]-t, _state); k = 0; xmin = b->x.ptr.p_double[0]; xmax = b->x.ptr.p_double[0]; for(i=1; i<=b->n-1; i++) { vv = b->x.ptr.p_double[i]; if( ae_fp_less(ae_fabs(vv-t, _state),v) ) { v = ae_fabs(vv-t, _state); k = i; } xmin = ae_minreal(xmin, vv, _state); xmax = ae_maxreal(xmax, vv, _state); } /* * pivot point found, calculate dNumerator and dDenominator */ xscale1 = 1/(xmax-xmin); xoffs1 = -xmin/(xmax-xmin)+1; xscale2 = (double)(2); xoffs2 = (double)(-3); t = t*xscale1+xoffs1; t = t*xscale2+xoffs2; xk = b->x.ptr.p_double[k]; xk = xk*xscale1+xoffs1; xk = xk*xscale2+xoffs2; v = t-xk; n0 = (double)(0); n1 = (double)(0); d0 = (double)(0); d1 = (double)(0); xprev = (double)(-2); for(i=0; i<=b->n-1; i++) { xi = b->x.ptr.p_double[i]; xi = xi*xscale1+xoffs1; xi = xi*xscale2+xoffs2; ae_assert(ae_fp_greater(xi,xprev), "BarycentricDiff1: points are too close!", _state); xprev = xi; if( i!=k ) { vv = ae_sqr(t-xi, _state); s0 = (t-xk)/(t-xi); s1 = (xk-xi)/vv; } else { s0 = (double)(1); s1 = (double)(0); } vv = b->w.ptr.p_double[i]*b->y.ptr.p_double[i]; n0 = n0+s0*vv; n1 = n1+s1*vv; vv = b->w.ptr.p_double[i]; d0 = d0+s0*vv; d1 = d1+s1*vv; } *f = b->sy*n0/d0; *df = (n1*d0-n0*d1)/ae_sqr(d0, _state); if( ae_fp_neq(*df,(double)(0)) ) { *df = ae_sign(*df, _state)*ae_exp(ae_log(ae_fabs(*df, _state), _state)+ae_log(b->sy, _state)+ae_log(xscale1, _state)+ae_log(xscale2, _state), _state); } } /************************************************************************* Differentiation of barycentric interpolant: first/second derivatives. INPUT PARAMETERS: B - barycentric interpolant built with one of model building subroutines. T - interpolation point OUTPUT PARAMETERS: F - barycentric interpolant at T DF - first derivative D2F - second derivative NOTE: this algorithm may fail due to overflow/underflor if used on data whose values are close to MaxRealNumber or MinRealNumber. Use more robust BarycentricDiff1() subroutine in such cases. -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricdiff2(barycentricinterpolant* b, double t, double* f, double* df, double* d2f, ae_state *_state) { double v; double vv; ae_int_t i; ae_int_t k; double n0; double n1; double n2; double d0; double d1; double d2; double s0; double s1; double s2; double xk; double xi; *f = 0; *df = 0; *d2f = 0; ae_assert(!ae_isinf(t, _state), "BarycentricDiff1: infinite T!", _state); /* * special case: NaN */ if( ae_isnan(t, _state) ) { *f = _state->v_nan; *df = _state->v_nan; *d2f = _state->v_nan; return; } /* * special case: N=1 */ if( b->n==1 ) { *f = b->sy*b->y.ptr.p_double[0]; *df = (double)(0); *d2f = (double)(0); return; } if( ae_fp_eq(b->sy,(double)(0)) ) { *f = (double)(0); *df = (double)(0); *d2f = (double)(0); return; } /* * We assume than N>1 and B.SY>0. Find: * 1. pivot point (X[i] closest to T) * 2. width of interval containing X[i] */ ae_assert(ae_fp_greater(b->sy,(double)(0)), "BarycentricDiff: internal error", _state); *f = (double)(0); *df = (double)(0); *d2f = (double)(0); v = ae_fabs(b->x.ptr.p_double[0]-t, _state); k = 0; for(i=1; i<=b->n-1; i++) { vv = b->x.ptr.p_double[i]; if( ae_fp_less(ae_fabs(vv-t, _state),v) ) { v = ae_fabs(vv-t, _state); k = i; } } /* * pivot point found, calculate dNumerator and dDenominator */ xk = b->x.ptr.p_double[k]; v = t-xk; n0 = (double)(0); n1 = (double)(0); n2 = (double)(0); d0 = (double)(0); d1 = (double)(0); d2 = (double)(0); for(i=0; i<=b->n-1; i++) { if( i!=k ) { xi = b->x.ptr.p_double[i]; vv = ae_sqr(t-xi, _state); s0 = (t-xk)/(t-xi); s1 = (xk-xi)/vv; s2 = -2*(xk-xi)/(vv*(t-xi)); } else { s0 = (double)(1); s1 = (double)(0); s2 = (double)(0); } vv = b->w.ptr.p_double[i]*b->y.ptr.p_double[i]; n0 = n0+s0*vv; n1 = n1+s1*vv; n2 = n2+s2*vv; vv = b->w.ptr.p_double[i]; d0 = d0+s0*vv; d1 = d1+s1*vv; d2 = d2+s2*vv; } *f = b->sy*n0/d0; *df = b->sy*(n1*d0-n0*d1)/ae_sqr(d0, _state); *d2f = b->sy*((n2*d0-n0*d2)*ae_sqr(d0, _state)-(n1*d0-n0*d1)*2*d0*d1)/ae_sqr(ae_sqr(d0, _state), _state); } /************************************************************************* This subroutine performs linear transformation of the argument. INPUT PARAMETERS: B - rational interpolant in barycentric form CA, CB - transformation coefficients: x = CA*t + CB OUTPUT PARAMETERS: B - transformed interpolant with X replaced by T -- ALGLIB PROJECT -- Copyright 19.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentriclintransx(barycentricinterpolant* b, double ca, double cb, ae_state *_state) { ae_int_t i; ae_int_t j; double v; /* * special case, replace by constant F(CB) */ if( ae_fp_eq(ca,(double)(0)) ) { b->sy = barycentriccalc(b, cb, _state); v = (double)(1); for(i=0; i<=b->n-1; i++) { b->y.ptr.p_double[i] = (double)(1); b->w.ptr.p_double[i] = v; v = -v; } return; } /* * general case: CA<>0 */ for(i=0; i<=b->n-1; i++) { b->x.ptr.p_double[i] = (b->x.ptr.p_double[i]-cb)/ca; } if( ae_fp_less(ca,(double)(0)) ) { for(i=0; i<=b->n-1; i++) { if( in-1-i ) { j = b->n-1-i; v = b->x.ptr.p_double[i]; b->x.ptr.p_double[i] = b->x.ptr.p_double[j]; b->x.ptr.p_double[j] = v; v = b->y.ptr.p_double[i]; b->y.ptr.p_double[i] = b->y.ptr.p_double[j]; b->y.ptr.p_double[j] = v; v = b->w.ptr.p_double[i]; b->w.ptr.p_double[i] = b->w.ptr.p_double[j]; b->w.ptr.p_double[j] = v; } else { break; } } } } /************************************************************************* This subroutine performs linear transformation of the barycentric interpolant. INPUT PARAMETERS: B - rational interpolant in barycentric form CA, CB - transformation coefficients: B2(x) = CA*B(x) + CB OUTPUT PARAMETERS: B - transformed interpolant -- ALGLIB PROJECT -- Copyright 19.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentriclintransy(barycentricinterpolant* b, double ca, double cb, ae_state *_state) { ae_int_t i; double v; for(i=0; i<=b->n-1; i++) { b->y.ptr.p_double[i] = ca*b->sy*b->y.ptr.p_double[i]+cb; } b->sy = (double)(0); for(i=0; i<=b->n-1; i++) { b->sy = ae_maxreal(b->sy, ae_fabs(b->y.ptr.p_double[i], _state), _state); } if( ae_fp_greater(b->sy,(double)(0)) ) { v = 1/b->sy; ae_v_muld(&b->y.ptr.p_double[0], 1, ae_v_len(0,b->n-1), v); } } /************************************************************************* Extracts X/Y/W arrays from rational interpolant INPUT PARAMETERS: B - barycentric interpolant OUTPUT PARAMETERS: N - nodes count, N>0 X - interpolation nodes, array[0..N-1] F - function values, array[0..N-1] W - barycentric weights, array[0..N-1] -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricunpack(barycentricinterpolant* b, ae_int_t* n, /* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_state *_state) { double v; *n = 0; ae_vector_clear(x); ae_vector_clear(y); ae_vector_clear(w); *n = b->n; ae_vector_set_length(x, *n, _state); ae_vector_set_length(y, *n, _state); ae_vector_set_length(w, *n, _state); v = b->sy; ae_v_move(&x->ptr.p_double[0], 1, &b->x.ptr.p_double[0], 1, ae_v_len(0,*n-1)); ae_v_moved(&y->ptr.p_double[0], 1, &b->y.ptr.p_double[0], 1, ae_v_len(0,*n-1), v); ae_v_move(&w->ptr.p_double[0], 1, &b->w.ptr.p_double[0], 1, ae_v_len(0,*n-1)); } /************************************************************************* Rational interpolant from X/Y/W arrays F(t) = SUM(i=0,n-1,w[i]*f[i]/(t-x[i])) / SUM(i=0,n-1,w[i]/(t-x[i])) INPUT PARAMETERS: X - interpolation nodes, array[0..N-1] F - function values, array[0..N-1] W - barycentric weights, array[0..N-1] N - nodes count, N>0 OUTPUT PARAMETERS: B - barycentric interpolant built from (X, Y, W) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricbuildxyw(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, barycentricinterpolant* b, ae_state *_state) { _barycentricinterpolant_clear(b); ae_assert(n>0, "BarycentricBuildXYW: incorrect N!", _state); /* * fill X/Y/W */ ae_vector_set_length(&b->x, n, _state); ae_vector_set_length(&b->y, n, _state); ae_vector_set_length(&b->w, n, _state); ae_v_move(&b->x.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&b->y.ptr.p_double[0], 1, &y->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&b->w.ptr.p_double[0], 1, &w->ptr.p_double[0], 1, ae_v_len(0,n-1)); b->n = n; /* * Normalize */ ratint_barycentricnormalize(b, _state); } /************************************************************************* Rational interpolant without poles The subroutine constructs the rational interpolating function without real poles (see 'Barycentric rational interpolation with no poles and high rates of approximation', Michael S. Floater. and Kai Hormann, for more information on this subject). Input parameters: X - interpolation nodes, array[0..N-1]. Y - function values, array[0..N-1]. N - number of nodes, N>0. D - order of the interpolation scheme, 0 <= D <= N-1. D<0 will cause an error. D>=N it will be replaced with D=N-1. if you don't know what D to choose, use small value about 3-5. Output parameters: B - barycentric interpolant. Note: this algorithm always succeeds and calculates the weights with close to machine precision. -- ALGLIB PROJECT -- Copyright 17.06.2007 by Bochkanov Sergey *************************************************************************/ void barycentricbuildfloaterhormann(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t d, barycentricinterpolant* b, ae_state *_state) { ae_frame _frame_block; double s0; double s; double v; ae_int_t i; ae_int_t j; ae_int_t k; ae_vector perm; ae_vector wtemp; ae_vector sortrbuf; ae_vector sortrbuf2; ae_frame_make(_state, &_frame_block); _barycentricinterpolant_clear(b); ae_vector_init(&perm, 0, DT_INT, _state); ae_vector_init(&wtemp, 0, DT_REAL, _state); ae_vector_init(&sortrbuf, 0, DT_REAL, _state); ae_vector_init(&sortrbuf2, 0, DT_REAL, _state); ae_assert(n>0, "BarycentricFloaterHormann: N<=0!", _state); ae_assert(d>=0, "BarycentricFloaterHormann: incorrect D!", _state); /* * Prepare */ if( d>n-1 ) { d = n-1; } b->n = n; /* * special case: N=1 */ if( n==1 ) { ae_vector_set_length(&b->x, n, _state); ae_vector_set_length(&b->y, n, _state); ae_vector_set_length(&b->w, n, _state); b->x.ptr.p_double[0] = x->ptr.p_double[0]; b->y.ptr.p_double[0] = y->ptr.p_double[0]; b->w.ptr.p_double[0] = (double)(1); ratint_barycentricnormalize(b, _state); ae_frame_leave(_state); return; } /* * Fill X/Y */ ae_vector_set_length(&b->x, n, _state); ae_vector_set_length(&b->y, n, _state); ae_v_move(&b->x.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&b->y.ptr.p_double[0], 1, &y->ptr.p_double[0], 1, ae_v_len(0,n-1)); tagsortfastr(&b->x, &b->y, &sortrbuf, &sortrbuf2, n, _state); /* * Calculate Wk */ ae_vector_set_length(&b->w, n, _state); s0 = (double)(1); for(k=1; k<=d; k++) { s0 = -s0; } for(k=0; k<=n-1; k++) { /* * Wk */ s = (double)(0); for(i=ae_maxint(k-d, 0, _state); i<=ae_minint(k, n-1-d, _state); i++) { v = (double)(1); for(j=i; j<=i+d; j++) { if( j!=k ) { v = v/ae_fabs(b->x.ptr.p_double[k]-b->x.ptr.p_double[j], _state); } } s = s+v; } b->w.ptr.p_double[k] = s0*s; /* * Next S0 */ s0 = -s0; } /* * Normalize */ ratint_barycentricnormalize(b, _state); ae_frame_leave(_state); } /************************************************************************* Copying of the barycentric interpolant (for internal use only) INPUT PARAMETERS: B - barycentric interpolant OUTPUT PARAMETERS: B2 - copy(B1) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentriccopy(barycentricinterpolant* b, barycentricinterpolant* b2, ae_state *_state) { _barycentricinterpolant_clear(b2); b2->n = b->n; b2->sy = b->sy; ae_vector_set_length(&b2->x, b2->n, _state); ae_vector_set_length(&b2->y, b2->n, _state); ae_vector_set_length(&b2->w, b2->n, _state); ae_v_move(&b2->x.ptr.p_double[0], 1, &b->x.ptr.p_double[0], 1, ae_v_len(0,b2->n-1)); ae_v_move(&b2->y.ptr.p_double[0], 1, &b->y.ptr.p_double[0], 1, ae_v_len(0,b2->n-1)); ae_v_move(&b2->w.ptr.p_double[0], 1, &b->w.ptr.p_double[0], 1, ae_v_len(0,b2->n-1)); } /************************************************************************* Normalization of barycentric interpolant: * B.N, B.X, B.Y and B.W are initialized * B.SY is NOT initialized * Y[] is normalized, scaling coefficient is stored in B.SY * W[] is normalized, no scaling coefficient is stored * X[] is sorted Internal subroutine. *************************************************************************/ static void ratint_barycentricnormalize(barycentricinterpolant* b, ae_state *_state) { ae_frame _frame_block; ae_vector p1; ae_vector p2; ae_int_t i; ae_int_t j; ae_int_t j2; double v; ae_frame_make(_state, &_frame_block); ae_vector_init(&p1, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); /* * Normalize task: |Y|<=1, |W|<=1, sort X[] */ b->sy = (double)(0); for(i=0; i<=b->n-1; i++) { b->sy = ae_maxreal(b->sy, ae_fabs(b->y.ptr.p_double[i], _state), _state); } if( ae_fp_greater(b->sy,(double)(0))&&ae_fp_greater(ae_fabs(b->sy-1, _state),10*ae_machineepsilon) ) { v = 1/b->sy; ae_v_muld(&b->y.ptr.p_double[0], 1, ae_v_len(0,b->n-1), v); } v = (double)(0); for(i=0; i<=b->n-1; i++) { v = ae_maxreal(v, ae_fabs(b->w.ptr.p_double[i], _state), _state); } if( ae_fp_greater(v,(double)(0))&&ae_fp_greater(ae_fabs(v-1, _state),10*ae_machineepsilon) ) { v = 1/v; ae_v_muld(&b->w.ptr.p_double[0], 1, ae_v_len(0,b->n-1), v); } for(i=0; i<=b->n-2; i++) { if( ae_fp_less(b->x.ptr.p_double[i+1],b->x.ptr.p_double[i]) ) { tagsort(&b->x, b->n, &p1, &p2, _state); for(j=0; j<=b->n-1; j++) { j2 = p2.ptr.p_int[j]; v = b->y.ptr.p_double[j]; b->y.ptr.p_double[j] = b->y.ptr.p_double[j2]; b->y.ptr.p_double[j2] = v; v = b->w.ptr.p_double[j]; b->w.ptr.p_double[j] = b->w.ptr.p_double[j2]; b->w.ptr.p_double[j2] = v; } break; } } ae_frame_leave(_state); } void _barycentricinterpolant_init(void* _p, ae_state *_state) { barycentricinterpolant *p = (barycentricinterpolant*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->y, 0, DT_REAL, _state); ae_vector_init(&p->w, 0, DT_REAL, _state); } void _barycentricinterpolant_init_copy(void* _dst, void* _src, ae_state *_state) { barycentricinterpolant *dst = (barycentricinterpolant*)_dst; barycentricinterpolant *src = (barycentricinterpolant*)_src; dst->n = src->n; dst->sy = src->sy; ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->y, &src->y, _state); ae_vector_init_copy(&dst->w, &src->w, _state); } void _barycentricinterpolant_clear(void* _p) { barycentricinterpolant *p = (barycentricinterpolant*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->x); ae_vector_clear(&p->y); ae_vector_clear(&p->w); } void _barycentricinterpolant_destroy(void* _p) { barycentricinterpolant *p = (barycentricinterpolant*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->x); ae_vector_destroy(&p->y); ae_vector_destroy(&p->w); } /************************************************************************* This subroutine builds linear spline interpolant INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildlinear(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, spline1dinterpolant* c, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; _spline1dinterpolant_clear(c); ae_assert(n>1, "Spline1DBuildLinear: N<2!", _state); ae_assert(x->cnt>=n, "Spline1DBuildLinear: Length(X)cnt>=n, "Spline1DBuildLinear: Length(Y)periodic = ae_false; c->n = n; c->k = 3; c->continuity = 0; ae_vector_set_length(&c->x, n, _state); ae_vector_set_length(&c->c, 4*(n-1)+2, _state); for(i=0; i<=n-1; i++) { c->x.ptr.p_double[i] = x->ptr.p_double[i]; } for(i=0; i<=n-2; i++) { c->c.ptr.p_double[4*i+0] = y->ptr.p_double[i]; c->c.ptr.p_double[4*i+1] = (y->ptr.p_double[i+1]-y->ptr.p_double[i])/(x->ptr.p_double[i+1]-x->ptr.p_double[i]); c->c.ptr.p_double[4*i+2] = (double)(0); c->c.ptr.p_double[4*i+3] = (double)(0); } c->c.ptr.p_double[4*(n-1)+0] = y->ptr.p_double[n-1]; c->c.ptr.p_double[4*(n-1)+1] = c->c.ptr.p_double[4*(n-2)+1]; ae_frame_leave(_state); } /************************************************************************* This subroutine builds cubic spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Y - function values, array[0..N-1]. OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildcubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, spline1dinterpolant* c, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector a1; ae_vector a2; ae_vector a3; ae_vector b; ae_vector dt; ae_vector d; ae_vector p; ae_int_t ylen; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; _spline1dinterpolant_clear(c); ae_vector_init(&a1, 0, DT_REAL, _state); ae_vector_init(&a2, 0, DT_REAL, _state); ae_vector_init(&a3, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&dt, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); /* * check correctness of boundary conditions */ ae_assert(((boundltype==-1||boundltype==0)||boundltype==1)||boundltype==2, "Spline1DBuildCubic: incorrect BoundLType!", _state); ae_assert(((boundrtype==-1||boundrtype==0)||boundrtype==1)||boundrtype==2, "Spline1DBuildCubic: incorrect BoundRType!", _state); ae_assert((boundrtype==-1&&boundltype==-1)||(boundrtype!=-1&&boundltype!=-1), "Spline1DBuildCubic: incorrect BoundLType/BoundRType!", _state); if( boundltype==1||boundltype==2 ) { ae_assert(ae_isfinite(boundl, _state), "Spline1DBuildCubic: BoundL is infinite or NAN!", _state); } if( boundrtype==1||boundrtype==2 ) { ae_assert(ae_isfinite(boundr, _state), "Spline1DBuildCubic: BoundR is infinite or NAN!", _state); } /* * check lengths of arguments */ ae_assert(n>=2, "Spline1DBuildCubic: N<2!", _state); ae_assert(x->cnt>=n, "Spline1DBuildCubic: Length(X)cnt>=n, "Spline1DBuildCubic: Length(Y)ptr.p_double[n-1] = y->ptr.p_double[0]; } spline1d_spline1dgriddiffcubicinternal(x, y, n, boundltype, boundl, boundrtype, boundr, &d, &a1, &a2, &a3, &b, &dt, _state); spline1dbuildhermite(x, y, &d, n, c, _state); c->periodic = boundltype==-1||boundrtype==-1; c->continuity = 2; ae_frame_leave(_state); } /************************************************************************* This function solves following problem: given table y[] of function values at nodes x[], it calculates and returns table of function derivatives d[] (calculated at the same nodes x[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - spline nodes Y - function values OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: D - derivative values at X[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Derivative values are correctly reordered on return, so D[I] is always equal to S'(X[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dgriddiffcubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, /* Real */ ae_vector* d, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector a1; ae_vector a2; ae_vector a3; ae_vector b; ae_vector dt; ae_vector p; ae_int_t i; ae_int_t ylen; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; ae_vector_clear(d); ae_vector_init(&a1, 0, DT_REAL, _state); ae_vector_init(&a2, 0, DT_REAL, _state); ae_vector_init(&a3, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&dt, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); /* * check correctness of boundary conditions */ ae_assert(((boundltype==-1||boundltype==0)||boundltype==1)||boundltype==2, "Spline1DGridDiffCubic: incorrect BoundLType!", _state); ae_assert(((boundrtype==-1||boundrtype==0)||boundrtype==1)||boundrtype==2, "Spline1DGridDiffCubic: incorrect BoundRType!", _state); ae_assert((boundrtype==-1&&boundltype==-1)||(boundrtype!=-1&&boundltype!=-1), "Spline1DGridDiffCubic: incorrect BoundLType/BoundRType!", _state); if( boundltype==1||boundltype==2 ) { ae_assert(ae_isfinite(boundl, _state), "Spline1DGridDiffCubic: BoundL is infinite or NAN!", _state); } if( boundrtype==1||boundrtype==2 ) { ae_assert(ae_isfinite(boundr, _state), "Spline1DGridDiffCubic: BoundR is infinite or NAN!", _state); } /* * check lengths of arguments */ ae_assert(n>=2, "Spline1DGridDiffCubic: N<2!", _state); ae_assert(x->cnt>=n, "Spline1DGridDiffCubic: Length(X)cnt>=n, "Spline1DGridDiffCubic: Length(Y)ptr.p_double[i]; } ae_v_move(&d->ptr.p_double[0], 1, &dt.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* This function solves following problem: given table y[] of function values at nodes x[], it calculates and returns tables of first and second function derivatives d1[] and d2[] (calculated at the same nodes x[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - spline nodes Y - function values OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: D1 - S' values at X[] D2 - S'' values at X[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Derivative values are correctly reordered on return, so D[I] is always equal to S'(X[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dgriddiff2cubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, /* Real */ ae_vector* d1, /* Real */ ae_vector* d2, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector a1; ae_vector a2; ae_vector a3; ae_vector b; ae_vector dt; ae_vector p; ae_int_t i; ae_int_t ylen; double delta; double delta2; double delta3; double s2; double s3; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; ae_vector_clear(d1); ae_vector_clear(d2); ae_vector_init(&a1, 0, DT_REAL, _state); ae_vector_init(&a2, 0, DT_REAL, _state); ae_vector_init(&a3, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&dt, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); /* * check correctness of boundary conditions */ ae_assert(((boundltype==-1||boundltype==0)||boundltype==1)||boundltype==2, "Spline1DGridDiff2Cubic: incorrect BoundLType!", _state); ae_assert(((boundrtype==-1||boundrtype==0)||boundrtype==1)||boundrtype==2, "Spline1DGridDiff2Cubic: incorrect BoundRType!", _state); ae_assert((boundrtype==-1&&boundltype==-1)||(boundrtype!=-1&&boundltype!=-1), "Spline1DGridDiff2Cubic: incorrect BoundLType/BoundRType!", _state); if( boundltype==1||boundltype==2 ) { ae_assert(ae_isfinite(boundl, _state), "Spline1DGridDiff2Cubic: BoundL is infinite or NAN!", _state); } if( boundrtype==1||boundrtype==2 ) { ae_assert(ae_isfinite(boundr, _state), "Spline1DGridDiff2Cubic: BoundR is infinite or NAN!", _state); } /* * check lengths of arguments */ ae_assert(n>=2, "Spline1DGridDiff2Cubic: N<2!", _state); ae_assert(x->cnt>=n, "Spline1DGridDiff2Cubic: Length(X)cnt>=n, "Spline1DGridDiff2Cubic: Length(Y)ptr.p_double[i+1]-x->ptr.p_double[i]; delta2 = ae_sqr(delta, _state); delta3 = delta*delta2; s2 = (3*(y->ptr.p_double[i+1]-y->ptr.p_double[i])-2*d1->ptr.p_double[i]*delta-d1->ptr.p_double[i+1]*delta)/delta2; s3 = (2*(y->ptr.p_double[i]-y->ptr.p_double[i+1])+d1->ptr.p_double[i]*delta+d1->ptr.p_double[i+1]*delta)/delta3; d2->ptr.p_double[i] = 2*s2; } d2->ptr.p_double[n-1] = 2*s2+6*s3*delta; /* * Remember that HeapSortPPoints() call? * Now we have to reorder them back. */ if( dt.cntptr.p_double[i]; } ae_v_move(&d1->ptr.p_double[0], 1, &dt.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { dt.ptr.p_double[p.ptr.p_int[i]] = d2->ptr.p_double[i]; } ae_v_move(&d2->ptr.p_double[0], 1, &dt.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvcubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* y2, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector _x2; ae_vector a1; ae_vector a2; ae_vector a3; ae_vector b; ae_vector d; ae_vector dt; ae_vector d1; ae_vector d2; ae_vector p; ae_vector p2; ae_int_t i; ae_int_t ylen; double t; double t2; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; ae_vector_init_copy(&_x2, x2, _state); x2 = &_x2; ae_vector_clear(y2); ae_vector_init(&a1, 0, DT_REAL, _state); ae_vector_init(&a2, 0, DT_REAL, _state); ae_vector_init(&a3, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&dt, 0, DT_REAL, _state); ae_vector_init(&d1, 0, DT_REAL, _state); ae_vector_init(&d2, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); /* * check correctness of boundary conditions */ ae_assert(((boundltype==-1||boundltype==0)||boundltype==1)||boundltype==2, "Spline1DConvCubic: incorrect BoundLType!", _state); ae_assert(((boundrtype==-1||boundrtype==0)||boundrtype==1)||boundrtype==2, "Spline1DConvCubic: incorrect BoundRType!", _state); ae_assert((boundrtype==-1&&boundltype==-1)||(boundrtype!=-1&&boundltype!=-1), "Spline1DConvCubic: incorrect BoundLType/BoundRType!", _state); if( boundltype==1||boundltype==2 ) { ae_assert(ae_isfinite(boundl, _state), "Spline1DConvCubic: BoundL is infinite or NAN!", _state); } if( boundrtype==1||boundrtype==2 ) { ae_assert(ae_isfinite(boundr, _state), "Spline1DConvCubic: BoundR is infinite or NAN!", _state); } /* * check lengths of arguments */ ae_assert(n>=2, "Spline1DConvCubic: N<2!", _state); ae_assert(x->cnt>=n, "Spline1DConvCubic: Length(X)cnt>=n, "Spline1DConvCubic: Length(Y)=2, "Spline1DConvCubic: N2<2!", _state); ae_assert(x2->cnt>=n2, "Spline1DConvCubic: Length(X2)ptr.p_double[i]; apperiodicmap(&t, x->ptr.p_double[0], x->ptr.p_double[n-1], &t2, _state); x2->ptr.p_double[i] = t; } } spline1d_heapsortppoints(x2, &dt, &p2, n2, _state); /* * Now we've checked and preordered everything, so we: * * call internal GridDiff() function to get Hermite form of spline * * convert using internal Conv() function * * convert Y2 back to original order */ spline1d_spline1dgriddiffcubicinternal(x, y, n, boundltype, boundl, boundrtype, boundr, &d, &a1, &a2, &a3, &b, &dt, _state); spline1dconvdiffinternal(x, y, &d, n, x2, n2, y2, ae_true, &d1, ae_false, &d2, ae_false, _state); ae_assert(dt.cnt>=n2, "Spline1DConvCubic: internal error!", _state); for(i=0; i<=n2-1; i++) { dt.ptr.p_double[p2.ptr.p_int[i]] = y2->ptr.p_double[i]; } ae_v_move(&y2->ptr.p_double[0], 1, &dt.ptr.p_double[0], 1, ae_v_len(0,n2-1)); ae_frame_leave(_state); } /************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[] and derivatives d2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] D2 - first derivatives at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvdiffcubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* y2, /* Real */ ae_vector* d2, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector _x2; ae_vector a1; ae_vector a2; ae_vector a3; ae_vector b; ae_vector d; ae_vector dt; ae_vector rt1; ae_vector p; ae_vector p2; ae_int_t i; ae_int_t ylen; double t; double t2; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; ae_vector_init_copy(&_x2, x2, _state); x2 = &_x2; ae_vector_clear(y2); ae_vector_clear(d2); ae_vector_init(&a1, 0, DT_REAL, _state); ae_vector_init(&a2, 0, DT_REAL, _state); ae_vector_init(&a3, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&dt, 0, DT_REAL, _state); ae_vector_init(&rt1, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); /* * check correctness of boundary conditions */ ae_assert(((boundltype==-1||boundltype==0)||boundltype==1)||boundltype==2, "Spline1DConvDiffCubic: incorrect BoundLType!", _state); ae_assert(((boundrtype==-1||boundrtype==0)||boundrtype==1)||boundrtype==2, "Spline1DConvDiffCubic: incorrect BoundRType!", _state); ae_assert((boundrtype==-1&&boundltype==-1)||(boundrtype!=-1&&boundltype!=-1), "Spline1DConvDiffCubic: incorrect BoundLType/BoundRType!", _state); if( boundltype==1||boundltype==2 ) { ae_assert(ae_isfinite(boundl, _state), "Spline1DConvDiffCubic: BoundL is infinite or NAN!", _state); } if( boundrtype==1||boundrtype==2 ) { ae_assert(ae_isfinite(boundr, _state), "Spline1DConvDiffCubic: BoundR is infinite or NAN!", _state); } /* * check lengths of arguments */ ae_assert(n>=2, "Spline1DConvDiffCubic: N<2!", _state); ae_assert(x->cnt>=n, "Spline1DConvDiffCubic: Length(X)cnt>=n, "Spline1DConvDiffCubic: Length(Y)=2, "Spline1DConvDiffCubic: N2<2!", _state); ae_assert(x2->cnt>=n2, "Spline1DConvDiffCubic: Length(X2)ptr.p_double[i]; apperiodicmap(&t, x->ptr.p_double[0], x->ptr.p_double[n-1], &t2, _state); x2->ptr.p_double[i] = t; } } spline1d_heapsortppoints(x2, &dt, &p2, n2, _state); /* * Now we've checked and preordered everything, so we: * * call internal GridDiff() function to get Hermite form of spline * * convert using internal Conv() function * * convert Y2 back to original order */ spline1d_spline1dgriddiffcubicinternal(x, y, n, boundltype, boundl, boundrtype, boundr, &d, &a1, &a2, &a3, &b, &dt, _state); spline1dconvdiffinternal(x, y, &d, n, x2, n2, y2, ae_true, d2, ae_true, &rt1, ae_false, _state); ae_assert(dt.cnt>=n2, "Spline1DConvDiffCubic: internal error!", _state); for(i=0; i<=n2-1; i++) { dt.ptr.p_double[p2.ptr.p_int[i]] = y2->ptr.p_double[i]; } ae_v_move(&y2->ptr.p_double[0], 1, &dt.ptr.p_double[0], 1, ae_v_len(0,n2-1)); for(i=0; i<=n2-1; i++) { dt.ptr.p_double[p2.ptr.p_int[i]] = d2->ptr.p_double[i]; } ae_v_move(&d2->ptr.p_double[0], 1, &dt.ptr.p_double[0], 1, ae_v_len(0,n2-1)); ae_frame_leave(_state); } /************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[], first and second derivatives d2[] and dd2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] D2 - first derivatives at X2[] DD2 - second derivatives at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvdiff2cubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* y2, /* Real */ ae_vector* d2, /* Real */ ae_vector* dd2, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector _x2; ae_vector a1; ae_vector a2; ae_vector a3; ae_vector b; ae_vector d; ae_vector dt; ae_vector p; ae_vector p2; ae_int_t i; ae_int_t ylen; double t; double t2; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; ae_vector_init_copy(&_x2, x2, _state); x2 = &_x2; ae_vector_clear(y2); ae_vector_clear(d2); ae_vector_clear(dd2); ae_vector_init(&a1, 0, DT_REAL, _state); ae_vector_init(&a2, 0, DT_REAL, _state); ae_vector_init(&a3, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&dt, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); /* * check correctness of boundary conditions */ ae_assert(((boundltype==-1||boundltype==0)||boundltype==1)||boundltype==2, "Spline1DConvDiff2Cubic: incorrect BoundLType!", _state); ae_assert(((boundrtype==-1||boundrtype==0)||boundrtype==1)||boundrtype==2, "Spline1DConvDiff2Cubic: incorrect BoundRType!", _state); ae_assert((boundrtype==-1&&boundltype==-1)||(boundrtype!=-1&&boundltype!=-1), "Spline1DConvDiff2Cubic: incorrect BoundLType/BoundRType!", _state); if( boundltype==1||boundltype==2 ) { ae_assert(ae_isfinite(boundl, _state), "Spline1DConvDiff2Cubic: BoundL is infinite or NAN!", _state); } if( boundrtype==1||boundrtype==2 ) { ae_assert(ae_isfinite(boundr, _state), "Spline1DConvDiff2Cubic: BoundR is infinite or NAN!", _state); } /* * check lengths of arguments */ ae_assert(n>=2, "Spline1DConvDiff2Cubic: N<2!", _state); ae_assert(x->cnt>=n, "Spline1DConvDiff2Cubic: Length(X)cnt>=n, "Spline1DConvDiff2Cubic: Length(Y)=2, "Spline1DConvDiff2Cubic: N2<2!", _state); ae_assert(x2->cnt>=n2, "Spline1DConvDiff2Cubic: Length(X2)ptr.p_double[i]; apperiodicmap(&t, x->ptr.p_double[0], x->ptr.p_double[n-1], &t2, _state); x2->ptr.p_double[i] = t; } } spline1d_heapsortppoints(x2, &dt, &p2, n2, _state); /* * Now we've checked and preordered everything, so we: * * call internal GridDiff() function to get Hermite form of spline * * convert using internal Conv() function * * convert Y2 back to original order */ spline1d_spline1dgriddiffcubicinternal(x, y, n, boundltype, boundl, boundrtype, boundr, &d, &a1, &a2, &a3, &b, &dt, _state); spline1dconvdiffinternal(x, y, &d, n, x2, n2, y2, ae_true, d2, ae_true, dd2, ae_true, _state); ae_assert(dt.cnt>=n2, "Spline1DConvDiff2Cubic: internal error!", _state); for(i=0; i<=n2-1; i++) { dt.ptr.p_double[p2.ptr.p_int[i]] = y2->ptr.p_double[i]; } ae_v_move(&y2->ptr.p_double[0], 1, &dt.ptr.p_double[0], 1, ae_v_len(0,n2-1)); for(i=0; i<=n2-1; i++) { dt.ptr.p_double[p2.ptr.p_int[i]] = d2->ptr.p_double[i]; } ae_v_move(&d2->ptr.p_double[0], 1, &dt.ptr.p_double[0], 1, ae_v_len(0,n2-1)); for(i=0; i<=n2-1; i++) { dt.ptr.p_double[p2.ptr.p_int[i]] = dd2->ptr.p_double[i]; } ae_v_move(&dd2->ptr.p_double[0], 1, &dt.ptr.p_double[0], 1, ae_v_len(0,n2-1)); ae_frame_leave(_state); } /************************************************************************* This subroutine builds Catmull-Rom spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Y - function values, array[0..N-1]. OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundType - boundary condition type: * -1 for periodic boundary condition * 0 for parabolically terminated spline (default) Tension - tension parameter: * tension=0 corresponds to classic Catmull-Rom spline (default) * 0=2, "Spline1DBuildCatmullRom: N<2!", _state); ae_assert(boundtype==-1||boundtype==0, "Spline1DBuildCatmullRom: incorrect BoundType!", _state); ae_assert(ae_fp_greater_eq(tension,(double)(0)), "Spline1DBuildCatmullRom: Tension<0!", _state); ae_assert(ae_fp_less_eq(tension,(double)(1)), "Spline1DBuildCatmullRom: Tension>1!", _state); ae_assert(x->cnt>=n, "Spline1DBuildCatmullRom: Length(X)cnt>=n, "Spline1DBuildCatmullRom: Length(Y)ptr.p_double[n-1] = y->ptr.p_double[0]; ae_vector_set_length(&d, n, _state); d.ptr.p_double[0] = (y->ptr.p_double[1]-y->ptr.p_double[n-2])/(2*(x->ptr.p_double[1]-x->ptr.p_double[0]+x->ptr.p_double[n-1]-x->ptr.p_double[n-2])); for(i=1; i<=n-2; i++) { d.ptr.p_double[i] = (1-tension)*(y->ptr.p_double[i+1]-y->ptr.p_double[i-1])/(x->ptr.p_double[i+1]-x->ptr.p_double[i-1]); } d.ptr.p_double[n-1] = d.ptr.p_double[0]; /* * Now problem is reduced to the cubic Hermite spline */ spline1dbuildhermite(x, y, &d, n, c, _state); c->periodic = ae_true; } else { /* * Non-periodic boundary conditions */ ae_vector_set_length(&d, n, _state); for(i=1; i<=n-2; i++) { d.ptr.p_double[i] = (1-tension)*(y->ptr.p_double[i+1]-y->ptr.p_double[i-1])/(x->ptr.p_double[i+1]-x->ptr.p_double[i-1]); } d.ptr.p_double[0] = 2*(y->ptr.p_double[1]-y->ptr.p_double[0])/(x->ptr.p_double[1]-x->ptr.p_double[0])-d.ptr.p_double[1]; d.ptr.p_double[n-1] = 2*(y->ptr.p_double[n-1]-y->ptr.p_double[n-2])/(x->ptr.p_double[n-1]-x->ptr.p_double[n-2])-d.ptr.p_double[n-2]; /* * Now problem is reduced to the cubic Hermite spline */ spline1dbuildhermite(x, y, &d, n, c, _state); } ae_frame_leave(_state); } /************************************************************************* This subroutine builds Hermite spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] D - derivatives, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant. ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildhermite(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* d, ae_int_t n, spline1dinterpolant* c, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector _d; ae_int_t i; double delta; double delta2; double delta3; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; ae_vector_init_copy(&_d, d, _state); d = &_d; _spline1dinterpolant_clear(c); ae_assert(n>=2, "Spline1DBuildHermite: N<2!", _state); ae_assert(x->cnt>=n, "Spline1DBuildHermite: Length(X)cnt>=n, "Spline1DBuildHermite: Length(Y)cnt>=n, "Spline1DBuildHermite: Length(D)x, n, _state); ae_vector_set_length(&c->c, 4*(n-1)+2, _state); c->periodic = ae_false; c->k = 3; c->n = n; c->continuity = 1; for(i=0; i<=n-1; i++) { c->x.ptr.p_double[i] = x->ptr.p_double[i]; } for(i=0; i<=n-2; i++) { delta = x->ptr.p_double[i+1]-x->ptr.p_double[i]; delta2 = ae_sqr(delta, _state); delta3 = delta*delta2; c->c.ptr.p_double[4*i+0] = y->ptr.p_double[i]; c->c.ptr.p_double[4*i+1] = d->ptr.p_double[i]; c->c.ptr.p_double[4*i+2] = (3*(y->ptr.p_double[i+1]-y->ptr.p_double[i])-2*d->ptr.p_double[i]*delta-d->ptr.p_double[i+1]*delta)/delta2; c->c.ptr.p_double[4*i+3] = (2*(y->ptr.p_double[i]-y->ptr.p_double[i+1])+d->ptr.p_double[i]*delta+d->ptr.p_double[i+1]*delta)/delta3; } c->c.ptr.p_double[4*(n-1)+0] = y->ptr.p_double[n-1]; c->c.ptr.p_double[4*(n-1)+1] = d->ptr.p_double[n-1]; ae_frame_leave(_state); } /************************************************************************* This subroutine builds Akima spline interpolant INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildakima(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, spline1dinterpolant* c, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_int_t i; ae_vector d; ae_vector w; ae_vector diff; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; _spline1dinterpolant_clear(c); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&diff, 0, DT_REAL, _state); ae_assert(n>=2, "Spline1DBuildAkima: N<2!", _state); ae_assert(x->cnt>=n, "Spline1DBuildAkima: Length(X)cnt>=n, "Spline1DBuildAkima: Length(Y)ptr.p_double[i+1]-y->ptr.p_double[i])/(x->ptr.p_double[i+1]-x->ptr.p_double[i]); } for(i=1; i<=n-2; i++) { w.ptr.p_double[i] = ae_fabs(diff.ptr.p_double[i]-diff.ptr.p_double[i-1], _state); } /* * Prepare Hermite interpolation scheme */ ae_vector_set_length(&d, n, _state); for(i=2; i<=n-3; i++) { if( ae_fp_neq(ae_fabs(w.ptr.p_double[i-1], _state)+ae_fabs(w.ptr.p_double[i+1], _state),(double)(0)) ) { d.ptr.p_double[i] = (w.ptr.p_double[i+1]*diff.ptr.p_double[i-1]+w.ptr.p_double[i-1]*diff.ptr.p_double[i])/(w.ptr.p_double[i+1]+w.ptr.p_double[i-1]); } else { d.ptr.p_double[i] = ((x->ptr.p_double[i+1]-x->ptr.p_double[i])*diff.ptr.p_double[i-1]+(x->ptr.p_double[i]-x->ptr.p_double[i-1])*diff.ptr.p_double[i])/(x->ptr.p_double[i+1]-x->ptr.p_double[i-1]); } } d.ptr.p_double[0] = spline1d_diffthreepoint(x->ptr.p_double[0], x->ptr.p_double[0], y->ptr.p_double[0], x->ptr.p_double[1], y->ptr.p_double[1], x->ptr.p_double[2], y->ptr.p_double[2], _state); d.ptr.p_double[1] = spline1d_diffthreepoint(x->ptr.p_double[1], x->ptr.p_double[0], y->ptr.p_double[0], x->ptr.p_double[1], y->ptr.p_double[1], x->ptr.p_double[2], y->ptr.p_double[2], _state); d.ptr.p_double[n-2] = spline1d_diffthreepoint(x->ptr.p_double[n-2], x->ptr.p_double[n-3], y->ptr.p_double[n-3], x->ptr.p_double[n-2], y->ptr.p_double[n-2], x->ptr.p_double[n-1], y->ptr.p_double[n-1], _state); d.ptr.p_double[n-1] = spline1d_diffthreepoint(x->ptr.p_double[n-1], x->ptr.p_double[n-3], y->ptr.p_double[n-3], x->ptr.p_double[n-2], y->ptr.p_double[n-2], x->ptr.p_double[n-1], y->ptr.p_double[n-1], _state); /* * Build Akima spline using Hermite interpolation scheme */ spline1dbuildhermite(x, y, &d, n, c, _state); ae_frame_leave(_state); } /************************************************************************* This subroutine calculates the value of the spline at the given point X. INPUT PARAMETERS: C - spline interpolant X - point Result: S(x) -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ double spline1dcalc(spline1dinterpolant* c, double x, ae_state *_state) { ae_int_t l; ae_int_t r; ae_int_t m; double t; double result; ae_assert(c->k==3, "Spline1DCalc: internal error", _state); ae_assert(!ae_isinf(x, _state), "Spline1DCalc: infinite X!", _state); /* * special case: NaN */ if( ae_isnan(x, _state) ) { result = _state->v_nan; return result; } /* * correct if periodic */ if( c->periodic ) { apperiodicmap(&x, c->x.ptr.p_double[0], c->x.ptr.p_double[c->n-1], &t, _state); } /* * Binary search in the [ x[0], ..., x[n-2] ] (x[n-1] is not included) */ l = 0; r = c->n-2+1; while(l!=r-1) { m = (l+r)/2; if( c->x.ptr.p_double[m]>=x ) { r = m; } else { l = m; } } /* * Interpolation */ x = x-c->x.ptr.p_double[l]; m = 4*l; result = c->c.ptr.p_double[m]+x*(c->c.ptr.p_double[m+1]+x*(c->c.ptr.p_double[m+2]+x*c->c.ptr.p_double[m+3])); return result; } /************************************************************************* This subroutine differentiates the spline. INPUT PARAMETERS: C - spline interpolant. X - point Result: S - S(x) DS - S'(x) D2S - S''(x) -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1ddiff(spline1dinterpolant* c, double x, double* s, double* ds, double* d2s, ae_state *_state) { ae_int_t l; ae_int_t r; ae_int_t m; double t; *s = 0; *ds = 0; *d2s = 0; ae_assert(c->k==3, "Spline1DDiff: internal error", _state); ae_assert(!ae_isinf(x, _state), "Spline1DDiff: infinite X!", _state); /* * special case: NaN */ if( ae_isnan(x, _state) ) { *s = _state->v_nan; *ds = _state->v_nan; *d2s = _state->v_nan; return; } /* * correct if periodic */ if( c->periodic ) { apperiodicmap(&x, c->x.ptr.p_double[0], c->x.ptr.p_double[c->n-1], &t, _state); } /* * Binary search */ l = 0; r = c->n-2+1; while(l!=r-1) { m = (l+r)/2; if( c->x.ptr.p_double[m]>=x ) { r = m; } else { l = m; } } /* * Differentiation */ x = x-c->x.ptr.p_double[l]; m = 4*l; *s = c->c.ptr.p_double[m]+x*(c->c.ptr.p_double[m+1]+x*(c->c.ptr.p_double[m+2]+x*c->c.ptr.p_double[m+3])); *ds = c->c.ptr.p_double[m+1]+2*x*c->c.ptr.p_double[m+2]+3*ae_sqr(x, _state)*c->c.ptr.p_double[m+3]; *d2s = 2*c->c.ptr.p_double[m+2]+6*x*c->c.ptr.p_double[m+3]; } /************************************************************************* This subroutine makes the copy of the spline. INPUT PARAMETERS: C - spline interpolant. Result: CC - spline copy -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dcopy(spline1dinterpolant* c, spline1dinterpolant* cc, ae_state *_state) { ae_int_t s; _spline1dinterpolant_clear(cc); cc->periodic = c->periodic; cc->n = c->n; cc->k = c->k; cc->continuity = c->continuity; ae_vector_set_length(&cc->x, cc->n, _state); ae_v_move(&cc->x.ptr.p_double[0], 1, &c->x.ptr.p_double[0], 1, ae_v_len(0,cc->n-1)); s = c->c.cnt; ae_vector_set_length(&cc->c, s, _state); ae_v_move(&cc->c.ptr.p_double[0], 1, &c->c.ptr.p_double[0], 1, ae_v_len(0,s-1)); } /************************************************************************* This subroutine unpacks the spline into the coefficients table. INPUT PARAMETERS: C - spline interpolant. X - point OUTPUT PARAMETERS: Tbl - coefficients table, unpacked format, array[0..N-2, 0..5]. For I = 0...N-2: Tbl[I,0] = X[i] Tbl[I,1] = X[i+1] Tbl[I,2] = C0 Tbl[I,3] = C1 Tbl[I,4] = C2 Tbl[I,5] = C3 On [x[i], x[i+1]] spline is equals to: S(x) = C0 + C1*t + C2*t^2 + C3*t^3 t = x-x[i] NOTE: You can rebuild spline with Spline1DBuildHermite() function, which accepts as inputs function values and derivatives at nodes, which are easy to calculate when you have coefficients. -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dunpack(spline1dinterpolant* c, ae_int_t* n, /* Real */ ae_matrix* tbl, ae_state *_state) { ae_int_t i; ae_int_t j; *n = 0; ae_matrix_clear(tbl); ae_matrix_set_length(tbl, c->n-2+1, 2+c->k+1, _state); *n = c->n; /* * Fill */ for(i=0; i<=*n-2; i++) { tbl->ptr.pp_double[i][0] = c->x.ptr.p_double[i]; tbl->ptr.pp_double[i][1] = c->x.ptr.p_double[i+1]; for(j=0; j<=c->k; j++) { tbl->ptr.pp_double[i][2+j] = c->c.ptr.p_double[(c->k+1)*i+j]; } } } /************************************************************************* This subroutine performs linear transformation of the spline argument. INPUT PARAMETERS: C - spline interpolant. A, B- transformation coefficients: x = A*t + B Result: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dlintransx(spline1dinterpolant* c, double a, double b, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t n; double v; double dv; double d2v; ae_vector x; ae_vector y; ae_vector d; ae_bool isperiodic; ae_int_t contval; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_assert(c->k==3, "Spline1DLinTransX: internal error", _state); n = c->n; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&d, n, _state); /* * Unpack, X, Y, dY/dX. * Scale and pack with Spline1DBuildHermite again. */ if( ae_fp_eq(a,(double)(0)) ) { /* * Special case: A=0 */ v = spline1dcalc(c, b, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = c->x.ptr.p_double[i]; y.ptr.p_double[i] = v; d.ptr.p_double[i] = 0.0; } } else { /* * General case, A<>0 */ for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = c->x.ptr.p_double[i]; spline1ddiff(c, x.ptr.p_double[i], &v, &dv, &d2v, _state); x.ptr.p_double[i] = (x.ptr.p_double[i]-b)/a; y.ptr.p_double[i] = v; d.ptr.p_double[i] = a*dv; } } isperiodic = c->periodic; contval = c->continuity; if( contval>0 ) { spline1dbuildhermite(&x, &y, &d, n, c, _state); } else { spline1dbuildlinear(&x, &y, n, c, _state); } c->periodic = isperiodic; c->continuity = contval; ae_frame_leave(_state); } /************************************************************************* This subroutine performs linear transformation of the spline. INPUT PARAMETERS: C - spline interpolant. A, B- transformation coefficients: S2(x) = A*S(x) + B Result: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dlintransy(spline1dinterpolant* c, double a, double b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t n; ae_assert(c->k==3, "Spline1DLinTransX: internal error", _state); n = c->n; for(i=0; i<=n-2; i++) { c->c.ptr.p_double[4*i] = a*c->c.ptr.p_double[4*i]+b; for(j=1; j<=3; j++) { c->c.ptr.p_double[4*i+j] = a*c->c.ptr.p_double[4*i+j]; } } c->c.ptr.p_double[4*(n-1)+0] = a*c->c.ptr.p_double[4*(n-1)+0]+b; c->c.ptr.p_double[4*(n-1)+1] = a*c->c.ptr.p_double[4*(n-1)+1]; } /************************************************************************* This subroutine integrates the spline. INPUT PARAMETERS: C - spline interpolant. X - right bound of the integration interval [a, x], here 'a' denotes min(x[]) Result: integral(S(t)dt,a,x) -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ double spline1dintegrate(spline1dinterpolant* c, double x, ae_state *_state) { ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t l; ae_int_t r; ae_int_t m; double w; double v; double t; double intab; double additionalterm; double result; n = c->n; /* * Periodic splines require special treatment. We make * following transformation: * * integral(S(t)dt,A,X) = integral(S(t)dt,A,Z)+AdditionalTerm * * here X may lie outside of [A,B], Z lies strictly in [A,B], * AdditionalTerm is equals to integral(S(t)dt,A,B) times some * integer number (may be zero). */ if( c->periodic&&(ae_fp_less(x,c->x.ptr.p_double[0])||ae_fp_greater(x,c->x.ptr.p_double[c->n-1])) ) { /* * compute integral(S(x)dx,A,B) */ intab = (double)(0); for(i=0; i<=c->n-2; i++) { w = c->x.ptr.p_double[i+1]-c->x.ptr.p_double[i]; m = (c->k+1)*i; intab = intab+c->c.ptr.p_double[m]*w; v = w; for(j=1; j<=c->k; j++) { v = v*w; intab = intab+c->c.ptr.p_double[m+j]*v/(j+1); } } /* * map X into [A,B] */ apperiodicmap(&x, c->x.ptr.p_double[0], c->x.ptr.p_double[c->n-1], &t, _state); additionalterm = t*intab; } else { additionalterm = (double)(0); } /* * Binary search in the [ x[0], ..., x[n-2] ] (x[n-1] is not included) */ l = 0; r = n-2+1; while(l!=r-1) { m = (l+r)/2; if( ae_fp_greater_eq(c->x.ptr.p_double[m],x) ) { r = m; } else { l = m; } } /* * Integration */ result = (double)(0); for(i=0; i<=l-1; i++) { w = c->x.ptr.p_double[i+1]-c->x.ptr.p_double[i]; m = (c->k+1)*i; result = result+c->c.ptr.p_double[m]*w; v = w; for(j=1; j<=c->k; j++) { v = v*w; result = result+c->c.ptr.p_double[m+j]*v/(j+1); } } w = x-c->x.ptr.p_double[l]; m = (c->k+1)*l; v = w; result = result+c->c.ptr.p_double[m]*w; for(j=1; j<=c->k; j++) { v = v*w; result = result+c->c.ptr.p_double[m+j]*v/(j+1); } result = result+additionalterm; return result; } /************************************************************************* Internal version of Spline1DConvDiff Converts from Hermite spline given by grid XOld to new grid X2 INPUT PARAMETERS: XOld - old grid YOld - values at old grid DOld - first derivative at old grid N - grid size X2 - new grid N2 - new grid size Y - possibly preallocated output array (reallocate if too small) NeedY - do we need Y? D1 - possibly preallocated output array (reallocate if too small) NeedD1 - do we need D1? D2 - possibly preallocated output array (reallocate if too small) NeedD2 - do we need D1? OUTPUT ARRAYS: Y - values, if needed D1 - first derivative, if needed D2 - second derivative, if needed -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvdiffinternal(/* Real */ ae_vector* xold, /* Real */ ae_vector* yold, /* Real */ ae_vector* dold, ae_int_t n, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* y, ae_bool needy, /* Real */ ae_vector* d1, ae_bool needd1, /* Real */ ae_vector* d2, ae_bool needd2, ae_state *_state) { ae_int_t intervalindex; ae_int_t pointindex; ae_bool havetoadvance; double c0; double c1; double c2; double c3; double a; double b; double w; double w2; double w3; double fa; double fb; double da; double db; double t; /* * Prepare space */ if( needy&&y->cntcntcnt=n2 ) { break; } t = x2->ptr.p_double[pointindex]; /* * do we need to advance interval? */ havetoadvance = ae_false; if( intervalindex==-1 ) { havetoadvance = ae_true; } else { if( intervalindexptr.p_double[intervalindex]; b = xold->ptr.p_double[intervalindex+1]; w = b-a; w2 = w*w; w3 = w*w2; fa = yold->ptr.p_double[intervalindex]; fb = yold->ptr.p_double[intervalindex+1]; da = dold->ptr.p_double[intervalindex]; db = dold->ptr.p_double[intervalindex+1]; c0 = fa; c1 = da; c2 = (3*(fb-fa)-2*da*w-db*w)/w2; c3 = (2*(fa-fb)+da*w+db*w)/w3; continue; } /* * Calculate spline and its derivatives using power basis */ t = t-a; if( needy ) { y->ptr.p_double[pointindex] = c0+t*(c1+t*(c2+t*c3)); } if( needd1 ) { d1->ptr.p_double[pointindex] = c1+2*t*c2+3*t*t*c3; } if( needd2 ) { d2->ptr.p_double[pointindex] = 2*c2+6*t*c3; } pointindex = pointindex+1; } } /************************************************************************* This function finds all roots and extrema of the spline S(x) defined at [A,B] (interval which contains spline nodes). It does not extrapolates function, so roots and extrema located outside of [A,B] will not be found. It returns all isolated (including multiple) roots and extrema. INPUT PARAMETERS C - spline interpolant OUTPUT PARAMETERS R - array[NR], contains roots of the spline. In case there is no roots, this array has zero length. NR - number of roots, >=0 DR - is set to True in case there is at least one interval where spline is just a zero constant. Such degenerate cases are not reported in the R/NR E - array[NE], contains extrema (maximums/minimums) of the spline. In case there is no extrema, this array has zero length. ET - array[NE], extrema types: * ET[i]>0 in case I-th extrema is a minimum * ET[i]<0 in case I-th extrema is a maximum NE - number of extrema, >=0 DE - is set to True in case there is at least one interval where spline is a constant. Such degenerate cases are not reported in the E/NE. NOTES: 1. This function does NOT report following kinds of roots: * intervals where function is constantly zero * roots which are outside of [A,B] (note: it CAN return A or B) 2. This function does NOT report following kinds of extrema: * intervals where function is a constant * extrema which are outside of (A,B) (note: it WON'T return A or B) -- ALGLIB PROJECT -- Copyright 26.09.2011 by Bochkanov Sergey *************************************************************************/ void spline1drootsandextrema(spline1dinterpolant* c, /* Real */ ae_vector* r, ae_int_t* nr, ae_bool* dr, /* Real */ ae_vector* e, /* Integer */ ae_vector* et, ae_int_t* ne, ae_bool* de, ae_state *_state) { ae_frame _frame_block; double pl; double ml; double pll; double pr; double mr; ae_vector tr; ae_vector tmpr; ae_vector tmpe; ae_vector tmpet; ae_vector tmpc; double x0; double x1; double x2; double ex0; double ex1; ae_int_t tne; ae_int_t tnr; ae_int_t i; ae_int_t j; ae_bool nstep; ae_frame_make(_state, &_frame_block); ae_vector_clear(r); *nr = 0; *dr = ae_false; ae_vector_clear(e); ae_vector_clear(et); *ne = 0; *de = ae_false; ae_vector_init(&tr, 0, DT_REAL, _state); ae_vector_init(&tmpr, 0, DT_REAL, _state); ae_vector_init(&tmpe, 0, DT_REAL, _state); ae_vector_init(&tmpet, 0, DT_INT, _state); ae_vector_init(&tmpc, 0, DT_REAL, _state); /* *exception handling */ ae_assert(c->k==3, "Spline1DRootsAndExtrema : incorrect parameter C.K!", _state); ae_assert(c->continuity>=0, "Spline1DRootsAndExtrema : parameter C.Continuity must not be less than 0!", _state); /* *initialization of variable */ *nr = 0; *ne = 0; *dr = ae_false; *de = ae_false; nstep = ae_true; /* *consider case, when C.Continuty=0 */ if( c->continuity==0 ) { /* *allocation for auxiliary arrays *'TmpR ' - it stores a time value for roots *'TmpE ' - it stores a time value for extremums *'TmpET '- it stores a time value for extremums type */ rvectorsetlengthatleast(&tmpr, 3*(c->n-1), _state); rvectorsetlengthatleast(&tmpe, 2*(c->n-1), _state); ivectorsetlengthatleast(&tmpet, 2*(c->n-1), _state); /* *start calculating */ for(i=0; i<=c->n-2; i++) { /* *initialization pL, mL, pR, mR */ pl = c->c.ptr.p_double[4*i]; ml = c->c.ptr.p_double[4*i+1]; pr = c->c.ptr.p_double[4*(i+1)]; mr = c->c.ptr.p_double[4*i+1]+2*c->c.ptr.p_double[4*i+2]*(c->x.ptr.p_double[i+1]-c->x.ptr.p_double[i])+3*c->c.ptr.p_double[4*i+3]*(c->x.ptr.p_double[i+1]-c->x.ptr.p_double[i])*(c->x.ptr.p_double[i+1]-c->x.ptr.p_double[i]); /* *pre-searching roots and extremums */ solvecubicpolinom(pl, ml, pr, mr, c->x.ptr.p_double[i], c->x.ptr.p_double[i+1], &x0, &x1, &x2, &ex0, &ex1, &tnr, &tne, &tr, _state); *dr = *dr||tnr==-1; *de = *de||tne==-1; /* *searching of roots */ if( tnr==1&&nstep ) { /* *is there roots? */ if( *nr>0 ) { /* *is a next root equal a previous root? *if is't, then write new root */ if( ae_fp_neq(x0,tmpr.ptr.p_double[*nr-1]) ) { tmpr.ptr.p_double[*nr] = x0; *nr = *nr+1; } } else { /* *write a first root */ tmpr.ptr.p_double[*nr] = x0; *nr = *nr+1; } } else { /* *case when function at a segment identically to zero *then we have to clear a root, if the one located on a *constant segment */ if( tnr==-1 ) { /* *safe state variable as constant */ if( nstep ) { nstep = ae_false; } /* *clear the root, if there is */ if( *nr>0 ) { if( ae_fp_eq(c->x.ptr.p_double[i],tmpr.ptr.p_double[*nr-1]) ) { *nr = *nr-1; } } /* *change state for 'DR' */ if( !*dr ) { *dr = ae_true; } } else { nstep = ae_true; } } /* *searching of extremums */ if( i>0 ) { pll = c->c.ptr.p_double[4*(i-1)]; /* *if pL=pLL or pL=pR then */ if( tne==-1 ) { if( !*de ) { *de = ae_true; } } else { if( ae_fp_greater(pl,pll)&&ae_fp_greater(pl,pr) ) { /* *maximum */ tmpet.ptr.p_int[*ne] = -1; tmpe.ptr.p_double[*ne] = c->x.ptr.p_double[i]; *ne = *ne+1; } else { if( ae_fp_less(pl,pll)&&ae_fp_less(pl,pr) ) { /* *minimum */ tmpet.ptr.p_int[*ne] = 1; tmpe.ptr.p_double[*ne] = c->x.ptr.p_double[i]; *ne = *ne+1; } } } } } /* *write final result */ rvectorsetlengthatleast(r, *nr, _state); rvectorsetlengthatleast(e, *ne, _state); ivectorsetlengthatleast(et, *ne, _state); /* *write roots */ for(i=0; i<=*nr-1; i++) { r->ptr.p_double[i] = tmpr.ptr.p_double[i]; } /* *write extremums and their types */ for(i=0; i<=*ne-1; i++) { e->ptr.p_double[i] = tmpe.ptr.p_double[i]; et->ptr.p_int[i] = tmpet.ptr.p_int[i]; } } else { /* *case, when C.Continuity>=1 *'TmpR ' - it stores a time value for roots *'TmpC' - it stores a time value for extremums and *their function value (TmpC={EX0,F(EX0), EX1,F(EX1), ..., EXn,F(EXn)};) *'TmpE' - it stores a time value for extremums only *'TmpET'- it stores a time value for extremums type */ rvectorsetlengthatleast(&tmpr, 2*c->n-1, _state); rvectorsetlengthatleast(&tmpc, 4*c->n, _state); rvectorsetlengthatleast(&tmpe, 2*c->n, _state); ivectorsetlengthatleast(&tmpet, 2*c->n, _state); /* *start calculating */ for(i=0; i<=c->n-2; i++) { /* *we calculate pL,mL, pR,mR as Fi+1(F'i+1) at left border */ pl = c->c.ptr.p_double[4*i]; ml = c->c.ptr.p_double[4*i+1]; pr = c->c.ptr.p_double[4*(i+1)]; mr = c->c.ptr.p_double[4*(i+1)+1]; /* *calculating roots and extremums at [X[i],X[i+1]] */ solvecubicpolinom(pl, ml, pr, mr, c->x.ptr.p_double[i], c->x.ptr.p_double[i+1], &x0, &x1, &x2, &ex0, &ex1, &tnr, &tne, &tr, _state); /* *searching roots */ if( tnr>0 ) { /* *re-init tR */ if( tnr>=1 ) { tr.ptr.p_double[0] = x0; } if( tnr>=2 ) { tr.ptr.p_double[1] = x1; } if( tnr==3 ) { tr.ptr.p_double[2] = x2; } /* *start root selection */ if( *nr>0 ) { if( ae_fp_neq(tmpr.ptr.p_double[*nr-1],x0) ) { /* *previous segment was't constant identical zero */ if( nstep ) { for(j=0; j<=tnr-1; j++) { tmpr.ptr.p_double[*nr+j] = tr.ptr.p_double[j]; } *nr = *nr+tnr; } else { /* *previous segment was constant identical zero *and we must ignore [NR+j-1] root */ for(j=1; j<=tnr-1; j++) { tmpr.ptr.p_double[*nr+j-1] = tr.ptr.p_double[j]; } *nr = *nr+tnr-1; nstep = ae_true; } } else { for(j=1; j<=tnr-1; j++) { tmpr.ptr.p_double[*nr+j-1] = tr.ptr.p_double[j]; } *nr = *nr+tnr-1; } } else { /* *write first root */ for(j=0; j<=tnr-1; j++) { tmpr.ptr.p_double[*nr+j] = tr.ptr.p_double[j]; } *nr = *nr+tnr; } } else { if( tnr==-1 ) { /* *decrement 'NR' if at previous step was writen a root *(previous segment identical zero) */ if( *nr>0&&nstep ) { *nr = *nr-1; } /* *previous segment is't constant */ if( nstep ) { nstep = ae_false; } /* *rewrite 'DR' */ if( !*dr ) { *dr = ae_true; } } } /* *searching extremums *write all term like extremums */ if( tne==1 ) { if( *ne>0 ) { /* *just ignore identical extremums *because he must be one */ if( ae_fp_neq(tmpc.ptr.p_double[*ne-2],ex0) ) { tmpc.ptr.p_double[*ne] = ex0; tmpc.ptr.p_double[*ne+1] = c->c.ptr.p_double[4*i]+c->c.ptr.p_double[4*i+1]*(ex0-c->x.ptr.p_double[i])+c->c.ptr.p_double[4*i+2]*(ex0-c->x.ptr.p_double[i])*(ex0-c->x.ptr.p_double[i])+c->c.ptr.p_double[4*i+3]*(ex0-c->x.ptr.p_double[i])*(ex0-c->x.ptr.p_double[i])*(ex0-c->x.ptr.p_double[i]); *ne = *ne+2; } } else { /* *write first extremum and it function value */ tmpc.ptr.p_double[*ne] = ex0; tmpc.ptr.p_double[*ne+1] = c->c.ptr.p_double[4*i]+c->c.ptr.p_double[4*i+1]*(ex0-c->x.ptr.p_double[i])+c->c.ptr.p_double[4*i+2]*(ex0-c->x.ptr.p_double[i])*(ex0-c->x.ptr.p_double[i])+c->c.ptr.p_double[4*i+3]*(ex0-c->x.ptr.p_double[i])*(ex0-c->x.ptr.p_double[i])*(ex0-c->x.ptr.p_double[i]); *ne = *ne+2; } } else { if( tne==2 ) { if( *ne>0 ) { /* *ignore identical extremum */ if( ae_fp_neq(tmpc.ptr.p_double[*ne-2],ex0) ) { tmpc.ptr.p_double[*ne] = ex0; tmpc.ptr.p_double[*ne+1] = c->c.ptr.p_double[4*i]+c->c.ptr.p_double[4*i+1]*(ex0-c->x.ptr.p_double[i])+c->c.ptr.p_double[4*i+2]*(ex0-c->x.ptr.p_double[i])*(ex0-c->x.ptr.p_double[i])+c->c.ptr.p_double[4*i+3]*(ex0-c->x.ptr.p_double[i])*(ex0-c->x.ptr.p_double[i])*(ex0-c->x.ptr.p_double[i]); *ne = *ne+2; } } else { /* *write first extremum */ tmpc.ptr.p_double[*ne] = ex0; tmpc.ptr.p_double[*ne+1] = c->c.ptr.p_double[4*i]+c->c.ptr.p_double[4*i+1]*(ex0-c->x.ptr.p_double[i])+c->c.ptr.p_double[4*i+2]*(ex0-c->x.ptr.p_double[i])*(ex0-c->x.ptr.p_double[i])+c->c.ptr.p_double[4*i+3]*(ex0-c->x.ptr.p_double[i])*(ex0-c->x.ptr.p_double[i])*(ex0-c->x.ptr.p_double[i]); *ne = *ne+2; } /* *write second extremum */ tmpc.ptr.p_double[*ne] = ex1; tmpc.ptr.p_double[*ne+1] = c->c.ptr.p_double[4*i]+c->c.ptr.p_double[4*i+1]*(ex1-c->x.ptr.p_double[i])+c->c.ptr.p_double[4*i+2]*(ex1-c->x.ptr.p_double[i])*(ex1-c->x.ptr.p_double[i])+c->c.ptr.p_double[4*i+3]*(ex1-c->x.ptr.p_double[i])*(ex1-c->x.ptr.p_double[i])*(ex1-c->x.ptr.p_double[i]); *ne = *ne+2; } else { if( tne==-1 ) { if( !*de ) { *de = ae_true; } } } } } /* *checking of arrays *get number of extremums (tNe=NE/2) *initialize pL as value F0(X[0]) and *initialize pR as value Fn-1(X[N]) */ tne = *ne/2; *ne = 0; pl = c->c.ptr.p_double[0]; pr = c->c.ptr.p_double[4*(c->n-1)]; for(i=0; i<=tne-1; i++) { if( i>0&&ix.ptr.p_double[0]) ) { if( ae_fp_greater(tmpc.ptr.p_double[2*i+1],pl)&&ae_fp_greater(tmpc.ptr.p_double[2*i+1],tmpc.ptr.p_double[2*(i+1)+1]) ) { /* *maximum */ tmpe.ptr.p_double[*ne] = tmpc.ptr.p_double[2*i]; tmpet.ptr.p_int[*ne] = -1; *ne = *ne+1; } else { if( ae_fp_less(tmpc.ptr.p_double[2*i+1],pl)&&ae_fp_less(tmpc.ptr.p_double[2*i+1],tmpc.ptr.p_double[2*(i+1)+1]) ) { /* *minimum */ tmpe.ptr.p_double[*ne] = tmpc.ptr.p_double[2*i]; tmpet.ptr.p_int[*ne] = 1; *ne = *ne+1; } } } } else { if( i==tne-1 ) { if( ae_fp_neq(tmpc.ptr.p_double[2*i],c->x.ptr.p_double[c->n-1]) ) { if( ae_fp_greater(tmpc.ptr.p_double[2*i+1],tmpc.ptr.p_double[2*(i-1)+1])&&ae_fp_greater(tmpc.ptr.p_double[2*i+1],pr) ) { /* *maximum */ tmpe.ptr.p_double[*ne] = tmpc.ptr.p_double[2*i]; tmpet.ptr.p_int[*ne] = -1; *ne = *ne+1; } else { if( ae_fp_less(tmpc.ptr.p_double[2*i+1],tmpc.ptr.p_double[2*(i-1)+1])&&ae_fp_less(tmpc.ptr.p_double[2*i+1],pr) ) { /* *minimum */ tmpe.ptr.p_double[*ne] = tmpc.ptr.p_double[2*i]; tmpet.ptr.p_int[*ne] = 1; *ne = *ne+1; } } } } } } } /* *final results *allocate R, E, ET */ rvectorsetlengthatleast(r, *nr, _state); rvectorsetlengthatleast(e, *ne, _state); ivectorsetlengthatleast(et, *ne, _state); /* *write result for extremus and their types */ for(i=0; i<=*ne-1; i++) { e->ptr.p_double[i] = tmpe.ptr.p_double[i]; et->ptr.p_int[i] = tmpet.ptr.p_int[i]; } /* *write result for roots */ for(i=0; i<=*nr-1; i++) { r->ptr.p_double[i] = tmpr.ptr.p_double[i]; } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine. Heap sort. *************************************************************************/ void heapsortdpoints(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* d, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector rbuf; ae_vector ibuf; ae_vector rbuf2; ae_vector ibuf2; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_init(&rbuf, 0, DT_REAL, _state); ae_vector_init(&ibuf, 0, DT_INT, _state); ae_vector_init(&rbuf2, 0, DT_REAL, _state); ae_vector_init(&ibuf2, 0, DT_INT, _state); ae_vector_set_length(&ibuf, n, _state); ae_vector_set_length(&rbuf, n, _state); for(i=0; i<=n-1; i++) { ibuf.ptr.p_int[i] = i; } tagsortfasti(x, &ibuf, &rbuf2, &ibuf2, n, _state); for(i=0; i<=n-1; i++) { rbuf.ptr.p_double[i] = y->ptr.p_double[ibuf.ptr.p_int[i]]; } ae_v_move(&y->ptr.p_double[0], 1, &rbuf.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { rbuf.ptr.p_double[i] = d->ptr.p_double[ibuf.ptr.p_int[i]]; } ae_v_move(&d->ptr.p_double[0], 1, &rbuf.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* This procedure search roots of an quadratic equation inside [0;1] and it number of roots. INPUT PARAMETERS: P0 - value of a function at 0 M0 - value of a derivative at 0 P1 - value of a function at 1 M1 - value of a derivative at 1 OUTPUT PARAMETERS: X0 - first root of an equation X1 - second root of an equation NR - number of roots RESTRICTIONS OF PARAMETERS: Parameters for this procedure has't to be zero simultaneously. Is expected, that input polinom is't degenerate or constant identicaly ZERO. REMARK: The procedure always fill value for X1 and X2, even if it is't belongs to [0;1]. But first true root(even if existing one) is in X1. Number of roots is NR. -- ALGLIB PROJECT -- Copyright 26.09.2011 by Bochkanov Sergey *************************************************************************/ void solvepolinom2(double p0, double m0, double p1, double m1, double* x0, double* x1, ae_int_t* nr, ae_state *_state) { double a; double b; double c; double dd; double tmp; double exf; double extr; *x0 = 0; *x1 = 0; *nr = 0; /* *calculate parameters for equation: A, B and C */ a = 6*p0+3*m0-6*p1+3*m1; b = -6*p0-4*m0+6*p1-2*m1; c = m0; /* *check case, when A=0 *we are considering the linear equation */ if( ae_fp_eq(a,(double)(0)) ) { /* *B<>0 and root inside [0;1] *one root */ if( (ae_fp_neq(b,(double)(0))&&ae_sign(c, _state)*ae_sign(b, _state)<=0)&&ae_fp_greater_eq(ae_fabs(b, _state),ae_fabs(c, _state)) ) { *x0 = -c/b; *nr = 1; return; } else { *nr = 0; return; } } /* *consider case, when extremumu outside (0;1) *exist one root only */ if( ae_fp_less_eq(ae_fabs(2*a, _state),ae_fabs(b, _state))||ae_sign(b, _state)*ae_sign(a, _state)>=0 ) { if( ae_sign(m0, _state)*ae_sign(m1, _state)>0 ) { *nr = 0; return; } /* *consider case, when the one exist *same sign of derivative */ if( ae_sign(m0, _state)*ae_sign(m1, _state)<0 ) { *nr = 1; extr = -b/(2*a); dd = b*b-4*a*c; if( ae_fp_less(dd,(double)(0)) ) { return; } *x0 = (-b-ae_sqrt(dd, _state))/(2*a); *x1 = (-b+ae_sqrt(dd, _state))/(2*a); if( (ae_fp_greater_eq(extr,(double)(1))&&ae_fp_less_eq(*x1,extr))||(ae_fp_less_eq(extr,(double)(0))&&ae_fp_greater_eq(*x1,extr)) ) { *x0 = *x1; } return; } /* *consider case, when the one is 0 */ if( ae_fp_eq(m0,(double)(0)) ) { *x0 = (double)(0); *nr = 1; return; } if( ae_fp_eq(m1,(double)(0)) ) { *x0 = (double)(1); *nr = 1; return; } } else { /* *consider case, when both of derivatives is 0 */ if( ae_fp_eq(m0,(double)(0))&&ae_fp_eq(m1,(double)(0)) ) { *x0 = (double)(0); *x1 = (double)(1); *nr = 2; return; } /* *consider case, when derivative at 0 is 0, and derivative at 1 is't 0 */ if( ae_fp_eq(m0,(double)(0))&&ae_fp_neq(m1,(double)(0)) ) { dd = b*b-4*a*c; if( ae_fp_less(dd,(double)(0)) ) { *x0 = (double)(0); *nr = 1; return; } *x0 = (-b-ae_sqrt(dd, _state))/(2*a); *x1 = (-b+ae_sqrt(dd, _state))/(2*a); extr = -b/(2*a); exf = a*extr*extr+b*extr+c; if( ae_sign(exf, _state)*ae_sign(m1, _state)>0 ) { *x0 = (double)(0); *nr = 1; return; } else { if( ae_fp_greater(extr,*x0) ) { *x0 = (double)(0); } else { *x1 = (double)(0); } *nr = 2; /* *roots must placed ascending */ if( ae_fp_greater(*x0,*x1) ) { tmp = *x0; *x0 = *x1; *x1 = tmp; } return; } } if( ae_fp_eq(m1,(double)(0))&&ae_fp_neq(m0,(double)(0)) ) { dd = b*b-4*a*c; if( ae_fp_less(dd,(double)(0)) ) { *x0 = (double)(1); *nr = 1; return; } *x0 = (-b-ae_sqrt(dd, _state))/(2*a); *x1 = (-b+ae_sqrt(dd, _state))/(2*a); extr = -b/(2*a); exf = a*extr*extr+b*extr+c; if( ae_sign(exf, _state)*ae_sign(m0, _state)>0 ) { *x0 = (double)(1); *nr = 1; return; } else { if( ae_fp_less(extr,*x0) ) { *x0 = (double)(1); } else { *x1 = (double)(1); } *nr = 2; /* *roots must placed ascending */ if( ae_fp_greater(*x0,*x1) ) { tmp = *x0; *x0 = *x1; *x1 = tmp; } return; } } else { extr = -b/(2*a); exf = a*extr*extr+b*extr+c; if( ae_sign(exf, _state)*ae_sign(m0, _state)>0&&ae_sign(exf, _state)*ae_sign(m1, _state)>0 ) { *nr = 0; return; } dd = b*b-4*a*c; if( ae_fp_less(dd,(double)(0)) ) { *nr = 0; return; } *x0 = (-b-ae_sqrt(dd, _state))/(2*a); *x1 = (-b+ae_sqrt(dd, _state))/(2*a); /* *if EXF and m0, EXF and m1 has different signs, then equation has two roots */ if( ae_sign(exf, _state)*ae_sign(m0, _state)<0&&ae_sign(exf, _state)*ae_sign(m1, _state)<0 ) { *nr = 2; /* *roots must placed ascending */ if( ae_fp_greater(*x0,*x1) ) { tmp = *x0; *x0 = *x1; *x1 = tmp; } return; } else { *nr = 1; if( ae_sign(exf, _state)*ae_sign(m0, _state)<0 ) { if( ae_fp_less(*x1,extr) ) { *x0 = *x1; } return; } if( ae_sign(exf, _state)*ae_sign(m1, _state)<0 ) { if( ae_fp_greater(*x1,extr) ) { *x0 = *x1; } return; } } } } } /************************************************************************* This procedure search roots of an cubic equation inside [A;B], it number of roots and number of extremums. INPUT PARAMETERS: pA - value of a function at A mA - value of a derivative at A pB - value of a function at B mB - value of a derivative at B A0 - left border [A0;B0] B0 - right border [A0;B0] OUTPUT PARAMETERS: X0 - first root of an equation X1 - second root of an equation X2 - third root of an equation EX0 - first extremum of a function EX0 - second extremum of a function NR - number of roots NR - number of extrmums RESTRICTIONS OF PARAMETERS: Length of [A;B] must be positive and is't zero, i.e. A<>B and AB */ ae_assert(ae_fp_less(a,b), "\nSolveCubicPolinom: incorrect borders for [A;B]!\n", _state); /* *case 1 *function can be identicaly to ZERO */ if( ((ae_fp_eq(ma,(double)(0))&&ae_fp_eq(mb,(double)(0)))&&ae_fp_eq(pa,pb))&&ae_fp_eq(pa,(double)(0)) ) { *nr = -1; *ne = -1; return; } if( (ae_fp_eq(ma,(double)(0))&&ae_fp_eq(mb,(double)(0)))&&ae_fp_eq(pa,pb) ) { *nr = 0; *ne = -1; return; } tmpma = ma*(b-a); tmpmb = mb*(b-a); solvepolinom2(pa, tmpma, pb, tmpmb, ex0, ex1, ne, _state); *ex0 = spline1d_rescaleval((double)(0), (double)(1), a, b, *ex0, _state); *ex1 = spline1d_rescaleval((double)(0), (double)(1), a, b, *ex1, _state); /* *case 3.1 *no extremums at [A;B] */ if( *ne==0 ) { *nr = bisectmethod(pa, tmpma, pb, tmpmb, (double)(0), (double)(1), x0, _state); if( *nr==1 ) { *x0 = spline1d_rescaleval((double)(0), (double)(1), a, b, *x0, _state); } return; } /* *case 3.2 *one extremum */ if( *ne==1 ) { if( ae_fp_eq(*ex0,a)||ae_fp_eq(*ex0,b) ) { *nr = bisectmethod(pa, tmpma, pb, tmpmb, (double)(0), (double)(1), x0, _state); if( *nr==1 ) { *x0 = spline1d_rescaleval((double)(0), (double)(1), a, b, *x0, _state); } return; } else { *nr = 0; i = 0; tex0 = spline1d_rescaleval(a, b, (double)(0), (double)(1), *ex0, _state); *nr = bisectmethod(pa, tmpma, pb, tmpmb, (double)(0), tex0, x0, _state)+(*nr); if( *nr>i ) { tempdata->ptr.p_double[i] = spline1d_rescaleval((double)(0), tex0, a, *ex0, *x0, _state); i = i+1; } *nr = bisectmethod(pa, tmpma, pb, tmpmb, tex0, (double)(1), x0, _state)+(*nr); if( *nr>i ) { *x0 = spline1d_rescaleval(tex0, (double)(1), *ex0, b, *x0, _state); if( i>0 ) { if( ae_fp_neq(*x0,tempdata->ptr.p_double[i-1]) ) { tempdata->ptr.p_double[i] = *x0; i = i+1; } else { *nr = *nr-1; } } else { tempdata->ptr.p_double[i] = *x0; i = i+1; } } if( *nr>0 ) { *x0 = tempdata->ptr.p_double[0]; if( *nr>1 ) { *x1 = tempdata->ptr.p_double[1]; } return; } } return; } else { /* *case 3.3 *two extremums(or more, but it's impossible) * * *case 3.3.0 *both extremums at the border */ if( ae_fp_eq(*ex0,a)&&ae_fp_eq(*ex1,b) ) { *nr = bisectmethod(pa, tmpma, pb, tmpmb, (double)(0), (double)(1), x0, _state); if( *nr==1 ) { *x0 = spline1d_rescaleval((double)(0), (double)(1), a, b, *x0, _state); } return; } if( ae_fp_eq(*ex0,a)&&ae_fp_neq(*ex1,b) ) { *nr = 0; i = 0; tex1 = spline1d_rescaleval(a, b, (double)(0), (double)(1), *ex1, _state); *nr = bisectmethod(pa, tmpma, pb, tmpmb, (double)(0), tex1, x0, _state)+(*nr); if( *nr>i ) { tempdata->ptr.p_double[i] = spline1d_rescaleval((double)(0), tex1, a, *ex1, *x0, _state); i = i+1; } *nr = bisectmethod(pa, tmpma, pb, tmpmb, tex1, (double)(1), x0, _state)+(*nr); if( *nr>i ) { *x0 = spline1d_rescaleval(tex1, (double)(1), *ex1, b, *x0, _state); if( ae_fp_neq(*x0,tempdata->ptr.p_double[i-1]) ) { tempdata->ptr.p_double[i] = *x0; i = i+1; } else { *nr = *nr-1; } } if( *nr>0 ) { *x0 = tempdata->ptr.p_double[0]; if( *nr>1 ) { *x1 = tempdata->ptr.p_double[1]; } return; } } if( ae_fp_eq(*ex1,b)&&ae_fp_neq(*ex0,a) ) { *nr = 0; i = 0; tex0 = spline1d_rescaleval(a, b, (double)(0), (double)(1), *ex0, _state); *nr = bisectmethod(pa, tmpma, pb, tmpmb, (double)(0), tex0, x0, _state)+(*nr); if( *nr>i ) { tempdata->ptr.p_double[i] = spline1d_rescaleval((double)(0), tex0, a, *ex0, *x0, _state); i = i+1; } *nr = bisectmethod(pa, tmpma, pb, tmpmb, tex0, (double)(1), x0, _state)+(*nr); if( *nr>i ) { *x0 = spline1d_rescaleval(tex0, (double)(1), *ex0, b, *x0, _state); if( i>0 ) { if( ae_fp_neq(*x0,tempdata->ptr.p_double[i-1]) ) { tempdata->ptr.p_double[i] = *x0; i = i+1; } else { *nr = *nr-1; } } else { tempdata->ptr.p_double[i] = *x0; i = i+1; } } if( *nr>0 ) { *x0 = tempdata->ptr.p_double[0]; if( *nr>1 ) { *x1 = tempdata->ptr.p_double[1]; } return; } } else { /* *case 3.3.2 *both extremums inside (0;1) */ *nr = 0; i = 0; tex0 = spline1d_rescaleval(a, b, (double)(0), (double)(1), *ex0, _state); tex1 = spline1d_rescaleval(a, b, (double)(0), (double)(1), *ex1, _state); *nr = bisectmethod(pa, tmpma, pb, tmpmb, (double)(0), tex0, x0, _state)+(*nr); if( *nr>i ) { tempdata->ptr.p_double[i] = spline1d_rescaleval((double)(0), tex0, a, *ex0, *x0, _state); i = i+1; } *nr = bisectmethod(pa, tmpma, pb, tmpmb, tex0, tex1, x0, _state)+(*nr); if( *nr>i ) { *x0 = spline1d_rescaleval(tex0, tex1, *ex0, *ex1, *x0, _state); if( i>0 ) { if( ae_fp_neq(*x0,tempdata->ptr.p_double[i-1]) ) { tempdata->ptr.p_double[i] = *x0; i = i+1; } else { *nr = *nr-1; } } else { tempdata->ptr.p_double[i] = *x0; i = i+1; } } *nr = bisectmethod(pa, tmpma, pb, tmpmb, tex1, (double)(1), x0, _state)+(*nr); if( *nr>i ) { *x0 = spline1d_rescaleval(tex1, (double)(1), *ex1, b, *x0, _state); if( i>0 ) { if( ae_fp_neq(*x0,tempdata->ptr.p_double[i-1]) ) { tempdata->ptr.p_double[i] = *x0; i = i+1; } else { *nr = *nr-1; } } else { tempdata->ptr.p_double[i] = *x0; i = i+1; } } /* *write are found roots */ if( *nr>0 ) { *x0 = tempdata->ptr.p_double[0]; if( *nr>1 ) { *x1 = tempdata->ptr.p_double[1]; } if( *nr>2 ) { *x2 = tempdata->ptr.p_double[2]; } return; } } } } /************************************************************************* Function for searching a root at [A;B] by bisection method and return number of roots (0 or 1) INPUT PARAMETERS: pA - value of a function at A mA - value of a derivative at A pB - value of a function at B mB - value of a derivative at B A0 - left border [A0;B0] B0 - right border [A0;B0] RESTRICTIONS OF PARAMETERS: We assume, that B0>A0. REMARK: Assume, that exist one root only at [A;B], else function may be work incorrectly. The function dont check value A0,B0! -- ALGLIB PROJECT -- Copyright 26.09.2011 by Bochkanov Sergey *************************************************************************/ ae_int_t bisectmethod(double pa, double ma, double pb, double mb, double a, double b, double* x, ae_state *_state) { double vacuum; double eps; double a0; double b0; double m; double lf; double rf; double mf; ae_int_t result; *x = 0; /* *accuracy */ eps = 1000*(b-a)*ae_machineepsilon; /* *initialization left and right borders */ a0 = a; b0 = b; /* *initialize function value at 'A' and 'B' */ spline1d_hermitecalc(pa, ma, pb, mb, a, &lf, &vacuum, _state); spline1d_hermitecalc(pa, ma, pb, mb, b, &rf, &vacuum, _state); /* *check, that 'A' and 'B' are't roots, *and that root exist */ if( ae_sign(lf, _state)*ae_sign(rf, _state)>0 ) { result = 0; return result; } else { if( ae_fp_eq(lf,(double)(0)) ) { *x = a; result = 1; return result; } else { if( ae_fp_eq(rf,(double)(0)) ) { *x = b; result = 1; return result; } } } /* *searching a root */ do { m = (b0+a0)/2; spline1d_hermitecalc(pa, ma, pb, mb, a0, &lf, &vacuum, _state); spline1d_hermitecalc(pa, ma, pb, mb, b0, &rf, &vacuum, _state); spline1d_hermitecalc(pa, ma, pb, mb, m, &mf, &vacuum, _state); if( ae_sign(mf, _state)*ae_sign(lf, _state)<0 ) { b0 = m; } else { if( ae_sign(mf, _state)*ae_sign(rf, _state)<0 ) { a0 = m; } else { if( ae_fp_eq(lf,(double)(0)) ) { *x = a0; result = 1; return result; } if( ae_fp_eq(rf,(double)(0)) ) { *x = b0; result = 1; return result; } if( ae_fp_eq(mf,(double)(0)) ) { *x = m; result = 1; return result; } } } } while(ae_fp_greater_eq(ae_fabs(b0-a0, _state),eps)); *x = m; result = 1; return result; } /************************************************************************* This function builds monotone cubic Hermite interpolant. This interpolant is monotonic in [x(0),x(n-1)] and is constant outside of this interval. In case y[] form non-monotonic sequence, interpolant is piecewise monotonic. Say, for x=(0,1,2,3,4) and y=(0,1,2,1,0) interpolant will monotonically grow at [0..2] and monotonically decrease at [2..4]. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Subroutine automatically sorts points, so caller may pass unsorted array. Y - function values, array[0..N-1] N - the number of points(N>=2). OUTPUT PARAMETERS: C - spline interpolant. -- ALGLIB PROJECT -- Copyright 21.06.2012 by Bochkanov Sergey *************************************************************************/ void spline1dbuildmonotone(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, spline1dinterpolant* c, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector d; ae_vector ex; ae_vector ey; ae_vector p; double delta; double alpha; double beta; ae_int_t tmpn; ae_int_t sn; double ca; double cb; double epsilon; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; _spline1dinterpolant_clear(c); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&ex, 0, DT_REAL, _state); ae_vector_init(&ey, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); /* * Check lengths of arguments */ ae_assert(n>=2, "Spline1DBuildMonotone: N<2", _state); ae_assert(x->cnt>=n, "Spline1DBuildMonotone: Length(X)cnt>=n, "Spline1DBuildMonotone: Length(Y)ptr.p_double[0]-ae_fabs(x->ptr.p_double[1]-x->ptr.p_double[0], _state); ex.ptr.p_double[n-1] = x->ptr.p_double[n-3]+ae_fabs(x->ptr.p_double[n-3]-x->ptr.p_double[n-4], _state); ey.ptr.p_double[0] = y->ptr.p_double[0]; ey.ptr.p_double[n-1] = y->ptr.p_double[n-3]; for(i=1; i<=n-2; i++) { ex.ptr.p_double[i] = x->ptr.p_double[i-1]; ey.ptr.p_double[i] = y->ptr.p_double[i-1]; } /* * Init sign of the function for first segment */ i = 0; ca = (double)(0); do { ca = ey.ptr.p_double[i+1]-ey.ptr.p_double[i]; i = i+1; } while(!(ae_fp_neq(ca,(double)(0))||i>n-2)); if( ae_fp_neq(ca,(double)(0)) ) { ca = ca/ae_fabs(ca, _state); } i = 0; while(i=2, "Spline1DBuildMonotone: internal error", _state); /* * Calculate derivatives for current segment */ d.ptr.p_double[i] = (double)(0); d.ptr.p_double[sn-1] = (double)(0); for(j=i+1; j<=sn-2; j++) { d.ptr.p_double[j] = ((ey.ptr.p_double[j]-ey.ptr.p_double[j-1])/(ex.ptr.p_double[j]-ex.ptr.p_double[j-1])+(ey.ptr.p_double[j+1]-ey.ptr.p_double[j])/(ex.ptr.p_double[j+1]-ex.ptr.p_double[j]))/2; } for(j=i; j<=sn-2; j++) { delta = (ey.ptr.p_double[j+1]-ey.ptr.p_double[j])/(ex.ptr.p_double[j+1]-ex.ptr.p_double[j]); if( ae_fp_less_eq(ae_fabs(delta, _state),epsilon) ) { d.ptr.p_double[j] = (double)(0); d.ptr.p_double[j+1] = (double)(0); } else { alpha = d.ptr.p_double[j]/delta; beta = d.ptr.p_double[j+1]/delta; if( ae_fp_neq(alpha,(double)(0)) ) { cb = alpha*ae_sqrt(1+ae_sqr(beta/alpha, _state), _state); } else { if( ae_fp_neq(beta,(double)(0)) ) { cb = beta; } else { continue; } } if( ae_fp_greater(cb,(double)(3)) ) { d.ptr.p_double[j] = 3*alpha*delta/cb; d.ptr.p_double[j+1] = 3*beta*delta/cb; } } } /* * Transition to next segment */ i = sn-1; } spline1dbuildhermite(&ex, &ey, &d, n, c, _state); c->continuity = 2; ae_frame_leave(_state); } /************************************************************************* Internal version of Spline1DGridDiffCubic. Accepts pre-ordered X/Y, temporary arrays (which may be preallocated, if you want to save time, or not) and output array (which may be preallocated too). Y is passed as var-parameter because we may need to force last element to be equal to the first one (if periodic boundary conditions are specified). -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ static void spline1d_spline1dgriddiffcubicinternal(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, /* Real */ ae_vector* d, /* Real */ ae_vector* a1, /* Real */ ae_vector* a2, /* Real */ ae_vector* a3, /* Real */ ae_vector* b, /* Real */ ae_vector* dt, ae_state *_state) { ae_int_t i; /* * allocate arrays */ if( d->cntcntcntcntcntcntptr.p_double[0] = (y->ptr.p_double[1]-y->ptr.p_double[0])/(x->ptr.p_double[1]-x->ptr.p_double[0]); d->ptr.p_double[1] = d->ptr.p_double[0]; return; } if( (n==2&&boundltype==-1)&&boundrtype==-1 ) { d->ptr.p_double[0] = (double)(0); d->ptr.p_double[1] = (double)(0); return; } /* * Periodic and non-periodic boundary conditions are * two separate classes */ if( boundrtype==-1&&boundltype==-1 ) { /* * Periodic boundary conditions */ y->ptr.p_double[n-1] = y->ptr.p_double[0]; /* * Boundary conditions at N-1 points * (one point less because last point is the same as first point). */ a1->ptr.p_double[0] = x->ptr.p_double[1]-x->ptr.p_double[0]; a2->ptr.p_double[0] = 2*(x->ptr.p_double[1]-x->ptr.p_double[0]+x->ptr.p_double[n-1]-x->ptr.p_double[n-2]); a3->ptr.p_double[0] = x->ptr.p_double[n-1]-x->ptr.p_double[n-2]; b->ptr.p_double[0] = 3*(y->ptr.p_double[n-1]-y->ptr.p_double[n-2])/(x->ptr.p_double[n-1]-x->ptr.p_double[n-2])*(x->ptr.p_double[1]-x->ptr.p_double[0])+3*(y->ptr.p_double[1]-y->ptr.p_double[0])/(x->ptr.p_double[1]-x->ptr.p_double[0])*(x->ptr.p_double[n-1]-x->ptr.p_double[n-2]); for(i=1; i<=n-2; i++) { /* * Altough last point is [N-2], we use X[N-1] and Y[N-1] * (because of periodicity) */ a1->ptr.p_double[i] = x->ptr.p_double[i+1]-x->ptr.p_double[i]; a2->ptr.p_double[i] = 2*(x->ptr.p_double[i+1]-x->ptr.p_double[i-1]); a3->ptr.p_double[i] = x->ptr.p_double[i]-x->ptr.p_double[i-1]; b->ptr.p_double[i] = 3*(y->ptr.p_double[i]-y->ptr.p_double[i-1])/(x->ptr.p_double[i]-x->ptr.p_double[i-1])*(x->ptr.p_double[i+1]-x->ptr.p_double[i])+3*(y->ptr.p_double[i+1]-y->ptr.p_double[i])/(x->ptr.p_double[i+1]-x->ptr.p_double[i])*(x->ptr.p_double[i]-x->ptr.p_double[i-1]); } /* * Solve, add last point (with index N-1) */ spline1d_solvecyclictridiagonal(a1, a2, a3, b, n-1, dt, _state); ae_v_move(&d->ptr.p_double[0], 1, &dt->ptr.p_double[0], 1, ae_v_len(0,n-2)); d->ptr.p_double[n-1] = d->ptr.p_double[0]; } else { /* * Non-periodic boundary condition. * Left boundary conditions. */ if( boundltype==0 ) { a1->ptr.p_double[0] = (double)(0); a2->ptr.p_double[0] = (double)(1); a3->ptr.p_double[0] = (double)(1); b->ptr.p_double[0] = 2*(y->ptr.p_double[1]-y->ptr.p_double[0])/(x->ptr.p_double[1]-x->ptr.p_double[0]); } if( boundltype==1 ) { a1->ptr.p_double[0] = (double)(0); a2->ptr.p_double[0] = (double)(1); a3->ptr.p_double[0] = (double)(0); b->ptr.p_double[0] = boundl; } if( boundltype==2 ) { a1->ptr.p_double[0] = (double)(0); a2->ptr.p_double[0] = (double)(2); a3->ptr.p_double[0] = (double)(1); b->ptr.p_double[0] = 3*(y->ptr.p_double[1]-y->ptr.p_double[0])/(x->ptr.p_double[1]-x->ptr.p_double[0])-0.5*boundl*(x->ptr.p_double[1]-x->ptr.p_double[0]); } /* * Central conditions */ for(i=1; i<=n-2; i++) { a1->ptr.p_double[i] = x->ptr.p_double[i+1]-x->ptr.p_double[i]; a2->ptr.p_double[i] = 2*(x->ptr.p_double[i+1]-x->ptr.p_double[i-1]); a3->ptr.p_double[i] = x->ptr.p_double[i]-x->ptr.p_double[i-1]; b->ptr.p_double[i] = 3*(y->ptr.p_double[i]-y->ptr.p_double[i-1])/(x->ptr.p_double[i]-x->ptr.p_double[i-1])*(x->ptr.p_double[i+1]-x->ptr.p_double[i])+3*(y->ptr.p_double[i+1]-y->ptr.p_double[i])/(x->ptr.p_double[i+1]-x->ptr.p_double[i])*(x->ptr.p_double[i]-x->ptr.p_double[i-1]); } /* * Right boundary conditions */ if( boundrtype==0 ) { a1->ptr.p_double[n-1] = (double)(1); a2->ptr.p_double[n-1] = (double)(1); a3->ptr.p_double[n-1] = (double)(0); b->ptr.p_double[n-1] = 2*(y->ptr.p_double[n-1]-y->ptr.p_double[n-2])/(x->ptr.p_double[n-1]-x->ptr.p_double[n-2]); } if( boundrtype==1 ) { a1->ptr.p_double[n-1] = (double)(0); a2->ptr.p_double[n-1] = (double)(1); a3->ptr.p_double[n-1] = (double)(0); b->ptr.p_double[n-1] = boundr; } if( boundrtype==2 ) { a1->ptr.p_double[n-1] = (double)(1); a2->ptr.p_double[n-1] = (double)(2); a3->ptr.p_double[n-1] = (double)(0); b->ptr.p_double[n-1] = 3*(y->ptr.p_double[n-1]-y->ptr.p_double[n-2])/(x->ptr.p_double[n-1]-x->ptr.p_double[n-2])+0.5*boundr*(x->ptr.p_double[n-1]-x->ptr.p_double[n-2]); } /* * Solve */ spline1d_solvetridiagonal(a1, a2, a3, b, n, d, _state); } } /************************************************************************* Internal subroutine. Heap sort. *************************************************************************/ static void spline1d_heapsortpoints(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector bufx; ae_vector bufy; ae_frame_make(_state, &_frame_block); ae_vector_init(&bufx, 0, DT_REAL, _state); ae_vector_init(&bufy, 0, DT_REAL, _state); tagsortfastr(x, y, &bufx, &bufy, n, _state); ae_frame_leave(_state); } /************************************************************************* Internal subroutine. Heap sort. Accepts: X, Y - points P - empty or preallocated array Returns: X, Y - sorted by X P - array of permutations; I-th position of output arrays X/Y contains (X[P[I]],Y[P[I]]) *************************************************************************/ static void spline1d_heapsortppoints(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Integer */ ae_vector* p, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector rbuf; ae_vector ibuf; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_init(&rbuf, 0, DT_REAL, _state); ae_vector_init(&ibuf, 0, DT_INT, _state); if( p->cntptr.p_int[i] = i; } tagsortfasti(x, p, &rbuf, &ibuf, n, _state); for(i=0; i<=n-1; i++) { rbuf.ptr.p_double[i] = y->ptr.p_double[p->ptr.p_int[i]]; } ae_v_move(&y->ptr.p_double[0], 1, &rbuf.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_frame_leave(_state); } /************************************************************************* Internal subroutine. Tridiagonal solver. Solves ( B[0] C[0] ( A[1] B[1] C[1] ) ( A[2] B[2] C[2] ) ( .......... ) * X = D ( .......... ) ( A[N-2] B[N-2] C[N-2] ) ( A[N-1] B[N-1] ) *************************************************************************/ static void spline1d_solvetridiagonal(/* Real */ ae_vector* a, /* Real */ ae_vector* b, /* Real */ ae_vector* c, /* Real */ ae_vector* d, ae_int_t n, /* Real */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_vector _b; ae_vector _d; ae_int_t k; double t; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_b, b, _state); b = &_b; ae_vector_init_copy(&_d, d, _state); d = &_d; if( x->cntptr.p_double[k]/b->ptr.p_double[k-1]; b->ptr.p_double[k] = b->ptr.p_double[k]-t*c->ptr.p_double[k-1]; d->ptr.p_double[k] = d->ptr.p_double[k]-t*d->ptr.p_double[k-1]; } x->ptr.p_double[n-1] = d->ptr.p_double[n-1]/b->ptr.p_double[n-1]; for(k=n-2; k>=0; k--) { x->ptr.p_double[k] = (d->ptr.p_double[k]-c->ptr.p_double[k]*x->ptr.p_double[k+1])/b->ptr.p_double[k]; } ae_frame_leave(_state); } /************************************************************************* Internal subroutine. Cyclic tridiagonal solver. Solves ( B[0] C[0] A[0] ) ( A[1] B[1] C[1] ) ( A[2] B[2] C[2] ) ( .......... ) * X = D ( .......... ) ( A[N-2] B[N-2] C[N-2] ) ( C[N-1] A[N-1] B[N-1] ) *************************************************************************/ static void spline1d_solvecyclictridiagonal(/* Real */ ae_vector* a, /* Real */ ae_vector* b, /* Real */ ae_vector* c, /* Real */ ae_vector* d, ae_int_t n, /* Real */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_vector _b; ae_int_t k; double alpha; double beta; double gamma; ae_vector y; ae_vector z; ae_vector u; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_b, b, _state); b = &_b; ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&z, 0, DT_REAL, _state); ae_vector_init(&u, 0, DT_REAL, _state); if( x->cntptr.p_double[0]; alpha = c->ptr.p_double[n-1]; gamma = -b->ptr.p_double[0]; b->ptr.p_double[0] = 2*b->ptr.p_double[0]; b->ptr.p_double[n-1] = b->ptr.p_double[n-1]-alpha*beta/gamma; ae_vector_set_length(&u, n, _state); for(k=0; k<=n-1; k++) { u.ptr.p_double[k] = (double)(0); } u.ptr.p_double[0] = gamma; u.ptr.p_double[n-1] = alpha; spline1d_solvetridiagonal(a, b, c, d, n, &y, _state); spline1d_solvetridiagonal(a, b, c, &u, n, &z, _state); for(k=0; k<=n-1; k++) { x->ptr.p_double[k] = y.ptr.p_double[k]-(y.ptr.p_double[0]+beta/gamma*y.ptr.p_double[n-1])/(1+z.ptr.p_double[0]+beta/gamma*z.ptr.p_double[n-1])*z.ptr.p_double[k]; } ae_frame_leave(_state); } /************************************************************************* Internal subroutine. Three-point differentiation *************************************************************************/ static double spline1d_diffthreepoint(double t, double x0, double f0, double x1, double f1, double x2, double f2, ae_state *_state) { double a; double b; double result; t = t-x0; x1 = x1-x0; x2 = x2-x0; a = (f2-f0-x2/x1*(f1-f0))/(ae_sqr(x2, _state)-x1*x2); b = (f1-f0-a*ae_sqr(x1, _state))/x1; result = 2*a*t+b; return result; } /************************************************************************* Procedure for calculating value of a function is providet in the form of Hermite polinom INPUT PARAMETERS: P0 - value of a function at 0 M0 - value of a derivative at 0 P1 - value of a function at 1 M1 - value of a derivative at 1 T - point inside [0;1] OUTPUT PARAMETERS: S - value of a function at T B0 - value of a derivative function at T -- ALGLIB PROJECT -- Copyright 26.09.2011 by Bochkanov Sergey *************************************************************************/ static void spline1d_hermitecalc(double p0, double m0, double p1, double m1, double t, double* s, double* ds, ae_state *_state) { *s = 0; *ds = 0; *s = p0*(1+2*t)*(1-t)*(1-t)+m0*t*(1-t)*(1-t)+p1*(3-2*t)*t*t+m1*t*t*(t-1); *ds = -p0*6*t*(1-t)+m0*(1-t)*(1-3*t)+p1*6*t*(1-t)+m1*t*(3*t-2); } /************************************************************************* Function for mapping from [A0;B0] to [A1;B1] INPUT PARAMETERS: A0 - left border [A0;B0] B0 - right border [A0;B0] A1 - left border [A1;B1] B1 - right border [A1;B1] T - value inside [A0;B0] RESTRICTIONS OF PARAMETERS: We assume, that B0>A0 and B1>A1. But we chech, that T is inside [A0;B0], and if TB0 then T - B1. INPUT PARAMETERS: A0 - left border for segment [A0;B0] from 'T' is converted to [A1;B1] B0 - right border for segment [A0;B0] from 'T' is converted to [A1;B1] A1 - left border for segment [A1;B1] to 'T' is converted from [A0;B0] B1 - right border for segment [A1;B1] to 'T' is converted from [A0;B0] T - the parameter is mapped from [A0;B0] to [A1;B1] Result: is converted value for 'T' from [A0;B0] to [A1;B1] REMARK: The function dont check value A0,B0 and A1,B1! -- ALGLIB PROJECT -- Copyright 26.09.2011 by Bochkanov Sergey *************************************************************************/ static double spline1d_rescaleval(double a0, double b0, double a1, double b1, double t, ae_state *_state) { double result; /* *return left border */ if( ae_fp_less_eq(t,a0) ) { result = a1; return result; } /* *return right border */ if( ae_fp_greater_eq(t,b0) ) { result = b1; return result; } /* *return value between left and right borders */ result = (b1-a1)*(t-a0)/(b0-a0)+a1; return result; } void _spline1dinterpolant_init(void* _p, ae_state *_state) { spline1dinterpolant *p = (spline1dinterpolant*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->c, 0, DT_REAL, _state); } void _spline1dinterpolant_init_copy(void* _dst, void* _src, ae_state *_state) { spline1dinterpolant *dst = (spline1dinterpolant*)_dst; spline1dinterpolant *src = (spline1dinterpolant*)_src; dst->periodic = src->periodic; dst->n = src->n; dst->k = src->k; dst->continuity = src->continuity; ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->c, &src->c, _state); } void _spline1dinterpolant_clear(void* _p) { spline1dinterpolant *p = (spline1dinterpolant*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->x); ae_vector_clear(&p->c); } void _spline1dinterpolant_destroy(void* _p) { spline1dinterpolant *p = (spline1dinterpolant*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->x); ae_vector_destroy(&p->c); } /************************************************************************* This function builds non-periodic 2-dimensional parametric spline which starts at (X[0],Y[0]) and ends at (X[N-1],Y[N-1]). INPUT PARAMETERS: XY - points, array[0..N-1,0..1]. XY[I,0:1] corresponds to the Ith point. Order of points is important! N - points count, N>=5 for Akima splines, N>=2 for other types of splines. ST - spline type: * 0 Akima spline * 1 parabolically terminated Catmull-Rom spline (Tension=0) * 2 parabolically terminated cubic spline PT - parameterization type: * 0 uniform * 1 chord length * 2 centripetal OUTPUT PARAMETERS: P - parametric spline interpolant NOTES: * this function assumes that there all consequent points are distinct. I.e. (x0,y0)<>(x1,y1), (x1,y1)<>(x2,y2), (x2,y2)<>(x3,y3) and so on. However, non-consequent points may coincide, i.e. we can have (x0,y0)= =(x2,y2). -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2build(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t st, ae_int_t pt, pspline2interpolant* p, ae_state *_state) { ae_frame _frame_block; ae_matrix _xy; ae_vector tmp; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_xy, xy, _state); xy = &_xy; _pspline2interpolant_clear(p); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_assert(st>=0&&st<=2, "PSpline2Build: incorrect spline type!", _state); ae_assert(pt>=0&&pt<=2, "PSpline2Build: incorrect parameterization type!", _state); if( st==0 ) { ae_assert(n>=5, "PSpline2Build: N<5 (minimum value for Akima splines)!", _state); } else { ae_assert(n>=2, "PSpline2Build: N<2!", _state); } /* * Prepare */ p->n = n; p->periodic = ae_false; ae_vector_set_length(&tmp, n, _state); /* * Build parameterization, check that all parameters are distinct */ parametric_pspline2par(xy, n, pt, &p->p, _state); ae_assert(aredistinct(&p->p, n, _state), "PSpline2Build: consequent points are too close!", _state); /* * Build splines */ if( st==0 ) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][0], xy->stride, ae_v_len(0,n-1)); spline1dbuildakima(&p->p, &tmp, n, &p->x, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][1], xy->stride, ae_v_len(0,n-1)); spline1dbuildakima(&p->p, &tmp, n, &p->y, _state); } if( st==1 ) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][0], xy->stride, ae_v_len(0,n-1)); spline1dbuildcatmullrom(&p->p, &tmp, n, 0, 0.0, &p->x, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][1], xy->stride, ae_v_len(0,n-1)); spline1dbuildcatmullrom(&p->p, &tmp, n, 0, 0.0, &p->y, _state); } if( st==2 ) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][0], xy->stride, ae_v_len(0,n-1)); spline1dbuildcubic(&p->p, &tmp, n, 0, 0.0, 0, 0.0, &p->x, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][1], xy->stride, ae_v_len(0,n-1)); spline1dbuildcubic(&p->p, &tmp, n, 0, 0.0, 0, 0.0, &p->y, _state); } ae_frame_leave(_state); } /************************************************************************* This function builds non-periodic 3-dimensional parametric spline which starts at (X[0],Y[0],Z[0]) and ends at (X[N-1],Y[N-1],Z[N-1]). Same as PSpline2Build() function, but for 3D, so we won't duplicate its description here. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3build(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t st, ae_int_t pt, pspline3interpolant* p, ae_state *_state) { ae_frame _frame_block; ae_matrix _xy; ae_vector tmp; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_xy, xy, _state); xy = &_xy; _pspline3interpolant_clear(p); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_assert(st>=0&&st<=2, "PSpline3Build: incorrect spline type!", _state); ae_assert(pt>=0&&pt<=2, "PSpline3Build: incorrect parameterization type!", _state); if( st==0 ) { ae_assert(n>=5, "PSpline3Build: N<5 (minimum value for Akima splines)!", _state); } else { ae_assert(n>=2, "PSpline3Build: N<2!", _state); } /* * Prepare */ p->n = n; p->periodic = ae_false; ae_vector_set_length(&tmp, n, _state); /* * Build parameterization, check that all parameters are distinct */ parametric_pspline3par(xy, n, pt, &p->p, _state); ae_assert(aredistinct(&p->p, n, _state), "PSpline3Build: consequent points are too close!", _state); /* * Build splines */ if( st==0 ) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][0], xy->stride, ae_v_len(0,n-1)); spline1dbuildakima(&p->p, &tmp, n, &p->x, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][1], xy->stride, ae_v_len(0,n-1)); spline1dbuildakima(&p->p, &tmp, n, &p->y, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][2], xy->stride, ae_v_len(0,n-1)); spline1dbuildakima(&p->p, &tmp, n, &p->z, _state); } if( st==1 ) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][0], xy->stride, ae_v_len(0,n-1)); spline1dbuildcatmullrom(&p->p, &tmp, n, 0, 0.0, &p->x, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][1], xy->stride, ae_v_len(0,n-1)); spline1dbuildcatmullrom(&p->p, &tmp, n, 0, 0.0, &p->y, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][2], xy->stride, ae_v_len(0,n-1)); spline1dbuildcatmullrom(&p->p, &tmp, n, 0, 0.0, &p->z, _state); } if( st==2 ) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][0], xy->stride, ae_v_len(0,n-1)); spline1dbuildcubic(&p->p, &tmp, n, 0, 0.0, 0, 0.0, &p->x, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][1], xy->stride, ae_v_len(0,n-1)); spline1dbuildcubic(&p->p, &tmp, n, 0, 0.0, 0, 0.0, &p->y, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[0][2], xy->stride, ae_v_len(0,n-1)); spline1dbuildcubic(&p->p, &tmp, n, 0, 0.0, 0, 0.0, &p->z, _state); } ae_frame_leave(_state); } /************************************************************************* This function builds periodic 2-dimensional parametric spline which starts at (X[0],Y[0]), goes through all points to (X[N-1],Y[N-1]) and then back to (X[0],Y[0]). INPUT PARAMETERS: XY - points, array[0..N-1,0..1]. XY[I,0:1] corresponds to the Ith point. XY[N-1,0:1] must be different from XY[0,0:1]. Order of points is important! N - points count, N>=3 for other types of splines. ST - spline type: * 1 Catmull-Rom spline (Tension=0) with cyclic boundary conditions * 2 cubic spline with cyclic boundary conditions PT - parameterization type: * 0 uniform * 1 chord length * 2 centripetal OUTPUT PARAMETERS: P - parametric spline interpolant NOTES: * this function assumes that there all consequent points are distinct. I.e. (x0,y0)<>(x1,y1), (x1,y1)<>(x2,y2), (x2,y2)<>(x3,y3) and so on. However, non-consequent points may coincide, i.e. we can have (x0,y0)= =(x2,y2). * last point of sequence is NOT equal to the first point. You shouldn't make curve "explicitly periodic" by making them equal. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2buildperiodic(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t st, ae_int_t pt, pspline2interpolant* p, ae_state *_state) { ae_frame _frame_block; ae_matrix _xy; ae_matrix xyp; ae_vector tmp; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_xy, xy, _state); xy = &_xy; _pspline2interpolant_clear(p); ae_matrix_init(&xyp, 0, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_assert(st>=1&&st<=2, "PSpline2BuildPeriodic: incorrect spline type!", _state); ae_assert(pt>=0&&pt<=2, "PSpline2BuildPeriodic: incorrect parameterization type!", _state); ae_assert(n>=3, "PSpline2BuildPeriodic: N<3!", _state); /* * Prepare */ p->n = n; p->periodic = ae_true; ae_vector_set_length(&tmp, n+1, _state); ae_matrix_set_length(&xyp, n+1, 2, _state); ae_v_move(&xyp.ptr.pp_double[0][0], xyp.stride, &xy->ptr.pp_double[0][0], xy->stride, ae_v_len(0,n-1)); ae_v_move(&xyp.ptr.pp_double[0][1], xyp.stride, &xy->ptr.pp_double[0][1], xy->stride, ae_v_len(0,n-1)); ae_v_move(&xyp.ptr.pp_double[n][0], 1, &xy->ptr.pp_double[0][0], 1, ae_v_len(0,1)); /* * Build parameterization, check that all parameters are distinct */ parametric_pspline2par(&xyp, n+1, pt, &p->p, _state); ae_assert(aredistinct(&p->p, n+1, _state), "PSpline2BuildPeriodic: consequent (or first and last) points are too close!", _state); /* * Build splines */ if( st==1 ) { ae_v_move(&tmp.ptr.p_double[0], 1, &xyp.ptr.pp_double[0][0], xyp.stride, ae_v_len(0,n)); spline1dbuildcatmullrom(&p->p, &tmp, n+1, -1, 0.0, &p->x, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xyp.ptr.pp_double[0][1], xyp.stride, ae_v_len(0,n)); spline1dbuildcatmullrom(&p->p, &tmp, n+1, -1, 0.0, &p->y, _state); } if( st==2 ) { ae_v_move(&tmp.ptr.p_double[0], 1, &xyp.ptr.pp_double[0][0], xyp.stride, ae_v_len(0,n)); spline1dbuildcubic(&p->p, &tmp, n+1, -1, 0.0, -1, 0.0, &p->x, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xyp.ptr.pp_double[0][1], xyp.stride, ae_v_len(0,n)); spline1dbuildcubic(&p->p, &tmp, n+1, -1, 0.0, -1, 0.0, &p->y, _state); } ae_frame_leave(_state); } /************************************************************************* This function builds periodic 3-dimensional parametric spline which starts at (X[0],Y[0],Z[0]), goes through all points to (X[N-1],Y[N-1],Z[N-1]) and then back to (X[0],Y[0],Z[0]). Same as PSpline2Build() function, but for 3D, so we won't duplicate its description here. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3buildperiodic(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t st, ae_int_t pt, pspline3interpolant* p, ae_state *_state) { ae_frame _frame_block; ae_matrix _xy; ae_matrix xyp; ae_vector tmp; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_xy, xy, _state); xy = &_xy; _pspline3interpolant_clear(p); ae_matrix_init(&xyp, 0, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_assert(st>=1&&st<=2, "PSpline3BuildPeriodic: incorrect spline type!", _state); ae_assert(pt>=0&&pt<=2, "PSpline3BuildPeriodic: incorrect parameterization type!", _state); ae_assert(n>=3, "PSpline3BuildPeriodic: N<3!", _state); /* * Prepare */ p->n = n; p->periodic = ae_true; ae_vector_set_length(&tmp, n+1, _state); ae_matrix_set_length(&xyp, n+1, 3, _state); ae_v_move(&xyp.ptr.pp_double[0][0], xyp.stride, &xy->ptr.pp_double[0][0], xy->stride, ae_v_len(0,n-1)); ae_v_move(&xyp.ptr.pp_double[0][1], xyp.stride, &xy->ptr.pp_double[0][1], xy->stride, ae_v_len(0,n-1)); ae_v_move(&xyp.ptr.pp_double[0][2], xyp.stride, &xy->ptr.pp_double[0][2], xy->stride, ae_v_len(0,n-1)); ae_v_move(&xyp.ptr.pp_double[n][0], 1, &xy->ptr.pp_double[0][0], 1, ae_v_len(0,2)); /* * Build parameterization, check that all parameters are distinct */ parametric_pspline3par(&xyp, n+1, pt, &p->p, _state); ae_assert(aredistinct(&p->p, n+1, _state), "PSplineBuild2Periodic: consequent (or first and last) points are too close!", _state); /* * Build splines */ if( st==1 ) { ae_v_move(&tmp.ptr.p_double[0], 1, &xyp.ptr.pp_double[0][0], xyp.stride, ae_v_len(0,n)); spline1dbuildcatmullrom(&p->p, &tmp, n+1, -1, 0.0, &p->x, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xyp.ptr.pp_double[0][1], xyp.stride, ae_v_len(0,n)); spline1dbuildcatmullrom(&p->p, &tmp, n+1, -1, 0.0, &p->y, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xyp.ptr.pp_double[0][2], xyp.stride, ae_v_len(0,n)); spline1dbuildcatmullrom(&p->p, &tmp, n+1, -1, 0.0, &p->z, _state); } if( st==2 ) { ae_v_move(&tmp.ptr.p_double[0], 1, &xyp.ptr.pp_double[0][0], xyp.stride, ae_v_len(0,n)); spline1dbuildcubic(&p->p, &tmp, n+1, -1, 0.0, -1, 0.0, &p->x, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xyp.ptr.pp_double[0][1], xyp.stride, ae_v_len(0,n)); spline1dbuildcubic(&p->p, &tmp, n+1, -1, 0.0, -1, 0.0, &p->y, _state); ae_v_move(&tmp.ptr.p_double[0], 1, &xyp.ptr.pp_double[0][2], xyp.stride, ae_v_len(0,n)); spline1dbuildcubic(&p->p, &tmp, n+1, -1, 0.0, -1, 0.0, &p->z, _state); } ae_frame_leave(_state); } /************************************************************************* This function returns vector of parameter values correspoding to points. I.e. for P created from (X[0],Y[0])...(X[N-1],Y[N-1]) and U=TValues(P) we have (X[0],Y[0]) = PSpline2Calc(P,U[0]), (X[1],Y[1]) = PSpline2Calc(P,U[1]), (X[2],Y[2]) = PSpline2Calc(P,U[2]), ... INPUT PARAMETERS: P - parametric spline interpolant OUTPUT PARAMETERS: N - array size T - array[0..N-1] NOTES: * for non-periodic splines U[0]=0, U[0]n>=2, "PSpline2ParameterValues: internal error!", _state); *n = p->n; ae_vector_set_length(t, *n, _state); ae_v_move(&t->ptr.p_double[0], 1, &p->p.ptr.p_double[0], 1, ae_v_len(0,*n-1)); t->ptr.p_double[0] = (double)(0); if( !p->periodic ) { t->ptr.p_double[*n-1] = (double)(1); } } /************************************************************************* This function returns vector of parameter values correspoding to points. Same as PSpline2ParameterValues(), but for 3D. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3parametervalues(pspline3interpolant* p, ae_int_t* n, /* Real */ ae_vector* t, ae_state *_state) { *n = 0; ae_vector_clear(t); ae_assert(p->n>=2, "PSpline3ParameterValues: internal error!", _state); *n = p->n; ae_vector_set_length(t, *n, _state); ae_v_move(&t->ptr.p_double[0], 1, &p->p.ptr.p_double[0], 1, ae_v_len(0,*n-1)); t->ptr.p_double[0] = (double)(0); if( !p->periodic ) { t->ptr.p_double[*n-1] = (double)(1); } } /************************************************************************* This function calculates the value of the parametric spline for a given value of parameter T INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-position Y - Y-position -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2calc(pspline2interpolant* p, double t, double* x, double* y, ae_state *_state) { *x = 0; *y = 0; if( p->periodic ) { t = t-ae_ifloor(t, _state); } *x = spline1dcalc(&p->x, t, _state); *y = spline1dcalc(&p->y, t, _state); } /************************************************************************* This function calculates the value of the parametric spline for a given value of parameter T. INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-position Y - Y-position Z - Z-position -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3calc(pspline3interpolant* p, double t, double* x, double* y, double* z, ae_state *_state) { *x = 0; *y = 0; *z = 0; if( p->periodic ) { t = t-ae_ifloor(t, _state); } *x = spline1dcalc(&p->x, t, _state); *y = spline1dcalc(&p->y, t, _state); *z = spline1dcalc(&p->z, t, _state); } /************************************************************************* This function calculates tangent vector for a given value of parameter T INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-component of tangent vector (normalized) Y - Y-component of tangent vector (normalized) NOTE: X^2+Y^2 is either 1 (for non-zero tangent vector) or 0. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2tangent(pspline2interpolant* p, double t, double* x, double* y, ae_state *_state) { double v; double v0; double v1; *x = 0; *y = 0; if( p->periodic ) { t = t-ae_ifloor(t, _state); } pspline2diff(p, t, &v0, x, &v1, y, _state); if( ae_fp_neq(*x,(double)(0))||ae_fp_neq(*y,(double)(0)) ) { /* * this code is a bit more complex than X^2+Y^2 to avoid * overflow for large values of X and Y. */ v = safepythag2(*x, *y, _state); *x = *x/v; *y = *y/v; } } /************************************************************************* This function calculates tangent vector for a given value of parameter T INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-component of tangent vector (normalized) Y - Y-component of tangent vector (normalized) Z - Z-component of tangent vector (normalized) NOTE: X^2+Y^2+Z^2 is either 1 (for non-zero tangent vector) or 0. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3tangent(pspline3interpolant* p, double t, double* x, double* y, double* z, ae_state *_state) { double v; double v0; double v1; double v2; *x = 0; *y = 0; *z = 0; if( p->periodic ) { t = t-ae_ifloor(t, _state); } pspline3diff(p, t, &v0, x, &v1, y, &v2, z, _state); if( (ae_fp_neq(*x,(double)(0))||ae_fp_neq(*y,(double)(0)))||ae_fp_neq(*z,(double)(0)) ) { v = safepythag3(*x, *y, *z, _state); *x = *x/v; *y = *y/v; *z = *z/v; } } /************************************************************************* This function calculates derivative, i.e. it returns (dX/dT,dY/dT). INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - X-derivative Y - Y-value DY - Y-derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2diff(pspline2interpolant* p, double t, double* x, double* dx, double* y, double* dy, ae_state *_state) { double d2s; *x = 0; *dx = 0; *y = 0; *dy = 0; if( p->periodic ) { t = t-ae_ifloor(t, _state); } spline1ddiff(&p->x, t, x, dx, &d2s, _state); spline1ddiff(&p->y, t, y, dy, &d2s, _state); } /************************************************************************* This function calculates derivative, i.e. it returns (dX/dT,dY/dT,dZ/dT). INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - X-derivative Y - Y-value DY - Y-derivative Z - Z-value DZ - Z-derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3diff(pspline3interpolant* p, double t, double* x, double* dx, double* y, double* dy, double* z, double* dz, ae_state *_state) { double d2s; *x = 0; *dx = 0; *y = 0; *dy = 0; *z = 0; *dz = 0; if( p->periodic ) { t = t-ae_ifloor(t, _state); } spline1ddiff(&p->x, t, x, dx, &d2s, _state); spline1ddiff(&p->y, t, y, dy, &d2s, _state); spline1ddiff(&p->z, t, z, dz, &d2s, _state); } /************************************************************************* This function calculates first and second derivative with respect to T. INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - derivative D2X - second derivative Y - Y-value DY - derivative D2Y - second derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2diff2(pspline2interpolant* p, double t, double* x, double* dx, double* d2x, double* y, double* dy, double* d2y, ae_state *_state) { *x = 0; *dx = 0; *d2x = 0; *y = 0; *dy = 0; *d2y = 0; if( p->periodic ) { t = t-ae_ifloor(t, _state); } spline1ddiff(&p->x, t, x, dx, d2x, _state); spline1ddiff(&p->y, t, y, dy, d2y, _state); } /************************************************************************* This function calculates first and second derivative with respect to T. INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - derivative D2X - second derivative Y - Y-value DY - derivative D2Y - second derivative Z - Z-value DZ - derivative D2Z - second derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3diff2(pspline3interpolant* p, double t, double* x, double* dx, double* d2x, double* y, double* dy, double* d2y, double* z, double* dz, double* d2z, ae_state *_state) { *x = 0; *dx = 0; *d2x = 0; *y = 0; *dy = 0; *d2y = 0; *z = 0; *dz = 0; *d2z = 0; if( p->periodic ) { t = t-ae_ifloor(t, _state); } spline1ddiff(&p->x, t, x, dx, d2x, _state); spline1ddiff(&p->y, t, y, dy, d2y, _state); spline1ddiff(&p->z, t, z, dz, d2z, _state); } /************************************************************************* This function calculates arc length, i.e. length of curve between t=a and t=b. INPUT PARAMETERS: P - parametric spline interpolant A,B - parameter values corresponding to arc ends: * B>A will result in positive length returned * Bx, state.x, &sx, &dsx, &d2sx, _state); spline1ddiff(&p->y, state.x, &sy, &dsy, &d2sy, _state); state.f = safepythag2(dsx, dsy, _state); } autogkresults(&state, &result, &rep, _state); ae_assert(rep.terminationtype>0, "PSpline2ArcLength: internal error!", _state); ae_frame_leave(_state); return result; } /************************************************************************* This function calculates arc length, i.e. length of curve between t=a and t=b. INPUT PARAMETERS: P - parametric spline interpolant A,B - parameter values corresponding to arc ends: * B>A will result in positive length returned * Bx, state.x, &sx, &dsx, &d2sx, _state); spline1ddiff(&p->y, state.x, &sy, &dsy, &d2sy, _state); spline1ddiff(&p->z, state.x, &sz, &dsz, &d2sz, _state); state.f = safepythag3(dsx, dsy, dsz, _state); } autogkresults(&state, &result, &rep, _state); ae_assert(rep.terminationtype>0, "PSpline3ArcLength: internal error!", _state); ae_frame_leave(_state); return result; } /************************************************************************* This subroutine fits piecewise linear curve to points with Ramer-Douglas- Peucker algorithm. This function performs PARAMETRIC fit, i.e. it can be used to fit curves like circles. On input it accepts dataset which describes parametric multidimensional curve X(t), with X being vector, and t taking values in [0,N), where N is a number of points in dataset. As result, it returns reduced dataset X2, which can be used to build parametric curve X2(t), which approximates X(t) with desired precision (or has specified number of sections). INPUT PARAMETERS: X - array of multidimensional points: * at least N elements, leading N elements are used if more than N elements were specified * order of points is IMPORTANT because it is parametric fit * each row of array is one point which has D coordinates N - number of elements in X D - number of dimensions (elements per row of X) StopM - stopping condition - desired number of sections: * at most M sections are generated by this function * less than M sections can be generated if we have N=0, "LSTFitPiecewiseLinearParametricRDP: N<0", _state); ae_assert(d>=1, "LSTFitPiecewiseLinearParametricRDP: D<=0", _state); ae_assert(stopm>=0, "LSTFitPiecewiseLinearParametricRDP: StopM<1", _state); ae_assert(ae_isfinite(stopeps, _state)&&ae_fp_greater_eq(stopeps,(double)(0)), "LSTFitPiecewiseLinearParametricRDP: StopEps<0 or is infinite", _state); ae_assert(x->rows>=n, "LSTFitPiecewiseLinearParametricRDP: Rows(X)cols>=d, "LSTFitPiecewiseLinearParametricRDP: Cols(X)ptr.pp_double[i][j],x->ptr.pp_double[0][j]); } } if( allsame ) { *nsections = 0; ae_frame_leave(_state); return; } /* * Prepare first section */ parametric_rdpanalyzesectionpar(x, 0, n-1, d, &worstidx, &worsterror, _state); ae_matrix_set_length(§ions, n, 4, _state); ae_vector_set_length(&heaperrors, n, _state); ae_vector_set_length(&heaptags, n, _state); *nsections = 1; sections.ptr.pp_double[0][0] = (double)(0); sections.ptr.pp_double[0][1] = (double)(n-1); sections.ptr.pp_double[0][2] = (double)(worstidx); sections.ptr.pp_double[0][3] = worsterror; heaperrors.ptr.p_double[0] = worsterror; heaptags.ptr.p_int[0] = 0; ae_assert(ae_fp_eq(sections.ptr.pp_double[0][1],(double)(n-1)), "RDP algorithm: integrity check failed", _state); /* * Main loop. * Repeatedly find section with worst error and divide it. * Terminate after M-th section, or because of other reasons (see loop internals). */ for(;;) { /* * Break loop if one of the stopping conditions was met. * Store index of worst section to K. */ if( ae_fp_eq(heaperrors.ptr.p_double[0],(double)(0)) ) { break; } if( ae_fp_greater(stopeps,(double)(0))&&ae_fp_less_eq(heaperrors.ptr.p_double[0],stopeps) ) { break; } if( stopm>0&&*nsections>=stopm ) { break; } k = heaptags.ptr.p_int[0]; /* * K-th section is divided in two: * * first one spans interval from X[Sections[K,0]] to X[Sections[K,2]] * * second one spans interval from X[Sections[K,2]] to X[Sections[K,1]] * * First section is stored at K-th position, second one is appended to the table. * Then we update heap which stores pairs of (error,section_index) */ k0 = ae_round(sections.ptr.pp_double[k][0], _state); k1 = ae_round(sections.ptr.pp_double[k][1], _state); k2 = ae_round(sections.ptr.pp_double[k][2], _state); parametric_rdpanalyzesectionpar(x, k0, k2, d, &idx0, &e0, _state); parametric_rdpanalyzesectionpar(x, k2, k1, d, &idx1, &e1, _state); sections.ptr.pp_double[k][0] = (double)(k0); sections.ptr.pp_double[k][1] = (double)(k2); sections.ptr.pp_double[k][2] = (double)(idx0); sections.ptr.pp_double[k][3] = e0; tagheapreplacetopi(&heaperrors, &heaptags, *nsections, e0, k, _state); sections.ptr.pp_double[*nsections][0] = (double)(k2); sections.ptr.pp_double[*nsections][1] = (double)(k1); sections.ptr.pp_double[*nsections][2] = (double)(idx1); sections.ptr.pp_double[*nsections][3] = e1; tagheappushi(&heaperrors, &heaptags, nsections, e1, *nsections, _state); } /* * Convert from sections to indexes */ ae_vector_set_length(&buf0, *nsections+1, _state); for(i=0; i<=*nsections-1; i++) { buf0.ptr.p_double[i] = (double)(ae_round(sections.ptr.pp_double[i][0], _state)); } buf0.ptr.p_double[*nsections] = (double)(n-1); tagsortfast(&buf0, &buf1, *nsections+1, _state); ae_vector_set_length(idx2, *nsections+1, _state); for(i=0; i<=*nsections; i++) { idx2->ptr.p_int[i] = ae_round(buf0.ptr.p_double[i], _state); } ae_assert(idx2->ptr.p_int[0]==0, "RDP algorithm: integrity check failed", _state); ae_assert(idx2->ptr.p_int[*nsections]==n-1, "RDP algorithm: integrity check failed", _state); /* * Output sections: * * first NSection elements of X2/Y2 are filled by x/y at left boundaries of sections * * last element of X2/Y2 is filled by right boundary of rightmost section * * X2/Y2 is sorted by ascending of X2 */ ae_matrix_set_length(x2, *nsections+1, d, _state); for(i=0; i<=*nsections; i++) { for(j=0; j<=d-1; j++) { x2->ptr.pp_double[i][j] = x->ptr.pp_double[idx2->ptr.p_int[i]][j]; } } ae_frame_leave(_state); } /************************************************************************* Builds non-periodic parameterization for 2-dimensional spline *************************************************************************/ static void parametric_pspline2par(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t pt, /* Real */ ae_vector* p, ae_state *_state) { double v; ae_int_t i; ae_vector_clear(p); ae_assert(pt>=0&&pt<=2, "PSpline2Par: internal error!", _state); /* * Build parameterization: * * fill by non-normalized values * * normalize them so we have P[0]=0, P[N-1]=1. */ ae_vector_set_length(p, n, _state); if( pt==0 ) { for(i=0; i<=n-1; i++) { p->ptr.p_double[i] = (double)(i); } } if( pt==1 ) { p->ptr.p_double[0] = (double)(0); for(i=1; i<=n-1; i++) { p->ptr.p_double[i] = p->ptr.p_double[i-1]+safepythag2(xy->ptr.pp_double[i][0]-xy->ptr.pp_double[i-1][0], xy->ptr.pp_double[i][1]-xy->ptr.pp_double[i-1][1], _state); } } if( pt==2 ) { p->ptr.p_double[0] = (double)(0); for(i=1; i<=n-1; i++) { p->ptr.p_double[i] = p->ptr.p_double[i-1]+ae_sqrt(safepythag2(xy->ptr.pp_double[i][0]-xy->ptr.pp_double[i-1][0], xy->ptr.pp_double[i][1]-xy->ptr.pp_double[i-1][1], _state), _state); } } v = 1/p->ptr.p_double[n-1]; ae_v_muld(&p->ptr.p_double[0], 1, ae_v_len(0,n-1), v); } /************************************************************************* Builds non-periodic parameterization for 3-dimensional spline *************************************************************************/ static void parametric_pspline3par(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t pt, /* Real */ ae_vector* p, ae_state *_state) { double v; ae_int_t i; ae_vector_clear(p); ae_assert(pt>=0&&pt<=2, "PSpline3Par: internal error!", _state); /* * Build parameterization: * * fill by non-normalized values * * normalize them so we have P[0]=0, P[N-1]=1. */ ae_vector_set_length(p, n, _state); if( pt==0 ) { for(i=0; i<=n-1; i++) { p->ptr.p_double[i] = (double)(i); } } if( pt==1 ) { p->ptr.p_double[0] = (double)(0); for(i=1; i<=n-1; i++) { p->ptr.p_double[i] = p->ptr.p_double[i-1]+safepythag3(xy->ptr.pp_double[i][0]-xy->ptr.pp_double[i-1][0], xy->ptr.pp_double[i][1]-xy->ptr.pp_double[i-1][1], xy->ptr.pp_double[i][2]-xy->ptr.pp_double[i-1][2], _state); } } if( pt==2 ) { p->ptr.p_double[0] = (double)(0); for(i=1; i<=n-1; i++) { p->ptr.p_double[i] = p->ptr.p_double[i-1]+ae_sqrt(safepythag3(xy->ptr.pp_double[i][0]-xy->ptr.pp_double[i-1][0], xy->ptr.pp_double[i][1]-xy->ptr.pp_double[i-1][1], xy->ptr.pp_double[i][2]-xy->ptr.pp_double[i-1][2], _state), _state); } } v = 1/p->ptr.p_double[n-1]; ae_v_muld(&p->ptr.p_double[0], 1, ae_v_len(0,n-1), v); } /************************************************************************* This function analyzes section of curve for processing by RDP algorithm: given set of points X,Y with indexes [I0,I1] it returns point with worst deviation from linear model (PARAMETRIC version which sees curve as X(t) with vector X). Input parameters: XY - array I0,I1 - interval (boundaries included) to process D - number of dimensions OUTPUT PARAMETERS: WorstIdx - index of worst point WorstError - error at worst point NOTE: this function guarantees that it returns exactly zero for a section with less than 3 points. -- ALGLIB PROJECT -- Copyright 02.10.2014 by Bochkanov Sergey *************************************************************************/ static void parametric_rdpanalyzesectionpar(/* Real */ ae_matrix* xy, ae_int_t i0, ae_int_t i1, ae_int_t d, ae_int_t* worstidx, double* worsterror, ae_state *_state) { ae_int_t i; ae_int_t j; double v; double d2; double ts; double vv; *worstidx = 0; *worsterror = 0; /* * Quick exit for 0, 1, 2 points */ if( i1-i0+1<3 ) { *worstidx = i0; *worsterror = 0.0; return; } /* * Estimate D2 - squared distance between XY[I1] and XY[I0]. * In case D2=0 handle it as special case. */ d2 = 0.0; for(j=0; j<=d-1; j++) { d2 = d2+ae_sqr(xy->ptr.pp_double[i1][j]-xy->ptr.pp_double[i0][j], _state); } if( ae_fp_eq(d2,(double)(0)) ) { /* * First and last points are equal, interval evaluation is * trivial - we just calculate distance from all points to * the first/last one. */ *worstidx = i0; *worsterror = 0.0; for(i=i0+1; i<=i1-1; i++) { vv = 0.0; for(j=0; j<=d-1; j++) { v = xy->ptr.pp_double[i][j]-xy->ptr.pp_double[i0][j]; vv = vv+v*v; } vv = ae_sqrt(vv, _state); if( ae_fp_greater(vv,*worsterror) ) { *worsterror = vv; *worstidx = i; } } return; } /* * General case * * Current section of curve is modeled as x(t) = d*t+c, where * d = XY[I1]-XY[I0] * c = XY[I0] * t is in [0,1] */ *worstidx = i0; *worsterror = 0.0; for(i=i0+1; i<=i1-1; i++) { /* * Determine t_s - parameter value for projected point. */ ts = (double)(i-i0)/(double)(i1-i0); /* * Estimate error norm */ vv = 0.0; for(j=0; j<=d-1; j++) { v = (xy->ptr.pp_double[i1][j]-xy->ptr.pp_double[i0][j])*ts-(xy->ptr.pp_double[i][j]-xy->ptr.pp_double[i0][j]); vv = vv+ae_sqr(v, _state); } vv = ae_sqrt(vv, _state); if( ae_fp_greater(vv,*worsterror) ) { *worsterror = vv; *worstidx = i; } } } void _pspline2interpolant_init(void* _p, ae_state *_state) { pspline2interpolant *p = (pspline2interpolant*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->p, 0, DT_REAL, _state); _spline1dinterpolant_init(&p->x, _state); _spline1dinterpolant_init(&p->y, _state); } void _pspline2interpolant_init_copy(void* _dst, void* _src, ae_state *_state) { pspline2interpolant *dst = (pspline2interpolant*)_dst; pspline2interpolant *src = (pspline2interpolant*)_src; dst->n = src->n; dst->periodic = src->periodic; ae_vector_init_copy(&dst->p, &src->p, _state); _spline1dinterpolant_init_copy(&dst->x, &src->x, _state); _spline1dinterpolant_init_copy(&dst->y, &src->y, _state); } void _pspline2interpolant_clear(void* _p) { pspline2interpolant *p = (pspline2interpolant*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->p); _spline1dinterpolant_clear(&p->x); _spline1dinterpolant_clear(&p->y); } void _pspline2interpolant_destroy(void* _p) { pspline2interpolant *p = (pspline2interpolant*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->p); _spline1dinterpolant_destroy(&p->x); _spline1dinterpolant_destroy(&p->y); } void _pspline3interpolant_init(void* _p, ae_state *_state) { pspline3interpolant *p = (pspline3interpolant*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->p, 0, DT_REAL, _state); _spline1dinterpolant_init(&p->x, _state); _spline1dinterpolant_init(&p->y, _state); _spline1dinterpolant_init(&p->z, _state); } void _pspline3interpolant_init_copy(void* _dst, void* _src, ae_state *_state) { pspline3interpolant *dst = (pspline3interpolant*)_dst; pspline3interpolant *src = (pspline3interpolant*)_src; dst->n = src->n; dst->periodic = src->periodic; ae_vector_init_copy(&dst->p, &src->p, _state); _spline1dinterpolant_init_copy(&dst->x, &src->x, _state); _spline1dinterpolant_init_copy(&dst->y, &src->y, _state); _spline1dinterpolant_init_copy(&dst->z, &src->z, _state); } void _pspline3interpolant_clear(void* _p) { pspline3interpolant *p = (pspline3interpolant*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->p); _spline1dinterpolant_clear(&p->x); _spline1dinterpolant_clear(&p->y); _spline1dinterpolant_clear(&p->z); } void _pspline3interpolant_destroy(void* _p) { pspline3interpolant *p = (pspline3interpolant*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->p); _spline1dinterpolant_destroy(&p->x); _spline1dinterpolant_destroy(&p->y); _spline1dinterpolant_destroy(&p->z); } /************************************************************************* This subroutine calculates the value of the trilinear or tricubic spline at the given point (X,Y,Z). INPUT PARAMETERS: C - coefficients table. Built by BuildBilinearSpline or BuildBicubicSpline. X, Y, Z - point Result: S(x,y,z) -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ double spline3dcalc(spline3dinterpolant* c, double x, double y, double z, ae_state *_state) { double v; double vx; double vy; double vxy; double result; ae_assert(c->stype==-1||c->stype==-3, "Spline3DCalc: incorrect C (incorrect parameter C.SType)", _state); ae_assert((ae_isfinite(x, _state)&&ae_isfinite(y, _state))&&ae_isfinite(z, _state), "Spline3DCalc: X=NaN/Infinite, Y=NaN/Infinite or Z=NaN/Infinite", _state); if( c->d!=1 ) { result = (double)(0); return result; } spline3d_spline3ddiff(c, x, y, z, &v, &vx, &vy, &vxy, _state); result = v; return result; } /************************************************************************* This subroutine performs linear transformation of the spline argument. INPUT PARAMETERS: C - spline interpolant AX, BX - transformation coefficients: x = A*u + B AY, BY - transformation coefficients: y = A*v + B AZ, BZ - transformation coefficients: z = A*w + B OUTPUT PARAMETERS: C - transformed spline -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dlintransxyz(spline3dinterpolant* c, double ax, double bx, double ay, double by, double az, double bz, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_vector z; ae_vector f; ae_vector v; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t di; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&z, 0, DT_REAL, _state); ae_vector_init(&f, 0, DT_REAL, _state); ae_vector_init(&v, 0, DT_REAL, _state); ae_assert(c->stype==-3||c->stype==-1, "Spline3DLinTransXYZ: incorrect C (incorrect parameter C.SType)", _state); ae_vector_set_length(&x, c->n, _state); ae_vector_set_length(&y, c->m, _state); ae_vector_set_length(&z, c->l, _state); ae_vector_set_length(&f, c->m*c->n*c->l*c->d, _state); for(j=0; j<=c->n-1; j++) { x.ptr.p_double[j] = c->x.ptr.p_double[j]; } for(i=0; i<=c->m-1; i++) { y.ptr.p_double[i] = c->y.ptr.p_double[i]; } for(i=0; i<=c->l-1; i++) { z.ptr.p_double[i] = c->z.ptr.p_double[i]; } /* * Handle different combinations of zero/nonzero AX/AY/AZ */ if( (ae_fp_neq(ax,(double)(0))&&ae_fp_neq(ay,(double)(0)))&&ae_fp_neq(az,(double)(0)) ) { ae_v_move(&f.ptr.p_double[0], 1, &c->f.ptr.p_double[0], 1, ae_v_len(0,c->m*c->n*c->l*c->d-1)); } if( (ae_fp_eq(ax,(double)(0))&&ae_fp_neq(ay,(double)(0)))&&ae_fp_neq(az,(double)(0)) ) { for(i=0; i<=c->m-1; i++) { for(j=0; j<=c->l-1; j++) { spline3dcalcv(c, bx, y.ptr.p_double[i], z.ptr.p_double[j], &v, _state); for(k=0; k<=c->n-1; k++) { for(di=0; di<=c->d-1; di++) { f.ptr.p_double[c->d*(c->n*(c->m*j+i)+k)+di] = v.ptr.p_double[di]; } } } } ax = (double)(1); bx = (double)(0); } if( (ae_fp_neq(ax,(double)(0))&&ae_fp_eq(ay,(double)(0)))&&ae_fp_neq(az,(double)(0)) ) { for(i=0; i<=c->n-1; i++) { for(j=0; j<=c->l-1; j++) { spline3dcalcv(c, x.ptr.p_double[i], by, z.ptr.p_double[j], &v, _state); for(k=0; k<=c->m-1; k++) { for(di=0; di<=c->d-1; di++) { f.ptr.p_double[c->d*(c->n*(c->m*j+k)+i)+di] = v.ptr.p_double[di]; } } } } ay = (double)(1); by = (double)(0); } if( (ae_fp_neq(ax,(double)(0))&&ae_fp_neq(ay,(double)(0)))&&ae_fp_eq(az,(double)(0)) ) { for(i=0; i<=c->n-1; i++) { for(j=0; j<=c->m-1; j++) { spline3dcalcv(c, x.ptr.p_double[i], y.ptr.p_double[j], bz, &v, _state); for(k=0; k<=c->l-1; k++) { for(di=0; di<=c->d-1; di++) { f.ptr.p_double[c->d*(c->n*(c->m*k+j)+i)+di] = v.ptr.p_double[di]; } } } } az = (double)(1); bz = (double)(0); } if( (ae_fp_eq(ax,(double)(0))&&ae_fp_eq(ay,(double)(0)))&&ae_fp_neq(az,(double)(0)) ) { for(i=0; i<=c->l-1; i++) { spline3dcalcv(c, bx, by, z.ptr.p_double[i], &v, _state); for(k=0; k<=c->m-1; k++) { for(j=0; j<=c->n-1; j++) { for(di=0; di<=c->d-1; di++) { f.ptr.p_double[c->d*(c->n*(c->m*i+k)+j)+di] = v.ptr.p_double[di]; } } } } ax = (double)(1); bx = (double)(0); ay = (double)(1); by = (double)(0); } if( (ae_fp_eq(ax,(double)(0))&&ae_fp_neq(ay,(double)(0)))&&ae_fp_eq(az,(double)(0)) ) { for(i=0; i<=c->m-1; i++) { spline3dcalcv(c, bx, y.ptr.p_double[i], bz, &v, _state); for(k=0; k<=c->l-1; k++) { for(j=0; j<=c->n-1; j++) { for(di=0; di<=c->d-1; di++) { f.ptr.p_double[c->d*(c->n*(c->m*k+i)+j)+di] = v.ptr.p_double[di]; } } } } ax = (double)(1); bx = (double)(0); az = (double)(1); bz = (double)(0); } if( (ae_fp_neq(ax,(double)(0))&&ae_fp_eq(ay,(double)(0)))&&ae_fp_eq(az,(double)(0)) ) { for(i=0; i<=c->n-1; i++) { spline3dcalcv(c, x.ptr.p_double[i], by, bz, &v, _state); for(k=0; k<=c->l-1; k++) { for(j=0; j<=c->m-1; j++) { for(di=0; di<=c->d-1; di++) { f.ptr.p_double[c->d*(c->n*(c->m*k+j)+i)+di] = v.ptr.p_double[di]; } } } } ay = (double)(1); by = (double)(0); az = (double)(1); bz = (double)(0); } if( (ae_fp_eq(ax,(double)(0))&&ae_fp_eq(ay,(double)(0)))&&ae_fp_eq(az,(double)(0)) ) { spline3dcalcv(c, bx, by, bz, &v, _state); for(k=0; k<=c->l-1; k++) { for(j=0; j<=c->m-1; j++) { for(i=0; i<=c->n-1; i++) { for(di=0; di<=c->d-1; di++) { f.ptr.p_double[c->d*(c->n*(c->m*k+j)+i)+di] = v.ptr.p_double[di]; } } } } ax = (double)(1); bx = (double)(0); ay = (double)(1); by = (double)(0); az = (double)(1); bz = (double)(0); } /* * General case: AX<>0, AY<>0, AZ<>0 * Unpack, scale and pack again. */ for(i=0; i<=c->n-1; i++) { x.ptr.p_double[i] = (x.ptr.p_double[i]-bx)/ax; } for(i=0; i<=c->m-1; i++) { y.ptr.p_double[i] = (y.ptr.p_double[i]-by)/ay; } for(i=0; i<=c->l-1; i++) { z.ptr.p_double[i] = (z.ptr.p_double[i]-bz)/az; } if( c->stype==-1 ) { spline3dbuildtrilinearv(&x, c->n, &y, c->m, &z, c->l, &f, c->d, c, _state); } ae_frame_leave(_state); } /************************************************************************* This subroutine performs linear transformation of the spline. INPUT PARAMETERS: C - spline interpolant. A, B- transformation coefficients: S2(x,y) = A*S(x,y,z) + B OUTPUT PARAMETERS: C - transformed spline -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dlintransf(spline3dinterpolant* c, double a, double b, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_vector z; ae_vector f; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&z, 0, DT_REAL, _state); ae_vector_init(&f, 0, DT_REAL, _state); ae_assert(c->stype==-3||c->stype==-1, "Spline3DLinTransF: incorrect C (incorrect parameter C.SType)", _state); ae_vector_set_length(&x, c->n, _state); ae_vector_set_length(&y, c->m, _state); ae_vector_set_length(&z, c->l, _state); ae_vector_set_length(&f, c->m*c->n*c->l*c->d, _state); for(j=0; j<=c->n-1; j++) { x.ptr.p_double[j] = c->x.ptr.p_double[j]; } for(i=0; i<=c->m-1; i++) { y.ptr.p_double[i] = c->y.ptr.p_double[i]; } for(i=0; i<=c->l-1; i++) { z.ptr.p_double[i] = c->z.ptr.p_double[i]; } for(i=0; i<=c->m*c->n*c->l*c->d-1; i++) { f.ptr.p_double[i] = a*c->f.ptr.p_double[i]+b; } if( c->stype==-1 ) { spline3dbuildtrilinearv(&x, c->n, &y, c->m, &z, c->l, &f, c->d, c, _state); } ae_frame_leave(_state); } /************************************************************************* This subroutine makes the copy of the spline model. INPUT PARAMETERS: C - spline interpolant OUTPUT PARAMETERS: CC - spline copy -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dcopy(spline3dinterpolant* c, spline3dinterpolant* cc, ae_state *_state) { ae_int_t tblsize; _spline3dinterpolant_clear(cc); ae_assert(c->k==1||c->k==3, "Spline3DCopy: incorrect C (incorrect parameter C.K)", _state); cc->k = c->k; cc->n = c->n; cc->m = c->m; cc->l = c->l; cc->d = c->d; tblsize = c->n*c->m*c->l*c->d; cc->stype = c->stype; ae_vector_set_length(&cc->x, cc->n, _state); ae_vector_set_length(&cc->y, cc->m, _state); ae_vector_set_length(&cc->z, cc->l, _state); ae_vector_set_length(&cc->f, tblsize, _state); ae_v_move(&cc->x.ptr.p_double[0], 1, &c->x.ptr.p_double[0], 1, ae_v_len(0,cc->n-1)); ae_v_move(&cc->y.ptr.p_double[0], 1, &c->y.ptr.p_double[0], 1, ae_v_len(0,cc->m-1)); ae_v_move(&cc->z.ptr.p_double[0], 1, &c->z.ptr.p_double[0], 1, ae_v_len(0,cc->l-1)); ae_v_move(&cc->f.ptr.p_double[0], 1, &c->f.ptr.p_double[0], 1, ae_v_len(0,tblsize-1)); } /************************************************************************* Trilinear spline resampling INPUT PARAMETERS: A - array[0..OldXCount*OldYCount*OldZCount-1], function values at the old grid, : A[0] x=0,y=0,z=0 A[1] x=1,y=0,z=0 A[..] ... A[..] x=oldxcount-1,y=0,z=0 A[..] x=0,y=1,z=0 A[..] ... ... OldZCount - old Z-count, OldZCount>1 OldYCount - old Y-count, OldYCount>1 OldXCount - old X-count, OldXCount>1 NewZCount - new Z-count, NewZCount>1 NewYCount - new Y-count, NewYCount>1 NewXCount - new X-count, NewXCount>1 OUTPUT PARAMETERS: B - array[0..NewXCount*NewYCount*NewZCount-1], function values at the new grid: B[0] x=0,y=0,z=0 B[1] x=1,y=0,z=0 B[..] ... B[..] x=newxcount-1,y=0,z=0 B[..] x=0,y=1,z=0 B[..] ... ... -- ALGLIB routine -- 26.04.2012 Copyright by Bochkanov Sergey *************************************************************************/ void spline3dresampletrilinear(/* Real */ ae_vector* a, ae_int_t oldzcount, ae_int_t oldycount, ae_int_t oldxcount, ae_int_t newzcount, ae_int_t newycount, ae_int_t newxcount, /* Real */ ae_vector* b, ae_state *_state) { double xd; double yd; double zd; double c0; double c1; double c2; double c3; ae_int_t ix; ae_int_t iy; ae_int_t iz; ae_int_t i; ae_int_t j; ae_int_t k; ae_vector_clear(b); ae_assert((oldycount>1&&oldzcount>1)&&oldxcount>1, "Spline3DResampleTrilinear: length/width/height less than 1", _state); ae_assert((newycount>1&&newzcount>1)&&newxcount>1, "Spline3DResampleTrilinear: length/width/height less than 1", _state); ae_assert(a->cnt>=oldycount*oldzcount*oldxcount, "Spline3DResampleTrilinear: length/width/height less than 1", _state); ae_vector_set_length(b, newxcount*newycount*newzcount, _state); for(i=0; i<=newxcount-1; i++) { for(j=0; j<=newycount-1; j++) { for(k=0; k<=newzcount-1; k++) { ix = i*(oldxcount-1)/(newxcount-1); if( ix==oldxcount-1 ) { ix = oldxcount-2; } xd = (double)(i*(oldxcount-1))/(double)(newxcount-1)-ix; iy = j*(oldycount-1)/(newycount-1); if( iy==oldycount-1 ) { iy = oldycount-2; } yd = (double)(j*(oldycount-1))/(double)(newycount-1)-iy; iz = k*(oldzcount-1)/(newzcount-1); if( iz==oldzcount-1 ) { iz = oldzcount-2; } zd = (double)(k*(oldzcount-1))/(double)(newzcount-1)-iz; c0 = a->ptr.p_double[oldxcount*(oldycount*iz+iy)+ix]*(1-xd)+a->ptr.p_double[oldxcount*(oldycount*iz+iy)+(ix+1)]*xd; c1 = a->ptr.p_double[oldxcount*(oldycount*iz+(iy+1))+ix]*(1-xd)+a->ptr.p_double[oldxcount*(oldycount*iz+(iy+1))+(ix+1)]*xd; c2 = a->ptr.p_double[oldxcount*(oldycount*(iz+1)+iy)+ix]*(1-xd)+a->ptr.p_double[oldxcount*(oldycount*(iz+1)+iy)+(ix+1)]*xd; c3 = a->ptr.p_double[oldxcount*(oldycount*(iz+1)+(iy+1))+ix]*(1-xd)+a->ptr.p_double[oldxcount*(oldycount*(iz+1)+(iy+1))+(ix+1)]*xd; c0 = c0*(1-yd)+c1*yd; c1 = c2*(1-yd)+c3*yd; b->ptr.p_double[newxcount*(newycount*k+j)+i] = c0*(1-zd)+c1*zd; } } } } /************************************************************************* This subroutine builds trilinear vector-valued spline. INPUT PARAMETERS: X - spline abscissas, array[0..N-1] Y - spline ordinates, array[0..M-1] Z - spline applicates, array[0..L-1] F - function values, array[0..M*N*L*D-1]: * first D elements store D values at (X[0],Y[0],Z[0]) * next D elements store D values at (X[1],Y[0],Z[0]) * next D elements store D values at (X[2],Y[0],Z[0]) * ... * next D elements store D values at (X[0],Y[1],Z[0]) * next D elements store D values at (X[1],Y[1],Z[0]) * next D elements store D values at (X[2],Y[1],Z[0]) * ... * next D elements store D values at (X[0],Y[0],Z[1]) * next D elements store D values at (X[1],Y[0],Z[1]) * next D elements store D values at (X[2],Y[0],Z[1]) * ... * general form - D function values at (X[i],Y[j]) are stored at F[D*(N*(M*K+J)+I)...D*(N*(M*K+J)+I)+D-1]. M,N, L - grid size, M>=2, N>=2, L>=2 D - vector dimension, D>=1 OUTPUT PARAMETERS: C - spline interpolant -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dbuildtrilinearv(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, /* Real */ ae_vector* z, ae_int_t l, /* Real */ ae_vector* f, ae_int_t d, spline3dinterpolant* c, ae_state *_state) { double t; ae_int_t tblsize; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t i0; ae_int_t j0; _spline3dinterpolant_clear(c); ae_assert(m>=2, "Spline3DBuildTrilinearV: M<2", _state); ae_assert(n>=2, "Spline3DBuildTrilinearV: N<2", _state); ae_assert(l>=2, "Spline3DBuildTrilinearV: L<2", _state); ae_assert(d>=1, "Spline3DBuildTrilinearV: D<1", _state); ae_assert((x->cnt>=n&&y->cnt>=m)&&z->cnt>=l, "Spline3DBuildTrilinearV: length of X, Y or Z is too short (Length(X/Y/Z)cnt>=tblsize, "Spline3DBuildTrilinearV: length of F is too short (Length(F)k = 1; c->n = n; c->m = m; c->l = l; c->d = d; c->stype = -1; ae_vector_set_length(&c->x, c->n, _state); ae_vector_set_length(&c->y, c->m, _state); ae_vector_set_length(&c->z, c->l, _state); ae_vector_set_length(&c->f, tblsize, _state); for(i=0; i<=c->n-1; i++) { c->x.ptr.p_double[i] = x->ptr.p_double[i]; } for(i=0; i<=c->m-1; i++) { c->y.ptr.p_double[i] = y->ptr.p_double[i]; } for(i=0; i<=c->l-1; i++) { c->z.ptr.p_double[i] = z->ptr.p_double[i]; } for(i=0; i<=tblsize-1; i++) { c->f.ptr.p_double[i] = f->ptr.p_double[i]; } /* * Sort points: * * sort x; * * sort y; * * sort z. */ for(j=0; j<=c->n-1; j++) { k = j; for(i=j+1; i<=c->n-1; i++) { if( ae_fp_less(c->x.ptr.p_double[i],c->x.ptr.p_double[k]) ) { k = i; } } if( k!=j ) { for(i=0; i<=c->m-1; i++) { for(j0=0; j0<=c->l-1; j0++) { for(i0=0; i0<=c->d-1; i0++) { t = c->f.ptr.p_double[c->d*(c->n*(c->m*j0+i)+j)+i0]; c->f.ptr.p_double[c->d*(c->n*(c->m*j0+i)+j)+i0] = c->f.ptr.p_double[c->d*(c->n*(c->m*j0+i)+k)+i0]; c->f.ptr.p_double[c->d*(c->n*(c->m*j0+i)+k)+i0] = t; } } } t = c->x.ptr.p_double[j]; c->x.ptr.p_double[j] = c->x.ptr.p_double[k]; c->x.ptr.p_double[k] = t; } } for(i=0; i<=c->m-1; i++) { k = i; for(j=i+1; j<=c->m-1; j++) { if( ae_fp_less(c->y.ptr.p_double[j],c->y.ptr.p_double[k]) ) { k = j; } } if( k!=i ) { for(j=0; j<=c->n-1; j++) { for(j0=0; j0<=c->l-1; j0++) { for(i0=0; i0<=c->d-1; i0++) { t = c->f.ptr.p_double[c->d*(c->n*(c->m*j0+i)+j)+i0]; c->f.ptr.p_double[c->d*(c->n*(c->m*j0+i)+j)+i0] = c->f.ptr.p_double[c->d*(c->n*(c->m*j0+k)+j)+i0]; c->f.ptr.p_double[c->d*(c->n*(c->m*j0+k)+j)+i0] = t; } } } t = c->y.ptr.p_double[i]; c->y.ptr.p_double[i] = c->y.ptr.p_double[k]; c->y.ptr.p_double[k] = t; } } for(k=0; k<=c->l-1; k++) { i = k; for(j=i+1; j<=c->l-1; j++) { if( ae_fp_less(c->z.ptr.p_double[j],c->z.ptr.p_double[i]) ) { i = j; } } if( i!=k ) { for(j=0; j<=c->m-1; j++) { for(j0=0; j0<=c->n-1; j0++) { for(i0=0; i0<=c->d-1; i0++) { t = c->f.ptr.p_double[c->d*(c->n*(c->m*k+j)+j0)+i0]; c->f.ptr.p_double[c->d*(c->n*(c->m*k+j)+j0)+i0] = c->f.ptr.p_double[c->d*(c->n*(c->m*i+j)+j0)+i0]; c->f.ptr.p_double[c->d*(c->n*(c->m*i+j)+j0)+i0] = t; } } } t = c->z.ptr.p_double[k]; c->z.ptr.p_double[k] = c->z.ptr.p_double[i]; c->z.ptr.p_double[i] = t; } } } /************************************************************************* This subroutine calculates bilinear or bicubic vector-valued spline at the given point (X,Y,Z). INPUT PARAMETERS: C - spline interpolant. X, Y, Z - point F - output buffer, possibly preallocated array. In case array size is large enough to store result, it is not reallocated. Array which is too short will be reallocated OUTPUT PARAMETERS: F - array[D] (or larger) which stores function values -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dcalcvbuf(spline3dinterpolant* c, double x, double y, double z, /* Real */ ae_vector* f, ae_state *_state) { double xd; double yd; double zd; double c0; double c1; double c2; double c3; ae_int_t ix; ae_int_t iy; ae_int_t iz; ae_int_t l; ae_int_t r; ae_int_t h; ae_int_t i; ae_assert(c->stype==-1||c->stype==-3, "Spline3DCalcVBuf: incorrect C (incorrect parameter C.SType)", _state); ae_assert((ae_isfinite(x, _state)&&ae_isfinite(y, _state))&&ae_isfinite(z, _state), "Spline3DCalcVBuf: X, Y or Z contains NaN/Infinite", _state); rvectorsetlengthatleast(f, c->d, _state); /* * Binary search in the [ x[0], ..., x[n-2] ] (x[n-1] is not included) */ l = 0; r = c->n-1; while(l!=r-1) { h = (l+r)/2; if( ae_fp_greater_eq(c->x.ptr.p_double[h],x) ) { r = h; } else { l = h; } } ix = l; /* * Binary search in the [ y[0], ..., y[n-2] ] (y[n-1] is not included) */ l = 0; r = c->m-1; while(l!=r-1) { h = (l+r)/2; if( ae_fp_greater_eq(c->y.ptr.p_double[h],y) ) { r = h; } else { l = h; } } iy = l; /* * Binary search in the [ z[0], ..., z[n-2] ] (z[n-1] is not included) */ l = 0; r = c->l-1; while(l!=r-1) { h = (l+r)/2; if( ae_fp_greater_eq(c->z.ptr.p_double[h],z) ) { r = h; } else { l = h; } } iz = l; xd = (x-c->x.ptr.p_double[ix])/(c->x.ptr.p_double[ix+1]-c->x.ptr.p_double[ix]); yd = (y-c->y.ptr.p_double[iy])/(c->y.ptr.p_double[iy+1]-c->y.ptr.p_double[iy]); zd = (z-c->z.ptr.p_double[iz])/(c->z.ptr.p_double[iz+1]-c->z.ptr.p_double[iz]); for(i=0; i<=c->d-1; i++) { /* * Trilinear interpolation */ if( c->stype==-1 ) { c0 = c->f.ptr.p_double[c->d*(c->n*(c->m*iz+iy)+ix)+i]*(1-xd)+c->f.ptr.p_double[c->d*(c->n*(c->m*iz+iy)+(ix+1))+i]*xd; c1 = c->f.ptr.p_double[c->d*(c->n*(c->m*iz+(iy+1))+ix)+i]*(1-xd)+c->f.ptr.p_double[c->d*(c->n*(c->m*iz+(iy+1))+(ix+1))+i]*xd; c2 = c->f.ptr.p_double[c->d*(c->n*(c->m*(iz+1)+iy)+ix)+i]*(1-xd)+c->f.ptr.p_double[c->d*(c->n*(c->m*(iz+1)+iy)+(ix+1))+i]*xd; c3 = c->f.ptr.p_double[c->d*(c->n*(c->m*(iz+1)+(iy+1))+ix)+i]*(1-xd)+c->f.ptr.p_double[c->d*(c->n*(c->m*(iz+1)+(iy+1))+(ix+1))+i]*xd; c0 = c0*(1-yd)+c1*yd; c1 = c2*(1-yd)+c3*yd; f->ptr.p_double[i] = c0*(1-zd)+c1*zd; } } } /************************************************************************* This subroutine calculates trilinear or tricubic vector-valued spline at the given point (X,Y,Z). INPUT PARAMETERS: C - spline interpolant. X, Y, Z - point OUTPUT PARAMETERS: F - array[D] which stores function values. F is out-parameter and it is reallocated after call to this function. In case you want to reuse previously allocated F, you may use Spline2DCalcVBuf(), which reallocates F only when it is too small. -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dcalcv(spline3dinterpolant* c, double x, double y, double z, /* Real */ ae_vector* f, ae_state *_state) { ae_vector_clear(f); ae_assert(c->stype==-1||c->stype==-3, "Spline3DCalcV: incorrect C (incorrect parameter C.SType)", _state); ae_assert((ae_isfinite(x, _state)&&ae_isfinite(y, _state))&&ae_isfinite(z, _state), "Spline3DCalcV: X=NaN/Infinite, Y=NaN/Infinite or Z=NaN/Infinite", _state); ae_vector_set_length(f, c->d, _state); spline3dcalcvbuf(c, x, y, z, f, _state); } /************************************************************************* This subroutine unpacks tri-dimensional spline into the coefficients table INPUT PARAMETERS: C - spline interpolant. Result: N - grid size (X) M - grid size (Y) L - grid size (Z) D - number of components SType- spline type. Currently, only one spline type is supported: trilinear spline, as indicated by SType=1. Tbl - spline coefficients: [0..(N-1)*(M-1)*(L-1)*D-1, 0..13]. For T=0..D-1 (component index), I = 0...N-2 (x index), J=0..M-2 (y index), K=0..L-2 (z index): Q := T + I*D + J*D*(N-1) + K*D*(N-1)*(M-1), Q-th row stores decomposition for T-th component of the vector-valued function Tbl[Q,0] = X[i] Tbl[Q,1] = X[i+1] Tbl[Q,2] = Y[j] Tbl[Q,3] = Y[j+1] Tbl[Q,4] = Z[k] Tbl[Q,5] = Z[k+1] Tbl[Q,6] = C000 Tbl[Q,7] = C100 Tbl[Q,8] = C010 Tbl[Q,9] = C110 Tbl[Q,10]= C001 Tbl[Q,11]= C101 Tbl[Q,12]= C011 Tbl[Q,13]= C111 On each grid square spline is equals to: S(x) = SUM(c[i,j,k]*(x^i)*(y^j)*(z^k), i=0..1, j=0..1, k=0..1) t = x-x[j] u = y-y[i] v = z-z[k] NOTE: format of Tbl is given for SType=1. Future versions of ALGLIB can use different formats for different values of SType. -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dunpackv(spline3dinterpolant* c, ae_int_t* n, ae_int_t* m, ae_int_t* l, ae_int_t* d, ae_int_t* stype, /* Real */ ae_matrix* tbl, ae_state *_state) { ae_int_t p; ae_int_t ci; ae_int_t cj; ae_int_t ck; double du; double dv; double dw; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t di; ae_int_t i0; *n = 0; *m = 0; *l = 0; *d = 0; *stype = 0; ae_matrix_clear(tbl); ae_assert(c->stype==-1, "Spline3DUnpackV: incorrect C (incorrect parameter C.SType)", _state); *n = c->n; *m = c->m; *l = c->l; *d = c->d; *stype = ae_iabs(c->stype, _state); ae_matrix_set_length(tbl, (*n-1)*(*m-1)*(*l-1)*(*d), 14, _state); /* * Fill */ for(i=0; i<=*n-2; i++) { for(j=0; j<=*m-2; j++) { for(k=0; k<=*l-2; k++) { for(di=0; di<=*d-1; di++) { p = *d*((*n-1)*((*m-1)*k+j)+i)+di; tbl->ptr.pp_double[p][0] = c->x.ptr.p_double[i]; tbl->ptr.pp_double[p][1] = c->x.ptr.p_double[i+1]; tbl->ptr.pp_double[p][2] = c->y.ptr.p_double[j]; tbl->ptr.pp_double[p][3] = c->y.ptr.p_double[j+1]; tbl->ptr.pp_double[p][4] = c->z.ptr.p_double[k]; tbl->ptr.pp_double[p][5] = c->z.ptr.p_double[k+1]; du = 1/(tbl->ptr.pp_double[p][1]-tbl->ptr.pp_double[p][0]); dv = 1/(tbl->ptr.pp_double[p][3]-tbl->ptr.pp_double[p][2]); dw = 1/(tbl->ptr.pp_double[p][5]-tbl->ptr.pp_double[p][4]); /* * Trilinear interpolation */ if( c->stype==-1 ) { for(i0=6; i0<=13; i0++) { tbl->ptr.pp_double[p][i0] = (double)(0); } tbl->ptr.pp_double[p][6+2*(2*0+0)+0] = c->f.ptr.p_double[*d*(*n*(*m*k+j)+i)+di]; tbl->ptr.pp_double[p][6+2*(2*0+0)+1] = c->f.ptr.p_double[*d*(*n*(*m*k+j)+(i+1))+di]-c->f.ptr.p_double[*d*(*n*(*m*k+j)+i)+di]; tbl->ptr.pp_double[p][6+2*(2*0+1)+0] = c->f.ptr.p_double[*d*(*n*(*m*k+(j+1))+i)+di]-c->f.ptr.p_double[*d*(*n*(*m*k+j)+i)+di]; tbl->ptr.pp_double[p][6+2*(2*0+1)+1] = c->f.ptr.p_double[*d*(*n*(*m*k+(j+1))+(i+1))+di]-c->f.ptr.p_double[*d*(*n*(*m*k+(j+1))+i)+di]-c->f.ptr.p_double[*d*(*n*(*m*k+j)+(i+1))+di]+c->f.ptr.p_double[*d*(*n*(*m*k+j)+i)+di]; tbl->ptr.pp_double[p][6+2*(2*1+0)+0] = c->f.ptr.p_double[*d*(*n*(*m*(k+1)+j)+i)+di]-c->f.ptr.p_double[*d*(*n*(*m*k+j)+i)+di]; tbl->ptr.pp_double[p][6+2*(2*1+0)+1] = c->f.ptr.p_double[*d*(*n*(*m*(k+1)+j)+(i+1))+di]-c->f.ptr.p_double[*d*(*n*(*m*(k+1)+j)+i)+di]-c->f.ptr.p_double[*d*(*n*(*m*k+j)+(i+1))+di]+c->f.ptr.p_double[*d*(*n*(*m*k+j)+i)+di]; tbl->ptr.pp_double[p][6+2*(2*1+1)+0] = c->f.ptr.p_double[*d*(*n*(*m*(k+1)+(j+1))+i)+di]-c->f.ptr.p_double[*d*(*n*(*m*(k+1)+j)+i)+di]-c->f.ptr.p_double[*d*(*n*(*m*k+(j+1))+i)+di]+c->f.ptr.p_double[*d*(*n*(*m*k+j)+i)+di]; tbl->ptr.pp_double[p][6+2*(2*1+1)+1] = c->f.ptr.p_double[*d*(*n*(*m*(k+1)+(j+1))+(i+1))+di]-c->f.ptr.p_double[*d*(*n*(*m*(k+1)+(j+1))+i)+di]-c->f.ptr.p_double[*d*(*n*(*m*(k+1)+j)+(i+1))+di]+c->f.ptr.p_double[*d*(*n*(*m*(k+1)+j)+i)+di]-c->f.ptr.p_double[*d*(*n*(*m*k+(j+1))+(i+1))+di]+c->f.ptr.p_double[*d*(*n*(*m*k+(j+1))+i)+di]+c->f.ptr.p_double[*d*(*n*(*m*k+j)+(i+1))+di]-c->f.ptr.p_double[*d*(*n*(*m*k+j)+i)+di]; } /* * Rescale Cij */ for(ci=0; ci<=1; ci++) { for(cj=0; cj<=1; cj++) { for(ck=0; ck<=1; ck++) { tbl->ptr.pp_double[p][6+2*(2*ck+cj)+ci] = tbl->ptr.pp_double[p][6+2*(2*ck+cj)+ci]*ae_pow(du, (double)(ci), _state)*ae_pow(dv, (double)(cj), _state)*ae_pow(dw, (double)(ck), _state); } } } } } } } } /************************************************************************* This subroutine calculates the value of the trilinear(or tricubic;possible will be later) spline at the given point X(and its derivatives; possible will be later). INPUT PARAMETERS: C - spline interpolant. X, Y, Z - point OUTPUT PARAMETERS: F - S(x,y,z) FX - dS(x,y,z)/dX FY - dS(x,y,z)/dY FXY - d2S(x,y,z)/dXdY -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ static void spline3d_spline3ddiff(spline3dinterpolant* c, double x, double y, double z, double* f, double* fx, double* fy, double* fxy, ae_state *_state) { double xd; double yd; double zd; double c0; double c1; double c2; double c3; ae_int_t ix; ae_int_t iy; ae_int_t iz; ae_int_t l; ae_int_t r; ae_int_t h; *f = 0; *fx = 0; *fy = 0; *fxy = 0; ae_assert(c->stype==-1||c->stype==-3, "Spline3DDiff: incorrect C (incorrect parameter C.SType)", _state); ae_assert(ae_isfinite(x, _state)&&ae_isfinite(y, _state), "Spline3DDiff: X or Y contains NaN or Infinite value", _state); /* * Prepare F, dF/dX, dF/dY, d2F/dXdY */ *f = (double)(0); *fx = (double)(0); *fy = (double)(0); *fxy = (double)(0); if( c->d!=1 ) { return; } /* * Binary search in the [ x[0], ..., x[n-2] ] (x[n-1] is not included) */ l = 0; r = c->n-1; while(l!=r-1) { h = (l+r)/2; if( ae_fp_greater_eq(c->x.ptr.p_double[h],x) ) { r = h; } else { l = h; } } ix = l; /* * Binary search in the [ y[0], ..., y[n-2] ] (y[n-1] is not included) */ l = 0; r = c->m-1; while(l!=r-1) { h = (l+r)/2; if( ae_fp_greater_eq(c->y.ptr.p_double[h],y) ) { r = h; } else { l = h; } } iy = l; /* * Binary search in the [ z[0], ..., z[n-2] ] (z[n-1] is not included) */ l = 0; r = c->l-1; while(l!=r-1) { h = (l+r)/2; if( ae_fp_greater_eq(c->z.ptr.p_double[h],z) ) { r = h; } else { l = h; } } iz = l; xd = (x-c->x.ptr.p_double[ix])/(c->x.ptr.p_double[ix+1]-c->x.ptr.p_double[ix]); yd = (y-c->y.ptr.p_double[iy])/(c->y.ptr.p_double[iy+1]-c->y.ptr.p_double[iy]); zd = (z-c->z.ptr.p_double[iz])/(c->z.ptr.p_double[iz+1]-c->z.ptr.p_double[iz]); /* * Trilinear interpolation */ if( c->stype==-1 ) { c0 = c->f.ptr.p_double[c->n*(c->m*iz+iy)+ix]*(1-xd)+c->f.ptr.p_double[c->n*(c->m*iz+iy)+(ix+1)]*xd; c1 = c->f.ptr.p_double[c->n*(c->m*iz+(iy+1))+ix]*(1-xd)+c->f.ptr.p_double[c->n*(c->m*iz+(iy+1))+(ix+1)]*xd; c2 = c->f.ptr.p_double[c->n*(c->m*(iz+1)+iy)+ix]*(1-xd)+c->f.ptr.p_double[c->n*(c->m*(iz+1)+iy)+(ix+1)]*xd; c3 = c->f.ptr.p_double[c->n*(c->m*(iz+1)+(iy+1))+ix]*(1-xd)+c->f.ptr.p_double[c->n*(c->m*(iz+1)+(iy+1))+(ix+1)]*xd; c0 = c0*(1-yd)+c1*yd; c1 = c2*(1-yd)+c3*yd; *f = c0*(1-zd)+c1*zd; } } void _spline3dinterpolant_init(void* _p, ae_state *_state) { spline3dinterpolant *p = (spline3dinterpolant*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->y, 0, DT_REAL, _state); ae_vector_init(&p->z, 0, DT_REAL, _state); ae_vector_init(&p->f, 0, DT_REAL, _state); } void _spline3dinterpolant_init_copy(void* _dst, void* _src, ae_state *_state) { spline3dinterpolant *dst = (spline3dinterpolant*)_dst; spline3dinterpolant *src = (spline3dinterpolant*)_src; dst->k = src->k; dst->stype = src->stype; dst->n = src->n; dst->m = src->m; dst->l = src->l; dst->d = src->d; ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->y, &src->y, _state); ae_vector_init_copy(&dst->z, &src->z, _state); ae_vector_init_copy(&dst->f, &src->f, _state); } void _spline3dinterpolant_clear(void* _p) { spline3dinterpolant *p = (spline3dinterpolant*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->x); ae_vector_clear(&p->y); ae_vector_clear(&p->z); ae_vector_clear(&p->f); } void _spline3dinterpolant_destroy(void* _p) { spline3dinterpolant *p = (spline3dinterpolant*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->x); ae_vector_destroy(&p->y); ae_vector_destroy(&p->z); ae_vector_destroy(&p->f); } /************************************************************************* Conversion from barycentric representation to Chebyshev basis. This function has O(N^2) complexity. INPUT PARAMETERS: P - polynomial in barycentric form A,B - base interval for Chebyshev polynomials (see below) A<>B OUTPUT PARAMETERS T - coefficients of Chebyshev representation; P(x) = sum { T[i]*Ti(2*(x-A)/(B-A)-1), i=0..N-1 }, where Ti - I-th Chebyshev polynomial. NOTES: barycentric interpolant passed as P may be either polynomial obtained from polynomial interpolation/ fitting or rational function which is NOT polynomial. We can't distinguish between these two cases, and this algorithm just tries to work assuming that P IS a polynomial. If not, algorithm will return results, but they won't have any meaning. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/ void polynomialbar2cheb(barycentricinterpolant* p, double a, double b, /* Real */ ae_vector* t, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t k; ae_vector vp; ae_vector vx; ae_vector tk; ae_vector tk1; double v; ae_frame_make(_state, &_frame_block); ae_vector_clear(t); ae_vector_init(&vp, 0, DT_REAL, _state); ae_vector_init(&vx, 0, DT_REAL, _state); ae_vector_init(&tk, 0, DT_REAL, _state); ae_vector_init(&tk1, 0, DT_REAL, _state); ae_assert(ae_isfinite(a, _state), "PolynomialBar2Cheb: A is not finite!", _state); ae_assert(ae_isfinite(b, _state), "PolynomialBar2Cheb: B is not finite!", _state); ae_assert(ae_fp_neq(a,b), "PolynomialBar2Cheb: A=B!", _state); ae_assert(p->n>0, "PolynomialBar2Cheb: P is not correctly initialized barycentric interpolant!", _state); /* * Calculate function values on a Chebyshev grid */ ae_vector_set_length(&vp, p->n, _state); ae_vector_set_length(&vx, p->n, _state); for(i=0; i<=p->n-1; i++) { vx.ptr.p_double[i] = ae_cos(ae_pi*(i+0.5)/p->n, _state); vp.ptr.p_double[i] = barycentriccalc(p, 0.5*(vx.ptr.p_double[i]+1)*(b-a)+a, _state); } /* * T[0] */ ae_vector_set_length(t, p->n, _state); v = (double)(0); for(i=0; i<=p->n-1; i++) { v = v+vp.ptr.p_double[i]; } t->ptr.p_double[0] = v/p->n; /* * other T's. * * NOTES: * 1. TK stores T{k} on VX, TK1 stores T{k-1} on VX * 2. we can do same calculations with fast DCT, but it * * adds dependencies * * still leaves us with O(N^2) algorithm because * preparation of function values is O(N^2) process */ if( p->n>1 ) { ae_vector_set_length(&tk, p->n, _state); ae_vector_set_length(&tk1, p->n, _state); for(i=0; i<=p->n-1; i++) { tk.ptr.p_double[i] = vx.ptr.p_double[i]; tk1.ptr.p_double[i] = (double)(1); } for(k=1; k<=p->n-1; k++) { /* * calculate discrete product of function vector and TK */ v = ae_v_dotproduct(&tk.ptr.p_double[0], 1, &vp.ptr.p_double[0], 1, ae_v_len(0,p->n-1)); t->ptr.p_double[k] = v/(0.5*p->n); /* * Update TK and TK1 */ for(i=0; i<=p->n-1; i++) { v = 2*vx.ptr.p_double[i]*tk.ptr.p_double[i]-tk1.ptr.p_double[i]; tk1.ptr.p_double[i] = tk.ptr.p_double[i]; tk.ptr.p_double[i] = v; } } } ae_frame_leave(_state); } /************************************************************************* Conversion from Chebyshev basis to barycentric representation. This function has O(N^2) complexity. INPUT PARAMETERS: T - coefficients of Chebyshev representation; P(x) = sum { T[i]*Ti(2*(x-A)/(B-A)-1), i=0..N }, where Ti - I-th Chebyshev polynomial. N - number of coefficients: * if given, only leading N elements of T are used * if not given, automatically determined from size of T A,B - base interval for Chebyshev polynomials (see above) A=1, "PolynomialBar2Cheb: N<1", _state); ae_assert(t->cnt>=n, "PolynomialBar2Cheb: Length(T)ptr.p_double[0]; tk1 = (double)(1); tk = vx; for(k=1; k<=n-1; k++) { vy = vy+t->ptr.p_double[k]*tk; v = 2*vx*tk-tk1; tk1 = tk; tk = v; } y.ptr.p_double[i] = vy; } /* * Build barycentric interpolant, map grid from [-1,+1] to [A,B] */ polynomialbuildcheb1(a, b, &y, n, p, _state); ae_frame_leave(_state); } /************************************************************************* Conversion from barycentric representation to power basis. This function has O(N^2) complexity. INPUT PARAMETERS: P - polynomial in barycentric form C - offset (see below); 0.0 is used as default value. S - scale (see below); 1.0 is used as default value. S<>0. OUTPUT PARAMETERS A - coefficients, P(x) = sum { A[i]*((X-C)/S)^i, i=0..N-1 } N - number of coefficients (polynomial degree plus 1) NOTES: 1. this function accepts offset and scale, which can be set to improve numerical properties of polynomial. For example, if P was obtained as result of interpolation on [-1,+1], you can set C=0 and S=1 and represent P as sum of 1, x, x^2, x^3 and so on. In most cases you it is exactly what you need. However, if your interpolation model was built on [999,1001], you will see significant growth of numerical errors when using {1, x, x^2, x^3} as basis. Representing P as sum of 1, (x-1000), (x-1000)^2, (x-1000)^3 will be better option. Such representation can be obtained by using 1000.0 as offset C and 1.0 as scale S. 2. power basis is ill-conditioned and tricks described above can't solve this problem completely. This function will return coefficients in any case, but for N>8 they will become unreliable. However, N's less than 5 are pretty safe. 3. barycentric interpolant passed as P may be either polynomial obtained from polynomial interpolation/ fitting or rational function which is NOT polynomial. We can't distinguish between these two cases, and this algorithm just tries to work assuming that P IS a polynomial. If not, algorithm will return results, but they won't have any meaning. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/ void polynomialbar2pow(barycentricinterpolant* p, double c, double s, /* Real */ ae_vector* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t k; double e; double d; ae_vector vp; ae_vector vx; ae_vector tk; ae_vector tk1; ae_vector t; double v; double c0; double s0; double va; double vb; ae_vector vai; ae_vector vbi; double minx; double maxx; ae_frame_make(_state, &_frame_block); ae_vector_clear(a); ae_vector_init(&vp, 0, DT_REAL, _state); ae_vector_init(&vx, 0, DT_REAL, _state); ae_vector_init(&tk, 0, DT_REAL, _state); ae_vector_init(&tk1, 0, DT_REAL, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&vai, 0, DT_REAL, _state); ae_vector_init(&vbi, 0, DT_REAL, _state); /* * We have barycentric model built using set of points X[], and we * want to convert it to power basis centered about point C with * scale S: I-th basis function is ((X-C)/S)^i. * * We use following three-stage algorithm: * * 1. we build Chebyshev representation of polynomial using * intermediate center C0 and scale S0, which are derived from X[]: * C0 = 0.5*(min(X)+max(X)), S0 = 0.5*(max(X)-min(X)). Chebyshev * representation is built by sampling points around center C0, * with typical distance between them proportional to S0. * 2. then we transform form Chebyshev basis to intermediate power * basis, using same center/scale C0/S0. * 3. after that, we apply linear transformation to intermediate * power basis which moves it to final center/scale C/S. * * The idea of such multi-stage algorithm is that it is much easier to * transform barycentric model to Chebyshev basis, and only later to * power basis, than transforming it directly to power basis. It is * also more numerically stable to sample points using intermediate C0/S0, * which are derived from user-supplied model, than using "final" C/S, * which may be unsuitable for sampling (say, if S=1, we may have stability * problems when working with models built from dataset with non-unit * scale of abscissas). */ ae_assert(ae_isfinite(c, _state), "PolynomialBar2Pow: C is not finite!", _state); ae_assert(ae_isfinite(s, _state), "PolynomialBar2Pow: S is not finite!", _state); ae_assert(ae_fp_neq(s,(double)(0)), "PolynomialBar2Pow: S=0!", _state); ae_assert(p->n>0, "PolynomialBar2Pow: P is not correctly initialized barycentric interpolant!", _state); /* * Select intermediate center/scale */ minx = p->x.ptr.p_double[0]; maxx = p->x.ptr.p_double[0]; for(i=1; i<=p->n-1; i++) { minx = ae_minreal(minx, p->x.ptr.p_double[i], _state); maxx = ae_maxreal(maxx, p->x.ptr.p_double[i], _state); } if( ae_fp_eq(minx,maxx) ) { c0 = minx; s0 = 1.0; } else { c0 = 0.5*(maxx+minx); s0 = 0.5*(maxx-minx); } /* * Calculate function values on a Chebyshev grid using intermediate C0/S0 */ ae_vector_set_length(&vp, p->n+1, _state); ae_vector_set_length(&vx, p->n, _state); for(i=0; i<=p->n-1; i++) { vx.ptr.p_double[i] = ae_cos(ae_pi*(i+0.5)/p->n, _state); vp.ptr.p_double[i] = barycentriccalc(p, s0*vx.ptr.p_double[i]+c0, _state); } /* * T[0] */ ae_vector_set_length(&t, p->n, _state); v = (double)(0); for(i=0; i<=p->n-1; i++) { v = v+vp.ptr.p_double[i]; } t.ptr.p_double[0] = v/p->n; /* * other T's. * * NOTES: * 1. TK stores T{k} on VX, TK1 stores T{k-1} on VX * 2. we can do same calculations with fast DCT, but it * * adds dependencies * * still leaves us with O(N^2) algorithm because * preparation of function values is O(N^2) process */ if( p->n>1 ) { ae_vector_set_length(&tk, p->n, _state); ae_vector_set_length(&tk1, p->n, _state); for(i=0; i<=p->n-1; i++) { tk.ptr.p_double[i] = vx.ptr.p_double[i]; tk1.ptr.p_double[i] = (double)(1); } for(k=1; k<=p->n-1; k++) { /* * calculate discrete product of function vector and TK */ v = ae_v_dotproduct(&tk.ptr.p_double[0], 1, &vp.ptr.p_double[0], 1, ae_v_len(0,p->n-1)); t.ptr.p_double[k] = v/(0.5*p->n); /* * Update TK and TK1 */ for(i=0; i<=p->n-1; i++) { v = 2*vx.ptr.p_double[i]*tk.ptr.p_double[i]-tk1.ptr.p_double[i]; tk1.ptr.p_double[i] = tk.ptr.p_double[i]; tk.ptr.p_double[i] = v; } } } /* * Convert from Chebyshev basis to power basis */ ae_vector_set_length(a, p->n, _state); for(i=0; i<=p->n-1; i++) { a->ptr.p_double[i] = (double)(0); } d = (double)(0); for(i=0; i<=p->n-1; i++) { for(k=i; k<=p->n-1; k++) { e = a->ptr.p_double[k]; a->ptr.p_double[k] = (double)(0); if( i<=1&&k==i ) { a->ptr.p_double[k] = (double)(1); } else { if( i!=0 ) { a->ptr.p_double[k] = 2*d; } if( k>i+1 ) { a->ptr.p_double[k] = a->ptr.p_double[k]-a->ptr.p_double[k-2]; } } d = e; } d = a->ptr.p_double[i]; e = (double)(0); k = i; while(k<=p->n-1) { e = e+a->ptr.p_double[k]*t.ptr.p_double[k]; k = k+2; } a->ptr.p_double[i] = e; } /* * Apply linear transformation which converts basis from intermediate * one Fi=((x-C0)/S0)^i to final one Fi=((x-C)/S)^i. * * We have y=(x-C0)/S0, z=(x-C)/S, and coefficients A[] for basis Fi(y). * Because we have y=A*z+B, for A=s/s0 and B=c/s0-c0/s0, we can perform * substitution and get coefficients A_new[] in basis Fi(z). */ ae_assert(vp.cnt>=p->n+1, "PolynomialBar2Pow: internal error", _state); ae_assert(t.cnt>=p->n, "PolynomialBar2Pow: internal error", _state); for(i=0; i<=p->n-1; i++) { t.ptr.p_double[i] = 0.0; } va = s/s0; vb = c/s0-c0/s0; ae_vector_set_length(&vai, p->n, _state); ae_vector_set_length(&vbi, p->n, _state); vai.ptr.p_double[0] = (double)(1); vbi.ptr.p_double[0] = (double)(1); for(k=1; k<=p->n-1; k++) { vai.ptr.p_double[k] = vai.ptr.p_double[k-1]*va; vbi.ptr.p_double[k] = vbi.ptr.p_double[k-1]*vb; } for(k=0; k<=p->n-1; k++) { /* * Generate set of binomial coefficients in VP[] */ if( k>0 ) { vp.ptr.p_double[k] = (double)(1); for(i=k-1; i>=1; i--) { vp.ptr.p_double[i] = vp.ptr.p_double[i]+vp.ptr.p_double[i-1]; } vp.ptr.p_double[0] = (double)(1); } else { vp.ptr.p_double[0] = (double)(1); } /* * Update T[] with expansion of K-th basis function */ for(i=0; i<=k; i++) { t.ptr.p_double[i] = t.ptr.p_double[i]+a->ptr.p_double[k]*vai.ptr.p_double[i]*vbi.ptr.p_double[k-i]*vp.ptr.p_double[i]; } } for(k=0; k<=p->n-1; k++) { a->ptr.p_double[k] = t.ptr.p_double[k]; } ae_frame_leave(_state); } /************************************************************************* Conversion from power basis to barycentric representation. This function has O(N^2) complexity. INPUT PARAMETERS: A - coefficients, P(x) = sum { A[i]*((X-C)/S)^i, i=0..N-1 } N - number of coefficients (polynomial degree plus 1) * if given, only leading N elements of A are used * if not given, automatically determined from size of A C - offset (see below); 0.0 is used as default value. S - scale (see below); 1.0 is used as default value. S<>0. OUTPUT PARAMETERS P - polynomial in barycentric form NOTES: 1. this function accepts offset and scale, which can be set to improve numerical properties of polynomial. For example, if you interpolate on [-1,+1], you can set C=0 and S=1 and convert from sum of 1, x, x^2, x^3 and so on. In most cases you it is exactly what you need. However, if your interpolation model was built on [999,1001], you will see significant growth of numerical errors when using {1, x, x^2, x^3} as input basis. Converting from sum of 1, (x-1000), (x-1000)^2, (x-1000)^3 will be better option (you have to specify 1000.0 as offset C and 1.0 as scale S). 2. power basis is ill-conditioned and tricks described above can't solve this problem completely. This function will return barycentric model in any case, but for N>8 accuracy well degrade. However, N's less than 5 are pretty safe. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/ void polynomialpow2bar(/* Real */ ae_vector* a, ae_int_t n, double c, double s, barycentricinterpolant* p, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t k; ae_vector y; double vx; double vy; double px; ae_frame_make(_state, &_frame_block); _barycentricinterpolant_clear(p); ae_vector_init(&y, 0, DT_REAL, _state); ae_assert(ae_isfinite(c, _state), "PolynomialPow2Bar: C is not finite!", _state); ae_assert(ae_isfinite(s, _state), "PolynomialPow2Bar: S is not finite!", _state); ae_assert(ae_fp_neq(s,(double)(0)), "PolynomialPow2Bar: S is zero!", _state); ae_assert(n>=1, "PolynomialPow2Bar: N<1", _state); ae_assert(a->cnt>=n, "PolynomialPow2Bar: Length(A)ptr.p_double[0]; px = vx; for(k=1; k<=n-1; k++) { vy = vy+px*a->ptr.p_double[k]; px = px*vx; } y.ptr.p_double[i] = vy; } /* * Build barycentric interpolant, map grid from [-1,+1] to [A,B] */ polynomialbuildcheb1(c-s, c+s, &y, n, p, _state); ae_frame_leave(_state); } /************************************************************************* Lagrange intepolant: generation of the model on the general grid. This function has O(N^2) complexity. INPUT PARAMETERS: X - abscissas, array[0..N-1] Y - function values, array[0..N-1] N - number of points, N>=1 OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuild(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, barycentricinterpolant* p, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_int_t j; ae_int_t k; ae_vector w; double b; double a; double v; double mx; ae_vector sortrbuf; ae_vector sortrbuf2; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; _barycentricinterpolant_clear(p); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&sortrbuf, 0, DT_REAL, _state); ae_vector_init(&sortrbuf2, 0, DT_REAL, _state); ae_assert(n>0, "PolynomialBuild: N<=0!", _state); ae_assert(x->cnt>=n, "PolynomialBuild: Length(X)cnt>=n, "PolynomialBuild: Length(Y)ptr.p_double[0]; b = x->ptr.p_double[0]; for(j=0; j<=n-1; j++) { w.ptr.p_double[j] = (double)(1); a = ae_minreal(a, x->ptr.p_double[j], _state); b = ae_maxreal(b, x->ptr.p_double[j], _state); } for(k=0; k<=n-1; k++) { /* * W[K] is used instead of 0.0 because * cycle on J does not touch K-th element * and we MUST get maximum from ALL elements */ mx = ae_fabs(w.ptr.p_double[k], _state); for(j=0; j<=n-1; j++) { if( j!=k ) { v = (b-a)/(x->ptr.p_double[j]-x->ptr.p_double[k]); w.ptr.p_double[j] = w.ptr.p_double[j]*v; mx = ae_maxreal(mx, ae_fabs(w.ptr.p_double[j], _state), _state); } } if( k%5==0 ) { /* * every 5-th run we renormalize W[] */ v = 1/mx; ae_v_muld(&w.ptr.p_double[0], 1, ae_v_len(0,n-1), v); } } barycentricbuildxyw(x, y, &w, n, p, _state); ae_frame_leave(_state); } /************************************************************************* Lagrange intepolant: generation of the model on equidistant grid. This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1] N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuildeqdist(double a, double b, /* Real */ ae_vector* y, ae_int_t n, barycentricinterpolant* p, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector w; ae_vector x; double v; ae_frame_make(_state, &_frame_block); _barycentricinterpolant_clear(p); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_assert(n>0, "PolynomialBuildEqDist: N<=0!", _state); ae_assert(y->cnt>=n, "PolynomialBuildEqDist: Length(Y)=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuildcheb1(double a, double b, /* Real */ ae_vector* y, ae_int_t n, barycentricinterpolant* p, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector w; ae_vector x; double v; double t; ae_frame_make(_state, &_frame_block); _barycentricinterpolant_clear(p); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_assert(n>0, "PolynomialBuildCheb1: N<=0!", _state); ae_assert(y->cnt>=n, "PolynomialBuildCheb1: Length(Y)=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuildcheb2(double a, double b, /* Real */ ae_vector* y, ae_int_t n, barycentricinterpolant* p, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector w; ae_vector x; double v; ae_frame_make(_state, &_frame_block); _barycentricinterpolant_clear(p); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_assert(n>0, "PolynomialBuildCheb2: N<=0!", _state); ae_assert(y->cnt>=n, "PolynomialBuildCheb2: Length(Y)=1 for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolynomialBuildEqDist()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double polynomialcalceqdist(double a, double b, /* Real */ ae_vector* f, ae_int_t n, double t, ae_state *_state) { double s1; double s2; double v; double threshold; double s; double h; ae_int_t i; ae_int_t j; double w; double x; double result; ae_assert(n>0, "PolynomialCalcEqDist: N<=0!", _state); ae_assert(f->cnt>=n, "PolynomialCalcEqDist: Length(F)v_nan; return result; } /* * Special case: N=1 */ if( n==1 ) { result = f->ptr.p_double[0]; return result; } /* * First, decide: should we use "safe" formula (guarded * against overflow) or fast one? */ threshold = ae_sqrt(ae_minrealnumber, _state); j = 0; s = t-a; for(i=1; i<=n-1; i++) { x = a+(double)i/(double)(n-1)*(b-a); if( ae_fp_less(ae_fabs(t-x, _state),ae_fabs(s, _state)) ) { s = t-x; j = i; } } if( ae_fp_eq(s,(double)(0)) ) { result = f->ptr.p_double[j]; return result; } if( ae_fp_greater(ae_fabs(s, _state),threshold) ) { /* * use fast formula */ j = -1; s = 1.0; } /* * Calculate using safe or fast barycentric formula */ s1 = (double)(0); s2 = (double)(0); w = 1.0; h = (b-a)/(n-1); for(i=0; i<=n-1; i++) { if( i!=j ) { v = s*w/(t-(a+i*h)); s1 = s1+v*f->ptr.p_double[i]; s2 = s2+v; } else { v = w; s1 = s1+v*f->ptr.p_double[i]; s2 = s2+v; } w = -w*(n-1-i); w = w/(i+1); } result = s1/s2; return result; } /************************************************************************* Fast polynomial interpolation function on Chebyshev points (first kind) with O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on Chebyshev grid (first kind), X[i] = 0.5*(B+A) + 0.5*(B-A)*Cos(PI*(2*i+1)/(2*n)) for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolIntBuildCheb1()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double polynomialcalccheb1(double a, double b, /* Real */ ae_vector* f, ae_int_t n, double t, ae_state *_state) { double s1; double s2; double v; double threshold; double s; ae_int_t i; ae_int_t j; double a0; double delta; double alpha; double beta; double ca; double sa; double tempc; double temps; double x; double w; double p1; double result; ae_assert(n>0, "PolynomialCalcCheb1: N<=0!", _state); ae_assert(f->cnt>=n, "PolynomialCalcCheb1: Length(F)v_nan; return result; } /* * Special case: N=1 */ if( n==1 ) { result = f->ptr.p_double[0]; return result; } /* * Prepare information for the recurrence formula * used to calculate sin(pi*(2j+1)/(2n+2)) and * cos(pi*(2j+1)/(2n+2)): * * A0 = pi/(2n+2) * Delta = pi/(n+1) * Alpha = 2 sin^2 (Delta/2) * Beta = sin(Delta) * * so that sin(..) = sin(A0+j*delta) and cos(..) = cos(A0+j*delta). * Then we use * * sin(x+delta) = sin(x) - (alpha*sin(x) - beta*cos(x)) * cos(x+delta) = cos(x) - (alpha*cos(x) - beta*sin(x)) * * to repeatedly calculate sin(..) and cos(..). */ threshold = ae_sqrt(ae_minrealnumber, _state); t = (t-0.5*(a+b))/(0.5*(b-a)); a0 = ae_pi/(2*(n-1)+2); delta = 2*ae_pi/(2*(n-1)+2); alpha = 2*ae_sqr(ae_sin(delta/2, _state), _state); beta = ae_sin(delta, _state); /* * First, decide: should we use "safe" formula (guarded * against overflow) or fast one? */ ca = ae_cos(a0, _state); sa = ae_sin(a0, _state); j = 0; x = ca; s = t-x; for(i=1; i<=n-1; i++) { /* * Next X[i] */ temps = sa-(alpha*sa-beta*ca); tempc = ca-(alpha*ca+beta*sa); sa = temps; ca = tempc; x = ca; /* * Use X[i] */ if( ae_fp_less(ae_fabs(t-x, _state),ae_fabs(s, _state)) ) { s = t-x; j = i; } } if( ae_fp_eq(s,(double)(0)) ) { result = f->ptr.p_double[j]; return result; } if( ae_fp_greater(ae_fabs(s, _state),threshold) ) { /* * use fast formula */ j = -1; s = 1.0; } /* * Calculate using safe or fast barycentric formula */ s1 = (double)(0); s2 = (double)(0); ca = ae_cos(a0, _state); sa = ae_sin(a0, _state); p1 = 1.0; for(i=0; i<=n-1; i++) { /* * Calculate X[i], W[i] */ x = ca; w = p1*sa; /* * Proceed */ if( i!=j ) { v = s*w/(t-x); s1 = s1+v*f->ptr.p_double[i]; s2 = s2+v; } else { v = w; s1 = s1+v*f->ptr.p_double[i]; s2 = s2+v; } /* * Next CA, SA, P1 */ temps = sa-(alpha*sa-beta*ca); tempc = ca-(alpha*ca+beta*sa); sa = temps; ca = tempc; p1 = -p1; } result = s1/s2; return result; } /************************************************************************* Fast polynomial interpolation function on Chebyshev points (second kind) with O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on Chebyshev grid (second kind), X[i] = 0.5*(B+A) + 0.5*(B-A)*Cos(PI*i/(n-1)) for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolIntBuildCheb2()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double polynomialcalccheb2(double a, double b, /* Real */ ae_vector* f, ae_int_t n, double t, ae_state *_state) { double s1; double s2; double v; double threshold; double s; ae_int_t i; ae_int_t j; double a0; double delta; double alpha; double beta; double ca; double sa; double tempc; double temps; double x; double w; double p1; double result; ae_assert(n>0, "PolynomialCalcCheb2: N<=0!", _state); ae_assert(f->cnt>=n, "PolynomialCalcCheb2: Length(F)v_nan; return result; } /* * Special case: N=1 */ if( n==1 ) { result = f->ptr.p_double[0]; return result; } /* * Prepare information for the recurrence formula * used to calculate sin(pi*i/n) and * cos(pi*i/n): * * A0 = 0 * Delta = pi/n * Alpha = 2 sin^2 (Delta/2) * Beta = sin(Delta) * * so that sin(..) = sin(A0+j*delta) and cos(..) = cos(A0+j*delta). * Then we use * * sin(x+delta) = sin(x) - (alpha*sin(x) - beta*cos(x)) * cos(x+delta) = cos(x) - (alpha*cos(x) - beta*sin(x)) * * to repeatedly calculate sin(..) and cos(..). */ threshold = ae_sqrt(ae_minrealnumber, _state); t = (t-0.5*(a+b))/(0.5*(b-a)); a0 = 0.0; delta = ae_pi/(n-1); alpha = 2*ae_sqr(ae_sin(delta/2, _state), _state); beta = ae_sin(delta, _state); /* * First, decide: should we use "safe" formula (guarded * against overflow) or fast one? */ ca = ae_cos(a0, _state); sa = ae_sin(a0, _state); j = 0; x = ca; s = t-x; for(i=1; i<=n-1; i++) { /* * Next X[i] */ temps = sa-(alpha*sa-beta*ca); tempc = ca-(alpha*ca+beta*sa); sa = temps; ca = tempc; x = ca; /* * Use X[i] */ if( ae_fp_less(ae_fabs(t-x, _state),ae_fabs(s, _state)) ) { s = t-x; j = i; } } if( ae_fp_eq(s,(double)(0)) ) { result = f->ptr.p_double[j]; return result; } if( ae_fp_greater(ae_fabs(s, _state),threshold) ) { /* * use fast formula */ j = -1; s = 1.0; } /* * Calculate using safe or fast barycentric formula */ s1 = (double)(0); s2 = (double)(0); ca = ae_cos(a0, _state); sa = ae_sin(a0, _state); p1 = 1.0; for(i=0; i<=n-1; i++) { /* * Calculate X[i], W[i] */ x = ca; if( i==0||i==n-1 ) { w = 0.5*p1; } else { w = 1.0*p1; } /* * Proceed */ if( i!=j ) { v = s*w/(t-x); s1 = s1+v*f->ptr.p_double[i]; s2 = s2+v; } else { v = w; s1 = s1+v*f->ptr.p_double[i]; s2 = s2+v; } /* * Next CA, SA, P1 */ temps = sa-(alpha*sa-beta*ca); tempc = ca-(alpha*ca+beta*sa); sa = temps; ca = tempc; p1 = -p1; } result = s1/s2; return result; } /************************************************************************* This subroutine fits piecewise linear curve to points with Ramer-Douglas- Peucker algorithm, which stops after generating specified number of linear sections. IMPORTANT: * it does NOT perform least-squares fitting; it builds curve, but this curve does not minimize some least squares metric. See description of RDP algorithm (say, in Wikipedia) for more details on WHAT is performed. * this function does NOT work with parametric curves (i.e. curves which can be represented as {X(t),Y(t)}. It works with curves which can be represented as Y(X). Thus, it is impossible to model figures like circles with this functions. If you want to work with parametric curves, you should use ParametricRDPFixed() function provided by "Parametric" subpackage of "Interpolation" package. INPUT PARAMETERS: X - array of X-coordinates: * at least N elements * can be unordered (points are automatically sorted) * this function may accept non-distinct X (see below for more information on handling of such inputs) Y - array of Y-coordinates: * at least N elements N - number of elements in X/Y M - desired number of sections: * at most M sections are generated by this function * less than M sections can be generated if we have N=0, "LSTFitPiecewiseLinearRDPFixed: N<0", _state); ae_assert(m>=1, "LSTFitPiecewiseLinearRDPFixed: M<1", _state); ae_assert(x->cnt>=n, "LSTFitPiecewiseLinearRDPFixed: Length(X)cnt>=n, "LSTFitPiecewiseLinearRDPFixed: Length(Y)ptr.p_double[i]; while(j<=n-1&&ae_fp_eq(x->ptr.p_double[j],x->ptr.p_double[i])) { v = v+y->ptr.p_double[j]; j = j+1; } v = v/(j-i); for(k=i; k<=j-1; k++) { y->ptr.p_double[k] = v; } i = j; } /* * Handle degenerate case x[0]=x[N-1] */ if( ae_fp_eq(x->ptr.p_double[n-1],x->ptr.p_double[0]) ) { *nsections = 0; ae_frame_leave(_state); return; } /* * Prepare first section */ lsfit_rdpanalyzesection(x, y, 0, n-1, &worstidx, &worsterror, _state); ae_matrix_set_length(§ions, m, 4, _state); ae_vector_set_length(&heaperrors, m, _state); ae_vector_set_length(&heaptags, m, _state); *nsections = 1; sections.ptr.pp_double[0][0] = (double)(0); sections.ptr.pp_double[0][1] = (double)(n-1); sections.ptr.pp_double[0][2] = (double)(worstidx); sections.ptr.pp_double[0][3] = worsterror; heaperrors.ptr.p_double[0] = worsterror; heaptags.ptr.p_int[0] = 0; ae_assert(ae_fp_eq(sections.ptr.pp_double[0][1],(double)(n-1)), "RDP algorithm: integrity check failed", _state); /* * Main loop. * Repeatedly find section with worst error and divide it. * Terminate after M-th section, or because of other reasons (see loop internals). */ while(*nsectionsptr.p_double[ae_round(sections.ptr.pp_double[i][1], _state)],x->ptr.p_double[k]) ) { k = ae_round(sections.ptr.pp_double[i][1], _state); } } points.ptr.p_double[*nsections] = (double)(k); tagsortfast(&points, &buf0, *nsections+1, _state); /* * Output sections: * * first NSection elements of X2/Y2 are filled by x/y at left boundaries of sections * * last element of X2/Y2 is filled by right boundary of rightmost section * * X2/Y2 is sorted by ascending of X2 */ ae_vector_set_length(x2, *nsections+1, _state); ae_vector_set_length(y2, *nsections+1, _state); for(i=0; i<=*nsections; i++) { x2->ptr.p_double[i] = x->ptr.p_double[ae_round(points.ptr.p_double[i], _state)]; y2->ptr.p_double[i] = y->ptr.p_double[ae_round(points.ptr.p_double[i], _state)]; } ae_frame_leave(_state); } /************************************************************************* This subroutine fits piecewise linear curve to points with Ramer-Douglas- Peucker algorithm, which stops after achieving desired precision. IMPORTANT: * it performs non-least-squares fitting; it builds curve, but this curve does not minimize some least squares metric. See description of RDP algorithm (say, in Wikipedia) for more details on WHAT is performed. * this function does NOT work with parametric curves (i.e. curves which can be represented as {X(t),Y(t)}. It works with curves which can be represented as Y(X). Thus, it is impossible to model figures like circles with this functions. If you want to work with parametric curves, you should use ParametricRDPFixed() function provided by "Parametric" subpackage of "Interpolation" package. INPUT PARAMETERS: X - array of X-coordinates: * at least N elements * can be unordered (points are automatically sorted) * this function may accept non-distinct X (see below for more information on handling of such inputs) Y - array of Y-coordinates: * at least N elements N - number of elements in X/Y Eps - positive number, desired precision. OUTPUT PARAMETERS: X2 - X-values of corner points for piecewise approximation, has length NSections+1 or zero (for NSections=0). Y2 - Y-values of corner points, has length NSections+1 or zero (for NSections=0). NSections- number of sections found by algorithm, NSections can be zero for degenerate datasets (N<=1 or all X[] are non-distinct). NOTE: X2/Y2 are ordered arrays, i.e. (X2[0],Y2[0]) is a first point of curve, (X2[NSection-1],Y2[NSection-1]) is the last point. -- ALGLIB -- Copyright 02.10.2014 by Bochkanov Sergey *************************************************************************/ void lstfitpiecewiselinearrdp(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double eps, /* Real */ ae_vector* x2, /* Real */ ae_vector* y2, ae_int_t* nsections, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_int_t i; ae_int_t j; ae_int_t k; ae_vector buf0; ae_vector buf1; ae_vector xtmp; ae_vector ytmp; double v; ae_int_t npts; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; ae_vector_clear(x2); ae_vector_clear(y2); *nsections = 0; ae_vector_init(&buf0, 0, DT_REAL, _state); ae_vector_init(&buf1, 0, DT_REAL, _state); ae_vector_init(&xtmp, 0, DT_REAL, _state); ae_vector_init(&ytmp, 0, DT_REAL, _state); ae_assert(n>=0, "LSTFitPiecewiseLinearRDP: N<0", _state); ae_assert(ae_fp_greater(eps,(double)(0)), "LSTFitPiecewiseLinearRDP: Eps<=0", _state); ae_assert(x->cnt>=n, "LSTFitPiecewiseLinearRDP: Length(X)cnt>=n, "LSTFitPiecewiseLinearRDP: Length(Y)ptr.p_double[i]; while(j<=n-1&&ae_fp_eq(x->ptr.p_double[j],x->ptr.p_double[i])) { v = v+y->ptr.p_double[j]; j = j+1; } v = v/(j-i); for(k=i; k<=j-1; k++) { y->ptr.p_double[k] = v; } i = j; } /* * Handle degenerate case x[0]=x[N-1] */ if( ae_fp_eq(x->ptr.p_double[n-1],x->ptr.p_double[0]) ) { *nsections = 0; ae_frame_leave(_state); return; } /* * Prepare data for recursive algorithm */ ae_vector_set_length(&xtmp, n, _state); ae_vector_set_length(&ytmp, n, _state); npts = 2; xtmp.ptr.p_double[0] = x->ptr.p_double[0]; ytmp.ptr.p_double[0] = y->ptr.p_double[0]; xtmp.ptr.p_double[1] = x->ptr.p_double[n-1]; ytmp.ptr.p_double[1] = y->ptr.p_double[n-1]; lsfit_rdprecursive(x, y, 0, n-1, eps, &xtmp, &ytmp, &npts, _state); /* * Output sections: * * first NSection elements of X2/Y2 are filled by x/y at left boundaries of sections * * last element of X2/Y2 is filled by right boundary of rightmost section * * X2/Y2 is sorted by ascending of X2 */ *nsections = npts-1; ae_vector_set_length(x2, npts, _state); ae_vector_set_length(y2, npts, _state); for(i=0; i<=*nsections; i++) { x2->ptr.p_double[i] = xtmp.ptr.p_double[i]; y2->ptr.p_double[i] = ytmp.ptr.p_double[i]; } tagsortfastr(x2, y2, &buf0, &buf1, npts, _state); ae_frame_leave(_state); } /************************************************************************* Fitting by polynomials in barycentric form. This function provides simple unterface for unconstrained unweighted fitting. See PolynomialFitWC() if you need constrained fitting. Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO: PolynomialFitWC() COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. N - number of points, N>0 * if given, only leading N elements of X/Y are used * if not given, automatically determined from sizes of X/Y M - number of basis functions (= polynomial_degree + 1), M>=1 OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD P - interpolant in barycentric form. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED NOTES: you can convert P from barycentric form to the power or Chebyshev basis with PolynomialBar2Pow() or PolynomialBar2Cheb() functions from POLINT subpackage. -- ALGLIB PROJECT -- Copyright 10.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialfit(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, ae_int_t* info, barycentricinterpolant* p, polynomialfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector w; ae_vector xc; ae_vector yc; ae_vector dc; ae_frame_make(_state, &_frame_block); *info = 0; _barycentricinterpolant_clear(p); _polynomialfitreport_clear(rep); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&yc, 0, DT_REAL, _state); ae_vector_init(&dc, 0, DT_INT, _state); ae_assert(n>0, "PolynomialFit: N<=0!", _state); ae_assert(m>0, "PolynomialFit: M<=0!", _state); ae_assert(x->cnt>=n, "PolynomialFit: Length(X)cnt>=n, "PolynomialFit: Length(Y)0. * if given, only leading N elements of X/Y/W are used * if not given, automatically determined from sizes of X/Y/W XC - points where polynomial values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that P(XC[i])=YC[i] * DC[i]=1 means that P'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints, 0<=K=1 OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints P - interpolant in barycentric form. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTES: you can convert P from barycentric form to the power or Chebyshev basis with PolynomialBar2Pow() or PolynomialBar2Cheb() functions from POLINT subpackage. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * even simple constraints can be inconsistent, see Wikipedia article on this subject: http://en.wikipedia.org/wiki/Birkhoff_interpolation * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints is NOT GUARANTEED. * in the one special cases, however, we can guarantee consistency. This case is: M>1 and constraints on the function values (NOT DERIVATIVES) Our final recommendation is to use constraints WHEN AND ONLY when you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 10.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialfitwc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, barycentricinterpolant* p, polynomialfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector _w; ae_vector _xc; ae_vector _yc; double xa; double xb; double sa; double sb; ae_vector xoriginal; ae_vector yoriginal; ae_vector y2; ae_vector w2; ae_vector tmp; ae_vector tmp2; ae_vector bx; ae_vector by; ae_vector bw; ae_int_t i; ae_int_t j; double u; double v; double s; ae_int_t relcnt; lsfitreport lrep; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; ae_vector_init_copy(&_w, w, _state); w = &_w; ae_vector_init_copy(&_xc, xc, _state); xc = &_xc; ae_vector_init_copy(&_yc, yc, _state); yc = &_yc; *info = 0; _barycentricinterpolant_clear(p); _polynomialfitreport_clear(rep); ae_vector_init(&xoriginal, 0, DT_REAL, _state); ae_vector_init(&yoriginal, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&tmp2, 0, DT_REAL, _state); ae_vector_init(&bx, 0, DT_REAL, _state); ae_vector_init(&by, 0, DT_REAL, _state); ae_vector_init(&bw, 0, DT_REAL, _state); _lsfitreport_init(&lrep, _state); ae_assert(n>0, "PolynomialFitWC: N<=0!", _state); ae_assert(m>0, "PolynomialFitWC: M<=0!", _state); ae_assert(k>=0, "PolynomialFitWC: K<0!", _state); ae_assert(k=M!", _state); ae_assert(x->cnt>=n, "PolynomialFitWC: Length(X)cnt>=n, "PolynomialFitWC: Length(Y)cnt>=n, "PolynomialFitWC: Length(W)cnt>=k, "PolynomialFitWC: Length(XC)cnt>=k, "PolynomialFitWC: Length(YC)cnt>=k, "PolynomialFitWC: Length(DC)ptr.p_int[i]==0||dc->ptr.p_int[i]==1, "PolynomialFitWC: one of DC[] is not 0 or 1!", _state); } /* * Scale X, Y, XC, YC. * Solve scaled problem using internal Chebyshev fitting function. */ lsfitscalexy(x, y, w, n, xc, yc, dc, k, &xa, &xb, &sa, &sb, &xoriginal, &yoriginal, _state); lsfit_internalchebyshevfit(x, y, w, n, xc, yc, dc, k, m, info, &tmp, &lrep, _state); if( *info<0 ) { ae_frame_leave(_state); return; } /* * Generate barycentric model and scale it * * BX, BY store barycentric model nodes * * FMatrix is reused (remember - it is at least MxM, what we need) * * Model intialization is done in O(M^2). In principle, it can be * done in O(M*log(M)), but before it we solved task with O(N*M^2) * complexity, so it is only a small amount of total time spent. */ ae_vector_set_length(&bx, m, _state); ae_vector_set_length(&by, m, _state); ae_vector_set_length(&bw, m, _state); ae_vector_set_length(&tmp2, m, _state); s = (double)(1); for(i=0; i<=m-1; i++) { if( m!=1 ) { u = ae_cos(ae_pi*i/(m-1), _state); } else { u = (double)(0); } v = (double)(0); for(j=0; j<=m-1; j++) { if( j==0 ) { tmp2.ptr.p_double[j] = (double)(1); } else { if( j==1 ) { tmp2.ptr.p_double[j] = u; } else { tmp2.ptr.p_double[j] = 2*u*tmp2.ptr.p_double[j-1]-tmp2.ptr.p_double[j-2]; } } v = v+tmp.ptr.p_double[j]*tmp2.ptr.p_double[j]; } bx.ptr.p_double[i] = u; by.ptr.p_double[i] = v; bw.ptr.p_double[i] = s; if( i==0||i==m-1 ) { bw.ptr.p_double[i] = 0.5*bw.ptr.p_double[i]; } s = -s; } barycentricbuildxyw(&bx, &by, &bw, m, p, _state); barycentriclintransx(p, 2/(xb-xa), -(xa+xb)/(xb-xa), _state); barycentriclintransy(p, sb-sa, sa, _state); /* * Scale absolute errors obtained from LSFitLinearW. * Relative error should be calculated separately * (because of shifting/scaling of the task) */ rep->taskrcond = lrep.taskrcond; rep->rmserror = lrep.rmserror*(sb-sa); rep->avgerror = lrep.avgerror*(sb-sa); rep->maxerror = lrep.maxerror*(sb-sa); rep->avgrelerror = (double)(0); relcnt = 0; for(i=0; i<=n-1; i++) { if( ae_fp_neq(yoriginal.ptr.p_double[i],(double)(0)) ) { rep->avgrelerror = rep->avgrelerror+ae_fabs(barycentriccalc(p, xoriginal.ptr.p_double[i], _state)-yoriginal.ptr.p_double[i], _state)/ae_fabs(yoriginal.ptr.p_double[i], _state); relcnt = relcnt+1; } } if( relcnt!=0 ) { rep->avgrelerror = rep->avgrelerror/relcnt; } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_polynomialfitwc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, barycentricinterpolant* p, polynomialfitreport* rep, ae_state *_state) { polynomialfitwc(x,y,w,n,xc,yc,dc,k,m,info,p,rep, _state); } /************************************************************************* This function calculates value of four-parameter logistic (4PL) model at specified point X. 4PL model has following form: F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) INPUT PARAMETERS: X - current point, X>=0: * zero X is correctly handled even for B<=0 * negative X results in exception. A, B, C, D- parameters of 4PL model: * A is unconstrained * B is unconstrained; zero or negative values are handled correctly. * C>0, non-positive value results in exception * D is unconstrained RESULT: model value at X NOTE: if B=0, denominator is assumed to be equal to 2.0 even for zero X (strictly speaking, 0^0 is undefined). NOTE: this function also throws exception if all input parameters are correct, but overflow was detected during calculations. NOTE: this function performs a lot of checks; if you need really high performance, consider evaluating model yourself, without checking for degenerate cases. -- ALGLIB PROJECT -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/ double logisticcalc4(double x, double a, double b, double c, double d, ae_state *_state) { double result; ae_assert(ae_isfinite(x, _state), "LogisticCalc4: X is not finite", _state); ae_assert(ae_isfinite(a, _state), "LogisticCalc4: A is not finite", _state); ae_assert(ae_isfinite(b, _state), "LogisticCalc4: B is not finite", _state); ae_assert(ae_isfinite(c, _state), "LogisticCalc4: C is not finite", _state); ae_assert(ae_isfinite(d, _state), "LogisticCalc4: D is not finite", _state); ae_assert(ae_fp_greater_eq(x,(double)(0)), "LogisticCalc4: X is negative", _state); ae_assert(ae_fp_greater(c,(double)(0)), "LogisticCalc4: C is non-positive", _state); /* * Check for degenerate cases */ if( ae_fp_eq(b,(double)(0)) ) { result = 0.5*(a+d); return result; } if( ae_fp_eq(x,(double)(0)) ) { if( ae_fp_greater(b,(double)(0)) ) { result = a; } else { result = d; } return result; } /* * General case */ result = d+(a-d)/(1.0+ae_pow(x/c, b, _state)); ae_assert(ae_isfinite(result, _state), "LogisticCalc4: overflow during calculations", _state); return result; } /************************************************************************* This function calculates value of five-parameter logistic (5PL) model at specified point X. 5PL model has following form: F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) INPUT PARAMETERS: X - current point, X>=0: * zero X is correctly handled even for B<=0 * negative X results in exception. A, B, C, D, G- parameters of 5PL model: * A is unconstrained * B is unconstrained; zero or negative values are handled correctly. * C>0, non-positive value results in exception * D is unconstrained * G>0, non-positive value results in exception RESULT: model value at X NOTE: if B=0, denominator is assumed to be equal to Power(2.0,G) even for zero X (strictly speaking, 0^0 is undefined). NOTE: this function also throws exception if all input parameters are correct, but overflow was detected during calculations. NOTE: this function performs a lot of checks; if you need really high performance, consider evaluating model yourself, without checking for degenerate cases. -- ALGLIB PROJECT -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/ double logisticcalc5(double x, double a, double b, double c, double d, double g, ae_state *_state) { double result; ae_assert(ae_isfinite(x, _state), "LogisticCalc5: X is not finite", _state); ae_assert(ae_isfinite(a, _state), "LogisticCalc5: A is not finite", _state); ae_assert(ae_isfinite(b, _state), "LogisticCalc5: B is not finite", _state); ae_assert(ae_isfinite(c, _state), "LogisticCalc5: C is not finite", _state); ae_assert(ae_isfinite(d, _state), "LogisticCalc5: D is not finite", _state); ae_assert(ae_isfinite(g, _state), "LogisticCalc5: G is not finite", _state); ae_assert(ae_fp_greater_eq(x,(double)(0)), "LogisticCalc5: X is negative", _state); ae_assert(ae_fp_greater(c,(double)(0)), "LogisticCalc5: C is non-positive", _state); ae_assert(ae_fp_greater(g,(double)(0)), "LogisticCalc5: G is non-positive", _state); /* * Check for degenerate cases */ if( ae_fp_eq(b,(double)(0)) ) { result = d+(a-d)/ae_pow(2.0, g, _state); return result; } if( ae_fp_eq(x,(double)(0)) ) { if( ae_fp_greater(b,(double)(0)) ) { result = a; } else { result = d; } return result; } /* * General case */ result = d+(a-d)/ae_pow(1.0+ae_pow(x/c, b, _state), g, _state); ae_assert(ae_isfinite(result, _state), "LogisticCalc5: overflow during calculations", _state); return result; } /************************************************************************* This function fits four-parameter logistic (4PL) model to data provided by user. 4PL model has following form: F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) Here: * A, D - unconstrained (see LogisticFit4EC() for constrained 4PL) * B>=0 * C>0 IMPORTANT: output of this function is constrained in such way that B>0. Because 4PL model is symmetric with respect to B, there is no need to explore B<0. Constraining B makes algorithm easier to stabilize and debug. Users who for some reason prefer to work with negative B's should transform output themselves (swap A and D, replace B by -B). 4PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". * second Levenberg-Marquardt round is performed without excessive constraints. Results from the previous round are used as initial guess. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. OUTPUT PARAMETERS: A, B, C, D- parameters of 4PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc4() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit4(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double* a, double* b, double* c, double* d, lsfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; double g; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; *a = 0; *b = 0; *c = 0; *d = 0; _lsfitreport_clear(rep); logisticfit45x(x, y, n, _state->v_nan, _state->v_nan, ae_true, 0.0, 0.0, 0, a, b, c, d, &g, rep, _state); ae_frame_leave(_state); } /************************************************************************* This function fits four-parameter logistic (4PL) model to data provided by user, with optional constraints on parameters A and D. 4PL model has following form: F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) Here: * A, D - with optional equality constraints * B>=0 * C>0 IMPORTANT: output of this function is constrained in such way that B>0. Because 4PL model is symmetric with respect to B, there is no need to explore B<0. Constraining B makes algorithm easier to stabilize and debug. Users who for some reason prefer to work with negative B's should transform output themselves (swap A and D, replace B by -B). 4PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". * second Levenberg-Marquardt round is performed without excessive constraints. Results from the previous round are used as initial guess. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. CnstrLeft- optional equality constraint for model value at the left boundary (at X=0). Specify NAN (Not-a-Number) if you do not need constraint on the model value at X=0 (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. CnstrRight- optional equality constraint for model value at X=infinity. Specify NAN (Not-a-Number) if you do not need constraint on the model value (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. OUTPUT PARAMETERS: A, B, C, D- parameters of 4PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc4() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. EQUALITY CONSTRAINTS ON PARAMETERS 4PL/5PL solver supports equality constraints on model values at the left boundary (X=0) and right boundary (X=infinity). These constraints are completely optional and you can specify both of them, only one - or no constraints at all. Parameter CnstrLeft contains left constraint (or NAN for unconstrained fitting), and CnstrRight contains right one. For 4PL, left constraint ALWAYS corresponds to parameter A, and right one is ALWAYS constraint on D. That's because 4PL model is normalized in such way that B>=0. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit4ec(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double cnstrleft, double cnstrright, double* a, double* b, double* c, double* d, lsfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; double g; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; *a = 0; *b = 0; *c = 0; *d = 0; _lsfitreport_clear(rep); logisticfit45x(x, y, n, cnstrleft, cnstrright, ae_true, 0.0, 0.0, 0, a, b, c, d, &g, rep, _state); ae_frame_leave(_state); } /************************************************************************* This function fits five-parameter logistic (5PL) model to data provided by user. 5PL model has following form: F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) Here: * A, D - unconstrained * B - unconstrained * C>0 * G>0 IMPORTANT: unlike in 4PL fitting, output of this function is NOT constrained in such way that B is guaranteed to be positive. Furthermore, unlike 4PL, 5PL model is NOT symmetric with respect to B, so you can NOT transform model to equivalent one, with B having desired sign (>0 or <0). 5PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". Parameter G is fixed at G=1. * second Levenberg-Marquardt round is performed without excessive constraints on B and C, but with G still equal to 1. Results from the previous round are used as initial guess. * third Levenberg-Marquardt round relaxes constraints on G and tries two different models - one with B>0 and one with B<0. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. OUTPUT PARAMETERS: A,B,C,D,G- parameters of 5PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc5() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit5(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double* a, double* b, double* c, double* d, double* g, lsfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; *a = 0; *b = 0; *c = 0; *d = 0; *g = 0; _lsfitreport_clear(rep); logisticfit45x(x, y, n, _state->v_nan, _state->v_nan, ae_false, 0.0, 0.0, 0, a, b, c, d, g, rep, _state); ae_frame_leave(_state); } /************************************************************************* This function fits five-parameter logistic (5PL) model to data provided by user, subject to optional equality constraints on parameters A and D. 5PL model has following form: F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) Here: * A, D - with optional equality constraints * B - unconstrained * C>0 * G>0 IMPORTANT: unlike in 4PL fitting, output of this function is NOT constrained in such way that B is guaranteed to be positive. Furthermore, unlike 4PL, 5PL model is NOT symmetric with respect to B, so you can NOT transform model to equivalent one, with B having desired sign (>0 or <0). 5PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". Parameter G is fixed at G=1. * second Levenberg-Marquardt round is performed without excessive constraints on B and C, but with G still equal to 1. Results from the previous round are used as initial guess. * third Levenberg-Marquardt round relaxes constraints on G and tries two different models - one with B>0 and one with B<0. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. CnstrLeft- optional equality constraint for model value at the left boundary (at X=0). Specify NAN (Not-a-Number) if you do not need constraint on the model value at X=0 (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. CnstrRight- optional equality constraint for model value at X=infinity. Specify NAN (Not-a-Number) if you do not need constraint on the model value (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. OUTPUT PARAMETERS: A,B,C,D,G- parameters of 5PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc5() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. EQUALITY CONSTRAINTS ON PARAMETERS 5PL solver supports equality constraints on model values at the left boundary (X=0) and right boundary (X=infinity). These constraints are completely optional and you can specify both of them, only one - or no constraints at all. Parameter CnstrLeft contains left constraint (or NAN for unconstrained fitting), and CnstrRight contains right one. Unlike 4PL one, 5PL model is NOT symmetric with respect to change in sign of B. Thus, negative B's are possible, and left constraint may constrain parameter A (for positive B's) - or parameter D (for negative B's). Similarly changes meaning of right constraint. You do not have to decide what parameter to constrain - algorithm will automatically determine correct parameters as fitting progresses. However, question highlighted above is important when you interpret fitting results. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit5ec(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double cnstrleft, double cnstrright, double* a, double* b, double* c, double* d, double* g, lsfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; *a = 0; *b = 0; *c = 0; *d = 0; *g = 0; _lsfitreport_clear(rep); logisticfit45x(x, y, n, cnstrleft, cnstrright, ae_false, 0.0, 0.0, 0, a, b, c, d, g, rep, _state); ae_frame_leave(_state); } /************************************************************************* This is "expert" 4PL/5PL fitting function, which can be used if you need better control over fitting process than provided by LogisticFit4() or LogisticFit5(). This function fits model of the form F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) (4PL model) or F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) (5PL model) Here: * A, D - unconstrained * B>=0 for 4PL, unconstrained for 5PL * C>0 * G>0 (if present) INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. CnstrLeft- optional equality constraint for model value at the left boundary (at X=0). Specify NAN (Not-a-Number) if you do not need constraint on the model value at X=0 (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. CnstrRight- optional equality constraint for model value at X=infinity. Specify NAN (Not-a-Number) if you do not need constraint on the model value (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. Is4PL - whether 4PL or 5PL models are fitted LambdaV - regularization coefficient, LambdaV>=0. Set it to zero unless you know what you are doing. EpsX - stopping condition (step size), EpsX>=0. Zero value means that small step is automatically chosen. See notes below for more information. RsCnt - number of repeated restarts from random points. 4PL/5PL models are prone to problem of bad local extrema. Utilizing multiple random restarts allows us to improve algorithm convergence. RsCnt>=0. Zero value means that function automatically choose small amount of restarts (recommended). OUTPUT PARAMETERS: A, B, C, D- parameters of 4PL model G - parameter of 5PL model; for Is4PL=True, G=1 is returned. Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc5() function. NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. EQUALITY CONSTRAINTS ON PARAMETERS 4PL/5PL solver supports equality constraints on model values at the left boundary (X=0) and right boundary (X=infinity). These constraints are completely optional and you can specify both of them, only one - or no constraints at all. Parameter CnstrLeft contains left constraint (or NAN for unconstrained fitting), and CnstrRight contains right one. For 4PL, left constraint ALWAYS corresponds to parameter A, and right one is ALWAYS constraint on D. That's because 4PL model is normalized in such way that B>=0. For 5PL model things are different. Unlike 4PL one, 5PL model is NOT symmetric with respect to change in sign of B. Thus, negative B's are possible, and left constraint may constrain parameter A (for positive B's) - or parameter D (for negative B's). Similarly changes meaning of right constraint. You do not have to decide what parameter to constrain - algorithm will automatically determine correct parameters as fitting progresses. However, question highlighted above is important when you interpret fitting results. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit45x(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double cnstrleft, double cnstrright, ae_bool is4pl, double lambdav, double epsx, ae_int_t rscnt, double* a, double* b, double* c, double* d, double* g, lsfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_int_t i; ae_int_t outerit; ae_int_t nz; double v; ae_vector p0; ae_vector p1; ae_vector p2; ae_vector bndl; ae_vector bndu; ae_vector s; ae_vector bndl1; ae_vector bndu1; ae_vector bndl2; ae_vector bndu2; ae_matrix z; hqrndstate rs; minlmstate state; minlmreport replm; ae_int_t maxits; double fbest; double flast; double scalex; double scaley; ae_vector bufx; ae_vector bufy; double fposb; double fnegb; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; *a = 0; *b = 0; *c = 0; *d = 0; *g = 0; _lsfitreport_clear(rep); ae_vector_init(&p0, 0, DT_REAL, _state); ae_vector_init(&p1, 0, DT_REAL, _state); ae_vector_init(&p2, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&bndl1, 0, DT_REAL, _state); ae_vector_init(&bndu1, 0, DT_REAL, _state); ae_vector_init(&bndl2, 0, DT_REAL, _state); ae_vector_init(&bndu2, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); _minlmstate_init(&state, _state); _minlmreport_init(&replm, _state); ae_vector_init(&bufx, 0, DT_REAL, _state); ae_vector_init(&bufy, 0, DT_REAL, _state); ae_assert(ae_isfinite(epsx, _state), "LogisticFitX: EpsX is infinite/NAN", _state); ae_assert(ae_isfinite(lambdav, _state), "LogisticFitX: LambdaV is infinite/NAN", _state); ae_assert(ae_isfinite(cnstrleft, _state)||ae_isnan(cnstrleft, _state), "LogisticFitX: CnstrLeft is NOT finite or NAN", _state); ae_assert(ae_isfinite(cnstrright, _state)||ae_isnan(cnstrright, _state), "LogisticFitX: CnstrRight is NOT finite or NAN", _state); ae_assert(ae_fp_greater_eq(lambdav,(double)(0)), "LogisticFitX: negative LambdaV", _state); ae_assert(n>0, "LogisticFitX: N<=0", _state); ae_assert(rscnt>=0, "LogisticFitX: RsCnt<0", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "LogisticFitX: EpsX<0", _state); ae_assert(x->cnt>=n, "LogisticFitX: Length(X)cnt>=n, "LogisticFitX: Length(Y)ptr.p_double[0],(double)(0)), "LogisticFitX: some X[] are negative", _state); nz = n; for(i=0; i<=n-1; i++) { if( ae_fp_greater(x->ptr.p_double[i],(double)(0)) ) { nz = i; break; } } /* * For NZ=N (all X[] are zero) special code is used. * For NZiterationscount = 0; if( nz==n ) { /* * NZ=N, degenerate problem. * No need to run optimizer. */ v = 0.0; for(i=0; i<=n-1; i++) { v = v+y->ptr.p_double[i]; } v = v/n; if( ae_isfinite(cnstrleft, _state) ) { *a = cnstrleft; } else { *a = v; } *b = (double)(1); *c = (double)(1); if( ae_isfinite(cnstrright, _state) ) { *d = cnstrright; } else { *d = *a; } *g = (double)(1); lsfit_logisticfit45errors(x, y, n, *a, *b, *c, *d, *g, rep, _state); ae_frame_leave(_state); return; } /* * Non-degenerate problem. * Determine scale of data. */ scalex = x->ptr.p_double[nz+(n-nz)/2]; ae_assert(ae_fp_greater(scalex,(double)(0)), "LogisticFitX: internal error", _state); v = 0.0; for(i=0; i<=n-1; i++) { v = v+y->ptr.p_double[i]; } v = v/n; scaley = 0.0; for(i=0; i<=n-1; i++) { scaley = scaley+ae_sqr(y->ptr.p_double[i]-v, _state); } scaley = ae_sqrt(scaley/n, _state); if( ae_fp_eq(scaley,(double)(0)) ) { scaley = 1.0; } ae_vector_set_length(&s, 5, _state); s.ptr.p_double[0] = scaley; s.ptr.p_double[1] = 0.1; s.ptr.p_double[2] = scalex; s.ptr.p_double[3] = scaley; s.ptr.p_double[4] = 0.1; ae_vector_set_length(&p0, 5, _state); p0.ptr.p_double[0] = (double)(0); p0.ptr.p_double[1] = (double)(0); p0.ptr.p_double[2] = (double)(0); p0.ptr.p_double[3] = (double)(0); p0.ptr.p_double[4] = (double)(0); ae_vector_set_length(&bndl, 5, _state); ae_vector_set_length(&bndu, 5, _state); ae_vector_set_length(&bndl1, 5, _state); ae_vector_set_length(&bndu1, 5, _state); ae_vector_set_length(&bndl2, 5, _state); ae_vector_set_length(&bndu2, 5, _state); minlmcreatevj(5, n+5, &p0, &state, _state); minlmsetscale(&state, &s, _state); minlmsetcond(&state, epsx, maxits, _state); minlmsetxrep(&state, ae_true, _state); ae_vector_set_length(&p1, 5, _state); ae_vector_set_length(&p2, 5, _state); /* * Is it 4PL problem? */ if( is4pl ) { /* * Run outer iterations */ *a = (double)(0); *b = (double)(1); *c = (double)(1); *d = (double)(1); *g = (double)(1); fbest = ae_maxrealnumber; for(outerit=0; outerit<=rscnt-1; outerit++) { /* * Prepare initial point; use B>0 */ if( ae_isfinite(cnstrleft, _state) ) { p1.ptr.p_double[0] = cnstrleft; } else { p1.ptr.p_double[0] = y->ptr.p_double[0]+0.15*scaley*(hqrnduniformr(&rs, _state)-0.5); } p1.ptr.p_double[1] = 0.5+hqrnduniformr(&rs, _state); p1.ptr.p_double[2] = x->ptr.p_double[nz+hqrnduniformi(&rs, n-nz, _state)]; if( ae_isfinite(cnstrright, _state) ) { p1.ptr.p_double[3] = cnstrright; } else { p1.ptr.p_double[3] = y->ptr.p_double[n-1]+0.25*scaley*(hqrnduniformr(&rs, _state)-0.5); } p1.ptr.p_double[4] = 1.0; /* * Run optimization with tight constraints and increased regularization */ if( ae_isfinite(cnstrleft, _state) ) { bndl.ptr.p_double[0] = cnstrleft; bndu.ptr.p_double[0] = cnstrleft; } else { bndl.ptr.p_double[0] = _state->v_neginf; bndu.ptr.p_double[0] = _state->v_posinf; } bndl.ptr.p_double[1] = 0.5; bndu.ptr.p_double[1] = 2.0; bndl.ptr.p_double[2] = 0.5*scalex; bndu.ptr.p_double[2] = 2.0*scalex; if( ae_isfinite(cnstrright, _state) ) { bndl.ptr.p_double[3] = cnstrright; bndu.ptr.p_double[3] = cnstrright; } else { bndl.ptr.p_double[3] = _state->v_neginf; bndu.ptr.p_double[3] = _state->v_posinf; } bndl.ptr.p_double[4] = 1.0; bndu.ptr.p_double[4] = 1.0; minlmsetbc(&state, &bndl, &bndu, _state); lsfit_logisticfitinternal(x, y, n, is4pl, 100*lambdav, &state, &replm, &p1, &flast, _state); rep->iterationscount = rep->iterationscount+replm.iterationscount; /* * Relax constraints, run optimization one more time */ bndl.ptr.p_double[1] = 0.1; bndu.ptr.p_double[1] = 10.0; bndl.ptr.p_double[2] = ae_machineepsilon*scalex; bndu.ptr.p_double[2] = scalex/ae_machineepsilon; minlmsetbc(&state, &bndl, &bndu, _state); lsfit_logisticfitinternal(x, y, n, is4pl, lambdav, &state, &replm, &p1, &flast, _state); rep->iterationscount = rep->iterationscount+replm.iterationscount; /* * Compare results with best value found so far. */ if( ae_fp_less(flast,fbest) ) { *a = p1.ptr.p_double[0]; *b = p1.ptr.p_double[1]; *c = p1.ptr.p_double[2]; *d = p1.ptr.p_double[3]; *g = p1.ptr.p_double[4]; fbest = flast; } } lsfit_logisticfit45errors(x, y, n, *a, *b, *c, *d, *g, rep, _state); ae_frame_leave(_state); return; } /* * Well.... we have 5PL fit, and we have to test two separate branches: * B>0 and B<0, because of asymmetry in the curve. First, we run optimization * with tight constraints two times, in order to determine better sign for B. * * Run outer iterations */ *a = (double)(0); *b = (double)(1); *c = (double)(1); *d = (double)(1); *g = (double)(1); fbest = ae_maxrealnumber; for(outerit=0; outerit<=rscnt-1; outerit++) { /* * First, we try positive B. */ p1.ptr.p_double[0] = y->ptr.p_double[0]+0.15*scaley*(hqrnduniformr(&rs, _state)-0.5); p1.ptr.p_double[1] = 0.5+hqrnduniformr(&rs, _state); p1.ptr.p_double[2] = x->ptr.p_double[nz+hqrnduniformi(&rs, n-nz, _state)]; p1.ptr.p_double[3] = y->ptr.p_double[n-1]+0.25*scaley*(hqrnduniformr(&rs, _state)-0.5); p1.ptr.p_double[4] = 1.0; bndl1.ptr.p_double[0] = _state->v_neginf; bndu1.ptr.p_double[0] = _state->v_posinf; bndl1.ptr.p_double[1] = 0.5; bndu1.ptr.p_double[1] = 2.0; bndl1.ptr.p_double[2] = 0.5*scalex; bndu1.ptr.p_double[2] = 2.0*scalex; bndl1.ptr.p_double[3] = _state->v_neginf; bndu1.ptr.p_double[3] = _state->v_posinf; bndl1.ptr.p_double[4] = 0.5; bndu1.ptr.p_double[4] = 2.0; if( ae_isfinite(cnstrleft, _state) ) { p1.ptr.p_double[0] = cnstrleft; bndl1.ptr.p_double[0] = cnstrleft; bndu1.ptr.p_double[0] = cnstrleft; } if( ae_isfinite(cnstrright, _state) ) { p1.ptr.p_double[3] = cnstrright; bndl1.ptr.p_double[3] = cnstrright; bndu1.ptr.p_double[3] = cnstrright; } minlmsetbc(&state, &bndl1, &bndu1, _state); lsfit_logisticfitinternal(x, y, n, is4pl, 100*lambdav, &state, &replm, &p1, &fposb, _state); rep->iterationscount = rep->iterationscount+replm.iterationscount; /* * Second attempt - with negative B (constraints are still tight). */ p2.ptr.p_double[0] = y->ptr.p_double[n-1]+0.15*scaley*(hqrnduniformr(&rs, _state)-0.5); p2.ptr.p_double[1] = -(0.5+hqrnduniformr(&rs, _state)); p2.ptr.p_double[2] = x->ptr.p_double[nz+hqrnduniformi(&rs, n-nz, _state)]; p2.ptr.p_double[3] = y->ptr.p_double[0]+0.25*scaley*(hqrnduniformr(&rs, _state)-0.5); p2.ptr.p_double[4] = 1.0; bndl2.ptr.p_double[0] = _state->v_neginf; bndu2.ptr.p_double[0] = _state->v_posinf; bndl2.ptr.p_double[1] = -2.0; bndu2.ptr.p_double[1] = -0.5; bndl2.ptr.p_double[2] = 0.5*scalex; bndu2.ptr.p_double[2] = 2.0*scalex; bndl2.ptr.p_double[3] = _state->v_neginf; bndu2.ptr.p_double[3] = _state->v_posinf; bndl2.ptr.p_double[4] = 0.5; bndu2.ptr.p_double[4] = 2.0; if( ae_isfinite(cnstrleft, _state) ) { p2.ptr.p_double[3] = cnstrleft; bndl2.ptr.p_double[3] = cnstrleft; bndu2.ptr.p_double[3] = cnstrleft; } if( ae_isfinite(cnstrright, _state) ) { p2.ptr.p_double[0] = cnstrright; bndl2.ptr.p_double[0] = cnstrright; bndu2.ptr.p_double[0] = cnstrright; } minlmsetbc(&state, &bndl2, &bndu2, _state); lsfit_logisticfitinternal(x, y, n, is4pl, 100*lambdav, &state, &replm, &p2, &fnegb, _state); rep->iterationscount = rep->iterationscount+replm.iterationscount; /* * Select best version of B sign */ if( ae_fp_less(fposb,fnegb) ) { /* * Prepare relaxed constraints assuming that B is positive */ bndl1.ptr.p_double[1] = 0.1; bndu1.ptr.p_double[1] = 10.0; bndl1.ptr.p_double[2] = ae_machineepsilon*scalex; bndu1.ptr.p_double[2] = scalex/ae_machineepsilon; bndl1.ptr.p_double[4] = 0.1; bndu1.ptr.p_double[4] = 10.0; minlmsetbc(&state, &bndl1, &bndu1, _state); lsfit_logisticfitinternal(x, y, n, is4pl, lambdav, &state, &replm, &p1, &flast, _state); rep->iterationscount = rep->iterationscount+replm.iterationscount; /* * Compare results with best value found so far. */ if( ae_fp_less(flast,fbest) ) { *a = p1.ptr.p_double[0]; *b = p1.ptr.p_double[1]; *c = p1.ptr.p_double[2]; *d = p1.ptr.p_double[3]; *g = p1.ptr.p_double[4]; fbest = flast; } } else { /* * Prepare relaxed constraints assuming that B is positive */ bndl2.ptr.p_double[1] = -10.0; bndu2.ptr.p_double[1] = -0.1; bndl2.ptr.p_double[2] = ae_machineepsilon*scalex; bndu2.ptr.p_double[2] = scalex/ae_machineepsilon; bndl2.ptr.p_double[4] = 0.1; bndu2.ptr.p_double[4] = 10.0; minlmsetbc(&state, &bndl2, &bndu2, _state); lsfit_logisticfitinternal(x, y, n, is4pl, lambdav, &state, &replm, &p2, &flast, _state); rep->iterationscount = rep->iterationscount+replm.iterationscount; /* * Compare results with best value found so far. */ if( ae_fp_less(flast,fbest) ) { *a = p2.ptr.p_double[0]; *b = p2.ptr.p_double[1]; *c = p2.ptr.p_double[2]; *d = p2.ptr.p_double[3]; *g = p2.ptr.p_double[4]; fbest = flast; } } } lsfit_logisticfit45errors(x, y, n, *a, *b, *c, *d, *g, rep, _state); ae_frame_leave(_state); } /************************************************************************* Weghted rational least squares fitting using Floater-Hormann rational functions with optimal D chosen from [0,9], with constraints and individual weights. Equidistant grid with M node on [min(x),max(x)] is used to build basis functions. Different values of D are tried, optimal D (least WEIGHTED root mean square error) is chosen. Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2) (mostly dominated by the least squares solver). SEE ALSO * BarycentricFitFloaterHormann(), "lightweight" fitting without invididual weights and constraints. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points, N>0. XC - points where function values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints, 0<=K=2. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints -1 means another errors in parameters passed (N<=0, for example) B - barycentric interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * DBest best value of the D parameter * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroutine doesn't calculate task's condition number for K<>0. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained barycentric interpolants: * excessive constraints can be inconsistent. Floater-Hormann basis functions aren't as flexible as splines (although they are very smooth). * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints IS NOT GUARANTEED. * in the several special cases, however, we CAN guarantee consistency. * one of this cases is constraints on the function VALUES at the interval boundaries. Note that consustency of the constraints on the function DERIVATIVES is NOT guaranteed (you can use in such cases cubic splines which are more flexible). * another special case is ONE constraint on the function value (OR, but not AND, derivative) anywhere in the interval Our final recommendation is to use constraints WHEN AND ONLY WHEN you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricfitfloaterhormannwc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, barycentricinterpolant* b, barycentricfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t d; ae_int_t i; double wrmscur; double wrmsbest; barycentricinterpolant locb; barycentricfitreport locrep; ae_int_t locinfo; ae_frame_make(_state, &_frame_block); *info = 0; _barycentricinterpolant_clear(b); _barycentricfitreport_clear(rep); _barycentricinterpolant_init(&locb, _state); _barycentricfitreport_init(&locrep, _state); ae_assert(n>0, "BarycentricFitFloaterHormannWC: N<=0!", _state); ae_assert(m>0, "BarycentricFitFloaterHormannWC: M<=0!", _state); ae_assert(k>=0, "BarycentricFitFloaterHormannWC: K<0!", _state); ae_assert(k=M!", _state); ae_assert(x->cnt>=n, "BarycentricFitFloaterHormannWC: Length(X)cnt>=n, "BarycentricFitFloaterHormannWC: Length(Y)cnt>=n, "BarycentricFitFloaterHormannWC: Length(W)cnt>=k, "BarycentricFitFloaterHormannWC: Length(XC)cnt>=k, "BarycentricFitFloaterHormannWC: Length(YC)cnt>=k, "BarycentricFitFloaterHormannWC: Length(DC)ptr.p_int[i]==0||dc->ptr.p_int[i]==1, "BarycentricFitFloaterHormannWC: one of DC[] is not 0 or 1!", _state); } /* * Find optimal D * * Info is -3 by default (degenerate constraints). * If LocInfo will always be equal to -3, Info will remain equal to -3. * If at least once LocInfo will be -4, Info will be -4. */ wrmsbest = ae_maxrealnumber; rep->dbest = -1; *info = -3; for(d=0; d<=ae_minint(9, n-1, _state); d++) { lsfit_barycentricfitwcfixedd(x, y, w, n, xc, yc, dc, k, m, d, &locinfo, &locb, &locrep, _state); ae_assert((locinfo==-4||locinfo==-3)||locinfo>0, "BarycentricFitFloaterHormannWC: unexpected result from BarycentricFitWCFixedD!", _state); if( locinfo>0 ) { /* * Calculate weghted RMS */ wrmscur = (double)(0); for(i=0; i<=n-1; i++) { wrmscur = wrmscur+ae_sqr(w->ptr.p_double[i]*(y->ptr.p_double[i]-barycentriccalc(&locb, x->ptr.p_double[i], _state)), _state); } wrmscur = ae_sqrt(wrmscur/n, _state); if( ae_fp_less(wrmscur,wrmsbest)||rep->dbest<0 ) { barycentriccopy(&locb, b, _state); rep->dbest = d; *info = 1; rep->rmserror = locrep.rmserror; rep->avgerror = locrep.avgerror; rep->avgrelerror = locrep.avgrelerror; rep->maxerror = locrep.maxerror; rep->taskrcond = locrep.taskrcond; wrmsbest = wrmscur; } } else { if( locinfo!=-3&&*info<0 ) { *info = locinfo; } } } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_barycentricfitfloaterhormannwc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, barycentricinterpolant* b, barycentricfitreport* rep, ae_state *_state) { barycentricfitfloaterhormannwc(x,y,w,n,xc,yc,dc,k,m,info,b,rep, _state); } /************************************************************************* Rational least squares fitting using Floater-Hormann rational functions with optimal D chosen from [0,9]. Equidistant grid with M node on [min(x),max(x)] is used to build basis functions. Different values of D are tried, optimal D (least root mean square error) is chosen. Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2) (mostly dominated by the least squares solver). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. N - number of points, N>0. M - number of basis functions ( = number_of_nodes), M>=2. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints B - barycentric interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * DBest best value of the D parameter * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricfitfloaterhormann(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, ae_int_t* info, barycentricinterpolant* b, barycentricfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector w; ae_vector xc; ae_vector yc; ae_vector dc; ae_int_t i; ae_frame_make(_state, &_frame_block); *info = 0; _barycentricinterpolant_clear(b); _barycentricfitreport_clear(rep); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&yc, 0, DT_REAL, _state); ae_vector_init(&dc, 0, DT_INT, _state); ae_assert(n>0, "BarycentricFitFloaterHormann: N<=0!", _state); ae_assert(m>0, "BarycentricFitFloaterHormann: M<=0!", _state); ae_assert(x->cnt>=n, "BarycentricFitFloaterHormann: Length(X)cnt>=n, "BarycentricFitFloaterHormann: Length(Y)0 * if given, only first N elements of X/Y are processed * if not given, automatically determined from X/Y sizes M - number of basis functions ( = number_of_nodes), M>=4. Rho - regularization constant passed by user. It penalizes nonlinearity in the regression spline. It is logarithmically scaled, i.e. actual value of regularization constant is calculated as 10^Rho. It is automatically scaled so that: * Rho=2.0 corresponds to moderate amount of nonlinearity * generally, it should be somewhere in the [-8.0,+8.0] If you do not want to penalize nonlineary, pass small Rho. Values as low as -15 should work. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD or Cholesky decomposition; problem may be too ill-conditioned (very rare) S - spline interpolant. Rep - Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTE 1: additional nodes are added to the spline outside of the fitting interval to force linearity when xmax(x,xc). It is done for consistency - we penalize non-linearity at [min(x,xc),max(x,xc)], so it is natural to force linearity outside of this interval. NOTE 2: function automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfitpenalized(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, double rho, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector w; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; *info = 0; _spline1dinterpolant_clear(s); _spline1dfitreport_clear(rep); ae_vector_init(&w, 0, DT_REAL, _state); ae_assert(n>=1, "Spline1DFitPenalized: N<1!", _state); ae_assert(m>=4, "Spline1DFitPenalized: M<4!", _state); ae_assert(x->cnt>=n, "Spline1DFitPenalized: Length(X)cnt>=n, "Spline1DFitPenalized: Length(Y)0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes M - number of basis functions ( = number_of_nodes), M>=4. Rho - regularization constant passed by user. It penalizes nonlinearity in the regression spline. It is logarithmically scaled, i.e. actual value of regularization constant is calculated as 10^Rho. It is automatically scaled so that: * Rho=2.0 corresponds to moderate amount of nonlinearity * generally, it should be somewhere in the [-8.0,+8.0] If you do not want to penalize nonlineary, pass small Rho. Values as low as -15 should work. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD or Cholesky decomposition; problem may be too ill-conditioned (very rare) S - spline interpolant. Rep - Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTE 1: additional nodes are added to the spline outside of the fitting interval to force linearity when xmax(x,xc). It is done for consistency - we penalize non-linearity at [min(x,xc),max(x,xc)], so it is natural to force linearity outside of this interval. NOTE 2: function automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 19.10.2010 by Bochkanov Sergey *************************************************************************/ void spline1dfitpenalizedw(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, ae_int_t m, double rho, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector _w; ae_int_t i; ae_int_t j; ae_int_t b; double v; double relcnt; double xa; double xb; double sa; double sb; ae_vector xoriginal; ae_vector yoriginal; double pdecay; double tdecay; ae_matrix fmatrix; ae_vector fcolumn; ae_vector y2; ae_vector w2; ae_vector xc; ae_vector yc; ae_vector dc; double fdmax; double admax; ae_matrix amatrix; ae_matrix d2matrix; double fa; double ga; double fb; double gb; double lambdav; ae_vector bx; ae_vector by; ae_vector bd1; ae_vector bd2; ae_vector tx; ae_vector ty; ae_vector td; spline1dinterpolant bs; ae_matrix nmatrix; ae_vector rightpart; fblslincgstate cgstate; ae_vector c; ae_vector tmp0; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; ae_vector_init_copy(&_w, w, _state); w = &_w; *info = 0; _spline1dinterpolant_clear(s); _spline1dfitreport_clear(rep); ae_vector_init(&xoriginal, 0, DT_REAL, _state); ae_vector_init(&yoriginal, 0, DT_REAL, _state); ae_matrix_init(&fmatrix, 0, 0, DT_REAL, _state); ae_vector_init(&fcolumn, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&yc, 0, DT_REAL, _state); ae_vector_init(&dc, 0, DT_INT, _state); ae_matrix_init(&amatrix, 0, 0, DT_REAL, _state); ae_matrix_init(&d2matrix, 0, 0, DT_REAL, _state); ae_vector_init(&bx, 0, DT_REAL, _state); ae_vector_init(&by, 0, DT_REAL, _state); ae_vector_init(&bd1, 0, DT_REAL, _state); ae_vector_init(&bd2, 0, DT_REAL, _state); ae_vector_init(&tx, 0, DT_REAL, _state); ae_vector_init(&ty, 0, DT_REAL, _state); ae_vector_init(&td, 0, DT_REAL, _state); _spline1dinterpolant_init(&bs, _state); ae_matrix_init(&nmatrix, 0, 0, DT_REAL, _state); ae_vector_init(&rightpart, 0, DT_REAL, _state); _fblslincgstate_init(&cgstate, _state); ae_vector_init(&c, 0, DT_REAL, _state); ae_vector_init(&tmp0, 0, DT_REAL, _state); ae_assert(n>=1, "Spline1DFitPenalizedW: N<1!", _state); ae_assert(m>=4, "Spline1DFitPenalizedW: M<4!", _state); ae_assert(x->cnt>=n, "Spline1DFitPenalizedW: Length(X)cnt>=n, "Spline1DFitPenalizedW: Length(Y)cnt>=n, "Spline1DFitPenalizedW: Length(W)ptr.p_double[i]*fcolumn.ptr.p_double[i], _state); } fdmax = ae_maxreal(fdmax, v, _state); /* * Fill temporary with second derivatives of basis function */ ae_v_move(&d2matrix.ptr.pp_double[b][0], 1, &bd2.ptr.p_double[0], 1, ae_v_len(0,m-1)); } /* * * calculate penalty matrix A * * calculate max of diagonal elements of A * * calculate PDecay - coefficient before penalty matrix */ for(i=0; i<=m-1; i++) { for(j=i; j<=m-1; j++) { /* * calculate integral(B_i''*B_j'') where B_i and B_j are * i-th and j-th basis splines. * B_i and B_j are piecewise linear functions. */ v = (double)(0); for(b=0; b<=m-2; b++) { fa = d2matrix.ptr.pp_double[i][b]; fb = d2matrix.ptr.pp_double[i][b+1]; ga = d2matrix.ptr.pp_double[j][b]; gb = d2matrix.ptr.pp_double[j][b+1]; v = v+(bx.ptr.p_double[b+1]-bx.ptr.p_double[b])*(fa*ga+(fa*(gb-ga)+ga*(fb-fa))/2+(fb-fa)*(gb-ga)/3); } amatrix.ptr.pp_double[i][j] = v; amatrix.ptr.pp_double[j][i] = v; } } admax = (double)(0); for(i=0; i<=m-1; i++) { admax = ae_maxreal(admax, ae_fabs(amatrix.ptr.pp_double[i][i], _state), _state); } pdecay = lambdav*fdmax/admax; /* * Calculate TDecay for Tikhonov regularization */ tdecay = fdmax*(1+pdecay)*10*ae_machineepsilon; /* * Prepare system * * NOTE: FMatrix is spoiled during this process */ for(i=0; i<=n-1; i++) { v = w->ptr.p_double[i]; ae_v_muld(&fmatrix.ptr.pp_double[i][0], 1, ae_v_len(0,m-1), v); } rmatrixgemm(m, m, n, 1.0, &fmatrix, 0, 0, 1, &fmatrix, 0, 0, 0, 0.0, &nmatrix, 0, 0, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { nmatrix.ptr.pp_double[i][j] = nmatrix.ptr.pp_double[i][j]+pdecay*amatrix.ptr.pp_double[i][j]; } } for(i=0; i<=m-1; i++) { nmatrix.ptr.pp_double[i][i] = nmatrix.ptr.pp_double[i][i]+tdecay; } for(i=0; i<=m-1; i++) { rightpart.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { v = y->ptr.p_double[i]*w->ptr.p_double[i]; ae_v_addd(&rightpart.ptr.p_double[0], 1, &fmatrix.ptr.pp_double[i][0], 1, ae_v_len(0,m-1), v); } /* * Solve system */ if( !spdmatrixcholesky(&nmatrix, m, ae_true, _state) ) { *info = -4; ae_frame_leave(_state); return; } fblscholeskysolve(&nmatrix, 1.0, m, ae_true, &rightpart, &tmp0, _state); ae_v_move(&c.ptr.p_double[0], 1, &rightpart.ptr.p_double[0], 1, ae_v_len(0,m-1)); /* * add nodes to force linearity outside of the fitting interval */ spline1dgriddiffcubic(&bx, &c, m, 2, 0.0, 2, 0.0, &bd1, _state); ae_vector_set_length(&tx, m+2, _state); ae_vector_set_length(&ty, m+2, _state); ae_vector_set_length(&td, m+2, _state); ae_v_move(&tx.ptr.p_double[1], 1, &bx.ptr.p_double[0], 1, ae_v_len(1,m)); ae_v_move(&ty.ptr.p_double[1], 1, &rightpart.ptr.p_double[0], 1, ae_v_len(1,m)); ae_v_move(&td.ptr.p_double[1], 1, &bd1.ptr.p_double[0], 1, ae_v_len(1,m)); tx.ptr.p_double[0] = tx.ptr.p_double[1]-(tx.ptr.p_double[2]-tx.ptr.p_double[1]); ty.ptr.p_double[0] = ty.ptr.p_double[1]-td.ptr.p_double[1]*(tx.ptr.p_double[2]-tx.ptr.p_double[1]); td.ptr.p_double[0] = td.ptr.p_double[1]; tx.ptr.p_double[m+1] = tx.ptr.p_double[m]+(tx.ptr.p_double[m]-tx.ptr.p_double[m-1]); ty.ptr.p_double[m+1] = ty.ptr.p_double[m]+td.ptr.p_double[m]*(tx.ptr.p_double[m]-tx.ptr.p_double[m-1]); td.ptr.p_double[m+1] = td.ptr.p_double[m]; spline1dbuildhermite(&tx, &ty, &td, m+2, s, _state); spline1dlintransx(s, 2/(xb-xa), -(xa+xb)/(xb-xa), _state); spline1dlintransy(s, sb-sa, sa, _state); *info = 1; /* * Fill report */ rep->rmserror = (double)(0); rep->avgerror = (double)(0); rep->avgrelerror = (double)(0); rep->maxerror = (double)(0); relcnt = (double)(0); spline1dconvcubic(&bx, &rightpart, m, 2, 0.0, 2, 0.0, x, n, &fcolumn, _state); for(i=0; i<=n-1; i++) { v = (sb-sa)*fcolumn.ptr.p_double[i]+sa; rep->rmserror = rep->rmserror+ae_sqr(v-yoriginal.ptr.p_double[i], _state); rep->avgerror = rep->avgerror+ae_fabs(v-yoriginal.ptr.p_double[i], _state); if( ae_fp_neq(yoriginal.ptr.p_double[i],(double)(0)) ) { rep->avgrelerror = rep->avgrelerror+ae_fabs(v-yoriginal.ptr.p_double[i], _state)/ae_fabs(yoriginal.ptr.p_double[i], _state); relcnt = relcnt+1; } rep->maxerror = ae_maxreal(rep->maxerror, ae_fabs(v-yoriginal.ptr.p_double[i], _state), _state); } rep->rmserror = ae_sqrt(rep->rmserror/n, _state); rep->avgerror = rep->avgerror/n; if( ae_fp_neq(relcnt,(double)(0)) ) { rep->avgrelerror = rep->avgrelerror/relcnt; } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_spline1dfitpenalizedw(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, ae_int_t m, double rho, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state) { spline1dfitpenalizedw(x,y,w,n,m,rho,info,s,rep, _state); } /************************************************************************* Weighted fitting by cubic spline, with constraints on function values or derivatives. Equidistant grid with M-2 nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with continuous second derivatives and non-fixed first derivatives at interval ends. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO Spline1DFitHermiteWC() - fitting by Hermite splines (more flexible, less smooth) Spline1DFitCubic() - "lightweight" fitting by cubic splines, without invididual weights and constraints COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes XC - points where spline values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints (optional): * 0<=K=4. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints S - spline interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * excessive constraints can be inconsistent. Splines are piecewise cubic functions, and it is easy to create an example, where large number of constraints concentrated in small area will result in inconsistency. Just because spline is not flexible enough to satisfy all of them. And same constraints spread across the [min(x),max(x)] will be perfectly consistent. * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints IS NOT GUARANTEED. * in the several special cases, however, we CAN guarantee consistency. * one of this cases is constraints on the function values AND/OR its derivatives at the interval boundaries. * another special case is ONE constraint on the function value (OR, but not AND, derivative) anywhere in the interval Our final recommendation is to use constraints WHEN AND ONLY WHEN you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfitcubicwc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state) { ae_int_t i; *info = 0; _spline1dinterpolant_clear(s); _spline1dfitreport_clear(rep); ae_assert(n>=1, "Spline1DFitCubicWC: N<1!", _state); ae_assert(m>=4, "Spline1DFitCubicWC: M<4!", _state); ae_assert(k>=0, "Spline1DFitCubicWC: K<0!", _state); ae_assert(k=M!", _state); ae_assert(x->cnt>=n, "Spline1DFitCubicWC: Length(X)cnt>=n, "Spline1DFitCubicWC: Length(Y)cnt>=n, "Spline1DFitCubicWC: Length(W)cnt>=k, "Spline1DFitCubicWC: Length(XC)cnt>=k, "Spline1DFitCubicWC: Length(YC)cnt>=k, "Spline1DFitCubicWC: Length(DC)ptr.p_int[i]==0||dc->ptr.p_int[i]==1, "Spline1DFitCubicWC: DC[i] is neither 0 or 1!", _state); } lsfit_spline1dfitinternal(0, x, y, w, n, xc, yc, dc, k, m, info, s, rep, _state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_spline1dfitcubicwc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state) { spline1dfitcubicwc(x,y,w,n,xc,yc,dc,k,m,info,s,rep, _state); } /************************************************************************* Weighted fitting by Hermite spline, with constraints on function values or first derivatives. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are Hermite splines. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO Spline1DFitCubicWC() - fitting by Cubic splines (less flexible, more smooth) Spline1DFitHermite() - "lightweight" Hermite fitting, without invididual weights and constraints COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes XC - points where spline values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints (optional): * 0<=K=4, M IS EVEN! OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints -2 means odd M was passed (which is not supported) -1 means another errors in parameters passed (N<=0, for example) S - spline interpolant. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. IMPORTANT: this subroitine supports only even M's ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * excessive constraints can be inconsistent. Splines are piecewise cubic functions, and it is easy to create an example, where large number of constraints concentrated in small area will result in inconsistency. Just because spline is not flexible enough to satisfy all of them. And same constraints spread across the [min(x),max(x)] will be perfectly consistent. * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints is NOT GUARANTEED. * in the several special cases, however, we can guarantee consistency. * one of this cases is M>=4 and constraints on the function value (AND/OR its derivative) at the interval boundaries. * another special case is M>=4 and ONE constraint on the function value (OR, BUT NOT AND, derivative) anywhere in [min(x),max(x)] Our final recommendation is to use constraints WHEN AND ONLY when you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfithermitewc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state) { ae_int_t i; *info = 0; _spline1dinterpolant_clear(s); _spline1dfitreport_clear(rep); ae_assert(n>=1, "Spline1DFitHermiteWC: N<1!", _state); ae_assert(m>=4, "Spline1DFitHermiteWC: M<4!", _state); ae_assert(m%2==0, "Spline1DFitHermiteWC: M is odd!", _state); ae_assert(k>=0, "Spline1DFitHermiteWC: K<0!", _state); ae_assert(k=M!", _state); ae_assert(x->cnt>=n, "Spline1DFitHermiteWC: Length(X)cnt>=n, "Spline1DFitHermiteWC: Length(Y)cnt>=n, "Spline1DFitHermiteWC: Length(W)cnt>=k, "Spline1DFitHermiteWC: Length(XC)cnt>=k, "Spline1DFitHermiteWC: Length(YC)cnt>=k, "Spline1DFitHermiteWC: Length(DC)ptr.p_int[i]==0||dc->ptr.p_int[i]==1, "Spline1DFitHermiteWC: DC[i] is neither 0 or 1!", _state); } lsfit_spline1dfitinternal(1, x, y, w, n, xc, yc, dc, k, m, info, s, rep, _state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_spline1dfithermitewc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state) { spline1dfithermitewc(x,y,w,n,xc,yc,dc,k,m,info,s,rep, _state); } /************************************************************************* Least squares fitting by cubic spline. This subroutine is "lightweight" alternative for more complex and feature- rich Spline1DFitCubicWC(). See Spline1DFitCubicWC() for more information about subroutine parameters (we don't duplicate it here because of length) COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfitcubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector w; ae_vector xc; ae_vector yc; ae_vector dc; ae_frame_make(_state, &_frame_block); *info = 0; _spline1dinterpolant_clear(s); _spline1dfitreport_clear(rep); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&yc, 0, DT_REAL, _state); ae_vector_init(&dc, 0, DT_INT, _state); ae_assert(n>=1, "Spline1DFitCubic: N<1!", _state); ae_assert(m>=4, "Spline1DFitCubic: M<4!", _state); ae_assert(x->cnt>=n, "Spline1DFitCubic: Length(X)cnt>=n, "Spline1DFitCubic: Length(Y)=1, "Spline1DFitHermite: N<1!", _state); ae_assert(m>=4, "Spline1DFitHermite: M<4!", _state); ae_assert(m%2==0, "Spline1DFitHermite: M is odd!", _state); ae_assert(x->cnt>=n, "Spline1DFitHermite: Length(X)cnt>=n, "Spline1DFitHermite: Length(Y)=1. M - number of basis functions, M>=1. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -1 incorrect N/M were specified * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * Rep.TaskRCond reciprocal of condition number * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinearw(/* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, ae_int_t n, ae_int_t m, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state) { *info = 0; ae_vector_clear(c); _lsfitreport_clear(rep); ae_assert(n>=1, "LSFitLinearW: N<1!", _state); ae_assert(m>=1, "LSFitLinearW: M<1!", _state); ae_assert(y->cnt>=n, "LSFitLinearW: length(Y)cnt>=n, "LSFitLinearW: length(W)rows>=n, "LSFitLinearW: rows(FMatrix)cols>=m, "LSFitLinearW: cols(FMatrix)=1. M - number of basis functions, M>=1. K - number of constraints, 0 <= K < M K=0 corresponds to absence of constraints. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -3 either too many constraints (M or more), degenerate constraints (some constraints are repetead twice) or inconsistent constraints were specified. * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 07.09.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinearwc(/* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, /* Real */ ae_matrix* cmatrix, ae_int_t n, ae_int_t m, ae_int_t k, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _y; ae_matrix _cmatrix; ae_int_t i; ae_int_t j; ae_vector tau; ae_matrix q; ae_matrix f2; ae_vector tmp; ae_vector c0; double v; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_y, y, _state); y = &_y; ae_matrix_init_copy(&_cmatrix, cmatrix, _state); cmatrix = &_cmatrix; *info = 0; ae_vector_clear(c); _lsfitreport_clear(rep); ae_vector_init(&tau, 0, DT_REAL, _state); ae_matrix_init(&q, 0, 0, DT_REAL, _state); ae_matrix_init(&f2, 0, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&c0, 0, DT_REAL, _state); ae_assert(n>=1, "LSFitLinearWC: N<1!", _state); ae_assert(m>=1, "LSFitLinearWC: M<1!", _state); ae_assert(k>=0, "LSFitLinearWC: K<0!", _state); ae_assert(y->cnt>=n, "LSFitLinearWC: length(Y)cnt>=n, "LSFitLinearWC: length(W)rows>=n, "LSFitLinearWC: rows(FMatrix)cols>=m, "LSFitLinearWC: cols(FMatrix)rows>=k, "LSFitLinearWC: rows(CMatrix)cols>=m+1||k==0, "LSFitLinearWC: cols(CMatrix)=m ) { *info = -3; ae_frame_leave(_state); return; } /* * Solve */ if( k==0 ) { /* * no constraints */ lsfit_lsfitlinearinternal(y, w, fmatrix, n, m, info, c, rep, _state); } else { /* * First, find general form solution of constraints system: * * factorize C = L*Q * * unpack Q * * fill upper part of C with zeros (for RCond) * * We got C=C0+Q2'*y where Q2 is lower M-K rows of Q. */ rmatrixlq(cmatrix, k, m, &tau, _state); rmatrixlqunpackq(cmatrix, k, m, &tau, m, &q, _state); for(i=0; i<=k-1; i++) { for(j=i+1; j<=m-1; j++) { cmatrix->ptr.pp_double[i][j] = 0.0; } } if( ae_fp_less(rmatrixlurcondinf(cmatrix, k, _state),1000*ae_machineepsilon) ) { *info = -3; ae_frame_leave(_state); return; } ae_vector_set_length(&tmp, k, _state); for(i=0; i<=k-1; i++) { if( i>0 ) { v = ae_v_dotproduct(&cmatrix->ptr.pp_double[i][0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,i-1)); } else { v = (double)(0); } tmp.ptr.p_double[i] = (cmatrix->ptr.pp_double[i][m]-v)/cmatrix->ptr.pp_double[i][i]; } ae_vector_set_length(&c0, m, _state); for(i=0; i<=m-1; i++) { c0.ptr.p_double[i] = (double)(0); } for(i=0; i<=k-1; i++) { v = tmp.ptr.p_double[i]; ae_v_addd(&c0.ptr.p_double[0], 1, &q.ptr.pp_double[i][0], 1, ae_v_len(0,m-1), v); } /* * Second, prepare modified matrix F2 = F*Q2' and solve modified task */ ae_vector_set_length(&tmp, ae_maxint(n, m, _state)+1, _state); ae_matrix_set_length(&f2, n, m-k, _state); matrixvectormultiply(fmatrix, 0, n-1, 0, m-1, ae_false, &c0, 0, m-1, -1.0, y, 0, n-1, 1.0, _state); rmatrixgemm(n, m-k, m, 1.0, fmatrix, 0, 0, 0, &q, k, 0, 1, 0.0, &f2, 0, 0, _state); lsfit_lsfitlinearinternal(y, w, &f2, n, m-k, info, &tmp, rep, _state); rep->taskrcond = (double)(-1); if( *info<=0 ) { ae_frame_leave(_state); return; } /* * then, convert back to original answer: C = C0 + Q2'*Y0 */ ae_vector_set_length(c, m, _state); ae_v_move(&c->ptr.p_double[0], 1, &c0.ptr.p_double[0], 1, ae_v_len(0,m-1)); matrixvectormultiply(&q, k, m-1, 0, m-1, ae_true, &tmp, 0, m-k-1, 1.0, c, 0, m-1, 1.0, _state); } ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_lsfitlinearwc(/* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, /* Real */ ae_matrix* cmatrix, ae_int_t n, ae_int_t m, ae_int_t k, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state) { lsfitlinearwc(y,w,fmatrix,cmatrix,n,m,k,info,c,rep, _state); } /************************************************************************* Linear least squares fitting. QR decomposition is used to reduce task to MxM, then triangular solver or SVD-based solver is used depending on condition number of the system. It allows to maximize speed and retain decent accuracy. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I, J] - value of J-th basis function in I-th point. N - number of points used. N>=1. M - number of basis functions, M>=1. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * Rep.TaskRCond reciprocal of condition number * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinear(/* Real */ ae_vector* y, /* Real */ ae_matrix* fmatrix, ae_int_t n, ae_int_t m, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector w; ae_int_t i; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(c); _lsfitreport_clear(rep); ae_vector_init(&w, 0, DT_REAL, _state); ae_assert(n>=1, "LSFitLinear: N<1!", _state); ae_assert(m>=1, "LSFitLinear: M<1!", _state); ae_assert(y->cnt>=n, "LSFitLinear: length(Y)rows>=n, "LSFitLinear: rows(FMatrix)cols>=m, "LSFitLinear: cols(FMatrix)=1. M - number of basis functions, M>=1. K - number of constraints, 0 <= K < M K=0 corresponds to absence of constraints. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -3 either too many constraints (M or more), degenerate constraints (some constraints are repetead twice) or inconsistent constraints were specified. * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 07.09.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinearc(/* Real */ ae_vector* y, /* Real */ ae_matrix* fmatrix, /* Real */ ae_matrix* cmatrix, ae_int_t n, ae_int_t m, ae_int_t k, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _y; ae_vector w; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_y, y, _state); y = &_y; *info = 0; ae_vector_clear(c); _lsfitreport_clear(rep); ae_vector_init(&w, 0, DT_REAL, _state); ae_assert(n>=1, "LSFitLinearC: N<1!", _state); ae_assert(m>=1, "LSFitLinearC: M<1!", _state); ae_assert(k>=0, "LSFitLinearC: K<0!", _state); ae_assert(y->cnt>=n, "LSFitLinearC: length(Y)rows>=n, "LSFitLinearC: rows(FMatrix)cols>=m, "LSFitLinearC: cols(FMatrix)rows>=k, "LSFitLinearC: rows(CMatrix)cols>=m+1||k==0, "LSFitLinearC: cols(CMatrix)1 M - dimension of space K - number of parameters being fitted DiffStep- numerical differentiation step; should not be very small or large; large = loss of accuracy small = growth of round-off errors OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 18.10.2008 by Bochkanov Sergey *************************************************************************/ void lsfitcreatewf(/* Real */ ae_matrix* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_vector* c, ae_int_t n, ae_int_t m, ae_int_t k, double diffstep, lsfitstate* state, ae_state *_state) { ae_int_t i; _lsfitstate_clear(state); ae_assert(n>=1, "LSFitCreateWF: N<1!", _state); ae_assert(m>=1, "LSFitCreateWF: M<1!", _state); ae_assert(k>=1, "LSFitCreateWF: K<1!", _state); ae_assert(c->cnt>=k, "LSFitCreateWF: length(C)cnt>=n, "LSFitCreateWF: length(Y)cnt>=n, "LSFitCreateWF: length(W)rows>=n, "LSFitCreateWF: rows(X)cols>=m, "LSFitCreateWF: cols(X)teststep = (double)(0); state->diffstep = diffstep; state->npoints = n; state->nweights = n; state->wkind = 1; state->m = m; state->k = k; lsfitsetcond(state, 0.0, 0, _state); lsfitsetstpmax(state, 0.0, _state); lsfitsetxrep(state, ae_false, _state); ae_matrix_set_length(&state->taskx, n, m, _state); ae_vector_set_length(&state->tasky, n, _state); ae_vector_set_length(&state->taskw, n, _state); ae_vector_set_length(&state->c, k, _state); ae_vector_set_length(&state->x, m, _state); ae_v_move(&state->c.ptr.p_double[0], 1, &c->ptr.p_double[0], 1, ae_v_len(0,k-1)); ae_v_move(&state->taskw.ptr.p_double[0], 1, &w->ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { ae_v_move(&state->taskx.ptr.pp_double[i][0], 1, &x->ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->tasky.ptr.p_double[i] = y->ptr.p_double[i]; } ae_vector_set_length(&state->s, k, _state); ae_vector_set_length(&state->bndl, k, _state); ae_vector_set_length(&state->bndu, k, _state); for(i=0; i<=k-1; i++) { state->s.ptr.p_double[i] = 1.0; state->bndl.ptr.p_double[i] = _state->v_neginf; state->bndu.ptr.p_double[i] = _state->v_posinf; } state->optalgo = 0; state->prevnpt = -1; state->prevalgo = -1; state->nec = 0; state->nic = 0; minlmcreatev(k, n, &state->c, diffstep, &state->optstate, _state); lsfit_lsfitclearrequestfields(state, _state); ae_vector_set_length(&state->rstate.ia, 6+1, _state); ae_vector_set_length(&state->rstate.ra, 8+1, _state); state->rstate.stage = -1; } /************************************************************************* Nonlinear least squares fitting using function values only. Combination of numerical differentiation and secant updates is used to obtain function Jacobian. Nonlinear task min(F(c)) is solved, where F(c) = (f(c,x[0])-y[0])^2 + ... + (f(c,x[n-1])-y[n-1])^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]). INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted DiffStep- numerical differentiation step; should not be very small or large; large = loss of accuracy small = growth of round-off errors OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 18.10.2008 by Bochkanov Sergey *************************************************************************/ void lsfitcreatef(/* Real */ ae_matrix* x, /* Real */ ae_vector* y, /* Real */ ae_vector* c, ae_int_t n, ae_int_t m, ae_int_t k, double diffstep, lsfitstate* state, ae_state *_state) { ae_int_t i; _lsfitstate_clear(state); ae_assert(n>=1, "LSFitCreateF: N<1!", _state); ae_assert(m>=1, "LSFitCreateF: M<1!", _state); ae_assert(k>=1, "LSFitCreateF: K<1!", _state); ae_assert(c->cnt>=k, "LSFitCreateF: length(C)cnt>=n, "LSFitCreateF: length(Y)rows>=n, "LSFitCreateF: rows(X)cols>=m, "LSFitCreateF: cols(X)rows>=n, "LSFitCreateF: rows(X)cols>=m, "LSFitCreateF: cols(X)teststep = (double)(0); state->diffstep = diffstep; state->npoints = n; state->wkind = 0; state->m = m; state->k = k; lsfitsetcond(state, 0.0, 0, _state); lsfitsetstpmax(state, 0.0, _state); lsfitsetxrep(state, ae_false, _state); ae_matrix_set_length(&state->taskx, n, m, _state); ae_vector_set_length(&state->tasky, n, _state); ae_vector_set_length(&state->c, k, _state); ae_vector_set_length(&state->x, m, _state); ae_v_move(&state->c.ptr.p_double[0], 1, &c->ptr.p_double[0], 1, ae_v_len(0,k-1)); for(i=0; i<=n-1; i++) { ae_v_move(&state->taskx.ptr.pp_double[i][0], 1, &x->ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->tasky.ptr.p_double[i] = y->ptr.p_double[i]; } ae_vector_set_length(&state->s, k, _state); ae_vector_set_length(&state->bndl, k, _state); ae_vector_set_length(&state->bndu, k, _state); for(i=0; i<=k-1; i++) { state->s.ptr.p_double[i] = 1.0; state->bndl.ptr.p_double[i] = _state->v_neginf; state->bndu.ptr.p_double[i] = _state->v_posinf; } state->optalgo = 0; state->prevnpt = -1; state->prevalgo = -1; state->nec = 0; state->nic = 0; minlmcreatev(k, n, &state->c, diffstep, &state->optstate, _state); lsfit_lsfitclearrequestfields(state, _state); ae_vector_set_length(&state->rstate.ia, 6+1, _state); ae_vector_set_length(&state->rstate.ra, 8+1, _state); state->rstate.stage = -1; } /************************************************************************* Weighted nonlinear least squares fitting using gradient only. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]) and its gradient. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted CheapFG - boolean flag, which is: * True if both function and gradient calculation complexity are less than O(M^2). An improved algorithm can be used which corresponds to FGJ scheme from MINLM unit. * False otherwise. Standard Jacibian-bases Levenberg-Marquardt algo will be used (FJ scheme). OUTPUT PARAMETERS: State - structure which stores algorithm state See also: LSFitResults LSFitCreateFG (fitting without weights) LSFitCreateWFGH (fitting using Hessian) LSFitCreateFGH (fitting using Hessian, without weights) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatewfg(/* Real */ ae_matrix* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_vector* c, ae_int_t n, ae_int_t m, ae_int_t k, ae_bool cheapfg, lsfitstate* state, ae_state *_state) { ae_int_t i; _lsfitstate_clear(state); ae_assert(n>=1, "LSFitCreateWFG: N<1!", _state); ae_assert(m>=1, "LSFitCreateWFG: M<1!", _state); ae_assert(k>=1, "LSFitCreateWFG: K<1!", _state); ae_assert(c->cnt>=k, "LSFitCreateWFG: length(C)cnt>=n, "LSFitCreateWFG: length(Y)cnt>=n, "LSFitCreateWFG: length(W)rows>=n, "LSFitCreateWFG: rows(X)cols>=m, "LSFitCreateWFG: cols(X)teststep = (double)(0); state->diffstep = (double)(0); state->npoints = n; state->nweights = n; state->wkind = 1; state->m = m; state->k = k; lsfitsetcond(state, 0.0, 0, _state); lsfitsetstpmax(state, 0.0, _state); lsfitsetxrep(state, ae_false, _state); ae_matrix_set_length(&state->taskx, n, m, _state); ae_vector_set_length(&state->tasky, n, _state); ae_vector_set_length(&state->taskw, n, _state); ae_vector_set_length(&state->c, k, _state); ae_vector_set_length(&state->x, m, _state); ae_vector_set_length(&state->g, k, _state); ae_v_move(&state->c.ptr.p_double[0], 1, &c->ptr.p_double[0], 1, ae_v_len(0,k-1)); ae_v_move(&state->taskw.ptr.p_double[0], 1, &w->ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { ae_v_move(&state->taskx.ptr.pp_double[i][0], 1, &x->ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->tasky.ptr.p_double[i] = y->ptr.p_double[i]; } ae_vector_set_length(&state->s, k, _state); ae_vector_set_length(&state->bndl, k, _state); ae_vector_set_length(&state->bndu, k, _state); for(i=0; i<=k-1; i++) { state->s.ptr.p_double[i] = 1.0; state->bndl.ptr.p_double[i] = _state->v_neginf; state->bndu.ptr.p_double[i] = _state->v_posinf; } state->optalgo = 1; state->prevnpt = -1; state->prevalgo = -1; state->nec = 0; state->nic = 0; if( cheapfg ) { minlmcreatevgj(k, n, &state->c, &state->optstate, _state); } else { minlmcreatevj(k, n, &state->c, &state->optstate, _state); } lsfit_lsfitclearrequestfields(state, _state); ae_vector_set_length(&state->rstate.ia, 6+1, _state); ae_vector_set_length(&state->rstate.ra, 8+1, _state); state->rstate.stage = -1; } /************************************************************************* Nonlinear least squares fitting using gradient only, without individual weights. Nonlinear task min(F(c)) is solved, where F(c) = ((f(c,x[0])-y[0]))^2 + ... + ((f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]) and its gradient. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted CheapFG - boolean flag, which is: * True if both function and gradient calculation complexity are less than O(M^2). An improved algorithm can be used which corresponds to FGJ scheme from MINLM unit. * False otherwise. Standard Jacibian-bases Levenberg-Marquardt algo will be used (FJ scheme). OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatefg(/* Real */ ae_matrix* x, /* Real */ ae_vector* y, /* Real */ ae_vector* c, ae_int_t n, ae_int_t m, ae_int_t k, ae_bool cheapfg, lsfitstate* state, ae_state *_state) { ae_int_t i; _lsfitstate_clear(state); ae_assert(n>=1, "LSFitCreateFG: N<1!", _state); ae_assert(m>=1, "LSFitCreateFG: M<1!", _state); ae_assert(k>=1, "LSFitCreateFG: K<1!", _state); ae_assert(c->cnt>=k, "LSFitCreateFG: length(C)cnt>=n, "LSFitCreateFG: length(Y)rows>=n, "LSFitCreateFG: rows(X)cols>=m, "LSFitCreateFG: cols(X)rows>=n, "LSFitCreateFG: rows(X)cols>=m, "LSFitCreateFG: cols(X)teststep = (double)(0); state->diffstep = (double)(0); state->npoints = n; state->wkind = 0; state->m = m; state->k = k; lsfitsetcond(state, 0.0, 0, _state); lsfitsetstpmax(state, 0.0, _state); lsfitsetxrep(state, ae_false, _state); ae_matrix_set_length(&state->taskx, n, m, _state); ae_vector_set_length(&state->tasky, n, _state); ae_vector_set_length(&state->c, k, _state); ae_vector_set_length(&state->x, m, _state); ae_vector_set_length(&state->g, k, _state); ae_v_move(&state->c.ptr.p_double[0], 1, &c->ptr.p_double[0], 1, ae_v_len(0,k-1)); for(i=0; i<=n-1; i++) { ae_v_move(&state->taskx.ptr.pp_double[i][0], 1, &x->ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->tasky.ptr.p_double[i] = y->ptr.p_double[i]; } ae_vector_set_length(&state->s, k, _state); ae_vector_set_length(&state->bndl, k, _state); ae_vector_set_length(&state->bndu, k, _state); for(i=0; i<=k-1; i++) { state->s.ptr.p_double[i] = 1.0; state->bndl.ptr.p_double[i] = _state->v_neginf; state->bndu.ptr.p_double[i] = _state->v_posinf; } state->optalgo = 1; state->prevnpt = -1; state->prevalgo = -1; state->nec = 0; state->nic = 0; if( cheapfg ) { minlmcreatevgj(k, n, &state->c, &state->optstate, _state); } else { minlmcreatevj(k, n, &state->c, &state->optstate, _state); } lsfit_lsfitclearrequestfields(state, _state); ae_vector_set_length(&state->rstate.ia, 6+1, _state); ae_vector_set_length(&state->rstate.ra, 8+1, _state); state->rstate.stage = -1; } /************************************************************************* Weighted nonlinear least squares fitting using gradient/Hessian. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses f(c,x[i]), its gradient and its Hessian. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatewfgh(/* Real */ ae_matrix* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_vector* c, ae_int_t n, ae_int_t m, ae_int_t k, lsfitstate* state, ae_state *_state) { ae_int_t i; _lsfitstate_clear(state); ae_assert(n>=1, "LSFitCreateWFGH: N<1!", _state); ae_assert(m>=1, "LSFitCreateWFGH: M<1!", _state); ae_assert(k>=1, "LSFitCreateWFGH: K<1!", _state); ae_assert(c->cnt>=k, "LSFitCreateWFGH: length(C)cnt>=n, "LSFitCreateWFGH: length(Y)cnt>=n, "LSFitCreateWFGH: length(W)rows>=n, "LSFitCreateWFGH: rows(X)cols>=m, "LSFitCreateWFGH: cols(X)teststep = (double)(0); state->diffstep = (double)(0); state->npoints = n; state->nweights = n; state->wkind = 1; state->m = m; state->k = k; lsfitsetcond(state, 0.0, 0, _state); lsfitsetstpmax(state, 0.0, _state); lsfitsetxrep(state, ae_false, _state); ae_matrix_set_length(&state->taskx, n, m, _state); ae_vector_set_length(&state->tasky, n, _state); ae_vector_set_length(&state->taskw, n, _state); ae_vector_set_length(&state->c, k, _state); ae_matrix_set_length(&state->h, k, k, _state); ae_vector_set_length(&state->x, m, _state); ae_vector_set_length(&state->g, k, _state); ae_v_move(&state->c.ptr.p_double[0], 1, &c->ptr.p_double[0], 1, ae_v_len(0,k-1)); ae_v_move(&state->taskw.ptr.p_double[0], 1, &w->ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { ae_v_move(&state->taskx.ptr.pp_double[i][0], 1, &x->ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->tasky.ptr.p_double[i] = y->ptr.p_double[i]; } ae_vector_set_length(&state->s, k, _state); ae_vector_set_length(&state->bndl, k, _state); ae_vector_set_length(&state->bndu, k, _state); for(i=0; i<=k-1; i++) { state->s.ptr.p_double[i] = 1.0; state->bndl.ptr.p_double[i] = _state->v_neginf; state->bndu.ptr.p_double[i] = _state->v_posinf; } state->optalgo = 2; state->prevnpt = -1; state->prevalgo = -1; state->nec = 0; state->nic = 0; minlmcreatefgh(k, &state->c, &state->optstate, _state); lsfit_lsfitclearrequestfields(state, _state); ae_vector_set_length(&state->rstate.ia, 6+1, _state); ae_vector_set_length(&state->rstate.ra, 8+1, _state); state->rstate.stage = -1; } /************************************************************************* Nonlinear least squares fitting using gradient/Hessian, without individial weights. Nonlinear task min(F(c)) is solved, where F(c) = ((f(c,x[0])-y[0]))^2 + ... + ((f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses f(c,x[i]), its gradient and its Hessian. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatefgh(/* Real */ ae_matrix* x, /* Real */ ae_vector* y, /* Real */ ae_vector* c, ae_int_t n, ae_int_t m, ae_int_t k, lsfitstate* state, ae_state *_state) { ae_int_t i; _lsfitstate_clear(state); ae_assert(n>=1, "LSFitCreateFGH: N<1!", _state); ae_assert(m>=1, "LSFitCreateFGH: M<1!", _state); ae_assert(k>=1, "LSFitCreateFGH: K<1!", _state); ae_assert(c->cnt>=k, "LSFitCreateFGH: length(C)cnt>=n, "LSFitCreateFGH: length(Y)rows>=n, "LSFitCreateFGH: rows(X)cols>=m, "LSFitCreateFGH: cols(X)teststep = (double)(0); state->diffstep = (double)(0); state->npoints = n; state->wkind = 0; state->m = m; state->k = k; lsfitsetcond(state, 0.0, 0, _state); lsfitsetstpmax(state, 0.0, _state); lsfitsetxrep(state, ae_false, _state); ae_matrix_set_length(&state->taskx, n, m, _state); ae_vector_set_length(&state->tasky, n, _state); ae_vector_set_length(&state->c, k, _state); ae_matrix_set_length(&state->h, k, k, _state); ae_vector_set_length(&state->x, m, _state); ae_vector_set_length(&state->g, k, _state); ae_v_move(&state->c.ptr.p_double[0], 1, &c->ptr.p_double[0], 1, ae_v_len(0,k-1)); for(i=0; i<=n-1; i++) { ae_v_move(&state->taskx.ptr.pp_double[i][0], 1, &x->ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->tasky.ptr.p_double[i] = y->ptr.p_double[i]; } ae_vector_set_length(&state->s, k, _state); ae_vector_set_length(&state->bndl, k, _state); ae_vector_set_length(&state->bndu, k, _state); for(i=0; i<=k-1; i++) { state->s.ptr.p_double[i] = 1.0; state->bndl.ptr.p_double[i] = _state->v_neginf; state->bndu.ptr.p_double[i] = _state->v_posinf; } state->optalgo = 2; state->prevnpt = -1; state->prevalgo = -1; state->nec = 0; state->nic = 0; minlmcreatefgh(k, &state->c, &state->optstate, _state); lsfit_lsfitclearrequestfields(state, _state); ae_vector_set_length(&state->rstate.ia, 6+1, _state); ae_vector_set_length(&state->rstate.ra, 8+1, _state); state->rstate.stage = -1; } /************************************************************************* Stopping conditions for nonlinear least squares fitting. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by LSFitSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Only Levenberg-Marquardt iterations are counted (L-BFGS/CG iterations are NOT counted because their cost is very low compared to that of LM). NOTE Passing EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (according to the scheme used by MINLM unit). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitsetcond(lsfitstate* state, double epsx, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(epsx, _state), "LSFitSetCond: EpsX is not finite!", _state); ae_assert(ae_fp_greater_eq(epsx,(double)(0)), "LSFitSetCond: negative EpsX!", _state); ae_assert(maxits>=0, "LSFitSetCond: negative MaxIts!", _state); state->epsx = epsx; state->maxits = maxits; } /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. NOTE: non-zero StpMax leads to moderate performance degradation because intermediate step of preconditioned L-BFGS optimization is incompatible with limits on step size. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void lsfitsetstpmax(lsfitstate* state, double stpmax, ae_state *_state) { ae_assert(ae_fp_greater_eq(stpmax,(double)(0)), "LSFitSetStpMax: StpMax<0!", _state); state->stpmax = stpmax; } /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not When reports are needed, State.C (current parameters) and State.F (current value of fitting function) are reported. -- ALGLIB -- Copyright 15.08.2010 by Bochkanov Sergey *************************************************************************/ void lsfitsetxrep(lsfitstate* state, ae_bool needxrep, ae_state *_state) { state->xrep = needxrep; } /************************************************************************* This function sets scaling coefficients for underlying optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Generally, scale is NOT considered to be a form of preconditioner. But LM optimizer is unique in that it uses scaling matrix both in the stopping condition tests and as Marquardt damping factor. Proper scaling is very important for the algorithm performance. It is less important for the quality of results, but still has some influence (it is easier to converge when variables are properly scaled, so premature stopping is possible when very badly scalled variables are combined with relaxed stopping conditions). INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void lsfitsetscale(lsfitstate* state, /* Real */ ae_vector* s, ae_state *_state) { ae_int_t i; ae_assert(s->cnt>=state->k, "LSFitSetScale: Length(S)k-1; i++) { ae_assert(ae_isfinite(s->ptr.p_double[i], _state), "LSFitSetScale: S contains infinite or NAN elements", _state); ae_assert(ae_fp_neq(s->ptr.p_double[i],(double)(0)), "LSFitSetScale: S contains infinite or NAN elements", _state); state->s.ptr.p_double[i] = ae_fabs(s->ptr.p_double[i], _state); } } /************************************************************************* This function sets boundary constraints for underlying optimizer Boundary constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetBC() call. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[K]. If some (all) variables are unbounded, you may specify very small number or -INF (latter is recommended because it will allow solver to use better algorithm). BndU - upper bounds, array[K]. If some (all) variables are unbounded, you may specify very large number or +INF (latter is recommended because it will allow solver to use better algorithm). NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: unlike other constrained optimization algorithms, this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void lsfitsetbc(lsfitstate* state, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state) { ae_int_t i; ae_int_t k; k = state->k; ae_assert(bndl->cnt>=k, "LSFitSetBC: Length(BndL)cnt>=k, "LSFitSetBC: Length(BndU)ptr.p_double[i], _state)||ae_isneginf(bndl->ptr.p_double[i], _state), "LSFitSetBC: BndL contains NAN or +INF", _state); ae_assert(ae_isfinite(bndu->ptr.p_double[i], _state)||ae_isposinf(bndu->ptr.p_double[i], _state), "LSFitSetBC: BndU contains NAN or -INF", _state); if( ae_isfinite(bndl->ptr.p_double[i], _state)&&ae_isfinite(bndu->ptr.p_double[i], _state) ) { ae_assert(ae_fp_less_eq(bndl->ptr.p_double[i],bndu->ptr.p_double[i]), "LSFitSetBC: BndL[i]>BndU[i]", _state); } state->bndl.ptr.p_double[i] = bndl->ptr.p_double[i]; state->bndu.ptr.p_double[i] = bndu->ptr.p_double[i]; } } /************************************************************************* This function sets linear constraints for underlying optimizer Linear constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetLC() call. INPUT PARAMETERS: State - structure stores algorithm state C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT IMPORTANT: if you have linear constraints, it is strongly recommended to set scale of variables with lsfitsetscale(). QP solver which is used to calculate linearly constrained steps heavily relies on good scaling of input problems. NOTE: linear (non-box) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations. NOTE: general linear constraints add significant overhead to solution process. Although solver performs roughly same amount of iterations (when compared with similar box-only constrained problem), each iteration now involves solution of linearly constrained QP subproblem, which requires ~3-5 times more Cholesky decompositions. Thus, if you can reformulate your problem in such way this it has only box constraints, it may be beneficial to do so. -- ALGLIB -- Copyright 29.04.2017 by Bochkanov Sergey *************************************************************************/ void lsfitsetlc(lsfitstate* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state) { ae_int_t i; ae_int_t n; n = state->k; /* * First, check for errors in the inputs */ ae_assert(k>=0, "LSFitSetLC: K<0", _state); ae_assert(c->cols>=n+1||k==0, "LSFitSetLC: Cols(C)rows>=k, "LSFitSetLC: Rows(C)cnt>=k, "LSFitSetLC: Length(CT)nec = 0; state->nic = 0; return; } /* * Equality constraints are stored first, in the upper * NEC rows of State.CLEIC matrix. Inequality constraints * are stored in the next NIC rows. * * NOTE: we convert inequality constraints to the form * A*x<=b before copying them. */ rmatrixsetlengthatleast(&state->cleic, k, n+1, _state); state->nec = 0; state->nic = 0; for(i=0; i<=k-1; i++) { if( ct->ptr.p_int[i]==0 ) { ae_v_move(&state->cleic.ptr.pp_double[state->nec][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); state->nec = state->nec+1; } } for(i=0; i<=k-1; i++) { if( ct->ptr.p_int[i]!=0 ) { if( ct->ptr.p_int[i]>0 ) { ae_v_moveneg(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } else { ae_v_move(&state->cleic.ptr.pp_double[state->nec+state->nic][0], 1, &c->ptr.pp_double[i][0], 1, ae_v_len(0,n)); } state->nic = state->nic+1; } } } /************************************************************************* NOTES: 1. this algorithm is somewhat unusual because it works with parameterized function f(C,X), where X is a function argument (we have many points which are characterized by different argument values), and C is a parameter to fit. For example, if we want to do linear fit by f(c0,c1,x) = c0*x+c1, then x will be argument, and {c0,c1} will be parameters. It is important to understand that this algorithm finds minimum in the space of function PARAMETERS (not arguments), so it needs derivatives of f() with respect to C, not X. In the example above it will need f=c0*x+c1 and {df/dc0,df/dc1} = {x,1} instead of {df/dx} = {c0}. 2. Callback functions accept C as the first parameter, and X as the second 3. If state was created with LSFitCreateFG(), algorithm needs just function and its gradient, but if state was created with LSFitCreateFGH(), algorithm will need function, gradient and Hessian. According to the said above, there ase several versions of this function, which accept different sets of callbacks. This flexibility opens way to subtle errors - you may create state with LSFitCreateFGH() (optimization using Hessian), but call function which does not accept Hessian. So when algorithm will request Hessian, there will be no callback to call. In this case exception will be thrown. Be careful to avoid such errors because there is no way to find them at compile time - you can see them at runtime only. -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ ae_bool lsfititeration(lsfitstate* state, ae_state *_state) { double lx; double lf; double ld; double rx; double rf; double rd; ae_int_t n; ae_int_t m; ae_int_t k; double v; double vv; double relcnt; ae_int_t i; ae_int_t j; ae_int_t j1; ae_int_t info; ae_bool result; /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( state->rstate.stage>=0 ) { n = state->rstate.ia.ptr.p_int[0]; m = state->rstate.ia.ptr.p_int[1]; k = state->rstate.ia.ptr.p_int[2]; i = state->rstate.ia.ptr.p_int[3]; j = state->rstate.ia.ptr.p_int[4]; j1 = state->rstate.ia.ptr.p_int[5]; info = state->rstate.ia.ptr.p_int[6]; lx = state->rstate.ra.ptr.p_double[0]; lf = state->rstate.ra.ptr.p_double[1]; ld = state->rstate.ra.ptr.p_double[2]; rx = state->rstate.ra.ptr.p_double[3]; rf = state->rstate.ra.ptr.p_double[4]; rd = state->rstate.ra.ptr.p_double[5]; v = state->rstate.ra.ptr.p_double[6]; vv = state->rstate.ra.ptr.p_double[7]; relcnt = state->rstate.ra.ptr.p_double[8]; } else { n = 359; m = -58; k = -919; i = -909; j = 81; j1 = 255; info = 74; lx = -788; lf = 809; ld = 205; rx = -838; rf = 939; rd = -526; v = 763; vv = -541; relcnt = -698; } if( state->rstate.stage==0 ) { goto lbl_0; } if( state->rstate.stage==1 ) { goto lbl_1; } if( state->rstate.stage==2 ) { goto lbl_2; } if( state->rstate.stage==3 ) { goto lbl_3; } if( state->rstate.stage==4 ) { goto lbl_4; } if( state->rstate.stage==5 ) { goto lbl_5; } if( state->rstate.stage==6 ) { goto lbl_6; } if( state->rstate.stage==7 ) { goto lbl_7; } if( state->rstate.stage==8 ) { goto lbl_8; } if( state->rstate.stage==9 ) { goto lbl_9; } if( state->rstate.stage==10 ) { goto lbl_10; } if( state->rstate.stage==11 ) { goto lbl_11; } if( state->rstate.stage==12 ) { goto lbl_12; } if( state->rstate.stage==13 ) { goto lbl_13; } /* * Routine body */ /* * Init */ if( state->wkind==1 ) { ae_assert(state->npoints==state->nweights, "LSFitFit: number of points is not equal to the number of weights", _state); } state->repvaridx = -1; n = state->npoints; m = state->m; k = state->k; ivectorsetlengthatleast(&state->tmpct, state->nec+state->nic, _state); for(i=0; i<=state->nec-1; i++) { state->tmpct.ptr.p_int[i] = 0; } for(i=0; i<=state->nic-1; i++) { state->tmpct.ptr.p_int[state->nec+i] = -1; } minlmsetcond(&state->optstate, state->epsx, state->maxits, _state); minlmsetstpmax(&state->optstate, state->stpmax, _state); minlmsetxrep(&state->optstate, state->xrep, _state); minlmsetscale(&state->optstate, &state->s, _state); minlmsetbc(&state->optstate, &state->bndl, &state->bndu, _state); minlmsetlc(&state->optstate, &state->cleic, &state->tmpct, state->nec+state->nic, _state); /* * Check that user-supplied gradient is correct */ lsfit_lsfitclearrequestfields(state, _state); if( !(ae_fp_greater(state->teststep,(double)(0))&&state->optalgo==1) ) { goto lbl_14; } for(i=0; i<=k-1; i++) { if( ae_isfinite(state->bndl.ptr.p_double[i], _state) ) { state->c.ptr.p_double[i] = ae_maxreal(state->c.ptr.p_double[i], state->bndl.ptr.p_double[i], _state); } if( ae_isfinite(state->bndu.ptr.p_double[i], _state) ) { state->c.ptr.p_double[i] = ae_minreal(state->c.ptr.p_double[i], state->bndu.ptr.p_double[i], _state); } } state->needfg = ae_true; i = 0; lbl_16: if( i>k-1 ) { goto lbl_18; } ae_assert(ae_fp_less_eq(state->bndl.ptr.p_double[i],state->c.ptr.p_double[i])&&ae_fp_less_eq(state->c.ptr.p_double[i],state->bndu.ptr.p_double[i]), "LSFitIteration: internal error(State.C is out of bounds)", _state); v = state->c.ptr.p_double[i]; j = 0; lbl_19: if( j>n-1 ) { goto lbl_21; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->taskx.ptr.pp_double[j][0], 1, ae_v_len(0,m-1)); state->c.ptr.p_double[i] = v-state->teststep*state->s.ptr.p_double[i]; if( ae_isfinite(state->bndl.ptr.p_double[i], _state) ) { state->c.ptr.p_double[i] = ae_maxreal(state->c.ptr.p_double[i], state->bndl.ptr.p_double[i], _state); } lx = state->c.ptr.p_double[i]; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: lf = state->f; ld = state->g.ptr.p_double[i]; state->c.ptr.p_double[i] = v+state->teststep*state->s.ptr.p_double[i]; if( ae_isfinite(state->bndu.ptr.p_double[i], _state) ) { state->c.ptr.p_double[i] = ae_minreal(state->c.ptr.p_double[i], state->bndu.ptr.p_double[i], _state); } rx = state->c.ptr.p_double[i]; state->rstate.stage = 1; goto lbl_rcomm; lbl_1: rf = state->f; rd = state->g.ptr.p_double[i]; state->c.ptr.p_double[i] = (lx+rx)/2; if( ae_isfinite(state->bndl.ptr.p_double[i], _state) ) { state->c.ptr.p_double[i] = ae_maxreal(state->c.ptr.p_double[i], state->bndl.ptr.p_double[i], _state); } if( ae_isfinite(state->bndu.ptr.p_double[i], _state) ) { state->c.ptr.p_double[i] = ae_minreal(state->c.ptr.p_double[i], state->bndu.ptr.p_double[i], _state); } state->rstate.stage = 2; goto lbl_rcomm; lbl_2: state->c.ptr.p_double[i] = v; if( !derivativecheck(lf, ld, rf, rd, state->f, state->g.ptr.p_double[i], rx-lx, _state) ) { state->repvaridx = i; state->repterminationtype = -7; result = ae_false; return result; } j = j+1; goto lbl_19; lbl_21: i = i+1; goto lbl_16; lbl_18: state->needfg = ae_false; lbl_14: /* * Fill WCur by weights: * * for WKind=0 unit weights are chosen * * for WKind=1 we use user-supplied weights stored in State.TaskW */ rvectorsetlengthatleast(&state->wcur, n, _state); for(i=0; i<=n-1; i++) { state->wcur.ptr.p_double[i] = 1.0; if( state->wkind==1 ) { state->wcur.ptr.p_double[i] = state->taskw.ptr.p_double[i]; } } /* * Optimize */ lbl_22: if( !minlmiteration(&state->optstate, _state) ) { goto lbl_23; } if( !state->optstate.needfi ) { goto lbl_24; } /* * calculate f[] = wi*(f(xi,c)-yi) */ i = 0; lbl_26: if( i>n-1 ) { goto lbl_28; } ae_v_move(&state->c.ptr.p_double[0], 1, &state->optstate.x.ptr.p_double[0], 1, ae_v_len(0,k-1)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->taskx.ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->pointindex = i; lsfit_lsfitclearrequestfields(state, _state); state->needf = ae_true; state->rstate.stage = 3; goto lbl_rcomm; lbl_3: state->needf = ae_false; vv = state->wcur.ptr.p_double[i]; state->optstate.fi.ptr.p_double[i] = vv*(state->f-state->tasky.ptr.p_double[i]); i = i+1; goto lbl_26; lbl_28: goto lbl_22; lbl_24: if( !state->optstate.needf ) { goto lbl_29; } /* * calculate F = sum (wi*(f(xi,c)-yi))^2 */ state->optstate.f = (double)(0); i = 0; lbl_31: if( i>n-1 ) { goto lbl_33; } ae_v_move(&state->c.ptr.p_double[0], 1, &state->optstate.x.ptr.p_double[0], 1, ae_v_len(0,k-1)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->taskx.ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->pointindex = i; lsfit_lsfitclearrequestfields(state, _state); state->needf = ae_true; state->rstate.stage = 4; goto lbl_rcomm; lbl_4: state->needf = ae_false; vv = state->wcur.ptr.p_double[i]; state->optstate.f = state->optstate.f+ae_sqr(vv*(state->f-state->tasky.ptr.p_double[i]), _state); i = i+1; goto lbl_31; lbl_33: goto lbl_22; lbl_29: if( !state->optstate.needfg ) { goto lbl_34; } /* * calculate F/gradF */ state->optstate.f = (double)(0); for(i=0; i<=k-1; i++) { state->optstate.g.ptr.p_double[i] = (double)(0); } i = 0; lbl_36: if( i>n-1 ) { goto lbl_38; } ae_v_move(&state->c.ptr.p_double[0], 1, &state->optstate.x.ptr.p_double[0], 1, ae_v_len(0,k-1)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->taskx.ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->pointindex = i; lsfit_lsfitclearrequestfields(state, _state); state->needfg = ae_true; state->rstate.stage = 5; goto lbl_rcomm; lbl_5: state->needfg = ae_false; vv = state->wcur.ptr.p_double[i]; state->optstate.f = state->optstate.f+ae_sqr(vv*(state->f-state->tasky.ptr.p_double[i]), _state); v = ae_sqr(vv, _state)*2*(state->f-state->tasky.ptr.p_double[i]); ae_v_addd(&state->optstate.g.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,k-1), v); i = i+1; goto lbl_36; lbl_38: goto lbl_22; lbl_34: if( !state->optstate.needfij ) { goto lbl_39; } /* * calculate Fi/jac(Fi) */ i = 0; lbl_41: if( i>n-1 ) { goto lbl_43; } ae_v_move(&state->c.ptr.p_double[0], 1, &state->optstate.x.ptr.p_double[0], 1, ae_v_len(0,k-1)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->taskx.ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->pointindex = i; lsfit_lsfitclearrequestfields(state, _state); state->needfg = ae_true; state->rstate.stage = 6; goto lbl_rcomm; lbl_6: state->needfg = ae_false; vv = state->wcur.ptr.p_double[i]; state->optstate.fi.ptr.p_double[i] = vv*(state->f-state->tasky.ptr.p_double[i]); ae_v_moved(&state->optstate.j.ptr.pp_double[i][0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,k-1), vv); i = i+1; goto lbl_41; lbl_43: goto lbl_22; lbl_39: if( !state->optstate.needfgh ) { goto lbl_44; } /* * calculate F/grad(F)/hess(F) */ state->optstate.f = (double)(0); for(i=0; i<=k-1; i++) { state->optstate.g.ptr.p_double[i] = (double)(0); } for(i=0; i<=k-1; i++) { for(j=0; j<=k-1; j++) { state->optstate.h.ptr.pp_double[i][j] = (double)(0); } } i = 0; lbl_46: if( i>n-1 ) { goto lbl_48; } ae_v_move(&state->c.ptr.p_double[0], 1, &state->optstate.x.ptr.p_double[0], 1, ae_v_len(0,k-1)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->taskx.ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->pointindex = i; lsfit_lsfitclearrequestfields(state, _state); state->needfgh = ae_true; state->rstate.stage = 7; goto lbl_rcomm; lbl_7: state->needfgh = ae_false; vv = state->wcur.ptr.p_double[i]; state->optstate.f = state->optstate.f+ae_sqr(vv*(state->f-state->tasky.ptr.p_double[i]), _state); v = ae_sqr(vv, _state)*2*(state->f-state->tasky.ptr.p_double[i]); ae_v_addd(&state->optstate.g.ptr.p_double[0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,k-1), v); for(j=0; j<=k-1; j++) { v = 2*ae_sqr(vv, _state)*state->g.ptr.p_double[j]; ae_v_addd(&state->optstate.h.ptr.pp_double[j][0], 1, &state->g.ptr.p_double[0], 1, ae_v_len(0,k-1), v); v = 2*ae_sqr(vv, _state)*(state->f-state->tasky.ptr.p_double[i]); ae_v_addd(&state->optstate.h.ptr.pp_double[j][0], 1, &state->h.ptr.pp_double[j][0], 1, ae_v_len(0,k-1), v); } i = i+1; goto lbl_46; lbl_48: goto lbl_22; lbl_44: if( !state->optstate.xupdated ) { goto lbl_49; } /* * Report new iteration */ ae_v_move(&state->c.ptr.p_double[0], 1, &state->optstate.x.ptr.p_double[0], 1, ae_v_len(0,k-1)); state->f = state->optstate.f; lsfit_lsfitclearrequestfields(state, _state); state->xupdated = ae_true; state->rstate.stage = 8; goto lbl_rcomm; lbl_8: state->xupdated = ae_false; goto lbl_22; lbl_49: goto lbl_22; lbl_23: minlmresults(&state->optstate, &state->c, &state->optrep, _state); state->repterminationtype = state->optrep.terminationtype; state->repiterationscount = state->optrep.iterationscount; /* * calculate errors */ if( state->repterminationtype<=0 ) { goto lbl_51; } /* * Calculate RMS/Avg/Max/... errors */ state->reprmserror = (double)(0); state->repwrmserror = (double)(0); state->repavgerror = (double)(0); state->repavgrelerror = (double)(0); state->repmaxerror = (double)(0); relcnt = (double)(0); i = 0; lbl_53: if( i>n-1 ) { goto lbl_55; } ae_v_move(&state->c.ptr.p_double[0], 1, &state->c.ptr.p_double[0], 1, ae_v_len(0,k-1)); ae_v_move(&state->x.ptr.p_double[0], 1, &state->taskx.ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->pointindex = i; lsfit_lsfitclearrequestfields(state, _state); state->needf = ae_true; state->rstate.stage = 9; goto lbl_rcomm; lbl_9: state->needf = ae_false; v = state->f; vv = state->wcur.ptr.p_double[i]; state->reprmserror = state->reprmserror+ae_sqr(v-state->tasky.ptr.p_double[i], _state); state->repwrmserror = state->repwrmserror+ae_sqr(vv*(v-state->tasky.ptr.p_double[i]), _state); state->repavgerror = state->repavgerror+ae_fabs(v-state->tasky.ptr.p_double[i], _state); if( ae_fp_neq(state->tasky.ptr.p_double[i],(double)(0)) ) { state->repavgrelerror = state->repavgrelerror+ae_fabs(v-state->tasky.ptr.p_double[i], _state)/ae_fabs(state->tasky.ptr.p_double[i], _state); relcnt = relcnt+1; } state->repmaxerror = ae_maxreal(state->repmaxerror, ae_fabs(v-state->tasky.ptr.p_double[i], _state), _state); i = i+1; goto lbl_53; lbl_55: state->reprmserror = ae_sqrt(state->reprmserror/n, _state); state->repwrmserror = ae_sqrt(state->repwrmserror/n, _state); state->repavgerror = state->repavgerror/n; if( ae_fp_neq(relcnt,(double)(0)) ) { state->repavgrelerror = state->repavgrelerror/relcnt; } /* * Calculate covariance matrix */ rmatrixsetlengthatleast(&state->tmpjac, n, k, _state); rvectorsetlengthatleast(&state->tmpf, n, _state); rvectorsetlengthatleast(&state->tmp, k, _state); if( ae_fp_less_eq(state->diffstep,(double)(0)) ) { goto lbl_56; } /* * Compute Jacobian by means of numerical differentiation */ lsfit_lsfitclearrequestfields(state, _state); state->needf = ae_true; i = 0; lbl_58: if( i>n-1 ) { goto lbl_60; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->taskx.ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->pointindex = i; state->rstate.stage = 10; goto lbl_rcomm; lbl_10: state->tmpf.ptr.p_double[i] = state->f; j = 0; lbl_61: if( j>k-1 ) { goto lbl_63; } v = state->c.ptr.p_double[j]; lx = v-state->diffstep*state->s.ptr.p_double[j]; state->c.ptr.p_double[j] = lx; if( ae_isfinite(state->bndl.ptr.p_double[j], _state) ) { state->c.ptr.p_double[j] = ae_maxreal(state->c.ptr.p_double[j], state->bndl.ptr.p_double[j], _state); } state->rstate.stage = 11; goto lbl_rcomm; lbl_11: lf = state->f; rx = v+state->diffstep*state->s.ptr.p_double[j]; state->c.ptr.p_double[j] = rx; if( ae_isfinite(state->bndu.ptr.p_double[j], _state) ) { state->c.ptr.p_double[j] = ae_minreal(state->c.ptr.p_double[j], state->bndu.ptr.p_double[j], _state); } state->rstate.stage = 12; goto lbl_rcomm; lbl_12: rf = state->f; state->c.ptr.p_double[j] = v; if( ae_fp_neq(rx,lx) ) { state->tmpjac.ptr.pp_double[i][j] = (rf-lf)/(rx-lx); } else { state->tmpjac.ptr.pp_double[i][j] = (double)(0); } j = j+1; goto lbl_61; lbl_63: i = i+1; goto lbl_58; lbl_60: state->needf = ae_false; goto lbl_57; lbl_56: /* * Jacobian is calculated with user-provided analytic gradient */ lsfit_lsfitclearrequestfields(state, _state); state->needfg = ae_true; i = 0; lbl_64: if( i>n-1 ) { goto lbl_66; } ae_v_move(&state->x.ptr.p_double[0], 1, &state->taskx.ptr.pp_double[i][0], 1, ae_v_len(0,m-1)); state->pointindex = i; state->rstate.stage = 13; goto lbl_rcomm; lbl_13: state->tmpf.ptr.p_double[i] = state->f; for(j=0; j<=k-1; j++) { state->tmpjac.ptr.pp_double[i][j] = state->g.ptr.p_double[j]; } i = i+1; goto lbl_64; lbl_66: state->needfg = ae_false; lbl_57: for(i=0; i<=k-1; i++) { state->tmp.ptr.p_double[i] = 0.0; } lsfit_estimateerrors(&state->tmpjac, &state->tmpf, &state->tasky, &state->wcur, &state->tmp, &state->s, n, k, &state->rep, &state->tmpjacw, 0, _state); lbl_51: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = n; state->rstate.ia.ptr.p_int[1] = m; state->rstate.ia.ptr.p_int[2] = k; state->rstate.ia.ptr.p_int[3] = i; state->rstate.ia.ptr.p_int[4] = j; state->rstate.ia.ptr.p_int[5] = j1; state->rstate.ia.ptr.p_int[6] = info; state->rstate.ra.ptr.p_double[0] = lx; state->rstate.ra.ptr.p_double[1] = lf; state->rstate.ra.ptr.p_double[2] = ld; state->rstate.ra.ptr.p_double[3] = rx; state->rstate.ra.ptr.p_double[4] = rf; state->rstate.ra.ptr.p_double[5] = rd; state->rstate.ra.ptr.p_double[6] = v; state->rstate.ra.ptr.p_double[7] = vv; state->rstate.ra.ptr.p_double[8] = relcnt; return result; } /************************************************************************* Nonlinear least squares fitting results. Called after return from LSFitFit(). INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: Info - completion code: * -7 gradient verification failed. See LSFitSetGradientCheck() for more information. * 2 relative step is no more than EpsX. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible C - array[0..K-1], solution Rep - optimization report. On success following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED * WRMSError weighted rms error on the (X,Y). ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(J*CovPar*J')), where J is Jacobian matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitresults(lsfitstate* state, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state) { ae_int_t i; ae_int_t j; *info = 0; ae_vector_clear(c); _lsfitreport_clear(rep); lsfit_clearreport(rep, _state); *info = state->repterminationtype; rep->varidx = state->repvaridx; if( *info>0 ) { ae_vector_set_length(c, state->k, _state); ae_v_move(&c->ptr.p_double[0], 1, &state->c.ptr.p_double[0], 1, ae_v_len(0,state->k-1)); rep->rmserror = state->reprmserror; rep->wrmserror = state->repwrmserror; rep->avgerror = state->repavgerror; rep->avgrelerror = state->repavgrelerror; rep->maxerror = state->repmaxerror; rep->iterationscount = state->repiterationscount; ae_matrix_set_length(&rep->covpar, state->k, state->k, _state); ae_vector_set_length(&rep->errpar, state->k, _state); ae_vector_set_length(&rep->errcurve, state->npoints, _state); ae_vector_set_length(&rep->noise, state->npoints, _state); rep->r2 = state->rep.r2; for(i=0; i<=state->k-1; i++) { for(j=0; j<=state->k-1; j++) { rep->covpar.ptr.pp_double[i][j] = state->rep.covpar.ptr.pp_double[i][j]; } rep->errpar.ptr.p_double[i] = state->rep.errpar.ptr.p_double[i]; } for(i=0; i<=state->npoints-1; i++) { rep->errcurve.ptr.p_double[i] = state->rep.errcurve.ptr.p_double[i]; rep->noise.ptr.p_double[i] = state->rep.noise.ptr.p_double[i]; } } } /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before fitting begins * LSFitFit() is called * prior to actual fitting, for each point in data set X_i and each component of parameters being fited C_j algorithm performs following steps: * two trial steps are made to C_j-TestStep*S[j] and C_j+TestStep*S[j], where C_j is j-th parameter and S[j] is a scale of j-th parameter * if needed, steps are bounded with respect to constraints on C[] * F(X_i|C) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N*K (points count * parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with LSFitSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. NOTE 4: this function works only for optimizers created with LSFitCreateWFG() or LSFitCreateFG() constructors. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/ void lsfitsetgradientcheck(lsfitstate* state, double teststep, ae_state *_state) { ae_assert(ae_isfinite(teststep, _state), "LSFitSetGradientCheck: TestStep contains NaN or Infinite", _state); ae_assert(ae_fp_greater_eq(teststep,(double)(0)), "LSFitSetGradientCheck: invalid argument TestStep(TestStep<0)", _state); state->teststep = teststep; } /************************************************************************* Internal subroutine: automatic scaling for LLS tasks. NEVER CALL IT DIRECTLY! Maps abscissas to [-1,1], standartizes ordinates and correspondingly scales constraints. It also scales weights so that max(W[i])=1 Transformations performed: * X, XC [XA,XB] => [-1,+1] transformation makes min(X)=-1, max(X)=+1 * Y [SA,SB] => [0,1] transformation makes mean(Y)=0, stddev(Y)=1 * YC transformed accordingly to SA, SB, DC[I] -- ALGLIB PROJECT -- Copyright 08.09.2009 by Bochkanov Sergey *************************************************************************/ void lsfitscalexy(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, double* xa, double* xb, double* sa, double* sb, /* Real */ ae_vector* xoriginal, /* Real */ ae_vector* yoriginal, ae_state *_state) { double xmin; double xmax; ae_int_t i; double mx; *xa = 0; *xb = 0; *sa = 0; *sb = 0; ae_vector_clear(xoriginal); ae_vector_clear(yoriginal); ae_assert(n>=1, "LSFitScaleXY: incorrect N", _state); ae_assert(k>=0, "LSFitScaleXY: incorrect K", _state); /* * Calculate xmin/xmax. * Force xmin<>xmax. */ xmin = x->ptr.p_double[0]; xmax = x->ptr.p_double[0]; for(i=1; i<=n-1; i++) { xmin = ae_minreal(xmin, x->ptr.p_double[i], _state); xmax = ae_maxreal(xmax, x->ptr.p_double[i], _state); } for(i=0; i<=k-1; i++) { xmin = ae_minreal(xmin, xc->ptr.p_double[i], _state); xmax = ae_maxreal(xmax, xc->ptr.p_double[i], _state); } if( ae_fp_eq(xmin,xmax) ) { if( ae_fp_eq(xmin,(double)(0)) ) { xmin = (double)(-1); xmax = (double)(1); } else { if( ae_fp_greater(xmin,(double)(0)) ) { xmin = 0.5*xmin; } else { xmax = 0.5*xmax; } } } /* * Transform abscissas: map [XA,XB] to [0,1] * * Store old X[] in XOriginal[] (it will be used * to calculate relative error). */ ae_vector_set_length(xoriginal, n, _state); ae_v_move(&xoriginal->ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); *xa = xmin; *xb = xmax; for(i=0; i<=n-1; i++) { x->ptr.p_double[i] = 2*(x->ptr.p_double[i]-0.5*(*xa+(*xb)))/(*xb-(*xa)); } for(i=0; i<=k-1; i++) { ae_assert(dc->ptr.p_int[i]>=0, "LSFitScaleXY: internal error!", _state); xc->ptr.p_double[i] = 2*(xc->ptr.p_double[i]-0.5*(*xa+(*xb)))/(*xb-(*xa)); yc->ptr.p_double[i] = yc->ptr.p_double[i]*ae_pow(0.5*(*xb-(*xa)), (double)(dc->ptr.p_int[i]), _state); } /* * Transform function values: map [SA,SB] to [0,1] * SA = mean(Y), * SB = SA+stddev(Y). * * Store old Y[] in YOriginal[] (it will be used * to calculate relative error). */ ae_vector_set_length(yoriginal, n, _state); ae_v_move(&yoriginal->ptr.p_double[0], 1, &y->ptr.p_double[0], 1, ae_v_len(0,n-1)); *sa = (double)(0); for(i=0; i<=n-1; i++) { *sa = *sa+y->ptr.p_double[i]; } *sa = *sa/n; *sb = (double)(0); for(i=0; i<=n-1; i++) { *sb = *sb+ae_sqr(y->ptr.p_double[i]-(*sa), _state); } *sb = ae_sqrt(*sb/n, _state)+(*sa); if( ae_fp_eq(*sb,*sa) ) { *sb = 2*(*sa); } if( ae_fp_eq(*sb,*sa) ) { *sb = *sa+1; } for(i=0; i<=n-1; i++) { y->ptr.p_double[i] = (y->ptr.p_double[i]-(*sa))/(*sb-(*sa)); } for(i=0; i<=k-1; i++) { if( dc->ptr.p_int[i]==0 ) { yc->ptr.p_double[i] = (yc->ptr.p_double[i]-(*sa))/(*sb-(*sa)); } else { yc->ptr.p_double[i] = yc->ptr.p_double[i]/(*sb-(*sa)); } } /* * Scale weights */ mx = (double)(0); for(i=0; i<=n-1; i++) { mx = ae_maxreal(mx, ae_fabs(w->ptr.p_double[i], _state), _state); } if( ae_fp_neq(mx,(double)(0)) ) { for(i=0; i<=n-1; i++) { w->ptr.p_double[i] = w->ptr.p_double[i]/mx; } } } /************************************************************************* This function analyzes section of curve for processing by RDP algorithm: given set of points X,Y with indexes [I0,I1] it returns point with worst deviation from linear model (non-parametric version which sees curve as Y(x)). Input parameters: X, Y - SORTED arrays. I0,I1 - interval (boundaries included) to process Eps - desired precision OUTPUT PARAMETERS: WorstIdx - index of worst point WorstError - error at worst point NOTE: this function guarantees that it returns exactly zero for a section with less than 3 points. -- ALGLIB PROJECT -- Copyright 02.10.2014 by Bochkanov Sergey *************************************************************************/ static void lsfit_rdpanalyzesection(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t i0, ae_int_t i1, ae_int_t* worstidx, double* worsterror, ae_state *_state) { ae_int_t i; double xleft; double xright; double vx; double ve; double a; double b; *worstidx = 0; *worsterror = 0; xleft = x->ptr.p_double[i0]; xright = x->ptr.p_double[i1]; if( i1-i0+1<3||ae_fp_eq(xright,xleft) ) { *worstidx = i0; *worsterror = 0.0; return; } a = (y->ptr.p_double[i1]-y->ptr.p_double[i0])/(xright-xleft); b = (y->ptr.p_double[i0]*xright-y->ptr.p_double[i1]*xleft)/(xright-xleft); *worstidx = -1; *worsterror = (double)(0); for(i=i0+1; i<=i1-1; i++) { vx = x->ptr.p_double[i]; ve = ae_fabs(a*vx+b-y->ptr.p_double[i], _state); if( (ae_fp_greater(vx,xleft)&&ae_fp_less(vx,xright))&&ae_fp_greater(ve,*worsterror) ) { *worsterror = ve; *worstidx = i; } } } /************************************************************************* Recursive splitting of interval [I0,I1] (right boundary included) with RDP algorithm (non-parametric version which sees curve as Y(x)). Input parameters: X, Y - SORTED arrays. I0,I1 - interval (boundaries included) to process Eps - desired precision XOut,YOut - preallocated output arrays large enough to store result; XOut[0..1], YOut[0..1] contain first and last points of curve NOut - must contain 2 on input OUTPUT PARAMETERS: XOut, YOut - curve generated by RDP algorithm, UNSORTED NOut - number of points in curve -- ALGLIB PROJECT -- Copyright 02.10.2014 by Bochkanov Sergey *************************************************************************/ static void lsfit_rdprecursive(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t i0, ae_int_t i1, double eps, /* Real */ ae_vector* xout, /* Real */ ae_vector* yout, ae_int_t* nout, ae_state *_state) { ae_int_t worstidx; double worsterror; ae_assert(ae_fp_greater(eps,(double)(0)), "RDPRecursive: internal error, Eps<0", _state); lsfit_rdpanalyzesection(x, y, i0, i1, &worstidx, &worsterror, _state); if( ae_fp_less_eq(worsterror,eps) ) { return; } xout->ptr.p_double[*nout] = x->ptr.p_double[worstidx]; yout->ptr.p_double[*nout] = y->ptr.p_double[worstidx]; *nout = *nout+1; if( worstidx-i0x.ptr.p_double[0]; tb = state->x.ptr.p_double[1]; tc = state->x.ptr.p_double[2]; td = state->x.ptr.p_double[3]; tg = state->x.ptr.p_double[4]; if( state->xupdated ) { /* * Save best function value obtained so far. */ *flast = state->f; continue; } if( state->needfi||state->needfij ) { /* * Function vector and Jacobian */ for(i=0; i<=n-1; i++) { ae_assert(ae_fp_greater_eq(x->ptr.p_double[i],(double)(0)), "LogisticFitInternal: integrity error", _state); /* * Handle zero X */ if( ae_fp_eq(x->ptr.p_double[i],(double)(0)) ) { if( ae_fp_greater_eq(tb,(double)(0)) ) { /* * Positive or zero TB, limit X^TB subject to X->+0 is equal to zero. */ state->fi.ptr.p_double[i] = ta-y->ptr.p_double[i]; if( state->needfij ) { state->j.ptr.pp_double[i][0] = (double)(1); state->j.ptr.pp_double[i][1] = (double)(0); state->j.ptr.pp_double[i][2] = (double)(0); state->j.ptr.pp_double[i][3] = (double)(0); state->j.ptr.pp_double[i][4] = (double)(0); } } else { /* * Negative TB, limit X^TB subject to X->+0 is equal to +INF. */ state->fi.ptr.p_double[i] = td-y->ptr.p_double[i]; if( state->needfij ) { state->j.ptr.pp_double[i][0] = (double)(0); state->j.ptr.pp_double[i][1] = (double)(0); state->j.ptr.pp_double[i][2] = (double)(0); state->j.ptr.pp_double[i][3] = (double)(1); state->j.ptr.pp_double[i][4] = (double)(0); } } continue; } /* * Positive X. * Prepare VP0/VP1, it may become infinite or nearly overflow in some rare cases, * handle these cases */ vp0 = ae_pow(x->ptr.p_double[i]/tc, tb, _state); if( is4pl ) { vp1 = 1+vp0; } else { vp1 = ae_pow(1+vp0, tg, _state); } if( (!ae_isfinite(vp1, _state)||ae_fp_greater(vp0,1.0E50))||ae_fp_greater(vp1,1.0E50) ) { /* * VP0/VP1 are not finite, assume that it is +INF or -INF */ state->fi.ptr.p_double[i] = td-y->ptr.p_double[i]; if( state->needfij ) { state->j.ptr.pp_double[i][0] = (double)(0); state->j.ptr.pp_double[i][1] = (double)(0); state->j.ptr.pp_double[i][2] = (double)(0); state->j.ptr.pp_double[i][3] = (double)(1); state->j.ptr.pp_double[i][4] = (double)(0); } continue; } /* * VP0/VP1 are finite, normal processing */ if( is4pl ) { state->fi.ptr.p_double[i] = td+(ta-td)/vp1-y->ptr.p_double[i]; if( state->needfij ) { state->j.ptr.pp_double[i][0] = 1/vp1; state->j.ptr.pp_double[i][1] = -(ta-td)*vp0*ae_log(x->ptr.p_double[i]/tc, _state)/ae_sqr(vp1, _state); state->j.ptr.pp_double[i][2] = (ta-td)*(tb/tc)*vp0/ae_sqr(vp1, _state); state->j.ptr.pp_double[i][3] = 1-1/vp1; state->j.ptr.pp_double[i][4] = (double)(0); } } else { state->fi.ptr.p_double[i] = td+(ta-td)/vp1-y->ptr.p_double[i]; if( state->needfij ) { state->j.ptr.pp_double[i][0] = 1/vp1; state->j.ptr.pp_double[i][1] = (ta-td)*(-tg)*ae_pow(1+vp0, -tg-1, _state)*vp0*ae_log(x->ptr.p_double[i]/tc, _state); state->j.ptr.pp_double[i][2] = (ta-td)*(-tg)*ae_pow(1+vp0, -tg-1, _state)*vp0*(-tb/tc); state->j.ptr.pp_double[i][3] = 1-1/vp1; state->j.ptr.pp_double[i][4] = -(ta-td)/vp1*ae_log(1+vp0, _state); } } } /* * Add regularizer */ for(i=0; i<=4; i++) { state->fi.ptr.p_double[n+i] = lambdav*state->x.ptr.p_double[i]; if( state->needfij ) { for(j=0; j<=4; j++) { state->j.ptr.pp_double[n+i][j] = 0.0; } state->j.ptr.pp_double[n+i][i] = lambdav; } } /* * Done */ continue; } ae_assert(ae_false, "LogisticFitX: internal error", _state); } minlmresultsbuf(state, p1, replm, _state); ae_assert(replm->terminationtype>0, "LogisticFitX: internal error", _state); } /************************************************************************* Calculate errors for 4PL/5PL fit. Leaves other fields of Rep unchanged, so caller should properly initialize it with ClearRep() call. -- ALGLIB PROJECT -- Copyright 28.04.2017 by Bochkanov Sergey *************************************************************************/ static void lsfit_logisticfit45errors(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double a, double b, double c, double d, double g, lsfitreport* rep, ae_state *_state) { ae_int_t i; ae_int_t k; double v; double rss; double tss; double meany; /* * Calculate errors */ rep->rmserror = (double)(0); rep->avgerror = (double)(0); rep->avgrelerror = (double)(0); rep->maxerror = (double)(0); k = 0; rss = 0.0; tss = 0.0; meany = 0.0; for(i=0; i<=n-1; i++) { meany = meany+y->ptr.p_double[i]; } meany = meany/n; for(i=0; i<=n-1; i++) { /* * Calculate residual from regression */ if( ae_fp_greater(x->ptr.p_double[i],(double)(0)) ) { v = d+(a-d)/ae_pow(1.0+ae_pow(x->ptr.p_double[i]/c, b, _state), g, _state)-y->ptr.p_double[i]; } else { if( ae_fp_greater_eq(b,(double)(0)) ) { v = a-y->ptr.p_double[i]; } else { v = d-y->ptr.p_double[i]; } } /* * Update RSS (residual sum of squares) and TSS (total sum of squares) * which are used to calculate coefficient of determination. * * NOTE: we use formula R2 = 1-RSS/TSS because it has nice property of * being equal to 0.0 if and only if model perfectly fits data. * * When we fit nonlinear models, there are exist multiple ways of * determining R2, each of them giving different results. Formula * above is the most intuitive one. */ rss = rss+v*v; tss = tss+ae_sqr(y->ptr.p_double[i]-meany, _state); /* * Update errors */ rep->rmserror = rep->rmserror+ae_sqr(v, _state); rep->avgerror = rep->avgerror+ae_fabs(v, _state); if( ae_fp_neq(y->ptr.p_double[i],(double)(0)) ) { rep->avgrelerror = rep->avgrelerror+ae_fabs(v/y->ptr.p_double[i], _state); k = k+1; } rep->maxerror = ae_maxreal(rep->maxerror, ae_fabs(v, _state), _state); } rep->rmserror = ae_sqrt(rep->rmserror/n, _state); rep->avgerror = rep->avgerror/n; if( k>0 ) { rep->avgrelerror = rep->avgrelerror/k; } rep->r2 = 1.0-rss/tss; } /************************************************************************* Internal spline fitting subroutine -- ALGLIB PROJECT -- Copyright 08.09.2009 by Bochkanov Sergey *************************************************************************/ static void lsfit_spline1dfitinternal(ae_int_t st, /* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector _w; ae_vector _xc; ae_vector _yc; ae_matrix fmatrix; ae_matrix cmatrix; ae_vector y2; ae_vector w2; ae_vector sx; ae_vector sy; ae_vector sd; ae_vector tmp; ae_vector xoriginal; ae_vector yoriginal; lsfitreport lrep; double v0; double v1; double v2; double mx; spline1dinterpolant s2; ae_int_t i; ae_int_t j; ae_int_t relcnt; double xa; double xb; double sa; double sb; double bl; double br; double decay; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; ae_vector_init_copy(&_w, w, _state); w = &_w; ae_vector_init_copy(&_xc, xc, _state); xc = &_xc; ae_vector_init_copy(&_yc, yc, _state); yc = &_yc; *info = 0; _spline1dinterpolant_clear(s); _spline1dfitreport_clear(rep); ae_matrix_init(&fmatrix, 0, 0, DT_REAL, _state); ae_matrix_init(&cmatrix, 0, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&sx, 0, DT_REAL, _state); ae_vector_init(&sy, 0, DT_REAL, _state); ae_vector_init(&sd, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&xoriginal, 0, DT_REAL, _state); ae_vector_init(&yoriginal, 0, DT_REAL, _state); _lsfitreport_init(&lrep, _state); _spline1dinterpolant_init(&s2, _state); ae_assert(st==0||st==1, "Spline1DFit: internal error!", _state); if( st==0&&m<4 ) { *info = -1; ae_frame_leave(_state); return; } if( st==1&&m<4 ) { *info = -1; ae_frame_leave(_state); return; } if( (n<1||k<0)||k>=m ) { *info = -1; ae_frame_leave(_state); return; } for(i=0; i<=k-1; i++) { *info = 0; if( dc->ptr.p_int[i]<0 ) { *info = -1; } if( dc->ptr.p_int[i]>1 ) { *info = -1; } if( *info<0 ) { ae_frame_leave(_state); return; } } if( st==1&&m%2!=0 ) { /* * Hermite fitter must have even number of basis functions */ *info = -2; ae_frame_leave(_state); return; } /* * weight decay for correct handling of task which becomes * degenerate after constraints are applied */ decay = 10000*ae_machineepsilon; /* * Scale X, Y, XC, YC */ lsfitscalexy(x, y, w, n, xc, yc, dc, k, &xa, &xb, &sa, &sb, &xoriginal, &yoriginal, _state); /* * allocate space, initialize: * * SX - grid for basis functions * * SY - values of basis functions at grid points * * FMatrix- values of basis functions at X[] * * CMatrix- values (derivatives) of basis functions at XC[] */ ae_vector_set_length(&y2, n+m, _state); ae_vector_set_length(&w2, n+m, _state); ae_matrix_set_length(&fmatrix, n+m, m, _state); if( k>0 ) { ae_matrix_set_length(&cmatrix, k, m+1, _state); } if( st==0 ) { /* * allocate space for cubic spline */ ae_vector_set_length(&sx, m-2, _state); ae_vector_set_length(&sy, m-2, _state); for(j=0; j<=m-2-1; j++) { sx.ptr.p_double[j] = (double)(2*j)/(double)(m-2-1)-1; } } if( st==1 ) { /* * allocate space for Hermite spline */ ae_vector_set_length(&sx, m/2, _state); ae_vector_set_length(&sy, m/2, _state); ae_vector_set_length(&sd, m/2, _state); for(j=0; j<=m/2-1; j++) { sx.ptr.p_double[j] = (double)(2*j)/(double)(m/2-1)-1; } } /* * Prepare design and constraints matrices: * * fill constraints matrix * * fill first N rows of design matrix with values * * fill next M rows of design matrix with regularizing term * * append M zeros to Y * * append M elements, mean(abs(W)) each, to W */ for(j=0; j<=m-1; j++) { /* * prepare Jth basis function */ if( st==0 ) { /* * cubic spline basis */ for(i=0; i<=m-2-1; i++) { sy.ptr.p_double[i] = (double)(0); } bl = (double)(0); br = (double)(0); if( jptr.p_double[i], _state); } for(i=0; i<=k-1; i++) { ae_assert(dc->ptr.p_int[i]>=0&&dc->ptr.p_int[i]<=2, "Spline1DFit: internal error!", _state); spline1ddiff(&s2, xc->ptr.p_double[i], &v0, &v1, &v2, _state); if( dc->ptr.p_int[i]==0 ) { cmatrix.ptr.pp_double[i][j] = v0; } if( dc->ptr.p_int[i]==1 ) { cmatrix.ptr.pp_double[i][j] = v1; } if( dc->ptr.p_int[i]==2 ) { cmatrix.ptr.pp_double[i][j] = v2; } } } for(i=0; i<=k-1; i++) { cmatrix.ptr.pp_double[i][m] = yc->ptr.p_double[i]; } for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { if( i==j ) { fmatrix.ptr.pp_double[n+i][j] = decay; } else { fmatrix.ptr.pp_double[n+i][j] = (double)(0); } } } ae_vector_set_length(&y2, n+m, _state); ae_vector_set_length(&w2, n+m, _state); ae_v_move(&y2.ptr.p_double[0], 1, &y->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&w2.ptr.p_double[0], 1, &w->ptr.p_double[0], 1, ae_v_len(0,n-1)); mx = (double)(0); for(i=0; i<=n-1; i++) { mx = mx+ae_fabs(w->ptr.p_double[i], _state); } mx = mx/n; for(i=0; i<=m-1; i++) { y2.ptr.p_double[n+i] = (double)(0); w2.ptr.p_double[n+i] = mx; } /* * Solve constrained task */ if( k>0 ) { /* * solve using regularization */ lsfitlinearwc(&y2, &w2, &fmatrix, &cmatrix, n+m, m, k, info, &tmp, &lrep, _state); } else { /* * no constraints, no regularization needed */ lsfitlinearwc(y, w, &fmatrix, &cmatrix, n, m, k, info, &tmp, &lrep, _state); } if( *info<0 ) { ae_frame_leave(_state); return; } /* * Generate spline and scale it */ if( st==0 ) { /* * cubic spline basis */ ae_v_move(&sy.ptr.p_double[0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,m-2-1)); spline1dbuildcubic(&sx, &sy, m-2, 1, tmp.ptr.p_double[m-2], 1, tmp.ptr.p_double[m-1], s, _state); } if( st==1 ) { /* * Hermite basis */ for(i=0; i<=m/2-1; i++) { sy.ptr.p_double[i] = tmp.ptr.p_double[2*i]; sd.ptr.p_double[i] = tmp.ptr.p_double[2*i+1]; } spline1dbuildhermite(&sx, &sy, &sd, m/2, s, _state); } spline1dlintransx(s, 2/(xb-xa), -(xa+xb)/(xb-xa), _state); spline1dlintransy(s, sb-sa, sa, _state); /* * Scale absolute errors obtained from LSFitLinearW. * Relative error should be calculated separately * (because of shifting/scaling of the task) */ rep->taskrcond = lrep.taskrcond; rep->rmserror = lrep.rmserror*(sb-sa); rep->avgerror = lrep.avgerror*(sb-sa); rep->maxerror = lrep.maxerror*(sb-sa); rep->avgrelerror = (double)(0); relcnt = 0; for(i=0; i<=n-1; i++) { if( ae_fp_neq(yoriginal.ptr.p_double[i],(double)(0)) ) { rep->avgrelerror = rep->avgrelerror+ae_fabs(spline1dcalc(s, xoriginal.ptr.p_double[i], _state)-yoriginal.ptr.p_double[i], _state)/ae_fabs(yoriginal.ptr.p_double[i], _state); relcnt = relcnt+1; } } if( relcnt!=0 ) { rep->avgrelerror = rep->avgrelerror/relcnt; } ae_frame_leave(_state); } /************************************************************************* Internal fitting subroutine *************************************************************************/ static void lsfit_lsfitlinearinternal(/* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, ae_int_t n, ae_int_t m, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state) { ae_frame _frame_block; double threshold; ae_matrix ft; ae_matrix q; ae_matrix l; ae_matrix r; ae_vector b; ae_vector wmod; ae_vector tau; ae_vector nzeros; ae_vector s; ae_int_t i; ae_int_t j; double v; ae_vector sv; ae_matrix u; ae_matrix vt; ae_vector tmp; ae_vector utb; ae_vector sutb; ae_int_t relcnt; ae_frame_make(_state, &_frame_block); *info = 0; ae_vector_clear(c); _lsfitreport_clear(rep); ae_matrix_init(&ft, 0, 0, DT_REAL, _state); ae_matrix_init(&q, 0, 0, DT_REAL, _state); ae_matrix_init(&l, 0, 0, DT_REAL, _state); ae_matrix_init(&r, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&wmod, 0, DT_REAL, _state); ae_vector_init(&tau, 0, DT_REAL, _state); ae_vector_init(&nzeros, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&sv, 0, DT_REAL, _state); ae_matrix_init(&u, 0, 0, DT_REAL, _state); ae_matrix_init(&vt, 0, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&utb, 0, DT_REAL, _state); ae_vector_init(&sutb, 0, DT_REAL, _state); lsfit_clearreport(rep, _state); if( n<1||m<1 ) { *info = -1; ae_frame_leave(_state); return; } *info = 1; threshold = ae_sqrt(ae_machineepsilon, _state); /* * Degenerate case, needs special handling */ if( nptr.p_double[j]; ae_v_moved(&ft.ptr.pp_double[j][0], 1, &fmatrix->ptr.pp_double[j][0], 1, ae_v_len(0,m-1), v); b.ptr.p_double[j] = w->ptr.p_double[j]*y->ptr.p_double[j]; wmod.ptr.p_double[j] = (double)(1); } /* * LQ decomposition and reduction to M=N */ ae_vector_set_length(c, m, _state); for(i=0; i<=m-1; i++) { c->ptr.p_double[i] = (double)(0); } rep->taskrcond = (double)(0); rmatrixlq(&ft, n, m, &tau, _state); rmatrixlqunpackq(&ft, n, m, &tau, n, &q, _state); rmatrixlqunpackl(&ft, n, m, &l, _state); lsfit_lsfitlinearinternal(&b, &wmod, &l, n, n, info, &tmp, rep, _state); if( *info<=0 ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { v = tmp.ptr.p_double[i]; ae_v_addd(&c->ptr.p_double[0], 1, &q.ptr.pp_double[i][0], 1, ae_v_len(0,m-1), v); } ae_frame_leave(_state); return; } /* * N>=M. Generate design matrix and reduce to N=M using * QR decomposition. */ ae_matrix_set_length(&ft, n, m, _state); ae_vector_set_length(&b, n, _state); for(j=0; j<=n-1; j++) { v = w->ptr.p_double[j]; ae_v_moved(&ft.ptr.pp_double[j][0], 1, &fmatrix->ptr.pp_double[j][0], 1, ae_v_len(0,m-1), v); b.ptr.p_double[j] = w->ptr.p_double[j]*y->ptr.p_double[j]; } rmatrixqr(&ft, n, m, &tau, _state); rmatrixqrunpackq(&ft, n, m, &tau, m, &q, _state); rmatrixqrunpackr(&ft, n, m, &r, _state); ae_vector_set_length(&tmp, m, _state); for(i=0; i<=m-1; i++) { tmp.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { v = b.ptr.p_double[i]; ae_v_addd(&tmp.ptr.p_double[0], 1, &q.ptr.pp_double[i][0], 1, ae_v_len(0,m-1), v); } ae_vector_set_length(&b, m, _state); ae_v_move(&b.ptr.p_double[0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,m-1)); /* * R contains reduced MxM design upper triangular matrix, * B contains reduced Mx1 right part. * * Determine system condition number and decide * should we use triangular solver (faster) or * SVD-based solver (more stable). * * We can use LU-based RCond estimator for this task. */ rep->taskrcond = rmatrixlurcondinf(&r, m, _state); if( ae_fp_greater(rep->taskrcond,threshold) ) { /* * use QR-based solver */ ae_vector_set_length(c, m, _state); c->ptr.p_double[m-1] = b.ptr.p_double[m-1]/r.ptr.pp_double[m-1][m-1]; for(i=m-2; i>=0; i--) { v = ae_v_dotproduct(&r.ptr.pp_double[i][i+1], 1, &c->ptr.p_double[i+1], 1, ae_v_len(i+1,m-1)); c->ptr.p_double[i] = (b.ptr.p_double[i]-v)/r.ptr.pp_double[i][i]; } } else { /* * use SVD-based solver */ if( !rmatrixsvd(&r, m, m, 1, 1, 2, &sv, &u, &vt, _state) ) { *info = -4; ae_frame_leave(_state); return; } ae_vector_set_length(&utb, m, _state); ae_vector_set_length(&sutb, m, _state); for(i=0; i<=m-1; i++) { utb.ptr.p_double[i] = (double)(0); } for(i=0; i<=m-1; i++) { v = b.ptr.p_double[i]; ae_v_addd(&utb.ptr.p_double[0], 1, &u.ptr.pp_double[i][0], 1, ae_v_len(0,m-1), v); } if( ae_fp_greater(sv.ptr.p_double[0],(double)(0)) ) { rep->taskrcond = sv.ptr.p_double[m-1]/sv.ptr.p_double[0]; for(i=0; i<=m-1; i++) { if( ae_fp_greater(sv.ptr.p_double[i],threshold*sv.ptr.p_double[0]) ) { sutb.ptr.p_double[i] = utb.ptr.p_double[i]/sv.ptr.p_double[i]; } else { sutb.ptr.p_double[i] = (double)(0); } } } else { rep->taskrcond = (double)(0); for(i=0; i<=m-1; i++) { sutb.ptr.p_double[i] = (double)(0); } } ae_vector_set_length(c, m, _state); for(i=0; i<=m-1; i++) { c->ptr.p_double[i] = (double)(0); } for(i=0; i<=m-1; i++) { v = sutb.ptr.p_double[i]; ae_v_addd(&c->ptr.p_double[0], 1, &vt.ptr.pp_double[i][0], 1, ae_v_len(0,m-1), v); } } /* * calculate errors */ rep->rmserror = (double)(0); rep->avgerror = (double)(0); rep->avgrelerror = (double)(0); rep->maxerror = (double)(0); relcnt = 0; for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&fmatrix->ptr.pp_double[i][0], 1, &c->ptr.p_double[0], 1, ae_v_len(0,m-1)); rep->rmserror = rep->rmserror+ae_sqr(v-y->ptr.p_double[i], _state); rep->avgerror = rep->avgerror+ae_fabs(v-y->ptr.p_double[i], _state); if( ae_fp_neq(y->ptr.p_double[i],(double)(0)) ) { rep->avgrelerror = rep->avgrelerror+ae_fabs(v-y->ptr.p_double[i], _state)/ae_fabs(y->ptr.p_double[i], _state); relcnt = relcnt+1; } rep->maxerror = ae_maxreal(rep->maxerror, ae_fabs(v-y->ptr.p_double[i], _state), _state); } rep->rmserror = ae_sqrt(rep->rmserror/n, _state); rep->avgerror = rep->avgerror/n; if( relcnt!=0 ) { rep->avgrelerror = rep->avgrelerror/relcnt; } ae_vector_set_length(&nzeros, n, _state); ae_vector_set_length(&s, m, _state); for(i=0; i<=m-1; i++) { s.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { s.ptr.p_double[j] = s.ptr.p_double[j]+ae_sqr(fmatrix->ptr.pp_double[i][j], _state); } nzeros.ptr.p_double[i] = (double)(0); } for(i=0; i<=m-1; i++) { if( ae_fp_neq(s.ptr.p_double[i],(double)(0)) ) { s.ptr.p_double[i] = ae_sqrt(1/s.ptr.p_double[i], _state); } else { s.ptr.p_double[i] = (double)(1); } } lsfit_estimateerrors(fmatrix, &nzeros, y, w, c, &s, n, m, rep, &r, 1, _state); ae_frame_leave(_state); } /************************************************************************* Internal subroutine *************************************************************************/ static void lsfit_lsfitclearrequestfields(lsfitstate* state, ae_state *_state) { state->needf = ae_false; state->needfg = ae_false; state->needfgh = ae_false; state->xupdated = ae_false; } /************************************************************************* Internal subroutine, calculates barycentric basis functions. Used for efficient simultaneous calculation of N basis functions. -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ static void lsfit_barycentriccalcbasis(barycentricinterpolant* b, double t, /* Real */ ae_vector* y, ae_state *_state) { double s2; double s; double v; ae_int_t i; ae_int_t j; /* * special case: N=1 */ if( b->n==1 ) { y->ptr.p_double[0] = (double)(1); return; } /* * Here we assume that task is normalized, i.e.: * 1. abs(Y[i])<=1 * 2. abs(W[i])<=1 * 3. X[] is ordered * * First, we decide: should we use "safe" formula (guarded * against overflow) or fast one? */ s = ae_fabs(t-b->x.ptr.p_double[0], _state); for(i=0; i<=b->n-1; i++) { v = b->x.ptr.p_double[i]; if( ae_fp_eq(v,t) ) { for(j=0; j<=b->n-1; j++) { y->ptr.p_double[j] = (double)(0); } y->ptr.p_double[i] = (double)(1); return; } v = ae_fabs(t-v, _state); if( ae_fp_less(v,s) ) { s = v; } } s2 = (double)(0); for(i=0; i<=b->n-1; i++) { v = s/(t-b->x.ptr.p_double[i]); v = v*b->w.ptr.p_double[i]; y->ptr.p_double[i] = v; s2 = s2+v; } v = 1/s2; ae_v_muld(&y->ptr.p_double[0], 1, ae_v_len(0,b->n-1), v); } /************************************************************************* This is internal function for Chebyshev fitting. It assumes that input data are normalized: * X/XC belong to [-1,+1], * mean(Y)=0, stddev(Y)=1. It does not checks inputs for errors. This function is used to fit general (shifted) Chebyshev models, power basis models or barycentric models. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] N - number of points, N>0. XC - points where polynomial values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that P(XC[i])=YC[i] * DC[i]=1 means that P'(XC[i])=YC[i] K - number of constraints, 0<=K=1 OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints C - interpolant in Chebyshev form; [-1,+1] is used as base interval Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. -- ALGLIB PROJECT -- Copyright 10.12.2009 by Bochkanov Sergey *************************************************************************/ static void lsfit_internalchebyshevfit(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _xc; ae_vector _yc; ae_vector y2; ae_vector w2; ae_vector tmp; ae_vector tmp2; ae_vector tmpdiff; ae_vector bx; ae_vector by; ae_vector bw; ae_matrix fmatrix; ae_matrix cmatrix; ae_int_t i; ae_int_t j; double mx; double decay; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_xc, xc, _state); xc = &_xc; ae_vector_init_copy(&_yc, yc, _state); yc = &_yc; *info = 0; ae_vector_clear(c); _lsfitreport_clear(rep); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&tmp2, 0, DT_REAL, _state); ae_vector_init(&tmpdiff, 0, DT_REAL, _state); ae_vector_init(&bx, 0, DT_REAL, _state); ae_vector_init(&by, 0, DT_REAL, _state); ae_vector_init(&bw, 0, DT_REAL, _state); ae_matrix_init(&fmatrix, 0, 0, DT_REAL, _state); ae_matrix_init(&cmatrix, 0, 0, DT_REAL, _state); lsfit_clearreport(rep, _state); /* * weight decay for correct handling of task which becomes * degenerate after constraints are applied */ decay = 10000*ae_machineepsilon; /* * allocate space, initialize/fill: * * FMatrix- values of basis functions at X[] * * CMatrix- values (derivatives) of basis functions at XC[] * * fill constraints matrix * * fill first N rows of design matrix with values * * fill next M rows of design matrix with regularizing term * * append M zeros to Y * * append M elements, mean(abs(W)) each, to W */ ae_vector_set_length(&y2, n+m, _state); ae_vector_set_length(&w2, n+m, _state); ae_vector_set_length(&tmp, m, _state); ae_vector_set_length(&tmpdiff, m, _state); ae_matrix_set_length(&fmatrix, n+m, m, _state); if( k>0 ) { ae_matrix_set_length(&cmatrix, k, m+1, _state); } /* * Fill design matrix, Y2, W2: * * first N rows with basis functions for original points * * next M rows with decay terms */ for(i=0; i<=n-1; i++) { /* * prepare Ith row * use Tmp for calculations to avoid multidimensional arrays overhead */ for(j=0; j<=m-1; j++) { if( j==0 ) { tmp.ptr.p_double[j] = (double)(1); } else { if( j==1 ) { tmp.ptr.p_double[j] = x->ptr.p_double[i]; } else { tmp.ptr.p_double[j] = 2*x->ptr.p_double[i]*tmp.ptr.p_double[j-1]-tmp.ptr.p_double[j-2]; } } } ae_v_move(&fmatrix.ptr.pp_double[i][0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,m-1)); } for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { if( i==j ) { fmatrix.ptr.pp_double[n+i][j] = decay; } else { fmatrix.ptr.pp_double[n+i][j] = (double)(0); } } } ae_v_move(&y2.ptr.p_double[0], 1, &y->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&w2.ptr.p_double[0], 1, &w->ptr.p_double[0], 1, ae_v_len(0,n-1)); mx = (double)(0); for(i=0; i<=n-1; i++) { mx = mx+ae_fabs(w->ptr.p_double[i], _state); } mx = mx/n; for(i=0; i<=m-1; i++) { y2.ptr.p_double[n+i] = (double)(0); w2.ptr.p_double[n+i] = mx; } /* * fill constraints matrix */ for(i=0; i<=k-1; i++) { /* * prepare Ith row * use Tmp for basis function values, * TmpDiff for basos function derivatives */ for(j=0; j<=m-1; j++) { if( j==0 ) { tmp.ptr.p_double[j] = (double)(1); tmpdiff.ptr.p_double[j] = (double)(0); } else { if( j==1 ) { tmp.ptr.p_double[j] = xc->ptr.p_double[i]; tmpdiff.ptr.p_double[j] = (double)(1); } else { tmp.ptr.p_double[j] = 2*xc->ptr.p_double[i]*tmp.ptr.p_double[j-1]-tmp.ptr.p_double[j-2]; tmpdiff.ptr.p_double[j] = 2*(tmp.ptr.p_double[j-1]+xc->ptr.p_double[i]*tmpdiff.ptr.p_double[j-1])-tmpdiff.ptr.p_double[j-2]; } } } if( dc->ptr.p_int[i]==0 ) { ae_v_move(&cmatrix.ptr.pp_double[i][0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,m-1)); } if( dc->ptr.p_int[i]==1 ) { ae_v_move(&cmatrix.ptr.pp_double[i][0], 1, &tmpdiff.ptr.p_double[0], 1, ae_v_len(0,m-1)); } cmatrix.ptr.pp_double[i][m] = yc->ptr.p_double[i]; } /* * Solve constrained task */ if( k>0 ) { /* * solve using regularization */ lsfitlinearwc(&y2, &w2, &fmatrix, &cmatrix, n+m, m, k, info, c, rep, _state); } else { /* * no constraints, no regularization needed */ lsfitlinearwc(y, w, &fmatrix, &cmatrix, n, m, 0, info, c, rep, _state); } if( *info<0 ) { ae_frame_leave(_state); return; } ae_frame_leave(_state); } /************************************************************************* Internal Floater-Hormann fitting subroutine for fixed D *************************************************************************/ static void lsfit_barycentricfitwcfixedd(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t d, ae_int_t* info, barycentricinterpolant* b, barycentricfitreport* rep, ae_state *_state) { ae_frame _frame_block; ae_vector _x; ae_vector _y; ae_vector _w; ae_vector _xc; ae_vector _yc; ae_matrix fmatrix; ae_matrix cmatrix; ae_vector y2; ae_vector w2; ae_vector sx; ae_vector sy; ae_vector sbf; ae_vector xoriginal; ae_vector yoriginal; ae_vector tmp; lsfitreport lrep; double v0; double v1; double mx; barycentricinterpolant b2; ae_int_t i; ae_int_t j; ae_int_t relcnt; double xa; double xb; double sa; double sb; double decay; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_x, x, _state); x = &_x; ae_vector_init_copy(&_y, y, _state); y = &_y; ae_vector_init_copy(&_w, w, _state); w = &_w; ae_vector_init_copy(&_xc, xc, _state); xc = &_xc; ae_vector_init_copy(&_yc, yc, _state); yc = &_yc; *info = 0; _barycentricinterpolant_clear(b); _barycentricfitreport_clear(rep); ae_matrix_init(&fmatrix, 0, 0, DT_REAL, _state); ae_matrix_init(&cmatrix, 0, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&sx, 0, DT_REAL, _state); ae_vector_init(&sy, 0, DT_REAL, _state); ae_vector_init(&sbf, 0, DT_REAL, _state); ae_vector_init(&xoriginal, 0, DT_REAL, _state); ae_vector_init(&yoriginal, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); _lsfitreport_init(&lrep, _state); _barycentricinterpolant_init(&b2, _state); if( ((n<1||m<2)||k<0)||k>=m ) { *info = -1; ae_frame_leave(_state); return; } for(i=0; i<=k-1; i++) { *info = 0; if( dc->ptr.p_int[i]<0 ) { *info = -1; } if( dc->ptr.p_int[i]>1 ) { *info = -1; } if( *info<0 ) { ae_frame_leave(_state); return; } } /* * weight decay for correct handling of task which becomes * degenerate after constraints are applied */ decay = 10000*ae_machineepsilon; /* * Scale X, Y, XC, YC */ lsfitscalexy(x, y, w, n, xc, yc, dc, k, &xa, &xb, &sa, &sb, &xoriginal, &yoriginal, _state); /* * allocate space, initialize: * * FMatrix- values of basis functions at X[] * * CMatrix- values (derivatives) of basis functions at XC[] */ ae_vector_set_length(&y2, n+m, _state); ae_vector_set_length(&w2, n+m, _state); ae_matrix_set_length(&fmatrix, n+m, m, _state); if( k>0 ) { ae_matrix_set_length(&cmatrix, k, m+1, _state); } ae_vector_set_length(&y2, n+m, _state); ae_vector_set_length(&w2, n+m, _state); /* * Prepare design and constraints matrices: * * fill constraints matrix * * fill first N rows of design matrix with values * * fill next M rows of design matrix with regularizing term * * append M zeros to Y * * append M elements, mean(abs(W)) each, to W */ ae_vector_set_length(&sx, m, _state); ae_vector_set_length(&sy, m, _state); ae_vector_set_length(&sbf, m, _state); for(j=0; j<=m-1; j++) { sx.ptr.p_double[j] = (double)(2*j)/(double)(m-1)-1; } for(i=0; i<=m-1; i++) { sy.ptr.p_double[i] = (double)(1); } barycentricbuildfloaterhormann(&sx, &sy, m, d, &b2, _state); mx = (double)(0); for(i=0; i<=n-1; i++) { lsfit_barycentriccalcbasis(&b2, x->ptr.p_double[i], &sbf, _state); ae_v_move(&fmatrix.ptr.pp_double[i][0], 1, &sbf.ptr.p_double[0], 1, ae_v_len(0,m-1)); y2.ptr.p_double[i] = y->ptr.p_double[i]; w2.ptr.p_double[i] = w->ptr.p_double[i]; mx = mx+ae_fabs(w->ptr.p_double[i], _state)/n; } for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { if( i==j ) { fmatrix.ptr.pp_double[n+i][j] = decay; } else { fmatrix.ptr.pp_double[n+i][j] = (double)(0); } } y2.ptr.p_double[n+i] = (double)(0); w2.ptr.p_double[n+i] = mx; } if( k>0 ) { for(j=0; j<=m-1; j++) { for(i=0; i<=m-1; i++) { sy.ptr.p_double[i] = (double)(0); } sy.ptr.p_double[j] = (double)(1); barycentricbuildfloaterhormann(&sx, &sy, m, d, &b2, _state); for(i=0; i<=k-1; i++) { ae_assert(dc->ptr.p_int[i]>=0&&dc->ptr.p_int[i]<=1, "BarycentricFit: internal error!", _state); barycentricdiff1(&b2, xc->ptr.p_double[i], &v0, &v1, _state); if( dc->ptr.p_int[i]==0 ) { cmatrix.ptr.pp_double[i][j] = v0; } if( dc->ptr.p_int[i]==1 ) { cmatrix.ptr.pp_double[i][j] = v1; } } } for(i=0; i<=k-1; i++) { cmatrix.ptr.pp_double[i][m] = yc->ptr.p_double[i]; } } /* * Solve constrained task */ if( k>0 ) { /* * solve using regularization */ lsfitlinearwc(&y2, &w2, &fmatrix, &cmatrix, n+m, m, k, info, &tmp, &lrep, _state); } else { /* * no constraints, no regularization needed */ lsfitlinearwc(y, w, &fmatrix, &cmatrix, n, m, k, info, &tmp, &lrep, _state); } if( *info<0 ) { ae_frame_leave(_state); return; } /* * Generate interpolant and scale it */ ae_v_move(&sy.ptr.p_double[0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,m-1)); barycentricbuildfloaterhormann(&sx, &sy, m, d, b, _state); barycentriclintransx(b, 2/(xb-xa), -(xa+xb)/(xb-xa), _state); barycentriclintransy(b, sb-sa, sa, _state); /* * Scale absolute errors obtained from LSFitLinearW. * Relative error should be calculated separately * (because of shifting/scaling of the task) */ rep->taskrcond = lrep.taskrcond; rep->rmserror = lrep.rmserror*(sb-sa); rep->avgerror = lrep.avgerror*(sb-sa); rep->maxerror = lrep.maxerror*(sb-sa); rep->avgrelerror = (double)(0); relcnt = 0; for(i=0; i<=n-1; i++) { if( ae_fp_neq(yoriginal.ptr.p_double[i],(double)(0)) ) { rep->avgrelerror = rep->avgrelerror+ae_fabs(barycentriccalc(b, xoriginal.ptr.p_double[i], _state)-yoriginal.ptr.p_double[i], _state)/ae_fabs(yoriginal.ptr.p_double[i], _state); relcnt = relcnt+1; } } if( relcnt!=0 ) { rep->avgrelerror = rep->avgrelerror/relcnt; } ae_frame_leave(_state); } static void lsfit_clearreport(lsfitreport* rep, ae_state *_state) { rep->taskrcond = (double)(0); rep->iterationscount = 0; rep->varidx = -1; rep->rmserror = (double)(0); rep->avgerror = (double)(0); rep->avgrelerror = (double)(0); rep->maxerror = (double)(0); rep->wrmserror = (double)(0); rep->r2 = (double)(0); ae_matrix_set_length(&rep->covpar, 0, 0, _state); ae_vector_set_length(&rep->errpar, 0, _state); ae_vector_set_length(&rep->errcurve, 0, _state); ae_vector_set_length(&rep->noise, 0, _state); } /************************************************************************* This internal function estimates covariance matrix and other error-related information for linear/nonlinear least squares model. It has a bit awkward interface, but it can be used for both linear and nonlinear problems. INPUT PARAMETERS: F1 - array[0..N-1,0..K-1]: * for linear problems - matrix of function values * for nonlinear problems - Jacobian matrix F0 - array[0..N-1]: * for linear problems - must be filled with zeros * for nonlinear problems - must store values of function being fitted Y - array[0..N-1]: * for linear and nonlinear problems - must store target values W - weights, array[0..N-1]: * for linear and nonlinear problems - weights X - array[0..K-1]: * for linear and nonlinear problems - current solution S - array[0..K-1]: * its components should be strictly positive * squared inverse of this diagonal matrix is used as damping factor for covariance matrix (linear and nonlinear problems) * for nonlinear problems, when scale of the variables is usually explicitly given by user, you may use scale vector for this parameter * for linear problems you may set this parameter to S=sqrt(1/diag(F'*F)) * this parameter is automatically rescaled by this function, only relative magnitudes of its components (with respect to each other) matter. N - number of points, N>0. K - number of dimensions Rep - structure which is used to store results Z - additional matrix which, depending on ZKind, may contain some information used to accelerate calculations - or just can be temporary buffer: * for ZKind=0 Z contains no information, just temporary buffer which can be resized and used as needed * for ZKind=1 Z contains triangular matrix from QR decomposition of W*F1. This matrix can be used to speedup calculation of covariance matrix. It should not be changed by algorithm. ZKind- contents of Z OUTPUT PARAMETERS: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(J*CovPar*J')), where J is Jacobian matrix. * Rep.Noise vector of per-point estimates of noise, array[N] * Rep.R2 coefficient of determination (non-weighted) Other fields of Rep are not changed. IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. -- ALGLIB PROJECT -- Copyright 10.12.2009 by Bochkanov Sergey *************************************************************************/ static void lsfit_estimateerrors(/* Real */ ae_matrix* f1, /* Real */ ae_vector* f0, /* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_vector* x, /* Real */ ae_vector* s, ae_int_t n, ae_int_t k, lsfitreport* rep, /* Real */ ae_matrix* z, ae_int_t zkind, ae_state *_state) { ae_frame _frame_block; ae_vector _s; ae_int_t i; ae_int_t j; ae_int_t j1; double v; double noisec; ae_int_t info; matinvreport invrep; ae_int_t nzcnt; double avg; double rss; double tss; double sz; double ss; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_s, s, _state); s = &_s; _matinvreport_init(&invrep, _state); /* * Compute NZCnt - count of non-zero weights */ nzcnt = 0; for(i=0; i<=n-1; i++) { if( ae_fp_neq(w->ptr.p_double[i],(double)(0)) ) { nzcnt = nzcnt+1; } } /* * Compute R2 */ if( nzcnt>0 ) { avg = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_neq(w->ptr.p_double[i],(double)(0)) ) { avg = avg+y->ptr.p_double[i]; } } avg = avg/nzcnt; rss = 0.0; tss = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_neq(w->ptr.p_double[i],(double)(0)) ) { v = ae_v_dotproduct(&f1->ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,k-1)); v = v+f0->ptr.p_double[i]; rss = rss+ae_sqr(v-y->ptr.p_double[i], _state); tss = tss+ae_sqr(y->ptr.p_double[i]-avg, _state); } } if( ae_fp_neq(tss,(double)(0)) ) { rep->r2 = ae_maxreal(1.0-rss/tss, 0.0, _state); } else { rep->r2 = 1.0; } } else { rep->r2 = (double)(0); } /* * Compute estimate of proportionality between noise in the data and weights: * NoiseC = mean(per-point-noise*per-point-weight) * Noise level (standard deviation) at each point is equal to NoiseC/W[I]. */ if( nzcnt>k ) { noisec = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_neq(w->ptr.p_double[i],(double)(0)) ) { v = ae_v_dotproduct(&f1->ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,k-1)); v = v+f0->ptr.p_double[i]; noisec = noisec+ae_sqr((v-y->ptr.p_double[i])*w->ptr.p_double[i], _state); } } noisec = ae_sqrt(noisec/(nzcnt-k), _state); } else { noisec = 0.0; } /* * Two branches on noise level: * * NoiseC>0 normal situation * * NoiseC=0 degenerate case CovPar is filled by zeros */ rmatrixsetlengthatleast(&rep->covpar, k, k, _state); if( ae_fp_greater(noisec,(double)(0)) ) { /* * Normal situation: non-zero noise level */ ae_assert(zkind==0||zkind==1, "LSFit: internal error in EstimateErrors() function", _state); if( zkind==0 ) { /* * Z contains no additional information which can be used to speed up * calculations. We have to calculate covariance matrix on our own: * * Compute scaled Jacobian N*J, where N[i,i]=WCur[I]/NoiseC, store in Z * * Compute Z'*Z, store in CovPar * * Apply moderate regularization to CovPar and compute matrix inverse. * In case inverse failed, increase regularization parameter and try * again. */ rmatrixsetlengthatleast(z, n, k, _state); for(i=0; i<=n-1; i++) { v = w->ptr.p_double[i]/noisec; ae_v_moved(&z->ptr.pp_double[i][0], 1, &f1->ptr.pp_double[i][0], 1, ae_v_len(0,k-1), v); } /* * Convert S to automatically scaled damped matrix: * * calculate SZ - sum of diagonal elements of Z'*Z * * calculate SS - sum of diagonal elements of S^(-2) * * overwrite S by (SZ/SS)*S^(-2) * * now S has approximately same magnitude as giagonal of Z'*Z */ sz = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { sz = sz+z->ptr.pp_double[i][j]*z->ptr.pp_double[i][j]; } } if( ae_fp_eq(sz,(double)(0)) ) { sz = (double)(1); } ss = (double)(0); for(j=0; j<=k-1; j++) { ss = ss+1/ae_sqr(s->ptr.p_double[j], _state); } for(j=0; j<=k-1; j++) { s->ptr.p_double[j] = sz/ss/ae_sqr(s->ptr.p_double[j], _state); } /* * Calculate damped inverse inv(Z'*Z+S). * We increase damping factor V until Z'*Z become well-conditioned. */ v = 1.0E3*ae_machineepsilon; do { rmatrixsyrk(k, n, 1.0, z, 0, 0, 2, 0.0, &rep->covpar, 0, 0, ae_true, _state); for(i=0; i<=k-1; i++) { rep->covpar.ptr.pp_double[i][i] = rep->covpar.ptr.pp_double[i][i]+v*s->ptr.p_double[i]; } spdmatrixinverse(&rep->covpar, k, ae_true, &info, &invrep, _state); v = 10*v; } while(info<=0); for(i=0; i<=k-1; i++) { for(j=i+1; j<=k-1; j++) { rep->covpar.ptr.pp_double[j][i] = rep->covpar.ptr.pp_double[i][j]; } } } if( zkind==1 ) { /* * We can reuse additional information: * * Z contains R matrix from QR decomposition of W*F1 * * After multiplication by 1/NoiseC we get Z_mod = N*F1, where diag(N)=w[i]/NoiseC * * Such triangular Z_mod is a Cholesky factor from decomposition of J'*N'*N*J. * Thus, we can calculate covariance matrix as inverse of the matrix given by * its Cholesky decomposition. It allow us to avoid time-consuming calculation * of J'*N'*N*J in CovPar - complexity is reduced from O(N*K^2) to O(K^3), which * is quite good because K is usually orders of magnitude smaller than N. * * First, convert S to automatically scaled damped matrix: * * calculate SZ - sum of magnitudes of diagonal elements of Z/NoiseC * * calculate SS - sum of diagonal elements of S^(-1) * * overwrite S by (SZ/SS)*S^(-1) * * now S has approximately same magnitude as giagonal of Z'*Z */ sz = (double)(0); for(j=0; j<=k-1; j++) { sz = sz+ae_fabs(z->ptr.pp_double[j][j]/noisec, _state); } if( ae_fp_eq(sz,(double)(0)) ) { sz = (double)(1); } ss = (double)(0); for(j=0; j<=k-1; j++) { ss = ss+1/s->ptr.p_double[j]; } for(j=0; j<=k-1; j++) { s->ptr.p_double[j] = sz/ss/s->ptr.p_double[j]; } /* * Calculate damped inverse of inv((Z+v*S)'*(Z+v*S)) * We increase damping factor V until matrix become well-conditioned. */ v = 1.0E3*ae_machineepsilon; do { for(i=0; i<=k-1; i++) { for(j=i; j<=k-1; j++) { rep->covpar.ptr.pp_double[i][j] = z->ptr.pp_double[i][j]/noisec; } rep->covpar.ptr.pp_double[i][i] = rep->covpar.ptr.pp_double[i][i]+v*s->ptr.p_double[i]; } spdmatrixcholeskyinverse(&rep->covpar, k, ae_true, &info, &invrep, _state); v = 10*v; } while(info<=0); for(i=0; i<=k-1; i++) { for(j=i+1; j<=k-1; j++) { rep->covpar.ptr.pp_double[j][i] = rep->covpar.ptr.pp_double[i][j]; } } } } else { /* * Degenerate situation: zero noise level, covariance matrix is zero. */ for(i=0; i<=k-1; i++) { for(j=0; j<=k-1; j++) { rep->covpar.ptr.pp_double[j][i] = (double)(0); } } } /* * Estimate erorrs in parameters, curve and per-point noise */ rvectorsetlengthatleast(&rep->errpar, k, _state); rvectorsetlengthatleast(&rep->errcurve, n, _state); rvectorsetlengthatleast(&rep->noise, n, _state); for(i=0; i<=k-1; i++) { rep->errpar.ptr.p_double[i] = ae_sqrt(rep->covpar.ptr.pp_double[i][i], _state); } for(i=0; i<=n-1; i++) { /* * ErrCurve[I] is sqrt(P[i,i]) where P=J*CovPar*J' */ v = 0.0; for(j=0; j<=k-1; j++) { for(j1=0; j1<=k-1; j1++) { v = v+f1->ptr.pp_double[i][j]*rep->covpar.ptr.pp_double[j][j1]*f1->ptr.pp_double[i][j1]; } } rep->errcurve.ptr.p_double[i] = ae_sqrt(v, _state); /* * Noise[i] is filled using weights and current estimate of noise level */ if( ae_fp_neq(w->ptr.p_double[i],(double)(0)) ) { rep->noise.ptr.p_double[i] = noisec/w->ptr.p_double[i]; } else { rep->noise.ptr.p_double[i] = (double)(0); } } ae_frame_leave(_state); } void _polynomialfitreport_init(void* _p, ae_state *_state) { polynomialfitreport *p = (polynomialfitreport*)_p; ae_touch_ptr((void*)p); } void _polynomialfitreport_init_copy(void* _dst, void* _src, ae_state *_state) { polynomialfitreport *dst = (polynomialfitreport*)_dst; polynomialfitreport *src = (polynomialfitreport*)_src; dst->taskrcond = src->taskrcond; dst->rmserror = src->rmserror; dst->avgerror = src->avgerror; dst->avgrelerror = src->avgrelerror; dst->maxerror = src->maxerror; } void _polynomialfitreport_clear(void* _p) { polynomialfitreport *p = (polynomialfitreport*)_p; ae_touch_ptr((void*)p); } void _polynomialfitreport_destroy(void* _p) { polynomialfitreport *p = (polynomialfitreport*)_p; ae_touch_ptr((void*)p); } void _barycentricfitreport_init(void* _p, ae_state *_state) { barycentricfitreport *p = (barycentricfitreport*)_p; ae_touch_ptr((void*)p); } void _barycentricfitreport_init_copy(void* _dst, void* _src, ae_state *_state) { barycentricfitreport *dst = (barycentricfitreport*)_dst; barycentricfitreport *src = (barycentricfitreport*)_src; dst->taskrcond = src->taskrcond; dst->dbest = src->dbest; dst->rmserror = src->rmserror; dst->avgerror = src->avgerror; dst->avgrelerror = src->avgrelerror; dst->maxerror = src->maxerror; } void _barycentricfitreport_clear(void* _p) { barycentricfitreport *p = (barycentricfitreport*)_p; ae_touch_ptr((void*)p); } void _barycentricfitreport_destroy(void* _p) { barycentricfitreport *p = (barycentricfitreport*)_p; ae_touch_ptr((void*)p); } void _spline1dfitreport_init(void* _p, ae_state *_state) { spline1dfitreport *p = (spline1dfitreport*)_p; ae_touch_ptr((void*)p); } void _spline1dfitreport_init_copy(void* _dst, void* _src, ae_state *_state) { spline1dfitreport *dst = (spline1dfitreport*)_dst; spline1dfitreport *src = (spline1dfitreport*)_src; dst->taskrcond = src->taskrcond; dst->rmserror = src->rmserror; dst->avgerror = src->avgerror; dst->avgrelerror = src->avgrelerror; dst->maxerror = src->maxerror; } void _spline1dfitreport_clear(void* _p) { spline1dfitreport *p = (spline1dfitreport*)_p; ae_touch_ptr((void*)p); } void _spline1dfitreport_destroy(void* _p) { spline1dfitreport *p = (spline1dfitreport*)_p; ae_touch_ptr((void*)p); } void _lsfitreport_init(void* _p, ae_state *_state) { lsfitreport *p = (lsfitreport*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->covpar, 0, 0, DT_REAL, _state); ae_vector_init(&p->errpar, 0, DT_REAL, _state); ae_vector_init(&p->errcurve, 0, DT_REAL, _state); ae_vector_init(&p->noise, 0, DT_REAL, _state); } void _lsfitreport_init_copy(void* _dst, void* _src, ae_state *_state) { lsfitreport *dst = (lsfitreport*)_dst; lsfitreport *src = (lsfitreport*)_src; dst->taskrcond = src->taskrcond; dst->iterationscount = src->iterationscount; dst->varidx = src->varidx; dst->rmserror = src->rmserror; dst->avgerror = src->avgerror; dst->avgrelerror = src->avgrelerror; dst->maxerror = src->maxerror; dst->wrmserror = src->wrmserror; ae_matrix_init_copy(&dst->covpar, &src->covpar, _state); ae_vector_init_copy(&dst->errpar, &src->errpar, _state); ae_vector_init_copy(&dst->errcurve, &src->errcurve, _state); ae_vector_init_copy(&dst->noise, &src->noise, _state); dst->r2 = src->r2; } void _lsfitreport_clear(void* _p) { lsfitreport *p = (lsfitreport*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->covpar); ae_vector_clear(&p->errpar); ae_vector_clear(&p->errcurve); ae_vector_clear(&p->noise); } void _lsfitreport_destroy(void* _p) { lsfitreport *p = (lsfitreport*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->covpar); ae_vector_destroy(&p->errpar); ae_vector_destroy(&p->errcurve); ae_vector_destroy(&p->noise); } void _lsfitstate_init(void* _p, ae_state *_state) { lsfitstate *p = (lsfitstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->s, 0, DT_REAL, _state); ae_vector_init(&p->bndl, 0, DT_REAL, _state); ae_vector_init(&p->bndu, 0, DT_REAL, _state); ae_matrix_init(&p->taskx, 0, 0, DT_REAL, _state); ae_vector_init(&p->tasky, 0, DT_REAL, _state); ae_vector_init(&p->taskw, 0, DT_REAL, _state); ae_matrix_init(&p->cleic, 0, 0, DT_REAL, _state); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->c, 0, DT_REAL, _state); ae_vector_init(&p->g, 0, DT_REAL, _state); ae_matrix_init(&p->h, 0, 0, DT_REAL, _state); ae_vector_init(&p->wcur, 0, DT_REAL, _state); ae_vector_init(&p->tmpct, 0, DT_INT, _state); ae_vector_init(&p->tmp, 0, DT_REAL, _state); ae_vector_init(&p->tmpf, 0, DT_REAL, _state); ae_matrix_init(&p->tmpjac, 0, 0, DT_REAL, _state); ae_matrix_init(&p->tmpjacw, 0, 0, DT_REAL, _state); _matinvreport_init(&p->invrep, _state); _lsfitreport_init(&p->rep, _state); _minlmstate_init(&p->optstate, _state); _minlmreport_init(&p->optrep, _state); _rcommstate_init(&p->rstate, _state); } void _lsfitstate_init_copy(void* _dst, void* _src, ae_state *_state) { lsfitstate *dst = (lsfitstate*)_dst; lsfitstate *src = (lsfitstate*)_src; dst->optalgo = src->optalgo; dst->m = src->m; dst->k = src->k; dst->epsx = src->epsx; dst->maxits = src->maxits; dst->stpmax = src->stpmax; dst->xrep = src->xrep; ae_vector_init_copy(&dst->s, &src->s, _state); ae_vector_init_copy(&dst->bndl, &src->bndl, _state); ae_vector_init_copy(&dst->bndu, &src->bndu, _state); ae_matrix_init_copy(&dst->taskx, &src->taskx, _state); ae_vector_init_copy(&dst->tasky, &src->tasky, _state); dst->npoints = src->npoints; ae_vector_init_copy(&dst->taskw, &src->taskw, _state); dst->nweights = src->nweights; dst->wkind = src->wkind; dst->wits = src->wits; dst->diffstep = src->diffstep; dst->teststep = src->teststep; ae_matrix_init_copy(&dst->cleic, &src->cleic, _state); dst->nec = src->nec; dst->nic = src->nic; dst->xupdated = src->xupdated; dst->needf = src->needf; dst->needfg = src->needfg; dst->needfgh = src->needfgh; dst->pointindex = src->pointindex; ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->c, &src->c, _state); dst->f = src->f; ae_vector_init_copy(&dst->g, &src->g, _state); ae_matrix_init_copy(&dst->h, &src->h, _state); ae_vector_init_copy(&dst->wcur, &src->wcur, _state); ae_vector_init_copy(&dst->tmpct, &src->tmpct, _state); ae_vector_init_copy(&dst->tmp, &src->tmp, _state); ae_vector_init_copy(&dst->tmpf, &src->tmpf, _state); ae_matrix_init_copy(&dst->tmpjac, &src->tmpjac, _state); ae_matrix_init_copy(&dst->tmpjacw, &src->tmpjacw, _state); dst->tmpnoise = src->tmpnoise; _matinvreport_init_copy(&dst->invrep, &src->invrep, _state); dst->repiterationscount = src->repiterationscount; dst->repterminationtype = src->repterminationtype; dst->repvaridx = src->repvaridx; dst->reprmserror = src->reprmserror; dst->repavgerror = src->repavgerror; dst->repavgrelerror = src->repavgrelerror; dst->repmaxerror = src->repmaxerror; dst->repwrmserror = src->repwrmserror; _lsfitreport_init_copy(&dst->rep, &src->rep, _state); _minlmstate_init_copy(&dst->optstate, &src->optstate, _state); _minlmreport_init_copy(&dst->optrep, &src->optrep, _state); dst->prevnpt = src->prevnpt; dst->prevalgo = src->prevalgo; _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); } void _lsfitstate_clear(void* _p) { lsfitstate *p = (lsfitstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->s); ae_vector_clear(&p->bndl); ae_vector_clear(&p->bndu); ae_matrix_clear(&p->taskx); ae_vector_clear(&p->tasky); ae_vector_clear(&p->taskw); ae_matrix_clear(&p->cleic); ae_vector_clear(&p->x); ae_vector_clear(&p->c); ae_vector_clear(&p->g); ae_matrix_clear(&p->h); ae_vector_clear(&p->wcur); ae_vector_clear(&p->tmpct); ae_vector_clear(&p->tmp); ae_vector_clear(&p->tmpf); ae_matrix_clear(&p->tmpjac); ae_matrix_clear(&p->tmpjacw); _matinvreport_clear(&p->invrep); _lsfitreport_clear(&p->rep); _minlmstate_clear(&p->optstate); _minlmreport_clear(&p->optrep); _rcommstate_clear(&p->rstate); } void _lsfitstate_destroy(void* _p) { lsfitstate *p = (lsfitstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->s); ae_vector_destroy(&p->bndl); ae_vector_destroy(&p->bndu); ae_matrix_destroy(&p->taskx); ae_vector_destroy(&p->tasky); ae_vector_destroy(&p->taskw); ae_matrix_destroy(&p->cleic); ae_vector_destroy(&p->x); ae_vector_destroy(&p->c); ae_vector_destroy(&p->g); ae_matrix_destroy(&p->h); ae_vector_destroy(&p->wcur); ae_vector_destroy(&p->tmpct); ae_vector_destroy(&p->tmp); ae_vector_destroy(&p->tmpf); ae_matrix_destroy(&p->tmpjac); ae_matrix_destroy(&p->tmpjacw); _matinvreport_destroy(&p->invrep); _lsfitreport_destroy(&p->rep); _minlmstate_destroy(&p->optstate); _minlmreport_destroy(&p->optrep); _rcommstate_destroy(&p->rstate); } /************************************************************************* This function creates RBF model for a scalar (NY=1) or vector (NY>1) function in a NX-dimensional space (NX=2 or NX=3). INPUT PARAMETERS: NX - dimension of the space, NX=2 or NX=3 NY - function dimension, NY>=1 OUTPUT PARAMETERS: S - RBF model (initially equals to zero) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfv2create(ae_int_t nx, ae_int_t ny, rbfv2model* s, ae_state *_state) { ae_int_t i; ae_int_t j; _rbfv2model_clear(s); ae_assert(nx>=1, "RBFCreate: NX<1", _state); ae_assert(ny>=1, "RBFCreate: NY<1", _state); /* * Serializable parameters */ s->nx = nx; s->ny = ny; s->bf = 0; s->nh = 0; ae_matrix_set_length(&s->v, ny, nx+1, _state); for(i=0; i<=ny-1; i++) { for(j=0; j<=nx; j++) { s->v.ptr.pp_double[i][j] = (double)(0); } } /* * Non-serializable parameters */ s->lambdareg = rbfv2_defaultlambdareg; s->maxits = rbfv2_defaultmaxits; s->supportr = rbfv2_defaultsupportr; s->basisfunction = rbfv2_defaultbf; } /************************************************************************* This function creates buffer structure which can be used to perform parallel RBF model evaluations (with one RBF model instance being used from multiple threads, as long as different threads use different instances of buffer). This buffer object can be used with rbftscalcbuf() function (here "ts" stands for "thread-safe", "buf" is a suffix which denotes function which reuses previously allocated output space). How to use it: * create RBF model structure with rbfcreate() * load data, tune parameters * call rbfbuildmodel() * call rbfcreatecalcbuffer(), once per thread working with RBF model (you should call this function only AFTER call to rbfbuildmodel(), see below for more information) * call rbftscalcbuf() from different threads, with each thread working with its own copy of buffer object. INPUT PARAMETERS S - RBF model OUTPUT PARAMETERS Buf - external buffer. IMPORTANT: buffer object should be used only with RBF model object which was used to initialize buffer. Any attempt to use buffer with different object is dangerous - you may get memory violation error because sizes of internal arrays do not fit to dimensions of RBF structure. IMPORTANT: you should call this function only for model which was built with rbfbuildmodel() function, after successful invocation of rbfbuildmodel(). Sizes of some internal structures are determined only after model is built, so buffer object created before model construction stage will be useless (and any attempt to use it will result in exception). -- ALGLIB -- Copyright 02.04.2016 by Sergey Bochkanov *************************************************************************/ void rbfv2createcalcbuffer(rbfv2model* s, rbfv2calcbuffer* buf, ae_state *_state) { _rbfv2calcbuffer_clear(buf); rbfv2_allocatecalcbuffer(s, buf, _state); } /************************************************************************* This function builds hierarchical RBF model. INPUT PARAMETERS: X - array[N,S.NX], X-values Y - array[N,S.NY], Y-values ScaleVec- array[S.NX], vector of per-dimension scales N - points count ATerm - linear term type, 1 for linear, 2 for constant, 3 for zero. NH - hierarchy height RBase - base RBF radius BF - basis function type: 0 for Gaussian, 1 for compact LambdaNS- non-smoothness penalty coefficient. Exactly zero value means that no penalty is applied, and even system matrix does not contain penalty-related rows. Value of 1 means Nonnegative-whether model is nonnegatively constrained or not. Depending on presence of nonnegativity constraint different solvers are used: layerwise LSQR for unconstrained model, single-step BLEIC for constrained models. MaxNNIts- iteration count for BLEIC solver; ignored for Nonnegative=False. S - RBF model, initialized by RBFCreate() call. OUTPUT PARAMETERS: S - updated model (for rep.terminationtype>0, unchanged otherwise) Rep - report: * Rep.TerminationType: * -5 - non-distinct basis function centers were detected, interpolation aborted * -4 - nonconvergence of the internal SVD solver * 1 - successful termination Fields are used for debugging purposes: * Rep.IterationsCount - iterations count of the LSQR solver * Rep.NMV - number of matrix-vector products * Rep.ARows - rows count for the system matrix * Rep.ACols - columns count for the system matrix * Rep.ANNZ - number of significantly non-zero elements (elements above some algorithm-determined threshold) NOTE: failure to build model will leave current state of the structure unchanged. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ void rbfv2buildhierarchical(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, /* Real */ ae_vector* scalevec, ae_int_t aterm, ae_int_t nh, double rbase, double lambdans, ae_bool nonnegative, ae_int_t nnmaxits, rbfv2model* s, rbfv2report* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t nx; ae_int_t ny; ae_int_t bf; ae_matrix rhs; ae_matrix residualy; ae_matrix v; ae_int_t rowsperpoint; ae_vector hidx; ae_vector xr; ae_vector ri; ae_vector kdroots; ae_vector kdnodes; ae_vector kdsplits; ae_vector kdboxmin; ae_vector kdboxmax; ae_vector cw; ae_vector cwrange; ae_matrix curxy; ae_int_t curn; ae_int_t nbasis; kdtree curtree; kdtree globaltree; ae_vector x0; ae_vector x1; ae_vector tags; ae_vector dist; ae_vector nncnt; ae_vector rowsizes; ae_vector diagata; ae_vector bndl; ae_vector bndu; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t k2; ae_int_t levelidx; ae_int_t offsi; ae_int_t offsj; double val; double criticalr; ae_int_t cnt; double avgdiagata; linlsqrstate linstate; linlsqrreport lsqrrep; sparsematrix sparseacrs; ae_vector densew1; ae_vector denseb1; rbfv2calcbuffer calcbuf; ae_vector vr2; ae_vector voffs; ae_vector rowindexes; ae_vector rowvals; double penalty; ae_frame_make(_state, &_frame_block); _rbfv2report_clear(rep); ae_matrix_init(&rhs, 0, 0, DT_REAL, _state); ae_matrix_init(&residualy, 0, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); ae_vector_init(&hidx, 0, DT_INT, _state); ae_vector_init(&xr, 0, DT_REAL, _state); ae_vector_init(&ri, 0, DT_REAL, _state); ae_vector_init(&kdroots, 0, DT_INT, _state); ae_vector_init(&kdnodes, 0, DT_INT, _state); ae_vector_init(&kdsplits, 0, DT_REAL, _state); ae_vector_init(&kdboxmin, 0, DT_REAL, _state); ae_vector_init(&kdboxmax, 0, DT_REAL, _state); ae_vector_init(&cw, 0, DT_REAL, _state); ae_vector_init(&cwrange, 0, DT_INT, _state); ae_matrix_init(&curxy, 0, 0, DT_REAL, _state); _kdtree_init(&curtree, _state); _kdtree_init(&globaltree, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&tags, 0, DT_INT, _state); ae_vector_init(&dist, 0, DT_REAL, _state); ae_vector_init(&nncnt, 0, DT_INT, _state); ae_vector_init(&rowsizes, 0, DT_INT, _state); ae_vector_init(&diagata, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); _linlsqrstate_init(&linstate, _state); _linlsqrreport_init(&lsqrrep, _state); _sparsematrix_init(&sparseacrs, _state); ae_vector_init(&densew1, 0, DT_REAL, _state); ae_vector_init(&denseb1, 0, DT_REAL, _state); _rbfv2calcbuffer_init(&calcbuf, _state); ae_vector_init(&vr2, 0, DT_REAL, _state); ae_vector_init(&voffs, 0, DT_INT, _state); ae_vector_init(&rowindexes, 0, DT_INT, _state); ae_vector_init(&rowvals, 0, DT_REAL, _state); ae_assert(s->nx>0, "RBFV2BuildHierarchical: incorrect NX", _state); ae_assert(s->ny>0, "RBFV2BuildHierarchical: incorrect NY", _state); ae_assert(!nonnegative||nnmaxits>0, "RBFV2BuildHierarchical: incorrect NNMaxIts", _state); ae_assert(ae_fp_greater_eq(lambdans,(double)(0)), "RBFV2BuildHierarchical: incorrect LambdaNS", _state); for(j=0; j<=s->nx-1; j++) { ae_assert(ae_fp_greater(scalevec->ptr.p_double[j],(double)(0)), "RBFV2BuildHierarchical: incorrect ScaleVec", _state); } nx = s->nx; ny = s->ny; bf = s->basisfunction; ae_assert(bf==0||bf==1, "RBFV2BuildHierarchical: incorrect BF", _state); /* * Quick exit when we have no points */ if( n==0 ) { rep->terminationtype = 1; rep->maxerror = (double)(0); rep->rmserror = (double)(0); s->bf = bf; s->nh = 0; ae_vector_set_length(&s->ri, 0, _state); ae_vector_set_length(&s->s, 0, _state); ae_vector_set_length(&s->kdroots, 0, _state); ae_vector_set_length(&s->kdnodes, 0, _state); ae_vector_set_length(&s->kdsplits, 0, _state); ae_vector_set_length(&s->kdboxmin, 0, _state); ae_vector_set_length(&s->kdboxmax, 0, _state); ae_vector_set_length(&s->cw, 0, _state); ae_matrix_set_length(&s->v, ny, nx+1, _state); for(i=0; i<=ny-1; i++) { for(j=0; j<=nx; j++) { s->v.ptr.pp_double[i][j] = (double)(0); } } ae_frame_leave(_state); return; } /* * First model in a sequence - linear model. * Residuals from linear regression are stored in the ResidualY variable * (used later to build RBF models). */ ae_matrix_set_length(&residualy, n, ny, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=ny-1; j++) { residualy.ptr.pp_double[i][j] = y->ptr.pp_double[i][j]; } } if( !rbfv2_rbfv2buildlinearmodel(x, &residualy, n, nx, ny, aterm, &v, _state) ) { rep->terminationtype = -5; ae_frame_leave(_state); return; } /* * Handle special case: multilayer model with NLayers=0. * Quick exit. */ if( nh==0 ) { rep->terminationtype = 1; s->bf = bf; s->nh = 0; ae_vector_set_length(&s->ri, 0, _state); ae_vector_set_length(&s->s, 0, _state); ae_vector_set_length(&s->kdroots, 0, _state); ae_vector_set_length(&s->kdnodes, 0, _state); ae_vector_set_length(&s->kdsplits, 0, _state); ae_vector_set_length(&s->kdboxmin, 0, _state); ae_vector_set_length(&s->kdboxmax, 0, _state); ae_vector_set_length(&s->cw, 0, _state); ae_matrix_set_length(&s->v, ny, nx+1, _state); for(i=0; i<=ny-1; i++) { for(j=0; j<=nx; j++) { s->v.ptr.pp_double[i][j] = v.ptr.pp_double[i][j]; } } rep->maxerror = (double)(0); rep->rmserror = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=ny-1; j++) { rep->maxerror = ae_maxreal(rep->maxerror, ae_fabs(residualy.ptr.pp_double[i][j], _state), _state); rep->rmserror = rep->rmserror+ae_sqr(residualy.ptr.pp_double[i][j], _state); } } rep->rmserror = ae_sqrt(rep->rmserror/(n*ny), _state); ae_frame_leave(_state); return; } /* * Penalty coefficient is set to LambdaNS*RBase^2. * * We use such normalization because VALUES of radial basis * functions have roughly unit magnitude, but their DERIVATIVES * are (roughly) inversely proportional to the radius. Thus, * without additional scaling, regularization coefficient * looses invariancy w.r.t. scaling of variables. */ if( ae_fp_eq(lambdans,(double)(0)) ) { rowsperpoint = 1; } else { /* * NOTE: simplified penalty function is used, which does not provide rotation invariance */ rowsperpoint = 1+nx; } penalty = lambdans*ae_sqr(rbase, _state); /* * Prepare temporary structures */ ae_matrix_set_length(&rhs, n*rowsperpoint, ny, _state); ae_matrix_set_length(&curxy, n, nx+ny, _state); ae_vector_set_length(&x0, nx, _state); ae_vector_set_length(&x1, nx, _state); ae_vector_set_length(&tags, n, _state); ae_vector_set_length(&dist, n, _state); ae_vector_set_length(&vr2, n, _state); ae_vector_set_length(&voffs, n, _state); ae_vector_set_length(&nncnt, n, _state); ae_vector_set_length(&rowsizes, n*rowsperpoint, _state); ae_vector_set_length(&denseb1, n*rowsperpoint, _state); for(i=0; i<=n*rowsperpoint-1; i++) { for(j=0; j<=ny-1; j++) { rhs.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { curxy.ptr.pp_double[i][j] = x->ptr.pp_double[i][j]/scalevec->ptr.p_double[j]; } for(j=0; j<=ny-1; j++) { rhs.ptr.pp_double[i*rowsperpoint][j] = residualy.ptr.pp_double[i][j]; } tags.ptr.p_int[i] = i; } kdtreebuildtagged(&curxy, &tags, n, nx, 0, 2, &globaltree, _state); /* * Generate sequence of layer radii. * Prepare assignment of different levels to points. */ ae_assert(n>0, "RBFV2BuildHierarchical: integrity check failed", _state); ae_vector_set_length(&ri, nh, _state); for(levelidx=0; levelidx<=nh-1; levelidx++) { ri.ptr.p_double[levelidx] = rbase*ae_pow((double)(2), (double)(-levelidx), _state); } ae_vector_set_length(&hidx, n, _state); ae_vector_set_length(&xr, n, _state); for(i=0; i<=n-1; i++) { hidx.ptr.p_int[i] = nh; xr.ptr.p_double[i] = ae_maxrealnumber; ae_assert(ae_fp_greater(xr.ptr.p_double[i],ri.ptr.p_double[0]), "RBFV2BuildHierarchical: integrity check failed", _state); } for(levelidx=0; levelidx<=nh-1; levelidx++) { /* * Scan dataset points, for each such point that distance to nearest * "support" point is larger than SupportR*Ri[LevelIdx] we: * * set distance of current point to 0 (it is support now) and update HIdx * * perform R-NN request with radius SupportR*Ri[LevelIdx] * * for each point in request update its distance */ criticalr = s->supportr*ri.ptr.p_double[levelidx]; for(i=0; i<=n-1; i++) { if( ae_fp_greater(xr.ptr.p_double[i],criticalr) ) { /* * Mark point as support */ ae_assert(hidx.ptr.p_int[i]==nh, "RBFV2BuildHierarchical: integrity check failed", _state); hidx.ptr.p_int[i] = levelidx; xr.ptr.p_double[i] = (double)(0); /* * Update neighbors */ for(j=0; j<=nx-1; j++) { x0.ptr.p_double[j] = x->ptr.pp_double[i][j]/scalevec->ptr.p_double[j]; } k = kdtreequeryrnn(&globaltree, &x0, criticalr, ae_true, _state); kdtreequeryresultstags(&globaltree, &tags, _state); kdtreequeryresultsdistances(&globaltree, &dist, _state); for(j=0; j<=k-1; j++) { xr.ptr.p_double[tags.ptr.p_int[j]] = ae_minreal(xr.ptr.p_double[tags.ptr.p_int[j]], dist.ptr.p_double[j], _state); } } } } /* * Build multitree (with zero weights) according to hierarchy. * * NOTE: this code assumes that during every iteration kdNodes, * kdSplits and CW have size which EXACTLY fits their * contents, and that these variables are resized at each * iteration when we add new hierarchical model. */ ae_vector_set_length(&kdroots, nh+1, _state); ae_vector_set_length(&kdnodes, 0, _state); ae_vector_set_length(&kdsplits, 0, _state); ae_vector_set_length(&kdboxmin, nx, _state); ae_vector_set_length(&kdboxmax, nx, _state); ae_vector_set_length(&cw, 0, _state); ae_vector_set_length(&cwrange, nh+1, _state); kdtreeexplorebox(&globaltree, &kdboxmin, &kdboxmax, _state); cwrange.ptr.p_int[0] = 0; for(levelidx=0; levelidx<=nh-1; levelidx++) { /* * Prepare radius and root offset */ kdroots.ptr.p_int[levelidx] = kdnodes.cnt; /* * Generate LevelIdx-th tree and append to multi-tree */ curn = 0; for(i=0; i<=n-1; i++) { if( hidx.ptr.p_int[i]<=levelidx ) { for(j=0; j<=nx-1; j++) { curxy.ptr.pp_double[curn][j] = x->ptr.pp_double[i][j]/scalevec->ptr.p_double[j]; } for(j=0; j<=ny-1; j++) { curxy.ptr.pp_double[curn][nx+j] = (double)(0); } inc(&curn, _state); } } ae_assert(curn>0, "RBFV2BuildHierarchical: integrity check failed", _state); kdtreebuild(&curxy, curn, nx, ny, 2, &curtree, _state); rbfv2_convertandappendtree(&curtree, curn, nx, ny, &kdnodes, &kdsplits, &cw, _state); /* * Fill entry of CWRange (we assume that length of CW exactly fits its actual size) */ cwrange.ptr.p_int[levelidx+1] = cw.cnt; } kdroots.ptr.p_int[nh] = kdnodes.cnt; /* * Prepare buffer and scaled dataset */ rbfv2_allocatecalcbuffer(s, &calcbuf, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { curxy.ptr.pp_double[i][j] = x->ptr.pp_double[i][j]/scalevec->ptr.p_double[j]; } } /* * Depending on presence of nonnegativity constraint, use different solvers */ if( !nonnegative ) { /* * Build unconstrained model with LSQR solver, applied layer by layer */ for(levelidx=0; levelidx<=nh-1; levelidx++) { /* * Generate A - matrix of basis functions (near radius is used) * * NOTE: AvgDiagATA is average value of diagonal element of A^T*A. * It is used to calculate value of Tikhonov regularization * coefficient. */ nbasis = (cwrange.ptr.p_int[levelidx+1]-cwrange.ptr.p_int[levelidx])/(nx+ny); ae_assert(cwrange.ptr.p_int[levelidx+1]-cwrange.ptr.p_int[levelidx]==nbasis*(nx+ny), "Assertion failed", _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { x0.ptr.p_double[j] = curxy.ptr.pp_double[i][j]; } cnt = rbfv2_designmatrixrowsize(&kdnodes, &kdsplits, &cw, &ri, &kdroots, &kdboxmin, &kdboxmax, nx, ny, nh, levelidx, rbfv2nearradius(bf, _state), &x0, &calcbuf, _state); nncnt.ptr.p_int[i] = cnt; for(j=0; j<=rowsperpoint-1; j++) { rowsizes.ptr.p_int[i*rowsperpoint+j] = cnt; } } ivectorsetlengthatleast(&rowindexes, nbasis, _state); rvectorsetlengthatleast(&rowvals, nbasis*rowsperpoint, _state); sparsecreatecrsbuf(n*rowsperpoint, nbasis, &rowsizes, &sparseacrs, _state); avgdiagata = 0.0; for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { x0.ptr.p_double[j] = curxy.ptr.pp_double[i][j]; } rbfv2_designmatrixgeneraterow(&kdnodes, &kdsplits, &cw, &ri, &kdroots, &kdboxmin, &kdboxmax, &cwrange, nx, ny, nh, levelidx, bf, rbfv2nearradius(bf, _state), rowsperpoint, penalty, &x0, &calcbuf, &vr2, &voffs, &rowindexes, &rowvals, &cnt, _state); ae_assert(cnt==nncnt.ptr.p_int[i], "RBFV2BuildHierarchical: integrity check failed", _state); for(k=0; k<=rowsperpoint-1; k++) { for(j=0; j<=cnt-1; j++) { val = rowvals.ptr.p_double[j*rowsperpoint+k]; sparseset(&sparseacrs, i*rowsperpoint+k, rowindexes.ptr.p_int[j], val, _state); avgdiagata = avgdiagata+ae_sqr(val, _state); } } } avgdiagata = avgdiagata/nbasis; /* * solve */ linlsqrcreate(n*rowsperpoint, nbasis, &linstate, _state); linlsqrsetcond(&linstate, 0.0, 0.0, coalescei(s->maxits, rbfv2_defaultmaxits, _state), _state); linlsqrsetlambdai(&linstate, ae_sqrt(s->lambdareg*avgdiagata, _state), _state); for(j=0; j<=ny-1; j++) { for(i=0; i<=n*rowsperpoint-1; i++) { denseb1.ptr.p_double[i] = rhs.ptr.pp_double[i][j]; } linlsqrsolvesparse(&linstate, &sparseacrs, &denseb1, _state); linlsqrresults(&linstate, &densew1, &lsqrrep, _state); ae_assert(lsqrrep.terminationtype>0, "RBFV2BuildHierarchical: integrity check failed", _state); for(i=0; i<=nbasis-1; i++) { offsi = cwrange.ptr.p_int[levelidx]+(nx+ny)*i; cw.ptr.p_double[offsi+nx+j] = densew1.ptr.p_double[i]; } } /* * Update residuals (far radius is used) */ for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { x0.ptr.p_double[j] = curxy.ptr.pp_double[i][j]; } rbfv2_designmatrixgeneraterow(&kdnodes, &kdsplits, &cw, &ri, &kdroots, &kdboxmin, &kdboxmax, &cwrange, nx, ny, nh, levelidx, bf, rbfv2farradius(bf, _state), rowsperpoint, penalty, &x0, &calcbuf, &vr2, &voffs, &rowindexes, &rowvals, &cnt, _state); for(j=0; j<=cnt-1; j++) { offsj = cwrange.ptr.p_int[levelidx]+(nx+ny)*rowindexes.ptr.p_int[j]+nx; for(k=0; k<=rowsperpoint-1; k++) { val = rowvals.ptr.p_double[j*rowsperpoint+k]; for(k2=0; k2<=ny-1; k2++) { rhs.ptr.pp_double[i*rowsperpoint+k][k2] = rhs.ptr.pp_double[i*rowsperpoint+k][k2]-val*cw.ptr.p_double[offsj+k2]; } } } } } } else { ae_assert(ae_false, "RBFV2: unsupported mode", _state); } /* * Model is built. * * Copy local variables by swapping, global ones (ScaleVec) are copied * explicitly. */ s->bf = bf; s->nh = nh; ae_swap_vectors(&s->ri, &ri); ae_swap_vectors(&s->kdroots, &kdroots); ae_swap_vectors(&s->kdnodes, &kdnodes); ae_swap_vectors(&s->kdsplits, &kdsplits); ae_swap_vectors(&s->kdboxmin, &kdboxmin); ae_swap_vectors(&s->kdboxmax, &kdboxmax); ae_swap_vectors(&s->cw, &cw); ae_swap_matrices(&s->v, &v); ae_vector_set_length(&s->s, nx, _state); for(i=0; i<=nx-1; i++) { s->s.ptr.p_double[i] = scalevec->ptr.p_double[i]; } rep->terminationtype = 1; /* * Calculate maximum and RMS errors */ rep->maxerror = (double)(0); rep->rmserror = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=ny-1; j++) { rep->maxerror = ae_maxreal(rep->maxerror, ae_fabs(rhs.ptr.pp_double[i*rowsperpoint][j], _state), _state); rep->rmserror = rep->rmserror+ae_sqr(rhs.ptr.pp_double[i*rowsperpoint][j], _state); } } rep->rmserror = ae_sqrt(rep->rmserror/(n*ny), _state); ae_frame_leave(_state); } /************************************************************************* Serializer: allocation -- ALGLIB -- Copyright 02.02.2012 by Bochkanov Sergey *************************************************************************/ void rbfv2alloc(ae_serializer* s, rbfv2model* model, ae_state *_state) { /* * Data */ ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); allocrealarray(s, &model->ri, -1, _state); allocrealarray(s, &model->s, -1, _state); allocintegerarray(s, &model->kdroots, -1, _state); allocintegerarray(s, &model->kdnodes, -1, _state); allocrealarray(s, &model->kdsplits, -1, _state); allocrealarray(s, &model->kdboxmin, -1, _state); allocrealarray(s, &model->kdboxmax, -1, _state); allocrealarray(s, &model->cw, -1, _state); allocrealmatrix(s, &model->v, -1, -1, _state); } /************************************************************************* Serializer: serialization -- ALGLIB -- Copyright 02.02.2012 by Bochkanov Sergey *************************************************************************/ void rbfv2serialize(ae_serializer* s, rbfv2model* model, ae_state *_state) { /* * Data */ ae_serializer_serialize_int(s, model->nx, _state); ae_serializer_serialize_int(s, model->ny, _state); ae_serializer_serialize_int(s, model->nh, _state); ae_serializer_serialize_int(s, model->bf, _state); serializerealarray(s, &model->ri, -1, _state); serializerealarray(s, &model->s, -1, _state); serializeintegerarray(s, &model->kdroots, -1, _state); serializeintegerarray(s, &model->kdnodes, -1, _state); serializerealarray(s, &model->kdsplits, -1, _state); serializerealarray(s, &model->kdboxmin, -1, _state); serializerealarray(s, &model->kdboxmax, -1, _state); serializerealarray(s, &model->cw, -1, _state); serializerealmatrix(s, &model->v, -1, -1, _state); } /************************************************************************* Serializer: unserialization -- ALGLIB -- Copyright 02.02.2012 by Bochkanov Sergey *************************************************************************/ void rbfv2unserialize(ae_serializer* s, rbfv2model* model, ae_state *_state) { ae_int_t nx; ae_int_t ny; _rbfv2model_clear(model); /* * Unserialize primary model parameters, initialize model. * * It is necessary to call RBFCreate() because some internal fields * which are NOT unserialized will need initialization. */ ae_serializer_unserialize_int(s, &nx, _state); ae_serializer_unserialize_int(s, &ny, _state); rbfv2create(nx, ny, model, _state); ae_serializer_unserialize_int(s, &model->nh, _state); ae_serializer_unserialize_int(s, &model->bf, _state); unserializerealarray(s, &model->ri, _state); unserializerealarray(s, &model->s, _state); unserializeintegerarray(s, &model->kdroots, _state); unserializeintegerarray(s, &model->kdnodes, _state); unserializerealarray(s, &model->kdsplits, _state); unserializerealarray(s, &model->kdboxmin, _state); unserializerealarray(s, &model->kdboxmax, _state); unserializerealarray(s, &model->cw, _state); unserializerealmatrix(s, &model->v, _state); } /************************************************************************* Returns far radius for basis function type *************************************************************************/ double rbfv2farradius(ae_int_t bf, ae_state *_state) { double result; result = (double)(1); if( bf==0 ) { result = 5.0; } if( bf==1 ) { result = (double)(3); } return result; } /************************************************************************* Returns near radius for basis function type *************************************************************************/ double rbfv2nearradius(ae_int_t bf, ae_state *_state) { double result; result = (double)(1); if( bf==0 ) { result = 3.0; } if( bf==1 ) { result = (double)(3); } return result; } /************************************************************************* Returns basis function value. Assumes that D2>=0 *************************************************************************/ double rbfv2basisfunc(ae_int_t bf, double d2, ae_state *_state) { double v; double result; result = (double)(0); if( bf==0 ) { result = ae_exp(-d2, _state); return result; } if( bf==1 ) { /* * if D2<3: * Exp(1)*Exp(-D2)*Exp(-1/(1-D2/9)) * else: * 0 */ v = 1-d2/9; if( ae_fp_less_eq(v,(double)(0)) ) { result = (double)(0); return result; } result = 2.718281828459045*ae_exp(-d2, _state)*ae_exp(-1/v, _state); return result; } ae_assert(ae_false, "RBFV2BasisFunc: unknown BF type", _state); return result; } /************************************************************************* Returns basis function value, first and second derivatives Assumes that D2>=0 *************************************************************************/ void rbfv2basisfuncdiff2(ae_int_t bf, double d2, double* f, double* df, double* d2f, ae_state *_state) { double v; *f = 0; *df = 0; *d2f = 0; if( bf==0 ) { *f = ae_exp(-d2, _state); *df = -*f; *d2f = *f; return; } if( bf==1 ) { /* * if D2<3: * F = Exp(1)*Exp(-D2)*Exp(-1/(1-D2/9)) * dF = -F * [pow(D2/9-1,-2)/9 + 1] * d2F = -dF * [pow(D2/9-1,-2)/9 + 1] + F*(2/81)*pow(D2/9-1,-3) * else: * 0 */ v = 1-d2/9; if( ae_fp_less_eq(v,(double)(0)) ) { *f = (double)(0); *df = (double)(0); *d2f = (double)(0); return; } *f = ae_exp((double)(1), _state)*ae_exp(-d2, _state)*ae_exp(-1/v, _state); *df = -*f*(1/(9*v*v)+1); *d2f = -*df*(1/(9*v*v)+1)+*f*((double)2/(double)81)/(v*v*v); return; } ae_assert(ae_false, "RBFV2BasisFuncDiff2: unknown BF type", _state); } /************************************************************************* This function calculates values of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=1 (1-dimensional space). This function returns 0.0 when: * model is not initialized * NX<>1 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - X-coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfv2calc1(rbfv2model* s, double x0, ae_state *_state) { double result; ae_assert(ae_isfinite(x0, _state), "RBFCalc1: invalid value for X0 (X0 is Inf)!", _state); if( s->ny!=1||s->nx!=1 ) { result = (double)(0); return result; } result = s->v.ptr.pp_double[0][0]*x0-s->v.ptr.pp_double[0][1]; if( s->nh==0 ) { return result; } rbfv2_allocatecalcbuffer(s, &s->calcbuf, _state); s->calcbuf.x123.ptr.p_double[0] = x0; rbfv2tscalcbuf(s, &s->calcbuf, &s->calcbuf.x123, &s->calcbuf.y123, _state); result = s->calcbuf.y123.ptr.p_double[0]; return result; } /************************************************************************* This function calculates values of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=2 (2-dimensional space). If you have 3-dimensional space, use RBFCalc3(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use general, less efficient implementation RBFCalc(). If you want to calculate function values many times, consider using RBFGridCalc2(), which is far more efficient than many subsequent calls to RBFCalc2(). This function returns 0.0 when: * model is not initialized * NX<>2 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - first coordinate, finite number X1 - second coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfv2calc2(rbfv2model* s, double x0, double x1, ae_state *_state) { double result; ae_assert(ae_isfinite(x0, _state), "RBFCalc2: invalid value for X0 (X0 is Inf)!", _state); ae_assert(ae_isfinite(x1, _state), "RBFCalc2: invalid value for X1 (X1 is Inf)!", _state); if( s->ny!=1||s->nx!=2 ) { result = (double)(0); return result; } result = s->v.ptr.pp_double[0][0]*x0+s->v.ptr.pp_double[0][1]*x1+s->v.ptr.pp_double[0][2]; if( s->nh==0 ) { return result; } rbfv2_allocatecalcbuffer(s, &s->calcbuf, _state); s->calcbuf.x123.ptr.p_double[0] = x0; s->calcbuf.x123.ptr.p_double[1] = x1; rbfv2tscalcbuf(s, &s->calcbuf, &s->calcbuf.x123, &s->calcbuf.y123, _state); result = s->calcbuf.y123.ptr.p_double[0]; return result; } /************************************************************************* This function calculates values of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=3 (3-dimensional space). If you have 2-dimensional space, use RBFCalc2(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use general, less efficient implementation RBFCalc(). This function returns 0.0 when: * model is not initialized * NX<>3 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - first coordinate, finite number X1 - second coordinate, finite number X2 - third coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfv2calc3(rbfv2model* s, double x0, double x1, double x2, ae_state *_state) { double result; ae_assert(ae_isfinite(x0, _state), "RBFCalc3: invalid value for X0 (X0 is Inf or NaN)!", _state); ae_assert(ae_isfinite(x1, _state), "RBFCalc3: invalid value for X1 (X1 is Inf or NaN)!", _state); ae_assert(ae_isfinite(x2, _state), "RBFCalc3: invalid value for X2 (X2 is Inf or NaN)!", _state); if( s->ny!=1||s->nx!=3 ) { result = (double)(0); return result; } result = s->v.ptr.pp_double[0][0]*x0+s->v.ptr.pp_double[0][1]*x1+s->v.ptr.pp_double[0][2]*x2+s->v.ptr.pp_double[0][3]; if( s->nh==0 ) { return result; } rbfv2_allocatecalcbuffer(s, &s->calcbuf, _state); s->calcbuf.x123.ptr.p_double[0] = x0; s->calcbuf.x123.ptr.p_double[1] = x1; s->calcbuf.x123.ptr.p_double[2] = x2; rbfv2tscalcbuf(s, &s->calcbuf, &s->calcbuf.x123, &s->calcbuf.y123, _state); result = s->calcbuf.y123.ptr.p_double[0]; return result; } /************************************************************************* This function calculates values of the RBF model at the given point. Same as RBFCalc(), but does not reallocate Y when in is large enough to store function values. INPUT PARAMETERS: S - RBF model X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. Y - possibly preallocated array OUTPUT PARAMETERS: Y - function value, array[NY]. Y is not reallocated when it is larger than NY. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfv2calcbuf(rbfv2model* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { rbfv2tscalcbuf(s, &s->calcbuf, x, y, _state); } /************************************************************************* This function calculates values of the RBF model at the given point, using external buffer object (internal temporaries of RBF model are not modified). This function allows to use same RBF model object in different threads, assuming that different threads use different instances of buffer structure. INPUT PARAMETERS: S - RBF model, may be shared between different threads Buf - buffer object created for this particular instance of RBF model with rbfcreatecalcbuffer(). X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. Y - possibly preallocated array OUTPUT PARAMETERS: Y - function value, array[NY]. Y is not reallocated when it is larger than NY. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfv2tscalcbuf(rbfv2model* s, rbfv2calcbuffer* buf, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t levelidx; double rcur; double rquery2; double invrc2; ae_int_t nx; ae_int_t ny; ae_assert(x->cnt>=s->nx, "RBFCalcBuf: Length(X)nx, _state), "RBFCalcBuf: X contains infinite or NaN values", _state); nx = s->nx; ny = s->ny; /* * Handle linear term */ if( y->cntptr.p_double[i] = s->v.ptr.pp_double[i][nx]; for(j=0; j<=nx-1; j++) { y->ptr.p_double[i] = y->ptr.p_double[i]+s->v.ptr.pp_double[i][j]*x->ptr.p_double[j]; } } if( s->nh==0 ) { return; } /* * Handle nonlinear term */ rbfv2_allocatecalcbuffer(s, buf, _state); for(j=0; j<=nx-1; j++) { buf->x.ptr.p_double[j] = x->ptr.p_double[j]/s->s.ptr.p_double[j]; } for(levelidx=0; levelidx<=s->nh-1; levelidx++) { /* * Prepare fields of Buf required by PartialCalcRec() */ buf->curdist2 = (double)(0); for(j=0; j<=nx-1; j++) { buf->curboxmin.ptr.p_double[j] = s->kdboxmin.ptr.p_double[j]; buf->curboxmax.ptr.p_double[j] = s->kdboxmax.ptr.p_double[j]; if( ae_fp_less(buf->x.ptr.p_double[j],buf->curboxmin.ptr.p_double[j]) ) { buf->curdist2 = buf->curdist2+ae_sqr(buf->curboxmin.ptr.p_double[j]-buf->x.ptr.p_double[j], _state); } else { if( ae_fp_greater(buf->x.ptr.p_double[j],buf->curboxmax.ptr.p_double[j]) ) { buf->curdist2 = buf->curdist2+ae_sqr(buf->x.ptr.p_double[j]-buf->curboxmax.ptr.p_double[j], _state); } } } /* * Call PartialCalcRec() */ rcur = s->ri.ptr.p_double[levelidx]; invrc2 = 1/(rcur*rcur); rquery2 = ae_sqr(rcur*rbfv2farradius(s->bf, _state), _state); rbfv2_partialcalcrec(s, buf, s->kdroots.ptr.p_int[levelidx], invrc2, rquery2, &buf->x, y, _state); } } /************************************************************************* This function calculates values of the RBF model at the regular grid. Grid have N0*N1 points, with Point[I,J] = (X0[I], X1[J]) This function returns 0.0 when: * model is not initialized * NX<>2 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - array of grid nodes, first coordinates, array[N0] N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] N1 - grid size (number of nodes) in the second dimension OUTPUT PARAMETERS: Y - function values, array[N0,N1]. Y is out-variable and is reallocated by this function. NOTE: as a special exception, this function supports unordered arrays X0 and X1. However, future versions may be more efficient for X0/X1 ordered by ascending. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfv2gridcalc2(rbfv2model* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_matrix* y, ae_state *_state) { ae_frame _frame_block; ae_vector cpx0; ae_vector cpx1; ae_vector dummyx2; ae_vector dummyx3; ae_vector dummyflag; ae_vector p01; ae_vector p11; ae_vector p2; ae_vector vy; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_matrix_clear(y); ae_vector_init(&cpx0, 0, DT_REAL, _state); ae_vector_init(&cpx1, 0, DT_REAL, _state); ae_vector_init(&dummyx2, 0, DT_REAL, _state); ae_vector_init(&dummyx3, 0, DT_REAL, _state); ae_vector_init(&dummyflag, 0, DT_BOOL, _state); ae_vector_init(&p01, 0, DT_INT, _state); ae_vector_init(&p11, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); ae_vector_init(&vy, 0, DT_REAL, _state); ae_assert(n0>0, "RBFGridCalc2: invalid value for N0 (N0<=0)!", _state); ae_assert(n1>0, "RBFGridCalc2: invalid value for N1 (N1<=0)!", _state); ae_assert(x0->cnt>=n0, "RBFGridCalc2: Length(X0)cnt>=n1, "RBFGridCalc2: Length(X1)ptr.pp_double[i][j] = (double)(0); } } if( s->ny!=1||s->nx!=2 ) { ae_frame_leave(_state); return; } /* *create and sort arrays */ ae_vector_set_length(&cpx0, n0, _state); for(i=0; i<=n0-1; i++) { cpx0.ptr.p_double[i] = x0->ptr.p_double[i]; } tagsort(&cpx0, n0, &p01, &p2, _state); ae_vector_set_length(&cpx1, n1, _state); for(i=0; i<=n1-1; i++) { cpx1.ptr.p_double[i] = x1->ptr.p_double[i]; } tagsort(&cpx1, n1, &p11, &p2, _state); ae_vector_set_length(&dummyx2, 1, _state); dummyx2.ptr.p_double[0] = (double)(0); ae_vector_set_length(&dummyx3, 1, _state); dummyx3.ptr.p_double[0] = (double)(0); ae_vector_set_length(&vy, n0*n1, _state); rbfv2gridcalcvx(s, &cpx0, n0, &cpx1, n1, &dummyx2, 1, &dummyx3, 1, &dummyflag, ae_false, &vy, _state); for(i=0; i<=n0-1; i++) { for(j=0; j<=n1-1; j++) { y->ptr.pp_double[i][j] = vy.ptr.p_double[i+j*n0]; } } ae_frame_leave(_state); } /************************************************************************* This function is used to perform gridded calculation for 2D, 3D or 4D problems. It accepts parameters X0...X3 and counters N0...N3. If RBF model has dimensionality less than 4, corresponding arrays should contain just one element equal to zero, and corresponding N's should be equal to 1. NOTE: array Y should be preallocated by caller. -- ALGLIB -- Copyright 12.07.2016 by Bochkanov Sergey *************************************************************************/ void rbfv2gridcalcvx(rbfv2model* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* x3, ae_int_t n3, /* Boolean */ ae_vector* flagy, ae_bool sparsey, /* Real */ ae_vector* y, ae_state *_state) { ae_frame _frame_block; ae_int_t nx; ae_int_t ny; ae_int_t i; ae_int_t j; ae_int_t k; ae_vector tx; ae_vector ty; ae_vector z; ae_int_t dstoffs; ae_int_t dummy; rbfv2gridcalcbuffer bufseedv2; ae_shared_pool bufpool; ae_int_t rowidx; ae_int_t rowcnt; double v; double rcur; ae_int_t levelidx; double searchradius2; ae_int_t ntrials; double avgfuncpernode; hqrndstate rs; ae_vector blocks0; ae_vector blocks1; ae_vector blocks2; ae_vector blocks3; ae_int_t blockscnt0; ae_int_t blockscnt1; ae_int_t blockscnt2; ae_int_t blockscnt3; double blockwidth0; double blockwidth1; double blockwidth2; double blockwidth3; ae_int_t maxblocksize; ae_frame_make(_state, &_frame_block); ae_vector_init(&tx, 0, DT_REAL, _state); ae_vector_init(&ty, 0, DT_REAL, _state); ae_vector_init(&z, 0, DT_REAL, _state); _rbfv2gridcalcbuffer_init(&bufseedv2, _state); ae_shared_pool_init(&bufpool, _state); _hqrndstate_init(&rs, _state); ae_vector_init(&blocks0, 0, DT_INT, _state); ae_vector_init(&blocks1, 0, DT_INT, _state); ae_vector_init(&blocks2, 0, DT_INT, _state); ae_vector_init(&blocks3, 0, DT_INT, _state); nx = s->nx; ny = s->ny; hqrndseed(532, 54734, &rs, _state); /* * Perform integrity checks */ ae_assert(s->nx==2||s->nx==3, "RBFGridCalcVX: integrity check failed", _state); ae_assert(s->nx>=4||((x3->cnt>=1&&ae_fp_eq(x3->ptr.p_double[0],(double)(0)))&&n3==1), "RBFGridCalcVX: integrity check failed", _state); ae_assert(s->nx>=3||((x2->cnt>=1&&ae_fp_eq(x2->ptr.p_double[0],(double)(0)))&&n2==1), "RBFGridCalcVX: integrity check failed", _state); ae_assert(s->nx>=2||((x1->cnt>=1&&ae_fp_eq(x1->ptr.p_double[0],(double)(0)))&&n1==1), "RBFGridCalcVX: integrity check failed", _state); /* * Allocate arrays */ ae_assert(s->nx<=4, "RBFGridCalcVX: integrity check failed", _state); ae_vector_set_length(&z, ny, _state); ae_vector_set_length(&tx, 4, _state); ae_vector_set_length(&ty, ny, _state); /* * Calculate linear term */ rowcnt = n1*n2*n3; for(rowidx=0; rowidx<=rowcnt-1; rowidx++) { /* * Calculate TX - current position */ k = rowidx; tx.ptr.p_double[0] = (double)(0); tx.ptr.p_double[1] = x1->ptr.p_double[k%n1]; k = k/n1; tx.ptr.p_double[2] = x2->ptr.p_double[k%n2]; k = k/n2; tx.ptr.p_double[3] = x3->ptr.p_double[k%n3]; k = k/n3; ae_assert(k==0, "RBFGridCalcVX: integrity check failed", _state); for(j=0; j<=ny-1; j++) { v = s->v.ptr.pp_double[j][nx]; for(k=1; k<=nx-1; k++) { v = v+tx.ptr.p_double[k]*s->v.ptr.pp_double[j][k]; } z.ptr.p_double[j] = v; } for(i=0; i<=n0-1; i++) { dstoffs = ny*(rowidx*n0+i); if( sparsey&&!flagy->ptr.p_bool[rowidx*n0+i] ) { for(j=0; j<=ny-1; j++) { y->ptr.p_double[j+dstoffs] = (double)(0); } continue; } v = x0->ptr.p_double[i]; for(j=0; j<=ny-1; j++) { y->ptr.p_double[j+dstoffs] = z.ptr.p_double[j]+v*s->v.ptr.pp_double[j][0]; } } } if( s->nh==0 ) { ae_frame_leave(_state); return; } /* * Process RBF terms, layer by layer */ for(levelidx=0; levelidx<=s->nh-1; levelidx++) { rcur = s->ri.ptr.p_double[levelidx]; blockwidth0 = (double)(1); blockwidth1 = (double)(1); blockwidth2 = (double)(1); blockwidth3 = (double)(1); if( nx>=1 ) { blockwidth0 = rcur*s->s.ptr.p_double[0]; } if( nx>=2 ) { blockwidth1 = rcur*s->s.ptr.p_double[1]; } if( nx>=3 ) { blockwidth2 = rcur*s->s.ptr.p_double[2]; } if( nx>=4 ) { blockwidth3 = rcur*s->s.ptr.p_double[3]; } maxblocksize = 8; /* * Group grid nodes into blocks according to current radius */ ae_vector_set_length(&blocks0, n0+1, _state); blockscnt0 = 0; blocks0.ptr.p_int[0] = 0; for(i=1; i<=n0-1; i++) { if( ae_fp_greater(x0->ptr.p_double[i]-x0->ptr.p_double[blocks0.ptr.p_int[blockscnt0]],blockwidth0)||i-blocks0.ptr.p_int[blockscnt0]>=maxblocksize ) { inc(&blockscnt0, _state); blocks0.ptr.p_int[blockscnt0] = i; } } inc(&blockscnt0, _state); blocks0.ptr.p_int[blockscnt0] = n0; ae_vector_set_length(&blocks1, n1+1, _state); blockscnt1 = 0; blocks1.ptr.p_int[0] = 0; for(i=1; i<=n1-1; i++) { if( ae_fp_greater(x1->ptr.p_double[i]-x1->ptr.p_double[blocks1.ptr.p_int[blockscnt1]],blockwidth1)||i-blocks1.ptr.p_int[blockscnt1]>=maxblocksize ) { inc(&blockscnt1, _state); blocks1.ptr.p_int[blockscnt1] = i; } } inc(&blockscnt1, _state); blocks1.ptr.p_int[blockscnt1] = n1; ae_vector_set_length(&blocks2, n2+1, _state); blockscnt2 = 0; blocks2.ptr.p_int[0] = 0; for(i=1; i<=n2-1; i++) { if( ae_fp_greater(x2->ptr.p_double[i]-x2->ptr.p_double[blocks2.ptr.p_int[blockscnt2]],blockwidth2)||i-blocks2.ptr.p_int[blockscnt2]>=maxblocksize ) { inc(&blockscnt2, _state); blocks2.ptr.p_int[blockscnt2] = i; } } inc(&blockscnt2, _state); blocks2.ptr.p_int[blockscnt2] = n2; ae_vector_set_length(&blocks3, n3+1, _state); blockscnt3 = 0; blocks3.ptr.p_int[0] = 0; for(i=1; i<=n3-1; i++) { if( ae_fp_greater(x3->ptr.p_double[i]-x3->ptr.p_double[blocks3.ptr.p_int[blockscnt3]],blockwidth3)||i-blocks3.ptr.p_int[blockscnt3]>=maxblocksize ) { inc(&blockscnt3, _state); blocks3.ptr.p_int[blockscnt3] = i; } } inc(&blockscnt3, _state); blocks3.ptr.p_int[blockscnt3] = n3; /* * Prepare seed for shared pool */ rbfv2_allocatecalcbuffer(s, &bufseedv2.calcbuf, _state); ae_shared_pool_set_seed(&bufpool, &bufseedv2, sizeof(bufseedv2), _rbfv2gridcalcbuffer_init, _rbfv2gridcalcbuffer_init_copy, _rbfv2gridcalcbuffer_destroy, _state); /* * Determine average number of neighbor per node */ searchradius2 = ae_sqr(rcur*rbfv2farradius(s->bf, _state), _state); ntrials = 100; avgfuncpernode = 0.0; for(i=0; i<=ntrials-1; i++) { tx.ptr.p_double[0] = x0->ptr.p_double[hqrnduniformi(&rs, n0, _state)]; tx.ptr.p_double[1] = x1->ptr.p_double[hqrnduniformi(&rs, n1, _state)]; tx.ptr.p_double[2] = x2->ptr.p_double[hqrnduniformi(&rs, n2, _state)]; tx.ptr.p_double[3] = x3->ptr.p_double[hqrnduniformi(&rs, n3, _state)]; rbfv2_preparepartialquery(&tx, &s->kdboxmin, &s->kdboxmax, nx, &bufseedv2.calcbuf, &dummy, _state); avgfuncpernode = avgfuncpernode+(double)rbfv2_partialcountrec(&s->kdnodes, &s->kdsplits, &s->cw, nx, ny, &bufseedv2.calcbuf, s->kdroots.ptr.p_int[levelidx], searchradius2, &tx, _state)/(double)ntrials; } /* * Perform calculation in multithreaded mode */ rbfv2partialgridcalcrec(s, x0, n0, x1, n1, x2, n2, x3, n3, &blocks0, 0, blockscnt0, &blocks1, 0, blockscnt1, &blocks2, 0, blockscnt2, &blocks3, 0, blockscnt3, flagy, sparsey, levelidx, avgfuncpernode, &bufpool, y, _state); } ae_frame_leave(_state); } void rbfv2partialgridcalcrec(rbfv2model* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* x3, ae_int_t n3, /* Integer */ ae_vector* blocks0, ae_int_t block0a, ae_int_t block0b, /* Integer */ ae_vector* blocks1, ae_int_t block1a, ae_int_t block1b, /* Integer */ ae_vector* blocks2, ae_int_t block2a, ae_int_t block2b, /* Integer */ ae_vector* blocks3, ae_int_t block3a, ae_int_t block3b, /* Boolean */ ae_vector* flagy, ae_bool sparsey, ae_int_t levelidx, double avgfuncpernode, ae_shared_pool* bufpool, /* Real */ ae_vector* y, ae_state *_state) { ae_frame _frame_block; ae_int_t nx; ae_int_t ny; ae_int_t k; ae_int_t l; ae_int_t blkidx; ae_int_t blkcnt; ae_int_t nodeidx; ae_int_t nodescnt; ae_int_t rowidx; ae_int_t rowscnt; ae_int_t i0; ae_int_t i1; ae_int_t i2; ae_int_t i3; ae_int_t j0; ae_int_t j1; ae_int_t j2; ae_int_t j3; double rcur; double invrc2; double rquery2; double rfar2; ae_int_t dstoffs; ae_int_t srcoffs; ae_int_t dummy; double rowwidth; double maxrowwidth; double problemcost; ae_int_t maxbs; ae_int_t midpoint; ae_bool emptyrow; rbfv2gridcalcbuffer *buf; ae_smart_ptr _buf; ae_frame_make(_state, &_frame_block); ae_smart_ptr_init(&_buf, (void**)&buf, _state); nx = s->nx; ny = s->ny; /* * Integrity checks */ ae_assert(s->nx==2||s->nx==3, "RBFV2PartialGridCalcRec: integrity check failed", _state); /* * Try to split large problem */ problemcost = s->ny*2*(avgfuncpernode+1); problemcost = problemcost*(blocks0->ptr.p_int[block0b]-blocks0->ptr.p_int[block0a]); problemcost = problemcost*(blocks1->ptr.p_int[block1b]-blocks1->ptr.p_int[block1a]); problemcost = problemcost*(blocks2->ptr.p_int[block2b]-blocks2->ptr.p_int[block2a]); problemcost = problemcost*(blocks3->ptr.p_int[block3b]-blocks3->ptr.p_int[block3a]); maxbs = 0; maxbs = ae_maxint(maxbs, block0b-block0a, _state); maxbs = ae_maxint(maxbs, block1b-block1a, _state); maxbs = ae_maxint(maxbs, block2b-block2a, _state); maxbs = ae_maxint(maxbs, block3b-block3a, _state); if( ae_fp_greater_eq(problemcost,rbfv2_minbasecasecost)&&maxbs>=2 ) { if( block0b-block0a==maxbs ) { midpoint = block0a+maxbs/2; rbfv2partialgridcalcrec(s, x0, n0, x1, n1, x2, n2, x3, n3, blocks0, block0a, midpoint, blocks1, block1a, block1b, blocks2, block2a, block2b, blocks3, block3a, block3b, flagy, sparsey, levelidx, avgfuncpernode, bufpool, y, _state); rbfv2partialgridcalcrec(s, x0, n0, x1, n1, x2, n2, x3, n3, blocks0, midpoint, block0b, blocks1, block1a, block1b, blocks2, block2a, block2b, blocks3, block3a, block3b, flagy, sparsey, levelidx, avgfuncpernode, bufpool, y, _state); ae_frame_leave(_state); return; } if( block1b-block1a==maxbs ) { midpoint = block1a+maxbs/2; rbfv2partialgridcalcrec(s, x0, n0, x1, n1, x2, n2, x3, n3, blocks0, block0a, block0b, blocks1, block1a, midpoint, blocks2, block2a, block2b, blocks3, block3a, block3b, flagy, sparsey, levelidx, avgfuncpernode, bufpool, y, _state); rbfv2partialgridcalcrec(s, x0, n0, x1, n1, x2, n2, x3, n3, blocks0, block0a, block0b, blocks1, midpoint, block1b, blocks2, block2a, block2b, blocks3, block3a, block3b, flagy, sparsey, levelidx, avgfuncpernode, bufpool, y, _state); ae_frame_leave(_state); return; } if( block2b-block2a==maxbs ) { midpoint = block2a+maxbs/2; rbfv2partialgridcalcrec(s, x0, n0, x1, n1, x2, n2, x3, n3, blocks0, block0a, block0b, blocks1, block1a, block1b, blocks2, block2a, midpoint, blocks3, block3a, block3b, flagy, sparsey, levelidx, avgfuncpernode, bufpool, y, _state); rbfv2partialgridcalcrec(s, x0, n0, x1, n1, x2, n2, x3, n3, blocks0, block0a, block0b, blocks1, block1a, block1b, blocks2, midpoint, block2b, blocks3, block3a, block3b, flagy, sparsey, levelidx, avgfuncpernode, bufpool, y, _state); ae_frame_leave(_state); return; } if( block3b-block3a==maxbs ) { midpoint = block3a+maxbs/2; rbfv2partialgridcalcrec(s, x0, n0, x1, n1, x2, n2, x3, n3, blocks0, block0a, block0b, blocks1, block1a, block1b, blocks2, block2a, block2b, blocks3, block3a, midpoint, flagy, sparsey, levelidx, avgfuncpernode, bufpool, y, _state); rbfv2partialgridcalcrec(s, x0, n0, x1, n1, x2, n2, x3, n3, blocks0, block0a, block0b, blocks1, block1a, block1b, blocks2, block2a, block2b, blocks3, midpoint, block3b, flagy, sparsey, levelidx, avgfuncpernode, bufpool, y, _state); ae_frame_leave(_state); return; } ae_assert(ae_false, "RBFV2PartialGridCalcRec: integrity check failed", _state); } /* * Retrieve buffer object from pool (it will be returned later) */ ae_shared_pool_retrieve(bufpool, &_buf, _state); /* * Calculate RBF model */ ae_assert(nx<=4, "RBFV2PartialGridCalcRec: integrity check failed", _state); ae_vector_set_length(&buf->tx, 4, _state); ae_vector_set_length(&buf->cx, 4, _state); ae_vector_set_length(&buf->ty, ny, _state); rcur = s->ri.ptr.p_double[levelidx]; invrc2 = 1/(rcur*rcur); blkcnt = (block3b-block3a)*(block2b-block2a)*(block1b-block1a)*(block0b-block0a); for(blkidx=0; blkidx<=blkcnt-1; blkidx++) { /* * Select block (I0,I1,I2,I3). * * NOTE: for problems with NX<4 corresponding I_? are zero. */ k = blkidx; i0 = block0a+k%(block0b-block0a); k = k/(block0b-block0a); i1 = block1a+k%(block1b-block1a); k = k/(block1b-block1a); i2 = block2a+k%(block2b-block2a); k = k/(block2b-block2a); i3 = block3a+k%(block3b-block3a); k = k/(block3b-block3a); ae_assert(k==0, "RBFV2PartialGridCalcRec: integrity check failed", _state); /* * We partitioned grid into blocks and selected block with * index (I0,I1,I2,I3). This block is a 4D cube (some dimensions * may be zero) of nodes with indexes (J0,J1,J2,J3), which is * further partitioned into a set of rows, each row corresponding * to indexes J1...J3 being fixed. * * We process block row by row, and each row may be handled * by either "generic" (nodes are processed separately) or * batch algorithm (that's the reason to use rows, after all). * * * Process nodes of the block */ rowscnt = (blocks3->ptr.p_int[i3+1]-blocks3->ptr.p_int[i3])*(blocks2->ptr.p_int[i2+1]-blocks2->ptr.p_int[i2])*(blocks1->ptr.p_int[i1+1]-blocks1->ptr.p_int[i1]); for(rowidx=0; rowidx<=rowscnt-1; rowidx++) { /* * Find out node indexes (*,J1,J2,J3). * * NOTE: for problems with NX<4 corresponding J_? are zero. */ k = rowidx; j1 = blocks1->ptr.p_int[i1]+k%(blocks1->ptr.p_int[i1+1]-blocks1->ptr.p_int[i1]); k = k/(blocks1->ptr.p_int[i1+1]-blocks1->ptr.p_int[i1]); j2 = blocks2->ptr.p_int[i2]+k%(blocks2->ptr.p_int[i2+1]-blocks2->ptr.p_int[i2]); k = k/(blocks2->ptr.p_int[i2+1]-blocks2->ptr.p_int[i2]); j3 = blocks3->ptr.p_int[i3]+k%(blocks3->ptr.p_int[i3+1]-blocks3->ptr.p_int[i3]); k = k/(blocks3->ptr.p_int[i3+1]-blocks3->ptr.p_int[i3]); ae_assert(k==0, "RBFV2PartialGridCalcRec: integrity check failed", _state); /* * Analyze row, skip completely empty rows */ nodescnt = blocks0->ptr.p_int[i0+1]-blocks0->ptr.p_int[i0]; srcoffs = blocks0->ptr.p_int[i0]+(j1+(j2+j3*n2)*n1)*n0; emptyrow = ae_true; for(nodeidx=0; nodeidx<=nodescnt-1; nodeidx++) { emptyrow = emptyrow&&(sparsey&&!flagy->ptr.p_bool[srcoffs+nodeidx]); } if( emptyrow ) { continue; } /* * Process row - use either "batch" (rowsize>1) or "generic" * (row size is 1) algorithm. * * NOTE: "generic" version may also be used as fallback code for * situations when we do not want to use batch code. */ maxrowwidth = 0.5*rbfv2nearradius(s->bf, _state)*rcur*s->s.ptr.p_double[0]; rowwidth = x0->ptr.p_double[blocks0->ptr.p_int[i0+1]-1]-x0->ptr.p_double[blocks0->ptr.p_int[i0]]; if( nodescnt>1&&ae_fp_less_eq(rowwidth,maxrowwidth) ) { /* * "Batch" code which processes entire row at once, saving * some time in kd-tree search code. */ rquery2 = ae_sqr(rcur*rbfv2farradius(s->bf, _state)+0.5*rowwidth/s->s.ptr.p_double[0], _state); rfar2 = ae_sqr(rcur*rbfv2farradius(s->bf, _state), _state); j0 = blocks0->ptr.p_int[i0]; if( nx>0 ) { buf->cx.ptr.p_double[0] = (x0->ptr.p_double[j0]+0.5*rowwidth)/s->s.ptr.p_double[0]; } if( nx>1 ) { buf->cx.ptr.p_double[1] = x1->ptr.p_double[j1]/s->s.ptr.p_double[1]; } if( nx>2 ) { buf->cx.ptr.p_double[2] = x2->ptr.p_double[j2]/s->s.ptr.p_double[2]; } if( nx>3 ) { buf->cx.ptr.p_double[3] = x3->ptr.p_double[j3]/s->s.ptr.p_double[3]; } srcoffs = j0+(j1+(j2+j3*n2)*n1)*n0; dstoffs = ny*srcoffs; rvectorsetlengthatleast(&buf->rx, nodescnt, _state); bvectorsetlengthatleast(&buf->rf, nodescnt, _state); rvectorsetlengthatleast(&buf->ry, nodescnt*ny, _state); for(nodeidx=0; nodeidx<=nodescnt-1; nodeidx++) { buf->rx.ptr.p_double[nodeidx] = x0->ptr.p_double[j0+nodeidx]/s->s.ptr.p_double[0]; buf->rf.ptr.p_bool[nodeidx] = !sparsey||flagy->ptr.p_bool[srcoffs+nodeidx]; } for(k=0; k<=nodescnt*ny-1; k++) { buf->ry.ptr.p_double[k] = (double)(0); } rbfv2_preparepartialquery(&buf->cx, &s->kdboxmin, &s->kdboxmax, nx, &buf->calcbuf, &dummy, _state); rbfv2_partialrowcalcrec(s, &buf->calcbuf, s->kdroots.ptr.p_int[levelidx], invrc2, rquery2, rfar2, &buf->cx, &buf->rx, &buf->rf, nodescnt, &buf->ry, _state); for(k=0; k<=nodescnt*ny-1; k++) { y->ptr.p_double[dstoffs+k] = y->ptr.p_double[dstoffs+k]+buf->ry.ptr.p_double[k]; } } else { /* * "Generic" code. Although we usually move here * only when NodesCnt=1, we still use a loop on * NodeIdx just to be able to use this branch as * fallback code without any modifications. */ rquery2 = ae_sqr(rcur*rbfv2farradius(s->bf, _state), _state); for(nodeidx=0; nodeidx<=nodescnt-1; nodeidx++) { /* * Prepare TX - current point */ j0 = blocks0->ptr.p_int[i0]+nodeidx; if( nx>0 ) { buf->tx.ptr.p_double[0] = x0->ptr.p_double[j0]/s->s.ptr.p_double[0]; } if( nx>1 ) { buf->tx.ptr.p_double[1] = x1->ptr.p_double[j1]/s->s.ptr.p_double[1]; } if( nx>2 ) { buf->tx.ptr.p_double[2] = x2->ptr.p_double[j2]/s->s.ptr.p_double[2]; } if( nx>3 ) { buf->tx.ptr.p_double[3] = x3->ptr.p_double[j3]/s->s.ptr.p_double[3]; } /* * Evaluate and add to Y */ srcoffs = j0+(j1+(j2+j3*n2)*n1)*n0; dstoffs = ny*srcoffs; for(l=0; l<=ny-1; l++) { buf->ty.ptr.p_double[l] = (double)(0); } if( !sparsey||flagy->ptr.p_bool[srcoffs] ) { rbfv2_preparepartialquery(&buf->tx, &s->kdboxmin, &s->kdboxmax, nx, &buf->calcbuf, &dummy, _state); rbfv2_partialcalcrec(s, &buf->calcbuf, s->kdroots.ptr.p_int[levelidx], invrc2, rquery2, &buf->tx, &buf->ty, _state); } for(l=0; l<=ny-1; l++) { y->ptr.p_double[dstoffs+l] = y->ptr.p_double[dstoffs+l]+buf->ty.ptr.p_double[l]; } } } } } /* * Recycle buffer object back to pool */ ae_shared_pool_recycle(bufpool, &_buf, _state); ae_frame_leave(_state); } /************************************************************************* This function "unpacks" RBF model by extracting its coefficients. INPUT PARAMETERS: S - RBF model OUTPUT PARAMETERS: NX - dimensionality of argument NY - dimensionality of the target function XWR - model information, array[NC,NX+NY+1]. One row of the array corresponds to one basis function: * first NX columns - coordinates of the center * next NY columns - weights, one per dimension of the function being modelled * last NX columns - radii, per dimension NC - number of the centers V - polynomial term , array[NY,NX+1]. One row per one dimension of the function being modelled. First NX elements are linear coefficients, V[NX] is equal to the constant part. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfv2unpack(rbfv2model* s, ae_int_t* nx, ae_int_t* ny, /* Real */ ae_matrix* xwr, ae_int_t* nc, /* Real */ ae_matrix* v, ae_state *_state) { ae_int_t i; ae_int_t ncactual; *nx = 0; *ny = 0; ae_matrix_clear(xwr); *nc = 0; ae_matrix_clear(v); *nx = s->nx; *ny = s->ny; *nc = 0; /* * Fill V */ ae_matrix_set_length(v, s->ny, s->nx+1, _state); for(i=0; i<=s->ny-1; i++) { ae_v_move(&v->ptr.pp_double[i][0], 1, &s->v.ptr.pp_double[i][0], 1, ae_v_len(0,s->nx)); } /* * Fill XWR */ ae_assert(s->cw.cnt%(s->nx+s->ny)==0, "RBFV2Unpack: integrity error", _state); *nc = s->cw.cnt/(s->nx+s->ny); ncactual = 0; if( *nc>0 ) { ae_matrix_set_length(xwr, *nc, s->nx+s->ny+s->nx, _state); for(i=0; i<=s->nh-1; i++) { rbfv2_partialunpackrec(&s->kdnodes, &s->kdsplits, &s->cw, &s->s, s->nx, s->ny, s->kdroots.ptr.p_int[i], s->ri.ptr.p_double[i], xwr, &ncactual, _state); } } ae_assert(*nc==ncactual, "RBFV2Unpack: integrity error", _state); } static ae_bool rbfv2_rbfv2buildlinearmodel(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_int_t modeltype, /* Real */ ae_matrix* v, ae_state *_state) { ae_frame _frame_block; ae_vector tmpy; ae_matrix a; double scaling; ae_vector shifting; double mn; double mx; ae_vector c; lsfitreport rep; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t info; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_clear(v); ae_vector_init(&tmpy, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&shifting, 0, DT_REAL, _state); ae_vector_init(&c, 0, DT_REAL, _state); _lsfitreport_init(&rep, _state); ae_assert(n>=0, "BuildLinearModel: N<0", _state); ae_assert(nx>0, "BuildLinearModel: NX<=0", _state); ae_assert(ny>0, "BuildLinearModel: NY<=0", _state); /* * Handle degenerate case (N=0) */ result = ae_true; ae_matrix_set_length(v, ny, nx+1, _state); if( n==0 ) { for(j=0; j<=nx; j++) { for(i=0; i<=ny-1; i++) { v->ptr.pp_double[i][j] = (double)(0); } } ae_frame_leave(_state); return result; } /* * Allocate temporaries */ ae_vector_set_length(&tmpy, n, _state); /* * General linear model. */ if( modeltype==1 ) { /* * Calculate scaling/shifting, transform variables, prepare LLS problem */ ae_matrix_set_length(&a, n, nx+1, _state); ae_vector_set_length(&shifting, nx, _state); scaling = (double)(0); for(i=0; i<=nx-1; i++) { mn = x->ptr.pp_double[0][i]; mx = mn; for(j=1; j<=n-1; j++) { if( ae_fp_greater(mn,x->ptr.pp_double[j][i]) ) { mn = x->ptr.pp_double[j][i]; } if( ae_fp_less(mx,x->ptr.pp_double[j][i]) ) { mx = x->ptr.pp_double[j][i]; } } scaling = ae_maxreal(scaling, mx-mn, _state); shifting.ptr.p_double[i] = 0.5*(mx+mn); } if( ae_fp_eq(scaling,(double)(0)) ) { scaling = (double)(1); } else { scaling = 0.5*scaling; } for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { a.ptr.pp_double[i][j] = (x->ptr.pp_double[i][j]-shifting.ptr.p_double[j])/scaling; } } for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][nx] = (double)(1); } /* * Solve linear system in transformed variables, make backward */ for(i=0; i<=ny-1; i++) { for(j=0; j<=n-1; j++) { tmpy.ptr.p_double[j] = y->ptr.pp_double[j][i]; } lsfitlinear(&tmpy, &a, n, nx+1, &info, &c, &rep, _state); if( info<=0 ) { result = ae_false; ae_frame_leave(_state); return result; } for(j=0; j<=nx-1; j++) { v->ptr.pp_double[i][j] = c.ptr.p_double[j]/scaling; } v->ptr.pp_double[i][nx] = c.ptr.p_double[nx]; for(j=0; j<=nx-1; j++) { v->ptr.pp_double[i][nx] = v->ptr.pp_double[i][nx]-shifting.ptr.p_double[j]*v->ptr.pp_double[i][j]; } for(j=0; j<=n-1; j++) { for(k=0; k<=nx-1; k++) { y->ptr.pp_double[j][i] = y->ptr.pp_double[j][i]-x->ptr.pp_double[j][k]*v->ptr.pp_double[i][k]; } y->ptr.pp_double[j][i] = y->ptr.pp_double[j][i]-v->ptr.pp_double[i][nx]; } } ae_frame_leave(_state); return result; } /* * Constant model, very simple */ if( modeltype==2 ) { for(i=0; i<=ny-1; i++) { for(j=0; j<=nx; j++) { v->ptr.pp_double[i][j] = (double)(0); } for(j=0; j<=n-1; j++) { v->ptr.pp_double[i][nx] = v->ptr.pp_double[i][nx]+y->ptr.pp_double[j][i]; } if( n>0 ) { v->ptr.pp_double[i][nx] = v->ptr.pp_double[i][nx]/n; } for(j=0; j<=n-1; j++) { y->ptr.pp_double[j][i] = y->ptr.pp_double[j][i]-v->ptr.pp_double[i][nx]; } } ae_frame_leave(_state); return result; } /* * Zero model */ ae_assert(modeltype==3, "BuildLinearModel: unknown model type", _state); for(i=0; i<=ny-1; i++) { for(j=0; j<=nx; j++) { v->ptr.pp_double[i][j] = (double)(0); } } ae_frame_leave(_state); return result; } /************************************************************************* Reallocates calcBuf if necessary, reuses previously allocated space if possible. -- ALGLIB -- Copyright 20.06.2016 by Sergey Bochkanov *************************************************************************/ static void rbfv2_allocatecalcbuffer(rbfv2model* s, rbfv2calcbuffer* buf, ae_state *_state) { if( buf->x.cntnx ) { ae_vector_set_length(&buf->x, s->nx, _state); } if( buf->curboxmin.cntnx ) { ae_vector_set_length(&buf->curboxmin, s->nx, _state); } if( buf->curboxmax.cntnx ) { ae_vector_set_length(&buf->curboxmax, s->nx, _state); } if( buf->x123.cntnx ) { ae_vector_set_length(&buf->x123, s->nx, _state); } if( buf->y123.cntny ) { ae_vector_set_length(&buf->y123, s->ny, _state); } } /************************************************************************* Extracts structure (and XY-values too) from kd-tree built for a small subset of points and appends it to multi-tree. -- ALGLIB -- Copyright 20.06.2016 by Sergey Bochkanov *************************************************************************/ static void rbfv2_convertandappendtree(kdtree* curtree, ae_int_t n, ae_int_t nx, ae_int_t ny, /* Integer */ ae_vector* kdnodes, /* Real */ ae_vector* kdsplits, /* Real */ ae_vector* cw, ae_state *_state) { ae_frame _frame_block; ae_int_t nodesbase; ae_int_t splitsbase; ae_int_t cwbase; ae_vector localnodes; ae_vector localsplits; ae_vector localcw; ae_matrix xybuf; ae_int_t localnodessize; ae_int_t localsplitssize; ae_int_t localcwsize; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_init(&localnodes, 0, DT_INT, _state); ae_vector_init(&localsplits, 0, DT_REAL, _state); ae_vector_init(&localcw, 0, DT_REAL, _state); ae_matrix_init(&xybuf, 0, 0, DT_REAL, _state); /* * Calculate base offsets */ nodesbase = kdnodes->cnt; splitsbase = kdsplits->cnt; cwbase = cw->cnt; /* * Prepare local copy of tree */ ae_vector_set_length(&localnodes, n*rbfv2_maxnodesize, _state); ae_vector_set_length(&localsplits, n, _state); ae_vector_set_length(&localcw, (nx+ny)*n, _state); localnodessize = 0; localsplitssize = 0; localcwsize = 0; rbfv2_converttreerec(curtree, n, nx, ny, 0, nodesbase, splitsbase, cwbase, &localnodes, &localnodessize, &localsplits, &localsplitssize, &localcw, &localcwsize, &xybuf, _state); /* * Append to multi-tree */ ivectorresize(kdnodes, kdnodes->cnt+localnodessize, _state); rvectorresize(kdsplits, kdsplits->cnt+localsplitssize, _state); rvectorresize(cw, cw->cnt+localcwsize, _state); for(i=0; i<=localnodessize-1; i++) { kdnodes->ptr.p_int[nodesbase+i] = localnodes.ptr.p_int[i]; } for(i=0; i<=localsplitssize-1; i++) { kdsplits->ptr.p_double[splitsbase+i] = localsplits.ptr.p_double[i]; } for(i=0; i<=localcwsize-1; i++) { cw->ptr.p_double[cwbase+i] = localcw.ptr.p_double[i]; } ae_frame_leave(_state); } /************************************************************************* Recurrent tree conversion CurTree - tree to convert N, NX, NY - dataset metrics NodeOffset - offset of current tree node, 0 for root NodesBase - a value which is added to intra-tree node indexes; although this tree is stored in separate array, it is intended to be stored in the larger tree, with localNodes being moved to offset NodesBase. SplitsBase - similarly, offset of localSplits in the final tree CWBase - similarly, offset of localCW in the final tree *************************************************************************/ static void rbfv2_converttreerec(kdtree* curtree, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_int_t nodeoffset, ae_int_t nodesbase, ae_int_t splitsbase, ae_int_t cwbase, /* Integer */ ae_vector* localnodes, ae_int_t* localnodessize, /* Real */ ae_vector* localsplits, ae_int_t* localsplitssize, /* Real */ ae_vector* localcw, ae_int_t* localcwsize, /* Real */ ae_matrix* xybuf, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t nodetype; ae_int_t cnt; ae_int_t d; double s; ae_int_t nodele; ae_int_t nodege; ae_int_t oldnodessize; kdtreeexplorenodetype(curtree, nodeoffset, &nodetype, _state); /* * Leaf node */ if( nodetype==0 ) { kdtreeexploreleaf(curtree, nodeoffset, xybuf, &cnt, _state); ae_assert(localnodes->cnt>=*localnodessize+2, "ConvertTreeRec: integrity check failed", _state); ae_assert(localcw->cnt>=*localcwsize+cnt*(nx+ny), "ConvertTreeRec: integrity check failed", _state); localnodes->ptr.p_int[*localnodessize+0] = cnt; localnodes->ptr.p_int[*localnodessize+1] = cwbase+(*localcwsize); *localnodessize = *localnodessize+2; for(i=0; i<=cnt-1; i++) { for(j=0; j<=nx+ny-1; j++) { localcw->ptr.p_double[*localcwsize+i*(nx+ny)+j] = xybuf->ptr.pp_double[i][j]; } } *localcwsize = *localcwsize+cnt*(nx+ny); return; } /* * Split node */ if( nodetype==1 ) { kdtreeexploresplit(curtree, nodeoffset, &d, &s, &nodele, &nodege, _state); ae_assert(localnodes->cnt>=*localnodessize+rbfv2_maxnodesize, "ConvertTreeRec: integrity check failed", _state); ae_assert(localsplits->cnt>=*localsplitssize+1, "ConvertTreeRec: integrity check failed", _state); oldnodessize = *localnodessize; localnodes->ptr.p_int[*localnodessize+0] = 0; localnodes->ptr.p_int[*localnodessize+1] = d; localnodes->ptr.p_int[*localnodessize+2] = splitsbase+(*localsplitssize); localnodes->ptr.p_int[*localnodessize+3] = -1; localnodes->ptr.p_int[*localnodessize+4] = -1; *localnodessize = *localnodessize+5; localsplits->ptr.p_double[*localsplitssize+0] = s; *localsplitssize = *localsplitssize+1; localnodes->ptr.p_int[oldnodessize+3] = nodesbase+(*localnodessize); rbfv2_converttreerec(curtree, n, nx, ny, nodele, nodesbase, splitsbase, cwbase, localnodes, localnodessize, localsplits, localsplitssize, localcw, localcwsize, xybuf, _state); localnodes->ptr.p_int[oldnodessize+4] = nodesbase+(*localnodessize); rbfv2_converttreerec(curtree, n, nx, ny, nodege, nodesbase, splitsbase, cwbase, localnodes, localnodessize, localsplits, localsplitssize, localcw, localcwsize, xybuf, _state); return; } /* * Integrity error */ ae_assert(ae_false, "ConvertTreeRec: integrity check failed", _state); } /************************************************************************* This function performs partial calculation of hierarchical model: given evaluation point X and partially computed value Y, it updates Y by values computed using part of multi-tree given by RootIdx. INPUT PARAMETERS: S - V2 model Buf - calc-buffer, this function uses following fields: * Buf.CurBoxMin - should be set by caller * Buf.CurBoxMax - should be set by caller * Buf.CurDist2 - squared distance from X to current bounding box, should be set by caller RootIdx - offset of partial kd-tree InvR2 - 1/R^2, where R is basis function radius QueryR2 - squared query radius, usually it is (R*FarRadius(BasisFunction))^2 X - evaluation point, array[NX] Y - partial value, array[NY] OUTPUT PARAMETERS Y - updated partial value -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ static void rbfv2_partialcalcrec(rbfv2model* s, rbfv2calcbuffer* buf, ae_int_t rootidx, double invr2, double queryr2, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_int_t j; double ptdist2; double v; double v0; double v1; ae_int_t cwoffs; ae_int_t cwcnt; ae_int_t itemoffs; double arg; double val; ae_int_t d; double split; ae_int_t childle; ae_int_t childge; ae_int_t childoffs; ae_bool updatemin; double prevdist2; double t1; ae_int_t nx; ae_int_t ny; nx = s->nx; ny = s->ny; /* * Leaf node. */ if( s->kdnodes.ptr.p_int[rootidx]>0 ) { cwcnt = s->kdnodes.ptr.p_int[rootidx+0]; cwoffs = s->kdnodes.ptr.p_int[rootidx+1]; for(i=0; i<=cwcnt-1; i++) { /* * Calculate distance */ itemoffs = cwoffs+i*(nx+ny); ptdist2 = (double)(0); for(j=0; j<=nx-1; j++) { v = s->cw.ptr.p_double[itemoffs+j]-x->ptr.p_double[j]; ptdist2 = ptdist2+v*v; } /* * Skip points if distance too large */ if( ptdist2>=queryr2 ) { continue; } /* * Update Y */ arg = ptdist2*invr2; if( s->bf==0 ) { val = ae_exp(-arg, _state); } else { if( s->bf==1 ) { val = rbfv2basisfunc(s->bf, arg, _state); } else { ae_assert(ae_false, "PartialCalcRec: integrity check failed", _state); } } itemoffs = itemoffs+nx; for(j=0; j<=ny-1; j++) { y->ptr.p_double[j] = y->ptr.p_double[j]+val*s->cw.ptr.p_double[itemoffs+j]; } } return; } /* * Simple split */ if( s->kdnodes.ptr.p_int[rootidx]==0 ) { /* * Load: * * D dimension to split * * Split split position * * ChildLE, ChildGE - indexes of childs */ d = s->kdnodes.ptr.p_int[rootidx+1]; split = s->kdsplits.ptr.p_double[s->kdnodes.ptr.p_int[rootidx+2]]; childle = s->kdnodes.ptr.p_int[rootidx+3]; childge = s->kdnodes.ptr.p_int[rootidx+4]; /* * Navigate through childs */ for(i=0; i<=1; i++) { /* * Select child to process: * * ChildOffs current child offset in Nodes[] * * UpdateMin whether minimum or maximum value * of bounding box is changed on update */ updatemin = i!=0; if( i==0 ) { childoffs = childle; } else { childoffs = childge; } /* * Update bounding box and current distance */ prevdist2 = buf->curdist2; t1 = x->ptr.p_double[d]; if( updatemin ) { v = buf->curboxmin.ptr.p_double[d]; if( t1<=split ) { v0 = v-t1; if( v0<0 ) { v0 = (double)(0); } v1 = split-t1; buf->curdist2 = buf->curdist2-v0*v0+v1*v1; } buf->curboxmin.ptr.p_double[d] = split; } else { v = buf->curboxmax.ptr.p_double[d]; if( t1>=split ) { v0 = t1-v; if( v0<0 ) { v0 = (double)(0); } v1 = t1-split; buf->curdist2 = buf->curdist2-v0*v0+v1*v1; } buf->curboxmax.ptr.p_double[d] = split; } /* * Decide: to dive into cell or not to dive */ if( buf->curdist2curboxmin.ptr.p_double[d] = v; } else { buf->curboxmax.ptr.p_double[d] = v; } buf->curdist2 = prevdist2; } return; } /* * Integrity failure */ ae_assert(ae_false, "PartialCalcRec: integrity check failed", _state); } /************************************************************************* This function performs same operation as partialcalcrec(), but for entire row of the grid. "Row" is a set of nodes (x0,x1,x2,x3) which share x1..x3, but have different x0's. (note: for 2D/3D problems x2..x3 are zero). Row is given by: * central point XC, which is located at the center of the row, and used to perform kd-tree requests * set of x0 coordinates stored in RX array (array may be unordered, but it is expected that spread of x0 is no more than R; function may be inefficient for larger spreads). * set of YFlag values stored in RF INPUT PARAMETERS: S - V2 model Buf - calc-buffer, this function uses following fields: * Buf.CurBoxMin - should be set by caller * Buf.CurBoxMax - should be set by caller * Buf.CurDist2 - squared distance from X to current bounding box, should be set by caller RootIdx - offset of partial kd-tree InvR2 - 1/R^2, where R is basis function radius RQuery2 - squared query radius, usually it is (R*FarRadius(BasisFunction)+0.5*RowWidth)^2, where RowWidth is its spatial extent (after scaling of variables). This radius is used to perform initial query for neighbors of CX. RFar2 - squared far radius; far radius is used to perform actual filtering of results of query made with RQuery2. CX - central point, array[NX], used for queries RX - x0 coordinates, array[RowSize] RF - sparsity flags, array[RowSize] RowSize - row size in elements RY - input partial value, array[NY] OUTPUT PARAMETERS RY - updated partial value (function adds its results to RY) -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ static void rbfv2_partialrowcalcrec(rbfv2model* s, rbfv2calcbuffer* buf, ae_int_t rootidx, double invr2, double rquery2, double rfar2, /* Real */ ae_vector* cx, /* Real */ ae_vector* rx, /* Boolean */ ae_vector* rf, ae_int_t rowsize, /* Real */ ae_vector* ry, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t i0; ae_int_t i1; double partialptdist2; double ptdist2; double v; double v0; double v1; ae_int_t cwoffs; ae_int_t cwcnt; ae_int_t itemoffs; ae_int_t woffs; double val; ae_int_t d; double split; ae_int_t childle; ae_int_t childge; ae_int_t childoffs; ae_bool updatemin; double prevdist2; double t1; ae_int_t nx; ae_int_t ny; nx = s->nx; ny = s->ny; /* * Leaf node. */ if( s->kdnodes.ptr.p_int[rootidx]>0 ) { cwcnt = s->kdnodes.ptr.p_int[rootidx+0]; cwoffs = s->kdnodes.ptr.p_int[rootidx+1]; for(i0=0; i0<=cwcnt-1; i0++) { /* * Calculate partial distance (components from 1 to NX-1) */ itemoffs = cwoffs+i0*(nx+ny); partialptdist2 = (double)(0); for(j=1; j<=nx-1; j++) { v = s->cw.ptr.p_double[itemoffs+j]-cx->ptr.p_double[j]; partialptdist2 = partialptdist2+v*v; } /* * Process each element of the row */ for(i1=0; i1<=rowsize-1; i1++) { if( rf->ptr.p_bool[i1] ) { /* * Calculate distance */ v = s->cw.ptr.p_double[itemoffs]-rx->ptr.p_double[i1]; ptdist2 = partialptdist2+v*v; /* * Skip points if distance too large */ if( ptdist2>=rfar2 ) { continue; } /* * Update Y */ val = rbfv2basisfunc(s->bf, ptdist2*invr2, _state); woffs = itemoffs+nx; for(j=0; j<=ny-1; j++) { ry->ptr.p_double[j+i1*ny] = ry->ptr.p_double[j+i1*ny]+val*s->cw.ptr.p_double[woffs+j]; } } } } return; } /* * Simple split */ if( s->kdnodes.ptr.p_int[rootidx]==0 ) { /* * Load: * * D dimension to split * * Split split position * * ChildLE, ChildGE - indexes of childs */ d = s->kdnodes.ptr.p_int[rootidx+1]; split = s->kdsplits.ptr.p_double[s->kdnodes.ptr.p_int[rootidx+2]]; childle = s->kdnodes.ptr.p_int[rootidx+3]; childge = s->kdnodes.ptr.p_int[rootidx+4]; /* * Navigate through childs */ for(i=0; i<=1; i++) { /* * Select child to process: * * ChildOffs current child offset in Nodes[] * * UpdateMin whether minimum or maximum value * of bounding box is changed on update */ updatemin = i!=0; if( i==0 ) { childoffs = childle; } else { childoffs = childge; } /* * Update bounding box and current distance */ prevdist2 = buf->curdist2; t1 = cx->ptr.p_double[d]; if( updatemin ) { v = buf->curboxmin.ptr.p_double[d]; if( t1<=split ) { v0 = v-t1; if( v0<0 ) { v0 = (double)(0); } v1 = split-t1; buf->curdist2 = buf->curdist2-v0*v0+v1*v1; } buf->curboxmin.ptr.p_double[d] = split; } else { v = buf->curboxmax.ptr.p_double[d]; if( t1>=split ) { v0 = t1-v; if( v0<0 ) { v0 = (double)(0); } v1 = t1-split; buf->curdist2 = buf->curdist2-v0*v0+v1*v1; } buf->curboxmax.ptr.p_double[d] = split; } /* * Decide: to dive into cell or not to dive */ if( buf->curdist2curboxmin.ptr.p_double[d] = v; } else { buf->curboxmax.ptr.p_double[d] = v; } buf->curdist2 = prevdist2; } return; } /* * Integrity failure */ ae_assert(ae_false, "PartialCalcRec: integrity check failed", _state); } /************************************************************************* This function prepares partial query INPUT PARAMETERS: X - query point kdBoxMin, kdBoxMax - current bounding box NX - problem size Buf - preallocated buffer; this function just loads data, but does not allocate place for them. Cnt - counter variable which is set to zery by this function, as convenience, and to remember about necessity to zero counter prior to calling partialqueryrec(). OUTPUT PARAMETERS Buf - calc-buffer: * Buf.CurBoxMin - current box * Buf.CurBoxMax - current box * Buf.CurDist2 - squared distance from X to current box Cnt - set to zero -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ static void rbfv2_preparepartialquery(/* Real */ ae_vector* x, /* Real */ ae_vector* kdboxmin, /* Real */ ae_vector* kdboxmax, ae_int_t nx, rbfv2calcbuffer* buf, ae_int_t* cnt, ae_state *_state) { ae_int_t j; *cnt = 0; buf->curdist2 = (double)(0); for(j=0; j<=nx-1; j++) { buf->curboxmin.ptr.p_double[j] = kdboxmin->ptr.p_double[j]; buf->curboxmax.ptr.p_double[j] = kdboxmax->ptr.p_double[j]; if( ae_fp_less(x->ptr.p_double[j],buf->curboxmin.ptr.p_double[j]) ) { buf->curdist2 = buf->curdist2+ae_sqr(buf->curboxmin.ptr.p_double[j]-x->ptr.p_double[j], _state); } else { if( ae_fp_greater(x->ptr.p_double[j],buf->curboxmax.ptr.p_double[j]) ) { buf->curdist2 = buf->curdist2+ae_sqr(x->ptr.p_double[j]-buf->curboxmax.ptr.p_double[j], _state); } } } } /************************************************************************* This function performs partial (for just one subtree of multi-tree) query for neighbors located in R-sphere around X. It returns squared distances from X to points and offsets in S.CW[] array for points being found. INPUT PARAMETERS: kdNodes, kdSplits, CW, NX, NY - corresponding fields of V2 model Buf - calc-buffer, this function uses following fields: * Buf.CurBoxMin - should be set by caller * Buf.CurBoxMax - should be set by caller * Buf.CurDist2 - squared distance from X to current bounding box, should be set by caller You may use preparepartialquery() function to initialize these fields. RootIdx - offset of partial kd-tree QueryR2 - squared query radius X - array[NX], point being queried R2 - preallocated output buffer; it is caller's responsibility to make sure that R2 has enough space. Offs - preallocated output buffer; it is caller's responsibility to make sure that Offs has enough space. K - MUST BE ZERO ON INITIAL CALL. This variable is incremented, not set. So, any no-zero value will result in the incorrect points count being returned. OUTPUT PARAMETERS R2 - squared distances in first K elements Offs - offsets in S.CW in first K elements K - points count -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ static void rbfv2_partialqueryrec(/* Integer */ ae_vector* kdnodes, /* Real */ ae_vector* kdsplits, /* Real */ ae_vector* cw, ae_int_t nx, ae_int_t ny, rbfv2calcbuffer* buf, ae_int_t rootidx, double queryr2, /* Real */ ae_vector* x, /* Real */ ae_vector* r2, /* Integer */ ae_vector* offs, ae_int_t* k, ae_state *_state) { ae_int_t i; ae_int_t j; double ptdist2; double v; ae_int_t cwoffs; ae_int_t cwcnt; ae_int_t itemoffs; ae_int_t d; double split; ae_int_t childle; ae_int_t childge; ae_int_t childoffs; ae_bool updatemin; double prevdist2; double t1; /* * Leaf node. */ if( kdnodes->ptr.p_int[rootidx]>0 ) { cwcnt = kdnodes->ptr.p_int[rootidx+0]; cwoffs = kdnodes->ptr.p_int[rootidx+1]; for(i=0; i<=cwcnt-1; i++) { /* * Calculate distance */ itemoffs = cwoffs+i*(nx+ny); ptdist2 = (double)(0); for(j=0; j<=nx-1; j++) { v = cw->ptr.p_double[itemoffs+j]-x->ptr.p_double[j]; ptdist2 = ptdist2+v*v; } /* * Skip points if distance too large */ if( ae_fp_greater_eq(ptdist2,queryr2) ) { continue; } /* * Output */ r2->ptr.p_double[*k] = ptdist2; offs->ptr.p_int[*k] = itemoffs; *k = *k+1; } return; } /* * Simple split */ if( kdnodes->ptr.p_int[rootidx]==0 ) { /* * Load: * * D dimension to split * * Split split position * * ChildLE, ChildGE - indexes of childs */ d = kdnodes->ptr.p_int[rootidx+1]; split = kdsplits->ptr.p_double[kdnodes->ptr.p_int[rootidx+2]]; childle = kdnodes->ptr.p_int[rootidx+3]; childge = kdnodes->ptr.p_int[rootidx+4]; /* * Navigate through childs */ for(i=0; i<=1; i++) { /* * Select child to process: * * ChildOffs current child offset in Nodes[] * * UpdateMin whether minimum or maximum value * of bounding box is changed on update */ updatemin = i!=0; if( i==0 ) { childoffs = childle; } else { childoffs = childge; } /* * Update bounding box and current distance */ prevdist2 = buf->curdist2; t1 = x->ptr.p_double[d]; if( updatemin ) { v = buf->curboxmin.ptr.p_double[d]; if( ae_fp_less_eq(t1,split) ) { buf->curdist2 = buf->curdist2-ae_sqr(ae_maxreal(v-t1, (double)(0), _state), _state)+ae_sqr(split-t1, _state); } buf->curboxmin.ptr.p_double[d] = split; } else { v = buf->curboxmax.ptr.p_double[d]; if( ae_fp_greater_eq(t1,split) ) { buf->curdist2 = buf->curdist2-ae_sqr(ae_maxreal(t1-v, (double)(0), _state), _state)+ae_sqr(t1-split, _state); } buf->curboxmax.ptr.p_double[d] = split; } /* * Decide: to dive into cell or not to dive */ if( ae_fp_less(buf->curdist2,queryr2) ) { rbfv2_partialqueryrec(kdnodes, kdsplits, cw, nx, ny, buf, childoffs, queryr2, x, r2, offs, k, _state); } /* * Restore bounding box and distance */ if( updatemin ) { buf->curboxmin.ptr.p_double[d] = v; } else { buf->curboxmax.ptr.p_double[d] = v; } buf->curdist2 = prevdist2; } return; } /* * Integrity failure */ ae_assert(ae_false, "PartialQueryRec: integrity check failed", _state); } /************************************************************************* This function performs partial (for just one subtree of multi-tree) counting of neighbors located in R-sphere around X. This function does not guarantee consistency of results with other partial queries, it should be used only to get approximate estimates (well, we do not use approximate algorithms, but rounding errors may give us inconsistent results in just-at-the-boundary cases). INPUT PARAMETERS: kdNodes, kdSplits, CW, NX, NY - corresponding fields of V2 model Buf - calc-buffer, this function uses following fields: * Buf.CurBoxMin - should be set by caller * Buf.CurBoxMax - should be set by caller * Buf.CurDist2 - squared distance from X to current bounding box, should be set by caller You may use preparepartialquery() function to initialize these fields. RootIdx - offset of partial kd-tree QueryR2 - squared query radius X - array[NX], point being queried RESULT: points count -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ static ae_int_t rbfv2_partialcountrec(/* Integer */ ae_vector* kdnodes, /* Real */ ae_vector* kdsplits, /* Real */ ae_vector* cw, ae_int_t nx, ae_int_t ny, rbfv2calcbuffer* buf, ae_int_t rootidx, double queryr2, /* Real */ ae_vector* x, ae_state *_state) { ae_int_t i; ae_int_t j; double ptdist2; double v; ae_int_t cwoffs; ae_int_t cwcnt; ae_int_t itemoffs; ae_int_t d; double split; ae_int_t childle; ae_int_t childge; ae_int_t childoffs; ae_bool updatemin; double prevdist2; double t1; ae_int_t result; result = 0; /* * Leaf node. */ if( kdnodes->ptr.p_int[rootidx]>0 ) { cwcnt = kdnodes->ptr.p_int[rootidx+0]; cwoffs = kdnodes->ptr.p_int[rootidx+1]; for(i=0; i<=cwcnt-1; i++) { /* * Calculate distance */ itemoffs = cwoffs+i*(nx+ny); ptdist2 = (double)(0); for(j=0; j<=nx-1; j++) { v = cw->ptr.p_double[itemoffs+j]-x->ptr.p_double[j]; ptdist2 = ptdist2+v*v; } /* * Skip points if distance too large */ if( ae_fp_greater_eq(ptdist2,queryr2) ) { continue; } /* * Output */ result = result+1; } return result; } /* * Simple split */ if( kdnodes->ptr.p_int[rootidx]==0 ) { /* * Load: * * D dimension to split * * Split split position * * ChildLE, ChildGE - indexes of childs */ d = kdnodes->ptr.p_int[rootidx+1]; split = kdsplits->ptr.p_double[kdnodes->ptr.p_int[rootidx+2]]; childle = kdnodes->ptr.p_int[rootidx+3]; childge = kdnodes->ptr.p_int[rootidx+4]; /* * Navigate through childs */ for(i=0; i<=1; i++) { /* * Select child to process: * * ChildOffs current child offset in Nodes[] * * UpdateMin whether minimum or maximum value * of bounding box is changed on update */ updatemin = i!=0; if( i==0 ) { childoffs = childle; } else { childoffs = childge; } /* * Update bounding box and current distance */ prevdist2 = buf->curdist2; t1 = x->ptr.p_double[d]; if( updatemin ) { v = buf->curboxmin.ptr.p_double[d]; if( ae_fp_less_eq(t1,split) ) { buf->curdist2 = buf->curdist2-ae_sqr(ae_maxreal(v-t1, (double)(0), _state), _state)+ae_sqr(split-t1, _state); } buf->curboxmin.ptr.p_double[d] = split; } else { v = buf->curboxmax.ptr.p_double[d]; if( ae_fp_greater_eq(t1,split) ) { buf->curdist2 = buf->curdist2-ae_sqr(ae_maxreal(t1-v, (double)(0), _state), _state)+ae_sqr(t1-split, _state); } buf->curboxmax.ptr.p_double[d] = split; } /* * Decide: to dive into cell or not to dive */ if( ae_fp_less(buf->curdist2,queryr2) ) { result = result+rbfv2_partialcountrec(kdnodes, kdsplits, cw, nx, ny, buf, childoffs, queryr2, x, _state); } /* * Restore bounding box and distance */ if( updatemin ) { buf->curboxmin.ptr.p_double[d] = v; } else { buf->curboxmax.ptr.p_double[d] = v; } buf->curdist2 = prevdist2; } return result; } /* * Integrity failure */ ae_assert(ae_false, "PartialCountRec: integrity check failed", _state); return result; } /************************************************************************* This function performs partial (for just one subtree of multi-tree) unpack for RBF model. It appends center coordinates, weights and per-dimension radii (according to current scaling) to preallocated output array. INPUT PARAMETERS: kdNodes, kdSplits, CW, S, NX, NY - corresponding fields of V2 model RootIdx - offset of partial kd-tree R - radius for current partial tree XWR - preallocated output buffer; it is caller's responsibility to make sure that XWR has enough space. First K rows are already occupied. K - number of already occupied rows in XWR. OUTPUT PARAMETERS XWR - updated XWR K - updated rows count -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ static void rbfv2_partialunpackrec(/* Integer */ ae_vector* kdnodes, /* Real */ ae_vector* kdsplits, /* Real */ ae_vector* cw, /* Real */ ae_vector* s, ae_int_t nx, ae_int_t ny, ae_int_t rootidx, double r, /* Real */ ae_matrix* xwr, ae_int_t* k, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t childle; ae_int_t childge; ae_int_t itemoffs; ae_int_t cwoffs; ae_int_t cwcnt; /* * Leaf node. */ if( kdnodes->ptr.p_int[rootidx]>0 ) { cwcnt = kdnodes->ptr.p_int[rootidx+0]; cwoffs = kdnodes->ptr.p_int[rootidx+1]; for(i=0; i<=cwcnt-1; i++) { itemoffs = cwoffs+i*(nx+ny); for(j=0; j<=nx+ny-1; j++) { xwr->ptr.pp_double[*k][j] = cw->ptr.p_double[itemoffs+j]; } for(j=0; j<=nx-1; j++) { xwr->ptr.pp_double[*k][j] = xwr->ptr.pp_double[*k][j]*s->ptr.p_double[j]; } for(j=0; j<=nx-1; j++) { xwr->ptr.pp_double[*k][nx+ny+j] = r*s->ptr.p_double[j]; } *k = *k+1; } return; } /* * Simple split */ if( kdnodes->ptr.p_int[rootidx]==0 ) { /* * Load: * * ChildLE, ChildGE - indexes of childs */ childle = kdnodes->ptr.p_int[rootidx+3]; childge = kdnodes->ptr.p_int[rootidx+4]; /* * Process both parts of split */ rbfv2_partialunpackrec(kdnodes, kdsplits, cw, s, nx, ny, childle, r, xwr, k, _state); rbfv2_partialunpackrec(kdnodes, kdsplits, cw, s, nx, ny, childge, r, xwr, k, _state); return; } /* * Integrity failure */ ae_assert(ae_false, "PartialUnpackRec: integrity check failed", _state); } /************************************************************************* This function returns size of design matrix row for evaluation point X0, given: * query radius multiplier (either RBFV2NearRadius() or RBFV2FarRadius()) * hierarchy level: value in [0,NH) for single-level model, or negative value for multilevel model (all levels of hierarchy in single matrix, like one used by nonnegative RBF) INPUT PARAMETERS: kdNodes, kdSplits, CW, Ri, kdRoots, kdBoxMin, kdBoxMax, NX, NY, NH - corresponding fields of V2 model Level - value in [0,NH) for single-level design matrix, negative value for multilevel design matrix RCoeff - radius coefficient, either RBFV2NearRadius() or RBFV2FarRadius() X0 - query point CalcBuf - buffer for PreparePartialQuery(), allocated by caller RESULT: row size -- ALGLIB -- Copyright 28.09.2016 by Bochkanov Sergey *************************************************************************/ static ae_int_t rbfv2_designmatrixrowsize(/* Integer */ ae_vector* kdnodes, /* Real */ ae_vector* kdsplits, /* Real */ ae_vector* cw, /* Real */ ae_vector* ri, /* Integer */ ae_vector* kdroots, /* Real */ ae_vector* kdboxmin, /* Real */ ae_vector* kdboxmax, ae_int_t nx, ae_int_t ny, ae_int_t nh, ae_int_t level, double rcoeff, /* Real */ ae_vector* x0, rbfv2calcbuffer* calcbuf, ae_state *_state) { ae_int_t dummy; ae_int_t levelidx; ae_int_t level0; ae_int_t level1; double curradius2; ae_int_t result; ae_assert(nh>0, "DesignMatrixRowSize: integrity failure", _state); if( level>=0 ) { level0 = level; level1 = level; } else { level0 = 0; level1 = nh-1; } result = 0; for(levelidx=level0; levelidx<=level1; levelidx++) { curradius2 = ae_sqr(ri->ptr.p_double[levelidx]*rcoeff, _state); rbfv2_preparepartialquery(x0, kdboxmin, kdboxmax, nx, calcbuf, &dummy, _state); result = result+rbfv2_partialcountrec(kdnodes, kdsplits, cw, nx, ny, calcbuf, kdroots->ptr.p_int[levelidx], curradius2, x0, _state); } return result; } /************************************************************************* This function generates design matrix row for evaluation point X0, given: * query radius multiplier (either RBFV2NearRadius() or RBFV2FarRadius()) * hierarchy level: value in [0,NH) for single-level model, or negative value for multilevel model (all levels of hierarchy in single matrix, like one used by nonnegative RBF) INPUT PARAMETERS: kdNodes, kdSplits, CW, Ri, kdRoots, kdBoxMin, kdBoxMax, NX, NY, NH - corresponding fields of V2 model CWRange - internal array[NH+1] used by RBF construction function, stores ranges of CW occupied by NH trees. Level - value in [0,NH) for single-level design matrix, negative value for multilevel design matrix BF - basis function type RCoeff - radius coefficient, either RBFV2NearRadius() or RBFV2FarRadius() RowsPerPoint-equal to: * 1 for unpenalized regression model * 1+NX for basic form of nonsmoothness penalty Penalty - nonsmoothness penalty coefficient X0 - query point CalcBuf - buffer for PreparePartialQuery(), allocated by caller R2 - preallocated temporary buffer, size is at least NPoints; it is caller's responsibility to make sure that R2 has enough space. Offs - preallocated temporary buffer; size is at least NPoints; it is caller's responsibility to make sure that Offs has enough space. K - MUST BE ZERO ON INITIAL CALL. This variable is incremented, not set. So, any no-zero value will result in the incorrect points count being returned. RowIdx - preallocated array, at least RowSize elements RowVal - preallocated array, at least RowSize*RowsPerPoint elements RESULT: RowIdx - RowSize elements are filled with column indexes of non-zero design matrix entries RowVal - RowSize*RowsPerPoint elements are filled with design matrix values, with column RowIdx[0] being stored in first RowsPerPoint elements of RowVal, column RowIdx[1] being stored in next RowsPerPoint elements, and so on. First element in contiguous set of RowsPerPoint elements corresponds to RowSize - number of columns per row -- ALGLIB -- Copyright 28.09.2016 by Bochkanov Sergey *************************************************************************/ static void rbfv2_designmatrixgeneraterow(/* Integer */ ae_vector* kdnodes, /* Real */ ae_vector* kdsplits, /* Real */ ae_vector* cw, /* Real */ ae_vector* ri, /* Integer */ ae_vector* kdroots, /* Real */ ae_vector* kdboxmin, /* Real */ ae_vector* kdboxmax, /* Integer */ ae_vector* cwrange, ae_int_t nx, ae_int_t ny, ae_int_t nh, ae_int_t level, ae_int_t bf, double rcoeff, ae_int_t rowsperpoint, double penalty, /* Real */ ae_vector* x0, rbfv2calcbuffer* calcbuf, /* Real */ ae_vector* tmpr2, /* Integer */ ae_vector* tmpoffs, /* Integer */ ae_vector* rowidx, /* Real */ ae_vector* rowval, ae_int_t* rowsize, ae_state *_state) { ae_int_t j; ae_int_t k; ae_int_t cnt; ae_int_t levelidx; ae_int_t level0; ae_int_t level1; double invri2; double curradius2; double val; double dval; double d2val; *rowsize = 0; ae_assert(nh>0, "DesignMatrixGenerateRow: integrity failure (a)", _state); ae_assert(rowsperpoint==1||rowsperpoint==1+nx, "DesignMatrixGenerateRow: integrity failure (b)", _state); if( level>=0 ) { level0 = level; level1 = level; } else { level0 = 0; level1 = nh-1; } *rowsize = 0; for(levelidx=level0; levelidx<=level1; levelidx++) { curradius2 = ae_sqr(ri->ptr.p_double[levelidx]*rcoeff, _state); invri2 = 1/ae_sqr(ri->ptr.p_double[levelidx], _state); rbfv2_preparepartialquery(x0, kdboxmin, kdboxmax, nx, calcbuf, &cnt, _state); rbfv2_partialqueryrec(kdnodes, kdsplits, cw, nx, ny, calcbuf, kdroots->ptr.p_int[levelidx], curradius2, x0, tmpr2, tmpoffs, &cnt, _state); ae_assert(tmpr2->cnt>=cnt, "DesignMatrixRowSize: integrity failure (c)", _state); ae_assert(tmpoffs->cnt>=cnt, "DesignMatrixRowSize: integrity failure (d)", _state); ae_assert(rowidx->cnt>=*rowsize+cnt, "DesignMatrixRowSize: integrity failure (e)", _state); ae_assert(rowval->cnt>=rowsperpoint*(*rowsize+cnt), "DesignMatrixRowSize: integrity failure (f)", _state); for(j=0; j<=cnt-1; j++) { /* * Generate element corresponding to fitting error. * Store derivative information which may be required later. */ ae_assert((tmpoffs->ptr.p_int[j]-cwrange->ptr.p_int[level0])%(nx+ny)==0, "DesignMatrixRowSize: integrity failure (g)", _state); rbfv2basisfuncdiff2(bf, tmpr2->ptr.p_double[j]*invri2, &val, &dval, &d2val, _state); rowidx->ptr.p_int[*rowsize+j] = (tmpoffs->ptr.p_int[j]-cwrange->ptr.p_int[level0])/(nx+ny); rowval->ptr.p_double[(*rowsize+j)*rowsperpoint+0] = val; if( rowsperpoint==1 ) { continue; } /* * Generate elements corresponding to nonsmoothness penalty */ ae_assert(rowsperpoint==1+nx, "DesignMatrixRowSize: integrity failure (h)", _state); for(k=0; k<=nx-1; k++) { rowval->ptr.p_double[(*rowsize+j)*rowsperpoint+1+k] = penalty*(dval*2*invri2+d2val*ae_sqr(2*(x0->ptr.p_double[k]-cw->ptr.p_double[tmpoffs->ptr.p_int[j]+k])*invri2, _state)); } } /* * Update columns counter */ *rowsize = *rowsize+cnt; } } void _rbfv2calcbuffer_init(void* _p, ae_state *_state) { rbfv2calcbuffer *p = (rbfv2calcbuffer*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->curboxmin, 0, DT_REAL, _state); ae_vector_init(&p->curboxmax, 0, DT_REAL, _state); ae_vector_init(&p->x123, 0, DT_REAL, _state); ae_vector_init(&p->y123, 0, DT_REAL, _state); } void _rbfv2calcbuffer_init_copy(void* _dst, void* _src, ae_state *_state) { rbfv2calcbuffer *dst = (rbfv2calcbuffer*)_dst; rbfv2calcbuffer *src = (rbfv2calcbuffer*)_src; ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->curboxmin, &src->curboxmin, _state); ae_vector_init_copy(&dst->curboxmax, &src->curboxmax, _state); dst->curdist2 = src->curdist2; ae_vector_init_copy(&dst->x123, &src->x123, _state); ae_vector_init_copy(&dst->y123, &src->y123, _state); } void _rbfv2calcbuffer_clear(void* _p) { rbfv2calcbuffer *p = (rbfv2calcbuffer*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->x); ae_vector_clear(&p->curboxmin); ae_vector_clear(&p->curboxmax); ae_vector_clear(&p->x123); ae_vector_clear(&p->y123); } void _rbfv2calcbuffer_destroy(void* _p) { rbfv2calcbuffer *p = (rbfv2calcbuffer*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->x); ae_vector_destroy(&p->curboxmin); ae_vector_destroy(&p->curboxmax); ae_vector_destroy(&p->x123); ae_vector_destroy(&p->y123); } void _rbfv2model_init(void* _p, ae_state *_state) { rbfv2model *p = (rbfv2model*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->ri, 0, DT_REAL, _state); ae_vector_init(&p->s, 0, DT_REAL, _state); ae_vector_init(&p->kdroots, 0, DT_INT, _state); ae_vector_init(&p->kdnodes, 0, DT_INT, _state); ae_vector_init(&p->kdsplits, 0, DT_REAL, _state); ae_vector_init(&p->kdboxmin, 0, DT_REAL, _state); ae_vector_init(&p->kdboxmax, 0, DT_REAL, _state); ae_vector_init(&p->cw, 0, DT_REAL, _state); ae_matrix_init(&p->v, 0, 0, DT_REAL, _state); _rbfv2calcbuffer_init(&p->calcbuf, _state); } void _rbfv2model_init_copy(void* _dst, void* _src, ae_state *_state) { rbfv2model *dst = (rbfv2model*)_dst; rbfv2model *src = (rbfv2model*)_src; dst->ny = src->ny; dst->nx = src->nx; dst->bf = src->bf; dst->nh = src->nh; ae_vector_init_copy(&dst->ri, &src->ri, _state); ae_vector_init_copy(&dst->s, &src->s, _state); ae_vector_init_copy(&dst->kdroots, &src->kdroots, _state); ae_vector_init_copy(&dst->kdnodes, &src->kdnodes, _state); ae_vector_init_copy(&dst->kdsplits, &src->kdsplits, _state); ae_vector_init_copy(&dst->kdboxmin, &src->kdboxmin, _state); ae_vector_init_copy(&dst->kdboxmax, &src->kdboxmax, _state); ae_vector_init_copy(&dst->cw, &src->cw, _state); ae_matrix_init_copy(&dst->v, &src->v, _state); dst->lambdareg = src->lambdareg; dst->maxits = src->maxits; dst->supportr = src->supportr; dst->basisfunction = src->basisfunction; _rbfv2calcbuffer_init_copy(&dst->calcbuf, &src->calcbuf, _state); } void _rbfv2model_clear(void* _p) { rbfv2model *p = (rbfv2model*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->ri); ae_vector_clear(&p->s); ae_vector_clear(&p->kdroots); ae_vector_clear(&p->kdnodes); ae_vector_clear(&p->kdsplits); ae_vector_clear(&p->kdboxmin); ae_vector_clear(&p->kdboxmax); ae_vector_clear(&p->cw); ae_matrix_clear(&p->v); _rbfv2calcbuffer_clear(&p->calcbuf); } void _rbfv2model_destroy(void* _p) { rbfv2model *p = (rbfv2model*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->ri); ae_vector_destroy(&p->s); ae_vector_destroy(&p->kdroots); ae_vector_destroy(&p->kdnodes); ae_vector_destroy(&p->kdsplits); ae_vector_destroy(&p->kdboxmin); ae_vector_destroy(&p->kdboxmax); ae_vector_destroy(&p->cw); ae_matrix_destroy(&p->v); _rbfv2calcbuffer_destroy(&p->calcbuf); } void _rbfv2gridcalcbuffer_init(void* _p, ae_state *_state) { rbfv2gridcalcbuffer *p = (rbfv2gridcalcbuffer*)_p; ae_touch_ptr((void*)p); _rbfv2calcbuffer_init(&p->calcbuf, _state); ae_vector_init(&p->cx, 0, DT_REAL, _state); ae_vector_init(&p->rx, 0, DT_REAL, _state); ae_vector_init(&p->ry, 0, DT_REAL, _state); ae_vector_init(&p->tx, 0, DT_REAL, _state); ae_vector_init(&p->ty, 0, DT_REAL, _state); ae_vector_init(&p->rf, 0, DT_BOOL, _state); } void _rbfv2gridcalcbuffer_init_copy(void* _dst, void* _src, ae_state *_state) { rbfv2gridcalcbuffer *dst = (rbfv2gridcalcbuffer*)_dst; rbfv2gridcalcbuffer *src = (rbfv2gridcalcbuffer*)_src; _rbfv2calcbuffer_init_copy(&dst->calcbuf, &src->calcbuf, _state); ae_vector_init_copy(&dst->cx, &src->cx, _state); ae_vector_init_copy(&dst->rx, &src->rx, _state); ae_vector_init_copy(&dst->ry, &src->ry, _state); ae_vector_init_copy(&dst->tx, &src->tx, _state); ae_vector_init_copy(&dst->ty, &src->ty, _state); ae_vector_init_copy(&dst->rf, &src->rf, _state); } void _rbfv2gridcalcbuffer_clear(void* _p) { rbfv2gridcalcbuffer *p = (rbfv2gridcalcbuffer*)_p; ae_touch_ptr((void*)p); _rbfv2calcbuffer_clear(&p->calcbuf); ae_vector_clear(&p->cx); ae_vector_clear(&p->rx); ae_vector_clear(&p->ry); ae_vector_clear(&p->tx); ae_vector_clear(&p->ty); ae_vector_clear(&p->rf); } void _rbfv2gridcalcbuffer_destroy(void* _p) { rbfv2gridcalcbuffer *p = (rbfv2gridcalcbuffer*)_p; ae_touch_ptr((void*)p); _rbfv2calcbuffer_destroy(&p->calcbuf); ae_vector_destroy(&p->cx); ae_vector_destroy(&p->rx); ae_vector_destroy(&p->ry); ae_vector_destroy(&p->tx); ae_vector_destroy(&p->ty); ae_vector_destroy(&p->rf); } void _rbfv2report_init(void* _p, ae_state *_state) { rbfv2report *p = (rbfv2report*)_p; ae_touch_ptr((void*)p); } void _rbfv2report_init_copy(void* _dst, void* _src, ae_state *_state) { rbfv2report *dst = (rbfv2report*)_dst; rbfv2report *src = (rbfv2report*)_src; dst->terminationtype = src->terminationtype; dst->maxerror = src->maxerror; dst->rmserror = src->rmserror; } void _rbfv2report_clear(void* _p) { rbfv2report *p = (rbfv2report*)_p; ae_touch_ptr((void*)p); } void _rbfv2report_destroy(void* _p) { rbfv2report *p = (rbfv2report*)_p; ae_touch_ptr((void*)p); } /************************************************************************* Fits minimum circumscribed (MCC) circle (or NX-dimensional sphere) to data (a set of points in NX-dimensional space). INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) OUTPUT PARAMETERS: CX - central point for a sphere RHi - radius NOTE: this function is an easy-to-use wrapper around more powerful "expert" function nsfitspherex(). This wrapper is optimized for ease of use and stability - at the cost of somewhat lower performance (we have to use very tight stopping criteria for inner optimizer because we want to make sure that it will converge on any dataset). If you are ready to experiment with settings of "expert" function, you can achieve ~2-4x speedup over standard "bulletproof" settings. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitspheremcc(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, /* Real */ ae_vector* cx, double* rhi, ae_state *_state) { double dummy; ae_vector_clear(cx); *rhi = 0; nsfitspherex(xy, npoints, nx, 1, 0.0, 0, 0.0, cx, &dummy, rhi, _state); } /************************************************************************* Fits maximum inscribed circle (or NX-dimensional sphere) to data (a set of points in NX-dimensional space). INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius NOTE: this function is an easy-to-use wrapper around more powerful "expert" function nsfitspherex(). This wrapper is optimized for ease of use and stability - at the cost of somewhat lower performance (we have to use very tight stopping criteria for inner optimizer because we want to make sure that it will converge on any dataset). If you are ready to experiment with settings of "expert" function, you can achieve ~2-4x speedup over standard "bulletproof" settings. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitspheremic(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, /* Real */ ae_vector* cx, double* rlo, ae_state *_state) { double dummy; ae_vector_clear(cx); *rlo = 0; nsfitspherex(xy, npoints, nx, 2, 0.0, 0, 0.0, cx, rlo, &dummy, _state); } /************************************************************************* Fits minimum zone circle (or NX-dimensional sphere) to data (a set of points in NX-dimensional space). INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius of inscribed circle RHo - radius of circumscribed circle NOTE: this function is an easy-to-use wrapper around more powerful "expert" function nsfitspherex(). This wrapper is optimized for ease of use and stability - at the cost of somewhat lower performance (we have to use very tight stopping criteria for inner optimizer because we want to make sure that it will converge on any dataset). If you are ready to experiment with settings of "expert" function, you can achieve ~2-4x speedup over standard "bulletproof" settings. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitspheremzc(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, /* Real */ ae_vector* cx, double* rlo, double* rhi, ae_state *_state) { ae_vector_clear(cx); *rlo = 0; *rhi = 0; nsfitspherex(xy, npoints, nx, 3, 0.0, 0, 0.0, cx, rlo, rhi, _state); } /************************************************************************* Fitting minimum circumscribed, maximum inscribed or minimum zone circles (or NX-dimensional spheres) to data (a set of points in NX-dimensional space). This is expert function which allows to tweak many parameters of underlying nonlinear solver: * stopping criteria for inner iterations * number of outer iterations * penalty coefficient used to handle nonlinear constraints (we convert unconstrained nonsmooth optimization problem ivolving max() and/or min() operations to quadratically constrained smooth one). You may tweak all these parameters or only some of them, leaving other ones at their default state - just specify zero value, and solver will fill it with appropriate default one. These comments also include some discussion of approach used to handle such unusual fitting problem, its stability, drawbacks of alternative methods, and convergence properties. INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) ProblemType-used to encode problem type: * 1 for minimum circumscribed circle/sphere fitting (MCC) * 2 for maximum inscribed circle/sphere fitting (MIC) * 3 for minimum zone circle fitting (difference between Rhi and Rlo is minimized), denoted as MZC EpsX - stopping condition for NLC optimizer: * must be non-negative * use 0 to choose default value (1.0E-12 is used by default) * you may specify larger values, up to 1.0E-6, if you want to speed-up solver; NLC solver performs several preconditioned outer iterations, so final result typically has precision much better than EpsX. AULIts - number of outer iterations performed by NLC optimizer: * must be non-negative * use 0 to choose default value (20 is used by default) * you may specify values smaller than 20 if you want to speed up solver; 10 often results in good combination of precision and speed; sometimes you may get good results with just 6 outer iterations. Penalty - penalty coefficient for NLC optimizer: * must be non-negative * use 0 to choose default value (1.0E6 in current version) * it should be really large, 1.0E6...1.0E7 is a good value to start from; * generally, default value is good enough OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius: * for ProblemType=2,3, radius of the inscribed sphere * for ProblemType=1 - zero RHo - radius: * for ProblemType=1,3, radius of the circumscribed sphere * for ProblemType=2 - zero NOTE: ON THE UNIQUENESS OF SOLUTIONS ALGLIB provides solution to several related circle fitting problems: MCC (minimum circumscribed), MIC (maximum inscribed) and MZC (minimum zone) fitting. It is important to note that among these problems only MCC is convex and has unique solution independently from starting point. As for MIC, it may (or may not, depending on dataset properties) have multiple solutions, and it always has one degenerate solution C=infinity which corresponds to infinitely large radius. Thus, there are no guarantees that solution to MIC returned by this solver will be the best one (and no one can provide you with such guarantee because problem is NP-hard). The only guarantee you have is that this solution is locally optimal, i.e. it can not be improved by infinitesimally small tweaks in the parameters. It is also possible to "run away" to infinity when started from bad initial point located outside of point cloud (or when point cloud does not span entire circumference/surface of the sphere). Finally, MZC (minimum zone circle) stands somewhere between MCC and MIC in stability. It is somewhat regularized by "circumscribed" term of the merit function; however, solutions to MZC may be non-unique, and in some unlucky cases it is also possible to "run away to infinity". NOTE: ON THE NONLINEARLY CONSTRAINED PROGRAMMING APPROACH The problem formulation for MCC (minimum circumscribed circle; for the sake of simplicity we omit MZC and MIC here) is: [ [ ]2 ] min [ max [ XY[i]-C ] ] C [ i [ ] ] i.e. it is unconstrained nonsmooth optimization problem of finding "best" central point, with radius R being unambiguously determined from C. In order to move away from non-smoothness we use following reformulation: [ ] [ ]2 min [ R ] subject to R>=0, [ XY[i]-C ] <= R^2 C,R [ ] [ ] i.e. it becomes smooth quadratically constrained optimization problem with linear target function. Such problem statement is 100% equivalent to the original nonsmooth one, but much easier to approach. We solve it with MinNLC solver provided by ALGLIB. NOTE: ON INSTABILITY OF SEQUENTIAL LINEAR PROGRAMMING APPROACJ ALGLIB has nonlinearly constrained solver which proved to be stable on such problems. However, some authors proposed to linearize constraints in the vicinity of current approximation (Ci,Ri) and to get next approximate solution (Ci+1,Ri+1) as solution to linear programming problem. Obviously, LP problems are easier than nonlinearly constrained ones. Indeed, SLP approach to MCC/MIC/MZC resulted in ~10-20x increase in performance (when compared with NLC solver). However, it turned out that in some cases linearized model fails to predict correct direction for next step and tells us that we converged to solution even when we are still 2-4 digits of precision away from it. It is important that it is not failure of LP solver - it is failure of the linear model; even when solved exactly, it fails to handle subtle nonlinearities which arise near the solution. We validated it by comparing results returned by ALGLIB linear solver with that of MATLAB. In our experiments with SLP solver: * MCC failed most often, at both realistic and synthetic datasets * MIC sometimes failed, but sometimes succeeded * MZC often succeeded; our guess is that presence of two independent sets of constraints (one set for Rlo and another one for Rhi) and two terms in the target function (Rlo and Rhi) regularizes task, so when linear model fails to handle nonlinearities from Rlo, it uses Rhi as a hint (and vice versa). Because SLP approach failed to achieve stable results, we do not include it in ALGLIB. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitspherex(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, ae_int_t problemtype, double epsx, ae_int_t aulits, double penalty, /* Real */ ae_vector* cx, double* rlo, double* rhi, ae_state *_state) { ae_frame _frame_block; nsfitinternalreport rep; ae_frame_make(_state, &_frame_block); ae_vector_clear(cx); *rlo = 0; *rhi = 0; _nsfitinternalreport_init(&rep, _state); ae_assert(ae_isfinite(penalty, _state)&&ae_fp_greater_eq(penalty,(double)(0)), "NSFitSphereX: Penalty<0 or is not finite", _state); ae_assert(ae_isfinite(epsx, _state)&&ae_fp_greater_eq(epsx,(double)(0)), "NSFitSphereX: EpsX<0 or is not finite", _state); ae_assert(aulits>=0, "NSFitSphereX: AULIts<0", _state); nsfitsphereinternal(xy, npoints, nx, problemtype, 0, epsx, aulits, penalty, cx, rlo, rhi, &rep, _state); ae_frame_leave(_state); } /************************************************************************* Fitting minimum circumscribed, maximum inscribed or minimum zone circles (or NX-dimensional spheres) to data (a set of points in NX-dimensional space). Internal computational function. INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) ProblemType-used to encode problem type: * 1 for minimum circumscribed circle/sphere fitting (MCC) * 2 for maximum inscribed circle/sphere fitting (MIC) * 3 for minimum zone circle fitting (difference between Rhi and Rlo is minimized), denoted as MZC SolverType- solver to use: * 0 use best solver available (1 in current version) * 1 use nonlinearly constrained optimization approach, NLC (it is roughly 10-20 times slower than SLP, but much more stable) * 2 use special fast IMPRECISE solver, sequential linear programming approach; SLP is fast, but sometimes fails to converge with more than 3 digits of precision; see comments below. NOT RECOMMENDED UNLESS YOU REALLY NEED HIGH PERFORMANCE AT THE COST OF SOME PRECISION. EpsX - stopping criteria for SLP and NLC optimizers: * must be non-negative * use 0 to choose default value (1.0E-12 is used by default) * if you use SLP solver, you should use default values * if you use NLC solver, you may specify larger values, up to 1.0E-6, if you want to speed-up solver; NLC solver performs several preconditioned outer iterations, so final result typically has precision much better than EpsX. AULIts - number of iterations performed by NLC optimizer: * must be non-negative * use 0 to choose default value (20 is used by default) * you may specify values smaller than 20 if you want to speed up solver; 10 often results in good combination of precision and speed Penalty - penalty coefficient for NLC optimizer (ignored for SLP): * must be non-negative * use 0 to choose default value (1.0E6 in current version) * it should be really large, 1.0E6...1.0E7 is a good value to start from; * generally, default value is good enough * ignored by SLP optimizer OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius: * for ProblemType=2,3, radius of the inscribed sphere * for ProblemType=1 - zero RHo - radius: * for ProblemType=1,3, radius of the circumscribed sphere * for ProblemType=2 - zero -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitsphereinternal(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, ae_int_t problemtype, ae_int_t solvertype, double epsx, ae_int_t aulits, double penalty, /* Real */ ae_vector* cx, double* rlo, double* rhi, nsfitinternalreport* rep, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double v; double vv; ae_int_t cpr; ae_bool userlo; ae_bool userhi; double vlo; double vhi; ae_vector vmin; ae_vector vmax; double spread; ae_vector pcr; ae_vector scr; ae_vector bl; ae_vector bu; ae_int_t suboffset; ae_int_t dstrow; minnlcstate nlcstate; minnlcreport nlcrep; ae_matrix cmatrix; ae_vector ct; ae_int_t outeridx; ae_int_t maxouterits; ae_int_t maxits; double xsafeguard; double bi; minbleicstate blcstate; minbleicreport blcrep; ae_vector prevc; ae_frame_make(_state, &_frame_block); ae_vector_clear(cx); *rlo = 0; *rhi = 0; _nsfitinternalreport_clear(rep); ae_vector_init(&vmin, 0, DT_REAL, _state); ae_vector_init(&vmax, 0, DT_REAL, _state); ae_vector_init(&pcr, 0, DT_REAL, _state); ae_vector_init(&scr, 0, DT_REAL, _state); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); _minnlcstate_init(&nlcstate, _state); _minnlcreport_init(&nlcrep, _state); ae_matrix_init(&cmatrix, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); _minbleicstate_init(&blcstate, _state); _minbleicreport_init(&blcrep, _state); ae_vector_init(&prevc, 0, DT_REAL, _state); /* * Check input parameters */ ae_assert(npoints>0, "NSFitSphereX: NPoints<=0", _state); ae_assert(nx>0, "NSFitSphereX: NX<=0", _state); ae_assert(apservisfinitematrix(xy, npoints, nx, _state), "NSFitSphereX: XY contains infinite or NAN values", _state); ae_assert(problemtype>=1&&problemtype<=3, "NSFitSphereX: ProblemType is neither 1, 2 or 3", _state); ae_assert(solvertype>=0&&solvertype<=2, "NSFitSphereX: ProblemType is neither 1, 2 or 3", _state); ae_assert(ae_isfinite(penalty, _state)&&ae_fp_greater_eq(penalty,(double)(0)), "NSFitSphereX: Penalty<0 or is not finite", _state); ae_assert(ae_isfinite(epsx, _state)&&ae_fp_greater_eq(epsx,(double)(0)), "NSFitSphereX: EpsX<0 or is not finite", _state); ae_assert(aulits>=0, "NSFitSphereX: AULIts<0", _state); if( solvertype==0 ) { solvertype = 1; } if( ae_fp_eq(penalty,(double)(0)) ) { penalty = 1.0E6; } if( ae_fp_eq(epsx,(double)(0)) ) { epsx = 1.0E-12; } if( aulits==0 ) { aulits = 20; } /* * Prepare problem metrics */ userlo = problemtype==2||problemtype==3; userhi = problemtype==1||problemtype==3; if( userlo&&userhi ) { cpr = 2; } else { cpr = 1; } if( userlo ) { vlo = (double)(1); } else { vlo = (double)(0); } if( userhi ) { vhi = (double)(1); } else { vhi = (double)(0); } maxouterits = 10; maxits = 10000; xsafeguard = (double)(10); rep->nfev = 0; rep->iterationscount = 0; /* * Determine initial values, initial estimates and spread of the points */ ae_vector_set_length(&vmin, nx, _state); ae_vector_set_length(&vmax, nx, _state); ae_vector_set_length(cx, nx, _state); for(j=0; j<=nx-1; j++) { vmin.ptr.p_double[j] = xy->ptr.pp_double[0][j]; vmax.ptr.p_double[j] = xy->ptr.pp_double[0][j]; cx->ptr.p_double[j] = (double)(0); } for(i=0; i<=npoints-1; i++) { for(j=0; j<=nx-1; j++) { cx->ptr.p_double[j] = cx->ptr.p_double[j]+xy->ptr.pp_double[i][j]; vmin.ptr.p_double[j] = ae_minreal(vmin.ptr.p_double[j], xy->ptr.pp_double[i][j], _state); vmax.ptr.p_double[j] = ae_maxreal(vmax.ptr.p_double[j], xy->ptr.pp_double[i][j], _state); } } spread = (double)(0); for(j=0; j<=nx-1; j++) { cx->ptr.p_double[j] = cx->ptr.p_double[j]/npoints; spread = ae_maxreal(spread, vmax.ptr.p_double[j]-vmin.ptr.p_double[j], _state); } *rlo = ae_maxrealnumber; *rhi = (double)(0); for(i=0; i<=npoints-1; i++) { v = (double)(0); for(j=0; j<=nx-1; j++) { v = v+ae_sqr(xy->ptr.pp_double[i][j]-cx->ptr.p_double[j], _state); } v = ae_sqrt(v, _state); *rhi = ae_maxreal(*rhi, v, _state); *rlo = ae_minreal(*rlo, v, _state); } /* * Handle degenerate case of zero spread */ if( ae_fp_eq(spread,(double)(0)) ) { for(j=0; j<=nx-1; j++) { cx->ptr.p_double[j] = vmin.ptr.p_double[j]; } *rhi = (double)(0); *rlo = (double)(0); ae_frame_leave(_state); return; } /* * Prepare initial point for optimizer, scale vector and box constraints */ ae_vector_set_length(&pcr, nx+2, _state); ae_vector_set_length(&scr, nx+2, _state); ae_vector_set_length(&bl, nx+2, _state); ae_vector_set_length(&bu, nx+2, _state); for(j=0; j<=nx-1; j++) { pcr.ptr.p_double[j] = cx->ptr.p_double[j]; scr.ptr.p_double[j] = 0.1*spread; bl.ptr.p_double[j] = cx->ptr.p_double[j]-xsafeguard*spread; bu.ptr.p_double[j] = cx->ptr.p_double[j]+xsafeguard*spread; } pcr.ptr.p_double[nx+0] = *rlo; pcr.ptr.p_double[nx+1] = *rhi; scr.ptr.p_double[nx+0] = 0.5*spread; scr.ptr.p_double[nx+1] = 0.5*spread; bl.ptr.p_double[nx+0] = (double)(0); bl.ptr.p_double[nx+1] = (double)(0); bu.ptr.p_double[nx+0] = 2*(*rhi); bu.ptr.p_double[nx+1] = 2*(*rhi); /* * Solve with NLC solver; problem is treated as general nonlinearly constrained * programming, with augmented Lagrangian solver being used. */ if( solvertype==1 ) { minnlccreate(nx+2, &pcr, &nlcstate, _state); minnlcsetscale(&nlcstate, &scr, _state); minnlcsetbc(&nlcstate, &bl, &bu, _state); minnlcsetnlc(&nlcstate, 0, cpr*npoints, _state); minnlcsetcond(&nlcstate, (double)(0), (double)(0), epsx, maxits, _state); minnlcsetprecexactrobust(&nlcstate, 5, _state); minnlcsetstpmax(&nlcstate, 0.1, _state); minnlcsetalgoaul(&nlcstate, penalty, aulits, _state); minnlcrestartfrom(&nlcstate, &pcr, _state); while(minnlciteration(&nlcstate, _state)) { if( nlcstate.needfij ) { inc(&rep->nfev, _state); nlcstate.fi.ptr.p_double[0] = vhi*nlcstate.x.ptr.p_double[nx+1]-vlo*nlcstate.x.ptr.p_double[nx+0]; for(j=0; j<=nx-1; j++) { nlcstate.j.ptr.pp_double[0][j] = (double)(0); } nlcstate.j.ptr.pp_double[0][nx+0] = -1*vlo; nlcstate.j.ptr.pp_double[0][nx+1] = 1*vhi; for(i=0; i<=npoints-1; i++) { suboffset = 0; if( userhi ) { dstrow = 1+cpr*i+suboffset; v = (double)(0); for(j=0; j<=nx-1; j++) { vv = nlcstate.x.ptr.p_double[j]-xy->ptr.pp_double[i][j]; v = v+vv*vv; nlcstate.j.ptr.pp_double[dstrow][j] = 2*vv; } vv = nlcstate.x.ptr.p_double[nx+1]; v = v-vv*vv; nlcstate.j.ptr.pp_double[dstrow][nx+0] = (double)(0); nlcstate.j.ptr.pp_double[dstrow][nx+1] = -2*vv; nlcstate.fi.ptr.p_double[dstrow] = v; inc(&suboffset, _state); } if( userlo ) { dstrow = 1+cpr*i+suboffset; v = (double)(0); for(j=0; j<=nx-1; j++) { vv = nlcstate.x.ptr.p_double[j]-xy->ptr.pp_double[i][j]; v = v-vv*vv; nlcstate.j.ptr.pp_double[dstrow][j] = -2*vv; } vv = nlcstate.x.ptr.p_double[nx+0]; v = v+vv*vv; nlcstate.j.ptr.pp_double[dstrow][nx+0] = 2*vv; nlcstate.j.ptr.pp_double[dstrow][nx+1] = (double)(0); nlcstate.fi.ptr.p_double[dstrow] = v; inc(&suboffset, _state); } ae_assert(suboffset==cpr, "Assertion failed", _state); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&nlcstate, &pcr, &nlcrep, _state); ae_assert(nlcrep.terminationtype>0, "NSFitSphereX: unexpected failure of NLC solver", _state); rep->iterationscount = rep->iterationscount+nlcrep.iterationscount; /* * Offload center coordinates from PCR to CX, * re-calculate exact value of RLo/RHi using CX. */ for(j=0; j<=nx-1; j++) { cx->ptr.p_double[j] = pcr.ptr.p_double[j]; } *rlo = ae_maxrealnumber; *rhi = (double)(0); for(i=0; i<=npoints-1; i++) { v = (double)(0); for(j=0; j<=nx-1; j++) { v = v+ae_sqr(xy->ptr.pp_double[i][j]-cx->ptr.p_double[j], _state); } v = ae_sqrt(v, _state); *rhi = ae_maxreal(*rhi, v, _state); *rlo = ae_minreal(*rlo, v, _state); } if( !userlo ) { *rlo = (double)(0); } if( !userhi ) { *rhi = (double)(0); } ae_frame_leave(_state); return; } /* * Solve problem with SLP (sequential LP) approach; this approach * is much faster than NLP, but often fails for MIC and MCC (for MZC * it performs well enough). * * REFERENCE: "On a sequential linear programming approach to finding * the smallest circumscribed, largest inscribed, and minimum * zone circle or sphere", Helmuth Spath and G.A.Watson */ if( solvertype==2 ) { ae_matrix_set_length(&cmatrix, cpr*npoints, nx+3, _state); ae_vector_set_length(&ct, cpr*npoints, _state); ae_vector_set_length(&prevc, nx, _state); minbleiccreate(nx+2, &pcr, &blcstate, _state); minbleicsetscale(&blcstate, &scr, _state); minbleicsetbc(&blcstate, &bl, &bu, _state); minbleicsetcond(&blcstate, (double)(0), (double)(0), epsx, maxits, _state); for(outeridx=0; outeridx<=maxouterits-1; outeridx++) { /* * Prepare initial point for algorithm; center coordinates at * PCR are used to calculate RLo/RHi and update PCR with them. */ *rlo = ae_maxrealnumber; *rhi = (double)(0); for(i=0; i<=npoints-1; i++) { v = (double)(0); for(j=0; j<=nx-1; j++) { v = v+ae_sqr(xy->ptr.pp_double[i][j]-pcr.ptr.p_double[j], _state); } v = ae_sqrt(v, _state); *rhi = ae_maxreal(*rhi, v, _state); *rlo = ae_minreal(*rlo, v, _state); } pcr.ptr.p_double[nx+0] = *rlo*0.99999; pcr.ptr.p_double[nx+1] = *rhi/0.99999; /* * Generate matrix of linear constraints */ for(i=0; i<=npoints-1; i++) { v = (double)(0); for(j=0; j<=nx-1; j++) { v = v+ae_sqr(xy->ptr.pp_double[i][j], _state); } bi = -v/2; suboffset = 0; if( userhi ) { dstrow = cpr*i+suboffset; for(j=0; j<=nx-1; j++) { cmatrix.ptr.pp_double[dstrow][j] = pcr.ptr.p_double[j]/2-xy->ptr.pp_double[i][j]; } cmatrix.ptr.pp_double[dstrow][nx+0] = (double)(0); cmatrix.ptr.pp_double[dstrow][nx+1] = -*rhi/2; cmatrix.ptr.pp_double[dstrow][nx+2] = bi; ct.ptr.p_int[dstrow] = -1; inc(&suboffset, _state); } if( userlo ) { dstrow = cpr*i+suboffset; for(j=0; j<=nx-1; j++) { cmatrix.ptr.pp_double[dstrow][j] = -(pcr.ptr.p_double[j]/2-xy->ptr.pp_double[i][j]); } cmatrix.ptr.pp_double[dstrow][nx+0] = *rlo/2; cmatrix.ptr.pp_double[dstrow][nx+1] = (double)(0); cmatrix.ptr.pp_double[dstrow][nx+2] = -bi; ct.ptr.p_int[dstrow] = -1; inc(&suboffset, _state); } ae_assert(suboffset==cpr, "Assertion failed", _state); } /* * Solve LP subproblem with MinBLEIC */ for(j=0; j<=nx-1; j++) { prevc.ptr.p_double[j] = pcr.ptr.p_double[j]; } minbleicsetlc(&blcstate, &cmatrix, &ct, cpr*npoints, _state); minbleicrestartfrom(&blcstate, &pcr, _state); while(minbleiciteration(&blcstate, _state)) { if( blcstate.needfg ) { inc(&rep->nfev, _state); blcstate.f = vhi*blcstate.x.ptr.p_double[nx+1]-vlo*blcstate.x.ptr.p_double[nx+0]; for(j=0; j<=nx-1; j++) { blcstate.g.ptr.p_double[j] = (double)(0); } blcstate.g.ptr.p_double[nx+0] = -1*vlo; blcstate.g.ptr.p_double[nx+1] = 1*vhi; continue; } } minbleicresults(&blcstate, &pcr, &blcrep, _state); ae_assert(blcrep.terminationtype>0, "NSFitSphereX: unexpected failure of BLEIC solver", _state); rep->iterationscount = rep->iterationscount+blcrep.iterationscount; /* * Terminate iterations early if we converged */ v = (double)(0); for(j=0; j<=nx-1; j++) { v = v+ae_sqr(prevc.ptr.p_double[j]-pcr.ptr.p_double[j], _state); } v = ae_sqrt(v, _state); if( ae_fp_less_eq(v,epsx) ) { break; } } /* * Offload center coordinates from PCR to CX, * re-calculate exact value of RLo/RHi using CX. */ for(j=0; j<=nx-1; j++) { cx->ptr.p_double[j] = pcr.ptr.p_double[j]; } *rlo = ae_maxrealnumber; *rhi = (double)(0); for(i=0; i<=npoints-1; i++) { v = (double)(0); for(j=0; j<=nx-1; j++) { v = v+ae_sqr(xy->ptr.pp_double[i][j]-cx->ptr.p_double[j], _state); } v = ae_sqrt(v, _state); *rhi = ae_maxreal(*rhi, v, _state); *rlo = ae_minreal(*rlo, v, _state); } if( !userlo ) { *rlo = (double)(0); } if( !userhi ) { *rhi = (double)(0); } ae_frame_leave(_state); return; } /* * Oooops...! */ ae_assert(ae_false, "NSFitSphereX: integrity check failed", _state); ae_frame_leave(_state); } void _nsfitinternalreport_init(void* _p, ae_state *_state) { nsfitinternalreport *p = (nsfitinternalreport*)_p; ae_touch_ptr((void*)p); } void _nsfitinternalreport_init_copy(void* _dst, void* _src, ae_state *_state) { nsfitinternalreport *dst = (nsfitinternalreport*)_dst; nsfitinternalreport *src = (nsfitinternalreport*)_src; dst->nfev = src->nfev; dst->iterationscount = src->iterationscount; } void _nsfitinternalreport_clear(void* _p) { nsfitinternalreport *p = (nsfitinternalreport*)_p; ae_touch_ptr((void*)p); } void _nsfitinternalreport_destroy(void* _p) { nsfitinternalreport *p = (nsfitinternalreport*)_p; ae_touch_ptr((void*)p); } /************************************************************************* This subroutine calculates the value of the bilinear or bicubic spline at the given point X. Input parameters: C - coefficients table. Built by BuildBilinearSpline or BuildBicubicSpline. X, Y- point Result: S(x,y) -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/ double spline2dcalc(spline2dinterpolant* c, double x, double y, ae_state *_state) { double v; double vx; double vy; double vxy; double result; ae_assert(c->stype==-1||c->stype==-3, "Spline2DCalc: incorrect C (incorrect parameter C.SType)", _state); ae_assert(ae_isfinite(x, _state)&&ae_isfinite(y, _state), "Spline2DCalc: X or Y contains NaN or Infinite value", _state); if( c->d!=1 ) { result = (double)(0); return result; } spline2ddiff(c, x, y, &v, &vx, &vy, &vxy, _state); result = v; return result; } /************************************************************************* This subroutine calculates the value of the bilinear or bicubic spline at the given point X and its derivatives. Input parameters: C - spline interpolant. X, Y- point Output parameters: F - S(x,y) FX - dS(x,y)/dX FY - dS(x,y)/dY FXY - d2S(x,y)/dXdY -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/ void spline2ddiff(spline2dinterpolant* c, double x, double y, double* f, double* fx, double* fy, double* fxy, ae_state *_state) { double t; double dt; double u; double du; ae_int_t ix; ae_int_t iy; ae_int_t l; ae_int_t r; ae_int_t h; ae_int_t s1; ae_int_t s2; ae_int_t s3; ae_int_t s4; ae_int_t sfx; ae_int_t sfy; ae_int_t sfxy; double y1; double y2; double y3; double y4; double v; double t0; double t1; double t2; double t3; double u0; double u1; double u2; double u3; *f = 0; *fx = 0; *fy = 0; *fxy = 0; ae_assert(c->stype==-1||c->stype==-3, "Spline2DDiff: incorrect C (incorrect parameter C.SType)", _state); ae_assert(ae_isfinite(x, _state)&&ae_isfinite(y, _state), "Spline2DDiff: X or Y contains NaN or Infinite value", _state); /* * Prepare F, dF/dX, dF/dY, d2F/dXdY */ *f = (double)(0); *fx = (double)(0); *fy = (double)(0); *fxy = (double)(0); if( c->d!=1 ) { return; } /* * Binary search in the [ x[0], ..., x[n-2] ] (x[n-1] is not included) */ l = 0; r = c->n-1; while(l!=r-1) { h = (l+r)/2; if( ae_fp_greater_eq(c->x.ptr.p_double[h],x) ) { r = h; } else { l = h; } } t = (x-c->x.ptr.p_double[l])/(c->x.ptr.p_double[l+1]-c->x.ptr.p_double[l]); dt = 1.0/(c->x.ptr.p_double[l+1]-c->x.ptr.p_double[l]); ix = l; /* * Binary search in the [ y[0], ..., y[m-2] ] (y[m-1] is not included) */ l = 0; r = c->m-1; while(l!=r-1) { h = (l+r)/2; if( ae_fp_greater_eq(c->y.ptr.p_double[h],y) ) { r = h; } else { l = h; } } u = (y-c->y.ptr.p_double[l])/(c->y.ptr.p_double[l+1]-c->y.ptr.p_double[l]); du = 1.0/(c->y.ptr.p_double[l+1]-c->y.ptr.p_double[l]); iy = l; /* * Bilinear interpolation */ if( c->stype==-1 ) { y1 = c->f.ptr.p_double[c->n*iy+ix]; y2 = c->f.ptr.p_double[c->n*iy+(ix+1)]; y3 = c->f.ptr.p_double[c->n*(iy+1)+(ix+1)]; y4 = c->f.ptr.p_double[c->n*(iy+1)+ix]; *f = (1-t)*(1-u)*y1+t*(1-u)*y2+t*u*y3+(1-t)*u*y4; *fx = (-(1-u)*y1+(1-u)*y2+u*y3-u*y4)*dt; *fy = (-(1-t)*y1-t*y2+t*y3+(1-t)*y4)*du; *fxy = (y1-y2+y3-y4)*du*dt; return; } /* * Bicubic interpolation */ if( c->stype==-3 ) { /* * Prepare info */ t0 = (double)(1); t1 = t; t2 = ae_sqr(t, _state); t3 = t*t2; u0 = (double)(1); u1 = u; u2 = ae_sqr(u, _state); u3 = u*u2; sfx = c->n*c->m; sfy = 2*c->n*c->m; sfxy = 3*c->n*c->m; s1 = c->n*iy+ix; s2 = c->n*iy+(ix+1); s3 = c->n*(iy+1)+(ix+1); s4 = c->n*(iy+1)+ix; /* * Calculate */ v = c->f.ptr.p_double[s1]; *f = *f+v*t0*u0; v = c->f.ptr.p_double[sfy+s1]/du; *f = *f+v*t0*u1; *fy = *fy+v*t0*u0*du; v = -3*c->f.ptr.p_double[s1]+3*c->f.ptr.p_double[s4]-2*c->f.ptr.p_double[sfy+s1]/du-c->f.ptr.p_double[sfy+s4]/du; *f = *f+v*t0*u2; *fy = *fy+2*v*t0*u1*du; v = 2*c->f.ptr.p_double[s1]-2*c->f.ptr.p_double[s4]+c->f.ptr.p_double[sfy+s1]/du+c->f.ptr.p_double[sfy+s4]/du; *f = *f+v*t0*u3; *fy = *fy+3*v*t0*u2*du; v = c->f.ptr.p_double[sfx+s1]/dt; *f = *f+v*t1*u0; *fx = *fx+v*t0*u0*dt; v = c->f.ptr.p_double[sfxy+s1]/(dt*du); *f = *f+v*t1*u1; *fx = *fx+v*t0*u1*dt; *fy = *fy+v*t1*u0*du; *fxy = *fxy+v*t0*u0*dt*du; v = -3*c->f.ptr.p_double[sfx+s1]/dt+3*c->f.ptr.p_double[sfx+s4]/dt-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-c->f.ptr.p_double[sfxy+s4]/(dt*du); *f = *f+v*t1*u2; *fx = *fx+v*t0*u2*dt; *fy = *fy+2*v*t1*u1*du; *fxy = *fxy+2*v*t0*u1*dt*du; v = 2*c->f.ptr.p_double[sfx+s1]/dt-2*c->f.ptr.p_double[sfx+s4]/dt+c->f.ptr.p_double[sfxy+s1]/(dt*du)+c->f.ptr.p_double[sfxy+s4]/(dt*du); *f = *f+v*t1*u3; *fx = *fx+v*t0*u3*dt; *fy = *fy+3*v*t1*u2*du; *fxy = *fxy+3*v*t0*u2*dt*du; v = -3*c->f.ptr.p_double[s1]+3*c->f.ptr.p_double[s2]-2*c->f.ptr.p_double[sfx+s1]/dt-c->f.ptr.p_double[sfx+s2]/dt; *f = *f+v*t2*u0; *fx = *fx+2*v*t1*u0*dt; v = -3*c->f.ptr.p_double[sfy+s1]/du+3*c->f.ptr.p_double[sfy+s2]/du-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-c->f.ptr.p_double[sfxy+s2]/(dt*du); *f = *f+v*t2*u1; *fx = *fx+2*v*t1*u1*dt; *fy = *fy+v*t2*u0*du; *fxy = *fxy+2*v*t1*u0*dt*du; v = 9*c->f.ptr.p_double[s1]-9*c->f.ptr.p_double[s2]+9*c->f.ptr.p_double[s3]-9*c->f.ptr.p_double[s4]+6*c->f.ptr.p_double[sfx+s1]/dt+3*c->f.ptr.p_double[sfx+s2]/dt-3*c->f.ptr.p_double[sfx+s3]/dt-6*c->f.ptr.p_double[sfx+s4]/dt+6*c->f.ptr.p_double[sfy+s1]/du-6*c->f.ptr.p_double[sfy+s2]/du-3*c->f.ptr.p_double[sfy+s3]/du+3*c->f.ptr.p_double[sfy+s4]/du+4*c->f.ptr.p_double[sfxy+s1]/(dt*du)+2*c->f.ptr.p_double[sfxy+s2]/(dt*du)+c->f.ptr.p_double[sfxy+s3]/(dt*du)+2*c->f.ptr.p_double[sfxy+s4]/(dt*du); *f = *f+v*t2*u2; *fx = *fx+2*v*t1*u2*dt; *fy = *fy+2*v*t2*u1*du; *fxy = *fxy+4*v*t1*u1*dt*du; v = -6*c->f.ptr.p_double[s1]+6*c->f.ptr.p_double[s2]-6*c->f.ptr.p_double[s3]+6*c->f.ptr.p_double[s4]-4*c->f.ptr.p_double[sfx+s1]/dt-2*c->f.ptr.p_double[sfx+s2]/dt+2*c->f.ptr.p_double[sfx+s3]/dt+4*c->f.ptr.p_double[sfx+s4]/dt-3*c->f.ptr.p_double[sfy+s1]/du+3*c->f.ptr.p_double[sfy+s2]/du+3*c->f.ptr.p_double[sfy+s3]/du-3*c->f.ptr.p_double[sfy+s4]/du-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-c->f.ptr.p_double[sfxy+s2]/(dt*du)-c->f.ptr.p_double[sfxy+s3]/(dt*du)-2*c->f.ptr.p_double[sfxy+s4]/(dt*du); *f = *f+v*t2*u3; *fx = *fx+2*v*t1*u3*dt; *fy = *fy+3*v*t2*u2*du; *fxy = *fxy+6*v*t1*u2*dt*du; v = 2*c->f.ptr.p_double[s1]-2*c->f.ptr.p_double[s2]+c->f.ptr.p_double[sfx+s1]/dt+c->f.ptr.p_double[sfx+s2]/dt; *f = *f+v*t3*u0; *fx = *fx+3*v*t2*u0*dt; v = 2*c->f.ptr.p_double[sfy+s1]/du-2*c->f.ptr.p_double[sfy+s2]/du+c->f.ptr.p_double[sfxy+s1]/(dt*du)+c->f.ptr.p_double[sfxy+s2]/(dt*du); *f = *f+v*t3*u1; *fx = *fx+3*v*t2*u1*dt; *fy = *fy+v*t3*u0*du; *fxy = *fxy+3*v*t2*u0*dt*du; v = -6*c->f.ptr.p_double[s1]+6*c->f.ptr.p_double[s2]-6*c->f.ptr.p_double[s3]+6*c->f.ptr.p_double[s4]-3*c->f.ptr.p_double[sfx+s1]/dt-3*c->f.ptr.p_double[sfx+s2]/dt+3*c->f.ptr.p_double[sfx+s3]/dt+3*c->f.ptr.p_double[sfx+s4]/dt-4*c->f.ptr.p_double[sfy+s1]/du+4*c->f.ptr.p_double[sfy+s2]/du+2*c->f.ptr.p_double[sfy+s3]/du-2*c->f.ptr.p_double[sfy+s4]/du-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-2*c->f.ptr.p_double[sfxy+s2]/(dt*du)-c->f.ptr.p_double[sfxy+s3]/(dt*du)-c->f.ptr.p_double[sfxy+s4]/(dt*du); *f = *f+v*t3*u2; *fx = *fx+3*v*t2*u2*dt; *fy = *fy+2*v*t3*u1*du; *fxy = *fxy+6*v*t2*u1*dt*du; v = 4*c->f.ptr.p_double[s1]-4*c->f.ptr.p_double[s2]+4*c->f.ptr.p_double[s3]-4*c->f.ptr.p_double[s4]+2*c->f.ptr.p_double[sfx+s1]/dt+2*c->f.ptr.p_double[sfx+s2]/dt-2*c->f.ptr.p_double[sfx+s3]/dt-2*c->f.ptr.p_double[sfx+s4]/dt+2*c->f.ptr.p_double[sfy+s1]/du-2*c->f.ptr.p_double[sfy+s2]/du-2*c->f.ptr.p_double[sfy+s3]/du+2*c->f.ptr.p_double[sfy+s4]/du+c->f.ptr.p_double[sfxy+s1]/(dt*du)+c->f.ptr.p_double[sfxy+s2]/(dt*du)+c->f.ptr.p_double[sfxy+s3]/(dt*du)+c->f.ptr.p_double[sfxy+s4]/(dt*du); *f = *f+v*t3*u3; *fx = *fx+3*v*t2*u3*dt; *fy = *fy+3*v*t3*u2*du; *fxy = *fxy+9*v*t2*u2*dt*du; return; } } /************************************************************************* This subroutine performs linear transformation of the spline argument. Input parameters: C - spline interpolant AX, BX - transformation coefficients: x = A*t + B AY, BY - transformation coefficients: y = A*u + B Result: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/ void spline2dlintransxy(spline2dinterpolant* c, double ax, double bx, double ay, double by, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_vector f; ae_vector v; ae_int_t i; ae_int_t j; ae_int_t k; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&f, 0, DT_REAL, _state); ae_vector_init(&v, 0, DT_REAL, _state); ae_assert(c->stype==-3||c->stype==-1, "Spline2DLinTransXY: incorrect C (incorrect parameter C.SType)", _state); ae_assert(ae_isfinite(ax, _state), "Spline2DLinTransXY: AX is infinite or NaN", _state); ae_assert(ae_isfinite(bx, _state), "Spline2DLinTransXY: BX is infinite or NaN", _state); ae_assert(ae_isfinite(ay, _state), "Spline2DLinTransXY: AY is infinite or NaN", _state); ae_assert(ae_isfinite(by, _state), "Spline2DLinTransXY: BY is infinite or NaN", _state); ae_vector_set_length(&x, c->n, _state); ae_vector_set_length(&y, c->m, _state); ae_vector_set_length(&f, c->m*c->n*c->d, _state); for(j=0; j<=c->n-1; j++) { x.ptr.p_double[j] = c->x.ptr.p_double[j]; } for(i=0; i<=c->m-1; i++) { y.ptr.p_double[i] = c->y.ptr.p_double[i]; } for(i=0; i<=c->m-1; i++) { for(j=0; j<=c->n-1; j++) { for(k=0; k<=c->d-1; k++) { f.ptr.p_double[c->d*(i*c->n+j)+k] = c->f.ptr.p_double[c->d*(i*c->n+j)+k]; } } } /* * Handle different combinations of AX/AY */ if( ae_fp_eq(ax,(double)(0))&&ae_fp_neq(ay,(double)(0)) ) { for(i=0; i<=c->m-1; i++) { spline2dcalcvbuf(c, bx, y.ptr.p_double[i], &v, _state); y.ptr.p_double[i] = (y.ptr.p_double[i]-by)/ay; for(j=0; j<=c->n-1; j++) { for(k=0; k<=c->d-1; k++) { f.ptr.p_double[c->d*(i*c->n+j)+k] = v.ptr.p_double[k]; } } } } if( ae_fp_neq(ax,(double)(0))&&ae_fp_eq(ay,(double)(0)) ) { for(j=0; j<=c->n-1; j++) { spline2dcalcvbuf(c, x.ptr.p_double[j], by, &v, _state); x.ptr.p_double[j] = (x.ptr.p_double[j]-bx)/ax; for(i=0; i<=c->m-1; i++) { for(k=0; k<=c->d-1; k++) { f.ptr.p_double[c->d*(i*c->n+j)+k] = v.ptr.p_double[k]; } } } } if( ae_fp_neq(ax,(double)(0))&&ae_fp_neq(ay,(double)(0)) ) { for(j=0; j<=c->n-1; j++) { x.ptr.p_double[j] = (x.ptr.p_double[j]-bx)/ax; } for(i=0; i<=c->m-1; i++) { y.ptr.p_double[i] = (y.ptr.p_double[i]-by)/ay; } } if( ae_fp_eq(ax,(double)(0))&&ae_fp_eq(ay,(double)(0)) ) { spline2dcalcvbuf(c, bx, by, &v, _state); for(i=0; i<=c->m-1; i++) { for(j=0; j<=c->n-1; j++) { for(k=0; k<=c->d-1; k++) { f.ptr.p_double[c->d*(i*c->n+j)+k] = v.ptr.p_double[k]; } } } } /* * Rebuild spline */ if( c->stype==-3 ) { spline2dbuildbicubicv(&x, c->n, &y, c->m, &f, c->d, c, _state); } if( c->stype==-1 ) { spline2dbuildbilinearv(&x, c->n, &y, c->m, &f, c->d, c, _state); } ae_frame_leave(_state); } /************************************************************************* This subroutine performs linear transformation of the spline. Input parameters: C - spline interpolant. A, B- transformation coefficients: S2(x,y) = A*S(x,y) + B Output parameters: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/ void spline2dlintransf(spline2dinterpolant* c, double a, double b, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_vector f; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&f, 0, DT_REAL, _state); ae_assert(c->stype==-3||c->stype==-1, "Spline2DLinTransF: incorrect C (incorrect parameter C.SType)", _state); ae_vector_set_length(&x, c->n, _state); ae_vector_set_length(&y, c->m, _state); ae_vector_set_length(&f, c->m*c->n*c->d, _state); for(j=0; j<=c->n-1; j++) { x.ptr.p_double[j] = c->x.ptr.p_double[j]; } for(i=0; i<=c->m-1; i++) { y.ptr.p_double[i] = c->y.ptr.p_double[i]; } for(i=0; i<=c->m*c->n*c->d-1; i++) { f.ptr.p_double[i] = a*c->f.ptr.p_double[i]+b; } if( c->stype==-3 ) { spline2dbuildbicubicv(&x, c->n, &y, c->m, &f, c->d, c, _state); } if( c->stype==-1 ) { spline2dbuildbilinearv(&x, c->n, &y, c->m, &f, c->d, c, _state); } ae_frame_leave(_state); } /************************************************************************* This subroutine makes the copy of the spline model. Input parameters: C - spline interpolant Output parameters: CC - spline copy -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/ void spline2dcopy(spline2dinterpolant* c, spline2dinterpolant* cc, ae_state *_state) { ae_int_t tblsize; _spline2dinterpolant_clear(cc); ae_assert(c->k==1||c->k==3, "Spline2DCopy: incorrect C (incorrect parameter C.K)", _state); cc->k = c->k; cc->n = c->n; cc->m = c->m; cc->d = c->d; cc->stype = c->stype; tblsize = -1; if( c->stype==-3 ) { tblsize = 4*c->n*c->m*c->d; } if( c->stype==-1 ) { tblsize = c->n*c->m*c->d; } ae_assert(tblsize>0, "Spline2DCopy: internal error", _state); ae_vector_set_length(&cc->x, cc->n, _state); ae_vector_set_length(&cc->y, cc->m, _state); ae_vector_set_length(&cc->f, tblsize, _state); ae_v_move(&cc->x.ptr.p_double[0], 1, &c->x.ptr.p_double[0], 1, ae_v_len(0,cc->n-1)); ae_v_move(&cc->y.ptr.p_double[0], 1, &c->y.ptr.p_double[0], 1, ae_v_len(0,cc->m-1)); ae_v_move(&cc->f.ptr.p_double[0], 1, &c->f.ptr.p_double[0], 1, ae_v_len(0,tblsize-1)); } /************************************************************************* Bicubic spline resampling Input parameters: A - function values at the old grid, array[0..OldHeight-1, 0..OldWidth-1] OldHeight - old grid height, OldHeight>1 OldWidth - old grid width, OldWidth>1 NewHeight - new grid height, NewHeight>1 NewWidth - new grid width, NewWidth>1 Output parameters: B - function values at the new grid, array[0..NewHeight-1, 0..NewWidth-1] -- ALGLIB routine -- 15 May, 2007 Copyright by Bochkanov Sergey *************************************************************************/ void spline2dresamplebicubic(/* Real */ ae_matrix* a, ae_int_t oldheight, ae_int_t oldwidth, /* Real */ ae_matrix* b, ae_int_t newheight, ae_int_t newwidth, ae_state *_state) { ae_frame _frame_block; ae_matrix buf; ae_vector x; ae_vector y; spline1dinterpolant c; ae_int_t mw; ae_int_t mh; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); ae_matrix_clear(b); ae_matrix_init(&buf, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); _spline1dinterpolant_init(&c, _state); ae_assert(oldwidth>1&&oldheight>1, "Spline2DResampleBicubic: width/height less than 1", _state); ae_assert(newwidth>1&&newheight>1, "Spline2DResampleBicubic: width/height less than 1", _state); /* * Prepare */ mw = ae_maxint(oldwidth, newwidth, _state); mh = ae_maxint(oldheight, newheight, _state); ae_matrix_set_length(b, newheight, newwidth, _state); ae_matrix_set_length(&buf, oldheight, newwidth, _state); ae_vector_set_length(&x, ae_maxint(mw, mh, _state), _state); ae_vector_set_length(&y, ae_maxint(mw, mh, _state), _state); /* * Horizontal interpolation */ for(i=0; i<=oldheight-1; i++) { /* * Fill X, Y */ for(j=0; j<=oldwidth-1; j++) { x.ptr.p_double[j] = (double)j/(double)(oldwidth-1); y.ptr.p_double[j] = a->ptr.pp_double[i][j]; } /* * Interpolate and place result into temporary matrix */ spline1dbuildcubic(&x, &y, oldwidth, 0, 0.0, 0, 0.0, &c, _state); for(j=0; j<=newwidth-1; j++) { buf.ptr.pp_double[i][j] = spline1dcalc(&c, (double)j/(double)(newwidth-1), _state); } } /* * Vertical interpolation */ for(j=0; j<=newwidth-1; j++) { /* * Fill X, Y */ for(i=0; i<=oldheight-1; i++) { x.ptr.p_double[i] = (double)i/(double)(oldheight-1); y.ptr.p_double[i] = buf.ptr.pp_double[i][j]; } /* * Interpolate and place result into B */ spline1dbuildcubic(&x, &y, oldheight, 0, 0.0, 0, 0.0, &c, _state); for(i=0; i<=newheight-1; i++) { b->ptr.pp_double[i][j] = spline1dcalc(&c, (double)i/(double)(newheight-1), _state); } } ae_frame_leave(_state); } /************************************************************************* Bilinear spline resampling Input parameters: A - function values at the old grid, array[0..OldHeight-1, 0..OldWidth-1] OldHeight - old grid height, OldHeight>1 OldWidth - old grid width, OldWidth>1 NewHeight - new grid height, NewHeight>1 NewWidth - new grid width, NewWidth>1 Output parameters: B - function values at the new grid, array[0..NewHeight-1, 0..NewWidth-1] -- ALGLIB routine -- 09.07.2007 Copyright by Bochkanov Sergey *************************************************************************/ void spline2dresamplebilinear(/* Real */ ae_matrix* a, ae_int_t oldheight, ae_int_t oldwidth, /* Real */ ae_matrix* b, ae_int_t newheight, ae_int_t newwidth, ae_state *_state) { ae_int_t l; ae_int_t c; double t; double u; ae_int_t i; ae_int_t j; ae_matrix_clear(b); ae_assert(oldwidth>1&&oldheight>1, "Spline2DResampleBilinear: width/height less than 1", _state); ae_assert(newwidth>1&&newheight>1, "Spline2DResampleBilinear: width/height less than 1", _state); ae_matrix_set_length(b, newheight, newwidth, _state); for(i=0; i<=newheight-1; i++) { for(j=0; j<=newwidth-1; j++) { l = i*(oldheight-1)/(newheight-1); if( l==oldheight-1 ) { l = oldheight-2; } u = (double)i/(double)(newheight-1)*(oldheight-1)-l; c = j*(oldwidth-1)/(newwidth-1); if( c==oldwidth-1 ) { c = oldwidth-2; } t = (double)(j*(oldwidth-1))/(double)(newwidth-1)-c; b->ptr.pp_double[i][j] = (1-t)*(1-u)*a->ptr.pp_double[l][c]+t*(1-u)*a->ptr.pp_double[l][c+1]+t*u*a->ptr.pp_double[l+1][c+1]+(1-t)*u*a->ptr.pp_double[l+1][c]; } } } /************************************************************************* This subroutine builds bilinear vector-valued spline. Input parameters: X - spline abscissas, array[0..N-1] Y - spline ordinates, array[0..M-1] F - function values, array[0..M*N*D-1]: * first D elements store D values at (X[0],Y[0]) * next D elements store D values at (X[1],Y[0]) * general form - D function values at (X[i],Y[j]) are stored at F[D*(J*N+I)...D*(J*N+I)+D-1]. M,N - grid size, M>=2, N>=2 D - vector dimension, D>=1 Output parameters: C - spline interpolant -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dbuildbilinearv(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, /* Real */ ae_vector* f, ae_int_t d, spline2dinterpolant* c, ae_state *_state) { double t; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t i0; _spline2dinterpolant_clear(c); ae_assert(n>=2, "Spline2DBuildBilinearV: N is less then 2", _state); ae_assert(m>=2, "Spline2DBuildBilinearV: M is less then 2", _state); ae_assert(d>=1, "Spline2DBuildBilinearV: invalid argument D (D<1)", _state); ae_assert(x->cnt>=n&&y->cnt>=m, "Spline2DBuildBilinearV: length of X or Y is too short (Length(X/Y)cnt>=k, "Spline2DBuildBilinearV: length of F is too short (Length(F)k = 1; c->n = n; c->m = m; c->d = d; c->stype = -1; ae_vector_set_length(&c->x, c->n, _state); ae_vector_set_length(&c->y, c->m, _state); ae_vector_set_length(&c->f, k, _state); for(i=0; i<=c->n-1; i++) { c->x.ptr.p_double[i] = x->ptr.p_double[i]; } for(i=0; i<=c->m-1; i++) { c->y.ptr.p_double[i] = y->ptr.p_double[i]; } for(i=0; i<=k-1; i++) { c->f.ptr.p_double[i] = f->ptr.p_double[i]; } /* * Sort points */ for(j=0; j<=c->n-1; j++) { k = j; for(i=j+1; i<=c->n-1; i++) { if( ae_fp_less(c->x.ptr.p_double[i],c->x.ptr.p_double[k]) ) { k = i; } } if( k!=j ) { for(i=0; i<=c->m-1; i++) { for(i0=0; i0<=c->d-1; i0++) { t = c->f.ptr.p_double[c->d*(i*c->n+j)+i0]; c->f.ptr.p_double[c->d*(i*c->n+j)+i0] = c->f.ptr.p_double[c->d*(i*c->n+k)+i0]; c->f.ptr.p_double[c->d*(i*c->n+k)+i0] = t; } } t = c->x.ptr.p_double[j]; c->x.ptr.p_double[j] = c->x.ptr.p_double[k]; c->x.ptr.p_double[k] = t; } } for(i=0; i<=c->m-1; i++) { k = i; for(j=i+1; j<=c->m-1; j++) { if( ae_fp_less(c->y.ptr.p_double[j],c->y.ptr.p_double[k]) ) { k = j; } } if( k!=i ) { for(j=0; j<=c->n-1; j++) { for(i0=0; i0<=c->d-1; i0++) { t = c->f.ptr.p_double[c->d*(i*c->n+j)+i0]; c->f.ptr.p_double[c->d*(i*c->n+j)+i0] = c->f.ptr.p_double[c->d*(k*c->n+j)+i0]; c->f.ptr.p_double[c->d*(k*c->n+j)+i0] = t; } } t = c->y.ptr.p_double[i]; c->y.ptr.p_double[i] = c->y.ptr.p_double[k]; c->y.ptr.p_double[k] = t; } } } /************************************************************************* This subroutine builds bicubic vector-valued spline. Input parameters: X - spline abscissas, array[0..N-1] Y - spline ordinates, array[0..M-1] F - function values, array[0..M*N*D-1]: * first D elements store D values at (X[0],Y[0]) * next D elements store D values at (X[1],Y[0]) * general form - D function values at (X[i],Y[j]) are stored at F[D*(J*N+I)...D*(J*N+I)+D-1]. M,N - grid size, M>=2, N>=2 D - vector dimension, D>=1 Output parameters: C - spline interpolant -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dbuildbicubicv(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, /* Real */ ae_vector* f, ae_int_t d, spline2dinterpolant* c, ae_state *_state) { ae_frame _frame_block; ae_vector _f; ae_matrix tf; ae_matrix dx; ae_matrix dy; ae_matrix dxy; double t; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t di; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_f, f, _state); f = &_f; _spline2dinterpolant_clear(c); ae_matrix_init(&tf, 0, 0, DT_REAL, _state); ae_matrix_init(&dx, 0, 0, DT_REAL, _state); ae_matrix_init(&dy, 0, 0, DT_REAL, _state); ae_matrix_init(&dxy, 0, 0, DT_REAL, _state); ae_assert(n>=2, "Spline2DBuildBicubicV: N is less than 2", _state); ae_assert(m>=2, "Spline2DBuildBicubicV: M is less than 2", _state); ae_assert(d>=1, "Spline2DBuildBicubicV: invalid argument D (D<1)", _state); ae_assert(x->cnt>=n&&y->cnt>=m, "Spline2DBuildBicubicV: length of X or Y is too short (Length(X/Y)cnt>=k, "Spline2DBuildBicubicV: length of F is too short (Length(F)k = 3; c->d = d; c->n = n; c->m = m; c->stype = -3; k = 4*k; ae_vector_set_length(&c->x, c->n, _state); ae_vector_set_length(&c->y, c->m, _state); ae_vector_set_length(&c->f, k, _state); ae_matrix_set_length(&tf, c->m, c->n, _state); for(i=0; i<=c->n-1; i++) { c->x.ptr.p_double[i] = x->ptr.p_double[i]; } for(i=0; i<=c->m-1; i++) { c->y.ptr.p_double[i] = y->ptr.p_double[i]; } /* * Sort points */ for(j=0; j<=c->n-1; j++) { k = j; for(i=j+1; i<=c->n-1; i++) { if( ae_fp_less(c->x.ptr.p_double[i],c->x.ptr.p_double[k]) ) { k = i; } } if( k!=j ) { for(i=0; i<=c->m-1; i++) { for(di=0; di<=c->d-1; di++) { t = f->ptr.p_double[c->d*(i*c->n+j)+di]; f->ptr.p_double[c->d*(i*c->n+j)+di] = f->ptr.p_double[c->d*(i*c->n+k)+di]; f->ptr.p_double[c->d*(i*c->n+k)+di] = t; } } t = c->x.ptr.p_double[j]; c->x.ptr.p_double[j] = c->x.ptr.p_double[k]; c->x.ptr.p_double[k] = t; } } for(i=0; i<=c->m-1; i++) { k = i; for(j=i+1; j<=c->m-1; j++) { if( ae_fp_less(c->y.ptr.p_double[j],c->y.ptr.p_double[k]) ) { k = j; } } if( k!=i ) { for(j=0; j<=c->n-1; j++) { for(di=0; di<=c->d-1; di++) { t = f->ptr.p_double[c->d*(i*c->n+j)+di]; f->ptr.p_double[c->d*(i*c->n+j)+di] = f->ptr.p_double[c->d*(k*c->n+j)+di]; f->ptr.p_double[c->d*(k*c->n+j)+di] = t; } } t = c->y.ptr.p_double[i]; c->y.ptr.p_double[i] = c->y.ptr.p_double[k]; c->y.ptr.p_double[k] = t; } } for(di=0; di<=c->d-1; di++) { for(i=0; i<=c->m-1; i++) { for(j=0; j<=c->n-1; j++) { tf.ptr.pp_double[i][j] = f->ptr.p_double[c->d*(i*c->n+j)+di]; } } spline2d_bicubiccalcderivatives(&tf, &c->x, &c->y, c->m, c->n, &dx, &dy, &dxy, _state); for(i=0; i<=c->m-1; i++) { for(j=0; j<=c->n-1; j++) { k = c->d*(i*c->n+j)+di; c->f.ptr.p_double[k] = tf.ptr.pp_double[i][j]; c->f.ptr.p_double[c->n*c->m*c->d+k] = dx.ptr.pp_double[i][j]; c->f.ptr.p_double[2*c->n*c->m*c->d+k] = dy.ptr.pp_double[i][j]; c->f.ptr.p_double[3*c->n*c->m*c->d+k] = dxy.ptr.pp_double[i][j]; } } } ae_frame_leave(_state); } /************************************************************************* This subroutine calculates bilinear or bicubic vector-valued spline at the given point (X,Y). INPUT PARAMETERS: C - spline interpolant. X, Y- point F - output buffer, possibly preallocated array. In case array size is large enough to store result, it is not reallocated. Array which is too short will be reallocated OUTPUT PARAMETERS: F - array[D] (or larger) which stores function values -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dcalcvbuf(spline2dinterpolant* c, double x, double y, /* Real */ ae_vector* f, ae_state *_state) { double t; double dt; double u; double du; ae_int_t ix; ae_int_t iy; ae_int_t l; ae_int_t r; ae_int_t h; ae_int_t s1; ae_int_t s2; ae_int_t s3; ae_int_t s4; ae_int_t sfx; ae_int_t sfy; ae_int_t sfxy; double y1; double y2; double y3; double y4; double v; double t0; double t1; double t2; double t3; double u0; double u1; double u2; double u3; ae_int_t i; ae_assert(c->stype==-1||c->stype==-3, "Spline2DCalcVBuf: incorrect C (incorrect parameter C.SType)", _state); ae_assert(ae_isfinite(x, _state)&&ae_isfinite(y, _state), "Spline2DCalcVBuf: either X=NaN/Infinite or Y=NaN/Infinite", _state); rvectorsetlengthatleast(f, c->d, _state); /* * Binary search in the [ x[0], ..., x[n-2] ] (x[n-1] is not included) */ l = 0; r = c->n-1; while(l!=r-1) { h = (l+r)/2; if( ae_fp_greater_eq(c->x.ptr.p_double[h],x) ) { r = h; } else { l = h; } } t = (x-c->x.ptr.p_double[l])/(c->x.ptr.p_double[l+1]-c->x.ptr.p_double[l]); dt = 1.0/(c->x.ptr.p_double[l+1]-c->x.ptr.p_double[l]); ix = l; /* * Binary search in the [ y[0], ..., y[m-2] ] (y[m-1] is not included) */ l = 0; r = c->m-1; while(l!=r-1) { h = (l+r)/2; if( ae_fp_greater_eq(c->y.ptr.p_double[h],y) ) { r = h; } else { l = h; } } u = (y-c->y.ptr.p_double[l])/(c->y.ptr.p_double[l+1]-c->y.ptr.p_double[l]); du = 1.0/(c->y.ptr.p_double[l+1]-c->y.ptr.p_double[l]); iy = l; /* * Bilinear interpolation */ if( c->stype==-1 ) { for(i=0; i<=c->d-1; i++) { y1 = c->f.ptr.p_double[c->d*(c->n*iy+ix)+i]; y2 = c->f.ptr.p_double[c->d*(c->n*iy+(ix+1))+i]; y3 = c->f.ptr.p_double[c->d*(c->n*(iy+1)+(ix+1))+i]; y4 = c->f.ptr.p_double[c->d*(c->n*(iy+1)+ix)+i]; f->ptr.p_double[i] = (1-t)*(1-u)*y1+t*(1-u)*y2+t*u*y3+(1-t)*u*y4; } return; } /* * Bicubic interpolation */ if( c->stype==-3 ) { /* * Prepare info */ t0 = (double)(1); t1 = t; t2 = ae_sqr(t, _state); t3 = t*t2; u0 = (double)(1); u1 = u; u2 = ae_sqr(u, _state); u3 = u*u2; sfx = c->n*c->m*c->d; sfy = 2*c->n*c->m*c->d; sfxy = 3*c->n*c->m*c->d; for(i=0; i<=c->d-1; i++) { /* * Prepare F, dF/dX, dF/dY, d2F/dXdY */ f->ptr.p_double[i] = (double)(0); s1 = c->d*(c->n*iy+ix)+i; s2 = c->d*(c->n*iy+(ix+1))+i; s3 = c->d*(c->n*(iy+1)+(ix+1))+i; s4 = c->d*(c->n*(iy+1)+ix)+i; /* * Calculate */ v = c->f.ptr.p_double[s1]; f->ptr.p_double[i] = f->ptr.p_double[i]+v*t0*u0; v = c->f.ptr.p_double[sfy+s1]/du; f->ptr.p_double[i] = f->ptr.p_double[i]+v*t0*u1; v = -3*c->f.ptr.p_double[s1]+3*c->f.ptr.p_double[s4]-2*c->f.ptr.p_double[sfy+s1]/du-c->f.ptr.p_double[sfy+s4]/du; f->ptr.p_double[i] = f->ptr.p_double[i]+v*t0*u2; v = 2*c->f.ptr.p_double[s1]-2*c->f.ptr.p_double[s4]+c->f.ptr.p_double[sfy+s1]/du+c->f.ptr.p_double[sfy+s4]/du; f->ptr.p_double[i] = f->ptr.p_double[i]+v*t0*u3; v = c->f.ptr.p_double[sfx+s1]/dt; f->ptr.p_double[i] = f->ptr.p_double[i]+v*t1*u0; v = c->f.ptr.p_double[sfxy+s1]/(dt*du); f->ptr.p_double[i] = f->ptr.p_double[i]+v*t1*u1; v = -3*c->f.ptr.p_double[sfx+s1]/dt+3*c->f.ptr.p_double[sfx+s4]/dt-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-c->f.ptr.p_double[sfxy+s4]/(dt*du); f->ptr.p_double[i] = f->ptr.p_double[i]+v*t1*u2; v = 2*c->f.ptr.p_double[sfx+s1]/dt-2*c->f.ptr.p_double[sfx+s4]/dt+c->f.ptr.p_double[sfxy+s1]/(dt*du)+c->f.ptr.p_double[sfxy+s4]/(dt*du); f->ptr.p_double[i] = f->ptr.p_double[i]+v*t1*u3; v = -3*c->f.ptr.p_double[s1]+3*c->f.ptr.p_double[s2]-2*c->f.ptr.p_double[sfx+s1]/dt-c->f.ptr.p_double[sfx+s2]/dt; f->ptr.p_double[i] = f->ptr.p_double[i]+v*t2*u0; v = -3*c->f.ptr.p_double[sfy+s1]/du+3*c->f.ptr.p_double[sfy+s2]/du-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-c->f.ptr.p_double[sfxy+s2]/(dt*du); f->ptr.p_double[i] = f->ptr.p_double[i]+v*t2*u1; v = 9*c->f.ptr.p_double[s1]-9*c->f.ptr.p_double[s2]+9*c->f.ptr.p_double[s3]-9*c->f.ptr.p_double[s4]+6*c->f.ptr.p_double[sfx+s1]/dt+3*c->f.ptr.p_double[sfx+s2]/dt-3*c->f.ptr.p_double[sfx+s3]/dt-6*c->f.ptr.p_double[sfx+s4]/dt+6*c->f.ptr.p_double[sfy+s1]/du-6*c->f.ptr.p_double[sfy+s2]/du-3*c->f.ptr.p_double[sfy+s3]/du+3*c->f.ptr.p_double[sfy+s4]/du+4*c->f.ptr.p_double[sfxy+s1]/(dt*du)+2*c->f.ptr.p_double[sfxy+s2]/(dt*du)+c->f.ptr.p_double[sfxy+s3]/(dt*du)+2*c->f.ptr.p_double[sfxy+s4]/(dt*du); f->ptr.p_double[i] = f->ptr.p_double[i]+v*t2*u2; v = -6*c->f.ptr.p_double[s1]+6*c->f.ptr.p_double[s2]-6*c->f.ptr.p_double[s3]+6*c->f.ptr.p_double[s4]-4*c->f.ptr.p_double[sfx+s1]/dt-2*c->f.ptr.p_double[sfx+s2]/dt+2*c->f.ptr.p_double[sfx+s3]/dt+4*c->f.ptr.p_double[sfx+s4]/dt-3*c->f.ptr.p_double[sfy+s1]/du+3*c->f.ptr.p_double[sfy+s2]/du+3*c->f.ptr.p_double[sfy+s3]/du-3*c->f.ptr.p_double[sfy+s4]/du-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-c->f.ptr.p_double[sfxy+s2]/(dt*du)-c->f.ptr.p_double[sfxy+s3]/(dt*du)-2*c->f.ptr.p_double[sfxy+s4]/(dt*du); f->ptr.p_double[i] = f->ptr.p_double[i]+v*t2*u3; v = 2*c->f.ptr.p_double[s1]-2*c->f.ptr.p_double[s2]+c->f.ptr.p_double[sfx+s1]/dt+c->f.ptr.p_double[sfx+s2]/dt; f->ptr.p_double[i] = f->ptr.p_double[i]+v*t3*u0; v = 2*c->f.ptr.p_double[sfy+s1]/du-2*c->f.ptr.p_double[sfy+s2]/du+c->f.ptr.p_double[sfxy+s1]/(dt*du)+c->f.ptr.p_double[sfxy+s2]/(dt*du); f->ptr.p_double[i] = f->ptr.p_double[i]+v*t3*u1; v = -6*c->f.ptr.p_double[s1]+6*c->f.ptr.p_double[s2]-6*c->f.ptr.p_double[s3]+6*c->f.ptr.p_double[s4]-3*c->f.ptr.p_double[sfx+s1]/dt-3*c->f.ptr.p_double[sfx+s2]/dt+3*c->f.ptr.p_double[sfx+s3]/dt+3*c->f.ptr.p_double[sfx+s4]/dt-4*c->f.ptr.p_double[sfy+s1]/du+4*c->f.ptr.p_double[sfy+s2]/du+2*c->f.ptr.p_double[sfy+s3]/du-2*c->f.ptr.p_double[sfy+s4]/du-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-2*c->f.ptr.p_double[sfxy+s2]/(dt*du)-c->f.ptr.p_double[sfxy+s3]/(dt*du)-c->f.ptr.p_double[sfxy+s4]/(dt*du); f->ptr.p_double[i] = f->ptr.p_double[i]+v*t3*u2; v = 4*c->f.ptr.p_double[s1]-4*c->f.ptr.p_double[s2]+4*c->f.ptr.p_double[s3]-4*c->f.ptr.p_double[s4]+2*c->f.ptr.p_double[sfx+s1]/dt+2*c->f.ptr.p_double[sfx+s2]/dt-2*c->f.ptr.p_double[sfx+s3]/dt-2*c->f.ptr.p_double[sfx+s4]/dt+2*c->f.ptr.p_double[sfy+s1]/du-2*c->f.ptr.p_double[sfy+s2]/du-2*c->f.ptr.p_double[sfy+s3]/du+2*c->f.ptr.p_double[sfy+s4]/du+c->f.ptr.p_double[sfxy+s1]/(dt*du)+c->f.ptr.p_double[sfxy+s2]/(dt*du)+c->f.ptr.p_double[sfxy+s3]/(dt*du)+c->f.ptr.p_double[sfxy+s4]/(dt*du); f->ptr.p_double[i] = f->ptr.p_double[i]+v*t3*u3; } return; } } /************************************************************************* This subroutine calculates bilinear or bicubic vector-valued spline at the given point (X,Y). INPUT PARAMETERS: C - spline interpolant. X, Y- point OUTPUT PARAMETERS: F - array[D] which stores function values. F is out-parameter and it is reallocated after call to this function. In case you want to reuse previously allocated F, you may use Spline2DCalcVBuf(), which reallocates F only when it is too small. -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dcalcv(spline2dinterpolant* c, double x, double y, /* Real */ ae_vector* f, ae_state *_state) { ae_vector_clear(f); ae_assert(c->stype==-1||c->stype==-3, "Spline2DCalcV: incorrect C (incorrect parameter C.SType)", _state); ae_assert(ae_isfinite(x, _state)&&ae_isfinite(y, _state), "Spline2DCalcV: either X=NaN/Infinite or Y=NaN/Infinite", _state); ae_vector_set_length(f, c->d, _state); spline2dcalcvbuf(c, x, y, f, _state); } /************************************************************************* This subroutine unpacks two-dimensional spline into the coefficients table Input parameters: C - spline interpolant. Result: M, N- grid size (x-axis and y-axis) D - number of components Tbl - coefficients table, unpacked format, D - components: [0..(N-1)*(M-1)*D-1, 0..19]. For T=0..D-1 (component index), I = 0...N-2 (x index), J=0..M-2 (y index): K := T + I*D + J*D*(N-1) K-th row stores decomposition for T-th component of the vector-valued function Tbl[K,0] = X[i] Tbl[K,1] = X[i+1] Tbl[K,2] = Y[j] Tbl[K,3] = Y[j+1] Tbl[K,4] = C00 Tbl[K,5] = C01 Tbl[K,6] = C02 Tbl[K,7] = C03 Tbl[K,8] = C10 Tbl[K,9] = C11 ... Tbl[K,19] = C33 On each grid square spline is equals to: S(x) = SUM(c[i,j]*(t^i)*(u^j), i=0..3, j=0..3) t = x-x[j] u = y-y[i] -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dunpackv(spline2dinterpolant* c, ae_int_t* m, ae_int_t* n, ae_int_t* d, /* Real */ ae_matrix* tbl, ae_state *_state) { ae_int_t k; ae_int_t p; ae_int_t ci; ae_int_t cj; ae_int_t s1; ae_int_t s2; ae_int_t s3; ae_int_t s4; ae_int_t sfx; ae_int_t sfy; ae_int_t sfxy; double y1; double y2; double y3; double y4; double dt; double du; ae_int_t i; ae_int_t j; ae_int_t k0; *m = 0; *n = 0; *d = 0; ae_matrix_clear(tbl); ae_assert(c->stype==-3||c->stype==-1, "Spline2DUnpackV: incorrect C (incorrect parameter C.SType)", _state); *n = c->n; *m = c->m; *d = c->d; ae_matrix_set_length(tbl, (*n-1)*(*m-1)*(*d), 20, _state); sfx = *n*(*m)*(*d); sfy = 2*(*n)*(*m)*(*d); sfxy = 3*(*n)*(*m)*(*d); for(i=0; i<=*m-2; i++) { for(j=0; j<=*n-2; j++) { for(k=0; k<=*d-1; k++) { p = *d*(i*(*n-1)+j)+k; tbl->ptr.pp_double[p][0] = c->x.ptr.p_double[j]; tbl->ptr.pp_double[p][1] = c->x.ptr.p_double[j+1]; tbl->ptr.pp_double[p][2] = c->y.ptr.p_double[i]; tbl->ptr.pp_double[p][3] = c->y.ptr.p_double[i+1]; dt = 1/(tbl->ptr.pp_double[p][1]-tbl->ptr.pp_double[p][0]); du = 1/(tbl->ptr.pp_double[p][3]-tbl->ptr.pp_double[p][2]); /* * Bilinear interpolation */ if( c->stype==-1 ) { for(k0=4; k0<=19; k0++) { tbl->ptr.pp_double[p][k0] = (double)(0); } y1 = c->f.ptr.p_double[*d*(*n*i+j)+k]; y2 = c->f.ptr.p_double[*d*(*n*i+(j+1))+k]; y3 = c->f.ptr.p_double[*d*(*n*(i+1)+(j+1))+k]; y4 = c->f.ptr.p_double[*d*(*n*(i+1)+j)+k]; tbl->ptr.pp_double[p][4] = y1; tbl->ptr.pp_double[p][4+1*4+0] = y2-y1; tbl->ptr.pp_double[p][4+0*4+1] = y4-y1; tbl->ptr.pp_double[p][4+1*4+1] = y3-y2-y4+y1; } /* * Bicubic interpolation */ if( c->stype==-3 ) { s1 = *d*(*n*i+j)+k; s2 = *d*(*n*i+(j+1))+k; s3 = *d*(*n*(i+1)+(j+1))+k; s4 = *d*(*n*(i+1)+j)+k; tbl->ptr.pp_double[p][4+0*4+0] = c->f.ptr.p_double[s1]; tbl->ptr.pp_double[p][4+0*4+1] = c->f.ptr.p_double[sfy+s1]/du; tbl->ptr.pp_double[p][4+0*4+2] = -3*c->f.ptr.p_double[s1]+3*c->f.ptr.p_double[s4]-2*c->f.ptr.p_double[sfy+s1]/du-c->f.ptr.p_double[sfy+s4]/du; tbl->ptr.pp_double[p][4+0*4+3] = 2*c->f.ptr.p_double[s1]-2*c->f.ptr.p_double[s4]+c->f.ptr.p_double[sfy+s1]/du+c->f.ptr.p_double[sfy+s4]/du; tbl->ptr.pp_double[p][4+1*4+0] = c->f.ptr.p_double[sfx+s1]/dt; tbl->ptr.pp_double[p][4+1*4+1] = c->f.ptr.p_double[sfxy+s1]/(dt*du); tbl->ptr.pp_double[p][4+1*4+2] = -3*c->f.ptr.p_double[sfx+s1]/dt+3*c->f.ptr.p_double[sfx+s4]/dt-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-c->f.ptr.p_double[sfxy+s4]/(dt*du); tbl->ptr.pp_double[p][4+1*4+3] = 2*c->f.ptr.p_double[sfx+s1]/dt-2*c->f.ptr.p_double[sfx+s4]/dt+c->f.ptr.p_double[sfxy+s1]/(dt*du)+c->f.ptr.p_double[sfxy+s4]/(dt*du); tbl->ptr.pp_double[p][4+2*4+0] = -3*c->f.ptr.p_double[s1]+3*c->f.ptr.p_double[s2]-2*c->f.ptr.p_double[sfx+s1]/dt-c->f.ptr.p_double[sfx+s2]/dt; tbl->ptr.pp_double[p][4+2*4+1] = -3*c->f.ptr.p_double[sfy+s1]/du+3*c->f.ptr.p_double[sfy+s2]/du-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-c->f.ptr.p_double[sfxy+s2]/(dt*du); tbl->ptr.pp_double[p][4+2*4+2] = 9*c->f.ptr.p_double[s1]-9*c->f.ptr.p_double[s2]+9*c->f.ptr.p_double[s3]-9*c->f.ptr.p_double[s4]+6*c->f.ptr.p_double[sfx+s1]/dt+3*c->f.ptr.p_double[sfx+s2]/dt-3*c->f.ptr.p_double[sfx+s3]/dt-6*c->f.ptr.p_double[sfx+s4]/dt+6*c->f.ptr.p_double[sfy+s1]/du-6*c->f.ptr.p_double[sfy+s2]/du-3*c->f.ptr.p_double[sfy+s3]/du+3*c->f.ptr.p_double[sfy+s4]/du+4*c->f.ptr.p_double[sfxy+s1]/(dt*du)+2*c->f.ptr.p_double[sfxy+s2]/(dt*du)+c->f.ptr.p_double[sfxy+s3]/(dt*du)+2*c->f.ptr.p_double[sfxy+s4]/(dt*du); tbl->ptr.pp_double[p][4+2*4+3] = -6*c->f.ptr.p_double[s1]+6*c->f.ptr.p_double[s2]-6*c->f.ptr.p_double[s3]+6*c->f.ptr.p_double[s4]-4*c->f.ptr.p_double[sfx+s1]/dt-2*c->f.ptr.p_double[sfx+s2]/dt+2*c->f.ptr.p_double[sfx+s3]/dt+4*c->f.ptr.p_double[sfx+s4]/dt-3*c->f.ptr.p_double[sfy+s1]/du+3*c->f.ptr.p_double[sfy+s2]/du+3*c->f.ptr.p_double[sfy+s3]/du-3*c->f.ptr.p_double[sfy+s4]/du-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-c->f.ptr.p_double[sfxy+s2]/(dt*du)-c->f.ptr.p_double[sfxy+s3]/(dt*du)-2*c->f.ptr.p_double[sfxy+s4]/(dt*du); tbl->ptr.pp_double[p][4+3*4+0] = 2*c->f.ptr.p_double[s1]-2*c->f.ptr.p_double[s2]+c->f.ptr.p_double[sfx+s1]/dt+c->f.ptr.p_double[sfx+s2]/dt; tbl->ptr.pp_double[p][4+3*4+1] = 2*c->f.ptr.p_double[sfy+s1]/du-2*c->f.ptr.p_double[sfy+s2]/du+c->f.ptr.p_double[sfxy+s1]/(dt*du)+c->f.ptr.p_double[sfxy+s2]/(dt*du); tbl->ptr.pp_double[p][4+3*4+2] = -6*c->f.ptr.p_double[s1]+6*c->f.ptr.p_double[s2]-6*c->f.ptr.p_double[s3]+6*c->f.ptr.p_double[s4]-3*c->f.ptr.p_double[sfx+s1]/dt-3*c->f.ptr.p_double[sfx+s2]/dt+3*c->f.ptr.p_double[sfx+s3]/dt+3*c->f.ptr.p_double[sfx+s4]/dt-4*c->f.ptr.p_double[sfy+s1]/du+4*c->f.ptr.p_double[sfy+s2]/du+2*c->f.ptr.p_double[sfy+s3]/du-2*c->f.ptr.p_double[sfy+s4]/du-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-2*c->f.ptr.p_double[sfxy+s2]/(dt*du)-c->f.ptr.p_double[sfxy+s3]/(dt*du)-c->f.ptr.p_double[sfxy+s4]/(dt*du); tbl->ptr.pp_double[p][4+3*4+3] = 4*c->f.ptr.p_double[s1]-4*c->f.ptr.p_double[s2]+4*c->f.ptr.p_double[s3]-4*c->f.ptr.p_double[s4]+2*c->f.ptr.p_double[sfx+s1]/dt+2*c->f.ptr.p_double[sfx+s2]/dt-2*c->f.ptr.p_double[sfx+s3]/dt-2*c->f.ptr.p_double[sfx+s4]/dt+2*c->f.ptr.p_double[sfy+s1]/du-2*c->f.ptr.p_double[sfy+s2]/du-2*c->f.ptr.p_double[sfy+s3]/du+2*c->f.ptr.p_double[sfy+s4]/du+c->f.ptr.p_double[sfxy+s1]/(dt*du)+c->f.ptr.p_double[sfxy+s2]/(dt*du)+c->f.ptr.p_double[sfxy+s3]/(dt*du)+c->f.ptr.p_double[sfxy+s4]/(dt*du); } /* * Rescale Cij */ for(ci=0; ci<=3; ci++) { for(cj=0; cj<=3; cj++) { tbl->ptr.pp_double[p][4+ci*4+cj] = tbl->ptr.pp_double[p][4+ci*4+cj]*ae_pow(dt, (double)(ci), _state)*ae_pow(du, (double)(cj), _state); } } } } } } /************************************************************************* This subroutine was deprecated in ALGLIB 3.6.0 We recommend you to switch to Spline2DBuildBilinearV(), which is more flexible and accepts its arguments in more convenient order. -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/ void spline2dbuildbilinear(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_matrix* f, ae_int_t m, ae_int_t n, spline2dinterpolant* c, ae_state *_state) { double t; ae_int_t i; ae_int_t j; ae_int_t k; _spline2dinterpolant_clear(c); ae_assert(n>=2, "Spline2DBuildBilinear: N<2", _state); ae_assert(m>=2, "Spline2DBuildBilinear: M<2", _state); ae_assert(x->cnt>=n&&y->cnt>=m, "Spline2DBuildBilinear: length of X or Y is too short (Length(X/Y)rows>=m&&f->cols>=n, "Spline2DBuildBilinear: size of F is too small (rows(F)k = 1; c->n = n; c->m = m; c->d = 1; c->stype = -1; ae_vector_set_length(&c->x, c->n, _state); ae_vector_set_length(&c->y, c->m, _state); ae_vector_set_length(&c->f, c->n*c->m, _state); for(i=0; i<=c->n-1; i++) { c->x.ptr.p_double[i] = x->ptr.p_double[i]; } for(i=0; i<=c->m-1; i++) { c->y.ptr.p_double[i] = y->ptr.p_double[i]; } for(i=0; i<=c->m-1; i++) { for(j=0; j<=c->n-1; j++) { c->f.ptr.p_double[i*c->n+j] = f->ptr.pp_double[i][j]; } } /* * Sort points */ for(j=0; j<=c->n-1; j++) { k = j; for(i=j+1; i<=c->n-1; i++) { if( ae_fp_less(c->x.ptr.p_double[i],c->x.ptr.p_double[k]) ) { k = i; } } if( k!=j ) { for(i=0; i<=c->m-1; i++) { t = c->f.ptr.p_double[i*c->n+j]; c->f.ptr.p_double[i*c->n+j] = c->f.ptr.p_double[i*c->n+k]; c->f.ptr.p_double[i*c->n+k] = t; } t = c->x.ptr.p_double[j]; c->x.ptr.p_double[j] = c->x.ptr.p_double[k]; c->x.ptr.p_double[k] = t; } } for(i=0; i<=c->m-1; i++) { k = i; for(j=i+1; j<=c->m-1; j++) { if( ae_fp_less(c->y.ptr.p_double[j],c->y.ptr.p_double[k]) ) { k = j; } } if( k!=i ) { for(j=0; j<=c->n-1; j++) { t = c->f.ptr.p_double[i*c->n+j]; c->f.ptr.p_double[i*c->n+j] = c->f.ptr.p_double[k*c->n+j]; c->f.ptr.p_double[k*c->n+j] = t; } t = c->y.ptr.p_double[i]; c->y.ptr.p_double[i] = c->y.ptr.p_double[k]; c->y.ptr.p_double[k] = t; } } } /************************************************************************* This subroutine was deprecated in ALGLIB 3.6.0 We recommend you to switch to Spline2DBuildBicubicV(), which is more flexible and accepts its arguments in more convenient order. -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/ void spline2dbuildbicubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_matrix* f, ae_int_t m, ae_int_t n, spline2dinterpolant* c, ae_state *_state) { ae_frame _frame_block; ae_matrix _f; ae_int_t sfx; ae_int_t sfy; ae_int_t sfxy; ae_matrix dx; ae_matrix dy; ae_matrix dxy; double t; ae_int_t i; ae_int_t j; ae_int_t k; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_f, f, _state); f = &_f; _spline2dinterpolant_clear(c); ae_matrix_init(&dx, 0, 0, DT_REAL, _state); ae_matrix_init(&dy, 0, 0, DT_REAL, _state); ae_matrix_init(&dxy, 0, 0, DT_REAL, _state); ae_assert(n>=2, "Spline2DBuildBicubicSpline: N<2", _state); ae_assert(m>=2, "Spline2DBuildBicubicSpline: M<2", _state); ae_assert(x->cnt>=n&&y->cnt>=m, "Spline2DBuildBicubic: length of X or Y is too short (Length(X/Y)rows>=m&&f->cols>=n, "Spline2DBuildBicubic: size of F is too small (rows(F)k = 3; c->d = 1; c->n = n; c->m = m; c->stype = -3; sfx = c->n*c->m; sfy = 2*c->n*c->m; sfxy = 3*c->n*c->m; ae_vector_set_length(&c->x, c->n, _state); ae_vector_set_length(&c->y, c->m, _state); ae_vector_set_length(&c->f, 4*c->n*c->m, _state); for(i=0; i<=c->n-1; i++) { c->x.ptr.p_double[i] = x->ptr.p_double[i]; } for(i=0; i<=c->m-1; i++) { c->y.ptr.p_double[i] = y->ptr.p_double[i]; } /* * Sort points */ for(j=0; j<=c->n-1; j++) { k = j; for(i=j+1; i<=c->n-1; i++) { if( ae_fp_less(c->x.ptr.p_double[i],c->x.ptr.p_double[k]) ) { k = i; } } if( k!=j ) { for(i=0; i<=c->m-1; i++) { t = f->ptr.pp_double[i][j]; f->ptr.pp_double[i][j] = f->ptr.pp_double[i][k]; f->ptr.pp_double[i][k] = t; } t = c->x.ptr.p_double[j]; c->x.ptr.p_double[j] = c->x.ptr.p_double[k]; c->x.ptr.p_double[k] = t; } } for(i=0; i<=c->m-1; i++) { k = i; for(j=i+1; j<=c->m-1; j++) { if( ae_fp_less(c->y.ptr.p_double[j],c->y.ptr.p_double[k]) ) { k = j; } } if( k!=i ) { for(j=0; j<=c->n-1; j++) { t = f->ptr.pp_double[i][j]; f->ptr.pp_double[i][j] = f->ptr.pp_double[k][j]; f->ptr.pp_double[k][j] = t; } t = c->y.ptr.p_double[i]; c->y.ptr.p_double[i] = c->y.ptr.p_double[k]; c->y.ptr.p_double[k] = t; } } spline2d_bicubiccalcderivatives(f, &c->x, &c->y, c->m, c->n, &dx, &dy, &dxy, _state); for(i=0; i<=c->m-1; i++) { for(j=0; j<=c->n-1; j++) { k = i*c->n+j; c->f.ptr.p_double[k] = f->ptr.pp_double[i][j]; c->f.ptr.p_double[sfx+k] = dx.ptr.pp_double[i][j]; c->f.ptr.p_double[sfy+k] = dy.ptr.pp_double[i][j]; c->f.ptr.p_double[sfxy+k] = dxy.ptr.pp_double[i][j]; } } ae_frame_leave(_state); } /************************************************************************* This subroutine was deprecated in ALGLIB 3.6.0 We recommend you to switch to Spline2DUnpackV(), which is more flexible and accepts its arguments in more convenient order. -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/ void spline2dunpack(spline2dinterpolant* c, ae_int_t* m, ae_int_t* n, /* Real */ ae_matrix* tbl, ae_state *_state) { ae_int_t k; ae_int_t p; ae_int_t ci; ae_int_t cj; ae_int_t s1; ae_int_t s2; ae_int_t s3; ae_int_t s4; ae_int_t sfx; ae_int_t sfy; ae_int_t sfxy; double y1; double y2; double y3; double y4; double dt; double du; ae_int_t i; ae_int_t j; *m = 0; *n = 0; ae_matrix_clear(tbl); ae_assert(c->stype==-3||c->stype==-1, "Spline2DUnpack: incorrect C (incorrect parameter C.SType)", _state); if( c->d!=1 ) { *n = 0; *m = 0; return; } *n = c->n; *m = c->m; ae_matrix_set_length(tbl, (*n-1)*(*m-1), 20, _state); sfx = *n*(*m); sfy = 2*(*n)*(*m); sfxy = 3*(*n)*(*m); /* * Fill */ for(i=0; i<=*m-2; i++) { for(j=0; j<=*n-2; j++) { p = i*(*n-1)+j; tbl->ptr.pp_double[p][0] = c->x.ptr.p_double[j]; tbl->ptr.pp_double[p][1] = c->x.ptr.p_double[j+1]; tbl->ptr.pp_double[p][2] = c->y.ptr.p_double[i]; tbl->ptr.pp_double[p][3] = c->y.ptr.p_double[i+1]; dt = 1/(tbl->ptr.pp_double[p][1]-tbl->ptr.pp_double[p][0]); du = 1/(tbl->ptr.pp_double[p][3]-tbl->ptr.pp_double[p][2]); /* * Bilinear interpolation */ if( c->stype==-1 ) { for(k=4; k<=19; k++) { tbl->ptr.pp_double[p][k] = (double)(0); } y1 = c->f.ptr.p_double[*n*i+j]; y2 = c->f.ptr.p_double[*n*i+(j+1)]; y3 = c->f.ptr.p_double[*n*(i+1)+(j+1)]; y4 = c->f.ptr.p_double[*n*(i+1)+j]; tbl->ptr.pp_double[p][4] = y1; tbl->ptr.pp_double[p][4+1*4+0] = y2-y1; tbl->ptr.pp_double[p][4+0*4+1] = y4-y1; tbl->ptr.pp_double[p][4+1*4+1] = y3-y2-y4+y1; } /* * Bicubic interpolation */ if( c->stype==-3 ) { s1 = *n*i+j; s2 = *n*i+(j+1); s3 = *n*(i+1)+(j+1); s4 = *n*(i+1)+j; tbl->ptr.pp_double[p][4+0*4+0] = c->f.ptr.p_double[s1]; tbl->ptr.pp_double[p][4+0*4+1] = c->f.ptr.p_double[sfy+s1]/du; tbl->ptr.pp_double[p][4+0*4+2] = -3*c->f.ptr.p_double[s1]+3*c->f.ptr.p_double[s4]-2*c->f.ptr.p_double[sfy+s1]/du-c->f.ptr.p_double[sfy+s4]/du; tbl->ptr.pp_double[p][4+0*4+3] = 2*c->f.ptr.p_double[s1]-2*c->f.ptr.p_double[s4]+c->f.ptr.p_double[sfy+s1]/du+c->f.ptr.p_double[sfy+s4]/du; tbl->ptr.pp_double[p][4+1*4+0] = c->f.ptr.p_double[sfx+s1]/dt; tbl->ptr.pp_double[p][4+1*4+1] = c->f.ptr.p_double[sfxy+s1]/(dt*du); tbl->ptr.pp_double[p][4+1*4+2] = -3*c->f.ptr.p_double[sfx+s1]/dt+3*c->f.ptr.p_double[sfx+s4]/dt-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-c->f.ptr.p_double[sfxy+s4]/(dt*du); tbl->ptr.pp_double[p][4+1*4+3] = 2*c->f.ptr.p_double[sfx+s1]/dt-2*c->f.ptr.p_double[sfx+s4]/dt+c->f.ptr.p_double[sfxy+s1]/(dt*du)+c->f.ptr.p_double[sfxy+s4]/(dt*du); tbl->ptr.pp_double[p][4+2*4+0] = -3*c->f.ptr.p_double[s1]+3*c->f.ptr.p_double[s2]-2*c->f.ptr.p_double[sfx+s1]/dt-c->f.ptr.p_double[sfx+s2]/dt; tbl->ptr.pp_double[p][4+2*4+1] = -3*c->f.ptr.p_double[sfy+s1]/du+3*c->f.ptr.p_double[sfy+s2]/du-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-c->f.ptr.p_double[sfxy+s2]/(dt*du); tbl->ptr.pp_double[p][4+2*4+2] = 9*c->f.ptr.p_double[s1]-9*c->f.ptr.p_double[s2]+9*c->f.ptr.p_double[s3]-9*c->f.ptr.p_double[s4]+6*c->f.ptr.p_double[sfx+s1]/dt+3*c->f.ptr.p_double[sfx+s2]/dt-3*c->f.ptr.p_double[sfx+s3]/dt-6*c->f.ptr.p_double[sfx+s4]/dt+6*c->f.ptr.p_double[sfy+s1]/du-6*c->f.ptr.p_double[sfy+s2]/du-3*c->f.ptr.p_double[sfy+s3]/du+3*c->f.ptr.p_double[sfy+s4]/du+4*c->f.ptr.p_double[sfxy+s1]/(dt*du)+2*c->f.ptr.p_double[sfxy+s2]/(dt*du)+c->f.ptr.p_double[sfxy+s3]/(dt*du)+2*c->f.ptr.p_double[sfxy+s4]/(dt*du); tbl->ptr.pp_double[p][4+2*4+3] = -6*c->f.ptr.p_double[s1]+6*c->f.ptr.p_double[s2]-6*c->f.ptr.p_double[s3]+6*c->f.ptr.p_double[s4]-4*c->f.ptr.p_double[sfx+s1]/dt-2*c->f.ptr.p_double[sfx+s2]/dt+2*c->f.ptr.p_double[sfx+s3]/dt+4*c->f.ptr.p_double[sfx+s4]/dt-3*c->f.ptr.p_double[sfy+s1]/du+3*c->f.ptr.p_double[sfy+s2]/du+3*c->f.ptr.p_double[sfy+s3]/du-3*c->f.ptr.p_double[sfy+s4]/du-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-c->f.ptr.p_double[sfxy+s2]/(dt*du)-c->f.ptr.p_double[sfxy+s3]/(dt*du)-2*c->f.ptr.p_double[sfxy+s4]/(dt*du); tbl->ptr.pp_double[p][4+3*4+0] = 2*c->f.ptr.p_double[s1]-2*c->f.ptr.p_double[s2]+c->f.ptr.p_double[sfx+s1]/dt+c->f.ptr.p_double[sfx+s2]/dt; tbl->ptr.pp_double[p][4+3*4+1] = 2*c->f.ptr.p_double[sfy+s1]/du-2*c->f.ptr.p_double[sfy+s2]/du+c->f.ptr.p_double[sfxy+s1]/(dt*du)+c->f.ptr.p_double[sfxy+s2]/(dt*du); tbl->ptr.pp_double[p][4+3*4+2] = -6*c->f.ptr.p_double[s1]+6*c->f.ptr.p_double[s2]-6*c->f.ptr.p_double[s3]+6*c->f.ptr.p_double[s4]-3*c->f.ptr.p_double[sfx+s1]/dt-3*c->f.ptr.p_double[sfx+s2]/dt+3*c->f.ptr.p_double[sfx+s3]/dt+3*c->f.ptr.p_double[sfx+s4]/dt-4*c->f.ptr.p_double[sfy+s1]/du+4*c->f.ptr.p_double[sfy+s2]/du+2*c->f.ptr.p_double[sfy+s3]/du-2*c->f.ptr.p_double[sfy+s4]/du-2*c->f.ptr.p_double[sfxy+s1]/(dt*du)-2*c->f.ptr.p_double[sfxy+s2]/(dt*du)-c->f.ptr.p_double[sfxy+s3]/(dt*du)-c->f.ptr.p_double[sfxy+s4]/(dt*du); tbl->ptr.pp_double[p][4+3*4+3] = 4*c->f.ptr.p_double[s1]-4*c->f.ptr.p_double[s2]+4*c->f.ptr.p_double[s3]-4*c->f.ptr.p_double[s4]+2*c->f.ptr.p_double[sfx+s1]/dt+2*c->f.ptr.p_double[sfx+s2]/dt-2*c->f.ptr.p_double[sfx+s3]/dt-2*c->f.ptr.p_double[sfx+s4]/dt+2*c->f.ptr.p_double[sfy+s1]/du-2*c->f.ptr.p_double[sfy+s2]/du-2*c->f.ptr.p_double[sfy+s3]/du+2*c->f.ptr.p_double[sfy+s4]/du+c->f.ptr.p_double[sfxy+s1]/(dt*du)+c->f.ptr.p_double[sfxy+s2]/(dt*du)+c->f.ptr.p_double[sfxy+s3]/(dt*du)+c->f.ptr.p_double[sfxy+s4]/(dt*du); } /* * Rescale Cij */ for(ci=0; ci<=3; ci++) { for(cj=0; cj<=3; cj++) { tbl->ptr.pp_double[p][4+ci*4+cj] = tbl->ptr.pp_double[p][4+ci*4+cj]*ae_pow(dt, (double)(ci), _state)*ae_pow(du, (double)(cj), _state); } } } } } /************************************************************************* Internal subroutine. Calculation of the first derivatives and the cross-derivative. *************************************************************************/ static void spline2d_bicubiccalcderivatives(/* Real */ ae_matrix* a, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* dx, /* Real */ ae_matrix* dy, /* Real */ ae_matrix* dxy, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_vector xt; ae_vector ft; double s; double ds; double d2s; spline1dinterpolant c; ae_frame_make(_state, &_frame_block); ae_matrix_clear(dx); ae_matrix_clear(dy); ae_matrix_clear(dxy); ae_vector_init(&xt, 0, DT_REAL, _state); ae_vector_init(&ft, 0, DT_REAL, _state); _spline1dinterpolant_init(&c, _state); ae_matrix_set_length(dx, m, n, _state); ae_matrix_set_length(dy, m, n, _state); ae_matrix_set_length(dxy, m, n, _state); /* * dF/dX */ ae_vector_set_length(&xt, n, _state); ae_vector_set_length(&ft, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { xt.ptr.p_double[j] = x->ptr.p_double[j]; ft.ptr.p_double[j] = a->ptr.pp_double[i][j]; } spline1dbuildcubic(&xt, &ft, n, 0, 0.0, 0, 0.0, &c, _state); for(j=0; j<=n-1; j++) { spline1ddiff(&c, x->ptr.p_double[j], &s, &ds, &d2s, _state); dx->ptr.pp_double[i][j] = ds; } } /* * dF/dY */ ae_vector_set_length(&xt, m, _state); ae_vector_set_length(&ft, m, _state); for(j=0; j<=n-1; j++) { for(i=0; i<=m-1; i++) { xt.ptr.p_double[i] = y->ptr.p_double[i]; ft.ptr.p_double[i] = a->ptr.pp_double[i][j]; } spline1dbuildcubic(&xt, &ft, m, 0, 0.0, 0, 0.0, &c, _state); for(i=0; i<=m-1; i++) { spline1ddiff(&c, y->ptr.p_double[i], &s, &ds, &d2s, _state); dy->ptr.pp_double[i][j] = ds; } } /* * d2F/dXdY */ ae_vector_set_length(&xt, n, _state); ae_vector_set_length(&ft, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { xt.ptr.p_double[j] = x->ptr.p_double[j]; ft.ptr.p_double[j] = dy->ptr.pp_double[i][j]; } spline1dbuildcubic(&xt, &ft, n, 0, 0.0, 0, 0.0, &c, _state); for(j=0; j<=n-1; j++) { spline1ddiff(&c, x->ptr.p_double[j], &s, &ds, &d2s, _state); dxy->ptr.pp_double[i][j] = ds; } } ae_frame_leave(_state); } void _spline2dinterpolant_init(void* _p, ae_state *_state) { spline2dinterpolant *p = (spline2dinterpolant*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->y, 0, DT_REAL, _state); ae_vector_init(&p->f, 0, DT_REAL, _state); } void _spline2dinterpolant_init_copy(void* _dst, void* _src, ae_state *_state) { spline2dinterpolant *dst = (spline2dinterpolant*)_dst; spline2dinterpolant *src = (spline2dinterpolant*)_src; dst->k = src->k; dst->stype = src->stype; dst->n = src->n; dst->m = src->m; dst->d = src->d; ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->y, &src->y, _state); ae_vector_init_copy(&dst->f, &src->f, _state); } void _spline2dinterpolant_clear(void* _p) { spline2dinterpolant *p = (spline2dinterpolant*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->x); ae_vector_clear(&p->y); ae_vector_clear(&p->f); } void _spline2dinterpolant_destroy(void* _p) { spline2dinterpolant *p = (spline2dinterpolant*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->x); ae_vector_destroy(&p->y); ae_vector_destroy(&p->f); } /************************************************************************* This function creates RBF model for a scalar (NY=1) or vector (NY>1) function in a NX-dimensional space (NX=2 or NX=3). INPUT PARAMETERS: NX - dimension of the space, NX=2 or NX=3 NY - function dimension, NY>=1 OUTPUT PARAMETERS: S - RBF model (initially equals to zero) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfv1create(ae_int_t nx, ae_int_t ny, rbfv1model* s, ae_state *_state) { ae_int_t i; ae_int_t j; _rbfv1model_clear(s); ae_assert(nx==2||nx==3, "RBFCreate: NX<>2 and NX<>3", _state); ae_assert(ny>=1, "RBFCreate: NY<1", _state); s->nx = nx; s->ny = ny; s->nl = 0; s->nc = 0; ae_matrix_set_length(&s->v, ny, rbfv1_mxnx+1, _state); for(i=0; i<=ny-1; i++) { for(j=0; j<=rbfv1_mxnx; j++) { s->v.ptr.pp_double[i][j] = (double)(0); } } s->rmax = (double)(0); } /************************************************************************* This function creates buffer structure which can be used to perform parallel RBF model evaluations (with one RBF model instance being used from multiple threads, as long as different threads use different instances of buffer). This buffer object can be used with rbftscalcbuf() function (here "ts" stands for "thread-safe", "buf" is a suffix which denotes function which reuses previously allocated output space). How to use it: * create RBF model structure with rbfcreate() * load data, tune parameters * call rbfbuildmodel() * call rbfcreatecalcbuffer(), once per thread working with RBF model (you should call this function only AFTER call to rbfbuildmodel(), see below for more information) * call rbftscalcbuf() from different threads, with each thread working with its own copy of buffer object. INPUT PARAMETERS S - RBF model OUTPUT PARAMETERS Buf - external buffer. IMPORTANT: buffer object should be used only with RBF model object which was used to initialize buffer. Any attempt to use buffer with different object is dangerous - you may get memory violation error because sizes of internal arrays do not fit to dimensions of RBF structure. IMPORTANT: you should call this function only for model which was built with rbfbuildmodel() function, after successful invocation of rbfbuildmodel(). Sizes of some internal structures are determined only after model is built, so buffer object created before model construction stage will be useless (and any attempt to use it will result in exception). -- ALGLIB -- Copyright 02.04.2016 by Sergey Bochkanov *************************************************************************/ void rbfv1createcalcbuffer(rbfv1model* s, rbfv1calcbuffer* buf, ae_state *_state) { _rbfv1calcbuffer_clear(buf); kdtreecreaterequestbuffer(&s->tree, &buf->requestbuffer, _state); } /************************************************************************* This function builds RBF model and returns report (contains some information which can be used for evaluation of the algorithm properties). Call to this function modifies RBF model by calculating its centers/radii/ weights and saving them into RBFModel structure. Initially RBFModel contain zero coefficients, but after call to this function we will have coefficients which were calculated in order to fit our dataset. After you called this function you can call RBFCalc(), RBFGridCalc() and other model calculation functions. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call Rep - report: * Rep.TerminationType: * -5 - non-distinct basis function centers were detected, interpolation aborted * -4 - nonconvergence of the internal SVD solver * 1 - successful termination Fields are used for debugging purposes: * Rep.IterationsCount - iterations count of the LSQR solver * Rep.NMV - number of matrix-vector products * Rep.ARows - rows count for the system matrix * Rep.ACols - columns count for the system matrix * Rep.ANNZ - number of significantly non-zero elements (elements above some algorithm-determined threshold) NOTE: failure to build model will leave current state of the structure unchanged. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfv1buildmodel(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t aterm, ae_int_t algorithmtype, ae_int_t nlayers, double radvalue, double radzvalue, double lambdav, double epsort, double epserr, ae_int_t maxits, rbfv1model* s, rbfv1report* rep, ae_state *_state) { ae_frame _frame_block; kdtree tree; kdtree ctree; ae_vector dist; ae_vector xcx; ae_matrix a; ae_matrix v; ae_matrix omega; ae_matrix residualy; ae_vector radius; ae_matrix xc; ae_int_t nc; double rmax; ae_vector tags; ae_vector ctags; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t snnz; ae_vector tmp0; ae_vector tmp1; ae_int_t layerscnt; ae_bool modelstatus; ae_frame_make(_state, &_frame_block); _rbfv1report_clear(rep); _kdtree_init(&tree, _state); _kdtree_init(&ctree, _state); ae_vector_init(&dist, 0, DT_REAL, _state); ae_vector_init(&xcx, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); ae_matrix_init(&omega, 0, 0, DT_REAL, _state); ae_matrix_init(&residualy, 0, 0, DT_REAL, _state); ae_vector_init(&radius, 0, DT_REAL, _state); ae_matrix_init(&xc, 0, 0, DT_REAL, _state); ae_vector_init(&tags, 0, DT_INT, _state); ae_vector_init(&ctags, 0, DT_INT, _state); ae_vector_init(&tmp0, 0, DT_REAL, _state); ae_vector_init(&tmp1, 0, DT_REAL, _state); ae_assert(s->nx==2||s->nx==3, "RBFBuildModel: S.NX<>2 or S.NX<>3!", _state); /* * Quick exit when we have no points */ if( n==0 ) { rep->terminationtype = 1; rep->iterationscount = 0; rep->nmv = 0; rep->arows = 0; rep->acols = 0; kdtreebuildtagged(&s->xc, &tags, 0, rbfv1_mxnx, 0, 2, &s->tree, _state); ae_matrix_set_length(&s->xc, 0, 0, _state); ae_matrix_set_length(&s->wr, 0, 0, _state); s->nc = 0; s->rmax = (double)(0); ae_matrix_set_length(&s->v, s->ny, rbfv1_mxnx+1, _state); for(i=0; i<=s->ny-1; i++) { for(j=0; j<=rbfv1_mxnx; j++) { s->v.ptr.pp_double[i][j] = (double)(0); } } ae_frame_leave(_state); return; } /* * General case, N>0 */ rep->annz = 0; rep->iterationscount = 0; rep->nmv = 0; ae_vector_set_length(&xcx, rbfv1_mxnx, _state); /* * First model in a sequence - linear model. * Residuals from linear regression are stored in the ResidualY variable * (used later to build RBF models). */ ae_matrix_set_length(&residualy, n, s->ny, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=s->ny-1; j++) { residualy.ptr.pp_double[i][j] = y->ptr.pp_double[i][j]; } } if( !rbfv1_rbfv1buildlinearmodel(x, &residualy, n, s->ny, aterm, &v, _state) ) { rep->terminationtype = -5; ae_frame_leave(_state); return; } /* * Handle special case: multilayer model with NLayers=0. * Quick exit. */ if( algorithmtype==2&&nlayers==0 ) { rep->terminationtype = 1; rep->iterationscount = 0; rep->nmv = 0; rep->arows = 0; rep->acols = 0; kdtreebuildtagged(&s->xc, &tags, 0, rbfv1_mxnx, 0, 2, &s->tree, _state); ae_matrix_set_length(&s->xc, 0, 0, _state); ae_matrix_set_length(&s->wr, 0, 0, _state); s->nc = 0; s->rmax = (double)(0); ae_matrix_set_length(&s->v, s->ny, rbfv1_mxnx+1, _state); for(i=0; i<=s->ny-1; i++) { for(j=0; j<=rbfv1_mxnx; j++) { s->v.ptr.pp_double[i][j] = v.ptr.pp_double[i][j]; } } ae_frame_leave(_state); return; } /* * Second model in a sequence - RBF term. * * NOTE: assignments below are not necessary, but without them * MSVC complains about unitialized variables. */ nc = 0; rmax = (double)(0); layerscnt = 0; modelstatus = ae_false; if( algorithmtype==1 ) { /* * Add RBF model. * This model uses local KD-trees to speed-up nearest neighbor searches. */ nc = n; ae_matrix_set_length(&xc, nc, rbfv1_mxnx, _state); for(i=0; i<=nc-1; i++) { for(j=0; j<=rbfv1_mxnx-1; j++) { xc.ptr.pp_double[i][j] = x->ptr.pp_double[i][j]; } } rmax = (double)(0); ae_vector_set_length(&radius, nc, _state); ae_vector_set_length(&ctags, nc, _state); for(i=0; i<=nc-1; i++) { ctags.ptr.p_int[i] = i; } kdtreebuildtagged(&xc, &ctags, nc, rbfv1_mxnx, 0, 2, &ctree, _state); if( nc==0 ) { rmax = (double)(1); } else { if( nc==1 ) { radius.ptr.p_double[0] = radvalue; rmax = radius.ptr.p_double[0]; } else { /* * NC>1, calculate radii using distances to nearest neigbors */ for(i=0; i<=nc-1; i++) { for(j=0; j<=rbfv1_mxnx-1; j++) { xcx.ptr.p_double[j] = xc.ptr.pp_double[i][j]; } if( kdtreequeryknn(&ctree, &xcx, 1, ae_false, _state)>0 ) { kdtreequeryresultsdistances(&ctree, &dist, _state); radius.ptr.p_double[i] = radvalue*dist.ptr.p_double[0]; } else { /* * No neighbors found (it will happen when we have only one center). * Initialize radius with default value. */ radius.ptr.p_double[i] = 1.0; } } /* * Apply filtering */ rvectorsetlengthatleast(&tmp0, nc, _state); for(i=0; i<=nc-1; i++) { tmp0.ptr.p_double[i] = radius.ptr.p_double[i]; } tagsortfast(&tmp0, &tmp1, nc, _state); for(i=0; i<=nc-1; i++) { radius.ptr.p_double[i] = ae_minreal(radius.ptr.p_double[i], radzvalue*tmp0.ptr.p_double[nc/2], _state); } /* * Calculate RMax, check that all radii are non-zero */ for(i=0; i<=nc-1; i++) { rmax = ae_maxreal(rmax, radius.ptr.p_double[i], _state); } for(i=0; i<=nc-1; i++) { if( ae_fp_eq(radius.ptr.p_double[i],(double)(0)) ) { rep->terminationtype = -5; ae_frame_leave(_state); return; } } } } ivectorsetlengthatleast(&tags, n, _state); for(i=0; i<=n-1; i++) { tags.ptr.p_int[i] = i; } kdtreebuildtagged(x, &tags, n, rbfv1_mxnx, 0, 2, &tree, _state); rbfv1_buildrbfmodellsqr(x, &residualy, &xc, &radius, n, nc, s->ny, &tree, &ctree, epsort, epserr, maxits, &rep->annz, &snnz, &omega, &rep->terminationtype, &rep->iterationscount, &rep->nmv, _state); layerscnt = 1; modelstatus = ae_true; } if( algorithmtype==2 ) { rmax = radvalue; rbfv1_buildrbfmlayersmodellsqr(x, &residualy, &xc, radvalue, &radius, n, &nc, s->ny, nlayers, &ctree, 1.0E-6, 1.0E-6, 50, lambdav, &rep->annz, &omega, &rep->terminationtype, &rep->iterationscount, &rep->nmv, _state); layerscnt = nlayers; modelstatus = ae_true; } ae_assert(modelstatus, "RBFBuildModel: integrity error", _state); if( rep->terminationtype<=0 ) { ae_frame_leave(_state); return; } /* * Model is built */ s->nc = nc/layerscnt; s->rmax = rmax; s->nl = layerscnt; ae_matrix_set_length(&s->xc, s->nc, rbfv1_mxnx, _state); ae_matrix_set_length(&s->wr, s->nc, 1+s->nl*s->ny, _state); ae_matrix_set_length(&s->v, s->ny, rbfv1_mxnx+1, _state); for(i=0; i<=s->nc-1; i++) { for(j=0; j<=rbfv1_mxnx-1; j++) { s->xc.ptr.pp_double[i][j] = xc.ptr.pp_double[i][j]; } } ivectorsetlengthatleast(&tags, s->nc, _state); for(i=0; i<=s->nc-1; i++) { tags.ptr.p_int[i] = i; } kdtreebuildtagged(&s->xc, &tags, s->nc, rbfv1_mxnx, 0, 2, &s->tree, _state); for(i=0; i<=s->nc-1; i++) { s->wr.ptr.pp_double[i][0] = radius.ptr.p_double[i]; for(k=0; k<=layerscnt-1; k++) { for(j=0; j<=s->ny-1; j++) { s->wr.ptr.pp_double[i][1+k*s->ny+j] = omega.ptr.pp_double[k*s->nc+i][j]; } } } for(i=0; i<=s->ny-1; i++) { for(j=0; j<=rbfv1_mxnx; j++) { s->v.ptr.pp_double[i][j] = v.ptr.pp_double[i][j]; } } rep->terminationtype = 1; rep->arows = n; rep->acols = s->nc; ae_frame_leave(_state); } /************************************************************************* Serializer: allocation -- ALGLIB -- Copyright 02.02.2012 by Bochkanov Sergey *************************************************************************/ void rbfv1alloc(ae_serializer* s, rbfv1model* model, ae_state *_state) { /* * Data */ ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); kdtreealloc(s, &model->tree, _state); allocrealmatrix(s, &model->xc, -1, -1, _state); allocrealmatrix(s, &model->wr, -1, -1, _state); ae_serializer_alloc_entry(s); allocrealmatrix(s, &model->v, -1, -1, _state); } /************************************************************************* Serializer: serialization -- ALGLIB -- Copyright 02.02.2012 by Bochkanov Sergey *************************************************************************/ void rbfv1serialize(ae_serializer* s, rbfv1model* model, ae_state *_state) { /* * Data */ ae_serializer_serialize_int(s, model->nx, _state); ae_serializer_serialize_int(s, model->ny, _state); ae_serializer_serialize_int(s, model->nc, _state); ae_serializer_serialize_int(s, model->nl, _state); kdtreeserialize(s, &model->tree, _state); serializerealmatrix(s, &model->xc, -1, -1, _state); serializerealmatrix(s, &model->wr, -1, -1, _state); ae_serializer_serialize_double(s, model->rmax, _state); serializerealmatrix(s, &model->v, -1, -1, _state); } /************************************************************************* Serializer: unserialization -- ALGLIB -- Copyright 02.02.2012 by Bochkanov Sergey *************************************************************************/ void rbfv1unserialize(ae_serializer* s, rbfv1model* model, ae_state *_state) { ae_int_t nx; ae_int_t ny; _rbfv1model_clear(model); /* * Unserialize primary model parameters, initialize model. * * It is necessary to call RBFCreate() because some internal fields * which are NOT unserialized will need initialization. */ ae_serializer_unserialize_int(s, &nx, _state); ae_serializer_unserialize_int(s, &ny, _state); rbfv1create(nx, ny, model, _state); ae_serializer_unserialize_int(s, &model->nc, _state); ae_serializer_unserialize_int(s, &model->nl, _state); kdtreeunserialize(s, &model->tree, _state); unserializerealmatrix(s, &model->xc, _state); unserializerealmatrix(s, &model->wr, _state); ae_serializer_unserialize_double(s, &model->rmax, _state); unserializerealmatrix(s, &model->v, _state); } /************************************************************************* This function calculates values of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=2 (2-dimensional space). If you have 3-dimensional space, use RBFCalc3(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use general, less efficient implementation RBFCalc(). If you want to calculate function values many times, consider using RBFGridCalc2(), which is far more efficient than many subsequent calls to RBFCalc2(). This function returns 0.0 when: * model is not initialized * NX<>2 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - first coordinate, finite number X1 - second coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfv1calc2(rbfv1model* s, double x0, double x1, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t lx; ae_int_t tg; double d2; double t; double bfcur; double rcur; double result; ae_assert(ae_isfinite(x0, _state), "RBFCalc2: invalid value for X0 (X0 is Inf)!", _state); ae_assert(ae_isfinite(x1, _state), "RBFCalc2: invalid value for X1 (X1 is Inf)!", _state); if( s->ny!=1||s->nx!=2 ) { result = (double)(0); return result; } result = s->v.ptr.pp_double[0][0]*x0+s->v.ptr.pp_double[0][1]*x1+s->v.ptr.pp_double[0][rbfv1_mxnx]; if( s->nc==0 ) { return result; } rvectorsetlengthatleast(&s->calcbufxcx, rbfv1_mxnx, _state); for(i=0; i<=rbfv1_mxnx-1; i++) { s->calcbufxcx.ptr.p_double[i] = 0.0; } s->calcbufxcx.ptr.p_double[0] = x0; s->calcbufxcx.ptr.p_double[1] = x1; lx = kdtreequeryrnn(&s->tree, &s->calcbufxcx, s->rmax*rbfv1_rbffarradius, ae_true, _state); kdtreequeryresultsx(&s->tree, &s->calcbufx, _state); kdtreequeryresultstags(&s->tree, &s->calcbuftags, _state); for(i=0; i<=lx-1; i++) { tg = s->calcbuftags.ptr.p_int[i]; d2 = ae_sqr(x0-s->calcbufx.ptr.pp_double[i][0], _state)+ae_sqr(x1-s->calcbufx.ptr.pp_double[i][1], _state); rcur = s->wr.ptr.pp_double[tg][0]; bfcur = ae_exp(-d2/(rcur*rcur), _state); for(j=0; j<=s->nl-1; j++) { result = result+bfcur*s->wr.ptr.pp_double[tg][1+j]; rcur = 0.5*rcur; t = bfcur*bfcur; bfcur = t*t; } } return result; } /************************************************************************* This function calculates values of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=3 (3-dimensional space). If you have 2-dimensional space, use RBFCalc2(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use general, less efficient implementation RBFCalc(). This function returns 0.0 when: * model is not initialized * NX<>3 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - first coordinate, finite number X1 - second coordinate, finite number X2 - third coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfv1calc3(rbfv1model* s, double x0, double x1, double x2, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t lx; ae_int_t tg; double t; double rcur; double bf; double result; ae_assert(ae_isfinite(x0, _state), "RBFCalc3: invalid value for X0 (X0 is Inf or NaN)!", _state); ae_assert(ae_isfinite(x1, _state), "RBFCalc3: invalid value for X1 (X1 is Inf or NaN)!", _state); ae_assert(ae_isfinite(x2, _state), "RBFCalc3: invalid value for X2 (X2 is Inf or NaN)!", _state); if( s->ny!=1||s->nx!=3 ) { result = (double)(0); return result; } result = s->v.ptr.pp_double[0][0]*x0+s->v.ptr.pp_double[0][1]*x1+s->v.ptr.pp_double[0][2]*x2+s->v.ptr.pp_double[0][rbfv1_mxnx]; if( s->nc==0 ) { return result; } /* * calculating value for F(X) */ rvectorsetlengthatleast(&s->calcbufxcx, rbfv1_mxnx, _state); for(i=0; i<=rbfv1_mxnx-1; i++) { s->calcbufxcx.ptr.p_double[i] = 0.0; } s->calcbufxcx.ptr.p_double[0] = x0; s->calcbufxcx.ptr.p_double[1] = x1; s->calcbufxcx.ptr.p_double[2] = x2; lx = kdtreequeryrnn(&s->tree, &s->calcbufxcx, s->rmax*rbfv1_rbffarradius, ae_true, _state); kdtreequeryresultsx(&s->tree, &s->calcbufx, _state); kdtreequeryresultstags(&s->tree, &s->calcbuftags, _state); for(i=0; i<=lx-1; i++) { tg = s->calcbuftags.ptr.p_int[i]; rcur = s->wr.ptr.pp_double[tg][0]; bf = ae_exp(-(ae_sqr(x0-s->calcbufx.ptr.pp_double[i][0], _state)+ae_sqr(x1-s->calcbufx.ptr.pp_double[i][1], _state)+ae_sqr(x2-s->calcbufx.ptr.pp_double[i][2], _state))/ae_sqr(rcur, _state), _state); for(j=0; j<=s->nl-1; j++) { result = result+bf*s->wr.ptr.pp_double[tg][1+j]; t = bf*bf; bf = t*t; } } return result; } /************************************************************************* This function calculates values of the RBF model at the given point. Same as RBFCalc(), but does not reallocate Y when in is large enough to store function values. INPUT PARAMETERS: S - RBF model X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. Y - possibly preallocated array OUTPUT PARAMETERS: Y - function value, array[NY]. Y is not reallocated when it is larger than NY. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfv1calcbuf(rbfv1model* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t lx; ae_int_t tg; double t; double rcur; double bf; ae_assert(x->cnt>=s->nx, "RBFCalcBuf: Length(X)nx, _state), "RBFCalcBuf: X contains infinite or NaN values", _state); if( y->cntny ) { ae_vector_set_length(y, s->ny, _state); } for(i=0; i<=s->ny-1; i++) { y->ptr.p_double[i] = s->v.ptr.pp_double[i][rbfv1_mxnx]; for(j=0; j<=s->nx-1; j++) { y->ptr.p_double[i] = y->ptr.p_double[i]+s->v.ptr.pp_double[i][j]*x->ptr.p_double[j]; } } if( s->nc==0 ) { return; } rvectorsetlengthatleast(&s->calcbufxcx, rbfv1_mxnx, _state); for(i=0; i<=rbfv1_mxnx-1; i++) { s->calcbufxcx.ptr.p_double[i] = 0.0; } for(i=0; i<=s->nx-1; i++) { s->calcbufxcx.ptr.p_double[i] = x->ptr.p_double[i]; } lx = kdtreequeryrnn(&s->tree, &s->calcbufxcx, s->rmax*rbfv1_rbffarradius, ae_true, _state); kdtreequeryresultsx(&s->tree, &s->calcbufx, _state); kdtreequeryresultstags(&s->tree, &s->calcbuftags, _state); for(i=0; i<=s->ny-1; i++) { for(j=0; j<=lx-1; j++) { tg = s->calcbuftags.ptr.p_int[j]; rcur = s->wr.ptr.pp_double[tg][0]; bf = ae_exp(-(ae_sqr(s->calcbufxcx.ptr.p_double[0]-s->calcbufx.ptr.pp_double[j][0], _state)+ae_sqr(s->calcbufxcx.ptr.p_double[1]-s->calcbufx.ptr.pp_double[j][1], _state)+ae_sqr(s->calcbufxcx.ptr.p_double[2]-s->calcbufx.ptr.pp_double[j][2], _state))/ae_sqr(rcur, _state), _state); for(k=0; k<=s->nl-1; k++) { y->ptr.p_double[i] = y->ptr.p_double[i]+bf*s->wr.ptr.pp_double[tg][1+k*s->ny+i]; t = bf*bf; bf = t*t; } } } } /************************************************************************* This function calculates values of the RBF model at the given point, using external buffer object (internal temporaries of RBF model are not modified). This function allows to use same RBF model object in different threads, assuming that different threads use different instances of buffer structure. INPUT PARAMETERS: S - RBF model, may be shared between different threads Buf - buffer object created for this particular instance of RBF model with rbfcreatecalcbuffer(). X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. Y - possibly preallocated array OUTPUT PARAMETERS: Y - function value, array[NY]. Y is not reallocated when it is larger than NY. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfv1tscalcbuf(rbfv1model* s, rbfv1calcbuffer* buf, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t lx; ae_int_t tg; double t; double rcur; double bf; ae_assert(x->cnt>=s->nx, "RBFCalcBuf: Length(X)nx, _state), "RBFCalcBuf: X contains infinite or NaN values", _state); if( y->cntny ) { ae_vector_set_length(y, s->ny, _state); } for(i=0; i<=s->ny-1; i++) { y->ptr.p_double[i] = s->v.ptr.pp_double[i][rbfv1_mxnx]; for(j=0; j<=s->nx-1; j++) { y->ptr.p_double[i] = y->ptr.p_double[i]+s->v.ptr.pp_double[i][j]*x->ptr.p_double[j]; } } if( s->nc==0 ) { return; } rvectorsetlengthatleast(&buf->calcbufxcx, rbfv1_mxnx, _state); for(i=0; i<=rbfv1_mxnx-1; i++) { buf->calcbufxcx.ptr.p_double[i] = 0.0; } for(i=0; i<=s->nx-1; i++) { buf->calcbufxcx.ptr.p_double[i] = x->ptr.p_double[i]; } lx = kdtreetsqueryrnn(&s->tree, &buf->requestbuffer, &buf->calcbufxcx, s->rmax*rbfv1_rbffarradius, ae_true, _state); kdtreetsqueryresultsx(&s->tree, &buf->requestbuffer, &buf->calcbufx, _state); kdtreetsqueryresultstags(&s->tree, &buf->requestbuffer, &buf->calcbuftags, _state); for(i=0; i<=s->ny-1; i++) { for(j=0; j<=lx-1; j++) { tg = buf->calcbuftags.ptr.p_int[j]; rcur = s->wr.ptr.pp_double[tg][0]; bf = ae_exp(-(ae_sqr(buf->calcbufxcx.ptr.p_double[0]-buf->calcbufx.ptr.pp_double[j][0], _state)+ae_sqr(buf->calcbufxcx.ptr.p_double[1]-buf->calcbufx.ptr.pp_double[j][1], _state)+ae_sqr(buf->calcbufxcx.ptr.p_double[2]-buf->calcbufx.ptr.pp_double[j][2], _state))/ae_sqr(rcur, _state), _state); for(k=0; k<=s->nl-1; k++) { y->ptr.p_double[i] = y->ptr.p_double[i]+bf*s->wr.ptr.pp_double[tg][1+k*s->ny+i]; t = bf*bf; bf = t*t; } } } } /************************************************************************* This function calculates values of the RBF model at the regular grid. Grid have N0*N1 points, with Point[I,J] = (X0[I], X1[J]) This function returns 0.0 when: * model is not initialized * NX<>2 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - array of grid nodes, first coordinates, array[N0] N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] N1 - grid size (number of nodes) in the second dimension OUTPUT PARAMETERS: Y - function values, array[N0,N1]. Y is out-variable and is reallocated by this function. NOTE: as a special exception, this function supports unordered arrays X0 and X1. However, future versions may be more efficient for X0/X1 ordered by ascending. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfv1gridcalc2(rbfv1model* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_matrix* y, ae_state *_state) { ae_frame _frame_block; ae_vector cpx0; ae_vector cpx1; ae_vector p01; ae_vector p11; ae_vector p2; double rlimit; double xcnorm2; ae_int_t hp01; double hcpx0; double xc0; double xc1; double omega; double radius; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t d; ae_int_t i00; ae_int_t i01; ae_int_t i10; ae_int_t i11; ae_frame_make(_state, &_frame_block); ae_matrix_clear(y); ae_vector_init(&cpx0, 0, DT_REAL, _state); ae_vector_init(&cpx1, 0, DT_REAL, _state); ae_vector_init(&p01, 0, DT_INT, _state); ae_vector_init(&p11, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); ae_assert(n0>0, "RBFGridCalc2: invalid value for N0 (N0<=0)!", _state); ae_assert(n1>0, "RBFGridCalc2: invalid value for N1 (N1<=0)!", _state); ae_assert(x0->cnt>=n0, "RBFGridCalc2: Length(X0)cnt>=n1, "RBFGridCalc2: Length(X1)ptr.pp_double[i][j] = (double)(0); } } if( (s->ny!=1||s->nx!=2)||s->nc==0 ) { ae_frame_leave(_state); return; } /* *create and sort arrays */ ae_vector_set_length(&cpx0, n0, _state); for(i=0; i<=n0-1; i++) { cpx0.ptr.p_double[i] = x0->ptr.p_double[i]; } tagsort(&cpx0, n0, &p01, &p2, _state); ae_vector_set_length(&cpx1, n1, _state); for(i=0; i<=n1-1; i++) { cpx1.ptr.p_double[i] = x1->ptr.p_double[i]; } tagsort(&cpx1, n1, &p11, &p2, _state); /* *calculate function's value */ for(i=0; i<=s->nc-1; i++) { radius = s->wr.ptr.pp_double[i][0]; for(d=0; d<=s->nl-1; d++) { omega = s->wr.ptr.pp_double[i][1+d]; rlimit = radius*rbfv1_rbffarradius; /* *search lower and upper indexes */ i00 = lowerbound(&cpx0, n0, s->xc.ptr.pp_double[i][0]-rlimit, _state); i01 = upperbound(&cpx0, n0, s->xc.ptr.pp_double[i][0]+rlimit, _state); i10 = lowerbound(&cpx1, n1, s->xc.ptr.pp_double[i][1]-rlimit, _state); i11 = upperbound(&cpx1, n1, s->xc.ptr.pp_double[i][1]+rlimit, _state); xc0 = s->xc.ptr.pp_double[i][0]; xc1 = s->xc.ptr.pp_double[i][1]; for(j=i00; j<=i01-1; j++) { hcpx0 = cpx0.ptr.p_double[j]; hp01 = p01.ptr.p_int[j]; for(k=i10; k<=i11-1; k++) { xcnorm2 = ae_sqr(hcpx0-xc0, _state)+ae_sqr(cpx1.ptr.p_double[k]-xc1, _state); if( ae_fp_less_eq(xcnorm2,rlimit*rlimit) ) { y->ptr.pp_double[hp01][p11.ptr.p_int[k]] = y->ptr.pp_double[hp01][p11.ptr.p_int[k]]+ae_exp(-xcnorm2/ae_sqr(radius, _state), _state)*omega; } } } radius = 0.5*radius; } } /* *add linear term */ for(i=0; i<=n0-1; i++) { for(j=0; j<=n1-1; j++) { y->ptr.pp_double[i][j] = y->ptr.pp_double[i][j]+s->v.ptr.pp_double[0][0]*x0->ptr.p_double[i]+s->v.ptr.pp_double[0][1]*x1->ptr.p_double[j]+s->v.ptr.pp_double[0][rbfv1_mxnx]; } } ae_frame_leave(_state); } void rbfv1gridcalc3vrec(rbfv1model* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Integer */ ae_vector* blocks0, ae_int_t block0a, ae_int_t block0b, /* Integer */ ae_vector* blocks1, ae_int_t block1a, ae_int_t block1b, /* Integer */ ae_vector* blocks2, ae_int_t block2a, ae_int_t block2b, /* Boolean */ ae_vector* flagy, ae_bool sparsey, double searchradius, double avgfuncpernode, ae_shared_pool* bufpool, /* Real */ ae_vector* y, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t t; ae_int_t l; ae_int_t i0; ae_int_t i1; ae_int_t i2; ae_int_t ic; gridcalc3v1buf *pbuf; ae_smart_ptr _pbuf; ae_int_t flag12dim1; ae_int_t flag12dim2; double problemcost; ae_int_t maxbs; ae_int_t nx; ae_int_t ny; double v; ae_int_t kc; ae_int_t tg; double rcur; double rcur2; double basisfuncval; ae_int_t dstoffs; ae_int_t srcoffs; ae_int_t ubnd; double w0; double w1; double w2; ae_bool allnodes; ae_bool somenodes; ae_frame_make(_state, &_frame_block); ae_smart_ptr_init(&_pbuf, (void**)&pbuf, _state); nx = s->nx; ny = s->ny; /* * Try to split large problem */ problemcost = (s->nl+1)*s->ny*2*(avgfuncpernode+1); problemcost = problemcost*(blocks0->ptr.p_int[block0b]-blocks0->ptr.p_int[block0a]); problemcost = problemcost*(blocks1->ptr.p_int[block1b]-blocks1->ptr.p_int[block1a]); problemcost = problemcost*(blocks2->ptr.p_int[block2b]-blocks2->ptr.p_int[block2a]); maxbs = 0; maxbs = ae_maxint(maxbs, block0b-block0a, _state); maxbs = ae_maxint(maxbs, block1b-block1a, _state); maxbs = ae_maxint(maxbs, block2b-block2a, _state); if( ae_fp_greater_eq(problemcost,rbfv1_minbasecasecost)&&maxbs>=2 ) { if( block0b-block0a==maxbs ) { rbfv1gridcalc3vrec(s, x0, n0, x1, n1, x2, n2, blocks0, block0a, block0a+maxbs/2, blocks1, block1a, block1b, blocks2, block2a, block2b, flagy, sparsey, searchradius, avgfuncpernode, bufpool, y, _state); rbfv1gridcalc3vrec(s, x0, n0, x1, n1, x2, n2, blocks0, block0a+maxbs/2, block0b, blocks1, block1a, block1b, blocks2, block2a, block2b, flagy, sparsey, searchradius, avgfuncpernode, bufpool, y, _state); ae_frame_leave(_state); return; } if( block1b-block1a==maxbs ) { rbfv1gridcalc3vrec(s, x0, n0, x1, n1, x2, n2, blocks0, block0a, block0b, blocks1, block1a, block1a+maxbs/2, blocks2, block2a, block2b, flagy, sparsey, searchradius, avgfuncpernode, bufpool, y, _state); rbfv1gridcalc3vrec(s, x0, n0, x1, n1, x2, n2, blocks0, block0a, block0b, blocks1, block1a+maxbs/2, block1b, blocks2, block2a, block2b, flagy, sparsey, searchradius, avgfuncpernode, bufpool, y, _state); ae_frame_leave(_state); return; } if( block2b-block2a==maxbs ) { rbfv1gridcalc3vrec(s, x0, n0, x1, n1, x2, n2, blocks0, block0a, block0b, blocks1, block1a, block1b, blocks2, block2a, block2a+maxbs/2, flagy, sparsey, searchradius, avgfuncpernode, bufpool, y, _state); rbfv1gridcalc3vrec(s, x0, n0, x1, n1, x2, n2, blocks0, block0a, block0b, blocks1, block1a, block1b, blocks2, block2a+maxbs/2, block2b, flagy, sparsey, searchradius, avgfuncpernode, bufpool, y, _state); ae_frame_leave(_state); return; } } /* * Retrieve buffer object from pool (it will be returned later) */ ae_shared_pool_retrieve(bufpool, &_pbuf, _state); /* * Calculate RBF model */ for(i2=block2a; i2<=block2b-1; i2++) { for(i1=block1a; i1<=block1b-1; i1++) { for(i0=block0a; i0<=block0b-1; i0++) { /* * Analyze block - determine what elements are needed and what are not. * * After this block is done, two flag variables can be used: * * SomeNodes, which is True when there are at least one node which have * to be calculated * * AllNodes, which is True when all nodes are required */ somenodes = ae_true; allnodes = ae_true; flag12dim1 = blocks1->ptr.p_int[i1+1]-blocks1->ptr.p_int[i1]; flag12dim2 = blocks2->ptr.p_int[i2+1]-blocks2->ptr.p_int[i2]; if( sparsey ) { /* * Use FlagY to determine what is required. */ bvectorsetlengthatleast(&pbuf->flag0, n0, _state); bvectorsetlengthatleast(&pbuf->flag1, n1, _state); bvectorsetlengthatleast(&pbuf->flag2, n2, _state); bvectorsetlengthatleast(&pbuf->flag12, flag12dim1*flag12dim2, _state); for(i=blocks0->ptr.p_int[i0]; i<=blocks0->ptr.p_int[i0+1]-1; i++) { pbuf->flag0.ptr.p_bool[i] = ae_false; } for(j=blocks1->ptr.p_int[i1]; j<=blocks1->ptr.p_int[i1+1]-1; j++) { pbuf->flag1.ptr.p_bool[j] = ae_false; } for(k=blocks2->ptr.p_int[i2]; k<=blocks2->ptr.p_int[i2+1]-1; k++) { pbuf->flag2.ptr.p_bool[k] = ae_false; } for(i=0; i<=flag12dim1*flag12dim2-1; i++) { pbuf->flag12.ptr.p_bool[i] = ae_false; } somenodes = ae_false; allnodes = ae_true; for(k=blocks2->ptr.p_int[i2]; k<=blocks2->ptr.p_int[i2+1]-1; k++) { for(j=blocks1->ptr.p_int[i1]; j<=blocks1->ptr.p_int[i1+1]-1; j++) { dstoffs = j-blocks1->ptr.p_int[i1]+flag12dim1*(k-blocks2->ptr.p_int[i2]); srcoffs = j*n0+k*n0*n1; for(i=blocks0->ptr.p_int[i0]; i<=blocks0->ptr.p_int[i0+1]-1; i++) { if( flagy->ptr.p_bool[srcoffs+i] ) { pbuf->flag0.ptr.p_bool[i] = ae_true; pbuf->flag1.ptr.p_bool[j] = ae_true; pbuf->flag2.ptr.p_bool[k] = ae_true; pbuf->flag12.ptr.p_bool[dstoffs] = ae_true; somenodes = ae_true; } else { allnodes = ae_false; } } } } } /* * Skip block if it is completely empty. */ if( !somenodes ) { continue; } /* * compute linear term for block (I0,I1,I2) */ for(k=blocks2->ptr.p_int[i2]; k<=blocks2->ptr.p_int[i2+1]-1; k++) { for(j=blocks1->ptr.p_int[i1]; j<=blocks1->ptr.p_int[i1+1]-1; j++) { /* * do we need this micro-row? */ if( !allnodes&&!pbuf->flag12.ptr.p_bool[j-blocks1->ptr.p_int[i1]+flag12dim1*(k-blocks2->ptr.p_int[i2])] ) { continue; } /* * Compute linear term */ for(i=blocks0->ptr.p_int[i0]; i<=blocks0->ptr.p_int[i0+1]-1; i++) { pbuf->tx.ptr.p_double[0] = x0->ptr.p_double[i]; pbuf->tx.ptr.p_double[1] = x1->ptr.p_double[j]; pbuf->tx.ptr.p_double[2] = x2->ptr.p_double[k]; for(l=0; l<=s->ny-1; l++) { v = s->v.ptr.pp_double[l][rbfv1_mxnx]; for(t=0; t<=nx-1; t++) { v = v+s->v.ptr.pp_double[l][t]*pbuf->tx.ptr.p_double[t]; } y->ptr.p_double[l+ny*(i+j*n0+k*n0*n1)] = v; } } } } /* * compute RBF term for block (I0,I1,I2) */ pbuf->tx.ptr.p_double[0] = 0.5*(x0->ptr.p_double[blocks0->ptr.p_int[i0]]+x0->ptr.p_double[blocks0->ptr.p_int[i0+1]-1]); pbuf->tx.ptr.p_double[1] = 0.5*(x1->ptr.p_double[blocks1->ptr.p_int[i1]]+x1->ptr.p_double[blocks1->ptr.p_int[i1+1]-1]); pbuf->tx.ptr.p_double[2] = 0.5*(x2->ptr.p_double[blocks2->ptr.p_int[i2]]+x2->ptr.p_double[blocks2->ptr.p_int[i2+1]-1]); kc = kdtreetsqueryrnn(&s->tree, &pbuf->requestbuf, &pbuf->tx, searchradius, ae_true, _state); kdtreetsqueryresultsx(&s->tree, &pbuf->requestbuf, &pbuf->calcbufx, _state); kdtreetsqueryresultstags(&s->tree, &pbuf->requestbuf, &pbuf->calcbuftags, _state); for(ic=0; ic<=kc-1; ic++) { pbuf->cx.ptr.p_double[0] = pbuf->calcbufx.ptr.pp_double[ic][0]; pbuf->cx.ptr.p_double[1] = pbuf->calcbufx.ptr.pp_double[ic][1]; pbuf->cx.ptr.p_double[2] = pbuf->calcbufx.ptr.pp_double[ic][2]; tg = pbuf->calcbuftags.ptr.p_int[ic]; rcur = s->wr.ptr.pp_double[tg][0]; rcur2 = rcur*rcur; for(i=blocks0->ptr.p_int[i0]; i<=blocks0->ptr.p_int[i0+1]-1; i++) { if( allnodes||pbuf->flag0.ptr.p_bool[i] ) { pbuf->expbuf0.ptr.p_double[i] = ae_exp(-ae_sqr(x0->ptr.p_double[i]-pbuf->cx.ptr.p_double[0], _state)/rcur2, _state); } else { pbuf->expbuf0.ptr.p_double[i] = 0.0; } } for(j=blocks1->ptr.p_int[i1]; j<=blocks1->ptr.p_int[i1+1]-1; j++) { if( allnodes||pbuf->flag1.ptr.p_bool[j] ) { pbuf->expbuf1.ptr.p_double[j] = ae_exp(-ae_sqr(x1->ptr.p_double[j]-pbuf->cx.ptr.p_double[1], _state)/rcur2, _state); } else { pbuf->expbuf1.ptr.p_double[j] = 0.0; } } for(k=blocks2->ptr.p_int[i2]; k<=blocks2->ptr.p_int[i2+1]-1; k++) { if( allnodes||pbuf->flag2.ptr.p_bool[k] ) { pbuf->expbuf2.ptr.p_double[k] = ae_exp(-ae_sqr(x2->ptr.p_double[k]-pbuf->cx.ptr.p_double[2], _state)/rcur2, _state); } else { pbuf->expbuf2.ptr.p_double[k] = 0.0; } } for(t=0; t<=s->nl-1; t++) { /* * Calculate */ for(k=blocks2->ptr.p_int[i2]; k<=blocks2->ptr.p_int[i2+1]-1; k++) { for(j=blocks1->ptr.p_int[i1]; j<=blocks1->ptr.p_int[i1+1]-1; j++) { /* * do we need this micro-row? */ if( !allnodes&&!pbuf->flag12.ptr.p_bool[j-blocks1->ptr.p_int[i1]+flag12dim1*(k-blocks2->ptr.p_int[i2])] ) { continue; } /* * Prepare local variables */ dstoffs = ny*(blocks0->ptr.p_int[i0]+j*n0+k*n0*n1); v = pbuf->expbuf1.ptr.p_double[j]*pbuf->expbuf2.ptr.p_double[k]; /* * Optimized for NY=1 */ if( s->ny==1 ) { w0 = s->wr.ptr.pp_double[tg][1+t*s->ny+0]; ubnd = blocks0->ptr.p_int[i0+1]-1; for(i=blocks0->ptr.p_int[i0]; i<=ubnd; i++) { basisfuncval = pbuf->expbuf0.ptr.p_double[i]*v; y->ptr.p_double[dstoffs] = y->ptr.p_double[dstoffs]+basisfuncval*w0; dstoffs = dstoffs+1; } continue; } /* * Optimized for NY=2 */ if( s->ny==2 ) { w0 = s->wr.ptr.pp_double[tg][1+t*s->ny+0]; w1 = s->wr.ptr.pp_double[tg][1+t*s->ny+1]; ubnd = blocks0->ptr.p_int[i0+1]-1; for(i=blocks0->ptr.p_int[i0]; i<=ubnd; i++) { basisfuncval = pbuf->expbuf0.ptr.p_double[i]*v; y->ptr.p_double[dstoffs+0] = y->ptr.p_double[dstoffs+0]+basisfuncval*w0; y->ptr.p_double[dstoffs+1] = y->ptr.p_double[dstoffs+1]+basisfuncval*w1; dstoffs = dstoffs+2; } continue; } /* * Optimized for NY=3 */ if( s->ny==3 ) { w0 = s->wr.ptr.pp_double[tg][1+t*s->ny+0]; w1 = s->wr.ptr.pp_double[tg][1+t*s->ny+1]; w2 = s->wr.ptr.pp_double[tg][1+t*s->ny+2]; ubnd = blocks0->ptr.p_int[i0+1]-1; for(i=blocks0->ptr.p_int[i0]; i<=ubnd; i++) { basisfuncval = pbuf->expbuf0.ptr.p_double[i]*v; y->ptr.p_double[dstoffs+0] = y->ptr.p_double[dstoffs+0]+basisfuncval*w0; y->ptr.p_double[dstoffs+1] = y->ptr.p_double[dstoffs+1]+basisfuncval*w1; y->ptr.p_double[dstoffs+2] = y->ptr.p_double[dstoffs+2]+basisfuncval*w2; dstoffs = dstoffs+3; } continue; } /* * General case */ for(i=blocks0->ptr.p_int[i0]; i<=blocks0->ptr.p_int[i0+1]-1; i++) { basisfuncval = pbuf->expbuf0.ptr.p_double[i]*v; for(l=0; l<=s->ny-1; l++) { y->ptr.p_double[l+dstoffs] = y->ptr.p_double[l+dstoffs]+basisfuncval*s->wr.ptr.pp_double[tg][1+t*s->ny+l]; } dstoffs = dstoffs+ny; } } } /* * Update basis functions */ if( t!=s->nl-1 ) { ubnd = blocks0->ptr.p_int[i0+1]-1; for(i=blocks0->ptr.p_int[i0]; i<=ubnd; i++) { if( allnodes||pbuf->flag0.ptr.p_bool[i] ) { v = pbuf->expbuf0.ptr.p_double[i]*pbuf->expbuf0.ptr.p_double[i]; pbuf->expbuf0.ptr.p_double[i] = v*v; } } ubnd = blocks1->ptr.p_int[i1+1]-1; for(j=blocks1->ptr.p_int[i1]; j<=ubnd; j++) { if( allnodes||pbuf->flag1.ptr.p_bool[j] ) { v = pbuf->expbuf1.ptr.p_double[j]*pbuf->expbuf1.ptr.p_double[j]; pbuf->expbuf1.ptr.p_double[j] = v*v; } } ubnd = blocks2->ptr.p_int[i2+1]-1; for(k=blocks2->ptr.p_int[i2]; k<=ubnd; k++) { if( allnodes||pbuf->flag2.ptr.p_bool[k] ) { v = pbuf->expbuf2.ptr.p_double[k]*pbuf->expbuf2.ptr.p_double[k]; pbuf->expbuf2.ptr.p_double[k] = v*v; } } } } } } } } /* * Recycle buffer object back to pool */ ae_shared_pool_recycle(bufpool, &_pbuf, _state); ae_frame_leave(_state); } /************************************************************************* This function "unpacks" RBF model by extracting its coefficients. INPUT PARAMETERS: S - RBF model OUTPUT PARAMETERS: NX - dimensionality of argument NY - dimensionality of the target function XWR - model information, array[NC,NX+NY+1]. One row of the array corresponds to one basis function: * first NX columns - coordinates of the center * next NY columns - weights, one per dimension of the function being modelled * last column - radius, same for all dimensions of the function being modelled NC - number of the centers V - polynomial term , array[NY,NX+1]. One row per one dimension of the function being modelled. First NX elements are linear coefficients, V[NX] is equal to the constant part. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfv1unpack(rbfv1model* s, ae_int_t* nx, ae_int_t* ny, /* Real */ ae_matrix* xwr, ae_int_t* nc, /* Real */ ae_matrix* v, ae_state *_state) { ae_int_t i; ae_int_t j; double rcur; *nx = 0; *ny = 0; ae_matrix_clear(xwr); *nc = 0; ae_matrix_clear(v); *nx = s->nx; *ny = s->ny; *nc = s->nc; /* * Fill V */ ae_matrix_set_length(v, s->ny, s->nx+1, _state); for(i=0; i<=s->ny-1; i++) { ae_v_move(&v->ptr.pp_double[i][0], 1, &s->v.ptr.pp_double[i][0], 1, ae_v_len(0,s->nx-1)); v->ptr.pp_double[i][s->nx] = s->v.ptr.pp_double[i][rbfv1_mxnx]; } /* * Fill XWR and V */ if( *nc*s->nl>0 ) { ae_matrix_set_length(xwr, s->nc*s->nl, s->nx+s->ny+1, _state); for(i=0; i<=s->nc-1; i++) { rcur = s->wr.ptr.pp_double[i][0]; for(j=0; j<=s->nl-1; j++) { ae_v_move(&xwr->ptr.pp_double[i*s->nl+j][0], 1, &s->xc.ptr.pp_double[i][0], 1, ae_v_len(0,s->nx-1)); ae_v_move(&xwr->ptr.pp_double[i*s->nl+j][s->nx], 1, &s->wr.ptr.pp_double[i][1+j*s->ny], 1, ae_v_len(s->nx,s->nx+s->ny-1)); xwr->ptr.pp_double[i*s->nl+j][s->nx+s->ny] = rcur; rcur = 0.5*rcur; } } } } static ae_bool rbfv1_rbfv1buildlinearmodel(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t ny, ae_int_t modeltype, /* Real */ ae_matrix* v, ae_state *_state) { ae_frame _frame_block; ae_vector tmpy; ae_matrix a; double scaling; ae_vector shifting; double mn; double mx; ae_vector c; lsfitreport rep; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t info; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_clear(v); ae_vector_init(&tmpy, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&shifting, 0, DT_REAL, _state); ae_vector_init(&c, 0, DT_REAL, _state); _lsfitreport_init(&rep, _state); ae_assert(n>=0, "BuildLinearModel: N<0", _state); ae_assert(ny>0, "BuildLinearModel: NY<=0", _state); /* * Handle degenerate case (N=0) */ result = ae_true; ae_matrix_set_length(v, ny, rbfv1_mxnx+1, _state); if( n==0 ) { for(j=0; j<=rbfv1_mxnx; j++) { for(i=0; i<=ny-1; i++) { v->ptr.pp_double[i][j] = (double)(0); } } ae_frame_leave(_state); return result; } /* * Allocate temporaries */ ae_vector_set_length(&tmpy, n, _state); /* * General linear model. */ if( modeltype==1 ) { /* * Calculate scaling/shifting, transform variables, prepare LLS problem */ ae_matrix_set_length(&a, n, rbfv1_mxnx+1, _state); ae_vector_set_length(&shifting, rbfv1_mxnx, _state); scaling = (double)(0); for(i=0; i<=rbfv1_mxnx-1; i++) { mn = x->ptr.pp_double[0][i]; mx = mn; for(j=1; j<=n-1; j++) { if( ae_fp_greater(mn,x->ptr.pp_double[j][i]) ) { mn = x->ptr.pp_double[j][i]; } if( ae_fp_less(mx,x->ptr.pp_double[j][i]) ) { mx = x->ptr.pp_double[j][i]; } } scaling = ae_maxreal(scaling, mx-mn, _state); shifting.ptr.p_double[i] = 0.5*(mx+mn); } if( ae_fp_eq(scaling,(double)(0)) ) { scaling = (double)(1); } else { scaling = 0.5*scaling; } for(i=0; i<=n-1; i++) { for(j=0; j<=rbfv1_mxnx-1; j++) { a.ptr.pp_double[i][j] = (x->ptr.pp_double[i][j]-shifting.ptr.p_double[j])/scaling; } } for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][rbfv1_mxnx] = (double)(1); } /* * Solve linear system in transformed variables, make backward */ for(i=0; i<=ny-1; i++) { for(j=0; j<=n-1; j++) { tmpy.ptr.p_double[j] = y->ptr.pp_double[j][i]; } lsfitlinear(&tmpy, &a, n, rbfv1_mxnx+1, &info, &c, &rep, _state); if( info<=0 ) { result = ae_false; ae_frame_leave(_state); return result; } for(j=0; j<=rbfv1_mxnx-1; j++) { v->ptr.pp_double[i][j] = c.ptr.p_double[j]/scaling; } v->ptr.pp_double[i][rbfv1_mxnx] = c.ptr.p_double[rbfv1_mxnx]; for(j=0; j<=rbfv1_mxnx-1; j++) { v->ptr.pp_double[i][rbfv1_mxnx] = v->ptr.pp_double[i][rbfv1_mxnx]-shifting.ptr.p_double[j]*v->ptr.pp_double[i][j]; } for(j=0; j<=n-1; j++) { for(k=0; k<=rbfv1_mxnx-1; k++) { y->ptr.pp_double[j][i] = y->ptr.pp_double[j][i]-x->ptr.pp_double[j][k]*v->ptr.pp_double[i][k]; } y->ptr.pp_double[j][i] = y->ptr.pp_double[j][i]-v->ptr.pp_double[i][rbfv1_mxnx]; } } ae_frame_leave(_state); return result; } /* * Constant model, very simple */ if( modeltype==2 ) { for(i=0; i<=ny-1; i++) { for(j=0; j<=rbfv1_mxnx; j++) { v->ptr.pp_double[i][j] = (double)(0); } for(j=0; j<=n-1; j++) { v->ptr.pp_double[i][rbfv1_mxnx] = v->ptr.pp_double[i][rbfv1_mxnx]+y->ptr.pp_double[j][i]; } if( n>0 ) { v->ptr.pp_double[i][rbfv1_mxnx] = v->ptr.pp_double[i][rbfv1_mxnx]/n; } for(j=0; j<=n-1; j++) { y->ptr.pp_double[j][i] = y->ptr.pp_double[j][i]-v->ptr.pp_double[i][rbfv1_mxnx]; } } ae_frame_leave(_state); return result; } /* * Zero model */ ae_assert(modeltype==3, "BuildLinearModel: unknown model type", _state); for(i=0; i<=ny-1; i++) { for(j=0; j<=rbfv1_mxnx; j++) { v->ptr.pp_double[i][j] = (double)(0); } } ae_frame_leave(_state); return result; } static void rbfv1_buildrbfmodellsqr(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, /* Real */ ae_matrix* xc, /* Real */ ae_vector* r, ae_int_t n, ae_int_t nc, ae_int_t ny, kdtree* pointstree, kdtree* centerstree, double epsort, double epserr, ae_int_t maxits, ae_int_t* gnnz, ae_int_t* snnz, /* Real */ ae_matrix* w, ae_int_t* info, ae_int_t* iterationscount, ae_int_t* nmv, ae_state *_state) { ae_frame _frame_block; linlsqrstate state; linlsqrreport lsqrrep; sparsematrix spg; sparsematrix sps; ae_vector nearcenterscnt; ae_vector nearpointscnt; ae_vector skipnearpointscnt; ae_vector farpointscnt; ae_int_t maxnearcenterscnt; ae_int_t maxnearpointscnt; ae_int_t maxfarpointscnt; ae_int_t sumnearcenterscnt; ae_int_t sumnearpointscnt; ae_int_t sumfarpointscnt; double maxrad; ae_vector pointstags; ae_vector centerstags; ae_matrix nearpoints; ae_matrix nearcenters; ae_matrix farpoints; ae_int_t tmpi; ae_int_t pointscnt; ae_int_t centerscnt; ae_vector xcx; ae_vector tmpy; ae_vector tc; ae_vector g; ae_vector c; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t sind; ae_matrix a; double vv; double vx; double vy; double vz; double vr; double gnorm2; ae_vector tmp0; ae_vector tmp1; ae_vector tmp2; double fx; ae_matrix xx; ae_matrix cx; double mrad; ae_frame_make(_state, &_frame_block); *gnnz = 0; *snnz = 0; ae_matrix_clear(w); *info = 0; *iterationscount = 0; *nmv = 0; _linlsqrstate_init(&state, _state); _linlsqrreport_init(&lsqrrep, _state); _sparsematrix_init(&spg, _state); _sparsematrix_init(&sps, _state); ae_vector_init(&nearcenterscnt, 0, DT_INT, _state); ae_vector_init(&nearpointscnt, 0, DT_INT, _state); ae_vector_init(&skipnearpointscnt, 0, DT_INT, _state); ae_vector_init(&farpointscnt, 0, DT_INT, _state); ae_vector_init(&pointstags, 0, DT_INT, _state); ae_vector_init(¢erstags, 0, DT_INT, _state); ae_matrix_init(&nearpoints, 0, 0, DT_REAL, _state); ae_matrix_init(&nearcenters, 0, 0, DT_REAL, _state); ae_matrix_init(&farpoints, 0, 0, DT_REAL, _state); ae_vector_init(&xcx, 0, DT_REAL, _state); ae_vector_init(&tmpy, 0, DT_REAL, _state); ae_vector_init(&tc, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&c, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&tmp0, 0, DT_REAL, _state); ae_vector_init(&tmp1, 0, DT_REAL, _state); ae_vector_init(&tmp2, 0, DT_REAL, _state); ae_matrix_init(&xx, 0, 0, DT_REAL, _state); ae_matrix_init(&cx, 0, 0, DT_REAL, _state); /* * Handle special cases: NC=0 */ if( nc==0 ) { *info = 1; *iterationscount = 0; *nmv = 0; ae_frame_leave(_state); return; } /* * Prepare for general case, NC>0 */ ae_vector_set_length(&xcx, rbfv1_mxnx, _state); ae_vector_set_length(&pointstags, n, _state); ae_vector_set_length(¢erstags, nc, _state); *info = -1; *iterationscount = 0; *nmv = 0; /* * This block prepares quantities used to compute approximate cardinal basis functions (ACBFs): * * NearCentersCnt[] - array[NC], whose elements store number of near centers used to build ACBF * * NearPointsCnt[] - array[NC], number of near points used to build ACBF * * FarPointsCnt[] - array[NC], number of far points (ones where ACBF is nonzero) * * MaxNearCentersCnt - max(NearCentersCnt) * * MaxNearPointsCnt - max(NearPointsCnt) * * SumNearCentersCnt - sum(NearCentersCnt) * * SumNearPointsCnt - sum(NearPointsCnt) * * SumFarPointsCnt - sum(FarPointsCnt) */ ae_vector_set_length(&nearcenterscnt, nc, _state); ae_vector_set_length(&nearpointscnt, nc, _state); ae_vector_set_length(&skipnearpointscnt, nc, _state); ae_vector_set_length(&farpointscnt, nc, _state); maxnearcenterscnt = 0; maxnearpointscnt = 0; maxfarpointscnt = 0; sumnearcenterscnt = 0; sumnearpointscnt = 0; sumfarpointscnt = 0; for(i=0; i<=nc-1; i++) { for(j=0; j<=rbfv1_mxnx-1; j++) { xcx.ptr.p_double[j] = xc->ptr.pp_double[i][j]; } /* * Determine number of near centers and maximum radius of near centers */ nearcenterscnt.ptr.p_int[i] = kdtreequeryrnn(centerstree, &xcx, r->ptr.p_double[i]*rbfv1_rbfnearradius, ae_true, _state); kdtreequeryresultstags(centerstree, ¢erstags, _state); maxrad = (double)(0); for(j=0; j<=nearcenterscnt.ptr.p_int[i]-1; j++) { maxrad = ae_maxreal(maxrad, ae_fabs(r->ptr.p_double[centerstags.ptr.p_int[j]], _state), _state); } /* * Determine number of near points (ones which used to build ACBF) * and skipped points (the most near points which are NOT used to build ACBF * and are NOT included in the near points count */ skipnearpointscnt.ptr.p_int[i] = kdtreequeryrnn(pointstree, &xcx, 0.1*r->ptr.p_double[i], ae_true, _state); nearpointscnt.ptr.p_int[i] = kdtreequeryrnn(pointstree, &xcx, (r->ptr.p_double[i]+maxrad)*rbfv1_rbfnearradius, ae_true, _state)-skipnearpointscnt.ptr.p_int[i]; ae_assert(nearpointscnt.ptr.p_int[i]>=0, "BuildRBFModelLSQR: internal error", _state); /* * Determine number of far points */ farpointscnt.ptr.p_int[i] = kdtreequeryrnn(pointstree, &xcx, ae_maxreal(r->ptr.p_double[i]*rbfv1_rbfnearradius+maxrad*rbfv1_rbffarradius, r->ptr.p_double[i]*rbfv1_rbffarradius, _state), ae_true, _state); /* * calculate sum and max, make some basic checks */ ae_assert(nearcenterscnt.ptr.p_int[i]>0, "BuildRBFModelLSQR: internal error", _state); maxnearcenterscnt = ae_maxint(maxnearcenterscnt, nearcenterscnt.ptr.p_int[i], _state); maxnearpointscnt = ae_maxint(maxnearpointscnt, nearpointscnt.ptr.p_int[i], _state); maxfarpointscnt = ae_maxint(maxfarpointscnt, farpointscnt.ptr.p_int[i], _state); sumnearcenterscnt = sumnearcenterscnt+nearcenterscnt.ptr.p_int[i]; sumnearpointscnt = sumnearpointscnt+nearpointscnt.ptr.p_int[i]; sumfarpointscnt = sumfarpointscnt+farpointscnt.ptr.p_int[i]; } *snnz = sumnearcenterscnt; *gnnz = sumfarpointscnt; ae_assert(maxnearcenterscnt>0, "BuildRBFModelLSQR: internal error", _state); /* * Allocate temporaries. * * NOTE: we want to avoid allocation of zero-size arrays, so we * use max(desired_size,1) instead of desired_size when performing * memory allocation. */ ae_matrix_set_length(&a, maxnearpointscnt+maxnearcenterscnt, maxnearcenterscnt, _state); ae_vector_set_length(&tmpy, maxnearpointscnt+maxnearcenterscnt, _state); ae_vector_set_length(&g, maxnearcenterscnt, _state); ae_vector_set_length(&c, maxnearcenterscnt, _state); ae_matrix_set_length(&nearcenters, maxnearcenterscnt, rbfv1_mxnx, _state); ae_matrix_set_length(&nearpoints, ae_maxint(maxnearpointscnt, 1, _state), rbfv1_mxnx, _state); ae_matrix_set_length(&farpoints, ae_maxint(maxfarpointscnt, 1, _state), rbfv1_mxnx, _state); /* * fill matrix SpG */ sparsecreate(n, nc, *gnnz, &spg, _state); sparsecreate(nc, nc, *snnz, &sps, _state); for(i=0; i<=nc-1; i++) { centerscnt = nearcenterscnt.ptr.p_int[i]; /* * main center */ for(j=0; j<=rbfv1_mxnx-1; j++) { xcx.ptr.p_double[j] = xc->ptr.pp_double[i][j]; } /* * center's tree */ tmpi = kdtreequeryknn(centerstree, &xcx, centerscnt, ae_true, _state); ae_assert(tmpi==centerscnt, "BuildRBFModelLSQR: internal error", _state); kdtreequeryresultsx(centerstree, &cx, _state); kdtreequeryresultstags(centerstree, ¢erstags, _state); /* * point's tree */ mrad = (double)(0); for(j=0; j<=centerscnt-1; j++) { mrad = ae_maxreal(mrad, r->ptr.p_double[centerstags.ptr.p_int[j]], _state); } /* * we need to be sure that 'CTree' contains * at least one side center */ sparseset(&sps, i, i, (double)(1), _state); c.ptr.p_double[0] = 1.0; for(j=1; j<=centerscnt-1; j++) { c.ptr.p_double[j] = 0.0; } if( centerscnt>1&&nearpointscnt.ptr.p_int[i]>0 ) { /* * first KDTree request for points */ pointscnt = nearpointscnt.ptr.p_int[i]; tmpi = kdtreequeryknn(pointstree, &xcx, skipnearpointscnt.ptr.p_int[i]+nearpointscnt.ptr.p_int[i], ae_true, _state); ae_assert(tmpi==skipnearpointscnt.ptr.p_int[i]+nearpointscnt.ptr.p_int[i], "BuildRBFModelLSQR: internal error", _state); kdtreequeryresultsx(pointstree, &xx, _state); sind = skipnearpointscnt.ptr.p_int[i]; for(j=0; j<=pointscnt-1; j++) { vx = xx.ptr.pp_double[sind+j][0]; vy = xx.ptr.pp_double[sind+j][1]; vz = xx.ptr.pp_double[sind+j][2]; for(k=0; k<=centerscnt-1; k++) { vr = 0.0; vv = vx-cx.ptr.pp_double[k][0]; vr = vr+vv*vv; vv = vy-cx.ptr.pp_double[k][1]; vr = vr+vv*vv; vv = vz-cx.ptr.pp_double[k][2]; vr = vr+vv*vv; vv = r->ptr.p_double[centerstags.ptr.p_int[k]]; a.ptr.pp_double[j][k] = ae_exp(-vr/(vv*vv), _state); } } for(j=0; j<=centerscnt-1; j++) { g.ptr.p_double[j] = ae_exp(-(ae_sqr(xcx.ptr.p_double[0]-cx.ptr.pp_double[j][0], _state)+ae_sqr(xcx.ptr.p_double[1]-cx.ptr.pp_double[j][1], _state)+ae_sqr(xcx.ptr.p_double[2]-cx.ptr.pp_double[j][2], _state))/ae_sqr(r->ptr.p_double[centerstags.ptr.p_int[j]], _state), _state); } /* * calculate the problem */ gnorm2 = ae_v_dotproduct(&g.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,centerscnt-1)); for(j=0; j<=pointscnt-1; j++) { vv = ae_v_dotproduct(&a.ptr.pp_double[j][0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,centerscnt-1)); vv = vv/gnorm2; tmpy.ptr.p_double[j] = -vv; ae_v_subd(&a.ptr.pp_double[j][0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,centerscnt-1), vv); } for(j=pointscnt; j<=pointscnt+centerscnt-1; j++) { for(k=0; k<=centerscnt-1; k++) { a.ptr.pp_double[j][k] = 0.0; } a.ptr.pp_double[j][j-pointscnt] = 1.0E-6; tmpy.ptr.p_double[j] = 0.0; } fblssolvels(&a, &tmpy, pointscnt+centerscnt, centerscnt, &tmp0, &tmp1, &tmp2, _state); ae_v_move(&c.ptr.p_double[0], 1, &tmpy.ptr.p_double[0], 1, ae_v_len(0,centerscnt-1)); vv = ae_v_dotproduct(&g.ptr.p_double[0], 1, &c.ptr.p_double[0], 1, ae_v_len(0,centerscnt-1)); vv = vv/gnorm2; ae_v_subd(&c.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,centerscnt-1), vv); vv = 1/gnorm2; ae_v_addd(&c.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,centerscnt-1), vv); for(j=0; j<=centerscnt-1; j++) { sparseset(&sps, i, centerstags.ptr.p_int[j], c.ptr.p_double[j], _state); } } /* * second KDTree request for points */ pointscnt = farpointscnt.ptr.p_int[i]; tmpi = kdtreequeryknn(pointstree, &xcx, pointscnt, ae_true, _state); ae_assert(tmpi==pointscnt, "BuildRBFModelLSQR: internal error", _state); kdtreequeryresultsx(pointstree, &xx, _state); kdtreequeryresultstags(pointstree, &pointstags, _state); /* *fill SpG matrix */ for(j=0; j<=pointscnt-1; j++) { fx = (double)(0); vx = xx.ptr.pp_double[j][0]; vy = xx.ptr.pp_double[j][1]; vz = xx.ptr.pp_double[j][2]; for(k=0; k<=centerscnt-1; k++) { vr = 0.0; vv = vx-cx.ptr.pp_double[k][0]; vr = vr+vv*vv; vv = vy-cx.ptr.pp_double[k][1]; vr = vr+vv*vv; vv = vz-cx.ptr.pp_double[k][2]; vr = vr+vv*vv; vv = r->ptr.p_double[centerstags.ptr.p_int[k]]; vv = vv*vv; fx = fx+c.ptr.p_double[k]*ae_exp(-vr/vv, _state); } sparseset(&spg, pointstags.ptr.p_int[j], i, fx, _state); } } sparseconverttocrs(&spg, _state); sparseconverttocrs(&sps, _state); /* * solve by LSQR method */ ae_vector_set_length(&tmpy, n, _state); ae_vector_set_length(&tc, nc, _state); ae_matrix_set_length(w, nc, ny, _state); linlsqrcreate(n, nc, &state, _state); linlsqrsetcond(&state, epsort, epserr, maxits, _state); for(i=0; i<=ny-1; i++) { for(j=0; j<=n-1; j++) { tmpy.ptr.p_double[j] = y->ptr.pp_double[j][i]; } linlsqrsolvesparse(&state, &spg, &tmpy, _state); linlsqrresults(&state, &c, &lsqrrep, _state); if( lsqrrep.terminationtype<=0 ) { *info = -4; ae_frame_leave(_state); return; } sparsemtv(&sps, &c, &tc, _state); for(j=0; j<=nc-1; j++) { w->ptr.pp_double[j][i] = tc.ptr.p_double[j]; } *iterationscount = *iterationscount+lsqrrep.iterationscount; *nmv = *nmv+lsqrrep.nmv; } *info = 1; ae_frame_leave(_state); } static void rbfv1_buildrbfmlayersmodellsqr(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, /* Real */ ae_matrix* xc, double rval, /* Real */ ae_vector* r, ae_int_t n, ae_int_t* nc, ae_int_t ny, ae_int_t nlayers, kdtree* centerstree, double epsort, double epserr, ae_int_t maxits, double lambdav, ae_int_t* annz, /* Real */ ae_matrix* w, ae_int_t* info, ae_int_t* iterationscount, ae_int_t* nmv, ae_state *_state) { ae_frame _frame_block; linlsqrstate state; linlsqrreport lsqrrep; sparsematrix spa; double anorm; ae_vector omega; ae_vector xx; ae_vector tmpy; ae_matrix cx; double yval; ae_int_t nec; ae_vector centerstags; ae_int_t layer; ae_int_t i; ae_int_t j; ae_int_t k; double v; double rmaxbefore; double rmaxafter; ae_frame_make(_state, &_frame_block); ae_matrix_clear(xc); ae_vector_clear(r); *nc = 0; *annz = 0; ae_matrix_clear(w); *info = 0; *iterationscount = 0; *nmv = 0; _linlsqrstate_init(&state, _state); _linlsqrreport_init(&lsqrrep, _state); _sparsematrix_init(&spa, _state); ae_vector_init(&omega, 0, DT_REAL, _state); ae_vector_init(&xx, 0, DT_REAL, _state); ae_vector_init(&tmpy, 0, DT_REAL, _state); ae_matrix_init(&cx, 0, 0, DT_REAL, _state); ae_vector_init(¢erstags, 0, DT_INT, _state); ae_assert(nlayers>=0, "BuildRBFMLayersModelLSQR: invalid argument(NLayers<0)", _state); ae_assert(n>=0, "BuildRBFMLayersModelLSQR: invalid argument(N<0)", _state); ae_assert(rbfv1_mxnx>0&&rbfv1_mxnx<=3, "BuildRBFMLayersModelLSQR: internal error(invalid global const MxNX: either MxNX<=0 or MxNX>3)", _state); *annz = 0; if( n==0||nlayers==0 ) { *info = 1; *iterationscount = 0; *nmv = 0; ae_frame_leave(_state); return; } *nc = n*nlayers; ae_vector_set_length(&xx, rbfv1_mxnx, _state); ae_vector_set_length(¢erstags, n, _state); ae_matrix_set_length(xc, *nc, rbfv1_mxnx, _state); ae_vector_set_length(r, *nc, _state); for(i=0; i<=*nc-1; i++) { for(j=0; j<=rbfv1_mxnx-1; j++) { xc->ptr.pp_double[i][j] = x->ptr.pp_double[i%n][j]; } } for(i=0; i<=*nc-1; i++) { r->ptr.p_double[i] = rval/ae_pow((double)(2), (double)(i/n), _state); } for(i=0; i<=n-1; i++) { centerstags.ptr.p_int[i] = i; } kdtreebuildtagged(xc, ¢erstags, n, rbfv1_mxnx, 0, 2, centerstree, _state); ae_vector_set_length(&omega, n, _state); ae_vector_set_length(&tmpy, n, _state); ae_matrix_set_length(w, *nc, ny, _state); *info = -1; *iterationscount = 0; *nmv = 0; linlsqrcreate(n, n, &state, _state); linlsqrsetcond(&state, epsort, epserr, maxits, _state); linlsqrsetlambdai(&state, 1.0E-6, _state); /* * calculate number of non-zero elements for sparse matrix */ for(i=0; i<=n-1; i++) { for(j=0; j<=rbfv1_mxnx-1; j++) { xx.ptr.p_double[j] = x->ptr.pp_double[i][j]; } *annz = *annz+kdtreequeryrnn(centerstree, &xx, r->ptr.p_double[0]*rbfv1_rbfmlradius, ae_true, _state); } for(layer=0; layer<=nlayers-1; layer++) { /* * Fill sparse matrix, calculate norm(A) */ anorm = 0.0; sparsecreate(n, n, *annz, &spa, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=rbfv1_mxnx-1; j++) { xx.ptr.p_double[j] = x->ptr.pp_double[i][j]; } nec = kdtreequeryrnn(centerstree, &xx, r->ptr.p_double[layer*n]*rbfv1_rbfmlradius, ae_true, _state); kdtreequeryresultsx(centerstree, &cx, _state); kdtreequeryresultstags(centerstree, ¢erstags, _state); for(j=0; j<=nec-1; j++) { v = ae_exp(-(ae_sqr(xx.ptr.p_double[0]-cx.ptr.pp_double[j][0], _state)+ae_sqr(xx.ptr.p_double[1]-cx.ptr.pp_double[j][1], _state)+ae_sqr(xx.ptr.p_double[2]-cx.ptr.pp_double[j][2], _state))/ae_sqr(r->ptr.p_double[layer*n+centerstags.ptr.p_int[j]], _state), _state); sparseset(&spa, i, centerstags.ptr.p_int[j], v, _state); anorm = anorm+ae_sqr(v, _state); } } anorm = ae_sqrt(anorm, _state); sparseconverttocrs(&spa, _state); /* * Calculate maximum residual before adding new layer. * This value is not used by algorithm, the only purpose is to make debugging easier. */ rmaxbefore = 0.0; for(j=0; j<=n-1; j++) { for(i=0; i<=ny-1; i++) { rmaxbefore = ae_maxreal(rmaxbefore, ae_fabs(y->ptr.pp_double[j][i], _state), _state); } } /* * Process NY dimensions of the target function */ for(i=0; i<=ny-1; i++) { for(j=0; j<=n-1; j++) { tmpy.ptr.p_double[j] = y->ptr.pp_double[j][i]; } /* * calculate Omega for current layer */ linlsqrsetlambdai(&state, lambdav*anorm/n, _state); linlsqrsolvesparse(&state, &spa, &tmpy, _state); linlsqrresults(&state, &omega, &lsqrrep, _state); if( lsqrrep.terminationtype<=0 ) { *info = -4; ae_frame_leave(_state); return; } /* * calculate error for current layer */ for(j=0; j<=n-1; j++) { yval = (double)(0); for(k=0; k<=rbfv1_mxnx-1; k++) { xx.ptr.p_double[k] = x->ptr.pp_double[j][k]; } nec = kdtreequeryrnn(centerstree, &xx, r->ptr.p_double[layer*n]*rbfv1_rbffarradius, ae_true, _state); kdtreequeryresultsx(centerstree, &cx, _state); kdtreequeryresultstags(centerstree, ¢erstags, _state); for(k=0; k<=nec-1; k++) { yval = yval+omega.ptr.p_double[centerstags.ptr.p_int[k]]*ae_exp(-(ae_sqr(xx.ptr.p_double[0]-cx.ptr.pp_double[k][0], _state)+ae_sqr(xx.ptr.p_double[1]-cx.ptr.pp_double[k][1], _state)+ae_sqr(xx.ptr.p_double[2]-cx.ptr.pp_double[k][2], _state))/ae_sqr(r->ptr.p_double[layer*n+centerstags.ptr.p_int[k]], _state), _state); } y->ptr.pp_double[j][i] = y->ptr.pp_double[j][i]-yval; } /* * write Omega in out parameter W */ for(j=0; j<=n-1; j++) { w->ptr.pp_double[layer*n+j][i] = omega.ptr.p_double[j]; } *iterationscount = *iterationscount+lsqrrep.iterationscount; *nmv = *nmv+lsqrrep.nmv; } /* * Calculate maximum residual before adding new layer. * This value is not used by algorithm, the only purpose is to make debugging easier. */ rmaxafter = 0.0; for(j=0; j<=n-1; j++) { for(i=0; i<=ny-1; i++) { rmaxafter = ae_maxreal(rmaxafter, ae_fabs(y->ptr.pp_double[j][i], _state), _state); } } } *info = 1; ae_frame_leave(_state); } void _rbfv1calcbuffer_init(void* _p, ae_state *_state) { rbfv1calcbuffer *p = (rbfv1calcbuffer*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->calcbufxcx, 0, DT_REAL, _state); ae_matrix_init(&p->calcbufx, 0, 0, DT_REAL, _state); ae_vector_init(&p->calcbuftags, 0, DT_INT, _state); _kdtreerequestbuffer_init(&p->requestbuffer, _state); } void _rbfv1calcbuffer_init_copy(void* _dst, void* _src, ae_state *_state) { rbfv1calcbuffer *dst = (rbfv1calcbuffer*)_dst; rbfv1calcbuffer *src = (rbfv1calcbuffer*)_src; ae_vector_init_copy(&dst->calcbufxcx, &src->calcbufxcx, _state); ae_matrix_init_copy(&dst->calcbufx, &src->calcbufx, _state); ae_vector_init_copy(&dst->calcbuftags, &src->calcbuftags, _state); _kdtreerequestbuffer_init_copy(&dst->requestbuffer, &src->requestbuffer, _state); } void _rbfv1calcbuffer_clear(void* _p) { rbfv1calcbuffer *p = (rbfv1calcbuffer*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->calcbufxcx); ae_matrix_clear(&p->calcbufx); ae_vector_clear(&p->calcbuftags); _kdtreerequestbuffer_clear(&p->requestbuffer); } void _rbfv1calcbuffer_destroy(void* _p) { rbfv1calcbuffer *p = (rbfv1calcbuffer*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->calcbufxcx); ae_matrix_destroy(&p->calcbufx); ae_vector_destroy(&p->calcbuftags); _kdtreerequestbuffer_destroy(&p->requestbuffer); } void _rbfv1model_init(void* _p, ae_state *_state) { rbfv1model *p = (rbfv1model*)_p; ae_touch_ptr((void*)p); _kdtree_init(&p->tree, _state); ae_matrix_init(&p->xc, 0, 0, DT_REAL, _state); ae_matrix_init(&p->wr, 0, 0, DT_REAL, _state); ae_matrix_init(&p->v, 0, 0, DT_REAL, _state); ae_vector_init(&p->calcbufxcx, 0, DT_REAL, _state); ae_matrix_init(&p->calcbufx, 0, 0, DT_REAL, _state); ae_vector_init(&p->calcbuftags, 0, DT_INT, _state); } void _rbfv1model_init_copy(void* _dst, void* _src, ae_state *_state) { rbfv1model *dst = (rbfv1model*)_dst; rbfv1model *src = (rbfv1model*)_src; dst->ny = src->ny; dst->nx = src->nx; dst->nc = src->nc; dst->nl = src->nl; _kdtree_init_copy(&dst->tree, &src->tree, _state); ae_matrix_init_copy(&dst->xc, &src->xc, _state); ae_matrix_init_copy(&dst->wr, &src->wr, _state); dst->rmax = src->rmax; ae_matrix_init_copy(&dst->v, &src->v, _state); ae_vector_init_copy(&dst->calcbufxcx, &src->calcbufxcx, _state); ae_matrix_init_copy(&dst->calcbufx, &src->calcbufx, _state); ae_vector_init_copy(&dst->calcbuftags, &src->calcbuftags, _state); } void _rbfv1model_clear(void* _p) { rbfv1model *p = (rbfv1model*)_p; ae_touch_ptr((void*)p); _kdtree_clear(&p->tree); ae_matrix_clear(&p->xc); ae_matrix_clear(&p->wr); ae_matrix_clear(&p->v); ae_vector_clear(&p->calcbufxcx); ae_matrix_clear(&p->calcbufx); ae_vector_clear(&p->calcbuftags); } void _rbfv1model_destroy(void* _p) { rbfv1model *p = (rbfv1model*)_p; ae_touch_ptr((void*)p); _kdtree_destroy(&p->tree); ae_matrix_destroy(&p->xc); ae_matrix_destroy(&p->wr); ae_matrix_destroy(&p->v); ae_vector_destroy(&p->calcbufxcx); ae_matrix_destroy(&p->calcbufx); ae_vector_destroy(&p->calcbuftags); } void _gridcalc3v1buf_init(void* _p, ae_state *_state) { gridcalc3v1buf *p = (gridcalc3v1buf*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->tx, 0, DT_REAL, _state); ae_vector_init(&p->cx, 0, DT_REAL, _state); ae_vector_init(&p->ty, 0, DT_REAL, _state); ae_vector_init(&p->flag0, 0, DT_BOOL, _state); ae_vector_init(&p->flag1, 0, DT_BOOL, _state); ae_vector_init(&p->flag2, 0, DT_BOOL, _state); ae_vector_init(&p->flag12, 0, DT_BOOL, _state); ae_vector_init(&p->expbuf0, 0, DT_REAL, _state); ae_vector_init(&p->expbuf1, 0, DT_REAL, _state); ae_vector_init(&p->expbuf2, 0, DT_REAL, _state); _kdtreerequestbuffer_init(&p->requestbuf, _state); ae_matrix_init(&p->calcbufx, 0, 0, DT_REAL, _state); ae_vector_init(&p->calcbuftags, 0, DT_INT, _state); } void _gridcalc3v1buf_init_copy(void* _dst, void* _src, ae_state *_state) { gridcalc3v1buf *dst = (gridcalc3v1buf*)_dst; gridcalc3v1buf *src = (gridcalc3v1buf*)_src; ae_vector_init_copy(&dst->tx, &src->tx, _state); ae_vector_init_copy(&dst->cx, &src->cx, _state); ae_vector_init_copy(&dst->ty, &src->ty, _state); ae_vector_init_copy(&dst->flag0, &src->flag0, _state); ae_vector_init_copy(&dst->flag1, &src->flag1, _state); ae_vector_init_copy(&dst->flag2, &src->flag2, _state); ae_vector_init_copy(&dst->flag12, &src->flag12, _state); ae_vector_init_copy(&dst->expbuf0, &src->expbuf0, _state); ae_vector_init_copy(&dst->expbuf1, &src->expbuf1, _state); ae_vector_init_copy(&dst->expbuf2, &src->expbuf2, _state); _kdtreerequestbuffer_init_copy(&dst->requestbuf, &src->requestbuf, _state); ae_matrix_init_copy(&dst->calcbufx, &src->calcbufx, _state); ae_vector_init_copy(&dst->calcbuftags, &src->calcbuftags, _state); } void _gridcalc3v1buf_clear(void* _p) { gridcalc3v1buf *p = (gridcalc3v1buf*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->tx); ae_vector_clear(&p->cx); ae_vector_clear(&p->ty); ae_vector_clear(&p->flag0); ae_vector_clear(&p->flag1); ae_vector_clear(&p->flag2); ae_vector_clear(&p->flag12); ae_vector_clear(&p->expbuf0); ae_vector_clear(&p->expbuf1); ae_vector_clear(&p->expbuf2); _kdtreerequestbuffer_clear(&p->requestbuf); ae_matrix_clear(&p->calcbufx); ae_vector_clear(&p->calcbuftags); } void _gridcalc3v1buf_destroy(void* _p) { gridcalc3v1buf *p = (gridcalc3v1buf*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->tx); ae_vector_destroy(&p->cx); ae_vector_destroy(&p->ty); ae_vector_destroy(&p->flag0); ae_vector_destroy(&p->flag1); ae_vector_destroy(&p->flag2); ae_vector_destroy(&p->flag12); ae_vector_destroy(&p->expbuf0); ae_vector_destroy(&p->expbuf1); ae_vector_destroy(&p->expbuf2); _kdtreerequestbuffer_destroy(&p->requestbuf); ae_matrix_destroy(&p->calcbufx); ae_vector_destroy(&p->calcbuftags); } void _rbfv1report_init(void* _p, ae_state *_state) { rbfv1report *p = (rbfv1report*)_p; ae_touch_ptr((void*)p); } void _rbfv1report_init_copy(void* _dst, void* _src, ae_state *_state) { rbfv1report *dst = (rbfv1report*)_dst; rbfv1report *src = (rbfv1report*)_src; dst->arows = src->arows; dst->acols = src->acols; dst->annz = src->annz; dst->iterationscount = src->iterationscount; dst->nmv = src->nmv; dst->terminationtype = src->terminationtype; } void _rbfv1report_clear(void* _p) { rbfv1report *p = (rbfv1report*)_p; ae_touch_ptr((void*)p); } void _rbfv1report_destroy(void* _p) { rbfv1report *p = (rbfv1report*)_p; ae_touch_ptr((void*)p); } /************************************************************************* This function creates RBF model for a scalar (NY=1) or vector (NY>1) function in a NX-dimensional space (NX>=1). Newly created model is empty. It can be used for interpolation right after creation, but it just returns zeros. You have to add points to the model, tune interpolation settings, and then call model construction function rbfbuildmodel() which will update model according to your specification. USAGE: 1. User creates model with rbfcreate() 2. User adds dataset with rbfsetpoints() (points do NOT have to be on a regular grid) or rbfsetpointsandscales(). 3. (OPTIONAL) User chooses polynomial term by calling: * rbflinterm() to set linear term * rbfconstterm() to set constant term * rbfzeroterm() to set zero term By default, linear term is used. 4. User tweaks algorithm properties with rbfsetalgohierarchical() method (or chooses one of the legacy algorithms - QNN (rbfsetalgoqnn) or ML (rbfsetalgomultilayer)). 5. User calls rbfbuildmodel() function which rebuilds model according to the specification 6. User may call rbfcalc() to calculate model value at the specified point, rbfgridcalc() to calculate model values at the points of the regular grid. User may extract model coefficients with rbfunpack() call. IMPORTANT: we recommend you to use latest model construction algorithm - hierarchical RBFs, which is activated by rbfsetalgohierarchical() function. This algorithm is the fastest one, and most memory- efficient. However, it is incompatible with older versions of ALGLIB (pre-3.11). So, if you serialize hierarchical model, you will be unable to load it in pre-3.11 ALGLIB. Other model types (QNN and RBF-ML) are still backward-compatible. INPUT PARAMETERS: NX - dimension of the space, NX>=1 NY - function dimension, NY>=1 OUTPUT PARAMETERS: S - RBF model (initially equals to zero) NOTE 1: memory requirements. RBF models require amount of memory which is proportional to the number of data points. Some additional memory is allocated during model construction, but most of this memory is freed after model coefficients are calculated. Amount of this additional memory depends on model construction algorithm being used. NOTE 2: prior to ALGLIB version 3.11, RBF models supported only NX=2 or NX=3. Any attempt to create single-dimensional or more than 3-dimensional RBF model resulted in exception. ALGLIB 3.11 supports any NX>0, but models created with NX!=2 and NX!=3 are incompatible with (a) older versions of ALGLIB, (b) old model construction algorithms (QNN or RBF-ML). So, if you create a model with NX=2 or NX=3, then, depending on specific model construction algorithm being chosen, you will (QNN and RBF-ML) or will not (HierarchicalRBF) get backward compatibility with older versions of ALGLIB. You have a choice here. However, if you create a model with NX neither 2 nor 3, you have no backward compatibility from the start, and you are forced to use hierarchical RBFs and ALGLIB 3.11 or later. -- ALGLIB -- Copyright 13.12.2011, 20.06.2016 by Bochkanov Sergey *************************************************************************/ void rbfcreate(ae_int_t nx, ae_int_t ny, rbfmodel* s, ae_state *_state) { _rbfmodel_clear(s); ae_assert(nx>=1, "RBFCreate: NX<1", _state); ae_assert(ny>=1, "RBFCreate: NY<1", _state); s->nx = nx; s->ny = ny; rbf_rbfpreparenonserializablefields(s, _state); /* * Select default model version according to NX. * * The idea is that when we call this function with NX=2 or NX=3, backward * compatible dummy (zero) V1 model is created, so serialization produces * model which are compatible with pre-3.11 ALGLIB. */ rbf_initializev1(nx, ny, &s->model1, _state); rbf_initializev2(nx, ny, &s->model2, _state); if( nx==2||nx==3 ) { s->modelversion = 1; } else { s->modelversion = 2; } } /************************************************************************* This function creates buffer structure which can be used to perform parallel RBF model evaluations (with one RBF model instance being used from multiple threads, as long as different threads use different instances of buffer). This buffer object can be used with rbftscalcbuf() function (here "ts" stands for "thread-safe", "buf" is a suffix which denotes function which reuses previously allocated output space). How to use it: * create RBF model structure with rbfcreate() * load data, tune parameters * call rbfbuildmodel() * call rbfcreatecalcbuffer(), once per thread working with RBF model (you should call this function only AFTER call to rbfbuildmodel(), see below for more information) * call rbftscalcbuf() from different threads, with each thread working with its own copy of buffer object. INPUT PARAMETERS S - RBF model OUTPUT PARAMETERS Buf - external buffer. IMPORTANT: buffer object should be used only with RBF model object which was used to initialize buffer. Any attempt to use buffer with different object is dangerous - you may get memory violation error because sizes of internal arrays do not fit to dimensions of RBF structure. IMPORTANT: you should call this function only for model which was built with rbfbuildmodel() function, after successful invocation of rbfbuildmodel(). Sizes of some internal structures are determined only after model is built, so buffer object created before model construction stage will be useless (and any attempt to use it will result in exception). -- ALGLIB -- Copyright 02.04.2016 by Sergey Bochkanov *************************************************************************/ void rbfcreatecalcbuffer(rbfmodel* s, rbfcalcbuffer* buf, ae_state *_state) { _rbfcalcbuffer_clear(buf); if( s->modelversion==1 ) { buf->modelversion = 1; rbfv1createcalcbuffer(&s->model1, &buf->bufv1, _state); return; } if( s->modelversion==2 ) { buf->modelversion = 2; rbfv2createcalcbuffer(&s->model2, &buf->bufv2, _state); return; } ae_assert(ae_false, "RBFCreateCalcBuffer: integrity check failed", _state); } /************************************************************************* This function adds dataset. This function overrides results of the previous calls, i.e. multiple calls of this function will result in only the last set being added. IMPORTANT: ALGLIB version 3.11 and later allows you to specify a set of per-dimension scales. Interpolation radii are multiplied by the scale vector. It may be useful if you have mixed spatio-temporal data (say, a set of 3D slices recorded at different times). You should call rbfsetpointsandscales() function to use this feature. INPUT PARAMETERS: S - RBF model, initialized by rbfcreate() call. XY - points, array[N,NX+NY]. One row corresponds to one point in the dataset. First NX elements are coordinates, next NY elements are function values. Array may be larger than specified, in this case only leading [N,NX+NY] elements will be used. N - number of points in the dataset After you've added dataset and (optionally) tuned algorithm settings you should call rbfbuildmodel() in order to build a model for you. NOTE: dataset added by this function is not saved during model serialization. MODEL ITSELF is serialized, but data used to build it are not. So, if you 1) add dataset to empty RBF model, 2) serialize and unserialize it, then you will get an empty RBF model with no dataset being attached. From the other side, if you call rbfbuildmodel() between (1) and (2), then after (2) you will get your fully constructed RBF model - but again with no dataset attached, so subsequent calls to rbfbuildmodel() will produce empty model. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetpoints(rbfmodel* s, /* Real */ ae_matrix* xy, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; ae_assert(n>0, "RBFSetPoints: N<0", _state); ae_assert(xy->rows>=n, "RBFSetPoints: Rows(XY)cols>=s->nx+s->ny, "RBFSetPoints: Cols(XY)n = n; s->hasscale = ae_false; ae_matrix_set_length(&s->x, s->n, s->nx, _state); ae_matrix_set_length(&s->y, s->n, s->ny, _state); for(i=0; i<=s->n-1; i++) { for(j=0; j<=s->nx-1; j++) { s->x.ptr.pp_double[i][j] = xy->ptr.pp_double[i][j]; } for(j=0; j<=s->ny-1; j++) { s->y.ptr.pp_double[i][j] = xy->ptr.pp_double[i][j+s->nx]; } } } /************************************************************************* This function adds dataset and a vector of per-dimension scales. It may be useful if you have mixed spatio-temporal data - say, a set of 3D slices recorded at different times. Such data typically require different RBF radii for spatial and temporal dimensions. ALGLIB solves this problem by specifying single RBF radius, which is (optionally) multiplied by the scale vector. This function overrides results of the previous calls, i.e. multiple calls of this function will result in only the last set being added. IMPORTANT: only HierarchicalRBF algorithm can work with scaled points. So, using this function results in RBF models which can be used in ALGLIB 3.11 or later. Previous versions of the library will be unable to unserialize models produced by HierarchicalRBF algo. Any attempt to use this function with RBF-ML or QNN algorithms will result in -3 error code being returned (incorrect algorithm). INPUT PARAMETERS: R - RBF model, initialized by rbfcreate() call. XY - points, array[N,NX+NY]. One row corresponds to one point in the dataset. First NX elements are coordinates, next NY elements are function values. Array may be larger than specified, in this case only leading [N,NX+NY] elements will be used. N - number of points in the dataset S - array[NX], scale vector, S[i]>0. After you've added dataset and (optionally) tuned algorithm settings you should call rbfbuildmodel() in order to build a model for you. NOTE: dataset added by this function is not saved during model serialization. MODEL ITSELF is serialized, but data used to build it are not. So, if you 1) add dataset to empty RBF model, 2) serialize and unserialize it, then you will get an empty RBF model with no dataset being attached. From the other side, if you call rbfbuildmodel() between (1) and (2), then after (2) you will get your fully constructed RBF model - but again with no dataset attached, so subsequent calls to rbfbuildmodel() will produce empty model. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ void rbfsetpointsandscales(rbfmodel* r, /* Real */ ae_matrix* xy, ae_int_t n, /* Real */ ae_vector* s, ae_state *_state) { ae_int_t i; ae_int_t j; ae_assert(n>0, "RBFSetPointsAndScales: N<0", _state); ae_assert(xy->rows>=n, "RBFSetPointsAndScales: Rows(XY)cols>=r->nx+r->ny, "RBFSetPointsAndScales: Cols(XY)cnt>=r->nx, "RBFSetPointsAndScales: Length(S)n = n; r->hasscale = ae_true; ae_matrix_set_length(&r->x, r->n, r->nx, _state); ae_matrix_set_length(&r->y, r->n, r->ny, _state); for(i=0; i<=r->n-1; i++) { for(j=0; j<=r->nx-1; j++) { r->x.ptr.pp_double[i][j] = xy->ptr.pp_double[i][j]; } for(j=0; j<=r->ny-1; j++) { r->y.ptr.pp_double[i][j] = xy->ptr.pp_double[i][j+r->nx]; } } ae_vector_set_length(&r->s, r->nx, _state); for(i=0; i<=r->nx-1; i++) { ae_assert(ae_isfinite(s->ptr.p_double[i], _state), "RBFSetPointsAndScales: S[i] is not finite number", _state); ae_assert(ae_fp_greater(s->ptr.p_double[i],(double)(0)), "RBFSetPointsAndScales: S[i]<=0", _state); r->s.ptr.p_double[i] = s->ptr.p_double[i]; } } /************************************************************************* DEPRECATED:since version 3.11 ALGLIB includes new RBF model construction algorithm, Hierarchical RBF. This algorithm is faster and requires less memory than QNN and RBF-ML. It is especially good for large-scale interpolation problems. So, we recommend you to consider Hierarchical RBF as default option. ========================================================================== This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called RBF-QNN and it is good for point sets with following properties: a) all points are distinct b) all points are well separated. c) points distribution is approximately uniform. There is no "contour lines", clusters of points, or other small-scale structures. Algorithm description: 1) interpolation centers are allocated to data points 2) interpolation radii are calculated as distances to the nearest centers times Q coefficient (where Q is a value from [0.75,1.50]). 3) after performing (2) radii are transformed in order to avoid situation when single outlier has very large radius and influences many points across all dataset. Transformation has following form: new_r[i] = min(r[i],Z*median(r[])) where r[i] is I-th radius, median() is a median radius across entire dataset, Z is user-specified value which controls amount of deviation from median radius. When (a) is violated, we will be unable to build RBF model. When (b) or (c) are violated, model will be built, but interpolation quality will be low. See http://www.alglib.net/interpolation/ for more information on this subject. This algorithm is used by default. Additional Q parameter controls smoothness properties of the RBF basis: * Q<0.75 will give perfectly conditioned basis, but terrible smoothness properties (RBF interpolant will have sharp peaks around function values) * Q around 1.0 gives good balance between smoothness and condition number * Q>1.5 will lead to badly conditioned systems and slow convergence of the underlying linear solver (although smoothness will be very good) * Q>2.0 will effectively make optimizer useless because it won't converge within reasonable amount of iterations. It is possible to set such large Q, but it is advised not to do so. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call Q - Q parameter, Q>0, recommended value - 1.0 Z - Z parameter, Z>0, recommended value - 5.0 NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetalgoqnn(rbfmodel* s, double q, double z, ae_state *_state) { ae_assert(ae_isfinite(q, _state), "RBFSetAlgoQNN: Q is infinite or NAN", _state); ae_assert(ae_fp_greater(q,(double)(0)), "RBFSetAlgoQNN: Q<=0", _state); ae_assert(ae_isfinite(z, _state), "RBFSetAlgoQNN: Z is infinite or NAN", _state); ae_assert(ae_fp_greater(z,(double)(0)), "RBFSetAlgoQNN: Z<=0", _state); s->radvalue = q; s->radzvalue = z; s->algorithmtype = 1; } /************************************************************************* DEPRECATED:since version 3.11 ALGLIB includes new RBF model construction algorithm, Hierarchical RBF. This algorithm is faster and requires less memory than QNN and RBF-ML. It is especially good for large-scale interpolation problems. So, we recommend you to consider Hierarchical RBF as default option. ========================================================================== This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called RBF-ML. It builds multilayer RBF model, i.e. model with subsequently decreasing radii, which allows us to combine smoothness (due to large radii of the first layers) with exactness (due to small radii of the last layers) and fast convergence. Internally RBF-ML uses many different means of acceleration, from sparse matrices to KD-trees, which results in algorithm whose working time is roughly proportional to N*log(N)*Density*RBase^2*NLayers, where N is a number of points, Density is an average density if points per unit of the interpolation space, RBase is an initial radius, NLayers is a number of layers. RBF-ML is good for following kinds of interpolation problems: 1. "exact" problems (perfect fit) with well separated points 2. least squares problems with arbitrary distribution of points (algorithm gives perfect fit where it is possible, and resorts to least squares fit in the hard areas). 3. noisy problems where we want to apply some controlled amount of smoothing. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call RBase - RBase parameter, RBase>0 NLayers - NLayers parameter, NLayers>0, recommended value to start with - about 5. LambdaV - regularization value, can be useful when solving problem in the least squares sense. Optimal lambda is problem- dependent and require trial and error. In our experience, good lambda can be as large as 0.1, and you can use 0.001 as initial guess. Default value - 0.01, which is used when LambdaV is not given. You can specify zero value, but it is not recommended to do so. TUNING ALGORITHM In order to use this algorithm you have to choose three parameters: * initial radius RBase * number of layers in the model NLayers * regularization coefficient LambdaV Initial radius is easy to choose - you can pick any number several times larger than the average distance between points. Algorithm won't break down if you choose radius which is too large (model construction time will increase, but model will be built correctly). Choose such number of layers that RLast=RBase/2^(NLayers-1) (radius used by the last layer) will be smaller than the typical distance between points. In case model error is too large, you can increase number of layers. Having more layers will make model construction and evaluation proportionally slower, but it will allow you to have model which precisely fits your data. From the other side, if you want to suppress noise, you can DECREASE number of layers to make your model less flexible. Regularization coefficient LambdaV controls smoothness of the individual models built for each layer. We recommend you to use default value in case you don't want to tune this parameter, because having non-zero LambdaV accelerates and stabilizes internal iterative algorithm. In case you want to suppress noise you can use LambdaV as additional parameter (larger value = more smoothness) to tune. TYPICAL ERRORS 1. Using initial radius which is too large. Memory requirements of the RBF-ML are roughly proportional to N*Density*RBase^2 (where Density is an average density of points per unit of the interpolation space). In the extreme case of the very large RBase we will need O(N^2) units of memory - and many layers in order to decrease radius to some reasonably small value. 2. Using too small number of layers - RBF models with large radius are not flexible enough to reproduce small variations in the target function. You need many layers with different radii, from large to small, in order to have good model. 3. Using initial radius which is too small. You will get model with "holes" in the areas which are too far away from interpolation centers. However, algorithm will work correctly (and quickly) in this case. 4. Using too many layers - you will get too large and too slow model. This model will perfectly reproduce your function, but maybe you will be able to achieve similar results with less layers (and less memory). -- ALGLIB -- Copyright 02.03.2012 by Bochkanov Sergey *************************************************************************/ void rbfsetalgomultilayer(rbfmodel* s, double rbase, ae_int_t nlayers, double lambdav, ae_state *_state) { ae_assert(ae_isfinite(rbase, _state), "RBFSetAlgoMultiLayer: RBase is infinite or NaN", _state); ae_assert(ae_fp_greater(rbase,(double)(0)), "RBFSetAlgoMultiLayer: RBase<=0", _state); ae_assert(nlayers>=0, "RBFSetAlgoMultiLayer: NLayers<0", _state); ae_assert(ae_isfinite(lambdav, _state), "RBFSetAlgoMultiLayer: LambdaV is infinite or NAN", _state); ae_assert(ae_fp_greater_eq(lambdav,(double)(0)), "RBFSetAlgoMultiLayer: LambdaV<0", _state); s->radvalue = rbase; s->nlayers = nlayers; s->algorithmtype = 2; s->lambdav = lambdav; } /************************************************************************* This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called Hierarchical RBF. It similar to its previous incarnation, RBF-ML, i.e. it also builds a sequence of models with decreasing radii. However, it uses more economical way of building upper layers (ones with large radii), which results in faster model construction and evaluation, as well as smaller memory footprint during construction. This algorithm has following important features: * ability to handle millions of points * controllable smoothing via nonlinearity penalization * support for NX-dimensional models with NX=1 or NX>3 (unlike QNN or RBF-ML) * support for specification of per-dimensional radii via scale vector, which is set by means of rbfsetpointsandscales() function. This feature is useful if you solve spatio-temporal interpolation problems, where different radii are required for spatial and temporal dimensions. Running times are roughly proportional to: * N*log(N)*NLayers - for model construction * N*NLayers - for model evaluation You may see that running time does not depend on search radius or points density, just on number of layers in the hierarchy. IMPORTANT: this model construction algorithm was introduced in ALGLIB 3.11 and produces models which are INCOMPATIBLE with previous versions of ALGLIB. You can not unserialize models produced with this function in ALGLIB 3.10 or earlier. INPUT PARAMETERS: S - RBF model, initialized by rbfcreate() call RBase - RBase parameter, RBase>0 NLayers - NLayers parameter, NLayers>0, recommended value to start with - about 5. LambdaNS- >=0, nonlinearity penalty coefficient, negative values are not allowed. This parameter adds controllable smoothing to the problem, which may reduce noise. Specification of non- zero lambda means that in addition to fitting error solver will also minimize LambdaNS*|S''(x)|^2 (appropriately generalized to multiple dimensions. Specification of exactly zero value means that no penalty is added (we do not even evaluate matrix of second derivatives which is necessary for smoothing). Calculation of nonlinearity penalty is costly - it results in several-fold increase of model construction time. Evaluation time remains the same. Optimal lambda is problem-dependent and requires trial and error. Good value to start from is 1e-5...1e-6, which corresponds to slightly noticeable smoothing of the function. Value 1e-2 usually means that quite heavy smoothing is applied. TUNING ALGORITHM In order to use this algorithm you have to choose three parameters: * initial radius RBase * number of layers in the model NLayers * penalty coefficient LambdaNS Initial radius is easy to choose - you can pick any number several times larger than the average distance between points. Algorithm won't break down if you choose radius which is too large (model construction time will increase, but model will be built correctly). Choose such number of layers that RLast=RBase/2^(NLayers-1) (radius used by the last layer) will be smaller than the typical distance between points. In case model error is too large, you can increase number of layers. Having more layers will make model construction and evaluation proportionally slower, but it will allow you to have model which precisely fits your data. From the other side, if you want to suppress noise, you can DECREASE number of layers to make your model less flexible (or specify non-zero LambdaNS). TYPICAL ERRORS 1. Using too small number of layers - RBF models with large radius are not flexible enough to reproduce small variations in the target function. You need many layers with different radii, from large to small, in order to have good model. 2. Using initial radius which is too small. You will get model with "holes" in the areas which are too far away from interpolation centers. However, algorithm will work correctly (and quickly) in this case. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ void rbfsetalgohierarchical(rbfmodel* s, double rbase, ae_int_t nlayers, double lambdans, ae_state *_state) { ae_assert(ae_isfinite(rbase, _state), "RBFSetAlgoHierarchical: RBase is infinite or NaN", _state); ae_assert(ae_fp_greater(rbase,(double)(0)), "RBFSetAlgoHierarchical: RBase<=0", _state); ae_assert(nlayers>=0, "RBFSetAlgoHierarchical: NLayers<0", _state); ae_assert(ae_isfinite(lambdans, _state)&&ae_fp_greater_eq(lambdans,(double)(0)), "RBFSetAlgoHierarchical: LambdaNS<0 or infinite", _state); s->radvalue = rbase; s->nlayers = nlayers; s->algorithmtype = 3; s->lambdav = lambdans; } /************************************************************************* This function sets linear term (model is a sum of radial basis functions plus linear polynomial). This function won't have effect until next call to RBFBuildModel(). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetlinterm(rbfmodel* s, ae_state *_state) { s->aterm = 1; } /************************************************************************* This function sets constant term (model is a sum of radial basis functions plus constant). This function won't have effect until next call to RBFBuildModel(). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetconstterm(rbfmodel* s, ae_state *_state) { s->aterm = 2; } /************************************************************************* This function sets zero term (model is a sum of radial basis functions without polynomial term). This function won't have effect until next call to RBFBuildModel(). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetzeroterm(rbfmodel* s, ae_state *_state) { s->aterm = 3; } /************************************************************************* This function sets basis function type, which can be: * 0 for classic Gaussian * 1 for fast and compact bell-like basis function, which becomes exactly zero at distance equal to 3*R (default option). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call BF - basis function type: * 0 - classic Gaussian * 1 - fast and compact one -- ALGLIB -- Copyright 01.02.2017 by Bochkanov Sergey *************************************************************************/ void rbfsetv2bf(rbfmodel* s, ae_int_t bf, ae_state *_state) { ae_assert(bf==0||bf==1, "RBFSetV2Its: BF<>0 and BF<>1", _state); s->model2.basisfunction = bf; } /************************************************************************* This function sets stopping criteria of the underlying linear solver for hierarchical (version 2) RBF constructor. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call MaxIts - this criterion will stop algorithm after MaxIts iterations. Typically a few hundreds iterations is required, with 400 being a good default value to start experimentation. Zero value means that default value will be selected. -- ALGLIB -- Copyright 01.02.2017 by Bochkanov Sergey *************************************************************************/ void rbfsetv2its(rbfmodel* s, ae_int_t maxits, ae_state *_state) { ae_assert(maxits>=0, "RBFSetV2Its: MaxIts is negative", _state); s->model2.maxits = maxits; } /************************************************************************* This function sets support radius parameter of hierarchical (version 2) RBF constructor. Hierarchical RBF model achieves great speed-up by removing from the model excessive (too dense) nodes. Say, if you have RBF radius equal to 1 meter, and two nodes are just 1 millimeter apart, you may remove one of them without reducing model quality. Support radius parameter is used to justify which points need removal, and which do not. If two points are less than SUPPORT_R*CUR_RADIUS units of distance apart, one of them is removed from the model. The larger support radius is, the faster model construction AND evaluation are. However, too large values result in "bumpy" models. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call R - support radius coefficient, >=0. Recommended values are [0.1,0.4] range, with 0.1 being default value. -- ALGLIB -- Copyright 01.02.2017 by Bochkanov Sergey *************************************************************************/ void rbfsetv2supportr(rbfmodel* s, double r, ae_state *_state) { ae_assert(ae_isfinite(r, _state), "RBFSetV2SupportR: R is not finite", _state); ae_assert(ae_fp_greater_eq(r,(double)(0)), "RBFSetV2SupportR: R<0", _state); s->model2.supportr = r; } /************************************************************************* This function sets stopping criteria of the underlying linear solver. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call EpsOrt - orthogonality stopping criterion, EpsOrt>=0. Algorithm will stop when ||A'*r||<=EpsOrt where A' is a transpose of the system matrix, r is a residual vector. Recommended value of EpsOrt is equal to 1E-6. This criterion will stop algorithm when we have "bad fit" situation, i.e. when we should stop in a point with large, nonzero residual. EpsErr - residual stopping criterion. Algorithm will stop when ||r||<=EpsErr*||b||, where r is a residual vector, b is a right part of the system (function values). Recommended value of EpsErr is equal to 1E-3 or 1E-6. This criterion will stop algorithm in a "good fit" situation when we have near-zero residual near the desired solution. MaxIts - this criterion will stop algorithm after MaxIts iterations. It should be used for debugging purposes only! Zero MaxIts means that no limit is placed on the number of iterations. We recommend to set moderate non-zero values EpsOrt and EpsErr simultaneously. Values equal to 10E-6 are good to start with. In case you need high performance and do not need high precision , you may decrease EpsErr down to 0.001. However, we do not recommend decreasing EpsOrt. As for MaxIts, we recommend to leave it zero unless you know what you do. NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetcond(rbfmodel* s, double epsort, double epserr, ae_int_t maxits, ae_state *_state) { ae_assert(ae_isfinite(epsort, _state)&&ae_fp_greater_eq(epsort,(double)(0)), "RBFSetCond: EpsOrt is negative, INF or NAN", _state); ae_assert(ae_isfinite(epserr, _state)&&ae_fp_greater_eq(epserr,(double)(0)), "RBFSetCond: EpsB is negative, INF or NAN", _state); ae_assert(maxits>=0, "RBFSetCond: MaxIts is negative", _state); if( (ae_fp_eq(epsort,(double)(0))&&ae_fp_eq(epserr,(double)(0)))&&maxits==0 ) { s->epsort = rbf_eps; s->epserr = rbf_eps; s->maxits = 0; } else { s->epsort = epsort; s->epserr = epserr; s->maxits = maxits; } } /************************************************************************* This function builds RBF model and returns report (contains some information which can be used for evaluation of the algorithm properties). Call to this function modifies RBF model by calculating its centers/radii/ weights and saving them into RBFModel structure. Initially RBFModel contain zero coefficients, but after call to this function we will have coefficients which were calculated in order to fit our dataset. After you called this function you can call RBFCalc(), RBFGridCalc() and other model calculation functions. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call Rep - report: * Rep.TerminationType: * -5 - non-distinct basis function centers were detected, interpolation aborted; only QNN returns this error code, other algorithms can handle non- distinct nodes. * -4 - nonconvergence of the internal SVD solver * -3 incorrect model construction algorithm was chosen: QNN or RBF-ML, combined with one of the incompatible features - NX=1 or NX>3; points with per-dimension scales. * 1 - successful termination Fields which are set only by modern RBF solvers (hierarchical or nonnegative; older solvers like QNN and ML initialize these fields by NANs): * rep.rmserror - root-mean-square error at nodes * rep.maxerror - maximum error at nodes Fields are used for debugging purposes: * Rep.IterationsCount - iterations count of the LSQR solver * Rep.NMV - number of matrix-vector products * Rep.ARows - rows count for the system matrix * Rep.ACols - columns count for the system matrix * Rep.ANNZ - number of significantly non-zero elements (elements above some algorithm-determined threshold) NOTE: failure to build model will leave current state of the structure unchanged. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfbuildmodel(rbfmodel* s, rbfreport* rep, ae_state *_state) { ae_frame _frame_block; rbfv1report rep1; rbfv2report rep2; ae_matrix x3; ae_vector scalevec; ae_int_t i; ae_int_t curalgorithmtype; ae_frame_make(_state, &_frame_block); _rbfreport_clear(rep); _rbfv1report_init(&rep1, _state); _rbfv2report_init(&rep2, _state); ae_matrix_init(&x3, 0, 0, DT_REAL, _state); ae_vector_init(&scalevec, 0, DT_REAL, _state); /* * Clean report fields prior to processing */ rbf_clearreportfields(rep, _state); /* * Autoselect algorithm */ if( s->algorithmtype==0 ) { if( (s->nx<2||s->nx>3)||s->hasscale ) { curalgorithmtype = 3; } else { curalgorithmtype = 1; } } else { curalgorithmtype = s->algorithmtype; } /* * Algorithms which generate V1 models */ if( curalgorithmtype==1||curalgorithmtype==2 ) { /* * Perform compatibility checks */ if( (s->nx<2||s->nx>3)||s->hasscale ) { rep->terminationtype = -3; ae_frame_leave(_state); return; } /* * Try to build model. * * NOTE: due to historical reasons RBFV1BuildModel() accepts points * cast to 3-dimensional space, even if they are really 2-dimensional. * So, for 2D data we have to explicitly convert them to 3D. */ if( s->nx==2 ) { /* * Convert data to 3D */ rmatrixsetlengthatleast(&x3, s->n, 3, _state); for(i=0; i<=s->n-1; i++) { x3.ptr.pp_double[i][0] = s->x.ptr.pp_double[i][0]; x3.ptr.pp_double[i][1] = s->x.ptr.pp_double[i][1]; x3.ptr.pp_double[i][2] = (double)(0); } rbfv1buildmodel(&x3, &s->y, s->n, s->aterm, curalgorithmtype, s->nlayers, s->radvalue, s->radzvalue, s->lambdav, s->epsort, s->epserr, s->maxits, &s->model1, &rep1, _state); } else { /* * Work with raw data */ rbfv1buildmodel(&s->x, &s->y, s->n, s->aterm, curalgorithmtype, s->nlayers, s->radvalue, s->radzvalue, s->lambdav, s->epsort, s->epserr, s->maxits, &s->model1, &rep1, _state); } s->modelversion = 1; /* * Convert report fields */ rep->arows = rep1.arows; rep->acols = rep1.acols; rep->annz = rep1.annz; rep->iterationscount = rep1.iterationscount; rep->nmv = rep1.nmv; rep->terminationtype = rep1.terminationtype; /* * Done */ ae_frame_leave(_state); return; } /* * Algorithms which generate V2 models */ if( curalgorithmtype==3 ) { /* * Prepare scale vector - use unit values or user supplied ones */ ae_vector_set_length(&scalevec, s->nx, _state); for(i=0; i<=s->nx-1; i++) { if( s->hasscale ) { scalevec.ptr.p_double[i] = s->s.ptr.p_double[i]; } else { scalevec.ptr.p_double[i] = (double)(1); } } /* * Build model */ rbfv2buildhierarchical(&s->x, &s->y, s->n, &scalevec, s->aterm, s->nlayers, s->radvalue, s->lambdav, ae_false, 0, &s->model2, &rep2, _state); s->modelversion = 2; /* * Convert report fields */ rep->terminationtype = rep2.terminationtype; rep->rmserror = rep2.rmserror; rep->maxerror = rep2.maxerror; /* * Done */ ae_frame_leave(_state); return; } /* * Critical error */ ae_assert(ae_false, "RBFBuildModel: integrity check failure", _state); ae_frame_leave(_state); } /************************************************************************* This function calculates values of the RBF model in the given point. IMPORTANT: this function works only with modern (hierarchical) RBFs. It can not be used with legacy (version 1) RBFs because older RBF code does not support 1-dimensional models. This function should be used when we have NY=1 (scalar function) and NX=1 (1-dimensional space). If you have 3-dimensional space, use rbfcalc3(). If you have 2-dimensional space, use rbfcalc3(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use generic rbfcalc(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when: * model is not initialized * NX<>1 * NY<>1 INPUT PARAMETERS: S - RBF model X0 - X-coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfcalc1(rbfmodel* s, double x0, ae_state *_state) { double result; ae_assert(ae_isfinite(x0, _state), "RBFCalc1: invalid value for X0 (X0 is Inf)!", _state); result = (double)(0); if( s->ny!=1||s->nx!=1 ) { return result; } if( s->modelversion==1 ) { result = (double)(0); return result; } if( s->modelversion==2 ) { result = rbfv2calc1(&s->model2, x0, _state); return result; } ae_assert(ae_false, "RBFCalc1: integrity check failed", _state); return result; } /************************************************************************* This function calculates values of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=2 (2-dimensional space). If you have 3-dimensional space, use rbfcalc3(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use generic rbfcalc(). If you want to calculate function values many times, consider using rbfgridcalc2v(), which is far more efficient than many subsequent calls to rbfcalc2(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when: * model is not initialized * NX<>2 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - first coordinate, finite number X1 - second coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfcalc2(rbfmodel* s, double x0, double x1, ae_state *_state) { double result; ae_assert(ae_isfinite(x0, _state), "RBFCalc2: invalid value for X0 (X0 is Inf)!", _state); ae_assert(ae_isfinite(x1, _state), "RBFCalc2: invalid value for X1 (X1 is Inf)!", _state); result = (double)(0); if( s->ny!=1||s->nx!=2 ) { return result; } if( s->modelversion==1 ) { result = rbfv1calc2(&s->model1, x0, x1, _state); return result; } if( s->modelversion==2 ) { result = rbfv2calc2(&s->model2, x0, x1, _state); return result; } ae_assert(ae_false, "RBFCalc2: integrity check failed", _state); return result; } /************************************************************************* This function calculates value of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=3 (3-dimensional space). If you have 2-dimensional space, use rbfcalc2(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use generic rbfcalc(). If you want to calculate function values many times, consider using rbfgridcalc3v(), which is far more efficient than many subsequent calls to rbfcalc3(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when: * model is not initialized * NX<>3 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - first coordinate, finite number X1 - second coordinate, finite number X2 - third coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfcalc3(rbfmodel* s, double x0, double x1, double x2, ae_state *_state) { double result; ae_assert(ae_isfinite(x0, _state), "RBFCalc3: invalid value for X0 (X0 is Inf or NaN)!", _state); ae_assert(ae_isfinite(x1, _state), "RBFCalc3: invalid value for X1 (X1 is Inf or NaN)!", _state); ae_assert(ae_isfinite(x2, _state), "RBFCalc3: invalid value for X2 (X2 is Inf or NaN)!", _state); result = (double)(0); if( s->ny!=1||s->nx!=3 ) { return result; } if( s->modelversion==1 ) { result = rbfv1calc3(&s->model1, x0, x1, x2, _state); return result; } if( s->modelversion==2 ) { result = rbfv2calc3(&s->model2, x0, x1, x2, _state); return result; } ae_assert(ae_false, "RBFCalc3: integrity check failed", _state); return result; } /************************************************************************* This function calculates values of the RBF model at the given point. This is general function which can be used for arbitrary NX (dimension of the space of arguments) and NY (dimension of the function itself). However when you have NY=1 you may find more convenient to use rbfcalc2() or rbfcalc3(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when model is not initialized. INPUT PARAMETERS: S - RBF model X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. OUTPUT PARAMETERS: Y - function value, array[NY]. Y is out-parameter and reallocated after call to this function. In case you want to reuse previously allocated Y, you may use RBFCalcBuf(), which reallocates Y only when it is too small. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfcalc(rbfmodel* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_vector_clear(y); ae_assert(x->cnt>=s->nx, "RBFCalc: Length(X)nx, _state), "RBFCalc: X contains infinite or NaN values", _state); rbfcalcbuf(s, x, y, _state); } /************************************************************************* This function calculates values of the RBF model at the given point. Same as rbfcalc(), but does not reallocate Y when in is large enough to store function values. If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. INPUT PARAMETERS: S - RBF model X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. Y - possibly preallocated array OUTPUT PARAMETERS: Y - function value, array[NY]. Y is not reallocated when it is larger than NY. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfcalcbuf(rbfmodel* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_assert(x->cnt>=s->nx, "RBFCalcBuf: Length(X)nx, _state), "RBFCalcBuf: X contains infinite or NaN values", _state); if( y->cntny ) { ae_vector_set_length(y, s->ny, _state); } for(i=0; i<=s->ny-1; i++) { y->ptr.p_double[i] = (double)(0); } if( s->modelversion==1 ) { rbfv1calcbuf(&s->model1, x, y, _state); return; } if( s->modelversion==2 ) { rbfv2calcbuf(&s->model2, x, y, _state); return; } ae_assert(ae_false, "RBFCalcBuf: integrity check failed", _state); } /************************************************************************* This function calculates values of the RBF model at the given point, using external buffer object (internal temporaries of RBF model are not modified). This function allows to use same RBF model object in different threads, assuming that different threads use different instances of buffer structure. INPUT PARAMETERS: S - RBF model, may be shared between different threads Buf - buffer object created for this particular instance of RBF model with rbfcreatecalcbuffer(). X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. Y - possibly preallocated array OUTPUT PARAMETERS: Y - function value, array[NY]. Y is not reallocated when it is larger than NY. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbftscalcbuf(rbfmodel* s, rbfcalcbuffer* buf, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_assert(x->cnt>=s->nx, "RBFCalcBuf: Length(X)nx, _state), "RBFCalcBuf: X contains infinite or NaN values", _state); ae_assert(s->modelversion==buf->modelversion, "RBFCalcBuf: buffer object is not compatible with RBF model", _state); if( y->cntny ) { ae_vector_set_length(y, s->ny, _state); } for(i=0; i<=s->ny-1; i++) { y->ptr.p_double[i] = (double)(0); } if( s->modelversion==1 ) { rbfv1tscalcbuf(&s->model1, &buf->bufv1, x, y, _state); return; } if( s->modelversion==2 ) { rbfv2tscalcbuf(&s->model2, &buf->bufv2, x, y, _state); return; } ae_assert(ae_false, "RBFTsCalcBuf: integrity check failed", _state); } /************************************************************************* This is legacy function for gridded calculation of RBF model. It is superseded by rbfgridcalc2v() and rbfgridcalc2vsubset() functions. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc2(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_matrix* y, ae_state *_state) { ae_frame _frame_block; ae_vector cpx0; ae_vector cpx1; ae_vector p01; ae_vector p11; ae_vector p2; ae_frame_make(_state, &_frame_block); ae_matrix_clear(y); ae_vector_init(&cpx0, 0, DT_REAL, _state); ae_vector_init(&cpx1, 0, DT_REAL, _state); ae_vector_init(&p01, 0, DT_INT, _state); ae_vector_init(&p11, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); ae_assert(n0>0, "RBFGridCalc2: invalid value for N0 (N0<=0)!", _state); ae_assert(n1>0, "RBFGridCalc2: invalid value for N1 (N1<=0)!", _state); ae_assert(x0->cnt>=n0, "RBFGridCalc2: Length(X0)cnt>=n1, "RBFGridCalc2: Length(X1)modelversion==1 ) { rbfv1gridcalc2(&s->model1, x0, n0, x1, n1, y, _state); ae_frame_leave(_state); return; } if( s->modelversion==2 ) { rbfv2gridcalc2(&s->model2, x0, n0, x1, n1, y, _state); ae_frame_leave(_state); return; } ae_assert(ae_false, "RBFGridCalc2: integrity check failed", _state); ae_frame_leave(_state); } /************************************************************************* This function calculates values of the RBF model at the regular grid, which has N0*N1 points, with Point[I,J] = (X0[I], X1[J]). Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>2 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 200x200 problem to get nearly-linear speed- ! up (high efficiency). ! ! Parallel processing is implemented only for modern (hierarchical) ! RBFs. Legacy version 1 RBFs (created by QNN or RBF-ML) are still ! processed serially. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1], where NY is a number of "output" vector values (this function supports vector- valued RBF models). Y is out-variable and is reallocated by this function. Y[K+NY*(I0+I1*N0)]=F_k(X0[I0],X1[I1]), for: * K=0...NY-1 * I0=0...N0-1 * I1=0...N1-1 NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. NOTE: if you need function values on some subset of regular grid, which may be described as "several compact and dense islands", you may use rbfgridcalc2vsubset(). -- ALGLIB -- Copyright 27.01.2017 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc2v(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* y, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector dummy; ae_frame_make(_state, &_frame_block); ae_vector_clear(y); ae_vector_init(&dummy, 0, DT_BOOL, _state); ae_assert(n0>0, "RBFGridCalc2V: invalid value for N0 (N0<=0)!", _state); ae_assert(n1>0, "RBFGridCalc2V: invalid value for N1 (N1<=0)!", _state); ae_assert(x0->cnt>=n0, "RBFGridCalc2V: Length(X0)cnt>=n1, "RBFGridCalc2V: Length(X1)ptr.p_double[i],x0->ptr.p_double[i+1]), "RBFGridCalc2V: X0 is not ordered by ascending", _state); } for(i=0; i<=n1-2; i++) { ae_assert(ae_fp_less_eq(x1->ptr.p_double[i],x1->ptr.p_double[i+1]), "RBFGridCalc2V: X1 is not ordered by ascending", _state); } rbfgridcalc2vx(s, x0, n0, x1, n1, &dummy, ae_false, y, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rbfgridcalc2v(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* y, ae_state *_state) { rbfgridcalc2v(s,x0,n0,x1,n1,y, _state); } /************************************************************************* This function calculates values of the RBF model at some subset of regular grid: * grid has N0*N1 points, with Point[I,J] = (X0[I], X1[J]) * only values at some subset of this grid are required Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>2 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 200x200 problem to get nearly-linear speed- ! up (high efficiency). ! ! Parallel processing is implemented only for modern (hierarchical) ! RBFs. Legacy version 1 RBFs (created by QNN or RBF-ML) are still ! processed serially. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension FlagY - array[N0*N1]: * Y[I0+I1*N0] corresponds to node (X0[I0],X1[I1]) * it is a "bitmap" array which contains False for nodes which are NOT calculated, and True for nodes which are required. OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1*N2], where NY is a number of "output" vector values (this function supports vector- valued RBF models): * Y[K+NY*(I0+I1*N0)]=F_k(X0[I0],X1[I1]), for K=0...NY-1, I0=0...N0-1, I1=0...N1-1. * elements of Y[] which correspond to FlagY[]=True are loaded by model values (which may be exactly zero for some nodes). * elements of Y[] which correspond to FlagY[]=False MAY be initialized by zeros OR may be calculated. This function processes grid as a hierarchy of nested blocks and micro-rows. If just one element of micro-row is required, entire micro-row (up to 8 nodes in the current version, but no promises) is calculated. NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc2vsubset(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Boolean */ ae_vector* flagy, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_vector_clear(y); ae_assert(n0>0, "RBFGridCalc2VSubset: invalid value for N0 (N0<=0)!", _state); ae_assert(n1>0, "RBFGridCalc2VSubset: invalid value for N1 (N1<=0)!", _state); ae_assert(x0->cnt>=n0, "RBFGridCalc2VSubset: Length(X0)cnt>=n1, "RBFGridCalc2VSubset: Length(X1)cnt>=n0*n1, "RBFGridCalc2VSubset: Length(FlagY)ptr.p_double[i],x0->ptr.p_double[i+1]), "RBFGridCalc2VSubset: X0 is not ordered by ascending", _state); } for(i=0; i<=n1-2; i++) { ae_assert(ae_fp_less_eq(x1->ptr.p_double[i],x1->ptr.p_double[i+1]), "RBFGridCalc2VSubset: X1 is not ordered by ascending", _state); } rbfgridcalc2vx(s, x0, n0, x1, n1, flagy, ae_true, y, _state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rbfgridcalc2vsubset(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Boolean */ ae_vector* flagy, /* Real */ ae_vector* y, ae_state *_state) { rbfgridcalc2vsubset(s,x0,n0,x1,n1,flagy,y, _state); } /************************************************************************* This function calculates values of the RBF model at the regular grid, which has N0*N1*N2 points, with Point[I,J,K] = (X0[I], X1[J], X2[K]). Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>3 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 50x50x50 problem to get nearly-linear speed- ! up (high efficiency). ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension X2 - array of grid nodes, third coordinates, array[N2] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N2 - grid size (number of nodes) in the third dimension OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1*N2], where NY is a number of "output" vector values (this function supports vector- valued RBF models). Y is out-variable and is reallocated by this function. Y[K+NY*(I0+I1*N0+I2*N0*N1)]=F_k(X0[I0],X1[I1],X2[I2]), for: * K=0...NY-1 * I0=0...N0-1 * I1=0...N1-1 * I2=0...N2-1 NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. NOTE: if you need function values on some subset of regular grid, which may be described as "several compact and dense islands", you may use rbfgridcalc3vsubset(). -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc3v(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* y, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector dummy; ae_frame_make(_state, &_frame_block); ae_vector_clear(y); ae_vector_init(&dummy, 0, DT_BOOL, _state); ae_assert(n0>0, "RBFGridCalc3V: invalid value for N0 (N0<=0)!", _state); ae_assert(n1>0, "RBFGridCalc3V: invalid value for N1 (N1<=0)!", _state); ae_assert(n2>0, "RBFGridCalc3V: invalid value for N2 (N2<=0)!", _state); ae_assert(x0->cnt>=n0, "RBFGridCalc3V: Length(X0)cnt>=n1, "RBFGridCalc3V: Length(X1)cnt>=n2, "RBFGridCalc3V: Length(X2)ptr.p_double[i],x0->ptr.p_double[i+1]), "RBFGridCalc3V: X0 is not ordered by ascending", _state); } for(i=0; i<=n1-2; i++) { ae_assert(ae_fp_less_eq(x1->ptr.p_double[i],x1->ptr.p_double[i+1]), "RBFGridCalc3V: X1 is not ordered by ascending", _state); } for(i=0; i<=n2-2; i++) { ae_assert(ae_fp_less_eq(x2->ptr.p_double[i],x2->ptr.p_double[i+1]), "RBFGridCalc3V: X2 is not ordered by ascending", _state); } rbfgridcalc3vx(s, x0, n0, x1, n1, x2, n2, &dummy, ae_false, y, _state); ae_frame_leave(_state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rbfgridcalc3v(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* y, ae_state *_state) { rbfgridcalc3v(s,x0,n0,x1,n1,x2,n2,y, _state); } /************************************************************************* This function calculates values of the RBF model at some subset of regular grid: * grid has N0*N1*N2 points, with Point[I,J,K] = (X0[I], X1[J], X2[K]) * only values at some subset of this grid are required Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>3 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 50x50x50 problem to get nearly-linear speed- ! up (high efficiency). ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension X2 - array of grid nodes, third coordinates, array[N2] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N2 - grid size (number of nodes) in the third dimension FlagY - array[N0*N1*N2]: * Y[I0+I1*N0+I2*N0*N1] corresponds to node (X0[I0],X1[I1],X2[I2]) * it is a "bitmap" array which contains False for nodes which are NOT calculated, and True for nodes which are required. OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1*N2], where NY is a number of "output" vector values (this function supports vector- valued RBF models): * Y[K+NY*(I0+I1*N0+I2*N0*N1)]=F_k(X0[I0],X1[I1],X2[I2]), for K=0...NY-1, I0=0...N0-1, I1=0...N1-1, I2=0...N2-1. * elements of Y[] which correspond to FlagY[]=True are loaded by model values (which may be exactly zero for some nodes). * elements of Y[] which correspond to FlagY[]=False MAY be initialized by zeros OR may be calculated. This function processes grid as a hierarchy of nested blocks and micro-rows. If just one element of micro-row is required, entire micro-row (up to 8 nodes in the current version, but no promises) is calculated. NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc3vsubset(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Boolean */ ae_vector* flagy, /* Real */ ae_vector* y, ae_state *_state) { ae_int_t i; ae_vector_clear(y); ae_assert(n0>0, "RBFGridCalc3VSubset: invalid value for N0 (N0<=0)!", _state); ae_assert(n1>0, "RBFGridCalc3VSubset: invalid value for N1 (N1<=0)!", _state); ae_assert(n2>0, "RBFGridCalc3VSubset: invalid value for N2 (N2<=0)!", _state); ae_assert(x0->cnt>=n0, "RBFGridCalc3VSubset: Length(X0)cnt>=n1, "RBFGridCalc3VSubset: Length(X1)cnt>=n2, "RBFGridCalc3VSubset: Length(X2)cnt>=n0*n1*n2, "RBFGridCalc3VSubset: Length(FlagY)ptr.p_double[i],x0->ptr.p_double[i+1]), "RBFGridCalc3VSubset: X0 is not ordered by ascending", _state); } for(i=0; i<=n1-2; i++) { ae_assert(ae_fp_less_eq(x1->ptr.p_double[i],x1->ptr.p_double[i+1]), "RBFGridCalc3VSubset: X1 is not ordered by ascending", _state); } for(i=0; i<=n2-2; i++) { ae_assert(ae_fp_less_eq(x2->ptr.p_double[i],x2->ptr.p_double[i+1]), "RBFGridCalc3VSubset: X2 is not ordered by ascending", _state); } rbfgridcalc3vx(s, x0, n0, x1, n1, x2, n2, flagy, ae_true, y, _state); } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ void _pexec_rbfgridcalc3vsubset(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Boolean */ ae_vector* flagy, /* Real */ ae_vector* y, ae_state *_state) { rbfgridcalc3vsubset(s,x0,n0,x1,n1,x2,n2,flagy,y, _state); } /************************************************************************* This function, depending on SparseY, acts as RBFGridCalc2V (SparseY=False) or RBFGridCalc2VSubset (SparseY=True) function. See comments for these functions for more information -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc2vx(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Boolean */ ae_vector* flagy, ae_bool sparsey, /* Real */ ae_vector* y, ae_state *_state) { ae_frame _frame_block; ae_int_t nx; ae_int_t ny; ae_int_t ylen; hqrndstate rs; ae_vector dummyx2; ae_vector dummyx3; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t l; ae_vector tx; ae_vector ty; ae_int_t dstoffs; rbfcalcbuffer calcbuf; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&rs, _state); ae_vector_init(&dummyx2, 0, DT_REAL, _state); ae_vector_init(&dummyx3, 0, DT_REAL, _state); ae_vector_init(&tx, 0, DT_REAL, _state); ae_vector_init(&ty, 0, DT_REAL, _state); _rbfcalcbuffer_init(&calcbuf, _state); ae_assert(n0>0, "RBFGridCalc2VX: invalid value for N0 (N0<=0)!", _state); ae_assert(n1>0, "RBFGridCalc2VX: invalid value for N1 (N1<=0)!", _state); ae_assert(x0->cnt>=n0, "RBFGridCalc2VX: Length(X0)cnt>=n1, "RBFGridCalc2VX: Length(X1)ptr.p_double[i],x0->ptr.p_double[i+1]), "RBFGridCalc2VX: X0 is not ordered by ascending", _state); } for(i=0; i<=n1-2; i++) { ae_assert(ae_fp_less_eq(x1->ptr.p_double[i],x1->ptr.p_double[i+1]), "RBFGridCalc2VX: X1 is not ordered by ascending", _state); } /* * Prepare local variables */ nx = s->nx; ny = s->ny; hqrndseed(325, 46345, &rs, _state); /* * Prepare output array */ ylen = ny*n0*n1; ae_vector_set_length(y, ylen, _state); for(i=0; i<=ylen-1; i++) { y->ptr.p_double[i] = (double)(0); } if( s->nx!=2 ) { ae_frame_leave(_state); return; } /* * Process V2 model */ if( s->modelversion==2 ) { ae_vector_set_length(&dummyx2, 1, _state); dummyx2.ptr.p_double[0] = (double)(0); ae_vector_set_length(&dummyx3, 1, _state); dummyx3.ptr.p_double[0] = (double)(0); rbfv2gridcalcvx(&s->model2, x0, n0, x1, n1, &dummyx2, 1, &dummyx3, 1, flagy, sparsey, y, _state); ae_frame_leave(_state); return; } /* * Reference code for V1 models */ if( s->modelversion==1 ) { ae_vector_set_length(&tx, nx, _state); rbfcreatecalcbuffer(s, &calcbuf, _state); for(i=0; i<=n0-1; i++) { for(j=0; j<=n1-1; j++) { k = i+j*n0; dstoffs = ny*k; if( sparsey&&!flagy->ptr.p_bool[k] ) { for(l=0; l<=ny-1; l++) { y->ptr.p_double[l+dstoffs] = (double)(0); } continue; } tx.ptr.p_double[0] = x0->ptr.p_double[i]; tx.ptr.p_double[1] = x1->ptr.p_double[j]; rbftscalcbuf(s, &calcbuf, &tx, &ty, _state); for(l=0; l<=ny-1; l++) { y->ptr.p_double[l+dstoffs] = ty.ptr.p_double[l]; } } } ae_frame_leave(_state); return; } /* * Unknown model */ ae_assert(ae_false, "RBFGradCalc3VX: integrity check failed", _state); ae_frame_leave(_state); } /************************************************************************* This function, depending on SparseY, acts as RBFGridCalc3V (SparseY=False) or RBFGridCalc3VSubset (SparseY=True) function. See comments for these functions for more information -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc3vx(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Boolean */ ae_vector* flagy, ae_bool sparsey, /* Real */ ae_vector* y, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t ylen; ae_int_t nx; ae_int_t ny; double rmax; ae_vector blocks0; ae_vector blocks1; ae_vector blocks2; ae_int_t blockscnt0; ae_int_t blockscnt1; ae_int_t blockscnt2; double blockwidth; double searchradius; double avgfuncpernode; ae_int_t ntrials; ae_int_t maxblocksize; gridcalc3v1buf bufseedv1; ae_shared_pool bufpool; hqrndstate rs; ae_vector dummyx3; ae_frame_make(_state, &_frame_block); ae_vector_init(&blocks0, 0, DT_INT, _state); ae_vector_init(&blocks1, 0, DT_INT, _state); ae_vector_init(&blocks2, 0, DT_INT, _state); _gridcalc3v1buf_init(&bufseedv1, _state); ae_shared_pool_init(&bufpool, _state); _hqrndstate_init(&rs, _state); ae_vector_init(&dummyx3, 0, DT_REAL, _state); ae_assert(n0>0, "RBFGridCalc3V: invalid value for N0 (N0<=0)!", _state); ae_assert(n1>0, "RBFGridCalc3V: invalid value for N1 (N1<=0)!", _state); ae_assert(n2>0, "RBFGridCalc3V: invalid value for N2 (N2<=0)!", _state); ae_assert(x0->cnt>=n0, "RBFGridCalc3V: Length(X0)cnt>=n1, "RBFGridCalc3V: Length(X1)cnt>=n2, "RBFGridCalc3V: Length(X2)ptr.p_double[i],x0->ptr.p_double[i+1]), "RBFGridCalc3V: X0 is not ordered by ascending", _state); } for(i=0; i<=n1-2; i++) { ae_assert(ae_fp_less_eq(x1->ptr.p_double[i],x1->ptr.p_double[i+1]), "RBFGridCalc3V: X1 is not ordered by ascending", _state); } for(i=0; i<=n2-2; i++) { ae_assert(ae_fp_less_eq(x2->ptr.p_double[i],x2->ptr.p_double[i+1]), "RBFGridCalc3V: X2 is not ordered by ascending", _state); } /* * Prepare local variables */ nx = s->nx; ny = s->ny; hqrndseed(325, 46345, &rs, _state); /* * Prepare output array */ ylen = ny*n0*n1*n2; ae_vector_set_length(y, ylen, _state); for(i=0; i<=ylen-1; i++) { y->ptr.p_double[i] = (double)(0); } if( s->nx!=3 ) { ae_frame_leave(_state); return; } /* * Process V1 model */ if( s->modelversion==1 ) { /* * Fast exit for models without centers */ if( s->model1.nc==0 ) { ae_frame_leave(_state); return; } /* * Prepare seed, create shared pool of temporary buffers */ ae_vector_set_length(&bufseedv1.cx, nx, _state); ae_vector_set_length(&bufseedv1.tx, nx, _state); ae_vector_set_length(&bufseedv1.ty, ny, _state); ae_vector_set_length(&bufseedv1.expbuf0, n0, _state); ae_vector_set_length(&bufseedv1.expbuf1, n1, _state); ae_vector_set_length(&bufseedv1.expbuf2, n2, _state); kdtreecreaterequestbuffer(&s->model1.tree, &bufseedv1.requestbuf, _state); ae_shared_pool_set_seed(&bufpool, &bufseedv1, sizeof(bufseedv1), _gridcalc3v1buf_init, _gridcalc3v1buf_init_copy, _gridcalc3v1buf_destroy, _state); /* * Analyze input grid: * * analyze average number of basis functions per grid node * * partition grid in into blocks */ rmax = s->model1.rmax; blockwidth = 2*rmax; maxblocksize = 8; searchradius = rmax*rbf_rbffarradius+0.5*ae_sqrt((double)(s->nx), _state)*blockwidth; ntrials = 100; avgfuncpernode = 0.0; for(i=0; i<=ntrials-1; i++) { bufseedv1.tx.ptr.p_double[0] = x0->ptr.p_double[hqrnduniformi(&rs, n0, _state)]; bufseedv1.tx.ptr.p_double[1] = x1->ptr.p_double[hqrnduniformi(&rs, n1, _state)]; bufseedv1.tx.ptr.p_double[2] = x2->ptr.p_double[hqrnduniformi(&rs, n2, _state)]; avgfuncpernode = avgfuncpernode+(double)kdtreetsqueryrnn(&s->model1.tree, &bufseedv1.requestbuf, &bufseedv1.tx, searchradius, ae_true, _state)/(double)ntrials; } ae_vector_set_length(&blocks0, n0+1, _state); blockscnt0 = 0; blocks0.ptr.p_int[0] = 0; for(i=1; i<=n0-1; i++) { if( ae_fp_greater(x0->ptr.p_double[i]-x0->ptr.p_double[blocks0.ptr.p_int[blockscnt0]],blockwidth)||i-blocks0.ptr.p_int[blockscnt0]>=maxblocksize ) { inc(&blockscnt0, _state); blocks0.ptr.p_int[blockscnt0] = i; } } inc(&blockscnt0, _state); blocks0.ptr.p_int[blockscnt0] = n0; ae_vector_set_length(&blocks1, n1+1, _state); blockscnt1 = 0; blocks1.ptr.p_int[0] = 0; for(i=1; i<=n1-1; i++) { if( ae_fp_greater(x1->ptr.p_double[i]-x1->ptr.p_double[blocks1.ptr.p_int[blockscnt1]],blockwidth)||i-blocks1.ptr.p_int[blockscnt1]>=maxblocksize ) { inc(&blockscnt1, _state); blocks1.ptr.p_int[blockscnt1] = i; } } inc(&blockscnt1, _state); blocks1.ptr.p_int[blockscnt1] = n1; ae_vector_set_length(&blocks2, n2+1, _state); blockscnt2 = 0; blocks2.ptr.p_int[0] = 0; for(i=1; i<=n2-1; i++) { if( ae_fp_greater(x2->ptr.p_double[i]-x2->ptr.p_double[blocks2.ptr.p_int[blockscnt2]],blockwidth)||i-blocks2.ptr.p_int[blockscnt2]>=maxblocksize ) { inc(&blockscnt2, _state); blocks2.ptr.p_int[blockscnt2] = i; } } inc(&blockscnt2, _state); blocks2.ptr.p_int[blockscnt2] = n2; /* * Perform calculation in multithreaded mode */ rbfv1gridcalc3vrec(&s->model1, x0, n0, x1, n1, x2, n2, &blocks0, 0, blockscnt0, &blocks1, 0, blockscnt1, &blocks2, 0, blockscnt2, flagy, sparsey, searchradius, avgfuncpernode, &bufpool, y, _state); /* * Done */ ae_frame_leave(_state); return; } /* * Process V2 model */ if( s->modelversion==2 ) { ae_vector_set_length(&dummyx3, 1, _state); dummyx3.ptr.p_double[0] = (double)(0); rbfv2gridcalcvx(&s->model2, x0, n0, x1, n1, x2, n2, &dummyx3, 1, flagy, sparsey, y, _state); ae_frame_leave(_state); return; } /* * Unknown model */ ae_assert(ae_false, "RBFGradCalc3VX: integrity check failed", _state); ae_frame_leave(_state); } /************************************************************************* This function "unpacks" RBF model by extracting its coefficients. INPUT PARAMETERS: S - RBF model OUTPUT PARAMETERS: NX - dimensionality of argument NY - dimensionality of the target function XWR - model information, array[NC,NX+NY+1]. One row of the array corresponds to one basis function: * first NX columns - coordinates of the center * next NY columns - weights, one per dimension of the function being modelled For ModelVersion=1: * last column - radius, same for all dimensions of the function being modelled For ModelVersion=2: * last NX columns - radii, one per dimension NC - number of the centers V - polynomial term , array[NY,NX+1]. One row per one dimension of the function being modelled. First NX elements are linear coefficients, V[NX] is equal to the constant part. ModelVersion-version of the RBF model: * 1 - for models created by QNN and RBF-ML algorithms, compatible with ALGLIB 3.10 or earlier. * 2 - for models created by HierarchicalRBF, requires ALGLIB 3.11 or later -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfunpack(rbfmodel* s, ae_int_t* nx, ae_int_t* ny, /* Real */ ae_matrix* xwr, ae_int_t* nc, /* Real */ ae_matrix* v, ae_int_t* modelversion, ae_state *_state) { *nx = 0; *ny = 0; ae_matrix_clear(xwr); *nc = 0; ae_matrix_clear(v); *modelversion = 0; if( s->modelversion==1 ) { *modelversion = 1; rbfv1unpack(&s->model1, nx, ny, xwr, nc, v, _state); return; } if( s->modelversion==2 ) { *modelversion = 2; rbfv2unpack(&s->model2, nx, ny, xwr, nc, v, _state); return; } ae_assert(ae_false, "RBFUnpack: integrity check failure", _state); } /************************************************************************* This function returns model version. INPUT PARAMETERS: S - RBF model RESULT: * 1 - for models created by QNN and RBF-ML algorithms, compatible with ALGLIB 3.10 or earlier. * 2 - for models created by HierarchicalRBF, requires ALGLIB 3.11 or later -- ALGLIB -- Copyright 06.07.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t rbfgetmodelversion(rbfmodel* s, ae_state *_state) { ae_int_t result; result = s->modelversion; return result; } /************************************************************************* Serializer: allocation -- ALGLIB -- Copyright 02.02.2012 by Bochkanov Sergey *************************************************************************/ void rbfalloc(ae_serializer* s, rbfmodel* model, ae_state *_state) { /* * Header */ ae_serializer_alloc_entry(s); /* * V1 model */ if( model->modelversion==1 ) { /* * Header */ ae_serializer_alloc_entry(s); rbfv1alloc(s, &model->model1, _state); return; } /* * V2 model */ if( model->modelversion==2 ) { /* * Header */ ae_serializer_alloc_entry(s); rbfv2alloc(s, &model->model2, _state); return; } ae_assert(ae_false, "Assertion failed", _state); } /************************************************************************* Serializer: serialization -- ALGLIB -- Copyright 02.02.2012 by Bochkanov Sergey *************************************************************************/ void rbfserialize(ae_serializer* s, rbfmodel* model, ae_state *_state) { /* * Header */ ae_serializer_serialize_int(s, getrbfserializationcode(_state), _state); /* * V1 model */ if( model->modelversion==1 ) { ae_serializer_serialize_int(s, rbf_rbffirstversion, _state); rbfv1serialize(s, &model->model1, _state); return; } /* * V2 model */ if( model->modelversion==2 ) { /* * Header */ ae_serializer_serialize_int(s, rbf_rbfversion2, _state); rbfv2serialize(s, &model->model2, _state); return; } ae_assert(ae_false, "Assertion failed", _state); } /************************************************************************* Serializer: unserialization -- ALGLIB -- Copyright 02.02.2012 by Bochkanov Sergey *************************************************************************/ void rbfunserialize(ae_serializer* s, rbfmodel* model, ae_state *_state) { ae_int_t i0; ae_int_t i1; _rbfmodel_clear(model); rbf_rbfpreparenonserializablefields(model, _state); /* * Header */ ae_serializer_unserialize_int(s, &i0, _state); ae_assert(i0==getrbfserializationcode(_state), "RBFUnserialize: stream header corrupted", _state); ae_serializer_unserialize_int(s, &i1, _state); ae_assert(i1==rbf_rbffirstversion||i1==rbf_rbfversion2, "RBFUnserialize: stream header corrupted", _state); /* * V1 model */ if( i1==rbf_rbffirstversion ) { rbfv1unserialize(s, &model->model1, _state); model->modelversion = 1; model->ny = model->model1.ny; model->nx = model->model1.nx; rbf_initializev2(model->nx, model->ny, &model->model2, _state); return; } /* * V2 model */ if( i1==rbf_rbfversion2 ) { rbfv2unserialize(s, &model->model2, _state); model->modelversion = 2; model->ny = model->model2.ny; model->nx = model->model2.nx; rbf_initializev1(model->nx, model->ny, &model->model1, _state); return; } ae_assert(ae_false, "Assertion failed", _state); } /************************************************************************* Initialize empty model -- ALGLIB -- Copyright 12.05.2016 by Bochkanov Sergey *************************************************************************/ static void rbf_rbfpreparenonserializablefields(rbfmodel* s, ae_state *_state) { s->n = 0; s->hasscale = ae_false; s->radvalue = (double)(1); s->radzvalue = (double)(5); s->nlayers = 0; s->lambdav = (double)(0); s->aterm = 1; s->algorithmtype = 0; s->epsort = rbf_eps; s->epserr = rbf_eps; s->maxits = 0; s->nnmaxits = 100; } /************************************************************************* Initialize V1 model (skip initialization for NX=1 or NX>3) -- ALGLIB -- Copyright 12.05.2016 by Bochkanov Sergey *************************************************************************/ static void rbf_initializev1(ae_int_t nx, ae_int_t ny, rbfv1model* s, ae_state *_state) { _rbfv1model_clear(s); if( nx==2||nx==3 ) { rbfv1create(nx, ny, s, _state); } } /************************************************************************* Initialize V2 model -- ALGLIB -- Copyright 12.05.2016 by Bochkanov Sergey *************************************************************************/ static void rbf_initializev2(ae_int_t nx, ae_int_t ny, rbfv2model* s, ae_state *_state) { _rbfv2model_clear(s); rbfv2create(nx, ny, s, _state); } /************************************************************************* Cleans report fields -- ALGLIB -- Copyright 16.06.2016 by Bochkanov Sergey *************************************************************************/ static void rbf_clearreportfields(rbfreport* rep, ae_state *_state) { rep->rmserror = _state->v_nan; rep->maxerror = _state->v_nan; rep->arows = 0; rep->acols = 0; rep->annz = 0; rep->iterationscount = 0; rep->nmv = 0; rep->terminationtype = 0; } void _rbfcalcbuffer_init(void* _p, ae_state *_state) { rbfcalcbuffer *p = (rbfcalcbuffer*)_p; ae_touch_ptr((void*)p); _rbfv1calcbuffer_init(&p->bufv1, _state); _rbfv2calcbuffer_init(&p->bufv2, _state); } void _rbfcalcbuffer_init_copy(void* _dst, void* _src, ae_state *_state) { rbfcalcbuffer *dst = (rbfcalcbuffer*)_dst; rbfcalcbuffer *src = (rbfcalcbuffer*)_src; dst->modelversion = src->modelversion; _rbfv1calcbuffer_init_copy(&dst->bufv1, &src->bufv1, _state); _rbfv2calcbuffer_init_copy(&dst->bufv2, &src->bufv2, _state); } void _rbfcalcbuffer_clear(void* _p) { rbfcalcbuffer *p = (rbfcalcbuffer*)_p; ae_touch_ptr((void*)p); _rbfv1calcbuffer_clear(&p->bufv1); _rbfv2calcbuffer_clear(&p->bufv2); } void _rbfcalcbuffer_destroy(void* _p) { rbfcalcbuffer *p = (rbfcalcbuffer*)_p; ae_touch_ptr((void*)p); _rbfv1calcbuffer_destroy(&p->bufv1); _rbfv2calcbuffer_destroy(&p->bufv2); } void _rbfmodel_init(void* _p, ae_state *_state) { rbfmodel *p = (rbfmodel*)_p; ae_touch_ptr((void*)p); _rbfv1model_init(&p->model1, _state); _rbfv2model_init(&p->model2, _state); ae_matrix_init(&p->x, 0, 0, DT_REAL, _state); ae_matrix_init(&p->y, 0, 0, DT_REAL, _state); ae_vector_init(&p->s, 0, DT_REAL, _state); } void _rbfmodel_init_copy(void* _dst, void* _src, ae_state *_state) { rbfmodel *dst = (rbfmodel*)_dst; rbfmodel *src = (rbfmodel*)_src; dst->nx = src->nx; dst->ny = src->ny; dst->modelversion = src->modelversion; _rbfv1model_init_copy(&dst->model1, &src->model1, _state); _rbfv2model_init_copy(&dst->model2, &src->model2, _state); dst->lambdav = src->lambdav; dst->radvalue = src->radvalue; dst->radzvalue = src->radzvalue; dst->nlayers = src->nlayers; dst->aterm = src->aterm; dst->algorithmtype = src->algorithmtype; dst->epsort = src->epsort; dst->epserr = src->epserr; dst->maxits = src->maxits; dst->nnmaxits = src->nnmaxits; dst->n = src->n; ae_matrix_init_copy(&dst->x, &src->x, _state); ae_matrix_init_copy(&dst->y, &src->y, _state); dst->hasscale = src->hasscale; ae_vector_init_copy(&dst->s, &src->s, _state); } void _rbfmodel_clear(void* _p) { rbfmodel *p = (rbfmodel*)_p; ae_touch_ptr((void*)p); _rbfv1model_clear(&p->model1); _rbfv2model_clear(&p->model2); ae_matrix_clear(&p->x); ae_matrix_clear(&p->y); ae_vector_clear(&p->s); } void _rbfmodel_destroy(void* _p) { rbfmodel *p = (rbfmodel*)_p; ae_touch_ptr((void*)p); _rbfv1model_destroy(&p->model1); _rbfv2model_destroy(&p->model2); ae_matrix_destroy(&p->x); ae_matrix_destroy(&p->y); ae_vector_destroy(&p->s); } void _rbfreport_init(void* _p, ae_state *_state) { rbfreport *p = (rbfreport*)_p; ae_touch_ptr((void*)p); } void _rbfreport_init_copy(void* _dst, void* _src, ae_state *_state) { rbfreport *dst = (rbfreport*)_dst; rbfreport *src = (rbfreport*)_src; dst->rmserror = src->rmserror; dst->maxerror = src->maxerror; dst->arows = src->arows; dst->acols = src->acols; dst->annz = src->annz; dst->iterationscount = src->iterationscount; dst->nmv = src->nmv; dst->terminationtype = src->terminationtype; } void _rbfreport_clear(void* _p) { rbfreport *p = (rbfreport*)_p; ae_touch_ptr((void*)p); } void _rbfreport_destroy(void* _p) { rbfreport *p = (rbfreport*)_p; ae_touch_ptr((void*)p); } } cpp/src/alglibinternal.h0000755000175000017500000010767013105126765015171 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _alglibinternal_pkg_h #define _alglibinternal_pkg_h #include "ap.h" ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (DATATYPES) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { typedef struct { ae_vector ba0; ae_vector ia0; ae_vector ia1; ae_vector ia2; ae_vector ia3; ae_vector ra0; ae_vector ra1; ae_vector ra2; ae_vector ra3; ae_matrix rm0; ae_matrix rm1; } apbuffers; typedef struct { ae_bool val; } sboolean; typedef struct { ae_vector val; } sbooleanarray; typedef struct { ae_int_t val; } sinteger; typedef struct { ae_vector val; } sintegerarray; typedef struct { double val; } sreal; typedef struct { ae_vector val; } srealarray; typedef struct { ae_complex val; } scomplex; typedef struct { ae_vector val; } scomplexarray; typedef struct { ae_bool brackt; ae_bool stage1; ae_int_t infoc; double dg; double dgm; double dginit; double dgtest; double dgx; double dgxm; double dgy; double dgym; double finit; double ftest1; double fm; double fx; double fxm; double fy; double fym; double stx; double sty; double stmin; double stmax; double width; double width1; double xtrapf; } linminstate; typedef struct { ae_bool needf; ae_vector x; double f; ae_int_t n; ae_vector xbase; ae_vector s; double stplen; double fcur; double stpmax; ae_int_t fmax; ae_int_t nfev; ae_int_t info; rcommstate rstate; } armijostate; typedef struct { ae_int_t chunksize; ae_int_t ntotal; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_vector batch4buf; ae_vector hpcbuf; ae_matrix xy; ae_matrix xy2; ae_vector xyrow; ae_vector x; ae_vector y; ae_vector desiredy; double e; ae_vector g; ae_vector tmp0; } mlpbuffers; typedef struct { ae_matrix entries; ae_vector buffer; ae_vector precr; ae_vector preci; ae_shared_pool bluesteinpool; } fasttransformplan; } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (FUNCTIONS) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { ae_int_t getrdfserializationcode(ae_state *_state); ae_int_t getkdtreeserializationcode(ae_state *_state); ae_int_t getmlpserializationcode(ae_state *_state); ae_int_t getmlpeserializationcode(ae_state *_state); ae_int_t getrbfserializationcode(ae_state *_state); ae_bool seterrorflag(ae_bool* flag, ae_bool cond, ae_state *_state); ae_bool seterrorflagdiff(ae_bool* flag, double val, double refval, double tol, double s, ae_state *_state); void touchint(ae_int_t* a, ae_state *_state); void touchreal(double* a, ae_state *_state); double coalesce(double a, double b, ae_state *_state); ae_int_t coalescei(ae_int_t a, ae_int_t b, ae_state *_state); double inttoreal(ae_int_t a, ae_state *_state); double logbase2(double x, ae_state *_state); ae_bool approxequal(double a, double b, double tol, ae_state *_state); ae_bool approxequalrel(double a, double b, double tol, ae_state *_state); void taskgenint1d(double a, double b, ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void taskgenint1dequidist(double a, double b, ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void taskgenint1dcheb1(double a, double b, ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void taskgenint1dcheb2(double a, double b, ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); ae_bool aredistinct(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state); ae_bool aresameboolean(ae_bool v1, ae_bool v2, ae_state *_state); void bvectorsetlengthatleast(/* Boolean */ ae_vector* x, ae_int_t n, ae_state *_state); void ivectorsetlengthatleast(/* Integer */ ae_vector* x, ae_int_t n, ae_state *_state); void rvectorsetlengthatleast(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state); void rmatrixsetlengthatleast(/* Real */ ae_matrix* x, ae_int_t m, ae_int_t n, ae_state *_state); void ivectorresize(/* Integer */ ae_vector* x, ae_int_t n, ae_state *_state); void rvectorresize(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state); void rmatrixresize(/* Real */ ae_matrix* x, ae_int_t m, ae_int_t n, ae_state *_state); void imatrixresize(/* Integer */ ae_matrix* x, ae_int_t m, ae_int_t n, ae_state *_state); ae_bool isfinitevector(/* Real */ ae_vector* x, ae_int_t n, ae_state *_state); ae_bool isfinitecvector(/* Complex */ ae_vector* z, ae_int_t n, ae_state *_state); ae_bool apservisfinitematrix(/* Real */ ae_matrix* x, ae_int_t m, ae_int_t n, ae_state *_state); ae_bool apservisfinitecmatrix(/* Complex */ ae_matrix* x, ae_int_t m, ae_int_t n, ae_state *_state); ae_bool isfinitertrmatrix(/* Real */ ae_matrix* x, ae_int_t n, ae_bool isupper, ae_state *_state); ae_bool apservisfinitectrmatrix(/* Complex */ ae_matrix* x, ae_int_t n, ae_bool isupper, ae_state *_state); ae_bool apservisfiniteornanmatrix(/* Real */ ae_matrix* x, ae_int_t m, ae_int_t n, ae_state *_state); double safepythag2(double x, double y, ae_state *_state); double safepythag3(double x, double y, double z, ae_state *_state); ae_int_t saferdiv(double x, double y, double* r, ae_state *_state); double safeminposrv(double x, double y, double v, ae_state *_state); void apperiodicmap(double* x, double a, double b, double* k, ae_state *_state); double randomnormal(ae_state *_state); void randomunit(ae_int_t n, /* Real */ ae_vector* x, ae_state *_state); void swapi(ae_int_t* v0, ae_int_t* v1, ae_state *_state); void swapr(double* v0, double* v1, ae_state *_state); double maxreal3(double v0, double v1, double v2, ae_state *_state); void inc(ae_int_t* v, ae_state *_state); void dec(ae_int_t* v, ae_state *_state); void countdown(ae_int_t* v, ae_state *_state); double rmul2(double v0, double v1, ae_state *_state); double boundval(double x, double b1, double b2, ae_state *_state); void alloccomplex(ae_serializer* s, ae_complex v, ae_state *_state); void serializecomplex(ae_serializer* s, ae_complex v, ae_state *_state); ae_complex unserializecomplex(ae_serializer* s, ae_state *_state); void allocrealarray(ae_serializer* s, /* Real */ ae_vector* v, ae_int_t n, ae_state *_state); void serializerealarray(ae_serializer* s, /* Real */ ae_vector* v, ae_int_t n, ae_state *_state); void unserializerealarray(ae_serializer* s, /* Real */ ae_vector* v, ae_state *_state); void allocintegerarray(ae_serializer* s, /* Integer */ ae_vector* v, ae_int_t n, ae_state *_state); void serializeintegerarray(ae_serializer* s, /* Integer */ ae_vector* v, ae_int_t n, ae_state *_state); void unserializeintegerarray(ae_serializer* s, /* Integer */ ae_vector* v, ae_state *_state); void allocrealmatrix(ae_serializer* s, /* Real */ ae_matrix* v, ae_int_t n0, ae_int_t n1, ae_state *_state); void serializerealmatrix(ae_serializer* s, /* Real */ ae_matrix* v, ae_int_t n0, ae_int_t n1, ae_state *_state); void unserializerealmatrix(ae_serializer* s, /* Real */ ae_matrix* v, ae_state *_state); void copyintegerarray(/* Integer */ ae_vector* src, /* Integer */ ae_vector* dst, ae_state *_state); void copyrealarray(/* Real */ ae_vector* src, /* Real */ ae_vector* dst, ae_state *_state); void copyrealmatrix(/* Real */ ae_matrix* src, /* Real */ ae_matrix* dst, ae_state *_state); void unsetintegerarray(/* Integer */ ae_vector* a, ae_state *_state); void unsetrealarray(/* Real */ ae_vector* a, ae_state *_state); void unsetrealmatrix(/* Real */ ae_matrix* a, ae_state *_state); ae_int_t recsearch(/* Integer */ ae_vector* a, ae_int_t nrec, ae_int_t nheader, ae_int_t i0, ae_int_t i1, /* Integer */ ae_vector* b, ae_state *_state); void splitlengtheven(ae_int_t tasksize, ae_int_t* task0, ae_int_t* task1, ae_state *_state); void splitlength(ae_int_t tasksize, ae_int_t chunksize, ae_int_t* task0, ae_int_t* task1, ae_state *_state); ae_int_t chunkscount(ae_int_t tasksize, ae_int_t chunksize, ae_state *_state); void _apbuffers_init(void* _p, ae_state *_state); void _apbuffers_init_copy(void* _dst, void* _src, ae_state *_state); void _apbuffers_clear(void* _p); void _apbuffers_destroy(void* _p); void _sboolean_init(void* _p, ae_state *_state); void _sboolean_init_copy(void* _dst, void* _src, ae_state *_state); void _sboolean_clear(void* _p); void _sboolean_destroy(void* _p); void _sbooleanarray_init(void* _p, ae_state *_state); void _sbooleanarray_init_copy(void* _dst, void* _src, ae_state *_state); void _sbooleanarray_clear(void* _p); void _sbooleanarray_destroy(void* _p); void _sinteger_init(void* _p, ae_state *_state); void _sinteger_init_copy(void* _dst, void* _src, ae_state *_state); void _sinteger_clear(void* _p); void _sinteger_destroy(void* _p); void _sintegerarray_init(void* _p, ae_state *_state); void _sintegerarray_init_copy(void* _dst, void* _src, ae_state *_state); void _sintegerarray_clear(void* _p); void _sintegerarray_destroy(void* _p); void _sreal_init(void* _p, ae_state *_state); void _sreal_init_copy(void* _dst, void* _src, ae_state *_state); void _sreal_clear(void* _p); void _sreal_destroy(void* _p); void _srealarray_init(void* _p, ae_state *_state); void _srealarray_init_copy(void* _dst, void* _src, ae_state *_state); void _srealarray_clear(void* _p); void _srealarray_destroy(void* _p); void _scomplex_init(void* _p, ae_state *_state); void _scomplex_init_copy(void* _dst, void* _src, ae_state *_state); void _scomplex_clear(void* _p); void _scomplex_destroy(void* _p); void _scomplexarray_init(void* _p, ae_state *_state); void _scomplexarray_init_copy(void* _dst, void* _src, ae_state *_state); void _scomplexarray_clear(void* _p); void _scomplexarray_destroy(void* _p); void tagsort(/* Real */ ae_vector* a, ae_int_t n, /* Integer */ ae_vector* p1, /* Integer */ ae_vector* p2, ae_state *_state); void tagsortbuf(/* Real */ ae_vector* a, ae_int_t n, /* Integer */ ae_vector* p1, /* Integer */ ae_vector* p2, apbuffers* buf, ae_state *_state); void tagsortfasti(/* Real */ ae_vector* a, /* Integer */ ae_vector* b, /* Real */ ae_vector* bufa, /* Integer */ ae_vector* bufb, ae_int_t n, ae_state *_state); void tagsortfastr(/* Real */ ae_vector* a, /* Real */ ae_vector* b, /* Real */ ae_vector* bufa, /* Real */ ae_vector* bufb, ae_int_t n, ae_state *_state); void tagsortfast(/* Real */ ae_vector* a, /* Real */ ae_vector* bufa, ae_int_t n, ae_state *_state); void tagsortmiddleir(/* Integer */ ae_vector* a, /* Real */ ae_vector* b, ae_int_t offset, ae_int_t n, ae_state *_state); void tagheappushi(/* Real */ ae_vector* a, /* Integer */ ae_vector* b, ae_int_t* n, double va, ae_int_t vb, ae_state *_state); void tagheapreplacetopi(/* Real */ ae_vector* a, /* Integer */ ae_vector* b, ae_int_t n, double va, ae_int_t vb, ae_state *_state); void tagheappopi(/* Real */ ae_vector* a, /* Integer */ ae_vector* b, ae_int_t* n, ae_state *_state); ae_int_t lowerbound(/* Real */ ae_vector* a, ae_int_t n, double t, ae_state *_state); ae_int_t upperbound(/* Real */ ae_vector* a, ae_int_t n, double t, ae_state *_state); void generatereflection(/* Real */ ae_vector* x, ae_int_t n, double* tau, ae_state *_state); void applyreflectionfromtheleft(/* Real */ ae_matrix* c, double tau, /* Real */ ae_vector* v, ae_int_t m1, ae_int_t m2, ae_int_t n1, ae_int_t n2, /* Real */ ae_vector* work, ae_state *_state); void applyreflectionfromtheright(/* Real */ ae_matrix* c, double tau, /* Real */ ae_vector* v, ae_int_t m1, ae_int_t m2, ae_int_t n1, ae_int_t n2, /* Real */ ae_vector* work, ae_state *_state); void complexgeneratereflection(/* Complex */ ae_vector* x, ae_int_t n, ae_complex* tau, ae_state *_state); void complexapplyreflectionfromtheleft(/* Complex */ ae_matrix* c, ae_complex tau, /* Complex */ ae_vector* v, ae_int_t m1, ae_int_t m2, ae_int_t n1, ae_int_t n2, /* Complex */ ae_vector* work, ae_state *_state); void complexapplyreflectionfromtheright(/* Complex */ ae_matrix* c, ae_complex tau, /* Complex */ ae_vector* v, ae_int_t m1, ae_int_t m2, ae_int_t n1, ae_int_t n2, /* Complex */ ae_vector* work, ae_state *_state); ae_bool cmatrixrank1f(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Complex */ ae_vector* u, ae_int_t iu, /* Complex */ ae_vector* v, ae_int_t iv, ae_state *_state); ae_bool rmatrixrank1f(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_vector* u, ae_int_t iu, /* Real */ ae_vector* v, ae_int_t iv, ae_state *_state); ae_bool cmatrixmvf(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t opa, /* Complex */ ae_vector* x, ae_int_t ix, /* Complex */ ae_vector* y, ae_int_t iy, ae_state *_state); ae_bool rmatrixmvf(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t opa, /* Real */ ae_vector* x, ae_int_t ix, /* Real */ ae_vector* y, ae_int_t iy, ae_state *_state); ae_bool cmatrixrighttrsmf(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); ae_bool cmatrixlefttrsmf(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); ae_bool rmatrixrighttrsmf(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); ae_bool rmatrixlefttrsmf(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); ae_bool cmatrixherkf(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); ae_bool rmatrixsyrkf(ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); ae_bool rmatrixgemmf(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); ae_bool cmatrixgemmf(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); void cmatrixgemmk(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); void rmatrixgemmk(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); void rmatrixgemmk44v00(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); void rmatrixgemmk44v01(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); void rmatrixgemmk44v10(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); void rmatrixgemmk44v11(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); ae_bool rmatrixsyrkmkl(ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); ae_bool cmatrixherkmkl(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); ae_bool rmatrixgemmmkl(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); ae_bool cmatrixgemmmkl(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); ae_bool cmatrixlefttrsmmkl(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); ae_bool cmatrixrighttrsmmkl(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); ae_bool rmatrixlefttrsmmkl(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); ae_bool rmatrixrighttrsmmkl(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); ae_bool spdmatrixcholeskymkl(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t n, ae_bool isupper, ae_bool* cholresult, ae_state *_state); ae_bool rmatrixplumkl(/* Real */ ae_matrix* a, ae_int_t offs, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state); ae_bool rmatrixbdmkl(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* d, /* Real */ ae_vector* e, /* Real */ ae_vector* tauq, /* Real */ ae_vector* taup, ae_state *_state); ae_bool rmatrixbdmultiplybymkl(/* Real */ ae_matrix* qp, ae_int_t m, ae_int_t n, /* Real */ ae_vector* tauq, /* Real */ ae_vector* taup, /* Real */ ae_matrix* z, ae_int_t zrows, ae_int_t zcolumns, ae_bool byq, ae_bool fromtheright, ae_bool dotranspose, ae_state *_state); ae_bool rmatrixhessenbergmkl(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* tau, ae_state *_state); ae_bool rmatrixhessenbergunpackqmkl(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* tau, /* Real */ ae_matrix* q, ae_state *_state); ae_bool smatrixtdmkl(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tau, /* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_state *_state); ae_bool smatrixtdunpackqmkl(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Real */ ae_vector* tau, /* Real */ ae_matrix* q, ae_state *_state); ae_bool hmatrixtdmkl(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tau, /* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_state *_state); ae_bool hmatrixtdunpackqmkl(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, /* Complex */ ae_vector* tau, /* Complex */ ae_matrix* q, ae_state *_state); ae_bool rmatrixbdsvdmkl(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* u, ae_int_t nru, /* Real */ ae_matrix* c, ae_int_t ncc, /* Real */ ae_matrix* vt, ae_int_t ncvt, ae_bool* svdresult, ae_state *_state); ae_bool rmatrixinternalschurdecompositionmkl(/* Real */ ae_matrix* h, ae_int_t n, ae_int_t tneeded, ae_int_t zneeded, /* Real */ ae_vector* wr, /* Real */ ae_vector* wi, /* Real */ ae_matrix* z, ae_int_t* info, ae_state *_state); ae_bool rmatrixinternaltrevcmkl(/* Real */ ae_matrix* t, ae_int_t n, ae_int_t side, ae_int_t howmny, /* Real */ ae_matrix* vl, /* Real */ ae_matrix* vr, ae_int_t* m, ae_int_t* info, ae_state *_state); ae_bool smatrixtdevdmkl(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_int_t zneeded, /* Real */ ae_matrix* z, ae_bool* evdresult, ae_state *_state); void applyrotationsfromtheleft(ae_bool isforward, ae_int_t m1, ae_int_t m2, ae_int_t n1, ae_int_t n2, /* Real */ ae_vector* c, /* Real */ ae_vector* s, /* Real */ ae_matrix* a, /* Real */ ae_vector* work, ae_state *_state); void applyrotationsfromtheright(ae_bool isforward, ae_int_t m1, ae_int_t m2, ae_int_t n1, ae_int_t n2, /* Real */ ae_vector* c, /* Real */ ae_vector* s, /* Real */ ae_matrix* a, /* Real */ ae_vector* work, ae_state *_state); void generaterotation(double f, double g, double* cs, double* sn, double* r, ae_state *_state); void rmatrixtrsafesolve(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* x, double* s, ae_bool isupper, ae_bool istrans, ae_bool isunit, ae_state *_state); void safesolvetriangular(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_vector* x, double* s, ae_bool isupper, ae_bool istrans, ae_bool isunit, ae_bool normin, /* Real */ ae_vector* cnorm, ae_state *_state); ae_bool rmatrixscaledtrsafesolve(/* Real */ ae_matrix* a, double sa, ae_int_t n, /* Real */ ae_vector* x, ae_bool isupper, ae_int_t trans, ae_bool isunit, double maxgrowth, ae_state *_state); ae_bool cmatrixscaledtrsafesolve(/* Complex */ ae_matrix* a, double sa, ae_int_t n, /* Complex */ ae_vector* x, ae_bool isupper, ae_int_t trans, ae_bool isunit, double maxgrowth, ae_state *_state); void hermitianmatrixvectormultiply(/* Complex */ ae_matrix* a, ae_bool isupper, ae_int_t i1, ae_int_t i2, /* Complex */ ae_vector* x, ae_complex alpha, /* Complex */ ae_vector* y, ae_state *_state); void hermitianrank2update(/* Complex */ ae_matrix* a, ae_bool isupper, ae_int_t i1, ae_int_t i2, /* Complex */ ae_vector* x, /* Complex */ ae_vector* y, /* Complex */ ae_vector* t, ae_complex alpha, ae_state *_state); void symmetricmatrixvectormultiply(/* Real */ ae_matrix* a, ae_bool isupper, ae_int_t i1, ae_int_t i2, /* Real */ ae_vector* x, double alpha, /* Real */ ae_vector* y, ae_state *_state); void symmetricrank2update(/* Real */ ae_matrix* a, ae_bool isupper, ae_int_t i1, ae_int_t i2, /* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* t, double alpha, ae_state *_state); double vectornorm2(/* Real */ ae_vector* x, ae_int_t i1, ae_int_t i2, ae_state *_state); ae_int_t vectoridxabsmax(/* Real */ ae_vector* x, ae_int_t i1, ae_int_t i2, ae_state *_state); ae_int_t columnidxabsmax(/* Real */ ae_matrix* x, ae_int_t i1, ae_int_t i2, ae_int_t j, ae_state *_state); ae_int_t rowidxabsmax(/* Real */ ae_matrix* x, ae_int_t j1, ae_int_t j2, ae_int_t i, ae_state *_state); double upperhessenberg1norm(/* Real */ ae_matrix* a, ae_int_t i1, ae_int_t i2, ae_int_t j1, ae_int_t j2, /* Real */ ae_vector* work, ae_state *_state); void copymatrix(/* Real */ ae_matrix* a, ae_int_t is1, ae_int_t is2, ae_int_t js1, ae_int_t js2, /* Real */ ae_matrix* b, ae_int_t id1, ae_int_t id2, ae_int_t jd1, ae_int_t jd2, ae_state *_state); void inplacetranspose(/* Real */ ae_matrix* a, ae_int_t i1, ae_int_t i2, ae_int_t j1, ae_int_t j2, /* Real */ ae_vector* work, ae_state *_state); void copyandtranspose(/* Real */ ae_matrix* a, ae_int_t is1, ae_int_t is2, ae_int_t js1, ae_int_t js2, /* Real */ ae_matrix* b, ae_int_t id1, ae_int_t id2, ae_int_t jd1, ae_int_t jd2, ae_state *_state); void matrixvectormultiply(/* Real */ ae_matrix* a, ae_int_t i1, ae_int_t i2, ae_int_t j1, ae_int_t j2, ae_bool trans, /* Real */ ae_vector* x, ae_int_t ix1, ae_int_t ix2, double alpha, /* Real */ ae_vector* y, ae_int_t iy1, ae_int_t iy2, double beta, ae_state *_state); double pythag2(double x, double y, ae_state *_state); void matrixmatrixmultiply(/* Real */ ae_matrix* a, ae_int_t ai1, ae_int_t ai2, ae_int_t aj1, ae_int_t aj2, ae_bool transa, /* Real */ ae_matrix* b, ae_int_t bi1, ae_int_t bi2, ae_int_t bj1, ae_int_t bj2, ae_bool transb, double alpha, /* Real */ ae_matrix* c, ae_int_t ci1, ae_int_t ci2, ae_int_t cj1, ae_int_t cj2, double beta, /* Real */ ae_vector* work, ae_state *_state); void linminnormalized(/* Real */ ae_vector* d, double* stp, ae_int_t n, ae_state *_state); void mcsrch(ae_int_t n, /* Real */ ae_vector* x, double* f, /* Real */ ae_vector* g, /* Real */ ae_vector* s, double* stp, double stpmax, double gtol, ae_int_t* info, ae_int_t* nfev, /* Real */ ae_vector* wa, linminstate* state, ae_int_t* stage, ae_state *_state); void armijocreate(ae_int_t n, /* Real */ ae_vector* x, double f, /* Real */ ae_vector* s, double stp, double stpmax, ae_int_t fmax, armijostate* state, ae_state *_state); ae_bool armijoiteration(armijostate* state, ae_state *_state); void armijoresults(armijostate* state, ae_int_t* info, double* stp, double* f, ae_state *_state); void _linminstate_init(void* _p, ae_state *_state); void _linminstate_init_copy(void* _dst, void* _src, ae_state *_state); void _linminstate_clear(void* _p); void _linminstate_destroy(void* _p); void _armijostate_init(void* _p, ae_state *_state); void _armijostate_init_copy(void* _dst, void* _src, ae_state *_state); void _armijostate_clear(void* _p); void _armijostate_destroy(void* _p); void xdot(/* Real */ ae_vector* a, /* Real */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* temp, double* r, double* rerr, ae_state *_state); void xcdot(/* Complex */ ae_vector* a, /* Complex */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* temp, ae_complex* r, double* rerr, ae_state *_state); void rmatrixinternalschurdecomposition(/* Real */ ae_matrix* h, ae_int_t n, ae_int_t tneeded, ae_int_t zneeded, /* Real */ ae_vector* wr, /* Real */ ae_vector* wi, /* Real */ ae_matrix* z, ae_int_t* info, ae_state *_state); ae_bool upperhessenbergschurdecomposition(/* Real */ ae_matrix* h, ae_int_t n, /* Real */ ae_matrix* s, ae_state *_state); void internalschurdecomposition(/* Real */ ae_matrix* h, ae_int_t n, ae_int_t tneeded, ae_int_t zneeded, /* Real */ ae_vector* wr, /* Real */ ae_vector* wi, /* Real */ ae_matrix* z, ae_int_t* info, ae_state *_state); void rankx(/* Real */ ae_vector* x, ae_int_t n, ae_bool iscentered, apbuffers* buf, ae_state *_state); void rankxuntied(/* Real */ ae_vector* x, ae_int_t n, apbuffers* buf, ae_state *_state); void hpcpreparechunkedgradient(/* Real */ ae_vector* weights, ae_int_t wcount, ae_int_t ntotal, ae_int_t nin, ae_int_t nout, mlpbuffers* buf, ae_state *_state); void hpcfinalizechunkedgradient(mlpbuffers* buf, /* Real */ ae_vector* grad, ae_state *_state); ae_bool hpcchunkedgradient(/* Real */ ae_vector* weights, /* Integer */ ae_vector* structinfo, /* Real */ ae_vector* columnmeans, /* Real */ ae_vector* columnsigmas, /* Real */ ae_matrix* xy, ae_int_t cstart, ae_int_t csize, /* Real */ ae_vector* batch4buf, /* Real */ ae_vector* hpcbuf, double* e, ae_bool naturalerrorfunc, ae_state *_state); ae_bool hpcchunkedprocess(/* Real */ ae_vector* weights, /* Integer */ ae_vector* structinfo, /* Real */ ae_vector* columnmeans, /* Real */ ae_vector* columnsigmas, /* Real */ ae_matrix* xy, ae_int_t cstart, ae_int_t csize, /* Real */ ae_vector* batch4buf, /* Real */ ae_vector* hpcbuf, ae_state *_state); void _mlpbuffers_init(void* _p, ae_state *_state); void _mlpbuffers_init_copy(void* _dst, void* _src, ae_state *_state); void _mlpbuffers_clear(void* _p); void _mlpbuffers_destroy(void* _p); void findprimitiverootandinverse(ae_int_t n, ae_int_t* proot, ae_int_t* invproot, ae_state *_state); void ftcomplexfftplan(ae_int_t n, ae_int_t k, fasttransformplan* plan, ae_state *_state); void ftapplyplan(fasttransformplan* plan, /* Real */ ae_vector* a, ae_int_t offsa, ae_int_t repcnt, ae_state *_state); void ftbasefactorize(ae_int_t n, ae_int_t tasktype, ae_int_t* n1, ae_int_t* n2, ae_state *_state); ae_bool ftbaseissmooth(ae_int_t n, ae_state *_state); ae_int_t ftbasefindsmooth(ae_int_t n, ae_state *_state); ae_int_t ftbasefindsmootheven(ae_int_t n, ae_state *_state); double ftbasegetflopestimate(ae_int_t n, ae_state *_state); void _fasttransformplan_init(void* _p, ae_state *_state); void _fasttransformplan_init_copy(void* _dst, void* _src, ae_state *_state); void _fasttransformplan_clear(void* _p); void _fasttransformplan_destroy(void* _p); double nulog1p(double x, ae_state *_state); double nuexpm1(double x, ae_state *_state); double nucosm1(double x, ae_state *_state); } #endif cpp/src/alglibmisc.h0000755000175000017500000022272313105126765014305 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _alglibmisc_pkg_h #define _alglibmisc_pkg_h #include "ap.h" #include "alglibinternal.h" ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (DATATYPES) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { typedef struct { ae_vector x; ae_vector boxmin; ae_vector boxmax; ae_int_t kneeded; double rneeded; ae_bool selfmatch; double approxf; ae_int_t kcur; ae_vector idx; ae_vector r; ae_vector buf; ae_vector curboxmin; ae_vector curboxmax; double curdist; } kdtreerequestbuffer; typedef struct { ae_int_t n; ae_int_t nx; ae_int_t ny; ae_int_t normtype; ae_matrix xy; ae_vector tags; ae_vector boxmin; ae_vector boxmax; ae_vector nodes; ae_vector splits; kdtreerequestbuffer innerbuf; ae_int_t debugcounter; } kdtree; typedef struct { ae_int_t s1; ae_int_t s2; ae_int_t magicv; } hqrndstate; typedef struct { ae_int_t i; ae_complex c; ae_vector a; } xdebugrecord1; } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* Buffer object which is used to perform nearest neighbor requests in the multithreaded mode (multiple threads working with same KD-tree object). This object should be created with KDTreeCreateBuffer(). *************************************************************************/ class _kdtreerequestbuffer_owner { public: _kdtreerequestbuffer_owner(); _kdtreerequestbuffer_owner(const _kdtreerequestbuffer_owner &rhs); _kdtreerequestbuffer_owner& operator=(const _kdtreerequestbuffer_owner &rhs); virtual ~_kdtreerequestbuffer_owner(); alglib_impl::kdtreerequestbuffer* c_ptr(); alglib_impl::kdtreerequestbuffer* c_ptr() const; protected: alglib_impl::kdtreerequestbuffer *p_struct; }; class kdtreerequestbuffer : public _kdtreerequestbuffer_owner { public: kdtreerequestbuffer(); kdtreerequestbuffer(const kdtreerequestbuffer &rhs); kdtreerequestbuffer& operator=(const kdtreerequestbuffer &rhs); virtual ~kdtreerequestbuffer(); }; /************************************************************************* KD-tree object. *************************************************************************/ class _kdtree_owner { public: _kdtree_owner(); _kdtree_owner(const _kdtree_owner &rhs); _kdtree_owner& operator=(const _kdtree_owner &rhs); virtual ~_kdtree_owner(); alglib_impl::kdtree* c_ptr(); alglib_impl::kdtree* c_ptr() const; protected: alglib_impl::kdtree *p_struct; }; class kdtree : public _kdtree_owner { public: kdtree(); kdtree(const kdtree &rhs); kdtree& operator=(const kdtree &rhs); virtual ~kdtree(); }; /************************************************************************* Portable high quality random number generator state. Initialized with HQRNDRandomize() or HQRNDSeed(). Fields: S1, S2 - seed values V - precomputed value MagicV - 'magic' value used to determine whether State structure was correctly initialized. *************************************************************************/ class _hqrndstate_owner { public: _hqrndstate_owner(); _hqrndstate_owner(const _hqrndstate_owner &rhs); _hqrndstate_owner& operator=(const _hqrndstate_owner &rhs); virtual ~_hqrndstate_owner(); alglib_impl::hqrndstate* c_ptr(); alglib_impl::hqrndstate* c_ptr() const; protected: alglib_impl::hqrndstate *p_struct; }; class hqrndstate : public _hqrndstate_owner { public: hqrndstate(); hqrndstate(const hqrndstate &rhs); hqrndstate& operator=(const hqrndstate &rhs); virtual ~hqrndstate(); }; /************************************************************************* *************************************************************************/ class _xdebugrecord1_owner { public: _xdebugrecord1_owner(); _xdebugrecord1_owner(const _xdebugrecord1_owner &rhs); _xdebugrecord1_owner& operator=(const _xdebugrecord1_owner &rhs); virtual ~_xdebugrecord1_owner(); alglib_impl::xdebugrecord1* c_ptr(); alglib_impl::xdebugrecord1* c_ptr() const; protected: alglib_impl::xdebugrecord1 *p_struct; }; class xdebugrecord1 : public _xdebugrecord1_owner { public: xdebugrecord1(); xdebugrecord1(const xdebugrecord1 &rhs); xdebugrecord1& operator=(const xdebugrecord1 &rhs); virtual ~xdebugrecord1(); ae_int_t &i; alglib::complex &c; real_1d_array a; }; /************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/ void kdtreeserialize(kdtree &obj, std::string &s_out); /************************************************************************* This function unserializes data structure from string. *************************************************************************/ void kdtreeunserialize(const std::string &s_in, kdtree &obj); /************************************************************************* This function serializes data structure to C++ stream. Data stream generated by this function is same as string representation generated by string version of serializer - alphanumeric characters, dots, underscores, minus signs, which are grouped into words separated by spaces and CR+LF. We recommend you to read comments on string version of serializer to find out more about serialization of AlGLIB objects. *************************************************************************/ void kdtreeserialize(kdtree &obj, std::ostream &s_out); /************************************************************************* This function unserializes data structure from stream. *************************************************************************/ void kdtreeunserialize(const std::istream &s_in, kdtree &obj); /************************************************************************* KD-tree creation This subroutine creates KD-tree from set of X-values and optional Y-values INPUT PARAMETERS XY - dataset, array[0..N-1,0..NX+NY-1]. one row corresponds to one point. first NX columns contain X-values, next NY (NY may be zero) columns may contain associated Y-values N - number of points, N>=0. NX - space dimension, NX>=1. NY - number of optional Y-values, NY>=0. NormType- norm type: * 0 denotes infinity-norm * 1 denotes 1-norm * 2 denotes 2-norm (Euclidean norm) OUTPUT PARAMETERS KDT - KD-tree NOTES 1. KD-tree creation have O(N*logN) complexity and O(N*(2*NX+NY)) memory requirements. 2. Although KD-trees may be used with any combination of N and NX, they are more efficient than brute-force search only when N >> 4^NX. So they are most useful in low-dimensional tasks (NX=2, NX=3). NX=1 is another inefficient case, because simple binary search (without additional structures) is much more efficient in such tasks than KD-trees. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreebuild(const real_2d_array &xy, const ae_int_t n, const ae_int_t nx, const ae_int_t ny, const ae_int_t normtype, kdtree &kdt); void kdtreebuild(const real_2d_array &xy, const ae_int_t nx, const ae_int_t ny, const ae_int_t normtype, kdtree &kdt); /************************************************************************* KD-tree creation This subroutine creates KD-tree from set of X-values, integer tags and optional Y-values INPUT PARAMETERS XY - dataset, array[0..N-1,0..NX+NY-1]. one row corresponds to one point. first NX columns contain X-values, next NY (NY may be zero) columns may contain associated Y-values Tags - tags, array[0..N-1], contains integer tags associated with points. N - number of points, N>=0 NX - space dimension, NX>=1. NY - number of optional Y-values, NY>=0. NormType- norm type: * 0 denotes infinity-norm * 1 denotes 1-norm * 2 denotes 2-norm (Euclidean norm) OUTPUT PARAMETERS KDT - KD-tree NOTES 1. KD-tree creation have O(N*logN) complexity and O(N*(2*NX+NY)) memory requirements. 2. Although KD-trees may be used with any combination of N and NX, they are more efficient than brute-force search only when N >> 4^NX. So they are most useful in low-dimensional tasks (NX=2, NX=3). NX=1 is another inefficient case, because simple binary search (without additional structures) is much more efficient in such tasks than KD-trees. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreebuildtagged(const real_2d_array &xy, const integer_1d_array &tags, const ae_int_t n, const ae_int_t nx, const ae_int_t ny, const ae_int_t normtype, kdtree &kdt); void kdtreebuildtagged(const real_2d_array &xy, const integer_1d_array &tags, const ae_int_t nx, const ae_int_t ny, const ae_int_t normtype, kdtree &kdt); /************************************************************************* This function creates buffer structure which can be used to perform parallel KD-tree requests. KD-tree subpackage provides two sets of request functions - ones which use internal buffer of KD-tree object (these functions are single-threaded because they use same buffer, which can not shared between threads), and ones which use external buffer. This function is used to initialize external buffer. INPUT PARAMETERS KDT - KD-tree which is associated with newly created buffer OUTPUT PARAMETERS Buf - external buffer. IMPORTANT: KD-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ void kdtreecreaterequestbuffer(const kdtree &kdt, kdtreerequestbuffer &buf); /************************************************************************* K-NN query: K nearest neighbors IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryKNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of actual neighbors found (either K or N, if K>N). This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequeryknn(const kdtree &kdt, const real_1d_array &x, const ae_int_t k, const bool selfmatch); ae_int_t kdtreequeryknn(const kdtree &kdt, const real_1d_array &x, const ae_int_t k); /************************************************************************* K-NN query: K nearest neighbors, using external thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - kd-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of actual neighbors found (either K or N, if K>N). This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsqueryknn(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &x, const ae_int_t k, const bool selfmatch); ae_int_t kdtreetsqueryknn(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &x, const ae_int_t k); /************************************************************************* R-NN query: all points within R-sphere centered at X IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryRNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. R - radius of sphere (in corresponding norm), R>0 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of neighbors found, >=0 This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain actual results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequeryrnn(const kdtree &kdt, const real_1d_array &x, const double r, const bool selfmatch); ae_int_t kdtreequeryrnn(const kdtree &kdt, const real_1d_array &x, const double r); /************************************************************************* R-NN query: all points within R-sphere centered at X, using external thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. R - radius of sphere (in corresponding norm), R>0 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of neighbors found, >=0 This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsqueryrnn(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &x, const double r, const bool selfmatch); ae_int_t kdtreetsqueryrnn(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &x, const double r); /************************************************************************* K-NN query: approximate K nearest neighbors IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryAKNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True Eps - approximation factor, Eps>=0. eps-approximate nearest neighbor is a neighbor whose distance from X is at most (1+eps) times distance of true nearest neighbor. RESULT number of actual neighbors found (either K or N, if K>N). NOTES significant performance gain may be achieved only when Eps is is on the order of magnitude of 1 or larger. This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequeryaknn(const kdtree &kdt, const real_1d_array &x, const ae_int_t k, const bool selfmatch, const double eps); ae_int_t kdtreequeryaknn(const kdtree &kdt, const real_1d_array &x, const ae_int_t k, const double eps); /************************************************************************* K-NN query: approximate K nearest neighbors, using thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True Eps - approximation factor, Eps>=0. eps-approximate nearest neighbor is a neighbor whose distance from X is at most (1+eps) times distance of true nearest neighbor. RESULT number of actual neighbors found (either K or N, if K>N). NOTES significant performance gain may be achieved only when Eps is is on the order of magnitude of 1 or larger. This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsqueryaknn(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &x, const ae_int_t k, const bool selfmatch, const double eps); ae_int_t kdtreetsqueryaknn(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &x, const ae_int_t k, const double eps); /************************************************************************* Box query: all points within user-specified box. IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryBox() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree BoxMin - lower bounds, array[0..NX-1]. BoxMax - upper bounds, array[0..NX-1]. RESULT number of actual neighbors found (in [0,N]). This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() returns zeros for this request NOTE: this particular query returns unordered results, because there is no meaningful way of ordering points. Furthermore, no 'distance' is associated with points - it is either INSIDE or OUTSIDE (so request for distances will return zeros). -- ALGLIB -- Copyright 14.05.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequerybox(const kdtree &kdt, const real_1d_array &boxmin, const real_1d_array &boxmax); /************************************************************************* Box query: all points within user-specified box, using thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. BoxMin - lower bounds, array[0..NX-1]. BoxMax - upper bounds, array[0..NX-1]. RESULT number of actual neighbors found (in [0,N]). This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "ts" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() returns zeros for this query NOTE: this particular query returns unordered results, because there is no meaningful way of ordering points. Furthermore, no 'distance' is associated with points - it is either INSIDE or OUTSIDE (so request for distances will return zeros). IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 14.05.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsquerybox(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &boxmin, const real_1d_array &boxmax); /************************************************************************* X-values from last query. This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultsx(). INPUT PARAMETERS KDT - KD-tree X - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS X - rows are filled with X-values NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsx(const kdtree &kdt, real_2d_array &x); /************************************************************************* X- and Y-values from last query This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultsxy(). INPUT PARAMETERS KDT - KD-tree XY - possibly pre-allocated buffer. If XY is too small to store result, it is resized. If size(XY) is enough to store result, it is left unchanged. OUTPUT PARAMETERS XY - rows are filled with points: first NX columns with X-values, next NY columns - with Y-values. NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsxy(const kdtree &kdt, real_2d_array &xy); /************************************************************************* Tags from last query This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultstags(). INPUT PARAMETERS KDT - KD-tree Tags - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS Tags - filled with tags associated with points, or, when no tags were supplied, with zeros NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultstags(const kdtree &kdt, integer_1d_array &tags); /************************************************************************* Distances from last query This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultsdistances(). INPUT PARAMETERS KDT - KD-tree R - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS R - filled with distances (in corresponding norm) NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsdistances(const kdtree &kdt, real_1d_array &r); /************************************************************************* X-values from last query associated with kdtreerequestbuffer object. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. X - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS X - rows are filled with X-values NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreetsqueryresultsx(const kdtree &kdt, const kdtreerequestbuffer &buf, real_2d_array &x); /************************************************************************* X- and Y-values from last query associated with kdtreerequestbuffer object. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. XY - possibly pre-allocated buffer. If XY is too small to store result, it is resized. If size(XY) is enough to store result, it is left unchanged. OUTPUT PARAMETERS XY - rows are filled with points: first NX columns with X-values, next NY columns - with Y-values. NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreetsqueryresultsxy(const kdtree &kdt, const kdtreerequestbuffer &buf, real_2d_array &xy); /************************************************************************* Tags from last query associated with kdtreerequestbuffer object. This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - KDTreeTsqueryresultstags(). INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. Tags - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS Tags - filled with tags associated with points, or, when no tags were supplied, with zeros NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreetsqueryresultstags(const kdtree &kdt, const kdtreerequestbuffer &buf, integer_1d_array &tags); /************************************************************************* Distances from last query associated with kdtreerequestbuffer object. This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - KDTreeTsqueryresultsdistances(). INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. R - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS R - filled with distances (in corresponding norm) NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreetsqueryresultsdistances(const kdtree &kdt, const kdtreerequestbuffer &buf, real_1d_array &r); /************************************************************************* X-values from last query; 'interactive' variant for languages like Python which support constructs like "X = KDTreeQueryResultsXI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsxi(const kdtree &kdt, real_2d_array &x); /************************************************************************* XY-values from last query; 'interactive' variant for languages like Python which support constructs like "XY = KDTreeQueryResultsXYI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsxyi(const kdtree &kdt, real_2d_array &xy); /************************************************************************* Tags from last query; 'interactive' variant for languages like Python which support constructs like "Tags = KDTreeQueryResultsTagsI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultstagsi(const kdtree &kdt, integer_1d_array &tags); /************************************************************************* Distances from last query; 'interactive' variant for languages like Python which support constructs like "R = KDTreeQueryResultsDistancesI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsdistancesi(const kdtree &kdt, real_1d_array &r); /************************************************************************* HQRNDState initialization with random values which come from standard RNG. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void hqrndrandomize(hqrndstate &state); /************************************************************************* HQRNDState initialization with seed values -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void hqrndseed(const ae_int_t s1, const ae_int_t s2, hqrndstate &state); /************************************************************************* This function generates random real number in (0,1), not including interval boundaries State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double hqrnduniformr(const hqrndstate &state); /************************************************************************* This function generates random integer number in [0, N) 1. State structure must be initialized with HQRNDRandomize() or HQRNDSeed() 2. N can be any positive number except for very large numbers: * close to 2^31 on 32-bit systems * close to 2^62 on 64-bit systems An exception will be generated if N is too large. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ ae_int_t hqrnduniformi(const hqrndstate &state, const ae_int_t n); /************************************************************************* Random number generator: normal numbers This function generates one random number from normal distribution. Its performance is equal to that of HQRNDNormal2() State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double hqrndnormal(const hqrndstate &state); /************************************************************************* Random number generator: random X and Y such that X^2+Y^2=1 State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void hqrndunit2(const hqrndstate &state, double &x, double &y); /************************************************************************* Random number generator: normal numbers This function generates two independent random numbers from normal distribution. Its performance is equal to that of HQRNDNormal() State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void hqrndnormal2(const hqrndstate &state, double &x1, double &x2); /************************************************************************* Random number generator: exponential distribution State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 11.08.2007 by Bochkanov Sergey *************************************************************************/ double hqrndexponential(const hqrndstate &state, const double lambdav); /************************************************************************* This function generates random number from discrete distribution given by finite sample X. INPUT PARAMETERS State - high quality random number generator, must be initialized with HQRNDRandomize() or HQRNDSeed(). X - finite sample N - number of elements to use, N>=1 RESULT this function returns one of the X[i] for random i=0..N-1 -- ALGLIB -- Copyright 08.11.2011 by Bochkanov Sergey *************************************************************************/ double hqrnddiscrete(const hqrndstate &state, const real_1d_array &x, const ae_int_t n); /************************************************************************* This function generates random number from continuous distribution given by finite sample X. INPUT PARAMETERS State - high quality random number generator, must be initialized with HQRNDRandomize() or HQRNDSeed(). X - finite sample, array[N] (can be larger, in this case only leading N elements are used). THIS ARRAY MUST BE SORTED BY ASCENDING. N - number of elements to use, N>=1 RESULT this function returns random number from continuous distribution which tries to approximate X as mush as possible. min(X)<=Result<=max(X). -- ALGLIB -- Copyright 08.11.2011 by Bochkanov Sergey *************************************************************************/ double hqrndcontinuous(const hqrndstate &state, const real_1d_array &x, const ae_int_t n); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Creates and returns XDebugRecord1 structure: * integer and complex fields of Rec1 are set to 1 and 1+i correspondingly * array field of Rec1 is set to [2,3] -- ALGLIB -- Copyright 27.05.2014 by Bochkanov Sergey *************************************************************************/ void xdebuginitrecord1(xdebugrecord1 &rec1); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Counts number of True values in the boolean 1D array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_int_t xdebugb1count(const boolean_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by NOT(a[i]). Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb1not(const boolean_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb1appendcopy(boolean_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered elements set to True. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb1outeven(const ae_int_t n, boolean_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_int_t xdebugi1sum(const integer_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -A[I] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi1neg(const integer_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi1appendcopy(integer_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered A[I] set to I, and odd-numbered ones set to 0. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi1outeven(const ae_int_t n, integer_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ double xdebugr1sum(const real_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -A[I] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr1neg(const real_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr1appendcopy(real_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered A[I] set to I*0.25, and odd-numbered ones are set to 0. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr1outeven(const ae_int_t n, real_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ alglib::complex xdebugc1sum(const complex_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -A[I] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc1neg(const complex_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc1appendcopy(complex_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered A[K] set to (x,y) = (K*0.25, K*0.125) and odd-numbered ones are set to 0. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc1outeven(const ae_int_t n, complex_1d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Counts number of True values in the boolean 2D array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_int_t xdebugb2count(const boolean_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by NOT(a[i]). Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb2not(const boolean_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb2transpose(boolean_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sin(3*I+5*J)>0" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb2outsin(const ae_int_t m, const ae_int_t n, boolean_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_int_t xdebugi2sum(const integer_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -a[i,j] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi2neg(const integer_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi2transpose(integer_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sign(Sin(3*I+5*J))" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi2outsin(const ae_int_t m, const ae_int_t n, integer_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ double xdebugr2sum(const real_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -a[i,j] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr2neg(const real_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr2transpose(real_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sin(3*I+5*J)" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr2outsin(const ae_int_t m, const ae_int_t n, real_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ alglib::complex xdebugc2sum(const complex_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -a[i,j] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc2neg(const complex_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc2transpose(complex_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sin(3*I+5*J),Cos(3*I+5*J)" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc2outsincos(const ae_int_t m, const ae_int_t n, complex_2d_array &a); /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of a[i,j]*(1+b[i,j]) such that c[i,j] is True -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ double xdebugmaskedbiasedproductsum(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const real_2d_array &b, const boolean_2d_array &c); } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (FUNCTIONS) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { void kdtreebuild(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_int_t normtype, kdtree* kdt, ae_state *_state); void kdtreebuildtagged(/* Real */ ae_matrix* xy, /* Integer */ ae_vector* tags, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_int_t normtype, kdtree* kdt, ae_state *_state); void kdtreecreaterequestbuffer(kdtree* kdt, kdtreerequestbuffer* buf, ae_state *_state); ae_int_t kdtreequeryknn(kdtree* kdt, /* Real */ ae_vector* x, ae_int_t k, ae_bool selfmatch, ae_state *_state); ae_int_t kdtreetsqueryknn(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_vector* x, ae_int_t k, ae_bool selfmatch, ae_state *_state); ae_int_t kdtreequeryrnn(kdtree* kdt, /* Real */ ae_vector* x, double r, ae_bool selfmatch, ae_state *_state); ae_int_t kdtreetsqueryrnn(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_vector* x, double r, ae_bool selfmatch, ae_state *_state); ae_int_t kdtreequeryaknn(kdtree* kdt, /* Real */ ae_vector* x, ae_int_t k, ae_bool selfmatch, double eps, ae_state *_state); ae_int_t kdtreetsqueryaknn(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_vector* x, ae_int_t k, ae_bool selfmatch, double eps, ae_state *_state); ae_int_t kdtreequerybox(kdtree* kdt, /* Real */ ae_vector* boxmin, /* Real */ ae_vector* boxmax, ae_state *_state); ae_int_t kdtreetsquerybox(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_vector* boxmin, /* Real */ ae_vector* boxmax, ae_state *_state); void kdtreequeryresultsx(kdtree* kdt, /* Real */ ae_matrix* x, ae_state *_state); void kdtreequeryresultsxy(kdtree* kdt, /* Real */ ae_matrix* xy, ae_state *_state); void kdtreequeryresultstags(kdtree* kdt, /* Integer */ ae_vector* tags, ae_state *_state); void kdtreequeryresultsdistances(kdtree* kdt, /* Real */ ae_vector* r, ae_state *_state); void kdtreetsqueryresultsx(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_matrix* x, ae_state *_state); void kdtreetsqueryresultsxy(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_matrix* xy, ae_state *_state); void kdtreetsqueryresultstags(kdtree* kdt, kdtreerequestbuffer* buf, /* Integer */ ae_vector* tags, ae_state *_state); void kdtreetsqueryresultsdistances(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_vector* r, ae_state *_state); void kdtreequeryresultsxi(kdtree* kdt, /* Real */ ae_matrix* x, ae_state *_state); void kdtreequeryresultsxyi(kdtree* kdt, /* Real */ ae_matrix* xy, ae_state *_state); void kdtreequeryresultstagsi(kdtree* kdt, /* Integer */ ae_vector* tags, ae_state *_state); void kdtreequeryresultsdistancesi(kdtree* kdt, /* Real */ ae_vector* r, ae_state *_state); void kdtreeexplorebox(kdtree* kdt, /* Real */ ae_vector* boxmin, /* Real */ ae_vector* boxmax, ae_state *_state); void kdtreeexplorenodetype(kdtree* kdt, ae_int_t node, ae_int_t* nodetype, ae_state *_state); void kdtreeexploreleaf(kdtree* kdt, ae_int_t node, /* Real */ ae_matrix* xy, ae_int_t* k, ae_state *_state); void kdtreeexploresplit(kdtree* kdt, ae_int_t node, ae_int_t* d, double* s, ae_int_t* nodele, ae_int_t* nodege, ae_state *_state); void kdtreealloc(ae_serializer* s, kdtree* tree, ae_state *_state); void kdtreeserialize(ae_serializer* s, kdtree* tree, ae_state *_state); void kdtreeunserialize(ae_serializer* s, kdtree* tree, ae_state *_state); void _kdtreerequestbuffer_init(void* _p, ae_state *_state); void _kdtreerequestbuffer_init_copy(void* _dst, void* _src, ae_state *_state); void _kdtreerequestbuffer_clear(void* _p); void _kdtreerequestbuffer_destroy(void* _p); void _kdtree_init(void* _p, ae_state *_state); void _kdtree_init_copy(void* _dst, void* _src, ae_state *_state); void _kdtree_clear(void* _p); void _kdtree_destroy(void* _p); void hqrndrandomize(hqrndstate* state, ae_state *_state); void hqrndseed(ae_int_t s1, ae_int_t s2, hqrndstate* state, ae_state *_state); double hqrnduniformr(hqrndstate* state, ae_state *_state); ae_int_t hqrnduniformi(hqrndstate* state, ae_int_t n, ae_state *_state); double hqrndnormal(hqrndstate* state, ae_state *_state); void hqrndunit2(hqrndstate* state, double* x, double* y, ae_state *_state); void hqrndnormal2(hqrndstate* state, double* x1, double* x2, ae_state *_state); double hqrndexponential(hqrndstate* state, double lambdav, ae_state *_state); double hqrnddiscrete(hqrndstate* state, /* Real */ ae_vector* x, ae_int_t n, ae_state *_state); double hqrndcontinuous(hqrndstate* state, /* Real */ ae_vector* x, ae_int_t n, ae_state *_state); void _hqrndstate_init(void* _p, ae_state *_state); void _hqrndstate_init_copy(void* _dst, void* _src, ae_state *_state); void _hqrndstate_clear(void* _p); void _hqrndstate_destroy(void* _p); void xdebuginitrecord1(xdebugrecord1* rec1, ae_state *_state); ae_int_t xdebugb1count(/* Boolean */ ae_vector* a, ae_state *_state); void xdebugb1not(/* Boolean */ ae_vector* a, ae_state *_state); void xdebugb1appendcopy(/* Boolean */ ae_vector* a, ae_state *_state); void xdebugb1outeven(ae_int_t n, /* Boolean */ ae_vector* a, ae_state *_state); ae_int_t xdebugi1sum(/* Integer */ ae_vector* a, ae_state *_state); void xdebugi1neg(/* Integer */ ae_vector* a, ae_state *_state); void xdebugi1appendcopy(/* Integer */ ae_vector* a, ae_state *_state); void xdebugi1outeven(ae_int_t n, /* Integer */ ae_vector* a, ae_state *_state); double xdebugr1sum(/* Real */ ae_vector* a, ae_state *_state); void xdebugr1neg(/* Real */ ae_vector* a, ae_state *_state); void xdebugr1appendcopy(/* Real */ ae_vector* a, ae_state *_state); void xdebugr1outeven(ae_int_t n, /* Real */ ae_vector* a, ae_state *_state); ae_complex xdebugc1sum(/* Complex */ ae_vector* a, ae_state *_state); void xdebugc1neg(/* Complex */ ae_vector* a, ae_state *_state); void xdebugc1appendcopy(/* Complex */ ae_vector* a, ae_state *_state); void xdebugc1outeven(ae_int_t n, /* Complex */ ae_vector* a, ae_state *_state); ae_int_t xdebugb2count(/* Boolean */ ae_matrix* a, ae_state *_state); void xdebugb2not(/* Boolean */ ae_matrix* a, ae_state *_state); void xdebugb2transpose(/* Boolean */ ae_matrix* a, ae_state *_state); void xdebugb2outsin(ae_int_t m, ae_int_t n, /* Boolean */ ae_matrix* a, ae_state *_state); ae_int_t xdebugi2sum(/* Integer */ ae_matrix* a, ae_state *_state); void xdebugi2neg(/* Integer */ ae_matrix* a, ae_state *_state); void xdebugi2transpose(/* Integer */ ae_matrix* a, ae_state *_state); void xdebugi2outsin(ae_int_t m, ae_int_t n, /* Integer */ ae_matrix* a, ae_state *_state); double xdebugr2sum(/* Real */ ae_matrix* a, ae_state *_state); void xdebugr2neg(/* Real */ ae_matrix* a, ae_state *_state); void xdebugr2transpose(/* Real */ ae_matrix* a, ae_state *_state); void xdebugr2outsin(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_state *_state); ae_complex xdebugc2sum(/* Complex */ ae_matrix* a, ae_state *_state); void xdebugc2neg(/* Complex */ ae_matrix* a, ae_state *_state); void xdebugc2transpose(/* Complex */ ae_matrix* a, ae_state *_state); void xdebugc2outsincos(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_state *_state); double xdebugmaskedbiasedproductsum(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, /* Real */ ae_matrix* b, /* Boolean */ ae_matrix* c, ae_state *_state); void _xdebugrecord1_init(void* _p, ae_state *_state); void _xdebugrecord1_init_copy(void* _dst, void* _src, ae_state *_state); void _xdebugrecord1_clear(void* _p); void _xdebugrecord1_destroy(void* _p); } #endif cpp/src/alglibmisc.cpp0000755000175000017500000077417213105126765014652 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "alglibmisc.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif using namespace std; ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* Buffer object which is used to perform nearest neighbor requests in the multithreaded mode (multiple threads working with same KD-tree object). This object should be created with KDTreeCreateBuffer(). *************************************************************************/ _kdtreerequestbuffer_owner::_kdtreerequestbuffer_owner() { p_struct = (alglib_impl::kdtreerequestbuffer*)alglib_impl::ae_malloc(sizeof(alglib_impl::kdtreerequestbuffer), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_kdtreerequestbuffer_init(p_struct, NULL); } _kdtreerequestbuffer_owner::_kdtreerequestbuffer_owner(const _kdtreerequestbuffer_owner &rhs) { p_struct = (alglib_impl::kdtreerequestbuffer*)alglib_impl::ae_malloc(sizeof(alglib_impl::kdtreerequestbuffer), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_kdtreerequestbuffer_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _kdtreerequestbuffer_owner& _kdtreerequestbuffer_owner::operator=(const _kdtreerequestbuffer_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_kdtreerequestbuffer_clear(p_struct); alglib_impl::_kdtreerequestbuffer_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _kdtreerequestbuffer_owner::~_kdtreerequestbuffer_owner() { alglib_impl::_kdtreerequestbuffer_clear(p_struct); ae_free(p_struct); } alglib_impl::kdtreerequestbuffer* _kdtreerequestbuffer_owner::c_ptr() { return p_struct; } alglib_impl::kdtreerequestbuffer* _kdtreerequestbuffer_owner::c_ptr() const { return const_cast(p_struct); } kdtreerequestbuffer::kdtreerequestbuffer() : _kdtreerequestbuffer_owner() { } kdtreerequestbuffer::kdtreerequestbuffer(const kdtreerequestbuffer &rhs):_kdtreerequestbuffer_owner(rhs) { } kdtreerequestbuffer& kdtreerequestbuffer::operator=(const kdtreerequestbuffer &rhs) { if( this==&rhs ) return *this; _kdtreerequestbuffer_owner::operator=(rhs); return *this; } kdtreerequestbuffer::~kdtreerequestbuffer() { } /************************************************************************* KD-tree object. *************************************************************************/ _kdtree_owner::_kdtree_owner() { p_struct = (alglib_impl::kdtree*)alglib_impl::ae_malloc(sizeof(alglib_impl::kdtree), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_kdtree_init(p_struct, NULL); } _kdtree_owner::_kdtree_owner(const _kdtree_owner &rhs) { p_struct = (alglib_impl::kdtree*)alglib_impl::ae_malloc(sizeof(alglib_impl::kdtree), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_kdtree_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _kdtree_owner& _kdtree_owner::operator=(const _kdtree_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_kdtree_clear(p_struct); alglib_impl::_kdtree_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _kdtree_owner::~_kdtree_owner() { alglib_impl::_kdtree_clear(p_struct); ae_free(p_struct); } alglib_impl::kdtree* _kdtree_owner::c_ptr() { return p_struct; } alglib_impl::kdtree* _kdtree_owner::c_ptr() const { return const_cast(p_struct); } kdtree::kdtree() : _kdtree_owner() { } kdtree::kdtree(const kdtree &rhs):_kdtree_owner(rhs) { } kdtree& kdtree::operator=(const kdtree &rhs) { if( this==&rhs ) return *this; _kdtree_owner::operator=(rhs); return *this; } kdtree::~kdtree() { } /************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/ void kdtreeserialize(kdtree &obj, std::string &s_out) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_int_t ssize; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_alloc_start(&serializer); alglib_impl::kdtreealloc(&serializer, obj.c_ptr(), &state); ssize = alglib_impl::ae_serializer_get_alloc_size(&serializer); s_out.clear(); s_out.reserve((size_t)(ssize+1)); alglib_impl::ae_serializer_sstart_str(&serializer, &s_out); alglib_impl::kdtreeserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); if( s_out.length()>(size_t)ssize ) throw ap_error("ALGLIB: serialization integrity error"); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function unserializes data structure from string. *************************************************************************/ void kdtreeunserialize(const std::string &s_in, kdtree &obj) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_ustart_str(&serializer, &s_in); alglib_impl::kdtreeunserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function serializes data structure to C++ stream. Data stream generated by this function is same as string representation generated by string version of serializer - alphanumeric characters, dots, underscores, minus signs, which are grouped into words separated by spaces and CR+LF. We recommend you to read comments on string version of serializer to find out more about serialization of AlGLIB objects. *************************************************************************/ void kdtreeserialize(kdtree &obj, std::ostream &s_out) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_alloc_start(&serializer); alglib_impl::kdtreealloc(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_get_alloc_size(&serializer); // not actually needed, but we have to ask alglib_impl::ae_serializer_sstart_stream(&serializer, &s_out); alglib_impl::kdtreeserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* This function unserializes data structure from stream. *************************************************************************/ void kdtreeunserialize(const std::istream &s_in, kdtree &obj) { alglib_impl::ae_state state; alglib_impl::ae_serializer serializer; alglib_impl::ae_state_init(&state); try { alglib_impl::ae_serializer_init(&serializer); alglib_impl::ae_serializer_ustart_stream(&serializer, &s_in); alglib_impl::kdtreeunserialize(&serializer, obj.c_ptr(), &state); alglib_impl::ae_serializer_stop(&serializer, &state); alglib_impl::ae_serializer_clear(&serializer); alglib_impl::ae_state_clear(&state); } catch(alglib_impl::ae_error_type) { throw ap_error(state.error_msg); } } /************************************************************************* KD-tree creation This subroutine creates KD-tree from set of X-values and optional Y-values INPUT PARAMETERS XY - dataset, array[0..N-1,0..NX+NY-1]. one row corresponds to one point. first NX columns contain X-values, next NY (NY may be zero) columns may contain associated Y-values N - number of points, N>=0. NX - space dimension, NX>=1. NY - number of optional Y-values, NY>=0. NormType- norm type: * 0 denotes infinity-norm * 1 denotes 1-norm * 2 denotes 2-norm (Euclidean norm) OUTPUT PARAMETERS KDT - KD-tree NOTES 1. KD-tree creation have O(N*logN) complexity and O(N*(2*NX+NY)) memory requirements. 2. Although KD-trees may be used with any combination of N and NX, they are more efficient than brute-force search only when N >> 4^NX. So they are most useful in low-dimensional tasks (NX=2, NX=3). NX=1 is another inefficient case, because simple binary search (without additional structures) is much more efficient in such tasks than KD-trees. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreebuild(const real_2d_array &xy, const ae_int_t n, const ae_int_t nx, const ae_int_t ny, const ae_int_t normtype, kdtree &kdt) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreebuild(const_cast(xy.c_ptr()), n, nx, ny, normtype, const_cast(kdt.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* KD-tree creation This subroutine creates KD-tree from set of X-values and optional Y-values INPUT PARAMETERS XY - dataset, array[0..N-1,0..NX+NY-1]. one row corresponds to one point. first NX columns contain X-values, next NY (NY may be zero) columns may contain associated Y-values N - number of points, N>=0. NX - space dimension, NX>=1. NY - number of optional Y-values, NY>=0. NormType- norm type: * 0 denotes infinity-norm * 1 denotes 1-norm * 2 denotes 2-norm (Euclidean norm) OUTPUT PARAMETERS KDT - KD-tree NOTES 1. KD-tree creation have O(N*logN) complexity and O(N*(2*NX+NY)) memory requirements. 2. Although KD-trees may be used with any combination of N and NX, they are more efficient than brute-force search only when N >> 4^NX. So they are most useful in low-dimensional tasks (NX=2, NX=3). NX=1 is another inefficient case, because simple binary search (without additional structures) is much more efficient in such tasks than KD-trees. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreebuild(const real_2d_array &xy, const ae_int_t nx, const ae_int_t ny, const ae_int_t normtype, kdtree &kdt) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; n = xy.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreebuild(const_cast(xy.c_ptr()), n, nx, ny, normtype, const_cast(kdt.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* KD-tree creation This subroutine creates KD-tree from set of X-values, integer tags and optional Y-values INPUT PARAMETERS XY - dataset, array[0..N-1,0..NX+NY-1]. one row corresponds to one point. first NX columns contain X-values, next NY (NY may be zero) columns may contain associated Y-values Tags - tags, array[0..N-1], contains integer tags associated with points. N - number of points, N>=0 NX - space dimension, NX>=1. NY - number of optional Y-values, NY>=0. NormType- norm type: * 0 denotes infinity-norm * 1 denotes 1-norm * 2 denotes 2-norm (Euclidean norm) OUTPUT PARAMETERS KDT - KD-tree NOTES 1. KD-tree creation have O(N*logN) complexity and O(N*(2*NX+NY)) memory requirements. 2. Although KD-trees may be used with any combination of N and NX, they are more efficient than brute-force search only when N >> 4^NX. So they are most useful in low-dimensional tasks (NX=2, NX=3). NX=1 is another inefficient case, because simple binary search (without additional structures) is much more efficient in such tasks than KD-trees. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreebuildtagged(const real_2d_array &xy, const integer_1d_array &tags, const ae_int_t n, const ae_int_t nx, const ae_int_t ny, const ae_int_t normtype, kdtree &kdt) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreebuildtagged(const_cast(xy.c_ptr()), const_cast(tags.c_ptr()), n, nx, ny, normtype, const_cast(kdt.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* KD-tree creation This subroutine creates KD-tree from set of X-values, integer tags and optional Y-values INPUT PARAMETERS XY - dataset, array[0..N-1,0..NX+NY-1]. one row corresponds to one point. first NX columns contain X-values, next NY (NY may be zero) columns may contain associated Y-values Tags - tags, array[0..N-1], contains integer tags associated with points. N - number of points, N>=0 NX - space dimension, NX>=1. NY - number of optional Y-values, NY>=0. NormType- norm type: * 0 denotes infinity-norm * 1 denotes 1-norm * 2 denotes 2-norm (Euclidean norm) OUTPUT PARAMETERS KDT - KD-tree NOTES 1. KD-tree creation have O(N*logN) complexity and O(N*(2*NX+NY)) memory requirements. 2. Although KD-trees may be used with any combination of N and NX, they are more efficient than brute-force search only when N >> 4^NX. So they are most useful in low-dimensional tasks (NX=2, NX=3). NX=1 is another inefficient case, because simple binary search (without additional structures) is much more efficient in such tasks than KD-trees. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreebuildtagged(const real_2d_array &xy, const integer_1d_array &tags, const ae_int_t nx, const ae_int_t ny, const ae_int_t normtype, kdtree &kdt) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; if( (xy.rows()!=tags.length())) throw ap_error("Error while calling 'kdtreebuildtagged': looks like one of arguments has wrong size"); n = xy.rows(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreebuildtagged(const_cast(xy.c_ptr()), const_cast(tags.c_ptr()), n, nx, ny, normtype, const_cast(kdt.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function creates buffer structure which can be used to perform parallel KD-tree requests. KD-tree subpackage provides two sets of request functions - ones which use internal buffer of KD-tree object (these functions are single-threaded because they use same buffer, which can not shared between threads), and ones which use external buffer. This function is used to initialize external buffer. INPUT PARAMETERS KDT - KD-tree which is associated with newly created buffer OUTPUT PARAMETERS Buf - external buffer. IMPORTANT: KD-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ void kdtreecreaterequestbuffer(const kdtree &kdt, kdtreerequestbuffer &buf) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreecreaterequestbuffer(const_cast(kdt.c_ptr()), const_cast(buf.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* K-NN query: K nearest neighbors IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryKNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of actual neighbors found (either K or N, if K>N). This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequeryknn(const kdtree &kdt, const real_1d_array &x, const ae_int_t k, const bool selfmatch) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreequeryknn(const_cast(kdt.c_ptr()), const_cast(x.c_ptr()), k, selfmatch, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* K-NN query: K nearest neighbors IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryKNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of actual neighbors found (either K or N, if K>N). This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequeryknn(const kdtree &kdt, const real_1d_array &x, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; bool selfmatch; selfmatch = true; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreequeryknn(const_cast(kdt.c_ptr()), const_cast(x.c_ptr()), k, selfmatch, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* K-NN query: K nearest neighbors, using external thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - kd-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of actual neighbors found (either K or N, if K>N). This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsqueryknn(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &x, const ae_int_t k, const bool selfmatch) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreetsqueryknn(const_cast(kdt.c_ptr()), const_cast(buf.c_ptr()), const_cast(x.c_ptr()), k, selfmatch, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* K-NN query: K nearest neighbors, using external thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - kd-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of actual neighbors found (either K or N, if K>N). This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsqueryknn(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &x, const ae_int_t k) { alglib_impl::ae_state _alglib_env_state; bool selfmatch; selfmatch = true; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreetsqueryknn(const_cast(kdt.c_ptr()), const_cast(buf.c_ptr()), const_cast(x.c_ptr()), k, selfmatch, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* R-NN query: all points within R-sphere centered at X IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryRNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. R - radius of sphere (in corresponding norm), R>0 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of neighbors found, >=0 This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain actual results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequeryrnn(const kdtree &kdt, const real_1d_array &x, const double r, const bool selfmatch) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreequeryrnn(const_cast(kdt.c_ptr()), const_cast(x.c_ptr()), r, selfmatch, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* R-NN query: all points within R-sphere centered at X IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryRNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. R - radius of sphere (in corresponding norm), R>0 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of neighbors found, >=0 This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain actual results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequeryrnn(const kdtree &kdt, const real_1d_array &x, const double r) { alglib_impl::ae_state _alglib_env_state; bool selfmatch; selfmatch = true; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreequeryrnn(const_cast(kdt.c_ptr()), const_cast(x.c_ptr()), r, selfmatch, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* R-NN query: all points within R-sphere centered at X, using external thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. R - radius of sphere (in corresponding norm), R>0 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of neighbors found, >=0 This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsqueryrnn(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &x, const double r, const bool selfmatch) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreetsqueryrnn(const_cast(kdt.c_ptr()), const_cast(buf.c_ptr()), const_cast(x.c_ptr()), r, selfmatch, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* R-NN query: all points within R-sphere centered at X, using external thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. R - radius of sphere (in corresponding norm), R>0 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of neighbors found, >=0 This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsqueryrnn(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &x, const double r) { alglib_impl::ae_state _alglib_env_state; bool selfmatch; selfmatch = true; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreetsqueryrnn(const_cast(kdt.c_ptr()), const_cast(buf.c_ptr()), const_cast(x.c_ptr()), r, selfmatch, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* K-NN query: approximate K nearest neighbors IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryAKNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True Eps - approximation factor, Eps>=0. eps-approximate nearest neighbor is a neighbor whose distance from X is at most (1+eps) times distance of true nearest neighbor. RESULT number of actual neighbors found (either K or N, if K>N). NOTES significant performance gain may be achieved only when Eps is is on the order of magnitude of 1 or larger. This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequeryaknn(const kdtree &kdt, const real_1d_array &x, const ae_int_t k, const bool selfmatch, const double eps) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreequeryaknn(const_cast(kdt.c_ptr()), const_cast(x.c_ptr()), k, selfmatch, eps, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* K-NN query: approximate K nearest neighbors IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryAKNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True Eps - approximation factor, Eps>=0. eps-approximate nearest neighbor is a neighbor whose distance from X is at most (1+eps) times distance of true nearest neighbor. RESULT number of actual neighbors found (either K or N, if K>N). NOTES significant performance gain may be achieved only when Eps is is on the order of magnitude of 1 or larger. This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequeryaknn(const kdtree &kdt, const real_1d_array &x, const ae_int_t k, const double eps) { alglib_impl::ae_state _alglib_env_state; bool selfmatch; selfmatch = true; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreequeryaknn(const_cast(kdt.c_ptr()), const_cast(x.c_ptr()), k, selfmatch, eps, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* K-NN query: approximate K nearest neighbors, using thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True Eps - approximation factor, Eps>=0. eps-approximate nearest neighbor is a neighbor whose distance from X is at most (1+eps) times distance of true nearest neighbor. RESULT number of actual neighbors found (either K or N, if K>N). NOTES significant performance gain may be achieved only when Eps is is on the order of magnitude of 1 or larger. This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsqueryaknn(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &x, const ae_int_t k, const bool selfmatch, const double eps) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreetsqueryaknn(const_cast(kdt.c_ptr()), const_cast(buf.c_ptr()), const_cast(x.c_ptr()), k, selfmatch, eps, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* K-NN query: approximate K nearest neighbors, using thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True Eps - approximation factor, Eps>=0. eps-approximate nearest neighbor is a neighbor whose distance from X is at most (1+eps) times distance of true nearest neighbor. RESULT number of actual neighbors found (either K or N, if K>N). NOTES significant performance gain may be achieved only when Eps is is on the order of magnitude of 1 or larger. This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsqueryaknn(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &x, const ae_int_t k, const double eps) { alglib_impl::ae_state _alglib_env_state; bool selfmatch; selfmatch = true; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreetsqueryaknn(const_cast(kdt.c_ptr()), const_cast(buf.c_ptr()), const_cast(x.c_ptr()), k, selfmatch, eps, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Box query: all points within user-specified box. IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryBox() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree BoxMin - lower bounds, array[0..NX-1]. BoxMax - upper bounds, array[0..NX-1]. RESULT number of actual neighbors found (in [0,N]). This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() returns zeros for this request NOTE: this particular query returns unordered results, because there is no meaningful way of ordering points. Furthermore, no 'distance' is associated with points - it is either INSIDE or OUTSIDE (so request for distances will return zeros). -- ALGLIB -- Copyright 14.05.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequerybox(const kdtree &kdt, const real_1d_array &boxmin, const real_1d_array &boxmax) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreequerybox(const_cast(kdt.c_ptr()), const_cast(boxmin.c_ptr()), const_cast(boxmax.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Box query: all points within user-specified box, using thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. BoxMin - lower bounds, array[0..NX-1]. BoxMax - upper bounds, array[0..NX-1]. RESULT number of actual neighbors found (in [0,N]). This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "ts" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() returns zeros for this query NOTE: this particular query returns unordered results, because there is no meaningful way of ordering points. Furthermore, no 'distance' is associated with points - it is either INSIDE or OUTSIDE (so request for distances will return zeros). IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 14.05.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsquerybox(const kdtree &kdt, const kdtreerequestbuffer &buf, const real_1d_array &boxmin, const real_1d_array &boxmax) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::kdtreetsquerybox(const_cast(kdt.c_ptr()), const_cast(buf.c_ptr()), const_cast(boxmin.c_ptr()), const_cast(boxmax.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* X-values from last query. This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultsx(). INPUT PARAMETERS KDT - KD-tree X - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS X - rows are filled with X-values NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsx(const kdtree &kdt, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreequeryresultsx(const_cast(kdt.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* X- and Y-values from last query This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultsxy(). INPUT PARAMETERS KDT - KD-tree XY - possibly pre-allocated buffer. If XY is too small to store result, it is resized. If size(XY) is enough to store result, it is left unchanged. OUTPUT PARAMETERS XY - rows are filled with points: first NX columns with X-values, next NY columns - with Y-values. NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsxy(const kdtree &kdt, real_2d_array &xy) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreequeryresultsxy(const_cast(kdt.c_ptr()), const_cast(xy.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Tags from last query This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultstags(). INPUT PARAMETERS KDT - KD-tree Tags - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS Tags - filled with tags associated with points, or, when no tags were supplied, with zeros NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultstags(const kdtree &kdt, integer_1d_array &tags) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreequeryresultstags(const_cast(kdt.c_ptr()), const_cast(tags.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Distances from last query This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultsdistances(). INPUT PARAMETERS KDT - KD-tree R - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS R - filled with distances (in corresponding norm) NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsdistances(const kdtree &kdt, real_1d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreequeryresultsdistances(const_cast(kdt.c_ptr()), const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* X-values from last query associated with kdtreerequestbuffer object. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. X - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS X - rows are filled with X-values NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreetsqueryresultsx(const kdtree &kdt, const kdtreerequestbuffer &buf, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreetsqueryresultsx(const_cast(kdt.c_ptr()), const_cast(buf.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* X- and Y-values from last query associated with kdtreerequestbuffer object. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. XY - possibly pre-allocated buffer. If XY is too small to store result, it is resized. If size(XY) is enough to store result, it is left unchanged. OUTPUT PARAMETERS XY - rows are filled with points: first NX columns with X-values, next NY columns - with Y-values. NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreetsqueryresultsxy(const kdtree &kdt, const kdtreerequestbuffer &buf, real_2d_array &xy) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreetsqueryresultsxy(const_cast(kdt.c_ptr()), const_cast(buf.c_ptr()), const_cast(xy.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Tags from last query associated with kdtreerequestbuffer object. This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - KDTreeTsqueryresultstags(). INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. Tags - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS Tags - filled with tags associated with points, or, when no tags were supplied, with zeros NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreetsqueryresultstags(const kdtree &kdt, const kdtreerequestbuffer &buf, integer_1d_array &tags) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreetsqueryresultstags(const_cast(kdt.c_ptr()), const_cast(buf.c_ptr()), const_cast(tags.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Distances from last query associated with kdtreerequestbuffer object. This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - KDTreeTsqueryresultsdistances(). INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. R - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS R - filled with distances (in corresponding norm) NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreetsqueryresultsdistances(const kdtree &kdt, const kdtreerequestbuffer &buf, real_1d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreetsqueryresultsdistances(const_cast(kdt.c_ptr()), const_cast(buf.c_ptr()), const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* X-values from last query; 'interactive' variant for languages like Python which support constructs like "X = KDTreeQueryResultsXI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsxi(const kdtree &kdt, real_2d_array &x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreequeryresultsxi(const_cast(kdt.c_ptr()), const_cast(x.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* XY-values from last query; 'interactive' variant for languages like Python which support constructs like "XY = KDTreeQueryResultsXYI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsxyi(const kdtree &kdt, real_2d_array &xy) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreequeryresultsxyi(const_cast(kdt.c_ptr()), const_cast(xy.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Tags from last query; 'interactive' variant for languages like Python which support constructs like "Tags = KDTreeQueryResultsTagsI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultstagsi(const kdtree &kdt, integer_1d_array &tags) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreequeryresultstagsi(const_cast(kdt.c_ptr()), const_cast(tags.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Distances from last query; 'interactive' variant for languages like Python which support constructs like "R = KDTreeQueryResultsDistancesI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsdistancesi(const kdtree &kdt, real_1d_array &r) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::kdtreequeryresultsdistancesi(const_cast(kdt.c_ptr()), const_cast(r.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Portable high quality random number generator state. Initialized with HQRNDRandomize() or HQRNDSeed(). Fields: S1, S2 - seed values V - precomputed value MagicV - 'magic' value used to determine whether State structure was correctly initialized. *************************************************************************/ _hqrndstate_owner::_hqrndstate_owner() { p_struct = (alglib_impl::hqrndstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::hqrndstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_hqrndstate_init(p_struct, NULL); } _hqrndstate_owner::_hqrndstate_owner(const _hqrndstate_owner &rhs) { p_struct = (alglib_impl::hqrndstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::hqrndstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_hqrndstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _hqrndstate_owner& _hqrndstate_owner::operator=(const _hqrndstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_hqrndstate_clear(p_struct); alglib_impl::_hqrndstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _hqrndstate_owner::~_hqrndstate_owner() { alglib_impl::_hqrndstate_clear(p_struct); ae_free(p_struct); } alglib_impl::hqrndstate* _hqrndstate_owner::c_ptr() { return p_struct; } alglib_impl::hqrndstate* _hqrndstate_owner::c_ptr() const { return const_cast(p_struct); } hqrndstate::hqrndstate() : _hqrndstate_owner() { } hqrndstate::hqrndstate(const hqrndstate &rhs):_hqrndstate_owner(rhs) { } hqrndstate& hqrndstate::operator=(const hqrndstate &rhs) { if( this==&rhs ) return *this; _hqrndstate_owner::operator=(rhs); return *this; } hqrndstate::~hqrndstate() { } /************************************************************************* HQRNDState initialization with random values which come from standard RNG. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void hqrndrandomize(hqrndstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hqrndrandomize(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* HQRNDState initialization with seed values -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void hqrndseed(const ae_int_t s1, const ae_int_t s2, hqrndstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hqrndseed(s1, s2, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function generates random real number in (0,1), not including interval boundaries State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double hqrnduniformr(const hqrndstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::hqrnduniformr(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function generates random integer number in [0, N) 1. State structure must be initialized with HQRNDRandomize() or HQRNDSeed() 2. N can be any positive number except for very large numbers: * close to 2^31 on 32-bit systems * close to 2^62 on 64-bit systems An exception will be generated if N is too large. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ ae_int_t hqrnduniformi(const hqrndstate &state, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::hqrnduniformi(const_cast(state.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Random number generator: normal numbers This function generates one random number from normal distribution. Its performance is equal to that of HQRNDNormal2() State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double hqrndnormal(const hqrndstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::hqrndnormal(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Random number generator: random X and Y such that X^2+Y^2=1 State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void hqrndunit2(const hqrndstate &state, double &x, double &y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hqrndunit2(const_cast(state.c_ptr()), &x, &y, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Random number generator: normal numbers This function generates two independent random numbers from normal distribution. Its performance is equal to that of HQRNDNormal() State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void hqrndnormal2(const hqrndstate &state, double &x1, double &x2) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hqrndnormal2(const_cast(state.c_ptr()), &x1, &x2, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Random number generator: exponential distribution State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 11.08.2007 by Bochkanov Sergey *************************************************************************/ double hqrndexponential(const hqrndstate &state, const double lambdav) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::hqrndexponential(const_cast(state.c_ptr()), lambdav, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function generates random number from discrete distribution given by finite sample X. INPUT PARAMETERS State - high quality random number generator, must be initialized with HQRNDRandomize() or HQRNDSeed(). X - finite sample N - number of elements to use, N>=1 RESULT this function returns one of the X[i] for random i=0..N-1 -- ALGLIB -- Copyright 08.11.2011 by Bochkanov Sergey *************************************************************************/ double hqrnddiscrete(const hqrndstate &state, const real_1d_array &x, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::hqrnddiscrete(const_cast(state.c_ptr()), const_cast(x.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function generates random number from continuous distribution given by finite sample X. INPUT PARAMETERS State - high quality random number generator, must be initialized with HQRNDRandomize() or HQRNDSeed(). X - finite sample, array[N] (can be larger, in this case only leading N elements are used). THIS ARRAY MUST BE SORTED BY ASCENDING. N - number of elements to use, N>=1 RESULT this function returns random number from continuous distribution which tries to approximate X as mush as possible. min(X)<=Result<=max(X). -- ALGLIB -- Copyright 08.11.2011 by Bochkanov Sergey *************************************************************************/ double hqrndcontinuous(const hqrndstate &state, const real_1d_array &x, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::hqrndcontinuous(const_cast(state.c_ptr()), const_cast(x.c_ptr()), n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* *************************************************************************/ _xdebugrecord1_owner::_xdebugrecord1_owner() { p_struct = (alglib_impl::xdebugrecord1*)alglib_impl::ae_malloc(sizeof(alglib_impl::xdebugrecord1), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_xdebugrecord1_init(p_struct, NULL); } _xdebugrecord1_owner::_xdebugrecord1_owner(const _xdebugrecord1_owner &rhs) { p_struct = (alglib_impl::xdebugrecord1*)alglib_impl::ae_malloc(sizeof(alglib_impl::xdebugrecord1), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_xdebugrecord1_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _xdebugrecord1_owner& _xdebugrecord1_owner::operator=(const _xdebugrecord1_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_xdebugrecord1_clear(p_struct); alglib_impl::_xdebugrecord1_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _xdebugrecord1_owner::~_xdebugrecord1_owner() { alglib_impl::_xdebugrecord1_clear(p_struct); ae_free(p_struct); } alglib_impl::xdebugrecord1* _xdebugrecord1_owner::c_ptr() { return p_struct; } alglib_impl::xdebugrecord1* _xdebugrecord1_owner::c_ptr() const { return const_cast(p_struct); } xdebugrecord1::xdebugrecord1() : _xdebugrecord1_owner() ,i(p_struct->i),c(*((alglib::complex*)(&p_struct->c))),a(&p_struct->a) { } xdebugrecord1::xdebugrecord1(const xdebugrecord1 &rhs):_xdebugrecord1_owner(rhs) ,i(p_struct->i),c(*((alglib::complex*)(&p_struct->c))),a(&p_struct->a) { } xdebugrecord1& xdebugrecord1::operator=(const xdebugrecord1 &rhs) { if( this==&rhs ) return *this; _xdebugrecord1_owner::operator=(rhs); return *this; } xdebugrecord1::~xdebugrecord1() { } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Creates and returns XDebugRecord1 structure: * integer and complex fields of Rec1 are set to 1 and 1+i correspondingly * array field of Rec1 is set to [2,3] -- ALGLIB -- Copyright 27.05.2014 by Bochkanov Sergey *************************************************************************/ void xdebuginitrecord1(xdebugrecord1 &rec1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebuginitrecord1(const_cast(rec1.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Counts number of True values in the boolean 1D array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_int_t xdebugb1count(const boolean_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::xdebugb1count(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by NOT(a[i]). Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb1not(const boolean_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugb1not(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb1appendcopy(boolean_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugb1appendcopy(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered elements set to True. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb1outeven(const ae_int_t n, boolean_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugb1outeven(n, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_int_t xdebugi1sum(const integer_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::xdebugi1sum(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -A[I] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi1neg(const integer_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugi1neg(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi1appendcopy(integer_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugi1appendcopy(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered A[I] set to I, and odd-numbered ones set to 0. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi1outeven(const ae_int_t n, integer_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugi1outeven(n, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ double xdebugr1sum(const real_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::xdebugr1sum(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -A[I] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr1neg(const real_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugr1neg(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr1appendcopy(real_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugr1appendcopy(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered A[I] set to I*0.25, and odd-numbered ones are set to 0. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr1outeven(const ae_int_t n, real_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugr1outeven(n, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ alglib::complex xdebugc1sum(const complex_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_complex result = alglib_impl::xdebugc1sum(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -A[I] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc1neg(const complex_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugc1neg(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc1appendcopy(complex_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugc1appendcopy(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered A[K] set to (x,y) = (K*0.25, K*0.125) and odd-numbered ones are set to 0. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc1outeven(const ae_int_t n, complex_1d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugc1outeven(n, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Counts number of True values in the boolean 2D array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_int_t xdebugb2count(const boolean_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::xdebugb2count(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by NOT(a[i]). Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb2not(const boolean_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugb2not(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb2transpose(boolean_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugb2transpose(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sin(3*I+5*J)>0" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb2outsin(const ae_int_t m, const ae_int_t n, boolean_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugb2outsin(m, n, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_int_t xdebugi2sum(const integer_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_int_t result = alglib_impl::xdebugi2sum(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -a[i,j] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi2neg(const integer_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugi2neg(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi2transpose(integer_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugi2transpose(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sign(Sin(3*I+5*J))" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi2outsin(const ae_int_t m, const ae_int_t n, integer_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugi2outsin(m, n, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ double xdebugr2sum(const real_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::xdebugr2sum(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -a[i,j] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr2neg(const real_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugr2neg(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr2transpose(real_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugr2transpose(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sin(3*I+5*J)" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr2outsin(const ae_int_t m, const ae_int_t n, real_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugr2outsin(m, n, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ alglib::complex xdebugc2sum(const complex_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::ae_complex result = alglib_impl::xdebugc2sum(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -a[i,j] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc2neg(const complex_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugc2neg(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc2transpose(complex_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugc2transpose(const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sin(3*I+5*J),Cos(3*I+5*J)" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc2outsincos(const ae_int_t m, const ae_int_t n, complex_2d_array &a) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::xdebugc2outsincos(m, n, const_cast(a.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of a[i,j]*(1+b[i,j]) such that c[i,j] is True -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ double xdebugmaskedbiasedproductsum(const ae_int_t m, const ae_int_t n, const real_2d_array &a, const real_2d_array &b, const boolean_2d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::xdebugmaskedbiasedproductsum(m, n, const_cast(a.c_ptr()), const_cast(b.c_ptr()), const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF COMPUTATIONAL CORE // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { static ae_int_t nearestneighbor_splitnodesize = 6; static ae_int_t nearestneighbor_kdtreefirstversion = 0; static void nearestneighbor_kdtreesplit(kdtree* kdt, ae_int_t i1, ae_int_t i2, ae_int_t d, double s, ae_int_t* i3, ae_state *_state); static void nearestneighbor_kdtreegeneratetreerec(kdtree* kdt, ae_int_t* nodesoffs, ae_int_t* splitsoffs, ae_int_t i1, ae_int_t i2, ae_int_t maxleafsize, ae_state *_state); static void nearestneighbor_kdtreequerynnrec(kdtree* kdt, kdtreerequestbuffer* buf, ae_int_t offs, ae_state *_state); static void nearestneighbor_kdtreequeryboxrec(kdtree* kdt, kdtreerequestbuffer* buf, ae_int_t offs, ae_state *_state); static void nearestneighbor_kdtreeinitbox(kdtree* kdt, /* Real */ ae_vector* x, kdtreerequestbuffer* buf, ae_state *_state); static void nearestneighbor_kdtreeallocdatasetindependent(kdtree* kdt, ae_int_t nx, ae_int_t ny, ae_state *_state); static void nearestneighbor_kdtreeallocdatasetdependent(kdtree* kdt, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_state *_state); static void nearestneighbor_checkrequestbufferconsistency(kdtree* kdt, kdtreerequestbuffer* buf, ae_state *_state); static ae_int_t hqrnd_hqrndmax = 2147483561; static ae_int_t hqrnd_hqrndm1 = 2147483563; static ae_int_t hqrnd_hqrndm2 = 2147483399; static ae_int_t hqrnd_hqrndmagic = 1634357784; static ae_int_t hqrnd_hqrndintegerbase(hqrndstate* state, ae_state *_state); /************************************************************************* KD-tree creation This subroutine creates KD-tree from set of X-values and optional Y-values INPUT PARAMETERS XY - dataset, array[0..N-1,0..NX+NY-1]. one row corresponds to one point. first NX columns contain X-values, next NY (NY may be zero) columns may contain associated Y-values N - number of points, N>=0. NX - space dimension, NX>=1. NY - number of optional Y-values, NY>=0. NormType- norm type: * 0 denotes infinity-norm * 1 denotes 1-norm * 2 denotes 2-norm (Euclidean norm) OUTPUT PARAMETERS KDT - KD-tree NOTES 1. KD-tree creation have O(N*logN) complexity and O(N*(2*NX+NY)) memory requirements. 2. Although KD-trees may be used with any combination of N and NX, they are more efficient than brute-force search only when N >> 4^NX. So they are most useful in low-dimensional tasks (NX=2, NX=3). NX=1 is another inefficient case, because simple binary search (without additional structures) is much more efficient in such tasks than KD-trees. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreebuild(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_int_t normtype, kdtree* kdt, ae_state *_state) { ae_frame _frame_block; ae_vector tags; ae_int_t i; ae_frame_make(_state, &_frame_block); _kdtree_clear(kdt); ae_vector_init(&tags, 0, DT_INT, _state); ae_assert(n>=0, "KDTreeBuild: N<0", _state); ae_assert(nx>=1, "KDTreeBuild: NX<1", _state); ae_assert(ny>=0, "KDTreeBuild: NY<0", _state); ae_assert(normtype>=0&&normtype<=2, "KDTreeBuild: incorrect NormType", _state); ae_assert(xy->rows>=n, "KDTreeBuild: rows(X)cols>=nx+ny||n==0, "KDTreeBuild: cols(X)0 ) { ae_vector_set_length(&tags, n, _state); for(i=0; i<=n-1; i++) { tags.ptr.p_int[i] = 0; } } kdtreebuildtagged(xy, &tags, n, nx, ny, normtype, kdt, _state); ae_frame_leave(_state); } /************************************************************************* KD-tree creation This subroutine creates KD-tree from set of X-values, integer tags and optional Y-values INPUT PARAMETERS XY - dataset, array[0..N-1,0..NX+NY-1]. one row corresponds to one point. first NX columns contain X-values, next NY (NY may be zero) columns may contain associated Y-values Tags - tags, array[0..N-1], contains integer tags associated with points. N - number of points, N>=0 NX - space dimension, NX>=1. NY - number of optional Y-values, NY>=0. NormType- norm type: * 0 denotes infinity-norm * 1 denotes 1-norm * 2 denotes 2-norm (Euclidean norm) OUTPUT PARAMETERS KDT - KD-tree NOTES 1. KD-tree creation have O(N*logN) complexity and O(N*(2*NX+NY)) memory requirements. 2. Although KD-trees may be used with any combination of N and NX, they are more efficient than brute-force search only when N >> 4^NX. So they are most useful in low-dimensional tasks (NX=2, NX=3). NX=1 is another inefficient case, because simple binary search (without additional structures) is much more efficient in such tasks than KD-trees. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreebuildtagged(/* Real */ ae_matrix* xy, /* Integer */ ae_vector* tags, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_int_t normtype, kdtree* kdt, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t maxnodes; ae_int_t nodesoffs; ae_int_t splitsoffs; _kdtree_clear(kdt); ae_assert(n>=0, "KDTreeBuildTagged: N<0", _state); ae_assert(nx>=1, "KDTreeBuildTagged: NX<1", _state); ae_assert(ny>=0, "KDTreeBuildTagged: NY<0", _state); ae_assert(normtype>=0&&normtype<=2, "KDTreeBuildTagged: incorrect NormType", _state); ae_assert(xy->rows>=n, "KDTreeBuildTagged: rows(X)cols>=nx+ny||n==0, "KDTreeBuildTagged: cols(X)n = n; kdt->nx = nx; kdt->ny = ny; kdt->normtype = normtype; kdt->innerbuf.kcur = 0; /* * N=0 => quick exit */ if( n==0 ) { return; } /* * Allocate */ nearestneighbor_kdtreeallocdatasetindependent(kdt, nx, ny, _state); nearestneighbor_kdtreeallocdatasetdependent(kdt, n, nx, ny, _state); kdtreecreaterequestbuffer(kdt, &kdt->innerbuf, _state); /* * Initial fill */ for(i=0; i<=n-1; i++) { ae_v_move(&kdt->xy.ptr.pp_double[i][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); ae_v_move(&kdt->xy.ptr.pp_double[i][nx], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(nx,2*nx+ny-1)); kdt->tags.ptr.p_int[i] = tags->ptr.p_int[i]; } /* * Determine bounding box */ ae_v_move(&kdt->boxmin.ptr.p_double[0], 1, &kdt->xy.ptr.pp_double[0][0], 1, ae_v_len(0,nx-1)); ae_v_move(&kdt->boxmax.ptr.p_double[0], 1, &kdt->xy.ptr.pp_double[0][0], 1, ae_v_len(0,nx-1)); for(i=1; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { kdt->boxmin.ptr.p_double[j] = ae_minreal(kdt->boxmin.ptr.p_double[j], kdt->xy.ptr.pp_double[i][j], _state); kdt->boxmax.ptr.p_double[j] = ae_maxreal(kdt->boxmax.ptr.p_double[j], kdt->xy.ptr.pp_double[i][j], _state); } } /* * prepare tree structure * * MaxNodes=N because we guarantee no trivial splits, i.e. * every split will generate two non-empty boxes */ maxnodes = n; ae_vector_set_length(&kdt->nodes, nearestneighbor_splitnodesize*2*maxnodes, _state); ae_vector_set_length(&kdt->splits, 2*maxnodes, _state); nodesoffs = 0; splitsoffs = 0; ae_v_move(&kdt->innerbuf.curboxmin.ptr.p_double[0], 1, &kdt->boxmin.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_move(&kdt->innerbuf.curboxmax.ptr.p_double[0], 1, &kdt->boxmax.ptr.p_double[0], 1, ae_v_len(0,nx-1)); nearestneighbor_kdtreegeneratetreerec(kdt, &nodesoffs, &splitsoffs, 0, n, 8, _state); } /************************************************************************* This function creates buffer structure which can be used to perform parallel KD-tree requests. KD-tree subpackage provides two sets of request functions - ones which use internal buffer of KD-tree object (these functions are single-threaded because they use same buffer, which can not shared between threads), and ones which use external buffer. This function is used to initialize external buffer. INPUT PARAMETERS KDT - KD-tree which is associated with newly created buffer OUTPUT PARAMETERS Buf - external buffer. IMPORTANT: KD-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ void kdtreecreaterequestbuffer(kdtree* kdt, kdtreerequestbuffer* buf, ae_state *_state) { _kdtreerequestbuffer_clear(buf); ae_vector_set_length(&buf->x, kdt->nx, _state); ae_vector_set_length(&buf->boxmin, kdt->nx, _state); ae_vector_set_length(&buf->boxmax, kdt->nx, _state); ae_vector_set_length(&buf->idx, kdt->n, _state); ae_vector_set_length(&buf->r, kdt->n, _state); ae_vector_set_length(&buf->buf, ae_maxint(kdt->n, kdt->nx, _state), _state); ae_vector_set_length(&buf->curboxmin, kdt->nx, _state); ae_vector_set_length(&buf->curboxmax, kdt->nx, _state); buf->kcur = 0; } /************************************************************************* K-NN query: K nearest neighbors IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryKNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of actual neighbors found (either K or N, if K>N). This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequeryknn(kdtree* kdt, /* Real */ ae_vector* x, ae_int_t k, ae_bool selfmatch, ae_state *_state) { ae_int_t result; ae_assert(k>=1, "KDTreeQueryKNN: K<1!", _state); ae_assert(x->cnt>=kdt->nx, "KDTreeQueryKNN: Length(X)nx, _state), "KDTreeQueryKNN: X contains infinite or NaN values!", _state); result = kdtreetsqueryaknn(kdt, &kdt->innerbuf, x, k, selfmatch, 0.0, _state); return result; } /************************************************************************* K-NN query: K nearest neighbors, using external thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - kd-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of actual neighbors found (either K or N, if K>N). This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsqueryknn(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_vector* x, ae_int_t k, ae_bool selfmatch, ae_state *_state) { ae_int_t result; ae_assert(k>=1, "KDTreeTsQueryKNN: K<1!", _state); ae_assert(x->cnt>=kdt->nx, "KDTreeTsQueryKNN: Length(X)nx, _state), "KDTreeTsQueryKNN: X contains infinite or NaN values!", _state); result = kdtreetsqueryaknn(kdt, buf, x, k, selfmatch, 0.0, _state); return result; } /************************************************************************* R-NN query: all points within R-sphere centered at X IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryRNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. R - radius of sphere (in corresponding norm), R>0 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of neighbors found, >=0 This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain actual results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequeryrnn(kdtree* kdt, /* Real */ ae_vector* x, double r, ae_bool selfmatch, ae_state *_state) { ae_int_t result; ae_assert(ae_fp_greater(r,(double)(0)), "KDTreeQueryRNN: incorrect R!", _state); ae_assert(x->cnt>=kdt->nx, "KDTreeQueryRNN: Length(X)nx, _state), "KDTreeQueryRNN: X contains infinite or NaN values!", _state); result = kdtreetsqueryrnn(kdt, &kdt->innerbuf, x, r, selfmatch, _state); return result; } /************************************************************************* R-NN query: all points within R-sphere centered at X, using external thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. R - radius of sphere (in corresponding norm), R>0 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True RESULT number of neighbors found, >=0 This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsqueryrnn(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_vector* x, double r, ae_bool selfmatch, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t result; ae_assert(ae_fp_greater(r,(double)(0)), "KDTreeTsQueryRNN: incorrect R!", _state); ae_assert(x->cnt>=kdt->nx, "KDTreeTsQueryRNN: Length(X)nx, _state), "KDTreeTsQueryRNN: X contains infinite or NaN values!", _state); /* * Handle special case: KDT.N=0 */ if( kdt->n==0 ) { buf->kcur = 0; result = 0; return result; } /* * Check consistency of request buffer */ nearestneighbor_checkrequestbufferconsistency(kdt, buf, _state); /* * Prepare parameters */ buf->kneeded = 0; if( kdt->normtype!=2 ) { buf->rneeded = r; } else { buf->rneeded = ae_sqr(r, _state); } buf->selfmatch = selfmatch; buf->approxf = (double)(1); buf->kcur = 0; /* * calculate distance from point to current bounding box */ nearestneighbor_kdtreeinitbox(kdt, x, buf, _state); /* * call recursive search * results are returned as heap */ nearestneighbor_kdtreequerynnrec(kdt, buf, 0, _state); /* * pop from heap to generate ordered representation * * last element is not pop'ed because it is already in * its place */ result = buf->kcur; j = buf->kcur; for(i=buf->kcur; i>=2; i--) { tagheappopi(&buf->r, &buf->idx, &j, _state); } return result; } /************************************************************************* K-NN query: approximate K nearest neighbors IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryAKNN() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True Eps - approximation factor, Eps>=0. eps-approximate nearest neighbor is a neighbor whose distance from X is at most (1+eps) times distance of true nearest neighbor. RESULT number of actual neighbors found (either K or N, if K>N). NOTES significant performance gain may be achieved only when Eps is is on the order of magnitude of 1 or larger. This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() to get distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequeryaknn(kdtree* kdt, /* Real */ ae_vector* x, ae_int_t k, ae_bool selfmatch, double eps, ae_state *_state) { ae_int_t result; result = kdtreetsqueryaknn(kdt, &kdt->innerbuf, x, k, selfmatch, eps, _state); return result; } /************************************************************************* K-NN query: approximate K nearest neighbors, using thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. X - point, array[0..NX-1]. K - number of neighbors to return, K>=1 SelfMatch - whether self-matches are allowed: * if True, nearest neighbor may be the point itself (if it exists in original dataset) * if False, then only points with non-zero distance are returned * if not given, considered True Eps - approximation factor, Eps>=0. eps-approximate nearest neighbor is a neighbor whose distance from X is at most (1+eps) times distance of true nearest neighbor. RESULT number of actual neighbors found (either K or N, if K>N). NOTES significant performance gain may be achieved only when Eps is is on the order of magnitude of 1 or larger. This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "buf" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() to get distances IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 18.03.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsqueryaknn(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_vector* x, ae_int_t k, ae_bool selfmatch, double eps, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t result; ae_assert(k>0, "KDTreeTsQueryAKNN: incorrect K!", _state); ae_assert(ae_fp_greater_eq(eps,(double)(0)), "KDTreeTsQueryAKNN: incorrect Eps!", _state); ae_assert(x->cnt>=kdt->nx, "KDTreeTsQueryAKNN: Length(X)nx, _state), "KDTreeTsQueryAKNN: X contains infinite or NaN values!", _state); /* * Handle special case: KDT.N=0 */ if( kdt->n==0 ) { buf->kcur = 0; result = 0; return result; } /* * Check consistency of request buffer */ nearestneighbor_checkrequestbufferconsistency(kdt, buf, _state); /* * Prepare parameters */ k = ae_minint(k, kdt->n, _state); buf->kneeded = k; buf->rneeded = (double)(0); buf->selfmatch = selfmatch; if( kdt->normtype==2 ) { buf->approxf = 1/ae_sqr(1+eps, _state); } else { buf->approxf = 1/(1+eps); } buf->kcur = 0; /* * calculate distance from point to current bounding box */ nearestneighbor_kdtreeinitbox(kdt, x, buf, _state); /* * call recursive search * results are returned as heap */ nearestneighbor_kdtreequerynnrec(kdt, buf, 0, _state); /* * pop from heap to generate ordered representation * * last element is non pop'ed because it is already in * its place */ result = buf->kcur; j = buf->kcur; for(i=buf->kcur; i>=2; i--) { tagheappopi(&buf->r, &buf->idx, &j, _state); } return result; } /************************************************************************* Box query: all points within user-specified box. IMPORTANT: this function can not be used in multithreaded code because it uses internal temporary buffer of kd-tree object, which can not be shared between multiple threads. If you want to perform parallel requests, use function which uses external request buffer: KDTreeTsQueryBox() ("Ts" stands for "thread-safe"). INPUT PARAMETERS KDT - KD-tree BoxMin - lower bounds, array[0..NX-1]. BoxMax - upper bounds, array[0..NX-1]. RESULT number of actual neighbors found (in [0,N]). This subroutine performs query and stores its result in the internal structures of the KD-tree. You can use following subroutines to obtain these results: * KDTreeQueryResultsX() to get X-values * KDTreeQueryResultsXY() to get X- and Y-values * KDTreeQueryResultsTags() to get tag values * KDTreeQueryResultsDistances() returns zeros for this request NOTE: this particular query returns unordered results, because there is no meaningful way of ordering points. Furthermore, no 'distance' is associated with points - it is either INSIDE or OUTSIDE (so request for distances will return zeros). -- ALGLIB -- Copyright 14.05.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreequerybox(kdtree* kdt, /* Real */ ae_vector* boxmin, /* Real */ ae_vector* boxmax, ae_state *_state) { ae_int_t result; result = kdtreetsquerybox(kdt, &kdt->innerbuf, boxmin, boxmax, _state); return result; } /************************************************************************* Box query: all points within user-specified box, using thread-local buffer. You can call this function from multiple threads for same kd-tree instance, assuming that different instances of buffer object are passed to different threads. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure with kdtreecreaterequestbuffer() function. BoxMin - lower bounds, array[0..NX-1]. BoxMax - upper bounds, array[0..NX-1]. RESULT number of actual neighbors found (in [0,N]). This subroutine performs query and stores its result in the internal structures of the buffer object. You can use following subroutines to obtain these results (pay attention to "ts" in their names): * KDTreeTsQueryResultsX() to get X-values * KDTreeTsQueryResultsXY() to get X- and Y-values * KDTreeTsQueryResultsTags() to get tag values * KDTreeTsQueryResultsDistances() returns zeros for this query NOTE: this particular query returns unordered results, because there is no meaningful way of ordering points. Furthermore, no 'distance' is associated with points - it is either INSIDE or OUTSIDE (so request for distances will return zeros). IMPORTANT: kd-tree buffer should be used only with KD-tree object which was used to initialize buffer. Any attempt to use biffer with different object is dangerous - you may get integrity check failure (exception) because sizes of internal arrays do not fit to dimensions of KD-tree structure. -- ALGLIB -- Copyright 14.05.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t kdtreetsquerybox(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_vector* boxmin, /* Real */ ae_vector* boxmax, ae_state *_state) { ae_int_t j; ae_int_t result; ae_assert(boxmin->cnt>=kdt->nx, "KDTreeTsQueryBox: Length(BoxMin)cnt>=kdt->nx, "KDTreeTsQueryBox: Length(BoxMax)nx, _state), "KDTreeTsQueryBox: BoxMin contains infinite or NaN values!", _state); ae_assert(isfinitevector(boxmax, kdt->nx, _state), "KDTreeTsQueryBox: BoxMax contains infinite or NaN values!", _state); /* * Check consistency of request buffer */ nearestneighbor_checkrequestbufferconsistency(kdt, buf, _state); /* * Quick exit for degenerate boxes */ for(j=0; j<=kdt->nx-1; j++) { if( ae_fp_greater(boxmin->ptr.p_double[j],boxmax->ptr.p_double[j]) ) { buf->kcur = 0; result = 0; return result; } } /* * Prepare parameters */ for(j=0; j<=kdt->nx-1; j++) { buf->boxmin.ptr.p_double[j] = boxmin->ptr.p_double[j]; buf->boxmax.ptr.p_double[j] = boxmax->ptr.p_double[j]; buf->curboxmin.ptr.p_double[j] = boxmin->ptr.p_double[j]; buf->curboxmax.ptr.p_double[j] = boxmax->ptr.p_double[j]; } buf->kcur = 0; /* * call recursive search */ nearestneighbor_kdtreequeryboxrec(kdt, buf, 0, _state); result = buf->kcur; return result; } /************************************************************************* X-values from last query. This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultsx(). INPUT PARAMETERS KDT - KD-tree X - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS X - rows are filled with X-values NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsx(kdtree* kdt, /* Real */ ae_matrix* x, ae_state *_state) { kdtreetsqueryresultsx(kdt, &kdt->innerbuf, x, _state); } /************************************************************************* X- and Y-values from last query This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultsxy(). INPUT PARAMETERS KDT - KD-tree XY - possibly pre-allocated buffer. If XY is too small to store result, it is resized. If size(XY) is enough to store result, it is left unchanged. OUTPUT PARAMETERS XY - rows are filled with points: first NX columns with X-values, next NY columns - with Y-values. NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsxy(kdtree* kdt, /* Real */ ae_matrix* xy, ae_state *_state) { kdtreetsqueryresultsxy(kdt, &kdt->innerbuf, xy, _state); } /************************************************************************* Tags from last query This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultstags(). INPUT PARAMETERS KDT - KD-tree Tags - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS Tags - filled with tags associated with points, or, when no tags were supplied, with zeros NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultstags(kdtree* kdt, /* Integer */ ae_vector* tags, ae_state *_state) { kdtreetsqueryresultstags(kdt, &kdt->innerbuf, tags, _state); } /************************************************************************* Distances from last query This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - kdtreetsqueryresultsdistances(). INPUT PARAMETERS KDT - KD-tree R - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS R - filled with distances (in corresponding norm) NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsdistances(kdtree* kdt, /* Real */ ae_vector* r, ae_state *_state) { kdtreetsqueryresultsdistances(kdt, &kdt->innerbuf, r, _state); } /************************************************************************* X-values from last query associated with kdtreerequestbuffer object. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. X - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS X - rows are filled with X-values NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreetsqueryresultsx(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_matrix* x, ae_state *_state) { ae_int_t i; ae_int_t k; if( buf->kcur==0 ) { return; } if( x->rowskcur||x->colsnx ) { ae_matrix_set_length(x, buf->kcur, kdt->nx, _state); } k = buf->kcur; for(i=0; i<=k-1; i++) { ae_v_move(&x->ptr.pp_double[i][0], 1, &kdt->xy.ptr.pp_double[buf->idx.ptr.p_int[i]][kdt->nx], 1, ae_v_len(0,kdt->nx-1)); } } /************************************************************************* X- and Y-values from last query associated with kdtreerequestbuffer object. INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. XY - possibly pre-allocated buffer. If XY is too small to store result, it is resized. If size(XY) is enough to store result, it is left unchanged. OUTPUT PARAMETERS XY - rows are filled with points: first NX columns with X-values, next NY columns - with Y-values. NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsTags() tag values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreetsqueryresultsxy(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_matrix* xy, ae_state *_state) { ae_int_t i; ae_int_t k; if( buf->kcur==0 ) { return; } if( xy->rowskcur||xy->colsnx+kdt->ny ) { ae_matrix_set_length(xy, buf->kcur, kdt->nx+kdt->ny, _state); } k = buf->kcur; for(i=0; i<=k-1; i++) { ae_v_move(&xy->ptr.pp_double[i][0], 1, &kdt->xy.ptr.pp_double[buf->idx.ptr.p_int[i]][kdt->nx], 1, ae_v_len(0,kdt->nx+kdt->ny-1)); } } /************************************************************************* Tags from last query associated with kdtreerequestbuffer object. This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - KDTreeTsqueryresultstags(). INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. Tags - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS Tags - filled with tags associated with points, or, when no tags were supplied, with zeros NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsDistances() distances -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreetsqueryresultstags(kdtree* kdt, kdtreerequestbuffer* buf, /* Integer */ ae_vector* tags, ae_state *_state) { ae_int_t i; ae_int_t k; if( buf->kcur==0 ) { return; } if( tags->cntkcur ) { ae_vector_set_length(tags, buf->kcur, _state); } k = buf->kcur; for(i=0; i<=k-1; i++) { tags->ptr.p_int[i] = kdt->tags.ptr.p_int[buf->idx.ptr.p_int[i]]; } } /************************************************************************* Distances from last query associated with kdtreerequestbuffer object. This function retuns results stored in the internal buffer of kd-tree object. If you performed buffered requests (ones which use instances of kdtreerequestbuffer class), you should call buffered version of this function - KDTreeTsqueryresultsdistances(). INPUT PARAMETERS KDT - KD-tree Buf - request buffer object created for this particular instance of kd-tree structure. R - possibly pre-allocated buffer. If X is too small to store result, it is resized. If size(X) is enough to store result, it is left unchanged. OUTPUT PARAMETERS R - filled with distances (in corresponding norm) NOTES 1. points are ordered by distance from the query point (first = closest) 2. if XY is larger than required to store result, only leading part will be overwritten; trailing part will be left unchanged. So if on input XY = [[A,B],[C,D]], and result is [1,2], then on exit we will get XY = [[1,2],[C,D]]. This is done purposely to increase performance; if you want function to resize array according to result size, use function with same name and suffix 'I'. SEE ALSO * KDTreeQueryResultsX() X-values * KDTreeQueryResultsXY() X- and Y-values * KDTreeQueryResultsTags() tag values -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreetsqueryresultsdistances(kdtree* kdt, kdtreerequestbuffer* buf, /* Real */ ae_vector* r, ae_state *_state) { ae_int_t i; ae_int_t k; if( buf->kcur==0 ) { return; } if( r->cntkcur ) { ae_vector_set_length(r, buf->kcur, _state); } k = buf->kcur; /* * unload norms * * Abs() call is used to handle cases with negative norms * (generated during KFN requests) */ if( kdt->normtype==0 ) { for(i=0; i<=k-1; i++) { r->ptr.p_double[i] = ae_fabs(buf->r.ptr.p_double[i], _state); } } if( kdt->normtype==1 ) { for(i=0; i<=k-1; i++) { r->ptr.p_double[i] = ae_fabs(buf->r.ptr.p_double[i], _state); } } if( kdt->normtype==2 ) { for(i=0; i<=k-1; i++) { r->ptr.p_double[i] = ae_sqrt(ae_fabs(buf->r.ptr.p_double[i], _state), _state); } } } /************************************************************************* X-values from last query; 'interactive' variant for languages like Python which support constructs like "X = KDTreeQueryResultsXI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsxi(kdtree* kdt, /* Real */ ae_matrix* x, ae_state *_state) { ae_matrix_clear(x); kdtreequeryresultsx(kdt, x, _state); } /************************************************************************* XY-values from last query; 'interactive' variant for languages like Python which support constructs like "XY = KDTreeQueryResultsXYI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsxyi(kdtree* kdt, /* Real */ ae_matrix* xy, ae_state *_state) { ae_matrix_clear(xy); kdtreequeryresultsxy(kdt, xy, _state); } /************************************************************************* Tags from last query; 'interactive' variant for languages like Python which support constructs like "Tags = KDTreeQueryResultsTagsI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultstagsi(kdtree* kdt, /* Integer */ ae_vector* tags, ae_state *_state) { ae_vector_clear(tags); kdtreequeryresultstags(kdt, tags, _state); } /************************************************************************* Distances from last query; 'interactive' variant for languages like Python which support constructs like "R = KDTreeQueryResultsDistancesI(KDT)" and interactive mode of interpreter. This function allocates new array on each call, so it is significantly slower than its 'non-interactive' counterpart, but it is more convenient when you call it from command line. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ void kdtreequeryresultsdistancesi(kdtree* kdt, /* Real */ ae_vector* r, ae_state *_state) { ae_vector_clear(r); kdtreequeryresultsdistances(kdt, r, _state); } /************************************************************************* It is informational function which returns bounding box for entire dataset. This function is not visible to ALGLIB users, only ALGLIB itself may use it. This function assumes that output buffers are preallocated by caller. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ void kdtreeexplorebox(kdtree* kdt, /* Real */ ae_vector* boxmin, /* Real */ ae_vector* boxmax, ae_state *_state) { ae_int_t i; rvectorsetlengthatleast(boxmin, kdt->nx, _state); rvectorsetlengthatleast(boxmax, kdt->nx, _state); for(i=0; i<=kdt->nx-1; i++) { boxmin->ptr.p_double[i] = kdt->boxmin.ptr.p_double[i]; boxmax->ptr.p_double[i] = kdt->boxmax.ptr.p_double[i]; } } /************************************************************************* It is informational function which allows to get information about node type. Node index is given by integer value, with 0 corresponding to root node and other node indexes obtained via exploration. You should not expect that serialization/unserialization will retain node indexes. You should keep in mind that future versions of ALGLIB may introduce new node types. OUTPUT VALUES: NodeType - node type: * 0 corresponds to leaf node, which can be explored by kdtreeexploreleaf() function * 1 corresponds to split node, which can be explored by kdtreeexploresplit() function -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ void kdtreeexplorenodetype(kdtree* kdt, ae_int_t node, ae_int_t* nodetype, ae_state *_state) { *nodetype = 0; ae_assert(node>=0, "KDTreeExploreNodeType: incorrect node", _state); ae_assert(nodenodes.cnt, "KDTreeExploreNodeType: incorrect node", _state); if( kdt->nodes.ptr.p_int[node]>0 ) { /* * Leaf node */ *nodetype = 0; return; } if( kdt->nodes.ptr.p_int[node]==0 ) { /* * Split node */ *nodetype = 1; return; } ae_assert(ae_false, "KDTreeExploreNodeType: integrity check failure", _state); } /************************************************************************* It is informational function which allows to get information about leaf node. Node index is given by integer value, with 0 corresponding to root node and other node indexes obtained via exploration. You should not expect that serialization/unserialization will retain node indexes. You should keep in mind that future versions of ALGLIB may introduce new node types. OUTPUT VALUES: XT - output buffer is reallocated (if too small) and filled by XY values K - number of rows in XY -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ void kdtreeexploreleaf(kdtree* kdt, ae_int_t node, /* Real */ ae_matrix* xy, ae_int_t* k, ae_state *_state) { ae_int_t offs; ae_int_t i; ae_int_t j; *k = 0; ae_assert(node>=0, "KDTreeExploreLeaf: incorrect node index", _state); ae_assert(node+1nodes.cnt, "KDTreeExploreLeaf: incorrect node index", _state); ae_assert(kdt->nodes.ptr.p_int[node]>0, "KDTreeExploreLeaf: incorrect node index", _state); *k = kdt->nodes.ptr.p_int[node]; offs = kdt->nodes.ptr.p_int[node+1]; ae_assert(offs>=0, "KDTreeExploreLeaf: integrity error", _state); ae_assert(offs+(*k)-1xy.rows, "KDTreeExploreLeaf: integrity error", _state); rmatrixsetlengthatleast(xy, *k, kdt->nx+kdt->ny, _state); for(i=0; i<=*k-1; i++) { for(j=0; j<=kdt->nx+kdt->ny-1; j++) { xy->ptr.pp_double[i][j] = kdt->xy.ptr.pp_double[offs+i][kdt->nx+j]; } } } /************************************************************************* It is informational function which allows to get information about split node. Node index is given by integer value, with 0 corresponding to root node and other node indexes obtained via exploration. You should not expect that serialization/unserialization will retain node indexes. You should keep in mind that future versions of ALGLIB may introduce new node types. OUTPUT VALUES: XT - output buffer is reallocated (if too small) and filled by XY values K - number of rows in XY // Nodes[idx+1]=dim dimension to split // Nodes[idx+2]=offs offset of splitting point in Splits[] // Nodes[idx+3]=left position of left child in Nodes[] // Nodes[idx+4]=right position of right child in Nodes[] -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ void kdtreeexploresplit(kdtree* kdt, ae_int_t node, ae_int_t* d, double* s, ae_int_t* nodele, ae_int_t* nodege, ae_state *_state) { *d = 0; *s = 0; *nodele = 0; *nodege = 0; ae_assert(node>=0, "KDTreeExploreSplit: incorrect node index", _state); ae_assert(node+4nodes.cnt, "KDTreeExploreSplit: incorrect node index", _state); ae_assert(kdt->nodes.ptr.p_int[node]==0, "KDTreeExploreSplit: incorrect node index", _state); *d = kdt->nodes.ptr.p_int[node+1]; *s = kdt->splits.ptr.p_double[kdt->nodes.ptr.p_int[node+2]]; *nodele = kdt->nodes.ptr.p_int[node+3]; *nodege = kdt->nodes.ptr.p_int[node+4]; ae_assert(*d>=0, "KDTreeExploreSplit: integrity failure", _state); ae_assert(*dnx, "KDTreeExploreSplit: integrity failure", _state); ae_assert(ae_isfinite(*s, _state), "KDTreeExploreSplit: integrity failure", _state); ae_assert(*nodele>=0, "KDTreeExploreSplit: integrity failure", _state); ae_assert(*nodelenodes.cnt, "KDTreeExploreSplit: integrity failure", _state); ae_assert(*nodege>=0, "KDTreeExploreSplit: integrity failure", _state); ae_assert(*nodegenodes.cnt, "KDTreeExploreSplit: integrity failure", _state); } /************************************************************************* Serializer: allocation -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ void kdtreealloc(ae_serializer* s, kdtree* tree, ae_state *_state) { /* * Header */ ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); /* * Data */ ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); ae_serializer_alloc_entry(s); allocrealmatrix(s, &tree->xy, -1, -1, _state); allocintegerarray(s, &tree->tags, -1, _state); allocrealarray(s, &tree->boxmin, -1, _state); allocrealarray(s, &tree->boxmax, -1, _state); allocintegerarray(s, &tree->nodes, -1, _state); allocrealarray(s, &tree->splits, -1, _state); } /************************************************************************* Serializer: serialization -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ void kdtreeserialize(ae_serializer* s, kdtree* tree, ae_state *_state) { /* * Header */ ae_serializer_serialize_int(s, getkdtreeserializationcode(_state), _state); ae_serializer_serialize_int(s, nearestneighbor_kdtreefirstversion, _state); /* * Data */ ae_serializer_serialize_int(s, tree->n, _state); ae_serializer_serialize_int(s, tree->nx, _state); ae_serializer_serialize_int(s, tree->ny, _state); ae_serializer_serialize_int(s, tree->normtype, _state); serializerealmatrix(s, &tree->xy, -1, -1, _state); serializeintegerarray(s, &tree->tags, -1, _state); serializerealarray(s, &tree->boxmin, -1, _state); serializerealarray(s, &tree->boxmax, -1, _state); serializeintegerarray(s, &tree->nodes, -1, _state); serializerealarray(s, &tree->splits, -1, _state); } /************************************************************************* Serializer: unserialization -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ void kdtreeunserialize(ae_serializer* s, kdtree* tree, ae_state *_state) { ae_int_t i0; ae_int_t i1; _kdtree_clear(tree); /* * check correctness of header */ ae_serializer_unserialize_int(s, &i0, _state); ae_assert(i0==getkdtreeserializationcode(_state), "KDTreeUnserialize: stream header corrupted", _state); ae_serializer_unserialize_int(s, &i1, _state); ae_assert(i1==nearestneighbor_kdtreefirstversion, "KDTreeUnserialize: stream header corrupted", _state); /* * Unserialize data */ ae_serializer_unserialize_int(s, &tree->n, _state); ae_serializer_unserialize_int(s, &tree->nx, _state); ae_serializer_unserialize_int(s, &tree->ny, _state); ae_serializer_unserialize_int(s, &tree->normtype, _state); unserializerealmatrix(s, &tree->xy, _state); unserializeintegerarray(s, &tree->tags, _state); unserializerealarray(s, &tree->boxmin, _state); unserializerealarray(s, &tree->boxmax, _state); unserializeintegerarray(s, &tree->nodes, _state); unserializerealarray(s, &tree->splits, _state); kdtreecreaterequestbuffer(tree, &tree->innerbuf, _state); } /************************************************************************* Rearranges nodes [I1,I2) using partition in D-th dimension with S as threshold. Returns split position I3: [I1,I3) and [I3,I2) are created as result. This subroutine doesn't create tree structures, just rearranges nodes. *************************************************************************/ static void nearestneighbor_kdtreesplit(kdtree* kdt, ae_int_t i1, ae_int_t i2, ae_int_t d, double s, ae_int_t* i3, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t ileft; ae_int_t iright; double v; *i3 = 0; ae_assert(kdt->n>0, "KDTreeSplit: internal error", _state); /* * split XY/Tags in two parts: * * [ILeft,IRight] is non-processed part of XY/Tags * * After cycle is done, we have Ileft=IRight. We deal with * this element separately. * * After this, [I1,ILeft) contains left part, and [ILeft,I2) * contains right part. */ ileft = i1; iright = i2-1; while(ileftxy.ptr.pp_double[ileft][d],s) ) { /* * XY[ILeft] is on its place. * Advance ILeft. */ ileft = ileft+1; } else { /* * XY[ILeft,..] must be at IRight. * Swap and advance IRight. */ for(i=0; i<=2*kdt->nx+kdt->ny-1; i++) { v = kdt->xy.ptr.pp_double[ileft][i]; kdt->xy.ptr.pp_double[ileft][i] = kdt->xy.ptr.pp_double[iright][i]; kdt->xy.ptr.pp_double[iright][i] = v; } j = kdt->tags.ptr.p_int[ileft]; kdt->tags.ptr.p_int[ileft] = kdt->tags.ptr.p_int[iright]; kdt->tags.ptr.p_int[iright] = j; iright = iright-1; } } if( ae_fp_less_eq(kdt->xy.ptr.pp_double[ileft][d],s) ) { ileft = ileft+1; } else { iright = iright-1; } *i3 = ileft; } /************************************************************************* Recursive kd-tree generation subroutine. PARAMETERS KDT tree NodesOffs unused part of Nodes[] which must be filled by tree SplitsOffs unused part of Splits[] I1, I2 points from [I1,I2) are processed NodesOffs[] and SplitsOffs[] must be large enough. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ static void nearestneighbor_kdtreegeneratetreerec(kdtree* kdt, ae_int_t* nodesoffs, ae_int_t* splitsoffs, ae_int_t i1, ae_int_t i2, ae_int_t maxleafsize, ae_state *_state) { ae_int_t n; ae_int_t nx; ae_int_t ny; ae_int_t i; ae_int_t j; ae_int_t oldoffs; ae_int_t i3; ae_int_t cntless; ae_int_t cntgreater; double minv; double maxv; ae_int_t minidx; ae_int_t maxidx; ae_int_t d; double ds; double s; double v; double v0; double v1; ae_assert(kdt->n>0, "KDTreeGenerateTreeRec: internal error", _state); ae_assert(i2>i1, "KDTreeGenerateTreeRec: internal error", _state); /* * Generate leaf if needed */ if( i2-i1<=maxleafsize ) { kdt->nodes.ptr.p_int[*nodesoffs+0] = i2-i1; kdt->nodes.ptr.p_int[*nodesoffs+1] = i1; *nodesoffs = *nodesoffs+2; return; } /* * Load values for easier access */ nx = kdt->nx; ny = kdt->ny; /* * Select dimension to split: * * D is a dimension number * In case bounding box has zero size, we enforce creation of the leaf node. */ d = 0; ds = kdt->innerbuf.curboxmax.ptr.p_double[0]-kdt->innerbuf.curboxmin.ptr.p_double[0]; for(i=1; i<=nx-1; i++) { v = kdt->innerbuf.curboxmax.ptr.p_double[i]-kdt->innerbuf.curboxmin.ptr.p_double[i]; if( ae_fp_greater(v,ds) ) { ds = v; d = i; } } if( ae_fp_eq(ds,(double)(0)) ) { kdt->nodes.ptr.p_int[*nodesoffs+0] = i2-i1; kdt->nodes.ptr.p_int[*nodesoffs+1] = i1; *nodesoffs = *nodesoffs+2; return; } /* * Select split position S using sliding midpoint rule, * rearrange points into [I1,I3) and [I3,I2). * * In case all points has same value of D-th component * (MinV=MaxV) we enforce D-th dimension of bounding * box to become exactly zero and repeat tree construction. */ s = kdt->innerbuf.curboxmin.ptr.p_double[d]+0.5*ds; ae_v_move(&kdt->innerbuf.buf.ptr.p_double[0], 1, &kdt->xy.ptr.pp_double[i1][d], kdt->xy.stride, ae_v_len(0,i2-i1-1)); n = i2-i1; cntless = 0; cntgreater = 0; minv = kdt->innerbuf.buf.ptr.p_double[0]; maxv = kdt->innerbuf.buf.ptr.p_double[0]; minidx = i1; maxidx = i1; for(i=0; i<=n-1; i++) { v = kdt->innerbuf.buf.ptr.p_double[i]; if( ae_fp_less(v,minv) ) { minv = v; minidx = i1+i; } if( ae_fp_greater(v,maxv) ) { maxv = v; maxidx = i1+i; } if( ae_fp_less(v,s) ) { cntless = cntless+1; } if( ae_fp_greater(v,s) ) { cntgreater = cntgreater+1; } } if( ae_fp_eq(minv,maxv) ) { /* * In case all points has same value of D-th component * (MinV=MaxV) we enforce D-th dimension of bounding * box to become exactly zero and repeat tree construction. */ v0 = kdt->innerbuf.curboxmin.ptr.p_double[d]; v1 = kdt->innerbuf.curboxmax.ptr.p_double[d]; kdt->innerbuf.curboxmin.ptr.p_double[d] = minv; kdt->innerbuf.curboxmax.ptr.p_double[d] = maxv; nearestneighbor_kdtreegeneratetreerec(kdt, nodesoffs, splitsoffs, i1, i2, maxleafsize, _state); kdt->innerbuf.curboxmin.ptr.p_double[d] = v0; kdt->innerbuf.curboxmax.ptr.p_double[d] = v1; return; } if( cntless>0&&cntgreater>0 ) { /* * normal midpoint split */ nearestneighbor_kdtreesplit(kdt, i1, i2, d, s, &i3, _state); } else { /* * sliding midpoint */ if( cntless==0 ) { /* * 1. move split to MinV, * 2. place one point to the left bin (move to I1), * others - to the right bin */ s = minv; if( minidx!=i1 ) { for(i=0; i<=2*nx+ny-1; i++) { v = kdt->xy.ptr.pp_double[minidx][i]; kdt->xy.ptr.pp_double[minidx][i] = kdt->xy.ptr.pp_double[i1][i]; kdt->xy.ptr.pp_double[i1][i] = v; } j = kdt->tags.ptr.p_int[minidx]; kdt->tags.ptr.p_int[minidx] = kdt->tags.ptr.p_int[i1]; kdt->tags.ptr.p_int[i1] = j; } i3 = i1+1; } else { /* * 1. move split to MaxV, * 2. place one point to the right bin (move to I2-1), * others - to the left bin */ s = maxv; if( maxidx!=i2-1 ) { for(i=0; i<=2*nx+ny-1; i++) { v = kdt->xy.ptr.pp_double[maxidx][i]; kdt->xy.ptr.pp_double[maxidx][i] = kdt->xy.ptr.pp_double[i2-1][i]; kdt->xy.ptr.pp_double[i2-1][i] = v; } j = kdt->tags.ptr.p_int[maxidx]; kdt->tags.ptr.p_int[maxidx] = kdt->tags.ptr.p_int[i2-1]; kdt->tags.ptr.p_int[i2-1] = j; } i3 = i2-1; } } /* * Generate 'split' node */ kdt->nodes.ptr.p_int[*nodesoffs+0] = 0; kdt->nodes.ptr.p_int[*nodesoffs+1] = d; kdt->nodes.ptr.p_int[*nodesoffs+2] = *splitsoffs; kdt->splits.ptr.p_double[*splitsoffs+0] = s; oldoffs = *nodesoffs; *nodesoffs = *nodesoffs+nearestneighbor_splitnodesize; *splitsoffs = *splitsoffs+1; /* * Recirsive generation: * * update CurBox * * call subroutine * * restore CurBox */ kdt->nodes.ptr.p_int[oldoffs+3] = *nodesoffs; v = kdt->innerbuf.curboxmax.ptr.p_double[d]; kdt->innerbuf.curboxmax.ptr.p_double[d] = s; nearestneighbor_kdtreegeneratetreerec(kdt, nodesoffs, splitsoffs, i1, i3, maxleafsize, _state); kdt->innerbuf.curboxmax.ptr.p_double[d] = v; kdt->nodes.ptr.p_int[oldoffs+4] = *nodesoffs; v = kdt->innerbuf.curboxmin.ptr.p_double[d]; kdt->innerbuf.curboxmin.ptr.p_double[d] = s; nearestneighbor_kdtreegeneratetreerec(kdt, nodesoffs, splitsoffs, i3, i2, maxleafsize, _state); kdt->innerbuf.curboxmin.ptr.p_double[d] = v; } /************************************************************************* Recursive subroutine for NN queries. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ static void nearestneighbor_kdtreequerynnrec(kdtree* kdt, kdtreerequestbuffer* buf, ae_int_t offs, ae_state *_state) { double ptdist; ae_int_t i; ae_int_t j; ae_int_t nx; ae_int_t i1; ae_int_t i2; ae_int_t d; double s; double v; double t1; ae_int_t childbestoffs; ae_int_t childworstoffs; ae_int_t childoffs; double prevdist; ae_bool todive; ae_bool bestisleft; ae_bool updatemin; ae_assert(kdt->n>0, "KDTreeQueryNNRec: internal error", _state); /* * Leaf node. * Process points. */ if( kdt->nodes.ptr.p_int[offs]>0 ) { i1 = kdt->nodes.ptr.p_int[offs+1]; i2 = i1+kdt->nodes.ptr.p_int[offs]; for(i=i1; i<=i2-1; i++) { /* * Calculate distance */ ptdist = (double)(0); nx = kdt->nx; if( kdt->normtype==0 ) { for(j=0; j<=nx-1; j++) { ptdist = ae_maxreal(ptdist, ae_fabs(kdt->xy.ptr.pp_double[i][j]-buf->x.ptr.p_double[j], _state), _state); } } if( kdt->normtype==1 ) { for(j=0; j<=nx-1; j++) { ptdist = ptdist+ae_fabs(kdt->xy.ptr.pp_double[i][j]-buf->x.ptr.p_double[j], _state); } } if( kdt->normtype==2 ) { for(j=0; j<=nx-1; j++) { ptdist = ptdist+ae_sqr(kdt->xy.ptr.pp_double[i][j]-buf->x.ptr.p_double[j], _state); } } /* * Skip points with zero distance if self-matches are turned off */ if( ae_fp_eq(ptdist,(double)(0))&&!buf->selfmatch ) { continue; } /* * We CAN'T process point if R-criterion isn't satisfied, * i.e. (RNeeded<>0) AND (PtDist>R). */ if( ae_fp_eq(buf->rneeded,(double)(0))||ae_fp_less_eq(ptdist,buf->rneeded) ) { /* * R-criterion is satisfied, we must either: * * replace worst point, if (KNeeded<>0) AND (KCur=KNeeded) * (or skip, if worst point is better) * * add point without replacement otherwise */ if( buf->kcurkneeded||buf->kneeded==0 ) { /* * add current point to heap without replacement */ tagheappushi(&buf->r, &buf->idx, &buf->kcur, ptdist, i, _state); } else { /* * New points are added or not, depending on their distance. * If added, they replace element at the top of the heap */ if( ae_fp_less(ptdist,buf->r.ptr.p_double[0]) ) { if( buf->kneeded==1 ) { buf->idx.ptr.p_int[0] = i; buf->r.ptr.p_double[0] = ptdist; } else { tagheapreplacetopi(&buf->r, &buf->idx, buf->kneeded, ptdist, i, _state); } } } } } return; } /* * Simple split */ if( kdt->nodes.ptr.p_int[offs]==0 ) { /* * Load: * * D dimension to split * * S split position */ d = kdt->nodes.ptr.p_int[offs+1]; s = kdt->splits.ptr.p_double[kdt->nodes.ptr.p_int[offs+2]]; /* * Calculate: * * ChildBestOffs child box with best chances * * ChildWorstOffs child box with worst chances */ if( ae_fp_less_eq(buf->x.ptr.p_double[d],s) ) { childbestoffs = kdt->nodes.ptr.p_int[offs+3]; childworstoffs = kdt->nodes.ptr.p_int[offs+4]; bestisleft = ae_true; } else { childbestoffs = kdt->nodes.ptr.p_int[offs+4]; childworstoffs = kdt->nodes.ptr.p_int[offs+3]; bestisleft = ae_false; } /* * Navigate through childs */ for(i=0; i<=1; i++) { /* * Select child to process: * * ChildOffs current child offset in Nodes[] * * UpdateMin whether minimum or maximum value * of bounding box is changed on update */ if( i==0 ) { childoffs = childbestoffs; updatemin = !bestisleft; } else { updatemin = bestisleft; childoffs = childworstoffs; } /* * Update bounding box and current distance */ if( updatemin ) { prevdist = buf->curdist; t1 = buf->x.ptr.p_double[d]; v = buf->curboxmin.ptr.p_double[d]; if( ae_fp_less_eq(t1,s) ) { if( kdt->normtype==0 ) { buf->curdist = ae_maxreal(buf->curdist, s-t1, _state); } if( kdt->normtype==1 ) { buf->curdist = buf->curdist-ae_maxreal(v-t1, (double)(0), _state)+s-t1; } if( kdt->normtype==2 ) { buf->curdist = buf->curdist-ae_sqr(ae_maxreal(v-t1, (double)(0), _state), _state)+ae_sqr(s-t1, _state); } } buf->curboxmin.ptr.p_double[d] = s; } else { prevdist = buf->curdist; t1 = buf->x.ptr.p_double[d]; v = buf->curboxmax.ptr.p_double[d]; if( ae_fp_greater_eq(t1,s) ) { if( kdt->normtype==0 ) { buf->curdist = ae_maxreal(buf->curdist, t1-s, _state); } if( kdt->normtype==1 ) { buf->curdist = buf->curdist-ae_maxreal(t1-v, (double)(0), _state)+t1-s; } if( kdt->normtype==2 ) { buf->curdist = buf->curdist-ae_sqr(ae_maxreal(t1-v, (double)(0), _state), _state)+ae_sqr(t1-s, _state); } } buf->curboxmax.ptr.p_double[d] = s; } /* * Decide: to dive into cell or not to dive */ if( ae_fp_neq(buf->rneeded,(double)(0))&&ae_fp_greater(buf->curdist,buf->rneeded) ) { todive = ae_false; } else { if( buf->kcurkneeded||buf->kneeded==0 ) { /* * KCurcurdist,buf->r.ptr.p_double[0]*buf->approxf); } } if( todive ) { nearestneighbor_kdtreequerynnrec(kdt, buf, childoffs, _state); } /* * Restore bounding box and distance */ if( updatemin ) { buf->curboxmin.ptr.p_double[d] = v; } else { buf->curboxmax.ptr.p_double[d] = v; } buf->curdist = prevdist; } return; } } /************************************************************************* Recursive subroutine for box queries. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ static void nearestneighbor_kdtreequeryboxrec(kdtree* kdt, kdtreerequestbuffer* buf, ae_int_t offs, ae_state *_state) { ae_bool inbox; ae_int_t nx; ae_int_t i1; ae_int_t i2; ae_int_t i; ae_int_t j; ae_int_t d; double s; double v; ae_assert(kdt->n>0, "KDTreeQueryBoxRec: internal error", _state); nx = kdt->nx; /* * Check that intersection of query box with bounding box is non-empty. * This check is performed once for Offs=0 (tree root). */ if( offs==0 ) { for(j=0; j<=nx-1; j++) { if( ae_fp_greater(buf->boxmin.ptr.p_double[j],buf->curboxmax.ptr.p_double[j]) ) { return; } if( ae_fp_less(buf->boxmax.ptr.p_double[j],buf->curboxmin.ptr.p_double[j]) ) { return; } } } /* * Leaf node. * Process points. */ if( kdt->nodes.ptr.p_int[offs]>0 ) { i1 = kdt->nodes.ptr.p_int[offs+1]; i2 = i1+kdt->nodes.ptr.p_int[offs]; for(i=i1; i<=i2-1; i++) { /* * Check whether point is in box or not */ inbox = ae_true; for(j=0; j<=nx-1; j++) { inbox = inbox&&ae_fp_greater_eq(kdt->xy.ptr.pp_double[i][j],buf->boxmin.ptr.p_double[j]); inbox = inbox&&ae_fp_less_eq(kdt->xy.ptr.pp_double[i][j],buf->boxmax.ptr.p_double[j]); } if( !inbox ) { continue; } /* * Add point to unordered list */ buf->r.ptr.p_double[buf->kcur] = 0.0; buf->idx.ptr.p_int[buf->kcur] = i; buf->kcur = buf->kcur+1; } return; } /* * Simple split */ if( kdt->nodes.ptr.p_int[offs]==0 ) { /* * Load: * * D dimension to split * * S split position */ d = kdt->nodes.ptr.p_int[offs+1]; s = kdt->splits.ptr.p_double[kdt->nodes.ptr.p_int[offs+2]]; /* * Check lower split (S is upper bound of new bounding box) */ if( ae_fp_greater_eq(s,buf->boxmin.ptr.p_double[d]) ) { v = buf->curboxmax.ptr.p_double[d]; buf->curboxmax.ptr.p_double[d] = s; nearestneighbor_kdtreequeryboxrec(kdt, buf, kdt->nodes.ptr.p_int[offs+3], _state); buf->curboxmax.ptr.p_double[d] = v; } /* * Check upper split (S is lower bound of new bounding box) */ if( ae_fp_less_eq(s,buf->boxmax.ptr.p_double[d]) ) { v = buf->curboxmin.ptr.p_double[d]; buf->curboxmin.ptr.p_double[d] = s; nearestneighbor_kdtreequeryboxrec(kdt, buf, kdt->nodes.ptr.p_int[offs+4], _state); buf->curboxmin.ptr.p_double[d] = v; } return; } } /************************************************************************* Copies X[] to Buf.X[] Loads distance from X[] to bounding box. Initializes Buf.CurBox[]. -- ALGLIB -- Copyright 28.02.2010 by Bochkanov Sergey *************************************************************************/ static void nearestneighbor_kdtreeinitbox(kdtree* kdt, /* Real */ ae_vector* x, kdtreerequestbuffer* buf, ae_state *_state) { ae_int_t i; double vx; double vmin; double vmax; ae_assert(kdt->n>0, "KDTreeInitBox: internal error", _state); /* * calculate distance from point to current bounding box */ buf->curdist = (double)(0); if( kdt->normtype==0 ) { for(i=0; i<=kdt->nx-1; i++) { vx = x->ptr.p_double[i]; vmin = kdt->boxmin.ptr.p_double[i]; vmax = kdt->boxmax.ptr.p_double[i]; buf->x.ptr.p_double[i] = vx; buf->curboxmin.ptr.p_double[i] = vmin; buf->curboxmax.ptr.p_double[i] = vmax; if( ae_fp_less(vx,vmin) ) { buf->curdist = ae_maxreal(buf->curdist, vmin-vx, _state); } else { if( ae_fp_greater(vx,vmax) ) { buf->curdist = ae_maxreal(buf->curdist, vx-vmax, _state); } } } } if( kdt->normtype==1 ) { for(i=0; i<=kdt->nx-1; i++) { vx = x->ptr.p_double[i]; vmin = kdt->boxmin.ptr.p_double[i]; vmax = kdt->boxmax.ptr.p_double[i]; buf->x.ptr.p_double[i] = vx; buf->curboxmin.ptr.p_double[i] = vmin; buf->curboxmax.ptr.p_double[i] = vmax; if( ae_fp_less(vx,vmin) ) { buf->curdist = buf->curdist+vmin-vx; } else { if( ae_fp_greater(vx,vmax) ) { buf->curdist = buf->curdist+vx-vmax; } } } } if( kdt->normtype==2 ) { for(i=0; i<=kdt->nx-1; i++) { vx = x->ptr.p_double[i]; vmin = kdt->boxmin.ptr.p_double[i]; vmax = kdt->boxmax.ptr.p_double[i]; buf->x.ptr.p_double[i] = vx; buf->curboxmin.ptr.p_double[i] = vmin; buf->curboxmax.ptr.p_double[i] = vmax; if( ae_fp_less(vx,vmin) ) { buf->curdist = buf->curdist+ae_sqr(vmin-vx, _state); } else { if( ae_fp_greater(vx,vmax) ) { buf->curdist = buf->curdist+ae_sqr(vx-vmax, _state); } } } } } /************************************************************************* This function allocates all dataset-independend array fields of KDTree, i.e. such array fields that their dimensions do not depend on dataset size. This function do not sets KDT.NX or KDT.NY - it just allocates arrays -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ static void nearestneighbor_kdtreeallocdatasetindependent(kdtree* kdt, ae_int_t nx, ae_int_t ny, ae_state *_state) { ae_assert(kdt->n>0, "KDTreeAllocDatasetIndependent: internal error", _state); ae_vector_set_length(&kdt->boxmin, nx, _state); ae_vector_set_length(&kdt->boxmax, nx, _state); } /************************************************************************* This function allocates all dataset-dependent array fields of KDTree, i.e. such array fields that their dimensions depend on dataset size. This function do not sets KDT.N, KDT.NX or KDT.NY - it just allocates arrays. -- ALGLIB -- Copyright 14.03.2011 by Bochkanov Sergey *************************************************************************/ static void nearestneighbor_kdtreeallocdatasetdependent(kdtree* kdt, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_state *_state) { ae_assert(n>0, "KDTreeAllocDatasetDependent: internal error", _state); ae_matrix_set_length(&kdt->xy, n, 2*nx+ny, _state); ae_vector_set_length(&kdt->tags, n, _state); ae_vector_set_length(&kdt->nodes, nearestneighbor_splitnodesize*2*n, _state); ae_vector_set_length(&kdt->splits, 2*n, _state); } /************************************************************************* This function checks consistency of request buffer structure with dimensions of kd-tree object. -- ALGLIB -- Copyright 02.04.2016 by Bochkanov Sergey *************************************************************************/ static void nearestneighbor_checkrequestbufferconsistency(kdtree* kdt, kdtreerequestbuffer* buf, ae_state *_state) { ae_assert(buf->x.cnt>=kdt->nx, "KDTree: dimensions of kdtreerequestbuffer are inconsistent with kdtree structure", _state); ae_assert(buf->idx.cnt>=kdt->n, "KDTree: dimensions of kdtreerequestbuffer are inconsistent with kdtree structure", _state); ae_assert(buf->r.cnt>=kdt->n, "KDTree: dimensions of kdtreerequestbuffer are inconsistent with kdtree structure", _state); ae_assert(buf->buf.cnt>=ae_maxint(kdt->n, kdt->nx, _state), "KDTree: dimensions of kdtreerequestbuffer are inconsistent with kdtree structure", _state); ae_assert(buf->curboxmin.cnt>=kdt->nx, "KDTree: dimensions of kdtreerequestbuffer are inconsistent with kdtree structure", _state); ae_assert(buf->curboxmax.cnt>=kdt->nx, "KDTree: dimensions of kdtreerequestbuffer are inconsistent with kdtree structure", _state); } void _kdtreerequestbuffer_init(void* _p, ae_state *_state) { kdtreerequestbuffer *p = (kdtreerequestbuffer*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->x, 0, DT_REAL, _state); ae_vector_init(&p->boxmin, 0, DT_REAL, _state); ae_vector_init(&p->boxmax, 0, DT_REAL, _state); ae_vector_init(&p->idx, 0, DT_INT, _state); ae_vector_init(&p->r, 0, DT_REAL, _state); ae_vector_init(&p->buf, 0, DT_REAL, _state); ae_vector_init(&p->curboxmin, 0, DT_REAL, _state); ae_vector_init(&p->curboxmax, 0, DT_REAL, _state); } void _kdtreerequestbuffer_init_copy(void* _dst, void* _src, ae_state *_state) { kdtreerequestbuffer *dst = (kdtreerequestbuffer*)_dst; kdtreerequestbuffer *src = (kdtreerequestbuffer*)_src; ae_vector_init_copy(&dst->x, &src->x, _state); ae_vector_init_copy(&dst->boxmin, &src->boxmin, _state); ae_vector_init_copy(&dst->boxmax, &src->boxmax, _state); dst->kneeded = src->kneeded; dst->rneeded = src->rneeded; dst->selfmatch = src->selfmatch; dst->approxf = src->approxf; dst->kcur = src->kcur; ae_vector_init_copy(&dst->idx, &src->idx, _state); ae_vector_init_copy(&dst->r, &src->r, _state); ae_vector_init_copy(&dst->buf, &src->buf, _state); ae_vector_init_copy(&dst->curboxmin, &src->curboxmin, _state); ae_vector_init_copy(&dst->curboxmax, &src->curboxmax, _state); dst->curdist = src->curdist; } void _kdtreerequestbuffer_clear(void* _p) { kdtreerequestbuffer *p = (kdtreerequestbuffer*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->x); ae_vector_clear(&p->boxmin); ae_vector_clear(&p->boxmax); ae_vector_clear(&p->idx); ae_vector_clear(&p->r); ae_vector_clear(&p->buf); ae_vector_clear(&p->curboxmin); ae_vector_clear(&p->curboxmax); } void _kdtreerequestbuffer_destroy(void* _p) { kdtreerequestbuffer *p = (kdtreerequestbuffer*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->x); ae_vector_destroy(&p->boxmin); ae_vector_destroy(&p->boxmax); ae_vector_destroy(&p->idx); ae_vector_destroy(&p->r); ae_vector_destroy(&p->buf); ae_vector_destroy(&p->curboxmin); ae_vector_destroy(&p->curboxmax); } void _kdtree_init(void* _p, ae_state *_state) { kdtree *p = (kdtree*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->xy, 0, 0, DT_REAL, _state); ae_vector_init(&p->tags, 0, DT_INT, _state); ae_vector_init(&p->boxmin, 0, DT_REAL, _state); ae_vector_init(&p->boxmax, 0, DT_REAL, _state); ae_vector_init(&p->nodes, 0, DT_INT, _state); ae_vector_init(&p->splits, 0, DT_REAL, _state); _kdtreerequestbuffer_init(&p->innerbuf, _state); } void _kdtree_init_copy(void* _dst, void* _src, ae_state *_state) { kdtree *dst = (kdtree*)_dst; kdtree *src = (kdtree*)_src; dst->n = src->n; dst->nx = src->nx; dst->ny = src->ny; dst->normtype = src->normtype; ae_matrix_init_copy(&dst->xy, &src->xy, _state); ae_vector_init_copy(&dst->tags, &src->tags, _state); ae_vector_init_copy(&dst->boxmin, &src->boxmin, _state); ae_vector_init_copy(&dst->boxmax, &src->boxmax, _state); ae_vector_init_copy(&dst->nodes, &src->nodes, _state); ae_vector_init_copy(&dst->splits, &src->splits, _state); _kdtreerequestbuffer_init_copy(&dst->innerbuf, &src->innerbuf, _state); dst->debugcounter = src->debugcounter; } void _kdtree_clear(void* _p) { kdtree *p = (kdtree*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->xy); ae_vector_clear(&p->tags); ae_vector_clear(&p->boxmin); ae_vector_clear(&p->boxmax); ae_vector_clear(&p->nodes); ae_vector_clear(&p->splits); _kdtreerequestbuffer_clear(&p->innerbuf); } void _kdtree_destroy(void* _p) { kdtree *p = (kdtree*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->xy); ae_vector_destroy(&p->tags); ae_vector_destroy(&p->boxmin); ae_vector_destroy(&p->boxmax); ae_vector_destroy(&p->nodes); ae_vector_destroy(&p->splits); _kdtreerequestbuffer_destroy(&p->innerbuf); } /************************************************************************* HQRNDState initialization with random values which come from standard RNG. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void hqrndrandomize(hqrndstate* state, ae_state *_state) { ae_int_t s0; ae_int_t s1; _hqrndstate_clear(state); s0 = ae_randominteger(hqrnd_hqrndm1, _state); s1 = ae_randominteger(hqrnd_hqrndm2, _state); hqrndseed(s0, s1, state, _state); } /************************************************************************* HQRNDState initialization with seed values -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void hqrndseed(ae_int_t s1, ae_int_t s2, hqrndstate* state, ae_state *_state) { _hqrndstate_clear(state); /* * Protection against negative seeds: * * SEED := -(SEED+1) * * We can use just "-SEED" because there exists such integer number N * that N<0, -N=N<0 too. (This number is equal to 0x800...000). Need * to handle such seed correctly forces us to use a bit complicated * formula. */ if( s1<0 ) { s1 = -(s1+1); } if( s2<0 ) { s2 = -(s2+1); } state->s1 = s1%(hqrnd_hqrndm1-1)+1; state->s2 = s2%(hqrnd_hqrndm2-1)+1; state->magicv = hqrnd_hqrndmagic; } /************************************************************************* This function generates random real number in (0,1), not including interval boundaries State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double hqrnduniformr(hqrndstate* state, ae_state *_state) { double result; result = (double)(hqrnd_hqrndintegerbase(state, _state)+1)/(double)(hqrnd_hqrndmax+2); return result; } /************************************************************************* This function generates random integer number in [0, N) 1. State structure must be initialized with HQRNDRandomize() or HQRNDSeed() 2. N can be any positive number except for very large numbers: * close to 2^31 on 32-bit systems * close to 2^62 on 64-bit systems An exception will be generated if N is too large. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ ae_int_t hqrnduniformi(hqrndstate* state, ae_int_t n, ae_state *_state) { ae_int_t maxcnt; ae_int_t mx; ae_int_t a; ae_int_t b; ae_int_t result; ae_assert(n>0, "HQRNDUniformI: N<=0!", _state); maxcnt = hqrnd_hqrndmax+1; /* * Two branches: one for N<=MaxCnt, another for N>MaxCnt. */ if( n>maxcnt ) { /* * N>=MaxCnt. * * We have two options here: * a) N is exactly divisible by MaxCnt * b) N is not divisible by MaxCnt * * In both cases we reduce problem on interval spanning [0,N) * to several subproblems on intervals spanning [0,MaxCnt). */ if( n%maxcnt==0 ) { /* * N is exactly divisible by MaxCnt. * * [0,N) range is dividided into N/MaxCnt bins, * each of them having length equal to MaxCnt. * * We generate: * * random bin number B * * random offset within bin A * Both random numbers are generated by recursively * calling HQRNDUniformI(). * * Result is equal to A+MaxCnt*B. */ ae_assert(n/maxcnt<=maxcnt, "HQRNDUniformI: N is too large", _state); a = hqrnduniformi(state, maxcnt, _state); b = hqrnduniformi(state, n/maxcnt, _state); result = a+maxcnt*b; } else { /* * N is NOT exactly divisible by MaxCnt. * * [0,N) range is dividided into Ceil(N/MaxCnt) bins, * each of them having length equal to MaxCnt. * * We generate: * * random bin number B in [0, Ceil(N/MaxCnt)-1] * * random offset within bin A * * if both of what is below is true * 1) bin number B is that of the last bin * 2) A >= N mod MaxCnt * then we repeat generation of A/B. * This stage is essential in order to avoid bias in the result. * * otherwise, we return A*MaxCnt+N */ ae_assert(n/maxcnt+1<=maxcnt, "HQRNDUniformI: N is too large", _state); result = -1; do { a = hqrnduniformi(state, maxcnt, _state); b = hqrnduniformi(state, n/maxcnt+1, _state); if( b==n/maxcnt&&a>=n%maxcnt ) { continue; } result = a+maxcnt*b; } while(result<0); } } else { /* * N<=MaxCnt * * Code below is a bit complicated because we can not simply * return "HQRNDIntegerBase() mod N" - it will be skewed for * large N's in [0.1*HQRNDMax...HQRNDMax]. */ mx = maxcnt-maxcnt%n; do { result = hqrnd_hqrndintegerbase(state, _state); } while(result>=mx); result = result%n; } return result; } /************************************************************************* Random number generator: normal numbers This function generates one random number from normal distribution. Its performance is equal to that of HQRNDNormal2() State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double hqrndnormal(hqrndstate* state, ae_state *_state) { double v1; double v2; double result; hqrndnormal2(state, &v1, &v2, _state); result = v1; return result; } /************************************************************************* Random number generator: random X and Y such that X^2+Y^2=1 State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void hqrndunit2(hqrndstate* state, double* x, double* y, ae_state *_state) { double v; double mx; double mn; *x = 0; *y = 0; do { hqrndnormal2(state, x, y, _state); } while(!(ae_fp_neq(*x,(double)(0))||ae_fp_neq(*y,(double)(0)))); mx = ae_maxreal(ae_fabs(*x, _state), ae_fabs(*y, _state), _state); mn = ae_minreal(ae_fabs(*x, _state), ae_fabs(*y, _state), _state); v = mx*ae_sqrt(1+ae_sqr(mn/mx, _state), _state); *x = *x/v; *y = *y/v; } /************************************************************************* Random number generator: normal numbers This function generates two independent random numbers from normal distribution. Its performance is equal to that of HQRNDNormal() State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void hqrndnormal2(hqrndstate* state, double* x1, double* x2, ae_state *_state) { double u; double v; double s; *x1 = 0; *x2 = 0; for(;;) { u = 2*hqrnduniformr(state, _state)-1; v = 2*hqrnduniformr(state, _state)-1; s = ae_sqr(u, _state)+ae_sqr(v, _state); if( ae_fp_greater(s,(double)(0))&&ae_fp_less(s,(double)(1)) ) { /* * two Sqrt's instead of one to * avoid overflow when S is too small */ s = ae_sqrt(-2*ae_log(s, _state), _state)/ae_sqrt(s, _state); *x1 = u*s; *x2 = v*s; return; } } } /************************************************************************* Random number generator: exponential distribution State structure must be initialized with HQRNDRandomize() or HQRNDSeed(). -- ALGLIB -- Copyright 11.08.2007 by Bochkanov Sergey *************************************************************************/ double hqrndexponential(hqrndstate* state, double lambdav, ae_state *_state) { double result; ae_assert(ae_fp_greater(lambdav,(double)(0)), "HQRNDExponential: LambdaV<=0!", _state); result = -ae_log(hqrnduniformr(state, _state), _state)/lambdav; return result; } /************************************************************************* This function generates random number from discrete distribution given by finite sample X. INPUT PARAMETERS State - high quality random number generator, must be initialized with HQRNDRandomize() or HQRNDSeed(). X - finite sample N - number of elements to use, N>=1 RESULT this function returns one of the X[i] for random i=0..N-1 -- ALGLIB -- Copyright 08.11.2011 by Bochkanov Sergey *************************************************************************/ double hqrnddiscrete(hqrndstate* state, /* Real */ ae_vector* x, ae_int_t n, ae_state *_state) { double result; ae_assert(n>0, "HQRNDDiscrete: N<=0", _state); ae_assert(n<=x->cnt, "HQRNDDiscrete: Length(X)ptr.p_double[hqrnduniformi(state, n, _state)]; return result; } /************************************************************************* This function generates random number from continuous distribution given by finite sample X. INPUT PARAMETERS State - high quality random number generator, must be initialized with HQRNDRandomize() or HQRNDSeed(). X - finite sample, array[N] (can be larger, in this case only leading N elements are used). THIS ARRAY MUST BE SORTED BY ASCENDING. N - number of elements to use, N>=1 RESULT this function returns random number from continuous distribution which tries to approximate X as mush as possible. min(X)<=Result<=max(X). -- ALGLIB -- Copyright 08.11.2011 by Bochkanov Sergey *************************************************************************/ double hqrndcontinuous(hqrndstate* state, /* Real */ ae_vector* x, ae_int_t n, ae_state *_state) { double mx; double mn; ae_int_t i; double result; ae_assert(n>0, "HQRNDContinuous: N<=0", _state); ae_assert(n<=x->cnt, "HQRNDContinuous: Length(X)ptr.p_double[0]; return result; } i = hqrnduniformi(state, n-1, _state); mn = x->ptr.p_double[i]; mx = x->ptr.p_double[i+1]; ae_assert(ae_fp_greater_eq(mx,mn), "HQRNDDiscrete: X is not sorted by ascending", _state); if( ae_fp_neq(mx,mn) ) { result = (mx-mn)*hqrnduniformr(state, _state)+mn; } else { result = mn; } return result; } /************************************************************************* This function returns random integer in [0,HQRNDMax] L'Ecuyer, Efficient and portable combined random number generators *************************************************************************/ static ae_int_t hqrnd_hqrndintegerbase(hqrndstate* state, ae_state *_state) { ae_int_t k; ae_int_t result; ae_assert(state->magicv==hqrnd_hqrndmagic, "HQRNDIntegerBase: State is not correctly initialized!", _state); k = state->s1/53668; state->s1 = 40014*(state->s1-k*53668)-k*12211; if( state->s1<0 ) { state->s1 = state->s1+2147483563; } k = state->s2/52774; state->s2 = 40692*(state->s2-k*52774)-k*3791; if( state->s2<0 ) { state->s2 = state->s2+2147483399; } /* * Result */ result = state->s1-state->s2; if( result<1 ) { result = result+2147483562; } result = result-1; return result; } void _hqrndstate_init(void* _p, ae_state *_state) { hqrndstate *p = (hqrndstate*)_p; ae_touch_ptr((void*)p); } void _hqrndstate_init_copy(void* _dst, void* _src, ae_state *_state) { hqrndstate *dst = (hqrndstate*)_dst; hqrndstate *src = (hqrndstate*)_src; dst->s1 = src->s1; dst->s2 = src->s2; dst->magicv = src->magicv; } void _hqrndstate_clear(void* _p) { hqrndstate *p = (hqrndstate*)_p; ae_touch_ptr((void*)p); } void _hqrndstate_destroy(void* _p) { hqrndstate *p = (hqrndstate*)_p; ae_touch_ptr((void*)p); } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Creates and returns XDebugRecord1 structure: * integer and complex fields of Rec1 are set to 1 and 1+i correspondingly * array field of Rec1 is set to [2,3] -- ALGLIB -- Copyright 27.05.2014 by Bochkanov Sergey *************************************************************************/ void xdebuginitrecord1(xdebugrecord1* rec1, ae_state *_state) { _xdebugrecord1_clear(rec1); rec1->i = 1; rec1->c.x = (double)(1); rec1->c.y = (double)(1); ae_vector_set_length(&rec1->a, 2, _state); rec1->a.ptr.p_double[0] = (double)(2); rec1->a.ptr.p_double[1] = (double)(3); } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Counts number of True values in the boolean 1D array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_int_t xdebugb1count(/* Boolean */ ae_vector* a, ae_state *_state) { ae_int_t i; ae_int_t result; result = 0; for(i=0; i<=a->cnt-1; i++) { if( a->ptr.p_bool[i] ) { result = result+1; } } return result; } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by NOT(a[i]). Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb1not(/* Boolean */ ae_vector* a, ae_state *_state) { ae_int_t i; for(i=0; i<=a->cnt-1; i++) { a->ptr.p_bool[i] = !a->ptr.p_bool[i]; } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb1appendcopy(/* Boolean */ ae_vector* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector b; ae_frame_make(_state, &_frame_block); ae_vector_init(&b, 0, DT_BOOL, _state); ae_vector_set_length(&b, a->cnt, _state); for(i=0; i<=b.cnt-1; i++) { b.ptr.p_bool[i] = a->ptr.p_bool[i]; } ae_vector_set_length(a, 2*b.cnt, _state); for(i=0; i<=a->cnt-1; i++) { a->ptr.p_bool[i] = b.ptr.p_bool[i%b.cnt]; } ae_frame_leave(_state); } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered elements set to True. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb1outeven(ae_int_t n, /* Boolean */ ae_vector* a, ae_state *_state) { ae_int_t i; ae_vector_clear(a); ae_vector_set_length(a, n, _state); for(i=0; i<=a->cnt-1; i++) { a->ptr.p_bool[i] = i%2==0; } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_int_t xdebugi1sum(/* Integer */ ae_vector* a, ae_state *_state) { ae_int_t i; ae_int_t result; result = 0; for(i=0; i<=a->cnt-1; i++) { result = result+a->ptr.p_int[i]; } return result; } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -A[I] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi1neg(/* Integer */ ae_vector* a, ae_state *_state) { ae_int_t i; for(i=0; i<=a->cnt-1; i++) { a->ptr.p_int[i] = -a->ptr.p_int[i]; } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi1appendcopy(/* Integer */ ae_vector* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector b; ae_frame_make(_state, &_frame_block); ae_vector_init(&b, 0, DT_INT, _state); ae_vector_set_length(&b, a->cnt, _state); for(i=0; i<=b.cnt-1; i++) { b.ptr.p_int[i] = a->ptr.p_int[i]; } ae_vector_set_length(a, 2*b.cnt, _state); for(i=0; i<=a->cnt-1; i++) { a->ptr.p_int[i] = b.ptr.p_int[i%b.cnt]; } ae_frame_leave(_state); } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered A[I] set to I, and odd-numbered ones set to 0. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi1outeven(ae_int_t n, /* Integer */ ae_vector* a, ae_state *_state) { ae_int_t i; ae_vector_clear(a); ae_vector_set_length(a, n, _state); for(i=0; i<=a->cnt-1; i++) { if( i%2==0 ) { a->ptr.p_int[i] = i; } else { a->ptr.p_int[i] = 0; } } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ double xdebugr1sum(/* Real */ ae_vector* a, ae_state *_state) { ae_int_t i; double result; result = (double)(0); for(i=0; i<=a->cnt-1; i++) { result = result+a->ptr.p_double[i]; } return result; } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -A[I] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr1neg(/* Real */ ae_vector* a, ae_state *_state) { ae_int_t i; for(i=0; i<=a->cnt-1; i++) { a->ptr.p_double[i] = -a->ptr.p_double[i]; } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr1appendcopy(/* Real */ ae_vector* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector b; ae_frame_make(_state, &_frame_block); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_set_length(&b, a->cnt, _state); for(i=0; i<=b.cnt-1; i++) { b.ptr.p_double[i] = a->ptr.p_double[i]; } ae_vector_set_length(a, 2*b.cnt, _state); for(i=0; i<=a->cnt-1; i++) { a->ptr.p_double[i] = b.ptr.p_double[i%b.cnt]; } ae_frame_leave(_state); } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered A[I] set to I*0.25, and odd-numbered ones are set to 0. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr1outeven(ae_int_t n, /* Real */ ae_vector* a, ae_state *_state) { ae_int_t i; ae_vector_clear(a); ae_vector_set_length(a, n, _state); for(i=0; i<=a->cnt-1; i++) { if( i%2==0 ) { a->ptr.p_double[i] = i*0.25; } else { a->ptr.p_double[i] = (double)(0); } } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_complex xdebugc1sum(/* Complex */ ae_vector* a, ae_state *_state) { ae_int_t i; ae_complex result; result = ae_complex_from_i(0); for(i=0; i<=a->cnt-1; i++) { result = ae_c_add(result,a->ptr.p_complex[i]); } return result; } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -A[I] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc1neg(/* Complex */ ae_vector* a, ae_state *_state) { ae_int_t i; for(i=0; i<=a->cnt-1; i++) { a->ptr.p_complex[i] = ae_c_neg(a->ptr.p_complex[i]); } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Appends copy of array to itself. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc1appendcopy(/* Complex */ ae_vector* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector b; ae_frame_make(_state, &_frame_block); ae_vector_init(&b, 0, DT_COMPLEX, _state); ae_vector_set_length(&b, a->cnt, _state); for(i=0; i<=b.cnt-1; i++) { b.ptr.p_complex[i] = a->ptr.p_complex[i]; } ae_vector_set_length(a, 2*b.cnt, _state); for(i=0; i<=a->cnt-1; i++) { a->ptr.p_complex[i] = b.ptr.p_complex[i%b.cnt]; } ae_frame_leave(_state); } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate N-element array with even-numbered A[K] set to (x,y) = (K*0.25, K*0.125) and odd-numbered ones are set to 0. Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc1outeven(ae_int_t n, /* Complex */ ae_vector* a, ae_state *_state) { ae_int_t i; ae_vector_clear(a); ae_vector_set_length(a, n, _state); for(i=0; i<=a->cnt-1; i++) { if( i%2==0 ) { a->ptr.p_complex[i].x = i*0.250; a->ptr.p_complex[i].y = i*0.125; } else { a->ptr.p_complex[i] = ae_complex_from_i(0); } } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Counts number of True values in the boolean 2D array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_int_t xdebugb2count(/* Boolean */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t result; result = 0; for(i=0; i<=a->rows-1; i++) { for(j=0; j<=a->cols-1; j++) { if( a->ptr.pp_bool[i][j] ) { result = result+1; } } } return result; } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by NOT(a[i]). Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb2not(/* Boolean */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=a->rows-1; i++) { for(j=0; j<=a->cols-1; j++) { a->ptr.pp_bool[i][j] = !a->ptr.pp_bool[i][j]; } } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb2transpose(/* Boolean */ ae_matrix* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_matrix b; ae_frame_make(_state, &_frame_block); ae_matrix_init(&b, 0, 0, DT_BOOL, _state); ae_matrix_set_length(&b, a->rows, a->cols, _state); for(i=0; i<=b.rows-1; i++) { for(j=0; j<=b.cols-1; j++) { b.ptr.pp_bool[i][j] = a->ptr.pp_bool[i][j]; } } ae_matrix_set_length(a, b.cols, b.rows, _state); for(i=0; i<=b.rows-1; i++) { for(j=0; j<=b.cols-1; j++) { a->ptr.pp_bool[j][i] = b.ptr.pp_bool[i][j]; } } ae_frame_leave(_state); } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sin(3*I+5*J)>0" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugb2outsin(ae_int_t m, ae_int_t n, /* Boolean */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(a); ae_matrix_set_length(a, m, n, _state); for(i=0; i<=a->rows-1; i++) { for(j=0; j<=a->cols-1; j++) { a->ptr.pp_bool[i][j] = ae_fp_greater(ae_sin((double)(3*i+5*j), _state),(double)(0)); } } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_int_t xdebugi2sum(/* Integer */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t result; result = 0; for(i=0; i<=a->rows-1; i++) { for(j=0; j<=a->cols-1; j++) { result = result+a->ptr.pp_int[i][j]; } } return result; } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -a[i,j] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi2neg(/* Integer */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=a->rows-1; i++) { for(j=0; j<=a->cols-1; j++) { a->ptr.pp_int[i][j] = -a->ptr.pp_int[i][j]; } } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi2transpose(/* Integer */ ae_matrix* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_matrix b; ae_frame_make(_state, &_frame_block); ae_matrix_init(&b, 0, 0, DT_INT, _state); ae_matrix_set_length(&b, a->rows, a->cols, _state); for(i=0; i<=b.rows-1; i++) { for(j=0; j<=b.cols-1; j++) { b.ptr.pp_int[i][j] = a->ptr.pp_int[i][j]; } } ae_matrix_set_length(a, b.cols, b.rows, _state); for(i=0; i<=b.rows-1; i++) { for(j=0; j<=b.cols-1; j++) { a->ptr.pp_int[j][i] = b.ptr.pp_int[i][j]; } } ae_frame_leave(_state); } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sign(Sin(3*I+5*J))" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugi2outsin(ae_int_t m, ae_int_t n, /* Integer */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(a); ae_matrix_set_length(a, m, n, _state); for(i=0; i<=a->rows-1; i++) { for(j=0; j<=a->cols-1; j++) { a->ptr.pp_int[i][j] = ae_sign(ae_sin((double)(3*i+5*j), _state), _state); } } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ double xdebugr2sum(/* Real */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; double result; result = (double)(0); for(i=0; i<=a->rows-1; i++) { for(j=0; j<=a->cols-1; j++) { result = result+a->ptr.pp_double[i][j]; } } return result; } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -a[i,j] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr2neg(/* Real */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=a->rows-1; i++) { for(j=0; j<=a->cols-1; j++) { a->ptr.pp_double[i][j] = -a->ptr.pp_double[i][j]; } } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr2transpose(/* Real */ ae_matrix* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_matrix b; ae_frame_make(_state, &_frame_block); ae_matrix_init(&b, 0, 0, DT_REAL, _state); ae_matrix_set_length(&b, a->rows, a->cols, _state); for(i=0; i<=b.rows-1; i++) { for(j=0; j<=b.cols-1; j++) { b.ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } ae_matrix_set_length(a, b.cols, b.rows, _state); for(i=0; i<=b.rows-1; i++) { for(j=0; j<=b.cols-1; j++) { a->ptr.pp_double[j][i] = b.ptr.pp_double[i][j]; } } ae_frame_leave(_state); } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sin(3*I+5*J)" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugr2outsin(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(a); ae_matrix_set_length(a, m, n, _state); for(i=0; i<=a->rows-1; i++) { for(j=0; j<=a->cols-1; j++) { a->ptr.pp_double[i][j] = ae_sin((double)(3*i+5*j), _state); } } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of elements in the array. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ ae_complex xdebugc2sum(/* Complex */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; ae_complex result; result = ae_complex_from_i(0); for(i=0; i<=a->rows-1; i++) { for(j=0; j<=a->cols-1; j++) { result = ae_c_add(result,a->ptr.pp_complex[i][j]); } } return result; } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Replace all values in array by -a[i,j] Array is passed using "shared" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc2neg(/* Complex */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=a->rows-1; i++) { for(j=0; j<=a->cols-1; j++) { a->ptr.pp_complex[i][j] = ae_c_neg(a->ptr.pp_complex[i][j]); } } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Transposes array. Array is passed using "var" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc2transpose(/* Complex */ ae_matrix* a, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_matrix b; ae_frame_make(_state, &_frame_block); ae_matrix_init(&b, 0, 0, DT_COMPLEX, _state); ae_matrix_set_length(&b, a->rows, a->cols, _state); for(i=0; i<=b.rows-1; i++) { for(j=0; j<=b.cols-1; j++) { b.ptr.pp_complex[i][j] = a->ptr.pp_complex[i][j]; } } ae_matrix_set_length(a, b.cols, b.rows, _state); for(i=0; i<=b.rows-1; i++) { for(j=0; j<=b.cols-1; j++) { a->ptr.pp_complex[j][i] = b.ptr.pp_complex[i][j]; } } ae_frame_leave(_state); } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Generate MxN matrix with elements set to "Sin(3*I+5*J),Cos(3*I+5*J)" Array is passed using "out" convention. -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ void xdebugc2outsincos(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(a); ae_matrix_set_length(a, m, n, _state); for(i=0; i<=a->rows-1; i++) { for(j=0; j<=a->cols-1; j++) { a->ptr.pp_complex[i][j].x = ae_sin((double)(3*i+5*j), _state); a->ptr.pp_complex[i][j].y = ae_cos((double)(3*i+5*j), _state); } } } /************************************************************************* This is debug function intended for testing ALGLIB interface generator. Never use it in any real life project. Returns sum of a[i,j]*(1+b[i,j]) such that c[i,j] is True -- ALGLIB -- Copyright 11.10.2013 by Bochkanov Sergey *************************************************************************/ double xdebugmaskedbiasedproductsum(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, /* Real */ ae_matrix* b, /* Boolean */ ae_matrix* c, ae_state *_state) { ae_int_t i; ae_int_t j; double result; ae_assert(m>=a->rows, "Assertion failed", _state); ae_assert(m>=b->rows, "Assertion failed", _state); ae_assert(m>=c->rows, "Assertion failed", _state); ae_assert(n>=a->cols, "Assertion failed", _state); ae_assert(n>=b->cols, "Assertion failed", _state); ae_assert(n>=c->cols, "Assertion failed", _state); result = 0.0; for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( c->ptr.pp_bool[i][j] ) { result = result+a->ptr.pp_double[i][j]*(1+b->ptr.pp_double[i][j]); } } } return result; } void _xdebugrecord1_init(void* _p, ae_state *_state) { xdebugrecord1 *p = (xdebugrecord1*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->a, 0, DT_REAL, _state); } void _xdebugrecord1_init_copy(void* _dst, void* _src, ae_state *_state) { xdebugrecord1 *dst = (xdebugrecord1*)_dst; xdebugrecord1 *src = (xdebugrecord1*)_src; dst->i = src->i; dst->c = src->c; ae_vector_init_copy(&dst->a, &src->a, _state); } void _xdebugrecord1_clear(void* _p) { xdebugrecord1 *p = (xdebugrecord1*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->a); } void _xdebugrecord1_destroy(void* _p) { xdebugrecord1 *p = (xdebugrecord1*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->a); } } cpp/src/ap.cpp0000755000175000017500000121372513105126766013136 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "ap.h" #include #include using namespace std; #if defined(AE_CPU) #if (AE_CPU==AE_INTEL) #if AE_COMPILER==AE_MSVC #include #endif #endif #endif // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif ///////////////////////////////////////////////////////////////////////// // // THIS SECTION IMPLEMENTS BASIC FUNCTIONALITY LIKE // MEMORY MANAGEMENT FOR VECTORS/MATRICES WHICH IS // SHARED BETWEEN C++ AND PURE C LIBRARIES // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { /* * OS-specific includes */ #ifdef AE_USE_CPP } #endif #if AE_OS==AE_WINDOWS #ifndef _WIN32_WINNT #define _WIN32_WINNT 0x0501 #endif #include #include #elif AE_OS==AE_POSIX #include #include #include #include #endif /* Debugging helpers for Windows */ #ifdef AE_DEBUG4WINDOWS #include #include #endif #ifdef AE_USE_CPP namespace alglib_impl { #endif /* * local definitions */ #define x_nb 16 #define AE_DATA_ALIGN 64 #define AE_PTR_ALIGN sizeof(void*) #define DYN_BOTTOM ((void*)1) #define DYN_FRAME ((void*)2) #define AE_LITTLE_ENDIAN 1 #define AE_BIG_ENDIAN 2 #define AE_MIXED_ENDIAN 3 #define AE_SER_ENTRY_LENGTH 11 #define AE_SER_ENTRIES_PER_ROW 5 #define AE_SM_DEFAULT 0 #define AE_SM_ALLOC 1 #define AE_SM_READY2S 2 #define AE_SM_TO_STRING 10 #define AE_SM_TO_CPPSTRING 11 #define AE_SM_TO_STREAM 12 #define AE_SM_FROM_STRING 20 #define AE_SM_FROM_STREAM 22 #define AE_LOCK_CYCLES 512 #define AE_LOCK_TESTS_BEFORE_YIELD 16 #define AE_CRITICAL_ASSERT(x) if( !(x) ) abort() /************************************************************************* Lock. This is internal structure which implements lock functionality. *************************************************************************/ typedef struct { #if AE_OS==AE_WINDOWS volatile ae_int_t * volatile p_lock; char buf[sizeof(ae_int_t)+AE_LOCK_ALIGNMENT]; #elif AE_OS==AE_POSIX pthread_mutex_t mutex; #else ae_bool is_locked; #endif } _lock; /* * alloc counter */ ae_int64_t _alloc_counter = 0; ae_bool _use_alloc_counter = ae_false; #ifdef AE_SMP_DEBUGCOUNTERS __declspec(align(AE_LOCK_ALIGNMENT)) volatile ae_int64_t _ae_dbg_lock_acquisitions = 0; __declspec(align(AE_LOCK_ALIGNMENT)) volatile ae_int64_t _ae_dbg_lock_spinwaits = 0; __declspec(align(AE_LOCK_ALIGNMENT)) volatile ae_int64_t _ae_dbg_lock_yields = 0; #endif /* * These declarations are used to ensure that * sizeof(ae_bool)=1, sizeof(ae_int32_t)==4, sizeof(ae_int64_t)==8, sizeof(ae_int_t)==sizeof(void*). * they will lead to syntax error otherwise (array size will be negative). * * you can remove them, if you want - they are not used anywhere. * */ static char _ae_bool_must_be_8_bits_wide[1-2*((int)(sizeof(ae_bool))-1)*((int)(sizeof(ae_bool))-1)]; static char _ae_int32_t_must_be_32_bits_wide[1-2*((int)(sizeof(ae_int32_t))-4)*((int)(sizeof(ae_int32_t))-4)]; static char _ae_int64_t_must_be_64_bits_wide[1-2*((int)(sizeof(ae_int64_t))-8)*((int)(sizeof(ae_int64_t))-8)]; static char _ae_int_t_must_be_pointer_sized [1-2*((int)(sizeof(ae_int_t))-(int)sizeof(void*))*((int)(sizeof(ae_int_t))-(int)(sizeof(void*)))]; /* * This variable is used to prevent some tricky optimizations which may degrade multithreaded performance. * It is touched once in the ae_init_pool() function from smp.c in order to prevent optimizations. * */ static volatile ae_int_t ae_never_change_it = 1; ae_int_t ae_misalignment(const void *ptr, size_t alignment) { union _u { const void *ptr; ae_int_t iptr; } u; u.ptr = ptr; return (ae_int_t)(u.iptr%alignment); } void* ae_align(void *ptr, size_t alignment) { char *result = (char*)ptr; if( (result-(char*)0)%alignment!=0 ) result += alignment - (result-(char*)0)%alignment; return result; } /************************************************************************* This function abnormally aborts program, using one of several ways: * for AE_USE_CPP_ERROR_HANDLING being NOT defined: * for state!=NULL and state->break_jump being initialized with call to ae_state_set_break_jump() - it performs longjmp() to return site. * otherwise, abort() is called * for AE_USE_CPP_ERROR_HANDLING being DEFINED - an instance of ae_error_type() class is throw'ed. In all cases, for state!=NULL function sets state->last_error and state->error_msg fields. It also clears state with ae_state_clear(). If state is not NULL and state->thread_exception_handler is set, it is called prior to handling error and clearing state. *************************************************************************/ void ae_break(ae_state *state, ae_error_type error_type, const char *msg) { #ifndef AE_USE_CPP_ERROR_HANDLING if( state!=NULL ) { if( state->thread_exception_handler!=NULL ) state->thread_exception_handler(state); ae_state_clear(state); state->last_error = error_type; state->error_msg = msg; if( state->break_jump!=NULL ) longjmp(*(state->break_jump), 1); else abort(); } else abort(); #else if( state!=NULL ) { if( state->thread_exception_handler!=NULL ) state->thread_exception_handler(state); ae_state_clear(state); state->last_error = error_type; state->error_msg = msg; } throw error_type; #endif } void* aligned_malloc(size_t size, size_t alignment) { if( size==0 ) return NULL; if( alignment<=1 ) { /* no alignment, just call malloc */ void *block; void **p; ; block = malloc(sizeof(void*)+size); if( block==NULL ) return NULL; p = (void**)block; *p = block; if( _use_alloc_counter ) { #if AE_OS==AE_WINDOWS InterlockedIncrement((LONG volatile *)&_alloc_counter); #else #endif } return (void*)((char*)block+sizeof(void*)); } else { /* align */ void *block; char *result; block = malloc(alignment-1+sizeof(void*)+size); if( block==NULL ) return NULL; result = (char*)block+sizeof(void*); /*if( (result-(char*)0)%alignment!=0 ) result += alignment - (result-(char*)0)%alignment;*/ result = (char*)ae_align(result, alignment); *((void**)(result-sizeof(void*))) = block; if( _use_alloc_counter ) { #if AE_OS==AE_WINDOWS InterlockedIncrement((LONG volatile *)&_alloc_counter); #else #endif } return result; } } void aligned_free(void *block) { void *p; if( block==NULL ) return; p = *((void**)((char*)block-sizeof(void*))); free(p); if( _use_alloc_counter ) { #if AE_OS==AE_WINDOWS InterlockedDecrement((LONG volatile *)&_alloc_counter); #else #endif } } /************************************************************************ Malloc's memory with automatic alignment. Returns NULL when zero size is specified. Error handling: * if state is NULL, returns NULL on allocation error * if state is not NULL, calls ae_break() on allocation error ************************************************************************/ void* ae_malloc(size_t size, ae_state *state) { void *result; if( size==0 ) return NULL; result = aligned_malloc(size,AE_DATA_ALIGN); if( result==NULL && state!=NULL) ae_break(state, ERR_OUT_OF_MEMORY, "ae_malloc(): out of memory"); return result; } void ae_free(void *p) { if( p!=NULL ) aligned_free(p); } /************************************************************************ Sets pointers to the matrix rows. * dst must be correctly initialized matrix * dst->data.ptr points to the beginning of memory block allocated for row pointers. * dst->ptr - undefined (initialized during algorithm processing) * storage parameter points to the beginning of actual storage ************************************************************************/ void ae_matrix_update_row_pointers(ae_matrix *dst, void *storage) { char *p_base; void **pp_ptr; ae_int_t i; if( dst->rows>0 && dst->cols>0 ) { p_base = (char*)storage; pp_ptr = (void**)dst->data.ptr; dst->ptr.pp_void = pp_ptr; for(i=0; irows; i++, p_base+=dst->stride*ae_sizeof(dst->datatype)) pp_ptr[i] = p_base; } else dst->ptr.pp_void = NULL; } /************************************************************************ Returns size of datatype. Zero for dynamic types like strings or multiple precision types. ************************************************************************/ ae_int_t ae_sizeof(ae_datatype datatype) { switch(datatype) { case DT_BOOL: return (ae_int_t)sizeof(ae_bool); case DT_INT: return (ae_int_t)sizeof(ae_int_t); case DT_REAL: return (ae_int_t)sizeof(double); case DT_COMPLEX: return 2*(ae_int_t)sizeof(double); default: return 0; } } /************************************************************************ This dummy function is used to prevent compiler messages about unused locals in automatically generated code. It makes nothing - just accepts pointer, "touches" it - and that is all. It performs several tricky operations without side effects which confuse compiler so it does not compain about unused locals in THIS function. ************************************************************************/ void ae_touch_ptr(void *p) { void * volatile fake_variable0 = p; void * volatile fake_variable1 = fake_variable0; fake_variable0 = fake_variable1; } /************************************************************************ This function initializes ALGLIB environment state. NOTES: * stacks contain no frames, so ae_make_frame() must be called before attaching dynamic blocks. Without it ae_leave_frame() will cycle forever (which is intended behavior). ************************************************************************/ void ae_state_init(ae_state *state) { ae_int32_t *vp; /* * p_next points to itself because: * * correct program should be able to detect end of the list * by looking at the ptr field. * * NULL p_next may be used to distinguish automatic blocks * (in the list) from non-automatic (not in the list) */ state->last_block.p_next = &(state->last_block); state->last_block.deallocator = NULL; state->last_block.ptr = DYN_BOTTOM; state->p_top_block = &(state->last_block); #ifndef AE_USE_CPP_ERROR_HANDLING state->break_jump = NULL; #endif state->error_msg = ""; /* * determine endianness and initialize precomputed IEEE special quantities. */ state->endianness = ae_get_endianness(); if( state->endianness==AE_LITTLE_ENDIAN ) { vp = (ae_int32_t*)(&state->v_nan); vp[0] = 0; vp[1] = (ae_int32_t)0x7FF80000; vp = (ae_int32_t*)(&state->v_posinf); vp[0] = 0; vp[1] = (ae_int32_t)0x7FF00000; vp = (ae_int32_t*)(&state->v_neginf); vp[0] = 0; vp[1] = (ae_int32_t)0xFFF00000; } else if( state->endianness==AE_BIG_ENDIAN ) { vp = (ae_int32_t*)(&state->v_nan); vp[1] = 0; vp[0] = (ae_int32_t)0x7FF80000; vp = (ae_int32_t*)(&state->v_posinf); vp[1] = 0; vp[0] = (ae_int32_t)0x7FF00000; vp = (ae_int32_t*)(&state->v_neginf); vp[1] = 0; vp[0] = (ae_int32_t)0xFFF00000; } else abort(); /* * set threading information */ state->worker_thread = NULL; state->parent_task = NULL; state->thread_exception_handler = NULL; } /************************************************************************ This function clears ALGLIB environment state. All dynamic data controlled by state are freed. ************************************************************************/ void ae_state_clear(ae_state *state) { while( state->p_top_block->ptr!=DYN_BOTTOM ) ae_frame_leave(state); } #ifndef AE_USE_CPP_ERROR_HANDLING /************************************************************************ This function sets jump buffer for error handling. buf may be NULL. ************************************************************************/ void ae_state_set_break_jump(ae_state *state, jmp_buf *buf) { state->break_jump = buf; } #endif /************************************************************************ This function makes new stack frame. This function takes two parameters: environment state and pointer to the dynamic block which will be used as indicator of the frame beginning. This dynamic block must be initialized by caller and mustn't be changed/ deallocated/reused till ae_leave_frame called. It may be global or local variable (local is even better). ************************************************************************/ void ae_frame_make(ae_state *state, ae_frame *tmp) { tmp->db_marker.p_next = state->p_top_block; tmp->db_marker.deallocator = NULL; tmp->db_marker.ptr = DYN_FRAME; state->p_top_block = &tmp->db_marker; } /************************************************************************ This function leaves current stack frame and deallocates all automatic dynamic blocks which were attached to this frame. ************************************************************************/ void ae_frame_leave(ae_state *state) { while( state->p_top_block->ptr!=DYN_FRAME && state->p_top_block->ptr!=DYN_BOTTOM) { if( state->p_top_block->ptr!=NULL && state->p_top_block->deallocator!=NULL) ((ae_deallocator)(state->p_top_block->deallocator))(state->p_top_block->ptr); state->p_top_block = state->p_top_block->p_next; } state->p_top_block = state->p_top_block->p_next; } /************************************************************************ This function attaches block to the dynamic block list block block state ALGLIB environment state NOTES: * never call it for special blocks which marks frame boundaries! ************************************************************************/ void ae_db_attach(ae_dyn_block *block, ae_state *state) { block->p_next = state->p_top_block; state->p_top_block = block; } /************************************************************************ This function malloc's dynamic block: block destination block, assumed to be uninitialized size size (in bytes) state ALGLIB environment state. May be NULL. make_automatic if true, vector is added to the dynamic block list block is assumed to be uninitialized, its fields are ignored. Error handling: * if state is NULL, returns ae_false on allocation error * if state is not NULL, calls ae_break() on allocation error * returns ae_true on success NOTES: * never call it for blocks which are already in the list ************************************************************************/ ae_bool ae_db_malloc(ae_dyn_block *block, ae_int_t size, ae_state *state, ae_bool make_automatic) { /* ensure that size is >=0 two ways to exit: 1) through ae_assert, if we have non-NULL state, 2) by returning ae_false */ if( state!=NULL ) ae_assert(size>=0, "ae_db_malloc(): negative size", state); if( size<0 ) return ae_false; /* allocation */ block->ptr = ae_malloc((size_t)size, state); if( block->ptr==NULL && size!=0 ) { /* for state!=NULL exception is thrown from ae_malloc(), so we have to handle only situation when state is NULL */ return ae_false; } if( make_automatic && state!=NULL ) ae_db_attach(block, state); else block->p_next = NULL; block->deallocator = ae_free; return ae_true; } /************************************************************************ This function realloc's dynamic block: block destination block (initialized) size new size (in bytes) state ALGLIB environment state block is assumed to be initialized. This function: * deletes old contents * preserves automatic state Error handling: * if state is NULL, returns ae_false on allocation error * if state is not NULL, calls ae_break() on allocation error * returns ae_true on success NOTES: * never call it for special blocks which mark frame boundaries! ************************************************************************/ ae_bool ae_db_realloc(ae_dyn_block *block, ae_int_t size, ae_state *state) { /* ensure that size is >=0 two ways to exit: 1) through ae_assert, if we have non-NULL state, 2) by returning ae_false */ if( state!=NULL ) ae_assert(size>=0, "ae_db_realloc(): negative size", state); if( size<0 ) return ae_false; /* realloc */ if( block->ptr!=NULL ) ((ae_deallocator)block->deallocator)(block->ptr); block->ptr = ae_malloc((size_t)size, state); if( block->ptr==NULL && size!=0 ) { /* for state!=NULL exception is thrown from ae_malloc(), so we have to handle only situation when state is NULL */ return ae_false; } block->deallocator = ae_free; return ae_true; } /************************************************************************ This function clears dynamic block (releases all dynamically allocated memory). Dynamic block may be in automatic management list - in this case it will NOT be removed from list. block destination block (initialized) NOTES: * never call it for special blocks which marks frame boundaries! ************************************************************************/ void ae_db_free(ae_dyn_block *block) { if( block->ptr!=NULL ) ((ae_deallocator)block->deallocator)(block->ptr); block->ptr = NULL; block->deallocator = ae_free; } /************************************************************************ This function swaps contents of two dynamic blocks (pointers and deallocators) leaving other parameters (automatic management settings, etc.) unchanged. NOTES: * never call it for special blocks which marks frame boundaries! ************************************************************************/ void ae_db_swap(ae_dyn_block *block1, ae_dyn_block *block2) { void (*deallocator)(void*) = NULL; void * volatile ptr; ptr = block1->ptr; deallocator = block1->deallocator; block1->ptr = block2->ptr; block1->deallocator = block2->deallocator; block2->ptr = ptr; block2->deallocator = deallocator; } /************************************************************************* This function creates ae_vector. Vector size may be zero. Vector contents is uninitialized. dst destination vector, assumed to be uninitialized, its fields are ignored. size vector size, may be zero datatype guess what... state this parameter can be: * pointer to current instance of ae_state, if you want to automatically destroy this object after leaving current frame * NULL, if you do NOT want this vector to be automatically managed (say, if it is field of some object) Error handling: * on failure (size<0 or unable to allocate memory) - calls ae_break() with NULL state pointer. Usually it results in abort() call. dst is *************************************************************************/ void ae_vector_init(ae_vector *dst, ae_int_t size, ae_datatype datatype, ae_state *state) { /* ensure that size is >=0 two ways to exit: 1) through ae_assert, if we have non-NULL state, 2) by returning ae_false */ ae_assert( size>=0, "ae_vector_init(): negative size", NULL); /* init */ dst->cnt = size; dst->datatype = datatype; ae_assert( ae_db_malloc(&dst->data, size*ae_sizeof(datatype), state, state!=NULL), /* TODO: change ae_db_malloc() */ "ae_vector_init(): failed to allocate memory", NULL); dst->ptr.p_ptr = dst->data.ptr; dst->is_attached = ae_false; } /************************************************************************ This function creates copy of ae_vector. New copy of the data is created, which is managed and owned by newly initialized vector. dst destination vector src well, it is source state this parameter can be: * pointer to current instance of ae_state, if you want to automatically destroy this object after leaving current frame * NULL, if you do NOT want this vector to be automatically managed (say, if it is field of some object) Error handling: * on failure calls ae_break() with NULL state pointer. Usually it results in abort() call. dst is assumed to be uninitialized, its fields are ignored. ************************************************************************/ void ae_vector_init_copy(ae_vector *dst, ae_vector *src, ae_state *state) { ae_vector_init(dst, src->cnt, src->datatype, state); if( src->cnt!=0 ) memcpy(dst->ptr.p_ptr, src->ptr.p_ptr, (size_t)(src->cnt*ae_sizeof(src->datatype))); } /************************************************************************ This function initializes ae_vector using X-structure as source. New copy of data is created, which is owned/managed by ae_vector structure. Both structures (source and destination) remain completely independent after this call. dst destination matrix src well, it is source state this parameter can be: * pointer to current instance of ae_state, if you want to automatically destroy this object after leaving current frame * NULL, if you do NOT want this vector to be automatically managed (say, if it is field of some object) Error handling: * on failure calls ae_break() with NULL state pointer. Usually it results in abort() call. dst is assumed to be uninitialized, its fields are ignored. ************************************************************************/ void ae_vector_init_from_x(ae_vector *dst, x_vector *src, ae_state *state) { ae_vector_init(dst, (ae_int_t)src->cnt, (ae_datatype)src->datatype, state); if( src->cnt>0 ) memcpy(dst->ptr.p_ptr, src->ptr, (size_t)(((ae_int_t)src->cnt)*ae_sizeof((ae_datatype)src->datatype))); } /************************************************************************ This function initializes ae_vector using X-structure as source. New vector is attached to source: * DST shares memory with SRC * both DST and SRC are writable - all writes to DST change elements of SRC and vice versa. * DST can be reallocated with ae_vector_set_length(), in this case SRC remains untouched * SRC, however, CAN NOT BE REALLOCATED AS LONG AS DST EXISTS NOTE: is_attached field is set to ae_true in order to indicate that vector does not own its memory. dst destination vector src well, it is source state this parameter can be: * pointer to current instance of ae_state, if you want to automatically destroy this object after leaving current frame * NULL, if you do NOT want this vector to be automatically managed (say, if it is field of some object) Error handling: * on failure calls ae_break() with NULL state pointer. Usually it results in abort() call. dst is assumed to be uninitialized, its fields are ignored. ************************************************************************/ void ae_vector_attach_to_x(ae_vector *dst, x_vector *src, ae_state *state) { volatile ae_int_t cnt; cnt = (ae_int_t)src->cnt; /* ensure that size is correct */ ae_assert(cnt==src->cnt, "ae_vector_attach_to_x(): 32/64 overflow", NULL); ae_assert(cnt>=0, "ae_vector_attach_to_x(): negative length", NULL); /* init */ dst->cnt = cnt; dst->datatype = (ae_datatype)src->datatype; dst->ptr.p_ptr = src->ptr; dst->is_attached = ae_true; ae_assert( ae_db_malloc(&dst->data, 0, state, state!=NULL), "ae_vector_attach_to_x(): malloc error", NULL); } /************************************************************************ This function changes length of ae_vector. dst destination vector newsize vector size, may be zero state ALGLIB environment state Error handling: * if state is NULL, returns ae_false on allocation error * if state is not NULL, calls ae_break() on allocation error * returns ae_true on success NOTES: * vector must be initialized * all contents is destroyed during setlength() call * new size may be zero. ************************************************************************/ ae_bool ae_vector_set_length(ae_vector *dst, ae_int_t newsize, ae_state *state) { /* ensure that size is >=0 two ways to exit: 1) through ae_assert, if we have non-NULL state, 2) by returning ae_false */ if( state!=NULL ) ae_assert(newsize>=0, "ae_vector_set_length(): negative size", state); if( newsize<0 ) return ae_false; /* set length */ if( dst->cnt==newsize ) return ae_true; dst->cnt = newsize; if( !ae_db_realloc(&dst->data, newsize*ae_sizeof(dst->datatype), state) ) return ae_false; dst->ptr.p_ptr = dst->data.ptr; return ae_true; } /************************************************************************ This function provides "CLEAR" functionality for vector (contents is cleared, but structure still left in valid state). The function clears vector contents (releases all dynamically allocated memory). Vector may be in automatic management list - in this case it will NOT be removed from list. IMPORTANT: this function does NOT invalidates dst; it just releases all dynamically allocated storage, but dst still may be used after call to ae_vector_set_length(). dst destination vector ************************************************************************/ void ae_vector_clear(ae_vector *dst) { dst->cnt = 0; ae_db_free(&dst->data); dst->ptr.p_ptr = 0; dst->is_attached = ae_false; } /************************************************************************ This function provides "DESTROY" functionality for vector (contents is cleared, all internal structures are destroyed). For vectors it is same as CLEAR. dst destination vector ************************************************************************/ void ae_vector_destroy(ae_vector *dst) { ae_vector_clear(dst); } /************************************************************************ This function efficiently swaps contents of two vectors, leaving other pararemeters (automatic management, etc.) unchanged. ************************************************************************/ void ae_swap_vectors(ae_vector *vec1, ae_vector *vec2) { ae_int_t cnt; ae_datatype datatype; void *p_ptr; ae_assert(!vec1->is_attached, "ALGLIB: internal error, attempt to swap vectors attached to X-object", NULL); ae_assert(!vec2->is_attached, "ALGLIB: internal error, attempt to swap vectors attached to X-object", NULL); ae_db_swap(&vec1->data, &vec2->data); cnt = vec1->cnt; datatype = vec1->datatype; p_ptr = vec1->ptr.p_ptr; vec1->cnt = vec2->cnt; vec1->datatype = vec2->datatype; vec1->ptr.p_ptr = vec2->ptr.p_ptr; vec2->cnt = cnt; vec2->datatype = datatype; vec2->ptr.p_ptr = p_ptr; } /************************************************************************ This function creates ae_matrix. Matrix size may be zero, in such cases both rows and cols are zero. Matrix contents is uninitialized. dst destination matrix, assumed to be unitialized, its fields are ignored rows rows count cols cols count datatype element type state depending on your desire to register matrix in the current frame: * pointer to ALGLIB environment state, if you want the matrix to be automatically managed * NULL, if you do not want it to be automatically managed Error handling: * calls ae_break() with NULL state; usually it results in abort() call. dst is assumed to be uninitialized, its fields are ignored. ************************************************************************/ void ae_matrix_init(ae_matrix *dst, ae_int_t rows, ae_int_t cols, ae_datatype datatype, ae_state *state) { ae_assert(rows>=0 && cols>=0, "ae_matrix_init(): negative length", NULL); /* if one of rows/cols is zero, another MUST be too */ if( rows==0 || cols==0 ) { rows = 0; cols = 0; } /* init */ dst->is_attached = ae_false; dst->rows = rows; dst->cols = cols; dst->stride = cols; while( dst->stride*ae_sizeof(datatype)%AE_DATA_ALIGN!=0 ) dst->stride++; dst->datatype = datatype; ae_assert( ae_db_malloc(&dst->data, dst->rows*((ae_int_t)sizeof(void*)+dst->stride*ae_sizeof(datatype))+AE_DATA_ALIGN-1, state, state!=NULL), /* TODO: change ae_db_malloc() */ "ae_matrix_init(): failed to allocate memory", NULL); ae_matrix_update_row_pointers(dst, ae_align((char*)dst->data.ptr+dst->rows*sizeof(void*),AE_DATA_ALIGN)); } /************************************************************************ This function creates copy of ae_matrix. A new copy of the data is created. dst destination matrix src well, it is source state this parameter can be: * pointer to current instance of ae_state, if you want to automatically destroy this object after leaving current frame * NULL, if you do NOT want this vector to be automatically managed (say, if it is field of some object) Error handling: * on failure calls ae_break() with NULL state pointer. Usually it results in abort() call. dst is assumed to be uninitialized, its fields are ignored. ************************************************************************/ void ae_matrix_init_copy(ae_matrix *dst, ae_matrix *src, ae_state *state) { ae_int_t i; ae_matrix_init(dst, src->rows, src->cols, src->datatype, state); if( src->rows!=0 && src->cols!=0 ) { if( dst->stride==src->stride ) memcpy(dst->ptr.pp_void[0], src->ptr.pp_void[0], (size_t)(src->rows*src->stride*ae_sizeof(src->datatype))); else for(i=0; irows; i++) memcpy(dst->ptr.pp_void[i], src->ptr.pp_void[i], (size_t)(dst->cols*ae_sizeof(dst->datatype))); } } /************************************************************************ This function initializes ae_matrix using X-structure as source. New copy of data is created, which is owned/managed by ae_matrix structure. Both structures (source and destination) remain completely independent after this call. dst destination matrix src well, it is source state this parameter can be: * pointer to current instance of ae_state, if you want to automatically destroy this object after leaving current frame * NULL, if you do NOT want this vector to be automatically managed (say, if it is field of some object) Error handling: * on failure calls ae_break() with NULL state pointer. Usually it results in abort() call. dst is assumed to be uninitialized, its fields are ignored. ************************************************************************/ void ae_matrix_init_from_x(ae_matrix *dst, x_matrix *src, ae_state *state) { char *p_src_row; char *p_dst_row; ae_int_t row_size; ae_int_t i; ae_matrix_init(dst, (ae_int_t)src->rows, (ae_int_t)src->cols, (ae_datatype)src->datatype, state); if( src->rows!=0 && src->cols!=0 ) { p_src_row = (char*)src->ptr; p_dst_row = (char*)(dst->ptr.pp_void[0]); row_size = ae_sizeof((ae_datatype)src->datatype)*(ae_int_t)src->cols; for(i=0; irows; i++, p_src_row+=src->stride*ae_sizeof((ae_datatype)src->datatype), p_dst_row+=dst->stride*ae_sizeof((ae_datatype)src->datatype)) memcpy(p_dst_row, p_src_row, (size_t)(row_size)); } } /************************************************************************ This function initializes ae_matrix using X-structure as source. New matrix is attached to source: * DST shares memory with SRC * both DST and SRC are writable - all writes to DST change elements of SRC and vice versa. * DST can be reallocated with ae_matrix_set_length(), in this case SRC remains untouched * SRC, however, CAN NOT BE REALLOCATED AS LONG AS DST EXISTS dst destination matrix src well, it is source state this parameter can be: * pointer to current instance of ae_state, if you want to automatically destroy this object after leaving current frame * NULL, if you do NOT want this vector to be automatically managed (say, if it is field of some object) Error handling: * on failure calls ae_break() with NULL state pointer. Usually it results in abort() call. dst is assumed to be uninitialized, its fields are ignored. ************************************************************************/ void ae_matrix_attach_to_x(ae_matrix *dst, x_matrix *src, ae_state *state) { ae_int_t rows, cols; rows = (ae_int_t)src->rows; cols = (ae_int_t)src->cols; /* check that X-source is densely packed */ ae_assert(src->cols==src->stride, "ae_matrix_attach_to_x(): unsupported stride", NULL); /* ensure that size is correct */ ae_assert(rows==src->rows, "ae_matrix_attach_to_x(): 32/64 overflow", NULL); ae_assert(cols==src->cols, "ae_matrix_attach_to_x(): 32/64 overflow", NULL); ae_assert(rows>=0 && cols>=0, "ae_matrix_attach_to_x(): negative length", NULL); /* if one of rows/cols is zero, another MUST be too */ if( rows==0 || cols==0 ) { rows = 0; cols = 0; } /* init */ dst->is_attached = ae_true; dst->rows = rows; dst->cols = cols; dst->stride = cols; dst->datatype = (ae_datatype)src->datatype; dst->ptr.pp_void = NULL; ae_assert( ae_db_malloc(&dst->data, dst->rows*(ae_int_t)sizeof(void*), state, state!=NULL), "ae_matrix_attach_to_x(): malloc error", NULL); if( dst->rows>0 && dst->cols>0 ) { ae_int_t i, rowsize; char *p_row; void **pp_ptr; p_row = (char*)src->ptr; rowsize = dst->stride*ae_sizeof(dst->datatype); pp_ptr = (void**)dst->data.ptr; dst->ptr.pp_void = pp_ptr; for(i=0; irows; i++, p_row+=rowsize) pp_ptr[i] = p_row; } } /************************************************************************ This function changes length of ae_matrix. dst destination matrix rows size, may be zero cols size, may be zero state ALGLIB environment state Error handling: * if state is NULL, returns ae_false on allocation error * if state is not NULL, calls ae_break() on allocation error * returns ae_true on success NOTES: * matrix must be initialized * all contents is destroyed during setlength() call * new size may be zero. ************************************************************************/ ae_bool ae_matrix_set_length(ae_matrix *dst, ae_int_t rows, ae_int_t cols, ae_state *state) { /* ensure that size is >=0 two ways to exit: 1) through ae_assert, if we have non-NULL state, 2) by returning ae_false */ if( state!=NULL ) ae_assert(rows>=0 && cols>=0, "ae_matrix_set_length(): negative length", state); if( rows<0 || cols<0 ) return ae_false; if( dst->rows==rows && dst->cols==cols ) return ae_true; dst->rows = rows; dst->cols = cols; dst->stride = cols; while( dst->stride*ae_sizeof(dst->datatype)%AE_DATA_ALIGN!=0 ) dst->stride++; if( !ae_db_realloc(&dst->data, dst->rows*((ae_int_t)sizeof(void*)+dst->stride*ae_sizeof(dst->datatype))+AE_DATA_ALIGN-1, state) ) return ae_false; ae_matrix_update_row_pointers(dst, ae_align((char*)dst->data.ptr+dst->rows*sizeof(void*),AE_DATA_ALIGN)); return ae_true; } /************************************************************************ This function provides "CLEAR" functionality for vector (contents is cleared, but structure still left in valid state). The function clears matrix contents (releases all dynamically allocated memory). Matrix may be in automatic management list - in this case it will NOT be removed from list. IMPORTANT: this function does NOT invalidates dst; it just releases all dynamically allocated storage, but dst still may be used after call to ae_matrix_set_length(). dst destination matrix ************************************************************************/ void ae_matrix_clear(ae_matrix *dst) { dst->rows = 0; dst->cols = 0; dst->stride = 0; ae_db_free(&dst->data); dst->ptr.p_ptr = 0; dst->is_attached = ae_false; } /************************************************************************ This function provides "DESTROY" functionality for matrix (contents is cleared, but structure still left in valid state). For matrices it is same as CLEAR. dst destination matrix ************************************************************************/ void ae_matrix_destroy(ae_matrix *dst) { ae_matrix_clear(dst); } /************************************************************************ This function efficiently swaps contents of two vectors, leaving other pararemeters (automatic management, etc.) unchanged. ************************************************************************/ void ae_swap_matrices(ae_matrix *mat1, ae_matrix *mat2) { ae_int_t rows; ae_int_t cols; ae_int_t stride; ae_datatype datatype; void *p_ptr; ae_assert(!mat1->is_attached, "ALGLIB: internal error, attempt to swap matrices attached to X-object", NULL); ae_assert(!mat2->is_attached, "ALGLIB: internal error, attempt to swap matrices attached to X-object", NULL); ae_db_swap(&mat1->data, &mat2->data); rows = mat1->rows; cols = mat1->cols; stride = mat1->stride; datatype = mat1->datatype; p_ptr = mat1->ptr.p_ptr; mat1->rows = mat2->rows; mat1->cols = mat2->cols; mat1->stride = mat2->stride; mat1->datatype = mat2->datatype; mat1->ptr.p_ptr = mat2->ptr.p_ptr; mat2->rows = rows; mat2->cols = cols; mat2->stride = stride; mat2->datatype = datatype; mat2->ptr.p_ptr = p_ptr; } /************************************************************************ This function creates smart pointer structure. dst destination smart pointer. already allocated, but not initialized. subscriber pointer to pointer which receives updates in the internal object stored in ae_smart_ptr. Any update to dst->ptr is translated to subscriber. Can be NULL. state this parameter can be: * pointer to current instance of ae_state, if you want to automatically destroy this object after leaving current frame * NULL, if you do NOT want this vector to be automatically managed (say, if it is field of some object) Error handling: * on failure calls ae_break() with NULL state pointer. Usually it results in abort() call. After initialization, smart pointer stores NULL pointer. ************************************************************************/ void ae_smart_ptr_init(ae_smart_ptr *dst, void **subscriber, ae_state *state) { dst->subscriber = subscriber; dst->ptr = NULL; if( dst->subscriber!=NULL ) *(dst->subscriber) = dst->ptr; dst->is_owner = ae_false; dst->is_dynamic = ae_false; dst->frame_entry.deallocator = ae_smart_ptr_destroy; dst->frame_entry.ptr = dst; if( state!=NULL ) ae_db_attach(&dst->frame_entry, state); } /************************************************************************ This function clears smart pointer structure. dst destination smart pointer. After call to this function smart pointer contains NULL reference, which is propagated to its subscriber (in cases non-NULL subscruber was specified during pointer creation). ************************************************************************/ void ae_smart_ptr_clear(void *_dst) { ae_smart_ptr *dst = (ae_smart_ptr*)_dst; if( dst->is_owner && dst->ptr!=NULL ) { dst->destroy(dst->ptr); if( dst->is_dynamic ) ae_free(dst->ptr); } dst->is_owner = ae_false; dst->is_dynamic = ae_false; dst->ptr = NULL; dst->destroy = NULL; if( dst->subscriber!=NULL ) *(dst->subscriber) = NULL; } /************************************************************************ This function dstroys smart pointer structure (same as clearing it). dst destination smart pointer. ************************************************************************/ void ae_smart_ptr_destroy(void *_dst) { ae_smart_ptr_clear(_dst); } /************************************************************************ This function assigns pointer to ae_smart_ptr structure. dst destination smart pointer. new_ptr new pointer to assign is_owner whether smart pointer owns new_ptr is_dynamic whether object is dynamic - clearing such object requires BOTH calling destructor function AND calling ae_free() for memory occupied by object. destroy destructor function In case smart pointer already contains non-NULL value and owns this value, it is freed before assigning new pointer. Changes in pointer are propagated to its subscriber (in case non-NULL subscriber was specified during pointer creation). You can specify NULL new_ptr, in which case is_owner/destroy are ignored. ************************************************************************/ void ae_smart_ptr_assign(ae_smart_ptr *dst, void *new_ptr, ae_bool is_owner, ae_bool is_dynamic, void (*destroy)(void*)) { if( dst->is_owner && dst->ptr!=NULL ) dst->destroy(dst->ptr); if( new_ptr!=NULL ) { dst->ptr = new_ptr; dst->is_owner = is_owner; dst->is_dynamic = is_dynamic; dst->destroy = destroy; } else { dst->ptr = NULL; dst->is_owner = ae_false; dst->is_dynamic = ae_false; dst->destroy = NULL; } if( dst->subscriber!=NULL ) *(dst->subscriber) = dst->ptr; } /************************************************************************ This function releases pointer owned by ae_smart_ptr structure: * all internal fields are set to NULL * destructor function for internal pointer is NOT called even when we own this pointer. After this call ae_smart_ptr releases ownership of its pointer and passes it to caller. * changes in pointer are propagated to its subscriber (in case non-NULL subscriber was specified during pointer creation). dst destination smart pointer. ************************************************************************/ void ae_smart_ptr_release(ae_smart_ptr *dst) { dst->is_owner = ae_false; dst->is_dynamic = ae_false; dst->ptr = NULL; dst->destroy = NULL; if( dst->subscriber!=NULL ) *(dst->subscriber) = NULL; } /************************************************************************ This function copies contents of ae_vector (SRC) to x_vector (DST). This function should not be called for DST which is attached to SRC (opposite situation, when SRC is attached to DST, is possible). Depending on situation, following actions are performed * for SRC attached to DST, this function performs no actions (no need to do anything) * for independent vectors of different sizes it allocates storage in DST and copy contents of SRC to DST. DST->last_action field is set to ACT_NEW_LOCATION, and DST->owner is set to OWN_AE. * for independent vectors of same sizes it does not perform memory (re)allocation. It just copies SRC to already existing place. DST->last_action is set to ACT_SAME_LOCATION (unless it was ACT_NEW_LOCATION), DST->owner is unmodified. dst destination vector src source, vector in x-format state ALGLIB environment state NOTES: * dst is assumed to be initialized. Its contents is freed before copying data from src (if size / type are different) or overwritten (if possible given destination size). ************************************************************************/ void ae_x_set_vector(x_vector *dst, ae_vector *src, ae_state *state) { if( src->ptr.p_ptr == dst->ptr ) { /* src->ptr points to the beginning of dst, attached matrices, no need to copy */ return; } if( dst->cnt!=src->cnt || dst->datatype!=src->datatype ) { if( dst->owner==OWN_AE ) ae_free(dst->ptr); dst->ptr = ae_malloc((size_t)(src->cnt*ae_sizeof(src->datatype)), state); if( src->cnt!=0 && dst->ptr==NULL ) ae_break(state, ERR_OUT_OF_MEMORY, "ae_malloc(): out of memory"); dst->last_action = ACT_NEW_LOCATION; dst->cnt = src->cnt; dst->datatype = src->datatype; dst->owner = OWN_AE; } else { if( dst->last_action==ACT_UNCHANGED ) dst->last_action = ACT_SAME_LOCATION; else if( dst->last_action==ACT_SAME_LOCATION ) dst->last_action = ACT_SAME_LOCATION; else if( dst->last_action==ACT_NEW_LOCATION ) dst->last_action = ACT_NEW_LOCATION; else ae_assert(ae_false, "ALGLIB: internal error in ae_x_set_vector()", state); } if( src->cnt ) memcpy(dst->ptr, src->ptr.p_ptr, (size_t)(src->cnt*ae_sizeof(src->datatype))); } /************************************************************************ This function copies contents of ae_matrix to x_matrix. This function should not be called for DST which is attached to SRC (opposite situation, when SRC is attached to DST, is possible). Depending on situation, following actions are performed * for SRC attached to DST, this function performs no actions (no need to do anything) * for independent matrices of different sizes it allocates storage in DST and copy contents of SRC to DST. DST->last_action field is set to ACT_NEW_LOCATION, and DST->owner is set to OWN_AE. * for independent matrices of same sizes it does not perform memory (re)allocation. It just copies SRC to already existing place. DST->last_action is set to ACT_SAME_LOCATION (unless it was ACT_NEW_LOCATION), DST->owner is unmodified. dst destination vector src source, matrix in x-format state ALGLIB environment state NOTES: * dst is assumed to be initialized. Its contents is freed before copying data from src (if size / type are different) or overwritten (if possible given destination size). ************************************************************************/ void ae_x_set_matrix(x_matrix *dst, ae_matrix *src, ae_state *state) { char *p_src_row; char *p_dst_row; ae_int_t i; ae_int_t row_size; if( src->ptr.pp_void!=NULL && src->ptr.pp_void[0] == dst->ptr ) { /* src->ptr points to the beginning of dst, attached matrices, no need to copy */ return; } if( dst->rows!=src->rows || dst->cols!=src->cols || dst->datatype!=src->datatype ) { if( dst->owner==OWN_AE ) ae_free(dst->ptr); dst->rows = src->rows; dst->cols = src->cols; dst->stride = src->cols; dst->datatype = src->datatype; dst->ptr = ae_malloc((size_t)(dst->rows*((ae_int_t)dst->stride)*ae_sizeof(src->datatype)), state); if( dst->rows!=0 && dst->stride!=0 && dst->ptr==NULL ) ae_break(state, ERR_OUT_OF_MEMORY, "ae_malloc(): out of memory"); dst->last_action = ACT_NEW_LOCATION; dst->owner = OWN_AE; } else { if( dst->last_action==ACT_UNCHANGED ) dst->last_action = ACT_SAME_LOCATION; else if( dst->last_action==ACT_SAME_LOCATION ) dst->last_action = ACT_SAME_LOCATION; else if( dst->last_action==ACT_NEW_LOCATION ) dst->last_action = ACT_NEW_LOCATION; else ae_assert(ae_false, "ALGLIB: internal error in ae_x_set_vector()", state); } if( src->rows!=0 && src->cols!=0 ) { p_src_row = (char*)(src->ptr.pp_void[0]); p_dst_row = (char*)dst->ptr; row_size = ae_sizeof(src->datatype)*src->cols; for(i=0; irows; i++, p_src_row+=src->stride*ae_sizeof(src->datatype), p_dst_row+=dst->stride*ae_sizeof(src->datatype)) memcpy(p_dst_row, p_src_row, (size_t)(row_size)); } } /************************************************************************ This function attaches x_vector to ae_vector's contents. Ownership of memory allocated is not changed (it is still managed by ae_matrix). dst destination vector src source, vector in x-format state ALGLIB environment state NOTES: * dst is assumed to be initialized. Its contents is freed before attaching to src. * this function doesn't need ae_state parameter because it can't fail (assuming correctly initialized src) ************************************************************************/ void ae_x_attach_to_vector(x_vector *dst, ae_vector *src) { if( dst->owner==OWN_AE ) ae_free(dst->ptr); dst->ptr = src->ptr.p_ptr; dst->last_action = ACT_NEW_LOCATION; dst->cnt = src->cnt; dst->datatype = src->datatype; dst->owner = OWN_CALLER; } /************************************************************************ This function attaches x_matrix to ae_matrix's contents. Ownership of memory allocated is not changed (it is still managed by ae_matrix). dst destination vector src source, matrix in x-format state ALGLIB environment state NOTES: * dst is assumed to be initialized. Its contents is freed before attaching to src. * this function doesn't need ae_state parameter because it can't fail (assuming correctly initialized src) ************************************************************************/ void ae_x_attach_to_matrix(x_matrix *dst, ae_matrix *src) { if( dst->owner==OWN_AE ) ae_free(dst->ptr); dst->rows = src->rows; dst->cols = src->cols; dst->stride = src->stride; dst->datatype = src->datatype; dst->ptr = &(src->ptr.pp_double[0][0]); dst->last_action = ACT_NEW_LOCATION; dst->owner = OWN_CALLER; } /************************************************************************ This function clears x_vector. It does nothing if vector is not owned by ALGLIB environment. dst vector ************************************************************************/ void x_vector_clear(x_vector *dst) { if( dst->owner==OWN_AE ) aligned_free(dst->ptr); dst->ptr = NULL; dst->cnt = 0; } /************************************************************************ Assertion For non-NULL state it allows to gracefully leave ALGLIB session, removing all frames and deallocating registered dynamic data structure. For NULL state it just abort()'s program. IMPORTANT: this function ALWAYS evaluates its argument. It can not be replaced by macro which does nothing. So, you may place actual function calls at cond, and these will always be performed. ************************************************************************/ void ae_assert(ae_bool cond, const char *msg, ae_state *state) { if( !cond ) ae_break(state, ERR_ASSERTION_FAILED, msg); } /************************************************************************ CPUID Returns information about features CPU and compiler support. You must tell ALGLIB what CPU family is used by defining AE_CPU symbol (without this hint zero will be returned). Note: results of this function depend on both CPU and compiler; if compiler doesn't support SSE intrinsics, function won't set corresponding flag. ************************************************************************/ static volatile ae_bool _ae_cpuid_initialized = ae_false; static volatile ae_bool _ae_cpuid_has_sse2 = ae_false; ae_int_t ae_cpuid() { /* * to speed up CPU detection we cache results from previous attempts * there is no synchronization, but it is still thread safe. * * thread safety is guaranteed on all modern architectures which * have following property: simultaneous writes by different cores * to the same location will be executed in serial manner. * */ ae_int_t result; /* * if not initialized, determine system properties */ if( !_ae_cpuid_initialized ) { /* * SSE2 */ #if defined(AE_CPU) #if (AE_CPU==AE_INTEL) && defined(AE_HAS_SSE2_INTRINSICS) #if AE_COMPILER==AE_MSVC { int CPUInfo[4]; __cpuid(CPUInfo, 1); if( (CPUInfo[3]&0x04000000)!=0 ) _ae_cpuid_has_sse2 = ae_true; } #elif AE_COMPILER==AE_GNUC { ae_int_t a,b,c,d; __asm__ __volatile__ ("cpuid": "=a" (a), "=b" (b), "=c" (c), "=d" (d) : "a" (1)); if( (d&0x04000000)!=0 ) _ae_cpuid_has_sse2 = ae_true; } #elif AE_COMPILER==AE_SUNC { ae_int_t a,b,c,d; __asm__ __volatile__ ("cpuid": "=a" (a), "=b" (b), "=c" (c), "=d" (d) : "a" (1)); if( (d&0x04000000)!=0 ) _ae_cpuid_has_sse2 = ae_true; } #else #endif #endif #endif /* * Perform one more CPUID call to generate memory fence */ #if AE_CPU==AE_INTEL #if AE_COMPILER==AE_MSVC { int CPUInfo[4]; __cpuid(CPUInfo, 1); } #elif AE_COMPILER==AE_GNUC { ae_int_t a,b,c,d; __asm__ __volatile__ ("cpuid": "=a" (a), "=b" (b), "=c" (c), "=d" (d) : "a" (1)); } #elif AE_COMPILER==AE_SUNC { ae_int_t a,b,c,d; __asm__ __volatile__ ("cpuid": "=a" (a), "=b" (b), "=c" (c), "=d" (d) : "a" (1)); } #else #endif #endif /* * set initialization flag */ _ae_cpuid_initialized = ae_true; } /* * return */ result = 0; if( _ae_cpuid_has_sse2 ) result = result|CPU_SSE2; return result; } /************************************************************************ Real math functions ************************************************************************/ ae_bool ae_fp_eq(double v1, double v2) { /* IEEE-strict floating point comparison */ volatile double x = v1; volatile double y = v2; return x==y; } ae_bool ae_fp_neq(double v1, double v2) { /* IEEE-strict floating point comparison */ return !ae_fp_eq(v1,v2); } ae_bool ae_fp_less(double v1, double v2) { /* IEEE-strict floating point comparison */ volatile double x = v1; volatile double y = v2; return xy; } ae_bool ae_fp_greater_eq(double v1, double v2) { /* IEEE-strict floating point comparison */ volatile double x = v1; volatile double y = v2; return x>=y; } ae_bool ae_isfinite_stateless(double x, ae_int_t endianness) { union _u { double a; ae_int32_t p[2]; } u; ae_int32_t high; u.a = x; if( endianness==AE_LITTLE_ENDIAN ) high = u.p[1]; else high = u.p[0]; return (high & (ae_int32_t)0x7FF00000)!=(ae_int32_t)0x7FF00000; } ae_bool ae_isnan_stateless(double x, ae_int_t endianness) { union _u { double a; ae_int32_t p[2]; } u; ae_int32_t high, low; u.a = x; if( endianness==AE_LITTLE_ENDIAN ) { high = u.p[1]; low = u.p[0]; } else { high = u.p[0]; low = u.p[1]; } return ((high &0x7FF00000)==0x7FF00000) && (((high &0x000FFFFF)!=0) || (low!=0)); } ae_bool ae_isinf_stateless(double x, ae_int_t endianness) { union _u { double a; ae_int32_t p[2]; } u; ae_int32_t high, low; u.a = x; if( endianness==AE_LITTLE_ENDIAN ) { high = u.p[1]; low = u.p[0]; } else { high = u.p[0]; low = u.p[1]; } /* 31 least significant bits of high are compared */ return ((high&0x7FFFFFFF)==0x7FF00000) && (low==0); } ae_bool ae_isposinf_stateless(double x, ae_int_t endianness) { union _u { double a; ae_int32_t p[2]; } u; ae_int32_t high, low; u.a = x; if( endianness==AE_LITTLE_ENDIAN ) { high = u.p[1]; low = u.p[0]; } else { high = u.p[0]; low = u.p[1]; } /* all 32 bits of high are compared */ return (high==(ae_int32_t)0x7FF00000) && (low==0); } ae_bool ae_isneginf_stateless(double x, ae_int_t endianness) { union _u { double a; ae_int32_t p[2]; } u; ae_int32_t high, low; u.a = x; if( endianness==AE_LITTLE_ENDIAN ) { high = u.p[1]; low = u.p[0]; } else { high = u.p[0]; low = u.p[1]; } /* this code is a bit tricky to avoid comparison of high with 0xFFF00000, which may be unsafe with some buggy compilers */ return ((high&0x7FFFFFFF)==0x7FF00000) && (high!=(ae_int32_t)0x7FF00000) && (low==0); } ae_int_t ae_get_endianness() { union { double a; ae_int32_t p[2]; } u; /* * determine endianness * two types are supported: big-endian and little-endian. * mixed-endian hardware is NOT supported. * * 1983 is used as magic number because its non-periodic double * representation allow us to easily distinguish between upper * and lower halfs and to detect mixed endian hardware. * */ u.a = 1.0/1983.0; if( u.p[1]==(ae_int32_t)0x3f408642 ) return AE_LITTLE_ENDIAN; if( u.p[0]==(ae_int32_t)0x3f408642 ) return AE_BIG_ENDIAN; return AE_MIXED_ENDIAN; } ae_bool ae_isfinite(double x,ae_state *state) { return ae_isfinite_stateless(x, state->endianness); } ae_bool ae_isnan(double x, ae_state *state) { return ae_isnan_stateless(x, state->endianness); } ae_bool ae_isinf(double x, ae_state *state) { return ae_isinf_stateless(x, state->endianness); } ae_bool ae_isposinf(double x,ae_state *state) { return ae_isposinf_stateless(x, state->endianness); } ae_bool ae_isneginf(double x,ae_state *state) { return ae_isneginf_stateless(x, state->endianness); } double ae_fabs(double x, ae_state *state) { return fabs(x); } ae_int_t ae_iabs(ae_int_t x, ae_state *state) { return x>=0 ? x : -x; } double ae_sqr(double x, ae_state *state) { return x*x; } double ae_sqrt(double x, ae_state *state) { return sqrt(x); } ae_int_t ae_sign(double x, ae_state *state) { if( x>0 ) return 1; if( x<0 ) return -1; return 0; } ae_int_t ae_round(double x, ae_state *state) { return (ae_int_t)(ae_ifloor(x+0.5,state)); } ae_int_t ae_trunc(double x, ae_state *state) { return (ae_int_t)(x>0 ? ae_ifloor(x,state) : ae_iceil(x,state)); } ae_int_t ae_ifloor(double x, ae_state *state) { return (ae_int_t)(floor(x)); } ae_int_t ae_iceil(double x, ae_state *state) { return (ae_int_t)(ceil(x)); } ae_int_t ae_maxint(ae_int_t m1, ae_int_t m2, ae_state *state) { return m1>m2 ? m1 : m2; } ae_int_t ae_minint(ae_int_t m1, ae_int_t m2, ae_state *state) { return m1>m2 ? m2 : m1; } double ae_maxreal(double m1, double m2, ae_state *state) { return m1>m2 ? m1 : m2; } double ae_minreal(double m1, double m2, ae_state *state) { return m1>m2 ? m2 : m1; } double ae_randomreal(ae_state *state) { int i1 = rand(); int i2 = rand(); double mx = (double)(RAND_MAX)+1.0; volatile double tmp0 = i2/mx; volatile double tmp1 = i1+tmp0; return tmp1/mx; } ae_int_t ae_randominteger(ae_int_t maxv, ae_state *state) { return rand()%maxv; } double ae_sin(double x, ae_state *state) { return sin(x); } double ae_cos(double x, ae_state *state) { return cos(x); } double ae_tan(double x, ae_state *state) { return tan(x); } double ae_sinh(double x, ae_state *state) { return sinh(x); } double ae_cosh(double x, ae_state *state) { return cosh(x); } double ae_tanh(double x, ae_state *state) { return tanh(x); } double ae_asin(double x, ae_state *state) { return asin(x); } double ae_acos(double x, ae_state *state) { return acos(x); } double ae_atan(double x, ae_state *state) { return atan(x); } double ae_atan2(double y, double x, ae_state *state) { return atan2(y,x); } double ae_log(double x, ae_state *state) { return log(x); } double ae_pow(double x, double y, ae_state *state) { return pow(x,y); } double ae_exp(double x, ae_state *state) { return exp(x); } /************************************************************************ Symmetric/Hermitian properties: check and force ************************************************************************/ static void x_split_length(ae_int_t n, ae_int_t nb, ae_int_t* n1, ae_int_t* n2) { ae_int_t r; if( n<=nb ) { *n1 = n; *n2 = 0; } else { if( n%nb!=0 ) { *n2 = n%nb; *n1 = n-(*n2); } else { *n2 = n/2; *n1 = n-(*n2); if( *n1%nb==0 ) { return; } r = nb-*n1%nb; *n1 = *n1+r; *n2 = *n2-r; } } } static double x_safepythag2(double x, double y) { double w; double xabs; double yabs; double z; xabs = fabs(x); yabs = fabs(y); w = xabs>yabs ? xabs : yabs; z = xabsx_nb || len1>x_nb ) { ae_int_t n1, n2; if( len0>len1 ) { x_split_length(len0, x_nb, &n1, &n2); is_symmetric_rec_off_stat(a, offset0, offset1, n1, len1, nonfinite, mx, err, _state); is_symmetric_rec_off_stat(a, offset0+n1, offset1, n2, len1, nonfinite, mx, err, _state); } else { x_split_length(len1, x_nb, &n1, &n2); is_symmetric_rec_off_stat(a, offset0, offset1, len0, n1, nonfinite, mx, err, _state); is_symmetric_rec_off_stat(a, offset0, offset1+n1, len0, n2, nonfinite, mx, err, _state); } return; } else { /* base case */ double *p1, *p2, *prow, *pcol; double v; ae_int_t i, j; p1 = (double*)(a->ptr)+offset0*a->stride+offset1; p2 = (double*)(a->ptr)+offset1*a->stride+offset0; for(i=0; istride; for(j=0; jv ? *mx : v; v = fabs(*prow); *mx = *mx>v ? *mx : v; v = fabs(*pcol-*prow); *err = *err>v ? *err : v; } pcol += a->stride; prow++; } } } } /* * this function checks that diagonal block A0 is symmetric. * Block A0 is specified by its offset and size. * * [ . ] * [ A0 ] * A = [ . ] * [ . ] * * this subroutine updates current values of: * a) mx maximum value of A[i,j] found so far * b) err componentwise difference between A0 and A0^T * */ static void is_symmetric_rec_diag_stat(x_matrix *a, ae_int_t offset, ae_int_t len, ae_bool *nonfinite, double *mx, double *err, ae_state *_state) { double *p, *prow, *pcol; double v; ae_int_t i, j; /* try to split problem into two smaller ones */ if( len>x_nb ) { ae_int_t n1, n2; x_split_length(len, x_nb, &n1, &n2); is_symmetric_rec_diag_stat(a, offset, n1, nonfinite, mx, err, _state); is_symmetric_rec_diag_stat(a, offset+n1, n2, nonfinite, mx, err, _state); is_symmetric_rec_off_stat(a, offset+n1, offset, n2, n1, nonfinite, mx, err, _state); return; } /* base case */ p = (double*)(a->ptr)+offset*a->stride+offset; for(i=0; istride; for(j=0; jstride,prow++) { if( !ae_isfinite(*pcol,_state) || !ae_isfinite(*prow,_state) ) { *nonfinite = ae_true; } else { v = fabs(*pcol); *mx = *mx>v ? *mx : v; v = fabs(*prow); *mx = *mx>v ? *mx : v; v = fabs(*pcol-*prow); *err = *err>v ? *err : v; } } v = fabs(p[i+i*a->stride]); *mx = *mx>v ? *mx : v; } } /* * this function checks difference between offdiagonal blocks BL and BU * (see below). Block BL is specified by offsets (offset0,offset1) and * sizes (len0,len1). * * [ . ] * [ A0 BU ] * A = [ BL A1 ] * [ . ] * * this subroutine updates current values of: * a) mx maximum value of A[i,j] found so far * b) err componentwise difference between elements of BL and BU^H * */ static void is_hermitian_rec_off_stat(x_matrix *a, ae_int_t offset0, ae_int_t offset1, ae_int_t len0, ae_int_t len1, ae_bool *nonfinite, double *mx, double *err, ae_state *_state) { /* try to split problem into two smaller ones */ if( len0>x_nb || len1>x_nb ) { ae_int_t n1, n2; if( len0>len1 ) { x_split_length(len0, x_nb, &n1, &n2); is_hermitian_rec_off_stat(a, offset0, offset1, n1, len1, nonfinite, mx, err, _state); is_hermitian_rec_off_stat(a, offset0+n1, offset1, n2, len1, nonfinite, mx, err, _state); } else { x_split_length(len1, x_nb, &n1, &n2); is_hermitian_rec_off_stat(a, offset0, offset1, len0, n1, nonfinite, mx, err, _state); is_hermitian_rec_off_stat(a, offset0, offset1+n1, len0, n2, nonfinite, mx, err, _state); } return; } else { /* base case */ ae_complex *p1, *p2, *prow, *pcol; double v; ae_int_t i, j; p1 = (ae_complex*)(a->ptr)+offset0*a->stride+offset1; p2 = (ae_complex*)(a->ptr)+offset1*a->stride+offset0; for(i=0; istride; for(j=0; jx, _state) || !ae_isfinite(pcol->y, _state) || !ae_isfinite(prow->x, _state) || !ae_isfinite(prow->y, _state) ) { *nonfinite = ae_true; } else { v = x_safepythag2(pcol->x, pcol->y); *mx = *mx>v ? *mx : v; v = x_safepythag2(prow->x, prow->y); *mx = *mx>v ? *mx : v; v = x_safepythag2(pcol->x-prow->x, pcol->y+prow->y); *err = *err>v ? *err : v; } pcol += a->stride; prow++; } } } } /* * this function checks that diagonal block A0 is Hermitian. * Block A0 is specified by its offset and size. * * [ . ] * [ A0 ] * A = [ . ] * [ . ] * * this subroutine updates current values of: * a) mx maximum value of A[i,j] found so far * b) err componentwise difference between A0 and A0^H * */ static void is_hermitian_rec_diag_stat(x_matrix *a, ae_int_t offset, ae_int_t len, ae_bool *nonfinite, double *mx, double *err, ae_state *_state) { ae_complex *p, *prow, *pcol; double v; ae_int_t i, j; /* try to split problem into two smaller ones */ if( len>x_nb ) { ae_int_t n1, n2; x_split_length(len, x_nb, &n1, &n2); is_hermitian_rec_diag_stat(a, offset, n1, nonfinite, mx, err, _state); is_hermitian_rec_diag_stat(a, offset+n1, n2, nonfinite, mx, err, _state); is_hermitian_rec_off_stat(a, offset+n1, offset, n2, n1, nonfinite, mx, err, _state); return; } /* base case */ p = (ae_complex*)(a->ptr)+offset*a->stride+offset; for(i=0; istride; for(j=0; jstride,prow++) { if( !ae_isfinite(pcol->x, _state) || !ae_isfinite(pcol->y, _state) || !ae_isfinite(prow->x, _state) || !ae_isfinite(prow->y, _state) ) { *nonfinite = ae_true; } else { v = x_safepythag2(pcol->x, pcol->y); *mx = *mx>v ? *mx : v; v = x_safepythag2(prow->x, prow->y); *mx = *mx>v ? *mx : v; v = x_safepythag2(pcol->x-prow->x, pcol->y+prow->y); *err = *err>v ? *err : v; } } if( !ae_isfinite(p[i+i*a->stride].x, _state) || !ae_isfinite(p[i+i*a->stride].y, _state) ) { *nonfinite = ae_true; } else { v = fabs(p[i+i*a->stride].x); *mx = *mx>v ? *mx : v; v = fabs(p[i+i*a->stride].y); *err = *err>v ? *err : v; } } } /* * this function copies offdiagonal block BL to its symmetric counterpart * BU (see below). Block BL is specified by offsets (offset0,offset1) * and sizes (len0,len1). * * [ . ] * [ A0 BU ] * A = [ BL A1 ] * [ . ] * */ static void force_symmetric_rec_off_stat(x_matrix *a, ae_int_t offset0, ae_int_t offset1, ae_int_t len0, ae_int_t len1) { /* try to split problem into two smaller ones */ if( len0>x_nb || len1>x_nb ) { ae_int_t n1, n2; if( len0>len1 ) { x_split_length(len0, x_nb, &n1, &n2); force_symmetric_rec_off_stat(a, offset0, offset1, n1, len1); force_symmetric_rec_off_stat(a, offset0+n1, offset1, n2, len1); } else { x_split_length(len1, x_nb, &n1, &n2); force_symmetric_rec_off_stat(a, offset0, offset1, len0, n1); force_symmetric_rec_off_stat(a, offset0, offset1+n1, len0, n2); } return; } else { /* base case */ double *p1, *p2, *prow, *pcol; ae_int_t i, j; p1 = (double*)(a->ptr)+offset0*a->stride+offset1; p2 = (double*)(a->ptr)+offset1*a->stride+offset0; for(i=0; istride; for(j=0; jstride; prow++; } } } } /* * this function copies lower part of diagonal block A0 to its upper part * Block is specified by offset and size. * * [ . ] * [ A0 ] * A = [ . ] * [ . ] * */ static void force_symmetric_rec_diag_stat(x_matrix *a, ae_int_t offset, ae_int_t len) { double *p, *prow, *pcol; ae_int_t i, j; /* try to split problem into two smaller ones */ if( len>x_nb ) { ae_int_t n1, n2; x_split_length(len, x_nb, &n1, &n2); force_symmetric_rec_diag_stat(a, offset, n1); force_symmetric_rec_diag_stat(a, offset+n1, n2); force_symmetric_rec_off_stat(a, offset+n1, offset, n2, n1); return; } /* base case */ p = (double*)(a->ptr)+offset*a->stride+offset; for(i=0; istride; for(j=0; jstride,prow++) *pcol = *prow; } } /* * this function copies Hermitian transpose of offdiagonal block BL to * its symmetric counterpart BU (see below). Block BL is specified by * offsets (offset0,offset1) and sizes (len0,len1). * * [ . ] * [ A0 BU ] * A = [ BL A1 ] * [ . ] */ static void force_hermitian_rec_off_stat(x_matrix *a, ae_int_t offset0, ae_int_t offset1, ae_int_t len0, ae_int_t len1) { /* try to split problem into two smaller ones */ if( len0>x_nb || len1>x_nb ) { ae_int_t n1, n2; if( len0>len1 ) { x_split_length(len0, x_nb, &n1, &n2); force_hermitian_rec_off_stat(a, offset0, offset1, n1, len1); force_hermitian_rec_off_stat(a, offset0+n1, offset1, n2, len1); } else { x_split_length(len1, x_nb, &n1, &n2); force_hermitian_rec_off_stat(a, offset0, offset1, len0, n1); force_hermitian_rec_off_stat(a, offset0, offset1+n1, len0, n2); } return; } else { /* base case */ ae_complex *p1, *p2, *prow, *pcol; ae_int_t i, j; p1 = (ae_complex*)(a->ptr)+offset0*a->stride+offset1; p2 = (ae_complex*)(a->ptr)+offset1*a->stride+offset0; for(i=0; istride; for(j=0; jstride; prow++; } } } } /* * this function copies Hermitian transpose of lower part of * diagonal block A0 to its upper part Block is specified by offset and size. * * [ . ] * [ A0 ] * A = [ . ] * [ . ] * */ static void force_hermitian_rec_diag_stat(x_matrix *a, ae_int_t offset, ae_int_t len) { ae_complex *p, *prow, *pcol; ae_int_t i, j; /* try to split problem into two smaller ones */ if( len>x_nb ) { ae_int_t n1, n2; x_split_length(len, x_nb, &n1, &n2); force_hermitian_rec_diag_stat(a, offset, n1); force_hermitian_rec_diag_stat(a, offset+n1, n2); force_hermitian_rec_off_stat(a, offset+n1, offset, n2, n1); return; } /* base case */ p = (ae_complex*)(a->ptr)+offset*a->stride+offset; for(i=0; istride; for(j=0; jstride,prow++) *pcol = *prow; } } ae_bool x_is_symmetric(x_matrix *a) { double mx, err; ae_bool nonfinite; ae_state _alglib_env_state; if( a->datatype!=DT_REAL ) return ae_false; if( a->cols!=a->rows ) return ae_false; if( a->cols==0 || a->rows==0 ) return ae_true; ae_state_init(&_alglib_env_state); mx = 0; err = 0; nonfinite = ae_false; is_symmetric_rec_diag_stat(a, 0, (ae_int_t)a->rows, &nonfinite, &mx, &err, &_alglib_env_state); if( nonfinite ) return ae_false; if( mx==0 ) return ae_true; return err/mx<=1.0E-14; } ae_bool x_is_hermitian(x_matrix *a) { double mx, err; ae_bool nonfinite; ae_state _alglib_env_state; if( a->datatype!=DT_COMPLEX ) return ae_false; if( a->cols!=a->rows ) return ae_false; if( a->cols==0 || a->rows==0 ) return ae_true; ae_state_init(&_alglib_env_state); mx = 0; err = 0; nonfinite = ae_false; is_hermitian_rec_diag_stat(a, 0, (ae_int_t)a->rows, &nonfinite, &mx, &err, &_alglib_env_state); if( nonfinite ) return ae_false; if( mx==0 ) return ae_true; return err/mx<=1.0E-14; } ae_bool x_force_symmetric(x_matrix *a) { if( a->datatype!=DT_REAL ) return ae_false; if( a->cols!=a->rows ) return ae_false; if( a->cols==0 || a->rows==0 ) return ae_true; force_symmetric_rec_diag_stat(a, 0, (ae_int_t)a->rows); return ae_true; } ae_bool x_force_hermitian(x_matrix *a) { if( a->datatype!=DT_COMPLEX ) return ae_false; if( a->cols!=a->rows ) return ae_false; if( a->cols==0 || a->rows==0 ) return ae_true; force_hermitian_rec_diag_stat(a, 0, (ae_int_t)a->rows); return ae_true; } ae_bool ae_is_symmetric(ae_matrix *a) { x_matrix x; x.owner = OWN_CALLER; ae_x_attach_to_matrix(&x, a); return x_is_symmetric(&x); } ae_bool ae_is_hermitian(ae_matrix *a) { x_matrix x; x.owner = OWN_CALLER; ae_x_attach_to_matrix(&x, a); return x_is_hermitian(&x); } ae_bool ae_force_symmetric(ae_matrix *a) { x_matrix x; x.owner = OWN_CALLER; ae_x_attach_to_matrix(&x, a); return x_force_symmetric(&x); } ae_bool ae_force_hermitian(ae_matrix *a) { x_matrix x; x.owner = OWN_CALLER; ae_x_attach_to_matrix(&x, a); return x_force_hermitian(&x); } /************************************************************************ This function converts six-bit value (from 0 to 63) to character (only digits, lowercase and uppercase letters, minus and underscore are used). If v is negative or greater than 63, this function returns '?'. ************************************************************************/ static char _sixbits2char_tbl[64] = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '-', '_' }; char ae_sixbits2char(ae_int_t v) { if( v<0 || v>63 ) return '?'; return _sixbits2char_tbl[v]; /* v is correct, process it */ /*if( v<10 ) return '0'+v; v -= 10; if( v<26 ) return 'A'+v; v -= 26; if( v<26 ) return 'a'+v; v -= 26; return v==0 ? '-' : '_';*/ } /************************************************************************ This function converts character to six-bit value (from 0 to 63). This function is inverse of ae_sixbits2char() If c is not correct character, this function returns -1. ************************************************************************/ static ae_int_t _ae_char2sixbits_tbl[] = { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 62, -1, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, -1, -1, -1, -1, -1, -1, -1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, -1, -1, -1, -1, 63, -1, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, -1, -1, -1, -1, -1 }; ae_int_t ae_char2sixbits(char c) { return (c>=0 && c<127) ? _ae_char2sixbits_tbl[(int)c] : -1; } /************************************************************************ This function converts three bytes (24 bits) to four six-bit values (24 bits again). src pointer to three bytes dst pointer to four ints ************************************************************************/ void ae_threebytes2foursixbits(const unsigned char *src, ae_int_t *dst) { dst[0] = src[0] & 0x3F; dst[1] = (src[0]>>6) | ((src[1]&0x0F)<<2); dst[2] = (src[1]>>4) | ((src[2]&0x03)<<4); dst[3] = src[2]>>2; } /************************************************************************ This function converts four six-bit values (24 bits) to three bytes (24 bits again). src pointer to four ints dst pointer to three bytes ************************************************************************/ void ae_foursixbits2threebytes(const ae_int_t *src, unsigned char *dst) { dst[0] = (unsigned char)( src[0] | ((src[1]&0x03)<<6)); dst[1] = (unsigned char)((src[1]>>2) | ((src[2]&0x0F)<<4)); dst[2] = (unsigned char)((src[2]>>4) | (src[3]<<2)); } /************************************************************************ This function serializes boolean value into buffer v boolean value to be serialized buf buffer, at least 12 characters wide (11 chars for value, one for trailing zero) state ALGLIB environment state ************************************************************************/ void ae_bool2str(ae_bool v, char *buf, ae_state *state) { char c = v ? '1' : '0'; ae_int_t i; for(i=0; iendianness==AE_BIG_ENDIAN ) { for(i=0; i<(ae_int_t)(sizeof(ae_int_t)/2); i++) { unsigned char tc; tc = u.bytes[i]; u.bytes[i] = u.bytes[sizeof(ae_int_t)-1-i]; u.bytes[sizeof(ae_int_t)-1-i] = tc; } } /* * convert to six-bit representation, output * * NOTE: last 12th element of sixbits is always zero, we do not output it */ ae_threebytes2foursixbits(u.bytes+0, sixbits+0); ae_threebytes2foursixbits(u.bytes+3, sixbits+4); ae_threebytes2foursixbits(u.bytes+6, sixbits+8); for(i=0; i=AE_SER_ENTRY_LENGTH ) ae_break(state, ERR_ASSERTION_FAILED, emsg); sixbits[sixbitsread] = d; sixbitsread++; buf++; } *pasttheend = buf; if( sixbitsread==0 ) ae_break(state, ERR_ASSERTION_FAILED, emsg); for(i=sixbitsread; i<12; i++) sixbits[i] = 0; ae_foursixbits2threebytes(sixbits+0, u.bytes+0); ae_foursixbits2threebytes(sixbits+4, u.bytes+3); ae_foursixbits2threebytes(sixbits+8, u.bytes+6); if( state->endianness==AE_BIG_ENDIAN ) { for(i=0; i<(ae_int_t)(sizeof(ae_int_t)/2); i++) { unsigned char tc; tc = u.bytes[i]; u.bytes[i] = u.bytes[sizeof(ae_int_t)-1-i]; u.bytes[sizeof(ae_int_t)-1-i] = tc; } } return u.ival; } /************************************************************************ This function serializes double value into buffer v double value to be serialized buf buffer, at least 12 characters wide (11 chars for value, one for trailing zero) state ALGLIB environment state ************************************************************************/ void ae_double2str(double v, char *buf, ae_state *state) { union _u { double dval; unsigned char bytes[9]; } u; ae_int_t i; ae_int_t sixbits[12]; /* * handle special quantities */ if( ae_isnan(v, state) ) { const char *s = ".nan_______"; memcpy(buf, s, strlen(s)+1); return; } if( ae_isposinf(v, state) ) { const char *s = ".posinf____"; memcpy(buf, s, strlen(s)+1); return; } if( ae_isneginf(v, state) ) { const char *s = ".neginf____"; memcpy(buf, s, strlen(s)+1); return; } /* * process general case: * 1. copy v to array of chars * 2. set 9th byte of u.bytes to zero in order to * simplify conversion to six-bit representation * 3. convert to little endian (if needed) * 4. convert to six-bit representation * (last 12th element of sixbits is always zero, we do not output it) */ u.dval = v; u.bytes[8] = 0; if( state->endianness==AE_BIG_ENDIAN ) { for(i=0; i<(ae_int_t)(sizeof(double)/2); i++) { unsigned char tc; tc = u.bytes[i]; u.bytes[i] = u.bytes[sizeof(double)-1-i]; u.bytes[sizeof(double)-1-i] = tc; } } ae_threebytes2foursixbits(u.bytes+0, sixbits+0); ae_threebytes2foursixbits(u.bytes+3, sixbits+4); ae_threebytes2foursixbits(u.bytes+6, sixbits+8); for(i=0; iv_nan; } if( strncmp(buf, s_posinf, strlen(s_posinf))==0 ) { *pasttheend = buf+strlen(s_posinf); return state->v_posinf; } if( strncmp(buf, s_neginf, strlen(s_neginf))==0 ) { *pasttheend = buf+strlen(s_neginf); return state->v_neginf; } ae_break(state, ERR_ASSERTION_FAILED, emsg); } /* * General case: * 1. read and decode six-bit digits * 2. check that all 11 digits were read * 3. set last 12th digit to zero (needed for simplicity of conversion) * 4. convert to 8 bytes * 5. convert to big endian representation, if needed */ sixbitsread = 0; while( *buf!=' ' && *buf!='\t' && *buf!='\n' && *buf!='\r' && *buf!=0 ) { ae_int_t d; d = ae_char2sixbits(*buf); if( d<0 || sixbitsread>=AE_SER_ENTRY_LENGTH ) ae_break(state, ERR_ASSERTION_FAILED, emsg); sixbits[sixbitsread] = d; sixbitsread++; buf++; } *pasttheend = buf; if( sixbitsread!=AE_SER_ENTRY_LENGTH ) ae_break(state, ERR_ASSERTION_FAILED, emsg); sixbits[AE_SER_ENTRY_LENGTH] = 0; ae_foursixbits2threebytes(sixbits+0, u.bytes+0); ae_foursixbits2threebytes(sixbits+4, u.bytes+3); ae_foursixbits2threebytes(sixbits+8, u.bytes+6); if( state->endianness==AE_BIG_ENDIAN ) { for(i=0; i<(ae_int_t)(sizeof(double)/2); i++) { unsigned char tc; tc = u.bytes[i]; u.bytes[i] = u.bytes[sizeof(double)-1-i]; u.bytes[sizeof(double)-1-i] = tc; } } return u.dval; } /************************************************************************ This function performs given number of spin-wait iterations ************************************************************************/ void ae_spin_wait(ae_int_t cnt) { /* * these strange operations with ae_never_change_it are necessary to * prevent compiler optimization of the loop. */ volatile ae_int_t i; /* very unlikely because no one will wait for such amount of cycles */ if( cnt>0x12345678 ) ae_never_change_it = cnt%10; /* spin wait, test condition which will never be true */ for(i=0; i0 ) ae_never_change_it--; } /************************************************************************ This function causes the calling thread to relinquish the CPU. The thread is moved to the end of the queue and some other thread gets to run. NOTE: this function should NOT be called when AE_OS is AE_UNKNOWN - the whole program will be abnormally terminated. ************************************************************************/ void ae_yield() { #if AE_OS==AE_WINDOWS if( !SwitchToThread() ) Sleep(0); #elif AE_OS==AE_POSIX sched_yield(); #else abort(); #endif } /************************************************************************ This function initializes ae_lock structure and sets lock in a free mode. ************************************************************************/ void ae_init_lock(ae_lock *lock) { _lock *p; lock->ptr = malloc(sizeof(_lock)); AE_CRITICAL_ASSERT(lock->ptr!=NULL); p = (_lock*)lock->ptr; #if AE_OS==AE_WINDOWS p->p_lock = (ae_int_t*)ae_align((void*)(&p->buf),AE_LOCK_ALIGNMENT); p->p_lock[0] = 0; #elif AE_OS==AE_POSIX pthread_mutex_init(&p->mutex, NULL); #else p->is_locked = ae_false; #endif } /************************************************************************ This function acquires lock. In case lock is busy, we perform several iterations inside tight loop before trying again. ************************************************************************/ void ae_acquire_lock(ae_lock *lock) { #if AE_OS==AE_WINDOWS ae_int_t cnt = 0; #endif _lock *p; p = (_lock*)lock->ptr; #if AE_OS==AE_WINDOWS #ifdef AE_SMP_DEBUGCOUNTERS InterlockedIncrement((LONG volatile *)&_ae_dbg_lock_acquisitions); #endif for(;;) { if( InterlockedCompareExchange((LONG volatile *)p->p_lock, 1, 0)==0 ) return; ae_spin_wait(AE_LOCK_CYCLES); #ifdef AE_SMP_DEBUGCOUNTERS InterlockedIncrement((LONG volatile *)&_ae_dbg_lock_spinwaits); #endif cnt++; if( cnt%AE_LOCK_TESTS_BEFORE_YIELD==0 ) { #ifdef AE_SMP_DEBUGCOUNTERS InterlockedIncrement((LONG volatile *)&_ae_dbg_lock_yields); #endif ae_yield(); } } #elif AE_OS==AE_POSIX ae_int_t cnt = 0; for(;;) { if( pthread_mutex_trylock(&p->mutex)==0 ) return; ae_spin_wait(AE_LOCK_CYCLES); cnt++; if( cnt%AE_LOCK_TESTS_BEFORE_YIELD==0 ) ae_yield(); } ; #else AE_CRITICAL_ASSERT(!p->is_locked); p->is_locked = ae_true; #endif } /************************************************************************ This function releases lock. ************************************************************************/ void ae_release_lock(ae_lock *lock) { _lock *p; p = (_lock*)lock->ptr; #if AE_OS==AE_WINDOWS InterlockedExchange((LONG volatile *)p->p_lock, 0); #elif AE_OS==AE_POSIX pthread_mutex_unlock(&p->mutex); #else p->is_locked = ae_false; #endif } /************************************************************************ This function frees ae_lock structure. ************************************************************************/ void ae_free_lock(ae_lock *lock) { _lock *p; p = (_lock*)lock->ptr; #if AE_OS==AE_POSIX pthread_mutex_destroy(&p->mutex); #endif free(p); } /************************************************************************ This function creates ae_shared_pool structure. dst destination shared pool; already allocated, but not initialized. state this parameter can be: * pointer to current instance of ae_state, if you want to automatically destroy this object after leaving current frame * NULL, if you do NOT want this vector to be automatically managed (say, if it is field of some object) Error handling: * on failure calls ae_break() with NULL state pointer. Usually it results in abort() call. dst is assumed to be uninitialized, its fields are ignored. ************************************************************************/ void ae_shared_pool_init(void *_dst, ae_state *state) { ae_shared_pool *dst; dst = (ae_shared_pool*)_dst; /* init */ dst->seed_object = NULL; dst->recycled_objects = NULL; dst->recycled_entries = NULL; dst->enumeration_counter = NULL; dst->size_of_object = 0; dst->init = NULL; dst->init_copy = NULL; dst->destroy = NULL; dst->frame_entry.deallocator = ae_shared_pool_destroy; dst->frame_entry.ptr = dst; if( state!=NULL ) ae_db_attach(&dst->frame_entry, state); ae_init_lock(&dst->pool_lock); } /************************************************************************ This function clears all dynamically allocated fields of the pool except for the lock. It does NOT try to acquire pool_lock. NOTE: this function is NOT thread-safe, it is not protected by lock. ************************************************************************/ static void ae_shared_pool_internalclear(ae_shared_pool *dst) { ae_shared_pool_entry *ptr, *tmp; /* destroy seed */ if( dst->seed_object!=NULL ) { dst->destroy((void*)dst->seed_object); ae_free((void*)dst->seed_object); dst->seed_object = NULL; } /* destroy recycled objects */ for(ptr=dst->recycled_objects; ptr!=NULL;) { tmp = (ae_shared_pool_entry*)ptr->next_entry; dst->destroy(ptr->obj); ae_free(ptr->obj); ae_free(ptr); ptr = tmp; } dst->recycled_objects = NULL; /* destroy recycled entries */ for(ptr=dst->recycled_entries; ptr!=NULL;) { tmp = (ae_shared_pool_entry*)ptr->next_entry; ae_free(ptr); ptr = tmp; } dst->recycled_entries = NULL; } /************************************************************************ This function creates copy of ae_shared_pool. dst destination pool, allocated but not initialized src source pool state this parameter can be: * pointer to current instance of ae_state, if you want to automatically destroy this object after leaving current frame * NULL, if you do NOT want this vector to be automatically managed (say, if it is field of some object) Error handling: * on failure calls ae_break() with NULL state pointer. Usually it results in abort() call. dst is assumed to be uninitialized, its fields are ignored. NOTE: this function is NOT thread-safe. It does not acquire pool lock, so you should NOT call it when lock can be used by another thread. ************************************************************************/ void ae_shared_pool_init_copy(void *_dst, void *_src, ae_state *state) { ae_shared_pool *dst, *src; ae_shared_pool_entry *ptr; /* state!=NULL, allocation errors result in exception */ /* AE_CRITICAL_ASSERT(state!=NULL); */ dst = (ae_shared_pool*)_dst; src = (ae_shared_pool*)_src; ae_shared_pool_init(dst, state); /* copy non-pointer fields */ dst->size_of_object = src->size_of_object; dst->init = src->init; dst->init_copy = src->init_copy; dst->destroy = src->destroy; ae_init_lock(&dst->pool_lock); /* copy seed object */ if( src->seed_object!=NULL ) { dst->seed_object = ae_malloc(dst->size_of_object, state); dst->init_copy(dst->seed_object, src->seed_object, NULL); } /* copy recycled objects */ dst->recycled_objects = NULL; for(ptr=src->recycled_objects; ptr!=NULL; ptr=(ae_shared_pool_entry*)ptr->next_entry) { ae_shared_pool_entry *tmp; tmp = (ae_shared_pool_entry*)ae_malloc(sizeof(ae_shared_pool_entry), state); tmp->obj = ae_malloc(dst->size_of_object, state); dst->init_copy(tmp->obj, ptr->obj, NULL); tmp->next_entry = dst->recycled_objects; dst->recycled_objects = tmp; } /* recycled entries are not copied because they do not store any information */ dst->recycled_entries = NULL; /* enumeration counter is reset on copying */ dst->enumeration_counter = NULL; /* initialize frame record */ dst->frame_entry.deallocator = ae_shared_pool_destroy; dst->frame_entry.ptr = dst; } /************************************************************************ This function clears contents of the pool, but pool remain usable. IMPORTANT: this function invalidates dst, it can not be used after it is cleared. NOTE: this function is NOT thread-safe. It does not acquire pool lock, so you should NOT call it when lock can be used by another thread. ************************************************************************/ void ae_shared_pool_clear(void *_dst) { ae_shared_pool *dst = (ae_shared_pool*)_dst; /* clear seed and lists */ ae_shared_pool_internalclear(dst); /* clear fields */ dst->seed_object = NULL; dst->recycled_objects = NULL; dst->recycled_entries = NULL; dst->enumeration_counter = NULL; dst->size_of_object = 0; dst->init = NULL; dst->init_copy = NULL; dst->destroy = NULL; } /************************************************************************ This function destroys pool (object is left in invalid state, all dynamically allocated memory is freed). NOTE: this function is NOT thread-safe. It does not acquire pool lock, so you should NOT call it when lock can be used by another thread. ************************************************************************/ void ae_shared_pool_destroy(void *_dst) { ae_shared_pool *dst = (ae_shared_pool*)_dst; ae_shared_pool_clear(_dst); ae_free_lock(&dst->pool_lock); } /************************************************************************ This function returns True, if internal seed object was set. It returns False for un-seeded pool. dst destination pool (initialized by constructor function) NOTE: this function is NOT thread-safe. It does not acquire pool lock, so you should NOT call it when lock can be used by another thread. ************************************************************************/ ae_bool ae_shared_pool_is_initialized(void *_dst) { ae_shared_pool *dst = (ae_shared_pool*)_dst; return dst->seed_object!=NULL; } /************************************************************************ This function sets internal seed object. All objects owned by the pool (current seed object, recycled objects) are automatically freed. dst destination pool (initialized by constructor function) seed_object new seed object size_of_object sizeof(), used to allocate memory init constructor function init_copy copy constructor clear destructor function state ALGLIB environment state NOTE: this function is NOT thread-safe. It does not acquire pool lock, so you should NOT call it when lock can be used by another thread. ************************************************************************/ void ae_shared_pool_set_seed( ae_shared_pool *dst, void *seed_object, ae_int_t size_of_object, void (*init)(void* dst, ae_state* state), void (*init_copy)(void* dst, void* src, ae_state* state), void (*destroy)(void* ptr), ae_state *state) { /* state!=NULL, allocation errors result in exception */ AE_CRITICAL_ASSERT(state!=NULL); /* destroy internal objects */ ae_shared_pool_internalclear(dst); /* set non-pointer fields */ dst->size_of_object = size_of_object; dst->init = init; dst->init_copy = init_copy; dst->destroy = destroy; /* set seed object */ dst->seed_object = ae_malloc(size_of_object, state); init_copy(dst->seed_object, seed_object, NULL); } /************************************************************************ This function retrieves a copy of the seed object from the pool and stores it to target smart pointer ptr. In case target pointer owns non-NULL value, it is deallocated before storing value retrieved from pool. Target pointer becomes owner of the value which was retrieved from pool. pool pool pptr pointer to ae_smart_ptr structure state ALGLIB environment state NOTE: this function IS thread-safe. It acquires pool lock during its operation and can be used simultaneously from several threads. ************************************************************************/ void ae_shared_pool_retrieve( ae_shared_pool *pool, ae_smart_ptr *pptr, ae_state *state) { void *new_obj; /* state!=NULL, allocation errors are handled by throwing exception from ae_malloc() */ AE_CRITICAL_ASSERT(state!=NULL); /* assert that pool was seeded */ ae_assert( pool->seed_object!=NULL, "ALGLIB: shared pool is not seeded, PoolRetrieve() failed", state); /* acquire lock */ ae_acquire_lock(&pool->pool_lock); /* try to reuse recycled objects */ if( pool->recycled_objects!=NULL ) { void *new_obj; ae_shared_pool_entry *result; /* retrieve entry/object from list of recycled objects */ result = pool->recycled_objects; pool->recycled_objects = (ae_shared_pool_entry*)pool->recycled_objects->next_entry; new_obj = result->obj; result->obj = NULL; /* move entry to list of recycled entries */ result->next_entry = pool->recycled_entries; pool->recycled_entries = result; /* release lock */ ae_release_lock(&pool->pool_lock); /* assign object to smart pointer */ ae_smart_ptr_assign(pptr, new_obj, ae_true, ae_true, pool->destroy); return; } /* release lock; we do not need it anymore because copy constructor does not modify source variable */ ae_release_lock(&pool->pool_lock); /* create new object from seed */ new_obj = ae_malloc(pool->size_of_object, state); pool->init_copy(new_obj, pool->seed_object, NULL); /* assign object to smart pointer and return */ ae_smart_ptr_assign(pptr, new_obj, ae_true, ae_true, pool->destroy); } /************************************************************************ This function recycles object owned by smart pointer by moving it to internal storage of the shared pool. Source pointer must own the object. After function is over, it owns NULL pointer. pool pool pptr pointer to ae_smart_ptr structure state ALGLIB environment state NOTE: this function IS thread-safe. It acquires pool lock during its operation and can be used simultaneously from several threads. ************************************************************************/ void ae_shared_pool_recycle( ae_shared_pool *pool, ae_smart_ptr *pptr, ae_state *state) { ae_shared_pool_entry *new_entry; /* state!=NULL, allocation errors are handled by throwing exception from ae_malloc() */ AE_CRITICAL_ASSERT(state!=NULL); /* assert that pool was seeded */ ae_assert( pool->seed_object!=NULL, "ALGLIB: shared pool is not seeded, PoolRecycle() failed", state); /* assert that pointer non-null and owns the object */ ae_assert(pptr->is_owner, "ALGLIB: pptr in ae_shared_pool_recycle() does not own its pointer", state); ae_assert(pptr->ptr!=NULL, "ALGLIB: pptr in ae_shared_pool_recycle() is NULL", state); /* acquire lock */ ae_acquire_lock(&pool->pool_lock); /* acquire shared pool entry (reuse one from recycled_entries or malloc new one) */ if( pool->recycled_entries!=NULL ) { /* reuse previously allocated entry */ new_entry = pool->recycled_entries; pool->recycled_entries = (ae_shared_pool_entry*)new_entry->next_entry; } else { /* * Allocate memory for new entry. * * NOTE: we release pool lock during allocation because ae_malloc() may raise * exception and we do not want our pool to be left in the locked state. */ ae_release_lock(&pool->pool_lock); new_entry = (ae_shared_pool_entry*)ae_malloc(sizeof(ae_shared_pool_entry), state); ae_acquire_lock(&pool->pool_lock); } /* add object to the list of recycled objects */ new_entry->obj = pptr->ptr; new_entry->next_entry = pool->recycled_objects; pool->recycled_objects = new_entry; /* release lock object */ ae_release_lock(&pool->pool_lock); /* release source pointer */ ae_smart_ptr_release(pptr); } /************************************************************************ This function clears internal list of recycled objects, but does not change seed object managed by the pool. pool pool state ALGLIB environment state NOTE: this function is NOT thread-safe. It does not acquire pool lock, so you should NOT call it when lock can be used by another thread. ************************************************************************/ void ae_shared_pool_clear_recycled( ae_shared_pool *pool, ae_state *state) { ae_shared_pool_entry *ptr, *tmp; /* clear recycled objects */ for(ptr=pool->recycled_objects; ptr!=NULL;) { tmp = (ae_shared_pool_entry*)ptr->next_entry; pool->destroy(ptr->obj); ae_free(ptr->obj); ae_free(ptr); ptr = tmp; } pool->recycled_objects = NULL; } /************************************************************************ This function allows to enumerate recycled elements of the shared pool. It stores pointer to the first recycled object in the smart pointer. IMPORTANT: * in case target pointer owns non-NULL value, it is deallocated before storing value retrieved from pool. * recycled object IS NOT removed from pool * target pointer DOES NOT become owner of the new value * this function IS NOT thread-safe * you SHOULD NOT modify shared pool during enumeration (although you can modify state of the objects retrieved from pool) * in case there is no recycled objects in the pool, NULL is stored to pptr * in case pool is not seeded, NULL is stored to pptr pool pool pptr pointer to ae_smart_ptr structure state ALGLIB environment state ************************************************************************/ void ae_shared_pool_first_recycled( ae_shared_pool *pool, ae_smart_ptr *pptr, ae_state *state) { /* modify internal enumeration counter */ pool->enumeration_counter = pool->recycled_objects; /* exit on empty list */ if( pool->enumeration_counter==NULL ) { ae_smart_ptr_assign(pptr, NULL, ae_false, ae_false, NULL); return; } /* assign object to smart pointer */ ae_smart_ptr_assign(pptr, pool->enumeration_counter->obj, ae_false, ae_false, pool->destroy); } /************************************************************************ This function allows to enumerate recycled elements of the shared pool. It stores pointer to the next recycled object in the smart pointer. IMPORTANT: * in case target pointer owns non-NULL value, it is deallocated before storing value retrieved from pool. * recycled object IS NOT removed from pool * target pointer DOES NOT become owner of the new value * this function IS NOT thread-safe * you SHOULD NOT modify shared pool during enumeration (although you can modify state of the objects retrieved from pool) * in case there is no recycled objects left in the pool, NULL is stored. * in case pool is not seeded, NULL is stored. pool pool pptr pointer to ae_smart_ptr structure state ALGLIB environment state ************************************************************************/ void ae_shared_pool_next_recycled( ae_shared_pool *pool, ae_smart_ptr *pptr, ae_state *state) { /* exit on end of list */ if( pool->enumeration_counter==NULL ) { ae_smart_ptr_assign(pptr, NULL, ae_false, ae_false, NULL); return; } /* modify internal enumeration counter */ pool->enumeration_counter = (ae_shared_pool_entry*)pool->enumeration_counter->next_entry; /* exit on empty list */ if( pool->enumeration_counter==NULL ) { ae_smart_ptr_assign(pptr, NULL, ae_false, ae_false, NULL); return; } /* assign object to smart pointer */ ae_smart_ptr_assign(pptr, pool->enumeration_counter->obj, ae_false, ae_false, pool->destroy); } /************************************************************************ This function clears internal list of recycled objects and seed object. However, pool still can be used (after initialization with another seed). pool pool state ALGLIB environment state NOTE: this function is NOT thread-safe. It does not acquire pool lock, so you should NOT call it when lock can be used by another thread. ************************************************************************/ void ae_shared_pool_reset( ae_shared_pool *pool, ae_state *state) { /* clear seed and lists */ ae_shared_pool_internalclear(pool); /* clear fields */ pool->seed_object = NULL; pool->recycled_objects = NULL; pool->recycled_entries = NULL; pool->enumeration_counter = NULL; pool->size_of_object = 0; pool->init = NULL; pool->init_copy = NULL; pool->destroy = NULL; } /************************************************************************ This function initializes serializer ************************************************************************/ void ae_serializer_init(ae_serializer *serializer) { serializer->mode = AE_SM_DEFAULT; serializer->entries_needed = 0; serializer->bytes_asked = 0; } void ae_serializer_clear(ae_serializer *serializer) { } void ae_serializer_alloc_start(ae_serializer *serializer) { serializer->entries_needed = 0; serializer->bytes_asked = 0; serializer->mode = AE_SM_ALLOC; } void ae_serializer_alloc_entry(ae_serializer *serializer) { serializer->entries_needed++; } /************************************************************************ After allocation phase is done, this function returns required size of the output string buffer (including trailing zero symbol). Actual size of the data being stored can be a few characters smaller than requested. ************************************************************************/ ae_int_t ae_serializer_get_alloc_size(ae_serializer *serializer) { ae_int_t rows, lastrowsize, result; serializer->mode = AE_SM_READY2S; /* if no entries needes (degenerate case) */ if( serializer->entries_needed==0 ) { serializer->bytes_asked = 4; /* a pair of chars for \r\n, one for dot, one for trailing zero */ return serializer->bytes_asked; } /* non-degenerate case */ rows = serializer->entries_needed/AE_SER_ENTRIES_PER_ROW; lastrowsize = AE_SER_ENTRIES_PER_ROW; if( serializer->entries_needed%AE_SER_ENTRIES_PER_ROW ) { lastrowsize = serializer->entries_needed%AE_SER_ENTRIES_PER_ROW; rows++; } /* calculate result size */ result = ((rows-1)*AE_SER_ENTRIES_PER_ROW+lastrowsize)*AE_SER_ENTRY_LENGTH; /* data size */ result += (rows-1)*(AE_SER_ENTRIES_PER_ROW-1)+(lastrowsize-1); /* space symbols */ result += rows*2; /* newline symbols */ result += 1; /* trailing dot */ result += 1; /* trailing zero */ serializer->bytes_asked = result; return result; } #ifdef AE_USE_CPP_SERIALIZATION void ae_serializer_sstart_str(ae_serializer *serializer, std::string *buf) { serializer->mode = AE_SM_TO_CPPSTRING; serializer->out_cppstr = buf; serializer->entries_saved = 0; serializer->bytes_written = 0; } void ae_serializer_ustart_str(ae_serializer *serializer, const std::string *buf) { serializer->mode = AE_SM_FROM_STRING; serializer->in_str = buf->c_str(); } static char cpp_writer(const char *p_string, ae_int_t aux) { std::ostream *stream = reinterpret_cast(aux); stream->write(p_string, strlen(p_string)); return stream->bad() ? 1 : 0; } static char cpp_reader(ae_int_t aux, ae_int_t cnt, char *p_buf) { std::istream *stream = reinterpret_cast(aux); int c; if( cnt<=0 ) return 1; /* unexpected cnt */ for(;;) { c = stream->get(); if( c<0 || c>255 ) return 1; /* failure! */ if( c!=' ' && c!='\t' && c!='\n' && c!='\r' ) break; } p_buf[0] = (char)c; for(int k=1; kget(); if( c<0 || c>255 || c==' ' || c=='\t' || c=='\n' || c=='\r' ) return 1; /* failure! */ p_buf[k] = (char)c; } p_buf[cnt] = 0; return 0; /* success */ } void ae_serializer_sstart_stream(ae_serializer *serializer, std::ostream *stream) { serializer->mode = AE_SM_TO_STREAM; serializer->stream_writer = cpp_writer; serializer->stream_aux = reinterpret_cast(stream); serializer->entries_saved = 0; serializer->bytes_written = 0; } void ae_serializer_ustart_stream(ae_serializer *serializer, const std::istream *stream) { serializer->mode = AE_SM_FROM_STREAM; serializer->stream_reader = cpp_reader; serializer->stream_aux = reinterpret_cast(stream); } #endif void ae_serializer_sstart_str(ae_serializer *serializer, char *buf) { serializer->mode = AE_SM_TO_STRING; serializer->out_str = buf; serializer->out_str[0] = 0; serializer->entries_saved = 0; serializer->bytes_written = 0; } void ae_serializer_ustart_str(ae_serializer *serializer, const char *buf) { serializer->mode = AE_SM_FROM_STRING; serializer->in_str = buf; } void ae_serializer_sstart_stream(ae_serializer *serializer, ae_stream_writer writer, ae_int_t aux) { serializer->mode = AE_SM_TO_STREAM; serializer->stream_writer = writer; serializer->stream_aux = aux; serializer->entries_saved = 0; serializer->bytes_written = 0; } void ae_serializer_ustart_stream(ae_serializer *serializer, ae_stream_reader reader, ae_int_t aux) { serializer->mode = AE_SM_FROM_STREAM; serializer->stream_reader = reader; serializer->stream_aux = aux; } void ae_serializer_serialize_bool(ae_serializer *serializer, ae_bool v, ae_state *state) { char buf[AE_SER_ENTRY_LENGTH+2+1]; const char *emsg = "ALGLIB: serialization integrity error"; ae_int_t bytes_appended; /* prepare serialization, check consistency */ ae_bool2str(v, buf, state); serializer->entries_saved++; if( serializer->entries_saved%AE_SER_ENTRIES_PER_ROW ) strcat(buf, " "); else strcat(buf, "\r\n"); bytes_appended = (ae_int_t)strlen(buf); ae_assert(serializer->bytes_written+bytes_appendedbytes_asked, emsg, state); /* strict "less" because we need space for trailing zero */ serializer->bytes_written += bytes_appended; /* append to buffer */ #ifdef AE_USE_CPP_SERIALIZATION if( serializer->mode==AE_SM_TO_CPPSTRING ) { *(serializer->out_cppstr) += buf; return; } #endif if( serializer->mode==AE_SM_TO_STRING ) { strcat(serializer->out_str, buf); serializer->out_str += bytes_appended; return; } if( serializer->mode==AE_SM_TO_STREAM ) { ae_assert(serializer->stream_writer(buf, serializer->stream_aux)==0, "serializer: error writing to stream", state); return; } ae_break(state, ERR_ASSERTION_FAILED, emsg); } void ae_serializer_serialize_int(ae_serializer *serializer, ae_int_t v, ae_state *state) { char buf[AE_SER_ENTRY_LENGTH+2+1]; const char *emsg = "ALGLIB: serialization integrity error"; ae_int_t bytes_appended; /* prepare serialization, check consistency */ ae_int2str(v, buf, state); serializer->entries_saved++; if( serializer->entries_saved%AE_SER_ENTRIES_PER_ROW ) strcat(buf, " "); else strcat(buf, "\r\n"); bytes_appended = (ae_int_t)strlen(buf); ae_assert(serializer->bytes_written+bytes_appendedbytes_asked, emsg, state); /* strict "less" because we need space for trailing zero */ serializer->bytes_written += bytes_appended; /* append to buffer */ #ifdef AE_USE_CPP_SERIALIZATION if( serializer->mode==AE_SM_TO_CPPSTRING ) { *(serializer->out_cppstr) += buf; return; } #endif if( serializer->mode==AE_SM_TO_STRING ) { strcat(serializer->out_str, buf); serializer->out_str += bytes_appended; return; } if( serializer->mode==AE_SM_TO_STREAM ) { ae_assert(serializer->stream_writer(buf, serializer->stream_aux)==0, "serializer: error writing to stream", state); return; } ae_break(state, ERR_ASSERTION_FAILED, emsg); } void ae_serializer_serialize_double(ae_serializer *serializer, double v, ae_state *state) { char buf[AE_SER_ENTRY_LENGTH+2+1]; const char *emsg = "ALGLIB: serialization integrity error"; ae_int_t bytes_appended; /* prepare serialization, check consistency */ ae_double2str(v, buf, state); serializer->entries_saved++; if( serializer->entries_saved%AE_SER_ENTRIES_PER_ROW ) strcat(buf, " "); else strcat(buf, "\r\n"); bytes_appended = (ae_int_t)strlen(buf); ae_assert(serializer->bytes_written+bytes_appendedbytes_asked, emsg, state); /* strict "less" because we need space for trailing zero */ serializer->bytes_written += bytes_appended; /* append to buffer */ #ifdef AE_USE_CPP_SERIALIZATION if( serializer->mode==AE_SM_TO_CPPSTRING ) { *(serializer->out_cppstr) += buf; return; } #endif if( serializer->mode==AE_SM_TO_STRING ) { strcat(serializer->out_str, buf); serializer->out_str += bytes_appended; return; } if( serializer->mode==AE_SM_TO_STREAM ) { ae_assert(serializer->stream_writer(buf, serializer->stream_aux)==0, "serializer: error writing to stream", state); return; } ae_break(state, ERR_ASSERTION_FAILED, emsg); } void ae_serializer_unserialize_bool(ae_serializer *serializer, ae_bool *v, ae_state *state) { if( serializer->mode==AE_SM_FROM_STRING ) { *v = ae_str2bool(serializer->in_str, state, &serializer->in_str); return; } if( serializer->mode==AE_SM_FROM_STREAM ) { char buf[AE_SER_ENTRY_LENGTH+2+1]; const char *p = buf; ae_assert(serializer->stream_reader(serializer->stream_aux, AE_SER_ENTRY_LENGTH, buf)==0, "serializer: error reading from stream", state); *v = ae_str2bool(buf, state, &p); return; } ae_break(state, ERR_ASSERTION_FAILED, "ae_serializer: integrity check failed"); } void ae_serializer_unserialize_int(ae_serializer *serializer, ae_int_t *v, ae_state *state) { if( serializer->mode==AE_SM_FROM_STRING ) { *v = ae_str2int(serializer->in_str, state, &serializer->in_str); return; } if( serializer->mode==AE_SM_FROM_STREAM ) { char buf[AE_SER_ENTRY_LENGTH+2+1]; const char *p = buf; ae_assert(serializer->stream_reader(serializer->stream_aux, AE_SER_ENTRY_LENGTH, buf)==0, "serializer: error reading from stream", state); *v = ae_str2int(buf, state, &p); return; } ae_break(state, ERR_ASSERTION_FAILED, "ae_serializer: integrity check failed"); } void ae_serializer_unserialize_double(ae_serializer *serializer, double *v, ae_state *state) { if( serializer->mode==AE_SM_FROM_STRING ) { *v = ae_str2double(serializer->in_str, state, &serializer->in_str); return; } if( serializer->mode==AE_SM_FROM_STREAM ) { char buf[AE_SER_ENTRY_LENGTH+2+1]; const char *p = buf; ae_assert(serializer->stream_reader(serializer->stream_aux, AE_SER_ENTRY_LENGTH, buf)==0, "serializer: error reading from stream", state); *v = ae_str2double(buf, state, &p); return; } ae_break(state, ERR_ASSERTION_FAILED, "ae_serializer: integrity check failed"); } void ae_serializer_stop(ae_serializer *serializer, ae_state *state) { #ifdef AE_USE_CPP_SERIALIZATION if( serializer->mode==AE_SM_TO_CPPSTRING ) { ae_assert(serializer->bytes_written+1bytes_asked, "ae_serializer: integrity check failed", state);/* strict "less" because we need space for trailing zero */ serializer->bytes_written++; *(serializer->out_cppstr) += "."; return; } #endif if( serializer->mode==AE_SM_TO_STRING ) { ae_assert(serializer->bytes_written+1bytes_asked, "ae_serializer: integrity check failed", state); /* strict "less" because we need space for trailing zero */ serializer->bytes_written++; strcat(serializer->out_str, "."); serializer->out_str += 1; return; } if( serializer->mode==AE_SM_TO_STREAM ) { ae_assert(serializer->bytes_written+1bytes_asked, "ae_serializer: integrity check failed", state); /* strict "less" because we need space for trailing zero */ serializer->bytes_written++; ae_assert(serializer->stream_writer(".", serializer->stream_aux)==0, "ae_serializer: error writing to stream", state); return; } if( serializer->mode==AE_SM_FROM_STRING ) { /* * because input string may be from pre-3.11 serializer, * which does not include trailing dot, we do not test * string for presence of "." symbol. Anyway, because string * is not stream, we do not have to read ALL trailing symbols. */ return; } if( serializer->mode==AE_SM_FROM_STREAM ) { /* * Read trailing dot, perform integrity check */ char buf[2]; ae_assert(serializer->stream_reader(serializer->stream_aux, 1, buf)==0, "ae_serializer: error reading from stream", state); ae_assert(buf[0]=='.', "ae_serializer: trailing . is not found in the stream", state); return; } ae_break(state, ERR_ASSERTION_FAILED, "ae_serializer: integrity check failed"); } /************************************************************************ Complex math functions ************************************************************************/ ae_complex ae_complex_from_i(ae_int_t v) { ae_complex r; r.x = (double)v; r.y = 0.0; return r; } ae_complex ae_complex_from_d(double v) { ae_complex r; r.x = v; r.y = 0.0; return r; } ae_complex ae_c_neg(ae_complex lhs) { ae_complex result; result.x = -lhs.x; result.y = -lhs.y; return result; } ae_complex ae_c_conj(ae_complex lhs, ae_state *state) { ae_complex result; result.x = +lhs.x; result.y = -lhs.y; return result; } ae_complex ae_c_sqr(ae_complex lhs, ae_state *state) { ae_complex result; result.x = lhs.x*lhs.x-lhs.y*lhs.y; result.y = 2*lhs.x*lhs.y; return result; } double ae_c_abs(ae_complex z, ae_state *state) { double w; double xabs; double yabs; double v; xabs = fabs(z.x); yabs = fabs(z.y); w = xabs>yabs ? xabs : yabs; v = xabsx; v0y = -v0->y; v1x = v1->x; v1y = -v1->y; rx += v0x*v1x-v0y*v1y; ry += v0x*v1y+v0y*v1x; } } if( !bconj0 && bconj1 ) { double v0x, v0y, v1x, v1y; for(i=0; ix; v0y = v0->y; v1x = v1->x; v1y = -v1->y; rx += v0x*v1x-v0y*v1y; ry += v0x*v1y+v0y*v1x; } } if( bconj0 && !bconj1 ) { double v0x, v0y, v1x, v1y; for(i=0; ix; v0y = -v0->y; v1x = v1->x; v1y = v1->y; rx += v0x*v1x-v0y*v1y; ry += v0x*v1y+v0y*v1x; } } if( !bconj0 && !bconj1 ) { double v0x, v0y, v1x, v1y; for(i=0; ix; v0y = v0->y; v1x = v1->x; v1y = v1->y; rx += v0x*v1x-v0y*v1y; ry += v0x*v1y+v0y*v1x; } } result.x = rx; result.y = ry; return result; } void ae_v_cmove(ae_complex *vdst, ae_int_t stride_dst, const ae_complex* vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n) { ae_bool bconj = !((conj_src[0]=='N') || (conj_src[0]=='n')); ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { /* * general unoptimized case */ if( bconj ) { for(i=0; ix = vsrc->x; vdst->y = -vsrc->y; } } else { for(i=0; ix = vsrc->x; vdst->y = -vsrc->y; } } else { for(i=0; ix = -vsrc->x; vdst->y = vsrc->y; } } else { for(i=0; ix = -vsrc->x; vdst->y = -vsrc->y; } } } else { /* * optimized case */ if( bconj ) { for(i=0; ix = -vsrc->x; vdst->y = vsrc->y; } } else { for(i=0; ix = -vsrc->x; vdst->y = -vsrc->y; } } } } void ae_v_cmoved(ae_complex *vdst, ae_int_t stride_dst, const ae_complex* vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, double alpha) { ae_bool bconj = !((conj_src[0]=='N') || (conj_src[0]=='n')); ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { /* * general unoptimized case */ if( bconj ) { for(i=0; ix = alpha*vsrc->x; vdst->y = -alpha*vsrc->y; } } else { for(i=0; ix = alpha*vsrc->x; vdst->y = alpha*vsrc->y; } } } else { /* * optimized case */ if( bconj ) { for(i=0; ix = alpha*vsrc->x; vdst->y = -alpha*vsrc->y; } } else { for(i=0; ix = alpha*vsrc->x; vdst->y = alpha*vsrc->y; } } } } void ae_v_cmovec(ae_complex *vdst, ae_int_t stride_dst, const ae_complex* vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, ae_complex alpha) { ae_bool bconj = !((conj_src[0]=='N') || (conj_src[0]=='n')); ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { /* * general unoptimized case */ if( bconj ) { double ax = alpha.x, ay = alpha.y; for(i=0; ix = ax*vsrc->x+ay*vsrc->y; vdst->y = -ax*vsrc->y+ay*vsrc->x; } } else { double ax = alpha.x, ay = alpha.y; for(i=0; ix = ax*vsrc->x-ay*vsrc->y; vdst->y = ax*vsrc->y+ay*vsrc->x; } } } else { /* * highly optimized case */ if( bconj ) { double ax = alpha.x, ay = alpha.y; for(i=0; ix = ax*vsrc->x+ay*vsrc->y; vdst->y = -ax*vsrc->y+ay*vsrc->x; } } else { double ax = alpha.x, ay = alpha.y; for(i=0; ix = ax*vsrc->x-ay*vsrc->y; vdst->y = ax*vsrc->y+ay*vsrc->x; } } } } void ae_v_cadd(ae_complex *vdst, ae_int_t stride_dst, const ae_complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n) { ae_bool bconj = !((conj_src[0]=='N') || (conj_src[0]=='n')); ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { /* * general unoptimized case */ if( bconj ) { for(i=0; ix += vsrc->x; vdst->y -= vsrc->y; } } else { for(i=0; ix += vsrc->x; vdst->y += vsrc->y; } } } else { /* * optimized case */ if( bconj ) { for(i=0; ix += vsrc->x; vdst->y -= vsrc->y; } } else { for(i=0; ix += vsrc->x; vdst->y += vsrc->y; } } } } void ae_v_caddd(ae_complex *vdst, ae_int_t stride_dst, const ae_complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, double alpha) { ae_bool bconj = !((conj_src[0]=='N') || (conj_src[0]=='n')); ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { /* * general unoptimized case */ if( bconj ) { for(i=0; ix += alpha*vsrc->x; vdst->y -= alpha*vsrc->y; } } else { for(i=0; ix += alpha*vsrc->x; vdst->y += alpha*vsrc->y; } } } else { /* * optimized case */ if( bconj ) { for(i=0; ix += alpha*vsrc->x; vdst->y -= alpha*vsrc->y; } } else { for(i=0; ix += alpha*vsrc->x; vdst->y += alpha*vsrc->y; } } } } void ae_v_caddc(ae_complex *vdst, ae_int_t stride_dst, const ae_complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, ae_complex alpha) { ae_bool bconj = !((conj_src[0]=='N') || (conj_src[0]=='n')); ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { /* * general unoptimized case */ double ax = alpha.x, ay = alpha.y; if( bconj ) { for(i=0; ix += ax*vsrc->x+ay*vsrc->y; vdst->y -= ax*vsrc->y-ay*vsrc->x; } } else { for(i=0; ix += ax*vsrc->x-ay*vsrc->y; vdst->y += ax*vsrc->y+ay*vsrc->x; } } } else { /* * highly optimized case */ double ax = alpha.x, ay = alpha.y; if( bconj ) { for(i=0; ix += ax*vsrc->x+ay*vsrc->y; vdst->y -= ax*vsrc->y-ay*vsrc->x; } } else { for(i=0; ix += ax*vsrc->x-ay*vsrc->y; vdst->y += ax*vsrc->y+ay*vsrc->x; } } } } void ae_v_csub(ae_complex *vdst, ae_int_t stride_dst, const ae_complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n) { ae_bool bconj = !((conj_src[0]=='N') || (conj_src[0]=='n')); ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { /* * general unoptimized case */ if( bconj ) { for(i=0; ix -= vsrc->x; vdst->y += vsrc->y; } } else { for(i=0; ix -= vsrc->x; vdst->y -= vsrc->y; } } } else { /* * highly optimized case */ if( bconj ) { for(i=0; ix -= vsrc->x; vdst->y += vsrc->y; } } else { for(i=0; ix -= vsrc->x; vdst->y -= vsrc->y; } } } } void ae_v_csubd(ae_complex *vdst, ae_int_t stride_dst, const ae_complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, double alpha) { ae_v_caddd(vdst, stride_dst, vsrc, stride_src, conj_src, n, -alpha); } void ae_v_csubc(ae_complex *vdst, ae_int_t stride_dst, const ae_complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, ae_complex alpha) { alpha.x = -alpha.x; alpha.y = -alpha.y; ae_v_caddc(vdst, stride_dst, vsrc, stride_src, conj_src, n, alpha); } void ae_v_cmuld(ae_complex *vdst, ae_int_t stride_dst, ae_int_t n, double alpha) { ae_int_t i; if( stride_dst!=1 ) { /* * general unoptimized case */ for(i=0; ix *= alpha; vdst->y *= alpha; } } else { /* * optimized case */ for(i=0; ix *= alpha; vdst->y *= alpha; } } } void ae_v_cmulc(ae_complex *vdst, ae_int_t stride_dst, ae_int_t n, ae_complex alpha) { ae_int_t i; if( stride_dst!=1 ) { /* * general unoptimized case */ double ax = alpha.x, ay = alpha.y; for(i=0; ix, dsty = vdst->y; vdst->x = ax*dstx-ay*dsty; vdst->y = ax*dsty+ay*dstx; } } else { /* * highly optimized case */ double ax = alpha.x, ay = alpha.y; for(i=0; ix, dsty = vdst->y; vdst->x = ax*dstx-ay*dsty; vdst->y = ax*dsty+ay*dstx; } } } /************************************************************************ Real BLAS operations ************************************************************************/ double ae_v_dotproduct(const double *v0, ae_int_t stride0, const double *v1, ae_int_t stride1, ae_int_t n) { double result = 0; ae_int_t i; if( stride0!=1 || stride1!=1 ) { /* * slow general code */ for(i=0; iba, 0, DT_BOOL, _state); ae_vector_init(&p->ia, 0, DT_INT, _state); ae_vector_init(&p->ra, 0, DT_REAL, _state); ae_vector_init(&p->ca, 0, DT_COMPLEX, _state); } void _rcommstate_init_copy(rcommstate* dst, rcommstate* src, ae_state *_state) { ae_vector_init_copy(&dst->ba, &src->ba, _state); ae_vector_init_copy(&dst->ia, &src->ia, _state); ae_vector_init_copy(&dst->ra, &src->ra, _state); ae_vector_init_copy(&dst->ca, &src->ca, _state); dst->stage = src->stage; } void _rcommstate_clear(rcommstate* p) { ae_vector_clear(&p->ba); ae_vector_clear(&p->ia); ae_vector_clear(&p->ra); ae_vector_clear(&p->ca); } void _rcommstate_destroy(rcommstate* p) { _rcommstate_clear(p); } #ifdef AE_DEBUG4WINDOWS int _tickcount() { return GetTickCount(); } #endif #ifdef AE_DEBUG4POSIX #include int _tickcount() { struct timeval now; ae_int64_t r, v; gettimeofday(&now, NULL); v = now.tv_sec; r = v*1000; v = now.tv_usec/1000; r = r+v; return r; /*struct timespec now; if (clock_gettime(CLOCK_MONOTONIC, &now) ) return 0; return now.tv_sec * 1000.0 + now.tv_nsec / 1000000.0;*/ } #endif } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ RELATED FUNCTIONALITY // ///////////////////////////////////////////////////////////////////////// /******************************************************************** Internal forwards ********************************************************************/ namespace alglib { double get_aenv_nan(); double get_aenv_posinf(); double get_aenv_neginf(); ae_int_t my_stricmp(const char *s1, const char *s2); char* filter_spaces(const char *s); void str_vector_create(const char *src, bool match_head_only, std::vector *p_vec); void str_matrix_create(const char *src, std::vector< std::vector > *p_mat); ae_bool parse_bool_delim(const char *s, const char *delim); ae_int_t parse_int_delim(const char *s, const char *delim); bool _parse_real_delim(const char *s, const char *delim, double *result, const char **new_s); double parse_real_delim(const char *s, const char *delim); alglib::complex parse_complex_delim(const char *s, const char *delim); std::string arraytostring(const bool *ptr, ae_int_t n); std::string arraytostring(const ae_int_t *ptr, ae_int_t n); std::string arraytostring(const double *ptr, ae_int_t n, int dps); std::string arraytostring(const alglib::complex *ptr, ae_int_t n, int dps); } /******************************************************************** Global and local constants ********************************************************************/ const double alglib::machineepsilon = 5E-16; const double alglib::maxrealnumber = 1E300; const double alglib::minrealnumber = 1E-300; const alglib::ae_int_t alglib::endianness = alglib_impl::ae_get_endianness(); const double alglib::fp_nan = alglib::get_aenv_nan(); const double alglib::fp_posinf = alglib::get_aenv_posinf(); const double alglib::fp_neginf = alglib::get_aenv_neginf(); /******************************************************************** ap_error ********************************************************************/ alglib::ap_error::ap_error() { } alglib::ap_error::ap_error(const char *s) { msg = s; } void alglib::ap_error::make_assertion(bool bClause) { if(!bClause) throw ap_error(); } void alglib::ap_error::make_assertion(bool bClause, const char *p_msg) { if(!bClause) throw ap_error(p_msg); } /******************************************************************** Complex number with double precision. ********************************************************************/ alglib::complex::complex():x(0.0),y(0.0) { } alglib::complex::complex(const double &_x):x(_x),y(0.0) { } alglib::complex::complex(const double &_x, const double &_y):x(_x),y(_y) { } alglib::complex::complex(const alglib::complex &z):x(z.x),y(z.y) { } alglib::complex& alglib::complex::operator= (const double& v) { x = v; y = 0.0; return *this; } alglib::complex& alglib::complex::operator+=(const double& v) { x += v; return *this; } alglib::complex& alglib::complex::operator-=(const double& v) { x -= v; return *this; } alglib::complex& alglib::complex::operator*=(const double& v) { x *= v; y *= v; return *this; } alglib::complex& alglib::complex::operator/=(const double& v) { x /= v; y /= v; return *this; } alglib::complex& alglib::complex::operator= (const alglib::complex& z) { x = z.x; y = z.y; return *this; } alglib::complex& alglib::complex::operator+=(const alglib::complex& z) { x += z.x; y += z.y; return *this; } alglib::complex& alglib::complex::operator-=(const alglib::complex& z) { x -= z.x; y -= z.y; return *this; } alglib::complex& alglib::complex::operator*=(const alglib::complex& z) { double t = x*z.x-y*z.y; y = x*z.y+y*z.x; x = t; return *this; } alglib::complex& alglib::complex::operator/=(const alglib::complex& z) { alglib::complex result; double e; double f; if( fabs(z.y)=0 ? _dps : -_dps; if( dps<=0 || dps>=20 ) throw ap_error("complex::tostring(): incorrect dps"); // handle IEEE special quantities if( fp_isnan(x) || fp_isnan(y) ) return "NAN"; if( fp_isinf(x) || fp_isinf(y) ) return "INF"; // generate mask if( sprintf(mask, "%%.%d%s", dps, _dps>=0 ? "f" : "e")>=(int)sizeof(mask) ) throw ap_error("complex::tostring(): buffer overflow"); // print |x|, |y| and zero with same mask and compare if( sprintf(buf_x, mask, (double)(fabs(x)))>=(int)sizeof(buf_x) ) throw ap_error("complex::tostring(): buffer overflow"); if( sprintf(buf_y, mask, (double)(fabs(y)))>=(int)sizeof(buf_y) ) throw ap_error("complex::tostring(): buffer overflow"); if( sprintf(buf_zero, mask, (double)0)>=(int)sizeof(buf_zero) ) throw ap_error("complex::tostring(): buffer overflow"); // different zero/nonzero patterns if( strcmp(buf_x,buf_zero)!=0 && strcmp(buf_y,buf_zero)!=0 ) return std::string(x>0 ? "" : "-")+buf_x+(y>0 ? "+" : "-")+buf_y+"i"; if( strcmp(buf_x,buf_zero)!=0 && strcmp(buf_y,buf_zero)==0 ) return std::string(x>0 ? "" : "-")+buf_x; if( strcmp(buf_x,buf_zero)==0 && strcmp(buf_y,buf_zero)!=0 ) return std::string(y>0 ? "" : "-")+buf_y+"i"; return std::string("0"); } const bool alglib::operator==(const alglib::complex& lhs, const alglib::complex& rhs) { volatile double x1 = lhs.x; volatile double x2 = rhs.x; volatile double y1 = lhs.y; volatile double y2 = rhs.y; return x1==x2 && y1==y2; } const bool alglib::operator!=(const alglib::complex& lhs, const alglib::complex& rhs) { return !(lhs==rhs); } const alglib::complex alglib::operator+(const alglib::complex& lhs) { return lhs; } const alglib::complex alglib::operator-(const alglib::complex& lhs) { return alglib::complex(-lhs.x, -lhs.y); } const alglib::complex alglib::operator+(const alglib::complex& lhs, const alglib::complex& rhs) { alglib::complex r = lhs; r += rhs; return r; } const alglib::complex alglib::operator+(const alglib::complex& lhs, const double& rhs) { alglib::complex r = lhs; r += rhs; return r; } const alglib::complex alglib::operator+(const double& lhs, const alglib::complex& rhs) { alglib::complex r = rhs; r += lhs; return r; } const alglib::complex alglib::operator-(const alglib::complex& lhs, const alglib::complex& rhs) { alglib::complex r = lhs; r -= rhs; return r; } const alglib::complex alglib::operator-(const alglib::complex& lhs, const double& rhs) { alglib::complex r = lhs; r -= rhs; return r; } const alglib::complex alglib::operator-(const double& lhs, const alglib::complex& rhs) { alglib::complex r = lhs; r -= rhs; return r; } const alglib::complex alglib::operator*(const alglib::complex& lhs, const alglib::complex& rhs) { return alglib::complex(lhs.x*rhs.x - lhs.y*rhs.y, lhs.x*rhs.y + lhs.y*rhs.x); } const alglib::complex alglib::operator*(const alglib::complex& lhs, const double& rhs) { return alglib::complex(lhs.x*rhs, lhs.y*rhs); } const alglib::complex alglib::operator*(const double& lhs, const alglib::complex& rhs) { return alglib::complex(lhs*rhs.x, lhs*rhs.y); } const alglib::complex alglib::operator/(const alglib::complex& lhs, const alglib::complex& rhs) { alglib::complex result; double e; double f; if( fabs(rhs.y)yabs ? xabs : yabs; v = xabsx; v0y = -v0->y; v1x = v1->x; v1y = -v1->y; rx += v0x*v1x-v0y*v1y; ry += v0x*v1y+v0y*v1x; } } if( !bconj0 && bconj1 ) { double v0x, v0y, v1x, v1y; for(i=0; ix; v0y = v0->y; v1x = v1->x; v1y = -v1->y; rx += v0x*v1x-v0y*v1y; ry += v0x*v1y+v0y*v1x; } } if( bconj0 && !bconj1 ) { double v0x, v0y, v1x, v1y; for(i=0; ix; v0y = -v0->y; v1x = v1->x; v1y = v1->y; rx += v0x*v1x-v0y*v1y; ry += v0x*v1y+v0y*v1x; } } if( !bconj0 && !bconj1 ) { double v0x, v0y, v1x, v1y; for(i=0; ix; v0y = v0->y; v1x = v1->x; v1y = v1->y; rx += v0x*v1x-v0y*v1y; ry += v0x*v1y+v0y*v1x; } } return alglib::complex(rx,ry); } alglib::complex alglib::vdotproduct(const alglib::complex *v1, const alglib::complex *v2, ae_int_t N) { return vdotproduct(v1, 1, "N", v2, 1, "N", N); } void alglib::vmove(double *vdst, ae_int_t stride_dst, const double* vsrc, ae_int_t stride_src, ae_int_t n) { ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { // // general unoptimized case // for(i=0; ix = vsrc->x; vdst->y = -vsrc->y; } } else { for(i=0; ix = vsrc->x; vdst->y = -vsrc->y; } } else { for(i=0; ix = -vsrc->x; vdst->y = vsrc->y; } } else { for(i=0; ix = -vsrc->x; vdst->y = -vsrc->y; } } } else { // // optimized case // if( bconj ) { for(i=0; ix = -vsrc->x; vdst->y = vsrc->y; } } else { for(i=0; ix = -vsrc->x; vdst->y = -vsrc->y; } } } } void alglib::vmoveneg(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N) { vmoveneg(vdst, 1, vsrc, 1, "N", N); } void alglib::vmove(double *vdst, ae_int_t stride_dst, const double* vsrc, ae_int_t stride_src, ae_int_t n, double alpha) { ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { // // general unoptimized case // for(i=0; ix = alpha*vsrc->x; vdst->y = -alpha*vsrc->y; } } else { for(i=0; ix = alpha*vsrc->x; vdst->y = alpha*vsrc->y; } } } else { // // optimized case // if( bconj ) { for(i=0; ix = alpha*vsrc->x; vdst->y = -alpha*vsrc->y; } } else { for(i=0; ix = alpha*vsrc->x; vdst->y = alpha*vsrc->y; } } } } void alglib::vmove(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N, double alpha) { vmove(vdst, 1, vsrc, 1, "N", N, alpha); } void alglib::vmove(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex* vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, alglib::complex alpha) { bool bconj = !((conj_src[0]=='N') || (conj_src[0]=='n')); ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { // // general unoptimized case // if( bconj ) { double ax = alpha.x, ay = alpha.y; for(i=0; ix = ax*vsrc->x+ay*vsrc->y; vdst->y = -ax*vsrc->y+ay*vsrc->x; } } else { double ax = alpha.x, ay = alpha.y; for(i=0; ix = ax*vsrc->x-ay*vsrc->y; vdst->y = ax*vsrc->y+ay*vsrc->x; } } } else { // // optimized case // if( bconj ) { double ax = alpha.x, ay = alpha.y; for(i=0; ix = ax*vsrc->x+ay*vsrc->y; vdst->y = -ax*vsrc->y+ay*vsrc->x; } } else { double ax = alpha.x, ay = alpha.y; for(i=0; ix = ax*vsrc->x-ay*vsrc->y; vdst->y = ax*vsrc->y+ay*vsrc->x; } } } } void alglib::vmove(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N, alglib::complex alpha) { vmove(vdst, 1, vsrc, 1, "N", N, alpha); } void alglib::vadd(double *vdst, ae_int_t stride_dst, const double *vsrc, ae_int_t stride_src, ae_int_t n) { ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { // // general unoptimized case // for(i=0; ix += vsrc->x; vdst->y -= vsrc->y; } } else { for(i=0; ix += vsrc->x; vdst->y += vsrc->y; } } } else { // // optimized case // if( bconj ) { for(i=0; ix += vsrc->x; vdst->y -= vsrc->y; } } else { for(i=0; ix += vsrc->x; vdst->y += vsrc->y; } } } } void alglib::vadd(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N) { vadd(vdst, 1, vsrc, 1, "N", N); } void alglib::vadd(double *vdst, ae_int_t stride_dst, const double *vsrc, ae_int_t stride_src, ae_int_t n, double alpha) { ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { // // general unoptimized case // for(i=0; ix += alpha*vsrc->x; vdst->y -= alpha*vsrc->y; } } else { for(i=0; ix += alpha*vsrc->x; vdst->y += alpha*vsrc->y; } } } else { // // optimized case // if( bconj ) { for(i=0; ix += alpha*vsrc->x; vdst->y -= alpha*vsrc->y; } } else { for(i=0; ix += alpha*vsrc->x; vdst->y += alpha*vsrc->y; } } } } void alglib::vadd(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N, double alpha) { vadd(vdst, 1, vsrc, 1, "N", N, alpha); } void alglib::vadd(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, alglib::complex alpha) { bool bconj = !((conj_src[0]=='N') || (conj_src[0]=='n')); ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { // // general unoptimized case // double ax = alpha.x, ay = alpha.y; if( bconj ) { for(i=0; ix += ax*vsrc->x+ay*vsrc->y; vdst->y -= ax*vsrc->y-ay*vsrc->x; } } else { for(i=0; ix += ax*vsrc->x-ay*vsrc->y; vdst->y += ax*vsrc->y+ay*vsrc->x; } } } else { // // optimized case // double ax = alpha.x, ay = alpha.y; if( bconj ) { for(i=0; ix += ax*vsrc->x+ay*vsrc->y; vdst->y -= ax*vsrc->y-ay*vsrc->x; } } else { for(i=0; ix += ax*vsrc->x-ay*vsrc->y; vdst->y += ax*vsrc->y+ay*vsrc->x; } } } } void alglib::vadd(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N, alglib::complex alpha) { vadd(vdst, 1, vsrc, 1, "N", N, alpha); } void alglib::vsub(double *vdst, ae_int_t stride_dst, const double *vsrc, ae_int_t stride_src, ae_int_t n) { ae_int_t i; if( stride_dst!=1 || stride_src!=1 ) { // // general unoptimized case // for(i=0; ix -= vsrc->x; vdst->y += vsrc->y; } } else { for(i=0; ix -= vsrc->x; vdst->y -= vsrc->y; } } } else { // // optimized case // if( bconj ) { for(i=0; ix -= vsrc->x; vdst->y += vsrc->y; } } else { for(i=0; ix -= vsrc->x; vdst->y -= vsrc->y; } } } } void alglib::vsub(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N) { vsub(vdst, 1, vsrc, 1, "N", N); } void alglib::vsub(double *vdst, ae_int_t stride_dst, const double *vsrc, ae_int_t stride_src, ae_int_t n, double alpha) { vadd(vdst, stride_dst, vsrc, stride_src, n, -alpha); } void alglib::vsub(double *vdst, const double *vsrc, ae_int_t N, double alpha) { vadd(vdst, 1, vsrc, 1, N, -alpha); } void alglib::vsub(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, double alpha) { vadd(vdst, stride_dst, vsrc, stride_src, conj_src, n, -alpha); } void alglib::vsub(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t n, double alpha) { vadd(vdst, 1, vsrc, 1, "N", n, -alpha); } void alglib::vsub(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, alglib::complex alpha) { vadd(vdst, stride_dst, vsrc, stride_src, conj_src, n, -alpha); } void alglib::vsub(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t n, alglib::complex alpha) { vadd(vdst, 1, vsrc, 1, "N", n, -alpha); } void alglib::vmul(double *vdst, ae_int_t stride_dst, ae_int_t n, double alpha) { ae_int_t i; if( stride_dst!=1 ) { // // general unoptimized case // for(i=0; ix *= alpha; vdst->y *= alpha; } } else { // // optimized case // for(i=0; ix *= alpha; vdst->y *= alpha; } } } void alglib::vmul(alglib::complex *vdst, ae_int_t N, double alpha) { vmul(vdst, 1, N, alpha); } void alglib::vmul(alglib::complex *vdst, ae_int_t stride_dst, ae_int_t n, alglib::complex alpha) { ae_int_t i; if( stride_dst!=1 ) { // // general unoptimized case // double ax = alpha.x, ay = alpha.y; for(i=0; ix, dsty = vdst->y; vdst->x = ax*dstx-ay*dsty; vdst->y = ax*dsty+ay*dstx; } } else { // // optimized case // double ax = alpha.x, ay = alpha.y; for(i=0; ix, dsty = vdst->y; vdst->x = ax*dstx-ay*dsty; vdst->y = ax*dsty+ay*dstx; } } } void alglib::vmul(alglib::complex *vdst, ae_int_t N, alglib::complex alpha) { vmul(vdst, 1, N, alpha); } /******************************************************************** Matrices and vectors ********************************************************************/ alglib::ae_vector_wrapper::ae_vector_wrapper() { p_vec = NULL; } alglib::ae_vector_wrapper::~ae_vector_wrapper() { if( p_vec==&vec ) ae_vector_clear(p_vec); } void alglib::ae_vector_wrapper::setlength(ae_int_t iLen) { if( p_vec==NULL ) throw alglib::ap_error("ALGLIB: setlength() error, p_vec==NULL (array was not correctly initialized)"); if( p_vec!=&vec ) throw alglib::ap_error("ALGLIB: setlength() error, p_vec!=&vec (attempt to resize frozen array)"); if( !ae_vector_set_length(p_vec, iLen, NULL) ) throw alglib::ap_error("ALGLIB: malloc error"); } alglib::ae_int_t alglib::ae_vector_wrapper::length() const { if( p_vec==NULL ) return 0; return p_vec->cnt; } void alglib::ae_vector_wrapper::attach_to(alglib_impl::ae_vector *ptr) { if( ptr==&vec ) throw alglib::ap_error("ALGLIB: attempt to attach vector to itself"); if( p_vec==&vec ) ae_vector_clear(p_vec); p_vec = ptr; } void alglib::ae_vector_wrapper::allocate_own(ae_int_t size, alglib_impl::ae_datatype datatype) { if( p_vec==&vec ) ae_vector_clear(p_vec); p_vec = &vec; ae_vector_init(p_vec, size, datatype, NULL); } const alglib_impl::ae_vector* alglib::ae_vector_wrapper::c_ptr() const { return p_vec; } alglib_impl::ae_vector* alglib::ae_vector_wrapper::c_ptr() { return p_vec; } void alglib::ae_vector_wrapper::create(const alglib::ae_vector_wrapper &rhs) { if( rhs.p_vec!=NULL ) { p_vec = &vec; ae_vector_init_copy(p_vec, rhs.p_vec, NULL); } else p_vec = NULL; } void alglib::ae_vector_wrapper::create(const char *s, alglib_impl::ae_datatype datatype) { std::vector svec; size_t i; char *p = filter_spaces(s); try { str_vector_create(p, true, &svec); allocate_own((ae_int_t)(svec.size()), datatype); for(i=0; iptr.p_bool[i] = parse_bool_delim(svec[i],",]"); if( datatype==alglib_impl::DT_INT ) p_vec->ptr.p_int[i] = parse_int_delim(svec[i],",]"); if( datatype==alglib_impl::DT_REAL ) p_vec->ptr.p_double[i] = parse_real_delim(svec[i],",]"); if( datatype==alglib_impl::DT_COMPLEX ) { alglib::complex t = parse_complex_delim(svec[i],",]"); p_vec->ptr.p_complex[i].x = t.x; p_vec->ptr.p_complex[i].y = t.y; } } alglib_impl::ae_free(p); } catch(...) { alglib_impl::ae_free(p); throw; } } void alglib::ae_vector_wrapper::assign(const alglib::ae_vector_wrapper &rhs) { if( this==&rhs ) return; if( p_vec==&vec || p_vec==NULL ) { // // Assignment to non-proxy object // ae_vector_clear(p_vec); if( rhs.p_vec!=NULL ) { p_vec = &vec; ae_vector_init_copy(p_vec, rhs.p_vec, NULL); } else p_vec = NULL; } else { // // Assignment to proxy object // if( rhs.p_vec==NULL ) throw alglib::ap_error("ALGLIB: incorrect assignment to array (sizes dont match)"); if( rhs.p_vec->datatype!=p_vec->datatype ) throw alglib::ap_error("ALGLIB: incorrect assignment to array (types dont match)"); if( rhs.p_vec->cnt!=p_vec->cnt ) throw alglib::ap_error("ALGLIB: incorrect assignment to array (sizes dont match)"); memcpy(p_vec->ptr.p_ptr, rhs.p_vec->ptr.p_ptr, p_vec->cnt*alglib_impl::ae_sizeof(p_vec->datatype)); } } alglib::boolean_1d_array::boolean_1d_array() { allocate_own(0, alglib_impl::DT_BOOL); } alglib::boolean_1d_array::boolean_1d_array(const char *s) { create(s, alglib_impl::DT_BOOL); } alglib::boolean_1d_array::boolean_1d_array(const alglib::boolean_1d_array &rhs) { create(rhs); } alglib::boolean_1d_array::boolean_1d_array(alglib_impl::ae_vector *p) { p_vec = NULL; attach_to(p); } const alglib::boolean_1d_array& alglib::boolean_1d_array::operator=(const alglib::boolean_1d_array &rhs) { assign(rhs); return *this; } alglib::boolean_1d_array::~boolean_1d_array() { } const ae_bool& alglib::boolean_1d_array::operator()(ae_int_t i) const { return p_vec->ptr.p_bool[i]; } ae_bool& alglib::boolean_1d_array::operator()(ae_int_t i) { return p_vec->ptr.p_bool[i]; } const ae_bool& alglib::boolean_1d_array::operator[](ae_int_t i) const { return p_vec->ptr.p_bool[i]; } ae_bool& alglib::boolean_1d_array::operator[](ae_int_t i) { return p_vec->ptr.p_bool[i]; } void alglib::boolean_1d_array::setcontent(ae_int_t iLen, const bool *pContent ) { ae_int_t i; setlength(iLen); for(i=0; iptr.p_bool[i] = pContent[i]; } ae_bool* alglib::boolean_1d_array::getcontent() { return p_vec->ptr.p_bool; } const ae_bool* alglib::boolean_1d_array::getcontent() const { return p_vec->ptr.p_bool; } std::string alglib::boolean_1d_array::tostring() const { if( length()==0 ) return "[]"; return arraytostring(&(operator()(0)), length()); } alglib::integer_1d_array::integer_1d_array() { allocate_own(0, alglib_impl::DT_INT); } alglib::integer_1d_array::integer_1d_array(alglib_impl::ae_vector *p) { p_vec = NULL; attach_to(p); } alglib::integer_1d_array::integer_1d_array(const char *s) { create(s, alglib_impl::DT_INT); } alglib::integer_1d_array::integer_1d_array(const alglib::integer_1d_array &rhs) { create(rhs); } const alglib::integer_1d_array& alglib::integer_1d_array::operator=(const alglib::integer_1d_array &rhs) { assign(rhs); return *this; } alglib::integer_1d_array::~integer_1d_array() { } const alglib::ae_int_t& alglib::integer_1d_array::operator()(ae_int_t i) const { return p_vec->ptr.p_int[i]; } alglib::ae_int_t& alglib::integer_1d_array::operator()(ae_int_t i) { return p_vec->ptr.p_int[i]; } const alglib::ae_int_t& alglib::integer_1d_array::operator[](ae_int_t i) const { return p_vec->ptr.p_int[i]; } alglib::ae_int_t& alglib::integer_1d_array::operator[](ae_int_t i) { return p_vec->ptr.p_int[i]; } void alglib::integer_1d_array::setcontent(ae_int_t iLen, const ae_int_t *pContent ) { ae_int_t i; setlength(iLen); for(i=0; iptr.p_int[i] = pContent[i]; } alglib::ae_int_t* alglib::integer_1d_array::getcontent() { return p_vec->ptr.p_int; } const alglib::ae_int_t* alglib::integer_1d_array::getcontent() const { return p_vec->ptr.p_int; } std::string alglib::integer_1d_array::tostring() const { if( length()==0 ) return "[]"; return arraytostring(&operator()(0), length()); } alglib::real_1d_array::real_1d_array() { allocate_own(0, alglib_impl::DT_REAL); } alglib::real_1d_array::real_1d_array(alglib_impl::ae_vector *p) { p_vec = NULL; attach_to(p); } alglib::real_1d_array::real_1d_array(const char *s) { create(s, alglib_impl::DT_REAL); } alglib::real_1d_array::real_1d_array(const alglib::real_1d_array &rhs) { create(rhs); } const alglib::real_1d_array& alglib::real_1d_array::operator=(const alglib::real_1d_array &rhs) { assign(rhs); return *this; } alglib::real_1d_array::~real_1d_array() { } const double& alglib::real_1d_array::operator()(ae_int_t i) const { return p_vec->ptr.p_double[i]; } double& alglib::real_1d_array::operator()(ae_int_t i) { return p_vec->ptr.p_double[i]; } const double& alglib::real_1d_array::operator[](ae_int_t i) const { return p_vec->ptr.p_double[i]; } double& alglib::real_1d_array::operator[](ae_int_t i) { return p_vec->ptr.p_double[i]; } void alglib::real_1d_array::setcontent(ae_int_t iLen, const double *pContent ) { ae_int_t i; setlength(iLen); for(i=0; iptr.p_double[i] = pContent[i]; } double* alglib::real_1d_array::getcontent() { return p_vec->ptr.p_double; } const double* alglib::real_1d_array::getcontent() const { return p_vec->ptr.p_double; } std::string alglib::real_1d_array::tostring(int dps) const { if( length()==0 ) return "[]"; return arraytostring(&operator()(0), length(), dps); } alglib::complex_1d_array::complex_1d_array() { allocate_own(0, alglib_impl::DT_COMPLEX); } alglib::complex_1d_array::complex_1d_array(alglib_impl::ae_vector *p) { p_vec = NULL; attach_to(p); } alglib::complex_1d_array::complex_1d_array(const char *s) { create(s, alglib_impl::DT_COMPLEX); } alglib::complex_1d_array::complex_1d_array(const alglib::complex_1d_array &rhs) { create(rhs); } const alglib::complex_1d_array& alglib::complex_1d_array::operator=(const alglib::complex_1d_array &rhs) { assign(rhs); return *this; } alglib::complex_1d_array::~complex_1d_array() { } const alglib::complex& alglib::complex_1d_array::operator()(ae_int_t i) const { return *((const alglib::complex*)(p_vec->ptr.p_complex+i)); } alglib::complex& alglib::complex_1d_array::operator()(ae_int_t i) { return *((alglib::complex*)(p_vec->ptr.p_complex+i)); } const alglib::complex& alglib::complex_1d_array::operator[](ae_int_t i) const { return *((const alglib::complex*)(p_vec->ptr.p_complex+i)); } alglib::complex& alglib::complex_1d_array::operator[](ae_int_t i) { return *((alglib::complex*)(p_vec->ptr.p_complex+i)); } void alglib::complex_1d_array::setcontent(ae_int_t iLen, const alglib::complex *pContent ) { ae_int_t i; setlength(iLen); for(i=0; iptr.p_complex[i].x = pContent[i].x; p_vec->ptr.p_complex[i].y = pContent[i].y; } } alglib::complex* alglib::complex_1d_array::getcontent() { return (alglib::complex*)p_vec->ptr.p_complex; } const alglib::complex* alglib::complex_1d_array::getcontent() const { return (const alglib::complex*)p_vec->ptr.p_complex; } std::string alglib::complex_1d_array::tostring(int dps) const { if( length()==0 ) return "[]"; return arraytostring(&operator()(0), length(), dps); } alglib::ae_matrix_wrapper::ae_matrix_wrapper() { p_mat = NULL; } alglib::ae_matrix_wrapper::~ae_matrix_wrapper() { if( p_mat==&mat ) ae_matrix_clear(p_mat); } const alglib::ae_matrix_wrapper& alglib::ae_matrix_wrapper::operator=(const alglib::ae_matrix_wrapper &rhs) { assign(rhs); return *this; } void alglib::ae_matrix_wrapper::create(const ae_matrix_wrapper &rhs) { if( rhs.p_mat!=NULL ) { p_mat = &mat; ae_matrix_init_copy(p_mat, rhs.p_mat, NULL); } else p_mat = NULL; } void alglib::ae_matrix_wrapper::create(const char *s, alglib_impl::ae_datatype datatype) { std::vector< std::vector > smat; size_t i, j; char *p = filter_spaces(s); try { str_matrix_create(p, &smat); if( smat.size()!=0 ) { allocate_own((ae_int_t)(smat.size()), (ae_int_t)(smat[0].size()), datatype); for(i=0; iptr.pp_bool[i][j] = parse_bool_delim(smat[i][j],",]"); if( datatype==alglib_impl::DT_INT ) p_mat->ptr.pp_int[i][j] = parse_int_delim(smat[i][j],",]"); if( datatype==alglib_impl::DT_REAL ) p_mat->ptr.pp_double[i][j] = parse_real_delim(smat[i][j],",]"); if( datatype==alglib_impl::DT_COMPLEX ) { alglib::complex t = parse_complex_delim(smat[i][j],",]"); p_mat->ptr.pp_complex[i][j].x = t.x; p_mat->ptr.pp_complex[i][j].y = t.y; } } } else allocate_own(0, 0, datatype); alglib_impl::ae_free(p); } catch(...) { alglib_impl::ae_free(p); throw; } } void alglib::ae_matrix_wrapper::assign(const alglib::ae_matrix_wrapper &rhs) { if( this==&rhs ) return; if( p_mat==&mat || p_mat==NULL ) { // // Assignment to non-proxy object // ae_matrix_clear(p_mat); if( rhs.p_mat!=NULL ) { p_mat = &mat; ae_matrix_init_copy(p_mat, rhs.p_mat, NULL); } else p_mat = NULL; } else { // // Assignment to proxy object // ae_int_t i; if( rhs.p_mat==NULL ) throw alglib::ap_error("ALGLIB: incorrect assignment to array (sizes dont match)"); if( rhs.p_mat->datatype!=p_mat->datatype ) throw alglib::ap_error("ALGLIB: incorrect assignment to array (types dont match)"); if( rhs.p_mat->rows!=p_mat->rows ) throw alglib::ap_error("ALGLIB: incorrect assignment to array (sizes dont match)"); if( rhs.p_mat->cols!=p_mat->cols ) throw alglib::ap_error("ALGLIB: incorrect assignment to array (sizes dont match)"); for(i=0; irows; i++) memcpy(p_mat->ptr.pp_void[i], rhs.p_mat->ptr.pp_void[i], p_mat->cols*alglib_impl::ae_sizeof(p_mat->datatype)); } } void alglib::ae_matrix_wrapper::setlength(ae_int_t rows, ae_int_t cols) { if( p_mat==NULL ) throw alglib::ap_error("ALGLIB: setlength() error, p_mat==NULL (array was not correctly initialized)"); if( p_mat!=&mat ) throw alglib::ap_error("ALGLIB: setlength() error, p_mat!=&mat (attempt to resize frozen array)"); if( !ae_matrix_set_length(p_mat, rows, cols, NULL) ) throw alglib::ap_error("ALGLIB: malloc error"); } alglib::ae_int_t alglib::ae_matrix_wrapper::rows() const { if( p_mat==NULL ) return 0; return p_mat->rows; } alglib::ae_int_t alglib::ae_matrix_wrapper::cols() const { if( p_mat==NULL ) return 0; return p_mat->cols; } bool alglib::ae_matrix_wrapper::isempty() const { return rows()==0 || cols()==0; } alglib::ae_int_t alglib::ae_matrix_wrapper::getstride() const { if( p_mat==NULL ) return 0; return p_mat->stride; } void alglib::ae_matrix_wrapper::attach_to(alglib_impl::ae_matrix *ptr) { if( ptr==&mat ) throw alglib::ap_error("ALGLIB: attempt to attach matrix to itself"); if( p_mat==&mat ) ae_matrix_clear(p_mat); p_mat = ptr; } void alglib::ae_matrix_wrapper::allocate_own(ae_int_t rows, ae_int_t cols, alglib_impl::ae_datatype datatype) { if( p_mat==&mat ) ae_matrix_clear(p_mat); p_mat = &mat; ae_matrix_init(p_mat, rows, cols, datatype, NULL); } const alglib_impl::ae_matrix* alglib::ae_matrix_wrapper::c_ptr() const { return p_mat; } alglib_impl::ae_matrix* alglib::ae_matrix_wrapper::c_ptr() { return p_mat; } alglib::boolean_2d_array::boolean_2d_array() { allocate_own(0, 0, alglib_impl::DT_BOOL); } alglib::boolean_2d_array::boolean_2d_array(const alglib::boolean_2d_array &rhs) { create(rhs); } alglib::boolean_2d_array::boolean_2d_array(alglib_impl::ae_matrix *p) { p_mat = NULL; attach_to(p); } alglib::boolean_2d_array::boolean_2d_array(const char *s) { create(s, alglib_impl::DT_BOOL); } alglib::boolean_2d_array::~boolean_2d_array() { } const ae_bool& alglib::boolean_2d_array::operator()(ae_int_t i, ae_int_t j) const { return p_mat->ptr.pp_bool[i][j]; } ae_bool& alglib::boolean_2d_array::operator()(ae_int_t i, ae_int_t j) { return p_mat->ptr.pp_bool[i][j]; } const ae_bool* alglib::boolean_2d_array::operator[](ae_int_t i) const { return p_mat->ptr.pp_bool[i]; } ae_bool* alglib::boolean_2d_array::operator[](ae_int_t i) { return p_mat->ptr.pp_bool[i]; } void alglib::boolean_2d_array::setcontent(ae_int_t irows, ae_int_t icols, const bool *pContent ) { ae_int_t i, j; setlength(irows, icols); for(i=0; iptr.pp_bool[i][j] = pContent[i*icols+j]; } std::string alglib::boolean_2d_array::tostring() const { std::string result; ae_int_t i; if( isempty() ) return "[[]]"; result = "["; for(i=0; iptr.pp_int[i][j]; } alglib::ae_int_t& alglib::integer_2d_array::operator()(ae_int_t i, ae_int_t j) { return p_mat->ptr.pp_int[i][j]; } const alglib::ae_int_t* alglib::integer_2d_array::operator[](ae_int_t i) const { return p_mat->ptr.pp_int[i]; } alglib::ae_int_t* alglib::integer_2d_array::operator[](ae_int_t i) { return p_mat->ptr.pp_int[i]; } void alglib::integer_2d_array::setcontent(ae_int_t irows, ae_int_t icols, const ae_int_t *pContent ) { ae_int_t i, j; setlength(irows, icols); for(i=0; iptr.pp_int[i][j] = pContent[i*icols+j]; } std::string alglib::integer_2d_array::tostring() const { std::string result; ae_int_t i; if( isempty() ) return "[[]]"; result = "["; for(i=0; iptr.pp_double[i][j]; } double& alglib::real_2d_array::operator()(ae_int_t i, ae_int_t j) { return p_mat->ptr.pp_double[i][j]; } const double* alglib::real_2d_array::operator[](ae_int_t i) const { return p_mat->ptr.pp_double[i]; } double* alglib::real_2d_array::operator[](ae_int_t i) { return p_mat->ptr.pp_double[i]; } void alglib::real_2d_array::setcontent(ae_int_t irows, ae_int_t icols, const double *pContent ) { ae_int_t i, j; setlength(irows, icols); for(i=0; iptr.pp_double[i][j] = pContent[i*icols+j]; } std::string alglib::real_2d_array::tostring(int dps) const { std::string result; ae_int_t i; if( isempty() ) return "[[]]"; result = "["; for(i=0; iptr.pp_complex[i]+j)); } alglib::complex& alglib::complex_2d_array::operator()(ae_int_t i, ae_int_t j) { return *((alglib::complex*)(p_mat->ptr.pp_complex[i]+j)); } const alglib::complex* alglib::complex_2d_array::operator[](ae_int_t i) const { return (const alglib::complex*)(p_mat->ptr.pp_complex[i]); } alglib::complex* alglib::complex_2d_array::operator[](ae_int_t i) { return (alglib::complex*)(p_mat->ptr.pp_complex[i]); } void alglib::complex_2d_array::setcontent(ae_int_t irows, ae_int_t icols, const alglib::complex *pContent ) { ae_int_t i, j; setlength(irows, icols); for(i=0; iptr.pp_complex[i][j].x = pContent[i*icols+j].x; p_mat->ptr.pp_complex[i][j].y = pContent[i*icols+j].y; } } std::string alglib::complex_2d_array::tostring(int dps) const { std::string result; ae_int_t i; if( isempty() ) return "[[]]"; result = "["; for(i=0; ic2 ) return +1; } } char* alglib::filter_spaces(const char *s) { size_t i, n; char *r; char *r0; n = strlen(s); r = (char*)alglib_impl::ae_malloc(n+1, NULL); if( r==NULL ) throw ap_error("malloc error"); for(i=0,r0=r; i<=n; i++,s++) if( !isspace(*s) ) { *r0 = *s; r0++; } return r; } void alglib::str_vector_create(const char *src, bool match_head_only, std::vector *p_vec) { // // parse beginning of the string. // try to handle "[]" string // p_vec->clear(); if( *src!='[' ) throw alglib::ap_error("Incorrect initializer for vector"); src++; if( *src==']' ) return; p_vec->push_back(src); for(;;) { if( *src==0 ) throw alglib::ap_error("Incorrect initializer for vector"); if( *src==']' ) { if( src[1]==0 || !match_head_only) return; throw alglib::ap_error("Incorrect initializer for vector"); } if( *src==',' ) { p_vec->push_back(src+1); src++; continue; } src++; } } void alglib::str_matrix_create(const char *src, std::vector< std::vector > *p_mat) { p_mat->clear(); // // Try to handle "[[]]" string // if( strcmp(src, "[[]]")==0 ) return; // // Parse non-empty string // if( *src!='[' ) throw alglib::ap_error("Incorrect initializer for matrix"); src++; for(;;) { p_mat->push_back(std::vector()); str_vector_create(src, false, &p_mat->back()); if( p_mat->back().size()==0 || p_mat->back().size()!=(*p_mat)[0].size() ) throw alglib::ap_error("Incorrect initializer for matrix"); src = strchr(src, ']'); if( src==NULL ) throw alglib::ap_error("Incorrect initializer for matrix"); src++; if( *src==',' ) { src++; continue; } if( *src==']' ) break; throw alglib::ap_error("Incorrect initializer for matrix"); } src++; if( *src!=0 ) throw alglib::ap_error("Incorrect initializer for matrix"); } ae_bool alglib::parse_bool_delim(const char *s, const char *delim) { const char *p; char buf[8]; // try to parse false p = "false"; memset(buf, 0, sizeof(buf)); strncpy(buf, s, strlen(p)); if( my_stricmp(buf, p)==0 ) { if( s[strlen(p)]==0 || strchr(delim,s[strlen(p)])==NULL ) throw alglib::ap_error("Cannot parse value"); return ae_false; } // try to parse true p = "true"; memset(buf, 0, sizeof(buf)); strncpy(buf, s, strlen(p)); if( my_stricmp(buf, p)==0 ) { if( s[strlen(p)]==0 || strchr(delim,s[strlen(p)])==NULL ) throw alglib::ap_error("Cannot parse value"); return ae_true; } // error throw alglib::ap_error("Cannot parse value"); } alglib::ae_int_t alglib::parse_int_delim(const char *s, const char *delim) { const char *p; long long_val; volatile ae_int_t ae_val; p = s; // // check string structure: // * leading sign // * at least one digit // * delimiter // if( *s=='-' || *s=='+' ) s++; if( *s==0 || strchr("1234567890",*s)==NULL) throw alglib::ap_error("Cannot parse value"); while( *s!=0 && strchr("1234567890",*s)!=NULL ) s++; if( *s==0 || strchr(delim,*s)==NULL ) throw alglib::ap_error("Cannot parse value"); // convert and ensure that value fits into ae_int_t s = p; long_val = atol(s); ae_val = long_val; if( ae_val!=long_val ) throw alglib::ap_error("Cannot parse value"); return ae_val; } bool alglib::_parse_real_delim(const char *s, const char *delim, double *result, const char **new_s) { const char *p; char *t; bool has_digits; char buf[64]; int isign; lconv *loc; p = s; // // check string structure and decide what to do // isign = 1; if( *s=='-' || *s=='+' ) { isign = *s=='-' ? -1 : +1; s++; } memset(buf, 0, sizeof(buf)); strncpy(buf, s, 3); if( my_stricmp(buf,"nan")!=0 && my_stricmp(buf,"inf")!=0 ) { // // [sign] [ddd] [.] [ddd] [e|E[sign]ddd] // has_digits = false; if( *s!=0 && strchr("1234567890",*s)!=NULL ) { has_digits = true; while( *s!=0 && strchr("1234567890",*s)!=NULL ) s++; } if( *s=='.' ) s++; if( *s!=0 && strchr("1234567890",*s)!=NULL ) { has_digits = true; while( *s!=0 && strchr("1234567890",*s)!=NULL ) s++; } if (!has_digits ) return false; if( *s=='e' || *s=='E' ) { s++; if( *s=='-' || *s=='+' ) s++; if( *s==0 || strchr("1234567890",*s)==NULL ) return false; while( *s!=0 && strchr("1234567890",*s)!=NULL ) s++; } if( *s==0 || strchr(delim,*s)==NULL ) return false; *new_s = s; // // finite value conversion // if( *new_s-p>=(int)sizeof(buf) ) return false; strncpy(buf, p, (size_t)(*new_s-p)); buf[*new_s-p] = 0; loc = localeconv(); t = strchr(buf,'.'); if( t!=NULL ) *t = *loc->decimal_point; *result = atof(buf); return true; } else { // // check delimiter and update *new_s // s += 3; if( *s==0 || strchr(delim,*s)==NULL ) return false; *new_s = s; // // NAN, INF conversion // if( my_stricmp(buf,"nan")==0 ) *result = fp_nan; if( my_stricmp(buf,"inf")==0 ) *result = isign>0 ? fp_posinf : fp_neginf; return true; } } double alglib::parse_real_delim(const char *s, const char *delim) { double result; const char *new_s; if( !_parse_real_delim(s, delim, &result, &new_s) ) throw alglib::ap_error("Cannot parse value"); return result; } alglib::complex alglib::parse_complex_delim(const char *s, const char *delim) { double d_result; const char *new_s; alglib::complex c_result; // parse as real value if( _parse_real_delim(s, delim, &d_result, &new_s) ) return d_result; // parse as "a+bi" or "a-bi" if( _parse_real_delim(s, "+-", &c_result.x, &new_s) ) { s = new_s; if( !_parse_real_delim(s, "i", &c_result.y, &new_s) ) throw alglib::ap_error("Cannot parse value"); s = new_s+1; if( *s==0 || strchr(delim,*s)==NULL ) throw alglib::ap_error("Cannot parse value"); return c_result; } // parse as complex value "bi+a" or "bi-a" if( _parse_real_delim(s, "i", &c_result.y, &new_s) ) { s = new_s+1; if( *s==0 ) throw alglib::ap_error("Cannot parse value"); if( strchr(delim,*s)!=NULL ) { c_result.x = 0; return c_result; } if( strchr("+-",*s)!=NULL ) { if( !_parse_real_delim(s, delim, &c_result.x, &new_s) ) throw alglib::ap_error("Cannot parse value"); return c_result; } throw alglib::ap_error("Cannot parse value"); } // error throw alglib::ap_error("Cannot parse value"); } std::string alglib::arraytostring(const bool *ptr, ae_int_t n) { std::string result; ae_int_t i; result = "["; for(i=0; i=(int)sizeof(buf) ) throw ap_error("arraytostring(): buffer overflow"); result += buf; } result += "]"; return result; } std::string alglib::arraytostring(const double *ptr, ae_int_t n, int _dps) { std::string result; ae_int_t i; char buf[64]; char mask1[64]; char mask2[64]; int dps = _dps>=0 ? _dps : -_dps; result = "["; if( sprintf(mask1, "%%.%d%s", dps, _dps>=0 ? "f" : "e")>=(int)sizeof(mask1) ) throw ap_error("arraytostring(): buffer overflow"); if( sprintf(mask2, ",%s", mask1)>=(int)sizeof(mask2) ) throw ap_error("arraytostring(): buffer overflow"); for(i=0; i=(int)sizeof(buf) ) throw ap_error("arraytostring(): buffer overflow"); } else if( fp_isnan(ptr[i]) ) strcpy(buf, i==0 ? "NAN" : ",NAN"); else if( fp_isposinf(ptr[i]) ) strcpy(buf, i==0 ? "+INF" : ",+INF"); else if( fp_isneginf(ptr[i]) ) strcpy(buf, i==0 ? "-INF" : ",-INF"); result += buf; } result += "]"; return result; } std::string alglib::arraytostring(const alglib::complex *ptr, ae_int_t n, int dps) { std::string result; ae_int_t i; result = "["; for(i=0; i0 ) return 1; if( x<0 ) return -1; return 0; } double alglib::randomreal() { int i1 = rand(); int i2 = rand(); double mx = (double)(RAND_MAX)+1.0; volatile double tmp0 = i2/mx; volatile double tmp1 = i1+tmp0; return tmp1/mx; } alglib::ae_int_t alglib::randominteger(alglib::ae_int_t maxv) { return ((alglib::ae_int_t)rand())%maxv; } int alglib::round(double x) { return int(floor(x+0.5)); } int alglib::trunc(double x) { return int(x>0 ? floor(x) : ceil(x)); } int alglib::ifloor(double x) { return int(floor(x)); } int alglib::iceil(double x) { return int(ceil(x)); } double alglib::pi() { return 3.14159265358979323846; } double alglib::sqr(double x) { return x*x; } int alglib::maxint(int m1, int m2) { return m1>m2 ? m1 : m2; } int alglib::minint(int m1, int m2) { return m1>m2 ? m2 : m1; } double alglib::maxreal(double m1, double m2) { return m1>m2 ? m1 : m2; } double alglib::minreal(double m1, double m2) { return m1>m2 ? m2 : m1; } bool alglib::fp_eq(double v1, double v2) { // IEEE-strict floating point comparison volatile double x = v1; volatile double y = v2; return x==y; } bool alglib::fp_neq(double v1, double v2) { // IEEE-strict floating point comparison return !fp_eq(v1,v2); } bool alglib::fp_less(double v1, double v2) { // IEEE-strict floating point comparison volatile double x = v1; volatile double y = v2; return xy; } bool alglib::fp_greater_eq(double v1, double v2) { // IEEE-strict floating point comparison volatile double x = v1; volatile double y = v2; return x>=y; } bool alglib::fp_isnan(double x) { return alglib_impl::ae_isnan_stateless(x,endianness); } bool alglib::fp_isposinf(double x) { return alglib_impl::ae_isposinf_stateless(x,endianness); } bool alglib::fp_isneginf(double x) { return alglib_impl::ae_isneginf_stateless(x,endianness); } bool alglib::fp_isinf(double x) { return alglib_impl::ae_isinf_stateless(x,endianness); } bool alglib::fp_isfinite(double x) { return alglib_impl::ae_isfinite_stateless(x,endianness); } /******************************************************************** CSV functions ********************************************************************/ void alglib::read_csv(const char *filename, char separator, int flags, alglib::real_2d_array &out) { int flag; // // Parameters // bool skip_first_row = (flags&CSV_SKIP_HEADERS)!=0; // // Prepare empty output array // out.setlength(0,0); // // Open file, determine size, read contents // FILE *f_in = fopen(filename, "rb"); if( f_in==NULL ) throw alglib::ap_error("read_csv: unable to open input file"); flag = fseek(f_in, 0, SEEK_END); AE_CRITICAL_ASSERT(flag==0); long int _filesize = ftell(f_in); AE_CRITICAL_ASSERT(_filesize>=0); if( _filesize==0 ) { // empty file, return empty array, success fclose(f_in); return; } size_t filesize = _filesize; std::vector v_buf; v_buf.resize(filesize+2, 0); char *p_buf = &v_buf[0]; flag = fseek(f_in, 0, SEEK_SET); AE_CRITICAL_ASSERT(flag==0); size_t bytes_read = fread ((void*)p_buf, 1, filesize, f_in); AE_CRITICAL_ASSERT(bytes_read==filesize); fclose(f_in); // // Normalize file contents: // * replace 0x0 by spaces // * remove trailing spaces and newlines // * append trailing '\n' and '\0' characters // Return if file contains only spaces/newlines. // for(size_t i=0; i0; ) { char c = p_buf[filesize-1]; if( c==' ' || c=='\t' || c=='\n' || c=='\r' ) { filesize--; continue; } break; } if( filesize==0 ) return; p_buf[filesize+0] = '\n'; p_buf[filesize+1] = '\0'; filesize+=2; // // Scan dataset. // size_t rows_count = 0, cols_count = 0, max_length = 0; std::vector offsets, lengths; for(size_t row_start=0; p_buf[row_start]!=0x0; ) { // determine row length size_t row_length; for(row_length=0; p_buf[row_start+row_length]!='\n'; row_length++); // determine cols count, perform integrity check size_t cur_cols_cnt=1; for(size_t idx=0; idx0 && cols_count!=cur_cols_cnt ) throw alglib::ap_error("read_csv: non-rectangular contents, rows have different sizes"); cols_count = cur_cols_cnt; // store offsets and lengths of the fields size_t cur_offs = 0; for(size_t idx=0; idxmax_length ? idx-cur_offs : max_length; cur_offs = idx+1; } // advance row start rows_count++; row_start = row_start+row_length+1; } AE_CRITICAL_ASSERT(rows_count>=1); AE_CRITICAL_ASSERT(cols_count>=1); AE_CRITICAL_ASSERT(cols_count*rows_count==offsets.size()); AE_CRITICAL_ASSERT(cols_count*rows_count==lengths.size()); if( rows_count==1 && skip_first_row ) // empty output, return return; // // Convert // size_t row0 = skip_first_row ? 1 : 0; size_t row1 = rows_count; lconv *loc = localeconv(); out.setlength(row1-row0, cols_count); for(size_t ridx=row0; ridxdecimal_point; out[ridx-row0][cidx] = atof(p_field); } } /******************************************************************** Dataset functions ********************************************************************/ /*bool alglib::readstrings(std::string file, std::list *pOutput) { return readstrings(file, pOutput, ""); } bool alglib::readstrings(std::string file, std::list *pOutput, std::string comment) { std::string cmd, s; FILE *f; char buf[32768]; char *str; f = fopen(file.c_str(), "rb"); if( !f ) return false; s = ""; pOutput->clear(); while(str=fgets(buf, sizeof(buf), f)) { // TODO: read file by small chunks, combine in one large string if( strlen(str)==0 ) continue; // // trim trailing newline chars // char *eos = str+strlen(str)-1; if( *eos=='\n' ) { *eos = 0; eos--; } if( *eos=='\r' ) { *eos = 0; eos--; } s = str; // // skip comments // if( comment.length()>0 ) if( strncmp(s.c_str(), comment.c_str(), comment.length())==0 ) { s = ""; continue; } // // read data // if( s.length()<1 ) { fclose(f); throw alglib::ap_error("internal error in read_strings"); } pOutput->push_back(s); } fclose(f); return true; } void alglib::explodestring(std::string s, char sep, std::vector *pOutput) { std::string tmp; int i; tmp = ""; pOutput->clear(); for(i=0; ipush_back(tmp); tmp = ""; } if( tmp.length()!=0 ) pOutput->push_back(tmp); } std::string alglib::strtolower(const std::string &s) { std::string r = s; for(int i=0; i Lines; std::vector Values, RowsArr, ColsArr, VarsArr, HeadArr; std::list::iterator i; std::string s; int TrnFirst, TrnLast, ValFirst, ValLast, TstFirst, TstLast, LinesRead, j; // // Read data // if( pdataset==NULL ) return false; if( !readstrings(file, &Lines, "//") ) return false; i = Lines.begin(); *pdataset = dataset(); // // Read header // if( i==Lines.end() ) return false; s = alglib::xtrim(*i); alglib::explodestring(s, '#', &HeadArr); if( HeadArr.size()!=2 ) return false; // // Rows info // alglib::explodestring(alglib::xtrim(HeadArr[0]), ' ', &RowsArr); if( RowsArr.size()==0 || RowsArr.size()>3 ) return false; if( RowsArr.size()==1 ) { pdataset->totalsize = atol(RowsArr[0].c_str()); pdataset->trnsize = pdataset->totalsize; } if( RowsArr.size()==2 ) { pdataset->trnsize = atol(RowsArr[0].c_str()); pdataset->tstsize = atol(RowsArr[1].c_str()); pdataset->totalsize = pdataset->trnsize + pdataset->tstsize; } if( RowsArr.size()==3 ) { pdataset->trnsize = atol(RowsArr[0].c_str()); pdataset->valsize = atol(RowsArr[1].c_str()); pdataset->tstsize = atol(RowsArr[2].c_str()); pdataset->totalsize = pdataset->trnsize + pdataset->valsize + pdataset->tstsize; } if( pdataset->totalsize<=0 || pdataset->trnsize<0 || pdataset->valsize<0 || pdataset->tstsize<0 ) return false; TrnFirst = 0; TrnLast = TrnFirst + pdataset->trnsize; ValFirst = TrnLast; ValLast = ValFirst + pdataset->valsize; TstFirst = ValLast; TstLast = TstFirst + pdataset->tstsize; // // columns // alglib::explodestring(alglib::xtrim(HeadArr[1]), ' ', &ColsArr); if( ColsArr.size()!=1 && ColsArr.size()!=4 ) return false; if( ColsArr.size()==1 ) { pdataset->nin = atoi(ColsArr[0].c_str()); if( pdataset->nin<=0 ) return false; } if( ColsArr.size()==4 ) { if( alglib::strtolower(ColsArr[0])!="reg" && alglib::strtolower(ColsArr[0])!="cls" ) return false; if( ColsArr[2]!="=>" ) return false; pdataset->nin = atol(ColsArr[1].c_str()); if( pdataset->nin<1 ) return false; if( alglib::strtolower(ColsArr[0])=="reg" ) { pdataset->nclasses = 0; pdataset->nout = atol(ColsArr[3].c_str()); if( pdataset->nout<1 ) return false; } else { pdataset->nclasses = atol(ColsArr[3].c_str()); pdataset->nout = 1; if( pdataset->nclasses<2 ) return false; } } // // initialize arrays // pdataset->all.setlength(pdataset->totalsize, pdataset->nin+pdataset->nout); if( pdataset->trnsize>0 ) pdataset->trn.setlength(pdataset->trnsize, pdataset->nin+pdataset->nout); if( pdataset->valsize>0 ) pdataset->val.setlength(pdataset->valsize, pdataset->nin+pdataset->nout); if( pdataset->tstsize>0 ) pdataset->tst.setlength(pdataset->tstsize, pdataset->nin+pdataset->nout); // // read data // for(LinesRead=0, i++; i!=Lines.end() && LinesReadtotalsize; i++, LinesRead++) { std::string sss = *i; alglib::explodestring(alglib::xtrim(*i), ' ', &VarsArr); if( VarsArr.size()!=pdataset->nin+pdataset->nout ) return false; int tmpc = alglib::round(atof(VarsArr[pdataset->nin+pdataset->nout-1].c_str())); if( pdataset->nclasses>0 && (tmpc<0 || tmpc>=pdataset->nclasses) ) return false; for(j=0; jnin+pdataset->nout; j++) { pdataset->all(LinesRead,j) = atof(VarsArr[j].c_str()); if( LinesRead>=TrnFirst && LinesReadtrn(LinesRead-TrnFirst,j) = atof(VarsArr[j].c_str()); if( LinesRead>=ValFirst && LinesReadval(LinesRead-ValFirst,j) = atof(VarsArr[j].c_str()); if( LinesRead>=TstFirst && LinesReadtst(LinesRead-TstFirst,j) = atof(VarsArr[j].c_str()); } } if( LinesRead!=pdataset->totalsize ) return false; return true; }*/ /* previous variant bool alglib::opendataset(std::string file, dataset *pdataset) { std::list Lines; std::vector Values; std::list::iterator i; int nCol, nRow, nSplitted; int nColumns, nRows; // // Read data // if( pdataset==NULL ) return false; if( !readstrings(file, &Lines, "//") ) return false; i = Lines.begin(); *pdataset = dataset(); // // Read columns info // if( i==Lines.end() ) return false; if( sscanf(i->c_str(), " columns = %d %d ", &pdataset->nin, &pdataset->nout)!=2 ) return false; if( pdataset->nin<=0 || pdataset->nout==0 || pdataset->nout==-1) return false; if( pdataset->nout<0 ) { pdataset->nclasses = -pdataset->nout; pdataset->nout = 1; pdataset->iscls = true; } else { pdataset->isreg = true; } nColumns = pdataset->nin+pdataset->nout; i++; // // Read rows info // if( i==Lines.end() ) return false; if( sscanf(i->c_str(), " rows = %d %d %d ", &pdataset->trnsize, &pdataset->valsize, &pdataset->tstsize)!=3 ) return false; if( (pdataset->trnsize<0) || (pdataset->valsize<0) || (pdataset->tstsize<0) ) return false; if( (pdataset->trnsize==0) && (pdataset->valsize==0) && (pdataset->tstsize==0) ) return false; nRows = pdataset->trnsize+pdataset->valsize+pdataset->tstsize; pdataset->size = nRows; if( Lines.size()!=nRows+2 ) return false; i++; // // Read all cases // alglib::real_2d_array &arr = pdataset->all; arr.setbounds(0, nRows-1, 0, nColumns-1); for(nRow=0; nRowiscls && ((round(v)<0) || (round(v)>=pdataset->nclasses)) ) return false; if( (nCol==nColumns-1) && pdataset->iscls ) arr(nRow, nCol) = round(v); else arr(nRow, nCol) = v; } i++; } // // Split to training, validation and test sets // if( pdataset->trnsize>0 ) pdataset->trn.setbounds(0, pdataset->trnsize-1, 0, nColumns-1); if( pdataset->valsize>0 ) pdataset->val.setbounds(0, pdataset->valsize-1, 0, nColumns-1); if( pdataset->tstsize>0 ) pdataset->tst.setbounds(0, pdataset->tstsize-1, 0, nColumns-1); nSplitted=0; for(nRow=0; nRow<=pdataset->trnsize-1; nRow++, nSplitted++) for(nCol=0; nCol<=nColumns-1; nCol++) pdataset->trn(nRow,nCol) = arr(nSplitted,nCol); for(nRow=0; nRow<=pdataset->valsize-1; nRow++, nSplitted++) for(nCol=0; nCol<=nColumns-1; nCol++) pdataset->val(nRow,nCol) = arr(nSplitted,nCol); for(nRow=0; nRow<=pdataset->tstsize-1; nRow++, nSplitted++) for(nCol=0; nCol<=nColumns-1; nCol++) pdataset->tst(nRow,nCol) = arr(nSplitted,nCol); return true; }*/ alglib::ae_int_t alglib::vlen(ae_int_t n1, ae_int_t n2) { return n2-n1+1; } ///////////////////////////////////////////////////////////////////////// // // THIS SECTIONS CONTAINS OPTIMIZED LINEAR ALGEBRA CODE // IT IS SHARED BETWEEN C++ AND PURE C LIBRARIES // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { #define alglib_simd_alignment 16 #define alglib_r_block 32 #define alglib_half_r_block 16 #define alglib_twice_r_block 64 #define alglib_c_block 24 #define alglib_half_c_block 12 #define alglib_twice_c_block 48 /******************************************************************** This subroutine calculates fast 32x32 real matrix-vector product: y := beta*y + alpha*A*x using either generic C code or native optimizations (if available) IMPORTANT: * A must be stored in row-major order, stride is alglib_r_block, aligned on alglib_simd_alignment boundary * X must be aligned on alglib_simd_alignment boundary * Y may be non-aligned ********************************************************************/ void _ialglib_mv_32(const double *a, const double *x, double *y, ae_int_t stride, double alpha, double beta) { ae_int_t i, k; const double *pa0, *pa1, *pb; pa0 = a; pa1 = a+alglib_r_block; pb = x; for(i=0; i<16; i++) { double v0 = 0, v1 = 0; for(k=0; k<4; k++) { v0 += pa0[0]*pb[0]; v1 += pa1[0]*pb[0]; v0 += pa0[1]*pb[1]; v1 += pa1[1]*pb[1]; v0 += pa0[2]*pb[2]; v1 += pa1[2]*pb[2]; v0 += pa0[3]*pb[3]; v1 += pa1[3]*pb[3]; v0 += pa0[4]*pb[4]; v1 += pa1[4]*pb[4]; v0 += pa0[5]*pb[5]; v1 += pa1[5]*pb[5]; v0 += pa0[6]*pb[6]; v1 += pa1[6]*pb[6]; v0 += pa0[7]*pb[7]; v1 += pa1[7]*pb[7]; pa0 += 8; pa1 += 8; pb += 8; } y[0] = beta*y[0]+alpha*v0; y[stride] = beta*y[stride]+alpha*v1; /* * now we've processed rows I and I+1, * pa0 and pa1 are pointing to rows I+1 and I+2. * move to I+2 and I+3. */ pa0 += alglib_r_block; pa1 += alglib_r_block; pb = x; y+=2*stride; } } /************************************************************************* This function calculates MxN real matrix-vector product: y := beta*y + alpha*A*x using generic C code. It calls _ialglib_mv_32 if both M=32 and N=32. If beta is zero, we do not use previous values of y (they are overwritten by alpha*A*x without ever being read). If alpha is zero, no matrix-vector product is calculated (only beta is updated); however, this update is not efficient and this function should NOT be used for multiplication of vector and scalar. IMPORTANT: * 0<=M<=alglib_r_block, 0<=N<=alglib_r_block * A must be stored in row-major order with stride equal to alglib_r_block *************************************************************************/ void _ialglib_rmv(ae_int_t m, ae_int_t n, const double *a, const double *x, double *y, ae_int_t stride, double alpha, double beta) { /* * Handle special cases: * - alpha is zero or n is zero * - m is zero */ if( m==0 ) return; if( alpha==0.0 || n==0 ) { ae_int_t i; if( beta==0.0 ) { for(i=0; ix-beta.y*cy->y)+(alpha.x*v0-alpha.y*v1); double ty = (beta.x*cy->y+beta.y*cy->x)+(alpha.x*v1+alpha.y*v0); cy->x = tx; cy->y = ty; cy+=stride; } else { double tx = (beta.x*dy[0]-beta.y*dy[1])+(alpha.x*v0-alpha.y*v1); double ty = (beta.x*dy[1]+beta.y*dy[0])+(alpha.x*v1+alpha.y*v0); dy[0] = tx; dy[1] = ty; dy += 2*stride; } parow += 2*alglib_c_block; } } /************************************************************************* This subroutine calculates fast MxN complex matrix-vector product: y := beta*y + alpha*A*x using generic C code, where A, x, y, alpha and beta are complex. If beta is zero, we do not use previous values of y (they are overwritten by alpha*A*x without ever being read). However, when alpha is zero, we still calculate A*x and multiply it by alpha (this distinction can be important when A or x contain infinities/NANs). IMPORTANT: * 0<=M<=alglib_c_block, 0<=N<=alglib_c_block * A must be stored in row-major order, as sequence of double precision pairs. Stride is alglib_c_block (it is measured in pairs of doubles, not in doubles). * Y may be referenced by cy (pointer to ae_complex) or dy (pointer to array of double precision pair) depending on what type of output you wish. Pass pointer to Y as one of these parameters, AND SET OTHER PARAMETER TO NULL. * both A and x must be aligned; y may be non-aligned. This function supports SSE2; it can be used when: 1. AE_HAS_SSE2_INTRINSICS was defined (checked at compile-time) 2. ae_cpuid() result contains CPU_SSE2 (checked at run-time) If (1) is failed, this function will be undefined. If (2) is failed, call to this function will probably crash your system. If you want to know whether it is safe to call it, you should check results of ae_cpuid(). If CPU_SSE2 bit is set, this function is callable and will do its work. *************************************************************************/ #if defined(AE_HAS_SSE2_INTRINSICS) void _ialglib_cmv_sse2(ae_int_t m, ae_int_t n, const double *a, const double *x, ae_complex *cy, double *dy, ae_int_t stride, ae_complex alpha, ae_complex beta) { ae_int_t i, j, m2; const double *pa0, *pa1, *parow, *pb; __m128d vbeta, vbetax, vbetay; __m128d valpha, valphax, valphay; m2 = m/2; parow = a; if( cy!=NULL ) { dy = (double*)cy; cy = NULL; } vbeta = _mm_loadh_pd(_mm_load_sd(&beta.x),&beta.y); vbetax = _mm_unpacklo_pd(vbeta,vbeta); vbetay = _mm_unpackhi_pd(vbeta,vbeta); valpha = _mm_loadh_pd(_mm_load_sd(&alpha.x),&alpha.y); valphax = _mm_unpacklo_pd(valpha,valpha); valphay = _mm_unpackhi_pd(valpha,valpha); for(i=0; ix = 0.0; p->y = 0.0; } } else { for(i=0; ix = 0.0; p->y = 0.0; } } } /******************************************************************** This subroutine copies unaligned real vector ********************************************************************/ void _ialglib_vcopy(ae_int_t n, const double *a, ae_int_t stridea, double *b, ae_int_t strideb) { ae_int_t i, n2; if( stridea==1 && strideb==1 ) { n2 = n/2; for(i=n2; i!=0; i--, a+=2, b+=2) { b[0] = a[0]; b[1] = a[1]; } if( n%2!=0 ) b[0] = a[0]; } else { for(i=0; ix; b[1] = a->y; } } else { for(i=0; ix; b[1] = -a->y; } } } /******************************************************************** This subroutine copies unaligned complex vector (passed as double*) 1. strideb is stride measured in complex numbers, not doubles 2. conj may be "N" (no conj.) or "C" (conj.) ********************************************************************/ void _ialglib_vcopy_dcomplex(ae_int_t n, const double *a, ae_int_t stridea, double *b, ae_int_t strideb, const char *conj) { ae_int_t i; /* * more general case */ if( conj[0]=='N' || conj[0]=='n' ) { for(i=0; ix; pdst[1] = psrc->y; } } if( op==1 ) { for(i=0,psrc=a; ix; pdst[1] = psrc->y; } } if( op==2 ) { for(i=0,psrc=a; ix; pdst[1] = -psrc->y; } } if( op==3 ) { for(i=0,psrc=a; ix; pdst[1] = -psrc->y; } } } /******************************************************************** This subroutine copies matrix from aligned contigous storage to non-aligned non-contigous storage A: * 2*alglib_c_block*alglib_c_block doubles (only MxN submatrix is used) * aligned * stride is alglib_c_block * pointer to double is passed * may be transformed during copying (as prescribed by op) B: * MxN * non-aligned * non-contigous * pointer to ae_complex is passed Transformation types: * 0 - no transform * 1 - transposition * 2 - conjugate transposition * 3 - conjugate, but no transposition ********************************************************************/ void _ialglib_mcopyunblock_complex(ae_int_t m, ae_int_t n, const double *a, ae_int_t op, ae_complex* b, ae_int_t stride) { ae_int_t i, j; const double *psrc; ae_complex *pdst; if( op==0 ) { for(i=0,psrc=a; ix = psrc[0]; pdst->y = psrc[1]; } } if( op==1 ) { for(i=0,psrc=a; ix = psrc[0]; pdst->y = psrc[1]; } } if( op==2 ) { for(i=0,psrc=a; ix = psrc[0]; pdst->y = -psrc[1]; } } if( op==3 ) { for(i=0,psrc=a; ix = psrc[0]; pdst->y = -psrc[1]; } } } /******************************************************************** Real GEMM kernel ********************************************************************/ ae_bool _ialglib_rmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, double *_a, ae_int_t _a_stride, ae_int_t optypea, double *_b, ae_int_t _b_stride, ae_int_t optypeb, double beta, double *_c, ae_int_t _c_stride) { int i; double *crow; double _abuf[alglib_r_block+alglib_simd_alignment]; double _bbuf[alglib_r_block*alglib_r_block+alglib_simd_alignment]; double * const abuf = (double * const) ae_align(_abuf,alglib_simd_alignment); double * const b = (double * const) ae_align(_bbuf,alglib_simd_alignment); void (*rmv)(ae_int_t, ae_int_t, const double *, const double *, double *, ae_int_t, double, double) = &_ialglib_rmv; void (*mcopyblock)(ae_int_t, ae_int_t, const double *, ae_int_t, ae_int_t, double *) = &_ialglib_mcopyblock; if( m>alglib_r_block || n>alglib_r_block || k>alglib_r_block || m<=0 || n<=0 || k<=0 || alpha==0.0 ) return ae_false; /* * Check for SSE2 support */ #ifdef AE_HAS_SSE2_INTRINSICS if( ae_cpuid() & CPU_SSE2 ) { rmv = &_ialglib_rmv_sse2; mcopyblock = &_ialglib_mcopyblock_sse2; } #endif /* * copy b */ if( optypeb==0 ) mcopyblock(k, n, _b, 1, _b_stride, b); else mcopyblock(n, k, _b, 0, _b_stride, b); /* * multiply B by A (from the right, by rows) * and store result in C */ crow = _c; if( optypea==0 ) { const double *arow = _a; for(i=0; ialglib_c_block || n>alglib_c_block || k>alglib_c_block ) return ae_false; /* * Check for SSE2 support */ #ifdef AE_HAS_SSE2_INTRINSICS if( ae_cpuid() & CPU_SSE2 ) { cmv = &_ialglib_cmv_sse2; } #endif /* * copy b */ brows = optypeb==0 ? k : n; bcols = optypeb==0 ? n : k; if( optypeb==0 ) _ialglib_mcopyblock_complex(brows, bcols, _b, 1, _b_stride, b); if( optypeb==1 ) _ialglib_mcopyblock_complex(brows, bcols, _b, 0, _b_stride, b); if( optypeb==2 ) _ialglib_mcopyblock_complex(brows, bcols, _b, 3, _b_stride, b); /* * multiply B by A (from the right, by rows) * and store result in C */ arow = _a; crow = _c; for(i=0; ialglib_c_block || n>alglib_c_block ) return ae_false; /* * Check for SSE2 support */ #ifdef AE_HAS_SSE2_INTRINSICS if( ae_cpuid() & CPU_SSE2 ) { cmv = &_ialglib_cmv_sse2; } #endif /* * Prepare */ _ialglib_mcopyblock_complex(n, n, _a, optype, _a_stride, abuf); _ialglib_mcopyblock_complex(m, n, _x, 0, _x_stride, xbuf); if( isunit ) for(i=0,pdiag=abuf; i=0; i--,pdiag-=2*(alglib_c_block+1)) { ae_complex tmp_c; ae_complex beta; ae_complex alpha; tmp_c.x = pdiag[0]; tmp_c.y = pdiag[1]; beta = ae_c_d_div(1.0, tmp_c); alpha.x = -beta.x; alpha.y = -beta.y; _ialglib_vcopy_dcomplex(n-1-i, pdiag+2*alglib_c_block, alglib_c_block, tmpbuf, 1, "No conj"); cmv(m, n-1-i, xbuf+2*(i+1), tmpbuf, NULL, xbuf+2*i, alglib_c_block, alpha, beta); } _ialglib_mcopyunblock_complex(m, n, xbuf, 0, _x, _x_stride); } return ae_true; } /******************************************************************** real TRSM kernel ********************************************************************/ ae_bool _ialglib_rmatrixrighttrsm(ae_int_t m, ae_int_t n, double *_a, ae_int_t _a_stride, ae_bool isupper, ae_bool isunit, ae_int_t optype, double *_x, ae_int_t _x_stride) { /* * local buffers */ double *pdiag; ae_int_t i; double _loc_abuf[alglib_r_block*alglib_r_block+alglib_simd_alignment]; double _loc_xbuf[alglib_r_block*alglib_r_block+alglib_simd_alignment]; double _loc_tmpbuf[alglib_r_block+alglib_simd_alignment]; double * const abuf = (double * const) ae_align(_loc_abuf, alglib_simd_alignment); double * const xbuf = (double * const) ae_align(_loc_xbuf, alglib_simd_alignment); double * const tmpbuf = (double * const) ae_align(_loc_tmpbuf,alglib_simd_alignment); ae_bool uppera; void (*rmv)(ae_int_t, ae_int_t, const double *, const double *, double *, ae_int_t, double, double) = &_ialglib_rmv; void (*mcopyblock)(ae_int_t, ae_int_t, const double *, ae_int_t, ae_int_t, double *) = &_ialglib_mcopyblock; if( m>alglib_r_block || n>alglib_r_block ) return ae_false; /* * Check for SSE2 support */ #ifdef AE_HAS_SSE2_INTRINSICS if( ae_cpuid() & CPU_SSE2 ) { rmv = &_ialglib_rmv_sse2; mcopyblock = &_ialglib_mcopyblock_sse2; } #endif /* * Prepare */ mcopyblock(n, n, _a, optype, _a_stride, abuf); mcopyblock(m, n, _x, 0, _x_stride, xbuf); if( isunit ) for(i=0,pdiag=abuf; i=0; i--,pdiag-=alglib_r_block+1) { double beta = 1.0/(*pdiag); double alpha = -beta; _ialglib_vcopy(n-1-i, pdiag+alglib_r_block, alglib_r_block, tmpbuf+i+1, 1); rmv(m, n-1-i, xbuf+i+1, tmpbuf+i+1, xbuf+i, alglib_r_block, alpha, beta); } _ialglib_mcopyunblock(m, n, xbuf, 0, _x, _x_stride); } return ae_true; } /******************************************************************** complex TRSM kernel ********************************************************************/ ae_bool _ialglib_cmatrixlefttrsm(ae_int_t m, ae_int_t n, ae_complex *_a, ae_int_t _a_stride, ae_bool isupper, ae_bool isunit, ae_int_t optype, ae_complex *_x, ae_int_t _x_stride) { /* * local buffers */ double *pdiag, *arow; ae_int_t i; double _loc_abuf[2*alglib_c_block*alglib_c_block+alglib_simd_alignment]; double _loc_xbuf[2*alglib_c_block*alglib_c_block+alglib_simd_alignment]; double _loc_tmpbuf[2*alglib_c_block+alglib_simd_alignment]; double * const abuf = (double * const) ae_align(_loc_abuf, alglib_simd_alignment); double * const xbuf = (double * const) ae_align(_loc_xbuf, alglib_simd_alignment); double * const tmpbuf = (double * const) ae_align(_loc_tmpbuf,alglib_simd_alignment); ae_bool uppera; void (*cmv)(ae_int_t, ae_int_t, const double *, const double *, ae_complex *, double *, ae_int_t, ae_complex, ae_complex) = &_ialglib_cmv; if( m>alglib_c_block || n>alglib_c_block ) return ae_false; /* * Check for SSE2 support */ #ifdef AE_HAS_SSE2_INTRINSICS if( ae_cpuid() & CPU_SSE2 ) { cmv = &_ialglib_cmv_sse2; } #endif /* * Prepare * Transpose X (so we may use mv, which calculates A*x, but not x*A) */ _ialglib_mcopyblock_complex(m, m, _a, optype, _a_stride, abuf); _ialglib_mcopyblock_complex(m, n, _x, 1, _x_stride, xbuf); if( isunit ) for(i=0,pdiag=abuf; i=0; i--,pdiag-=2*(alglib_c_block+1)) { ae_complex tmp_c; ae_complex beta; ae_complex alpha; tmp_c.x = pdiag[0]; tmp_c.y = pdiag[1]; beta = ae_c_d_div(1.0, tmp_c); alpha.x = -beta.x; alpha.y = -beta.y; _ialglib_vcopy_dcomplex(m-1-i, pdiag+2, 1, tmpbuf, 1, "No conj"); cmv(n, m-1-i, xbuf+2*(i+1), tmpbuf, NULL, xbuf+2*i, alglib_c_block, alpha, beta); } _ialglib_mcopyunblock_complex(m, n, xbuf, 1, _x, _x_stride); } else { for(i=0,pdiag=abuf,arow=abuf; ialglib_r_block || n>alglib_r_block ) return ae_false; /* * Check for SSE2 support */ #ifdef AE_HAS_SSE2_INTRINSICS if( ae_cpuid() & CPU_SSE2 ) { rmv = &_ialglib_rmv_sse2; mcopyblock = &_ialglib_mcopyblock_sse2; } #endif /* * Prepare * Transpose X (so we may use mv, which calculates A*x, but not x*A) */ mcopyblock(m, m, _a, optype, _a_stride, abuf); mcopyblock(m, n, _x, 1, _x_stride, xbuf); if( isunit ) for(i=0,pdiag=abuf; i=0; i--,pdiag-=alglib_r_block+1) { double beta = 1.0/(*pdiag); double alpha = -beta; _ialglib_vcopy(m-1-i, pdiag+1, 1, tmpbuf+i+1, 1); rmv(n, m-1-i, xbuf+i+1, tmpbuf+i+1, xbuf+i, alglib_r_block, alpha, beta); } _ialglib_mcopyunblock(m, n, xbuf, 1, _x, _x_stride); } else { for(i=0,pdiag=abuf,arow=abuf; ialglib_c_block || k>alglib_c_block ) return ae_false; if( n==0 ) return ae_true; /* * copy A and C, task is transformed to "A*A^H"-form. * if beta==0, then C is filled by zeros (and not referenced) * * alpha==0 or k==0 are correctly processed (A is not referenced) */ c_alpha.x = alpha; c_alpha.y = 0; c_beta.x = beta; c_beta.y = 0; if( alpha==0 ) k = 0; if( k>0 ) { if( optypea==0 ) _ialglib_mcopyblock_complex(n, k, _a, 3, _a_stride, abuf); else _ialglib_mcopyblock_complex(k, n, _a, 1, _a_stride, abuf); } _ialglib_mcopyblock_complex(n, n, _c, 0, _c_stride, cbuf); if( beta==0 ) { for(i=0,crow=cbuf; ialglib_r_block || k>alglib_r_block ) return ae_false; if( n==0 ) return ae_true; /* * copy A and C, task is transformed to "A*A^T"-form. * if beta==0, then C is filled by zeros (and not referenced) * * alpha==0 or k==0 are correctly processed (A is not referenced) */ if( alpha==0 ) k = 0; if( k>0 ) { if( optypea==0 ) _ialglib_mcopyblock(n, k, _a, 0, _a_stride, abuf); else _ialglib_mcopyblock(k, n, _a, 1, _a_stride, abuf); } _ialglib_mcopyblock(n, n, _c, 0, _c_stride, cbuf); if( beta==0 ) { for(i=0,crow=cbuf; iptr.pp_double[ia]+ja, _a->stride, optypea, _b->ptr.pp_double[ib]+jb, _b->stride, optypeb, beta, _c->ptr.pp_double[ic]+jc, _c->stride); } ae_bool _ialglib_i_cmatrixgemmf(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, ae_matrix *_a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, ae_matrix *_b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, ae_matrix *_c, ae_int_t ic, ae_int_t jc) { /* handle degenerate cases like zero matrices by ALGLIB - greatly simplifies passing data to ALGLIB kernel */ if( (alpha.x==0.0 && alpha.y==0) || k==0 || n==0 || m==0 ) return ae_false; /* handle with optimized ALGLIB kernel */ return _ialglib_cmatrixgemm(m, n, k, alpha, _a->ptr.pp_complex[ia]+ja, _a->stride, optypea, _b->ptr.pp_complex[ib]+jb, _b->stride, optypeb, beta, _c->ptr.pp_complex[ic]+jc, _c->stride); } ae_bool _ialglib_i_cmatrixrighttrsmf(ae_int_t m, ae_int_t n, ae_matrix *a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, ae_matrix *x, ae_int_t i2, ae_int_t j2) { /* handle degenerate cases like zero matrices by ALGLIB - greatly simplifies passing data to ALGLIB kernel */ if( m==0 || n==0) return ae_false; /* handle with optimized ALGLIB kernel */ return _ialglib_cmatrixrighttrsm(m, n, &a->ptr.pp_complex[i1][j1], a->stride, isupper, isunit, optype, &x->ptr.pp_complex[i2][j2], x->stride); } ae_bool _ialglib_i_rmatrixrighttrsmf(ae_int_t m, ae_int_t n, ae_matrix *a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, ae_matrix *x, ae_int_t i2, ae_int_t j2) { /* handle degenerate cases like zero matrices by ALGLIB - greatly simplifies passing data to ALGLIB kernel */ if( m==0 || n==0) return ae_false; /* handle with optimized ALGLIB kernel */ return _ialglib_rmatrixrighttrsm(m, n, &a->ptr.pp_double[i1][j1], a->stride, isupper, isunit, optype, &x->ptr.pp_double[i2][j2], x->stride); } ae_bool _ialglib_i_cmatrixlefttrsmf(ae_int_t m, ae_int_t n, ae_matrix *a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, ae_matrix *x, ae_int_t i2, ae_int_t j2) { /* handle degenerate cases like zero matrices by ALGLIB - greatly simplifies passing data to ALGLIB kernel */ if( m==0 || n==0) return ae_false; /* handle with optimized ALGLIB kernel */ return _ialglib_cmatrixlefttrsm(m, n, &a->ptr.pp_complex[i1][j1], a->stride, isupper, isunit, optype, &x->ptr.pp_complex[i2][j2], x->stride); } ae_bool _ialglib_i_rmatrixlefttrsmf(ae_int_t m, ae_int_t n, ae_matrix *a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, ae_matrix *x, ae_int_t i2, ae_int_t j2) { /* handle degenerate cases like zero matrices by ALGLIB - greatly simplifies passing data to ALGLIB kernel */ if( m==0 || n==0) return ae_false; /* handle with optimized ALGLIB kernel */ return _ialglib_rmatrixlefttrsm(m, n, &a->ptr.pp_double[i1][j1], a->stride, isupper, isunit, optype, &x->ptr.pp_double[i2][j2], x->stride); } ae_bool _ialglib_i_cmatrixherkf(ae_int_t n, ae_int_t k, double alpha, ae_matrix *a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, ae_matrix *c, ae_int_t ic, ae_int_t jc, ae_bool isupper) { /* handle degenerate cases like zero matrices by ALGLIB - greatly simplifies passing data to ALGLIB kernel */ if( alpha==0.0 || k==0 || n==0) return ae_false; /* ALGLIB kernel */ return _ialglib_cmatrixherk(n, k, alpha, &a->ptr.pp_complex[ia][ja], a->stride, optypea, beta, &c->ptr.pp_complex[ic][jc], c->stride, isupper); } ae_bool _ialglib_i_rmatrixsyrkf(ae_int_t n, ae_int_t k, double alpha, ae_matrix *a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, ae_matrix *c, ae_int_t ic, ae_int_t jc, ae_bool isupper) { /* handle degenerate cases like zero matrices by ALGLIB - greatly simplifies passing data to ALGLIB kernel */ if( alpha==0.0 || k==0 || n==0) return ae_false; /* ALGLIB kernel */ return _ialglib_rmatrixsyrk(n, k, alpha, &a->ptr.pp_double[ia][ja], a->stride, optypea, beta, &c->ptr.pp_double[ic][jc], c->stride, isupper); } ae_bool _ialglib_i_cmatrixrank1f(ae_int_t m, ae_int_t n, ae_matrix *a, ae_int_t ia, ae_int_t ja, ae_vector *u, ae_int_t uoffs, ae_vector *v, ae_int_t voffs) { return _ialglib_cmatrixrank1(m, n, &a->ptr.pp_complex[ia][ja], a->stride, &u->ptr.p_complex[uoffs], &v->ptr.p_complex[voffs]); } ae_bool _ialglib_i_rmatrixrank1f(ae_int_t m, ae_int_t n, ae_matrix *a, ae_int_t ia, ae_int_t ja, ae_vector *u, ae_int_t uoffs, ae_vector *v, ae_int_t voffs) { return _ialglib_rmatrixrank1(m, n, &a->ptr.pp_double[ia][ja], a->stride, &u->ptr.p_double[uoffs], &v->ptr.p_double[voffs]); } /******************************************************************** This function reads rectangular matrix A given by two column pointers col0 and col1 and stride src_stride and moves it into contiguous row- by-row storage given by dst. It can handle following special cases: * col1==NULL in this case second column of A is filled by zeros ********************************************************************/ void _ialglib_pack_n2( double *col0, double *col1, ae_int_t n, ae_int_t src_stride, double *dst) { ae_int_t n2, j, stride2; /* * handle special case */ if( col1==NULL ) { for(j=0; j>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _ap_h #define _ap_h #include #include #include #include #include #include #include #if defined(__CODEGEARC__) #include #include #elif defined(__BORLANDC__) #include #include #else #include #include #endif #define AE_USE_CPP /* Definitions */ #define AE_UNKNOWN 0 #define AE_MSVC 1 #define AE_GNUC 2 #define AE_SUNC 3 #define AE_INTEL 1 #define AE_SPARC 2 #define AE_WINDOWS 1 #define AE_POSIX 2 #define AE_LOCK_ALIGNMENT 16 /* in case no OS is defined, use AE_UNKNOWN */ #ifndef AE_OS #define AE_OS AE_UNKNOWN #endif /* automatically determine compiler */ #define AE_COMPILER AE_UNKNOWN #ifdef __GNUC__ #undef AE_COMPILER #define AE_COMPILER AE_GNUC #endif #if defined(__SUNPRO_C)||defined(__SUNPRO_CC) #undef AE_COMPILER #define AE_COMPILER AE_SUNC #endif #ifdef _MSC_VER #undef AE_COMPILER #define AE_COMPILER AE_MSVC #endif /* compiler-specific definitions */ #if AE_COMPILER==AE_MSVC #define ALIGNED __declspec(align(8)) #elif AE_COMPILER==AE_GNUC #define ALIGNED __attribute__((aligned(8))) #else #define ALIGNED #endif /* now we are ready to include headers */ #include #include #include #include #include #include #if defined(AE_HAVE_STDINT) #include #endif /* * SSE2 intrinsics * * Preprocessor directives below: * - include headers for SSE2 intrinsics * - define AE_HAS_SSE2_INTRINSICS definition * * These actions are performed when we have: * - x86 architecture definition (AE_CPU==AE_INTEL) * - compiler which supports intrinsics * * Presence of AE_HAS_SSE2_INTRINSICS does NOT mean that our CPU * actually supports SSE2 - such things should be determined at runtime * with ae_cpuid() call. It means that we are working under Intel and * out compiler can issue SSE2-capable code. * */ #if defined(AE_CPU) #if AE_CPU==AE_INTEL #if AE_COMPILER==AE_MSVC #include #define AE_HAS_SSE2_INTRINSICS #endif #if AE_COMPILER==AE_GNUC #include #define AE_HAS_SSE2_INTRINSICS #endif #if AE_COMPILER==AE_SUNC #include #include #define AE_HAS_SSE2_INTRINSICS #endif #endif #endif ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS DECLARATIONS FOR BASIC FUNCTIONALITY // LIKE MEMORY MANAGEMENT FOR VECTORS/MATRICES WHICH IS SHARED // BETWEEN C++ AND PURE C LIBRARIES // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { /* if we work under C++ environment, define several conditions */ #ifdef AE_USE_CPP #define AE_USE_CPP_BOOL #define AE_USE_CPP_ERROR_HANDLING #define AE_USE_CPP_SERIALIZATION #include #endif /* * define ae_int32_t, ae_int64_t, ae_int_t, ae_bool, ae_complex, ae_error_type and ae_datatype */ #if defined(AE_INT32_T) typedef AE_INT32_T ae_int32_t; #endif #if defined(AE_HAVE_STDINT) && !defined(AE_INT32_T) typedef int32_t ae_int32_t; #endif #if !defined(AE_HAVE_STDINT) && !defined(AE_INT32_T) #if AE_COMPILER==AE_MSVC typedef __int32 ae_int32_t; #endif #if (AE_COMPILER==AE_GNUC) || (AE_COMPILER==AE_SUNC) || (AE_COMPILER==AE_UNKNOWN) typedef int ae_int32_t; #endif #endif #if defined(AE_INT64_T) typedef AE_INT64_T ae_int64_t; #endif #if defined(AE_HAVE_STDINT) && !defined(AE_INT64_T) typedef int64_t ae_int64_t; #endif #if !defined(AE_HAVE_STDINT) && !defined(AE_INT64_T) #if AE_COMPILER==AE_MSVC typedef __int64 ae_int64_t; #endif #if (AE_COMPILER==AE_GNUC) || (AE_COMPILER==AE_SUNC) || (AE_COMPILER==AE_UNKNOWN) typedef signed long long ae_int64_t; #endif #endif #if !defined(AE_INT_T) typedef ptrdiff_t ae_int_t; #endif #if !defined(AE_USE_CPP_BOOL) #define ae_bool char #define ae_true 1 #define ae_false 0 #else #define ae_bool bool #define ae_true true #define ae_false false #endif typedef struct { double x, y; } ae_complex; typedef enum { ERR_OK = 0, ERR_OUT_OF_MEMORY = 1, ERR_XARRAY_TOO_LARGE = 2, ERR_ASSERTION_FAILED = 3 } ae_error_type; typedef ae_int_t ae_datatype; /* * other definitions */ enum { OWN_CALLER=1, OWN_AE=2 }; enum { ACT_UNCHANGED=1, ACT_SAME_LOCATION=2, ACT_NEW_LOCATION=3 }; enum { DT_BOOL=1, DT_INT=2, DT_REAL=3, DT_COMPLEX=4 }; enum { CPU_SSE2=1 }; /************************************************************************ x-string (zero-terminated): owner OWN_CALLER or OWN_AE. Determines what to do on realloc(). If vector is owned by caller, X-interface will just set ptr to NULL before realloc(). If it is owned by X, it will call ae_free/x_free/aligned_free family functions. last_action ACT_UNCHANGED, ACT_SAME_LOCATION, ACT_NEW_LOCATION contents is either: unchanged, stored at the same location, stored at the new location. this field is set on return from X. ptr pointer to the actual data Members of this structure are ae_int64_t to avoid alignment problems. ************************************************************************/ typedef struct { ALIGNED ae_int64_t owner; ALIGNED ae_int64_t last_action; ALIGNED char *ptr; } x_string; /************************************************************************ x-vector: cnt number of elements datatype one of the DT_XXXX values owner OWN_CALLER or OWN_AE. Determines what to do on realloc(). If vector is owned by caller, X-interface will just set ptr to NULL before realloc(). If it is owned by X, it will call ae_free/x_free/aligned_free family functions. last_action ACT_UNCHANGED, ACT_SAME_LOCATION, ACT_NEW_LOCATION contents is either: unchanged, stored at the same location, stored at the new location. this field is set on return from X interface and may be used by caller as hint when deciding what to do with data (if it was ACT_UNCHANGED or ACT_SAME_LOCATION, no array reallocation or copying is required). ptr pointer to the actual data Members of this structure are ae_int64_t to avoid alignment problems. ************************************************************************/ typedef struct { ALIGNED ae_int64_t cnt; ALIGNED ae_int64_t datatype; ALIGNED ae_int64_t owner; ALIGNED ae_int64_t last_action; ALIGNED void *ptr; } x_vector; /************************************************************************ x-matrix: rows number of rows. may be zero only when cols is zero too. cols number of columns. may be zero only when rows is zero too. stride stride, i.e. distance between first elements of rows (in bytes) datatype one of the DT_XXXX values owner OWN_CALLER or OWN_AE. Determines what to do on realloc(). If vector is owned by caller, X-interface will just set ptr to NULL before realloc(). If it is owned by X, it will call ae_free/x_free/aligned_free family functions. last_action ACT_UNCHANGED, ACT_SAME_LOCATION, ACT_NEW_LOCATION contents is either: unchanged, stored at the same location, stored at the new location. this field is set on return from X interface and may be used by caller as hint when deciding what to do with data (if it was ACT_UNCHANGED or ACT_SAME_LOCATION, no array reallocation or copying is required). ptr pointer to the actual data, stored rowwise Members of this structure are ae_int64_t to avoid alignment problems. ************************************************************************/ typedef struct { ALIGNED ae_int64_t rows; ALIGNED ae_int64_t cols; ALIGNED ae_int64_t stride; ALIGNED ae_int64_t datatype; ALIGNED ae_int64_t owner; ALIGNED ae_int64_t last_action; ALIGNED void *ptr; } x_matrix; /************************************************************************ dynamic block which may be automatically deallocated during stack unwinding p_next next block in the stack unwinding list. NULL means that this block is not in the list deallocator deallocator function which should be used to deallocate block. NULL for "special" blocks (frame/stack boundaries) ptr pointer which should be passed to the deallocator. may be null (for zero-size block), DYN_BOTTOM or DYN_FRAME for "special" blocks (frame/stack boundaries). ************************************************************************/ typedef struct ae_dyn_block { struct ae_dyn_block * volatile p_next; /* void *deallocator; */ void (*deallocator)(void*); void * volatile ptr; } ae_dyn_block; typedef void(*ae_deallocator)(void*); /************************************************************************ frame marker ************************************************************************/ typedef struct ae_frame { ae_dyn_block db_marker; } ae_frame; /************************************************************************ ALGLIB environment state ************************************************************************/ typedef struct ae_state { /* * endianness type: AE_LITTLE_ENDIAN or AE_BIG_ENDIAN */ ae_int_t endianness; /* * double value for NAN */ double v_nan; /* * double value for +INF */ double v_posinf; /* * double value for -INF */ double v_neginf; /* * pointer to the top block in a stack of frames * which hold dynamically allocated objects */ ae_dyn_block * volatile p_top_block; ae_dyn_block last_block; /* * jmp_buf for cases when C-style exception handling is used */ #ifndef AE_USE_CPP_ERROR_HANDLING jmp_buf * volatile break_jump; #endif /* * ae_error_type of the last error (filled when exception is thrown) */ ae_error_type volatile last_error; /* * human-readable message (filled when exception is thrown) */ const char* volatile error_msg; /* * threading information: * a) current thread pool * b) current worker thread * c) parent task (one we are solving right now) * d) thread exception handler (function which must be called * by ae_assert before raising exception). * * NOTE: we use void* to store pointers in order to avoid explicit dependency on smp.h */ void *worker_thread; void *parent_task; void (*thread_exception_handler)(void*); } ae_state; /************************************************************************ Serializer: * ae_stream_writer type is a function pointer for stream writer method; this pointer is used by X-core for out-of-core serialization (say, to serialize ALGLIB structure directly to managed C# stream). This function accepts two parameters: pointer to ANSI (7-bit) string and pointer-sized integer passed to serializer during initialization. String being passed is a part of the data stream; aux paramerer may be arbitrary value intended to be used by actual implementation of stream writer. String parameter may include spaces and linefeed symbols, it should be written to stream as is. Return value must be zero for success or non-zero for failure. * ae_stream_reader type is a function pointer for stream reader method; this pointer is used by X-core for out-of-core unserialization (say, to unserialize ALGLIB structure directly from managed C# stream). This function accepts three parameters: pointer-sized integer passed to serializer during initialization; number of symbols to read from stream; pointer to buffer used to store next token read from stream (ANSI encoding is used, buffer is large enough to store all symbols and trailing zero symbol). Number of symbols to read is always positive. After being called by X-core, this function must: * skip all space and linefeed characters from the current position at the stream and until first non-space non-linefeed character is found * read exactly cnt symbols from stream to buffer; check that all symbols being read are non-space non-linefeed ones * append trailing zero symbol to buffer * return value must be zero on success, non-zero if even one of the conditions above fails. When reader returns non-zero value, contents of buf is not used. ************************************************************************/ typedef char(*ae_stream_writer)(const char *p_string, ae_int_t aux); typedef char(*ae_stream_reader)(ae_int_t aux, ae_int_t cnt, char *p_buf); typedef struct { ae_int_t mode; ae_int_t entries_needed; ae_int_t entries_saved; ae_int_t bytes_asked; ae_int_t bytes_written; #ifdef AE_USE_CPP_SERIALIZATION std::string *out_cppstr; #endif char *out_str; /* pointer to the current position at the output buffer; advanced with each write operation */ const char *in_str; /* pointer to the current position at the input buffer; advanced with each read operation */ ae_int_t stream_aux; ae_stream_writer stream_writer; ae_stream_reader stream_reader; } ae_serializer; typedef struct ae_vector { /* * Number of elements in array, cnt>=0 */ ae_int_t cnt; /* * Either DT_BOOL, DT_INT, DT_REAL or DT_COMPLEX */ ae_datatype datatype; /* * If ptr points to memory owned and managed by ae_vector itself, * this field is ae_false. If vector was attached to x_vector structure * with ae_vector_attach_to_x(), this field is ae_true. */ ae_bool is_attached; /* * ae_dyn_block structure which manages data in ptr. This structure * is responsible for automatic deletion of object when its frame * is destroyed. */ ae_dyn_block data; /* * Pointer to data. * User usually works with this field. */ union { void *p_ptr; ae_bool *p_bool; ae_int_t *p_int; double *p_double; ae_complex *p_complex; } ptr; } ae_vector; typedef struct ae_matrix { ae_int_t rows; ae_int_t cols; ae_int_t stride; ae_datatype datatype; /* * If ptr points to memory owned and managed by ae_vector itself, * this field is ae_false. If vector was attached to x_vector structure * with ae_vector_attach_to_x(), this field is ae_true. */ ae_bool is_attached; ae_dyn_block data; union { void *p_ptr; void **pp_void; ae_bool **pp_bool; ae_int_t **pp_int; double **pp_double; ae_complex **pp_complex; } ptr; } ae_matrix; typedef struct ae_smart_ptr { /* pointer to subscriber; all changes in ptr are translated to subscriber */ void **subscriber; /* pointer to object */ void *ptr; /* whether smart pointer owns ptr */ ae_bool is_owner; /* whether object pointed by ptr is dynamic - clearing such object requires BOTH calling destructor function AND calling ae_free for memory occupied by object. */ ae_bool is_dynamic; /* destructor function for pointer; clears all dynamically allocated memory */ void (*destroy)(void*); /* frame entry; used to ensure automatic deallocation of smart pointer in case of exception/exit */ ae_dyn_block frame_entry; } ae_smart_ptr; /************************************************************************* Lock. This structure provides OS-independent non-reentrant lock: * under Windows/Posix systems it uses system-provided locks * under Boost it uses OS-independent lock provided by Boost package * when no OS is defined, it uses "fake lock" (just stub which is not thread-safe): a) "fake lock" can be in locked or free mode b) "fake lock" can be used only from one thread - one which created lock c) when thread acquires free lock, it immediately returns d) when thread acquires busy lock, program is terminated (because lock is already acquired and no one else can free it) *************************************************************************/ typedef struct { /* * Pointer to _lock structure. This pointer has type void* in order to * make header file OS-independent (lock declaration depends on OS). */ void *ptr; } ae_lock; /************************************************************************* Shared pool: data structure used to provide thread-safe access to pool of temporary variables. *************************************************************************/ typedef struct ae_shared_pool_entry { void * volatile obj; void * volatile next_entry; } ae_shared_pool_entry; typedef struct ae_shared_pool { /* lock object which protects pool */ ae_lock pool_lock; /* seed object (used to create new instances of temporaries) */ void * volatile seed_object; /* * list of recycled OBJECTS: * 1. entries in this list store pointers to recycled objects * 2. every time we retrieve object, we retrieve first entry from this list, * move it to recycled_entries and return its obj field to caller/ */ ae_shared_pool_entry * volatile recycled_objects; /* * list of recycled ENTRIES: * 1. this list holds entries which are not used to store recycled objects; * every time recycled object is retrieved, its entry is moved to this list. * 2. every time object is recycled, we try to fetch entry for him from this list * before allocating it with malloc() */ ae_shared_pool_entry * volatile recycled_entries; /* enumeration pointer, points to current recycled object*/ ae_shared_pool_entry * volatile enumeration_counter; /* size of object; this field is used when we call malloc() for new objects */ ae_int_t size_of_object; /* initializer function; accepts pointer to malloc'ed object, initializes its fields */ void (*init)(void* dst, ae_state* state); /* copy constructor; accepts pointer to malloc'ed, but not initialized object */ void (*init_copy)(void* dst, void* src, ae_state* state); /* destructor function; */ void (*destroy)(void* ptr); /* frame entry; contains pointer to the pool object itself */ ae_dyn_block frame_entry; } ae_shared_pool; ae_int_t ae_misalignment(const void *ptr, size_t alignment); void* ae_align(void *ptr, size_t alignment); void* aligned_malloc(size_t size, size_t alignment); void aligned_free(void *block); void* ae_malloc(size_t size, ae_state *state); void ae_free(void *p); ae_int_t ae_sizeof(ae_datatype datatype); void ae_touch_ptr(void *p); void ae_state_init(ae_state *state); void ae_state_clear(ae_state *state); #ifndef AE_USE_CPP_ERROR_HANDLING void ae_state_set_break_jump(ae_state *state, jmp_buf *buf); #endif void ae_break(ae_state *state, ae_error_type error_type, const char *msg); void ae_frame_make(ae_state *state, ae_frame *tmp); void ae_frame_leave(ae_state *state); void ae_db_attach(ae_dyn_block *block, ae_state *state); ae_bool ae_db_malloc(ae_dyn_block *block, ae_int_t size, ae_state *state, ae_bool make_automatic); ae_bool ae_db_realloc(ae_dyn_block *block, ae_int_t size, ae_state *state); void ae_db_free(ae_dyn_block *block); void ae_db_swap(ae_dyn_block *block1, ae_dyn_block *block2); void ae_vector_init(ae_vector *dst, ae_int_t size, ae_datatype datatype, ae_state *state); void ae_vector_init_copy(ae_vector *dst, ae_vector *src, ae_state *state); void ae_vector_init_from_x(ae_vector *dst, x_vector *src, ae_state *state); void ae_vector_attach_to_x(ae_vector *dst, x_vector *src, ae_state *state); ae_bool ae_vector_set_length(ae_vector *dst, ae_int_t newsize, ae_state *state); void ae_vector_clear(ae_vector *dst); void ae_vector_destroy(ae_vector *dst); void ae_swap_vectors(ae_vector *vec1, ae_vector *vec2); void ae_matrix_init(ae_matrix *dst, ae_int_t rows, ae_int_t cols, ae_datatype datatype, ae_state *state); void ae_matrix_init_copy(ae_matrix *dst, ae_matrix *src, ae_state *state); void ae_matrix_init_from_x(ae_matrix *dst, x_matrix *src, ae_state *state); void ae_matrix_attach_to_x(ae_matrix *dst, x_matrix *src, ae_state *state); ae_bool ae_matrix_set_length(ae_matrix *dst, ae_int_t rows, ae_int_t cols, ae_state *state); void ae_matrix_clear(ae_matrix *dst); void ae_matrix_destroy(ae_matrix *dst); void ae_swap_matrices(ae_matrix *mat1, ae_matrix *mat2); void ae_smart_ptr_init(ae_smart_ptr *dst, void **subscriber, ae_state *state); void ae_smart_ptr_clear(void *_dst); /* accepts ae_smart_ptr* */ void ae_smart_ptr_destroy(void *_dst); void ae_smart_ptr_assign(ae_smart_ptr *dst, void *new_ptr, ae_bool is_owner, ae_bool is_dynamic, void (*destroy)(void*)); void ae_smart_ptr_release(ae_smart_ptr *dst); void ae_yield(); void ae_init_lock(ae_lock *lock); void ae_acquire_lock(ae_lock *lock); void ae_release_lock(ae_lock *lock); void ae_free_lock(ae_lock *lock); void ae_shared_pool_init(void *_dst, ae_state *state); void ae_shared_pool_init_copy(void *_dst, void *_src, ae_state *state); void ae_shared_pool_clear(void *dst); void ae_shared_pool_destroy(void *dst); ae_bool ae_shared_pool_is_initialized(void *_dst); void ae_shared_pool_set_seed( ae_shared_pool *dst, void *seed_object, ae_int_t size_of_object, void (*init)(void* dst, ae_state* state), void (*init_copy)(void* dst, void* src, ae_state* state), void (*destroy)(void* ptr), ae_state *state); void ae_shared_pool_retrieve( ae_shared_pool *pool, ae_smart_ptr *pptr, ae_state *state); void ae_shared_pool_recycle( ae_shared_pool *pool, ae_smart_ptr *pptr, ae_state *state); void ae_shared_pool_clear_recycled( ae_shared_pool *pool, ae_state *state); void ae_shared_pool_first_recycled( ae_shared_pool *pool, ae_smart_ptr *pptr, ae_state *state); void ae_shared_pool_next_recycled( ae_shared_pool *pool, ae_smart_ptr *pptr, ae_state *state); void ae_shared_pool_reset( ae_shared_pool *pool, ae_state *state); void ae_x_set_vector(x_vector *dst, ae_vector *src, ae_state *state); void ae_x_set_matrix(x_matrix *dst, ae_matrix *src, ae_state *state); void ae_x_attach_to_vector(x_vector *dst, ae_vector *src); void ae_x_attach_to_matrix(x_matrix *dst, ae_matrix *src); void x_vector_clear(x_vector *dst); ae_bool x_is_symmetric(x_matrix *a); ae_bool x_is_hermitian(x_matrix *a); ae_bool x_force_symmetric(x_matrix *a); ae_bool x_force_hermitian(x_matrix *a); ae_bool ae_is_symmetric(ae_matrix *a); ae_bool ae_is_hermitian(ae_matrix *a); ae_bool ae_force_symmetric(ae_matrix *a); ae_bool ae_force_hermitian(ae_matrix *a); void ae_serializer_init(ae_serializer *serializer); void ae_serializer_clear(ae_serializer *serializer); void ae_serializer_alloc_start(ae_serializer *serializer); void ae_serializer_alloc_entry(ae_serializer *serializer); ae_int_t ae_serializer_get_alloc_size(ae_serializer *serializer); #ifdef AE_USE_CPP_SERIALIZATION void ae_serializer_sstart_str(ae_serializer *serializer, std::string *buf); void ae_serializer_ustart_str(ae_serializer *serializer, const std::string *buf); void ae_serializer_sstart_stream(ae_serializer *serializer, std::ostream *stream); void ae_serializer_ustart_stream(ae_serializer *serializer, const std::istream *stream); #endif void ae_serializer_sstart_str(ae_serializer *serializer, char *buf); void ae_serializer_ustart_str(ae_serializer *serializer, const char *buf); void ae_serializer_sstart_stream(ae_serializer *serializer, ae_stream_writer writer, ae_int_t aux); void ae_serializer_ustart_stream(ae_serializer *serializer, ae_stream_reader reader, ae_int_t aux); void ae_serializer_serialize_bool(ae_serializer *serializer, ae_bool v, ae_state *state); void ae_serializer_serialize_int(ae_serializer *serializer, ae_int_t v, ae_state *state); void ae_serializer_serialize_double(ae_serializer *serializer, double v, ae_state *state); void ae_serializer_unserialize_bool(ae_serializer *serializer, ae_bool *v, ae_state *state); void ae_serializer_unserialize_int(ae_serializer *serializer, ae_int_t *v, ae_state *state); void ae_serializer_unserialize_double(ae_serializer *serializer, double *v, ae_state *state); void ae_serializer_stop(ae_serializer *serializer, ae_state *state); /************************************************************************ Service functions ************************************************************************/ void ae_assert(ae_bool cond, const char *msg, ae_state *state); ae_int_t ae_cpuid(); /************************************************************************ Real math functions: * IEEE-compliant floating point comparisons * standard functions ************************************************************************/ ae_bool ae_fp_eq(double v1, double v2); ae_bool ae_fp_neq(double v1, double v2); ae_bool ae_fp_less(double v1, double v2); ae_bool ae_fp_less_eq(double v1, double v2); ae_bool ae_fp_greater(double v1, double v2); ae_bool ae_fp_greater_eq(double v1, double v2); ae_bool ae_isfinite_stateless(double x, ae_int_t endianness); ae_bool ae_isnan_stateless(double x, ae_int_t endianness); ae_bool ae_isinf_stateless(double x, ae_int_t endianness); ae_bool ae_isposinf_stateless(double x, ae_int_t endianness); ae_bool ae_isneginf_stateless(double x, ae_int_t endianness); ae_int_t ae_get_endianness(); ae_bool ae_isfinite(double x,ae_state *state); ae_bool ae_isnan(double x, ae_state *state); ae_bool ae_isinf(double x, ae_state *state); ae_bool ae_isposinf(double x,ae_state *state); ae_bool ae_isneginf(double x,ae_state *state); double ae_fabs(double x, ae_state *state); ae_int_t ae_iabs(ae_int_t x, ae_state *state); double ae_sqr(double x, ae_state *state); double ae_sqrt(double x, ae_state *state); ae_int_t ae_sign(double x, ae_state *state); ae_int_t ae_round(double x, ae_state *state); ae_int_t ae_trunc(double x, ae_state *state); ae_int_t ae_ifloor(double x, ae_state *state); ae_int_t ae_iceil(double x, ae_state *state); ae_int_t ae_maxint(ae_int_t m1, ae_int_t m2, ae_state *state); ae_int_t ae_minint(ae_int_t m1, ae_int_t m2, ae_state *state); double ae_maxreal(double m1, double m2, ae_state *state); double ae_minreal(double m1, double m2, ae_state *state); double ae_randomreal(ae_state *state); ae_int_t ae_randominteger(ae_int_t maxv, ae_state *state); double ae_sin(double x, ae_state *state); double ae_cos(double x, ae_state *state); double ae_tan(double x, ae_state *state); double ae_sinh(double x, ae_state *state); double ae_cosh(double x, ae_state *state); double ae_tanh(double x, ae_state *state); double ae_asin(double x, ae_state *state); double ae_acos(double x, ae_state *state); double ae_atan(double x, ae_state *state); double ae_atan2(double y, double x, ae_state *state); double ae_log(double x, ae_state *state); double ae_pow(double x, double y, ae_state *state); double ae_exp(double x, ae_state *state); /************************************************************************ Complex math functions: * basic arithmetic operations * standard functions ************************************************************************/ ae_complex ae_complex_from_i(ae_int_t v); ae_complex ae_complex_from_d(double v); ae_complex ae_c_neg(ae_complex lhs); ae_bool ae_c_eq(ae_complex lhs, ae_complex rhs); ae_bool ae_c_neq(ae_complex lhs, ae_complex rhs); ae_complex ae_c_add(ae_complex lhs, ae_complex rhs); ae_complex ae_c_mul(ae_complex lhs, ae_complex rhs); ae_complex ae_c_sub(ae_complex lhs, ae_complex rhs); ae_complex ae_c_div(ae_complex lhs, ae_complex rhs); ae_bool ae_c_eq_d(ae_complex lhs, double rhs); ae_bool ae_c_neq_d(ae_complex lhs, double rhs); ae_complex ae_c_add_d(ae_complex lhs, double rhs); ae_complex ae_c_mul_d(ae_complex lhs, double rhs); ae_complex ae_c_sub_d(ae_complex lhs, double rhs); ae_complex ae_c_d_sub(double lhs, ae_complex rhs); ae_complex ae_c_div_d(ae_complex lhs, double rhs); ae_complex ae_c_d_div(double lhs, ae_complex rhs); ae_complex ae_c_conj(ae_complex lhs, ae_state *state); ae_complex ae_c_sqr(ae_complex lhs, ae_state *state); double ae_c_abs(ae_complex z, ae_state *state); /************************************************************************ Complex BLAS operations ************************************************************************/ ae_complex ae_v_cdotproduct(const ae_complex *v0, ae_int_t stride0, const char *conj0, const ae_complex *v1, ae_int_t stride1, const char *conj1, ae_int_t n); void ae_v_cmove(ae_complex *vdst, ae_int_t stride_dst, const ae_complex* vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n); void ae_v_cmoveneg(ae_complex *vdst, ae_int_t stride_dst, const ae_complex* vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n); void ae_v_cmoved(ae_complex *vdst, ae_int_t stride_dst, const ae_complex* vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, double alpha); void ae_v_cmovec(ae_complex *vdst, ae_int_t stride_dst, const ae_complex* vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, ae_complex alpha); void ae_v_cadd(ae_complex *vdst, ae_int_t stride_dst, const ae_complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n); void ae_v_caddd(ae_complex *vdst, ae_int_t stride_dst, const ae_complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, double alpha); void ae_v_caddc(ae_complex *vdst, ae_int_t stride_dst, const ae_complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, ae_complex alpha); void ae_v_csub(ae_complex *vdst, ae_int_t stride_dst, const ae_complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n); void ae_v_csubd(ae_complex *vdst, ae_int_t stride_dst, const ae_complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, double alpha); void ae_v_csubc(ae_complex *vdst, ae_int_t stride_dst, const ae_complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, ae_complex alpha); void ae_v_cmuld(ae_complex *vdst, ae_int_t stride_dst, ae_int_t n, double alpha); void ae_v_cmulc(ae_complex *vdst, ae_int_t stride_dst, ae_int_t n, ae_complex alpha); /************************************************************************ Real BLAS operations ************************************************************************/ double ae_v_dotproduct(const double *v0, ae_int_t stride0, const double *v1, ae_int_t stride1, ae_int_t n); void ae_v_move(double *vdst, ae_int_t stride_dst, const double* vsrc, ae_int_t stride_src, ae_int_t n); void ae_v_moveneg(double *vdst, ae_int_t stride_dst, const double* vsrc, ae_int_t stride_src, ae_int_t n); void ae_v_moved(double *vdst, ae_int_t stride_dst, const double* vsrc, ae_int_t stride_src, ae_int_t n, double alpha); void ae_v_add(double *vdst, ae_int_t stride_dst, const double *vsrc, ae_int_t stride_src, ae_int_t n); void ae_v_addd(double *vdst, ae_int_t stride_dst, const double *vsrc, ae_int_t stride_src, ae_int_t n, double alpha); void ae_v_sub(double *vdst, ae_int_t stride_dst, const double *vsrc, ae_int_t stride_src, ae_int_t n); void ae_v_subd(double *vdst, ae_int_t stride_dst, const double *vsrc, ae_int_t stride_src, ae_int_t n, double alpha); void ae_v_muld(double *vdst, ae_int_t stride_dst, ae_int_t n, double alpha); /************************************************************************ Other functions ************************************************************************/ ae_int_t ae_v_len(ae_int_t a, ae_int_t b); /* extern const double ae_machineepsilon; extern const double ae_maxrealnumber; extern const double ae_minrealnumber; extern const double ae_pi; */ #define ae_machineepsilon 5E-16 #define ae_maxrealnumber 1E300 #define ae_minrealnumber 1E-300 #define ae_pi 3.1415926535897932384626433832795 /************************************************************************ RComm functions ************************************************************************/ typedef struct rcommstate { int stage; ae_vector ia; ae_vector ba; ae_vector ra; ae_vector ca; } rcommstate; void _rcommstate_init(rcommstate* p, ae_state *_state); void _rcommstate_init_copy(rcommstate* dst, rcommstate* src, ae_state *_state); void _rcommstate_clear(rcommstate* p); void _rcommstate_destroy(rcommstate* p); /************************************************************************ Allocation counter, inactive by default. Turned on when needed for debugging purposes. ************************************************************************/ extern ae_int64_t _alloc_counter; extern ae_bool _use_alloc_counter; /************************************************************************ debug functions (must be turned on by preprocessor definitions): * tickcount(), which is wrapper around GetTickCount() * flushconsole(), fluches console * ae_debugrng(), returns random number generated with high-quality random numbers generator * ae_set_seed(), sets seed of the debug RNG (NON-THREAD-SAFE!!!) * ae_get_seed(), returns two seed values of the debug RNG (NON-THREAD-SAFE!!!) ************************************************************************/ #ifdef AE_DEBUG4WINDOWS #define flushconsole(s) fflush(stdout) #define tickcount(s) _tickcount() int _tickcount(); #endif #ifdef AE_DEBUG4POSIX #define flushconsole(s) fflush(stdout) #define tickcount(s) _tickcount() int _tickcount(); #endif } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS DECLARATIONS FOR C++ RELATED FUNCTIONALITY // ///////////////////////////////////////////////////////////////////////// namespace alglib { typedef alglib_impl::ae_int_t ae_int_t; /******************************************************************** Class forwards ********************************************************************/ class complex; ae_int_t vlen(ae_int_t n1, ae_int_t n2); /******************************************************************** Exception class. ********************************************************************/ class ap_error { public: std::string msg; ap_error(); ap_error(const char *s); static void make_assertion(bool bClause); static void make_assertion(bool bClause, const char *p_msg); private: }; /******************************************************************** Complex number with double precision. ********************************************************************/ class complex { public: complex(); complex(const double &_x); complex(const double &_x, const double &_y); complex(const complex &z); complex& operator= (const double& v); complex& operator+=(const double& v); complex& operator-=(const double& v); complex& operator*=(const double& v); complex& operator/=(const double& v); complex& operator= (const complex& z); complex& operator+=(const complex& z); complex& operator-=(const complex& z); complex& operator*=(const complex& z); complex& operator/=(const complex& z); alglib_impl::ae_complex* c_ptr(); const alglib_impl::ae_complex* c_ptr() const; std::string tostring(int dps) const; double x, y; }; const alglib::complex operator/(const alglib::complex& lhs, const alglib::complex& rhs); const bool operator==(const alglib::complex& lhs, const alglib::complex& rhs); const bool operator!=(const alglib::complex& lhs, const alglib::complex& rhs); const alglib::complex operator+(const alglib::complex& lhs); const alglib::complex operator-(const alglib::complex& lhs); const alglib::complex operator+(const alglib::complex& lhs, const alglib::complex& rhs); const alglib::complex operator+(const alglib::complex& lhs, const double& rhs); const alglib::complex operator+(const double& lhs, const alglib::complex& rhs); const alglib::complex operator-(const alglib::complex& lhs, const alglib::complex& rhs); const alglib::complex operator-(const alglib::complex& lhs, const double& rhs); const alglib::complex operator-(const double& lhs, const alglib::complex& rhs); const alglib::complex operator*(const alglib::complex& lhs, const alglib::complex& rhs); const alglib::complex operator*(const alglib::complex& lhs, const double& rhs); const alglib::complex operator*(const double& lhs, const alglib::complex& rhs); const alglib::complex operator/(const alglib::complex& lhs, const alglib::complex& rhs); const alglib::complex operator/(const double& lhs, const alglib::complex& rhs); const alglib::complex operator/(const alglib::complex& lhs, const double& rhs); double abscomplex(const alglib::complex &z); alglib::complex conj(const alglib::complex &z); alglib::complex csqr(const alglib::complex &z); void setnworkers(alglib::ae_int_t nworkers); /******************************************************************** Level 1 BLAS functions NOTES: * destination and source should NOT overlap * stride is assumed to be positive, but it is not assert'ed within function * conj_src parameter specifies whether complex source is conjugated before processing or not. Pass string which starts with 'N' or 'n' ("No conj", for example) to use unmodified parameter. All other values will result in conjugation of input, but it is recommended to use "Conj" in such cases. ********************************************************************/ double vdotproduct(const double *v0, ae_int_t stride0, const double *v1, ae_int_t stride1, ae_int_t n); double vdotproduct(const double *v1, const double *v2, ae_int_t N); alglib::complex vdotproduct(const alglib::complex *v0, ae_int_t stride0, const char *conj0, const alglib::complex *v1, ae_int_t stride1, const char *conj1, ae_int_t n); alglib::complex vdotproduct(const alglib::complex *v1, const alglib::complex *v2, ae_int_t N); void vmove(double *vdst, ae_int_t stride_dst, const double* vsrc, ae_int_t stride_src, ae_int_t n); void vmove(double *vdst, const double* vsrc, ae_int_t N); void vmove(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex* vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n); void vmove(alglib::complex *vdst, const alglib::complex* vsrc, ae_int_t N); void vmoveneg(double *vdst, ae_int_t stride_dst, const double* vsrc, ae_int_t stride_src, ae_int_t n); void vmoveneg(double *vdst, const double *vsrc, ae_int_t N); void vmoveneg(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex* vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n); void vmoveneg(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N); void vmove(double *vdst, ae_int_t stride_dst, const double* vsrc, ae_int_t stride_src, ae_int_t n, double alpha); void vmove(double *vdst, const double *vsrc, ae_int_t N, double alpha); void vmove(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex* vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, double alpha); void vmove(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N, double alpha); void vmove(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex* vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, alglib::complex alpha); void vmove(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N, alglib::complex alpha); void vadd(double *vdst, ae_int_t stride_dst, const double *vsrc, ae_int_t stride_src, ae_int_t n); void vadd(double *vdst, const double *vsrc, ae_int_t N); void vadd(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n); void vadd(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N); void vadd(double *vdst, ae_int_t stride_dst, const double *vsrc, ae_int_t stride_src, ae_int_t n, double alpha); void vadd(double *vdst, const double *vsrc, ae_int_t N, double alpha); void vadd(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, double alpha); void vadd(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N, double alpha); void vadd(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, alglib::complex alpha); void vadd(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N, alglib::complex alpha); void vsub(double *vdst, ae_int_t stride_dst, const double *vsrc, ae_int_t stride_src, ae_int_t n); void vsub(double *vdst, const double *vsrc, ae_int_t N); void vsub(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n); void vsub(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N); void vsub(double *vdst, ae_int_t stride_dst, const double *vsrc, ae_int_t stride_src, ae_int_t n, double alpha); void vsub(double *vdst, const double *vsrc, ae_int_t N, double alpha); void vsub(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, double alpha); void vsub(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N, double alpha); void vsub(alglib::complex *vdst, ae_int_t stride_dst, const alglib::complex *vsrc, ae_int_t stride_src, const char *conj_src, ae_int_t n, alglib::complex alpha); void vsub(alglib::complex *vdst, const alglib::complex *vsrc, ae_int_t N, alglib::complex alpha); void vmul(double *vdst, ae_int_t stride_dst, ae_int_t n, double alpha); void vmul(double *vdst, ae_int_t N, double alpha); void vmul(alglib::complex *vdst, ae_int_t stride_dst, ae_int_t n, double alpha); void vmul(alglib::complex *vdst, ae_int_t N, double alpha); void vmul(alglib::complex *vdst, ae_int_t stride_dst, ae_int_t n, alglib::complex alpha); void vmul(alglib::complex *vdst, ae_int_t N, alglib::complex alpha); /******************************************************************** string conversion functions !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ********************************************************************/ /******************************************************************** 1- and 2-dimensional arrays ********************************************************************/ class ae_vector_wrapper { public: ae_vector_wrapper(); virtual ~ae_vector_wrapper(); void setlength(ae_int_t iLen); ae_int_t length() const; void attach_to(alglib_impl::ae_vector *ptr); void allocate_own(ae_int_t size, alglib_impl::ae_datatype datatype); const alglib_impl::ae_vector* c_ptr() const; alglib_impl::ae_vector* c_ptr(); private: ae_vector_wrapper(const ae_vector_wrapper &rhs); const ae_vector_wrapper& operator=(const ae_vector_wrapper &rhs); protected: // // Copies source vector RHS into current object. // // Current object is considered empty (this function should be // called from copy constructor). // void create(const ae_vector_wrapper &rhs); // // Copies array given by string into current object. Additional // parameter DATATYPE contains information about type of the data // in S and type of the array to create. // // Current object is considered empty (this function should be // called from copy constructor). // void create(const char *s, alglib_impl::ae_datatype datatype); // // Assigns RHS to current object. // // It has several branches depending on target object status: // * in case it is proxy object, data are copied into memory pointed by // proxy. Function checks that source has exactly same size as target // (exception is thrown on failure). // * in case it is non-proxy object, data allocated by object are cleared // and a copy of RHS is created in target. // // NOTE: this function correctly handles assignments of the object to itself. // void assign(const ae_vector_wrapper &rhs); alglib_impl::ae_vector *p_vec; alglib_impl::ae_vector vec; }; class boolean_1d_array : public ae_vector_wrapper { public: boolean_1d_array(); boolean_1d_array(const char *s); boolean_1d_array(const boolean_1d_array &rhs); boolean_1d_array(alglib_impl::ae_vector *p); const boolean_1d_array& operator=(const boolean_1d_array &rhs); virtual ~boolean_1d_array() ; const ae_bool& operator()(ae_int_t i) const; ae_bool& operator()(ae_int_t i); const ae_bool& operator[](ae_int_t i) const; ae_bool& operator[](ae_int_t i); void setcontent(ae_int_t iLen, const bool *pContent ); ae_bool* getcontent(); const ae_bool* getcontent() const; std::string tostring() const; }; class integer_1d_array : public ae_vector_wrapper { public: integer_1d_array(); integer_1d_array(const char *s); integer_1d_array(const integer_1d_array &rhs); integer_1d_array(alglib_impl::ae_vector *p); const integer_1d_array& operator=(const integer_1d_array &rhs); virtual ~integer_1d_array(); const ae_int_t& operator()(ae_int_t i) const; ae_int_t& operator()(ae_int_t i); const ae_int_t& operator[](ae_int_t i) const; ae_int_t& operator[](ae_int_t i); void setcontent(ae_int_t iLen, const ae_int_t *pContent ); ae_int_t* getcontent(); const ae_int_t* getcontent() const; std::string tostring() const; }; class real_1d_array : public ae_vector_wrapper { public: real_1d_array(); real_1d_array(const char *s); real_1d_array(const real_1d_array &rhs); real_1d_array(alglib_impl::ae_vector *p); const real_1d_array& operator=(const real_1d_array &rhs); virtual ~real_1d_array(); const double& operator()(ae_int_t i) const; double& operator()(ae_int_t i); const double& operator[](ae_int_t i) const; double& operator[](ae_int_t i); void setcontent(ae_int_t iLen, const double *pContent ); double* getcontent(); const double* getcontent() const; std::string tostring(int dps) const; }; class complex_1d_array : public ae_vector_wrapper { public: complex_1d_array(); complex_1d_array(const char *s); complex_1d_array(const complex_1d_array &rhs); complex_1d_array(alglib_impl::ae_vector *p); const complex_1d_array& operator=(const complex_1d_array &rhs); virtual ~complex_1d_array(); const alglib::complex& operator()(ae_int_t i) const; alglib::complex& operator()(ae_int_t i); const alglib::complex& operator[](ae_int_t i) const; alglib::complex& operator[](ae_int_t i); void setcontent(ae_int_t iLen, const alglib::complex *pContent ); alglib::complex* getcontent(); const alglib::complex* getcontent() const; std::string tostring(int dps) const; }; class ae_matrix_wrapper { public: ae_matrix_wrapper(); virtual ~ae_matrix_wrapper(); const ae_matrix_wrapper& operator=(const ae_matrix_wrapper &rhs); void setlength(ae_int_t rows, ae_int_t cols); ae_int_t rows() const; ae_int_t cols() const; bool isempty() const; ae_int_t getstride() const; void attach_to(alglib_impl::ae_matrix *ptr); void allocate_own(ae_int_t rows, ae_int_t cols, alglib_impl::ae_datatype datatype); const alglib_impl::ae_matrix* c_ptr() const; alglib_impl::ae_matrix* c_ptr(); private: ae_matrix_wrapper(const ae_matrix_wrapper &rhs); protected: // // Copies source matrix RHS into current object. // // Current object is considered empty (this function should be // called from copy constructor). // void create(const ae_matrix_wrapper &rhs); // // Copies array given by string into current object. Additional // parameter DATATYPE contains information about type of the data // in S and type of the array to create. // // Current object is considered empty (this function should be // called from copy constructor). // void create(const char *s, alglib_impl::ae_datatype datatype); // // Assigns RHS to current object. // // It has several branches depending on target object status: // * in case it is proxy object, data are copied into memory pointed by // proxy. Function checks that source has exactly same size as target // (exception is thrown on failure). // * in case it is non-proxy object, data allocated by object are cleared // and a copy of RHS is created in target. // // NOTE: this function correctly handles assignments of the object to itself. // void assign(const ae_matrix_wrapper &rhs); alglib_impl::ae_matrix *p_mat; alglib_impl::ae_matrix mat; }; class boolean_2d_array : public ae_matrix_wrapper { public: boolean_2d_array(); boolean_2d_array(const boolean_2d_array &rhs); boolean_2d_array(alglib_impl::ae_matrix *p); boolean_2d_array(const char *s); virtual ~boolean_2d_array(); const ae_bool& operator()(ae_int_t i, ae_int_t j) const; ae_bool& operator()(ae_int_t i, ae_int_t j); const ae_bool* operator[](ae_int_t i) const; ae_bool* operator[](ae_int_t i); void setcontent(ae_int_t irows, ae_int_t icols, const bool *pContent ); std::string tostring() const ; }; class integer_2d_array : public ae_matrix_wrapper { public: integer_2d_array(); integer_2d_array(const integer_2d_array &rhs); integer_2d_array(alglib_impl::ae_matrix *p); integer_2d_array(const char *s); virtual ~integer_2d_array(); const ae_int_t& operator()(ae_int_t i, ae_int_t j) const; ae_int_t& operator()(ae_int_t i, ae_int_t j); const ae_int_t* operator[](ae_int_t i) const; ae_int_t* operator[](ae_int_t i); void setcontent(ae_int_t irows, ae_int_t icols, const ae_int_t *pContent ); std::string tostring() const; }; class real_2d_array : public ae_matrix_wrapper { public: real_2d_array(); real_2d_array(const real_2d_array &rhs); real_2d_array(alglib_impl::ae_matrix *p); real_2d_array(const char *s); virtual ~real_2d_array(); const double& operator()(ae_int_t i, ae_int_t j) const; double& operator()(ae_int_t i, ae_int_t j); const double* operator[](ae_int_t i) const; double* operator[](ae_int_t i); void setcontent(ae_int_t irows, ae_int_t icols, const double *pContent ); std::string tostring(int dps) const; }; class complex_2d_array : public ae_matrix_wrapper { public: complex_2d_array(); complex_2d_array(const complex_2d_array &rhs); complex_2d_array(alglib_impl::ae_matrix *p); complex_2d_array(const char *s); virtual ~complex_2d_array(); const alglib::complex& operator()(ae_int_t i, ae_int_t j) const; alglib::complex& operator()(ae_int_t i, ae_int_t j); const alglib::complex* operator[](ae_int_t i) const; alglib::complex* operator[](ae_int_t i); void setcontent(ae_int_t irows, ae_int_t icols, const alglib::complex *pContent ); std::string tostring(int dps) const; }; /******************************************************************** CSV operations: reading CSV file to real matrix. This function reads CSV file and stores its contents to double precision 2D array. Format of the data file must conform to RFC 4180 specification, with additional notes: * file size should be less than 2GB * ASCI encoding, UTF-8 without BOM (in header names) are supported * any character (comma/tab/space) may be used as field separator, as long as it is distinct from one used for decimal point * multiple subsequent field separators (say, two spaces) are treated as MULTIPLE separators, not one big separator * both comma and full stop may be used as decimal point. Parser will automatically determine specific character being used. Both fixed and exponential number formats are allowed. Thousand separators are NOT allowed. * line may end with \n (Unix style) or \r\n (Windows style), parser will automatically adapt to chosen convention * escaped fields (ones in double quotes) are not supported INPUT PARAMETERS: filename relative/absolute path separator character used to separate fields. May be ' ', ',', '\t'. Other separators are possible too. flags several values combined with bitwise OR: * alglib::CSV_SKIP_HEADERS - if present, first row contains headers and will be skipped. Its contents is used to determine fields count, and that's all. If no flags are specified, default value 0x0 (or alglib::CSV_DEFAULT, which is same) should be used. OUTPUT PARAMETERS: out 2D matrix, CSV file parsed with atof() HANDLING OF SPECIAL CASES: * file does not exist - alglib::ap_error exception is thrown * empty file - empty array is returned (no exception) * skip_first_row=true, only one row in file - empty array is returned * field contents is not recognized by atof() - field value is replaced by 0.0 ********************************************************************/ void read_csv(const char *filename, char separator, int flags, alglib::real_2d_array &out); /******************************************************************** dataset information. can store regression dataset, classification dataset, or non-labeled task: * nout==0 means non-labeled task (clustering, for example) * nout>0 && nclasses==0 means regression task * nout>0 && nclasses>0 means classification task ********************************************************************/ /*class dataset { public: dataset():nin(0), nout(0), nclasses(0), trnsize(0), valsize(0), tstsize(0), totalsize(0){}; int nin, nout, nclasses; int trnsize; int valsize; int tstsize; int totalsize; alglib::real_2d_array trn; alglib::real_2d_array val; alglib::real_2d_array tst; alglib::real_2d_array all; }; bool opendataset(std::string file, dataset *pdataset); // // internal functions // std::string strtolower(const std::string &s); bool readstrings(std::string file, std::list *pOutput); bool readstrings(std::string file, std::list *pOutput, std::string comment); void explodestring(std::string s, char sep, std::vector *pOutput); std::string xtrim(std::string s);*/ /******************************************************************** Constants and functions introduced for compatibility with AlgoPascal ********************************************************************/ extern const double machineepsilon; extern const double maxrealnumber; extern const double minrealnumber; extern const double fp_nan; extern const double fp_posinf; extern const double fp_neginf; extern const ae_int_t endianness; static const int CSV_DEFAULT = 0x0; static const int CSV_SKIP_HEADERS = 0x1; int sign(double x); double randomreal(); ae_int_t randominteger(ae_int_t maxv); int round(double x); int trunc(double x); int ifloor(double x); int iceil(double x); double pi(); double sqr(double x); int maxint(int m1, int m2); int minint(int m1, int m2); double maxreal(double m1, double m2); double minreal(double m1, double m2); bool fp_eq(double v1, double v2); bool fp_neq(double v1, double v2); bool fp_less(double v1, double v2); bool fp_less_eq(double v1, double v2); bool fp_greater(double v1, double v2); bool fp_greater_eq(double v1, double v2); bool fp_isnan(double x); bool fp_isposinf(double x); bool fp_isneginf(double x); bool fp_isinf(double x); bool fp_isfinite(double x); }//namespace alglib ///////////////////////////////////////////////////////////////////////// // // THIS SECTIONS CONTAINS DECLARATIONS FOR OPTIMIZED LINEAR ALGEBRA CODES // IT IS SHARED BETWEEN C++ AND PURE C LIBRARIES // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { #define ALGLIB_INTERCEPTS_ABLAS void _ialglib_vzero(ae_int_t n, double *p, ae_int_t stride); void _ialglib_vzero_complex(ae_int_t n, ae_complex *p, ae_int_t stride); void _ialglib_vcopy(ae_int_t n, const double *a, ae_int_t stridea, double *b, ae_int_t strideb); void _ialglib_vcopy_complex(ae_int_t n, const ae_complex *a, ae_int_t stridea, double *b, ae_int_t strideb, const char *conj); void _ialglib_vcopy_dcomplex(ae_int_t n, const double *a, ae_int_t stridea, double *b, ae_int_t strideb, const char *conj); void _ialglib_mcopyblock(ae_int_t m, ae_int_t n, const double *a, ae_int_t op, ae_int_t stride, double *b); void _ialglib_mcopyunblock(ae_int_t m, ae_int_t n, const double *a, ae_int_t op, double *b, ae_int_t stride); void _ialglib_mcopyblock_complex(ae_int_t m, ae_int_t n, const ae_complex *a, ae_int_t op, ae_int_t stride, double *b); void _ialglib_mcopyunblock_complex(ae_int_t m, ae_int_t n, const double *a, ae_int_t op, ae_complex* b, ae_int_t stride); ae_bool _ialglib_i_rmatrixgemmf(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, ae_matrix *a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, ae_matrix *b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, ae_matrix *c, ae_int_t ic, ae_int_t jc); ae_bool _ialglib_i_cmatrixgemmf(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, ae_matrix *a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, ae_matrix *b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, ae_matrix *c, ae_int_t ic, ae_int_t jc); ae_bool _ialglib_i_cmatrixrighttrsmf(ae_int_t m, ae_int_t n, ae_matrix *a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, ae_matrix *x, ae_int_t i2, ae_int_t j2); ae_bool _ialglib_i_rmatrixrighttrsmf(ae_int_t m, ae_int_t n, ae_matrix *a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, ae_matrix *x, ae_int_t i2, ae_int_t j2); ae_bool _ialglib_i_cmatrixlefttrsmf(ae_int_t m, ae_int_t n, ae_matrix *a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, ae_matrix *x, ae_int_t i2, ae_int_t j2); ae_bool _ialglib_i_rmatrixlefttrsmf(ae_int_t m, ae_int_t n, ae_matrix *a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, ae_matrix *x, ae_int_t i2, ae_int_t j2); ae_bool _ialglib_i_cmatrixherkf(ae_int_t n, ae_int_t k, double alpha, ae_matrix *a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, ae_matrix *c, ae_int_t ic, ae_int_t jc, ae_bool isupper); ae_bool _ialglib_i_rmatrixsyrkf(ae_int_t n, ae_int_t k, double alpha, ae_matrix *a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, ae_matrix *c, ae_int_t ic, ae_int_t jc, ae_bool isupper); ae_bool _ialglib_i_cmatrixrank1f(ae_int_t m, ae_int_t n, ae_matrix *a, ae_int_t ia, ae_int_t ja, ae_vector *u, ae_int_t uoffs, ae_vector *v, ae_int_t voffs); ae_bool _ialglib_i_rmatrixrank1f(ae_int_t m, ae_int_t n, ae_matrix *a, ae_int_t ia, ae_int_t ja, ae_vector *u, ae_int_t uoffs, ae_vector *v, ae_int_t voffs); } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS PARALLEL SUBROUTINES // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { } #endif cpp/src/specialfunctions.h0000755000175000017500000017474713105126766015565 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _specialfunctions_pkg_h #define _specialfunctions_pkg_h #include "ap.h" #include "alglibinternal.h" ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (DATATYPES) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* Gamma function Input parameters: X - argument Domain: 0 < X < 171.6 -170 < X < 0, X is not an integer. Relative error: arithmetic domain # trials peak rms IEEE -170,-33 20000 2.3e-15 3.3e-16 IEEE -33, 33 20000 9.4e-16 2.2e-16 IEEE 33, 171.6 20000 2.3e-15 3.2e-16 Cephes Math Library Release 2.8: June, 2000 Original copyright 1984, 1987, 1989, 1992, 2000 by Stephen L. Moshier Translated to AlgoPascal by Bochkanov Sergey (2005, 2006, 2007). *************************************************************************/ double gammafunction(const double x); /************************************************************************* Natural logarithm of gamma function Input parameters: X - argument Result: logarithm of the absolute value of the Gamma(X). Output parameters: SgnGam - sign(Gamma(X)) Domain: 0 < X < 2.55e305 -2.55e305 < X < 0, X is not an integer. ACCURACY: arithmetic domain # trials peak rms IEEE 0, 3 28000 5.4e-16 1.1e-16 IEEE 2.718, 2.556e305 40000 3.5e-16 8.3e-17 The error criterion was relative when the function magnitude was greater than one but absolute when it was less than one. The following test used the relative error criterion, though at certain points the relative error could be much higher than indicated. IEEE -200, -4 10000 4.8e-16 1.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 1992, 2000 by Stephen L. Moshier Translated to AlgoPascal by Bochkanov Sergey (2005, 2006, 2007). *************************************************************************/ double lngamma(const double x, double &sgngam); /************************************************************************* Error function The integral is x - 2 | | 2 erf(x) = -------- | exp( - t ) dt. sqrt(pi) | | - 0 For 0 <= |x| < 1, erf(x) = x * P4(x**2)/Q5(x**2); otherwise erf(x) = 1 - erfc(x). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,1 30000 3.7e-16 1.0e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double errorfunction(const double x); /************************************************************************* Complementary error function 1 - erf(x) = inf. - 2 | | 2 erfc(x) = -------- | exp( - t ) dt sqrt(pi) | | - x For small x, erfc(x) = 1 - erf(x); otherwise rational approximations are computed. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,26.6417 30000 5.7e-14 1.5e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double errorfunctionc(const double x); /************************************************************************* Normal distribution function Returns the area under the Gaussian probability density function, integrated from minus infinity to x: x - 1 | | 2 ndtr(x) = --------- | exp( - t /2 ) dt sqrt(2pi) | | - -inf. = ( 1 + erf(z) ) / 2 = erfc(z) / 2 where z = x/sqrt(2). Computation is via the functions erf and erfc. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE -13,0 30000 3.4e-14 6.7e-15 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double normaldistribution(const double x); /************************************************************************* Inverse of the error function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double inverf(const double e); /************************************************************************* Inverse of Normal distribution function Returns the argument, x, for which the area under the Gaussian probability density function (integrated from minus infinity to x) is equal to y. For small arguments 0 < y < exp(-2), the program computes z = sqrt( -2.0 * log(y) ); then the approximation is x = z - log(z)/z - (1/z) P(1/z) / Q(1/z). There are two rational functions P/Q, one for 0 < y < exp(-32) and the other for y up to exp(-2). For larger arguments, w = y - 0.5, and x/sqrt(2pi) = w + w**3 R(w**2)/S(w**2)). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0.125, 1 20000 7.2e-16 1.3e-16 IEEE 3e-308, 0.135 50000 4.6e-16 9.8e-17 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double invnormaldistribution(const double y0); /************************************************************************* Incomplete gamma integral The function is defined by x - 1 | | -t a-1 igam(a,x) = ----- | e t dt. - | | | (a) - 0 In this implementation both arguments must be positive. The integral is evaluated by either a power series or continued fraction expansion, depending on the relative values of a and x. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 200000 3.6e-14 2.9e-15 IEEE 0,100 300000 9.9e-14 1.5e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double incompletegamma(const double a, const double x); /************************************************************************* Complemented incomplete gamma integral The function is defined by igamc(a,x) = 1 - igam(a,x) inf. - 1 | | -t a-1 = ----- | e t dt. - | | | (a) - x In this implementation both arguments must be positive. The integral is evaluated by either a power series or continued fraction expansion, depending on the relative values of a and x. ACCURACY: Tested at random a, x. a x Relative error: arithmetic domain domain # trials peak rms IEEE 0.5,100 0,100 200000 1.9e-14 1.7e-15 IEEE 0.01,0.5 0,100 200000 1.4e-13 1.6e-15 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double incompletegammac(const double a, const double x); /************************************************************************* Inverse of complemented imcomplete gamma integral Given p, the function finds x such that igamc( a, x ) = p. Starting with the approximate value 3 x = a t where t = 1 - d - ndtri(p) sqrt(d) and d = 1/9a, the routine performs up to 10 Newton iterations to find the root of igamc(a,x) - p = 0. ACCURACY: Tested at random a, p in the intervals indicated. a p Relative error: arithmetic domain domain # trials peak rms IEEE 0.5,100 0,0.5 100000 1.0e-14 1.7e-15 IEEE 0.01,0.5 0,0.5 100000 9.0e-14 3.4e-15 IEEE 0.5,10000 0,0.5 20000 2.3e-13 3.8e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invincompletegammac(const double a, const double y0); /************************************************************************* Complete elliptic integral of the first kind Approximates the integral pi/2 - | | | dt K(m) = | ------------------ | 2 | | sqrt( 1 - m sin t ) - 0 using the approximation P(x) - log x Q(x). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,1 30000 2.5e-16 6.8e-17 Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double ellipticintegralk(const double m); /************************************************************************* Complete elliptic integral of the first kind Approximates the integral pi/2 - | | | dt K(m) = | ------------------ | 2 | | sqrt( 1 - m sin t ) - 0 where m = 1 - m1, using the approximation P(x) - log x Q(x). The argument m1 is used rather than m so that the logarithmic singularity at m = 1 will be shifted to the origin; this preserves maximum accuracy. K(0) = pi/2. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,1 30000 2.5e-16 6.8e-17 Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double ellipticintegralkhighprecision(const double m1); /************************************************************************* Incomplete elliptic integral of the first kind F(phi|m) Approximates the integral phi - | | | dt F(phi_\m) = | ------------------ | 2 | | sqrt( 1 - m sin t ) - 0 of amplitude phi and modulus m, using the arithmetic - geometric mean algorithm. ACCURACY: Tested at random points with m in [0, 1] and phi as indicated. Relative error: arithmetic domain # trials peak rms IEEE -10,10 200000 7.4e-16 1.0e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double incompleteellipticintegralk(const double phi, const double m); /************************************************************************* Complete elliptic integral of the second kind Approximates the integral pi/2 - | | 2 E(m) = | sqrt( 1 - m sin t ) dt | | - 0 using the approximation P(x) - x log x Q(x). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 1 10000 2.1e-16 7.3e-17 Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double ellipticintegrale(const double m); /************************************************************************* Incomplete elliptic integral of the second kind Approximates the integral phi - | | | 2 E(phi_\m) = | sqrt( 1 - m sin t ) dt | | | - 0 of amplitude phi and modulus m, using the arithmetic - geometric mean algorithm. ACCURACY: Tested at random arguments with phi in [-10, 10] and m in [0, 1]. Relative error: arithmetic domain # trials peak rms IEEE -10,10 150000 3.3e-15 1.4e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1993, 2000 by Stephen L. Moshier *************************************************************************/ double incompleteellipticintegrale(const double phi, const double m); /************************************************************************* Calculation of the value of the Hermite polynomial. Parameters: n - degree, n>=0 x - argument Result: the value of the Hermite polynomial Hn at x *************************************************************************/ double hermitecalculate(const ae_int_t n, const double x); /************************************************************************* Summation of Hermite polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*H0(x) + c[1]*H1(x) + ... + c[N]*HN(x) Parameters: n - degree, n>=0 x - argument Result: the value of the Hermite polynomial at x *************************************************************************/ double hermitesum(const real_1d_array &c, const ae_int_t n, const double x); /************************************************************************* Representation of Hn as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/ void hermitecoefficients(const ae_int_t n, real_1d_array &c); /************************************************************************* Dawson's Integral Approximates the integral x - 2 | | 2 dawsn(x) = exp( -x ) | exp( t ) dt | | - 0 Three different rational approximations are employed, for the intervals 0 to 3.25; 3.25 to 6.25; and 6.25 up. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,10 10000 6.9e-16 1.0e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double dawsonintegral(const double x); /************************************************************************* Sine and cosine integrals Evaluates the integrals x - | cos t - 1 Ci(x) = eul + ln x + | --------- dt, | t - 0 x - | sin t Si(x) = | ----- dt | t - 0 where eul = 0.57721566490153286061 is Euler's constant. The integrals are approximated by rational functions. For x > 8 auxiliary functions f(x) and g(x) are employed such that Ci(x) = f(x) sin(x) - g(x) cos(x) Si(x) = pi/2 - f(x) cos(x) - g(x) sin(x) ACCURACY: Test interval = [0,50]. Absolute error, except relative when > 1: arithmetic function # trials peak rms IEEE Si 30000 4.4e-16 7.3e-17 IEEE Ci 30000 6.9e-16 5.1e-17 Cephes Math Library Release 2.1: January, 1989 Copyright 1984, 1987, 1989 by Stephen L. Moshier *************************************************************************/ void sinecosineintegrals(const double x, double &si, double &ci); /************************************************************************* Hyperbolic sine and cosine integrals Approximates the integrals x - | | cosh t - 1 Chi(x) = eul + ln x + | ----------- dt, | | t - 0 x - | | sinh t Shi(x) = | ------ dt | | t - 0 where eul = 0.57721566490153286061 is Euler's constant. The integrals are evaluated by power series for x < 8 and by Chebyshev expansions for x between 8 and 88. For large x, both functions approach exp(x)/2x. Arguments greater than 88 in magnitude return MAXNUM. ACCURACY: Test interval 0 to 88. Relative error: arithmetic function # trials peak rms IEEE Shi 30000 6.9e-16 1.6e-16 Absolute error, except relative when |Chi| > 1: IEEE Chi 30000 8.4e-16 1.4e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ void hyperbolicsinecosineintegrals(const double x, double &shi, double &chi); /************************************************************************* Poisson distribution Returns the sum of the first k+1 terms of the Poisson distribution: k j -- -m m > e -- -- j! j=0 The terms are not summed directly; instead the incomplete gamma integral is employed, according to the relation y = pdtr( k, m ) = igamc( k+1, m ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double poissondistribution(const ae_int_t k, const double m); /************************************************************************* Complemented Poisson distribution Returns the sum of the terms k+1 to infinity of the Poisson distribution: inf. j -- -m m > e -- -- j! j=k+1 The terms are not summed directly; instead the incomplete gamma integral is employed, according to the formula y = pdtrc( k, m ) = igam( k+1, m ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double poissoncdistribution(const ae_int_t k, const double m); /************************************************************************* Inverse Poisson distribution Finds the Poisson variable x such that the integral from 0 to x of the Poisson density is equal to the given probability y. This is accomplished using the inverse gamma integral function and the relation m = igami( k+1, y ). ACCURACY: See inverse incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invpoissondistribution(const ae_int_t k, const double y); /************************************************************************* Bessel function of order zero Returns Bessel function of order zero of the argument. The domain is divided into the intervals [0, 5] and (5, infinity). In the first interval the following rational approximation is used: 2 2 (w - r ) (w - r ) P (w) / Q (w) 1 2 3 8 2 where w = x and the two r's are zeros of the function. In the second interval, the Hankel asymptotic expansion is employed with two rational functions of degree 6/6 and 7/7. ACCURACY: Absolute error: arithmetic domain # trials peak rms IEEE 0, 30 60000 4.2e-16 1.1e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double besselj0(const double x); /************************************************************************* Bessel function of order one Returns Bessel function of order one of the argument. The domain is divided into the intervals [0, 8] and (8, infinity). In the first interval a 24 term Chebyshev expansion is used. In the second, the asymptotic trigonometric representation is employed using two rational functions of degree 5/5. ACCURACY: Absolute error: arithmetic domain # trials peak rms IEEE 0, 30 30000 2.6e-16 1.1e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double besselj1(const double x); /************************************************************************* Bessel function of integer order Returns Bessel function of order n, where n is a (possibly negative) integer. The ratio of jn(x) to j0(x) is computed by backward recurrence. First the ratio jn/jn-1 is found by a continued fraction expansion. Then the recurrence relating successive orders is applied until j0 or j1 is reached. If n = 0 or 1 the routine for j0 or j1 is called directly. ACCURACY: Absolute error: arithmetic range # trials peak rms IEEE 0, 30 5000 4.4e-16 7.9e-17 Not suitable for large n or x. Use jv() (fractional order) instead. Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besseljn(const ae_int_t n, const double x); /************************************************************************* Bessel function of the second kind, order zero Returns Bessel function of the second kind, of order zero, of the argument. The domain is divided into the intervals [0, 5] and (5, infinity). In the first interval a rational approximation R(x) is employed to compute y0(x) = R(x) + 2 * log(x) * j0(x) / PI. Thus a call to j0() is required. In the second interval, the Hankel asymptotic expansion is employed with two rational functions of degree 6/6 and 7/7. ACCURACY: Absolute error, when y0(x) < 1; else relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.3e-15 1.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double bessely0(const double x); /************************************************************************* Bessel function of second kind of order one Returns Bessel function of the second kind of order one of the argument. The domain is divided into the intervals [0, 8] and (8, infinity). In the first interval a 25 term Chebyshev expansion is used, and a call to j1() is required. In the second, the asymptotic trigonometric representation is employed using two rational functions of degree 5/5. ACCURACY: Absolute error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.0e-15 1.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double bessely1(const double x); /************************************************************************* Bessel function of second kind of integer order Returns Bessel function of order n, where n is a (possibly negative) integer. The function is evaluated by forward recurrence on n, starting with values computed by the routines y0() and y1(). If n = 0 or 1 the routine for y0 or y1 is called directly. ACCURACY: Absolute error, except relative when y > 1: arithmetic domain # trials peak rms IEEE 0, 30 30000 3.4e-15 4.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besselyn(const ae_int_t n, const double x); /************************************************************************* Modified Bessel function of order zero Returns modified Bessel function of order zero of the argument. The function is defined as i0(x) = j0( ix ). The range is partitioned into the two intervals [0,8] and (8, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 30000 5.8e-16 1.4e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besseli0(const double x); /************************************************************************* Modified Bessel function of order one Returns modified Bessel function of order one of the argument. The function is defined as i1(x) = -i j1( ix ). The range is partitioned into the two intervals [0,8] and (8, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.9e-15 2.1e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besseli1(const double x); /************************************************************************* Modified Bessel function, second kind, order zero Returns modified Bessel function of the second kind of order zero of the argument. The range is partitioned into the two intervals [0,8] and (8, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Tested at 2000 random points between 0 and 8. Peak absolute error (relative when K0 > 1) was 1.46e-14; rms, 4.26e-15. Relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.2e-15 1.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besselk0(const double x); /************************************************************************* Modified Bessel function, second kind, order one Computes the modified Bessel function of the second kind of order one of the argument. The range is partitioned into the two intervals [0,2] and (2, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.2e-15 1.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besselk1(const double x); /************************************************************************* Modified Bessel function, second kind, integer order Returns modified Bessel function of the second kind of order n of the argument. The range is partitioned into the two intervals [0,9.55] and (9.55, infinity). An ascending power series is used in the low range, and an asymptotic expansion in the high range. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 90000 1.8e-8 3.0e-10 Error is high only near the crossover point x = 9.55 between the two expansions used. Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 2000 by Stephen L. Moshier *************************************************************************/ double besselkn(const ae_int_t nn, const double x); /************************************************************************* Incomplete beta integral Returns incomplete beta integral of the arguments, evaluated from zero to x. The function is defined as x - - | (a+b) | | a-1 b-1 ----------- | t (1-t) dt. - - | | | (a) | (b) - 0 The domain of definition is 0 <= x <= 1. In this implementation a and b are restricted to positive values. The integral from x to 1 may be obtained by the symmetry relation 1 - incbet( a, b, x ) = incbet( b, a, 1-x ). The integral is evaluated by a continued fraction expansion or, when b*x is small, by a power series. ACCURACY: Tested at uniformly distributed random points (a,b,x) with a and b in "domain" and x between 0 and 1. Relative error arithmetic domain # trials peak rms IEEE 0,5 10000 6.9e-15 4.5e-16 IEEE 0,85 250000 2.2e-13 1.7e-14 IEEE 0,1000 30000 5.3e-12 6.3e-13 IEEE 0,10000 250000 9.3e-11 7.1e-12 IEEE 0,100000 10000 8.7e-10 4.8e-11 Outputs smaller than the IEEE gradual underflow threshold were excluded from these statistics. Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double incompletebeta(const double a, const double b, const double x); /************************************************************************* Inverse of imcomplete beta integral Given y, the function finds x such that incbet( a, b, x ) = y . The routine performs interval halving or Newton iterations to find the root of incbet(a,b,x) - y = 0. ACCURACY: Relative error: x a,b arithmetic domain domain # trials peak rms IEEE 0,1 .5,10000 50000 5.8e-12 1.3e-13 IEEE 0,1 .25,100 100000 1.8e-13 3.9e-15 IEEE 0,1 0,5 50000 1.1e-12 5.5e-15 With a and b constrained to half-integer or integer values: IEEE 0,1 .5,10000 50000 5.8e-12 1.1e-13 IEEE 0,1 .5,100 100000 1.7e-14 7.9e-16 With a = .5, b constrained to half-integer or integer values: IEEE 0,1 .5,10000 10000 8.3e-11 1.0e-11 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1996, 2000 by Stephen L. Moshier *************************************************************************/ double invincompletebeta(const double a, const double b, const double y); /************************************************************************* F distribution Returns the area from zero to x under the F density function (also known as Snedcor's density or the variance ratio density). This is the density of x = (u1/df1)/(u2/df2), where u1 and u2 are random variables having Chi square distributions with df1 and df2 degrees of freedom, respectively. The incomplete beta integral is used, according to the formula P(x) = incbet( df1/2, df2/2, (df1*x/(df2 + df1*x) ). The arguments a and b are greater than zero, and x is nonnegative. ACCURACY: Tested at random points (a,b,x). x a,b Relative error: arithmetic domain domain # trials peak rms IEEE 0,1 0,100 100000 9.8e-15 1.7e-15 IEEE 1,5 0,100 100000 6.5e-15 3.5e-16 IEEE 0,1 1,10000 100000 2.2e-11 3.3e-12 IEEE 1,5 1,10000 100000 1.1e-11 1.7e-13 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double fdistribution(const ae_int_t a, const ae_int_t b, const double x); /************************************************************************* Complemented F distribution Returns the area from x to infinity under the F density function (also known as Snedcor's density or the variance ratio density). inf. - 1 | | a-1 b-1 1-P(x) = ------ | t (1-t) dt B(a,b) | | - x The incomplete beta integral is used, according to the formula P(x) = incbet( df2/2, df1/2, (df2/(df2 + df1*x) ). ACCURACY: Tested at random points (a,b,x) in the indicated intervals. x a,b Relative error: arithmetic domain domain # trials peak rms IEEE 0,1 1,100 100000 3.7e-14 5.9e-16 IEEE 1,5 1,100 100000 8.0e-15 1.6e-15 IEEE 0,1 1,10000 100000 1.8e-11 3.5e-13 IEEE 1,5 1,10000 100000 2.0e-11 3.0e-12 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double fcdistribution(const ae_int_t a, const ae_int_t b, const double x); /************************************************************************* Inverse of complemented F distribution Finds the F density argument x such that the integral from x to infinity of the F density is equal to the given probability p. This is accomplished using the inverse beta integral function and the relations z = incbi( df2/2, df1/2, p ) x = df2 (1-z) / (df1 z). Note: the following relations hold for the inverse of the uncomplemented F distribution: z = incbi( df1/2, df2/2, p ) x = df2 z / (df1 (1-z)). ACCURACY: Tested at random points (a,b,p). a,b Relative error: arithmetic domain # trials peak rms For p between .001 and 1: IEEE 1,100 100000 8.3e-15 4.7e-16 IEEE 1,10000 100000 2.1e-11 1.4e-13 For p between 10^-6 and 10^-3: IEEE 1,100 50000 1.3e-12 8.4e-15 IEEE 1,10000 50000 3.0e-12 4.8e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invfdistribution(const ae_int_t a, const ae_int_t b, const double y); /************************************************************************* Fresnel integral Evaluates the Fresnel integrals x - | | C(x) = | cos(pi/2 t**2) dt, | | - 0 x - | | S(x) = | sin(pi/2 t**2) dt. | | - 0 The integrals are evaluated by a power series for x < 1. For x >= 1 auxiliary functions f(x) and g(x) are employed such that C(x) = 0.5 + f(x) sin( pi/2 x**2 ) - g(x) cos( pi/2 x**2 ) S(x) = 0.5 - f(x) cos( pi/2 x**2 ) - g(x) sin( pi/2 x**2 ) ACCURACY: Relative error. Arithmetic function domain # trials peak rms IEEE S(x) 0, 10 10000 2.0e-15 3.2e-16 IEEE C(x) 0, 10 10000 1.8e-15 3.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ void fresnelintegral(const double x, double &c, double &s); /************************************************************************* Jacobian Elliptic Functions Evaluates the Jacobian elliptic functions sn(u|m), cn(u|m), and dn(u|m) of parameter m between 0 and 1, and real argument u. These functions are periodic, with quarter-period on the real axis equal to the complete elliptic integral ellpk(1.0-m). Relation to incomplete elliptic integral: If u = ellik(phi,m), then sn(u|m) = sin(phi), and cn(u|m) = cos(phi). Phi is called the amplitude of u. Computation is by means of the arithmetic-geometric mean algorithm, except when m is within 1e-9 of 0 or 1. In the latter case with m close to 1, the approximation applies only for phi < pi/2. ACCURACY: Tested at random points with u between 0 and 10, m between 0 and 1. Absolute error (* = relative error): arithmetic function # trials peak rms IEEE phi 10000 9.2e-16* 1.4e-16* IEEE sn 50000 4.1e-15 4.6e-16 IEEE cn 40000 3.6e-15 4.4e-16 IEEE dn 10000 1.3e-12 1.8e-14 Peak error observed in consistency check using addition theorem for sn(u+v) was 4e-16 (absolute). Also tested by the above relation to the incomplete elliptic integral. Accuracy deteriorates when u is large. Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ void jacobianellipticfunctions(const double u, const double m, double &sn, double &cn, double &dn, double &ph); /************************************************************************* Psi (digamma) function d - psi(x) = -- ln | (x) dx is the logarithmic derivative of the gamma function. For integer x, n-1 - psi(n) = -EUL + > 1/k. - k=1 This formula is used for 0 < n <= 10. If x is negative, it is transformed to a positive argument by the reflection formula psi(1-x) = psi(x) + pi cot(pi x). For general positive x, the argument is made greater than 10 using the recurrence psi(x+1) = psi(x) + 1/x. Then the following asymptotic expansion is applied: inf. B - 2k psi(x) = log(x) - 1/2x - > ------- - 2k k=1 2k x where the B2k are Bernoulli numbers. ACCURACY: Relative error (except absolute when |psi| < 1): arithmetic domain # trials peak rms IEEE 0,30 30000 1.3e-15 1.4e-16 IEEE -30,0 40000 1.5e-15 2.2e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double psi(const double x); /************************************************************************* Exponential integral Ei(x) x - t | | e Ei(x) = -|- --- dt . | | t - -inf Not defined for x <= 0. See also expn.c. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,100 50000 8.6e-16 1.3e-16 Cephes Math Library Release 2.8: May, 1999 Copyright 1999 by Stephen L. Moshier *************************************************************************/ double exponentialintegralei(const double x); /************************************************************************* Exponential integral En(x) Evaluates the exponential integral inf. - | | -xt | e E (x) = | ---- dt. n | n | | t - 1 Both n and x must be nonnegative. The routine employs either a power series, a continued fraction, or an asymptotic formula depending on the relative values of n and x. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 30 10000 1.7e-15 3.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 2000 by Stephen L. Moshier *************************************************************************/ double exponentialintegralen(const double x, const ae_int_t n); /************************************************************************* Calculation of the value of the Laguerre polynomial. Parameters: n - degree, n>=0 x - argument Result: the value of the Laguerre polynomial Ln at x *************************************************************************/ double laguerrecalculate(const ae_int_t n, const double x); /************************************************************************* Summation of Laguerre polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*L0(x) + c[1]*L1(x) + ... + c[N]*LN(x) Parameters: n - degree, n>=0 x - argument Result: the value of the Laguerre polynomial at x *************************************************************************/ double laguerresum(const real_1d_array &c, const ae_int_t n, const double x); /************************************************************************* Representation of Ln as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/ void laguerrecoefficients(const ae_int_t n, real_1d_array &c); /************************************************************************* Chi-square distribution Returns the area under the left hand tail (from 0 to x) of the Chi square probability density function with v degrees of freedom. x - 1 | | v/2-1 -t/2 P( x | v ) = ----------- | t e dt v/2 - | | 2 | (v/2) - 0 where x is the Chi-square variable. The incomplete gamma integral is used, according to the formula y = chdtr( v, x ) = igam( v/2.0, x/2.0 ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double chisquaredistribution(const double v, const double x); /************************************************************************* Complemented Chi-square distribution Returns the area under the right hand tail (from x to infinity) of the Chi square probability density function with v degrees of freedom: inf. - 1 | | v/2-1 -t/2 P( x | v ) = ----------- | t e dt v/2 - | | 2 | (v/2) - x where x is the Chi-square variable. The incomplete gamma integral is used, according to the formula y = chdtr( v, x ) = igamc( v/2.0, x/2.0 ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double chisquarecdistribution(const double v, const double x); /************************************************************************* Inverse of complemented Chi-square distribution Finds the Chi-square argument x such that the integral from x to infinity of the Chi-square density is equal to the given cumulative probability y. This is accomplished using the inverse gamma integral function and the relation x/2 = igami( df/2, y ); ACCURACY: See inverse incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double invchisquaredistribution(const double v, const double y); /************************************************************************* Calculation of the value of the Legendre polynomial Pn. Parameters: n - degree, n>=0 x - argument Result: the value of the Legendre polynomial Pn at x *************************************************************************/ double legendrecalculate(const ae_int_t n, const double x); /************************************************************************* Summation of Legendre polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*P0(x) + c[1]*P1(x) + ... + c[N]*PN(x) Parameters: n - degree, n>=0 x - argument Result: the value of the Legendre polynomial at x *************************************************************************/ double legendresum(const real_1d_array &c, const ae_int_t n, const double x); /************************************************************************* Representation of Pn as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/ void legendrecoefficients(const ae_int_t n, real_1d_array &c); /************************************************************************* Beta function - - | (a) | (b) beta( a, b ) = -----------. - | (a+b) For large arguments the logarithm of the function is evaluated using lgam(), then exponentiated. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 30000 8.1e-14 1.1e-14 Cephes Math Library Release 2.0: April, 1987 Copyright 1984, 1987 by Stephen L. Moshier *************************************************************************/ double beta(const double a, const double b); /************************************************************************* Calculation of the value of the Chebyshev polynomials of the first and second kinds. Parameters: r - polynomial kind, either 1 or 2. n - degree, n>=0 x - argument, -1 <= x <= 1 Result: the value of the Chebyshev polynomial at x *************************************************************************/ double chebyshevcalculate(const ae_int_t r, const ae_int_t n, const double x); /************************************************************************* Summation of Chebyshev polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*T0(x) + c[1]*T1(x) + ... + c[N]*TN(x) or c[0]*U0(x) + c[1]*U1(x) + ... + c[N]*UN(x) depending on the R. Parameters: r - polynomial kind, either 1 or 2. n - degree, n>=0 x - argument Result: the value of the Chebyshev polynomial at x *************************************************************************/ double chebyshevsum(const real_1d_array &c, const ae_int_t r, const ae_int_t n, const double x); /************************************************************************* Representation of Tn as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/ void chebyshevcoefficients(const ae_int_t n, real_1d_array &c); /************************************************************************* Conversion of a series of Chebyshev polynomials to a power series. Represents A[0]*T0(x) + A[1]*T1(x) + ... + A[N]*Tn(x) as B[0] + B[1]*X + ... + B[N]*X^N. Input parameters: A - Chebyshev series coefficients N - degree, N>=0 Output parameters B - power series coefficients *************************************************************************/ void fromchebyshev(const real_1d_array &a, const ae_int_t n, real_1d_array &b); /************************************************************************* Student's t distribution Computes the integral from minus infinity to t of the Student t distribution with integer k > 0 degrees of freedom: t - | | - | 2 -(k+1)/2 | ( (k+1)/2 ) | ( x ) ---------------------- | ( 1 + --- ) dx - | ( k ) sqrt( k pi ) | ( k/2 ) | | | - -inf. Relation to incomplete beta integral: 1 - stdtr(k,t) = 0.5 * incbet( k/2, 1/2, z ) where z = k/(k + t**2). For t < -2, this is the method of computation. For higher t, a direct method is derived from integration by parts. Since the function is symmetric about t=0, the area under the right tail of the density is found by calling the function with -t instead of t. ACCURACY: Tested at random 1 <= k <= 25. The "domain" refers to t. Relative error: arithmetic domain # trials peak rms IEEE -100,-2 50000 5.9e-15 1.4e-15 IEEE -2,100 500000 2.7e-15 4.9e-17 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double studenttdistribution(const ae_int_t k, const double t); /************************************************************************* Functional inverse of Student's t distribution Given probability p, finds the argument t such that stdtr(k,t) is equal to p. ACCURACY: Tested at random 1 <= k <= 100. The "domain" refers to p: Relative error: arithmetic domain # trials peak rms IEEE .001,.999 25000 5.7e-15 8.0e-16 IEEE 10^-6,.001 25000 2.0e-12 2.9e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invstudenttdistribution(const ae_int_t k, const double p); /************************************************************************* Binomial distribution Returns the sum of the terms 0 through k of the Binomial probability density: k -- ( n ) j n-j > ( ) p (1-p) -- ( j ) j=0 The terms are not summed directly; instead the incomplete beta integral is employed, according to the formula y = bdtr( k, n, p ) = incbet( n-k, k+1, 1-p ). The arguments must be positive, with p ranging from 0 to 1. ACCURACY: Tested at random points (a,b,p), with p between 0 and 1. a,b Relative error: arithmetic domain # trials peak rms For p between 0.001 and 1: IEEE 0,100 100000 4.3e-15 2.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double binomialdistribution(const ae_int_t k, const ae_int_t n, const double p); /************************************************************************* Complemented binomial distribution Returns the sum of the terms k+1 through n of the Binomial probability density: n -- ( n ) j n-j > ( ) p (1-p) -- ( j ) j=k+1 The terms are not summed directly; instead the incomplete beta integral is employed, according to the formula y = bdtrc( k, n, p ) = incbet( k+1, n-k, p ). The arguments must be positive, with p ranging from 0 to 1. ACCURACY: Tested at random points (a,b,p). a,b Relative error: arithmetic domain # trials peak rms For p between 0.001 and 1: IEEE 0,100 100000 6.7e-15 8.2e-16 For p between 0 and .001: IEEE 0,100 100000 1.5e-13 2.7e-15 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double binomialcdistribution(const ae_int_t k, const ae_int_t n, const double p); /************************************************************************* Inverse binomial distribution Finds the event probability p such that the sum of the terms 0 through k of the Binomial probability density is equal to the given cumulative probability y. This is accomplished using the inverse beta integral function and the relation 1 - p = incbi( n-k, k+1, y ). ACCURACY: Tested at random points (a,b,p). a,b Relative error: arithmetic domain # trials peak rms For p between 0.001 and 1: IEEE 0,100 100000 2.3e-14 6.4e-16 IEEE 0,10000 100000 6.6e-12 1.2e-13 For p between 10^-6 and 0.001: IEEE 0,100 100000 2.0e-12 1.3e-14 IEEE 0,10000 100000 1.5e-12 3.2e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invbinomialdistribution(const ae_int_t k, const ae_int_t n, const double y); /************************************************************************* Airy function Solution of the differential equation y"(x) = xy. The function returns the two independent solutions Ai, Bi and their first derivatives Ai'(x), Bi'(x). Evaluation is by power series summation for small x, by rational minimax approximations for large x. ACCURACY: Error criterion is absolute when function <= 1, relative when function > 1, except * denotes relative error criterion. For large negative x, the absolute error increases as x^1.5. For large positive x, the relative error increases as x^1.5. Arithmetic domain function # trials peak rms IEEE -10, 0 Ai 10000 1.6e-15 2.7e-16 IEEE 0, 10 Ai 10000 2.3e-14* 1.8e-15* IEEE -10, 0 Ai' 10000 4.6e-15 7.6e-16 IEEE 0, 10 Ai' 10000 1.8e-14* 1.5e-15* IEEE -10, 10 Bi 30000 4.2e-15 5.3e-16 IEEE -10, 10 Bi' 30000 4.9e-15 7.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ void airy(const double x, double &ai, double &aip, double &bi, double &bip); } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (FUNCTIONS) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { double gammafunction(double x, ae_state *_state); double lngamma(double x, double* sgngam, ae_state *_state); double errorfunction(double x, ae_state *_state); double errorfunctionc(double x, ae_state *_state); double normaldistribution(double x, ae_state *_state); double inverf(double e, ae_state *_state); double invnormaldistribution(double y0, ae_state *_state); double incompletegamma(double a, double x, ae_state *_state); double incompletegammac(double a, double x, ae_state *_state); double invincompletegammac(double a, double y0, ae_state *_state); double ellipticintegralk(double m, ae_state *_state); double ellipticintegralkhighprecision(double m1, ae_state *_state); double incompleteellipticintegralk(double phi, double m, ae_state *_state); double ellipticintegrale(double m, ae_state *_state); double incompleteellipticintegrale(double phi, double m, ae_state *_state); double hermitecalculate(ae_int_t n, double x, ae_state *_state); double hermitesum(/* Real */ ae_vector* c, ae_int_t n, double x, ae_state *_state); void hermitecoefficients(ae_int_t n, /* Real */ ae_vector* c, ae_state *_state); double dawsonintegral(double x, ae_state *_state); void sinecosineintegrals(double x, double* si, double* ci, ae_state *_state); void hyperbolicsinecosineintegrals(double x, double* shi, double* chi, ae_state *_state); double poissondistribution(ae_int_t k, double m, ae_state *_state); double poissoncdistribution(ae_int_t k, double m, ae_state *_state); double invpoissondistribution(ae_int_t k, double y, ae_state *_state); double besselj0(double x, ae_state *_state); double besselj1(double x, ae_state *_state); double besseljn(ae_int_t n, double x, ae_state *_state); double bessely0(double x, ae_state *_state); double bessely1(double x, ae_state *_state); double besselyn(ae_int_t n, double x, ae_state *_state); double besseli0(double x, ae_state *_state); double besseli1(double x, ae_state *_state); double besselk0(double x, ae_state *_state); double besselk1(double x, ae_state *_state); double besselkn(ae_int_t nn, double x, ae_state *_state); double incompletebeta(double a, double b, double x, ae_state *_state); double invincompletebeta(double a, double b, double y, ae_state *_state); double fdistribution(ae_int_t a, ae_int_t b, double x, ae_state *_state); double fcdistribution(ae_int_t a, ae_int_t b, double x, ae_state *_state); double invfdistribution(ae_int_t a, ae_int_t b, double y, ae_state *_state); void fresnelintegral(double x, double* c, double* s, ae_state *_state); void jacobianellipticfunctions(double u, double m, double* sn, double* cn, double* dn, double* ph, ae_state *_state); double psi(double x, ae_state *_state); double exponentialintegralei(double x, ae_state *_state); double exponentialintegralen(double x, ae_int_t n, ae_state *_state); double laguerrecalculate(ae_int_t n, double x, ae_state *_state); double laguerresum(/* Real */ ae_vector* c, ae_int_t n, double x, ae_state *_state); void laguerrecoefficients(ae_int_t n, /* Real */ ae_vector* c, ae_state *_state); double chisquaredistribution(double v, double x, ae_state *_state); double chisquarecdistribution(double v, double x, ae_state *_state); double invchisquaredistribution(double v, double y, ae_state *_state); double legendrecalculate(ae_int_t n, double x, ae_state *_state); double legendresum(/* Real */ ae_vector* c, ae_int_t n, double x, ae_state *_state); void legendrecoefficients(ae_int_t n, /* Real */ ae_vector* c, ae_state *_state); double beta(double a, double b, ae_state *_state); double chebyshevcalculate(ae_int_t r, ae_int_t n, double x, ae_state *_state); double chebyshevsum(/* Real */ ae_vector* c, ae_int_t r, ae_int_t n, double x, ae_state *_state); void chebyshevcoefficients(ae_int_t n, /* Real */ ae_vector* c, ae_state *_state); void fromchebyshev(/* Real */ ae_vector* a, ae_int_t n, /* Real */ ae_vector* b, ae_state *_state); double studenttdistribution(ae_int_t k, double t, ae_state *_state); double invstudenttdistribution(ae_int_t k, double p, ae_state *_state); double binomialdistribution(ae_int_t k, ae_int_t n, double p, ae_state *_state); double binomialcdistribution(ae_int_t k, ae_int_t n, double p, ae_state *_state); double invbinomialdistribution(ae_int_t k, ae_int_t n, double y, ae_state *_state); void airy(double x, double* ai, double* aip, double* bi, double* bip, ae_state *_state); } #endif cpp/src/interpolation.h0000755000175000017500000133322613105126766015071 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #ifndef _interpolation_pkg_h #define _interpolation_pkg_h #include "ap.h" #include "alglibinternal.h" #include "alglibmisc.h" #include "linalg.h" #include "solvers.h" #include "specialfunctions.h" #include "integration.h" #include "optimization.h" ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (DATATYPES) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { typedef struct { ae_int_t n; ae_int_t nx; ae_int_t d; double r; ae_int_t nw; kdtree tree; ae_int_t modeltype; ae_matrix q; ae_vector xbuf; ae_vector tbuf; ae_vector rbuf; ae_matrix xybuf; ae_int_t debugsolverfailures; double debugworstrcond; double debugbestrcond; } idwinterpolant; typedef struct { ae_int_t n; double sy; ae_vector x; ae_vector y; ae_vector w; } barycentricinterpolant; typedef struct { ae_bool periodic; ae_int_t n; ae_int_t k; ae_int_t continuity; ae_vector x; ae_vector c; } spline1dinterpolant; typedef struct { ae_int_t n; ae_bool periodic; ae_vector p; spline1dinterpolant x; spline1dinterpolant y; } pspline2interpolant; typedef struct { ae_int_t n; ae_bool periodic; ae_vector p; spline1dinterpolant x; spline1dinterpolant y; spline1dinterpolant z; } pspline3interpolant; typedef struct { ae_int_t k; ae_int_t stype; ae_int_t n; ae_int_t m; ae_int_t l; ae_int_t d; ae_vector x; ae_vector y; ae_vector z; ae_vector f; } spline3dinterpolant; typedef struct { double taskrcond; double rmserror; double avgerror; double avgrelerror; double maxerror; } polynomialfitreport; typedef struct { double taskrcond; ae_int_t dbest; double rmserror; double avgerror; double avgrelerror; double maxerror; } barycentricfitreport; typedef struct { double taskrcond; double rmserror; double avgerror; double avgrelerror; double maxerror; } spline1dfitreport; typedef struct { double taskrcond; ae_int_t iterationscount; ae_int_t varidx; double rmserror; double avgerror; double avgrelerror; double maxerror; double wrmserror; ae_matrix covpar; ae_vector errpar; ae_vector errcurve; ae_vector noise; double r2; } lsfitreport; typedef struct { ae_int_t optalgo; ae_int_t m; ae_int_t k; double epsx; ae_int_t maxits; double stpmax; ae_bool xrep; ae_vector s; ae_vector bndl; ae_vector bndu; ae_matrix taskx; ae_vector tasky; ae_int_t npoints; ae_vector taskw; ae_int_t nweights; ae_int_t wkind; ae_int_t wits; double diffstep; double teststep; ae_matrix cleic; ae_int_t nec; ae_int_t nic; ae_bool xupdated; ae_bool needf; ae_bool needfg; ae_bool needfgh; ae_int_t pointindex; ae_vector x; ae_vector c; double f; ae_vector g; ae_matrix h; ae_vector wcur; ae_vector tmpct; ae_vector tmp; ae_vector tmpf; ae_matrix tmpjac; ae_matrix tmpjacw; double tmpnoise; matinvreport invrep; ae_int_t repiterationscount; ae_int_t repterminationtype; ae_int_t repvaridx; double reprmserror; double repavgerror; double repavgrelerror; double repmaxerror; double repwrmserror; lsfitreport rep; minlmstate optstate; minlmreport optrep; ae_int_t prevnpt; ae_int_t prevalgo; rcommstate rstate; } lsfitstate; typedef struct { ae_vector x; ae_vector curboxmin; ae_vector curboxmax; double curdist2; ae_vector x123; ae_vector y123; } rbfv2calcbuffer; typedef struct { ae_int_t ny; ae_int_t nx; ae_int_t bf; ae_int_t nh; ae_vector ri; ae_vector s; ae_vector kdroots; ae_vector kdnodes; ae_vector kdsplits; ae_vector kdboxmin; ae_vector kdboxmax; ae_vector cw; ae_matrix v; double lambdareg; ae_int_t maxits; double supportr; ae_int_t basisfunction; rbfv2calcbuffer calcbuf; } rbfv2model; typedef struct { rbfv2calcbuffer calcbuf; ae_vector cx; ae_vector rx; ae_vector ry; ae_vector tx; ae_vector ty; ae_vector rf; } rbfv2gridcalcbuffer; typedef struct { ae_int_t terminationtype; double maxerror; double rmserror; } rbfv2report; typedef struct { ae_int_t nfev; ae_int_t iterationscount; } nsfitinternalreport; typedef struct { ae_int_t k; ae_int_t stype; ae_int_t n; ae_int_t m; ae_int_t d; ae_vector x; ae_vector y; ae_vector f; } spline2dinterpolant; typedef struct { ae_vector calcbufxcx; ae_matrix calcbufx; ae_vector calcbuftags; kdtreerequestbuffer requestbuffer; } rbfv1calcbuffer; typedef struct { ae_int_t ny; ae_int_t nx; ae_int_t nc; ae_int_t nl; kdtree tree; ae_matrix xc; ae_matrix wr; double rmax; ae_matrix v; ae_vector calcbufxcx; ae_matrix calcbufx; ae_vector calcbuftags; } rbfv1model; typedef struct { ae_vector tx; ae_vector cx; ae_vector ty; ae_vector flag0; ae_vector flag1; ae_vector flag2; ae_vector flag12; ae_vector expbuf0; ae_vector expbuf1; ae_vector expbuf2; kdtreerequestbuffer requestbuf; ae_matrix calcbufx; ae_vector calcbuftags; } gridcalc3v1buf; typedef struct { ae_int_t arows; ae_int_t acols; ae_int_t annz; ae_int_t iterationscount; ae_int_t nmv; ae_int_t terminationtype; } rbfv1report; typedef struct { ae_int_t modelversion; rbfv1calcbuffer bufv1; rbfv2calcbuffer bufv2; } rbfcalcbuffer; typedef struct { ae_int_t nx; ae_int_t ny; ae_int_t modelversion; rbfv1model model1; rbfv2model model2; double lambdav; double radvalue; double radzvalue; ae_int_t nlayers; ae_int_t aterm; ae_int_t algorithmtype; double epsort; double epserr; ae_int_t maxits; ae_int_t nnmaxits; ae_int_t n; ae_matrix x; ae_matrix y; ae_bool hasscale; ae_vector s; } rbfmodel; typedef struct { double rmserror; double maxerror; ae_int_t arows; ae_int_t acols; ae_int_t annz; ae_int_t iterationscount; ae_int_t nmv; ae_int_t terminationtype; } rbfreport; } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* IDW interpolant. *************************************************************************/ class _idwinterpolant_owner { public: _idwinterpolant_owner(); _idwinterpolant_owner(const _idwinterpolant_owner &rhs); _idwinterpolant_owner& operator=(const _idwinterpolant_owner &rhs); virtual ~_idwinterpolant_owner(); alglib_impl::idwinterpolant* c_ptr(); alglib_impl::idwinterpolant* c_ptr() const; protected: alglib_impl::idwinterpolant *p_struct; }; class idwinterpolant : public _idwinterpolant_owner { public: idwinterpolant(); idwinterpolant(const idwinterpolant &rhs); idwinterpolant& operator=(const idwinterpolant &rhs); virtual ~idwinterpolant(); }; /************************************************************************* Barycentric interpolant. *************************************************************************/ class _barycentricinterpolant_owner { public: _barycentricinterpolant_owner(); _barycentricinterpolant_owner(const _barycentricinterpolant_owner &rhs); _barycentricinterpolant_owner& operator=(const _barycentricinterpolant_owner &rhs); virtual ~_barycentricinterpolant_owner(); alglib_impl::barycentricinterpolant* c_ptr(); alglib_impl::barycentricinterpolant* c_ptr() const; protected: alglib_impl::barycentricinterpolant *p_struct; }; class barycentricinterpolant : public _barycentricinterpolant_owner { public: barycentricinterpolant(); barycentricinterpolant(const barycentricinterpolant &rhs); barycentricinterpolant& operator=(const barycentricinterpolant &rhs); virtual ~barycentricinterpolant(); }; /************************************************************************* 1-dimensional spline interpolant *************************************************************************/ class _spline1dinterpolant_owner { public: _spline1dinterpolant_owner(); _spline1dinterpolant_owner(const _spline1dinterpolant_owner &rhs); _spline1dinterpolant_owner& operator=(const _spline1dinterpolant_owner &rhs); virtual ~_spline1dinterpolant_owner(); alglib_impl::spline1dinterpolant* c_ptr(); alglib_impl::spline1dinterpolant* c_ptr() const; protected: alglib_impl::spline1dinterpolant *p_struct; }; class spline1dinterpolant : public _spline1dinterpolant_owner { public: spline1dinterpolant(); spline1dinterpolant(const spline1dinterpolant &rhs); spline1dinterpolant& operator=(const spline1dinterpolant &rhs); virtual ~spline1dinterpolant(); }; /************************************************************************* Parametric spline inteprolant: 2-dimensional curve. You should not try to access its members directly - use PSpline2XXXXXXXX() functions instead. *************************************************************************/ class _pspline2interpolant_owner { public: _pspline2interpolant_owner(); _pspline2interpolant_owner(const _pspline2interpolant_owner &rhs); _pspline2interpolant_owner& operator=(const _pspline2interpolant_owner &rhs); virtual ~_pspline2interpolant_owner(); alglib_impl::pspline2interpolant* c_ptr(); alglib_impl::pspline2interpolant* c_ptr() const; protected: alglib_impl::pspline2interpolant *p_struct; }; class pspline2interpolant : public _pspline2interpolant_owner { public: pspline2interpolant(); pspline2interpolant(const pspline2interpolant &rhs); pspline2interpolant& operator=(const pspline2interpolant &rhs); virtual ~pspline2interpolant(); }; /************************************************************************* Parametric spline inteprolant: 3-dimensional curve. You should not try to access its members directly - use PSpline3XXXXXXXX() functions instead. *************************************************************************/ class _pspline3interpolant_owner { public: _pspline3interpolant_owner(); _pspline3interpolant_owner(const _pspline3interpolant_owner &rhs); _pspline3interpolant_owner& operator=(const _pspline3interpolant_owner &rhs); virtual ~_pspline3interpolant_owner(); alglib_impl::pspline3interpolant* c_ptr(); alglib_impl::pspline3interpolant* c_ptr() const; protected: alglib_impl::pspline3interpolant *p_struct; }; class pspline3interpolant : public _pspline3interpolant_owner { public: pspline3interpolant(); pspline3interpolant(const pspline3interpolant &rhs); pspline3interpolant& operator=(const pspline3interpolant &rhs); virtual ~pspline3interpolant(); }; /************************************************************************* 3-dimensional spline inteprolant *************************************************************************/ class _spline3dinterpolant_owner { public: _spline3dinterpolant_owner(); _spline3dinterpolant_owner(const _spline3dinterpolant_owner &rhs); _spline3dinterpolant_owner& operator=(const _spline3dinterpolant_owner &rhs); virtual ~_spline3dinterpolant_owner(); alglib_impl::spline3dinterpolant* c_ptr(); alglib_impl::spline3dinterpolant* c_ptr() const; protected: alglib_impl::spline3dinterpolant *p_struct; }; class spline3dinterpolant : public _spline3dinterpolant_owner { public: spline3dinterpolant(); spline3dinterpolant(const spline3dinterpolant &rhs); spline3dinterpolant& operator=(const spline3dinterpolant &rhs); virtual ~spline3dinterpolant(); }; /************************************************************************* Polynomial fitting report: TaskRCond reciprocal of task's condition number RMSError RMS error AvgError average error AvgRelError average relative error (for non-zero Y[I]) MaxError maximum error *************************************************************************/ class _polynomialfitreport_owner { public: _polynomialfitreport_owner(); _polynomialfitreport_owner(const _polynomialfitreport_owner &rhs); _polynomialfitreport_owner& operator=(const _polynomialfitreport_owner &rhs); virtual ~_polynomialfitreport_owner(); alglib_impl::polynomialfitreport* c_ptr(); alglib_impl::polynomialfitreport* c_ptr() const; protected: alglib_impl::polynomialfitreport *p_struct; }; class polynomialfitreport : public _polynomialfitreport_owner { public: polynomialfitreport(); polynomialfitreport(const polynomialfitreport &rhs); polynomialfitreport& operator=(const polynomialfitreport &rhs); virtual ~polynomialfitreport(); double &taskrcond; double &rmserror; double &avgerror; double &avgrelerror; double &maxerror; }; /************************************************************************* Barycentric fitting report: RMSError RMS error AvgError average error AvgRelError average relative error (for non-zero Y[I]) MaxError maximum error TaskRCond reciprocal of task's condition number *************************************************************************/ class _barycentricfitreport_owner { public: _barycentricfitreport_owner(); _barycentricfitreport_owner(const _barycentricfitreport_owner &rhs); _barycentricfitreport_owner& operator=(const _barycentricfitreport_owner &rhs); virtual ~_barycentricfitreport_owner(); alglib_impl::barycentricfitreport* c_ptr(); alglib_impl::barycentricfitreport* c_ptr() const; protected: alglib_impl::barycentricfitreport *p_struct; }; class barycentricfitreport : public _barycentricfitreport_owner { public: barycentricfitreport(); barycentricfitreport(const barycentricfitreport &rhs); barycentricfitreport& operator=(const barycentricfitreport &rhs); virtual ~barycentricfitreport(); double &taskrcond; ae_int_t &dbest; double &rmserror; double &avgerror; double &avgrelerror; double &maxerror; }; /************************************************************************* Spline fitting report: RMSError RMS error AvgError average error AvgRelError average relative error (for non-zero Y[I]) MaxError maximum error Fields below are filled by obsolete functions (Spline1DFitCubic, Spline1DFitHermite). Modern fitting functions do NOT fill these fields: TaskRCond reciprocal of task's condition number *************************************************************************/ class _spline1dfitreport_owner { public: _spline1dfitreport_owner(); _spline1dfitreport_owner(const _spline1dfitreport_owner &rhs); _spline1dfitreport_owner& operator=(const _spline1dfitreport_owner &rhs); virtual ~_spline1dfitreport_owner(); alglib_impl::spline1dfitreport* c_ptr(); alglib_impl::spline1dfitreport* c_ptr() const; protected: alglib_impl::spline1dfitreport *p_struct; }; class spline1dfitreport : public _spline1dfitreport_owner { public: spline1dfitreport(); spline1dfitreport(const spline1dfitreport &rhs); spline1dfitreport& operator=(const spline1dfitreport &rhs); virtual ~spline1dfitreport(); double &taskrcond; double &rmserror; double &avgerror; double &avgrelerror; double &maxerror; }; /************************************************************************* Least squares fitting report. This structure contains informational fields which are set by fitting functions provided by this unit. Different functions initialize different sets of fields, so you should read documentation on specific function you used in order to know which fields are initialized. TaskRCond reciprocal of task's condition number IterationsCount number of internal iterations VarIdx if user-supplied gradient contains errors which were detected by nonlinear fitter, this field is set to index of the first component of gradient which is suspected to be spoiled by bugs. RMSError RMS error AvgError average error AvgRelError average relative error (for non-zero Y[I]) MaxError maximum error WRMSError weighted RMS error CovPar covariance matrix for parameters, filled by some solvers ErrPar vector of errors in parameters, filled by some solvers ErrCurve vector of fit errors - variability of the best-fit curve, filled by some solvers. Noise vector of per-point noise estimates, filled by some solvers. R2 coefficient of determination (non-weighted, non-adjusted), filled by some solvers. *************************************************************************/ class _lsfitreport_owner { public: _lsfitreport_owner(); _lsfitreport_owner(const _lsfitreport_owner &rhs); _lsfitreport_owner& operator=(const _lsfitreport_owner &rhs); virtual ~_lsfitreport_owner(); alglib_impl::lsfitreport* c_ptr(); alglib_impl::lsfitreport* c_ptr() const; protected: alglib_impl::lsfitreport *p_struct; }; class lsfitreport : public _lsfitreport_owner { public: lsfitreport(); lsfitreport(const lsfitreport &rhs); lsfitreport& operator=(const lsfitreport &rhs); virtual ~lsfitreport(); double &taskrcond; ae_int_t &iterationscount; ae_int_t &varidx; double &rmserror; double &avgerror; double &avgrelerror; double &maxerror; double &wrmserror; real_2d_array covpar; real_1d_array errpar; real_1d_array errcurve; real_1d_array noise; double &r2; }; /************************************************************************* Nonlinear fitter. You should use ALGLIB functions to work with fitter. Never try to access its fields directly! *************************************************************************/ class _lsfitstate_owner { public: _lsfitstate_owner(); _lsfitstate_owner(const _lsfitstate_owner &rhs); _lsfitstate_owner& operator=(const _lsfitstate_owner &rhs); virtual ~_lsfitstate_owner(); alglib_impl::lsfitstate* c_ptr(); alglib_impl::lsfitstate* c_ptr() const; protected: alglib_impl::lsfitstate *p_struct; }; class lsfitstate : public _lsfitstate_owner { public: lsfitstate(); lsfitstate(const lsfitstate &rhs); lsfitstate& operator=(const lsfitstate &rhs); virtual ~lsfitstate(); ae_bool &needf; ae_bool &needfg; ae_bool &needfgh; ae_bool &xupdated; real_1d_array c; double &f; real_1d_array g; real_2d_array h; real_1d_array x; }; /************************************************************************* 2-dimensional spline inteprolant *************************************************************************/ class _spline2dinterpolant_owner { public: _spline2dinterpolant_owner(); _spline2dinterpolant_owner(const _spline2dinterpolant_owner &rhs); _spline2dinterpolant_owner& operator=(const _spline2dinterpolant_owner &rhs); virtual ~_spline2dinterpolant_owner(); alglib_impl::spline2dinterpolant* c_ptr(); alglib_impl::spline2dinterpolant* c_ptr() const; protected: alglib_impl::spline2dinterpolant *p_struct; }; class spline2dinterpolant : public _spline2dinterpolant_owner { public: spline2dinterpolant(); spline2dinterpolant(const spline2dinterpolant &rhs); spline2dinterpolant& operator=(const spline2dinterpolant &rhs); virtual ~spline2dinterpolant(); }; /************************************************************************* Buffer object which is used to perform nearest neighbor requests in the multithreaded mode (multiple threads working with same KD-tree object). This object should be created with KDTreeCreateBuffer(). *************************************************************************/ class _rbfcalcbuffer_owner { public: _rbfcalcbuffer_owner(); _rbfcalcbuffer_owner(const _rbfcalcbuffer_owner &rhs); _rbfcalcbuffer_owner& operator=(const _rbfcalcbuffer_owner &rhs); virtual ~_rbfcalcbuffer_owner(); alglib_impl::rbfcalcbuffer* c_ptr(); alglib_impl::rbfcalcbuffer* c_ptr() const; protected: alglib_impl::rbfcalcbuffer *p_struct; }; class rbfcalcbuffer : public _rbfcalcbuffer_owner { public: rbfcalcbuffer(); rbfcalcbuffer(const rbfcalcbuffer &rhs); rbfcalcbuffer& operator=(const rbfcalcbuffer &rhs); virtual ~rbfcalcbuffer(); }; /************************************************************************* RBF model. Never try to directly work with fields of this object - always use ALGLIB functions to use this object. *************************************************************************/ class _rbfmodel_owner { public: _rbfmodel_owner(); _rbfmodel_owner(const _rbfmodel_owner &rhs); _rbfmodel_owner& operator=(const _rbfmodel_owner &rhs); virtual ~_rbfmodel_owner(); alglib_impl::rbfmodel* c_ptr(); alglib_impl::rbfmodel* c_ptr() const; protected: alglib_impl::rbfmodel *p_struct; }; class rbfmodel : public _rbfmodel_owner { public: rbfmodel(); rbfmodel(const rbfmodel &rhs); rbfmodel& operator=(const rbfmodel &rhs); virtual ~rbfmodel(); }; /************************************************************************* RBF solution report: * TerminationType - termination type, positive values - success, non-positive - failure. Fields which are set by modern RBF solvers (hierarchical): * RMSError - root-mean-square error; NAN for old solvers (ML, QNN) * MaxError - maximum error; NAN for old solvers (ML, QNN) *************************************************************************/ class _rbfreport_owner { public: _rbfreport_owner(); _rbfreport_owner(const _rbfreport_owner &rhs); _rbfreport_owner& operator=(const _rbfreport_owner &rhs); virtual ~_rbfreport_owner(); alglib_impl::rbfreport* c_ptr(); alglib_impl::rbfreport* c_ptr() const; protected: alglib_impl::rbfreport *p_struct; }; class rbfreport : public _rbfreport_owner { public: rbfreport(); rbfreport(const rbfreport &rhs); rbfreport& operator=(const rbfreport &rhs); virtual ~rbfreport(); double &rmserror; double &maxerror; ae_int_t &arows; ae_int_t &acols; ae_int_t &annz; ae_int_t &iterationscount; ae_int_t &nmv; ae_int_t &terminationtype; }; /************************************************************************* IDW interpolation INPUT PARAMETERS: Z - IDW interpolant built with one of model building subroutines. X - array[0..NX-1], interpolation point Result: IDW interpolant Z(X) -- ALGLIB -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/ double idwcalc(const idwinterpolant &z, const real_1d_array &x); /************************************************************************* IDW interpolant using modified Shepard method for uniform point distributions. INPUT PARAMETERS: XY - X and Y values, array[0..N-1,0..NX]. First NX columns contain X-values, last column contain Y-values. N - number of nodes, N>0. NX - space dimension, NX>=1. D - nodal function type, either: * 0 constant model. Just for demonstration only, worst model ever. * 1 linear model, least squares fitting. Simpe model for datasets too small for quadratic models * 2 quadratic model, least squares fitting. Best model available (if your dataset is large enough). * -1 "fast" linear model, use with caution!!! It is significantly faster than linear/quadratic and better than constant model. But it is less robust (especially in the presence of noise). NQ - number of points used to calculate nodal functions (ignored for constant models). NQ should be LARGER than: * max(1.5*(1+NX),2^NX+1) for linear model, * max(3/4*(NX+2)*(NX+1),2^NX+1) for quadratic model. Values less than this threshold will be silently increased. NW - number of points used to calculate weights and to interpolate. Required: >=2^NX+1, values less than this threshold will be silently increased. Recommended value: about 2*NQ OUTPUT PARAMETERS: Z - IDW interpolant. NOTES: * best results are obtained with quadratic models, worst - with constant models * when N is large, NQ and NW must be significantly smaller than N both to obtain optimal performance and to obtain optimal accuracy. In 2 or 3-dimensional tasks NQ=15 and NW=25 are good values to start with. * NQ and NW may be greater than N. In such cases they will be automatically decreased. * this subroutine is always succeeds (as long as correct parameters are passed). * see 'Multivariate Interpolation of Large Sets of Scattered Data' by Robert J. Renka for more information on this algorithm. * this subroutine assumes that point distribution is uniform at the small scales. If it isn't - for example, points are concentrated along "lines", but "lines" distribution is uniform at the larger scale - then you should use IDWBuildModifiedShepardR() -- ALGLIB PROJECT -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/ void idwbuildmodifiedshepard(const real_2d_array &xy, const ae_int_t n, const ae_int_t nx, const ae_int_t d, const ae_int_t nq, const ae_int_t nw, idwinterpolant &z); /************************************************************************* IDW interpolant using modified Shepard method for non-uniform datasets. This type of model uses constant nodal functions and interpolates using all nodes which are closer than user-specified radius R. It may be used when points distribution is non-uniform at the small scale, but it is at the distances as large as R. INPUT PARAMETERS: XY - X and Y values, array[0..N-1,0..NX]. First NX columns contain X-values, last column contain Y-values. N - number of nodes, N>0. NX - space dimension, NX>=1. R - radius, R>0 OUTPUT PARAMETERS: Z - IDW interpolant. NOTES: * if there is less than IDWKMin points within R-ball, algorithm selects IDWKMin closest ones, so that continuity properties of interpolant are preserved even far from points. -- ALGLIB PROJECT -- Copyright 11.04.2010 by Bochkanov Sergey *************************************************************************/ void idwbuildmodifiedshepardr(const real_2d_array &xy, const ae_int_t n, const ae_int_t nx, const double r, idwinterpolant &z); /************************************************************************* IDW model for noisy data. This subroutine may be used to handle noisy data, i.e. data with noise in OUTPUT values. It differs from IDWBuildModifiedShepard() in the following aspects: * nodal functions are not constrained to pass through nodes: Qi(xi)<>yi, i.e. we have fitting instead of interpolation. * weights which are used during least squares fitting stage are all equal to 1.0 (independently of distance) * "fast"-linear or constant nodal functions are not supported (either not robust enough or too rigid) This problem require far more complex tuning than interpolation problems. Below you can find some recommendations regarding this problem: * focus on tuning NQ; it controls noise reduction. As for NW, you can just make it equal to 2*NQ. * you can use cross-validation to determine optimal NQ. * optimal NQ is a result of complex tradeoff between noise level (more noise = larger NQ required) and underlying function complexity (given fixed N, larger NQ means smoothing of compex features in the data). For example, NQ=N will reduce noise to the minimum level possible, but you will end up with just constant/linear/quadratic (depending on D) least squares model for the whole dataset. INPUT PARAMETERS: XY - X and Y values, array[0..N-1,0..NX]. First NX columns contain X-values, last column contain Y-values. N - number of nodes, N>0. NX - space dimension, NX>=1. D - nodal function degree, either: * 1 linear model, least squares fitting. Simpe model for datasets too small for quadratic models (or for very noisy problems). * 2 quadratic model, least squares fitting. Best model available (if your dataset is large enough). NQ - number of points used to calculate nodal functions. NQ should be significantly larger than 1.5 times the number of coefficients in a nodal function to overcome effects of noise: * larger than 1.5*(1+NX) for linear model, * larger than 3/4*(NX+2)*(NX+1) for quadratic model. Values less than this threshold will be silently increased. NW - number of points used to calculate weights and to interpolate. Required: >=2^NX+1, values less than this threshold will be silently increased. Recommended value: about 2*NQ or larger OUTPUT PARAMETERS: Z - IDW interpolant. NOTES: * best results are obtained with quadratic models, linear models are not recommended to use unless you are pretty sure that it is what you want * this subroutine is always succeeds (as long as correct parameters are passed). * see 'Multivariate Interpolation of Large Sets of Scattered Data' by Robert J. Renka for more information on this algorithm. -- ALGLIB PROJECT -- Copyright 02.03.2010 by Bochkanov Sergey *************************************************************************/ void idwbuildnoisy(const real_2d_array &xy, const ae_int_t n, const ae_int_t nx, const ae_int_t d, const ae_int_t nq, const ae_int_t nw, idwinterpolant &z); /************************************************************************* Rational interpolation using barycentric formula F(t) = SUM(i=0,n-1,w[i]*f[i]/(t-x[i])) / SUM(i=0,n-1,w[i]/(t-x[i])) Input parameters: B - barycentric interpolant built with one of model building subroutines. T - interpolation point Result: barycentric interpolant F(t) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ double barycentriccalc(const barycentricinterpolant &b, const double t); /************************************************************************* Differentiation of barycentric interpolant: first derivative. Algorithm used in this subroutine is very robust and should not fail until provided with values too close to MaxRealNumber (usually MaxRealNumber/N or greater will overflow). INPUT PARAMETERS: B - barycentric interpolant built with one of model building subroutines. T - interpolation point OUTPUT PARAMETERS: F - barycentric interpolant at T DF - first derivative NOTE -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricdiff1(const barycentricinterpolant &b, const double t, double &f, double &df); /************************************************************************* Differentiation of barycentric interpolant: first/second derivatives. INPUT PARAMETERS: B - barycentric interpolant built with one of model building subroutines. T - interpolation point OUTPUT PARAMETERS: F - barycentric interpolant at T DF - first derivative D2F - second derivative NOTE: this algorithm may fail due to overflow/underflor if used on data whose values are close to MaxRealNumber or MinRealNumber. Use more robust BarycentricDiff1() subroutine in such cases. -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricdiff2(const barycentricinterpolant &b, const double t, double &f, double &df, double &d2f); /************************************************************************* This subroutine performs linear transformation of the argument. INPUT PARAMETERS: B - rational interpolant in barycentric form CA, CB - transformation coefficients: x = CA*t + CB OUTPUT PARAMETERS: B - transformed interpolant with X replaced by T -- ALGLIB PROJECT -- Copyright 19.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentriclintransx(const barycentricinterpolant &b, const double ca, const double cb); /************************************************************************* This subroutine performs linear transformation of the barycentric interpolant. INPUT PARAMETERS: B - rational interpolant in barycentric form CA, CB - transformation coefficients: B2(x) = CA*B(x) + CB OUTPUT PARAMETERS: B - transformed interpolant -- ALGLIB PROJECT -- Copyright 19.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentriclintransy(const barycentricinterpolant &b, const double ca, const double cb); /************************************************************************* Extracts X/Y/W arrays from rational interpolant INPUT PARAMETERS: B - barycentric interpolant OUTPUT PARAMETERS: N - nodes count, N>0 X - interpolation nodes, array[0..N-1] F - function values, array[0..N-1] W - barycentric weights, array[0..N-1] -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricunpack(const barycentricinterpolant &b, ae_int_t &n, real_1d_array &x, real_1d_array &y, real_1d_array &w); /************************************************************************* Rational interpolant from X/Y/W arrays F(t) = SUM(i=0,n-1,w[i]*f[i]/(t-x[i])) / SUM(i=0,n-1,w[i]/(t-x[i])) INPUT PARAMETERS: X - interpolation nodes, array[0..N-1] F - function values, array[0..N-1] W - barycentric weights, array[0..N-1] N - nodes count, N>0 OUTPUT PARAMETERS: B - barycentric interpolant built from (X, Y, W) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricbuildxyw(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, barycentricinterpolant &b); /************************************************************************* Rational interpolant without poles The subroutine constructs the rational interpolating function without real poles (see 'Barycentric rational interpolation with no poles and high rates of approximation', Michael S. Floater. and Kai Hormann, for more information on this subject). Input parameters: X - interpolation nodes, array[0..N-1]. Y - function values, array[0..N-1]. N - number of nodes, N>0. D - order of the interpolation scheme, 0 <= D <= N-1. D<0 will cause an error. D>=N it will be replaced with D=N-1. if you don't know what D to choose, use small value about 3-5. Output parameters: B - barycentric interpolant. Note: this algorithm always succeeds and calculates the weights with close to machine precision. -- ALGLIB PROJECT -- Copyright 17.06.2007 by Bochkanov Sergey *************************************************************************/ void barycentricbuildfloaterhormann(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t d, barycentricinterpolant &b); /************************************************************************* This subroutine builds linear spline interpolant INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildlinear(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, spline1dinterpolant &c); void spline1dbuildlinear(const real_1d_array &x, const real_1d_array &y, spline1dinterpolant &c); /************************************************************************* This subroutine builds cubic spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Y - function values, array[0..N-1]. OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t boundltype, const double boundl, const ae_int_t boundrtype, const double boundr, spline1dinterpolant &c); void spline1dbuildcubic(const real_1d_array &x, const real_1d_array &y, spline1dinterpolant &c); /************************************************************************* This function solves following problem: given table y[] of function values at nodes x[], it calculates and returns table of function derivatives d[] (calculated at the same nodes x[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - spline nodes Y - function values OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: D - derivative values at X[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Derivative values are correctly reordered on return, so D[I] is always equal to S'(X[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dgriddiffcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t boundltype, const double boundl, const ae_int_t boundrtype, const double boundr, real_1d_array &d); void spline1dgriddiffcubic(const real_1d_array &x, const real_1d_array &y, real_1d_array &d); /************************************************************************* This function solves following problem: given table y[] of function values at nodes x[], it calculates and returns tables of first and second function derivatives d1[] and d2[] (calculated at the same nodes x[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - spline nodes Y - function values OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) OUTPUT PARAMETERS: D1 - S' values at X[] D2 - S'' values at X[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Derivative values are correctly reordered on return, so D[I] is always equal to S'(X[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dgriddiff2cubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t boundltype, const double boundl, const ae_int_t boundrtype, const double boundr, real_1d_array &d1, real_1d_array &d2); void spline1dgriddiff2cubic(const real_1d_array &x, const real_1d_array &y, real_1d_array &d1, real_1d_array &d2); /************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t boundltype, const double boundl, const ae_int_t boundrtype, const double boundr, const real_1d_array &x2, const ae_int_t n2, real_1d_array &y2); void spline1dconvcubic(const real_1d_array &x, const real_1d_array &y, const real_1d_array &x2, real_1d_array &y2); /************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[] and derivatives d2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] D2 - first derivatives at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvdiffcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t boundltype, const double boundl, const ae_int_t boundrtype, const double boundr, const real_1d_array &x2, const ae_int_t n2, real_1d_array &y2, real_1d_array &d2); void spline1dconvdiffcubic(const real_1d_array &x, const real_1d_array &y, const real_1d_array &x2, real_1d_array &y2, real_1d_array &d2); /************************************************************************* This function solves following problem: given table y[] of function values at old nodes x[] and new nodes x2[], it calculates and returns table of function values y2[], first and second derivatives d2[] and dd2[] (calculated at x2[]). This function yields same result as Spline1DBuildCubic() call followed by sequence of Spline1DDiff() calls, but it can be several times faster when called for ordered X[] and X2[]. INPUT PARAMETERS: X - old spline nodes Y - function values X2 - new spline nodes OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points from X/Y are used * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundLType - boundary condition type for the left boundary BoundL - left boundary condition (first or second derivative, depending on the BoundLType) BoundRType - boundary condition type for the right boundary BoundR - right boundary condition (first or second derivative, depending on the BoundRType) N2 - new points count: * N2>=2 * if given, only first N2 points from X2 are used * if not given, automatically detected from X2 size OUTPUT PARAMETERS: F2 - function values at X2[] D2 - first derivatives at X2[] DD2 - second derivatives at X2[] ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. Function values are correctly reordered on return, so F2[I] is always equal to S(X2[I]) independently of points order. SETTING BOUNDARY VALUES: The BoundLType/BoundRType parameters can have the following values: * -1, which corresonds to the periodic (cyclic) boundary conditions. In this case: * both BoundLType and BoundRType must be equal to -1. * BoundL/BoundR are ignored * Y[last] is ignored (it is assumed to be equal to Y[first]). * 0, which corresponds to the parabolically terminated spline (BoundL and/or BoundR are ignored). * 1, which corresponds to the first derivative boundary condition * 2, which corresponds to the second derivative boundary condition * by default, BoundType=0 is used PROBLEMS WITH PERIODIC BOUNDARY CONDITIONS: Problems with periodic boundary conditions have Y[first_point]=Y[last_point]. However, this subroutine doesn't require you to specify equal values for the first and last points - it automatically forces them to be equal by copying Y[first_point] (corresponds to the leftmost, minimal X[]) to Y[last_point]. However it is recommended to pass consistent values of Y[], i.e. to make Y[first_point]=Y[last_point]. -- ALGLIB PROJECT -- Copyright 03.09.2010 by Bochkanov Sergey *************************************************************************/ void spline1dconvdiff2cubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t boundltype, const double boundl, const ae_int_t boundrtype, const double boundr, const real_1d_array &x2, const ae_int_t n2, real_1d_array &y2, real_1d_array &d2, real_1d_array &dd2); void spline1dconvdiff2cubic(const real_1d_array &x, const real_1d_array &y, const real_1d_array &x2, real_1d_array &y2, real_1d_array &d2, real_1d_array &dd2); /************************************************************************* This subroutine builds Catmull-Rom spline interpolant. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Y - function values, array[0..N-1]. OPTIONAL PARAMETERS: N - points count: * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) BoundType - boundary condition type: * -1 for periodic boundary condition * 0 for parabolically terminated spline (default) Tension - tension parameter: * tension=0 corresponds to classic Catmull-Rom spline (default) * 0=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant. ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildhermite(const real_1d_array &x, const real_1d_array &y, const real_1d_array &d, const ae_int_t n, spline1dinterpolant &c); void spline1dbuildhermite(const real_1d_array &x, const real_1d_array &y, const real_1d_array &d, spline1dinterpolant &c); /************************************************************************* This subroutine builds Akima spline interpolant INPUT PARAMETERS: X - spline nodes, array[0..N-1] Y - function values, array[0..N-1] N - points count (optional): * N>=2 * if given, only first N points are used to build spline * if not given, automatically detected from X/Y sizes (len(X) must be equal to len(Y)) OUTPUT PARAMETERS: C - spline interpolant ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dbuildakima(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, spline1dinterpolant &c); void spline1dbuildakima(const real_1d_array &x, const real_1d_array &y, spline1dinterpolant &c); /************************************************************************* This subroutine calculates the value of the spline at the given point X. INPUT PARAMETERS: C - spline interpolant X - point Result: S(x) -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ double spline1dcalc(const spline1dinterpolant &c, const double x); /************************************************************************* This subroutine differentiates the spline. INPUT PARAMETERS: C - spline interpolant. X - point Result: S - S(x) DS - S'(x) D2S - S''(x) -- ALGLIB PROJECT -- Copyright 24.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1ddiff(const spline1dinterpolant &c, const double x, double &s, double &ds, double &d2s); /************************************************************************* This subroutine unpacks the spline into the coefficients table. INPUT PARAMETERS: C - spline interpolant. X - point OUTPUT PARAMETERS: Tbl - coefficients table, unpacked format, array[0..N-2, 0..5]. For I = 0...N-2: Tbl[I,0] = X[i] Tbl[I,1] = X[i+1] Tbl[I,2] = C0 Tbl[I,3] = C1 Tbl[I,4] = C2 Tbl[I,5] = C3 On [x[i], x[i+1]] spline is equals to: S(x) = C0 + C1*t + C2*t^2 + C3*t^3 t = x-x[i] NOTE: You can rebuild spline with Spline1DBuildHermite() function, which accepts as inputs function values and derivatives at nodes, which are easy to calculate when you have coefficients. -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dunpack(const spline1dinterpolant &c, ae_int_t &n, real_2d_array &tbl); /************************************************************************* This subroutine performs linear transformation of the spline argument. INPUT PARAMETERS: C - spline interpolant. A, B- transformation coefficients: x = A*t + B Result: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dlintransx(const spline1dinterpolant &c, const double a, const double b); /************************************************************************* This subroutine performs linear transformation of the spline. INPUT PARAMETERS: C - spline interpolant. A, B- transformation coefficients: S2(x) = A*S(x) + B Result: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/ void spline1dlintransy(const spline1dinterpolant &c, const double a, const double b); /************************************************************************* This subroutine integrates the spline. INPUT PARAMETERS: C - spline interpolant. X - right bound of the integration interval [a, x], here 'a' denotes min(x[]) Result: integral(S(t)dt,a,x) -- ALGLIB PROJECT -- Copyright 23.06.2007 by Bochkanov Sergey *************************************************************************/ double spline1dintegrate(const spline1dinterpolant &c, const double x); /************************************************************************* This function builds monotone cubic Hermite interpolant. This interpolant is monotonic in [x(0),x(n-1)] and is constant outside of this interval. In case y[] form non-monotonic sequence, interpolant is piecewise monotonic. Say, for x=(0,1,2,3,4) and y=(0,1,2,1,0) interpolant will monotonically grow at [0..2] and monotonically decrease at [2..4]. INPUT PARAMETERS: X - spline nodes, array[0..N-1]. Subroutine automatically sorts points, so caller may pass unsorted array. Y - function values, array[0..N-1] N - the number of points(N>=2). OUTPUT PARAMETERS: C - spline interpolant. -- ALGLIB PROJECT -- Copyright 21.06.2012 by Bochkanov Sergey *************************************************************************/ void spline1dbuildmonotone(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, spline1dinterpolant &c); void spline1dbuildmonotone(const real_1d_array &x, const real_1d_array &y, spline1dinterpolant &c); /************************************************************************* This function builds non-periodic 2-dimensional parametric spline which starts at (X[0],Y[0]) and ends at (X[N-1],Y[N-1]). INPUT PARAMETERS: XY - points, array[0..N-1,0..1]. XY[I,0:1] corresponds to the Ith point. Order of points is important! N - points count, N>=5 for Akima splines, N>=2 for other types of splines. ST - spline type: * 0 Akima spline * 1 parabolically terminated Catmull-Rom spline (Tension=0) * 2 parabolically terminated cubic spline PT - parameterization type: * 0 uniform * 1 chord length * 2 centripetal OUTPUT PARAMETERS: P - parametric spline interpolant NOTES: * this function assumes that there all consequent points are distinct. I.e. (x0,y0)<>(x1,y1), (x1,y1)<>(x2,y2), (x2,y2)<>(x3,y3) and so on. However, non-consequent points may coincide, i.e. we can have (x0,y0)= =(x2,y2). -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2build(const real_2d_array &xy, const ae_int_t n, const ae_int_t st, const ae_int_t pt, pspline2interpolant &p); /************************************************************************* This function builds non-periodic 3-dimensional parametric spline which starts at (X[0],Y[0],Z[0]) and ends at (X[N-1],Y[N-1],Z[N-1]). Same as PSpline2Build() function, but for 3D, so we won't duplicate its description here. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3build(const real_2d_array &xy, const ae_int_t n, const ae_int_t st, const ae_int_t pt, pspline3interpolant &p); /************************************************************************* This function builds periodic 2-dimensional parametric spline which starts at (X[0],Y[0]), goes through all points to (X[N-1],Y[N-1]) and then back to (X[0],Y[0]). INPUT PARAMETERS: XY - points, array[0..N-1,0..1]. XY[I,0:1] corresponds to the Ith point. XY[N-1,0:1] must be different from XY[0,0:1]. Order of points is important! N - points count, N>=3 for other types of splines. ST - spline type: * 1 Catmull-Rom spline (Tension=0) with cyclic boundary conditions * 2 cubic spline with cyclic boundary conditions PT - parameterization type: * 0 uniform * 1 chord length * 2 centripetal OUTPUT PARAMETERS: P - parametric spline interpolant NOTES: * this function assumes that there all consequent points are distinct. I.e. (x0,y0)<>(x1,y1), (x1,y1)<>(x2,y2), (x2,y2)<>(x3,y3) and so on. However, non-consequent points may coincide, i.e. we can have (x0,y0)= =(x2,y2). * last point of sequence is NOT equal to the first point. You shouldn't make curve "explicitly periodic" by making them equal. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2buildperiodic(const real_2d_array &xy, const ae_int_t n, const ae_int_t st, const ae_int_t pt, pspline2interpolant &p); /************************************************************************* This function builds periodic 3-dimensional parametric spline which starts at (X[0],Y[0],Z[0]), goes through all points to (X[N-1],Y[N-1],Z[N-1]) and then back to (X[0],Y[0],Z[0]). Same as PSpline2Build() function, but for 3D, so we won't duplicate its description here. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3buildperiodic(const real_2d_array &xy, const ae_int_t n, const ae_int_t st, const ae_int_t pt, pspline3interpolant &p); /************************************************************************* This function returns vector of parameter values correspoding to points. I.e. for P created from (X[0],Y[0])...(X[N-1],Y[N-1]) and U=TValues(P) we have (X[0],Y[0]) = PSpline2Calc(P,U[0]), (X[1],Y[1]) = PSpline2Calc(P,U[1]), (X[2],Y[2]) = PSpline2Calc(P,U[2]), ... INPUT PARAMETERS: P - parametric spline interpolant OUTPUT PARAMETERS: N - array size T - array[0..N-1] NOTES: * for non-periodic splines U[0]=0, U[0]1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-position Y - Y-position -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2calc(const pspline2interpolant &p, const double t, double &x, double &y); /************************************************************************* This function calculates the value of the parametric spline for a given value of parameter T. INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-position Y - Y-position Z - Z-position -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3calc(const pspline3interpolant &p, const double t, double &x, double &y, double &z); /************************************************************************* This function calculates tangent vector for a given value of parameter T INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-component of tangent vector (normalized) Y - Y-component of tangent vector (normalized) NOTE: X^2+Y^2 is either 1 (for non-zero tangent vector) or 0. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2tangent(const pspline2interpolant &p, const double t, double &x, double &y); /************************************************************************* This function calculates tangent vector for a given value of parameter T INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-component of tangent vector (normalized) Y - Y-component of tangent vector (normalized) Z - Z-component of tangent vector (normalized) NOTE: X^2+Y^2+Z^2 is either 1 (for non-zero tangent vector) or 0. -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3tangent(const pspline3interpolant &p, const double t, double &x, double &y, double &z); /************************************************************************* This function calculates derivative, i.e. it returns (dX/dT,dY/dT). INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - X-derivative Y - Y-value DY - Y-derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2diff(const pspline2interpolant &p, const double t, double &x, double &dx, double &y, double &dy); /************************************************************************* This function calculates derivative, i.e. it returns (dX/dT,dY/dT,dZ/dT). INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - X-derivative Y - Y-value DY - Y-derivative Z - Z-value DZ - Z-derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3diff(const pspline3interpolant &p, const double t, double &x, double &dx, double &y, double &dy, double &z, double &dz); /************************************************************************* This function calculates first and second derivative with respect to T. INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - derivative D2X - second derivative Y - Y-value DY - derivative D2Y - second derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline2diff2(const pspline2interpolant &p, const double t, double &x, double &dx, double &d2x, double &y, double &dy, double &d2y); /************************************************************************* This function calculates first and second derivative with respect to T. INPUT PARAMETERS: P - parametric spline interpolant T - point: * T in [0,1] corresponds to interval spanned by points * for non-periodic splines T<0 (or T>1) correspond to parts of the curve before the first (after the last) point * for periodic splines T<0 (or T>1) are projected into [0,1] by making T=T-floor(T). OUTPUT PARAMETERS: X - X-value DX - derivative D2X - second derivative Y - Y-value DY - derivative D2Y - second derivative Z - Z-value DZ - derivative D2Z - second derivative -- ALGLIB PROJECT -- Copyright 28.05.2010 by Bochkanov Sergey *************************************************************************/ void pspline3diff2(const pspline3interpolant &p, const double t, double &x, double &dx, double &d2x, double &y, double &dy, double &d2y, double &z, double &dz, double &d2z); /************************************************************************* This function calculates arc length, i.e. length of curve between t=a and t=b. INPUT PARAMETERS: P - parametric spline interpolant A,B - parameter values corresponding to arc ends: * B>A will result in positive length returned * BA will result in positive length returned * B1 OldYCount - old Y-count, OldYCount>1 OldXCount - old X-count, OldXCount>1 NewZCount - new Z-count, NewZCount>1 NewYCount - new Y-count, NewYCount>1 NewXCount - new X-count, NewXCount>1 OUTPUT PARAMETERS: B - array[0..NewXCount*NewYCount*NewZCount-1], function values at the new grid: B[0] x=0,y=0,z=0 B[1] x=1,y=0,z=0 B[..] ... B[..] x=newxcount-1,y=0,z=0 B[..] x=0,y=1,z=0 B[..] ... ... -- ALGLIB routine -- 26.04.2012 Copyright by Bochkanov Sergey *************************************************************************/ void spline3dresampletrilinear(const real_1d_array &a, const ae_int_t oldzcount, const ae_int_t oldycount, const ae_int_t oldxcount, const ae_int_t newzcount, const ae_int_t newycount, const ae_int_t newxcount, real_1d_array &b); /************************************************************************* This subroutine builds trilinear vector-valued spline. INPUT PARAMETERS: X - spline abscissas, array[0..N-1] Y - spline ordinates, array[0..M-1] Z - spline applicates, array[0..L-1] F - function values, array[0..M*N*L*D-1]: * first D elements store D values at (X[0],Y[0],Z[0]) * next D elements store D values at (X[1],Y[0],Z[0]) * next D elements store D values at (X[2],Y[0],Z[0]) * ... * next D elements store D values at (X[0],Y[1],Z[0]) * next D elements store D values at (X[1],Y[1],Z[0]) * next D elements store D values at (X[2],Y[1],Z[0]) * ... * next D elements store D values at (X[0],Y[0],Z[1]) * next D elements store D values at (X[1],Y[0],Z[1]) * next D elements store D values at (X[2],Y[0],Z[1]) * ... * general form - D function values at (X[i],Y[j]) are stored at F[D*(N*(M*K+J)+I)...D*(N*(M*K+J)+I)+D-1]. M,N, L - grid size, M>=2, N>=2, L>=2 D - vector dimension, D>=1 OUTPUT PARAMETERS: C - spline interpolant -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dbuildtrilinearv(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, const real_1d_array &z, const ae_int_t l, const real_1d_array &f, const ae_int_t d, spline3dinterpolant &c); /************************************************************************* This subroutine calculates bilinear or bicubic vector-valued spline at the given point (X,Y,Z). INPUT PARAMETERS: C - spline interpolant. X, Y, Z - point F - output buffer, possibly preallocated array. In case array size is large enough to store result, it is not reallocated. Array which is too short will be reallocated OUTPUT PARAMETERS: F - array[D] (or larger) which stores function values -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dcalcvbuf(const spline3dinterpolant &c, const double x, const double y, const double z, real_1d_array &f); /************************************************************************* This subroutine calculates trilinear or tricubic vector-valued spline at the given point (X,Y,Z). INPUT PARAMETERS: C - spline interpolant. X, Y, Z - point OUTPUT PARAMETERS: F - array[D] which stores function values. F is out-parameter and it is reallocated after call to this function. In case you want to reuse previously allocated F, you may use Spline2DCalcVBuf(), which reallocates F only when it is too small. -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dcalcv(const spline3dinterpolant &c, const double x, const double y, const double z, real_1d_array &f); /************************************************************************* This subroutine unpacks tri-dimensional spline into the coefficients table INPUT PARAMETERS: C - spline interpolant. Result: N - grid size (X) M - grid size (Y) L - grid size (Z) D - number of components SType- spline type. Currently, only one spline type is supported: trilinear spline, as indicated by SType=1. Tbl - spline coefficients: [0..(N-1)*(M-1)*(L-1)*D-1, 0..13]. For T=0..D-1 (component index), I = 0...N-2 (x index), J=0..M-2 (y index), K=0..L-2 (z index): Q := T + I*D + J*D*(N-1) + K*D*(N-1)*(M-1), Q-th row stores decomposition for T-th component of the vector-valued function Tbl[Q,0] = X[i] Tbl[Q,1] = X[i+1] Tbl[Q,2] = Y[j] Tbl[Q,3] = Y[j+1] Tbl[Q,4] = Z[k] Tbl[Q,5] = Z[k+1] Tbl[Q,6] = C000 Tbl[Q,7] = C100 Tbl[Q,8] = C010 Tbl[Q,9] = C110 Tbl[Q,10]= C001 Tbl[Q,11]= C101 Tbl[Q,12]= C011 Tbl[Q,13]= C111 On each grid square spline is equals to: S(x) = SUM(c[i,j,k]*(x^i)*(y^j)*(z^k), i=0..1, j=0..1, k=0..1) t = x-x[j] u = y-y[i] v = z-z[k] NOTE: format of Tbl is given for SType=1. Future versions of ALGLIB can use different formats for different values of SType. -- ALGLIB PROJECT -- Copyright 26.04.2012 by Bochkanov Sergey *************************************************************************/ void spline3dunpackv(const spline3dinterpolant &c, ae_int_t &n, ae_int_t &m, ae_int_t &l, ae_int_t &d, ae_int_t &stype, real_2d_array &tbl); /************************************************************************* Conversion from barycentric representation to Chebyshev basis. This function has O(N^2) complexity. INPUT PARAMETERS: P - polynomial in barycentric form A,B - base interval for Chebyshev polynomials (see below) A<>B OUTPUT PARAMETERS T - coefficients of Chebyshev representation; P(x) = sum { T[i]*Ti(2*(x-A)/(B-A)-1), i=0..N-1 }, where Ti - I-th Chebyshev polynomial. NOTES: barycentric interpolant passed as P may be either polynomial obtained from polynomial interpolation/ fitting or rational function which is NOT polynomial. We can't distinguish between these two cases, and this algorithm just tries to work assuming that P IS a polynomial. If not, algorithm will return results, but they won't have any meaning. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/ void polynomialbar2cheb(const barycentricinterpolant &p, const double a, const double b, real_1d_array &t); /************************************************************************* Conversion from Chebyshev basis to barycentric representation. This function has O(N^2) complexity. INPUT PARAMETERS: T - coefficients of Chebyshev representation; P(x) = sum { T[i]*Ti(2*(x-A)/(B-A)-1), i=0..N }, where Ti - I-th Chebyshev polynomial. N - number of coefficients: * if given, only leading N elements of T are used * if not given, automatically determined from size of T A,B - base interval for Chebyshev polynomials (see above) A0. OUTPUT PARAMETERS A - coefficients, P(x) = sum { A[i]*((X-C)/S)^i, i=0..N-1 } N - number of coefficients (polynomial degree plus 1) NOTES: 1. this function accepts offset and scale, which can be set to improve numerical properties of polynomial. For example, if P was obtained as result of interpolation on [-1,+1], you can set C=0 and S=1 and represent P as sum of 1, x, x^2, x^3 and so on. In most cases you it is exactly what you need. However, if your interpolation model was built on [999,1001], you will see significant growth of numerical errors when using {1, x, x^2, x^3} as basis. Representing P as sum of 1, (x-1000), (x-1000)^2, (x-1000)^3 will be better option. Such representation can be obtained by using 1000.0 as offset C and 1.0 as scale S. 2. power basis is ill-conditioned and tricks described above can't solve this problem completely. This function will return coefficients in any case, but for N>8 they will become unreliable. However, N's less than 5 are pretty safe. 3. barycentric interpolant passed as P may be either polynomial obtained from polynomial interpolation/ fitting or rational function which is NOT polynomial. We can't distinguish between these two cases, and this algorithm just tries to work assuming that P IS a polynomial. If not, algorithm will return results, but they won't have any meaning. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/ void polynomialbar2pow(const barycentricinterpolant &p, const double c, const double s, real_1d_array &a); void polynomialbar2pow(const barycentricinterpolant &p, real_1d_array &a); /************************************************************************* Conversion from power basis to barycentric representation. This function has O(N^2) complexity. INPUT PARAMETERS: A - coefficients, P(x) = sum { A[i]*((X-C)/S)^i, i=0..N-1 } N - number of coefficients (polynomial degree plus 1) * if given, only leading N elements of A are used * if not given, automatically determined from size of A C - offset (see below); 0.0 is used as default value. S - scale (see below); 1.0 is used as default value. S<>0. OUTPUT PARAMETERS P - polynomial in barycentric form NOTES: 1. this function accepts offset and scale, which can be set to improve numerical properties of polynomial. For example, if you interpolate on [-1,+1], you can set C=0 and S=1 and convert from sum of 1, x, x^2, x^3 and so on. In most cases you it is exactly what you need. However, if your interpolation model was built on [999,1001], you will see significant growth of numerical errors when using {1, x, x^2, x^3} as input basis. Converting from sum of 1, (x-1000), (x-1000)^2, (x-1000)^3 will be better option (you have to specify 1000.0 as offset C and 1.0 as scale S). 2. power basis is ill-conditioned and tricks described above can't solve this problem completely. This function will return barycentric model in any case, but for N>8 accuracy well degrade. However, N's less than 5 are pretty safe. -- ALGLIB -- Copyright 30.09.2010 by Bochkanov Sergey *************************************************************************/ void polynomialpow2bar(const real_1d_array &a, const ae_int_t n, const double c, const double s, barycentricinterpolant &p); void polynomialpow2bar(const real_1d_array &a, barycentricinterpolant &p); /************************************************************************* Lagrange intepolant: generation of the model on the general grid. This function has O(N^2) complexity. INPUT PARAMETERS: X - abscissas, array[0..N-1] Y - function values, array[0..N-1] N - number of points, N>=1 OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuild(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, barycentricinterpolant &p); void polynomialbuild(const real_1d_array &x, const real_1d_array &y, barycentricinterpolant &p); /************************************************************************* Lagrange intepolant: generation of the model on equidistant grid. This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1] N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuildeqdist(const double a, const double b, const real_1d_array &y, const ae_int_t n, barycentricinterpolant &p); void polynomialbuildeqdist(const double a, const double b, const real_1d_array &y, barycentricinterpolant &p); /************************************************************************* Lagrange intepolant on Chebyshev grid (first kind). This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1], Y[I] = Y(0.5*(B+A) + 0.5*(B-A)*Cos(PI*(2*i+1)/(2*n))) N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuildcheb1(const double a, const double b, const real_1d_array &y, const ae_int_t n, barycentricinterpolant &p); void polynomialbuildcheb1(const double a, const double b, const real_1d_array &y, barycentricinterpolant &p); /************************************************************************* Lagrange intepolant on Chebyshev grid (second kind). This function has O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] Y - function values at the nodes, array[0..N-1], Y[I] = Y(0.5*(B+A) + 0.5*(B-A)*Cos(PI*i/(n-1))) N - number of points, N>=1 for N=1 a constant model is constructed. OUTPUT PARAMETERS P - barycentric model which represents Lagrange interpolant (see ratint unit info and BarycentricCalc() description for more information). -- ALGLIB -- Copyright 03.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialbuildcheb2(const double a, const double b, const real_1d_array &y, const ae_int_t n, barycentricinterpolant &p); void polynomialbuildcheb2(const double a, const double b, const real_1d_array &y, barycentricinterpolant &p); /************************************************************************* Fast equidistant polynomial interpolation function with O(N) complexity INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on equidistant grid, N>=1 for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolynomialBuildEqDist()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double polynomialcalceqdist(const double a, const double b, const real_1d_array &f, const ae_int_t n, const double t); double polynomialcalceqdist(const double a, const double b, const real_1d_array &f, const double t); /************************************************************************* Fast polynomial interpolation function on Chebyshev points (first kind) with O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on Chebyshev grid (first kind), X[i] = 0.5*(B+A) + 0.5*(B-A)*Cos(PI*(2*i+1)/(2*n)) for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolIntBuildCheb1()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double polynomialcalccheb1(const double a, const double b, const real_1d_array &f, const ae_int_t n, const double t); double polynomialcalccheb1(const double a, const double b, const real_1d_array &f, const double t); /************************************************************************* Fast polynomial interpolation function on Chebyshev points (second kind) with O(N) complexity. INPUT PARAMETERS: A - left boundary of [A,B] B - right boundary of [A,B] F - function values, array[0..N-1] N - number of points on Chebyshev grid (second kind), X[i] = 0.5*(B+A) + 0.5*(B-A)*Cos(PI*i/(n-1)) for N=1 a constant model is constructed. T - position where P(x) is calculated RESULT value of the Lagrange interpolant at T IMPORTANT this function provides fast interface which is not overflow-safe nor it is very precise. the best option is to use PolIntBuildCheb2()/BarycentricCalc() subroutines unless you are pretty sure that your data will not result in overflow. -- ALGLIB -- Copyright 02.12.2009 by Bochkanov Sergey *************************************************************************/ double polynomialcalccheb2(const double a, const double b, const real_1d_array &f, const ae_int_t n, const double t); double polynomialcalccheb2(const double a, const double b, const real_1d_array &f, const double t); /************************************************************************* This subroutine fits piecewise linear curve to points with Ramer-Douglas- Peucker algorithm, which stops after generating specified number of linear sections. IMPORTANT: * it does NOT perform least-squares fitting; it builds curve, but this curve does not minimize some least squares metric. See description of RDP algorithm (say, in Wikipedia) for more details on WHAT is performed. * this function does NOT work with parametric curves (i.e. curves which can be represented as {X(t),Y(t)}. It works with curves which can be represented as Y(X). Thus, it is impossible to model figures like circles with this functions. If you want to work with parametric curves, you should use ParametricRDPFixed() function provided by "Parametric" subpackage of "Interpolation" package. INPUT PARAMETERS: X - array of X-coordinates: * at least N elements * can be unordered (points are automatically sorted) * this function may accept non-distinct X (see below for more information on handling of such inputs) Y - array of Y-coordinates: * at least N elements N - number of elements in X/Y M - desired number of sections: * at most M sections are generated by this function * less than M sections can be generated if we have N0 * if given, only leading N elements of X/Y are used * if not given, automatically determined from sizes of X/Y M - number of basis functions (= polynomial_degree + 1), M>=1 OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD P - interpolant in barycentric form. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED NOTES: you can convert P from barycentric form to the power or Chebyshev basis with PolynomialBar2Pow() or PolynomialBar2Cheb() functions from POLINT subpackage. -- ALGLIB PROJECT -- Copyright 10.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialfit(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep); void smp_polynomialfit(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep); void polynomialfit(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep); void smp_polynomialfit(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep); /************************************************************************* Weighted fitting by polynomials in barycentric form, with constraints on function values or first derivatives. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO: PolynomialFit() COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points, N>0. * if given, only leading N elements of X/Y/W are used * if not given, automatically determined from sizes of X/Y/W XC - points where polynomial values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that P(XC[i])=YC[i] * DC[i]=1 means that P'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints, 0<=K=1 OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints P - interpolant in barycentric form. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTES: you can convert P from barycentric form to the power or Chebyshev basis with PolynomialBar2Pow() or PolynomialBar2Cheb() functions from POLINT subpackage. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * even simple constraints can be inconsistent, see Wikipedia article on this subject: http://en.wikipedia.org/wiki/Birkhoff_interpolation * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints is NOT GUARANTEED. * in the one special cases, however, we can guarantee consistency. This case is: M>1 and constraints on the function values (NOT DERIVATIVES) Our final recommendation is to use constraints WHEN AND ONLY when you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 10.12.2009 by Bochkanov Sergey *************************************************************************/ void polynomialfitwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep); void smp_polynomialfitwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep); void polynomialfitwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep); void smp_polynomialfitwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t m, ae_int_t &info, barycentricinterpolant &p, polynomialfitreport &rep); /************************************************************************* This function calculates value of four-parameter logistic (4PL) model at specified point X. 4PL model has following form: F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) INPUT PARAMETERS: X - current point, X>=0: * zero X is correctly handled even for B<=0 * negative X results in exception. A, B, C, D- parameters of 4PL model: * A is unconstrained * B is unconstrained; zero or negative values are handled correctly. * C>0, non-positive value results in exception * D is unconstrained RESULT: model value at X NOTE: if B=0, denominator is assumed to be equal to 2.0 even for zero X (strictly speaking, 0^0 is undefined). NOTE: this function also throws exception if all input parameters are correct, but overflow was detected during calculations. NOTE: this function performs a lot of checks; if you need really high performance, consider evaluating model yourself, without checking for degenerate cases. -- ALGLIB PROJECT -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/ double logisticcalc4(const double x, const double a, const double b, const double c, const double d); /************************************************************************* This function calculates value of five-parameter logistic (5PL) model at specified point X. 5PL model has following form: F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) INPUT PARAMETERS: X - current point, X>=0: * zero X is correctly handled even for B<=0 * negative X results in exception. A, B, C, D, G- parameters of 5PL model: * A is unconstrained * B is unconstrained; zero or negative values are handled correctly. * C>0, non-positive value results in exception * D is unconstrained * G>0, non-positive value results in exception RESULT: model value at X NOTE: if B=0, denominator is assumed to be equal to Power(2.0,G) even for zero X (strictly speaking, 0^0 is undefined). NOTE: this function also throws exception if all input parameters are correct, but overflow was detected during calculations. NOTE: this function performs a lot of checks; if you need really high performance, consider evaluating model yourself, without checking for degenerate cases. -- ALGLIB PROJECT -- Copyright 14.05.2014 by Bochkanov Sergey *************************************************************************/ double logisticcalc5(const double x, const double a, const double b, const double c, const double d, const double g); /************************************************************************* This function fits four-parameter logistic (4PL) model to data provided by user. 4PL model has following form: F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) Here: * A, D - unconstrained (see LogisticFit4EC() for constrained 4PL) * B>=0 * C>0 IMPORTANT: output of this function is constrained in such way that B>0. Because 4PL model is symmetric with respect to B, there is no need to explore B<0. Constraining B makes algorithm easier to stabilize and debug. Users who for some reason prefer to work with negative B's should transform output themselves (swap A and D, replace B by -B). 4PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". * second Levenberg-Marquardt round is performed without excessive constraints. Results from the previous round are used as initial guess. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. OUTPUT PARAMETERS: A, B, C, D- parameters of 4PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc4() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit4(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, double &a, double &b, double &c, double &d, lsfitreport &rep); /************************************************************************* This function fits four-parameter logistic (4PL) model to data provided by user, with optional constraints on parameters A and D. 4PL model has following form: F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) Here: * A, D - with optional equality constraints * B>=0 * C>0 IMPORTANT: output of this function is constrained in such way that B>0. Because 4PL model is symmetric with respect to B, there is no need to explore B<0. Constraining B makes algorithm easier to stabilize and debug. Users who for some reason prefer to work with negative B's should transform output themselves (swap A and D, replace B by -B). 4PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". * second Levenberg-Marquardt round is performed without excessive constraints. Results from the previous round are used as initial guess. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. CnstrLeft- optional equality constraint for model value at the left boundary (at X=0). Specify NAN (Not-a-Number) if you do not need constraint on the model value at X=0 (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. CnstrRight- optional equality constraint for model value at X=infinity. Specify NAN (Not-a-Number) if you do not need constraint on the model value (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. OUTPUT PARAMETERS: A, B, C, D- parameters of 4PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc4() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. EQUALITY CONSTRAINTS ON PARAMETERS 4PL/5PL solver supports equality constraints on model values at the left boundary (X=0) and right boundary (X=infinity). These constraints are completely optional and you can specify both of them, only one - or no constraints at all. Parameter CnstrLeft contains left constraint (or NAN for unconstrained fitting), and CnstrRight contains right one. For 4PL, left constraint ALWAYS corresponds to parameter A, and right one is ALWAYS constraint on D. That's because 4PL model is normalized in such way that B>=0. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit4ec(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const double cnstrleft, const double cnstrright, double &a, double &b, double &c, double &d, lsfitreport &rep); /************************************************************************* This function fits five-parameter logistic (5PL) model to data provided by user. 5PL model has following form: F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) Here: * A, D - unconstrained * B - unconstrained * C>0 * G>0 IMPORTANT: unlike in 4PL fitting, output of this function is NOT constrained in such way that B is guaranteed to be positive. Furthermore, unlike 4PL, 5PL model is NOT symmetric with respect to B, so you can NOT transform model to equivalent one, with B having desired sign (>0 or <0). 5PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". Parameter G is fixed at G=1. * second Levenberg-Marquardt round is performed without excessive constraints on B and C, but with G still equal to 1. Results from the previous round are used as initial guess. * third Levenberg-Marquardt round relaxes constraints on G and tries two different models - one with B>0 and one with B<0. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. OUTPUT PARAMETERS: A,B,C,D,G- parameters of 5PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc5() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit5(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, double &a, double &b, double &c, double &d, double &g, lsfitreport &rep); /************************************************************************* This function fits five-parameter logistic (5PL) model to data provided by user, subject to optional equality constraints on parameters A and D. 5PL model has following form: F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) Here: * A, D - with optional equality constraints * B - unconstrained * C>0 * G>0 IMPORTANT: unlike in 4PL fitting, output of this function is NOT constrained in such way that B is guaranteed to be positive. Furthermore, unlike 4PL, 5PL model is NOT symmetric with respect to B, so you can NOT transform model to equivalent one, with B having desired sign (>0 or <0). 5PL fitting is implemented as follows: * we perform small number of restarts from random locations which helps to solve problem of bad local extrema. Locations are only partially random - we use input data to determine good initial guess, but we include controlled amount of randomness. * we perform Levenberg-Marquardt fitting with very tight constraints on parameters B and C - it allows us to find good initial guess for the second stage without risk of running into "flat spot". Parameter G is fixed at G=1. * second Levenberg-Marquardt round is performed without excessive constraints on B and C, but with G still equal to 1. Results from the previous round are used as initial guess. * third Levenberg-Marquardt round relaxes constraints on G and tries two different models - one with B>0 and one with B<0. * after fitting is done, we compare results with best values found so far, rewrite "best solution" if needed, and move to next random location. Overall algorithm is very stable and is not prone to bad local extrema. Furthermore, it automatically scales when input data have very large or very small range. INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. CnstrLeft- optional equality constraint for model value at the left boundary (at X=0). Specify NAN (Not-a-Number) if you do not need constraint on the model value at X=0 (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. CnstrRight- optional equality constraint for model value at X=infinity. Specify NAN (Not-a-Number) if you do not need constraint on the model value (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. OUTPUT PARAMETERS: A,B,C,D,G- parameters of 5PL model Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc5() function. NOTE: if you need better control over fitting process than provided by this function, you may use LogisticFit45X(). NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. EQUALITY CONSTRAINTS ON PARAMETERS 5PL solver supports equality constraints on model values at the left boundary (X=0) and right boundary (X=infinity). These constraints are completely optional and you can specify both of them, only one - or no constraints at all. Parameter CnstrLeft contains left constraint (or NAN for unconstrained fitting), and CnstrRight contains right one. Unlike 4PL one, 5PL model is NOT symmetric with respect to change in sign of B. Thus, negative B's are possible, and left constraint may constrain parameter A (for positive B's) - or parameter D (for negative B's). Similarly changes meaning of right constraint. You do not have to decide what parameter to constrain - algorithm will automatically determine correct parameters as fitting progresses. However, question highlighted above is important when you interpret fitting results. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit5ec(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const double cnstrleft, const double cnstrright, double &a, double &b, double &c, double &d, double &g, lsfitreport &rep); /************************************************************************* This is "expert" 4PL/5PL fitting function, which can be used if you need better control over fitting process than provided by LogisticFit4() or LogisticFit5(). This function fits model of the form F(x|A,B,C,D) = D+(A-D)/(1+Power(x/C,B)) (4PL model) or F(x|A,B,C,D,G) = D+(A-D)/Power(1+Power(x/C,B),G) (5PL model) Here: * A, D - unconstrained * B>=0 for 4PL, unconstrained for 5PL * C>0 * G>0 (if present) INPUT PARAMETERS: X - array[N], stores X-values. MUST include only non-negative numbers (but may include zero values). Can be unsorted. Y - array[N], values to fit. N - number of points. If N is less than length of X/Y, only leading N elements are used. CnstrLeft- optional equality constraint for model value at the left boundary (at X=0). Specify NAN (Not-a-Number) if you do not need constraint on the model value at X=0 (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. CnstrRight- optional equality constraint for model value at X=infinity. Specify NAN (Not-a-Number) if you do not need constraint on the model value (in C++ you can pass alglib::fp_nan as parameter, in C# it will be Double.NaN). See below, section "EQUALITY CONSTRAINTS" for more information about constraints. Is4PL - whether 4PL or 5PL models are fitted LambdaV - regularization coefficient, LambdaV>=0. Set it to zero unless you know what you are doing. EpsX - stopping condition (step size), EpsX>=0. Zero value means that small step is automatically chosen. See notes below for more information. RsCnt - number of repeated restarts from random points. 4PL/5PL models are prone to problem of bad local extrema. Utilizing multiple random restarts allows us to improve algorithm convergence. RsCnt>=0. Zero value means that function automatically choose small amount of restarts (recommended). OUTPUT PARAMETERS: A, B, C, D- parameters of 4PL model G - parameter of 5PL model; for Is4PL=True, G=1 is returned. Rep - fitting report. This structure has many fields, but ONLY ONES LISTED BELOW ARE SET: * Rep.IterationsCount - number of iterations performed * Rep.RMSError - root-mean-square error * Rep.AvgError - average absolute error * Rep.AvgRelError - average relative error (calculated for non-zero Y-values) * Rep.MaxError - maximum absolute error * Rep.R2 - coefficient of determination, R-squared. This coefficient is calculated as R2=1-RSS/TSS (in case of nonlinear regression there are multiple ways to define R2, each of them giving different results). NOTE: after you obtained coefficients, you can evaluate model with LogisticCalc5() function. NOTE: step is automatically scaled according to scale of parameters being fitted before we compare its length with EpsX. Thus, this function can be used to fit data with very small or very large values without changing EpsX. EQUALITY CONSTRAINTS ON PARAMETERS 4PL/5PL solver supports equality constraints on model values at the left boundary (X=0) and right boundary (X=infinity). These constraints are completely optional and you can specify both of them, only one - or no constraints at all. Parameter CnstrLeft contains left constraint (or NAN for unconstrained fitting), and CnstrRight contains right one. For 4PL, left constraint ALWAYS corresponds to parameter A, and right one is ALWAYS constraint on D. That's because 4PL model is normalized in such way that B>=0. For 5PL model things are different. Unlike 4PL one, 5PL model is NOT symmetric with respect to change in sign of B. Thus, negative B's are possible, and left constraint may constrain parameter A (for positive B's) - or parameter D (for negative B's). Similarly changes meaning of right constraint. You do not have to decide what parameter to constrain - algorithm will automatically determine correct parameters as fitting progresses. However, question highlighted above is important when you interpret fitting results. -- ALGLIB PROJECT -- Copyright 14.02.2014 by Bochkanov Sergey *************************************************************************/ void logisticfit45x(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const double cnstrleft, const double cnstrright, const bool is4pl, const double lambdav, const double epsx, const ae_int_t rscnt, double &a, double &b, double &c, double &d, double &g, lsfitreport &rep); /************************************************************************* Weghted rational least squares fitting using Floater-Hormann rational functions with optimal D chosen from [0,9], with constraints and individual weights. Equidistant grid with M node on [min(x),max(x)] is used to build basis functions. Different values of D are tried, optimal D (least WEIGHTED root mean square error) is chosen. Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2) (mostly dominated by the least squares solver). SEE ALSO * BarycentricFitFloaterHormann(), "lightweight" fitting without invididual weights and constraints. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points, N>0. XC - points where function values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints, 0<=K=2. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints -1 means another errors in parameters passed (N<=0, for example) B - barycentric interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * DBest best value of the D parameter * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroutine doesn't calculate task's condition number for K<>0. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained barycentric interpolants: * excessive constraints can be inconsistent. Floater-Hormann basis functions aren't as flexible as splines (although they are very smooth). * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints IS NOT GUARANTEED. * in the several special cases, however, we CAN guarantee consistency. * one of this cases is constraints on the function VALUES at the interval boundaries. Note that consustency of the constraints on the function DERIVATIVES is NOT guaranteed (you can use in such cases cubic splines which are more flexible). * another special case is ONE constraint on the function value (OR, but not AND, derivative) anywhere in the interval Our final recommendation is to use constraints WHEN AND ONLY WHEN you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricfitfloaterhormannwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, barycentricinterpolant &b, barycentricfitreport &rep); void smp_barycentricfitfloaterhormannwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, barycentricinterpolant &b, barycentricfitreport &rep); /************************************************************************* Rational least squares fitting using Floater-Hormann rational functions with optimal D chosen from [0,9]. Equidistant grid with M node on [min(x),max(x)] is used to build basis functions. Different values of D are tried, optimal D (least root mean square error) is chosen. Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2) (mostly dominated by the least squares solver). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. N - number of points, N>0. M - number of basis functions ( = number_of_nodes), M>=2. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints B - barycentric interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * DBest best value of the D parameter * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void barycentricfitfloaterhormann(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, barycentricinterpolant &b, barycentricfitreport &rep); void smp_barycentricfitfloaterhormann(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, barycentricinterpolant &b, barycentricfitreport &rep); /************************************************************************* Fitting by penalized cubic spline. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with natural boundary conditions. Problem is regularized by adding non-linearity penalty to the usual least squares penalty function: S(x) = arg min { LS + P }, where LS = SUM { w[i]^2*(y[i] - S(x[i]))^2 } - least squares penalty P = C*10^rho*integral{ S''(x)^2*dx } - non-linearity penalty rho - tunable constant given by user C - automatically determined scale parameter, makes penalty invariant with respect to scaling of X, Y, W. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. N - number of points (optional): * N>0 * if given, only first N elements of X/Y are processed * if not given, automatically determined from X/Y sizes M - number of basis functions ( = number_of_nodes), M>=4. Rho - regularization constant passed by user. It penalizes nonlinearity in the regression spline. It is logarithmically scaled, i.e. actual value of regularization constant is calculated as 10^Rho. It is automatically scaled so that: * Rho=2.0 corresponds to moderate amount of nonlinearity * generally, it should be somewhere in the [-8.0,+8.0] If you do not want to penalize nonlineary, pass small Rho. Values as low as -15 should work. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD or Cholesky decomposition; problem may be too ill-conditioned (very rare) S - spline interpolant. Rep - Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTE 1: additional nodes are added to the spline outside of the fitting interval to force linearity when xmax(x,xc). It is done for consistency - we penalize non-linearity at [min(x,xc),max(x,xc)], so it is natural to force linearity outside of this interval. NOTE 2: function automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfitpenalized(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void smp_spline1dfitpenalized(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void spline1dfitpenalized(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void smp_spline1dfitpenalized(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); /************************************************************************* Weighted fitting by penalized cubic spline. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with natural boundary conditions. Problem is regularized by adding non-linearity penalty to the usual least squares penalty function: S(x) = arg min { LS + P }, where LS = SUM { w[i]^2*(y[i] - S(x[i]))^2 } - least squares penalty P = C*10^rho*integral{ S''(x)^2*dx } - non-linearity penalty rho - tunable constant given by user C - automatically determined scale parameter, makes penalty invariant with respect to scaling of X, Y, W. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted problem. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes M - number of basis functions ( = number_of_nodes), M>=4. Rho - regularization constant passed by user. It penalizes nonlinearity in the regression spline. It is logarithmically scaled, i.e. actual value of regularization constant is calculated as 10^Rho. It is automatically scaled so that: * Rho=2.0 corresponds to moderate amount of nonlinearity * generally, it should be somewhere in the [-8.0,+8.0] If you do not want to penalize nonlineary, pass small Rho. Values as low as -15 should work. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD or Cholesky decomposition; problem may be too ill-conditioned (very rare) S - spline interpolant. Rep - Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. NOTE 1: additional nodes are added to the spline outside of the fitting interval to force linearity when xmax(x,xc). It is done for consistency - we penalize non-linearity at [min(x,xc),max(x,xc)], so it is natural to force linearity outside of this interval. NOTE 2: function automatically sorts points, so caller may pass unsorted array. -- ALGLIB PROJECT -- Copyright 19.10.2010 by Bochkanov Sergey *************************************************************************/ void spline1dfitpenalizedw(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void smp_spline1dfitpenalizedw(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void spline1dfitpenalizedw(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void smp_spline1dfitpenalizedw(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t m, const double rho, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); /************************************************************************* Weighted fitting by cubic spline, with constraints on function values or derivatives. Equidistant grid with M-2 nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are cubic splines with continuous second derivatives and non-fixed first derivatives at interval ends. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO Spline1DFitHermiteWC() - fitting by Hermite splines (more flexible, less smooth) Spline1DFitCubic() - "lightweight" fitting by cubic splines, without invididual weights and constraints COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes XC - points where spline values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints (optional): * 0<=K=4. OUTPUT PARAMETERS: Info- same format as in LSFitLinearWC() subroutine. * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints S - spline interpolant. Rep - report, same format as in LSFitLinearWC() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * excessive constraints can be inconsistent. Splines are piecewise cubic functions, and it is easy to create an example, where large number of constraints concentrated in small area will result in inconsistency. Just because spline is not flexible enough to satisfy all of them. And same constraints spread across the [min(x),max(x)] will be perfectly consistent. * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints IS NOT GUARANTEED. * in the several special cases, however, we CAN guarantee consistency. * one of this cases is constraints on the function values AND/OR its derivatives at the interval boundaries. * another special case is ONE constraint on the function value (OR, but not AND, derivative) anywhere in the interval Our final recommendation is to use constraints WHEN AND ONLY WHEN you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfitcubicwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void smp_spline1dfitcubicwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void spline1dfitcubicwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void smp_spline1dfitcubicwc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); /************************************************************************* Weighted fitting by Hermite spline, with constraints on function values or first derivatives. Equidistant grid with M nodes on [min(x,xc),max(x,xc)] is used to build basis functions. Basis functions are Hermite splines. Small regularizing term is used when solving constrained tasks (to improve stability). Task is linear, so linear least squares solver is used. Complexity of this computational scheme is O(N*M^2), mostly dominated by least squares solver SEE ALSO Spline1DFitCubicWC() - fitting by Cubic splines (less flexible, more smooth) Spline1DFitHermite() - "lightweight" Hermite fitting, without invididual weights and constraints COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: X - points, array[0..N-1]. Y - function values, array[0..N-1]. W - weights, array[0..N-1] Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. Fill it by 1's if you don't want to solve weighted task. N - number of points (optional): * N>0 * if given, only first N elements of X/Y/W are processed * if not given, automatically determined from X/Y/W sizes XC - points where spline values/derivatives are constrained, array[0..K-1]. YC - values of constraints, array[0..K-1] DC - array[0..K-1], types of constraints: * DC[i]=0 means that S(XC[i])=YC[i] * DC[i]=1 means that S'(XC[i])=YC[i] SEE BELOW FOR IMPORTANT INFORMATION ON CONSTRAINTS K - number of constraints (optional): * 0<=K=4, M IS EVEN! OUTPUT PARAMETERS: Info- same format as in LSFitLinearW() subroutine: * Info>0 task is solved * Info<=0 an error occured: -4 means inconvergence of internal SVD -3 means inconsistent constraints -2 means odd M was passed (which is not supported) -1 means another errors in parameters passed (N<=0, for example) S - spline interpolant. Rep - report, same format as in LSFitLinearW() subroutine. Following fields are set: * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. IMPORTANT: this subroitine supports only even M's ORDER OF POINTS Subroutine automatically sorts points, so caller may pass unsorted array. SETTING CONSTRAINTS - DANGERS AND OPPORTUNITIES: Setting constraints can lead to undesired results, like ill-conditioned behavior, or inconsistency being detected. From the other side, it allows us to improve quality of the fit. Here we summarize our experience with constrained regression splines: * excessive constraints can be inconsistent. Splines are piecewise cubic functions, and it is easy to create an example, where large number of constraints concentrated in small area will result in inconsistency. Just because spline is not flexible enough to satisfy all of them. And same constraints spread across the [min(x),max(x)] will be perfectly consistent. * the more evenly constraints are spread across [min(x),max(x)], the more chances that they will be consistent * the greater is M (given fixed constraints), the more chances that constraints will be consistent * in the general case, consistency of constraints is NOT GUARANTEED. * in the several special cases, however, we can guarantee consistency. * one of this cases is M>=4 and constraints on the function value (AND/OR its derivative) at the interval boundaries. * another special case is M>=4 and ONE constraint on the function value (OR, BUT NOT AND, derivative) anywhere in [min(x),max(x)] Our final recommendation is to use constraints WHEN AND ONLY when you can't solve your task without them. Anything beyond special cases given above is not guaranteed and may result in inconsistency. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfithermitewc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void smp_spline1dfithermitewc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const ae_int_t n, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t k, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void spline1dfithermitewc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void smp_spline1dfithermitewc(const real_1d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &xc, const real_1d_array &yc, const integer_1d_array &dc, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); /************************************************************************* Least squares fitting by cubic spline. This subroutine is "lightweight" alternative for more complex and feature- rich Spline1DFitCubicWC(). See Spline1DFitCubicWC() for more information about subroutine parameters (we don't duplicate it here because of length) COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfitcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void smp_spline1dfitcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void spline1dfitcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void smp_spline1dfitcubic(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); /************************************************************************* Least squares fitting by Hermite spline. This subroutine is "lightweight" alternative for more complex and feature- rich Spline1DFitHermiteWC(). See Spline1DFitHermiteWC() description for more information about subroutine parameters (we don't duplicate it here because of length). COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. -- ALGLIB PROJECT -- Copyright 18.08.2009 by Bochkanov Sergey *************************************************************************/ void spline1dfithermite(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void smp_spline1dfithermite(const real_1d_array &x, const real_1d_array &y, const ae_int_t n, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void spline1dfithermite(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); void smp_spline1dfithermite(const real_1d_array &x, const real_1d_array &y, const ae_int_t m, ae_int_t &info, spline1dinterpolant &s, spline1dfitreport &rep); /************************************************************************* Weighted linear least squares fitting. QR decomposition is used to reduce task to MxM, then triangular solver or SVD-based solver is used depending on condition number of the system. It allows to maximize speed and retain decent accuracy. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. W - array[0..N-1] Weights corresponding to function values. Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I, J] - value of J-th basis function in I-th point. N - number of points used. N>=1. M - number of basis functions, M>=1. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -1 incorrect N/M were specified * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * Rep.TaskRCond reciprocal of condition number * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinearw(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, const ae_int_t n, const ae_int_t m, ae_int_t &info, real_1d_array &c, lsfitreport &rep); void smp_lsfitlinearw(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, const ae_int_t n, const ae_int_t m, ae_int_t &info, real_1d_array &c, lsfitreport &rep); void lsfitlinearw(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep); void smp_lsfitlinearw(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep); /************************************************************************* Weighted constained linear least squares fitting. This is variation of LSFitLinearW(), which searchs for min|A*x=b| given that K additional constaints C*x=bc are satisfied. It reduces original task to modified one: min|B*y-d| WITHOUT constraints, then LSFitLinearW() is called. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. W - array[0..N-1] Weights corresponding to function values. Each summand in square sum of approximation deviations from given values is multiplied by the square of corresponding weight. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I,J] - value of J-th basis function in I-th point. CMatrix - a table of constaints, array[0..K-1,0..M]. I-th row of CMatrix corresponds to I-th linear constraint: CMatrix[I,0]*C[0] + ... + CMatrix[I,M-1]*C[M-1] = CMatrix[I,M] N - number of points used. N>=1. M - number of basis functions, M>=1. K - number of constraints, 0 <= K < M K=0 corresponds to absence of constraints. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -3 either too many constraints (M or more), degenerate constraints (some constraints are repetead twice) or inconsistent constraints were specified. * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 07.09.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinearwc(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, const real_2d_array &cmatrix, const ae_int_t n, const ae_int_t m, const ae_int_t k, ae_int_t &info, real_1d_array &c, lsfitreport &rep); void smp_lsfitlinearwc(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, const real_2d_array &cmatrix, const ae_int_t n, const ae_int_t m, const ae_int_t k, ae_int_t &info, real_1d_array &c, lsfitreport &rep); void lsfitlinearwc(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, const real_2d_array &cmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep); void smp_lsfitlinearwc(const real_1d_array &y, const real_1d_array &w, const real_2d_array &fmatrix, const real_2d_array &cmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep); /************************************************************************* Linear least squares fitting. QR decomposition is used to reduce task to MxM, then triangular solver or SVD-based solver is used depending on condition number of the system. It allows to maximize speed and retain decent accuracy. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I, J] - value of J-th basis function in I-th point. N - number of points used. N>=1. M - number of basis functions, M>=1. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * Rep.TaskRCond reciprocal of condition number * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinear(const real_1d_array &y, const real_2d_array &fmatrix, const ae_int_t n, const ae_int_t m, ae_int_t &info, real_1d_array &c, lsfitreport &rep); void smp_lsfitlinear(const real_1d_array &y, const real_2d_array &fmatrix, const ae_int_t n, const ae_int_t m, ae_int_t &info, real_1d_array &c, lsfitreport &rep); void lsfitlinear(const real_1d_array &y, const real_2d_array &fmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep); void smp_lsfitlinear(const real_1d_array &y, const real_2d_array &fmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep); /************************************************************************* Constained linear least squares fitting. This is variation of LSFitLinear(), which searchs for min|A*x=b| given that K additional constaints C*x=bc are satisfied. It reduces original task to modified one: min|B*y-d| WITHOUT constraints, then LSFitLinear() is called. IMPORTANT: if you want to perform polynomial fitting, it may be more convenient to use PolynomialFit() function. This function gives best results on polynomial problems and solves numerical stability issues which arise when you fit high-degree polynomials to your data. COMMERCIAL EDITION OF ALGLIB: ! Commercial version of ALGLIB includes two important improvements of ! this function, which can be used from C++ and C#: ! * Intel MKL support (lightweight Intel MKL is shipped with ALGLIB) ! * multithreading support ! ! Intel MKL gives approximately constant (with respect to number of ! worker threads) acceleration factor which depends on CPU being used, ! problem size and "baseline" ALGLIB edition which is used for ! comparison. ! ! Speed-up provided by multithreading greatly depends on problem size ! - only large problems (number of coefficients is more than 500) can be ! efficiently multithreaded. ! ! Generally, commercial ALGLIB is several times faster than open-source ! generic C edition, and many times faster than open-source C# edition. ! ! We recommend you to read 'Working with commercial version' section of ! ALGLIB Reference Manual in order to find out how to use performance- ! related features provided by commercial edition of ALGLIB. INPUT PARAMETERS: Y - array[0..N-1] Function values in N points. FMatrix - a table of basis functions values, array[0..N-1, 0..M-1]. FMatrix[I,J] - value of J-th basis function in I-th point. CMatrix - a table of constaints, array[0..K-1,0..M]. I-th row of CMatrix corresponds to I-th linear constraint: CMatrix[I,0]*C[0] + ... + CMatrix[I,M-1]*C[M-1] = CMatrix[I,M] N - number of points used. N>=1. M - number of basis functions, M>=1. K - number of constraints, 0 <= K < M K=0 corresponds to absence of constraints. OUTPUT PARAMETERS: Info - error code: * -4 internal SVD decomposition subroutine failed (very rare and for degenerate systems only) * -3 either too many constraints (M or more), degenerate constraints (some constraints are repetead twice) or inconsistent constraints were specified. * 1 task is solved C - decomposition coefficients, array[0..M-1] Rep - fitting report. Following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED IMPORTANT: this subroitine doesn't calculate task's condition number for K<>0. ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(F*CovPar*F')), where F is functions matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 07.09.2009 by Bochkanov Sergey *************************************************************************/ void lsfitlinearc(const real_1d_array &y, const real_2d_array &fmatrix, const real_2d_array &cmatrix, const ae_int_t n, const ae_int_t m, const ae_int_t k, ae_int_t &info, real_1d_array &c, lsfitreport &rep); void smp_lsfitlinearc(const real_1d_array &y, const real_2d_array &fmatrix, const real_2d_array &cmatrix, const ae_int_t n, const ae_int_t m, const ae_int_t k, ae_int_t &info, real_1d_array &c, lsfitreport &rep); void lsfitlinearc(const real_1d_array &y, const real_2d_array &fmatrix, const real_2d_array &cmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep); void smp_lsfitlinearc(const real_1d_array &y, const real_2d_array &fmatrix, const real_2d_array &cmatrix, ae_int_t &info, real_1d_array &c, lsfitreport &rep); /************************************************************************* Weighted nonlinear least squares fitting using function values only. Combination of numerical differentiation and secant updates is used to obtain function Jacobian. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]). INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted DiffStep- numerical differentiation step; should not be very small or large; large = loss of accuracy small = growth of round-off errors OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 18.10.2008 by Bochkanov Sergey *************************************************************************/ void lsfitcreatewf(const real_2d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &c, const ae_int_t n, const ae_int_t m, const ae_int_t k, const double diffstep, lsfitstate &state); void lsfitcreatewf(const real_2d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &c, const double diffstep, lsfitstate &state); /************************************************************************* Nonlinear least squares fitting using function values only. Combination of numerical differentiation and secant updates is used to obtain function Jacobian. Nonlinear task min(F(c)) is solved, where F(c) = (f(c,x[0])-y[0])^2 + ... + (f(c,x[n-1])-y[n-1])^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]). INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted DiffStep- numerical differentiation step; should not be very small or large; large = loss of accuracy small = growth of round-off errors OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 18.10.2008 by Bochkanov Sergey *************************************************************************/ void lsfitcreatef(const real_2d_array &x, const real_1d_array &y, const real_1d_array &c, const ae_int_t n, const ae_int_t m, const ae_int_t k, const double diffstep, lsfitstate &state); void lsfitcreatef(const real_2d_array &x, const real_1d_array &y, const real_1d_array &c, const double diffstep, lsfitstate &state); /************************************************************************* Weighted nonlinear least squares fitting using gradient only. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]) and its gradient. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted CheapFG - boolean flag, which is: * True if both function and gradient calculation complexity are less than O(M^2). An improved algorithm can be used which corresponds to FGJ scheme from MINLM unit. * False otherwise. Standard Jacibian-bases Levenberg-Marquardt algo will be used (FJ scheme). OUTPUT PARAMETERS: State - structure which stores algorithm state See also: LSFitResults LSFitCreateFG (fitting without weights) LSFitCreateWFGH (fitting using Hessian) LSFitCreateFGH (fitting using Hessian, without weights) -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatewfg(const real_2d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &c, const ae_int_t n, const ae_int_t m, const ae_int_t k, const bool cheapfg, lsfitstate &state); void lsfitcreatewfg(const real_2d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &c, const bool cheapfg, lsfitstate &state); /************************************************************************* Nonlinear least squares fitting using gradient only, without individual weights. Nonlinear task min(F(c)) is solved, where F(c) = ((f(c,x[0])-y[0]))^2 + ... + ((f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses only f(c,x[i]) and its gradient. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted CheapFG - boolean flag, which is: * True if both function and gradient calculation complexity are less than O(M^2). An improved algorithm can be used which corresponds to FGJ scheme from MINLM unit. * False otherwise. Standard Jacibian-bases Levenberg-Marquardt algo will be used (FJ scheme). OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatefg(const real_2d_array &x, const real_1d_array &y, const real_1d_array &c, const ae_int_t n, const ae_int_t m, const ae_int_t k, const bool cheapfg, lsfitstate &state); void lsfitcreatefg(const real_2d_array &x, const real_1d_array &y, const real_1d_array &c, const bool cheapfg, lsfitstate &state); /************************************************************************* Weighted nonlinear least squares fitting using gradient/Hessian. Nonlinear task min(F(c)) is solved, where F(c) = (w[0]*(f(c,x[0])-y[0]))^2 + ... + (w[n-1]*(f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * w is an N-dimensional vector of weight coefficients, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses f(c,x[i]), its gradient and its Hessian. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. W - weights, array[0..N-1] C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatewfgh(const real_2d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &c, const ae_int_t n, const ae_int_t m, const ae_int_t k, lsfitstate &state); void lsfitcreatewfgh(const real_2d_array &x, const real_1d_array &y, const real_1d_array &w, const real_1d_array &c, lsfitstate &state); /************************************************************************* Nonlinear least squares fitting using gradient/Hessian, without individial weights. Nonlinear task min(F(c)) is solved, where F(c) = ((f(c,x[0])-y[0]))^2 + ... + ((f(c,x[n-1])-y[n-1]))^2, * N is a number of points, * M is a dimension of a space points belong to, * K is a dimension of a space of parameters being fitted, * x is a set of N points, each of them is an M-dimensional vector, * c is a K-dimensional vector of parameters being fitted This subroutine uses f(c,x[i]), its gradient and its Hessian. INPUT PARAMETERS: X - array[0..N-1,0..M-1], points (one row = one point) Y - array[0..N-1], function values. C - array[0..K-1], initial approximation to the solution, N - number of points, N>1 M - dimension of space K - number of parameters being fitted OUTPUT PARAMETERS: State - structure which stores algorithm state -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitcreatefgh(const real_2d_array &x, const real_1d_array &y, const real_1d_array &c, const ae_int_t n, const ae_int_t m, const ae_int_t k, lsfitstate &state); void lsfitcreatefgh(const real_2d_array &x, const real_1d_array &y, const real_1d_array &c, lsfitstate &state); /************************************************************************* Stopping conditions for nonlinear least squares fitting. INPUT PARAMETERS: State - structure which stores algorithm state EpsX - >=0 The subroutine finishes its work if on k+1-th iteration the condition |v|<=EpsX is fulfilled, where: * |.| means Euclidian norm * v - scaled step vector, v[i]=dx[i]/s[i] * dx - ste pvector, dx=X(k+1)-X(k) * s - scaling coefficients set by LSFitSetScale() MaxIts - maximum number of iterations. If MaxIts=0, the number of iterations is unlimited. Only Levenberg-Marquardt iterations are counted (L-BFGS/CG iterations are NOT counted because their cost is very low compared to that of LM). NOTE Passing EpsX=0 and MaxIts=0 (simultaneously) will lead to automatic stopping criterion selection (according to the scheme used by MINLM unit). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitsetcond(const lsfitstate &state, const double epsx, const ae_int_t maxits); /************************************************************************* This function sets maximum step length INPUT PARAMETERS: State - structure which stores algorithm state StpMax - maximum step length, >=0. Set StpMax to 0.0, if you don't want to limit step length. Use this subroutine when you optimize target function which contains exp() or other fast growing functions, and optimization algorithm makes too large steps which leads to overflow. This function allows us to reject steps that are too large (and therefore expose us to the possible overflow) without actually calculating function value at the x+stp*d. NOTE: non-zero StpMax leads to moderate performance degradation because intermediate step of preconditioned L-BFGS optimization is incompatible with limits on step size. -- ALGLIB -- Copyright 02.04.2010 by Bochkanov Sergey *************************************************************************/ void lsfitsetstpmax(const lsfitstate &state, const double stpmax); /************************************************************************* This function turns on/off reporting. INPUT PARAMETERS: State - structure which stores algorithm state NeedXRep- whether iteration reports are needed or not When reports are needed, State.C (current parameters) and State.F (current value of fitting function) are reported. -- ALGLIB -- Copyright 15.08.2010 by Bochkanov Sergey *************************************************************************/ void lsfitsetxrep(const lsfitstate &state, const bool needxrep); /************************************************************************* This function sets scaling coefficients for underlying optimizer. ALGLIB optimizers use scaling matrices to test stopping conditions (step size and gradient are scaled before comparison with tolerances). Scale of the I-th variable is a translation invariant measure of: a) "how large" the variable is b) how large the step should be to make significant changes in the function Generally, scale is NOT considered to be a form of preconditioner. But LM optimizer is unique in that it uses scaling matrix both in the stopping condition tests and as Marquardt damping factor. Proper scaling is very important for the algorithm performance. It is less important for the quality of results, but still has some influence (it is easier to converge when variables are properly scaled, so premature stopping is possible when very badly scalled variables are combined with relaxed stopping conditions). INPUT PARAMETERS: State - structure stores algorithm state S - array[N], non-zero scaling coefficients S[i] may be negative, sign doesn't matter. -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void lsfitsetscale(const lsfitstate &state, const real_1d_array &s); /************************************************************************* This function sets boundary constraints for underlying optimizer Boundary constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetBC() call. INPUT PARAMETERS: State - structure stores algorithm state BndL - lower bounds, array[K]. If some (all) variables are unbounded, you may specify very small number or -INF (latter is recommended because it will allow solver to use better algorithm). BndU - upper bounds, array[K]. If some (all) variables are unbounded, you may specify very large number or +INF (latter is recommended because it will allow solver to use better algorithm). NOTE 1: it is possible to specify BndL[i]=BndU[i]. In this case I-th variable will be "frozen" at X[i]=BndL[i]=BndU[i]. NOTE 2: unlike other constrained optimization algorithms, this solver has following useful properties: * bound constraints are always satisfied exactly * function is evaluated only INSIDE area specified by bound constraints -- ALGLIB -- Copyright 14.01.2011 by Bochkanov Sergey *************************************************************************/ void lsfitsetbc(const lsfitstate &state, const real_1d_array &bndl, const real_1d_array &bndu); /************************************************************************* This function sets linear constraints for underlying optimizer Linear constraints are inactive by default (after initial creation). They are preserved until explicitly turned off with another SetLC() call. INPUT PARAMETERS: State - structure stores algorithm state C - linear constraints, array[K,N+1]. Each row of C represents one constraint, either equality or inequality (see below): * first N elements correspond to coefficients, * last element corresponds to the right part. All elements of C (including right part) must be finite. CT - type of constraints, array[K]: * if CT[i]>0, then I-th constraint is C[i,*]*x >= C[i,n+1] * if CT[i]=0, then I-th constraint is C[i,*]*x = C[i,n+1] * if CT[i]<0, then I-th constraint is C[i,*]*x <= C[i,n+1] K - number of equality/inequality constraints, K>=0: * if given, only leading K elements of C/CT are used * if not given, automatically determined from sizes of C/CT IMPORTANT: if you have linear constraints, it is strongly recommended to set scale of variables with lsfitsetscale(). QP solver which is used to calculate linearly constrained steps heavily relies on good scaling of input problems. NOTE: linear (non-box) constraints are satisfied only approximately - there always exists some violation due to numerical errors and algorithmic limitations. NOTE: general linear constraints add significant overhead to solution process. Although solver performs roughly same amount of iterations (when compared with similar box-only constrained problem), each iteration now involves solution of linearly constrained QP subproblem, which requires ~3-5 times more Cholesky decompositions. Thus, if you can reformulate your problem in such way this it has only box constraints, it may be beneficial to do so. -- ALGLIB -- Copyright 29.04.2017 by Bochkanov Sergey *************************************************************************/ void lsfitsetlc(const lsfitstate &state, const real_2d_array &c, const integer_1d_array &ct, const ae_int_t k); void lsfitsetlc(const lsfitstate &state, const real_2d_array &c, const integer_1d_array &ct); /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool lsfititeration(const lsfitstate &state); /************************************************************************* This family of functions is used to launcn iterations of nonlinear fitter These functions accept following parameters: state - algorithm state func - callback which calculates function (or merit function) value func at given point x grad - callback which calculates function (or merit function) value func and gradient grad at given point x hess - callback which calculates function (or merit function) value func, gradient grad and Hessian hess at given point x rep - optional callback which is called after each iteration can be NULL ptr - optional pointer which is passed to func/grad/hess/jac/rep can be NULL NOTES: 1. this algorithm is somewhat unusual because it works with parameterized function f(C,X), where X is a function argument (we have many points which are characterized by different argument values), and C is a parameter to fit. For example, if we want to do linear fit by f(c0,c1,x) = c0*x+c1, then x will be argument, and {c0,c1} will be parameters. It is important to understand that this algorithm finds minimum in the space of function PARAMETERS (not arguments), so it needs derivatives of f() with respect to C, not X. In the example above it will need f=c0*x+c1 and {df/dc0,df/dc1} = {x,1} instead of {df/dx} = {c0}. 2. Callback functions accept C as the first parameter, and X as the second 3. If state was created with LSFitCreateFG(), algorithm needs just function and its gradient, but if state was created with LSFitCreateFGH(), algorithm will need function, gradient and Hessian. According to the said above, there ase several versions of this function, which accept different sets of callbacks. This flexibility opens way to subtle errors - you may create state with LSFitCreateFGH() (optimization using Hessian), but call function which does not accept Hessian. So when algorithm will request Hessian, there will be no callback to call. In this case exception will be thrown. Be careful to avoid such errors because there is no way to find them at compile time - you can see them at runtime only. -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitfit(lsfitstate &state, void (*func)(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr), void (*rep)(const real_1d_array &c, double func, void *ptr) = NULL, void *ptr = NULL); void lsfitfit(lsfitstate &state, void (*func)(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr), void (*grad)(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*rep)(const real_1d_array &c, double func, void *ptr) = NULL, void *ptr = NULL); void lsfitfit(lsfitstate &state, void (*func)(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr), void (*grad)(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, void *ptr), void (*hess)(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, real_2d_array &hess, void *ptr), void (*rep)(const real_1d_array &c, double func, void *ptr) = NULL, void *ptr = NULL); /************************************************************************* Nonlinear least squares fitting results. Called after return from LSFitFit(). INPUT PARAMETERS: State - algorithm state OUTPUT PARAMETERS: Info - completion code: * -7 gradient verification failed. See LSFitSetGradientCheck() for more information. * 2 relative step is no more than EpsX. * 5 MaxIts steps was taken * 7 stopping conditions are too stringent, further improvement is impossible C - array[0..K-1], solution Rep - optimization report. On success following fields are set: * R2 non-adjusted coefficient of determination (non-weighted) * RMSError rms error on the (X,Y). * AvgError average error on the (X,Y). * AvgRelError average relative error on the non-zero Y * MaxError maximum error NON-WEIGHTED ERRORS ARE CALCULATED * WRMSError weighted rms error on the (X,Y). ERRORS IN PARAMETERS This solver also calculates different kinds of errors in parameters and fills corresponding fields of report: * Rep.CovPar covariance matrix for parameters, array[K,K]. * Rep.ErrPar errors in parameters, array[K], errpar = sqrt(diag(CovPar)) * Rep.ErrCurve vector of fit errors - standard deviations of empirical best-fit curve from "ideal" best-fit curve built with infinite number of samples, array[N]. errcurve = sqrt(diag(J*CovPar*J')), where J is Jacobian matrix. * Rep.Noise vector of per-point estimates of noise, array[N] IMPORTANT: errors in parameters are calculated without taking into account boundary/linear constraints! Presence of constraints changes distribution of errors, but there is no easy way to account for constraints when you calculate covariance matrix. NOTE: noise in the data is estimated as follows: * for fitting without user-supplied weights all points are assumed to have same level of noise, which is estimated from the data * for fitting with user-supplied weights we assume that noise level in I-th point is inversely proportional to Ith weight. Coefficient of proportionality is estimated from the data. NOTE: we apply small amount of regularization when we invert squared Jacobian and calculate covariance matrix. It guarantees that algorithm won't divide by zero during inversion, but skews error estimates a bit (fractional error is about 10^-9). However, we believe that this difference is insignificant for all practical purposes except for the situation when you want to compare ALGLIB results with "reference" implementation up to the last significant digit. NOTE: covariance matrix is estimated using correction for degrees of freedom (covariances are divided by N-M instead of dividing by N). -- ALGLIB -- Copyright 17.08.2009 by Bochkanov Sergey *************************************************************************/ void lsfitresults(const lsfitstate &state, ae_int_t &info, real_1d_array &c, lsfitreport &rep); /************************************************************************* This subroutine turns on verification of the user-supplied analytic gradient: * user calls this subroutine before fitting begins * LSFitFit() is called * prior to actual fitting, for each point in data set X_i and each component of parameters being fited C_j algorithm performs following steps: * two trial steps are made to C_j-TestStep*S[j] and C_j+TestStep*S[j], where C_j is j-th parameter and S[j] is a scale of j-th parameter * if needed, steps are bounded with respect to constraints on C[] * F(X_i|C) is evaluated at these trial points * we perform one more evaluation in the middle point of the interval * we build cubic model using function values and derivatives at trial points and we compare its prediction with actual value in the middle point * in case difference between prediction and actual value is higher than some predetermined threshold, algorithm stops with completion code -7; Rep.VarIdx is set to index of the parameter with incorrect derivative. * after verification is over, algorithm proceeds to the actual optimization. NOTE 1: verification needs N*K (points count * parameters count) gradient evaluations. It is very costly and you should use it only for low dimensional problems, when you want to be sure that you've correctly calculated analytic derivatives. You should not use it in the production code (unless you want to check derivatives provided by some third party). NOTE 2: you should carefully choose TestStep. Value which is too large (so large that function behaviour is significantly non-cubic) will lead to false alarms. You may use different step for different parameters by means of setting scale with LSFitSetScale(). NOTE 3: this function may lead to false positives. In case it reports that I-th derivative was calculated incorrectly, you may decrease test step and try one more time - maybe your function changes too sharply and your step is too large for such rapidly chanding function. NOTE 4: this function works only for optimizers created with LSFitCreateWFG() or LSFitCreateFG() constructors. INPUT PARAMETERS: State - structure used to store algorithm state TestStep - verification step: * TestStep=0 turns verification off * TestStep>0 activates verification -- ALGLIB -- Copyright 15.06.2012 by Bochkanov Sergey *************************************************************************/ void lsfitsetgradientcheck(const lsfitstate &state, const double teststep); /************************************************************************* Fits minimum circumscribed (MCC) circle (or NX-dimensional sphere) to data (a set of points in NX-dimensional space). INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) OUTPUT PARAMETERS: CX - central point for a sphere RHi - radius NOTE: this function is an easy-to-use wrapper around more powerful "expert" function nsfitspherex(). This wrapper is optimized for ease of use and stability - at the cost of somewhat lower performance (we have to use very tight stopping criteria for inner optimizer because we want to make sure that it will converge on any dataset). If you are ready to experiment with settings of "expert" function, you can achieve ~2-4x speedup over standard "bulletproof" settings. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitspheremcc(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nx, real_1d_array &cx, double &rhi); /************************************************************************* Fits maximum inscribed circle (or NX-dimensional sphere) to data (a set of points in NX-dimensional space). INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius NOTE: this function is an easy-to-use wrapper around more powerful "expert" function nsfitspherex(). This wrapper is optimized for ease of use and stability - at the cost of somewhat lower performance (we have to use very tight stopping criteria for inner optimizer because we want to make sure that it will converge on any dataset). If you are ready to experiment with settings of "expert" function, you can achieve ~2-4x speedup over standard "bulletproof" settings. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitspheremic(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nx, real_1d_array &cx, double &rlo); /************************************************************************* Fits minimum zone circle (or NX-dimensional sphere) to data (a set of points in NX-dimensional space). INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius of inscribed circle RHo - radius of circumscribed circle NOTE: this function is an easy-to-use wrapper around more powerful "expert" function nsfitspherex(). This wrapper is optimized for ease of use and stability - at the cost of somewhat lower performance (we have to use very tight stopping criteria for inner optimizer because we want to make sure that it will converge on any dataset). If you are ready to experiment with settings of "expert" function, you can achieve ~2-4x speedup over standard "bulletproof" settings. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitspheremzc(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nx, real_1d_array &cx, double &rlo, double &rhi); /************************************************************************* Fitting minimum circumscribed, maximum inscribed or minimum zone circles (or NX-dimensional spheres) to data (a set of points in NX-dimensional space). This is expert function which allows to tweak many parameters of underlying nonlinear solver: * stopping criteria for inner iterations * number of outer iterations * penalty coefficient used to handle nonlinear constraints (we convert unconstrained nonsmooth optimization problem ivolving max() and/or min() operations to quadratically constrained smooth one). You may tweak all these parameters or only some of them, leaving other ones at their default state - just specify zero value, and solver will fill it with appropriate default one. These comments also include some discussion of approach used to handle such unusual fitting problem, its stability, drawbacks of alternative methods, and convergence properties. INPUT PARAMETERS: XY - array[NPoints,NX] (or larger), contains dataset. One row = one point in NX-dimensional space. NPoints - dataset size, NPoints>0 NX - space dimensionality, NX>0 (1, 2, 3, 4, 5 and so on) ProblemType-used to encode problem type: * 1 for minimum circumscribed circle/sphere fitting (MCC) * 2 for maximum inscribed circle/sphere fitting (MIC) * 3 for minimum zone circle fitting (difference between Rhi and Rlo is minimized), denoted as MZC EpsX - stopping condition for NLC optimizer: * must be non-negative * use 0 to choose default value (1.0E-12 is used by default) * you may specify larger values, up to 1.0E-6, if you want to speed-up solver; NLC solver performs several preconditioned outer iterations, so final result typically has precision much better than EpsX. AULIts - number of outer iterations performed by NLC optimizer: * must be non-negative * use 0 to choose default value (20 is used by default) * you may specify values smaller than 20 if you want to speed up solver; 10 often results in good combination of precision and speed; sometimes you may get good results with just 6 outer iterations. Penalty - penalty coefficient for NLC optimizer: * must be non-negative * use 0 to choose default value (1.0E6 in current version) * it should be really large, 1.0E6...1.0E7 is a good value to start from; * generally, default value is good enough OUTPUT PARAMETERS: CX - central point for a sphere RLo - radius: * for ProblemType=2,3, radius of the inscribed sphere * for ProblemType=1 - zero RHo - radius: * for ProblemType=1,3, radius of the circumscribed sphere * for ProblemType=2 - zero NOTE: ON THE UNIQUENESS OF SOLUTIONS ALGLIB provides solution to several related circle fitting problems: MCC (minimum circumscribed), MIC (maximum inscribed) and MZC (minimum zone) fitting. It is important to note that among these problems only MCC is convex and has unique solution independently from starting point. As for MIC, it may (or may not, depending on dataset properties) have multiple solutions, and it always has one degenerate solution C=infinity which corresponds to infinitely large radius. Thus, there are no guarantees that solution to MIC returned by this solver will be the best one (and no one can provide you with such guarantee because problem is NP-hard). The only guarantee you have is that this solution is locally optimal, i.e. it can not be improved by infinitesimally small tweaks in the parameters. It is also possible to "run away" to infinity when started from bad initial point located outside of point cloud (or when point cloud does not span entire circumference/surface of the sphere). Finally, MZC (minimum zone circle) stands somewhere between MCC and MIC in stability. It is somewhat regularized by "circumscribed" term of the merit function; however, solutions to MZC may be non-unique, and in some unlucky cases it is also possible to "run away to infinity". NOTE: ON THE NONLINEARLY CONSTRAINED PROGRAMMING APPROACH The problem formulation for MCC (minimum circumscribed circle; for the sake of simplicity we omit MZC and MIC here) is: [ [ ]2 ] min [ max [ XY[i]-C ] ] C [ i [ ] ] i.e. it is unconstrained nonsmooth optimization problem of finding "best" central point, with radius R being unambiguously determined from C. In order to move away from non-smoothness we use following reformulation: [ ] [ ]2 min [ R ] subject to R>=0, [ XY[i]-C ] <= R^2 C,R [ ] [ ] i.e. it becomes smooth quadratically constrained optimization problem with linear target function. Such problem statement is 100% equivalent to the original nonsmooth one, but much easier to approach. We solve it with MinNLC solver provided by ALGLIB. NOTE: ON INSTABILITY OF SEQUENTIAL LINEAR PROGRAMMING APPROACJ ALGLIB has nonlinearly constrained solver which proved to be stable on such problems. However, some authors proposed to linearize constraints in the vicinity of current approximation (Ci,Ri) and to get next approximate solution (Ci+1,Ri+1) as solution to linear programming problem. Obviously, LP problems are easier than nonlinearly constrained ones. Indeed, SLP approach to MCC/MIC/MZC resulted in ~10-20x increase in performance (when compared with NLC solver). However, it turned out that in some cases linearized model fails to predict correct direction for next step and tells us that we converged to solution even when we are still 2-4 digits of precision away from it. It is important that it is not failure of LP solver - it is failure of the linear model; even when solved exactly, it fails to handle subtle nonlinearities which arise near the solution. We validated it by comparing results returned by ALGLIB linear solver with that of MATLAB. In our experiments with SLP solver: * MCC failed most often, at both realistic and synthetic datasets * MIC sometimes failed, but sometimes succeeded * MZC often succeeded; our guess is that presence of two independent sets of constraints (one set for Rlo and another one for Rhi) and two terms in the target function (Rlo and Rhi) regularizes task, so when linear model fails to handle nonlinearities from Rlo, it uses Rhi as a hint (and vice versa). Because SLP approach failed to achieve stable results, we do not include it in ALGLIB. -- ALGLIB -- Copyright 14.04.2017 by Bochkanov Sergey *************************************************************************/ void nsfitspherex(const real_2d_array &xy, const ae_int_t npoints, const ae_int_t nx, const ae_int_t problemtype, const double epsx, const ae_int_t aulits, const double penalty, real_1d_array &cx, double &rlo, double &rhi); /************************************************************************* This subroutine calculates the value of the bilinear or bicubic spline at the given point X. Input parameters: C - coefficients table. Built by BuildBilinearSpline or BuildBicubicSpline. X, Y- point Result: S(x,y) -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/ double spline2dcalc(const spline2dinterpolant &c, const double x, const double y); /************************************************************************* This subroutine calculates the value of the bilinear or bicubic spline at the given point X and its derivatives. Input parameters: C - spline interpolant. X, Y- point Output parameters: F - S(x,y) FX - dS(x,y)/dX FY - dS(x,y)/dY FXY - d2S(x,y)/dXdY -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/ void spline2ddiff(const spline2dinterpolant &c, const double x, const double y, double &f, double &fx, double &fy, double &fxy); /************************************************************************* This subroutine performs linear transformation of the spline argument. Input parameters: C - spline interpolant AX, BX - transformation coefficients: x = A*t + B AY, BY - transformation coefficients: y = A*u + B Result: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/ void spline2dlintransxy(const spline2dinterpolant &c, const double ax, const double bx, const double ay, const double by); /************************************************************************* This subroutine performs linear transformation of the spline. Input parameters: C - spline interpolant. A, B- transformation coefficients: S2(x,y) = A*S(x,y) + B Output parameters: C - transformed spline -- ALGLIB PROJECT -- Copyright 30.06.2007 by Bochkanov Sergey *************************************************************************/ void spline2dlintransf(const spline2dinterpolant &c, const double a, const double b); /************************************************************************* This subroutine makes the copy of the spline model. Input parameters: C - spline interpolant Output parameters: CC - spline copy -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/ void spline2dcopy(const spline2dinterpolant &c, spline2dinterpolant &cc); /************************************************************************* Bicubic spline resampling Input parameters: A - function values at the old grid, array[0..OldHeight-1, 0..OldWidth-1] OldHeight - old grid height, OldHeight>1 OldWidth - old grid width, OldWidth>1 NewHeight - new grid height, NewHeight>1 NewWidth - new grid width, NewWidth>1 Output parameters: B - function values at the new grid, array[0..NewHeight-1, 0..NewWidth-1] -- ALGLIB routine -- 15 May, 2007 Copyright by Bochkanov Sergey *************************************************************************/ void spline2dresamplebicubic(const real_2d_array &a, const ae_int_t oldheight, const ae_int_t oldwidth, real_2d_array &b, const ae_int_t newheight, const ae_int_t newwidth); /************************************************************************* Bilinear spline resampling Input parameters: A - function values at the old grid, array[0..OldHeight-1, 0..OldWidth-1] OldHeight - old grid height, OldHeight>1 OldWidth - old grid width, OldWidth>1 NewHeight - new grid height, NewHeight>1 NewWidth - new grid width, NewWidth>1 Output parameters: B - function values at the new grid, array[0..NewHeight-1, 0..NewWidth-1] -- ALGLIB routine -- 09.07.2007 Copyright by Bochkanov Sergey *************************************************************************/ void spline2dresamplebilinear(const real_2d_array &a, const ae_int_t oldheight, const ae_int_t oldwidth, real_2d_array &b, const ae_int_t newheight, const ae_int_t newwidth); /************************************************************************* This subroutine builds bilinear vector-valued spline. Input parameters: X - spline abscissas, array[0..N-1] Y - spline ordinates, array[0..M-1] F - function values, array[0..M*N*D-1]: * first D elements store D values at (X[0],Y[0]) * next D elements store D values at (X[1],Y[0]) * general form - D function values at (X[i],Y[j]) are stored at F[D*(J*N+I)...D*(J*N+I)+D-1]. M,N - grid size, M>=2, N>=2 D - vector dimension, D>=1 Output parameters: C - spline interpolant -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dbuildbilinearv(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, const real_1d_array &f, const ae_int_t d, spline2dinterpolant &c); /************************************************************************* This subroutine builds bicubic vector-valued spline. Input parameters: X - spline abscissas, array[0..N-1] Y - spline ordinates, array[0..M-1] F - function values, array[0..M*N*D-1]: * first D elements store D values at (X[0],Y[0]) * next D elements store D values at (X[1],Y[0]) * general form - D function values at (X[i],Y[j]) are stored at F[D*(J*N+I)...D*(J*N+I)+D-1]. M,N - grid size, M>=2, N>=2 D - vector dimension, D>=1 Output parameters: C - spline interpolant -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dbuildbicubicv(const real_1d_array &x, const ae_int_t n, const real_1d_array &y, const ae_int_t m, const real_1d_array &f, const ae_int_t d, spline2dinterpolant &c); /************************************************************************* This subroutine calculates bilinear or bicubic vector-valued spline at the given point (X,Y). INPUT PARAMETERS: C - spline interpolant. X, Y- point F - output buffer, possibly preallocated array. In case array size is large enough to store result, it is not reallocated. Array which is too short will be reallocated OUTPUT PARAMETERS: F - array[D] (or larger) which stores function values -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dcalcvbuf(const spline2dinterpolant &c, const double x, const double y, real_1d_array &f); /************************************************************************* This subroutine calculates bilinear or bicubic vector-valued spline at the given point (X,Y). INPUT PARAMETERS: C - spline interpolant. X, Y- point OUTPUT PARAMETERS: F - array[D] which stores function values. F is out-parameter and it is reallocated after call to this function. In case you want to reuse previously allocated F, you may use Spline2DCalcVBuf(), which reallocates F only when it is too small. -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dcalcv(const spline2dinterpolant &c, const double x, const double y, real_1d_array &f); /************************************************************************* This subroutine unpacks two-dimensional spline into the coefficients table Input parameters: C - spline interpolant. Result: M, N- grid size (x-axis and y-axis) D - number of components Tbl - coefficients table, unpacked format, D - components: [0..(N-1)*(M-1)*D-1, 0..19]. For T=0..D-1 (component index), I = 0...N-2 (x index), J=0..M-2 (y index): K := T + I*D + J*D*(N-1) K-th row stores decomposition for T-th component of the vector-valued function Tbl[K,0] = X[i] Tbl[K,1] = X[i+1] Tbl[K,2] = Y[j] Tbl[K,3] = Y[j+1] Tbl[K,4] = C00 Tbl[K,5] = C01 Tbl[K,6] = C02 Tbl[K,7] = C03 Tbl[K,8] = C10 Tbl[K,9] = C11 ... Tbl[K,19] = C33 On each grid square spline is equals to: S(x) = SUM(c[i,j]*(t^i)*(u^j), i=0..3, j=0..3) t = x-x[j] u = y-y[i] -- ALGLIB PROJECT -- Copyright 16.04.2012 by Bochkanov Sergey *************************************************************************/ void spline2dunpackv(const spline2dinterpolant &c, ae_int_t &m, ae_int_t &n, ae_int_t &d, real_2d_array &tbl); /************************************************************************* This subroutine was deprecated in ALGLIB 3.6.0 We recommend you to switch to Spline2DBuildBilinearV(), which is more flexible and accepts its arguments in more convenient order. -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/ void spline2dbuildbilinear(const real_1d_array &x, const real_1d_array &y, const real_2d_array &f, const ae_int_t m, const ae_int_t n, spline2dinterpolant &c); /************************************************************************* This subroutine was deprecated in ALGLIB 3.6.0 We recommend you to switch to Spline2DBuildBicubicV(), which is more flexible and accepts its arguments in more convenient order. -- ALGLIB PROJECT -- Copyright 05.07.2007 by Bochkanov Sergey *************************************************************************/ void spline2dbuildbicubic(const real_1d_array &x, const real_1d_array &y, const real_2d_array &f, const ae_int_t m, const ae_int_t n, spline2dinterpolant &c); /************************************************************************* This subroutine was deprecated in ALGLIB 3.6.0 We recommend you to switch to Spline2DUnpackV(), which is more flexible and accepts its arguments in more convenient order. -- ALGLIB PROJECT -- Copyright 29.06.2007 by Bochkanov Sergey *************************************************************************/ void spline2dunpack(const spline2dinterpolant &c, ae_int_t &m, ae_int_t &n, real_2d_array &tbl); /************************************************************************* This function serializes data structure to string. Important properties of s_out: * it contains alphanumeric characters, dots, underscores, minus signs * these symbols are grouped into words, which are separated by spaces and Windows-style (CR+LF) newlines * although serializer uses spaces and CR+LF as separators, you can replace any separator character by arbitrary combination of spaces, tabs, Windows or Unix newlines. It allows flexible reformatting of the string in case you want to include it into text or XML file. But you should not insert separators into the middle of the "words" nor you should change case of letters. * s_out can be freely moved between 32-bit and 64-bit systems, little and big endian machines, and so on. You can serialize structure on 32-bit machine and unserialize it on 64-bit one (or vice versa), or serialize it on SPARC and unserialize on x86. You can also serialize it in C++ version of ALGLIB and unserialize in C# one, and vice versa. *************************************************************************/ void rbfserialize(rbfmodel &obj, std::string &s_out); /************************************************************************* This function unserializes data structure from string. *************************************************************************/ void rbfunserialize(const std::string &s_in, rbfmodel &obj); /************************************************************************* This function serializes data structure to C++ stream. Data stream generated by this function is same as string representation generated by string version of serializer - alphanumeric characters, dots, underscores, minus signs, which are grouped into words separated by spaces and CR+LF. We recommend you to read comments on string version of serializer to find out more about serialization of AlGLIB objects. *************************************************************************/ void rbfserialize(rbfmodel &obj, std::ostream &s_out); /************************************************************************* This function unserializes data structure from stream. *************************************************************************/ void rbfunserialize(const std::istream &s_in, rbfmodel &obj); /************************************************************************* This function creates RBF model for a scalar (NY=1) or vector (NY>1) function in a NX-dimensional space (NX>=1). Newly created model is empty. It can be used for interpolation right after creation, but it just returns zeros. You have to add points to the model, tune interpolation settings, and then call model construction function rbfbuildmodel() which will update model according to your specification. USAGE: 1. User creates model with rbfcreate() 2. User adds dataset with rbfsetpoints() (points do NOT have to be on a regular grid) or rbfsetpointsandscales(). 3. (OPTIONAL) User chooses polynomial term by calling: * rbflinterm() to set linear term * rbfconstterm() to set constant term * rbfzeroterm() to set zero term By default, linear term is used. 4. User tweaks algorithm properties with rbfsetalgohierarchical() method (or chooses one of the legacy algorithms - QNN (rbfsetalgoqnn) or ML (rbfsetalgomultilayer)). 5. User calls rbfbuildmodel() function which rebuilds model according to the specification 6. User may call rbfcalc() to calculate model value at the specified point, rbfgridcalc() to calculate model values at the points of the regular grid. User may extract model coefficients with rbfunpack() call. IMPORTANT: we recommend you to use latest model construction algorithm - hierarchical RBFs, which is activated by rbfsetalgohierarchical() function. This algorithm is the fastest one, and most memory- efficient. However, it is incompatible with older versions of ALGLIB (pre-3.11). So, if you serialize hierarchical model, you will be unable to load it in pre-3.11 ALGLIB. Other model types (QNN and RBF-ML) are still backward-compatible. INPUT PARAMETERS: NX - dimension of the space, NX>=1 NY - function dimension, NY>=1 OUTPUT PARAMETERS: S - RBF model (initially equals to zero) NOTE 1: memory requirements. RBF models require amount of memory which is proportional to the number of data points. Some additional memory is allocated during model construction, but most of this memory is freed after model coefficients are calculated. Amount of this additional memory depends on model construction algorithm being used. NOTE 2: prior to ALGLIB version 3.11, RBF models supported only NX=2 or NX=3. Any attempt to create single-dimensional or more than 3-dimensional RBF model resulted in exception. ALGLIB 3.11 supports any NX>0, but models created with NX!=2 and NX!=3 are incompatible with (a) older versions of ALGLIB, (b) old model construction algorithms (QNN or RBF-ML). So, if you create a model with NX=2 or NX=3, then, depending on specific model construction algorithm being chosen, you will (QNN and RBF-ML) or will not (HierarchicalRBF) get backward compatibility with older versions of ALGLIB. You have a choice here. However, if you create a model with NX neither 2 nor 3, you have no backward compatibility from the start, and you are forced to use hierarchical RBFs and ALGLIB 3.11 or later. -- ALGLIB -- Copyright 13.12.2011, 20.06.2016 by Bochkanov Sergey *************************************************************************/ void rbfcreate(const ae_int_t nx, const ae_int_t ny, rbfmodel &s); /************************************************************************* This function creates buffer structure which can be used to perform parallel RBF model evaluations (with one RBF model instance being used from multiple threads, as long as different threads use different instances of buffer). This buffer object can be used with rbftscalcbuf() function (here "ts" stands for "thread-safe", "buf" is a suffix which denotes function which reuses previously allocated output space). How to use it: * create RBF model structure with rbfcreate() * load data, tune parameters * call rbfbuildmodel() * call rbfcreatecalcbuffer(), once per thread working with RBF model (you should call this function only AFTER call to rbfbuildmodel(), see below for more information) * call rbftscalcbuf() from different threads, with each thread working with its own copy of buffer object. INPUT PARAMETERS S - RBF model OUTPUT PARAMETERS Buf - external buffer. IMPORTANT: buffer object should be used only with RBF model object which was used to initialize buffer. Any attempt to use buffer with different object is dangerous - you may get memory violation error because sizes of internal arrays do not fit to dimensions of RBF structure. IMPORTANT: you should call this function only for model which was built with rbfbuildmodel() function, after successful invocation of rbfbuildmodel(). Sizes of some internal structures are determined only after model is built, so buffer object created before model construction stage will be useless (and any attempt to use it will result in exception). -- ALGLIB -- Copyright 02.04.2016 by Sergey Bochkanov *************************************************************************/ void rbfcreatecalcbuffer(const rbfmodel &s, rbfcalcbuffer &buf); /************************************************************************* This function adds dataset. This function overrides results of the previous calls, i.e. multiple calls of this function will result in only the last set being added. IMPORTANT: ALGLIB version 3.11 and later allows you to specify a set of per-dimension scales. Interpolation radii are multiplied by the scale vector. It may be useful if you have mixed spatio-temporal data (say, a set of 3D slices recorded at different times). You should call rbfsetpointsandscales() function to use this feature. INPUT PARAMETERS: S - RBF model, initialized by rbfcreate() call. XY - points, array[N,NX+NY]. One row corresponds to one point in the dataset. First NX elements are coordinates, next NY elements are function values. Array may be larger than specified, in this case only leading [N,NX+NY] elements will be used. N - number of points in the dataset After you've added dataset and (optionally) tuned algorithm settings you should call rbfbuildmodel() in order to build a model for you. NOTE: dataset added by this function is not saved during model serialization. MODEL ITSELF is serialized, but data used to build it are not. So, if you 1) add dataset to empty RBF model, 2) serialize and unserialize it, then you will get an empty RBF model with no dataset being attached. From the other side, if you call rbfbuildmodel() between (1) and (2), then after (2) you will get your fully constructed RBF model - but again with no dataset attached, so subsequent calls to rbfbuildmodel() will produce empty model. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetpoints(const rbfmodel &s, const real_2d_array &xy, const ae_int_t n); void rbfsetpoints(const rbfmodel &s, const real_2d_array &xy); /************************************************************************* This function adds dataset and a vector of per-dimension scales. It may be useful if you have mixed spatio-temporal data - say, a set of 3D slices recorded at different times. Such data typically require different RBF radii for spatial and temporal dimensions. ALGLIB solves this problem by specifying single RBF radius, which is (optionally) multiplied by the scale vector. This function overrides results of the previous calls, i.e. multiple calls of this function will result in only the last set being added. IMPORTANT: only HierarchicalRBF algorithm can work with scaled points. So, using this function results in RBF models which can be used in ALGLIB 3.11 or later. Previous versions of the library will be unable to unserialize models produced by HierarchicalRBF algo. Any attempt to use this function with RBF-ML or QNN algorithms will result in -3 error code being returned (incorrect algorithm). INPUT PARAMETERS: R - RBF model, initialized by rbfcreate() call. XY - points, array[N,NX+NY]. One row corresponds to one point in the dataset. First NX elements are coordinates, next NY elements are function values. Array may be larger than specified, in this case only leading [N,NX+NY] elements will be used. N - number of points in the dataset S - array[NX], scale vector, S[i]>0. After you've added dataset and (optionally) tuned algorithm settings you should call rbfbuildmodel() in order to build a model for you. NOTE: dataset added by this function is not saved during model serialization. MODEL ITSELF is serialized, but data used to build it are not. So, if you 1) add dataset to empty RBF model, 2) serialize and unserialize it, then you will get an empty RBF model with no dataset being attached. From the other side, if you call rbfbuildmodel() between (1) and (2), then after (2) you will get your fully constructed RBF model - but again with no dataset attached, so subsequent calls to rbfbuildmodel() will produce empty model. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ void rbfsetpointsandscales(const rbfmodel &r, const real_2d_array &xy, const ae_int_t n, const real_1d_array &s); void rbfsetpointsandscales(const rbfmodel &r, const real_2d_array &xy, const real_1d_array &s); /************************************************************************* DEPRECATED:since version 3.11 ALGLIB includes new RBF model construction algorithm, Hierarchical RBF. This algorithm is faster and requires less memory than QNN and RBF-ML. It is especially good for large-scale interpolation problems. So, we recommend you to consider Hierarchical RBF as default option. ========================================================================== This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called RBF-QNN and it is good for point sets with following properties: a) all points are distinct b) all points are well separated. c) points distribution is approximately uniform. There is no "contour lines", clusters of points, or other small-scale structures. Algorithm description: 1) interpolation centers are allocated to data points 2) interpolation radii are calculated as distances to the nearest centers times Q coefficient (where Q is a value from [0.75,1.50]). 3) after performing (2) radii are transformed in order to avoid situation when single outlier has very large radius and influences many points across all dataset. Transformation has following form: new_r[i] = min(r[i],Z*median(r[])) where r[i] is I-th radius, median() is a median radius across entire dataset, Z is user-specified value which controls amount of deviation from median radius. When (a) is violated, we will be unable to build RBF model. When (b) or (c) are violated, model will be built, but interpolation quality will be low. See http://www.alglib.net/interpolation/ for more information on this subject. This algorithm is used by default. Additional Q parameter controls smoothness properties of the RBF basis: * Q<0.75 will give perfectly conditioned basis, but terrible smoothness properties (RBF interpolant will have sharp peaks around function values) * Q around 1.0 gives good balance between smoothness and condition number * Q>1.5 will lead to badly conditioned systems and slow convergence of the underlying linear solver (although smoothness will be very good) * Q>2.0 will effectively make optimizer useless because it won't converge within reasonable amount of iterations. It is possible to set such large Q, but it is advised not to do so. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call Q - Q parameter, Q>0, recommended value - 1.0 Z - Z parameter, Z>0, recommended value - 5.0 NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetalgoqnn(const rbfmodel &s, const double q, const double z); void rbfsetalgoqnn(const rbfmodel &s); /************************************************************************* DEPRECATED:since version 3.11 ALGLIB includes new RBF model construction algorithm, Hierarchical RBF. This algorithm is faster and requires less memory than QNN and RBF-ML. It is especially good for large-scale interpolation problems. So, we recommend you to consider Hierarchical RBF as default option. ========================================================================== This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called RBF-ML. It builds multilayer RBF model, i.e. model with subsequently decreasing radii, which allows us to combine smoothness (due to large radii of the first layers) with exactness (due to small radii of the last layers) and fast convergence. Internally RBF-ML uses many different means of acceleration, from sparse matrices to KD-trees, which results in algorithm whose working time is roughly proportional to N*log(N)*Density*RBase^2*NLayers, where N is a number of points, Density is an average density if points per unit of the interpolation space, RBase is an initial radius, NLayers is a number of layers. RBF-ML is good for following kinds of interpolation problems: 1. "exact" problems (perfect fit) with well separated points 2. least squares problems with arbitrary distribution of points (algorithm gives perfect fit where it is possible, and resorts to least squares fit in the hard areas). 3. noisy problems where we want to apply some controlled amount of smoothing. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call RBase - RBase parameter, RBase>0 NLayers - NLayers parameter, NLayers>0, recommended value to start with - about 5. LambdaV - regularization value, can be useful when solving problem in the least squares sense. Optimal lambda is problem- dependent and require trial and error. In our experience, good lambda can be as large as 0.1, and you can use 0.001 as initial guess. Default value - 0.01, which is used when LambdaV is not given. You can specify zero value, but it is not recommended to do so. TUNING ALGORITHM In order to use this algorithm you have to choose three parameters: * initial radius RBase * number of layers in the model NLayers * regularization coefficient LambdaV Initial radius is easy to choose - you can pick any number several times larger than the average distance between points. Algorithm won't break down if you choose radius which is too large (model construction time will increase, but model will be built correctly). Choose such number of layers that RLast=RBase/2^(NLayers-1) (radius used by the last layer) will be smaller than the typical distance between points. In case model error is too large, you can increase number of layers. Having more layers will make model construction and evaluation proportionally slower, but it will allow you to have model which precisely fits your data. From the other side, if you want to suppress noise, you can DECREASE number of layers to make your model less flexible. Regularization coefficient LambdaV controls smoothness of the individual models built for each layer. We recommend you to use default value in case you don't want to tune this parameter, because having non-zero LambdaV accelerates and stabilizes internal iterative algorithm. In case you want to suppress noise you can use LambdaV as additional parameter (larger value = more smoothness) to tune. TYPICAL ERRORS 1. Using initial radius which is too large. Memory requirements of the RBF-ML are roughly proportional to N*Density*RBase^2 (where Density is an average density of points per unit of the interpolation space). In the extreme case of the very large RBase we will need O(N^2) units of memory - and many layers in order to decrease radius to some reasonably small value. 2. Using too small number of layers - RBF models with large radius are not flexible enough to reproduce small variations in the target function. You need many layers with different radii, from large to small, in order to have good model. 3. Using initial radius which is too small. You will get model with "holes" in the areas which are too far away from interpolation centers. However, algorithm will work correctly (and quickly) in this case. 4. Using too many layers - you will get too large and too slow model. This model will perfectly reproduce your function, but maybe you will be able to achieve similar results with less layers (and less memory). -- ALGLIB -- Copyright 02.03.2012 by Bochkanov Sergey *************************************************************************/ void rbfsetalgomultilayer(const rbfmodel &s, const double rbase, const ae_int_t nlayers, const double lambdav); void rbfsetalgomultilayer(const rbfmodel &s, const double rbase, const ae_int_t nlayers); /************************************************************************* This function sets RBF interpolation algorithm. ALGLIB supports several RBF algorithms with different properties. This algorithm is called Hierarchical RBF. It similar to its previous incarnation, RBF-ML, i.e. it also builds a sequence of models with decreasing radii. However, it uses more economical way of building upper layers (ones with large radii), which results in faster model construction and evaluation, as well as smaller memory footprint during construction. This algorithm has following important features: * ability to handle millions of points * controllable smoothing via nonlinearity penalization * support for NX-dimensional models with NX=1 or NX>3 (unlike QNN or RBF-ML) * support for specification of per-dimensional radii via scale vector, which is set by means of rbfsetpointsandscales() function. This feature is useful if you solve spatio-temporal interpolation problems, where different radii are required for spatial and temporal dimensions. Running times are roughly proportional to: * N*log(N)*NLayers - for model construction * N*NLayers - for model evaluation You may see that running time does not depend on search radius or points density, just on number of layers in the hierarchy. IMPORTANT: this model construction algorithm was introduced in ALGLIB 3.11 and produces models which are INCOMPATIBLE with previous versions of ALGLIB. You can not unserialize models produced with this function in ALGLIB 3.10 or earlier. INPUT PARAMETERS: S - RBF model, initialized by rbfcreate() call RBase - RBase parameter, RBase>0 NLayers - NLayers parameter, NLayers>0, recommended value to start with - about 5. LambdaNS- >=0, nonlinearity penalty coefficient, negative values are not allowed. This parameter adds controllable smoothing to the problem, which may reduce noise. Specification of non- zero lambda means that in addition to fitting error solver will also minimize LambdaNS*|S''(x)|^2 (appropriately generalized to multiple dimensions. Specification of exactly zero value means that no penalty is added (we do not even evaluate matrix of second derivatives which is necessary for smoothing). Calculation of nonlinearity penalty is costly - it results in several-fold increase of model construction time. Evaluation time remains the same. Optimal lambda is problem-dependent and requires trial and error. Good value to start from is 1e-5...1e-6, which corresponds to slightly noticeable smoothing of the function. Value 1e-2 usually means that quite heavy smoothing is applied. TUNING ALGORITHM In order to use this algorithm you have to choose three parameters: * initial radius RBase * number of layers in the model NLayers * penalty coefficient LambdaNS Initial radius is easy to choose - you can pick any number several times larger than the average distance between points. Algorithm won't break down if you choose radius which is too large (model construction time will increase, but model will be built correctly). Choose such number of layers that RLast=RBase/2^(NLayers-1) (radius used by the last layer) will be smaller than the typical distance between points. In case model error is too large, you can increase number of layers. Having more layers will make model construction and evaluation proportionally slower, but it will allow you to have model which precisely fits your data. From the other side, if you want to suppress noise, you can DECREASE number of layers to make your model less flexible (or specify non-zero LambdaNS). TYPICAL ERRORS 1. Using too small number of layers - RBF models with large radius are not flexible enough to reproduce small variations in the target function. You need many layers with different radii, from large to small, in order to have good model. 2. Using initial radius which is too small. You will get model with "holes" in the areas which are too far away from interpolation centers. However, algorithm will work correctly (and quickly) in this case. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ void rbfsetalgohierarchical(const rbfmodel &s, const double rbase, const ae_int_t nlayers, const double lambdans); /************************************************************************* This function sets linear term (model is a sum of radial basis functions plus linear polynomial). This function won't have effect until next call to RBFBuildModel(). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetlinterm(const rbfmodel &s); /************************************************************************* This function sets constant term (model is a sum of radial basis functions plus constant). This function won't have effect until next call to RBFBuildModel(). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetconstterm(const rbfmodel &s); /************************************************************************* This function sets zero term (model is a sum of radial basis functions without polynomial term). This function won't have effect until next call to RBFBuildModel(). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call NOTE: this function has some serialization-related subtleties. We recommend you to study serialization examples from ALGLIB Reference Manual if you want to perform serialization of your models. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfsetzeroterm(const rbfmodel &s); /************************************************************************* This function sets basis function type, which can be: * 0 for classic Gaussian * 1 for fast and compact bell-like basis function, which becomes exactly zero at distance equal to 3*R (default option). INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call BF - basis function type: * 0 - classic Gaussian * 1 - fast and compact one -- ALGLIB -- Copyright 01.02.2017 by Bochkanov Sergey *************************************************************************/ void rbfsetv2bf(const rbfmodel &s, const ae_int_t bf); /************************************************************************* This function sets stopping criteria of the underlying linear solver for hierarchical (version 2) RBF constructor. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call MaxIts - this criterion will stop algorithm after MaxIts iterations. Typically a few hundreds iterations is required, with 400 being a good default value to start experimentation. Zero value means that default value will be selected. -- ALGLIB -- Copyright 01.02.2017 by Bochkanov Sergey *************************************************************************/ void rbfsetv2its(const rbfmodel &s, const ae_int_t maxits); /************************************************************************* This function sets support radius parameter of hierarchical (version 2) RBF constructor. Hierarchical RBF model achieves great speed-up by removing from the model excessive (too dense) nodes. Say, if you have RBF radius equal to 1 meter, and two nodes are just 1 millimeter apart, you may remove one of them without reducing model quality. Support radius parameter is used to justify which points need removal, and which do not. If two points are less than SUPPORT_R*CUR_RADIUS units of distance apart, one of them is removed from the model. The larger support radius is, the faster model construction AND evaluation are. However, too large values result in "bumpy" models. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call R - support radius coefficient, >=0. Recommended values are [0.1,0.4] range, with 0.1 being default value. -- ALGLIB -- Copyright 01.02.2017 by Bochkanov Sergey *************************************************************************/ void rbfsetv2supportr(const rbfmodel &s, const double r); /************************************************************************* This function builds RBF model and returns report (contains some information which can be used for evaluation of the algorithm properties). Call to this function modifies RBF model by calculating its centers/radii/ weights and saving them into RBFModel structure. Initially RBFModel contain zero coefficients, but after call to this function we will have coefficients which were calculated in order to fit our dataset. After you called this function you can call RBFCalc(), RBFGridCalc() and other model calculation functions. INPUT PARAMETERS: S - RBF model, initialized by RBFCreate() call Rep - report: * Rep.TerminationType: * -5 - non-distinct basis function centers were detected, interpolation aborted; only QNN returns this error code, other algorithms can handle non- distinct nodes. * -4 - nonconvergence of the internal SVD solver * -3 incorrect model construction algorithm was chosen: QNN or RBF-ML, combined with one of the incompatible features - NX=1 or NX>3; points with per-dimension scales. * 1 - successful termination Fields which are set only by modern RBF solvers (hierarchical or nonnegative; older solvers like QNN and ML initialize these fields by NANs): * rep.rmserror - root-mean-square error at nodes * rep.maxerror - maximum error at nodes Fields are used for debugging purposes: * Rep.IterationsCount - iterations count of the LSQR solver * Rep.NMV - number of matrix-vector products * Rep.ARows - rows count for the system matrix * Rep.ACols - columns count for the system matrix * Rep.ANNZ - number of significantly non-zero elements (elements above some algorithm-determined threshold) NOTE: failure to build model will leave current state of the structure unchanged. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfbuildmodel(const rbfmodel &s, rbfreport &rep); /************************************************************************* This function calculates values of the RBF model in the given point. IMPORTANT: this function works only with modern (hierarchical) RBFs. It can not be used with legacy (version 1) RBFs because older RBF code does not support 1-dimensional models. This function should be used when we have NY=1 (scalar function) and NX=1 (1-dimensional space). If you have 3-dimensional space, use rbfcalc3(). If you have 2-dimensional space, use rbfcalc3(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use generic rbfcalc(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when: * model is not initialized * NX<>1 * NY<>1 INPUT PARAMETERS: S - RBF model X0 - X-coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfcalc1(const rbfmodel &s, const double x0); /************************************************************************* This function calculates values of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=2 (2-dimensional space). If you have 3-dimensional space, use rbfcalc3(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use generic rbfcalc(). If you want to calculate function values many times, consider using rbfgridcalc2v(), which is far more efficient than many subsequent calls to rbfcalc2(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when: * model is not initialized * NX<>2 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - first coordinate, finite number X1 - second coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfcalc2(const rbfmodel &s, const double x0, const double x1); /************************************************************************* This function calculates value of the RBF model in the given point. This function should be used when we have NY=1 (scalar function) and NX=3 (3-dimensional space). If you have 2-dimensional space, use rbfcalc2(). If you have general situation (NX-dimensional space, NY-dimensional function) you should use generic rbfcalc(). If you want to calculate function values many times, consider using rbfgridcalc3v(), which is far more efficient than many subsequent calls to rbfcalc3(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when: * model is not initialized * NX<>3 *NY<>1 INPUT PARAMETERS: S - RBF model X0 - first coordinate, finite number X1 - second coordinate, finite number X2 - third coordinate, finite number RESULT: value of the model or 0.0 (as defined above) -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ double rbfcalc3(const rbfmodel &s, const double x0, const double x1, const double x2); /************************************************************************* This function calculates values of the RBF model at the given point. This is general function which can be used for arbitrary NX (dimension of the space of arguments) and NY (dimension of the function itself). However when you have NY=1 you may find more convenient to use rbfcalc2() or rbfcalc3(). If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. This function returns 0.0 when model is not initialized. INPUT PARAMETERS: S - RBF model X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. OUTPUT PARAMETERS: Y - function value, array[NY]. Y is out-parameter and reallocated after call to this function. In case you want to reuse previously allocated Y, you may use RBFCalcBuf(), which reallocates Y only when it is too small. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfcalc(const rbfmodel &s, const real_1d_array &x, real_1d_array &y); /************************************************************************* This function calculates values of the RBF model at the given point. Same as rbfcalc(), but does not reallocate Y when in is large enough to store function values. If you want to perform parallel model evaluation from multiple threads, use rbftscalcbuf() with per-thread buffer object. INPUT PARAMETERS: S - RBF model X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. Y - possibly preallocated array OUTPUT PARAMETERS: Y - function value, array[NY]. Y is not reallocated when it is larger than NY. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfcalcbuf(const rbfmodel &s, const real_1d_array &x, real_1d_array &y); /************************************************************************* This function calculates values of the RBF model at the given point, using external buffer object (internal temporaries of RBF model are not modified). This function allows to use same RBF model object in different threads, assuming that different threads use different instances of buffer structure. INPUT PARAMETERS: S - RBF model, may be shared between different threads Buf - buffer object created for this particular instance of RBF model with rbfcreatecalcbuffer(). X - coordinates, array[NX]. X may have more than NX elements, in this case only leading NX will be used. Y - possibly preallocated array OUTPUT PARAMETERS: Y - function value, array[NY]. Y is not reallocated when it is larger than NY. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbftscalcbuf(const rbfmodel &s, const rbfcalcbuffer &buf, const real_1d_array &x, real_1d_array &y); /************************************************************************* This is legacy function for gridded calculation of RBF model. It is superseded by rbfgridcalc2v() and rbfgridcalc2vsubset() functions. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc2(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, real_2d_array &y); /************************************************************************* This function calculates values of the RBF model at the regular grid, which has N0*N1 points, with Point[I,J] = (X0[I], X1[J]). Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>2 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 200x200 problem to get nearly-linear speed- ! up (high efficiency). ! ! Parallel processing is implemented only for modern (hierarchical) ! RBFs. Legacy version 1 RBFs (created by QNN or RBF-ML) are still ! processed serially. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1], where NY is a number of "output" vector values (this function supports vector- valued RBF models). Y is out-variable and is reallocated by this function. Y[K+NY*(I0+I1*N0)]=F_k(X0[I0],X1[I1]), for: * K=0...NY-1 * I0=0...N0-1 * I1=0...N1-1 NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. NOTE: if you need function values on some subset of regular grid, which may be described as "several compact and dense islands", you may use rbfgridcalc2vsubset(). -- ALGLIB -- Copyright 27.01.2017 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc2v(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, real_1d_array &y); void smp_rbfgridcalc2v(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, real_1d_array &y); /************************************************************************* This function calculates values of the RBF model at some subset of regular grid: * grid has N0*N1 points, with Point[I,J] = (X0[I], X1[J]) * only values at some subset of this grid are required Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>2 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 200x200 problem to get nearly-linear speed- ! up (high efficiency). ! ! Parallel processing is implemented only for modern (hierarchical) ! RBFs. Legacy version 1 RBFs (created by QNN or RBF-ML) are still ! processed serially. ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension FlagY - array[N0*N1]: * Y[I0+I1*N0] corresponds to node (X0[I0],X1[I1]) * it is a "bitmap" array which contains False for nodes which are NOT calculated, and True for nodes which are required. OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1*N2], where NY is a number of "output" vector values (this function supports vector- valued RBF models): * Y[K+NY*(I0+I1*N0)]=F_k(X0[I0],X1[I1]), for K=0...NY-1, I0=0...N0-1, I1=0...N1-1. * elements of Y[] which correspond to FlagY[]=True are loaded by model values (which may be exactly zero for some nodes). * elements of Y[] which correspond to FlagY[]=False MAY be initialized by zeros OR may be calculated. This function processes grid as a hierarchy of nested blocks and micro-rows. If just one element of micro-row is required, entire micro-row (up to 8 nodes in the current version, but no promises) is calculated. NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc2vsubset(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, const boolean_1d_array &flagy, real_1d_array &y); void smp_rbfgridcalc2vsubset(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, const boolean_1d_array &flagy, real_1d_array &y); /************************************************************************* This function calculates values of the RBF model at the regular grid, which has N0*N1*N2 points, with Point[I,J,K] = (X0[I], X1[J], X2[K]). Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>3 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 50x50x50 problem to get nearly-linear speed- ! up (high efficiency). ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension X2 - array of grid nodes, third coordinates, array[N2] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N2 - grid size (number of nodes) in the third dimension OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1*N2], where NY is a number of "output" vector values (this function supports vector- valued RBF models). Y is out-variable and is reallocated by this function. Y[K+NY*(I0+I1*N0+I2*N0*N1)]=F_k(X0[I0],X1[I1],X2[I2]), for: * K=0...NY-1 * I0=0...N0-1 * I1=0...N1-1 * I2=0...N2-1 NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. NOTE: if you need function values on some subset of regular grid, which may be described as "several compact and dense islands", you may use rbfgridcalc3vsubset(). -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc3v(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, const real_1d_array &x2, const ae_int_t n2, real_1d_array &y); void smp_rbfgridcalc3v(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, const real_1d_array &x2, const ae_int_t n2, real_1d_array &y); /************************************************************************* This function calculates values of the RBF model at some subset of regular grid: * grid has N0*N1*N2 points, with Point[I,J,K] = (X0[I], X1[J], X2[K]) * only values at some subset of this grid are required Vector-valued RBF models are supported. This function returns 0.0 when: * model is not initialized * NX<>3 FOR USERS OF COMMERCIAL EDITION: ! Commercial version of ALGLIB includes multicore support (C++ and C# ! computational cores) for this function. ! ! Parallel processing gives close-to-linear speedup on multicore systems, ! assuming that problem is large enough to be divided between cores. ! You should solve at least 50x50x50 problem to get nearly-linear speed- ! up (high efficiency). ! ! In order to use multicore features you have to: ! * use commercial version of ALGLIB ! * call this function with "smp_" prefix, which indicates that ! multicore code will be used (for multicore support) ! ! This note is given for users of commercial edition; if you use GPL ! edition, you still will be able to call smp-version of this function, ! but all computations will be done serially. ! ! We recommend you to carefully read ALGLIB Reference Manual, section ! called 'SMP support', before using parallel version of this function. INPUT PARAMETERS: S - RBF model, used in read-only mode, can be shared between multiple invocations of this function from multiple threads. X0 - array of grid nodes, first coordinates, array[N0]. Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N0 - grid size (number of nodes) in the first dimension X1 - array of grid nodes, second coordinates, array[N1] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N1 - grid size (number of nodes) in the second dimension X2 - array of grid nodes, third coordinates, array[N2] Must be ordered by ascending. Exception is generated if the array is not correctly ordered. N2 - grid size (number of nodes) in the third dimension FlagY - array[N0*N1*N2]: * Y[I0+I1*N0+I2*N0*N1] corresponds to node (X0[I0],X1[I1],X2[I2]) * it is a "bitmap" array which contains False for nodes which are NOT calculated, and True for nodes which are required. OUTPUT PARAMETERS: Y - function values, array[NY*N0*N1*N2], where NY is a number of "output" vector values (this function supports vector- valued RBF models): * Y[K+NY*(I0+I1*N0+I2*N0*N1)]=F_k(X0[I0],X1[I1],X2[I2]), for K=0...NY-1, I0=0...N0-1, I1=0...N1-1, I2=0...N2-1. * elements of Y[] which correspond to FlagY[]=True are loaded by model values (which may be exactly zero for some nodes). * elements of Y[] which correspond to FlagY[]=False MAY be initialized by zeros OR may be calculated. This function processes grid as a hierarchy of nested blocks and micro-rows. If just one element of micro-row is required, entire micro-row (up to 8 nodes in the current version, but no promises) is calculated. NOTE: this function supports weakly ordered grid nodes, i.e. you may have X[i]=X[i+1] for some i. It does not provide you any performance benefits due to duplication of points, just convenience and flexibility. NOTE: this function is re-entrant, i.e. you may use same rbfmodel structure in multiple threads calling this function for different grids. -- ALGLIB -- Copyright 04.03.2016 by Bochkanov Sergey *************************************************************************/ void rbfgridcalc3vsubset(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, const real_1d_array &x2, const ae_int_t n2, const boolean_1d_array &flagy, real_1d_array &y); void smp_rbfgridcalc3vsubset(const rbfmodel &s, const real_1d_array &x0, const ae_int_t n0, const real_1d_array &x1, const ae_int_t n1, const real_1d_array &x2, const ae_int_t n2, const boolean_1d_array &flagy, real_1d_array &y); /************************************************************************* This function "unpacks" RBF model by extracting its coefficients. INPUT PARAMETERS: S - RBF model OUTPUT PARAMETERS: NX - dimensionality of argument NY - dimensionality of the target function XWR - model information, array[NC,NX+NY+1]. One row of the array corresponds to one basis function: * first NX columns - coordinates of the center * next NY columns - weights, one per dimension of the function being modelled For ModelVersion=1: * last column - radius, same for all dimensions of the function being modelled For ModelVersion=2: * last NX columns - radii, one per dimension NC - number of the centers V - polynomial term , array[NY,NX+1]. One row per one dimension of the function being modelled. First NX elements are linear coefficients, V[NX] is equal to the constant part. ModelVersion-version of the RBF model: * 1 - for models created by QNN and RBF-ML algorithms, compatible with ALGLIB 3.10 or earlier. * 2 - for models created by HierarchicalRBF, requires ALGLIB 3.11 or later -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ void rbfunpack(const rbfmodel &s, ae_int_t &nx, ae_int_t &ny, real_2d_array &xwr, ae_int_t &nc, real_2d_array &v, ae_int_t &modelversion); /************************************************************************* This function returns model version. INPUT PARAMETERS: S - RBF model RESULT: * 1 - for models created by QNN and RBF-ML algorithms, compatible with ALGLIB 3.10 or earlier. * 2 - for models created by HierarchicalRBF, requires ALGLIB 3.11 or later -- ALGLIB -- Copyright 06.07.2016 by Bochkanov Sergey *************************************************************************/ ae_int_t rbfgetmodelversion(const rbfmodel &s); } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS COMPUTATIONAL CORE DECLARATIONS (FUNCTIONS) // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { double idwcalc(idwinterpolant* z, /* Real */ ae_vector* x, ae_state *_state); void idwbuildmodifiedshepard(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t nx, ae_int_t d, ae_int_t nq, ae_int_t nw, idwinterpolant* z, ae_state *_state); void idwbuildmodifiedshepardr(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t nx, double r, idwinterpolant* z, ae_state *_state); void idwbuildnoisy(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t nx, ae_int_t d, ae_int_t nq, ae_int_t nw, idwinterpolant* z, ae_state *_state); void _idwinterpolant_init(void* _p, ae_state *_state); void _idwinterpolant_init_copy(void* _dst, void* _src, ae_state *_state); void _idwinterpolant_clear(void* _p); void _idwinterpolant_destroy(void* _p); double barycentriccalc(barycentricinterpolant* b, double t, ae_state *_state); void barycentricdiff1(barycentricinterpolant* b, double t, double* f, double* df, ae_state *_state); void barycentricdiff2(barycentricinterpolant* b, double t, double* f, double* df, double* d2f, ae_state *_state); void barycentriclintransx(barycentricinterpolant* b, double ca, double cb, ae_state *_state); void barycentriclintransy(barycentricinterpolant* b, double ca, double cb, ae_state *_state); void barycentricunpack(barycentricinterpolant* b, ae_int_t* n, /* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_state *_state); void barycentricbuildxyw(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, barycentricinterpolant* b, ae_state *_state); void barycentricbuildfloaterhormann(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t d, barycentricinterpolant* b, ae_state *_state); void barycentriccopy(barycentricinterpolant* b, barycentricinterpolant* b2, ae_state *_state); void _barycentricinterpolant_init(void* _p, ae_state *_state); void _barycentricinterpolant_init_copy(void* _dst, void* _src, ae_state *_state); void _barycentricinterpolant_clear(void* _p); void _barycentricinterpolant_destroy(void* _p); void spline1dbuildlinear(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, spline1dinterpolant* c, ae_state *_state); void spline1dbuildcubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, spline1dinterpolant* c, ae_state *_state); void spline1dgriddiffcubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, /* Real */ ae_vector* d, ae_state *_state); void spline1dgriddiff2cubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, /* Real */ ae_vector* d1, /* Real */ ae_vector* d2, ae_state *_state); void spline1dconvcubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* y2, ae_state *_state); void spline1dconvdiffcubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* y2, /* Real */ ae_vector* d2, ae_state *_state); void spline1dconvdiff2cubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundltype, double boundl, ae_int_t boundrtype, double boundr, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* y2, /* Real */ ae_vector* d2, /* Real */ ae_vector* dd2, ae_state *_state); void spline1dbuildcatmullrom(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t boundtype, double tension, spline1dinterpolant* c, ae_state *_state); void spline1dbuildhermite(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* d, ae_int_t n, spline1dinterpolant* c, ae_state *_state); void spline1dbuildakima(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, spline1dinterpolant* c, ae_state *_state); double spline1dcalc(spline1dinterpolant* c, double x, ae_state *_state); void spline1ddiff(spline1dinterpolant* c, double x, double* s, double* ds, double* d2s, ae_state *_state); void spline1dcopy(spline1dinterpolant* c, spline1dinterpolant* cc, ae_state *_state); void spline1dunpack(spline1dinterpolant* c, ae_int_t* n, /* Real */ ae_matrix* tbl, ae_state *_state); void spline1dlintransx(spline1dinterpolant* c, double a, double b, ae_state *_state); void spline1dlintransy(spline1dinterpolant* c, double a, double b, ae_state *_state); double spline1dintegrate(spline1dinterpolant* c, double x, ae_state *_state); void spline1dconvdiffinternal(/* Real */ ae_vector* xold, /* Real */ ae_vector* yold, /* Real */ ae_vector* dold, ae_int_t n, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* y, ae_bool needy, /* Real */ ae_vector* d1, ae_bool needd1, /* Real */ ae_vector* d2, ae_bool needd2, ae_state *_state); void spline1drootsandextrema(spline1dinterpolant* c, /* Real */ ae_vector* r, ae_int_t* nr, ae_bool* dr, /* Real */ ae_vector* e, /* Integer */ ae_vector* et, ae_int_t* ne, ae_bool* de, ae_state *_state); void heapsortdpoints(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* d, ae_int_t n, ae_state *_state); void solvepolinom2(double p0, double m0, double p1, double m1, double* x0, double* x1, ae_int_t* nr, ae_state *_state); void solvecubicpolinom(double pa, double ma, double pb, double mb, double a, double b, double* x0, double* x1, double* x2, double* ex0, double* ex1, ae_int_t* nr, ae_int_t* ne, /* Real */ ae_vector* tempdata, ae_state *_state); ae_int_t bisectmethod(double pa, double ma, double pb, double mb, double a, double b, double* x, ae_state *_state); void spline1dbuildmonotone(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, spline1dinterpolant* c, ae_state *_state); void _spline1dinterpolant_init(void* _p, ae_state *_state); void _spline1dinterpolant_init_copy(void* _dst, void* _src, ae_state *_state); void _spline1dinterpolant_clear(void* _p); void _spline1dinterpolant_destroy(void* _p); void pspline2build(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t st, ae_int_t pt, pspline2interpolant* p, ae_state *_state); void pspline3build(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t st, ae_int_t pt, pspline3interpolant* p, ae_state *_state); void pspline2buildperiodic(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t st, ae_int_t pt, pspline2interpolant* p, ae_state *_state); void pspline3buildperiodic(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t st, ae_int_t pt, pspline3interpolant* p, ae_state *_state); void pspline2parametervalues(pspline2interpolant* p, ae_int_t* n, /* Real */ ae_vector* t, ae_state *_state); void pspline3parametervalues(pspline3interpolant* p, ae_int_t* n, /* Real */ ae_vector* t, ae_state *_state); void pspline2calc(pspline2interpolant* p, double t, double* x, double* y, ae_state *_state); void pspline3calc(pspline3interpolant* p, double t, double* x, double* y, double* z, ae_state *_state); void pspline2tangent(pspline2interpolant* p, double t, double* x, double* y, ae_state *_state); void pspline3tangent(pspline3interpolant* p, double t, double* x, double* y, double* z, ae_state *_state); void pspline2diff(pspline2interpolant* p, double t, double* x, double* dx, double* y, double* dy, ae_state *_state); void pspline3diff(pspline3interpolant* p, double t, double* x, double* dx, double* y, double* dy, double* z, double* dz, ae_state *_state); void pspline2diff2(pspline2interpolant* p, double t, double* x, double* dx, double* d2x, double* y, double* dy, double* d2y, ae_state *_state); void pspline3diff2(pspline3interpolant* p, double t, double* x, double* dx, double* d2x, double* y, double* dy, double* d2y, double* z, double* dz, double* d2z, ae_state *_state); double pspline2arclength(pspline2interpolant* p, double a, double b, ae_state *_state); double pspline3arclength(pspline3interpolant* p, double a, double b, ae_state *_state); void parametricrdpfixed(/* Real */ ae_matrix* x, ae_int_t n, ae_int_t d, ae_int_t stopm, double stopeps, /* Real */ ae_matrix* x2, /* Integer */ ae_vector* idx2, ae_int_t* nsections, ae_state *_state); void _pspline2interpolant_init(void* _p, ae_state *_state); void _pspline2interpolant_init_copy(void* _dst, void* _src, ae_state *_state); void _pspline2interpolant_clear(void* _p); void _pspline2interpolant_destroy(void* _p); void _pspline3interpolant_init(void* _p, ae_state *_state); void _pspline3interpolant_init_copy(void* _dst, void* _src, ae_state *_state); void _pspline3interpolant_clear(void* _p); void _pspline3interpolant_destroy(void* _p); double spline3dcalc(spline3dinterpolant* c, double x, double y, double z, ae_state *_state); void spline3dlintransxyz(spline3dinterpolant* c, double ax, double bx, double ay, double by, double az, double bz, ae_state *_state); void spline3dlintransf(spline3dinterpolant* c, double a, double b, ae_state *_state); void spline3dcopy(spline3dinterpolant* c, spline3dinterpolant* cc, ae_state *_state); void spline3dresampletrilinear(/* Real */ ae_vector* a, ae_int_t oldzcount, ae_int_t oldycount, ae_int_t oldxcount, ae_int_t newzcount, ae_int_t newycount, ae_int_t newxcount, /* Real */ ae_vector* b, ae_state *_state); void spline3dbuildtrilinearv(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, /* Real */ ae_vector* z, ae_int_t l, /* Real */ ae_vector* f, ae_int_t d, spline3dinterpolant* c, ae_state *_state); void spline3dcalcvbuf(spline3dinterpolant* c, double x, double y, double z, /* Real */ ae_vector* f, ae_state *_state); void spline3dcalcv(spline3dinterpolant* c, double x, double y, double z, /* Real */ ae_vector* f, ae_state *_state); void spline3dunpackv(spline3dinterpolant* c, ae_int_t* n, ae_int_t* m, ae_int_t* l, ae_int_t* d, ae_int_t* stype, /* Real */ ae_matrix* tbl, ae_state *_state); void _spline3dinterpolant_init(void* _p, ae_state *_state); void _spline3dinterpolant_init_copy(void* _dst, void* _src, ae_state *_state); void _spline3dinterpolant_clear(void* _p); void _spline3dinterpolant_destroy(void* _p); void polynomialbar2cheb(barycentricinterpolant* p, double a, double b, /* Real */ ae_vector* t, ae_state *_state); void polynomialcheb2bar(/* Real */ ae_vector* t, ae_int_t n, double a, double b, barycentricinterpolant* p, ae_state *_state); void polynomialbar2pow(barycentricinterpolant* p, double c, double s, /* Real */ ae_vector* a, ae_state *_state); void polynomialpow2bar(/* Real */ ae_vector* a, ae_int_t n, double c, double s, barycentricinterpolant* p, ae_state *_state); void polynomialbuild(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, barycentricinterpolant* p, ae_state *_state); void polynomialbuildeqdist(double a, double b, /* Real */ ae_vector* y, ae_int_t n, barycentricinterpolant* p, ae_state *_state); void polynomialbuildcheb1(double a, double b, /* Real */ ae_vector* y, ae_int_t n, barycentricinterpolant* p, ae_state *_state); void polynomialbuildcheb2(double a, double b, /* Real */ ae_vector* y, ae_int_t n, barycentricinterpolant* p, ae_state *_state); double polynomialcalceqdist(double a, double b, /* Real */ ae_vector* f, ae_int_t n, double t, ae_state *_state); double polynomialcalccheb1(double a, double b, /* Real */ ae_vector* f, ae_int_t n, double t, ae_state *_state); double polynomialcalccheb2(double a, double b, /* Real */ ae_vector* f, ae_int_t n, double t, ae_state *_state); void lstfitpiecewiselinearrdpfixed(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, /* Real */ ae_vector* x2, /* Real */ ae_vector* y2, ae_int_t* nsections, ae_state *_state); void lstfitpiecewiselinearrdp(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double eps, /* Real */ ae_vector* x2, /* Real */ ae_vector* y2, ae_int_t* nsections, ae_state *_state); void polynomialfit(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, ae_int_t* info, barycentricinterpolant* p, polynomialfitreport* rep, ae_state *_state); void _pexec_polynomialfit(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, ae_int_t* info, barycentricinterpolant* p, polynomialfitreport* rep, ae_state *_state); void polynomialfitwc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, barycentricinterpolant* p, polynomialfitreport* rep, ae_state *_state); void _pexec_polynomialfitwc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, barycentricinterpolant* p, polynomialfitreport* rep, ae_state *_state); double logisticcalc4(double x, double a, double b, double c, double d, ae_state *_state); double logisticcalc5(double x, double a, double b, double c, double d, double g, ae_state *_state); void logisticfit4(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double* a, double* b, double* c, double* d, lsfitreport* rep, ae_state *_state); void logisticfit4ec(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double cnstrleft, double cnstrright, double* a, double* b, double* c, double* d, lsfitreport* rep, ae_state *_state); void logisticfit5(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double* a, double* b, double* c, double* d, double* g, lsfitreport* rep, ae_state *_state); void logisticfit5ec(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double cnstrleft, double cnstrright, double* a, double* b, double* c, double* d, double* g, lsfitreport* rep, ae_state *_state); void logisticfit45x(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, double cnstrleft, double cnstrright, ae_bool is4pl, double lambdav, double epsx, ae_int_t rscnt, double* a, double* b, double* c, double* d, double* g, lsfitreport* rep, ae_state *_state); void barycentricfitfloaterhormannwc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, barycentricinterpolant* b, barycentricfitreport* rep, ae_state *_state); void _pexec_barycentricfitfloaterhormannwc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, barycentricinterpolant* b, barycentricfitreport* rep, ae_state *_state); void barycentricfitfloaterhormann(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, ae_int_t* info, barycentricinterpolant* b, barycentricfitreport* rep, ae_state *_state); void _pexec_barycentricfitfloaterhormann(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, ae_int_t* info, barycentricinterpolant* b, barycentricfitreport* rep, ae_state *_state); void spline1dfitpenalized(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, double rho, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); void _pexec_spline1dfitpenalized(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, double rho, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); void spline1dfitpenalizedw(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, ae_int_t m, double rho, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); void _pexec_spline1dfitpenalizedw(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, ae_int_t m, double rho, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); void spline1dfitcubicwc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); void _pexec_spline1dfitcubicwc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); void spline1dfithermitewc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); void _pexec_spline1dfithermitewc(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); void spline1dfitcubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); void _pexec_spline1dfitcubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); void spline1dfithermite(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); void _pexec_spline1dfithermite(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t m, ae_int_t* info, spline1dinterpolant* s, spline1dfitreport* rep, ae_state *_state); void lsfitlinearw(/* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, ae_int_t n, ae_int_t m, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state); void _pexec_lsfitlinearw(/* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, ae_int_t n, ae_int_t m, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state); void lsfitlinearwc(/* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, /* Real */ ae_matrix* cmatrix, ae_int_t n, ae_int_t m, ae_int_t k, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state); void _pexec_lsfitlinearwc(/* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, /* Real */ ae_matrix* cmatrix, ae_int_t n, ae_int_t m, ae_int_t k, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state); void lsfitlinear(/* Real */ ae_vector* y, /* Real */ ae_matrix* fmatrix, ae_int_t n, ae_int_t m, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state); void _pexec_lsfitlinear(/* Real */ ae_vector* y, /* Real */ ae_matrix* fmatrix, ae_int_t n, ae_int_t m, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state); void lsfitlinearc(/* Real */ ae_vector* y, /* Real */ ae_matrix* fmatrix, /* Real */ ae_matrix* cmatrix, ae_int_t n, ae_int_t m, ae_int_t k, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state); void _pexec_lsfitlinearc(/* Real */ ae_vector* y, /* Real */ ae_matrix* fmatrix, /* Real */ ae_matrix* cmatrix, ae_int_t n, ae_int_t m, ae_int_t k, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state); void lsfitcreatewf(/* Real */ ae_matrix* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_vector* c, ae_int_t n, ae_int_t m, ae_int_t k, double diffstep, lsfitstate* state, ae_state *_state); void lsfitcreatef(/* Real */ ae_matrix* x, /* Real */ ae_vector* y, /* Real */ ae_vector* c, ae_int_t n, ae_int_t m, ae_int_t k, double diffstep, lsfitstate* state, ae_state *_state); void lsfitcreatewfg(/* Real */ ae_matrix* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_vector* c, ae_int_t n, ae_int_t m, ae_int_t k, ae_bool cheapfg, lsfitstate* state, ae_state *_state); void lsfitcreatefg(/* Real */ ae_matrix* x, /* Real */ ae_vector* y, /* Real */ ae_vector* c, ae_int_t n, ae_int_t m, ae_int_t k, ae_bool cheapfg, lsfitstate* state, ae_state *_state); void lsfitcreatewfgh(/* Real */ ae_matrix* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_vector* c, ae_int_t n, ae_int_t m, ae_int_t k, lsfitstate* state, ae_state *_state); void lsfitcreatefgh(/* Real */ ae_matrix* x, /* Real */ ae_vector* y, /* Real */ ae_vector* c, ae_int_t n, ae_int_t m, ae_int_t k, lsfitstate* state, ae_state *_state); void lsfitsetcond(lsfitstate* state, double epsx, ae_int_t maxits, ae_state *_state); void lsfitsetstpmax(lsfitstate* state, double stpmax, ae_state *_state); void lsfitsetxrep(lsfitstate* state, ae_bool needxrep, ae_state *_state); void lsfitsetscale(lsfitstate* state, /* Real */ ae_vector* s, ae_state *_state); void lsfitsetbc(lsfitstate* state, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state); void lsfitsetlc(lsfitstate* state, /* Real */ ae_matrix* c, /* Integer */ ae_vector* ct, ae_int_t k, ae_state *_state); ae_bool lsfititeration(lsfitstate* state, ae_state *_state); void lsfitresults(lsfitstate* state, ae_int_t* info, /* Real */ ae_vector* c, lsfitreport* rep, ae_state *_state); void lsfitsetgradientcheck(lsfitstate* state, double teststep, ae_state *_state); void lsfitscalexy(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* w, ae_int_t n, /* Real */ ae_vector* xc, /* Real */ ae_vector* yc, /* Integer */ ae_vector* dc, ae_int_t k, double* xa, double* xb, double* sa, double* sb, /* Real */ ae_vector* xoriginal, /* Real */ ae_vector* yoriginal, ae_state *_state); void _polynomialfitreport_init(void* _p, ae_state *_state); void _polynomialfitreport_init_copy(void* _dst, void* _src, ae_state *_state); void _polynomialfitreport_clear(void* _p); void _polynomialfitreport_destroy(void* _p); void _barycentricfitreport_init(void* _p, ae_state *_state); void _barycentricfitreport_init_copy(void* _dst, void* _src, ae_state *_state); void _barycentricfitreport_clear(void* _p); void _barycentricfitreport_destroy(void* _p); void _spline1dfitreport_init(void* _p, ae_state *_state); void _spline1dfitreport_init_copy(void* _dst, void* _src, ae_state *_state); void _spline1dfitreport_clear(void* _p); void _spline1dfitreport_destroy(void* _p); void _lsfitreport_init(void* _p, ae_state *_state); void _lsfitreport_init_copy(void* _dst, void* _src, ae_state *_state); void _lsfitreport_clear(void* _p); void _lsfitreport_destroy(void* _p); void _lsfitstate_init(void* _p, ae_state *_state); void _lsfitstate_init_copy(void* _dst, void* _src, ae_state *_state); void _lsfitstate_clear(void* _p); void _lsfitstate_destroy(void* _p); void rbfv2create(ae_int_t nx, ae_int_t ny, rbfv2model* s, ae_state *_state); void rbfv2createcalcbuffer(rbfv2model* s, rbfv2calcbuffer* buf, ae_state *_state); void rbfv2buildhierarchical(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, /* Real */ ae_vector* scalevec, ae_int_t aterm, ae_int_t nh, double rbase, double lambdans, ae_bool nonnegative, ae_int_t nnmaxits, rbfv2model* s, rbfv2report* rep, ae_state *_state); void rbfv2alloc(ae_serializer* s, rbfv2model* model, ae_state *_state); void rbfv2serialize(ae_serializer* s, rbfv2model* model, ae_state *_state); void rbfv2unserialize(ae_serializer* s, rbfv2model* model, ae_state *_state); double rbfv2farradius(ae_int_t bf, ae_state *_state); double rbfv2nearradius(ae_int_t bf, ae_state *_state); double rbfv2basisfunc(ae_int_t bf, double d2, ae_state *_state); void rbfv2basisfuncdiff2(ae_int_t bf, double d2, double* f, double* df, double* d2f, ae_state *_state); double rbfv2calc1(rbfv2model* s, double x0, ae_state *_state); double rbfv2calc2(rbfv2model* s, double x0, double x1, ae_state *_state); double rbfv2calc3(rbfv2model* s, double x0, double x1, double x2, ae_state *_state); void rbfv2calcbuf(rbfv2model* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void rbfv2tscalcbuf(rbfv2model* s, rbfv2calcbuffer* buf, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void rbfv2gridcalc2(rbfv2model* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_matrix* y, ae_state *_state); void rbfv2gridcalcvx(rbfv2model* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* x3, ae_int_t n3, /* Boolean */ ae_vector* flagy, ae_bool sparsey, /* Real */ ae_vector* y, ae_state *_state); void rbfv2partialgridcalcrec(rbfv2model* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* x3, ae_int_t n3, /* Integer */ ae_vector* blocks0, ae_int_t block0a, ae_int_t block0b, /* Integer */ ae_vector* blocks1, ae_int_t block1a, ae_int_t block1b, /* Integer */ ae_vector* blocks2, ae_int_t block2a, ae_int_t block2b, /* Integer */ ae_vector* blocks3, ae_int_t block3a, ae_int_t block3b, /* Boolean */ ae_vector* flagy, ae_bool sparsey, ae_int_t levelidx, double avgfuncpernode, ae_shared_pool* bufpool, /* Real */ ae_vector* y, ae_state *_state); void rbfv2unpack(rbfv2model* s, ae_int_t* nx, ae_int_t* ny, /* Real */ ae_matrix* xwr, ae_int_t* nc, /* Real */ ae_matrix* v, ae_state *_state); void _rbfv2calcbuffer_init(void* _p, ae_state *_state); void _rbfv2calcbuffer_init_copy(void* _dst, void* _src, ae_state *_state); void _rbfv2calcbuffer_clear(void* _p); void _rbfv2calcbuffer_destroy(void* _p); void _rbfv2model_init(void* _p, ae_state *_state); void _rbfv2model_init_copy(void* _dst, void* _src, ae_state *_state); void _rbfv2model_clear(void* _p); void _rbfv2model_destroy(void* _p); void _rbfv2gridcalcbuffer_init(void* _p, ae_state *_state); void _rbfv2gridcalcbuffer_init_copy(void* _dst, void* _src, ae_state *_state); void _rbfv2gridcalcbuffer_clear(void* _p); void _rbfv2gridcalcbuffer_destroy(void* _p); void _rbfv2report_init(void* _p, ae_state *_state); void _rbfv2report_init_copy(void* _dst, void* _src, ae_state *_state); void _rbfv2report_clear(void* _p); void _rbfv2report_destroy(void* _p); void nsfitspheremcc(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, /* Real */ ae_vector* cx, double* rhi, ae_state *_state); void nsfitspheremic(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, /* Real */ ae_vector* cx, double* rlo, ae_state *_state); void nsfitspheremzc(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, /* Real */ ae_vector* cx, double* rlo, double* rhi, ae_state *_state); void nsfitspherex(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, ae_int_t problemtype, double epsx, ae_int_t aulits, double penalty, /* Real */ ae_vector* cx, double* rlo, double* rhi, ae_state *_state); void nsfitsphereinternal(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, ae_int_t problemtype, ae_int_t solvertype, double epsx, ae_int_t aulits, double penalty, /* Real */ ae_vector* cx, double* rlo, double* rhi, nsfitinternalreport* rep, ae_state *_state); void _nsfitinternalreport_init(void* _p, ae_state *_state); void _nsfitinternalreport_init_copy(void* _dst, void* _src, ae_state *_state); void _nsfitinternalreport_clear(void* _p); void _nsfitinternalreport_destroy(void* _p); double spline2dcalc(spline2dinterpolant* c, double x, double y, ae_state *_state); void spline2ddiff(spline2dinterpolant* c, double x, double y, double* f, double* fx, double* fy, double* fxy, ae_state *_state); void spline2dlintransxy(spline2dinterpolant* c, double ax, double bx, double ay, double by, ae_state *_state); void spline2dlintransf(spline2dinterpolant* c, double a, double b, ae_state *_state); void spline2dcopy(spline2dinterpolant* c, spline2dinterpolant* cc, ae_state *_state); void spline2dresamplebicubic(/* Real */ ae_matrix* a, ae_int_t oldheight, ae_int_t oldwidth, /* Real */ ae_matrix* b, ae_int_t newheight, ae_int_t newwidth, ae_state *_state); void spline2dresamplebilinear(/* Real */ ae_matrix* a, ae_int_t oldheight, ae_int_t oldwidth, /* Real */ ae_matrix* b, ae_int_t newheight, ae_int_t newwidth, ae_state *_state); void spline2dbuildbilinearv(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, /* Real */ ae_vector* f, ae_int_t d, spline2dinterpolant* c, ae_state *_state); void spline2dbuildbicubicv(/* Real */ ae_vector* x, ae_int_t n, /* Real */ ae_vector* y, ae_int_t m, /* Real */ ae_vector* f, ae_int_t d, spline2dinterpolant* c, ae_state *_state); void spline2dcalcvbuf(spline2dinterpolant* c, double x, double y, /* Real */ ae_vector* f, ae_state *_state); void spline2dcalcv(spline2dinterpolant* c, double x, double y, /* Real */ ae_vector* f, ae_state *_state); void spline2dunpackv(spline2dinterpolant* c, ae_int_t* m, ae_int_t* n, ae_int_t* d, /* Real */ ae_matrix* tbl, ae_state *_state); void spline2dbuildbilinear(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_matrix* f, ae_int_t m, ae_int_t n, spline2dinterpolant* c, ae_state *_state); void spline2dbuildbicubic(/* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_matrix* f, ae_int_t m, ae_int_t n, spline2dinterpolant* c, ae_state *_state); void spline2dunpack(spline2dinterpolant* c, ae_int_t* m, ae_int_t* n, /* Real */ ae_matrix* tbl, ae_state *_state); void _spline2dinterpolant_init(void* _p, ae_state *_state); void _spline2dinterpolant_init_copy(void* _dst, void* _src, ae_state *_state); void _spline2dinterpolant_clear(void* _p); void _spline2dinterpolant_destroy(void* _p); void rbfv1create(ae_int_t nx, ae_int_t ny, rbfv1model* s, ae_state *_state); void rbfv1createcalcbuffer(rbfv1model* s, rbfv1calcbuffer* buf, ae_state *_state); void rbfv1buildmodel(/* Real */ ae_matrix* x, /* Real */ ae_matrix* y, ae_int_t n, ae_int_t aterm, ae_int_t algorithmtype, ae_int_t nlayers, double radvalue, double radzvalue, double lambdav, double epsort, double epserr, ae_int_t maxits, rbfv1model* s, rbfv1report* rep, ae_state *_state); void rbfv1alloc(ae_serializer* s, rbfv1model* model, ae_state *_state); void rbfv1serialize(ae_serializer* s, rbfv1model* model, ae_state *_state); void rbfv1unserialize(ae_serializer* s, rbfv1model* model, ae_state *_state); double rbfv1calc2(rbfv1model* s, double x0, double x1, ae_state *_state); double rbfv1calc3(rbfv1model* s, double x0, double x1, double x2, ae_state *_state); void rbfv1calcbuf(rbfv1model* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void rbfv1tscalcbuf(rbfv1model* s, rbfv1calcbuffer* buf, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void rbfv1gridcalc2(rbfv1model* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_matrix* y, ae_state *_state); void rbfv1gridcalc3vrec(rbfv1model* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Integer */ ae_vector* blocks0, ae_int_t block0a, ae_int_t block0b, /* Integer */ ae_vector* blocks1, ae_int_t block1a, ae_int_t block1b, /* Integer */ ae_vector* blocks2, ae_int_t block2a, ae_int_t block2b, /* Boolean */ ae_vector* flagy, ae_bool sparsey, double searchradius, double avgfuncpernode, ae_shared_pool* bufpool, /* Real */ ae_vector* y, ae_state *_state); void rbfv1unpack(rbfv1model* s, ae_int_t* nx, ae_int_t* ny, /* Real */ ae_matrix* xwr, ae_int_t* nc, /* Real */ ae_matrix* v, ae_state *_state); void _rbfv1calcbuffer_init(void* _p, ae_state *_state); void _rbfv1calcbuffer_init_copy(void* _dst, void* _src, ae_state *_state); void _rbfv1calcbuffer_clear(void* _p); void _rbfv1calcbuffer_destroy(void* _p); void _rbfv1model_init(void* _p, ae_state *_state); void _rbfv1model_init_copy(void* _dst, void* _src, ae_state *_state); void _rbfv1model_clear(void* _p); void _rbfv1model_destroy(void* _p); void _gridcalc3v1buf_init(void* _p, ae_state *_state); void _gridcalc3v1buf_init_copy(void* _dst, void* _src, ae_state *_state); void _gridcalc3v1buf_clear(void* _p); void _gridcalc3v1buf_destroy(void* _p); void _rbfv1report_init(void* _p, ae_state *_state); void _rbfv1report_init_copy(void* _dst, void* _src, ae_state *_state); void _rbfv1report_clear(void* _p); void _rbfv1report_destroy(void* _p); void rbfcreate(ae_int_t nx, ae_int_t ny, rbfmodel* s, ae_state *_state); void rbfcreatecalcbuffer(rbfmodel* s, rbfcalcbuffer* buf, ae_state *_state); void rbfsetpoints(rbfmodel* s, /* Real */ ae_matrix* xy, ae_int_t n, ae_state *_state); void rbfsetpointsandscales(rbfmodel* r, /* Real */ ae_matrix* xy, ae_int_t n, /* Real */ ae_vector* s, ae_state *_state); void rbfsetalgoqnn(rbfmodel* s, double q, double z, ae_state *_state); void rbfsetalgomultilayer(rbfmodel* s, double rbase, ae_int_t nlayers, double lambdav, ae_state *_state); void rbfsetalgohierarchical(rbfmodel* s, double rbase, ae_int_t nlayers, double lambdans, ae_state *_state); void rbfsetlinterm(rbfmodel* s, ae_state *_state); void rbfsetconstterm(rbfmodel* s, ae_state *_state); void rbfsetzeroterm(rbfmodel* s, ae_state *_state); void rbfsetv2bf(rbfmodel* s, ae_int_t bf, ae_state *_state); void rbfsetv2its(rbfmodel* s, ae_int_t maxits, ae_state *_state); void rbfsetv2supportr(rbfmodel* s, double r, ae_state *_state); void rbfsetcond(rbfmodel* s, double epsort, double epserr, ae_int_t maxits, ae_state *_state); void rbfbuildmodel(rbfmodel* s, rbfreport* rep, ae_state *_state); double rbfcalc1(rbfmodel* s, double x0, ae_state *_state); double rbfcalc2(rbfmodel* s, double x0, double x1, ae_state *_state); double rbfcalc3(rbfmodel* s, double x0, double x1, double x2, ae_state *_state); void rbfcalc(rbfmodel* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void rbfcalcbuf(rbfmodel* s, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void rbftscalcbuf(rbfmodel* s, rbfcalcbuffer* buf, /* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_state *_state); void rbfgridcalc2(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_matrix* y, ae_state *_state); void rbfgridcalc2v(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* y, ae_state *_state); void _pexec_rbfgridcalc2v(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* y, ae_state *_state); void rbfgridcalc2vsubset(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Boolean */ ae_vector* flagy, /* Real */ ae_vector* y, ae_state *_state); void _pexec_rbfgridcalc2vsubset(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Boolean */ ae_vector* flagy, /* Real */ ae_vector* y, ae_state *_state); void rbfgridcalc3v(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* y, ae_state *_state); void _pexec_rbfgridcalc3v(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Real */ ae_vector* y, ae_state *_state); void rbfgridcalc3vsubset(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Boolean */ ae_vector* flagy, /* Real */ ae_vector* y, ae_state *_state); void _pexec_rbfgridcalc3vsubset(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Boolean */ ae_vector* flagy, /* Real */ ae_vector* y, ae_state *_state); void rbfgridcalc2vx(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Boolean */ ae_vector* flagy, ae_bool sparsey, /* Real */ ae_vector* y, ae_state *_state); void rbfgridcalc3vx(rbfmodel* s, /* Real */ ae_vector* x0, ae_int_t n0, /* Real */ ae_vector* x1, ae_int_t n1, /* Real */ ae_vector* x2, ae_int_t n2, /* Boolean */ ae_vector* flagy, ae_bool sparsey, /* Real */ ae_vector* y, ae_state *_state); void rbfunpack(rbfmodel* s, ae_int_t* nx, ae_int_t* ny, /* Real */ ae_matrix* xwr, ae_int_t* nc, /* Real */ ae_matrix* v, ae_int_t* modelversion, ae_state *_state); ae_int_t rbfgetmodelversion(rbfmodel* s, ae_state *_state); void rbfalloc(ae_serializer* s, rbfmodel* model, ae_state *_state); void rbfserialize(ae_serializer* s, rbfmodel* model, ae_state *_state); void rbfunserialize(ae_serializer* s, rbfmodel* model, ae_state *_state); void _rbfcalcbuffer_init(void* _p, ae_state *_state); void _rbfcalcbuffer_init_copy(void* _dst, void* _src, ae_state *_state); void _rbfcalcbuffer_clear(void* _p); void _rbfcalcbuffer_destroy(void* _p); void _rbfmodel_init(void* _p, ae_state *_state); void _rbfmodel_init_copy(void* _dst, void* _src, ae_state *_state); void _rbfmodel_clear(void* _p); void _rbfmodel_destroy(void* _p); void _rbfreport_init(void* _p, ae_state *_state); void _rbfreport_init_copy(void* _dst, void* _src, ae_state *_state); void _rbfreport_clear(void* _p); void _rbfreport_destroy(void* _p); } #endif cpp/src/diffequations.cpp0000755000175000017500000011500313105126765015363 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "diffequations.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif using namespace std; ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* *************************************************************************/ _odesolverstate_owner::_odesolverstate_owner() { p_struct = (alglib_impl::odesolverstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::odesolverstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_odesolverstate_init(p_struct, NULL); } _odesolverstate_owner::_odesolverstate_owner(const _odesolverstate_owner &rhs) { p_struct = (alglib_impl::odesolverstate*)alglib_impl::ae_malloc(sizeof(alglib_impl::odesolverstate), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_odesolverstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _odesolverstate_owner& _odesolverstate_owner::operator=(const _odesolverstate_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_odesolverstate_clear(p_struct); alglib_impl::_odesolverstate_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _odesolverstate_owner::~_odesolverstate_owner() { alglib_impl::_odesolverstate_clear(p_struct); ae_free(p_struct); } alglib_impl::odesolverstate* _odesolverstate_owner::c_ptr() { return p_struct; } alglib_impl::odesolverstate* _odesolverstate_owner::c_ptr() const { return const_cast(p_struct); } odesolverstate::odesolverstate() : _odesolverstate_owner() ,needdy(p_struct->needdy),y(&p_struct->y),dy(&p_struct->dy),x(p_struct->x) { } odesolverstate::odesolverstate(const odesolverstate &rhs):_odesolverstate_owner(rhs) ,needdy(p_struct->needdy),y(&p_struct->y),dy(&p_struct->dy),x(p_struct->x) { } odesolverstate& odesolverstate::operator=(const odesolverstate &rhs) { if( this==&rhs ) return *this; _odesolverstate_owner::operator=(rhs); return *this; } odesolverstate::~odesolverstate() { } /************************************************************************* *************************************************************************/ _odesolverreport_owner::_odesolverreport_owner() { p_struct = (alglib_impl::odesolverreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::odesolverreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_odesolverreport_init(p_struct, NULL); } _odesolverreport_owner::_odesolverreport_owner(const _odesolverreport_owner &rhs) { p_struct = (alglib_impl::odesolverreport*)alglib_impl::ae_malloc(sizeof(alglib_impl::odesolverreport), NULL); if( p_struct==NULL ) throw ap_error("ALGLIB: malloc error"); alglib_impl::_odesolverreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); } _odesolverreport_owner& _odesolverreport_owner::operator=(const _odesolverreport_owner &rhs) { if( this==&rhs ) return *this; alglib_impl::_odesolverreport_clear(p_struct); alglib_impl::_odesolverreport_init_copy(p_struct, const_cast(rhs.p_struct), NULL); return *this; } _odesolverreport_owner::~_odesolverreport_owner() { alglib_impl::_odesolverreport_clear(p_struct); ae_free(p_struct); } alglib_impl::odesolverreport* _odesolverreport_owner::c_ptr() { return p_struct; } alglib_impl::odesolverreport* _odesolverreport_owner::c_ptr() const { return const_cast(p_struct); } odesolverreport::odesolverreport() : _odesolverreport_owner() ,nfev(p_struct->nfev),terminationtype(p_struct->terminationtype) { } odesolverreport::odesolverreport(const odesolverreport &rhs):_odesolverreport_owner(rhs) ,nfev(p_struct->nfev),terminationtype(p_struct->terminationtype) { } odesolverreport& odesolverreport::operator=(const odesolverreport &rhs) { if( this==&rhs ) return *this; _odesolverreport_owner::operator=(rhs); return *this; } odesolverreport::~odesolverreport() { } /************************************************************************* Cash-Karp adaptive ODE solver. This subroutine solves ODE Y'=f(Y,x) with initial conditions Y(xs)=Ys (here Y may be single variable or vector of N variables). INPUT PARAMETERS: Y - initial conditions, array[0..N-1]. contains values of Y[] at X[0] N - system size X - points at which Y should be tabulated, array[0..M-1] integrations starts at X[0], ends at X[M-1], intermediate values at X[i] are returned too. SHOULD BE ORDERED BY ASCENDING OR BY DESCENDING! M - number of intermediate points + first point + last point: * M>2 means that you need both Y(X[M-1]) and M-2 values at intermediate points * M=2 means that you want just to integrate from X[0] to X[1] and don't interested in intermediate values. * M=1 means that you don't want to integrate :) it is degenerate case, but it will be handled correctly. * M<1 means error Eps - tolerance (absolute/relative error on each step will be less than Eps). When passing: * Eps>0, it means desired ABSOLUTE error * Eps<0, it means desired RELATIVE error. Relative errors are calculated with respect to maximum values of Y seen so far. Be careful to use this criterion when starting from Y[] that are close to zero. H - initial step lenth, it will be adjusted automatically after the first step. If H=0, step will be selected automatically (usualy it will be equal to 0.001 of min(x[i]-x[j])). OUTPUT PARAMETERS State - structure which stores algorithm state between subsequent calls of OdeSolverIteration. Used for reverse communication. This structure should be passed to the OdeSolverIteration subroutine. SEE ALSO AutoGKSmoothW, AutoGKSingular, AutoGKIteration, AutoGKResults. -- ALGLIB -- Copyright 01.09.2009 by Bochkanov Sergey *************************************************************************/ void odesolverrkck(const real_1d_array &y, const ae_int_t n, const real_1d_array &x, const ae_int_t m, const double eps, const double h, odesolverstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::odesolverrkck(const_cast(y.c_ptr()), n, const_cast(x.c_ptr()), m, eps, h, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Cash-Karp adaptive ODE solver. This subroutine solves ODE Y'=f(Y,x) with initial conditions Y(xs)=Ys (here Y may be single variable or vector of N variables). INPUT PARAMETERS: Y - initial conditions, array[0..N-1]. contains values of Y[] at X[0] N - system size X - points at which Y should be tabulated, array[0..M-1] integrations starts at X[0], ends at X[M-1], intermediate values at X[i] are returned too. SHOULD BE ORDERED BY ASCENDING OR BY DESCENDING! M - number of intermediate points + first point + last point: * M>2 means that you need both Y(X[M-1]) and M-2 values at intermediate points * M=2 means that you want just to integrate from X[0] to X[1] and don't interested in intermediate values. * M=1 means that you don't want to integrate :) it is degenerate case, but it will be handled correctly. * M<1 means error Eps - tolerance (absolute/relative error on each step will be less than Eps). When passing: * Eps>0, it means desired ABSOLUTE error * Eps<0, it means desired RELATIVE error. Relative errors are calculated with respect to maximum values of Y seen so far. Be careful to use this criterion when starting from Y[] that are close to zero. H - initial step lenth, it will be adjusted automatically after the first step. If H=0, step will be selected automatically (usualy it will be equal to 0.001 of min(x[i]-x[j])). OUTPUT PARAMETERS State - structure which stores algorithm state between subsequent calls of OdeSolverIteration. Used for reverse communication. This structure should be passed to the OdeSolverIteration subroutine. SEE ALSO AutoGKSmoothW, AutoGKSingular, AutoGKIteration, AutoGKResults. -- ALGLIB -- Copyright 01.09.2009 by Bochkanov Sergey *************************************************************************/ void odesolverrkck(const real_1d_array &y, const real_1d_array &x, const double eps, const double h, odesolverstate &state) { alglib_impl::ae_state _alglib_env_state; ae_int_t n; ae_int_t m; n = y.length(); m = x.length(); alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::odesolverrkck(const_cast(y.c_ptr()), n, const_cast(x.c_ptr()), m, eps, h, const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* This function provides reverse communication interface Reverse communication interface is not documented or recommended to use. See below for functions which provide better documented API *************************************************************************/ bool odesolveriteration(const odesolverstate &state) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { ae_bool result = alglib_impl::odesolveriteration(const_cast(state.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } void odesolversolve(odesolverstate &state, void (*diff)(const real_1d_array &y, double x, real_1d_array &dy, void *ptr), void *ptr){ alglib_impl::ae_state _alglib_env_state; if( diff==NULL ) throw ap_error("ALGLIB: error in 'odesolversolve()' (diff is NULL)"); alglib_impl::ae_state_init(&_alglib_env_state); try { while( alglib_impl::odesolveriteration(state.c_ptr(), &_alglib_env_state) ) { if( state.needdy ) { diff(state.y, state.x, state.dy, ptr); continue; } throw ap_error("ALGLIB: unexpected error in 'odesolversolve'"); } alglib_impl::ae_state_clear(&_alglib_env_state); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* ODE solver results Called after OdeSolverIteration returned False. INPUT PARAMETERS: State - algorithm state (used by OdeSolverIteration). OUTPUT PARAMETERS: M - number of tabulated values, M>=1 XTbl - array[0..M-1], values of X YTbl - array[0..M-1,0..N-1], values of Y in X[i] Rep - solver report: * Rep.TerminationType completetion code: * -2 X is not ordered by ascending/descending or there are non-distinct X[], i.e. X[i]=X[i+1] * -1 incorrect parameters were specified * 1 task has been solved * Rep.NFEV contains number of function calculations -- ALGLIB -- Copyright 01.09.2009 by Bochkanov Sergey *************************************************************************/ void odesolverresults(const odesolverstate &state, ae_int_t &m, real_1d_array &xtbl, real_2d_array &ytbl, odesolverreport &rep) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::odesolverresults(const_cast(state.c_ptr()), &m, const_cast(xtbl.c_ptr()), const_cast(ytbl.c_ptr()), const_cast(rep.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF COMPUTATIONAL CORE // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { static double odesolver_odesolvermaxgrow = 3.0; static double odesolver_odesolvermaxshrink = 10.0; static void odesolver_odesolverinit(ae_int_t solvertype, /* Real */ ae_vector* y, ae_int_t n, /* Real */ ae_vector* x, ae_int_t m, double eps, double h, odesolverstate* state, ae_state *_state); /************************************************************************* Cash-Karp adaptive ODE solver. This subroutine solves ODE Y'=f(Y,x) with initial conditions Y(xs)=Ys (here Y may be single variable or vector of N variables). INPUT PARAMETERS: Y - initial conditions, array[0..N-1]. contains values of Y[] at X[0] N - system size X - points at which Y should be tabulated, array[0..M-1] integrations starts at X[0], ends at X[M-1], intermediate values at X[i] are returned too. SHOULD BE ORDERED BY ASCENDING OR BY DESCENDING! M - number of intermediate points + first point + last point: * M>2 means that you need both Y(X[M-1]) and M-2 values at intermediate points * M=2 means that you want just to integrate from X[0] to X[1] and don't interested in intermediate values. * M=1 means that you don't want to integrate :) it is degenerate case, but it will be handled correctly. * M<1 means error Eps - tolerance (absolute/relative error on each step will be less than Eps). When passing: * Eps>0, it means desired ABSOLUTE error * Eps<0, it means desired RELATIVE error. Relative errors are calculated with respect to maximum values of Y seen so far. Be careful to use this criterion when starting from Y[] that are close to zero. H - initial step lenth, it will be adjusted automatically after the first step. If H=0, step will be selected automatically (usualy it will be equal to 0.001 of min(x[i]-x[j])). OUTPUT PARAMETERS State - structure which stores algorithm state between subsequent calls of OdeSolverIteration. Used for reverse communication. This structure should be passed to the OdeSolverIteration subroutine. SEE ALSO AutoGKSmoothW, AutoGKSingular, AutoGKIteration, AutoGKResults. -- ALGLIB -- Copyright 01.09.2009 by Bochkanov Sergey *************************************************************************/ void odesolverrkck(/* Real */ ae_vector* y, ae_int_t n, /* Real */ ae_vector* x, ae_int_t m, double eps, double h, odesolverstate* state, ae_state *_state) { _odesolverstate_clear(state); ae_assert(n>=1, "ODESolverRKCK: N<1!", _state); ae_assert(m>=1, "ODESolverRKCK: M<1!", _state); ae_assert(y->cnt>=n, "ODESolverRKCK: Length(Y)cnt>=m, "ODESolverRKCK: Length(X)rstate.stage>=0 ) { n = state->rstate.ia.ptr.p_int[0]; m = state->rstate.ia.ptr.p_int[1]; i = state->rstate.ia.ptr.p_int[2]; j = state->rstate.ia.ptr.p_int[3]; k = state->rstate.ia.ptr.p_int[4]; klimit = state->rstate.ia.ptr.p_int[5]; gridpoint = state->rstate.ba.ptr.p_bool[0]; xc = state->rstate.ra.ptr.p_double[0]; v = state->rstate.ra.ptr.p_double[1]; h = state->rstate.ra.ptr.p_double[2]; h2 = state->rstate.ra.ptr.p_double[3]; err = state->rstate.ra.ptr.p_double[4]; maxgrowpow = state->rstate.ra.ptr.p_double[5]; } else { n = 359; m = -58; i = -919; j = -909; k = 81; klimit = 255; gridpoint = ae_false; xc = -788; v = 809; h = 205; h2 = -838; err = 939; maxgrowpow = -526; } if( state->rstate.stage==0 ) { goto lbl_0; } /* * Routine body */ /* * prepare */ if( state->repterminationtype!=0 ) { result = ae_false; return result; } n = state->n; m = state->m; h = state->h; maxgrowpow = ae_pow(odesolver_odesolvermaxgrow, (double)(5), _state); state->repnfev = 0; /* * some preliminary checks for internal errors * after this we assume that H>0 and M>1 */ ae_assert(ae_fp_greater(state->h,(double)(0)), "ODESolver: internal error", _state); ae_assert(m>1, "ODESolverIteration: internal error", _state); /* * choose solver */ if( state->solvertype!=0 ) { goto lbl_1; } /* * Cask-Karp solver * Prepare coefficients table. * Check it for errors */ ae_vector_set_length(&state->rka, 6, _state); state->rka.ptr.p_double[0] = (double)(0); state->rka.ptr.p_double[1] = (double)1/(double)5; state->rka.ptr.p_double[2] = (double)3/(double)10; state->rka.ptr.p_double[3] = (double)3/(double)5; state->rka.ptr.p_double[4] = (double)(1); state->rka.ptr.p_double[5] = (double)7/(double)8; ae_matrix_set_length(&state->rkb, 6, 5, _state); state->rkb.ptr.pp_double[1][0] = (double)1/(double)5; state->rkb.ptr.pp_double[2][0] = (double)3/(double)40; state->rkb.ptr.pp_double[2][1] = (double)9/(double)40; state->rkb.ptr.pp_double[3][0] = (double)3/(double)10; state->rkb.ptr.pp_double[3][1] = -(double)9/(double)10; state->rkb.ptr.pp_double[3][2] = (double)6/(double)5; state->rkb.ptr.pp_double[4][0] = -(double)11/(double)54; state->rkb.ptr.pp_double[4][1] = (double)5/(double)2; state->rkb.ptr.pp_double[4][2] = -(double)70/(double)27; state->rkb.ptr.pp_double[4][3] = (double)35/(double)27; state->rkb.ptr.pp_double[5][0] = (double)1631/(double)55296; state->rkb.ptr.pp_double[5][1] = (double)175/(double)512; state->rkb.ptr.pp_double[5][2] = (double)575/(double)13824; state->rkb.ptr.pp_double[5][3] = (double)44275/(double)110592; state->rkb.ptr.pp_double[5][4] = (double)253/(double)4096; ae_vector_set_length(&state->rkc, 6, _state); state->rkc.ptr.p_double[0] = (double)37/(double)378; state->rkc.ptr.p_double[1] = (double)(0); state->rkc.ptr.p_double[2] = (double)250/(double)621; state->rkc.ptr.p_double[3] = (double)125/(double)594; state->rkc.ptr.p_double[4] = (double)(0); state->rkc.ptr.p_double[5] = (double)512/(double)1771; ae_vector_set_length(&state->rkcs, 6, _state); state->rkcs.ptr.p_double[0] = (double)2825/(double)27648; state->rkcs.ptr.p_double[1] = (double)(0); state->rkcs.ptr.p_double[2] = (double)18575/(double)48384; state->rkcs.ptr.p_double[3] = (double)13525/(double)55296; state->rkcs.ptr.p_double[4] = (double)277/(double)14336; state->rkcs.ptr.p_double[5] = (double)1/(double)4; ae_matrix_set_length(&state->rkk, 6, n, _state); /* * Main cycle consists of two iterations: * * outer where we travel from X[i-1] to X[i] * * inner where we travel inside [X[i-1],X[i]] */ ae_matrix_set_length(&state->ytbl, m, n, _state); ae_vector_set_length(&state->escale, n, _state); ae_vector_set_length(&state->yn, n, _state); ae_vector_set_length(&state->yns, n, _state); xc = state->xg.ptr.p_double[0]; ae_v_move(&state->ytbl.ptr.pp_double[0][0], 1, &state->yc.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(j=0; j<=n-1; j++) { state->escale.ptr.p_double[j] = (double)(0); } i = 1; lbl_3: if( i>m-1 ) { goto lbl_5; } /* * begin inner iteration */ lbl_6: if( ae_false ) { goto lbl_7; } /* * truncate step if needed (beyond right boundary). * determine should we store X or not */ if( ae_fp_greater_eq(xc+h,state->xg.ptr.p_double[i]) ) { h = state->xg.ptr.p_double[i]-xc; gridpoint = ae_true; } else { gridpoint = ae_false; } /* * Update error scale maximums * * These maximums are initialized by zeros, * then updated every iterations. */ for(j=0; j<=n-1; j++) { state->escale.ptr.p_double[j] = ae_maxreal(state->escale.ptr.p_double[j], ae_fabs(state->yc.ptr.p_double[j], _state), _state); } /* * make one step: * 1. calculate all info needed to do step * 2. update errors scale maximums using values/derivatives * obtained during (1) * * Take into account that we use scaling of X to reduce task * to the form where x[0] < x[1] < ... < x[n-1]. So X is * replaced by x=xscale*t, and dy/dx=f(y,x) is replaced * by dy/dt=xscale*f(y,xscale*t). */ ae_v_move(&state->yn.ptr.p_double[0], 1, &state->yc.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&state->yns.ptr.p_double[0], 1, &state->yc.ptr.p_double[0], 1, ae_v_len(0,n-1)); k = 0; lbl_8: if( k>5 ) { goto lbl_10; } /* * prepare data for the next update of YN/YNS */ state->x = state->xscale*(xc+state->rka.ptr.p_double[k]*h); ae_v_move(&state->y.ptr.p_double[0], 1, &state->yc.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(j=0; j<=k-1; j++) { v = state->rkb.ptr.pp_double[k][j]; ae_v_addd(&state->y.ptr.p_double[0], 1, &state->rkk.ptr.pp_double[j][0], 1, ae_v_len(0,n-1), v); } state->needdy = ae_true; state->rstate.stage = 0; goto lbl_rcomm; lbl_0: state->needdy = ae_false; state->repnfev = state->repnfev+1; v = h*state->xscale; ae_v_moved(&state->rkk.ptr.pp_double[k][0], 1, &state->dy.ptr.p_double[0], 1, ae_v_len(0,n-1), v); /* * update YN/YNS */ v = state->rkc.ptr.p_double[k]; ae_v_addd(&state->yn.ptr.p_double[0], 1, &state->rkk.ptr.pp_double[k][0], 1, ae_v_len(0,n-1), v); v = state->rkcs.ptr.p_double[k]; ae_v_addd(&state->yns.ptr.p_double[0], 1, &state->rkk.ptr.pp_double[k][0], 1, ae_v_len(0,n-1), v); k = k+1; goto lbl_8; lbl_10: /* * estimate error */ err = (double)(0); for(j=0; j<=n-1; j++) { if( !state->fraceps ) { /* * absolute error is estimated */ err = ae_maxreal(err, ae_fabs(state->yn.ptr.p_double[j]-state->yns.ptr.p_double[j], _state), _state); } else { /* * Relative error is estimated */ v = state->escale.ptr.p_double[j]; if( ae_fp_eq(v,(double)(0)) ) { v = (double)(1); } err = ae_maxreal(err, ae_fabs(state->yn.ptr.p_double[j]-state->yns.ptr.p_double[j], _state)/v, _state); } } /* * calculate new step, restart if necessary */ if( ae_fp_less_eq(maxgrowpow*err,state->eps) ) { h2 = odesolver_odesolvermaxgrow*h; } else { h2 = h*ae_pow(state->eps/err, 0.2, _state); } if( ae_fp_less(h2,h/odesolver_odesolvermaxshrink) ) { h2 = h/odesolver_odesolvermaxshrink; } if( ae_fp_greater(err,state->eps) ) { h = h2; goto lbl_6; } /* * advance position */ xc = xc+h; ae_v_move(&state->yc.ptr.p_double[0], 1, &state->yn.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * update H */ h = h2; /* * break on grid point */ if( gridpoint ) { goto lbl_7; } goto lbl_6; lbl_7: /* * save result */ ae_v_move(&state->ytbl.ptr.pp_double[i][0], 1, &state->yc.ptr.p_double[0], 1, ae_v_len(0,n-1)); i = i+1; goto lbl_3; lbl_5: state->repterminationtype = 1; result = ae_false; return result; lbl_1: result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; state->rstate.ia.ptr.p_int[0] = n; state->rstate.ia.ptr.p_int[1] = m; state->rstate.ia.ptr.p_int[2] = i; state->rstate.ia.ptr.p_int[3] = j; state->rstate.ia.ptr.p_int[4] = k; state->rstate.ia.ptr.p_int[5] = klimit; state->rstate.ba.ptr.p_bool[0] = gridpoint; state->rstate.ra.ptr.p_double[0] = xc; state->rstate.ra.ptr.p_double[1] = v; state->rstate.ra.ptr.p_double[2] = h; state->rstate.ra.ptr.p_double[3] = h2; state->rstate.ra.ptr.p_double[4] = err; state->rstate.ra.ptr.p_double[5] = maxgrowpow; return result; } /************************************************************************* ODE solver results Called after OdeSolverIteration returned False. INPUT PARAMETERS: State - algorithm state (used by OdeSolverIteration). OUTPUT PARAMETERS: M - number of tabulated values, M>=1 XTbl - array[0..M-1], values of X YTbl - array[0..M-1,0..N-1], values of Y in X[i] Rep - solver report: * Rep.TerminationType completetion code: * -2 X is not ordered by ascending/descending or there are non-distinct X[], i.e. X[i]=X[i+1] * -1 incorrect parameters were specified * 1 task has been solved * Rep.NFEV contains number of function calculations -- ALGLIB -- Copyright 01.09.2009 by Bochkanov Sergey *************************************************************************/ void odesolverresults(odesolverstate* state, ae_int_t* m, /* Real */ ae_vector* xtbl, /* Real */ ae_matrix* ytbl, odesolverreport* rep, ae_state *_state) { double v; ae_int_t i; *m = 0; ae_vector_clear(xtbl); ae_matrix_clear(ytbl); _odesolverreport_clear(rep); rep->terminationtype = state->repterminationtype; if( rep->terminationtype>0 ) { *m = state->m; rep->nfev = state->repnfev; ae_vector_set_length(xtbl, state->m, _state); v = state->xscale; ae_v_moved(&xtbl->ptr.p_double[0], 1, &state->xg.ptr.p_double[0], 1, ae_v_len(0,state->m-1), v); ae_matrix_set_length(ytbl, state->m, state->n, _state); for(i=0; i<=state->m-1; i++) { ae_v_move(&ytbl->ptr.pp_double[i][0], 1, &state->ytbl.ptr.pp_double[i][0], 1, ae_v_len(0,state->n-1)); } } else { rep->nfev = 0; } } /************************************************************************* Internal initialization subroutine *************************************************************************/ static void odesolver_odesolverinit(ae_int_t solvertype, /* Real */ ae_vector* y, ae_int_t n, /* Real */ ae_vector* x, ae_int_t m, double eps, double h, odesolverstate* state, ae_state *_state) { ae_int_t i; double v; _odesolverstate_clear(state); /* * Prepare RComm */ ae_vector_set_length(&state->rstate.ia, 5+1, _state); ae_vector_set_length(&state->rstate.ba, 0+1, _state); ae_vector_set_length(&state->rstate.ra, 5+1, _state); state->rstate.stage = -1; state->needdy = ae_false; /* * check parameters. */ if( (n<=0||m<1)||ae_fp_eq(eps,(double)(0)) ) { state->repterminationtype = -1; return; } if( ae_fp_less(h,(double)(0)) ) { h = -h; } /* * quick exit if necessary. * after this block we assume that M>1 */ if( m==1 ) { state->repnfev = 0; state->repterminationtype = 1; ae_matrix_set_length(&state->ytbl, 1, n, _state); ae_v_move(&state->ytbl.ptr.pp_double[0][0], 1, &y->ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_vector_set_length(&state->xg, m, _state); ae_v_move(&state->xg.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,m-1)); return; } /* * check again: correct order of X[] */ if( ae_fp_eq(x->ptr.p_double[1],x->ptr.p_double[0]) ) { state->repterminationtype = -2; return; } for(i=1; i<=m-1; i++) { if( (ae_fp_greater(x->ptr.p_double[1],x->ptr.p_double[0])&&ae_fp_less_eq(x->ptr.p_double[i],x->ptr.p_double[i-1]))||(ae_fp_less(x->ptr.p_double[1],x->ptr.p_double[0])&&ae_fp_greater_eq(x->ptr.p_double[i],x->ptr.p_double[i-1])) ) { state->repterminationtype = -2; return; } } /* * auto-select H if necessary */ if( ae_fp_eq(h,(double)(0)) ) { v = ae_fabs(x->ptr.p_double[1]-x->ptr.p_double[0], _state); for(i=2; i<=m-1; i++) { v = ae_minreal(v, ae_fabs(x->ptr.p_double[i]-x->ptr.p_double[i-1], _state), _state); } h = 0.001*v; } /* * store parameters */ state->n = n; state->m = m; state->h = h; state->eps = ae_fabs(eps, _state); state->fraceps = ae_fp_less(eps,(double)(0)); ae_vector_set_length(&state->xg, m, _state); ae_v_move(&state->xg.ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,m-1)); if( ae_fp_greater(x->ptr.p_double[1],x->ptr.p_double[0]) ) { state->xscale = (double)(1); } else { state->xscale = (double)(-1); ae_v_muld(&state->xg.ptr.p_double[0], 1, ae_v_len(0,m-1), -1); } ae_vector_set_length(&state->yc, n, _state); ae_v_move(&state->yc.ptr.p_double[0], 1, &y->ptr.p_double[0], 1, ae_v_len(0,n-1)); state->solvertype = solvertype; state->repterminationtype = 0; /* * Allocate arrays */ ae_vector_set_length(&state->y, n, _state); ae_vector_set_length(&state->dy, n, _state); } void _odesolverstate_init(void* _p, ae_state *_state) { odesolverstate *p = (odesolverstate*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->yc, 0, DT_REAL, _state); ae_vector_init(&p->escale, 0, DT_REAL, _state); ae_vector_init(&p->xg, 0, DT_REAL, _state); ae_vector_init(&p->y, 0, DT_REAL, _state); ae_vector_init(&p->dy, 0, DT_REAL, _state); ae_matrix_init(&p->ytbl, 0, 0, DT_REAL, _state); ae_vector_init(&p->yn, 0, DT_REAL, _state); ae_vector_init(&p->yns, 0, DT_REAL, _state); ae_vector_init(&p->rka, 0, DT_REAL, _state); ae_vector_init(&p->rkc, 0, DT_REAL, _state); ae_vector_init(&p->rkcs, 0, DT_REAL, _state); ae_matrix_init(&p->rkb, 0, 0, DT_REAL, _state); ae_matrix_init(&p->rkk, 0, 0, DT_REAL, _state); _rcommstate_init(&p->rstate, _state); } void _odesolverstate_init_copy(void* _dst, void* _src, ae_state *_state) { odesolverstate *dst = (odesolverstate*)_dst; odesolverstate *src = (odesolverstate*)_src; dst->n = src->n; dst->m = src->m; dst->xscale = src->xscale; dst->h = src->h; dst->eps = src->eps; dst->fraceps = src->fraceps; ae_vector_init_copy(&dst->yc, &src->yc, _state); ae_vector_init_copy(&dst->escale, &src->escale, _state); ae_vector_init_copy(&dst->xg, &src->xg, _state); dst->solvertype = src->solvertype; dst->needdy = src->needdy; dst->x = src->x; ae_vector_init_copy(&dst->y, &src->y, _state); ae_vector_init_copy(&dst->dy, &src->dy, _state); ae_matrix_init_copy(&dst->ytbl, &src->ytbl, _state); dst->repterminationtype = src->repterminationtype; dst->repnfev = src->repnfev; ae_vector_init_copy(&dst->yn, &src->yn, _state); ae_vector_init_copy(&dst->yns, &src->yns, _state); ae_vector_init_copy(&dst->rka, &src->rka, _state); ae_vector_init_copy(&dst->rkc, &src->rkc, _state); ae_vector_init_copy(&dst->rkcs, &src->rkcs, _state); ae_matrix_init_copy(&dst->rkb, &src->rkb, _state); ae_matrix_init_copy(&dst->rkk, &src->rkk, _state); _rcommstate_init_copy(&dst->rstate, &src->rstate, _state); } void _odesolverstate_clear(void* _p) { odesolverstate *p = (odesolverstate*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->yc); ae_vector_clear(&p->escale); ae_vector_clear(&p->xg); ae_vector_clear(&p->y); ae_vector_clear(&p->dy); ae_matrix_clear(&p->ytbl); ae_vector_clear(&p->yn); ae_vector_clear(&p->yns); ae_vector_clear(&p->rka); ae_vector_clear(&p->rkc); ae_vector_clear(&p->rkcs); ae_matrix_clear(&p->rkb); ae_matrix_clear(&p->rkk); _rcommstate_clear(&p->rstate); } void _odesolverstate_destroy(void* _p) { odesolverstate *p = (odesolverstate*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->yc); ae_vector_destroy(&p->escale); ae_vector_destroy(&p->xg); ae_vector_destroy(&p->y); ae_vector_destroy(&p->dy); ae_matrix_destroy(&p->ytbl); ae_vector_destroy(&p->yn); ae_vector_destroy(&p->yns); ae_vector_destroy(&p->rka); ae_vector_destroy(&p->rkc); ae_vector_destroy(&p->rkcs); ae_matrix_destroy(&p->rkb); ae_matrix_destroy(&p->rkk); _rcommstate_destroy(&p->rstate); } void _odesolverreport_init(void* _p, ae_state *_state) { odesolverreport *p = (odesolverreport*)_p; ae_touch_ptr((void*)p); } void _odesolverreport_init_copy(void* _dst, void* _src, ae_state *_state) { odesolverreport *dst = (odesolverreport*)_dst; odesolverreport *src = (odesolverreport*)_src; dst->nfev = src->nfev; dst->terminationtype = src->terminationtype; } void _odesolverreport_clear(void* _p) { odesolverreport *p = (odesolverreport*)_p; ae_touch_ptr((void*)p); } void _odesolverreport_destroy(void* _p) { odesolverreport *p = (odesolverreport*)_p; ae_touch_ptr((void*)p); } } cpp/src/specialfunctions.cpp0000755000175000017500000107141313105126766016103 0ustar sergeysergey/************************************************************************* ALGLIB 3.11.0 (source code generated 2017-05-11) Copyright (c) Sergey Bochkanov (ALGLIB project). >>> SOURCE LICENSE >>> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation (www.fsf.org); either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available at http://www.fsf.org/licensing/licenses >>> END OF LICENSE >>> *************************************************************************/ #include "stdafx.h" #include "specialfunctions.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif using namespace std; ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF C++ INTERFACE // ///////////////////////////////////////////////////////////////////////// namespace alglib { /************************************************************************* Gamma function Input parameters: X - argument Domain: 0 < X < 171.6 -170 < X < 0, X is not an integer. Relative error: arithmetic domain # trials peak rms IEEE -170,-33 20000 2.3e-15 3.3e-16 IEEE -33, 33 20000 9.4e-16 2.2e-16 IEEE 33, 171.6 20000 2.3e-15 3.2e-16 Cephes Math Library Release 2.8: June, 2000 Original copyright 1984, 1987, 1989, 1992, 2000 by Stephen L. Moshier Translated to AlgoPascal by Bochkanov Sergey (2005, 2006, 2007). *************************************************************************/ double gammafunction(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::gammafunction(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Natural logarithm of gamma function Input parameters: X - argument Result: logarithm of the absolute value of the Gamma(X). Output parameters: SgnGam - sign(Gamma(X)) Domain: 0 < X < 2.55e305 -2.55e305 < X < 0, X is not an integer. ACCURACY: arithmetic domain # trials peak rms IEEE 0, 3 28000 5.4e-16 1.1e-16 IEEE 2.718, 2.556e305 40000 3.5e-16 8.3e-17 The error criterion was relative when the function magnitude was greater than one but absolute when it was less than one. The following test used the relative error criterion, though at certain points the relative error could be much higher than indicated. IEEE -200, -4 10000 4.8e-16 1.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 1992, 2000 by Stephen L. Moshier Translated to AlgoPascal by Bochkanov Sergey (2005, 2006, 2007). *************************************************************************/ double lngamma(const double x, double &sgngam) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::lngamma(x, &sgngam, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Error function The integral is x - 2 | | 2 erf(x) = -------- | exp( - t ) dt. sqrt(pi) | | - 0 For 0 <= |x| < 1, erf(x) = x * P4(x**2)/Q5(x**2); otherwise erf(x) = 1 - erfc(x). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,1 30000 3.7e-16 1.0e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double errorfunction(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::errorfunction(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complementary error function 1 - erf(x) = inf. - 2 | | 2 erfc(x) = -------- | exp( - t ) dt sqrt(pi) | | - x For small x, erfc(x) = 1 - erf(x); otherwise rational approximations are computed. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,26.6417 30000 5.7e-14 1.5e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double errorfunctionc(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::errorfunctionc(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Normal distribution function Returns the area under the Gaussian probability density function, integrated from minus infinity to x: x - 1 | | 2 ndtr(x) = --------- | exp( - t /2 ) dt sqrt(2pi) | | - -inf. = ( 1 + erf(z) ) / 2 = erfc(z) / 2 where z = x/sqrt(2). Computation is via the functions erf and erfc. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE -13,0 30000 3.4e-14 6.7e-15 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double normaldistribution(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::normaldistribution(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inverse of the error function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double inverf(const double e) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::inverf(e, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inverse of Normal distribution function Returns the argument, x, for which the area under the Gaussian probability density function (integrated from minus infinity to x) is equal to y. For small arguments 0 < y < exp(-2), the program computes z = sqrt( -2.0 * log(y) ); then the approximation is x = z - log(z)/z - (1/z) P(1/z) / Q(1/z). There are two rational functions P/Q, one for 0 < y < exp(-32) and the other for y up to exp(-2). For larger arguments, w = y - 0.5, and x/sqrt(2pi) = w + w**3 R(w**2)/S(w**2)). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0.125, 1 20000 7.2e-16 1.3e-16 IEEE 3e-308, 0.135 50000 4.6e-16 9.8e-17 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double invnormaldistribution(const double y0) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::invnormaldistribution(y0, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Incomplete gamma integral The function is defined by x - 1 | | -t a-1 igam(a,x) = ----- | e t dt. - | | | (a) - 0 In this implementation both arguments must be positive. The integral is evaluated by either a power series or continued fraction expansion, depending on the relative values of a and x. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 200000 3.6e-14 2.9e-15 IEEE 0,100 300000 9.9e-14 1.5e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double incompletegamma(const double a, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::incompletegamma(a, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complemented incomplete gamma integral The function is defined by igamc(a,x) = 1 - igam(a,x) inf. - 1 | | -t a-1 = ----- | e t dt. - | | | (a) - x In this implementation both arguments must be positive. The integral is evaluated by either a power series or continued fraction expansion, depending on the relative values of a and x. ACCURACY: Tested at random a, x. a x Relative error: arithmetic domain domain # trials peak rms IEEE 0.5,100 0,100 200000 1.9e-14 1.7e-15 IEEE 0.01,0.5 0,100 200000 1.4e-13 1.6e-15 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double incompletegammac(const double a, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::incompletegammac(a, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inverse of complemented imcomplete gamma integral Given p, the function finds x such that igamc( a, x ) = p. Starting with the approximate value 3 x = a t where t = 1 - d - ndtri(p) sqrt(d) and d = 1/9a, the routine performs up to 10 Newton iterations to find the root of igamc(a,x) - p = 0. ACCURACY: Tested at random a, p in the intervals indicated. a p Relative error: arithmetic domain domain # trials peak rms IEEE 0.5,100 0,0.5 100000 1.0e-14 1.7e-15 IEEE 0.01,0.5 0,0.5 100000 9.0e-14 3.4e-15 IEEE 0.5,10000 0,0.5 20000 2.3e-13 3.8e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invincompletegammac(const double a, const double y0) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::invincompletegammac(a, y0, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complete elliptic integral of the first kind Approximates the integral pi/2 - | | | dt K(m) = | ------------------ | 2 | | sqrt( 1 - m sin t ) - 0 using the approximation P(x) - log x Q(x). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,1 30000 2.5e-16 6.8e-17 Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double ellipticintegralk(const double m) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::ellipticintegralk(m, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complete elliptic integral of the first kind Approximates the integral pi/2 - | | | dt K(m) = | ------------------ | 2 | | sqrt( 1 - m sin t ) - 0 where m = 1 - m1, using the approximation P(x) - log x Q(x). The argument m1 is used rather than m so that the logarithmic singularity at m = 1 will be shifted to the origin; this preserves maximum accuracy. K(0) = pi/2. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,1 30000 2.5e-16 6.8e-17 Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double ellipticintegralkhighprecision(const double m1) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::ellipticintegralkhighprecision(m1, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Incomplete elliptic integral of the first kind F(phi|m) Approximates the integral phi - | | | dt F(phi_\m) = | ------------------ | 2 | | sqrt( 1 - m sin t ) - 0 of amplitude phi and modulus m, using the arithmetic - geometric mean algorithm. ACCURACY: Tested at random points with m in [0, 1] and phi as indicated. Relative error: arithmetic domain # trials peak rms IEEE -10,10 200000 7.4e-16 1.0e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double incompleteellipticintegralk(const double phi, const double m) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::incompleteellipticintegralk(phi, m, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complete elliptic integral of the second kind Approximates the integral pi/2 - | | 2 E(m) = | sqrt( 1 - m sin t ) dt | | - 0 using the approximation P(x) - x log x Q(x). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 1 10000 2.1e-16 7.3e-17 Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double ellipticintegrale(const double m) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::ellipticintegrale(m, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Incomplete elliptic integral of the second kind Approximates the integral phi - | | | 2 E(phi_\m) = | sqrt( 1 - m sin t ) dt | | | - 0 of amplitude phi and modulus m, using the arithmetic - geometric mean algorithm. ACCURACY: Tested at random arguments with phi in [-10, 10] and m in [0, 1]. Relative error: arithmetic domain # trials peak rms IEEE -10,10 150000 3.3e-15 1.4e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1993, 2000 by Stephen L. Moshier *************************************************************************/ double incompleteellipticintegrale(const double phi, const double m) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::incompleteellipticintegrale(phi, m, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the value of the Hermite polynomial. Parameters: n - degree, n>=0 x - argument Result: the value of the Hermite polynomial Hn at x *************************************************************************/ double hermitecalculate(const ae_int_t n, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::hermitecalculate(n, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Summation of Hermite polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*H0(x) + c[1]*H1(x) + ... + c[N]*HN(x) Parameters: n - degree, n>=0 x - argument Result: the value of the Hermite polynomial at x *************************************************************************/ double hermitesum(const real_1d_array &c, const ae_int_t n, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::hermitesum(const_cast(c.c_ptr()), n, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Representation of Hn as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/ void hermitecoefficients(const ae_int_t n, real_1d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hermitecoefficients(n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Dawson's Integral Approximates the integral x - 2 | | 2 dawsn(x) = exp( -x ) | exp( t ) dt | | - 0 Three different rational approximations are employed, for the intervals 0 to 3.25; 3.25 to 6.25; and 6.25 up. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,10 10000 6.9e-16 1.0e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double dawsonintegral(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::dawsonintegral(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Sine and cosine integrals Evaluates the integrals x - | cos t - 1 Ci(x) = eul + ln x + | --------- dt, | t - 0 x - | sin t Si(x) = | ----- dt | t - 0 where eul = 0.57721566490153286061 is Euler's constant. The integrals are approximated by rational functions. For x > 8 auxiliary functions f(x) and g(x) are employed such that Ci(x) = f(x) sin(x) - g(x) cos(x) Si(x) = pi/2 - f(x) cos(x) - g(x) sin(x) ACCURACY: Test interval = [0,50]. Absolute error, except relative when > 1: arithmetic function # trials peak rms IEEE Si 30000 4.4e-16 7.3e-17 IEEE Ci 30000 6.9e-16 5.1e-17 Cephes Math Library Release 2.1: January, 1989 Copyright 1984, 1987, 1989 by Stephen L. Moshier *************************************************************************/ void sinecosineintegrals(const double x, double &si, double &ci) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::sinecosineintegrals(x, &si, &ci, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Hyperbolic sine and cosine integrals Approximates the integrals x - | | cosh t - 1 Chi(x) = eul + ln x + | ----------- dt, | | t - 0 x - | | sinh t Shi(x) = | ------ dt | | t - 0 where eul = 0.57721566490153286061 is Euler's constant. The integrals are evaluated by power series for x < 8 and by Chebyshev expansions for x between 8 and 88. For large x, both functions approach exp(x)/2x. Arguments greater than 88 in magnitude return MAXNUM. ACCURACY: Test interval 0 to 88. Relative error: arithmetic function # trials peak rms IEEE Shi 30000 6.9e-16 1.6e-16 Absolute error, except relative when |Chi| > 1: IEEE Chi 30000 8.4e-16 1.4e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ void hyperbolicsinecosineintegrals(const double x, double &shi, double &chi) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::hyperbolicsinecosineintegrals(x, &shi, &chi, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Poisson distribution Returns the sum of the first k+1 terms of the Poisson distribution: k j -- -m m > e -- -- j! j=0 The terms are not summed directly; instead the incomplete gamma integral is employed, according to the relation y = pdtr( k, m ) = igamc( k+1, m ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double poissondistribution(const ae_int_t k, const double m) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::poissondistribution(k, m, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complemented Poisson distribution Returns the sum of the terms k+1 to infinity of the Poisson distribution: inf. j -- -m m > e -- -- j! j=k+1 The terms are not summed directly; instead the incomplete gamma integral is employed, according to the formula y = pdtrc( k, m ) = igam( k+1, m ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double poissoncdistribution(const ae_int_t k, const double m) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::poissoncdistribution(k, m, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inverse Poisson distribution Finds the Poisson variable x such that the integral from 0 to x of the Poisson density is equal to the given probability y. This is accomplished using the inverse gamma integral function and the relation m = igami( k+1, y ). ACCURACY: See inverse incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invpoissondistribution(const ae_int_t k, const double y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::invpoissondistribution(k, y, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Bessel function of order zero Returns Bessel function of order zero of the argument. The domain is divided into the intervals [0, 5] and (5, infinity). In the first interval the following rational approximation is used: 2 2 (w - r ) (w - r ) P (w) / Q (w) 1 2 3 8 2 where w = x and the two r's are zeros of the function. In the second interval, the Hankel asymptotic expansion is employed with two rational functions of degree 6/6 and 7/7. ACCURACY: Absolute error: arithmetic domain # trials peak rms IEEE 0, 30 60000 4.2e-16 1.1e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double besselj0(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::besselj0(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Bessel function of order one Returns Bessel function of order one of the argument. The domain is divided into the intervals [0, 8] and (8, infinity). In the first interval a 24 term Chebyshev expansion is used. In the second, the asymptotic trigonometric representation is employed using two rational functions of degree 5/5. ACCURACY: Absolute error: arithmetic domain # trials peak rms IEEE 0, 30 30000 2.6e-16 1.1e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double besselj1(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::besselj1(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Bessel function of integer order Returns Bessel function of order n, where n is a (possibly negative) integer. The ratio of jn(x) to j0(x) is computed by backward recurrence. First the ratio jn/jn-1 is found by a continued fraction expansion. Then the recurrence relating successive orders is applied until j0 or j1 is reached. If n = 0 or 1 the routine for j0 or j1 is called directly. ACCURACY: Absolute error: arithmetic range # trials peak rms IEEE 0, 30 5000 4.4e-16 7.9e-17 Not suitable for large n or x. Use jv() (fractional order) instead. Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besseljn(const ae_int_t n, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::besseljn(n, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Bessel function of the second kind, order zero Returns Bessel function of the second kind, of order zero, of the argument. The domain is divided into the intervals [0, 5] and (5, infinity). In the first interval a rational approximation R(x) is employed to compute y0(x) = R(x) + 2 * log(x) * j0(x) / PI. Thus a call to j0() is required. In the second interval, the Hankel asymptotic expansion is employed with two rational functions of degree 6/6 and 7/7. ACCURACY: Absolute error, when y0(x) < 1; else relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.3e-15 1.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double bessely0(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::bessely0(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Bessel function of second kind of order one Returns Bessel function of the second kind of order one of the argument. The domain is divided into the intervals [0, 8] and (8, infinity). In the first interval a 25 term Chebyshev expansion is used, and a call to j1() is required. In the second, the asymptotic trigonometric representation is employed using two rational functions of degree 5/5. ACCURACY: Absolute error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.0e-15 1.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double bessely1(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::bessely1(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Bessel function of second kind of integer order Returns Bessel function of order n, where n is a (possibly negative) integer. The function is evaluated by forward recurrence on n, starting with values computed by the routines y0() and y1(). If n = 0 or 1 the routine for y0 or y1 is called directly. ACCURACY: Absolute error, except relative when y > 1: arithmetic domain # trials peak rms IEEE 0, 30 30000 3.4e-15 4.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besselyn(const ae_int_t n, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::besselyn(n, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modified Bessel function of order zero Returns modified Bessel function of order zero of the argument. The function is defined as i0(x) = j0( ix ). The range is partitioned into the two intervals [0,8] and (8, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 30000 5.8e-16 1.4e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besseli0(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::besseli0(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modified Bessel function of order one Returns modified Bessel function of order one of the argument. The function is defined as i1(x) = -i j1( ix ). The range is partitioned into the two intervals [0,8] and (8, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.9e-15 2.1e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besseli1(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::besseli1(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modified Bessel function, second kind, order zero Returns modified Bessel function of the second kind of order zero of the argument. The range is partitioned into the two intervals [0,8] and (8, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Tested at 2000 random points between 0 and 8. Peak absolute error (relative when K0 > 1) was 1.46e-14; rms, 4.26e-15. Relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.2e-15 1.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besselk0(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::besselk0(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modified Bessel function, second kind, order one Computes the modified Bessel function of the second kind of order one of the argument. The range is partitioned into the two intervals [0,2] and (2, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.2e-15 1.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besselk1(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::besselk1(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Modified Bessel function, second kind, integer order Returns modified Bessel function of the second kind of order n of the argument. The range is partitioned into the two intervals [0,9.55] and (9.55, infinity). An ascending power series is used in the low range, and an asymptotic expansion in the high range. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 90000 1.8e-8 3.0e-10 Error is high only near the crossover point x = 9.55 between the two expansions used. Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 2000 by Stephen L. Moshier *************************************************************************/ double besselkn(const ae_int_t nn, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::besselkn(nn, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Incomplete beta integral Returns incomplete beta integral of the arguments, evaluated from zero to x. The function is defined as x - - | (a+b) | | a-1 b-1 ----------- | t (1-t) dt. - - | | | (a) | (b) - 0 The domain of definition is 0 <= x <= 1. In this implementation a and b are restricted to positive values. The integral from x to 1 may be obtained by the symmetry relation 1 - incbet( a, b, x ) = incbet( b, a, 1-x ). The integral is evaluated by a continued fraction expansion or, when b*x is small, by a power series. ACCURACY: Tested at uniformly distributed random points (a,b,x) with a and b in "domain" and x between 0 and 1. Relative error arithmetic domain # trials peak rms IEEE 0,5 10000 6.9e-15 4.5e-16 IEEE 0,85 250000 2.2e-13 1.7e-14 IEEE 0,1000 30000 5.3e-12 6.3e-13 IEEE 0,10000 250000 9.3e-11 7.1e-12 IEEE 0,100000 10000 8.7e-10 4.8e-11 Outputs smaller than the IEEE gradual underflow threshold were excluded from these statistics. Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double incompletebeta(const double a, const double b, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::incompletebeta(a, b, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inverse of imcomplete beta integral Given y, the function finds x such that incbet( a, b, x ) = y . The routine performs interval halving or Newton iterations to find the root of incbet(a,b,x) - y = 0. ACCURACY: Relative error: x a,b arithmetic domain domain # trials peak rms IEEE 0,1 .5,10000 50000 5.8e-12 1.3e-13 IEEE 0,1 .25,100 100000 1.8e-13 3.9e-15 IEEE 0,1 0,5 50000 1.1e-12 5.5e-15 With a and b constrained to half-integer or integer values: IEEE 0,1 .5,10000 50000 5.8e-12 1.1e-13 IEEE 0,1 .5,100 100000 1.7e-14 7.9e-16 With a = .5, b constrained to half-integer or integer values: IEEE 0,1 .5,10000 10000 8.3e-11 1.0e-11 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1996, 2000 by Stephen L. Moshier *************************************************************************/ double invincompletebeta(const double a, const double b, const double y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::invincompletebeta(a, b, y, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* F distribution Returns the area from zero to x under the F density function (also known as Snedcor's density or the variance ratio density). This is the density of x = (u1/df1)/(u2/df2), where u1 and u2 are random variables having Chi square distributions with df1 and df2 degrees of freedom, respectively. The incomplete beta integral is used, according to the formula P(x) = incbet( df1/2, df2/2, (df1*x/(df2 + df1*x) ). The arguments a and b are greater than zero, and x is nonnegative. ACCURACY: Tested at random points (a,b,x). x a,b Relative error: arithmetic domain domain # trials peak rms IEEE 0,1 0,100 100000 9.8e-15 1.7e-15 IEEE 1,5 0,100 100000 6.5e-15 3.5e-16 IEEE 0,1 1,10000 100000 2.2e-11 3.3e-12 IEEE 1,5 1,10000 100000 1.1e-11 1.7e-13 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double fdistribution(const ae_int_t a, const ae_int_t b, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::fdistribution(a, b, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complemented F distribution Returns the area from x to infinity under the F density function (also known as Snedcor's density or the variance ratio density). inf. - 1 | | a-1 b-1 1-P(x) = ------ | t (1-t) dt B(a,b) | | - x The incomplete beta integral is used, according to the formula P(x) = incbet( df2/2, df1/2, (df2/(df2 + df1*x) ). ACCURACY: Tested at random points (a,b,x) in the indicated intervals. x a,b Relative error: arithmetic domain domain # trials peak rms IEEE 0,1 1,100 100000 3.7e-14 5.9e-16 IEEE 1,5 1,100 100000 8.0e-15 1.6e-15 IEEE 0,1 1,10000 100000 1.8e-11 3.5e-13 IEEE 1,5 1,10000 100000 2.0e-11 3.0e-12 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double fcdistribution(const ae_int_t a, const ae_int_t b, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::fcdistribution(a, b, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inverse of complemented F distribution Finds the F density argument x such that the integral from x to infinity of the F density is equal to the given probability p. This is accomplished using the inverse beta integral function and the relations z = incbi( df2/2, df1/2, p ) x = df2 (1-z) / (df1 z). Note: the following relations hold for the inverse of the uncomplemented F distribution: z = incbi( df1/2, df2/2, p ) x = df2 z / (df1 (1-z)). ACCURACY: Tested at random points (a,b,p). a,b Relative error: arithmetic domain # trials peak rms For p between .001 and 1: IEEE 1,100 100000 8.3e-15 4.7e-16 IEEE 1,10000 100000 2.1e-11 1.4e-13 For p between 10^-6 and 10^-3: IEEE 1,100 50000 1.3e-12 8.4e-15 IEEE 1,10000 50000 3.0e-12 4.8e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invfdistribution(const ae_int_t a, const ae_int_t b, const double y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::invfdistribution(a, b, y, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Fresnel integral Evaluates the Fresnel integrals x - | | C(x) = | cos(pi/2 t**2) dt, | | - 0 x - | | S(x) = | sin(pi/2 t**2) dt. | | - 0 The integrals are evaluated by a power series for x < 1. For x >= 1 auxiliary functions f(x) and g(x) are employed such that C(x) = 0.5 + f(x) sin( pi/2 x**2 ) - g(x) cos( pi/2 x**2 ) S(x) = 0.5 - f(x) cos( pi/2 x**2 ) - g(x) sin( pi/2 x**2 ) ACCURACY: Relative error. Arithmetic function domain # trials peak rms IEEE S(x) 0, 10 10000 2.0e-15 3.2e-16 IEEE C(x) 0, 10 10000 1.8e-15 3.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ void fresnelintegral(const double x, double &c, double &s) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fresnelintegral(x, &c, &s, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Jacobian Elliptic Functions Evaluates the Jacobian elliptic functions sn(u|m), cn(u|m), and dn(u|m) of parameter m between 0 and 1, and real argument u. These functions are periodic, with quarter-period on the real axis equal to the complete elliptic integral ellpk(1.0-m). Relation to incomplete elliptic integral: If u = ellik(phi,m), then sn(u|m) = sin(phi), and cn(u|m) = cos(phi). Phi is called the amplitude of u. Computation is by means of the arithmetic-geometric mean algorithm, except when m is within 1e-9 of 0 or 1. In the latter case with m close to 1, the approximation applies only for phi < pi/2. ACCURACY: Tested at random points with u between 0 and 10, m between 0 and 1. Absolute error (* = relative error): arithmetic function # trials peak rms IEEE phi 10000 9.2e-16* 1.4e-16* IEEE sn 50000 4.1e-15 4.6e-16 IEEE cn 40000 3.6e-15 4.4e-16 IEEE dn 10000 1.3e-12 1.8e-14 Peak error observed in consistency check using addition theorem for sn(u+v) was 4e-16 (absolute). Also tested by the above relation to the incomplete elliptic integral. Accuracy deteriorates when u is large. Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ void jacobianellipticfunctions(const double u, const double m, double &sn, double &cn, double &dn, double &ph) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::jacobianellipticfunctions(u, m, &sn, &cn, &dn, &ph, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Psi (digamma) function d - psi(x) = -- ln | (x) dx is the logarithmic derivative of the gamma function. For integer x, n-1 - psi(n) = -EUL + > 1/k. - k=1 This formula is used for 0 < n <= 10. If x is negative, it is transformed to a positive argument by the reflection formula psi(1-x) = psi(x) + pi cot(pi x). For general positive x, the argument is made greater than 10 using the recurrence psi(x+1) = psi(x) + 1/x. Then the following asymptotic expansion is applied: inf. B - 2k psi(x) = log(x) - 1/2x - > ------- - 2k k=1 2k x where the B2k are Bernoulli numbers. ACCURACY: Relative error (except absolute when |psi| < 1): arithmetic domain # trials peak rms IEEE 0,30 30000 1.3e-15 1.4e-16 IEEE -30,0 40000 1.5e-15 2.2e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double psi(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::psi(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Exponential integral Ei(x) x - t | | e Ei(x) = -|- --- dt . | | t - -inf Not defined for x <= 0. See also expn.c. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,100 50000 8.6e-16 1.3e-16 Cephes Math Library Release 2.8: May, 1999 Copyright 1999 by Stephen L. Moshier *************************************************************************/ double exponentialintegralei(const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::exponentialintegralei(x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Exponential integral En(x) Evaluates the exponential integral inf. - | | -xt | e E (x) = | ---- dt. n | n | | t - 1 Both n and x must be nonnegative. The routine employs either a power series, a continued fraction, or an asymptotic formula depending on the relative values of n and x. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 30 10000 1.7e-15 3.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 2000 by Stephen L. Moshier *************************************************************************/ double exponentialintegralen(const double x, const ae_int_t n) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::exponentialintegralen(x, n, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the value of the Laguerre polynomial. Parameters: n - degree, n>=0 x - argument Result: the value of the Laguerre polynomial Ln at x *************************************************************************/ double laguerrecalculate(const ae_int_t n, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::laguerrecalculate(n, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Summation of Laguerre polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*L0(x) + c[1]*L1(x) + ... + c[N]*LN(x) Parameters: n - degree, n>=0 x - argument Result: the value of the Laguerre polynomial at x *************************************************************************/ double laguerresum(const real_1d_array &c, const ae_int_t n, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::laguerresum(const_cast(c.c_ptr()), n, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Representation of Ln as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/ void laguerrecoefficients(const ae_int_t n, real_1d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::laguerrecoefficients(n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Chi-square distribution Returns the area under the left hand tail (from 0 to x) of the Chi square probability density function with v degrees of freedom. x - 1 | | v/2-1 -t/2 P( x | v ) = ----------- | t e dt v/2 - | | 2 | (v/2) - 0 where x is the Chi-square variable. The incomplete gamma integral is used, according to the formula y = chdtr( v, x ) = igam( v/2.0, x/2.0 ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double chisquaredistribution(const double v, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::chisquaredistribution(v, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complemented Chi-square distribution Returns the area under the right hand tail (from x to infinity) of the Chi square probability density function with v degrees of freedom: inf. - 1 | | v/2-1 -t/2 P( x | v ) = ----------- | t e dt v/2 - | | 2 | (v/2) - x where x is the Chi-square variable. The incomplete gamma integral is used, according to the formula y = chdtr( v, x ) = igamc( v/2.0, x/2.0 ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double chisquarecdistribution(const double v, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::chisquarecdistribution(v, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inverse of complemented Chi-square distribution Finds the Chi-square argument x such that the integral from x to infinity of the Chi-square density is equal to the given cumulative probability y. This is accomplished using the inverse gamma integral function and the relation x/2 = igami( df/2, y ); ACCURACY: See inverse incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double invchisquaredistribution(const double v, const double y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::invchisquaredistribution(v, y, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the value of the Legendre polynomial Pn. Parameters: n - degree, n>=0 x - argument Result: the value of the Legendre polynomial Pn at x *************************************************************************/ double legendrecalculate(const ae_int_t n, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::legendrecalculate(n, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Summation of Legendre polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*P0(x) + c[1]*P1(x) + ... + c[N]*PN(x) Parameters: n - degree, n>=0 x - argument Result: the value of the Legendre polynomial at x *************************************************************************/ double legendresum(const real_1d_array &c, const ae_int_t n, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::legendresum(const_cast(c.c_ptr()), n, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Representation of Pn as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/ void legendrecoefficients(const ae_int_t n, real_1d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::legendrecoefficients(n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Beta function - - | (a) | (b) beta( a, b ) = -----------. - | (a+b) For large arguments the logarithm of the function is evaluated using lgam(), then exponentiated. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 30000 8.1e-14 1.1e-14 Cephes Math Library Release 2.0: April, 1987 Copyright 1984, 1987 by Stephen L. Moshier *************************************************************************/ double beta(const double a, const double b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::beta(a, b, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Calculation of the value of the Chebyshev polynomials of the first and second kinds. Parameters: r - polynomial kind, either 1 or 2. n - degree, n>=0 x - argument, -1 <= x <= 1 Result: the value of the Chebyshev polynomial at x *************************************************************************/ double chebyshevcalculate(const ae_int_t r, const ae_int_t n, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::chebyshevcalculate(r, n, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Summation of Chebyshev polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*T0(x) + c[1]*T1(x) + ... + c[N]*TN(x) or c[0]*U0(x) + c[1]*U1(x) + ... + c[N]*UN(x) depending on the R. Parameters: r - polynomial kind, either 1 or 2. n - degree, n>=0 x - argument Result: the value of the Chebyshev polynomial at x *************************************************************************/ double chebyshevsum(const real_1d_array &c, const ae_int_t r, const ae_int_t n, const double x) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::chebyshevsum(const_cast(c.c_ptr()), r, n, x, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Representation of Tn as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/ void chebyshevcoefficients(const ae_int_t n, real_1d_array &c) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::chebyshevcoefficients(n, const_cast(c.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Conversion of a series of Chebyshev polynomials to a power series. Represents A[0]*T0(x) + A[1]*T1(x) + ... + A[N]*Tn(x) as B[0] + B[1]*X + ... + B[N]*X^N. Input parameters: A - Chebyshev series coefficients N - degree, N>=0 Output parameters B - power series coefficients *************************************************************************/ void fromchebyshev(const real_1d_array &a, const ae_int_t n, real_1d_array &b) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::fromchebyshev(const_cast(a.c_ptr()), n, const_cast(b.c_ptr()), &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Student's t distribution Computes the integral from minus infinity to t of the Student t distribution with integer k > 0 degrees of freedom: t - | | - | 2 -(k+1)/2 | ( (k+1)/2 ) | ( x ) ---------------------- | ( 1 + --- ) dx - | ( k ) sqrt( k pi ) | ( k/2 ) | | | - -inf. Relation to incomplete beta integral: 1 - stdtr(k,t) = 0.5 * incbet( k/2, 1/2, z ) where z = k/(k + t**2). For t < -2, this is the method of computation. For higher t, a direct method is derived from integration by parts. Since the function is symmetric about t=0, the area under the right tail of the density is found by calling the function with -t instead of t. ACCURACY: Tested at random 1 <= k <= 25. The "domain" refers to t. Relative error: arithmetic domain # trials peak rms IEEE -100,-2 50000 5.9e-15 1.4e-15 IEEE -2,100 500000 2.7e-15 4.9e-17 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double studenttdistribution(const ae_int_t k, const double t) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::studenttdistribution(k, t, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Functional inverse of Student's t distribution Given probability p, finds the argument t such that stdtr(k,t) is equal to p. ACCURACY: Tested at random 1 <= k <= 100. The "domain" refers to p: Relative error: arithmetic domain # trials peak rms IEEE .001,.999 25000 5.7e-15 8.0e-16 IEEE 10^-6,.001 25000 2.0e-12 2.9e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invstudenttdistribution(const ae_int_t k, const double p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::invstudenttdistribution(k, p, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Binomial distribution Returns the sum of the terms 0 through k of the Binomial probability density: k -- ( n ) j n-j > ( ) p (1-p) -- ( j ) j=0 The terms are not summed directly; instead the incomplete beta integral is employed, according to the formula y = bdtr( k, n, p ) = incbet( n-k, k+1, 1-p ). The arguments must be positive, with p ranging from 0 to 1. ACCURACY: Tested at random points (a,b,p), with p between 0 and 1. a,b Relative error: arithmetic domain # trials peak rms For p between 0.001 and 1: IEEE 0,100 100000 4.3e-15 2.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double binomialdistribution(const ae_int_t k, const ae_int_t n, const double p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::binomialdistribution(k, n, p, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Complemented binomial distribution Returns the sum of the terms k+1 through n of the Binomial probability density: n -- ( n ) j n-j > ( ) p (1-p) -- ( j ) j=k+1 The terms are not summed directly; instead the incomplete beta integral is employed, according to the formula y = bdtrc( k, n, p ) = incbet( k+1, n-k, p ). The arguments must be positive, with p ranging from 0 to 1. ACCURACY: Tested at random points (a,b,p). a,b Relative error: arithmetic domain # trials peak rms For p between 0.001 and 1: IEEE 0,100 100000 6.7e-15 8.2e-16 For p between 0 and .001: IEEE 0,100 100000 1.5e-13 2.7e-15 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double binomialcdistribution(const ae_int_t k, const ae_int_t n, const double p) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::binomialcdistribution(k, n, p, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Inverse binomial distribution Finds the event probability p such that the sum of the terms 0 through k of the Binomial probability density is equal to the given cumulative probability y. This is accomplished using the inverse beta integral function and the relation 1 - p = incbi( n-k, k+1, y ). ACCURACY: Tested at random points (a,b,p). a,b Relative error: arithmetic domain # trials peak rms For p between 0.001 and 1: IEEE 0,100 100000 2.3e-14 6.4e-16 IEEE 0,10000 100000 6.6e-12 1.2e-13 For p between 10^-6 and 0.001: IEEE 0,100 100000 2.0e-12 1.3e-14 IEEE 0,10000 100000 1.5e-12 3.2e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invbinomialdistribution(const ae_int_t k, const ae_int_t n, const double y) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { double result = alglib_impl::invbinomialdistribution(k, n, y, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return *(reinterpret_cast(&result)); } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } /************************************************************************* Airy function Solution of the differential equation y"(x) = xy. The function returns the two independent solutions Ai, Bi and their first derivatives Ai'(x), Bi'(x). Evaluation is by power series summation for small x, by rational minimax approximations for large x. ACCURACY: Error criterion is absolute when function <= 1, relative when function > 1, except * denotes relative error criterion. For large negative x, the absolute error increases as x^1.5. For large positive x, the relative error increases as x^1.5. Arithmetic domain function # trials peak rms IEEE -10, 0 Ai 10000 1.6e-15 2.7e-16 IEEE 0, 10 Ai 10000 2.3e-14* 1.8e-15* IEEE -10, 0 Ai' 10000 4.6e-15 7.6e-16 IEEE 0, 10 Ai' 10000 1.8e-14* 1.5e-15* IEEE -10, 10 Bi 30000 4.2e-15 5.3e-16 IEEE -10, 10 Bi' 30000 4.9e-15 7.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ void airy(const double x, double &ai, double &aip, double &bi, double &bip) { alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_state_init(&_alglib_env_state); try { alglib_impl::airy(x, &ai, &aip, &bi, &bip, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); return; } catch(alglib_impl::ae_error_type) { throw ap_error(_alglib_env_state.error_msg); } } } ///////////////////////////////////////////////////////////////////////// // // THIS SECTION CONTAINS IMPLEMENTATION OF COMPUTATIONAL CORE // ///////////////////////////////////////////////////////////////////////// namespace alglib_impl { static double gammafunc_gammastirf(double x, ae_state *_state); static void trigintegrals_chebiterationshichi(double x, double c, double* b0, double* b1, double* b2, ae_state *_state); static void bessel_besselmfirstcheb(double c, double* b0, double* b1, double* b2, ae_state *_state); static void bessel_besselmnextcheb(double x, double c, double* b0, double* b1, double* b2, ae_state *_state); static void bessel_besselm1firstcheb(double c, double* b0, double* b1, double* b2, ae_state *_state); static void bessel_besselm1nextcheb(double x, double c, double* b0, double* b1, double* b2, ae_state *_state); static void bessel_besselasympt0(double x, double* pzero, double* qzero, ae_state *_state); static void bessel_besselasympt1(double x, double* pzero, double* qzero, ae_state *_state); static double ibetaf_incompletebetafe(double a, double b, double x, double big, double biginv, ae_state *_state); static double ibetaf_incompletebetafe2(double a, double b, double x, double big, double biginv, ae_state *_state); static double ibetaf_incompletebetaps(double a, double b, double x, double maxgam, ae_state *_state); /************************************************************************* Gamma function Input parameters: X - argument Domain: 0 < X < 171.6 -170 < X < 0, X is not an integer. Relative error: arithmetic domain # trials peak rms IEEE -170,-33 20000 2.3e-15 3.3e-16 IEEE -33, 33 20000 9.4e-16 2.2e-16 IEEE 33, 171.6 20000 2.3e-15 3.2e-16 Cephes Math Library Release 2.8: June, 2000 Original copyright 1984, 1987, 1989, 1992, 2000 by Stephen L. Moshier Translated to AlgoPascal by Bochkanov Sergey (2005, 2006, 2007). *************************************************************************/ double gammafunction(double x, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_SPECFUNCS double p; double pp; double q; double qq; double z; ae_int_t i; double sgngam; double result; sgngam = (double)(1); q = ae_fabs(x, _state); if( ae_fp_greater(q,33.0) ) { if( ae_fp_less(x,0.0) ) { p = (double)(ae_ifloor(q, _state)); i = ae_round(p, _state); if( i%2==0 ) { sgngam = (double)(-1); } z = q-p; if( ae_fp_greater(z,0.5) ) { p = p+1; z = q-p; } z = q*ae_sin(ae_pi*z, _state); z = ae_fabs(z, _state); z = ae_pi/(z*gammafunc_gammastirf(q, _state)); } else { z = gammafunc_gammastirf(x, _state); } result = sgngam*z; return result; } z = (double)(1); while(ae_fp_greater_eq(x,(double)(3))) { x = x-1; z = z*x; } while(ae_fp_less(x,(double)(0))) { if( ae_fp_greater(x,-0.000000001) ) { result = z/((1+0.5772156649015329*x)*x); return result; } z = z/x; x = x+1; } while(ae_fp_less(x,(double)(2))) { if( ae_fp_less(x,0.000000001) ) { result = z/((1+0.5772156649015329*x)*x); return result; } z = z/x; x = x+1.0; } if( ae_fp_eq(x,(double)(2)) ) { result = z; return result; } x = x-2.0; pp = 1.60119522476751861407E-4; pp = 1.19135147006586384913E-3+x*pp; pp = 1.04213797561761569935E-2+x*pp; pp = 4.76367800457137231464E-2+x*pp; pp = 2.07448227648435975150E-1+x*pp; pp = 4.94214826801497100753E-1+x*pp; pp = 9.99999999999999996796E-1+x*pp; qq = -2.31581873324120129819E-5; qq = 5.39605580493303397842E-4+x*qq; qq = -4.45641913851797240494E-3+x*qq; qq = 1.18139785222060435552E-2+x*qq; qq = 3.58236398605498653373E-2+x*qq; qq = -2.34591795718243348568E-1+x*qq; qq = 7.14304917030273074085E-2+x*qq; qq = 1.00000000000000000320+x*qq; result = z*pp/qq; return result; #else return _ialglib_i_gammafunction(x); #endif } /************************************************************************* Natural logarithm of gamma function Input parameters: X - argument Result: logarithm of the absolute value of the Gamma(X). Output parameters: SgnGam - sign(Gamma(X)) Domain: 0 < X < 2.55e305 -2.55e305 < X < 0, X is not an integer. ACCURACY: arithmetic domain # trials peak rms IEEE 0, 3 28000 5.4e-16 1.1e-16 IEEE 2.718, 2.556e305 40000 3.5e-16 8.3e-17 The error criterion was relative when the function magnitude was greater than one but absolute when it was less than one. The following test used the relative error criterion, though at certain points the relative error could be much higher than indicated. IEEE -200, -4 10000 4.8e-16 1.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 1992, 2000 by Stephen L. Moshier Translated to AlgoPascal by Bochkanov Sergey (2005, 2006, 2007). *************************************************************************/ double lngamma(double x, double* sgngam, ae_state *_state) { #ifndef ALGLIB_INTERCEPTS_SPECFUNCS double a; double b; double c; double p; double q; double u; double w; double z; ae_int_t i; double logpi; double ls2pi; double tmp; double result; *sgngam = 0; *sgngam = (double)(1); logpi = 1.14472988584940017414; ls2pi = 0.91893853320467274178; if( ae_fp_less(x,-34.0) ) { q = -x; w = lngamma(q, &tmp, _state); p = (double)(ae_ifloor(q, _state)); i = ae_round(p, _state); if( i%2==0 ) { *sgngam = (double)(-1); } else { *sgngam = (double)(1); } z = q-p; if( ae_fp_greater(z,0.5) ) { p = p+1; z = p-q; } z = q*ae_sin(ae_pi*z, _state); result = logpi-ae_log(z, _state)-w; return result; } if( ae_fp_less(x,(double)(13)) ) { z = (double)(1); p = (double)(0); u = x; while(ae_fp_greater_eq(u,(double)(3))) { p = p-1; u = x+p; z = z*u; } while(ae_fp_less(u,(double)(2))) { z = z/u; p = p+1; u = x+p; } if( ae_fp_less(z,(double)(0)) ) { *sgngam = (double)(-1); z = -z; } else { *sgngam = (double)(1); } if( ae_fp_eq(u,(double)(2)) ) { result = ae_log(z, _state); return result; } p = p-2; x = x+p; b = -1378.25152569120859100; b = -38801.6315134637840924+x*b; b = -331612.992738871184744+x*b; b = -1162370.97492762307383+x*b; b = -1721737.00820839662146+x*b; b = -853555.664245765465627+x*b; c = (double)(1); c = -351.815701436523470549+x*c; c = -17064.2106651881159223+x*c; c = -220528.590553854454839+x*c; c = -1139334.44367982507207+x*c; c = -2532523.07177582951285+x*c; c = -2018891.41433532773231+x*c; p = x*b/c; result = ae_log(z, _state)+p; return result; } q = (x-0.5)*ae_log(x, _state)-x+ls2pi; if( ae_fp_greater(x,(double)(100000000)) ) { result = q; return result; } p = 1/(x*x); if( ae_fp_greater_eq(x,1000.0) ) { q = q+((7.9365079365079365079365*0.0001*p-2.7777777777777777777778*0.001)*p+0.0833333333333333333333)/x; } else { a = 8.11614167470508450300*0.0001; a = -5.95061904284301438324*0.0001+p*a; a = 7.93650340457716943945*0.0001+p*a; a = -2.77777777730099687205*0.001+p*a; a = 8.33333333333331927722*0.01+p*a; q = q+a/x; } result = q; return result; #else return _ialglib_i_lngamma(x, sgngam); #endif } static double gammafunc_gammastirf(double x, ae_state *_state) { double y; double w; double v; double stir; double result; w = 1/x; stir = 7.87311395793093628397E-4; stir = -2.29549961613378126380E-4+w*stir; stir = -2.68132617805781232825E-3+w*stir; stir = 3.47222221605458667310E-3+w*stir; stir = 8.33333333333482257126E-2+w*stir; w = 1+w*stir; y = ae_exp(x, _state); if( ae_fp_greater(x,143.01608) ) { v = ae_pow(x, 0.5*x-0.25, _state); y = v*(v/y); } else { y = ae_pow(x, x-0.5, _state)/y; } result = 2.50662827463100050242*y*w; return result; } /************************************************************************* Error function The integral is x - 2 | | 2 erf(x) = -------- | exp( - t ) dt. sqrt(pi) | | - 0 For 0 <= |x| < 1, erf(x) = x * P4(x**2)/Q5(x**2); otherwise erf(x) = 1 - erfc(x). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,1 30000 3.7e-16 1.0e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double errorfunction(double x, ae_state *_state) { double xsq; double s; double p; double q; double result; s = (double)(ae_sign(x, _state)); x = ae_fabs(x, _state); if( ae_fp_less(x,0.5) ) { xsq = x*x; p = 0.007547728033418631287834; p = -0.288805137207594084924010+xsq*p; p = 14.3383842191748205576712+xsq*p; p = 38.0140318123903008244444+xsq*p; p = 3017.82788536507577809226+xsq*p; p = 7404.07142710151470082064+xsq*p; p = 80437.3630960840172832162+xsq*p; q = 0.0; q = 1.00000000000000000000000+xsq*q; q = 38.0190713951939403753468+xsq*q; q = 658.070155459240506326937+xsq*q; q = 6379.60017324428279487120+xsq*q; q = 34216.5257924628539769006+xsq*q; q = 80437.3630960840172826266+xsq*q; result = s*1.1283791670955125738961589031*x*p/q; return result; } if( ae_fp_greater_eq(x,(double)(10)) ) { result = s; return result; } result = s*(1-errorfunctionc(x, _state)); return result; } /************************************************************************* Complementary error function 1 - erf(x) = inf. - 2 | | 2 erfc(x) = -------- | exp( - t ) dt sqrt(pi) | | - x For small x, erfc(x) = 1 - erf(x); otherwise rational approximations are computed. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,26.6417 30000 5.7e-14 1.5e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double errorfunctionc(double x, ae_state *_state) { double p; double q; double result; if( ae_fp_less(x,(double)(0)) ) { result = 2-errorfunctionc(-x, _state); return result; } if( ae_fp_less(x,0.5) ) { result = 1.0-errorfunction(x, _state); return result; } if( ae_fp_greater_eq(x,(double)(10)) ) { result = (double)(0); return result; } p = 0.0; p = 0.5641877825507397413087057563+x*p; p = 9.675807882987265400604202961+x*p; p = 77.08161730368428609781633646+x*p; p = 368.5196154710010637133875746+x*p; p = 1143.262070703886173606073338+x*p; p = 2320.439590251635247384768711+x*p; p = 2898.0293292167655611275846+x*p; p = 1826.3348842295112592168999+x*p; q = 1.0; q = 17.14980943627607849376131193+x*q; q = 137.1255960500622202878443578+x*q; q = 661.7361207107653469211984771+x*q; q = 2094.384367789539593790281779+x*q; q = 4429.612803883682726711528526+x*q; q = 6089.5424232724435504633068+x*q; q = 4958.82756472114071495438422+x*q; q = 1826.3348842295112595576438+x*q; result = ae_exp(-ae_sqr(x, _state), _state)*p/q; return result; } /************************************************************************* Normal distribution function Returns the area under the Gaussian probability density function, integrated from minus infinity to x: x - 1 | | 2 ndtr(x) = --------- | exp( - t /2 ) dt sqrt(2pi) | | - -inf. = ( 1 + erf(z) ) / 2 = erfc(z) / 2 where z = x/sqrt(2). Computation is via the functions erf and erfc. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE -13,0 30000 3.4e-14 6.7e-15 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double normaldistribution(double x, ae_state *_state) { double result; result = 0.5*(errorfunction(x/1.41421356237309504880, _state)+1); return result; } /************************************************************************* Inverse of the error function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double inverf(double e, ae_state *_state) { double result; result = invnormaldistribution(0.5*(e+1), _state)/ae_sqrt((double)(2), _state); return result; } /************************************************************************* Inverse of Normal distribution function Returns the argument, x, for which the area under the Gaussian probability density function (integrated from minus infinity to x) is equal to y. For small arguments 0 < y < exp(-2), the program computes z = sqrt( -2.0 * log(y) ); then the approximation is x = z - log(z)/z - (1/z) P(1/z) / Q(1/z). There are two rational functions P/Q, one for 0 < y < exp(-32) and the other for y up to exp(-2). For larger arguments, w = y - 0.5, and x/sqrt(2pi) = w + w**3 R(w**2)/S(w**2)). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0.125, 1 20000 7.2e-16 1.3e-16 IEEE 3e-308, 0.135 50000 4.6e-16 9.8e-17 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double invnormaldistribution(double y0, ae_state *_state) { double expm2; double s2pi; double x; double y; double z; double y2; double x0; double x1; ae_int_t code; double p0; double q0; double p1; double q1; double p2; double q2; double result; expm2 = 0.13533528323661269189; s2pi = 2.50662827463100050242; if( ae_fp_less_eq(y0,(double)(0)) ) { result = -ae_maxrealnumber; return result; } if( ae_fp_greater_eq(y0,(double)(1)) ) { result = ae_maxrealnumber; return result; } code = 1; y = y0; if( ae_fp_greater(y,1.0-expm2) ) { y = 1.0-y; code = 0; } if( ae_fp_greater(y,expm2) ) { y = y-0.5; y2 = y*y; p0 = -59.9633501014107895267; p0 = 98.0010754185999661536+y2*p0; p0 = -56.6762857469070293439+y2*p0; p0 = 13.9312609387279679503+y2*p0; p0 = -1.23916583867381258016+y2*p0; q0 = (double)(1); q0 = 1.95448858338141759834+y2*q0; q0 = 4.67627912898881538453+y2*q0; q0 = 86.3602421390890590575+y2*q0; q0 = -225.462687854119370527+y2*q0; q0 = 200.260212380060660359+y2*q0; q0 = -82.0372256168333339912+y2*q0; q0 = 15.9056225126211695515+y2*q0; q0 = -1.18331621121330003142+y2*q0; x = y+y*y2*p0/q0; x = x*s2pi; result = x; return result; } x = ae_sqrt(-2.0*ae_log(y, _state), _state); x0 = x-ae_log(x, _state)/x; z = 1.0/x; if( ae_fp_less(x,8.0) ) { p1 = 4.05544892305962419923; p1 = 31.5251094599893866154+z*p1; p1 = 57.1628192246421288162+z*p1; p1 = 44.0805073893200834700+z*p1; p1 = 14.6849561928858024014+z*p1; p1 = 2.18663306850790267539+z*p1; p1 = -1.40256079171354495875*0.1+z*p1; p1 = -3.50424626827848203418*0.01+z*p1; p1 = -8.57456785154685413611*0.0001+z*p1; q1 = (double)(1); q1 = 15.7799883256466749731+z*q1; q1 = 45.3907635128879210584+z*q1; q1 = 41.3172038254672030440+z*q1; q1 = 15.0425385692907503408+z*q1; q1 = 2.50464946208309415979+z*q1; q1 = -1.42182922854787788574*0.1+z*q1; q1 = -3.80806407691578277194*0.01+z*q1; q1 = -9.33259480895457427372*0.0001+z*q1; x1 = z*p1/q1; } else { p2 = 3.23774891776946035970; p2 = 6.91522889068984211695+z*p2; p2 = 3.93881025292474443415+z*p2; p2 = 1.33303460815807542389+z*p2; p2 = 2.01485389549179081538*0.1+z*p2; p2 = 1.23716634817820021358*0.01+z*p2; p2 = 3.01581553508235416007*0.0001+z*p2; p2 = 2.65806974686737550832*0.000001+z*p2; p2 = 6.23974539184983293730*0.000000001+z*p2; q2 = (double)(1); q2 = 6.02427039364742014255+z*q2; q2 = 3.67983563856160859403+z*q2; q2 = 1.37702099489081330271+z*q2; q2 = 2.16236993594496635890*0.1+z*q2; q2 = 1.34204006088543189037*0.01+z*q2; q2 = 3.28014464682127739104*0.0001+z*q2; q2 = 2.89247864745380683936*0.000001+z*q2; q2 = 6.79019408009981274425*0.000000001+z*q2; x1 = z*p2/q2; } x = x0-x1; if( code!=0 ) { x = -x; } result = x; return result; } /************************************************************************* Incomplete gamma integral The function is defined by x - 1 | | -t a-1 igam(a,x) = ----- | e t dt. - | | | (a) - 0 In this implementation both arguments must be positive. The integral is evaluated by either a power series or continued fraction expansion, depending on the relative values of a and x. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 200000 3.6e-14 2.9e-15 IEEE 0,100 300000 9.9e-14 1.5e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double incompletegamma(double a, double x, ae_state *_state) { double igammaepsilon; double ans; double ax; double c; double r; double tmp; double result; igammaepsilon = 0.000000000000001; if( ae_fp_less_eq(x,(double)(0))||ae_fp_less_eq(a,(double)(0)) ) { result = (double)(0); return result; } if( ae_fp_greater(x,(double)(1))&&ae_fp_greater(x,a) ) { result = 1-incompletegammac(a, x, _state); return result; } ax = a*ae_log(x, _state)-x-lngamma(a, &tmp, _state); if( ae_fp_less(ax,-709.78271289338399) ) { result = (double)(0); return result; } ax = ae_exp(ax, _state); r = a; c = (double)(1); ans = (double)(1); do { r = r+1; c = c*x/r; ans = ans+c; } while(ae_fp_greater(c/ans,igammaepsilon)); result = ans*ax/a; return result; } /************************************************************************* Complemented incomplete gamma integral The function is defined by igamc(a,x) = 1 - igam(a,x) inf. - 1 | | -t a-1 = ----- | e t dt. - | | | (a) - x In this implementation both arguments must be positive. The integral is evaluated by either a power series or continued fraction expansion, depending on the relative values of a and x. ACCURACY: Tested at random a, x. a x Relative error: arithmetic domain domain # trials peak rms IEEE 0.5,100 0,100 200000 1.9e-14 1.7e-15 IEEE 0.01,0.5 0,100 200000 1.4e-13 1.6e-15 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double incompletegammac(double a, double x, ae_state *_state) { double igammaepsilon; double igammabignumber; double igammabignumberinv; double ans; double ax; double c; double yc; double r; double t; double y; double z; double pk; double pkm1; double pkm2; double qk; double qkm1; double qkm2; double tmp; double result; igammaepsilon = 0.000000000000001; igammabignumber = 4503599627370496.0; igammabignumberinv = 2.22044604925031308085*0.0000000000000001; if( ae_fp_less_eq(x,(double)(0))||ae_fp_less_eq(a,(double)(0)) ) { result = (double)(1); return result; } if( ae_fp_less(x,(double)(1))||ae_fp_less(x,a) ) { result = 1-incompletegamma(a, x, _state); return result; } ax = a*ae_log(x, _state)-x-lngamma(a, &tmp, _state); if( ae_fp_less(ax,-709.78271289338399) ) { result = (double)(0); return result; } ax = ae_exp(ax, _state); y = 1-a; z = x+y+1; c = (double)(0); pkm2 = (double)(1); qkm2 = x; pkm1 = x+1; qkm1 = z*x; ans = pkm1/qkm1; do { c = c+1; y = y+1; z = z+2; yc = y*c; pk = pkm1*z-pkm2*yc; qk = qkm1*z-qkm2*yc; if( ae_fp_neq(qk,(double)(0)) ) { r = pk/qk; t = ae_fabs((ans-r)/r, _state); ans = r; } else { t = (double)(1); } pkm2 = pkm1; pkm1 = pk; qkm2 = qkm1; qkm1 = qk; if( ae_fp_greater(ae_fabs(pk, _state),igammabignumber) ) { pkm2 = pkm2*igammabignumberinv; pkm1 = pkm1*igammabignumberinv; qkm2 = qkm2*igammabignumberinv; qkm1 = qkm1*igammabignumberinv; } } while(ae_fp_greater(t,igammaepsilon)); result = ans*ax; return result; } /************************************************************************* Inverse of complemented imcomplete gamma integral Given p, the function finds x such that igamc( a, x ) = p. Starting with the approximate value 3 x = a t where t = 1 - d - ndtri(p) sqrt(d) and d = 1/9a, the routine performs up to 10 Newton iterations to find the root of igamc(a,x) - p = 0. ACCURACY: Tested at random a, p in the intervals indicated. a p Relative error: arithmetic domain domain # trials peak rms IEEE 0.5,100 0,0.5 100000 1.0e-14 1.7e-15 IEEE 0.01,0.5 0,0.5 100000 9.0e-14 3.4e-15 IEEE 0.5,10000 0,0.5 20000 2.3e-13 3.8e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invincompletegammac(double a, double y0, ae_state *_state) { double igammaepsilon; double iinvgammabignumber; double x0; double x1; double x; double yl; double yh; double y; double d; double lgm; double dithresh; ae_int_t i; ae_int_t dir; double tmp; double result; igammaepsilon = 0.000000000000001; iinvgammabignumber = 4503599627370496.0; x0 = iinvgammabignumber; yl = (double)(0); x1 = (double)(0); yh = (double)(1); dithresh = 5*igammaepsilon; d = 1/(9*a); y = 1-d-invnormaldistribution(y0, _state)*ae_sqrt(d, _state); x = a*y*y*y; lgm = lngamma(a, &tmp, _state); i = 0; while(i<10) { if( ae_fp_greater(x,x0)||ae_fp_less(x,x1) ) { d = 0.0625; break; } y = incompletegammac(a, x, _state); if( ae_fp_less(y,yl)||ae_fp_greater(y,yh) ) { d = 0.0625; break; } if( ae_fp_less(y,y0) ) { x0 = x; yl = y; } else { x1 = x; yh = y; } d = (a-1)*ae_log(x, _state)-x-lgm; if( ae_fp_less(d,-709.78271289338399) ) { d = 0.0625; break; } d = -ae_exp(d, _state); d = (y-y0)/d; if( ae_fp_less(ae_fabs(d/x, _state),igammaepsilon) ) { result = x; return result; } x = x-d; i = i+1; } if( ae_fp_eq(x0,iinvgammabignumber) ) { if( ae_fp_less_eq(x,(double)(0)) ) { x = (double)(1); } while(ae_fp_eq(x0,iinvgammabignumber)) { x = (1+d)*x; y = incompletegammac(a, x, _state); if( ae_fp_less(y,y0) ) { x0 = x; yl = y; break; } d = d+d; } } d = 0.5; dir = 0; i = 0; while(i<400) { x = x1+d*(x0-x1); y = incompletegammac(a, x, _state); lgm = (x0-x1)/(x1+x0); if( ae_fp_less(ae_fabs(lgm, _state),dithresh) ) { break; } lgm = (y-y0)/y0; if( ae_fp_less(ae_fabs(lgm, _state),dithresh) ) { break; } if( ae_fp_less_eq(x,0.0) ) { break; } if( ae_fp_greater_eq(y,y0) ) { x1 = x; yh = y; if( dir<0 ) { dir = 0; d = 0.5; } else { if( dir>1 ) { d = 0.5*d+0.5; } else { d = (y0-yl)/(yh-yl); } } dir = dir+1; } else { x0 = x; yl = y; if( dir>0 ) { dir = 0; d = 0.5; } else { if( dir<-1 ) { d = 0.5*d; } else { d = (y0-yl)/(yh-yl); } } dir = dir-1; } i = i+1; } result = x; return result; } /************************************************************************* Complete elliptic integral of the first kind Approximates the integral pi/2 - | | | dt K(m) = | ------------------ | 2 | | sqrt( 1 - m sin t ) - 0 using the approximation P(x) - log x Q(x). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,1 30000 2.5e-16 6.8e-17 Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double ellipticintegralk(double m, ae_state *_state) { double result; result = ellipticintegralkhighprecision(1.0-m, _state); return result; } /************************************************************************* Complete elliptic integral of the first kind Approximates the integral pi/2 - | | | dt K(m) = | ------------------ | 2 | | sqrt( 1 - m sin t ) - 0 where m = 1 - m1, using the approximation P(x) - log x Q(x). The argument m1 is used rather than m so that the logarithmic singularity at m = 1 will be shifted to the origin; this preserves maximum accuracy. K(0) = pi/2. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,1 30000 2.5e-16 6.8e-17 Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double ellipticintegralkhighprecision(double m1, ae_state *_state) { double p; double q; double result; if( ae_fp_less_eq(m1,ae_machineepsilon) ) { result = 1.3862943611198906188E0-0.5*ae_log(m1, _state); } else { p = 1.37982864606273237150E-4; p = p*m1+2.28025724005875567385E-3; p = p*m1+7.97404013220415179367E-3; p = p*m1+9.85821379021226008714E-3; p = p*m1+6.87489687449949877925E-3; p = p*m1+6.18901033637687613229E-3; p = p*m1+8.79078273952743772254E-3; p = p*m1+1.49380448916805252718E-2; p = p*m1+3.08851465246711995998E-2; p = p*m1+9.65735902811690126535E-2; p = p*m1+1.38629436111989062502E0; q = 2.94078955048598507511E-5; q = q*m1+9.14184723865917226571E-4; q = q*m1+5.94058303753167793257E-3; q = q*m1+1.54850516649762399335E-2; q = q*m1+2.39089602715924892727E-2; q = q*m1+3.01204715227604046988E-2; q = q*m1+3.73774314173823228969E-2; q = q*m1+4.88280347570998239232E-2; q = q*m1+7.03124996963957469739E-2; q = q*m1+1.24999999999870820058E-1; q = q*m1+4.99999999999999999821E-1; result = p-q*ae_log(m1, _state); } return result; } /************************************************************************* Incomplete elliptic integral of the first kind F(phi|m) Approximates the integral phi - | | | dt F(phi_\m) = | ------------------ | 2 | | sqrt( 1 - m sin t ) - 0 of amplitude phi and modulus m, using the arithmetic - geometric mean algorithm. ACCURACY: Tested at random points with m in [0, 1] and phi as indicated. Relative error: arithmetic domain # trials peak rms IEEE -10,10 200000 7.4e-16 1.0e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double incompleteellipticintegralk(double phi, double m, ae_state *_state) { double a; double b; double c; double e; double temp; double pio2; double t; double k; ae_int_t d; ae_int_t md; ae_int_t s; ae_int_t npio2; double result; pio2 = 1.57079632679489661923; if( ae_fp_eq(m,(double)(0)) ) { result = phi; return result; } a = 1-m; if( ae_fp_eq(a,(double)(0)) ) { result = ae_log(ae_tan(0.5*(pio2+phi), _state), _state); return result; } npio2 = ae_ifloor(phi/pio2, _state); if( npio2%2!=0 ) { npio2 = npio2+1; } if( npio2!=0 ) { k = ellipticintegralk(1-a, _state); phi = phi-npio2*pio2; } else { k = (double)(0); } if( ae_fp_less(phi,(double)(0)) ) { phi = -phi; s = -1; } else { s = 0; } b = ae_sqrt(a, _state); t = ae_tan(phi, _state); if( ae_fp_greater(ae_fabs(t, _state),(double)(10)) ) { e = 1.0/(b*t); if( ae_fp_less(ae_fabs(e, _state),(double)(10)) ) { e = ae_atan(e, _state); if( npio2==0 ) { k = ellipticintegralk(1-a, _state); } temp = k-incompleteellipticintegralk(e, m, _state); if( s<0 ) { temp = -temp; } result = temp+npio2*k; return result; } } a = 1.0; c = ae_sqrt(m, _state); d = 1; md = 0; while(ae_fp_greater(ae_fabs(c/a, _state),ae_machineepsilon)) { temp = b/a; phi = phi+ae_atan(t*temp, _state)+md*ae_pi; md = ae_trunc((phi+pio2)/ae_pi, _state); t = t*(1.0+temp)/(1.0-temp*t*t); c = 0.5*(a-b); temp = ae_sqrt(a*b, _state); a = 0.5*(a+b); b = temp; d = d+d; } temp = (ae_atan(t, _state)+md*ae_pi)/(d*a); if( s<0 ) { temp = -temp; } result = temp+npio2*k; return result; } /************************************************************************* Complete elliptic integral of the second kind Approximates the integral pi/2 - | | 2 E(m) = | sqrt( 1 - m sin t ) dt | | - 0 using the approximation P(x) - x log x Q(x). ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 1 10000 2.1e-16 7.3e-17 Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double ellipticintegrale(double m, ae_state *_state) { double p; double q; double result; ae_assert(ae_fp_greater_eq(m,(double)(0))&&ae_fp_less_eq(m,(double)(1)), "Domain error in EllipticIntegralE: m<0 or m>1", _state); m = 1-m; if( ae_fp_eq(m,(double)(0)) ) { result = (double)(1); return result; } p = 1.53552577301013293365E-4; p = p*m+2.50888492163602060990E-3; p = p*m+8.68786816565889628429E-3; p = p*m+1.07350949056076193403E-2; p = p*m+7.77395492516787092951E-3; p = p*m+7.58395289413514708519E-3; p = p*m+1.15688436810574127319E-2; p = p*m+2.18317996015557253103E-2; p = p*m+5.68051945617860553470E-2; p = p*m+4.43147180560990850618E-1; p = p*m+1.00000000000000000299E0; q = 3.27954898576485872656E-5; q = q*m+1.00962792679356715133E-3; q = q*m+6.50609489976927491433E-3; q = q*m+1.68862163993311317300E-2; q = q*m+2.61769742454493659583E-2; q = q*m+3.34833904888224918614E-2; q = q*m+4.27180926518931511717E-2; q = q*m+5.85936634471101055642E-2; q = q*m+9.37499997197644278445E-2; q = q*m+2.49999999999888314361E-1; result = p-q*m*ae_log(m, _state); return result; } /************************************************************************* Incomplete elliptic integral of the second kind Approximates the integral phi - | | | 2 E(phi_\m) = | sqrt( 1 - m sin t ) dt | | | - 0 of amplitude phi and modulus m, using the arithmetic - geometric mean algorithm. ACCURACY: Tested at random arguments with phi in [-10, 10] and m in [0, 1]. Relative error: arithmetic domain # trials peak rms IEEE -10,10 150000 3.3e-15 1.4e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1993, 2000 by Stephen L. Moshier *************************************************************************/ double incompleteellipticintegrale(double phi, double m, ae_state *_state) { double pio2; double a; double b; double c; double e; double temp; double lphi; double t; double ebig; ae_int_t d; ae_int_t md; ae_int_t npio2; ae_int_t s; double result; pio2 = 1.57079632679489661923; if( ae_fp_eq(m,(double)(0)) ) { result = phi; return result; } lphi = phi; npio2 = ae_ifloor(lphi/pio2, _state); if( npio2%2!=0 ) { npio2 = npio2+1; } lphi = lphi-npio2*pio2; if( ae_fp_less(lphi,(double)(0)) ) { lphi = -lphi; s = -1; } else { s = 1; } a = 1.0-m; ebig = ellipticintegrale(m, _state); if( ae_fp_eq(a,(double)(0)) ) { temp = ae_sin(lphi, _state); if( s<0 ) { temp = -temp; } result = temp+npio2*ebig; return result; } t = ae_tan(lphi, _state); b = ae_sqrt(a, _state); /* * Thanks to Brian Fitzgerald * for pointing out an instability near odd multiples of pi/2 */ if( ae_fp_greater(ae_fabs(t, _state),(double)(10)) ) { /* * Transform the amplitude */ e = 1.0/(b*t); /* * ... but avoid multiple recursions. */ if( ae_fp_less(ae_fabs(e, _state),(double)(10)) ) { e = ae_atan(e, _state); temp = ebig+m*ae_sin(lphi, _state)*ae_sin(e, _state)-incompleteellipticintegrale(e, m, _state); if( s<0 ) { temp = -temp; } result = temp+npio2*ebig; return result; } } c = ae_sqrt(m, _state); a = 1.0; d = 1; e = 0.0; md = 0; while(ae_fp_greater(ae_fabs(c/a, _state),ae_machineepsilon)) { temp = b/a; lphi = lphi+ae_atan(t*temp, _state)+md*ae_pi; md = ae_trunc((lphi+pio2)/ae_pi, _state); t = t*(1.0+temp)/(1.0-temp*t*t); c = 0.5*(a-b); temp = ae_sqrt(a*b, _state); a = 0.5*(a+b); b = temp; d = d+d; e = e+c*ae_sin(lphi, _state); } temp = ebig/ellipticintegralk(m, _state); temp = temp*((ae_atan(t, _state)+md*ae_pi)/(d*a)); temp = temp+e; if( s<0 ) { temp = -temp; } result = temp+npio2*ebig; return result; } /************************************************************************* Calculation of the value of the Hermite polynomial. Parameters: n - degree, n>=0 x - argument Result: the value of the Hermite polynomial Hn at x *************************************************************************/ double hermitecalculate(ae_int_t n, double x, ae_state *_state) { ae_int_t i; double a; double b; double result; result = (double)(0); /* * Prepare A and B */ a = (double)(1); b = 2*x; /* * Special cases: N=0 or N=1 */ if( n==0 ) { result = a; return result; } if( n==1 ) { result = b; return result; } /* * General case: N>=2 */ for(i=2; i<=n; i++) { result = 2*x*b-2*(i-1)*a; a = b; b = result; } return result; } /************************************************************************* Summation of Hermite polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*H0(x) + c[1]*H1(x) + ... + c[N]*HN(x) Parameters: n - degree, n>=0 x - argument Result: the value of the Hermite polynomial at x *************************************************************************/ double hermitesum(/* Real */ ae_vector* c, ae_int_t n, double x, ae_state *_state) { double b1; double b2; ae_int_t i; double result; b1 = (double)(0); b2 = (double)(0); result = (double)(0); for(i=n; i>=0; i--) { result = 2*(x*b1-(i+1)*b2)+c->ptr.p_double[i]; b2 = b1; b1 = result; } return result; } /************************************************************************* Representation of Hn as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/ void hermitecoefficients(ae_int_t n, /* Real */ ae_vector* c, ae_state *_state) { ae_int_t i; ae_vector_clear(c); ae_vector_set_length(c, n+1, _state); for(i=0; i<=n; i++) { c->ptr.p_double[i] = (double)(0); } c->ptr.p_double[n] = ae_exp(n*ae_log((double)(2), _state), _state); for(i=0; i<=n/2-1; i++) { c->ptr.p_double[n-2*(i+1)] = -c->ptr.p_double[n-2*i]*(n-2*i)*(n-2*i-1)/4/(i+1); } } /************************************************************************* Dawson's Integral Approximates the integral x - 2 | | 2 dawsn(x) = exp( -x ) | exp( t ) dt | | - 0 Three different rational approximations are employed, for the intervals 0 to 3.25; 3.25 to 6.25; and 6.25 up. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,10 10000 6.9e-16 1.0e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double dawsonintegral(double x, ae_state *_state) { double x2; double y; ae_int_t sg; double an; double ad; double bn; double bd; double cn; double cd; double result; sg = 1; if( ae_fp_less(x,(double)(0)) ) { sg = -1; x = -x; } if( ae_fp_less(x,3.25) ) { x2 = x*x; an = 1.13681498971755972054E-11; an = an*x2+8.49262267667473811108E-10; an = an*x2+1.94434204175553054283E-8; an = an*x2+9.53151741254484363489E-7; an = an*x2+3.07828309874913200438E-6; an = an*x2+3.52513368520288738649E-4; an = an*x2+(-8.50149846724410912031E-4); an = an*x2+4.22618223005546594270E-2; an = an*x2+(-9.17480371773452345351E-2); an = an*x2+9.99999999999999994612E-1; ad = 2.40372073066762605484E-11; ad = ad*x2+1.48864681368493396752E-9; ad = ad*x2+5.21265281010541664570E-8; ad = ad*x2+1.27258478273186970203E-6; ad = ad*x2+2.32490249820789513991E-5; ad = ad*x2+3.25524741826057911661E-4; ad = ad*x2+3.48805814657162590916E-3; ad = ad*x2+2.79448531198828973716E-2; ad = ad*x2+1.58874241960120565368E-1; ad = ad*x2+5.74918629489320327824E-1; ad = ad*x2+1.00000000000000000539E0; y = x*an/ad; result = sg*y; return result; } x2 = 1.0/(x*x); if( ae_fp_less(x,6.25) ) { bn = 5.08955156417900903354E-1; bn = bn*x2-2.44754418142697847934E-1; bn = bn*x2+9.41512335303534411857E-2; bn = bn*x2-2.18711255142039025206E-2; bn = bn*x2+3.66207612329569181322E-3; bn = bn*x2-4.23209114460388756528E-4; bn = bn*x2+3.59641304793896631888E-5; bn = bn*x2-2.14640351719968974225E-6; bn = bn*x2+9.10010780076391431042E-8; bn = bn*x2-2.40274520828250956942E-9; bn = bn*x2+3.59233385440928410398E-11; bd = 1.00000000000000000000E0; bd = bd*x2-6.31839869873368190192E-1; bd = bd*x2+2.36706788228248691528E-1; bd = bd*x2-5.31806367003223277662E-2; bd = bd*x2+8.48041718586295374409E-3; bd = bd*x2-9.47996768486665330168E-4; bd = bd*x2+7.81025592944552338085E-5; bd = bd*x2-4.55875153252442634831E-6; bd = bd*x2+1.89100358111421846170E-7; bd = bd*x2-4.91324691331920606875E-9; bd = bd*x2+7.18466403235734541950E-11; y = 1.0/x+x2*bn/(bd*x); result = sg*0.5*y; return result; } if( ae_fp_greater(x,1.0E9) ) { result = sg*0.5/x; return result; } cn = -5.90592860534773254987E-1; cn = cn*x2+6.29235242724368800674E-1; cn = cn*x2-1.72858975380388136411E-1; cn = cn*x2+1.64837047825189632310E-2; cn = cn*x2-4.86827613020462700845E-4; cd = 1.00000000000000000000E0; cd = cd*x2-2.69820057197544900361E0; cd = cd*x2+1.73270799045947845857E0; cd = cd*x2-3.93708582281939493482E-1; cd = cd*x2+3.44278924041233391079E-2; cd = cd*x2-9.73655226040941223894E-4; y = 1.0/x+x2*cn/(cd*x); result = sg*0.5*y; return result; } /************************************************************************* Sine and cosine integrals Evaluates the integrals x - | cos t - 1 Ci(x) = eul + ln x + | --------- dt, | t - 0 x - | sin t Si(x) = | ----- dt | t - 0 where eul = 0.57721566490153286061 is Euler's constant. The integrals are approximated by rational functions. For x > 8 auxiliary functions f(x) and g(x) are employed such that Ci(x) = f(x) sin(x) - g(x) cos(x) Si(x) = pi/2 - f(x) cos(x) - g(x) sin(x) ACCURACY: Test interval = [0,50]. Absolute error, except relative when > 1: arithmetic function # trials peak rms IEEE Si 30000 4.4e-16 7.3e-17 IEEE Ci 30000 6.9e-16 5.1e-17 Cephes Math Library Release 2.1: January, 1989 Copyright 1984, 1987, 1989 by Stephen L. Moshier *************************************************************************/ void sinecosineintegrals(double x, double* si, double* ci, ae_state *_state) { double z; double c; double s; double f; double g; ae_int_t sg; double sn; double sd; double cn; double cd; double fn; double fd; double gn; double gd; *si = 0; *ci = 0; if( ae_fp_less(x,(double)(0)) ) { sg = -1; x = -x; } else { sg = 0; } if( ae_fp_eq(x,(double)(0)) ) { *si = (double)(0); *ci = -ae_maxrealnumber; return; } if( ae_fp_greater(x,1.0E9) ) { *si = 1.570796326794896619-ae_cos(x, _state)/x; *ci = ae_sin(x, _state)/x; return; } if( ae_fp_less_eq(x,(double)(4)) ) { z = x*x; sn = -8.39167827910303881427E-11; sn = sn*z+4.62591714427012837309E-8; sn = sn*z-9.75759303843632795789E-6; sn = sn*z+9.76945438170435310816E-4; sn = sn*z-4.13470316229406538752E-2; sn = sn*z+1.00000000000000000302E0; sd = 2.03269266195951942049E-12; sd = sd*z+1.27997891179943299903E-9; sd = sd*z+4.41827842801218905784E-7; sd = sd*z+9.96412122043875552487E-5; sd = sd*z+1.42085239326149893930E-2; sd = sd*z+9.99999999999999996984E-1; s = x*sn/sd; cn = 2.02524002389102268789E-11; cn = cn*z-1.35249504915790756375E-8; cn = cn*z+3.59325051419993077021E-6; cn = cn*z-4.74007206873407909465E-4; cn = cn*z+2.89159652607555242092E-2; cn = cn*z-1.00000000000000000080E0; cd = 4.07746040061880559506E-12; cd = cd*z+3.06780997581887812692E-9; cd = cd*z+1.23210355685883423679E-6; cd = cd*z+3.17442024775032769882E-4; cd = cd*z+5.10028056236446052392E-2; cd = cd*z+4.00000000000000000080E0; c = z*cn/cd; if( sg!=0 ) { s = -s; } *si = s; *ci = 0.57721566490153286061+ae_log(x, _state)+c; return; } s = ae_sin(x, _state); c = ae_cos(x, _state); z = 1.0/(x*x); if( ae_fp_less(x,(double)(8)) ) { fn = 4.23612862892216586994E0; fn = fn*z+5.45937717161812843388E0; fn = fn*z+1.62083287701538329132E0; fn = fn*z+1.67006611831323023771E-1; fn = fn*z+6.81020132472518137426E-3; fn = fn*z+1.08936580650328664411E-4; fn = fn*z+5.48900223421373614008E-7; fd = 1.00000000000000000000E0; fd = fd*z+8.16496634205391016773E0; fd = fd*z+7.30828822505564552187E0; fd = fd*z+1.86792257950184183883E0; fd = fd*z+1.78792052963149907262E-1; fd = fd*z+7.01710668322789753610E-3; fd = fd*z+1.10034357153915731354E-4; fd = fd*z+5.48900252756255700982E-7; f = fn/(x*fd); gn = 8.71001698973114191777E-2; gn = gn*z+6.11379109952219284151E-1; gn = gn*z+3.97180296392337498885E-1; gn = gn*z+7.48527737628469092119E-2; gn = gn*z+5.38868681462177273157E-3; gn = gn*z+1.61999794598934024525E-4; gn = gn*z+1.97963874140963632189E-6; gn = gn*z+7.82579040744090311069E-9; gd = 1.00000000000000000000E0; gd = gd*z+1.64402202413355338886E0; gd = gd*z+6.66296701268987968381E-1; gd = gd*z+9.88771761277688796203E-2; gd = gd*z+6.22396345441768420760E-3; gd = gd*z+1.73221081474177119497E-4; gd = gd*z+2.02659182086343991969E-6; gd = gd*z+7.82579218933534490868E-9; g = z*gn/gd; } else { fn = 4.55880873470465315206E-1; fn = fn*z+7.13715274100146711374E-1; fn = fn*z+1.60300158222319456320E-1; fn = fn*z+1.16064229408124407915E-2; fn = fn*z+3.49556442447859055605E-4; fn = fn*z+4.86215430826454749482E-6; fn = fn*z+3.20092790091004902806E-8; fn = fn*z+9.41779576128512936592E-11; fn = fn*z+9.70507110881952024631E-14; fd = 1.00000000000000000000E0; fd = fd*z+9.17463611873684053703E-1; fd = fd*z+1.78685545332074536321E-1; fd = fd*z+1.22253594771971293032E-2; fd = fd*z+3.58696481881851580297E-4; fd = fd*z+4.92435064317881464393E-6; fd = fd*z+3.21956939101046018377E-8; fd = fd*z+9.43720590350276732376E-11; fd = fd*z+9.70507110881952025725E-14; f = fn/(x*fd); gn = 6.97359953443276214934E-1; gn = gn*z+3.30410979305632063225E-1; gn = gn*z+3.84878767649974295920E-2; gn = gn*z+1.71718239052347903558E-3; gn = gn*z+3.48941165502279436777E-5; gn = gn*z+3.47131167084116673800E-7; gn = gn*z+1.70404452782044526189E-9; gn = gn*z+3.85945925430276600453E-12; gn = gn*z+3.14040098946363334640E-15; gd = 1.00000000000000000000E0; gd = gd*z+1.68548898811011640017E0; gd = gd*z+4.87852258695304967486E-1; gd = gd*z+4.67913194259625806320E-2; gd = gd*z+1.90284426674399523638E-3; gd = gd*z+3.68475504442561108162E-5; gd = gd*z+3.57043223443740838771E-7; gd = gd*z+1.72693748966316146736E-9; gd = gd*z+3.87830166023954706752E-12; gd = gd*z+3.14040098946363335242E-15; g = z*gn/gd; } *si = 1.570796326794896619-f*c-g*s; if( sg!=0 ) { *si = -*si; } *ci = f*s-g*c; } /************************************************************************* Hyperbolic sine and cosine integrals Approximates the integrals x - | | cosh t - 1 Chi(x) = eul + ln x + | ----------- dt, | | t - 0 x - | | sinh t Shi(x) = | ------ dt | | t - 0 where eul = 0.57721566490153286061 is Euler's constant. The integrals are evaluated by power series for x < 8 and by Chebyshev expansions for x between 8 and 88. For large x, both functions approach exp(x)/2x. Arguments greater than 88 in magnitude return MAXNUM. ACCURACY: Test interval 0 to 88. Relative error: arithmetic function # trials peak rms IEEE Shi 30000 6.9e-16 1.6e-16 Absolute error, except relative when |Chi| > 1: IEEE Chi 30000 8.4e-16 1.4e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ void hyperbolicsinecosineintegrals(double x, double* shi, double* chi, ae_state *_state) { double k; double z; double c; double s; double a; ae_int_t sg; double b0; double b1; double b2; *shi = 0; *chi = 0; if( ae_fp_less(x,(double)(0)) ) { sg = -1; x = -x; } else { sg = 0; } if( ae_fp_eq(x,(double)(0)) ) { *shi = (double)(0); *chi = -ae_maxrealnumber; return; } if( ae_fp_less(x,8.0) ) { z = x*x; a = 1.0; s = 1.0; c = 0.0; k = 2.0; do { a = a*z/k; c = c+a/k; k = k+1.0; a = a/k; s = s+a/k; k = k+1.0; } while(ae_fp_greater_eq(ae_fabs(a/s, _state),ae_machineepsilon)); s = s*x; } else { if( ae_fp_less(x,18.0) ) { a = (576.0/x-52.0)/10.0; k = ae_exp(x, _state)/x; b0 = 1.83889230173399459482E-17; b1 = 0.0; trigintegrals_chebiterationshichi(a, -9.55485532279655569575E-17, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 2.04326105980879882648E-16, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.09896949074905343022E-15, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.31313534344092599234E-14, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 5.93976226264314278932E-14, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -3.47197010497749154755E-14, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.40059764613117131000E-12, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 9.49044626224223543299E-12, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.61596181145435454033E-11, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.77899784436430310321E-10, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.35455469767246947469E-9, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.03257121792819495123E-9, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -3.56699611114982536845E-8, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.44818877384267342057E-7, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 7.82018215184051295296E-7, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -5.39919118403805073710E-6, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -3.12458202168959833422E-5, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 8.90136741950727517826E-5, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 2.02558474743846862168E-3, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 2.96064440855633256972E-2, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.11847751047257036625E0, &b0, &b1, &b2, _state); s = k*0.5*(b0-b2); b0 = -8.12435385225864036372E-18; b1 = 0.0; trigintegrals_chebiterationshichi(a, 2.17586413290339214377E-17, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 5.22624394924072204667E-17, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -9.48812110591690559363E-16, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 5.35546311647465209166E-15, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.21009970113732918701E-14, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -6.00865178553447437951E-14, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 7.16339649156028587775E-13, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -2.93496072607599856104E-12, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.40359438136491256904E-12, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 8.76302288609054966081E-11, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -4.40092476213282340617E-10, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.87992075640569295479E-10, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.31458150989474594064E-8, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -4.75513930924765465590E-8, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -2.21775018801848880741E-7, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.94635531373272490962E-6, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 4.33505889257316408893E-6, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -6.13387001076494349496E-5, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -3.13085477492997465138E-4, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 4.97164789823116062801E-4, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 2.64347496031374526641E-2, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.11446150876699213025E0, &b0, &b1, &b2, _state); c = k*0.5*(b0-b2); } else { if( ae_fp_less_eq(x,88.0) ) { a = (6336.0/x-212.0)/70.0; k = ae_exp(x, _state)/x; b0 = -1.05311574154850938805E-17; b1 = 0.0; trigintegrals_chebiterationshichi(a, 2.62446095596355225821E-17, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 8.82090135625368160657E-17, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -3.38459811878103047136E-16, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -8.30608026366935789136E-16, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 3.93397875437050071776E-15, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.01765565969729044505E-14, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -4.21128170307640802703E-14, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.60818204519802480035E-13, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 3.34714954175994481761E-13, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 2.72600352129153073807E-12, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.66894954752839083608E-12, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -3.49278141024730899554E-11, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.58580661666482709598E-10, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.79289437183355633342E-10, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.76281629144264523277E-9, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.69050228879421288846E-8, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.25391771228487041649E-7, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.16229947068677338732E-6, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.61038260117376323993E-5, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 3.49810375601053973070E-4, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.28478065259647610779E-2, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.03665722588798326712E0, &b0, &b1, &b2, _state); s = k*0.5*(b0-b2); b0 = 8.06913408255155572081E-18; b1 = 0.0; trigintegrals_chebiterationshichi(a, -2.08074168180148170312E-17, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -5.98111329658272336816E-17, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 2.68533951085945765591E-16, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 4.52313941698904694774E-16, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -3.10734917335299464535E-15, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -4.42823207332531972288E-15, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 3.49639695410806959872E-14, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 6.63406731718911586609E-14, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -3.71902448093119218395E-13, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.27135418132338309016E-12, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 2.74851141935315395333E-12, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 2.33781843985453438400E-11, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 2.71436006377612442764E-11, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -2.56600180000355990529E-10, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -1.61021375163803438552E-9, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -4.72543064876271773512E-9, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, -3.00095178028681682282E-9, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 7.79387474390914922337E-8, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.06942765566401507066E-6, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.59503164802313196374E-5, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 3.49592575153777996871E-4, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.28475387530065247392E-2, &b0, &b1, &b2, _state); trigintegrals_chebiterationshichi(a, 1.03665693917934275131E0, &b0, &b1, &b2, _state); c = k*0.5*(b0-b2); } else { if( sg!=0 ) { *shi = -ae_maxrealnumber; } else { *shi = ae_maxrealnumber; } *chi = ae_maxrealnumber; return; } } } if( sg!=0 ) { s = -s; } *shi = s; *chi = 0.57721566490153286061+ae_log(x, _state)+c; } static void trigintegrals_chebiterationshichi(double x, double c, double* b0, double* b1, double* b2, ae_state *_state) { *b2 = *b1; *b1 = *b0; *b0 = x*(*b1)-(*b2)+c; } /************************************************************************* Poisson distribution Returns the sum of the first k+1 terms of the Poisson distribution: k j -- -m m > e -- -- j! j=0 The terms are not summed directly; instead the incomplete gamma integral is employed, according to the relation y = pdtr( k, m ) = igamc( k+1, m ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double poissondistribution(ae_int_t k, double m, ae_state *_state) { double result; ae_assert(k>=0&&ae_fp_greater(m,(double)(0)), "Domain error in PoissonDistribution", _state); result = incompletegammac((double)(k+1), m, _state); return result; } /************************************************************************* Complemented Poisson distribution Returns the sum of the terms k+1 to infinity of the Poisson distribution: inf. j -- -m m > e -- -- j! j=k+1 The terms are not summed directly; instead the incomplete gamma integral is employed, according to the formula y = pdtrc( k, m ) = igam( k+1, m ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double poissoncdistribution(ae_int_t k, double m, ae_state *_state) { double result; ae_assert(k>=0&&ae_fp_greater(m,(double)(0)), "Domain error in PoissonDistributionC", _state); result = incompletegamma((double)(k+1), m, _state); return result; } /************************************************************************* Inverse Poisson distribution Finds the Poisson variable x such that the integral from 0 to x of the Poisson density is equal to the given probability y. This is accomplished using the inverse gamma integral function and the relation m = igami( k+1, y ). ACCURACY: See inverse incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invpoissondistribution(ae_int_t k, double y, ae_state *_state) { double result; ae_assert((k>=0&&ae_fp_greater_eq(y,(double)(0)))&&ae_fp_less(y,(double)(1)), "Domain error in InvPoissonDistribution", _state); result = invincompletegammac((double)(k+1), y, _state); return result; } /************************************************************************* Bessel function of order zero Returns Bessel function of order zero of the argument. The domain is divided into the intervals [0, 5] and (5, infinity). In the first interval the following rational approximation is used: 2 2 (w - r ) (w - r ) P (w) / Q (w) 1 2 3 8 2 where w = x and the two r's are zeros of the function. In the second interval, the Hankel asymptotic expansion is employed with two rational functions of degree 6/6 and 7/7. ACCURACY: Absolute error: arithmetic domain # trials peak rms IEEE 0, 30 60000 4.2e-16 1.1e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double besselj0(double x, ae_state *_state) { double xsq; double nn; double pzero; double qzero; double p1; double q1; double result; if( ae_fp_less(x,(double)(0)) ) { x = -x; } if( ae_fp_greater(x,8.0) ) { bessel_besselasympt0(x, &pzero, &qzero, _state); nn = x-ae_pi/4; result = ae_sqrt(2/ae_pi/x, _state)*(pzero*ae_cos(nn, _state)-qzero*ae_sin(nn, _state)); return result; } xsq = ae_sqr(x, _state); p1 = 26857.86856980014981415848441; p1 = -40504123.71833132706360663322+xsq*p1; p1 = 25071582855.36881945555156435+xsq*p1; p1 = -8085222034853.793871199468171+xsq*p1; p1 = 1434354939140344.111664316553+xsq*p1; p1 = -136762035308817138.6865416609+xsq*p1; p1 = 6382059341072356562.289432465+xsq*p1; p1 = -117915762910761053603.8440800+xsq*p1; p1 = 493378725179413356181.6813446+xsq*p1; q1 = 1.0; q1 = 1363.063652328970604442810507+xsq*q1; q1 = 1114636.098462985378182402543+xsq*q1; q1 = 669998767.2982239671814028660+xsq*q1; q1 = 312304311494.1213172572469442+xsq*q1; q1 = 112775673967979.8507056031594+xsq*q1; q1 = 30246356167094626.98627330784+xsq*q1; q1 = 5428918384092285160.200195092+xsq*q1; q1 = 493378725179413356211.3278438+xsq*q1; result = p1/q1; return result; } /************************************************************************* Bessel function of order one Returns Bessel function of order one of the argument. The domain is divided into the intervals [0, 8] and (8, infinity). In the first interval a 24 term Chebyshev expansion is used. In the second, the asymptotic trigonometric representation is employed using two rational functions of degree 5/5. ACCURACY: Absolute error: arithmetic domain # trials peak rms IEEE 0, 30 30000 2.6e-16 1.1e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double besselj1(double x, ae_state *_state) { double s; double xsq; double nn; double pzero; double qzero; double p1; double q1; double result; s = (double)(ae_sign(x, _state)); if( ae_fp_less(x,(double)(0)) ) { x = -x; } if( ae_fp_greater(x,8.0) ) { bessel_besselasympt1(x, &pzero, &qzero, _state); nn = x-3*ae_pi/4; result = ae_sqrt(2/ae_pi/x, _state)*(pzero*ae_cos(nn, _state)-qzero*ae_sin(nn, _state)); if( ae_fp_less(s,(double)(0)) ) { result = -result; } return result; } xsq = ae_sqr(x, _state); p1 = 2701.122710892323414856790990; p1 = -4695753.530642995859767162166+xsq*p1; p1 = 3413234182.301700539091292655+xsq*p1; p1 = -1322983480332.126453125473247+xsq*p1; p1 = 290879526383477.5409737601689+xsq*p1; p1 = -35888175699101060.50743641413+xsq*p1; p1 = 2316433580634002297.931815435+xsq*p1; p1 = -66721065689249162980.20941484+xsq*p1; p1 = 581199354001606143928.050809+xsq*p1; q1 = 1.0; q1 = 1606.931573481487801970916749+xsq*q1; q1 = 1501793.594998585505921097578+xsq*q1; q1 = 1013863514.358673989967045588+xsq*q1; q1 = 524371026216.7649715406728642+xsq*q1; q1 = 208166122130760.7351240184229+xsq*q1; q1 = 60920613989175217.46105196863+xsq*q1; q1 = 11857707121903209998.37113348+xsq*q1; q1 = 1162398708003212287858.529400+xsq*q1; result = s*x*p1/q1; return result; } /************************************************************************* Bessel function of integer order Returns Bessel function of order n, where n is a (possibly negative) integer. The ratio of jn(x) to j0(x) is computed by backward recurrence. First the ratio jn/jn-1 is found by a continued fraction expansion. Then the recurrence relating successive orders is applied until j0 or j1 is reached. If n = 0 or 1 the routine for j0 or j1 is called directly. ACCURACY: Absolute error: arithmetic range # trials peak rms IEEE 0, 30 5000 4.4e-16 7.9e-17 Not suitable for large n or x. Use jv() (fractional order) instead. Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besseljn(ae_int_t n, double x, ae_state *_state) { double pkm2; double pkm1; double pk; double xk; double r; double ans; ae_int_t k; ae_int_t sg; double result; if( n<0 ) { n = -n; if( n%2==0 ) { sg = 1; } else { sg = -1; } } else { sg = 1; } if( ae_fp_less(x,(double)(0)) ) { if( n%2!=0 ) { sg = -sg; } x = -x; } if( n==0 ) { result = sg*besselj0(x, _state); return result; } if( n==1 ) { result = sg*besselj1(x, _state); return result; } if( n==2 ) { if( ae_fp_eq(x,(double)(0)) ) { result = (double)(0); } else { result = sg*(2.0*besselj1(x, _state)/x-besselj0(x, _state)); } return result; } if( ae_fp_less(x,ae_machineepsilon) ) { result = (double)(0); return result; } k = 53; pk = (double)(2*(n+k)); ans = pk; xk = x*x; do { pk = pk-2.0; ans = pk-xk/ans; k = k-1; } while(k!=0); ans = x/ans; pk = 1.0; pkm1 = 1.0/ans; k = n-1; r = (double)(2*k); do { pkm2 = (pkm1*r-pk*x)/x; pk = pkm1; pkm1 = pkm2; r = r-2.0; k = k-1; } while(k!=0); if( ae_fp_greater(ae_fabs(pk, _state),ae_fabs(pkm1, _state)) ) { ans = besselj1(x, _state)/pk; } else { ans = besselj0(x, _state)/pkm1; } result = sg*ans; return result; } /************************************************************************* Bessel function of the second kind, order zero Returns Bessel function of the second kind, of order zero, of the argument. The domain is divided into the intervals [0, 5] and (5, infinity). In the first interval a rational approximation R(x) is employed to compute y0(x) = R(x) + 2 * log(x) * j0(x) / PI. Thus a call to j0() is required. In the second interval, the Hankel asymptotic expansion is employed with two rational functions of degree 6/6 and 7/7. ACCURACY: Absolute error, when y0(x) < 1; else relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.3e-15 1.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double bessely0(double x, ae_state *_state) { double nn; double xsq; double pzero; double qzero; double p4; double q4; double result; if( ae_fp_greater(x,8.0) ) { bessel_besselasympt0(x, &pzero, &qzero, _state); nn = x-ae_pi/4; result = ae_sqrt(2/ae_pi/x, _state)*(pzero*ae_sin(nn, _state)+qzero*ae_cos(nn, _state)); return result; } xsq = ae_sqr(x, _state); p4 = -41370.35497933148554125235152; p4 = 59152134.65686889654273830069+xsq*p4; p4 = -34363712229.79040378171030138+xsq*p4; p4 = 10255208596863.94284509167421+xsq*p4; p4 = -1648605817185729.473122082537+xsq*p4; p4 = 137562431639934407.8571335453+xsq*p4; p4 = -5247065581112764941.297350814+xsq*p4; p4 = 65874732757195549259.99402049+xsq*p4; p4 = -27502866786291095837.01933175+xsq*p4; q4 = 1.0; q4 = 1282.452772478993804176329391+xsq*q4; q4 = 1001702.641288906265666651753+xsq*q4; q4 = 579512264.0700729537480087915+xsq*q4; q4 = 261306575504.1081249568482092+xsq*q4; q4 = 91620380340751.85262489147968+xsq*q4; q4 = 23928830434997818.57439356652+xsq*q4; q4 = 4192417043410839973.904769661+xsq*q4; q4 = 372645883898616588198.9980+xsq*q4; result = p4/q4+2/ae_pi*besselj0(x, _state)*ae_log(x, _state); return result; } /************************************************************************* Bessel function of second kind of order one Returns Bessel function of the second kind of order one of the argument. The domain is divided into the intervals [0, 8] and (8, infinity). In the first interval a 25 term Chebyshev expansion is used, and a call to j1() is required. In the second, the asymptotic trigonometric representation is employed using two rational functions of degree 5/5. ACCURACY: Absolute error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.0e-15 1.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ double bessely1(double x, ae_state *_state) { double nn; double xsq; double pzero; double qzero; double p4; double q4; double result; if( ae_fp_greater(x,8.0) ) { bessel_besselasympt1(x, &pzero, &qzero, _state); nn = x-3*ae_pi/4; result = ae_sqrt(2/ae_pi/x, _state)*(pzero*ae_sin(nn, _state)+qzero*ae_cos(nn, _state)); return result; } xsq = ae_sqr(x, _state); p4 = -2108847.540133123652824139923; p4 = 3639488548.124002058278999428+xsq*p4; p4 = -2580681702194.450950541426399+xsq*p4; p4 = 956993023992168.3481121552788+xsq*p4; p4 = -196588746272214065.8820322248+xsq*p4; p4 = 21931073399177975921.11427556+xsq*p4; p4 = -1212297555414509577913.561535+xsq*p4; p4 = 26554738314348543268942.48968+xsq*p4; p4 = -99637534243069222259967.44354+xsq*p4; q4 = 1.0; q4 = 1612.361029677000859332072312+xsq*q4; q4 = 1563282.754899580604737366452+xsq*q4; q4 = 1128686837.169442121732366891+xsq*q4; q4 = 646534088126.5275571961681500+xsq*q4; q4 = 297663212564727.6729292742282+xsq*q4; q4 = 108225825940881955.2553850180+xsq*q4; q4 = 29549879358971486742.90758119+xsq*q4; q4 = 5435310377188854170800.653097+xsq*q4; q4 = 508206736694124324531442.4152+xsq*q4; result = x*p4/q4+2/ae_pi*(besselj1(x, _state)*ae_log(x, _state)-1/x); return result; } /************************************************************************* Bessel function of second kind of integer order Returns Bessel function of order n, where n is a (possibly negative) integer. The function is evaluated by forward recurrence on n, starting with values computed by the routines y0() and y1(). If n = 0 or 1 the routine for y0 or y1 is called directly. ACCURACY: Absolute error, except relative when y > 1: arithmetic domain # trials peak rms IEEE 0, 30 30000 3.4e-15 4.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besselyn(ae_int_t n, double x, ae_state *_state) { ae_int_t i; double a; double b; double tmp; double s; double result; s = (double)(1); if( n<0 ) { n = -n; if( n%2!=0 ) { s = (double)(-1); } } if( n==0 ) { result = bessely0(x, _state); return result; } if( n==1 ) { result = s*bessely1(x, _state); return result; } a = bessely0(x, _state); b = bessely1(x, _state); for(i=1; i<=n-1; i++) { tmp = b; b = 2*i/x*b-a; a = tmp; } result = s*b; return result; } /************************************************************************* Modified Bessel function of order zero Returns modified Bessel function of order zero of the argument. The function is defined as i0(x) = j0( ix ). The range is partitioned into the two intervals [0,8] and (8, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 30000 5.8e-16 1.4e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besseli0(double x, ae_state *_state) { double y; double v; double z; double b0; double b1; double b2; double result; if( ae_fp_less(x,(double)(0)) ) { x = -x; } if( ae_fp_less_eq(x,8.0) ) { y = x/2.0-2.0; bessel_besselmfirstcheb(-4.41534164647933937950E-18, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 3.33079451882223809783E-17, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -2.43127984654795469359E-16, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 1.71539128555513303061E-15, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -1.16853328779934516808E-14, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 7.67618549860493561688E-14, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -4.85644678311192946090E-13, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 2.95505266312963983461E-12, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -1.72682629144155570723E-11, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 9.67580903537323691224E-11, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -5.18979560163526290666E-10, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 2.65982372468238665035E-9, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -1.30002500998624804212E-8, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 6.04699502254191894932E-8, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -2.67079385394061173391E-7, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 1.11738753912010371815E-6, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -4.41673835845875056359E-6, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 1.64484480707288970893E-5, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -5.75419501008210370398E-5, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 1.88502885095841655729E-4, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -5.76375574538582365885E-4, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 1.63947561694133579842E-3, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -4.32430999505057594430E-3, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 1.05464603945949983183E-2, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -2.37374148058994688156E-2, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 4.93052842396707084878E-2, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -9.49010970480476444210E-2, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 1.71620901522208775349E-1, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -3.04682672343198398683E-1, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 6.76795274409476084995E-1, &b0, &b1, &b2, _state); v = 0.5*(b0-b2); result = ae_exp(x, _state)*v; return result; } z = 32.0/x-2.0; bessel_besselmfirstcheb(-7.23318048787475395456E-18, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -4.83050448594418207126E-18, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 4.46562142029675999901E-17, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 3.46122286769746109310E-17, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -2.82762398051658348494E-16, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -3.42548561967721913462E-16, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 1.77256013305652638360E-15, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 3.81168066935262242075E-15, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -9.55484669882830764870E-15, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -4.15056934728722208663E-14, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 1.54008621752140982691E-14, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 3.85277838274214270114E-13, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 7.18012445138366623367E-13, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -1.79417853150680611778E-12, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -1.32158118404477131188E-11, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -3.14991652796324136454E-11, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 1.18891471078464383424E-11, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 4.94060238822496958910E-10, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 3.39623202570838634515E-9, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 2.26666899049817806459E-8, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 2.04891858946906374183E-7, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 2.89137052083475648297E-6, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 6.88975834691682398426E-5, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 3.36911647825569408990E-3, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 8.04490411014108831608E-1, &b0, &b1, &b2, _state); v = 0.5*(b0-b2); result = ae_exp(x, _state)*v/ae_sqrt(x, _state); return result; } /************************************************************************* Modified Bessel function of order one Returns modified Bessel function of order one of the argument. The function is defined as i1(x) = -i j1( ix ). The range is partitioned into the two intervals [0,8] and (8, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.9e-15 2.1e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besseli1(double x, ae_state *_state) { double y; double z; double v; double b0; double b1; double b2; double result; z = ae_fabs(x, _state); if( ae_fp_less_eq(z,8.0) ) { y = z/2.0-2.0; bessel_besselm1firstcheb(2.77791411276104639959E-18, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -2.11142121435816608115E-17, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 1.55363195773620046921E-16, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.10559694773538630805E-15, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 7.60068429473540693410E-15, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -5.04218550472791168711E-14, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 3.22379336594557470981E-13, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.98397439776494371520E-12, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 1.17361862988909016308E-11, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -6.66348972350202774223E-11, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 3.62559028155211703701E-10, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.88724975172282928790E-9, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 9.38153738649577178388E-9, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -4.44505912879632808065E-8, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 2.00329475355213526229E-7, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -8.56872026469545474066E-7, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 3.47025130813767847674E-6, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.32731636560394358279E-5, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 4.78156510755005422638E-5, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.61760815825896745588E-4, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 5.12285956168575772895E-4, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.51357245063125314899E-3, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 4.15642294431288815669E-3, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.05640848946261981558E-2, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 2.47264490306265168283E-2, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -5.29459812080949914269E-2, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 1.02643658689847095384E-1, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.76416518357834055153E-1, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 2.52587186443633654823E-1, &b0, &b1, &b2, _state); v = 0.5*(b0-b2); z = v*z*ae_exp(z, _state); } else { y = 32.0/z-2.0; bessel_besselm1firstcheb(7.51729631084210481353E-18, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 4.41434832307170791151E-18, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -4.65030536848935832153E-17, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -3.20952592199342395980E-17, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 2.96262899764595013876E-16, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 3.30820231092092828324E-16, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.88035477551078244854E-15, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -3.81440307243700780478E-15, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 1.04202769841288027642E-14, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 4.27244001671195135429E-14, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -2.10154184277266431302E-14, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -4.08355111109219731823E-13, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -7.19855177624590851209E-13, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 2.03562854414708950722E-12, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 1.41258074366137813316E-11, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 3.25260358301548823856E-11, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.89749581235054123450E-11, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -5.58974346219658380687E-10, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -3.83538038596423702205E-9, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -2.63146884688951950684E-8, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -2.51223623787020892529E-7, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -3.88256480887769039346E-6, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.10588938762623716291E-4, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -9.76109749136146840777E-3, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 7.78576235018280120474E-1, &b0, &b1, &b2, _state); v = 0.5*(b0-b2); z = v*ae_exp(z, _state)/ae_sqrt(z, _state); } if( ae_fp_less(x,(double)(0)) ) { z = -z; } result = z; return result; } /************************************************************************* Modified Bessel function, second kind, order zero Returns modified Bessel function of the second kind of order zero of the argument. The range is partitioned into the two intervals [0,8] and (8, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Tested at 2000 random points between 0 and 8. Peak absolute error (relative when K0 > 1) was 1.46e-14; rms, 4.26e-15. Relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.2e-15 1.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besselk0(double x, ae_state *_state) { double y; double z; double v; double b0; double b1; double b2; double result; ae_assert(ae_fp_greater(x,(double)(0)), "Domain error in BesselK0: x<=0", _state); if( ae_fp_less_eq(x,(double)(2)) ) { y = x*x-2.0; bessel_besselmfirstcheb(1.37446543561352307156E-16, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 4.25981614279661018399E-14, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 1.03496952576338420167E-11, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 1.90451637722020886025E-9, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 2.53479107902614945675E-7, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 2.28621210311945178607E-5, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 1.26461541144692592338E-3, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 3.59799365153615016266E-2, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, 3.44289899924628486886E-1, &b0, &b1, &b2, _state); bessel_besselmnextcheb(y, -5.35327393233902768720E-1, &b0, &b1, &b2, _state); v = 0.5*(b0-b2); v = v-ae_log(0.5*x, _state)*besseli0(x, _state); } else { z = 8.0/x-2.0; bessel_besselmfirstcheb(5.30043377268626276149E-18, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -1.64758043015242134646E-17, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 5.21039150503902756861E-17, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -1.67823109680541210385E-16, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 5.51205597852431940784E-16, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -1.84859337734377901440E-15, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 6.34007647740507060557E-15, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -2.22751332699166985548E-14, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 8.03289077536357521100E-14, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -2.98009692317273043925E-13, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 1.14034058820847496303E-12, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -4.51459788337394416547E-12, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 1.85594911495471785253E-11, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -7.95748924447710747776E-11, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 3.57739728140030116597E-10, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -1.69753450938905987466E-9, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 8.57403401741422608519E-9, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -4.66048989768794782956E-8, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 2.76681363944501510342E-7, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -1.83175552271911948767E-6, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 1.39498137188764993662E-5, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -1.28495495816278026384E-4, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 1.56988388573005337491E-3, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, -3.14481013119645005427E-2, &b0, &b1, &b2, _state); bessel_besselmnextcheb(z, 2.44030308206595545468E0, &b0, &b1, &b2, _state); v = 0.5*(b0-b2); v = v*ae_exp(-x, _state)/ae_sqrt(x, _state); } result = v; return result; } /************************************************************************* Modified Bessel function, second kind, order one Computes the modified Bessel function of the second kind of order one of the argument. The range is partitioned into the two intervals [0,2] and (2, infinity). Chebyshev polynomial expansions are employed in each interval. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 30 30000 1.2e-15 1.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double besselk1(double x, ae_state *_state) { double y; double z; double v; double b0; double b1; double b2; double result; z = 0.5*x; ae_assert(ae_fp_greater(z,(double)(0)), "Domain error in K1", _state); if( ae_fp_less_eq(x,(double)(2)) ) { y = x*x-2.0; bessel_besselm1firstcheb(-7.02386347938628759343E-18, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -2.42744985051936593393E-15, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -6.66690169419932900609E-13, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.41148839263352776110E-10, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -2.21338763073472585583E-8, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -2.43340614156596823496E-6, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.73028895751305206302E-4, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -6.97572385963986435018E-3, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.22611180822657148235E-1, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -3.53155960776544875667E-1, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 1.52530022733894777053E0, &b0, &b1, &b2, _state); v = 0.5*(b0-b2); result = ae_log(z, _state)*besseli1(x, _state)+v/x; } else { y = 8.0/x-2.0; bessel_besselm1firstcheb(-5.75674448366501715755E-18, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 1.79405087314755922667E-17, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -5.68946255844285935196E-17, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 1.83809354436663880070E-16, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -6.05704724837331885336E-16, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 2.03870316562433424052E-15, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -7.01983709041831346144E-15, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 2.47715442448130437068E-14, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -8.97670518232499435011E-14, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 3.34841966607842919884E-13, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.28917396095102890680E-12, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 5.13963967348173025100E-12, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -2.12996783842756842877E-11, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 9.21831518760500529508E-11, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -4.19035475934189648750E-10, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 2.01504975519703286596E-9, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.03457624656780970260E-8, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 5.74108412545004946722E-8, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -3.50196060308781257119E-7, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 2.40648494783721712015E-6, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -1.93619797416608296024E-5, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 1.95215518471351631108E-4, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, -2.85781685962277938680E-3, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 1.03923736576817238437E-1, &b0, &b1, &b2, _state); bessel_besselm1nextcheb(y, 2.72062619048444266945E0, &b0, &b1, &b2, _state); v = 0.5*(b0-b2); result = ae_exp(-x, _state)*v/ae_sqrt(x, _state); } return result; } /************************************************************************* Modified Bessel function, second kind, integer order Returns modified Bessel function of the second kind of order n of the argument. The range is partitioned into the two intervals [0,9.55] and (9.55, infinity). An ascending power series is used in the low range, and an asymptotic expansion in the high range. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 90000 1.8e-8 3.0e-10 Error is high only near the crossover point x = 9.55 between the two expansions used. Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1988, 2000 by Stephen L. Moshier *************************************************************************/ double besselkn(ae_int_t nn, double x, ae_state *_state) { double k; double kf; double nk1f; double nkf; double zn; double t; double s; double z0; double z; double ans; double fn; double pn; double pk; double zmn; double tlg; double tox; ae_int_t i; ae_int_t n; double eul; double result; eul = 5.772156649015328606065e-1; if( nn<0 ) { n = -nn; } else { n = nn; } ae_assert(n<=31, "Overflow in BesselKN", _state); ae_assert(ae_fp_greater(x,(double)(0)), "Domain error in BesselKN", _state); if( ae_fp_less_eq(x,9.55) ) { ans = 0.0; z0 = 0.25*x*x; fn = 1.0; pn = 0.0; zmn = 1.0; tox = 2.0/x; if( n>0 ) { pn = -eul; k = 1.0; for(i=1; i<=n-1; i++) { pn = pn+1.0/k; k = k+1.0; fn = fn*k; } zmn = tox; if( n==1 ) { ans = 1.0/x; } else { nk1f = fn/n; kf = 1.0; s = nk1f; z = -z0; zn = 1.0; for(i=1; i<=n-1; i++) { nk1f = nk1f/(n-i); kf = kf*i; zn = zn*z; t = nk1f*zn/kf; s = s+t; ae_assert(ae_fp_greater(ae_maxrealnumber-ae_fabs(t, _state),ae_fabs(s, _state)), "Overflow in BesselKN", _state); ae_assert(!(ae_fp_greater(tox,1.0)&&ae_fp_less(ae_maxrealnumber/tox,zmn)), "Overflow in BesselKN", _state); zmn = zmn*tox; } s = s*0.5; t = ae_fabs(s, _state); ae_assert(!(ae_fp_greater(zmn,1.0)&&ae_fp_less(ae_maxrealnumber/zmn,t)), "Overflow in BesselKN", _state); ae_assert(!(ae_fp_greater(t,1.0)&&ae_fp_less(ae_maxrealnumber/t,zmn)), "Overflow in BesselKN", _state); ans = s*zmn; } } tlg = 2.0*ae_log(0.5*x, _state); pk = -eul; if( n==0 ) { pn = pk; t = 1.0; } else { pn = pn+1.0/n; t = 1.0/fn; } s = (pk+pn-tlg)*t; k = 1.0; do { t = t*(z0/(k*(k+n))); pk = pk+1.0/k; pn = pn+1.0/(k+n); s = s+(pk+pn-tlg)*t; k = k+1.0; } while(ae_fp_greater(ae_fabs(t/s, _state),ae_machineepsilon)); s = 0.5*s/zmn; if( n%2!=0 ) { s = -s; } ans = ans+s; result = ans; return result; } if( ae_fp_greater(x,ae_log(ae_maxrealnumber, _state)) ) { result = (double)(0); return result; } k = (double)(n); pn = 4.0*k*k; pk = 1.0; z0 = 8.0*x; fn = 1.0; t = 1.0; s = t; nkf = ae_maxrealnumber; i = 0; do { z = pn-pk*pk; t = t*z/(fn*z0); nk1f = ae_fabs(t, _state); if( i>=n&&ae_fp_greater(nk1f,nkf) ) { break; } nkf = nk1f; s = s+t; fn = fn+1.0; pk = pk+2.0; i = i+1; } while(ae_fp_greater(ae_fabs(t/s, _state),ae_machineepsilon)); result = ae_exp(-x, _state)*ae_sqrt(ae_pi/(2.0*x), _state)*s; return result; } /************************************************************************* Internal subroutine Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ static void bessel_besselmfirstcheb(double c, double* b0, double* b1, double* b2, ae_state *_state) { *b0 = c; *b1 = 0.0; *b2 = 0.0; } /************************************************************************* Internal subroutine Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ static void bessel_besselmnextcheb(double x, double c, double* b0, double* b1, double* b2, ae_state *_state) { *b2 = *b1; *b1 = *b0; *b0 = x*(*b1)-(*b2)+c; } /************************************************************************* Internal subroutine Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ static void bessel_besselm1firstcheb(double c, double* b0, double* b1, double* b2, ae_state *_state) { *b0 = c; *b1 = 0.0; *b2 = 0.0; } /************************************************************************* Internal subroutine Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ static void bessel_besselm1nextcheb(double x, double c, double* b0, double* b1, double* b2, ae_state *_state) { *b2 = *b1; *b1 = *b0; *b0 = x*(*b1)-(*b2)+c; } static void bessel_besselasympt0(double x, double* pzero, double* qzero, ae_state *_state) { double xsq; double p2; double q2; double p3; double q3; *pzero = 0; *qzero = 0; xsq = 64.0/(x*x); p2 = 0.0; p2 = 2485.271928957404011288128951+xsq*p2; p2 = 153982.6532623911470917825993+xsq*p2; p2 = 2016135.283049983642487182349+xsq*p2; p2 = 8413041.456550439208464315611+xsq*p2; p2 = 12332384.76817638145232406055+xsq*p2; p2 = 5393485.083869438325262122897+xsq*p2; q2 = 1.0; q2 = 2615.700736920839685159081813+xsq*q2; q2 = 156001.7276940030940592769933+xsq*q2; q2 = 2025066.801570134013891035236+xsq*q2; q2 = 8426449.050629797331554404810+xsq*q2; q2 = 12338310.22786324960844856182+xsq*q2; q2 = 5393485.083869438325560444960+xsq*q2; p3 = -0.0; p3 = -4.887199395841261531199129300+xsq*p3; p3 = -226.2630641933704113967255053+xsq*p3; p3 = -2365.956170779108192723612816+xsq*p3; p3 = -8239.066313485606568803548860+xsq*p3; p3 = -10381.41698748464093880530341+xsq*p3; p3 = -3984.617357595222463506790588+xsq*p3; q3 = 1.0; q3 = 408.7714673983499223402830260+xsq*q3; q3 = 15704.89191515395519392882766+xsq*q3; q3 = 156021.3206679291652539287109+xsq*q3; q3 = 533291.3634216897168722255057+xsq*q3; q3 = 666745.4239319826986004038103+xsq*q3; q3 = 255015.5108860942382983170882+xsq*q3; *pzero = p2/q2; *qzero = 8*p3/q3/x; } static void bessel_besselasympt1(double x, double* pzero, double* qzero, ae_state *_state) { double xsq; double p2; double q2; double p3; double q3; *pzero = 0; *qzero = 0; xsq = 64.0/(x*x); p2 = -1611.616644324610116477412898; p2 = -109824.0554345934672737413139+xsq*p2; p2 = -1523529.351181137383255105722+xsq*p2; p2 = -6603373.248364939109255245434+xsq*p2; p2 = -9942246.505077641195658377899+xsq*p2; p2 = -4435757.816794127857114720794+xsq*p2; q2 = 1.0; q2 = -1455.009440190496182453565068+xsq*q2; q2 = -107263.8599110382011903063867+xsq*q2; q2 = -1511809.506634160881644546358+xsq*q2; q2 = -6585339.479723087072826915069+xsq*q2; q2 = -9934124.389934585658967556309+xsq*q2; q2 = -4435757.816794127856828016962+xsq*q2; p3 = 35.26513384663603218592175580; p3 = 1706.375429020768002061283546+xsq*p3; p3 = 18494.26287322386679652009819+xsq*p3; p3 = 66178.83658127083517939992166+xsq*p3; p3 = 85145.16067533570196555001171+xsq*p3; p3 = 33220.91340985722351859704442+xsq*p3; q3 = 1.0; q3 = 863.8367769604990967475517183+xsq*q3; q3 = 37890.22974577220264142952256+xsq*q3; q3 = 400294.4358226697511708610813+xsq*q3; q3 = 1419460.669603720892855755253+xsq*q3; q3 = 1819458.042243997298924553839+xsq*q3; q3 = 708712.8194102874357377502472+xsq*q3; *pzero = p2/q2; *qzero = 8*p3/q3/x; } /************************************************************************* Incomplete beta integral Returns incomplete beta integral of the arguments, evaluated from zero to x. The function is defined as x - - | (a+b) | | a-1 b-1 ----------- | t (1-t) dt. - - | | | (a) | (b) - 0 The domain of definition is 0 <= x <= 1. In this implementation a and b are restricted to positive values. The integral from x to 1 may be obtained by the symmetry relation 1 - incbet( a, b, x ) = incbet( b, a, 1-x ). The integral is evaluated by a continued fraction expansion or, when b*x is small, by a power series. ACCURACY: Tested at uniformly distributed random points (a,b,x) with a and b in "domain" and x between 0 and 1. Relative error arithmetic domain # trials peak rms IEEE 0,5 10000 6.9e-15 4.5e-16 IEEE 0,85 250000 2.2e-13 1.7e-14 IEEE 0,1000 30000 5.3e-12 6.3e-13 IEEE 0,10000 250000 9.3e-11 7.1e-12 IEEE 0,100000 10000 8.7e-10 4.8e-11 Outputs smaller than the IEEE gradual underflow threshold were excluded from these statistics. Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double incompletebeta(double a, double b, double x, ae_state *_state) { double t; double xc; double w; double y; ae_int_t flag; double sg; double big; double biginv; double maxgam; double minlog; double maxlog; double result; big = 4.503599627370496e15; biginv = 2.22044604925031308085e-16; maxgam = 171.624376956302725; minlog = ae_log(ae_minrealnumber, _state); maxlog = ae_log(ae_maxrealnumber, _state); ae_assert(ae_fp_greater(a,(double)(0))&&ae_fp_greater(b,(double)(0)), "Domain error in IncompleteBeta", _state); ae_assert(ae_fp_greater_eq(x,(double)(0))&&ae_fp_less_eq(x,(double)(1)), "Domain error in IncompleteBeta", _state); if( ae_fp_eq(x,(double)(0)) ) { result = (double)(0); return result; } if( ae_fp_eq(x,(double)(1)) ) { result = (double)(1); return result; } flag = 0; if( ae_fp_less_eq(b*x,1.0)&&ae_fp_less_eq(x,0.95) ) { result = ibetaf_incompletebetaps(a, b, x, maxgam, _state); return result; } w = 1.0-x; if( ae_fp_greater(x,a/(a+b)) ) { flag = 1; t = a; a = b; b = t; xc = x; x = w; } else { xc = w; } if( (flag==1&&ae_fp_less_eq(b*x,1.0))&&ae_fp_less_eq(x,0.95) ) { t = ibetaf_incompletebetaps(a, b, x, maxgam, _state); if( ae_fp_less_eq(t,ae_machineepsilon) ) { result = 1.0-ae_machineepsilon; } else { result = 1.0-t; } return result; } y = x*(a+b-2.0)-(a-1.0); if( ae_fp_less(y,0.0) ) { w = ibetaf_incompletebetafe(a, b, x, big, biginv, _state); } else { w = ibetaf_incompletebetafe2(a, b, x, big, biginv, _state)/xc; } y = a*ae_log(x, _state); t = b*ae_log(xc, _state); if( (ae_fp_less(a+b,maxgam)&&ae_fp_less(ae_fabs(y, _state),maxlog))&&ae_fp_less(ae_fabs(t, _state),maxlog) ) { t = ae_pow(xc, b, _state); t = t*ae_pow(x, a, _state); t = t/a; t = t*w; t = t*(gammafunction(a+b, _state)/(gammafunction(a, _state)*gammafunction(b, _state))); if( flag==1 ) { if( ae_fp_less_eq(t,ae_machineepsilon) ) { result = 1.0-ae_machineepsilon; } else { result = 1.0-t; } } else { result = t; } return result; } y = y+t+lngamma(a+b, &sg, _state)-lngamma(a, &sg, _state)-lngamma(b, &sg, _state); y = y+ae_log(w/a, _state); if( ae_fp_less(y,minlog) ) { t = 0.0; } else { t = ae_exp(y, _state); } if( flag==1 ) { if( ae_fp_less_eq(t,ae_machineepsilon) ) { t = 1.0-ae_machineepsilon; } else { t = 1.0-t; } } result = t; return result; } /************************************************************************* Inverse of imcomplete beta integral Given y, the function finds x such that incbet( a, b, x ) = y . The routine performs interval halving or Newton iterations to find the root of incbet(a,b,x) - y = 0. ACCURACY: Relative error: x a,b arithmetic domain domain # trials peak rms IEEE 0,1 .5,10000 50000 5.8e-12 1.3e-13 IEEE 0,1 .25,100 100000 1.8e-13 3.9e-15 IEEE 0,1 0,5 50000 1.1e-12 5.5e-15 With a and b constrained to half-integer or integer values: IEEE 0,1 .5,10000 50000 5.8e-12 1.1e-13 IEEE 0,1 .5,100 100000 1.7e-14 7.9e-16 With a = .5, b constrained to half-integer or integer values: IEEE 0,1 .5,10000 10000 8.3e-11 1.0e-11 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1996, 2000 by Stephen L. Moshier *************************************************************************/ double invincompletebeta(double a, double b, double y, ae_state *_state) { double aaa; double bbb; double y0; double d; double yyy; double x; double x0; double x1; double lgm; double yp; double di; double dithresh; double yl; double yh; double xt; ae_int_t i; ae_int_t rflg; ae_int_t dir; ae_int_t nflg; double s; ae_int_t mainlooppos; ae_int_t ihalve; ae_int_t ihalvecycle; ae_int_t newt; ae_int_t newtcycle; ae_int_t breaknewtcycle; ae_int_t breakihalvecycle; double result; i = 0; ae_assert(ae_fp_greater_eq(y,(double)(0))&&ae_fp_less_eq(y,(double)(1)), "Domain error in InvIncompleteBeta", _state); /* * special cases */ if( ae_fp_eq(y,(double)(0)) ) { result = (double)(0); return result; } if( ae_fp_eq(y,1.0) ) { result = (double)(1); return result; } /* * these initializations are not really necessary, * but without them compiler complains about 'possibly uninitialized variables'. */ dithresh = (double)(0); rflg = 0; aaa = (double)(0); bbb = (double)(0); y0 = (double)(0); x = (double)(0); yyy = (double)(0); lgm = (double)(0); dir = 0; di = (double)(0); /* * normal initializations */ x0 = 0.0; yl = 0.0; x1 = 1.0; yh = 1.0; nflg = 0; mainlooppos = 0; ihalve = 1; ihalvecycle = 2; newt = 3; newtcycle = 4; breaknewtcycle = 5; breakihalvecycle = 6; /* * main loop */ for(;;) { /* * start */ if( mainlooppos==0 ) { if( ae_fp_less_eq(a,1.0)||ae_fp_less_eq(b,1.0) ) { dithresh = 1.0e-6; rflg = 0; aaa = a; bbb = b; y0 = y; x = aaa/(aaa+bbb); yyy = incompletebeta(aaa, bbb, x, _state); mainlooppos = ihalve; continue; } else { dithresh = 1.0e-4; } yp = -invnormaldistribution(y, _state); if( ae_fp_greater(y,0.5) ) { rflg = 1; aaa = b; bbb = a; y0 = 1.0-y; yp = -yp; } else { rflg = 0; aaa = a; bbb = b; y0 = y; } lgm = (yp*yp-3.0)/6.0; x = 2.0/(1.0/(2.0*aaa-1.0)+1.0/(2.0*bbb-1.0)); d = yp*ae_sqrt(x+lgm, _state)/x-(1.0/(2.0*bbb-1.0)-1.0/(2.0*aaa-1.0))*(lgm+5.0/6.0-2.0/(3.0*x)); d = 2.0*d; if( ae_fp_less(d,ae_log(ae_minrealnumber, _state)) ) { x = (double)(0); break; } x = aaa/(aaa+bbb*ae_exp(d, _state)); yyy = incompletebeta(aaa, bbb, x, _state); yp = (yyy-y0)/y0; if( ae_fp_less(ae_fabs(yp, _state),0.2) ) { mainlooppos = newt; continue; } mainlooppos = ihalve; continue; } /* * ihalve */ if( mainlooppos==ihalve ) { dir = 0; di = 0.5; i = 0; mainlooppos = ihalvecycle; continue; } /* * ihalvecycle */ if( mainlooppos==ihalvecycle ) { if( i<=99 ) { if( i!=0 ) { x = x0+di*(x1-x0); if( ae_fp_eq(x,1.0) ) { x = 1.0-ae_machineepsilon; } if( ae_fp_eq(x,0.0) ) { di = 0.5; x = x0+di*(x1-x0); if( ae_fp_eq(x,0.0) ) { break; } } yyy = incompletebeta(aaa, bbb, x, _state); yp = (x1-x0)/(x1+x0); if( ae_fp_less(ae_fabs(yp, _state),dithresh) ) { mainlooppos = newt; continue; } yp = (yyy-y0)/y0; if( ae_fp_less(ae_fabs(yp, _state),dithresh) ) { mainlooppos = newt; continue; } } if( ae_fp_less(yyy,y0) ) { x0 = x; yl = yyy; if( dir<0 ) { dir = 0; di = 0.5; } else { if( dir>3 ) { di = 1.0-(1.0-di)*(1.0-di); } else { if( dir>1 ) { di = 0.5*di+0.5; } else { di = (y0-yyy)/(yh-yl); } } } dir = dir+1; if( ae_fp_greater(x0,0.75) ) { if( rflg==1 ) { rflg = 0; aaa = a; bbb = b; y0 = y; } else { rflg = 1; aaa = b; bbb = a; y0 = 1.0-y; } x = 1.0-x; yyy = incompletebeta(aaa, bbb, x, _state); x0 = 0.0; yl = 0.0; x1 = 1.0; yh = 1.0; mainlooppos = ihalve; continue; } } else { x1 = x; if( rflg==1&&ae_fp_less(x1,ae_machineepsilon) ) { x = 0.0; break; } yh = yyy; if( dir>0 ) { dir = 0; di = 0.5; } else { if( dir<-3 ) { di = di*di; } else { if( dir<-1 ) { di = 0.5*di; } else { di = (yyy-y0)/(yh-yl); } } } dir = dir-1; } i = i+1; mainlooppos = ihalvecycle; continue; } else { mainlooppos = breakihalvecycle; continue; } } /* * breakihalvecycle */ if( mainlooppos==breakihalvecycle ) { if( ae_fp_greater_eq(x0,1.0) ) { x = 1.0-ae_machineepsilon; break; } if( ae_fp_less_eq(x,0.0) ) { x = 0.0; break; } mainlooppos = newt; continue; } /* * newt */ if( mainlooppos==newt ) { if( nflg!=0 ) { break; } nflg = 1; lgm = lngamma(aaa+bbb, &s, _state)-lngamma(aaa, &s, _state)-lngamma(bbb, &s, _state); i = 0; mainlooppos = newtcycle; continue; } /* * newtcycle */ if( mainlooppos==newtcycle ) { if( i<=7 ) { if( i!=0 ) { yyy = incompletebeta(aaa, bbb, x, _state); } if( ae_fp_less(yyy,yl) ) { x = x0; yyy = yl; } else { if( ae_fp_greater(yyy,yh) ) { x = x1; yyy = yh; } else { if( ae_fp_less(yyy,y0) ) { x0 = x; yl = yyy; } else { x1 = x; yh = yyy; } } } if( ae_fp_eq(x,1.0)||ae_fp_eq(x,0.0) ) { mainlooppos = breaknewtcycle; continue; } d = (aaa-1.0)*ae_log(x, _state)+(bbb-1.0)*ae_log(1.0-x, _state)+lgm; if( ae_fp_less(d,ae_log(ae_minrealnumber, _state)) ) { break; } if( ae_fp_greater(d,ae_log(ae_maxrealnumber, _state)) ) { mainlooppos = breaknewtcycle; continue; } d = ae_exp(d, _state); d = (yyy-y0)/d; xt = x-d; if( ae_fp_less_eq(xt,x0) ) { yyy = (x-x0)/(x1-x0); xt = x0+0.5*yyy*(x-x0); if( ae_fp_less_eq(xt,0.0) ) { mainlooppos = breaknewtcycle; continue; } } if( ae_fp_greater_eq(xt,x1) ) { yyy = (x1-x)/(x1-x0); xt = x1-0.5*yyy*(x1-x); if( ae_fp_greater_eq(xt,1.0) ) { mainlooppos = breaknewtcycle; continue; } } x = xt; if( ae_fp_less(ae_fabs(d/x, _state),128.0*ae_machineepsilon) ) { break; } i = i+1; mainlooppos = newtcycle; continue; } else { mainlooppos = breaknewtcycle; continue; } } /* * breaknewtcycle */ if( mainlooppos==breaknewtcycle ) { dithresh = 256.0*ae_machineepsilon; mainlooppos = ihalve; continue; } } /* * done */ if( rflg!=0 ) { if( ae_fp_less_eq(x,ae_machineepsilon) ) { x = 1.0-ae_machineepsilon; } else { x = 1.0-x; } } result = x; return result; } /************************************************************************* Continued fraction expansion #1 for incomplete beta integral Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1995, 2000 by Stephen L. Moshier *************************************************************************/ static double ibetaf_incompletebetafe(double a, double b, double x, double big, double biginv, ae_state *_state) { double xk; double pk; double pkm1; double pkm2; double qk; double qkm1; double qkm2; double k1; double k2; double k3; double k4; double k5; double k6; double k7; double k8; double r; double t; double ans; double thresh; ae_int_t n; double result; k1 = a; k2 = a+b; k3 = a; k4 = a+1.0; k5 = 1.0; k6 = b-1.0; k7 = k4; k8 = a+2.0; pkm2 = 0.0; qkm2 = 1.0; pkm1 = 1.0; qkm1 = 1.0; ans = 1.0; r = 1.0; n = 0; thresh = 3.0*ae_machineepsilon; do { xk = -x*k1*k2/(k3*k4); pk = pkm1+pkm2*xk; qk = qkm1+qkm2*xk; pkm2 = pkm1; pkm1 = pk; qkm2 = qkm1; qkm1 = qk; xk = x*k5*k6/(k7*k8); pk = pkm1+pkm2*xk; qk = qkm1+qkm2*xk; pkm2 = pkm1; pkm1 = pk; qkm2 = qkm1; qkm1 = qk; if( ae_fp_neq(qk,(double)(0)) ) { r = pk/qk; } if( ae_fp_neq(r,(double)(0)) ) { t = ae_fabs((ans-r)/r, _state); ans = r; } else { t = 1.0; } if( ae_fp_less(t,thresh) ) { break; } k1 = k1+1.0; k2 = k2+1.0; k3 = k3+2.0; k4 = k4+2.0; k5 = k5+1.0; k6 = k6-1.0; k7 = k7+2.0; k8 = k8+2.0; if( ae_fp_greater(ae_fabs(qk, _state)+ae_fabs(pk, _state),big) ) { pkm2 = pkm2*biginv; pkm1 = pkm1*biginv; qkm2 = qkm2*biginv; qkm1 = qkm1*biginv; } if( ae_fp_less(ae_fabs(qk, _state),biginv)||ae_fp_less(ae_fabs(pk, _state),biginv) ) { pkm2 = pkm2*big; pkm1 = pkm1*big; qkm2 = qkm2*big; qkm1 = qkm1*big; } n = n+1; } while(n!=300); result = ans; return result; } /************************************************************************* Continued fraction expansion #2 for incomplete beta integral Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1995, 2000 by Stephen L. Moshier *************************************************************************/ static double ibetaf_incompletebetafe2(double a, double b, double x, double big, double biginv, ae_state *_state) { double xk; double pk; double pkm1; double pkm2; double qk; double qkm1; double qkm2; double k1; double k2; double k3; double k4; double k5; double k6; double k7; double k8; double r; double t; double ans; double z; double thresh; ae_int_t n; double result; k1 = a; k2 = b-1.0; k3 = a; k4 = a+1.0; k5 = 1.0; k6 = a+b; k7 = a+1.0; k8 = a+2.0; pkm2 = 0.0; qkm2 = 1.0; pkm1 = 1.0; qkm1 = 1.0; z = x/(1.0-x); ans = 1.0; r = 1.0; n = 0; thresh = 3.0*ae_machineepsilon; do { xk = -z*k1*k2/(k3*k4); pk = pkm1+pkm2*xk; qk = qkm1+qkm2*xk; pkm2 = pkm1; pkm1 = pk; qkm2 = qkm1; qkm1 = qk; xk = z*k5*k6/(k7*k8); pk = pkm1+pkm2*xk; qk = qkm1+qkm2*xk; pkm2 = pkm1; pkm1 = pk; qkm2 = qkm1; qkm1 = qk; if( ae_fp_neq(qk,(double)(0)) ) { r = pk/qk; } if( ae_fp_neq(r,(double)(0)) ) { t = ae_fabs((ans-r)/r, _state); ans = r; } else { t = 1.0; } if( ae_fp_less(t,thresh) ) { break; } k1 = k1+1.0; k2 = k2-1.0; k3 = k3+2.0; k4 = k4+2.0; k5 = k5+1.0; k6 = k6+1.0; k7 = k7+2.0; k8 = k8+2.0; if( ae_fp_greater(ae_fabs(qk, _state)+ae_fabs(pk, _state),big) ) { pkm2 = pkm2*biginv; pkm1 = pkm1*biginv; qkm2 = qkm2*biginv; qkm1 = qkm1*biginv; } if( ae_fp_less(ae_fabs(qk, _state),biginv)||ae_fp_less(ae_fabs(pk, _state),biginv) ) { pkm2 = pkm2*big; pkm1 = pkm1*big; qkm2 = qkm2*big; qkm1 = qkm1*big; } n = n+1; } while(n!=300); result = ans; return result; } /************************************************************************* Power series for incomplete beta integral. Use when b*x is small and x not too close to 1. Cephes Math Library, Release 2.8: June, 2000 Copyright 1984, 1995, 2000 by Stephen L. Moshier *************************************************************************/ static double ibetaf_incompletebetaps(double a, double b, double x, double maxgam, ae_state *_state) { double s; double t; double u; double v; double n; double t1; double z; double ai; double sg; double result; ai = 1.0/a; u = (1.0-b)*x; v = u/(a+1.0); t1 = v; t = u; n = 2.0; s = 0.0; z = ae_machineepsilon*ai; while(ae_fp_greater(ae_fabs(v, _state),z)) { u = (n-b)*x/n; t = t*u; v = t/(a+n); s = s+v; n = n+1.0; } s = s+t1; s = s+ai; u = a*ae_log(x, _state); if( ae_fp_less(a+b,maxgam)&&ae_fp_less(ae_fabs(u, _state),ae_log(ae_maxrealnumber, _state)) ) { t = gammafunction(a+b, _state)/(gammafunction(a, _state)*gammafunction(b, _state)); s = s*t*ae_pow(x, a, _state); } else { t = lngamma(a+b, &sg, _state)-lngamma(a, &sg, _state)-lngamma(b, &sg, _state)+u+ae_log(s, _state); if( ae_fp_less(t,ae_log(ae_minrealnumber, _state)) ) { s = 0.0; } else { s = ae_exp(t, _state); } } result = s; return result; } /************************************************************************* F distribution Returns the area from zero to x under the F density function (also known as Snedcor's density or the variance ratio density). This is the density of x = (u1/df1)/(u2/df2), where u1 and u2 are random variables having Chi square distributions with df1 and df2 degrees of freedom, respectively. The incomplete beta integral is used, according to the formula P(x) = incbet( df1/2, df2/2, (df1*x/(df2 + df1*x) ). The arguments a and b are greater than zero, and x is nonnegative. ACCURACY: Tested at random points (a,b,x). x a,b Relative error: arithmetic domain domain # trials peak rms IEEE 0,1 0,100 100000 9.8e-15 1.7e-15 IEEE 1,5 0,100 100000 6.5e-15 3.5e-16 IEEE 0,1 1,10000 100000 2.2e-11 3.3e-12 IEEE 1,5 1,10000 100000 1.1e-11 1.7e-13 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double fdistribution(ae_int_t a, ae_int_t b, double x, ae_state *_state) { double w; double result; ae_assert((a>=1&&b>=1)&&ae_fp_greater_eq(x,(double)(0)), "Domain error in FDistribution", _state); w = a*x; w = w/(b+w); result = incompletebeta(0.5*a, 0.5*b, w, _state); return result; } /************************************************************************* Complemented F distribution Returns the area from x to infinity under the F density function (also known as Snedcor's density or the variance ratio density). inf. - 1 | | a-1 b-1 1-P(x) = ------ | t (1-t) dt B(a,b) | | - x The incomplete beta integral is used, according to the formula P(x) = incbet( df2/2, df1/2, (df2/(df2 + df1*x) ). ACCURACY: Tested at random points (a,b,x) in the indicated intervals. x a,b Relative error: arithmetic domain domain # trials peak rms IEEE 0,1 1,100 100000 3.7e-14 5.9e-16 IEEE 1,5 1,100 100000 8.0e-15 1.6e-15 IEEE 0,1 1,10000 100000 1.8e-11 3.5e-13 IEEE 1,5 1,10000 100000 2.0e-11 3.0e-12 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double fcdistribution(ae_int_t a, ae_int_t b, double x, ae_state *_state) { double w; double result; ae_assert((a>=1&&b>=1)&&ae_fp_greater_eq(x,(double)(0)), "Domain error in FCDistribution", _state); w = b/(b+a*x); result = incompletebeta(0.5*b, 0.5*a, w, _state); return result; } /************************************************************************* Inverse of complemented F distribution Finds the F density argument x such that the integral from x to infinity of the F density is equal to the given probability p. This is accomplished using the inverse beta integral function and the relations z = incbi( df2/2, df1/2, p ) x = df2 (1-z) / (df1 z). Note: the following relations hold for the inverse of the uncomplemented F distribution: z = incbi( df1/2, df2/2, p ) x = df2 z / (df1 (1-z)). ACCURACY: Tested at random points (a,b,p). a,b Relative error: arithmetic domain # trials peak rms For p between .001 and 1: IEEE 1,100 100000 8.3e-15 4.7e-16 IEEE 1,10000 100000 2.1e-11 1.4e-13 For p between 10^-6 and 10^-3: IEEE 1,100 50000 1.3e-12 8.4e-15 IEEE 1,10000 50000 3.0e-12 4.8e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invfdistribution(ae_int_t a, ae_int_t b, double y, ae_state *_state) { double w; double result; ae_assert(((a>=1&&b>=1)&&ae_fp_greater(y,(double)(0)))&&ae_fp_less_eq(y,(double)(1)), "Domain error in InvFDistribution", _state); /* * Compute probability for x = 0.5 */ w = incompletebeta(0.5*b, 0.5*a, 0.5, _state); /* * If that is greater than y, then the solution w < .5 * Otherwise, solve at 1-y to remove cancellation in (b - b*w) */ if( ae_fp_greater(w,y)||ae_fp_less(y,0.001) ) { w = invincompletebeta(0.5*b, 0.5*a, y, _state); result = (b-b*w)/(a*w); } else { w = invincompletebeta(0.5*a, 0.5*b, 1.0-y, _state); result = b*w/(a*(1.0-w)); } return result; } /************************************************************************* Fresnel integral Evaluates the Fresnel integrals x - | | C(x) = | cos(pi/2 t**2) dt, | | - 0 x - | | S(x) = | sin(pi/2 t**2) dt. | | - 0 The integrals are evaluated by a power series for x < 1. For x >= 1 auxiliary functions f(x) and g(x) are employed such that C(x) = 0.5 + f(x) sin( pi/2 x**2 ) - g(x) cos( pi/2 x**2 ) S(x) = 0.5 - f(x) cos( pi/2 x**2 ) - g(x) sin( pi/2 x**2 ) ACCURACY: Relative error. Arithmetic function domain # trials peak rms IEEE S(x) 0, 10 10000 2.0e-15 3.2e-16 IEEE C(x) 0, 10 10000 1.8e-15 3.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ void fresnelintegral(double x, double* c, double* s, ae_state *_state) { double xxa; double f; double g; double cc; double ss; double t; double u; double x2; double sn; double sd; double cn; double cd; double fn; double fd; double gn; double gd; double mpi; double mpio2; mpi = 3.14159265358979323846; mpio2 = 1.57079632679489661923; xxa = x; x = ae_fabs(xxa, _state); x2 = x*x; if( ae_fp_less(x2,2.5625) ) { t = x2*x2; sn = -2.99181919401019853726E3; sn = sn*t+7.08840045257738576863E5; sn = sn*t-6.29741486205862506537E7; sn = sn*t+2.54890880573376359104E9; sn = sn*t-4.42979518059697779103E10; sn = sn*t+3.18016297876567817986E11; sd = 1.00000000000000000000E0; sd = sd*t+2.81376268889994315696E2; sd = sd*t+4.55847810806532581675E4; sd = sd*t+5.17343888770096400730E6; sd = sd*t+4.19320245898111231129E8; sd = sd*t+2.24411795645340920940E10; sd = sd*t+6.07366389490084639049E11; cn = -4.98843114573573548651E-8; cn = cn*t+9.50428062829859605134E-6; cn = cn*t-6.45191435683965050962E-4; cn = cn*t+1.88843319396703850064E-2; cn = cn*t-2.05525900955013891793E-1; cn = cn*t+9.99999999999999998822E-1; cd = 3.99982968972495980367E-12; cd = cd*t+9.15439215774657478799E-10; cd = cd*t+1.25001862479598821474E-7; cd = cd*t+1.22262789024179030997E-5; cd = cd*t+8.68029542941784300606E-4; cd = cd*t+4.12142090722199792936E-2; cd = cd*t+1.00000000000000000118E0; *s = ae_sign(xxa, _state)*x*x2*sn/sd; *c = ae_sign(xxa, _state)*x*cn/cd; return; } if( ae_fp_greater(x,36974.0) ) { *c = ae_sign(xxa, _state)*0.5; *s = ae_sign(xxa, _state)*0.5; return; } x2 = x*x; t = mpi*x2; u = 1/(t*t); t = 1/t; fn = 4.21543555043677546506E-1; fn = fn*u+1.43407919780758885261E-1; fn = fn*u+1.15220955073585758835E-2; fn = fn*u+3.45017939782574027900E-4; fn = fn*u+4.63613749287867322088E-6; fn = fn*u+3.05568983790257605827E-8; fn = fn*u+1.02304514164907233465E-10; fn = fn*u+1.72010743268161828879E-13; fn = fn*u+1.34283276233062758925E-16; fn = fn*u+3.76329711269987889006E-20; fd = 1.00000000000000000000E0; fd = fd*u+7.51586398353378947175E-1; fd = fd*u+1.16888925859191382142E-1; fd = fd*u+6.44051526508858611005E-3; fd = fd*u+1.55934409164153020873E-4; fd = fd*u+1.84627567348930545870E-6; fd = fd*u+1.12699224763999035261E-8; fd = fd*u+3.60140029589371370404E-11; fd = fd*u+5.88754533621578410010E-14; fd = fd*u+4.52001434074129701496E-17; fd = fd*u+1.25443237090011264384E-20; gn = 5.04442073643383265887E-1; gn = gn*u+1.97102833525523411709E-1; gn = gn*u+1.87648584092575249293E-2; gn = gn*u+6.84079380915393090172E-4; gn = gn*u+1.15138826111884280931E-5; gn = gn*u+9.82852443688422223854E-8; gn = gn*u+4.45344415861750144738E-10; gn = gn*u+1.08268041139020870318E-12; gn = gn*u+1.37555460633261799868E-15; gn = gn*u+8.36354435630677421531E-19; gn = gn*u+1.86958710162783235106E-22; gd = 1.00000000000000000000E0; gd = gd*u+1.47495759925128324529E0; gd = gd*u+3.37748989120019970451E-1; gd = gd*u+2.53603741420338795122E-2; gd = gd*u+8.14679107184306179049E-4; gd = gd*u+1.27545075667729118702E-5; gd = gd*u+1.04314589657571990585E-7; gd = gd*u+4.60680728146520428211E-10; gd = gd*u+1.10273215066240270757E-12; gd = gd*u+1.38796531259578871258E-15; gd = gd*u+8.39158816283118707363E-19; gd = gd*u+1.86958710162783236342E-22; f = 1-u*fn/fd; g = t*gn/gd; t = mpio2*x2; cc = ae_cos(t, _state); ss = ae_sin(t, _state); t = mpi*x; *c = 0.5+(f*ss-g*cc)/t; *s = 0.5-(f*cc+g*ss)/t; *c = *c*ae_sign(xxa, _state); *s = *s*ae_sign(xxa, _state); } /************************************************************************* Jacobian Elliptic Functions Evaluates the Jacobian elliptic functions sn(u|m), cn(u|m), and dn(u|m) of parameter m between 0 and 1, and real argument u. These functions are periodic, with quarter-period on the real axis equal to the complete elliptic integral ellpk(1.0-m). Relation to incomplete elliptic integral: If u = ellik(phi,m), then sn(u|m) = sin(phi), and cn(u|m) = cos(phi). Phi is called the amplitude of u. Computation is by means of the arithmetic-geometric mean algorithm, except when m is within 1e-9 of 0 or 1. In the latter case with m close to 1, the approximation applies only for phi < pi/2. ACCURACY: Tested at random points with u between 0 and 10, m between 0 and 1. Absolute error (* = relative error): arithmetic function # trials peak rms IEEE phi 10000 9.2e-16* 1.4e-16* IEEE sn 50000 4.1e-15 4.6e-16 IEEE cn 40000 3.6e-15 4.4e-16 IEEE dn 10000 1.3e-12 1.8e-14 Peak error observed in consistency check using addition theorem for sn(u+v) was 4e-16 (absolute). Also tested by the above relation to the incomplete elliptic integral. Accuracy deteriorates when u is large. Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ void jacobianellipticfunctions(double u, double m, double* sn, double* cn, double* dn, double* ph, ae_state *_state) { ae_frame _frame_block; double ai; double b; double phi; double t; double twon; ae_vector a; ae_vector c; ae_int_t i; ae_frame_make(_state, &_frame_block); *sn = 0; *cn = 0; *dn = 0; *ph = 0; ae_vector_init(&a, 0, DT_REAL, _state); ae_vector_init(&c, 0, DT_REAL, _state); ae_assert(ae_fp_greater_eq(m,(double)(0))&&ae_fp_less_eq(m,(double)(1)), "Domain error in JacobianEllipticFunctions: m<0 or m>1", _state); ae_vector_set_length(&a, 8+1, _state); ae_vector_set_length(&c, 8+1, _state); if( ae_fp_less(m,1.0e-9) ) { t = ae_sin(u, _state); b = ae_cos(u, _state); ai = 0.25*m*(u-t*b); *sn = t-ai*b; *cn = b+ai*t; *ph = u-ai; *dn = 1.0-0.5*m*t*t; ae_frame_leave(_state); return; } if( ae_fp_greater_eq(m,0.9999999999) ) { ai = 0.25*(1.0-m); b = ae_cosh(u, _state); t = ae_tanh(u, _state); phi = 1.0/b; twon = b*ae_sinh(u, _state); *sn = t+ai*(twon-u)/(b*b); *ph = 2.0*ae_atan(ae_exp(u, _state), _state)-1.57079632679489661923+ai*(twon-u)/b; ai = ai*t*phi; *cn = phi-ai*(twon-u); *dn = phi+ai*(twon+u); ae_frame_leave(_state); return; } a.ptr.p_double[0] = 1.0; b = ae_sqrt(1.0-m, _state); c.ptr.p_double[0] = ae_sqrt(m, _state); twon = 1.0; i = 0; while(ae_fp_greater(ae_fabs(c.ptr.p_double[i]/a.ptr.p_double[i], _state),ae_machineepsilon)) { if( i>7 ) { ae_assert(ae_false, "Overflow in JacobianEllipticFunctions", _state); break; } ai = a.ptr.p_double[i]; i = i+1; c.ptr.p_double[i] = 0.5*(ai-b); t = ae_sqrt(ai*b, _state); a.ptr.p_double[i] = 0.5*(ai+b); b = t; twon = twon*2.0; } phi = twon*a.ptr.p_double[i]*u; do { t = c.ptr.p_double[i]*ae_sin(phi, _state)/a.ptr.p_double[i]; b = phi; phi = (ae_asin(t, _state)+phi)/2.0; i = i-1; } while(i!=0); *sn = ae_sin(phi, _state); t = ae_cos(phi, _state); *cn = t; *dn = t/ae_cos(phi-b, _state); *ph = phi; ae_frame_leave(_state); } /************************************************************************* Psi (digamma) function d - psi(x) = -- ln | (x) dx is the logarithmic derivative of the gamma function. For integer x, n-1 - psi(n) = -EUL + > 1/k. - k=1 This formula is used for 0 < n <= 10. If x is negative, it is transformed to a positive argument by the reflection formula psi(1-x) = psi(x) + pi cot(pi x). For general positive x, the argument is made greater than 10 using the recurrence psi(x+1) = psi(x) + 1/x. Then the following asymptotic expansion is applied: inf. B - 2k psi(x) = log(x) - 1/2x - > ------- - 2k k=1 2k x where the B2k are Bernoulli numbers. ACCURACY: Relative error (except absolute when |psi| < 1): arithmetic domain # trials peak rms IEEE 0,30 30000 1.3e-15 1.4e-16 IEEE -30,0 40000 1.5e-15 2.2e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1992, 2000 by Stephen L. Moshier *************************************************************************/ double psi(double x, ae_state *_state) { double p; double q; double nz; double s; double w; double y; double z; double polv; ae_int_t i; ae_int_t n; ae_int_t negative; double result; negative = 0; nz = 0.0; if( ae_fp_less_eq(x,(double)(0)) ) { negative = 1; q = x; p = (double)(ae_ifloor(q, _state)); if( ae_fp_eq(p,q) ) { ae_assert(ae_false, "Singularity in Psi(x)", _state); result = ae_maxrealnumber; return result; } nz = q-p; if( ae_fp_neq(nz,0.5) ) { if( ae_fp_greater(nz,0.5) ) { p = p+1.0; nz = q-p; } nz = ae_pi/ae_tan(ae_pi*nz, _state); } else { nz = 0.0; } x = 1.0-x; } if( ae_fp_less_eq(x,10.0)&&ae_fp_eq(x,(double)(ae_ifloor(x, _state))) ) { y = 0.0; n = ae_ifloor(x, _state); for(i=1; i<=n-1; i++) { w = (double)(i); y = y+1.0/w; } y = y-0.57721566490153286061; } else { s = x; w = 0.0; while(ae_fp_less(s,10.0)) { w = w+1.0/s; s = s+1.0; } if( ae_fp_less(s,1.0E17) ) { z = 1.0/(s*s); polv = 8.33333333333333333333E-2; polv = polv*z-2.10927960927960927961E-2; polv = polv*z+7.57575757575757575758E-3; polv = polv*z-4.16666666666666666667E-3; polv = polv*z+3.96825396825396825397E-3; polv = polv*z-8.33333333333333333333E-3; polv = polv*z+8.33333333333333333333E-2; y = z*polv; } else { y = 0.0; } y = ae_log(s, _state)-0.5/s-y-w; } if( negative!=0 ) { y = y-nz; } result = y; return result; } /************************************************************************* Exponential integral Ei(x) x - t | | e Ei(x) = -|- --- dt . | | t - -inf Not defined for x <= 0. See also expn.c. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,100 50000 8.6e-16 1.3e-16 Cephes Math Library Release 2.8: May, 1999 Copyright 1999 by Stephen L. Moshier *************************************************************************/ double exponentialintegralei(double x, ae_state *_state) { double eul; double f; double f1; double f2; double w; double result; eul = 0.5772156649015328606065; if( ae_fp_less_eq(x,(double)(0)) ) { result = (double)(0); return result; } if( ae_fp_less(x,(double)(2)) ) { f1 = -5.350447357812542947283; f1 = f1*x+218.5049168816613393830; f1 = f1*x-4176.572384826693777058; f1 = f1*x+55411.76756393557601232; f1 = f1*x-331338.1331178144034309; f1 = f1*x+1592627.163384945414220; f2 = 1.000000000000000000000; f2 = f2*x-52.50547959112862969197; f2 = f2*x+1259.616186786790571525; f2 = f2*x-17565.49581973534652631; f2 = f2*x+149306.2117002725991967; f2 = f2*x-729494.9239640527645655; f2 = f2*x+1592627.163384945429726; f = f1/f2; result = eul+ae_log(x, _state)+x*f; return result; } if( ae_fp_less(x,(double)(4)) ) { w = 1/x; f1 = 1.981808503259689673238E-2; f1 = f1*w-1.271645625984917501326; f1 = f1*w-2.088160335681228318920; f1 = f1*w+2.755544509187936721172; f1 = f1*w-4.409507048701600257171E-1; f1 = f1*w+4.665623805935891391017E-2; f1 = f1*w-1.545042679673485262580E-3; f1 = f1*w+7.059980605299617478514E-5; f2 = 1.000000000000000000000; f2 = f2*w+1.476498670914921440652; f2 = f2*w+5.629177174822436244827E-1; f2 = f2*w+1.699017897879307263248E-1; f2 = f2*w+2.291647179034212017463E-2; f2 = f2*w+4.450150439728752875043E-3; f2 = f2*w+1.727439612206521482874E-4; f2 = f2*w+3.953167195549672482304E-5; f = f1/f2; result = ae_exp(x, _state)*w*(1+w*f); return result; } if( ae_fp_less(x,(double)(8)) ) { w = 1/x; f1 = -1.373215375871208729803; f1 = f1*w-7.084559133740838761406E-1; f1 = f1*w+1.580806855547941010501; f1 = f1*w-2.601500427425622944234E-1; f1 = f1*w+2.994674694113713763365E-2; f1 = f1*w-1.038086040188744005513E-3; f1 = f1*w+4.371064420753005429514E-5; f1 = f1*w+2.141783679522602903795E-6; f2 = 1.000000000000000000000; f2 = f2*w+8.585231423622028380768E-1; f2 = f2*w+4.483285822873995129957E-1; f2 = f2*w+7.687932158124475434091E-2; f2 = f2*w+2.449868241021887685904E-2; f2 = f2*w+8.832165941927796567926E-4; f2 = f2*w+4.590952299511353531215E-4; f2 = f2*w+(-4.729848351866523044863E-6); f2 = f2*w+2.665195537390710170105E-6; f = f1/f2; result = ae_exp(x, _state)*w*(1+w*f); return result; } if( ae_fp_less(x,(double)(16)) ) { w = 1/x; f1 = -2.106934601691916512584; f1 = f1*w+1.732733869664688041885; f1 = f1*w-2.423619178935841904839E-1; f1 = f1*w+2.322724180937565842585E-2; f1 = f1*w+2.372880440493179832059E-4; f1 = f1*w-8.343219561192552752335E-5; f1 = f1*w+1.363408795605250394881E-5; f1 = f1*w-3.655412321999253963714E-7; f1 = f1*w+1.464941733975961318456E-8; f1 = f1*w+6.176407863710360207074E-10; f2 = 1.000000000000000000000; f2 = f2*w-2.298062239901678075778E-1; f2 = f2*w+1.105077041474037862347E-1; f2 = f2*w-1.566542966630792353556E-2; f2 = f2*w+2.761106850817352773874E-3; f2 = f2*w-2.089148012284048449115E-4; f2 = f2*w+1.708528938807675304186E-5; f2 = f2*w-4.459311796356686423199E-7; f2 = f2*w+1.394634930353847498145E-8; f2 = f2*w+6.150865933977338354138E-10; f = f1/f2; result = ae_exp(x, _state)*w*(1+w*f); return result; } if( ae_fp_less(x,(double)(32)) ) { w = 1/x; f1 = -2.458119367674020323359E-1; f1 = f1*w-1.483382253322077687183E-1; f1 = f1*w+7.248291795735551591813E-2; f1 = f1*w-1.348315687380940523823E-2; f1 = f1*w+1.342775069788636972294E-3; f1 = f1*w-7.942465637159712264564E-5; f1 = f1*w+2.644179518984235952241E-6; f1 = f1*w-4.239473659313765177195E-8; f2 = 1.000000000000000000000; f2 = f2*w-1.044225908443871106315E-1; f2 = f2*w-2.676453128101402655055E-1; f2 = f2*w+9.695000254621984627876E-2; f2 = f2*w-1.601745692712991078208E-2; f2 = f2*w+1.496414899205908021882E-3; f2 = f2*w-8.462452563778485013756E-5; f2 = f2*w+2.728938403476726394024E-6; f2 = f2*w-4.239462431819542051337E-8; f = f1/f2; result = ae_exp(x, _state)*w*(1+w*f); return result; } if( ae_fp_less(x,(double)(64)) ) { w = 1/x; f1 = 1.212561118105456670844E-1; f1 = f1*w-5.823133179043894485122E-1; f1 = f1*w+2.348887314557016779211E-1; f1 = f1*w-3.040034318113248237280E-2; f1 = f1*w+1.510082146865190661777E-3; f1 = f1*w-2.523137095499571377122E-5; f2 = 1.000000000000000000000; f2 = f2*w-1.002252150365854016662; f2 = f2*w+2.928709694872224144953E-1; f2 = f2*w-3.337004338674007801307E-2; f2 = f2*w+1.560544881127388842819E-3; f2 = f2*w-2.523137093603234562648E-5; f = f1/f2; result = ae_exp(x, _state)*w*(1+w*f); return result; } w = 1/x; f1 = -7.657847078286127362028E-1; f1 = f1*w+6.886192415566705051750E-1; f1 = f1*w-2.132598113545206124553E-1; f1 = f1*w+3.346107552384193813594E-2; f1 = f1*w-3.076541477344756050249E-3; f1 = f1*w+1.747119316454907477380E-4; f1 = f1*w-6.103711682274170530369E-6; f1 = f1*w+1.218032765428652199087E-7; f1 = f1*w-1.086076102793290233007E-9; f2 = 1.000000000000000000000; f2 = f2*w-1.888802868662308731041; f2 = f2*w+1.066691687211408896850; f2 = f2*w-2.751915982306380647738E-1; f2 = f2*w+3.930852688233823569726E-2; f2 = f2*w-3.414684558602365085394E-3; f2 = f2*w+1.866844370703555398195E-4; f2 = f2*w-6.345146083130515357861E-6; f2 = f2*w+1.239754287483206878024E-7; f2 = f2*w-1.086076102793126632978E-9; f = f1/f2; result = ae_exp(x, _state)*w*(1+w*f); return result; } /************************************************************************* Exponential integral En(x) Evaluates the exponential integral inf. - | | -xt | e E (x) = | ---- dt. n | n | | t - 1 Both n and x must be nonnegative. The routine employs either a power series, a continued fraction, or an asymptotic formula depending on the relative values of n and x. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0, 30 10000 1.7e-15 3.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1985, 2000 by Stephen L. Moshier *************************************************************************/ double exponentialintegralen(double x, ae_int_t n, ae_state *_state) { double r; double t; double yk; double xk; double pk; double pkm1; double pkm2; double qk; double qkm1; double qkm2; double psi; double z; ae_int_t i; ae_int_t k; double big; double eul; double result; eul = 0.57721566490153286060; big = 1.44115188075855872*ae_pow((double)(10), (double)(17), _state); if( ((n<0||ae_fp_less(x,(double)(0)))||ae_fp_greater(x,(double)(170)))||(ae_fp_eq(x,(double)(0))&&n<2) ) { result = (double)(-1); return result; } if( ae_fp_eq(x,(double)(0)) ) { result = (double)1/(double)(n-1); return result; } if( n==0 ) { result = ae_exp(-x, _state)/x; return result; } if( n>5000 ) { xk = x+n; yk = 1/(xk*xk); t = (double)(n); result = yk*t*(6*x*x-8*t*x+t*t); result = yk*(result+t*(t-2.0*x)); result = yk*(result+t); result = (result+1)*ae_exp(-x, _state)/xk; return result; } if( ae_fp_less_eq(x,(double)(1)) ) { psi = -eul-ae_log(x, _state); for(i=1; i<=n-1; i++) { psi = psi+(double)1/(double)i; } z = -x; xk = (double)(0); yk = (double)(1); pk = (double)(1-n); if( n==1 ) { result = 0.0; } else { result = 1.0/pk; } do { xk = xk+1; yk = yk*z/xk; pk = pk+1; if( ae_fp_neq(pk,(double)(0)) ) { result = result+yk/pk; } if( ae_fp_neq(result,(double)(0)) ) { t = ae_fabs(yk/result, _state); } else { t = (double)(1); } } while(ae_fp_greater_eq(t,ae_machineepsilon)); t = (double)(1); for(i=1; i<=n-1; i++) { t = t*z/i; } result = psi*t-result; return result; } else { k = 1; pkm2 = (double)(1); qkm2 = x; pkm1 = 1.0; qkm1 = x+n; result = pkm1/qkm1; do { k = k+1; if( k%2==1 ) { yk = (double)(1); xk = n+(double)(k-1)/(double)2; } else { yk = x; xk = (double)k/(double)2; } pk = pkm1*yk+pkm2*xk; qk = qkm1*yk+qkm2*xk; if( ae_fp_neq(qk,(double)(0)) ) { r = pk/qk; t = ae_fabs((result-r)/r, _state); result = r; } else { t = (double)(1); } pkm2 = pkm1; pkm1 = pk; qkm2 = qkm1; qkm1 = qk; if( ae_fp_greater(ae_fabs(pk, _state),big) ) { pkm2 = pkm2/big; pkm1 = pkm1/big; qkm2 = qkm2/big; qkm1 = qkm1/big; } } while(ae_fp_greater_eq(t,ae_machineepsilon)); result = result*ae_exp(-x, _state); } return result; } /************************************************************************* Calculation of the value of the Laguerre polynomial. Parameters: n - degree, n>=0 x - argument Result: the value of the Laguerre polynomial Ln at x *************************************************************************/ double laguerrecalculate(ae_int_t n, double x, ae_state *_state) { double a; double b; double i; double result; result = (double)(1); a = (double)(1); b = 1-x; if( n==1 ) { result = b; } i = (double)(2); while(ae_fp_less_eq(i,(double)(n))) { result = ((2*i-1-x)*b-(i-1)*a)/i; a = b; b = result; i = i+1; } return result; } /************************************************************************* Summation of Laguerre polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*L0(x) + c[1]*L1(x) + ... + c[N]*LN(x) Parameters: n - degree, n>=0 x - argument Result: the value of the Laguerre polynomial at x *************************************************************************/ double laguerresum(/* Real */ ae_vector* c, ae_int_t n, double x, ae_state *_state) { double b1; double b2; ae_int_t i; double result; b1 = (double)(0); b2 = (double)(0); result = (double)(0); for(i=n; i>=0; i--) { result = (2*i+1-x)*b1/(i+1)-(i+1)*b2/(i+2)+c->ptr.p_double[i]; b2 = b1; b1 = result; } return result; } /************************************************************************* Representation of Ln as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/ void laguerrecoefficients(ae_int_t n, /* Real */ ae_vector* c, ae_state *_state) { ae_int_t i; ae_vector_clear(c); ae_vector_set_length(c, n+1, _state); c->ptr.p_double[0] = (double)(1); for(i=0; i<=n-1; i++) { c->ptr.p_double[i+1] = -c->ptr.p_double[i]*(n-i)/(i+1)/(i+1); } } /************************************************************************* Chi-square distribution Returns the area under the left hand tail (from 0 to x) of the Chi square probability density function with v degrees of freedom. x - 1 | | v/2-1 -t/2 P( x | v ) = ----------- | t e dt v/2 - | | 2 | (v/2) - 0 where x is the Chi-square variable. The incomplete gamma integral is used, according to the formula y = chdtr( v, x ) = igam( v/2.0, x/2.0 ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double chisquaredistribution(double v, double x, ae_state *_state) { double result; ae_assert(ae_fp_greater_eq(x,(double)(0))&&ae_fp_greater_eq(v,(double)(1)), "Domain error in ChiSquareDistribution", _state); result = incompletegamma(v/2.0, x/2.0, _state); return result; } /************************************************************************* Complemented Chi-square distribution Returns the area under the right hand tail (from x to infinity) of the Chi square probability density function with v degrees of freedom: inf. - 1 | | v/2-1 -t/2 P( x | v ) = ----------- | t e dt v/2 - | | 2 | (v/2) - x where x is the Chi-square variable. The incomplete gamma integral is used, according to the formula y = chdtr( v, x ) = igamc( v/2.0, x/2.0 ). The arguments must both be positive. ACCURACY: See incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double chisquarecdistribution(double v, double x, ae_state *_state) { double result; ae_assert(ae_fp_greater_eq(x,(double)(0))&&ae_fp_greater_eq(v,(double)(1)), "Domain error in ChiSquareDistributionC", _state); result = incompletegammac(v/2.0, x/2.0, _state); return result; } /************************************************************************* Inverse of complemented Chi-square distribution Finds the Chi-square argument x such that the integral from x to infinity of the Chi-square density is equal to the given cumulative probability y. This is accomplished using the inverse gamma integral function and the relation x/2 = igami( df/2, y ); ACCURACY: See inverse incomplete gamma function Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 2000 by Stephen L. Moshier *************************************************************************/ double invchisquaredistribution(double v, double y, ae_state *_state) { double result; ae_assert((ae_fp_greater_eq(y,(double)(0))&&ae_fp_less_eq(y,(double)(1)))&&ae_fp_greater_eq(v,(double)(1)), "Domain error in InvChiSquareDistribution", _state); result = 2*invincompletegammac(0.5*v, y, _state); return result; } /************************************************************************* Calculation of the value of the Legendre polynomial Pn. Parameters: n - degree, n>=0 x - argument Result: the value of the Legendre polynomial Pn at x *************************************************************************/ double legendrecalculate(ae_int_t n, double x, ae_state *_state) { double a; double b; ae_int_t i; double result; result = (double)(1); a = (double)(1); b = x; if( n==0 ) { result = a; return result; } if( n==1 ) { result = b; return result; } for(i=2; i<=n; i++) { result = ((2*i-1)*x*b-(i-1)*a)/i; a = b; b = result; } return result; } /************************************************************************* Summation of Legendre polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*P0(x) + c[1]*P1(x) + ... + c[N]*PN(x) Parameters: n - degree, n>=0 x - argument Result: the value of the Legendre polynomial at x *************************************************************************/ double legendresum(/* Real */ ae_vector* c, ae_int_t n, double x, ae_state *_state) { double b1; double b2; ae_int_t i; double result; b1 = (double)(0); b2 = (double)(0); result = (double)(0); for(i=n; i>=0; i--) { result = (2*i+1)*x*b1/(i+1)-(i+1)*b2/(i+2)+c->ptr.p_double[i]; b2 = b1; b1 = result; } return result; } /************************************************************************* Representation of Pn as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/ void legendrecoefficients(ae_int_t n, /* Real */ ae_vector* c, ae_state *_state) { ae_int_t i; ae_vector_clear(c); ae_vector_set_length(c, n+1, _state); for(i=0; i<=n; i++) { c->ptr.p_double[i] = (double)(0); } c->ptr.p_double[n] = (double)(1); for(i=1; i<=n; i++) { c->ptr.p_double[n] = c->ptr.p_double[n]*(n+i)/2/i; } for(i=0; i<=n/2-1; i++) { c->ptr.p_double[n-2*(i+1)] = -c->ptr.p_double[n-2*i]*(n-2*i)*(n-2*i-1)/2/(i+1)/(2*(n-i)-1); } } /************************************************************************* Beta function - - | (a) | (b) beta( a, b ) = -----------. - | (a+b) For large arguments the logarithm of the function is evaluated using lgam(), then exponentiated. ACCURACY: Relative error: arithmetic domain # trials peak rms IEEE 0,30 30000 8.1e-14 1.1e-14 Cephes Math Library Release 2.0: April, 1987 Copyright 1984, 1987 by Stephen L. Moshier *************************************************************************/ double beta(double a, double b, ae_state *_state) { double y; double sg; double s; double result; sg = (double)(1); ae_assert(ae_fp_greater(a,(double)(0))||ae_fp_neq(a,(double)(ae_ifloor(a, _state))), "Overflow in Beta", _state); ae_assert(ae_fp_greater(b,(double)(0))||ae_fp_neq(b,(double)(ae_ifloor(b, _state))), "Overflow in Beta", _state); y = a+b; if( ae_fp_greater(ae_fabs(y, _state),171.624376956302725) ) { y = lngamma(y, &s, _state); sg = sg*s; y = lngamma(b, &s, _state)-y; sg = sg*s; y = lngamma(a, &s, _state)+y; sg = sg*s; ae_assert(ae_fp_less_eq(y,ae_log(ae_maxrealnumber, _state)), "Overflow in Beta", _state); result = sg*ae_exp(y, _state); return result; } y = gammafunction(y, _state); ae_assert(ae_fp_neq(y,(double)(0)), "Overflow in Beta", _state); if( ae_fp_greater(a,b) ) { y = gammafunction(a, _state)/y; y = y*gammafunction(b, _state); } else { y = gammafunction(b, _state)/y; y = y*gammafunction(a, _state); } result = y; return result; } /************************************************************************* Calculation of the value of the Chebyshev polynomials of the first and second kinds. Parameters: r - polynomial kind, either 1 or 2. n - degree, n>=0 x - argument, -1 <= x <= 1 Result: the value of the Chebyshev polynomial at x *************************************************************************/ double chebyshevcalculate(ae_int_t r, ae_int_t n, double x, ae_state *_state) { ae_int_t i; double a; double b; double result; result = (double)(0); /* * Prepare A and B */ if( r==1 ) { a = (double)(1); b = x; } else { a = (double)(1); b = 2*x; } /* * Special cases: N=0 or N=1 */ if( n==0 ) { result = a; return result; } if( n==1 ) { result = b; return result; } /* * General case: N>=2 */ for(i=2; i<=n; i++) { result = 2*x*b-a; a = b; b = result; } return result; } /************************************************************************* Summation of Chebyshev polynomials using Clenshaw's recurrence formula. This routine calculates c[0]*T0(x) + c[1]*T1(x) + ... + c[N]*TN(x) or c[0]*U0(x) + c[1]*U1(x) + ... + c[N]*UN(x) depending on the R. Parameters: r - polynomial kind, either 1 or 2. n - degree, n>=0 x - argument Result: the value of the Chebyshev polynomial at x *************************************************************************/ double chebyshevsum(/* Real */ ae_vector* c, ae_int_t r, ae_int_t n, double x, ae_state *_state) { double b1; double b2; ae_int_t i; double result; b1 = (double)(0); b2 = (double)(0); for(i=n; i>=1; i--) { result = 2*x*b1-b2+c->ptr.p_double[i]; b2 = b1; b1 = result; } if( r==1 ) { result = -b2+x*b1+c->ptr.p_double[0]; } else { result = -b2+2*x*b1+c->ptr.p_double[0]; } return result; } /************************************************************************* Representation of Tn as C[0] + C[1]*X + ... + C[N]*X^N Input parameters: N - polynomial degree, n>=0 Output parameters: C - coefficients *************************************************************************/ void chebyshevcoefficients(ae_int_t n, /* Real */ ae_vector* c, ae_state *_state) { ae_int_t i; ae_vector_clear(c); ae_vector_set_length(c, n+1, _state); for(i=0; i<=n; i++) { c->ptr.p_double[i] = (double)(0); } if( n==0||n==1 ) { c->ptr.p_double[n] = (double)(1); } else { c->ptr.p_double[n] = ae_exp((n-1)*ae_log((double)(2), _state), _state); for(i=0; i<=n/2-1; i++) { c->ptr.p_double[n-2*(i+1)] = -c->ptr.p_double[n-2*i]*(n-2*i)*(n-2*i-1)/4/(i+1)/(n-i-1); } } } /************************************************************************* Conversion of a series of Chebyshev polynomials to a power series. Represents A[0]*T0(x) + A[1]*T1(x) + ... + A[N]*Tn(x) as B[0] + B[1]*X + ... + B[N]*X^N. Input parameters: A - Chebyshev series coefficients N - degree, N>=0 Output parameters B - power series coefficients *************************************************************************/ void fromchebyshev(/* Real */ ae_vector* a, ae_int_t n, /* Real */ ae_vector* b, ae_state *_state) { ae_int_t i; ae_int_t k; double e; double d; ae_vector_clear(b); ae_vector_set_length(b, n+1, _state); for(i=0; i<=n; i++) { b->ptr.p_double[i] = (double)(0); } d = (double)(0); i = 0; do { k = i; do { e = b->ptr.p_double[k]; b->ptr.p_double[k] = (double)(0); if( i<=1&&k==i ) { b->ptr.p_double[k] = (double)(1); } else { if( i!=0 ) { b->ptr.p_double[k] = 2*d; } if( k>i+1 ) { b->ptr.p_double[k] = b->ptr.p_double[k]-b->ptr.p_double[k-2]; } } d = e; k = k+1; } while(k<=n); d = b->ptr.p_double[i]; e = (double)(0); k = i; while(k<=n) { e = e+b->ptr.p_double[k]*a->ptr.p_double[k]; k = k+2; } b->ptr.p_double[i] = e; i = i+1; } while(i<=n); } /************************************************************************* Student's t distribution Computes the integral from minus infinity to t of the Student t distribution with integer k > 0 degrees of freedom: t - | | - | 2 -(k+1)/2 | ( (k+1)/2 ) | ( x ) ---------------------- | ( 1 + --- ) dx - | ( k ) sqrt( k pi ) | ( k/2 ) | | | - -inf. Relation to incomplete beta integral: 1 - stdtr(k,t) = 0.5 * incbet( k/2, 1/2, z ) where z = k/(k + t**2). For t < -2, this is the method of computation. For higher t, a direct method is derived from integration by parts. Since the function is symmetric about t=0, the area under the right tail of the density is found by calling the function with -t instead of t. ACCURACY: Tested at random 1 <= k <= 25. The "domain" refers to t. Relative error: arithmetic domain # trials peak rms IEEE -100,-2 50000 5.9e-15 1.4e-15 IEEE -2,100 500000 2.7e-15 4.9e-17 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double studenttdistribution(ae_int_t k, double t, ae_state *_state) { double x; double rk; double z; double f; double tz; double p; double xsqk; ae_int_t j; double result; ae_assert(k>0, "Domain error in StudentTDistribution", _state); if( ae_fp_eq(t,(double)(0)) ) { result = 0.5; return result; } if( ae_fp_less(t,-2.0) ) { rk = (double)(k); z = rk/(rk+t*t); result = 0.5*incompletebeta(0.5*rk, 0.5, z, _state); return result; } if( ae_fp_less(t,(double)(0)) ) { x = -t; } else { x = t; } rk = (double)(k); z = 1.0+x*x/rk; if( k%2!=0 ) { xsqk = x/ae_sqrt(rk, _state); p = ae_atan(xsqk, _state); if( k>1 ) { f = 1.0; tz = 1.0; j = 3; while(j<=k-2&&ae_fp_greater(tz/f,ae_machineepsilon)) { tz = tz*((j-1)/(z*j)); f = f+tz; j = j+2; } p = p+f*xsqk/z; } p = p*2.0/ae_pi; } else { f = 1.0; tz = 1.0; j = 2; while(j<=k-2&&ae_fp_greater(tz/f,ae_machineepsilon)) { tz = tz*((j-1)/(z*j)); f = f+tz; j = j+2; } p = f*x/ae_sqrt(z*rk, _state); } if( ae_fp_less(t,(double)(0)) ) { p = -p; } result = 0.5+0.5*p; return result; } /************************************************************************* Functional inverse of Student's t distribution Given probability p, finds the argument t such that stdtr(k,t) is equal to p. ACCURACY: Tested at random 1 <= k <= 100. The "domain" refers to p: Relative error: arithmetic domain # trials peak rms IEEE .001,.999 25000 5.7e-15 8.0e-16 IEEE 10^-6,.001 25000 2.0e-12 2.9e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invstudenttdistribution(ae_int_t k, double p, ae_state *_state) { double t; double rk; double z; ae_int_t rflg; double result; ae_assert((k>0&&ae_fp_greater(p,(double)(0)))&&ae_fp_less(p,(double)(1)), "Domain error in InvStudentTDistribution", _state); rk = (double)(k); if( ae_fp_greater(p,0.25)&&ae_fp_less(p,0.75) ) { if( ae_fp_eq(p,0.5) ) { result = (double)(0); return result; } z = 1.0-2.0*p; z = invincompletebeta(0.5, 0.5*rk, ae_fabs(z, _state), _state); t = ae_sqrt(rk*z/(1.0-z), _state); if( ae_fp_less(p,0.5) ) { t = -t; } result = t; return result; } rflg = -1; if( ae_fp_greater_eq(p,0.5) ) { p = 1.0-p; rflg = 1; } z = invincompletebeta(0.5*rk, 0.5, 2.0*p, _state); if( ae_fp_less(ae_maxrealnumber*z,rk) ) { result = rflg*ae_maxrealnumber; return result; } t = ae_sqrt(rk/z-rk, _state); result = rflg*t; return result; } /************************************************************************* Binomial distribution Returns the sum of the terms 0 through k of the Binomial probability density: k -- ( n ) j n-j > ( ) p (1-p) -- ( j ) j=0 The terms are not summed directly; instead the incomplete beta integral is employed, according to the formula y = bdtr( k, n, p ) = incbet( n-k, k+1, 1-p ). The arguments must be positive, with p ranging from 0 to 1. ACCURACY: Tested at random points (a,b,p), with p between 0 and 1. a,b Relative error: arithmetic domain # trials peak rms For p between 0.001 and 1: IEEE 0,100 100000 4.3e-15 2.6e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double binomialdistribution(ae_int_t k, ae_int_t n, double p, ae_state *_state) { double dk; double dn; double result; ae_assert(ae_fp_greater_eq(p,(double)(0))&&ae_fp_less_eq(p,(double)(1)), "Domain error in BinomialDistribution", _state); ae_assert(k>=-1&&k<=n, "Domain error in BinomialDistribution", _state); if( k==-1 ) { result = (double)(0); return result; } if( k==n ) { result = (double)(1); return result; } dn = (double)(n-k); if( k==0 ) { dk = ae_pow(1.0-p, dn, _state); } else { dk = (double)(k+1); dk = incompletebeta(dn, dk, 1.0-p, _state); } result = dk; return result; } /************************************************************************* Complemented binomial distribution Returns the sum of the terms k+1 through n of the Binomial probability density: n -- ( n ) j n-j > ( ) p (1-p) -- ( j ) j=k+1 The terms are not summed directly; instead the incomplete beta integral is employed, according to the formula y = bdtrc( k, n, p ) = incbet( k+1, n-k, p ). The arguments must be positive, with p ranging from 0 to 1. ACCURACY: Tested at random points (a,b,p). a,b Relative error: arithmetic domain # trials peak rms For p between 0.001 and 1: IEEE 0,100 100000 6.7e-15 8.2e-16 For p between 0 and .001: IEEE 0,100 100000 1.5e-13 2.7e-15 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double binomialcdistribution(ae_int_t k, ae_int_t n, double p, ae_state *_state) { double dk; double dn; double result; ae_assert(ae_fp_greater_eq(p,(double)(0))&&ae_fp_less_eq(p,(double)(1)), "Domain error in BinomialDistributionC", _state); ae_assert(k>=-1&&k<=n, "Domain error in BinomialDistributionC", _state); if( k==-1 ) { result = (double)(1); return result; } if( k==n ) { result = (double)(0); return result; } dn = (double)(n-k); if( k==0 ) { if( ae_fp_less(p,0.01) ) { dk = -nuexpm1(dn*nulog1p(-p, _state), _state); } else { dk = 1.0-ae_pow(1.0-p, dn, _state); } } else { dk = (double)(k+1); dk = incompletebeta(dk, dn, p, _state); } result = dk; return result; } /************************************************************************* Inverse binomial distribution Finds the event probability p such that the sum of the terms 0 through k of the Binomial probability density is equal to the given cumulative probability y. This is accomplished using the inverse beta integral function and the relation 1 - p = incbi( n-k, k+1, y ). ACCURACY: Tested at random points (a,b,p). a,b Relative error: arithmetic domain # trials peak rms For p between 0.001 and 1: IEEE 0,100 100000 2.3e-14 6.4e-16 IEEE 0,10000 100000 6.6e-12 1.2e-13 For p between 10^-6 and 0.001: IEEE 0,100 100000 2.0e-12 1.3e-14 IEEE 0,10000 100000 1.5e-12 3.2e-14 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1995, 2000 by Stephen L. Moshier *************************************************************************/ double invbinomialdistribution(ae_int_t k, ae_int_t n, double y, ae_state *_state) { double dk; double dn; double p; double result; ae_assert(k>=0&&k 1, except * denotes relative error criterion. For large negative x, the absolute error increases as x^1.5. For large positive x, the relative error increases as x^1.5. Arithmetic domain function # trials peak rms IEEE -10, 0 Ai 10000 1.6e-15 2.7e-16 IEEE 0, 10 Ai 10000 2.3e-14* 1.8e-15* IEEE -10, 0 Ai' 10000 4.6e-15 7.6e-16 IEEE 0, 10 Ai' 10000 1.8e-14* 1.5e-15* IEEE -10, 10 Bi 30000 4.2e-15 5.3e-16 IEEE -10, 10 Bi' 30000 4.9e-15 7.3e-16 Cephes Math Library Release 2.8: June, 2000 Copyright 1984, 1987, 1989, 2000 by Stephen L. Moshier *************************************************************************/ void airy(double x, double* ai, double* aip, double* bi, double* bip, ae_state *_state) { double z; double zz; double t; double f; double g; double uf; double ug; double k; double zeta; double theta; ae_int_t domflg; double c1; double c2; double sqrt3; double sqpii; double afn; double afd; double agn; double agd; double apfn; double apfd; double apgn; double apgd; double an; double ad; double apn; double apd; double bn16; double bd16; double bppn; double bppd; *ai = 0; *aip = 0; *bi = 0; *bip = 0; sqpii = 5.64189583547756286948E-1; c1 = 0.35502805388781723926; c2 = 0.258819403792806798405; sqrt3 = 1.732050807568877293527; domflg = 0; if( ae_fp_greater(x,25.77) ) { *ai = (double)(0); *aip = (double)(0); *bi = ae_maxrealnumber; *bip = ae_maxrealnumber; return; } if( ae_fp_less(x,-2.09) ) { domflg = 15; t = ae_sqrt(-x, _state); zeta = -2.0*x*t/3.0; t = ae_sqrt(t, _state); k = sqpii/t; z = 1.0/zeta; zz = z*z; afn = -1.31696323418331795333E-1; afn = afn*zz-6.26456544431912369773E-1; afn = afn*zz-6.93158036036933542233E-1; afn = afn*zz-2.79779981545119124951E-1; afn = afn*zz-4.91900132609500318020E-2; afn = afn*zz-4.06265923594885404393E-3; afn = afn*zz-1.59276496239262096340E-4; afn = afn*zz-2.77649108155232920844E-6; afn = afn*zz-1.67787698489114633780E-8; afd = 1.00000000000000000000E0; afd = afd*zz+1.33560420706553243746E1; afd = afd*zz+3.26825032795224613948E1; afd = afd*zz+2.67367040941499554804E1; afd = afd*zz+9.18707402907259625840E0; afd = afd*zz+1.47529146771666414581E0; afd = afd*zz+1.15687173795188044134E-1; afd = afd*zz+4.40291641615211203805E-3; afd = afd*zz+7.54720348287414296618E-5; afd = afd*zz+4.51850092970580378464E-7; uf = 1.0+zz*afn/afd; agn = 1.97339932091685679179E-2; agn = agn*zz+3.91103029615688277255E-1; agn = agn*zz+1.06579897599595591108E0; agn = agn*zz+9.39169229816650230044E-1; agn = agn*zz+3.51465656105547619242E-1; agn = agn*zz+6.33888919628925490927E-2; agn = agn*zz+5.85804113048388458567E-3; agn = agn*zz+2.82851600836737019778E-4; agn = agn*zz+6.98793669997260967291E-6; agn = agn*zz+8.11789239554389293311E-8; agn = agn*zz+3.41551784765923618484E-10; agd = 1.00000000000000000000E0; agd = agd*zz+9.30892908077441974853E0; agd = agd*zz+1.98352928718312140417E1; agd = agd*zz+1.55646628932864612953E1; agd = agd*zz+5.47686069422975497931E0; agd = agd*zz+9.54293611618961883998E-1; agd = agd*zz+8.64580826352392193095E-2; agd = agd*zz+4.12656523824222607191E-3; agd = agd*zz+1.01259085116509135510E-4; agd = agd*zz+1.17166733214413521882E-6; agd = agd*zz+4.91834570062930015649E-9; ug = z*agn/agd; theta = zeta+0.25*ae_pi; f = ae_sin(theta, _state); g = ae_cos(theta, _state); *ai = k*(f*uf-g*ug); *bi = k*(g*uf+f*ug); apfn = 1.85365624022535566142E-1; apfn = apfn*zz+8.86712188052584095637E-1; apfn = apfn*zz+9.87391981747398547272E-1; apfn = apfn*zz+4.01241082318003734092E-1; apfn = apfn*zz+7.10304926289631174579E-2; apfn = apfn*zz+5.90618657995661810071E-3; apfn = apfn*zz+2.33051409401776799569E-4; apfn = apfn*zz+4.08718778289035454598E-6; apfn = apfn*zz+2.48379932900442457853E-8; apfd = 1.00000000000000000000E0; apfd = apfd*zz+1.47345854687502542552E1; apfd = apfd*zz+3.75423933435489594466E1; apfd = apfd*zz+3.14657751203046424330E1; apfd = apfd*zz+1.09969125207298778536E1; apfd = apfd*zz+1.78885054766999417817E0; apfd = apfd*zz+1.41733275753662636873E-1; apfd = apfd*zz+5.44066067017226003627E-3; apfd = apfd*zz+9.39421290654511171663E-5; apfd = apfd*zz+5.65978713036027009243E-7; uf = 1.0+zz*apfn/apfd; apgn = -3.55615429033082288335E-2; apgn = apgn*zz-6.37311518129435504426E-1; apgn = apgn*zz-1.70856738884312371053E0; apgn = apgn*zz-1.50221872117316635393E0; apgn = apgn*zz-5.63606665822102676611E-1; apgn = apgn*zz-1.02101031120216891789E-1; apgn = apgn*zz-9.48396695961445269093E-3; apgn = apgn*zz-4.60325307486780994357E-4; apgn = apgn*zz-1.14300836484517375919E-5; apgn = apgn*zz-1.33415518685547420648E-7; apgn = apgn*zz-5.63803833958893494476E-10; apgd = 1.00000000000000000000E0; apgd = apgd*zz+9.85865801696130355144E0; apgd = apgd*zz+2.16401867356585941885E1; apgd = apgd*zz+1.73130776389749389525E1; apgd = apgd*zz+6.17872175280828766327E0; apgd = apgd*zz+1.08848694396321495475E0; apgd = apgd*zz+9.95005543440888479402E-2; apgd = apgd*zz+4.78468199683886610842E-3; apgd = apgd*zz+1.18159633322838625562E-4; apgd = apgd*zz+1.37480673554219441465E-6; apgd = apgd*zz+5.79912514929147598821E-9; ug = z*apgn/apgd; k = sqpii*t; *aip = -k*(g*uf+f*ug); *bip = k*(f*uf-g*ug); return; } if( ae_fp_greater_eq(x,2.09) ) { domflg = 5; t = ae_sqrt(x, _state); zeta = 2.0*x*t/3.0; g = ae_exp(zeta, _state); t = ae_sqrt(t, _state); k = 2.0*t*g; z = 1.0/zeta; an = 3.46538101525629032477E-1; an = an*z+1.20075952739645805542E1; an = an*z+7.62796053615234516538E1; an = an*z+1.68089224934630576269E2; an = an*z+1.59756391350164413639E2; an = an*z+7.05360906840444183113E1; an = an*z+1.40264691163389668864E1; an = an*z+9.99999999999999995305E-1; ad = 5.67594532638770212846E-1; ad = ad*z+1.47562562584847203173E1; ad = ad*z+8.45138970141474626562E1; ad = ad*z+1.77318088145400459522E2; ad = ad*z+1.64234692871529701831E2; ad = ad*z+7.14778400825575695274E1; ad = ad*z+1.40959135607834029598E1; ad = ad*z+1.00000000000000000470E0; f = an/ad; *ai = sqpii*f/k; k = -0.5*sqpii*t/g; apn = 6.13759184814035759225E-1; apn = apn*z+1.47454670787755323881E1; apn = apn*z+8.20584123476060982430E1; apn = apn*z+1.71184781360976385540E2; apn = apn*z+1.59317847137141783523E2; apn = apn*z+6.99778599330103016170E1; apn = apn*z+1.39470856980481566958E1; apn = apn*z+1.00000000000000000550E0; apd = 3.34203677749736953049E-1; apd = apd*z+1.11810297306158156705E1; apd = apd*z+7.11727352147859965283E1; apd = apd*z+1.58778084372838313640E2; apd = apd*z+1.53206427475809220834E2; apd = apd*z+6.86752304592780337944E1; apd = apd*z+1.38498634758259442477E1; apd = apd*z+9.99999999999999994502E-1; f = apn/apd; *aip = f*k; if( ae_fp_greater(x,8.3203353) ) { bn16 = -2.53240795869364152689E-1; bn16 = bn16*z+5.75285167332467384228E-1; bn16 = bn16*z-3.29907036873225371650E-1; bn16 = bn16*z+6.44404068948199951727E-2; bn16 = bn16*z-3.82519546641336734394E-3; bd16 = 1.00000000000000000000E0; bd16 = bd16*z-7.15685095054035237902E0; bd16 = bd16*z+1.06039580715664694291E1; bd16 = bd16*z-5.23246636471251500874E0; bd16 = bd16*z+9.57395864378383833152E-1; bd16 = bd16*z-5.50828147163549611107E-2; f = z*bn16/bd16; k = sqpii*g; *bi = k*(1.0+f)/t; bppn = 4.65461162774651610328E-1; bppn = bppn*z-1.08992173800493920734E0; bppn = bppn*z+6.38800117371827987759E-1; bppn = bppn*z-1.26844349553102907034E-1; bppn = bppn*z+7.62487844342109852105E-3; bppd = 1.00000000000000000000E0; bppd = bppd*z-8.70622787633159124240E0; bppd = bppd*z+1.38993162704553213172E1; bppd = bppd*z-7.14116144616431159572E0; bppd = bppd*z+1.34008595960680518666E0; bppd = bppd*z-7.84273211323341930448E-2; f = z*bppn/bppd; *bip = k*t*(1.0+f); return; } } f = 1.0; g = x; t = 1.0; uf = 1.0; ug = x; k = 1.0; z = x*x*x; while(ae_fp_greater(t,ae_machineepsilon)) { uf = uf*z; k = k+1.0; uf = uf/k; ug = ug*z; k = k+1.0; ug = ug/k; uf = uf/k; f = f+uf; k = k+1.0; ug = ug/k; g = g+ug; t = ae_fabs(uf/f, _state); } uf = c1*f; ug = c2*g; if( domflg%2==0 ) { *ai = uf-ug; } if( domflg/2%2==0 ) { *bi = sqrt3*(uf+ug); } k = 4.0; uf = x*x/2.0; ug = z/3.0; f = uf; g = 1.0+ug; uf = uf/3.0; t = 1.0; while(ae_fp_greater(t,ae_machineepsilon)) { uf = uf*z; ug = ug/k; k = k+1.0; ug = ug*z; uf = uf/k; f = f+uf; k = k+1.0; ug = ug/k; uf = uf/k; g = g+ug; k = k+1.0; t = ae_fabs(ug/g, _state); } uf = c1*f; ug = c2*g; if( domflg/4%2==0 ) { *aip = uf-ug; } if( domflg/8%2==0 ) { *bip = sqrt3*(uf+ug); } } } cpp/tests/0000755000175000017500000000000013105126766012367 5ustar sergeysergeycpp/tests/test_i.cpp0000755000175000017500000171275613105126766014410 0ustar sergeysergey#include "stdafx.h" #include #include "alglibinternal.h" #include "alglibmisc.h" #include "diffequations.h" #include "linalg.h" #include "optimization.h" #include "solvers.h" #include "statistics.h" #include "dataanalysis.h" #include "specialfunctions.h" #include "integration.h" #include "fasttransforms.h" #include "interpolation.h" using namespace alglib; bool doc_test_bool(bool v, bool t) { return (v && t) || (!v && !t); } bool doc_test_int(ae_int_t v, ae_int_t t) { return v==t; } bool doc_test_real(double v, double t, double _threshold) { double s = _threshold>=0 ? 1.0 : fabs(t); double threshold = fabs(_threshold); return fabs(v-t)/s<=threshold; } bool doc_test_complex(alglib::complex v, alglib::complex t, double _threshold) { double s = _threshold>=0 ? 1.0 : alglib::abscomplex(t); double threshold = fabs(_threshold); return abscomplex(v-t)/s<=threshold; } bool doc_test_bool_vector(const boolean_1d_array &v, const boolean_1d_array &t) { ae_int_t i; if( v.length()!=t.length() ) return false; for(i=0; i=0 ? 1.0 : fabs(t(i)); double threshold = fabs(_threshold); if( fabs(v(i)-t(i))/s>threshold ) return false; } return true; } bool doc_test_real_matrix(const real_2d_array &v, const real_2d_array &t, double _threshold) { ae_int_t i, j; if( v.rows()!=t.rows() ) return false; if( v.cols()!=t.cols() ) return false; for(i=0; i=0 ? 1.0 : fabs(t(i,j)); double threshold = fabs(_threshold); if( fabs(v(i,j)-t(i,j))/s>threshold ) return false; } return true; } bool doc_test_complex_vector(const complex_1d_array &v, const complex_1d_array &t, double _threshold) { ae_int_t i; if( v.length()!=t.length() ) return false; for(i=0; i=0 ? 1.0 : alglib::abscomplex(t(i)); double threshold = fabs(_threshold); if( abscomplex(v(i)-t(i))/s>threshold ) return false; } return true; } bool doc_test_complex_matrix(const complex_2d_array &v, const complex_2d_array &t, double _threshold) { ae_int_t i, j; if( v.rows()!=t.rows() ) return false; if( v.cols()!=t.cols() ) return false; for(i=0; i=0 ? 1.0 : alglib::abscomplex(t(i,j)); double threshold = fabs(_threshold); if( abscomplex(v(i,j)-t(i,j))/s>threshold ) return false; } return true; } template void spoil_vector_by_adding_element(T &x) { ae_int_t i; T y = x; x.setlength(y.length()+1); for(i=0; i void spoil_vector_by_deleting_element(T &x) { ae_int_t i; T y = x; x.setlength(y.length()-1); for(i=0; i void spoil_matrix_by_adding_row(T &x) { ae_int_t i, j; T y = x; x.setlength(y.rows()+1, y.cols()); for(i=0; i void spoil_matrix_by_deleting_row(T &x) { ae_int_t i, j; T y = x; x.setlength(y.rows()-1, y.cols()); for(i=0; i void spoil_matrix_by_adding_col(T &x) { ae_int_t i, j; T y = x; x.setlength(y.rows(), y.cols()+1); for(i=0; i void spoil_matrix_by_deleting_col(T &x) { ae_int_t i, j; T y = x; x.setlength(y.rows(), y.cols()-1); for(i=0; i void spoil_vector_by_nan(T &x) { if( x.length()!=0 ) x(randominteger(x.length())) = fp_nan; } template void spoil_vector_by_posinf(T &x) { if( x.length()!=0 ) x(randominteger(x.length())) = fp_posinf; } template void spoil_vector_by_neginf(T &x) { if( x.length()!=0 ) x(randominteger(x.length())) = fp_neginf; } template void spoil_matrix_by_nan(T &x) { if( x.rows()!=0 && x.cols()!=0 ) x(randominteger(x.rows()),randominteger(x.cols())) = fp_nan; } template void spoil_matrix_by_posinf(T &x) { if( x.rows()!=0 && x.cols()!=0 ) x(randominteger(x.rows()),randominteger(x.cols())) = fp_posinf; } template void spoil_matrix_by_neginf(T &x) { if( x.rows()!=0 && x.cols()!=0 ) x(randominteger(x.rows()),randominteger(x.cols())) = fp_neginf; } void function1_func(const real_1d_array &x, double &func, void *ptr) { // // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4 // func = 100*pow(x[0]+3,4) + pow(x[1]-3,4); } void function1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) { // // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4 // and its derivatives df/d0 and df/dx1 // func = 100*pow(x[0]+3,4) + pow(x[1]-3,4); grad[0] = 400*pow(x[0]+3,3); grad[1] = 4*pow(x[1]-3,3); } void function1_hess(const real_1d_array &x, double &func, real_1d_array &grad, real_2d_array &hess, void *ptr) { // // this callback calculates f(x0,x1) = 100*(x0+3)^4 + (x1-3)^4 // its derivatives df/d0 and df/dx1 // and its Hessian. // func = 100*pow(x[0]+3,4) + pow(x[1]-3,4); grad[0] = 400*pow(x[0]+3,3); grad[1] = 4*pow(x[1]-3,3); hess[0][0] = 1200*pow(x[0]+3,2); hess[0][1] = 0; hess[1][0] = 0; hess[1][1] = 12*pow(x[1]-3,2); } void function1_fvec(const real_1d_array &x, real_1d_array &fi, void *ptr) { // // this callback calculates // f0(x0,x1) = 100*(x0+3)^4, // f1(x0,x1) = (x1-3)^4 // fi[0] = 10*pow(x[0]+3,2); fi[1] = pow(x[1]-3,2); } void function1_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr) { // // this callback calculates // f0(x0,x1) = 100*(x0+3)^4, // f1(x0,x1) = (x1-3)^4 // and Jacobian matrix J = [dfi/dxj] // fi[0] = 10*pow(x[0]+3,2); fi[1] = pow(x[1]-3,2); jac[0][0] = 20*(x[0]+3); jac[0][1] = 0; jac[1][0] = 0; jac[1][1] = 2*(x[1]-3); } void function2_func(const real_1d_array &x, double &func, void *ptr) { // // this callback calculates f(x0,x1) = (x0^2+1)^2 + (x1-1)^2 // func = pow(x[0]*x[0]+1,2) + pow(x[1]-1,2); } void function2_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) { // // this callback calculates f(x0,x1) = (x0^2+1)^2 + (x1-1)^2 // and its derivatives df/d0 and df/dx1 // func = pow(x[0]*x[0]+1,2) + pow(x[1]-1,2); grad[0] = 4*(x[0]*x[0]+1)*x[0]; grad[1] = 2*(x[1]-1); } void function2_hess(const real_1d_array &x, double &func, real_1d_array &grad, real_2d_array &hess, void *ptr) { // // this callback calculates f(x0,x1) = (x0^2+1)^2 + (x1-1)^2 // its gradient and Hessian // func = pow(x[0]*x[0]+1,2) + pow(x[1]-1,2); grad[0] = 4*(x[0]*x[0]+1)*x[0]; grad[1] = 2*(x[1]-1); hess[0][0] = 12*x[0]*x[0]+4; hess[0][1] = 0; hess[1][0] = 0; hess[1][1] = 2; } void function2_fvec(const real_1d_array &x, real_1d_array &fi, void *ptr) { // // this callback calculates // f0(x0,x1) = x0^2+1 // f1(x0,x1) = x1-1 // fi[0] = x[0]*x[0]+1; fi[1] = x[1]-1; } void function2_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr) { // // this callback calculates // f0(x0,x1) = x0^2+1 // f1(x0,x1) = x1-1 // and Jacobian matrix J = [dfi/dxj] // fi[0] = x[0]*x[0]+1; fi[1] = x[1]-1; jac[0][0] = 2*x[0]; jac[0][1] = 0; jac[1][0] = 0; jac[1][1] = 1; } void nlcfunc1_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr) { // // this callback calculates // // f0(x0,x1) = -x0+x1 // f1(x0,x1) = x0^2+x1^2-1 // // and Jacobian matrix J = [dfi/dxj] // fi[0] = -x[0]+x[1]; fi[1] = x[0]*x[0] + x[1]*x[1] - 1.0; jac[0][0] = -1.0; jac[0][1] = +1.0; jac[1][0] = 2*x[0]; jac[1][1] = 2*x[1]; } void nlcfunc2_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr) { // // this callback calculates // // f0(x0,x1,x2) = x0+x1 // f1(x0,x1,x2) = x2-exp(x0) // f2(x0,x1,x2) = x0^2+x1^2-1 // // and Jacobian matrix J = [dfi/dxj] // fi[0] = x[0]+x[1]; fi[1] = x[2]-exp(x[0]); fi[2] = x[0]*x[0] + x[1]*x[1] - 1.0; jac[0][0] = 1.0; jac[0][1] = 1.0; jac[0][2] = 0.0; jac[1][0] = -exp(x[0]); jac[1][1] = 0.0; jac[1][2] = 1.0; jac[2][0] = 2*x[0]; jac[2][1] = 2*x[1]; jac[2][2] = 0.0; } void nsfunc1_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr) { // // this callback calculates // // f0(x0,x1) = 2*|x0|+x1 // // and Jacobian matrix J = [df0/dx0 df0/dx1] // fi[0] = 2*fabs(double(x[0]))+fabs(double(x[1])); jac[0][0] = 2*alglib::sign(x[0]); jac[0][1] = alglib::sign(x[1]); } void nsfunc1_fvec(const real_1d_array &x, real_1d_array &fi, void *ptr) { // // this callback calculates // // f0(x0,x1) = 2*|x0|+x1 // fi[0] = 2*fabs(double(x[0]))+fabs(double(x[1])); } void nsfunc2_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr) { // // this callback calculates function vector // // f0(x0,x1) = 2*|x0|+x1 // f1(x0,x1) = x0-1 // f2(x0,x1) = -x1-1 // // and Jacobian matrix J // // [ df0/dx0 df0/dx1 ] // J = [ df1/dx0 df1/dx1 ] // [ df2/dx0 df2/dx1 ] // fi[0] = 2*fabs(double(x[0]))+fabs(double(x[1])); jac[0][0] = 2*alglib::sign(x[0]); jac[0][1] = alglib::sign(x[1]); fi[1] = x[0]-1; jac[1][0] = 1; jac[1][1] = 0; fi[2] = -x[1]-1; jac[2][0] = 0; jac[2][1] = -1; } void bad_func(const real_1d_array &x, double &func, void *ptr) { // // this callback calculates 'bad' function, // i.e. function with incorrectly calculated derivatives // func = 100*pow(x[0]+3,4) + pow(x[1]-3,4); } void bad_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) { // // this callback calculates 'bad' function, // i.e. function with incorrectly calculated derivatives // func = 100*pow(x[0]+3,4) + pow(x[1]-3,4); grad[0] = 40*pow(x[0]+3,3); grad[1] = 40*pow(x[1]-3,3); } void bad_hess(const real_1d_array &x, double &func, real_1d_array &grad, real_2d_array &hess, void *ptr) { // // this callback calculates 'bad' function, // i.e. function with incorrectly calculated derivatives // func = 100*pow(x[0]+3,4) + pow(x[1]-3,4); grad[0] = 40*pow(x[0]+3,3); grad[1] = 40*pow(x[1]-3,3); hess[0][0] = 120*pow(x[0]+3,2); hess[0][1] = 0; hess[1][0] = 0; hess[1][1] = 120*pow(x[1]-3,2); } void bad_fvec(const real_1d_array &x, real_1d_array &fi, void *ptr) { // // this callback calculates 'bad' function, // i.e. function with incorrectly calculated derivatives // fi[0] = 10*pow(x[0]+3,2); fi[1] = pow(x[1]-3,2); } void bad_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr) { // // this callback calculates 'bad' function, // i.e. function with incorrectly calculated derivatives // fi[0] = 10*pow(x[0]+3,2); fi[1] = pow(x[1]-3,2); jac[0][0] = 2*(x[0]+3); jac[0][1] = 1; jac[1][0] = 0; jac[1][1] = 20*(x[1]-3); } void function_cx_1_func(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr) { // this callback calculates f(c,x)=exp(-c0*sqr(x0)) // where x is a position on X-axis and c is adjustable parameter func = exp(-c[0]*pow(x[0],2)); } void function_cx_1_grad(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) { // this callback calculates f(c,x)=exp(-c0*sqr(x0)) and gradient G={df/dc[i]} // where x is a position on X-axis and c is adjustable parameter. // IMPORTANT: gradient is calculated with respect to C, not to X func = exp(-c[0]*pow(x[0],2)); grad[0] = -pow(x[0],2)*func; } void function_cx_1_hess(const real_1d_array &c, const real_1d_array &x, double &func, real_1d_array &grad, real_2d_array &hess, void *ptr) { // this callback calculates f(c,x)=exp(-c0*sqr(x0)), gradient G={df/dc[i]} and Hessian H={d2f/(dc[i]*dc[j])} // where x is a position on X-axis and c is adjustable parameter. // IMPORTANT: gradient/Hessian are calculated with respect to C, not to X func = exp(-c[0]*pow(x[0],2)); grad[0] = -pow(x[0],2)*func; hess[0][0] = pow(x[0],4)*func; } void ode_function_1_diff(const real_1d_array &y, double x, real_1d_array &dy, void *ptr) { // this callback calculates f(y[],x)=-y[0] dy[0] = -y[0]; } void int_function_1_func(double x, double xminusa, double bminusx, double &y, void *ptr) { // this callback calculates f(x)=exp(x) y = exp(x); } void function_debt_func(const real_1d_array &c, const real_1d_array &x, double &func, void *ptr) { // // this callback calculates f(c,x)=c[0]*(1+c[1]*(pow(x[0]-1999,c[2])-1)) // func = c[0]*(1+c[1]*(pow(x[0]-1999,c[2])-1)); } void s1_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) { // // this callback calculates f(x) = (1+x)^(-0.2) + (1-x)^(-0.3) + 1000*x and its gradient. // // function is trimmed when we calculate it near the singular points or outside of the [-1,+1]. // Note that we do NOT calculate gradient in this case. // if( (x[0]<=-0.999999999999) || (x[0]>=+0.999999999999) ) { func = 1.0E+300; return; } func = pow(1+x[0],-0.2) + pow(1-x[0],-0.3) + 1000*x[0]; grad[0] = -0.2*pow(1+x[0],-1.2) +0.3*pow(1-x[0],-1.3) + 1000; } int main() { bool _TotalResult = true; bool _TestResult; int _spoil_scenario; printf("C++ tests. Please wait...\n"); #ifdef AE_USE_ALLOC_COUNTER printf("Allocation counter activated...\n"); alglib_impl::_use_alloc_counter = ae_true; if( alglib_impl::_alloc_counter!=0 ) { _TotalResult = false; printf("FAILURE: alloc_counter is non-zero on start!\n"); } #endif try { // // TEST nneighbor_d_1 // Nearest neighbor search, KNN queries // printf("0/145\n"); _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { real_2d_array a = "[[0,0],[0,1],[1,0],[1,1]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(a); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(a); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(a); ae_int_t nx = 2; ae_int_t ny = 0; ae_int_t normtype = 2; kdtree kdt; real_1d_array x; real_2d_array r = "[[]]"; ae_int_t k; kdtreebuild(a, nx, ny, normtype, kdt); x = "[-1,0]"; k = kdtreequeryknn(kdt, x, 1); _TestResult = _TestResult && doc_test_int(k, 1); kdtreequeryresultsx(kdt, r); _TestResult = _TestResult && doc_test_real_matrix(r, "[[0,0]]", 0.05); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "nneighbor_d_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST nneighbor_t_2 // Subsequent queries; buffered functions must use previously allocated storage (if large enough), so buffer may contain some info from previous call // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { real_2d_array a = "[[0,0],[0,1],[1,0],[1,1]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(a); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(a); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(a); ae_int_t nx = 2; ae_int_t ny = 0; ae_int_t normtype = 2; kdtree kdt; real_1d_array x; real_2d_array rx = "[[]]"; ae_int_t k; kdtreebuild(a, nx, ny, normtype, kdt); x = "[+2,0]"; k = kdtreequeryknn(kdt, x, 2, true); _TestResult = _TestResult && doc_test_int(k, 2); kdtreequeryresultsx(kdt, rx); _TestResult = _TestResult && doc_test_real_matrix(rx, "[[1,0],[1,1]]", 0.05); x = "[-2,0]"; k = kdtreequeryknn(kdt, x, 1, true); _TestResult = _TestResult && doc_test_int(k, 1); kdtreequeryresultsx(kdt, rx); _TestResult = _TestResult && doc_test_real_matrix(rx, "[[0,0],[1,1]]", 0.05); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "nneighbor_t_2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST nneighbor_d_2 // Serialization of KD-trees // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { real_2d_array a = "[[0,0],[0,1],[1,0],[1,1]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(a); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(a); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(a); ae_int_t nx = 2; ae_int_t ny = 0; ae_int_t normtype = 2; kdtree kdt0; kdtree kdt1; std::string s; real_1d_array x; real_2d_array r0 = "[[]]"; real_2d_array r1 = "[[]]"; // // Build tree and serialize it // kdtreebuild(a, nx, ny, normtype, kdt0); alglib::kdtreeserialize(kdt0, s); alglib::kdtreeunserialize(s, kdt1); // // Compare results from KNN queries // x = "[-1,0]"; kdtreequeryknn(kdt0, x, 1); kdtreequeryresultsx(kdt0, r0); kdtreequeryknn(kdt1, x, 1); kdtreequeryresultsx(kdt1, r1); _TestResult = _TestResult && doc_test_real_matrix(r0, "[[0,0]]", 0.05); _TestResult = _TestResult && doc_test_real_matrix(r1, "[[0,0]]", 0.05); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "nneighbor_d_2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST odesolver_d1 // Solving y'=-y with ODE solver // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<13; _spoil_scenario++) { try { real_1d_array y = "[1]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(y); real_1d_array x = "[0, 1, 2, 3]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(x); if( _spoil_scenario==5 ) spoil_vector_by_posinf(x); if( _spoil_scenario==6 ) spoil_vector_by_neginf(x); double eps = 0.00001; if( _spoil_scenario==7 ) eps = fp_nan; if( _spoil_scenario==8 ) eps = fp_posinf; if( _spoil_scenario==9 ) eps = fp_neginf; double h = 0; if( _spoil_scenario==10 ) h = fp_nan; if( _spoil_scenario==11 ) h = fp_posinf; if( _spoil_scenario==12 ) h = fp_neginf; odesolverstate s; ae_int_t m; real_1d_array xtbl; real_2d_array ytbl; odesolverreport rep; odesolverrkck(y, x, eps, h, s); alglib::odesolversolve(s, ode_function_1_diff); odesolverresults(s, m, xtbl, ytbl, rep); _TestResult = _TestResult && doc_test_int(m, 4); _TestResult = _TestResult && doc_test_real_vector(xtbl, "[0, 1, 2, 3]", 0.005); _TestResult = _TestResult && doc_test_real_matrix(ytbl, "[[1], [0.367], [0.135], [0.050]]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "odesolver_d1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST sparse_d_1 // Basic operations with sparse matrices // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<1; _spoil_scenario++) { try { // // This example demonstrates creation/initialization of the sparse matrix // and matrix-vector multiplication. // // First, we have to create matrix and initialize it. Matrix is initially created // in the Hash-Table format, which allows convenient initialization. We can modify // Hash-Table matrix with sparseset() and sparseadd() functions. // // NOTE: Unlike CRS format, Hash-Table representation allows you to initialize // elements in the arbitrary order. You may see that we initialize a[0][0] first, // then move to the second row, and then move back to the first row. // sparsematrix s; sparsecreate(2, 2, s); sparseset(s, 0, 0, 2.0); sparseset(s, 1, 1, 1.0); sparseset(s, 0, 1, 1.0); sparseadd(s, 1, 1, 4.0); // // Now S is equal to // [ 2 1 ] // [ 5 ] // Lets check it by reading matrix contents with sparseget(). // You may see that with sparseget() you may read both non-zero // and zero elements. // double v; v = sparseget(s, 0, 0); _TestResult = _TestResult && doc_test_real(v, 2.0000, 0.005); v = sparseget(s, 0, 1); _TestResult = _TestResult && doc_test_real(v, 1.0000, 0.005); v = sparseget(s, 1, 0); _TestResult = _TestResult && doc_test_real(v, 0.0000, 0.005); v = sparseget(s, 1, 1); _TestResult = _TestResult && doc_test_real(v, 5.0000, 0.005); // // After successful creation we can use our matrix for linear operations. // // However, there is one more thing we MUST do before using S in linear // operations: we have to convert it from HashTable representation (used for // initialization and dynamic operations) to CRS format with sparseconverttocrs() // call. If you omit this call, ALGLIB will generate exception on the first // attempt to use S in linear operations. // sparseconverttocrs(s); // // Now S is in the CRS format and we are ready to do linear operations. // Lets calculate A*x for some x. // real_1d_array x = "[1,-1]"; if( _spoil_scenario==0 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[]"; sparsemv(s, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[1.000,-5.000]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "sparse_d_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST sparse_d_crs // Advanced topic: creation in the CRS format. // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<2; _spoil_scenario++) { try { // // This example demonstrates creation/initialization of the sparse matrix in the // CRS format. // // Hash-Table format used by default is very convenient (it allows easy // insertion of elements, automatic memory reallocation), but has // significant memory and performance overhead. Insertion of one element // costs hundreds of CPU cycles, and memory consumption is several times // higher than that of CRS. // // When you work with really large matrices and when you can tell in // advance how many elements EXACTLY you need, it can be beneficial to // create matrix in the CRS format from the very beginning. // // If you want to create matrix in the CRS format, you should: // * use sparsecreatecrs() function // * know row sizes in advance (number of non-zero entries in the each row) // * initialize matrix with sparseset() - another function, sparseadd(), is not allowed // * initialize elements from left to right, from top to bottom, each // element is initialized only once. // sparsematrix s; integer_1d_array row_sizes = "[2,2,2,1]"; if( _spoil_scenario==0 ) spoil_vector_by_deleting_element(row_sizes); sparsecreatecrs(4, 4, row_sizes, s); sparseset(s, 0, 0, 2.0); sparseset(s, 0, 1, 1.0); sparseset(s, 1, 1, 4.0); sparseset(s, 1, 2, 2.0); sparseset(s, 2, 2, 3.0); sparseset(s, 2, 3, 1.0); sparseset(s, 3, 3, 9.0); // // Now S is equal to // [ 2 1 ] // [ 4 2 ] // [ 3 1 ] // [ 9 ] // // We should point that we have initialized S elements from left to right, // from top to bottom. CRS representation does NOT allow you to do so in // the different order. Try to change order of the sparseset() calls above, // and you will see that your program generates exception. // // We can check it by reading matrix contents with sparseget(). // However, you should remember that sparseget() is inefficient on // CRS matrices (it may have to pass through all elements of the row // until it finds element you need). // double v; v = sparseget(s, 0, 0); _TestResult = _TestResult && doc_test_real(v, 2.0000, 0.005); v = sparseget(s, 2, 3); _TestResult = _TestResult && doc_test_real(v, 1.0000, 0.005); // you may see that you can read zero elements (which are not stored) with sparseget() v = sparseget(s, 3, 2); _TestResult = _TestResult && doc_test_real(v, 0.0000, 0.005); // // After successful creation we can use our matrix for linear operations. // Lets calculate A*x for some x. // real_1d_array x = "[1,-1,1,-1]"; if( _spoil_scenario==1 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[]"; sparsemv(s, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[1.000,-2.000,2.000,-9]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "sparse_d_crs"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST ablas_d_gemm // Matrix multiplication (single-threaded) // _TestResult = true; try { real_2d_array a = "[[2,1],[1,3]]"; real_2d_array b = "[[2,1],[0,1]]"; real_2d_array c = "[[0,0],[0,0]]"; // // rmatrixgemm() function allows us to calculate matrix product C:=A*B or // to perform more general operation, C:=alpha*op1(A)*op2(B)+beta*C, // where A, B, C are rectangular matrices, op(X) can be X or X^T, // alpha and beta are scalars. // // This function: // * can apply transposition and/or multiplication by scalar to operands // * can use arbitrary part of matrices A/B (given by submatrix offset) // * can store result into arbitrary part of C // * for performance reasons requires C to be preallocated // // Parameters of this function are: // * M, N, K - sizes of op1(A) (which is MxK), op2(B) (which // is KxN) and C (which is MxN) // * Alpha - coefficient before A*B // * A, IA, JA - matrix A and offset of the submatrix // * OpTypeA - transformation type: // 0 - no transformation // 1 - transposition // * B, IB, JB - matrix B and offset of the submatrix // * OpTypeB - transformation type: // 0 - no transformation // 1 - transposition // * Beta - coefficient before C // * C, IC, JC - preallocated matrix C and offset of the submatrix // // Below we perform simple product C:=A*B (alpha=1, beta=0) // // IMPORTANT: this function works with preallocated C, which must be large // enough to store multiplication result. // ae_int_t m = 2; ae_int_t n = 2; ae_int_t k = 2; double alpha = 1.0; ae_int_t ia = 0; ae_int_t ja = 0; ae_int_t optypea = 0; ae_int_t ib = 0; ae_int_t jb = 0; ae_int_t optypeb = 0; double beta = 0.0; ae_int_t ic = 0; ae_int_t jc = 0; rmatrixgemm(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc); _TestResult = _TestResult && doc_test_real_matrix(c, "[[4,3],[2,4]]", 0.0001); // // Now we try to apply some simple transformation to operands: C:=A*B^T // optypeb = 1; rmatrixgemm(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc); _TestResult = _TestResult && doc_test_real_matrix(c, "[[5,1],[5,3]]", 0.0001); } catch(ap_error) { _TestResult = false; } if( !_TestResult) { printf("%-32s FAILED\n", "ablas_d_gemm"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST ablas_d_syrk // Symmetric rank-K update (single-threaded) // _TestResult = true; try { // // rmatrixsyrk() function allows us to calculate symmetric rank-K update // C := beta*C + alpha*A'*A, where C is square N*N matrix, A is square K*N // matrix, alpha and beta are scalars. It is also possible to update by // adding A*A' instead of A'*A. // // Parameters of this function are: // * N, K - matrix size // * Alpha - coefficient before A // * A, IA, JA - matrix and submatrix offsets // * OpTypeA - multiplication type: // * 0 - A*A^T is calculated // * 2 - A^T*A is calculated // * Beta - coefficient before C // * C, IC, JC - preallocated input/output matrix and submatrix offsets // * IsUpper - whether upper or lower triangle of C is updated; // this function updates only one half of C, leaving // other half unchanged (not referenced at all). // // Below we will show how to calculate simple product C:=A'*A // // NOTE: beta=0 and we do not use previous value of C, but still it // MUST be preallocated. // ae_int_t n = 2; ae_int_t k = 1; double alpha = 1.0; ae_int_t ia = 0; ae_int_t ja = 0; ae_int_t optypea = 2; double beta = 0.0; ae_int_t ic = 0; ae_int_t jc = 0; bool isupper = true; real_2d_array a = "[[1,2]]"; // preallocate space to store result real_2d_array c = "[[0,0],[0,0]]"; // calculate product, store result into upper part of c rmatrixsyrk(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper); // output result. // IMPORTANT: lower triangle of C was NOT updated! _TestResult = _TestResult && doc_test_real_matrix(c, "[[1,2],[0,4]]", 0.0001); } catch(ap_error) { _TestResult = false; } if( !_TestResult) { printf("%-32s FAILED\n", "ablas_d_syrk"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST ablas_smp_gemm // Matrix multiplication (multithreaded) // _TestResult = true; try { // // In this example we assume that you already know how to work with // rmatrixgemm() function. Below we concentrate on its multithreading // capabilities. // // SMP edition of ALGLIB includes smp_rmatrixgemm() - multithreaded // version of rmatrixgemm() function. In the basic edition of ALGLIB // (GPL edition or commercial version without SMP support) this function // just calls single-threaded stub. So, you may call this function from // ANY edition of ALGLIB, but only in SMP edition it will work in really // multithreaded mode. // // In order to use multithreading, you have to: // 1) Install SMP edition of ALGLIB. // 2) This step is specific for C++ users: you should activate OS-specific // capabilities of ALGLIB by defining AE_OS=AE_POSIX (for *nix systems) // or AE_OS=AE_WINDOWS (for Windows systems). // C# users do not have to perform this step because C# programs are // portable across different systems without OS-specific tuning. // 3) Allow ALGLIB to know about number of worker threads to use: // a) autodetection (C++, C#): // ALGLIB will automatically determine number of CPU cores and // (by default) will use all cores except for one. Say, on 4-core // system it will use three cores - unless you manually told it // to use more or less. It will keep your system responsive during // lengthy computations. // Such behavior may be changed with setnworkers() call: // * alglib::setnworkers(0) = use all cores // * alglib::setnworkers(-1) = leave one core unused // * alglib::setnworkers(-2) = leave two cores unused // * alglib::setnworkers(+2) = use 2 cores (even if you have more) // b) manual specification (C++, C#): // You may want to specify maximum number of worker threads during // compile time by means of preprocessor definition AE_NWORKERS. // For C++ it will be "AE_NWORKERS=X" where X can be any positive number. // For C# it is "AE_NWORKERSX", where X should be replaced by number of // workers (AE_NWORKERS2, AE_NWORKERS3, AE_NWORKERS4, ...). // You can add this definition to compiler command line or change // corresponding project settings in your IDE. // // After you installed and configured SMP edition of ALGLIB, you may choose // between serial and multithreaded versions of SMP-capable functions: // * serial version works as usual, in the context of the calling thread // * multithreaded version (with "smp_" prefix) creates (or wakes up) worker // threads, inserts task in the worker queue, and waits for completion of // the task. All processing is done in context of worker thread(s). // // NOTE: because starting/stopping worker threads costs thousands of CPU cycles, // you should not use multithreading for lightweight computational problems. // // NOTE: some old POSIX-compatible operating systems do not support // sysconf(_SC_NPROCESSORS_ONLN) system call which is required in order // to automatically determine number of active cores. On these systems // you should specify number of cores manually at compile time. // Without it ALGLIB will run in single-threaded mode. // // Now, back to our example. In this example we will show you: // * how to call SMP version of rmatrixgemm(). Because we work with tiny 2x2 // matrices, we won't expect to see ANY speedup from using multithreading. // The only purpose of this demo is to show how to call SMP functions. // * how to modify number of worker threads used by ALGLIB // real_2d_array a = "[[2,1],[1,3]]"; real_2d_array b = "[[2,1],[0,1]]"; real_2d_array c = "[[0,0],[0,0]]"; ae_int_t m = 2; ae_int_t n = 2; ae_int_t k = 2; double alpha = 1.0; ae_int_t ia = 0; ae_int_t ja = 0; ae_int_t optypea = 0; ae_int_t ib = 0; ae_int_t jb = 0; ae_int_t optypeb = 0; double beta = 0.0; ae_int_t ic = 0; ae_int_t jc = 0; // serial code c = "[[0,0],[0,0]]"; rmatrixgemm(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc); // SMP code with default number of worker threads c = "[[0,0],[0,0]]"; smp_rmatrixgemm(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc); _TestResult = _TestResult && doc_test_real_matrix(c, "[[4,3],[2,4]]", 0.0001); // override number of worker threads - use two cores alglib::setnworkers(+2); c = "[[0,0],[0,0]]"; smp_rmatrixgemm(m, n, k, alpha, a, ia, ja, optypea, b, ib, jb, optypeb, beta, c, ic, jc); _TestResult = _TestResult && doc_test_real_matrix(c, "[[4,3],[2,4]]", 0.0001); } catch(ap_error) { _TestResult = false; } if( !_TestResult) { printf("%-32s FAILED\n", "ablas_smp_gemm"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST ablas_smp_syrk // Symmetric rank-K update (multithreaded) // _TestResult = true; try { // // In this example we assume that you already know how to work with // rmatrixsyrk() function. Below we concentrate on its multithreading // capabilities. // // SMP edition of ALGLIB includes smp_rmatrixsyrk() - multithreaded // version of rmatrixsyrk() function. In the basic edition of ALGLIB // (GPL edition or commercial version without SMP support) this function // just calls single-threaded stub. So, you may call this function from // ANY edition of ALGLIB, but only in SMP edition it will work in really // multithreaded mode. // // In order to use multithreading, you have to: // 1) Install SMP edition of ALGLIB. // 2) This step is specific for C++ users: you should activate OS-specific // capabilities of ALGLIB by defining AE_OS=AE_POSIX (for *nix systems) // or AE_OS=AE_WINDOWS (for Windows systems). // C# users do not have to perform this step because C# programs are // portable across different systems without OS-specific tuning. // 3) Allow ALGLIB to know about number of worker threads to use: // a) autodetection (C++, C#): // ALGLIB will automatically determine number of CPU cores and // (by default) will use all cores except for one. Say, on 4-core // system it will use three cores - unless you manually told it // to use more or less. It will keep your system responsive during // lengthy computations. // Such behavior may be changed with setnworkers() call: // * alglib::setnworkers(0) = use all cores // * alglib::setnworkers(-1) = leave one core unused // * alglib::setnworkers(-2) = leave two cores unused // * alglib::setnworkers(+2) = use 2 cores (even if you have more) // b) manual specification (C++, C#): // You may want to specify maximum number of worker threads during // compile time by means of preprocessor definition AE_NWORKERS. // For C++ it will be "AE_NWORKERS=X" where X can be any positive number. // For C# it is "AE_NWORKERSX", where X should be replaced by number of // workers (AE_NWORKERS2, AE_NWORKERS3, AE_NWORKERS4, ...). // You can add this definition to compiler command line or change // corresponding project settings in your IDE. // // After you installed and configured SMP edition of ALGLIB, you may choose // between serial and multithreaded versions of SMP-capable functions: // * serial version works as usual, in the context of the calling thread // * multithreaded version (with "smp_" prefix) creates (or wakes up) worker // threads, inserts task in the worker queue, and waits for completion of // the task. All processing is done in context of worker thread(s). // // NOTE: because starting/stopping worker threads costs thousands of CPU cycles, // you should not use multithreading for lightweight computational problems. // // NOTE: some old POSIX-compatible operating systems do not support // sysconf(_SC_NPROCESSORS_ONLN) system call which is required in order // to automatically determine number of active cores. On these systems // you should specify number of cores manually at compile time. // Without it ALGLIB will run in single-threaded mode. // // Now, back to our example. In this example we will show you: // * how to call SMP version of rmatrixsyrk(). Because we work with tiny 2x2 // matrices, we won't expect to see ANY speedup from using multithreading. // The only purpose of this demo is to show how to call SMP functions. // * how to modify number of worker threads used by ALGLIB // ae_int_t n = 2; ae_int_t k = 1; double alpha = 1.0; ae_int_t ia = 0; ae_int_t ja = 0; ae_int_t optypea = 2; double beta = 0.0; ae_int_t ic = 0; ae_int_t jc = 0; bool isupper = true; real_2d_array a = "[[1,2]]"; real_2d_array c = "[[]]"; // // Default number of worker threads. // Preallocate space to store result, call multithreaded version, test. // // NOTE: this function updates only one triangular part of C. In our // example we choose to update upper triangle. // c = "[[0,0],[0,0]]"; smp_rmatrixsyrk(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper); _TestResult = _TestResult && doc_test_real_matrix(c, "[[1,2],[0,4]]", 0.0001); // // Override default number of worker threads (set to 2). // Preallocate space to store result, call multithreaded version, test. // // NOTE: this function updates only one triangular part of C. In our // example we choose to update upper triangle. // alglib::setnworkers(+2); c = "[[0,0],[0,0]]"; smp_rmatrixsyrk(n, k, alpha, a, ia, ja, optypea, beta, c, ic, jc, isupper); _TestResult = _TestResult && doc_test_real_matrix(c, "[[1,2],[0,4]]", 0.0001); } catch(ap_error) { _TestResult = false; } if( !_TestResult) { printf("%-32s FAILED\n", "ablas_smp_syrk"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST ablas_t_real // Basis test for real matrix functions (correctness and presence of SMP support) // _TestResult = true; try { real_2d_array a; real_2d_array b; real_2d_array c; // test rmatrixgemm() a = "[[2,1],[1,3]]"; b = "[[2,1],[0,1]]"; c = "[[0,0],[0,0]]"; rmatrixgemm(2, 2, 2, 1.0, a, 0, 0, 0, b, 0, 0, 0, 0.0, c, 0, 0); _TestResult = _TestResult && doc_test_real_matrix(c, "[[4,3],[2,4]]", 0.0001); smp_rmatrixgemm(2, 2, 2, 1.0, a, 0, 0, 0, b, 0, 0, 0, 1.0, c, 0, 0); _TestResult = _TestResult && doc_test_real_matrix(c, "[[8,6],[4,8]]", 0.0001); } catch(ap_error) { _TestResult = false; } if( !_TestResult) { printf("%-32s FAILED\n", "ablas_t_real"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST ablas_t_complex // Basis test for complex matrix functions (correctness and presence of SMP support) // _TestResult = true; try { complex_2d_array a; complex_2d_array b; complex_2d_array c; // test cmatrixgemm() a = "[[2i,1i],[1,3]]"; b = "[[2,1],[0,1]]"; c = "[[0,0],[0,0]]"; cmatrixgemm(2, 2, 2, 1.0, a, 0, 0, 0, b, 0, 0, 0, 0.0, c, 0, 0); _TestResult = _TestResult && doc_test_complex_matrix(c, "[[4i,3i],[2,4]]", 0.0001); smp_cmatrixgemm(2, 2, 2, 1.0, a, 0, 0, 0, b, 0, 0, 0, 1.0, c, 0, 0); _TestResult = _TestResult && doc_test_complex_matrix(c, "[[8i,6i],[4,8]]", 0.0001); } catch(ap_error) { _TestResult = false; } if( !_TestResult) { printf("%-32s FAILED\n", "ablas_t_complex"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matinv_d_r1 // Real matrix inverse // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<7; _spoil_scenario++) { try { real_2d_array a = "[[1,-1],[1,1]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(a); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(a); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(a); if( _spoil_scenario==3 ) spoil_matrix_by_adding_row(a); if( _spoil_scenario==4 ) spoil_matrix_by_adding_col(a); if( _spoil_scenario==5 ) spoil_matrix_by_deleting_row(a); if( _spoil_scenario==6 ) spoil_matrix_by_deleting_col(a); ae_int_t info; matinvreport rep; rmatrixinverse(a, info, rep); _TestResult = _TestResult && doc_test_int(info, 1); _TestResult = _TestResult && doc_test_real_matrix(a, "[[0.5,0.5],[-0.5,0.5]]", 0.00005); _TestResult = _TestResult && doc_test_real(rep.r1, 0.5, 0.00005); _TestResult = _TestResult && doc_test_real(rep.rinf, 0.5, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matinv_d_r1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matinv_d_c1 // Complex matrix inverse // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<7; _spoil_scenario++) { try { complex_2d_array a = "[[1i,-1],[1i,1]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(a); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(a); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(a); if( _spoil_scenario==3 ) spoil_matrix_by_adding_row(a); if( _spoil_scenario==4 ) spoil_matrix_by_adding_col(a); if( _spoil_scenario==5 ) spoil_matrix_by_deleting_row(a); if( _spoil_scenario==6 ) spoil_matrix_by_deleting_col(a); ae_int_t info; matinvreport rep; cmatrixinverse(a, info, rep); _TestResult = _TestResult && doc_test_int(info, 1); _TestResult = _TestResult && doc_test_complex_matrix(a, "[[-0.5i,-0.5i],[-0.5,0.5]]", 0.00005); _TestResult = _TestResult && doc_test_real(rep.r1, 0.5, 0.00005); _TestResult = _TestResult && doc_test_real(rep.rinf, 0.5, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matinv_d_c1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matinv_d_spd1 // SPD matrix inverse // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<7; _spoil_scenario++) { try { real_2d_array a = "[[2,1],[1,2]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(a); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(a); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(a); if( _spoil_scenario==3 ) spoil_matrix_by_adding_row(a); if( _spoil_scenario==4 ) spoil_matrix_by_adding_col(a); if( _spoil_scenario==5 ) spoil_matrix_by_deleting_row(a); if( _spoil_scenario==6 ) spoil_matrix_by_deleting_col(a); ae_int_t info; matinvreport rep; spdmatrixinverse(a, info, rep); _TestResult = _TestResult && doc_test_int(info, 1); _TestResult = _TestResult && doc_test_real_matrix(a, "[[0.666666,-0.333333],[-0.333333,0.666666]]", 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matinv_d_spd1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matinv_d_hpd1 // HPD matrix inverse // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<7; _spoil_scenario++) { try { complex_2d_array a = "[[2,1],[1,2]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(a); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(a); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(a); if( _spoil_scenario==3 ) spoil_matrix_by_adding_row(a); if( _spoil_scenario==4 ) spoil_matrix_by_adding_col(a); if( _spoil_scenario==5 ) spoil_matrix_by_deleting_row(a); if( _spoil_scenario==6 ) spoil_matrix_by_deleting_col(a); ae_int_t info; matinvreport rep; hpdmatrixinverse(a, info, rep); _TestResult = _TestResult && doc_test_int(info, 1); _TestResult = _TestResult && doc_test_complex_matrix(a, "[[0.666666,-0.333333],[-0.333333,0.666666]]", 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matinv_d_hpd1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matinv_t_r1 // Real matrix inverse: singular matrix // _TestResult = true; try { real_2d_array a = "[[1,-1],[-2,2]]"; ae_int_t info; matinvreport rep; rmatrixinverse(a, info, rep); _TestResult = _TestResult && doc_test_int(info, -3); _TestResult = _TestResult && doc_test_real(rep.r1, 0.0, 0.00005); _TestResult = _TestResult && doc_test_real(rep.rinf, 0.0, 0.00005); } catch(ap_error) { _TestResult = false; } if( !_TestResult) { printf("%-32s FAILED\n", "matinv_t_r1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matinv_t_c1 // Complex matrix inverse: singular matrix // _TestResult = true; try { complex_2d_array a = "[[1i,-1i],[-2,2]]"; ae_int_t info; matinvreport rep; cmatrixinverse(a, info, rep); _TestResult = _TestResult && doc_test_int(info, -3); _TestResult = _TestResult && doc_test_real(rep.r1, 0.0, 0.00005); _TestResult = _TestResult && doc_test_real(rep.rinf, 0.0, 0.00005); } catch(ap_error) { _TestResult = false; } if( !_TestResult) { printf("%-32s FAILED\n", "matinv_t_c1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matinv_e_spd1 // Attempt to use SPD function on nonsymmetrix matrix // _TestResult = true; try { real_2d_array a = "[[1,0],[1,1]]"; ae_int_t info; matinvreport rep; spdmatrixinverse(a, info, rep); _TestResult = false; } catch(ap_error) {} if( !_TestResult) { printf("%-32s FAILED\n", "matinv_e_spd1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matinv_e_hpd1 // Attempt to use SPD function on nonsymmetrix matrix // _TestResult = true; try { complex_2d_array a = "[[1,0],[1,1]]"; ae_int_t info; matinvreport rep; hpdmatrixinverse(a, info, rep); _TestResult = false; } catch(ap_error) {} if( !_TestResult) { printf("%-32s FAILED\n", "matinv_e_hpd1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minlbfgs_d_1 // Nonlinear optimization by L-BFGS // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4 // using LBFGS method. // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsg = 0.0000000001; if( _spoil_scenario==3 ) epsg = fp_nan; if( _spoil_scenario==4 ) epsg = fp_posinf; if( _spoil_scenario==5 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==6 ) epsf = fp_nan; if( _spoil_scenario==7 ) epsf = fp_posinf; if( _spoil_scenario==8 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==9 ) epsx = fp_nan; if( _spoil_scenario==10 ) epsx = fp_posinf; if( _spoil_scenario==11 ) epsx = fp_neginf; ae_int_t maxits = 0; minlbfgsstate state; minlbfgsreport rep; minlbfgscreate(1, x, state); minlbfgssetcond(state, epsg, epsf, epsx, maxits); alglib::minlbfgsoptimize(state, function1_grad); minlbfgsresults(state, x, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,3]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minlbfgs_d_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minlbfgs_d_2 // Nonlinear optimization with additional settings and restarts // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<18; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4 // using LBFGS method. // // Several advanced techniques are demonstrated: // * upper limit on step size // * restart from new point // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsg = 0.0000000001; if( _spoil_scenario==3 ) epsg = fp_nan; if( _spoil_scenario==4 ) epsg = fp_posinf; if( _spoil_scenario==5 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==6 ) epsf = fp_nan; if( _spoil_scenario==7 ) epsf = fp_posinf; if( _spoil_scenario==8 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==9 ) epsx = fp_nan; if( _spoil_scenario==10 ) epsx = fp_posinf; if( _spoil_scenario==11 ) epsx = fp_neginf; double stpmax = 0.1; if( _spoil_scenario==12 ) stpmax = fp_nan; if( _spoil_scenario==13 ) stpmax = fp_posinf; if( _spoil_scenario==14 ) stpmax = fp_neginf; ae_int_t maxits = 0; minlbfgsstate state; minlbfgsreport rep; // first run minlbfgscreate(1, x, state); minlbfgssetcond(state, epsg, epsf, epsx, maxits); minlbfgssetstpmax(state, stpmax); alglib::minlbfgsoptimize(state, function1_grad); minlbfgsresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,3]", 0.005); // second run - algorithm is restarted x = "[10,10]"; if( _spoil_scenario==15 ) spoil_vector_by_nan(x); if( _spoil_scenario==16 ) spoil_vector_by_posinf(x); if( _spoil_scenario==17 ) spoil_vector_by_neginf(x); minlbfgsrestartfrom(state, x); alglib::minlbfgsoptimize(state, function1_grad); minlbfgsresults(state, x, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,3]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minlbfgs_d_2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minlbfgs_numdiff // Nonlinear optimization by L-BFGS with numerical differentiation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<15; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4 // using numerical differentiation to calculate gradient. // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsg = 0.0000000001; if( _spoil_scenario==3 ) epsg = fp_nan; if( _spoil_scenario==4 ) epsg = fp_posinf; if( _spoil_scenario==5 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==6 ) epsf = fp_nan; if( _spoil_scenario==7 ) epsf = fp_posinf; if( _spoil_scenario==8 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==9 ) epsx = fp_nan; if( _spoil_scenario==10 ) epsx = fp_posinf; if( _spoil_scenario==11 ) epsx = fp_neginf; double diffstep = 1.0e-6; if( _spoil_scenario==12 ) diffstep = fp_nan; if( _spoil_scenario==13 ) diffstep = fp_posinf; if( _spoil_scenario==14 ) diffstep = fp_neginf; ae_int_t maxits = 0; minlbfgsstate state; minlbfgsreport rep; minlbfgscreatef(1, x, diffstep, state); minlbfgssetcond(state, epsg, epsf, epsx, maxits); alglib::minlbfgsoptimize(state, function1_func); minlbfgsresults(state, x, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,3]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minlbfgs_numdiff"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minlbfgs_ftrim // Nonlinear optimization by LBFGS, function with singularities // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x) = (1+x)^(-0.2) + (1-x)^(-0.3) + 1000*x. // This function has singularities at the boundary of the [-1,+1], but technique called // "function trimming" allows us to solve this optimization problem. // // See http://www.alglib.net/optimization/tipsandtricks.php#ftrimming for more information // on this subject. // real_1d_array x = "[0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsg = 1.0e-6; if( _spoil_scenario==3 ) epsg = fp_nan; if( _spoil_scenario==4 ) epsg = fp_posinf; if( _spoil_scenario==5 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==6 ) epsf = fp_nan; if( _spoil_scenario==7 ) epsf = fp_posinf; if( _spoil_scenario==8 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==9 ) epsx = fp_nan; if( _spoil_scenario==10 ) epsx = fp_posinf; if( _spoil_scenario==11 ) epsx = fp_neginf; ae_int_t maxits = 0; minlbfgsstate state; minlbfgsreport rep; minlbfgscreate(1, x, state); minlbfgssetcond(state, epsg, epsf, epsx, maxits); alglib::minlbfgsoptimize(state, s1_grad); minlbfgsresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[-0.99917305]", 0.000005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minlbfgs_ftrim"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST linlsqr_d_1 // Solution of sparse linear systems with CG // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<4; _spoil_scenario++) { try { // // This example illustrates solution of sparse linear least squares problem // with LSQR algorithm. // // Suppose that we have least squares problem min|A*x-b| with sparse A // represented by sparsematrix object // [ 1 1 ] // [ 1 1 ] // A = [ 2 1 ] // [ 1 ] // [ 1 ] // and right part b // [ 4 ] // [ 2 ] // b = [ 4 ] // [ 1 ] // [ 2 ] // and we want to solve this system in the least squares sense using // LSQR algorithm. In order to do so, we have to create left part // (sparsematrix object) and right part (dense array). // // Initially, sparse matrix is created in the Hash-Table format, // which allows easy initialization, but do not allow matrix to be // used in the linear solvers. So after construction you should convert // sparse matrix to CRS format (one suited for linear operations). // sparsematrix a; sparsecreate(5, 2, a); sparseset(a, 0, 0, 1.0); sparseset(a, 0, 1, 1.0); sparseset(a, 1, 0, 1.0); sparseset(a, 1, 1, 1.0); sparseset(a, 2, 0, 2.0); sparseset(a, 2, 1, 1.0); sparseset(a, 3, 0, 1.0); sparseset(a, 4, 1, 1.0); // // Now our matrix is fully initialized, but we have to do one more // step - convert it from Hash-Table format to CRS format (see // documentation on sparse matrices for more information about these // formats). // // If you omit this call, ALGLIB will generate exception on the first // attempt to use A in linear operations. // sparseconverttocrs(a); // // Initialization of the right part // real_1d_array b = "[4,2,4,1,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(b); if( _spoil_scenario==1 ) spoil_vector_by_posinf(b); if( _spoil_scenario==2 ) spoil_vector_by_neginf(b); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(b); // // Now we have to create linear solver object and to use it for the // solution of the linear system. // linlsqrstate s; linlsqrreport rep; real_1d_array x; linlsqrcreate(5, 2, s); linlsqrsolvesparse(s, a, b); linlsqrresults(s, x, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[1.000,2.000]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "linlsqr_d_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST mincg_d_1 // Nonlinear optimization by CG // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4 // with nonlinear conjugate gradient method. // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsg = 0.0000000001; if( _spoil_scenario==3 ) epsg = fp_nan; if( _spoil_scenario==4 ) epsg = fp_posinf; if( _spoil_scenario==5 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==6 ) epsf = fp_nan; if( _spoil_scenario==7 ) epsf = fp_posinf; if( _spoil_scenario==8 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==9 ) epsx = fp_nan; if( _spoil_scenario==10 ) epsx = fp_posinf; if( _spoil_scenario==11 ) epsx = fp_neginf; ae_int_t maxits = 0; mincgstate state; mincgreport rep; mincgcreate(x, state); mincgsetcond(state, epsg, epsf, epsx, maxits); alglib::mincgoptimize(state, function1_grad); mincgresults(state, x, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,3]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "mincg_d_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST mincg_d_2 // Nonlinear optimization with additional settings and restarts // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<18; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4 // with nonlinear conjugate gradient method. // // Several advanced techniques are demonstrated: // * upper limit on step size // * restart from new point // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsg = 0.0000000001; if( _spoil_scenario==3 ) epsg = fp_nan; if( _spoil_scenario==4 ) epsg = fp_posinf; if( _spoil_scenario==5 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==6 ) epsf = fp_nan; if( _spoil_scenario==7 ) epsf = fp_posinf; if( _spoil_scenario==8 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==9 ) epsx = fp_nan; if( _spoil_scenario==10 ) epsx = fp_posinf; if( _spoil_scenario==11 ) epsx = fp_neginf; double stpmax = 0.1; if( _spoil_scenario==12 ) stpmax = fp_nan; if( _spoil_scenario==13 ) stpmax = fp_posinf; if( _spoil_scenario==14 ) stpmax = fp_neginf; ae_int_t maxits = 0; mincgstate state; mincgreport rep; // first run mincgcreate(x, state); mincgsetcond(state, epsg, epsf, epsx, maxits); mincgsetstpmax(state, stpmax); alglib::mincgoptimize(state, function1_grad); mincgresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,3]", 0.005); // second run - algorithm is restarted with mincgrestartfrom() x = "[10,10]"; if( _spoil_scenario==15 ) spoil_vector_by_nan(x); if( _spoil_scenario==16 ) spoil_vector_by_posinf(x); if( _spoil_scenario==17 ) spoil_vector_by_neginf(x); mincgrestartfrom(state, x); alglib::mincgoptimize(state, function1_grad); mincgresults(state, x, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,3]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "mincg_d_2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST mincg_numdiff // Nonlinear optimization by CG with numerical differentiation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<15; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4 // using numerical differentiation to calculate gradient. // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsg = 0.0000000001; if( _spoil_scenario==3 ) epsg = fp_nan; if( _spoil_scenario==4 ) epsg = fp_posinf; if( _spoil_scenario==5 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==6 ) epsf = fp_nan; if( _spoil_scenario==7 ) epsf = fp_posinf; if( _spoil_scenario==8 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==9 ) epsx = fp_nan; if( _spoil_scenario==10 ) epsx = fp_posinf; if( _spoil_scenario==11 ) epsx = fp_neginf; double diffstep = 1.0e-6; if( _spoil_scenario==12 ) diffstep = fp_nan; if( _spoil_scenario==13 ) diffstep = fp_posinf; if( _spoil_scenario==14 ) diffstep = fp_neginf; ae_int_t maxits = 0; mincgstate state; mincgreport rep; mincgcreatef(x, diffstep, state); mincgsetcond(state, epsg, epsf, epsx, maxits); alglib::mincgoptimize(state, function1_func); mincgresults(state, x, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,3]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "mincg_numdiff"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST mincg_ftrim // Nonlinear optimization by CG, function with singularities // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x) = (1+x)^(-0.2) + (1-x)^(-0.3) + 1000*x. // This function has singularities at the boundary of the [-1,+1], but technique called // "function trimming" allows us to solve this optimization problem. // // See http://www.alglib.net/optimization/tipsandtricks.php#ftrimming for more information // on this subject. // real_1d_array x = "[0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsg = 1.0e-6; if( _spoil_scenario==3 ) epsg = fp_nan; if( _spoil_scenario==4 ) epsg = fp_posinf; if( _spoil_scenario==5 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==6 ) epsf = fp_nan; if( _spoil_scenario==7 ) epsf = fp_posinf; if( _spoil_scenario==8 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==9 ) epsx = fp_nan; if( _spoil_scenario==10 ) epsx = fp_posinf; if( _spoil_scenario==11 ) epsx = fp_neginf; ae_int_t maxits = 0; mincgstate state; mincgreport rep; mincgcreate(x, state); mincgsetcond(state, epsg, epsf, epsx, maxits); alglib::mincgoptimize(state, s1_grad); mincgresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[-0.99917305]", 0.000005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "mincg_ftrim"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minbleic_d_1 // Nonlinear optimization with bound constraints // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<16; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4 // subject to bound constraints -1<=x<=+1, -1<=y<=+1, using BLEIC optimizer. // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); real_1d_array bndl = "[-1,-1]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(bndl); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(bndl); real_1d_array bndu = "[+1,+1]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(bndu); if( _spoil_scenario==6 ) spoil_vector_by_deleting_element(bndu); minbleicstate state; minbleicreport rep; // // These variables define stopping conditions for the optimizer. // // We use very simple condition - |g|<=epsg // double epsg = 0.000001; if( _spoil_scenario==7 ) epsg = fp_nan; if( _spoil_scenario==8 ) epsg = fp_posinf; if( _spoil_scenario==9 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==10 ) epsf = fp_nan; if( _spoil_scenario==11 ) epsf = fp_posinf; if( _spoil_scenario==12 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==13 ) epsx = fp_nan; if( _spoil_scenario==14 ) epsx = fp_posinf; if( _spoil_scenario==15 ) epsx = fp_neginf; ae_int_t maxits = 0; // // Now we are ready to actually optimize something: // * first we create optimizer // * we add boundary constraints // * we tune stopping conditions // * and, finally, optimize and obtain results... // minbleiccreate(x, state); minbleicsetbc(state, bndl, bndu); minbleicsetcond(state, epsg, epsf, epsx, maxits); alglib::minbleicoptimize(state, function1_grad); minbleicresults(state, x, rep); // // ...and evaluate these results // _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[-1,1]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minbleic_d_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minbleic_d_2 // Nonlinear optimization with linear inequality constraints // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<18; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4 // subject to inequality constraints: // * x>=2 (posed as general linear constraint), // * x+y>=6 // using BLEIC optimizer. // real_1d_array x = "[5,5]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); real_2d_array c = "[[1,0,2],[1,1,6]]"; if( _spoil_scenario==3 ) spoil_matrix_by_nan(c); if( _spoil_scenario==4 ) spoil_matrix_by_posinf(c); if( _spoil_scenario==5 ) spoil_matrix_by_neginf(c); if( _spoil_scenario==6 ) spoil_matrix_by_deleting_row(c); if( _spoil_scenario==7 ) spoil_matrix_by_deleting_col(c); integer_1d_array ct = "[1,1]"; if( _spoil_scenario==8 ) spoil_vector_by_deleting_element(ct); minbleicstate state; minbleicreport rep; // // These variables define stopping conditions for the optimizer. // // We use very simple condition - |g|<=epsg // double epsg = 0.000001; if( _spoil_scenario==9 ) epsg = fp_nan; if( _spoil_scenario==10 ) epsg = fp_posinf; if( _spoil_scenario==11 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==12 ) epsf = fp_nan; if( _spoil_scenario==13 ) epsf = fp_posinf; if( _spoil_scenario==14 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==15 ) epsx = fp_nan; if( _spoil_scenario==16 ) epsx = fp_posinf; if( _spoil_scenario==17 ) epsx = fp_neginf; ae_int_t maxits = 0; // // Now we are ready to actually optimize something: // * first we create optimizer // * we add linear constraints // * we tune stopping conditions // * and, finally, optimize and obtain results... // minbleiccreate(x, state); minbleicsetlc(state, c, ct); minbleicsetcond(state, epsg, epsf, epsx, maxits); alglib::minbleicoptimize(state, function1_grad); minbleicresults(state, x, rep); // // ...and evaluate these results // _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[2,4]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minbleic_d_2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minbleic_numdiff // Nonlinear optimization with bound constraints and numerical differentiation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<19; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4 // subject to bound constraints -1<=x<=+1, -1<=y<=+1, using BLEIC optimizer. // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); real_1d_array bndl = "[-1,-1]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(bndl); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(bndl); real_1d_array bndu = "[+1,+1]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(bndu); if( _spoil_scenario==6 ) spoil_vector_by_deleting_element(bndu); minbleicstate state; minbleicreport rep; // // These variables define stopping conditions for the optimizer. // // We use very simple condition - |g|<=epsg // double epsg = 0.000001; if( _spoil_scenario==7 ) epsg = fp_nan; if( _spoil_scenario==8 ) epsg = fp_posinf; if( _spoil_scenario==9 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==10 ) epsf = fp_nan; if( _spoil_scenario==11 ) epsf = fp_posinf; if( _spoil_scenario==12 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==13 ) epsx = fp_nan; if( _spoil_scenario==14 ) epsx = fp_posinf; if( _spoil_scenario==15 ) epsx = fp_neginf; ae_int_t maxits = 0; // // This variable contains differentiation step // double diffstep = 1.0e-6; if( _spoil_scenario==16 ) diffstep = fp_nan; if( _spoil_scenario==17 ) diffstep = fp_posinf; if( _spoil_scenario==18 ) diffstep = fp_neginf; // // Now we are ready to actually optimize something: // * first we create optimizer // * we add boundary constraints // * we tune stopping conditions // * and, finally, optimize and obtain results... // minbleiccreatef(x, diffstep, state); minbleicsetbc(state, bndl, bndu); minbleicsetcond(state, epsg, epsf, epsx, maxits); alglib::minbleicoptimize(state, function1_func); minbleicresults(state, x, rep); // // ...and evaluate these results // _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[-1,1]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minbleic_numdiff"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minbleic_ftrim // Nonlinear optimization by BLEIC, function with singularities // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x) = (1+x)^(-0.2) + (1-x)^(-0.3) + 1000*x. // // This function is undefined outside of (-1,+1) and has singularities at x=-1 and x=+1. // Special technique called "function trimming" allows us to solve this optimization problem // - without using boundary constraints! // // See http://www.alglib.net/optimization/tipsandtricks.php#ftrimming for more information // on this subject. // real_1d_array x = "[0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsg = 1.0e-6; if( _spoil_scenario==3 ) epsg = fp_nan; if( _spoil_scenario==4 ) epsg = fp_posinf; if( _spoil_scenario==5 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==6 ) epsf = fp_nan; if( _spoil_scenario==7 ) epsf = fp_posinf; if( _spoil_scenario==8 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==9 ) epsx = fp_nan; if( _spoil_scenario==10 ) epsx = fp_posinf; if( _spoil_scenario==11 ) epsx = fp_neginf; ae_int_t maxits = 0; minbleicstate state; minbleicreport rep; minbleiccreate(x, state); minbleicsetcond(state, epsg, epsf, epsx, maxits); alglib::minbleicoptimize(state, s1_grad); minbleicresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[-0.99917305]", 0.000005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minbleic_ftrim"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minqp_d_u1 // Unconstrained dense quadratic programming // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<17; _spoil_scenario++) { try { // // This example demonstrates minimization of F(x0,x1) = x0^2 + x1^2 -6*x0 - 4*x1 // // Exact solution is [x0,x1] = [3,2] // // We provide algorithm with starting point, although in this case // (dense matrix, no constraints) it can work without such information. // // Several QP solvers are tried: QuickQP, BLEIC, DENSE-AUL. // // IMPORTANT: this solver minimizes following function: // f(x) = 0.5*x'*A*x + b'*x. // Note that quadratic term has 0.5 before it. So if you want to minimize // quadratic function, you should rewrite it in such way that quadratic term // is multiplied by 0.5 too. // // For example, our function is f(x)=x0^2+x1^2+..., but we rewrite it as // f(x) = 0.5*(2*x0^2+2*x1^2) + .... // and pass diag(2,2) as quadratic term - NOT diag(1,1)! // real_2d_array a = "[[2,0],[0,2]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(a); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(a); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(a); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(a); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(a); real_1d_array b = "[-6,-4]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(b); if( _spoil_scenario==6 ) spoil_vector_by_posinf(b); if( _spoil_scenario==7 ) spoil_vector_by_neginf(b); if( _spoil_scenario==8 ) spoil_vector_by_deleting_element(b); real_1d_array x0 = "[0,1]"; if( _spoil_scenario==9 ) spoil_vector_by_nan(x0); if( _spoil_scenario==10 ) spoil_vector_by_posinf(x0); if( _spoil_scenario==11 ) spoil_vector_by_neginf(x0); if( _spoil_scenario==12 ) spoil_vector_by_deleting_element(x0); real_1d_array s = "[1,1]"; if( _spoil_scenario==13 ) spoil_vector_by_nan(s); if( _spoil_scenario==14 ) spoil_vector_by_posinf(s); if( _spoil_scenario==15 ) spoil_vector_by_neginf(s); if( _spoil_scenario==16 ) spoil_vector_by_deleting_element(s); real_1d_array x; minqpstate state; minqpreport rep; // create solver, set quadratic/linear terms minqpcreate(2, state); minqpsetquadraticterm(state, a); minqpsetlinearterm(state, b); minqpsetstartingpoint(state, x0); // Set scale of the parameters. // It is strongly recommended that you set scale of your variables. // Knowing their scales is essential for evaluation of stopping criteria // and for preconditioning of the algorithm steps. // You can find more information on scaling at http://www.alglib.net/optimization/scaling.php minqpsetscale(state, s); // // Solve problem with QuickQP solver. // // This solver is intended for medium and large-scale problems with box // constraints (general linear constraints are not supported), but it can // also be efficiently used on unconstrained problems. // // Default stopping criteria are used, Newton phase is active. // minqpsetalgoquickqp(state, 0.0, 0.0, 0.0, 0, true); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[3,2]", 0.005); // // Solve problem with BLEIC-based QP solver. // // This solver is intended for problems with moderate (up to 50) number // of general linear constraints and unlimited number of box constraints. // Of course, unconstrained problems can be solved too. // // Default stopping criteria are used. // minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[3,2]", 0.005); // // Solve problem with DENSE-AUL solver. // // This solver is optimized for problems with up to several thousands of // variables and large amount of general linear constraints. Problems with // less than 50 general linear constraints can be efficiently solved with // BLEIC, problems with box-only constraints can be solved with QuickQP. // However, DENSE-AUL will work in any (including unconstrained) case. // // Default stopping criteria are used. // minqpsetalgodenseaul(state, 1.0e-9, 1.0e+4, 5); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[3,2]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minqp_d_u1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minqp_d_bc1 // Bound constrained dense quadratic programming // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<21; _spoil_scenario++) { try { // // This example demonstrates minimization of F(x0,x1) = x0^2 + x1^2 -6*x0 - 4*x1 // subject to bound constraints 0<=x0<=2.5, 0<=x1<=2.5 // // Exact solution is [x0,x1] = [2.5,2] // // We provide algorithm with starting point. With such small problem good starting // point is not really necessary, but with high-dimensional problem it can save us // a lot of time. // // Several QP solvers are tried: QuickQP, BLEIC, DENSE-AUL. // // IMPORTANT: this solver minimizes following function: // f(x) = 0.5*x'*A*x + b'*x. // Note that quadratic term has 0.5 before it. So if you want to minimize // quadratic function, you should rewrite it in such way that quadratic term // is multiplied by 0.5 too. // For example, our function is f(x)=x0^2+x1^2+..., but we rewrite it as // f(x) = 0.5*(2*x0^2+2*x1^2) + .... // and pass diag(2,2) as quadratic term - NOT diag(1,1)! // real_2d_array a = "[[2,0],[0,2]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(a); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(a); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(a); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(a); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(a); real_1d_array b = "[-6,-4]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(b); if( _spoil_scenario==6 ) spoil_vector_by_posinf(b); if( _spoil_scenario==7 ) spoil_vector_by_neginf(b); if( _spoil_scenario==8 ) spoil_vector_by_deleting_element(b); real_1d_array x0 = "[0,1]"; if( _spoil_scenario==9 ) spoil_vector_by_nan(x0); if( _spoil_scenario==10 ) spoil_vector_by_posinf(x0); if( _spoil_scenario==11 ) spoil_vector_by_neginf(x0); if( _spoil_scenario==12 ) spoil_vector_by_deleting_element(x0); real_1d_array s = "[1,1]"; if( _spoil_scenario==13 ) spoil_vector_by_nan(s); if( _spoil_scenario==14 ) spoil_vector_by_posinf(s); if( _spoil_scenario==15 ) spoil_vector_by_neginf(s); if( _spoil_scenario==16 ) spoil_vector_by_deleting_element(s); real_1d_array bndl = "[0.0,0.0]"; if( _spoil_scenario==17 ) spoil_vector_by_nan(bndl); if( _spoil_scenario==18 ) spoil_vector_by_deleting_element(bndl); real_1d_array bndu = "[2.5,2.5]"; if( _spoil_scenario==19 ) spoil_vector_by_nan(bndu); if( _spoil_scenario==20 ) spoil_vector_by_deleting_element(bndu); real_1d_array x; minqpstate state; minqpreport rep; // create solver, set quadratic/linear terms minqpcreate(2, state); minqpsetquadraticterm(state, a); minqpsetlinearterm(state, b); minqpsetstartingpoint(state, x0); minqpsetbc(state, bndl, bndu); // Set scale of the parameters. // It is strongly recommended that you set scale of your variables. // Knowing their scales is essential for evaluation of stopping criteria // and for preconditioning of the algorithm steps. // You can find more information on scaling at http://www.alglib.net/optimization/scaling.php minqpsetscale(state, s); // // Solve problem with QuickQP solver. // // This solver is intended for medium and large-scale problems with box // constraints (general linear constraints are not supported). // // Default stopping criteria are used, Newton phase is active. // minqpsetalgoquickqp(state, 0.0, 0.0, 0.0, 0, true); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[2.5,2]", 0.005); // // Solve problem with BLEIC-based QP solver. // // This solver is intended for problems with moderate (up to 50) number // of general linear constraints and unlimited number of box constraints. // // Default stopping criteria are used. // minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[2.5,2]", 0.005); // // Solve problem with DENSE-AUL solver. // // This solver is optimized for problems with up to several thousands of // variables and large amount of general linear constraints. Problems with // less than 50 general linear constraints can be efficiently solved with // BLEIC, problems with box-only constraints can be solved with QuickQP. // However, DENSE-AUL will work in any (including unconstrained) case. // // Default stopping criteria are used. // minqpsetalgodenseaul(state, 1.0e-9, 1.0e+4, 5); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[2.5,2]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minqp_d_bc1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minqp_d_lc1 // Linearly constrained dense quadratic programming // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<16; _spoil_scenario++) { try { // // This example demonstrates minimization of F(x0,x1) = x0^2 + x1^2 -6*x0 - 4*x1 // subject to linear constraint x0+x1<=2 // // Exact solution is [x0,x1] = [1.5,0.5] // // IMPORTANT: this solver minimizes following function: // f(x) = 0.5*x'*A*x + b'*x. // Note that quadratic term has 0.5 before it. So if you want to minimize // quadratic function, you should rewrite it in such way that quadratic term // is multiplied by 0.5 too. // For example, our function is f(x)=x0^2+x1^2+..., but we rewrite it as // f(x) = 0.5*(2*x0^2+2*x1^2) + .... // and pass diag(2,2) as quadratic term - NOT diag(1,1)! // real_2d_array a = "[[2,0],[0,2]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(a); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(a); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(a); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(a); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(a); real_1d_array b = "[-6,-4]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(b); if( _spoil_scenario==6 ) spoil_vector_by_posinf(b); if( _spoil_scenario==7 ) spoil_vector_by_neginf(b); if( _spoil_scenario==8 ) spoil_vector_by_deleting_element(b); real_1d_array s = "[1,1]"; if( _spoil_scenario==9 ) spoil_vector_by_nan(s); if( _spoil_scenario==10 ) spoil_vector_by_posinf(s); if( _spoil_scenario==11 ) spoil_vector_by_neginf(s); if( _spoil_scenario==12 ) spoil_vector_by_deleting_element(s); real_2d_array c = "[[1.0,1.0,2.0]]"; if( _spoil_scenario==13 ) spoil_matrix_by_nan(c); if( _spoil_scenario==14 ) spoil_matrix_by_posinf(c); if( _spoil_scenario==15 ) spoil_matrix_by_neginf(c); integer_1d_array ct = "[-1]"; real_1d_array x; minqpstate state; minqpreport rep; // create solver, set quadratic/linear terms minqpcreate(2, state); minqpsetquadraticterm(state, a); minqpsetlinearterm(state, b); minqpsetlc(state, c, ct); // Set scale of the parameters. // It is strongly recommended that you set scale of your variables. // Knowing their scales is essential for evaluation of stopping criteria // and for preconditioning of the algorithm steps. // You can find more information on scaling at http://www.alglib.net/optimization/scaling.php minqpsetscale(state, s); // // Solve problem with BLEIC-based QP solver. // // This solver is intended for problems with moderate (up to 50) number // of general linear constraints and unlimited number of box constraints. // // Default stopping criteria are used. // minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[1.500,0.500]", 0.05); // // Solve problem with DENSE-AUL solver. // // This solver is optimized for problems with up to several thousands of // variables and large amount of general linear constraints. Problems with // less than 50 general linear constraints can be efficiently solved with // BLEIC, problems with box-only constraints can be solved with QuickQP. // However, DENSE-AUL will work in any (including unconstrained) case. // // Default stopping criteria are used. // minqpsetalgodenseaul(state, 1.0e-9, 1.0e+4, 5); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[1.500,0.500]", 0.05); // // Solve problem with QuickQP solver. // // This solver is intended for medium and large-scale problems with box // constraints, and... // // ...Oops! It does not support general linear constraints, -5 returned as completion code! // minqpsetalgoquickqp(state, 0.0, 0.0, 0.0, 0, true); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, -5); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minqp_d_lc1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minqp_d_u2 // Unconstrained sparse quadratic programming // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { // // This example demonstrates minimization of F(x0,x1) = x0^2 + x1^2 -6*x0 - 4*x1, // with quadratic term given by sparse matrix structure. // // Exact solution is [x0,x1] = [3,2] // // We provide algorithm with starting point, although in this case // (dense matrix, no constraints) it can work without such information. // // IMPORTANT: this solver minimizes following function: // f(x) = 0.5*x'*A*x + b'*x. // Note that quadratic term has 0.5 before it. So if you want to minimize // quadratic function, you should rewrite it in such way that quadratic term // is multiplied by 0.5 too. // // For example, our function is f(x)=x0^2+x1^2+..., but we rewrite it as // f(x) = 0.5*(2*x0^2+2*x1^2) + .... // and pass diag(2,2) as quadratic term - NOT diag(1,1)! // sparsematrix a; real_1d_array b = "[-6,-4]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(b); if( _spoil_scenario==1 ) spoil_vector_by_posinf(b); if( _spoil_scenario==2 ) spoil_vector_by_neginf(b); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(b); real_1d_array x0 = "[0,1]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(x0); if( _spoil_scenario==5 ) spoil_vector_by_posinf(x0); if( _spoil_scenario==6 ) spoil_vector_by_neginf(x0); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(x0); real_1d_array s = "[1,1]"; if( _spoil_scenario==8 ) spoil_vector_by_nan(s); if( _spoil_scenario==9 ) spoil_vector_by_posinf(s); if( _spoil_scenario==10 ) spoil_vector_by_neginf(s); if( _spoil_scenario==11 ) spoil_vector_by_deleting_element(s); real_1d_array x; minqpstate state; minqpreport rep; // initialize sparsematrix structure sparsecreate(2, 2, 0, a); sparseset(a, 0, 0, 2.0); sparseset(a, 1, 1, 2.0); // create solver, set quadratic/linear terms minqpcreate(2, state); minqpsetquadratictermsparse(state, a, true); minqpsetlinearterm(state, b); minqpsetstartingpoint(state, x0); // Set scale of the parameters. // It is strongly recommended that you set scale of your variables. // Knowing their scales is essential for evaluation of stopping criteria // and for preconditioning of the algorithm steps. // You can find more information on scaling at http://www.alglib.net/optimization/scaling.php minqpsetscale(state, s); // // Solve problem with BLEIC-based QP solver. // // This solver is intended for problems with moderate (up to 50) number // of general linear constraints and unlimited number of box constraints. // It also supports sparse problems. // // Default stopping criteria are used. // minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[3,2]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minqp_d_u2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minqp_d_nonconvex // Nonconvex quadratic programming // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<21; _spoil_scenario++) { try { // // This example demonstrates minimization of nonconvex function // F(x0,x1) = -(x0^2+x1^2) // subject to constraints x0,x1 in [1.0,2.0] // Exact solution is [x0,x1] = [2,2]. // // Non-convex problems are harded to solve than convex ones, and they // may have more than one local minimum. However, ALGLIB solves may deal // with such problems (altough they do not guarantee convergence to // global minimum). // // IMPORTANT: this solver minimizes following function: // f(x) = 0.5*x'*A*x + b'*x. // Note that quadratic term has 0.5 before it. So if you want to minimize // quadratic function, you should rewrite it in such way that quadratic term // is multiplied by 0.5 too. // // For example, our function is f(x)=-(x0^2+x1^2), but we rewrite it as // f(x) = 0.5*(-2*x0^2-2*x1^2) // and pass diag(-2,-2) as quadratic term - NOT diag(-1,-1)! // real_2d_array a = "[[-2,0],[0,-2]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(a); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(a); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(a); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(a); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(a); real_1d_array x0 = "[1,1]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(x0); if( _spoil_scenario==6 ) spoil_vector_by_posinf(x0); if( _spoil_scenario==7 ) spoil_vector_by_neginf(x0); if( _spoil_scenario==8 ) spoil_vector_by_deleting_element(x0); real_1d_array s = "[1,1]"; if( _spoil_scenario==9 ) spoil_vector_by_nan(s); if( _spoil_scenario==10 ) spoil_vector_by_posinf(s); if( _spoil_scenario==11 ) spoil_vector_by_neginf(s); if( _spoil_scenario==12 ) spoil_vector_by_deleting_element(s); real_1d_array bndl = "[1.0,1.0]"; if( _spoil_scenario==13 ) spoil_vector_by_nan(bndl); if( _spoil_scenario==14 ) spoil_vector_by_deleting_element(bndl); real_1d_array bndu = "[2.0,2.0]"; if( _spoil_scenario==15 ) spoil_vector_by_nan(bndu); if( _spoil_scenario==16 ) spoil_vector_by_deleting_element(bndu); real_1d_array x; minqpstate state; minqpreport rep; // create solver, set quadratic/linear terms, constraints minqpcreate(2, state); minqpsetquadraticterm(state, a); minqpsetstartingpoint(state, x0); minqpsetbc(state, bndl, bndu); // Set scale of the parameters. // It is strongly recommended that you set scale of your variables. // Knowing their scales is essential for evaluation of stopping criteria // and for preconditioning of the algorithm steps. // You can find more information on scaling at http://www.alglib.net/optimization/scaling.php minqpsetscale(state, s); // // Solve problem with BLEIC-based QP solver. // // This solver is intended for problems with moderate (up to 50) number // of general linear constraints and unlimited number of box constraints. // // It may solve non-convex problems as long as they are bounded from // below under constraints. // // Default stopping criteria are used. // minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[2,2]", 0.005); // // Solve problem with DENSE-AUL solver. // // This solver is optimized for problems with up to several thousands of // variables and large amount of general linear constraints. Problems with // less than 50 general linear constraints can be efficiently solved with // BLEIC, problems with box-only constraints can be solved with QuickQP. // However, DENSE-AUL will work in any (including unconstrained) case. // // Algorithm convergence is guaranteed only for convex case, but you may // expect that it will work for non-convex problems too (because near the // solution they are locally convex). // // Default stopping criteria are used. // minqpsetalgodenseaul(state, 1.0e-9, 1.0e+4, 5); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[2,2]", 0.005); // Hmm... this problem is bounded from below (has solution) only under constraints. // What it we remove them? // // You may see that BLEIC algorithm detects unboundedness of the problem, // -4 is returned as completion code. However, DENSE-AUL is unable to detect // such situation and it will cycle forever (we do not test it here). real_1d_array nobndl = "[-inf,-inf]"; if( _spoil_scenario==17 ) spoil_vector_by_nan(nobndl); if( _spoil_scenario==18 ) spoil_vector_by_deleting_element(nobndl); real_1d_array nobndu = "[+inf,+inf]"; if( _spoil_scenario==19 ) spoil_vector_by_nan(nobndu); if( _spoil_scenario==20 ) spoil_vector_by_deleting_element(nobndu); minqpsetbc(state, nobndl, nobndu); minqpsetalgobleic(state, 0.0, 0.0, 0.0, 0); minqpoptimize(state); minqpresults(state, x, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, -4); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minqp_d_nonconvex"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minnlc_d_inequality // Nonlinearly constrained optimization (inequality constraints) // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<18; _spoil_scenario++) { try { // // This example demonstrates minimization of // // f(x0,x1) = -x0+x1 // // subject to boundary constraints // // x0>=0, x1>=0 // // and nonlinear inequality constraint // // x0^2 + x1^2 - 1 <= 0 // real_1d_array x0 = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x0); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x0); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x0); real_1d_array s = "[1,1]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(s); if( _spoil_scenario==4 ) spoil_vector_by_posinf(s); if( _spoil_scenario==5 ) spoil_vector_by_neginf(s); double epsg = 0; if( _spoil_scenario==6 ) epsg = fp_nan; if( _spoil_scenario==7 ) epsg = fp_posinf; if( _spoil_scenario==8 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==9 ) epsf = fp_nan; if( _spoil_scenario==10 ) epsf = fp_posinf; if( _spoil_scenario==11 ) epsf = fp_neginf; double epsx = 0.000001; if( _spoil_scenario==12 ) epsx = fp_nan; if( _spoil_scenario==13 ) epsx = fp_posinf; if( _spoil_scenario==14 ) epsx = fp_neginf; ae_int_t maxits = 0; ae_int_t outerits = 5; ae_int_t updatefreq = 10; double rho = 1000; if( _spoil_scenario==15 ) rho = fp_nan; if( _spoil_scenario==16 ) rho = fp_posinf; if( _spoil_scenario==17 ) rho = fp_neginf; real_1d_array bndl = "[0,0]"; real_1d_array bndu = "[+inf,+inf]"; minnlcstate state; minnlcreport rep; real_1d_array x1; // // Create optimizer object, choose AUL algorithm and tune its settings: // * rho=1000 penalty coefficient // * outerits=5 number of outer iterations to tune Lagrange coefficients // * epsx=0.000001 stopping condition for inner iterations // * s=[1,1] all variables have unit scale // * exact low-rank preconditioner is used, updated after each 10 iterations // minnlccreate(2, x0, state); minnlcsetalgoaul(state, rho, outerits); minnlcsetcond(state, epsg, epsf, epsx, maxits); minnlcsetscale(state, s); minnlcsetprecexactlowrank(state, updatefreq); // // Set constraints: // // 1. boundary constraints are passed with minnlcsetbc() call // // 2. nonlinear constraints are more tricky - you can not "pack" general // nonlinear function into double precision array. That's why // minnlcsetnlc() does not accept constraints itself - only constraint // counts are passed: first parameter is number of equality constraints, // second one is number of inequality constraints. // // As for constraining functions - these functions are passed as part // of problem Jacobian (see below). // // NOTE: MinNLC optimizer supports arbitrary combination of boundary, general // linear and general nonlinear constraints. This example does not // show how to work with general linear constraints, but you can // easily find it in documentation on minnlcsetlc() function. // minnlcsetbc(state, bndl, bndu); minnlcsetnlc(state, 0, 1); // // Optimize and test results. // // Optimizer object accepts vector function and its Jacobian, with first // component (Jacobian row) being target function, and next components // (Jacobian rows) being nonlinear equality and inequality constraints. // // So, our vector function has form // // {f0,f1} = { -x0+x1 , x0^2+x1^2-1 } // // with Jacobian // // [ -1 +1 ] // J = [ ] // [ 2*x0 2*x1 ] // // with f0 being target function, f1 being constraining function. Number // of equality/inequality constraints is specified by minnlcsetnlc(), // with equality ones always being first, inequality ones being last. // alglib::minnlcoptimize(state, nlcfunc1_jac); minnlcresults(state, x1, rep); _TestResult = _TestResult && doc_test_real_vector(x1, "[1.0000,0.0000]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minnlc_d_inequality"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minnlc_d_equality // Nonlinearly constrained optimization (equality constraints) // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<18; _spoil_scenario++) { try { // // This example demonstrates minimization of // // f(x0,x1) = -x0+x1 // // subject to nonlinear equality constraint // // x0^2 + x1^2 - 1 = 0 // real_1d_array x0 = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x0); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x0); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x0); real_1d_array s = "[1,1]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(s); if( _spoil_scenario==4 ) spoil_vector_by_posinf(s); if( _spoil_scenario==5 ) spoil_vector_by_neginf(s); double epsg = 0; if( _spoil_scenario==6 ) epsg = fp_nan; if( _spoil_scenario==7 ) epsg = fp_posinf; if( _spoil_scenario==8 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==9 ) epsf = fp_nan; if( _spoil_scenario==10 ) epsf = fp_posinf; if( _spoil_scenario==11 ) epsf = fp_neginf; double epsx = 0.000001; if( _spoil_scenario==12 ) epsx = fp_nan; if( _spoil_scenario==13 ) epsx = fp_posinf; if( _spoil_scenario==14 ) epsx = fp_neginf; ae_int_t maxits = 0; ae_int_t outerits = 5; ae_int_t updatefreq = 10; double rho = 1000; if( _spoil_scenario==15 ) rho = fp_nan; if( _spoil_scenario==16 ) rho = fp_posinf; if( _spoil_scenario==17 ) rho = fp_neginf; minnlcstate state; minnlcreport rep; real_1d_array x1; // // Create optimizer object, choose AUL algorithm and tune its settings: // * rho=1000 penalty coefficient // * outerits=5 number of outer iterations to tune Lagrange coefficients // * epsx=0.000001 stopping condition for inner iterations // * s=[1,1] all variables have unit scale // * exact low-rank preconditioner is used, updated after each 10 iterations // minnlccreate(2, x0, state); minnlcsetalgoaul(state, rho, outerits); minnlcsetcond(state, epsg, epsf, epsx, maxits); minnlcsetscale(state, s); minnlcsetprecexactlowrank(state, updatefreq); // // Set constraints: // // Nonlinear constraints are tricky - you can not "pack" general // nonlinear function into double precision array. That's why // minnlcsetnlc() does not accept constraints itself - only constraint // counts are passed: first parameter is number of equality constraints, // second one is number of inequality constraints. // // As for constraining functions - these functions are passed as part // of problem Jacobian (see below). // // NOTE: MinNLC optimizer supports arbitrary combination of boundary, general // linear and general nonlinear constraints. This example does not // show how to work with general linear constraints, but you can // easily find it in documentation on minnlcsetbc() and // minnlcsetlc() functions. // minnlcsetnlc(state, 1, 0); // // Optimize and test results. // // Optimizer object accepts vector function and its Jacobian, with first // component (Jacobian row) being target function, and next components // (Jacobian rows) being nonlinear equality and inequality constraints. // // So, our vector function has form // // {f0,f1} = { -x0+x1 , x0^2+x1^2-1 } // // with Jacobian // // [ -1 +1 ] // J = [ ] // [ 2*x0 2*x1 ] // // with f0 being target function, f1 being constraining function. Number // of equality/inequality constraints is specified by minnlcsetnlc(), // with equality ones always being first, inequality ones being last. // alglib::minnlcoptimize(state, nlcfunc1_jac); minnlcresults(state, x1, rep); _TestResult = _TestResult && doc_test_real_vector(x1, "[0.70710,-0.70710]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minnlc_d_equality"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minnlc_d_mixed // Nonlinearly constrained optimization with mixed equality/inequality constraints // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<18; _spoil_scenario++) { try { // // This example demonstrates minimization of // // f(x0,x1) = x0+x1 // // subject to nonlinear inequality constraint // // x0^2 + x1^2 - 1 <= 0 // // and nonlinear equality constraint // // x2-exp(x0) = 0 // real_1d_array x0 = "[0,0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x0); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x0); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x0); real_1d_array s = "[1,1,1]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(s); if( _spoil_scenario==4 ) spoil_vector_by_posinf(s); if( _spoil_scenario==5 ) spoil_vector_by_neginf(s); double epsg = 0; if( _spoil_scenario==6 ) epsg = fp_nan; if( _spoil_scenario==7 ) epsg = fp_posinf; if( _spoil_scenario==8 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==9 ) epsf = fp_nan; if( _spoil_scenario==10 ) epsf = fp_posinf; if( _spoil_scenario==11 ) epsf = fp_neginf; double epsx = 0.000001; if( _spoil_scenario==12 ) epsx = fp_nan; if( _spoil_scenario==13 ) epsx = fp_posinf; if( _spoil_scenario==14 ) epsx = fp_neginf; ae_int_t maxits = 0; ae_int_t outerits = 5; ae_int_t updatefreq = 10; double rho = 1000; if( _spoil_scenario==15 ) rho = fp_nan; if( _spoil_scenario==16 ) rho = fp_posinf; if( _spoil_scenario==17 ) rho = fp_neginf; minnlcstate state; minnlcreport rep; real_1d_array x1; // // Create optimizer object, choose AUL algorithm and tune its settings: // * rho=1000 penalty coefficient // * outerits=5 number of outer iterations to tune Lagrange coefficients // * epsx=0.000001 stopping condition for inner iterations // * s=[1,1] all variables have unit scale // * exact low-rank preconditioner is used, updated after each 10 iterations // * upper limit on step length is specified (to avoid probing locations where exp() is large) // minnlccreate(3, x0, state); minnlcsetalgoaul(state, rho, outerits); minnlcsetcond(state, epsg, epsf, epsx, maxits); minnlcsetscale(state, s); minnlcsetprecexactlowrank(state, updatefreq); minnlcsetstpmax(state, 10.0); // // Set constraints: // // Nonlinear constraints are tricky - you can not "pack" general // nonlinear function into double precision array. That's why // minnlcsetnlc() does not accept constraints itself - only constraint // counts are passed: first parameter is number of equality constraints, // second one is number of inequality constraints. // // As for constraining functions - these functions are passed as part // of problem Jacobian (see below). // // NOTE: MinNLC optimizer supports arbitrary combination of boundary, general // linear and general nonlinear constraints. This example does not // show how to work with boundary or general linear constraints, but you // can easily find it in documentation on minnlcsetbc() and // minnlcsetlc() functions. // minnlcsetnlc(state, 1, 1); // // Optimize and test results. // // Optimizer object accepts vector function and its Jacobian, with first // component (Jacobian row) being target function, and next components // (Jacobian rows) being nonlinear equality and inequality constraints. // // So, our vector function has form // // {f0,f1,f2} = { x0+x1 , x2-exp(x0) , x0^2+x1^2-1 } // // with Jacobian // // [ +1 +1 0 ] // J = [-exp(x0) 0 1 ] // [ 2*x0 2*x1 0 ] // // with f0 being target function, f1 being equality constraint "f1=0", // f2 being inequality constraint "f2<=0". Number of equality/inequality // constraints is specified by minnlcsetnlc(), with equality ones always // being first, inequality ones being last. // alglib::minnlcoptimize(state, nlcfunc2_jac); minnlcresults(state, x1, rep); _TestResult = _TestResult && doc_test_real_vector(x1, "[-0.70710,-0.70710,0.49306]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minnlc_d_mixed"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minbc_d_1 // Nonlinear optimization with box constraints // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<16; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4 // subject to bound constraints -1<=x<=+1, -1<=y<=+1, using MinBC optimizer. // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); real_1d_array bndl = "[-1,-1]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(bndl); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(bndl); real_1d_array bndu = "[+1,+1]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(bndu); if( _spoil_scenario==6 ) spoil_vector_by_deleting_element(bndu); minbcstate state; minbcreport rep; // // These variables define stopping conditions for the optimizer. // // We use very simple condition - |g|<=epsg // double epsg = 0.000001; if( _spoil_scenario==7 ) epsg = fp_nan; if( _spoil_scenario==8 ) epsg = fp_posinf; if( _spoil_scenario==9 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==10 ) epsf = fp_nan; if( _spoil_scenario==11 ) epsf = fp_posinf; if( _spoil_scenario==12 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==13 ) epsx = fp_nan; if( _spoil_scenario==14 ) epsx = fp_posinf; if( _spoil_scenario==15 ) epsx = fp_neginf; ae_int_t maxits = 0; // // Now we are ready to actually optimize something: // * first we create optimizer // * we add boundary constraints // * we tune stopping conditions // * and, finally, optimize and obtain results... // minbccreate(x, state); minbcsetbc(state, bndl, bndu); minbcsetcond(state, epsg, epsf, epsx, maxits); alglib::minbcoptimize(state, function1_grad); minbcresults(state, x, rep); // // ...and evaluate these results // _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[-1,1]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minbc_d_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minbc_numdiff // Nonlinear optimization with bound constraints and numerical differentiation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<19; _spoil_scenario++) { try { // // This example demonstrates minimization of f(x,y) = 100*(x+3)^4+(y-3)^4 // subject to bound constraints -1<=x<=+1, -1<=y<=+1, using MinBC optimizer. // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); real_1d_array bndl = "[-1,-1]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(bndl); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(bndl); real_1d_array bndu = "[+1,+1]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(bndu); if( _spoil_scenario==6 ) spoil_vector_by_deleting_element(bndu); minbcstate state; minbcreport rep; // // These variables define stopping conditions for the optimizer. // // We use very simple condition - |g|<=epsg // double epsg = 0.000001; if( _spoil_scenario==7 ) epsg = fp_nan; if( _spoil_scenario==8 ) epsg = fp_posinf; if( _spoil_scenario==9 ) epsg = fp_neginf; double epsf = 0; if( _spoil_scenario==10 ) epsf = fp_nan; if( _spoil_scenario==11 ) epsf = fp_posinf; if( _spoil_scenario==12 ) epsf = fp_neginf; double epsx = 0; if( _spoil_scenario==13 ) epsx = fp_nan; if( _spoil_scenario==14 ) epsx = fp_posinf; if( _spoil_scenario==15 ) epsx = fp_neginf; ae_int_t maxits = 0; // // This variable contains differentiation step // double diffstep = 1.0e-6; if( _spoil_scenario==16 ) diffstep = fp_nan; if( _spoil_scenario==17 ) diffstep = fp_posinf; if( _spoil_scenario==18 ) diffstep = fp_neginf; // // Now we are ready to actually optimize something: // * first we create optimizer // * we add boundary constraints // * we tune stopping conditions // * and, finally, optimize and obtain results... // minbccreatef(x, diffstep, state); minbcsetbc(state, bndl, bndu); minbcsetcond(state, epsg, epsf, epsx, maxits); alglib::minbcoptimize(state, function1_func); minbcresults(state, x, rep); // // ...and evaluate these results // _TestResult = _TestResult && doc_test_int(rep.terminationtype, 4); _TestResult = _TestResult && doc_test_real_vector(x, "[-1,1]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minbc_numdiff"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minns_d_unconstrained // Nonsmooth unconstrained optimization // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<15; _spoil_scenario++) { try { // // This example demonstrates minimization of // // f(x0,x1) = 2*|x0|+|x1| // // using nonsmooth nonlinear optimizer. // real_1d_array x0 = "[1,1]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x0); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x0); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x0); real_1d_array s = "[1,1]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(s); if( _spoil_scenario==4 ) spoil_vector_by_posinf(s); if( _spoil_scenario==5 ) spoil_vector_by_neginf(s); double epsx = 0.00001; if( _spoil_scenario==6 ) epsx = fp_nan; if( _spoil_scenario==7 ) epsx = fp_posinf; if( _spoil_scenario==8 ) epsx = fp_neginf; double radius = 0.1; if( _spoil_scenario==9 ) radius = fp_nan; if( _spoil_scenario==10 ) radius = fp_posinf; if( _spoil_scenario==11 ) radius = fp_neginf; double rho = 0.0; if( _spoil_scenario==12 ) rho = fp_nan; if( _spoil_scenario==13 ) rho = fp_posinf; if( _spoil_scenario==14 ) rho = fp_neginf; ae_int_t maxits = 0; minnsstate state; minnsreport rep; real_1d_array x1; // // Create optimizer object, choose AGS algorithm and tune its settings: // * radius=0.1 good initial value; will be automatically decreased later. // * rho=0.0 penalty coefficient for nonlinear constraints; can be zero // because we do not have such constraints // * epsx=0.000001 stopping conditions // * s=[1,1] all variables have unit scale // minnscreate(2, x0, state); minnssetalgoags(state, radius, rho); minnssetcond(state, epsx, maxits); minnssetscale(state, s); // // Optimize and test results. // // Optimizer object accepts vector function and its Jacobian, with first // component (Jacobian row) being target function, and next components // (Jacobian rows) being nonlinear equality and inequality constraints // (box/linear ones are passed separately by means of minnssetbc() and // minnssetlc() calls). // // If you do not have nonlinear constraints (exactly our situation), then // you will have one-component function vector and 1xN Jacobian matrix. // // So, our vector function has form // // {f0} = { 2*|x0|+|x1| } // // with Jacobian // // [ ] // J = [ 2*sign(x0) sign(x1) ] // [ ] // // NOTE: nonsmooth optimizer requires considerably more function // evaluations than smooth solver - about 2N times more. Using // numerical differentiation introduces additional (multiplicative) // 2N speedup. // // It means that if smooth optimizer WITH user-supplied gradient // needs 100 function evaluations to solve 50-dimensional problem, // then AGS solver with user-supplied gradient will need about 10.000 // function evaluations, and with numerical gradient about 1.000.000 // function evaluations will be performed. // // NOTE: AGS solver used by us can handle nonsmooth and nonconvex // optimization problems. It has convergence guarantees, i.e. it will // converge to stationary point of the function after running for some // time. // // However, it is important to remember that "stationary point" is not // equal to "solution". If your problem is convex, everything is OK. // But nonconvex optimization problems may have "flat spots" - large // areas where gradient is exactly zero, but function value is far away // from optimal. Such areas are stationary points too, and optimizer // may be trapped here. // // "Flat spots" are nonsmooth equivalent of the saddle points, but with // orders of magnitude worse properties - they may be quite large and // hard to avoid. All nonsmooth optimizers are prone to this kind of the // problem, because it is impossible to automatically distinguish "flat // spot" from true solution. // // This note is here to warn you that you should be very careful when // you solve nonsmooth optimization problems. Visual inspection of // results is essential. // alglib::minnsoptimize(state, nsfunc1_jac); minnsresults(state, x1, rep); _TestResult = _TestResult && doc_test_real_vector(x1, "[0.0000,0.0000]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minns_d_unconstrained"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minns_d_diff // Nonsmooth unconstrained optimization with numerical differentiation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<18; _spoil_scenario++) { try { // // This example demonstrates minimization of // // f(x0,x1) = 2*|x0|+|x1| // // using nonsmooth nonlinear optimizer with numerical // differentiation provided by ALGLIB. // // NOTE: nonsmooth optimizer requires considerably more function // evaluations than smooth solver - about 2N times more. Using // numerical differentiation introduces additional (multiplicative) // 2N speedup. // // It means that if smooth optimizer WITH user-supplied gradient // needs 100 function evaluations to solve 50-dimensional problem, // then AGS solver with user-supplied gradient will need about 10.000 // function evaluations, and with numerical gradient about 1.000.000 // function evaluations will be performed. // real_1d_array x0 = "[1,1]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x0); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x0); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x0); real_1d_array s = "[1,1]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(s); if( _spoil_scenario==4 ) spoil_vector_by_posinf(s); if( _spoil_scenario==5 ) spoil_vector_by_neginf(s); double epsx = 0.00001; if( _spoil_scenario==6 ) epsx = fp_nan; if( _spoil_scenario==7 ) epsx = fp_posinf; if( _spoil_scenario==8 ) epsx = fp_neginf; double diffstep = 0.000001; if( _spoil_scenario==9 ) diffstep = fp_nan; if( _spoil_scenario==10 ) diffstep = fp_posinf; if( _spoil_scenario==11 ) diffstep = fp_neginf; double radius = 0.1; if( _spoil_scenario==12 ) radius = fp_nan; if( _spoil_scenario==13 ) radius = fp_posinf; if( _spoil_scenario==14 ) radius = fp_neginf; double rho = 0.0; if( _spoil_scenario==15 ) rho = fp_nan; if( _spoil_scenario==16 ) rho = fp_posinf; if( _spoil_scenario==17 ) rho = fp_neginf; ae_int_t maxits = 0; minnsstate state; minnsreport rep; real_1d_array x1; // // Create optimizer object, choose AGS algorithm and tune its settings: // * radius=0.1 good initial value; will be automatically decreased later. // * rho=0.0 penalty coefficient for nonlinear constraints; can be zero // because we do not have such constraints // * epsx=0.000001 stopping conditions // * s=[1,1] all variables have unit scale // minnscreatef(2, x0, diffstep, state); minnssetalgoags(state, radius, rho); minnssetcond(state, epsx, maxits); minnssetscale(state, s); // // Optimize and test results. // // Optimizer object accepts vector function, with first component // being target function, and next components being nonlinear equality // and inequality constraints (box/linear ones are passed separately // by means of minnssetbc() and minnssetlc() calls). // // If you do not have nonlinear constraints (exactly our situation), then // you will have one-component function vector. // // So, our vector function has form // // {f0} = { 2*|x0|+|x1| } // alglib::minnsoptimize(state, nsfunc1_fvec); minnsresults(state, x1, rep); _TestResult = _TestResult && doc_test_real_vector(x1, "[0.0000,0.0000]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minns_d_diff"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minns_d_bc // Nonsmooth box constrained optimization // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<17; _spoil_scenario++) { try { // // This example demonstrates minimization of // // f(x0,x1) = 2*|x0|+|x1| // // subject to box constraints // // 1 <= x0 < +INF // -INF <= x1 < +INF // // using nonsmooth nonlinear optimizer. // real_1d_array x0 = "[1,1]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x0); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x0); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x0); real_1d_array s = "[1,1]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(s); if( _spoil_scenario==4 ) spoil_vector_by_posinf(s); if( _spoil_scenario==5 ) spoil_vector_by_neginf(s); real_1d_array bndl = "[1,-inf]"; if( _spoil_scenario==6 ) spoil_vector_by_nan(bndl); real_1d_array bndu = "[+inf,+inf]"; if( _spoil_scenario==7 ) spoil_vector_by_nan(bndu); double epsx = 0.00001; if( _spoil_scenario==8 ) epsx = fp_nan; if( _spoil_scenario==9 ) epsx = fp_posinf; if( _spoil_scenario==10 ) epsx = fp_neginf; double radius = 0.1; if( _spoil_scenario==11 ) radius = fp_nan; if( _spoil_scenario==12 ) radius = fp_posinf; if( _spoil_scenario==13 ) radius = fp_neginf; double rho = 0.0; if( _spoil_scenario==14 ) rho = fp_nan; if( _spoil_scenario==15 ) rho = fp_posinf; if( _spoil_scenario==16 ) rho = fp_neginf; ae_int_t maxits = 0; minnsstate state; minnsreport rep; real_1d_array x1; // // Create optimizer object, choose AGS algorithm and tune its settings: // * radius=0.1 good initial value; will be automatically decreased later. // * rho=0.0 penalty coefficient for nonlinear constraints; can be zero // because we do not have such constraints // * epsx=0.000001 stopping conditions // * s=[1,1] all variables have unit scale // minnscreate(2, x0, state); minnssetalgoags(state, radius, rho); minnssetcond(state, epsx, maxits); minnssetscale(state, s); // // Set box constraints. // // General linear constraints are set in similar way (see comments on // minnssetlc() function for more information). // // You may combine box, linear and nonlinear constraints in one optimization // problem. // minnssetbc(state, bndl, bndu); // // Optimize and test results. // // Optimizer object accepts vector function and its Jacobian, with first // component (Jacobian row) being target function, and next components // (Jacobian rows) being nonlinear equality and inequality constraints // (box/linear ones are passed separately by means of minnssetbc() and // minnssetlc() calls). // // If you do not have nonlinear constraints (exactly our situation), then // you will have one-component function vector and 1xN Jacobian matrix. // // So, our vector function has form // // {f0} = { 2*|x0|+|x1| } // // with Jacobian // // [ ] // J = [ 2*sign(x0) sign(x1) ] // [ ] // // NOTE: nonsmooth optimizer requires considerably more function // evaluations than smooth solver - about 2N times more. Using // numerical differentiation introduces additional (multiplicative) // 2N speedup. // // It means that if smooth optimizer WITH user-supplied gradient // needs 100 function evaluations to solve 50-dimensional problem, // then AGS solver with user-supplied gradient will need about 10.000 // function evaluations, and with numerical gradient about 1.000.000 // function evaluations will be performed. // // NOTE: AGS solver used by us can handle nonsmooth and nonconvex // optimization problems. It has convergence guarantees, i.e. it will // converge to stationary point of the function after running for some // time. // // However, it is important to remember that "stationary point" is not // equal to "solution". If your problem is convex, everything is OK. // But nonconvex optimization problems may have "flat spots" - large // areas where gradient is exactly zero, but function value is far away // from optimal. Such areas are stationary points too, and optimizer // may be trapped here. // // "Flat spots" are nonsmooth equivalent of the saddle points, but with // orders of magnitude worse properties - they may be quite large and // hard to avoid. All nonsmooth optimizers are prone to this kind of the // problem, because it is impossible to automatically distinguish "flat // spot" from true solution. // // This note is here to warn you that you should be very careful when // you solve nonsmooth optimization problems. Visual inspection of // results is essential. // // alglib::minnsoptimize(state, nsfunc1_jac); minnsresults(state, x1, rep); _TestResult = _TestResult && doc_test_real_vector(x1, "[1.0000,0.0000]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minns_d_bc"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minns_d_nlc // Nonsmooth nonlinearly constrained optimization // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<15; _spoil_scenario++) { try { // // This example demonstrates minimization of // // f(x0,x1) = 2*|x0|+|x1| // // subject to combination of equality and inequality constraints // // x0 = 1 // x1 >= -1 // // using nonsmooth nonlinear optimizer. Although these constraints // are linear, we treat them as general nonlinear ones in order to // demonstrate nonlinearly constrained optimization setup. // real_1d_array x0 = "[1,1]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x0); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x0); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x0); real_1d_array s = "[1,1]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(s); if( _spoil_scenario==4 ) spoil_vector_by_posinf(s); if( _spoil_scenario==5 ) spoil_vector_by_neginf(s); double epsx = 0.00001; if( _spoil_scenario==6 ) epsx = fp_nan; if( _spoil_scenario==7 ) epsx = fp_posinf; if( _spoil_scenario==8 ) epsx = fp_neginf; double radius = 0.1; if( _spoil_scenario==9 ) radius = fp_nan; if( _spoil_scenario==10 ) radius = fp_posinf; if( _spoil_scenario==11 ) radius = fp_neginf; double rho = 50.0; if( _spoil_scenario==12 ) rho = fp_nan; if( _spoil_scenario==13 ) rho = fp_posinf; if( _spoil_scenario==14 ) rho = fp_neginf; ae_int_t maxits = 0; minnsstate state; minnsreport rep; real_1d_array x1; // // Create optimizer object, choose AGS algorithm and tune its settings: // * radius=0.1 good initial value; will be automatically decreased later. // * rho=50.0 penalty coefficient for nonlinear constraints. It is your // responsibility to choose good one - large enough that it // enforces constraints, but small enough in order to avoid // extreme slowdown due to ill-conditioning. // * epsx=0.000001 stopping conditions // * s=[1,1] all variables have unit scale // minnscreate(2, x0, state); minnssetalgoags(state, radius, rho); minnssetcond(state, epsx, maxits); minnssetscale(state, s); // // Set general nonlinear constraints. // // This part is more tricky than working with box/linear constraints - you // can not "pack" general nonlinear function into double precision array. // That's why minnssetnlc() does not accept constraints itself - only // constraint COUNTS are passed: first parameter is number of equality // constraints, second one is number of inequality constraints. // // As for constraining functions - these functions are passed as part // of problem Jacobian (see below). // // NOTE: MinNS optimizer supports arbitrary combination of boundary, general // linear and general nonlinear constraints. This example does not // show how to work with general linear constraints, but you can // easily find it in documentation on minnlcsetlc() function. // minnssetnlc(state, 1, 1); // // Optimize and test results. // // Optimizer object accepts vector function and its Jacobian, with first // component (Jacobian row) being target function, and next components // (Jacobian rows) being nonlinear equality and inequality constraints // (box/linear ones are passed separately by means of minnssetbc() and // minnssetlc() calls). // // Nonlinear equality constraints have form Gi(x)=0, inequality ones // have form Hi(x)<=0, so we may have to "normalize" constraints prior // to passing them to optimizer (right side is zero, constraints are // sorted, multiplied by -1 when needed). // // So, our vector function has form // // {f0,f1,f2} = { 2*|x0|+|x1|, x0-1, -x1-1 } // // with Jacobian // // [ 2*sign(x0) sign(x1) ] // J = [ 1 0 ] // [ 0 -1 ] // // which means that we have optimization problem // // min{f0} subject to f1=0, f2<=0 // // which is essentially same as // // min { 2*|x0|+|x1| } subject to x0=1, x1>=-1 // // NOTE: AGS solver used by us can handle nonsmooth and nonconvex // optimization problems. It has convergence guarantees, i.e. it will // converge to stationary point of the function after running for some // time. // // However, it is important to remember that "stationary point" is not // equal to "solution". If your problem is convex, everything is OK. // But nonconvex optimization problems may have "flat spots" - large // areas where gradient is exactly zero, but function value is far away // from optimal. Such areas are stationary points too, and optimizer // may be trapped here. // // "Flat spots" are nonsmooth equivalent of the saddle points, but with // orders of magnitude worse properties - they may be quite large and // hard to avoid. All nonsmooth optimizers are prone to this kind of the // problem, because it is impossible to automatically distinguish "flat // spot" from true solution. // // This note is here to warn you that you should be very careful when // you solve nonsmooth optimization problems. Visual inspection of // results is essential. // alglib::minnsoptimize(state, nsfunc2_jac); minnsresults(state, x1, rep); _TestResult = _TestResult && doc_test_real_vector(x1, "[1.0000,0.0000]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minns_d_nlc"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minlm_d_v // Nonlinear least squares optimization using function vector only // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { // // This example demonstrates minimization of F(x0,x1) = f0^2+f1^2, where // // f0(x0,x1) = 10*(x0+3)^2 // f1(x0,x1) = (x1-3)^2 // // using "V" mode of the Levenberg-Marquardt optimizer. // // Optimization algorithm uses: // * function vector f[] = {f1,f2} // // No other information (Jacobian, gradient, etc.) is needed. // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsx = 0.0000000001; if( _spoil_scenario==3 ) epsx = fp_nan; if( _spoil_scenario==4 ) epsx = fp_posinf; if( _spoil_scenario==5 ) epsx = fp_neginf; ae_int_t maxits = 0; minlmstate state; minlmreport rep; minlmcreatev(2, x, 0.0001, state); minlmsetcond(state, epsx, maxits); alglib::minlmoptimize(state, function1_fvec); minlmresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,+3]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minlm_d_v"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minlm_d_vj // Nonlinear least squares optimization using function vector and Jacobian // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { // // This example demonstrates minimization of F(x0,x1) = f0^2+f1^2, where // // f0(x0,x1) = 10*(x0+3)^2 // f1(x0,x1) = (x1-3)^2 // // using "VJ" mode of the Levenberg-Marquardt optimizer. // // Optimization algorithm uses: // * function vector f[] = {f1,f2} // * Jacobian matrix J = {dfi/dxj}. // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsx = 0.0000000001; if( _spoil_scenario==3 ) epsx = fp_nan; if( _spoil_scenario==4 ) epsx = fp_posinf; if( _spoil_scenario==5 ) epsx = fp_neginf; ae_int_t maxits = 0; minlmstate state; minlmreport rep; minlmcreatevj(2, x, state); minlmsetcond(state, epsx, maxits); alglib::minlmoptimize(state, function1_fvec, function1_jac); minlmresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,+3]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minlm_d_vj"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minlm_d_fgh // Nonlinear Hessian-based optimization for general functions // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { // // This example demonstrates minimization of F(x0,x1) = 100*(x0+3)^4+(x1-3)^4 // using "FGH" mode of the Levenberg-Marquardt optimizer. // // F is treated like a monolitic function without internal structure, // i.e. we do NOT represent it as a sum of squares. // // Optimization algorithm uses: // * function value F(x0,x1) // * gradient G={dF/dxi} // * Hessian H={d2F/(dxi*dxj)} // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsx = 0.0000000001; if( _spoil_scenario==3 ) epsx = fp_nan; if( _spoil_scenario==4 ) epsx = fp_posinf; if( _spoil_scenario==5 ) epsx = fp_neginf; ae_int_t maxits = 0; minlmstate state; minlmreport rep; minlmcreatefgh(x, state); minlmsetcond(state, epsx, maxits); alglib::minlmoptimize(state, function1_func, function1_grad, function1_hess); minlmresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,+3]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minlm_d_fgh"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minlm_d_vb // Bound constrained nonlinear least squares optimization // printf("50/145\n"); _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<10; _spoil_scenario++) { try { // // This example demonstrates minimization of F(x0,x1) = f0^2+f1^2, where // // f0(x0,x1) = 10*(x0+3)^2 // f1(x0,x1) = (x1-3)^2 // // with boundary constraints // // -1 <= x0 <= +1 // -1 <= x1 <= +1 // // using "V" mode of the Levenberg-Marquardt optimizer. // // Optimization algorithm uses: // * function vector f[] = {f1,f2} // // No other information (Jacobian, gradient, etc.) is needed. // real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); real_1d_array bndl = "[-1,-1]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(bndl); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(bndl); real_1d_array bndu = "[+1,+1]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(bndu); if( _spoil_scenario==6 ) spoil_vector_by_deleting_element(bndu); double epsx = 0.0000000001; if( _spoil_scenario==7 ) epsx = fp_nan; if( _spoil_scenario==8 ) epsx = fp_posinf; if( _spoil_scenario==9 ) epsx = fp_neginf; ae_int_t maxits = 0; minlmstate state; minlmreport rep; minlmcreatev(2, x, 0.0001, state); minlmsetbc(state, bndl, bndu); minlmsetcond(state, epsx, maxits); alglib::minlmoptimize(state, function1_fvec); minlmresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[-1,+1]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minlm_d_vb"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minlm_d_restarts // Efficient restarts of LM optimizer // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<9; _spoil_scenario++) { try { // // This example demonstrates minimization of F(x0,x1) = f0^2+f1^2, where // // f0(x0,x1) = 10*(x0+3)^2 // f1(x0,x1) = (x1-3)^2 // // using several starting points and efficient restarts. // real_1d_array x; double epsx = 0.0000000001; if( _spoil_scenario==0 ) epsx = fp_nan; if( _spoil_scenario==1 ) epsx = fp_posinf; if( _spoil_scenario==2 ) epsx = fp_neginf; ae_int_t maxits = 0; minlmstate state; minlmreport rep; // // create optimizer using minlmcreatev() // x = "[10,10]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(x); if( _spoil_scenario==4 ) spoil_vector_by_posinf(x); if( _spoil_scenario==5 ) spoil_vector_by_neginf(x); minlmcreatev(2, x, 0.0001, state); minlmsetcond(state, epsx, maxits); alglib::minlmoptimize(state, function1_fvec); minlmresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,+3]", 0.005); // // restart optimizer using minlmrestartfrom() // // we can use different starting point, different function, // different stopping conditions, but problem size // must remain unchanged. // x = "[4,4]"; if( _spoil_scenario==6 ) spoil_vector_by_nan(x); if( _spoil_scenario==7 ) spoil_vector_by_posinf(x); if( _spoil_scenario==8 ) spoil_vector_by_neginf(x); minlmrestartfrom(state, x); alglib::minlmoptimize(state, function2_fvec); minlmresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[0,1]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minlm_d_restarts"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minlm_t_1 // Nonlinear least squares optimization, FJ scheme (obsolete, but supported) // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsx = 0.0000000001; if( _spoil_scenario==3 ) epsx = fp_nan; if( _spoil_scenario==4 ) epsx = fp_posinf; if( _spoil_scenario==5 ) epsx = fp_neginf; ae_int_t maxits = 0; minlmstate state; minlmreport rep; minlmcreatefj(2, x, state); minlmsetcond(state, epsx, maxits); alglib::minlmoptimize(state, function1_func, function1_jac); minlmresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,+3]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minlm_t_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST minlm_t_2 // Nonlinear least squares optimization, FGJ scheme (obsolete, but supported) // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { real_1d_array x = "[0,0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double epsx = 0.0000000001; if( _spoil_scenario==3 ) epsx = fp_nan; if( _spoil_scenario==4 ) epsx = fp_posinf; if( _spoil_scenario==5 ) epsx = fp_neginf; ae_int_t maxits = 0; minlmstate state; minlmreport rep; minlmcreatefgj(2, x, state); minlmsetcond(state, epsx, maxits); alglib::minlmoptimize(state, function1_func, function1_grad, function1_jac); minlmresults(state, x, rep); _TestResult = _TestResult && doc_test_real_vector(x, "[-3,+3]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "minlm_t_2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST basestat_d_base // Basic functionality (moments, adev, median, percentile) // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { real_1d_array x = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); double mean; double variance; double skewness; double kurtosis; double adev; double p; double v; // // Here we demonstrate calculation of sample moments // (mean, variance, skewness, kurtosis) // samplemoments(x, mean, variance, skewness, kurtosis); _TestResult = _TestResult && doc_test_real(mean, 28.5, 0.01); _TestResult = _TestResult && doc_test_real(variance, 801.1667, 0.01); _TestResult = _TestResult && doc_test_real(skewness, 0.5751, 0.01); _TestResult = _TestResult && doc_test_real(kurtosis, -1.2666, 0.01); // // Average deviation // sampleadev(x, adev); _TestResult = _TestResult && doc_test_real(adev, 23.2, 0.01); // // Median and percentile // samplemedian(x, v); _TestResult = _TestResult && doc_test_real(v, 20.5, 0.01); p = 0.5; if( _spoil_scenario==3 ) p = fp_nan; if( _spoil_scenario==4 ) p = fp_posinf; if( _spoil_scenario==5 ) p = fp_neginf; samplepercentile(x, p, v); _TestResult = _TestResult && doc_test_real(v, 20.5, 0.01); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "basestat_d_base"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST basestat_d_c2 // Correlation (covariance) between two random variables // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<10; _spoil_scenario++) { try { // // We have two samples - x and y, and want to measure dependency between them // real_1d_array x = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_adding_element(x); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0,1,2,3,4,5,6,7,8,9]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); double v; // // Three dependency measures are calculated: // * covariation // * Pearson correlation // * Spearman rank correlation // v = cov2(x, y); _TestResult = _TestResult && doc_test_real(v, 82.5, 0.001); v = pearsoncorr2(x, y); _TestResult = _TestResult && doc_test_real(v, 0.9627, 0.001); v = spearmancorr2(x, y); _TestResult = _TestResult && doc_test_real(v, 1.000, 0.001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "basestat_d_c2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST basestat_d_cm // Correlation (covariance) between components of random vector // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // X is a sample matrix: // * I-th row corresponds to I-th observation // * J-th column corresponds to J-th variable // real_2d_array x = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(x); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(x); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(x); real_2d_array c; // // Three dependency measures are calculated: // * covariation // * Pearson correlation // * Spearman rank correlation // // Result is stored into C, with C[i,j] equal to correlation // (covariance) between I-th and J-th variables of X. // covm(x, c); _TestResult = _TestResult && doc_test_real_matrix(c, "[[1.80,0.60,-1.40],[0.60,0.70,-0.80],[-1.40,-0.80,14.70]]", 0.01); pearsoncorrm(x, c); _TestResult = _TestResult && doc_test_real_matrix(c, "[[1.000,0.535,-0.272],[0.535,1.000,-0.249],[-0.272,-0.249,1.000]]", 0.01); spearmancorrm(x, c); _TestResult = _TestResult && doc_test_real_matrix(c, "[[1.000,0.556,-0.306],[0.556,1.000,-0.750],[-0.306,-0.750,1.000]]", 0.01); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "basestat_d_cm"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST basestat_d_cm2 // Correlation (covariance) between two random vectors // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { // // X and Y are sample matrices: // * I-th row corresponds to I-th observation // * J-th column corresponds to J-th variable // real_2d_array x = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(x); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(x); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(x); real_2d_array y = "[[2,3],[2,1],[-1,6],[-9,9],[7,1]]"; if( _spoil_scenario==3 ) spoil_matrix_by_nan(y); if( _spoil_scenario==4 ) spoil_matrix_by_posinf(y); if( _spoil_scenario==5 ) spoil_matrix_by_neginf(y); real_2d_array c; // // Three dependency measures are calculated: // * covariation // * Pearson correlation // * Spearman rank correlation // // Result is stored into C, with C[i,j] equal to correlation // (covariance) between I-th variable of X and J-th variable of Y. // covm2(x, y, c); _TestResult = _TestResult && doc_test_real_matrix(c, "[[4.100,-3.250],[2.450,-1.500],[13.450,-5.750]]", 0.01); pearsoncorrm2(x, y, c); _TestResult = _TestResult && doc_test_real_matrix(c, "[[0.519,-0.699],[0.497,-0.518],[0.596,-0.433]]", 0.01); spearmancorrm2(x, y, c); _TestResult = _TestResult && doc_test_real_matrix(c, "[[0.541,-0.649],[0.216,-0.433],[0.433,-0.135]]", 0.01); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "basestat_d_cm2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST basestat_t_base // Tests ability to detect errors in inputs // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<34; _spoil_scenario++) { try { double mean; double variance; double skewness; double kurtosis; double adev; double p; double v; // // first, we test short form of functions // real_1d_array x1 = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x1); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x1); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x1); samplemoments(x1, mean, variance, skewness, kurtosis); real_1d_array x2 = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(x2); if( _spoil_scenario==4 ) spoil_vector_by_posinf(x2); if( _spoil_scenario==5 ) spoil_vector_by_neginf(x2); sampleadev(x2, adev); real_1d_array x3 = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==6 ) spoil_vector_by_nan(x3); if( _spoil_scenario==7 ) spoil_vector_by_posinf(x3); if( _spoil_scenario==8 ) spoil_vector_by_neginf(x3); samplemedian(x3, v); real_1d_array x4 = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==9 ) spoil_vector_by_nan(x4); if( _spoil_scenario==10 ) spoil_vector_by_posinf(x4); if( _spoil_scenario==11 ) spoil_vector_by_neginf(x4); p = 0.5; if( _spoil_scenario==12 ) p = fp_nan; if( _spoil_scenario==13 ) p = fp_posinf; if( _spoil_scenario==14 ) p = fp_neginf; samplepercentile(x4, p, v); // // and then we test full form // real_1d_array x5 = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==15 ) spoil_vector_by_nan(x5); if( _spoil_scenario==16 ) spoil_vector_by_posinf(x5); if( _spoil_scenario==17 ) spoil_vector_by_neginf(x5); if( _spoil_scenario==18 ) spoil_vector_by_deleting_element(x5); samplemoments(x5, 10, mean, variance, skewness, kurtosis); real_1d_array x6 = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==19 ) spoil_vector_by_nan(x6); if( _spoil_scenario==20 ) spoil_vector_by_posinf(x6); if( _spoil_scenario==21 ) spoil_vector_by_neginf(x6); if( _spoil_scenario==22 ) spoil_vector_by_deleting_element(x6); sampleadev(x6, 10, adev); real_1d_array x7 = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==23 ) spoil_vector_by_nan(x7); if( _spoil_scenario==24 ) spoil_vector_by_posinf(x7); if( _spoil_scenario==25 ) spoil_vector_by_neginf(x7); if( _spoil_scenario==26 ) spoil_vector_by_deleting_element(x7); samplemedian(x7, 10, v); real_1d_array x8 = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==27 ) spoil_vector_by_nan(x8); if( _spoil_scenario==28 ) spoil_vector_by_posinf(x8); if( _spoil_scenario==29 ) spoil_vector_by_neginf(x8); if( _spoil_scenario==30 ) spoil_vector_by_deleting_element(x8); p = 0.5; if( _spoil_scenario==31 ) p = fp_nan; if( _spoil_scenario==32 ) p = fp_posinf; if( _spoil_scenario==33 ) p = fp_neginf; samplepercentile(x8, 10, p, v); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "basestat_t_base"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST basestat_t_covcorr // Tests ability to detect errors in inputs // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<126; _spoil_scenario++) { try { double v; real_2d_array c; // // 2-sample short-form cov/corr are tested // real_1d_array x1 = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x1); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x1); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x1); if( _spoil_scenario==3 ) spoil_vector_by_adding_element(x1); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(x1); real_1d_array y1 = "[0,1,2,3,4,5,6,7,8,9]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y1); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y1); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y1); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y1); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y1); v = cov2(x1, y1); real_1d_array x2 = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==10 ) spoil_vector_by_nan(x2); if( _spoil_scenario==11 ) spoil_vector_by_posinf(x2); if( _spoil_scenario==12 ) spoil_vector_by_neginf(x2); if( _spoil_scenario==13 ) spoil_vector_by_adding_element(x2); if( _spoil_scenario==14 ) spoil_vector_by_deleting_element(x2); real_1d_array y2 = "[0,1,2,3,4,5,6,7,8,9]"; if( _spoil_scenario==15 ) spoil_vector_by_nan(y2); if( _spoil_scenario==16 ) spoil_vector_by_posinf(y2); if( _spoil_scenario==17 ) spoil_vector_by_neginf(y2); if( _spoil_scenario==18 ) spoil_vector_by_adding_element(y2); if( _spoil_scenario==19 ) spoil_vector_by_deleting_element(y2); v = pearsoncorr2(x2, y2); real_1d_array x3 = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==20 ) spoil_vector_by_nan(x3); if( _spoil_scenario==21 ) spoil_vector_by_posinf(x3); if( _spoil_scenario==22 ) spoil_vector_by_neginf(x3); if( _spoil_scenario==23 ) spoil_vector_by_adding_element(x3); if( _spoil_scenario==24 ) spoil_vector_by_deleting_element(x3); real_1d_array y3 = "[0,1,2,3,4,5,6,7,8,9]"; if( _spoil_scenario==25 ) spoil_vector_by_nan(y3); if( _spoil_scenario==26 ) spoil_vector_by_posinf(y3); if( _spoil_scenario==27 ) spoil_vector_by_neginf(y3); if( _spoil_scenario==28 ) spoil_vector_by_adding_element(y3); if( _spoil_scenario==29 ) spoil_vector_by_deleting_element(y3); v = spearmancorr2(x3, y3); // // 2-sample full-form cov/corr are tested // real_1d_array x1a = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==30 ) spoil_vector_by_nan(x1a); if( _spoil_scenario==31 ) spoil_vector_by_posinf(x1a); if( _spoil_scenario==32 ) spoil_vector_by_neginf(x1a); if( _spoil_scenario==33 ) spoil_vector_by_deleting_element(x1a); real_1d_array y1a = "[0,1,2,3,4,5,6,7,8,9]"; if( _spoil_scenario==34 ) spoil_vector_by_nan(y1a); if( _spoil_scenario==35 ) spoil_vector_by_posinf(y1a); if( _spoil_scenario==36 ) spoil_vector_by_neginf(y1a); if( _spoil_scenario==37 ) spoil_vector_by_deleting_element(y1a); v = cov2(x1a, y1a, 10); real_1d_array x2a = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==38 ) spoil_vector_by_nan(x2a); if( _spoil_scenario==39 ) spoil_vector_by_posinf(x2a); if( _spoil_scenario==40 ) spoil_vector_by_neginf(x2a); if( _spoil_scenario==41 ) spoil_vector_by_deleting_element(x2a); real_1d_array y2a = "[0,1,2,3,4,5,6,7,8,9]"; if( _spoil_scenario==42 ) spoil_vector_by_nan(y2a); if( _spoil_scenario==43 ) spoil_vector_by_posinf(y2a); if( _spoil_scenario==44 ) spoil_vector_by_neginf(y2a); if( _spoil_scenario==45 ) spoil_vector_by_deleting_element(y2a); v = pearsoncorr2(x2a, y2a, 10); real_1d_array x3a = "[0,1,4,9,16,25,36,49,64,81]"; if( _spoil_scenario==46 ) spoil_vector_by_nan(x3a); if( _spoil_scenario==47 ) spoil_vector_by_posinf(x3a); if( _spoil_scenario==48 ) spoil_vector_by_neginf(x3a); if( _spoil_scenario==49 ) spoil_vector_by_deleting_element(x3a); real_1d_array y3a = "[0,1,2,3,4,5,6,7,8,9]"; if( _spoil_scenario==50 ) spoil_vector_by_nan(y3a); if( _spoil_scenario==51 ) spoil_vector_by_posinf(y3a); if( _spoil_scenario==52 ) spoil_vector_by_neginf(y3a); if( _spoil_scenario==53 ) spoil_vector_by_deleting_element(y3a); v = spearmancorr2(x3a, y3a, 10); // // vector short-form cov/corr are tested. // real_2d_array x4 = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==54 ) spoil_matrix_by_nan(x4); if( _spoil_scenario==55 ) spoil_matrix_by_posinf(x4); if( _spoil_scenario==56 ) spoil_matrix_by_neginf(x4); covm(x4, c); real_2d_array x5 = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==57 ) spoil_matrix_by_nan(x5); if( _spoil_scenario==58 ) spoil_matrix_by_posinf(x5); if( _spoil_scenario==59 ) spoil_matrix_by_neginf(x5); pearsoncorrm(x5, c); real_2d_array x6 = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==60 ) spoil_matrix_by_nan(x6); if( _spoil_scenario==61 ) spoil_matrix_by_posinf(x6); if( _spoil_scenario==62 ) spoil_matrix_by_neginf(x6); spearmancorrm(x6, c); // // vector full-form cov/corr are tested. // real_2d_array x7 = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==63 ) spoil_matrix_by_nan(x7); if( _spoil_scenario==64 ) spoil_matrix_by_posinf(x7); if( _spoil_scenario==65 ) spoil_matrix_by_neginf(x7); if( _spoil_scenario==66 ) spoil_matrix_by_deleting_row(x7); if( _spoil_scenario==67 ) spoil_matrix_by_deleting_col(x7); covm(x7, 5, 3, c); real_2d_array x8 = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==68 ) spoil_matrix_by_nan(x8); if( _spoil_scenario==69 ) spoil_matrix_by_posinf(x8); if( _spoil_scenario==70 ) spoil_matrix_by_neginf(x8); if( _spoil_scenario==71 ) spoil_matrix_by_deleting_row(x8); if( _spoil_scenario==72 ) spoil_matrix_by_deleting_col(x8); pearsoncorrm(x8, 5, 3, c); real_2d_array x9 = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==73 ) spoil_matrix_by_nan(x9); if( _spoil_scenario==74 ) spoil_matrix_by_posinf(x9); if( _spoil_scenario==75 ) spoil_matrix_by_neginf(x9); if( _spoil_scenario==76 ) spoil_matrix_by_deleting_row(x9); if( _spoil_scenario==77 ) spoil_matrix_by_deleting_col(x9); spearmancorrm(x9, 5, 3, c); // // cross-vector short-form cov/corr are tested. // real_2d_array x10 = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==78 ) spoil_matrix_by_nan(x10); if( _spoil_scenario==79 ) spoil_matrix_by_posinf(x10); if( _spoil_scenario==80 ) spoil_matrix_by_neginf(x10); real_2d_array y10 = "[[2,3],[2,1],[-1,6],[-9,9],[7,1]]"; if( _spoil_scenario==81 ) spoil_matrix_by_nan(y10); if( _spoil_scenario==82 ) spoil_matrix_by_posinf(y10); if( _spoil_scenario==83 ) spoil_matrix_by_neginf(y10); covm2(x10, y10, c); real_2d_array x11 = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==84 ) spoil_matrix_by_nan(x11); if( _spoil_scenario==85 ) spoil_matrix_by_posinf(x11); if( _spoil_scenario==86 ) spoil_matrix_by_neginf(x11); real_2d_array y11 = "[[2,3],[2,1],[-1,6],[-9,9],[7,1]]"; if( _spoil_scenario==87 ) spoil_matrix_by_nan(y11); if( _spoil_scenario==88 ) spoil_matrix_by_posinf(y11); if( _spoil_scenario==89 ) spoil_matrix_by_neginf(y11); pearsoncorrm2(x11, y11, c); real_2d_array x12 = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==90 ) spoil_matrix_by_nan(x12); if( _spoil_scenario==91 ) spoil_matrix_by_posinf(x12); if( _spoil_scenario==92 ) spoil_matrix_by_neginf(x12); real_2d_array y12 = "[[2,3],[2,1],[-1,6],[-9,9],[7,1]]"; if( _spoil_scenario==93 ) spoil_matrix_by_nan(y12); if( _spoil_scenario==94 ) spoil_matrix_by_posinf(y12); if( _spoil_scenario==95 ) spoil_matrix_by_neginf(y12); spearmancorrm2(x12, y12, c); // // cross-vector full-form cov/corr are tested. // real_2d_array x13 = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==96 ) spoil_matrix_by_nan(x13); if( _spoil_scenario==97 ) spoil_matrix_by_posinf(x13); if( _spoil_scenario==98 ) spoil_matrix_by_neginf(x13); if( _spoil_scenario==99 ) spoil_matrix_by_deleting_row(x13); if( _spoil_scenario==100 ) spoil_matrix_by_deleting_col(x13); real_2d_array y13 = "[[2,3],[2,1],[-1,6],[-9,9],[7,1]]"; if( _spoil_scenario==101 ) spoil_matrix_by_nan(y13); if( _spoil_scenario==102 ) spoil_matrix_by_posinf(y13); if( _spoil_scenario==103 ) spoil_matrix_by_neginf(y13); if( _spoil_scenario==104 ) spoil_matrix_by_deleting_row(y13); if( _spoil_scenario==105 ) spoil_matrix_by_deleting_col(y13); covm2(x13, y13, 5, 3, 2, c); real_2d_array x14 = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==106 ) spoil_matrix_by_nan(x14); if( _spoil_scenario==107 ) spoil_matrix_by_posinf(x14); if( _spoil_scenario==108 ) spoil_matrix_by_neginf(x14); if( _spoil_scenario==109 ) spoil_matrix_by_deleting_row(x14); if( _spoil_scenario==110 ) spoil_matrix_by_deleting_col(x14); real_2d_array y14 = "[[2,3],[2,1],[-1,6],[-9,9],[7,1]]"; if( _spoil_scenario==111 ) spoil_matrix_by_nan(y14); if( _spoil_scenario==112 ) spoil_matrix_by_posinf(y14); if( _spoil_scenario==113 ) spoil_matrix_by_neginf(y14); if( _spoil_scenario==114 ) spoil_matrix_by_deleting_row(y14); if( _spoil_scenario==115 ) spoil_matrix_by_deleting_col(y14); pearsoncorrm2(x14, y14, 5, 3, 2, c); real_2d_array x15 = "[[1,0,1],[1,1,0],[-1,1,0],[-2,-1,1],[-1,0,9]]"; if( _spoil_scenario==116 ) spoil_matrix_by_nan(x15); if( _spoil_scenario==117 ) spoil_matrix_by_posinf(x15); if( _spoil_scenario==118 ) spoil_matrix_by_neginf(x15); if( _spoil_scenario==119 ) spoil_matrix_by_deleting_row(x15); if( _spoil_scenario==120 ) spoil_matrix_by_deleting_col(x15); real_2d_array y15 = "[[2,3],[2,1],[-1,6],[-9,9],[7,1]]"; if( _spoil_scenario==121 ) spoil_matrix_by_nan(y15); if( _spoil_scenario==122 ) spoil_matrix_by_posinf(y15); if( _spoil_scenario==123 ) spoil_matrix_by_neginf(y15); if( _spoil_scenario==124 ) spoil_matrix_by_deleting_row(y15); if( _spoil_scenario==125 ) spoil_matrix_by_deleting_col(y15); spearmancorrm2(x15, y15, 5, 3, 2, c); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "basestat_t_covcorr"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST linreg_d_basic // Linear regression used to build the very basic model and unpack coefficients // _TestResult = true; try { // // In this example we demonstrate linear fitting by f(x|a) = a*exp(0.5*x). // // We have: // * xy - matrix of basic function values (exp(0.5*x)) and expected values // real_2d_array xy = "[[0.606531,1.133719],[0.670320,1.306522],[0.740818,1.504604],[0.818731,1.554663],[0.904837,1.884638],[1.000000,2.072436],[1.105171,2.257285],[1.221403,2.534068],[1.349859,2.622017],[1.491825,2.897713],[1.648721,3.219371]]"; ae_int_t info; ae_int_t nvars; linearmodel model; lrreport rep; real_1d_array c; lrbuildz(xy, 11, 1, info, model, rep); _TestResult = _TestResult && doc_test_int(info, 1); lrunpack(model, c, nvars); _TestResult = _TestResult && doc_test_real_vector(c, "[1.98650,0.00000]", 0.00005); } catch(ap_error) { _TestResult = false; } if( !_TestResult) { printf("%-32s FAILED\n", "linreg_d_basic"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST filters_d_sma // SMA(k) filter // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // Here we demonstrate SMA(k) filtering for time series. // real_1d_array x = "[5,6,7,8]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); // // Apply filter. // We should get [5, 5.5, 6.5, 7.5] as result // filtersma(x, 2); _TestResult = _TestResult && doc_test_real_vector(x, "[5,5.5,6.5,7.5]", 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "filters_d_sma"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST filters_d_ema // EMA(alpha) filter // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // Here we demonstrate EMA(0.5) filtering for time series. // real_1d_array x = "[5,6,7,8]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); // // Apply filter. // We should get [5, 5.5, 6.25, 7.125] as result // filterema(x, 0.5); _TestResult = _TestResult && doc_test_real_vector(x, "[5,5.5,6.25,7.125]", 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "filters_d_ema"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST filters_d_lrma // LRMA(k) filter // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // Here we demonstrate LRMA(3) filtering for time series. // real_1d_array x = "[7,8,8,9,12,12]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); // // Apply filter. // We should get [7.0000, 8.0000, 8.1667, 8.8333, 11.6667, 12.5000] as result // filterlrma(x, 3); _TestResult = _TestResult && doc_test_real_vector(x, "[7.0000,8.0000,8.1667,8.8333,11.6667,12.5000]", 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "filters_d_lrma"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST mcpd_simple1 // Simple unconstrained MCPD model (no entry/exit states) // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { // // The very simple MCPD example // // We have a loan portfolio. Our loans can be in one of two states: // * normal loans ("good" ones) // * past due loans ("bad" ones) // // We assume that: // * loans can transition from any state to any other state. In // particular, past due loan can become "good" one at any moment // with same (fixed) probability. Not realistic, but it is toy example :) // * portfolio size does not change over time // // Thus, we have following model // state_new = P*state_old // where // ( p00 p01 ) // P = ( ) // ( p10 p11 ) // // We want to model transitions between these two states using MCPD // approach (Markov Chains for Proportional/Population Data), i.e. // to restore hidden transition matrix P using actual portfolio data. // We have: // * poportional data, i.e. proportion of loans in the normal and past // due states (not portfolio size measured in some currency, although // it is possible to work with population data too) // * two tracks, i.e. two sequences which describe portfolio // evolution from two different starting states: [1,0] (all loans // are "good") and [0.8,0.2] (only 80% of portfolio is in the "good" // state) // mcpdstate s; mcpdreport rep; real_2d_array p; real_2d_array track0 = "[[1.00000,0.00000],[0.95000,0.05000],[0.92750,0.07250],[0.91738,0.08263],[0.91282,0.08718]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(track0); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(track0); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(track0); real_2d_array track1 = "[[0.80000,0.20000],[0.86000,0.14000],[0.88700,0.11300],[0.89915,0.10085]]"; if( _spoil_scenario==3 ) spoil_matrix_by_nan(track1); if( _spoil_scenario==4 ) spoil_matrix_by_posinf(track1); if( _spoil_scenario==5 ) spoil_matrix_by_neginf(track1); mcpdcreate(2, s); mcpdaddtrack(s, track0); mcpdaddtrack(s, track1); mcpdsolve(s); mcpdresults(s, p, rep); // // Hidden matrix P is equal to // ( 0.95 0.50 ) // ( ) // ( 0.05 0.50 ) // which means that "good" loans can become "bad" with 5% probability, // while "bad" loans will return to good state with 50% probability. // _TestResult = _TestResult && doc_test_real_matrix(p, "[[0.95,0.50],[0.05,0.50]]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "mcpd_simple1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST mcpd_simple2 // Simple MCPD model (no entry/exit states) with equality constraints // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { // // Simple MCPD example // // We have a loan portfolio. Our loans can be in one of three states: // * normal loans // * past due loans // * charged off loans // // We assume that: // * normal loan can stay normal or become past due (but not charged off) // * past due loan can stay past due, become normal or charged off // * charged off loan will stay charged off for the rest of eternity // * portfolio size does not change over time // Not realistic, but it is toy example :) // // Thus, we have following model // state_new = P*state_old // where // ( p00 p01 ) // P = ( p10 p11 ) // ( p21 1 ) // i.e. four elements of P are known a priori. // // Although it is possible (given enough data) to In order to enforce // this property we set equality constraints on these elements. // // We want to model transitions between these two states using MCPD // approach (Markov Chains for Proportional/Population Data), i.e. // to restore hidden transition matrix P using actual portfolio data. // We have: // * poportional data, i.e. proportion of loans in the current and past // due states (not portfolio size measured in some currency, although // it is possible to work with population data too) // * two tracks, i.e. two sequences which describe portfolio // evolution from two different starting states: [1,0,0] (all loans // are "good") and [0.8,0.2,0.0] (only 80% of portfolio is in the "good" // state) // mcpdstate s; mcpdreport rep; real_2d_array p; real_2d_array track0 = "[[1.000000,0.000000,0.000000],[0.950000,0.050000,0.000000],[0.927500,0.060000,0.012500],[0.911125,0.061375,0.027500],[0.896256,0.060900,0.042844]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(track0); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(track0); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(track0); real_2d_array track1 = "[[0.800000,0.200000,0.000000],[0.860000,0.090000,0.050000],[0.862000,0.065500,0.072500],[0.851650,0.059475,0.088875],[0.838805,0.057451,0.103744]]"; if( _spoil_scenario==3 ) spoil_matrix_by_nan(track1); if( _spoil_scenario==4 ) spoil_matrix_by_posinf(track1); if( _spoil_scenario==5 ) spoil_matrix_by_neginf(track1); mcpdcreate(3, s); mcpdaddtrack(s, track0); mcpdaddtrack(s, track1); mcpdaddec(s, 0, 2, 0.0); mcpdaddec(s, 1, 2, 0.0); mcpdaddec(s, 2, 2, 1.0); mcpdaddec(s, 2, 0, 0.0); mcpdsolve(s); mcpdresults(s, p, rep); // // Hidden matrix P is equal to // ( 0.95 0.50 ) // ( 0.05 0.25 ) // ( 0.25 1.00 ) // which means that "good" loans can become past due with 5% probability, // while past due loans will become charged off with 25% probability or // return back to normal state with 50% probability. // _TestResult = _TestResult && doc_test_real_matrix(p, "[[0.95,0.50,0.00],[0.05,0.25,0.00],[0.00,0.25,1.00]]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "mcpd_simple2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST nn_regr // Regression problem with one output (2=>1) // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // The very simple example on neural network: network is trained to reproduce // small 2x2 multiplication table. // // NOTE: we use network with excessive amount of neurons, which guarantees // almost exact reproduction of the training set. Generalization ability // of such network is rather low, but we are not concerned with such // questions in this basic demo. // mlptrainer trn; multilayerperceptron network; mlpreport rep; // // Training set: // * one row corresponds to one record A*B=C in the multiplication table // * first two columns store A and B, last column stores C // // [1 * 1 = 1] // [1 * 2 = 2] // [2 * 1 = 2] // [2 * 2 = 4] // real_2d_array xy = "[[1,1,1],[1,2,2],[2,1,2],[2,2,4]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); // // Network is created. // Trainer object is created. // Dataset is attached to trainer object. // mlpcreatetrainer(2, 1, trn); mlpcreate1(2, 5, 1, network); mlpsetdataset(trn, xy, 4); // // Network is trained with 5 restarts from random positions // mlptrainnetwork(trn, network, 5, rep); // // 2*2=? // real_1d_array x = "[2,2]"; real_1d_array y = "[0]"; mlpprocess(network, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[4.000]", 0.05); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "nn_regr"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST nn_regr_n // Regression problem with multiple outputs (2=>2) // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // Network with 2 inputs and 2 outputs is trained to reproduce vector function: // (x0,x1) => (x0+x1, x0*x1) // // Informally speaking, we want neural network to simultaneously calculate // both sum of two numbers and their product. // // NOTE: we use network with excessive amount of neurons, which guarantees // almost exact reproduction of the training set. Generalization ability // of such network is rather low, but we are not concerned with such // questions in this basic demo. // mlptrainer trn; multilayerperceptron network; mlpreport rep; // // Training set. One row corresponds to one record [A,B,A+B,A*B]. // // [ 1 1 1+1 1*1 ] // [ 1 2 1+2 1*2 ] // [ 2 1 2+1 2*1 ] // [ 2 2 2+2 2*2 ] // real_2d_array xy = "[[1,1,2,1],[1,2,3,2],[2,1,3,2],[2,2,4,4]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); // // Network is created. // Trainer object is created. // Dataset is attached to trainer object. // mlpcreatetrainer(2, 2, trn); mlpcreate1(2, 5, 2, network); mlpsetdataset(trn, xy, 4); // // Network is trained with 5 restarts from random positions // mlptrainnetwork(trn, network, 5, rep); // // 2+1=? // 2*1=? // real_1d_array x = "[2,1]"; real_1d_array y = "[0,0]"; mlpprocess(network, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[3.000,2.000]", 0.05); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "nn_regr_n"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST nn_cls2 // Binary classification problem // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // Suppose that we want to classify numbers as positive (class 0) and negative // (class 1). We have training set which includes several strictly positive // or negative numbers - and zero. // // The problem is that we are not sure how to classify zero, so from time to // time we mark it as positive or negative (with equal probability). Other // numbers are marked in pure deterministic setting. How will neural network // cope with such classification task? // // NOTE: we use network with excessive amount of neurons, which guarantees // almost exact reproduction of the training set. Generalization ability // of such network is rather low, but we are not concerned with such // questions in this basic demo. // mlptrainer trn; multilayerperceptron network; mlpreport rep; real_1d_array x = "[0]"; real_1d_array y = "[0,0]"; // // Training set. One row corresponds to one record [A => class(A)]. // // Classes are denoted by numbers from 0 to 1, where 0 corresponds to positive // numbers and 1 to negative numbers. // // [ +1 0] // [ +2 0] // [ -1 1] // [ -2 1] // [ 0 0] !! sometimes we classify 0 as positive, sometimes as negative // [ 0 1] !! // real_2d_array xy = "[[+1,0],[+2,0],[-1,1],[-2,1],[0,0],[0,1]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); // // // When we solve classification problems, everything is slightly different from // the regression ones: // // 1. Network is created. Because we solve classification problem, we use // mlpcreatec1() function instead of mlpcreate1(). This function creates // classifier network with SOFTMAX-normalized outputs. This network returns // vector of class membership probabilities which are normalized to be // non-negative and sum to 1.0 // // 2. We use mlpcreatetrainercls() function instead of mlpcreatetrainer() to // create trainer object. Trainer object process dataset and neural network // slightly differently to account for specifics of the classification // problems. // // 3. Dataset is attached to trainer object. Note that dataset format is slightly // different from one used for regression. // mlpcreatetrainercls(1, 2, trn); mlpcreatec1(1, 5, 2, network); mlpsetdataset(trn, xy, 6); // // Network is trained with 5 restarts from random positions // mlptrainnetwork(trn, network, 5, rep); // // Test our neural network on strictly positive and strictly negative numbers. // // IMPORTANT! Classifier network returns class membership probabilities instead // of class indexes. Network returns two values (probabilities) instead of one // (class index). // // Thus, for +1 we expect to get [P0,P1] = [1,0], where P0 is probability that // number is positive (belongs to class 0), and P1 is probability that number // is negative (belongs to class 1). // // For -1 we expect to get [P0,P1] = [0,1] // // Following properties are guaranteed by network architecture: // * P0>=0, P1>=0 non-negativity // * P0+P1=1 normalization // x = "[1]"; mlpprocess(network, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[1.000,0.000]", 0.05); x = "[-1]"; mlpprocess(network, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[0.000,1.000]", 0.05); // // But what our network will return for 0, which is between classes 0 and 1? // // In our dataset it has two different marks assigned (class 0 AND class 1). // So network will return something average between class 0 and class 1: // 0 => [0.5, 0.5] // x = "[0]"; mlpprocess(network, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[0.500,0.500]", 0.05); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "nn_cls2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST nn_cls3 // Multiclass classification problem // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // Suppose that we want to classify numbers as positive (class 0) and negative // (class 1). We also have one more class for zero (class 2). // // NOTE: we use network with excessive amount of neurons, which guarantees // almost exact reproduction of the training set. Generalization ability // of such network is rather low, but we are not concerned with such // questions in this basic demo. // mlptrainer trn; multilayerperceptron network; mlpreport rep; real_1d_array x = "[0]"; real_1d_array y = "[0,0,0]"; // // Training set. One row corresponds to one record [A => class(A)]. // // Classes are denoted by numbers from 0 to 2, where 0 corresponds to positive // numbers, 1 to negative numbers, 2 to zero // // [ +1 0] // [ +2 0] // [ -1 1] // [ -2 1] // [ 0 2] // real_2d_array xy = "[[+1,0],[+2,0],[-1,1],[-2,1],[0,2]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); // // // When we solve classification problems, everything is slightly different from // the regression ones: // // 1. Network is created. Because we solve classification problem, we use // mlpcreatec1() function instead of mlpcreate1(). This function creates // classifier network with SOFTMAX-normalized outputs. This network returns // vector of class membership probabilities which are normalized to be // non-negative and sum to 1.0 // // 2. We use mlpcreatetrainercls() function instead of mlpcreatetrainer() to // create trainer object. Trainer object process dataset and neural network // slightly differently to account for specifics of the classification // problems. // // 3. Dataset is attached to trainer object. Note that dataset format is slightly // different from one used for regression. // mlpcreatetrainercls(1, 3, trn); mlpcreatec1(1, 5, 3, network); mlpsetdataset(trn, xy, 5); // // Network is trained with 5 restarts from random positions // mlptrainnetwork(trn, network, 5, rep); // // Test our neural network on strictly positive and strictly negative numbers. // // IMPORTANT! Classifier network returns class membership probabilities instead // of class indexes. Network returns three values (probabilities) instead of one // (class index). // // Thus, for +1 we expect to get [P0,P1,P2] = [1,0,0], // for -1 we expect to get [P0,P1,P2] = [0,1,0], // and for 0 we will get [P0,P1,P2] = [0,0,1]. // // Following properties are guaranteed by network architecture: // * P0>=0, P1>=0, P2>=0 non-negativity // * P0+P1+P2=1 normalization // x = "[1]"; mlpprocess(network, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[1.000,0.000,0.000]", 0.05); x = "[-1]"; mlpprocess(network, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[0.000,1.000,0.000]", 0.05); x = "[0]"; mlpprocess(network, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[0.000,0.000,1.000]", 0.05); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "nn_cls3"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST nn_trainerobject // Advanced example on trainer object // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { // // Trainer object is used to train network. It stores dataset, training settings, // and other information which is NOT part of neural network. You should use // trainer object as follows: // (1) you create trainer object and specify task type (classification/regression) // and number of inputs/outputs // (2) you add dataset to the trainer object // (3) you may change training settings (stopping criteria or weight decay) // (4) finally, you may train one or more networks // // You may interleave stages 2...4 and repeat them many times. Trainer object // remembers its internal state and can be used several times after its creation // and initialization. // mlptrainer trn; // // Stage 1: object creation. // // We have to specify number of inputs and outputs. Trainer object can be used // only for problems with same number of inputs/outputs as was specified during // its creation. // // In case you want to train SOFTMAX-normalized network which solves classification // problems, you must use another function to create trainer object: // mlpcreatetrainercls(). // // Below we create trainer object which can be used to train regression networks // with 2 inputs and 1 output. // mlpcreatetrainer(2, 1, trn); // // Stage 2: specification of the training set // // By default trainer object stores empty dataset. So to solve your non-empty problem // you have to set dataset by passing to trainer dense or sparse matrix. // // One row of the matrix corresponds to one record A*B=C in the multiplication table. // First two columns store A and B, last column stores C // // [1 * 1 = 1] [ 1 1 1 ] // [1 * 2 = 2] [ 1 2 2 ] // [2 * 1 = 2] = [ 2 1 2 ] // [2 * 2 = 4] [ 2 2 4 ] // real_2d_array xy = "[[1,1,1],[1,2,2],[2,1,2],[2,2,4]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); mlpsetdataset(trn, xy, 4); // // Stage 3: modification of the training parameters. // // You may modify parameters like weights decay or stopping criteria: // * we set moderate weight decay // * we choose iterations limit as stopping condition (another condition - step size - // is zero, which means than this condition is not active) // double wstep = 0.000; if( _spoil_scenario==3 ) wstep = fp_nan; if( _spoil_scenario==4 ) wstep = fp_posinf; if( _spoil_scenario==5 ) wstep = fp_neginf; ae_int_t maxits = 100; mlpsetdecay(trn, 0.01); mlpsetcond(trn, wstep, maxits); // // Stage 4: training. // // We will train several networks with different architecture using same trainer object. // We may change training parameters or even dataset, so different networks are trained // differently. But in this simple example we will train all networks with same settings. // // We create and train three networks: // * network 1 has 2x1 architecture (2 inputs, no hidden neurons, 1 output) // * network 2 has 2x5x1 architecture (2 inputs, 5 hidden neurons, 1 output) // * network 3 has 2x5x5x1 architecture (2 inputs, two hidden layers, 1 output) // // NOTE: these networks solve regression problems. For classification problems you // should use mlpcreatec0/c1/c2 to create neural networks which have SOFTMAX- // normalized outputs. // multilayerperceptron net1; multilayerperceptron net2; multilayerperceptron net3; mlpreport rep; mlpcreate0(2, 1, net1); mlpcreate1(2, 5, 1, net2); mlpcreate2(2, 5, 5, 1, net3); mlptrainnetwork(trn, net1, 5, rep); mlptrainnetwork(trn, net2, 5, rep); mlptrainnetwork(trn, net3, 5, rep); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "nn_trainerobject"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST nn_crossvalidation // Cross-validation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // This example shows how to perform cross-validation with ALGLIB // mlptrainer trn; multilayerperceptron network; mlpreport rep; // // Training set: f(x)=1/(x^2+1) // One row corresponds to one record [x,f(x)] // real_2d_array xy = "[[-2.0,0.2],[-1.6,0.3],[-1.3,0.4],[-1,0.5],[-0.6,0.7],[-0.3,0.9],[0,1],[2.0,0.2],[1.6,0.3],[1.3,0.4],[1,0.5],[0.6,0.7],[0.3,0.9]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); // // Trainer object is created. // Dataset is attached to trainer object. // // NOTE: it is not good idea to perform cross-validation on sample // as small as ours (13 examples). It is done for demonstration // purposes only. Generalization error estimates won't be // precise enough for practical purposes. // mlpcreatetrainer(1, 1, trn); mlpsetdataset(trn, xy, 13); // // The key property of the cross-validation is that it estimates // generalization properties of neural ARCHITECTURE. It does NOT // estimates generalization error of some specific network which // is passed to the k-fold CV routine. // // In our example we create 1x4x1 neural network and pass it to // CV routine without training it. Original state of the network // is not used for cross-validation - each round is restarted from // random initial state. Only geometry of network matters. // // We perform 5 restarts from different random positions for each // of the 10 cross-validation rounds. // mlpcreate1(1, 4, 1, network); mlpkfoldcv(trn, network, 5, 10, rep); // // Cross-validation routine stores estimates of the generalization // error to MLP report structure. You may examine its fields and // see estimates of different errors (RMS, CE, Avg). // // Because cross-validation is non-deterministic, in our manual we // can not say what values will be stored to rep after call to // mlpkfoldcv(). Every CV round will return slightly different // estimates. // _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "nn_crossvalidation"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST nn_ensembles_es // Early stopping ensembles // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // This example shows how to train early stopping ensebles. // mlptrainer trn; mlpensemble ensemble; mlpreport rep; // // Training set: f(x)=1/(x^2+1) // One row corresponds to one record [x,f(x)] // real_2d_array xy = "[[-2.0,0.2],[-1.6,0.3],[-1.3,0.4],[-1,0.5],[-0.6,0.7],[-0.3,0.9],[0,1],[2.0,0.2],[1.6,0.3],[1.3,0.4],[1,0.5],[0.6,0.7],[0.3,0.9]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); // // Trainer object is created. // Dataset is attached to trainer object. // // NOTE: it is not good idea to use early stopping ensemble on sample // as small as ours (13 examples). It is done for demonstration // purposes only. Ensemble training algorithm won't find good // solution on such small sample. // mlpcreatetrainer(1, 1, trn); mlpsetdataset(trn, xy, 13); // // Ensemble is created and trained. Each of 50 network is trained // with 5 restarts. // mlpecreate1(1, 4, 1, 50, ensemble); mlptrainensemblees(trn, ensemble, 5, rep); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "nn_ensembles_es"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST nn_parallel // Parallel training // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // This example shows how to use parallel functionality of ALGLIB. // We generate simple 1-dimensional regression problem and show how // to use parallel training, parallel cross-validation, parallel // training of neural ensembles. // // We assume that you already know how to use ALGLIB in serial mode // and concentrate on its parallel capabilities. // // NOTE: it is not good idea to use parallel features on sample as small // as ours (13 examples). It is done only for demonstration purposes. // mlptrainer trn; multilayerperceptron network; mlpensemble ensemble; mlpreport rep; real_2d_array xy = "[[-2.0,0.2],[-1.6,0.3],[-1.3,0.4],[-1,0.5],[-0.6,0.7],[-0.3,0.9],[0,1],[2.0,0.2],[1.6,0.3],[1.3,0.4],[1,0.5],[0.6,0.7],[0.3,0.9]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); mlpcreatetrainer(1, 1, trn); mlpsetdataset(trn, xy, 13); mlpcreate1(1, 4, 1, network); mlpecreate1(1, 4, 1, 50, ensemble); // // Below we demonstrate how to perform: // * parallel training of individual networks // * parallel cross-validation // * parallel training of neural ensembles // // In order to use multithreading, you have to: // 1) Install SMP edition of ALGLIB. // 2) This step is specific for C++ users: you should activate OS-specific // capabilities of ALGLIB by defining AE_OS=AE_POSIX (for *nix systems) // or AE_OS=AE_WINDOWS (for Windows systems). // C# users do not have to perform this step because C# programs are // portable across different systems without OS-specific tuning. // 3) Allow ALGLIB to know about number of worker threads to use: // a) autodetection (C++, C#): // ALGLIB will automatically determine number of CPU cores and // (by default) will use all cores except for one. Say, on 4-core // system it will use three cores - unless you manually told it // to use more or less. It will keep your system responsive during // lengthy computations. // Such behavior may be changed with setnworkers() call: // * alglib::setnworkers(0) = use all cores // * alglib::setnworkers(-1) = leave one core unused // * alglib::setnworkers(-2) = leave two cores unused // * alglib::setnworkers(+2) = use 2 cores (even if you have more) // b) manual specification (C++, C#): // You may want to specify maximum number of worker threads during // compile time by means of preprocessor definition AE_NWORKERS. // For C++ it will be "AE_NWORKERS=X" where X can be any positive number. // For C# it is "AE_NWORKERSX", where X should be replaced by number of // workers (AE_NWORKERS2, AE_NWORKERS3, AE_NWORKERS4, ...). // You can add this definition to compiler command line or change // corresponding project settings in your IDE. // // After you installed and configured SMP edition of ALGLIB, you may choose // between serial and multithreaded versions of SMP-capable functions: // * serial version works as usual, in the context of the calling thread // * multithreaded version (with "smp_" prefix) creates (or wakes up) worker // threads, inserts task in the worker queue, and waits for completion of // the task. All processing is done in context of worker thread(s). // // NOTE: because starting/stopping worker threads costs thousands of CPU cycles, // you should not use multithreading for lightweight computational problems. // // NOTE: some old POSIX-compatible operating systems do not support // sysconf(_SC_NPROCESSORS_ONLN) system call which is required in order // to automatically determine number of active cores. On these systems // you should specify number of cores manually at compile time. // Without it ALGLIB will run in single-threaded mode. // // // First, we perform parallel training of individual network with 5 // restarts from random positions. These 5 rounds of training are // executed in parallel manner, with best network chosen after // training. // // ALGLIB can use additional way to speed up computations - divide // dataset into smaller subsets and process these subsets // simultaneously. It allows us to efficiently parallelize even // single training round. This operation is performed automatically // for large datasets, but our toy dataset is too small. // smp_mlptrainnetwork(trn, network, 5, rep); // // Then, we perform parallel 10-fold cross-validation, with 5 random // restarts per each CV round. I.e., 5*10=50 networks are trained // in total. All these operations can be parallelized. // // NOTE: again, ALGLIB can parallelize calculation of gradient // over entire dataset - but our dataset is too small. // smp_mlpkfoldcv(trn, network, 5, 10, rep); // // Finally, we train early stopping ensemble of 50 neural networks, // each of them is trained with 5 random restarts. I.e., 5*50=250 // networks aretrained in total. // smp_mlptrainensemblees(trn, ensemble, 5, rep); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "nn_parallel"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST clst_ahc // Simple hierarchical clusterization with Euclidean distance function // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // The very simple clusterization example // // We have a set of points in 2D space: // (P0,P1,P2,P3,P4) = ((1,1),(1,2),(4,1),(2,3),(4,1.5)) // // | // | P3 // | // | P1 // | P4 // | P0 P2 // |------------------------- // // We want to perform Agglomerative Hierarchic Clusterization (AHC), // using complete linkage (default algorithm) and Euclidean distance // (default metric). // // In order to do that, we: // * create clusterizer with clusterizercreate() // * set points XY and metric (2=Euclidean) with clusterizersetpoints() // * run AHC algorithm with clusterizerrunahc // // You may see that clusterization itself is a minor part of the example, // most of which is dominated by comments :) // clusterizerstate s; ahcreport rep; real_2d_array xy = "[[1,1],[1,2],[4,1],[2,3],[4,1.5]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); clusterizercreate(s); clusterizersetpoints(s, xy, 2); clusterizerrunahc(s, rep); // // Now we've built our clusterization tree. Rep.z contains information which // is required to build dendrogram. I-th row of rep.z represents one merge // operation, with first cluster to merge having index rep.z[I,0] and second // one having index rep.z[I,1]. Merge result has index NPoints+I. // // Clusters with indexes less than NPoints are single-point initial clusters, // while ones with indexes from NPoints to 2*NPoints-2 are multi-point // clusters created during merges. // // In our example, Z=[[2,4], [0,1], [3,6], [5,7]] // // It means that: // * first, we merge C2=(P2) and C4=(P4), and create C5=(P2,P4) // * then, we merge C2=(P0) and C1=(P1), and create C6=(P0,P1) // * then, we merge C3=(P3) and C6=(P0,P1), and create C7=(P0,P1,P3) // * finally, we merge C5 and C7 and create C8=(P0,P1,P2,P3,P4) // // Thus, we have following dendrogram: // // ------8----- // | | // | ----7---- // | | | // ---5--- | ---6--- // | | | | | // P2 P4 P3 P0 P1 // _TestResult = _TestResult && doc_test_int_matrix(rep.z, "[[2,4],[0,1],[3,6],[5,7]]"); // // We've built dendrogram above by reordering our dataset. // // Without such reordering it would be impossible to build dendrogram without // intersections. Luckily, ahcreport structure contains two additional fields // which help to build dendrogram from your data: // * rep.p, which contains permutation applied to dataset // * rep.pm, which contains another representation of merges // // In our example we have: // * P=[3,4,0,2,1] // * PZ=[[0,0,1,1,0,0],[3,3,4,4,0,0],[2,2,3,4,0,1],[0,1,2,4,1,2]] // // Permutation array P tells us that P0 should be moved to position 3, // P1 moved to position 4, P2 moved to position 0 and so on: // // (P0 P1 P2 P3 P4) => (P2 P4 P3 P0 P1) // // Merges array PZ tells us how to perform merges on the sorted dataset. // One row of PZ corresponds to one merge operations, with first pair of // elements denoting first of the clusters to merge (start index, end // index) and next pair of elements denoting second of the clusters to // merge. Clusters being merged are always adjacent, with first one on // the left and second one on the right. // // For example, first row of PZ tells us that clusters [0,0] and [1,1] are // merged (single-point clusters, with first one containing P2 and second // one containing P4). Third row of PZ tells us that we merge one single- // point cluster [2,2] with one two-point cluster [3,4]. // // There are two more elements in each row of PZ. These are the helper // elements, which denote HEIGHT (not size) of left and right subdendrograms. // For example, according to PZ, first two merges are performed on clusterization // trees of height 0, while next two merges are performed on 0-1 and 1-2 // pairs of trees correspondingly. // _TestResult = _TestResult && doc_test_int_vector(rep.p, "[3,4,0,2,1]"); _TestResult = _TestResult && doc_test_int_matrix(rep.pm, "[[0,0,1,1,0,0],[3,3,4,4,0,0],[2,2,3,4,0,1],[0,1,2,4,1,2]]"); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "clst_ahc"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST clst_kmeans // Simple k-means clusterization // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // The very simple clusterization example // // We have a set of points in 2D space: // (P0,P1,P2,P3,P4) = ((1,1),(1,2),(4,1),(2,3),(4,1.5)) // // | // | P3 // | // | P1 // | P4 // | P0 P2 // |------------------------- // // We want to perform k-means++ clustering with K=2. // // In order to do that, we: // * create clusterizer with clusterizercreate() // * set points XY and metric (must be Euclidean, distype=2) with clusterizersetpoints() // * (optional) set number of restarts from random positions to 5 // * run k-means algorithm with clusterizerrunkmeans() // // You may see that clusterization itself is a minor part of the example, // most of which is dominated by comments :) // clusterizerstate s; kmeansreport rep; real_2d_array xy = "[[1,1],[1,2],[4,1],[2,3],[4,1.5]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); clusterizercreate(s); clusterizersetpoints(s, xy, 2); clusterizersetkmeanslimits(s, 5, 0); clusterizerrunkmeans(s, 2, rep); // // We've performed clusterization, and it succeeded (completion code is +1). // // Now first center is stored in the first row of rep.c, second one is stored // in the second row. rep.cidx can be used to determine which center is // closest to some specific point of the dataset. // _TestResult = _TestResult && doc_test_int(rep.terminationtype, 1); // We called clusterizersetpoints() with disttype=2 because k-means++ // algorithm does NOT support metrics other than Euclidean. But what if we // try to use some other metric? // // We change metric type by calling clusterizersetpoints() one more time, // and try to run k-means algo again. It fails. // clusterizersetpoints(s, xy, 0); clusterizerrunkmeans(s, 2, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, -5); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "clst_kmeans"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST clst_linkage // Clusterization with different linkage types // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // We have a set of points in 1D space: // (P0,P1,P2,P3,P4) = (1, 3, 10, 16, 20) // // We want to perform Agglomerative Hierarchic Clusterization (AHC), // using either complete or single linkage and Euclidean distance // (default metric). // // First two steps merge P0/P1 and P3/P4 independently of the linkage type. // However, third step depends on linkage type being used: // * in case of complete linkage P2=10 is merged with [P0,P1] // * in case of single linkage P2=10 is merged with [P3,P4] // clusterizerstate s; ahcreport rep; real_2d_array xy = "[[1],[3],[10],[16],[20]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); integer_1d_array cidx; integer_1d_array cz; clusterizercreate(s); clusterizersetpoints(s, xy, 2); // use complete linkage, reduce set down to 2 clusters. // print clusterization with clusterizergetkclusters(2). // P2 must belong to [P0,P1] clusterizersetahcalgo(s, 0); clusterizerrunahc(s, rep); clusterizergetkclusters(rep, 2, cidx, cz); _TestResult = _TestResult && doc_test_int_vector(cidx, "[1,1,1,0,0]"); // use single linkage, reduce set down to 2 clusters. // print clusterization with clusterizergetkclusters(2). // P2 must belong to [P2,P3] clusterizersetahcalgo(s, 1); clusterizerrunahc(s, rep); clusterizergetkclusters(rep, 2, cidx, cz); _TestResult = _TestResult && doc_test_int_vector(cidx, "[0,0,1,1,1]"); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "clst_linkage"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST clst_distance // Clusterization with different metric types // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // We have three points in 4D space: // (P0,P1,P2) = ((1, 2, 1, 2), (6, 7, 6, 7), (7, 6, 7, 6)) // // We want to try clustering them with different distance functions. // Distance function is chosen when we add dataset to the clusterizer. // We can choose several distance types - Euclidean, city block, Chebyshev, // several correlation measures or user-supplied distance matrix. // // Here we'll try three distances: Euclidean, Pearson correlation, // user-supplied distance matrix. Different distance functions lead // to different choices being made by algorithm during clustering. // clusterizerstate s; ahcreport rep; ae_int_t disttype; real_2d_array xy = "[[1, 2, 1, 2], [6, 7, 6, 7], [7, 6, 7, 6]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); clusterizercreate(s); // With Euclidean distance function (disttype=2) two closest points // are P1 and P2, thus: // * first, we merge P1 and P2 to form C3=[P1,P2] // * second, we merge P0 and C3 to form C4=[P0,P1,P2] disttype = 2; clusterizersetpoints(s, xy, disttype); clusterizerrunahc(s, rep); _TestResult = _TestResult && doc_test_int_matrix(rep.z, "[[1,2],[0,3]]"); // With Pearson correlation distance function (disttype=10) situation // is different - distance between P0 and P1 is zero, thus: // * first, we merge P0 and P1 to form C3=[P0,P1] // * second, we merge P2 and C3 to form C4=[P0,P1,P2] disttype = 10; clusterizersetpoints(s, xy, disttype); clusterizerrunahc(s, rep); _TestResult = _TestResult && doc_test_int_matrix(rep.z, "[[0,1],[2,3]]"); // Finally, we try clustering with user-supplied distance matrix: // [ 0 3 1 ] // P = [ 3 0 3 ], where P[i,j] = dist(Pi,Pj) // [ 1 3 0 ] // // * first, we merge P0 and P2 to form C3=[P0,P2] // * second, we merge P1 and C3 to form C4=[P0,P1,P2] real_2d_array d = "[[0,3,1],[3,0,3],[1,3,0]]"; clusterizersetdistances(s, d, true); clusterizerrunahc(s, rep); _TestResult = _TestResult && doc_test_int_matrix(rep.z, "[[0,2],[1,3]]"); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "clst_distance"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST clst_kclusters // Obtaining K top clusters from clusterization tree // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // We have a set of points in 2D space: // (P0,P1,P2,P3,P4) = ((1,1),(1,2),(4,1),(2,3),(4,1.5)) // // | // | P3 // | // | P1 // | P4 // | P0 P2 // |------------------------- // // We perform Agglomerative Hierarchic Clusterization (AHC) and we want // to get top K clusters from clusterization tree for different K. // clusterizerstate s; ahcreport rep; real_2d_array xy = "[[1,1],[1,2],[4,1],[2,3],[4,1.5]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); integer_1d_array cidx; integer_1d_array cz; clusterizercreate(s); clusterizersetpoints(s, xy, 2); clusterizerrunahc(s, rep); // with K=5, every points is assigned to its own cluster: // C0=P0, C1=P1 and so on... clusterizergetkclusters(rep, 5, cidx, cz); _TestResult = _TestResult && doc_test_int_vector(cidx, "[0,1,2,3,4]"); // with K=1 we have one large cluster C0=[P0,P1,P2,P3,P4,P5] clusterizergetkclusters(rep, 1, cidx, cz); _TestResult = _TestResult && doc_test_int_vector(cidx, "[0,0,0,0,0]"); // with K=3 we have three clusters C0=[P3], C1=[P2,P4], C2=[P0,P1] clusterizergetkclusters(rep, 3, cidx, cz); _TestResult = _TestResult && doc_test_int_vector(cidx, "[2,2,1,0,1]"); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "clst_kclusters"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST autogk_d1 // Integrating f=exp(x) by adaptive integrator // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { // // This example demonstrates integration of f=exp(x) on [0,1]: // * first, autogkstate is initialized // * then we call integration function // * and finally we obtain results with autogkresults() call // double a = 0; if( _spoil_scenario==0 ) a = fp_nan; if( _spoil_scenario==1 ) a = fp_posinf; if( _spoil_scenario==2 ) a = fp_neginf; double b = 1; if( _spoil_scenario==3 ) b = fp_nan; if( _spoil_scenario==4 ) b = fp_posinf; if( _spoil_scenario==5 ) b = fp_neginf; autogkstate s; double v; autogkreport rep; autogksmooth(a, b, s); alglib::autogkintegrate(s, int_function_1_func); autogkresults(s, v, rep); _TestResult = _TestResult && doc_test_real(v, 1.7182, 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "autogk_d1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST fft_complex_d1 // Complex FFT: simple example // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // first we demonstrate forward FFT: // [1i,1i,1i,1i] is converted to [4i, 0, 0, 0] // complex_1d_array z = "[1i,1i,1i,1i]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(z); if( _spoil_scenario==1 ) spoil_vector_by_posinf(z); if( _spoil_scenario==2 ) spoil_vector_by_neginf(z); fftc1d(z); _TestResult = _TestResult && doc_test_complex_vector(z, "[4i,0,0,0]", 0.0001); // // now we convert [4i, 0, 0, 0] back to [1i,1i,1i,1i] // with backward FFT // fftc1dinv(z); _TestResult = _TestResult && doc_test_complex_vector(z, "[1i,1i,1i,1i]", 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "fft_complex_d1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST fft_complex_d2 // Complex FFT: advanced example // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // first we demonstrate forward FFT: // [0,1,0,1i] is converted to [1+1i, -1-1i, -1-1i, 1+1i] // complex_1d_array z = "[0,1,0,1i]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(z); if( _spoil_scenario==1 ) spoil_vector_by_posinf(z); if( _spoil_scenario==2 ) spoil_vector_by_neginf(z); fftc1d(z); _TestResult = _TestResult && doc_test_complex_vector(z, "[1+1i, -1-1i, -1-1i, 1+1i]", 0.0001); // // now we convert result back with backward FFT // fftc1dinv(z); _TestResult = _TestResult && doc_test_complex_vector(z, "[0,1,0,1i]", 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "fft_complex_d2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST fft_real_d1 // Real FFT: simple example // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // first we demonstrate forward FFT: // [1,1,1,1] is converted to [4, 0, 0, 0] // real_1d_array x = "[1,1,1,1]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); complex_1d_array f; real_1d_array x2; fftr1d(x, f); _TestResult = _TestResult && doc_test_complex_vector(f, "[4,0,0,0]", 0.0001); // // now we convert [4, 0, 0, 0] back to [1,1,1,1] // with backward FFT // fftr1dinv(f, x2); _TestResult = _TestResult && doc_test_real_vector(x2, "[1,1,1,1]", 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "fft_real_d1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST fft_real_d2 // Real FFT: advanced example // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // first we demonstrate forward FFT: // [1,2,3,4] is converted to [10, -2+2i, -2, -2-2i] // // note that output array is self-adjoint: // * f[0] = conj(f[0]) // * f[1] = conj(f[3]) // * f[2] = conj(f[2]) // real_1d_array x = "[1,2,3,4]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); complex_1d_array f; real_1d_array x2; fftr1d(x, f); _TestResult = _TestResult && doc_test_complex_vector(f, "[10, -2+2i, -2, -2-2i]", 0.0001); // // now we convert [10, -2+2i, -2, -2-2i] back to [1,2,3,4] // fftr1dinv(f, x2); _TestResult = _TestResult && doc_test_real_vector(x2, "[1,2,3,4]", 0.0001); // // remember that F is self-adjoint? It means that we can pass just half // (slightly larger than half) of F to inverse real FFT and still get our result. // // I.e. instead [10, -2+2i, -2, -2-2i] we pass just [10, -2+2i, -2] and everything works! // // NOTE: in this case we should explicitly pass array length (which is 4) to ALGLIB; // if not, it will automatically use array length to determine FFT size and // will erroneously make half-length FFT. // f = "[10, -2+2i, -2]"; fftr1dinv(f, 4, x2); _TestResult = _TestResult && doc_test_real_vector(x2, "[1,2,3,4]", 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "fft_real_d2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST fft_complex_e1 // error detection in backward FFT // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { complex_1d_array z = "[0,2,0,-2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(z); if( _spoil_scenario==1 ) spoil_vector_by_posinf(z); if( _spoil_scenario==2 ) spoil_vector_by_neginf(z); fftc1dinv(z); _TestResult = _TestResult && doc_test_complex_vector(z, "[0,1i,0,-1i]", 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "fft_complex_e1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST spline1d_d_linear // Piecewise linear spline interpolation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { // // We use piecewise linear spline to interpolate f(x)=x^2 sampled // at 5 equidistant nodes on [-1,+1]. // real_1d_array x = "[-1.0,-0.5,0.0,+0.5,+1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_adding_element(x); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[+1.0,0.25,0.0,0.25,+1.0]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); double t = 0.25; if( _spoil_scenario==10 ) t = fp_posinf; if( _spoil_scenario==11 ) t = fp_neginf; double v; spline1dinterpolant s; // build spline spline1dbuildlinear(x, y, s); // calculate S(0.25) - it is quite different from 0.25^2=0.0625 v = spline1dcalc(s, t); _TestResult = _TestResult && doc_test_real(v, 0.125, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "spline1d_d_linear"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST spline1d_d_cubic // Cubic spline interpolation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<10; _spoil_scenario++) { try { // // We use cubic spline to interpolate f(x)=x^2 sampled // at 5 equidistant nodes on [-1,+1]. // // First, we use default boundary conditions ("parabolically terminated // spline") because cubic spline built with such boundary conditions // will exactly reproduce any quadratic f(x). // // Then we try to use natural boundary conditions // d2S(-1)/dx^2 = 0.0 // d2S(+1)/dx^2 = 0.0 // and see that such spline interpolated f(x) with small error. // real_1d_array x = "[-1.0,-0.5,0.0,+0.5,+1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[+1.0,0.25,0.0,0.25,+1.0]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); double t = 0.25; if( _spoil_scenario==8 ) t = fp_posinf; if( _spoil_scenario==9 ) t = fp_neginf; double v; spline1dinterpolant s; ae_int_t natural_bound_type = 2; // // Test exact boundary conditions: build S(x), calculare S(0.25) // (almost same as original function) // spline1dbuildcubic(x, y, s); v = spline1dcalc(s, t); _TestResult = _TestResult && doc_test_real(v, 0.0625, 0.00001); // // Test natural boundary conditions: build S(x), calculare S(0.25) // (small interpolation error) // spline1dbuildcubic(x, y, 5, natural_bound_type, 0.0, natural_bound_type, 0.0, s); v = spline1dcalc(s, t); _TestResult = _TestResult && doc_test_real(v, 0.0580, 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "spline1d_d_cubic"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST spline1d_d_monotone // Monotone interpolation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<10; _spoil_scenario++) { try { // // Spline built witn spline1dbuildcubic() can be non-monotone even when // Y-values form monotone sequence. Say, for x=[0,1,2] and y=[0,1,1] // cubic spline will monotonically grow until x=1.5 and then start // decreasing. // // That's why ALGLIB provides special spline construction function // which builds spline which preserves monotonicity of the original // dataset. // // NOTE: in case original dataset is non-monotonic, ALGLIB splits it // into monotone subsequences and builds piecewise monotonic spline. // real_1d_array x = "[0,1,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_adding_element(x); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0,1,1]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); spline1dinterpolant s; // build spline spline1dbuildmonotone(x, y, s); // calculate S at x = [-0.5, 0.0, 0.5, 1.0, 1.5, 2.0] // you may see that spline is really monotonic double v; v = spline1dcalc(s, -0.5); _TestResult = _TestResult && doc_test_real(v, 0.0000, 0.00005); v = spline1dcalc(s, 0.0); _TestResult = _TestResult && doc_test_real(v, 0.0000, 0.00005); v = spline1dcalc(s, +0.5); _TestResult = _TestResult && doc_test_real(v, 0.5000, 0.00005); v = spline1dcalc(s, 1.0); _TestResult = _TestResult && doc_test_real(v, 1.0000, 0.00005); v = spline1dcalc(s, 1.5); _TestResult = _TestResult && doc_test_real(v, 1.0000, 0.00005); v = spline1dcalc(s, 2.0); _TestResult = _TestResult && doc_test_real(v, 1.0000, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "spline1d_d_monotone"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST spline1d_d_griddiff // Differentiation on the grid using cubic splines // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<10; _spoil_scenario++) { try { // // We use cubic spline to do grid differentiation, i.e. having // values of f(x)=x^2 sampled at 5 equidistant nodes on [-1,+1] // we calculate derivatives of cubic spline at nodes WITHOUT // CONSTRUCTION OF SPLINE OBJECT. // // There are efficient functions spline1dgriddiffcubic() and // spline1dgriddiff2cubic() for such calculations. // // We use default boundary conditions ("parabolically terminated // spline") because cubic spline built with such boundary conditions // will exactly reproduce any quadratic f(x). // // Actually, we could use natural conditions, but we feel that // spline which exactly reproduces f() will show us more // understandable results. // real_1d_array x = "[-1.0,-0.5,0.0,+0.5,+1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_adding_element(x); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[+1.0,0.25,0.0,0.25,+1.0]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); real_1d_array d1; real_1d_array d2; // // We calculate first derivatives: they must be equal to 2*x // spline1dgriddiffcubic(x, y, d1); _TestResult = _TestResult && doc_test_real_vector(d1, "[-2.0, -1.0, 0.0, +1.0, +2.0]", 0.0001); // // Now test griddiff2, which returns first AND second derivatives. // First derivative is 2*x, second is equal to 2.0 // spline1dgriddiff2cubic(x, y, d1, d2); _TestResult = _TestResult && doc_test_real_vector(d1, "[-2.0, -1.0, 0.0, +1.0, +2.0]", 0.0001); _TestResult = _TestResult && doc_test_real_vector(d2, "[ 2.0, 2.0, 2.0, 2.0, 2.0]", 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "spline1d_d_griddiff"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST spline1d_d_convdiff // Resampling using cubic splines // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<11; _spoil_scenario++) { try { // // We use cubic spline to do resampling, i.e. having // values of f(x)=x^2 sampled at 5 equidistant nodes on [-1,+1] // we calculate values/derivatives of cubic spline on // another grid (equidistant with 9 nodes on [-1,+1]) // WITHOUT CONSTRUCTION OF SPLINE OBJECT. // // There are efficient functions spline1dconvcubic(), // spline1dconvdiffcubic() and spline1dconvdiff2cubic() // for such calculations. // // We use default boundary conditions ("parabolically terminated // spline") because cubic spline built with such boundary conditions // will exactly reproduce any quadratic f(x). // // Actually, we could use natural conditions, but we feel that // spline which exactly reproduces f() will show us more // understandable results. // real_1d_array x_old = "[-1.0,-0.5,0.0,+0.5,+1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x_old); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x_old); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x_old); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x_old); real_1d_array y_old = "[+1.0,0.25,0.0,0.25,+1.0]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y_old); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y_old); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y_old); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y_old); real_1d_array x_new = "[-1.00,-0.75,-0.50,-0.25,0.00,+0.25,+0.50,+0.75,+1.00]"; if( _spoil_scenario==8 ) spoil_vector_by_nan(x_new); if( _spoil_scenario==9 ) spoil_vector_by_posinf(x_new); if( _spoil_scenario==10 ) spoil_vector_by_neginf(x_new); real_1d_array y_new; real_1d_array d1_new; real_1d_array d2_new; // // First, conversion without differentiation. // // spline1dconvcubic(x_old, y_old, x_new, y_new); _TestResult = _TestResult && doc_test_real_vector(y_new, "[1.0000, 0.5625, 0.2500, 0.0625, 0.0000, 0.0625, 0.2500, 0.5625, 1.0000]", 0.0001); // // Then, conversion with differentiation (first derivatives only) // // spline1dconvdiffcubic(x_old, y_old, x_new, y_new, d1_new); _TestResult = _TestResult && doc_test_real_vector(y_new, "[1.0000, 0.5625, 0.2500, 0.0625, 0.0000, 0.0625, 0.2500, 0.5625, 1.0000]", 0.0001); _TestResult = _TestResult && doc_test_real_vector(d1_new, "[-2.0, -1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0]", 0.0001); // // Finally, conversion with first and second derivatives // // spline1dconvdiff2cubic(x_old, y_old, x_new, y_new, d1_new, d2_new); _TestResult = _TestResult && doc_test_real_vector(y_new, "[1.0000, 0.5625, 0.2500, 0.0625, 0.0000, 0.0625, 0.2500, 0.5625, 1.0000]", 0.0001); _TestResult = _TestResult && doc_test_real_vector(d1_new, "[-2.0, -1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0]", 0.0001); _TestResult = _TestResult && doc_test_real_vector(d2_new, "[2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0]", 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "spline1d_d_convdiff"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST parametric_rdp // Parametric Ramer-Douglas-Peucker approximation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<7; _spoil_scenario++) { try { // // We use RDP algorithm to approximate parametric 2D curve given by // locations in t=0,1,2,3 (see below), which form piecewise linear // trajectory through D-dimensional space (2-dimensional in our example). // // | // | // - * * X2................X3 // | . // | . // - * * . * * * * // | . // | . // - * X1 * * * * // | ..... // | .... // X0----|-----|-----|-----|-----|-----|--- // ae_int_t npoints = 4; ae_int_t ndimensions = 2; real_2d_array x = "[[0,0],[2,1],[3,3],[6,3]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(x); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(x); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(x); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(x); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(x); // // Approximation of parametric curve is performed by another parametric curve // with lesser amount of points. It allows to work with "compressed" // representation, which needs smaller amount of memory. Say, in our example // (we allow points with error smaller than 0.8) approximation will have // just two sequential sections connecting X0 with X2, and X2 with X3. // // | // | // - * * X2................X3 // | . // | . // - * . * * * * // | . // | . // - . X1 * * * * // | . // | . // X0----|-----|-----|-----|-----|-----|--- // // real_2d_array y; integer_1d_array idxy; ae_int_t nsections; ae_int_t limitcnt = 0; double limiteps = 0.8; if( _spoil_scenario==5 ) limiteps = fp_posinf; if( _spoil_scenario==6 ) limiteps = fp_neginf; parametricrdpfixed(x, npoints, ndimensions, limitcnt, limiteps, y, idxy, nsections); _TestResult = _TestResult && doc_test_int(nsections, 2); _TestResult = _TestResult && doc_test_int_vector(idxy, "[0,2,3]"); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "parametric_rdp"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST spline3d_trilinear // Trilinear spline interpolation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<22; _spoil_scenario++) { try { // // We use trilinear spline to interpolate f(x,y,z)=x+xy+z sampled // at (x,y,z) from [0.0, 1.0] X [0.0, 1.0] X [0.0, 1.0]. // // We store x, y and z-values at local arrays with same names. // Function values are stored in the array F as follows: // f[0] (x,y,z) = (0,0,0) // f[1] (x,y,z) = (1,0,0) // f[2] (x,y,z) = (0,1,0) // f[3] (x,y,z) = (1,1,0) // f[4] (x,y,z) = (0,0,1) // f[5] (x,y,z) = (1,0,1) // f[6] (x,y,z) = (0,1,1) // f[7] (x,y,z) = (1,1,1) // real_1d_array x = "[0.0, 1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.0, 1.0]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); real_1d_array z = "[0.0, 1.0]"; if( _spoil_scenario==8 ) spoil_vector_by_nan(z); if( _spoil_scenario==9 ) spoil_vector_by_posinf(z); if( _spoil_scenario==10 ) spoil_vector_by_neginf(z); if( _spoil_scenario==11 ) spoil_vector_by_deleting_element(z); real_1d_array f = "[0,1,0,2,1,2,1,3]"; if( _spoil_scenario==12 ) spoil_vector_by_nan(f); if( _spoil_scenario==13 ) spoil_vector_by_posinf(f); if( _spoil_scenario==14 ) spoil_vector_by_neginf(f); if( _spoil_scenario==15 ) spoil_vector_by_deleting_element(f); double vx = 0.50; if( _spoil_scenario==16 ) vx = fp_posinf; if( _spoil_scenario==17 ) vx = fp_neginf; double vy = 0.50; if( _spoil_scenario==18 ) vy = fp_posinf; if( _spoil_scenario==19 ) vy = fp_neginf; double vz = 0.50; if( _spoil_scenario==20 ) vz = fp_posinf; if( _spoil_scenario==21 ) vz = fp_neginf; double v; spline3dinterpolant s; // build spline spline3dbuildtrilinearv(x, 2, y, 2, z, 2, f, 1, s); // calculate S(0.5,0.5,0.5) v = spline3dcalc(s, vx, vy, vz); _TestResult = _TestResult && doc_test_real(v, 1.2500, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "spline3d_trilinear"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST spline3d_vector // Vector-valued trilinear spline interpolation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<22; _spoil_scenario++) { try { // // We use trilinear vector-valued spline to interpolate {f0,f1}={x+xy+z,x+xy+yz+z} // sampled at (x,y,z) from [0.0, 1.0] X [0.0, 1.0] X [0.0, 1.0]. // // We store x, y and z-values at local arrays with same names. // Function values are stored in the array F as follows: // f[0] f0, (x,y,z) = (0,0,0) // f[1] f1, (x,y,z) = (0,0,0) // f[2] f0, (x,y,z) = (1,0,0) // f[3] f1, (x,y,z) = (1,0,0) // f[4] f0, (x,y,z) = (0,1,0) // f[5] f1, (x,y,z) = (0,1,0) // f[6] f0, (x,y,z) = (1,1,0) // f[7] f1, (x,y,z) = (1,1,0) // f[8] f0, (x,y,z) = (0,0,1) // f[9] f1, (x,y,z) = (0,0,1) // f[10] f0, (x,y,z) = (1,0,1) // f[11] f1, (x,y,z) = (1,0,1) // f[12] f0, (x,y,z) = (0,1,1) // f[13] f1, (x,y,z) = (0,1,1) // f[14] f0, (x,y,z) = (1,1,1) // f[15] f1, (x,y,z) = (1,1,1) // real_1d_array x = "[0.0, 1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.0, 1.0]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); real_1d_array z = "[0.0, 1.0]"; if( _spoil_scenario==8 ) spoil_vector_by_nan(z); if( _spoil_scenario==9 ) spoil_vector_by_posinf(z); if( _spoil_scenario==10 ) spoil_vector_by_neginf(z); if( _spoil_scenario==11 ) spoil_vector_by_deleting_element(z); real_1d_array f = "[0,0, 1,1, 0,0, 2,2, 1,1, 2,2, 1,2, 3,4]"; if( _spoil_scenario==12 ) spoil_vector_by_nan(f); if( _spoil_scenario==13 ) spoil_vector_by_posinf(f); if( _spoil_scenario==14 ) spoil_vector_by_neginf(f); if( _spoil_scenario==15 ) spoil_vector_by_deleting_element(f); double vx = 0.50; if( _spoil_scenario==16 ) vx = fp_posinf; if( _spoil_scenario==17 ) vx = fp_neginf; double vy = 0.50; if( _spoil_scenario==18 ) vy = fp_posinf; if( _spoil_scenario==19 ) vy = fp_neginf; double vz = 0.50; if( _spoil_scenario==20 ) vz = fp_posinf; if( _spoil_scenario==21 ) vz = fp_neginf; spline3dinterpolant s; // build spline spline3dbuildtrilinearv(x, 2, y, 2, z, 2, f, 2, s); // calculate S(0.5,0.5,0.5) - we have vector of values instead of single value real_1d_array v; spline3dcalcv(s, vx, vy, vz, v); _TestResult = _TestResult && doc_test_real_vector(v, "[1.2500,1.5000]", 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "spline3d_vector"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_d_calcdiff // Interpolation and differentiation using barycentric representation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { // // Here we demonstrate polynomial interpolation and differentiation // of y=x^2-x sampled at [0,1,2]. Barycentric representation of polynomial is used. // real_1d_array x = "[0,1,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_adding_element(x); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0,0,2]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); double t = -1; if( _spoil_scenario==10 ) t = fp_posinf; if( _spoil_scenario==11 ) t = fp_neginf; double v; double dv; double d2v; barycentricinterpolant p; // barycentric model is created polynomialbuild(x, y, p); // barycentric interpolation is demonstrated v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); // barycentric differentation is demonstrated barycentricdiff1(p, t, v, dv); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); _TestResult = _TestResult && doc_test_real(dv, -3.0, 0.00005); // second derivatives with barycentric representation barycentricdiff1(p, t, v, dv); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); _TestResult = _TestResult && doc_test_real(dv, -3.0, 0.00005); barycentricdiff2(p, t, v, dv, d2v); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); _TestResult = _TestResult && doc_test_real(dv, -3.0, 0.00005); _TestResult = _TestResult && doc_test_real(d2v, 2.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_d_calcdiff"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_d_conv // Conversion between power basis and barycentric representation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<5; _spoil_scenario++) { try { // // Here we demonstrate conversion of y=x^2-x // between power basis and barycentric representation. // real_1d_array a = "[0,-1,+1]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(a); if( _spoil_scenario==1 ) spoil_vector_by_posinf(a); if( _spoil_scenario==2 ) spoil_vector_by_neginf(a); double t = 2; if( _spoil_scenario==3 ) t = fp_posinf; if( _spoil_scenario==4 ) t = fp_neginf; real_1d_array a2; double v; barycentricinterpolant p; // // a=[0,-1,+1] is decomposition of y=x^2-x in the power basis: // // y = 0 - 1*x + 1*x^2 // // We convert it to the barycentric form. // polynomialpow2bar(a, p); // now we have barycentric interpolation; we can use it for interpolation v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.005); // we can also convert back from barycentric representation to power basis polynomialbar2pow(p, a2); _TestResult = _TestResult && doc_test_real_vector(a2, "[0,-1,+1]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_d_conv"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_d_spec // Polynomial interpolation on special grids (equidistant, Chebyshev I/II) // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<11; _spoil_scenario++) { try { // // Temporaries: // * values of y=x^2-x sampled at three special grids: // * equdistant grid spanning [0,2], x[i] = 2*i/(N-1), i=0..N-1 // * Chebyshev-I grid spanning [-1,+1], x[i] = 1 + Cos(PI*(2*i+1)/(2*n)), i=0..N-1 // * Chebyshev-II grid spanning [-1,+1], x[i] = 1 + Cos(PI*i/(n-1)), i=0..N-1 // * barycentric interpolants for these three grids // * vectors to store coefficients of quadratic representation // real_1d_array y_eqdist = "[0,0,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y_eqdist); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y_eqdist); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y_eqdist); real_1d_array y_cheb1 = "[-0.116025,0.000000,1.616025]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(y_cheb1); if( _spoil_scenario==4 ) spoil_vector_by_posinf(y_cheb1); if( _spoil_scenario==5 ) spoil_vector_by_neginf(y_cheb1); real_1d_array y_cheb2 = "[0,0,2]"; if( _spoil_scenario==6 ) spoil_vector_by_nan(y_cheb2); if( _spoil_scenario==7 ) spoil_vector_by_posinf(y_cheb2); if( _spoil_scenario==8 ) spoil_vector_by_neginf(y_cheb2); barycentricinterpolant p_eqdist; barycentricinterpolant p_cheb1; barycentricinterpolant p_cheb2; real_1d_array a_eqdist; real_1d_array a_cheb1; real_1d_array a_cheb2; // // First, we demonstrate construction of barycentric interpolants on // special grids. We unpack power representation to ensure that // interpolant was built correctly. // // In all three cases we should get same quadratic function. // polynomialbuildeqdist(0.0, 2.0, y_eqdist, p_eqdist); polynomialbar2pow(p_eqdist, a_eqdist); _TestResult = _TestResult && doc_test_real_vector(a_eqdist, "[0,-1,+1]", 0.00005); polynomialbuildcheb1(-1, +1, y_cheb1, p_cheb1); polynomialbar2pow(p_cheb1, a_cheb1); _TestResult = _TestResult && doc_test_real_vector(a_cheb1, "[0,-1,+1]", 0.00005); polynomialbuildcheb2(-1, +1, y_cheb2, p_cheb2); polynomialbar2pow(p_cheb2, a_cheb2); _TestResult = _TestResult && doc_test_real_vector(a_cheb2, "[0,-1,+1]", 0.00005); // // Now we demonstrate polynomial interpolation without construction // of the barycentricinterpolant structure. // // We calculate interpolant value at x=-2. // In all three cases we should get same f=6 // double t = -2; if( _spoil_scenario==9 ) t = fp_posinf; if( _spoil_scenario==10 ) t = fp_neginf; double v; v = polynomialcalceqdist(0.0, 2.0, y_eqdist, t); _TestResult = _TestResult && doc_test_real(v, 6.0, 0.00005); v = polynomialcalccheb1(-1, +1, y_cheb1, t); _TestResult = _TestResult && doc_test_real(v, 6.0, 0.00005); v = polynomialcalccheb2(-1, +1, y_cheb2, t); _TestResult = _TestResult && doc_test_real(v, 6.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_d_spec"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_1 // Polynomial interpolation, full list of parameters. // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<10; _spoil_scenario++) { try { real_1d_array x = "[0,1,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0,0,2]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); double t = -1; if( _spoil_scenario==8 ) t = fp_posinf; if( _spoil_scenario==9 ) t = fp_neginf; barycentricinterpolant p; double v; polynomialbuild(x, y, 3, p); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_2 // Polynomial interpolation, full list of parameters. // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { real_1d_array y = "[0,0,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(y); double t = -1; if( _spoil_scenario==4 ) t = fp_posinf; if( _spoil_scenario==5 ) t = fp_neginf; barycentricinterpolant p; double v; polynomialbuildeqdist(0.0, 2.0, y, 3, p); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_3 // Polynomial interpolation, full list of parameters. // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { real_1d_array y = "[-0.116025,0.000000,1.616025]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(y); double t = -1; if( _spoil_scenario==4 ) t = fp_posinf; if( _spoil_scenario==5 ) t = fp_neginf; barycentricinterpolant p; double v; polynomialbuildcheb1(-1.0, +1.0, y, 3, p); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_3"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_4 // Polynomial interpolation, full list of parameters. // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { real_1d_array y = "[0,0,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(y); double t = -2; if( _spoil_scenario==4 ) t = fp_posinf; if( _spoil_scenario==5 ) t = fp_neginf; double a = -1; if( _spoil_scenario==6 ) a = fp_nan; if( _spoil_scenario==7 ) a = fp_posinf; if( _spoil_scenario==8 ) a = fp_neginf; double b = +1; if( _spoil_scenario==9 ) b = fp_nan; if( _spoil_scenario==10 ) b = fp_posinf; if( _spoil_scenario==11 ) b = fp_neginf; barycentricinterpolant p; double v; polynomialbuildcheb2(a, b, y, 3, p); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 6.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_4"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_5 // Polynomial interpolation, full list of parameters. // printf("100/145\n"); _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { real_1d_array y = "[0,0,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(y); double t = -1; if( _spoil_scenario==4 ) t = fp_posinf; if( _spoil_scenario==5 ) t = fp_neginf; double v; v = polynomialcalceqdist(0.0, 2.0, y, 3, t); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_5"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_6 // Polynomial interpolation, full list of parameters. // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { real_1d_array y = "[-0.116025,0.000000,1.616025]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(y); double t = -1; if( _spoil_scenario==4 ) t = fp_posinf; if( _spoil_scenario==5 ) t = fp_neginf; double a = -1; if( _spoil_scenario==6 ) a = fp_nan; if( _spoil_scenario==7 ) a = fp_posinf; if( _spoil_scenario==8 ) a = fp_neginf; double b = +1; if( _spoil_scenario==9 ) b = fp_nan; if( _spoil_scenario==10 ) b = fp_posinf; if( _spoil_scenario==11 ) b = fp_neginf; double v; v = polynomialcalccheb1(a, b, y, 3, t); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_6"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_7 // Polynomial interpolation, full list of parameters. // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { real_1d_array y = "[0,0,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(y); double t = -2; if( _spoil_scenario==4 ) t = fp_posinf; if( _spoil_scenario==5 ) t = fp_neginf; double a = -1; if( _spoil_scenario==6 ) a = fp_nan; if( _spoil_scenario==7 ) a = fp_posinf; if( _spoil_scenario==8 ) a = fp_neginf; double b = +1; if( _spoil_scenario==9 ) b = fp_nan; if( _spoil_scenario==10 ) b = fp_posinf; if( _spoil_scenario==11 ) b = fp_neginf; double v; v = polynomialcalccheb2(a, b, y, 3, t); _TestResult = _TestResult && doc_test_real(v, 6.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_7"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_8 // Polynomial interpolation: y=x^2-x, equidistant grid, barycentric form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<5; _spoil_scenario++) { try { real_1d_array y = "[0,0,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); double t = -1; if( _spoil_scenario==3 ) t = fp_posinf; if( _spoil_scenario==4 ) t = fp_neginf; barycentricinterpolant p; double v; polynomialbuildeqdist(0.0, 2.0, y, p); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_8"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_9 // Polynomial interpolation: y=x^2-x, Chebyshev grid (first kind), barycentric form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<11; _spoil_scenario++) { try { real_1d_array y = "[-0.116025,0.000000,1.616025]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); double t = -1; if( _spoil_scenario==3 ) t = fp_posinf; if( _spoil_scenario==4 ) t = fp_neginf; double a = -1; if( _spoil_scenario==5 ) a = fp_nan; if( _spoil_scenario==6 ) a = fp_posinf; if( _spoil_scenario==7 ) a = fp_neginf; double b = +1; if( _spoil_scenario==8 ) b = fp_nan; if( _spoil_scenario==9 ) b = fp_posinf; if( _spoil_scenario==10 ) b = fp_neginf; barycentricinterpolant p; double v; polynomialbuildcheb1(a, b, y, p); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_9"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_10 // Polynomial interpolation: y=x^2-x, Chebyshev grid (second kind), barycentric form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<11; _spoil_scenario++) { try { real_1d_array y = "[0,0,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); double t = -2; if( _spoil_scenario==3 ) t = fp_posinf; if( _spoil_scenario==4 ) t = fp_neginf; double a = -1; if( _spoil_scenario==5 ) a = fp_nan; if( _spoil_scenario==6 ) a = fp_posinf; if( _spoil_scenario==7 ) a = fp_neginf; double b = +1; if( _spoil_scenario==8 ) b = fp_nan; if( _spoil_scenario==9 ) b = fp_posinf; if( _spoil_scenario==10 ) b = fp_neginf; barycentricinterpolant p; double v; polynomialbuildcheb2(a, b, y, p); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 6.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_10"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_11 // Polynomial interpolation: y=x^2-x, equidistant grid // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<5; _spoil_scenario++) { try { real_1d_array y = "[0,0,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); double t = -1; if( _spoil_scenario==3 ) t = fp_posinf; if( _spoil_scenario==4 ) t = fp_neginf; double v; v = polynomialcalceqdist(0.0, 2.0, y, t); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_11"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_12 // Polynomial interpolation: y=x^2-x, Chebyshev grid (first kind) // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<11; _spoil_scenario++) { try { real_1d_array y = "[-0.116025,0.000000,1.616025]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); double t = -1; if( _spoil_scenario==3 ) t = fp_posinf; if( _spoil_scenario==4 ) t = fp_neginf; double a = -1; if( _spoil_scenario==5 ) a = fp_nan; if( _spoil_scenario==6 ) a = fp_posinf; if( _spoil_scenario==7 ) a = fp_neginf; double b = +1; if( _spoil_scenario==8 ) b = fp_nan; if( _spoil_scenario==9 ) b = fp_posinf; if( _spoil_scenario==10 ) b = fp_neginf; double v; v = polynomialcalccheb1(a, b, y, t); _TestResult = _TestResult && doc_test_real(v, 2.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_12"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST polint_t_13 // Polynomial interpolation: y=x^2-x, Chebyshev grid (second kind) // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<11; _spoil_scenario++) { try { real_1d_array y = "[0,0,2]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); double t = -2; if( _spoil_scenario==3 ) t = fp_posinf; if( _spoil_scenario==4 ) t = fp_neginf; double a = -1; if( _spoil_scenario==5 ) a = fp_nan; if( _spoil_scenario==6 ) a = fp_posinf; if( _spoil_scenario==7 ) a = fp_neginf; double b = +1; if( _spoil_scenario==8 ) b = fp_nan; if( _spoil_scenario==9 ) b = fp_posinf; if( _spoil_scenario==10 ) b = fp_neginf; double v; v = polynomialcalccheb2(a, b, y, t); _TestResult = _TestResult && doc_test_real(v, 6.0, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "polint_t_13"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_d_nlf // Nonlinear fitting using function value only // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<24; _spoil_scenario++) { try { // // In this example we demonstrate exponential fitting // by f(x) = exp(-c*x^2) // using function value only. // // Gradient is estimated using combination of numerical differences // and secant updates. diffstep variable stores differentiation step // (we have to tell algorithm what step to use). // real_2d_array x = "[[-1],[-0.8],[-0.6],[-0.4],[-0.2],[0],[0.2],[0.4],[0.6],[0.8],[1.0]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(x); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(x); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(x); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(x); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(x); real_1d_array y = "[0.223130, 0.382893, 0.582748, 0.786628, 0.941765, 1.000000, 0.941765, 0.786628, 0.582748, 0.382893, 0.223130]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); real_1d_array c = "[0.3]"; if( _spoil_scenario==10 ) spoil_vector_by_nan(c); if( _spoil_scenario==11 ) spoil_vector_by_posinf(c); if( _spoil_scenario==12 ) spoil_vector_by_neginf(c); double epsx = 0.000001; if( _spoil_scenario==13 ) epsx = fp_nan; if( _spoil_scenario==14 ) epsx = fp_posinf; if( _spoil_scenario==15 ) epsx = fp_neginf; ae_int_t maxits = 0; ae_int_t info; lsfitstate state; lsfitreport rep; double diffstep = 0.0001; if( _spoil_scenario==16 ) diffstep = fp_nan; if( _spoil_scenario==17 ) diffstep = fp_posinf; if( _spoil_scenario==18 ) diffstep = fp_neginf; // // Fitting without weights // lsfitcreatef(x, y, c, diffstep, state); lsfitsetcond(state, epsx, maxits); alglib::lsfitfit(state, function_cx_1_func); lsfitresults(state, info, c, rep); _TestResult = _TestResult && doc_test_int(info, 2); _TestResult = _TestResult && doc_test_real_vector(c, "[1.5]", 0.05); // // Fitting with weights // (you can change weights and see how it changes result) // real_1d_array w = "[1,1,1,1,1,1,1,1,1,1,1]"; if( _spoil_scenario==19 ) spoil_vector_by_nan(w); if( _spoil_scenario==20 ) spoil_vector_by_posinf(w); if( _spoil_scenario==21 ) spoil_vector_by_neginf(w); if( _spoil_scenario==22 ) spoil_vector_by_adding_element(w); if( _spoil_scenario==23 ) spoil_vector_by_deleting_element(w); lsfitcreatewf(x, y, w, c, diffstep, state); lsfitsetcond(state, epsx, maxits); alglib::lsfitfit(state, function_cx_1_func); lsfitresults(state, info, c, rep); _TestResult = _TestResult && doc_test_int(info, 2); _TestResult = _TestResult && doc_test_real_vector(c, "[1.5]", 0.05); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_d_nlf"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_d_nlfg // Nonlinear fitting using gradient // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<21; _spoil_scenario++) { try { // // In this example we demonstrate exponential fitting // by f(x) = exp(-c*x^2) // using function value and gradient (with respect to c). // real_2d_array x = "[[-1],[-0.8],[-0.6],[-0.4],[-0.2],[0],[0.2],[0.4],[0.6],[0.8],[1.0]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(x); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(x); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(x); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(x); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(x); real_1d_array y = "[0.223130, 0.382893, 0.582748, 0.786628, 0.941765, 1.000000, 0.941765, 0.786628, 0.582748, 0.382893, 0.223130]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); real_1d_array c = "[0.3]"; if( _spoil_scenario==10 ) spoil_vector_by_nan(c); if( _spoil_scenario==11 ) spoil_vector_by_posinf(c); if( _spoil_scenario==12 ) spoil_vector_by_neginf(c); double epsx = 0.000001; if( _spoil_scenario==13 ) epsx = fp_nan; if( _spoil_scenario==14 ) epsx = fp_posinf; if( _spoil_scenario==15 ) epsx = fp_neginf; ae_int_t maxits = 0; ae_int_t info; lsfitstate state; lsfitreport rep; // // Fitting without weights // lsfitcreatefg(x, y, c, true, state); lsfitsetcond(state, epsx, maxits); alglib::lsfitfit(state, function_cx_1_func, function_cx_1_grad); lsfitresults(state, info, c, rep); _TestResult = _TestResult && doc_test_int(info, 2); _TestResult = _TestResult && doc_test_real_vector(c, "[1.5]", 0.05); // // Fitting with weights // (you can change weights and see how it changes result) // real_1d_array w = "[1,1,1,1,1,1,1,1,1,1,1]"; if( _spoil_scenario==16 ) spoil_vector_by_nan(w); if( _spoil_scenario==17 ) spoil_vector_by_posinf(w); if( _spoil_scenario==18 ) spoil_vector_by_neginf(w); if( _spoil_scenario==19 ) spoil_vector_by_adding_element(w); if( _spoil_scenario==20 ) spoil_vector_by_deleting_element(w); lsfitcreatewfg(x, y, w, c, true, state); lsfitsetcond(state, epsx, maxits); alglib::lsfitfit(state, function_cx_1_func, function_cx_1_grad); lsfitresults(state, info, c, rep); _TestResult = _TestResult && doc_test_int(info, 2); _TestResult = _TestResult && doc_test_real_vector(c, "[1.5]", 0.05); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_d_nlfg"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_d_nlfgh // Nonlinear fitting using gradient and Hessian // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<21; _spoil_scenario++) { try { // // In this example we demonstrate exponential fitting // by f(x) = exp(-c*x^2) // using function value, gradient and Hessian (with respect to c) // real_2d_array x = "[[-1],[-0.8],[-0.6],[-0.4],[-0.2],[0],[0.2],[0.4],[0.6],[0.8],[1.0]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(x); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(x); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(x); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(x); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(x); real_1d_array y = "[0.223130, 0.382893, 0.582748, 0.786628, 0.941765, 1.000000, 0.941765, 0.786628, 0.582748, 0.382893, 0.223130]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); real_1d_array c = "[0.3]"; if( _spoil_scenario==10 ) spoil_vector_by_nan(c); if( _spoil_scenario==11 ) spoil_vector_by_posinf(c); if( _spoil_scenario==12 ) spoil_vector_by_neginf(c); double epsx = 0.000001; if( _spoil_scenario==13 ) epsx = fp_nan; if( _spoil_scenario==14 ) epsx = fp_posinf; if( _spoil_scenario==15 ) epsx = fp_neginf; ae_int_t maxits = 0; ae_int_t info; lsfitstate state; lsfitreport rep; // // Fitting without weights // lsfitcreatefgh(x, y, c, state); lsfitsetcond(state, epsx, maxits); alglib::lsfitfit(state, function_cx_1_func, function_cx_1_grad, function_cx_1_hess); lsfitresults(state, info, c, rep); _TestResult = _TestResult && doc_test_int(info, 2); _TestResult = _TestResult && doc_test_real_vector(c, "[1.5]", 0.05); // // Fitting with weights // (you can change weights and see how it changes result) // real_1d_array w = "[1,1,1,1,1,1,1,1,1,1,1]"; if( _spoil_scenario==16 ) spoil_vector_by_nan(w); if( _spoil_scenario==17 ) spoil_vector_by_posinf(w); if( _spoil_scenario==18 ) spoil_vector_by_neginf(w); if( _spoil_scenario==19 ) spoil_vector_by_adding_element(w); if( _spoil_scenario==20 ) spoil_vector_by_deleting_element(w); lsfitcreatewfgh(x, y, w, c, state); lsfitsetcond(state, epsx, maxits); alglib::lsfitfit(state, function_cx_1_func, function_cx_1_grad, function_cx_1_hess); lsfitresults(state, info, c, rep); _TestResult = _TestResult && doc_test_int(info, 2); _TestResult = _TestResult && doc_test_real_vector(c, "[1.5]", 0.05); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_d_nlfgh"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_d_nlfb // Bound contstrained nonlinear fitting using function value only // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<23; _spoil_scenario++) { try { // // In this example we demonstrate exponential fitting by // f(x) = exp(-c*x^2) // subject to bound constraints // 0.0 <= c <= 1.0 // using function value only. // // Gradient is estimated using combination of numerical differences // and secant updates. diffstep variable stores differentiation step // (we have to tell algorithm what step to use). // // Unconstrained solution is c=1.5, but because of constraints we should // get c=1.0 (at the boundary). // real_2d_array x = "[[-1],[-0.8],[-0.6],[-0.4],[-0.2],[0],[0.2],[0.4],[0.6],[0.8],[1.0]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(x); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(x); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(x); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(x); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(x); real_1d_array y = "[0.223130, 0.382893, 0.582748, 0.786628, 0.941765, 1.000000, 0.941765, 0.786628, 0.582748, 0.382893, 0.223130]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); real_1d_array c = "[0.3]"; if( _spoil_scenario==10 ) spoil_vector_by_nan(c); if( _spoil_scenario==11 ) spoil_vector_by_posinf(c); if( _spoil_scenario==12 ) spoil_vector_by_neginf(c); real_1d_array bndl = "[0.0]"; if( _spoil_scenario==13 ) spoil_vector_by_nan(bndl); if( _spoil_scenario==14 ) spoil_vector_by_deleting_element(bndl); real_1d_array bndu = "[1.0]"; if( _spoil_scenario==15 ) spoil_vector_by_nan(bndu); if( _spoil_scenario==16 ) spoil_vector_by_deleting_element(bndu); double epsx = 0.000001; if( _spoil_scenario==17 ) epsx = fp_nan; if( _spoil_scenario==18 ) epsx = fp_posinf; if( _spoil_scenario==19 ) epsx = fp_neginf; ae_int_t maxits = 0; ae_int_t info; lsfitstate state; lsfitreport rep; double diffstep = 0.0001; if( _spoil_scenario==20 ) diffstep = fp_nan; if( _spoil_scenario==21 ) diffstep = fp_posinf; if( _spoil_scenario==22 ) diffstep = fp_neginf; lsfitcreatef(x, y, c, diffstep, state); lsfitsetbc(state, bndl, bndu); lsfitsetcond(state, epsx, maxits); alglib::lsfitfit(state, function_cx_1_func); lsfitresults(state, info, c, rep); _TestResult = _TestResult && doc_test_real_vector(c, "[1.0]", 0.05); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_d_nlfb"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_d_nlscale // Nonlinear fitting with custom scaling and bound constraints // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<27; _spoil_scenario++) { try { // // In this example we demonstrate fitting by // f(x) = c[0]*(1+c[1]*((x-1999)^c[2]-1)) // subject to bound constraints // -INF < c[0] < +INF // -10 <= c[1] <= +10 // 0.1 <= c[2] <= 2.0 // Data we want to fit are time series of Japan national debt // collected from 2000 to 2008 measured in USD (dollars, not // millions of dollars). // // Our variables are: // c[0] - debt value at initial moment (2000), // c[1] - direction coefficient (growth or decrease), // c[2] - curvature coefficient. // You may see that our variables are badly scaled - first one // is order of 10^12, and next two are somewhere about 1 in // magnitude. Such problem is difficult to solve without some // kind of scaling. // That is exactly where lsfitsetscale() function can be used. // We set scale of our variables to [1.0E12, 1, 1], which allows // us to easily solve this problem. // // You can try commenting out lsfitsetscale() call - and you will // see that algorithm will fail to converge. // real_2d_array x = "[[2000],[2001],[2002],[2003],[2004],[2005],[2006],[2007],[2008]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(x); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(x); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(x); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(x); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(x); real_1d_array y = "[4323239600000.0, 4560913100000.0, 5564091500000.0, 6743189300000.0, 7284064600000.0, 7050129600000.0, 7092221500000.0, 8483907600000.0, 8625804400000.0]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); real_1d_array c = "[1.0e+13, 1, 1]"; if( _spoil_scenario==10 ) spoil_vector_by_nan(c); if( _spoil_scenario==11 ) spoil_vector_by_posinf(c); if( _spoil_scenario==12 ) spoil_vector_by_neginf(c); double epsx = 1.0e-5; if( _spoil_scenario==13 ) epsx = fp_nan; if( _spoil_scenario==14 ) epsx = fp_posinf; if( _spoil_scenario==15 ) epsx = fp_neginf; real_1d_array bndl = "[-inf, -10, 0.1]"; if( _spoil_scenario==16 ) spoil_vector_by_nan(bndl); if( _spoil_scenario==17 ) spoil_vector_by_deleting_element(bndl); real_1d_array bndu = "[+inf, +10, 2.0]"; if( _spoil_scenario==18 ) spoil_vector_by_nan(bndu); if( _spoil_scenario==19 ) spoil_vector_by_deleting_element(bndu); real_1d_array s = "[1.0e+12, 1, 1]"; if( _spoil_scenario==20 ) spoil_vector_by_nan(s); if( _spoil_scenario==21 ) spoil_vector_by_posinf(s); if( _spoil_scenario==22 ) spoil_vector_by_neginf(s); if( _spoil_scenario==23 ) spoil_vector_by_deleting_element(s); ae_int_t maxits = 0; ae_int_t info; lsfitstate state; lsfitreport rep; double diffstep = 1.0e-5; if( _spoil_scenario==24 ) diffstep = fp_nan; if( _spoil_scenario==25 ) diffstep = fp_posinf; if( _spoil_scenario==26 ) diffstep = fp_neginf; lsfitcreatef(x, y, c, diffstep, state); lsfitsetcond(state, epsx, maxits); lsfitsetbc(state, bndl, bndu); lsfitsetscale(state, s); alglib::lsfitfit(state, function_debt_func); lsfitresults(state, info, c, rep); _TestResult = _TestResult && doc_test_int(info, 2); _TestResult = _TestResult && doc_test_real_vector(c, "[4.142560e+12, 0.434240, 0.565376]", -0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_d_nlscale"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_d_lin // Unconstrained (general) linear least squares fitting with and without weights // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<13; _spoil_scenario++) { try { // // In this example we demonstrate linear fitting by f(x|a) = a*exp(0.5*x). // // We have: // * y - vector of experimental data // * fmatrix - matrix of basis functions calculated at sample points // Actually, we have only one basis function F0 = exp(0.5*x). // real_2d_array fmatrix = "[[0.606531],[0.670320],[0.740818],[0.818731],[0.904837],[1.000000],[1.105171],[1.221403],[1.349859],[1.491825],[1.648721]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(fmatrix); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(fmatrix); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(fmatrix); real_1d_array y = "[1.133719, 1.306522, 1.504604, 1.554663, 1.884638, 2.072436, 2.257285, 2.534068, 2.622017, 2.897713, 3.219371]"; if( _spoil_scenario==3 ) spoil_vector_by_nan(y); if( _spoil_scenario==4 ) spoil_vector_by_posinf(y); if( _spoil_scenario==5 ) spoil_vector_by_neginf(y); if( _spoil_scenario==6 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); ae_int_t info; real_1d_array c; lsfitreport rep; // // Linear fitting without weights // lsfitlinear(y, fmatrix, info, c, rep); _TestResult = _TestResult && doc_test_int(info, 1); _TestResult = _TestResult && doc_test_real_vector(c, "[1.98650]", 0.00005); // // Linear fitting with individual weights. // Slightly different result is returned. // real_1d_array w = "[1.414213, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]"; if( _spoil_scenario==8 ) spoil_vector_by_nan(w); if( _spoil_scenario==9 ) spoil_vector_by_posinf(w); if( _spoil_scenario==10 ) spoil_vector_by_neginf(w); if( _spoil_scenario==11 ) spoil_vector_by_adding_element(w); if( _spoil_scenario==12 ) spoil_vector_by_deleting_element(w); lsfitlinearw(y, w, fmatrix, info, c, rep); _TestResult = _TestResult && doc_test_int(info, 1); _TestResult = _TestResult && doc_test_real_vector(c, "[1.983354]", 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_d_lin"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_d_linc // Constrained (general) linear least squares fitting with and without weights // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<20; _spoil_scenario++) { try { // // In this example we demonstrate linear fitting by f(x|a,b) = a*x+b // with simple constraint f(0)=0. // // We have: // * y - vector of experimental data // * fmatrix - matrix of basis functions sampled at [0,1] with step 0.2: // [ 1.0 0.0 ] // [ 1.0 0.2 ] // [ 1.0 0.4 ] // [ 1.0 0.6 ] // [ 1.0 0.8 ] // [ 1.0 1.0 ] // first column contains value of first basis function (constant term) // second column contains second basis function (linear term) // * cmatrix - matrix of linear constraints: // [ 1.0 0.0 0.0 ] // first two columns contain coefficients before basis functions, // last column contains desired value of their sum. // So [1,0,0] means "1*constant_term + 0*linear_term = 0" // real_1d_array y = "[0.072436,0.246944,0.491263,0.522300,0.714064,0.921929]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(y); if( _spoil_scenario==1 ) spoil_vector_by_posinf(y); if( _spoil_scenario==2 ) spoil_vector_by_neginf(y); if( _spoil_scenario==3 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(y); real_2d_array fmatrix = "[[1,0.0],[1,0.2],[1,0.4],[1,0.6],[1,0.8],[1,1.0]]"; if( _spoil_scenario==5 ) spoil_matrix_by_nan(fmatrix); if( _spoil_scenario==6 ) spoil_matrix_by_posinf(fmatrix); if( _spoil_scenario==7 ) spoil_matrix_by_neginf(fmatrix); if( _spoil_scenario==8 ) spoil_matrix_by_adding_row(fmatrix); if( _spoil_scenario==9 ) spoil_matrix_by_adding_col(fmatrix); if( _spoil_scenario==10 ) spoil_matrix_by_deleting_row(fmatrix); if( _spoil_scenario==11 ) spoil_matrix_by_deleting_col(fmatrix); real_2d_array cmatrix = "[[1,0,0]]"; if( _spoil_scenario==12 ) spoil_matrix_by_nan(cmatrix); if( _spoil_scenario==13 ) spoil_matrix_by_posinf(cmatrix); if( _spoil_scenario==14 ) spoil_matrix_by_neginf(cmatrix); ae_int_t info; real_1d_array c; lsfitreport rep; // // Constrained fitting without weights // lsfitlinearc(y, fmatrix, cmatrix, info, c, rep); _TestResult = _TestResult && doc_test_int(info, 1); _TestResult = _TestResult && doc_test_real_vector(c, "[0,0.932933]", 0.0005); // // Constrained fitting with individual weights // real_1d_array w = "[1, 1.414213, 1, 1, 1, 1]"; if( _spoil_scenario==15 ) spoil_vector_by_nan(w); if( _spoil_scenario==16 ) spoil_vector_by_posinf(w); if( _spoil_scenario==17 ) spoil_vector_by_neginf(w); if( _spoil_scenario==18 ) spoil_vector_by_adding_element(w); if( _spoil_scenario==19 ) spoil_vector_by_deleting_element(w); lsfitlinearwc(y, w, fmatrix, cmatrix, info, c, rep); _TestResult = _TestResult && doc_test_int(info, 1); _TestResult = _TestResult && doc_test_real_vector(c, "[0,0.938322]", 0.0005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_d_linc"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_d_pol // Unconstrained polynomial fitting // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<20; _spoil_scenario++) { try { // // This example demonstrates polynomial fitting. // // Fitting is done by two (M=2) functions from polynomial basis: // f0 = 1 // f1 = x // Basically, it just a linear fit; more complex polynomials may be used // (e.g. parabolas with M=3, cubic with M=4), but even such simple fit allows // us to demonstrate polynomialfit() function in action. // // We have: // * x set of abscissas // * y experimental data // // Additionally we demonstrate weighted fitting, where second point has // more weight than other ones. // real_1d_array x = "[0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_adding_element(x); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.00,0.05,0.26,0.32,0.33,0.43,0.60,0.60,0.77,0.98,1.02]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); ae_int_t m = 2; double t = 2; if( _spoil_scenario==10 ) t = fp_posinf; if( _spoil_scenario==11 ) t = fp_neginf; ae_int_t info; barycentricinterpolant p; polynomialfitreport rep; double v; // // Fitting without individual weights // // NOTE: result is returned as barycentricinterpolant structure. // if you want to get representation in the power basis, // you can use barycentricbar2pow() function to convert // from barycentric to power representation (see docs for // POLINT subpackage for more info). // polynomialfit(x, y, m, info, p, rep); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.011, 0.002); // // Fitting with individual weights // // NOTE: slightly different result is returned // real_1d_array w = "[1,1.414213562,1,1,1,1,1,1,1,1,1]"; if( _spoil_scenario==12 ) spoil_vector_by_nan(w); if( _spoil_scenario==13 ) spoil_vector_by_posinf(w); if( _spoil_scenario==14 ) spoil_vector_by_neginf(w); if( _spoil_scenario==15 ) spoil_vector_by_adding_element(w); if( _spoil_scenario==16 ) spoil_vector_by_deleting_element(w); real_1d_array xc = "[]"; if( _spoil_scenario==17 ) spoil_vector_by_adding_element(xc); real_1d_array yc = "[]"; if( _spoil_scenario==18 ) spoil_vector_by_adding_element(yc); integer_1d_array dc = "[]"; if( _spoil_scenario==19 ) spoil_vector_by_adding_element(dc); polynomialfitwc(x, y, w, xc, yc, dc, m, info, p, rep); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.023, 0.002); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_d_pol"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_d_polc // Constrained polynomial fitting // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<29; _spoil_scenario++) { try { // // This example demonstrates polynomial fitting. // // Fitting is done by two (M=2) functions from polynomial basis: // f0 = 1 // f1 = x // with simple constraint on function value // f(0) = 0 // Basically, it just a linear fit; more complex polynomials may be used // (e.g. parabolas with M=3, cubic with M=4), but even such simple fit allows // us to demonstrate polynomialfit() function in action. // // We have: // * x set of abscissas // * y experimental data // * xc points where constraints are placed // * yc constraints on derivatives // * dc derivative indices // (0 means function itself, 1 means first derivative) // real_1d_array x = "[1.0,1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_adding_element(x); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.9,1.1]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); real_1d_array w = "[1,1]"; if( _spoil_scenario==10 ) spoil_vector_by_nan(w); if( _spoil_scenario==11 ) spoil_vector_by_posinf(w); if( _spoil_scenario==12 ) spoil_vector_by_neginf(w); if( _spoil_scenario==13 ) spoil_vector_by_adding_element(w); if( _spoil_scenario==14 ) spoil_vector_by_deleting_element(w); real_1d_array xc = "[0]"; if( _spoil_scenario==15 ) spoil_vector_by_nan(xc); if( _spoil_scenario==16 ) spoil_vector_by_posinf(xc); if( _spoil_scenario==17 ) spoil_vector_by_neginf(xc); if( _spoil_scenario==18 ) spoil_vector_by_adding_element(xc); if( _spoil_scenario==19 ) spoil_vector_by_deleting_element(xc); real_1d_array yc = "[0]"; if( _spoil_scenario==20 ) spoil_vector_by_nan(yc); if( _spoil_scenario==21 ) spoil_vector_by_posinf(yc); if( _spoil_scenario==22 ) spoil_vector_by_neginf(yc); if( _spoil_scenario==23 ) spoil_vector_by_adding_element(yc); if( _spoil_scenario==24 ) spoil_vector_by_deleting_element(yc); integer_1d_array dc = "[0]"; if( _spoil_scenario==25 ) spoil_vector_by_adding_element(dc); if( _spoil_scenario==26 ) spoil_vector_by_deleting_element(dc); double t = 2; if( _spoil_scenario==27 ) t = fp_posinf; if( _spoil_scenario==28 ) t = fp_neginf; ae_int_t m = 2; ae_int_t info; barycentricinterpolant p; polynomialfitreport rep; double v; polynomialfitwc(x, y, w, xc, yc, dc, m, info, p, rep); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.000, 0.001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_d_polc"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_d_spline // Unconstrained fitting by penalized regression spline // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<19; _spoil_scenario++) { try { // // In this example we demonstrate penalized spline fitting of noisy data // // We have: // * x - abscissas // * y - vector of experimental data, straight line with small noise // real_1d_array x = "[0.00,0.10,0.20,0.30,0.40,0.50,0.60,0.70,0.80,0.90]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_adding_element(x); if( _spoil_scenario==4 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.10,0.00,0.30,0.40,0.30,0.40,0.62,0.68,0.75,0.95]"; if( _spoil_scenario==5 ) spoil_vector_by_nan(y); if( _spoil_scenario==6 ) spoil_vector_by_posinf(y); if( _spoil_scenario==7 ) spoil_vector_by_neginf(y); if( _spoil_scenario==8 ) spoil_vector_by_adding_element(y); if( _spoil_scenario==9 ) spoil_vector_by_deleting_element(y); ae_int_t info; double v; spline1dinterpolant s; spline1dfitreport rep; double rho; // // Fit with VERY small amount of smoothing (rho = -5.0) // and large number of basis functions (M=50). // // With such small regularization penalized spline almost fully reproduces function values // rho = -5.0; if( _spoil_scenario==10 ) rho = fp_nan; if( _spoil_scenario==11 ) rho = fp_posinf; if( _spoil_scenario==12 ) rho = fp_neginf; spline1dfitpenalized(x, y, 50, rho, info, s, rep); _TestResult = _TestResult && doc_test_int(info, 1); v = spline1dcalc(s, 0.0); _TestResult = _TestResult && doc_test_real(v, 0.10, 0.01); // // Fit with VERY large amount of smoothing (rho = 10.0) // and large number of basis functions (M=50). // // With such regularization our spline should become close to the straight line fit. // We will compare its value in x=1.0 with results obtained from such fit. // rho = +10.0; if( _spoil_scenario==13 ) rho = fp_nan; if( _spoil_scenario==14 ) rho = fp_posinf; if( _spoil_scenario==15 ) rho = fp_neginf; spline1dfitpenalized(x, y, 50, rho, info, s, rep); _TestResult = _TestResult && doc_test_int(info, 1); v = spline1dcalc(s, 1.0); _TestResult = _TestResult && doc_test_real(v, 0.969, 0.001); // // In real life applications you may need some moderate degree of fitting, // so we try to fit once more with rho=3.0. // rho = +3.0; if( _spoil_scenario==16 ) rho = fp_nan; if( _spoil_scenario==17 ) rho = fp_posinf; if( _spoil_scenario==18 ) rho = fp_neginf; spline1dfitpenalized(x, y, 50, rho, info, s, rep); _TestResult = _TestResult && doc_test_int(info, 1); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_d_spline"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_t_polfit_1 // Polynomial fitting, full list of parameters. // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<10; _spoil_scenario++) { try { real_1d_array x = "[0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.00,0.05,0.26,0.32,0.33,0.43,0.60,0.60,0.77,0.98,1.02]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); ae_int_t m = 2; double t = 2; if( _spoil_scenario==8 ) t = fp_posinf; if( _spoil_scenario==9 ) t = fp_neginf; ae_int_t info; barycentricinterpolant p; polynomialfitreport rep; double v; polynomialfit(x, y, 11, m, info, p, rep); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.011, 0.002); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_t_polfit_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_t_polfit_2 // Polynomial fitting, full list of parameters. // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<14; _spoil_scenario++) { try { real_1d_array x = "[0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.00,0.05,0.26,0.32,0.33,0.43,0.60,0.60,0.77,0.98,1.02]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); real_1d_array w = "[1,1.414213562,1,1,1,1,1,1,1,1,1]"; if( _spoil_scenario==8 ) spoil_vector_by_nan(w); if( _spoil_scenario==9 ) spoil_vector_by_posinf(w); if( _spoil_scenario==10 ) spoil_vector_by_neginf(w); if( _spoil_scenario==11 ) spoil_vector_by_deleting_element(w); real_1d_array xc = "[]"; real_1d_array yc = "[]"; integer_1d_array dc = "[]"; ae_int_t m = 2; double t = 2; if( _spoil_scenario==12 ) t = fp_posinf; if( _spoil_scenario==13 ) t = fp_neginf; ae_int_t info; barycentricinterpolant p; polynomialfitreport rep; double v; polynomialfitwc(x, y, w, 11, xc, yc, dc, 0, m, info, p, rep); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.023, 0.002); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_t_polfit_2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_t_polfit_3 // Polynomial fitting, full list of parameters. // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<23; _spoil_scenario++) { try { real_1d_array x = "[1.0,1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.9,1.1]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); real_1d_array w = "[1,1]"; if( _spoil_scenario==8 ) spoil_vector_by_nan(w); if( _spoil_scenario==9 ) spoil_vector_by_posinf(w); if( _spoil_scenario==10 ) spoil_vector_by_neginf(w); if( _spoil_scenario==11 ) spoil_vector_by_deleting_element(w); real_1d_array xc = "[0]"; if( _spoil_scenario==12 ) spoil_vector_by_nan(xc); if( _spoil_scenario==13 ) spoil_vector_by_posinf(xc); if( _spoil_scenario==14 ) spoil_vector_by_neginf(xc); if( _spoil_scenario==15 ) spoil_vector_by_deleting_element(xc); real_1d_array yc = "[0]"; if( _spoil_scenario==16 ) spoil_vector_by_nan(yc); if( _spoil_scenario==17 ) spoil_vector_by_posinf(yc); if( _spoil_scenario==18 ) spoil_vector_by_neginf(yc); if( _spoil_scenario==19 ) spoil_vector_by_deleting_element(yc); integer_1d_array dc = "[0]"; if( _spoil_scenario==20 ) spoil_vector_by_deleting_element(dc); ae_int_t m = 2; double t = 2; if( _spoil_scenario==21 ) t = fp_posinf; if( _spoil_scenario==22 ) t = fp_neginf; ae_int_t info; barycentricinterpolant p; polynomialfitreport rep; double v; polynomialfitwc(x, y, w, 2, xc, yc, dc, 1, m, info, p, rep); v = barycentriccalc(p, t); _TestResult = _TestResult && doc_test_real(v, 2.000, 0.001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_t_polfit_3"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_t_4pl // 4-parameter logistic fitting // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<8; _spoil_scenario++) { try { real_1d_array x = "[1,2,3,4,5,6,7,8]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.06313223,0.44552624,0.61838364,0.71385108,0.77345838,0.81383140,0.84280033,0.86449822]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); ae_int_t n = 8; double a; double b; double c; double d; lsfitreport rep; // // Test logisticfit4() on carefully designed data with a priori known answer. // logisticfit4(x, y, n, a, b, c, d, rep); _TestResult = _TestResult && doc_test_real(a, -1.000, 0.01); _TestResult = _TestResult && doc_test_real(b, 1.200, 0.01); _TestResult = _TestResult && doc_test_real(c, 0.900, 0.01); _TestResult = _TestResult && doc_test_real(d, 1.000, 0.01); // // Evaluate model at point x=0.5 // double v; v = logisticcalc4(0.5, a, b, c, d); _TestResult = _TestResult && doc_test_real(v, -0.33874308, 0.001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_t_4pl"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lsfit_t_5pl // 5-parameter logistic fitting // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<8; _spoil_scenario++) { try { real_1d_array x = "[1,2,3,4,5,6,7,8]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.1949776139,0.5710060208,0.726002637,0.8060434158,0.8534547965,0.8842071579,0.9054773317,0.9209088299]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); ae_int_t n = 8; double a; double b; double c; double d; double g; lsfitreport rep; // // Test logisticfit5() on carefully designed data with a priori known answer. // logisticfit5(x, y, n, a, b, c, d, g, rep); _TestResult = _TestResult && doc_test_real(a, -1.000, 0.01); _TestResult = _TestResult && doc_test_real(b, 1.200, 0.01); _TestResult = _TestResult && doc_test_real(c, 0.900, 0.01); _TestResult = _TestResult && doc_test_real(d, 1.000, 0.01); _TestResult = _TestResult && doc_test_real(g, 1.200, 0.01); // // Evaluate model at point x=0.5 // double v; v = logisticcalc5(0.5, a, b, c, d, g); _TestResult = _TestResult && doc_test_real(v, -0.2354656824, 0.001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lsfit_t_5pl"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST spline2d_bilinear // Bilinear spline interpolation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<16; _spoil_scenario++) { try { // // We use bilinear spline to interpolate f(x,y)=x^2+2*y^2 sampled // at (x,y) from [0.0, 0.5, 1.0] X [0.0, 1.0]. // real_1d_array x = "[0.0, 0.5, 1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.0, 1.0]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); real_1d_array f = "[0.00,0.25,1.00,2.00,2.25,3.00]"; if( _spoil_scenario==8 ) spoil_vector_by_nan(f); if( _spoil_scenario==9 ) spoil_vector_by_posinf(f); if( _spoil_scenario==10 ) spoil_vector_by_neginf(f); if( _spoil_scenario==11 ) spoil_vector_by_deleting_element(f); double vx = 0.25; if( _spoil_scenario==12 ) vx = fp_posinf; if( _spoil_scenario==13 ) vx = fp_neginf; double vy = 0.50; if( _spoil_scenario==14 ) vy = fp_posinf; if( _spoil_scenario==15 ) vy = fp_neginf; double v; spline2dinterpolant s; // build spline spline2dbuildbilinearv(x, 3, y, 2, f, 1, s); // calculate S(0.25,0.50) v = spline2dcalc(s, vx, vy); _TestResult = _TestResult && doc_test_real(v, 1.1250, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "spline2d_bilinear"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST spline2d_bicubic // Bilinear spline interpolation // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<16; _spoil_scenario++) { try { // // We use bilinear spline to interpolate f(x,y)=x^2+2*y^2 sampled // at (x,y) from [0.0, 0.5, 1.0] X [0.0, 1.0]. // real_1d_array x = "[0.0, 0.5, 1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.0, 1.0]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); real_1d_array f = "[0.00,0.25,1.00,2.00,2.25,3.00]"; if( _spoil_scenario==8 ) spoil_vector_by_nan(f); if( _spoil_scenario==9 ) spoil_vector_by_posinf(f); if( _spoil_scenario==10 ) spoil_vector_by_neginf(f); if( _spoil_scenario==11 ) spoil_vector_by_deleting_element(f); double vx = 0.25; if( _spoil_scenario==12 ) vx = fp_posinf; if( _spoil_scenario==13 ) vx = fp_neginf; double vy = 0.50; if( _spoil_scenario==14 ) vy = fp_posinf; if( _spoil_scenario==15 ) vy = fp_neginf; double v; double dx; double dy; double dxy; spline2dinterpolant s; // build spline spline2dbuildbicubicv(x, 3, y, 2, f, 1, s); // calculate S(0.25,0.50) v = spline2dcalc(s, vx, vy); _TestResult = _TestResult && doc_test_real(v, 1.0625, 0.00005); // calculate derivatives spline2ddiff(s, vx, vy, v, dx, dy, dxy); _TestResult = _TestResult && doc_test_real(v, 1.0625, 0.00005); _TestResult = _TestResult && doc_test_real(dx, 0.5000, 0.00005); _TestResult = _TestResult && doc_test_real(dy, 2.0000, 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "spline2d_bicubic"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST spline2d_unpack // Unpacking bilinear spline // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { // // We build bilinear spline for f(x,y)=x+2*y+3*xy for (x,y) in [0,1]. // Then we demonstrate how to unpack it. // real_1d_array x = "[0.0, 1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.0, 1.0]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); real_1d_array f = "[0.00,1.00,2.00,6.00]"; if( _spoil_scenario==8 ) spoil_vector_by_nan(f); if( _spoil_scenario==9 ) spoil_vector_by_posinf(f); if( _spoil_scenario==10 ) spoil_vector_by_neginf(f); if( _spoil_scenario==11 ) spoil_vector_by_deleting_element(f); real_2d_array c; ae_int_t m; ae_int_t n; ae_int_t d; spline2dinterpolant s; // build spline spline2dbuildbilinearv(x, 2, y, 2, f, 1, s); // unpack and test spline2dunpackv(s, m, n, d, c); _TestResult = _TestResult && doc_test_real_matrix(c, "[[0, 1, 0, 1, 0,2,0,0, 1,3,0,0, 0,0,0,0, 0,0,0,0 ]]", 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "spline2d_unpack"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST spline2d_copytrans // Copy and transform // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<16; _spoil_scenario++) { try { // // We build bilinear spline for f(x,y)=x+2*y for (x,y) in [0,1]. // Then we apply several transformations to this spline. // real_1d_array x = "[0.0, 1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.0, 1.0]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); real_1d_array f = "[0.00,1.00,2.00,3.00]"; if( _spoil_scenario==8 ) spoil_vector_by_nan(f); if( _spoil_scenario==9 ) spoil_vector_by_posinf(f); if( _spoil_scenario==10 ) spoil_vector_by_neginf(f); if( _spoil_scenario==11 ) spoil_vector_by_deleting_element(f); spline2dinterpolant s; spline2dinterpolant snew; double v; spline2dbuildbilinearv(x, 2, y, 2, f, 1, s); // copy spline, apply transformation x:=2*xnew, y:=4*ynew // evaluate at (xnew,ynew) = (0.25,0.25) - should be same as (x,y)=(0.5,1.0) spline2dcopy(s, snew); spline2dlintransxy(snew, 2.0, 0.0, 4.0, 0.0); v = spline2dcalc(snew, 0.25, 0.25); _TestResult = _TestResult && doc_test_real(v, 2.500, 0.00005); // copy spline, apply transformation SNew:=2*S+3 spline2dcopy(s, snew); spline2dlintransf(snew, 2.0, 3.0); v = spline2dcalc(snew, 0.5, 1.0); _TestResult = _TestResult && doc_test_real(v, 8.000, 0.00005); // // Same example, but for vector spline (f0,f1) = {x+2*y, 2*x+y} // real_1d_array f2 = "[0.00,0.00, 1.00,2.00, 2.00,1.00, 3.00,3.00]"; if( _spoil_scenario==12 ) spoil_vector_by_nan(f2); if( _spoil_scenario==13 ) spoil_vector_by_posinf(f2); if( _spoil_scenario==14 ) spoil_vector_by_neginf(f2); if( _spoil_scenario==15 ) spoil_vector_by_deleting_element(f2); real_1d_array vr; spline2dbuildbilinearv(x, 2, y, 2, f2, 2, s); // copy spline, apply transformation x:=2*xnew, y:=4*ynew spline2dcopy(s, snew); spline2dlintransxy(snew, 2.0, 0.0, 4.0, 0.0); spline2dcalcv(snew, 0.25, 0.25, vr); _TestResult = _TestResult && doc_test_real_vector(vr, "[2.500,2.000]", 0.00005); // copy spline, apply transformation SNew:=2*S+3 spline2dcopy(s, snew); spline2dlintransf(snew, 2.0, 3.0); spline2dcalcv(snew, 0.5, 1.0, vr); _TestResult = _TestResult && doc_test_real_vector(vr, "[8.000,7.000]", 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "spline2d_copytrans"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST spline2d_vector // Copy and transform // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<12; _spoil_scenario++) { try { // // We build bilinear vector-valued spline (f0,f1) = {x+2*y, 2*x+y} // Spline is built using function values at 2x2 grid: (x,y)=[0,1]*[0,1] // Then we perform evaluation at (x,y)=(0.1,0.3) // real_1d_array x = "[0.0, 1.0]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(x); real_1d_array y = "[0.0, 1.0]"; if( _spoil_scenario==4 ) spoil_vector_by_nan(y); if( _spoil_scenario==5 ) spoil_vector_by_posinf(y); if( _spoil_scenario==6 ) spoil_vector_by_neginf(y); if( _spoil_scenario==7 ) spoil_vector_by_deleting_element(y); real_1d_array f = "[0.00,0.00, 1.00,2.00, 2.00,1.00, 3.00,3.00]"; if( _spoil_scenario==8 ) spoil_vector_by_nan(f); if( _spoil_scenario==9 ) spoil_vector_by_posinf(f); if( _spoil_scenario==10 ) spoil_vector_by_neginf(f); if( _spoil_scenario==11 ) spoil_vector_by_deleting_element(f); spline2dinterpolant s; real_1d_array vr; spline2dbuildbilinearv(x, 2, y, 2, f, 2, s); spline2dcalcv(s, 0.1, 0.3, vr); _TestResult = _TestResult && doc_test_real_vector(vr, "[0.700,0.500]", 0.00005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "spline2d_vector"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST rbf_d_hrbf // Simple model built with HRBF algorithm // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // This example illustrates basic concepts of the RBF models: creation, modification, // evaluation. // // Suppose that we have set of 2-dimensional points with associated // scalar function values, and we want to build a RBF model using // our data. // // NOTE: we can work with 3D models too :) // // Typical sequence of steps is given below: // 1. we create RBF model object // 2. we attach our dataset to the RBF model and tune algorithm settings // 3. we rebuild RBF model using QNN algorithm on new data // 4. we use RBF model (evaluate, serialize, etc.) // double v; // // Step 1: RBF model creation. // // We have to specify dimensionality of the space (2 or 3) and // dimensionality of the function (scalar or vector). // // New model is empty - it can be evaluated, // but we just get zero value at any point. // rbfmodel model; rbfcreate(2, 1, model); v = rbfcalc2(model, 0.0, 0.0); _TestResult = _TestResult && doc_test_real(v, 0.000, 0.005); // // Step 2: we add dataset. // // XY contains two points - x0=(-1,0) and x1=(+1,0) - // and two function values f(x0)=2, f(x1)=3. // // We added points, but model was not rebuild yet. // If we call rbfcalc2(), we still will get 0.0 as result. // real_2d_array xy = "[[-1,0,2],[+1,0,3]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); rbfsetpoints(model, xy); v = rbfcalc2(model, 0.0, 0.0); _TestResult = _TestResult && doc_test_real(v, 0.000, 0.005); // // Step 3: rebuild model // // After we've configured model, we should rebuild it - // it will change coefficients stored internally in the // rbfmodel structure. // // We use hierarchical RBF algorithm with following parameters: // * RBase - set to 1.0 // * NLayers - three layers are used (although such simple problem // does not need more than 1 layer) // * LambdaReg - is set to zero value, no smoothing is required // rbfreport rep; rbfsetalgohierarchical(model, 1.0, 3, 0.0); rbfbuildmodel(model, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 1); // // Step 4: model was built // // After call of rbfbuildmodel(), rbfcalc2() will return // value of the new model. // v = rbfcalc2(model, 0.0, 0.0); _TestResult = _TestResult && doc_test_real(v, 2.500, 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "rbf_d_hrbf"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST rbf_d_vector // Working with vector functions // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { // // Suppose that we have set of 2-dimensional points with associated VECTOR // function values, and we want to build a RBF model using our data. // // Typical sequence of steps is given below: // 1. we create RBF model object // 2. we attach our dataset to the RBF model and tune algorithm settings // 3. we rebuild RBF model using new data // 4. we use RBF model (evaluate, serialize, etc.) // real_1d_array x; real_1d_array y; // // Step 1: RBF model creation. // // We have to specify dimensionality of the space (equal to 2) and // dimensionality of the function (2-dimensional vector function). // // New model is empty - it can be evaluated, // but we just get zero value at any point. // rbfmodel model; rbfcreate(2, 2, model); x = "[+1,+1]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(x); if( _spoil_scenario==1 ) spoil_vector_by_posinf(x); if( _spoil_scenario==2 ) spoil_vector_by_neginf(x); rbfcalc(model, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[0.000,0.000]", 0.005); // // Step 2: we add dataset. // // XY arrays containt four points: // * (x0,y0) = (+1,+1), f(x0,y0)=(0,-1) // * (x1,y1) = (+1,-1), f(x1,y1)=(-1,0) // * (x2,y2) = (-1,-1), f(x2,y2)=(0,+1) // * (x3,y3) = (-1,+1), f(x3,y3)=(+1,0) // real_2d_array xy = "[[+1,+1,0,-1],[+1,-1,-1,0],[-1,-1,0,+1],[-1,+1,+1,0]]"; if( _spoil_scenario==3 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==4 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==5 ) spoil_matrix_by_neginf(xy); rbfsetpoints(model, xy); // We added points, but model was not rebuild yet. // If we call rbfcalc(), we still will get 0.0 as result. rbfcalc(model, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[0.000,0.000]", 0.005); // // Step 3: rebuild model // // We use hierarchical RBF algorithm with following parameters: // * RBase - set to 1.0 // * NLayers - three layers are used (although such simple problem // does not need more than 1 layer) // * LambdaReg - is set to zero value, no smoothing is required // // After we've configured model, we should rebuild it - // it will change coefficients stored internally in the // rbfmodel structure. // rbfreport rep; rbfsetalgohierarchical(model, 1.0, 3, 0.0); rbfbuildmodel(model, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 1); // // Step 4: model was built // // After call of rbfbuildmodel(), rbfcalc() will return // value of the new model. // rbfcalc(model, x, y); _TestResult = _TestResult && doc_test_real_vector(y, "[0.000,-1.000]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "rbf_d_vector"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST rbf_d_polterm // RBF models - working with polynomial term // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // This example show how to work with polynomial term // // Suppose that we have set of 2-dimensional points with associated // scalar function values, and we want to build a RBF model using // our data. // // We use hierarchical RBF algorithm with following parameters: // * RBase - set to 1.0 // * NLayers - three layers are used (although such simple problem // does not need more than 1 layer) // * LambdaReg - is set to zero value, no smoothing is required // double v; rbfmodel model; real_2d_array xy = "[[-1,0,2],[+1,0,3]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); rbfreport rep; rbfcreate(2, 1, model); rbfsetpoints(model, xy); rbfsetalgohierarchical(model, 1.0, 3, 0.0); // // By default, RBF model uses linear term. It means that model // looks like // f(x,y) = SUM(RBF[i]) + a*x + b*y + c // where RBF[i] is I-th radial basis function and a*x+by+c is a // linear term. Having linear terms in a model gives us: // (1) improved extrapolation properties // (2) linearity of the model when data can be perfectly fitted // by the linear function // (3) linear asymptotic behavior // // Our simple dataset can be modelled by the linear function // f(x,y) = 0.5*x + 2.5 // and rbfbuildmodel() with default settings should preserve this // linearity. // ae_int_t nx; ae_int_t ny; ae_int_t nc; ae_int_t modelversion; real_2d_array xwr; real_2d_array c; rbfbuildmodel(model, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 1); rbfunpack(model, nx, ny, xwr, nc, c, modelversion); _TestResult = _TestResult && doc_test_real_matrix(c, "[[0.500,0.000,2.500]]", 0.005); // asymptotic behavior of our function is linear v = rbfcalc2(model, 1000.0, 0.0); _TestResult = _TestResult && doc_test_real(v, 502.50, 0.05); // // Instead of linear term we can use constant term. In this case // we will get model which has form // f(x,y) = SUM(RBF[i]) + c // where RBF[i] is I-th radial basis function and c is a constant, // which is equal to the average function value on the dataset. // // Because we've already attached dataset to the model the only // thing we have to do is to call rbfsetconstterm() and then // rebuild model with rbfbuildmodel(). // rbfsetconstterm(model); rbfbuildmodel(model, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 1); rbfunpack(model, nx, ny, xwr, nc, c, modelversion); _TestResult = _TestResult && doc_test_real_matrix(c, "[[0.000,0.000,2.500]]", 0.005); // asymptotic behavior of our function is constant v = rbfcalc2(model, 1000.0, 0.0); _TestResult = _TestResult && doc_test_real(v, 2.500, 0.005); // // Finally, we can use zero term. Just plain RBF without polynomial // part: // f(x,y) = SUM(RBF[i]) // where RBF[i] is I-th radial basis function. // rbfsetzeroterm(model); rbfbuildmodel(model, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 1); rbfunpack(model, nx, ny, xwr, nc, c, modelversion); _TestResult = _TestResult && doc_test_real_matrix(c, "[[0.000,0.000,0.000]]", 0.005); // asymptotic behavior of our function is just zero constant v = rbfcalc2(model, 1000.0, 0.0); _TestResult = _TestResult && doc_test_real(v, 0.000, 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "rbf_d_polterm"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST rbf_d_serialize // Serialization/unserialization // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<3; _spoil_scenario++) { try { // // This example show how to serialize and unserialize RBF model // // Suppose that we have set of 2-dimensional points with associated // scalar function values, and we want to build a RBF model using // our data. Then we want to serialize it to string and to unserialize // from string, loading to another instance of RBF model. // // Here we assume that you already know how to create RBF models. // std::string s; double v; rbfmodel model0; rbfmodel model1; real_2d_array xy = "[[-1,0,2],[+1,0,3]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(xy); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(xy); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(xy); rbfreport rep; // model initialization rbfcreate(2, 1, model0); rbfsetpoints(model0, xy); rbfsetalgohierarchical(model0, 1.0, 3, 0.0); rbfbuildmodel(model0, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 1); // // Serialization - it looks easy, // but you should carefully read next section. // alglib::rbfserialize(model0, s); alglib::rbfunserialize(s, model1); // both models return same value v = rbfcalc2(model0, 0.0, 0.0); _TestResult = _TestResult && doc_test_real(v, 2.500, 0.005); v = rbfcalc2(model1, 0.0, 0.0); _TestResult = _TestResult && doc_test_real(v, 2.500, 0.005); // // Previous section shows that model state is saved/restored during // serialization. However, some properties are NOT serialized. // // Serialization saves/restores RBF model, but it does NOT saves/restores // settings which were used to build current model. In particular, dataset // which was used to build model, is not preserved. // // What does it mean in for us? // // Do you remember this sequence: rbfcreate-rbfsetpoints-rbfbuildmodel? // First step creates model, second step adds dataset and tunes model // settings, third step builds model using current dataset and model // construction settings. // // If you call rbfbuildmodel() without calling rbfsetpoints() first, you // will get empty (zero) RBF model. In our example, model0 contains // dataset which was added by rbfsetpoints() call. However, model1 does // NOT contain dataset - because dataset is NOT serialized. // // This, if we call rbfbuildmodel(model0,rep), we will get same model, // which returns 2.5 at (x,y)=(0,0). However, after same call model1 will // return zero - because it contains RBF model (coefficients), but does NOT // contain dataset which was used to build this model. // // Basically, it means that: // * serialization of the RBF model preserves anything related to the model // EVALUATION // * but it does NOT creates perfect copy of the original object. // rbfbuildmodel(model0, rep); v = rbfcalc2(model0, 0.0, 0.0); _TestResult = _TestResult && doc_test_real(v, 2.500, 0.005); rbfbuildmodel(model1, rep); v = rbfcalc2(model1, 0.0, 0.0); _TestResult = _TestResult && doc_test_real(v, 0.000, 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "rbf_d_serialize"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matdet_d_1 // Determinant calculation, real matrix, short form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<7; _spoil_scenario++) { try { real_2d_array b = "[[1,2],[2,1]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(b); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(b); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(b); if( _spoil_scenario==3 ) spoil_matrix_by_adding_row(b); if( _spoil_scenario==4 ) spoil_matrix_by_adding_col(b); if( _spoil_scenario==5 ) spoil_matrix_by_deleting_row(b); if( _spoil_scenario==6 ) spoil_matrix_by_deleting_col(b); double a; a = rmatrixdet(b); _TestResult = _TestResult && doc_test_real(a, -3, 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matdet_d_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matdet_d_2 // Determinant calculation, real matrix, full form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<5; _spoil_scenario++) { try { real_2d_array b = "[[5,4],[4,5]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(b); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(b); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(b); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(b); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(b); double a; a = rmatrixdet(b, 2); _TestResult = _TestResult && doc_test_real(a, 9, 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matdet_d_2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matdet_d_3 // Determinant calculation, complex matrix, short form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<7; _spoil_scenario++) { try { complex_2d_array b = "[[1+1i,2],[2,1-1i]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(b); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(b); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(b); if( _spoil_scenario==3 ) spoil_matrix_by_adding_row(b); if( _spoil_scenario==4 ) spoil_matrix_by_adding_col(b); if( _spoil_scenario==5 ) spoil_matrix_by_deleting_row(b); if( _spoil_scenario==6 ) spoil_matrix_by_deleting_col(b); alglib::complex a; a = cmatrixdet(b); _TestResult = _TestResult && doc_test_complex(a, -2, 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matdet_d_3"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matdet_d_4 // Determinant calculation, complex matrix, full form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<5; _spoil_scenario++) { try { alglib::complex a; complex_2d_array b = "[[5i,4],[4i,5]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(b); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(b); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(b); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(b); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(b); a = cmatrixdet(b, 2); _TestResult = _TestResult && doc_test_complex(a, alglib::complex(0,9), 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matdet_d_4"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matdet_d_5 // Determinant calculation, complex matrix with zero imaginary part, short form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<7; _spoil_scenario++) { try { alglib::complex a; complex_2d_array b = "[[9,1],[2,1]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(b); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(b); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(b); if( _spoil_scenario==3 ) spoil_matrix_by_adding_row(b); if( _spoil_scenario==4 ) spoil_matrix_by_adding_col(b); if( _spoil_scenario==5 ) spoil_matrix_by_deleting_row(b); if( _spoil_scenario==6 ) spoil_matrix_by_deleting_col(b); a = cmatrixdet(b); _TestResult = _TestResult && doc_test_complex(a, 7, 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matdet_d_5"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matdet_t_0 // Determinant calculation, real matrix, full form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<5; _spoil_scenario++) { try { double a; real_2d_array b = "[[3,4],[-4,3]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(b); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(b); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(b); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(b); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(b); a = rmatrixdet(b, 2); _TestResult = _TestResult && doc_test_real(a, 25, 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matdet_t_0"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matdet_t_1 // Determinant calculation, real matrix, LU, short form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<9; _spoil_scenario++) { try { double a; real_2d_array b = "[[1,2],[2,5]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(b); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(b); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(b); if( _spoil_scenario==3 ) spoil_matrix_by_adding_row(b); if( _spoil_scenario==4 ) spoil_matrix_by_adding_col(b); if( _spoil_scenario==5 ) spoil_matrix_by_deleting_row(b); if( _spoil_scenario==6 ) spoil_matrix_by_deleting_col(b); integer_1d_array p = "[1,1]"; if( _spoil_scenario==7 ) spoil_vector_by_adding_element(p); if( _spoil_scenario==8 ) spoil_vector_by_deleting_element(p); a = rmatrixludet(b, p); _TestResult = _TestResult && doc_test_real(a, -5, 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matdet_t_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matdet_t_2 // Determinant calculation, real matrix, LU, full form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { double a; real_2d_array b = "[[5,4],[4,5]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(b); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(b); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(b); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(b); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(b); integer_1d_array p = "[0,1]"; if( _spoil_scenario==5 ) spoil_vector_by_deleting_element(p); a = rmatrixludet(b, p, 2); _TestResult = _TestResult && doc_test_real(a, 25, 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matdet_t_2"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matdet_t_3 // Determinant calculation, complex matrix, full form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<5; _spoil_scenario++) { try { alglib::complex a; complex_2d_array b = "[[5i,4],[-4,5i]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(b); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(b); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(b); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(b); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(b); a = cmatrixdet(b, 2); _TestResult = _TestResult && doc_test_complex(a, -9, 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matdet_t_3"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matdet_t_4 // Determinant calculation, complex matrix, LU, short form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<9; _spoil_scenario++) { try { alglib::complex a; complex_2d_array b = "[[1,2],[2,5i]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(b); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(b); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(b); if( _spoil_scenario==3 ) spoil_matrix_by_adding_row(b); if( _spoil_scenario==4 ) spoil_matrix_by_adding_col(b); if( _spoil_scenario==5 ) spoil_matrix_by_deleting_row(b); if( _spoil_scenario==6 ) spoil_matrix_by_deleting_col(b); integer_1d_array p = "[1,1]"; if( _spoil_scenario==7 ) spoil_vector_by_adding_element(p); if( _spoil_scenario==8 ) spoil_vector_by_deleting_element(p); a = cmatrixludet(b, p); _TestResult = _TestResult && doc_test_complex(a, alglib::complex(0,-5), 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matdet_t_4"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST matdet_t_5 // Determinant calculation, complex matrix, LU, full form // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<6; _spoil_scenario++) { try { alglib::complex a; complex_2d_array b = "[[5,4i],[4,5]]"; if( _spoil_scenario==0 ) spoil_matrix_by_nan(b); if( _spoil_scenario==1 ) spoil_matrix_by_posinf(b); if( _spoil_scenario==2 ) spoil_matrix_by_neginf(b); if( _spoil_scenario==3 ) spoil_matrix_by_deleting_row(b); if( _spoil_scenario==4 ) spoil_matrix_by_deleting_col(b); integer_1d_array p = "[0,1]"; if( _spoil_scenario==5 ) spoil_vector_by_deleting_element(p); a = cmatrixludet(b, p, 2); _TestResult = _TestResult && doc_test_complex(a, 25, 0.0001); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "matdet_t_5"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; // // TEST lincg_d_1 // Solution of sparse linear systems with CG // _TestResult = true; for(_spoil_scenario=-1; _spoil_scenario<4; _spoil_scenario++) { try { // // This example illustrates solution of sparse linear systems with // conjugate gradient method. // // Suppose that we have linear system A*x=b with sparse symmetric // positive definite A (represented by sparsematrix object) // [ 5 1 ] // [ 1 7 2 ] // A = [ 2 8 1 ] // [ 1 4 1 ] // [ 1 4 ] // and right part b // [ 7 ] // [ 17 ] // b = [ 14 ] // [ 10 ] // [ 6 ] // and we want to solve this system using sparse linear CG. In order // to do so, we have to create left part (sparsematrix object) and // right part (dense array). // // Initially, sparse matrix is created in the Hash-Table format, // which allows easy initialization, but do not allow matrix to be // used in the linear solvers. So after construction you should convert // sparse matrix to CRS format (one suited for linear operations). // // It is important to note that in our example we initialize full // matrix A, both lower and upper triangles. However, it is symmetric // and sparse solver needs just one half of the matrix. So you may // save about half of the space by filling only one of the triangles. // sparsematrix a; sparsecreate(5, 5, a); sparseset(a, 0, 0, 5.0); sparseset(a, 0, 1, 1.0); sparseset(a, 1, 0, 1.0); sparseset(a, 1, 1, 7.0); sparseset(a, 1, 2, 2.0); sparseset(a, 2, 1, 2.0); sparseset(a, 2, 2, 8.0); sparseset(a, 2, 3, 1.0); sparseset(a, 3, 2, 1.0); sparseset(a, 3, 3, 4.0); sparseset(a, 3, 4, 1.0); sparseset(a, 4, 3, 1.0); sparseset(a, 4, 4, 4.0); // // Now our matrix is fully initialized, but we have to do one more // step - convert it from Hash-Table format to CRS format (see // documentation on sparse matrices for more information about these // formats). // // If you omit this call, ALGLIB will generate exception on the first // attempt to use A in linear operations. // sparseconverttocrs(a); // // Initialization of the right part // real_1d_array b = "[7,17,14,10,6]"; if( _spoil_scenario==0 ) spoil_vector_by_nan(b); if( _spoil_scenario==1 ) spoil_vector_by_posinf(b); if( _spoil_scenario==2 ) spoil_vector_by_neginf(b); if( _spoil_scenario==3 ) spoil_vector_by_deleting_element(b); // // Now we have to create linear solver object and to use it for the // solution of the linear system. // // NOTE: lincgsolvesparse() accepts additional parameter which tells // what triangle of the symmetric matrix should be used - upper // or lower. Because we've filled both parts of the matrix, we // can use any part - upper or lower. // lincgstate s; lincgreport rep; real_1d_array x; lincgcreate(5, s); lincgsolvesparse(s, a, true, b); lincgresults(s, x, rep); _TestResult = _TestResult && doc_test_int(rep.terminationtype, 1); _TestResult = _TestResult && doc_test_real_vector(x, "[1.000,2.000,1.000,2.000,1.000]", 0.005); _TestResult = _TestResult && (_spoil_scenario==-1); } catch(ap_error) { _TestResult = _TestResult && (_spoil_scenario!=-1); } } if( !_TestResult) { printf("%-32s FAILED\n", "lincg_d_1"); fflush(stdout); } _TotalResult = _TotalResult && _TestResult; printf("145/145\n"); } catch(...) { printf("Unhandled exception was raised!\n"); return 1; } #ifdef AE_USE_ALLOC_COUNTER printf("Allocation counter checked... "); if( alglib_impl::_alloc_counter!=0 ) { _TotalResult = false; printf("FAILURE: alloc_counter is non-zero on end!\n"); } else printf("OK\n"); #endif return _TotalResult ? 0 : 1; } cpp/tests/test_c.cpp0000755000175000017500001632167213105126766014401 0ustar sergeysergey#include #include #include "ap.h" // disable some irrelevant warnings #if (AE_COMPILER==AE_MSVC) #pragma warning(disable:4100) #pragma warning(disable:4127) #pragma warning(disable:4702) #pragma warning(disable:4996) #endif #include "alglibinternal.h" #include "alglibmisc.h" #include "diffequations.h" #include "linalg.h" #include "optimization.h" #include "solvers.h" #include "statistics.h" #include "dataanalysis.h" #include "specialfunctions.h" #include "integration.h" #include "fasttransforms.h" #include "interpolation.h" using namespace alglib_impl; /************************************************************************* Testing tag sort *************************************************************************/ ae_bool testtsort(ae_bool silent, ae_state *_state); ae_bool _pexec_testtsort(ae_bool silent, ae_state *_state); /************************************************************************* Testing Nearest Neighbor Search *************************************************************************/ ae_bool testnearestneighbor(ae_bool silent, ae_state *_state); ae_bool _pexec_testnearestneighbor(ae_bool silent, ae_state *_state); ae_bool testhqrnd(ae_bool silent, ae_state *_state); ae_bool _pexec_testhqrnd(ae_bool silent, ae_state *_state); /************************************************************************* Function for test HQRNDContinuous function *************************************************************************/ ae_bool hqrndcontinuoustest(ae_bool silent, ae_state *_state); /************************************************************************* Function for test HQRNDContinuous function *************************************************************************/ ae_bool hqrnddiscretetest(ae_bool silent, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testodesolver(ae_bool silent, ae_state *_state); ae_bool _pexec_testodesolver(ae_bool silent, ae_state *_state); typedef struct { ae_int_t n; ae_int_t m; ae_int_t matkind; ae_int_t triangle; ae_matrix bufa; hqrndstate rs; rcommstate rcs; } sparsegenerator; ae_bool testsparse(ae_bool silent, ae_state *_state); ae_bool _pexec_testsparse(ae_bool silent, ae_state *_state); /************************************************************************* Function for testing basic SKS functional. Returns True on errors, False on success. -- ALGLIB PROJECT -- Copyright 16.01.1014 by Bochkanov Sergey *************************************************************************/ ae_bool skstest(ae_state *_state); /************************************************************************* Function for testing basic functional -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ ae_bool basicfunctest(ae_state *_state); /************************************************************************* Function for testing Level 2 unsymmetric linear algebra functions. Additionally it tests SparseGet() for several matrix formats. Returns True on failure. -- ALGLIB PROJECT -- Copyright 20.01.2014 by Bochkanov Sergey *************************************************************************/ ae_bool testlevel2unsymmetric(ae_state *_state); /************************************************************************* Function for testing Level 3 unsymmetric linear algebra functions. Additionally it tests SparseGet() for several matrix formats. Returns True on failure. -- ALGLIB PROJECT -- Copyright 20.01.2014 by Bochkanov Sergey *************************************************************************/ ae_bool testlevel3unsymmetric(ae_state *_state); /************************************************************************* Function for testing Level 2 symmetric linear algebra functions. Additionally it tests SparseGet() for several matrix formats. Returns True on failure. -- ALGLIB PROJECT -- Copyright 20.01.2014 by Bochkanov Sergey *************************************************************************/ ae_bool testlevel2symmetric(ae_state *_state); /************************************************************************* Function for testing Level 2 symmetric linear algebra functions. Additionally it tests SparseGet() for several matrix formats. Returns True on failure. -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ ae_bool testlevel3symmetric(ae_state *_state); /************************************************************************* Function for testing Level 2 triangular linear algebra functions. Returns True on failure. -- ALGLIB PROJECT -- Copyright 20.01.2014 by Bochkanov Sergey *************************************************************************/ ae_bool testlevel2triangular(ae_state *_state); /************************************************************************* Function for testing basic functional -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ ae_bool basicfuncrandomtest(ae_state *_state); /************************************************************************* Function for testing multyplication matrix with vector -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ ae_bool linearfunctionstest(ae_state *_state); /************************************************************************* Function for testing multyplication for simmetric matrix with vector -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ ae_bool linearfunctionsstest(ae_state *_state); /************************************************************************* Function for testing multyplication sparse matrix with nerrow dense matrix -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ ae_bool linearfunctionsmmtest(ae_state *_state); /************************************************************************* Function for testing multyplication for simmetric sparse matrix with narrow dense matrix -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ ae_bool linearfunctionssmmtest(ae_state *_state); /************************************************************************* Function for basic test SparseCopy -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ ae_bool basiccopyfunctest(ae_bool silent, ae_state *_state); /************************************************************************* Function for testing SparseCopy -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ ae_bool copyfunctest(ae_bool silent, ae_state *_state); void _sparsegenerator_init(void* _p, ae_state *_state); void _sparsegenerator_init_copy(void* _dst, void* _src, ae_state *_state); void _sparsegenerator_clear(void* _p); void _sparsegenerator_destroy(void* _p); ae_bool testreflections(ae_bool silent, ae_state *_state); ae_bool _pexec_testreflections(ae_bool silent, ae_state *_state); ae_bool testcreflections(ae_bool silent, ae_state *_state); ae_bool _pexec_testcreflections(ae_bool silent, ae_state *_state); ae_bool testmatgen(ae_bool silent, ae_state *_state); ae_bool _pexec_testmatgen(ae_bool silent, ae_state *_state); ae_bool testablas(ae_bool silent, ae_state *_state); ae_bool _pexec_testablas(ae_bool silent, ae_state *_state); ae_bool testtrfac(ae_bool silent, ae_state *_state); ae_bool _pexec_testtrfac(ae_bool silent, ae_state *_state); /************************************************************************* Function for testing sparse real Cholesky. Returns True on errors, False on success. -- ALGLIB PROJECT -- Copyright 16.01.1014 by Bochkanov Sergey *************************************************************************/ ae_bool sparserealcholeskytest(ae_state *_state); /************************************************************************* Main unittest subroutine *************************************************************************/ ae_bool testtrlinsolve(ae_bool silent, ae_state *_state); ae_bool _pexec_testtrlinsolve(ae_bool silent, ae_state *_state); /************************************************************************* Main unittest subroutine *************************************************************************/ ae_bool testsafesolve(ae_bool silent, ae_state *_state); ae_bool _pexec_testsafesolve(ae_bool silent, ae_state *_state); ae_bool testrcond(ae_bool silent, ae_state *_state); ae_bool _pexec_testrcond(ae_bool silent, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testmatinv(ae_bool silent, ae_state *_state); ae_bool _pexec_testmatinv(ae_bool silent, ae_state *_state); ae_bool testhblas(ae_bool silent, ae_state *_state); ae_bool _pexec_testhblas(ae_bool silent, ae_state *_state); ae_bool testsblas(ae_bool silent, ae_state *_state); ae_bool _pexec_testsblas(ae_bool silent, ae_state *_state); /************************************************************************* Main unittest subroutine *************************************************************************/ ae_bool testortfac(ae_bool silent, ae_state *_state); ae_bool _pexec_testortfac(ae_bool silent, ae_state *_state); /************************************************************************* Testing *************************************************************************/ ae_bool testfbls(ae_bool silent, ae_state *_state); ae_bool _pexec_testfbls(ae_bool silent, ae_state *_state); ae_bool testcqmodels(ae_bool silent, ae_state *_state); ae_bool _pexec_testcqmodels(ae_bool silent, ae_state *_state); /************************************************************************* Testing bidiagonal SVD decomposition subroutine *************************************************************************/ ae_bool testbdsvd(ae_bool silent, ae_state *_state); ae_bool _pexec_testbdsvd(ae_bool silent, ae_state *_state); ae_bool testblas(ae_bool silent, ae_state *_state); ae_bool _pexec_testblas(ae_bool silent, ae_state *_state); /************************************************************************* Testing SVD decomposition subroutine *************************************************************************/ ae_bool testsvd(ae_bool silent, ae_state *_state); ae_bool _pexec_testsvd(ae_bool silent, ae_state *_state); ae_bool testoptserv(ae_bool silent, ae_state *_state); ae_bool _pexec_testoptserv(ae_bool silent, ae_state *_state); ae_bool testsnnls(ae_bool silent, ae_state *_state); ae_bool _pexec_testsnnls(ae_bool silent, ae_state *_state); ae_bool testsactivesets(ae_bool silent, ae_state *_state); ae_bool _pexec_testsactivesets(ae_bool silent, ae_state *_state); ae_bool testlinmin(ae_bool silent, ae_state *_state); ae_bool _pexec_testlinmin(ae_bool silent, ae_state *_state); ae_bool testminlbfgs(ae_bool silent, ae_state *_state); ae_bool _pexec_testminlbfgs(ae_bool silent, ae_state *_state); ae_bool testxblas(ae_bool silent, ae_state *_state); ae_bool _pexec_testxblas(ae_bool silent, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testdensesolver(ae_bool silent, ae_state *_state); ae_bool _pexec_testdensesolver(ae_bool silent, ae_state *_state); ae_bool testnormestimator(ae_bool silent, ae_state *_state); ae_bool _pexec_testnormestimator(ae_bool silent, ae_state *_state); ae_bool testlinlsqr(ae_bool silent, ae_state *_state); ae_bool _pexec_testlinlsqr(ae_bool silent, ae_state *_state); ae_bool testmincg(ae_bool silent, ae_state *_state); ae_bool _pexec_testmincg(ae_bool silent, ae_state *_state); /************************************************************************* Other properties *************************************************************************/ void testother(ae_bool* err, ae_state *_state); ae_bool testminbleic(ae_bool silent, ae_state *_state); ae_bool _pexec_testminbleic(ae_bool silent, ae_state *_state); ae_bool testminqp(ae_bool silent, ae_state *_state); ae_bool _pexec_testminqp(ae_bool silent, ae_state *_state); /************************************************************************* Function to test: 'MinQPCreate', 'MinQPSetQuadraticTerm', 'MinQPSetBC', 'MinQPSetOrigin', 'MinQPSetStartingPoint', 'MinQPOptimize', 'MinQPResults'. Test problem: A = diag(aii), aii>0 (random) b = 0 random bounds (either no bounds, one bound, two bounds a 0 without bounds random start point dimension - from 1 to 5. *************************************************************************/ ae_bool functest1(ae_state *_state); /************************************************************************* Function to test: 'MinQPCreate', 'MinQPSetLinearTerm', 'MinQPSetQuadraticTerm', 'MinQPSetBC', 'MinQPSetOrigin', 'MinQPSetStartingPoint', 'MinQPOptimize', 'MinQPResults'. Test problem: A = positive-definite matrix, obtained by 'SPDMatrixRndCond' function b <> 0 boundary constraints random start point dimension - from 1 to 5. *************************************************************************/ ae_bool functest2(ae_state *_state); /************************************************************************* ConsoleTest. *************************************************************************/ ae_bool consoletest(ae_state *_state); /************************************************************************* This function performs tests specific for Cholesky solver Returns True on success, False on failure. *************************************************************************/ ae_bool choleskytests(ae_state *_state); /************************************************************************* This function performs tests specific for QuickQP solver Returns True on failure. *************************************************************************/ ae_bool quickqptests(ae_state *_state); /************************************************************************* This function performs tests specific for BLEIC solver Returns True on error, False on success. *************************************************************************/ ae_bool bleictests(ae_state *_state); ae_bool testminnlc(ae_bool silent, ae_state *_state); ae_bool _pexec_testminnlc(ae_bool silent, ae_state *_state); ae_bool testminbc(ae_bool silent, ae_state *_state); ae_bool _pexec_testminbc(ae_bool silent, ae_state *_state); ae_bool testminns(ae_bool silent, ae_state *_state); ae_bool _pexec_testminns(ae_bool silent, ae_state *_state); ae_bool testminlm(ae_bool silent, ae_state *_state); ae_bool _pexec_testminlm(ae_bool silent, ae_state *_state); /************************************************************************* Testing symmetric EVD subroutine *************************************************************************/ ae_bool testevd(ae_bool silent, ae_state *_state); ae_bool _pexec_testevd(ae_bool silent, ae_state *_state); ae_bool testbasestat(ae_bool silent, ae_state *_state); ae_bool _pexec_testbasestat(ae_bool silent, ae_state *_state); ae_bool testpca(ae_bool silent, ae_state *_state); ae_bool _pexec_testpca(ae_bool silent, ae_state *_state); /************************************************************************* Testing BDSS operations *************************************************************************/ ae_bool testbdss(ae_bool silent, ae_state *_state); ae_bool _pexec_testbdss(ae_bool silent, ae_state *_state); ae_bool testmlpbase(ae_bool silent, ae_state *_state); ae_bool _pexec_testmlpbase(ae_bool silent, ae_state *_state); ae_bool testlda(ae_bool silent, ae_state *_state); ae_bool _pexec_testlda(ae_bool silent, ae_state *_state); ae_bool testgammafunc(ae_bool silent, ae_state *_state); ae_bool _pexec_testgammafunc(ae_bool silent, ae_state *_state); ae_bool testlinreg(ae_bool silent, ae_state *_state); ae_bool _pexec_testlinreg(ae_bool silent, ae_state *_state); ae_bool testfilters(ae_bool silent, ae_state *_state); ae_bool _pexec_testfilters(ae_bool silent, ae_state *_state); /************************************************************************* This function tests SMA(k) filter. It returns True on error. Additional IsSilent parameter controls detailed error reporting. *************************************************************************/ ae_bool testsma(ae_bool issilent, ae_state *_state); /************************************************************************* This function tests EMA(alpha) filter. It returns True on error. Additional IsSilent parameter controls detailed error reporting. *************************************************************************/ ae_bool testema(ae_bool issilent, ae_state *_state); /************************************************************************* This function tests LRMA(k) filter. It returns True on error. Additional IsSilent parameter controls detailed error reporting. *************************************************************************/ ae_bool testlrma(ae_bool issilent, ae_state *_state); ae_bool testmcpd(ae_bool silent, ae_state *_state); ae_bool _pexec_testmcpd(ae_bool silent, ae_state *_state); ae_bool testmlpe(ae_bool silent, ae_state *_state); ae_bool _pexec_testmlpe(ae_bool silent, ae_state *_state); ae_bool testmlptrain(ae_bool silent, ae_state *_state); ae_bool _pexec_testmlptrain(ae_bool silent, ae_state *_state); /************************************************************************* Testing clustering *************************************************************************/ ae_bool testclustering(ae_bool silent, ae_state *_state); ae_bool _pexec_testclustering(ae_bool silent, ae_state *_state); ae_bool testdforest(ae_bool silent, ae_state *_state); ae_bool _pexec_testdforest(ae_bool silent, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testgq(ae_bool silent, ae_state *_state); ae_bool _pexec_testgq(ae_bool silent, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testgkq(ae_bool silent, ae_state *_state); ae_bool _pexec_testgkq(ae_bool silent, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testautogk(ae_bool silent, ae_state *_state); ae_bool _pexec_testautogk(ae_bool silent, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testfft(ae_bool silent, ae_state *_state); ae_bool _pexec_testfft(ae_bool silent, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testfht(ae_bool silent, ae_state *_state); ae_bool _pexec_testfht(ae_bool silent, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testconv(ae_bool silent, ae_state *_state); ae_bool _pexec_testconv(ae_bool silent, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testcorr(ae_bool silent, ae_state *_state); ae_bool _pexec_testcorr(ae_bool silent, ae_state *_state); /************************************************************************* Testing IDW interpolation *************************************************************************/ ae_bool testidwint(ae_bool silent, ae_state *_state); ae_bool _pexec_testidwint(ae_bool silent, ae_state *_state); ae_bool testratint(ae_bool silent, ae_state *_state); ae_bool _pexec_testratint(ae_bool silent, ae_state *_state); ae_bool testspline1d(ae_bool silent, ae_state *_state); ae_bool _pexec_testspline1d(ae_bool silent, ae_state *_state); ae_bool testparametric(ae_bool silent, ae_state *_state); ae_bool _pexec_testparametric(ae_bool silent, ae_state *_state); ae_bool testspline3d(ae_bool silence, ae_state *_state); ae_bool _pexec_testspline3d(ae_bool silence, ae_state *_state); /************************************************************************* Unit test *************************************************************************/ ae_bool testpolint(ae_bool silent, ae_state *_state); ae_bool _pexec_testpolint(ae_bool silent, ae_state *_state); ae_bool testlsfit(ae_bool silent, ae_state *_state); ae_bool _pexec_testlsfit(ae_bool silent, ae_state *_state); ae_bool testnsfit(ae_bool silent, ae_state *_state); ae_bool _pexec_testnsfit(ae_bool silent, ae_state *_state); ae_bool testspline2d(ae_bool silent, ae_state *_state); ae_bool _pexec_testspline2d(ae_bool silent, ae_state *_state); ae_bool testrbf(ae_bool silent, ae_state *_state); ae_bool _pexec_testrbf(ae_bool silent, ae_state *_state); /************************************************************************* The test has to check, that algorithm can solve problems of matrix are degenerate. * used model with linear term; * points locate in a subspace of dimension less than an original space. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ ae_bool sqrdegmatrixrbftest(ae_bool silent, ae_state *_state); /************************************************************************* Function for testing basic functionality of RBF module on regular grids with multi-layer algorithm in 1D. -- ALGLIB -- Copyright 2.03.2012 by Bochkanov Sergey *************************************************************************/ ae_bool basicmultilayerrbf1dtest(ae_state *_state); ae_bool testhermite(ae_bool silent, ae_state *_state); ae_bool _pexec_testhermite(ae_bool silent, ae_state *_state); ae_bool testlaguerre(ae_bool silent, ae_state *_state); ae_bool _pexec_testlaguerre(ae_bool silent, ae_state *_state); ae_bool testlegendre(ae_bool silent, ae_state *_state); ae_bool _pexec_testlegendre(ae_bool silent, ae_state *_state); ae_bool testchebyshev(ae_bool silent, ae_state *_state); ae_bool _pexec_testchebyshev(ae_bool silent, ae_state *_state); ae_bool testwsr(ae_bool silent, ae_state *_state); ae_bool _pexec_testwsr(ae_bool silent, ae_state *_state); ae_bool teststest(ae_bool silent, ae_state *_state); ae_bool _pexec_teststest(ae_bool silent, ae_state *_state); ae_bool teststudentttests(ae_bool silent, ae_state *_state); ae_bool _pexec_teststudentttests(ae_bool silent, ae_state *_state); ae_bool testmannwhitneyu(ae_bool silent, ae_state *_state); ae_bool _pexec_testmannwhitneyu(ae_bool silent, ae_state *_state); /************************************************************************* Testing Schur decomposition subroutine *************************************************************************/ ae_bool testschur(ae_bool silent, ae_state *_state); ae_bool _pexec_testschur(ae_bool silent, ae_state *_state); /************************************************************************* Testing bidiagonal SVD decomposition subroutine *************************************************************************/ ae_bool testspdgevd(ae_bool silent, ae_state *_state); ae_bool _pexec_testspdgevd(ae_bool silent, ae_state *_state); ae_bool testinverseupdate(ae_bool silent, ae_state *_state); ae_bool _pexec_testinverseupdate(ae_bool silent, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testpolynomialsolver(ae_bool silent, ae_state *_state); ae_bool _pexec_testpolynomialsolver(ae_bool silent, ae_state *_state); ae_bool testnleq(ae_bool silent, ae_state *_state); ae_bool _pexec_testnleq(ae_bool silent, ae_state *_state); ae_bool testlincg(ae_bool silent, ae_state *_state); ae_bool _pexec_testlincg(ae_bool silent, ae_state *_state); typedef struct { ae_bool bfield; double rfield; ae_int_t ifield; ae_complex cfield; ae_vector b1field; ae_vector r1field; ae_vector i1field; ae_vector c1field; ae_matrix b2field; ae_matrix r2field; ae_matrix i2field; ae_matrix c2field; } rec1; typedef struct { ae_vector b; ae_vector i; ae_vector r; } rec4serialization; typedef struct { ae_complex cval; double rval; ae_int_t ival; ae_bool bval; ae_vector i1val; } poolrec1; typedef struct { ae_bool bval; poolrec1 recval; ae_shared_pool pool; } poolrec2; typedef struct { ae_int_t val; } poolsummand; void rec4serializationalloc(ae_serializer* s, rec4serialization* v, ae_state *_state); void rec4serializationserialize(ae_serializer* s, rec4serialization* v, ae_state *_state); void rec4serializationunserialize(ae_serializer* s, rec4serialization* v, ae_state *_state); ae_bool testalglibbasics(ae_bool silent, ae_state *_state); ae_bool _pexec_testalglibbasics(ae_bool silent, ae_state *_state); void _rec1_init(void* _p, ae_state *_state); void _rec1_init_copy(void* _dst, void* _src, ae_state *_state); void _rec1_clear(void* _p); void _rec1_destroy(void* _p); void _rec4serialization_init(void* _p, ae_state *_state); void _rec4serialization_init_copy(void* _dst, void* _src, ae_state *_state); void _rec4serialization_clear(void* _p); void _rec4serialization_destroy(void* _p); void _poolrec1_init(void* _p, ae_state *_state); void _poolrec1_init_copy(void* _dst, void* _src, ae_state *_state); void _poolrec1_clear(void* _p); void _poolrec1_destroy(void* _p); void _poolrec2_init(void* _p, ae_state *_state); void _poolrec2_init_copy(void* _dst, void* _src, ae_state *_state); void _poolrec2_clear(void* _p); void _poolrec2_destroy(void* _p); void _poolsummand_init(void* _p, ae_state *_state); void _poolsummand_init_copy(void* _dst, void* _src, ae_state *_state); void _poolsummand_clear(void* _p); void _poolsummand_destroy(void* _p); static void testtsortunit_unset1di(/* Integer */ ae_vector* a, ae_state *_state); static void testtsortunit_testsortresults(/* Real */ ae_vector* asorted, /* Integer */ ae_vector* p1, /* Integer */ ae_vector* p2, /* Real */ ae_vector* aoriginal, ae_int_t n, ae_bool* waserrors, ae_state *_state); /************************************************************************* Testing tag sort *************************************************************************/ ae_bool testtsort(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_int_t n; ae_int_t i; ae_int_t m; ae_int_t offs; ae_int_t pass; ae_int_t passcount; ae_int_t maxn; ae_vector a; ae_vector a0; ae_vector a1; ae_vector a2; ae_vector a3; ae_vector i1; ae_vector i2; ae_vector i3; ae_vector a4; ae_vector pa4; ae_vector ar; ae_vector ar2; ae_vector ai; ae_vector p1; ae_vector p2; ae_vector bufr1; ae_vector bufr2; ae_vector bufi1; ae_bool distinctvals; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&a, 0, DT_REAL, _state); ae_vector_init(&a0, 0, DT_REAL, _state); ae_vector_init(&a1, 0, DT_REAL, _state); ae_vector_init(&a2, 0, DT_REAL, _state); ae_vector_init(&a3, 0, DT_REAL, _state); ae_vector_init(&i1, 0, DT_INT, _state); ae_vector_init(&i2, 0, DT_INT, _state); ae_vector_init(&i3, 0, DT_INT, _state); ae_vector_init(&a4, 0, DT_INT, _state); ae_vector_init(&pa4, 0, DT_INT, _state); ae_vector_init(&ar, 0, DT_REAL, _state); ae_vector_init(&ar2, 0, DT_REAL, _state); ae_vector_init(&ai, 0, DT_INT, _state); ae_vector_init(&p1, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); ae_vector_init(&bufr1, 0, DT_REAL, _state); ae_vector_init(&bufr2, 0, DT_REAL, _state); ae_vector_init(&bufi1, 0, DT_INT, _state); waserrors = ae_false; maxn = 100; passcount = 10; /* * Test tagsort */ for(n=1; n<=maxn; n++) { for(pass=1; pass<=passcount; pass++) { /* * Pprobably distinct sort: * * generate array of integer random numbers. * Because of birthday paradox, random numbers have to be VERY large * in order to avoid situation when we have distinct values. * * sort A0 using TagSort and test sort results * * now we can use A0 as reference point and test other functions */ testtsortunit_unset1di(&p1, _state); testtsortunit_unset1di(&p2, _state); ae_vector_set_length(&a, n, _state); ae_vector_set_length(&a0, n, _state); ae_vector_set_length(&a1, n, _state); ae_vector_set_length(&a2, n, _state); ae_vector_set_length(&a3, n, _state); ae_vector_set_length(&a4, n, _state); ae_vector_set_length(&ar, n, _state); ae_vector_set_length(&ar2, n, _state); ae_vector_set_length(&ai, n, _state); for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(ae_randominteger(100000000, _state)); a0.ptr.p_double[i] = a.ptr.p_double[i]; a1.ptr.p_double[i] = a.ptr.p_double[i]; a2.ptr.p_double[i] = a.ptr.p_double[i]; a3.ptr.p_double[i] = a.ptr.p_double[i]; a4.ptr.p_int[i] = ae_round(a.ptr.p_double[i], _state); ar.ptr.p_double[i] = (double)(i); ar2.ptr.p_double[i] = (double)(i); ai.ptr.p_int[i] = i; } tagsort(&a0, n, &p1, &p2, _state); testtsortunit_testsortresults(&a0, &p1, &p2, &a, n, &waserrors, _state); distinctvals = ae_true; for(i=1; i<=n-1; i++) { distinctvals = distinctvals&&ae_fp_neq(a0.ptr.p_double[i],a0.ptr.p_double[i-1]); } if( distinctvals ) { tagsortfasti(&a1, &ai, &bufr1, &bufi1, n, _state); for(i=0; i<=n-1; i++) { waserrors = (waserrors||ae_fp_neq(a1.ptr.p_double[i],a0.ptr.p_double[i]))||ai.ptr.p_int[i]!=p1.ptr.p_int[i]; } tagsortfastr(&a2, &ar, &bufr1, &bufr2, n, _state); for(i=0; i<=n-1; i++) { waserrors = (waserrors||ae_fp_neq(a2.ptr.p_double[i],a0.ptr.p_double[i]))||ae_fp_neq(ar.ptr.p_double[i],(double)(p1.ptr.p_int[i])); } tagsortfast(&a3, &bufr1, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq(a3.ptr.p_double[i],a0.ptr.p_double[i]); } tagsortmiddleir(&a4, &ar2, 0, n, _state); for(i=0; i<=n-1; i++) { waserrors = (waserrors||ae_fp_neq((double)(a4.ptr.p_int[i]),a0.ptr.p_double[i]))||ae_fp_neq(ar2.ptr.p_double[i],(double)(p1.ptr.p_int[i])); } } /* * Non-distinct sort. * We test that keys are correctly reordered, but do NOT test order of values. */ testtsortunit_unset1di(&p1, _state); testtsortunit_unset1di(&p2, _state); ae_vector_set_length(&a, n, _state); ae_vector_set_length(&a0, n, _state); ae_vector_set_length(&a1, n, _state); ae_vector_set_length(&a2, n, _state); ae_vector_set_length(&a3, n, _state); ae_vector_set_length(&a4, n, _state); ae_vector_set_length(&ar, n, _state); ae_vector_set_length(&ar2, n, _state); ae_vector_set_length(&ai, n, _state); for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)((n-i)/2); a0.ptr.p_double[i] = a.ptr.p_double[i]; a1.ptr.p_double[i] = a.ptr.p_double[i]; a2.ptr.p_double[i] = a.ptr.p_double[i]; a3.ptr.p_double[i] = a.ptr.p_double[i]; a4.ptr.p_int[i] = ae_round(a.ptr.p_double[i], _state); ar.ptr.p_double[i] = (double)(i); ar2.ptr.p_double[i] = (double)(i); ai.ptr.p_int[i] = i; } tagsort(&a0, n, &p1, &p2, _state); testtsortunit_testsortresults(&a0, &p1, &p2, &a, n, &waserrors, _state); tagsortfasti(&a1, &ai, &bufr1, &bufi1, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq(a1.ptr.p_double[i],a0.ptr.p_double[i]); } tagsortfastr(&a2, &ar, &bufr1, &bufr2, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq(a2.ptr.p_double[i],a0.ptr.p_double[i]); } tagsortfast(&a3, &bufr1, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq(a3.ptr.p_double[i],a0.ptr.p_double[i]); } tagsortmiddleir(&a4, &ar2, 0, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq((double)(a4.ptr.p_int[i]),a0.ptr.p_double[i]); } /* * 'All same' sort * We test that keys are correctly reordered, but do NOT test order of values. */ testtsortunit_unset1di(&p1, _state); testtsortunit_unset1di(&p2, _state); ae_vector_set_length(&a, n, _state); ae_vector_set_length(&a0, n, _state); ae_vector_set_length(&a1, n, _state); ae_vector_set_length(&a2, n, _state); ae_vector_set_length(&a3, n, _state); ae_vector_set_length(&a4, n, _state); ae_vector_set_length(&ar, n, _state); ae_vector_set_length(&ar2, n, _state); ae_vector_set_length(&ai, n, _state); for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(0); a0.ptr.p_double[i] = a.ptr.p_double[i]; a1.ptr.p_double[i] = a.ptr.p_double[i]; a2.ptr.p_double[i] = a.ptr.p_double[i]; a3.ptr.p_double[i] = a.ptr.p_double[i]; a4.ptr.p_int[i] = ae_round(a.ptr.p_double[i], _state); ar.ptr.p_double[i] = (double)(i); ar2.ptr.p_double[i] = (double)(i); ai.ptr.p_int[i] = i; } tagsort(&a0, n, &p1, &p2, _state); testtsortunit_testsortresults(&a0, &p1, &p2, &a, n, &waserrors, _state); tagsortfasti(&a1, &ai, &bufr1, &bufi1, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq(a1.ptr.p_double[i],a0.ptr.p_double[i]); } tagsortfastr(&a2, &ar, &bufr1, &bufr2, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq(a2.ptr.p_double[i],a0.ptr.p_double[i]); } tagsortfast(&a3, &bufr1, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq(a3.ptr.p_double[i],a0.ptr.p_double[i]); } tagsortmiddleir(&a4, &ar2, 0, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq((double)(a4.ptr.p_int[i]),a0.ptr.p_double[i]); } /* * 0-1 sort * We test that keys are correctly reordered, but do NOT test order of values. */ testtsortunit_unset1di(&p1, _state); testtsortunit_unset1di(&p2, _state); ae_vector_set_length(&a, n, _state); ae_vector_set_length(&a0, n, _state); ae_vector_set_length(&a1, n, _state); ae_vector_set_length(&a2, n, _state); ae_vector_set_length(&a3, n, _state); ae_vector_set_length(&a4, n, _state); ae_vector_set_length(&ar, n, _state); ae_vector_set_length(&ar2, n, _state); ae_vector_set_length(&ai, n, _state); for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(ae_randominteger(2, _state)); a0.ptr.p_double[i] = a.ptr.p_double[i]; a1.ptr.p_double[i] = a.ptr.p_double[i]; a2.ptr.p_double[i] = a.ptr.p_double[i]; a3.ptr.p_double[i] = a.ptr.p_double[i]; a4.ptr.p_int[i] = ae_round(a.ptr.p_double[i], _state); ar.ptr.p_double[i] = (double)(i); ar2.ptr.p_double[i] = (double)(i); ai.ptr.p_int[i] = i; } tagsort(&a0, n, &p1, &p2, _state); testtsortunit_testsortresults(&a0, &p1, &p2, &a, n, &waserrors, _state); tagsortfasti(&a1, &ai, &bufr1, &bufi1, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq(a1.ptr.p_double[i],a0.ptr.p_double[i]); } tagsortfastr(&a2, &ar, &bufr1, &bufr2, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq(a2.ptr.p_double[i],a0.ptr.p_double[i]); } tagsortfast(&a3, &bufr1, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq(a3.ptr.p_double[i],a0.ptr.p_double[i]); } tagsortmiddleir(&a4, &ar2, 0, n, _state); for(i=0; i<=n-1; i++) { waserrors = waserrors||ae_fp_neq((double)(a4.ptr.p_int[i]),a0.ptr.p_double[i]); } /* * Special test for TagSortMiddleIR: sorting in the middle gives same results * as sorting in the beginning of the array */ m = 3*n; offs = ae_randominteger(n, _state); ae_vector_set_length(&i1, m, _state); ae_vector_set_length(&i2, m, _state); ae_vector_set_length(&i3, m, _state); ae_vector_set_length(&ar, m, _state); ae_vector_set_length(&ar2, m, _state); for(i=0; i<=m-1; i++) { i1.ptr.p_int[i] = ae_randominteger(100000000, _state); i2.ptr.p_int[i] = i1.ptr.p_int[i]; i3.ptr.p_int[i] = i1.ptr.p_int[i]; ar.ptr.p_double[i] = (double)(i); ar2.ptr.p_double[i] = (double)(i); } for(i=0; i<=n-1; i++) { i1.ptr.p_int[i] = i1.ptr.p_int[offs+i]; ar.ptr.p_double[i] = ar.ptr.p_double[offs+i]; } tagsortmiddleir(&i1, &ar, 0, n, _state); for(i=1; i<=n-1; i++) { distinctvals = distinctvals&&i1.ptr.p_int[i]!=i1.ptr.p_int[i-1]; } if( distinctvals ) { tagsortmiddleir(&i2, &ar2, offs, n, _state); for(i=0; i<=n-1; i++) { waserrors = (waserrors||i2.ptr.p_int[offs+i]!=i1.ptr.p_int[i])||ae_fp_neq(ar2.ptr.p_double[offs+i],ar.ptr.p_double[i]); } } } } /* * report */ if( !silent ) { printf("TESTING TAGSORT\n"); if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testtsort(ae_bool silent, ae_state *_state) { return testtsort(silent, _state); } /************************************************************************* Unsets 1D array. *************************************************************************/ static void testtsortunit_unset1di(/* Integer */ ae_vector* a, ae_state *_state) { ae_vector_set_length(a, 0+1, _state); a->ptr.p_int[0] = ae_randominteger(3, _state)-1; } static void testtsortunit_testsortresults(/* Real */ ae_vector* asorted, /* Integer */ ae_vector* p1, /* Integer */ ae_vector* p2, /* Real */ ae_vector* aoriginal, ae_int_t n, ae_bool* waserrors, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_vector a2; double t; ae_vector f; ae_frame_make(_state, &_frame_block); ae_vector_init(&a2, 0, DT_REAL, _state); ae_vector_init(&f, 0, DT_INT, _state); ae_vector_set_length(&a2, n-1+1, _state); ae_vector_set_length(&f, n-1+1, _state); /* * is set ordered? */ for(i=0; i<=n-2; i++) { *waserrors = *waserrors||ae_fp_greater(asorted->ptr.p_double[i],asorted->ptr.p_double[i+1]); } /* * P1 correctness */ for(i=0; i<=n-1; i++) { *waserrors = *waserrors||ae_fp_neq(asorted->ptr.p_double[i],aoriginal->ptr.p_double[p1->ptr.p_int[i]]); } for(i=0; i<=n-1; i++) { f.ptr.p_int[i] = 0; } for(i=0; i<=n-1; i++) { f.ptr.p_int[p1->ptr.p_int[i]] = f.ptr.p_int[p1->ptr.p_int[i]]+1; } for(i=0; i<=n-1; i++) { *waserrors = *waserrors||f.ptr.p_int[i]!=1; } /* * P2 correctness */ for(i=0; i<=n-1; i++) { a2.ptr.p_double[i] = aoriginal->ptr.p_double[i]; } for(i=0; i<=n-1; i++) { if( p2->ptr.p_int[i]!=i ) { t = a2.ptr.p_double[i]; a2.ptr.p_double[i] = a2.ptr.p_double[p2->ptr.p_int[i]]; a2.ptr.p_double[p2->ptr.p_int[i]] = t; } } for(i=0; i<=n-1; i++) { *waserrors = *waserrors||ae_fp_neq(asorted->ptr.p_double[i],a2.ptr.p_double[i]); } ae_frame_leave(_state); } static ae_bool testnearestneighborunit_kdtresultsdifferent(/* Real */ ae_matrix* refxy, ae_int_t ntotal, /* Real */ ae_matrix* qx, /* Real */ ae_matrix* qxy, /* Integer */ ae_vector* qt, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_state *_state); static double testnearestneighborunit_vnorm(/* Real */ ae_vector* x, ae_int_t n, ae_int_t normtype, ae_state *_state); static void testnearestneighborunit_testkdtuniform(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_int_t normtype, ae_bool* kdterrors, ae_state *_state); static void testnearestneighborunit_testkdtreeserialization(ae_bool* err, ae_state *_state); static ae_bool testnearestneighborunit_testspecialcases(ae_state *_state); /************************************************************************* Testing Nearest Neighbor Search *************************************************************************/ ae_bool testnearestneighbor(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_matrix xy; ae_int_t i; ae_int_t j; double v; ae_int_t normtype; ae_int_t nx; ae_int_t ny; ae_int_t n; ae_int_t smalln; ae_int_t largen; ae_int_t passcount; ae_int_t pass; ae_bool waserrors; ae_bool kdterrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); kdterrors = ae_false; passcount = 2; smalln = 256; largen = 2048; ny = 3; /* * */ testnearestneighborunit_testkdtreeserialization(&kdterrors, _state); for(pass=1; pass<=passcount; pass++) { for(normtype=0; normtype<=2; normtype++) { for(nx=1; nx<=3; nx++) { /* * Test in hypercube */ ae_matrix_set_length(&xy, largen, nx+ny, _state); for(i=0; i<=largen-1; i++) { for(j=0; j<=nx+ny-1; j++) { xy.ptr.pp_double[i][j] = 10*ae_randomreal(_state)-5; } } for(n=1; n<=10; n++) { testnearestneighborunit_testkdtuniform(&xy, n, nx, ae_randominteger(ny+1, _state), normtype, &kdterrors, _state); } testnearestneighborunit_testkdtuniform(&xy, largen, nx, ae_randominteger(ny+1, _state), normtype, &kdterrors, _state); /* * Test clustered (2*N points, pairs of equal points) */ ae_matrix_set_length(&xy, 2*smalln, nx+ny, _state); for(i=0; i<=smalln-1; i++) { for(j=0; j<=nx+ny-1; j++) { xy.ptr.pp_double[2*i+0][j] = 10*ae_randomreal(_state)-5; xy.ptr.pp_double[2*i+1][j] = xy.ptr.pp_double[2*i+0][j]; } } testnearestneighborunit_testkdtuniform(&xy, 2*smalln, nx, ae_randominteger(ny+1, _state), normtype, &kdterrors, _state); /* * Test degenerate case: all points are same except for one */ ae_matrix_set_length(&xy, smalln, nx+ny, _state); v = ae_randomreal(_state); for(i=0; i<=smalln-2; i++) { for(j=0; j<=nx+ny-1; j++) { xy.ptr.pp_double[i][j] = v; } } for(j=0; j<=nx+ny-1; j++) { xy.ptr.pp_double[smalln-1][j] = 10*ae_randomreal(_state)-5; } testnearestneighborunit_testkdtuniform(&xy, smalln, nx, ae_randominteger(ny+1, _state), normtype, &kdterrors, _state); } } } kdterrors = kdterrors||testnearestneighborunit_testspecialcases(_state); /* * report */ waserrors = kdterrors; if( !silent ) { printf("TESTING NEAREST NEIGHBOR SEARCH\n"); printf("* KD TREES: "); if( !kdterrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testnearestneighbor(ae_bool silent, ae_state *_state) { return testnearestneighbor(silent, _state); } /************************************************************************* Compare results from different queries: * X just X-values * XY X-values and Y-values * XT X-values and tag values *************************************************************************/ static ae_bool testnearestneighborunit_kdtresultsdifferent(/* Real */ ae_matrix* refxy, ae_int_t ntotal, /* Real */ ae_matrix* qx, /* Real */ ae_matrix* qxy, /* Integer */ ae_vector* qt, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; result = ae_false; for(i=0; i<=n-1; i++) { if( qt->ptr.p_int[i]<0||qt->ptr.p_int[i]>=ntotal ) { result = ae_true; return result; } for(j=0; j<=nx-1; j++) { result = result||ae_fp_neq(qx->ptr.pp_double[i][j],refxy->ptr.pp_double[qt->ptr.p_int[i]][j]); result = result||ae_fp_neq(qxy->ptr.pp_double[i][j],refxy->ptr.pp_double[qt->ptr.p_int[i]][j]); } for(j=0; j<=ny-1; j++) { result = result||ae_fp_neq(qxy->ptr.pp_double[i][nx+j],refxy->ptr.pp_double[qt->ptr.p_int[i]][nx+j]); } } return result; } /************************************************************************* Returns norm *************************************************************************/ static double testnearestneighborunit_vnorm(/* Real */ ae_vector* x, ae_int_t n, ae_int_t normtype, ae_state *_state) { ae_int_t i; double result; result = ae_randomreal(_state); if( normtype==0 ) { result = (double)(0); for(i=0; i<=n-1; i++) { result = ae_maxreal(result, ae_fabs(x->ptr.p_double[i], _state), _state); } return result; } if( normtype==1 ) { result = (double)(0); for(i=0; i<=n-1; i++) { result = result+ae_fabs(x->ptr.p_double[i], _state); } return result; } if( normtype==2 ) { result = (double)(0); for(i=0; i<=n-1; i++) { result = result+ae_sqr(x->ptr.p_double[i], _state); } result = ae_sqrt(result, _state); return result; } return result; } /************************************************************************* Testing Nearest Neighbor Search on uniformly distributed hypercube NormType: 0, 1, 2 D: space dimension N: points count *************************************************************************/ static void testnearestneighborunit_testkdtuniform(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t nx, ae_int_t ny, ae_int_t normtype, ae_bool* kdterrors, ae_state *_state) { ae_frame _frame_block; double errtol; ae_vector tags; ae_vector ptx; ae_vector tmpx; ae_vector tmpb; kdtree treex; kdtree treexy; kdtree treext; kdtreerequestbuffer bufx; kdtreerequestbuffer bufxy; kdtreerequestbuffer bufxt; ae_matrix qx; ae_matrix qxy; ae_vector qtags; ae_vector qr; ae_vector boxmin; ae_vector boxmax; ae_vector qmin; ae_vector qmax; double spread; ae_int_t kx; ae_int_t kxy; ae_int_t kt; double eps; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t task; ae_bool isequal; double r; ae_int_t q; ae_int_t qcount; double v; ae_bool inthebox; ae_frame_make(_state, &_frame_block); ae_vector_init(&tags, 0, DT_INT, _state); ae_vector_init(&ptx, 0, DT_REAL, _state); ae_vector_init(&tmpx, 0, DT_REAL, _state); ae_vector_init(&tmpb, 0, DT_BOOL, _state); _kdtree_init(&treex, _state); _kdtree_init(&treexy, _state); _kdtree_init(&treext, _state); _kdtreerequestbuffer_init(&bufx, _state); _kdtreerequestbuffer_init(&bufxy, _state); _kdtreerequestbuffer_init(&bufxt, _state); ae_matrix_init(&qx, 0, 0, DT_REAL, _state); ae_matrix_init(&qxy, 0, 0, DT_REAL, _state); ae_vector_init(&qtags, 0, DT_INT, _state); ae_vector_init(&qr, 0, DT_REAL, _state); ae_vector_init(&boxmin, 0, DT_REAL, _state); ae_vector_init(&boxmax, 0, DT_REAL, _state); ae_vector_init(&qmin, 0, DT_REAL, _state); ae_vector_init(&qmax, 0, DT_REAL, _state); qcount = 10; ae_assert(n>0, "Assertion failed", _state); /* * Tol - roundoff error tolerance (for '>=' comparisons) */ errtol = 100000*ae_machineepsilon; /* * Evaluate bounding box and spread. */ ae_vector_set_length(&boxmin, nx, _state); ae_vector_set_length(&boxmax, nx, _state); for(j=0; j<=nx-1; j++) { boxmin.ptr.p_double[j] = xy->ptr.pp_double[0][j]; boxmax.ptr.p_double[j] = xy->ptr.pp_double[0][j]; } for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { boxmin.ptr.p_double[j] = ae_minreal(boxmin.ptr.p_double[j], xy->ptr.pp_double[i][j], _state); boxmax.ptr.p_double[j] = ae_maxreal(boxmax.ptr.p_double[j], xy->ptr.pp_double[i][j], _state); } } spread = (double)(0); for(j=0; j<=nx-1; j++) { spread = ae_maxreal(spread, boxmax.ptr.p_double[j]-boxmin.ptr.p_double[j], _state); } if( ae_fp_eq(spread,(double)(0)) ) { spread = (double)(1); } /* * fill tags */ ae_vector_set_length(&tags, n, _state); for(i=0; i<=n-1; i++) { tags.ptr.p_int[i] = i; } /* * build trees */ kdtreebuild(xy, n, nx, 0, normtype, &treex, _state); kdtreebuild(xy, n, nx, ny, normtype, &treexy, _state); kdtreebuildtagged(xy, &tags, n, nx, 0, normtype, &treext, _state); /* * allocate arrays */ ae_vector_set_length(&tmpx, nx, _state); ae_vector_set_length(&tmpb, n, _state); ae_matrix_set_length(&qx, n, nx, _state); ae_matrix_set_length(&qxy, n, nx+ny, _state); ae_vector_set_length(&qtags, n, _state); ae_vector_set_length(&qr, n, _state); ae_vector_set_length(&ptx, nx, _state); /* * test general K-NN queries (with self-matches): * * compare results from different trees (must be equal) and * check that correct (value,tag) pairs are returned * * test results from XT tree - let R be radius of query result. * then all points not in result must be not closer than R. */ for(q=1; q<=qcount; q++) { /* * Select K: 1..N */ if( ae_fp_greater(ae_randomreal(_state),0.5) ) { k = 1+ae_randominteger(n, _state); } else { k = 1; } /* * Select point (either one of the points, or random) */ if( ae_fp_greater(ae_randomreal(_state),0.5) ) { i = ae_randominteger(n, _state); ae_v_move(&ptx.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); } else { for(i=0; i<=nx-1; i++) { ptx.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } } /* * Test functions which use internal buffer: * * consistency of results from different queries * * points in query are IN the R-sphere (or at the boundary), * and points not in query are outside of the R-sphere (or at the boundary) * * distances are correct and are ordered */ kx = kdtreequeryknn(&treex, &ptx, k, ae_true, _state); kxy = kdtreequeryknn(&treexy, &ptx, k, ae_true, _state); kt = kdtreequeryknn(&treext, &ptx, k, ae_true, _state); if( (kx!=k||kxy!=k)||kt!=k ) { *kdterrors = ae_true; ae_frame_leave(_state); return; } unsetrealmatrix(&qx, _state); unsetrealmatrix(&qxy, _state); unsetintegerarray(&qtags, _state); unsetrealarray(&qr, _state); kdtreequeryresultsxi(&treex, &qx, _state); kdtreequeryresultsxyi(&treexy, &qxy, _state); kdtreequeryresultstagsi(&treext, &qtags, _state); kdtreequeryresultsdistancesi(&treext, &qr, _state); *kdterrors = *kdterrors||testnearestneighborunit_kdtresultsdifferent(xy, n, &qx, &qxy, &qtags, k, nx, ny, _state); unsetrealmatrix(&qx, _state); unsetrealmatrix(&qxy, _state); unsetintegerarray(&qtags, _state); unsetrealarray(&qr, _state); kdtreequeryresultsx(&treex, &qx, _state); kdtreequeryresultsxy(&treexy, &qxy, _state); kdtreequeryresultstags(&treext, &qtags, _state); kdtreequeryresultsdistances(&treext, &qr, _state); *kdterrors = *kdterrors||testnearestneighborunit_kdtresultsdifferent(xy, n, &qx, &qxy, &qtags, k, nx, ny, _state); for(i=0; i<=n-1; i++) { tmpb.ptr.p_bool[i] = ae_true; } r = (double)(0); for(i=0; i<=k-1; i++) { tmpb.ptr.p_bool[qtags.ptr.p_int[i]] = ae_false; ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &qx.ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); r = ae_maxreal(r, testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state), _state); } for(i=0; i<=n-1; i++) { if( tmpb.ptr.p_bool[i] ) { ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); *kdterrors = *kdterrors||ae_fp_less(testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state),r*(1-errtol)); } } for(i=0; i<=k-2; i++) { *kdterrors = *kdterrors||ae_fp_greater(qr.ptr.p_double[i],qr.ptr.p_double[i+1]); } for(i=0; i<=k-1; i++) { ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &xy->ptr.pp_double[qtags.ptr.p_int[i]][0], 1, ae_v_len(0,nx-1)); *kdterrors = *kdterrors||ae_fp_greater(ae_fabs(testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state)-qr.ptr.p_double[i], _state),errtol); } /* * Test functions which use external buffer: * * create external request buffer, perform buffered request * * reset status of internal buffer by performing 1-NN query, it is essential * to test the fact that external buffer is used * * check consistency of results from different queries * * check that points in query are IN the R-sphere (or at the boundary), * and points not in query are outside of the R-sphere (or at the boundary) * * check that distances are correct and are ordered */ kdtreecreaterequestbuffer(&treex, &bufx, _state); kdtreecreaterequestbuffer(&treexy, &bufxy, _state); kdtreecreaterequestbuffer(&treext, &bufxt, _state); kx = kdtreetsqueryknn(&treex, &bufx, &ptx, k, ae_true, _state); kxy = kdtreetsqueryknn(&treexy, &bufxy, &ptx, k, ae_true, _state); kt = kdtreetsqueryknn(&treext, &bufxt, &ptx, k, ae_true, _state); kdtreequeryknn(&treex, &ptx, 1, ae_true, _state); kdtreequeryknn(&treexy, &ptx, 1, ae_true, _state); kdtreequeryknn(&treext, &ptx, 1, ae_true, _state); if( (kx!=k||kxy!=k)||kt!=k ) { *kdterrors = ae_true; ae_frame_leave(_state); return; } unsetrealmatrix(&qx, _state); unsetrealmatrix(&qxy, _state); unsetintegerarray(&qtags, _state); unsetrealarray(&qr, _state); kdtreetsqueryresultsx(&treex, &bufx, &qx, _state); kdtreetsqueryresultsxy(&treexy, &bufxy, &qxy, _state); kdtreetsqueryresultstags(&treext, &bufxt, &qtags, _state); kdtreetsqueryresultsdistances(&treext, &bufxt, &qr, _state); *kdterrors = *kdterrors||testnearestneighborunit_kdtresultsdifferent(xy, n, &qx, &qxy, &qtags, k, nx, ny, _state); for(i=0; i<=n-1; i++) { tmpb.ptr.p_bool[i] = ae_true; } r = (double)(0); for(i=0; i<=k-1; i++) { tmpb.ptr.p_bool[qtags.ptr.p_int[i]] = ae_false; ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &qx.ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); r = ae_maxreal(r, testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state), _state); } for(i=0; i<=n-1; i++) { if( tmpb.ptr.p_bool[i] ) { ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); *kdterrors = *kdterrors||ae_fp_less(testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state),r*(1-errtol)); } } for(i=0; i<=k-2; i++) { *kdterrors = *kdterrors||ae_fp_greater(qr.ptr.p_double[i],qr.ptr.p_double[i+1]); } for(i=0; i<=k-1; i++) { ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &xy->ptr.pp_double[qtags.ptr.p_int[i]][0], 1, ae_v_len(0,nx-1)); *kdterrors = *kdterrors||ae_fp_greater(ae_fabs(testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state)-qr.ptr.p_double[i], _state),errtol); } /* * Test reallocation properties: functions must automatically * resize array which is too small, but leave unchanged array which is * too large. */ if( n>=2 ) { /* * First step: array is too small, two elements are required */ k = 2; kx = kdtreequeryknn(&treex, &ptx, k, ae_true, _state); kxy = kdtreequeryknn(&treexy, &ptx, k, ae_true, _state); kt = kdtreequeryknn(&treext, &ptx, k, ae_true, _state); if( (kx!=k||kxy!=k)||kt!=k ) { *kdterrors = ae_true; ae_frame_leave(_state); return; } ae_matrix_set_length(&qx, 1, 1, _state); ae_matrix_set_length(&qxy, 1, 1, _state); ae_vector_set_length(&qtags, 1, _state); ae_vector_set_length(&qr, 1, _state); kdtreequeryresultsx(&treex, &qx, _state); kdtreequeryresultsxy(&treexy, &qxy, _state); kdtreequeryresultstags(&treext, &qtags, _state); kdtreequeryresultsdistances(&treext, &qr, _state); *kdterrors = *kdterrors||testnearestneighborunit_kdtresultsdifferent(xy, n, &qx, &qxy, &qtags, k, nx, ny, _state); /* * Second step: array is one row larger than needed, so only first * row is overwritten. Test it. */ k = 1; kx = kdtreequeryknn(&treex, &ptx, k, ae_true, _state); kxy = kdtreequeryknn(&treexy, &ptx, k, ae_true, _state); kt = kdtreequeryknn(&treext, &ptx, k, ae_true, _state); if( (kx!=k||kxy!=k)||kt!=k ) { *kdterrors = ae_true; ae_frame_leave(_state); return; } for(i=0; i<=nx-1; i++) { qx.ptr.pp_double[1][i] = _state->v_nan; } for(i=0; i<=nx+ny-1; i++) { qxy.ptr.pp_double[1][i] = _state->v_nan; } qtags.ptr.p_int[1] = 999; qr.ptr.p_double[1] = _state->v_nan; kdtreequeryresultsx(&treex, &qx, _state); kdtreequeryresultsxy(&treexy, &qxy, _state); kdtreequeryresultstags(&treext, &qtags, _state); kdtreequeryresultsdistances(&treext, &qr, _state); *kdterrors = *kdterrors||testnearestneighborunit_kdtresultsdifferent(xy, n, &qx, &qxy, &qtags, k, nx, ny, _state); for(i=0; i<=nx-1; i++) { *kdterrors = *kdterrors||!ae_isnan(qx.ptr.pp_double[1][i], _state); } for(i=0; i<=nx+ny-1; i++) { *kdterrors = *kdterrors||!ae_isnan(qxy.ptr.pp_double[1][i], _state); } *kdterrors = *kdterrors||!(qtags.ptr.p_int[1]==999); *kdterrors = *kdterrors||!ae_isnan(qr.ptr.p_double[1], _state); } /* * Test reallocation properties: 'interactive' functions must allocate * new array on each call. */ if( n>=2 ) { /* * On input array is either too small or too large */ for(k=1; k<=2; k++) { ae_assert(k==1||k==2, "KNN: internal error (unexpected K)!", _state); kx = kdtreequeryknn(&treex, &ptx, k, ae_true, _state); kxy = kdtreequeryknn(&treexy, &ptx, k, ae_true, _state); kt = kdtreequeryknn(&treext, &ptx, k, ae_true, _state); if( (kx!=k||kxy!=k)||kt!=k ) { *kdterrors = ae_true; ae_frame_leave(_state); return; } ae_matrix_set_length(&qx, 3-k, 3-k, _state); ae_matrix_set_length(&qxy, 3-k, 3-k, _state); ae_vector_set_length(&qtags, 3-k, _state); ae_vector_set_length(&qr, 3-k, _state); kdtreequeryresultsxi(&treex, &qx, _state); kdtreequeryresultsxyi(&treexy, &qxy, _state); kdtreequeryresultstagsi(&treext, &qtags, _state); kdtreequeryresultsdistancesi(&treext, &qr, _state); *kdterrors = *kdterrors||testnearestneighborunit_kdtresultsdifferent(xy, n, &qx, &qxy, &qtags, k, nx, ny, _state); *kdterrors = (*kdterrors||qx.rows!=k)||qx.cols!=nx; *kdterrors = (*kdterrors||qxy.rows!=k)||qxy.cols!=nx+ny; *kdterrors = *kdterrors||qtags.cnt!=k; *kdterrors = *kdterrors||qr.cnt!=k; } } } /* * test general approximate K-NN queries (with self-matches): * * compare results from different trees (must be equal) and * check that correct (value,tag) pairs are returned * * test results from XT tree - let R be radius of query result. * then all points not in result must be not closer than R/(1+Eps). */ for(q=1; q<=qcount; q++) { /* * Select K: 1..N */ if( ae_fp_greater(ae_randomreal(_state),0.5) ) { k = 1+ae_randominteger(n, _state); } else { k = 1; } /* * Select Eps */ eps = 0.5+ae_randomreal(_state); /* * Select point (either one of the points, or random) */ if( ae_fp_greater(ae_randomreal(_state),0.5) ) { i = ae_randominteger(n, _state); ae_v_move(&ptx.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); } else { for(i=0; i<=nx-1; i++) { ptx.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } } /* * Test functions which use internal buffer: * * consistency of results from different queries * * points in query are IN the R-sphere (or at the boundary), * and points not in query are outside of the R-sphere (or at the boundary) * * distances are correct and are ordered */ kx = kdtreequeryaknn(&treex, &ptx, k, ae_true, eps, _state); kxy = kdtreequeryaknn(&treexy, &ptx, k, ae_true, eps, _state); kt = kdtreequeryaknn(&treext, &ptx, k, ae_true, eps, _state); if( (kx!=k||kxy!=k)||kt!=k ) { *kdterrors = ae_true; ae_frame_leave(_state); return; } unsetrealmatrix(&qx, _state); unsetrealmatrix(&qxy, _state); unsetintegerarray(&qtags, _state); unsetrealarray(&qr, _state); kdtreequeryresultsxi(&treex, &qx, _state); kdtreequeryresultsxyi(&treexy, &qxy, _state); kdtreequeryresultstagsi(&treext, &qtags, _state); kdtreequeryresultsdistancesi(&treext, &qr, _state); *kdterrors = *kdterrors||testnearestneighborunit_kdtresultsdifferent(xy, n, &qx, &qxy, &qtags, k, nx, ny, _state); unsetrealmatrix(&qx, _state); unsetrealmatrix(&qxy, _state); unsetintegerarray(&qtags, _state); unsetrealarray(&qr, _state); kdtreequeryresultsx(&treex, &qx, _state); kdtreequeryresultsxy(&treexy, &qxy, _state); kdtreequeryresultstags(&treext, &qtags, _state); kdtreequeryresultsdistances(&treext, &qr, _state); *kdterrors = *kdterrors||testnearestneighborunit_kdtresultsdifferent(xy, n, &qx, &qxy, &qtags, k, nx, ny, _state); for(i=0; i<=n-1; i++) { tmpb.ptr.p_bool[i] = ae_true; } r = (double)(0); for(i=0; i<=k-1; i++) { tmpb.ptr.p_bool[qtags.ptr.p_int[i]] = ae_false; ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &qx.ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); r = ae_maxreal(r, testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state), _state); } for(i=0; i<=n-1; i++) { if( tmpb.ptr.p_bool[i] ) { ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); *kdterrors = *kdterrors||ae_fp_less(testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state),r*(1-errtol)/(1+eps)); } } for(i=0; i<=k-2; i++) { *kdterrors = *kdterrors||ae_fp_greater(qr.ptr.p_double[i],qr.ptr.p_double[i+1]); } for(i=0; i<=k-1; i++) { ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &xy->ptr.pp_double[qtags.ptr.p_int[i]][0], 1, ae_v_len(0,nx-1)); *kdterrors = *kdterrors||ae_fp_greater(ae_fabs(testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state)-qr.ptr.p_double[i], _state),errtol); } /* * Test functions which use external buffer: * * create external request buffer, perform buffered request * * reset status of internal buffer by performing 1-NN query, it is essential * to test the fact that external buffer is used * * check consistency of results from different queries * * check that points in query are IN the R-sphere (or at the boundary), * and points not in query are outside of the R-sphere (or at the boundary) * * check that distances are correct and are ordered */ kdtreecreaterequestbuffer(&treex, &bufx, _state); kdtreecreaterequestbuffer(&treexy, &bufxy, _state); kdtreecreaterequestbuffer(&treext, &bufxt, _state); kx = kdtreetsqueryaknn(&treex, &bufx, &ptx, k, ae_true, eps, _state); kxy = kdtreetsqueryaknn(&treexy, &bufxy, &ptx, k, ae_true, eps, _state); kt = kdtreetsqueryaknn(&treext, &bufxt, &ptx, k, ae_true, eps, _state); kdtreequeryknn(&treex, &ptx, 1, ae_true, _state); kdtreequeryknn(&treexy, &ptx, 1, ae_true, _state); kdtreequeryknn(&treext, &ptx, 1, ae_true, _state); if( (kx!=k||kxy!=k)||kt!=k ) { *kdterrors = ae_true; ae_frame_leave(_state); return; } unsetrealmatrix(&qx, _state); unsetrealmatrix(&qxy, _state); unsetintegerarray(&qtags, _state); unsetrealarray(&qr, _state); kdtreetsqueryresultsx(&treex, &bufx, &qx, _state); kdtreetsqueryresultsxy(&treexy, &bufxy, &qxy, _state); kdtreetsqueryresultstags(&treext, &bufxt, &qtags, _state); kdtreetsqueryresultsdistances(&treext, &bufxt, &qr, _state); *kdterrors = *kdterrors||testnearestneighborunit_kdtresultsdifferent(xy, n, &qx, &qxy, &qtags, k, nx, ny, _state); for(i=0; i<=n-1; i++) { tmpb.ptr.p_bool[i] = ae_true; } r = (double)(0); for(i=0; i<=k-1; i++) { tmpb.ptr.p_bool[qtags.ptr.p_int[i]] = ae_false; ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &qx.ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); r = ae_maxreal(r, testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state), _state); } for(i=0; i<=n-1; i++) { if( tmpb.ptr.p_bool[i] ) { ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); *kdterrors = *kdterrors||ae_fp_less(testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state),r*(1-errtol)/(1+eps)); } } for(i=0; i<=k-2; i++) { *kdterrors = *kdterrors||ae_fp_greater(qr.ptr.p_double[i],qr.ptr.p_double[i+1]); } for(i=0; i<=k-1; i++) { ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &xy->ptr.pp_double[qtags.ptr.p_int[i]][0], 1, ae_v_len(0,nx-1)); *kdterrors = *kdterrors||ae_fp_greater(ae_fabs(testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state)-qr.ptr.p_double[i], _state),errtol); } } /* * test general R-NN queries (with self-matches): * * compare results from different trees (must be equal) and * check that correct (value,tag) pairs are returned * * test results from XT tree - let R be radius of query result. * then all points not in result must be not closer than R. */ for(q=1; q<=qcount; q++) { /* * Select R */ if( ae_fp_greater(ae_randomreal(_state),0.3) ) { r = ae_maxreal(ae_randomreal(_state), ae_machineepsilon, _state); } else { r = ae_machineepsilon; } /* * Select point (either one of the points, or random) */ if( ae_fp_greater(ae_randomreal(_state),0.5) ) { i = ae_randominteger(n, _state); ae_v_move(&ptx.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); } else { for(i=0; i<=nx-1; i++) { ptx.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } } /* * Test functions which use internal buffer: * * consistency of results from different queries * * points in query are IN the R-sphere (or at the boundary), * and points not in query are outside of the R-sphere (or at the boundary) * * distances are correct and are ordered */ kx = kdtreequeryrnn(&treex, &ptx, r, ae_true, _state); kxy = kdtreequeryrnn(&treexy, &ptx, r, ae_true, _state); kt = kdtreequeryrnn(&treext, &ptx, r, ae_true, _state); if( kxy!=kx||kt!=kx ) { *kdterrors = ae_true; ae_frame_leave(_state); return; } unsetrealmatrix(&qx, _state); unsetrealmatrix(&qxy, _state); unsetintegerarray(&qtags, _state); unsetrealarray(&qr, _state); kdtreequeryresultsxi(&treex, &qx, _state); kdtreequeryresultsxyi(&treexy, &qxy, _state); kdtreequeryresultstagsi(&treext, &qtags, _state); kdtreequeryresultsdistancesi(&treext, &qr, _state); *kdterrors = *kdterrors||testnearestneighborunit_kdtresultsdifferent(xy, n, &qx, &qxy, &qtags, kx, nx, ny, _state); unsetrealmatrix(&qx, _state); unsetrealmatrix(&qxy, _state); unsetintegerarray(&qtags, _state); unsetrealarray(&qr, _state); kdtreequeryresultsx(&treex, &qx, _state); kdtreequeryresultsxy(&treexy, &qxy, _state); kdtreequeryresultstags(&treext, &qtags, _state); kdtreequeryresultsdistances(&treext, &qr, _state); *kdterrors = *kdterrors||testnearestneighborunit_kdtresultsdifferent(xy, n, &qx, &qxy, &qtags, kx, nx, ny, _state); for(i=0; i<=n-1; i++) { tmpb.ptr.p_bool[i] = ae_true; } for(i=0; i<=kx-1; i++) { tmpb.ptr.p_bool[qtags.ptr.p_int[i]] = ae_false; } for(i=0; i<=n-1; i++) { ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); if( tmpb.ptr.p_bool[i] ) { *kdterrors = *kdterrors||ae_fp_less(testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state),r*(1-errtol)); } else { *kdterrors = *kdterrors||ae_fp_greater(testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state),r*(1+errtol)); } } for(i=0; i<=kx-2; i++) { *kdterrors = *kdterrors||ae_fp_greater(qr.ptr.p_double[i],qr.ptr.p_double[i+1]); } /* * Test functions which use external buffer: * * create external request buffer, perform buffered request * * reset status of internal buffer by performing 1-NN query, it is essential * to test the fact that external buffer is used * * check consistency of results from different queries * * check that points in query are IN the R-sphere (or at the boundary), * and points not in query are outside of the R-sphere (or at the boundary) * * check that distances are correct and are ordered */ kdtreecreaterequestbuffer(&treex, &bufx, _state); kdtreecreaterequestbuffer(&treexy, &bufxy, _state); kdtreecreaterequestbuffer(&treext, &bufxt, _state); kx = kdtreetsqueryrnn(&treex, &bufx, &ptx, r, ae_true, _state); kxy = kdtreetsqueryrnn(&treexy, &bufxy, &ptx, r, ae_true, _state); kt = kdtreetsqueryrnn(&treext, &bufxt, &ptx, r, ae_true, _state); kdtreequeryknn(&treex, &ptx, 1, ae_true, _state); kdtreequeryknn(&treexy, &ptx, 1, ae_true, _state); kdtreequeryknn(&treext, &ptx, 1, ae_true, _state); if( kxy!=kx||kt!=kx ) { *kdterrors = ae_true; ae_frame_leave(_state); return; } unsetrealmatrix(&qx, _state); unsetrealmatrix(&qxy, _state); unsetintegerarray(&qtags, _state); unsetrealarray(&qr, _state); kdtreetsqueryresultsx(&treex, &bufx, &qx, _state); kdtreetsqueryresultsxy(&treexy, &bufxy, &qxy, _state); kdtreetsqueryresultstags(&treext, &bufxt, &qtags, _state); kdtreetsqueryresultsdistances(&treext, &bufxt, &qr, _state); *kdterrors = *kdterrors||testnearestneighborunit_kdtresultsdifferent(xy, n, &qx, &qxy, &qtags, kx, nx, ny, _state); for(i=0; i<=n-1; i++) { tmpb.ptr.p_bool[i] = ae_true; } for(i=0; i<=kx-1; i++) { tmpb.ptr.p_bool[qtags.ptr.p_int[i]] = ae_false; } for(i=0; i<=n-1; i++) { ae_v_move(&tmpx.ptr.p_double[0], 1, &ptx.ptr.p_double[0], 1, ae_v_len(0,nx-1)); ae_v_sub(&tmpx.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); if( tmpb.ptr.p_bool[i] ) { *kdterrors = *kdterrors||ae_fp_less(testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state),r*(1-errtol)); } else { *kdterrors = *kdterrors||ae_fp_greater(testnearestneighborunit_vnorm(&tmpx, nx, normtype, _state),r*(1+errtol)); } } for(i=0; i<=kx-2; i++) { *kdterrors = *kdterrors||ae_fp_greater(qr.ptr.p_double[i],qr.ptr.p_double[i+1]); } } /* * test box queries */ ae_vector_set_length(&qmin, nx, _state); ae_vector_set_length(&qmax, nx, _state); for(q=1; q<=qcount; q++) { /* * Test for box exactly equal to one of the points. * More than one exactly equal point may be found. * Only thread-safe version is tested. */ kdtreecreaterequestbuffer(&treext, &bufxt, _state); k = ae_randominteger(n, _state); for(j=0; j<=nx-1; j++) { qmin.ptr.p_double[j] = xy->ptr.pp_double[k][j]; qmax.ptr.p_double[j] = xy->ptr.pp_double[k][j]; } kx = kdtreetsquerybox(&treext, &bufxt, &qmin, &qmax, _state); if( kx<1 ) { *kdterrors = ae_true; ae_frame_leave(_state); return; } unsetrealmatrix(&qx, _state); unsetintegerarray(&qtags, _state); unsetrealarray(&qr, _state); kdtreetsqueryresultsx(&treext, &bufxt, &qx, _state); kdtreetsqueryresultstags(&treext, &bufxt, &qtags, _state); kdtreetsqueryresultsdistances(&treext, &bufxt, &qr, _state); for(i=0; i<=kx-1; i++) { seterrorflag(kdterrors, ae_fp_neq(qr.ptr.p_double[i],(double)(0)), _state); } for(i=0; i<=kx-1; i++) { for(j=0; j<=nx-1; j++) { seterrorflag(kdterrors, ae_fp_neq(qx.ptr.pp_double[i][j],xy->ptr.pp_double[k][j]), _state); seterrorflag(kdterrors, ae_fp_neq(qx.ptr.pp_double[i][j],xy->ptr.pp_double[qtags.ptr.p_int[i]][j]), _state); } } /* * Test for randomly generated box (thread-safe version) */ for(j=0; j<=nx-1; j++) { qmin.ptr.p_double[j] = boxmin.ptr.p_double[j]+ae_randomreal(_state)*(boxmax.ptr.p_double[j]-boxmin.ptr.p_double[j]); qmax.ptr.p_double[j] = qmin.ptr.p_double[j]; v = spread*ae_pow((double)(10), -2*ae_randomreal(_state), _state); qmin.ptr.p_double[j] = qmin.ptr.p_double[j]-v; qmax.ptr.p_double[j] = qmax.ptr.p_double[j]+v; } kx = kdtreetsquerybox(&treext, &bufxt, &qmin, &qmax, _state); unsetrealmatrix(&qx, _state); unsetintegerarray(&qtags, _state); unsetrealarray(&qr, _state); kdtreetsqueryresultsx(&treext, &bufxt, &qx, _state); kdtreetsqueryresultstags(&treext, &bufxt, &qtags, _state); kdtreetsqueryresultsdistances(&treext, &bufxt, &qr, _state); for(i=0; i<=kx-1; i++) { seterrorflag(kdterrors, ae_fp_neq(qr.ptr.p_double[i],(double)(0)), _state); } for(i=0; i<=n-1; i++) { tmpb.ptr.p_bool[i] = ae_false; } for(i=0; i<=kx-1; i++) { tmpb.ptr.p_bool[qtags.ptr.p_int[i]] = ae_true; } for(i=0; i<=n-1; i++) { inthebox = ae_true; for(j=0; j<=nx-1; j++) { inthebox = inthebox&&ae_fp_greater_eq(xy->ptr.pp_double[i][j],qmin.ptr.p_double[j]); inthebox = inthebox&&ae_fp_less_eq(xy->ptr.pp_double[i][j],qmax.ptr.p_double[j]); } if( tmpb.ptr.p_bool[i] ) { seterrorflag(kdterrors, !inthebox, _state); } else { seterrorflag(kdterrors, inthebox, _state); } } /* * Test for randomly generated box (non-thread-safe version) */ for(j=0; j<=nx-1; j++) { qmin.ptr.p_double[j] = boxmin.ptr.p_double[j]+ae_randomreal(_state)*(boxmax.ptr.p_double[j]-boxmin.ptr.p_double[j]); qmax.ptr.p_double[j] = qmin.ptr.p_double[j]; v = spread*ae_pow((double)(10), -2*ae_randomreal(_state), _state); qmin.ptr.p_double[j] = qmin.ptr.p_double[j]-v; qmax.ptr.p_double[j] = qmax.ptr.p_double[j]+v; } kx = kdtreequerybox(&treext, &qmin, &qmax, _state); unsetrealmatrix(&qx, _state); unsetintegerarray(&qtags, _state); unsetrealarray(&qr, _state); kdtreequeryresultsx(&treext, &qx, _state); kdtreequeryresultstags(&treext, &qtags, _state); kdtreequeryresultsdistances(&treext, &qr, _state); for(i=0; i<=kx-1; i++) { seterrorflag(kdterrors, ae_fp_neq(qr.ptr.p_double[i],(double)(0)), _state); } for(i=0; i<=n-1; i++) { tmpb.ptr.p_bool[i] = ae_false; } for(i=0; i<=kx-1; i++) { tmpb.ptr.p_bool[qtags.ptr.p_int[i]] = ae_true; } for(i=0; i<=n-1; i++) { inthebox = ae_true; for(j=0; j<=nx-1; j++) { inthebox = inthebox&&ae_fp_greater_eq(xy->ptr.pp_double[i][j],qmin.ptr.p_double[j]); inthebox = inthebox&&ae_fp_less_eq(xy->ptr.pp_double[i][j],qmax.ptr.p_double[j]); } if( tmpb.ptr.p_bool[i] ) { seterrorflag(kdterrors, !inthebox, _state); } else { seterrorflag(kdterrors, inthebox, _state); } } } /* * Test self-matching: * * self-match - nearest neighbor of each point in XY is the point itself * * no self-match - nearest neighbor is NOT the point itself */ if( n>1 ) { /* * test for N=1 have non-general form, but it is not really needed */ for(task=0; task<=1; task++) { for(i=0; i<=n-1; i++) { ae_v_move(&ptx.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); kx = kdtreequeryknn(&treex, &ptx, 1, task==0, _state); kdtreequeryresultsxi(&treex, &qx, _state); if( kx!=1 ) { *kdterrors = ae_true; ae_frame_leave(_state); return; } isequal = ae_true; for(j=0; j<=nx-1; j++) { isequal = isequal&&ae_fp_eq(qx.ptr.pp_double[0][j],ptx.ptr.p_double[j]); } if( task==0 ) { *kdterrors = *kdterrors||!isequal; } else { *kdterrors = *kdterrors||isequal; } } } } ae_frame_leave(_state); } /************************************************************************* Testing serialization of KD trees This function sets Err to True on errors, but leaves it unchanged on success *************************************************************************/ static void testnearestneighborunit_testkdtreeserialization(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t nx; ae_int_t ny; ae_int_t normtype; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t q; ae_matrix xy; ae_vector x; ae_vector tags; ae_vector qsizes; double threshold; kdtree tree0; kdtree tree1; ae_int_t k0; ae_int_t k1; ae_matrix xy0; ae_matrix xy1; ae_vector tags0; ae_vector tags1; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&tags, 0, DT_INT, _state); ae_vector_init(&qsizes, 0, DT_INT, _state); _kdtree_init(&tree0, _state); _kdtree_init(&tree1, _state); ae_matrix_init(&xy0, 0, 0, DT_REAL, _state); ae_matrix_init(&xy1, 0, 0, DT_REAL, _state); ae_vector_init(&tags0, 0, DT_INT, _state); ae_vector_init(&tags1, 0, DT_INT, _state); threshold = 100*ae_machineepsilon; /* * different N, NX, NY, NormType */ n = 1; while(n<=51) { /* * prepare array with query sizes */ ae_vector_set_length(&qsizes, 4, _state); qsizes.ptr.p_int[0] = 1; qsizes.ptr.p_int[1] = ae_minint(2, n, _state); qsizes.ptr.p_int[2] = ae_minint(4, n, _state); qsizes.ptr.p_int[3] = n; /* * different NX/NY/NormType */ for(nx=1; nx<=2; nx++) { for(ny=0; ny<=2; ny++) { for(normtype=0; normtype<=2; normtype++) { /* * Prepare data */ ae_matrix_set_length(&xy, n, nx+ny, _state); ae_vector_set_length(&tags, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx+ny-1; j++) { xy.ptr.pp_double[i][j] = ae_randomreal(_state); } tags.ptr.p_int[i] = ae_randominteger(100, _state); } /* * Build tree, pass it through serializer */ kdtreebuildtagged(&xy, &tags, n, nx, ny, normtype, &tree0, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); kdtreealloc(&_local_serializer, &tree0, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); kdtreeserialize(&_local_serializer, &tree0, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); kdtreeunserialize(&_local_serializer, &tree1, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } /* * For each point of XY we make queries with different sizes */ ae_vector_set_length(&x, nx, _state); for(k=0; k<=n-1; k++) { for(q=0; q<=qsizes.cnt-1; q++) { ae_v_move(&x.ptr.p_double[0], 1, &xy.ptr.pp_double[k][0], 1, ae_v_len(0,nx-1)); k0 = kdtreequeryknn(&tree0, &x, qsizes.ptr.p_int[q], ae_true, _state); k1 = kdtreequeryknn(&tree1, &x, qsizes.ptr.p_int[q], ae_true, _state); if( k0!=k1 ) { *err = ae_true; ae_frame_leave(_state); return; } kdtreequeryresultsxy(&tree0, &xy0, _state); kdtreequeryresultsxy(&tree1, &xy1, _state); for(i=0; i<=k0-1; i++) { for(j=0; j<=nx+ny-1; j++) { if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[i][j]-xy1.ptr.pp_double[i][j], _state),threshold) ) { *err = ae_true; ae_frame_leave(_state); return; } } } kdtreequeryresultstags(&tree0, &tags0, _state); kdtreequeryresultstags(&tree1, &tags1, _state); for(i=0; i<=k0-1; i++) { if( tags0.ptr.p_int[i]!=tags1.ptr.p_int[i] ) { *err = ae_true; ae_frame_leave(_state); return; } } } } } } } /* * Next N */ n = n+25; } ae_frame_leave(_state); } /************************************************************************* This function tests different special cases: * Kd-tree for a zero number of points * Kd-tree for array with a lot of duplicates (early versions of ALGLIB raised stack overflow on such datasets) It returns True on errors, False on success. *************************************************************************/ static ae_bool testnearestneighborunit_testspecialcases(ae_state *_state) { ae_frame _frame_block; kdtree kdt; ae_matrix xy; ae_vector tags; ae_vector x; ae_int_t n; ae_int_t nk; ae_int_t nx; ae_int_t ny; ae_int_t normtype; ae_int_t i; ae_int_t j; double v; ae_bool result; ae_frame_make(_state, &_frame_block); _kdtree_init(&kdt, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&tags, 0, DT_INT, _state); ae_vector_init(&x, 0, DT_REAL, _state); result = ae_false; for(nx=1; nx<=3; nx++) { for(ny=0; ny<=3; ny++) { for(normtype=0; normtype<=2; normtype++) { /* * Build tree */ if( ae_fp_greater(ae_randomreal(_state),0.5) ) { kdtreebuildtagged(&xy, &tags, 0, nx, ny, normtype, &kdt, _state); } else { kdtreebuild(&xy, 0, nx, ny, normtype, &kdt, _state); } /* * Test different queries */ ae_vector_set_length(&x, nx, _state); for(i=0; i<=nx-1; i++) { x.ptr.p_double[i] = ae_randomreal(_state); } result = result||kdtreequeryknn(&kdt, &x, 1, ae_true, _state)>0; result = result||kdtreequeryrnn(&kdt, &x, 1.0E6, ae_true, _state)>0; result = result||kdtreequeryaknn(&kdt, &x, 1, ae_true, 2.0, _state)>0; } } } /* * Ability to handle array with a lot of duplicates without causing * stack overflow. * * Two situations are handled: * * array where ALL N elements are duplicates * * array where there are NK distinct elements and N-NK duplicates */ nx = 2; ny = 1; n = 100000; nk = 100; v = ae_randomreal(_state); ae_matrix_set_length(&xy, n, nx+ny, _state); ae_vector_set_length(&x, nx, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx+ny-1; j++) { xy.ptr.pp_double[i][j] = v; } } kdtreebuild(&xy, n, nx, ny, 2, &kdt, _state); for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = v; } result = result||kdtreequeryrnn(&kdt, &x, 0.0001, ae_true, _state)!=n; for(i=0; i<=nk-1; i++) { for(j=0; j<=nx+ny-1; j++) { xy.ptr.pp_double[i][j] = ae_randomreal(_state); } } kdtreebuild(&xy, n, nx, ny, 2, &kdt, _state); result = result||kdtreequeryrnn(&kdt, &x, 0.0001, ae_true, _state)=bins.cnt ) { k = bins.cnt-1; } bins.ptr.p_int[k] = bins.ptr.p_int[k]+1; } for(i=0; i<=bins.cnt-1; i++) { seterrorflag(&unit2errors, ae_fp_less((double)(bins.ptr.p_int[i]),0.9*n/bins.cnt)||ae_fp_greater((double)(bins.ptr.p_int[i]),1.1*n/bins.cnt), _state); } /* * Test exponential */ testhqrndunit_unsetstate(&state, _state); hqrndrandomize(&state, _state); expsigmaerr = (double)(0); lambdav = 2+5*ae_randomreal(_state); for(i=0; i<=samplesize-1; i++) { x.ptr.p_double[i] = hqrndexponential(&state, lambdav, _state); } for(i=0; i<=samplesize-1; i++) { uierrors = uierrors||ae_fp_less(x.ptr.p_double[i],(double)(0)); } testhqrndunit_calculatemv(&x, samplesize, &mean, &means, &stddev, &stddevs, _state); if( ae_fp_neq(means,(double)(0)) ) { expsigmaerr = ae_maxreal(expsigmaerr, ae_fabs((mean-1.0/lambdav)/means, _state), _state); } else { experrors = ae_true; } if( ae_fp_neq(stddevs,(double)(0)) ) { expsigmaerr = ae_maxreal(expsigmaerr, ae_fabs((stddev-1.0/lambdav)/stddevs, _state), _state); } else { experrors = ae_true; } experrors = experrors||ae_fp_greater(expsigmaerr,sigmathreshold); /* *Discrete/Continuous tests */ discreteerr = hqrnddiscretetest(ae_true, _state); continuouserr = hqrndcontinuoustest(ae_true, _state); /* * Final report */ waserrors = ((((((seederrors||urerrors)||uierrors)||normerrors)||unit2errors)||experrors)||discreteerr)||continuouserr; if( !silent ) { printf("RNG TEST\n"); printf("SEED TEST: "); if( !seederrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("UNIFORM CONTINUOUS: "); if( !urerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("UNIFORM INTEGER: "); if( !uierrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("NORMAL: "); if( !normerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("UNIT2: "); if( !unit2errors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("EXPONENTIAL: "); if( !experrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("DISCRETE: "); if( !discreteerr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("CONTINUOUS: "); if( !continuouserr ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST SUMMARY: FAILED\n"); } else { printf("TEST SUMMARY: PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testhqrnd(ae_bool silent, ae_state *_state) { return testhqrnd(silent, _state); } /************************************************************************* Function for test HQRNDContinuous function *************************************************************************/ ae_bool hqrndcontinuoustest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_vector sample; ae_vector bins; ae_vector binbounds; ae_int_t nb; ae_int_t samplesize; hqrndstate state; ae_int_t xp; ae_int_t i; ae_int_t j; double v; double sigma; double sigmamax; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&sample, 0, DT_REAL, _state); ae_vector_init(&bins, 0, DT_INT, _state); ae_vector_init(&binbounds, 0, DT_REAL, _state); _hqrndstate_init(&state, _state); result = ae_false; /* * Test for sample size equal to 1 */ ae_vector_set_length(&sample, 1, _state); sample.ptr.p_double[0] = ae_randomreal(_state); hqrndrandomize(&state, _state); result = result||ae_fp_neq(hqrndcontinuous(&state, &sample, 1, _state),sample.ptr.p_double[0]); /* * Test for larger samples */ xp = 100000; sigmamax = 10.0; for(samplesize=2; samplesize<=5; samplesize++) { /* * 1. Generate random sample with SampleSize points * 2. Generate NB=3*(SampleSize-1) bins, with bounds as prescribed by (BinBounds[I],BinBounds[I+1]). * Bin bounds are generated in such a way that value can fall into any bin with same probability * 3. Generate many random values * 4. Calculate number of values which fall into each bin * 5. Bins[I] should have binomial distribution with mean XP/NB and * variance XP*(1/NB)*(1-1/NB) */ nb = 3*(samplesize-1); sigma = ae_sqrt(xp*((double)1/(double)nb)*(1-(double)1/(double)nb), _state); ae_vector_set_length(&sample, samplesize, _state); sample.ptr.p_double[0] = 2*ae_randomreal(_state)-1; for(i=0; i<=samplesize-2; i++) { sample.ptr.p_double[i+1] = sample.ptr.p_double[i]+0.1+ae_randomreal(_state); } ae_vector_set_length(&bins, nb, _state); ae_vector_set_length(&binbounds, nb+1, _state); for(i=0; i<=samplesize-2; i++) { bins.ptr.p_int[3*i+0] = 0; bins.ptr.p_int[3*i+1] = 0; bins.ptr.p_int[3*i+2] = 0; binbounds.ptr.p_double[3*i+0] = sample.ptr.p_double[i]; binbounds.ptr.p_double[3*i+1] = sample.ptr.p_double[i]+(sample.ptr.p_double[i+1]-sample.ptr.p_double[i])/3; binbounds.ptr.p_double[3*i+2] = sample.ptr.p_double[i]+(sample.ptr.p_double[i+1]-sample.ptr.p_double[i])*2/3; } binbounds.ptr.p_double[nb] = sample.ptr.p_double[samplesize-1]; hqrndrandomize(&state, _state); for(i=0; i<=xp-1; i++) { v = hqrndcontinuous(&state, &sample, samplesize, _state); for(j=0; j<=nb-1; j++) { if( ae_fp_greater(v,binbounds.ptr.p_double[j])&&ae_fp_less(v,binbounds.ptr.p_double[j+1]) ) { bins.ptr.p_int[j] = bins.ptr.p_int[j]+1; break; } } } for(i=0; i<=nb-1; i++) { result = result||ae_fp_greater(ae_fabs(bins.ptr.p_int[i]-(double)xp/(double)nb, _state),sigma*sigmamax); } } ae_frame_leave(_state); return result; } /************************************************************************* Function for test HQRNDContinuous function *************************************************************************/ ae_bool hqrnddiscretetest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_vector sample; double sigma; double sigmathreshold; double tsample; double max; double min; ae_int_t i; ae_int_t j; ae_int_t s1; ae_int_t s2; ae_int_t binscount; ae_int_t xp; ae_vector nn; hqrndstate state; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&sample, 0, DT_REAL, _state); ae_vector_init(&nn, 0, DT_INT, _state); _hqrndstate_init(&state, _state); /* * We test that all values from discrete sample are generated with same probability. * To do this, we generate random values many times, then we calculate actual probabilities * and compare them with theoretical ones. */ max = (double)(100); min = (double)(-100); xp = 100000; sigmathreshold = 10.0; for(binscount=1; binscount<=5; binscount++) { sigma = ae_sqrt(xp*((double)1/(double)binscount)*(1-(double)1/(double)binscount), _state); ae_vector_set_length(&nn, binscount, _state); for(i=0; i<=binscount-1; i++) { nn.ptr.p_int[i] = 0; } ae_vector_set_length(&sample, binscount, _state); sample.ptr.p_double[0] = (max-min)*ae_randomreal(_state)+min; for(i=1; i<=binscount-1; i++) { sample.ptr.p_double[i] = sample.ptr.p_double[i-1]+max*ae_randomreal(_state)+0.001; } s1 = 1+ae_randominteger(32000, _state); s2 = 1+ae_randominteger(32000, _state); hqrndseed(s1, s2, &state, _state); for(i=0; i<=xp-1; i++) { tsample = hqrnddiscrete(&state, &sample, binscount, _state); for(j=0; j<=binscount-1; j++) { if( ae_fp_eq(tsample,sample.ptr.p_double[j]) ) { nn.ptr.p_int[j] = nn.ptr.p_int[j]+1; break; } } } for(i=0; i<=binscount-1; i++) { if( ae_fp_less((double)(nn.ptr.p_int[i]),(double)xp/(double)binscount-sigmathreshold*sigma)||ae_fp_greater((double)(nn.ptr.p_int[i]),(double)xp/(double)binscount+sigmathreshold*sigma) ) { if( !silent ) { printf("HQRNDDiscreteTest::ErrorReport::\n"); printf("nn[%0d]=%0d;\n xp/BinsCount=%0.5f;\n C*sigma=%0.5f\n", (int)(i), (int)(nn.ptr.p_int[i]), (double)((double)xp/(double)binscount), (double)(sigmathreshold*sigma)); printf("HQRNDDiscreteTest: test is FAILED!\n"); } result = ae_true; ae_frame_leave(_state); return result; } } if( !silent ) { printf("HQRNDDiscreteTest: test is OK.\n"); } } result = ae_false; ae_frame_leave(_state); return result; } static void testhqrndunit_calculatemv(/* Real */ ae_vector* x, ae_int_t n, double* mean, double* means, double* stddev, double* stddevs, ae_state *_state) { ae_int_t i; double v1; double v2; double variance; *mean = 0; *means = 0; *stddev = 0; *stddevs = 0; *mean = (double)(0); *means = (double)(1); *stddev = (double)(0); *stddevs = (double)(1); variance = (double)(0); if( n<=1 ) { return; } /* * Mean */ for(i=0; i<=n-1; i++) { *mean = *mean+x->ptr.p_double[i]; } *mean = *mean/n; /* * Variance (using corrected two-pass algorithm) */ if( n!=1 ) { v1 = (double)(0); for(i=0; i<=n-1; i++) { v1 = v1+ae_sqr(x->ptr.p_double[i]-(*mean), _state); } v2 = (double)(0); for(i=0; i<=n-1; i++) { v2 = v2+(x->ptr.p_double[i]-(*mean)); } v2 = ae_sqr(v2, _state)/n; variance = (v1-v2)/(n-1); if( ae_fp_less(variance,(double)(0)) ) { variance = (double)(0); } *stddev = ae_sqrt(variance, _state); } /* * Errors */ *means = *stddev/ae_sqrt((double)(n), _state); *stddevs = *stddev*ae_sqrt((double)(2), _state)/ae_sqrt((double)(n-1), _state); } /************************************************************************* Unsets HQRNDState structure *************************************************************************/ static void testhqrndunit_unsetstate(hqrndstate* state, ae_state *_state) { state->s1 = 0; state->s2 = 0; state->magicv = 0; } /************************************************************************* Test *************************************************************************/ ae_bool testodesolver(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t passcount; ae_bool curerrors; ae_bool rkckerrors; ae_bool waserrors; ae_vector xtbl; ae_matrix ytbl; odesolverreport rep; ae_vector xg; ae_vector y; double h; double eps; ae_int_t solver; ae_int_t pass; ae_int_t mynfev; double v; ae_int_t m; ae_int_t m2; ae_int_t i; double err; odesolverstate state; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&xtbl, 0, DT_REAL, _state); ae_matrix_init(&ytbl, 0, 0, DT_REAL, _state); _odesolverreport_init(&rep, _state); ae_vector_init(&xg, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); _odesolverstate_init(&state, _state); rkckerrors = ae_false; waserrors = ae_false; passcount = 10; /* * simple test: just A*sin(x)+B*cos(x) */ ae_assert(passcount>=2, "Assertion failed", _state); for(pass=0; pass<=passcount-1; pass++) { for(solver=0; solver<=0; solver++) { /* * prepare */ h = 1.0E-2; eps = 1.0E-5; if( pass%2==0 ) { eps = -eps; } ae_vector_set_length(&y, 2, _state); for(i=0; i<=1; i++) { y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } m = 2+ae_randominteger(10, _state); ae_vector_set_length(&xg, m, _state); xg.ptr.p_double[0] = (m-1)*ae_randomreal(_state); for(i=1; i<=m-1; i++) { xg.ptr.p_double[i] = xg.ptr.p_double[i-1]+ae_randomreal(_state); } v = 2*ae_pi/(xg.ptr.p_double[m-1]-xg.ptr.p_double[0]); ae_v_muld(&xg.ptr.p_double[0], 1, ae_v_len(0,m-1), v); if( ae_fp_greater(ae_randomreal(_state),0.5) ) { ae_v_muld(&xg.ptr.p_double[0], 1, ae_v_len(0,m-1), -1); } mynfev = 0; /* * choose solver */ if( solver==0 ) { odesolverrkck(&y, 2, &xg, m, eps, h, &state, _state); } /* * solve */ while(odesolveriteration(&state, _state)) { state.dy.ptr.p_double[0] = state.y.ptr.p_double[1]; state.dy.ptr.p_double[1] = -state.y.ptr.p_double[0]; mynfev = mynfev+1; } odesolverresults(&state, &m2, &xtbl, &ytbl, &rep, _state); /* * check results */ curerrors = ae_false; if( rep.terminationtype<=0 ) { curerrors = ae_true; } else { curerrors = curerrors||m2!=m; err = (double)(0); for(i=0; i<=m-1; i++) { err = ae_maxreal(err, ae_fabs(ytbl.ptr.pp_double[i][0]-(y.ptr.p_double[0]*ae_cos(xtbl.ptr.p_double[i]-xtbl.ptr.p_double[0], _state)+y.ptr.p_double[1]*ae_sin(xtbl.ptr.p_double[i]-xtbl.ptr.p_double[0], _state)), _state), _state); err = ae_maxreal(err, ae_fabs(ytbl.ptr.pp_double[i][1]-(-y.ptr.p_double[0]*ae_sin(xtbl.ptr.p_double[i]-xtbl.ptr.p_double[0], _state)+y.ptr.p_double[1]*ae_cos(xtbl.ptr.p_double[i]-xtbl.ptr.p_double[0], _state)), _state), _state); } curerrors = curerrors||ae_fp_greater(err,10*ae_fabs(eps, _state)); curerrors = curerrors||mynfev!=rep.nfev; } if( solver==0 ) { rkckerrors = rkckerrors||curerrors; } } } /* * another test: * * y(0) = 0 * dy/dx = f(x,y) * f(x,y) = 0, x<1 * x-1, x>=1 * * with BOTH absolute and fractional tolerances. * Starting from zero will be real challenge for * fractional tolerance. */ ae_assert(passcount>=2, "Assertion failed", _state); for(pass=0; pass<=passcount-1; pass++) { h = 1.0E-4; eps = 1.0E-4; if( pass%2==0 ) { eps = -eps; } ae_vector_set_length(&y, 1, _state); y.ptr.p_double[0] = (double)(0); m = 21; ae_vector_set_length(&xg, m, _state); for(i=0; i<=m-1; i++) { xg.ptr.p_double[i] = (double)(2*i)/(double)(m-1); } mynfev = 0; odesolverrkck(&y, 1, &xg, m, eps, h, &state, _state); while(odesolveriteration(&state, _state)) { state.dy.ptr.p_double[0] = ae_maxreal(state.x-1, (double)(0), _state); mynfev = mynfev+1; } odesolverresults(&state, &m2, &xtbl, &ytbl, &rep, _state); if( rep.terminationtype<=0 ) { rkckerrors = ae_true; } else { rkckerrors = rkckerrors||m2!=m; err = (double)(0); for(i=0; i<=m-1; i++) { err = ae_maxreal(err, ae_fabs(ytbl.ptr.pp_double[i][0]-ae_sqr(ae_maxreal(xg.ptr.p_double[i]-1, (double)(0), _state), _state)/2, _state), _state); } rkckerrors = rkckerrors||ae_fp_greater(err,ae_fabs(eps, _state)); rkckerrors = rkckerrors||mynfev!=rep.nfev; } } /* * end */ waserrors = rkckerrors; if( !silent ) { printf("TESTING ODE SOLVER\n"); printf("* RK CASH-KARP: "); if( rkckerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testodesolver(ae_bool silent, ae_state *_state) { return testodesolver(silent, _state); } static void testsparseunit_initgenerator(ae_int_t m, ae_int_t n, ae_int_t matkind, ae_int_t triangle, sparsegenerator* g, ae_state *_state); static ae_bool testsparseunit_generatenext(sparsegenerator* g, /* Real */ ae_matrix* da, sparsematrix* sa, ae_state *_state); static void testsparseunit_createrandom(ae_int_t m, ae_int_t n, ae_int_t pkind, ae_int_t ckind, ae_int_t p0, ae_int_t p1, /* Real */ ae_matrix* da, sparsematrix* sa, ae_state *_state); static ae_bool testsparseunit_enumeratetest(ae_state *_state); static ae_bool testsparseunit_rewriteexistingtest(ae_state *_state); static void testsparseunit_testgetrow(ae_bool* err, ae_state *_state); static ae_bool testsparseunit_testconvertsm(ae_state *_state); static ae_bool testsparseunit_testgcmatrixtype(ae_state *_state); ae_bool testsparse(ae_bool silent, ae_state *_state) { ae_bool waserrors; ae_bool basicerrors; ae_bool linearerrors; ae_bool basicrnderrors; ae_bool level2unsymmetricerrors; ae_bool level2symmetricerrors; ae_bool level2triangularerrors; ae_bool level3unsymmetricerrors; ae_bool level3symmetricerrors; ae_bool linearserrors; ae_bool linearmmerrors; ae_bool linearsmmerrors; ae_bool getrowerrors; ae_bool copyerrors; ae_bool basiccopyerrors; ae_bool enumerateerrors; ae_bool rewriteexistingerr; ae_bool skserrors; ae_bool result; getrowerrors = ae_false; skserrors = skstest(_state); basicerrors = basicfunctest(_state)||testsparseunit_testgcmatrixtype(_state); basicrnderrors = basicfuncrandomtest(_state); linearerrors = linearfunctionstest(_state); level2unsymmetricerrors = testlevel2unsymmetric(_state); level2symmetricerrors = testlevel2symmetric(_state); level2triangularerrors = testlevel2triangular(_state); level3unsymmetricerrors = testlevel3unsymmetric(_state); level3symmetricerrors = testlevel3symmetric(_state); linearserrors = linearfunctionsstest(_state); linearmmerrors = linearfunctionsmmtest(_state); linearsmmerrors = linearfunctionssmmtest(_state); copyerrors = copyfunctest(ae_true, _state)||testsparseunit_testconvertsm(_state); basiccopyerrors = basiccopyfunctest(ae_true, _state); enumerateerrors = testsparseunit_enumeratetest(_state); rewriteexistingerr = testsparseunit_rewriteexistingtest(_state); testsparseunit_testgetrow(&getrowerrors, _state); /* * report */ waserrors = (((((((((((((((skserrors||getrowerrors)||basicerrors)||linearerrors)||basicrnderrors)||level2unsymmetricerrors)||level2symmetricerrors)||level2triangularerrors)||level3unsymmetricerrors)||level3symmetricerrors)||linearserrors)||linearmmerrors)||linearsmmerrors)||copyerrors)||basiccopyerrors)||enumerateerrors)||rewriteexistingerr; if( !silent ) { printf("TESTING SPARSE\n"); printf("STORAGE FORMATS:\n"); printf("* SKS: "); if( !skserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("OPERATIONS:\n"); printf("* GETROW: "); if( !getrowerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("BLAS:\n"); printf("* LEVEL 2 GENERAL: "); if( !level2unsymmetricerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* LEVEL 2 SYMMETRIC: "); if( !level2symmetricerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* LEVEL 2 TRIANGULAR: "); if( !level2triangularerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* LEVEL 3 GENERAL: "); if( !level3unsymmetricerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* LEVEL 3 SYMMETRIC: "); if( !level3symmetricerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("BASIC TEST: "); if( !basicerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("COPY TEST: "); if( !copyerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("BASIC_COPY TEST: "); if( !basiccopyerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("BASIC_RND TEST: "); if( !basicrnderrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("LINEAR TEST: "); if( !linearerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("LINEAR TEST FOR SYMMETRIC MATRICES: "); if( !linearserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("LINEAR MxM TEST: "); if( !linearmmerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("LINEAR MxM TEST FOR SYMMETRIC MATRICES: "); if( !linearsmmerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("ENUMERATE TEST: "); if( !enumerateerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("REWRITE EXISTING TEST: "); if( !rewriteexistingerr ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testsparse(ae_bool silent, ae_state *_state) { return testsparse(silent, _state); } /************************************************************************* Function for testing basic SKS functional. Returns True on errors, False on success. -- ALGLIB PROJECT -- Copyright 16.01.1014 by Bochkanov Sergey *************************************************************************/ ae_bool skstest(ae_state *_state) { ae_frame _frame_block; sparsematrix s0; sparsematrix s1; sparsematrix s2; sparsematrix s3; sparsematrix s4; sparsematrix s5; sparsematrix s6; ae_int_t n; ae_int_t nz; double pnz; ae_int_t i; ae_int_t j; ae_int_t t0; ae_int_t t1; ae_matrix a; ae_matrix wasenumerated; ae_vector d; ae_vector u; hqrndstate rs; double v0; double v1; ae_int_t uppercnt; ae_int_t lowercnt; ae_bool result; ae_frame_make(_state, &_frame_block); _sparsematrix_init(&s0, _state); _sparsematrix_init(&s1, _state); _sparsematrix_init(&s2, _state); _sparsematrix_init(&s3, _state); _sparsematrix_init(&s4, _state); _sparsematrix_init(&s5, _state); _sparsematrix_init(&s6, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&wasenumerated, 0, 0, DT_BOOL, _state); ae_vector_init(&d, 0, DT_INT, _state); ae_vector_init(&u, 0, DT_INT, _state); _hqrndstate_init(&rs, _state); result = ae_false; hqrndrandomize(&rs, _state); for(n=1; n<=20; n++) { nz = n*n-n; for(;;) { /* * Generate N*N matrix where probability of non-diagonal element * being non-zero is PNZ. We also generate D and U - subdiagonal * and superdiagonal profile sizes. */ if( n>1 ) { pnz = (double)nz/(double)(n*n-n); } else { pnz = 1.0; } ae_vector_set_length(&d, n, _state); ae_vector_set_length(&u, n, _state); for(i=0; i<=n-1; i++) { d.ptr.p_int[i] = 0; u.ptr.p_int[i] = 0; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j||ae_fp_less_eq(hqrnduniformr(&rs, _state),pnz) ) { a.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; if( ji1 ) { inc(&uppercnt, _state); } if( j1N * * with 50% probability to CRS or SKS, if M=N */ if( m!=n||ae_fp_less(hqrnduniformr(&rs, _state),0.5) ) { sparsecopytocrs(&sa, &s0, _state); } else { sparsecopytosks(&sa, &s0, _state); } /* * Test SparseGet() for SA and S0 against matrix returned in A */ for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { seterrorflag(&result, ae_fp_greater(ae_fabs(sparseget(&sa, i, j, _state)-a.ptr.pp_double[i][j], _state),eps), _state); seterrorflag(&result, ae_fp_greater(ae_fabs(sparseget(&s0, i, j, _state)-a.ptr.pp_double[i][j], _state),eps), _state); } } /* * Test SparseMV */ ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&x1, n, _state); for(j=0; j<=n-1; j++) { x0.ptr.p_double[j] = hqrnduniformr(&rs, _state)-0.5; x1.ptr.p_double[j] = x0.ptr.p_double[j]; } sparsemv(&s0, &x0, &y0, _state); seterrorflag(&result, y0.cntN * * with 50% probability to CRS or SKS, if M=N */ if( m!=n||ae_fp_less(hqrnduniformr(&rs, _state),0.5) ) { sparsecopytocrs(&sa, &s0, _state); } else { sparsecopytosks(&sa, &s0, _state); } /* * Test SparseGet() for SA and S0 against matrix returned in A */ for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { seterrorflag(&result, ae_fp_neq(sparseget(&sa, i, j, _state),a.ptr.pp_double[i][j]), _state); seterrorflag(&result, ae_fp_neq(sparseget(&s0, i, j, _state),a.ptr.pp_double[i][j]), _state); } } /* * Test SparseMV */ ae_matrix_set_length(&x0, n, k, _state); ae_matrix_set_length(&x1, n, k, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { x0.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; x1.ptr.pp_double[i][j] = x0.ptr.pp_double[i][j]; } } sparsemm(&s0, &x0, k, &y0, _state); seterrorflag(&result, y0.rows0 ) { isupper = ae_true; } testsparseunit_initgenerator(n, n, 0, triangletype, &g, _state); while(testsparseunit_generatenext(&g, &a, &sa, _state)) { /* * Convert SA to desired storage format: * * S0 stores unmodified copy * * S1 stores copy with unmodified triangle corresponding * to IsUpper and another triangle being spoiled by random * trash */ sparsecopytohash(&sa, &s1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (ji&&!isupper) ) { sparseset(&s1, i, j, hqrnduniformr(&rs, _state), _state); } } } if( ae_fp_less(hqrnduniformr(&rs, _state),0.5) ) { sparsecopytocrs(&sa, &s0, _state); sparseconverttocrs(&s1, _state); } else { sparsecopytosks(&sa, &s0, _state); sparseconverttosks(&s1, _state); } /* * Test SparseGet() for SA and S0 against matrix returned in A */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { seterrorflag(&result, ae_fp_greater(ae_fabs(sparseget(&sa, i, j, _state)-a.ptr.pp_double[i][j], _state),eps), _state); seterrorflag(&result, ae_fp_greater(ae_fabs(sparseget(&s0, i, j, _state)-a.ptr.pp_double[i][j], _state),eps), _state); seterrorflag(&result, (ji&&triangletype==-1)&&ae_fp_neq(sparseget(&s0, i, j, _state),(double)(0)), _state); } } /* * Before we proceed with testing, update empty triangle of A * with its copy from another part of the matrix. */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (ji&&!isupper) ) { a.ptr.pp_double[i][j] = a.ptr.pp_double[j][i]; } } } /* * Test SparseSMV */ ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&x1, n, _state); for(j=0; j<=n-1; j++) { x0.ptr.p_double[j] = hqrnduniformr(&rs, _state)-0.5; x1.ptr.p_double[j] = x0.ptr.p_double[j]; } sparsesmv(&s0, isupper, &x0, &y0, _state); seterrorflag(&result, y0.cnt0 ) { isupper = ae_true; } testsparseunit_initgenerator(n, n, 0, triangletype, &g, _state); while(testsparseunit_generatenext(&g, &a, &sa, _state)) { /* * Choose matrix width K */ k = 1+hqrnduniformi(&rs, 20, _state); /* * Convert SA to desired storage format: * * S0 stores unmodified copy * * S1 stores copy with unmodified triangle corresponding * to IsUpper and another triangle being spoiled by random * trash */ sparsecopytohash(&sa, &s1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (ji&&!isupper) ) { sparseset(&s1, i, j, hqrnduniformr(&rs, _state), _state); } } } if( ae_fp_less(hqrnduniformr(&rs, _state),0.5) ) { sparsecopytocrs(&sa, &s0, _state); sparseconverttocrs(&s1, _state); } else { sparsecopytosks(&sa, &s0, _state); sparseconverttosks(&s1, _state); } /* * Test SparseGet() for SA and S0 against matrix returned in A */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { seterrorflag(&result, ae_fp_greater(ae_fabs(sparseget(&sa, i, j, _state)-a.ptr.pp_double[i][j], _state),eps), _state); seterrorflag(&result, ae_fp_greater(ae_fabs(sparseget(&s0, i, j, _state)-a.ptr.pp_double[i][j], _state),eps), _state); seterrorflag(&result, (ji&&triangletype==-1)&&ae_fp_neq(sparseget(&s0, i, j, _state),(double)(0)), _state); } } /* * Before we proceed with testing, update empty triangle of A * with its copy from another part of the matrix. */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (ji&&!isupper) ) { a.ptr.pp_double[i][j] = a.ptr.pp_double[j][i]; } } } /* * Test SparseSMM */ ae_matrix_set_length(&x0, n, k, _state); ae_matrix_set_length(&x1, n, k, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { x0.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; x1.ptr.pp_double[i][j] = x0.ptr.pp_double[i][j]; } } sparsesmm(&s0, isupper, &x0, k, &y0, _state); seterrorflag(&result, y0.rows0 ) { isupper = ae_true; } testsparseunit_initgenerator(n, n, 0, triangletype, &g, _state); while(testsparseunit_generatenext(&g, &a, &sa, _state)) { /* * Settings (IsUpper was already set, handle the rest) */ isunit = ae_fp_less(hqrnduniformr(&rs, _state),0.5); optype = hqrnduniformi(&rs, 2, _state); /* * Convert SA to desired storage format: * * S0 stores unmodified copy * * S1 stores copy with unmodified triangle corresponding * to IsUpper and another triangle being spoiled by random * trash */ sparsecopytohash(&sa, &s1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (ji&&!isupper) ) { sparseset(&s1, i, j, hqrnduniformr(&rs, _state), _state); } } } if( ae_fp_less(hqrnduniformr(&rs, _state),0.5) ) { sparsecopytocrs(&sa, &s0, _state); sparseconverttocrs(&s1, _state); } else { sparsecopytosks(&sa, &s0, _state); sparseconverttosks(&s1, _state); } /* * Generate "effective A" */ ae_matrix_set_length(&ea, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { ea.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j>=i&&isupper)||(j<=i&&!isupper) ) { i1 = i; j1 = j; if( optype==1 ) { swapi(&i1, &j1, _state); } ea.ptr.pp_double[i1][j1] = a.ptr.pp_double[i][j]; if( isunit&&i1==j1 ) { ea.ptr.pp_double[i1][j1] = 1.0; } } } } /* * Test SparseTRMV */ ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&x1, n, _state); for(j=0; j<=n-1; j++) { x0.ptr.p_double[j] = hqrnduniformr(&rs, _state)-0.5; x1.ptr.p_double[j] = x0.ptr.p_double[j]; } sparsetrmv(&s0, isupper, isunit, optype, &x0, &y0, _state); seterrorflag(&result, y0.cnti&&!isupper) ) { sparseset(&s1, i, j, hqrnduniformr(&rs, _state), _state); } } } if( ae_fp_less(hqrnduniformr(&rs, _state),0.5) ) { sparsecopytocrs(&sa, &s0, _state); sparseconverttocrs(&s1, _state); } else { sparsecopytosks(&sa, &s0, _state); sparseconverttosks(&s1, _state); } /* * Generate "effective A" and EY = inv(EA)*x0 */ ae_matrix_set_length(&ea, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { ea.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j>=i&&isupper)||(j<=i&&!isupper) ) { i1 = i; j1 = j; if( optype==1 ) { swapi(&i1, &j1, _state); } ea.ptr.pp_double[i1][j1] = a.ptr.pp_double[i][j]; if( isunit&&i1==j1 ) { ea.ptr.pp_double[i1][j1] = 1.0; } } } } ae_vector_set_length(&ey, n, _state); for(i=0; i<=n-1; i++) { ey.ptr.p_double[i] = hqrnduniformr(&rs, _state)-0.5; } ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&x1, n, _state); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&ea.ptr.pp_double[i][0], 1, &ey.ptr.p_double[0], 1, ae_v_len(0,n-1)); x0.ptr.p_double[i] = v; x1.ptr.p_double[i] = v; } /* * Test SparseTRSV */ sparsetrsv(&s0, isupper, isunit, optype, &x0, _state); seterrorflag(&result, x0.cnti1&&j1<=i1+2 ) { a.ptr.pp_double[i1][j1] = (double)(i1+j1+1); sparseset(&s, i1, j1, a.ptr.pp_double[i1][j1], _state); sparseadd(&s, i1, j1, (double)(0), _state); sparseset(&sss, i1, j1, a.ptr.pp_double[i1][j1], _state); } else { a.ptr.pp_double[i1][j1] = (double)(0); sparseset(&s, i1, j1, a.ptr.pp_double[i1][j1], _state); sparseadd(&s, i1, j1, (double)(0), _state); } /* * Check for SparseCreate */ sparsecopy(&s, &ss, _state); a0 = sparseget(&s, i1, j1, _state); a1 = sparseget(&ss, i1, j1, _state); if( ae_fp_neq(a0,a1) ) { if( !silent ) { printf("BasicCopyFuncTest::Report::SparseGet\n"); printf("S::[%0d,%0d]=%0.5f\n", (int)(i1), (int)(j1), (double)(a0)); printf("SS::[%0d,%0d]=%0.5f\n", (int)(i1), (int)(j1), (double)(a1)); printf(" TEST FAILED.\n"); } result = ae_true; ae_frame_leave(_state); return result; } } } /* * Check for SparseCreateCRS */ for(i1=0; i1<=i-1; i1++) { for(j1=0; j1<=j-1; j1++) { sparsecopy(&sss, &ss, _state); a0 = sparseget(&sss, i1, j1, _state); a1 = sparseget(&ss, i1, j1, _state); if( ae_fp_neq(a0,a1) ) { if( !silent ) { printf("BasicCopyFuncTest::Report::SparseGet\n"); printf("S::[%0d,%0d]=%0.5f\n", (int)(i1), (int)(j1), (double)(a0)); printf("SS::[%0d,%0d]=%0.5f\n", (int)(i1), (int)(j1), (double)(a1)); printf(" TEST FAILED.\n"); } result = ae_true; ae_frame_leave(_state); return result; } } } /* * Check for Matrix with CRS type */ sparseconverttocrs(&s, _state); sparsecopy(&s, &ss, _state); for(i1=0; i1<=i-1; i1++) { for(j1=0; j1<=j-1; j1++) { a0 = sparseget(&s, i1, j1, _state); a1 = sparseget(&ss, i1, j1, _state); if( ae_fp_neq(a0,a1) ) { if( !silent ) { printf("BasicCopyFuncTest::Report::SparseGet\n"); printf("S::[%0d,%0d]=%0.5f\n", (int)(i1), (int)(j1), (double)(a0)); printf("SS::[%0d,%0d]=%0.5f\n", (int)(i1), (int)(j1), (double)(a1)); printf(" TEST FAILED.\n"); } result = ae_true; ae_frame_leave(_state); return result; } } } } } if( !silent ) { printf(" TEST IS PASSED.\n"); } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Function for testing SparseCopy -- ALGLIB PROJECT -- Copyright 14.10.2011 by Bochkanov Sergey *************************************************************************/ ae_bool copyfunctest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; sparsematrix s; sparsematrix ss; ae_int_t n; ae_int_t m; ae_int_t mtype; ae_int_t i; ae_int_t j; ae_int_t i1; ae_int_t j1; double lb; double rb; ae_matrix a; ae_vector x0; ae_vector x1; ae_vector ty; ae_vector tyt; ae_vector y; ae_vector yt; ae_vector y0; ae_vector yt0; ae_vector cpy; ae_vector cpyt; ae_vector cpy0; ae_vector cpyt0; double eps; double a0; double a1; ae_bool result; ae_frame_make(_state, &_frame_block); _sparsematrix_init(&s, _state); _sparsematrix_init(&ss, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&ty, 0, DT_REAL, _state); ae_vector_init(&tyt, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&yt, 0, DT_REAL, _state); ae_vector_init(&y0, 0, DT_REAL, _state); ae_vector_init(&yt0, 0, DT_REAL, _state); ae_vector_init(&cpy, 0, DT_REAL, _state); ae_vector_init(&cpyt, 0, DT_REAL, _state); ae_vector_init(&cpy0, 0, DT_REAL, _state); ae_vector_init(&cpyt0, 0, DT_REAL, _state); /* * Accuracy */ eps = 1000*ae_machineepsilon; /* * Size of the matrix (m*n) */ n = 30; m = 30; /* * Left and right borders, limiting matrix values */ lb = (double)(-10); rb = (double)(10); /* * Test linear algebra functions for: * a) sparse matrix converted to CRS from Hash-Table * b) sparse matrix initially created as CRS */ for(i=1; i<=m-1; i++) { for(j=1; j<=n-1; j++) { for(mtype=0; mtype<=1; mtype++) { /* * Prepare test problem */ testsparseunit_createrandom(i, j, -1, mtype, -1, -1, &a, &s, _state); sparsecopy(&s, &ss, _state); /* * Initialize temporaries */ ae_vector_set_length(&ty, i, _state); ae_vector_set_length(&tyt, j, _state); for(i1=0; i1<=i-1; i1++) { ty.ptr.p_double[i1] = (double)(0); } for(i1=0; i1<=j-1; i1++) { tyt.ptr.p_double[i1] = (double)(0); } ae_vector_set_length(&x0, j, _state); ae_vector_set_length(&x1, i, _state); for(i1=0; i1<=j-1; i1++) { x0.ptr.p_double[i1] = (rb-lb)*ae_randomreal(_state)+lb; } for(i1=0; i1<=i-1; i1++) { x1.ptr.p_double[i1] = (rb-lb)*ae_randomreal(_state)+lb; } /* * Consider two cases: square matrix, and non-square matrix */ if( i!=j ) { /* * Searching true result */ for(i1=0; i1<=i-1; i1++) { for(j1=0; j1<=j-1; j1++) { ty.ptr.p_double[i1] = ty.ptr.p_double[i1]+a.ptr.pp_double[i1][j1]*x0.ptr.p_double[j1]; tyt.ptr.p_double[j1] = tyt.ptr.p_double[j1]+a.ptr.pp_double[i1][j1]*x1.ptr.p_double[i1]; } } /* * Multiplication */ sparsemv(&s, &x0, &y, _state); sparsemtv(&s, &x1, &yt, _state); sparsemv(&ss, &x0, &cpy, _state); sparsemtv(&ss, &x1, &cpyt, _state); /* * Check for MV-result */ for(i1=0; i1<=i-1; i1++) { if( (ae_fp_greater_eq(ae_fabs(y.ptr.p_double[i1]-ty.ptr.p_double[i1], _state),eps)||ae_fp_greater_eq(ae_fabs(cpy.ptr.p_double[i1]-ty.ptr.p_double[i1], _state),eps))||ae_fp_neq(cpy.ptr.p_double[i1]-y.ptr.p_double[i1],(double)(0)) ) { if( !silent ) { printf("CopyFuncTest::Report::RES_MV\n"); printf("Y[%0d]=%0.5f; tY[%0d]=%0.5f\n", (int)(i1), (double)(y.ptr.p_double[i1]), (int)(i1), (double)(ty.ptr.p_double[i1])); printf("cpY[%0d]=%0.5f;\n", (int)(i1), (double)(cpy.ptr.p_double[i1])); printf(" TEST FAILED.\n"); } result = ae_true; ae_frame_leave(_state); return result; } } /* * Check for MTV-result */ for(i1=0; i1<=j-1; i1++) { if( (ae_fp_greater_eq(ae_fabs(yt.ptr.p_double[i1]-tyt.ptr.p_double[i1], _state),eps)||ae_fp_greater_eq(ae_fabs(cpyt.ptr.p_double[i1]-tyt.ptr.p_double[i1], _state),eps))||ae_fp_neq(cpyt.ptr.p_double[i1]-yt.ptr.p_double[i1],(double)(0)) ) { if( !silent ) { printf("CopyFuncTest::Report::RES_MTV\n"); printf("Yt[%0d]=%0.5f; tYt[%0d]=%0.5f\n", (int)(i1), (double)(yt.ptr.p_double[i1]), (int)(i1), (double)(tyt.ptr.p_double[i1])); printf("cpYt[%0d]=%0.5f;\n", (int)(i1), (double)(cpyt.ptr.p_double[i1])); printf(" TEST FAILED.\n"); } result = ae_true; ae_frame_leave(_state); return result; } } sparsecopy(&s, &ss, _state); for(i1=0; i1<=i-1; i1++) { for(j1=0; j1<=j-1; j1++) { a0 = sparseget(&s, i1, j1, _state); a1 = sparseget(&ss, i1, j1, _state); if( ae_fp_neq(a0,a1) ) { if( !silent ) { printf("CopyFuncTest::Report::SparseGet\n"); printf("S::[%0d,%0d]=%0.5f\n", (int)(i1), (int)(j1), (double)(a0)); printf("SS::[%0d,%0d]=%0.5f\n", (int)(i1), (int)(j1), (double)(a1)); printf(" TEST FAILED.\n"); } result = ae_true; ae_frame_leave(_state); return result; } } } } else { /* * Searching true result */ for(i1=0; i1<=i-1; i1++) { for(j1=0; j1<=j-1; j1++) { ty.ptr.p_double[i1] = ty.ptr.p_double[i1]+a.ptr.pp_double[i1][j1]*x0.ptr.p_double[j1]; tyt.ptr.p_double[j1] = tyt.ptr.p_double[j1]+a.ptr.pp_double[i1][j1]*x0.ptr.p_double[i1]; } } /* * Multiplication */ sparsemv(&s, &x0, &y, _state); sparsemtv(&s, &x0, &yt, _state); sparsemv2(&s, &x0, &y0, &yt0, _state); sparsemv(&ss, &x0, &cpy, _state); sparsemtv(&ss, &x0, &cpyt, _state); sparsemv2(&ss, &x0, &cpy0, &cpyt0, _state); /* * Check for MV2-result */ for(i1=0; i1<=i-1; i1++) { if( ((((ae_fp_greater_eq(ae_fabs(y0.ptr.p_double[i1]-ty.ptr.p_double[i1], _state),eps)||ae_fp_greater_eq(ae_fabs(yt0.ptr.p_double[i1]-tyt.ptr.p_double[i1], _state),eps))||ae_fp_greater_eq(ae_fabs(cpy0.ptr.p_double[i1]-ty.ptr.p_double[i1], _state),eps))||ae_fp_greater_eq(ae_fabs(cpyt0.ptr.p_double[i1]-tyt.ptr.p_double[i1], _state),eps))||ae_fp_neq(cpy0.ptr.p_double[i1]-y0.ptr.p_double[i1],(double)(0)))||ae_fp_neq(cpyt0.ptr.p_double[i1]-yt0.ptr.p_double[i1],(double)(0)) ) { if( !silent ) { printf("CopyFuncTest::Report::RES_MV2\n"); printf("Y0[%0d]=%0.5f; tY[%0d]=%0.5f\n", (int)(i1), (double)(y0.ptr.p_double[i1]), (int)(i1), (double)(ty.ptr.p_double[i1])); printf("Yt0[%0d]=%0.5f; tYt[%0d]=%0.5f\n", (int)(i1), (double)(yt0.ptr.p_double[i1]), (int)(i1), (double)(tyt.ptr.p_double[i1])); printf("cpY0[%0d]=%0.5f;\n", (int)(i1), (double)(cpy0.ptr.p_double[i1])); printf("cpYt0[%0d]=%0.5f;\n", (int)(i1), (double)(cpyt0.ptr.p_double[i1])); printf(" TEST FAILED.\n"); } result = ae_true; ae_frame_leave(_state); return result; } } /* * Check for MV- and MTV-result by help MV2 */ for(i1=0; i1<=i-1; i1++) { if( ((ae_fp_greater(ae_fabs(y0.ptr.p_double[i1]-y.ptr.p_double[i1], _state),eps)||ae_fp_greater(ae_fabs(yt0.ptr.p_double[i1]-yt.ptr.p_double[i1], _state),eps))||ae_fp_greater(ae_fabs(cpy0.ptr.p_double[i1]-cpy.ptr.p_double[i1], _state),eps))||ae_fp_greater(ae_fabs(cpyt0.ptr.p_double[i1]-cpyt.ptr.p_double[i1], _state),eps) ) { if( !silent ) { printf("CopyFuncTest::Report::RES_MV_MVT\n"); printf("Y0[%0d]=%0.5f; Y[%0d]=%0.5f\n", (int)(i1), (double)(y0.ptr.p_double[i1]), (int)(i1), (double)(y.ptr.p_double[i1])); printf("Yt0[%0d]=%0.5f; Yt[%0d]=%0.5f\n", (int)(i1), (double)(yt0.ptr.p_double[i1]), (int)(i1), (double)(yt.ptr.p_double[i1])); printf("cpY0[%0d]=%0.5f; cpY[%0d]=%0.5f\n", (int)(i1), (double)(cpy0.ptr.p_double[i1]), (int)(i1), (double)(cpy.ptr.p_double[i1])); printf("cpYt0[%0d]=%0.5f; cpYt[%0d]=%0.5f\n", (int)(i1), (double)(cpyt0.ptr.p_double[i1]), (int)(i1), (double)(cpyt.ptr.p_double[i1])); printf(" TEST FAILED.\n"); } result = ae_true; ae_frame_leave(_state); return result; } } sparsecopy(&s, &ss, _state); for(i1=0; i1<=i-1; i1++) { for(j1=0; j1<=j-1; j1++) { a0 = sparseget(&s, i1, j1, _state); a1 = sparseget(&ss, i1, j1, _state); if( ae_fp_neq(a0,a1) ) { if( !silent ) { printf("CopyFuncTest::Report::SparseGet\n"); printf("S::[%0d,%0d]=%0.5f\n", (int)(i1), (int)(j1), (double)(a0)); printf("SS::[%0d,%0d]=%0.5f\n", (int)(i1), (int)(j1), (double)(a1)); printf(" TEST FAILED.\n"); } result = ae_true; ae_frame_leave(_state); return result; } } } } } } } if( !silent ) { printf(" TEST IS PASSED.\n"); } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function initializes sparse matrix generator, which is used to generate a set of matrices with sequentially increasing sparsity. PARAMETERS: M, N - matrix size. If M=0, then matrix is square N*N. N and M must be small enough to store N*M dense matrix. MatKind - matrix properties: * 0 - general sparse (no structure) * 1 - general sparse, but diagonal is always present and non-zero * 2 - diagonally dominant, SPD Triangle - triangle being returned: * +1 - upper triangle * -1 - lower triangle * 0 - full matrix is returned OUTPUT PARAMETERS: G - generator A - matrix A in dense format SA - matrix A in sparse format (hash-table storage) *************************************************************************/ static void testsparseunit_initgenerator(ae_int_t m, ae_int_t n, ae_int_t matkind, ae_int_t triangle, sparsegenerator* g, ae_state *_state) { _sparsegenerator_clear(g); g->n = n; g->m = m; g->matkind = matkind; g->triangle = triangle; hqrndrandomize(&g->rs, _state); ae_vector_set_length(&g->rcs.ia, 5+1, _state); ae_vector_set_length(&g->rcs.ra, 1+1, _state); g->rcs.stage = -1; } static ae_bool testsparseunit_generatenext(sparsegenerator* g, /* Real */ ae_matrix* da, sparsematrix* sa, ae_state *_state) { ae_int_t n; ae_int_t m; ae_int_t nz; ae_int_t nzd; double pnz; ae_int_t i; ae_int_t j; double v; ae_bool result; ae_matrix_clear(da); _sparsematrix_clear(sa); /* * Reverse communication preparations * I know it looks ugly, but it works the same way * anywhere from C++ to Python. * * This code initializes locals by: * * random values determined during code * generation - on first subroutine call * * values from previous call - on subsequent calls */ if( g->rcs.stage>=0 ) { n = g->rcs.ia.ptr.p_int[0]; m = g->rcs.ia.ptr.p_int[1]; nz = g->rcs.ia.ptr.p_int[2]; nzd = g->rcs.ia.ptr.p_int[3]; i = g->rcs.ia.ptr.p_int[4]; j = g->rcs.ia.ptr.p_int[5]; pnz = g->rcs.ra.ptr.p_double[0]; v = g->rcs.ra.ptr.p_double[1]; } else { n = 359; m = -58; nz = -919; nzd = -909; i = 81; j = 255; pnz = 74; v = -788; } if( g->rcs.stage==0 ) { goto lbl_0; } if( g->rcs.stage==1 ) { goto lbl_1; } /* * Routine body */ n = g->n; if( g->m==0 ) { m = n; } else { m = g->m; } ae_assert(m>0&&n>0, "GenerateNext: incorrect N/M", _state); /* * Generate general sparse matrix */ if( g->matkind!=0 ) { goto lbl_2; } nz = n*m; lbl_4: if( ae_false ) { goto lbl_5; } /* * Generate dense N*N matrix where probability of element * being non-zero is PNZ. */ pnz = (double)nz/(double)(n*m); ae_matrix_set_length(&g->bufa, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_less_eq(hqrnduniformr(&g->rs, _state),pnz) ) { g->bufa.ptr.pp_double[i][j] = hqrnduniformr(&g->rs, _state)-0.5; } else { g->bufa.ptr.pp_double[i][j] = 0.0; } } } /* * Output matrix and RComm */ ae_matrix_set_length(da, m, n, _state); sparsecreate(m, n, ae_round(pnz*m*n, _state), sa, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( (j<=i&&g->triangle<=0)||(j>=i&&g->triangle>=0) ) { da->ptr.pp_double[i][j] = g->bufa.ptr.pp_double[i][j]; sparseset(sa, i, j, g->bufa.ptr.pp_double[i][j], _state); } else { da->ptr.pp_double[i][j] = 0.0; } } } g->rcs.stage = 0; goto lbl_rcomm; lbl_0: /* * Increase problem sparcity and try one more time. * Stop after testing NZ=0. */ if( nz==0 ) { goto lbl_5; } nz = nz/2; goto lbl_4; lbl_5: result = ae_false; return result; lbl_2: /* * Generate general sparse matrix with non-zero diagonal */ if( g->matkind!=1 ) { goto lbl_6; } ae_assert(n==m, "GenerateNext: non-square matrix for MatKind=1", _state); nz = n*n-n; lbl_8: if( ae_false ) { goto lbl_9; } /* * Generate dense N*N matrix where probability of non-diagonal element * being non-zero is PNZ. */ if( n>1 ) { pnz = (double)nz/(double)(n*n-n); } else { pnz = (double)(1); } ae_matrix_set_length(&g->bufa, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { do { g->bufa.ptr.pp_double[i][i] = hqrnduniformr(&g->rs, _state)-0.5; } while(ae_fp_eq(g->bufa.ptr.pp_double[i][i],(double)(0))); g->bufa.ptr.pp_double[i][i] = g->bufa.ptr.pp_double[i][i]+1.5*ae_sign(g->bufa.ptr.pp_double[i][i], _state); continue; } if( ae_fp_less_eq(hqrnduniformr(&g->rs, _state),pnz) ) { g->bufa.ptr.pp_double[i][j] = hqrnduniformr(&g->rs, _state)-0.5; } else { g->bufa.ptr.pp_double[i][j] = 0.0; } } } /* * Output matrix and RComm */ ae_matrix_set_length(da, n, n, _state); sparsecreate(n, n, ae_round(pnz*(n*n-n)+n, _state), sa, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j<=i&&g->triangle<=0)||(j>=i&&g->triangle>=0) ) { da->ptr.pp_double[i][j] = g->bufa.ptr.pp_double[i][j]; sparseset(sa, i, j, g->bufa.ptr.pp_double[i][j], _state); } else { da->ptr.pp_double[i][j] = 0.0; } } } g->rcs.stage = 1; goto lbl_rcomm; lbl_1: /* * Increase problem sparcity and try one more time. * Stop after testing NZ=0. */ if( nz==0 ) { goto lbl_9; } nz = nz/2; goto lbl_8; lbl_9: result = ae_false; return result; lbl_6: ae_assert(ae_false, "Assertion failed", _state); result = ae_false; return result; /* * Saving state */ lbl_rcomm: result = ae_true; g->rcs.ia.ptr.p_int[0] = n; g->rcs.ia.ptr.p_int[1] = m; g->rcs.ia.ptr.p_int[2] = nz; g->rcs.ia.ptr.p_int[3] = nzd; g->rcs.ia.ptr.p_int[4] = i; g->rcs.ia.ptr.p_int[5] = j; g->rcs.ra.ptr.p_double[0] = pnz; g->rcs.ra.ptr.p_double[1] = v; return result; } /************************************************************************* This function creates random sparse matrix with some prescribed pattern. INPUT PARAMETERS: M - number of rows N - number of columns PKind - sparsity pattern: *-1 = pattern is chosen at random as well as P0/P1 * 0 = matrix with up to P0 non-zero elements at random locations (however, actual number of non-zero elements can be less than P0, and in fact can be zero) * 1 = band matrix with P0 non-zero elements below diagonal and P1 non-zero element above diagonal * 2 = matrix with random number of contiguous non-zero elements in the each row CKind - creation type: *-1 = CKind is chosen at random * 0 = matrix is created in Hash-Table format and converted to CRS representation * 1 = matrix is created in CRS format OUTPUT PARAMETERS: DA - dense representation of A, array[M,N] SA - sparse representation of A, in CRS format -- ALGLIB PROJECT -- Copyright 31.10.2011 by Bochkanov Sergey *************************************************************************/ static void testsparseunit_createrandom(ae_int_t m, ae_int_t n, ae_int_t pkind, ae_int_t ckind, ae_int_t p0, ae_int_t p1, /* Real */ ae_matrix* da, sparsematrix* sa, ae_state *_state) { ae_frame _frame_block; ae_int_t maxpkind; ae_int_t maxckind; ae_int_t i; ae_int_t j; ae_int_t k; double v; ae_vector c0; ae_vector c1; ae_vector rowsizes; ae_frame_make(_state, &_frame_block); ae_matrix_clear(da); _sparsematrix_clear(sa); ae_vector_init(&c0, 0, DT_INT, _state); ae_vector_init(&c1, 0, DT_INT, _state); ae_vector_init(&rowsizes, 0, DT_INT, _state); maxpkind = 2; maxckind = 1; ae_assert(m>=1, "CreateRandom: incorrect parameters", _state); ae_assert(n>=1, "CreateRandom: incorrect parameters", _state); ae_assert(pkind>=-1&&pkind<=maxpkind, "CreateRandom: incorrect parameters", _state); ae_assert(ckind>=-1&&ckind<=maxckind, "CreateRandom: incorrect parameters", _state); if( pkind==-1 ) { pkind = ae_randominteger(maxpkind+1, _state); if( pkind==0 ) { p0 = ae_randominteger(m*n, _state); } if( pkind==1 ) { p0 = ae_randominteger(ae_minint(m, n, _state), _state); p1 = ae_randominteger(ae_minint(m, n, _state), _state); } } if( ckind==-1 ) { ckind = ae_randominteger(maxckind+1, _state); } if( pkind==0 ) { /* * Matrix with elements at random locations */ ae_matrix_set_length(da, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { da->ptr.pp_double[i][j] = (double)(0); } } if( ckind==0 ) { /* * Create matrix in Hash format, convert to CRS */ sparsecreate(m, n, 1, sa, _state); for(k=0; k<=p0-1; k++) { i = ae_randominteger(m, _state); j = ae_randominteger(n, _state); v = (double)(ae_randominteger(17, _state)-8)/(double)8; if( ae_fp_greater(ae_randomreal(_state),0.5) ) { da->ptr.pp_double[i][j] = v; sparseset(sa, i, j, v, _state); } else { da->ptr.pp_double[i][j] = da->ptr.pp_double[i][j]+v; sparseadd(sa, i, j, v, _state); } } sparseconverttocrs(sa, _state); ae_frame_leave(_state); return; } if( ckind==1 ) { /* * Create matrix in CRS format */ for(k=0; k<=p0-1; k++) { i = ae_randominteger(m, _state); j = ae_randominteger(n, _state); v = (double)(ae_randominteger(17, _state)-8)/(double)8; da->ptr.pp_double[i][j] = v; } ae_vector_set_length(&rowsizes, m, _state); for(i=0; i<=m-1; i++) { rowsizes.ptr.p_int[i] = 0; for(j=0; j<=n-1; j++) { if( ae_fp_neq(da->ptr.pp_double[i][j],(double)(0)) ) { rowsizes.ptr.p_int[i] = rowsizes.ptr.p_int[i]+1; } } } sparsecreatecrs(m, n, &rowsizes, sa, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_neq(da->ptr.pp_double[i][j],(double)(0)) ) { sparseset(sa, i, j, da->ptr.pp_double[i][j], _state); } } } ae_frame_leave(_state); return; } ae_assert(ae_false, "CreateRandom: internal error", _state); } if( pkind==1 ) { /* * Band matrix */ ae_matrix_set_length(da, m, n, _state); ae_vector_set_length(&rowsizes, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { da->ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=m-1; i++) { for(j=ae_maxint(i-p0, 0, _state); j<=ae_minint(i+p1, n-1, _state); j++) { do { da->ptr.pp_double[i][j] = (double)(ae_randominteger(17, _state)-8)/(double)8; } while(ae_fp_eq(da->ptr.pp_double[i][j],(double)(0))); } rowsizes.ptr.p_int[i] = ae_maxint(ae_minint(i+p1, n-1, _state)-ae_maxint(i-p0, 0, _state)+1, 0, _state); } if( ckind==0 ) { sparsecreate(m, n, 1, sa, _state); } if( ckind==1 ) { sparsecreatecrs(m, n, &rowsizes, sa, _state); } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_neq(da->ptr.pp_double[i][j],(double)(0)) ) { sparseset(sa, i, j, da->ptr.pp_double[i][j], _state); } } } sparseconverttocrs(sa, _state); ae_frame_leave(_state); return; } if( pkind==2 ) { /* * Matrix with one contiguous sequence of non-zero elements per row */ ae_matrix_set_length(da, m, n, _state); ae_vector_set_length(&rowsizes, m, _state); ae_vector_set_length(&c0, m, _state); ae_vector_set_length(&c1, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { da->ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=m-1; i++) { c0.ptr.p_int[i] = ae_randominteger(n, _state); c1.ptr.p_int[i] = c0.ptr.p_int[i]+ae_randominteger(n-c0.ptr.p_int[i]+1, _state); rowsizes.ptr.p_int[i] = c1.ptr.p_int[i]-c0.ptr.p_int[i]; } for(i=0; i<=m-1; i++) { for(j=c0.ptr.p_int[i]; j<=c1.ptr.p_int[i]-1; j++) { do { da->ptr.pp_double[i][j] = (double)(ae_randominteger(17, _state)-8)/(double)8; } while(ae_fp_eq(da->ptr.pp_double[i][j],(double)(0))); } } if( ckind==0 ) { sparsecreate(m, n, 1, sa, _state); } if( ckind==1 ) { sparsecreatecrs(m, n, &rowsizes, sa, _state); } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_neq(da->ptr.pp_double[i][j],(double)(0)) ) { sparseset(sa, i, j, da->ptr.pp_double[i][j], _state); } } } sparseconverttocrs(sa, _state); ae_frame_leave(_state); return; } ae_frame_leave(_state); } /************************************************************************* This function does test for SparseEnumerate function. -- ALGLIB PROJECT -- Copyright 14.03.2012 by Bochkanov Sergey *************************************************************************/ static ae_bool testsparseunit_enumeratetest(ae_state *_state) { ae_frame _frame_block; sparsematrix spa; ae_matrix a; ae_matrix ta; ae_int_t m; ae_int_t n; double r; double v; ae_int_t ne; ae_int_t t0; ae_int_t t1; ae_int_t counter; ae_int_t c; ae_int_t hashcrs; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); _sparsematrix_init(&spa, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&ta, 0, 0, DT_BOOL, _state); r = 10.5; for(m=1; m<=30; m++) { for(n=1; n<=30; n++) { ne = 0; /* * Create matrix with non-zero elements inside the region: * 0<=In ) { seterrorflag(err, ae_true, _state); ae_frame_leave(_state); return; } for(j=0; j<=n-1; j++) { wasreturned.ptr.p_bool[j] = ae_false; } for(j=0; j<=nz-1; j++) { if( colidx.ptr.p_int[j]<0||colidx.ptr.p_int[j]>n ) { seterrorflag(err, ae_true, _state); ae_frame_leave(_state); return; } seterrorflag(err, j>0&&colidx.ptr.p_int[j]<=colidx.ptr.p_int[j-1], _state); seterrorflag(err, ae_fp_neq(vals.ptr.p_double[j],a.ptr.pp_double[i][colidx.ptr.p_int[j]])||ae_fp_neq(vals.ptr.p_double[j],sparseget(&s, i, colidx.ptr.p_int[j], _state)), _state); wasreturned.ptr.p_bool[colidx.ptr.p_int[j]] = ae_true; } for(j=0; j<=n-1; j++) { seterrorflag(err, ae_fp_neq(a.ptr.pp_double[i][j],(double)(0))&&!wasreturned.ptr.p_bool[j], _state); } } } } } ae_frame_leave(_state); } /************************************************************************* Test for SparseConvert functions(isn't tested ConvertToCRS function). The function create random dense and sparse matrices in CRS format. Then convert sparse matrix to some format by CONVERT_TO/COPY_TO functions, then it does some modification in matrices and compares that marices are identical. NOTE: Result of the function assigned to variable CopyErrors in unit test. -- ALGLIB PROJECT -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ static ae_bool testsparseunit_testconvertsm(ae_state *_state) { ae_frame _frame_block; sparsematrix s; sparsematrix cs; ae_matrix a; ae_int_t m; ae_int_t n; ae_int_t msize; ae_int_t nsize; ae_vector ner; double tmp; ae_int_t i; ae_int_t j; ae_int_t vartf; ae_bool result; ae_frame_make(_state, &_frame_block); _sparsematrix_init(&s, _state); _sparsematrix_init(&cs, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&ner, 0, DT_INT, _state); msize = 15; nsize = 15; for(m=1; m<=msize; m++) { for(n=1; n<=nsize; n++) { for(vartf=0; vartf<=2; vartf++) { ae_matrix_set_length(&a, m, n, _state); ae_vector_set_length(&ner, m, _state); for(i=0; i<=m-1; i++) { ner.ptr.p_int[i] = 0; for(j=0; j<=n-1; j++) { if( ae_randominteger(5, _state)==3 ) { ner.ptr.p_int[i] = ner.ptr.p_int[i]+1; a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } else { a.ptr.pp_double[i][j] = (double)(0); } } } /* * Create sparse matrix */ sparsecreatecrs(m, n, &ner, &s, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_neq(a.ptr.pp_double[i][j],(double)(0)) ) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; sparseset(&s, i, j, a.ptr.pp_double[i][j], _state); } } } /* * Set matrix type(we have to be sure that all formats * converted correctly) */ i = ae_randominteger(2, _state); if( i==0 ) { sparseconverttohash(&s, _state); } if( i==1 ) { sparseconverttocrs(&s, _state); } /* * Start test */ if( vartf==0 ) { sparseconverttohash(&s, _state); sparsecopy(&s, &cs, _state); } if( vartf==1 ) { sparsecopytohash(&s, &cs, _state); } if( vartf==2 ) { sparsecopytocrs(&s, &cs, _state); } /* * Change some elements in row */ if( vartf!=2 ) { for(i=0; i<=m-1; i++) { tmp = 2*ae_randomreal(_state)-1; j = ae_randominteger(n, _state); a.ptr.pp_double[i][j] = tmp; sparseset(&cs, i, j, tmp, _state); tmp = 2*ae_randomreal(_state)-1; j = ae_randominteger(n, _state); a.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]+tmp; sparseadd(&cs, i, j, tmp, _state); } } /* * Check that A is identical to S */ for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_neq(a.ptr.pp_double[i][j],sparseget(&cs, i, j, _state)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Test for check/get type functions. The function create sparse matrix, converts it to desired type then check this type. NOTE: Result of the function assigned to variable BasicErrors in unit test. -- ALGLIB PROJECT -- Copyright 23.07.2012 by Bochkanov Sergey *************************************************************************/ static ae_bool testsparseunit_testgcmatrixtype(ae_state *_state) { ae_frame _frame_block; sparsematrix s; sparsematrix cs; ae_int_t m; ae_int_t n; ae_int_t msize; ae_int_t nsize; ae_bool result; ae_frame_make(_state, &_frame_block); _sparsematrix_init(&s, _state); _sparsematrix_init(&cs, _state); msize = 5; nsize = 5; for(m=1; m<=msize; m++) { for(n=1; n<=nsize; n++) { sparsecreate(m, n, 1, &s, _state); sparseconverttocrs(&s, _state); if( (sparseishash(&s, _state)||!sparseiscrs(&s, _state))||sparsegetmatrixtype(&s, _state)!=1 ) { result = ae_true; ae_frame_leave(_state); return result; } sparseconverttohash(&s, _state); if( (!sparseishash(&s, _state)||sparseiscrs(&s, _state))||sparsegetmatrixtype(&s, _state)!=0 ) { result = ae_true; ae_frame_leave(_state); return result; } sparsecopytocrs(&s, &cs, _state); if( (sparseishash(&cs, _state)||!sparseiscrs(&cs, _state))||sparsegetmatrixtype(&cs, _state)!=1 ) { result = ae_true; ae_frame_leave(_state); return result; } sparsecopytohash(&cs, &s, _state); if( (!sparseishash(&s, _state)||sparseiscrs(&s, _state))||sparsegetmatrixtype(&s, _state)!=0 ) { result = ae_true; ae_frame_leave(_state); return result; } } } result = ae_false; ae_frame_leave(_state); return result; } void _sparsegenerator_init(void* _p, ae_state *_state) { sparsegenerator *p = (sparsegenerator*)_p; ae_touch_ptr((void*)p); ae_matrix_init(&p->bufa, 0, 0, DT_REAL, _state); _hqrndstate_init(&p->rs, _state); _rcommstate_init(&p->rcs, _state); } void _sparsegenerator_init_copy(void* _dst, void* _src, ae_state *_state) { sparsegenerator *dst = (sparsegenerator*)_dst; sparsegenerator *src = (sparsegenerator*)_src; dst->n = src->n; dst->m = src->m; dst->matkind = src->matkind; dst->triangle = src->triangle; ae_matrix_init_copy(&dst->bufa, &src->bufa, _state); _hqrndstate_init_copy(&dst->rs, &src->rs, _state); _rcommstate_init_copy(&dst->rcs, &src->rcs, _state); } void _sparsegenerator_clear(void* _p) { sparsegenerator *p = (sparsegenerator*)_p; ae_touch_ptr((void*)p); ae_matrix_clear(&p->bufa); _hqrndstate_clear(&p->rs); _rcommstate_clear(&p->rcs); } void _sparsegenerator_destroy(void* _p) { sparsegenerator *p = (sparsegenerator*)_p; ae_touch_ptr((void*)p); ae_matrix_destroy(&p->bufa); _hqrndstate_destroy(&p->rs); _rcommstate_destroy(&p->rcs); } ae_bool testreflections(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t n; ae_int_t m; ae_int_t maxmn; ae_vector x; ae_vector v; ae_vector work; ae_matrix h; ae_matrix a; ae_matrix b; ae_matrix c; double tmp; double beta; double tau; double err; double mer; double mel; double meg; ae_int_t pass; ae_int_t passcount; double threshold; ae_int_t tasktype; double xscale; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&v, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); ae_matrix_init(&h, 0, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&b, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); passcount = 10; threshold = 100*ae_machineepsilon; mer = (double)(0); mel = (double)(0); meg = (double)(0); for(pass=1; pass<=passcount; pass++) { for(n=1; n<=10; n++) { for(m=1; m<=10; m++) { /* * Task */ n = 1+ae_randominteger(10, _state); m = 1+ae_randominteger(10, _state); maxmn = ae_maxint(m, n, _state); /* * Initialize */ ae_vector_set_length(&x, maxmn+1, _state); ae_vector_set_length(&v, maxmn+1, _state); ae_vector_set_length(&work, maxmn+1, _state); ae_matrix_set_length(&h, maxmn+1, maxmn+1, _state); ae_matrix_set_length(&a, maxmn+1, maxmn+1, _state); ae_matrix_set_length(&b, maxmn+1, maxmn+1, _state); ae_matrix_set_length(&c, maxmn+1, maxmn+1, _state); /* * GenerateReflection, three tasks are possible: * * random X * * zero X * * non-zero X[1], all other are zeros * * random X, near underflow scale * * random X, near overflow scale */ for(tasktype=0; tasktype<=4; tasktype++) { xscale = (double)(1); if( tasktype==0 ) { for(i=1; i<=n; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } } if( tasktype==1 ) { for(i=1; i<=n; i++) { x.ptr.p_double[i] = (double)(0); } } if( tasktype==2 ) { x.ptr.p_double[1] = 2*ae_randomreal(_state)-1; for(i=2; i<=n; i++) { x.ptr.p_double[i] = (double)(0); } } if( tasktype==3 ) { for(i=1; i<=n; i++) { x.ptr.p_double[i] = (ae_randominteger(21, _state)-10)*ae_minrealnumber; } xscale = 10*ae_minrealnumber; } if( tasktype==4 ) { for(i=1; i<=n; i++) { x.ptr.p_double[i] = (2*ae_randomreal(_state)-1)*ae_maxrealnumber; } xscale = ae_maxrealnumber; } ae_v_move(&v.ptr.p_double[1], 1, &x.ptr.p_double[1], 1, ae_v_len(1,n)); generatereflection(&v, n, &tau, _state); beta = v.ptr.p_double[1]; v.ptr.p_double[1] = (double)(1); for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { if( i==j ) { h.ptr.pp_double[i][j] = 1-tau*v.ptr.p_double[i]*v.ptr.p_double[j]; } else { h.ptr.pp_double[i][j] = -tau*v.ptr.p_double[i]*v.ptr.p_double[j]; } } } err = (double)(0); for(i=1; i<=n; i++) { tmp = ae_v_dotproduct(&h.ptr.pp_double[i][1], 1, &x.ptr.p_double[1], 1, ae_v_len(1,n)); if( i==1 ) { err = ae_maxreal(err, ae_fabs(tmp-beta, _state), _state); } else { err = ae_maxreal(err, ae_fabs(tmp, _state), _state); } } meg = ae_maxreal(meg, err/xscale, _state); } /* * ApplyReflectionFromTheLeft */ for(i=1; i<=m; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; v.ptr.p_double[i] = x.ptr.p_double[i]; } for(i=1; i<=m; i++) { for(j=1; j<=n; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; b.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]; } } generatereflection(&v, m, &tau, _state); beta = v.ptr.p_double[1]; v.ptr.p_double[1] = (double)(1); applyreflectionfromtheleft(&b, tau, &v, 1, m, 1, n, &work, _state); for(i=1; i<=m; i++) { for(j=1; j<=m; j++) { if( i==j ) { h.ptr.pp_double[i][j] = 1-tau*v.ptr.p_double[i]*v.ptr.p_double[j]; } else { h.ptr.pp_double[i][j] = -tau*v.ptr.p_double[i]*v.ptr.p_double[j]; } } } for(i=1; i<=m; i++) { for(j=1; j<=n; j++) { tmp = ae_v_dotproduct(&h.ptr.pp_double[i][1], 1, &a.ptr.pp_double[1][j], a.stride, ae_v_len(1,m)); c.ptr.pp_double[i][j] = tmp; } } err = (double)(0); for(i=1; i<=m; i++) { for(j=1; j<=n; j++) { err = ae_maxreal(err, ae_fabs(b.ptr.pp_double[i][j]-c.ptr.pp_double[i][j], _state), _state); } } mel = ae_maxreal(mel, err, _state); /* * ApplyReflectionFromTheRight */ for(i=1; i<=n; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; v.ptr.p_double[i] = x.ptr.p_double[i]; } for(i=1; i<=m; i++) { for(j=1; j<=n; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; b.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]; } } generatereflection(&v, n, &tau, _state); beta = v.ptr.p_double[1]; v.ptr.p_double[1] = (double)(1); applyreflectionfromtheright(&b, tau, &v, 1, m, 1, n, &work, _state); for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { if( i==j ) { h.ptr.pp_double[i][j] = 1-tau*v.ptr.p_double[i]*v.ptr.p_double[j]; } else { h.ptr.pp_double[i][j] = -tau*v.ptr.p_double[i]*v.ptr.p_double[j]; } } } for(i=1; i<=m; i++) { for(j=1; j<=n; j++) { tmp = ae_v_dotproduct(&a.ptr.pp_double[i][1], 1, &h.ptr.pp_double[1][j], h.stride, ae_v_len(1,n)); c.ptr.pp_double[i][j] = tmp; } } err = (double)(0); for(i=1; i<=m; i++) { for(j=1; j<=n; j++) { err = ae_maxreal(err, ae_fabs(b.ptr.pp_double[i][j]-c.ptr.pp_double[i][j], _state), _state); } } mer = ae_maxreal(mer, err, _state); } } } /* * Overflow crash test */ ae_vector_set_length(&x, 10+1, _state); ae_vector_set_length(&v, 10+1, _state); for(i=1; i<=10; i++) { v.ptr.p_double[i] = ae_maxrealnumber*0.01*(2*ae_randomreal(_state)-1); } generatereflection(&v, 10, &tau, _state); result = (ae_fp_less_eq(meg,threshold)&&ae_fp_less_eq(mel,threshold))&&ae_fp_less_eq(mer,threshold); if( !silent ) { printf("TESTING REFLECTIONS\n"); printf("Pass count is %0d\n", (int)(passcount)); printf("Generate absolute error is %5.3e\n", (double)(meg)); printf("Apply(Left) absolute error is %5.3e\n", (double)(mel)); printf("Apply(Right) absolute error is %5.3e\n", (double)(mer)); printf("Overflow crash test passed\n"); if( result ) { printf("TEST PASSED\n"); } else { printf("TEST FAILED\n"); } } ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testreflections(ae_bool silent, ae_state *_state) { return testreflections(silent, _state); } ae_bool testcreflections(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t n; ae_int_t m; ae_int_t maxmn; ae_vector x; ae_vector v; ae_vector work; ae_matrix h; ae_matrix a; ae_matrix b; ae_matrix c; ae_complex tmp; ae_complex beta; ae_complex tau; double err; double mer; double mel; double meg; ae_int_t pass; ae_int_t passcount; ae_bool waserrors; double threshold; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_COMPLEX, _state); ae_vector_init(&v, 0, DT_COMPLEX, _state); ae_vector_init(&work, 0, DT_COMPLEX, _state); ae_matrix_init(&h, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&a, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&b, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&c, 0, 0, DT_COMPLEX, _state); threshold = 1000*ae_machineepsilon; passcount = 1000; mer = (double)(0); mel = (double)(0); meg = (double)(0); for(pass=1; pass<=passcount; pass++) { /* * Task */ n = 1+ae_randominteger(10, _state); m = 1+ae_randominteger(10, _state); maxmn = ae_maxint(m, n, _state); /* * Initialize */ ae_vector_set_length(&x, maxmn+1, _state); ae_vector_set_length(&v, maxmn+1, _state); ae_vector_set_length(&work, maxmn+1, _state); ae_matrix_set_length(&h, maxmn+1, maxmn+1, _state); ae_matrix_set_length(&a, maxmn+1, maxmn+1, _state); ae_matrix_set_length(&b, maxmn+1, maxmn+1, _state); ae_matrix_set_length(&c, maxmn+1, maxmn+1, _state); /* * GenerateReflection */ for(i=1; i<=n; i++) { x.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; x.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; v.ptr.p_complex[i] = x.ptr.p_complex[i]; } complexgeneratereflection(&v, n, &tau, _state); beta = v.ptr.p_complex[1]; v.ptr.p_complex[1] = ae_complex_from_i(1); for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { if( i==j ) { h.ptr.pp_complex[i][j] = ae_c_d_sub(1,ae_c_mul(ae_c_mul(tau,v.ptr.p_complex[i]),ae_c_conj(v.ptr.p_complex[j], _state))); } else { h.ptr.pp_complex[i][j] = ae_c_neg(ae_c_mul(ae_c_mul(tau,v.ptr.p_complex[i]),ae_c_conj(v.ptr.p_complex[j], _state))); } } } err = (double)(0); for(i=1; i<=n; i++) { tmp = ae_v_cdotproduct(&h.ptr.pp_complex[1][i], h.stride, "Conj", &x.ptr.p_complex[1], 1, "N", ae_v_len(1,n)); if( i==1 ) { err = ae_maxreal(err, ae_c_abs(ae_c_sub(tmp,beta), _state), _state); } else { err = ae_maxreal(err, ae_c_abs(tmp, _state), _state); } } err = ae_maxreal(err, ae_fabs(beta.y, _state), _state); meg = ae_maxreal(meg, err, _state); /* * ApplyReflectionFromTheLeft */ for(i=1; i<=m; i++) { x.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; x.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; v.ptr.p_complex[i] = x.ptr.p_complex[i]; } for(i=1; i<=m; i++) { for(j=1; j<=n; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; b.ptr.pp_complex[i][j] = a.ptr.pp_complex[i][j]; } } complexgeneratereflection(&v, m, &tau, _state); beta = v.ptr.p_complex[1]; v.ptr.p_complex[1] = ae_complex_from_i(1); complexapplyreflectionfromtheleft(&b, tau, &v, 1, m, 1, n, &work, _state); for(i=1; i<=m; i++) { for(j=1; j<=m; j++) { if( i==j ) { h.ptr.pp_complex[i][j] = ae_c_d_sub(1,ae_c_mul(ae_c_mul(tau,v.ptr.p_complex[i]),ae_c_conj(v.ptr.p_complex[j], _state))); } else { h.ptr.pp_complex[i][j] = ae_c_neg(ae_c_mul(ae_c_mul(tau,v.ptr.p_complex[i]),ae_c_conj(v.ptr.p_complex[j], _state))); } } } for(i=1; i<=m; i++) { for(j=1; j<=n; j++) { tmp = ae_v_cdotproduct(&h.ptr.pp_complex[i][1], 1, "N", &a.ptr.pp_complex[1][j], a.stride, "N", ae_v_len(1,m)); c.ptr.pp_complex[i][j] = tmp; } } err = (double)(0); for(i=1; i<=m; i++) { for(j=1; j<=n; j++) { err = ae_maxreal(err, ae_c_abs(ae_c_sub(b.ptr.pp_complex[i][j],c.ptr.pp_complex[i][j]), _state), _state); } } mel = ae_maxreal(mel, err, _state); /* * ApplyReflectionFromTheRight */ for(i=1; i<=n; i++) { x.ptr.p_complex[i] = ae_complex_from_d(2*ae_randomreal(_state)-1); v.ptr.p_complex[i] = x.ptr.p_complex[i]; } for(i=1; i<=m; i++) { for(j=1; j<=n; j++) { a.ptr.pp_complex[i][j] = ae_complex_from_d(2*ae_randomreal(_state)-1); b.ptr.pp_complex[i][j] = a.ptr.pp_complex[i][j]; } } complexgeneratereflection(&v, n, &tau, _state); beta = v.ptr.p_complex[1]; v.ptr.p_complex[1] = ae_complex_from_i(1); complexapplyreflectionfromtheright(&b, tau, &v, 1, m, 1, n, &work, _state); for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { if( i==j ) { h.ptr.pp_complex[i][j] = ae_c_d_sub(1,ae_c_mul(ae_c_mul(tau,v.ptr.p_complex[i]),ae_c_conj(v.ptr.p_complex[j], _state))); } else { h.ptr.pp_complex[i][j] = ae_c_neg(ae_c_mul(ae_c_mul(tau,v.ptr.p_complex[i]),ae_c_conj(v.ptr.p_complex[j], _state))); } } } for(i=1; i<=m; i++) { for(j=1; j<=n; j++) { tmp = ae_v_cdotproduct(&a.ptr.pp_complex[i][1], 1, "N", &h.ptr.pp_complex[1][j], h.stride, "N", ae_v_len(1,n)); c.ptr.pp_complex[i][j] = tmp; } } err = (double)(0); for(i=1; i<=m; i++) { for(j=1; j<=n; j++) { err = ae_maxreal(err, ae_c_abs(ae_c_sub(b.ptr.pp_complex[i][j],c.ptr.pp_complex[i][j]), _state), _state); } } mer = ae_maxreal(mer, err, _state); } /* * Overflow crash test */ ae_vector_set_length(&x, 10+1, _state); ae_vector_set_length(&v, 10+1, _state); for(i=1; i<=10; i++) { v.ptr.p_complex[i] = ae_complex_from_d(ae_maxrealnumber*0.01*(2*ae_randomreal(_state)-1)); } complexgeneratereflection(&v, 10, &tau, _state); /* * report */ waserrors = (ae_fp_greater(meg,threshold)||ae_fp_greater(mel,threshold))||ae_fp_greater(mer,threshold); if( !silent ) { printf("TESTING COMPLEX REFLECTIONS\n"); printf("Generate error: %5.3e\n", (double)(meg)); printf("Apply(L) error: %5.3e\n", (double)(mel)); printf("Apply(R) error: %5.3e\n", (double)(mer)); printf("Threshold: %5.3e\n", (double)(threshold)); printf("Overflow crash test: PASSED\n"); if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testcreflections(ae_bool silent, ae_state *_state) { return testcreflections(silent, _state); } static ae_int_t testmatgenunit_maxsvditerations = 60; static void testmatgenunit_unset2d(/* Real */ ae_matrix* a, ae_state *_state); static void testmatgenunit_unset2dc(/* Complex */ ae_matrix* a, ae_state *_state); static ae_bool testmatgenunit_isspd(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state); static ae_bool testmatgenunit_ishpd(/* Complex */ ae_matrix* a, ae_int_t n, ae_state *_state); static ae_bool testmatgenunit_testeult(ae_state *_state); static double testmatgenunit_svdcond(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state); static ae_bool testmatgenunit_obsoletesvddecomposition(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* w, /* Real */ ae_matrix* v, ae_state *_state); static double testmatgenunit_extsign(double a, double b, ae_state *_state); static double testmatgenunit_mymax(double a, double b, ae_state *_state); static double testmatgenunit_pythag(double a, double b, ae_state *_state); ae_bool testmatgen(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix b; ae_matrix u; ae_matrix v; ae_matrix ca; ae_matrix cb; ae_matrix r1; ae_matrix r2; ae_matrix c1; ae_matrix c2; ae_vector w; ae_int_t n; ae_int_t maxn; ae_int_t i; ae_int_t j; ae_int_t pass; ae_int_t passcount; ae_bool waserrors; double cond; double threshold; double vt; ae_complex ct; double minw; double maxw; ae_bool serr; ae_bool herr; ae_bool spderr; ae_bool hpderr; ae_bool rerr; ae_bool cerr; ae_bool eulerr; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&b, 0, 0, DT_REAL, _state); ae_matrix_init(&u, 0, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); ae_matrix_init(&ca, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cb, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&r1, 0, 0, DT_REAL, _state); ae_matrix_init(&r2, 0, 0, DT_REAL, _state); ae_matrix_init(&c1, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&c2, 0, 0, DT_COMPLEX, _state); ae_vector_init(&w, 0, DT_REAL, _state); rerr = ae_false; cerr = ae_false; serr = ae_false; herr = ae_false; spderr = ae_false; hpderr = ae_false; eulerr = ae_false; waserrors = ae_false; maxn = 20; passcount = 15; threshold = 1000*ae_machineepsilon; /* * Testing orthogonal */ for(n=1; n<=maxn; n++) { for(pass=1; pass<=passcount; pass++) { ae_matrix_set_length(&r1, n-1+1, 2*n-1+1, _state); ae_matrix_set_length(&r2, 2*n-1+1, n-1+1, _state); ae_matrix_set_length(&c1, n-1+1, 2*n-1+1, _state); ae_matrix_set_length(&c2, 2*n-1+1, n-1+1, _state); /* * Random orthogonal, real */ testmatgenunit_unset2d(&a, _state); testmatgenunit_unset2d(&b, _state); rmatrixrndorthogonal(n, &a, _state); rmatrixrndorthogonal(n, &b, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { /* * orthogonality test */ vt = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &a.ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); if( i==j ) { rerr = rerr||ae_fp_greater(ae_fabs(vt-1, _state),threshold); } else { rerr = rerr||ae_fp_greater(ae_fabs(vt, _state),threshold); } vt = ae_v_dotproduct(&b.ptr.pp_double[i][0], 1, &b.ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); if( i==j ) { rerr = rerr||ae_fp_greater(ae_fabs(vt-1, _state),threshold); } else { rerr = rerr||ae_fp_greater(ae_fabs(vt, _state),threshold); } /* * test for difference in A and B */ if( n>=2 ) { rerr = rerr||ae_fp_eq(a.ptr.pp_double[i][j],b.ptr.pp_double[i][j]); } } } /* * Random orthogonal, complex */ testmatgenunit_unset2dc(&ca, _state); testmatgenunit_unset2dc(&cb, _state); cmatrixrndorthogonal(n, &ca, _state); cmatrixrndorthogonal(n, &cb, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { /* * orthogonality test */ ct = ae_v_cdotproduct(&ca.ptr.pp_complex[i][0], 1, "N", &ca.ptr.pp_complex[j][0], 1, "Conj", ae_v_len(0,n-1)); if( i==j ) { cerr = cerr||ae_fp_greater(ae_c_abs(ae_c_sub_d(ct,1), _state),threshold); } else { cerr = cerr||ae_fp_greater(ae_c_abs(ct, _state),threshold); } ct = ae_v_cdotproduct(&cb.ptr.pp_complex[i][0], 1, "N", &cb.ptr.pp_complex[j][0], 1, "Conj", ae_v_len(0,n-1)); if( i==j ) { cerr = cerr||ae_fp_greater(ae_c_abs(ae_c_sub_d(ct,1), _state),threshold); } else { cerr = cerr||ae_fp_greater(ae_c_abs(ct, _state),threshold); } /* * test for difference in A and B */ if( n>=2 ) { cerr = cerr||ae_c_eq(ca.ptr.pp_complex[i][j],cb.ptr.pp_complex[i][j]); } } } /* * From the right real tests: * 1. E*Q is orthogonal * 2. Q1<>Q2 (routine result is changing) * 3. (E E)'*Q = (Q' Q')' (correct handling of non-square matrices) */ testmatgenunit_unset2d(&a, _state); testmatgenunit_unset2d(&b, _state); ae_matrix_set_length(&a, n-1+1, n-1+1, _state); ae_matrix_set_length(&b, n-1+1, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); b.ptr.pp_double[i][j] = (double)(0); } a.ptr.pp_double[i][i] = (double)(1); b.ptr.pp_double[i][i] = (double)(1); } rmatrixrndorthogonalfromtheright(&a, n, n, _state); rmatrixrndorthogonalfromtheright(&b, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { /* * orthogonality test */ vt = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &a.ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); if( i==j ) { rerr = rerr||ae_fp_greater(ae_fabs(vt-1, _state),threshold); } else { rerr = rerr||ae_fp_greater(ae_fabs(vt, _state),threshold); } vt = ae_v_dotproduct(&b.ptr.pp_double[i][0], 1, &b.ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); if( i==j ) { rerr = rerr||ae_fp_greater(ae_fabs(vt-1, _state),threshold); } else { rerr = rerr||ae_fp_greater(ae_fabs(vt, _state),threshold); } /* * test for difference in A and B */ if( n>=2 ) { rerr = rerr||ae_fp_eq(a.ptr.pp_double[i][j],b.ptr.pp_double[i][j]); } } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { r2.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; r2.ptr.pp_double[i+n][j] = r2.ptr.pp_double[i][j]; } } rmatrixrndorthogonalfromtheright(&r2, 2*n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { rerr = rerr||ae_fp_greater(ae_fabs(r2.ptr.pp_double[i+n][j]-r2.ptr.pp_double[i][j], _state),threshold); } } /* * From the left real tests: * 1. Q*E is orthogonal * 2. Q1<>Q2 (routine result is changing) * 3. Q*(E E) = (Q Q) (correct handling of non-square matrices) */ testmatgenunit_unset2d(&a, _state); testmatgenunit_unset2d(&b, _state); ae_matrix_set_length(&a, n-1+1, n-1+1, _state); ae_matrix_set_length(&b, n-1+1, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); b.ptr.pp_double[i][j] = (double)(0); } a.ptr.pp_double[i][i] = (double)(1); b.ptr.pp_double[i][i] = (double)(1); } rmatrixrndorthogonalfromtheleft(&a, n, n, _state); rmatrixrndorthogonalfromtheleft(&b, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { /* * orthogonality test */ vt = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &a.ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); if( i==j ) { rerr = rerr||ae_fp_greater(ae_fabs(vt-1, _state),threshold); } else { rerr = rerr||ae_fp_greater(ae_fabs(vt, _state),threshold); } vt = ae_v_dotproduct(&b.ptr.pp_double[i][0], 1, &b.ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); if( i==j ) { rerr = rerr||ae_fp_greater(ae_fabs(vt-1, _state),threshold); } else { rerr = rerr||ae_fp_greater(ae_fabs(vt, _state),threshold); } /* * test for difference in A and B */ if( n>=2 ) { rerr = rerr||ae_fp_eq(a.ptr.pp_double[i][j],b.ptr.pp_double[i][j]); } } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { r1.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; r1.ptr.pp_double[i][j+n] = r1.ptr.pp_double[i][j]; } } rmatrixrndorthogonalfromtheleft(&r1, n, 2*n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { rerr = rerr||ae_fp_greater(ae_fabs(r1.ptr.pp_double[i][j]-r1.ptr.pp_double[i][j+n], _state),threshold); } } /* * From the right complex tests: * 1. E*Q is orthogonal * 2. Q1<>Q2 (routine result is changing) * 3. (E E)'*Q = (Q' Q')' (correct handling of non-square matrices) */ testmatgenunit_unset2dc(&ca, _state); testmatgenunit_unset2dc(&cb, _state); ae_matrix_set_length(&ca, n-1+1, n-1+1, _state); ae_matrix_set_length(&cb, n-1+1, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { ca.ptr.pp_complex[i][j] = ae_complex_from_i(0); cb.ptr.pp_complex[i][j] = ae_complex_from_i(0); } ca.ptr.pp_complex[i][i] = ae_complex_from_i(1); cb.ptr.pp_complex[i][i] = ae_complex_from_i(1); } cmatrixrndorthogonalfromtheright(&ca, n, n, _state); cmatrixrndorthogonalfromtheright(&cb, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { /* * orthogonality test */ ct = ae_v_cdotproduct(&ca.ptr.pp_complex[i][0], 1, "N", &ca.ptr.pp_complex[j][0], 1, "Conj", ae_v_len(0,n-1)); if( i==j ) { cerr = cerr||ae_fp_greater(ae_c_abs(ae_c_sub_d(ct,1), _state),threshold); } else { cerr = cerr||ae_fp_greater(ae_c_abs(ct, _state),threshold); } ct = ae_v_cdotproduct(&cb.ptr.pp_complex[i][0], 1, "N", &cb.ptr.pp_complex[j][0], 1, "Conj", ae_v_len(0,n-1)); if( i==j ) { cerr = cerr||ae_fp_greater(ae_c_abs(ae_c_sub_d(ct,1), _state),threshold); } else { cerr = cerr||ae_fp_greater(ae_c_abs(ct, _state),threshold); } /* * test for difference in A and B */ cerr = cerr||ae_c_eq(ca.ptr.pp_complex[i][j],cb.ptr.pp_complex[i][j]); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { c2.ptr.pp_complex[i][j] = ae_complex_from_d(2*ae_randomreal(_state)-1); c2.ptr.pp_complex[i+n][j] = c2.ptr.pp_complex[i][j]; } } cmatrixrndorthogonalfromtheright(&c2, 2*n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { cerr = cerr||ae_fp_greater(ae_c_abs(ae_c_sub(c2.ptr.pp_complex[i+n][j],c2.ptr.pp_complex[i][j]), _state),threshold); } } /* * From the left complex tests: * 1. Q*E is orthogonal * 2. Q1<>Q2 (routine result is changing) * 3. Q*(E E) = (Q Q) (correct handling of non-square matrices) */ testmatgenunit_unset2dc(&ca, _state); testmatgenunit_unset2dc(&cb, _state); ae_matrix_set_length(&ca, n-1+1, n-1+1, _state); ae_matrix_set_length(&cb, n-1+1, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { ca.ptr.pp_complex[i][j] = ae_complex_from_i(0); cb.ptr.pp_complex[i][j] = ae_complex_from_i(0); } ca.ptr.pp_complex[i][i] = ae_complex_from_i(1); cb.ptr.pp_complex[i][i] = ae_complex_from_i(1); } cmatrixrndorthogonalfromtheleft(&ca, n, n, _state); cmatrixrndorthogonalfromtheleft(&cb, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { /* * orthogonality test */ ct = ae_v_cdotproduct(&ca.ptr.pp_complex[i][0], 1, "N", &ca.ptr.pp_complex[j][0], 1, "Conj", ae_v_len(0,n-1)); if( i==j ) { cerr = cerr||ae_fp_greater(ae_c_abs(ae_c_sub_d(ct,1), _state),threshold); } else { cerr = cerr||ae_fp_greater(ae_c_abs(ct, _state),threshold); } ct = ae_v_cdotproduct(&cb.ptr.pp_complex[i][0], 1, "N", &cb.ptr.pp_complex[j][0], 1, "Conj", ae_v_len(0,n-1)); if( i==j ) { cerr = cerr||ae_fp_greater(ae_c_abs(ae_c_sub_d(ct,1), _state),threshold); } else { cerr = cerr||ae_fp_greater(ae_c_abs(ct, _state),threshold); } /* * test for difference in A and B */ cerr = cerr||ae_c_eq(ca.ptr.pp_complex[i][j],cb.ptr.pp_complex[i][j]); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { c1.ptr.pp_complex[i][j] = ae_complex_from_d(2*ae_randomreal(_state)-1); c1.ptr.pp_complex[i][j+n] = c1.ptr.pp_complex[i][j]; } } cmatrixrndorthogonalfromtheleft(&c1, n, 2*n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { cerr = cerr||ae_fp_greater(ae_c_abs(ae_c_sub(c1.ptr.pp_complex[i][j],c1.ptr.pp_complex[i][j+n]), _state),threshold); } } } } /* * Testing GCond */ for(n=2; n<=maxn; n++) { for(pass=1; pass<=passcount; pass++) { /* * real test */ testmatgenunit_unset2d(&a, _state); cond = ae_exp(ae_log((double)(1000), _state)*ae_randomreal(_state), _state); rmatrixrndcond(n, cond, &a, _state); ae_matrix_set_length(&b, n+1, n+1, _state); for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { b.ptr.pp_double[i][j] = a.ptr.pp_double[i-1][j-1]; } } if( testmatgenunit_obsoletesvddecomposition(&b, n, n, &w, &v, _state) ) { maxw = w.ptr.p_double[1]; minw = w.ptr.p_double[1]; for(i=2; i<=n; i++) { if( ae_fp_greater(w.ptr.p_double[i],maxw) ) { maxw = w.ptr.p_double[i]; } if( ae_fp_less(w.ptr.p_double[i],minw) ) { minw = w.ptr.p_double[i]; } } vt = maxw/minw/cond; if( ae_fp_greater(ae_fabs(ae_log(vt, _state), _state),ae_log(1+threshold, _state)) ) { rerr = ae_true; } } } } /* * Symmetric/SPD * N = 2 .. 30 */ for(n=2; n<=maxn; n++) { /* * SPD matrices */ for(pass=1; pass<=passcount; pass++) { /* * Generate A */ testmatgenunit_unset2d(&a, _state); cond = ae_exp(ae_log((double)(1000), _state)*ae_randomreal(_state), _state); spdmatrixrndcond(n, cond, &a, _state); /* * test condition number */ spderr = spderr||ae_fp_greater(testmatgenunit_svdcond(&a, n, _state)/cond-1,threshold); /* * test SPD */ spderr = spderr||!testmatgenunit_isspd(&a, n, ae_true, _state); /* * test that A is symmetic */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { spderr = spderr||ae_fp_greater(ae_fabs(a.ptr.pp_double[i][j]-a.ptr.pp_double[j][i], _state),threshold); } } /* * test for difference between A and B (subsequent matrix) */ testmatgenunit_unset2d(&b, _state); spdmatrixrndcond(n, cond, &b, _state); if( n>=2 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { spderr = spderr||ae_fp_eq(a.ptr.pp_double[i][j],b.ptr.pp_double[i][j]); } } } } /* * HPD matrices */ for(pass=1; pass<=passcount; pass++) { /* * Generate A */ testmatgenunit_unset2dc(&ca, _state); cond = ae_exp(ae_log((double)(1000), _state)*ae_randomreal(_state), _state); hpdmatrixrndcond(n, cond, &ca, _state); /* * test HPD */ hpderr = hpderr||!testmatgenunit_ishpd(&ca, n, _state); /* * test that A is Hermitian */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { hpderr = hpderr||ae_fp_greater(ae_c_abs(ae_c_sub(ca.ptr.pp_complex[i][j],ae_c_conj(ca.ptr.pp_complex[j][i], _state)), _state),threshold); } } /* * test for difference between A and B (subsequent matrix) */ testmatgenunit_unset2dc(&cb, _state); hpdmatrixrndcond(n, cond, &cb, _state); if( n>=2 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { hpderr = hpderr||ae_c_eq(ca.ptr.pp_complex[i][j],cb.ptr.pp_complex[i][j]); } } } } /* * Symmetric matrices */ for(pass=1; pass<=passcount; pass++) { /* * test condition number */ testmatgenunit_unset2d(&a, _state); cond = ae_exp(ae_log((double)(1000), _state)*ae_randomreal(_state), _state); smatrixrndcond(n, cond, &a, _state); serr = serr||ae_fp_greater(testmatgenunit_svdcond(&a, n, _state)/cond-1,threshold); /* * test for difference between A and B */ testmatgenunit_unset2d(&b, _state); smatrixrndcond(n, cond, &b, _state); if( n>=2 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { serr = serr||ae_fp_eq(a.ptr.pp_double[i][j],b.ptr.pp_double[i][j]); } } } } /* * Hermitian matrices */ for(pass=1; pass<=passcount; pass++) { /* * Generate A */ testmatgenunit_unset2dc(&ca, _state); cond = ae_exp(ae_log((double)(1000), _state)*ae_randomreal(_state), _state); hmatrixrndcond(n, cond, &ca, _state); /* * test that A is Hermitian */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { herr = herr||ae_fp_greater(ae_c_abs(ae_c_sub(ca.ptr.pp_complex[i][j],ae_c_conj(ca.ptr.pp_complex[j][i], _state)), _state),threshold); } } /* * test for difference between A and B (subsequent matrix) */ testmatgenunit_unset2dc(&cb, _state); hmatrixrndcond(n, cond, &cb, _state); if( n>=2 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { herr = herr||ae_c_eq(ca.ptr.pp_complex[i][j],cb.ptr.pp_complex[i][j]); } } } } } /* * Test for symmetric matrices */ eulerr = testmatgenunit_testeult(_state); /* * report */ waserrors = (((((rerr||cerr)||serr)||spderr)||herr)||hpderr)||eulerr; if( !silent ) { printf("TESTING MATRIX GENERATOR\n"); printf("REAL TEST: "); if( !rerr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("COMPLEX TEST: "); if( !cerr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("SYMMETRIC TEST: "); if( !serr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("HERMITIAN TEST: "); if( !herr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("SPD TEST: "); if( !spderr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("HPD TEST: "); if( !hpderr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("TEST FOR SYMMETRIC MATRICES: "); if( !eulerr ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testmatgen(ae_bool silent, ae_state *_state) { return testmatgen(silent, _state); } /************************************************************************* Unsets 2D array. *************************************************************************/ static void testmatgenunit_unset2d(/* Real */ ae_matrix* a, ae_state *_state) { ae_matrix_set_length(a, 0+1, 0+1, _state); a->ptr.pp_double[0][0] = 2*ae_randomreal(_state)-1; } /************************************************************************* Unsets 2D array. *************************************************************************/ static void testmatgenunit_unset2dc(/* Complex */ ae_matrix* a, ae_state *_state) { ae_matrix_set_length(a, 0+1, 0+1, _state); a->ptr.pp_complex[0][0] = ae_complex_from_d(2*ae_randomreal(_state)-1); } /************************************************************************* Test whether matrix is SPD *************************************************************************/ static ae_bool testmatgenunit_isspd(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t i; ae_int_t j; double ajj; double v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; /* * Test the input parameters. */ ae_assert(n>=0, "Error in SMatrixCholesky: incorrect function arguments", _state); /* * Quick return if possible */ result = ae_true; if( n<=0 ) { ae_frame_leave(_state); return result; } if( isupper ) { /* * Compute the Cholesky factorization A = U'*U. */ for(j=0; j<=n-1; j++) { /* * Compute U(J,J) and test for non-positive-definiteness. */ v = ae_v_dotproduct(&a->ptr.pp_double[0][j], a->stride, &a->ptr.pp_double[0][j], a->stride, ae_v_len(0,j-1)); ajj = a->ptr.pp_double[j][j]-v; if( ae_fp_less_eq(ajj,(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } ajj = ae_sqrt(ajj, _state); a->ptr.pp_double[j][j] = ajj; /* * Compute elements J+1:N of row J. */ if( jptr.pp_double[0][i], a->stride, &a->ptr.pp_double[0][j], a->stride, ae_v_len(0,j-1)); a->ptr.pp_double[j][i] = a->ptr.pp_double[j][i]-v; } v = 1/ajj; ae_v_muld(&a->ptr.pp_double[j][j+1], 1, ae_v_len(j+1,n-1), v); } } } else { /* * Compute the Cholesky factorization A = L*L'. */ for(j=0; j<=n-1; j++) { /* * Compute L(J,J) and test for non-positive-definiteness. */ v = ae_v_dotproduct(&a->ptr.pp_double[j][0], 1, &a->ptr.pp_double[j][0], 1, ae_v_len(0,j-1)); ajj = a->ptr.pp_double[j][j]-v; if( ae_fp_less_eq(ajj,(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } ajj = ae_sqrt(ajj, _state); a->ptr.pp_double[j][j] = ajj; /* * Compute elements J+1:N of column J. */ if( jptr.pp_double[i][0], 1, &a->ptr.pp_double[j][0], 1, ae_v_len(0,j-1)); a->ptr.pp_double[i][j] = a->ptr.pp_double[i][j]-v; } v = 1/ajj; ae_v_muld(&a->ptr.pp_double[j+1][j], a->stride, ae_v_len(j+1,n-1), v); } } } ae_frame_leave(_state); return result; } /************************************************************************* Tests whether A is HPD *************************************************************************/ static ae_bool testmatgenunit_ishpd(/* Complex */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_int_t j; double ajj; ae_complex v; double r; ae_vector t; ae_vector t2; ae_vector t3; ae_int_t i; ae_matrix a1; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_vector_init(&t, 0, DT_COMPLEX, _state); ae_vector_init(&t2, 0, DT_COMPLEX, _state); ae_vector_init(&t3, 0, DT_COMPLEX, _state); ae_matrix_init(&a1, 0, 0, DT_COMPLEX, _state); ae_vector_set_length(&t, n-1+1, _state); ae_vector_set_length(&t2, n-1+1, _state); ae_vector_set_length(&t3, n-1+1, _state); result = ae_true; /* * Compute the Cholesky factorization A = U'*U. */ for(j=0; j<=n-1; j++) { /* * Compute U(J,J) and test for non-positive-definiteness. */ v = ae_v_cdotproduct(&a->ptr.pp_complex[0][j], a->stride, "Conj", &a->ptr.pp_complex[0][j], a->stride, "N", ae_v_len(0,j-1)); ajj = ae_c_sub(a->ptr.pp_complex[j][j],v).x; if( ae_fp_less_eq(ajj,(double)(0)) ) { a->ptr.pp_complex[j][j] = ae_complex_from_d(ajj); result = ae_false; ae_frame_leave(_state); return result; } ajj = ae_sqrt(ajj, _state); a->ptr.pp_complex[j][j] = ae_complex_from_d(ajj); /* * Compute elements J+1:N-1 of row J. */ if( jptr.pp_complex[0][j], a->stride, "Conj", ae_v_len(0,j-1)); ae_v_cmove(&t3.ptr.p_complex[j+1], 1, &a->ptr.pp_complex[j][j+1], 1, "N", ae_v_len(j+1,n-1)); for(i=j+1; i<=n-1; i++) { v = ae_v_cdotproduct(&a->ptr.pp_complex[0][i], a->stride, "N", &t2.ptr.p_complex[0], 1, "N", ae_v_len(0,j-1)); t3.ptr.p_complex[i] = ae_c_sub(t3.ptr.p_complex[i],v); } ae_v_cmove(&a->ptr.pp_complex[j][j+1], 1, &t3.ptr.p_complex[j+1], 1, "N", ae_v_len(j+1,n-1)); r = 1/ajj; ae_v_cmuld(&a->ptr.pp_complex[j][j+1], 1, ae_v_len(j+1,n-1), r); } } ae_frame_leave(_state); return result; } /************************************************************************* The function check, that upper triangle from symmetric matrix is equal to lower triangle. *************************************************************************/ static ae_bool testmatgenunit_testeult(ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix b; double c; double range; double eps; ae_int_t n; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&b, 0, 0, DT_COMPLEX, _state); eps = 2*ae_machineepsilon; range = 100*(2*ae_randomreal(_state)-1); for(n=1; n<=15; n++) { c = 900*ae_randomreal(_state)+100; /* * Generate symmetric matrix and check it */ smatrixrndcond(n, c, &a, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater(ae_fabs(a.ptr.pp_double[i][j]-a.ptr.pp_double[j][i], _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } spdmatrixrndcond(n, c, &a, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater(ae_fabs(a.ptr.pp_double[i][j]-a.ptr.pp_double[j][i], _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } hmatrixrndcond(n, c, &b, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater(ae_fabs(b.ptr.pp_complex[i][j].x-b.ptr.pp_complex[j][i].x, _state),eps)||ae_fp_greater(ae_fabs(b.ptr.pp_complex[i][j].y+b.ptr.pp_complex[j][i].y, _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } hpdmatrixrndcond(n, c, &b, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater(ae_fabs(b.ptr.pp_complex[i][j].x-b.ptr.pp_complex[j][i].x, _state),eps)||ae_fp_greater(ae_fabs(b.ptr.pp_complex[i][j].y+b.ptr.pp_complex[j][i].y, _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } /* * Prepare symmetric matrix with real values */ for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { a.ptr.pp_double[i][j] = range*(2*ae_randomreal(_state)-1); } } for(i=0; i<=n-2; i++) { for(j=i+1; j<=n-1; j++) { a.ptr.pp_double[j][i] = a.ptr.pp_double[i][j]; } } smatrixrndmultiply(&a, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater(ae_fabs(a.ptr.pp_double[i][j]-a.ptr.pp_double[j][i], _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } /* * Prepare symmetric matrix with complex values */ for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { b.ptr.pp_complex[i][j].x = range*(2*ae_randomreal(_state)-1); if( i!=j ) { b.ptr.pp_complex[i][j].y = range*(2*ae_randomreal(_state)-1); } else { b.ptr.pp_complex[i][j].y = (double)(0); } } } for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { b.ptr.pp_complex[i][j].x = b.ptr.pp_complex[j][i].x; b.ptr.pp_complex[i][j].y = -b.ptr.pp_complex[j][i].y; } } hmatrixrndmultiply(&b, n, _state); for(i=0; i<=n-1; i++) { b.ptr.pp_complex[i][i].y = (double)(0); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater(ae_fabs(b.ptr.pp_complex[i][j].x-b.ptr.pp_complex[j][i].x, _state),eps)||ae_fp_greater(ae_fabs(b.ptr.pp_complex[i][j].y+b.ptr.pp_complex[j][i].y, _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* SVD condition number *************************************************************************/ static double testmatgenunit_svdcond(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_matrix a1; ae_matrix v; ae_vector w; ae_int_t i; ae_int_t j; double minw; double maxw; double result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a1, 0, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_matrix_set_length(&a1, n+1, n+1, _state); for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { a1.ptr.pp_double[i][j] = a->ptr.pp_double[i-1][j-1]; } } if( !testmatgenunit_obsoletesvddecomposition(&a1, n, n, &w, &v, _state) ) { result = (double)(0); ae_frame_leave(_state); return result; } minw = w.ptr.p_double[1]; maxw = w.ptr.p_double[1]; for(i=2; i<=n; i++) { if( ae_fp_less(w.ptr.p_double[i],minw) ) { minw = w.ptr.p_double[i]; } if( ae_fp_greater(w.ptr.p_double[i],maxw) ) { maxw = w.ptr.p_double[i]; } } result = maxw/minw; ae_frame_leave(_state); return result; } static ae_bool testmatgenunit_obsoletesvddecomposition(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_vector* w, /* Real */ ae_matrix* v, ae_state *_state) { ae_frame _frame_block; ae_int_t nm; ae_int_t minmn; ae_int_t l; ae_int_t k; ae_int_t j; ae_int_t jj; ae_int_t its; ae_int_t i; double z; double y; double x; double vscale; double s; double h; double g; double f; double c; double anorm; ae_vector rv1; ae_bool flag; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_clear(w); ae_matrix_clear(v); ae_vector_init(&rv1, 0, DT_REAL, _state); ae_vector_set_length(&rv1, n+1, _state); ae_vector_set_length(w, n+1, _state); ae_matrix_set_length(v, n+1, n+1, _state); result = ae_true; if( mptr.pp_double[k][i], _state); } if( ae_fp_neq(vscale,0.0) ) { for(k=i; k<=m; k++) { a->ptr.pp_double[k][i] = a->ptr.pp_double[k][i]/vscale; s = s+a->ptr.pp_double[k][i]*a->ptr.pp_double[k][i]; } f = a->ptr.pp_double[i][i]; g = -testmatgenunit_extsign(ae_sqrt(s, _state), f, _state); h = f*g-s; a->ptr.pp_double[i][i] = f-g; if( i!=n ) { for(j=l; j<=n; j++) { s = 0.0; for(k=i; k<=m; k++) { s = s+a->ptr.pp_double[k][i]*a->ptr.pp_double[k][j]; } f = s/h; for(k=i; k<=m; k++) { a->ptr.pp_double[k][j] = a->ptr.pp_double[k][j]+f*a->ptr.pp_double[k][i]; } } } for(k=i; k<=m; k++) { a->ptr.pp_double[k][i] = vscale*a->ptr.pp_double[k][i]; } } } w->ptr.p_double[i] = vscale*g; g = 0.0; s = 0.0; vscale = 0.0; if( i<=m&&i!=n ) { for(k=l; k<=n; k++) { vscale = vscale+ae_fabs(a->ptr.pp_double[i][k], _state); } if( ae_fp_neq(vscale,0.0) ) { for(k=l; k<=n; k++) { a->ptr.pp_double[i][k] = a->ptr.pp_double[i][k]/vscale; s = s+a->ptr.pp_double[i][k]*a->ptr.pp_double[i][k]; } f = a->ptr.pp_double[i][l]; g = -testmatgenunit_extsign(ae_sqrt(s, _state), f, _state); h = f*g-s; a->ptr.pp_double[i][l] = f-g; for(k=l; k<=n; k++) { rv1.ptr.p_double[k] = a->ptr.pp_double[i][k]/h; } if( i!=m ) { for(j=l; j<=m; j++) { s = 0.0; for(k=l; k<=n; k++) { s = s+a->ptr.pp_double[j][k]*a->ptr.pp_double[i][k]; } for(k=l; k<=n; k++) { a->ptr.pp_double[j][k] = a->ptr.pp_double[j][k]+s*rv1.ptr.p_double[k]; } } } for(k=l; k<=n; k++) { a->ptr.pp_double[i][k] = vscale*a->ptr.pp_double[i][k]; } } } anorm = testmatgenunit_mymax(anorm, ae_fabs(w->ptr.p_double[i], _state)+ae_fabs(rv1.ptr.p_double[i], _state), _state); } for(i=n; i>=1; i--) { if( iptr.pp_double[j][i] = a->ptr.pp_double[i][j]/a->ptr.pp_double[i][l]/g; } for(j=l; j<=n; j++) { s = 0.0; for(k=l; k<=n; k++) { s = s+a->ptr.pp_double[i][k]*v->ptr.pp_double[k][j]; } for(k=l; k<=n; k++) { v->ptr.pp_double[k][j] = v->ptr.pp_double[k][j]+s*v->ptr.pp_double[k][i]; } } } for(j=l; j<=n; j++) { v->ptr.pp_double[i][j] = 0.0; v->ptr.pp_double[j][i] = 0.0; } } v->ptr.pp_double[i][i] = 1.0; g = rv1.ptr.p_double[i]; l = i; } for(i=minmn; i>=1; i--) { l = i+1; g = w->ptr.p_double[i]; if( iptr.pp_double[i][j] = 0.0; } } if( ae_fp_neq(g,0.0) ) { g = 1.0/g; if( i!=n ) { for(j=l; j<=n; j++) { s = 0.0; for(k=l; k<=m; k++) { s = s+a->ptr.pp_double[k][i]*a->ptr.pp_double[k][j]; } f = s/a->ptr.pp_double[i][i]*g; for(k=i; k<=m; k++) { a->ptr.pp_double[k][j] = a->ptr.pp_double[k][j]+f*a->ptr.pp_double[k][i]; } } } for(j=i; j<=m; j++) { a->ptr.pp_double[j][i] = a->ptr.pp_double[j][i]*g; } } else { for(j=i; j<=m; j++) { a->ptr.pp_double[j][i] = 0.0; } } a->ptr.pp_double[i][i] = a->ptr.pp_double[i][i]+1.0; } nm = 0; for(k=n; k>=1; k--) { for(its=1; its<=testmatgenunit_maxsvditerations; its++) { flag = ae_true; for(l=k; l>=1; l--) { nm = l-1; if( ae_fp_eq(ae_fabs(rv1.ptr.p_double[l], _state)+anorm,anorm) ) { flag = ae_false; break; } if( ae_fp_eq(ae_fabs(w->ptr.p_double[nm], _state)+anorm,anorm) ) { break; } } if( flag ) { c = 0.0; s = 1.0; for(i=l; i<=k; i++) { f = s*rv1.ptr.p_double[i]; if( ae_fp_neq(ae_fabs(f, _state)+anorm,anorm) ) { g = w->ptr.p_double[i]; h = testmatgenunit_pythag(f, g, _state); w->ptr.p_double[i] = h; h = 1.0/h; c = g*h; s = -f*h; for(j=1; j<=m; j++) { y = a->ptr.pp_double[j][nm]; z = a->ptr.pp_double[j][i]; a->ptr.pp_double[j][nm] = y*c+z*s; a->ptr.pp_double[j][i] = -y*s+z*c; } } } } z = w->ptr.p_double[k]; if( l==k ) { if( ae_fp_less(z,0.0) ) { w->ptr.p_double[k] = -z; for(j=1; j<=n; j++) { v->ptr.pp_double[j][k] = -v->ptr.pp_double[j][k]; } } break; } if( its==testmatgenunit_maxsvditerations ) { result = ae_false; ae_frame_leave(_state); return result; } x = w->ptr.p_double[l]; nm = k-1; y = w->ptr.p_double[nm]; g = rv1.ptr.p_double[nm]; h = rv1.ptr.p_double[k]; f = ((y-z)*(y+z)+(g-h)*(g+h))/(2.0*h*y); g = testmatgenunit_pythag(f, (double)(1), _state); f = ((x-z)*(x+z)+h*(y/(f+testmatgenunit_extsign(g, f, _state))-h))/x; c = 1.0; s = 1.0; for(j=l; j<=nm; j++) { i = j+1; g = rv1.ptr.p_double[i]; y = w->ptr.p_double[i]; h = s*g; g = c*g; z = testmatgenunit_pythag(f, h, _state); rv1.ptr.p_double[j] = z; c = f/z; s = h/z; f = x*c+g*s; g = -x*s+g*c; h = y*s; y = y*c; for(jj=1; jj<=n; jj++) { x = v->ptr.pp_double[jj][j]; z = v->ptr.pp_double[jj][i]; v->ptr.pp_double[jj][j] = x*c+z*s; v->ptr.pp_double[jj][i] = -x*s+z*c; } z = testmatgenunit_pythag(f, h, _state); w->ptr.p_double[j] = z; if( ae_fp_neq(z,0.0) ) { z = 1.0/z; c = f*z; s = h*z; } f = c*g+s*y; x = -s*g+c*y; for(jj=1; jj<=m; jj++) { y = a->ptr.pp_double[jj][j]; z = a->ptr.pp_double[jj][i]; a->ptr.pp_double[jj][j] = y*c+z*s; a->ptr.pp_double[jj][i] = -y*s+z*c; } } rv1.ptr.p_double[l] = 0.0; rv1.ptr.p_double[k] = f; w->ptr.p_double[k] = x; } } ae_frame_leave(_state); return result; } static double testmatgenunit_extsign(double a, double b, ae_state *_state) { double result; if( ae_fp_greater_eq(b,(double)(0)) ) { result = ae_fabs(a, _state); } else { result = -ae_fabs(a, _state); } return result; } static double testmatgenunit_mymax(double a, double b, ae_state *_state) { double result; if( ae_fp_greater(a,b) ) { result = a; } else { result = b; } return result; } static double testmatgenunit_pythag(double a, double b, ae_state *_state) { double result; if( ae_fp_less(ae_fabs(a, _state),ae_fabs(b, _state)) ) { result = ae_fabs(b, _state)*ae_sqrt(1+ae_sqr(a/b, _state), _state); } else { result = ae_fabs(a, _state)*ae_sqrt(1+ae_sqr(b/a, _state), _state); } return result; } static ae_bool testablasunit_testtrsm(ae_int_t minn, ae_int_t maxn, ae_state *_state); static ae_bool testablasunit_testsyrk(ae_int_t minn, ae_int_t maxn, ae_state *_state); static ae_bool testablasunit_testgemm(ae_int_t minn, ae_int_t maxn, ae_state *_state); static ae_bool testablasunit_testtrans(ae_int_t minn, ae_int_t maxn, ae_state *_state); static ae_bool testablasunit_testrank1(ae_int_t minn, ae_int_t maxn, ae_state *_state); static ae_bool testablasunit_testmv(ae_int_t minn, ae_int_t maxn, ae_state *_state); static void testablasunit_spectest(ae_bool* errorflag, ae_state *_state); static ae_bool testablasunit_testcopy(ae_int_t minn, ae_int_t maxn, ae_state *_state); static void testablasunit_refcmatrixrighttrsm(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); static void testablasunit_refcmatrixlefttrsm(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); static void testablasunit_refrmatrixrighttrsm(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); static void testablasunit_refrmatrixlefttrsm(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state); static ae_bool testablasunit_internalcmatrixtrinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunittriangular, ae_state *_state); static ae_bool testablasunit_internalrmatrixtrinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunittriangular, ae_state *_state); static void testablasunit_refcmatrixherk(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); static void testablasunit_refrmatrixsyrk(ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state); static void testablasunit_refcmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); static void testablasunit_refrmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state); ae_bool testablas(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool trsmerrors; ae_bool syrkerrors; ae_bool gemmerrors; ae_bool transerrors; ae_bool rank1errors; ae_bool mverrors; ae_bool copyerrors; ae_bool specerrors; ae_bool waserrors; ae_matrix ra; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ra, 0, 0, DT_REAL, _state); trsmerrors = ae_false; syrkerrors = ae_false; gemmerrors = ae_false; transerrors = ae_false; rank1errors = ae_false; mverrors = ae_false; copyerrors = ae_false; waserrors = ae_false; specerrors = ae_false; testablasunit_spectest(&specerrors, _state); trsmerrors = trsmerrors||testablasunit_testtrsm(1, 3*ablasblocksize(&ra, _state)+1, _state); syrkerrors = syrkerrors||testablasunit_testsyrk(1, 3*ablasblocksize(&ra, _state)+1, _state); gemmerrors = gemmerrors||testablasunit_testgemm(1, 3*ablasblocksize(&ra, _state)+1, _state); transerrors = transerrors||testablasunit_testtrans(1, 3*ablasblocksize(&ra, _state)+1, _state); rank1errors = rank1errors||testablasunit_testrank1(1, 3*ablasblocksize(&ra, _state)+1, _state); mverrors = mverrors||testablasunit_testmv(1, 3*ablasblocksize(&ra, _state)+1, _state); copyerrors = copyerrors||testablasunit_testcopy(1, 3*ablasblocksize(&ra, _state)+1, _state); gemmerrors = gemmerrors||testablasunit_testgemm(8*ablasblocksize(&ra, _state)-1, 8*ablasblocksize(&ra, _state)+1, _state); /* * report */ waserrors = ((((((trsmerrors||syrkerrors)||gemmerrors)||transerrors)||rank1errors)||mverrors)||copyerrors)||specerrors; if( !silent ) { printf("TESTING ABLAS\n"); printf("* TRSM: "); if( trsmerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* SYRK: "); if( syrkerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* GEMM: "); if( gemmerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* TRANS: "); if( transerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* RANK1: "); if( rank1errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* MV: "); if( mverrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* COPY: "); if( copyerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* SPECIAL TESTS: "); if( specerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testablas(ae_bool silent, ae_state *_state) { return testablas(silent, _state); } /************************************************************************* ?Matrix????TRSM tests Returns False for passed test, True - for failed *************************************************************************/ static ae_bool testablasunit_testtrsm(ae_int_t minn, ae_int_t maxn, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t m; ae_int_t mx; ae_int_t i; ae_int_t j; ae_int_t optype; ae_int_t uppertype; ae_int_t unittype; ae_int_t xoffsi; ae_int_t xoffsj; ae_int_t aoffsitype; ae_int_t aoffsjtype; ae_int_t aoffsi; ae_int_t aoffsj; ae_matrix refra; ae_matrix refrxl; ae_matrix refrxr; ae_matrix refca; ae_matrix refcxl; ae_matrix refcxr; ae_matrix ra; ae_matrix ca; ae_matrix rxr1; ae_matrix rxl1; ae_matrix cxr1; ae_matrix cxl1; ae_matrix rxr2; ae_matrix rxl2; ae_matrix cxr2; ae_matrix cxl2; double threshold; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&refra, 0, 0, DT_REAL, _state); ae_matrix_init(&refrxl, 0, 0, DT_REAL, _state); ae_matrix_init(&refrxr, 0, 0, DT_REAL, _state); ae_matrix_init(&refca, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&refcxl, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&refcxr, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&ra, 0, 0, DT_REAL, _state); ae_matrix_init(&ca, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&rxr1, 0, 0, DT_REAL, _state); ae_matrix_init(&rxl1, 0, 0, DT_REAL, _state); ae_matrix_init(&cxr1, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cxl1, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&rxr2, 0, 0, DT_REAL, _state); ae_matrix_init(&rxl2, 0, 0, DT_REAL, _state); ae_matrix_init(&cxr2, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cxl2, 0, 0, DT_COMPLEX, _state); threshold = ae_sqr((double)(maxn), _state)*100*ae_machineepsilon; result = ae_false; for(mx=minn; mx<=maxn; mx++) { /* * Select random M/N in [1,MX] such that max(M,N)=MX */ m = 1+ae_randominteger(mx, _state); n = 1+ae_randominteger(mx, _state); if( ae_fp_greater(ae_randomreal(_state),0.5) ) { m = mx; } else { n = mx; } /* * Initialize RefRA/RefCA by random matrices whose upper * and lower triangle submatrices are non-degenerate * well-conditioned matrices. * * Matrix size is 2Mx2M (four copies of same MxM matrix * to test different offsets) */ ae_matrix_set_length(&refra, 2*m, 2*m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { refra.ptr.pp_double[i][j] = 0.2*ae_randomreal(_state)-0.1; } } for(i=0; i<=m-1; i++) { refra.ptr.pp_double[i][i] = (2*ae_randominteger(1, _state)-1)*(2*m+ae_randomreal(_state)); } for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { refra.ptr.pp_double[i+m][j] = refra.ptr.pp_double[i][j]; refra.ptr.pp_double[i][j+m] = refra.ptr.pp_double[i][j]; refra.ptr.pp_double[i+m][j+m] = refra.ptr.pp_double[i][j]; } } ae_matrix_set_length(&refca, 2*m, 2*m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { refca.ptr.pp_complex[i][j].x = 0.2*ae_randomreal(_state)-0.1; refca.ptr.pp_complex[i][j].y = 0.2*ae_randomreal(_state)-0.1; } } for(i=0; i<=m-1; i++) { refca.ptr.pp_complex[i][i].x = (2*ae_randominteger(2, _state)-1)*(2*m+ae_randomreal(_state)); refca.ptr.pp_complex[i][i].y = (2*ae_randominteger(2, _state)-1)*(2*m+ae_randomreal(_state)); } for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { refca.ptr.pp_complex[i+m][j] = refca.ptr.pp_complex[i][j]; refca.ptr.pp_complex[i][j+m] = refca.ptr.pp_complex[i][j]; refca.ptr.pp_complex[i+m][j+m] = refca.ptr.pp_complex[i][j]; } } /* * Generate random XL/XR. * * XR is NxM matrix (matrix for 'Right' subroutines) * XL is MxN matrix (matrix for 'Left' subroutines) */ ae_matrix_set_length(&refrxr, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { refrxr.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } ae_matrix_set_length(&refrxl, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { refrxl.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } ae_matrix_set_length(&refcxr, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { refcxr.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; refcxr.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } ae_matrix_set_length(&refcxl, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { refcxl.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; refcxl.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } /* * test different types of operations, offsets, and so on... * * to avoid unnecessary slowdown we don't test ALL possible * combinations of operation types. We just generate one random * set of parameters and test it. */ ae_matrix_set_length(&ra, 2*m, 2*m, _state); ae_matrix_set_length(&rxr1, n, m, _state); ae_matrix_set_length(&rxr2, n, m, _state); ae_matrix_set_length(&rxl1, m, n, _state); ae_matrix_set_length(&rxl2, m, n, _state); ae_matrix_set_length(&ca, 2*m, 2*m, _state); ae_matrix_set_length(&cxr1, n, m, _state); ae_matrix_set_length(&cxr2, n, m, _state); ae_matrix_set_length(&cxl1, m, n, _state); ae_matrix_set_length(&cxl2, m, n, _state); optype = ae_randominteger(3, _state); uppertype = ae_randominteger(2, _state); unittype = ae_randominteger(2, _state); xoffsi = ae_randominteger(2, _state); xoffsj = ae_randominteger(2, _state); aoffsitype = ae_randominteger(2, _state); aoffsjtype = ae_randominteger(2, _state); aoffsi = m*aoffsitype; aoffsj = m*aoffsjtype; /* * copy A, XR, XL (fill unused parts with random garbage) */ for(i=0; i<=2*m-1; i++) { for(j=0; j<=2*m-1; j++) { if( ((i>=aoffsi&&i=aoffsj)&&j=xoffsi&&j>=xoffsj ) { cxr1.ptr.pp_complex[i][j] = refcxr.ptr.pp_complex[i][j]; cxr2.ptr.pp_complex[i][j] = refcxr.ptr.pp_complex[i][j]; rxr1.ptr.pp_double[i][j] = refrxr.ptr.pp_double[i][j]; rxr2.ptr.pp_double[i][j] = refrxr.ptr.pp_double[i][j]; } else { cxr1.ptr.pp_complex[i][j] = ae_complex_from_d(ae_randomreal(_state)); cxr2.ptr.pp_complex[i][j] = cxr1.ptr.pp_complex[i][j]; rxr1.ptr.pp_double[i][j] = ae_randomreal(_state); rxr2.ptr.pp_double[i][j] = rxr1.ptr.pp_double[i][j]; } } } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( i>=xoffsi&&j>=xoffsj ) { cxl1.ptr.pp_complex[i][j] = refcxl.ptr.pp_complex[i][j]; cxl2.ptr.pp_complex[i][j] = refcxl.ptr.pp_complex[i][j]; rxl1.ptr.pp_double[i][j] = refrxl.ptr.pp_double[i][j]; rxl2.ptr.pp_double[i][j] = refrxl.ptr.pp_double[i][j]; } else { cxl1.ptr.pp_complex[i][j] = ae_complex_from_d(ae_randomreal(_state)); cxl2.ptr.pp_complex[i][j] = cxl1.ptr.pp_complex[i][j]; rxl1.ptr.pp_double[i][j] = ae_randomreal(_state); rxl2.ptr.pp_double[i][j] = rxl1.ptr.pp_double[i][j]; } } } /* * Test CXR */ cmatrixrighttrsm(n-xoffsi, m-xoffsj, &ca, aoffsi, aoffsj, uppertype==0, unittype==0, optype, &cxr1, xoffsi, xoffsj, _state); testablasunit_refcmatrixrighttrsm(n-xoffsi, m-xoffsj, &ca, aoffsi, aoffsj, uppertype==0, unittype==0, optype, &cxr2, xoffsi, xoffsj, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { result = result||ae_fp_greater(ae_c_abs(ae_c_sub(cxr1.ptr.pp_complex[i][j],cxr2.ptr.pp_complex[i][j]), _state),threshold); } } /* * Test CXL */ cmatrixlefttrsm(m-xoffsi, n-xoffsj, &ca, aoffsi, aoffsj, uppertype==0, unittype==0, optype, &cxl1, xoffsi, xoffsj, _state); testablasunit_refcmatrixlefttrsm(m-xoffsi, n-xoffsj, &ca, aoffsi, aoffsj, uppertype==0, unittype==0, optype, &cxl2, xoffsi, xoffsj, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { result = result||ae_fp_greater(ae_c_abs(ae_c_sub(cxl1.ptr.pp_complex[i][j],cxl2.ptr.pp_complex[i][j]), _state),threshold); } } if( optype<2 ) { /* * Test RXR */ rmatrixrighttrsm(n-xoffsi, m-xoffsj, &ra, aoffsi, aoffsj, uppertype==0, unittype==0, optype, &rxr1, xoffsi, xoffsj, _state); testablasunit_refrmatrixrighttrsm(n-xoffsi, m-xoffsj, &ra, aoffsi, aoffsj, uppertype==0, unittype==0, optype, &rxr2, xoffsi, xoffsj, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { result = result||ae_fp_greater(ae_fabs(rxr1.ptr.pp_double[i][j]-rxr2.ptr.pp_double[i][j], _state),threshold); } } /* * Test RXL */ rmatrixlefttrsm(m-xoffsi, n-xoffsj, &ra, aoffsi, aoffsj, uppertype==0, unittype==0, optype, &rxl1, xoffsi, xoffsj, _state); testablasunit_refrmatrixlefttrsm(m-xoffsi, n-xoffsj, &ra, aoffsi, aoffsj, uppertype==0, unittype==0, optype, &rxl2, xoffsi, xoffsj, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { result = result||ae_fp_greater(ae_fabs(rxl1.ptr.pp_double[i][j]-rxl2.ptr.pp_double[i][j], _state),threshold); } } } } ae_frame_leave(_state); return result; } /************************************************************************* SYRK tests Returns False for passed test, True - for failed *************************************************************************/ static ae_bool testablasunit_testsyrk(ae_int_t minn, ae_int_t maxn, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t k; ae_int_t mx; ae_int_t i; ae_int_t j; ae_int_t uppertype; ae_int_t xoffsi; ae_int_t xoffsj; ae_int_t aoffsitype; ae_int_t aoffsjtype; ae_int_t aoffsi; ae_int_t aoffsj; ae_int_t alphatype; ae_int_t betatype; ae_matrix refra; ae_matrix refrc; ae_matrix refca; ae_matrix refcc; double alpha; double beta; ae_matrix ra1; ae_matrix ra2; ae_matrix ca1; ae_matrix ca2; ae_matrix rc; ae_matrix rct; ae_matrix cc; ae_matrix cct; double threshold; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&refra, 0, 0, DT_REAL, _state); ae_matrix_init(&refrc, 0, 0, DT_REAL, _state); ae_matrix_init(&refca, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&refcc, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&ra1, 0, 0, DT_REAL, _state); ae_matrix_init(&ra2, 0, 0, DT_REAL, _state); ae_matrix_init(&ca1, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&ca2, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&rc, 0, 0, DT_REAL, _state); ae_matrix_init(&rct, 0, 0, DT_REAL, _state); ae_matrix_init(&cc, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cct, 0, 0, DT_COMPLEX, _state); threshold = maxn*100*ae_machineepsilon; result = ae_false; for(mx=minn; mx<=maxn; mx++) { /* * Select random M/N in [1,MX] such that max(M,N)=MX */ k = 1+ae_randominteger(mx, _state); n = 1+ae_randominteger(mx, _state); if( ae_fp_greater(ae_randomreal(_state),0.5) ) { k = mx; } else { n = mx; } /* * Initialize RefRA/RefCA by random Hermitian matrices, * RefRC/RefCC by random matrices * * RA/CA size is 2Nx2N (four copies of same NxN matrix * to test different offsets) */ ae_matrix_set_length(&refra, 2*n, 2*n, _state); ae_matrix_set_length(&refca, 2*n, 2*n, _state); for(i=0; i<=n-1; i++) { refra.ptr.pp_double[i][i] = 2*ae_randomreal(_state)-1; refca.ptr.pp_complex[i][i] = ae_complex_from_d(2*ae_randomreal(_state)-1); for(j=i+1; j<=n-1; j++) { refra.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; refca.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; refca.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; refra.ptr.pp_double[j][i] = refra.ptr.pp_double[i][j]; refca.ptr.pp_complex[j][i] = ae_c_conj(refca.ptr.pp_complex[i][j], _state); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { refra.ptr.pp_double[i+n][j] = refra.ptr.pp_double[i][j]; refra.ptr.pp_double[i][j+n] = refra.ptr.pp_double[i][j]; refra.ptr.pp_double[i+n][j+n] = refra.ptr.pp_double[i][j]; refca.ptr.pp_complex[i+n][j] = refca.ptr.pp_complex[i][j]; refca.ptr.pp_complex[i][j+n] = refca.ptr.pp_complex[i][j]; refca.ptr.pp_complex[i+n][j+n] = refca.ptr.pp_complex[i][j]; } } ae_matrix_set_length(&refrc, n, k, _state); ae_matrix_set_length(&refcc, n, k, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { refrc.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; refcc.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; refcc.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } /* * test different types of operations, offsets, and so on... * * to avoid unnecessary slowdown we don't test ALL possible * combinations of operation types. We just generate one random * set of parameters and test it. */ ae_matrix_set_length(&ra1, 2*n, 2*n, _state); ae_matrix_set_length(&ra2, 2*n, 2*n, _state); ae_matrix_set_length(&ca1, 2*n, 2*n, _state); ae_matrix_set_length(&ca2, 2*n, 2*n, _state); ae_matrix_set_length(&rc, n, k, _state); ae_matrix_set_length(&rct, k, n, _state); ae_matrix_set_length(&cc, n, k, _state); ae_matrix_set_length(&cct, k, n, _state); uppertype = ae_randominteger(2, _state); xoffsi = ae_randominteger(2, _state); xoffsj = ae_randominteger(2, _state); aoffsitype = ae_randominteger(2, _state); aoffsjtype = ae_randominteger(2, _state); alphatype = ae_randominteger(2, _state); betatype = ae_randominteger(2, _state); aoffsi = n*aoffsitype; aoffsj = n*aoffsjtype; alpha = alphatype*(2*ae_randomreal(_state)-1); beta = betatype*(2*ae_randomreal(_state)-1); /* * copy A, C (fill unused parts with random garbage) */ for(i=0; i<=2*n-1; i++) { for(j=0; j<=2*n-1; j++) { if( ((i>=aoffsi&&i=aoffsj)&&j=xoffsi&&j>=xoffsj ) { rc.ptr.pp_double[i][j] = refrc.ptr.pp_double[i][j]; rct.ptr.pp_double[j][i] = refrc.ptr.pp_double[i][j]; cc.ptr.pp_complex[i][j] = refcc.ptr.pp_complex[i][j]; cct.ptr.pp_complex[j][i] = refcc.ptr.pp_complex[i][j]; } else { rc.ptr.pp_double[i][j] = ae_randomreal(_state); rct.ptr.pp_double[j][i] = rc.ptr.pp_double[i][j]; cc.ptr.pp_complex[i][j] = ae_complex_from_d(ae_randomreal(_state)); cct.ptr.pp_complex[j][i] = cct.ptr.pp_complex[j][i]; } } } /* * Test complex * Only one of transform types is selected and tested */ if( ae_fp_greater(ae_randomreal(_state),0.5) ) { cmatrixherk(n-xoffsi, k-xoffsj, alpha, &cc, xoffsi, xoffsj, 0, beta, &ca1, aoffsi, aoffsj, uppertype==0, _state); testablasunit_refcmatrixherk(n-xoffsi, k-xoffsj, alpha, &cc, xoffsi, xoffsj, 0, beta, &ca2, aoffsi, aoffsj, uppertype==0, _state); } else { cmatrixherk(n-xoffsi, k-xoffsj, alpha, &cct, xoffsj, xoffsi, 2, beta, &ca1, aoffsi, aoffsj, uppertype==0, _state); testablasunit_refcmatrixherk(n-xoffsi, k-xoffsj, alpha, &cct, xoffsj, xoffsi, 2, beta, &ca2, aoffsi, aoffsj, uppertype==0, _state); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { result = result||ae_fp_greater(ae_c_abs(ae_c_sub(ca1.ptr.pp_complex[i][j],ca2.ptr.pp_complex[i][j]), _state),threshold); } } /* * Test old version of HERK (named SYRK) * Only one of transform types is selected and tested */ if( ae_fp_greater(ae_randomreal(_state),0.5) ) { cmatrixsyrk(n-xoffsi, k-xoffsj, alpha, &cc, xoffsi, xoffsj, 0, beta, &ca1, aoffsi, aoffsj, uppertype==0, _state); testablasunit_refcmatrixherk(n-xoffsi, k-xoffsj, alpha, &cc, xoffsi, xoffsj, 0, beta, &ca2, aoffsi, aoffsj, uppertype==0, _state); } else { cmatrixsyrk(n-xoffsi, k-xoffsj, alpha, &cct, xoffsj, xoffsi, 2, beta, &ca1, aoffsi, aoffsj, uppertype==0, _state); testablasunit_refcmatrixherk(n-xoffsi, k-xoffsj, alpha, &cct, xoffsj, xoffsi, 2, beta, &ca2, aoffsi, aoffsj, uppertype==0, _state); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { result = result||ae_fp_greater(ae_c_abs(ae_c_sub(ca1.ptr.pp_complex[i][j],ca2.ptr.pp_complex[i][j]), _state),threshold); } } /* * Test real * Only one of transform types is selected and tested */ if( ae_fp_greater(ae_randomreal(_state),0.5) ) { rmatrixsyrk(n-xoffsi, k-xoffsj, alpha, &rc, xoffsi, xoffsj, 0, beta, &ra1, aoffsi, aoffsj, uppertype==0, _state); testablasunit_refrmatrixsyrk(n-xoffsi, k-xoffsj, alpha, &rc, xoffsi, xoffsj, 0, beta, &ra2, aoffsi, aoffsj, uppertype==0, _state); } else { rmatrixsyrk(n-xoffsi, k-xoffsj, alpha, &rct, xoffsj, xoffsi, 1, beta, &ra1, aoffsi, aoffsj, uppertype==0, _state); testablasunit_refrmatrixsyrk(n-xoffsi, k-xoffsj, alpha, &rct, xoffsj, xoffsi, 1, beta, &ra2, aoffsi, aoffsj, uppertype==0, _state); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { result = result||ae_fp_greater(ae_fabs(ra1.ptr.pp_double[i][j]-ra2.ptr.pp_double[i][j], _state),threshold); } } } ae_frame_leave(_state); return result; } /************************************************************************* GEMM tests Returns False for passed test, True - for failed *************************************************************************/ static ae_bool testablasunit_testgemm(ae_int_t minn, ae_int_t maxn, ae_state *_state) { ae_frame _frame_block; ae_int_t m; ae_int_t n; ae_int_t k; ae_int_t mx; ae_int_t i; ae_int_t j; ae_int_t aoffsi; ae_int_t aoffsj; ae_int_t aoptype; ae_int_t aoptyper; ae_int_t boffsi; ae_int_t boffsj; ae_int_t boptype; ae_int_t boptyper; ae_int_t coffsi; ae_int_t coffsj; ae_matrix refra; ae_matrix refrb; ae_matrix refrc; ae_matrix refca; ae_matrix refcb; ae_matrix refcc; double alphar; double betar; ae_complex alphac; ae_complex betac; ae_matrix rc1; ae_matrix rc2; ae_matrix cc1; ae_matrix cc2; double threshold; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&refra, 0, 0, DT_REAL, _state); ae_matrix_init(&refrb, 0, 0, DT_REAL, _state); ae_matrix_init(&refrc, 0, 0, DT_REAL, _state); ae_matrix_init(&refca, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&refcb, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&refcc, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&rc1, 0, 0, DT_REAL, _state); ae_matrix_init(&rc2, 0, 0, DT_REAL, _state); ae_matrix_init(&cc1, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cc2, 0, 0, DT_COMPLEX, _state); threshold = maxn*100*ae_machineepsilon; result = ae_false; for(mx=minn; mx<=maxn; mx++) { /* * Select random M/N/K in [1,MX] such that max(M,N,K)=MX */ m = 1+ae_randominteger(mx, _state); n = 1+ae_randominteger(mx, _state); k = 1+ae_randominteger(mx, _state); i = ae_randominteger(3, _state); if( i==0 ) { m = mx; } if( i==1 ) { n = mx; } if( i==2 ) { k = mx; } /* * Initialize A/B/C by random matrices with size (MaxN+1)*(MaxN+1) */ ae_matrix_set_length(&refra, maxn+1, maxn+1, _state); ae_matrix_set_length(&refrb, maxn+1, maxn+1, _state); ae_matrix_set_length(&refrc, maxn+1, maxn+1, _state); ae_matrix_set_length(&refca, maxn+1, maxn+1, _state); ae_matrix_set_length(&refcb, maxn+1, maxn+1, _state); ae_matrix_set_length(&refcc, maxn+1, maxn+1, _state); for(i=0; i<=maxn; i++) { for(j=0; j<=maxn; j++) { refra.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; refrb.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; refrc.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; refca.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; refca.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; refcb.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; refcb.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; refcc.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; refcc.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } /* * test different types of operations, offsets, and so on... * * to avoid unnecessary slowdown we don't test ALL possible * combinations of operation types. We just generate one random * set of parameters and test it. */ ae_matrix_set_length(&rc1, maxn+1, maxn+1, _state); ae_matrix_set_length(&rc2, maxn+1, maxn+1, _state); ae_matrix_set_length(&cc1, maxn+1, maxn+1, _state); ae_matrix_set_length(&cc2, maxn+1, maxn+1, _state); aoffsi = ae_randominteger(2, _state); aoffsj = ae_randominteger(2, _state); aoptype = ae_randominteger(3, _state); aoptyper = ae_randominteger(2, _state); boffsi = ae_randominteger(2, _state); boffsj = ae_randominteger(2, _state); boptype = ae_randominteger(3, _state); boptyper = ae_randominteger(2, _state); coffsi = ae_randominteger(2, _state); coffsj = ae_randominteger(2, _state); alphar = ae_randominteger(2, _state)*(2*ae_randomreal(_state)-1); betar = ae_randominteger(2, _state)*(2*ae_randomreal(_state)-1); if( ae_fp_greater(ae_randomreal(_state),0.5) ) { alphac.x = 2*ae_randomreal(_state)-1; alphac.y = 2*ae_randomreal(_state)-1; } else { alphac = ae_complex_from_i(0); } if( ae_fp_greater(ae_randomreal(_state),0.5) ) { betac.x = 2*ae_randomreal(_state)-1; betac.y = 2*ae_randomreal(_state)-1; } else { betac = ae_complex_from_i(0); } /* * copy C */ for(i=0; i<=maxn; i++) { for(j=0; j<=maxn; j++) { rc1.ptr.pp_double[i][j] = refrc.ptr.pp_double[i][j]; rc2.ptr.pp_double[i][j] = refrc.ptr.pp_double[i][j]; cc1.ptr.pp_complex[i][j] = refcc.ptr.pp_complex[i][j]; cc2.ptr.pp_complex[i][j] = refcc.ptr.pp_complex[i][j]; } } /* * Test complex */ cmatrixgemm(m, n, k, alphac, &refca, aoffsi, aoffsj, aoptype, &refcb, boffsi, boffsj, boptype, betac, &cc1, coffsi, coffsj, _state); testablasunit_refcmatrixgemm(m, n, k, alphac, &refca, aoffsi, aoffsj, aoptype, &refcb, boffsi, boffsj, boptype, betac, &cc2, coffsi, coffsj, _state); for(i=0; i<=maxn; i++) { for(j=0; j<=maxn; j++) { result = result||ae_fp_greater(ae_c_abs(ae_c_sub(cc1.ptr.pp_complex[i][j],cc2.ptr.pp_complex[i][j]), _state),threshold); } } /* * Test real */ rmatrixgemm(m, n, k, alphar, &refra, aoffsi, aoffsj, aoptyper, &refrb, boffsi, boffsj, boptyper, betar, &rc1, coffsi, coffsj, _state); testablasunit_refrmatrixgemm(m, n, k, alphar, &refra, aoffsi, aoffsj, aoptyper, &refrb, boffsi, boffsj, boptyper, betar, &rc2, coffsi, coffsj, _state); for(i=0; i<=maxn; i++) { for(j=0; j<=maxn; j++) { result = result||ae_fp_greater(ae_fabs(rc1.ptr.pp_double[i][j]-rc2.ptr.pp_double[i][j], _state),threshold); } } } ae_frame_leave(_state); return result; } /************************************************************************* transpose tests Returns False for passed test, True - for failed *************************************************************************/ static ae_bool testablasunit_testtrans(ae_int_t minn, ae_int_t maxn, ae_state *_state) { ae_frame _frame_block; ae_int_t m; ae_int_t n; ae_int_t mx; ae_int_t i; ae_int_t j; ae_int_t aoffsi; ae_int_t aoffsj; ae_int_t boffsi; ae_int_t boffsj; double v1; double v2; double threshold; ae_matrix refra; ae_matrix refrb; ae_matrix refca; ae_matrix refcb; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&refra, 0, 0, DT_REAL, _state); ae_matrix_init(&refrb, 0, 0, DT_REAL, _state); ae_matrix_init(&refca, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&refcb, 0, 0, DT_COMPLEX, _state); result = ae_false; threshold = 1000*ae_machineepsilon; for(mx=minn; mx<=maxn; mx++) { /* * Select random M/N in [1,MX] such that max(M,N)=MX * Generate random V1 and V2 which are used to fill * RefRB/RefCB with control values. */ m = 1+ae_randominteger(mx, _state); n = 1+ae_randominteger(mx, _state); if( ae_randominteger(2, _state)==0 ) { m = mx; } else { n = mx; } v1 = ae_randomreal(_state); v2 = ae_randomreal(_state); /* * Initialize A by random matrix with size (MaxN+1)*(MaxN+1) * Fill B with control values */ ae_matrix_set_length(&refra, maxn+1, maxn+1, _state); ae_matrix_set_length(&refrb, maxn+1, maxn+1, _state); ae_matrix_set_length(&refca, maxn+1, maxn+1, _state); ae_matrix_set_length(&refcb, maxn+1, maxn+1, _state); for(i=0; i<=maxn; i++) { for(j=0; j<=maxn; j++) { refra.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; refca.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; refca.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; refrb.ptr.pp_double[i][j] = i*v1+j*v2; refcb.ptr.pp_complex[i][j] = ae_complex_from_d(i*v1+j*v2); } } /* * test different offsets (zero or one) * * to avoid unnecessary slowdown we don't test ALL possible * combinations of operation types. We just generate one random * set of parameters and test it. */ aoffsi = ae_randominteger(2, _state); aoffsj = ae_randominteger(2, _state); boffsi = ae_randominteger(2, _state); boffsj = ae_randominteger(2, _state); rmatrixtranspose(m, n, &refra, aoffsi, aoffsj, &refrb, boffsi, boffsj, _state); for(i=0; i<=maxn; i++) { for(j=0; j<=maxn; j++) { if( ((i=boffsi+n)||j=boffsj+m ) { result = result||ae_fp_greater(ae_fabs(refrb.ptr.pp_double[i][j]-(v1*i+v2*j), _state),threshold); } else { result = result||ae_fp_greater(ae_fabs(refrb.ptr.pp_double[i][j]-refra.ptr.pp_double[aoffsi+j-boffsj][aoffsj+i-boffsi], _state),threshold); } } } cmatrixtranspose(m, n, &refca, aoffsi, aoffsj, &refcb, boffsi, boffsj, _state); for(i=0; i<=maxn; i++) { for(j=0; j<=maxn; j++) { if( ((i=boffsi+n)||j=boffsj+m ) { result = result||ae_fp_greater(ae_c_abs(ae_c_sub_d(refcb.ptr.pp_complex[i][j],v1*i+v2*j), _state),threshold); } else { result = result||ae_fp_greater(ae_c_abs(ae_c_sub(refcb.ptr.pp_complex[i][j],refca.ptr.pp_complex[aoffsi+j-boffsj][aoffsj+i-boffsi]), _state),threshold); } } } } ae_frame_leave(_state); return result; } /************************************************************************* rank-1tests Returns False for passed test, True - for failed *************************************************************************/ static ae_bool testablasunit_testrank1(ae_int_t minn, ae_int_t maxn, ae_state *_state) { ae_frame _frame_block; ae_int_t m; ae_int_t n; ae_int_t mx; ae_int_t i; ae_int_t j; ae_int_t aoffsi; ae_int_t aoffsj; ae_int_t uoffs; ae_int_t voffs; double threshold; ae_matrix refra; ae_matrix refrb; ae_matrix refca; ae_matrix refcb; ae_vector ru; ae_vector rv; ae_vector cu; ae_vector cv; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&refra, 0, 0, DT_REAL, _state); ae_matrix_init(&refrb, 0, 0, DT_REAL, _state); ae_matrix_init(&refca, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&refcb, 0, 0, DT_COMPLEX, _state); ae_vector_init(&ru, 0, DT_REAL, _state); ae_vector_init(&rv, 0, DT_REAL, _state); ae_vector_init(&cu, 0, DT_COMPLEX, _state); ae_vector_init(&cv, 0, DT_COMPLEX, _state); result = ae_false; threshold = 1000*ae_machineepsilon; for(mx=minn; mx<=maxn; mx++) { /* * Select random M/N in [1,MX] such that max(M,N)=MX */ m = 1+ae_randominteger(mx, _state); n = 1+ae_randominteger(mx, _state); if( ae_randominteger(2, _state)==0 ) { m = mx; } else { n = mx; } /* * Initialize A by random matrix with size (MaxN+1)*(MaxN+1) * Fill B with control values */ ae_matrix_set_length(&refra, maxn+maxn, maxn+maxn, _state); ae_matrix_set_length(&refrb, maxn+maxn, maxn+maxn, _state); ae_matrix_set_length(&refca, maxn+maxn, maxn+maxn, _state); ae_matrix_set_length(&refcb, maxn+maxn, maxn+maxn, _state); for(i=0; i<=2*maxn-1; i++) { for(j=0; j<=2*maxn-1; j++) { refra.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; refca.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; refca.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; refrb.ptr.pp_double[i][j] = refra.ptr.pp_double[i][j]; refcb.ptr.pp_complex[i][j] = refca.ptr.pp_complex[i][j]; } } ae_vector_set_length(&ru, 2*m, _state); ae_vector_set_length(&cu, 2*m, _state); for(i=0; i<=2*m-1; i++) { ru.ptr.p_double[i] = 2*ae_randomreal(_state)-1; cu.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; cu.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&rv, 2*n, _state); ae_vector_set_length(&cv, 2*n, _state); for(i=0; i<=2*n-1; i++) { rv.ptr.p_double[i] = 2*ae_randomreal(_state)-1; cv.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; cv.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; } /* * test different offsets (zero or one) * * to avoid unnecessary slowdown we don't test ALL possible * combinations of operation types. We just generate one random * set of parameters and test it. */ aoffsi = ae_randominteger(maxn, _state); aoffsj = ae_randominteger(maxn, _state); uoffs = ae_randominteger(m, _state); voffs = ae_randominteger(n, _state); cmatrixrank1(m, n, &refca, aoffsi, aoffsj, &cu, uoffs, &cv, voffs, _state); for(i=0; i<=2*maxn-1; i++) { for(j=0; j<=2*maxn-1; j++) { if( ((i=aoffsi+m)||j=aoffsj+n ) { result = result||ae_fp_greater(ae_c_abs(ae_c_sub(refca.ptr.pp_complex[i][j],refcb.ptr.pp_complex[i][j]), _state),threshold); } else { result = result||ae_fp_greater(ae_c_abs(ae_c_sub(refca.ptr.pp_complex[i][j],ae_c_add(refcb.ptr.pp_complex[i][j],ae_c_mul(cu.ptr.p_complex[i-aoffsi+uoffs],cv.ptr.p_complex[j-aoffsj+voffs]))), _state),threshold); } } } rmatrixrank1(m, n, &refra, aoffsi, aoffsj, &ru, uoffs, &rv, voffs, _state); for(i=0; i<=2*maxn-1; i++) { for(j=0; j<=2*maxn-1; j++) { if( ((i=aoffsi+m)||j=aoffsj+n ) { result = result||ae_fp_greater(ae_fabs(refra.ptr.pp_double[i][j]-refrb.ptr.pp_double[i][j], _state),threshold); } else { result = result||ae_fp_greater(ae_fabs(refra.ptr.pp_double[i][j]-(refrb.ptr.pp_double[i][j]+ru.ptr.p_double[i-aoffsi+uoffs]*rv.ptr.p_double[j-aoffsj+voffs]), _state),threshold); } } } } ae_frame_leave(_state); return result; } /************************************************************************* MV tests Returns False for passed test, True - for failed *************************************************************************/ static ae_bool testablasunit_testmv(ae_int_t minn, ae_int_t maxn, ae_state *_state) { ae_frame _frame_block; ae_int_t m; ae_int_t n; ae_int_t mx; ae_int_t i; ae_int_t j; ae_int_t aoffsi; ae_int_t aoffsj; ae_int_t xoffs; ae_int_t yoffs; ae_int_t opca; ae_int_t opra; double threshold; double rv1; double rv2; ae_complex cv1; ae_complex cv2; ae_matrix refra; ae_matrix refca; ae_vector rx; ae_vector ry; ae_vector cx; ae_vector cy; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&refra, 0, 0, DT_REAL, _state); ae_matrix_init(&refca, 0, 0, DT_COMPLEX, _state); ae_vector_init(&rx, 0, DT_REAL, _state); ae_vector_init(&ry, 0, DT_REAL, _state); ae_vector_init(&cx, 0, DT_COMPLEX, _state); ae_vector_init(&cy, 0, DT_COMPLEX, _state); result = ae_false; threshold = 1000*ae_machineepsilon; for(mx=minn; mx<=maxn; mx++) { /* * Select random M/N in [1,MX] such that max(M,N)=MX */ m = 1+ae_randominteger(mx, _state); n = 1+ae_randominteger(mx, _state); if( ae_randominteger(2, _state)==0 ) { m = mx; } else { n = mx; } /* * Initialize A by random matrix with size (MaxN+MaxN)*(MaxN+MaxN) * Initialize X by random vector with size (MaxN+MaxN) * Fill Y by control values */ ae_matrix_set_length(&refra, maxn+maxn, maxn+maxn, _state); ae_matrix_set_length(&refca, maxn+maxn, maxn+maxn, _state); for(i=0; i<=2*maxn-1; i++) { for(j=0; j<=2*maxn-1; j++) { refra.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; refca.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; refca.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } ae_vector_set_length(&rx, 2*maxn, _state); ae_vector_set_length(&cx, 2*maxn, _state); ae_vector_set_length(&ry, 2*maxn, _state); ae_vector_set_length(&cy, 2*maxn, _state); for(i=0; i<=2*maxn-1; i++) { rx.ptr.p_double[i] = 2*ae_randomreal(_state)-1; cx.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; cx.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; ry.ptr.p_double[i] = (double)(i); cy.ptr.p_complex[i] = ae_complex_from_i(i); } /* * test different offsets (zero or one) * * to avoid unnecessary slowdown we don't test ALL possible * combinations of operation types. We just generate one random * set of parameters and test it. */ aoffsi = ae_randominteger(maxn, _state); aoffsj = ae_randominteger(maxn, _state); xoffs = ae_randominteger(maxn, _state); yoffs = ae_randominteger(maxn, _state); opca = ae_randominteger(3, _state); opra = ae_randominteger(2, _state); cmatrixmv(m, n, &refca, aoffsi, aoffsj, opca, &cx, xoffs, &cy, yoffs, _state); for(i=0; i<=2*maxn-1; i++) { if( i=yoffs+m ) { result = result||ae_c_neq_d(cy.ptr.p_complex[i],(double)(i)); } else { cv1 = cy.ptr.p_complex[i]; cv2 = ae_complex_from_d(0.0); if( opca==0 ) { cv2 = ae_v_cdotproduct(&refca.ptr.pp_complex[aoffsi+i-yoffs][aoffsj], 1, "N", &cx.ptr.p_complex[xoffs], 1, "N", ae_v_len(aoffsj,aoffsj+n-1)); } if( opca==1 ) { cv2 = ae_v_cdotproduct(&refca.ptr.pp_complex[aoffsi][aoffsj+i-yoffs], refca.stride, "N", &cx.ptr.p_complex[xoffs], 1, "N", ae_v_len(aoffsi,aoffsi+n-1)); } if( opca==2 ) { cv2 = ae_v_cdotproduct(&refca.ptr.pp_complex[aoffsi][aoffsj+i-yoffs], refca.stride, "Conj", &cx.ptr.p_complex[xoffs], 1, "N", ae_v_len(aoffsi,aoffsi+n-1)); } result = result||ae_fp_greater(ae_c_abs(ae_c_sub(cv1,cv2), _state),threshold); } } rmatrixmv(m, n, &refra, aoffsi, aoffsj, opra, &rx, xoffs, &ry, yoffs, _state); for(i=0; i<=2*maxn-1; i++) { if( i=yoffs+m ) { result = result||ae_fp_neq(ry.ptr.p_double[i],(double)(i)); } else { rv1 = ry.ptr.p_double[i]; rv2 = (double)(0); if( opra==0 ) { rv2 = ae_v_dotproduct(&refra.ptr.pp_double[aoffsi+i-yoffs][aoffsj], 1, &rx.ptr.p_double[xoffs], 1, ae_v_len(aoffsj,aoffsj+n-1)); } if( opra==1 ) { rv2 = ae_v_dotproduct(&refra.ptr.pp_double[aoffsi][aoffsj+i-yoffs], refra.stride, &rx.ptr.p_double[xoffs], 1, ae_v_len(aoffsi,aoffsi+n-1)); } result = result||ae_fp_greater(ae_fabs(rv1-rv2, _state),threshold); } } } ae_frame_leave(_state); return result; } /************************************************************************* Special test. On failure sets error flag, on success does not change it. *************************************************************************/ static void testablasunit_spectest(ae_bool* errorflag, ae_state *_state) { ae_frame _frame_block; ae_matrix emptyr2; ae_matrix emptyc2; ae_matrix outputr2; ae_matrix outputc2; ae_int_t i; ae_int_t j; ae_int_t n; ae_int_t pass; ae_frame_make(_state, &_frame_block); ae_matrix_init(&emptyr2, 0, 0, DT_REAL, _state); ae_matrix_init(&emptyc2, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&outputr2, 0, 0, DT_REAL, _state); ae_matrix_init(&outputc2, 0, 0, DT_COMPLEX, _state); /* * Test that SYRK, GEMM and TRSM does not reference empty argument at all. * * In order to perform this test we pass empty (unallocated) matrix * with large offset; incorrect implementation will crash on such data. */ n = 128+ae_randominteger(65, _state)-32; ae_matrix_set_length(&outputr2, n, n, _state); ae_matrix_set_length(&outputc2, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { outputr2.ptr.pp_double[i][j] = (double)(0); outputc2.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } for(pass=0; pass<=10; pass++) { rmatrixgemm(n, n, 0, 1.0, &emptyr2, 35345, 23453, ae_randominteger(2, _state), &emptyr2, 74764, 26845, ae_randominteger(2, _state), 1.0+ae_randominteger(2, _state), &outputr2, 0, 0, _state); rmatrixgemm(n, n, n, 0.0, &emptyr2, 35345, 23453, ae_randominteger(2, _state), &emptyr2, 74764, 26845, ae_randominteger(2, _state), 1.0+ae_randominteger(2, _state), &outputr2, 0, 0, _state); cmatrixgemm(n, n, 0, ae_complex_from_d(1.0), &emptyc2, 35345, 23453, ae_randominteger(3, _state), &emptyc2, 74764, 26845, ae_randominteger(3, _state), ae_complex_from_d(1.0+ae_randominteger(2, _state)), &outputc2, 0, 0, _state); cmatrixgemm(n, n, n, ae_complex_from_d(0.0), &emptyc2, 35345, 23453, ae_randominteger(3, _state), &emptyc2, 74764, 26845, ae_randominteger(3, _state), ae_complex_from_d(1.0+ae_randominteger(2, _state)), &outputc2, 0, 0, _state); rmatrixsyrk(n, 0, 1.0, &emptyr2, 54674, 34657, 2*ae_randominteger(2, _state), 1.0+ae_randominteger(2, _state), &outputr2, 0, 0, ae_fp_greater(ae_randomreal(_state),0.5), _state); rmatrixsyrk(n, n, 0.0, &emptyr2, 54674, 34657, 2*ae_randominteger(2, _state), 1.0+ae_randominteger(2, _state), &outputr2, 0, 0, ae_fp_greater(ae_randomreal(_state),0.5), _state); cmatrixherk(n, 0, 1.0, &emptyc2, 54674, 34657, 2*ae_randominteger(2, _state), 1.0+ae_randominteger(2, _state), &outputc2, 0, 0, ae_fp_greater(ae_randomreal(_state),0.5), _state); cmatrixherk(n, n, 0.0, &emptyc2, 54674, 34657, 2*ae_randominteger(2, _state), 1.0+ae_randominteger(2, _state), &outputc2, 0, 0, ae_fp_greater(ae_randomreal(_state),0.5), _state); rmatrixrighttrsm(0, 0, &emptyr2, 63463, 36345, ae_fp_greater(ae_randomreal(_state),0.5), ae_fp_greater(ae_randomreal(_state),0.5), ae_randominteger(2, _state), &outputr2, 0, 0, _state); rmatrixlefttrsm(0, 0, &emptyr2, 63463, 36345, ae_fp_greater(ae_randomreal(_state),0.5), ae_fp_greater(ae_randomreal(_state),0.5), ae_randominteger(2, _state), &outputr2, 0, 0, _state); cmatrixrighttrsm(0, 0, &emptyc2, 63463, 36345, ae_fp_greater(ae_randomreal(_state),0.5), ae_fp_greater(ae_randomreal(_state),0.5), ae_randominteger(3, _state), &outputc2, 0, 0, _state); cmatrixlefttrsm(0, 0, &emptyc2, 63463, 36345, ae_fp_greater(ae_randomreal(_state),0.5), ae_fp_greater(ae_randomreal(_state),0.5), ae_randominteger(3, _state), &outputc2, 0, 0, _state); } ae_frame_leave(_state); } /************************************************************************* COPY tests Returns False for passed test, True - for failed *************************************************************************/ static ae_bool testablasunit_testcopy(ae_int_t minn, ae_int_t maxn, ae_state *_state) { ae_frame _frame_block; ae_int_t m; ae_int_t n; ae_int_t mx; ae_int_t i; ae_int_t j; ae_int_t aoffsi; ae_int_t aoffsj; ae_int_t boffsi; ae_int_t boffsj; double threshold; ae_matrix ra; ae_matrix rb; ae_matrix ca; ae_matrix cb; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ra, 0, 0, DT_REAL, _state); ae_matrix_init(&rb, 0, 0, DT_REAL, _state); ae_matrix_init(&ca, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cb, 0, 0, DT_COMPLEX, _state); result = ae_false; threshold = 1000*ae_machineepsilon; for(mx=minn; mx<=maxn; mx++) { /* * Select random M/N in [1,MX] such that max(M,N)=MX */ m = 1+ae_randominteger(mx, _state); n = 1+ae_randominteger(mx, _state); if( ae_randominteger(2, _state)==0 ) { m = mx; } else { n = mx; } /* * Initialize A by random matrix with size (MaxN+MaxN)*(MaxN+MaxN) * Initialize X by random vector with size (MaxN+MaxN) * Fill Y by control values */ ae_matrix_set_length(&ra, maxn+maxn, maxn+maxn, _state); ae_matrix_set_length(&ca, maxn+maxn, maxn+maxn, _state); ae_matrix_set_length(&rb, maxn+maxn, maxn+maxn, _state); ae_matrix_set_length(&cb, maxn+maxn, maxn+maxn, _state); for(i=0; i<=2*maxn-1; i++) { for(j=0; j<=2*maxn-1; j++) { ra.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; rb.ptr.pp_double[i][j] = (double)(1+2*i+3*j); cb.ptr.pp_complex[i][j] = ae_complex_from_i(1+2*i+3*j); } } /* * test different offsets (zero or one) * * to avoid unnecessary slowdown we don't test ALL possible * combinations of operation types. We just generate one random * set of parameters and test it. */ aoffsi = ae_randominteger(maxn, _state); aoffsj = ae_randominteger(maxn, _state); boffsi = ae_randominteger(maxn, _state); boffsj = ae_randominteger(maxn, _state); cmatrixcopy(m, n, &ca, aoffsi, aoffsj, &cb, boffsi, boffsj, _state); for(i=0; i<=2*maxn-1; i++) { for(j=0; j<=2*maxn-1; j++) { if( ((i=boffsi+m)||j=boffsj+n ) { result = result||ae_c_neq_d(cb.ptr.pp_complex[i][j],(double)(1+2*i+3*j)); } else { result = result||ae_fp_greater(ae_c_abs(ae_c_sub(ca.ptr.pp_complex[aoffsi+i-boffsi][aoffsj+j-boffsj],cb.ptr.pp_complex[i][j]), _state),threshold); } } } rmatrixcopy(m, n, &ra, aoffsi, aoffsj, &rb, boffsi, boffsj, _state); for(i=0; i<=2*maxn-1; i++) { for(j=0; j<=2*maxn-1; j++) { if( ((i=boffsi+m)||j=boffsj+n ) { result = result||ae_fp_neq(rb.ptr.pp_double[i][j],(double)(1+2*i+3*j)); } else { result = result||ae_fp_greater(ae_fabs(ra.ptr.pp_double[aoffsi+i-boffsi][aoffsj+j-boffsj]-rb.ptr.pp_double[i][j], _state),threshold); } } } } ae_frame_leave(_state); return result; } /************************************************************************* Reference implementation -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ static void testablasunit_refcmatrixrighttrsm(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { ae_frame _frame_block; ae_matrix a1; ae_matrix a2; ae_vector tx; ae_int_t i; ae_int_t j; ae_complex vc; ae_bool rupper; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a1, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&a2, 0, 0, DT_COMPLEX, _state); ae_vector_init(&tx, 0, DT_COMPLEX, _state); if( n*m==0 ) { ae_frame_leave(_state); return; } ae_matrix_set_length(&a1, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a1.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } if( isupper ) { for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { a1.ptr.pp_complex[i][j] = a->ptr.pp_complex[i1+i][j1+j]; } } } else { for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { a1.ptr.pp_complex[i][j] = a->ptr.pp_complex[i1+i][j1+j]; } } } rupper = isupper; if( isunit ) { for(i=0; i<=n-1; i++) { a1.ptr.pp_complex[i][i] = ae_complex_from_i(1); } } ae_matrix_set_length(&a2, n, n, _state); if( optype==0 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a2.ptr.pp_complex[i][j] = a1.ptr.pp_complex[i][j]; } } } if( optype==1 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a2.ptr.pp_complex[i][j] = a1.ptr.pp_complex[j][i]; } } rupper = !rupper; } if( optype==2 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a2.ptr.pp_complex[i][j] = ae_c_conj(a1.ptr.pp_complex[j][i], _state); } } rupper = !rupper; } testablasunit_internalcmatrixtrinverse(&a2, n, rupper, ae_false, _state); ae_vector_set_length(&tx, n, _state); for(i=0; i<=m-1; i++) { ae_v_cmove(&tx.ptr.p_complex[0], 1, &x->ptr.pp_complex[i2+i][j2], 1, "N", ae_v_len(0,n-1)); for(j=0; j<=n-1; j++) { vc = ae_v_cdotproduct(&tx.ptr.p_complex[0], 1, "N", &a2.ptr.pp_complex[0][j], a2.stride, "N", ae_v_len(0,n-1)); x->ptr.pp_complex[i2+i][j2+j] = vc; } } ae_frame_leave(_state); } /************************************************************************* Reference implementation -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ static void testablasunit_refcmatrixlefttrsm(ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Complex */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { ae_frame _frame_block; ae_matrix a1; ae_matrix a2; ae_vector tx; ae_int_t i; ae_int_t j; ae_complex vc; ae_bool rupper; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a1, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&a2, 0, 0, DT_COMPLEX, _state); ae_vector_init(&tx, 0, DT_COMPLEX, _state); if( n*m==0 ) { ae_frame_leave(_state); return; } ae_matrix_set_length(&a1, m, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { a1.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } if( isupper ) { for(i=0; i<=m-1; i++) { for(j=i; j<=m-1; j++) { a1.ptr.pp_complex[i][j] = a->ptr.pp_complex[i1+i][j1+j]; } } } else { for(i=0; i<=m-1; i++) { for(j=0; j<=i; j++) { a1.ptr.pp_complex[i][j] = a->ptr.pp_complex[i1+i][j1+j]; } } } rupper = isupper; if( isunit ) { for(i=0; i<=m-1; i++) { a1.ptr.pp_complex[i][i] = ae_complex_from_i(1); } } ae_matrix_set_length(&a2, m, m, _state); if( optype==0 ) { for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { a2.ptr.pp_complex[i][j] = a1.ptr.pp_complex[i][j]; } } } if( optype==1 ) { for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { a2.ptr.pp_complex[i][j] = a1.ptr.pp_complex[j][i]; } } rupper = !rupper; } if( optype==2 ) { for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { a2.ptr.pp_complex[i][j] = ae_c_conj(a1.ptr.pp_complex[j][i], _state); } } rupper = !rupper; } testablasunit_internalcmatrixtrinverse(&a2, m, rupper, ae_false, _state); ae_vector_set_length(&tx, m, _state); for(j=0; j<=n-1; j++) { ae_v_cmove(&tx.ptr.p_complex[0], 1, &x->ptr.pp_complex[i2][j2+j], x->stride, "N", ae_v_len(0,m-1)); for(i=0; i<=m-1; i++) { vc = ae_v_cdotproduct(&a2.ptr.pp_complex[i][0], 1, "N", &tx.ptr.p_complex[0], 1, "N", ae_v_len(0,m-1)); x->ptr.pp_complex[i2+i][j2+j] = vc; } } ae_frame_leave(_state); } /************************************************************************* Reference implementation -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ static void testablasunit_refrmatrixrighttrsm(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { ae_frame _frame_block; ae_matrix a1; ae_matrix a2; ae_vector tx; ae_int_t i; ae_int_t j; double vr; ae_bool rupper; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a1, 0, 0, DT_REAL, _state); ae_matrix_init(&a2, 0, 0, DT_REAL, _state); ae_vector_init(&tx, 0, DT_REAL, _state); if( n*m==0 ) { ae_frame_leave(_state); return; } ae_matrix_set_length(&a1, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a1.ptr.pp_double[i][j] = (double)(0); } } if( isupper ) { for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { a1.ptr.pp_double[i][j] = a->ptr.pp_double[i1+i][j1+j]; } } } else { for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { a1.ptr.pp_double[i][j] = a->ptr.pp_double[i1+i][j1+j]; } } } rupper = isupper; if( isunit ) { for(i=0; i<=n-1; i++) { a1.ptr.pp_double[i][i] = (double)(1); } } ae_matrix_set_length(&a2, n, n, _state); if( optype==0 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a2.ptr.pp_double[i][j] = a1.ptr.pp_double[i][j]; } } } if( optype==1 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a2.ptr.pp_double[i][j] = a1.ptr.pp_double[j][i]; } } rupper = !rupper; } testablasunit_internalrmatrixtrinverse(&a2, n, rupper, ae_false, _state); ae_vector_set_length(&tx, n, _state); for(i=0; i<=m-1; i++) { ae_v_move(&tx.ptr.p_double[0], 1, &x->ptr.pp_double[i2+i][j2], 1, ae_v_len(0,n-1)); for(j=0; j<=n-1; j++) { vr = ae_v_dotproduct(&tx.ptr.p_double[0], 1, &a2.ptr.pp_double[0][j], a2.stride, ae_v_len(0,n-1)); x->ptr.pp_double[i2+i][j2+j] = vr; } } ae_frame_leave(_state); } /************************************************************************* Reference implementation -- ALGLIB routine -- 15.12.2009 Bochkanov Sergey *************************************************************************/ static void testablasunit_refrmatrixlefttrsm(ae_int_t m, ae_int_t n, /* Real */ ae_matrix* a, ae_int_t i1, ae_int_t j1, ae_bool isupper, ae_bool isunit, ae_int_t optype, /* Real */ ae_matrix* x, ae_int_t i2, ae_int_t j2, ae_state *_state) { ae_frame _frame_block; ae_matrix a1; ae_matrix a2; ae_vector tx; ae_int_t i; ae_int_t j; double vr; ae_bool rupper; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a1, 0, 0, DT_REAL, _state); ae_matrix_init(&a2, 0, 0, DT_REAL, _state); ae_vector_init(&tx, 0, DT_REAL, _state); if( n*m==0 ) { ae_frame_leave(_state); return; } ae_matrix_set_length(&a1, m, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { a1.ptr.pp_double[i][j] = (double)(0); } } if( isupper ) { for(i=0; i<=m-1; i++) { for(j=i; j<=m-1; j++) { a1.ptr.pp_double[i][j] = a->ptr.pp_double[i1+i][j1+j]; } } } else { for(i=0; i<=m-1; i++) { for(j=0; j<=i; j++) { a1.ptr.pp_double[i][j] = a->ptr.pp_double[i1+i][j1+j]; } } } rupper = isupper; if( isunit ) { for(i=0; i<=m-1; i++) { a1.ptr.pp_double[i][i] = (double)(1); } } ae_matrix_set_length(&a2, m, m, _state); if( optype==0 ) { for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { a2.ptr.pp_double[i][j] = a1.ptr.pp_double[i][j]; } } } if( optype==1 ) { for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { a2.ptr.pp_double[i][j] = a1.ptr.pp_double[j][i]; } } rupper = !rupper; } testablasunit_internalrmatrixtrinverse(&a2, m, rupper, ae_false, _state); ae_vector_set_length(&tx, m, _state); for(j=0; j<=n-1; j++) { ae_v_move(&tx.ptr.p_double[0], 1, &x->ptr.pp_double[i2][j2+j], x->stride, ae_v_len(0,m-1)); for(i=0; i<=m-1; i++) { vr = ae_v_dotproduct(&a2.ptr.pp_double[i][0], 1, &tx.ptr.p_double[0], 1, ae_v_len(0,m-1)); x->ptr.pp_double[i2+i][j2+j] = vr; } } ae_frame_leave(_state); } /************************************************************************* Internal subroutine. Triangular matrix inversion -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 *************************************************************************/ static ae_bool testablasunit_internalcmatrixtrinverse(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunittriangular, ae_state *_state) { ae_frame _frame_block; ae_bool nounit; ae_int_t i; ae_int_t j; ae_complex v; ae_complex ajj; ae_vector t; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&t, 0, DT_COMPLEX, _state); result = ae_true; ae_vector_set_length(&t, n-1+1, _state); /* * Test the input parameters. */ nounit = !isunittriangular; if( isupper ) { /* * Compute inverse of upper triangular matrix. */ for(j=0; j<=n-1; j++) { if( nounit ) { if( ae_c_eq_d(a->ptr.pp_complex[j][j],(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } a->ptr.pp_complex[j][j] = ae_c_d_div(1,a->ptr.pp_complex[j][j]); ajj = ae_c_neg(a->ptr.pp_complex[j][j]); } else { ajj = ae_complex_from_i(-1); } /* * Compute elements 1:j-1 of j-th column. */ if( j>0 ) { ae_v_cmove(&t.ptr.p_complex[0], 1, &a->ptr.pp_complex[0][j], a->stride, "N", ae_v_len(0,j-1)); for(i=0; i<=j-1; i++) { if( i+1ptr.pp_complex[i][i+1], 1, "N", &t.ptr.p_complex[i+1], 1, "N", ae_v_len(i+1,j-1)); } else { v = ae_complex_from_i(0); } if( nounit ) { a->ptr.pp_complex[i][j] = ae_c_add(v,ae_c_mul(a->ptr.pp_complex[i][i],t.ptr.p_complex[i])); } else { a->ptr.pp_complex[i][j] = ae_c_add(v,t.ptr.p_complex[i]); } } ae_v_cmulc(&a->ptr.pp_complex[0][j], a->stride, ae_v_len(0,j-1), ajj); } } } else { /* * Compute inverse of lower triangular matrix. */ for(j=n-1; j>=0; j--) { if( nounit ) { if( ae_c_eq_d(a->ptr.pp_complex[j][j],(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } a->ptr.pp_complex[j][j] = ae_c_d_div(1,a->ptr.pp_complex[j][j]); ajj = ae_c_neg(a->ptr.pp_complex[j][j]); } else { ajj = ae_complex_from_i(-1); } if( j+1ptr.pp_complex[j+1][j], a->stride, "N", ae_v_len(j+1,n-1)); for(i=j+1; i<=n-1; i++) { if( i>j+1 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[i][j+1], 1, "N", &t.ptr.p_complex[j+1], 1, "N", ae_v_len(j+1,i-1)); } else { v = ae_complex_from_i(0); } if( nounit ) { a->ptr.pp_complex[i][j] = ae_c_add(v,ae_c_mul(a->ptr.pp_complex[i][i],t.ptr.p_complex[i])); } else { a->ptr.pp_complex[i][j] = ae_c_add(v,t.ptr.p_complex[i]); } } ae_v_cmulc(&a->ptr.pp_complex[j+1][j], a->stride, ae_v_len(j+1,n-1), ajj); } } } ae_frame_leave(_state); return result; } /************************************************************************* Internal subroutine. Triangular matrix inversion -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 *************************************************************************/ static ae_bool testablasunit_internalrmatrixtrinverse(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunittriangular, ae_state *_state) { ae_frame _frame_block; ae_bool nounit; ae_int_t i; ae_int_t j; double v; double ajj; ae_vector t; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&t, 0, DT_REAL, _state); result = ae_true; ae_vector_set_length(&t, n-1+1, _state); /* * Test the input parameters. */ nounit = !isunittriangular; if( isupper ) { /* * Compute inverse of upper triangular matrix. */ for(j=0; j<=n-1; j++) { if( nounit ) { if( ae_fp_eq(a->ptr.pp_double[j][j],(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } a->ptr.pp_double[j][j] = 1/a->ptr.pp_double[j][j]; ajj = -a->ptr.pp_double[j][j]; } else { ajj = (double)(-1); } /* * Compute elements 1:j-1 of j-th column. */ if( j>0 ) { ae_v_move(&t.ptr.p_double[0], 1, &a->ptr.pp_double[0][j], a->stride, ae_v_len(0,j-1)); for(i=0; i<=j-1; i++) { if( iptr.pp_double[i][i+1], 1, &t.ptr.p_double[i+1], 1, ae_v_len(i+1,j-1)); } else { v = (double)(0); } if( nounit ) { a->ptr.pp_double[i][j] = v+a->ptr.pp_double[i][i]*t.ptr.p_double[i]; } else { a->ptr.pp_double[i][j] = v+t.ptr.p_double[i]; } } ae_v_muld(&a->ptr.pp_double[0][j], a->stride, ae_v_len(0,j-1), ajj); } } } else { /* * Compute inverse of lower triangular matrix. */ for(j=n-1; j>=0; j--) { if( nounit ) { if( ae_fp_eq(a->ptr.pp_double[j][j],(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } a->ptr.pp_double[j][j] = 1/a->ptr.pp_double[j][j]; ajj = -a->ptr.pp_double[j][j]; } else { ajj = (double)(-1); } if( jptr.pp_double[j+1][j], a->stride, ae_v_len(j+1,n-1)); for(i=j+1; i<=n-1; i++) { if( i>j+1 ) { v = ae_v_dotproduct(&a->ptr.pp_double[i][j+1], 1, &t.ptr.p_double[j+1], 1, ae_v_len(j+1,i-1)); } else { v = (double)(0); } if( nounit ) { a->ptr.pp_double[i][j] = v+a->ptr.pp_double[i][i]*t.ptr.p_double[i]; } else { a->ptr.pp_double[i][j] = v+t.ptr.p_double[i]; } } ae_v_muld(&a->ptr.pp_double[j+1][j], a->stride, ae_v_len(j+1,n-1), ajj); } } } ae_frame_leave(_state); return result; } /************************************************************************* Reference SYRK subroutine. -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ static void testablasunit_refcmatrixherk(ae_int_t n, ae_int_t k, double alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { ae_frame _frame_block; ae_matrix ae; ae_int_t i; ae_int_t j; ae_complex vc; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ae, 0, 0, DT_COMPLEX, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (isupper&&j>=i)||(!isupper&&j<=i) ) { if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_complex[i+ic][j+jc] = ae_complex_from_i(0); } else { c->ptr.pp_complex[i+ic][j+jc] = ae_c_mul_d(c->ptr.pp_complex[i+ic][j+jc],beta); } } } } if( ae_fp_eq(alpha,(double)(0)) ) { ae_frame_leave(_state); return; } if( n*k>0 ) { ae_matrix_set_length(&ae, n, k, _state); } for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { if( optypea==0 ) { ae.ptr.pp_complex[i][j] = a->ptr.pp_complex[ia+i][ja+j]; } if( optypea==2 ) { ae.ptr.pp_complex[i][j] = ae_c_conj(a->ptr.pp_complex[ia+j][ja+i], _state); } } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { vc = ae_complex_from_i(0); if( k>0 ) { vc = ae_v_cdotproduct(&ae.ptr.pp_complex[i][0], 1, "N", &ae.ptr.pp_complex[j][0], 1, "Conj", ae_v_len(0,k-1)); } vc = ae_c_mul_d(vc,alpha); if( isupper&&j>=i ) { c->ptr.pp_complex[ic+i][jc+j] = ae_c_add(vc,c->ptr.pp_complex[ic+i][jc+j]); } if( !isupper&&j<=i ) { c->ptr.pp_complex[ic+i][jc+j] = ae_c_add(vc,c->ptr.pp_complex[ic+i][jc+j]); } } } ae_frame_leave(_state); } /************************************************************************* Reference SYRK subroutine. -- ALGLIB routine -- 16.12.2009 Bochkanov Sergey *************************************************************************/ static void testablasunit_refrmatrixsyrk(ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_bool isupper, ae_state *_state) { ae_frame _frame_block; ae_matrix ae; ae_int_t i; ae_int_t j; double vr; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ae, 0, 0, DT_REAL, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (isupper&&j>=i)||(!isupper&&j<=i) ) { if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_double[i+ic][j+jc] = (double)(0); } else { c->ptr.pp_double[i+ic][j+jc] = c->ptr.pp_double[i+ic][j+jc]*beta; } } } } if( ae_fp_eq(alpha,(double)(0)) ) { ae_frame_leave(_state); return; } if( n*k>0 ) { ae_matrix_set_length(&ae, n, k, _state); } for(i=0; i<=n-1; i++) { for(j=0; j<=k-1; j++) { if( optypea==0 ) { ae.ptr.pp_double[i][j] = a->ptr.pp_double[ia+i][ja+j]; } if( optypea==1 ) { ae.ptr.pp_double[i][j] = a->ptr.pp_double[ia+j][ja+i]; } } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { vr = (double)(0); if( k>0 ) { vr = ae_v_dotproduct(&ae.ptr.pp_double[i][0], 1, &ae.ptr.pp_double[j][0], 1, ae_v_len(0,k-1)); } vr = alpha*vr; if( isupper&&j>=i ) { c->ptr.pp_double[ic+i][jc+j] = vr+c->ptr.pp_double[ic+i][jc+j]; } if( !isupper&&j<=i ) { c->ptr.pp_double[ic+i][jc+j] = vr+c->ptr.pp_double[ic+i][jc+j]; } } } ae_frame_leave(_state); } /************************************************************************* Reference GEMM, ALGLIB subroutine *************************************************************************/ static void testablasunit_refcmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, ae_complex alpha, /* Complex */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Complex */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, ae_complex beta, /* Complex */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { ae_frame _frame_block; ae_matrix ae; ae_matrix be; ae_int_t i; ae_int_t j; ae_complex vc; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ae, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&be, 0, 0, DT_COMPLEX, _state); ae_matrix_set_length(&ae, m, k, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=k-1; j++) { if( optypea==0 ) { ae.ptr.pp_complex[i][j] = a->ptr.pp_complex[ia+i][ja+j]; } if( optypea==1 ) { ae.ptr.pp_complex[i][j] = a->ptr.pp_complex[ia+j][ja+i]; } if( optypea==2 ) { ae.ptr.pp_complex[i][j] = ae_c_conj(a->ptr.pp_complex[ia+j][ja+i], _state); } } } ae_matrix_set_length(&be, k, n, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { if( optypeb==0 ) { be.ptr.pp_complex[i][j] = b->ptr.pp_complex[ib+i][jb+j]; } if( optypeb==1 ) { be.ptr.pp_complex[i][j] = b->ptr.pp_complex[ib+j][jb+i]; } if( optypeb==2 ) { be.ptr.pp_complex[i][j] = ae_c_conj(b->ptr.pp_complex[ib+j][jb+i], _state); } } } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { vc = ae_v_cdotproduct(&ae.ptr.pp_complex[i][0], 1, "N", &be.ptr.pp_complex[0][j], be.stride, "N", ae_v_len(0,k-1)); vc = ae_c_mul(alpha,vc); if( ae_c_neq_d(beta,(double)(0)) ) { vc = ae_c_add(vc,ae_c_mul(beta,c->ptr.pp_complex[ic+i][jc+j])); } c->ptr.pp_complex[ic+i][jc+j] = vc; } } ae_frame_leave(_state); } /************************************************************************* Reference GEMM, ALGLIB subroutine *************************************************************************/ static void testablasunit_refrmatrixgemm(ae_int_t m, ae_int_t n, ae_int_t k, double alpha, /* Real */ ae_matrix* a, ae_int_t ia, ae_int_t ja, ae_int_t optypea, /* Real */ ae_matrix* b, ae_int_t ib, ae_int_t jb, ae_int_t optypeb, double beta, /* Real */ ae_matrix* c, ae_int_t ic, ae_int_t jc, ae_state *_state) { ae_frame _frame_block; ae_matrix ae; ae_matrix be; ae_int_t i; ae_int_t j; double vc; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ae, 0, 0, DT_REAL, _state); ae_matrix_init(&be, 0, 0, DT_REAL, _state); ae_matrix_set_length(&ae, m, k, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=k-1; j++) { if( optypea==0 ) { ae.ptr.pp_double[i][j] = a->ptr.pp_double[ia+i][ja+j]; } if( optypea==1 ) { ae.ptr.pp_double[i][j] = a->ptr.pp_double[ia+j][ja+i]; } } } ae_matrix_set_length(&be, k, n, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { if( optypeb==0 ) { be.ptr.pp_double[i][j] = b->ptr.pp_double[ib+i][jb+j]; } if( optypeb==1 ) { be.ptr.pp_double[i][j] = b->ptr.pp_double[ib+j][jb+i]; } } } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { vc = ae_v_dotproduct(&ae.ptr.pp_double[i][0], 1, &be.ptr.pp_double[0][j], be.stride, ae_v_len(0,k-1)); vc = alpha*vc; if( ae_fp_neq(beta,(double)(0)) ) { vc = vc+beta*c->ptr.pp_double[ic+i][jc+j]; } c->ptr.pp_double[ic+i][jc+j] = vc; } } ae_frame_leave(_state); } static void testtrfacunit_testcluproblem(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* err, ae_bool* properr, ae_state *_state); static void testtrfacunit_testrluproblem(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* err, ae_bool* properr, ae_state *_state); static void testtrfacunit_testdensecholeskyupdates(ae_bool* spdupderrorflag, ae_state *_state); ae_bool testtrfac(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_matrix ra; ae_matrix ral; ae_matrix rau; ae_matrix ca; ae_matrix cal; ae_matrix cau; ae_int_t m; ae_int_t n; ae_int_t mx; ae_int_t maxmn; ae_int_t largemn; ae_int_t i; ae_int_t j; ae_complex vc; double vr; ae_bool waserrors; ae_bool dspderr; ae_bool sspderr; ae_bool hpderr; ae_bool rerr; ae_bool cerr; ae_bool properr; ae_bool dspdupderr; double threshold; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ra, 0, 0, DT_REAL, _state); ae_matrix_init(&ral, 0, 0, DT_REAL, _state); ae_matrix_init(&rau, 0, 0, DT_REAL, _state); ae_matrix_init(&ca, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cal, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cau, 0, 0, DT_COMPLEX, _state); rerr = ae_false; dspderr = ae_false; sspderr = ae_false; cerr = ae_false; hpderr = ae_false; properr = ae_false; dspdupderr = ae_false; waserrors = ae_false; maxmn = 4*ablasblocksize(&ra, _state)+1; largemn = 256; threshold = 1000*ae_machineepsilon*maxmn; /* * Sparse Cholesky */ sspderr = sparserealcholeskytest(_state); /* * Cholesky updates */ testtrfacunit_testdensecholeskyupdates(&dspdupderr, _state); /* * test LU: * * first, test on small-scale matrices * * then, perform several large-scale tests */ for(mx=1; mx<=maxmn; mx++) { /* * Initialize N/M, both are <=MX, * at least one of them is exactly equal to MX */ n = 1+ae_randominteger(mx, _state); m = 1+ae_randominteger(mx, _state); if( ae_fp_greater(ae_randomreal(_state),0.5) ) { n = mx; } else { m = mx; } /* * First, test on zero matrix */ ae_matrix_set_length(&ra, m, n, _state); ae_matrix_set_length(&ca, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { ra.ptr.pp_double[i][j] = (double)(0); ca.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } testtrfacunit_testcluproblem(&ca, m, n, threshold, &cerr, &properr, _state); testtrfacunit_testrluproblem(&ra, m, n, threshold, &rerr, &properr, _state); /* * Second, random matrix with moderate condition number */ ae_matrix_set_length(&ra, m, n, _state); ae_matrix_set_length(&ca, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { ra.ptr.pp_double[i][j] = (double)(0); ca.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } for(i=0; i<=ae_minint(m, n, _state)-1; i++) { ra.ptr.pp_double[i][i] = 1+10*ae_randomreal(_state); ca.ptr.pp_complex[i][i] = ae_complex_from_d(1+10*ae_randomreal(_state)); } cmatrixrndorthogonalfromtheleft(&ca, m, n, _state); cmatrixrndorthogonalfromtheright(&ca, m, n, _state); rmatrixrndorthogonalfromtheleft(&ra, m, n, _state); rmatrixrndorthogonalfromtheright(&ra, m, n, _state); testtrfacunit_testcluproblem(&ca, m, n, threshold, &cerr, &properr, _state); testtrfacunit_testrluproblem(&ra, m, n, threshold, &rerr, &properr, _state); } for(m=largemn-1; m<=largemn+1; m++) { for(n=largemn-1; n<=largemn+1; n++) { /* * Random matrix with moderate condition number */ ae_matrix_set_length(&ra, m, n, _state); ae_matrix_set_length(&ca, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { ra.ptr.pp_double[i][j] = (double)(0); ca.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } for(i=0; i<=ae_minint(m, n, _state)-1; i++) { ra.ptr.pp_double[i][i] = 1+10*ae_randomreal(_state); ca.ptr.pp_complex[i][i] = ae_complex_from_d(1+10*ae_randomreal(_state)); } cmatrixrndorthogonalfromtheleft(&ca, m, n, _state); cmatrixrndorthogonalfromtheright(&ca, m, n, _state); rmatrixrndorthogonalfromtheleft(&ra, m, n, _state); rmatrixrndorthogonalfromtheright(&ra, m, n, _state); testtrfacunit_testcluproblem(&ca, m, n, threshold, &cerr, &properr, _state); testtrfacunit_testrluproblem(&ra, m, n, threshold, &rerr, &properr, _state); } } /* * Test Cholesky */ for(n=1; n<=maxmn; n++) { /* * Load CA (HPD matrix with low condition number), * CAL and CAU - its lower and upper triangles */ hpdmatrixrndcond(n, 1+50*ae_randomreal(_state), &ca, _state); ae_matrix_set_length(&cal, n, n, _state); ae_matrix_set_length(&cau, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { cal.ptr.pp_complex[i][j] = ae_complex_from_i(i); cau.ptr.pp_complex[i][j] = ae_complex_from_i(j); } } for(i=0; i<=n-1; i++) { ae_v_cmove(&cal.ptr.pp_complex[i][0], 1, &ca.ptr.pp_complex[i][0], 1, "N", ae_v_len(0,i)); ae_v_cmove(&cau.ptr.pp_complex[i][i], 1, &ca.ptr.pp_complex[i][i], 1, "N", ae_v_len(i,n-1)); } /* * Test HPDMatrixCholesky: * 1. it must leave upper (lower) part unchanged * 2. max(A-L*L^H) must be small */ if( hpdmatrixcholesky(&cal, n, ae_false, _state) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( j>i ) { hpderr = hpderr||ae_c_neq_d(cal.ptr.pp_complex[i][j],(double)(i)); } else { vc = ae_v_cdotproduct(&cal.ptr.pp_complex[i][0], 1, "N", &cal.ptr.pp_complex[j][0], 1, "Conj", ae_v_len(0,j)); hpderr = hpderr||ae_fp_greater(ae_c_abs(ae_c_sub(ca.ptr.pp_complex[i][j],vc), _state),threshold); } } } } else { hpderr = ae_true; } if( hpdmatrixcholesky(&cau, n, ae_true, _state) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( ji ) { dspderr = dspderr||ae_fp_neq(ral.ptr.pp_double[i][j],(double)(i)); } else { vr = ae_v_dotproduct(&ral.ptr.pp_double[i][0], 1, &ral.ptr.pp_double[j][0], 1, ae_v_len(0,j)); dspderr = dspderr||ae_fp_greater(ae_fabs(ra.ptr.pp_double[i][j]-vr, _state),threshold); } } } } else { dspderr = ae_true; } if( spdmatrixcholesky(&rau, n, ae_true, _state) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( j1 ) { pnz = (double)nz/(double)(n*n-n); } else { pnz = 1.0; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { if( i==j ) { a.ptr.pp_double[i][i] = 1+hqrnduniformr(&rs, _state); continue; } if( ae_fp_less_eq(hqrnduniformr(&rs, _state),pnz) ) { a.ptr.pp_double[i][j] = offscale*(hqrnduniformr(&rs, _state)-0.5); a.ptr.pp_double[j][i] = a.ptr.pp_double[i][j]; } else { a.ptr.pp_double[i][j] = 0.0; a.ptr.pp_double[j][i] = 0.0; } } } /* * Problem statement */ isupper = ae_fp_greater(ae_randomreal(_state),0.5); cfmt = ae_randominteger(maxfmt+1, _state); cord = ae_randominteger(maxord+1-minord, _state)+minord; /* * Create matrix is hash-based storage format, convert it to random storage format. */ sparsecreate(n, n, 0, &sa, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j<=i&&!isupper)||(j>=i&&isupper) ) { sparseset(&sa, i, j, a.ptr.pp_double[i][j], _state); } } } sparseconvertto(&sa, hqrnduniformi(&rs, maxfmt+1, _state), _state); /* * Perform sparse Cholesky and make several tests: * * correctness of P0 and P1 (they are correct permutations and one is inverse of another) * * format of SC matches CFmt * * SC has correct size (exactly N*N) * * check that correct triangle is returned */ if( !sparsecholeskyx(&sa, n, isupper, &p0, &p1, cord, ae_randominteger(3, _state), cfmt, &sbuf, &sc, _state) ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } seterrorflag(&result, p0.cnt=n, _state); seterrorflag(&result, p1.ptr.p_int[i]>=n, _state); if( result ) { ae_frame_leave(_state); return result; } seterrorflag(&result, b1.ptr.p_bool[p0.ptr.p_int[i]], _state); b1.ptr.p_bool[p0.ptr.p_int[i]] = ae_true; seterrorflag(&result, p1.ptr.p_int[p0.ptr.p_int[i]]!=i, _state); } seterrorflag(&result, sparsegetmatrixtype(&sc, _state)!=cfmt, _state); seterrorflag(&result, sparsegetncols(&sc, _state)!=n, _state); seterrorflag(&result, sparsegetnrows(&sc, _state)!=n, _state); t0 = 0; t1 = 0; while(sparseenumerate(&sc, &t0, &t1, &i, &j, &v, _state)) { seterrorflag(&result, ji&&!isupper, _state); } /* * Now, test correctness of Cholesky decomposition itself. * We calculate U'*U (or L*L') and check at against permutation * of A given by P0. * * NOTE: we expect that only one triangle of SC is filled, * and another one is exactly zero. */ if( isupper ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = 0.0; for(k=0; k<=n-1; k++) { v = v+sparseget(&sc, k, j, _state)*sparseget(&sc, k, i, _state); } seterrorflag(&result, ae_fp_greater(ae_fabs(a.ptr.pp_double[p0.ptr.p_int[i]][p0.ptr.p_int[j]]-v, _state),tol), _state); } } } else { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = 0.0; for(k=0; k<=n-1; k++) { v = v+sparseget(&sc, j, k, _state)*sparseget(&sc, i, k, _state); } seterrorflag(&result, ae_fp_greater(ae_fabs(a.ptr.pp_double[p0.ptr.p_int[i]][p0.ptr.p_int[j]]-v, _state),tol), _state); } } } /* * Increase problem sparcity and try one more time. * Stop after testing NZ=0. */ if( nz==0 ) { break; } nz = nz/2; } } /* * SparseCholeskySkyline test: performed for matrices * of all sizes in 1..20 and all sparcity percentages. */ for(n=1; n<=20; n++) { nz = n*n-n; for(;;) { /* * Choose IsUpper - main triangle to work with. * * Generate A - symmetric N*N matrix where probability of non-diagonal * element being non-zero is PNZ. Off-diagonal elements are set to * very small values, so positive definiteness is guaranteed. Full matrix * is generated. * * Additionally, we create A1 - same as A, but one of the triangles is * asymmetrically spoiled. If IsUpper is True, we spoil lower one, or vice versa. */ isupper = ae_fp_greater(ae_randomreal(_state),0.5); if( n>1 ) { pnz = (double)nz/(double)(n*n-n); } else { pnz = 1.0; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { if( i==j ) { a.ptr.pp_double[i][i] = 1+hqrnduniformr(&rs, _state); continue; } if( ae_fp_less_eq(hqrnduniformr(&rs, _state),pnz) ) { a.ptr.pp_double[i][j] = offscale*(hqrnduniformr(&rs, _state)-0.5); a.ptr.pp_double[j][i] = a.ptr.pp_double[i][j]; } else { a.ptr.pp_double[i][j] = 0.0; a.ptr.pp_double[j][i] = 0.0; } } } ae_matrix_set_length(&a1, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j<=i&&!isupper)||(j>=i&&isupper) ) { /* * Copy one triangle */ a1.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]; } else { /* * Form another sparse pattern in different triangle. */ if( ae_fp_less_eq(hqrnduniformr(&rs, _state),pnz) ) { a1.ptr.pp_double[i][j] = offscale*(hqrnduniformr(&rs, _state)-0.5); } else { a1.ptr.pp_double[i][j] = 0.0; } } } } /* * Create copies of A and A1 in hash-based storage format. * Only one triangle of A is copied, but A1 is copied fully. * Convert them to SKS */ sparsecreate(n, n, 0, &sa, _state); sparsecreate(n, n, 0, &sa1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j<=i&&!isupper)||(j>=i&&isupper) ) { sparseset(&sa, i, j, a.ptr.pp_double[i][j], _state); } sparseset(&sa1, i, j, a1.ptr.pp_double[i][j], _state); } } sparseconverttosks(&sa, _state); sparseconverttosks(&sa1, _state); /* * Call SparseCholeskySkyline() for SA and make several tests: * * check that it is still SKS * * check that it has correct size (exactly N*N) * * check that correct triangle is returned (and another one is unchanged - zero) * * check that it is correct Cholesky decomposition. * We calculate U'*U (or L*L') and check at against A. We expect * that only one triangle of SA is filled, and another one is * exactly zero. */ if( !sparsecholeskyskyline(&sa, n, isupper, _state) ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } seterrorflag(&result, !sparseissks(&sa, _state), _state); seterrorflag(&result, sparsegetncols(&sa, _state)!=n, _state); seterrorflag(&result, sparsegetnrows(&sa, _state)!=n, _state); t0 = 0; t1 = 0; while(sparseenumerate(&sa, &t0, &t1, &i, &j, &v, _state)) { seterrorflag(&result, ji&&!isupper, _state); } if( isupper ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = 0.0; for(k=0; k<=n-1; k++) { v = v+sparseget(&sa, k, j, _state)*sparseget(&sa, k, i, _state); } seterrorflag(&result, ae_fp_greater(ae_fabs(a.ptr.pp_double[i][j]-v, _state),tol), _state); } } } else { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = 0.0; for(k=0; k<=n-1; k++) { v = v+sparseget(&sa, j, k, _state)*sparseget(&sa, i, k, _state); } seterrorflag(&result, ae_fp_greater(ae_fabs(a.ptr.pp_double[i][j]-v, _state),tol), _state); } } } /* * Call SparseCholeskySkyline() for SA1 and make several tests: * * check that it is still SKS * * check that it has correct size (exactly N*N) * * check that factorized triangle matches contents of SA, * and another triangle was unchanged (matches contents of A1). */ if( !sparsecholeskyskyline(&sa1, n, isupper, _state) ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } seterrorflag(&result, !sparseissks(&sa1, _state), _state); seterrorflag(&result, sparsegetncols(&sa1, _state)!=n, _state); seterrorflag(&result, sparsegetnrows(&sa1, _state)!=n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j<=i&&!isupper)||(j>=i&&isupper) ) { seterrorflag(&result, ae_fp_greater(ae_fabs(sparseget(&sa1, i, j, _state)-sparseget(&sa, i, j, _state), _state),10*ae_machineepsilon), _state); } else { seterrorflag(&result, ae_fp_greater(ae_fabs(sparseget(&sa1, i, j, _state)-a1.ptr.pp_double[i][j], _state),10*ae_machineepsilon), _state); } } } /* * Increase problem sparcity and try one more time. * Stop after testing NZ=0. */ if( nz==0 ) { break; } nz = nz/2; } } ae_frame_leave(_state); return result; } static void testtrfacunit_testcluproblem(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* err, ae_bool* properr, ae_state *_state) { ae_frame _frame_block; ae_matrix ca; ae_matrix cl; ae_matrix cu; ae_matrix ca2; ae_vector ct; ae_int_t i; ae_int_t j; ae_int_t minmn; ae_complex v; ae_vector p; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ca, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cl, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cu, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&ca2, 0, 0, DT_COMPLEX, _state); ae_vector_init(&ct, 0, DT_COMPLEX, _state); ae_vector_init(&p, 0, DT_INT, _state); minmn = ae_minint(m, n, _state); /* * PLU test */ ae_matrix_set_length(&ca, m, n, _state); for(i=0; i<=m-1; i++) { ae_v_cmove(&ca.ptr.pp_complex[i][0], 1, &a->ptr.pp_complex[i][0], 1, "N", ae_v_len(0,n-1)); } cmatrixplu(&ca, m, n, &p, _state); for(i=0; i<=minmn-1; i++) { if( p.ptr.p_int[i]=m ) { *properr = ae_true; ae_frame_leave(_state); return; } } ae_matrix_set_length(&cl, m, minmn, _state); for(j=0; j<=minmn-1; j++) { for(i=0; i<=j-1; i++) { cl.ptr.pp_complex[i][j] = ae_complex_from_d(0.0); } cl.ptr.pp_complex[j][j] = ae_complex_from_d(1.0); for(i=j+1; i<=m-1; i++) { cl.ptr.pp_complex[i][j] = ca.ptr.pp_complex[i][j]; } } ae_matrix_set_length(&cu, minmn, n, _state); for(i=0; i<=minmn-1; i++) { for(j=0; j<=i-1; j++) { cu.ptr.pp_complex[i][j] = ae_complex_from_d(0.0); } for(j=i; j<=n-1; j++) { cu.ptr.pp_complex[i][j] = ca.ptr.pp_complex[i][j]; } } ae_matrix_set_length(&ca2, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&cl.ptr.pp_complex[i][0], 1, "N", &cu.ptr.pp_complex[0][j], cu.stride, "N", ae_v_len(0,minmn-1)); ca2.ptr.pp_complex[i][j] = v; } } ae_vector_set_length(&ct, n, _state); for(i=minmn-1; i>=0; i--) { if( i!=p.ptr.p_int[i] ) { ae_v_cmove(&ct.ptr.p_complex[0], 1, &ca2.ptr.pp_complex[i][0], 1, "N", ae_v_len(0,n-1)); ae_v_cmove(&ca2.ptr.pp_complex[i][0], 1, &ca2.ptr.pp_complex[p.ptr.p_int[i]][0], 1, "N", ae_v_len(0,n-1)); ae_v_cmove(&ca2.ptr.pp_complex[p.ptr.p_int[i]][0], 1, &ct.ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); } } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { *err = *err||ae_fp_greater(ae_c_abs(ae_c_sub(a->ptr.pp_complex[i][j],ca2.ptr.pp_complex[i][j]), _state),threshold); } } /* * LUP test */ ae_matrix_set_length(&ca, m, n, _state); for(i=0; i<=m-1; i++) { ae_v_cmove(&ca.ptr.pp_complex[i][0], 1, &a->ptr.pp_complex[i][0], 1, "N", ae_v_len(0,n-1)); } cmatrixlup(&ca, m, n, &p, _state); for(i=0; i<=minmn-1; i++) { if( p.ptr.p_int[i]=n ) { *properr = ae_true; ae_frame_leave(_state); return; } } ae_matrix_set_length(&cl, m, minmn, _state); for(j=0; j<=minmn-1; j++) { for(i=0; i<=j-1; i++) { cl.ptr.pp_complex[i][j] = ae_complex_from_d(0.0); } for(i=j; i<=m-1; i++) { cl.ptr.pp_complex[i][j] = ca.ptr.pp_complex[i][j]; } } ae_matrix_set_length(&cu, minmn, n, _state); for(i=0; i<=minmn-1; i++) { for(j=0; j<=i-1; j++) { cu.ptr.pp_complex[i][j] = ae_complex_from_d(0.0); } cu.ptr.pp_complex[i][i] = ae_complex_from_d(1.0); for(j=i+1; j<=n-1; j++) { cu.ptr.pp_complex[i][j] = ca.ptr.pp_complex[i][j]; } } ae_matrix_set_length(&ca2, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&cl.ptr.pp_complex[i][0], 1, "N", &cu.ptr.pp_complex[0][j], cu.stride, "N", ae_v_len(0,minmn-1)); ca2.ptr.pp_complex[i][j] = v; } } ae_vector_set_length(&ct, m, _state); for(i=minmn-1; i>=0; i--) { if( i!=p.ptr.p_int[i] ) { ae_v_cmove(&ct.ptr.p_complex[0], 1, &ca2.ptr.pp_complex[0][i], ca2.stride, "N", ae_v_len(0,m-1)); ae_v_cmove(&ca2.ptr.pp_complex[0][i], ca2.stride, &ca2.ptr.pp_complex[0][p.ptr.p_int[i]], ca2.stride, "N", ae_v_len(0,m-1)); ae_v_cmove(&ca2.ptr.pp_complex[0][p.ptr.p_int[i]], ca2.stride, &ct.ptr.p_complex[0], 1, "N", ae_v_len(0,m-1)); } } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { *err = *err||ae_fp_greater(ae_c_abs(ae_c_sub(a->ptr.pp_complex[i][j],ca2.ptr.pp_complex[i][j]), _state),threshold); } } ae_frame_leave(_state); } static void testtrfacunit_testrluproblem(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* err, ae_bool* properr, ae_state *_state) { ae_frame _frame_block; ae_matrix ca; ae_matrix cl; ae_matrix cu; ae_matrix ca2; ae_vector ct; ae_int_t i; ae_int_t j; ae_int_t minmn; double v; ae_vector p; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ca, 0, 0, DT_REAL, _state); ae_matrix_init(&cl, 0, 0, DT_REAL, _state); ae_matrix_init(&cu, 0, 0, DT_REAL, _state); ae_matrix_init(&ca2, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); minmn = ae_minint(m, n, _state); /* * PLU test */ ae_matrix_set_length(&ca, m, n, _state); for(i=0; i<=m-1; i++) { ae_v_move(&ca.ptr.pp_double[i][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } rmatrixplu(&ca, m, n, &p, _state); for(i=0; i<=minmn-1; i++) { if( p.ptr.p_int[i]=m ) { *properr = ae_true; ae_frame_leave(_state); return; } } ae_matrix_set_length(&cl, m, minmn, _state); for(j=0; j<=minmn-1; j++) { for(i=0; i<=j-1; i++) { cl.ptr.pp_double[i][j] = 0.0; } cl.ptr.pp_double[j][j] = 1.0; for(i=j+1; i<=m-1; i++) { cl.ptr.pp_double[i][j] = ca.ptr.pp_double[i][j]; } } ae_matrix_set_length(&cu, minmn, n, _state); for(i=0; i<=minmn-1; i++) { for(j=0; j<=i-1; j++) { cu.ptr.pp_double[i][j] = 0.0; } for(j=i; j<=n-1; j++) { cu.ptr.pp_double[i][j] = ca.ptr.pp_double[i][j]; } } ae_matrix_set_length(&ca2, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&cl.ptr.pp_double[i][0], 1, &cu.ptr.pp_double[0][j], cu.stride, ae_v_len(0,minmn-1)); ca2.ptr.pp_double[i][j] = v; } } ae_vector_set_length(&ct, n, _state); for(i=minmn-1; i>=0; i--) { if( i!=p.ptr.p_int[i] ) { ae_v_move(&ct.ptr.p_double[0], 1, &ca2.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); ae_v_move(&ca2.ptr.pp_double[i][0], 1, &ca2.ptr.pp_double[p.ptr.p_int[i]][0], 1, ae_v_len(0,n-1)); ae_v_move(&ca2.ptr.pp_double[p.ptr.p_int[i]][0], 1, &ct.ptr.p_double[0], 1, ae_v_len(0,n-1)); } } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { *err = *err||ae_fp_greater(ae_fabs(a->ptr.pp_double[i][j]-ca2.ptr.pp_double[i][j], _state),threshold); } } /* * LUP test */ ae_matrix_set_length(&ca, m, n, _state); for(i=0; i<=m-1; i++) { ae_v_move(&ca.ptr.pp_double[i][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } rmatrixlup(&ca, m, n, &p, _state); for(i=0; i<=minmn-1; i++) { if( p.ptr.p_int[i]=n ) { *properr = ae_true; ae_frame_leave(_state); return; } } ae_matrix_set_length(&cl, m, minmn, _state); for(j=0; j<=minmn-1; j++) { for(i=0; i<=j-1; i++) { cl.ptr.pp_double[i][j] = 0.0; } for(i=j; i<=m-1; i++) { cl.ptr.pp_double[i][j] = ca.ptr.pp_double[i][j]; } } ae_matrix_set_length(&cu, minmn, n, _state); for(i=0; i<=minmn-1; i++) { for(j=0; j<=i-1; j++) { cu.ptr.pp_double[i][j] = 0.0; } cu.ptr.pp_double[i][i] = 1.0; for(j=i+1; j<=n-1; j++) { cu.ptr.pp_double[i][j] = ca.ptr.pp_double[i][j]; } } ae_matrix_set_length(&ca2, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&cl.ptr.pp_double[i][0], 1, &cu.ptr.pp_double[0][j], cu.stride, ae_v_len(0,minmn-1)); ca2.ptr.pp_double[i][j] = v; } } ae_vector_set_length(&ct, m, _state); for(i=minmn-1; i>=0; i--) { if( i!=p.ptr.p_int[i] ) { ae_v_move(&ct.ptr.p_double[0], 1, &ca2.ptr.pp_double[0][i], ca2.stride, ae_v_len(0,m-1)); ae_v_move(&ca2.ptr.pp_double[0][i], ca2.stride, &ca2.ptr.pp_double[0][p.ptr.p_int[i]], ca2.stride, ae_v_len(0,m-1)); ae_v_move(&ca2.ptr.pp_double[0][p.ptr.p_int[i]], ca2.stride, &ct.ptr.p_double[0], 1, ae_v_len(0,m-1)); } } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { *err = *err||ae_fp_greater(ae_fabs(a->ptr.pp_double[i][j]-ca2.ptr.pp_double[i][j], _state),threshold); } } ae_frame_leave(_state); } /************************************************************************* Function for testing dense Cholesky updates Sets error flag to True on errors, does not change it on success. -- ALGLIB PROJECT -- Copyright 16.01.1014 by Bochkanov Sergey *************************************************************************/ static void testtrfacunit_testdensecholeskyupdates(ae_bool* spdupderrorflag, ae_state *_state) { ae_frame _frame_block; ae_int_t n; double pfix; ae_matrix a0; ae_matrix a1; ae_vector u; ae_vector fix; ae_int_t i; ae_int_t j; ae_bool isupper; double tol; ae_vector bufr; hqrndstate rs; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a0, 0, 0, DT_REAL, _state); ae_matrix_init(&a1, 0, 0, DT_REAL, _state); ae_vector_init(&u, 0, DT_REAL, _state); ae_vector_init(&fix, 0, DT_BOOL, _state); ae_vector_init(&bufr, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); /* * Settings */ tol = 1.0E-8; /* * Test rank-1 updates * * For each matrix size in 1..30 select sparse update vector with probability of element * being non-zero equal to 1/2. */ for(n=1; n<=30; n++) { /* * Generate two matrices A0=A1, fill one triangle with SPD matrix, * another one with trash. Prepare vector U. */ isupper = ae_fp_less(hqrnduniformr(&rs, _state),0.5); spdmatrixrndcond(n, 1.0E4, &a0, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (ji&&!isupper) ) { a0.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; } } } ae_matrix_set_length(&a1, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a1.ptr.pp_double[i][j] = a0.ptr.pp_double[i][j]; } } ae_vector_set_length(&u, n, _state); for(i=0; i<=n-1; i++) { if( ae_fp_less_eq(hqrnduniformr(&rs, _state),0.5) ) { u.ptr.p_double[i] = hqrnduniformr(&rs, _state)-0.5; } else { u.ptr.p_double[i] = (double)(0); } } /* * Factorize and compare: * * A0 is factorized as follows: first with full Cholesky, then * we call SPDMatrixCholeskyUpdateAdd1 * * A1 is transformed explicitly before factorization with full Cholesky * * We randomly test either SPDMatrixCholeskyUpdateFix() or its * buffered version, SPDMatrixCholeskyUpdateFixBuf() */ seterrorflag(spdupderrorflag, !spdmatrixcholesky(&a0, n, isupper, _state), _state); if( *spdupderrorflag ) { ae_frame_leave(_state); return; } if( ae_fp_less(hqrnduniformr(&rs, _state),0.5) ) { spdmatrixcholeskyupdateadd1(&a0, n, isupper, &u, _state); } else { spdmatrixcholeskyupdateadd1buf(&a0, n, isupper, &u, &bufr, _state); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j>=i&&isupper)||(j<=i&&!isupper) ) { a1.ptr.pp_double[i][j] = a1.ptr.pp_double[i][j]+u.ptr.p_double[i]*u.ptr.p_double[j]; } } } seterrorflag(spdupderrorflag, !spdmatrixcholesky(&a1, n, isupper, _state), _state); if( *spdupderrorflag ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { seterrorflag(spdupderrorflag, ae_fp_greater(ae_fabs(a0.ptr.pp_double[i][j]-a1.ptr.pp_double[i][j], _state),tol), _state); } } } /* * Test variable fixing functions. * * For each matrix size in 1..30 select PFix - probability of each variable being fixed, * and perform test. */ for(n=1; n<=30; n++) { /* * Generate two matrices A0=A1, fill one triangle with SPD matrix, * another one with trash. Prepare vector Fix. */ pfix = (double)hqrnduniformi(&rs, n+1, _state)/(double)n; isupper = ae_fp_less(hqrnduniformr(&rs, _state),0.5); spdmatrixrndcond(n, 1.0E4, &a0, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (ji&&!isupper) ) { a0.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; } } } ae_matrix_set_length(&a1, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a1.ptr.pp_double[i][j] = a0.ptr.pp_double[i][j]; } } ae_vector_set_length(&fix, n, _state); for(i=0; i<=n-1; i++) { fix.ptr.p_bool[i] = ae_fp_less_eq(hqrnduniformr(&rs, _state),pfix); } /* * Factorize and compare: * * A0 is factorized as follows: first with full Cholesky, then * variables are fixed with SPDMatrixCholeskyUpdateFix * * A1 is fixed explicitly before factorization with full Cholesky * * We randomly test either SPDMatrixCholeskyUpdateFix() or its * buffered version, SPDMatrixCholeskyUpdateFixBuf() */ seterrorflag(spdupderrorflag, !spdmatrixcholesky(&a0, n, isupper, _state), _state); if( *spdupderrorflag ) { ae_frame_leave(_state); return; } if( ae_fp_less(hqrnduniformr(&rs, _state),0.5) ) { spdmatrixcholeskyupdatefixbuf(&a0, n, isupper, &fix, &bufr, _state); } else { spdmatrixcholeskyupdatefix(&a0, n, isupper, &fix, _state); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j>=i&&isupper)||(j<=i&&!isupper) ) { if( fix.ptr.p_bool[i]||fix.ptr.p_bool[j] ) { if( i==j ) { a1.ptr.pp_double[i][j] = (double)(1); } else { a1.ptr.pp_double[i][j] = (double)(0); } } } } } seterrorflag(spdupderrorflag, !spdmatrixcholesky(&a1, n, isupper, _state), _state); if( *spdupderrorflag ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { seterrorflag(spdupderrorflag, ae_fp_greater(ae_fabs(a0.ptr.pp_double[i][j]-a1.ptr.pp_double[i][j], _state),tol), _state); } } } ae_frame_leave(_state); } /************************************************************************* Main unittest subroutine *************************************************************************/ ae_bool testtrlinsolve(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t maxmn; ae_int_t passcount; double threshold; ae_matrix aeffective; ae_matrix aparam; ae_vector xe; ae_vector b; ae_int_t n; ae_int_t pass; ae_int_t i; ae_int_t j; ae_int_t cnts; ae_int_t cntu; ae_int_t cntt; ae_int_t cntm; ae_bool waserrors; ae_bool isupper; ae_bool istrans; ae_bool isunit; double v; double s; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&aeffective, 0, 0, DT_REAL, _state); ae_matrix_init(&aparam, 0, 0, DT_REAL, _state); ae_vector_init(&xe, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); waserrors = ae_false; maxmn = 15; passcount = 15; threshold = 1000*ae_machineepsilon; /* * Different problems */ for(n=1; n<=maxmn; n++) { ae_matrix_set_length(&aeffective, n-1+1, n-1+1, _state); ae_matrix_set_length(&aparam, n-1+1, n-1+1, _state); ae_vector_set_length(&xe, n-1+1, _state); ae_vector_set_length(&b, n-1+1, _state); for(pass=1; pass<=passcount; pass++) { for(cnts=0; cnts<=1; cnts++) { for(cntu=0; cntu<=1; cntu++) { for(cntt=0; cntt<=1; cntt++) { for(cntm=0; cntm<=2; cntm++) { isupper = cnts==0; isunit = cntu==0; istrans = cntt==0; /* * Skip meaningless combinations of parameters: * (matrix is singular) AND (matrix is unit diagonal) */ if( cntm==2&&isunit ) { continue; } /* * Clear matrices */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { aeffective.ptr.pp_double[i][j] = (double)(0); aparam.ptr.pp_double[i][j] = (double)(0); } } /* * Prepare matrices */ if( isupper ) { for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { aeffective.ptr.pp_double[i][j] = 0.9*(2*ae_randomreal(_state)-1); aparam.ptr.pp_double[i][j] = aeffective.ptr.pp_double[i][j]; } aeffective.ptr.pp_double[i][i] = (2*ae_randominteger(2, _state)-1)*(0.8+ae_randomreal(_state)); aparam.ptr.pp_double[i][i] = aeffective.ptr.pp_double[i][i]; } } else { for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { aeffective.ptr.pp_double[i][j] = 0.9*(2*ae_randomreal(_state)-1); aparam.ptr.pp_double[i][j] = aeffective.ptr.pp_double[i][j]; } aeffective.ptr.pp_double[i][i] = (2*ae_randominteger(2, _state)-1)*(0.8+ae_randomreal(_state)); aparam.ptr.pp_double[i][i] = aeffective.ptr.pp_double[i][i]; } } if( isunit ) { for(i=0; i<=n-1; i++) { aeffective.ptr.pp_double[i][i] = (double)(1); aparam.ptr.pp_double[i][i] = (double)(0); } } if( istrans ) { if( isupper ) { for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { aeffective.ptr.pp_double[j][i] = aeffective.ptr.pp_double[i][j]; aeffective.ptr.pp_double[i][j] = (double)(0); } } } else { for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { aeffective.ptr.pp_double[i][j] = aeffective.ptr.pp_double[j][i]; aeffective.ptr.pp_double[j][i] = (double)(0); } } } } /* * Prepare task, solve, compare */ for(i=0; i<=n-1; i++) { xe.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&aeffective.ptr.pp_double[i][0], 1, &xe.ptr.p_double[0], 1, ae_v_len(0,n-1)); b.ptr.p_double[i] = v; } rmatrixtrsafesolve(&aparam, n, &b, &s, isupper, istrans, isunit, _state); ae_v_muld(&xe.ptr.p_double[0], 1, ae_v_len(0,n-1), s); ae_v_sub(&xe.ptr.p_double[0], 1, &b.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&xe.ptr.p_double[0], 1, &xe.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = ae_sqrt(v, _state); waserrors = waserrors||ae_fp_greater(v,threshold); } } } } } } /* * report */ if( !silent ) { printf("TESTING RMatrixTRSafeSolve\n"); if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testtrlinsolve(ae_bool silent, ae_state *_state) { return testtrlinsolve(silent, _state); } static void testsafesolveunit_rmatrixmakeacopy(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* b, ae_state *_state); static void testsafesolveunit_cmatrixmakeacopy(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* b, ae_state *_state); /************************************************************************* Main unittest subroutine *************************************************************************/ ae_bool testsafesolve(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t maxmn; double threshold; ae_bool rerrors; ae_bool cerrors; ae_bool waserrors; ae_bool isupper; ae_int_t trans; ae_bool isunit; double scalea; double growth; ae_int_t i; ae_int_t j; ae_int_t n; ae_int_t j1; ae_int_t j2; ae_complex cv; ae_matrix ca; ae_matrix cea; ae_matrix ctmpa; ae_vector cxs; ae_vector cxe; double rv; ae_matrix ra; ae_matrix rea; ae_matrix rtmpa; ae_vector rxs; ae_vector rxe; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ca, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cea, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&ctmpa, 0, 0, DT_COMPLEX, _state); ae_vector_init(&cxs, 0, DT_COMPLEX, _state); ae_vector_init(&cxe, 0, DT_COMPLEX, _state); ae_matrix_init(&ra, 0, 0, DT_REAL, _state); ae_matrix_init(&rea, 0, 0, DT_REAL, _state); ae_matrix_init(&rtmpa, 0, 0, DT_REAL, _state); ae_vector_init(&rxs, 0, DT_REAL, _state); ae_vector_init(&rxe, 0, DT_REAL, _state); maxmn = 30; threshold = 100000*ae_machineepsilon; rerrors = ae_false; cerrors = ae_false; waserrors = ae_false; /* * Different problems: general tests */ for(n=1; n<=maxmn; n++) { /* * test complex solver with well-conditioned matrix: * 1. generate A: fill off-diagonal elements with small values, * diagonal elements are filled with larger values * 2. generate 'effective' A * 3. prepare task (exact X is stored in CXE, right part - in CXS), * solve and compare CXS and CXE */ isupper = ae_fp_greater(ae_randomreal(_state),0.5); trans = ae_randominteger(3, _state); isunit = ae_fp_greater(ae_randomreal(_state),0.5); scalea = ae_randomreal(_state)+0.5; ae_matrix_set_length(&ca, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { ca.ptr.pp_complex[i][j].x = (2*ae_randominteger(2, _state)-1)*(5+ae_randomreal(_state)); ca.ptr.pp_complex[i][j].y = (2*ae_randominteger(2, _state)-1)*(5+ae_randomreal(_state)); } else { ca.ptr.pp_complex[i][j].x = 0.2*ae_randomreal(_state)-0.1; ca.ptr.pp_complex[i][j].y = 0.2*ae_randomreal(_state)-0.1; } } } testsafesolveunit_cmatrixmakeacopy(&ca, n, n, &ctmpa, _state); for(i=0; i<=n-1; i++) { if( isupper ) { j1 = 0; j2 = i-1; } else { j1 = i+1; j2 = n-1; } for(j=j1; j<=j2; j++) { ctmpa.ptr.pp_complex[i][j] = ae_complex_from_i(0); } if( isunit ) { ctmpa.ptr.pp_complex[i][i] = ae_complex_from_i(1); } } ae_matrix_set_length(&cea, n, n, _state); for(i=0; i<=n-1; i++) { if( trans==0 ) { ae_v_cmoved(&cea.ptr.pp_complex[i][0], 1, &ctmpa.ptr.pp_complex[i][0], 1, "N", ae_v_len(0,n-1), scalea); } if( trans==1 ) { ae_v_cmoved(&cea.ptr.pp_complex[0][i], cea.stride, &ctmpa.ptr.pp_complex[i][0], 1, "N", ae_v_len(0,n-1), scalea); } if( trans==2 ) { ae_v_cmoved(&cea.ptr.pp_complex[0][i], cea.stride, &ctmpa.ptr.pp_complex[i][0], 1, "Conj", ae_v_len(0,n-1), scalea); } } ae_vector_set_length(&cxe, n, _state); for(i=0; i<=n-1; i++) { cxe.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; cxe.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&cxs, n, _state); for(i=0; i<=n-1; i++) { cv = ae_v_cdotproduct(&cea.ptr.pp_complex[i][0], 1, "N", &cxe.ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); cxs.ptr.p_complex[i] = cv; } if( cmatrixscaledtrsafesolve(&ca, scalea, n, &cxs, isupper, trans, isunit, ae_sqrt(ae_maxrealnumber, _state), _state) ) { for(i=0; i<=n-1; i++) { cerrors = cerrors||ae_fp_greater(ae_c_abs(ae_c_sub(cxs.ptr.p_complex[i],cxe.ptr.p_complex[i]), _state),threshold); } } else { cerrors = ae_true; } /* * same with real */ isupper = ae_fp_greater(ae_randomreal(_state),0.5); trans = ae_randominteger(2, _state); isunit = ae_fp_greater(ae_randomreal(_state),0.5); scalea = ae_randomreal(_state)+0.5; ae_matrix_set_length(&ra, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { ra.ptr.pp_double[i][j] = (2*ae_randominteger(2, _state)-1)*(5+ae_randomreal(_state)); } else { ra.ptr.pp_double[i][j] = 0.2*ae_randomreal(_state)-0.1; } } } testsafesolveunit_rmatrixmakeacopy(&ra, n, n, &rtmpa, _state); for(i=0; i<=n-1; i++) { if( isupper ) { j1 = 0; j2 = i-1; } else { j1 = i+1; j2 = n-1; } for(j=j1; j<=j2; j++) { rtmpa.ptr.pp_double[i][j] = (double)(0); } if( isunit ) { rtmpa.ptr.pp_double[i][i] = (double)(1); } } ae_matrix_set_length(&rea, n, n, _state); for(i=0; i<=n-1; i++) { if( trans==0 ) { ae_v_moved(&rea.ptr.pp_double[i][0], 1, &rtmpa.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), scalea); } if( trans==1 ) { ae_v_moved(&rea.ptr.pp_double[0][i], rea.stride, &rtmpa.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), scalea); } } ae_vector_set_length(&rxe, n, _state); for(i=0; i<=n-1; i++) { rxe.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&rxs, n, _state); for(i=0; i<=n-1; i++) { rv = ae_v_dotproduct(&rea.ptr.pp_double[i][0], 1, &rxe.ptr.p_double[0], 1, ae_v_len(0,n-1)); rxs.ptr.p_double[i] = rv; } if( rmatrixscaledtrsafesolve(&ra, scalea, n, &rxs, isupper, trans, isunit, ae_sqrt(ae_maxrealnumber, _state), _state) ) { for(i=0; i<=n-1; i++) { rerrors = rerrors||ae_fp_greater(ae_fabs(rxs.ptr.p_double[i]-rxe.ptr.p_double[i], _state),threshold); } } else { rerrors = ae_true; } } /* * Special test with diagonal ill-conditioned matrix: * * ability to solve it when resulting growth is less than threshold * * ability to stop solve when resulting growth is greater than threshold * * A = diag(1, 1/growth) * b = (1, 0.5) */ n = 2; growth = (double)(10); ae_matrix_set_length(&ca, n, n, _state); ca.ptr.pp_complex[0][0] = ae_complex_from_i(1); ca.ptr.pp_complex[0][1] = ae_complex_from_i(0); ca.ptr.pp_complex[1][0] = ae_complex_from_i(0); ca.ptr.pp_complex[1][1] = ae_complex_from_d(1/growth); ae_vector_set_length(&cxs, n, _state); cxs.ptr.p_complex[0] = ae_complex_from_d(1.0); cxs.ptr.p_complex[1] = ae_complex_from_d(0.5); cerrors = cerrors||!cmatrixscaledtrsafesolve(&ca, 1.0, n, &cxs, ae_fp_greater(ae_randomreal(_state),0.5), ae_randominteger(3, _state), ae_false, 1.05*ae_maxreal(ae_c_abs(cxs.ptr.p_complex[1], _state)*growth, 1.0, _state), _state); cerrors = cerrors||!cmatrixscaledtrsafesolve(&ca, 1.0, n, &cxs, ae_fp_greater(ae_randomreal(_state),0.5), ae_randominteger(3, _state), ae_false, 0.95*ae_maxreal(ae_c_abs(cxs.ptr.p_complex[1], _state)*growth, 1.0, _state), _state); ae_matrix_set_length(&ra, n, n, _state); ra.ptr.pp_double[0][0] = (double)(1); ra.ptr.pp_double[0][1] = (double)(0); ra.ptr.pp_double[1][0] = (double)(0); ra.ptr.pp_double[1][1] = 1/growth; ae_vector_set_length(&rxs, n, _state); rxs.ptr.p_double[0] = 1.0; rxs.ptr.p_double[1] = 0.5; rerrors = rerrors||!rmatrixscaledtrsafesolve(&ra, 1.0, n, &rxs, ae_fp_greater(ae_randomreal(_state),0.5), ae_randominteger(2, _state), ae_false, 1.05*ae_maxreal(ae_fabs(rxs.ptr.p_double[1], _state)*growth, 1.0, _state), _state); rerrors = rerrors||!rmatrixscaledtrsafesolve(&ra, 1.0, n, &rxs, ae_fp_greater(ae_randomreal(_state),0.5), ae_randominteger(2, _state), ae_false, 0.95*ae_maxreal(ae_fabs(rxs.ptr.p_double[1], _state)*growth, 1.0, _state), _state); /* * Special test with diagonal degenerate matrix: * * ability to solve it when resulting growth is less than threshold * * ability to stop solve when resulting growth is greater than threshold * * A = diag(1, 0) * b = (1, 0.5) */ n = 2; ae_matrix_set_length(&ca, n, n, _state); ca.ptr.pp_complex[0][0] = ae_complex_from_i(1); ca.ptr.pp_complex[0][1] = ae_complex_from_i(0); ca.ptr.pp_complex[1][0] = ae_complex_from_i(0); ca.ptr.pp_complex[1][1] = ae_complex_from_i(0); ae_vector_set_length(&cxs, n, _state); cxs.ptr.p_complex[0] = ae_complex_from_d(1.0); cxs.ptr.p_complex[1] = ae_complex_from_d(0.5); cerrors = cerrors||cmatrixscaledtrsafesolve(&ca, 1.0, n, &cxs, ae_fp_greater(ae_randomreal(_state),0.5), ae_randominteger(3, _state), ae_false, ae_sqrt(ae_maxrealnumber, _state), _state); ae_matrix_set_length(&ra, n, n, _state); ra.ptr.pp_double[0][0] = (double)(1); ra.ptr.pp_double[0][1] = (double)(0); ra.ptr.pp_double[1][0] = (double)(0); ra.ptr.pp_double[1][1] = (double)(0); ae_vector_set_length(&rxs, n, _state); rxs.ptr.p_double[0] = 1.0; rxs.ptr.p_double[1] = 0.5; rerrors = rerrors||rmatrixscaledtrsafesolve(&ra, 1.0, n, &rxs, ae_fp_greater(ae_randomreal(_state),0.5), ae_randominteger(2, _state), ae_false, ae_sqrt(ae_maxrealnumber, _state), _state); /* * report */ waserrors = rerrors||cerrors; if( !silent ) { printf("TESTING SAFE TR SOLVER\n"); printf("REAL: "); if( !rerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("COMPLEX: "); if( !cerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testsafesolve(ae_bool silent, ae_state *_state) { return testsafesolve(silent, _state); } /************************************************************************* Copy *************************************************************************/ static void testsafesolveunit_rmatrixmakeacopy(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(b); ae_matrix_set_length(b, m-1+1, n-1+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { b->ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } } /************************************************************************* Copy *************************************************************************/ static void testsafesolveunit_cmatrixmakeacopy(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(b); ae_matrix_set_length(b, m-1+1, n-1+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { b->ptr.pp_complex[i][j] = a->ptr.pp_complex[i][j]; } } } static double testrcondunit_threshold50 = 0.25; static double testrcondunit_threshold90 = 0.10; static void testrcondunit_rmatrixmakeacopy(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* b, ae_state *_state); static void testrcondunit_rmatrixdrophalf(/* Real */ ae_matrix* a, ae_int_t n, ae_bool droplower, ae_state *_state); static void testrcondunit_cmatrixdrophalf(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool droplower, ae_state *_state); static void testrcondunit_rmatrixgenzero(/* Real */ ae_matrix* a0, ae_int_t n, ae_state *_state); static ae_bool testrcondunit_rmatrixinvmattr(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunittriangular, ae_state *_state); static ae_bool testrcondunit_rmatrixinvmatlu(/* Real */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_state *_state); static ae_bool testrcondunit_rmatrixinvmat(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state); static void testrcondunit_rmatrixrefrcond(/* Real */ ae_matrix* a, ae_int_t n, double* rc1, double* rcinf, ae_state *_state); static void testrcondunit_cmatrixmakeacopy(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* b, ae_state *_state); static void testrcondunit_cmatrixgenzero(/* Complex */ ae_matrix* a0, ae_int_t n, ae_state *_state); static ae_bool testrcondunit_cmatrixinvmattr(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunittriangular, ae_state *_state); static ae_bool testrcondunit_cmatrixinvmatlu(/* Complex */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_state *_state); static ae_bool testrcondunit_cmatrixinvmat(/* Complex */ ae_matrix* a, ae_int_t n, ae_state *_state); static void testrcondunit_cmatrixrefrcond(/* Complex */ ae_matrix* a, ae_int_t n, double* rc1, double* rcinf, ae_state *_state); static ae_bool testrcondunit_testrmatrixtrrcond(ae_int_t maxn, ae_int_t passcount, ae_state *_state); static ae_bool testrcondunit_testcmatrixtrrcond(ae_int_t maxn, ae_int_t passcount, ae_state *_state); static ae_bool testrcondunit_testrmatrixrcond(ae_int_t maxn, ae_int_t passcount, ae_state *_state); static ae_bool testrcondunit_testspdmatrixrcond(ae_int_t maxn, ae_int_t passcount, ae_state *_state); static ae_bool testrcondunit_testcmatrixrcond(ae_int_t maxn, ae_int_t passcount, ae_state *_state); static ae_bool testrcondunit_testhpdmatrixrcond(ae_int_t maxn, ae_int_t passcount, ae_state *_state); ae_bool testrcond(ae_bool silent, ae_state *_state) { ae_int_t maxn; ae_int_t passcount; ae_bool waserrors; ae_bool rtrerr; ae_bool ctrerr; ae_bool rerr; ae_bool cerr; ae_bool spderr; ae_bool hpderr; ae_bool result; maxn = 10; passcount = 100; /* * report */ rtrerr = !testrcondunit_testrmatrixtrrcond(maxn, passcount, _state); ctrerr = !testrcondunit_testcmatrixtrrcond(maxn, passcount, _state); rerr = !testrcondunit_testrmatrixrcond(maxn, passcount, _state); cerr = !testrcondunit_testcmatrixrcond(maxn, passcount, _state); spderr = !testrcondunit_testspdmatrixrcond(maxn, passcount, _state); hpderr = !testrcondunit_testhpdmatrixrcond(maxn, passcount, _state); waserrors = ((((rtrerr||ctrerr)||rerr)||cerr)||spderr)||hpderr; if( !silent ) { printf("TESTING RCOND\n"); printf("REAL TRIANGULAR: "); if( !rtrerr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("COMPLEX TRIANGULAR: "); if( !ctrerr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("REAL: "); if( !rerr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("SPD: "); if( !spderr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("HPD: "); if( !hpderr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("COMPLEX: "); if( !cerr ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testrcond(ae_bool silent, ae_state *_state) { return testrcond(silent, _state); } /************************************************************************* Copy *************************************************************************/ static void testrcondunit_rmatrixmakeacopy(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(b); ae_matrix_set_length(b, m-1+1, n-1+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { b->ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } } /************************************************************************* Drops upper or lower half of the matrix - fills it by special pattern which may be used later to ensure that this part wasn't changed *************************************************************************/ static void testrcondunit_rmatrixdrophalf(/* Real */ ae_matrix* a, ae_int_t n, ae_bool droplower, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (droplower&&i>j)||(!droplower&&iptr.pp_double[i][j] = (double)(1+2*i+3*j); } } } } /************************************************************************* Drops upper or lower half of the matrix - fills it by special pattern which may be used later to ensure that this part wasn't changed *************************************************************************/ static void testrcondunit_cmatrixdrophalf(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool droplower, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (droplower&&i>j)||(!droplower&&iptr.pp_complex[i][j] = ae_complex_from_i(1+2*i+3*j); } } } } /************************************************************************* Generate matrix with given condition number C (2-norm) *************************************************************************/ static void testrcondunit_rmatrixgenzero(/* Real */ ae_matrix* a0, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_set_length(a0, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a0->ptr.pp_double[i][j] = (double)(0); } } } /************************************************************************* triangular inverse *************************************************************************/ static ae_bool testrcondunit_rmatrixinvmattr(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunittriangular, ae_state *_state) { ae_frame _frame_block; ae_bool nounit; ae_int_t i; ae_int_t j; double v; double ajj; ae_vector t; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&t, 0, DT_REAL, _state); result = ae_true; ae_vector_set_length(&t, n-1+1, _state); /* * Test the input parameters. */ nounit = !isunittriangular; if( isupper ) { /* * Compute inverse of upper triangular matrix. */ for(j=0; j<=n-1; j++) { if( nounit ) { if( ae_fp_eq(a->ptr.pp_double[j][j],(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } a->ptr.pp_double[j][j] = 1/a->ptr.pp_double[j][j]; ajj = -a->ptr.pp_double[j][j]; } else { ajj = (double)(-1); } /* * Compute elements 1:j-1 of j-th column. */ if( j>0 ) { ae_v_move(&t.ptr.p_double[0], 1, &a->ptr.pp_double[0][j], a->stride, ae_v_len(0,j-1)); for(i=0; i<=j-1; i++) { if( iptr.pp_double[i][i+1], 1, &t.ptr.p_double[i+1], 1, ae_v_len(i+1,j-1)); } else { v = (double)(0); } if( nounit ) { a->ptr.pp_double[i][j] = v+a->ptr.pp_double[i][i]*t.ptr.p_double[i]; } else { a->ptr.pp_double[i][j] = v+t.ptr.p_double[i]; } } ae_v_muld(&a->ptr.pp_double[0][j], a->stride, ae_v_len(0,j-1), ajj); } } } else { /* * Compute inverse of lower triangular matrix. */ for(j=n-1; j>=0; j--) { if( nounit ) { if( ae_fp_eq(a->ptr.pp_double[j][j],(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } a->ptr.pp_double[j][j] = 1/a->ptr.pp_double[j][j]; ajj = -a->ptr.pp_double[j][j]; } else { ajj = (double)(-1); } if( jptr.pp_double[j+1][j], a->stride, ae_v_len(j+1,n-1)); for(i=j+1; i<=n-1; i++) { if( i>j+1 ) { v = ae_v_dotproduct(&a->ptr.pp_double[i][j+1], 1, &t.ptr.p_double[j+1], 1, ae_v_len(j+1,i-1)); } else { v = (double)(0); } if( nounit ) { a->ptr.pp_double[i][j] = v+a->ptr.pp_double[i][i]*t.ptr.p_double[i]; } else { a->ptr.pp_double[i][j] = v+t.ptr.p_double[i]; } } ae_v_muld(&a->ptr.pp_double[j+1][j], a->stride, ae_v_len(j+1,n-1), ajj); } } } ae_frame_leave(_state); return result; } /************************************************************************* LU inverse *************************************************************************/ static ae_bool testrcondunit_rmatrixinvmatlu(/* Real */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_int_t i; ae_int_t j; ae_int_t jp; double v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&work, 0, DT_REAL, _state); result = ae_true; /* * Quick return if possible */ if( n==0 ) { ae_frame_leave(_state); return result; } ae_vector_set_length(&work, n-1+1, _state); /* * Form inv(U) */ if( !testrcondunit_rmatrixinvmattr(a, n, ae_true, ae_false, _state) ) { result = ae_false; ae_frame_leave(_state); return result; } /* * Solve the equation inv(A)*L = inv(U) for inv(A). */ for(j=n-1; j>=0; j--) { /* * Copy current column of L to WORK and replace with zeros. */ for(i=j+1; i<=n-1; i++) { work.ptr.p_double[i] = a->ptr.pp_double[i][j]; a->ptr.pp_double[i][j] = (double)(0); } /* * Compute current column of inv(A). */ if( jptr.pp_double[i][j+1], 1, &work.ptr.p_double[j+1], 1, ae_v_len(j+1,n-1)); a->ptr.pp_double[i][j] = a->ptr.pp_double[i][j]-v; } } } /* * Apply column interchanges. */ for(j=n-2; j>=0; j--) { jp = pivots->ptr.p_int[j]; if( jp!=j ) { ae_v_move(&work.ptr.p_double[0], 1, &a->ptr.pp_double[0][j], a->stride, ae_v_len(0,n-1)); ae_v_move(&a->ptr.pp_double[0][j], a->stride, &a->ptr.pp_double[0][jp], a->stride, ae_v_len(0,n-1)); ae_v_move(&a->ptr.pp_double[0][jp], a->stride, &work.ptr.p_double[0], 1, ae_v_len(0,n-1)); } } ae_frame_leave(_state); return result; } /************************************************************************* Matrix inverse *************************************************************************/ static ae_bool testrcondunit_rmatrixinvmat(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector pivots; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&pivots, 0, DT_INT, _state); rmatrixlu(a, n, n, &pivots, _state); result = testrcondunit_rmatrixinvmatlu(a, &pivots, n, _state); ae_frame_leave(_state); return result; } /************************************************************************* reference RCond *************************************************************************/ static void testrcondunit_rmatrixrefrcond(/* Real */ ae_matrix* a, ae_int_t n, double* rc1, double* rcinf, ae_state *_state) { ae_frame _frame_block; ae_matrix inva; double nrm1a; double nrminfa; double nrm1inva; double nrminfinva; double v; ae_int_t k; ae_int_t i; ae_frame_make(_state, &_frame_block); *rc1 = 0; *rcinf = 0; ae_matrix_init(&inva, 0, 0, DT_REAL, _state); /* * inv A */ testrcondunit_rmatrixmakeacopy(a, n, n, &inva, _state); if( !testrcondunit_rmatrixinvmat(&inva, n, _state) ) { *rc1 = (double)(0); *rcinf = (double)(0); ae_frame_leave(_state); return; } /* * norm A */ nrm1a = (double)(0); nrminfa = (double)(0); for(k=0; k<=n-1; k++) { v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_fabs(a->ptr.pp_double[i][k], _state); } nrm1a = ae_maxreal(nrm1a, v, _state); v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_fabs(a->ptr.pp_double[k][i], _state); } nrminfa = ae_maxreal(nrminfa, v, _state); } /* * norm inv A */ nrm1inva = (double)(0); nrminfinva = (double)(0); for(k=0; k<=n-1; k++) { v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_fabs(inva.ptr.pp_double[i][k], _state); } nrm1inva = ae_maxreal(nrm1inva, v, _state); v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_fabs(inva.ptr.pp_double[k][i], _state); } nrminfinva = ae_maxreal(nrminfinva, v, _state); } /* * result */ *rc1 = nrm1inva*nrm1a; *rcinf = nrminfinva*nrminfa; ae_frame_leave(_state); } /************************************************************************* Copy *************************************************************************/ static void testrcondunit_cmatrixmakeacopy(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(b); ae_matrix_set_length(b, m-1+1, n-1+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { b->ptr.pp_complex[i][j] = a->ptr.pp_complex[i][j]; } } } /************************************************************************* Generate matrix with given condition number C (2-norm) *************************************************************************/ static void testrcondunit_cmatrixgenzero(/* Complex */ ae_matrix* a0, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_set_length(a0, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a0->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } /************************************************************************* triangular inverse *************************************************************************/ static ae_bool testrcondunit_cmatrixinvmattr(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunittriangular, ae_state *_state) { ae_frame _frame_block; ae_bool nounit; ae_int_t i; ae_int_t j; ae_complex v; ae_complex ajj; ae_vector t; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&t, 0, DT_COMPLEX, _state); result = ae_true; ae_vector_set_length(&t, n-1+1, _state); /* * Test the input parameters. */ nounit = !isunittriangular; if( isupper ) { /* * Compute inverse of upper triangular matrix. */ for(j=0; j<=n-1; j++) { if( nounit ) { if( ae_c_eq_d(a->ptr.pp_complex[j][j],(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } a->ptr.pp_complex[j][j] = ae_c_d_div(1,a->ptr.pp_complex[j][j]); ajj = ae_c_neg(a->ptr.pp_complex[j][j]); } else { ajj = ae_complex_from_i(-1); } /* * Compute elements 1:j-1 of j-th column. */ if( j>0 ) { ae_v_cmove(&t.ptr.p_complex[0], 1, &a->ptr.pp_complex[0][j], a->stride, "N", ae_v_len(0,j-1)); for(i=0; i<=j-1; i++) { if( iptr.pp_complex[i][i+1], 1, "N", &t.ptr.p_complex[i+1], 1, "N", ae_v_len(i+1,j-1)); } else { v = ae_complex_from_i(0); } if( nounit ) { a->ptr.pp_complex[i][j] = ae_c_add(v,ae_c_mul(a->ptr.pp_complex[i][i],t.ptr.p_complex[i])); } else { a->ptr.pp_complex[i][j] = ae_c_add(v,t.ptr.p_complex[i]); } } ae_v_cmulc(&a->ptr.pp_complex[0][j], a->stride, ae_v_len(0,j-1), ajj); } } } else { /* * Compute inverse of lower triangular matrix. */ for(j=n-1; j>=0; j--) { if( nounit ) { if( ae_c_eq_d(a->ptr.pp_complex[j][j],(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } a->ptr.pp_complex[j][j] = ae_c_d_div(1,a->ptr.pp_complex[j][j]); ajj = ae_c_neg(a->ptr.pp_complex[j][j]); } else { ajj = ae_complex_from_i(-1); } if( jptr.pp_complex[j+1][j], a->stride, "N", ae_v_len(j+1,n-1)); for(i=j+1; i<=n-1; i++) { if( i>j+1 ) { v = ae_v_cdotproduct(&a->ptr.pp_complex[i][j+1], 1, "N", &t.ptr.p_complex[j+1], 1, "N", ae_v_len(j+1,i-1)); } else { v = ae_complex_from_i(0); } if( nounit ) { a->ptr.pp_complex[i][j] = ae_c_add(v,ae_c_mul(a->ptr.pp_complex[i][i],t.ptr.p_complex[i])); } else { a->ptr.pp_complex[i][j] = ae_c_add(v,t.ptr.p_complex[i]); } } ae_v_cmulc(&a->ptr.pp_complex[j+1][j], a->stride, ae_v_len(j+1,n-1), ajj); } } } ae_frame_leave(_state); return result; } /************************************************************************* LU inverse *************************************************************************/ static ae_bool testrcondunit_cmatrixinvmatlu(/* Complex */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_int_t i; ae_int_t j; ae_int_t jp; ae_complex v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&work, 0, DT_COMPLEX, _state); result = ae_true; /* * Quick return if possible */ if( n==0 ) { ae_frame_leave(_state); return result; } ae_vector_set_length(&work, n-1+1, _state); /* * Form inv(U) */ if( !testrcondunit_cmatrixinvmattr(a, n, ae_true, ae_false, _state) ) { result = ae_false; ae_frame_leave(_state); return result; } /* * Solve the equation inv(A)*L = inv(U) for inv(A). */ for(j=n-1; j>=0; j--) { /* * Copy current column of L to WORK and replace with zeros. */ for(i=j+1; i<=n-1; i++) { work.ptr.p_complex[i] = a->ptr.pp_complex[i][j]; a->ptr.pp_complex[i][j] = ae_complex_from_i(0); } /* * Compute current column of inv(A). */ if( jptr.pp_complex[i][j+1], 1, "N", &work.ptr.p_complex[j+1], 1, "N", ae_v_len(j+1,n-1)); a->ptr.pp_complex[i][j] = ae_c_sub(a->ptr.pp_complex[i][j],v); } } } /* * Apply column interchanges. */ for(j=n-2; j>=0; j--) { jp = pivots->ptr.p_int[j]; if( jp!=j ) { ae_v_cmove(&work.ptr.p_complex[0], 1, &a->ptr.pp_complex[0][j], a->stride, "N", ae_v_len(0,n-1)); ae_v_cmove(&a->ptr.pp_complex[0][j], a->stride, &a->ptr.pp_complex[0][jp], a->stride, "N", ae_v_len(0,n-1)); ae_v_cmove(&a->ptr.pp_complex[0][jp], a->stride, &work.ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); } } ae_frame_leave(_state); return result; } /************************************************************************* Matrix inverse *************************************************************************/ static ae_bool testrcondunit_cmatrixinvmat(/* Complex */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector pivots; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&pivots, 0, DT_INT, _state); cmatrixlu(a, n, n, &pivots, _state); result = testrcondunit_cmatrixinvmatlu(a, &pivots, n, _state); ae_frame_leave(_state); return result; } /************************************************************************* reference RCond *************************************************************************/ static void testrcondunit_cmatrixrefrcond(/* Complex */ ae_matrix* a, ae_int_t n, double* rc1, double* rcinf, ae_state *_state) { ae_frame _frame_block; ae_matrix inva; double nrm1a; double nrminfa; double nrm1inva; double nrminfinva; double v; ae_int_t k; ae_int_t i; ae_frame_make(_state, &_frame_block); *rc1 = 0; *rcinf = 0; ae_matrix_init(&inva, 0, 0, DT_COMPLEX, _state); /* * inv A */ testrcondunit_cmatrixmakeacopy(a, n, n, &inva, _state); if( !testrcondunit_cmatrixinvmat(&inva, n, _state) ) { *rc1 = (double)(0); *rcinf = (double)(0); ae_frame_leave(_state); return; } /* * norm A */ nrm1a = (double)(0); nrminfa = (double)(0); for(k=0; k<=n-1; k++) { v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_c_abs(a->ptr.pp_complex[i][k], _state); } nrm1a = ae_maxreal(nrm1a, v, _state); v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_c_abs(a->ptr.pp_complex[k][i], _state); } nrminfa = ae_maxreal(nrminfa, v, _state); } /* * norm inv A */ nrm1inva = (double)(0); nrminfinva = (double)(0); for(k=0; k<=n-1; k++) { v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_c_abs(inva.ptr.pp_complex[i][k], _state); } nrm1inva = ae_maxreal(nrm1inva, v, _state); v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_c_abs(inva.ptr.pp_complex[k][i], _state); } nrminfinva = ae_maxreal(nrminfinva, v, _state); } /* * result */ *rc1 = nrm1inva*nrm1a; *rcinf = nrminfinva*nrminfa; ae_frame_leave(_state); } /************************************************************************* Returns True for successful test, False - for failed test *************************************************************************/ static ae_bool testrcondunit_testrmatrixtrrcond(ae_int_t maxn, ae_int_t passcount, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix ea; ae_vector p; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t j1; ae_int_t j2; ae_int_t pass; ae_bool err50; ae_bool err90; ae_bool errspec; ae_bool errless; double erc1; double ercinf; ae_vector q50; ae_vector q90; double v; ae_bool isupper; ae_bool isunit; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&ea, 0, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_vector_init(&q50, 0, DT_REAL, _state); ae_vector_init(&q90, 0, DT_REAL, _state); err50 = ae_false; err90 = ae_false; errless = ae_false; errspec = ae_false; ae_vector_set_length(&q50, 2, _state); ae_vector_set_length(&q90, 2, _state); for(n=1; n<=maxn; n++) { /* * special test for zero matrix */ testrcondunit_rmatrixgenzero(&a, n, _state); errspec = errspec||ae_fp_neq(rmatrixtrrcond1(&a, n, ae_fp_greater(ae_randomreal(_state),0.5), ae_false, _state),(double)(0)); errspec = errspec||ae_fp_neq(rmatrixtrrcondinf(&a, n, ae_fp_greater(ae_randomreal(_state),0.5), ae_false, _state),(double)(0)); /* * general test */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=1; i++) { q50.ptr.p_double[i] = (double)(0); q90.ptr.p_double[i] = (double)(0); } for(pass=1; pass<=passcount; pass++) { isupper = ae_fp_greater(ae_randomreal(_state),0.5); isunit = ae_fp_greater(ae_randomreal(_state),0.5); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = ae_randomreal(_state)-0.5; } } for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = 1+ae_randomreal(_state); } testrcondunit_rmatrixmakeacopy(&a, n, n, &ea, _state); for(i=0; i<=n-1; i++) { if( isupper ) { j1 = 0; j2 = i-1; } else { j1 = i+1; j2 = n-1; } for(j=j1; j<=j2; j++) { ea.ptr.pp_double[i][j] = (double)(0); } if( isunit ) { ea.ptr.pp_double[i][i] = (double)(1); } } testrcondunit_rmatrixrefrcond(&ea, n, &erc1, &ercinf, _state); /* * 1-norm */ v = 1/rmatrixtrrcond1(&a, n, isupper, isunit, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*erc1) ) { q50.ptr.p_double[0] = q50.ptr.p_double[0]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*erc1) ) { q90.ptr.p_double[0] = q90.ptr.p_double[0]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,erc1*1.001); /* * Inf-norm */ v = 1/rmatrixtrrcondinf(&a, n, isupper, isunit, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*ercinf) ) { q50.ptr.p_double[1] = q50.ptr.p_double[1]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*ercinf) ) { q90.ptr.p_double[1] = q90.ptr.p_double[1]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,ercinf*1.001); } for(i=0; i<=1; i++) { err50 = err50||ae_fp_less(q50.ptr.p_double[i],0.50); err90 = err90||ae_fp_less(q90.ptr.p_double[i],0.90); } /* * degenerate matrix test */ if( n>=3 ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } a.ptr.pp_double[0][0] = (double)(1); a.ptr.pp_double[n-1][n-1] = (double)(1); errspec = errspec||ae_fp_neq(rmatrixtrrcond1(&a, n, ae_fp_greater(ae_randomreal(_state),0.5), ae_false, _state),(double)(0)); errspec = errspec||ae_fp_neq(rmatrixtrrcondinf(&a, n, ae_fp_greater(ae_randomreal(_state),0.5), ae_false, _state),(double)(0)); } /* * near-degenerate matrix test */ if( n>=2 ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = (double)(1); } i = ae_randominteger(n, _state); a.ptr.pp_double[i][i] = 0.1*ae_maxrealnumber; errspec = errspec||ae_fp_neq(rmatrixtrrcond1(&a, n, ae_fp_greater(ae_randomreal(_state),0.5), ae_false, _state),(double)(0)); errspec = errspec||ae_fp_neq(rmatrixtrrcondinf(&a, n, ae_fp_greater(ae_randomreal(_state),0.5), ae_false, _state),(double)(0)); } } /* * report */ result = !(((err50||err90)||errless)||errspec); ae_frame_leave(_state); return result; } /************************************************************************* Returns True for successful test, False - for failed test *************************************************************************/ static ae_bool testrcondunit_testcmatrixtrrcond(ae_int_t maxn, ae_int_t passcount, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix ea; ae_vector p; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t j1; ae_int_t j2; ae_int_t pass; ae_bool err50; ae_bool err90; ae_bool errspec; ae_bool errless; double erc1; double ercinf; ae_vector q50; ae_vector q90; double v; ae_bool isupper; ae_bool isunit; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&ea, 0, 0, DT_COMPLEX, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_vector_init(&q50, 0, DT_REAL, _state); ae_vector_init(&q90, 0, DT_REAL, _state); err50 = ae_false; err90 = ae_false; errless = ae_false; errspec = ae_false; ae_vector_set_length(&q50, 2, _state); ae_vector_set_length(&q90, 2, _state); for(n=1; n<=maxn; n++) { /* * special test for zero matrix */ testrcondunit_cmatrixgenzero(&a, n, _state); errspec = errspec||ae_fp_neq(cmatrixtrrcond1(&a, n, ae_fp_greater(ae_randomreal(_state),0.5), ae_false, _state),(double)(0)); errspec = errspec||ae_fp_neq(cmatrixtrrcondinf(&a, n, ae_fp_greater(ae_randomreal(_state),0.5), ae_false, _state),(double)(0)); /* * general test */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=1; i++) { q50.ptr.p_double[i] = (double)(0); q90.ptr.p_double[i] = (double)(0); } for(pass=1; pass<=passcount; pass++) { isupper = ae_fp_greater(ae_randomreal(_state),0.5); isunit = ae_fp_greater(ae_randomreal(_state),0.5); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = ae_randomreal(_state)-0.5; a.ptr.pp_complex[i][j].y = ae_randomreal(_state)-0.5; } } for(i=0; i<=n-1; i++) { a.ptr.pp_complex[i][i].x = 1+ae_randomreal(_state); a.ptr.pp_complex[i][i].y = 1+ae_randomreal(_state); } testrcondunit_cmatrixmakeacopy(&a, n, n, &ea, _state); for(i=0; i<=n-1; i++) { if( isupper ) { j1 = 0; j2 = i-1; } else { j1 = i+1; j2 = n-1; } for(j=j1; j<=j2; j++) { ea.ptr.pp_complex[i][j] = ae_complex_from_i(0); } if( isunit ) { ea.ptr.pp_complex[i][i] = ae_complex_from_i(1); } } testrcondunit_cmatrixrefrcond(&ea, n, &erc1, &ercinf, _state); /* * 1-norm */ v = 1/cmatrixtrrcond1(&a, n, isupper, isunit, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*erc1) ) { q50.ptr.p_double[0] = q50.ptr.p_double[0]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*erc1) ) { q90.ptr.p_double[0] = q90.ptr.p_double[0]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,erc1*1.001); /* * Inf-norm */ v = 1/cmatrixtrrcondinf(&a, n, isupper, isunit, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*ercinf) ) { q50.ptr.p_double[1] = q50.ptr.p_double[1]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*ercinf) ) { q90.ptr.p_double[1] = q90.ptr.p_double[1]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,ercinf*1.001); } for(i=0; i<=1; i++) { err50 = err50||ae_fp_less(q50.ptr.p_double[i],0.50); err90 = err90||ae_fp_less(q90.ptr.p_double[i],0.90); } /* * degenerate matrix test */ if( n>=3 ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j] = ae_complex_from_d(0.0); } } a.ptr.pp_complex[0][0] = ae_complex_from_i(1); a.ptr.pp_complex[n-1][n-1] = ae_complex_from_i(1); errspec = errspec||ae_fp_neq(cmatrixtrrcond1(&a, n, ae_fp_greater(ae_randomreal(_state),0.5), ae_false, _state),(double)(0)); errspec = errspec||ae_fp_neq(cmatrixtrrcondinf(&a, n, ae_fp_greater(ae_randomreal(_state),0.5), ae_false, _state),(double)(0)); } /* * near-degenerate matrix test */ if( n>=2 ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j] = ae_complex_from_d(0.0); } } for(i=0; i<=n-1; i++) { a.ptr.pp_complex[i][i] = ae_complex_from_i(1); } i = ae_randominteger(n, _state); a.ptr.pp_complex[i][i] = ae_complex_from_d(0.1*ae_maxrealnumber); errspec = errspec||ae_fp_neq(cmatrixtrrcond1(&a, n, ae_fp_greater(ae_randomreal(_state),0.5), ae_false, _state),(double)(0)); errspec = errspec||ae_fp_neq(cmatrixtrrcondinf(&a, n, ae_fp_greater(ae_randomreal(_state),0.5), ae_false, _state),(double)(0)); } } /* * report */ result = !(((err50||err90)||errless)||errspec); ae_frame_leave(_state); return result; } /************************************************************************* Returns True for successful test, False - for failed test *************************************************************************/ static ae_bool testrcondunit_testrmatrixrcond(ae_int_t maxn, ae_int_t passcount, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix lua; ae_vector p; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t pass; ae_bool err50; ae_bool err90; ae_bool errspec; ae_bool errless; double erc1; double ercinf; ae_vector q50; ae_vector q90; double v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&lua, 0, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_vector_init(&q50, 0, DT_REAL, _state); ae_vector_init(&q90, 0, DT_REAL, _state); err50 = ae_false; err90 = ae_false; errless = ae_false; errspec = ae_false; ae_vector_set_length(&q50, 3+1, _state); ae_vector_set_length(&q90, 3+1, _state); for(n=1; n<=maxn; n++) { /* * special test for zero matrix */ testrcondunit_rmatrixgenzero(&a, n, _state); testrcondunit_rmatrixmakeacopy(&a, n, n, &lua, _state); rmatrixlu(&lua, n, n, &p, _state); errspec = errspec||ae_fp_neq(rmatrixrcond1(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(rmatrixrcondinf(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(rmatrixlurcond1(&lua, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(rmatrixlurcondinf(&lua, n, _state),(double)(0)); /* * general test */ ae_matrix_set_length(&a, n-1+1, n-1+1, _state); for(i=0; i<=3; i++) { q50.ptr.p_double[i] = (double)(0); q90.ptr.p_double[i] = (double)(0); } for(pass=1; pass<=passcount; pass++) { rmatrixrndcond(n, ae_exp(ae_randomreal(_state)*ae_log((double)(1000), _state), _state), &a, _state); testrcondunit_rmatrixmakeacopy(&a, n, n, &lua, _state); rmatrixlu(&lua, n, n, &p, _state); testrcondunit_rmatrixrefrcond(&a, n, &erc1, &ercinf, _state); /* * 1-norm, normal */ v = 1/rmatrixrcond1(&a, n, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*erc1) ) { q50.ptr.p_double[0] = q50.ptr.p_double[0]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*erc1) ) { q90.ptr.p_double[0] = q90.ptr.p_double[0]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,erc1*1.001); /* * 1-norm, LU */ v = 1/rmatrixlurcond1(&lua, n, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*erc1) ) { q50.ptr.p_double[1] = q50.ptr.p_double[1]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*erc1) ) { q90.ptr.p_double[1] = q90.ptr.p_double[1]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,erc1*1.001); /* * Inf-norm, normal */ v = 1/rmatrixrcondinf(&a, n, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*ercinf) ) { q50.ptr.p_double[2] = q50.ptr.p_double[2]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*ercinf) ) { q90.ptr.p_double[2] = q90.ptr.p_double[2]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,ercinf*1.001); /* * Inf-norm, LU */ v = 1/rmatrixlurcondinf(&lua, n, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*ercinf) ) { q50.ptr.p_double[3] = q50.ptr.p_double[3]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*ercinf) ) { q90.ptr.p_double[3] = q90.ptr.p_double[3]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,ercinf*1.001); } for(i=0; i<=3; i++) { err50 = err50||ae_fp_less(q50.ptr.p_double[i],0.50); err90 = err90||ae_fp_less(q90.ptr.p_double[i],0.90); } /* * degenerate matrix test */ if( n>=3 ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } a.ptr.pp_double[0][0] = (double)(1); a.ptr.pp_double[n-1][n-1] = (double)(1); errspec = errspec||ae_fp_neq(rmatrixrcond1(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(rmatrixrcondinf(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(rmatrixlurcond1(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(rmatrixlurcondinf(&a, n, _state),(double)(0)); } /* * near-degenerate matrix test */ if( n>=2 ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = (double)(1); } i = ae_randominteger(n, _state); a.ptr.pp_double[i][i] = 0.1*ae_maxrealnumber; errspec = errspec||ae_fp_neq(rmatrixrcond1(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(rmatrixrcondinf(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(rmatrixlurcond1(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(rmatrixlurcondinf(&a, n, _state),(double)(0)); } } /* * report */ result = !(((err50||err90)||errless)||errspec); ae_frame_leave(_state); return result; } /************************************************************************* Returns True for successful test, False - for failed test *************************************************************************/ static ae_bool testrcondunit_testspdmatrixrcond(ae_int_t maxn, ae_int_t passcount, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix cha; ae_vector p; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t pass; ae_bool err50; ae_bool err90; ae_bool errspec; ae_bool errless; ae_bool isupper; double erc1; double ercinf; ae_vector q50; ae_vector q90; double v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&cha, 0, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_vector_init(&q50, 0, DT_REAL, _state); ae_vector_init(&q90, 0, DT_REAL, _state); err50 = ae_false; err90 = ae_false; errless = ae_false; errspec = ae_false; ae_vector_set_length(&q50, 2, _state); ae_vector_set_length(&q90, 2, _state); for(n=1; n<=maxn; n++) { isupper = ae_fp_greater(ae_randomreal(_state),0.5); /* * general test */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=1; i++) { q50.ptr.p_double[i] = (double)(0); q90.ptr.p_double[i] = (double)(0); } for(pass=1; pass<=passcount; pass++) { spdmatrixrndcond(n, ae_exp(ae_randomreal(_state)*ae_log((double)(1000), _state), _state), &a, _state); testrcondunit_rmatrixrefrcond(&a, n, &erc1, &ercinf, _state); testrcondunit_rmatrixdrophalf(&a, n, isupper, _state); testrcondunit_rmatrixmakeacopy(&a, n, n, &cha, _state); spdmatrixcholesky(&cha, n, isupper, _state); /* * normal */ v = 1/spdmatrixrcond(&a, n, isupper, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*erc1) ) { q50.ptr.p_double[0] = q50.ptr.p_double[0]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*erc1) ) { q90.ptr.p_double[0] = q90.ptr.p_double[0]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,erc1*1.001); /* * Cholesky */ v = 1/spdmatrixcholeskyrcond(&cha, n, isupper, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*erc1) ) { q50.ptr.p_double[1] = q50.ptr.p_double[1]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*erc1) ) { q90.ptr.p_double[1] = q90.ptr.p_double[1]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,erc1*1.001); } for(i=0; i<=1; i++) { err50 = err50||ae_fp_less(q50.ptr.p_double[i],0.50); err90 = err90||ae_fp_less(q90.ptr.p_double[i],0.90); } /* * degenerate matrix test */ if( n>=3 ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } a.ptr.pp_double[0][0] = (double)(1); a.ptr.pp_double[n-1][n-1] = (double)(1); errspec = errspec||ae_fp_neq(spdmatrixrcond(&a, n, isupper, _state),(double)(-1)); errspec = errspec||ae_fp_neq(spdmatrixcholeskyrcond(&a, n, isupper, _state),(double)(0)); } /* * near-degenerate matrix test */ if( n>=2 ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = (double)(1); } i = ae_randominteger(n, _state); a.ptr.pp_double[i][i] = 0.1*ae_maxrealnumber; errspec = errspec||ae_fp_neq(spdmatrixrcond(&a, n, isupper, _state),(double)(0)); errspec = errspec||ae_fp_neq(spdmatrixcholeskyrcond(&a, n, isupper, _state),(double)(0)); } } /* * report */ result = !(((err50||err90)||errless)||errspec); ae_frame_leave(_state); return result; } /************************************************************************* Returns True for successful test, False - for failed test *************************************************************************/ static ae_bool testrcondunit_testcmatrixrcond(ae_int_t maxn, ae_int_t passcount, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix lua; ae_vector p; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t pass; ae_bool err50; ae_bool err90; ae_bool errless; ae_bool errspec; double erc1; double ercinf; ae_vector q50; ae_vector q90; double v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&lua, 0, 0, DT_COMPLEX, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_vector_init(&q50, 0, DT_REAL, _state); ae_vector_init(&q90, 0, DT_REAL, _state); ae_vector_set_length(&q50, 3+1, _state); ae_vector_set_length(&q90, 3+1, _state); err50 = ae_false; err90 = ae_false; errless = ae_false; errspec = ae_false; /* * process */ for(n=1; n<=maxn; n++) { /* * special test for zero matrix */ testrcondunit_cmatrixgenzero(&a, n, _state); testrcondunit_cmatrixmakeacopy(&a, n, n, &lua, _state); cmatrixlu(&lua, n, n, &p, _state); errspec = errspec||ae_fp_neq(cmatrixrcond1(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(cmatrixrcondinf(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(cmatrixlurcond1(&lua, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(cmatrixlurcondinf(&lua, n, _state),(double)(0)); /* * general test */ ae_matrix_set_length(&a, n-1+1, n-1+1, _state); for(i=0; i<=3; i++) { q50.ptr.p_double[i] = (double)(0); q90.ptr.p_double[i] = (double)(0); } for(pass=1; pass<=passcount; pass++) { cmatrixrndcond(n, ae_exp(ae_randomreal(_state)*ae_log((double)(1000), _state), _state), &a, _state); testrcondunit_cmatrixmakeacopy(&a, n, n, &lua, _state); cmatrixlu(&lua, n, n, &p, _state); testrcondunit_cmatrixrefrcond(&a, n, &erc1, &ercinf, _state); /* * 1-norm, normal */ v = 1/cmatrixrcond1(&a, n, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*erc1) ) { q50.ptr.p_double[0] = q50.ptr.p_double[0]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*erc1) ) { q90.ptr.p_double[0] = q90.ptr.p_double[0]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,erc1*1.001); /* * 1-norm, LU */ v = 1/cmatrixlurcond1(&lua, n, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*erc1) ) { q50.ptr.p_double[1] = q50.ptr.p_double[1]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*erc1) ) { q90.ptr.p_double[1] = q90.ptr.p_double[1]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,erc1*1.001); /* * Inf-norm, normal */ v = 1/cmatrixrcondinf(&a, n, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*ercinf) ) { q50.ptr.p_double[2] = q50.ptr.p_double[2]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*ercinf) ) { q90.ptr.p_double[2] = q90.ptr.p_double[2]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,ercinf*1.001); /* * Inf-norm, LU */ v = 1/cmatrixlurcondinf(&lua, n, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*ercinf) ) { q50.ptr.p_double[3] = q50.ptr.p_double[3]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*ercinf) ) { q90.ptr.p_double[3] = q90.ptr.p_double[3]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,ercinf*1.001); } for(i=0; i<=3; i++) { err50 = err50||ae_fp_less(q50.ptr.p_double[i],0.50); err90 = err90||ae_fp_less(q90.ptr.p_double[i],0.90); } /* * degenerate matrix test */ if( n>=3 ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j] = ae_complex_from_d(0.0); } } a.ptr.pp_complex[0][0] = ae_complex_from_i(1); a.ptr.pp_complex[n-1][n-1] = ae_complex_from_i(1); errspec = errspec||ae_fp_neq(cmatrixrcond1(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(cmatrixrcondinf(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(cmatrixlurcond1(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(cmatrixlurcondinf(&a, n, _state),(double)(0)); } /* * near-degenerate matrix test */ if( n>=2 ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j] = ae_complex_from_d(0.0); } } for(i=0; i<=n-1; i++) { a.ptr.pp_complex[i][i] = ae_complex_from_i(1); } i = ae_randominteger(n, _state); a.ptr.pp_complex[i][i] = ae_complex_from_d(0.1*ae_maxrealnumber); errspec = errspec||ae_fp_neq(cmatrixrcond1(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(cmatrixrcondinf(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(cmatrixlurcond1(&a, n, _state),(double)(0)); errspec = errspec||ae_fp_neq(cmatrixlurcondinf(&a, n, _state),(double)(0)); } } /* * report */ result = !(((err50||err90)||errless)||errspec); ae_frame_leave(_state); return result; } /************************************************************************* Returns True for successful test, False - for failed test *************************************************************************/ static ae_bool testrcondunit_testhpdmatrixrcond(ae_int_t maxn, ae_int_t passcount, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix cha; ae_vector p; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t pass; ae_bool err50; ae_bool err90; ae_bool errspec; ae_bool errless; ae_bool isupper; double erc1; double ercinf; ae_vector q50; ae_vector q90; double v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cha, 0, 0, DT_COMPLEX, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_vector_init(&q50, 0, DT_REAL, _state); ae_vector_init(&q90, 0, DT_REAL, _state); err50 = ae_false; err90 = ae_false; errless = ae_false; errspec = ae_false; ae_vector_set_length(&q50, 2, _state); ae_vector_set_length(&q90, 2, _state); for(n=1; n<=maxn; n++) { isupper = ae_fp_greater(ae_randomreal(_state),0.5); /* * general test */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=1; i++) { q50.ptr.p_double[i] = (double)(0); q90.ptr.p_double[i] = (double)(0); } for(pass=1; pass<=passcount; pass++) { hpdmatrixrndcond(n, ae_exp(ae_randomreal(_state)*ae_log((double)(1000), _state), _state), &a, _state); testrcondunit_cmatrixrefrcond(&a, n, &erc1, &ercinf, _state); testrcondunit_cmatrixdrophalf(&a, n, isupper, _state); testrcondunit_cmatrixmakeacopy(&a, n, n, &cha, _state); hpdmatrixcholesky(&cha, n, isupper, _state); /* * normal */ v = 1/hpdmatrixrcond(&a, n, isupper, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*erc1) ) { q50.ptr.p_double[0] = q50.ptr.p_double[0]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*erc1) ) { q90.ptr.p_double[0] = q90.ptr.p_double[0]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,erc1*1.001); /* * Cholesky */ v = 1/hpdmatrixcholeskyrcond(&cha, n, isupper, _state); if( ae_fp_greater_eq(v,testrcondunit_threshold50*erc1) ) { q50.ptr.p_double[1] = q50.ptr.p_double[1]+(double)1/(double)passcount; } if( ae_fp_greater_eq(v,testrcondunit_threshold90*erc1) ) { q90.ptr.p_double[1] = q90.ptr.p_double[1]+(double)1/(double)passcount; } errless = errless||ae_fp_greater(v,erc1*1.001); } for(i=0; i<=1; i++) { err50 = err50||ae_fp_less(q50.ptr.p_double[i],0.50); err90 = err90||ae_fp_less(q90.ptr.p_double[i],0.90); } /* * degenerate matrix test */ if( n>=3 ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j] = ae_complex_from_d(0.0); } } a.ptr.pp_complex[0][0] = ae_complex_from_i(1); a.ptr.pp_complex[n-1][n-1] = ae_complex_from_i(1); errspec = errspec||ae_fp_neq(hpdmatrixrcond(&a, n, isupper, _state),(double)(-1)); errspec = errspec||ae_fp_neq(hpdmatrixcholeskyrcond(&a, n, isupper, _state),(double)(0)); } /* * near-degenerate matrix test */ if( n>=2 ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j] = ae_complex_from_d(0.0); } } for(i=0; i<=n-1; i++) { a.ptr.pp_complex[i][i] = ae_complex_from_i(1); } i = ae_randominteger(n, _state); a.ptr.pp_complex[i][i] = ae_complex_from_d(0.1*ae_maxrealnumber); errspec = errspec||ae_fp_neq(hpdmatrixrcond(&a, n, isupper, _state),(double)(0)); errspec = errspec||ae_fp_neq(hpdmatrixcholeskyrcond(&a, n, isupper, _state),(double)(0)); } } /* * report */ result = !(((err50||err90)||errless)||errspec); ae_frame_leave(_state); return result; } static void testmatinvunit_rmatrixmakeacopy(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* b, ae_state *_state); static void testmatinvunit_cmatrixmakeacopy(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* b, ae_state *_state); static ae_bool testmatinvunit_rmatrixcheckinverse(/* Real */ ae_matrix* a, /* Real */ ae_matrix* inva, ae_int_t n, double threshold, ae_int_t info, matinvreport* rep, ae_state *_state); static ae_bool testmatinvunit_spdmatrixcheckinverse(/* Real */ ae_matrix* a, /* Real */ ae_matrix* inva, ae_bool isupper, ae_int_t n, double threshold, ae_int_t info, matinvreport* rep, ae_state *_state); static ae_bool testmatinvunit_hpdmatrixcheckinverse(/* Complex */ ae_matrix* a, /* Complex */ ae_matrix* inva, ae_bool isupper, ae_int_t n, double threshold, ae_int_t info, matinvreport* rep, ae_state *_state); static ae_bool testmatinvunit_rmatrixcheckinversesingular(/* Real */ ae_matrix* inva, ae_int_t n, double threshold, ae_int_t info, matinvreport* rep, ae_state *_state); static ae_bool testmatinvunit_cmatrixcheckinverse(/* Complex */ ae_matrix* a, /* Complex */ ae_matrix* inva, ae_int_t n, double threshold, ae_int_t info, matinvreport* rep, ae_state *_state); static ae_bool testmatinvunit_cmatrixcheckinversesingular(/* Complex */ ae_matrix* inva, ae_int_t n, double threshold, ae_int_t info, matinvreport* rep, ae_state *_state); static void testmatinvunit_rmatrixdrophalf(/* Real */ ae_matrix* a, ae_int_t n, ae_bool droplower, ae_state *_state); static void testmatinvunit_cmatrixdrophalf(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool droplower, ae_state *_state); static void testmatinvunit_testrtrinv(ae_int_t minn, ae_int_t maxn, ae_int_t passcount, double threshold, ae_bool* rtrerrors, ae_state *_state); static void testmatinvunit_testctrinv(ae_int_t minn, ae_int_t maxn, ae_int_t passcount, double threshold, ae_bool* ctrerrors, ae_state *_state); static void testmatinvunit_testrinv(ae_int_t minn, ae_int_t maxn, ae_int_t passcount, double threshold, ae_bool* rerrors, ae_state *_state); static void testmatinvunit_testcinv(ae_int_t minn, ae_int_t maxn, ae_int_t passcount, double threshold, ae_bool* cerrors, ae_state *_state); static void testmatinvunit_testspdinv(ae_int_t minn, ae_int_t maxn, ae_int_t passcount, double threshold, ae_bool* spderrors, ae_state *_state); static void testmatinvunit_testhpdinv(ae_int_t minn, ae_int_t maxn, ae_int_t passcount, double threshold, ae_bool* hpderrors, ae_state *_state); static void testmatinvunit_unset2d(/* Real */ ae_matrix* x, ae_state *_state); static void testmatinvunit_cunset2d(/* Complex */ ae_matrix* x, ae_state *_state); static void testmatinvunit_unsetrep(matinvreport* r, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testmatinv(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t maxrn; ae_int_t maxcn; ae_int_t largen; ae_int_t passcount; double threshold; ae_bool rtrerrors; ae_bool ctrerrors; ae_bool rerrors; ae_bool cerrors; ae_bool spderrors; ae_bool hpderrors; ae_bool waserrors; ae_matrix emptyra; ae_matrix emptyca; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&emptyra, 0, 0, DT_REAL, _state); ae_matrix_init(&emptyca, 0, 0, DT_REAL, _state); maxrn = 3*ablasblocksize(&emptyra, _state)+1; maxcn = 3*ablasblocksize(&emptyca, _state)+1; largen = 256; passcount = 1; threshold = 10000*ae_machineepsilon; rtrerrors = ae_false; ctrerrors = ae_false; rerrors = ae_false; cerrors = ae_false; spderrors = ae_false; hpderrors = ae_false; testmatinvunit_testrtrinv(1, maxrn, passcount, threshold, &rtrerrors, _state); testmatinvunit_testctrinv(1, maxcn, passcount, threshold, &ctrerrors, _state); testmatinvunit_testrinv(1, maxrn, passcount, threshold, &rerrors, _state); testmatinvunit_testspdinv(1, maxrn, passcount, threshold, &spderrors, _state); testmatinvunit_testcinv(1, maxcn, passcount, threshold, &cerrors, _state); testmatinvunit_testhpdinv(1, maxcn, passcount, threshold, &hpderrors, _state); testmatinvunit_testrtrinv(largen, largen, passcount, threshold, &rtrerrors, _state); testmatinvunit_testctrinv(largen, largen, passcount, threshold, &ctrerrors, _state); testmatinvunit_testrinv(largen, largen, passcount, threshold, &rerrors, _state); testmatinvunit_testspdinv(largen, largen, passcount, threshold, &spderrors, _state); testmatinvunit_testcinv(largen, largen, passcount, threshold, &cerrors, _state); testmatinvunit_testhpdinv(largen, largen, passcount, threshold, &hpderrors, _state); waserrors = ((((rtrerrors||ctrerrors)||rerrors)||cerrors)||spderrors)||hpderrors; if( !silent ) { printf("TESTING MATINV\n"); printf("* REAL TRIANGULAR: "); if( rtrerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* COMPLEX TRIANGULAR: "); if( ctrerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* REAL: "); if( rerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* COMPLEX: "); if( cerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* SPD: "); if( spderrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* HPD: "); if( hpderrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testmatinv(ae_bool silent, ae_state *_state) { return testmatinv(silent, _state); } /************************************************************************* Copy *************************************************************************/ static void testmatinvunit_rmatrixmakeacopy(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(b); ae_matrix_set_length(b, m-1+1, n-1+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { b->ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } } /************************************************************************* Copy *************************************************************************/ static void testmatinvunit_cmatrixmakeacopy(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(b); ae_matrix_set_length(b, m-1+1, n-1+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { b->ptr.pp_complex[i][j] = a->ptr.pp_complex[i][j]; } } } /************************************************************************* Checks whether inverse is correct Returns True on success. *************************************************************************/ static ae_bool testmatinvunit_rmatrixcheckinverse(/* Real */ ae_matrix* a, /* Real */ ae_matrix* inva, ae_int_t n, double threshold, ae_int_t info, matinvreport* rep, ae_state *_state) { ae_int_t i; ae_int_t j; double v; ae_bool result; result = ae_true; if( info<=0 ) { result = ae_false; } else { result = result&&!(ae_fp_less(rep->r1,100*ae_machineepsilon)||ae_fp_greater(rep->r1,1+1000*ae_machineepsilon)); result = result&&!(ae_fp_less(rep->rinf,100*ae_machineepsilon)||ae_fp_greater(rep->rinf,1+1000*ae_machineepsilon)); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&a->ptr.pp_double[i][0], 1, &inva->ptr.pp_double[0][j], inva->stride, ae_v_len(0,n-1)); if( i==j ) { v = v-1; } result = result&&ae_fp_less_eq(ae_fabs(v, _state),threshold); } } } return result; } /************************************************************************* Checks whether inverse is correct Returns True on success. *************************************************************************/ static ae_bool testmatinvunit_spdmatrixcheckinverse(/* Real */ ae_matrix* a, /* Real */ ae_matrix* inva, ae_bool isupper, ae_int_t n, double threshold, ae_int_t info, matinvreport* rep, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_matrix _inva; ae_int_t i; ae_int_t j; double v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_matrix_init_copy(&_inva, inva, _state); inva = &_inva; for(i=0; i<=n-2; i++) { if( isupper ) { ae_v_move(&a->ptr.pp_double[i+1][i], a->stride, &a->ptr.pp_double[i][i+1], 1, ae_v_len(i+1,n-1)); ae_v_move(&inva->ptr.pp_double[i+1][i], inva->stride, &inva->ptr.pp_double[i][i+1], 1, ae_v_len(i+1,n-1)); } else { ae_v_move(&a->ptr.pp_double[i][i+1], 1, &a->ptr.pp_double[i+1][i], a->stride, ae_v_len(i+1,n-1)); ae_v_move(&inva->ptr.pp_double[i][i+1], 1, &inva->ptr.pp_double[i+1][i], inva->stride, ae_v_len(i+1,n-1)); } } result = ae_true; if( info<=0 ) { result = ae_false; } else { result = result&&!(ae_fp_less(rep->r1,100*ae_machineepsilon)||ae_fp_greater(rep->r1,1+1000*ae_machineepsilon)); result = result&&!(ae_fp_less(rep->rinf,100*ae_machineepsilon)||ae_fp_greater(rep->rinf,1+1000*ae_machineepsilon)); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&a->ptr.pp_double[i][0], 1, &inva->ptr.pp_double[0][j], inva->stride, ae_v_len(0,n-1)); if( i==j ) { v = v-1; } result = result&&ae_fp_less_eq(ae_fabs(v, _state),threshold); } } } ae_frame_leave(_state); return result; } /************************************************************************* Checks whether inverse is correct Returns True on success. *************************************************************************/ static ae_bool testmatinvunit_hpdmatrixcheckinverse(/* Complex */ ae_matrix* a, /* Complex */ ae_matrix* inva, ae_bool isupper, ae_int_t n, double threshold, ae_int_t info, matinvreport* rep, ae_state *_state) { ae_frame _frame_block; ae_matrix _a; ae_matrix _inva; ae_int_t i; ae_int_t j; ae_complex v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_a, a, _state); a = &_a; ae_matrix_init_copy(&_inva, inva, _state); inva = &_inva; for(i=0; i<=n-2; i++) { if( isupper ) { ae_v_cmove(&a->ptr.pp_complex[i+1][i], a->stride, &a->ptr.pp_complex[i][i+1], 1, "Conj", ae_v_len(i+1,n-1)); ae_v_cmove(&inva->ptr.pp_complex[i+1][i], inva->stride, &inva->ptr.pp_complex[i][i+1], 1, "Conj", ae_v_len(i+1,n-1)); } else { ae_v_cmove(&a->ptr.pp_complex[i][i+1], 1, &a->ptr.pp_complex[i+1][i], a->stride, "Conj", ae_v_len(i+1,n-1)); ae_v_cmove(&inva->ptr.pp_complex[i][i+1], 1, &inva->ptr.pp_complex[i+1][i], inva->stride, "Conj", ae_v_len(i+1,n-1)); } } result = ae_true; if( info<=0 ) { result = ae_false; } else { result = result&&!(ae_fp_less(rep->r1,100*ae_machineepsilon)||ae_fp_greater(rep->r1,1+1000*ae_machineepsilon)); result = result&&!(ae_fp_less(rep->rinf,100*ae_machineepsilon)||ae_fp_greater(rep->rinf,1+1000*ae_machineepsilon)); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&a->ptr.pp_complex[i][0], 1, "N", &inva->ptr.pp_complex[0][j], inva->stride, "N", ae_v_len(0,n-1)); if( i==j ) { v = ae_c_sub_d(v,1); } result = result&&ae_fp_less_eq(ae_c_abs(v, _state),threshold); } } } ae_frame_leave(_state); return result; } /************************************************************************* Checks whether inversion result indicate singular matrix Returns True on success. *************************************************************************/ static ae_bool testmatinvunit_rmatrixcheckinversesingular(/* Real */ ae_matrix* inva, ae_int_t n, double threshold, ae_int_t info, matinvreport* rep, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; result = ae_true; if( info!=-3&&info!=1 ) { result = ae_false; } else { result = result&&!(ae_fp_less(rep->r1,(double)(0))||ae_fp_greater(rep->r1,1000*ae_machineepsilon)); result = result&&!(ae_fp_less(rep->rinf,(double)(0))||ae_fp_greater(rep->rinf,1000*ae_machineepsilon)); if( info==-3 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { result = result&&ae_fp_eq(inva->ptr.pp_double[i][j],(double)(0)); } } } } return result; } /************************************************************************* Checks whether inverse is correct Returns True on success. *************************************************************************/ static ae_bool testmatinvunit_cmatrixcheckinverse(/* Complex */ ae_matrix* a, /* Complex */ ae_matrix* inva, ae_int_t n, double threshold, ae_int_t info, matinvreport* rep, ae_state *_state) { ae_int_t i; ae_int_t j; ae_complex v; ae_bool result; result = ae_true; if( info<=0 ) { result = ae_false; } else { result = result&&!(ae_fp_less(rep->r1,100*ae_machineepsilon)||ae_fp_greater(rep->r1,1+1000*ae_machineepsilon)); result = result&&!(ae_fp_less(rep->rinf,100*ae_machineepsilon)||ae_fp_greater(rep->rinf,1+1000*ae_machineepsilon)); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&a->ptr.pp_complex[i][0], 1, "N", &inva->ptr.pp_complex[0][j], inva->stride, "N", ae_v_len(0,n-1)); if( i==j ) { v = ae_c_sub_d(v,1); } result = result&&ae_fp_less_eq(ae_c_abs(v, _state),threshold); } } } return result; } /************************************************************************* Checks whether inversion result indicate singular matrix Returns True on success. *************************************************************************/ static ae_bool testmatinvunit_cmatrixcheckinversesingular(/* Complex */ ae_matrix* inva, ae_int_t n, double threshold, ae_int_t info, matinvreport* rep, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; result = ae_true; if( info!=-3&&info!=1 ) { result = ae_false; } else { result = result&&!(ae_fp_less(rep->r1,(double)(0))||ae_fp_greater(rep->r1,1000*ae_machineepsilon)); result = result&&!(ae_fp_less(rep->rinf,(double)(0))||ae_fp_greater(rep->rinf,1000*ae_machineepsilon)); if( info==-3 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { result = result&&ae_c_eq_d(inva->ptr.pp_complex[i][j],(double)(0)); } } } } return result; } /************************************************************************* Drops upper or lower half of the matrix - fills it by special pattern which may be used later to ensure that this part wasn't changed *************************************************************************/ static void testmatinvunit_rmatrixdrophalf(/* Real */ ae_matrix* a, ae_int_t n, ae_bool droplower, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (droplower&&i>j)||(!droplower&&iptr.pp_double[i][j] = (double)(1+2*i+3*j); } } } } /************************************************************************* Drops upper or lower half of the matrix - fills it by special pattern which may be used later to ensure that this part wasn't changed *************************************************************************/ static void testmatinvunit_cmatrixdrophalf(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool droplower, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (droplower&&i>j)||(!droplower&&iptr.pp_complex[i][j] = ae_complex_from_i(1+2*i+3*j); } } } } /************************************************************************* Real TR inverse *************************************************************************/ static void testmatinvunit_testrtrinv(ae_int_t minn, ae_int_t maxn, ae_int_t passcount, double threshold, ae_bool* rtrerrors, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix b; ae_int_t n; ae_int_t pass; ae_int_t i; ae_int_t j; ae_int_t task; ae_bool isupper; ae_bool isunit; double v; ae_int_t info; matinvreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&b, 0, 0, DT_REAL, _state); _matinvreport_init(&rep, _state); /* * Test */ for(n=minn; n<=maxn; n++) { ae_matrix_set_length(&a, n, n, _state); ae_matrix_set_length(&b, n, n, _state); for(task=0; task<=3; task++) { for(pass=1; pass<=passcount; pass++) { /* * Determine task */ isupper = task%2==0; isunit = task/2%2==0; /* * Generate matrix */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { a.ptr.pp_double[i][i] = 1+ae_randomreal(_state); } else { a.ptr.pp_double[i][j] = 0.2*ae_randomreal(_state)-0.1; } b.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]; } } /* * Inverse */ rmatrixtrinverse(&b, n, isupper, isunit, &info, &rep, _state); if( info<=0 ) { *rtrerrors = ae_true; ae_frame_leave(_state); return; } /* * Structural test */ if( isunit ) { for(i=0; i<=n-1; i++) { *rtrerrors = *rtrerrors||ae_fp_neq(a.ptr.pp_double[i][i],b.ptr.pp_double[i][i]); } } if( isupper ) { for(i=0; i<=n-1; i++) { for(j=0; j<=i-1; j++) { *rtrerrors = *rtrerrors||ae_fp_neq(a.ptr.pp_double[i][j],b.ptr.pp_double[i][j]); } } } else { for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { *rtrerrors = *rtrerrors||ae_fp_neq(a.ptr.pp_double[i][j],b.ptr.pp_double[i][j]); } } } /* * Inverse test */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (ji&&!isupper) ) { a.ptr.pp_double[i][j] = (double)(0); b.ptr.pp_double[i][j] = (double)(0); } } } if( isunit ) { for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = (double)(1); b.ptr.pp_double[i][i] = (double)(1); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &b.ptr.pp_double[0][j], b.stride, ae_v_len(0,n-1)); if( j!=i ) { *rtrerrors = *rtrerrors||ae_fp_greater(ae_fabs(v, _state),threshold); } else { *rtrerrors = *rtrerrors||ae_fp_greater(ae_fabs(v-1, _state),threshold); } } } } } } ae_frame_leave(_state); } /************************************************************************* Complex TR inverse *************************************************************************/ static void testmatinvunit_testctrinv(ae_int_t minn, ae_int_t maxn, ae_int_t passcount, double threshold, ae_bool* ctrerrors, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix b; ae_int_t n; ae_int_t pass; ae_int_t i; ae_int_t j; ae_int_t task; ae_bool isupper; ae_bool isunit; ae_complex v; ae_int_t info; matinvreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&b, 0, 0, DT_COMPLEX, _state); _matinvreport_init(&rep, _state); /* * Test */ for(n=minn; n<=maxn; n++) { ae_matrix_set_length(&a, n, n, _state); ae_matrix_set_length(&b, n, n, _state); for(task=0; task<=3; task++) { for(pass=1; pass<=passcount; pass++) { /* * Determine task */ isupper = task%2==0; isunit = task/2%2==0; /* * Generate matrix */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { a.ptr.pp_complex[i][i].x = 1+ae_randomreal(_state); a.ptr.pp_complex[i][i].y = 1+ae_randomreal(_state); } else { a.ptr.pp_complex[i][j].x = 0.2*ae_randomreal(_state)-0.1; a.ptr.pp_complex[i][j].y = 0.2*ae_randomreal(_state)-0.1; } b.ptr.pp_complex[i][j] = a.ptr.pp_complex[i][j]; } } /* * Inverse */ cmatrixtrinverse(&b, n, isupper, isunit, &info, &rep, _state); if( info<=0 ) { *ctrerrors = ae_true; ae_frame_leave(_state); return; } /* * Structural test */ if( isunit ) { for(i=0; i<=n-1; i++) { *ctrerrors = *ctrerrors||ae_c_neq(a.ptr.pp_complex[i][i],b.ptr.pp_complex[i][i]); } } if( isupper ) { for(i=0; i<=n-1; i++) { for(j=0; j<=i-1; j++) { *ctrerrors = *ctrerrors||ae_c_neq(a.ptr.pp_complex[i][j],b.ptr.pp_complex[i][j]); } } } else { for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { *ctrerrors = *ctrerrors||ae_c_neq(a.ptr.pp_complex[i][j],b.ptr.pp_complex[i][j]); } } } /* * Inverse test */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (ji&&!isupper) ) { a.ptr.pp_complex[i][j] = ae_complex_from_i(0); b.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } if( isunit ) { for(i=0; i<=n-1; i++) { a.ptr.pp_complex[i][i] = ae_complex_from_i(1); b.ptr.pp_complex[i][i] = ae_complex_from_i(1); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&a.ptr.pp_complex[i][0], 1, "N", &b.ptr.pp_complex[0][j], b.stride, "N", ae_v_len(0,n-1)); if( j!=i ) { *ctrerrors = *ctrerrors||ae_fp_greater(ae_c_abs(v, _state),threshold); } else { *ctrerrors = *ctrerrors||ae_fp_greater(ae_c_abs(ae_c_sub_d(v,1), _state),threshold); } } } } } } ae_frame_leave(_state); } /************************************************************************* Real test *************************************************************************/ static void testmatinvunit_testrinv(ae_int_t minn, ae_int_t maxn, ae_int_t passcount, double threshold, ae_bool* rerrors, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix lua; ae_matrix inva; ae_matrix invlua; ae_vector p; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t n; ae_int_t pass; ae_int_t taskkind; ae_int_t info; matinvreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&lua, 0, 0, DT_REAL, _state); ae_matrix_init(&inva, 0, 0, DT_REAL, _state); ae_matrix_init(&invlua, 0, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); _matinvreport_init(&rep, _state); /* * General square matrices: * * test general solvers * * test least squares solver */ for(pass=1; pass<=passcount; pass++) { for(n=minn; n<=maxn; n++) { /* * ******************************************************** * WELL CONDITIONED TASKS * ability to find correct solution is tested * ******************************************************** * * 1. generate random well conditioned matrix A. * 2. generate random solution vector xe * 3. generate right part b=A*xe * 4. test different methods on original A */ rmatrixrndcond(n, (double)(1000), &a, _state); testmatinvunit_rmatrixmakeacopy(&a, n, n, &lua, _state); rmatrixlu(&lua, n, n, &p, _state); testmatinvunit_rmatrixmakeacopy(&a, n, n, &inva, _state); testmatinvunit_rmatrixmakeacopy(&lua, n, n, &invlua, _state); info = 0; testmatinvunit_unsetrep(&rep, _state); rmatrixinverse(&inva, n, &info, &rep, _state); *rerrors = *rerrors||!testmatinvunit_rmatrixcheckinverse(&a, &inva, n, threshold, info, &rep, _state); info = 0; testmatinvunit_unsetrep(&rep, _state); rmatrixluinverse(&invlua, &p, n, &info, &rep, _state); *rerrors = *rerrors||!testmatinvunit_rmatrixcheckinverse(&a, &invlua, n, threshold, info, &rep, _state); /* * ******************************************************** * EXACTLY SINGULAR MATRICES * ability to detect singularity is tested * ******************************************************** * * 1. generate different types of singular matrices: * * zero * * with zero columns * * with zero rows * * with equal rows/columns * 2. test different methods */ for(taskkind=0; taskkind<=4; taskkind++) { testmatinvunit_unset2d(&a, _state); if( taskkind==0 ) { /* * all zeros */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } } if( taskkind==1 ) { /* * there is zero column */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } k = ae_randominteger(n, _state); ae_v_muld(&a.ptr.pp_double[0][k], a.stride, ae_v_len(0,n-1), 0); } if( taskkind==2 ) { /* * there is zero row */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } k = ae_randominteger(n, _state); ae_v_muld(&a.ptr.pp_double[k][0], 1, ae_v_len(0,n-1), 0); } if( taskkind==3 ) { /* * equal columns */ if( n<2 ) { continue; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } k = 1+ae_randominteger(n-1, _state); ae_v_move(&a.ptr.pp_double[0][0], a.stride, &a.ptr.pp_double[0][k], a.stride, ae_v_len(0,n-1)); } if( taskkind==4 ) { /* * equal rows */ if( n<2 ) { continue; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } k = 1+ae_randominteger(n-1, _state); ae_v_move(&a.ptr.pp_double[0][0], 1, &a.ptr.pp_double[k][0], 1, ae_v_len(0,n-1)); } testmatinvunit_rmatrixmakeacopy(&a, n, n, &lua, _state); rmatrixlu(&lua, n, n, &p, _state); info = 0; testmatinvunit_unsetrep(&rep, _state); rmatrixinverse(&a, n, &info, &rep, _state); *rerrors = *rerrors||!testmatinvunit_rmatrixcheckinversesingular(&a, n, threshold, info, &rep, _state); info = 0; testmatinvunit_unsetrep(&rep, _state); rmatrixluinverse(&lua, &p, n, &info, &rep, _state); *rerrors = *rerrors||!testmatinvunit_rmatrixcheckinversesingular(&lua, n, threshold, info, &rep, _state); } } } ae_frame_leave(_state); } /************************************************************************* Complex test *************************************************************************/ static void testmatinvunit_testcinv(ae_int_t minn, ae_int_t maxn, ae_int_t passcount, double threshold, ae_bool* cerrors, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix lua; ae_matrix inva; ae_matrix invlua; ae_vector p; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t n; ae_int_t pass; ae_int_t taskkind; ae_int_t info; matinvreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&lua, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&inva, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&invlua, 0, 0, DT_COMPLEX, _state); ae_vector_init(&p, 0, DT_INT, _state); _matinvreport_init(&rep, _state); /* * General square matrices: * * test general solvers * * test least squares solver */ for(pass=1; pass<=passcount; pass++) { for(n=minn; n<=maxn; n++) { /* * ******************************************************** * WELL CONDITIONED TASKS * ability to find correct solution is tested * ******************************************************** * * 1. generate random well conditioned matrix A. * 2. generate random solution vector xe * 3. generate right part b=A*xe * 4. test different methods on original A */ cmatrixrndcond(n, (double)(1000), &a, _state); testmatinvunit_cmatrixmakeacopy(&a, n, n, &lua, _state); cmatrixlu(&lua, n, n, &p, _state); testmatinvunit_cmatrixmakeacopy(&a, n, n, &inva, _state); testmatinvunit_cmatrixmakeacopy(&lua, n, n, &invlua, _state); info = 0; testmatinvunit_unsetrep(&rep, _state); cmatrixinverse(&inva, n, &info, &rep, _state); *cerrors = *cerrors||!testmatinvunit_cmatrixcheckinverse(&a, &inva, n, threshold, info, &rep, _state); info = 0; testmatinvunit_unsetrep(&rep, _state); cmatrixluinverse(&invlua, &p, n, &info, &rep, _state); *cerrors = *cerrors||!testmatinvunit_cmatrixcheckinverse(&a, &invlua, n, threshold, info, &rep, _state); /* * ******************************************************** * EXACTLY SINGULAR MATRICES * ability to detect singularity is tested * ******************************************************** * * 1. generate different types of singular matrices: * * zero * * with zero columns * * with zero rows * * with equal rows/columns * 2. test different methods */ for(taskkind=0; taskkind<=4; taskkind++) { testmatinvunit_cunset2d(&a, _state); if( taskkind==0 ) { /* * all zeros */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } if( taskkind==1 ) { /* * there is zero column */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } k = ae_randominteger(n, _state); ae_v_cmuld(&a.ptr.pp_complex[0][k], a.stride, ae_v_len(0,n-1), 0); } if( taskkind==2 ) { /* * there is zero row */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } k = ae_randominteger(n, _state); ae_v_cmuld(&a.ptr.pp_complex[k][0], 1, ae_v_len(0,n-1), 0); } if( taskkind==3 ) { /* * equal columns */ if( n<2 ) { continue; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } k = 1+ae_randominteger(n-1, _state); ae_v_cmove(&a.ptr.pp_complex[0][0], a.stride, &a.ptr.pp_complex[0][k], a.stride, "N", ae_v_len(0,n-1)); } if( taskkind==4 ) { /* * equal rows */ if( n<2 ) { continue; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } k = 1+ae_randominteger(n-1, _state); ae_v_cmove(&a.ptr.pp_complex[0][0], 1, &a.ptr.pp_complex[k][0], 1, "N", ae_v_len(0,n-1)); } testmatinvunit_cmatrixmakeacopy(&a, n, n, &lua, _state); cmatrixlu(&lua, n, n, &p, _state); info = 0; testmatinvunit_unsetrep(&rep, _state); cmatrixinverse(&a, n, &info, &rep, _state); *cerrors = *cerrors||!testmatinvunit_cmatrixcheckinversesingular(&a, n, threshold, info, &rep, _state); info = 0; testmatinvunit_unsetrep(&rep, _state); cmatrixluinverse(&lua, &p, n, &info, &rep, _state); *cerrors = *cerrors||!testmatinvunit_cmatrixcheckinversesingular(&lua, n, threshold, info, &rep, _state); } } } ae_frame_leave(_state); } /************************************************************************* SPD test *************************************************************************/ static void testmatinvunit_testspdinv(ae_int_t minn, ae_int_t maxn, ae_int_t passcount, double threshold, ae_bool* spderrors, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix cha; ae_matrix inva; ae_matrix invcha; ae_bool isupper; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t n; ae_int_t pass; ae_int_t taskkind; ae_int_t info; matinvreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&cha, 0, 0, DT_REAL, _state); ae_matrix_init(&inva, 0, 0, DT_REAL, _state); ae_matrix_init(&invcha, 0, 0, DT_REAL, _state); _matinvreport_init(&rep, _state); /* * General square matrices: * * test general solvers * * test least squares solver */ for(pass=1; pass<=passcount; pass++) { for(n=minn; n<=maxn; n++) { isupper = ae_fp_greater(ae_randomreal(_state),0.5); /* * ******************************************************** * WELL CONDITIONED TASKS * ability to find correct solution is tested * ******************************************************** * * 1. generate random well conditioned matrix A. * 2. generate random solution vector xe * 3. generate right part b=A*xe * 4. test different methods on original A */ spdmatrixrndcond(n, (double)(1000), &a, _state); testmatinvunit_rmatrixdrophalf(&a, n, isupper, _state); testmatinvunit_rmatrixmakeacopy(&a, n, n, &cha, _state); if( !spdmatrixcholesky(&cha, n, isupper, _state) ) { continue; } testmatinvunit_rmatrixmakeacopy(&a, n, n, &inva, _state); testmatinvunit_rmatrixmakeacopy(&cha, n, n, &invcha, _state); info = 0; testmatinvunit_unsetrep(&rep, _state); spdmatrixinverse(&inva, n, isupper, &info, &rep, _state); *spderrors = *spderrors||!testmatinvunit_spdmatrixcheckinverse(&a, &inva, isupper, n, threshold, info, &rep, _state); info = 0; testmatinvunit_unsetrep(&rep, _state); spdmatrixcholeskyinverse(&invcha, n, isupper, &info, &rep, _state); *spderrors = *spderrors||!testmatinvunit_spdmatrixcheckinverse(&a, &invcha, isupper, n, threshold, info, &rep, _state); /* * ******************************************************** * EXACTLY SINGULAR MATRICES * ability to detect singularity is tested * ******************************************************** * * 1. generate different types of singular matrices: * * zero * * with zero columns * * with zero rows * 2. test different methods */ for(taskkind=0; taskkind<=2; taskkind++) { testmatinvunit_unset2d(&a, _state); if( taskkind==0 ) { /* * all zeros */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } } if( taskkind==1 ) { /* * there is zero column */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } k = ae_randominteger(n, _state); ae_v_muld(&a.ptr.pp_double[0][k], a.stride, ae_v_len(0,n-1), 0); } if( taskkind==2 ) { /* * there is zero row */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } k = ae_randominteger(n, _state); ae_v_muld(&a.ptr.pp_double[k][0], 1, ae_v_len(0,n-1), 0); } info = 0; testmatinvunit_unsetrep(&rep, _state); spdmatrixcholeskyinverse(&a, n, isupper, &info, &rep, _state); if( info!=-3&&info!=1 ) { *spderrors = ae_true; } else { *spderrors = (*spderrors||ae_fp_less(rep.r1,(double)(0)))||ae_fp_greater(rep.r1,1000*ae_machineepsilon); *spderrors = (*spderrors||ae_fp_less(rep.rinf,(double)(0)))||ae_fp_greater(rep.rinf,1000*ae_machineepsilon); } } } } ae_frame_leave(_state); } /************************************************************************* HPD test *************************************************************************/ static void testmatinvunit_testhpdinv(ae_int_t minn, ae_int_t maxn, ae_int_t passcount, double threshold, ae_bool* hpderrors, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix cha; ae_matrix inva; ae_matrix invcha; ae_bool isupper; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t n; ae_int_t pass; ae_int_t taskkind; ae_int_t info; matinvreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cha, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&inva, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&invcha, 0, 0, DT_COMPLEX, _state); _matinvreport_init(&rep, _state); /* * General square matrices: * * test general solvers * * test least squares solver */ for(pass=1; pass<=passcount; pass++) { for(n=minn; n<=maxn; n++) { isupper = ae_fp_greater(ae_randomreal(_state),0.5); /* * ******************************************************** * WELL CONDITIONED TASKS * ability to find correct solution is tested * ******************************************************** * * 1. generate random well conditioned matrix A. * 2. generate random solution vector xe * 3. generate right part b=A*xe * 4. test different methods on original A */ hpdmatrixrndcond(n, (double)(1000), &a, _state); testmatinvunit_cmatrixdrophalf(&a, n, isupper, _state); testmatinvunit_cmatrixmakeacopy(&a, n, n, &cha, _state); if( !hpdmatrixcholesky(&cha, n, isupper, _state) ) { continue; } testmatinvunit_cmatrixmakeacopy(&a, n, n, &inva, _state); testmatinvunit_cmatrixmakeacopy(&cha, n, n, &invcha, _state); info = 0; testmatinvunit_unsetrep(&rep, _state); hpdmatrixinverse(&inva, n, isupper, &info, &rep, _state); *hpderrors = *hpderrors||!testmatinvunit_hpdmatrixcheckinverse(&a, &inva, isupper, n, threshold, info, &rep, _state); info = 0; testmatinvunit_unsetrep(&rep, _state); hpdmatrixcholeskyinverse(&invcha, n, isupper, &info, &rep, _state); *hpderrors = *hpderrors||!testmatinvunit_hpdmatrixcheckinverse(&a, &invcha, isupper, n, threshold, info, &rep, _state); /* * ******************************************************** * EXACTLY SINGULAR MATRICES * ability to detect singularity is tested * ******************************************************** * * 1. generate different types of singular matrices: * * zero * * with zero columns * * with zero rows * 2. test different methods */ for(taskkind=0; taskkind<=2; taskkind++) { testmatinvunit_cunset2d(&a, _state); if( taskkind==0 ) { /* * all zeros */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } if( taskkind==1 ) { /* * there is zero column */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } k = ae_randominteger(n, _state); ae_v_cmuld(&a.ptr.pp_complex[0][k], a.stride, ae_v_len(0,n-1), 0); ae_v_cmuld(&a.ptr.pp_complex[k][0], 1, ae_v_len(0,n-1), 0); } if( taskkind==2 ) { /* * there is zero row */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } k = ae_randominteger(n, _state); ae_v_cmuld(&a.ptr.pp_complex[k][0], 1, ae_v_len(0,n-1), 0); ae_v_cmuld(&a.ptr.pp_complex[0][k], a.stride, ae_v_len(0,n-1), 0); } info = 0; testmatinvunit_unsetrep(&rep, _state); hpdmatrixcholeskyinverse(&a, n, isupper, &info, &rep, _state); if( info!=-3&&info!=1 ) { *hpderrors = ae_true; } else { *hpderrors = (*hpderrors||ae_fp_less(rep.r1,(double)(0)))||ae_fp_greater(rep.r1,1000*ae_machineepsilon); *hpderrors = (*hpderrors||ae_fp_less(rep.rinf,(double)(0)))||ae_fp_greater(rep.rinf,1000*ae_machineepsilon); } } } } ae_frame_leave(_state); } /************************************************************************* Unsets real matrix *************************************************************************/ static void testmatinvunit_unset2d(/* Real */ ae_matrix* x, ae_state *_state) { ae_matrix_set_length(x, 1, 1, _state); x->ptr.pp_double[0][0] = 2*ae_randomreal(_state)-1; } /************************************************************************* Unsets real matrix *************************************************************************/ static void testmatinvunit_cunset2d(/* Complex */ ae_matrix* x, ae_state *_state) { ae_matrix_set_length(x, 1, 1, _state); x->ptr.pp_complex[0][0] = ae_complex_from_d(2*ae_randomreal(_state)-1); } /************************************************************************* Unsets report *************************************************************************/ static void testmatinvunit_unsetrep(matinvreport* r, ae_state *_state) { r->r1 = (double)(-1); r->rinf = (double)(-1); } ae_bool testhblas(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix ua; ae_matrix la; ae_vector x; ae_vector y1; ae_vector y2; ae_vector y3; ae_int_t n; ae_int_t maxn; ae_int_t i; ae_int_t j; ae_int_t i1; ae_int_t i2; ae_bool waserrors; double mverr; double threshold; ae_complex alpha; ae_complex v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&ua, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&la, 0, 0, DT_COMPLEX, _state); ae_vector_init(&x, 0, DT_COMPLEX, _state); ae_vector_init(&y1, 0, DT_COMPLEX, _state); ae_vector_init(&y2, 0, DT_COMPLEX, _state); ae_vector_init(&y3, 0, DT_COMPLEX, _state); mverr = (double)(0); waserrors = ae_false; maxn = 10; threshold = 1000*ae_machineepsilon; /* * Test MV */ for(n=2; n<=maxn; n++) { ae_matrix_set_length(&a, n+1, n+1, _state); ae_matrix_set_length(&ua, n+1, n+1, _state); ae_matrix_set_length(&la, n+1, n+1, _state); ae_vector_set_length(&x, n+1, _state); ae_vector_set_length(&y1, n+1, _state); ae_vector_set_length(&y2, n+1, _state); ae_vector_set_length(&y3, n+1, _state); /* * fill A, UA, LA */ for(i=1; i<=n; i++) { a.ptr.pp_complex[i][i].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][i].y = (double)(0); for(j=i+1; j<=n; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[j][i] = ae_c_conj(a.ptr.pp_complex[i][j], _state); } } for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { ua.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } for(i=1; i<=n; i++) { for(j=i; j<=n; j++) { ua.ptr.pp_complex[i][j] = a.ptr.pp_complex[i][j]; } } for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { la.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } for(i=1; i<=n; i++) { for(j=1; j<=i; j++) { la.ptr.pp_complex[i][j] = a.ptr.pp_complex[i][j]; } } /* * test on different I1, I2 */ for(i1=1; i1<=n; i1++) { for(i2=i1; i2<=n; i2++) { /* * Fill X, choose Alpha */ for(i=1; i<=i2-i1+1; i++) { x.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; x.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; } alpha.x = 2*ae_randomreal(_state)-1; alpha.y = 2*ae_randomreal(_state)-1; /* * calculate A*x, UA*x, LA*x */ for(i=i1; i<=i2; i++) { v = ae_v_cdotproduct(&a.ptr.pp_complex[i][i1], 1, "N", &x.ptr.p_complex[1], 1, "N", ae_v_len(i1,i2)); y1.ptr.p_complex[i-i1+1] = ae_c_mul(alpha,v); } hermitianmatrixvectormultiply(&ua, ae_true, i1, i2, &x, alpha, &y2, _state); hermitianmatrixvectormultiply(&la, ae_false, i1, i2, &x, alpha, &y3, _state); /* * Calculate error */ ae_v_csub(&y2.ptr.p_complex[1], 1, &y1.ptr.p_complex[1], 1, "N", ae_v_len(1,i2-i1+1)); v = ae_v_cdotproduct(&y2.ptr.p_complex[1], 1, "N", &y2.ptr.p_complex[1], 1, "Conj", ae_v_len(1,i2-i1+1)); mverr = ae_maxreal(mverr, ae_sqrt(ae_c_abs(v, _state), _state), _state); ae_v_csub(&y3.ptr.p_complex[1], 1, &y1.ptr.p_complex[1], 1, "N", ae_v_len(1,i2-i1+1)); v = ae_v_cdotproduct(&y3.ptr.p_complex[1], 1, "N", &y3.ptr.p_complex[1], 1, "Conj", ae_v_len(1,i2-i1+1)); mverr = ae_maxreal(mverr, ae_sqrt(ae_c_abs(v, _state), _state), _state); } } } /* * report */ waserrors = ae_fp_greater(mverr,threshold); if( !silent ) { printf("TESTING HERMITIAN BLAS\n"); printf("MV error: %5.3e\n", (double)(mverr)); printf("Threshold: %5.3e\n", (double)(threshold)); if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testhblas(ae_bool silent, ae_state *_state) { return testhblas(silent, _state); } ae_bool testsblas(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix ua; ae_matrix la; ae_vector x; ae_vector y1; ae_vector y2; ae_vector y3; ae_int_t n; ae_int_t maxn; ae_int_t i; ae_int_t j; ae_int_t i1; ae_int_t i2; ae_bool waserrors; double mverr; double threshold; double alpha; double v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&ua, 0, 0, DT_REAL, _state); ae_matrix_init(&la, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y1, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&y3, 0, DT_REAL, _state); mverr = (double)(0); waserrors = ae_false; maxn = 10; threshold = 1000*ae_machineepsilon; /* * Test MV */ for(n=2; n<=maxn; n++) { ae_matrix_set_length(&a, n+1, n+1, _state); ae_matrix_set_length(&ua, n+1, n+1, _state); ae_matrix_set_length(&la, n+1, n+1, _state); ae_vector_set_length(&x, n+1, _state); ae_vector_set_length(&y1, n+1, _state); ae_vector_set_length(&y2, n+1, _state); ae_vector_set_length(&y3, n+1, _state); /* * fill A, UA, LA */ for(i=1; i<=n; i++) { a.ptr.pp_double[i][i] = 2*ae_randomreal(_state)-1; for(j=i+1; j<=n; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; a.ptr.pp_double[j][i] = a.ptr.pp_double[i][j]; } } for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { ua.ptr.pp_double[i][j] = (double)(0); } } for(i=1; i<=n; i++) { for(j=i; j<=n; j++) { ua.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]; } } for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { la.ptr.pp_double[i][j] = (double)(0); } } for(i=1; i<=n; i++) { for(j=1; j<=i; j++) { la.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]; } } /* * test on different I1, I2 */ for(i1=1; i1<=n; i1++) { for(i2=i1; i2<=n; i2++) { /* * Fill X, choose Alpha */ for(i=1; i<=i2-i1+1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } alpha = 2*ae_randomreal(_state)-1; /* * calculate A*x, UA*x, LA*x */ for(i=i1; i<=i2; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][i1], 1, &x.ptr.p_double[1], 1, ae_v_len(i1,i2)); y1.ptr.p_double[i-i1+1] = alpha*v; } symmetricmatrixvectormultiply(&ua, ae_true, i1, i2, &x, alpha, &y2, _state); symmetricmatrixvectormultiply(&la, ae_false, i1, i2, &x, alpha, &y3, _state); /* * Calculate error */ ae_v_sub(&y2.ptr.p_double[1], 1, &y1.ptr.p_double[1], 1, ae_v_len(1,i2-i1+1)); v = ae_v_dotproduct(&y2.ptr.p_double[1], 1, &y2.ptr.p_double[1], 1, ae_v_len(1,i2-i1+1)); mverr = ae_maxreal(mverr, ae_sqrt(v, _state), _state); ae_v_sub(&y3.ptr.p_double[1], 1, &y1.ptr.p_double[1], 1, ae_v_len(1,i2-i1+1)); v = ae_v_dotproduct(&y3.ptr.p_double[1], 1, &y3.ptr.p_double[1], 1, ae_v_len(1,i2-i1+1)); mverr = ae_maxreal(mverr, ae_sqrt(v, _state), _state); } } } /* * report */ waserrors = ae_fp_greater(mverr,threshold); if( !silent ) { printf("TESTING SYMMETRIC BLAS\n"); printf("MV error: %5.3e\n", (double)(mverr)); printf("Threshold: %5.3e\n", (double)(threshold)); if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testsblas(ae_bool silent, ae_state *_state) { return testsblas(silent, _state); } static double testortfacunit_rmatrixdiff(/* Real */ ae_matrix* a, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t n, ae_state *_state); static void testortfacunit_rmatrixmakeacopy(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* b, ae_state *_state); static void testortfacunit_cmatrixmakeacopy(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* b, ae_state *_state); static void testortfacunit_rmatrixfillsparsea(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double sparcity, ae_state *_state); static void testortfacunit_cmatrixfillsparsea(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, double sparcity, ae_state *_state); static void testortfacunit_internalmatrixmatrixmultiply(/* Real */ ae_matrix* a, ae_int_t ai1, ae_int_t ai2, ae_int_t aj1, ae_int_t aj2, ae_bool transa, /* Real */ ae_matrix* b, ae_int_t bi1, ae_int_t bi2, ae_int_t bj1, ae_int_t bj2, ae_bool transb, /* Real */ ae_matrix* c, ae_int_t ci1, ae_int_t ci2, ae_int_t cj1, ae_int_t cj2, ae_state *_state); static void testortfacunit_testrqrproblem(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* qrerrors, ae_state *_state); static void testortfacunit_testcqrproblem(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* qrerrors, ae_state *_state); static void testortfacunit_testrlqproblem(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* lqerrors, ae_state *_state); static void testortfacunit_testclqproblem(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* lqerrors, ae_state *_state); static void testortfacunit_testrbdproblem(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* bderrors, ae_state *_state); static void testortfacunit_testrhessproblem(/* Real */ ae_matrix* a, ae_int_t n, double threshold, ae_bool* hesserrors, ae_state *_state); static void testortfacunit_testrtdproblem(/* Real */ ae_matrix* a, ae_int_t n, double threshold, ae_bool* tderrors, ae_state *_state); static void testortfacunit_testctdproblem(/* Complex */ ae_matrix* a, ae_int_t n, double threshold, ae_bool* tderrors, ae_state *_state); /************************************************************************* Main unittest subroutine *************************************************************************/ ae_bool testortfac(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t maxmn; double threshold; ae_int_t passcount; ae_int_t mx; ae_matrix ra; ae_matrix ca; ae_int_t m; ae_int_t n; ae_int_t pass; ae_int_t i; ae_int_t j; ae_bool rqrerrors; ae_bool rlqerrors; ae_bool cqrerrors; ae_bool clqerrors; ae_bool rbderrors; ae_bool rhesserrors; ae_bool rtderrors; ae_bool ctderrors; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ra, 0, 0, DT_REAL, _state); ae_matrix_init(&ca, 0, 0, DT_COMPLEX, _state); waserrors = ae_false; rqrerrors = ae_false; rlqerrors = ae_false; cqrerrors = ae_false; clqerrors = ae_false; rbderrors = ae_false; rhesserrors = ae_false; rtderrors = ae_false; ctderrors = ae_false; maxmn = 3*ablasblocksize(&ra, _state)+1; passcount = 1; threshold = 5*1000*ae_machineepsilon; /* * Different problems */ for(mx=1; mx<=maxmn; mx++) { for(pass=1; pass<=passcount; pass++) { /* * Rectangular factorizations: QR, LQ, bidiagonal * Matrix types: zero, dense, sparse */ n = 1+ae_randominteger(mx, _state); m = 1+ae_randominteger(mx, _state); if( ae_fp_greater(ae_randomreal(_state),0.5) ) { n = mx; } else { m = mx; } ae_matrix_set_length(&ra, m, n, _state); ae_matrix_set_length(&ca, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { ra.ptr.pp_double[i][j] = (double)(0); ca.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } testortfacunit_testrqrproblem(&ra, m, n, threshold, &rqrerrors, _state); testortfacunit_testrlqproblem(&ra, m, n, threshold, &rlqerrors, _state); testortfacunit_testcqrproblem(&ca, m, n, threshold, &cqrerrors, _state); testortfacunit_testclqproblem(&ca, m, n, threshold, &clqerrors, _state); testortfacunit_testrbdproblem(&ra, m, n, threshold, &rbderrors, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { ra.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } testortfacunit_testrqrproblem(&ra, m, n, threshold, &rqrerrors, _state); testortfacunit_testrlqproblem(&ra, m, n, threshold, &rlqerrors, _state); testortfacunit_testcqrproblem(&ca, m, n, threshold, &cqrerrors, _state); testortfacunit_testclqproblem(&ca, m, n, threshold, &clqerrors, _state); testortfacunit_testrbdproblem(&ra, m, n, threshold, &rbderrors, _state); testortfacunit_rmatrixfillsparsea(&ra, m, n, 0.95, _state); testortfacunit_cmatrixfillsparsea(&ca, m, n, 0.95, _state); testortfacunit_testrqrproblem(&ra, m, n, threshold, &rqrerrors, _state); testortfacunit_testrlqproblem(&ra, m, n, threshold, &rlqerrors, _state); testortfacunit_testcqrproblem(&ca, m, n, threshold, &cqrerrors, _state); testortfacunit_testclqproblem(&ca, m, n, threshold, &clqerrors, _state); testortfacunit_testrbdproblem(&ra, m, n, threshold, &rbderrors, _state); /* * Square factorizations: Hessenberg, tridiagonal * Matrix types: zero, dense, sparse */ ae_matrix_set_length(&ra, mx, mx, _state); ae_matrix_set_length(&ca, mx, mx, _state); for(i=0; i<=mx-1; i++) { for(j=0; j<=mx-1; j++) { ra.ptr.pp_double[i][j] = (double)(0); ca.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } testortfacunit_testrhessproblem(&ra, mx, threshold, &rhesserrors, _state); for(i=0; i<=mx-1; i++) { for(j=0; j<=mx-1; j++) { ra.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } testortfacunit_testrhessproblem(&ra, mx, threshold, &rhesserrors, _state); testortfacunit_rmatrixfillsparsea(&ra, mx, mx, 0.95, _state); testortfacunit_cmatrixfillsparsea(&ca, mx, mx, 0.95, _state); testortfacunit_testrhessproblem(&ra, mx, threshold, &rhesserrors, _state); /* * Symetric factorizations: tridiagonal * Matrix types: zero, dense, sparse */ ae_matrix_set_length(&ra, mx, mx, _state); ae_matrix_set_length(&ca, mx, mx, _state); for(i=0; i<=mx-1; i++) { for(j=0; j<=mx-1; j++) { ra.ptr.pp_double[i][j] = (double)(0); ca.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } testortfacunit_testrtdproblem(&ra, mx, threshold, &rtderrors, _state); testortfacunit_testctdproblem(&ca, mx, threshold, &ctderrors, _state); for(i=0; i<=mx-1; i++) { for(j=i; j<=mx-1; j++) { ra.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; ra.ptr.pp_double[j][i] = ra.ptr.pp_double[i][j]; ca.ptr.pp_complex[j][i] = ae_c_conj(ca.ptr.pp_complex[i][j], _state); } } for(i=0; i<=mx-1; i++) { ca.ptr.pp_complex[i][i] = ae_complex_from_d(2*ae_randomreal(_state)-1); } testortfacunit_testrtdproblem(&ra, mx, threshold, &rtderrors, _state); testortfacunit_testctdproblem(&ca, mx, threshold, &ctderrors, _state); testortfacunit_rmatrixfillsparsea(&ra, mx, mx, 0.95, _state); testortfacunit_cmatrixfillsparsea(&ca, mx, mx, 0.95, _state); for(i=0; i<=mx-1; i++) { for(j=i; j<=mx-1; j++) { ra.ptr.pp_double[j][i] = ra.ptr.pp_double[i][j]; ca.ptr.pp_complex[j][i] = ae_c_conj(ca.ptr.pp_complex[i][j], _state); } } for(i=0; i<=mx-1; i++) { ca.ptr.pp_complex[i][i] = ae_complex_from_d(2*ae_randomreal(_state)-1); } testortfacunit_testrtdproblem(&ra, mx, threshold, &rtderrors, _state); testortfacunit_testctdproblem(&ca, mx, threshold, &ctderrors, _state); } } /* * report */ waserrors = ((((((rqrerrors||rlqerrors)||cqrerrors)||clqerrors)||rbderrors)||rhesserrors)||rtderrors)||ctderrors; if( !silent ) { printf("TESTING ORTFAC UNIT\n"); printf("RQR ERRORS: "); if( !rqrerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("RLQ ERRORS: "); if( !rlqerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("CQR ERRORS: "); if( !cqrerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("CLQ ERRORS: "); if( !clqerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("RBD ERRORS: "); if( !rbderrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("RHESS ERRORS: "); if( !rhesserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("RTD ERRORS: "); if( !rtderrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("CTD ERRORS: "); if( !ctderrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testortfac(ae_bool silent, ae_state *_state) { return testortfac(silent, _state); } /************************************************************************* Diff *************************************************************************/ static double testortfacunit_rmatrixdiff(/* Real */ ae_matrix* a, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; double result; result = (double)(0); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { result = ae_maxreal(result, ae_fabs(b->ptr.pp_double[i][j]-a->ptr.pp_double[i][j], _state), _state); } } return result; } /************************************************************************* Copy *************************************************************************/ static void testortfacunit_rmatrixmakeacopy(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(b); ae_matrix_set_length(b, m-1+1, n-1+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { b->ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } } /************************************************************************* Copy *************************************************************************/ static void testortfacunit_cmatrixmakeacopy(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(b); ae_matrix_set_length(b, m-1+1, n-1+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { b->ptr.pp_complex[i][j] = a->ptr.pp_complex[i][j]; } } } /************************************************************************* Sparse fill *************************************************************************/ static void testortfacunit_rmatrixfillsparsea(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double sparcity, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater_eq(ae_randomreal(_state),sparcity) ) { a->ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } else { a->ptr.pp_double[i][j] = (double)(0); } } } } /************************************************************************* Sparse fill *************************************************************************/ static void testortfacunit_cmatrixfillsparsea(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, double sparcity, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater_eq(ae_randomreal(_state),sparcity) ) { a->ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a->ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } else { a->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } } /************************************************************************* Matrix multiplication *************************************************************************/ static void testortfacunit_internalmatrixmatrixmultiply(/* Real */ ae_matrix* a, ae_int_t ai1, ae_int_t ai2, ae_int_t aj1, ae_int_t aj2, ae_bool transa, /* Real */ ae_matrix* b, ae_int_t bi1, ae_int_t bi2, ae_int_t bj1, ae_int_t bj2, ae_bool transb, /* Real */ ae_matrix* c, ae_int_t ci1, ae_int_t ci2, ae_int_t cj1, ae_int_t cj2, ae_state *_state) { ae_frame _frame_block; ae_int_t arows; ae_int_t acols; ae_int_t brows; ae_int_t bcols; ae_int_t crows; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t l; ae_int_t r; double v; ae_vector work; double beta; double alpha; ae_frame_make(_state, &_frame_block); ae_vector_init(&work, 0, DT_REAL, _state); /* * Pre-setup */ k = ae_maxint(ai2-ai1+1, aj2-aj1+1, _state); k = ae_maxint(k, bi2-bi1+1, _state); k = ae_maxint(k, bj2-bj1+1, _state); ae_vector_set_length(&work, k+1, _state); beta = (double)(0); alpha = (double)(1); /* * Setup */ if( !transa ) { arows = ai2-ai1+1; acols = aj2-aj1+1; } else { arows = aj2-aj1+1; acols = ai2-ai1+1; } if( !transb ) { brows = bi2-bi1+1; bcols = bj2-bj1+1; } else { brows = bj2-bj1+1; bcols = bi2-bi1+1; } ae_assert(acols==brows, "MatrixMatrixMultiply: incorrect matrix sizes!", _state); if( ((arows<=0||acols<=0)||brows<=0)||bcols<=0 ) { ae_frame_leave(_state); return; } crows = arows; /* * Test WORK */ i = ae_maxint(arows, acols, _state); i = ae_maxint(brows, i, _state); i = ae_maxint(i, bcols, _state); work.ptr.p_double[1] = (double)(0); work.ptr.p_double[i] = (double)(0); /* * Prepare C */ if( ae_fp_eq(beta,(double)(0)) ) { for(i=ci1; i<=ci2; i++) { for(j=cj1; j<=cj2; j++) { c->ptr.pp_double[i][j] = (double)(0); } } } else { for(i=ci1; i<=ci2; i++) { ae_v_muld(&c->ptr.pp_double[i][cj1], 1, ae_v_len(cj1,cj2), beta); } } /* * A*B */ if( !transa&&!transb ) { for(l=ai1; l<=ai2; l++) { for(r=bi1; r<=bi2; r++) { v = alpha*a->ptr.pp_double[l][aj1+r-bi1]; k = ci1+l-ai1; ae_v_addd(&c->ptr.pp_double[k][cj1], 1, &b->ptr.pp_double[r][bj1], 1, ae_v_len(cj1,cj2), v); } } ae_frame_leave(_state); return; } /* * A*B' */ if( !transa&&transb ) { if( arows*acolsptr.pp_double[l][aj1], 1, &b->ptr.pp_double[r][bj1], 1, ae_v_len(aj1,aj2)); c->ptr.pp_double[ci1+l-ai1][cj1+r-bi1] = c->ptr.pp_double[ci1+l-ai1][cj1+r-bi1]+alpha*v; } } ae_frame_leave(_state); return; } else { for(l=ai1; l<=ai2; l++) { for(r=bi1; r<=bi2; r++) { v = ae_v_dotproduct(&a->ptr.pp_double[l][aj1], 1, &b->ptr.pp_double[r][bj1], 1, ae_v_len(aj1,aj2)); c->ptr.pp_double[ci1+l-ai1][cj1+r-bi1] = c->ptr.pp_double[ci1+l-ai1][cj1+r-bi1]+alpha*v; } } ae_frame_leave(_state); return; } } /* * A'*B */ if( transa&&!transb ) { for(l=aj1; l<=aj2; l++) { for(r=bi1; r<=bi2; r++) { v = alpha*a->ptr.pp_double[ai1+r-bi1][l]; k = ci1+l-aj1; ae_v_addd(&c->ptr.pp_double[k][cj1], 1, &b->ptr.pp_double[r][bj1], 1, ae_v_len(cj1,cj2), v); } } ae_frame_leave(_state); return; } /* * A'*B' */ if( transa&&transb ) { if( arows*acolsptr.pp_double[r][bj1+l-ai1]; k = cj1+r-bi1; ae_v_addd(&work.ptr.p_double[1], 1, &a->ptr.pp_double[l][aj1], 1, ae_v_len(1,crows), v); } ae_v_add(&c->ptr.pp_double[ci1][k], c->stride, &work.ptr.p_double[1], 1, ae_v_len(ci1,ci2)); } ae_frame_leave(_state); return; } else { for(l=aj1; l<=aj2; l++) { k = ai2-ai1+1; ae_v_move(&work.ptr.p_double[1], 1, &a->ptr.pp_double[ai1][l], a->stride, ae_v_len(1,k)); for(r=bi1; r<=bi2; r++) { v = ae_v_dotproduct(&work.ptr.p_double[1], 1, &b->ptr.pp_double[r][bj1], 1, ae_v_len(1,k)); c->ptr.pp_double[ci1+l-aj1][cj1+r-bi1] = c->ptr.pp_double[ci1+l-aj1][cj1+r-bi1]+alpha*v; } } ae_frame_leave(_state); return; } } ae_frame_leave(_state); } /************************************************************************* Problem testing *************************************************************************/ static void testortfacunit_testrqrproblem(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* qrerrors, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_matrix b; ae_vector taub; ae_matrix q; ae_matrix r; ae_matrix q2; double v; ae_frame_make(_state, &_frame_block); ae_matrix_init(&b, 0, 0, DT_REAL, _state); ae_vector_init(&taub, 0, DT_REAL, _state); ae_matrix_init(&q, 0, 0, DT_REAL, _state); ae_matrix_init(&r, 0, 0, DT_REAL, _state); ae_matrix_init(&q2, 0, 0, DT_REAL, _state); /* * Test decompose-and-unpack error */ testortfacunit_rmatrixmakeacopy(a, m, n, &b, _state); rmatrixqr(&b, m, n, &taub, _state); rmatrixqrunpackq(&b, m, n, &taub, m, &q, _state); rmatrixqrunpackr(&b, m, n, &r, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&q.ptr.pp_double[i][0], 1, &r.ptr.pp_double[0][j], r.stride, ae_v_len(0,m-1)); *qrerrors = *qrerrors||ae_fp_greater(ae_fabs(v-a->ptr.pp_double[i][j], _state),threshold); } } for(i=0; i<=m-1; i++) { for(j=0; j<=ae_minint(i, n-1, _state)-1; j++) { *qrerrors = *qrerrors||ae_fp_neq(r.ptr.pp_double[i][j],(double)(0)); } } for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_dotproduct(&q.ptr.pp_double[i][0], 1, &q.ptr.pp_double[j][0], 1, ae_v_len(0,m-1)); if( i==j ) { v = v-1; } *qrerrors = *qrerrors||ae_fp_greater_eq(ae_fabs(v, _state),threshold); } } /* * Test for other errors */ k = 1+ae_randominteger(m, _state); rmatrixqrunpackq(&b, m, n, &taub, k, &q2, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=k-1; j++) { *qrerrors = *qrerrors||ae_fp_greater(ae_fabs(q2.ptr.pp_double[i][j]-q.ptr.pp_double[i][j], _state),10*ae_machineepsilon); } } ae_frame_leave(_state); } /************************************************************************* Problem testing *************************************************************************/ static void testortfacunit_testcqrproblem(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* qrerrors, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_matrix b; ae_vector taub; ae_matrix q; ae_matrix r; ae_matrix q2; ae_complex v; ae_frame_make(_state, &_frame_block); ae_matrix_init(&b, 0, 0, DT_COMPLEX, _state); ae_vector_init(&taub, 0, DT_COMPLEX, _state); ae_matrix_init(&q, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&r, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&q2, 0, 0, DT_COMPLEX, _state); /* * Test decompose-and-unpack error */ testortfacunit_cmatrixmakeacopy(a, m, n, &b, _state); cmatrixqr(&b, m, n, &taub, _state); cmatrixqrunpackq(&b, m, n, &taub, m, &q, _state); cmatrixqrunpackr(&b, m, n, &r, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&q.ptr.pp_complex[i][0], 1, "N", &r.ptr.pp_complex[0][j], r.stride, "N", ae_v_len(0,m-1)); *qrerrors = *qrerrors||ae_fp_greater(ae_c_abs(ae_c_sub(v,a->ptr.pp_complex[i][j]), _state),threshold); } } for(i=0; i<=m-1; i++) { for(j=0; j<=ae_minint(i, n-1, _state)-1; j++) { *qrerrors = *qrerrors||ae_c_neq_d(r.ptr.pp_complex[i][j],(double)(0)); } } for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_cdotproduct(&q.ptr.pp_complex[i][0], 1, "N", &q.ptr.pp_complex[j][0], 1, "Conj", ae_v_len(0,m-1)); if( i==j ) { v = ae_c_sub_d(v,1); } *qrerrors = *qrerrors||ae_fp_greater_eq(ae_c_abs(v, _state),threshold); } } /* * Test for other errors */ k = 1+ae_randominteger(m, _state); cmatrixqrunpackq(&b, m, n, &taub, k, &q2, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=k-1; j++) { *qrerrors = *qrerrors||ae_fp_greater(ae_c_abs(ae_c_sub(q2.ptr.pp_complex[i][j],q.ptr.pp_complex[i][j]), _state),10*ae_machineepsilon); } } ae_frame_leave(_state); } /************************************************************************* Problem testing *************************************************************************/ static void testortfacunit_testrlqproblem(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* lqerrors, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_matrix b; ae_vector taub; ae_matrix q; ae_matrix l; ae_matrix q2; double v; ae_frame_make(_state, &_frame_block); ae_matrix_init(&b, 0, 0, DT_REAL, _state); ae_vector_init(&taub, 0, DT_REAL, _state); ae_matrix_init(&q, 0, 0, DT_REAL, _state); ae_matrix_init(&l, 0, 0, DT_REAL, _state); ae_matrix_init(&q2, 0, 0, DT_REAL, _state); /* * Test decompose-and-unpack error */ testortfacunit_rmatrixmakeacopy(a, m, n, &b, _state); rmatrixlq(&b, m, n, &taub, _state); rmatrixlqunpackq(&b, m, n, &taub, n, &q, _state); rmatrixlqunpackl(&b, m, n, &l, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&l.ptr.pp_double[i][0], 1, &q.ptr.pp_double[0][j], q.stride, ae_v_len(0,n-1)); *lqerrors = *lqerrors||ae_fp_greater_eq(ae_fabs(v-a->ptr.pp_double[i][j], _state),threshold); } } for(i=0; i<=m-1; i++) { for(j=ae_minint(i, n-1, _state)+1; j<=n-1; j++) { *lqerrors = *lqerrors||ae_fp_neq(l.ptr.pp_double[i][j],(double)(0)); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&q.ptr.pp_double[i][0], 1, &q.ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); if( i==j ) { v = v-1; } *lqerrors = *lqerrors||ae_fp_greater_eq(ae_fabs(v, _state),threshold); } } /* * Test for other errors */ k = 1+ae_randominteger(n, _state); rmatrixlqunpackq(&b, m, n, &taub, k, &q2, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { *lqerrors = *lqerrors||ae_fp_greater(ae_fabs(q2.ptr.pp_double[i][j]-q.ptr.pp_double[i][j], _state),10*ae_machineepsilon); } } ae_frame_leave(_state); } /************************************************************************* Problem testing *************************************************************************/ static void testortfacunit_testclqproblem(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* lqerrors, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_matrix b; ae_vector taub; ae_matrix q; ae_matrix l; ae_matrix q2; ae_complex v; ae_frame_make(_state, &_frame_block); ae_matrix_init(&b, 0, 0, DT_COMPLEX, _state); ae_vector_init(&taub, 0, DT_COMPLEX, _state); ae_matrix_init(&q, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&l, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&q2, 0, 0, DT_COMPLEX, _state); /* * Test decompose-and-unpack error */ testortfacunit_cmatrixmakeacopy(a, m, n, &b, _state); cmatrixlq(&b, m, n, &taub, _state); cmatrixlqunpackq(&b, m, n, &taub, n, &q, _state); cmatrixlqunpackl(&b, m, n, &l, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&l.ptr.pp_complex[i][0], 1, "N", &q.ptr.pp_complex[0][j], q.stride, "N", ae_v_len(0,n-1)); *lqerrors = *lqerrors||ae_fp_greater_eq(ae_c_abs(ae_c_sub(v,a->ptr.pp_complex[i][j]), _state),threshold); } } for(i=0; i<=m-1; i++) { for(j=ae_minint(i, n-1, _state)+1; j<=n-1; j++) { *lqerrors = *lqerrors||ae_c_neq_d(l.ptr.pp_complex[i][j],(double)(0)); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&q.ptr.pp_complex[i][0], 1, "N", &q.ptr.pp_complex[j][0], 1, "Conj", ae_v_len(0,n-1)); if( i==j ) { v = ae_c_sub_d(v,1); } *lqerrors = *lqerrors||ae_fp_greater_eq(ae_c_abs(v, _state),threshold); } } /* * Test for other errors */ k = 1+ae_randominteger(n, _state); cmatrixlqunpackq(&b, m, n, &taub, k, &q2, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { *lqerrors = *lqerrors||ae_fp_greater(ae_c_abs(ae_c_sub(q2.ptr.pp_complex[i][j],q.ptr.pp_complex[i][j]), _state),10*ae_machineepsilon); } } ae_frame_leave(_state); } /************************************************************************* Problem testing *************************************************************************/ static void testortfacunit_testrbdproblem(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double threshold, ae_bool* bderrors, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; ae_matrix t; ae_matrix pt; ae_matrix q; ae_matrix r; ae_matrix bd; ae_matrix x; ae_matrix r1; ae_matrix r2; ae_vector taup; ae_vector tauq; ae_vector d; ae_vector e; ae_bool up; double v; ae_int_t mtsize; ae_frame_make(_state, &_frame_block); ae_matrix_init(&t, 0, 0, DT_REAL, _state); ae_matrix_init(&pt, 0, 0, DT_REAL, _state); ae_matrix_init(&q, 0, 0, DT_REAL, _state); ae_matrix_init(&r, 0, 0, DT_REAL, _state); ae_matrix_init(&bd, 0, 0, DT_REAL, _state); ae_matrix_init(&x, 0, 0, DT_REAL, _state); ae_matrix_init(&r1, 0, 0, DT_REAL, _state); ae_matrix_init(&r2, 0, 0, DT_REAL, _state); ae_vector_init(&taup, 0, DT_REAL, _state); ae_vector_init(&tauq, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); /* * Bidiagonal decomposition error */ testortfacunit_rmatrixmakeacopy(a, m, n, &t, _state); rmatrixbd(&t, m, n, &tauq, &taup, _state); rmatrixbdunpackq(&t, m, n, &tauq, m, &q, _state); rmatrixbdunpackpt(&t, m, n, &taup, n, &pt, _state); rmatrixbdunpackdiagonals(&t, m, n, &up, &d, &e, _state); ae_matrix_set_length(&bd, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { bd.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=ae_minint(m, n, _state)-1; i++) { bd.ptr.pp_double[i][i] = d.ptr.p_double[i]; } if( up ) { for(i=0; i<=ae_minint(m, n, _state)-2; i++) { bd.ptr.pp_double[i][i+1] = e.ptr.p_double[i]; } } else { for(i=0; i<=ae_minint(m, n, _state)-2; i++) { bd.ptr.pp_double[i+1][i] = e.ptr.p_double[i]; } } ae_matrix_set_length(&r, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&q.ptr.pp_double[i][0], 1, &bd.ptr.pp_double[0][j], bd.stride, ae_v_len(0,m-1)); r.ptr.pp_double[i][j] = v; } } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&r.ptr.pp_double[i][0], 1, &pt.ptr.pp_double[0][j], pt.stride, ae_v_len(0,n-1)); *bderrors = *bderrors||ae_fp_greater(ae_fabs(v-a->ptr.pp_double[i][j], _state),threshold); } } /* * Orthogonality test for Q/PT */ for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_dotproduct(&q.ptr.pp_double[0][i], q.stride, &q.ptr.pp_double[0][j], q.stride, ae_v_len(0,m-1)); if( i==j ) { *bderrors = *bderrors||ae_fp_greater(ae_fabs(v-1, _state),threshold); } else { *bderrors = *bderrors||ae_fp_greater(ae_fabs(v, _state),threshold); } } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&pt.ptr.pp_double[i][0], 1, &pt.ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); if( i==j ) { *bderrors = *bderrors||ae_fp_greater(ae_fabs(v-1, _state),threshold); } else { *bderrors = *bderrors||ae_fp_greater(ae_fabs(v, _state),threshold); } } } /* * Partial unpacking test */ k = 1+ae_randominteger(m, _state); rmatrixbdunpackq(&t, m, n, &tauq, k, &r, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=k-1; j++) { *bderrors = *bderrors||ae_fp_greater(ae_fabs(r.ptr.pp_double[i][j]-q.ptr.pp_double[i][j], _state),10*ae_machineepsilon); } } k = 1+ae_randominteger(n, _state); rmatrixbdunpackpt(&t, m, n, &taup, k, &r, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { *bderrors = *bderrors||ae_fp_neq(r.ptr.pp_double[i][j]-pt.ptr.pp_double[i][j],(double)(0)); } } /* * Multiplication test */ ae_matrix_set_length(&x, ae_maxint(m, n, _state)-1+1, ae_maxint(m, n, _state)-1+1, _state); ae_matrix_set_length(&r, ae_maxint(m, n, _state)-1+1, ae_maxint(m, n, _state)-1+1, _state); ae_matrix_set_length(&r1, ae_maxint(m, n, _state)-1+1, ae_maxint(m, n, _state)-1+1, _state); ae_matrix_set_length(&r2, ae_maxint(m, n, _state)-1+1, ae_maxint(m, n, _state)-1+1, _state); for(i=0; i<=ae_maxint(m, n, _state)-1; i++) { for(j=0; j<=ae_maxint(m, n, _state)-1; j++) { x.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } mtsize = 1+ae_randominteger(ae_maxint(m, n, _state), _state); testortfacunit_rmatrixmakeacopy(&x, mtsize, m, &r, _state); testortfacunit_internalmatrixmatrixmultiply(&r, 0, mtsize-1, 0, m-1, ae_false, &q, 0, m-1, 0, m-1, ae_false, &r1, 0, mtsize-1, 0, m-1, _state); testortfacunit_rmatrixmakeacopy(&x, mtsize, m, &r2, _state); rmatrixbdmultiplybyq(&t, m, n, &tauq, &r2, mtsize, m, ae_true, ae_false, _state); *bderrors = *bderrors||ae_fp_greater(testortfacunit_rmatrixdiff(&r1, &r2, mtsize, m, _state),threshold); testortfacunit_rmatrixmakeacopy(&x, mtsize, m, &r, _state); testortfacunit_internalmatrixmatrixmultiply(&r, 0, mtsize-1, 0, m-1, ae_false, &q, 0, m-1, 0, m-1, ae_true, &r1, 0, mtsize-1, 0, m-1, _state); testortfacunit_rmatrixmakeacopy(&x, mtsize, m, &r2, _state); rmatrixbdmultiplybyq(&t, m, n, &tauq, &r2, mtsize, m, ae_true, ae_true, _state); *bderrors = *bderrors||ae_fp_greater(testortfacunit_rmatrixdiff(&r1, &r2, mtsize, m, _state),threshold); testortfacunit_rmatrixmakeacopy(&x, m, mtsize, &r, _state); testortfacunit_internalmatrixmatrixmultiply(&q, 0, m-1, 0, m-1, ae_false, &r, 0, m-1, 0, mtsize-1, ae_false, &r1, 0, m-1, 0, mtsize-1, _state); testortfacunit_rmatrixmakeacopy(&x, m, mtsize, &r2, _state); rmatrixbdmultiplybyq(&t, m, n, &tauq, &r2, m, mtsize, ae_false, ae_false, _state); *bderrors = *bderrors||ae_fp_greater(testortfacunit_rmatrixdiff(&r1, &r2, m, mtsize, _state),threshold); testortfacunit_rmatrixmakeacopy(&x, m, mtsize, &r, _state); testortfacunit_internalmatrixmatrixmultiply(&q, 0, m-1, 0, m-1, ae_true, &r, 0, m-1, 0, mtsize-1, ae_false, &r1, 0, m-1, 0, mtsize-1, _state); testortfacunit_rmatrixmakeacopy(&x, m, mtsize, &r2, _state); rmatrixbdmultiplybyq(&t, m, n, &tauq, &r2, m, mtsize, ae_false, ae_true, _state); *bderrors = *bderrors||ae_fp_greater(testortfacunit_rmatrixdiff(&r1, &r2, m, mtsize, _state),threshold); testortfacunit_rmatrixmakeacopy(&x, mtsize, n, &r, _state); testortfacunit_internalmatrixmatrixmultiply(&r, 0, mtsize-1, 0, n-1, ae_false, &pt, 0, n-1, 0, n-1, ae_true, &r1, 0, mtsize-1, 0, n-1, _state); testortfacunit_rmatrixmakeacopy(&x, mtsize, n, &r2, _state); rmatrixbdmultiplybyp(&t, m, n, &taup, &r2, mtsize, n, ae_true, ae_false, _state); *bderrors = *bderrors||ae_fp_greater(testortfacunit_rmatrixdiff(&r1, &r2, mtsize, n, _state),threshold); testortfacunit_rmatrixmakeacopy(&x, mtsize, n, &r, _state); testortfacunit_internalmatrixmatrixmultiply(&r, 0, mtsize-1, 0, n-1, ae_false, &pt, 0, n-1, 0, n-1, ae_false, &r1, 0, mtsize-1, 0, n-1, _state); testortfacunit_rmatrixmakeacopy(&x, mtsize, n, &r2, _state); rmatrixbdmultiplybyp(&t, m, n, &taup, &r2, mtsize, n, ae_true, ae_true, _state); *bderrors = *bderrors||ae_fp_greater(testortfacunit_rmatrixdiff(&r1, &r2, mtsize, n, _state),threshold); testortfacunit_rmatrixmakeacopy(&x, n, mtsize, &r, _state); testortfacunit_internalmatrixmatrixmultiply(&pt, 0, n-1, 0, n-1, ae_true, &r, 0, n-1, 0, mtsize-1, ae_false, &r1, 0, n-1, 0, mtsize-1, _state); testortfacunit_rmatrixmakeacopy(&x, n, mtsize, &r2, _state); rmatrixbdmultiplybyp(&t, m, n, &taup, &r2, n, mtsize, ae_false, ae_false, _state); *bderrors = *bderrors||ae_fp_greater(testortfacunit_rmatrixdiff(&r1, &r2, n, mtsize, _state),threshold); testortfacunit_rmatrixmakeacopy(&x, n, mtsize, &r, _state); testortfacunit_internalmatrixmatrixmultiply(&pt, 0, n-1, 0, n-1, ae_false, &r, 0, n-1, 0, mtsize-1, ae_false, &r1, 0, n-1, 0, mtsize-1, _state); testortfacunit_rmatrixmakeacopy(&x, n, mtsize, &r2, _state); rmatrixbdmultiplybyp(&t, m, n, &taup, &r2, n, mtsize, ae_false, ae_true, _state); *bderrors = *bderrors||ae_fp_greater(testortfacunit_rmatrixdiff(&r1, &r2, n, mtsize, _state),threshold); ae_frame_leave(_state); } /************************************************************************* Problem testing *************************************************************************/ static void testortfacunit_testrhessproblem(/* Real */ ae_matrix* a, ae_int_t n, double threshold, ae_bool* hesserrors, ae_state *_state) { ae_frame _frame_block; ae_matrix b; ae_matrix h; ae_matrix q; ae_matrix t1; ae_matrix t2; ae_vector tau; ae_int_t i; ae_int_t j; double v; ae_frame_make(_state, &_frame_block); ae_matrix_init(&b, 0, 0, DT_REAL, _state); ae_matrix_init(&h, 0, 0, DT_REAL, _state); ae_matrix_init(&q, 0, 0, DT_REAL, _state); ae_matrix_init(&t1, 0, 0, DT_REAL, _state); ae_matrix_init(&t2, 0, 0, DT_REAL, _state); ae_vector_init(&tau, 0, DT_REAL, _state); testortfacunit_rmatrixmakeacopy(a, n, n, &b, _state); /* * Decomposition */ rmatrixhessenberg(&b, n, &tau, _state); rmatrixhessenbergunpackq(&b, n, &tau, &q, _state); rmatrixhessenbergunpackh(&b, n, &h, _state); /* * Matrix properties */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&q.ptr.pp_double[0][i], q.stride, &q.ptr.pp_double[0][j], q.stride, ae_v_len(0,n-1)); if( i==j ) { v = v-1; } *hesserrors = *hesserrors||ae_fp_greater(ae_fabs(v, _state),threshold); } } for(i=0; i<=n-1; i++) { for(j=0; j<=i-2; j++) { *hesserrors = *hesserrors||ae_fp_neq(h.ptr.pp_double[i][j],(double)(0)); } } /* * Decomposition error */ ae_matrix_set_length(&t1, n, n, _state); ae_matrix_set_length(&t2, n, n, _state); testortfacunit_internalmatrixmatrixmultiply(&q, 0, n-1, 0, n-1, ae_false, &h, 0, n-1, 0, n-1, ae_false, &t1, 0, n-1, 0, n-1, _state); testortfacunit_internalmatrixmatrixmultiply(&t1, 0, n-1, 0, n-1, ae_false, &q, 0, n-1, 0, n-1, ae_true, &t2, 0, n-1, 0, n-1, _state); *hesserrors = *hesserrors||ae_fp_greater(testortfacunit_rmatrixdiff(&t2, a, n, n, _state),threshold); ae_frame_leave(_state); } /************************************************************************* Tridiagonal tester *************************************************************************/ static void testortfacunit_testrtdproblem(/* Real */ ae_matrix* a, ae_int_t n, double threshold, ae_bool* tderrors, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_matrix ua; ae_matrix la; ae_matrix t; ae_matrix q; ae_matrix t2; ae_matrix t3; ae_vector tau; ae_vector d; ae_vector e; double v; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ua, 0, 0, DT_REAL, _state); ae_matrix_init(&la, 0, 0, DT_REAL, _state); ae_matrix_init(&t, 0, 0, DT_REAL, _state); ae_matrix_init(&q, 0, 0, DT_REAL, _state); ae_matrix_init(&t2, 0, 0, DT_REAL, _state); ae_matrix_init(&t3, 0, 0, DT_REAL, _state); ae_vector_init(&tau, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_matrix_set_length(&ua, n-1+1, n-1+1, _state); ae_matrix_set_length(&la, n-1+1, n-1+1, _state); ae_matrix_set_length(&t, n-1+1, n-1+1, _state); ae_matrix_set_length(&q, n-1+1, n-1+1, _state); ae_matrix_set_length(&t2, n-1+1, n-1+1, _state); ae_matrix_set_length(&t3, n-1+1, n-1+1, _state); /* * fill */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { ua.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { ua.ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { la.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { la.ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } /* * Test 2tridiagonal: upper */ smatrixtd(&ua, n, ae_true, &tau, &d, &e, _state); smatrixtdunpackq(&ua, n, ae_true, &tau, &q, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { t.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { t.ptr.pp_double[i][i] = d.ptr.p_double[i]; } for(i=0; i<=n-2; i++) { t.ptr.pp_double[i][i+1] = e.ptr.p_double[i]; t.ptr.pp_double[i+1][i] = e.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&q.ptr.pp_double[0][i], q.stride, &a->ptr.pp_double[0][j], a->stride, ae_v_len(0,n-1)); t2.ptr.pp_double[i][j] = v; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&t2.ptr.pp_double[i][0], 1, &q.ptr.pp_double[0][j], q.stride, ae_v_len(0,n-1)); t3.ptr.pp_double[i][j] = v; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { *tderrors = *tderrors||ae_fp_greater(ae_fabs(t3.ptr.pp_double[i][j]-t.ptr.pp_double[i][j], _state),threshold); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&q.ptr.pp_double[i][0], 1, &q.ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); if( i==j ) { v = v-1; } *tderrors = *tderrors||ae_fp_greater(ae_fabs(v, _state),threshold); } } /* * Test 2tridiagonal: lower */ smatrixtd(&la, n, ae_false, &tau, &d, &e, _state); smatrixtdunpackq(&la, n, ae_false, &tau, &q, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { t.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { t.ptr.pp_double[i][i] = d.ptr.p_double[i]; } for(i=0; i<=n-2; i++) { t.ptr.pp_double[i][i+1] = e.ptr.p_double[i]; t.ptr.pp_double[i+1][i] = e.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&q.ptr.pp_double[0][i], q.stride, &a->ptr.pp_double[0][j], a->stride, ae_v_len(0,n-1)); t2.ptr.pp_double[i][j] = v; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&t2.ptr.pp_double[i][0], 1, &q.ptr.pp_double[0][j], q.stride, ae_v_len(0,n-1)); t3.ptr.pp_double[i][j] = v; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { *tderrors = *tderrors||ae_fp_greater(ae_fabs(t3.ptr.pp_double[i][j]-t.ptr.pp_double[i][j], _state),threshold); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&q.ptr.pp_double[i][0], 1, &q.ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); if( i==j ) { v = v-1; } *tderrors = *tderrors||ae_fp_greater(ae_fabs(v, _state),threshold); } } ae_frame_leave(_state); } /************************************************************************* Hermitian problem tester *************************************************************************/ static void testortfacunit_testctdproblem(/* Complex */ ae_matrix* a, ae_int_t n, double threshold, ae_bool* tderrors, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_matrix ua; ae_matrix la; ae_matrix t; ae_matrix q; ae_matrix t2; ae_matrix t3; ae_vector tau; ae_vector d; ae_vector e; ae_complex v; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ua, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&la, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&t, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&q, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&t2, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&t3, 0, 0, DT_COMPLEX, _state); ae_vector_init(&tau, 0, DT_COMPLEX, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_matrix_set_length(&ua, n-1+1, n-1+1, _state); ae_matrix_set_length(&la, n-1+1, n-1+1, _state); ae_matrix_set_length(&t, n-1+1, n-1+1, _state); ae_matrix_set_length(&q, n-1+1, n-1+1, _state); ae_matrix_set_length(&t2, n-1+1, n-1+1, _state); ae_matrix_set_length(&t3, n-1+1, n-1+1, _state); /* * fill */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { ua.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { ua.ptr.pp_complex[i][j] = a->ptr.pp_complex[i][j]; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { la.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { la.ptr.pp_complex[i][j] = a->ptr.pp_complex[i][j]; } } /* * Test 2tridiagonal: upper */ hmatrixtd(&ua, n, ae_true, &tau, &d, &e, _state); hmatrixtdunpackq(&ua, n, ae_true, &tau, &q, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { t.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } for(i=0; i<=n-1; i++) { t.ptr.pp_complex[i][i] = ae_complex_from_d(d.ptr.p_double[i]); } for(i=0; i<=n-2; i++) { t.ptr.pp_complex[i][i+1] = ae_complex_from_d(e.ptr.p_double[i]); t.ptr.pp_complex[i+1][i] = ae_complex_from_d(e.ptr.p_double[i]); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&q.ptr.pp_complex[0][i], q.stride, "Conj", &a->ptr.pp_complex[0][j], a->stride, "N", ae_v_len(0,n-1)); t2.ptr.pp_complex[i][j] = v; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&t2.ptr.pp_complex[i][0], 1, "N", &q.ptr.pp_complex[0][j], q.stride, "N", ae_v_len(0,n-1)); t3.ptr.pp_complex[i][j] = v; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { *tderrors = *tderrors||ae_fp_greater(ae_c_abs(ae_c_sub(t3.ptr.pp_complex[i][j],t.ptr.pp_complex[i][j]), _state),threshold); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&q.ptr.pp_complex[i][0], 1, "N", &q.ptr.pp_complex[j][0], 1, "Conj", ae_v_len(0,n-1)); if( i==j ) { v = ae_c_sub_d(v,1); } *tderrors = *tderrors||ae_fp_greater(ae_c_abs(v, _state),threshold); } } /* * Test 2tridiagonal: lower */ hmatrixtd(&la, n, ae_false, &tau, &d, &e, _state); hmatrixtdunpackq(&la, n, ae_false, &tau, &q, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { t.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } for(i=0; i<=n-1; i++) { t.ptr.pp_complex[i][i] = ae_complex_from_d(d.ptr.p_double[i]); } for(i=0; i<=n-2; i++) { t.ptr.pp_complex[i][i+1] = ae_complex_from_d(e.ptr.p_double[i]); t.ptr.pp_complex[i+1][i] = ae_complex_from_d(e.ptr.p_double[i]); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&q.ptr.pp_complex[0][i], q.stride, "Conj", &a->ptr.pp_complex[0][j], a->stride, "N", ae_v_len(0,n-1)); t2.ptr.pp_complex[i][j] = v; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&t2.ptr.pp_complex[i][0], 1, "N", &q.ptr.pp_complex[0][j], q.stride, "N", ae_v_len(0,n-1)); t3.ptr.pp_complex[i][j] = v; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { *tderrors = *tderrors||ae_fp_greater(ae_c_abs(ae_c_sub(t3.ptr.pp_complex[i][j],t.ptr.pp_complex[i][j]), _state),threshold); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&q.ptr.pp_complex[i][0], 1, "N", &q.ptr.pp_complex[j][0], 1, "Conj", ae_v_len(0,n-1)); if( i==j ) { v = ae_c_sub_d(v,1); } *tderrors = *tderrors||ae_fp_greater(ae_c_abs(v, _state),threshold); } } ae_frame_leave(_state); } /************************************************************************* Testing *************************************************************************/ ae_bool testfbls(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t m; ae_int_t mx; ae_int_t i; ae_int_t j; ae_bool waserrors; ae_bool cgerrors; ae_bool lserrors; double eps; double v; double v1; double v2; ae_vector tmp0; ae_vector tmp1; ae_vector tmp2; ae_matrix a; ae_vector b; ae_vector x; ae_vector xe; ae_vector buf; double alpha; double e1; double e2; fblslincgstate cgstate; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&tmp0, 0, DT_REAL, _state); ae_vector_init(&tmp1, 0, DT_REAL, _state); ae_vector_init(&tmp2, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&xe, 0, DT_REAL, _state); ae_vector_init(&buf, 0, DT_REAL, _state); _fblslincgstate_init(&cgstate, _state); mx = 10; waserrors = ae_false; cgerrors = ae_false; lserrors = ae_false; /* * Test CG solver: * * generate problem (A, B, Alpha, XE - exact solution) and initial approximation X * * E1 = ||A'A*x-b|| * * solve * * E2 = ||A'A*x-b|| * * test that E2<0.001*E1 */ for(n=1; n<=mx; n++) { for(m=1; m<=mx; m++) { ae_matrix_set_length(&a, m, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xe, n, _state); ae_vector_set_length(&tmp1, m, _state); ae_vector_set_length(&tmp2, n, _state); /* * init A, alpha, B, X (initial approximation), XE (exact solution) * X is initialized in such way that is has no chances to be equal to XE. */ for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } alpha = ae_randomreal(_state)+0.1; for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xe.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x.ptr.p_double[i] = (2*ae_randominteger(2, _state)-1)*(2+ae_randomreal(_state)); } /* * Test dense CG (which solves A'A*x=b and accepts dense A) */ for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (2*ae_randominteger(2, _state)-1)*(2+ae_randomreal(_state)); } rmatrixmv(m, n, &a, 0, 0, 0, &x, 0, &tmp1, 0, _state); rmatrixmv(n, m, &a, 0, 0, 1, &tmp1, 0, &tmp2, 0, _state); ae_v_addd(&tmp2.ptr.p_double[0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1), alpha); ae_v_sub(&tmp2.ptr.p_double[0], 1, &b.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&tmp2.ptr.p_double[0], 1, &tmp2.ptr.p_double[0], 1, ae_v_len(0,n-1)); e1 = ae_sqrt(v, _state); fblssolvecgx(&a, m, n, alpha, &b, &x, &buf, _state); rmatrixmv(m, n, &a, 0, 0, 0, &x, 0, &tmp1, 0, _state); rmatrixmv(n, m, &a, 0, 0, 1, &tmp1, 0, &tmp2, 0, _state); ae_v_addd(&tmp2.ptr.p_double[0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1), alpha); ae_v_sub(&tmp2.ptr.p_double[0], 1, &b.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&tmp2.ptr.p_double[0], 1, &tmp2.ptr.p_double[0], 1, ae_v_len(0,n-1)); e2 = ae_sqrt(v, _state); cgerrors = cgerrors||ae_fp_greater(e2,0.001*e1); /* * Test sparse CG (which relies on reverse communication) */ for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (2*ae_randominteger(2, _state)-1)*(2+ae_randomreal(_state)); } rmatrixmv(m, n, &a, 0, 0, 0, &x, 0, &tmp1, 0, _state); rmatrixmv(n, m, &a, 0, 0, 1, &tmp1, 0, &tmp2, 0, _state); ae_v_addd(&tmp2.ptr.p_double[0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1), alpha); ae_v_sub(&tmp2.ptr.p_double[0], 1, &b.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&tmp2.ptr.p_double[0], 1, &tmp2.ptr.p_double[0], 1, ae_v_len(0,n-1)); e1 = ae_sqrt(v, _state); fblscgcreate(&x, &b, n, &cgstate, _state); while(fblscgiteration(&cgstate, _state)) { rmatrixmv(m, n, &a, 0, 0, 0, &cgstate.x, 0, &tmp1, 0, _state); rmatrixmv(n, m, &a, 0, 0, 1, &tmp1, 0, &cgstate.ax, 0, _state); ae_v_addd(&cgstate.ax.ptr.p_double[0], 1, &cgstate.x.ptr.p_double[0], 1, ae_v_len(0,n-1), alpha); v1 = ae_v_dotproduct(&tmp1.ptr.p_double[0], 1, &tmp1.ptr.p_double[0], 1, ae_v_len(0,m-1)); v2 = ae_v_dotproduct(&cgstate.x.ptr.p_double[0], 1, &cgstate.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); cgstate.xax = v1+alpha*v2; } rmatrixmv(m, n, &a, 0, 0, 0, &cgstate.xk, 0, &tmp1, 0, _state); rmatrixmv(n, m, &a, 0, 0, 1, &tmp1, 0, &tmp2, 0, _state); ae_v_addd(&tmp2.ptr.p_double[0], 1, &cgstate.xk.ptr.p_double[0], 1, ae_v_len(0,n-1), alpha); ae_v_sub(&tmp2.ptr.p_double[0], 1, &b.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&tmp2.ptr.p_double[0], 1, &tmp2.ptr.p_double[0], 1, ae_v_len(0,n-1)); e2 = ae_sqrt(v, _state); cgerrors = cgerrors||ae_fp_greater(ae_fabs(e1-cgstate.e1, _state),100*ae_machineepsilon*e1); cgerrors = cgerrors||ae_fp_greater(ae_fabs(e2-cgstate.e2, _state),100*ae_machineepsilon*e1); cgerrors = cgerrors||ae_fp_greater(e2,0.001*e1); } } /* * Test linear least squares: * * try N=1..5, M=N..2*N * [ B ] * * generate MxN matrix A = [ ], where (M-N)xN submatrix B contains * [ C ] * random values from [-1,+1], and NxN submatrix C is diagonally dominant * (diagonal of C is equal to 1.0, and magnitude of off-diagonal elements * is smaller than 0.01). Such matrix is guaranteed to be non-degenerate. * * generate random known solution xe, set right part b=A*xe * * check that results of FBLSSolveLS agree with xe */ eps = 1.0E-6; for(n=1; n<=5; n++) { for(m=n; m<=2*n; m++) { ae_matrix_set_length(&a, m, n, _state); for(i=0; i<=m-n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } for(i=m-n; i<=m-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.01*(2*ae_randomreal(_state)-1); } a.ptr.pp_double[i][i-(m-n)] = 1.0; } ae_vector_set_length(&xe, n, _state); for(i=0; i<=n-1; i++) { xe.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&b, m, _state); for(i=0; i<=m-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xe.ptr.p_double[0], 1, ae_v_len(0,n-1)); b.ptr.p_double[i] = v; } fblssolvels(&a, &b, m, n, &tmp0, &tmp1, &tmp2, _state); for(i=0; i<=n-1; i++) { lserrors = lserrors||ae_fp_greater(ae_fabs(b.ptr.p_double[i]-xe.ptr.p_double[i], _state),eps); } } } /* * report */ waserrors = cgerrors||lserrors; if( !silent ) { printf("TESTING FBLS\n"); printf("CG ERRORS: "); if( cgerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("LS ERRORS: "); if( lserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testfbls(ae_bool silent, ae_state *_state) { return testfbls(silent, _state); } ae_bool testcqmodels(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool eval0errors; ae_bool eval1errors; ae_bool eval2errors; ae_bool newton0errors; ae_bool newton1errors; ae_bool newton2errors; ae_bool waserrors; convexquadraticmodel s; ae_int_t nkind; ae_int_t kmax; ae_int_t n; ae_int_t k; ae_int_t i; ae_int_t pass; ae_int_t j; double alpha; double theta; double tau; double v; double v2; double h; double f0; double mkind; double xtadx2; double noise; ae_matrix a; ae_matrix q; ae_vector b; ae_vector r; ae_vector x; ae_vector x0; ae_vector xc; ae_vector d; ae_vector ge; ae_vector gt; ae_vector tmp0; ae_vector adx; ae_vector adxe; ae_vector activeset; ae_bool result; ae_frame_make(_state, &_frame_block); _convexquadraticmodel_init(&s, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&q, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&r, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&ge, 0, DT_REAL, _state); ae_vector_init(>, 0, DT_REAL, _state); ae_vector_init(&tmp0, 0, DT_REAL, _state); ae_vector_init(&adx, 0, DT_REAL, _state); ae_vector_init(&adxe, 0, DT_REAL, _state); ae_vector_init(&activeset, 0, DT_BOOL, _state); waserrors = ae_false; /* * Eval0 test: unconstrained model evaluation */ eval0errors = ae_false; for(n=1; n<=5; n++) { for(k=0; k<=2*n; k++) { /* * Allocate place */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&d, n, _state); ae_vector_set_length(&ge, n, _state); ae_vector_set_length(>, n, _state); if( k>0 ) { ae_matrix_set_length(&q, k, n, _state); ae_vector_set_length(&r, k, _state); } /* * Generate problem */ alpha = ae_randomreal(_state)+1.0; theta = ae_randomreal(_state)+1.0; tau = ae_randomreal(_state)*ae_randominteger(2, _state); for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = 10*(1+ae_randomreal(_state)); b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; d.ptr.p_double[i] = ae_randomreal(_state)+1; for(j=i+1; j<=n-1; j++) { v = 0.1*ae_randomreal(_state)-0.05; a.ptr.pp_double[i][j] = v; a.ptr.pp_double[j][i] = v; } for(j=0; j<=k-1; j++) { q.ptr.pp_double[j][i] = 2*ae_randomreal(_state)-1; } } for(i=0; i<=k-1; i++) { r.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } /* * Build model */ cqminit(n, &s, _state); cqmseta(&s, &a, ae_fp_greater(ae_randomreal(_state),0.5), alpha, _state); cqmsetb(&s, &b, _state); cqmsetq(&s, &q, &r, k, theta, _state); cqmsetd(&s, &d, tau, _state); /* * Evaluate and compare: * * X - random point * * GE - "exact" gradient * * XTADX2 - x'*(alpha*A+tau*D)*x/2 * * ADXE - (alpha*A+tau*D)*x * * V - model value at X */ for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; ge.ptr.p_double[i] = 0.0; } v = 0.0; xtadx2 = 0.0; ae_vector_set_length(&adxe, n, _state); for(i=0; i<=n-1; i++) { adxe.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { v = v+x.ptr.p_double[i]*b.ptr.p_double[i]; ge.ptr.p_double[i] = ge.ptr.p_double[i]+b.ptr.p_double[i]; v = v+0.5*ae_sqr(x.ptr.p_double[i], _state)*tau*d.ptr.p_double[i]; ge.ptr.p_double[i] = ge.ptr.p_double[i]+x.ptr.p_double[i]*tau*d.ptr.p_double[i]; adxe.ptr.p_double[i] = adxe.ptr.p_double[i]+x.ptr.p_double[i]*tau*d.ptr.p_double[i]; xtadx2 = xtadx2+0.5*ae_sqr(x.ptr.p_double[i], _state)*tau*d.ptr.p_double[i]; for(j=0; j<=n-1; j++) { v = v+0.5*alpha*x.ptr.p_double[i]*a.ptr.pp_double[i][j]*x.ptr.p_double[j]; ge.ptr.p_double[i] = ge.ptr.p_double[i]+alpha*a.ptr.pp_double[i][j]*x.ptr.p_double[j]; adxe.ptr.p_double[i] = adxe.ptr.p_double[i]+alpha*a.ptr.pp_double[i][j]*x.ptr.p_double[j]; xtadx2 = xtadx2+0.5*alpha*x.ptr.p_double[i]*a.ptr.pp_double[i][j]*x.ptr.p_double[j]; } } for(i=0; i<=k-1; i++) { v2 = ae_v_dotproduct(&q.ptr.pp_double[i][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v+0.5*theta*ae_sqr(v2-r.ptr.p_double[i], _state); for(j=0; j<=n-1; j++) { ge.ptr.p_double[j] = ge.ptr.p_double[j]+theta*(v2-r.ptr.p_double[i])*q.ptr.pp_double[i][j]; } } v2 = cqmeval(&s, &x, _state); eval0errors = eval0errors||ae_fp_greater(ae_fabs(v-v2, _state),10000*ae_machineepsilon); cqmevalx(&s, &x, &v2, &noise, _state); eval0errors = eval0errors||ae_fp_greater(ae_fabs(v-v2, _state),10000*ae_machineepsilon); eval0errors = (eval0errors||ae_fp_less(noise,(double)(0)))||ae_fp_greater(noise,10000*ae_machineepsilon); v2 = cqmxtadx2(&s, &x, _state); eval0errors = eval0errors||ae_fp_greater(ae_fabs(xtadx2-v2, _state),10000*ae_machineepsilon); cqmgradunconstrained(&s, &x, >, _state); for(i=0; i<=n-1; i++) { eval0errors = eval0errors||ae_fp_greater(ae_fabs(ge.ptr.p_double[i]-gt.ptr.p_double[i], _state),10000*ae_machineepsilon); } cqmadx(&s, &x, &adx, _state); for(i=0; i<=n-1; i++) { eval0errors = eval0errors||ae_fp_greater(ae_fabs(adx.ptr.p_double[i]-adxe.ptr.p_double[i], _state),10000*ae_machineepsilon); } } } waserrors = waserrors||eval0errors; /* * Eval1 test: constrained model evaluation */ eval1errors = ae_false; for(n=1; n<=5; n++) { for(k=0; k<=2*n; k++) { /* * Allocate place */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&activeset, n, _state); if( k>0 ) { ae_matrix_set_length(&q, k, n, _state); ae_vector_set_length(&r, k, _state); } /* * Generate problem */ alpha = ae_randomreal(_state)+1.0; theta = ae_randomreal(_state)+1.0; for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = 10*(1+ae_randomreal(_state)); b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; activeset.ptr.p_bool[i] = ae_fp_greater(ae_randomreal(_state),0.5); for(j=i+1; j<=n-1; j++) { v = 0.1*ae_randomreal(_state)-0.05; a.ptr.pp_double[i][j] = v; a.ptr.pp_double[j][i] = v; } for(j=0; j<=k-1; j++) { q.ptr.pp_double[j][i] = 2*ae_randomreal(_state)-1; } } for(i=0; i<=k-1; i++) { r.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } /* * Build model, evaluate at random point X, compare */ cqminit(n, &s, _state); cqmseta(&s, &a, ae_fp_greater(ae_randomreal(_state),0.5), alpha, _state); cqmsetb(&s, &b, _state); cqmsetq(&s, &q, &r, k, theta, _state); cqmsetactiveset(&s, &xc, &activeset, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; if( !activeset.ptr.p_bool[i] ) { xc.ptr.p_double[i] = x.ptr.p_double[i]; } } v = 0.0; for(i=0; i<=n-1; i++) { v = v+xc.ptr.p_double[i]*b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { v = v+0.5*alpha*xc.ptr.p_double[i]*a.ptr.pp_double[i][j]*xc.ptr.p_double[j]; } } for(i=0; i<=k-1; i++) { v2 = ae_v_dotproduct(&q.ptr.pp_double[i][0], 1, &xc.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v+0.5*theta*ae_sqr(v2-r.ptr.p_double[i], _state); } eval1errors = eval1errors||ae_fp_greater(ae_fabs(v-cqmeval(&s, &xc, _state), _state),10000*ae_machineepsilon); eval1errors = eval1errors||ae_fp_greater(ae_fabs(v-cqmdebugconstrainedevalt(&s, &x, _state), _state),10000*ae_machineepsilon); eval1errors = eval1errors||ae_fp_greater(ae_fabs(v-cqmdebugconstrainedevale(&s, &x, _state), _state),10000*ae_machineepsilon); } } waserrors = waserrors||eval1errors; /* * Eval2 test: we generate empty problem and apply sequence of random transformations, * re-evaluating and re-checking model after each modification. * * The purpose of such test is to ensure that our caching strategy works correctly. */ eval2errors = ae_false; for(n=1; n<=5; n++) { kmax = 2*n; ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&d, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xc, n, _state); ae_matrix_set_length(&q, kmax, n, _state); ae_vector_set_length(&r, kmax, _state); ae_vector_set_length(&activeset, n, _state); ae_vector_set_length(&tmp0, n, _state); alpha = 0.0; theta = 0.0; k = 0; tau = 1.0+ae_randomreal(_state); for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = 1.0; b.ptr.p_double[i] = 0.0; xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } cqminit(n, &s, _state); cqmsetd(&s, &d, tau, _state); for(pass=1; pass<=100; pass++) { /* * Select random modification type, apply modification. * * MKind is a random integer in [0,7] - number of specific * modification to apply. */ mkind = (double)(ae_randominteger(8, _state)); if( ae_fp_eq(mkind,(double)(0)) ) { /* * Set non-zero D */ tau = 1.0+ae_randomreal(_state); for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = 2*ae_randomreal(_state)+1; } cqmsetd(&s, &d, tau, _state); } else { if( ae_fp_eq(mkind,(double)(1)) ) { /* * Set zero D. * In case Alpha=0, set non-zero A. */ if( ae_fp_eq(alpha,(double)(0)) ) { alpha = 1.0+ae_randomreal(_state); for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.2*ae_randomreal(_state)-0.1; a.ptr.pp_double[j][i] = a.ptr.pp_double[i][j]; } } for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = 4+2*ae_randomreal(_state); } cqmseta(&s, &a, ae_fp_greater(ae_randomreal(_state),0.5), alpha, _state); } tau = 0.0; for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = (double)(0); } cqmsetd(&s, &d, 0.0, _state); } else { if( ae_fp_eq(mkind,(double)(2)) ) { /* * Set non-zero A */ alpha = 1.0+ae_randomreal(_state); for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.2*ae_randomreal(_state)-0.1; a.ptr.pp_double[j][i] = a.ptr.pp_double[i][j]; } } for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = 4+2*ae_randomreal(_state); } cqmseta(&s, &a, ae_fp_greater(ae_randomreal(_state),0.5), alpha, _state); } else { if( ae_fp_eq(mkind,(double)(3)) ) { /* * Set zero A. * In case Tau=0, set non-zero D. */ if( ae_fp_eq(tau,(double)(0)) ) { tau = 1.0+ae_randomreal(_state); for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = 2*ae_randomreal(_state)+1; } cqmsetd(&s, &d, tau, _state); } alpha = 0.0; for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } cqmseta(&s, &a, ae_fp_greater(ae_randomreal(_state),0.5), alpha, _state); } else { if( ae_fp_eq(mkind,(double)(4)) ) { /* * Set B. */ for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } cqmsetb(&s, &b, _state); } else { if( ae_fp_eq(mkind,(double)(5)) ) { /* * Set Q. */ k = ae_randominteger(kmax+1, _state); theta = 1.0+ae_randomreal(_state); for(i=0; i<=k-1; i++) { r.ptr.p_double[i] = 2*ae_randomreal(_state)-1; for(j=0; j<=n-1; j++) { q.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } cqmsetq(&s, &q, &r, k, theta, _state); } else { if( ae_fp_eq(mkind,(double)(6)) ) { /* * Set active set */ for(i=0; i<=n-1; i++) { activeset.ptr.p_bool[i] = ae_fp_greater(ae_randomreal(_state),0.5); xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } cqmsetactiveset(&s, &xc, &activeset, _state); } else { if( ae_fp_eq(mkind,(double)(7)) ) { /* * Rewrite main diagonal */ if( ae_fp_eq(alpha,(double)(0)) ) { alpha = 1.0; } for(i=0; i<=n-1; i++) { tmp0.ptr.p_double[i] = 1+ae_randomreal(_state); a.ptr.pp_double[i][i] = tmp0.ptr.p_double[i]/alpha; } cqmrewritedensediagonal(&s, &tmp0, _state); } } } } } } } } /* * generate random point with respect to constraints, * test model at this point */ for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; if( activeset.ptr.p_bool[i] ) { x.ptr.p_double[i] = xc.ptr.p_double[i]; } } v = 0.0; for(i=0; i<=n-1; i++) { v = v+x.ptr.p_double[i]*b.ptr.p_double[i]; } if( ae_fp_greater(tau,(double)(0)) ) { for(i=0; i<=n-1; i++) { v = v+0.5*tau*d.ptr.p_double[i]*ae_sqr(x.ptr.p_double[i], _state); } } if( ae_fp_greater(alpha,(double)(0)) ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = v+0.5*alpha*x.ptr.p_double[i]*a.ptr.pp_double[i][j]*x.ptr.p_double[j]; } } } if( ae_fp_greater(theta,(double)(0)) ) { for(i=0; i<=k-1; i++) { v2 = ae_v_dotproduct(&q.ptr.pp_double[i][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v+0.5*theta*ae_sqr(v2-r.ptr.p_double[i], _state); } } v2 = cqmeval(&s, &x, _state); eval2errors = eval2errors||ae_fp_greater(ae_fabs(v-v2, _state),10000*ae_machineepsilon); v2 = cqmdebugconstrainedevalt(&s, &x, _state); eval2errors = eval2errors||ae_fp_greater(ae_fabs(v-v2, _state),10000*ae_machineepsilon); v2 = cqmdebugconstrainedevale(&s, &x, _state); eval2errors = eval2errors||ae_fp_greater(ae_fabs(v-v2, _state),10000*ae_machineepsilon); } } waserrors = waserrors||eval2errors; /* * Newton0 test: unconstrained optimization */ newton0errors = ae_false; for(n=1; n<=5; n++) { for(k=0; k<=2*n; k++) { /* * Allocate place */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&x0, n, _state); if( k>0 ) { ae_matrix_set_length(&q, k, n, _state); ae_vector_set_length(&r, k, _state); } /* * Generate problem with known solution x0: * min f(x), * f(x) = 0.5*(x-x0)'*A*(x-x0) * = 0.5*x'*A*x + (-x0'*A)*x + 0.5*x0'*A*x0' */ alpha = ae_randomreal(_state)+1.0; for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; a.ptr.pp_double[i][i] = 10*(1+ae_randomreal(_state)); for(j=i+1; j<=n-1; j++) { v = 0.1*ae_randomreal(_state)-0.05; a.ptr.pp_double[i][j] = v; a.ptr.pp_double[j][i] = v; } } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); b.ptr.p_double[i] = -alpha*v; } theta = ae_randomreal(_state)+1.0; for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { q.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } v = ae_v_dotproduct(&q.ptr.pp_double[i][0], 1, &x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); r.ptr.p_double[i] = v; } /* * Build model, evaluate at random point X, compare */ cqminit(n, &s, _state); cqmseta(&s, &a, ae_fp_greater(ae_randomreal(_state),0.5), alpha, _state); cqmsetb(&s, &b, _state); cqmsetq(&s, &q, &r, k, theta, _state); cqmconstrainedoptimum(&s, &x, _state); for(i=0; i<=n-1; i++) { newton0errors = newton0errors||ae_fp_greater(ae_fabs(x.ptr.p_double[i]-x0.ptr.p_double[i], _state),1.0E6*ae_machineepsilon); } } } waserrors = waserrors||newton0errors; /* * Newton1 test: constrained optimization */ newton1errors = ae_false; h = 1.0E-3; for(n=1; n<=5; n++) { for(k=0; k<=2*n; k++) { /* * Allocate place */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&activeset, n, _state); if( k>0 ) { ae_matrix_set_length(&q, k, n, _state); ae_vector_set_length(&r, k, _state); } /* * Generate test problem with unknown solution. */ alpha = ae_randomreal(_state)+1.0; for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = 10*(1+ae_randomreal(_state)); b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; activeset.ptr.p_bool[i] = ae_fp_greater(ae_randomreal(_state),0.5); for(j=i+1; j<=n-1; j++) { v = 0.1*ae_randomreal(_state)-0.05; a.ptr.pp_double[i][j] = v; a.ptr.pp_double[j][i] = v; } } theta = ae_randomreal(_state)+1.0; for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { q.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } r.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } /* * Build model, find solution */ cqminit(n, &s, _state); cqmseta(&s, &a, ae_fp_greater(ae_randomreal(_state),0.5), alpha, _state); cqmsetb(&s, &b, _state); cqmsetq(&s, &q, &r, k, theta, _state); cqmsetactiveset(&s, &xc, &activeset, _state); if( cqmconstrainedoptimum(&s, &x, _state) ) { /* * Check that constraints are satisfied, * and that solution is true optimum */ f0 = cqmeval(&s, &x, _state); for(i=0; i<=n-1; i++) { newton1errors = newton1errors||(activeset.ptr.p_bool[i]&&ae_fp_neq(x.ptr.p_double[i],xc.ptr.p_double[i])); if( !activeset.ptr.p_bool[i] ) { v = x.ptr.p_double[i]; x.ptr.p_double[i] = v+h; v2 = cqmeval(&s, &x, _state); newton1errors = newton1errors||ae_fp_less(v2,f0); x.ptr.p_double[i] = v-h; v2 = cqmeval(&s, &x, _state); newton1errors = newton1errors||ae_fp_less(v2,f0); x.ptr.p_double[i] = v; } } } else { newton1errors = ae_true; } } } waserrors = waserrors||newton1errors; /* * Newton2 test: we test ability to work with diagonal matrices, including * very large ones (up to 100.000 elements). This test checks that: * a) we can work with Alpha=0, i.e. when we have strictly diagonal A * b) diagonal problems are handled efficiently, i.e. algorithm will * successfully solve problem with N=100.000 * * Test problem: * * diagonal term D and rank-K term Q * * known solution X0, * * about 50% of constraints are active and equal to components of X0 */ newton2errors = ae_false; for(nkind=0; nkind<=5; nkind++) { for(k=0; k<=5; k++) { n = ae_round(ae_pow((double)(n), (double)(nkind), _state), _state); /* * generate problem */ ae_vector_set_length(&d, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&activeset, n, _state); if( k>0 ) { ae_matrix_set_length(&q, k, n, _state); ae_vector_set_length(&r, k, _state); } tau = 1+ae_randomreal(_state); theta = 1+ae_randomreal(_state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; d.ptr.p_double[i] = 1+ae_randomreal(_state); b.ptr.p_double[i] = -x0.ptr.p_double[i]*tau*d.ptr.p_double[i]; activeset.ptr.p_bool[i] = ae_fp_greater(ae_randomreal(_state),0.5); } for(i=0; i<=k-1; i++) { v = 0.0; for(j=0; j<=n-1; j++) { q.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; v = v+q.ptr.pp_double[i][j]*x0.ptr.p_double[j]; } r.ptr.p_double[i] = v; } /* * Solve, test */ cqminit(n, &s, _state); cqmsetb(&s, &b, _state); cqmsetd(&s, &d, tau, _state); cqmsetq(&s, &q, &r, k, theta, _state); cqmsetactiveset(&s, &x0, &activeset, _state); if( cqmconstrainedoptimum(&s, &x, _state) ) { /* * Check that constraints are satisfied, * and that solution is true optimum */ f0 = cqmeval(&s, &x, _state); for(i=0; i<=n-1; i++) { newton2errors = newton2errors||(activeset.ptr.p_bool[i]&&ae_fp_neq(x.ptr.p_double[i],x0.ptr.p_double[i])); newton2errors = newton2errors||(!activeset.ptr.p_bool[i]&&ae_fp_greater(ae_fabs(x.ptr.p_double[i]-x0.ptr.p_double[i], _state),1000*ae_machineepsilon)); } /* * Check that constrained evaluation at some point gives correct results */ for(i=0; i<=n-1; i++) { if( activeset.ptr.p_bool[i] ) { x.ptr.p_double[i] = x0.ptr.p_double[i]; } else { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } } v = 0.0; for(i=0; i<=n-1; i++) { v = v+0.5*tau*d.ptr.p_double[i]*ae_sqr(x.ptr.p_double[i], _state)+x.ptr.p_double[i]*b.ptr.p_double[i]; } for(i=0; i<=k-1; i++) { v2 = ae_v_dotproduct(&q.ptr.pp_double[i][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v+0.5*theta*ae_sqr(v2-r.ptr.p_double[i], _state); } v2 = cqmeval(&s, &x, _state); newton2errors = (newton2errors||!ae_isfinite(v2, _state))||ae_fp_greater(ae_fabs(v-v2, _state),10000*ae_machineepsilon); v2 = cqmdebugconstrainedevalt(&s, &x, _state); newton2errors = (newton2errors||!ae_isfinite(v2, _state))||ae_fp_greater(ae_fabs(v-v2, _state),10000*ae_machineepsilon); v2 = cqmdebugconstrainedevale(&s, &x, _state); newton2errors = (newton2errors||!ae_isfinite(v2, _state))||ae_fp_greater(ae_fabs(v-v2, _state),10000*ae_machineepsilon); } else { newton2errors = ae_true; } } } waserrors = waserrors||newton2errors; /* * report */ if( !silent ) { printf("TESTING CONVEX QUADRATIC MODELS\n"); printf("Eval0 test: "); if( eval0errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("Eval1 test: "); if( eval1errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("Eval2 test: "); if( eval2errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("Newton0 test: "); if( newton0errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("Newton1 test: "); if( newton1errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("Newton2 test: "); if( newton2errors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testcqmodels(ae_bool silent, ae_state *_state) { return testcqmodels(silent, _state); } static void testbdsvdunit_fillidentity(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state); static void testbdsvdunit_fillsparsede(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, double sparcity, ae_state *_state); static void testbdsvdunit_getbdsvderror(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* u, /* Real */ ae_matrix* c, /* Real */ ae_vector* w, /* Real */ ae_matrix* vt, double* materr, double* orterr, ae_bool* wsorted, ae_state *_state); static void testbdsvdunit_checksvdmultiplication(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* u, /* Real */ ae_matrix* c, /* Real */ ae_vector* w, /* Real */ ae_matrix* vt, double* err, ae_state *_state); static void testbdsvdunit_testbdsvdproblem(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, double* materr, double* orterr, ae_bool* wsorted, ae_bool* wfailed, ae_int_t* failcount, ae_int_t* succcount, ae_state *_state); /************************************************************************* Testing bidiagonal SVD decomposition subroutine *************************************************************************/ ae_bool testbdsvd(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_vector d; ae_vector e; ae_matrix mempty; ae_int_t n; ae_int_t maxn; ae_int_t i; ae_int_t pass; ae_bool waserrors; ae_bool wsorted; ae_bool wfailed; double materr; double orterr; double threshold; double failr; ae_int_t failcount; ae_int_t succcount; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_matrix_init(&mempty, 0, 0, DT_REAL, _state); failcount = 0; succcount = 0; materr = (double)(0); orterr = (double)(0); wsorted = ae_true; wfailed = ae_false; waserrors = ae_false; maxn = 15; threshold = 5*100*ae_machineepsilon; ae_vector_set_length(&d, maxn-1+1, _state); ae_vector_set_length(&e, maxn-2+1, _state); /* * special case: zero divide matrix * unfixed LAPACK routine should fail on this problem */ n = 7; d.ptr.p_double[0] = -6.96462904751731892700e-01; d.ptr.p_double[1] = 0.00000000000000000000e+00; d.ptr.p_double[2] = -5.73827770385971991400e-01; d.ptr.p_double[3] = -6.62562624399371191700e-01; d.ptr.p_double[4] = 5.82737148001782223600e-01; d.ptr.p_double[5] = 3.84825263580925003300e-01; d.ptr.p_double[6] = 9.84087420830525472200e-01; e.ptr.p_double[0] = -7.30307931760612871800e-02; e.ptr.p_double[1] = -2.30079042939542843800e-01; e.ptr.p_double[2] = -6.87824621739351216300e-01; e.ptr.p_double[3] = -1.77306437707837570600e-02; e.ptr.p_double[4] = 1.78285126526551632000e-15; e.ptr.p_double[5] = -4.89434737751289969400e-02; rmatrixbdsvd(&d, &e, n, ae_true, ae_false, &mempty, 0, &mempty, 0, &mempty, 0, _state); /* * zero matrix, several cases */ for(i=0; i<=maxn-1; i++) { d.ptr.p_double[i] = (double)(0); } for(i=0; i<=maxn-2; i++) { e.ptr.p_double[i] = (double)(0); } for(n=1; n<=maxn; n++) { testbdsvdunit_testbdsvdproblem(&d, &e, n, &materr, &orterr, &wsorted, &wfailed, &failcount, &succcount, _state); } /* * Dense matrix */ for(n=1; n<=maxn; n++) { for(pass=1; pass<=10; pass++) { for(i=0; i<=maxn-1; i++) { d.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=maxn-2; i++) { e.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } testbdsvdunit_testbdsvdproblem(&d, &e, n, &materr, &orterr, &wsorted, &wfailed, &failcount, &succcount, _state); } } /* * Sparse matrices, very sparse matrices, incredible sparse matrices */ for(n=1; n<=maxn; n++) { for(pass=1; pass<=10; pass++) { testbdsvdunit_fillsparsede(&d, &e, n, 0.5, _state); testbdsvdunit_testbdsvdproblem(&d, &e, n, &materr, &orterr, &wsorted, &wfailed, &failcount, &succcount, _state); testbdsvdunit_fillsparsede(&d, &e, n, 0.8, _state); testbdsvdunit_testbdsvdproblem(&d, &e, n, &materr, &orterr, &wsorted, &wfailed, &failcount, &succcount, _state); testbdsvdunit_fillsparsede(&d, &e, n, 0.9, _state); testbdsvdunit_testbdsvdproblem(&d, &e, n, &materr, &orterr, &wsorted, &wfailed, &failcount, &succcount, _state); testbdsvdunit_fillsparsede(&d, &e, n, 0.95, _state); testbdsvdunit_testbdsvdproblem(&d, &e, n, &materr, &orterr, &wsorted, &wfailed, &failcount, &succcount, _state); } } /* * report */ failr = (double)failcount/(double)(succcount+failcount); waserrors = ((wfailed||ae_fp_greater(materr,threshold))||ae_fp_greater(orterr,threshold))||!wsorted; if( !silent ) { printf("TESTING BIDIAGONAL SVD DECOMPOSITION\n"); printf("SVD decomposition error: %5.3e\n", (double)(materr)); printf("SVD orthogonality error: %5.3e\n", (double)(orterr)); printf("Singular values order: "); if( wsorted ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("Always converged: "); if( !wfailed ) { printf("YES\n"); } else { printf("NO\n"); printf("Fail ratio: %5.3f\n", (double)(failr)); } printf("Threshold: %5.3e\n", (double)(threshold)); if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testbdsvd(ae_bool silent, ae_state *_state) { return testbdsvd(silent, _state); } static void testbdsvdunit_fillidentity(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_set_length(a, n-1+1, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { a->ptr.pp_double[i][j] = (double)(1); } else { a->ptr.pp_double[i][j] = (double)(0); } } } } static void testbdsvdunit_fillsparsede(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, double sparcity, ae_state *_state) { ae_int_t i; ae_vector_set_length(d, n-1+1, _state); ae_vector_set_length(e, ae_maxint(0, n-2, _state)+1, _state); for(i=0; i<=n-1; i++) { if( ae_fp_greater_eq(ae_randomreal(_state),sparcity) ) { d->ptr.p_double[i] = 2*ae_randomreal(_state)-1; } else { d->ptr.p_double[i] = (double)(0); } } for(i=0; i<=n-2; i++) { if( ae_fp_greater_eq(ae_randomreal(_state),sparcity) ) { e->ptr.p_double[i] = 2*ae_randomreal(_state)-1; } else { e->ptr.p_double[i] = (double)(0); } } } static void testbdsvdunit_getbdsvderror(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* u, /* Real */ ae_matrix* c, /* Real */ ae_vector* w, /* Real */ ae_matrix* vt, double* materr, double* orterr, ae_bool* wsorted, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; double locerr; double sm; /* * decomposition error */ locerr = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { sm = (double)(0); for(k=0; k<=n-1; k++) { sm = sm+w->ptr.p_double[k]*u->ptr.pp_double[i][k]*vt->ptr.pp_double[k][j]; } if( isupper ) { if( i==j ) { locerr = ae_maxreal(locerr, ae_fabs(d->ptr.p_double[i]-sm, _state), _state); } else { if( i==j-1 ) { locerr = ae_maxreal(locerr, ae_fabs(e->ptr.p_double[i]-sm, _state), _state); } else { locerr = ae_maxreal(locerr, ae_fabs(sm, _state), _state); } } } else { if( i==j ) { locerr = ae_maxreal(locerr, ae_fabs(d->ptr.p_double[i]-sm, _state), _state); } else { if( i-1==j ) { locerr = ae_maxreal(locerr, ae_fabs(e->ptr.p_double[j]-sm, _state), _state); } else { locerr = ae_maxreal(locerr, ae_fabs(sm, _state), _state); } } } } } *materr = ae_maxreal(*materr, locerr, _state); /* * check for C = U' * we consider it as decomposition error */ locerr = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { locerr = ae_maxreal(locerr, ae_fabs(u->ptr.pp_double[i][j]-c->ptr.pp_double[j][i], _state), _state); } } *materr = ae_maxreal(*materr, locerr, _state); /* * orthogonality error */ locerr = (double)(0); for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { sm = ae_v_dotproduct(&u->ptr.pp_double[0][i], u->stride, &u->ptr.pp_double[0][j], u->stride, ae_v_len(0,n-1)); if( i!=j ) { locerr = ae_maxreal(locerr, ae_fabs(sm, _state), _state); } else { locerr = ae_maxreal(locerr, ae_fabs(sm-1, _state), _state); } sm = ae_v_dotproduct(&vt->ptr.pp_double[i][0], 1, &vt->ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); if( i!=j ) { locerr = ae_maxreal(locerr, ae_fabs(sm, _state), _state); } else { locerr = ae_maxreal(locerr, ae_fabs(sm-1, _state), _state); } } } *orterr = ae_maxreal(*orterr, locerr, _state); /* * values order error */ for(i=1; i<=n-1; i++) { if( ae_fp_greater(w->ptr.p_double[i],w->ptr.p_double[i-1]) ) { *wsorted = ae_false; } } } static void testbdsvdunit_checksvdmultiplication(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool isupper, /* Real */ ae_matrix* u, /* Real */ ae_matrix* c, /* Real */ ae_vector* w, /* Real */ ae_matrix* vt, double* err, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_vector wt; ae_matrix u2; ae_matrix c2; ae_matrix vt2; ae_matrix u1; ae_matrix c1; ae_matrix vt1; ae_int_t nru; ae_int_t ncc; ae_int_t ncvt; ae_int_t pass; hqrndstate rs; double v; ae_frame_make(_state, &_frame_block); ae_vector_init(&wt, 0, DT_REAL, _state); ae_matrix_init(&u2, 0, 0, DT_REAL, _state); ae_matrix_init(&c2, 0, 0, DT_REAL, _state); ae_matrix_init(&vt2, 0, 0, DT_REAL, _state); ae_matrix_init(&u1, 0, 0, DT_REAL, _state); ae_matrix_init(&c1, 0, 0, DT_REAL, _state); ae_matrix_init(&vt1, 0, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); ae_vector_set_length(&wt, n, _state); /* * Perform nonsquare SVD */ for(pass=1; pass<=20; pass++) { /* * Problem size */ nru = hqrnduniformi(&rs, 2*n, _state); ncc = hqrnduniformi(&rs, 2*n, _state); ncvt = hqrnduniformi(&rs, 2*n, _state); /* * Reference matrices (copy 1) and working matrices (copy 2) */ for(i=0; i<=n-1; i++) { wt.ptr.p_double[i] = d->ptr.p_double[i]; } if( nru>0 ) { /* * init U1/U2 */ ae_matrix_set_length(&u1, nru, n, _state); ae_matrix_set_length(&u2, nru, n, _state); for(i=0; i<=u1.rows-1; i++) { for(j=0; j<=u1.cols-1; j++) { u1.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; u2.ptr.pp_double[i][j] = u1.ptr.pp_double[i][j]; } } } else { /* * Set U1/U2 to 1x1 matrices; working with 1x1 matrices allows * to test correctness of code which passes them to MKL. */ ae_matrix_set_length(&u1, 1, 1, _state); ae_matrix_set_length(&u2, 1, 1, _state); } if( ncc>0 ) { ae_matrix_set_length(&c1, n, ncc, _state); ae_matrix_set_length(&c2, n, ncc, _state); for(i=0; i<=c1.rows-1; i++) { for(j=0; j<=c1.cols-1; j++) { c1.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; c2.ptr.pp_double[i][j] = c1.ptr.pp_double[i][j]; } } } else { /* * Set C1/C1 to 1x1 matrices; working with 1x1 matrices allows * to test correctness of code which passes them to MKL. */ ae_matrix_set_length(&c1, 1, 1, _state); ae_matrix_set_length(&c2, 1, 1, _state); } if( ncvt>0 ) { ae_matrix_set_length(&vt1, n, ncvt, _state); ae_matrix_set_length(&vt2, n, ncvt, _state); for(i=0; i<=vt1.rows-1; i++) { for(j=0; j<=vt1.cols-1; j++) { vt1.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; vt2.ptr.pp_double[i][j] = vt1.ptr.pp_double[i][j]; } } } else { /* * Set VT1/VT1 to 1x1 matrices; working with 1x1 matrices allows * to test correctness of code which passes them to MKL. */ ae_matrix_set_length(&vt1, 1, 1, _state); ae_matrix_set_length(&vt2, 1, 1, _state); } /* * SVD with non-square U/C/VT */ if( !rmatrixbdsvd(&wt, e, n, isupper, ae_fp_greater(hqrnduniformr(&rs, _state),(double)(0)), &u2, nru, &c2, ncc, &vt2, ncvt, _state) ) { *err = 1.0; ae_frame_leave(_state); return; } for(i=0; i<=nru-1; i++) { for(j=0; j<=u2.cols-1; j++) { v = ae_v_dotproduct(&u1.ptr.pp_double[i][0], 1, &u->ptr.pp_double[0][j], u->stride, ae_v_len(0,n-1)); *err = ae_maxreal(*err, ae_fabs(v-u2.ptr.pp_double[i][j], _state), _state); } } for(i=0; i<=c2.rows-1; i++) { for(j=0; j<=ncc-1; j++) { v = ae_v_dotproduct(&c->ptr.pp_double[i][0], 1, &c1.ptr.pp_double[0][j], c1.stride, ae_v_len(0,n-1)); *err = ae_maxreal(*err, ae_fabs(v-c2.ptr.pp_double[i][j], _state), _state); } } for(i=0; i<=vt2.rows-1; i++) { for(j=0; j<=ncvt-1; j++) { v = ae_v_dotproduct(&vt->ptr.pp_double[i][0], 1, &vt1.ptr.pp_double[0][j], vt1.stride, ae_v_len(0,n-1)); *err = ae_maxreal(*err, ae_fabs(v-vt2.ptr.pp_double[i][j], _state), _state); } } } ae_frame_leave(_state); } static void testbdsvdunit_testbdsvdproblem(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, double* materr, double* orterr, ae_bool* wsorted, ae_bool* wfailed, ae_int_t* failcount, ae_int_t* succcount, ae_state *_state) { ae_frame _frame_block; ae_matrix u; ae_matrix vt; ae_matrix c; ae_vector w; ae_int_t i; double mx; ae_frame_make(_state, &_frame_block); ae_matrix_init(&u, 0, 0, DT_REAL, _state); ae_matrix_init(&vt, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); mx = (double)(0); for(i=0; i<=n-1; i++) { if( ae_fp_greater(ae_fabs(d->ptr.p_double[i], _state),mx) ) { mx = ae_fabs(d->ptr.p_double[i], _state); } } for(i=0; i<=n-2; i++) { if( ae_fp_greater(ae_fabs(e->ptr.p_double[i], _state),mx) ) { mx = ae_fabs(e->ptr.p_double[i], _state); } } if( ae_fp_eq(mx,(double)(0)) ) { mx = (double)(1); } /* * Upper BDSVD tests */ ae_vector_set_length(&w, n-1+1, _state); testbdsvdunit_fillidentity(&u, n, _state); testbdsvdunit_fillidentity(&vt, n, _state); testbdsvdunit_fillidentity(&c, n, _state); for(i=0; i<=n-1; i++) { w.ptr.p_double[i] = d->ptr.p_double[i]; } if( !rmatrixbdsvd(&w, e, n, ae_true, ae_false, &u, n, &c, n, &vt, n, _state) ) { *failcount = *failcount+1; *wfailed = ae_true; ae_frame_leave(_state); return; } testbdsvdunit_getbdsvderror(d, e, n, ae_true, &u, &c, &w, &vt, materr, orterr, wsorted, _state); testbdsvdunit_checksvdmultiplication(d, e, n, ae_true, &u, &c, &w, &vt, materr, _state); testbdsvdunit_fillidentity(&u, n, _state); testbdsvdunit_fillidentity(&vt, n, _state); testbdsvdunit_fillidentity(&c, n, _state); for(i=0; i<=n-1; i++) { w.ptr.p_double[i] = d->ptr.p_double[i]; } if( !rmatrixbdsvd(&w, e, n, ae_true, ae_true, &u, n, &c, n, &vt, n, _state) ) { *failcount = *failcount+1; *wfailed = ae_true; ae_frame_leave(_state); return; } testbdsvdunit_getbdsvderror(d, e, n, ae_true, &u, &c, &w, &vt, materr, orterr, wsorted, _state); testbdsvdunit_checksvdmultiplication(d, e, n, ae_true, &u, &c, &w, &vt, materr, _state); /* * Lower BDSVD tests */ ae_vector_set_length(&w, n-1+1, _state); testbdsvdunit_fillidentity(&u, n, _state); testbdsvdunit_fillidentity(&vt, n, _state); testbdsvdunit_fillidentity(&c, n, _state); for(i=0; i<=n-1; i++) { w.ptr.p_double[i] = d->ptr.p_double[i]; } if( !rmatrixbdsvd(&w, e, n, ae_false, ae_false, &u, n, &c, n, &vt, n, _state) ) { *failcount = *failcount+1; *wfailed = ae_true; ae_frame_leave(_state); return; } testbdsvdunit_getbdsvderror(d, e, n, ae_false, &u, &c, &w, &vt, materr, orterr, wsorted, _state); testbdsvdunit_checksvdmultiplication(d, e, n, ae_false, &u, &c, &w, &vt, materr, _state); testbdsvdunit_fillidentity(&u, n, _state); testbdsvdunit_fillidentity(&vt, n, _state); testbdsvdunit_fillidentity(&c, n, _state); for(i=0; i<=n-1; i++) { w.ptr.p_double[i] = d->ptr.p_double[i]; } if( !rmatrixbdsvd(&w, e, n, ae_false, ae_true, &u, n, &c, n, &vt, n, _state) ) { *failcount = *failcount+1; *wfailed = ae_true; ae_frame_leave(_state); return; } testbdsvdunit_getbdsvderror(d, e, n, ae_false, &u, &c, &w, &vt, materr, orterr, wsorted, _state); testbdsvdunit_checksvdmultiplication(d, e, n, ae_false, &u, &c, &w, &vt, materr, _state); /* * update counter */ *succcount = *succcount+1; ae_frame_leave(_state); } static void testblasunit_naivematrixmatrixmultiply(/* Real */ ae_matrix* a, ae_int_t ai1, ae_int_t ai2, ae_int_t aj1, ae_int_t aj2, ae_bool transa, /* Real */ ae_matrix* b, ae_int_t bi1, ae_int_t bi2, ae_int_t bj1, ae_int_t bj2, ae_bool transb, double alpha, /* Real */ ae_matrix* c, ae_int_t ci1, ae_int_t ci2, ae_int_t cj1, ae_int_t cj2, double beta, ae_state *_state); ae_bool testblas(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t pass; ae_int_t passcount; ae_int_t n; ae_int_t i; ae_int_t i1; ae_int_t i2; ae_int_t j; ae_int_t j1; ae_int_t j2; ae_int_t l; ae_int_t k; ae_int_t r; ae_int_t i3; ae_int_t j3; ae_int_t col1; ae_int_t col2; ae_int_t row1; ae_int_t row2; ae_vector x1; ae_vector x2; ae_matrix a; ae_matrix b; ae_matrix c1; ae_matrix c2; double err; double e1; double e2; double e3; double v; double scl1; double scl2; double scl3; ae_bool was1; ae_bool was2; ae_bool trans1; ae_bool trans2; double threshold; ae_bool n2errors; ae_bool hsnerrors; ae_bool amaxerrors; ae_bool mverrors; ae_bool iterrors; ae_bool cterrors; ae_bool mmerrors; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&b, 0, 0, DT_REAL, _state); ae_matrix_init(&c1, 0, 0, DT_REAL, _state); ae_matrix_init(&c2, 0, 0, DT_REAL, _state); n2errors = ae_false; amaxerrors = ae_false; hsnerrors = ae_false; mverrors = ae_false; iterrors = ae_false; cterrors = ae_false; mmerrors = ae_false; waserrors = ae_false; threshold = 10000*ae_machineepsilon; /* * Test Norm2 */ passcount = 1000; e1 = (double)(0); e2 = (double)(0); e3 = (double)(0); scl2 = 0.5*ae_maxrealnumber; scl3 = 2*ae_minrealnumber; for(pass=1; pass<=passcount; pass++) { n = 1+ae_randominteger(1000, _state); i1 = ae_randominteger(10, _state); i2 = n+i1-1; ae_vector_set_length(&x1, i2+1, _state); ae_vector_set_length(&x2, i2+1, _state); for(i=i1; i<=i2; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } v = (double)(0); for(i=i1; i<=i2; i++) { v = v+ae_sqr(x1.ptr.p_double[i], _state); } v = ae_sqrt(v, _state); e1 = ae_maxreal(e1, ae_fabs(v-vectornorm2(&x1, i1, i2, _state), _state), _state); for(i=i1; i<=i2; i++) { x2.ptr.p_double[i] = scl2*x1.ptr.p_double[i]; } e2 = ae_maxreal(e2, ae_fabs(v*scl2-vectornorm2(&x2, i1, i2, _state), _state), _state); for(i=i1; i<=i2; i++) { x2.ptr.p_double[i] = scl3*x1.ptr.p_double[i]; } e3 = ae_maxreal(e3, ae_fabs(v*scl3-vectornorm2(&x2, i1, i2, _state), _state), _state); } e2 = e2/scl2; e3 = e3/scl3; n2errors = (ae_fp_greater_eq(e1,threshold)||ae_fp_greater_eq(e2,threshold))||ae_fp_greater_eq(e3,threshold); /* * Testing VectorAbsMax, Column/Row AbsMax */ ae_vector_set_length(&x1, 5+1, _state); x1.ptr.p_double[1] = 2.0; x1.ptr.p_double[2] = 0.2; x1.ptr.p_double[3] = -1.3; x1.ptr.p_double[4] = 0.7; x1.ptr.p_double[5] = -3.0; amaxerrors = (vectoridxabsmax(&x1, 1, 5, _state)!=5||vectoridxabsmax(&x1, 1, 4, _state)!=1)||vectoridxabsmax(&x1, 2, 4, _state)!=3; n = 30; ae_vector_set_length(&x1, n+1, _state); ae_matrix_set_length(&a, n+1, n+1, _state); for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } was1 = ae_false; was2 = ae_false; for(pass=1; pass<=1000; pass++) { j = 1+ae_randominteger(n, _state); i1 = 1+ae_randominteger(n, _state); i2 = i1+ae_randominteger(n+1-i1, _state); ae_v_move(&x1.ptr.p_double[i1], 1, &a.ptr.pp_double[i1][j], a.stride, ae_v_len(i1,i2)); if( vectoridxabsmax(&x1, i1, i2, _state)!=columnidxabsmax(&a, i1, i2, j, _state) ) { was1 = ae_true; } i = 1+ae_randominteger(n, _state); j1 = 1+ae_randominteger(n, _state); j2 = j1+ae_randominteger(n+1-j1, _state); ae_v_move(&x1.ptr.p_double[j1], 1, &a.ptr.pp_double[i][j1], 1, ae_v_len(j1,j2)); if( vectoridxabsmax(&x1, j1, j2, _state)!=rowidxabsmax(&a, j1, j2, i, _state) ) { was2 = ae_true; } } amaxerrors = (amaxerrors||was1)||was2; /* * Testing upper Hessenberg 1-norm */ ae_matrix_set_length(&a, 3+1, 3+1, _state); ae_vector_set_length(&x1, 3+1, _state); a.ptr.pp_double[1][1] = (double)(2); a.ptr.pp_double[1][2] = (double)(3); a.ptr.pp_double[1][3] = (double)(1); a.ptr.pp_double[2][1] = (double)(4); a.ptr.pp_double[2][2] = (double)(-5); a.ptr.pp_double[2][3] = (double)(8); a.ptr.pp_double[3][1] = (double)(99); a.ptr.pp_double[3][2] = (double)(3); a.ptr.pp_double[3][3] = (double)(1); hsnerrors = ae_fp_greater(ae_fabs(upperhessenberg1norm(&a, 1, 3, 1, 3, &x1, _state)-11, _state),threshold); /* * Testing MatrixVectorMultiply */ ae_matrix_set_length(&a, 3+1, 5+1, _state); ae_vector_set_length(&x1, 3+1, _state); ae_vector_set_length(&x2, 2+1, _state); a.ptr.pp_double[2][3] = (double)(2); a.ptr.pp_double[2][4] = (double)(-1); a.ptr.pp_double[2][5] = (double)(-1); a.ptr.pp_double[3][3] = (double)(1); a.ptr.pp_double[3][4] = (double)(-2); a.ptr.pp_double[3][5] = (double)(2); x1.ptr.p_double[1] = (double)(1); x1.ptr.p_double[2] = (double)(2); x1.ptr.p_double[3] = (double)(1); x2.ptr.p_double[1] = (double)(-1); x2.ptr.p_double[2] = (double)(-1); matrixvectormultiply(&a, 2, 3, 3, 5, ae_false, &x1, 1, 3, 1.0, &x2, 1, 2, 1.0, _state); matrixvectormultiply(&a, 2, 3, 3, 5, ae_true, &x2, 1, 2, 1.0, &x1, 1, 3, 1.0, _state); e1 = ae_fabs(x1.ptr.p_double[1]+5, _state)+ae_fabs(x1.ptr.p_double[2]-8, _state)+ae_fabs(x1.ptr.p_double[3]+1, _state)+ae_fabs(x2.ptr.p_double[1]+2, _state)+ae_fabs(x2.ptr.p_double[2]+2, _state); x1.ptr.p_double[1] = (double)(1); x1.ptr.p_double[2] = (double)(2); x1.ptr.p_double[3] = (double)(1); x2.ptr.p_double[1] = (double)(-1); x2.ptr.p_double[2] = (double)(-1); matrixvectormultiply(&a, 2, 3, 3, 5, ae_false, &x1, 1, 3, 1.0, &x2, 1, 2, 0.0, _state); matrixvectormultiply(&a, 2, 3, 3, 5, ae_true, &x2, 1, 2, 1.0, &x1, 1, 3, 0.0, _state); e2 = ae_fabs(x1.ptr.p_double[1]+3, _state)+ae_fabs(x1.ptr.p_double[2]-3, _state)+ae_fabs(x1.ptr.p_double[3]+1, _state)+ae_fabs(x2.ptr.p_double[1]+1, _state)+ae_fabs(x2.ptr.p_double[2]+1, _state); mverrors = ae_fp_greater_eq(e1+e2,threshold); /* * testing inplace transpose */ n = 10; ae_matrix_set_length(&a, n+1, n+1, _state); ae_matrix_set_length(&b, n+1, n+1, _state); ae_vector_set_length(&x1, n-1+1, _state); for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { a.ptr.pp_double[i][j] = ae_randomreal(_state); } } passcount = 10000; was1 = ae_false; for(pass=1; pass<=passcount; pass++) { i1 = 1+ae_randominteger(n, _state); i2 = i1+ae_randominteger(n-i1+1, _state); j1 = 1+ae_randominteger(n-(i2-i1), _state); j2 = j1+(i2-i1); copymatrix(&a, i1, i2, j1, j2, &b, i1, i2, j1, j2, _state); inplacetranspose(&b, i1, i2, j1, j2, &x1, _state); for(i=i1; i<=i2; i++) { for(j=j1; j<=j2; j++) { if( ae_fp_neq(a.ptr.pp_double[i][j],b.ptr.pp_double[i1+(j-j1)][j1+(i-i1)]) ) { was1 = ae_true; } } } } iterrors = was1; /* * testing copy and transpose */ n = 10; ae_matrix_set_length(&a, n+1, n+1, _state); ae_matrix_set_length(&b, n+1, n+1, _state); for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { a.ptr.pp_double[i][j] = ae_randomreal(_state); } } passcount = 10000; was1 = ae_false; for(pass=1; pass<=passcount; pass++) { i1 = 1+ae_randominteger(n, _state); i2 = i1+ae_randominteger(n-i1+1, _state); j1 = 1+ae_randominteger(n, _state); j2 = j1+ae_randominteger(n-j1+1, _state); copyandtranspose(&a, i1, i2, j1, j2, &b, j1, j2, i1, i2, _state); for(i=i1; i<=i2; i++) { for(j=j1; j<=j2; j++) { if( ae_fp_neq(a.ptr.pp_double[i][j],b.ptr.pp_double[j][i]) ) { was1 = ae_true; } } } } cterrors = was1; /* * Testing MatrixMatrixMultiply */ n = 10; ae_matrix_set_length(&a, 2*n+1, 2*n+1, _state); ae_matrix_set_length(&b, 2*n+1, 2*n+1, _state); ae_matrix_set_length(&c1, 2*n+1, 2*n+1, _state); ae_matrix_set_length(&c2, 2*n+1, 2*n+1, _state); ae_vector_set_length(&x1, n+1, _state); ae_vector_set_length(&x2, n+1, _state); for(i=1; i<=2*n; i++) { for(j=1; j<=2*n; j++) { a.ptr.pp_double[i][j] = ae_randomreal(_state); b.ptr.pp_double[i][j] = ae_randomreal(_state); } } passcount = 1000; was1 = ae_false; for(pass=1; pass<=passcount; pass++) { for(i=1; i<=2*n; i++) { for(j=1; j<=2*n; j++) { c1.ptr.pp_double[i][j] = 2.1*i+3.1*j; c2.ptr.pp_double[i][j] = c1.ptr.pp_double[i][j]; } } l = 1+ae_randominteger(n, _state); k = 1+ae_randominteger(n, _state); r = 1+ae_randominteger(n, _state); i1 = 1+ae_randominteger(n, _state); j1 = 1+ae_randominteger(n, _state); i2 = 1+ae_randominteger(n, _state); j2 = 1+ae_randominteger(n, _state); i3 = 1+ae_randominteger(n, _state); j3 = 1+ae_randominteger(n, _state); trans1 = ae_fp_greater(ae_randomreal(_state),0.5); trans2 = ae_fp_greater(ae_randomreal(_state),0.5); if( trans1 ) { col1 = l; row1 = k; } else { col1 = k; row1 = l; } if( trans2 ) { col2 = k; row2 = r; } else { col2 = r; row2 = k; } scl1 = ae_randomreal(_state); scl2 = ae_randomreal(_state); matrixmatrixmultiply(&a, i1, i1+row1-1, j1, j1+col1-1, trans1, &b, i2, i2+row2-1, j2, j2+col2-1, trans2, scl1, &c1, i3, i3+l-1, j3, j3+r-1, scl2, &x1, _state); testblasunit_naivematrixmatrixmultiply(&a, i1, i1+row1-1, j1, j1+col1-1, trans1, &b, i2, i2+row2-1, j2, j2+col2-1, trans2, scl1, &c2, i3, i3+l-1, j3, j3+r-1, scl2, _state); err = (double)(0); for(i=1; i<=l; i++) { for(j=1; j<=r; j++) { err = ae_maxreal(err, ae_fabs(c1.ptr.pp_double[i3+i-1][j3+j-1]-c2.ptr.pp_double[i3+i-1][j3+j-1], _state), _state); } } if( ae_fp_greater(err,threshold) ) { was1 = ae_true; break; } } mmerrors = was1; /* * report */ waserrors = (((((n2errors||amaxerrors)||hsnerrors)||mverrors)||iterrors)||cterrors)||mmerrors; if( !silent ) { printf("TESTING BLAS\n"); printf("VectorNorm2: "); if( n2errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("AbsMax (vector/row/column): "); if( amaxerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("UpperHessenberg1Norm: "); if( hsnerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("MatrixVectorMultiply: "); if( mverrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("InplaceTranspose: "); if( iterrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("CopyAndTranspose: "); if( cterrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("MatrixMatrixMultiply: "); if( mmerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testblas(ae_bool silent, ae_state *_state) { return testblas(silent, _state); } static void testblasunit_naivematrixmatrixmultiply(/* Real */ ae_matrix* a, ae_int_t ai1, ae_int_t ai2, ae_int_t aj1, ae_int_t aj2, ae_bool transa, /* Real */ ae_matrix* b, ae_int_t bi1, ae_int_t bi2, ae_int_t bj1, ae_int_t bj2, ae_bool transb, double alpha, /* Real */ ae_matrix* c, ae_int_t ci1, ae_int_t ci2, ae_int_t cj1, ae_int_t cj2, double beta, ae_state *_state) { ae_frame _frame_block; ae_int_t arows; ae_int_t acols; ae_int_t brows; ae_int_t bcols; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t l; ae_int_t r; double v; ae_vector x1; ae_vector x2; ae_frame_make(_state, &_frame_block); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); /* * Setup */ if( !transa ) { arows = ai2-ai1+1; acols = aj2-aj1+1; } else { arows = aj2-aj1+1; acols = ai2-ai1+1; } if( !transb ) { brows = bi2-bi1+1; bcols = bj2-bj1+1; } else { brows = bj2-bj1+1; bcols = bi2-bi1+1; } ae_assert(acols==brows, "NaiveMatrixMatrixMultiply: incorrect matrix sizes!", _state); if( ((arows<=0||acols<=0)||brows<=0)||bcols<=0 ) { ae_frame_leave(_state); return; } l = arows; r = bcols; k = acols; ae_vector_set_length(&x1, k+1, _state); ae_vector_set_length(&x2, k+1, _state); for(i=1; i<=l; i++) { for(j=1; j<=r; j++) { if( !transa ) { if( !transb ) { v = ae_v_dotproduct(&b->ptr.pp_double[bi1][bj1+j-1], b->stride, &a->ptr.pp_double[ai1+i-1][aj1], 1, ae_v_len(bi1,bi2)); } else { v = ae_v_dotproduct(&b->ptr.pp_double[bi1+j-1][bj1], 1, &a->ptr.pp_double[ai1+i-1][aj1], 1, ae_v_len(bj1,bj2)); } } else { if( !transb ) { v = ae_v_dotproduct(&b->ptr.pp_double[bi1][bj1+j-1], b->stride, &a->ptr.pp_double[ai1][aj1+i-1], a->stride, ae_v_len(bi1,bi2)); } else { v = ae_v_dotproduct(&b->ptr.pp_double[bi1+j-1][bj1], 1, &a->ptr.pp_double[ai1][aj1+i-1], a->stride, ae_v_len(bj1,bj2)); } } if( ae_fp_eq(beta,(double)(0)) ) { c->ptr.pp_double[ci1+i-1][cj1+j-1] = alpha*v; } else { c->ptr.pp_double[ci1+i-1][cj1+j-1] = beta*c->ptr.pp_double[ci1+i-1][cj1+j-1]+alpha*v; } } } ae_frame_leave(_state); } static void testsvdunit_fillsparsea(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double sparcity, ae_state *_state); static void testsvdunit_getsvderror(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* u, /* Real */ ae_vector* w, /* Real */ ae_matrix* vt, double* materr, double* orterr, ae_bool* wsorted, ae_state *_state); static void testsvdunit_testsvdproblem(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double* materr, double* orterr, double* othererr, ae_bool* wsorted, ae_bool* wfailed, ae_int_t* failcount, ae_int_t* succcount, ae_state *_state); /************************************************************************* Testing SVD decomposition subroutine *************************************************************************/ ae_bool testsvd(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_int_t m; ae_int_t n; ae_int_t maxmn; ae_int_t i; ae_int_t j; ae_int_t gpass; ae_int_t pass; ae_bool waserrors; ae_bool wsorted; ae_bool wfailed; double materr; double orterr; double othererr; double threshold; double failr; ae_int_t failcount; ae_int_t succcount; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); failcount = 0; succcount = 0; materr = (double)(0); orterr = (double)(0); othererr = (double)(0); wsorted = ae_true; wfailed = ae_false; waserrors = ae_false; maxmn = 30; threshold = 5*100*ae_machineepsilon; ae_matrix_set_length(&a, maxmn-1+1, maxmn-1+1, _state); /* * TODO: div by zero fail, convergence fail */ for(gpass=1; gpass<=1; gpass++) { /* * zero matrix, several cases */ for(i=0; i<=maxmn-1; i++) { for(j=0; j<=maxmn-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } for(i=1; i<=ae_minint(5, maxmn, _state); i++) { for(j=1; j<=ae_minint(5, maxmn, _state); j++) { testsvdunit_testsvdproblem(&a, i, j, &materr, &orterr, &othererr, &wsorted, &wfailed, &failcount, &succcount, _state); } } /* * Long dense matrix */ for(i=0; i<=maxmn-1; i++) { for(j=0; j<=ae_minint(5, maxmn, _state)-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } for(i=1; i<=maxmn; i++) { for(j=1; j<=ae_minint(5, maxmn, _state); j++) { testsvdunit_testsvdproblem(&a, i, j, &materr, &orterr, &othererr, &wsorted, &wfailed, &failcount, &succcount, _state); } } for(i=0; i<=ae_minint(5, maxmn, _state)-1; i++) { for(j=0; j<=maxmn-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } for(i=1; i<=ae_minint(5, maxmn, _state); i++) { for(j=1; j<=maxmn; j++) { testsvdunit_testsvdproblem(&a, i, j, &materr, &orterr, &othererr, &wsorted, &wfailed, &failcount, &succcount, _state); } } /* * Dense matrices */ for(m=1; m<=ae_minint(10, maxmn, _state); m++) { for(n=1; n<=ae_minint(10, maxmn, _state); n++) { for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } testsvdunit_testsvdproblem(&a, m, n, &materr, &orterr, &othererr, &wsorted, &wfailed, &failcount, &succcount, _state); } } /* * Sparse matrices, very sparse matrices, incredible sparse matrices */ for(m=1; m<=10; m++) { for(n=1; n<=10; n++) { for(pass=1; pass<=2; pass++) { testsvdunit_fillsparsea(&a, m, n, 0.8, _state); testsvdunit_testsvdproblem(&a, m, n, &materr, &orterr, &othererr, &wsorted, &wfailed, &failcount, &succcount, _state); testsvdunit_fillsparsea(&a, m, n, 0.9, _state); testsvdunit_testsvdproblem(&a, m, n, &materr, &orterr, &othererr, &wsorted, &wfailed, &failcount, &succcount, _state); testsvdunit_fillsparsea(&a, m, n, 0.95, _state); testsvdunit_testsvdproblem(&a, m, n, &materr, &orterr, &othererr, &wsorted, &wfailed, &failcount, &succcount, _state); } } } } /* * report */ failr = (double)failcount/(double)(succcount+failcount); waserrors = (((wfailed||ae_fp_greater(materr,threshold))||ae_fp_greater(orterr,threshold))||ae_fp_greater(othererr,threshold))||!wsorted; if( !silent ) { printf("TESTING SVD DECOMPOSITION\n"); printf("SVD decomposition error: %5.3e\n", (double)(materr)); printf("SVD orthogonality error: %5.3e\n", (double)(orterr)); printf("SVD with different parameters error: %5.3e\n", (double)(othererr)); printf("Singular values order: "); if( wsorted ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("Always converged: "); if( !wfailed ) { printf("YES\n"); } else { printf("NO\n"); printf("Fail ratio: %5.3f\n", (double)(failr)); } printf("Threshold: %5.3e\n", (double)(threshold)); if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testsvd(ae_bool silent, ae_state *_state) { return testsvd(silent, _state); } static void testsvdunit_fillsparsea(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double sparcity, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater_eq(ae_randomreal(_state),sparcity) ) { a->ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } else { a->ptr.pp_double[i][j] = (double)(0); } } } } static void testsvdunit_getsvderror(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* u, /* Real */ ae_vector* w, /* Real */ ae_matrix* vt, double* materr, double* orterr, ae_bool* wsorted, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t minmn; double locerr; double sm; minmn = ae_minint(m, n, _state); /* * decomposition error */ locerr = (double)(0); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { sm = (double)(0); for(k=0; k<=minmn-1; k++) { sm = sm+w->ptr.p_double[k]*u->ptr.pp_double[i][k]*vt->ptr.pp_double[k][j]; } locerr = ae_maxreal(locerr, ae_fabs(a->ptr.pp_double[i][j]-sm, _state), _state); } } *materr = ae_maxreal(*materr, locerr, _state); /* * orthogonality error */ locerr = (double)(0); for(i=0; i<=minmn-1; i++) { for(j=i; j<=minmn-1; j++) { sm = ae_v_dotproduct(&u->ptr.pp_double[0][i], u->stride, &u->ptr.pp_double[0][j], u->stride, ae_v_len(0,m-1)); if( i!=j ) { locerr = ae_maxreal(locerr, ae_fabs(sm, _state), _state); } else { locerr = ae_maxreal(locerr, ae_fabs(sm-1, _state), _state); } sm = ae_v_dotproduct(&vt->ptr.pp_double[i][0], 1, &vt->ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); if( i!=j ) { locerr = ae_maxreal(locerr, ae_fabs(sm, _state), _state); } else { locerr = ae_maxreal(locerr, ae_fabs(sm-1, _state), _state); } } } *orterr = ae_maxreal(*orterr, locerr, _state); /* * values order error */ for(i=1; i<=minmn-1; i++) { if( ae_fp_greater(w->ptr.p_double[i],w->ptr.p_double[i-1]) ) { *wsorted = ae_false; } } } static void testsvdunit_testsvdproblem(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double* materr, double* orterr, double* othererr, ae_bool* wsorted, ae_bool* wfailed, ae_int_t* failcount, ae_int_t* succcount, ae_state *_state) { ae_frame _frame_block; ae_matrix u; ae_matrix vt; ae_matrix u2; ae_matrix vt2; ae_vector w; ae_vector w2; ae_int_t i; ae_int_t j; ae_int_t ujob; ae_int_t vtjob; ae_int_t memjob; ae_int_t ucheck; ae_int_t vtcheck; ae_frame_make(_state, &_frame_block); ae_matrix_init(&u, 0, 0, DT_REAL, _state); ae_matrix_init(&vt, 0, 0, DT_REAL, _state); ae_matrix_init(&u2, 0, 0, DT_REAL, _state); ae_matrix_init(&vt2, 0, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); /* * Main SVD test */ if( !rmatrixsvd(a, m, n, 2, 2, 2, &w, &u, &vt, _state) ) { *failcount = *failcount+1; *wfailed = ae_true; ae_frame_leave(_state); return; } testsvdunit_getsvderror(a, m, n, &u, &w, &vt, materr, orterr, wsorted, _state); /* * Additional SVD tests */ for(ujob=0; ujob<=2; ujob++) { for(vtjob=0; vtjob<=2; vtjob++) { for(memjob=0; memjob<=2; memjob++) { if( !rmatrixsvd(a, m, n, ujob, vtjob, memjob, &w2, &u2, &vt2, _state) ) { *failcount = *failcount+1; *wfailed = ae_true; ae_frame_leave(_state); return; } ucheck = 0; if( ujob==1 ) { ucheck = ae_minint(m, n, _state); } if( ujob==2 ) { ucheck = m; } vtcheck = 0; if( vtjob==1 ) { vtcheck = ae_minint(m, n, _state); } if( vtjob==2 ) { vtcheck = n; } for(i=0; i<=m-1; i++) { for(j=0; j<=ucheck-1; j++) { *othererr = ae_maxreal(*othererr, ae_fabs(u.ptr.pp_double[i][j]-u2.ptr.pp_double[i][j], _state), _state); } } for(i=0; i<=vtcheck-1; i++) { for(j=0; j<=n-1; j++) { *othererr = ae_maxreal(*othererr, ae_fabs(vt.ptr.pp_double[i][j]-vt2.ptr.pp_double[i][j], _state), _state); } } for(i=0; i<=ae_minint(m, n, _state)-1; i++) { *othererr = ae_maxreal(*othererr, ae_fabs(w.ptr.p_double[i]-w2.ptr.p_double[i], _state), _state); } } } } /* * update counter */ *succcount = *succcount+1; ae_frame_leave(_state); } static void testoptservunit_testprec(ae_bool* wereerrors, ae_state *_state); ae_bool testoptserv(ae_bool silent, ae_state *_state) { ae_bool precerrors; ae_bool wereerrors; ae_bool result; precerrors = ae_false; testoptservunit_testprec(&precerrors, _state); /* * report */ wereerrors = precerrors; if( !silent ) { printf("TESTING OPTSERV\n"); printf("TESTS: \n"); printf("* PRECONDITIONERS "); if( precerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( wereerrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !wereerrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testoptserv(ae_bool silent, ae_state *_state) { return testoptserv(silent, _state); } /************************************************************************* This function checks preconditioning functions On failure sets error flag. *************************************************************************/ static void testoptservunit_testprec(ae_bool* wereerrors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t k; ae_int_t i; ae_int_t j; ae_int_t i0; ae_int_t j0; ae_int_t j1; double v; double rho; double theta; double tolg; ae_matrix va; ae_vector vc; ae_vector vd; ae_vector vb; ae_vector s0; ae_vector s1; ae_vector s2; ae_vector g; precbuflbfgs buf; precbuflowrank lowrankbuf; ae_vector norms; ae_matrix sk; ae_matrix yk; ae_matrix bk; ae_vector bksk; ae_vector tmp; matinvreport rep; hqrndstate rs; ae_frame_make(_state, &_frame_block); ae_matrix_init(&va, 0, 0, DT_REAL, _state); ae_vector_init(&vc, 0, DT_REAL, _state); ae_vector_init(&vd, 0, DT_REAL, _state); ae_vector_init(&vb, 0, DT_REAL, _state); ae_vector_init(&s0, 0, DT_REAL, _state); ae_vector_init(&s1, 0, DT_REAL, _state); ae_vector_init(&s2, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); _precbuflbfgs_init(&buf, _state); _precbuflowrank_init(&lowrankbuf, _state); ae_vector_init(&norms, 0, DT_REAL, _state); ae_matrix_init(&sk, 0, 0, DT_REAL, _state); ae_matrix_init(&yk, 0, 0, DT_REAL, _state); ae_matrix_init(&bk, 0, 0, DT_REAL, _state); ae_vector_init(&bksk, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); _matinvreport_init(&rep, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); /* * Test for inexact L-BFGS preconditioner. * * We generate QP problem 0.5*x'*H*x, with random H=D+V'*C*V. * Different K's, from 0 to N, are tried. We test preconditioner * code which uses compact L-BFGS update against reference implementation * which uses non-compact BFGS scheme. * * For each K we perform two tests: first for KxN non-zero matrix V, * second one for NxN matrix V with last N-K rows set to zero. Last test * checks algorithm's ability to handle zero updates. */ tolg = 1.0E-9; for(n=1; n<=10; n++) { for(k=0; k<=n; k++) { /* * Prepare problem: * * VD, VC, VA, with VC/VA reordered by ascending of VC[i]*norm(VA[i,...])^2 * * trial vector S (copies are stored to S0,S1,S2) */ ae_vector_set_length(&vd, n, _state); ae_vector_set_length(&s0, n, _state); ae_vector_set_length(&s1, n, _state); ae_vector_set_length(&s2, n, _state); for(i=0; i<=n-1; i++) { vd.ptr.p_double[i] = ae_exp(hqrndnormal(&rs, _state), _state); s0.ptr.p_double[i] = hqrndnormal(&rs, _state); s1.ptr.p_double[i] = s0.ptr.p_double[i]; s2.ptr.p_double[i] = s0.ptr.p_double[i]; } rmatrixrndcond(n, 1.0E2, &va, _state); rvectorsetlengthatleast(&vc, n, _state); for(i=0; i<=k-1; i++) { vc.ptr.p_double[i] = ae_exp(hqrndnormal(&rs, _state), _state); } for(i=k; i<=n-1; i++) { vc.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { va.ptr.pp_double[i][j] = 0.0; } } ae_vector_set_length(&norms, k, _state); for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&va.ptr.pp_double[i][0], 1, &va.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); norms.ptr.p_double[i] = v*vc.ptr.p_double[i]; } for(i=0; i<=k-1; i++) { for(j=0; j<=k-2; j++) { if( ae_fp_greater(norms.ptr.p_double[j],norms.ptr.p_double[j+1]) ) { /* * Swap elements J and J+1 */ v = norms.ptr.p_double[j]; norms.ptr.p_double[j] = norms.ptr.p_double[j+1]; norms.ptr.p_double[j+1] = v; v = vc.ptr.p_double[j]; vc.ptr.p_double[j] = vc.ptr.p_double[j+1]; vc.ptr.p_double[j+1] = v; for(j0=0; j0<=n-1; j0++) { v = va.ptr.pp_double[j][j0]; va.ptr.pp_double[j][j0] = va.ptr.pp_double[j+1][j0]; va.ptr.pp_double[j+1][j0] = v; } } } } /* * Generate reference model and apply it to S2: * * generate approximate Hessian Bk * * calculate inv(Bk) * * calculate inv(Bk)*S2, store to S2 */ rmatrixsetlengthatleast(&sk, k, n, _state); rmatrixsetlengthatleast(&yk, k, n, _state); ae_matrix_set_length(&bk, n, n, _state); ae_vector_set_length(&bksk, n, _state); ae_vector_set_length(&tmp, n, _state); for(i=0; i<=k-1; i++) { ae_v_move(&sk.ptr.pp_double[i][0], 1, &va.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&va.ptr.pp_double[i][0], 1, &sk.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); v = v*vc.ptr.p_double[i]; for(j=0; j<=n-1; j++) { yk.ptr.pp_double[i][j] = vd.ptr.p_double[j]*sk.ptr.pp_double[i][j]+va.ptr.pp_double[i][j]*v; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { bk.ptr.pp_double[i][i] = vd.ptr.p_double[i]; } else { bk.ptr.pp_double[i][j] = 0.0; } } } for(i=0; i<=k-1; i++) { theta = 0.0; for(j0=0; j0<=n-1; j0++) { bksk.ptr.p_double[j0] = (double)(0); for(j1=0; j1<=n-1; j1++) { theta = theta+sk.ptr.pp_double[i][j0]*bk.ptr.pp_double[j0][j1]*sk.ptr.pp_double[i][j1]; bksk.ptr.p_double[j0] = bksk.ptr.p_double[j0]+bk.ptr.pp_double[j0][j1]*sk.ptr.pp_double[i][j1]; } } theta = 1/theta; rho = ae_v_dotproduct(&sk.ptr.pp_double[i][0], 1, &yk.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); rho = 1/rho; for(j0=0; j0<=n-1; j0++) { for(j1=0; j1<=n-1; j1++) { bk.ptr.pp_double[j0][j1] = bk.ptr.pp_double[j0][j1]+rho*yk.ptr.pp_double[i][j0]*yk.ptr.pp_double[i][j1]; } } for(j0=0; j0<=n-1; j0++) { for(j1=0; j1<=n-1; j1++) { bk.ptr.pp_double[j0][j1] = bk.ptr.pp_double[j0][j1]-theta*bksk.ptr.p_double[j0]*bksk.ptr.p_double[j1]; } } } rmatrixinverse(&bk, n, &j0, &rep, _state); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&bk.ptr.pp_double[i][0], 1, &s2.ptr.p_double[0], 1, ae_v_len(0,n-1)); tmp.ptr.p_double[i] = v; } for(i=0; i<=n-1; i++) { s2.ptr.p_double[i] = tmp.ptr.p_double[i]; } /* * First test for non-zero V: * * apply preconditioner to X0 * * compare reference model against implementation being tested */ inexactlbfgspreconditioner(&s0, n, &vd, &vc, &va, k, &buf, _state); for(i=0; i<=n-1; i++) { seterrorflag(wereerrors, ae_fp_greater(ae_fabs(s2.ptr.p_double[i]-s0.ptr.p_double[i], _state),tolg), _state); } /* * Second test - N-K zero rows appended to V and rows are * randomly reordered. Doing so should not change result, * algorithm must be able to order rows according to second derivative * and skip zero updates. */ for(i=0; i<=n-1; i++) { i0 = i+hqrnduniformi(&rs, n-i, _state); v = vc.ptr.p_double[i]; vc.ptr.p_double[i] = vc.ptr.p_double[i0]; vc.ptr.p_double[i0] = v; for(j=0; j<=n-1; j++) { v = va.ptr.pp_double[i][j]; va.ptr.pp_double[i][j] = va.ptr.pp_double[i0][j]; va.ptr.pp_double[i0][j] = v; } } inexactlbfgspreconditioner(&s1, n, &vd, &vc, &va, n, &buf, _state); for(i=0; i<=n-1; i++) { seterrorflag(wereerrors, ae_fp_greater(ae_fabs(s2.ptr.p_double[i]-s1.ptr.p_double[i], _state),tolg), _state); } } } /* * Test for exact low-rank preconditioner. * * We generate QP problem 0.5*x'*H*x, with random H=D+V'*C*V. * Different K's, from 0 to N, are tried. We test preconditioner * code which uses Woodbury update against reference implementation * which performs straightforward matrix inversion. * * For each K we perform two tests: first for KxN non-zero matrix V, * second one for NxN matrix V with randomly appended N-K zero rows. * Last test checks algorithm's ability to handle zero updates. */ tolg = 1.0E-9; for(n=1; n<=10; n++) { for(k=0; k<=n; k++) { /* * Prepare problem: * * VD, VC, VA * * trial vector S (copies are stored to S0,S1,S2) */ ae_vector_set_length(&vd, n, _state); ae_vector_set_length(&s0, n, _state); ae_vector_set_length(&s1, n, _state); ae_vector_set_length(&s2, n, _state); for(i=0; i<=n-1; i++) { vd.ptr.p_double[i] = ae_exp(hqrndnormal(&rs, _state), _state); s0.ptr.p_double[i] = hqrndnormal(&rs, _state); s1.ptr.p_double[i] = s0.ptr.p_double[i]; s2.ptr.p_double[i] = s0.ptr.p_double[i]; } rmatrixrndcond(n, 1.0E2, &va, _state); rvectorsetlengthatleast(&vc, n, _state); for(i=0; i<=k-1; i++) { vc.ptr.p_double[i] = ae_exp(hqrndnormal(&rs, _state), _state); } for(i=k; i<=n-1; i++) { vc.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { va.ptr.pp_double[i][j] = 0.0; } } /* * Generate reference model and apply it to S2 */ ae_matrix_set_length(&bk, n, n, _state); ae_vector_set_length(&tmp, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { v = vd.ptr.p_double[i]; } else { v = 0.0; } for(j1=0; j1<=k-1; j1++) { v = v+va.ptr.pp_double[j1][i]*vc.ptr.p_double[j1]*va.ptr.pp_double[j1][j]; } bk.ptr.pp_double[i][j] = v; } } rmatrixinverse(&bk, n, &j, &rep, _state); ae_assert(j>0, "Assertion failed", _state); for(i=0; i<=n-1; i++) { v = 0.0; for(j=0; j<=n-1; j++) { v = v+bk.ptr.pp_double[i][j]*s2.ptr.p_double[j]; } tmp.ptr.p_double[i] = v; } for(i=0; i<=n-1; i++) { s2.ptr.p_double[i] = tmp.ptr.p_double[i]; } /* * First test for non-zero V: * * apply preconditioner to X0 * * compare reference model against implementation being tested */ preparelowrankpreconditioner(&vd, &vc, &va, n, k, &lowrankbuf, _state); applylowrankpreconditioner(&s0, &lowrankbuf, _state); for(i=0; i<=n-1; i++) { seterrorflag(wereerrors, ae_fp_greater(ae_fabs(s2.ptr.p_double[i]-s0.ptr.p_double[i], _state),tolg), _state); } /* * Second test - N-K zero rows appended to V and rows are * randomly reordered. Doing so should not change result, * algorithm must be able to order rows according to second derivative * and skip zero updates. */ for(i=0; i<=n-1; i++) { i0 = i+hqrnduniformi(&rs, n-i, _state); v = vc.ptr.p_double[i]; vc.ptr.p_double[i] = vc.ptr.p_double[i0]; vc.ptr.p_double[i0] = v; for(j=0; j<=n-1; j++) { v = va.ptr.pp_double[i][j]; va.ptr.pp_double[i][j] = va.ptr.pp_double[i0][j]; va.ptr.pp_double[i0][j] = v; } } preparelowrankpreconditioner(&vd, &vc, &va, n, n, &lowrankbuf, _state); applylowrankpreconditioner(&s1, &lowrankbuf, _state); for(i=0; i<=n-1; i++) { seterrorflag(wereerrors, ae_fp_greater(ae_fabs(s2.ptr.p_double[i]-s1.ptr.p_double[i], _state),tolg), _state); } } } ae_frame_leave(_state); } ae_bool testsnnls(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool test0errors; ae_bool test1errors; ae_bool test2errors; ae_bool testnewtonerrors; ae_bool waserrors; double eps; ae_int_t i; ae_int_t j; ae_int_t k; double v; ae_int_t ns; ae_int_t nd; ae_int_t nr; ae_matrix densea; ae_matrix effectivea; ae_vector isconstrained; ae_vector g; ae_vector b; ae_vector x; ae_vector xs; snnlssolver s; hqrndstate rs; double rho; double xtol; ae_int_t nmax; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&densea, 0, 0, DT_REAL, _state); ae_matrix_init(&effectivea, 0, 0, DT_REAL, _state); ae_vector_init(&isconstrained, 0, DT_BOOL, _state); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&xs, 0, DT_REAL, _state); _snnlssolver_init(&s, _state); _hqrndstate_init(&rs, _state); test0errors = ae_false; test1errors = ae_false; test2errors = ae_false; testnewtonerrors = ae_false; waserrors = ae_false; hqrndrandomize(&rs, _state); nmax = 10; /* * Test 2 (comes first because it is very basic): * * NS=0 * * ND in [1,NMAX] * * NR=ND * * DenseA is diagonal with positive entries * * B is random * * random constraints * Exact solution is known and can be tested */ eps = 1.0E-12; for(nd=1; nd<=nmax; nd++) { /* * Generate problem */ ns = 0; nr = nd; ae_matrix_set_length(&densea, nd, nd, _state); ae_vector_set_length(&b, nd, _state); ae_vector_set_length(&isconstrained, nd, _state); for(i=0; i<=nd-1; i++) { for(j=0; j<=nd-1; j++) { densea.ptr.pp_double[i][j] = (double)(0); } densea.ptr.pp_double[i][i] = (double)(1+hqrnduniformi(&rs, 2, _state)); b.ptr.p_double[i] = (double)((1+hqrnduniformi(&rs, 2, _state))*(2*hqrnduniformi(&rs, 2, _state)-1)); isconstrained.ptr.p_bool[i] = ae_fp_greater(hqrnduniformr(&rs, _state),0.5); } /* * Solve with SNNLS solver */ snnlsinit(0, 0, 0, &s, _state); snnlssetproblem(&s, &densea, &b, 0, nd, nd, _state); for(i=0; i<=nd-1; i++) { if( !isconstrained.ptr.p_bool[i] ) { snnlsdropnnc(&s, i, _state); } } snnlssolve(&s, &x, _state); /* * Check */ for(i=0; i<=nd-1; i++) { if( isconstrained.ptr.p_bool[i] ) { seterrorflag(&test2errors, ae_fp_greater(ae_fabs(x.ptr.p_double[i]-ae_maxreal(b.ptr.p_double[i]/densea.ptr.pp_double[i][i], 0.0, _state), _state),eps), _state); seterrorflag(&test2errors, ae_fp_less(x.ptr.p_double[i],0.0), _state); } else { seterrorflag(&test2errors, ae_fp_greater(ae_fabs(x.ptr.p_double[i]-b.ptr.p_double[i]/densea.ptr.pp_double[i][i], _state),eps), _state); } } } /* * Test 0: * * NS in [0,NMAX] * * ND in [0,NMAX] * * NR in [NS,NS+ND+NMAX] * * NS+ND>0, NR>0 * * about 50% of variables are constrained * * we check that constrained gradient is small at the solution */ eps = 1.0E-5; for(ns=0; ns<=nmax; ns++) { for(nd=0; nd<=nmax; nd++) { for(nr=ns; nr<=ns+nd+nmax; nr++) { /* * Skip NS+ND=0, NR=0 */ if( ns+nd==0 ) { continue; } if( nr==0 ) { continue; } /* * Generate problem: * * DenseA, array[NR,ND] * * EffectiveA, array[NR,NS+ND] * * B, array[NR] * * IsConstrained, array[NS+ND] */ if( nd>0 ) { ae_matrix_set_length(&densea, nr, nd, _state); for(i=0; i<=nr-1; i++) { for(j=0; j<=nd-1; j++) { densea.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } } ae_matrix_set_length(&effectivea, nr, ns+nd, _state); for(i=0; i<=nr-1; i++) { for(j=0; j<=ns+nd-1; j++) { effectivea.ptr.pp_double[i][j] = 0.0; } } for(i=0; i<=ns-1; i++) { effectivea.ptr.pp_double[i][i] = 1.0; } for(i=0; i<=nr-1; i++) { for(j=0; j<=nd-1; j++) { effectivea.ptr.pp_double[i][ns+j] = densea.ptr.pp_double[i][j]; } } ae_vector_set_length(&b, nr, _state); for(i=0; i<=nr-1; i++) { b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&isconstrained, ns+nd, _state); for(i=0; i<=ns+nd-1; i++) { isconstrained.ptr.p_bool[i] = ae_fp_greater(ae_randomreal(_state),0.5); } /* * Solve with SNNLS solver */ snnlsinit(0, 0, 0, &s, _state); snnlssetproblem(&s, &densea, &b, ns, nd, nr, _state); for(i=0; i<=ns+nd-1; i++) { if( !isconstrained.ptr.p_bool[i] ) { snnlsdropnnc(&s, i, _state); } } snnlssolve(&s, &x, _state); /* * Check non-negativity */ for(i=0; i<=ns+nd-1; i++) { seterrorflag(&test0errors, isconstrained.ptr.p_bool[i]&&ae_fp_less(x.ptr.p_double[i],(double)(0)), _state); } /* * Calculate gradient A'*A*x-b'*A. * Check projected gradient (each component must be less than Eps). */ ae_vector_set_length(&g, ns+nd, _state); for(i=0; i<=ns+nd-1; i++) { v = ae_v_dotproduct(&b.ptr.p_double[0], 1, &effectivea.ptr.pp_double[0][i], effectivea.stride, ae_v_len(0,nr-1)); g.ptr.p_double[i] = -v; } for(i=0; i<=nr-1; i++) { v = ae_v_dotproduct(&effectivea.ptr.pp_double[i][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,ns+nd-1)); ae_v_addd(&g.ptr.p_double[0], 1, &effectivea.ptr.pp_double[i][0], 1, ae_v_len(0,ns+nd-1), v); } for(i=0; i<=ns+nd-1; i++) { if( !isconstrained.ptr.p_bool[i]||ae_fp_greater(x.ptr.p_double[i],(double)(0)) ) { seterrorflag(&test0errors, ae_fp_greater(ae_fabs(g.ptr.p_double[i], _state),eps), _state); } else { seterrorflag(&test0errors, ae_fp_less(g.ptr.p_double[i],-eps), _state); } } } } } /* * Test 1: ability of the solver to take very short steps. * * We solve problem similar to one solver in test 0, but with * progressively decreased magnitude of variables. We generate * problem with already-known solution and compare results against it. */ xtol = 1.0E-7; for(ns=0; ns<=nmax; ns++) { for(nd=0; nd<=nmax; nd++) { for(nr=ns; nr<=ns+nd+nmax; nr++) { for(k=0; k<=20; k++) { /* * Skip NS+ND=0, NR=0 * * Skip degenerate problems (NR0 ) { ae_matrix_set_length(&densea, nr, nd, _state); for(i=0; i<=nr-1; i++) { for(j=0; j<=nd-1; j++) { densea.ptr.pp_double[i][j] = 2*hqrnduniformr(&rs, _state)-1; } } } ae_matrix_set_length(&effectivea, nr, ns+nd, _state); for(i=0; i<=nr-1; i++) { for(j=0; j<=ns+nd-1; j++) { effectivea.ptr.pp_double[i][j] = 0.0; } } for(i=0; i<=ns-1; i++) { effectivea.ptr.pp_double[i][i] = 1.0; } for(i=0; i<=nr-1; i++) { for(j=0; j<=nd-1; j++) { effectivea.ptr.pp_double[i][ns+j] = densea.ptr.pp_double[i][j]; } } ae_vector_set_length(&xs, ns+nd, _state); ae_vector_set_length(&isconstrained, ns+nd, _state); for(i=0; i<=ns+nd-1; i++) { xs.ptr.p_double[i] = rho*(hqrnduniformr(&rs, _state)-0.5); isconstrained.ptr.p_bool[i] = ae_fp_greater(xs.ptr.p_double[i],0.0); } ae_vector_set_length(&b, nr, _state); for(i=0; i<=nr-1; i++) { v = 0.0; for(j=0; j<=ns+nd-1; j++) { v = v+effectivea.ptr.pp_double[i][j]*xs.ptr.p_double[j]; } b.ptr.p_double[i] = v; } /* * Solve with SNNLS solver */ snnlsinit(0, 0, 0, &s, _state); snnlssetproblem(&s, &densea, &b, ns, nd, nr, _state); for(i=0; i<=ns+nd-1; i++) { if( !isconstrained.ptr.p_bool[i] ) { snnlsdropnnc(&s, i, _state); } } snnlssolve(&s, &x, _state); /* * Check non-negativity */ for(i=0; i<=ns+nd-1; i++) { seterrorflag(&test1errors, isconstrained.ptr.p_bool[i]&&ae_fp_less(x.ptr.p_double[i],(double)(0)), _state); } /* * Compare with true solution */ for(i=0; i<=ns+nd-1; i++) { seterrorflag(&test1errors, ae_fp_greater(ae_fabs(xs.ptr.p_double[i]-x.ptr.p_double[i], _state),rho*xtol), _state); } } } } } /* * Test for Newton phase: * * NS in [0,NMAX] * * ND in [0,NMAX] * * NR in [NS,NS+ND+NMAX] * * NS+ND>0, NR>0 * * all variables are unconstrained * * S.DebugMaxNewton is set to 1, S.RefinementIts is set to 1, * i.e. algorithm is terminated after one Newton iteration, and no * iterative refinement is used. * * we test that gradient is small at solution, i.e. one Newton iteration * on unconstrained problem is enough to find solution. In case of buggy * Newton solver one iteration won't move us to the solution - it may * decrease function value, but won't find exact solution. * * This test is intended to catch subtle bugs in the Newton solver which * do NOT prevent algorithm from converging to the solution, but slow it * down (convergence becomes linear or even slower). */ eps = 1.0E-4; for(ns=0; ns<=nmax; ns++) { for(nd=0; nd<=nmax; nd++) { for(nr=ns; nr<=ns+nd+nmax; nr++) { /* * Skip NS+ND=0, NR=0 */ if( ns+nd==0 ) { continue; } if( nr==0 ) { continue; } /* * Generate problem: * * DenseA, array[NR,ND] * * EffectiveA, array[NR,NS+ND] * * B, array[NR] * * IsConstrained, array[NS+ND] */ if( nd>0 ) { ae_matrix_set_length(&densea, nr, nd, _state); for(i=0; i<=nr-1; i++) { for(j=0; j<=nd-1; j++) { densea.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } } } ae_matrix_set_length(&effectivea, nr, ns+nd, _state); for(i=0; i<=nr-1; i++) { for(j=0; j<=ns+nd-1; j++) { effectivea.ptr.pp_double[i][j] = 0.0; } } for(i=0; i<=ns-1; i++) { effectivea.ptr.pp_double[i][i] = 1.0; } for(i=0; i<=nr-1; i++) { for(j=0; j<=nd-1; j++) { effectivea.ptr.pp_double[i][ns+j] = densea.ptr.pp_double[i][j]; } } ae_vector_set_length(&b, nr, _state); for(i=0; i<=nr-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); } /* * Solve with SNNLS solver */ snnlsinit(0, 0, 0, &s, _state); snnlssetproblem(&s, &densea, &b, ns, nd, nr, _state); for(i=0; i<=ns+nd-1; i++) { snnlsdropnnc(&s, i, _state); } s.debugmaxinnerits = 1; snnlssolve(&s, &x, _state); /* * Calculate gradient A'*A*x-b'*A. * Check projected gradient (each component must be less than Eps). */ ae_vector_set_length(&g, ns+nd, _state); for(i=0; i<=ns+nd-1; i++) { v = ae_v_dotproduct(&b.ptr.p_double[0], 1, &effectivea.ptr.pp_double[0][i], effectivea.stride, ae_v_len(0,nr-1)); g.ptr.p_double[i] = -v; } for(i=0; i<=nr-1; i++) { v = ae_v_dotproduct(&effectivea.ptr.pp_double[i][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,ns+nd-1)); ae_v_addd(&g.ptr.p_double[0], 1, &effectivea.ptr.pp_double[i][0], 1, ae_v_len(0,ns+nd-1), v); } for(i=0; i<=ns+nd-1; i++) { seterrorflag(&testnewtonerrors, ae_fp_greater(ae_fabs(g.ptr.p_double[i], _state),eps), _state); } } } } /* * report */ waserrors = ((test0errors||test1errors)||test2errors)||testnewtonerrors; if( !silent ) { printf("TESTING SPECIAL NNLS SOLVER\n"); printf("TEST 0: "); if( test0errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("TEST 1: "); if( test1errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("TEST 2: "); if( test2errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("NEWTON PHASE: "); if( testnewtonerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testsnnls(ae_bool silent, ae_state *_state) { return testsnnls(silent, _state); } static void testsactivesetsunit_testspecproperties(ae_bool* err, ae_state *_state); ae_bool testsactivesets(ae_bool silent, ae_state *_state) { ae_bool waserrors; ae_bool specerr; ae_bool result; specerr = ae_false; testsactivesetsunit_testspecproperties(&specerr, _state); /* * report */ waserrors = specerr; if( !silent ) { printf("TESTING ACTIVE SETS\n"); printf("* SPECIAL PROPERTIES "); if( specerr ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testsactivesets(ae_bool silent, ae_state *_state) { return testsactivesets(silent, _state); } /************************************************************************* This function tests special properties. On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testsactivesetsunit_testspecproperties(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t n; ae_int_t nec; ae_int_t nic; double v; double vv; sactiveset state; hqrndstate rs; ae_vector bl; ae_vector bu; ae_vector x; ae_vector s; ae_matrix c; ae_vector ct; ae_int_t scaletype; ae_int_t pass; ae_int_t distortidx; double distortmag; ae_frame_make(_state, &_frame_block); _sactiveset_init(&state, _state); _hqrndstate_init(&rs, _state); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); hqrndrandomize(&rs, _state); /* * N-dimensional problem with Ne equality and Ni inequality constraints. * * Check that SActiveSet object uses efficient algorithm * to determine initial point: it avoids expensive (N+Ni)-dimensional * QP subproblem when initial point is feasible w.r.t. constraints. * * In order to do so we try to find initial point for a problem with * 2 equality constraints and 1000000 inequality constraints (+box * constraints). Inefficient algorithm will simply fail to allocate * enough memory, so we do not have to perform any checks here. */ n = 5; nec = 2; nic = 1000000; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(scaletype=0; scaletype<=1; scaletype++) { /* * Generate problem */ for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = hqrnduniformr(&rs, _state); bl.ptr.p_double[i] = x.ptr.p_double[i]-hqrnduniformr(&rs, _state)*hqrnduniformi(&rs, 2, _state); bu.ptr.p_double[i] = x.ptr.p_double[i]+hqrnduniformr(&rs, _state)*hqrnduniformi(&rs, 2, _state); } ae_matrix_set_length(&c, nec+nic, n+1, _state); ae_vector_set_length(&ct, nec+nic, _state); for(i=0; i<=nec+nic-1; i++) { v = (double)(0); for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); v = v+c.ptr.pp_double[i][j]*x.ptr.p_double[j]; } c.ptr.pp_double[i][n] = v; if( ix.ptr.p_double[0],(double)(100)) ) { if( state->needf||state->needfg ) { state->f = ae_sqr(ae_exp(state->x.ptr.p_double[0], _state)-2, _state)+ae_sqr(ae_sqr(state->x.ptr.p_double[1], _state), _state)+ae_sqr(state->x.ptr.p_double[2]-state->x.ptr.p_double[0], _state); } if( state->needfg ) { state->g.ptr.p_double[0] = 2*(ae_exp(state->x.ptr.p_double[0], _state)-2)*ae_exp(state->x.ptr.p_double[0], _state)+2*(state->x.ptr.p_double[0]-state->x.ptr.p_double[2]); state->g.ptr.p_double[1] = 4*state->x.ptr.p_double[1]*ae_sqr(state->x.ptr.p_double[1], _state); state->g.ptr.p_double[2] = 2*(state->x.ptr.p_double[2]-state->x.ptr.p_double[0]); } } else { if( state->needf||state->needfg ) { state->f = ae_sqrt(ae_maxrealnumber, _state); } if( state->needfg ) { state->g.ptr.p_double[0] = ae_sqrt(ae_maxrealnumber, _state); state->g.ptr.p_double[1] = (double)(0); state->g.ptr.p_double[2] = (double)(0); } } } /************************************************************************* Calculate test function #3 Simple variation of #1, much more nonlinear, with non-zero value at minimum. It achieve two goals: * makes unlikely premature convergence of algorithm . * solves some issues with EpsF stopping condition which arise when F(minimum) is zero *************************************************************************/ static void testminlbfgsunit_testfunc3(minlbfgsstate* state, ae_state *_state) { double s; s = 0.001; if( ae_fp_less(state->x.ptr.p_double[0],(double)(100)) ) { if( state->needf||state->needfg ) { state->f = ae_sqr(ae_exp(state->x.ptr.p_double[0], _state)-2, _state)+ae_sqr(ae_sqr(state->x.ptr.p_double[1], _state)+s, _state)+ae_sqr(state->x.ptr.p_double[2]-state->x.ptr.p_double[0], _state); } if( state->needfg ) { state->g.ptr.p_double[0] = 2*(ae_exp(state->x.ptr.p_double[0], _state)-2)*ae_exp(state->x.ptr.p_double[0], _state)+2*(state->x.ptr.p_double[0]-state->x.ptr.p_double[2]); state->g.ptr.p_double[1] = 2*(ae_sqr(state->x.ptr.p_double[1], _state)+s)*2*state->x.ptr.p_double[1]; state->g.ptr.p_double[2] = 2*(state->x.ptr.p_double[2]-state->x.ptr.p_double[0]); } } else { if( state->needf||state->needfg ) { state->f = ae_sqrt(ae_maxrealnumber, _state); } if( state->needfg ) { state->g.ptr.p_double[0] = ae_sqrt(ae_maxrealnumber, _state); state->g.ptr.p_double[1] = (double)(0); state->g.ptr.p_double[2] = (double)(0); } } } /************************************************************************* Calculate test function IIP2 f(x) = sum( ((i*i+1)*x[i])^2, i=0..N-1) It has high condition number which makes fast convergence unlikely without good preconditioner. *************************************************************************/ static void testminlbfgsunit_calciip2(minlbfgsstate* state, ae_int_t n, ae_state *_state) { ae_int_t i; if( state->needf||state->needfg ) { state->f = (double)(0); } for(i=0; i<=n-1; i++) { if( state->needf||state->needfg ) { state->f = state->f+ae_sqr((double)(i*i+1), _state)*ae_sqr(state->x.ptr.p_double[i], _state); } if( state->needfg ) { state->g.ptr.p_double[i] = ae_sqr((double)(i*i+1), _state)*2*state->x.ptr.p_double[i]; } } } /************************************************************************* This function tests preconditioning On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testminlbfgsunit_testpreconditioning(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t pass; ae_int_t n; ae_int_t m; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t cntb1; ae_int_t cntb2; ae_int_t cntg1; ae_int_t cntg2; ae_int_t pkind; minlbfgsstate state; minlbfgsreport rep; ae_vector x; ae_vector s; ae_matrix a; ae_vector diagh; ae_frame_make(_state, &_frame_block); _minlbfgsstate_init(&state, _state); _minlbfgsreport_init(&rep, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&diagh, 0, DT_REAL, _state); m = 1; k = 50; /* * Preconditioner test1. * * If * * B1 is default preconditioner * * B2 is Cholesky preconditioner with unit diagonal * * G1 is Cholesky preconditioner based on exact Hessian with perturbations * * G2 is diagonal precomditioner based on approximate diagonal of Hessian matrix * then "bad" preconditioners (B1/B2/..) are worse than "good" ones (G1/G2/..). * "Worse" means more iterations to converge. * * We test it using f(x) = sum( ((i*i+1)*x[i])^2, i=0..N-1) and L-BFGS * optimizer with deliberately small M=1. * * N - problem size * PKind - zero for upper triangular preconditioner, one for lower triangular. * K - number of repeated passes (should be large enough to average out random factors) */ for(n=10; n<=15; n++) { pkind = ae_randominteger(2, _state); ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)(0); } minlbfgscreate(n, m, &x, &state, _state); /* * Test it with default preconditioner */ minlbfgssetprecdefault(&state, _state); cntb1 = 0; for(pass=0; pass<=k-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } minlbfgsrestartfrom(&state, &x, _state); while(minlbfgsiteration(&state, _state)) { testminlbfgsunit_calciip2(&state, n, _state); } minlbfgsresults(&state, &x, &rep, _state); cntb1 = cntb1+rep.iterationscount; *err = *err||rep.terminationtype<=0; } /* * Test it with unit preconditioner */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { a.ptr.pp_double[i][i] = (double)(1); } else { a.ptr.pp_double[i][j] = (double)(0); } } } minlbfgssetpreccholesky(&state, &a, pkind==0, _state); cntb2 = 0; for(pass=0; pass<=k-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } minlbfgsrestartfrom(&state, &x, _state); while(minlbfgsiteration(&state, _state)) { testminlbfgsunit_calciip2(&state, n, _state); } minlbfgsresults(&state, &x, &rep, _state); cntb2 = cntb2+rep.iterationscount; *err = *err||rep.terminationtype<=0; } /* * Test it with perturbed Hessian preconditioner */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { a.ptr.pp_double[i][i] = (i*i+1)*(0.8+0.4*ae_randomreal(_state)); } else { if( (pkind==0&&j>i)||(pkind==1&&jv_nan; } } } } minlbfgssetpreccholesky(&state, &a, pkind==0, _state); cntg1 = 0; for(pass=0; pass<=k-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } minlbfgsrestartfrom(&state, &x, _state); while(minlbfgsiteration(&state, _state)) { testminlbfgsunit_calciip2(&state, n, _state); } minlbfgsresults(&state, &x, &rep, _state); cntg1 = cntg1+rep.iterationscount; *err = *err||rep.terminationtype<=0; } /* * Test it with perturbed diagonal preconditioner */ ae_vector_set_length(&diagh, n, _state); for(i=0; i<=n-1; i++) { diagh.ptr.p_double[i] = 2*ae_sqr((double)(i*i+1), _state)*(0.8+0.4*ae_randomreal(_state)); } minlbfgssetprecdiag(&state, &diagh, _state); cntg2 = 0; for(pass=0; pass<=k-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } minlbfgsrestartfrom(&state, &x, _state); while(minlbfgsiteration(&state, _state)) { testminlbfgsunit_calciip2(&state, n, _state); } minlbfgsresults(&state, &x, &rep, _state); cntg2 = cntg2+rep.iterationscount; *err = *err||rep.terminationtype<=0; } /* * Compare */ *err = *err||cntb1=0.999999 * * where c is either 1.0 or 1.0E+6, M is either 1.0E8, 1.0E20 or +INF * (we try different combinations) */ for(ckind=0; ckind<=1; ckind++) { for(mkind=0; mkind<=2; mkind++) { /* * Choose c and M */ vc = (double)(1); vm = (double)(1); if( ckind==0 ) { vc = 1.0; } if( ckind==1 ) { vc = 1.0E+6; } if( mkind==0 ) { vm = 1.0E+8; } if( mkind==1 ) { vm = 1.0E+20; } if( mkind==2 ) { vm = _state->v_posinf; } /* * Create optimizer, solve optimization problem */ epsg = 1.0E-6*vc; ae_vector_set_length(&x, 1, _state); x.ptr.p_double[0] = 0.0; minlbfgscreate(1, 1, &x, &state, _state); minlbfgssetcond(&state, epsg, (double)(0), (double)(0), 0, _state); while(minlbfgsiteration(&state, _state)) { if( state.needfg ) { if( ae_fp_less(-0.999999,state.x.ptr.p_double[0])&&ae_fp_less(state.x.ptr.p_double[0],0.999999) ) { state.f = 1/(1-state.x.ptr.p_double[0])+1/(1+state.x.ptr.p_double[0])+vc*state.x.ptr.p_double[0]; state.g.ptr.p_double[0] = 1/ae_sqr(1-state.x.ptr.p_double[0], _state)-1/ae_sqr(1+state.x.ptr.p_double[0], _state)+vc; } else { state.f = vm; } } } minlbfgsresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } *err = *err||ae_fp_greater(ae_fabs(1/ae_sqr(1-x.ptr.p_double[0], _state)-1/ae_sqr(1+x.ptr.p_double[0], _state)+vc, _state),epsg); } } /* * Test integrity checks for NAN/INF: * * algorithm solves optimization problem, which is normal for some time (quadratic) * * after 5-th step we choose random component of gradient and consistently spoil * it by NAN or INF. * * we check that correct termination code is returned (-8) */ n = 100; for(pass=1; pass<=10; pass++) { spoiliteration = 5; stopiteration = 8; if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { /* * Gradient can be spoiled by +INF, -INF, NAN */ spoilvar = hqrnduniformi(&rs, n, _state); i = hqrnduniformi(&rs, 3, _state); spoilval = _state->v_nan; if( i==0 ) { spoilval = _state->v_neginf; } if( i==1 ) { spoilval = _state->v_posinf; } } else { /* * Function value can be spoiled only by NAN * (+INF can be recognized as legitimate value during optimization) */ spoilvar = -1; spoilval = _state->v_nan; } spdmatrixrndcond(n, 1.0E5, &fulla, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); x0.ptr.p_double[i] = hqrndnormal(&rs, _state); } minlbfgscreate(n, 1, &x0, &state, _state); minlbfgssetcond(&state, 0.0, 0.0, 0.0, stopiteration, _state); minlbfgssetxrep(&state, ae_true, _state); k = -1; while(minlbfgsiteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+b.ptr.p_double[i]*state.x.ptr.p_double[i]; state.g.ptr.p_double[i] = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.f = state.f+0.5*state.x.ptr.p_double[i]*fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; state.g.ptr.p_double[i] = state.g.ptr.p_double[i]+fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } if( k>=spoiliteration ) { if( spoilvar<0 ) { state.f = spoilval; } else { state.g.ptr.p_double[spoilvar] = spoilval; } } continue; } if( state.xupdated ) { inc(&k, _state); continue; } ae_assert(ae_false, "Assertion failed", _state); } minlbfgsresults(&state, &x1, &rep, _state); seterrorflag(err, rep.terminationtype!=-8, _state); } /* * Check algorithm ability to handle request for termination: * * to terminate with correct return code = 8 * * to return point which was "current" at the moment of termination */ for(pass=1; pass<=50; pass++) { n = 3; ss = (double)(100); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xlast, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 6+ae_randomreal(_state); } stopcallidx = ae_randominteger(20, _state); maxits = 25; minlbfgscreate(n, 1, &x, &state, _state); minlbfgssetcond(&state, (double)(0), (double)(0), (double)(0), maxits, _state); minlbfgssetxrep(&state, ae_true, _state); callidx = 0; terminationrequested = ae_false; ae_v_move(&xlast.ptr.p_double[0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); while(minlbfgsiteration(&state, _state)) { if( state.needfg ) { state.f = ss*ae_sqr(ae_exp(state.x.ptr.p_double[0], _state)-2, _state)+ae_sqr(state.x.ptr.p_double[1], _state)+ae_sqr(state.x.ptr.p_double[2]-state.x.ptr.p_double[0], _state); state.g.ptr.p_double[0] = 2*ss*(ae_exp(state.x.ptr.p_double[0], _state)-2)*ae_exp(state.x.ptr.p_double[0], _state)+2*(state.x.ptr.p_double[2]-state.x.ptr.p_double[0])*(-1); state.g.ptr.p_double[1] = 2*state.x.ptr.p_double[1]; state.g.ptr.p_double[2] = 2*(state.x.ptr.p_double[2]-state.x.ptr.p_double[0]); if( callidx==stopcallidx ) { minlbfgsrequesttermination(&state, _state); terminationrequested = ae_true; } inc(&callidx, _state); continue; } if( state.xupdated ) { if( !terminationrequested ) { ae_v_move(&xlast.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minlbfgsresults(&state, &x, &rep, _state); seterrorflag(err, rep.terminationtype!=8, _state); for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_neq(x.ptr.p_double[i],xlast.ptr.p_double[i]), _state); } } ae_frame_leave(_state); } /************************************************************************* This function tests, that gradient verified correctly. *************************************************************************/ static ae_bool testminlbfgsunit_gradientchecktest(ae_state *_state) { ae_frame _frame_block; minlbfgsstate state; minlbfgsreport rep; ae_int_t m; ae_int_t n; double a; double b; double c; double d; double x0; double x1; double x2; ae_vector x; double teststep; double noise; ae_int_t nbrcomp; ae_int_t func; ae_int_t pass; ae_int_t passcount; ae_int_t i; ae_bool result; ae_frame_make(_state, &_frame_block); _minlbfgsstate_init(&state, _state); _minlbfgsreport_init(&rep, _state); ae_vector_init(&x, 0, DT_REAL, _state); passcount = 35; teststep = 0.01; n = 3; m = 2; ae_vector_set_length(&x, n, _state); for(pass=1; pass<=passcount; pass++) { /* * Prepare test's parameters */ func = ae_randominteger(3, _state)+1; nbrcomp = ae_randominteger(n, _state); noise = (double)(10*(2*ae_randominteger(2, _state)-1)); /* * Prepare function's parameters */ for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 5*randomnormal(_state); } a = 5*ae_randomreal(_state)+1; b = 5*ae_randomreal(_state)+1; c = 5*ae_randomreal(_state)+1; d = 5*ae_randomreal(_state)+1; x0 = 5*(2*ae_randomreal(_state)-1); x1 = 5*(2*ae_randomreal(_state)-1); x2 = 5*(2*ae_randomreal(_state)-1); minlbfgscreate(n, m, &x, &state, _state); minlbfgssetcond(&state, (double)(0), (double)(0), (double)(0), 0, _state); minlbfgssetgradientcheck(&state, teststep, _state); /* * Check that the criterion passes a derivative if it is correct */ while(minlbfgsiteration(&state, _state)) { if( state.needfg ) { testminlbfgsunit_funcderiv(a, b, c, d, x0, x1, x2, &state.x, func, &state.f, &state.g, _state); } } minlbfgsresults(&state, &x, &rep, _state); /* * Check that error code does not equal to -7 and parameter .VarIdx * equal to -1. */ if( rep.terminationtype==-7||rep.varidx!=-1 ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 5*randomnormal(_state); } minlbfgsrestartfrom(&state, &x, _state); /* * Check that the criterion does not miss a derivative if * it is incorrect */ while(minlbfgsiteration(&state, _state)) { if( state.needfg ) { testminlbfgsunit_funcderiv(a, b, c, d, x0, x1, x2, &state.x, func, &state.f, &state.g, _state); state.g.ptr.p_double[nbrcomp] = state.g.ptr.p_double[nbrcomp]+noise; } } minlbfgsresults(&state, &x, &rep, _state); /* * Check that error code equal to -7 and parameter .VarIdx * equal to number of incorrect component. */ if( rep.terminationtype!=-7||rep.varidx!=nbrcomp ) { result = ae_true; ae_frame_leave(_state); return result; } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function return function value and it derivatives. Function dimension is 3. Function's list: * funcType=1: F(X)=A*(X-X0)^2+B*(Y-Y0)^2+C*(Z-Z0)^2+D; * funcType=2: F(X)=A*sin(X-X0)^2+B*sin(Y-Y0)^2+C*sin(Z-Z0)^2+D; * funcType=3: F(X)=A*(X-X0)^2+B*(Y-Y0)^2+C*((Z-Z0)-(X-X0))^2+D. *************************************************************************/ static void testminlbfgsunit_funcderiv(double a, double b, double c, double d, double x0, double x1, double x2, /* Real */ ae_vector* x, ae_int_t functype, double* f, /* Real */ ae_vector* g, ae_state *_state) { ae_assert(((ae_isfinite(a, _state)&&ae_isfinite(b, _state))&&ae_isfinite(c, _state))&&ae_isfinite(d, _state), "FuncDeriv: A, B, C or D contains NaN or Infinite.", _state); ae_assert((ae_isfinite(x0, _state)&&ae_isfinite(x1, _state))&&ae_isfinite(x2, _state), "FuncDeriv: X0, X1 or X2 contains NaN or Infinite.", _state); ae_assert(functype>=1&&functype<=3, "FuncDeriv: incorrect funcType(funcType<1 or funcType>3).", _state); if( functype==1 ) { *f = a*ae_sqr(x->ptr.p_double[0]-x0, _state)+b*ae_sqr(x->ptr.p_double[1]-x1, _state)+c*ae_sqr(x->ptr.p_double[2]-x2, _state)+d; g->ptr.p_double[0] = 2*a*(x->ptr.p_double[0]-x0); g->ptr.p_double[1] = 2*b*(x->ptr.p_double[1]-x1); g->ptr.p_double[2] = 2*c*(x->ptr.p_double[2]-x2); return; } if( functype==2 ) { *f = a*ae_sqr(ae_sin(x->ptr.p_double[0]-x0, _state), _state)+b*ae_sqr(ae_sin(x->ptr.p_double[1]-x1, _state), _state)+c*ae_sqr(ae_sin(x->ptr.p_double[2]-x2, _state), _state)+d; g->ptr.p_double[0] = 2*a*ae_sin(x->ptr.p_double[0]-x0, _state)*ae_cos(x->ptr.p_double[0]-x0, _state); g->ptr.p_double[1] = 2*b*ae_sin(x->ptr.p_double[1]-x1, _state)*ae_cos(x->ptr.p_double[1]-x1, _state); g->ptr.p_double[2] = 2*c*ae_sin(x->ptr.p_double[2]-x2, _state)*ae_cos(x->ptr.p_double[2]-x2, _state); return; } if( functype==3 ) { *f = a*ae_sqr(x->ptr.p_double[0]-x0, _state)+b*ae_sqr(x->ptr.p_double[1]-x1, _state)+c*ae_sqr(x->ptr.p_double[2]-x2-(x->ptr.p_double[0]-x0), _state)+d; g->ptr.p_double[0] = 2*a*(x->ptr.p_double[0]-x0)+2*c*(x->ptr.p_double[0]-x->ptr.p_double[2]-x0+x2); g->ptr.p_double[1] = 2*b*(x->ptr.p_double[1]-x1); g->ptr.p_double[2] = 2*c*(x->ptr.p_double[2]-x->ptr.p_double[0]-x2+x0); return; } } ae_bool testxblas(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool approxerrors; ae_bool exactnesserrors; ae_bool waserrors; double approxthreshold; ae_int_t maxn; ae_int_t passcount; ae_int_t n; ae_int_t i; ae_int_t pass; double rv1; double rv2; double rv2err; ae_complex cv1; ae_complex cv2; double cv2err; ae_vector rx; ae_vector ry; ae_vector cx; ae_vector cy; ae_vector temp; double s; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&rx, 0, DT_REAL, _state); ae_vector_init(&ry, 0, DT_REAL, _state); ae_vector_init(&cx, 0, DT_COMPLEX, _state); ae_vector_init(&cy, 0, DT_COMPLEX, _state); ae_vector_init(&temp, 0, DT_REAL, _state); approxerrors = ae_false; exactnesserrors = ae_false; waserrors = ae_false; approxthreshold = 1000*ae_machineepsilon; maxn = 1000; passcount = 10; /* * tests: * 1. ability to calculate dot product * 2. higher precision */ for(n=1; n<=maxn; n++) { for(pass=1; pass<=passcount; pass++) { /* * ability to approximately calculate real dot product */ ae_vector_set_length(&rx, n, _state); ae_vector_set_length(&ry, n, _state); ae_vector_set_length(&temp, n, _state); for(i=0; i<=n-1; i++) { if( ae_fp_greater(ae_randomreal(_state),0.2) ) { rx.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } else { rx.ptr.p_double[i] = (double)(0); } if( ae_fp_greater(ae_randomreal(_state),0.2) ) { ry.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } else { ry.ptr.p_double[i] = (double)(0); } } rv1 = ae_v_dotproduct(&rx.ptr.p_double[0], 1, &ry.ptr.p_double[0], 1, ae_v_len(0,n-1)); xdot(&rx, &ry, n, &temp, &rv2, &rv2err, _state); approxerrors = approxerrors||ae_fp_greater(ae_fabs(rv1-rv2, _state),approxthreshold); /* * ability to approximately calculate complex dot product */ ae_vector_set_length(&cx, n, _state); ae_vector_set_length(&cy, n, _state); ae_vector_set_length(&temp, 2*n, _state); for(i=0; i<=n-1; i++) { if( ae_fp_greater(ae_randomreal(_state),0.2) ) { cx.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; cx.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; } else { cx.ptr.p_complex[i] = ae_complex_from_i(0); } if( ae_fp_greater(ae_randomreal(_state),0.2) ) { cy.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; cy.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; } else { cy.ptr.p_complex[i] = ae_complex_from_i(0); } } cv1 = ae_v_cdotproduct(&cx.ptr.p_complex[0], 1, "N", &cy.ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); xcdot(&cx, &cy, n, &temp, &cv2, &cv2err, _state); approxerrors = approxerrors||ae_fp_greater(ae_c_abs(ae_c_sub(cv1,cv2), _state),approxthreshold); } } /* * test of precision: real */ n = 50000; ae_vector_set_length(&rx, n, _state); ae_vector_set_length(&ry, n, _state); ae_vector_set_length(&temp, n, _state); for(pass=0; pass<=passcount-1; pass++) { ae_assert(n%2==0, "Assertion failed", _state); /* * First test: X + X + ... + X - X - X - ... - X = 1*X */ s = ae_exp((double)(ae_maxint(pass, 50, _state)), _state); if( pass==passcount-1&&pass>1 ) { s = ae_maxrealnumber; } ry.ptr.p_double[0] = (2*ae_randomreal(_state)-1)*s*ae_sqrt(2*ae_randomreal(_state), _state); for(i=1; i<=n-1; i++) { ry.ptr.p_double[i] = ry.ptr.p_double[0]; } for(i=0; i<=n/2-1; i++) { rx.ptr.p_double[i] = (double)(1); } for(i=n/2; i<=n-2; i++) { rx.ptr.p_double[i] = (double)(-1); } rx.ptr.p_double[n-1] = (double)(0); xdot(&rx, &ry, n, &temp, &rv2, &rv2err, _state); exactnesserrors = exactnesserrors||ae_fp_less(rv2err,(double)(0)); exactnesserrors = exactnesserrors||ae_fp_greater(rv2err,4*ae_machineepsilon*ae_fabs(ry.ptr.p_double[0], _state)); exactnesserrors = exactnesserrors||ae_fp_greater(ae_fabs(rv2-ry.ptr.p_double[0], _state),rv2err); /* * First test: X + X + ... + X = N*X */ s = ae_exp((double)(ae_maxint(pass, 50, _state)), _state); if( pass==passcount-1&&pass>1 ) { s = ae_maxrealnumber; } ry.ptr.p_double[0] = (2*ae_randomreal(_state)-1)*s*ae_sqrt(2*ae_randomreal(_state), _state); for(i=1; i<=n-1; i++) { ry.ptr.p_double[i] = ry.ptr.p_double[0]; } for(i=0; i<=n-1; i++) { rx.ptr.p_double[i] = (double)(1); } xdot(&rx, &ry, n, &temp, &rv2, &rv2err, _state); exactnesserrors = exactnesserrors||ae_fp_less(rv2err,(double)(0)); exactnesserrors = exactnesserrors||ae_fp_greater(rv2err,4*ae_machineepsilon*ae_fabs(ry.ptr.p_double[0], _state)*n); exactnesserrors = exactnesserrors||ae_fp_greater(ae_fabs(rv2-n*ry.ptr.p_double[0], _state),rv2err); } /* * test of precision: complex */ n = 50000; ae_vector_set_length(&cx, n, _state); ae_vector_set_length(&cy, n, _state); ae_vector_set_length(&temp, 2*n, _state); for(pass=0; pass<=passcount-1; pass++) { ae_assert(n%2==0, "Assertion failed", _state); /* * First test: X + X + ... + X - X - X - ... - X = 1*X */ s = ae_exp((double)(ae_maxint(pass, 50, _state)), _state); if( pass==passcount-1&&pass>1 ) { s = ae_maxrealnumber; } cy.ptr.p_complex[0].x = (2*ae_randomreal(_state)-1)*s*ae_sqrt(2*ae_randomreal(_state), _state); cy.ptr.p_complex[0].y = (2*ae_randomreal(_state)-1)*s*ae_sqrt(2*ae_randomreal(_state), _state); for(i=1; i<=n-1; i++) { cy.ptr.p_complex[i] = cy.ptr.p_complex[0]; } for(i=0; i<=n/2-1; i++) { cx.ptr.p_complex[i] = ae_complex_from_i(1); } for(i=n/2; i<=n-2; i++) { cx.ptr.p_complex[i] = ae_complex_from_i(-1); } cx.ptr.p_complex[n-1] = ae_complex_from_i(0); xcdot(&cx, &cy, n, &temp, &cv2, &cv2err, _state); exactnesserrors = exactnesserrors||ae_fp_less(cv2err,(double)(0)); exactnesserrors = exactnesserrors||ae_fp_greater(cv2err,4*ae_machineepsilon*ae_c_abs(cy.ptr.p_complex[0], _state)); exactnesserrors = exactnesserrors||ae_fp_greater(ae_c_abs(ae_c_sub(cv2,cy.ptr.p_complex[0]), _state),cv2err); /* * First test: X + X + ... + X = N*X */ s = ae_exp((double)(ae_maxint(pass, 50, _state)), _state); if( pass==passcount-1&&pass>1 ) { s = ae_maxrealnumber; } cy.ptr.p_complex[0] = ae_complex_from_d((2*ae_randomreal(_state)-1)*s*ae_sqrt(2*ae_randomreal(_state), _state)); for(i=1; i<=n-1; i++) { cy.ptr.p_complex[i] = cy.ptr.p_complex[0]; } for(i=0; i<=n-1; i++) { cx.ptr.p_complex[i] = ae_complex_from_i(1); } xcdot(&cx, &cy, n, &temp, &cv2, &cv2err, _state); exactnesserrors = exactnesserrors||ae_fp_less(cv2err,(double)(0)); exactnesserrors = exactnesserrors||ae_fp_greater(cv2err,4*ae_machineepsilon*ae_c_abs(cy.ptr.p_complex[0], _state)*n); exactnesserrors = exactnesserrors||ae_fp_greater(ae_c_abs(ae_c_sub(cv2,ae_c_mul_d(cy.ptr.p_complex[0],1.0*n)), _state),cv2err); } /* * report */ waserrors = approxerrors||exactnesserrors; if( !silent ) { printf("TESTING XBLAS\n"); printf("APPROX.TESTS: "); if( approxerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("EXACT TESTS: "); if( exactnesserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } /* * end */ result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testxblas(ae_bool silent, ae_state *_state) { return testxblas(silent, _state); } static ae_bool testdensesolverunit_rmatrixchecksolutionm(/* Real */ ae_matrix* xe, ae_int_t n, ae_int_t m, double threshold, ae_int_t info, densesolverreport* rep, /* Real */ ae_matrix* xs, ae_state *_state); static ae_bool testdensesolverunit_rmatrixchecksolutionmfast(/* Real */ ae_matrix* xe, ae_int_t n, ae_int_t m, double threshold, ae_int_t info, /* Real */ ae_matrix* xs, ae_state *_state); static ae_bool testdensesolverunit_rmatrixchecksolution(/* Real */ ae_matrix* xe, ae_int_t n, double threshold, ae_int_t info, densesolverreport* rep, /* Real */ ae_vector* xs, ae_state *_state); static ae_bool testdensesolverunit_rmatrixchecksolutionfast(/* Real */ ae_matrix* xe, ae_int_t n, double threshold, ae_int_t info, /* Real */ ae_vector* xs, ae_state *_state); static ae_bool testdensesolverunit_rmatrixchecksingularm(ae_int_t n, ae_int_t m, ae_int_t info, densesolverreport* rep, /* Real */ ae_matrix* xs, ae_state *_state); static ae_bool testdensesolverunit_rmatrixchecksingularmfast(ae_int_t n, ae_int_t m, ae_int_t info, /* Real */ ae_matrix* xs, ae_state *_state); static ae_bool testdensesolverunit_rmatrixchecksingular(ae_int_t n, ae_int_t info, densesolverreport* rep, /* Real */ ae_vector* xs, ae_state *_state); static ae_bool testdensesolverunit_rmatrixchecksingularfast(ae_int_t n, ae_int_t info, /* Real */ ae_vector* xs, ae_state *_state); static ae_bool testdensesolverunit_cmatrixchecksolutionm(/* Complex */ ae_matrix* xe, ae_int_t n, ae_int_t m, double threshold, ae_int_t info, densesolverreport* rep, /* Complex */ ae_matrix* xs, ae_state *_state); static ae_bool testdensesolverunit_cmatrixchecksolutionmfast(/* Complex */ ae_matrix* xe, ae_int_t n, ae_int_t m, double threshold, ae_int_t info, /* Complex */ ae_matrix* xs, ae_state *_state); static ae_bool testdensesolverunit_cmatrixchecksolution(/* Complex */ ae_matrix* xe, ae_int_t n, double threshold, ae_int_t info, densesolverreport* rep, /* Complex */ ae_vector* xs, ae_state *_state); static ae_bool testdensesolverunit_cmatrixchecksolutionfast(/* Complex */ ae_matrix* xe, ae_int_t n, double threshold, ae_int_t info, /* Complex */ ae_vector* xs, ae_state *_state); static ae_bool testdensesolverunit_cmatrixchecksingularm(ae_int_t n, ae_int_t m, ae_int_t info, densesolverreport* rep, /* Complex */ ae_matrix* xs, ae_state *_state); static ae_bool testdensesolverunit_cmatrixchecksingularmfast(ae_int_t n, ae_int_t m, ae_int_t info, /* Complex */ ae_matrix* xs, ae_state *_state); static ae_bool testdensesolverunit_cmatrixchecksingular(ae_int_t n, ae_int_t info, densesolverreport* rep, /* Complex */ ae_vector* xs, ae_state *_state); static ae_bool testdensesolverunit_cmatrixchecksingularfast(ae_int_t n, ae_int_t info, /* Complex */ ae_vector* xs, ae_state *_state); static void testdensesolverunit_rmatrixmakeacopy(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* b, ae_state *_state); static void testdensesolverunit_cmatrixmakeacopy(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* b, ae_state *_state); static void testdensesolverunit_rmatrixdrophalf(/* Real */ ae_matrix* a, ae_int_t n, ae_bool droplower, ae_state *_state); static void testdensesolverunit_cmatrixdrophalf(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool droplower, ae_state *_state); static void testdensesolverunit_testrsolver(ae_int_t maxn, ae_int_t maxm, ae_int_t passcount, double threshold, ae_bool* rerrors, ae_bool* rfserrors, ae_state *_state); static void testdensesolverunit_testspdsolver(ae_int_t maxn, ae_int_t maxm, ae_int_t passcount, double threshold, ae_bool* spderrors, ae_bool* rfserrors, ae_state *_state); static void testdensesolverunit_testcsolver(ae_int_t maxn, ae_int_t maxm, ae_int_t passcount, double threshold, ae_bool* cerrors, ae_bool* rfserrors, ae_state *_state); static void testdensesolverunit_testhpdsolver(ae_int_t maxn, ae_int_t maxm, ae_int_t passcount, double threshold, ae_bool* hpderrors, ae_bool* rfserrors, ae_state *_state); static void testdensesolverunit_unset2d(/* Real */ ae_matrix* x, ae_state *_state); static void testdensesolverunit_unset1d(/* Real */ ae_vector* x, ae_state *_state); static void testdensesolverunit_cunset2d(/* Complex */ ae_matrix* x, ae_state *_state); static void testdensesolverunit_cunset1d(/* Complex */ ae_vector* x, ae_state *_state); static void testdensesolverunit_unsetrep(densesolverreport* r, ae_state *_state); static void testdensesolverunit_unsetlsrep(densesolverlsreport* r, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testdensesolver(ae_bool silent, ae_state *_state) { ae_int_t maxn; ae_int_t maxm; ae_int_t passcount; double threshold; ae_bool rerrors; ae_bool cerrors; ae_bool spderrors; ae_bool hpderrors; ae_bool rfserrors; ae_bool waserrors; ae_bool result; maxn = 10; maxm = 5; passcount = 5; threshold = 10000*ae_machineepsilon; rfserrors = ae_false; rerrors = ae_false; cerrors = ae_false; spderrors = ae_false; hpderrors = ae_false; testdensesolverunit_testrsolver(maxn, maxm, passcount, threshold, &rerrors, &rfserrors, _state); testdensesolverunit_testspdsolver(maxn, maxm, passcount, threshold, &spderrors, &rfserrors, _state); testdensesolverunit_testcsolver(maxn, maxm, passcount, threshold, &cerrors, &rfserrors, _state); testdensesolverunit_testhpdsolver(maxn, maxm, passcount, threshold, &hpderrors, &rfserrors, _state); waserrors = (((rerrors||cerrors)||spderrors)||hpderrors)||rfserrors; if( !silent ) { printf("TESTING DENSE SOLVER\n"); printf("* REAL: "); if( rerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* COMPLEX: "); if( cerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* SPD: "); if( spderrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* HPD: "); if( hpderrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* ITERATIVE IMPROVEMENT: "); if( rfserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testdensesolver(ae_bool silent, ae_state *_state) { return testdensesolver(silent, _state); } /************************************************************************* Checks whether solver results are correct solution. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_rmatrixchecksolutionm(/* Real */ ae_matrix* xe, ae_int_t n, ae_int_t m, double threshold, ae_int_t info, densesolverreport* rep, /* Real */ ae_matrix* xs, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; result = ae_true; if( info<=0 ) { result = ae_false; } else { result = result&&!(ae_fp_less(rep->r1,100*ae_machineepsilon)||ae_fp_greater(rep->r1,1+1000*ae_machineepsilon)); result = result&&!(ae_fp_less(rep->rinf,100*ae_machineepsilon)||ae_fp_greater(rep->rinf,1+1000*ae_machineepsilon)); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { result = result&&ae_fp_less_eq(ae_fabs(xe->ptr.pp_double[i][j]-xs->ptr.pp_double[i][j], _state),threshold); } } } return result; } /************************************************************************* Checks whether solver results are correct solution. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_rmatrixchecksolutionmfast(/* Real */ ae_matrix* xe, ae_int_t n, ae_int_t m, double threshold, ae_int_t info, /* Real */ ae_matrix* xs, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; result = ae_true; if( info<=0 ) { result = ae_false; } else { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { result = result&&ae_fp_less_eq(ae_fabs(xe->ptr.pp_double[i][j]-xs->ptr.pp_double[i][j], _state),threshold); } } } return result; } /************************************************************************* Checks whether solver results are correct solution. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_rmatrixchecksolution(/* Real */ ae_matrix* xe, ae_int_t n, double threshold, ae_int_t info, densesolverreport* rep, /* Real */ ae_vector* xs, ae_state *_state) { ae_frame _frame_block; ae_matrix xsm; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xsm, 0, 0, DT_REAL, _state); ae_matrix_set_length(&xsm, n, 1, _state); ae_v_move(&xsm.ptr.pp_double[0][0], xsm.stride, &xs->ptr.p_double[0], 1, ae_v_len(0,n-1)); result = testdensesolverunit_rmatrixchecksolutionm(xe, n, 1, threshold, info, rep, &xsm, _state); ae_frame_leave(_state); return result; } /************************************************************************* Checks whether solver results are correct solution. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_rmatrixchecksolutionfast(/* Real */ ae_matrix* xe, ae_int_t n, double threshold, ae_int_t info, /* Real */ ae_vector* xs, ae_state *_state) { ae_int_t i; ae_bool result; result = ae_true; if( info<=0 ) { result = ae_false; } else { for(i=0; i<=n-1; i++) { result = result&&ae_fp_less_eq(ae_fabs(xe->ptr.pp_double[i][0]-xs->ptr.p_double[i], _state),threshold); } } return result; } /************************************************************************* Checks whether solver results indicate singular matrix. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_rmatrixchecksingularm(ae_int_t n, ae_int_t m, ae_int_t info, densesolverreport* rep, /* Real */ ae_matrix* xs, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; result = ae_true; if( info!=-3&&info!=1 ) { result = ae_false; } else { result = result&&!(ae_fp_less(rep->r1,(double)(0))||ae_fp_greater(rep->r1,1000*ae_machineepsilon)); result = result&&!(ae_fp_less(rep->rinf,(double)(0))||ae_fp_greater(rep->rinf,1000*ae_machineepsilon)); if( info==-3 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { result = result&&ae_fp_eq(xs->ptr.pp_double[i][j],(double)(0)); } } } } return result; } /************************************************************************* Checks whether solver results indicate singular matrix. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_rmatrixchecksingularmfast(ae_int_t n, ae_int_t m, ae_int_t info, /* Real */ ae_matrix* xs, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; result = ae_true; if( info!=-3 ) { result = ae_false; } else { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { result = result&&ae_fp_eq(xs->ptr.pp_double[i][j],(double)(0)); } } } return result; } /************************************************************************* Checks whether solver results indicate singular matrix. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_rmatrixchecksingular(ae_int_t n, ae_int_t info, densesolverreport* rep, /* Real */ ae_vector* xs, ae_state *_state) { ae_frame _frame_block; ae_matrix xsm; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xsm, 0, 0, DT_REAL, _state); ae_matrix_set_length(&xsm, n, 1, _state); ae_v_move(&xsm.ptr.pp_double[0][0], xsm.stride, &xs->ptr.p_double[0], 1, ae_v_len(0,n-1)); result = testdensesolverunit_rmatrixchecksingularm(n, 1, info, rep, &xsm, _state); ae_frame_leave(_state); return result; } /************************************************************************* Checks whether solver results indicate singular matrix. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_rmatrixchecksingularfast(ae_int_t n, ae_int_t info, /* Real */ ae_vector* xs, ae_state *_state) { ae_int_t i; ae_bool result; result = ae_true; if( info!=-3 ) { result = ae_false; } else { for(i=0; i<=n-1; i++) { result = result&&ae_fp_eq(xs->ptr.p_double[i],(double)(0)); } } return result; } /************************************************************************* Checks whether solver results are correct solution. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_cmatrixchecksolutionm(/* Complex */ ae_matrix* xe, ae_int_t n, ae_int_t m, double threshold, ae_int_t info, densesolverreport* rep, /* Complex */ ae_matrix* xs, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; result = ae_true; if( info<=0 ) { result = ae_false; } else { result = result&&!(ae_fp_less(rep->r1,100*ae_machineepsilon)||ae_fp_greater(rep->r1,1+1000*ae_machineepsilon)); result = result&&!(ae_fp_less(rep->rinf,100*ae_machineepsilon)||ae_fp_greater(rep->rinf,1+1000*ae_machineepsilon)); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { result = result&&ae_fp_less_eq(ae_c_abs(ae_c_sub(xe->ptr.pp_complex[i][j],xs->ptr.pp_complex[i][j]), _state),threshold); } } } return result; } /************************************************************************* Checks whether solver results are correct solution. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_cmatrixchecksolutionmfast(/* Complex */ ae_matrix* xe, ae_int_t n, ae_int_t m, double threshold, ae_int_t info, /* Complex */ ae_matrix* xs, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; result = ae_true; if( info<=0 ) { result = ae_false; return result; } for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { result = result&&ae_fp_less_eq(ae_c_abs(ae_c_sub(xe->ptr.pp_complex[i][j],xs->ptr.pp_complex[i][j]), _state),threshold); } } return result; } /************************************************************************* Checks whether solver results are correct solution. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_cmatrixchecksolution(/* Complex */ ae_matrix* xe, ae_int_t n, double threshold, ae_int_t info, densesolverreport* rep, /* Complex */ ae_vector* xs, ae_state *_state) { ae_frame _frame_block; ae_matrix xsm; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xsm, 0, 0, DT_COMPLEX, _state); ae_matrix_set_length(&xsm, n, 1, _state); ae_v_cmove(&xsm.ptr.pp_complex[0][0], xsm.stride, &xs->ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); result = testdensesolverunit_cmatrixchecksolutionm(xe, n, 1, threshold, info, rep, &xsm, _state); ae_frame_leave(_state); return result; } /************************************************************************* Checks whether solver results are correct solution. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_cmatrixchecksolutionfast(/* Complex */ ae_matrix* xe, ae_int_t n, double threshold, ae_int_t info, /* Complex */ ae_vector* xs, ae_state *_state) { ae_frame _frame_block; ae_matrix xsm; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xsm, 0, 0, DT_COMPLEX, _state); ae_matrix_set_length(&xsm, n, 1, _state); ae_v_cmove(&xsm.ptr.pp_complex[0][0], xsm.stride, &xs->ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); result = testdensesolverunit_cmatrixchecksolutionmfast(xe, n, 1, threshold, info, &xsm, _state); ae_frame_leave(_state); return result; } /************************************************************************* Checks whether solver results indicate singular matrix. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_cmatrixchecksingularm(ae_int_t n, ae_int_t m, ae_int_t info, densesolverreport* rep, /* Complex */ ae_matrix* xs, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; result = ae_true; if( info!=-3&&info!=1 ) { result = ae_false; return result; } result = result&&!(ae_fp_less(rep->r1,(double)(0))||ae_fp_greater(rep->r1,1000*ae_machineepsilon)); result = result&&!(ae_fp_less(rep->rinf,(double)(0))||ae_fp_greater(rep->rinf,1000*ae_machineepsilon)); if( info==-3 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { result = result&&ae_c_eq_d(xs->ptr.pp_complex[i][j],(double)(0)); } } } return result; } /************************************************************************* Checks whether solver results indicate singular matrix. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_cmatrixchecksingularmfast(ae_int_t n, ae_int_t m, ae_int_t info, /* Complex */ ae_matrix* xs, ae_state *_state) { ae_int_t i; ae_int_t j; ae_bool result; result = ae_true; if( info!=-3 ) { result = ae_false; return result; } if( info==-3 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { result = result&&ae_c_eq_d(xs->ptr.pp_complex[i][j],(double)(0)); } } } return result; } /************************************************************************* Checks whether solver results indicate singular matrix. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_cmatrixchecksingular(ae_int_t n, ae_int_t info, densesolverreport* rep, /* Complex */ ae_vector* xs, ae_state *_state) { ae_frame _frame_block; ae_matrix xsm; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xsm, 0, 0, DT_COMPLEX, _state); ae_matrix_set_length(&xsm, n, 1, _state); ae_v_cmove(&xsm.ptr.pp_complex[0][0], xsm.stride, &xs->ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); result = testdensesolverunit_cmatrixchecksingularm(n, 1, info, rep, &xsm, _state); ae_frame_leave(_state); return result; } /************************************************************************* Checks whether solver results indicate singular matrix. Returns True on success. *************************************************************************/ static ae_bool testdensesolverunit_cmatrixchecksingularfast(ae_int_t n, ae_int_t info, /* Complex */ ae_vector* xs, ae_state *_state) { ae_frame _frame_block; ae_matrix xsm; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xsm, 0, 0, DT_COMPLEX, _state); ae_matrix_set_length(&xsm, n, 1, _state); ae_v_cmove(&xsm.ptr.pp_complex[0][0], xsm.stride, &xs->ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); result = testdensesolverunit_cmatrixchecksingularmfast(n, 1, info, &xsm, _state); ae_frame_leave(_state); return result; } /************************************************************************* Copy *************************************************************************/ static void testdensesolverunit_rmatrixmakeacopy(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Real */ ae_matrix* b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(b); ae_matrix_set_length(b, m-1+1, n-1+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { b->ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } } /************************************************************************* Copy *************************************************************************/ static void testdensesolverunit_cmatrixmakeacopy(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Complex */ ae_matrix* b, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(b); ae_matrix_set_length(b, m-1+1, n-1+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { b->ptr.pp_complex[i][j] = a->ptr.pp_complex[i][j]; } } } /************************************************************************* Drops upper or lower half of the matrix - fills it by special pattern which may be used later to ensure that this part wasn't changed *************************************************************************/ static void testdensesolverunit_rmatrixdrophalf(/* Real */ ae_matrix* a, ae_int_t n, ae_bool droplower, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (droplower&&i>j)||(!droplower&&iptr.pp_double[i][j] = (double)(1+2*i+3*j); } } } } /************************************************************************* Drops upper or lower half of the matrix - fills it by special pattern which may be used later to ensure that this part wasn't changed *************************************************************************/ static void testdensesolverunit_cmatrixdrophalf(/* Complex */ ae_matrix* a, ae_int_t n, ae_bool droplower, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (droplower&&i>j)||(!droplower&&iptr.pp_complex[i][j] = ae_complex_from_i(1+2*i+3*j); } } } } /************************************************************************* Real test *************************************************************************/ static void testdensesolverunit_testrsolver(ae_int_t maxn, ae_int_t maxm, ae_int_t passcount, double threshold, ae_bool* rerrors, ae_bool* rfserrors, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix lua; ae_matrix atmp; ae_vector p; ae_matrix xe; ae_matrix b; ae_vector bv; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t n; ae_int_t m; ae_int_t pass; ae_int_t taskkind; double v; double verr; ae_int_t info; densesolverreport rep; densesolverlsreport repls; ae_matrix x; ae_vector xv; ae_vector y; ae_vector tx; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&lua, 0, 0, DT_REAL, _state); ae_matrix_init(&atmp, 0, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_matrix_init(&xe, 0, 0, DT_REAL, _state); ae_matrix_init(&b, 0, 0, DT_REAL, _state); ae_vector_init(&bv, 0, DT_REAL, _state); _densesolverreport_init(&rep, _state); _densesolverlsreport_init(&repls, _state); ae_matrix_init(&x, 0, 0, DT_REAL, _state); ae_vector_init(&xv, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&tx, 0, DT_REAL, _state); /* * General square matrices: * * test general solvers * * test least squares solver */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { for(m=1; m<=maxm; m++) { /* * ******************************************************** * WELL CONDITIONED TASKS * ability to find correct solution is tested * ******************************************************** * * 1. generate random well conditioned matrix A. * 2. generate random solution vector xe * 3. generate right part b=A*xe * 4. test different methods on original A */ rmatrixrndcond(n, (double)(1000), &a, _state); testdensesolverunit_rmatrixmakeacopy(&a, n, n, &lua, _state); rmatrixlu(&lua, n, n, &p, _state); ae_matrix_set_length(&xe, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { xe.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } ae_matrix_set_length(&b, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xe.ptr.pp_double[0][j], xe.stride, ae_v_len(0,n-1)); b.ptr.pp_double[i][j] = v; } } /* * Test solvers */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); rmatrixsolvem(&a, n, &b, m, ae_fp_greater(ae_randomreal(_state),0.5), &info, &rep, &x, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksolutionm(&xe, n, m, threshold, info, &rep, &x, _state); info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_double[i][j] = b.ptr.pp_double[i][j]; } } rmatrixsolvemfast(&a, n, &x, m, &info, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksolutionmfast(&xe, n, m, threshold, info, &x, _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset1d(&xv, _state); ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); rmatrixsolve(&a, n, &bv, &info, &rep, &xv, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksolution(&xe, n, threshold, info, &rep, &xv, _state); info = 0; ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); rmatrixsolvefast(&a, n, &bv, &info, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksolutionfast(&xe, n, threshold, info, &bv, _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); rmatrixlusolvem(&lua, &p, n, &b, m, &info, &rep, &x, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksolutionm(&xe, n, m, threshold, info, &rep, &x, _state); info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_double[i][j] = b.ptr.pp_double[i][j]; } } rmatrixlusolvemfast(&lua, &p, n, &x, m, &info, _state); seterrorflag(rerrors, !testdensesolverunit_rmatrixchecksolutionmfast(&xe, n, m, threshold, info, &x, _state), _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset1d(&xv, _state); ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); rmatrixlusolve(&lua, &p, n, &bv, &info, &rep, &xv, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksolution(&xe, n, threshold, info, &rep, &xv, _state); info = 0; ae_vector_set_length(&bv, n, _state); ae_v_move(&xv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); rmatrixlusolvefast(&lua, &p, n, &xv, &info, _state); seterrorflag(rerrors, !testdensesolverunit_rmatrixchecksolutionfast(&xe, n, threshold, info, &xv, _state), _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); rmatrixmixedsolvem(&a, &lua, &p, n, &b, m, &info, &rep, &x, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksolutionm(&xe, n, m, threshold, info, &rep, &x, _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset1d(&xv, _state); ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); rmatrixmixedsolve(&a, &lua, &p, n, &bv, &info, &rep, &xv, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksolution(&xe, n, threshold, info, &rep, &xv, _state); /* * Test DenseSolverRLS(): * * test on original system A*x = b * * test on overdetermined system with the same solution: (A' A')'*x = (b' b')' * * test on underdetermined system with the same solution: (A 0 0 0 ) * z = b */ info = 0; testdensesolverunit_unsetlsrep(&repls, _state); testdensesolverunit_unset1d(&xv, _state); ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); rmatrixsolvels(&a, n, n, &bv, 0.0, &info, &repls, &xv, _state); if( info<=0 ) { *rerrors = ae_true; } else { *rerrors = (*rerrors||ae_fp_less(repls.r2,100*ae_machineepsilon))||ae_fp_greater(repls.r2,1+1000*ae_machineepsilon); *rerrors = (*rerrors||repls.n!=n)||repls.k!=0; for(i=0; i<=n-1; i++) { *rerrors = *rerrors||ae_fp_greater(ae_fabs(xe.ptr.pp_double[i][0]-xv.ptr.p_double[i], _state),threshold); } } info = 0; testdensesolverunit_unsetlsrep(&repls, _state); testdensesolverunit_unset1d(&xv, _state); ae_vector_set_length(&bv, 2*n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); ae_v_move(&bv.ptr.p_double[n], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(n,2*n-1)); ae_matrix_set_length(&atmp, 2*n, n, _state); copymatrix(&a, 0, n-1, 0, n-1, &atmp, 0, n-1, 0, n-1, _state); copymatrix(&a, 0, n-1, 0, n-1, &atmp, n, 2*n-1, 0, n-1, _state); rmatrixsolvels(&atmp, 2*n, n, &bv, 0.0, &info, &repls, &xv, _state); if( info<=0 ) { *rerrors = ae_true; } else { *rerrors = (*rerrors||ae_fp_less(repls.r2,100*ae_machineepsilon))||ae_fp_greater(repls.r2,1+1000*ae_machineepsilon); *rerrors = (*rerrors||repls.n!=n)||repls.k!=0; for(i=0; i<=n-1; i++) { *rerrors = *rerrors||ae_fp_greater(ae_fabs(xe.ptr.pp_double[i][0]-xv.ptr.p_double[i], _state),threshold); } } info = 0; testdensesolverunit_unsetlsrep(&repls, _state); testdensesolverunit_unset1d(&xv, _state); ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); ae_matrix_set_length(&atmp, n, 2*n, _state); copymatrix(&a, 0, n-1, 0, n-1, &atmp, 0, n-1, 0, n-1, _state); for(i=0; i<=n-1; i++) { for(j=n; j<=2*n-1; j++) { atmp.ptr.pp_double[i][j] = (double)(0); } } rmatrixsolvels(&atmp, n, 2*n, &bv, 0.0, &info, &repls, &xv, _state); if( info<=0 ) { *rerrors = ae_true; } else { *rerrors = *rerrors||ae_fp_neq(repls.r2,(double)(0)); *rerrors = (*rerrors||repls.n!=2*n)||repls.k!=n; for(i=0; i<=n-1; i++) { *rerrors = *rerrors||ae_fp_greater(ae_fabs(xe.ptr.pp_double[i][0]-xv.ptr.p_double[i], _state),threshold); } for(i=n; i<=2*n-1; i++) { *rerrors = *rerrors||ae_fp_greater(ae_fabs(xv.ptr.p_double[i], _state),threshold); } } /* * ******************************************************** * EXACTLY SINGULAR MATRICES * ability to detect singularity is tested * ******************************************************** * * 1. generate different types of singular matrices: * * zero (TaskKind=0) * * with zero columns (TaskKind=1) * * with zero rows (TaskKind=2) * * with equal rows/columns (TaskKind=2 or 3) * 2. generate random solution vector xe * 3. generate right part b=A*xe * 4. test different methods */ for(taskkind=0; taskkind<=4; taskkind++) { testdensesolverunit_unset2d(&a, _state); if( taskkind==0 ) { /* * all zeros */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } } if( taskkind==1 ) { /* * there is zero column */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } k = ae_randominteger(n, _state); ae_v_muld(&a.ptr.pp_double[0][k], a.stride, ae_v_len(0,n-1), 0); } if( taskkind==2 ) { /* * there is zero row */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } k = ae_randominteger(n, _state); ae_v_muld(&a.ptr.pp_double[k][0], 1, ae_v_len(0,n-1), 0); } if( taskkind==3 ) { /* * equal columns */ if( n<2 ) { continue; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } k = 1+ae_randominteger(n-1, _state); ae_v_move(&a.ptr.pp_double[0][0], a.stride, &a.ptr.pp_double[0][k], a.stride, ae_v_len(0,n-1)); } if( taskkind==4 ) { /* * equal rows */ if( n<2 ) { continue; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } k = 1+ae_randominteger(n-1, _state); ae_v_move(&a.ptr.pp_double[0][0], 1, &a.ptr.pp_double[k][0], 1, ae_v_len(0,n-1)); } ae_matrix_set_length(&xe, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { xe.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } ae_matrix_set_length(&b, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xe.ptr.pp_double[0][j], xe.stride, ae_v_len(0,n-1)); b.ptr.pp_double[i][j] = v; } } testdensesolverunit_rmatrixmakeacopy(&a, n, n, &lua, _state); rmatrixlu(&lua, n, n, &p, _state); /* * Test RMatrixSolveM() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); rmatrixsolvem(&a, n, &b, m, ae_fp_greater(ae_randomreal(_state),0.5), &info, &rep, &x, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksingularm(n, m, info, &rep, &x, _state); /* * Test RMatrixSolveMFast(); performed only for matrices * with zero rows or columns */ if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_double[i][j] = b.ptr.pp_double[i][j]; } } rmatrixsolvemfast(&a, n, &x, m, &info, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksingularmfast(n, m, info, &x, _state); } /* * Test RMatrixSolve() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); rmatrixsolve(&a, n, &bv, &info, &rep, &xv, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksingular(n, info, &rep, &xv, _state); /* * Test RMatrixSolveFast(); performed only for matrices * with zero rows or columns */ if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); rmatrixsolvefast(&a, n, &bv, &info, _state); seterrorflag(rerrors, !testdensesolverunit_rmatrixchecksingularfast(n, info, &bv, _state), _state); } /* * Test RMatrixLUSolveM() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); rmatrixlusolvem(&lua, &p, n, &b, m, &info, &rep, &x, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksingularm(n, m, info, &rep, &x, _state); /* * Test RMatrixLUSolveMFast(); performed only for matrices * with zero rows or columns */ if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_double[i][j] = b.ptr.pp_double[i][j]; } } rmatrixlusolvemfast(&lua, &p, n, &x, m, &info, _state); seterrorflag(rerrors, !testdensesolverunit_rmatrixchecksingularmfast(n, m, info, &x, _state), _state); } /* * Test RMatrixLUSolve() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); rmatrixlusolve(&lua, &p, n, &bv, &info, &rep, &xv, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksingular(n, info, &rep, &xv, _state); /* * Test RMatrixLUSolveFast(); performed only for matrices * with zero rows or columns */ if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); rmatrixlusolvefast(&lua, &p, n, &bv, &info, _state); seterrorflag(rerrors, !testdensesolverunit_rmatrixchecksingularfast(n, info, &bv, _state), _state); } /* * Test RMatrixMixedSolveM() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); rmatrixmixedsolvem(&a, &lua, &p, n, &b, m, &info, &rep, &x, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksingularm(n, m, info, &rep, &x, _state); /* * Test RMatrixMixedSolve() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); rmatrixmixedsolve(&a, &lua, &p, n, &bv, &info, &rep, &xv, _state); *rerrors = *rerrors||!testdensesolverunit_rmatrixchecksingular(n, info, &rep, &xv, _state); } } } } /* * test iterative improvement */ for(pass=1; pass<=passcount; pass++) { /* * Test iterative improvement matrices * * A matrix/right part are constructed such that both matrix * and solution components are within (-1,+1). Such matrix/right part * have nice properties - system can be solved using iterative * improvement with |A*x-b| about several ulps of max(1,|b|). */ n = 100; ae_matrix_set_length(&a, n, n, _state); ae_matrix_set_length(&b, n, 1, _state); ae_vector_set_length(&bv, n, _state); ae_vector_set_length(&tx, n, _state); ae_vector_set_length(&xv, n, _state); ae_vector_set_length(&y, n, _state); for(i=0; i<=n-1; i++) { xv.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } ae_v_move(&y.ptr.p_double[0], 1, &a.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); xdot(&y, &xv, n, &tx, &v, &verr, _state); bv.ptr.p_double[i] = v; } ae_v_move(&b.ptr.pp_double[0][0], b.stride, &bv.ptr.p_double[0], 1, ae_v_len(0,n-1)); /* * Test RMatrixSolveM() */ testdensesolverunit_unset2d(&x, _state); rmatrixsolvem(&a, n, &b, 1, ae_true, &info, &rep, &x, _state); if( info<=0 ) { *rfserrors = ae_true; } else { ae_vector_set_length(&xv, n, _state); ae_v_move(&xv.ptr.p_double[0], 1, &x.ptr.pp_double[0][0], x.stride, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { ae_v_move(&y.ptr.p_double[0], 1, &a.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); xdot(&y, &xv, n, &tx, &v, &verr, _state); *rfserrors = *rfserrors||ae_fp_greater(ae_fabs(v-b.ptr.pp_double[i][0], _state),8*ae_machineepsilon*ae_maxreal((double)(1), ae_fabs(b.ptr.pp_double[i][0], _state), _state)); } } /* * Test RMatrixSolve() */ testdensesolverunit_unset1d(&xv, _state); rmatrixsolve(&a, n, &bv, &info, &rep, &xv, _state); if( info<=0 ) { *rfserrors = ae_true; } else { for(i=0; i<=n-1; i++) { ae_v_move(&y.ptr.p_double[0], 1, &a.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); xdot(&y, &xv, n, &tx, &v, &verr, _state); *rfserrors = *rfserrors||ae_fp_greater(ae_fabs(v-bv.ptr.p_double[i], _state),8*ae_machineepsilon*ae_maxreal((double)(1), ae_fabs(bv.ptr.p_double[i], _state), _state)); } } /* * Test LS-solver on the same matrix */ rmatrixsolvels(&a, n, n, &bv, 0.0, &info, &repls, &xv, _state); if( info<=0 ) { *rfserrors = ae_true; } else { for(i=0; i<=n-1; i++) { ae_v_move(&y.ptr.p_double[0], 1, &a.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); xdot(&y, &xv, n, &tx, &v, &verr, _state); *rfserrors = *rfserrors||ae_fp_greater(ae_fabs(v-bv.ptr.p_double[i], _state),8*ae_machineepsilon*ae_maxreal((double)(1), ae_fabs(bv.ptr.p_double[i], _state), _state)); } } } ae_frame_leave(_state); } /************************************************************************* SPD test *************************************************************************/ static void testdensesolverunit_testspdsolver(ae_int_t maxn, ae_int_t maxm, ae_int_t passcount, double threshold, ae_bool* spderrors, ae_bool* rfserrors, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix cha; ae_matrix atmp; ae_vector p; ae_matrix xe; ae_matrix b; ae_vector bv; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t n; ae_int_t m; ae_int_t pass; ae_int_t taskkind; double v; ae_bool isupper; ae_int_t info; densesolverreport rep; densesolverlsreport repls; ae_matrix x; ae_vector xv; ae_vector y; ae_vector tx; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&cha, 0, 0, DT_REAL, _state); ae_matrix_init(&atmp, 0, 0, DT_REAL, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_matrix_init(&xe, 0, 0, DT_REAL, _state); ae_matrix_init(&b, 0, 0, DT_REAL, _state); ae_vector_init(&bv, 0, DT_REAL, _state); _densesolverreport_init(&rep, _state); _densesolverlsreport_init(&repls, _state); ae_matrix_init(&x, 0, 0, DT_REAL, _state); ae_vector_init(&xv, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&tx, 0, DT_REAL, _state); /* * General square matrices: * * test general solvers * * test least squares solver */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { for(m=1; m<=maxm; m++) { /* * ******************************************************** * WELL CONDITIONED TASKS * ability to find correct solution is tested * ******************************************************** * * 1. generate random well conditioned matrix A. * 2. generate random solution vector xe * 3. generate right part b=A*xe * 4. test different methods on original A */ isupper = ae_fp_greater(ae_randomreal(_state),0.5); spdmatrixrndcond(n, (double)(1000), &a, _state); testdensesolverunit_rmatrixmakeacopy(&a, n, n, &cha, _state); if( !spdmatrixcholesky(&cha, n, isupper, _state) ) { *spderrors = ae_true; ae_frame_leave(_state); return; } ae_matrix_set_length(&xe, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { xe.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } ae_matrix_set_length(&b, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xe.ptr.pp_double[0][j], xe.stride, ae_v_len(0,n-1)); b.ptr.pp_double[i][j] = v; } } testdensesolverunit_rmatrixdrophalf(&a, n, isupper, _state); testdensesolverunit_rmatrixdrophalf(&cha, n, isupper, _state); /* * Test solvers */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); spdmatrixsolvem(&a, n, isupper, &b, m, &info, &rep, &x, _state); *spderrors = *spderrors||!testdensesolverunit_rmatrixchecksolutionm(&xe, n, m, threshold, info, &rep, &x, _state); info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_double[i][j] = b.ptr.pp_double[i][j]; } } spdmatrixsolvemfast(&a, n, isupper, &x, m, &info, _state); seterrorflag(spderrors, !testdensesolverunit_rmatrixchecksolutionmfast(&xe, n, m, threshold, info, &x, _state), _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset1d(&xv, _state); ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); spdmatrixsolve(&a, n, isupper, &bv, &info, &rep, &xv, _state); *spderrors = *spderrors||!testdensesolverunit_rmatrixchecksolution(&xe, n, threshold, info, &rep, &xv, _state); info = 0; ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); spdmatrixsolvefast(&a, n, isupper, &bv, &info, _state); seterrorflag(spderrors, !testdensesolverunit_rmatrixchecksolutionfast(&xe, n, threshold, info, &bv, _state), _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); spdmatrixcholeskysolvem(&cha, n, isupper, &b, m, &info, &rep, &x, _state); *spderrors = *spderrors||!testdensesolverunit_rmatrixchecksolutionm(&xe, n, m, threshold, info, &rep, &x, _state); info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_double[i][j] = b.ptr.pp_double[i][j]; } } spdmatrixcholeskysolvemfast(&cha, n, isupper, &x, m, &info, _state); seterrorflag(spderrors, !testdensesolverunit_rmatrixchecksolutionmfast(&xe, n, m, threshold, info, &x, _state), _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset1d(&xv, _state); ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); spdmatrixcholeskysolve(&cha, n, isupper, &bv, &info, &rep, &xv, _state); *spderrors = *spderrors||!testdensesolverunit_rmatrixchecksolution(&xe, n, threshold, info, &rep, &xv, _state); info = 0; ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); spdmatrixcholeskysolvefast(&cha, n, isupper, &bv, &info, _state); seterrorflag(spderrors, !testdensesolverunit_rmatrixchecksolutionfast(&xe, n, threshold, info, &bv, _state), _state); /* * ******************************************************** * EXACTLY SINGULAR MATRICES * ability to detect singularity is tested * ******************************************************** * * 1. generate different types of singular matrices: * * zero * * with zero columns * * with zero rows * * with equal rows/columns * 2. generate random solution vector xe * 3. generate right part b=A*xe * 4. test different methods */ for(taskkind=0; taskkind<=3; taskkind++) { testdensesolverunit_unset2d(&a, _state); if( taskkind==0 ) { /* * all zeros */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } } if( taskkind==1 ) { /* * there is zero column */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; a.ptr.pp_double[j][i] = a.ptr.pp_double[i][j]; } } k = ae_randominteger(n, _state); ae_v_muld(&a.ptr.pp_double[0][k], a.stride, ae_v_len(0,n-1), 0); ae_v_muld(&a.ptr.pp_double[k][0], 1, ae_v_len(0,n-1), 0); } if( taskkind==2 ) { /* * there is zero row */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; a.ptr.pp_double[j][i] = a.ptr.pp_double[i][j]; } } k = ae_randominteger(n, _state); ae_v_muld(&a.ptr.pp_double[k][0], 1, ae_v_len(0,n-1), 0); ae_v_muld(&a.ptr.pp_double[0][k], a.stride, ae_v_len(0,n-1), 0); } if( taskkind==3 ) { /* * equal columns/rows */ if( n<2 ) { continue; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; a.ptr.pp_double[j][i] = a.ptr.pp_double[i][j]; } } k = 1+ae_randominteger(n-1, _state); ae_v_move(&a.ptr.pp_double[0][0], a.stride, &a.ptr.pp_double[0][k], a.stride, ae_v_len(0,n-1)); ae_v_move(&a.ptr.pp_double[0][0], 1, &a.ptr.pp_double[k][0], 1, ae_v_len(0,n-1)); } ae_matrix_set_length(&xe, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { xe.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } ae_matrix_set_length(&b, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xe.ptr.pp_double[0][j], xe.stride, ae_v_len(0,n-1)); b.ptr.pp_double[i][j] = v; } } testdensesolverunit_rmatrixmakeacopy(&a, n, n, &cha, _state); testdensesolverunit_rmatrixdrophalf(&a, n, isupper, _state); testdensesolverunit_rmatrixdrophalf(&cha, n, isupper, _state); /* * Test SPDMatrixSolveM() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); spdmatrixsolvem(&a, n, isupper, &b, m, &info, &rep, &x, _state); *spderrors = *spderrors||!testdensesolverunit_rmatrixchecksingularm(n, m, info, &rep, &x, _state); /* * Test SPDMatrixSolveMFast() */ if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_double[i][j] = b.ptr.pp_double[i][j]; } } spdmatrixsolvemfast(&a, n, isupper, &x, m, &info, _state); seterrorflag(spderrors, !testdensesolverunit_rmatrixchecksingularmfast(n, m, info, &x, _state), _state); } /* * Test SPDMatrixSolve() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); spdmatrixsolve(&a, n, isupper, &bv, &info, &rep, &xv, _state); *spderrors = *spderrors||!testdensesolverunit_rmatrixchecksingular(n, info, &rep, &xv, _state); /* * Test SPDMatrixSolveFast() */ info = 0; ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); spdmatrixsolvefast(&a, n, isupper, &bv, &info, _state); seterrorflag(spderrors, !testdensesolverunit_rmatrixchecksingular(n, info, &rep, &bv, _state), _state); /* * 'equal columns/rows' are degenerate, but * Cholesky matrix with equal columns/rows IS NOT degenerate, * so it is not used for testing purposes. */ if( taskkind!=3 ) { /* * Test SPDMatrixLUSolveM() (and fast version) */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); spdmatrixcholeskysolvem(&cha, n, isupper, &b, m, &info, &rep, &x, _state); *spderrors = *spderrors||!testdensesolverunit_rmatrixchecksingularm(n, m, info, &rep, &x, _state); if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_double[i][j] = b.ptr.pp_double[i][j]; } } spdmatrixcholeskysolvemfast(&a, n, isupper, &x, m, &info, _state); seterrorflag(spderrors, !testdensesolverunit_rmatrixchecksingularmfast(n, m, info, &x, _state), _state); } /* * Test SPDMatrixLUSolve() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_unset2d(&x, _state); ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); spdmatrixcholeskysolve(&cha, n, isupper, &bv, &info, &rep, &xv, _state); *spderrors = *spderrors||!testdensesolverunit_rmatrixchecksingular(n, info, &rep, &xv, _state); if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_vector_set_length(&bv, n, _state); ae_v_move(&bv.ptr.p_double[0], 1, &b.ptr.pp_double[0][0], b.stride, ae_v_len(0,n-1)); spdmatrixcholeskysolvefast(&a, n, isupper, &bv, &info, _state); seterrorflag(spderrors, !testdensesolverunit_rmatrixchecksingularfast(n, info, &bv, _state), _state); } } } } } } ae_frame_leave(_state); } /************************************************************************* Real test *************************************************************************/ static void testdensesolverunit_testcsolver(ae_int_t maxn, ae_int_t maxm, ae_int_t passcount, double threshold, ae_bool* cerrors, ae_bool* rfserrors, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix lua; ae_matrix atmp; ae_vector p; ae_matrix xe; ae_matrix b; ae_vector bv; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t n; ae_int_t m; ae_int_t pass; ae_int_t taskkind; double verr; ae_complex v; ae_int_t info; densesolverreport rep; densesolverlsreport repls; ae_matrix x; ae_vector xv; ae_vector y; ae_vector tx; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&lua, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&atmp, 0, 0, DT_COMPLEX, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_matrix_init(&xe, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&b, 0, 0, DT_COMPLEX, _state); ae_vector_init(&bv, 0, DT_COMPLEX, _state); _densesolverreport_init(&rep, _state); _densesolverlsreport_init(&repls, _state); ae_matrix_init(&x, 0, 0, DT_COMPLEX, _state); ae_vector_init(&xv, 0, DT_COMPLEX, _state); ae_vector_init(&y, 0, DT_COMPLEX, _state); ae_vector_init(&tx, 0, DT_REAL, _state); /* * General square matrices: * * test general solvers * * test least squares solver */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { for(m=1; m<=maxm; m++) { /* * ******************************************************** * WELL CONDITIONED TASKS * ability to find correct solution is tested * ******************************************************** * * 1. generate random well conditioned matrix A. * 2. generate random solution vector xe * 3. generate right part b=A*xe * 4. test different methods on original A */ cmatrixrndcond(n, (double)(1000), &a, _state); testdensesolverunit_cmatrixmakeacopy(&a, n, n, &lua, _state); cmatrixlu(&lua, n, n, &p, _state); ae_matrix_set_length(&xe, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { xe.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; xe.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } ae_matrix_set_length(&b, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_cdotproduct(&a.ptr.pp_complex[i][0], 1, "N", &xe.ptr.pp_complex[0][j], xe.stride, "N", ae_v_len(0,n-1)); b.ptr.pp_complex[i][j] = v; } } /* * Test solvers */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); cmatrixsolvem(&a, n, &b, m, ae_fp_greater(ae_randomreal(_state),0.5), &info, &rep, &x, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksolutionm(&xe, n, m, threshold, info, &rep, &x, _state); info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_complex[i][j] = b.ptr.pp_complex[i][j]; } } cmatrixsolvemfast(&a, n, &x, m, &info, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksolutionmfast(&xe, n, m, threshold, info, &x, _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset1d(&xv, _state); ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); cmatrixsolve(&a, n, &bv, &info, &rep, &xv, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksolution(&xe, n, threshold, info, &rep, &xv, _state); info = 0; ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); cmatrixsolvefast(&a, n, &bv, &info, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksolutionfast(&xe, n, threshold, info, &bv, _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); cmatrixlusolvem(&lua, &p, n, &b, m, &info, &rep, &x, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksolutionm(&xe, n, m, threshold, info, &rep, &x, _state); info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_complex[i][j] = b.ptr.pp_complex[i][j]; } } cmatrixlusolvemfast(&lua, &p, n, &x, m, &info, _state); seterrorflag(cerrors, !testdensesolverunit_cmatrixchecksolutionmfast(&xe, n, m, threshold, info, &x, _state), _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset1d(&xv, _state); ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); cmatrixlusolve(&lua, &p, n, &bv, &info, &rep, &xv, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksolution(&xe, n, threshold, info, &rep, &xv, _state); info = 0; ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); cmatrixlusolvefast(&lua, &p, n, &bv, &info, _state); seterrorflag(cerrors, !testdensesolverunit_cmatrixchecksolutionfast(&xe, n, threshold, info, &bv, _state), _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); cmatrixmixedsolvem(&a, &lua, &p, n, &b, m, &info, &rep, &x, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksolutionm(&xe, n, m, threshold, info, &rep, &x, _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset1d(&xv, _state); ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); cmatrixmixedsolve(&a, &lua, &p, n, &bv, &info, &rep, &xv, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksolution(&xe, n, threshold, info, &rep, &xv, _state); /* * ******************************************************** * EXACTLY SINGULAR MATRICES * ability to detect singularity is tested * ******************************************************** * * 1. generate different types of singular matrices: * * zero * * with zero columns * * with zero rows * * with equal rows/columns * 2. generate random solution vector xe * 3. generate right part b=A*xe * 4. test different methods */ for(taskkind=0; taskkind<=4; taskkind++) { testdensesolverunit_cunset2d(&a, _state); if( taskkind==0 ) { /* * all zeros */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } if( taskkind==1 ) { /* * there is zero column */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } k = ae_randominteger(n, _state); ae_v_cmuld(&a.ptr.pp_complex[0][k], a.stride, ae_v_len(0,n-1), 0); } if( taskkind==2 ) { /* * there is zero row */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } k = ae_randominteger(n, _state); ae_v_cmuld(&a.ptr.pp_complex[k][0], 1, ae_v_len(0,n-1), 0); } if( taskkind==3 ) { /* * equal columns */ if( n<2 ) { continue; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } k = 1+ae_randominteger(n-1, _state); ae_v_cmove(&a.ptr.pp_complex[0][0], a.stride, &a.ptr.pp_complex[0][k], a.stride, "N", ae_v_len(0,n-1)); } if( taskkind==4 ) { /* * equal rows */ if( n<2 ) { continue; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } k = 1+ae_randominteger(n-1, _state); ae_v_cmove(&a.ptr.pp_complex[0][0], 1, &a.ptr.pp_complex[k][0], 1, "N", ae_v_len(0,n-1)); } ae_matrix_set_length(&xe, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { xe.ptr.pp_complex[i][j] = ae_complex_from_d(2*ae_randomreal(_state)-1); } } ae_matrix_set_length(&b, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_cdotproduct(&a.ptr.pp_complex[i][0], 1, "N", &xe.ptr.pp_complex[0][j], xe.stride, "N", ae_v_len(0,n-1)); b.ptr.pp_complex[i][j] = v; } } testdensesolverunit_cmatrixmakeacopy(&a, n, n, &lua, _state); cmatrixlu(&lua, n, n, &p, _state); /* * Test CMatrixSolveM() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); cmatrixsolvem(&a, n, &b, m, ae_fp_greater(ae_randomreal(_state),0.5), &info, &rep, &x, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksingularm(n, m, info, &rep, &x, _state); /* * Test CMatrixSolveMFast(); performed only for matrices * with zero rows or columns */ if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_complex[i][j] = b.ptr.pp_complex[i][j]; } } cmatrixsolvemfast(&a, n, &x, m, &info, _state); seterrorflag(cerrors, !testdensesolverunit_cmatrixchecksingularmfast(n, m, info, &x, _state), _state); } /* * Test CMatrixSolve() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); cmatrixsolve(&a, n, &bv, &info, &rep, &xv, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksingular(n, info, &rep, &xv, _state); if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); cmatrixsolvefast(&a, n, &bv, &info, _state); seterrorflag(cerrors, !testdensesolverunit_cmatrixchecksingularfast(n, info, &bv, _state), _state); } /* * Test CMatrixLUSolveM() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); cmatrixlusolvem(&lua, &p, n, &b, m, &info, &rep, &x, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksingularm(n, m, info, &rep, &x, _state); /* * Test CMatrixLUSolveMFast() */ if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_complex[i][j] = b.ptr.pp_complex[i][j]; } } cmatrixlusolvemfast(&lua, &p, n, &x, m, &info, _state); seterrorflag(cerrors, !testdensesolverunit_cmatrixchecksingularmfast(n, m, info, &x, _state), _state); } /* * Test CMatrixLUSolve() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); cmatrixlusolve(&lua, &p, n, &bv, &info, &rep, &xv, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksingular(n, info, &rep, &xv, _state); /* * Test CMatrixLUSolveFast() */ if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); cmatrixlusolvefast(&lua, &p, n, &bv, &info, _state); seterrorflag(cerrors, !testdensesolverunit_cmatrixchecksingularfast(n, info, &bv, _state), _state); } /* * Test CMatrixMixedSolveM() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); cmatrixmixedsolvem(&a, &lua, &p, n, &b, m, &info, &rep, &x, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksingularm(n, m, info, &rep, &x, _state); /* * Test CMatrixMixedSolve() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); cmatrixmixedsolve(&a, &lua, &p, n, &bv, &info, &rep, &xv, _state); *cerrors = *cerrors||!testdensesolverunit_cmatrixchecksingular(n, info, &rep, &xv, _state); } } } } /* * test iterative improvement */ for(pass=1; pass<=passcount; pass++) { /* * Test iterative improvement matrices * * A matrix/right part are constructed such that both matrix * and solution components magnitudes are within (-1,+1). * Such matrix/right part have nice properties - system can * be solved using iterative improvement with |A*x-b| about * several ulps of max(1,|b|). */ n = 100; ae_matrix_set_length(&a, n, n, _state); ae_matrix_set_length(&b, n, 1, _state); ae_vector_set_length(&bv, n, _state); ae_vector_set_length(&tx, 2*n, _state); ae_vector_set_length(&xv, n, _state); ae_vector_set_length(&y, n, _state); for(i=0; i<=n-1; i++) { xv.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; xv.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } ae_v_cmove(&y.ptr.p_complex[0], 1, &a.ptr.pp_complex[i][0], 1, "N", ae_v_len(0,n-1)); xcdot(&y, &xv, n, &tx, &v, &verr, _state); bv.ptr.p_complex[i] = v; } ae_v_cmove(&b.ptr.pp_complex[0][0], b.stride, &bv.ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); /* * Test CMatrixSolveM() */ testdensesolverunit_cunset2d(&x, _state); cmatrixsolvem(&a, n, &b, 1, ae_true, &info, &rep, &x, _state); if( info<=0 ) { *rfserrors = ae_true; } else { ae_vector_set_length(&xv, n, _state); ae_v_cmove(&xv.ptr.p_complex[0], 1, &x.ptr.pp_complex[0][0], x.stride, "N", ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { ae_v_cmove(&y.ptr.p_complex[0], 1, &a.ptr.pp_complex[i][0], 1, "N", ae_v_len(0,n-1)); xcdot(&y, &xv, n, &tx, &v, &verr, _state); *rfserrors = *rfserrors||ae_fp_greater(ae_c_abs(ae_c_sub(v,b.ptr.pp_complex[i][0]), _state),8*ae_machineepsilon*ae_maxreal((double)(1), ae_c_abs(b.ptr.pp_complex[i][0], _state), _state)); } } /* * Test CMatrixSolve() */ testdensesolverunit_cunset1d(&xv, _state); cmatrixsolve(&a, n, &bv, &info, &rep, &xv, _state); if( info<=0 ) { *rfserrors = ae_true; } else { for(i=0; i<=n-1; i++) { ae_v_cmove(&y.ptr.p_complex[0], 1, &a.ptr.pp_complex[i][0], 1, "N", ae_v_len(0,n-1)); xcdot(&y, &xv, n, &tx, &v, &verr, _state); *rfserrors = *rfserrors||ae_fp_greater(ae_c_abs(ae_c_sub(v,bv.ptr.p_complex[i]), _state),8*ae_machineepsilon*ae_maxreal((double)(1), ae_c_abs(bv.ptr.p_complex[i], _state), _state)); } } /* * TODO: Test LS-solver on the same matrix */ } ae_frame_leave(_state); } /************************************************************************* HPD test *************************************************************************/ static void testdensesolverunit_testhpdsolver(ae_int_t maxn, ae_int_t maxm, ae_int_t passcount, double threshold, ae_bool* hpderrors, ae_bool* rfserrors, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix cha; ae_matrix atmp; ae_vector p; ae_matrix xe; ae_matrix b; ae_vector bv; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t n; ae_int_t m; ae_int_t pass; ae_int_t taskkind; ae_complex v; ae_bool isupper; ae_int_t info; densesolverreport rep; densesolverlsreport repls; ae_matrix x; ae_vector xv; ae_vector y; ae_vector tx; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cha, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&atmp, 0, 0, DT_COMPLEX, _state); ae_vector_init(&p, 0, DT_INT, _state); ae_matrix_init(&xe, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&b, 0, 0, DT_COMPLEX, _state); ae_vector_init(&bv, 0, DT_COMPLEX, _state); _densesolverreport_init(&rep, _state); _densesolverlsreport_init(&repls, _state); ae_matrix_init(&x, 0, 0, DT_COMPLEX, _state); ae_vector_init(&xv, 0, DT_COMPLEX, _state); ae_vector_init(&y, 0, DT_COMPLEX, _state); ae_vector_init(&tx, 0, DT_COMPLEX, _state); /* * General square matrices: * * test general solvers * * test least squares solver */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { for(m=1; m<=maxm; m++) { /* * ******************************************************** * WELL CONDITIONED TASKS * ability to find correct solution is tested * ******************************************************** * * 1. generate random well conditioned matrix A. * 2. generate random solution vector xe * 3. generate right part b=A*xe * 4. test different methods on original A */ isupper = ae_fp_greater(ae_randomreal(_state),0.5); hpdmatrixrndcond(n, (double)(1000), &a, _state); testdensesolverunit_cmatrixmakeacopy(&a, n, n, &cha, _state); if( !hpdmatrixcholesky(&cha, n, isupper, _state) ) { *hpderrors = ae_true; ae_frame_leave(_state); return; } ae_matrix_set_length(&xe, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { xe.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; xe.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } ae_matrix_set_length(&b, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_cdotproduct(&a.ptr.pp_complex[i][0], 1, "N", &xe.ptr.pp_complex[0][j], xe.stride, "N", ae_v_len(0,n-1)); b.ptr.pp_complex[i][j] = v; } } testdensesolverunit_cmatrixdrophalf(&a, n, isupper, _state); testdensesolverunit_cmatrixdrophalf(&cha, n, isupper, _state); /* * Test solvers */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); hpdmatrixsolvem(&a, n, isupper, &b, m, &info, &rep, &x, _state); *hpderrors = *hpderrors||!testdensesolverunit_cmatrixchecksolutionm(&xe, n, m, threshold, info, &rep, &x, _state); info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_complex[i][j] = b.ptr.pp_complex[i][j]; } } hpdmatrixsolvemfast(&a, n, isupper, &x, m, &info, _state); seterrorflag(hpderrors, !testdensesolverunit_cmatrixchecksolutionmfast(&xe, n, m, threshold, info, &x, _state), _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset1d(&xv, _state); ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); hpdmatrixsolve(&a, n, isupper, &bv, &info, &rep, &xv, _state); *hpderrors = *hpderrors||!testdensesolverunit_cmatrixchecksolution(&xe, n, threshold, info, &rep, &xv, _state); info = 0; ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); hpdmatrixsolvefast(&a, n, isupper, &bv, &info, _state); *hpderrors = *hpderrors||!testdensesolverunit_cmatrixchecksolution(&xe, n, threshold, info, &rep, &bv, _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); hpdmatrixcholeskysolvem(&cha, n, isupper, &b, m, &info, &rep, &x, _state); *hpderrors = *hpderrors||!testdensesolverunit_cmatrixchecksolutionm(&xe, n, m, threshold, info, &rep, &x, _state); info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_complex[i][j] = b.ptr.pp_complex[i][j]; } } hpdmatrixcholeskysolvemfast(&cha, n, isupper, &x, m, &info, _state); seterrorflag(hpderrors, !testdensesolverunit_cmatrixchecksolutionmfast(&xe, n, m, threshold, info, &x, _state), _state); info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset1d(&xv, _state); ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); hpdmatrixcholeskysolve(&cha, n, isupper, &bv, &info, &rep, &xv, _state); *hpderrors = *hpderrors||!testdensesolverunit_cmatrixchecksolution(&xe, n, threshold, info, &rep, &xv, _state); info = 0; ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); hpdmatrixcholeskysolvefast(&cha, n, isupper, &bv, &info, _state); seterrorflag(hpderrors, !testdensesolverunit_cmatrixchecksolutionfast(&xe, n, threshold, info, &bv, _state), _state); /* * ******************************************************** * EXACTLY SINGULAR MATRICES * ability to detect singularity is tested * ******************************************************** * * 1. generate different types of singular matrices: * * zero * * with zero columns * * with zero rows * * with equal rows/columns * 2. generate random solution vector xe * 3. generate right part b=A*xe * 4. test different methods */ for(taskkind=0; taskkind<=3; taskkind++) { testdensesolverunit_cunset2d(&a, _state); if( taskkind==0 ) { /* * all zeros */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } if( taskkind==1 ) { /* * there is zero column */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; if( i==j ) { a.ptr.pp_complex[i][j].y = (double)(0); } a.ptr.pp_complex[j][i] = a.ptr.pp_complex[i][j]; } } k = ae_randominteger(n, _state); ae_v_cmuld(&a.ptr.pp_complex[0][k], a.stride, ae_v_len(0,n-1), 0); ae_v_cmuld(&a.ptr.pp_complex[k][0], 1, ae_v_len(0,n-1), 0); } if( taskkind==2 ) { /* * there is zero row */ ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; if( i==j ) { a.ptr.pp_complex[i][j].y = (double)(0); } a.ptr.pp_complex[j][i] = a.ptr.pp_complex[i][j]; } } k = ae_randominteger(n, _state); ae_v_cmuld(&a.ptr.pp_complex[k][0], 1, ae_v_len(0,n-1), 0); ae_v_cmuld(&a.ptr.pp_complex[0][k], a.stride, ae_v_len(0,n-1), 0); } if( taskkind==3 ) { /* * equal columns/rows */ if( n<2 ) { continue; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { a.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; if( i==j ) { a.ptr.pp_complex[i][j].y = (double)(0); } a.ptr.pp_complex[j][i] = a.ptr.pp_complex[i][j]; } } k = 1+ae_randominteger(n-1, _state); ae_v_cmove(&a.ptr.pp_complex[0][0], a.stride, &a.ptr.pp_complex[0][k], a.stride, "N", ae_v_len(0,n-1)); ae_v_cmove(&a.ptr.pp_complex[0][0], 1, &a.ptr.pp_complex[k][0], 1, "N", ae_v_len(0,n-1)); } ae_matrix_set_length(&xe, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { xe.ptr.pp_complex[i][j] = ae_complex_from_d(2*ae_randomreal(_state)-1); } } ae_matrix_set_length(&b, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_cdotproduct(&a.ptr.pp_complex[i][0], 1, "N", &xe.ptr.pp_complex[0][j], xe.stride, "N", ae_v_len(0,n-1)); b.ptr.pp_complex[i][j] = v; } } testdensesolverunit_cmatrixmakeacopy(&a, n, n, &cha, _state); testdensesolverunit_cmatrixdrophalf(&a, n, isupper, _state); testdensesolverunit_cmatrixdrophalf(&cha, n, isupper, _state); /* * Test SPDMatrixSolveM() (and fast version) */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); hpdmatrixsolvem(&a, n, isupper, &b, m, &info, &rep, &x, _state); *hpderrors = *hpderrors||!testdensesolverunit_cmatrixchecksingularm(n, m, info, &rep, &x, _state); if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_complex[i][j] = b.ptr.pp_complex[i][j]; } } hpdmatrixsolvemfast(&a, n, isupper, &x, m, &info, _state); seterrorflag(hpderrors, !testdensesolverunit_cmatrixchecksingularmfast(n, m, info, &x, _state), _state); } /* * Test SPDMatrixSolve() */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); hpdmatrixsolve(&a, n, isupper, &bv, &info, &rep, &xv, _state); *hpderrors = *hpderrors||!testdensesolverunit_cmatrixchecksingular(n, info, &rep, &xv, _state); if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); hpdmatrixsolvefast(&a, n, isupper, &bv, &info, _state); seterrorflag(hpderrors, !testdensesolverunit_cmatrixchecksingularfast(n, info, &bv, _state), _state); } /* * 'equal columns/rows' are degenerate, but * Cholesky matrix with equal columns/rows IS NOT degenerate, * so it is not used for testing purposes. */ if( taskkind!=3 ) { /* * Test SPDMatrixCholeskySolveM()/fast */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); hpdmatrixcholeskysolvem(&cha, n, isupper, &b, m, &info, &rep, &x, _state); *hpderrors = *hpderrors||!testdensesolverunit_cmatrixchecksingularm(n, m, info, &rep, &x, _state); if( (taskkind==0||taskkind==1)||taskkind==2 ) { info = 0; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_complex[i][j] = b.ptr.pp_complex[i][j]; } } hpdmatrixcholeskysolvemfast(&cha, n, isupper, &x, m, &info, _state); seterrorflag(hpderrors, !testdensesolverunit_cmatrixchecksingularmfast(n, m, info, &x, _state), _state); } /* * Test HPDMatrixCholeskySolve() (fast) */ info = 0; testdensesolverunit_unsetrep(&rep, _state); testdensesolverunit_cunset2d(&x, _state); ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); hpdmatrixcholeskysolve(&cha, n, isupper, &bv, &info, &rep, &xv, _state); *hpderrors = *hpderrors||!testdensesolverunit_cmatrixchecksingular(n, info, &rep, &xv, _state); if( (taskkind==0||taskkind==1)||taskkind==2 ) { ae_vector_set_length(&bv, n, _state); ae_v_cmove(&bv.ptr.p_complex[0], 1, &b.ptr.pp_complex[0][0], b.stride, "N", ae_v_len(0,n-1)); hpdmatrixcholeskysolvefast(&cha, n, isupper, &bv, &info, _state); seterrorflag(hpderrors, !testdensesolverunit_cmatrixchecksingularfast(n, info, &bv, _state), _state); } } } } } } ae_frame_leave(_state); } /************************************************************************* Unsets real matrix *************************************************************************/ static void testdensesolverunit_unset2d(/* Real */ ae_matrix* x, ae_state *_state) { ae_matrix_set_length(x, 1, 1, _state); x->ptr.pp_double[0][0] = 2*ae_randomreal(_state)-1; } /************************************************************************* Unsets real vector *************************************************************************/ static void testdensesolverunit_unset1d(/* Real */ ae_vector* x, ae_state *_state) { ae_vector_set_length(x, 1, _state); x->ptr.p_double[0] = 2*ae_randomreal(_state)-1; } /************************************************************************* Unsets real matrix *************************************************************************/ static void testdensesolverunit_cunset2d(/* Complex */ ae_matrix* x, ae_state *_state) { ae_matrix_set_length(x, 1, 1, _state); x->ptr.pp_complex[0][0] = ae_complex_from_d(2*ae_randomreal(_state)-1); } /************************************************************************* Unsets real vector *************************************************************************/ static void testdensesolverunit_cunset1d(/* Complex */ ae_vector* x, ae_state *_state) { ae_vector_set_length(x, 1, _state); x->ptr.p_complex[0] = ae_complex_from_d(2*ae_randomreal(_state)-1); } /************************************************************************* Unsets report *************************************************************************/ static void testdensesolverunit_unsetrep(densesolverreport* r, ae_state *_state) { r->r1 = (double)(-1); r->rinf = (double)(-1); } /************************************************************************* Unsets report *************************************************************************/ static void testdensesolverunit_unsetlsrep(densesolverlsreport* r, ae_state *_state) { r->r2 = (double)(-1); r->n = -1; r->k = -1; testdensesolverunit_unset2d(&r->cx, _state); } ae_bool testnormestimator(ae_bool silent, ae_state *_state) { ae_frame _frame_block; double tol; ae_int_t maxmn; ae_int_t m; ae_int_t n; ae_int_t pass; ae_int_t passcount; ae_matrix a; ae_vector rowsizes; sparsematrix s; double snorm; double enorm; double enorm2; ae_int_t nbetter; double sigma; ae_int_t i; ae_int_t j; normestimatorstate e; normestimatorstate e2; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&rowsizes, 0, DT_INT, _state); _sparsematrix_init(&s, _state); _normestimatorstate_init(&e, _state); _normestimatorstate_init(&e2, _state); tol = 0.01; maxmn = 5; waserrors = ae_false; /* * First test: algorithm must correctly determine matrix norm */ for(m=1; m<=maxmn; m++) { for(n=1; n<=maxmn; n++) { /* * Create estimator with quite large NStart and NIts. * It should guarantee that we converge to the correct solution. */ normestimatorcreate(m, n, 15, 15, &e, _state); /* * Try with zero A first */ sparsecreate(m, n, 1, &s, _state); sparseconverttocrs(&s, _state); normestimatorestimatesparse(&e, &s, _state); normestimatorresults(&e, &enorm, _state); waserrors = waserrors||ae_fp_neq(enorm,(double)(0)); /* * Choose random norm, try with non-zero matrix * with specified norm. */ snorm = ae_exp(10*ae_randomreal(_state)-5, _state); sparsecreate(m, n, 1, &s, _state); if( m>=n ) { /* * Generate random orthogonal M*M matrix, * use N leading columns as columns of A */ rmatrixrndorthogonal(m, &a, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { sparseset(&s, i, j, snorm*a.ptr.pp_double[i][j], _state); } } } else { /* * Generate random orthogonal N*N matrix, * use M leading rows as rows of A */ rmatrixrndorthogonal(n, &a, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { sparseset(&s, i, j, snorm*a.ptr.pp_double[i][j], _state); } } } sparseconverttocrs(&s, _state); normestimatorestimatesparse(&e, &s, _state); normestimatorresults(&e, &enorm, _state); waserrors = (waserrors||ae_fp_greater(enorm,snorm*(1+tol)))||ae_fp_less(enorm,snorm*(1-tol)); } } /* * NStart=10 should give statistically better results than NStart=1. * In order to test it we perform PassCount attempts to solve random * problem by means of two estimators: one with NStart=10 and another * one with NStart=1. Every time we compare two estimates and choose * better one. * * Random variable NBetter is a number of cases when NStart=10 was better. * Under null hypothesis (no difference) it is binomially distributed * with mean PassCount/2 and variance PassCount/4. However, we expect * to have significant deviation to the right, in the area of larger * values. * * NOTE: we use fixed N because this test is independent of problem size. */ n = 3; normestimatorcreate(n, n, 1, 1, &e, _state); normestimatorcreate(n, n, 10, 1, &e2, _state); normestimatorsetseed(&e, 0, _state); normestimatorsetseed(&e2, 0, _state); nbetter = 0; passcount = 2000; sigma = 5.0; for(pass=1; pass<=passcount; pass++) { snorm = ae_pow(10.0, 2*ae_randomreal(_state)-1, _state); sparsecreate(n, n, 1, &s, _state); rmatrixrndcond(n, 2.0, &a, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { sparseset(&s, i, j, snorm*a.ptr.pp_double[i][j], _state); } } sparseconverttocrs(&s, _state); normestimatorestimatesparse(&e, &s, _state); normestimatorresults(&e, &enorm, _state); normestimatorestimatesparse(&e2, &s, _state); normestimatorresults(&e2, &enorm2, _state); if( ae_fp_less(ae_fabs(enorm2-snorm, _state),ae_fabs(enorm-snorm, _state)) ) { nbetter = nbetter+1; } } waserrors = waserrors||ae_fp_less((double)(nbetter),0.5*passcount+sigma*ae_sqrt(0.25*passcount, _state)); /* * Same as previous one (for NStart), but tests dependence on NIts. */ n = 3; normestimatorcreate(n, n, 1, 1, &e, _state); normestimatorcreate(n, n, 1, 10, &e2, _state); normestimatorsetseed(&e, 0, _state); normestimatorsetseed(&e2, 0, _state); nbetter = 0; passcount = 2000; sigma = 5.0; for(pass=1; pass<=passcount; pass++) { snorm = ae_pow(10.0, 2*ae_randomreal(_state)-1, _state); sparsecreate(n, n, 1, &s, _state); rmatrixrndcond(n, 2.0, &a, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { sparseset(&s, i, j, snorm*a.ptr.pp_double[i][j], _state); } } sparseconverttocrs(&s, _state); normestimatorestimatesparse(&e, &s, _state); normestimatorresults(&e, &enorm, _state); normestimatorestimatesparse(&e2, &s, _state); normestimatorresults(&e2, &enorm2, _state); if( ae_fp_less(ae_fabs(enorm2-snorm, _state),ae_fabs(enorm-snorm, _state)) ) { nbetter = nbetter+1; } } waserrors = waserrors||ae_fp_less((double)(nbetter),0.5*passcount+sigma*ae_sqrt(0.25*passcount, _state)); /* * report */ if( !silent ) { printf("TESTING NORM ESTIMATOR\n"); printf("TEST: "); if( !waserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testnormestimator(ae_bool silent, ae_state *_state) { return testnormestimator(silent, _state); } static double testlinlsqrunit_e0 = 1.0E-6; static double testlinlsqrunit_tolort = 1.0E-4; static double testlinlsqrunit_emergencye0 = 1.0E-12; static ae_bool testlinlsqrunit_svdtest(ae_bool silent, ae_state *_state); static ae_bool testlinlsqrunit_mwcranksvdtest(ae_bool silent, ae_state *_state); static ae_bool testlinlsqrunit_mwicranksvdtest(ae_bool silent, ae_state *_state); static ae_bool testlinlsqrunit_bidiagonaltest(ae_bool silent, ae_state *_state); static ae_bool testlinlsqrunit_zeromatrixtest(ae_bool silent, ae_state *_state); static ae_bool testlinlsqrunit_reportcorrectnesstest(ae_bool silent, ae_state *_state); static ae_bool testlinlsqrunit_stoppingcriteriatest(ae_bool silent, ae_state *_state); static ae_bool testlinlsqrunit_analytictest(ae_bool silent, ae_state *_state); static ae_bool testlinlsqrunit_isitgoodsolution(/* Real */ ae_matrix* a, /* Real */ ae_vector* b, ae_int_t m, ae_int_t n, double lambdav, /* Real */ ae_vector* x, double epserr, double epsort, ae_state *_state); static ae_bool testlinlsqrunit_preconditionertest(ae_state *_state); ae_bool testlinlsqr(ae_bool silent, ae_state *_state) { ae_bool svdtesterrors; ae_bool mwcranksvderr; ae_bool mwicranksvderr; ae_bool bidiagonalerr; ae_bool zeromatrixerr; ae_bool reportcorrectnesserr; ae_bool stoppingcriteriaerr; ae_bool analytictesterrors; ae_bool prectesterrors; ae_bool waserrors; ae_bool result; svdtesterrors = testlinlsqrunit_svdtest(ae_true, _state); mwcranksvderr = testlinlsqrunit_mwcranksvdtest(ae_true, _state); mwicranksvderr = testlinlsqrunit_mwicranksvdtest(ae_true, _state); bidiagonalerr = testlinlsqrunit_bidiagonaltest(ae_true, _state); zeromatrixerr = testlinlsqrunit_zeromatrixtest(ae_true, _state); reportcorrectnesserr = testlinlsqrunit_reportcorrectnesstest(ae_true, _state); stoppingcriteriaerr = testlinlsqrunit_stoppingcriteriatest(ae_true, _state); analytictesterrors = testlinlsqrunit_analytictest(ae_true, _state); prectesterrors = testlinlsqrunit_preconditionertest(_state); /* * report */ waserrors = (((((((svdtesterrors||mwcranksvderr)||mwicranksvderr)||bidiagonalerr)||zeromatrixerr)||reportcorrectnesserr)||stoppingcriteriaerr)||analytictesterrors)||prectesterrors; if( !silent ) { printf("TESTING LinLSQR\n"); printf("SVDTest: "); if( svdtesterrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("MWCRankSVDTest: "); if( mwcranksvderr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("MWICRankSVDTest: "); if( mwicranksvderr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("BidiagonalTest: "); if( bidiagonalerr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("ZeroMatrixTest: "); if( zeromatrixerr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("ReportCorrectnessTest: "); if( reportcorrectnesserr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("StoppingCriteriaTest: "); if( stoppingcriteriaerr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("Analytic properties: "); if( analytictesterrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("Preconditioner test: "); if( prectesterrors ) { printf("FAILED\n"); } else { printf("OK\n"); } /* *was errors? */ if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testlinlsqr(ae_bool silent, ae_state *_state) { return testlinlsqr(silent, _state); } /************************************************************************* This function generates random MxN problem, solves it with LSQR and compares with results obtained from SVD solver. Matrix A is generated as MxN matrix with uniformly distributed random entries, i.e. it has no special properties (like conditioning or separation of singular values). We apply random amount regularization to our problem (from zero to large) in order to test regularizer. Default stopping criteria are used. Preconditioning is turned off because it skews results for rank-deficient problems. INPUT: Silent - if true then function output report -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlinlsqrunit_svdtest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; linlsqrstate s; linlsqrreport rep; sparsematrix spa; ae_matrix a; ae_vector b; ae_vector x0; ae_int_t szn; ae_int_t szm; ae_int_t n; ae_int_t m; double lambdai; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); _linlsqrstate_init(&s, _state); _linlsqrreport_init(&rep, _state); _sparsematrix_init(&spa, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); szm = 5; szn = 5; for(m=1; m<=szm; m++) { for(n=1; n<=szn; n++) { /* * Prepare MxN matrix A, right part B, lambda */ lambdai = ae_randomreal(_state); ae_matrix_set_length(&a, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } sparsecreate(m, n, 1, &spa, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { sparseset(&spa, i, j, a.ptr.pp_double[i][j], _state); } } sparseconverttocrs(&spa, _state); ae_vector_set_length(&b, m, _state); for(i=0; i<=m-1; i++) { b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } /* * Solve by calling LinLSQRIteration */ linlsqrcreate(m, n, &s, _state); linlsqrsetb(&s, &b, _state); linlsqrsetlambdai(&s, lambdai, _state); linlsqrsetprecunit(&s, _state); while(linlsqriteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=m-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needmtv ) { for(i=0; i<=n-1; i++) { s.mtv.ptr.p_double[i] = (double)(0); for(j=0; j<=m-1; j++) { s.mtv.ptr.p_double[i] = s.mtv.ptr.p_double[i]+a.ptr.pp_double[j][i]*s.x.ptr.p_double[j]; } } } } linlsqrresults(&s, &x0, &rep, _state); if( !testlinlsqrunit_isitgoodsolution(&a, &b, m, n, lambdai, &x0, testlinlsqrunit_e0, testlinlsqrunit_tolort, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } /* *test LinLSQRRestart and LinLSQRSolveSparse */ linlsqrrestart(&s, _state); linlsqrsolvesparse(&s, &spa, &b, _state); linlsqrresults(&s, &x0, &rep, _state); if( !testlinlsqrunit_isitgoodsolution(&a, &b, m, n, lambdai, &x0, testlinlsqrunit_e0, testlinlsqrunit_tolort, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } } } if( !silent ) { printf("SVDTest::Ok\n"); } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* The test checks that algorithm can solve MxN (with N<=M) well-conditioned problems - and can do so within exactly N iterations. We use moderate condition numbers, from 1.0 to 10.0, because larger condition number may require several additional iterations to converge. We try different scalings of the A and B. INPUT: Silent - if true then function does not outputs results to console -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlinlsqrunit_mwcranksvdtest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; linlsqrstate s; linlsqrreport rep; ae_matrix a; ae_vector b; double bnorm; ae_vector x0; ae_int_t szm; ae_int_t n; ae_int_t m; ae_int_t ns0; ae_int_t ns1; ae_int_t nlambdai; double s0; double s1; double lambdai; double c; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); _linlsqrstate_init(&s, _state); _linlsqrreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); szm = 5; for(m=1; m<=szm; m++) { for(n=1; n<=m; n++) { for(nlambdai=0; nlambdai<=3; nlambdai++) { for(ns0=-1; ns0<=1; ns0++) { for(ns1=-1; ns1<=1; ns1++) { /* * Generate problem: * * scale factors s0, s1 * * MxN well conditioned A (with condition number C in [1,10] and norm s0) * * regularization coefficient LambdaI * * right part b, with |b|=s1 */ s0 = ae_pow((double)(10), (double)(10*ns0), _state); s1 = ae_pow((double)(10), (double)(10*ns1), _state); lambdai = (double)(0); if( nlambdai==0 ) { lambdai = (double)(0); } if( nlambdai==1 ) { lambdai = s0/1000; } if( nlambdai==2 ) { lambdai = s0; } if( nlambdai==3 ) { lambdai = s0*1000; } c = (10-1)*ae_randomreal(_state)+1; rmatrixrndcond(m, c, &a, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = s0*a.ptr.pp_double[i][j]; } } ae_vector_set_length(&b, m, _state); do { bnorm = (double)(0); for(i=0; i<=m-1; i++) { b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; bnorm = bnorm+b.ptr.p_double[i]*b.ptr.p_double[i]; } bnorm = ae_sqrt(bnorm, _state); } while(ae_fp_less_eq(bnorm,testlinlsqrunit_e0)); for(i=0; i<=m-1; i++) { b.ptr.p_double[i] = b.ptr.p_double[i]*s1/bnorm; } /* * Solve by LSQR method */ linlsqrcreate(m, n, &s, _state); linlsqrsetb(&s, &b, _state); linlsqrsetcond(&s, (double)(0), (double)(0), n, _state); linlsqrsetlambdai(&s, lambdai, _state); while(linlsqriteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=m-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needmtv ) { for(i=0; i<=n-1; i++) { s.mtv.ptr.p_double[i] = (double)(0); for(j=0; j<=m-1; j++) { s.mtv.ptr.p_double[i] = s.mtv.ptr.p_double[i]+a.ptr.pp_double[j][i]*s.x.ptr.p_double[j]; } } } } linlsqrresults(&s, &x0, &rep, _state); if( !testlinlsqrunit_isitgoodsolution(&a, &b, m, n, lambdai, &x0, testlinlsqrunit_e0, testlinlsqrunit_tolort, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* The test checks that algorithm can find a solution with minimum norm for a singular rectangular problem. System matrix has special property - singular values are either zero or well separated from zero. INPUT: Silent - if true then function output report -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlinlsqrunit_mwicranksvdtest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; linlsqrstate s; linlsqrreport rep; sparsematrix spa; ae_vector b; double bnorm; ae_vector x0; ae_int_t szm; ae_int_t n; ae_int_t m; ae_int_t nz; ae_int_t ns0; ae_int_t ns1; ae_int_t nlambdai; double s0; double s1; double lambdai; ae_int_t i; ae_int_t j; ae_matrix a; ae_bool result; ae_frame_make(_state, &_frame_block); _linlsqrstate_init(&s, _state); _linlsqrreport_init(&rep, _state); _sparsematrix_init(&spa, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); result = ae_false; szm = 5; for(m=1; m<=szm; m++) { for(n=1; n<=m; n++) { for(nlambdai=0; nlambdai<=2; nlambdai++) { for(ns0=-1; ns0<=1; ns0++) { for(ns1=-1; ns1<=1; ns1++) { for(nz=0; nz<=n-1; nz++) { /* * Generate problem: * * scale coefficients s0, s1 * * regularization coefficient LambdaI * * MxN matrix A, norm(A)=s0, with NZ zero singular values and N-NZ nonzero ones * * right part b with norm(b)=s1 */ s0 = ae_pow((double)(10), (double)(10*ns0), _state); s1 = ae_pow((double)(10), (double)(10*ns1), _state); lambdai = (double)(0); if( nlambdai==0 ) { lambdai = (double)(0); } if( nlambdai==1 ) { lambdai = s0; } if( nlambdai==2 ) { lambdai = s0*1000; } ae_matrix_set_length(&a, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-nz-1; i++) { a.ptr.pp_double[i][i] = s0*(0.1+0.9*ae_randomreal(_state)); } rmatrixrndorthogonalfromtheleft(&a, m, n, _state); rmatrixrndorthogonalfromtheright(&a, m, n, _state); ae_vector_set_length(&b, m, _state); do { bnorm = (double)(0); for(i=0; i<=m-1; i++) { b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; bnorm = bnorm+b.ptr.p_double[i]*b.ptr.p_double[i]; } bnorm = ae_sqrt(bnorm, _state); } while(ae_fp_less_eq(bnorm,testlinlsqrunit_e0)); for(i=0; i<=m-1; i++) { b.ptr.p_double[i] = b.ptr.p_double[i]*s1/bnorm; } /* * Solve by LSQR method */ linlsqrcreate(m, n, &s, _state); linlsqrsetb(&s, &b, _state); linlsqrsetcond(&s, testlinlsqrunit_emergencye0, testlinlsqrunit_emergencye0, n, _state); linlsqrsetlambdai(&s, lambdai, _state); while(linlsqriteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=m-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needmtv ) { for(i=0; i<=n-1; i++) { s.mtv.ptr.p_double[i] = (double)(0); for(j=0; j<=m-1; j++) { s.mtv.ptr.p_double[i] = s.mtv.ptr.p_double[i]+a.ptr.pp_double[j][i]*s.x.ptr.p_double[j]; } } } } linlsqrresults(&s, &x0, &rep, _state); /* * Check */ if( !testlinlsqrunit_isitgoodsolution(&a, &b, m, n, lambdai, &x0, testlinlsqrunit_e0, testlinlsqrunit_tolort, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } } } ae_frame_leave(_state); return result; } /************************************************************************* The test does check, that algorithm can find a solution with minimum norm, if a problem has bidiagonal matrix on diagonals of a lot of zeros. This problem has to lead to case when State.Alpha and State.Beta are zero, and we we can be sure that the algorithm correctly handles it. We do not use iteration count as stopping condition, because problem can be degenerate and we may need more than N iterations to converge. INPUT: Silent - if true then function output report -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlinlsqrunit_bidiagonaltest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; linlsqrstate s; linlsqrreport rep; ae_matrix a; ae_vector b; double bnorm; ae_vector x0; ae_int_t sz; ae_int_t n; ae_int_t m; ae_int_t minmn; ae_int_t ns0; ae_int_t ns1; double s0; double s1; ae_int_t i; ae_int_t j; ae_int_t p; ae_int_t diag; double pz; ae_bool result; ae_frame_make(_state, &_frame_block); _linlsqrstate_init(&s, _state); _linlsqrreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); sz = 5; for(m=1; m<=sz; m++) { for(n=1; n<=sz; n++) { minmn = ae_minint(m, n, _state); for(p=0; p<=2; p++) { for(ns0=-1; ns0<=1; ns0++) { for(ns1=-1; ns1<=1; ns1++) { for(diag=0; diag<=1; diag++) { /* * Generate problem: * * scaling coefficients s0, s1 * * bidiagonal A, with probability of having zero element at diagonal equal to PZ */ s0 = ae_pow((double)(10), (double)(10*ns0), _state); s1 = ae_pow((double)(10), (double)(10*ns1), _state); pz = 0.0; if( p==0 ) { pz = 0.25; } if( p==1 ) { pz = 0.5; } if( p==2 ) { pz = 0.75; } ae_matrix_set_length(&a, m, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=minmn-1; i++) { if( ae_fp_greater_eq(ae_randomreal(_state),pz) ) { a.ptr.pp_double[i][i] = 2*ae_randomreal(_state)-1; } } for(i=1; i<=minmn-1; i++) { if( ae_fp_greater_eq(ae_randomreal(_state),pz) ) { if( diag==0 ) { a.ptr.pp_double[i-1][i] = 2*ae_randomreal(_state)-1; } if( diag==1 ) { a.ptr.pp_double[i][i-1] = 2*ae_randomreal(_state)-1; } } } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = s0*a.ptr.pp_double[i][j]; } } ae_vector_set_length(&b, m, _state); do { bnorm = (double)(0); for(i=0; i<=m-1; i++) { b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; bnorm = bnorm+b.ptr.p_double[i]*b.ptr.p_double[i]; } bnorm = ae_sqrt(bnorm, _state); } while(ae_fp_less_eq(bnorm,testlinlsqrunit_e0)); for(i=0; i<=m-1; i++) { b.ptr.p_double[i] = b.ptr.p_double[i]*s1/bnorm; } /* * LSQR solution */ linlsqrcreate(m, n, &s, _state); linlsqrsetb(&s, &b, _state); linlsqrsetcond(&s, testlinlsqrunit_e0, testlinlsqrunit_e0, 0, _state); while(linlsqriteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=m-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needmtv ) { for(i=0; i<=n-1; i++) { s.mtv.ptr.p_double[i] = (double)(0); for(j=0; j<=m-1; j++) { s.mtv.ptr.p_double[i] = s.mtv.ptr.p_double[i]+a.ptr.pp_double[j][i]*s.x.ptr.p_double[j]; } } } } linlsqrresults(&s, &x0, &rep, _state); /* * Check */ if( !testlinlsqrunit_isitgoodsolution(&a, &b, m, n, 0.0, &x0, testlinlsqrunit_e0, testlinlsqrunit_tolort, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* The test does check, that algorithm correctly solves a problem in cases: 1. A=0, B<>0; 2. A<>0, B=0; 3. A=0, B=0. If some part is not zero then it filled with ones. INPUT: Silent - if true then function output report -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlinlsqrunit_zeromatrixtest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; linlsqrstate s; linlsqrreport rep; ae_matrix a; ae_vector b; ae_vector x0; ae_int_t sz; ae_int_t n; ae_int_t m; ae_int_t i; ae_int_t j; ae_int_t nzeropart; ae_bool result; ae_frame_make(_state, &_frame_block); _linlsqrstate_init(&s, _state); _linlsqrreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); sz = 5; result = ae_false; for(m=1; m<=sz; m++) { for(n=1; n<=sz; n++) { for(nzeropart=0; nzeropart<=2; nzeropart++) { /* * Initialize A, b */ ae_matrix_set_length(&a, m, n, _state); if( nzeropart==0||nzeropart==2 ) { for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } } else { for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(1); } } } ae_vector_set_length(&b, m, _state); if( nzeropart==1||nzeropart==2 ) { for(i=0; i<=m-1; i++) { b.ptr.p_double[i] = (double)(0); } } else { for(i=0; i<=m-1; i++) { b.ptr.p_double[i] = (double)(1); } } /* * LSQR */ linlsqrcreate(m, n, &s, _state); linlsqrsetb(&s, &b, _state); linlsqrsetcond(&s, (double)(0), (double)(0), n, _state); while(linlsqriteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=m-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needmtv ) { for(i=0; i<=n-1; i++) { s.mtv.ptr.p_double[i] = (double)(0); for(j=0; j<=m-1; j++) { s.mtv.ptr.p_double[i] = s.mtv.ptr.p_double[i]+a.ptr.pp_double[j][i]*s.x.ptr.p_double[j]; } } } } linlsqrresults(&s, &x0, &rep, _state); /* * Check */ if( !testlinlsqrunit_isitgoodsolution(&a, &b, m, n, 0.0, &x0, testlinlsqrunit_e0, testlinlsqrunit_tolort, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } ae_frame_leave(_state); return result; } /************************************************************************* The test does check, that algorithm correctly displays a progress report. INPUT: Silent - if true then function output report -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlinlsqrunit_reportcorrectnesstest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; linlsqrstate s; linlsqrreport rep; ae_matrix a; ae_matrix u; ae_matrix v; ae_vector w; ae_vector b; ae_vector x0; ae_vector firstx; ae_vector lastx; double rnorm; double tnorm; ae_int_t sz; ae_int_t n; ae_int_t m; ae_int_t lambdai; double mn; double mx; double c; ae_int_t i; ae_int_t j; ae_int_t its; double tmp; double eps; ae_bool result; ae_frame_make(_state, &_frame_block); _linlsqrstate_init(&s, _state); _linlsqrreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&u, 0, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&firstx, 0, DT_REAL, _state); ae_vector_init(&lastx, 0, DT_REAL, _state); eps = 0.001; sz = 5; mn = (double)(-100); mx = (double)(100); c = (double)(100); for(m=1; m<=sz; m++) { for(n=1; n<=m; n++) { for(lambdai=0; lambdai<=1; lambdai++) { its = -1; /* *initialize matrix A */ spdmatrixrndcond(m+n, c, &a, _state); for(i=m; i<=m+n-1; i++) { for(j=0; j<=n-1; j++) { if( i-m==j ) { a.ptr.pp_double[i][j] = (double)(lambdai); } else { a.ptr.pp_double[i][j] = (double)(0); } } } /* *initialize b */ ae_vector_set_length(&b, m+n, _state); rnorm = (double)(0); for(i=0; i<=m+n-1; i++) { if( imaxits||rep.terminationtype<=0 ) { if( !silent ) { printf("StoppingCriteriaTest::Fail\n"); printf("N=%0d\n", (int)(n)); printf("IterationsCount=%0d\n", (int)(rep.iterationscount)); printf("NMV=%0d\n", (int)(rep.nmv)); printf("TerminationType=%0d\n", (int)(rep.terminationtype)); } result = ae_true; ae_frame_leave(_state); return result; } /* * Test EpsB. * Set EpsB=eps, check that |r|=2) and using first N-1 columns as rectangular * system matrix, and sum of all columns with random non-zero coefficients * as right part. */ for(n=2; n<=sz; n++) { for(k0=-1; k0<=1; k0++) { for(k1=-1; k1<=1; k1++) { /* * Initialize A with non-unit norm 10^(10*K0), b with non-unit norm 10^(10*K1) */ anorm = ae_pow((double)(10), (double)(10*k0), _state); rmatrixrndorthogonal(n, &a, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = anorm*a.ptr.pp_double[i][j]; } } ae_vector_set_length(&b, n, _state); for(j=0; j<=n-1; j++) { b.ptr.p_double[j] = (double)(0); } for(i=0; i<=n-1; i++) { tmp = 1+ae_randomreal(_state); ae_v_addd(&b.ptr.p_double[0], 1, &a.ptr.pp_double[0][i], a.stride, ae_v_len(0,n-1), tmp); } tmp = (double)(0); for(i=0; i<=n-1; i++) { tmp = tmp+ae_sqr(b.ptr.p_double[i], _state); } tmp = ae_pow((double)(10), (double)(10*k1), _state)/ae_sqrt(tmp, _state); ae_v_muld(&b.ptr.p_double[0], 1, ae_v_len(0,n-1), tmp); /* * Test EpsA * * NOTE: it is guaranteed that algorithm will terminate with correct * TerminationType because other stopping criteria (EpsB) won't be satisfied * on such system. */ eps = ae_pow((double)(10), (double)(-(2+ae_randominteger(3, _state))), _state); epsmod = 1.1*eps; linlsqrcreate(n, n-1, &s, _state); linlsqrsetb(&s, &b, _state); linlsqrsetcond(&s, eps, (double)(0), 0, _state); while(linlsqriteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-2; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needmtv ) { for(i=0; i<=n-2; i++) { s.mtv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mtv.ptr.p_double[i] = s.mtv.ptr.p_double[i]+a.ptr.pp_double[j][i]*s.x.ptr.p_double[j]; } } } } linlsqrresults(&s, &x0, &rep, _state); /* * Check condition */ ae_vector_set_length(&rk, n, _state); ae_vector_set_length(&ark, n-1, _state); rknorm = (double)(0); for(i=0; i<=n-1; i++) { rk.ptr.p_double[i] = b.ptr.p_double[i]; for(j=0; j<=n-2; j++) { rk.ptr.p_double[i] = rk.ptr.p_double[i]-a.ptr.pp_double[i][j]*x0.ptr.p_double[j]; } rknorm = rknorm+ae_sqr(rk.ptr.p_double[i], _state); } rknorm = ae_sqrt(rknorm, _state); arknorm = (double)(0); for(i=0; i<=n-2; i++) { ark.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { ark.ptr.p_double[i] = ark.ptr.p_double[i]+a.ptr.pp_double[j][i]*rk.ptr.p_double[j]; } arknorm = arknorm+ae_sqr(ark.ptr.p_double[i], _state); } arknorm = ae_sqrt(arknorm, _state); if( ae_fp_greater(arknorm/(anorm*rknorm),epsmod)||rep.terminationtype!=4 ) { if( !silent ) { printf("StoppingCriteriaTest::Fail\n"); printf("N=%0d\n", (int)(n)); printf("IterationsCount=%0d\n", (int)(rep.iterationscount)); printf("NMV=%0d\n", (int)(rep.nmv)); printf("TerminationType=%0d\n", (int)(rep.terminationtype)); } result = ae_true; ae_frame_leave(_state); return result; } } } } if( !silent ) { printf("StoppingCriteriaTest::Ok\n"); } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This test compares LSQR for original system A*x=b against CG for a modified system (A'*A)x = A*b. Both algorithms should give same sequences of trial points (under exact arithmetics, or for very good conditioned systems). INPUT: Silent - if true then function does not output report -- ALGLIB -- Copyright 30.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlinlsqrunit_analytictest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; linlsqrstate s; linlsqrreport rep; ae_matrix a; ae_matrix xk; ae_matrix ap; ae_matrix r; ae_vector b; ae_vector tmp; ae_int_t n; ae_int_t m; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t smallk; ae_int_t pointsstored; double v; double tol; ae_bool result; ae_frame_make(_state, &_frame_block); _linlsqrstate_init(&s, _state); _linlsqrreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&xk, 0, 0, DT_REAL, _state); ae_matrix_init(&ap, 0, 0, DT_REAL, _state); ae_matrix_init(&r, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); /* * Set: * * SmallK - number of steps to check, must be small number in order * to reduce influence of the rounding errors * * Tol - error tolerance for orthogonality/conjugacy criteria */ result = ae_false; smallk = 4; tol = 1.0E-7; for(m=smallk; m<=smallk+5; m++) { for(n=smallk; n<=m; n++) { /* * Prepare problem: * * MxN matrix A, Mx1 vector B * * A is filled with random values from [-1,+1] * * diagonal elements are filled with large positive values * (should make system better conditioned) */ ae_matrix_set_length(&a, m, n, _state); ae_vector_set_length(&b, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = 10*(1+ae_randomreal(_state)); } /* * Solve with LSQR, save trial points into XK[] array */ ae_matrix_set_length(&xk, smallk+1, n, _state); linlsqrcreate(m, n, &s, _state); linlsqrsetb(&s, &b, _state); linlsqrsetcond(&s, (double)(0), (double)(0), smallk, _state); linlsqrsetxrep(&s, ae_true, _state); pointsstored = 0; while(linlsqriteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=m-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needmtv ) { for(i=0; i<=n-1; i++) { s.mtv.ptr.p_double[i] = (double)(0); for(j=0; j<=m-1; j++) { s.mtv.ptr.p_double[i] = s.mtv.ptr.p_double[i]+a.ptr.pp_double[j][i]*s.x.ptr.p_double[j]; } } } if( s.xupdated ) { ae_assert(pointsstoredj * * (p[i]^T)*A'*A*p[j]=0 for i<>j * where r[i]=(A'*A)*x[i]-A'*b is I-th residual , p[i] is I-th step. * * In order to test these criteria we generate two matrices: * * (PointsStored-1)*M matrix AP (matrix of A*p products) * * (PointsStored-1)*N matrix R (matrix of residuals) * Then, we check that each matrix has orthogonal rows. */ ae_matrix_set_length(&ap, pointsstored-1, m, _state); ae_matrix_set_length(&r, pointsstored-1, n, _state); ae_vector_set_length(&tmp, m, _state); for(k=0; k<=pointsstored-2; k++) { /* * Calculate K-th row of AP */ for(i=0; i<=m-1; i++) { ap.ptr.pp_double[k][i] = 0.0; for(j=0; j<=n-1; j++) { ap.ptr.pp_double[k][i] = ap.ptr.pp_double[k][i]+a.ptr.pp_double[i][j]*(xk.ptr.pp_double[k+1][j]-xk.ptr.pp_double[k][j]); } } /* * Calculate K-th row of R */ for(i=0; i<=m-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xk.ptr.pp_double[k][0], 1, ae_v_len(0,n-1)); tmp.ptr.p_double[i] = v-b.ptr.p_double[i]; } for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&a.ptr.pp_double[0][j], a.stride, &tmp.ptr.p_double[0], 1, ae_v_len(0,m-1)); r.ptr.pp_double[k][j] = v; } } for(i=0; i<=pointsstored-2; i++) { for(j=0; j<=pointsstored-2; j++) { if( i!=j ) { v = ae_v_dotproduct(&ap.ptr.pp_double[i][0], 1, &ap.ptr.pp_double[j][0], 1, ae_v_len(0,m-1)); result = result||ae_fp_greater(ae_fabs(v, _state),tol); v = ae_v_dotproduct(&r.ptr.pp_double[i][0], 1, &r.ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); result = result||ae_fp_greater(ae_fabs(v, _state),tol); } } } } } ae_frame_leave(_state); return result; } /************************************************************************* This function compares solution calculated by LSQR with one calculated by SVD solver. Following comparisons are performed: 1. either: 1.a) residual norm |Rk| for LSQR solution is at most epsErr*|B| 1.b) |A^T*Rk|/(|A|*|Rk|)<=EpsOrt 2. norm(LSQR_solution) is at most 1.2*norm(SVD_solution) Test (1) verifies that LSQR found good solution, test (2) verifies that LSQR finds solution with close-to-minimum norm. We use factor as large as 1.2 to test deviation from SVD solution because LSQR is not very good at solving degenerate problems. INPUT PARAMETERS: A - array[M,N], system matrix B - right part M, N - problem size LambdaV - regularization value for the problem, >=0 X - array[N], solution found by LSQR EpsErr - tolerance for |A*x-b| EpsOrt - tolerance for |A^T*Rk|/(|A|*|Rk|) RESULT True, for solution which passess all the tests *************************************************************************/ static ae_bool testlinlsqrunit_isitgoodsolution(/* Real */ ae_matrix* a, /* Real */ ae_vector* b, ae_int_t m, ae_int_t n, double lambdav, /* Real */ ae_vector* x, double epserr, double epsort, ae_state *_state) { ae_frame _frame_block; ae_matrix svda; ae_matrix u; ae_matrix vt; ae_vector w; ae_vector svdx; ae_vector tmparr; ae_vector r; ae_int_t i; ae_int_t j; ae_int_t minmn; ae_bool svdresult; double v; double rnorm; double bnorm; double anorm; double atrnorm; double xnorm; double svdxnorm; ae_bool clause1holds; ae_bool clause2holds; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&svda, 0, 0, DT_REAL, _state); ae_matrix_init(&u, 0, 0, DT_REAL, _state); ae_matrix_init(&vt, 0, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&svdx, 0, DT_REAL, _state); ae_vector_init(&tmparr, 0, DT_REAL, _state); ae_vector_init(&r, 0, DT_REAL, _state); /* * Solve regularized problem with SVD solver */ ae_matrix_set_length(&svda, m+n, n, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { svda.ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } for(i=m; i<=m+n-1; i++) { for(j=0; j<=n-1; j++) { if( i-m==j ) { svda.ptr.pp_double[i][j] = lambdav; } else { svda.ptr.pp_double[i][j] = (double)(0); } } } svdresult = rmatrixsvd(&svda, m+n, n, 1, 1, 0, &w, &u, &vt, _state); ae_assert(svdresult, "LINLSQR: internal error in unit test (SVD failed)", _state); minmn = ae_minint(m, n, _state); ae_vector_set_length(&svdx, n, _state); ae_vector_set_length(&tmparr, minmn, _state); for(i=0; i<=minmn-1; i++) { tmparr.ptr.p_double[i] = (double)(0); for(j=0; j<=m-1; j++) { tmparr.ptr.p_double[i] = tmparr.ptr.p_double[i]+u.ptr.pp_double[j][i]*b->ptr.p_double[j]; } if( ae_fp_less_eq(w.ptr.p_double[i],ae_sqrt(ae_machineepsilon, _state)*w.ptr.p_double[0]) ) { tmparr.ptr.p_double[i] = (double)(0); } else { tmparr.ptr.p_double[i] = tmparr.ptr.p_double[i]/w.ptr.p_double[i]; } } for(i=0; i<=n-1; i++) { svdx.ptr.p_double[i] = (double)(0); for(j=0; j<=minmn-1; j++) { svdx.ptr.p_double[i] = svdx.ptr.p_double[i]+vt.ptr.pp_double[j][i]*tmparr.ptr.p_double[j]; } } /* * Calculate residual, perform checks 1.a and 1.b: * * first, we check 1.a * * in case 1.a fails we check 1.b */ ae_vector_set_length(&r, m+n, _state); for(i=0; i<=m+n-1; i++) { v = ae_v_dotproduct(&svda.ptr.pp_double[i][0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); r.ptr.p_double[i] = v; if( iptr.p_double[i]; } } v = ae_v_dotproduct(&r.ptr.p_double[0], 1, &r.ptr.p_double[0], 1, ae_v_len(0,m+n-1)); rnorm = ae_sqrt(v, _state); v = ae_v_dotproduct(&b->ptr.p_double[0], 1, &b->ptr.p_double[0], 1, ae_v_len(0,m-1)); bnorm = ae_sqrt(v, _state); if( ae_fp_less_eq(rnorm,epserr*bnorm) ) { /* * 1.a is true, no further checks is needed */ clause1holds = ae_true; } else { /* * 1.a is false, we have to check 1.b * * In order to do so, we calculate ||A|| and ||A^T*Rk||. We do * not store product of A and Rk to some array, all we need is * just one component of product at time, stored in V. * */ anorm = (double)(0); atrnorm = (double)(0); for(i=0; i<=n-1; i++) { v = (double)(0); for(j=0; j<=m+n-1; j++) { v = v+svda.ptr.pp_double[j][i]*r.ptr.p_double[j]; anorm = anorm+ae_sqr(svda.ptr.pp_double[j][i], _state); } atrnorm = atrnorm+ae_sqr(v, _state); } anorm = ae_sqrt(anorm, _state); atrnorm = ae_sqrt(atrnorm, _state); clause1holds = ae_fp_eq(anorm*rnorm,(double)(0))||ae_fp_less_eq(atrnorm/(anorm*rnorm),epsort); } /* * Check (2). * Here we assume that Result=True when we enter this block. */ v = ae_v_dotproduct(&x->ptr.p_double[0], 1, &x->ptr.p_double[0], 1, ae_v_len(0,n-1)); xnorm = ae_sqrt(v, _state); v = ae_v_dotproduct(&svdx.ptr.p_double[0], 1, &svdx.ptr.p_double[0], 1, ae_v_len(0,n-1)); svdxnorm = ae_sqrt(v, _state); clause2holds = ae_fp_less_eq(xnorm,1.2*svdxnorm); /* * End */ result = clause1holds&&clause2holds; ae_frame_leave(_state); return result; } /************************************************************************* Function for testing preconditioned LSQR method. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlinlsqrunit_preconditionertest(ae_state *_state) { ae_frame _frame_block; linlsqrstate s; linlsqrreport rep; ae_matrix a; ae_matrix ta; sparsematrix sa; ae_vector b; ae_vector d; ae_vector xe; ae_vector x0; ae_bool bflag; ae_int_t i; ae_int_t j; ae_int_t n; ae_bool result; ae_frame_make(_state, &_frame_block); _linlsqrstate_init(&s, _state); _linlsqrreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&ta, 0, 0, DT_REAL, _state); _sparsematrix_init(&sa, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&xe, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); /* * Test 1. * * We test automatic diagonal preconditioning used by SolveSparse. * In order to do so we: * 1. generate 20*20 matrix A0 with condition number equal to 1.0E1 * 2. generate random "exact" solution xe and right part b=A0*xe * 3. generate random ill-conditioned diagonal scaling matrix D with * condition number equal to 1.0E50: * 4. transform A*x=b into badly scaled problem: * A0*x0=b0 * (A0*D)*(inv(D)*x0)=b0 * finally we got new problem A*x=b with A=A0*D, b=b0, x=inv(D)*x0 * * Then we solve A*x=b: * 1. with default preconditioner * 2. with explicitly activayed diagonal preconditioning * 3. with unit preconditioner. * 1st and 2nd solutions must be close to xe, 3rd solution must be very * far from the true one. */ n = 20; rmatrixrndcond(n, 1.0E1, &ta, _state); ae_vector_set_length(&xe, n, _state); for(i=0; i<=n-1; i++) { xe.ptr.p_double[i] = randomnormal(_state); } ae_vector_set_length(&b, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { b.ptr.p_double[i] = b.ptr.p_double[i]+ta.ptr.pp_double[i][j]*xe.ptr.p_double[j]; } } ae_vector_set_length(&d, n, _state); for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = ae_pow((double)(10), 100*ae_randomreal(_state)-50, _state); } ae_matrix_set_length(&a, n, n, _state); sparsecreate(n, n, n*n, &sa, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = ta.ptr.pp_double[i][j]*d.ptr.p_double[j]; sparseset(&sa, i, j, a.ptr.pp_double[i][j], _state); } xe.ptr.p_double[i] = xe.ptr.p_double[i]/d.ptr.p_double[i]; } sparseconverttocrs(&sa, _state); linlsqrcreate(n, n, &s, _state); linlsqrsetcond(&s, (double)(0), (double)(0), 2*n, _state); linlsqrsolvesparse(&s, &sa, &b, _state); linlsqrresults(&s, &x0, &rep, _state); if( rep.terminationtype<=0 ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=n-1; i++) { if( ae_fp_greater(ae_fabs(xe.ptr.p_double[i]-x0.ptr.p_double[i], _state),5.0E-2/d.ptr.p_double[i]) ) { result = ae_true; ae_frame_leave(_state); return result; } } linlsqrsetprecunit(&s, _state); linlsqrsolvesparse(&s, &sa, &b, _state); linlsqrresults(&s, &x0, &rep, _state); if( rep.terminationtype>0 ) { bflag = ae_false; for(i=0; i<=n-1; i++) { bflag = bflag||ae_fp_greater(ae_fabs(xe.ptr.p_double[i]-x0.ptr.p_double[i], _state),5.0E-2/d.ptr.p_double[i]); } if( !bflag ) { result = ae_true; ae_frame_leave(_state); return result; } } linlsqrsetprecdiag(&s, _state); linlsqrsolvesparse(&s, &sa, &b, _state); linlsqrresults(&s, &x0, &rep, _state); if( rep.terminationtype<=0 ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=n-1; i++) { if( ae_fp_greater(ae_fabs(xe.ptr.p_double[i]-x0.ptr.p_double[i], _state),5.0E-2/d.ptr.p_double[i]) ) { result = ae_true; ae_frame_leave(_state); return result; } } /* *test has been passed */ result = ae_false; ae_frame_leave(_state); return result; } static void testmincgunit_testfunc2(mincgstate* state, ae_state *_state); static void testmincgunit_testfunc3(mincgstate* state, ae_state *_state); static void testmincgunit_calciip2(mincgstate* state, ae_int_t n, ae_state *_state); static void testmincgunit_calclowrank(mincgstate* state, ae_int_t n, ae_int_t vcnt, /* Real */ ae_vector* d, /* Real */ ae_matrix* v, /* Real */ ae_vector* vd, /* Real */ ae_vector* x0, ae_state *_state); static void testmincgunit_testpreconditioning(ae_bool* err, ae_state *_state); static ae_bool testmincgunit_gradientchecktest(ae_state *_state); static void testmincgunit_funcderiv(double a, double b, double c, double d, double x0, double x1, double x2, /* Real */ ae_vector* x, ae_int_t functype, double* f, /* Real */ ae_vector* g, ae_state *_state); ae_bool testmincg(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_bool referror; ae_bool eqerror; ae_bool linerror1; ae_bool linerror2; ae_bool restartserror; ae_bool precerror; ae_bool converror; ae_bool othererrors; ae_bool graderrors; ae_int_t n; ae_vector x; ae_vector xe; ae_vector b; ae_vector xlast; ae_int_t i; ae_int_t j; double v; ae_matrix a; ae_vector diagh; mincgstate state; mincgreport rep; ae_int_t cgtype; ae_int_t difftype; double diffstep; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&xe, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&xlast, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&diagh, 0, DT_REAL, _state); _mincgstate_init(&state, _state); _mincgreport_init(&rep, _state); waserrors = ae_false; referror = ae_false; linerror1 = ae_false; linerror2 = ae_false; eqerror = ae_false; converror = ae_false; restartserror = ae_false; othererrors = ae_false; precerror = ae_false; testmincgunit_testpreconditioning(&precerror, _state); testother(&othererrors, _state); for(difftype=0; difftype<=1; difftype++) { for(cgtype=-1; cgtype<=1; cgtype++) { /* * Reference problem */ ae_vector_set_length(&x, 2+1, _state); n = 3; diffstep = 1.0E-6; x.ptr.p_double[0] = 100*ae_randomreal(_state)-50; x.ptr.p_double[1] = 100*ae_randomreal(_state)-50; x.ptr.p_double[2] = 100*ae_randomreal(_state)-50; if( difftype==0 ) { mincgcreate(n, &x, &state, _state); } if( difftype==1 ) { mincgcreatef(n, &x, diffstep, &state, _state); } mincgsetcgtype(&state, cgtype, _state); while(mincgiteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = ae_sqr(state.x.ptr.p_double[0]-2, _state)+ae_sqr(state.x.ptr.p_double[1], _state)+ae_sqr(state.x.ptr.p_double[2]-state.x.ptr.p_double[0], _state); } if( state.needfg ) { state.g.ptr.p_double[0] = 2*(state.x.ptr.p_double[0]-2)+2*(state.x.ptr.p_double[0]-state.x.ptr.p_double[2]); state.g.ptr.p_double[1] = 2*state.x.ptr.p_double[1]; state.g.ptr.p_double[2] = 2*(state.x.ptr.p_double[2]-state.x.ptr.p_double[0]); } } mincgresults(&state, &x, &rep, _state); referror = (((referror||rep.terminationtype<=0)||ae_fp_greater(ae_fabs(x.ptr.p_double[0]-2, _state),0.001))||ae_fp_greater(ae_fabs(x.ptr.p_double[1], _state),0.001))||ae_fp_greater(ae_fabs(x.ptr.p_double[2]-2, _state),0.001); /* * F2 problem with restarts: * * make several iterations and restart BEFORE termination * * iterate and restart AFTER termination * * NOTE: step is bounded from above to avoid premature convergence */ ae_vector_set_length(&x, 3, _state); n = 3; diffstep = 1.0E-6; x.ptr.p_double[0] = 10+10*ae_randomreal(_state); x.ptr.p_double[1] = 10+10*ae_randomreal(_state); x.ptr.p_double[2] = 10+10*ae_randomreal(_state); if( difftype==0 ) { mincgcreate(n, &x, &state, _state); } if( difftype==1 ) { mincgcreatef(n, &x, diffstep, &state, _state); } mincgsetcgtype(&state, cgtype, _state); mincgsetstpmax(&state, 0.1, _state); mincgsetcond(&state, 0.0000001, 0.0, 0.0, 0, _state); for(i=0; i<=10; i++) { if( !mincgiteration(&state, _state) ) { break; } testmincgunit_testfunc2(&state, _state); } x.ptr.p_double[0] = 10+10*ae_randomreal(_state); x.ptr.p_double[1] = 10+10*ae_randomreal(_state); x.ptr.p_double[2] = 10+10*ae_randomreal(_state); mincgrestartfrom(&state, &x, _state); while(mincgiteration(&state, _state)) { testmincgunit_testfunc2(&state, _state); } mincgresults(&state, &x, &rep, _state); restartserror = (((restartserror||rep.terminationtype<=0)||ae_fp_greater(ae_fabs(x.ptr.p_double[0]-ae_log((double)(2), _state), _state),0.01))||ae_fp_greater(ae_fabs(x.ptr.p_double[1], _state),0.01))||ae_fp_greater(ae_fabs(x.ptr.p_double[2]-ae_log((double)(2), _state), _state),0.01); x.ptr.p_double[0] = 10+10*ae_randomreal(_state); x.ptr.p_double[1] = 10+10*ae_randomreal(_state); x.ptr.p_double[2] = 10+10*ae_randomreal(_state); mincgrestartfrom(&state, &x, _state); while(mincgiteration(&state, _state)) { testmincgunit_testfunc2(&state, _state); } mincgresults(&state, &x, &rep, _state); restartserror = (((restartserror||rep.terminationtype<=0)||ae_fp_greater(ae_fabs(x.ptr.p_double[0]-ae_log((double)(2), _state), _state),0.01))||ae_fp_greater(ae_fabs(x.ptr.p_double[1], _state),0.01))||ae_fp_greater(ae_fabs(x.ptr.p_double[2]-ae_log((double)(2), _state), _state),0.01); /* * 1D problem #1 */ ae_vector_set_length(&x, 0+1, _state); n = 1; diffstep = 1.0E-6; x.ptr.p_double[0] = 100*ae_randomreal(_state)-50; if( difftype==0 ) { mincgcreate(n, &x, &state, _state); } if( difftype==1 ) { mincgcreatef(n, &x, diffstep, &state, _state); } mincgsetcgtype(&state, cgtype, _state); while(mincgiteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = -ae_cos(state.x.ptr.p_double[0], _state); } if( state.needfg ) { state.g.ptr.p_double[0] = ae_sin(state.x.ptr.p_double[0], _state); } } mincgresults(&state, &x, &rep, _state); linerror1 = (linerror1||rep.terminationtype<=0)||ae_fp_greater(ae_fabs(x.ptr.p_double[0]/ae_pi-ae_round(x.ptr.p_double[0]/ae_pi, _state), _state),0.001); /* * 1D problem #2 */ ae_vector_set_length(&x, 0+1, _state); n = 1; diffstep = 1.0E-6; x.ptr.p_double[0] = 100*ae_randomreal(_state)-50; if( difftype==0 ) { mincgcreate(n, &x, &state, _state); } if( difftype==1 ) { mincgcreatef(n, &x, diffstep, &state, _state); } mincgsetcgtype(&state, cgtype, _state); while(mincgiteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = ae_sqr(state.x.ptr.p_double[0], _state)/(1+ae_sqr(state.x.ptr.p_double[0], _state)); } if( state.needfg ) { state.g.ptr.p_double[0] = (2*state.x.ptr.p_double[0]*(1+ae_sqr(state.x.ptr.p_double[0], _state))-ae_sqr(state.x.ptr.p_double[0], _state)*2*state.x.ptr.p_double[0])/ae_sqr(1+ae_sqr(state.x.ptr.p_double[0], _state), _state); } } mincgresults(&state, &x, &rep, _state); linerror2 = (linerror2||rep.terminationtype<=0)||ae_fp_greater(ae_fabs(x.ptr.p_double[0], _state),0.001); /* * Linear equations */ diffstep = 1.0E-6; for(n=1; n<=10; n++) { /* * Prepare task */ ae_matrix_set_length(&a, n-1+1, n-1+1, _state); ae_vector_set_length(&x, n-1+1, _state); ae_vector_set_length(&xe, n-1+1, _state); ae_vector_set_length(&b, n-1+1, _state); for(i=0; i<=n-1; i++) { xe.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } a.ptr.pp_double[i][i] = a.ptr.pp_double[i][i]+3*ae_sign(a.ptr.pp_double[i][i], _state); } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xe.ptr.p_double[0], 1, ae_v_len(0,n-1)); b.ptr.p_double[i] = v; } /* * Solve task */ for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } if( difftype==0 ) { mincgcreate(n, &x, &state, _state); } if( difftype==1 ) { mincgcreatef(n, &x, diffstep, &state, _state); } mincgsetcgtype(&state, cgtype, _state); while(mincgiteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = (double)(0); } if( state.needfg ) { for(i=0; i<=n-1; i++) { state.g.ptr.p_double[i] = (double)(0); } } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( state.needf||state.needfg ) { state.f = state.f+ae_sqr(v-b.ptr.p_double[i], _state); } if( state.needfg ) { for(j=0; j<=n-1; j++) { state.g.ptr.p_double[j] = state.g.ptr.p_double[j]+2*(v-b.ptr.p_double[i])*a.ptr.pp_double[i][j]; } } } } mincgresults(&state, &x, &rep, _state); eqerror = eqerror||rep.terminationtype<=0; for(i=0; i<=n-1; i++) { eqerror = eqerror||ae_fp_greater(ae_fabs(x.ptr.p_double[i]-xe.ptr.p_double[i], _state),0.001); } } /* * Testing convergence properties */ diffstep = 1.0E-6; n = 3; ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 6*ae_randomreal(_state)-3; } if( difftype==0 ) { mincgcreate(n, &x, &state, _state); } if( difftype==1 ) { mincgcreatef(n, &x, diffstep, &state, _state); } mincgsetcond(&state, 0.001, 0.0, 0.0, 0, _state); mincgsetcgtype(&state, cgtype, _state); while(mincgiteration(&state, _state)) { testmincgunit_testfunc3(&state, _state); } mincgresults(&state, &x, &rep, _state); converror = converror||rep.terminationtype!=4; for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 6*ae_randomreal(_state)-3; } if( difftype==0 ) { mincgcreate(n, &x, &state, _state); } if( difftype==1 ) { mincgcreatef(n, &x, diffstep, &state, _state); } mincgsetcond(&state, 0.0, 0.001, 0.0, 0, _state); mincgsetcgtype(&state, cgtype, _state); while(mincgiteration(&state, _state)) { testmincgunit_testfunc3(&state, _state); } mincgresults(&state, &x, &rep, _state); converror = converror||rep.terminationtype!=1; for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 6*ae_randomreal(_state)-3; } if( difftype==0 ) { mincgcreate(n, &x, &state, _state); } if( difftype==1 ) { mincgcreatef(n, &x, diffstep, &state, _state); } mincgsetcond(&state, 0.0, 0.0, 0.001, 0, _state); mincgsetcgtype(&state, cgtype, _state); while(mincgiteration(&state, _state)) { testmincgunit_testfunc3(&state, _state); } mincgresults(&state, &x, &rep, _state); converror = converror||rep.terminationtype!=2; for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } if( difftype==0 ) { mincgcreate(n, &x, &state, _state); } if( difftype==1 ) { mincgcreatef(n, &x, diffstep, &state, _state); } mincgsetcond(&state, 0.0, 0.0, 0.0, 10, _state); mincgsetcgtype(&state, cgtype, _state); while(mincgiteration(&state, _state)) { testmincgunit_testfunc3(&state, _state); } mincgresults(&state, &x, &rep, _state); converror = converror||!((rep.terminationtype==5&&rep.iterationscount==10)||rep.terminationtype==7); } } /* * Test for MinCGGradientCheck */ graderrors = testmincgunit_gradientchecktest(_state); /* * end */ waserrors = (((((((referror||eqerror)||linerror1)||linerror2)||converror)||othererrors)||restartserror)||precerror)||graderrors; if( !silent ) { printf("TESTING CG OPTIMIZATION\n"); printf("REFERENCE PROBLEM: "); if( referror ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("LIN-1 PROBLEM: "); if( linerror1 ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("LIN-2 PROBLEM: "); if( linerror2 ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("LINEAR EQUATIONS: "); if( eqerror ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("RESTARTS: "); if( restartserror ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("PRECONDITIONING: "); if( precerror ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("CONVERGENCE PROPERTIES: "); if( converror ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("OTHER PROPERTIES: "); if( othererrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("TEST FOR VERIFICATION OF THE GRADIENT: "); if( graderrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testmincg(ae_bool silent, ae_state *_state) { return testmincg(silent, _state); } /************************************************************************* Other properties *************************************************************************/ void testother(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_vector x; ae_vector s; ae_vector a; ae_vector b; ae_vector h; ae_vector x0; ae_vector x1; ae_vector xlast; ae_matrix fulla; double fprev; double xprev; double stpmax; ae_int_t i; ae_int_t j; ae_int_t k; mincgstate state; mincgreport rep; ae_int_t cgtype; double tmpeps; double epsg; double v; double r; ae_bool hasxlast; double lastscaledstep; ae_int_t pkind; ae_int_t ckind; ae_int_t mkind; ae_int_t dkind; double diffstep; double vc; double vm; ae_bool wasf; ae_bool wasfg; hqrndstate rs; ae_int_t spoiliteration; ae_int_t stopiteration; ae_int_t spoilvar; double spoilval; ae_int_t pass; double ss; ae_int_t callidx; ae_int_t stopcallidx; ae_int_t maxits; ae_bool terminationrequested; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&a, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&h, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&xlast, 0, DT_REAL, _state); ae_matrix_init(&fulla, 0, 0, DT_REAL, _state); _mincgstate_init(&state, _state); _mincgreport_init(&rep, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); for(cgtype=-1; cgtype<=1; cgtype++) { /* * Test reports (F should form monotone sequence) */ n = 50; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xlast, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)(1); } mincgcreate(n, &x, &state, _state); mincgsetcond(&state, (double)(0), (double)(0), (double)(0), 100, _state); mincgsetxrep(&state, ae_true, _state); fprev = ae_maxrealnumber; while(mincgiteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+ae_sqr((1+i)*state.x.ptr.p_double[i], _state); state.g.ptr.p_double[i] = 2*(1+i)*state.x.ptr.p_double[i]; } } if( state.xupdated ) { *err = *err||ae_fp_greater(state.f,fprev); if( ae_fp_eq(fprev,ae_maxrealnumber) ) { for(i=0; i<=n-1; i++) { *err = *err||ae_fp_neq(state.x.ptr.p_double[i],x.ptr.p_double[i]); } } fprev = state.f; ae_v_move(&xlast.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); } } mincgresults(&state, &x, &rep, _state); for(i=0; i<=n-1; i++) { *err = *err||ae_fp_neq(x.ptr.p_double[i],xlast.ptr.p_double[i]); } /* * Test differentiation vs. analytic gradient * (first one issues NeedF requests, second one issues NeedFG requests) */ n = 50; diffstep = 1.0E-6; for(dkind=0; dkind<=1; dkind++) { ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xlast, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)(1); } if( dkind==0 ) { mincgcreate(n, &x, &state, _state); } if( dkind==1 ) { mincgcreatef(n, &x, diffstep, &state, _state); } mincgsetcond(&state, (double)(0), (double)(0), (double)(0), n/2, _state); wasf = ae_false; wasfg = ae_false; while(mincgiteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = (double)(0); } for(i=0; i<=n-1; i++) { if( state.needf||state.needfg ) { state.f = state.f+ae_sqr((1+i)*state.x.ptr.p_double[i], _state); } if( state.needfg ) { state.g.ptr.p_double[i] = 2*(1+i)*state.x.ptr.p_double[i]; } } wasf = wasf||state.needf; wasfg = wasfg||state.needfg; } mincgresults(&state, &x, &rep, _state); if( dkind==0 ) { *err = (*err||wasf)||!wasfg; } if( dkind==1 ) { *err = (*err||!wasf)||wasfg; } } /* * Test that numerical differentiation uses scaling. * * In order to test that we solve simple optimization * problem: min(x^2) with initial x equal to 0.0. * * We choose random DiffStep and S, then we check that * optimizer evaluates function at +-DiffStep*S only. */ ae_vector_set_length(&x, 1, _state); ae_vector_set_length(&s, 1, _state); diffstep = ae_randomreal(_state)*1.0E-6; s.ptr.p_double[0] = ae_exp(ae_randomreal(_state)*4-2, _state); x.ptr.p_double[0] = (double)(0); mincgcreatef(1, &x, diffstep, &state, _state); mincgsetcond(&state, 1.0E-6, (double)(0), (double)(0), 0, _state); mincgsetscale(&state, &s, _state); v = (double)(0); while(mincgiteration(&state, _state)) { state.f = ae_sqr(state.x.ptr.p_double[0], _state); v = ae_maxreal(v, ae_fabs(state.x.ptr.p_double[0], _state), _state); } mincgresults(&state, &x, &rep, _state); r = v/(s.ptr.p_double[0]*diffstep); *err = *err||ae_fp_greater(ae_fabs(ae_log(r, _state), _state),ae_log(1+1000*ae_machineepsilon, _state)); /* * Test maximum step */ n = 1; ae_vector_set_length(&x, n, _state); x.ptr.p_double[0] = (double)(100); stpmax = 0.05+0.05*ae_randomreal(_state); mincgcreate(n, &x, &state, _state); mincgsetcond(&state, 1.0E-9, (double)(0), (double)(0), 0, _state); mincgsetstpmax(&state, stpmax, _state); mincgsetxrep(&state, ae_true, _state); xprev = x.ptr.p_double[0]; while(mincgiteration(&state, _state)) { if( state.needfg ) { state.f = ae_exp(state.x.ptr.p_double[0], _state)+ae_exp(-state.x.ptr.p_double[0], _state); state.g.ptr.p_double[0] = ae_exp(state.x.ptr.p_double[0], _state)-ae_exp(-state.x.ptr.p_double[0], _state); *err = *err||ae_fp_greater(ae_fabs(state.x.ptr.p_double[0]-xprev, _state),(1+ae_sqrt(ae_machineepsilon, _state))*stpmax); } if( state.xupdated ) { *err = *err||ae_fp_greater(ae_fabs(state.x.ptr.p_double[0]-xprev, _state),(1+ae_sqrt(ae_machineepsilon, _state))*stpmax); xprev = state.x.ptr.p_double[0]; } } /* * Test correctness of the scaling: * * initial point is random point from [+1,+2]^N * * f(x) = SUM(A[i]*x[i]^4), C[i] is random from [0.01,100] * * we use random scaling matrix * * we test different variants of the preconditioning: * 0) unit preconditioner * 1) random diagonal from [0.01,100] * 2) scale preconditioner * * we set stringent stopping conditions (we try EpsG and EpsX) * * and we test that in the extremum stopping conditions are * satisfied subject to the current scaling coefficients. */ tmpeps = 1.0E-10; for(n=1; n<=10; n++) { for(pkind=0; pkind<=2; pkind++) { ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xlast, n, _state); ae_vector_set_length(&a, n, _state); ae_vector_set_length(&s, n, _state); ae_vector_set_length(&h, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_randomreal(_state)+1; a.ptr.p_double[i] = ae_exp(ae_log((double)(100), _state)*(2*ae_randomreal(_state)-1), _state); s.ptr.p_double[i] = ae_exp(ae_log((double)(100), _state)*(2*ae_randomreal(_state)-1), _state); h.ptr.p_double[i] = ae_exp(ae_log((double)(100), _state)*(2*ae_randomreal(_state)-1), _state); } mincgcreate(n, &x, &state, _state); mincgsetscale(&state, &s, _state); mincgsetxrep(&state, ae_true, _state); if( pkind==1 ) { mincgsetprecdiag(&state, &h, _state); } if( pkind==2 ) { mincgsetprecscale(&state, _state); } /* * Test gradient-based stopping condition */ for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_randomreal(_state)+1; } mincgsetcond(&state, tmpeps, (double)(0), (double)(0), 0, _state); mincgrestartfrom(&state, &x, _state); while(mincgiteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+a.ptr.p_double[i]*ae_pow(state.x.ptr.p_double[i], (double)(4), _state); state.g.ptr.p_double[i] = 4*a.ptr.p_double[i]*ae_pow(state.x.ptr.p_double[i], (double)(3), _state); } } } mincgresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(s.ptr.p_double[i]*4*a.ptr.p_double[i]*ae_pow(x.ptr.p_double[i], (double)(3), _state), _state); } v = ae_sqrt(v, _state); *err = *err||ae_fp_greater(v,tmpeps); /* * Test step-based stopping condition */ for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_randomreal(_state)+1; } hasxlast = ae_false; mincgsetcond(&state, (double)(0), (double)(0), tmpeps, 0, _state); mincgrestartfrom(&state, &x, _state); lastscaledstep = (double)(0); while(mincgiteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+a.ptr.p_double[i]*ae_pow(state.x.ptr.p_double[i], (double)(4), _state); state.g.ptr.p_double[i] = 4*a.ptr.p_double[i]*ae_pow(state.x.ptr.p_double[i], (double)(3), _state); } } if( state.xupdated ) { if( hasxlast ) { lastscaledstep = (double)(0); for(i=0; i<=n-1; i++) { lastscaledstep = lastscaledstep+ae_sqr(state.x.ptr.p_double[i]-xlast.ptr.p_double[i], _state)/ae_sqr(s.ptr.p_double[i], _state); } lastscaledstep = ae_sqrt(lastscaledstep, _state); } else { lastscaledstep = (double)(0); } ae_v_move(&xlast.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); hasxlast = ae_true; } } mincgresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } *err = *err||ae_fp_greater(lastscaledstep,tmpeps); } } /* * Check correctness of the "trimming". * * Trimming is a technique which is used to help algorithm * cope with unbounded functions. In order to check this * technique we will try to solve following optimization * problem: * * min f(x) subject to no constraints on X * { 1/(1-x) + 1/(1+x) + c*x, if -0.999999=0.999999 * * where c is either 1.0 or 1.0E+6, M is either 1.0E8, 1.0E20 or +INF * (we try different combinations) */ for(ckind=0; ckind<=1; ckind++) { for(mkind=0; mkind<=2; mkind++) { /* * Choose c and M */ vc = 1.0; vm = 1.0E+8; if( ckind==1 ) { vc = 1.0E+6; } if( mkind==1 ) { vm = 1.0E+20; } if( mkind==2 ) { vm = _state->v_posinf; } /* * Create optimizer, solve optimization problem */ epsg = 1.0E-6*vc; ae_vector_set_length(&x, 1, _state); x.ptr.p_double[0] = 0.0; mincgcreate(1, &x, &state, _state); mincgsetcond(&state, epsg, (double)(0), (double)(0), 0, _state); mincgsetcgtype(&state, cgtype, _state); while(mincgiteration(&state, _state)) { if( state.needfg ) { if( ae_fp_less(-0.999999,state.x.ptr.p_double[0])&&ae_fp_less(state.x.ptr.p_double[0],0.999999) ) { state.f = 1/(1-state.x.ptr.p_double[0])+1/(1+state.x.ptr.p_double[0])+vc*state.x.ptr.p_double[0]; state.g.ptr.p_double[0] = 1/ae_sqr(1-state.x.ptr.p_double[0], _state)-1/ae_sqr(1+state.x.ptr.p_double[0], _state)+vc; } else { state.f = vm; } } } mincgresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } *err = *err||ae_fp_greater(ae_fabs(1/ae_sqr(1-x.ptr.p_double[0], _state)-1/ae_sqr(1+x.ptr.p_double[0], _state)+vc, _state),epsg); } } } /* * Test integrity checks for NAN/INF: * * algorithm solves optimization problem, which is normal for some time (quadratic) * * after 5-th step we choose random component of gradient and consistently spoil * it by NAN or INF. * * we check that correct termination code is returned (-8) */ n = 100; for(pass=1; pass<=10; pass++) { spoiliteration = 5; stopiteration = 8; if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { /* * Gradient can be spoiled by +INF, -INF, NAN */ spoilvar = hqrnduniformi(&rs, n, _state); i = hqrnduniformi(&rs, 3, _state); spoilval = _state->v_nan; if( i==0 ) { spoilval = _state->v_neginf; } if( i==1 ) { spoilval = _state->v_posinf; } } else { /* * Function value can be spoiled only by NAN * (+INF can be recognized as legitimate value during optimization) */ spoilvar = -1; spoilval = _state->v_nan; } spdmatrixrndcond(n, 1.0E5, &fulla, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); x0.ptr.p_double[i] = hqrndnormal(&rs, _state); } mincgcreate(n, &x0, &state, _state); mincgsetcond(&state, 0.0, 0.0, 0.0, stopiteration, _state); mincgsetxrep(&state, ae_true, _state); k = -1; while(mincgiteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+b.ptr.p_double[i]*state.x.ptr.p_double[i]; state.g.ptr.p_double[i] = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.f = state.f+0.5*state.x.ptr.p_double[i]*fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; state.g.ptr.p_double[i] = state.g.ptr.p_double[i]+fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } if( k>=spoiliteration ) { if( spoilvar<0 ) { state.f = spoilval; } else { state.g.ptr.p_double[spoilvar] = spoilval; } } continue; } if( state.xupdated ) { inc(&k, _state); continue; } ae_assert(ae_false, "Assertion failed", _state); } mincgresults(&state, &x1, &rep, _state); seterrorflag(err, rep.terminationtype!=-8, _state); } /* * Check algorithm ability to handle request for termination: * * to terminate with correct return code = 8 * * to return point which was "current" at the moment of termination */ for(pass=1; pass<=50; pass++) { n = 3; ss = (double)(100); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xlast, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 6+ae_randomreal(_state); } stopcallidx = ae_randominteger(20, _state); maxits = 25; mincgcreate(n, &x, &state, _state); mincgsetcond(&state, (double)(0), (double)(0), (double)(0), maxits, _state); mincgsetxrep(&state, ae_true, _state); callidx = 0; terminationrequested = ae_false; ae_v_move(&xlast.ptr.p_double[0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); while(mincgiteration(&state, _state)) { if( state.needfg ) { state.f = ss*ae_sqr(ae_exp(state.x.ptr.p_double[0], _state)-2, _state)+ae_sqr(state.x.ptr.p_double[1], _state)+ae_sqr(state.x.ptr.p_double[2]-state.x.ptr.p_double[0], _state); state.g.ptr.p_double[0] = 2*ss*(ae_exp(state.x.ptr.p_double[0], _state)-2)*ae_exp(state.x.ptr.p_double[0], _state)+2*(state.x.ptr.p_double[2]-state.x.ptr.p_double[0])*(-1); state.g.ptr.p_double[1] = 2*state.x.ptr.p_double[1]; state.g.ptr.p_double[2] = 2*(state.x.ptr.p_double[2]-state.x.ptr.p_double[0]); if( callidx==stopcallidx ) { mincgrequesttermination(&state, _state); terminationrequested = ae_true; } inc(&callidx, _state); continue; } if( state.xupdated ) { if( !terminationrequested ) { ae_v_move(&xlast.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); } continue; } ae_assert(ae_false, "Assertion failed", _state); } mincgresults(&state, &x, &rep, _state); seterrorflag(err, rep.terminationtype!=8, _state); for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_neq(x.ptr.p_double[i],xlast.ptr.p_double[i]), _state); } } ae_frame_leave(_state); } /************************************************************************* Calculate test function #2 Simple variation of #1, much more nonlinear, which makes unlikely premature convergence of algorithm . *************************************************************************/ static void testmincgunit_testfunc2(mincgstate* state, ae_state *_state) { if( ae_fp_less(state->x.ptr.p_double[0],(double)(100)) ) { if( state->needf||state->needfg ) { state->f = ae_sqr(ae_exp(state->x.ptr.p_double[0], _state)-2, _state)+ae_sqr(ae_sqr(state->x.ptr.p_double[1], _state), _state)+ae_sqr(state->x.ptr.p_double[2]-state->x.ptr.p_double[0], _state); } if( state->needfg ) { state->g.ptr.p_double[0] = 2*(ae_exp(state->x.ptr.p_double[0], _state)-2)*ae_exp(state->x.ptr.p_double[0], _state)+2*(state->x.ptr.p_double[0]-state->x.ptr.p_double[2]); state->g.ptr.p_double[1] = 4*state->x.ptr.p_double[1]*ae_sqr(state->x.ptr.p_double[1], _state); state->g.ptr.p_double[2] = 2*(state->x.ptr.p_double[2]-state->x.ptr.p_double[0]); } } else { if( state->needf||state->needfg ) { state->f = ae_sqrt(ae_maxrealnumber, _state); } if( state->needfg ) { state->g.ptr.p_double[0] = ae_sqrt(ae_maxrealnumber, _state); state->g.ptr.p_double[1] = (double)(0); state->g.ptr.p_double[2] = (double)(0); } } } /************************************************************************* Calculate test function #3 Simple variation of #1, much more nonlinear, with non-zero value at minimum. It achieve two goals: * makes unlikely premature convergence of algorithm . * solves some issues with EpsF stopping condition which arise when F(minimum) is zero *************************************************************************/ static void testmincgunit_testfunc3(mincgstate* state, ae_state *_state) { double s; s = 0.001; if( ae_fp_less(state->x.ptr.p_double[0],(double)(100)) ) { if( state->needf||state->needfg ) { state->f = ae_sqr(ae_exp(state->x.ptr.p_double[0], _state)-2, _state)+ae_sqr(ae_sqr(state->x.ptr.p_double[1], _state)+s, _state)+ae_sqr(state->x.ptr.p_double[2]-state->x.ptr.p_double[0], _state); } if( state->needfg ) { state->g.ptr.p_double[0] = 2*(ae_exp(state->x.ptr.p_double[0], _state)-2)*ae_exp(state->x.ptr.p_double[0], _state)+2*(state->x.ptr.p_double[0]-state->x.ptr.p_double[2]); state->g.ptr.p_double[1] = 2*(ae_sqr(state->x.ptr.p_double[1], _state)+s)*2*state->x.ptr.p_double[1]; state->g.ptr.p_double[2] = 2*(state->x.ptr.p_double[2]-state->x.ptr.p_double[0]); } } else { if( state->needf||state->needfg ) { state->f = ae_sqrt(ae_maxrealnumber, _state); } if( state->needfg ) { state->g.ptr.p_double[0] = ae_sqrt(ae_maxrealnumber, _state); state->g.ptr.p_double[1] = (double)(0); state->g.ptr.p_double[2] = (double)(0); } } } /************************************************************************* Calculate test function IIP2 f(x) = sum( ((i*i+1)*x[i])^2, i=0..N-1) It has high condition number which makes fast convergence unlikely without good preconditioner. *************************************************************************/ static void testmincgunit_calciip2(mincgstate* state, ae_int_t n, ae_state *_state) { ae_int_t i; if( state->needf||state->needfg ) { state->f = (double)(0); } for(i=0; i<=n-1; i++) { if( state->needf||state->needfg ) { state->f = state->f+ae_sqr((double)(i*i+1), _state)*ae_sqr(state->x.ptr.p_double[i], _state); } if( state->needfg ) { state->g.ptr.p_double[i] = ae_sqr((double)(i*i+1), _state)*2*state->x.ptr.p_double[i]; } } } /************************************************************************* Calculate test function f(x) = 0.5*(x-x0)'*A*(x-x0), A = D+V'*Vd*V *************************************************************************/ static void testmincgunit_calclowrank(mincgstate* state, ae_int_t n, ae_int_t vcnt, /* Real */ ae_vector* d, /* Real */ ae_matrix* v, /* Real */ ae_vector* vd, /* Real */ ae_vector* x0, ae_state *_state) { ae_int_t i; ae_int_t j; double dx; double t; double t2; state->f = (double)(0); for(i=0; i<=n-1; i++) { state->g.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { dx = state->x.ptr.p_double[i]-x0->ptr.p_double[i]; state->f = state->f+0.5*dx*d->ptr.p_double[i]*dx; state->g.ptr.p_double[i] = state->g.ptr.p_double[i]+d->ptr.p_double[i]*dx; } for(i=0; i<=vcnt-1; i++) { t = (double)(0); for(j=0; j<=n-1; j++) { t = t+v->ptr.pp_double[i][j]*(state->x.ptr.p_double[j]-x0->ptr.p_double[j]); } state->f = state->f+0.5*t*vd->ptr.p_double[i]*t; t2 = t*vd->ptr.p_double[i]; ae_v_addd(&state->g.ptr.p_double[0], 1, &v->ptr.pp_double[i][0], 1, ae_v_len(0,n-1), t2); } } /************************************************************************* This function tests preconditioning On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testmincgunit_testpreconditioning(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t pass; ae_int_t n; ae_vector x; ae_vector x0; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t vs; ae_matrix v; ae_vector vd; ae_vector d; ae_vector s; ae_int_t cntb1; ae_int_t cntg1; ae_int_t cntb2; ae_int_t cntg2; ae_vector diagh; mincgstate state; mincgreport rep; ae_int_t cgtype; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); ae_vector_init(&vd, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&diagh, 0, DT_REAL, _state); _mincgstate_init(&state, _state); _mincgreport_init(&rep, _state); k = 50; for(cgtype=-1; cgtype<=1; cgtype++) { /* * Preconditioner test 1. * * If * * B1 is default preconditioner * * G1 is diagonal precomditioner based on approximate diagonal of Hessian matrix * then "bad" preconditioner is worse than "good" one. * "Worse" means more iterations to converge. * * * We test it using f(x) = sum( ((i*i+1)*x[i])^2, i=0..N-1). * * N - problem size * K - number of repeated passes (should be large enough to average out random factors) */ for(n=10; n<=15; n++) { ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)(0); } mincgcreate(n, &x, &state, _state); mincgsetcgtype(&state, cgtype, _state); /* * Test it with default preconditioner */ mincgsetprecdefault(&state, _state); cntb1 = 0; for(pass=0; pass<=k-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mincgrestartfrom(&state, &x, _state); while(mincgiteration(&state, _state)) { testmincgunit_calciip2(&state, n, _state); } mincgresults(&state, &x, &rep, _state); cntb1 = cntb1+rep.iterationscount; *err = *err||rep.terminationtype<=0; } /* * Test it with perturbed diagonal preconditioner */ ae_vector_set_length(&diagh, n, _state); for(i=0; i<=n-1; i++) { diagh.ptr.p_double[i] = 2*ae_sqr((double)(i*i+1), _state)*(0.8+0.4*ae_randomreal(_state)); } mincgsetprecdiag(&state, &diagh, _state); cntg1 = 0; for(pass=0; pass<=k-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mincgrestartfrom(&state, &x, _state); while(mincgiteration(&state, _state)) { testmincgunit_calciip2(&state, n, _state); } mincgresults(&state, &x, &rep, _state); cntg1 = cntg1+rep.iterationscount; *err = *err||rep.terminationtype<=0; } /* * Compare */ *err = *err||cntb10 ) { ae_matrix_set_length(&v, vs, n, _state); ae_vector_set_length(&vd, vs, _state); for(i=0; i<=vs-1; i++) { for(j=0; j<=n-1; j++) { v.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } vd.ptr.p_double[i] = ae_exp(2*ae_randomreal(_state), _state); } } mincgcreate(n, &x, &state, _state); mincgsetcgtype(&state, cgtype, _state); /* * Test it with default preconditioner */ mincgsetprecdefault(&state, _state); cntb1 = 0; for(pass=0; pass<=k-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mincgrestartfrom(&state, &x, _state); while(mincgiteration(&state, _state)) { testmincgunit_calclowrank(&state, n, vs, &d, &v, &vd, &x0, _state); } mincgresults(&state, &x, &rep, _state); cntb1 = cntb1+rep.iterationscount; *err = *err||rep.terminationtype<=0; } /* * Test it with low rank preconditioner */ mincgsetpreclowrankfast(&state, &d, &vd, &v, vs, _state); cntg1 = 0; for(pass=0; pass<=k-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mincgrestartfrom(&state, &x, _state); while(mincgiteration(&state, _state)) { testmincgunit_calclowrank(&state, n, vs, &d, &v, &vd, &x0, _state); } mincgresults(&state, &x, &rep, _state); cntg1 = cntg1+rep.iterationscount; *err = *err||rep.terminationtype<=0; } /* * Compare */ *err = *err||cntb1=1&&functype<=3, "FuncDeriv: incorrect funcType(funcType<1 or funcType>3).", _state); if( functype==1 ) { *f = a*ae_sqr(x->ptr.p_double[0]-x0, _state)+b*ae_sqr(x->ptr.p_double[1]-x1, _state)+c*ae_sqr(x->ptr.p_double[2]-x2, _state)+d; g->ptr.p_double[0] = 2*a*(x->ptr.p_double[0]-x0); g->ptr.p_double[1] = 2*b*(x->ptr.p_double[1]-x1); g->ptr.p_double[2] = 2*c*(x->ptr.p_double[2]-x2); return; } if( functype==2 ) { *f = a*ae_sqr(ae_sin(x->ptr.p_double[0]-x0, _state), _state)+b*ae_sqr(ae_sin(x->ptr.p_double[1]-x1, _state), _state)+c*ae_sqr(ae_sin(x->ptr.p_double[2]-x2, _state), _state)+d; g->ptr.p_double[0] = 2*a*ae_sin(x->ptr.p_double[0]-x0, _state)*ae_cos(x->ptr.p_double[0]-x0, _state); g->ptr.p_double[1] = 2*b*ae_sin(x->ptr.p_double[1]-x1, _state)*ae_cos(x->ptr.p_double[1]-x1, _state); g->ptr.p_double[2] = 2*c*ae_sin(x->ptr.p_double[2]-x2, _state)*ae_cos(x->ptr.p_double[2]-x2, _state); return; } if( functype==3 ) { *f = a*ae_sqr(x->ptr.p_double[0]-x0, _state)+b*ae_sqr(x->ptr.p_double[1]-x1, _state)+c*ae_sqr(x->ptr.p_double[2]-x2-(x->ptr.p_double[0]-x0), _state)+d; g->ptr.p_double[0] = 2*a*(x->ptr.p_double[0]-x0)+2*c*(x->ptr.p_double[0]-x->ptr.p_double[2]-x0+x2); g->ptr.p_double[1] = 2*b*(x->ptr.p_double[1]-x1); g->ptr.p_double[2] = 2*c*(x->ptr.p_double[2]-x->ptr.p_double[0]-x2+x0); return; } } static void testminbleicunit_calciip2(minbleicstate* state, ae_int_t n, ae_int_t fk, ae_state *_state); static void testminbleicunit_testfeasibility(ae_bool* feaserr, ae_bool* converr, ae_bool* interr, ae_state *_state); static void testminbleicunit_testother(ae_bool* err, ae_state *_state); static void testminbleicunit_testconv(ae_bool* err, ae_state *_state); static void testminbleicunit_testpreconditioning(ae_bool* err, ae_state *_state); static void testminbleicunit_setrandompreconditioner(minbleicstate* state, ae_int_t n, ae_int_t preckind, ae_state *_state); static void testminbleicunit_testgradientcheck(ae_bool* testg, ae_state *_state); static void testminbleicunit_testbugs(ae_bool* err, ae_state *_state); static void testminbleicunit_funcderiv(double a, double b, double c, double d, double x0, double x1, double x2, /* Real */ ae_vector* x, ae_int_t functype, double* f, /* Real */ ae_vector* g, ae_state *_state); ae_bool testminbleic(ae_bool silent, ae_state *_state) { ae_bool waserrors; ae_bool feasibilityerrors; ae_bool othererrors; ae_bool precerrors; ae_bool interrors; ae_bool converrors; ae_bool graderrors; ae_bool bugs; ae_bool result; waserrors = ae_false; feasibilityerrors = ae_false; othererrors = ae_false; precerrors = ae_false; interrors = ae_false; converrors = ae_false; graderrors = ae_false; bugs = ae_false; testminbleicunit_testfeasibility(&feasibilityerrors, &converrors, &interrors, _state); testminbleicunit_testother(&othererrors, _state); testminbleicunit_testconv(&converrors, _state); testminbleicunit_testbugs(&bugs, _state); testminbleicunit_testpreconditioning(&precerrors, _state); testminbleicunit_testgradientcheck(&graderrors, _state); /* * end */ waserrors = (((((feasibilityerrors||othererrors)||converrors)||interrors)||precerrors)||graderrors)||bugs; if( !silent ) { printf("TESTING BLEIC OPTIMIZATION\n"); printf("FEASIBILITY PROPERTIES: "); if( feasibilityerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("PRECONDITIONING: "); if( precerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("OTHER PROPERTIES: "); if( othererrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("CONVERGENCE PROPERTIES: "); if( converrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("INTERNAL ERRORS: "); if( interrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("TEST FOR VERIFICATION OF THE GRADIENT: "); if( graderrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("FIXED BUGS: "); if( bugs ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testminbleic(ae_bool silent, ae_state *_state) { return testminbleic(silent, _state); } /************************************************************************* Calculate test function IIP2 f(x) = sum( ((i*i+1)^FK*x[i])^2, i=0..N-1) It has high condition number which makes fast convergence unlikely without good preconditioner. *************************************************************************/ static void testminbleicunit_calciip2(minbleicstate* state, ae_int_t n, ae_int_t fk, ae_state *_state) { ae_int_t i; if( state->needfg ) { state->f = (double)(0); } for(i=0; i<=n-1; i++) { if( state->needfg ) { state->f = state->f+ae_pow((double)(i*i+1), (double)(2*fk), _state)*ae_sqr(state->x.ptr.p_double[i], _state); state->g.ptr.p_double[i] = ae_pow((double)(i*i+1), (double)(2*fk), _state)*2*state->x.ptr.p_double[i]; } } } /************************************************************************* This function test feasibility properties. It launches a sequence of problems and examines their solutions. Most of the attention is directed towards feasibility properties, although we make some quick checks to ensure that actual solution is found. On failure sets FeasErr (or ConvErr, depending on failure type) to True, or leaves it unchanged otherwise. IntErr is set to True on internal errors (errors in the control flow). *************************************************************************/ static void testminbleicunit_testfeasibility(ae_bool* feaserr, ae_bool* converr, ae_bool* interr, ae_state *_state) { ae_frame _frame_block; ae_int_t pkind; ae_int_t preckind; ae_int_t passcount; ae_int_t pass; ae_int_t n; ae_int_t nmax; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t p; double v; double v2; double v3; double vv; ae_vector bl; ae_vector bu; ae_vector x; ae_vector g; ae_vector x0; ae_vector xc; ae_vector xs; ae_vector svdw; ae_matrix c; ae_matrix svdu; ae_matrix svdvt; ae_vector ct; minbleicstate state; double epsx; double epsfeas; double weakepsg; minbleicreport rep; ae_int_t dkind; double diffstep; ae_frame_make(_state, &_frame_block); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&xs, 0, DT_REAL, _state); ae_vector_init(&svdw, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_matrix_init(&svdu, 0, 0, DT_REAL, _state); ae_matrix_init(&svdvt, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); _minbleicstate_init(&state, _state); _minbleicreport_init(&rep, _state); nmax = 5; weakepsg = 1.0E-4; epsx = 1.0E-4; epsfeas = 1.0E-6; passcount = 10; for(pass=1; pass<=passcount; pass++) { /* * Test problem 1: * * no boundary and inequality constraints * * randomly generated plane as equality constraint * * random point (not necessarily on the plane) * * f = |x|^P, P = {2, 4} is used as target function * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from converging * to the feasible point): * * unit preconditioner * * random diagonal-based preconditioner * * random scale-based preconditioner * * either analytic gradient or numerical differentiation are used * * we check that after work is over we are on the plane and * that we are in the stationary point of constrained F */ diffstep = 1.0E-6; for(dkind=0; dkind<=1; dkind++) { for(preckind=0; preckind<=2; preckind++) { for(pkind=1; pkind<=2; pkind++) { for(n=1; n<=nmax; n++) { /* * Generate X, BL, BU, CT and left part of C. * * Right part of C is generated using somewhat complex algo: * * we generate random vector and multiply it by C. * * result is used as the right part. * * calculations are done on the fly, vector itself is not stored * We use such algo to be sure that our system is consistent. */ p = 2*pkind; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&g, n, _state); ae_matrix_set_length(&c, 1, n+1, _state); ae_vector_set_length(&ct, 1, _state); c.ptr.pp_double[0][n] = (double)(0); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; c.ptr.pp_double[0][i] = 2*ae_randomreal(_state)-1; v = 2*ae_randomreal(_state)-1; c.ptr.pp_double[0][n] = c.ptr.pp_double[0][n]+c.ptr.pp_double[0][i]*v; } ct.ptr.p_int[0] = 0; /* * Create and optimize */ if( dkind==0 ) { minbleiccreate(n, &x, &state, _state); } if( dkind==1 ) { minbleiccreatef(n, &x, diffstep, &state, _state); } minbleicsetlc(&state, &c, &ct, 1, _state); minbleicsetcond(&state, weakepsg, 0.0, 0.0, 0, _state); testminbleicunit_setrandompreconditioner(&state, n, preckind, _state); while(minbleiciteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = (double)(0); } for(i=0; i<=n-1; i++) { if( state.needf||state.needfg ) { state.f = state.f+ae_pow(state.x.ptr.p_double[i], (double)(p), _state); } if( state.needfg ) { state.g.ptr.p_double[i] = p*ae_pow(state.x.ptr.p_double[i], (double)(p-1), _state); } } } minbleicresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *converr = ae_true; ae_frame_leave(_state); return; } /* * Test feasibility of solution */ v = ae_v_dotproduct(&c.ptr.pp_double[0][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); *feaserr = *feaserr||ae_fp_greater(ae_fabs(v-c.ptr.pp_double[0][n], _state),epsfeas); /* * if C is nonzero, test that result is * a stationary point of constrained F. * * NOTE: this check is done only if C is nonzero */ vv = ae_v_dotproduct(&c.ptr.pp_double[0][0], 1, &c.ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); if( ae_fp_neq(vv,(double)(0)) ) { /* * Calculate gradient at the result * Project gradient into C * Check projected norm */ for(i=0; i<=n-1; i++) { g.ptr.p_double[i] = p*ae_pow(x.ptr.p_double[i], (double)(p-1), _state); } v2 = ae_v_dotproduct(&c.ptr.pp_double[0][0], 1, &c.ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&c.ptr.pp_double[0][0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,n-1)); vv = v/v2; ae_v_subd(&g.ptr.p_double[0], 1, &c.ptr.pp_double[0][0], 1, ae_v_len(0,n-1), vv); v3 = ae_v_dotproduct(&g.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,n-1)); *converr = *converr||ae_fp_greater(ae_sqrt(v3, _state),weakepsg); } } } } } /* * Test problem 2 (multiple equality constraints): * * 1<=N<=NMax, 1<=K<=N * * no boundary constraints * * N-dimensional space * * randomly generated point xs * * K randomly generated hyperplanes which all pass through xs * define K equality constraints: (a[k],x)=b[k] * * equality constraints are checked for being well conditioned * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from converging * to the feasible point): * * unit preconditioner * * random diagonal-based preconditioner * * random scale-based preconditioner * * f(x) = |x-x0|^2, x0 = xs+a[0] * * either analytic gradient or numerical differentiation are used * * extremum of f(x) is exactly xs because: * * xs is the closest point in the plane defined by (a[0],x)=b[0] * * xs is feasible by definition */ diffstep = 1.0E-6; for(dkind=0; dkind<=1; dkind++) { for(preckind=0; preckind<=2; preckind++) { for(n=2; n<=nmax; n++) { for(k=1; k<=n; k++) { /* * Generate X, X0, XS, BL, BU, CT and left part of C. * * Right part of C is generated using somewhat complex algo: * * we generate random vector and multiply it by C. * * result is used as the right part. * * calculations are done on the fly, vector itself is not stored * We use such algo to be sure that our system is consistent. */ ae_vector_set_length(&x, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xs, n, _state); ae_vector_set_length(&g, n, _state); ae_matrix_set_length(&c, k, n+1, _state); ae_vector_set_length(&ct, k, _state); c.ptr.pp_double[0][n] = (double)(0); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xs.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } do { for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &xs.ptr.p_double[0], 1, ae_v_len(0,n-1)); c.ptr.pp_double[i][n] = v; ct.ptr.p_int[i] = 0; } seterrorflag(feaserr, !rmatrixsvd(&c, k, n, 0, 0, 0, &svdw, &svdu, &svdvt, _state), _state); } while(!(ae_fp_greater(svdw.ptr.p_double[0],(double)(0))&&ae_fp_greater(svdw.ptr.p_double[k-1],0.001*svdw.ptr.p_double[0]))); ae_v_move(&x0.ptr.p_double[0], 1, &xs.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_add(&x0.ptr.p_double[0], 1, &c.ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); /* * Create and optimize */ if( dkind==0 ) { minbleiccreate(n, &x, &state, _state); } if( dkind==1 ) { minbleiccreatef(n, &x, diffstep, &state, _state); } minbleicsetlc(&state, &c, &ct, k, _state); minbleicsetcond(&state, weakepsg, 0.0, 0.0, 0, _state); testminbleicunit_setrandompreconditioner(&state, n, preckind, _state); while(minbleiciteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = (double)(0); } for(i=0; i<=n-1; i++) { if( state.needf||state.needfg ) { state.f = state.f+ae_sqr(state.x.ptr.p_double[i]-x0.ptr.p_double[i], _state); } if( state.needfg ) { state.g.ptr.p_double[i] = 2*(state.x.ptr.p_double[i]-x0.ptr.p_double[i]); } } } minbleicresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *converr = ae_true; ae_frame_leave(_state); return; } /* * check feasiblity properties */ for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); *feaserr = *feaserr||ae_fp_greater(ae_fabs(v-c.ptr.pp_double[i][n], _state),epsx); } /* * Compare with XS */ v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(x.ptr.p_double[i]-xs.ptr.p_double[i], _state); } v = ae_sqrt(v, _state); *converr = *converr||ae_fp_greater(ae_fabs(v, _state),0.001); } } } } /* * Another simple problem: * * bound constraints 0 <= x[i] <= 1 * * no linear constraints * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from converging * to the feasible point): * * unit preconditioner * * random diagonal-based preconditioner * * random scale-based preconditioner * * F(x) = |x-x0|^P, where P={2,4} and x0 is randomly selected from [-1,+2]^N * * with such simple boundaries and function it is easy to find * analytic form of solution: S[i] = bound(x0[i], 0, 1) * * we also check that both final solution and subsequent iterates * are strictly feasible */ diffstep = 1.0E-6; for(dkind=0; dkind<=1; dkind++) { for(preckind=0; preckind<=2; preckind++) { for(pkind=1; pkind<=2; pkind++) { for(n=1; n<=nmax; n++) { /* * Generate X, BL, BU. */ p = 2*pkind; ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = (double)(0); bu.ptr.p_double[i] = (double)(1); x.ptr.p_double[i] = ae_randomreal(_state); x0.ptr.p_double[i] = 3*ae_randomreal(_state)-1; } /* * Create and optimize */ if( dkind==0 ) { minbleiccreate(n, &x, &state, _state); } if( dkind==1 ) { minbleiccreatef(n, &x, diffstep, &state, _state); } minbleicsetbc(&state, &bl, &bu, _state); minbleicsetcond(&state, weakepsg, 0.0, 0.0, 0, _state); testminbleicunit_setrandompreconditioner(&state, n, preckind, _state); while(minbleiciteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = (double)(0); } for(i=0; i<=n-1; i++) { if( state.needf||state.needfg ) { state.f = state.f+ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p), _state); } if( state.needfg ) { state.g.ptr.p_double[i] = p*ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state); } *feaserr = *feaserr||ae_fp_less(state.x.ptr.p_double[i],0.0); *feaserr = *feaserr||ae_fp_greater(state.x.ptr.p_double[i],1.0); } } minbleicresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *converr = ae_true; ae_frame_leave(_state); return; } /* * * compare solution with analytic one * * check feasibility */ v = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_greater(x.ptr.p_double[i],(double)(0))&&ae_fp_less(x.ptr.p_double[i],(double)(1)) ) { v = v+ae_sqr(p*ae_pow(x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state), _state); } *feaserr = *feaserr||ae_fp_less(x.ptr.p_double[i],0.0); *feaserr = *feaserr||ae_fp_greater(x.ptr.p_double[i],1.0); } *converr = *converr||ae_fp_greater(ae_sqrt(v, _state),weakepsg); } } } } /* * Same as previous problem, but with minor modifications: * * some bound constraints are 0<=x[i]<=1, some are Ci=x[i]=Ci * * no linear constraints * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from converging * to the feasible point): * * unit preconditioner * * random diagonal-based preconditioner * * random scale-based preconditioner * * F(x) = |x-x0|^P, where P={2,4} and x0 is randomly selected from [-1,+2]^N * * with such simple boundaries and function it is easy to find * analytic form of solution: S[i] = bound(x0[i], 0, 1) * * we also check that both final solution and subsequent iterates * are strictly feasible */ diffstep = 1.0E-6; for(dkind=0; dkind<=1; dkind++) { for(preckind=0; preckind<=2; preckind++) { for(pkind=1; pkind<=2; pkind++) { for(n=1; n<=nmax; n++) { /* * Generate X, BL, BU. */ p = 2*pkind; ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { if( ae_fp_greater(ae_randomreal(_state),0.5) ) { bl.ptr.p_double[i] = (double)(0); bu.ptr.p_double[i] = (double)(1); } else { bl.ptr.p_double[i] = ae_randomreal(_state); bu.ptr.p_double[i] = bl.ptr.p_double[i]; } x.ptr.p_double[i] = ae_randomreal(_state); x0.ptr.p_double[i] = 3*ae_randomreal(_state)-1; } /* * Create and optimize */ if( dkind==0 ) { minbleiccreate(n, &x, &state, _state); } if( dkind==1 ) { minbleiccreatef(n, &x, diffstep, &state, _state); } minbleicsetbc(&state, &bl, &bu, _state); minbleicsetcond(&state, weakepsg, 0.0, 0.0, 0, _state); testminbleicunit_setrandompreconditioner(&state, n, preckind, _state); while(minbleiciteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = (double)(0); } for(i=0; i<=n-1; i++) { if( state.needf||state.needfg ) { state.f = state.f+ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p), _state); } if( state.needfg ) { state.g.ptr.p_double[i] = p*ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state); } *feaserr = *feaserr||ae_fp_less(state.x.ptr.p_double[i],bl.ptr.p_double[i]); *feaserr = *feaserr||ae_fp_greater(state.x.ptr.p_double[i],bu.ptr.p_double[i]); } } minbleicresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *converr = ae_true; ae_frame_leave(_state); return; } /* * * compare solution with analytic one * * check feasibility */ v = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_greater(x.ptr.p_double[i],bl.ptr.p_double[i])&&ae_fp_less(x.ptr.p_double[i],bu.ptr.p_double[i]) ) { v = v+ae_sqr(p*ae_pow(x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state), _state); } *feaserr = *feaserr||ae_fp_less(x.ptr.p_double[i],bl.ptr.p_double[i]); *feaserr = *feaserr||ae_fp_greater(x.ptr.p_double[i],bu.ptr.p_double[i]); } *converr = *converr||ae_fp_greater(ae_sqrt(v, _state),weakepsg); } } } } /* * Same as previous one, but with bound constraints posed * as general linear ones: * * no bound constraints * * 2*N linear constraints 0 <= x[i] <= 1 * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from converging * to the feasible point): * * unit preconditioner * * random diagonal-based preconditioner * * random scale-based preconditioner * * F(x) = |x-x0|^P, where P={2,4} and x0 is randomly selected from [-1,+2]^N * * with such simple constraints and function it is easy to find * analytic form of solution: S[i] = bound(x0[i], 0, 1). * * however, we can't guarantee that solution is strictly feasible * with respect to nonlinearity constraint, so we check * for approximate feasibility. */ for(preckind=0; preckind<=2; preckind++) { for(pkind=1; pkind<=2; pkind++) { for(n=1; n<=nmax; n++) { /* * Generate X, BL, BU. */ p = 2*pkind; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&x0, n, _state); ae_matrix_set_length(&c, 2*n, n+1, _state); ae_vector_set_length(&ct, 2*n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_randomreal(_state); x0.ptr.p_double[i] = 3*ae_randomreal(_state)-1; for(j=0; j<=n; j++) { c.ptr.pp_double[2*i+0][j] = (double)(0); c.ptr.pp_double[2*i+1][j] = (double)(0); } c.ptr.pp_double[2*i+0][i] = (double)(1); c.ptr.pp_double[2*i+0][n] = (double)(0); ct.ptr.p_int[2*i+0] = 1; c.ptr.pp_double[2*i+1][i] = (double)(1); c.ptr.pp_double[2*i+1][n] = (double)(1); ct.ptr.p_int[2*i+1] = -1; } /* * Create and optimize */ minbleiccreate(n, &x, &state, _state); minbleicsetlc(&state, &c, &ct, 2*n, _state); minbleicsetcond(&state, weakepsg, 0.0, 0.0, 0, _state); testminbleicunit_setrandompreconditioner(&state, n, preckind, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p), _state); state.g.ptr.p_double[i] = p*ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state); } continue; } /* * Unknown protocol specified */ *interr = ae_true; ae_frame_leave(_state); return; } minbleicresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *converr = ae_true; ae_frame_leave(_state); return; } /* * * compare solution with analytic one * * check feasibility */ v = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_greater(x.ptr.p_double[i],0.02)&&ae_fp_less(x.ptr.p_double[i],0.98) ) { v = v+ae_sqr(p*ae_pow(x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state), _state); } *feaserr = *feaserr||ae_fp_less(x.ptr.p_double[i],0.0-epsfeas); *feaserr = *feaserr||ae_fp_greater(x.ptr.p_double[i],1.0+epsfeas); } *converr = *converr||ae_fp_greater(ae_sqrt(v, _state),weakepsg); } } } /* * Feasibility problem: * * bound constraints 0<=x[i]<=1 * * starting point xs with xs[i] in [-1,+2] * * random point xc from [0,1] is used to generate K<=N * random linear equality/inequality constraints of the form * (c,x-xc)=0.0 (or, alternatively, >= or <=), where * c is a random vector. * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from converging * to the feasible point): * * unit preconditioner * * random diagonal-based preconditioner * * random scale-based preconditioner * * F(x) = |x-x0|^P, where P={2,4} and x0 is randomly selected from [-1,+2]^N * * we do not know analytic form of the solution, and, if fact, we do not * check for solution correctness. We just check that algorithm converges * to the feasible points. */ for(preckind=0; preckind<=2; preckind++) { for(pkind=1; pkind<=2; pkind++) { for(n=1; n<=nmax; n++) { for(k=1; k<=n; k++) { /* * Generate X, BL, BU. */ p = 2*pkind; ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&xs, n, _state); ae_matrix_set_length(&c, k, n+1, _state); ae_vector_set_length(&ct, k, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 3*ae_randomreal(_state)-1; xs.ptr.p_double[i] = 3*ae_randomreal(_state)-1; xc.ptr.p_double[i] = 0.1+0.8*ae_randomreal(_state); bl.ptr.p_double[i] = (double)(0); bu.ptr.p_double[i] = (double)(1); } for(i=0; i<=k-1; i++) { c.ptr.pp_double[i][n] = (double)(0); for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; c.ptr.pp_double[i][n] = c.ptr.pp_double[i][n]+c.ptr.pp_double[i][j]*xc.ptr.p_double[j]; } ct.ptr.p_int[i] = ae_randominteger(3, _state)-1; } /* * Create and optimize */ minbleiccreate(n, &xs, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); minbleicsetlc(&state, &c, &ct, k, _state); minbleicsetcond(&state, weakepsg, 0.0, 0.0, 0, _state); testminbleicunit_setrandompreconditioner(&state, n, preckind, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p), _state); state.g.ptr.p_double[i] = p*ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state); } continue; } /* * Unknown protocol specified */ *interr = ae_true; ae_frame_leave(_state); return; } minbleicresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *converr = ae_true; ae_frame_leave(_state); return; } /* * Check feasibility */ for(i=0; i<=n-1; i++) { *feaserr = *feaserr||ae_fp_less(x.ptr.p_double[i],0.0); *feaserr = *feaserr||ae_fp_greater(x.ptr.p_double[i],1.0); } for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v-c.ptr.pp_double[i][n]; if( ct.ptr.p_int[i]==0 ) { *feaserr = *feaserr||ae_fp_greater(ae_fabs(v, _state),epsfeas); } if( ct.ptr.p_int[i]<0 ) { *feaserr = *feaserr||ae_fp_greater(v,epsfeas); } if( ct.ptr.p_int[i]>0 ) { *feaserr = *feaserr||ae_fp_less(v,-epsfeas); } } } } } } /* * Infeasible problem: * * all bound constraints are 0 <= x[i] <= 1 except for one * * that one is 0 >= x[i] >= 1 * * no linear constraints * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from detecting * infeasible point): * * unit preconditioner * * random diagonal-based preconditioner * * random scale-based preconditioner * * F(x) = |x-x0|^P, where P={2,4} and x0 is randomly selected from [-1,+2]^N * * algorithm must return correct error code on such problem */ for(preckind=0; preckind<=2; preckind++) { for(pkind=1; pkind<=2; pkind++) { for(n=1; n<=nmax; n++) { /* * Generate X, BL, BU. */ p = 2*pkind; ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = (double)(0); bu.ptr.p_double[i] = (double)(1); x.ptr.p_double[i] = ae_randomreal(_state); x0.ptr.p_double[i] = 3*ae_randomreal(_state)-1; } i = ae_randominteger(n, _state); bl.ptr.p_double[i] = (double)(1); bu.ptr.p_double[i] = (double)(0); /* * Create and optimize */ minbleiccreate(n, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); minbleicsetcond(&state, weakepsg, 0.0, 0.0, 0, _state); testminbleicunit_setrandompreconditioner(&state, n, preckind, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p), _state); state.g.ptr.p_double[i] = p*ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state); } continue; } /* * Unknown protocol specified */ *interr = ae_true; ae_frame_leave(_state); return; } minbleicresults(&state, &x, &rep, _state); *feaserr = *feaserr||rep.terminationtype!=-3; } } } /* * Infeasible problem (2): * * no bound and inequality constraints * * 1<=K<=N arbitrary equality constraints * * (K+1)th constraint which is equal to the first constraint a*x=c, * but with c:=c+1. I.e. we have both a*x=c and a*x=c+1, which can't * be true (other constraints may be inconsistent too, but we don't * have to check it). * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from detecting * infeasible point): * * unit preconditioner * * random diagonal-based preconditioner * * random scale-based preconditioner * * F(x) = |x|^P, where P={2,4} * * algorithm must return correct error code on such problem */ for(preckind=0; preckind<=2; preckind++) { for(pkind=1; pkind<=2; pkind++) { for(n=1; n<=nmax; n++) { for(k=1; k<=n; k++) { /* * Generate X, BL, BU. */ p = 2*pkind; ae_vector_set_length(&x, n, _state); ae_matrix_set_length(&c, k+1, n+1, _state); ae_vector_set_length(&ct, k+1, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_randomreal(_state); } for(i=0; i<=k-1; i++) { for(j=0; j<=n; j++) { c.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } ct.ptr.p_int[i] = 0; } ct.ptr.p_int[k] = 0; ae_v_move(&c.ptr.pp_double[k][0], 1, &c.ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); c.ptr.pp_double[k][n] = c.ptr.pp_double[0][n]+1; /* * Create and optimize */ minbleiccreate(n, &x, &state, _state); minbleicsetlc(&state, &c, &ct, k+1, _state); minbleicsetcond(&state, weakepsg, 0.0, 0.0, 0, _state); testminbleicunit_setrandompreconditioner(&state, n, preckind, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+ae_pow(state.x.ptr.p_double[i], (double)(p), _state); state.g.ptr.p_double[i] = p*ae_pow(state.x.ptr.p_double[i], (double)(p-1), _state); } continue; } /* * Unknown protocol specified */ *interr = ae_true; ae_frame_leave(_state); return; } minbleicresults(&state, &x, &rep, _state); *feaserr = *feaserr||rep.terminationtype!=-3; } } } } } ae_frame_leave(_state); } /************************************************************************* This function additional properties. On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testminbleicunit_testother(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t passcount; ae_int_t pass; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t k; ae_vector bl; ae_vector bu; ae_vector x; ae_vector xf; ae_vector x0; ae_vector x1; ae_vector b; ae_vector xlast; ae_vector a; ae_vector s; ae_vector h; ae_matrix c; ae_matrix fulla; ae_vector ct; double fprev; double xprev; double stpmax; double v; ae_int_t nec; ae_int_t nic; ae_int_t pkind; ae_int_t ckind; ae_int_t mkind; double vc; double vm; minbleicstate state; double epsx; double epsg; double eps; double tmpeps; minbleicreport rep; double diffstep; ae_int_t dkind; ae_bool wasf; ae_bool wasfg; double r; hqrndstate rs; ae_int_t spoiliteration; ae_int_t stopiteration; ae_int_t spoilvar; double spoilval; double ss; ae_int_t stopcallidx; ae_int_t callidx; ae_int_t maxits; ae_bool terminationrequested; ae_int_t scaletype; ae_frame_make(_state, &_frame_block); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&xf, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&xlast, 0, DT_REAL, _state); ae_vector_init(&a, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&h, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_matrix_init(&fulla, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); _minbleicstate_init(&state, _state); _minbleicreport_init(&rep, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); epsx = 1.0E-4; epsg = 1.0E-8; passcount = 10; /* * Try to reproduce bug 570 (optimizer hangs on problems where it is required * to perform very small step - less than 1E-50 - in order to activate constraints). * * The problem being solved is: * * min x[0]+x[1]+...+x[n-1] * * subject to * * x[i]>=0, for i=0..n-1 * * with initial point * * x[0] = 1.0E-100, x[1]=x[2]=...=0.5 * * We try to reproduce this problem in different settings: * * boundary-only constraints - we test that completion code is positive, * and all x[] are EXACTLY zero * * boundary constraints posed as general linear ones - we test that * completion code is positive, and all x[] are APPROXIMATELY zero. */ n = 10; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_matrix_set_length(&c, n, n+1, _state); ae_vector_set_length(&ct, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 0.5; bl.ptr.p_double[i] = 0.0; bu.ptr.p_double[i] = _state->v_posinf; ct.ptr.p_int[i] = 1; for(j=0; j<=n; j++) { c.ptr.pp_double[i][j] = 0.0; } c.ptr.pp_double[i][i] = 1.0; } x.ptr.p_double[0] = 1.0E-100; minbleiccreate(n, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); minbleicsetcond(&state, (double)(0), (double)(0), (double)(0), 2*n, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+state.x.ptr.p_double[i]; state.g.ptr.p_double[i] = 1.0; } } } minbleicresults(&state, &xf, &rep, _state); seterrorflag(err, rep.terminationtype<=0, _state); if( rep.terminationtype>0 ) { for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_neq(xf.ptr.p_double[i],(double)(0)), _state); } } minbleiccreate(n, &x, &state, _state); minbleicsetlc(&state, &c, &ct, n, _state); minbleicsetcond(&state, 1.0E-64, (double)(0), (double)(0), 10, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+state.x.ptr.p_double[i]; state.g.ptr.p_double[i] = 1.0; } } } minbleicresults(&state, &xf, &rep, _state); seterrorflag(err, rep.terminationtype<=0, _state); if( rep.terminationtype>0 ) { for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_greater(ae_fabs(xf.ptr.p_double[i], _state),1.0E-10), _state); } } /* * Test reports: * * first value must be starting point * * last value must be last point */ for(pass=1; pass<=passcount; pass++) { n = 50; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xlast, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)(10); bl.ptr.p_double[i] = 2*ae_randomreal(_state)-1; bu.ptr.p_double[i] = _state->v_posinf; } minbleiccreate(n, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); minbleicsetcond(&state, 1.0E-64, (double)(0), (double)(0), 10, _state); minbleicsetxrep(&state, ae_true, _state); fprev = ae_maxrealnumber; while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+ae_sqr((1+i)*state.x.ptr.p_double[i], _state); state.g.ptr.p_double[i] = 2*(1+i)*state.x.ptr.p_double[i]; } } if( state.xupdated ) { if( ae_fp_eq(fprev,ae_maxrealnumber) ) { for(i=0; i<=n-1; i++) { *err = *err||ae_fp_neq(state.x.ptr.p_double[i],x.ptr.p_double[i]); } } fprev = state.f; ae_v_move(&xlast.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); } } minbleicresults(&state, &x, &rep, _state); for(i=0; i<=n-1; i++) { *err = *err||ae_fp_neq(x.ptr.p_double[i],xlast.ptr.p_double[i]); } } /* * Test differentiation vs. analytic gradient * (first one issues NeedF requests, second one issues NeedFG requests) */ for(pass=1; pass<=passcount; pass++) { n = 10; diffstep = 1.0E-6; for(dkind=0; dkind<=1; dkind++) { ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xlast, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)(1); } if( dkind==0 ) { minbleiccreate(n, &x, &state, _state); } if( dkind==1 ) { minbleiccreatef(n, &x, diffstep, &state, _state); } minbleicsetcond(&state, 1.0E-6, (double)(0), epsx, 0, _state); wasf = ae_false; wasfg = ae_false; while(minbleiciteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = (double)(0); } for(i=0; i<=n-1; i++) { if( state.needf||state.needfg ) { state.f = state.f+ae_sqr((1+i)*state.x.ptr.p_double[i], _state); } if( state.needfg ) { state.g.ptr.p_double[i] = 2*(1+i)*state.x.ptr.p_double[i]; } } wasf = wasf||state.needf; wasfg = wasfg||state.needfg; } minbleicresults(&state, &x, &rep, _state); if( dkind==0 ) { *err = (*err||wasf)||!wasfg; } if( dkind==1 ) { *err = (*err||!wasf)||wasfg; } } } /* * Test that numerical differentiation uses scaling. * * In order to test that we solve simple optimization * problem: min(x^2) with initial x equal to 0.0. * * We choose random DiffStep and S, then we check that * optimizer evaluates function at +-DiffStep*S only. */ for(pass=1; pass<=passcount; pass++) { ae_vector_set_length(&x, 1, _state); ae_vector_set_length(&s, 1, _state); diffstep = ae_randomreal(_state)*1.0E-6; s.ptr.p_double[0] = ae_exp(ae_randomreal(_state)*4-2, _state); x.ptr.p_double[0] = (double)(0); minbleiccreatef(1, &x, diffstep, &state, _state); minbleicsetcond(&state, 1.0E-6, (double)(0), epsx, 0, _state); minbleicsetscale(&state, &s, _state); v = (double)(0); while(minbleiciteration(&state, _state)) { state.f = ae_sqr(state.x.ptr.p_double[0], _state); v = ae_maxreal(v, ae_fabs(state.x.ptr.p_double[0], _state), _state); } minbleicresults(&state, &x, &rep, _state); r = v/(s.ptr.p_double[0]*diffstep); *err = *err||ae_fp_greater(ae_fabs(ae_log(r, _state), _state),ae_log(1+1000*ae_machineepsilon, _state)); } /* * Test stpmax */ for(pass=1; pass<=passcount; pass++) { n = 1; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); x.ptr.p_double[0] = (double)(100); bl.ptr.p_double[0] = 2*ae_randomreal(_state)-1; bu.ptr.p_double[0] = _state->v_posinf; stpmax = 0.05+0.05*ae_randomreal(_state); minbleiccreate(n, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); minbleicsetcond(&state, epsg, (double)(0), epsx, 0, _state); minbleicsetxrep(&state, ae_true, _state); minbleicsetstpmax(&state, stpmax, _state); xprev = x.ptr.p_double[0]; while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = ae_exp(state.x.ptr.p_double[0], _state)+ae_exp(-state.x.ptr.p_double[0], _state); state.g.ptr.p_double[0] = ae_exp(state.x.ptr.p_double[0], _state)-ae_exp(-state.x.ptr.p_double[0], _state); *err = *err||ae_fp_greater(ae_fabs(state.x.ptr.p_double[0]-xprev, _state),(1+ae_sqrt(ae_machineepsilon, _state))*stpmax); } if( state.xupdated ) { *err = *err||ae_fp_greater(ae_fabs(state.x.ptr.p_double[0]-xprev, _state),(1+ae_sqrt(ae_machineepsilon, _state))*stpmax); xprev = state.x.ptr.p_double[0]; } } } /* * Ability to solve problems with function which is unbounded from below */ for(pass=1; pass<=passcount; pass++) { n = 1; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); bl.ptr.p_double[0] = 4*ae_randomreal(_state)+1; bu.ptr.p_double[0] = bl.ptr.p_double[0]+1; x.ptr.p_double[0] = 0.5*(bl.ptr.p_double[0]+bu.ptr.p_double[0]); minbleiccreate(n, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); minbleicsetcond(&state, epsg, (double)(0), epsx, 0, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = -1.0E8*ae_sqr(state.x.ptr.p_double[0], _state); state.g.ptr.p_double[0] = -2.0E8*state.x.ptr.p_double[0]; } } minbleicresults(&state, &x, &rep, _state); *err = *err||ae_fp_greater(ae_fabs(x.ptr.p_double[0]-bu.ptr.p_double[0], _state),epsx); } /* * Test correctness of the scaling: * * initial point is random point from [+1,+2]^N * * f(x) = SUM(A[i]*x[i]^4), C[i] is random from [0.01,100] * * function is EFFECTIVELY unconstrained; it has formal constraints, * but they are inactive at the solution; we try different variants * in order to explore different control paths of the optimizer: * 0) absense of constraints * 1) bound constraints -100000<=x[i]<=100000 * 2) one linear constraint 0*x=0 * 3) combination of (1) and (2) * * we use random scaling matrix * * we test different variants of the preconditioning: * 0) unit preconditioner * 1) random diagonal from [0.01,100] * 2) scale preconditioner * * we set very stringent stopping conditions * * and we test that in the extremum stopping conditions are * satisfied subject to the current scaling coefficients. */ for(pass=1; pass<=passcount; pass++) { tmpeps = 1.0E-5; for(n=1; n<=10; n++) { for(ckind=0; ckind<=3; ckind++) { for(pkind=0; pkind<=2; pkind++) { ae_vector_set_length(&x, n, _state); ae_vector_set_length(&a, n, _state); ae_vector_set_length(&s, n, _state); ae_vector_set_length(&h, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_matrix_set_length(&c, 1, n+1, _state); ae_vector_set_length(&ct, 1, _state); ct.ptr.p_int[0] = 0; c.ptr.pp_double[0][n] = (double)(0); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_randomreal(_state)+1; bl.ptr.p_double[i] = (double)(-100000); bu.ptr.p_double[i] = (double)(100000); c.ptr.pp_double[0][i] = (double)(0); a.ptr.p_double[i] = ae_exp(ae_log((double)(10), _state)*(2*ae_randomreal(_state)-1), _state); s.ptr.p_double[i] = ae_exp(ae_log((double)(10), _state)*(2*ae_randomreal(_state)-1), _state); h.ptr.p_double[i] = ae_exp(ae_log((double)(10), _state)*(2*ae_randomreal(_state)-1), _state); } minbleiccreate(n, &x, &state, _state); if( ckind==1||ckind==3 ) { minbleicsetbc(&state, &bl, &bu, _state); } if( ckind==2||ckind==3 ) { minbleicsetlc(&state, &c, &ct, 1, _state); } if( pkind==1 ) { minbleicsetprecdiag(&state, &h, _state); } if( pkind==2 ) { minbleicsetprecscale(&state, _state); } minbleicsetcond(&state, tmpeps, (double)(0), (double)(0), 0, _state); minbleicsetscale(&state, &s, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+a.ptr.p_double[i]*ae_sqr(state.x.ptr.p_double[i], _state); state.g.ptr.p_double[i] = 2*a.ptr.p_double[i]*state.x.ptr.p_double[i]; } } } minbleicresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(s.ptr.p_double[i]*2*a.ptr.p_double[i]*x.ptr.p_double[i], _state); } v = ae_sqrt(v, _state); seterrorflag(err, ae_fp_greater(v,tmpeps), _state); } } } } /* * Check correctness of the "trimming". * * Trimming is a technique which is used to help algorithm * cope with unbounded functions. In order to check this * technique we will try to solve following optimization * problem: * * min f(x) subject to no constraints on X * { 1/(1-x) + 1/(1+x) + c*x, if -0.999999=0.999999 * * where c is either 1.0 or 1.0E+4, M is either 1.0E8, 1.0E20 or +INF * (we try different combinations) */ for(pass=1; pass<=passcount; pass++) { for(ckind=0; ckind<=1; ckind++) { for(mkind=0; mkind<=2; mkind++) { /* * Choose c and M */ vc = (double)(1); vm = (double)(1); if( ckind==0 ) { vc = 1.0; } if( ckind==1 ) { vc = 1.0E+4; } if( mkind==0 ) { vm = 1.0E+8; } if( mkind==1 ) { vm = 1.0E+20; } if( mkind==2 ) { vm = _state->v_posinf; } /* * Create optimizer, solve optimization problem */ epsg = 1.0E-6*vc; ae_vector_set_length(&x, 1, _state); x.ptr.p_double[0] = 0.0; minbleiccreate(1, &x, &state, _state); minbleicsetcond(&state, epsg, (double)(0), (double)(0), 0, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { if( ae_fp_less(-0.999999,state.x.ptr.p_double[0])&&ae_fp_less(state.x.ptr.p_double[0],0.999999) ) { state.f = 1/(1-state.x.ptr.p_double[0])+1/(1+state.x.ptr.p_double[0])+vc*state.x.ptr.p_double[0]; state.g.ptr.p_double[0] = 1/ae_sqr(1-state.x.ptr.p_double[0], _state)-1/ae_sqr(1+state.x.ptr.p_double[0], _state)+vc; } else { state.f = vm; state.g.ptr.p_double[0] = (double)(0); } } } minbleicresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } *err = *err||ae_fp_greater(ae_fabs(1/ae_sqr(1-x.ptr.p_double[0], _state)-1/ae_sqr(1+x.ptr.p_double[0], _state)+vc, _state),epsg); } } } /* * Test behaviour on noisy functions. * * Consider following problem: * * f(x,y) = (x+1)^2 + (y+1)^2 + 10000*MachineEpsilon*RandomReal() * * boundary constraints x>=0, y>=0 * * starting point (x0,y0)=(10*MachineEpsilon,1.0) * * Such problem contains small numerical noise. Without noise its * solution is (xs,ys)=(0,0), which is easy to find. However, presence * of the noise makes it hard to solve: * * noisy f(x,y) is monotonically decreasing only when we perform * steps orders of magnitude larger than 10000*MachineEpsilon * * at small scales f(x,y) is non-monotonic and non-convex * * however, our first step must be done towards * (x1,y1) = (0,1-some_small_value), and length of such step is * many times SMALLER than 10000*MachineEpsilon * * second step, from (x1,y1) to (xs,ys), will be large enough to * ignore numerical noise, so the only problem is to perform * first step * * Naive implementation of BLEIC should fail sometimes (sometimes - * due to non-deterministic nature of noise) on such problem. However, * our improved implementation should solve it correctly. We test * several variations of inner stopping criteria. */ for(pass=1; pass<=passcount; pass++) { eps = 1.0E-9; ae_vector_set_length(&x, 2, _state); ae_vector_set_length(&bl, 2, _state); ae_vector_set_length(&bu, 2, _state); x.ptr.p_double[0] = 10*ae_machineepsilon; x.ptr.p_double[1] = 1.0; bl.ptr.p_double[0] = 0.0; bu.ptr.p_double[0] = _state->v_posinf; bl.ptr.p_double[1] = 0.0; bu.ptr.p_double[1] = _state->v_posinf; for(ckind=0; ckind<=2; ckind++) { minbleiccreate(2, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); if( ckind==0 ) { minbleicsetcond(&state, eps, (double)(0), (double)(0), 0, _state); } if( ckind==1 ) { minbleicsetcond(&state, (double)(0), eps, (double)(0), 0, _state); } if( ckind==2 ) { minbleicsetcond(&state, (double)(0), (double)(0), eps, 0, _state); } while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = ae_sqr(state.x.ptr.p_double[0]+1, _state)+ae_sqr(state.x.ptr.p_double[1]+1, _state)+10000*ae_machineepsilon*ae_randomreal(_state); state.g.ptr.p_double[0] = 2*(state.x.ptr.p_double[0]+1); state.g.ptr.p_double[1] = 2*(state.x.ptr.p_double[1]+1); } } minbleicresults(&state, &xf, &rep, _state); if( (rep.terminationtype<=0||ae_fp_neq(xf.ptr.p_double[0],(double)(0)))||ae_fp_neq(xf.ptr.p_double[1],(double)(0)) ) { *err = ae_true; ae_frame_leave(_state); return; } } } /* * Deterministic variation of the previous problem. * * Consider following problem: * * boundary constraints x>=0, y>=0 * * starting point (x0,y0)=(10*MachineEpsilon,1.0) * / (x+1)^2 + (y+1)^2, for (x,y)<>(x0,y0) * * f(x,y) = | * \ (x+1)^2 + (y+1)^2 - 0.1, for (x,y)=(x0,y0) * * Such problem contains deterministic numerical noise (-0.1 at * starting point). Without noise its solution is easy to find. * However, presence of the noise makes it hard to solve: * * our first step must be done towards (x1,y1) = (0,1-some_small_value), * but such step will increase function valye by approximately 0.1 - * instead of decreasing it. * * Naive implementation of BLEIC should fail on such problem. However, * our improved implementation should solve it correctly. We test * several variations of inner stopping criteria. */ for(pass=1; pass<=passcount; pass++) { eps = 1.0E-9; ae_vector_set_length(&x, 2, _state); ae_vector_set_length(&bl, 2, _state); ae_vector_set_length(&bu, 2, _state); x.ptr.p_double[0] = 10*ae_machineepsilon; x.ptr.p_double[1] = 1.0; bl.ptr.p_double[0] = 0.0; bu.ptr.p_double[0] = _state->v_posinf; bl.ptr.p_double[1] = 0.0; bu.ptr.p_double[1] = _state->v_posinf; for(ckind=0; ckind<=2; ckind++) { minbleiccreate(2, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); if( ckind==0 ) { minbleicsetcond(&state, eps, (double)(0), (double)(0), 0, _state); } if( ckind==1 ) { minbleicsetcond(&state, (double)(0), eps, (double)(0), 0, _state); } if( ckind==2 ) { minbleicsetcond(&state, (double)(0), (double)(0), eps, 0, _state); } while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = ae_sqr(state.x.ptr.p_double[0]+1, _state)+ae_sqr(state.x.ptr.p_double[1]+1, _state); if( ae_fp_eq(state.x.ptr.p_double[0],x.ptr.p_double[0])&&ae_fp_eq(state.x.ptr.p_double[1],x.ptr.p_double[1]) ) { state.f = state.f-0.1; } state.g.ptr.p_double[0] = 2*(state.x.ptr.p_double[0]+1); state.g.ptr.p_double[1] = 2*(state.x.ptr.p_double[1]+1); } } minbleicresults(&state, &xf, &rep, _state); if( (rep.terminationtype<=0||ae_fp_neq(xf.ptr.p_double[0],(double)(0)))||ae_fp_neq(xf.ptr.p_double[1],(double)(0)) ) { *err = ae_true; ae_frame_leave(_state); return; } } } /* * Test integrity checks for NAN/INF: * * algorithm solves optimization problem, which is normal for some time (quadratic) * * after 5-th step we choose random component of gradient and consistently spoil * it by NAN or INF. * * we check that correct termination code is returned (-8) */ n = 100; for(pass=1; pass<=10; pass++) { spoiliteration = 5; stopiteration = 8; if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { /* * Gradient can be spoiled by +INF, -INF, NAN */ spoilvar = hqrnduniformi(&rs, n, _state); i = hqrnduniformi(&rs, 3, _state); spoilval = _state->v_nan; if( i==0 ) { spoilval = _state->v_neginf; } if( i==1 ) { spoilval = _state->v_posinf; } } else { /* * Function value can be spoiled only by NAN * (+INF can be recognized as legitimate value during optimization) */ spoilvar = -1; spoilval = _state->v_nan; } spdmatrixrndcond(n, 1.0E5, &fulla, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); x0.ptr.p_double[i] = hqrndnormal(&rs, _state); } minbleiccreate(n, &x0, &state, _state); minbleicsetcond(&state, 0.0, 0.0, 0.0, stopiteration, _state); minbleicsetxrep(&state, ae_true, _state); k = -1; while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+b.ptr.p_double[i]*state.x.ptr.p_double[i]; state.g.ptr.p_double[i] = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.f = state.f+0.5*state.x.ptr.p_double[i]*fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; state.g.ptr.p_double[i] = state.g.ptr.p_double[i]+fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } if( k>=spoiliteration ) { if( spoilvar<0 ) { state.f = spoilval; } else { state.g.ptr.p_double[spoilvar] = spoilval; } } continue; } if( state.xupdated ) { inc(&k, _state); continue; } ae_assert(ae_false, "Assertion failed", _state); } minbleicresults(&state, &x1, &rep, _state); seterrorflag(err, rep.terminationtype!=-8, _state); } /* * Check algorithm ability to handle request for termination: * * to terminate with correct return code = 8 * * to return point which was "current" at the moment of termination * * NOTE: we solve problem with "corrupted" preconditioner which makes it hard * to converge in less than StopCallIdx iterations */ for(pass=1; pass<=50; pass++) { n = 3; ss = (double)(100); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xlast, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 6+ae_randomreal(_state); } ae_vector_set_length(&s, 3, _state); s.ptr.p_double[0] = 0.00001; s.ptr.p_double[1] = 0.00001; s.ptr.p_double[2] = 10000.0; stopcallidx = ae_randominteger(20, _state); maxits = 25; minbleiccreate(n, &x, &state, _state); minbleicsetcond(&state, (double)(0), (double)(0), (double)(0), maxits, _state); minbleicsetxrep(&state, ae_true, _state); minbleicsetprecdiag(&state, &s, _state); callidx = 0; terminationrequested = ae_false; ae_v_move(&xlast.ptr.p_double[0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = ss*ae_sqr(ae_exp(state.x.ptr.p_double[0], _state)-2, _state)+ae_sqr(state.x.ptr.p_double[1], _state)+ae_sqr(state.x.ptr.p_double[2]-state.x.ptr.p_double[0], _state); state.g.ptr.p_double[0] = 2*ss*(ae_exp(state.x.ptr.p_double[0], _state)-2)*ae_exp(state.x.ptr.p_double[0], _state)+2*(state.x.ptr.p_double[2]-state.x.ptr.p_double[0])*(-1); state.g.ptr.p_double[1] = 2*state.x.ptr.p_double[1]; state.g.ptr.p_double[2] = 2*(state.x.ptr.p_double[2]-state.x.ptr.p_double[0]); if( callidx==stopcallidx ) { minbleicrequesttermination(&state, _state); terminationrequested = ae_true; } inc(&callidx, _state); continue; } if( state.xupdated ) { if( !terminationrequested ) { ae_v_move(&xlast.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minbleicresults(&state, &x, &rep, _state); seterrorflag(err, rep.terminationtype!=8, _state); for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_neq(x.ptr.p_double[i],xlast.ptr.p_double[i]), _state); } } /* * N-dimensional problem with Ne equality and Ni inequality constraints. * * Check that internal SActiveSet object uses efficient algorithm * to determine initial point: it avoids expensive (N+Ni)-dimensional * QP subproblem when initial point is feasible w.r.t. constraints. * * In order to do so we try to solve 5-dimensional QP problem with * 2 equality constraints and 1000000 inequality constraints (+box * constraints). Inefficient algorithm will simply fail to allocate * enough memory, so we do not have to perform any checks here. */ n = 5; nec = 2; nic = 1000000; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(scaletype=0; scaletype<=1; scaletype++) { /* * Generate problem */ for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = hqrnduniformr(&rs, _state); bl.ptr.p_double[i] = x.ptr.p_double[i]-hqrnduniformr(&rs, _state)*hqrnduniformi(&rs, 2, _state); bu.ptr.p_double[i] = x.ptr.p_double[i]+hqrnduniformr(&rs, _state)*hqrnduniformi(&rs, 2, _state); } ae_matrix_set_length(&c, nec+nic, n+1, _state); ae_vector_set_length(&ct, nec+nic, _state); for(i=0; i<=nec+nic-1; i++) { v = (double)(0); for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); v = v+c.ptr.pp_double[i][j]*x.ptr.p_double[j]; } c.ptr.pp_double[i][n] = v; if( i=c'*x0, with c=A*x0 * * run BLEIC algorithm from initial point x0 for target function f=0.5*x'*A*x * and check that it stops at x0 (less than 1E-12 away from it) */ n = 20; for(pass=0; pass<=20000; pass++) { spdmatrixrndcond(n, 1.0E3, &fulla, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { fulla.ptr.pp_double[i][j] = fulla.ptr.pp_double[i][j]*1.0E9; } } ae_vector_set_length(&x0, n, _state); v = (double)(0); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = hqrnduniformr(&rs, _state); v = v+ae_sqr(x0.ptr.p_double[i], _state); } ae_assert(ae_fp_greater(v,(double)(0)), "MinBLEIC: integrity check failed in the unit test", _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = x0.ptr.p_double[i]/ae_sqrt(v, _state); } ae_matrix_set_length(&c, 1, n+1, _state); ae_vector_set_length(&ct, 1, _state); ct.ptr.p_int[0] = 1; c.ptr.pp_double[0][n] = (double)(0); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&fulla.ptr.pp_double[i][0], 1, &x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); c.ptr.pp_double[0][i] = v; c.ptr.pp_double[0][n] = c.ptr.pp_double[0][n]+v*x0.ptr.p_double[i]; } ae_assert(ae_fp_greater(c.ptr.pp_double[0][n],(double)(0)), "MinBLEIC: integrity check failed in the unit test", _state); minbleiccreate(n, &x0, &state, _state); minbleicsetlc(&state, &c, &ct, 1, _state); minbleicsetcond(&state, 0.0, (double)(0), 1.0E-15, 0, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&fulla.ptr.pp_double[i][0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state.f = state.f+0.5*v*state.x.ptr.p_double[i]; state.g.ptr.p_double[i] = v; } } } minbleicresults(&state, &xf, &rep, _state); seterrorflag(err, rep.terminationtype<=0, _state); for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_greater(ae_fabs(x0.ptr.p_double[i]-xf.ptr.p_double[i], _state),1.0E-12), _state); } } ae_frame_leave(_state); } /************************************************************************* This function tests convergence properties. We solve several simple problems with different combinations of constraints On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testminbleicunit_testconv(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t passcount; ae_int_t pass; ae_vector bl; ae_vector bu; ae_vector x; ae_vector b; ae_vector tmp; ae_vector g; ae_vector xf; ae_vector xs0; ae_vector xs1; ae_matrix a; ae_matrix c; ae_matrix ce; ae_vector ct; ae_vector nonnegative; minbleicstate state; double epsg; double epsfeas; double tol; minbleicreport rep; snnlssolver nnls; ae_int_t m; ae_int_t n; ae_int_t k; ae_int_t i; ae_int_t j; double v; double vv; ae_int_t preckind; ae_int_t akind; ae_int_t shiftkind; ae_int_t bscale; double tolconstr; double f0; double f1; ae_int_t ccnt; hqrndstate rs; ae_frame_make(_state, &_frame_block); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&xf, 0, DT_REAL, _state); ae_vector_init(&xs0, 0, DT_REAL, _state); ae_vector_init(&xs1, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_matrix_init(&ce, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); ae_vector_init(&nonnegative, 0, DT_BOOL, _state); _minbleicstate_init(&state, _state); _minbleicreport_init(&rep, _state); _snnlssolver_init(&nnls, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); epsg = 1.0E-8; epsfeas = 1.0E-8; tol = 0.001; passcount = 10; /* * Three closely connected problems: * * 2-dimensional space * * octagonal area bounded by: * * -1<=x<=+1 * * -1<=y<=+1 * * x+y<=1.5 * * x-y<=1.5 * * -x+y<=1.5 * * -x-y<=1.5 * * several target functions: * * f0=x+0.001*y, minimum at x=-1, y=-0.5 * * f1=(x+10)^2+y^2, minimum at x=-1, y=0 * * f2=(x+10)^2+(y-0.6)^2, minimum at x=-1, y=0.5 */ ae_vector_set_length(&x, 2, _state); ae_vector_set_length(&bl, 2, _state); ae_vector_set_length(&bu, 2, _state); ae_matrix_set_length(&c, 4, 3, _state); ae_vector_set_length(&ct, 4, _state); bl.ptr.p_double[0] = (double)(-1); bl.ptr.p_double[1] = (double)(-1); bu.ptr.p_double[0] = (double)(1); bu.ptr.p_double[1] = (double)(1); c.ptr.pp_double[0][0] = (double)(1); c.ptr.pp_double[0][1] = (double)(1); c.ptr.pp_double[0][2] = 1.5; ct.ptr.p_int[0] = -1; c.ptr.pp_double[1][0] = (double)(1); c.ptr.pp_double[1][1] = (double)(-1); c.ptr.pp_double[1][2] = 1.5; ct.ptr.p_int[1] = -1; c.ptr.pp_double[2][0] = (double)(-1); c.ptr.pp_double[2][1] = (double)(1); c.ptr.pp_double[2][2] = 1.5; ct.ptr.p_int[2] = -1; c.ptr.pp_double[3][0] = (double)(-1); c.ptr.pp_double[3][1] = (double)(-1); c.ptr.pp_double[3][2] = 1.5; ct.ptr.p_int[3] = -1; for(pass=1; pass<=passcount; pass++) { /* * f0 */ x.ptr.p_double[0] = 0.2*ae_randomreal(_state)-0.1; x.ptr.p_double[1] = 0.2*ae_randomreal(_state)-0.1; minbleiccreate(2, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); minbleicsetlc(&state, &c, &ct, 4, _state); minbleicsetcond(&state, epsg, 0.0, 0.0, 0, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = state.x.ptr.p_double[0]+0.001*state.x.ptr.p_double[1]; state.g.ptr.p_double[0] = (double)(1); state.g.ptr.p_double[1] = 0.001; } } minbleicresults(&state, &x, &rep, _state); if( rep.terminationtype>0 ) { *err = *err||ae_fp_greater(ae_fabs(x.ptr.p_double[0]+1, _state),tol); *err = *err||ae_fp_greater(ae_fabs(x.ptr.p_double[1]+0.5, _state),tol); } else { *err = ae_true; } /* * f1 */ x.ptr.p_double[0] = 0.2*ae_randomreal(_state)-0.1; x.ptr.p_double[1] = 0.2*ae_randomreal(_state)-0.1; minbleiccreate(2, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); minbleicsetlc(&state, &c, &ct, 4, _state); minbleicsetcond(&state, epsg, 0.0, 0.0, 0, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = ae_sqr(state.x.ptr.p_double[0]+10, _state)+ae_sqr(state.x.ptr.p_double[1], _state); state.g.ptr.p_double[0] = 2*(state.x.ptr.p_double[0]+10); state.g.ptr.p_double[1] = 2*state.x.ptr.p_double[1]; } } minbleicresults(&state, &x, &rep, _state); if( rep.terminationtype>0 ) { *err = *err||ae_fp_greater(ae_fabs(x.ptr.p_double[0]+1, _state),tol); *err = *err||ae_fp_greater(ae_fabs(x.ptr.p_double[1], _state),tol); } else { *err = ae_true; } /* * f2 */ x.ptr.p_double[0] = 0.2*ae_randomreal(_state)-0.1; x.ptr.p_double[1] = 0.2*ae_randomreal(_state)-0.1; minbleiccreate(2, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); minbleicsetlc(&state, &c, &ct, 4, _state); minbleicsetcond(&state, epsg, 0.0, 0.0, 0, _state); while(minbleiciteration(&state, _state)) { if( state.needfg ) { state.f = ae_sqr(state.x.ptr.p_double[0]+10, _state)+ae_sqr(state.x.ptr.p_double[1]-0.6, _state); state.g.ptr.p_double[0] = 2*(state.x.ptr.p_double[0]+10); state.g.ptr.p_double[1] = 2*(state.x.ptr.p_double[1]-0.6); } } minbleicresults(&state, &x, &rep, _state); if( rep.terminationtype>0 ) { *err = *err||ae_fp_greater(ae_fabs(x.ptr.p_double[0]+1, _state),tol); *err = *err||ae_fp_greater(ae_fabs(x.ptr.p_double[1]-0.5, _state),tol); } else { *err = ae_true; } } /* * Degenerate optimization problem with excessive constraints. * * * N=3..10, M=N div 3, K = 2*N * * f(x) = 0.5*|A*x-b|^2, where A is MxN random matrix, b is Mx1 random vector * * bound constraint: * a) Ci=x[i]=Ci for i=0..M-1 * b) 0<=x[i]<=1 for i=M..N-1 * * linear constraints (for fixed feasible xf and random ai): * a) ai*x = ai*xf for i=0..M-1 * b) ai*x <= ai*xf+random(0.1,1.0) for i=M..K-1 * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from detecting * infeasible point): * a) unit preconditioner * b) random diagonal-based preconditioner * c) random scale-based preconditioner * * we choose two random initial points from interior of the area * given by bound constraints. * * We do not know analytic solution of this problem, and we do not need * to solve it :) we just perform two restarts from two different initial * points and check that both solutions give approximately same function * value. */ for(preckind=0; preckind<=2; preckind++) { for(n=3; n<=10; n++) { /* * Generate problem */ m = n/3; k = 2*n; ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xs0, n, _state); ae_vector_set_length(&xs1, n, _state); ae_vector_set_length(&xf, n, _state); for(i=0; i<=n-1; i++) { if( i=-2; bscale--) { /* * Generate A, B and initial point */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = ae_pow((double)(10), (double)(bscale), _state)*hqrndnormal(&rs, _state); x.ptr.p_double[i] = hqrnduniformr(&rs, _state)-0.5; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } if( akind==1 ) { /* * Dense well conditioned SPD */ spdmatrixrndcond(n, 50.0, &a, _state); } if( akind==2 ) { /* * Dense well conditioned indefinite */ smatrixrndcond(n, 50.0, &a, _state); } if( akind==3 ) { /* * Low rank */ ae_vector_set_length(&tmp, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } for(k=1; k<=ae_minint(3, n-1, _state); k++) { for(i=0; i<=n-1; i++) { tmp.ptr.p_double[i] = hqrndnormal(&rs, _state); } v = hqrndnormal(&rs, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]+v*tmp.ptr.p_double[i]*tmp.ptr.p_double[j]; } } } } /* * Generate constraints */ ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = -1.0; bu.ptr.p_double[i] = 1.0; } ccnt = ae_round(ae_pow((double)(2), (double)(n), _state), _state); ae_matrix_set_length(&c, ccnt, n+1, _state); ae_vector_set_length(&ct, ccnt, _state); for(i=0; i<=ccnt-1; i++) { ct.ptr.p_int[i] = -1; k = i; c.ptr.pp_double[i][n] = ae_sign((double)(shiftkind), _state)*ae_pow((double)(10), ae_fabs((double)(shiftkind), _state), _state)*ae_machineepsilon; for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = (double)(2*(k%2)-1); c.ptr.pp_double[i][n] = c.ptr.pp_double[i][n]+c.ptr.pp_double[i][j]*c.ptr.pp_double[i][j]; k = k/2; } } /* * Create and optimize */ minbleiccreate(n, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); minbleicsetlc(&state, &c, &ct, ccnt, _state); minbleicsetcond(&state, 1.0E-9, 0.0, 0.0, 0, _state); while(minbleiciteration(&state, _state)) { ae_assert(state.needfg, "Assertion failed", _state); state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+state.x.ptr.p_double[i]*b.ptr.p_double[i]; state.g.ptr.p_double[i] = b.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state.f = state.f+0.5*state.x.ptr.p_double[i]*v; state.g.ptr.p_double[i] = state.g.ptr.p_double[i]+v; } } minbleicresults(&state, &xs0, &rep, _state); seterrorflag(err, rep.terminationtype<=0, _state); if( *err ) { ae_frame_leave(_state); return; } /* * Evaluate gradient at solution and test */ vv = 0.0; for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xs0.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v+b.ptr.p_double[i]; if( ae_fp_less_eq(xs0.ptr.p_double[i],bl.ptr.p_double[i]+tolconstr)&&ae_fp_greater(v,(double)(0)) ) { v = 0.0; } if( ae_fp_greater_eq(xs0.ptr.p_double[i],bu.ptr.p_double[i]-tolconstr)&&ae_fp_less(v,(double)(0)) ) { v = 0.0; } vv = vv+ae_sqr(v, _state); } vv = ae_sqrt(vv, _state); seterrorflag(err, ae_fp_greater(vv,1.0E-5), _state); } } } } /* * Convex/nonconvex optimization problem with combination of * box and linear constraints: * * * N=2..8 * * f = 0.5*x'*A*x+b'*x * * b has normally distributed entries with scale 10^BScale * * several kinds of A are tried: zero, well conditioned SPD, * well conditioned indefinite, low rank * * box constraints: x[i] in [-1,+1] * * initial point x0 = [0 0 ... 0 0] * * CCnt=min(3,N-1) general linear constraints of form (c,x)=0. * random mix of equality/inequality constraints is tried. * x0 is guaranteed to be feasible. * * We check that constrained gradient is close to zero at solution. * Inequality constraint is considered active if distance to boundary * is less than TolConstr. We use nonnegative least squares solver * in order to compute constrained gradient. */ tolconstr = 1.0E-8; for(n=2; n<=8; n++) { for(akind=0; akind<=3; akind++) { for(bscale=0; bscale>=-2; bscale--) { /* * Generate A, B and initial point */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = ae_pow((double)(10), (double)(bscale), _state)*hqrndnormal(&rs, _state); x.ptr.p_double[i] = 0.0; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } if( akind==1 ) { /* * Dense well conditioned SPD */ spdmatrixrndcond(n, 50.0, &a, _state); } if( akind==2 ) { /* * Dense well conditioned indefinite */ smatrixrndcond(n, 50.0, &a, _state); } if( akind==3 ) { /* * Low rank */ ae_vector_set_length(&tmp, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } for(k=1; k<=ae_minint(3, n-1, _state); k++) { for(i=0; i<=n-1; i++) { tmp.ptr.p_double[i] = hqrndnormal(&rs, _state); } v = hqrndnormal(&rs, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]+v*tmp.ptr.p_double[i]*tmp.ptr.p_double[j]; } } } } /* * Generate constraints */ ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = -1.0; bu.ptr.p_double[i] = 1.0; } ccnt = ae_minint(3, n-1, _state); ae_matrix_set_length(&c, ccnt, n+1, _state); ae_vector_set_length(&ct, ccnt, _state); for(i=0; i<=ccnt-1; i++) { ct.ptr.p_int[i] = hqrnduniformi(&rs, 3, _state)-1; c.ptr.pp_double[i][n] = 0.0; for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; } } /* * Create and optimize */ minbleiccreate(n, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); minbleicsetlc(&state, &c, &ct, ccnt, _state); minbleicsetcond(&state, 1.0E-9, 0.0, 0.0, 0, _state); while(minbleiciteration(&state, _state)) { ae_assert(state.needfg, "Assertion failed", _state); state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+state.x.ptr.p_double[i]*b.ptr.p_double[i]; state.g.ptr.p_double[i] = b.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state.f = state.f+0.5*state.x.ptr.p_double[i]*v; state.g.ptr.p_double[i] = state.g.ptr.p_double[i]+v; } } minbleicresults(&state, &xs0, &rep, _state); seterrorflag(err, rep.terminationtype<=0, _state); if( *err ) { ae_frame_leave(_state); return; } /* * 1. evaluate unconstrained gradient at solution * * 2. calculate constrained gradient (NNLS solver is used * to evaluate gradient subject to active constraints). * In order to do this we form CE matrix, matrix of active * constraints (columns store constraint vectors). Then * we try to approximate gradient vector by columns of CE, * subject to non-negativity restriction placed on variables * corresponding to inequality constraints. * * Residual from such regression is a constrained gradient vector. */ ae_vector_set_length(&g, n, _state); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xs0.ptr.p_double[0], 1, ae_v_len(0,n-1)); g.ptr.p_double[i] = v+b.ptr.p_double[i]; } ae_matrix_set_length(&ce, n, n+ccnt, _state); ae_vector_set_length(&nonnegative, n+ccnt, _state); k = 0; for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_less(xs0.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(err, ae_fp_greater(xs0.ptr.p_double[i],bu.ptr.p_double[i]), _state); if( ae_fp_less_eq(xs0.ptr.p_double[i],bl.ptr.p_double[i]+tolconstr) ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = 0.0; } ce.ptr.pp_double[i][k] = 1.0; nonnegative.ptr.p_bool[k] = ae_true; inc(&k, _state); continue; } if( ae_fp_greater_eq(xs0.ptr.p_double[i],bu.ptr.p_double[i]-tolconstr) ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = 0.0; } ce.ptr.pp_double[i][k] = -1.0; nonnegative.ptr.p_bool[k] = ae_true; inc(&k, _state); continue; } } for(i=0; i<=ccnt-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &xs0.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v-c.ptr.pp_double[i][n]; seterrorflag(err, ct.ptr.p_int[i]==0&&ae_fp_greater(ae_fabs(v, _state),tolconstr), _state); seterrorflag(err, ct.ptr.p_int[i]>0&&ae_fp_less(v,-tolconstr), _state); seterrorflag(err, ct.ptr.p_int[i]<0&&ae_fp_greater(v,tolconstr), _state); if( ct.ptr.p_int[i]==0 ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = c.ptr.pp_double[i][j]; } nonnegative.ptr.p_bool[k] = ae_false; inc(&k, _state); continue; } if( (ct.ptr.p_int[i]>0&&ae_fp_less_eq(v,tolconstr))||(ct.ptr.p_int[i]<0&&ae_fp_greater_eq(v,-tolconstr)) ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = ae_sign((double)(ct.ptr.p_int[i]), _state)*c.ptr.pp_double[i][j]; } nonnegative.ptr.p_bool[k] = ae_true; inc(&k, _state); continue; } } snnlsinit(0, 0, 0, &nnls, _state); snnlssetproblem(&nnls, &ce, &g, 0, k, n, _state); for(i=0; i<=k-1; i++) { if( !nonnegative.ptr.p_bool[i] ) { snnlsdropnnc(&nnls, i, _state); } } snnlssolve(&nnls, &tmp, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { g.ptr.p_double[j] = g.ptr.p_double[j]-tmp.ptr.p_double[i]*ce.ptr.pp_double[j][i]; } } vv = ae_v_dotproduct(&g.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,n-1)); vv = ae_sqrt(vv, _state); seterrorflag(err, ae_fp_greater(vv,1.0E-5), _state); } } } ae_frame_leave(_state); } /************************************************************************* This function tests preconditioning On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testminbleicunit_testpreconditioning(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t pass; ae_int_t n; ae_vector x; ae_vector x0; ae_int_t i; ae_int_t k; ae_matrix v; ae_matrix c; ae_vector ct; ae_vector bl; ae_vector bu; ae_vector vd; ae_vector d; ae_vector units; ae_vector s; ae_int_t cntb1; ae_int_t cntb2; ae_int_t cntg1; ae_int_t cntg2; double epsg; ae_vector diagh; minbleicstate state; minbleicreport rep; ae_int_t ckind; ae_int_t fk; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&vd, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&units, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&diagh, 0, DT_REAL, _state); _minbleicstate_init(&state, _state); _minbleicreport_init(&rep, _state); /* * Preconditioner test 1. * * If * * B1 is default preconditioner with unit scale * * G1 is diagonal preconditioner based on approximate diagonal of Hessian matrix * * B2 is default preconditioner with non-unit scale S[i]=1/sqrt(h[i]) * * G2 is scale-based preconditioner with non-unit scale S[i]=1/sqrt(h[i]) * then B1 is worse than G1, B2 is worse than G2. * "Worse" means more iterations to converge. * * Test problem setup: * * f(x) = sum( ((i*i+1)*x[i])^2, i=0..N-1) * * constraints: * 0) absent * 1) boundary only * 2) linear equality only * 3) combination of boundary and linear equality constraints * * N - problem size * K - number of repeated passes (should be large enough to average out random factors) */ k = 100; epsg = 1.0E-8; for(n=10; n<=10; n++) { for(ckind=0; ckind<=3; ckind++) { fk = 1; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&units, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)(0); units.ptr.p_double[i] = (double)(1); } minbleiccreate(n, &x, &state, _state); minbleicsetcond(&state, epsg, 0.0, 0.0, 0, _state); if( ckind==1||ckind==3 ) { ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = (double)(-1); bu.ptr.p_double[i] = (double)(1); } minbleicsetbc(&state, &bl, &bu, _state); } if( ckind==2||ckind==3 ) { ae_matrix_set_length(&c, 1, n+1, _state); ae_vector_set_length(&ct, 1, _state); ct.ptr.p_int[0] = ae_randominteger(3, _state)-1; for(i=0; i<=n-1; i++) { c.ptr.pp_double[0][i] = 2*ae_randomreal(_state)-1; } c.ptr.pp_double[0][n] = (double)(0); minbleicsetlc(&state, &c, &ct, 1, _state); } /* * Test it with default preconditioner VS. perturbed diagonal preconditioner */ minbleicsetprecdefault(&state, _state); minbleicsetscale(&state, &units, _state); cntb1 = 0; for(pass=0; pass<=k-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } minbleicrestartfrom(&state, &x, _state); while(minbleiciteration(&state, _state)) { testminbleicunit_calciip2(&state, n, fk, _state); } minbleicresults(&state, &x, &rep, _state); cntb1 = cntb1+rep.inneriterationscount; *err = *err||rep.terminationtype<=0; } ae_vector_set_length(&diagh, n, _state); for(i=0; i<=n-1; i++) { diagh.ptr.p_double[i] = 2*ae_pow((double)(i*i+1), (double)(2*fk), _state)*(0.8+0.4*ae_randomreal(_state)); } minbleicsetprecdiag(&state, &diagh, _state); minbleicsetscale(&state, &units, _state); cntg1 = 0; for(pass=0; pass<=k-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } minbleicrestartfrom(&state, &x, _state); while(minbleiciteration(&state, _state)) { testminbleicunit_calciip2(&state, n, fk, _state); } minbleicresults(&state, &x, &rep, _state); cntg1 = cntg1+rep.inneriterationscount; *err = *err||rep.terminationtype<=0; } *err = *err||cntb1v_neginf; } infcomp = ae_randominteger(n+1, _state); if( infcompv_posinf; } minbleiccreate(n, &x, &state, _state); minbleicsetgradientcheck(&state, teststep, _state); minbleicsetbc(&state, &bl, &bu, _state); /* * Check that the criterion passes a derivative if it is correct */ while(minbleiciteration(&state, _state)) { if( state.needfg ) { /* * Check that .X within the boundaries */ for(i=0; i<=n-1; i++) { if( (ae_isfinite(bl.ptr.p_double[i], _state)&&ae_fp_less(state.x.ptr.p_double[i],bl.ptr.p_double[i]))||(ae_isfinite(bu.ptr.p_double[i], _state)&&ae_fp_greater(state.x.ptr.p_double[i],bu.ptr.p_double[i])) ) { *testg = ae_true; ae_frame_leave(_state); return; } } testminbleicunit_funcderiv(a, b, c, d, x0, x1, x2, &state.x, func, &state.f, &state.g, _state); } } minbleicresults(&state, &x, &rep, _state); /* * Check that error code does not equal to -7 and parameter .VarIdx * equal to -1. */ if( rep.terminationtype==-7||rep.varidx!=-1 ) { *testg = ae_true; ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 5*randomnormal(_state); } minbleicrestartfrom(&state, &x, _state); /* * Check that the criterion does not miss a derivative if * it is incorrect */ while(minbleiciteration(&state, _state)) { if( state.needfg ) { for(i=0; i<=n-1; i++) { if( (ae_isfinite(bl.ptr.p_double[i], _state)&&ae_fp_less(state.x.ptr.p_double[i],bl.ptr.p_double[i]))||(ae_isfinite(bu.ptr.p_double[i], _state)&&ae_fp_greater(state.x.ptr.p_double[i],bu.ptr.p_double[i])) ) { *testg = ae_true; ae_frame_leave(_state); return; } } testminbleicunit_funcderiv(a, b, c, d, x0, x1, x2, &state.x, func, &state.f, &state.g, _state); state.g.ptr.p_double[nbrcomp] = state.g.ptr.p_double[nbrcomp]+noise; } } minbleicresults(&state, &x, &rep, _state); /* * Check that error code equal to -7 and parameter .VarIdx * equal to number of incorrect component. */ if( rep.terminationtype!=-7||rep.varidx!=nbrcomp ) { *testg = ae_true; ae_frame_leave(_state); return; } } *testg = ae_false; ae_frame_leave(_state); } /************************************************************************* This function tests problems which caused bugs in the past. On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testminbleicunit_testbugs(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; double v; ae_vector bl; ae_vector bu; ae_vector x; ae_vector x1; ae_vector h; ae_vector prior; ae_vector w; ae_matrix a; ae_matrix c; ae_matrix xy; ae_vector ct; minbleicstate state; minbleicreport rep; hqrndstate rs; ae_int_t pass; double tolx; double regterm; ae_int_t n; ae_int_t ckind; ae_frame_make(_state, &_frame_block); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&h, 0, DT_REAL, _state); ae_vector_init(&prior, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); _minbleicstate_init(&state, _state); _minbleicreport_init(&rep, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); /* * Reproduce situation: optimizer sometimes hangs when starts with * gradient orthogonal to the only linear constraint. In most cases * it is solved successfully, but sometimes leads to infinite loop * in one of the early optimizer versions. * * The problem is: * * f(x)= x'*x + c'*x * * linear constraint c'*x=0 * * initial point is x=0 * * there are two ways to choose coefficient vector c: * * its components can be long binary fractions * * or they can be either 0 or 1 * both ways test different scenarios for accumulation of rounding errors * * If test fails, it usually hangs */ tolx = 1.0E-10; for(pass=1; pass<=10; pass++) { for(ckind=0; ckind<=1; ckind++) { for(n=2; n<=10; n++) { ae_vector_set_length(&x, n, _state); ae_matrix_set_length(&c, 1, n+1, _state); ae_vector_set_length(&ct, 1, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 0.0; if( ckind==0 ) { c.ptr.pp_double[0][i] = ae_sqrt(hqrnduniformr(&rs, _state), _state); } else { c.ptr.pp_double[0][i] = (double)(hqrnduniformi(&rs, 2, _state)); } } c.ptr.pp_double[0][n] = 0.0; ct.ptr.p_int[0] = 0; minbleiccreate(n, &x, &state, _state); minbleicsetlc(&state, &c, &ct, 1, _state); minbleicsetcond(&state, 0.0, 0.0, 0.0, 99, _state); while(minbleiciteration(&state, _state)) { ae_assert(state.needfg, "Assertion failed", _state); if( state.needfg ) { state.f = 0.0; for(i=0; i<=n-1; i++) { state.f = state.f+ae_sqr(state.x.ptr.p_double[i], _state)+state.x.ptr.p_double[i]*c.ptr.pp_double[0][i]; state.g.ptr.p_double[i] = 2*state.x.ptr.p_double[i]+c.ptr.pp_double[0][i]; } } } minbleicresultsbuf(&state, &x1, &rep, _state); seterrorflag(err, rep.terminationtype<=0, _state); for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_greater(ae_fabs(x1.ptr.p_double[i], _state),tolx), _state); } } } } /* * Reproduce optimization problem which caused bugs (optimizer hangs) * when BLEIC was used from MCPD unit. We perform test on specific * 9-dimensional problem, no need to try general-case methods. * * This test hangs if bug is present. Thus, we do not test completion * code returned by optimizer - we just test that it was returned :) */ tolx = 1.0E-8; regterm = 1.0E-8; for(pass=1; pass<=1000; pass++) { /* * Prepare constraints: * * [0,1] box constraints on all variables * * 5 linear constraints, first one is random equality; * second one is random inequality; other ones are "sum-to-one" constraints * for x0-x2, x3-x5, x6-x8. */ ae_vector_set_length(&bl, 9, _state); ae_vector_set_length(&bu, 9, _state); for(i=0; i<=9-1; i++) { bl.ptr.p_double[i] = 0.0; bu.ptr.p_double[i] = 1.0; } ae_matrix_set_length(&c, 5, 10, _state); ae_vector_set_length(&ct, 5, _state); for(i=0; i<=1; i++) { c.ptr.pp_double[i][9] = (double)(0); for(j=0; j<=9-1; j++) { c.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; c.ptr.pp_double[i][9] = c.ptr.pp_double[i][9]+c.ptr.pp_double[i][j]*((double)1/(double)9); } } ct.ptr.p_int[0] = 0; ct.ptr.p_int[1] = 1; c.ptr.pp_double[1][9] = c.ptr.pp_double[1][9]-0.1; for(i=0; i<=3-1; i++) { for(k=0; k<=9-1; k++) { c.ptr.pp_double[2+i][k] = (double)(0); } for(k=0; k<=3-1; k++) { c.ptr.pp_double[2+i][k*3+i] = (double)(1); } c.ptr.pp_double[2+i][9] = 1.0; ct.ptr.p_int[2+i] = 0; } /* * Prepare weights */ ae_vector_set_length(&w, 3, _state); for(i=0; i<=w.cnt-1; i++) { w.ptr.p_double[i] = 1.0; } /* * Prepare preconditioner H */ ae_vector_set_length(&h, 9, _state); for(i=0; i<=h.cnt-1; i++) { h.ptr.p_double[i] = 1.0; } /* * Prepare prior value for regularization */ ae_vector_set_length(&prior, 9, _state); for(i=0; i<=prior.cnt-1; i++) { prior.ptr.p_double[i] = (double)(0); } prior.ptr.p_double[0] = 1.0; prior.ptr.p_double[4] = 1.0; prior.ptr.p_double[8] = 1.0; /* * Prepare dataset XY */ ae_matrix_set_length(&xy, 6, 3, _state); for(i=0; i<=xy.rows-1; i++) { for(j=0; j<=xy.cols-1; j++) { xy.ptr.pp_double[i][j] = ae_randomreal(_state); } } /* * Optimize */ ae_vector_set_length(&x, 9, _state); for(i=0; i<=9-1; i++) { x.ptr.p_double[i] = (double)1/(double)9; } minbleiccreate(9, &x, &state, _state); minbleicsetbc(&state, &bl, &bu, _state); minbleicsetlc(&state, &c, &ct, 5, _state); minbleicsetcond(&state, 0.0, 0.0, tolx, 0, _state); minbleicsetprecdiag(&state, &h, _state); while(minbleiciteration(&state, _state)) { ae_assert(state.needfg, "Assertion failed", _state); if( state.needfg ) { /* * Calculate regularization term */ state.f = 0.0; for(i=0; i<=9-1; i++) { state.f = state.f+regterm*ae_sqr(state.x.ptr.p_double[i]-prior.ptr.p_double[i], _state); state.g.ptr.p_double[i] = 2*regterm*(state.x.ptr.p_double[i]-prior.ptr.p_double[i]); } /* * calculate prediction error/gradient for K-th pair */ for(k=0; k<=xy.rows-2; k++) { for(i=0; i<=3-1; i++) { v = ae_v_dotproduct(&state.x.ptr.p_double[i*3], 1, &xy.ptr.pp_double[k][0], 1, ae_v_len(i*3,i*3+3-1)); state.f = state.f+ae_sqr(w.ptr.p_double[i]*(v-xy.ptr.pp_double[k+1][i]), _state); for(j=0; j<=3-1; j++) { state.g.ptr.p_double[i*3+j] = state.g.ptr.p_double[i*3+j]+2*w.ptr.p_double[i]*w.ptr.p_double[i]*(v-xy.ptr.pp_double[k+1][i])*xy.ptr.pp_double[k][j]; } } } } } minbleicresultsbuf(&state, &x, &rep, _state); } ae_frame_leave(_state); } /************************************************************************* This function return function value and it derivatives. Function dimension is 3. Function's list: * funcType=1: F(X)=A*(X-X0)^2+B*(Y-Y0)^2+C*(Z-Z0)^2+D; * funcType=2: F(X)=A*sin(X-X0)^2+B*sin(Y-Y0)^2+C*sin(Z-Z0)^2+D; * funcType=3: F(X)=A*(X-X0)^2+B*(Y-Y0)^2+C*((Z-Z0)-(X-X0))^2+D. *************************************************************************/ static void testminbleicunit_funcderiv(double a, double b, double c, double d, double x0, double x1, double x2, /* Real */ ae_vector* x, ae_int_t functype, double* f, /* Real */ ae_vector* g, ae_state *_state) { ae_assert(((ae_isfinite(a, _state)&&ae_isfinite(b, _state))&&ae_isfinite(c, _state))&&ae_isfinite(d, _state), "FuncDeriv: A, B, C or D contains NaN or Infinite.", _state); ae_assert((ae_isfinite(x0, _state)&&ae_isfinite(x1, _state))&&ae_isfinite(x2, _state), "FuncDeriv: X0, X1 or X2 contains NaN or Infinite.", _state); ae_assert(functype>=1&&functype<=3, "FuncDeriv: incorrect funcType(funcType<1 or funcType>3).", _state); if( functype==1 ) { *f = a*ae_sqr(x->ptr.p_double[0]-x0, _state)+b*ae_sqr(x->ptr.p_double[1]-x1, _state)+c*ae_sqr(x->ptr.p_double[2]-x2, _state)+d; g->ptr.p_double[0] = 2*a*(x->ptr.p_double[0]-x0); g->ptr.p_double[1] = 2*b*(x->ptr.p_double[1]-x1); g->ptr.p_double[2] = 2*c*(x->ptr.p_double[2]-x2); return; } if( functype==2 ) { *f = a*ae_sqr(ae_sin(x->ptr.p_double[0]-x0, _state), _state)+b*ae_sqr(ae_sin(x->ptr.p_double[1]-x1, _state), _state)+c*ae_sqr(ae_sin(x->ptr.p_double[2]-x2, _state), _state)+d; g->ptr.p_double[0] = 2*a*ae_sin(x->ptr.p_double[0]-x0, _state)*ae_cos(x->ptr.p_double[0]-x0, _state); g->ptr.p_double[1] = 2*b*ae_sin(x->ptr.p_double[1]-x1, _state)*ae_cos(x->ptr.p_double[1]-x1, _state); g->ptr.p_double[2] = 2*c*ae_sin(x->ptr.p_double[2]-x2, _state)*ae_cos(x->ptr.p_double[2]-x2, _state); return; } if( functype==3 ) { *f = a*ae_sqr(x->ptr.p_double[0]-x0, _state)+b*ae_sqr(x->ptr.p_double[1]-x1, _state)+c*ae_sqr(x->ptr.p_double[2]-x2-(x->ptr.p_double[0]-x0), _state)+d; g->ptr.p_double[0] = 2*a*(x->ptr.p_double[0]-x0)+2*c*(x->ptr.p_double[0]-x->ptr.p_double[2]-x0+x2); g->ptr.p_double[1] = 2*b*(x->ptr.p_double[1]-x1); g->ptr.p_double[2] = 2*c*(x->ptr.p_double[2]-x->ptr.p_double[0]-x2+x0); return; } } static void testminqpunit_bcqptest(ae_bool* wereerrors, ae_state *_state); static ae_bool testminqpunit_ecqptest(ae_state *_state); static void testminqpunit_icqptest(ae_bool* err, ae_state *_state); static void testminqpunit_generallcqptest(ae_bool* errorflag, ae_state *_state); static ae_bool testminqpunit_specialicqptests(ae_state *_state); static void testminqpunit_denseaultests(ae_bool* errorflag, ae_state *_state); static double testminqpunit_projectedantigradnorm(ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* g, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state); static void testminqpunit_testbcgradandfeasibility(/* Real */ ae_matrix* a, /* Real */ ae_vector* b, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_int_t n, /* Real */ ae_vector* x, double eps, ae_bool* errorflag, ae_state *_state); static void testminqpunit_setrandomalgoallmodern(minqpstate* s, double* bctol, double* lctol, ae_state *_state); static void testminqpunit_setrandomalgononconvex(minqpstate* s, ae_state *_state); static void testminqpunit_setrandomalgosemidefinite(minqpstate* s, ae_state *_state); static void testminqpunit_setrandomalgobc(minqpstate* s, ae_state *_state); static void testminqpunit_setrandomalgoconvexlc(minqpstate* s, ae_state *_state); static void testminqpunit_setrandomalgononconvexlc(minqpstate* s, ae_state *_state); static void testminqpunit_densetosparse(/* Real */ ae_matrix* a, ae_int_t n, sparsematrix* s, ae_state *_state); static void testminqpunit_randomlysplitlc(/* Real */ ae_matrix* rawc, /* Integer */ ae_vector* rawct, ae_int_t rawccnt, ae_int_t n, /* Real */ ae_matrix* densec, /* Integer */ ae_vector* densect, sparsematrix* sparsec, /* Integer */ ae_vector* sparsect, ae_int_t* denseccnt, ae_int_t* sparseccnt, hqrndstate* rs, ae_state *_state); static void testminqpunit_randomlysplitandsetlc(/* Real */ ae_matrix* rawc, /* Integer */ ae_vector* rawct, ae_int_t rawccnt, ae_int_t n, minqpstate* state, hqrndstate* rs, ae_state *_state); static void testminqpunit_randomlyselectconvertandsetquadraticterm(/* Real */ ae_matrix* a, ae_int_t n, minqpstate* state, hqrndstate* rs, ae_state *_state); static double testminqpunit_getconstraintrcond(/* Real */ ae_matrix* c, ae_int_t k, ae_int_t n, ae_state *_state); ae_bool testminqp(ae_bool silent, ae_state *_state) { ae_bool simpleerrors; ae_bool func1errors; ae_bool func2errors; ae_bool bcqperrors; ae_bool ecqperrors; ae_bool icqperrors; ae_bool lcqperrors; ae_bool cholerrors; ae_bool quickqperrors; ae_bool bleicerrors; ae_bool denseaulerrors; ae_bool waserrors; ae_bool result; /* * The VERY basic tests for Cholesky and BLEIC */ simpleerrors = simpletest(_state); func1errors = functest1(_state); func2errors = functest2(_state); /* * Solver-specific tests */ denseaulerrors = ae_false; cholerrors = choleskytests(_state); quickqperrors = quickqptests(_state); bleicerrors = bleictests(_state); testminqpunit_denseaultests(&denseaulerrors, _state); /* * */ icqperrors = ae_false; lcqperrors = ae_false; bcqperrors = ae_false; testminqpunit_bcqptest(&bcqperrors, _state); ecqperrors = testminqpunit_ecqptest(_state); testminqpunit_icqptest(&icqperrors, _state); icqperrors = icqperrors||testminqpunit_specialicqptests(_state); testminqpunit_generallcqptest(&lcqperrors, _state); /* * report */ waserrors = (((((((((simpleerrors||func1errors)||func2errors)||bcqperrors)||ecqperrors)||icqperrors)||lcqperrors)||quickqperrors)||cholerrors)||bleicerrors)||denseaulerrors; if( !silent ) { printf("TESTING MinQP\n"); printf("BASIC TESTS:\n"); printf("* SimpleTest: "); if( simpleerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* Func1Test: "); if( func1errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* Func2Test: "); if( func2errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("GENERIC QP TESTS:\n"); printf("* box constrained: "); if( bcqperrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* linearly constrained: "); if( (ecqperrors||icqperrors)||lcqperrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("SOLVER-SPECIFIC TESTS:\n"); printf("* QuickQP solver tests: "); if( quickqperrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* BLEIC solver tests: "); if( bleicerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* DENSE-AUL solver tests: "); if( denseaulerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* Cholesky solver tests: "); if( cholerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testminqp(ae_bool silent, ae_state *_state) { return testminqp(silent, _state); } /************************************************************************* Function to test: 'MinQPCreate', 'MinQPSetQuadraticTerm', 'MinQPSetBC', 'MinQPSetOrigin', 'MinQPSetStartingPoint', 'MinQPOptimize', 'MinQPResults'. Test problem: A = diag(aii), aii>0 (random) b = 0 random bounds (either no bounds, one bound, two bounds av_neginf; ub.ptr.p_double[j] = _state->v_posinf; } else { if( infd==1 ) { db.ptr.p_double[j] = _state->v_neginf; ub.ptr.p_double[j] = (maxnb-minnb)*ae_randomreal(_state)+minnb; } else { if( infd==2 ) { db.ptr.p_double[j] = (maxnb-minnb)*ae_randomreal(_state)+minnb; ub.ptr.p_double[j] = _state->v_posinf; } else { if( infd==3 ) { db.ptr.p_double[j] = (maxnb-minnb)*ae_randomreal(_state)+minnb; ub.ptr.p_double[j] = db.ptr.p_double[j]+maxstb*ae_randomreal(_state)+0.01; } else { db.ptr.p_double[j] = (maxnb-minnb)*ae_randomreal(_state)+minnb; ub.ptr.p_double[j] = db.ptr.p_double[j]; } } } } } minqpsetbc(&state, &db, &ub, _state); /* *initialization for shifting *initial value for 'XORi' *and searching true results */ for(j=0; j<=sn-1; j++) { xori.ptr.p_double[j] = (maxnb-minnb)*ae_randomreal(_state)+minnb; tx.ptr.p_double[j] = boundval(xori.ptr.p_double[j], db.ptr.p_double[j], ub.ptr.p_double[j], _state); } minqpsetorigin(&state, &xori, _state); /* *initialization for starting point */ for(j=0; j<=sn-1; j++) { stx.ptr.p_double[j] = (maxnb-minnb)*ae_randomreal(_state)+minnb; } minqpsetstartingpoint(&state, &stx, _state); /* *optimize and get result */ minqpoptimize(&state, _state); minqpresults(&state, &x, &rep, _state); for(j=0; j<=sn-1; j++) { if( ae_fp_greater(ae_fabs(tx.ptr.p_double[j]-x.ptr.p_double[j], _state),eps)||(ae_fp_less(x.ptr.p_double[j],db.ptr.p_double[j])||ae_fp_greater(x.ptr.p_double[j],ub.ptr.p_double[j])) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Function to test: 'MinQPCreate', 'MinQPSetLinearTerm', 'MinQPSetQuadraticTerm', 'MinQPSetOrigin', 'MinQPSetStartingPoint', 'MinQPOptimize', 'MinQPResults'. Test problem: A = positive-definite matrix, obtained by 'SPDMatrixRndCond' function b <> 0 without bounds random start point dimension - from 1 to 5. *************************************************************************/ ae_bool functest1(ae_state *_state) { ae_frame _frame_block; minqpstate state; ae_int_t nexp; ae_int_t msn; ae_int_t sn; ae_int_t i; ae_int_t j; ae_int_t k; ae_matrix a; ae_vector ub; ae_vector db; ae_vector x; ae_vector tx; ae_vector stx; ae_vector xori; ae_vector xoric; minqpreport rep; double eps; ae_vector b; ae_int_t c2; ae_bool result; ae_frame_make(_state, &_frame_block); _minqpstate_init(&state, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&ub, 0, DT_REAL, _state); ae_vector_init(&db, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&tx, 0, DT_REAL, _state); ae_vector_init(&stx, 0, DT_REAL, _state); ae_vector_init(&xori, 0, DT_REAL, _state); ae_vector_init(&xoric, 0, DT_REAL, _state); _minqpreport_init(&rep, _state); ae_vector_init(&b, 0, DT_REAL, _state); eps = 0.001; msn = 5; c2 = 1000; nexp = 1000; for(sn=1; sn<=msn; sn++) { ae_vector_set_length(&b, sn, _state); ae_vector_set_length(&tx, sn, _state); ae_vector_set_length(&xori, sn, _state); ae_vector_set_length(&xoric, sn, _state); ae_vector_set_length(&stx, sn, _state); for(i=0; i<=nexp; i++) { /* *create simmetric matrix 'A' */ spdmatrixrndcond(sn, ae_exp(ae_randomreal(_state)*ae_log((double)(c2), _state), _state), &a, _state); minqpcreate(sn, &state, _state); testminqpunit_setrandomalgobc(&state, _state); minqpsetquadraticterm(&state, &a, ae_false, _state); for(j=0; j<=sn-1; j++) { xoric.ptr.p_double[j] = 2*ae_randomreal(_state)-1; } /* *create linear part */ for(j=0; j<=sn-1; j++) { b.ptr.p_double[j] = (double)(0); for(k=0; k<=sn-1; k++) { b.ptr.p_double[j] = b.ptr.p_double[j]-xoric.ptr.p_double[k]*a.ptr.pp_double[k][j]; } } minqpsetlinearterm(&state, &b, _state); /* *initialization for shifting *initial value for 'XORi' *and searching true results */ for(j=0; j<=sn-1; j++) { xori.ptr.p_double[j] = 2*ae_randomreal(_state)-1; tx.ptr.p_double[j] = xori.ptr.p_double[j]+xoric.ptr.p_double[j]; } minqpsetorigin(&state, &xori, _state); /* *initialization for starting point */ for(j=0; j<=sn-1; j++) { stx.ptr.p_double[j] = 2*ae_randomreal(_state)-1; } minqpsetstartingpoint(&state, &stx, _state); /* *optimize and get result */ minqpoptimize(&state, _state); minqpresults(&state, &x, &rep, _state); for(j=0; j<=sn-1; j++) { if( ae_fp_greater(ae_fabs(tx.ptr.p_double[j]-x.ptr.p_double[j], _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Function to test: 'MinQPCreate', 'MinQPSetLinearTerm', 'MinQPSetQuadraticTerm', 'MinQPSetBC', 'MinQPSetOrigin', 'MinQPSetStartingPoint', 'MinQPOptimize', 'MinQPResults'. Test problem: A = positive-definite matrix, obtained by 'SPDMatrixRndCond' function b <> 0 boundary constraints random start point dimension - from 1 to 5. *************************************************************************/ ae_bool functest2(ae_state *_state) { ae_frame _frame_block; minqpstate state; ae_int_t nexp; ae_int_t msn; ae_int_t sn; ae_int_t i; ae_int_t j; ae_int_t k; ae_matrix a; ae_vector ub; ae_vector db; ae_vector x; ae_vector tmpx; double maxstb; ae_vector stx; ae_vector xori; ae_vector xoric; ae_int_t infd; minqpreport rep; double maxnb; double minnb; double eps; ae_vector b; ae_vector g; ae_vector c; ae_vector y0; ae_vector y1; ae_int_t c2; double anti; ae_bool result; ae_frame_make(_state, &_frame_block); _minqpstate_init(&state, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&ub, 0, DT_REAL, _state); ae_vector_init(&db, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&tmpx, 0, DT_REAL, _state); ae_vector_init(&stx, 0, DT_REAL, _state); ae_vector_init(&xori, 0, DT_REAL, _state); ae_vector_init(&xoric, 0, DT_REAL, _state); _minqpreport_init(&rep, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&c, 0, DT_REAL, _state); ae_vector_init(&y0, 0, DT_REAL, _state); ae_vector_init(&y1, 0, DT_REAL, _state); eps = 0.001; msn = 5; c2 = 1000; maxstb = (double)(10); nexp = 1000; maxnb = (double)(1000); minnb = (double)(-1000); for(sn=1; sn<=msn; sn++) { ae_vector_set_length(&tmpx, sn, _state); ae_vector_set_length(&b, sn, _state); ae_vector_set_length(&c, sn, _state); ae_vector_set_length(&g, sn, _state); ae_vector_set_length(&xori, sn, _state); ae_vector_set_length(&xoric, sn, _state); ae_vector_set_length(&stx, sn, _state); ae_vector_set_length(&db, sn, _state); ae_vector_set_length(&ub, sn, _state); ae_vector_set_length(&y0, sn, _state); ae_vector_set_length(&y1, sn, _state); for(i=0; i<=nexp; i++) { /* *create simmetric matrix 'A' */ spdmatrixrndcond(sn, ae_exp(ae_randomreal(_state)*ae_log((double)(c2), _state), _state), &a, _state); minqpcreate(sn, &state, _state); testminqpunit_setrandomalgobc(&state, _state); minqpsetquadraticterm(&state, &a, ae_false, _state); for(j=0; j<=sn-1; j++) { xoric.ptr.p_double[j] = (maxnb-minnb)*ae_randomreal(_state)+minnb; } /* *create linear part */ for(j=0; j<=sn-1; j++) { b.ptr.p_double[j] = (double)(0); for(k=0; k<=sn-1; k++) { b.ptr.p_double[j] = b.ptr.p_double[j]-xoric.ptr.p_double[k]*a.ptr.pp_double[k][j]; } } minqpsetlinearterm(&state, &b, _state); for(j=0; j<=sn-1; j++) { infd = ae_randominteger(4, _state); if( infd==0 ) { db.ptr.p_double[j] = _state->v_neginf; ub.ptr.p_double[j] = _state->v_posinf; } else { if( infd==1 ) { db.ptr.p_double[j] = _state->v_neginf; ub.ptr.p_double[j] = (maxnb-minnb)*ae_randomreal(_state)+minnb; } else { if( infd==2 ) { db.ptr.p_double[j] = (maxnb-minnb)*ae_randomreal(_state)+minnb; ub.ptr.p_double[j] = _state->v_posinf; } else { db.ptr.p_double[j] = (maxnb-minnb)*ae_randomreal(_state)+minnb; ub.ptr.p_double[j] = db.ptr.p_double[j]+maxstb*ae_randomreal(_state)+0.01; } } } } minqpsetbc(&state, &db, &ub, _state); /* *initialization for shifting *initial value for 'XORi' *and searching true results */ for(j=0; j<=sn-1; j++) { xori.ptr.p_double[j] = (maxnb-minnb)*ae_randomreal(_state)+minnb; } minqpsetorigin(&state, &xori, _state); for(j=0; j<=sn-1; j++) { c.ptr.p_double[j] = (double)(0); for(k=0; k<=sn-1; k++) { c.ptr.p_double[j] = c.ptr.p_double[j]-xori.ptr.p_double[k]*a.ptr.pp_double[k][j]; } } /* *initialization for starting point */ for(j=0; j<=sn-1; j++) { stx.ptr.p_double[j] = (maxnb-minnb)*ae_randomreal(_state)+minnb; } minqpsetstartingpoint(&state, &stx, _state); /* *optimize and get result */ minqpoptimize(&state, _state); minqpresults(&state, &x, &rep, _state); rmatrixmv(sn, sn, &a, 0, 0, 0, &x, 0, &y0, 0, _state); for(j=0; j<=sn-1; j++) { g.ptr.p_double[j] = y0.ptr.p_double[j]+c.ptr.p_double[j]+b.ptr.p_double[j]; } anti = testminqpunit_projectedantigradnorm(sn, &x, &g, &db, &ub, _state); for(j=0; j<=sn-1; j++) { if( ae_fp_greater(ae_fabs(anti, _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* ConsoleTest. *************************************************************************/ ae_bool consoletest(ae_state *_state) { ae_frame _frame_block; minqpstate state; ae_int_t nexp; ae_int_t msn; ae_int_t sn; ae_int_t i; ae_int_t j; ae_int_t k; ae_matrix a; ae_vector ub; ae_vector db; ae_vector x; ae_vector stx; ae_vector xori; ae_vector xoric; minqpreport rep; double eps; ae_vector b; ae_vector g; ae_vector y0; ae_vector y1; double c; double anti; ae_bool result; ae_frame_make(_state, &_frame_block); _minqpstate_init(&state, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&ub, 0, DT_REAL, _state); ae_vector_init(&db, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&stx, 0, DT_REAL, _state); ae_vector_init(&xori, 0, DT_REAL, _state); ae_vector_init(&xoric, 0, DT_REAL, _state); _minqpreport_init(&rep, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&y0, 0, DT_REAL, _state); ae_vector_init(&y1, 0, DT_REAL, _state); eps = 0.001; msn = 2; nexp = 0; for(sn=2; sn<=msn; sn++) { ae_vector_set_length(&b, sn, _state); ae_vector_set_length(&g, sn, _state); ae_vector_set_length(&xori, sn, _state); ae_vector_set_length(&xoric, sn, _state); ae_vector_set_length(&stx, sn, _state); ae_vector_set_length(&db, sn, _state); ae_vector_set_length(&ub, sn, _state); ae_vector_set_length(&y0, sn, _state); ae_vector_set_length(&y1, sn, _state); for(i=0; i<=nexp; i++) { /* *create simmetric matrix 'A' */ ae_matrix_set_length(&a, sn, sn, _state); for(j=0; j<=sn-1; j++) { for(k=0; k<=sn-1; k++) { if( j==k ) { a.ptr.pp_double[j][k] = (double)(1); } else { a.ptr.pp_double[j][k] = (double)(0); } printf("%0.5f ", (double)(a.ptr.pp_double[j][k])); } printf("\n"); } minqpcreate(sn, &state, _state); testminqpunit_setrandomalgobc(&state, _state); minqpsetquadraticterm(&state, &a, ae_false, _state); for(j=0; j<=sn-1; j++) { xoric.ptr.p_double[j] = (double)(1); printf("XoriC=%0.5f \n", (double)(xoric.ptr.p_double[j])); } /* *create linear part */ for(j=0; j<=sn-1; j++) { b.ptr.p_double[j] = (double)(0); for(k=0; k<=sn-1; k++) { b.ptr.p_double[j] = b.ptr.p_double[j]-xoric.ptr.p_double[k]*a.ptr.pp_double[k][j]; } printf("B[%0d]=%0.5f\n", (int)(j), (double)(b.ptr.p_double[j])); } minqpsetlinearterm(&state, &b, _state); for(j=0; j<=sn-1; j++) { db.ptr.p_double[j] = (double)(10); ub.ptr.p_double[j] = (double)(20); } minqpsetbc(&state, &db, &ub, _state); /* *initialization for shifting *initial value for 'XORi' *and searching true results */ for(j=0; j<=sn-1; j++) { xori.ptr.p_double[j] = (double)(1); } minqpsetorigin(&state, &xori, _state); /* *optimize and get result */ minqpoptimize(&state, _state); minqpresults(&state, &x, &rep, _state); rmatrixmv(sn, sn, &a, 0, 0, 0, &x, 0, &y0, 0, _state); rmatrixmv(sn, sn, &a, 0, 0, 0, &x, 0, &y1, 0, _state); for(j=0; j<=sn-1; j++) { c = (double)(0); for(k=0; k<=sn-1; k++) { c = c-xori.ptr.p_double[k]*a.ptr.pp_double[k][j]; } g.ptr.p_double[j] = b.ptr.p_double[j]+c+y0.ptr.p_double[j]+y1.ptr.p_double[j]; } anti = testminqpunit_projectedantigradnorm(sn, &x, &b, &db, &ub, _state); printf("SN=%0d\n", (int)(sn)); printf("NEXP=%0d\n", (int)(i)); printf("TermType=%0d\n", (int)(rep.terminationtype)); for(j=0; j<=sn-1; j++) { printf("X[%0d]=%0.5f;\n", (int)(j), (double)(x.ptr.p_double[j])); printf("DB[%0d]=%0.5f; UB[%0d]=%0.5f\n", (int)(j), (double)(db.ptr.p_double[j]), (int)(j), (double)(ub.ptr.p_double[j])); printf("XORi[%0d]=%0.5f; XORiC[%0d]=%0.5f;\n", (int)(j), (double)(xori.ptr.p_double[j]), (int)(j), (double)(xoric.ptr.p_double[j])); printf("Anti[%0d]=%0.5f;\n", (int)(j), (double)(anti)); if( ae_fp_greater(ae_fabs(anti, _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function performs tests specific for Cholesky solver Returns True on success, False on failure. *************************************************************************/ ae_bool choleskytests(ae_state *_state) { ae_frame _frame_block; minqpstate state; minqpreport rep; sparsematrix sa; ae_matrix a; ae_int_t n; ae_int_t i; ae_int_t j; ae_vector bndl; ae_vector bndu; ae_vector x; ae_vector xend; ae_vector xend0; ae_bool result; ae_frame_make(_state, &_frame_block); _minqpstate_init(&state, _state); _minqpreport_init(&rep, _state); _sparsematrix_init(&sa, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&xend, 0, DT_REAL, _state); ae_vector_init(&xend0, 0, DT_REAL, _state); result = ae_false; /* * TEST: Cholesky solver should return -5 on sparse matrices. */ n = 5; sparsecreate(n, n, 0, &sa, _state); for(i=0; i<=n-1; i++) { sparseset(&sa, i, i, 1.0, _state); } minqpcreate(n, &state, _state); minqpsetalgocholesky(&state, _state); minqpsetquadratictermsparse(&state, &sa, ae_true, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend, &rep, _state); seterrorflag(&result, rep.terminationtype!=-5, _state); /* * Test CQP solver on non-convex problems, * which are bounded from below on the feasible set: * * min -||x||^2 s.t. x[i] in [-1,+1] * * We test ability of the solver to detect such problems * and report failure. */ n = 20; ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = -1.0; } ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { bndl.ptr.p_double[i] = (double)(-1); bndu.ptr.p_double[i] = (double)(1); x.ptr.p_double[i] = ae_randomreal(_state)-0.5; } minqpcreate(n, &state, _state); minqpsetalgocholesky(&state, _state); minqpsetquadraticterm(&state, &a, ae_true, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpsetstartingpoint(&state, &x, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); seterrorflag(&result, rep.terminationtype!=-5, _state); /* * Test CQP solver on non-convex problems, * which are unbounded from below: * * min -||x||^2 * * We test ability of the solver to detect such problems * and report failure. */ n = 20; ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][i] = -1.0; } ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_randomreal(_state)-0.5; } minqpcreate(n, &state, _state); minqpsetalgocholesky(&state, _state); minqpsetquadraticterm(&state, &a, ae_true, _state); minqpsetstartingpoint(&state, &x, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); seterrorflag(&result, rep.terminationtype!=-5, _state); ae_frame_leave(_state); return result; } /************************************************************************* This function performs tests specific for QuickQP solver Returns True on failure. *************************************************************************/ ae_bool quickqptests(ae_state *_state) { ae_frame _frame_block; minqpstate state; minqpreport rep; ae_int_t n; ae_int_t pass; ae_int_t i; ae_int_t j; ae_int_t k; double v; double g; double gnorm; ae_bool flag; ae_int_t origintype; ae_int_t scaletype; ae_bool isupper; ae_bool issparse; ae_int_t itscnt; ae_vector nlist; ae_int_t nidx; ae_matrix a; ae_matrix za; ae_matrix fulla; ae_matrix halfa; ae_matrix c; sparsematrix sa; ae_vector ct; ae_vector b; ae_vector zb; ae_vector bndl; ae_vector bndu; ae_vector x0; ae_vector x1; ae_vector xend0; ae_vector xend1; ae_vector xori; ae_vector xz; ae_vector s; double eps; hqrndstate rs; ae_bool result; ae_frame_make(_state, &_frame_block); _minqpstate_init(&state, _state); _minqpreport_init(&rep, _state); ae_vector_init(&nlist, 0, DT_INT, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&za, 0, 0, DT_REAL, _state); ae_matrix_init(&fulla, 0, 0, DT_REAL, _state); ae_matrix_init(&halfa, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); _sparsematrix_init(&sa, _state); ae_vector_init(&ct, 0, DT_INT, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&zb, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&xend0, 0, DT_REAL, _state); ae_vector_init(&xend1, 0, DT_REAL, _state); ae_vector_init(&xori, 0, DT_REAL, _state); ae_vector_init(&xz, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); result = ae_false; hqrndrandomize(&rs, _state); /* * Convex unconstrained test: * * N dimensions * * positive-definite A * * algorithm randomly choose dense or sparse A, and for * sparse matrix it randomly choose format. * * random B with normal entries * * initial point is random, feasible * * random origin (zero or non-zero) and scale (unit or * non-unit) are generated */ eps = 1.0E-5; for(n=1; n<=10; n++) { for(pass=1; pass<=10; pass++) { /* * Generate problem */ origintype = hqrnduniformi(&rs, 2, _state); scaletype = hqrnduniformi(&rs, 2, _state); isupper = ae_fp_less(hqrnduniformr(&rs, _state),0.5); issparse = ae_fp_less(hqrnduniformr(&rs, _state),0.5); spdmatrixrndcond(n, 1.0E3, &fulla, _state); ae_matrix_set_length(&halfa, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j>=i&&isupper)||(j<=i&&!isupper) ) { halfa.ptr.pp_double[i][j] = fulla.ptr.pp_double[i][j]; } else { halfa.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; } } } testminqpunit_densetosparse(&halfa, n, &sa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xori, n, _state); ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); x0.ptr.p_double[i] = hqrndnormal(&rs, _state); if( origintype==0 ) { xori.ptr.p_double[i] = (double)(0); } else { xori.ptr.p_double[i] = hqrndnormal(&rs, _state); } if( scaletype==0 ) { s.ptr.p_double[i] = (double)(1); } else { s.ptr.p_double[i] = ae_exp(hqrndnormal(&rs, _state), _state); } } /* * Solve problem */ minqpcreate(n, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, 0, ae_fp_greater(hqrnduniformr(&rs, _state),0.5), _state); minqpsetlinearterm(&state, &b, _state); if( issparse ) { minqpsetquadratictermsparse(&state, &sa, isupper, _state); } else { minqpsetquadraticterm(&state, &halfa, isupper, _state); } if( origintype!=0 ) { minqpsetorigin(&state, &xori, _state); } if( scaletype!=0 ) { minqpsetscale(&state, &s, _state); } minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( rep.terminationtype<=0 ) { ae_frame_leave(_state); return result; } /* * Test - calculate constrained gradient at solution, * check its norm. */ gnorm = 0.0; for(i=0; i<=n-1; i++) { g = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { g = g+fulla.ptr.pp_double[i][j]*(x1.ptr.p_double[j]-xori.ptr.p_double[j]); } gnorm = gnorm+ae_sqr(g, _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(&result, ae_fp_greater(gnorm,eps), _state); } } /* * Convex test: * * N dimensions * * random number (0..N) of random boundary constraints * * positive-definite A * * algorithm randomly choose dense or sparse A, and for * sparse matrix it randomly choose format. * * random B with normal entries * * initial point is random, feasible * * random origin (zero or non-zero) and scale (unit or * non-unit) are generated */ eps = 1.0E-5; for(n=1; n<=10; n++) { for(pass=1; pass<=10; pass++) { /* * Generate problem */ origintype = hqrnduniformi(&rs, 2, _state); scaletype = hqrnduniformi(&rs, 2, _state); isupper = ae_fp_less(hqrnduniformr(&rs, _state),0.5); issparse = ae_fp_less(hqrnduniformr(&rs, _state),0.5); spdmatrixrndcond(n, 1.0E3, &fulla, _state); ae_matrix_set_length(&halfa, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j>=i&&isupper)||(j<=i&&!isupper) ) { halfa.ptr.pp_double[i][j] = fulla.ptr.pp_double[i][j]; } else { halfa.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; } } } testminqpunit_densetosparse(&halfa, n, &sa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xori, n, _state); ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); bndl.ptr.p_double[i] = _state->v_neginf; bndu.ptr.p_double[i] = _state->v_posinf; x0.ptr.p_double[i] = hqrndnormal(&rs, _state); if( origintype==0 ) { xori.ptr.p_double[i] = (double)(0); } else { xori.ptr.p_double[i] = hqrndnormal(&rs, _state); } if( scaletype==0 ) { s.ptr.p_double[i] = (double)(1); } else { s.ptr.p_double[i] = ae_exp(hqrndnormal(&rs, _state), _state); } j = hqrnduniformi(&rs, 5, _state); if( j==0 ) { bndl.ptr.p_double[i] = (double)(0); x0.ptr.p_double[i] = ae_fabs(x0.ptr.p_double[i], _state); } if( j==1 ) { bndu.ptr.p_double[i] = (double)(0); x0.ptr.p_double[i] = -ae_fabs(x0.ptr.p_double[i], _state); } if( j==2 ) { bndl.ptr.p_double[i] = hqrndnormal(&rs, _state); bndu.ptr.p_double[i] = bndl.ptr.p_double[i]; x0.ptr.p_double[i] = bndl.ptr.p_double[i]; } if( j==3 ) { bndl.ptr.p_double[i] = -0.1; bndu.ptr.p_double[i] = 0.1; x0.ptr.p_double[i] = 0.2*hqrnduniformr(&rs, _state)-0.1; } } /* * Solve problem */ minqpcreate(n, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, 0, ae_fp_greater(hqrnduniformr(&rs, _state),0.5), _state); minqpsetlinearterm(&state, &b, _state); if( issparse ) { minqpsetquadratictermsparse(&state, &sa, isupper, _state); } else { minqpsetquadraticterm(&state, &halfa, isupper, _state); } if( origintype!=0 ) { minqpsetorigin(&state, &xori, _state); } if( scaletype!=0 ) { minqpsetscale(&state, &s, _state); } minqpsetbc(&state, &bndl, &bndu, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( rep.terminationtype<=0 ) { ae_frame_leave(_state); return result; } /* * Test - calculate constrained gradient at solution, * check its norm. */ gnorm = 0.0; for(i=0; i<=n-1; i++) { g = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { g = g+fulla.ptr.pp_double[i][j]*(x1.ptr.p_double[j]-xori.ptr.p_double[j]); } if( ae_fp_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i])&&ae_fp_greater(g,(double)(0)) ) { g = (double)(0); } if( ae_fp_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i])&&ae_fp_less(g,(double)(0)) ) { g = (double)(0); } gnorm = gnorm+ae_sqr(g, _state); seterrorflag(&result, ae_fp_less(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(&result, ae_fp_greater(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(&result, ae_fp_greater(gnorm,eps), _state); } } /* * Strongly non-convex test: * * N dimensions, N>=2 * * box constraints, x[i] in [-1,+1] * * A = A0-0.5*I, where A0 is SPD with unit norm and smallest * singular value equal to 1.0E-3, I is identity matrix * * random B with normal entries * * initial point is random, feasible * * We perform two tests: * * unconstrained problem must be recognized as unbounded * * constrained problem can be successfully solved * * NOTE: it is important to have N>=2, because formula for A * can be applied only to matrix with at least two * singular values */ eps = 1.0E-5; for(n=2; n<=10; n++) { for(pass=1; pass<=10; pass++) { /* * Generate problem */ spdmatrixrndcond(n, 1.0E3, &fulla, _state); for(i=0; i<=n-1; i++) { fulla.ptr.pp_double[i][i] = fulla.ptr.pp_double[i][i]-0.5; } isupper = ae_fp_less(hqrnduniformr(&rs, _state),0.5); ae_matrix_set_length(&halfa, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j>=i&&isupper)||(j<=i&&!isupper) ) { halfa.ptr.pp_double[i][j] = fulla.ptr.pp_double[i][j]; } else { halfa.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; } } } testminqpunit_densetosparse(&halfa, n, &sa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); bndl.ptr.p_double[i] = (double)(-1); bndu.ptr.p_double[i] = (double)(1); x0.ptr.p_double[i] = 2*hqrnduniformr(&rs, _state)-1; } /* * Solve problem: * * without constraints we expect failure * * with constraints algorithm must succeed */ minqpcreate(n, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, 0, ae_fp_greater(hqrnduniformr(&rs, _state),0.5), _state); minqpsetlinearterm(&state, &b, _state); if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { minqpsetquadraticterm(&state, &halfa, isupper, _state); } else { minqpsetquadratictermsparse(&state, &sa, isupper, _state); } minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype!=-4, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( rep.terminationtype<=0 ) { ae_frame_leave(_state); return result; } /* * Test - calculate constrained gradient at solution, * check its norm. */ gnorm = 0.0; for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&fulla.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); g = v+b.ptr.p_double[i]; if( ae_fp_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i])&&ae_fp_greater(g,(double)(0)) ) { g = (double)(0); } if( ae_fp_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i])&&ae_fp_less(g,(double)(0)) ) { g = (double)(0); } gnorm = gnorm+ae_sqr(g, _state); seterrorflag(&result, ae_fp_less(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(&result, ae_fp_greater(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(&result, ae_fp_greater(gnorm,eps), _state); } } /* * Basic semi-definite test: * * N dimensions, N>=2 * * box constraints, x[i] in [-1,+1] * [ 1 1 ... 1 1 ] * * A = [ ... ... ... ], with one (random) diagonal entry set to -1 * [ 1 1 ... 1 1 ] * * random B with normal entries * * initial point is random, feasible * * We perform two tests: * * unconstrained problem must be recognized as unbounded * * constrained problem must be recognized as bounded and * successfully solved * * Both problems require subtle programming when we work * with semidefinite QP. * * NOTE: unlike BLEIC-QP algorthm, QQP may detect unboundedness * of the problem when started from any x0, with any b. * BLEIC-based solver requires carefully chosen x0 and b * to find direction of zero curvature, but this solver * can find it from any point. */ ae_vector_set_length(&nlist, 12, _state); nlist.ptr.p_int[0] = 2; nlist.ptr.p_int[1] = 3; nlist.ptr.p_int[2] = 4; nlist.ptr.p_int[3] = 5; nlist.ptr.p_int[4] = 6; nlist.ptr.p_int[5] = 7; nlist.ptr.p_int[6] = 8; nlist.ptr.p_int[7] = 9; nlist.ptr.p_int[8] = 10; nlist.ptr.p_int[9] = 20; nlist.ptr.p_int[10] = 40; nlist.ptr.p_int[11] = 80; eps = 1.0E-5; for(nidx=0; nidx<=nlist.cnt-1; nidx++) { for(pass=1; pass<=10; pass++) { /* * Generate problem */ n = nlist.ptr.p_int[nidx]; ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { do { b.ptr.p_double[i] = hqrndnormal(&rs, _state); } while(ae_fp_eq(b.ptr.p_double[i],(double)(0))); bndl.ptr.p_double[i] = (double)(-1); bndu.ptr.p_double[i] = (double)(1); x0.ptr.p_double[i] = 2*hqrnduniformr(&rs, _state)-1; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 1.0; } } j = hqrnduniformi(&rs, n, _state); a.ptr.pp_double[j][j] = -1.0; testminqpunit_densetosparse(&a, n, &sa, _state); /* * Solve problem: * * without constraints we expect failure * * with constraints algorithm must succeed */ minqpcreate(n, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, 0, ae_fp_greater(hqrnduniformr(&rs, _state),0.5), _state); minqpsetlinearterm(&state, &b, _state); if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { minqpsetquadraticterm(&state, &a, ae_true, _state); } else { minqpsetquadratictermsparse(&state, &sa, ae_true, _state); } minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype!=-4, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( rep.terminationtype<=0 ) { ae_frame_leave(_state); return result; } /* * Test - calculate constrained gradient at solution, * check its norm. */ gnorm = 0.0; for(i=0; i<=n-1; i++) { g = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { g = g+a.ptr.pp_double[i][j]*x1.ptr.p_double[j]; } if( ae_fp_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i])&&ae_fp_greater(g,(double)(0)) ) { g = (double)(0); } if( ae_fp_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i])&&ae_fp_less(g,(double)(0)) ) { g = (double)(0); } gnorm = gnorm+ae_sqr(g, _state); seterrorflag(&result, ae_fp_less(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(&result, ae_fp_greater(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(&result, ae_fp_greater(gnorm,eps), _state); } } /* * Linear (zero-quadratic) test: * * N dimensions, N>=1 * * box constraints, x[i] in [-1,+1] * * A = 0 * * random B with normal entries * * initial point is random, feasible * * We perform two tests: * * unconstrained problem must be recognized as unbounded * * constrained problem can be successfully solved * * NOTE: we may explicitly set zero A, or assume that by * default it is zero. During test we will try both * ways. */ eps = 1.0E-5; for(n=1; n<=10; n++) { for(pass=1; pass<=10; pass++) { /* * Generate problem */ ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { do { b.ptr.p_double[i] = hqrndnormal(&rs, _state); } while(ae_fp_eq(b.ptr.p_double[i],(double)(0))); bndl.ptr.p_double[i] = (double)(-1); bndu.ptr.p_double[i] = (double)(1); x0.ptr.p_double[i] = 2*hqrnduniformr(&rs, _state)-1; } /* * Solve problem: * * without constraints we expect failure * * with constraints algorithm must succeed */ minqpcreate(n, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, 0, ae_fp_greater(hqrnduniformr(&rs, _state),0.5), _state); minqpsetlinearterm(&state, &b, _state); if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } minqpsetquadraticterm(&state, &a, ae_true, _state); } minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype!=-4, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( rep.terminationtype<=0 ) { ae_frame_leave(_state); return result; } /* * Test - calculate constrained gradient at solution, * check its norm. */ for(i=0; i<=n-1; i++) { seterrorflag(&result, ae_fp_greater(b.ptr.p_double[i],(double)(0))&&ae_fp_greater(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(&result, ae_fp_less(b.ptr.p_double[i],(double)(0))&&ae_fp_less(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } } } /* * Test for Newton phase of QQP algorithm - we test that Newton * phase can find good solution within one step. In order to do * so we: * * solve convex QP problem (dense or sparse) * * with K<=N equality-only constraints ai=x=bi * * with number of outer iterations limited to just 1 * * and with CG phase turned off (we modify internal structures * of the QQP solver in order to make it) */ eps = 1.0E-5; for(pass=1; pass<=10; pass++) { /* * Generate problem */ n = 50+hqrnduniformi(&rs, 51, _state); spdmatrixrndcond(n, 1.0E3, &a, _state); testminqpunit_densetosparse(&a, n, &sa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); x0.ptr.p_double[i] = hqrndnormal(&rs, _state); if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { bndl.ptr.p_double[i] = _state->v_neginf; bndu.ptr.p_double[i] = _state->v_posinf; } else { bndl.ptr.p_double[i] = hqrndnormal(&rs, _state); bndu.ptr.p_double[i] = bndl.ptr.p_double[i]; } } /* * Solve problem * * NOTE: we modify internal structures of QQP solver in order * to deactivate CG phase */ minqpcreate(n, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, 1, ae_true, _state); state.qqpsettingsuser.cgphase = ae_false; minqpsetlinearterm(&state, &b, _state); if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { minqpsetquadraticterm(&state, &a, ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)), _state); } else { minqpsetquadratictermsparse(&state, &sa, ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)), _state); } minqpsetbc(&state, &bndl, &bndu, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( rep.terminationtype<=0 ) { ae_frame_leave(_state); return result; } /* * Test - calculate constrained gradient at solution, * check its norm. */ gnorm = 0.0; for(i=0; i<=n-1; i++) { g = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { g = g+a.ptr.pp_double[i][j]*x1.ptr.p_double[j]; } if( ae_fp_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i])&&ae_fp_greater(g,(double)(0)) ) { g = (double)(0); } if( ae_fp_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i])&&ae_fp_less(g,(double)(0)) ) { g = (double)(0); } gnorm = gnorm+ae_sqr(g, _state); seterrorflag(&result, ae_fp_less(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(&result, ae_fp_greater(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(&result, ae_fp_greater(gnorm,eps), _state); } /* * Test for Newton phase of QQP algorithm - we test that Newton * updates work correctly, i.e. that CNewtonUpdate() internal * function correctly updates inverse Hessian matrix. * * To test it we: * * solve ill conditioned convex QP problem * * with unconstrained solution XZ whose components are within [-0.5,+0.5] * * with one inequality constraint X[k]>=5 * * with initial point such that: * * X0[i] = 100 for i<>k * * X0[k] = 5+1.0E-5 * * with number of outer iterations limited to just 1 * * and with CG phase turned off (we modify internal structures * of the QQP solver in order to make it) * * The idea is that single Newton step is not enough to find solution, * but with just one update we can move exactly to the solution. * * We perform two tests: * * first one with State.QQP.NewtMaxIts set to 1, in order to * make sure that algorithm fails with just one iteration * * second one with State.QQP.NewtMaxIts set to 2, in order to * make sure that algorithm converges when it can perform update */ eps = 1.0E-5; for(pass=1; pass<=10; pass++) { /* * Generate problem */ n = 20+hqrnduniformi(&rs, 20, _state); spdmatrixrndcond(n, 1.0E5, &a, _state); testminqpunit_densetosparse(&a, n, &sa, _state); sparseconverttocrs(&sa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xz, n, _state); for(i=0; i<=n-1; i++) { xz.ptr.p_double[i] = hqrnduniformr(&rs, _state)-0.5; x0.ptr.p_double[i] = (double)(100); bndl.ptr.p_double[i] = _state->v_neginf; bndu.ptr.p_double[i] = _state->v_posinf; } k = hqrnduniformi(&rs, n, _state); x0.ptr.p_double[k] = 5.00001; bndl.ptr.p_double[k] = 5.0; sparsemv(&sa, &xz, &b, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = -b.ptr.p_double[i]; } /* * Create solver */ minqpcreate(n, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, 1, ae_true, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)), _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpsetstartingpoint(&state, &x0, _state); /* * Solve problem. First time, with no Newton updates. * It must fail. * * NOTE: we modify internal structures of QQP solver in order * to deactivate CG phase and turn off Newton updates. */ state.qqpsettingsuser.cgphase = ae_false; state.qqpsettingsuser.cnphase = ae_true; state.qqpsettingsuser.cnmaxupdates = 0; minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( result ) { ae_frame_leave(_state); return result; } flag = ae_false; testminqpunit_testbcgradandfeasibility(&a, &b, &bndl, &bndu, n, &x1, eps, &flag, _state); seterrorflag(&result, !flag, _state); /* * Now with Newton updates - it must succeeed. */ state.qqpsettingsuser.cgphase = ae_false; state.qqpsettingsuser.cnmaxupdates = n; minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( result ) { ae_frame_leave(_state); return result; } flag = ae_false; testminqpunit_testbcgradandfeasibility(&a, &b, &bndl, &bndu, n, &x1, eps, &flag, _state); seterrorflag(&result, flag, _state); } /* * Check that problem with general constraints results in * correct error code (-5 should be returned). */ ae_matrix_set_length(&c, 1, 3, _state); ae_vector_set_length(&ct, 1, _state); c.ptr.pp_double[0][0] = 1.0; c.ptr.pp_double[0][1] = 1.0; c.ptr.pp_double[0][2] = 2.0; ct.ptr.p_int[0] = 0; minqpcreate(2, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, 0, ae_fp_greater(hqrnduniformr(&rs, _state),0.5), _state); minqpsetlc(&state, &c, &ct, 1, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype!=-5, _state); /* * Test sparse functionality. QQP solver must perform * same steps independently of matrix type (dense or sparse). * * We generate random unconstrained test problem and solve it * twice - first time we solve dense version, second time - * sparse version is solved. * * During this test we: * * use stringent stopping criteria (one outer iteration) * * turn off Newton phase of the algorithm to slow down * convergence */ eps = 1.0E-3; itscnt = 1; n = 20; isupper = ae_fp_greater(ae_randomreal(_state),0.5); spdmatrixrndcond(n, 1.0E3, &za, _state); sparsecreate(n, n, 0, &sa, _state); ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( j>=i&&isupper ) { sparseset(&sa, i, j, za.ptr.pp_double[i][j], _state); a.ptr.pp_double[i][j] = za.ptr.pp_double[i][j]; } if( j<=i&&!isupper ) { sparseset(&sa, i, j, za.ptr.pp_double[i][j], _state); a.ptr.pp_double[i][j] = za.ptr.pp_double[i][j]; } } } ae_vector_set_length(&b, n, _state); ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = randomnormal(_state); s.ptr.p_double[i] = ae_pow(10.0, randomnormal(_state)/10, _state); } minqpcreate(n, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, itscnt, ae_false, _state); minqpsetscale(&state, &s, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, isupper, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); minqpcreate(n, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, itscnt, ae_false, _state); minqpsetscale(&state, &s, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadratictermsparse(&state, &sa, isupper, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend1, &rep, _state); for(i=0; i<=n-1; i++) { seterrorflag(&result, ae_fp_greater(ae_fabs(xend0.ptr.p_double[i]-xend1.ptr.p_double[i], _state),eps), _state); } /* * Test scale-invariance. QQP performs same steps on scaled and * unscaled problems (assuming that scale of the variables is known). * * We generate random scale matrix S and random well-conditioned and * well scaled matrix A. Then we solve two problems: * * (1) f = 0.5*x'*A*x+b'*x * (identity scale matrix is used) * * and * * (2) f = 0.5*y'*(inv(S)*A*inv(S))*y + (inv(S)*b)'*y * (scale matrix S is used) * * Solution process is started from X=0, we perform ItsCnt=1 outer * iterations with Newton phase turned off (to slow down convergence; * we want to prevent algorithm from converging to exact solution which * is exactly same for both problems; the idea is to test that same * intermediate tests are taken). * * As result, we must get S*x=y */ eps = 1.0E-3; itscnt = 1; n = 100; ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { s.ptr.p_double[i] = ae_pow(10.0, randomnormal(_state)/10, _state); } spdmatrixrndcond(n, 1.0E3, &a, _state); ae_matrix_set_length(&za, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { za.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]/(s.ptr.p_double[i]*s.ptr.p_double[j]); } } ae_vector_set_length(&b, n, _state); ae_vector_set_length(&zb, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = randomnormal(_state); zb.ptr.p_double[i] = b.ptr.p_double[i]/s.ptr.p_double[i]; } minqpcreate(n, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, itscnt, ae_false, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_true, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); minqpcreate(n, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, itscnt, ae_false, _state); minqpsetlinearterm(&state, &zb, _state); minqpsetquadraticterm(&state, &za, ae_true, _state); minqpsetscale(&state, &s, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend1, &rep, _state); for(i=0; i<=n-1; i++) { seterrorflag(&result, ae_fp_greater(ae_fabs(s.ptr.p_double[i]*xend0.ptr.p_double[i]-xend1.ptr.p_double[i], _state),eps), _state); } /* * Test that QQP can efficiently use sparse matrices (i.e. it is * not disguised version of some dense QP solver). In order to test * it we create very large and very sparse problem (diagonal matrix * with N=40.000) and perform 10 iterations of QQP solver. * * In case QP solver uses some form of dense linear algebra to solve * this problem, it will take TOO much time to solve it. And we will * notice it by EXTREME slowdown during testing. */ n = 40000; sparsecreate(n, n, 0, &sa, _state); for(i=0; i<=n-1; i++) { sparseset(&sa, i, i, ae_pow(10.0, -3*ae_randomreal(_state), _state), _state); } ae_vector_set_length(&b, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = randomnormal(_state); } minqpcreate(n, &state, _state); minqpsetalgoquickqp(&state, 0.0, 0.0, 0.0, 10, ae_fp_greater(hqrnduniformr(&rs, _state),0.5), _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadratictermsparse(&state, &sa, ae_true, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); ae_frame_leave(_state); return result; } /************************************************************************* This function performs tests specific for BLEIC solver Returns True on error, False on success. *************************************************************************/ ae_bool bleictests(ae_state *_state) { ae_frame _frame_block; minqpstate state; minqpreport rep; ae_vector nlist; ae_int_t nidx; ae_matrix a; ae_matrix za; ae_matrix c; ae_vector b; ae_vector zb; ae_vector bndl; ae_vector bndu; ae_vector s; ae_vector x; ae_vector ct; sparsematrix sa; ae_int_t n; ae_vector x0; ae_vector x1; hqrndstate rs; ae_int_t i; ae_int_t j; ae_int_t pass; ae_vector xend0; ae_vector xend1; double eps; double v; double g; double gnorm; ae_int_t itscnt; ae_bool isupper; ae_bool result; ae_frame_make(_state, &_frame_block); _minqpstate_init(&state, _state); _minqpreport_init(&rep, _state); ae_vector_init(&nlist, 0, DT_INT, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&za, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&zb, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); _sparsematrix_init(&sa, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); ae_vector_init(&xend0, 0, DT_REAL, _state); ae_vector_init(&xend1, 0, DT_REAL, _state); result = ae_false; hqrndrandomize(&rs, _state); /* * Test sparse functionality. BLEIC-based solver must perform * same steps independently of matrix type (dense or sparse). * * We generate random unconstrained test problem and solve it * twice - first time we solve dense version, second time - * sparse version is solved. */ eps = 1.0E-3; itscnt = 5; n = 20; isupper = ae_fp_greater(ae_randomreal(_state),0.5); spdmatrixrndcond(n, 1.0E3, &za, _state); sparsecreate(n, n, 0, &sa, _state); ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( j>=i&&isupper ) { sparseset(&sa, i, j, za.ptr.pp_double[i][j], _state); a.ptr.pp_double[i][j] = za.ptr.pp_double[i][j]; } if( j<=i&&!isupper ) { sparseset(&sa, i, j, za.ptr.pp_double[i][j], _state); a.ptr.pp_double[i][j] = za.ptr.pp_double[i][j]; } } } ae_vector_set_length(&b, n, _state); ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = randomnormal(_state); s.ptr.p_double[i] = ae_pow(10.0, randomnormal(_state)/10, _state); } minqpcreate(n, &state, _state); minqpsetalgobleic(&state, 0.0, 0.0, 0.0, itscnt, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, isupper, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); minqpcreate(n, &state, _state); minqpsetalgobleic(&state, 0.0, 0.0, 0.0, itscnt, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadratictermsparse(&state, &sa, isupper, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend1, &rep, _state); for(i=0; i<=n-1; i++) { seterrorflag(&result, ae_fp_greater(ae_fabs(xend0.ptr.p_double[i]-xend1.ptr.p_double[i], _state),eps), _state); } /* * Test scale-invariance. BLEIC performs same steps on scaled and * unscaled problems (assuming that scale of the variables is known). * * We generate random scale matrix S and random well-conditioned and * well scaled matrix A. Then we solve two problems: * * (1) f = 0.5*x'*A*x+b'*x * (identity scale matrix is used) * * and * * (2) f = 0.5*y'*(inv(S)*A*inv(S))*y + (inv(S)*b)'*y * (scale matrix S is used) * * Solution process is started from X=0, we perform ItsCnt=5 steps. * As result, we must get S*x=y */ eps = 1.0E-3; itscnt = 5; n = 20; ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { s.ptr.p_double[i] = ae_pow(10.0, randomnormal(_state)/10, _state); } spdmatrixrndcond(n, 1.0E3, &a, _state); ae_matrix_set_length(&za, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { za.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]/(s.ptr.p_double[i]*s.ptr.p_double[j]); } } ae_vector_set_length(&b, n, _state); ae_vector_set_length(&zb, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = randomnormal(_state); zb.ptr.p_double[i] = b.ptr.p_double[i]/s.ptr.p_double[i]; } minqpcreate(n, &state, _state); minqpsetalgobleic(&state, 0.0, 0.0, 0.0, itscnt, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_true, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); minqpcreate(n, &state, _state); minqpsetalgobleic(&state, 0.0, 0.0, 0.0, itscnt, _state); minqpsetlinearterm(&state, &zb, _state); minqpsetquadraticterm(&state, &za, ae_true, _state); minqpsetscale(&state, &s, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend1, &rep, _state); for(i=0; i<=n-1; i++) { seterrorflag(&result, ae_fp_greater(ae_fabs(s.ptr.p_double[i]*xend0.ptr.p_double[i]-xend1.ptr.p_double[i], _state),eps), _state); } /* * Test that BLEIC can efficiently use sparse matrices (i.e. it is * not disguised version of some dense QP solver). In order to test * it we create very large and very sparse problem (diagonal matrix * with N=20.000) and perform 10 iterations of BLEIC-based QP solver. * * In case QP solver uses some form of dense linear algebra to solve * this problem, it will take TOO much time to solve it. And we will * notice it by EXTREME slowdown during testing. */ n = 20000; sparsecreate(n, n, 0, &sa, _state); for(i=0; i<=n-1; i++) { sparseset(&sa, i, i, ae_pow(10.0, -3*ae_randomreal(_state), _state), _state); } ae_vector_set_length(&b, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = randomnormal(_state); } minqpcreate(n, &state, _state); minqpsetalgobleic(&state, 0.0, 0.0, 0.0, 10, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadratictermsparse(&state, &sa, ae_true, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); /* * Special semi-definite test: * * N dimensions, N>=2 (important!) * * box constraints, x[i] in [-1,+1] * [ 1 1 ... 1 1 ] * * A = [ ... ... ... ] * [ 1 1 ... 1 1 ] * * random B such that SUM(b[i])=0.0 (important!) * * initial point x0 is chosen in such way that SUM(x[i])=0.0 * (important!) * * We perform two tests: * * unconstrained problem must be recognized as unbounded * (when starting from x0!) * * constrained problem must be recognized as bounded * and successfully solved * * Both problems require subtle programming when we work * with semidefinite QP. * * NOTE: it is very important to have N>=2 (otherwise problem * will be bounded from below even without boundary * constraints) and to have x0/b0 such that sum of * components is zero (such x0 is exact minimum of x'*A*x, * which allows algorithm to find direction of zero curvature * at the very first step). If x0/b are chosen in other way, * algorithm may be unable to find direction of zero * curvature and will cycle forever, slowly decreasing * function value at each iteration. * This is major difference from similar test for QQP solver - * QQP can find direction of zero curvature from almost any * point due to internal CG solver which favors such directions. * BLEIC uses LBFGS, which is less able to find direction of * zero curvature. */ ae_vector_set_length(&nlist, 12, _state); nlist.ptr.p_int[0] = 2; nlist.ptr.p_int[1] = 3; nlist.ptr.p_int[2] = 4; nlist.ptr.p_int[3] = 5; nlist.ptr.p_int[4] = 6; nlist.ptr.p_int[5] = 7; nlist.ptr.p_int[6] = 8; nlist.ptr.p_int[7] = 9; nlist.ptr.p_int[8] = 10; nlist.ptr.p_int[9] = 20; nlist.ptr.p_int[10] = 40; nlist.ptr.p_int[11] = 80; eps = 1.0E-5; for(nidx=0; nidx<=nlist.cnt-1; nidx++) { for(pass=1; pass<=10; pass++) { /* * Generate problem */ n = nlist.ptr.p_int[nidx]; ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { do { b.ptr.p_double[i] = hqrndnormal(&rs, _state); } while(ae_fp_eq(b.ptr.p_double[i],(double)(0))); bndl.ptr.p_double[i] = (double)(-1); bndu.ptr.p_double[i] = (double)(1); x0.ptr.p_double[i] = 2*hqrnduniformr(&rs, _state)-1; } v = 0.0; for(i=0; i<=n-1; i++) { v = v+x0.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = x0.ptr.p_double[i]-v/n; } v = 0.0; for(i=0; i<=n-1; i++) { v = v+b.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = b.ptr.p_double[i]-v/n; } ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 1.0; } } testminqpunit_densetosparse(&a, n, &sa, _state); /* * Solve problem: * * without constraints we expect failure * * with constraints algorithm must succeed */ minqpcreate(n, &state, _state); minqpsetalgobleic(&state, 0.0, 0.0, 0.0, 0, _state); minqpsetlinearterm(&state, &b, _state); minqpsetstartingpoint(&state, &x0, _state); if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { minqpsetquadraticterm(&state, &a, ae_true, _state); } else { minqpsetquadratictermsparse(&state, &sa, ae_true, _state); } minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype!=-4, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( rep.terminationtype<=0 ) { ae_frame_leave(_state); return result; } /* * Test - calculate constrained gradient at solution, * check its norm. */ gnorm = 0.0; for(i=0; i<=n-1; i++) { g = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { g = g+a.ptr.pp_double[i][j]*x1.ptr.p_double[j]; } if( ae_fp_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i])&&ae_fp_greater(g,(double)(0)) ) { g = (double)(0); } if( ae_fp_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i])&&ae_fp_less(g,(double)(0)) ) { g = (double)(0); } gnorm = gnorm+ae_sqr(g, _state); seterrorflag(&result, ae_fp_less(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(&result, ae_fp_greater(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(&result, ae_fp_greater(gnorm,eps), _state); } } /* * Test that BLEIC-based QP solver can solve non-convex problems * which are bounded from below on the feasible set: * * min -||x||^2 s.t. x[i] in [-1,+1] * * We also test ability of the solver to detect unbounded problems * (we remove one of the constraints and repeat solution process). */ n = 20; eps = 1.0E-14; sparsecreate(n, n, 0, &sa, _state); for(i=0; i<=n-1; i++) { sparseset(&sa, i, i, (double)(-1), _state); } ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { bndl.ptr.p_double[i] = (double)(-1); bndu.ptr.p_double[i] = (double)(1); x.ptr.p_double[i] = ae_randomreal(_state)-0.5; } minqpcreate(n, &state, _state); minqpsetalgobleic(&state, eps, 0.0, 0.0, 0, _state); minqpsetquadratictermsparse(&state, &sa, ae_true, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpsetstartingpoint(&state, &x, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( rep.terminationtype>0 ) { for(i=0; i<=n-1; i++) { seterrorflag(&result, ae_fp_neq(xend0.ptr.p_double[i],(double)(-1))&&ae_fp_neq(xend0.ptr.p_double[i],(double)(1)), _state); } } i = ae_randominteger(n, _state); bndl.ptr.p_double[i] = _state->v_neginf; bndu.ptr.p_double[i] = _state->v_posinf; minqpsetbc(&state, &bndl, &bndu, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); seterrorflag(&result, rep.terminationtype!=-4, _state); /* * Test that BLEIC-based QP solver can solve non-convex problems * which are bounded from below on the feasible set: * * min -||x||^2 s.t. x[i] in [-1,+1], * with inequality constraints handled as general linear ones * * We also test ability of the solver to detect unbounded problems * (we remove last pair of constraints and try to solve modified * problem). */ n = 20; eps = 1.0E-14; sparsecreate(n, n, 0, &sa, _state); for(i=0; i<=n-1; i++) { sparseset(&sa, i, i, (double)(-1), _state); } ae_matrix_set_length(&c, 2*n, n+1, _state); ae_vector_set_length(&ct, 2*n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n; j++) { c.ptr.pp_double[2*i+0][j] = (double)(0); c.ptr.pp_double[2*i+1][j] = (double)(0); } c.ptr.pp_double[2*i+0][i] = 1.0; c.ptr.pp_double[2*i+0][n] = 1.0; ct.ptr.p_int[2*i+0] = -1; c.ptr.pp_double[2*i+1][i] = 1.0; c.ptr.pp_double[2*i+1][n] = -1.0; ct.ptr.p_int[2*i+1] = 1; } ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_randomreal(_state)-0.5; } minqpcreate(n, &state, _state); minqpsetalgobleic(&state, eps, 0.0, 0.0, 0, _state); minqpsetquadratictermsparse(&state, &sa, ae_true, _state); minqpsetlc(&state, &c, &ct, 2*n, _state); minqpsetstartingpoint(&state, &x, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( rep.terminationtype>0 ) { for(i=0; i<=n-1; i++) { seterrorflag(&result, ae_fp_greater(ae_fabs(xend0.ptr.p_double[i]+1, _state),100*ae_machineepsilon)&&ae_fp_greater(ae_fabs(xend0.ptr.p_double[i]-1, _state),100*ae_machineepsilon), _state); } } minqpsetlc(&state, &c, &ct, 2*(n-1), _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); seterrorflag(&result, rep.terminationtype!=-4, _state); /* * Test that BLEIC-based QP solver can solve QP problems with * zero quadratic term: * * min b'*x s.t. x[i] in [-1,+1] * * It means that QP solver can be used as linear programming solver * (altough performance of such solver is worse than that of specialized * LP solver). * * NOTE: we perform this test twice - first time without explicitly setting * quadratic term (we test that default quadratic term is zero), and * second time - with explicitly set quadratic term. */ n = 20; sparsecreate(n, n, 0, &sa, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&b, n, _state); for(i=0; i<=n-1; i++) { bndl.ptr.p_double[i] = (double)(-1); bndu.ptr.p_double[i] = (double)(1); b.ptr.p_double[i] = randomnormal(_state); } minqpcreate(n, &state, _state); minqpsetalgobleic(&state, eps, 0.0, 0.0, 0, _state); minqpsetlinearterm(&state, &b, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( rep.terminationtype>0 ) { for(i=0; i<=n-1; i++) { seterrorflag(&result, ae_fp_greater(b.ptr.p_double[i],(double)(0))&&ae_fp_neq(xend0.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(&result, ae_fp_less(b.ptr.p_double[i],(double)(0))&&ae_fp_neq(xend0.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } } minqpcreate(n, &state, _state); minqpsetalgobleic(&state, eps, 0.0, 0.0, 0, _state); minqpsetlinearterm(&state, &b, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpsetquadratictermsparse(&state, &sa, ae_true, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend0, &rep, _state); seterrorflag(&result, rep.terminationtype<=0, _state); if( rep.terminationtype>0 ) { for(i=0; i<=n-1; i++) { seterrorflag(&result, ae_fp_greater(b.ptr.p_double[i],(double)(0))&&ae_fp_neq(xend0.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(&result, ae_fp_less(b.ptr.p_double[i],(double)(0))&&ae_fp_neq(xend0.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } } /* * Test specific problem sent by V.Semenenko, which resulted in * the initinite loop in FindFeasiblePoint (before fix). We do * not test results returned by solver - simply being able to * stop is enough for this test. * * NOTE: it is important that modifications to problem are applied * sequentially. Test fails after 100-5000 such modifications. * One modification is not enough to cause failure. */ n = 3; ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } a.ptr.pp_double[0][0] = 1.222990; a.ptr.pp_double[1][1] = 1.934900; a.ptr.pp_double[2][2] = 0.603924; ae_vector_set_length(&b, n, _state); b.ptr.p_double[0] = -4.97245; b.ptr.p_double[1] = -9.09039; b.ptr.p_double[2] = -4.63856; ae_matrix_set_length(&c, 8, n+1, _state); for(i=0; i<=c.rows-1; i++) { for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = 0.0; } } c.ptr.pp_double[0][0] = (double)(1); c.ptr.pp_double[0][n] = 4.94298; c.ptr.pp_double[1][0] = (double)(1); c.ptr.pp_double[1][n] = 4.79981; c.ptr.pp_double[2][1] = (double)(1); c.ptr.pp_double[2][n] = -0.4848; c.ptr.pp_double[3][1] = (double)(1); c.ptr.pp_double[3][n] = -0.73804; c.ptr.pp_double[4][2] = (double)(1); c.ptr.pp_double[4][n] = 0.575729; c.ptr.pp_double[5][2] = (double)(1); c.ptr.pp_double[5][n] = 0.458645; c.ptr.pp_double[6][0] = (double)(1); c.ptr.pp_double[6][2] = (double)(-1); c.ptr.pp_double[6][n] = -0.0546574; c.ptr.pp_double[7][0] = (double)(1); c.ptr.pp_double[7][2] = (double)(-1); c.ptr.pp_double[7][n] = -0.5900440; ae_vector_set_length(&ct, 8, _state); ct.ptr.p_int[0] = -1; ct.ptr.p_int[1] = 1; ct.ptr.p_int[2] = -1; ct.ptr.p_int[3] = 1; ct.ptr.p_int[4] = -1; ct.ptr.p_int[5] = 1; ct.ptr.p_int[6] = -1; ct.ptr.p_int[7] = 1; ae_vector_set_length(&s, n, _state); s.ptr.p_double[0] = 0.143171; s.ptr.p_double[1] = 0.253240; s.ptr.p_double[2] = 0.117084; ae_vector_set_length(&x0, n, _state); x0.ptr.p_double[0] = 3.51126; x0.ptr.p_double[1] = 4.05731; x0.ptr.p_double[2] = 6.63307; for(pass=1; pass<=10000; pass++) { /* * Apply random distortion */ for(j=0; j<=n-1; j++) { b.ptr.p_double[j] = b.ptr.p_double[j]+(2*hqrnduniformi(&rs, 2, _state)-1)*0.1; } for(j=0; j<=6-1; j++) { c.ptr.pp_double[j][n] = c.ptr.pp_double[j][n]+(2*hqrnduniformi(&rs, 2, _state)-1)*0.1; } /* * Solve */ minqpcreate(3, &state, _state); minqpsetquadraticterm(&state, &a, ae_true, _state); minqpsetlinearterm(&state, &b, _state); minqpsetlc(&state, &c, &ct, 8, _state); minqpsetalgobleic(&state, 0.0, 0.0, 0.0, 0, _state); minqpsetstartingpoint(&state, &x0, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); } ae_frame_leave(_state); return result; } /************************************************************************* This function tests bound constrained quadratic programming algorithm. On failure sets error flag. *************************************************************************/ static void testminqpunit_bcqptest(ae_bool* wereerrors, ae_state *_state) { ae_frame _frame_block; minqpstate state; minqpreport rep; ae_int_t n; ae_int_t pass; ae_int_t i; ae_int_t j; double v; double g; double gnorm; ae_int_t origintype; ae_int_t scaletype; ae_bool isupper; ae_bool issparse; double bctol; double lctol; ae_matrix a; ae_matrix fulla; ae_matrix halfa; ae_matrix c; sparsematrix sa; ae_vector ct; ae_vector b; ae_vector bndl; ae_vector bndu; ae_vector x0; ae_vector x1; ae_vector xori; ae_vector xz; ae_vector s; double eps; hqrndstate rs; ae_frame_make(_state, &_frame_block); _minqpstate_init(&state, _state); _minqpreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&fulla, 0, 0, DT_REAL, _state); ae_matrix_init(&halfa, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); _sparsematrix_init(&sa, _state); ae_vector_init(&ct, 0, DT_INT, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&xori, 0, DT_REAL, _state); ae_vector_init(&xz, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); /* * Convex test: * * N dimensions * * random number (0..N) of random boundary constraints * * positive-definite A * * algorithm randomly choose dense or sparse A, and for * sparse matrix it randomly choose format. * * random B with normal entries * * initial point is random, feasible * * random origin (zero or non-zero) and scale (unit or * non-unit) are generated */ eps = 1.0E-4; for(n=1; n<=10; n++) { for(pass=1; pass<=10; pass++) { /* * Generate problem */ origintype = hqrnduniformi(&rs, 2, _state); scaletype = hqrnduniformi(&rs, 2, _state); isupper = ae_fp_less(hqrnduniformr(&rs, _state),0.5); issparse = ae_fp_less(hqrnduniformr(&rs, _state),0.5); spdmatrixrndcond(n, 1.0E3, &fulla, _state); ae_matrix_set_length(&halfa, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j>=i&&isupper)||(j<=i&&!isupper) ) { halfa.ptr.pp_double[i][j] = fulla.ptr.pp_double[i][j]; } else { halfa.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; } } } testminqpunit_densetosparse(&halfa, n, &sa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xori, n, _state); ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); bndl.ptr.p_double[i] = _state->v_neginf; bndu.ptr.p_double[i] = _state->v_posinf; x0.ptr.p_double[i] = hqrndnormal(&rs, _state); if( origintype==0 ) { xori.ptr.p_double[i] = (double)(0); } else { xori.ptr.p_double[i] = hqrndnormal(&rs, _state); } if( scaletype==0 ) { s.ptr.p_double[i] = (double)(1); } else { s.ptr.p_double[i] = ae_exp(hqrndnormal(&rs, _state), _state); } j = hqrnduniformi(&rs, 5, _state); if( j==0 ) { bndl.ptr.p_double[i] = (double)(0); x0.ptr.p_double[i] = ae_fabs(x0.ptr.p_double[i], _state); } if( j==1 ) { bndu.ptr.p_double[i] = (double)(0); x0.ptr.p_double[i] = -ae_fabs(x0.ptr.p_double[i], _state); } if( j==2 ) { bndl.ptr.p_double[i] = hqrndnormal(&rs, _state); bndu.ptr.p_double[i] = bndl.ptr.p_double[i]; x0.ptr.p_double[i] = bndl.ptr.p_double[i]; } if( j==3 ) { bndl.ptr.p_double[i] = -0.1; bndu.ptr.p_double[i] = 0.1; x0.ptr.p_double[i] = 0.2*hqrnduniformr(&rs, _state)-0.1; } } /* * Solve problem */ minqpcreate(n, &state, _state); testminqpunit_setrandomalgoallmodern(&state, &bctol, &lctol, _state); minqpsetlinearterm(&state, &b, _state); minqpsetstartingpoint(&state, &x0, _state); if( issparse ) { minqpsetquadratictermsparse(&state, &sa, isupper, _state); } else { minqpsetquadraticterm(&state, &halfa, isupper, _state); } if( origintype!=0 ) { minqpsetorigin(&state, &xori, _state); } if( scaletype!=0 ) { minqpsetscale(&state, &s, _state); } minqpsetbc(&state, &bndl, &bndu, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( rep.terminationtype<=0 ) { ae_frame_leave(_state); return; } /* * Test - calculate constrained gradient at solution, * check its norm. */ gnorm = 0.0; for(i=0; i<=n-1; i++) { g = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { g = g+fulla.ptr.pp_double[i][j]*(x1.ptr.p_double[j]-xori.ptr.p_double[j]); } if( ae_fp_less_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i]+bctol)&&ae_fp_greater(g,(double)(0)) ) { g = (double)(0); } if( ae_fp_greater_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i]-bctol)&&ae_fp_less(g,(double)(0)) ) { g = (double)(0); } gnorm = gnorm+ae_sqr(g, _state); seterrorflag(wereerrors, ae_fp_less(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(wereerrors, ae_fp_greater(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(wereerrors, ae_fp_greater(gnorm,eps), _state); } } /* * Semidefinite test: * * N dimensions * * nonnegativity constraints * * A = [ 1 1 ... 1 1 ; 1 1 ... 1 1 ; .... ; 1 1 ... 1 1 ] * * algorithm randomly choose dense or sparse A, and for * sparse matrix it randomly choose format. * * random B with normal entries * * initial point is random, feasible */ eps = 1.0E-4; for(n=1; n<=10; n++) { for(pass=1; pass<=10; pass++) { /* * Generate problem */ isupper = ae_fp_less(hqrnduniformr(&rs, _state),0.5); issparse = ae_fp_less(hqrnduniformr(&rs, _state),0.5); ae_matrix_set_length(&fulla, n, n, _state); ae_matrix_set_length(&halfa, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { fulla.ptr.pp_double[i][j] = 1.0; if( (j>=i&&isupper)||(j<=i&&!isupper) ) { halfa.ptr.pp_double[i][j] = fulla.ptr.pp_double[i][j]; } else { halfa.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; } } } testminqpunit_densetosparse(&halfa, n, &sa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); bndl.ptr.p_double[i] = 0.0; bndu.ptr.p_double[i] = _state->v_posinf; x0.ptr.p_double[i] = (double)(hqrnduniformi(&rs, 2, _state)); } /* * Solve problem */ minqpcreate(n, &state, _state); testminqpunit_setrandomalgosemidefinite(&state, _state); minqpsetstartingpoint(&state, &x0, _state); minqpsetlinearterm(&state, &b, _state); if( issparse ) { minqpsetquadratictermsparse(&state, &sa, isupper, _state); } else { minqpsetquadraticterm(&state, &halfa, isupper, _state); } minqpsetbc(&state, &bndl, &bndu, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( rep.terminationtype<=0 ) { ae_frame_leave(_state); return; } /* * Test - calculate constrained gradient at solution, * check its norm. */ gnorm = 0.0; for(i=0; i<=n-1; i++) { g = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { g = g+fulla.ptr.pp_double[i][j]*x1.ptr.p_double[j]; } if( ae_fp_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i])&&ae_fp_greater(g,(double)(0)) ) { g = (double)(0); } if( ae_fp_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i])&&ae_fp_less(g,(double)(0)) ) { g = (double)(0); } gnorm = gnorm+ae_sqr(g, _state); seterrorflag(wereerrors, ae_fp_less(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(wereerrors, ae_fp_greater(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(wereerrors, ae_fp_greater(gnorm,eps), _state); } } /* * Non-convex test: * * N dimensions, N>=2 * * box constraints, x[i] in [-1,+1] * * A = A0-0.5*I, where A0 is SPD with unit norm and smallest * singular value equal to 1.0E-3, I is identity matrix * * random B with normal entries * * initial point is random, feasible * * We perform two tests: * * unconstrained problem must be recognized as unbounded * * constrained problem can be successfully solved * * NOTE: it is important to have N>=2, because formula for A * can be applied only to matrix with at least two * singular values */ eps = 1.0E-4; for(n=2; n<=10; n++) { for(pass=1; pass<=10; pass++) { /* * Generate problem */ spdmatrixrndcond(n, 1.0E3, &fulla, _state); for(i=0; i<=n-1; i++) { fulla.ptr.pp_double[i][i] = fulla.ptr.pp_double[i][i]-0.5; } isupper = ae_fp_less(hqrnduniformr(&rs, _state),0.5); ae_matrix_set_length(&halfa, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( (j>=i&&isupper)||(j<=i&&!isupper) ) { halfa.ptr.pp_double[i][j] = fulla.ptr.pp_double[i][j]; } else { halfa.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; } } } testminqpunit_densetosparse(&halfa, n, &sa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); bndl.ptr.p_double[i] = (double)(-1); bndu.ptr.p_double[i] = (double)(1); x0.ptr.p_double[i] = 2*hqrnduniformr(&rs, _state)-1; } /* * Solve problem: * * without constraints we expect failure * * with constraints algorithm must succeed */ minqpcreate(n, &state, _state); testminqpunit_setrandomalgononconvex(&state, _state); minqpsetstartingpoint(&state, &x0, _state); minqpsetlinearterm(&state, &b, _state); if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { minqpsetquadraticterm(&state, &halfa, isupper, _state); } else { minqpsetquadratictermsparse(&state, &sa, isupper, _state); } minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, rep.terminationtype!=-4, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( rep.terminationtype<=0 ) { ae_frame_leave(_state); return; } /* * Test - calculate constrained gradient at solution, * check its norm. */ gnorm = 0.0; for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&fulla.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); g = v+b.ptr.p_double[i]; if( ae_fp_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i])&&ae_fp_greater(g,(double)(0)) ) { g = (double)(0); } if( ae_fp_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i])&&ae_fp_less(g,(double)(0)) ) { g = (double)(0); } gnorm = gnorm+ae_sqr(g, _state); seterrorflag(wereerrors, ae_fp_less(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(wereerrors, ae_fp_greater(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(wereerrors, ae_fp_greater(gnorm,eps), _state); } } /* * Linear (zero-quadratic) test: * * N dimensions, N>=1 * * box constraints, x[i] in [-1,+1] * * A = 0 * * random B with normal entries * * initial point is random, feasible * * We perform two tests: * * unconstrained problem must be recognized as unbounded * * constrained problem can be successfully solved * * NOTE: we may explicitly set zero A, or assume that by * default it is zero. During test we will try both * ways. */ eps = 1.0E-4; for(n=1; n<=10; n++) { for(pass=1; pass<=10; pass++) { /* * Generate problem */ ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { do { b.ptr.p_double[i] = hqrndnormal(&rs, _state); } while(ae_fp_eq(b.ptr.p_double[i],(double)(0))); bndl.ptr.p_double[i] = (double)(-1); bndu.ptr.p_double[i] = (double)(1); x0.ptr.p_double[i] = 2*hqrnduniformr(&rs, _state)-1; } /* * Solve problem: * * without constraints we expect failure * * with constraints algorithm must succeed */ minqpcreate(n, &state, _state); testminqpunit_setrandomalgononconvex(&state, _state); minqpsetlinearterm(&state, &b, _state); minqpsetstartingpoint(&state, &x0, _state); if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { ae_matrix_set_length(&a, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } minqpsetquadraticterm(&state, &a, ae_true, _state); } minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, rep.terminationtype!=-4, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( rep.terminationtype<=0 ) { ae_frame_leave(_state); return; } /* * Test - calculate constrained gradient at solution, * check its norm. */ for(i=0; i<=n-1; i++) { seterrorflag(wereerrors, ae_fp_greater(b.ptr.p_double[i],(double)(0))&&ae_fp_greater(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(wereerrors, ae_fp_less(b.ptr.p_double[i],(double)(0))&&ae_fp_less(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } } } ae_frame_leave(_state); } /************************************************************************* This function tests equality constrained quadratic programming algorithm. Returns True on errors. *************************************************************************/ static ae_bool testminqpunit_ecqptest(ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t k; ae_matrix a; ae_matrix q; ae_matrix c; ae_matrix a2; ae_vector b; ae_vector b2; ae_vector xstart; ae_vector xstart2; ae_vector xend; ae_vector xend2; ae_vector x0; ae_vector x1; ae_vector xd; ae_vector xs; ae_vector tmp; ae_vector g; ae_vector bndl; ae_vector bndu; ae_vector xorigin; ae_vector ct; double eps; double theta; double f0; double f1; minqpstate state; minqpstate state2; minqpreport rep; ae_int_t i; ae_int_t j; ae_int_t pass; ae_int_t rk; double v; ae_int_t aulits; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&q, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_matrix_init(&a2, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&b2, 0, DT_REAL, _state); ae_vector_init(&xstart, 0, DT_REAL, _state); ae_vector_init(&xstart2, 0, DT_REAL, _state); ae_vector_init(&xend, 0, DT_REAL, _state); ae_vector_init(&xend2, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&xd, 0, DT_REAL, _state); ae_vector_init(&xs, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_vector_init(&xorigin, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); _minqpstate_init(&state, _state); _minqpstate_init(&state2, _state); _minqpreport_init(&rep, _state); waserrors = ae_false; /* * First test: * * N*N identity A * * K=0, where q is random unit vector * * optimization problem has form 0.5*x'*A*x-(x1*A)*x, * where x1 is some random vector * * either: * a) x1 is feasible => we must stop at x1 * b) x1 is infeasible => we must stop at the boundary q'*x=0 and * projection of gradient onto q*x=0 must be zero * * NOTE: we make several passes because some specific kind of errors is rarely * caught by this test, so we need several repetitions. */ eps = 1.0E-4; for(n=2; n<=6; n++) { for(pass=0; pass<=4; pass++) { /* * Generate problem: A, b, CMatrix, x0, XStart */ spdmatrixrndcond(n, ae_pow(10.0, 3*ae_randomreal(_state), _state), &a, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x1, n, _state); ae_vector_set_length(&xstart, n, _state); ae_matrix_set_length(&c, 1, n+1, _state); ae_vector_set_length(&ct, 1, _state); for(i=0; i<=n-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xstart.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } do { v = (double)(0); for(i=0; i<=n-1; i++) { c.ptr.pp_double[0][i] = 2*ae_randomreal(_state)-1; v = v+ae_sqr(c.ptr.pp_double[0][i], _state); } v = ae_sqrt(v, _state); } while(ae_fp_eq(v,(double)(0))); for(i=0; i<=n-1; i++) { c.ptr.pp_double[0][i] = c.ptr.pp_double[0][i]/v; } c.ptr.pp_double[0][n] = (double)(0); ct.ptr.p_int[0] = 1; for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); b.ptr.p_double[i] = -v; } /* * Create optimizer, solve */ minqpcreate(n, &state, _state); testminqpunit_setrandomalgoconvexlc(&state, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_fp_greater(ae_randomreal(_state),0.5), _state); minqpsetstartingpoint(&state, &xstart, _state); minqpsetlc(&state, &c, &ct, 1, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend, &rep, _state); /* * Test */ if( rep.terminationtype<=0 ) { seterrorflag(err, ae_true, _state); continue; } v = ae_v_dotproduct(&x1.ptr.p_double[0], 1, &c.ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); if( ae_fp_greater_eq(v,(double)(0)) ) { /* * X1 is feasible */ for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_greater(ae_fabs(xend.ptr.p_double[i]-x1.ptr.p_double[i], _state),eps), _state); } } else { /* * X1 is infeasible: * * XEnd must be approximately feasible * * gradient projection onto c'*x=0 must be zero */ v = ae_v_dotproduct(&xend.ptr.p_double[0], 1, &c.ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); seterrorflag(err, ae_fp_less(v,-eps), _state); ae_vector_set_length(&g, n, _state); ae_v_move(&g.ptr.p_double[0], 1, &b.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xend.ptr.p_double[0], 1, ae_v_len(0,n-1)); g.ptr.p_double[i] = g.ptr.p_double[i]+v; } v = ae_v_dotproduct(&g.ptr.p_double[0], 1, &c.ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); ae_v_subd(&g.ptr.p_double[0], 1, &c.ptr.pp_double[0][0], 1, ae_v_len(0,n-1), v); v = ae_v_dotproduct(&g.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,n-1)); seterrorflag(err, ae_fp_greater(ae_sqrt(v, _state),eps), _state); } } } /* * Boundary and linear equality/inequality constrained QP problem, * test for correct handling of non-zero XOrigin: * * N*N SPD A with moderate condtion number (up to 100) * * boundary constraints 0<=x[i]<=1 * * K0 ) { seterrorflag(err, ae_fp_less(v,c.ptr.pp_double[i][n]-eps), _state); } if( ct.ptr.p_int[i]<0 ) { seterrorflag(err, ae_fp_greater(v,c.ptr.pp_double[i][n]+eps), _state); } } for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_greater(ae_fabs(xend.ptr.p_double[i]-xend2.ptr.p_double[i], _state),eps), _state); seterrorflag(err, ae_fp_less(xend.ptr.p_double[i],(double)(0)), _state); seterrorflag(err, ae_fp_greater(xend.ptr.p_double[i],(double)(1)), _state); } } } /* * Boundary constraints vs linear ones: * * N*N SPD A * * optimization problem has form 0.5*x'*A*x-(x1*A)*x, * where x1 is some random vector from [-1,+1] * * K=2*N constraints of the form ai<=x[i] or x[i]<=b[i], * with ai in [-1.0,-0.1], bi in [+0.1,+1.0] * * initial point xstart is from [-1,+2] * * we solve two related QP problems: * a) one with constraints posed as boundary ones * b) another one with same constraints posed as general linear ones * both problems must have same solution. * Here we test that boundary constrained and linear inequality constrained * solvers give same results. */ eps = 1.0E-3; for(n=1; n<=6; n++) { /* * Generate problem: A, b, x0, XStart, C, CT */ spdmatrixrndcond(n, ae_pow(10.0, 3*ae_randomreal(_state), _state), &a, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x1, n, _state); ae_vector_set_length(&xstart, n, _state); ae_matrix_set_length(&c, 2*n, n+1, _state); ae_vector_set_length(&ct, 2*n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); for(i=0; i<=n-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xstart.ptr.p_double[i] = 3*ae_randomreal(_state)-1; bndl.ptr.p_double[i] = -(0.1+0.9*ae_randomreal(_state)); bndu.ptr.p_double[i] = 0.1+0.9*ae_randomreal(_state); for(j=0; j<=n-1; j++) { c.ptr.pp_double[2*i+0][j] = (double)(0); c.ptr.pp_double[2*i+1][j] = (double)(0); } c.ptr.pp_double[2*i+0][i] = (double)(1); c.ptr.pp_double[2*i+0][n] = bndl.ptr.p_double[i]; ct.ptr.p_int[2*i+0] = 1; c.ptr.pp_double[2*i+1][i] = (double)(1); c.ptr.pp_double[2*i+1][n] = bndu.ptr.p_double[i]; ct.ptr.p_int[2*i+1] = -1; } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); b.ptr.p_double[i] = -v; } /* * Solve linear inequality constrained problem */ minqpcreate(n, &state, _state); testminqpunit_setrandomalgoconvexlc(&state, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_fp_greater(ae_randomreal(_state),0.5), _state); minqpsetstartingpoint(&state, &xstart, _state); minqpsetlc(&state, &c, &ct, 2*n, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend, &rep, _state); /* * Solve boundary constrained problem */ minqpcreate(n, &state2, _state); testminqpunit_setrandomalgoconvexlc(&state, _state); minqpsetlinearterm(&state2, &b, _state); minqpsetquadraticterm(&state2, &a, ae_fp_greater(ae_randomreal(_state),0.5), _state); minqpsetstartingpoint(&state2, &xstart, _state); minqpsetbc(&state2, &bndl, &bndu, _state); minqpoptimize(&state2, _state); minqpresults(&state2, &xend2, &rep2, _state); /* * Calculate gradient, check projection */ if( rep.terminationtype<=0||rep2.terminationtype<=0 ) { seterrorflag(err, ae_true, _state); continue; } for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_less(xend.ptr.p_double[i],bndl.ptr.p_double[i]-eps), _state); seterrorflag(err, ae_fp_greater(xend.ptr.p_double[i],bndu.ptr.p_double[i]+eps), _state); seterrorflag(err, ae_fp_greater(ae_fabs(xend.ptr.p_double[i]-xend2.ptr.p_double[i], _state),eps), _state); } } /* * Boundary constraints posed as general linear ones: * * no bound constraints * * 2*N linear constraints 0 <= x[i] <= 1 * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from converging * to the feasible point): * * unit preconditioner * * random diagonal-based preconditioner * * random scale-based preconditioner * * F(x) = |x-x0|^P, where P={2,4} and x0 is randomly selected from [-1,+2]^N * * with such simple constraints and function it is easy to find * analytic form of solution: S[i] = bound(x0[i], 0, 1). * * however, we can't guarantee that solution is strictly feasible * with respect to nonlinearity constraint, so we check * for approximate feasibility. */ for(n=1; n<=5; n++) { /* * Generate X, BL, BU. */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xstart, n, _state); ae_vector_set_length(&x0, n, _state); ae_matrix_set_length(&c, 2*n, n+1, _state); ae_vector_set_length(&ct, 2*n, _state); for(i=0; i<=n-1; i++) { xstart.ptr.p_double[i] = ae_randomreal(_state); x0.ptr.p_double[i] = 3*ae_randomreal(_state)-1; b.ptr.p_double[i] = -x0.ptr.p_double[i]; for(j=0; j<=n; j++) { c.ptr.pp_double[2*i+0][j] = (double)(0); c.ptr.pp_double[2*i+1][j] = (double)(0); } c.ptr.pp_double[2*i+0][i] = (double)(1); c.ptr.pp_double[2*i+0][n] = (double)(0); ct.ptr.p_int[2*i+0] = 1; c.ptr.pp_double[2*i+1][i] = (double)(1); c.ptr.pp_double[2*i+1][n] = (double)(1); ct.ptr.p_int[2*i+1] = -1; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { a.ptr.pp_double[i][j] = (double)(1); } else { a.ptr.pp_double[i][j] = (double)(0); } } } /* * Create and optimize */ minqpcreate(n, &state, _state); testminqpunit_setrandomalgoconvexlc(&state, _state); minqpsetlc(&state, &c, &ct, 2*n, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_fp_greater(ae_randomreal(_state),0.5), _state); minqpsetstartingpoint(&state, &xstart, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(err, ae_true, _state); continue; } /* * * compare solution with analytic one * * check feasibility */ for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_greater(ae_fabs(xend.ptr.p_double[i]-boundval(x0.ptr.p_double[i], 0.0, 1.0, _state), _state),0.05), _state); seterrorflag(err, ae_fp_less(xend.ptr.p_double[i],0.0-1.0E-6), _state); seterrorflag(err, ae_fp_greater(xend.ptr.p_double[i],1.0+1.0E-6), _state); } } /* * Boundary and linear equality/inequality constrained QP problem with * excessive constraints: * * N*N SPD A with moderate condtion number (up to 100) * * boundary constraints 0<=x[i]<=1 * * K=2*N equality/inequality constraints Q*x = Q*x0, where Q is random matrix, * x0 is some random vector from the feasible hypercube (0.1<=x0[i]<=0.9) * * optimization problem has form 0.5*x'*A*x-b*x, * where b is some random vector * * because constraints are excessive, the main problem is to find * feasible point; usually, the only existing feasible point is solution, * so we have to check only feasibility */ eps = 1.0E-4; for(n=1; n<=6; n++) { /* * Generate problem: A, b, BndL, BndU, CMatrix, x0, x1, XStart */ k = 2*n; spdmatrixrndcond(n, ae_pow(10.0, 3*ae_randomreal(_state), _state), &a, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&x1, n, _state); ae_vector_set_length(&xstart, n, _state); ae_matrix_set_length(&c, k, n+1, _state); ae_vector_set_length(&ct, k, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 0.1+0.8*ae_randomreal(_state); x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; bndl.ptr.p_double[i] = 0.0; bndu.ptr.p_double[i] = 1.0; xstart.ptr.p_double[i] = (double)(ae_randominteger(2, _state)); } for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); ct.ptr.p_int[i] = ae_randominteger(3, _state)-1; if( ct.ptr.p_int[i]==0 ) { c.ptr.pp_double[i][n] = v; } if( ct.ptr.p_int[i]>0 ) { c.ptr.pp_double[i][n] = v-1.0E-3; } if( ct.ptr.p_int[i]<0 ) { c.ptr.pp_double[i][n] = v+1.0E-3; } } for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } /* * Create optimizer, solve */ minqpcreate(n, &state, _state); testminqpunit_setrandomalgoconvexlc(&state, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_fp_greater(ae_randomreal(_state),0.5), _state); minqpsetstartingpoint(&state, &xstart, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpsetlc(&state, &c, &ct, k, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend, &rep, _state); /* * Check feasibility properties of the solution */ if( rep.terminationtype<=0 ) { seterrorflag(err, ae_true, _state); continue; } for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&xend.ptr.p_double[0], 1, &c.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); if( ct.ptr.p_int[i]==0 ) { seterrorflag(err, ae_fp_greater(ae_fabs(v-c.ptr.pp_double[i][n], _state),eps), _state); } if( ct.ptr.p_int[i]>0 ) { seterrorflag(err, ae_fp_less(v,c.ptr.pp_double[i][n]-eps), _state); } if( ct.ptr.p_int[i]<0 ) { seterrorflag(err, ae_fp_greater(v,c.ptr.pp_double[i][n]+eps), _state); } } } /* * General inequality constrained problem: * * N*N SPD diagonal A with moderate condtion number * * no boundary constraints * * K=N inequality constraints C*x >= C*x0, where C is N*N well conditioned * matrix, x0 is some random vector [-1,+1] * * optimization problem has form 0.5*x'*A*x-b'*x, * where b is random vector from [-1,+1] * * using duality, we can obtain solution of QP problem as follows: * a) intermediate problem min(0.5*y'*B*y + d'*y) s.t. y>=0 * is solved, where B = C*inv(A)*C', d = -(C*inv(A)*b + C*x0) * b) after we got dual solution ys, we calculate primal solution * xs = inv(A)*(C'*ys-b) */ eps = 1.0E-3; for(n=1; n<=6; n++) { /* * Generate problem */ ae_vector_set_length(&da, n, _state); ae_matrix_set_length(&a, n, n, _state); rmatrixrndcond(n, ae_pow(10.0, 2*ae_randomreal(_state), _state), &t2, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xstart, n, _state); ae_matrix_set_length(&c, n, n+1, _state); ae_vector_set_length(&ct, n, _state); for(i=0; i<=n-1; i++) { da.ptr.p_double[i] = ae_exp(8*ae_randomreal(_state)-4, _state); for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } a.ptr.pp_double[i][i] = da.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xstart.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=n-1; i++) { ae_v_move(&c.ptr.pp_double[i][0], 1, &t2.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); c.ptr.pp_double[i][n] = v; ct.ptr.p_int[i] = 1; } /* * Solve primal problem, check feasibility */ minqpcreate(n, &state, _state); testminqpunit_setrandomalgoconvexlc(&state, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_fp_greater(ae_randomreal(_state),0.5), _state); minqpsetstartingpoint(&state, &xstart, _state); minqpsetlc(&state, &c, &ct, n, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(err, ae_true, _state); continue; } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&xend.ptr.p_double[0], 1, &c.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); seterrorflag(err, ae_fp_less(v,c.ptr.pp_double[i][n]-eps), _state); } /* * Generate dual problem: * * A2 stores new quadratic term * * B2 stores new linear term * * BndL/BndU store boundary constraints */ ae_matrix_set_length(&t3, n, n, _state); ae_matrix_set_length(&a2, n, n, _state); rmatrixtranspose(n, n, &c, 0, 0, &t3, 0, 0, _state); for(i=0; i<=n-1; i++) { v = 1/ae_sqrt(da.ptr.p_double[i], _state); ae_v_muld(&t3.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } rmatrixsyrk(n, n, 1.0, &t3, 0, 0, 2, 0.0, &a2, 0, 0, ae_true, _state); ae_vector_set_length(&tmp0, n, _state); ae_v_move(&tmp0.ptr.p_double[0], 1, &b.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { tmp0.ptr.p_double[i] = tmp0.ptr.p_double[i]/da.ptr.p_double[i]; } ae_vector_set_length(&b2, n, _state); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &tmp0.ptr.p_double[0], 1, ae_v_len(0,n-1)); b2.ptr.p_double[i] = -(v+c.ptr.pp_double[i][n]); } ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); for(i=0; i<=n-1; i++) { bndl.ptr.p_double[i] = 0.0; bndu.ptr.p_double[i] = _state->v_posinf; } minqpcreate(n, &state2, _state); testminqpunit_setrandomalgoconvexlc(&state, _state); minqpsetlinearterm(&state2, &b2, _state); minqpsetquadraticterm(&state2, &a2, ae_true, _state); minqpsetbc(&state2, &bndl, &bndu, _state); minqpoptimize(&state2, _state); minqpresults(&state2, &xend2, &rep2, _state); if( rep2.terminationtype<=0 ) { seterrorflag(err, ae_true, _state); continue; } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[0][i], c.stride, &xend2.ptr.p_double[0], 1, ae_v_len(0,n-1)); tmp0.ptr.p_double[i] = v-b.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { tmp0.ptr.p_double[i] = tmp0.ptr.p_double[i]/da.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_greater(ae_fabs(tmp0.ptr.p_double[i]-xend.ptr.p_double[i], _state),eps*ae_maxreal(ae_fabs(tmp0.ptr.p_double[i], _state), 1.0, _state)), _state); } } /* * Boundary and linear equality/inequality constrained QP problem, * test checks that different starting points yield same final point: * * random N from [1..6], random K from [1..2*N] * * N*N SPD A with moderate condtion number (up to 100) * * boundary constraints 0<=x[i]<=1 * * K<2*N linear inequality constraints Q*x <= Q*x0, where * Q is random K*N matrix, x0 is some random vector from the * inner area of the feasible hypercube (0.1<=x0[i]<=0.9) * * optimization problem has form 0.5*x'*A*x+b*x, * where b is some random vector with -5<=b[i]<=+5 * * every component of the initial point XStart is random from [-2,+2] * * we perform two starts from random different XStart and compare values * of the target function (although final points may be slightly different, * function values should match each other) */ eps = 1.0E-4; for(pass=1; pass<=50; pass++) { /* * Generate problem: N, K, A, b, BndL, BndU, CMatrix, x0, x1, XStart. */ n = ae_randominteger(5, _state)+2; k = ae_randominteger(2*n, _state)+1; spdmatrixrndcond(n, ae_pow(10.0, 2*ae_randomreal(_state), _state), &a, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&b2, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xstart, n, _state); ae_vector_set_length(&xstart2, n, _state); ae_matrix_set_length(&c, k, n+1, _state); ae_vector_set_length(&ct, k, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 0.1+0.8*ae_randomreal(_state); b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; bndl.ptr.p_double[i] = 0.0; bndu.ptr.p_double[i] = 1.0; xstart.ptr.p_double[i] = 4*ae_randomreal(_state)-2; xstart2.ptr.p_double[i] = 4*ae_randomreal(_state)-2; } for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); c.ptr.pp_double[i][n] = v; ct.ptr.p_int[i] = ae_randominteger(3, _state)-1; } /* * Solve with XStart */ minqpcreate(n, &state, _state); testminqpunit_setrandomalgoconvexlc(&state, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_fp_greater(ae_randomreal(_state),0.5), _state); minqpsetstartingpoint(&state, &xstart, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpsetlc(&state, &c, &ct, k, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(err, ae_true, _state); continue; } /* * Solve with XStart2 */ minqpsetstartingpoint(&state, &xstart2, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend2, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(err, ae_true, _state); continue; } /* * Calculate function value and XEnd and XEnd2, compare solutions */ f0 = 0.0; f1 = 0.0; for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { f0 = f0+0.5*xend.ptr.p_double[i]*a.ptr.pp_double[i][j]*xend.ptr.p_double[j]; f1 = f1+0.5*xend2.ptr.p_double[i]*a.ptr.pp_double[i][j]*xend2.ptr.p_double[j]; } f0 = f0+xend.ptr.p_double[i]*b.ptr.p_double[i]; f1 = f1+xend2.ptr.p_double[i]*b.ptr.p_double[i]; } seterrorflag(err, ae_fp_greater(ae_fabs(f0-f1, _state),eps), _state); } /* * Convex/nonconvex optimization problem with excessive * (degenerate constraints): * * * N=2..5 * * f = 0.5*x'*A*x+b'*x * * b has normally distributed entries with scale 10^BScale * * several kinds of A are tried: zero, well conditioned SPD, well conditioned indefinite, low rank * * box constraints: x[i] in [-1,+1] * * 2^N "excessive" general linear constraints (v_k,x)<=(v_k,v_k)+v_shift, * where v_k is one of 2^N vertices of feasible hypercube, v_shift is * a shift parameter: * * with zero v_shift such constraints are degenerate (each vertex has * N box constraints and one "redundant" linear constraint) * * with positive v_shift linear constraint is always inactive * * with small (about machine epsilon) but negative v_shift, * constraint is close to degenerate - but not exactly * * We check that constrained gradient is close to zero at solution. * Box constraint is considered active if distance to boundary is less * than TolConstr. * * NOTE: TolConstr must be large enough so it won't conflict with * perturbation introduced by v_shift */ tolconstr = 1.0E-8; for(n=2; n<=8; n++) { for(akind=0; akind<=3; akind++) { for(shiftkind=-5; shiftkind<=1; shiftkind++) { for(bscale=0; bscale>=-2; bscale--) { /* * Generate A, B and initial point */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = ae_pow((double)(10), (double)(bscale), _state)*hqrndnormal(&rs, _state); x.ptr.p_double[i] = hqrnduniformr(&rs, _state)-0.5; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } if( akind==1 ) { /* * Dense well conditioned SPD */ spdmatrixrndcond(n, 50.0, &a, _state); } if( akind==2 ) { /* * Dense well conditioned indefinite */ smatrixrndcond(n, 50.0, &a, _state); } if( akind==3 ) { /* * Low rank */ ae_vector_set_length(&tmp, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } for(k=1; k<=ae_minint(3, n-1, _state); k++) { for(i=0; i<=n-1; i++) { tmp.ptr.p_double[i] = hqrndnormal(&rs, _state); } v = hqrndnormal(&rs, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]+v*tmp.ptr.p_double[i]*tmp.ptr.p_double[j]; } } } } /* * Generate constraints */ ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = -1.0; bu.ptr.p_double[i] = 1.0; } ccnt = ae_round(ae_pow((double)(2), (double)(n), _state), _state); ae_matrix_set_length(&c, ccnt, n+1, _state); ae_vector_set_length(&ct, ccnt, _state); for(i=0; i<=ccnt-1; i++) { ct.ptr.p_int[i] = -1; k = i; c.ptr.pp_double[i][n] = ae_sign((double)(shiftkind), _state)*ae_pow((double)(10), ae_fabs((double)(shiftkind), _state), _state)*ae_machineepsilon; for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = (double)(2*(k%2)-1); c.ptr.pp_double[i][n] = c.ptr.pp_double[i][n]+c.ptr.pp_double[i][j]*c.ptr.pp_double[i][j]; k = k/2; } } /* * Create and optimize */ minqpcreate(n, &state, _state); minqpsetstartingpoint(&state, &x, _state); testminqpunit_setrandomalgononconvexlc(&state, _state); minqpsetbc(&state, &bl, &bu, _state); minqpsetlc(&state, &c, &ct, ccnt, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_fp_greater(ae_randomreal(_state),0.5), _state); minqpoptimize(&state, _state); minqpresults(&state, &xs0, &rep, _state); seterrorflag(err, rep.terminationtype<=0, _state); if( *err ) { ae_frame_leave(_state); return; } /* * Evaluate gradient at solution and test */ vv = 0.0; for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xs0.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v+b.ptr.p_double[i]; if( ae_fp_less_eq(xs0.ptr.p_double[i],bl.ptr.p_double[i]+tolconstr)&&ae_fp_greater(v,(double)(0)) ) { v = 0.0; } if( ae_fp_greater_eq(xs0.ptr.p_double[i],bu.ptr.p_double[i]-tolconstr)&&ae_fp_less(v,(double)(0)) ) { v = 0.0; } vv = vv+ae_sqr(v, _state); } vv = ae_sqrt(vv, _state); seterrorflag(err, ae_fp_greater(vv,1.0E-5), _state); } } } } /* * Convex/nonconvex optimization problem with combination of * box and linear constraints: * * * N=2..8 * * f = 0.5*x'*A*x+b'*x * * b has normally distributed entries with scale 10^BScale * * several kinds of A are tried: zero, well conditioned SPD, * well conditioned indefinite, low rank * * box constraints: x[i] in [-1,+1] * * initial point x0 = [0 0 ... 0 0] * * CCnt=min(3,N-1) general linear constraints of form (c,x)=0. * random mix of equality/inequality constraints is tried. * x0 is guaranteed to be feasible. * * We check that constrained gradient is close to zero at solution. * Inequality constraint is considered active if distance to boundary * is less than TolConstr. We use nonnegative least squares solver * in order to compute constrained gradient. */ tolconstr = 1.0E-8; for(n=2; n<=8; n++) { for(akind=0; akind<=3; akind++) { for(bscale=0; bscale>=-2; bscale--) { /* * Generate A, B and initial point */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = ae_pow((double)(10), (double)(bscale), _state)*hqrndnormal(&rs, _state); x.ptr.p_double[i] = 0.0; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } if( akind==1 ) { /* * Dense well conditioned SPD */ spdmatrixrndcond(n, 50.0, &a, _state); } if( akind==2 ) { /* * Dense well conditioned indefinite */ smatrixrndcond(n, 50.0, &a, _state); } if( akind==3 ) { /* * Low rank */ ae_vector_set_length(&tmp, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } for(k=1; k<=ae_minint(3, n-1, _state); k++) { for(i=0; i<=n-1; i++) { tmp.ptr.p_double[i] = hqrndnormal(&rs, _state); } v = hqrndnormal(&rs, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]+v*tmp.ptr.p_double[i]*tmp.ptr.p_double[j]; } } } } /* * Generate constraints */ ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = -1.0; bu.ptr.p_double[i] = 1.0; } ccnt = ae_minint(3, n-1, _state); ae_matrix_set_length(&c, ccnt, n+1, _state); ae_vector_set_length(&ct, ccnt, _state); for(i=0; i<=ccnt-1; i++) { ct.ptr.p_int[i] = hqrnduniformi(&rs, 3, _state)-1; c.ptr.pp_double[i][n] = 0.0; for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; } } /* * Create and optimize */ minqpcreate(n, &state, _state); minqpsetstartingpoint(&state, &x, _state); testminqpunit_setrandomalgononconvexlc(&state, _state); minqpsetbc(&state, &bl, &bu, _state); minqpsetlc(&state, &c, &ct, ccnt, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_fp_greater(ae_randomreal(_state),0.5), _state); minqpoptimize(&state, _state); minqpresults(&state, &xs0, &rep, _state); seterrorflag(err, rep.terminationtype<=0, _state); if( *err ) { ae_frame_leave(_state); return; } /* * 1. evaluate unconstrained gradient at solution * * 2. calculate constrained gradient (NNLS solver is used * to evaluate gradient subject to active constraints). * In order to do this we form CE matrix, matrix of active * constraints (columns store constraint vectors). Then * we try to approximate gradient vector by columns of CE, * subject to non-negativity restriction placed on variables * corresponding to inequality constraints. * * Residual from such regression is a constrained gradient vector. */ ae_vector_set_length(&g, n, _state); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xs0.ptr.p_double[0], 1, ae_v_len(0,n-1)); g.ptr.p_double[i] = v+b.ptr.p_double[i]; } ae_matrix_set_length(&ce, n, n+ccnt, _state); ae_vector_set_length(&nonnegative, n+ccnt, _state); k = 0; for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_less(xs0.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(err, ae_fp_greater(xs0.ptr.p_double[i],bu.ptr.p_double[i]), _state); if( ae_fp_less_eq(xs0.ptr.p_double[i],bl.ptr.p_double[i]+tolconstr) ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = 0.0; } ce.ptr.pp_double[i][k] = 1.0; nonnegative.ptr.p_bool[k] = ae_true; inc(&k, _state); continue; } if( ae_fp_greater_eq(xs0.ptr.p_double[i],bu.ptr.p_double[i]-tolconstr) ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = 0.0; } ce.ptr.pp_double[i][k] = -1.0; nonnegative.ptr.p_bool[k] = ae_true; inc(&k, _state); continue; } } for(i=0; i<=ccnt-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &xs0.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v-c.ptr.pp_double[i][n]; seterrorflag(err, ct.ptr.p_int[i]==0&&ae_fp_greater(ae_fabs(v, _state),tolconstr), _state); seterrorflag(err, ct.ptr.p_int[i]>0&&ae_fp_less(v,-tolconstr), _state); seterrorflag(err, ct.ptr.p_int[i]<0&&ae_fp_greater(v,tolconstr), _state); if( ct.ptr.p_int[i]==0 ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = c.ptr.pp_double[i][j]; } nonnegative.ptr.p_bool[k] = ae_false; inc(&k, _state); continue; } if( (ct.ptr.p_int[i]>0&&ae_fp_less_eq(v,tolconstr))||(ct.ptr.p_int[i]<0&&ae_fp_greater_eq(v,-tolconstr)) ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = ae_sign((double)(ct.ptr.p_int[i]), _state)*c.ptr.pp_double[i][j]; } nonnegative.ptr.p_bool[k] = ae_true; inc(&k, _state); continue; } } snnlsinit(0, 0, 0, &nnls, _state); snnlssetproblem(&nnls, &ce, &g, 0, k, n, _state); for(i=0; i<=k-1; i++) { if( !nonnegative.ptr.p_bool[i] ) { snnlsdropnnc(&nnls, i, _state); } } snnlssolve(&nnls, &tmp, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { g.ptr.p_double[j] = g.ptr.p_double[j]-tmp.ptr.p_double[i]*ce.ptr.pp_double[j][i]; } } vv = ae_v_dotproduct(&g.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,n-1)); vv = ae_sqrt(vv, _state); seterrorflag(err, ae_fp_greater(vv,1.0E-5), _state); } } } ae_frame_leave(_state); } /************************************************************************* This function tests linearly constrained QP solvers. On failure sets Err to True; on success leaves it unchanged. *************************************************************************/ static void testminqpunit_generallcqptest(ae_bool* errorflag, ae_state *_state) { ae_frame _frame_block; hqrndstate rs; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t solvertype; double v; double vv; ae_bool bflag; ae_int_t pass; ae_matrix rawa; ae_matrix a; sparsematrix sa; ae_vector b; ae_vector b2; ae_vector xs; ae_vector x0; ae_vector x1; ae_vector x2; ae_vector xorigin; ae_vector s; ae_vector da; ae_vector xstart; ae_vector g; ae_vector gs; ae_vector tmp; ae_vector tmp0; ae_vector tmp1; ae_vector tmp2; ae_vector lagrange; ae_vector bndl; ae_vector bndu; ae_matrix rawc; ae_vector rawct; ae_int_t rawccnt; ae_matrix densec; ae_vector densect; ae_int_t denseccnt; sparsematrix sparsec; ae_vector sparsect; ae_int_t sparseccnt; ae_matrix activeset; ae_vector activeeq; ae_int_t nactive; snnlssolver nnls; double constraintsrcond; ae_vector svdw; ae_matrix svdu; ae_matrix svdvt; minqpstate state; minqpreport rep; minqpstate state2; minqpreport rep2; double f0; double f1; double xtol; double ftol; double gtol; double tolconstr; ae_int_t bscale; ae_int_t akind; ae_matrix ce; ae_matrix q; ae_matrix t2; ae_matrix t3; ae_matrix a2; ae_vector nonnegative; double mx; ae_int_t shiftkind; double bleicepsx; double aulepsx; double aulrho; ae_int_t aulits; ae_matrix kkt; ae_vector kktright; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&rs, _state); ae_matrix_init(&rawa, 0, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); _sparsematrix_init(&sa, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&b2, 0, DT_REAL, _state); ae_vector_init(&xs, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&xorigin, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&da, 0, DT_REAL, _state); ae_vector_init(&xstart, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&gs, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&tmp0, 0, DT_REAL, _state); ae_vector_init(&tmp1, 0, DT_REAL, _state); ae_vector_init(&tmp2, 0, DT_REAL, _state); ae_vector_init(&lagrange, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_matrix_init(&rawc, 0, 0, DT_REAL, _state); ae_vector_init(&rawct, 0, DT_INT, _state); ae_matrix_init(&densec, 0, 0, DT_REAL, _state); ae_vector_init(&densect, 0, DT_INT, _state); _sparsematrix_init(&sparsec, _state); ae_vector_init(&sparsect, 0, DT_INT, _state); ae_matrix_init(&activeset, 0, 0, DT_REAL, _state); ae_vector_init(&activeeq, 0, DT_BOOL, _state); _snnlssolver_init(&nnls, _state); ae_vector_init(&svdw, 0, DT_REAL, _state); ae_matrix_init(&svdu, 0, 0, DT_REAL, _state); ae_matrix_init(&svdvt, 0, 0, DT_REAL, _state); _minqpstate_init(&state, _state); _minqpreport_init(&rep, _state); _minqpstate_init(&state2, _state); _minqpreport_init(&rep2, _state); ae_matrix_init(&ce, 0, 0, DT_REAL, _state); ae_matrix_init(&q, 0, 0, DT_REAL, _state); ae_matrix_init(&t2, 0, 0, DT_REAL, _state); ae_matrix_init(&t3, 0, 0, DT_REAL, _state); ae_matrix_init(&a2, 0, 0, DT_REAL, _state); ae_vector_init(&nonnegative, 0, DT_BOOL, _state); ae_matrix_init(&kkt, 0, 0, DT_REAL, _state); ae_vector_init(&kktright, 0, DT_REAL, _state); hqrndrandomize(&rs, _state); bleicepsx = 1.0E-9; aulepsx = 1.0E-12; aulrho = 1.0E3; aulits = 15; /* * SMALL-SCALE TESTS: many tests for small N's */ for(solvertype=0; solvertype<=1; solvertype++) { /* * Test random linearly constrained QP problem with known answer: * * generate random A and b * * generate random solution XS * * calculate unconstrained gradient GS at XS * * generate random box/linear constraints C, with some of them being * active at XS, and some being inactive. Calculate residual gradient * GP after projection of GS onto active set, add one more constraint * equal to +-(GP-GS). * * We test here BLEIC and Dense-AUL solvers, with A being passed * in dense or sparse format, and linear constraints C being passed * as dense, sparse or mixed ones. */ for(n=1; n<=10; n++) { /* * Generate random A, b and xs */ spdmatrixrndcond(n, ae_pow(10.0, 2*ae_randomreal(_state), _state), &rawa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xs, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); xs.ptr.p_double[i] = hqrndnormal(&rs, _state); } /* * Generate well conditioned "raw" constraints: * * generate random box and CCnt-1 linear constraints * * determine active set, calculate its condition number * * repeat until condition number is good enough * (better than 1E2; larger values sometimes result in * spurious failures) */ ae_vector_set_length(&gs, n, _state); ae_vector_set_length(&tmp, n, _state); rmatrixmv(n, n, &rawa, 0, 0, 0, &xs, 0, &gs, 0, _state); for(i=0; i<=n-1; i++) { gs.ptr.p_double[i] = gs.ptr.p_double[i]+b.ptr.p_double[i]; } rawccnt = 1+hqrnduniformi(&rs, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_matrix_set_length(&activeset, n, n+rawccnt, _state); ae_vector_set_length(&activeeq, n+rawccnt, _state); ae_matrix_set_length(&rawc, rawccnt, n+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); do { nactive = 0; for(i=0; i<=n-1; i++) { bndl.ptr.p_double[i] = xs.ptr.p_double[i]-1-hqrnduniformr(&rs, _state); bndu.ptr.p_double[i] = xs.ptr.p_double[i]+1+hqrnduniformr(&rs, _state); if( ae_fp_less(hqrnduniformr(&rs, _state),0.66) ) { /* * I-th box constraint is inactive */ continue; } if( ae_fp_greater(hqrnduniformr(&rs, _state),0.50) ) { /* * I-th box constraint is equality one */ bndl.ptr.p_double[i] = xs.ptr.p_double[i]; bndu.ptr.p_double[i] = xs.ptr.p_double[i]; for(j=0; j<=n-1; j++) { activeset.ptr.pp_double[j][nactive] = (double)(0); } activeset.ptr.pp_double[i][nactive] = (double)(1); activeeq.ptr.p_bool[nactive] = ae_true; nactive = nactive+1; } else { /* * I-th box constraint is inequality one */ for(j=0; j<=n-1; j++) { activeset.ptr.pp_double[j][nactive] = (double)(0); } if( ae_fp_greater(gs.ptr.p_double[i],(double)(0)) ) { bndl.ptr.p_double[i] = xs.ptr.p_double[i]; activeset.ptr.pp_double[i][nactive] = (double)(-1); if( ae_fp_greater(hqrnduniformr(&rs, _state),0.50) ) { bndu.ptr.p_double[i] = _state->v_posinf; } } else { bndu.ptr.p_double[i] = xs.ptr.p_double[i]; activeset.ptr.pp_double[i][nactive] = (double)(1); if( ae_fp_greater(hqrnduniformr(&rs, _state),0.50) ) { bndl.ptr.p_double[i] = _state->v_neginf; } } activeeq.ptr.p_bool[nactive] = ae_false; nactive = nactive+1; } } for(i=0; i<=rawccnt-2; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_less(hqrnduniformr(&rs, _state),0.50) ) { rawc.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } else { rawc.ptr.pp_double[i][j] = (double)(0); } } rawc.ptr.pp_double[i][hqrnduniformi(&rs, n, _state)] = hqrndnormal(&rs, _state); rawc.ptr.pp_double[i][n] = (double)(0); for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][n] = rawc.ptr.pp_double[i][n]+rawc.ptr.pp_double[i][j]*xs.ptr.p_double[j]; } rawct.ptr.p_int[i] = -1; if( ae_fp_less(hqrnduniformr(&rs, _state),0.66) ) { /* * I-th box constraint is inactive */ rawc.ptr.pp_double[i][n] = rawc.ptr.pp_double[i][n]+(1+hqrnduniformr(&rs, _state)); if( ae_fp_greater(hqrnduniformr(&rs, _state),0.50) ) { ae_v_muld(&rawc.ptr.pp_double[i][0], 1, ae_v_len(0,n), -1); rawct.ptr.p_int[i] = -rawct.ptr.p_int[i]; } continue; } if( ae_fp_greater(hqrnduniformr(&rs, _state),0.50) ) { /* * I-th box constraint is equality one */ rawct.ptr.p_int[i] = 0; for(j=0; j<=n-1; j++) { activeset.ptr.pp_double[j][nactive] = rawc.ptr.pp_double[i][j]; } activeeq.ptr.p_bool[nactive] = ae_true; nactive = nactive+1; } else { /* * I-th box constraint is inequality one */ v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &gs.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( ae_fp_greater(v,(double)(0)) ) { rawct.ptr.p_int[i] = 1; for(j=0; j<=n-1; j++) { activeset.ptr.pp_double[j][nactive] = -rawc.ptr.pp_double[i][j]; } } else { rawct.ptr.p_int[i] = -1; for(j=0; j<=n-1; j++) { activeset.ptr.pp_double[j][nactive] = rawc.ptr.pp_double[i][j]; } } activeeq.ptr.p_bool[nactive] = ae_false; nactive = nactive+1; } } ae_v_moveneg(&tmp.ptr.p_double[0], 1, &gs.ptr.p_double[0], 1, ae_v_len(0,n-1)); snnlsinit(0, 0, 0, &nnls, _state); snnlssetproblem(&nnls, &activeset, &tmp, 0, nactive, n, _state); for(i=0; i<=nactive-1; i++) { if( activeeq.ptr.p_bool[i] ) { snnlsdropnnc(&nnls, i, _state); } } snnlssolve(&nnls, &lagrange, _state); ae_v_moveneg(&tmp.ptr.p_double[0], 1, &gs.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=nactive-1; i++) { v = lagrange.ptr.p_double[i]; ae_v_subd(&tmp.ptr.p_double[0], 1, &activeset.ptr.pp_double[0][i], activeset.stride, ae_v_len(0,n-1), v); } ae_v_move(&rawc.ptr.pp_double[rawccnt-1][0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&tmp.ptr.p_double[0], 1, &xs.ptr.p_double[0], 1, ae_v_len(0,n-1)); rawc.ptr.pp_double[rawccnt-1][n] = v; rawct.ptr.p_int[rawccnt-1] = -1; /* * Calculate reciprocal condition number */ if( nactive>0 ) { bflag = rmatrixsvd(&activeset, n, nactive, 0, 0, 0, &svdw, &svdu, &svdvt, _state); ae_assert(bflag, "MinQPTest: integrity failure", _state); constraintsrcond = svdw.ptr.p_double[ae_minint(nactive, n, _state)-1]/svdw.ptr.p_double[0]; } else { constraintsrcond = (double)(1); } /* * Check RCond */ } while(ae_fp_less(constraintsrcond,0.01)); /* * Create optimizer, solve */ minqpcreate(n, &state, _state); if( solvertype==0 ) { minqpsetalgobleic(&state, 0.0, 0.0, bleicepsx, 0, _state); } else { if( solvertype==1 ) { minqpsetalgodenseaul(&state, aulepsx, aulrho, aulits, _state); } else { ae_assert(ae_false, "unexpected solver type", _state); } } minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&rawa, n, &state, &rs, _state); minqpsetbc(&state, &bndl, &bndu, _state); testminqpunit_randomlysplitandsetlc(&rawc, &rawct, rawccnt, n, &state, &rs, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); /* * Test * * Because constrained problems are often ill-conditioned, * we do NOT compare X1 with XS directly. Instead, we: * a) compare function values at X1 and XS with good precision * b) check constraint violation with good precision * c) perform comparison for |X1-XS| with LOW precision */ if( rep.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); ae_frame_leave(_state); return; } f0 = (double)(0); f1 = (double)(0); for(i=0; i<=n-1; i++) { f0 = f0+b.ptr.p_double[i]*xs.ptr.p_double[i]; f1 = f1+b.ptr.p_double[i]*x1.ptr.p_double[i]; for(j=0; j<=n-1; j++) { f0 = f0+0.5*xs.ptr.p_double[i]*rawa.ptr.pp_double[i][j]*xs.ptr.p_double[j]; f1 = f1+0.5*x1.ptr.p_double[i]*rawa.ptr.pp_double[i][j]*x1.ptr.p_double[j]; } } seterrorflag(errorflag, ae_fp_greater(ae_fabs(f0-f1, _state),1.0E-3), _state); for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-xs.ptr.p_double[i], _state),1.0E-2), _state); } } /* * Inequality constrained problem: * * N*N diagonal A * * one inequality constraint q'*x>=0, where q is random unit vector * * optimization problem has form 0.5*x'*A*x-(xs*A)*x, * where xs is some random vector * * either: * a) xs is feasible => we must stop at xs * b) xs is infeasible => we must stop at the boundary q'*x=0 and * projection of gradient onto q*x=0 must be zero * * NOTE: we make several passes because some specific kind of errors is rarely * caught by this test, so we need several repetitions. */ xtol = 1.0E-4; gtol = 1.0E-4; for(n=2; n<=6; n++) { for(pass=0; pass<=4; pass++) { /* * Generate problem: A, b, CMatrix, x0, XStart */ spdmatrixrndcond(n, ae_pow(10.0, 3*ae_randomreal(_state), _state), &a, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xs, n, _state); ae_vector_set_length(&xstart, n, _state); ae_matrix_set_length(&rawc, 1, n+1, _state); ae_vector_set_length(&rawct, 1, _state); for(i=0; i<=n-1; i++) { xs.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xstart.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } do { v = (double)(0); for(i=0; i<=n-1; i++) { rawc.ptr.pp_double[0][i] = 2*ae_randomreal(_state)-1; v = v+ae_sqr(rawc.ptr.pp_double[0][i], _state); } v = ae_sqrt(v, _state); } while(ae_fp_eq(v,(double)(0))); for(i=0; i<=n-1; i++) { rawc.ptr.pp_double[0][i] = rawc.ptr.pp_double[0][i]/v; } rawc.ptr.pp_double[0][n] = (double)(0); rawct.ptr.p_int[0] = 1; for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xs.ptr.p_double[0], 1, ae_v_len(0,n-1)); b.ptr.p_double[i] = -v; } /* * Create optimizer, solve */ minqpcreate(n, &state, _state); if( solvertype==0 ) { minqpsetalgobleic(&state, 0.0, 0.0, bleicepsx, 0, _state); } else { if( solvertype==1 ) { minqpsetalgodenseaul(&state, aulepsx, aulrho, aulits, _state); } else { ae_assert(ae_false, "unexpected solver type", _state); } } minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&a, n, &state, &rs, _state); minqpsetstartingpoint(&state, &xstart, _state); testminqpunit_randomlysplitandsetlc(&rawc, &rawct, 1, n, &state, &rs, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); /* * Test */ if( rep.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); ae_frame_leave(_state); return; } v = ae_v_dotproduct(&xs.ptr.p_double[0], 1, &rawc.ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); if( ae_fp_greater_eq(v,(double)(0)) ) { /* * XS is feasible */ for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-xs.ptr.p_double[i], _state),xtol), _state); } } else { /* * XS is infeasible: * * X1 must be approximately feasible * * gradient projection onto c'*x=0 must be zero */ v = ae_v_dotproduct(&x1.ptr.p_double[0], 1, &rawc.ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); seterrorflag(errorflag, ae_fp_less(v,-xtol), _state); ae_vector_set_length(&g, n, _state); ae_v_move(&g.ptr.p_double[0], 1, &b.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); g.ptr.p_double[i] = g.ptr.p_double[i]+v; } v = ae_v_dotproduct(&g.ptr.p_double[0], 1, &rawc.ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); ae_v_subd(&g.ptr.p_double[0], 1, &rawc.ptr.pp_double[0][0], 1, ae_v_len(0,n-1), v); v = ae_v_dotproduct(&g.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,n-1)); seterrorflag(errorflag, ae_fp_greater(ae_sqrt(v, _state),gtol), _state); } } } /* * Box constraints vs linear ones: * * N*N SPD A * * optimization problem has form 0.5*x'*A*x-(x1*A)*x, * where x1 is some random vector from [-1,+1] * * K=2*N constraints of the form ai<=x[i] or x[i]<=b[i], * with ai in [-1.0,-0.1], bi in [+0.1,+1.0] * * initial point xstart is from [-1,+2] * * we solve two related QP problems: * a) one with constraints posed as boundary ones * b) another one with same constraints posed as general linear ones * both problems must have same solution. * Here we test that boundary constrained and linear inequality constrained * solvers give same results. */ xtol = 1.0E-5; for(n=1; n<=6; n++) { /* * Generate problem: A, b, x0, XStart, C, CT */ spdmatrixrndcond(n, ae_pow(10.0, 3*ae_randomreal(_state), _state), &a, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x1, n, _state); ae_vector_set_length(&xstart, n, _state); ae_matrix_set_length(&rawc, 2*n, n+1, _state); ae_vector_set_length(&rawct, 2*n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); for(i=0; i<=n-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xstart.ptr.p_double[i] = 3*ae_randomreal(_state)-1; bndl.ptr.p_double[i] = -(0.1+0.9*ae_randomreal(_state)); bndu.ptr.p_double[i] = 0.1+0.9*ae_randomreal(_state); for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[2*i+0][j] = (double)(0); rawc.ptr.pp_double[2*i+1][j] = (double)(0); } rawc.ptr.pp_double[2*i+0][i] = (double)(1); rawc.ptr.pp_double[2*i+0][n] = bndl.ptr.p_double[i]; rawct.ptr.p_int[2*i+0] = 1; rawc.ptr.pp_double[2*i+1][i] = (double)(1); rawc.ptr.pp_double[2*i+1][n] = bndu.ptr.p_double[i]; rawct.ptr.p_int[2*i+1] = -1; } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); b.ptr.p_double[i] = -v; } /* * Solve linear inequality constrained problem */ minqpcreate(n, &state, _state); if( solvertype==0 ) { minqpsetalgobleic(&state, 0.0, 0.0, bleicepsx, 0, _state); } else { if( solvertype==1 ) { minqpsetalgodenseaul(&state, aulepsx, aulrho, aulits, _state); } else { ae_assert(ae_false, "unexpected solver type", _state); } } minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&a, n, &state, &rs, _state); minqpsetstartingpoint(&state, &xstart, _state); testminqpunit_randomlysplitandsetlc(&rawc, &rawct, 2*n, n, &state, &rs, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); /* * Solve boundary constrained problem */ minqpcreate(n, &state2, _state); if( solvertype==0 ) { minqpsetalgobleic(&state2, 0.0, 0.0, bleicepsx, 0, _state); } else { if( solvertype==1 ) { minqpsetalgodenseaul(&state2, aulepsx, aulrho, aulits, _state); } else { ae_assert(ae_false, "unexpected solver type", _state); } } minqpsetlinearterm(&state2, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&a, n, &state2, &rs, _state); minqpsetstartingpoint(&state2, &xstart, _state); minqpsetbc(&state2, &bndl, &bndu, _state); minqpoptimize(&state2, _state); minqpresults(&state2, &x2, &rep2, _state); /* * Calculate gradient, check projection */ if( rep.terminationtype<=0||rep2.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_less(x1.ptr.p_double[i],bndl.ptr.p_double[i]-xtol), _state); seterrorflag(errorflag, ae_fp_greater(x1.ptr.p_double[i],bndu.ptr.p_double[i]+xtol), _state); seterrorflag(errorflag, ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-x2.ptr.p_double[i], _state),xtol), _state); } } /* * Convex/nonconvex optimization problem with combination of * box and linear constraints: * * * N=2..8 * * f = 0.5*x'*A*x+b'*x * * b has normally distributed entries with scale 10^BScale * * several kinds of A are tried: zero, well conditioned SPD, * well conditioned indefinite, low rank * * box constraints: x[i] in [-1,+1] * * initial point x0 = [0 0 ... 0 0] * * CCnt=min(3,N-1) general linear constraints of form (c,x)=0. * random mix of equality/inequality constraints is tried, moderate * condition number is guaranteed, x0 is guaranteed to be feasible. * * We check that constrained gradient is close to zero at solution. * Inequality constraint is considered active if distance to boundary * is less than TolConstr. We use nonnegative least squares solver * in order to compute constrained gradient. */ for(n=2; n<=8; n++) { for(akind=0; akind<=3; akind++) { for(bscale=1; bscale>=-1; bscale--) { /* * Dense-AUL solver has lower precision on rank-deficient * problems, so we skip AKind=3 */ if( solvertype==1&&akind==3 ) { continue; } /* * Set up tolerances */ if( solvertype==0 ) { tolconstr = 1.0E-8; } else { if( solvertype==1 ) { tolconstr = 1.0E-3; if( akind==3 ) { tolconstr = tolconstr*5; } } else { ae_assert(ae_false, "unexpected solver type", _state); } } gtol = 1.0E-4; /* * Generate A, B and initial point */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xstart, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = ae_pow((double)(10), (double)(bscale), _state)*hqrndnormal(&rs, _state); xstart.ptr.p_double[i] = 0.0; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } if( akind==1 ) { /* * Dense well conditioned SPD */ spdmatrixrndcond(n, 50.0, &a, _state); } if( akind==2 ) { /* * Dense well conditioned indefinite */ smatrixrndcond(n, 50.0, &a, _state); } if( akind==3 ) { /* * Low rank */ ae_vector_set_length(&tmp, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } for(k=1; k<=ae_minint(3, n-1, _state); k++) { for(i=0; i<=n-1; i++) { tmp.ptr.p_double[i] = hqrndnormal(&rs, _state); } v = hqrndnormal(&rs, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]+v*tmp.ptr.p_double[i]*tmp.ptr.p_double[j]; } } } } /* * Generate constraints */ ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); for(i=0; i<=n-1; i++) { bndl.ptr.p_double[i] = -1.0; bndu.ptr.p_double[i] = 1.0; } rawccnt = ae_minint(3, n-1, _state); ae_matrix_set_length(&rawc, rawccnt, n+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); do { for(i=0; i<=rawccnt-1; i++) { rawct.ptr.p_int[i] = hqrnduniformi(&rs, 3, _state)-1; rawc.ptr.pp_double[i][n] = 0.0; for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; } } } while(ae_fp_less_eq(testminqpunit_getconstraintrcond(&rawc, rawccnt, n, _state),0.01)); /* * Create and optimize */ minqpcreate(n, &state, _state); minqpsetstartingpoint(&state, &xstart, _state); if( solvertype==0 ) { minqpsetalgobleic(&state, 0.0, 0.0, bleicepsx, 0, _state); } else { if( solvertype==1 ) { minqpsetalgodenseaul(&state, aulepsx, aulrho, aulits, _state); } else { ae_assert(ae_false, "unexpected solver type", _state); } } minqpsetbc(&state, &bndl, &bndu, _state); testminqpunit_randomlysplitandsetlc(&rawc, &rawct, rawccnt, n, &state, &rs, _state); minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&a, n, &state, &rs, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(errorflag, rep.terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } /* * 1. evaluate unconstrained gradient at solution * * 2. calculate constrained gradient (NNLS solver is used * to evaluate gradient subject to active constraints). * In order to do this we form CE matrix, matrix of active * constraints (columns store constraint vectors). Then * we try to approximate gradient vector by columns of CE, * subject to non-negativity restriction placed on variables * corresponding to inequality constraints. * * Residual from such regression is a constrained gradient vector. */ ae_vector_set_length(&g, n, _state); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); g.ptr.p_double[i] = v+b.ptr.p_double[i]; } ae_matrix_set_length(&ce, n, n+rawccnt, _state); ae_vector_set_length(&nonnegative, n+rawccnt, _state); k = 0; for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_less(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); if( ae_fp_less_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i]+tolconstr) ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = 0.0; } ce.ptr.pp_double[i][k] = 1.0; nonnegative.ptr.p_bool[k] = ae_true; inc(&k, _state); continue; } if( ae_fp_greater_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i]-tolconstr) ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = 0.0; } ce.ptr.pp_double[i][k] = -1.0; nonnegative.ptr.p_bool[k] = ae_true; inc(&k, _state); continue; } } for(i=0; i<=rawccnt-1; i++) { v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v-rawc.ptr.pp_double[i][n]; seterrorflag(errorflag, rawct.ptr.p_int[i]==0&&ae_fp_greater(ae_fabs(v, _state),tolconstr), _state); seterrorflag(errorflag, rawct.ptr.p_int[i]>0&&ae_fp_less(v,-tolconstr), _state); seterrorflag(errorflag, rawct.ptr.p_int[i]<0&&ae_fp_greater(v,tolconstr), _state); if( rawct.ptr.p_int[i]==0 ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = rawc.ptr.pp_double[i][j]; } nonnegative.ptr.p_bool[k] = ae_false; inc(&k, _state); continue; } if( (rawct.ptr.p_int[i]>0&&ae_fp_less_eq(v,tolconstr))||(rawct.ptr.p_int[i]<0&&ae_fp_greater_eq(v,-tolconstr)) ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = ae_sign((double)(rawct.ptr.p_int[i]), _state)*rawc.ptr.pp_double[i][j]; } nonnegative.ptr.p_bool[k] = ae_true; inc(&k, _state); continue; } } snnlsinit(0, 0, 0, &nnls, _state); snnlssetproblem(&nnls, &ce, &g, 0, k, n, _state); for(i=0; i<=k-1; i++) { if( !nonnegative.ptr.p_bool[i] ) { snnlsdropnnc(&nnls, i, _state); } } snnlssolve(&nnls, &tmp, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { g.ptr.p_double[j] = g.ptr.p_double[j]-tmp.ptr.p_double[i]*ce.ptr.pp_double[j][i]; } } vv = ae_v_dotproduct(&g.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,n-1)); vv = ae_sqrt(vv, _state); seterrorflag(errorflag, ae_fp_greater(vv,gtol), _state); } } } /* * Boundary and linear equality/inequality constrained QP problem, * test for correct handling of non-zero XOrigin: * * N*N SPD A with moderate condtion number (up to 100) * * boundary constraints 0<=x[i]<=1 * * K= C*x0, where C is N*N well conditioned * matrix, x0 is some random vector [-1,+1] * * optimization problem has form 0.5*x'*A*x-b'*x, * where b is random vector from [-1,+1] * * using duality, we can obtain solution of QP problem as follows: * a) intermediate problem min(0.5*y'*B*y + d'*y) s.t. y>=0 * is solved, where B = C*inv(A)*C', d = -(C*inv(A)*b + C*x0) * b) after we got dual solution ys, we calculate primal solution * xs = inv(A)*(C'*ys-b) */ for(n=1; n<=6; n++) { /* * Set up tolerances */ if( solvertype==0 ) { xtol = 1.0E-5; } else { if( solvertype==1 ) { xtol = 1.0E-3; } else { ae_assert(ae_false, "unexpected solver type", _state); } } /* * Generate problem */ ae_vector_set_length(&da, n, _state); ae_matrix_set_length(&a, n, n, _state); rmatrixrndcond(n, ae_pow(10.0, 2*ae_randomreal(_state), _state), &t2, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xstart, n, _state); ae_matrix_set_length(&rawc, n, n+1, _state); ae_vector_set_length(&rawct, n, _state); for(i=0; i<=n-1; i++) { da.ptr.p_double[i] = ae_exp(6*ae_randomreal(_state)-3, _state); for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } a.ptr.pp_double[i][i] = da.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xstart.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=n-1; i++) { ae_v_move(&rawc.ptr.pp_double[i][0], 1, &t2.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); rawc.ptr.pp_double[i][n] = v; rawct.ptr.p_int[i] = 1; } /* * Solve primal problem, check feasibility */ minqpcreate(n, &state, _state); if( solvertype==0 ) { minqpsetalgobleic(&state, 0.0, 0.0, bleicepsx, 0, _state); } else { if( solvertype==1 ) { minqpsetalgodenseaul(&state, aulepsx, aulrho, aulits, _state); } else { ae_assert(ae_false, "unexpected solver type", _state); } } minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&a, n, &state, &rs, _state); minqpsetstartingpoint(&state, &xstart, _state); minqpsetlc(&state, &rawc, &rawct, n, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); continue; } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&x1.ptr.p_double[0], 1, &rawc.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); seterrorflag(errorflag, ae_fp_less(v,rawc.ptr.pp_double[i][n]-xtol), _state); } /* * Generate dual problem: * * A2 stores new quadratic term * * B2 stores new linear term * * BndL/BndU store boundary constraints */ ae_matrix_set_length(&t3, n, n, _state); ae_matrix_set_length(&a2, n, n, _state); rmatrixtranspose(n, n, &rawc, 0, 0, &t3, 0, 0, _state); for(i=0; i<=n-1; i++) { v = 1/ae_sqrt(da.ptr.p_double[i], _state); ae_v_muld(&t3.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } rmatrixsyrk(n, n, 1.0, &t3, 0, 0, 2, 0.0, &a2, 0, 0, ae_true, _state); ae_vector_set_length(&tmp0, n, _state); ae_v_move(&tmp0.ptr.p_double[0], 1, &b.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { tmp0.ptr.p_double[i] = tmp0.ptr.p_double[i]/da.ptr.p_double[i]; } ae_vector_set_length(&b2, n, _state); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &tmp0.ptr.p_double[0], 1, ae_v_len(0,n-1)); b2.ptr.p_double[i] = -(v+rawc.ptr.pp_double[i][n]); } ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); for(i=0; i<=n-1; i++) { bndl.ptr.p_double[i] = 0.0; bndu.ptr.p_double[i] = _state->v_posinf; } minqpcreate(n, &state2, _state); minqpsetalgoquickqp(&state2, 0.0, 0.0, 1.0E-9, 0, ae_true, _state); minqpsetlinearterm(&state2, &b2, _state); minqpsetquadraticterm(&state2, &a2, ae_true, _state); minqpsetbc(&state2, &bndl, &bndu, _state); minqpoptimize(&state2, _state); minqpresults(&state2, &x2, &rep2, _state); if( rep2.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); continue; } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&rawc.ptr.pp_double[0][i], rawc.stride, &x2.ptr.p_double[0], 1, ae_v_len(0,n-1)); tmp0.ptr.p_double[i] = v-b.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { tmp0.ptr.p_double[i] = tmp0.ptr.p_double[i]/da.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(tmp0.ptr.p_double[i]-x1.ptr.p_double[i], _state),xtol*ae_maxreal(ae_fabs(tmp0.ptr.p_double[i], _state), 1.0, _state)), _state); } } /* * Boundary and linear equality/inequality constrained QP problem with * excessive constraints: * * N*N SPD A with moderate condtion number (up to 100) * * boundary constraints 0<=x[i]<=1 * * K=2*N equality/inequality constraints Q*x = Q*x0, where Q is random matrix, * x0 is some random vector from the feasible hypercube (0.1<=x0[i]<=0.9) * * optimization problem has form 0.5*x'*A*x-b*x, * where b is some random vector * * because constraints are excessive, the main problem is to find * feasible point; usually, the only existing feasible point is solution, * so we have to check only feasibility */ for(n=1; n<=6; n++) { /* * Set up tolerances */ if( solvertype==0 ) { xtol = 1.0E-5; } else { if( solvertype==1 ) { xtol = 5.0E-3; } else { ae_assert(ae_false, "unexpected solver type", _state); } } /* * Generate problem: A, b, BndL, BndU, CMatrix, x0, x1, XStart */ k = 2*n; spdmatrixrndcond(n, ae_pow(10.0, 3*ae_randomreal(_state), _state), &a, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&x1, n, _state); ae_vector_set_length(&xstart, n, _state); ae_matrix_set_length(&rawc, k, n+1, _state); ae_vector_set_length(&rawct, k, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 0.1+0.8*ae_randomreal(_state); x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; bndl.ptr.p_double[i] = 0.0; bndu.ptr.p_double[i] = 1.0; xstart.ptr.p_double[i] = (double)(ae_randominteger(2, _state)); } for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); rawct.ptr.p_int[i] = ae_randominteger(3, _state)-1; if( rawct.ptr.p_int[i]==0 ) { rawc.ptr.pp_double[i][n] = v; } if( rawct.ptr.p_int[i]>0 ) { rawc.ptr.pp_double[i][n] = v-50*xtol; } if( rawct.ptr.p_int[i]<0 ) { rawc.ptr.pp_double[i][n] = v+50*xtol; } } for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } /* * Create optimizer, solve */ minqpcreate(n, &state, _state); if( solvertype==0 ) { minqpsetalgobleic(&state, 0.0, 0.0, bleicepsx, 0, _state); } else { if( solvertype==1 ) { minqpsetalgodenseaul(&state, aulepsx, aulrho, aulits, _state); } else { ae_assert(ae_false, "unexpected solver type", _state); } } minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&a, n, &state, &rs, _state); minqpsetstartingpoint(&state, &xstart, _state); minqpsetbc(&state, &bndl, &bndu, _state); testminqpunit_randomlysplitandsetlc(&rawc, &rawct, k, n, &state, &rs, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); /* * Check feasibility properties of the solution */ if( rep.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); continue; } for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&x1.ptr.p_double[0], 1, &rawc.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); if( rawct.ptr.p_int[i]==0 ) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(v-rawc.ptr.pp_double[i][n], _state),xtol), _state); } if( rawct.ptr.p_int[i]>0 ) { seterrorflag(errorflag, ae_fp_less(v,rawc.ptr.pp_double[i][n]-xtol), _state); } if( rawct.ptr.p_int[i]<0 ) { seterrorflag(errorflag, ae_fp_greater(v,rawc.ptr.pp_double[i][n]+xtol), _state); } } } /* * Boundary constraints posed as general linear ones: * * no bound constraints * * 2*N linear constraints 0 <= x[i] <= 1 * * F(x) = |x-x0|^P, where P={2,4} and x0 is randomly selected from [-1,+2]^N * * with such simple constraints and function it is easy to find * analytic form of solution: S[i] = bound(x0[i], 0, 1). * * however, we can't guarantee that solution is strictly feasible * with respect to nonlinearity constraint, so we check * for approximate feasibility. */ for(n=1; n<=5; n++) { /* * Set up tolerances */ if( solvertype==0 ) { xtol = 1.0E-3; } else { if( solvertype==1 ) { xtol = 1.0E-3; } else { ae_assert(ae_false, "unexpected solver type", _state); } } /* * Generate X, BL, BU. */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xstart, n, _state); ae_vector_set_length(&x0, n, _state); ae_matrix_set_length(&rawc, 2*n, n+1, _state); ae_vector_set_length(&rawct, 2*n, _state); k = 2*n; for(i=0; i<=n-1; i++) { xstart.ptr.p_double[i] = ae_randomreal(_state); x0.ptr.p_double[i] = 3*ae_randomreal(_state)-1; b.ptr.p_double[i] = -x0.ptr.p_double[i]; for(j=0; j<=n; j++) { rawc.ptr.pp_double[2*i+0][j] = (double)(0); rawc.ptr.pp_double[2*i+1][j] = (double)(0); } rawc.ptr.pp_double[2*i+0][i] = (double)(1); rawc.ptr.pp_double[2*i+0][n] = (double)(0); rawct.ptr.p_int[2*i+0] = 1; rawc.ptr.pp_double[2*i+1][i] = (double)(1); rawc.ptr.pp_double[2*i+1][n] = (double)(1); rawct.ptr.p_int[2*i+1] = -1; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( i==j ) { a.ptr.pp_double[i][j] = (double)(1); } else { a.ptr.pp_double[i][j] = (double)(0); } } } /* * Create and optimize */ minqpcreate(n, &state, _state); if( solvertype==0 ) { minqpsetalgobleic(&state, 0.0, 0.0, bleicepsx, 0, _state); } else { if( solvertype==1 ) { minqpsetalgodenseaul(&state, aulepsx, aulrho, aulits, _state); } else { ae_assert(ae_false, "unexpected solver type", _state); } } testminqpunit_randomlysplitandsetlc(&rawc, &rawct, k, n, &state, &rs, _state); minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&a, n, &state, &rs, _state); minqpsetstartingpoint(&state, &xstart, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); continue; } /* * * compare solution with analytic one * * check feasibility */ for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-boundval(x0.ptr.p_double[i], 0.0, 1.0, _state), _state),xtol), _state); seterrorflag(errorflag, ae_fp_less(x1.ptr.p_double[i],0.0-xtol), _state); seterrorflag(errorflag, ae_fp_greater(x1.ptr.p_double[i],1.0+xtol), _state); } } /* * Convex/nonconvex optimization problem with excessive * (degenerate constraints): * * * N=2..5 * * f = 0.5*x'*A*x+b'*x * * b has normally distributed entries * * A is diagonal with log-normally distributed entries * * box constraints: x[i] in [-1,+1] * * 2^N "excessive" general linear constraints (v_k,x)<=(v_k,v_k)+v_shift, * where v_k is one of 2^N vertices of feasible hypercube, v_shift is * a shift parameter: * * with zero v_shift such constraints are degenerate (each vertex has * N box constraints and one "redundant" linear constraint) * * with positive v_shift linear constraint is always inactive * * with small (about machine epsilon) but negative v_shift, * constraint is close to degenerate - but not exactly * * Because A is diagonal, we can easily find out solution analytically. * * NOTE: TolConstr must be large enough so it won't conflict with * perturbation introduced by v_shift */ for(n=2; n<=8; n++) { for(shiftkind=-5; shiftkind<=1; shiftkind++) { /* * Set up tolerances */ if( solvertype==0 ) { tolconstr = 1.0E-6; gtol = 1.0E-6; } else { if( solvertype==1 ) { tolconstr = 1.0E-4; gtol = 1.0E-4; } else { ae_assert(ae_false, "unexpected solver type", _state); } } /* * Generate A, B and initial point */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xstart, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); xstart.ptr.p_double[i] = hqrnduniformr(&rs, _state)-0.5; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } a.ptr.pp_double[i][i] = ae_pow((double)(2), hqrndnormal(&rs, _state), _state); } /* * Generate constraints */ ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); for(i=0; i<=n-1; i++) { bndl.ptr.p_double[i] = -1.0; bndu.ptr.p_double[i] = 1.0; } rawccnt = ae_round(ae_pow((double)(2), (double)(n), _state), _state); ae_matrix_set_length(&rawc, rawccnt, n+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); for(i=0; i<=rawccnt-1; i++) { rawct.ptr.p_int[i] = -1; k = i; rawc.ptr.pp_double[i][n] = ae_sign((double)(shiftkind), _state)*ae_pow((double)(10), ae_fabs((double)(shiftkind), _state), _state)*ae_machineepsilon; for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = (double)(2*(k%2)-1); rawc.ptr.pp_double[i][n] = rawc.ptr.pp_double[i][n]+rawc.ptr.pp_double[i][j]*rawc.ptr.pp_double[i][j]; k = k/2; } } /* * Create and optimize */ minqpcreate(n, &state, _state); minqpsetstartingpoint(&state, &xstart, _state); if( solvertype==0 ) { minqpsetalgobleic(&state, 0.0, 0.0, bleicepsx, 0, _state); } else { if( solvertype==1 ) { minqpsetalgodenseaul(&state, aulepsx, aulrho, aulits, _state); } else { ae_assert(ae_false, "unexpected solver type", _state); } } minqpsetbc(&state, &bndl, &bndu, _state); testminqpunit_randomlysplitandsetlc(&rawc, &rawct, rawccnt, n, &state, &rs, _state); minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&a, n, &state, &rs, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); seterrorflag(errorflag, rep.terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } /* * Evaluate gradient at solution and test */ vv = 0.0; for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v+b.ptr.p_double[i]; if( ae_fp_less_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i]+tolconstr)&&ae_fp_greater(v,(double)(0)) ) { v = 0.0; } if( ae_fp_greater_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i]-tolconstr)&&ae_fp_less(v,(double)(0)) ) { v = 0.0; } vv = vv+ae_sqr(v, _state); } vv = ae_sqrt(vv, _state); seterrorflag(errorflag, ae_fp_greater(vv,gtol), _state); } } } /* * Large-scale tests: a few selected tests for large N's */ for(solvertype=0; solvertype<=1; solvertype++) { /* * General equality constrained problem: * * N*N SPD (non-diagonal) A with moderate condition number * * no box constraints * * K= C*x0, where C is N*N well conditioned * matrix, x0 is some random vector [-1,+1] * * optimization problem has form 0.5*x'*A*x-b'*x, * where b is random vector from [-1,+1] * * using duality, we can obtain solution of QP problem as follows: * a) intermediate problem min(0.5*y'*B*y + d'*y) s.t. y>=0 * is solved, where B = C*inv(A)*C', d = -(C*inv(A)*b + C*x0) * b) after we got dual solution ys, we calculate primal solution * xs = inv(A)*(C'*ys-b) */ n = 60; rawccnt = 40; if( solvertype==0 ) { xtol = 1.0E-3; } else { if( solvertype==1 ) { xtol = 1.0E-3; } else { ae_assert(ae_false, "unexpected solver type", _state); } } ae_vector_set_length(&da, n, _state); ae_matrix_set_length(&a, n, n, _state); rmatrixrndcond(n, ae_pow(10.0, 2*ae_randomreal(_state), _state), &t2, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xstart, n, _state); ae_matrix_set_length(&rawc, rawccnt, n+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); for(i=0; i<=n-1; i++) { da.ptr.p_double[i] = ae_exp(8*ae_randomreal(_state)-4, _state); for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } a.ptr.pp_double[i][i] = da.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xstart.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=rawccnt-1; i++) { ae_v_move(&rawc.ptr.pp_double[i][0], 1, &t2.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); rawc.ptr.pp_double[i][n] = v; rawct.ptr.p_int[i] = 1; } minqpcreate(n, &state, _state); if( solvertype==0 ) { minqpsetalgobleic(&state, 0.0, 0.0, bleicepsx, 0, _state); } else { if( solvertype==1 ) { minqpsetalgodenseaul(&state, aulepsx, aulrho, aulits, _state); } else { ae_assert(ae_false, "unexpected solver type", _state); } } minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&a, n, &state, &rs, _state); minqpsetstartingpoint(&state, &xstart, _state); testminqpunit_randomlysplitandsetlc(&rawc, &rawct, rawccnt, n, &state, &rs, _state); minqpoptimize(&state, _state); minqpresults(&state, &x1, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); continue; } for(i=0; i<=rawccnt-1; i++) { v = ae_v_dotproduct(&x1.ptr.p_double[0], 1, &rawc.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); seterrorflag(errorflag, ae_fp_less(v,rawc.ptr.pp_double[i][n]-xtol), _state); } ae_matrix_set_length(&t3, n, rawccnt, _state); ae_matrix_set_length(&a2, rawccnt, rawccnt, _state); rmatrixtranspose(rawccnt, n, &rawc, 0, 0, &t3, 0, 0, _state); for(i=0; i<=n-1; i++) { v = 1/ae_sqrt(da.ptr.p_double[i], _state); ae_v_muld(&t3.ptr.pp_double[i][0], 1, ae_v_len(0,rawccnt-1), v); } rmatrixsyrk(rawccnt, n, 1.0, &t3, 0, 0, 2, 0.0, &a2, 0, 0, ae_true, _state); ae_vector_set_length(&tmp0, n, _state); ae_v_move(&tmp0.ptr.p_double[0], 1, &b.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { tmp0.ptr.p_double[i] = tmp0.ptr.p_double[i]/da.ptr.p_double[i]; } ae_vector_set_length(&b2, rawccnt, _state); for(i=0; i<=rawccnt-1; i++) { v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &tmp0.ptr.p_double[0], 1, ae_v_len(0,n-1)); b2.ptr.p_double[i] = -(v+rawc.ptr.pp_double[i][n]); } ae_vector_set_length(&bndl, rawccnt, _state); ae_vector_set_length(&bndu, rawccnt, _state); for(i=0; i<=rawccnt-1; i++) { bndl.ptr.p_double[i] = 0.0; bndu.ptr.p_double[i] = _state->v_posinf; } minqpcreate(rawccnt, &state2, _state); minqpsetalgoquickqp(&state2, 0.0, 0.0, 1.0E-9, 0, ae_true, _state); minqpsetlinearterm(&state2, &b2, _state); minqpsetquadraticterm(&state2, &a2, ae_true, _state); minqpsetbc(&state2, &bndl, &bndu, _state); minqpoptimize(&state2, _state); minqpresults(&state2, &x2, &rep2, _state); if( rep2.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); continue; } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&rawc.ptr.pp_double[0][i], rawc.stride, &x2.ptr.p_double[0], 1, ae_v_len(0,rawccnt-1)); tmp0.ptr.p_double[i] = v-b.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { tmp0.ptr.p_double[i] = tmp0.ptr.p_double[i]/da.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(tmp0.ptr.p_double[i]-x1.ptr.p_double[i], _state),xtol*ae_maxreal(ae_fabs(tmp0.ptr.p_double[i], _state), 1.0, _state)), _state); } } ae_frame_leave(_state); } /************************************************************************* This function tests special inequality constrained QP problems. Returns True on errors. *************************************************************************/ static ae_bool testminqpunit_specialicqptests(ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_matrix c; ae_vector xstart; ae_vector xend; ae_vector xexact; ae_vector b; ae_vector bndl; ae_vector bndu; ae_vector ct; minqpstate state; minqpreport rep; ae_bool waserrors; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&xstart, 0, DT_REAL, _state); ae_vector_init(&xend, 0, DT_REAL, _state); ae_vector_init(&xexact, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); _minqpstate_init(&state, _state); _minqpreport_init(&rep, _state); waserrors = ae_false; /* * Test 1: reported by Vanderlande Industries. * Tests algorithm ability to handle degenerate constraints. */ ae_matrix_set_length(&a, 3, 3, _state); for(i=0; i<=2; i++) { for(j=0; j<=2; j++) { a.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=2; i++) { a.ptr.pp_double[i][i] = (double)(1); } ae_vector_set_length(&b, 3, _state); b.ptr.p_double[0] = (double)(-50); b.ptr.p_double[1] = (double)(-50); b.ptr.p_double[2] = (double)(-75); ae_vector_set_length(&bndl, 3, _state); bndl.ptr.p_double[0] = (double)(0); bndl.ptr.p_double[1] = (double)(0); bndl.ptr.p_double[2] = (double)(0); ae_vector_set_length(&bndu, 3, _state); bndu.ptr.p_double[0] = (double)(100); bndu.ptr.p_double[1] = (double)(100); bndu.ptr.p_double[2] = (double)(150); ae_vector_set_length(&xstart, 3, _state); xstart.ptr.p_double[0] = (double)(0); xstart.ptr.p_double[1] = (double)(100); xstart.ptr.p_double[2] = (double)(0); ae_vector_set_length(&xexact, 3, _state); xexact.ptr.p_double[0] = (double)(0); xexact.ptr.p_double[1] = (double)(100); xexact.ptr.p_double[2] = (double)(50); ae_matrix_set_length(&c, 3, 4, _state); c.ptr.pp_double[0][0] = (double)(1); c.ptr.pp_double[0][1] = (double)(-1); c.ptr.pp_double[0][2] = (double)(0); c.ptr.pp_double[0][3] = (double)(-100); c.ptr.pp_double[1][0] = (double)(1); c.ptr.pp_double[1][1] = (double)(0); c.ptr.pp_double[1][2] = (double)(-1); c.ptr.pp_double[1][3] = (double)(0); c.ptr.pp_double[2][0] = (double)(-1); c.ptr.pp_double[2][1] = (double)(0); c.ptr.pp_double[2][2] = (double)(1); c.ptr.pp_double[2][3] = (double)(50); ae_vector_set_length(&ct, 3, _state); ct.ptr.p_int[0] = -1; ct.ptr.p_int[1] = -1; ct.ptr.p_int[2] = -1; minqpcreate(3, &state, _state); testminqpunit_setrandomalgoconvexlc(&state, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_true, _state); minqpsetstartingpoint(&state, &xstart, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpsetlc(&state, &c, &ct, 3, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend, &rep, _state); if( rep.terminationtype>0 ) { for(i=0; i<=2; i++) { waserrors = waserrors||ae_fp_greater(ae_fabs(xend.ptr.p_double[i]-xexact.ptr.p_double[i], _state),1.0E6*ae_machineepsilon); } } else { waserrors = ae_true; } /* * Test 2: reported by Vanderlande Industries. * Tests algorithm ability to handle degenerate constraints. */ ae_matrix_set_length(&a, 3, 3, _state); for(i=0; i<=2; i++) { for(j=0; j<=2; j++) { a.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=2; i++) { a.ptr.pp_double[i][i] = (double)(1); } ae_vector_set_length(&b, 3, _state); b.ptr.p_double[0] = (double)(-50); b.ptr.p_double[1] = (double)(-50); b.ptr.p_double[2] = (double)(-75); ae_vector_set_length(&bndl, 3, _state); bndl.ptr.p_double[0] = (double)(0); bndl.ptr.p_double[1] = (double)(0); bndl.ptr.p_double[2] = (double)(0); ae_vector_set_length(&bndu, 3, _state); bndu.ptr.p_double[0] = (double)(100); bndu.ptr.p_double[1] = (double)(100); bndu.ptr.p_double[2] = (double)(150); ae_vector_set_length(&xstart, 3, _state); xstart.ptr.p_double[0] = (double)(0); xstart.ptr.p_double[1] = (double)(100); xstart.ptr.p_double[2] = (double)(150); ae_vector_set_length(&xexact, 3, _state); xexact.ptr.p_double[0] = (double)(0); xexact.ptr.p_double[1] = (double)(100); xexact.ptr.p_double[2] = (double)(100); ae_matrix_set_length(&c, 3, 4, _state); c.ptr.pp_double[0][0] = (double)(1); c.ptr.pp_double[0][1] = (double)(-1); c.ptr.pp_double[0][2] = (double)(0); c.ptr.pp_double[0][3] = (double)(-100); c.ptr.pp_double[1][0] = (double)(0); c.ptr.pp_double[1][1] = (double)(1); c.ptr.pp_double[1][2] = (double)(-1); c.ptr.pp_double[1][3] = (double)(0); c.ptr.pp_double[2][0] = (double)(0); c.ptr.pp_double[2][1] = (double)(-1); c.ptr.pp_double[2][2] = (double)(1); c.ptr.pp_double[2][3] = (double)(50); ae_vector_set_length(&ct, 3, _state); ct.ptr.p_int[0] = -1; ct.ptr.p_int[1] = -1; ct.ptr.p_int[2] = -1; minqpcreate(3, &state, _state); testminqpunit_setrandomalgoconvexlc(&state, _state); minqpsetlinearterm(&state, &b, _state); minqpsetquadraticterm(&state, &a, ae_true, _state); minqpsetstartingpoint(&state, &xstart, _state); minqpsetbc(&state, &bndl, &bndu, _state); minqpsetlc(&state, &c, &ct, 3, _state); minqpoptimize(&state, _state); minqpresults(&state, &xend, &rep, _state); if( rep.terminationtype>0 ) { for(i=0; i<=2; i++) { waserrors = waserrors||ae_fp_greater(ae_fabs(xend.ptr.p_double[i]-xexact.ptr.p_double[i], _state),1.0E6*ae_machineepsilon); } } else { waserrors = ae_true; } result = waserrors; ae_frame_leave(_state); return result; } /************************************************************************* This function tests linearly constrained DENSE-AUL solver On failure sets Err to True; on success leaves it unchanged. *************************************************************************/ static void testminqpunit_denseaultests(ae_bool* errorflag, ae_state *_state) { ae_frame _frame_block; hqrndstate rs; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t k; double v; double vv; ae_int_t scaletype; ae_matrix rawa; ae_matrix z; ae_vector bndl; ae_vector bndu; ae_matrix rawc; ae_vector rawct; ae_int_t rawccnt; ae_vector b; ae_vector x0; ae_vector xf; ae_vector r; ae_vector xsol; ae_vector s; minqpstate state; minqpreport rep; double epsx; double xtol; double rho; ae_int_t outerits; densesolverreport svrep; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&rs, _state); ae_matrix_init(&rawa, 0, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_matrix_init(&rawc, 0, 0, DT_REAL, _state); ae_vector_init(&rawct, 0, DT_INT, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&xf, 0, DT_REAL, _state); ae_vector_init(&r, 0, DT_REAL, _state); ae_vector_init(&xsol, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); _minqpstate_init(&state, _state); _minqpreport_init(&rep, _state); _densesolverreport_init(&svrep, _state); hqrndrandomize(&rs, _state); /* * Test that unconstrained problem is solved with high precision, * independently of Rho and/or outer iterations count. * * 50% of problems are rescaled wildly (with scale being passed to * the solver). */ epsx = 1.0E-12; xtol = 1.0E-7; for(n=1; n<=10; n++) { for(scaletype=0; scaletype<=1; scaletype++) { /* * Generate random A, b, X0 and XSOL */ spdmatrixrndcond(n, ae_pow(10.0, 3*hqrnduniformr(&rs, _state), _state), &rawa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xsol, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = hqrndnormal(&rs, _state); xsol.ptr.p_double[i] = hqrndnormal(&rs, _state); } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&rawa.ptr.pp_double[i][0], 1, &xsol.ptr.p_double[0], 1, ae_v_len(0,n-1)); b.ptr.p_double[i] = -v; } /* * Generate scale vector, apply it */ ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { if( scaletype>0 ) { s.ptr.p_double[i] = ae_pow((double)(10), hqrnduniformr(&rs, _state)*20-10, _state); } else { s.ptr.p_double[i] = (double)(1); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { rawa.ptr.pp_double[i][j] = rawa.ptr.pp_double[i][j]/(s.ptr.p_double[i]*s.ptr.p_double[j]); } b.ptr.p_double[i] = b.ptr.p_double[i]/s.ptr.p_double[i]; x0.ptr.p_double[i] = x0.ptr.p_double[i]*s.ptr.p_double[i]; xsol.ptr.p_double[i] = xsol.ptr.p_double[i]*s.ptr.p_double[i]; } /* * Create optimizer, solve */ rho = ae_pow((double)(10), 2*hqrnduniformr(&rs, _state), _state); outerits = hqrnduniformi(&rs, 5, _state); minqpcreate(n, &state, _state); minqpsetalgodenseaul(&state, epsx, rho, outerits, _state); minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&rawa, n, &state, &rs, _state); minqpsetscale(&state, &s, _state); minqpsetstartingpoint(&state, &x0, _state); minqpoptimize(&state, _state); minqpresults(&state, &xf, &rep, _state); /* * Compare against analytically known solution */ if( rep.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(xf.ptr.p_double[i]-xsol.ptr.p_double[i], _state)/s.ptr.p_double[i],xtol), _state); } } } /* * Test that problem with zero constraint matrix can be solved * (with high precision). We do not perform any additional "tweaks" * like scaling of variables, just want to test ability to handle * zero matrices. */ epsx = 1.0E-12; xtol = 1.0E-7; for(n=1; n<=10; n++) { /* * Generate random A, b, X0 and XSOL */ spdmatrixrndcond(n, ae_pow(10.0, 3*hqrnduniformr(&rs, _state), _state), &rawa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xsol, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = hqrndnormal(&rs, _state); xsol.ptr.p_double[i] = hqrndnormal(&rs, _state); } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&rawa.ptr.pp_double[i][0], 1, &xsol.ptr.p_double[0], 1, ae_v_len(0,n-1)); b.ptr.p_double[i] = -v; } rawccnt = hqrnduniformi(&rs, 2*n, _state); ae_matrix_set_length(&rawc, rawccnt, n+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); for(i=0; i<=rawccnt-1; i++) { for(j=0; j<=n; j++) { rawc.ptr.pp_double[i][j] = (double)(0); } rawct.ptr.p_int[i] = hqrnduniformi(&rs, 3, _state)-1; } /* * Create optimizer, solve */ rho = ae_pow((double)(10), 2*hqrnduniformr(&rs, _state), _state); outerits = hqrnduniformi(&rs, 5, _state); minqpcreate(n, &state, _state); minqpsetalgodenseaul(&state, epsx, rho, outerits, _state); minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&rawa, n, &state, &rs, _state); minqpsetlc(&state, &rawc, &rawct, rawccnt, _state); minqpsetstartingpoint(&state, &x0, _state); minqpoptimize(&state, _state); minqpresults(&state, &xf, &rep, _state); /* * Compare against analytically known solution */ if( rep.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(xf.ptr.p_double[i]-xsol.ptr.p_double[i], _state),xtol), _state); } } /* * Test that box/linearly inequality constrained problem with ALL constraints * being inactive at BOTH initial and final points is solved with high precision. * * 50% of problems are rescaled wildly (with scale being passed to * the solver). */ epsx = 1.0E-12; xtol = 1.0E-7; for(n=1; n<=10; n++) { for(scaletype=0; scaletype<=1; scaletype++) { /* * Generate random A, b, X0 and XSOL */ spdmatrixrndcond(n, ae_pow(10.0, 3*hqrnduniformr(&rs, _state), _state), &rawa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xsol, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = hqrndnormal(&rs, _state); xsol.ptr.p_double[i] = hqrndnormal(&rs, _state); } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&rawa.ptr.pp_double[i][0], 1, &xsol.ptr.p_double[0], 1, ae_v_len(0,n-1)); b.ptr.p_double[i] = -v; } /* * Generate such set of inequality constraints that ALL * constraints are inactive at both X0 and XSOL. */ ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); for(i=0; i<=n-1; i++) { bndl.ptr.p_double[i] = ae_minreal(x0.ptr.p_double[i], xsol.ptr.p_double[i], _state)-1-hqrnduniformr(&rs, _state); bndu.ptr.p_double[i] = ae_maxreal(x0.ptr.p_double[i], xsol.ptr.p_double[i], _state)+1+hqrnduniformr(&rs, _state); } rawccnt = hqrnduniformi(&rs, 2*n, _state); ae_matrix_set_length(&rawc, rawccnt, n+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); for(i=0; i<=rawccnt-1; i++) { v = (double)(0); vv = (double)(0); for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); v = v+rawc.ptr.pp_double[i][j]*x0.ptr.p_double[j]; vv = vv+rawc.ptr.pp_double[i][j]*xsol.ptr.p_double[j]; } if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { rawct.ptr.p_int[i] = 1; rawc.ptr.pp_double[i][n] = ae_minreal(v, vv, _state)-1-hqrnduniformr(&rs, _state); } else { rawct.ptr.p_int[i] = -1; rawc.ptr.pp_double[i][n] = ae_maxreal(v, vv, _state)+1+hqrnduniformr(&rs, _state); } } /* * Generate scale vector, apply it */ ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { if( scaletype>0 ) { s.ptr.p_double[i] = ae_pow((double)(10), hqrnduniformr(&rs, _state)*20-10, _state); } else { s.ptr.p_double[i] = (double)(1); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { rawa.ptr.pp_double[i][j] = rawa.ptr.pp_double[i][j]/(s.ptr.p_double[i]*s.ptr.p_double[j]); } b.ptr.p_double[i] = b.ptr.p_double[i]/s.ptr.p_double[i]; x0.ptr.p_double[i] = x0.ptr.p_double[i]*s.ptr.p_double[i]; xsol.ptr.p_double[i] = xsol.ptr.p_double[i]*s.ptr.p_double[i]; bndl.ptr.p_double[i] = bndl.ptr.p_double[i]*s.ptr.p_double[i]; bndu.ptr.p_double[i] = bndu.ptr.p_double[i]*s.ptr.p_double[i]; } for(i=0; i<=rawccnt-1; i++) { for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = rawc.ptr.pp_double[i][j]/s.ptr.p_double[j]; } } /* * Create optimizer, solve */ rho = 100.0; outerits = 1; minqpcreate(n, &state, _state); minqpsetstartingpoint(&state, &x0, _state); minqpsetalgodenseaul(&state, epsx, rho, outerits, _state); minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&rawa, n, &state, &rs, _state); minqpsetbc(&state, &bndl, &bndu, _state); testminqpunit_randomlysplitandsetlc(&rawc, &rawct, rawccnt, n, &state, &rs, _state); minqpsetscale(&state, &s, _state); minqpoptimize(&state, _state); minqpresults(&state, &xf, &rep, _state); /* * Compare against analytically known solution */ if( rep.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(xf.ptr.p_double[i]-xsol.ptr.p_double[i], _state)/s.ptr.p_double[i],xtol), _state); } } } /* * Test that linear equality constrained problem is solved with high precision. * * 50% of problems are rescaled wildly (variable scaling, with scale being * passed to the solver). */ epsx = 1.0E-12; xtol = 1.0E-6; for(n=1; n<=10; n++) { for(scaletype=0; scaletype<=1; scaletype++) { /* * Generate random A, b, X0, constraints */ spdmatrixrndcond(n, ae_pow(10.0, 3*hqrnduniformr(&rs, _state), _state), &rawa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xsol, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = hqrndnormal(&rs, _state); b.ptr.p_double[i] = hqrndnormal(&rs, _state); } rmatrixrndcond(n, (double)(10), &z, _state); rawccnt = ae_maxint(n-2, 0, _state); if( rawccnt>0 ) { ae_matrix_set_length(&rawc, rawccnt, n+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); for(i=0; i<=rawccnt-1; i++) { for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = z.ptr.pp_double[i][j]; } rawct.ptr.p_int[i] = 0; rawc.ptr.pp_double[i][n] = hqrndnormal(&rs, _state); } } /* * Generate scale vector, apply it */ ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { if( scaletype>0 ) { s.ptr.p_double[i] = ae_pow((double)(10), hqrnduniformr(&rs, _state)*20-10, _state); } else { s.ptr.p_double[i] = (double)(1); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { rawa.ptr.pp_double[i][j] = rawa.ptr.pp_double[i][j]/(s.ptr.p_double[i]*s.ptr.p_double[j]); } b.ptr.p_double[i] = b.ptr.p_double[i]/s.ptr.p_double[i]; x0.ptr.p_double[i] = x0.ptr.p_double[i]*s.ptr.p_double[i]; xsol.ptr.p_double[i] = xsol.ptr.p_double[i]*s.ptr.p_double[i]; } for(i=0; i<=rawccnt-1; i++) { for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = rawc.ptr.pp_double[i][j]/s.ptr.p_double[j]; } } /* * Create optimizer, solve */ rho = 100.0; outerits = 3; minqpcreate(n, &state, _state); minqpsetalgodenseaul(&state, epsx, rho, outerits, _state); minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&rawa, n, &state, &rs, _state); testminqpunit_randomlysplitandsetlc(&rawc, &rawct, rawccnt, n, &state, &rs, _state); minqpsetscale(&state, &s, _state); minqpoptimize(&state, _state); minqpresults(&state, &xf, &rep, _state); /* * Solve problem analytically using Lagrangian approach */ ae_matrix_set_length(&z, n+rawccnt, n+rawccnt, _state); ae_vector_set_length(&r, n+rawccnt, _state); for(i=0; i<=n+rawccnt-1; i++) { for(j=0; j<=n+rawccnt-1; j++) { z.ptr.pp_double[i][j] = (double)(0); } r.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { z.ptr.pp_double[i][j] = rawa.ptr.pp_double[i][j]; } } for(i=0; i<=n-1; i++) { r.ptr.p_double[i] = -b.ptr.p_double[i]; } for(i=0; i<=rawccnt-1; i++) { for(j=0; j<=n-1; j++) { z.ptr.pp_double[n+i][j] = rawc.ptr.pp_double[i][j]; z.ptr.pp_double[j][n+i] = rawc.ptr.pp_double[i][j]; } r.ptr.p_double[n+i] = rawc.ptr.pp_double[i][n]; } rmatrixsolve(&z, n+rawccnt, &r, &k, &svrep, &xsol, _state); ae_assert(k>0, "MinQPTest: integrity check failed", _state); /* * Compare against analytically known solution */ if( rep.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(xf.ptr.p_double[i]-xsol.ptr.p_double[i], _state)/s.ptr.p_double[i],xtol), _state); } } } /* * MEDIUM-SCALE VERSION OF PREVIOUS TEST. * * Test that linear equality constrained problem is solved with high precision. * * 50% of problems are rescaled wildly (variable scaling, with scale being * passed to the solver). */ epsx = 1.0E-12; xtol = 1.0E-6; for(n=99; n<=101; n++) { for(scaletype=0; scaletype<=1; scaletype++) { /* * Generate random A, b, X0, constraints */ spdmatrixrndcond(n, ae_pow(10.0, 3*hqrnduniformr(&rs, _state), _state), &rawa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xsol, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = hqrndnormal(&rs, _state); b.ptr.p_double[i] = hqrndnormal(&rs, _state); } rmatrixrndcond(n, (double)(10), &z, _state); rawccnt = ae_maxint(n-2, 0, _state); if( rawccnt>0 ) { ae_matrix_set_length(&rawc, rawccnt, n+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); for(i=0; i<=rawccnt-1; i++) { for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = z.ptr.pp_double[i][j]; } rawct.ptr.p_int[i] = 0; rawc.ptr.pp_double[i][n] = hqrndnormal(&rs, _state); } } /* * Generate scale vector, apply it */ ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { if( scaletype>0 ) { s.ptr.p_double[i] = ae_pow((double)(10), hqrnduniformr(&rs, _state)*20-10, _state); } else { s.ptr.p_double[i] = (double)(1); } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { rawa.ptr.pp_double[i][j] = rawa.ptr.pp_double[i][j]/(s.ptr.p_double[i]*s.ptr.p_double[j]); } b.ptr.p_double[i] = b.ptr.p_double[i]/s.ptr.p_double[i]; x0.ptr.p_double[i] = x0.ptr.p_double[i]*s.ptr.p_double[i]; xsol.ptr.p_double[i] = xsol.ptr.p_double[i]*s.ptr.p_double[i]; } for(i=0; i<=rawccnt-1; i++) { for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = rawc.ptr.pp_double[i][j]/s.ptr.p_double[j]; } } /* * Create optimizer, solve */ rho = 100.0; outerits = 3; minqpcreate(n, &state, _state); minqpsetalgodenseaul(&state, epsx, rho, outerits, _state); minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&rawa, n, &state, &rs, _state); testminqpunit_randomlysplitandsetlc(&rawc, &rawct, rawccnt, n, &state, &rs, _state); minqpsetscale(&state, &s, _state); minqpoptimize(&state, _state); minqpresults(&state, &xf, &rep, _state); /* * Solve problem analytically using Lagrangian approach */ ae_matrix_set_length(&z, n+rawccnt, n+rawccnt, _state); ae_vector_set_length(&r, n+rawccnt, _state); for(i=0; i<=n+rawccnt-1; i++) { for(j=0; j<=n+rawccnt-1; j++) { z.ptr.pp_double[i][j] = (double)(0); } r.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { z.ptr.pp_double[i][j] = rawa.ptr.pp_double[i][j]; } } for(i=0; i<=n-1; i++) { r.ptr.p_double[i] = -b.ptr.p_double[i]; } for(i=0; i<=rawccnt-1; i++) { for(j=0; j<=n-1; j++) { z.ptr.pp_double[n+i][j] = rawc.ptr.pp_double[i][j]; z.ptr.pp_double[j][n+i] = rawc.ptr.pp_double[i][j]; } r.ptr.p_double[n+i] = rawc.ptr.pp_double[i][n]; } rmatrixsolve(&z, n+rawccnt, &r, &k, &svrep, &xsol, _state); ae_assert(k>0, "MinQPTest: integrity check failed", _state); /* * Compare against analytically known solution */ if( rep.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(xf.ptr.p_double[i]-xsol.ptr.p_double[i], _state)/s.ptr.p_double[i],xtol), _state); } } } /* * Test that constraints are automatically scaled to adapt to problem curvature * (that multiplication of A and b by some large/small number does not affect * solver). * * We generate random well-scaled problem, and multiply A/b by some large/small number, * and test that problem is still solved with high precision. * * NOTE: just to make things worse, we rescale variables randomly, but primary * idea of this test is to check for multiplication of A/B */ epsx = 1.0E-12; xtol = 1.0E-6; for(n=1; n<=10; n++) { for(scaletype=-1; scaletype<=1; scaletype++) { /* * Generate random A, b, X0, constraints */ spdmatrixrndcond(n, ae_pow(10.0, 3*hqrnduniformr(&rs, _state), _state), &rawa, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xsol, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = hqrndnormal(&rs, _state); b.ptr.p_double[i] = hqrndnormal(&rs, _state); } rmatrixrndcond(n, (double)(10), &z, _state); rawccnt = ae_maxint(n-2, 0, _state); if( rawccnt>0 ) { ae_matrix_set_length(&rawc, rawccnt, n+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); for(i=0; i<=rawccnt-1; i++) { for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = z.ptr.pp_double[i][j]; } rawct.ptr.p_int[i] = 0; rawc.ptr.pp_double[i][n] = hqrndnormal(&rs, _state); } } /* * Solve problem analytically using Lagrangian approach */ ae_matrix_set_length(&z, n+rawccnt, n+rawccnt, _state); ae_vector_set_length(&r, n+rawccnt, _state); for(i=0; i<=n+rawccnt-1; i++) { for(j=0; j<=n+rawccnt-1; j++) { z.ptr.pp_double[i][j] = (double)(0); } r.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { z.ptr.pp_double[i][j] = rawa.ptr.pp_double[i][j]; } } for(i=0; i<=n-1; i++) { r.ptr.p_double[i] = -b.ptr.p_double[i]; } for(i=0; i<=rawccnt-1; i++) { for(j=0; j<=n-1; j++) { z.ptr.pp_double[n+i][j] = rawc.ptr.pp_double[i][j]; z.ptr.pp_double[j][n+i] = rawc.ptr.pp_double[i][j]; } r.ptr.p_double[n+i] = rawc.ptr.pp_double[i][n]; } rmatrixsolve(&z, n+rawccnt, &r, &k, &svrep, &xsol, _state); ae_assert(k>0, "MinQPTest: integrity check failed", _state); /* * Generate scale vector, apply it */ ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { s.ptr.p_double[i] = ae_pow((double)(10), hqrnduniformr(&rs, _state)*20-10, _state); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { rawa.ptr.pp_double[i][j] = rawa.ptr.pp_double[i][j]/(s.ptr.p_double[i]*s.ptr.p_double[j]); } b.ptr.p_double[i] = b.ptr.p_double[i]/s.ptr.p_double[i]; x0.ptr.p_double[i] = x0.ptr.p_double[i]*s.ptr.p_double[i]; xsol.ptr.p_double[i] = xsol.ptr.p_double[i]*s.ptr.p_double[i]; } for(i=0; i<=rawccnt-1; i++) { for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = rawc.ptr.pp_double[i][j]/s.ptr.p_double[j]; } } /* * Multiply A/B by some coefficient with wild magnitude */ v = ae_pow((double)(10), (double)(10*scaletype), _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { rawa.ptr.pp_double[i][j] = rawa.ptr.pp_double[i][j]*v; } b.ptr.p_double[i] = b.ptr.p_double[i]*v; } /* * Create optimizer, solve */ rho = 100.0; outerits = 3; minqpcreate(n, &state, _state); minqpsetalgodenseaul(&state, epsx, rho, outerits, _state); minqpsetlinearterm(&state, &b, _state); testminqpunit_randomlyselectconvertandsetquadraticterm(&rawa, n, &state, &rs, _state); testminqpunit_randomlysplitandsetlc(&rawc, &rawct, rawccnt, n, &state, &rs, _state); minqpsetscale(&state, &s, _state); minqpoptimize(&state, _state); minqpresults(&state, &xf, &rep, _state); /* * Compare against analytically known solution */ if( rep.terminationtype<=0 ) { seterrorflag(errorflag, ae_true, _state); ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(xf.ptr.p_double[i]-xsol.ptr.p_double[i], _state)/s.ptr.p_double[i],xtol), _state); } } } ae_frame_leave(_state); } /************************************************************************* Function normal *************************************************************************/ static double testminqpunit_projectedantigradnorm(ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* g, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_state *_state) { ae_int_t i; double r; double result; r = (double)(0); for(i=0; i<=n-1; i++) { ae_assert(ae_fp_greater_eq(x->ptr.p_double[i],bndl->ptr.p_double[i])&&ae_fp_less_eq(x->ptr.p_double[i],bndu->ptr.p_double[i]), "ProjectedAntiGradNormal: boundary constraints violation", _state); if( ((ae_fp_greater(x->ptr.p_double[i],bndl->ptr.p_double[i])&&ae_fp_less(x->ptr.p_double[i],bndu->ptr.p_double[i]))||(ae_fp_eq(x->ptr.p_double[i],bndl->ptr.p_double[i])&&ae_fp_greater(-g->ptr.p_double[i],(double)(0))))||(ae_fp_eq(x->ptr.p_double[i],bndu->ptr.p_double[i])&&ae_fp_less(-g->ptr.p_double[i],(double)(0))) ) { r = r+g->ptr.p_double[i]*g->ptr.p_double[i]; } } result = ae_sqrt(r, _state); return result; } /************************************************************************* This function tests that norm of bound-constrained gradient at point X is less than Eps: * unconstrained gradient is A*x+b * if I-th component is at the boundary, and antigradient points outside of the feasible area, I-th component of constrained gradient is zero This function accepts QP terms A and B, bound constraints, current point, and performs test. Additionally, it checks that point is feasible w.r.t. boundary constraints. In case of failure, error flag is set. Otherwise, it is not modified. IMPORTANT: this function does NOT use SetErrorFlag() to modify flag. If you want to use SetErrorFlag() for easier tracking of errors, you should store flag returned by this function into separate variable TmpFlag and call SetErrorFlag(ErrorFlag, TmpFlag) yourself. *************************************************************************/ static void testminqpunit_testbcgradandfeasibility(/* Real */ ae_matrix* a, /* Real */ ae_vector* b, /* Real */ ae_vector* bndl, /* Real */ ae_vector* bndu, ae_int_t n, /* Real */ ae_vector* x, double eps, ae_bool* errorflag, ae_state *_state) { ae_int_t i; ae_int_t j; double g; double gnorm; gnorm = 0.0; for(i=0; i<=n-1; i++) { g = b->ptr.p_double[i]; for(j=0; j<=n-1; j++) { g = g+a->ptr.pp_double[i][j]*x->ptr.p_double[j]; } if( ae_fp_eq(x->ptr.p_double[i],bndl->ptr.p_double[i])&&ae_fp_greater(g,(double)(0)) ) { g = (double)(0); } if( ae_fp_eq(x->ptr.p_double[i],bndu->ptr.p_double[i])&&ae_fp_less(g,(double)(0)) ) { g = (double)(0); } gnorm = gnorm+ae_sqr(g, _state); if( ae_fp_less(x->ptr.p_double[i],bndl->ptr.p_double[i]) ) { *errorflag = ae_true; } if( ae_fp_greater(x->ptr.p_double[i],bndu->ptr.p_double[i]) ) { *errorflag = ae_true; } } gnorm = ae_sqrt(gnorm, _state); if( ae_fp_greater(gnorm,eps) ) { *errorflag = ae_true; } } /************************************************************************* set random type of the QP solver. All "modern" solvers can be chosen. OUTPUT PARAMETERS: BCTol - expected precision of box constraints handling assuming unit scale of variables. LCTol - expected precinion of linear constraints handling assuming unit scale of variables. BCTol and LCTol have following meaning - if some constraint is active, it means that we should be at most TOL units away from boundary. It is possible that zero value is returned. From definition it follows that if we stopped at more than TOL units away from the boundary, gradient in corresponding direction is nearly zero. *************************************************************************/ static void testminqpunit_setrandomalgoallmodern(minqpstate* s, double* bctol, double* lctol, ae_state *_state) { ae_int_t i; *bctol = 0; *lctol = 0; i = 1+ae_randominteger(3, _state); if( i==1 ) { minqpsetalgobleic(s, 1.0E-12, 0.0, 0.0, 0, _state); *bctol = (double)(0); *lctol = 1.0E-8; } if( i==2 ) { minqpsetalgoquickqp(s, 1.0E-12, 0.0, 0.0, 0, ae_fp_greater(ae_randomreal(_state),0.5), _state); *bctol = (double)(0); *lctol = (double)(0); } if( i==3 ) { minqpsetalgodenseaul(s, 1.0E-12, 1000.0, 10, _state); *bctol = 1.0E-3; *lctol = 1.0E-3; } } /************************************************************************* set random type of theQP solver *************************************************************************/ static void testminqpunit_setrandomalgononconvex(minqpstate* s, ae_state *_state) { ae_int_t i; i = 1+ae_randominteger(2, _state); if( i==1 ) { minqpsetalgobleic(s, 1.0E-12, 0.0, 0.0, 0, _state); } if( i==2 ) { minqpsetalgoquickqp(s, 1.0E-12, 0.0, 0.0, 0, ae_fp_greater(ae_randomreal(_state),0.5), _state); } } /************************************************************************* set random type of theQP solver *************************************************************************/ static void testminqpunit_setrandomalgosemidefinite(minqpstate* s, ae_state *_state) { ae_int_t i; i = 1+ae_randominteger(2, _state); if( i==1 ) { minqpsetalgobleic(s, 1.0E-12, 0.0, 0.0, 0, _state); } if( i==2 ) { minqpsetalgoquickqp(s, 1.0E-12, 0.0, 0.0, 0, ae_fp_greater(ae_randomreal(_state),0.5), _state); } } /************************************************************************* set random type of the QP solver, must support boundary constraints *************************************************************************/ static void testminqpunit_setrandomalgobc(minqpstate* s, ae_state *_state) { ae_int_t i; i = ae_randominteger(2, _state); if( i==0 ) { minqpsetalgocholesky(s, _state); } if( i==1 ) { minqpsetalgobleic(s, 1.0E-12, 0.0, 0.0, 0, _state); } } /************************************************************************* set random type of the QP solver, must support convex problems with boundary/linear constraints *************************************************************************/ static void testminqpunit_setrandomalgoconvexlc(minqpstate* s, ae_state *_state) { ae_int_t i; i = ae_randominteger(2, _state); if( i==0 ) { minqpsetalgocholesky(s, _state); } if( i==1 ) { minqpsetalgobleic(s, 0.0, 0.0, 1.0E-12, 0, _state); } } /************************************************************************* set random type of the QP solver, must support nonconvex problems with boundary/linear constraints *************************************************************************/ static void testminqpunit_setrandomalgononconvexlc(minqpstate* s, ae_state *_state) { ae_int_t i; i = ae_randominteger(1, _state); if( i==0 ) { minqpsetalgobleic(s, 1.0E-12, 0.0, 0.0, 0, _state); } } /************************************************************************* Convert dense matrix to sparse matrix using random format *************************************************************************/ static void testminqpunit_densetosparse(/* Real */ ae_matrix* a, ae_int_t n, sparsematrix* s, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; sparsematrix s0; ae_frame_make(_state, &_frame_block); _sparsematrix_clear(s); _sparsematrix_init(&s0, _state); sparsecreate(n, n, n*n, &s0, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { sparseset(&s0, i, j, a->ptr.pp_double[i][j], _state); } } sparsecopytobuf(&s0, ae_randominteger(3, _state), s, _state); ae_frame_leave(_state); } /************************************************************************* Randomly split constraints into dense and sparse parts *************************************************************************/ static void testminqpunit_randomlysplitlc(/* Real */ ae_matrix* rawc, /* Integer */ ae_vector* rawct, ae_int_t rawccnt, ae_int_t n, /* Real */ ae_matrix* densec, /* Integer */ ae_vector* densect, sparsematrix* sparsec, /* Integer */ ae_vector* sparsect, ae_int_t* denseccnt, ae_int_t* sparseccnt, hqrndstate* rs, ae_state *_state) { ae_int_t i; ae_int_t j; ae_matrix_clear(densec); ae_vector_clear(densect); _sparsematrix_clear(sparsec); ae_vector_clear(sparsect); *denseccnt = 0; *sparseccnt = 0; /* * Split "raw" constraints into dense and sparse parts */ *denseccnt = hqrnduniformi(rs, rawccnt+1, _state); *sparseccnt = rawccnt-(*denseccnt); if( *denseccnt>0 ) { ae_matrix_set_length(densec, *denseccnt, n+1, _state); ae_vector_set_length(densect, *denseccnt, _state); for(i=0; i<=*denseccnt-1; i++) { for(j=0; j<=n; j++) { densec->ptr.pp_double[i][j] = rawc->ptr.pp_double[i][j]; } densect->ptr.p_int[i] = rawct->ptr.p_int[i]; } } if( *sparseccnt>0 ) { sparsecreate(*sparseccnt, n+1, 0, sparsec, _state); ae_vector_set_length(sparsect, *sparseccnt, _state); for(i=0; i<=*sparseccnt-1; i++) { for(j=0; j<=n; j++) { sparseset(sparsec, i, j, rawc->ptr.pp_double[*denseccnt+i][j], _state); } sparsect->ptr.p_int[i] = rawct->ptr.p_int[*denseccnt+i]; } } } /************************************************************************* Randomly split constraints into dense and sparse parts and set them *************************************************************************/ static void testminqpunit_randomlysplitandsetlc(/* Real */ ae_matrix* rawc, /* Integer */ ae_vector* rawct, ae_int_t rawccnt, ae_int_t n, minqpstate* state, hqrndstate* rs, ae_state *_state) { ae_frame _frame_block; ae_matrix densec; ae_vector densect; sparsematrix sparsec; ae_vector sparsect; ae_int_t denseccnt; ae_int_t sparseccnt; ae_frame_make(_state, &_frame_block); ae_matrix_init(&densec, 0, 0, DT_REAL, _state); ae_vector_init(&densect, 0, DT_INT, _state); _sparsematrix_init(&sparsec, _state); ae_vector_init(&sparsect, 0, DT_INT, _state); testminqpunit_randomlysplitlc(rawc, rawct, rawccnt, n, &densec, &densect, &sparsec, &sparsect, &denseccnt, &sparseccnt, rs, _state); if( ae_fp_greater(hqrnduniformr(rs, _state),0.5)||denseccnt*sparseccnt>0 ) { minqpsetlcmixed(state, &densec, &densect, denseccnt, &sparsec, &sparsect, sparseccnt, _state); } else { if( denseccnt>0 ) { minqpsetlc(state, &densec, &densect, denseccnt, _state); } if( sparseccnt>0 ) { minqpsetlcsparse(state, &sparsec, &sparsect, sparseccnt, _state); } } ae_frame_leave(_state); } /************************************************************************* Randomly selects triangle of full symmetric matrix, converts it to one of the matrix storage formats (dense or sparse) and sets. *************************************************************************/ static void testminqpunit_randomlyselectconvertandsetquadraticterm(/* Real */ ae_matrix* a, ae_int_t n, minqpstate* state, hqrndstate* rs, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_matrix densea; sparsematrix sparsea; ae_bool isupper; ae_bool isdense; ae_frame_make(_state, &_frame_block); ae_matrix_init(&densea, 0, 0, DT_REAL, _state); _sparsematrix_init(&sparsea, _state); isupper = ae_fp_greater(hqrnduniformr(rs, _state),0.5); isdense = ae_fp_greater(hqrnduniformr(rs, _state),0.5); if( isupper&&isdense ) { ae_matrix_set_length(&densea, n, n, _state); for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { densea.ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } minqpsetquadraticterm(state, &densea, isupper, _state); ae_frame_leave(_state); return; } if( !isupper&&isdense ) { ae_matrix_set_length(&densea, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { densea.ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } minqpsetquadraticterm(state, &densea, isupper, _state); ae_frame_leave(_state); return; } if( isupper&&!isdense ) { sparsecreate(n, n, 0, &sparsea, _state); for(i=0; i<=n-1; i++) { for(j=i; j<=n-1; j++) { sparseset(&sparsea, i, j, a->ptr.pp_double[i][j], _state); } } minqpsetquadratictermsparse(state, &sparsea, isupper, _state); ae_frame_leave(_state); return; } if( !isupper&&!isdense ) { sparsecreate(n, n, 0, &sparsea, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=i; j++) { sparseset(&sparsea, i, j, a->ptr.pp_double[i][j], _state); } } minqpsetquadratictermsparse(state, &sparsea, isupper, _state); ae_frame_leave(_state); return; } ae_frame_leave(_state); } /************************************************************************* This function returns reciprocal of condition number of general linear constraints. *************************************************************************/ static double testminqpunit_getconstraintrcond(/* Real */ ae_matrix* c, ae_int_t k, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_bool bflag; ae_vector svdw; ae_matrix svdu; ae_matrix svdvt; double result; ae_frame_make(_state, &_frame_block); ae_vector_init(&svdw, 0, DT_REAL, _state); ae_matrix_init(&svdu, 0, 0, DT_REAL, _state); ae_matrix_init(&svdvt, 0, 0, DT_REAL, _state); bflag = rmatrixsvd(c, k, n, 0, 0, 0, &svdw, &svdu, &svdvt, _state); ae_assert(bflag, "MinQPTest: integrity failure", _state); if( ae_fp_greater(svdw.ptr.p_double[0],(double)(0)) ) { result = svdw.ptr.p_double[ae_minint(k, n, _state)-1]/svdw.ptr.p_double[0]; } else { result = (double)(1); } ae_frame_leave(_state); return result; } static void testminnlcunit_testbc(ae_bool* wereerrors, ae_state *_state); static void testminnlcunit_testlc(ae_bool* wereerrors, ae_state *_state); static void testminnlcunit_testnlc(ae_bool* wereerrors, ae_state *_state); static void testminnlcunit_testother(ae_bool* wereerrors, ae_state *_state); static void testminnlcunit_testbugs(ae_bool* wereerrors, ae_state *_state); ae_bool testminnlc(ae_bool silent, ae_state *_state) { ae_bool waserrors; ae_bool bcerr; ae_bool lcerr; ae_bool nlcerr; ae_bool othererr; ae_bool bugs; ae_bool result; waserrors = ae_false; bcerr = ae_false; lcerr = ae_false; nlcerr = ae_false; othererr = ae_false; bugs = ae_false; testminnlcunit_testbugs(&bugs, _state); testminnlcunit_testbc(&bcerr, _state); testminnlcunit_testlc(&lcerr, _state); testminnlcunit_testnlc(&nlcerr, _state); testminnlcunit_testother(&othererr, _state); /* * end */ waserrors = (((bcerr||lcerr)||nlcerr)||othererr)||bugs; if( !silent ) { printf("TESTING MINNLC OPTIMIZATION\n"); printf("TESTS:\n"); printf("* BOUND CONSTRAINED "); if( bcerr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* LINEARLY CONSTRAINED "); if( lcerr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* NONLINEARLY CONSTRAINED "); if( nlcerr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* OTHER PROPERTIES "); if( othererr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* FIXED BUGS: "); if( bugs ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testminnlc(ae_bool silent, ae_state *_state) { return testminnlc(silent, _state); } /************************************************************************* This function tests bound constrained quadratic programming algorithm. On failure sets error flag. *************************************************************************/ static void testminnlcunit_testbc(ae_bool* wereerrors, ae_state *_state) { ae_frame _frame_block; minnlcstate state; minnlcreport rep; ae_int_t n; ae_int_t pass; ae_int_t i; ae_int_t j; ae_int_t aulits; double tolx; double tolg; ae_int_t scaletype; double rho; ae_matrix fulla; ae_vector b; ae_vector bndl; ae_vector bndu; ae_vector s; ae_vector x0; ae_vector x1; double gnorm; double g; ae_int_t prectype; hqrndstate rs; ae_frame_make(_state, &_frame_block); _minnlcstate_init(&state, _state); _minnlcreport_init(&rep, _state); ae_matrix_init(&fulla, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); /* * Convex test: * * N dimensions * * random number (0..N) of random boundary constraints * * positive-definite quadratic programming problem * * initial point is random (maybe infeasible!) * * random scale (unit or non-unit) */ aulits = 50; rho = 200.0; tolx = 0.0005; tolg = 0.01; for(n=1; n<=10; n++) { for(pass=1; pass<=10; pass++) { for(prectype=-1; prectype<=2; prectype++) { /* * Generate well-conditioned problem with unit scale */ spdmatrixrndcond(n, 1.0E2, &fulla, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); bndl.ptr.p_double[i] = _state->v_neginf; bndu.ptr.p_double[i] = _state->v_posinf; x0.ptr.p_double[i] = hqrndnormal(&rs, _state); j = hqrnduniformi(&rs, 5, _state); if( j==0 ) { bndl.ptr.p_double[i] = (double)(0); } if( j==1 ) { bndu.ptr.p_double[i] = (double)(0); } if( j==2 ) { bndl.ptr.p_double[i] = hqrndnormal(&rs, _state); bndu.ptr.p_double[i] = bndl.ptr.p_double[i]; } if( j==3 ) { bndl.ptr.p_double[i] = -0.1; bndu.ptr.p_double[i] = 0.1; } } /* * Apply scaling to quadratic/linear term, so problem becomes * well-conditioned in the scaled coordinates. */ scaletype = hqrnduniformi(&rs, 2, _state); ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { if( scaletype==0 ) { s.ptr.p_double[i] = (double)(1); } else { s.ptr.p_double[i] = ae_exp(20*hqrndnormal(&rs, _state), _state); } } for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = x0.ptr.p_double[i]*s.ptr.p_double[i]; bndl.ptr.p_double[i] = bndl.ptr.p_double[i]*s.ptr.p_double[i]; bndu.ptr.p_double[i] = bndu.ptr.p_double[i]*s.ptr.p_double[i]; b.ptr.p_double[i] = b.ptr.p_double[i]/s.ptr.p_double[i]; for(j=0; j<=n-1; j++) { fulla.ptr.pp_double[i][j] = fulla.ptr.pp_double[i][j]/(s.ptr.p_double[i]*s.ptr.p_double[j]); } } /* * Solve problem */ minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); if( scaletype!=0 ) { minnlcsetscale(&state, &s, _state); } if( prectype==0 ) { minnlcsetprecinexact(&state, _state); } if( prectype==1 ) { minnlcsetprecexactlowrank(&state, 0, _state); } if( prectype==2 ) { minnlcsetprecexactrobust(&state, 0, _state); } minnlcsetbc(&state, &bndl, &bndu, _state); minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+b.ptr.p_double[i]*state.x.ptr.p_double[i]; state.j.ptr.pp_double[0][i] = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+0.5*state.x.ptr.p_double[i]*fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; state.j.ptr.pp_double[0][i] = state.j.ptr.pp_double[0][i]+fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } /* * Check feasibility properties */ for(i=0; i<=n-1; i++) { seterrorflag(wereerrors, ae_isfinite(bndl.ptr.p_double[i], _state)&&ae_fp_less_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i]-tolx*s.ptr.p_double[i]), _state); seterrorflag(wereerrors, ae_isfinite(bndu.ptr.p_double[i], _state)&&ae_fp_greater_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i]+tolx*s.ptr.p_double[i]), _state); } /* * Test - calculate scaled constrained gradient at solution, * check its norm. */ gnorm = 0.0; for(i=0; i<=n-1; i++) { g = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { g = g+fulla.ptr.pp_double[i][j]*x1.ptr.p_double[j]; } g = s.ptr.p_double[i]*g; if( (ae_isfinite(bndl.ptr.p_double[i], _state)&&ae_fp_less(ae_fabs(x1.ptr.p_double[i]-bndl.ptr.p_double[i], _state),tolx*s.ptr.p_double[i]))&&ae_fp_greater(g,(double)(0)) ) { g = (double)(0); } if( (ae_isfinite(bndu.ptr.p_double[i], _state)&&ae_fp_less(ae_fabs(x1.ptr.p_double[i]-bndu.ptr.p_double[i], _state),tolx*s.ptr.p_double[i]))&&ae_fp_less(g,(double)(0)) ) { g = (double)(0); } gnorm = gnorm+ae_sqr(g, _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(wereerrors, ae_fp_greater(gnorm,tolg), _state); } } } /* * Non-convex test: * * N dimensions, N>=2 * * box constraints, x[i] in [-1,+1] * * A is symmetric indefinite with condition number 50.0 * * random B with normal entries * * initial point is random, feasible * * scale is always unit * * We check that constrained problem can be successfully solved. * We do not check ability to detect unboundedness of unconstrained * problem because there is such functionality in MinNLC. */ aulits = 50; rho = 200.0; tolx = 0.0005; tolg = 0.01; for(n=2; n<=10; n++) { for(pass=1; pass<=10; pass++) { for(prectype=-1; prectype<=2; prectype++) { /* * Generate problem */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { fulla.ptr.pp_double[i][j] = 0.0; } } for(i=0; i<=n-1; i++) { fulla.ptr.pp_double[i][i] = -1-hqrnduniformr(&rs, _state); } ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = 0.05*hqrndnormal(&rs, _state); bndl.ptr.p_double[i] = (double)(-1); bndu.ptr.p_double[i] = (double)(1); x0.ptr.p_double[i] = 2*hqrnduniformr(&rs, _state)-1; } /* * Solve problem: * * without constraints we expect failure * * with constraints algorithm must succeed */ minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetbc(&state, &bndl, &bndu, _state); if( prectype==0 ) { minnlcsetprecinexact(&state, _state); } if( prectype==1 ) { minnlcsetprecexactlowrank(&state, 0, _state); } if( prectype==2 ) { minnlcsetprecexactrobust(&state, 0, _state); } minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+b.ptr.p_double[i]*state.x.ptr.p_double[i]; state.j.ptr.pp_double[0][i] = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+0.5*state.x.ptr.p_double[i]*fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; state.j.ptr.pp_double[0][i] = state.j.ptr.pp_double[0][i]+fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } /* * Check feasibility properties */ for(i=0; i<=n-1; i++) { seterrorflag(wereerrors, ae_isfinite(bndl.ptr.p_double[i], _state)&&ae_fp_less_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i]-tolx), _state); seterrorflag(wereerrors, ae_isfinite(bndu.ptr.p_double[i], _state)&&ae_fp_greater_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i]+tolx), _state); } /* * Test - calculate scaled constrained gradient at solution, * check its norm. */ gnorm = 0.0; for(i=0; i<=n-1; i++) { g = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { g = g+fulla.ptr.pp_double[i][j]*x1.ptr.p_double[j]; } g = g; if( (ae_isfinite(bndl.ptr.p_double[i], _state)&&ae_fp_less(ae_fabs(x1.ptr.p_double[i]-bndl.ptr.p_double[i], _state),tolx))&&ae_fp_greater(g,(double)(0)) ) { g = (double)(0); } if( (ae_isfinite(bndu.ptr.p_double[i], _state)&&ae_fp_less(ae_fabs(x1.ptr.p_double[i]-bndu.ptr.p_double[i], _state),tolx))&&ae_fp_less(g,(double)(0)) ) { g = (double)(0); } gnorm = gnorm+ae_sqr(g, _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(wereerrors, ae_fp_greater(gnorm,tolg), _state); } } } ae_frame_leave(_state); } /************************************************************************* This function tests linearly constrained quadratic programming algorithm. Sets error flag on failure. *************************************************************************/ static void testminnlcunit_testlc(ae_bool* wereerrors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t k; ae_int_t i; ae_int_t j; ae_int_t pass; ae_matrix q; ae_matrix fulla; double v; double vv; ae_vector tmp; ae_vector bl; ae_vector bu; ae_vector b; ae_vector xs0; ae_vector xstart; ae_vector x; ae_vector x0; ae_vector x1; ae_vector x2; ae_vector xm; ae_vector s; ae_vector g; ae_vector bndl; ae_vector bndu; ae_matrix a; ae_matrix c; ae_matrix ce; ae_vector ct; ae_vector nonnegative; double tolx; double tolg; double tolf; ae_int_t aulits; double rho; minnlcstate state; minnlcreport rep; ae_int_t scaletype; double f0; double f1; double tolconstr; ae_int_t bscale; ae_int_t akind; ae_int_t ccnt; ae_int_t shiftkind; ae_int_t prectype; hqrndstate rs; snnlssolver nnls; ae_frame_make(_state, &_frame_block); ae_matrix_init(&q, 0, 0, DT_REAL, _state); ae_matrix_init(&fulla, 0, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&xs0, 0, DT_REAL, _state); ae_vector_init(&xstart, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&xm, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_matrix_init(&ce, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); ae_vector_init(&nonnegative, 0, DT_BOOL, _state); _minnlcstate_init(&state, _state); _minnlcreport_init(&rep, _state); _hqrndstate_init(&rs, _state); _snnlssolver_init(&nnls, _state); hqrndrandomize(&rs, _state); /* * First test: * * K=0, where q is random unit vector * * optimization problem has form 0.5*x'*A*x-(x1*A)*x, * where x1 is some random vector * * either: * a) x1 is feasible => we must stop at x1 * b) x1 is infeasible => we must stop at the boundary q'*x=0 and * projection of gradient onto q*x=0 must be zero * * NOTE: we make several passes because some specific kind of errors is rarely * caught by this test, so we need several repetitions. */ rho = 200.0; tolx = 0.0005; tolg = 0.01; aulits = 50; for(n=2; n<=6; n++) { for(pass=0; pass<=4; pass++) { /* * Generate problem: A, b, CMatrix, x0, XStart */ spdmatrixrndcond(n, 1.0E2, &fulla, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xm, n, _state); ae_vector_set_length(&xstart, n, _state); ae_matrix_set_length(&c, 1, n+1, _state); ae_vector_set_length(&ct, 1, _state); for(i=0; i<=n-1; i++) { xm.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xstart.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } do { v = (double)(0); for(i=0; i<=n-1; i++) { c.ptr.pp_double[0][i] = 2*ae_randomreal(_state)-1; v = v+ae_sqr(c.ptr.pp_double[0][i], _state); } v = ae_sqrt(v, _state); } while(ae_fp_eq(v,(double)(0))); for(i=0; i<=n-1; i++) { c.ptr.pp_double[0][i] = c.ptr.pp_double[0][i]/v; } c.ptr.pp_double[0][n] = (double)(0); ct.ptr.p_int[0] = 1; for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&fulla.ptr.pp_double[i][0], 1, &xm.ptr.p_double[0], 1, ae_v_len(0,n-1)); b.ptr.p_double[i] = -v; } /* * Apply scaling to linear term and known solution, * so problem becomes well-conditioned in the scaled coordinates. */ scaletype = hqrnduniformi(&rs, 2, _state); ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { if( scaletype==0 ) { s.ptr.p_double[i] = (double)(1); } else { s.ptr.p_double[i] = ae_exp(20*hqrndnormal(&rs, _state), _state); } } for(i=0; i<=n-1; i++) { xm.ptr.p_double[i] = xm.ptr.p_double[i]*s.ptr.p_double[i]; xstart.ptr.p_double[i] = xstart.ptr.p_double[i]*s.ptr.p_double[i]; b.ptr.p_double[i] = b.ptr.p_double[i]/s.ptr.p_double[i]; for(j=0; j<=n-1; j++) { fulla.ptr.pp_double[i][j] = fulla.ptr.pp_double[i][j]/(s.ptr.p_double[i]*s.ptr.p_double[j]); } } for(j=0; j<=n-1; j++) { c.ptr.pp_double[0][j] = c.ptr.pp_double[0][j]/s.ptr.p_double[j]; } /* * Create optimizer, solve */ minnlccreate(n, &xstart, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetlc(&state, &c, &ct, 1, _state); minnlcsetscale(&state, &s, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+b.ptr.p_double[i]*state.x.ptr.p_double[i]; state.j.ptr.pp_double[0][i] = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+0.5*state.x.ptr.p_double[i]*fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; state.j.ptr.pp_double[0][i] = state.j.ptr.pp_double[0][i]+fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } /* * Test solution */ ae_vector_set_length(&g, n, _state); ae_v_move(&g.ptr.p_double[0], 1, &b.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&fulla.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); g.ptr.p_double[i] = g.ptr.p_double[i]+v; } v = ae_v_dotproduct(&x1.ptr.p_double[0], 1, &c.ptr.pp_double[0][0], 1, ae_v_len(0,n-1)); seterrorflag(wereerrors, ae_fp_less(v,-tolx), _state); if( ae_fp_less(v,tolx) ) { /* * Point at the boundary, project gradient into * equality-constrained subspace. */ v = 0.0; vv = 0.0; for(i=0; i<=n-1; i++) { v = v+g.ptr.p_double[i]*c.ptr.pp_double[0][i]; vv = vv+c.ptr.pp_double[0][i]*c.ptr.pp_double[0][i]; } v = v/vv; ae_v_subd(&g.ptr.p_double[0], 1, &c.ptr.pp_double[0][0], 1, ae_v_len(0,n-1), v); } v = 0.0; for(i=0; i<=n-1; i++) { v = v+ae_sqr(g.ptr.p_double[i]*s.ptr.p_double[i], _state); } seterrorflag(wereerrors, ae_fp_greater(ae_sqrt(v, _state),tolg), _state); } } /* * Equality-constrained test: * * N*N SPD A * * K0 ) { c.ptr.pp_double[i][n] = v; } if( ct.ptr.p_int[i]<0 ) { c.ptr.pp_double[i][n] = v; } } /* * Create optimizer, solve */ minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetbc(&state, &bndl, &bndu, _state); minnlcsetlc(&state, &c, &ct, k, _state); minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+b.ptr.p_double[i]*state.x.ptr.p_double[i]; state.j.ptr.pp_double[0][i] = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+0.5*state.x.ptr.p_double[i]*fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; state.j.ptr.pp_double[0][i] = state.j.ptr.pp_double[0][i]+fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&x1.ptr.p_double[0], 1, &c.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); if( ct.ptr.p_int[i]==0 ) { seterrorflag(wereerrors, ae_fp_greater(ae_fabs(v-c.ptr.pp_double[i][n], _state),tolx), _state); } if( ct.ptr.p_int[i]>0 ) { seterrorflag(wereerrors, ae_fp_less(v,c.ptr.pp_double[i][n]-tolx), _state); } if( ct.ptr.p_int[i]<0 ) { seterrorflag(wereerrors, ae_fp_greater(v,c.ptr.pp_double[i][n]+tolx), _state); } } } /* * Boundary and linear equality constrained QP problem, * test checks that different starting points yield same final point: * * random N from [1..6], random K from [1..N-1] * * N*N SPD A with moderate condtion number (important!) * * boundary constraints 0<=x[i]<=1 * * K=-2; bscale--) { /* * Generate A, B and initial point */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = ae_pow((double)(10), (double)(bscale), _state)*hqrndnormal(&rs, _state); x.ptr.p_double[i] = hqrnduniformr(&rs, _state)-0.5; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } if( akind==1 ) { /* * Dense well conditioned SPD */ spdmatrixrndcond(n, 50.0, &a, _state); } if( akind==2 ) { /* * Dense well conditioned indefinite */ smatrixrndcond(n, 50.0, &a, _state); } if( akind==3 ) { /* * Low rank */ ae_vector_set_length(&tmp, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } for(k=1; k<=ae_minint(3, n-1, _state); k++) { for(i=0; i<=n-1; i++) { tmp.ptr.p_double[i] = hqrndnormal(&rs, _state); } v = hqrndnormal(&rs, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]+v*tmp.ptr.p_double[i]*tmp.ptr.p_double[j]; } } } } /* * Generate constraints */ ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = -1.0; bu.ptr.p_double[i] = 1.0; } ccnt = ae_round(ae_pow((double)(2), (double)(n), _state), _state); ae_matrix_set_length(&c, ccnt, n+1, _state); ae_vector_set_length(&ct, ccnt, _state); for(i=0; i<=ccnt-1; i++) { ct.ptr.p_int[i] = -1; k = i; c.ptr.pp_double[i][n] = ae_sign((double)(shiftkind), _state)*ae_pow((double)(10), ae_fabs((double)(shiftkind), _state), _state)*ae_machineepsilon; for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = (double)(2*(k%2)-1); c.ptr.pp_double[i][n] = c.ptr.pp_double[i][n]+c.ptr.pp_double[i][j]*c.ptr.pp_double[i][j]; k = k/2; } } /* * Create and optimize */ minnlccreate(n, &x, &state, _state); minnlcsetbc(&state, &bl, &bu, _state); minnlcsetlc(&state, &c, &ct, ccnt, _state); minnlcsetcond(&state, 1.0E-12, 0.0, 0.0, 0, _state); minnlcsetalgoaul(&state, 1000.0, 10, _state); while(minnlciteration(&state, _state)) { ae_assert(state.needfij, "Assertion failed", _state); state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+state.x.ptr.p_double[i]*b.ptr.p_double[i]; state.j.ptr.pp_double[0][i] = b.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+0.5*state.x.ptr.p_double[i]*v; state.j.ptr.pp_double[0][i] = state.j.ptr.pp_double[0][i]+v; } } minnlcresults(&state, &xs0, &rep, _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } /* * Evaluate gradient at solution and test */ vv = 0.0; for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xs0.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v+b.ptr.p_double[i]; if( ae_fp_less_eq(xs0.ptr.p_double[i],bl.ptr.p_double[i]+tolconstr)&&ae_fp_greater(v,(double)(0)) ) { v = 0.0; } if( ae_fp_greater_eq(xs0.ptr.p_double[i],bu.ptr.p_double[i]-tolconstr)&&ae_fp_less(v,(double)(0)) ) { v = 0.0; } vv = vv+ae_sqr(v, _state); } vv = ae_sqrt(vv, _state); seterrorflag(wereerrors, ae_fp_greater(vv,1.0E-3), _state); } } } } /* * Linear/convex optimization problem with combination of * box and linear constraints: * * * N=2..8 * * f = 0.5*x'*A*x+b'*x * * b has normally distributed entries with scale 10^BScale * * several kinds of A are tried: zero, well conditioned SPD * * box constraints: x[i] in [-1,+1] * * initial point x0 = [0 0 ... 0 0] * * CCnt=min(3,N-1) general linear constraints of form (c,x)=0. * random mix of equality/inequality constraints is tried. * x0 is guaranteed to be feasible. * * We check that constrained gradient is close to zero at solution. * Inequality constraint is considered active if distance to boundary * is less than TolConstr. We use nonnegative least squares solver * in order to compute constrained gradient. */ tolconstr = 1.0E-2; for(n=2; n<=8; n++) { for(akind=0; akind<=1; akind++) { for(bscale=0; bscale>=-2; bscale--) { /* * Generate A, B and initial point */ ae_matrix_set_length(&a, n, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = ae_pow((double)(10), (double)(bscale), _state)*hqrndnormal(&rs, _state); x.ptr.p_double[i] = 0.0; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 0.0; } } if( akind==1 ) { /* * Dense well conditioned SPD */ spdmatrixrndcond(n, 50.0, &a, _state); } /* * Generate constraints */ ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = -1.0; bu.ptr.p_double[i] = 1.0; } ccnt = ae_minint(3, n-1, _state); ae_matrix_set_length(&c, ccnt, n+1, _state); ae_vector_set_length(&ct, ccnt, _state); for(i=0; i<=ccnt-1; i++) { ct.ptr.p_int[i] = hqrnduniformi(&rs, 3, _state)-1; c.ptr.pp_double[i][n] = 0.0; for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)-0.5; } } /* * Create and optimize */ minnlccreate(n, &x, &state, _state); minnlcsetbc(&state, &bl, &bu, _state); minnlcsetlc(&state, &c, &ct, ccnt, _state); minnlcsetcond(&state, 1.0E-9, 0.0, 0.0, 0, _state); minnlcsetalgoaul(&state, 1000.0, 10, _state); while(minnlciteration(&state, _state)) { ae_assert(state.needfij, "Assertion failed", _state); state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+state.x.ptr.p_double[i]*b.ptr.p_double[i]; state.j.ptr.pp_double[0][i] = b.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+0.5*state.x.ptr.p_double[i]*v; state.j.ptr.pp_double[0][i] = state.j.ptr.pp_double[0][i]+v; } } minnlcresults(&state, &xs0, &rep, _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } /* * 1. evaluate unconstrained gradient at solution * * 2. calculate constrained gradient (NNLS solver is used * to evaluate gradient subject to active constraints). * In order to do this we form CE matrix, matrix of active * constraints (columns store constraint vectors). Then * we try to approximate gradient vector by columns of CE, * subject to non-negativity restriction placed on variables * corresponding to inequality constraints. * * Residual from such regression is a constrained gradient vector. */ ae_vector_set_length(&g, n, _state); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &xs0.ptr.p_double[0], 1, ae_v_len(0,n-1)); g.ptr.p_double[i] = v+b.ptr.p_double[i]; } ae_matrix_set_length(&ce, n, n+ccnt, _state); ae_vector_set_length(&nonnegative, n+ccnt, _state); k = 0; for(i=0; i<=n-1; i++) { seterrorflag(wereerrors, ae_fp_less(xs0.ptr.p_double[i],bl.ptr.p_double[i]-tolconstr), _state); seterrorflag(wereerrors, ae_fp_greater(xs0.ptr.p_double[i],bu.ptr.p_double[i]+tolconstr), _state); if( ae_fp_less_eq(xs0.ptr.p_double[i],bl.ptr.p_double[i]+tolconstr) ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = 0.0; } ce.ptr.pp_double[i][k] = 1.0; nonnegative.ptr.p_bool[k] = ae_true; inc(&k, _state); continue; } if( ae_fp_greater_eq(xs0.ptr.p_double[i],bu.ptr.p_double[i]-tolconstr) ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = 0.0; } ce.ptr.pp_double[i][k] = -1.0; nonnegative.ptr.p_bool[k] = ae_true; inc(&k, _state); continue; } } for(i=0; i<=ccnt-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &xs0.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v-c.ptr.pp_double[i][n]; seterrorflag(wereerrors, ct.ptr.p_int[i]==0&&ae_fp_greater(ae_fabs(v, _state),tolconstr), _state); seterrorflag(wereerrors, ct.ptr.p_int[i]>0&&ae_fp_less(v,-tolconstr), _state); seterrorflag(wereerrors, ct.ptr.p_int[i]<0&&ae_fp_greater(v,tolconstr), _state); if( ct.ptr.p_int[i]==0 ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = c.ptr.pp_double[i][j]; } nonnegative.ptr.p_bool[k] = ae_false; inc(&k, _state); continue; } if( (ct.ptr.p_int[i]>0&&ae_fp_less_eq(v,tolconstr))||(ct.ptr.p_int[i]<0&&ae_fp_greater_eq(v,-tolconstr)) ) { for(j=0; j<=n-1; j++) { ce.ptr.pp_double[j][k] = ae_sign((double)(ct.ptr.p_int[i]), _state)*c.ptr.pp_double[i][j]; } nonnegative.ptr.p_bool[k] = ae_true; inc(&k, _state); continue; } } snnlsinit(0, 0, 0, &nnls, _state); snnlssetproblem(&nnls, &ce, &g, 0, k, n, _state); for(i=0; i<=k-1; i++) { if( !nonnegative.ptr.p_bool[i] ) { snnlsdropnnc(&nnls, i, _state); } } snnlssolve(&nnls, &tmp, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=n-1; j++) { g.ptr.p_double[j] = g.ptr.p_double[j]-tmp.ptr.p_double[i]*ce.ptr.pp_double[j][i]; } } vv = ae_v_dotproduct(&g.ptr.p_double[0], 1, &g.ptr.p_double[0], 1, ae_v_len(0,n-1)); vv = ae_sqrt(vv, _state); seterrorflag(wereerrors, ae_fp_greater(vv,1.0E-3), _state); } } } ae_frame_leave(_state); } /************************************************************************* This function tests nonlinearly constrained quadratic programming algorithm. Sets error flag on failure. *************************************************************************/ static void testminnlcunit_testnlc(ae_bool* wereerrors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t n2; double tolx; double tolg; ae_int_t aulits; double rho; minnlcstate state; minnlcreport rep; ae_int_t scaletype; ae_vector x0; ae_vector x1; ae_vector b; ae_vector bndl; ae_vector bndu; ae_vector s; ae_vector g; ae_vector ckind; ae_matrix fulla; ae_matrix c; ae_vector ct; ae_int_t cntbc; ae_int_t cntlc; ae_int_t cntnlec; ae_int_t cntnlic; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t pass; ae_int_t klc; ae_int_t knlec; ae_int_t knlic; double v; double vv; double vx; double vy; double gnorm; hqrndstate rs; ae_int_t prectype; ae_frame_make(_state, &_frame_block); _minnlcstate_init(&state, _state); _minnlcreport_init(&rep, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&ckind, 0, DT_INT, _state); ae_matrix_init(&fulla, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); /* * Basic test: * * 2-dimensional problem * * target function F(x0,x1) = (x0-1)^2 + (x1-1)^2 * * one nonlinear constraint Z(x0,x1) = x0^2+x1^2-1, * which is tried as equality and inequality one */ rho = 200.0; tolx = 0.0005; aulits = 50; n = 2; ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); minnlcsetnlc(&state, 0, 1, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = ae_sqr(state.x.ptr.p_double[0]-1, _state)+ae_sqr(state.x.ptr.p_double[1]-1, _state); state.j.ptr.pp_double[0][0] = 2*(state.x.ptr.p_double[0]-1); state.j.ptr.pp_double[0][1] = 2*(state.x.ptr.p_double[1]-1); state.fi.ptr.p_double[1] = ae_sqr(state.x.ptr.p_double[0], _state)+ae_sqr(state.x.ptr.p_double[1], _state)-1; state.j.ptr.pp_double[1][0] = 2*state.x.ptr.p_double[0]; state.j.ptr.pp_double[1][1] = 2*state.x.ptr.p_double[1]; continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } seterrorflag(wereerrors, ae_fp_greater(ae_fabs(x1.ptr.p_double[0]-ae_sqrt((double)(2), _state)/2, _state),tolx), _state); seterrorflag(wereerrors, ae_fp_greater(ae_fabs(x1.ptr.p_double[1]-ae_sqrt((double)(2), _state)/2, _state),tolx), _state); minnlcsetnlc(&state, 1, 0, _state); minnlcrestartfrom(&state, &x0, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = ae_sqr(state.x.ptr.p_double[0]-1, _state)+ae_sqr(state.x.ptr.p_double[1]-1, _state); state.j.ptr.pp_double[0][0] = 2*(state.x.ptr.p_double[0]-1); state.j.ptr.pp_double[0][1] = 2*(state.x.ptr.p_double[1]-1); state.fi.ptr.p_double[1] = ae_sqr(state.x.ptr.p_double[0], _state)+ae_sqr(state.x.ptr.p_double[1], _state)-1; state.j.ptr.pp_double[1][0] = 2*state.x.ptr.p_double[0]; state.j.ptr.pp_double[1][1] = 2*state.x.ptr.p_double[1]; continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } seterrorflag(wereerrors, ae_fp_greater(ae_fabs(x1.ptr.p_double[0]-ae_sqrt((double)(2), _state)/2, _state),tolx), _state); seterrorflag(wereerrors, ae_fp_greater(ae_fabs(x1.ptr.p_double[1]-ae_sqrt((double)(2), _state)/2, _state),tolx), _state); /* * This test checks correctness of scaling being applied to nonlinear * constraints. We solve bound constrained scaled problem and check * that solution is correct. */ aulits = 50; rho = 200.0; tolx = 0.0005; tolg = 0.01; for(n=1; n<=10; n++) { for(pass=1; pass<=10; pass++) { /* * Generate well-conditioned problem with unit scale */ spdmatrixrndcond(n, 1.0E2, &fulla, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); x0.ptr.p_double[i] = hqrndnormal(&rs, _state); } cntnlec = hqrnduniformi(&rs, n, _state); cntnlic = n-cntnlec; for(i=0; i<=cntnlec-1; i++) { bndl.ptr.p_double[i] = hqrndnormal(&rs, _state); bndu.ptr.p_double[i] = bndl.ptr.p_double[i]; } for(i=cntnlec; i<=n-1; i++) { bndl.ptr.p_double[i] = hqrndnormal(&rs, _state); bndu.ptr.p_double[i] = bndl.ptr.p_double[i]+0.5; } /* * Apply scaling to quadratic/linear term, so problem becomes * well-conditioned in the scaled coordinates. */ scaletype = hqrnduniformi(&rs, 2, _state); ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { if( scaletype==0 ) { s.ptr.p_double[i] = (double)(1); } else { s.ptr.p_double[i] = ae_exp(20*hqrndnormal(&rs, _state), _state); } } for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = x0.ptr.p_double[i]*s.ptr.p_double[i]; bndl.ptr.p_double[i] = bndl.ptr.p_double[i]*s.ptr.p_double[i]; bndu.ptr.p_double[i] = bndu.ptr.p_double[i]*s.ptr.p_double[i]; b.ptr.p_double[i] = b.ptr.p_double[i]/s.ptr.p_double[i]; for(j=0; j<=n-1; j++) { fulla.ptr.pp_double[i][j] = fulla.ptr.pp_double[i][j]/(s.ptr.p_double[i]*s.ptr.p_double[j]); } } /* * Solve problem with boundary constraints posed as nonlinear ones */ minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetscale(&state, &s, _state); minnlcsetnlc(&state, cntnlec, 2*cntnlic, _state); minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { for(i=0; i<=cntnlec+2*cntnlic; i++) { state.fi.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { state.j.ptr.pp_double[i][j] = (double)(0); } } /* * Function itself */ for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+b.ptr.p_double[i]*state.x.ptr.p_double[i]; state.j.ptr.pp_double[0][i] = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+0.5*state.x.ptr.p_double[i]*fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; state.j.ptr.pp_double[0][i] = state.j.ptr.pp_double[0][i]+fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } /* * Equality constraints */ for(i=0; i<=cntnlec-1; i++) { state.fi.ptr.p_double[1+i] = (state.x.ptr.p_double[i]-bndl.ptr.p_double[i])/s.ptr.p_double[i]; state.j.ptr.pp_double[1+i][i] = 1/s.ptr.p_double[i]; } /* * Inequality constraints */ for(i=0; i<=cntnlic-1; i++) { k = cntnlec+i; state.fi.ptr.p_double[1+cntnlec+2*i+0] = (bndl.ptr.p_double[k]-state.x.ptr.p_double[k])/s.ptr.p_double[k]; state.j.ptr.pp_double[1+cntnlec+2*i+0][k] = -1/s.ptr.p_double[k]; state.fi.ptr.p_double[1+cntnlec+2*i+1] = (state.x.ptr.p_double[k]-bndu.ptr.p_double[k])/s.ptr.p_double[k]; state.j.ptr.pp_double[1+cntnlec+2*i+1][k] = 1/s.ptr.p_double[k]; } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } /* * Check feasibility properties */ for(i=0; i<=n-1; i++) { seterrorflag(wereerrors, ae_isfinite(bndl.ptr.p_double[i], _state)&&ae_fp_less_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i]-tolx*s.ptr.p_double[i]), _state); seterrorflag(wereerrors, ae_isfinite(bndu.ptr.p_double[i], _state)&&ae_fp_greater_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i]+tolx*s.ptr.p_double[i]), _state); } /* * Test - calculate scaled constrained gradient at solution, * check its norm. */ ae_vector_set_length(&g, n, _state); gnorm = 0.0; for(i=0; i<=n-1; i++) { g.ptr.p_double[i] = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { g.ptr.p_double[i] = g.ptr.p_double[i]+fulla.ptr.pp_double[i][j]*x1.ptr.p_double[j]; } g.ptr.p_double[i] = s.ptr.p_double[i]*g.ptr.p_double[i]; if( (ae_isfinite(bndl.ptr.p_double[i], _state)&&ae_fp_less(ae_fabs(x1.ptr.p_double[i]-bndl.ptr.p_double[i], _state),tolx*s.ptr.p_double[i]))&&ae_fp_greater(g.ptr.p_double[i],(double)(0)) ) { g.ptr.p_double[i] = (double)(0); } if( (ae_isfinite(bndu.ptr.p_double[i], _state)&&ae_fp_less(ae_fabs(x1.ptr.p_double[i]-bndu.ptr.p_double[i], _state),tolx*s.ptr.p_double[i]))&&ae_fp_less(g.ptr.p_double[i],(double)(0)) ) { g.ptr.p_double[i] = (double)(0); } gnorm = gnorm+ae_sqr(g.ptr.p_double[i], _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(wereerrors, ae_fp_greater(gnorm,tolg), _state); } } /* * Complex problem with mix of boundary, linear and nonlinear constraints: * * quadratic target function f(x) = 0.5*x'*A*x + b'*x * * unit scaling is used * * problem size N is even * * all variables are divided into pairs: x[0] and x[1], x[2] and x[3], ... * * constraints are set for pairs of variables, i.e. each constraint involves * only pair of adjacent variables (x0/x1, x2/x3, x4/x5 and so on), and each * pair of variables has at most one constraint which binds them * * for variables u and v following kinds of constraints can be randomly set: * * CKind=0 no constraint * * CKind=1 boundary equality constraint: u=a, v=b * * CKind=2 boundary inequality constraint: a0<=u<=b0, a1<=v<=b1 * * CKind=3 linear equality constraint: a*u+b*v = c * * CKind=4 linear inequality constraint: a*u+b*v <= c * * CKind=5 nonlinear equality constraint: u^2+v^2 = 1 * * CKind=6 nonlinear inequality constraint: u^2+v^2 <= 1 * * it is relatively easy to calculated projected gradient for such problem */ aulits = 50; rho = 200.0; tolx = 0.0005; tolg = 0.01; n = 20; n2 = n/2; for(pass=1; pass<=50; pass++) { for(prectype=-1; prectype<=2; prectype++) { /* * Generate well-conditioned problem with unit scale */ spdmatrixrndcond(n, 1.0E2, &fulla, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_matrix_set_length(&c, n, n+1, _state); ae_vector_set_length(&ct, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&ckind, n2, _state); cntbc = 0; cntlc = 0; cntnlec = 0; cntnlic = 0; for(i=0; i<=n-1; i++) { bndl.ptr.p_double[i] = _state->v_neginf; bndu.ptr.p_double[i] = _state->v_posinf; x0.ptr.p_double[i] = hqrndnormal(&rs, _state); b.ptr.p_double[i] = hqrndnormal(&rs, _state); } for(i=0; i<=n2-1; i++) { ckind.ptr.p_int[i] = hqrnduniformi(&rs, 7, _state); if( ckind.ptr.p_int[i]==0 ) { /* * Unconstrained */ continue; } if( ckind.ptr.p_int[i]==1 ) { /* * Bound equality constrained */ bndl.ptr.p_double[2*i+0] = hqrnduniformr(&rs, _state)-0.5; bndu.ptr.p_double[2*i+0] = bndl.ptr.p_double[2*i+0]; bndl.ptr.p_double[2*i+1] = hqrnduniformr(&rs, _state)-0.5; bndu.ptr.p_double[2*i+1] = bndl.ptr.p_double[2*i+1]; inc(&cntbc, _state); continue; } if( ckind.ptr.p_int[i]==2 ) { /* * Bound inequality constrained */ bndl.ptr.p_double[2*i+0] = hqrnduniformr(&rs, _state)-0.5; bndu.ptr.p_double[2*i+0] = bndl.ptr.p_double[2*i+0]+0.5; bndl.ptr.p_double[2*i+1] = hqrnduniformr(&rs, _state)-0.5; bndu.ptr.p_double[2*i+1] = bndl.ptr.p_double[2*i+1]+0.5; inc(&cntbc, _state); continue; } if( ckind.ptr.p_int[i]==3 ) { /* * Linear equality constrained */ for(j=0; j<=n; j++) { c.ptr.pp_double[cntlc][j] = 0.0; } vx = hqrnduniformr(&rs, _state)-0.5; vy = hqrnduniformr(&rs, _state)-0.5; c.ptr.pp_double[cntlc][2*i+0] = vx; c.ptr.pp_double[cntlc][2*i+1] = vy; c.ptr.pp_double[cntlc][n] = hqrnduniformr(&rs, _state)-0.5; ct.ptr.p_int[cntlc] = 0; inc(&cntlc, _state); continue; } if( ckind.ptr.p_int[i]==4 ) { /* * Linear inequality constrained */ for(j=0; j<=n; j++) { c.ptr.pp_double[cntlc][j] = 0.0; } vx = hqrnduniformr(&rs, _state)-0.5; vy = hqrnduniformr(&rs, _state)-0.5; c.ptr.pp_double[cntlc][2*i+0] = vx; c.ptr.pp_double[cntlc][2*i+1] = vy; c.ptr.pp_double[cntlc][n] = hqrnduniformr(&rs, _state)-0.5; ct.ptr.p_int[cntlc] = -1; inc(&cntlc, _state); continue; } if( ckind.ptr.p_int[i]==5 ) { /* * Nonlinear equality constrained */ inc(&cntnlec, _state); continue; } if( ckind.ptr.p_int[i]==6 ) { /* * Nonlinear inequality constrained */ inc(&cntnlic, _state); continue; } ae_assert(ae_false, "Assertion failed", _state); } /* * Solve problem */ minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetbc(&state, &bndl, &bndu, _state); minnlcsetlc(&state, &c, &ct, cntlc, _state); minnlcsetnlc(&state, cntnlec, cntnlic, _state); if( prectype==0 ) { minnlcsetprecinexact(&state, _state); } if( prectype==1 ) { minnlcsetprecexactlowrank(&state, 0, _state); } if( prectype==2 ) { minnlcsetprecexactrobust(&state, 0, _state); } minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { /* * Evaluate target function */ state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+b.ptr.p_double[i]*state.x.ptr.p_double[i]; state.j.ptr.pp_double[0][i] = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+0.5*state.x.ptr.p_double[i]*fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; state.j.ptr.pp_double[0][i] = state.j.ptr.pp_double[0][i]+fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } /* * Evaluate constraint functions */ knlec = 1; knlic = 1+cntnlec; for(i=0; i<=n2-1; i++) { if( ckind.ptr.p_int[i]==5 ) { state.fi.ptr.p_double[knlec] = (double)(0); for(j=0; j<=n-1; j++) { state.j.ptr.pp_double[knlec][j] = 0.0; } state.fi.ptr.p_double[knlec] = ae_sqr(state.x.ptr.p_double[2*i+0], _state)+ae_sqr(state.x.ptr.p_double[2*i+1], _state)-1; state.j.ptr.pp_double[knlec][2*i+0] = 2*state.x.ptr.p_double[2*i+0]; state.j.ptr.pp_double[knlec][2*i+1] = 2*state.x.ptr.p_double[2*i+1]; inc(&knlec, _state); continue; } if( ckind.ptr.p_int[i]==6 ) { state.fi.ptr.p_double[knlic] = (double)(0); for(j=0; j<=n-1; j++) { state.j.ptr.pp_double[knlic][j] = 0.0; } state.fi.ptr.p_double[knlic] = ae_sqr(state.x.ptr.p_double[2*i+0], _state)+ae_sqr(state.x.ptr.p_double[2*i+1], _state)-1; state.j.ptr.pp_double[knlic][2*i+0] = 2*state.x.ptr.p_double[2*i+0]; state.j.ptr.pp_double[knlic][2*i+1] = 2*state.x.ptr.p_double[2*i+1]; inc(&knlic, _state); continue; } } ae_assert(knlec==1+cntnlec, "Assertion failed", _state); ae_assert(knlic==1+cntnlec+cntnlic, "Assertion failed", _state); continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } /* * Check feasibility properties */ klc = 0; for(i=0; i<=n2-1; i++) { if( ckind.ptr.p_int[i]==0 ) { /* * Unconstrained */ continue; } if( ckind.ptr.p_int[i]==1 ) { /* * Bound equality constrained */ seterrorflag(wereerrors, ae_fp_greater(ae_fabs(x1.ptr.p_double[2*i+0]-bndl.ptr.p_double[2*i+0], _state),tolx), _state); seterrorflag(wereerrors, ae_fp_greater(ae_fabs(x1.ptr.p_double[2*i+1]-bndl.ptr.p_double[2*i+1], _state),tolx), _state); continue; } if( ckind.ptr.p_int[i]==2 ) { /* * Bound inequality constrained */ seterrorflag(wereerrors, ae_fp_less(x1.ptr.p_double[2*i+0],bndl.ptr.p_double[2*i+0]-tolx), _state); seterrorflag(wereerrors, ae_fp_greater(x1.ptr.p_double[2*i+0],bndu.ptr.p_double[2*i+0]+tolx), _state); seterrorflag(wereerrors, ae_fp_less(x1.ptr.p_double[2*i+1],bndl.ptr.p_double[2*i+1]-tolx), _state); seterrorflag(wereerrors, ae_fp_greater(x1.ptr.p_double[2*i+1],bndu.ptr.p_double[2*i+1]+tolx), _state); continue; } if( ckind.ptr.p_int[i]==3 ) { /* * Linear equality constrained */ v = x1.ptr.p_double[2*i+0]*c.ptr.pp_double[klc][2*i+0]+x1.ptr.p_double[2*i+1]*c.ptr.pp_double[klc][2*i+1]-c.ptr.pp_double[klc][n]; seterrorflag(wereerrors, ae_fp_greater(ae_fabs(v, _state),tolx), _state); inc(&klc, _state); continue; } if( ckind.ptr.p_int[i]==4 ) { /* * Linear inequality constrained */ v = x1.ptr.p_double[2*i+0]*c.ptr.pp_double[klc][2*i+0]+x1.ptr.p_double[2*i+1]*c.ptr.pp_double[klc][2*i+1]-c.ptr.pp_double[klc][n]; seterrorflag(wereerrors, ae_fp_greater(v,tolx), _state); inc(&klc, _state); continue; } if( ckind.ptr.p_int[i]==5 ) { /* * Nonlinear equality constrained */ v = ae_sqr(x1.ptr.p_double[2*i+0], _state)+ae_sqr(x1.ptr.p_double[2*i+1], _state)-1; seterrorflag(wereerrors, ae_fp_greater(ae_fabs(v, _state),tolx), _state); continue; } if( ckind.ptr.p_int[i]==6 ) { /* * Nonlinear inequality constrained */ v = ae_sqr(x1.ptr.p_double[2*i+0], _state)+ae_sqr(x1.ptr.p_double[2*i+1], _state)-1; seterrorflag(wereerrors, ae_fp_greater(v,tolx), _state); continue; } ae_assert(ae_false, "Assertion failed", _state); } /* * Test - calculate scaled constrained gradient at solution, * check its norm. */ ae_vector_set_length(&g, n, _state); for(i=0; i<=n-1; i++) { v = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { v = v+fulla.ptr.pp_double[i][j]*x1.ptr.p_double[j]; } g.ptr.p_double[i] = v; } klc = 0; for(i=0; i<=n2-1; i++) { if( ckind.ptr.p_int[i]==0 ) { /* * Unconstrained */ continue; } if( ckind.ptr.p_int[i]==1 ) { /* * Bound equality constrained, unconditionally set gradient to zero */ g.ptr.p_double[2*i+0] = 0.0; g.ptr.p_double[2*i+1] = 0.0; continue; } if( ckind.ptr.p_int[i]==2 ) { /* * Bound inequality constrained, conditionally set gradient to zero * (when constraint is active) */ if( ae_fp_less(x1.ptr.p_double[2*i+0],bndl.ptr.p_double[2*i+0]+tolx)||ae_fp_greater(x1.ptr.p_double[2*i+0],bndu.ptr.p_double[2*i+0]-tolx) ) { g.ptr.p_double[2*i+0] = 0.0; } if( ae_fp_less(x1.ptr.p_double[2*i+1],bndl.ptr.p_double[2*i+1]+tolx)||ae_fp_greater(x1.ptr.p_double[2*i+1],bndu.ptr.p_double[2*i+1]-tolx) ) { g.ptr.p_double[2*i+1] = 0.0; } continue; } if( ckind.ptr.p_int[i]==3 ) { /* * Linear equality constrained, unconditionally project gradient into * equality constrained subspace */ v = g.ptr.p_double[2*i+0]*c.ptr.pp_double[klc][2*i+0]+g.ptr.p_double[2*i+1]*c.ptr.pp_double[klc][2*i+1]; vv = ae_sqr(c.ptr.pp_double[klc][2*i+0], _state)+ae_sqr(c.ptr.pp_double[klc][2*i+1], _state); g.ptr.p_double[2*i+0] = g.ptr.p_double[2*i+0]-c.ptr.pp_double[klc][2*i+0]*(v/vv); g.ptr.p_double[2*i+1] = g.ptr.p_double[2*i+1]-c.ptr.pp_double[klc][2*i+1]*(v/vv); inc(&klc, _state); continue; } if( ckind.ptr.p_int[i]==4 ) { /* * Linear inequality constrained, conditionally project gradient * (when constraint is active) */ v = x1.ptr.p_double[2*i+0]*c.ptr.pp_double[klc][2*i+0]+x1.ptr.p_double[2*i+1]*c.ptr.pp_double[klc][2*i+1]-c.ptr.pp_double[klc][n]; if( ae_fp_greater(v,-tolx) ) { v = g.ptr.p_double[2*i+0]*c.ptr.pp_double[klc][2*i+0]+g.ptr.p_double[2*i+1]*c.ptr.pp_double[klc][2*i+1]; vv = ae_sqr(c.ptr.pp_double[klc][2*i+0], _state)+ae_sqr(c.ptr.pp_double[klc][2*i+1], _state); g.ptr.p_double[2*i+0] = g.ptr.p_double[2*i+0]-c.ptr.pp_double[klc][2*i+0]*(v/vv); g.ptr.p_double[2*i+1] = g.ptr.p_double[2*i+1]-c.ptr.pp_double[klc][2*i+1]*(v/vv); } inc(&klc, _state); continue; } if( ckind.ptr.p_int[i]==5 ) { /* * Nonlinear equality constrained, unconditionally project gradient * * NOTE: here we rely on the fact that corresponding components of X * sum to one. */ v = g.ptr.p_double[2*i+0]*x1.ptr.p_double[2*i+0]+g.ptr.p_double[2*i+1]*x1.ptr.p_double[2*i+1]; g.ptr.p_double[2*i+0] = g.ptr.p_double[2*i+0]-x1.ptr.p_double[2*i+0]*v; g.ptr.p_double[2*i+1] = g.ptr.p_double[2*i+1]-x1.ptr.p_double[2*i+1]*v; continue; } if( ckind.ptr.p_int[i]==6 ) { /* * Nonlinear inequality constrained, conditionally project gradient * (when constraint is active) * * NOTE: here we rely on the fact that corresponding components of X * sum to one. */ v = ae_sqr(x1.ptr.p_double[2*i+0], _state)+ae_sqr(x1.ptr.p_double[2*i+1], _state)-1; if( ae_fp_greater(v,-tolx) ) { v = g.ptr.p_double[2*i+0]*x1.ptr.p_double[2*i+0]+g.ptr.p_double[2*i+1]*x1.ptr.p_double[2*i+1]; g.ptr.p_double[2*i+0] = g.ptr.p_double[2*i+0]-x1.ptr.p_double[2*i+0]*v; g.ptr.p_double[2*i+1] = g.ptr.p_double[2*i+1]-x1.ptr.p_double[2*i+1]*v; } continue; } ae_assert(ae_false, "Assertion failed", _state); } gnorm = 0.0; for(i=0; i<=n-1; i++) { gnorm = gnorm+ae_sqr(g.ptr.p_double[i], _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(wereerrors, ae_fp_greater(gnorm,tolg), _state); } } ae_frame_leave(_state); } /************************************************************************* This function performs additional tests On failure sets error flag. *************************************************************************/ static void testminnlcunit_testother(ae_bool* wereerrors, ae_state *_state) { ae_frame _frame_block; hqrndstate rs; double v; double h; double fl; double fr; double fl2; double fr2; double dfl; double dfr; double dfl2; double dfr2; double d2fl; double d2fr; double d2fl2; double d2fr2; double f0; double df; double d2f; double ndf; double nd2f; double dtol; double diffstep; minnlcstate state; minnlcreport rep; double rho; ae_int_t aulits; double tolx; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t n; ae_vector b; ae_vector x0; ae_vector x1; ae_vector x2; ae_vector x3; ae_vector xlast; ae_vector bndl; ae_vector bndu; double condv; ae_matrix a; ae_matrix c; ae_matrix fulla; ae_vector ct; ae_int_t nlbfgs; ae_int_t nexactlowrank; ae_int_t nexactrobust; ae_int_t nnone; ae_int_t prectype; ae_int_t ctype; ae_int_t trialidx; ae_int_t blocksize; ae_int_t blockcnt; ae_int_t maxits; ae_int_t spoiliteration; ae_int_t stopiteration; ae_int_t spoilvar; double spoilval; ae_int_t pass; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&rs, _state); _minnlcstate_init(&state, _state); _minnlcreport_init(&rep, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&x3, 0, DT_REAL, _state); ae_vector_init(&xlast, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_matrix_init(&fulla, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); hqrndrandomize(&rs, _state); /* * Test equality penalty function (correctly calculated and smooth) */ h = 1.0E-4; v = -0.98; dtol = 1.0E-3; while(ae_fp_less_eq(v,0.98)) { /* * Test numerical derivative; this test also checks continuity of the * function */ minnlcequalitypenaltyfunction(v-2*h, &fl2, &dfl2, &d2fl2, _state); minnlcequalitypenaltyfunction(v-h, &fl, &dfl, &d2fl, _state); minnlcequalitypenaltyfunction(v+h, &fr, &dfr, &d2fr, _state); minnlcequalitypenaltyfunction(v+2*h, &fr2, &dfr2, &d2fr2, _state); minnlcequalitypenaltyfunction(v, &f0, &df, &d2f, _state); ndf = (-fr2+8*fr-8*fl+fl2)/(12*h); seterrorflag(wereerrors, ae_fp_greater(ae_fabs(ndf-df, _state),dtol*ae_maxreal(ae_fabs(ndf, _state), (double)(1), _state)), _state); nd2f = (-dfr2+8*dfr-8*dfl+dfl2)/(12*h); seterrorflag(wereerrors, ae_fp_greater(ae_fabs(nd2f-d2f, _state),dtol*ae_maxreal(ae_fabs(nd2f, _state), (double)(1), _state)), _state); /* * Next point */ v = v+h; } minnlcequalitypenaltyfunction(0.0, &f0, &df, &d2f, _state); seterrorflag(wereerrors, ae_fp_neq(f0,(double)(0)), _state); seterrorflag(wereerrors, ae_fp_neq(df,(double)(0)), _state); /* * Test inequality penalty function (correctly calculated and smooth) */ h = 1.0E-4; v = 0.02; dtol = 1.0E-3; while(ae_fp_less_eq(v,2.00)) { /* * Test numerical derivative; this test also checks continuity of the * function */ minnlcinequalityshiftfunction(v-2*h, &fl2, &dfl2, &d2fl2, _state); minnlcinequalityshiftfunction(v-h, &fl, &dfl, &d2fl, _state); minnlcinequalityshiftfunction(v+h, &fr, &dfr, &d2fr, _state); minnlcinequalityshiftfunction(v+2*h, &fr2, &dfr2, &d2fr2, _state); minnlcinequalityshiftfunction(v, &f0, &df, &d2f, _state); ndf = (-fr2+8*fr-8*fl+fl2)/(12*h); seterrorflag(wereerrors, ae_fp_greater(ae_fabs(ndf-df, _state),dtol*ae_maxreal(ae_fabs(ndf, _state), (double)(1), _state)), _state); nd2f = (-dfr2+8*dfr-8*dfl+dfl2)/(12*h); seterrorflag(wereerrors, ae_fp_greater(ae_fabs(nd2f-d2f, _state),dtol*ae_maxreal(ae_fabs(nd2f, _state), (double)(1), _state)), _state); /* * Next point */ v = v+h; } minnlcinequalityshiftfunction(1.0, &f0, &df, &d2f, _state); seterrorflag(wereerrors, ae_fp_greater(ae_fabs(f0, _state),1.0E-6), _state); seterrorflag(wereerrors, ae_fp_greater(ae_fabs(df+1, _state),1.0E-6), _state); /* * Test location reports */ aulits = 50; rho = 200.0; tolx = 0.0005; n = 2; ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xlast, n, _state); x0.ptr.p_double[0] = 0.1; x0.ptr.p_double[1] = 0.2; xlast.ptr.p_double[0] = (double)(0); xlast.ptr.p_double[1] = (double)(0); minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); minnlcsetnlc(&state, 0, 1, _state); minnlcsetxrep(&state, ae_true, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = ae_sqr(state.x.ptr.p_double[0]-1, _state)+ae_sqr(state.x.ptr.p_double[1]-1, _state); state.j.ptr.pp_double[0][0] = 2*(state.x.ptr.p_double[0]-1); state.j.ptr.pp_double[0][1] = 2*(state.x.ptr.p_double[1]-1); state.fi.ptr.p_double[1] = ae_sqr(state.x.ptr.p_double[0], _state)+ae_sqr(state.x.ptr.p_double[1], _state)-1; state.j.ptr.pp_double[1][0] = 2*state.x.ptr.p_double[0]; state.j.ptr.pp_double[1][1] = 2*state.x.ptr.p_double[1]; continue; } if( state.xupdated ) { /* * Save last point */ xlast.ptr.p_double[0] = state.x.ptr.p_double[0]; xlast.ptr.p_double[1] = state.x.ptr.p_double[1]; continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } seterrorflag(wereerrors, ae_fp_greater(ae_fabs(x1.ptr.p_double[0]-xlast.ptr.p_double[0], _state),1.0E4*ae_machineepsilon), _state); seterrorflag(wereerrors, ae_fp_greater(ae_fabs(x1.ptr.p_double[1]-xlast.ptr.p_double[1], _state),1.0E4*ae_machineepsilon), _state); /* * Test numerical differentiation */ aulits = 50; rho = 200.0; tolx = 0.0001; diffstep = 0.001; n = 2; ae_vector_set_length(&x0, n, _state); x0.ptr.p_double[0] = 0.1; x0.ptr.p_double[1] = 0.2; minnlccreatef(n, &x0, diffstep, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); minnlcsetnlc(&state, 0, 1, _state); while(minnlciteration(&state, _state)) { if( state.needfi ) { state.fi.ptr.p_double[0] = ae_sqr(state.x.ptr.p_double[0]-1, _state)+ae_sqr(state.x.ptr.p_double[1]-1, _state); state.fi.ptr.p_double[1] = ae_sqr(state.x.ptr.p_double[0], _state)+ae_sqr(state.x.ptr.p_double[1], _state)-1; continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } seterrorflag(wereerrors, ae_fp_greater(ae_fabs(x1.ptr.p_double[0]-ae_sqrt((double)(2), _state)/2, _state),tolx), _state); seterrorflag(wereerrors, ae_fp_greater(ae_fabs(x1.ptr.p_double[1]-ae_sqrt((double)(2), _state)/2, _state),tolx), _state); /* * Test gradient checking */ aulits = 50; rho = 200.0; tolx = 0.0001; diffstep = 0.001; n = 2; ae_vector_set_length(&x0, n, _state); x0.ptr.p_double[0] = 0.1; x0.ptr.p_double[1] = 0.2; minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetgradientcheck(&state, diffstep, _state); minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); minnlcsetnlc(&state, 0, 1, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = ae_sqr(state.x.ptr.p_double[0]-1, _state)+ae_sqr(state.x.ptr.p_double[1]-1, _state); state.j.ptr.pp_double[0][0] = 2*(state.x.ptr.p_double[0]-1); state.j.ptr.pp_double[0][1] = 2*(state.x.ptr.p_double[1]-1); state.fi.ptr.p_double[1] = ae_sqr(state.x.ptr.p_double[0], _state)+ae_sqr(state.x.ptr.p_double[1], _state)-1; state.j.ptr.pp_double[1][0] = 2*state.x.ptr.p_double[0]; state.j.ptr.pp_double[1][1] = (double)(0); continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, rep.terminationtype!=-7, _state); seterrorflag(wereerrors, rep.varidx!=1, _state); seterrorflag(wereerrors, rep.funcidx!=1, _state); /* * Check handling of general linear constraints: solve linearly * constrained twice, first time with constraints posed as linear * ones, second time with constraints posed as nonlinear ones. * * Linear constraints are normalized because we know that optimizer * normalizes them internally. * * We perform small amount of inner iterations - just 3 steps. * Only one outer iteration is performed. Such small number of * iterations allows to reduce influence of round-off errors * and compare results returned by different control paths within * optimizer (control path for linear constraints and one for * nonlinear constraints). * * We test following kinds of preconditioners: * * "none" * * "exact low rank", restart frequency is 1 * * "exact robust", restart frequency is 1 * Inexact LBFGS-based preconditioner is not tested because its * behavior greatly depends on order of equations. */ n = 30; k = 5; rho = 1.0E3; aulits = 1; maxits = 3; ae_vector_set_length(&x0, n, _state); ae_matrix_set_length(&c, k, n+1, _state); ae_vector_set_length(&ct, k, _state); for(prectype=0; prectype<=2; prectype++) { for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = hqrndnormal(&rs, _state); } for(i=0; i<=k-1; i++) { v = 0.0; for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); v = v+ae_sqr(c.ptr.pp_double[i][j], _state); } v = 1/ae_sqrt(v, _state); ae_v_muld(&c.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); c.ptr.pp_double[i][n] = v; ct.ptr.p_int[i] = 0; } minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetcond(&state, 0.0, 0.0, 0.0, maxits, _state); if( prectype==0 ) { minnlcsetprecnone(&state, _state); } if( prectype==1 ) { minnlcsetprecexactlowrank(&state, 1, _state); } if( prectype==2 ) { minnlcsetprecexactrobust(&state, 1, _state); } minnlcsetlc(&state, &c, &ct, k, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+ae_sqr(state.x.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = 2*state.x.ptr.p_double[i]; } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetcond(&state, 0.0, 0.0, 0.0, maxits, _state); if( prectype==0 ) { minnlcsetprecnone(&state, _state); } if( prectype==1 ) { minnlcsetprecexactlowrank(&state, 1, _state); } if( prectype==2 ) { minnlcsetprecexactrobust(&state, 1, _state); } minnlcsetnlc(&state, k, 0, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+ae_sqr(state.x.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = 2*state.x.ptr.p_double[i]; } for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state.fi.ptr.p_double[1+i] = v-c.ptr.pp_double[i][n]; ae_v_move(&state.j.ptr.pp_double[1+i][0], 1, &c.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x2, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x2, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(wereerrors, ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-x2.ptr.p_double[i], _state),1.0E-4), _state); } } /* * Test preconditioning: * * compare number of iterations required to solve problem with * different preconditioners (LBFGS, exact, none) * * a set of trials is performed (100 trials) * * each trial is a solution of boundary/linearly constrained problem * (linear constraints may be posed as nonlinear ones) with normalized * constraint matrix. Normalization is essential for reproducibility * of results . * * Outer loop checks handling of different types of constraints * (posed as linear or nonlinear ones) */ n = 30; blocksize = 3; blockcnt = 3; rho = 1.0E3; aulits = 5; condv = 1.0E2; ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_matrix_set_length(&c, blocksize*blockcnt, n+1, _state); ae_vector_set_length(&ct, blocksize*blockcnt, _state); for(ctype=0; ctype<=1; ctype++) { /* * First, initialize iteration counters */ nlbfgs = 0; nexactlowrank = 0; nexactrobust = 0; nnone = 0; /* * Perform trials */ for(trialidx=0; trialidx<=99; trialidx++) { /* * Generate: * * boundary constraints BndL/BndU and initial point X0 * * block-diagonal matrix of linear constraints C such * that X0 is feasible w.r.t. constraints given by C */ for(i=0; i<=n-1; i++) { if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { bndl.ptr.p_double[i] = (double)(0); bndu.ptr.p_double[i] = _state->v_posinf; x0.ptr.p_double[i] = hqrnduniformr(&rs, _state); } else { bndl.ptr.p_double[i] = (double)(0); bndu.ptr.p_double[i] = (double)(0); x0.ptr.p_double[i] = (double)(0); } } for(i=0; i<=blocksize*blockcnt-1; i++) { for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = 0.0; } } for(k=0; k<=blockcnt-1; k++) { rmatrixrndcond(blocksize, condv, &a, _state); for(i=0; i<=blocksize-1; i++) { for(j=0; j<=blocksize-1; j++) { c.ptr.pp_double[k*blocksize+i][k*blocksize+j] = a.ptr.pp_double[i][j]; } } } for(i=0; i<=blocksize*blockcnt-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &c.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); v = 1/ae_sqrt(v, _state); ae_v_muld(&c.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &x0.ptr.p_double[0], 1, ae_v_len(0,n-1)); c.ptr.pp_double[i][n] = v; ct.ptr.p_int[i] = hqrnduniformi(&rs, 3, _state)-1; } /* * Test unpreconditioned iteration */ minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); if( ctype==0 ) { minnlcsetlc(&state, &c, &ct, blocksize*blockcnt, _state); } else { minnlcsetnlc(&state, blocksize*blockcnt, 0, _state); } minnlcsetprecnone(&state, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+ae_sqr(state.x.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = 2*state.x.ptr.p_double[i]; } if( ctype==1 ) { for(i=0; i<=blocksize*blockcnt-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state.fi.ptr.p_double[1+i] = v-c.ptr.pp_double[i][n]; ae_v_move(&state.j.ptr.pp_double[1+i][0], 1, &c.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } nnone = nnone+rep.iterationscount; /* * Test LBFGS preconditioned iteration */ minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); if( ctype==0 ) { minnlcsetlc(&state, &c, &ct, blocksize*blockcnt, _state); } else { minnlcsetnlc(&state, blocksize*blockcnt, 0, _state); } minnlcsetprecinexact(&state, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+ae_sqr(state.x.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = 2*state.x.ptr.p_double[i]; } if( ctype==1 ) { for(i=0; i<=blocksize*blockcnt-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state.fi.ptr.p_double[1+i] = v-c.ptr.pp_double[i][n]; ae_v_move(&state.j.ptr.pp_double[1+i][0], 1, &c.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } nlbfgs = nlbfgs+rep.iterationscount; /* * Test exact low rank preconditioner */ minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); if( ctype==0 ) { minnlcsetlc(&state, &c, &ct, blocksize*blockcnt, _state); } else { minnlcsetnlc(&state, blocksize*blockcnt, 0, _state); } minnlcsetprecexactlowrank(&state, 3, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+ae_sqr(state.x.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = 2*state.x.ptr.p_double[i]; } if( ctype==1 ) { for(i=0; i<=blocksize*blockcnt-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state.fi.ptr.p_double[1+i] = v-c.ptr.pp_double[i][n]; ae_v_move(&state.j.ptr.pp_double[1+i][0], 1, &c.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } nexactlowrank = nexactlowrank+rep.iterationscount; /* * Test exact robust preconditioner */ minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetcond(&state, 0.0, 0.0, 1.0E-9, 0, _state); if( ctype==0 ) { minnlcsetlc(&state, &c, &ct, blocksize*blockcnt, _state); } else { minnlcsetnlc(&state, blocksize*blockcnt, 0, _state); } minnlcsetprecexactrobust(&state, 3, _state); while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+ae_sqr(state.x.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = 2*state.x.ptr.p_double[i]; } if( ctype==1 ) { for(i=0; i<=blocksize*blockcnt-1; i++) { v = ae_v_dotproduct(&c.ptr.pp_double[i][0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); state.fi.ptr.p_double[1+i] = v-c.ptr.pp_double[i][n]; ae_v_move(&state.j.ptr.pp_double[1+i][0], 1, &c.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, !isfinitevector(&x1, n, _state), _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } nexactrobust = nexactrobust+rep.iterationscount; } /* * Compare. * * Preconditioners must be significantly different, * with exact being best one, inexact being second, * "none" being worst option. */ seterrorflag(wereerrors, !ae_fp_less((double)(nexactlowrank),0.9*nlbfgs), _state); seterrorflag(wereerrors, !ae_fp_less((double)(nexactrobust),0.9*nlbfgs), _state); seterrorflag(wereerrors, !ae_fp_less((double)(nlbfgs),0.9*nnone), _state); } /* * Test integrity checks for NAN/INF: * * algorithm solves optimization problem, which is normal for some time (quadratic) * * after 5-th step we choose random component of gradient and consistently spoil * it by NAN or INF. * * we check that correct termination code is returned (-8) */ n = 100; for(pass=1; pass<=10; pass++) { spoiliteration = 5; stopiteration = 8; if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { /* * Gradient can be spoiled by +INF, -INF, NAN */ spoilvar = hqrnduniformi(&rs, n, _state); i = hqrnduniformi(&rs, 3, _state); spoilval = _state->v_nan; if( i==0 ) { spoilval = _state->v_neginf; } if( i==1 ) { spoilval = _state->v_posinf; } } else { /* * Function value can be spoiled only by NAN * (+INF can be recognized as legitimate value during optimization) */ spoilvar = -1; spoilval = _state->v_nan; } spdmatrixrndcond(n, 1.0E5, &fulla, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); x0.ptr.p_double[i] = hqrndnormal(&rs, _state); } minnlccreate(n, &x0, &state, _state); minnlcsetcond(&state, 0.0, 0.0, 0.0, stopiteration, _state); minnlcsetxrep(&state, ae_true, _state); k = -1; while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = (double)(0); for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+b.ptr.p_double[i]*state.x.ptr.p_double[i]; state.j.ptr.pp_double[0][i] = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.fi.ptr.p_double[0] = state.f+0.5*state.x.ptr.p_double[i]*fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; state.j.ptr.pp_double[0][i] = state.j.ptr.pp_double[0][i]+fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } if( k>=spoiliteration ) { if( spoilvar<0 ) { state.fi.ptr.p_double[0] = spoilval; } else { state.j.ptr.pp_double[0][spoilvar] = spoilval; } } continue; } if( state.xupdated ) { inc(&k, _state); continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, rep.terminationtype!=-8, _state); } ae_frame_leave(_state); } /************************************************************************* This function performs tests for fixed bugs On failure sets error flag. *************************************************************************/ static void testminnlcunit_testbugs(ae_bool* wereerrors, ae_state *_state) { ae_frame _frame_block; hqrndstate rs; ae_int_t n; ae_int_t aulits; ae_int_t maxits; double rho; ae_int_t ckind; minnlcstate state; minnlcreport rep; ae_vector x0; ae_vector x1; ae_vector bndl; ae_vector bndu; ae_vector ct; ae_matrix c; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&rs, _state); _minnlcstate_init(&state, _state); _minnlcreport_init(&rep, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); hqrndrandomize(&rs, _state); /* * Bug description (fixed): sometimes on non-convex problems, when * Lagrange coefficient for inequality constraint becomes small, * algorithm performs VERY deep step into infeasible area (step is 1E50), * which de-stabilizes it and prevents from converging back to feasible area. * * Very rare situation, but must be fixed with additional "convexifying" term. * This test reproduces situation with convexified term turned off, then * checks that introduction of term solves issue. * * We perform three kinds of tests: * * with box inequality constraint * * with linear inequality constraint * * with nonlinear inequality constraint * * In all three cases we: * * first time solve non-convex problem with artificially moved stabilizing * point and decreased initial value of Lagrange multiplier. * * second time we solve problem with good stabilizing point, but zero Lagrange multiplier * * last time solve same problem, but with default settings */ aulits = 1; maxits = 1; rho = 100.0; n = 1; ae_vector_set_length(&x0, n, _state); x0.ptr.p_double[0] = 0.0; ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); bndl.ptr.p_double[0] = 0.0; bndu.ptr.p_double[0] = _state->v_posinf; ae_matrix_set_length(&c, 1, 2, _state); ae_vector_set_length(&ct, 1, _state); c.ptr.pp_double[0][0] = 1.0; c.ptr.pp_double[0][1] = 0.0; ct.ptr.p_int[0] = 1; for(ckind=0; ckind<=2; ckind++) { minnlccreate(n, &x0, &state, _state); state.stabilizingpoint = -1.0E300; state.initialinequalitymultiplier = 1.0E-12; minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetcond(&state, 0.0, 0.0, 0.0, maxits, _state); if( ckind==0 ) { minnlcsetbc(&state, &bndl, &bndu, _state); } if( ckind==1 ) { minnlcsetlc(&state, &c, &ct, 1, _state); } if( ckind==2 ) { minnlcsetnlc(&state, 0, 1, _state); } while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = state.x.ptr.p_double[0]-ae_sqr(state.x.ptr.p_double[0], _state); state.j.ptr.pp_double[0][0] = 1-2*state.x.ptr.p_double[0]; if( ckind==2 ) { state.fi.ptr.p_double[1] = -state.x.ptr.p_double[0]; state.j.ptr.pp_double[1][0] = (double)(-1); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } seterrorflag(wereerrors, ae_fp_greater(x1.ptr.p_double[0],-1.0E6), _state); minnlccreate(n, &x0, &state, _state); state.stabilizingpoint = -1.0E2; state.initialinequalitymultiplier = 1.0E-12; minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetcond(&state, 0.0, 0.0, 0.0, maxits, _state); if( ckind==0 ) { minnlcsetbc(&state, &bndl, &bndu, _state); } if( ckind==1 ) { minnlcsetlc(&state, &c, &ct, 1, _state); } if( ckind==2 ) { minnlcsetnlc(&state, 0, 1, _state); } while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = state.x.ptr.p_double[0]-ae_sqr(state.x.ptr.p_double[0], _state); state.j.ptr.pp_double[0][0] = 1-2*state.x.ptr.p_double[0]; if( ckind==2 ) { state.fi.ptr.p_double[1] = -state.x.ptr.p_double[0]; state.j.ptr.pp_double[1][0] = (double)(-1); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } seterrorflag(wereerrors, ae_fp_less(x1.ptr.p_double[0],3*state.stabilizingpoint), _state); minnlccreate(n, &x0, &state, _state); minnlcsetalgoaul(&state, rho, aulits, _state); minnlcsetcond(&state, 0.0, 0.0, 0.0, maxits, _state); if( ckind==0 ) { minnlcsetbc(&state, &bndl, &bndu, _state); } if( ckind==1 ) { minnlcsetlc(&state, &c, &ct, 1, _state); } if( ckind==2 ) { minnlcsetnlc(&state, 0, 1, _state); } while(minnlciteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = state.x.ptr.p_double[0]-ae_sqr(state.x.ptr.p_double[0], _state); state.j.ptr.pp_double[0][0] = 1-2*state.x.ptr.p_double[0]; if( ckind==2 ) { state.fi.ptr.p_double[1] = -state.x.ptr.p_double[0]; state.j.ptr.pp_double[1][0] = (double)(-1); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnlcresults(&state, &x1, &rep, _state); seterrorflag(wereerrors, rep.terminationtype<=0, _state); if( *wereerrors ) { ae_frame_leave(_state); return; } seterrorflag(wereerrors, ae_fp_less(x1.ptr.p_double[0],3*state.stabilizingpoint), _state); } ae_frame_leave(_state); } static void testminbcunit_calciip2(minbcstate* state, ae_int_t n, ae_int_t fk, ae_state *_state); static void testminbcunit_testfeasibility(ae_bool* feaserr, ae_bool* converr, ae_bool* interr, ae_state *_state); static void testminbcunit_testother(ae_bool* err, ae_state *_state); static void testminbcunit_testpreconditioning(ae_bool* err, ae_state *_state); static void testminbcunit_setrandompreconditioner(minbcstate* state, ae_int_t n, ae_int_t preckind, ae_state *_state); static void testminbcunit_testgradientcheck(ae_bool* testg, ae_state *_state); static void testminbcunit_funcderiv(double a, double b, double c, double d, double x0, double x1, double x2, /* Real */ ae_vector* x, ae_int_t functype, double* f, /* Real */ ae_vector* g, ae_state *_state); ae_bool testminbc(ae_bool silent, ae_state *_state) { ae_bool waserrors; ae_bool feasibilityerrors; ae_bool othererrors; ae_bool precerrors; ae_bool interrors; ae_bool converrors; ae_bool graderrors; ae_bool result; waserrors = ae_false; feasibilityerrors = ae_false; othererrors = ae_false; precerrors = ae_false; interrors = ae_false; converrors = ae_false; graderrors = ae_false; testminbcunit_testfeasibility(&feasibilityerrors, &converrors, &interrors, _state); testminbcunit_testother(&othererrors, _state); testminbcunit_testpreconditioning(&precerrors, _state); testminbcunit_testgradientcheck(&graderrors, _state); /* * end */ waserrors = ((((feasibilityerrors||othererrors)||converrors)||interrors)||precerrors)||graderrors; if( !silent ) { printf("TESTING BC OPTIMIZATION\n"); printf("FEASIBILITY PROPERTIES: "); if( feasibilityerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("PRECONDITIONING: "); if( precerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("OTHER PROPERTIES: "); if( othererrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("CONVERGENCE PROPERTIES: "); if( converrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("INTERNAL ERRORS: "); if( interrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("TEST FOR VERIFICATION OF THE GRADIENT: "); if( graderrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testminbc(ae_bool silent, ae_state *_state) { return testminbc(silent, _state); } /************************************************************************* Calculate test function IIP2 f(x) = sum( ((i*i+1)^FK*x[i])^2, i=0..N-1) It has high condition number which makes fast convergence unlikely without good preconditioner. *************************************************************************/ static void testminbcunit_calciip2(minbcstate* state, ae_int_t n, ae_int_t fk, ae_state *_state) { ae_int_t i; if( state->needfg ) { state->f = (double)(0); } for(i=0; i<=n-1; i++) { if( state->needfg ) { state->f = state->f+ae_pow((double)(i*i+1), (double)(2*fk), _state)*ae_sqr(state->x.ptr.p_double[i], _state); state->g.ptr.p_double[i] = ae_pow((double)(i*i+1), (double)(2*fk), _state)*2*state->x.ptr.p_double[i]; } } } /************************************************************************* This function test feasibility properties. It launches a sequence of problems and examines their solutions. Most of the attention is directed towards feasibility properties, although we make some quick checks to ensure that actual solution is found. On failure sets FeasErr (or ConvErr, depending on failure type) to True, or leaves it unchanged otherwise. IntErr is set to True on internal errors (errors in the control flow). *************************************************************************/ static void testminbcunit_testfeasibility(ae_bool* feaserr, ae_bool* converr, ae_bool* interr, ae_state *_state) { ae_frame _frame_block; ae_int_t pkind; ae_int_t preckind; ae_int_t passcount; ae_int_t pass; ae_int_t n; ae_int_t nmax; ae_int_t i; ae_int_t p; double v; ae_vector bl; ae_vector bu; ae_vector x; ae_vector g; ae_vector x0; ae_vector xc; ae_vector xs; ae_vector svdw; ae_matrix csvdu; ae_matrix svdvt; minbcstate state; double weakepsg; minbcreport rep; ae_int_t dkind; double diffstep; ae_frame_make(_state, &_frame_block); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&xs, 0, DT_REAL, _state); ae_vector_init(&svdw, 0, DT_REAL, _state); ae_matrix_init(&csvdu, 0, 0, DT_REAL, _state); ae_matrix_init(&svdvt, 0, 0, DT_REAL, _state); _minbcstate_init(&state, _state); _minbcreport_init(&rep, _state); nmax = 5; weakepsg = 1.0E-4; passcount = 10; for(pass=1; pass<=passcount; pass++) { /* * Another simple problem: * * bound constraints 0 <= x[i] <= 1 * * no linear constraints * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from converging * to the feasible point): * * unit preconditioner * * random diagonal-based preconditioner * * random scale-based preconditioner * * F(x) = |x-x0|^P, where P={2,4} and x0 is randomly selected from [-1,+2]^N * * with such simple boundaries and function it is easy to find * analytic form of solution: S[i] = bound(x0[i], 0, 1) * * we also check that both final solution and subsequent iterates * are strictly feasible */ diffstep = 1.0E-6; for(dkind=0; dkind<=1; dkind++) { for(preckind=0; preckind<=2; preckind++) { for(pkind=1; pkind<=2; pkind++) { for(n=1; n<=nmax; n++) { /* * Generate X, BL, BU. */ p = 2*pkind; ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = (double)(0); bu.ptr.p_double[i] = (double)(1); x.ptr.p_double[i] = ae_randomreal(_state); x0.ptr.p_double[i] = 3*ae_randomreal(_state)-1; } /* * Create and optimize */ if( dkind==0 ) { minbccreate(n, &x, &state, _state); } if( dkind==1 ) { minbccreatef(n, &x, diffstep, &state, _state); } minbcsetbc(&state, &bl, &bu, _state); minbcsetcond(&state, weakepsg, 0.0, 0.0, 0, _state); testminbcunit_setrandompreconditioner(&state, n, preckind, _state); while(minbciteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = (double)(0); } for(i=0; i<=n-1; i++) { if( state.needf||state.needfg ) { state.f = state.f+ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p), _state); } if( state.needfg ) { state.g.ptr.p_double[i] = p*ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state); } *feaserr = *feaserr||ae_fp_less(state.x.ptr.p_double[i],0.0); *feaserr = *feaserr||ae_fp_greater(state.x.ptr.p_double[i],1.0); } } minbcresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *converr = ae_true; ae_frame_leave(_state); return; } /* * * compare solution with analytic one * * check feasibility */ v = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_greater(x.ptr.p_double[i],(double)(0))&&ae_fp_less(x.ptr.p_double[i],(double)(1)) ) { v = v+ae_sqr(p*ae_pow(x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state), _state); } *feaserr = *feaserr||ae_fp_less(x.ptr.p_double[i],0.0); *feaserr = *feaserr||ae_fp_greater(x.ptr.p_double[i],1.0); } *converr = *converr||ae_fp_greater(ae_sqrt(v, _state),weakepsg); } } } } /* * Same as previous problem, but with minor modifications: * * some bound constraints are 0<=x[i]<=1, some are Ci=x[i]=Ci * * no linear constraints * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from converging * to the feasible point): * * unit preconditioner * * random diagonal-based preconditioner * * random scale-based preconditioner * * F(x) = |x-x0|^P, where P={2,4} and x0 is randomly selected from [-1,+2]^N * * with such simple boundaries and function it is easy to find * analytic form of solution: S[i] = bound(x0[i], 0, 1) * * we also check that both final solution and subsequent iterates * are strictly feasible */ diffstep = 1.0E-6; for(dkind=0; dkind<=1; dkind++) { for(preckind=0; preckind<=2; preckind++) { for(pkind=1; pkind<=2; pkind++) { for(n=1; n<=nmax; n++) { /* * Generate X, BL, BU. */ p = 2*pkind; ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { if( ae_fp_greater(ae_randomreal(_state),0.5) ) { bl.ptr.p_double[i] = (double)(0); bu.ptr.p_double[i] = (double)(1); } else { bl.ptr.p_double[i] = ae_randomreal(_state); bu.ptr.p_double[i] = bl.ptr.p_double[i]; } x.ptr.p_double[i] = ae_randomreal(_state); x0.ptr.p_double[i] = 3*ae_randomreal(_state)-1; } /* * Create and optimize */ if( dkind==0 ) { minbccreate(n, &x, &state, _state); } if( dkind==1 ) { minbccreatef(n, &x, diffstep, &state, _state); } minbcsetbc(&state, &bl, &bu, _state); minbcsetcond(&state, weakepsg, 0.0, 0.0, 0, _state); testminbcunit_setrandompreconditioner(&state, n, preckind, _state); while(minbciteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = (double)(0); } for(i=0; i<=n-1; i++) { if( state.needf||state.needfg ) { state.f = state.f+ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p), _state); } if( state.needfg ) { state.g.ptr.p_double[i] = p*ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state); } *feaserr = *feaserr||ae_fp_less(state.x.ptr.p_double[i],bl.ptr.p_double[i]); *feaserr = *feaserr||ae_fp_greater(state.x.ptr.p_double[i],bu.ptr.p_double[i]); } } minbcresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { *converr = ae_true; ae_frame_leave(_state); return; } /* * * compare solution with analytic one * * check feasibility */ v = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_greater(x.ptr.p_double[i],bl.ptr.p_double[i])&&ae_fp_less(x.ptr.p_double[i],bu.ptr.p_double[i]) ) { v = v+ae_sqr(p*ae_pow(x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state), _state); } *feaserr = *feaserr||ae_fp_less(x.ptr.p_double[i],bl.ptr.p_double[i]); *feaserr = *feaserr||ae_fp_greater(x.ptr.p_double[i],bu.ptr.p_double[i]); } *converr = *converr||ae_fp_greater(ae_sqrt(v, _state),weakepsg); } } } } /* * Infeasible problem: * * all bound constraints are 0 <= x[i] <= 1 except for one * * that one is 0 >= x[i] >= 1 * * no linear constraints * * preconditioner is chosen at random (we just want to be * sure that preconditioning won't prevent us from detecting * infeasible point): * * unit preconditioner * * random diagonal-based preconditioner * * random scale-based preconditioner * * F(x) = |x-x0|^P, where P={2,4} and x0 is randomly selected from [-1,+2]^N * * algorithm must return correct error code on such problem */ for(preckind=0; preckind<=2; preckind++) { for(pkind=1; pkind<=2; pkind++) { for(n=1; n<=nmax; n++) { /* * Generate X, BL, BU. */ p = 2*pkind; ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = (double)(0); bu.ptr.p_double[i] = (double)(1); x.ptr.p_double[i] = ae_randomreal(_state); x0.ptr.p_double[i] = 3*ae_randomreal(_state)-1; } i = ae_randominteger(n, _state); bl.ptr.p_double[i] = (double)(1); bu.ptr.p_double[i] = (double)(0); /* * Create and optimize */ minbccreate(n, &x, &state, _state); minbcsetbc(&state, &bl, &bu, _state); minbcsetcond(&state, weakepsg, 0.0, 0.0, 0, _state); testminbcunit_setrandompreconditioner(&state, n, preckind, _state); while(minbciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p), _state); state.g.ptr.p_double[i] = p*ae_pow(state.x.ptr.p_double[i]-x0.ptr.p_double[i], (double)(p-1), _state); } continue; } /* * Unknown protocol specified */ *interr = ae_true; ae_frame_leave(_state); return; } minbcresults(&state, &x, &rep, _state); *feaserr = *feaserr||rep.terminationtype!=-3; } } } } ae_frame_leave(_state); } /************************************************************************* This function additional properties. On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testminbcunit_testother(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t passcount; ae_int_t pass; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t k; ae_vector bl; ae_vector bu; ae_vector x; ae_vector xf; ae_vector x0; ae_vector x1; ae_vector b; ae_vector xlast; ae_vector a; ae_vector s; ae_vector h; ae_matrix fulla; double fprev; double xprev; double stpmax; double v; ae_int_t pkind; ae_int_t ckind; ae_int_t mkind; double vc; double vm; minbcstate state; double epsx; double epsg; double eps; double tmpeps; minbcreport rep; double diffstep; ae_int_t dkind; ae_bool wasf; ae_bool wasfg; double r; hqrndstate rs; ae_int_t spoiliteration; ae_int_t stopiteration; ae_int_t spoilvar; double spoilval; double ss; ae_int_t stopcallidx; ae_int_t callidx; ae_int_t maxits; ae_bool terminationrequested; ae_frame_make(_state, &_frame_block); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&xf, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&xlast, 0, DT_REAL, _state); ae_vector_init(&a, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&h, 0, DT_REAL, _state); ae_matrix_init(&fulla, 0, 0, DT_REAL, _state); _minbcstate_init(&state, _state); _minbcreport_init(&rep, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); epsx = 1.0E-4; epsg = 1.0E-8; passcount = 10; /* * Try to reproduce bug 570 (optimizer hangs on problems where it is required * to perform very small step - less than 1E-50 - in order to activate constraints). * * The problem being solved is: * * min x[0]+x[1]+...+x[n-1] * * subject to * * x[i]>=0, for i=0..n-1 * * with initial point * * x[0] = 1.0E-100, x[1]=x[2]=...=0.5 * * We try to reproduce this problem in different settings: * * boundary-only constraints - we test that completion code is positive, * and all x[] are EXACTLY zero * * boundary constraints posed as general linear ones - we test that * completion code is positive, and all x[] are APPROXIMATELY zero. */ n = 10; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 0.5; bl.ptr.p_double[i] = 0.0; bu.ptr.p_double[i] = _state->v_posinf; } x.ptr.p_double[0] = 1.0E-100; minbccreate(n, &x, &state, _state); minbcsetbc(&state, &bl, &bu, _state); minbcsetcond(&state, (double)(0), (double)(0), (double)(0), 2*n, _state); while(minbciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+state.x.ptr.p_double[i]; state.g.ptr.p_double[i] = 1.0; } } } minbcresults(&state, &xf, &rep, _state); seterrorflag(err, rep.terminationtype<=0, _state); if( rep.terminationtype>0 ) { for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_neq(xf.ptr.p_double[i],(double)(0)), _state); } } /* * Test reports: * * first value must be starting point * * last value must be last point */ for(pass=1; pass<=passcount; pass++) { n = 50; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xlast, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)(10); bl.ptr.p_double[i] = 2*ae_randomreal(_state)-1; bu.ptr.p_double[i] = _state->v_posinf; } minbccreate(n, &x, &state, _state); minbcsetbc(&state, &bl, &bu, _state); minbcsetcond(&state, 1.0E-64, (double)(0), (double)(0), 10, _state); minbcsetxrep(&state, ae_true, _state); fprev = ae_maxrealnumber; while(minbciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+ae_sqr((1+i)*state.x.ptr.p_double[i], _state); state.g.ptr.p_double[i] = 2*(1+i)*state.x.ptr.p_double[i]; } } if( state.xupdated ) { if( ae_fp_eq(fprev,ae_maxrealnumber) ) { for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_neq(state.x.ptr.p_double[i],x.ptr.p_double[i]), _state); } } fprev = state.f; ae_v_move(&xlast.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); } } minbcresults(&state, &x, &rep, _state); for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_neq(x.ptr.p_double[i],xlast.ptr.p_double[i]), _state); } } /* * Test differentiation vs. analytic gradient * (first one issues NeedF requests, second one issues NeedFG requests) */ for(pass=1; pass<=passcount; pass++) { n = 10; diffstep = 1.0E-6; for(dkind=0; dkind<=1; dkind++) { ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xlast, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)(1); } if( dkind==0 ) { minbccreate(n, &x, &state, _state); } if( dkind==1 ) { minbccreatef(n, &x, diffstep, &state, _state); } minbcsetcond(&state, 1.0E-6, (double)(0), epsx, 0, _state); wasf = ae_false; wasfg = ae_false; while(minbciteration(&state, _state)) { if( state.needf||state.needfg ) { state.f = (double)(0); } for(i=0; i<=n-1; i++) { if( state.needf||state.needfg ) { state.f = state.f+ae_sqr((1+i)*state.x.ptr.p_double[i], _state); } if( state.needfg ) { state.g.ptr.p_double[i] = 2*(1+i)*state.x.ptr.p_double[i]; } } wasf = wasf||state.needf; wasfg = wasfg||state.needfg; } minbcresults(&state, &x, &rep, _state); if( dkind==0 ) { seterrorflag(err, wasf||!wasfg, _state); } if( dkind==1 ) { seterrorflag(err, !wasf||wasfg, _state); } } } /* * Test that numerical differentiation uses scaling. * * In order to test that we solve simple optimization * problem: min(x^2) with initial x equal to 0.0. * * We choose random DiffStep and S, then we check that * optimizer evaluates function at +-DiffStep*S only. */ for(pass=1; pass<=passcount; pass++) { ae_vector_set_length(&x, 1, _state); ae_vector_set_length(&s, 1, _state); diffstep = ae_randomreal(_state)*1.0E-6; s.ptr.p_double[0] = ae_exp(ae_randomreal(_state)*4-2, _state); x.ptr.p_double[0] = (double)(0); minbccreatef(1, &x, diffstep, &state, _state); minbcsetcond(&state, 1.0E-6, (double)(0), epsx, 0, _state); minbcsetscale(&state, &s, _state); v = (double)(0); while(minbciteration(&state, _state)) { state.f = ae_sqr(state.x.ptr.p_double[0], _state); v = ae_maxreal(v, ae_fabs(state.x.ptr.p_double[0], _state), _state); } minbcresults(&state, &x, &rep, _state); r = v/(s.ptr.p_double[0]*diffstep); seterrorflag(err, ae_fp_greater(ae_fabs(ae_log(r, _state), _state),ae_log(1+1000*ae_machineepsilon, _state)), _state); } /* * Test stpmax */ for(pass=1; pass<=passcount; pass++) { n = 1; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); x.ptr.p_double[0] = (double)(100); bl.ptr.p_double[0] = 2*ae_randomreal(_state)-1; bu.ptr.p_double[0] = _state->v_posinf; stpmax = 0.05+0.05*ae_randomreal(_state); minbccreate(n, &x, &state, _state); minbcsetbc(&state, &bl, &bu, _state); minbcsetcond(&state, epsg, (double)(0), epsx, 0, _state); minbcsetxrep(&state, ae_true, _state); minbcsetstpmax(&state, stpmax, _state); xprev = x.ptr.p_double[0]; while(minbciteration(&state, _state)) { if( state.needfg ) { state.f = ae_exp(state.x.ptr.p_double[0], _state)+ae_exp(-state.x.ptr.p_double[0], _state); state.g.ptr.p_double[0] = ae_exp(state.x.ptr.p_double[0], _state)-ae_exp(-state.x.ptr.p_double[0], _state); seterrorflag(err, ae_fp_greater(ae_fabs(state.x.ptr.p_double[0]-xprev, _state),(1+ae_sqrt(ae_machineepsilon, _state))*stpmax), _state); } if( state.xupdated ) { seterrorflag(err, ae_fp_greater(ae_fabs(state.x.ptr.p_double[0]-xprev, _state),(1+ae_sqrt(ae_machineepsilon, _state))*stpmax), _state); xprev = state.x.ptr.p_double[0]; } } } /* * Ability to solve problems with function which is unbounded from below */ for(pass=1; pass<=passcount; pass++) { n = 1; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); bl.ptr.p_double[0] = 4*ae_randomreal(_state)+1; bu.ptr.p_double[0] = bl.ptr.p_double[0]+1; x.ptr.p_double[0] = 0.5*(bl.ptr.p_double[0]+bu.ptr.p_double[0]); minbccreate(n, &x, &state, _state); minbcsetbc(&state, &bl, &bu, _state); minbcsetcond(&state, epsg, (double)(0), epsx, 0, _state); while(minbciteration(&state, _state)) { if( state.needfg ) { state.f = -1.0E8*ae_sqr(state.x.ptr.p_double[0], _state); state.g.ptr.p_double[0] = -2.0E8*state.x.ptr.p_double[0]; } } minbcresults(&state, &x, &rep, _state); seterrorflag(err, ae_fp_greater(ae_fabs(x.ptr.p_double[0]-bu.ptr.p_double[0], _state),epsx), _state); } /* * Test correctness of the scaling: * * initial point is random point from [+1,+2]^N * * f(x) = SUM(A[i]*x[i]^4), C[i] is random from [0.01,100] * * function is EFFECTIVELY unconstrained; it has formal constraints, * but they are inactive at the solution; we try different variants * in order to explore different control paths of the optimizer: * 0) absense of constraints * 1) bound constraints -100000<=x[i]<=100000 * * we use random scaling matrix * * we test different variants of the preconditioning: * 0) unit preconditioner * 1) random diagonal from [0.01,100] * 2) scale preconditioner * * we set very stringent stopping conditions * * and we test that in the extremum stopping conditions are * satisfied subject to the current scaling coefficients. */ for(pass=1; pass<=passcount; pass++) { tmpeps = 1.0E-5; for(n=1; n<=10; n++) { for(ckind=0; ckind<=1; ckind++) { for(pkind=0; pkind<=2; pkind++) { ae_vector_set_length(&x, n, _state); ae_vector_set_length(&a, n, _state); ae_vector_set_length(&s, n, _state); ae_vector_set_length(&h, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_randomreal(_state)+1; bl.ptr.p_double[i] = (double)(-100000); bu.ptr.p_double[i] = (double)(100000); a.ptr.p_double[i] = ae_exp(ae_log((double)(10), _state)*(2*ae_randomreal(_state)-1), _state); s.ptr.p_double[i] = ae_exp(ae_log((double)(10), _state)*(2*ae_randomreal(_state)-1), _state); h.ptr.p_double[i] = ae_exp(ae_log((double)(10), _state)*(2*ae_randomreal(_state)-1), _state); } minbccreate(n, &x, &state, _state); if( ckind==1 ) { minbcsetbc(&state, &bl, &bu, _state); } if( pkind==1 ) { minbcsetprecdiag(&state, &h, _state); } if( pkind==2 ) { minbcsetprecscale(&state, _state); } minbcsetcond(&state, tmpeps, (double)(0), (double)(0), 0, _state); minbcsetscale(&state, &s, _state); while(minbciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+a.ptr.p_double[i]*ae_sqr(state.x.ptr.p_double[i], _state); state.g.ptr.p_double[i] = 2*a.ptr.p_double[i]*state.x.ptr.p_double[i]; } } } minbcresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(err, ae_true, _state); ae_frame_leave(_state); return; } v = (double)(0); for(i=0; i<=n-1; i++) { v = v+ae_sqr(s.ptr.p_double[i]*2*a.ptr.p_double[i]*x.ptr.p_double[i], _state); } v = ae_sqrt(v, _state); seterrorflag(err, ae_fp_greater(v,tmpeps), _state); } } } } /* * Check correctness of the "trimming". * * Trimming is a technique which is used to help algorithm * cope with unbounded functions. In order to check this * technique we will try to solve following optimization * problem: * * min f(x) subject to no constraints on X * { 1/(1-x) + 1/(1+x) + c*x, if -0.999999=0.999999 * * where c is either 1.0 or 1.0E+4, M is either 1.0E8, 1.0E20 or +INF * (we try different combinations) */ for(pass=1; pass<=passcount; pass++) { for(ckind=0; ckind<=1; ckind++) { for(mkind=0; mkind<=2; mkind++) { /* * Choose c and M */ vc = (double)(1); vm = (double)(1); if( ckind==0 ) { vc = 1.0; } if( ckind==1 ) { vc = 1.0E+4; } if( mkind==0 ) { vm = 1.0E+8; } if( mkind==1 ) { vm = 1.0E+20; } if( mkind==2 ) { vm = _state->v_posinf; } /* * Create optimizer, solve optimization problem */ epsg = 1.0E-6*vc; ae_vector_set_length(&x, 1, _state); x.ptr.p_double[0] = 0.0; minbccreate(1, &x, &state, _state); minbcsetcond(&state, epsg, (double)(0), (double)(0), 0, _state); while(minbciteration(&state, _state)) { if( state.needfg ) { if( ae_fp_less(-0.999999,state.x.ptr.p_double[0])&&ae_fp_less(state.x.ptr.p_double[0],0.999999) ) { state.f = 1/(1-state.x.ptr.p_double[0])+1/(1+state.x.ptr.p_double[0])+vc*state.x.ptr.p_double[0]; state.g.ptr.p_double[0] = 1/ae_sqr(1-state.x.ptr.p_double[0], _state)-1/ae_sqr(1+state.x.ptr.p_double[0], _state)+vc; } else { state.f = vm; state.g.ptr.p_double[0] = (double)(0); } } } minbcresults(&state, &x, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(err, ae_true, _state); ae_frame_leave(_state); return; } seterrorflag(err, ae_fp_greater(ae_fabs(1/ae_sqr(1-x.ptr.p_double[0], _state)-1/ae_sqr(1+x.ptr.p_double[0], _state)+vc, _state),epsg), _state); } } } /* * Test behaviour on noisy functions. * * Consider following problem: * * f(x,y) = (x+1)^2 + (y+1)^2 + 10000*MachineEpsilon*RandomReal() * * boundary constraints x>=0, y>=0 * * starting point (x0,y0)=(10*MachineEpsilon,1.0) * * Such problem contains small numerical noise. Without noise its * solution is (xs,ys)=(0,0), which is easy to find. However, presence * of the noise makes it hard to solve: * * noisy f(x,y) is monotonically decreasing only when we perform * steps orders of magnitude larger than 10000*MachineEpsilon * * at small scales f(x,y) is non-monotonic and non-convex * * however, our first step must be done towards * (x1,y1) = (0,1-some_small_value), and length of such step is * many times SMALLER than 10000*MachineEpsilon * * second step, from (x1,y1) to (xs,ys), will be large enough to * ignore numerical noise, so the only problem is to perform * first step * * Naive implementation of BC should fail sometimes (sometimes - * due to non-deterministic nature of noise) on such problem. However, * our improved implementation should solve it correctly. We test * several variations of inner stopping criteria. */ for(pass=1; pass<=passcount; pass++) { eps = 1.0E-9; ae_vector_set_length(&x, 2, _state); ae_vector_set_length(&bl, 2, _state); ae_vector_set_length(&bu, 2, _state); x.ptr.p_double[0] = 10*ae_machineepsilon; x.ptr.p_double[1] = 1.0; bl.ptr.p_double[0] = 0.0; bu.ptr.p_double[0] = _state->v_posinf; bl.ptr.p_double[1] = 0.0; bu.ptr.p_double[1] = _state->v_posinf; for(ckind=0; ckind<=2; ckind++) { minbccreate(2, &x, &state, _state); minbcsetbc(&state, &bl, &bu, _state); if( ckind==0 ) { minbcsetcond(&state, eps, (double)(0), (double)(0), 0, _state); } if( ckind==1 ) { minbcsetcond(&state, (double)(0), eps, (double)(0), 0, _state); } if( ckind==2 ) { minbcsetcond(&state, (double)(0), (double)(0), eps, 0, _state); } while(minbciteration(&state, _state)) { if( state.needfg ) { state.f = ae_sqr(state.x.ptr.p_double[0]+1, _state)+ae_sqr(state.x.ptr.p_double[1]+1, _state)+10000*ae_machineepsilon*ae_randomreal(_state); state.g.ptr.p_double[0] = 2*(state.x.ptr.p_double[0]+1); state.g.ptr.p_double[1] = 2*(state.x.ptr.p_double[1]+1); } } minbcresults(&state, &xf, &rep, _state); if( (rep.terminationtype<=0||ae_fp_neq(xf.ptr.p_double[0],(double)(0)))||ae_fp_neq(xf.ptr.p_double[1],(double)(0)) ) { seterrorflag(err, ae_true, _state); ae_frame_leave(_state); return; } } } /* * Deterministic variation of the previous problem. * * Consider following problem: * * boundary constraints x>=0, y>=0 * * starting point (x0,y0)=(10*MachineEpsilon,1.0) * / (x+1)^2 + (y+1)^2, for (x,y)<>(x0,y0) * * f(x,y) = | * \ (x+1)^2 + (y+1)^2 - 0.1, for (x,y)=(x0,y0) * * Such problem contains deterministic numerical noise (-0.1 at * starting point). Without noise its solution is easy to find. * However, presence of the noise makes it hard to solve: * * our first step must be done towards (x1,y1) = (0,1-some_small_value), * but such step will increase function valye by approximately 0.1 - * instead of decreasing it. * * Naive implementation of BC should fail on such problem. However, * our improved implementation should solve it correctly. We test * several variations of inner stopping criteria. */ for(pass=1; pass<=passcount; pass++) { eps = 1.0E-9; ae_vector_set_length(&x, 2, _state); ae_vector_set_length(&bl, 2, _state); ae_vector_set_length(&bu, 2, _state); x.ptr.p_double[0] = 10*ae_machineepsilon; x.ptr.p_double[1] = 1.0; bl.ptr.p_double[0] = 0.0; bu.ptr.p_double[0] = _state->v_posinf; bl.ptr.p_double[1] = 0.0; bu.ptr.p_double[1] = _state->v_posinf; for(ckind=0; ckind<=2; ckind++) { minbccreate(2, &x, &state, _state); minbcsetbc(&state, &bl, &bu, _state); if( ckind==0 ) { minbcsetcond(&state, eps, (double)(0), (double)(0), 0, _state); } if( ckind==1 ) { minbcsetcond(&state, (double)(0), eps, (double)(0), 0, _state); } if( ckind==2 ) { minbcsetcond(&state, (double)(0), (double)(0), eps, 0, _state); } while(minbciteration(&state, _state)) { if( state.needfg ) { state.f = ae_sqr(state.x.ptr.p_double[0]+1, _state)+ae_sqr(state.x.ptr.p_double[1]+1, _state); if( ae_fp_eq(state.x.ptr.p_double[0],x.ptr.p_double[0])&&ae_fp_eq(state.x.ptr.p_double[1],x.ptr.p_double[1]) ) { state.f = state.f-0.1; } state.g.ptr.p_double[0] = 2*(state.x.ptr.p_double[0]+1); state.g.ptr.p_double[1] = 2*(state.x.ptr.p_double[1]+1); } } minbcresults(&state, &xf, &rep, _state); if( (rep.terminationtype<=0||ae_fp_neq(xf.ptr.p_double[0],(double)(0)))||ae_fp_neq(xf.ptr.p_double[1],(double)(0)) ) { seterrorflag(err, ae_true, _state); ae_frame_leave(_state); return; } } } /* * Test integrity checks for NAN/INF: * * algorithm solves optimization problem, which is normal for some time (quadratic) * * after 5-th step we choose random component of gradient and consistently spoil * it by NAN or INF. * * we check that correct termination code is returned (-8) */ n = 100; for(pass=1; pass<=10; pass++) { spoiliteration = 5; stopiteration = 8; if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { /* * Gradient can be spoiled by +INF, -INF, NAN */ spoilvar = hqrnduniformi(&rs, n, _state); i = hqrnduniformi(&rs, 3, _state); spoilval = _state->v_nan; if( i==0 ) { spoilval = _state->v_neginf; } if( i==1 ) { spoilval = _state->v_posinf; } } else { /* * Function value can be spoiled only by NAN * (+INF can be recognized as legitimate value during optimization) */ spoilvar = -1; spoilval = _state->v_nan; } spdmatrixrndcond(n, 1.0E5, &fulla, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = hqrndnormal(&rs, _state); x0.ptr.p_double[i] = hqrndnormal(&rs, _state); } minbccreate(n, &x0, &state, _state); minbcsetcond(&state, 0.0, 0.0, 0.0, stopiteration, _state); minbcsetxrep(&state, ae_true, _state); k = -1; while(minbciteration(&state, _state)) { if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+b.ptr.p_double[i]*state.x.ptr.p_double[i]; state.g.ptr.p_double[i] = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.f = state.f+0.5*state.x.ptr.p_double[i]*fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; state.g.ptr.p_double[i] = state.g.ptr.p_double[i]+fulla.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } if( k>=spoiliteration ) { if( spoilvar<0 ) { state.f = spoilval; } else { state.g.ptr.p_double[spoilvar] = spoilval; } } continue; } if( state.xupdated ) { inc(&k, _state); continue; } ae_assert(ae_false, "Assertion failed", _state); } minbcresults(&state, &x1, &rep, _state); seterrorflag(err, rep.terminationtype!=-8, _state); } /* * Check algorithm ability to handle request for termination: * * to terminate with correct return code = 8 * * to return point which was "current" at the moment of termination * * NOTE: we solve problem with "corrupted" preconditioner which makes it hard * to converge in less than StopCallIdx iterations */ for(pass=1; pass<=50; pass++) { n = 3; ss = (double)(100); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xlast, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 6+ae_randomreal(_state); } ae_vector_set_length(&s, 3, _state); s.ptr.p_double[0] = 0.00001; s.ptr.p_double[1] = 0.00001; s.ptr.p_double[2] = 10000.0; stopcallidx = ae_randominteger(20, _state); maxits = 25; minbccreate(n, &x, &state, _state); minbcsetcond(&state, (double)(0), (double)(0), (double)(0), maxits, _state); minbcsetxrep(&state, ae_true, _state); minbcsetprecdiag(&state, &s, _state); callidx = 0; terminationrequested = ae_false; ae_v_move(&xlast.ptr.p_double[0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); while(minbciteration(&state, _state)) { if( state.needfg ) { state.f = ss*ae_sqr(ae_exp(state.x.ptr.p_double[0], _state)-2, _state)+ae_sqr(state.x.ptr.p_double[1], _state)+ae_sqr(state.x.ptr.p_double[2]-state.x.ptr.p_double[0], _state); state.g.ptr.p_double[0] = 2*ss*(ae_exp(state.x.ptr.p_double[0], _state)-2)*ae_exp(state.x.ptr.p_double[0], _state)+2*(state.x.ptr.p_double[2]-state.x.ptr.p_double[0])*(-1); state.g.ptr.p_double[1] = 2*state.x.ptr.p_double[1]; state.g.ptr.p_double[2] = 2*(state.x.ptr.p_double[2]-state.x.ptr.p_double[0]); if( callidx==stopcallidx ) { minbcrequesttermination(&state, _state); terminationrequested = ae_true; } inc(&callidx, _state); continue; } if( state.xupdated ) { if( !terminationrequested ) { ae_v_move(&xlast.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minbcresults(&state, &x, &rep, _state); seterrorflag(err, rep.terminationtype!=8, _state); for(i=0; i<=n-1; i++) { seterrorflag(err, ae_fp_neq(x.ptr.p_double[i],xlast.ptr.p_double[i]), _state); } } ae_frame_leave(_state); } /************************************************************************* This function tests preconditioning On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testminbcunit_testpreconditioning(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t pass; ae_int_t n; ae_vector x; ae_vector x0; ae_int_t i; ae_int_t k; ae_matrix v; ae_vector bl; ae_vector bu; ae_vector vd; ae_vector d; ae_vector units; ae_vector s; ae_int_t cntb1; ae_int_t cntb2; ae_int_t cntg1; ae_int_t cntg2; double epsg; ae_vector diagh; minbcstate state; minbcreport rep; ae_int_t ckind; ae_int_t fk; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&vd, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&units, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&diagh, 0, DT_REAL, _state); _minbcstate_init(&state, _state); _minbcreport_init(&rep, _state); /* * Preconditioner test 1. * * If * * B1 is default preconditioner with unit scale * * G1 is diagonal preconditioner based on approximate diagonal of Hessian matrix * * B2 is default preconditioner with non-unit scale S[i]=1/sqrt(h[i]) * * G2 is scale-based preconditioner with non-unit scale S[i]=1/sqrt(h[i]) * then B1 is worse than G1, B2 is worse than G2. * "Worse" means more iterations to converge. * * Test problem setup: * * f(x) = sum( ((i*i+1)*x[i])^2, i=0..N-1) * * constraints: * 0) absent * 1) box * * N - problem size * K - number of repeated passes (should be large enough to average out random factors) */ k = 100; epsg = 1.0E-8; for(n=10; n<=10; n++) { for(ckind=0; ckind<=1; ckind++) { fk = 1; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&units, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)(0); units.ptr.p_double[i] = (double)(1); } minbccreate(n, &x, &state, _state); minbcsetcond(&state, epsg, 0.0, 0.0, 0, _state); if( ckind==1 ) { ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = (double)(-1); bu.ptr.p_double[i] = (double)(1); } minbcsetbc(&state, &bl, &bu, _state); } /* * Test it with default preconditioner VS. perturbed diagonal preconditioner */ minbcsetprecdefault(&state, _state); minbcsetscale(&state, &units, _state); cntb1 = 0; for(pass=0; pass<=k-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } minbcrestartfrom(&state, &x, _state); while(minbciteration(&state, _state)) { testminbcunit_calciip2(&state, n, fk, _state); } minbcresults(&state, &x, &rep, _state); cntb1 = cntb1+rep.iterationscount; *err = *err||rep.terminationtype<=0; } ae_vector_set_length(&diagh, n, _state); for(i=0; i<=n-1; i++) { diagh.ptr.p_double[i] = 2*ae_pow((double)(i*i+1), (double)(2*fk), _state)*(0.8+0.4*ae_randomreal(_state)); } minbcsetprecdiag(&state, &diagh, _state); minbcsetscale(&state, &units, _state); cntg1 = 0; for(pass=0; pass<=k-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } minbcrestartfrom(&state, &x, _state); while(minbciteration(&state, _state)) { testminbcunit_calciip2(&state, n, fk, _state); } minbcresults(&state, &x, &rep, _state); cntg1 = cntg1+rep.iterationscount; *err = *err||rep.terminationtype<=0; } *err = *err||cntb1v_neginf; } infcomp = ae_randominteger(n+1, _state); if( infcompv_posinf; } minbccreate(n, &x, &state, _state); minbcsetgradientcheck(&state, teststep, _state); minbcsetbc(&state, &bl, &bu, _state); /* * Check that the criterion passes a derivative if it is correct */ while(minbciteration(&state, _state)) { if( state.needfg ) { /* * Check that .X within the boundaries */ for(i=0; i<=n-1; i++) { if( (ae_isfinite(bl.ptr.p_double[i], _state)&&ae_fp_less(state.x.ptr.p_double[i],bl.ptr.p_double[i]))||(ae_isfinite(bu.ptr.p_double[i], _state)&&ae_fp_greater(state.x.ptr.p_double[i],bu.ptr.p_double[i])) ) { *testg = ae_true; ae_frame_leave(_state); return; } } testminbcunit_funcderiv(a, b, c, d, x0, x1, x2, &state.x, func, &state.f, &state.g, _state); } } minbcresults(&state, &x, &rep, _state); /* * Check that error code does not equal to -7 and parameter .VarIdx * equal to -1. */ if( rep.terminationtype==-7||rep.varidx!=-1 ) { *testg = ae_true; ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 5*randomnormal(_state); } minbcrestartfrom(&state, &x, _state); /* * Check that the criterion does not miss a derivative if * it is incorrect */ while(minbciteration(&state, _state)) { if( state.needfg ) { for(i=0; i<=n-1; i++) { if( (ae_isfinite(bl.ptr.p_double[i], _state)&&ae_fp_less(state.x.ptr.p_double[i],bl.ptr.p_double[i]))||(ae_isfinite(bu.ptr.p_double[i], _state)&&ae_fp_greater(state.x.ptr.p_double[i],bu.ptr.p_double[i])) ) { *testg = ae_true; ae_frame_leave(_state); return; } } testminbcunit_funcderiv(a, b, c, d, x0, x1, x2, &state.x, func, &state.f, &state.g, _state); state.g.ptr.p_double[nbrcomp] = state.g.ptr.p_double[nbrcomp]+noise; } } minbcresults(&state, &x, &rep, _state); /* * Check that error code equal to -7 and parameter .VarIdx * equal to number of incorrect component. */ if( rep.terminationtype!=-7||rep.varidx!=nbrcomp ) { *testg = ae_true; ae_frame_leave(_state); return; } } *testg = ae_false; ae_frame_leave(_state); } /************************************************************************* This function return function value and it derivatives. Function dimension is 3. Function's list: * funcType=1: F(X)=A*(X-X0)^2+B*(Y-Y0)^2+C*(Z-Z0)^2+D; * funcType=2: F(X)=A*sin(X-X0)^2+B*sin(Y-Y0)^2+C*sin(Z-Z0)^2+D; * funcType=3: F(X)=A*(X-X0)^2+B*(Y-Y0)^2+C*((Z-Z0)-(X-X0))^2+D. *************************************************************************/ static void testminbcunit_funcderiv(double a, double b, double c, double d, double x0, double x1, double x2, /* Real */ ae_vector* x, ae_int_t functype, double* f, /* Real */ ae_vector* g, ae_state *_state) { ae_assert(((ae_isfinite(a, _state)&&ae_isfinite(b, _state))&&ae_isfinite(c, _state))&&ae_isfinite(d, _state), "FuncDeriv: A, B, C or D contains NaN or Infinite.", _state); ae_assert((ae_isfinite(x0, _state)&&ae_isfinite(x1, _state))&&ae_isfinite(x2, _state), "FuncDeriv: X0, X1 or X2 contains NaN or Infinite.", _state); ae_assert(functype>=1&&functype<=3, "FuncDeriv: incorrect funcType(funcType<1 or funcType>3).", _state); if( functype==1 ) { *f = a*ae_sqr(x->ptr.p_double[0]-x0, _state)+b*ae_sqr(x->ptr.p_double[1]-x1, _state)+c*ae_sqr(x->ptr.p_double[2]-x2, _state)+d; g->ptr.p_double[0] = 2*a*(x->ptr.p_double[0]-x0); g->ptr.p_double[1] = 2*b*(x->ptr.p_double[1]-x1); g->ptr.p_double[2] = 2*c*(x->ptr.p_double[2]-x2); return; } if( functype==2 ) { *f = a*ae_sqr(ae_sin(x->ptr.p_double[0]-x0, _state), _state)+b*ae_sqr(ae_sin(x->ptr.p_double[1]-x1, _state), _state)+c*ae_sqr(ae_sin(x->ptr.p_double[2]-x2, _state), _state)+d; g->ptr.p_double[0] = 2*a*ae_sin(x->ptr.p_double[0]-x0, _state)*ae_cos(x->ptr.p_double[0]-x0, _state); g->ptr.p_double[1] = 2*b*ae_sin(x->ptr.p_double[1]-x1, _state)*ae_cos(x->ptr.p_double[1]-x1, _state); g->ptr.p_double[2] = 2*c*ae_sin(x->ptr.p_double[2]-x2, _state)*ae_cos(x->ptr.p_double[2]-x2, _state); return; } if( functype==3 ) { *f = a*ae_sqr(x->ptr.p_double[0]-x0, _state)+b*ae_sqr(x->ptr.p_double[1]-x1, _state)+c*ae_sqr(x->ptr.p_double[2]-x2-(x->ptr.p_double[0]-x0), _state)+d; g->ptr.p_double[0] = 2*a*(x->ptr.p_double[0]-x0)+2*c*(x->ptr.p_double[0]-x->ptr.p_double[2]-x0+x2); g->ptr.p_double[1] = 2*b*(x->ptr.p_double[1]-x1); g->ptr.p_double[2] = 2*c*(x->ptr.p_double[2]-x->ptr.p_double[0]-x2+x0); return; } } static double testminnsunit_scalingtesttol = 1.0E-6; static ae_int_t testminnsunit_scalingtestcnt = 5; static void testminnsunit_basictest0uc(ae_bool* errors, ae_state *_state); static void testminnsunit_basictest1uc(ae_bool* errors, ae_state *_state); static void testminnsunit_basictest0bc(ae_bool* errors, ae_state *_state); static void testminnsunit_basictest1bc(ae_bool* errors, ae_state *_state); static void testminnsunit_basictest0lc(ae_bool* errors, ae_state *_state); static void testminnsunit_basictest1lc(ae_bool* errors, ae_state *_state); static void testminnsunit_basictest0nlc(ae_bool* errors, ae_state *_state); static void testminnsunit_testuc(ae_bool* primaryerrors, ae_bool* othererrors, ae_state *_state); static void testminnsunit_testbc(ae_bool* primaryerrors, ae_bool* othererrors, ae_state *_state); static void testminnsunit_testlc(ae_bool* primaryerrors, ae_bool* othererrors, ae_state *_state); static void testminnsunit_testnlc(ae_bool* primaryerrors, ae_bool* othererrors, ae_state *_state); static void testminnsunit_testother(ae_bool* othererrors, ae_state *_state); ae_bool testminns(ae_bool silent, ae_state *_state) { ae_bool wereerrors; ae_bool ucerrors; ae_bool bcerrors; ae_bool lcerrors; ae_bool nlcerrors; ae_bool othererrors; ae_bool result; wereerrors = ae_false; ucerrors = ae_false; bcerrors = ae_false; lcerrors = ae_false; nlcerrors = ae_false; othererrors = ae_false; /* * Basic tests */ testminnsunit_basictest0nlc(&nlcerrors, _state); testminnsunit_basictest0uc(&ucerrors, _state); testminnsunit_basictest1uc(&ucerrors, _state); testminnsunit_basictest0bc(&bcerrors, _state); testminnsunit_basictest1bc(&bcerrors, _state); testminnsunit_basictest0lc(&lcerrors, _state); testminnsunit_basictest1lc(&lcerrors, _state); /* * Special tests */ testminnsunit_testother(&othererrors, _state); /* * Full scale tests */ testminnsunit_testuc(&ucerrors, &othererrors, _state); testminnsunit_testbc(&bcerrors, &othererrors, _state); testminnsunit_testlc(&lcerrors, &othererrors, _state); testminnsunit_testnlc(&nlcerrors, &othererrors, _state); /* * end */ wereerrors = (((ucerrors||bcerrors)||lcerrors)||nlcerrors)||othererrors; if( !silent ) { printf("TESTING MINNS OPTIMIZATION\n"); printf("TESTS:\n"); printf("* UNCONSTRAINED "); if( ucerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* BOUND CONSTRAINED "); if( bcerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* LINEARLY CONSTRAINED "); if( lcerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* NONLINEARLY CONSTRAINED "); if( nlcerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* OTHER PROPERTIES "); if( othererrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( wereerrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !wereerrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testminns(ae_bool silent, ae_state *_state) { return testminns(silent, _state); } /************************************************************************* Basic unconstrained test *************************************************************************/ static void testminnsunit_basictest0uc(ae_bool* errors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t i; ae_vector x0; ae_vector x1; ae_vector d; minnsstate s; minnsreport rep; double sumits; double sumnfev; ae_int_t pass; ae_int_t passcount; ae_frame_make(_state, &_frame_block); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); _minnsstate_init(&s, _state); _minnsreport_init(&rep, _state); n = 5; passcount = 10; sumits = (double)(0); sumnfev = (double)(0); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&d, n, _state); for(pass=1; pass<=10; pass++) { for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; d.ptr.p_double[i] = ae_pow((double)(10), 2*ae_randomreal(_state)-1, _state); } minnscreate(n, &x0, &s, _state); minnssetalgoags(&s, 0.1, 0.0, _state); while(minnsiteration(&s, _state)) { if( s.needfij ) { s.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { s.fi.ptr.p_double[0] = s.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(s.x.ptr.p_double[i], _state); s.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*ae_sign(s.x.ptr.p_double[i], _state); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&s, &x1, &rep, _state); seterrorflag(errors, rep.terminationtype<=0, _state); if( *errors ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(errors, !ae_isfinite(x1.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[i], _state),0.001), _state); } sumits = sumits+(double)rep.iterationscount/(double)passcount; sumnfev = sumnfev+(double)rep.nfev/(double)passcount; } ae_frame_leave(_state); } /************************************************************************* Basic unconstrained test: nonsmooth Rosenbrock posed as unconstrained problem. [ ] minimize [ 10*|x0^2-x1| + (1-x0)^2 + 100*max(sqrt(2)*x0-1,0) + 100*max(2*x1-1,0) ] [ ] It's exact solution is x0=1/sqrt(2), x1=1/2 *************************************************************************/ static void testminnsunit_basictest1uc(ae_bool* errors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; double v0; double v1; ae_vector x0; ae_vector x1; minnsstate s; minnsreport rep; ae_frame_make(_state, &_frame_block); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); _minnsstate_init(&s, _state); _minnsreport_init(&rep, _state); n = 2; ae_vector_set_length(&x0, n, _state); x0.ptr.p_double[0] = (double)(0); x0.ptr.p_double[1] = (double)(0); minnscreate(n, &x0, &s, _state); minnssetalgoags(&s, 0.1, 0.0, _state); while(minnsiteration(&s, _state)) { if( s.needfij ) { v0 = s.x.ptr.p_double[0]; v1 = s.x.ptr.p_double[1]; s.fi.ptr.p_double[0] = 10*ae_fabs(ae_sqr(v0, _state)-v1, _state)+ae_sqr(v0-1, _state); s.j.ptr.pp_double[0][0] = 10*ae_sign(ae_sqr(v0, _state)-v1, _state)*2*v0+2*(v0-1); s.j.ptr.pp_double[0][1] = (double)(10*ae_sign(ae_sqr(v0, _state)-v1, _state)*(-1)); if( ae_fp_greater(ae_sqrt((double)(2), _state)*v0-1,0.0) ) { s.fi.ptr.p_double[0] = s.fi.ptr.p_double[0]+100*(ae_sqrt((double)(2), _state)*v0-1); s.j.ptr.pp_double[0][0] = s.j.ptr.pp_double[0][0]+100*ae_sqrt((double)(2), _state); } if( ae_fp_greater(2*v1-1,0.0) ) { s.fi.ptr.p_double[0] = s.fi.ptr.p_double[0]+100*(2*v1-1); s.j.ptr.pp_double[0][1] = s.j.ptr.pp_double[0][1]+100*2; } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&s, &x1, &rep, _state); seterrorflag(errors, rep.terminationtype<=0, _state); if( *errors ) { ae_frame_leave(_state); return; } seterrorflag(errors, !ae_isfinite(x1.ptr.p_double[0], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[0]-1/ae_sqrt((double)(2), _state), _state),0.001), _state); seterrorflag(errors, !ae_isfinite(x1.ptr.p_double[1], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[1]-(double)1/(double)2, _state),0.001), _state); ae_frame_leave(_state); } /************************************************************************* Basic box constrained test *************************************************************************/ static void testminnsunit_basictest0bc(ae_bool* errors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t i; ae_vector x0; ae_vector x1; ae_vector d; ae_vector bl; ae_vector bu; minnsstate s; minnsreport rep; double sumits; double sumnfev; ae_int_t pass; ae_int_t passcount; double v0; double v1; ae_frame_make(_state, &_frame_block); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); _minnsstate_init(&s, _state); _minnsreport_init(&rep, _state); n = 5; passcount = 10; sumits = (double)(0); sumnfev = (double)(0); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_vector_set_length(&d, n, _state); for(pass=1; pass<=10; pass++) { for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; d.ptr.p_double[i] = ae_pow((double)(10), 2*ae_randomreal(_state)-1, _state); v0 = 2*ae_randomreal(_state)-1; v1 = 2*ae_randomreal(_state)-1; bl.ptr.p_double[i] = ae_minreal(v0, v1, _state); bu.ptr.p_double[i] = ae_maxreal(v0, v1, _state); } minnscreate(n, &x0, &s, _state); minnssetalgoags(&s, 0.1, 0.0, _state); minnssetbc(&s, &bl, &bu, _state); while(minnsiteration(&s, _state)) { if( s.needfij ) { s.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { s.fi.ptr.p_double[0] = s.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(s.x.ptr.p_double[i], _state); s.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*ae_sign(s.x.ptr.p_double[i], _state); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&s, &x1, &rep, _state); seterrorflag(errors, rep.terminationtype<=0, _state); if( *errors ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(errors, !ae_isfinite(x1.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-boundval(0.0, bl.ptr.p_double[i], bu.ptr.p_double[i], _state), _state),0.001), _state); } sumits = sumits+(double)rep.iterationscount/(double)passcount; sumnfev = sumnfev+(double)rep.nfev/(double)passcount; } ae_frame_leave(_state); } /************************************************************************* Basic constrained test: nonsmooth Rosenbrock posed as box constrained problem. [ ] minimize [ 10*|x0^2-x1| + (1-x0)^2 ] [ ] s.t. x0<=1/sqrt(2), x1<=0.5 It's exact solution is x0=1/sqrt(2), x1=1/2 *************************************************************************/ static void testminnsunit_basictest1bc(ae_bool* errors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; double v0; double v1; ae_vector x0; ae_vector x1; ae_vector bndl; ae_vector bndu; minnsstate s; minnsreport rep; ae_frame_make(_state, &_frame_block); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); _minnsstate_init(&s, _state); _minnsreport_init(&rep, _state); n = 2; ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); x0.ptr.p_double[0] = (double)(0); x0.ptr.p_double[1] = (double)(0); bndl.ptr.p_double[0] = _state->v_neginf; bndl.ptr.p_double[1] = _state->v_neginf; bndu.ptr.p_double[0] = 1/ae_sqrt((double)(2), _state); bndu.ptr.p_double[1] = (double)1/(double)2; minnscreate(n, &x0, &s, _state); minnssetbc(&s, &bndl, &bndu, _state); minnssetalgoags(&s, 0.1, 0.0, _state); while(minnsiteration(&s, _state)) { if( s.needfij ) { v0 = s.x.ptr.p_double[0]; v1 = s.x.ptr.p_double[1]; s.fi.ptr.p_double[0] = 10*ae_fabs(ae_sqr(v0, _state)-v1, _state)+ae_sqr(v0-1, _state); s.j.ptr.pp_double[0][0] = 10*ae_sign(ae_sqr(v0, _state)-v1, _state)*2*v0+2*(v0-1); s.j.ptr.pp_double[0][1] = (double)(10*ae_sign(ae_sqr(v0, _state)-v1, _state)*(-1)); continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&s, &x1, &rep, _state); seterrorflag(errors, rep.terminationtype<=0, _state); if( *errors ) { ae_frame_leave(_state); return; } seterrorflag(errors, !ae_isfinite(x1.ptr.p_double[0], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[0]-1/ae_sqrt((double)(2), _state), _state),0.001), _state); seterrorflag(errors, !ae_isfinite(x1.ptr.p_double[1], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[1]-(double)1/(double)2, _state),0.001), _state); ae_frame_leave(_state); } /************************************************************************* Basic linearly constrained test *************************************************************************/ static void testminnsunit_basictest0lc(ae_bool* errors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t i; ae_int_t j; ae_vector x0; ae_vector x1; ae_matrix c; ae_vector ct; double d; minnsstate s; minnsreport rep; double sumits; double sumnfev; ae_int_t pass; ae_int_t passcount; ae_int_t nc; ae_frame_make(_state, &_frame_block); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); _minnsstate_init(&s, _state); _minnsreport_init(&rep, _state); d = -10.0; n = 5; passcount = 10; sumits = (double)(0); sumnfev = (double)(0); ae_vector_set_length(&x0, n, _state); ae_matrix_set_length(&c, 2*n, n+1, _state); ae_vector_set_length(&ct, 2*n, _state); for(pass=1; pass<=10; pass++) { nc = 0; for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; if( ae_fp_less(ae_randomreal(_state),0.5) ) { for(j=0; j<=n; j++) { c.ptr.pp_double[nc][j] = 0.0; } c.ptr.pp_double[nc][i] = 1.0+ae_randomreal(_state); ct.ptr.p_int[nc] = 0; inc(&nc, _state); } else { for(j=0; j<=n; j++) { c.ptr.pp_double[nc+0][j] = 0.0; c.ptr.pp_double[nc+1][j] = 0.0; } c.ptr.pp_double[nc+0][i] = 1.0+ae_randomreal(_state); c.ptr.pp_double[nc+1][i] = 1.0+ae_randomreal(_state); ct.ptr.p_int[nc+0] = 1; ct.ptr.p_int[nc+1] = -1; nc = nc+2; } } minnscreate(n, &x0, &s, _state); minnssetalgoags(&s, 0.1, 0.0, _state); minnssetlc(&s, &c, &ct, nc, _state); while(minnsiteration(&s, _state)) { if( s.needfij ) { s.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { s.fi.ptr.p_double[0] = d*ae_sqr(s.x.ptr.p_double[i], _state); s.j.ptr.pp_double[0][i] = d*2*s.x.ptr.p_double[i]; } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&s, &x1, &rep, _state); seterrorflag(errors, rep.terminationtype<=0, _state); if( *errors ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(errors, !ae_isfinite(x1.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[i], _state),0.001), _state); } sumits = sumits+(double)rep.iterationscount/(double)passcount; sumnfev = sumnfev+(double)rep.nfev/(double)passcount; } ae_frame_leave(_state); } /************************************************************************* Basic constrained test: nonsmooth Rosenbrock posed as linearly constrained problem. [ ] minimize [ 10*|x0^2-x1| + (1-x0)^2 ] [ ] s.t. x0<=1/sqrt(2), x1<=0.5 It's exact solution is x0=1/sqrt(2), x1=1/2 *************************************************************************/ static void testminnsunit_basictest1lc(ae_bool* errors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; double v0; double v1; ae_vector x0; ae_vector x1; ae_matrix c; ae_vector ct; minnsstate s; minnsreport rep; ae_frame_make(_state, &_frame_block); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); _minnsstate_init(&s, _state); _minnsreport_init(&rep, _state); n = 2; ae_vector_set_length(&x0, n, _state); ae_matrix_set_length(&c, 2, n+1, _state); ae_vector_set_length(&ct, 2, _state); x0.ptr.p_double[0] = (double)(0); x0.ptr.p_double[1] = (double)(0); c.ptr.pp_double[0][0] = 1.0; c.ptr.pp_double[0][1] = 0.0; c.ptr.pp_double[0][2] = 1/ae_sqrt((double)(2), _state); c.ptr.pp_double[1][0] = 0.0; c.ptr.pp_double[1][1] = 1.0; c.ptr.pp_double[1][2] = (double)1/(double)2; ct.ptr.p_int[0] = -1; ct.ptr.p_int[1] = -1; minnscreate(n, &x0, &s, _state); minnssetlc(&s, &c, &ct, 2, _state); minnssetalgoags(&s, 0.1, 0.0, _state); while(minnsiteration(&s, _state)) { if( s.needfij ) { v0 = s.x.ptr.p_double[0]; v1 = s.x.ptr.p_double[1]; s.fi.ptr.p_double[0] = 10*ae_fabs(ae_sqr(v0, _state)-v1, _state)+ae_sqr(v0-1, _state); s.j.ptr.pp_double[0][0] = 10*ae_sign(ae_sqr(v0, _state)-v1, _state)*2*v0+2*(v0-1); s.j.ptr.pp_double[0][1] = (double)(10*ae_sign(ae_sqr(v0, _state)-v1, _state)*(-1)); continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&s, &x1, &rep, _state); seterrorflag(errors, rep.terminationtype<=0, _state); if( *errors ) { ae_frame_leave(_state); return; } seterrorflag(errors, !ae_isfinite(x1.ptr.p_double[0], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[0]-1/ae_sqrt((double)(2), _state), _state),0.001), _state); seterrorflag(errors, !ae_isfinite(x1.ptr.p_double[1], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[1]-(double)1/(double)2, _state),0.001), _state); ae_frame_leave(_state); } /************************************************************************* Basic nonlinearly constrained test *************************************************************************/ static void testminnsunit_basictest0nlc(ae_bool* errors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t i; ae_int_t j; ae_vector x0; ae_vector x1; ae_matrix ec; ae_matrix ic; ae_int_t nec; ae_int_t nic; double d; minnsstate s; minnsreport rep; double sumits; double sumnfev; ae_int_t pass; ae_int_t passcount; ae_frame_make(_state, &_frame_block); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_matrix_init(&ec, 0, 0, DT_REAL, _state); ae_matrix_init(&ic, 0, 0, DT_REAL, _state); _minnsstate_init(&s, _state); _minnsreport_init(&rep, _state); d = -10.0; n = 5; passcount = 10; sumits = (double)(0); sumnfev = (double)(0); ae_vector_set_length(&x0, n, _state); ae_matrix_set_length(&ec, 2*n, n+1, _state); ae_matrix_set_length(&ic, 2*n, n+1, _state); for(pass=1; pass<=10; pass++) { nec = 0; nic = 0; for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; if( ae_fp_less(ae_randomreal(_state),0.5) ) { for(j=0; j<=n; j++) { ec.ptr.pp_double[nec][j] = 0.0; } ec.ptr.pp_double[nec][i] = 1.0+ae_randomreal(_state); inc(&nec, _state); } else { for(j=0; j<=n; j++) { ic.ptr.pp_double[nic+0][j] = 0.0; ic.ptr.pp_double[nic+1][j] = 0.0; } ic.ptr.pp_double[nic+0][i] = 1.0+ae_randomreal(_state); ic.ptr.pp_double[nic+1][i] = -1.0-ae_randomreal(_state); nic = nic+2; } } minnscreate(n, &x0, &s, _state); minnssetalgoags(&s, 0.1, 100.0, _state); minnssetnlc(&s, nec, nic, _state); while(minnsiteration(&s, _state)) { if( s.needfij ) { s.fi.ptr.p_double[0] = 0.0; for(j=0; j<=n-1; j++) { s.fi.ptr.p_double[0] = d*ae_sqr(s.x.ptr.p_double[j], _state); s.j.ptr.pp_double[0][j] = d*2*s.x.ptr.p_double[j]; } for(i=0; i<=nec-1; i++) { s.fi.ptr.p_double[1+i] = -ec.ptr.pp_double[i][n]; for(j=0; j<=n-1; j++) { s.fi.ptr.p_double[1+i] = s.fi.ptr.p_double[1+i]+s.x.ptr.p_double[j]*ec.ptr.pp_double[i][j]; s.j.ptr.pp_double[1+i][j] = ec.ptr.pp_double[i][j]; } } for(i=0; i<=nic-1; i++) { s.fi.ptr.p_double[1+nec+i] = -ic.ptr.pp_double[i][n]; for(j=0; j<=n-1; j++) { s.fi.ptr.p_double[1+nec+i] = s.fi.ptr.p_double[1+nec+i]+s.x.ptr.p_double[j]*ic.ptr.pp_double[i][j]; s.j.ptr.pp_double[1+nec+i][j] = ic.ptr.pp_double[i][j]; } } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&s, &x1, &rep, _state); seterrorflag(errors, rep.terminationtype<=0, _state); if( *errors ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(errors, !ae_isfinite(x1.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[i], _state),0.001), _state); } sumits = sumits+(double)rep.iterationscount/(double)passcount; sumnfev = sumnfev+(double)rep.nfev/(double)passcount; } ae_frame_leave(_state); } /************************************************************************* Unconstrained test *************************************************************************/ static void testminnsunit_testuc(ae_bool* primaryerrors, ae_bool* othererrors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t i; ae_vector x0; ae_vector x0s; ae_vector x1; ae_vector x1s; ae_vector d; ae_vector xc; ae_vector s; ae_vector xrfirst; ae_vector xrlast; minnsstate state; minnsreport rep; double v; ae_int_t pass; ae_bool requirexrep; double epsrad; ae_bool werexreports; double repferr; double xtol; ae_frame_make(_state, &_frame_block); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x0s, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x1s, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&xrfirst, 0, DT_REAL, _state); ae_vector_init(&xrlast, 0, DT_REAL, _state); _minnsstate_init(&state, _state); _minnsreport_init(&rep, _state); for(pass=1; pass<=10; pass++) { for(n=1; n<=5; n++) { /* * First test: * * test that problem is successfully solved * * test that X-reports are performed correctly - present * when requested, return first and last points correctly, * not present by default, function value is reported * correctly. * * we use non-unit scale, randomly chosen one, which results * in badly conditioned problems (to check robustness) */ ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&d, n, _state); ae_vector_set_length(&s, n, _state); ae_vector_set_length(&xrfirst, n, _state); ae_vector_set_length(&xrlast, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 10*(2*ae_randomreal(_state)-1); xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; d.ptr.p_double[i] = ae_pow((double)(10), 2*(2*ae_randomreal(_state)-1), _state); s.ptr.p_double[i] = ae_pow((double)(10), 2*(2*ae_randomreal(_state)-1), _state); } requirexrep = ae_fp_greater(ae_randomreal(_state),0.5); epsrad = 0.01*ae_pow((double)(10), -2*ae_randomreal(_state), _state); xtol = 15.0*epsrad; minnscreate(n, &x0, &state, _state); minnssetalgoags(&state, 0.1, 0.0, _state); minnssetcond(&state, epsrad, 0, _state); minnssetscale(&state, &s, _state); if( requirexrep ) { minnssetxrep(&state, ae_true, _state); } werexreports = ae_false; repferr = 0.0; while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*ae_sign(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); } continue; } if( state.xupdated ) { if( !werexreports ) { ae_v_move(&xrfirst.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); } ae_v_move(&xrlast.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); werexreports = ae_true; v = 0.0; for(i=0; i<=n-1; i++) { v = v+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); } repferr = ae_maxreal(repferr, ae_fabs(v-state.f, _state), _state); continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); seterrorflag(othererrors, werexreports&&!requirexrep, _state); seterrorflag(othererrors, requirexrep&&!werexreports, _state); seterrorflag(othererrors, ae_fp_greater(repferr,10000*ae_machineepsilon), _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(primaryerrors, !ae_isfinite(x1.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-xc.ptr.p_double[i], _state)/s.ptr.p_double[i],xtol), _state); if( requirexrep ) { seterrorflag(othererrors, !ae_isfinite(xrfirst.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(x0.ptr.p_double[i]-xrfirst.ptr.p_double[i], _state),100*ae_machineepsilon), _state); seterrorflag(othererrors, !ae_isfinite(xrlast.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-xrlast.ptr.p_double[i], _state),100*ae_machineepsilon), _state); } } /* * Test numerical differentiation: * * test that problem is successfully solved * * test that correct function value is reported */ ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&d, n, _state); ae_vector_set_length(&s, n, _state); ae_vector_set_length(&xrlast, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 10*(2*ae_randomreal(_state)-1); xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; d.ptr.p_double[i] = ae_pow((double)(10), 2*(2*ae_randomreal(_state)-1), _state); s.ptr.p_double[i] = ae_pow((double)(10), 2*(2*ae_randomreal(_state)-1), _state); } epsrad = 0.01*ae_pow((double)(10), -2*ae_randomreal(_state), _state); xtol = 15.0*epsrad; minnscreatef(n, &x0, epsrad/100, &state, _state); minnssetalgoags(&state, 0.1, 0.0, _state); minnssetcond(&state, epsrad, 0, _state); minnssetscale(&state, &s, _state); minnssetxrep(&state, ae_true, _state); repferr = 0.0; while(minnsiteration(&state, _state)) { if( state.needfi ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); } continue; } if( state.xupdated ) { v = 0.0; for(i=0; i<=n-1; i++) { v = v+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); } repferr = ae_maxreal(repferr, ae_fabs(v-state.f, _state), _state); continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); seterrorflag(othererrors, ae_fp_greater(repferr,10000*ae_machineepsilon), _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(primaryerrors, !ae_isfinite(x1.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-xc.ptr.p_double[i], _state)/s.ptr.p_double[i],xtol), _state); } /* * Test scaling: we perform several steps on unit-scale problem, * then we perform same amount of steps on re-scaled problem, * starting from same point (but scaled according to chosen scale). * * Correctly written optimizer should perform essentially same steps * (up to scale) on both problems. At least, it holds within first * several steps, before rounding errors start to accumulate. * * NOTE: we also check that correctly scaled points are reported. * And, as side effect, we check MinNSRestartFrom(). * * NOTE: we use moderate scale and diagonal coefficients in order * to have well-conditioned system. We test correctness of * formulae here, not robustness of algorithm. */ ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&x0s, n, _state); ae_vector_set_length(&d, n, _state); ae_vector_set_length(&s, n, _state); ae_vector_set_length(&xrlast, n, _state); for(i=0; i<=n-1; i++) { s.ptr.p_double[i] = ae_pow((double)(10), 2*ae_randomreal(_state)-1, _state); d.ptr.p_double[i] = ae_pow((double)(10), 2*ae_randomreal(_state)-1, _state); x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x0s.ptr.p_double[i] = x0.ptr.p_double[i]*s.ptr.p_double[i]; } minnscreate(n, &x0, &state, _state); minnssetalgoags(&state, 0.1, 0.0, _state); minnssetcond(&state, 0.0, testminnsunit_scalingtestcnt, _state); minnssetxrep(&state, ae_false, _state); while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*ae_sign(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } minnssetscale(&state, &s, _state); minnssetxrep(&state, ae_true, _state); minnsrestartfrom(&state, &x0s, _state); werexreports = ae_false; while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]/s.ptr.p_double[i]-xc.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*ae_sign(state.x.ptr.p_double[i]/s.ptr.p_double[i]-xc.ptr.p_double[i], _state)/s.ptr.p_double[i]; } continue; } if( state.xupdated ) { ae_v_move(&xrlast.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); werexreports = ae_true; continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1s, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(primaryerrors, (!ae_isfinite(x1.ptr.p_double[i], _state)||!ae_isfinite(x1s.ptr.p_double[i], _state))||ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-x1s.ptr.p_double[i]/s.ptr.p_double[i], _state),1.0E-4), _state); seterrorflag(othererrors, !ae_isfinite(xrlast.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(x1s.ptr.p_double[i]-xrlast.ptr.p_double[i], _state),testminnsunit_scalingtesttol), _state); } } } ae_frame_leave(_state); } /************************************************************************* Box constrained test *************************************************************************/ static void testminnsunit_testbc(ae_bool* primaryerrors, ae_bool* othererrors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t k; ae_vector x0; ae_vector x0s; ae_vector x1; ae_vector x1s; ae_vector b; ae_vector d; ae_vector xc; ae_vector s; ae_vector bndl; ae_vector bndu; ae_vector scaledbndl; ae_vector scaledbndu; ae_vector xrfirst; ae_vector xrlast; ae_matrix a; minnsstate state; minnsreport rep; double v; double v0; double v1; ae_int_t pass; ae_int_t passcount; ae_bool requirexrep; double epsrad; ae_bool werexreports; double repferr; double xtol; ae_int_t maxn; double conda; double gnorm; ae_frame_make(_state, &_frame_block); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x0s, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x1s, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_vector_init(&scaledbndl, 0, DT_REAL, _state); ae_vector_init(&scaledbndu, 0, DT_REAL, _state); ae_vector_init(&xrfirst, 0, DT_REAL, _state); ae_vector_init(&xrlast, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); _minnsstate_init(&state, _state); _minnsreport_init(&rep, _state); passcount = 10; maxn = 5; /* * First test: * * sparse function * * test that problem is successfully solved * * non-unit scale is used, which results in badly conditioned problem * * check that all iterates are feasible (box-constrained) */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&d, n, _state); ae_vector_set_length(&s, n, _state); ae_vector_set_length(&xrfirst, n, _state); ae_vector_set_length(&xrlast, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 10*(2*ae_randomreal(_state)-1); xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; d.ptr.p_double[i] = ae_pow((double)(10), 2*(2*ae_randomreal(_state)-1), _state); s.ptr.p_double[i] = ae_pow((double)(10), 2*(2*ae_randomreal(_state)-1), _state); bndl.ptr.p_double[i] = _state->v_neginf; bndu.ptr.p_double[i] = _state->v_posinf; k = ae_randominteger(5, _state); if( k==1 ) { bndl.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } if( k==2 ) { bndu.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } if( k==3 ) { v0 = 2*ae_randomreal(_state)-1; v1 = 2*ae_randomreal(_state)-1; bndl.ptr.p_double[i] = ae_minreal(v0, v1, _state); bndu.ptr.p_double[i] = ae_maxreal(v0, v1, _state); } if( k==4 ) { bndl.ptr.p_double[i] = 2*ae_randomreal(_state)-1; bndu.ptr.p_double[i] = bndl.ptr.p_double[i]; } } requirexrep = ae_fp_greater(ae_randomreal(_state),0.5); epsrad = 0.01*ae_pow((double)(10), -2*ae_randomreal(_state), _state); xtol = 15.0*epsrad; minnscreate(n, &x0, &state, _state); minnssetalgoags(&state, 0.1, 0.0, _state); minnssetcond(&state, epsrad, 0, _state); minnssetbc(&state, &bndl, &bndu, _state); minnssetscale(&state, &s, _state); if( requirexrep ) { minnssetxrep(&state, ae_true, _state); } werexreports = ae_false; repferr = 0.0; while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*ae_sign(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); } continue; } if( state.xupdated ) { if( !werexreports ) { ae_v_move(&xrfirst.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); } ae_v_move(&xrlast.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); werexreports = ae_true; v = 0.0; for(i=0; i<=n-1; i++) { v = v+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); } repferr = ae_maxreal(repferr, ae_fabs(v-state.f, _state), _state); for(i=0; i<=n-1; i++) { seterrorflag(primaryerrors, ae_fp_less(state.x.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(primaryerrors, ae_fp_greater(state.x.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); seterrorflag(othererrors, werexreports&&!requirexrep, _state); seterrorflag(othererrors, requirexrep&&!werexreports, _state); seterrorflag(othererrors, ae_fp_greater(repferr,10000*ae_machineepsilon), _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(primaryerrors, !ae_isfinite(x1.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-boundval(xc.ptr.p_double[i], bndl.ptr.p_double[i], bndu.ptr.p_double[i], _state), _state)/s.ptr.p_double[i],xtol), _state); seterrorflag(primaryerrors, ae_fp_less(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(primaryerrors, ae_fp_greater(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); if( requirexrep ) { seterrorflag(othererrors, !ae_isfinite(xrfirst.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(boundval(x0.ptr.p_double[i], bndl.ptr.p_double[i], bndu.ptr.p_double[i], _state)-xrfirst.ptr.p_double[i], _state),100*ae_machineepsilon), _state); seterrorflag(othererrors, !ae_isfinite(xrlast.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-xrlast.ptr.p_double[i], _state),100*ae_machineepsilon), _state); } } } } /* * A bit harder test: * * dense quadratic function (smooth), may be prone to different * rounding-related issues * * non-negativity box constraints * * unit scale is used * * extreme stopping criteria (EpsX=1.0E-12) * * single pass for each problem size * * check that constrained gradient at solution is small */ conda = 1.0E3; epsrad = 1.0E-12; for(n=1; n<=10; n++) { ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&b, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 1.0; b.ptr.p_double[i] = ae_randomreal(_state)-0.5; bndl.ptr.p_double[i] = 0.0; bndu.ptr.p_double[i] = _state->v_posinf; } spdmatrixrndcond(n, conda, &a, _state); minnscreate(n, &x0, &state, _state); minnssetalgoags(&state, 0.1, 0.0, _state); minnssetcond(&state, epsrad, 0, _state); minnssetbc(&state, &bndl, &bndu, _state); while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.j.ptr.pp_double[0][i] = 0.0; } for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+b.ptr.p_double[i]*state.x.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+0.5*state.x.ptr.p_double[i]*a.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } for(i=0; i<=n-1; i++) { state.j.ptr.pp_double[0][i] = state.j.ptr.pp_double[0][i]+b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { state.j.ptr.pp_double[0][i] = state.j.ptr.pp_double[0][i]+a.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } gnorm = 0.0; for(i=0; i<=n-1; i++) { v = b.ptr.p_double[i]; for(j=0; j<=n-1; j++) { v = v+a.ptr.pp_double[i][j]*x1.ptr.p_double[j]; } if( ae_fp_eq(x1.ptr.p_double[i],bndl.ptr.p_double[i])&&ae_fp_greater(v,(double)(0)) ) { v = (double)(0); } if( ae_fp_eq(x1.ptr.p_double[i],bndu.ptr.p_double[i])&&ae_fp_less(v,(double)(0)) ) { v = (double)(0); } gnorm = gnorm+ae_sqr(v, _state); seterrorflag(primaryerrors, ae_fp_less(x1.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(primaryerrors, ae_fp_greater(x1.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } gnorm = ae_sqrt(gnorm, _state); seterrorflag(primaryerrors, ae_fp_greater(gnorm,1.0E-5), _state); } /* * Test on HIGHLY nonconvex bound constrained problem. * Algorithm should be able to stop. * * NOTE: because algorithm can be attracted to saddle points, * x[i] may be -1, +1 or approximately zero. */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = ae_randomreal(_state)-0.5; bndl.ptr.p_double[i] = -1.0; bndu.ptr.p_double[i] = 1.0; } epsrad = 0.0001; xtol = 15.0*epsrad; minnscreate(n, &x0, &state, _state); minnssetalgoags(&state, 0.1, 0.0, _state); minnssetcond(&state, epsrad, 0, _state); minnssetbc(&state, &bndl, &bndu, _state); v = -1000.0; while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { v0 = ae_fabs(state.x.ptr.p_double[i], _state); v1 = (double)(ae_sign(state.x.ptr.p_double[i], _state)); state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+v*(v0+v0*v0); state.j.ptr.pp_double[0][i] = v*(v1+2*v0*v1); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); for(i=0; i<=n-1; i++) { v = ae_fabs(x1.ptr.p_double[i], _state); seterrorflag(primaryerrors, !ae_isfinite(x1.ptr.p_double[i], _state), _state); seterrorflag(primaryerrors, ae_fp_neq(v,1.0)&&ae_fp_greater(v,xtol), _state); } } } /* * Test numerical differentiation: * * test that problem is successfully solved * * test that correct function value is reported * * test that all iterates are within bound-constrained area */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&d, n, _state); ae_vector_set_length(&s, n, _state); ae_vector_set_length(&xrlast, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 10*(2*ae_randomreal(_state)-1); xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; d.ptr.p_double[i] = ae_pow((double)(10), 2*(2*ae_randomreal(_state)-1), _state); s.ptr.p_double[i] = ae_pow((double)(10), 2*(2*ae_randomreal(_state)-1), _state); bndl.ptr.p_double[i] = _state->v_neginf; bndu.ptr.p_double[i] = _state->v_posinf; k = ae_randominteger(5, _state); if( k==1 ) { bndl.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } if( k==2 ) { bndu.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } if( k==3 ) { v0 = 2*ae_randomreal(_state)-1; v1 = 2*ae_randomreal(_state)-1; bndl.ptr.p_double[i] = ae_minreal(v0, v1, _state); bndu.ptr.p_double[i] = ae_maxreal(v0, v1, _state); } if( k==4 ) { bndl.ptr.p_double[i] = 2*ae_randomreal(_state)-1; bndu.ptr.p_double[i] = bndl.ptr.p_double[i]; } } epsrad = 0.01*ae_pow((double)(10), -2*ae_randomreal(_state), _state); xtol = 15.0*epsrad; minnscreatef(n, &x0, epsrad/100, &state, _state); minnssetalgoags(&state, 0.1, 0.0, _state); minnssetcond(&state, epsrad, 0, _state); minnssetscale(&state, &s, _state); minnssetbc(&state, &bndl, &bndu, _state); minnssetxrep(&state, ae_true, _state); repferr = 0.0; while(minnsiteration(&state, _state)) { if( state.needfi ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); seterrorflag(primaryerrors, ae_fp_less(state.x.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(primaryerrors, ae_fp_greater(state.x.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } continue; } if( state.xupdated ) { v = 0.0; for(i=0; i<=n-1; i++) { v = v+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); seterrorflag(primaryerrors, ae_fp_less(state.x.ptr.p_double[i],bndl.ptr.p_double[i]), _state); seterrorflag(primaryerrors, ae_fp_greater(state.x.ptr.p_double[i],bndu.ptr.p_double[i]), _state); } repferr = ae_maxreal(repferr, ae_fabs(v-state.f, _state), _state); continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); seterrorflag(othererrors, ae_fp_greater(repferr,10000*ae_machineepsilon), _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(primaryerrors, !ae_isfinite(x1.ptr.p_double[i], _state)||ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-boundval(xc.ptr.p_double[i], bndl.ptr.p_double[i], bndu.ptr.p_double[i], _state), _state)/s.ptr.p_double[i],xtol), _state); } } } /* * Test scaling: we perform several steps on unit-scale problem, * then we perform same amount of steps on re-scaled problem, * starting from same point (but scaled according to chosen scale). * * Correctly written optimizer should perform essentially same steps * (up to scale) on both problems. At least, it holds within first * several steps, before rounding errors start to accumulate. * * NOTE: we also check that correctly scaled points are reported. * And, as side effect, we check MinNSRestartFrom(). * * NOTE: we use very low scale and diagonal coefficients in order * to have well-conditioned system. We test correctness of * formulae here, not robustness of algorithm. */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&x0s, n, _state); ae_vector_set_length(&d, n, _state); ae_vector_set_length(&s, n, _state); ae_vector_set_length(&xrlast, n, _state); ae_vector_set_length(&bndl, n, _state); ae_vector_set_length(&bndu, n, _state); ae_vector_set_length(&scaledbndl, n, _state); ae_vector_set_length(&scaledbndu, n, _state); for(i=0; i<=n-1; i++) { s.ptr.p_double[i] = ae_pow((double)(10), ae_randomreal(_state)-0.5, _state); d.ptr.p_double[i] = ae_pow((double)(10), ae_randomreal(_state)-0.5, _state); x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x0s.ptr.p_double[i] = x0.ptr.p_double[i]*s.ptr.p_double[i]; bndl.ptr.p_double[i] = _state->v_neginf; bndu.ptr.p_double[i] = _state->v_posinf; k = ae_randominteger(5, _state); if( k==1 ) { bndl.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } if( k==2 ) { bndu.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } if( k==3 ) { v0 = 2*ae_randomreal(_state)-1; v1 = 2*ae_randomreal(_state)-1; bndl.ptr.p_double[i] = ae_minreal(v0, v1, _state); bndu.ptr.p_double[i] = ae_maxreal(v0, v1, _state); } if( k==4 ) { bndl.ptr.p_double[i] = 2*ae_randomreal(_state)-1; bndu.ptr.p_double[i] = bndl.ptr.p_double[i]; } scaledbndl.ptr.p_double[i] = bndl.ptr.p_double[i]*s.ptr.p_double[i]; scaledbndu.ptr.p_double[i] = bndu.ptr.p_double[i]*s.ptr.p_double[i]; } minnscreate(n, &x0, &state, _state); minnssetalgoags(&state, 0.01, 0.0, _state); minnssetcond(&state, 0.0, testminnsunit_scalingtestcnt, _state); minnssetbc(&state, &bndl, &bndu, _state); minnssetxrep(&state, ae_false, _state); while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*ae_sign(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } minnssetscale(&state, &s, _state); minnssetbc(&state, &scaledbndl, &scaledbndu, _state); minnsrestartfrom(&state, &x0s, _state); while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]/s.ptr.p_double[i]-xc.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*ae_sign(state.x.ptr.p_double[i]/s.ptr.p_double[i]-xc.ptr.p_double[i], _state)/s.ptr.p_double[i]; } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1s, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(primaryerrors, (!ae_isfinite(x1.ptr.p_double[i], _state)||!ae_isfinite(x1s.ptr.p_double[i], _state))||ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-x1s.ptr.p_double[i]/s.ptr.p_double[i], _state),testminnsunit_scalingtesttol), _state); } } } ae_frame_leave(_state); } /************************************************************************* Linearly constrained test *************************************************************************/ static void testminnsunit_testlc(ae_bool* primaryerrors, ae_bool* othererrors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t nc; ae_vector x0; ae_vector x0s; ae_vector x1; ae_vector x2; ae_vector x1s; ae_vector d; ae_vector xc; ae_vector s; ae_vector bndl; ae_vector bndu; ae_matrix c; ae_matrix scaledc; ae_vector ct; ae_vector scaledbndl; ae_vector scaledbndu; ae_vector xrfirst; ae_vector xrlast; minnsstate state; minnsreport rep; double v; double v0; double v1; double vv; double flast0; double flast1; ae_int_t pass; double epsrad; double repferr; double xtol; double ftol; double rho; ae_frame_make(_state, &_frame_block); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x0s, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&x1s, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_matrix_init(&scaledc, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); ae_vector_init(&scaledbndl, 0, DT_REAL, _state); ae_vector_init(&scaledbndu, 0, DT_REAL, _state); ae_vector_init(&xrfirst, 0, DT_REAL, _state); ae_vector_init(&xrlast, 0, DT_REAL, _state); _minnsstate_init(&state, _state); _minnsreport_init(&rep, _state); for(pass=1; pass<=10; pass++) { for(n=1; n<=5; n++) { /* * First test: * * smooth problem * * subject to random linear constraints * * with non-unit scale * * We: * * compare function value at constrained solution with function * value for penalized unconstrained problem. We do not compare * actual X-values returned, because they are highly unstable - * function values at minimum show better stability. * * check that correct function values are reported */ ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&d, n, _state); ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 10*(2*ae_randomreal(_state)-1); xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; d.ptr.p_double[i] = 1+ae_randomreal(_state); s.ptr.p_double[i] = 1+ae_randomreal(_state); } nc = ae_randominteger((n+1)/2, _state); if( nc>0 ) { ae_matrix_set_length(&c, nc, n+1, _state); ae_vector_set_length(&ct, nc, _state); for(i=0; i<=nc-1; i++) { ct.ptr.p_int[i] = ae_randominteger(3, _state)-1; for(j=0; j<=n; j++) { c.ptr.pp_double[i][j] = ae_randomreal(_state)-0.5; } } } epsrad = 0.00001; ftol = 0.01; minnscreate(n, &x0, &state, _state); minnssetalgoags(&state, 0.1, 0.0, _state); minnssetcond(&state, epsrad, 0, _state); minnssetscale(&state, &s, _state); minnssetxrep(&state, ae_true, _state); minnssetlc(&state, &c, &ct, nc, _state); repferr = 0.0; flast0 = _state->v_nan; while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_sqr(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*(2*(state.x.ptr.p_double[i]-xc.ptr.p_double[i])); } continue; } if( state.xupdated ) { flast0 = 0.0; for(i=0; i<=n-1; i++) { flast0 = flast0+d.ptr.p_double[i]*ae_sqr(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); } repferr = ae_maxreal(repferr, ae_fabs(flast0-state.f, _state), _state); continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); seterrorflag(primaryerrors, !ae_isfinite(flast0, _state), _state); seterrorflag(othererrors, ae_fp_greater(repferr,10000*ae_machineepsilon), _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } minnssetlc(&state, &c, &ct, 0, _state); minnsrestartfrom(&state, &x0, _state); rho = 1000.0; repferr = 0.0; flast1 = _state->v_nan; while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_sqr(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*(2*(state.x.ptr.p_double[i]-xc.ptr.p_double[i])); } for(i=0; i<=nc-1; i++) { v = ae_v_dotproduct(&state.x.ptr.p_double[0], 1, &c.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); v = v-c.ptr.pp_double[i][n]; vv = 0.0; if( ct.ptr.p_int[i]<0 ) { vv = (double)(ae_sign(ae_maxreal(v, 0.0, _state), _state)); v = ae_maxreal(v, 0.0, _state); } if( ct.ptr.p_int[i]==0 ) { vv = (double)(ae_sign(v, _state)); v = ae_fabs(v, _state); } if( ct.ptr.p_int[i]>0 ) { vv = (double)(-ae_sign(ae_maxreal(-v, 0.0, _state), _state)); v = ae_maxreal(-v, 0.0, _state); } state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+rho*v; for(j=0; j<=n-1; j++) { state.j.ptr.pp_double[0][j] = state.j.ptr.pp_double[0][j]+rho*vv*c.ptr.pp_double[i][j]; } } continue; } if( state.xupdated ) { flast1 = 0.0; for(i=0; i<=n-1; i++) { flast1 = flast1+d.ptr.p_double[i]*ae_sqr(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x2, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); seterrorflag(primaryerrors, !ae_isfinite(flast1, _state), _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } seterrorflag(primaryerrors, ae_fp_greater(ae_fabs(flast0-flast1, _state),ftol), _state); /* * Test on HIGHLY nonconvex linearly constrained problem. * Algorithm should be able to stop at the bounds. */ ae_vector_set_length(&x0, n, _state); ae_matrix_set_length(&c, 2*n, n+1, _state); ae_vector_set_length(&ct, 2*n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = ae_randomreal(_state)-0.5; for(j=0; j<=n-1; j++) { c.ptr.pp_double[2*i+0][j] = 0.0; c.ptr.pp_double[2*i+1][j] = 0.0; } c.ptr.pp_double[2*i+0][i] = 1.0; c.ptr.pp_double[2*i+0][n] = -1.0; ct.ptr.p_int[2*i+0] = 1; c.ptr.pp_double[2*i+1][i] = 1.0; c.ptr.pp_double[2*i+1][n] = 1.0; ct.ptr.p_int[2*i+1] = -1; } epsrad = 0.0001; xtol = 15.0*epsrad; minnscreate(n, &x0, &state, _state); minnssetalgoags(&state, 0.1, 0.0, _state); minnssetcond(&state, epsrad, 0, _state); minnssetlc(&state, &c, &ct, 2*n, _state); v = -1000.0; while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { v0 = ae_fabs(state.x.ptr.p_double[i], _state); v1 = (double)(ae_sign(state.x.ptr.p_double[i], _state)); state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+v*(v0+v0*v0); state.j.ptr.pp_double[0][i] = v*(v1+2*v0*v1); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); for(i=0; i<=n-1; i++) { seterrorflag(primaryerrors, !ae_isfinite(x1.ptr.p_double[i], _state), _state); seterrorflag(primaryerrors, (ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-1, _state),xtol)&&ae_fp_greater(ae_fabs(x1.ptr.p_double[i], _state),xtol))&&ae_fp_greater(ae_fabs(x1.ptr.p_double[i]+1, _state),xtol), _state); } /* * Test scaling: we perform several steps on unit-scale problem, * then we perform same amount of steps on re-scaled problem, * starting from same point (but scaled according to chosen scale). * * Correctly written optimizer should perform essentially same steps * (up to scale) on both problems. At least, it holds within first * several steps, before rounding errors start to accumulate. * * NOTE: we also check that correctly scaled points are reported. * And, as side effect, we check MinNSRestartFrom(). * * NOTE: we use moderate scale and diagonal coefficients in order * to have well-conditioned system. We test correctness of * formulae here, not robustness of algorithm. */ ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&x0s, n, _state); ae_vector_set_length(&d, n, _state); ae_vector_set_length(&s, n, _state); ae_vector_set_length(&xrlast, n, _state); ae_matrix_set_length(&c, 2*n, n+1, _state); ae_matrix_set_length(&scaledc, 2*n, n+1, _state); ae_vector_set_length(&ct, 2*n, _state); for(i=0; i<=2*n-1; i++) { ct.ptr.p_int[i] = 0; for(j=0; j<=n; j++) { c.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=n-1; i++) { s.ptr.p_double[i] = ae_pow((double)(10), 2*ae_randomreal(_state)-1, _state); d.ptr.p_double[i] = ae_pow((double)(10), 2*ae_randomreal(_state)-1, _state); x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x0s.ptr.p_double[i] = x0.ptr.p_double[i]*s.ptr.p_double[i]; k = ae_randominteger(5, _state); if( k==1 ) { c.ptr.pp_double[2*i+0][i] = 1.0; c.ptr.pp_double[2*i+0][n] = 2*ae_randomreal(_state)-1; ct.ptr.p_int[2*i+0] = 1; } if( k==2 ) { c.ptr.pp_double[2*i+0][i] = 1.0; c.ptr.pp_double[2*i+0][n] = 2*ae_randomreal(_state)-1; ct.ptr.p_int[2*i+0] = -1; } if( k==3 ) { v0 = 2*ae_randomreal(_state)-1; v1 = 2*ae_randomreal(_state)-1; c.ptr.pp_double[2*i+0][i] = 1.0; c.ptr.pp_double[2*i+0][n] = ae_minreal(v0, v1, _state); c.ptr.pp_double[2*i+1][i] = 1.0; c.ptr.pp_double[2*i+1][n] = ae_maxreal(v0, v1, _state); ct.ptr.p_int[2*i+0] = 1; ct.ptr.p_int[2*i+1] = -1; } if( k==4 ) { c.ptr.pp_double[2*i+0][i] = 1.0; c.ptr.pp_double[2*i+0][n] = 2*ae_randomreal(_state)-1; ct.ptr.p_int[2*i+0] = 0; } } for(i=0; i<=2*n-1; i++) { for(j=0; j<=n-1; j++) { scaledc.ptr.pp_double[i][j] = c.ptr.pp_double[i][j]/s.ptr.p_double[j]; } scaledc.ptr.pp_double[i][n] = c.ptr.pp_double[i][n]; } minnscreate(n, &x0, &state, _state); minnssetalgoags(&state, 0.1, 0.0, _state); minnssetcond(&state, 0.0, testminnsunit_scalingtestcnt, _state); minnssetlc(&state, &c, &ct, 2*n, _state); minnssetxrep(&state, ae_false, _state); while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*ae_sign(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } minnssetscale(&state, &s, _state); minnssetlc(&state, &scaledc, &ct, 2*n, _state); minnsrestartfrom(&state, &x0s, _state); while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]/s.ptr.p_double[i]-xc.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*ae_sign(state.x.ptr.p_double[i]/s.ptr.p_double[i]-xc.ptr.p_double[i], _state)/s.ptr.p_double[i]; } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1s, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(primaryerrors, (!ae_isfinite(x1.ptr.p_double[i], _state)||!ae_isfinite(x1s.ptr.p_double[i], _state))||ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-x1s.ptr.p_double[i]/s.ptr.p_double[i], _state),testminnsunit_scalingtesttol), _state); } } } ae_frame_leave(_state); } /************************************************************************* Nonlinearly constrained test *************************************************************************/ static void testminnsunit_testnlc(ae_bool* primaryerrors, ae_bool* othererrors, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t nc; ae_int_t nec; ae_vector x0; ae_vector x0s; ae_vector x1; ae_vector x2; ae_vector x1s; ae_vector d; ae_vector xc; ae_vector s; ae_vector bndl; ae_vector bndu; ae_vector b; ae_vector r; ae_matrix c; ae_matrix scaledc; ae_vector ct; ae_vector scaledbndl; ae_vector scaledbndu; ae_vector xrfirst; ae_vector xrlast; minnsstate state; minnsreport rep; double v; ae_int_t pass; ae_int_t passcount; double epsrad; double xtol; double rho; ae_int_t maxn; double diffstep; ae_frame_make(_state, &_frame_block); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x0s, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&x1s, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&bndl, 0, DT_REAL, _state); ae_vector_init(&bndu, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&r, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_matrix_init(&scaledc, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); ae_vector_init(&scaledbndl, 0, DT_REAL, _state); ae_vector_init(&scaledbndu, 0, DT_REAL, _state); ae_vector_init(&xrfirst, 0, DT_REAL, _state); ae_vector_init(&xrlast, 0, DT_REAL, _state); _minnsstate_init(&state, _state); _minnsreport_init(&rep, _state); passcount = 10; maxn = 5; rho = 100.0; /* * First test: * * simple problem * * subject to random nonlinear constraints of form r[i]*x[i] OPERATION 0.0, * where OPERATION is <= or = * * with non-unit scale * * We: * * compare numerical solution with analytic one, which can be * easily calculated */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { for(nc=1; nc<=n; nc++) { for(nec=0; nec<=nc; nec++) { ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xc, n, _state); ae_vector_set_length(&d, n, _state); ae_vector_set_length(&r, n, _state); ae_vector_set_length(&s, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; xc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; d.ptr.p_double[i] = ae_pow((double)(10), ae_randomreal(_state)-0.5, _state); s.ptr.p_double[i] = ae_pow((double)(10), ae_randomreal(_state)-0.5, _state); r.ptr.p_double[i] = (2*ae_randominteger(2, _state)-1)*(0.1+ae_randomreal(_state)); } epsrad = 0.001; xtol = 0.01; minnscreate(n, &x0, &state, _state); minnssetalgoags(&state, 0.1, rho, _state); minnssetcond(&state, epsrad, 0, _state); minnssetscale(&state, &s, _state); minnssetnlc(&state, nec, nc-nec, _state); while(minnsiteration(&state, _state)) { if( state.needfij ) { state.fi.ptr.p_double[0] = 0.0; for(i=0; i<=n-1; i++) { state.fi.ptr.p_double[0] = state.fi.ptr.p_double[0]+d.ptr.p_double[i]*ae_fabs(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); state.j.ptr.pp_double[0][i] = d.ptr.p_double[i]*ae_sign(state.x.ptr.p_double[i]-xc.ptr.p_double[i], _state); } for(i=1; i<=nc; i++) { state.fi.ptr.p_double[i] = state.x.ptr.p_double[i-1]*r.ptr.p_double[i-1]; for(j=0; j<=n-1; j++) { state.j.ptr.pp_double[i][j] = 0.0; } state.j.ptr.pp_double[i][i-1] = r.ptr.p_double[i-1]; } continue; } ae_assert(ae_false, "Assertion failed", _state); } minnsresults(&state, &x1, &rep, _state); seterrorflag(primaryerrors, rep.terminationtype<=0, _state); if( *primaryerrors||(*othererrors) ) { ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { v = xc.ptr.p_double[i]; if( i=nec&&i=nec&&i0 ) { for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(x.ptr.p_double[i]-boundval(xe.ptr.p_double[i], bl.ptr.p_double[i], bu.ptr.p_double[i], _state), _state),5.0E-2), _state); } } else { seterrorflag(errorflag, ae_true, _state); } } } /* * Minimize * * [ [ ]2 ] * SUM_i[ SUM_j[ power(x_j,3)*c_ij ] ] * [ [ ] ] * * subject to non-negativity constraints on x_j */ epsx = 1.0E-9; for(tmpkind=0; tmpkind<=1; tmpkind++) { for(n=1; n<=20; n++) { m = n+hqrnduniformi(&rs, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = (double)(0); bu.ptr.p_double[i] = _state->v_posinf; x0.ptr.p_double[i] = 1+hqrnduniformr(&rs, _state); } ae_matrix_set_length(&c, m, n+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n; j++) { c.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } } if( tmpkind==0 ) { minlmcreatev(n, m, &x0, 10*epsx, &state, _state); } if( tmpkind==1 ) { minlmcreatevj(n, m, &x0, &state, _state); } minlmsetcond(&state, epsx, 0, _state); minlmsetbc(&state, &bl, &bu, _state); while(minlmiteration(&state, _state)) { for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_less(state.x.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(state.x.ptr.p_double[i],bu.ptr.p_double[i]), _state); } if( state.needfi ) { testminlmunit_testfunc1(n, m, &c, &state.x, &v, ae_false, &state.fi, ae_true, &state.j, ae_false, _state); continue; } if( state.needfij ) { testminlmunit_testfunc1(n, m, &c, &state.x, &v, ae_false, &state.fi, ae_true, &state.j, ae_true, _state); continue; } ae_assert(ae_false, "minlm test: integrity check failed", _state); } minlmresults(&state, &x, &rep, _state); seterrorflag(errorflag, rep.terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } testminlmunit_testfunc1(n, m, &c, &x, &f0, ae_true, &state.fi, ae_false, &state.j, ae_false, _state); ae_vector_set_length(&x1, n, _state); h = 0.001; for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_less(x.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(x.ptr.p_double[i],bu.ptr.p_double[i]), _state); if( ae_fp_greater_eq(x.ptr.p_double[i]+h,bl.ptr.p_double[i]) ) { for(j=0; j<=n-1; j++) { x1.ptr.p_double[j] = x.ptr.p_double[j]; } x1.ptr.p_double[i] = x.ptr.p_double[i]+h; testminlmunit_testfunc1(n, m, &c, &x1, &f1, ae_true, &state.fi, ae_false, &state.j, ae_false, _state); seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } if( ae_fp_greater_eq(x.ptr.p_double[i]-h,bl.ptr.p_double[i]) ) { for(j=0; j<=n-1; j++) { x1.ptr.p_double[j] = x.ptr.p_double[j]; } x1.ptr.p_double[i] = x.ptr.p_double[i]-h; testminlmunit_testfunc1(n, m, &c, &x1, &f1, ae_true, &state.fi, ae_false, &state.j, ae_false, _state); seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } } } } /* * Minimize * * [ [ ]2 ] * SUM_i[ SUM_j[ power(x_j,3)*c_ij ] ] * [ [ ] ] * * subject to random box constraints on x_j */ epsx = 1.0E-9; for(tmpkind=0; tmpkind<=1; tmpkind++) { for(n=1; n<=20; n++) { m = n+hqrnduniformi(&rs, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = hqrndnormal(&rs, _state); bu.ptr.p_double[i] = bl.ptr.p_double[i]+hqrnduniformr(&rs, _state); x0.ptr.p_double[i] = 1+hqrnduniformr(&rs, _state); } ae_matrix_set_length(&c, m, n+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n; j++) { c.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } } if( tmpkind==0 ) { minlmcreatev(n, m, &x0, 10*epsx, &state, _state); } if( tmpkind==1 ) { minlmcreatevj(n, m, &x0, &state, _state); } minlmsetcond(&state, epsx, 0, _state); minlmsetbc(&state, &bl, &bu, _state); while(minlmiteration(&state, _state)) { for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_less(state.x.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(state.x.ptr.p_double[i],bu.ptr.p_double[i]), _state); } if( state.needfi ) { testminlmunit_testfunc1(n, m, &c, &state.x, &v, ae_false, &state.fi, ae_true, &state.j, ae_false, _state); continue; } if( state.needfij ) { testminlmunit_testfunc1(n, m, &c, &state.x, &v, ae_false, &state.fi, ae_true, &state.j, ae_true, _state); continue; } ae_assert(ae_false, "minlm test: integrity check failed", _state); } minlmresults(&state, &x, &rep, _state); seterrorflag(errorflag, rep.terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } testminlmunit_testfunc1(n, m, &c, &x, &f0, ae_true, &state.fi, ae_false, &state.j, ae_false, _state); ae_vector_set_length(&x1, n, _state); h = 0.001; for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_less(x.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(x.ptr.p_double[i],bu.ptr.p_double[i]), _state); if( ae_fp_greater_eq(x.ptr.p_double[i]+h,bl.ptr.p_double[i])&&ae_fp_less_eq(x.ptr.p_double[i]+h,bu.ptr.p_double[i]) ) { for(j=0; j<=n-1; j++) { x1.ptr.p_double[j] = x.ptr.p_double[j]; } x1.ptr.p_double[i] = x.ptr.p_double[i]+h; testminlmunit_testfunc1(n, m, &c, &x1, &f1, ae_true, &state.fi, ae_false, &state.j, ae_false, _state); seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } if( ae_fp_greater_eq(x.ptr.p_double[i]-h,bl.ptr.p_double[i])&&ae_fp_less_eq(x.ptr.p_double[i]-h,bu.ptr.p_double[i]) ) { for(j=0; j<=n-1; j++) { x1.ptr.p_double[j] = x.ptr.p_double[j]; } x1.ptr.p_double[i] = x.ptr.p_double[i]-h; testminlmunit_testfunc1(n, m, &c, &x1, &f1, ae_true, &state.fi, ae_false, &state.j, ae_false, _state); seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } } } } ae_frame_leave(_state); } /************************************************************************* Test for linearly constrained problems. On failure sets error flag, leaves it unchanged on success. *************************************************************************/ static void testminlmunit_testlc(ae_bool* errorflag, ae_state *_state) { ae_frame _frame_block; minlmstate state; minlmreport rep; ae_vector bl; ae_vector bu; ae_int_t n; ae_int_t m; ae_int_t m1; ae_int_t m2; ae_vector x0; ae_vector x; ae_vector xe; ae_vector x1; ae_vector x12; ae_vector d; ae_matrix rawc; ae_vector rawct; ae_int_t rawccnt; ae_matrix c; ae_matrix c12; ae_matrix z; ae_int_t trialidx; ae_int_t i; ae_int_t j; double v; double h; ae_int_t optkind; hqrndstate rs; double epsx; double xtol; double f0; double f1; ae_bool bflag; ae_frame_make(_state, &_frame_block); _minlmstate_init(&state, _state); _minlmreport_init(&rep, _state); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&xe, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x12, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_matrix_init(&rawc, 0, 0, DT_REAL, _state); ae_vector_init(&rawct, 0, DT_INT, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_matrix_init(&c12, 0, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); /* * Minimize * * [ [ ]2 ] * SUM_i[ SUM_j[ (0.1*x_j+power(x_j,3))*c_ij ] ] * [ [ ] ] * * subject to mix of box and linear inequality constraints on x_j * * We check correctness of solution by sampling a few random points * around one returned by optimizer, and comparing function value * with target. Sampling is performed with respect to inequality * constraints. */ epsx = 1.0E-12; xtol = 1.0E-8; for(optkind=0; optkind<=1; optkind++) { for(n=5; n<=20; n++) { /* * Generate problem */ m = n+hqrnduniformi(&rs, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = hqrndnormal(&rs, _state); bu.ptr.p_double[i] = bl.ptr.p_double[i]+hqrnduniformr(&rs, _state); x0.ptr.p_double[i] = bl.ptr.p_double[i]+(bu.ptr.p_double[i]-bl.ptr.p_double[i])*hqrnduniformr(&rs, _state); } ae_matrix_set_length(&c, m, n+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n; j++) { c.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } } ae_assert(n>=5, "Assertion failed", _state); rawccnt = 3; ae_matrix_set_length(&rawc, rawccnt, n+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); for(i=0; i<=rawccnt-1; i++) { v = (double)(0); for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); v = v+x0.ptr.p_double[j]*rawc.ptr.pp_double[i][j]; } rawc.ptr.pp_double[i][n] = v; rawct.ptr.p_int[i] = 2*hqrnduniformi(&rs, 2, _state)-1; } /* * Solve */ if( optkind==0 ) { minlmcreatev(n, m, &x0, 10*epsx, &state, _state); } if( optkind==1 ) { minlmcreatevj(n, m, &x0, &state, _state); } minlmsetcond(&state, epsx, 0, _state); minlmsetbc(&state, &bl, &bu, _state); minlmsetlc(&state, &rawc, &rawct, rawccnt, _state); while(minlmiteration(&state, _state)) { for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_less(state.x.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(state.x.ptr.p_double[i],bu.ptr.p_double[i]), _state); } if( state.needfi ) { testminlmunit_testfunc1(n, m, &c, &state.x, &v, ae_false, &state.fi, ae_true, &state.j, ae_false, _state); continue; } if( state.needfij ) { testminlmunit_testfunc1(n, m, &c, &state.x, &v, ae_false, &state.fi, ae_true, &state.j, ae_true, _state); continue; } ae_assert(ae_false, "minlm test: integrity check failed", _state); } minlmresults(&state, &x, &rep, _state); seterrorflag(errorflag, rep.terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } /* * Test feasibility w.r.t. box and linear inequality constraints */ for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_less(x.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(x.ptr.p_double[i],bu.ptr.p_double[i]), _state); } for(i=0; i<=rawccnt-1; i++) { v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v-rawc.ptr.pp_double[i][n]; if( rawct.ptr.p_int[i]>0 ) { seterrorflag(errorflag, ae_fp_less(v,-xtol), _state); } if( rawct.ptr.p_int[i]<0 ) { seterrorflag(errorflag, ae_fp_greater(v,xtol), _state); } } /* * Make several random trial steps and: * 0) generate small random trial step * 1) if step is infeasible, skip to next trial * 2) compare function value in the trial point against one in other points */ testminlmunit_testfunc1(n, m, &c, &x, &f0, ae_true, &state.fi, ae_false, &state.j, ae_false, _state); ae_vector_set_length(&x1, n, _state); for(trialidx=0; trialidx<=10*n; trialidx++) { h = 0.001; for(i=0; i<=n-1; i++) { do { x1.ptr.p_double[i] = x.ptr.p_double[i]+(hqrnduniformr(&rs, _state)*2-1)*h; } while(!(ae_fp_greater_eq(x1.ptr.p_double[i],bl.ptr.p_double[i])&&ae_fp_less_eq(x1.ptr.p_double[i],bu.ptr.p_double[i]))); } for(i=0; i<=rawccnt-1; i++) { ae_assert(rawct.ptr.p_int[i]!=0, "Assertion failed", _state); v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v-rawc.ptr.pp_double[i][n]; bflag = bflag||(rawct.ptr.p_int[i]>0&&ae_fp_less(v,(double)(0))); bflag = bflag||(rawct.ptr.p_int[i]<0&&ae_fp_greater(v,(double)(0))); } if( bflag ) { continue; } testminlmunit_testfunc1(n, m, &c, &x1, &f1, ae_true, &state.fi, ae_false, &state.j, ae_false, _state); seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } } } /* * Minimize * * [ [ ]2 ] * SUM_i[ SUM_j[ (0.1*x_j+power(x_j,3))*c_ij ] ] * [ [ ] ] * * subject to linear EQUALITY constraints on x_j. * * We check correctness of solution by sampling a few random points * around one returned by optimizer, and comparing function value * with target. Sampling is performed with respect to equality * constraints. In order to simplify algorithm we use orthogonal * equality constraints. * * NOTE: we solve problem using VJ mode (analytic Jacobian) because * roundoff errors from numerical differentiation sometimes * prevent us from converging with good precision. */ epsx = 1.0E-12; xtol = 1.0E-8; optkind = 1; for(n=10; n<=20; n++) { /* * Generate problem */ m = n+hqrnduniformi(&rs, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = hqrndnormal(&rs, _state); } ae_matrix_set_length(&c, m, n+1, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=n; j++) { c.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } } ae_assert(n>=5, "Assertion failed", _state); rawccnt = 1+hqrnduniformi(&rs, 5, _state); rmatrixrndorthogonal(n, &z, _state); ae_matrix_set_length(&rawc, rawccnt, n+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); for(i=0; i<=rawccnt-1; i++) { v = (double)(0); for(j=0; j<=n-1; j++) { rawc.ptr.pp_double[i][j] = z.ptr.pp_double[i][j]; v = v+x0.ptr.p_double[j]*rawc.ptr.pp_double[i][j]; } rawc.ptr.pp_double[i][n] = v; rawct.ptr.p_int[i] = 0; } /* * Solve */ if( optkind==0 ) { minlmcreatev(n, m, &x0, 1.0E-12, &state, _state); } if( optkind==1 ) { minlmcreatevj(n, m, &x0, &state, _state); } minlmsetcond(&state, epsx, 0, _state); minlmsetlc(&state, &rawc, &rawct, rawccnt, _state); while(minlmiteration(&state, _state)) { if( state.needfi ) { testminlmunit_testfunc1(n, m, &c, &state.x, &v, ae_false, &state.fi, ae_true, &state.j, ae_false, _state); continue; } if( state.needfij ) { testminlmunit_testfunc1(n, m, &c, &state.x, &v, ae_false, &state.fi, ae_true, &state.j, ae_true, _state); continue; } ae_assert(ae_false, "minlm test: integrity check failed", _state); } minlmresults(&state, &x, &rep, _state); seterrorflag(errorflag, rep.terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } /* * Test feasibility w.r.t. linear equality constraints */ for(i=0; i<=rawccnt-1; i++) { v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); v = v-rawc.ptr.pp_double[i][n]; seterrorflag(errorflag, ae_fp_greater(ae_fabs(v, _state),xtol), _state); } /* * Make several random trial steps and: * 0) generate small random trial step * 1) project it onto equality constrained subspace * 2) compare function value in the trial point against one in other points */ testminlmunit_testfunc1(n, m, &c, &x, &f0, ae_true, &state.fi, ae_false, &state.j, ae_false, _state); ae_vector_set_length(&x1, n, _state); for(trialidx=0; trialidx<=10*n; trialidx++) { h = 0.001; for(i=0; i<=n-1; i++) { x1.ptr.p_double[i] = hqrndnormal(&rs, _state); } for(i=0; i<=rawccnt-1; i++) { v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_subd(&x1.ptr.p_double[0], 1, &rawc.ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } v = ae_v_dotproduct(&x1.ptr.p_double[0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_assert(ae_fp_greater(v,(double)(0)), "Assertion failed", _state); v = h/ae_sqrt(v, _state); ae_v_muld(&x1.ptr.p_double[0], 1, ae_v_len(0,n-1), v); ae_v_add(&x1.ptr.p_double[0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); testminlmunit_testfunc1(n, m, &c, &x1, &f1, ae_true, &state.fi, ae_false, &state.j, ae_false, _state); seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } } /* * Minimize * * [ [ ]2 ] [ [ ]2 ] * SUM_i[ SUM_j[ (0.1*x_j+power(x0_j,3))*c0_ij ] ] + SUM_i[ SUM_j[ (0.1*x_j+power(x1_j,3))*c1_ij ] ] * [ [ ] ] [ [ ] ] * * for two sets of unknowns (x0_j and x1_j) and two sets of * coefficients (c0_ij and c1_ij, M1*N and M2*N matrices) subject * to equality constraint * * x0_j=x1_j for all j * * Such optimization problem arises when we fit same model to * two distinct datasets and want to share SOME of coefficients * between fits. If we share ALL coefficients, it is equal to * fitting one model to combination of two datasets. * * Our test checks that such "combined" 2N-dimensional problem * solved with general linear constraints which "glue" two datasets * together returns same answer as N-dimensional problem on (M1+M2)-point * dataset. * * NOTE: we solve problem using VJ mode (analytic Jacobian) because * roundoff errors from numerical differentiation prevent us * from converging with good precision. */ epsx = 1.0E-12; for(n=5; n<=20; n++) { /* * Generate problems */ m1 = n+hqrnduniformi(&rs, n, _state); m2 = n+hqrnduniformi(&rs, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&x12, 2*n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = hqrndnormal(&rs, _state); x12.ptr.p_double[2*i+0] = hqrndnormal(&rs, _state); x12.ptr.p_double[2*i+1] = hqrndnormal(&rs, _state); } ae_matrix_set_length(&c, m1+m2, n+1, _state); ae_matrix_set_length(&c12, m1+m2, 2*n+1, _state); for(i=0; i<=m1+m2-1; i++) { for(j=0; j<=2*n; j++) { c12.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=m1+m2-1; i++) { for(j=0; j<=n-1; j++) { c.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); if( iv_nan; } if( i==1 ) { v = _state->v_posinf; } if( i==2 ) { v = _state->v_neginf; } if( ae_fp_greater(ae_randomreal(_state),0.5) ) { state.fi.ptr.p_double[ae_randominteger(m, _state)] = v; } else { state.j.ptr.pp_double[ae_randominteger(m, _state)][ae_randominteger(n, _state)] = v; } inc(&spoilcnt, _state); } continue; } if( state.needfi ) { for(i=0; i<=n-2; i++) { state.fi.ptr.p_double[2*i+0] = s*(state.x.ptr.p_double[i+1]-ae_sqr(state.x.ptr.p_double[i], _state)); state.fi.ptr.p_double[2*i+1] = 1-state.x.ptr.p_double[i]; } mx = (double)(0); for(i=0; i<=n-1; i++) { mx = ae_maxreal(mx, ae_fabs(state.x.ptr.p_double[i]-1, _state), _state); } if( ae_fp_less(mx,1.0E-2) ) { i = ae_randominteger(3, _state); if( i==0 ) { v = _state->v_nan; } if( i==1 ) { v = _state->v_posinf; } if( i==2 ) { v = _state->v_neginf; } state.fi.ptr.p_double[ae_randominteger(m, _state)] = v; inc(&spoilcnt, _state); } continue; } ae_assert(ae_false, "Assertion failed", _state); } minlmresults(&state, &x, &rep, _state); seterrorflag(errorflag, rep.terminationtype!=-8, _state); seterrorflag(errorflag, spoilcnt!=1, _state); ae_frame_leave(_state); } /************************************************************************* Asserts that State fields are consistent with RKind. Returns False otherwise. RKind is an algorithm selector: * -2 = V, AccType=1 * -1 = V, AccType=0 * 0 = FJ * 1 = FGJ * 2 = FGH * 3 = VJ, AccType=0 * 4 = VJ, AccType=1 * 5 = VJ, AccType=2 *************************************************************************/ static ae_bool testminlmunit_rkindvsstatecheck(ae_int_t rkind, minlmstate* state, ae_state *_state) { ae_int_t nset; ae_bool result; nset = 0; if( state->needfi ) { nset = nset+1; } if( state->needf ) { nset = nset+1; } if( state->needfg ) { nset = nset+1; } if( state->needfij ) { nset = nset+1; } if( state->needfgh ) { nset = nset+1; } if( state->xupdated ) { nset = nset+1; } if( nset!=1 ) { result = ae_false; return result; } if( rkind==-2 ) { result = state->needfi||state->xupdated; return result; } if( rkind==-1 ) { result = state->needfi||state->xupdated; return result; } if( rkind==0 ) { result = (state->needf||state->needfij)||state->xupdated; return result; } if( rkind==1 ) { result = ((state->needf||state->needfij)||state->needfg)||state->xupdated; return result; } if( rkind==2 ) { result = ((state->needf||state->needfg)||state->needfgh)||state->xupdated; return result; } if( rkind==3 ) { result = (state->needfi||state->needfij)||state->xupdated; return result; } if( rkind==4 ) { result = (state->needfi||state->needfij)||state->xupdated; return result; } if( rkind==5 ) { result = (state->needfi||state->needfij)||state->xupdated; return result; } result = ae_false; return result; } /************************************************************************* Calculates FI/F/G/H for problem min(||Ax-b||) *************************************************************************/ static void testminlmunit_axmb(minlmstate* state, /* Real */ ae_matrix* a, /* Real */ ae_vector* b, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; double v; if( (state->needf||state->needfg)||state->needfgh ) { state->f = (double)(0); } if( state->needfg||state->needfgh ) { for(i=0; i<=n-1; i++) { state->g.ptr.p_double[i] = (double)(0); } } if( state->needfgh ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { state->h.ptr.pp_double[i][j] = (double)(0); } } } for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&a->ptr.pp_double[i][0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( (state->needf||state->needfg)||state->needfgh ) { state->f = state->f+ae_sqr(v-b->ptr.p_double[i], _state); } if( state->needfg||state->needfgh ) { for(j=0; j<=n-1; j++) { state->g.ptr.p_double[j] = state->g.ptr.p_double[j]+2*(v-b->ptr.p_double[i])*a->ptr.pp_double[i][j]; } } if( state->needfgh ) { for(j=0; j<=n-1; j++) { for(k=0; k<=n-1; k++) { state->h.ptr.pp_double[j][k] = state->h.ptr.pp_double[j][k]+2*a->ptr.pp_double[i][j]*a->ptr.pp_double[i][k]; } } } if( state->needfi ) { state->fi.ptr.p_double[i] = v-b->ptr.p_double[i]; } if( state->needfij ) { state->fi.ptr.p_double[i] = v-b->ptr.p_double[i]; ae_v_move(&state->j.ptr.pp_double[i][0], 1, &a->ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); } } } /************************************************************************* This function tries to reproduce previously fixed bugs; in case of bug being present sets Err to True; leaves it unchanged otherwise. *************************************************************************/ static void testminlmunit_tryreproducefixedbugs(ae_bool* err, ae_state *_state) { ae_frame _frame_block; minlmstate s; minlmreport rep; ae_vector bl; ae_vector bu; ae_vector x; ae_frame_make(_state, &_frame_block); _minlmstate_init(&s, _state); _minlmreport_init(&rep, _state); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); /* * Reproduce bug reported by ISS: * when solving bound constrained problem with numerical differentiation * and starting from infeasible point, we won't stop at the feasible point */ ae_vector_set_length(&x, 2, _state); ae_vector_set_length(&bl, 2, _state); ae_vector_set_length(&bu, 2, _state); x.ptr.p_double[0] = 2.0; bl.ptr.p_double[0] = -1.0; bu.ptr.p_double[0] = 1.0; x.ptr.p_double[1] = 2.0; bl.ptr.p_double[1] = -1.0; bu.ptr.p_double[1] = 1.0; minlmcreatev(2, 2, &x, 0.001, &s, _state); minlmsetbc(&s, &bl, &bu, _state); while(minlmiteration(&s, _state)) { if( s.needfi ) { s.fi.ptr.p_double[0] = ae_sqr(s.x.ptr.p_double[0], _state); s.fi.ptr.p_double[1] = ae_sqr(s.x.ptr.p_double[1], _state); } } minlmresults(&s, &x, &rep, _state); seterrorflag(err, ((ae_fp_less(x.ptr.p_double[0],bl.ptr.p_double[0])||ae_fp_greater(x.ptr.p_double[0],bu.ptr.p_double[0]))||ae_fp_less(x.ptr.p_double[1],bl.ptr.p_double[1]))||ae_fp_greater(x.ptr.p_double[1],bu.ptr.p_double[1]), _state); ae_frame_leave(_state); } /************************************************************************* This function tests, that gradient verified correctly. *************************************************************************/ static ae_bool testminlmunit_gradientchecktest(ae_state *_state) { ae_frame _frame_block; minlmstate state; minlmreport rep; ae_int_t n; ae_int_t m; ae_vector a; ae_vector x0; ae_vector x; ae_vector bl; ae_vector bu; ae_int_t infcomp; double teststep; double noise; double rndconst; ae_int_t nbrfunc; ae_int_t nbrcomp; ae_int_t func; ae_int_t pass; ae_int_t passcount; ae_int_t i; ae_bool result; ae_frame_make(_state, &_frame_block); _minlmstate_init(&state, _state); _minlmreport_init(&rep, _state); ae_vector_init(&a, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); passcount = 35; teststep = 0.01; for(pass=1; pass<=passcount; pass++) { n = ae_randominteger(10, _state)+1; m = ae_randominteger(10, _state)+1; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&a, n, _state); ae_vector_set_length(&bl, n, _state); ae_vector_set_length(&bu, n, _state); /* * Prepare test's parameters */ func = ae_randominteger(3, _state)+1; nbrfunc = ae_randominteger(m, _state); nbrcomp = ae_randominteger(n, _state); noise = (double)(2*ae_randominteger(2, _state)-1); rndconst = 2*ae_randomreal(_state)-1; /* * Prepare function's parameters */ for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 5*randomnormal(_state); a.ptr.p_double[i] = 5*ae_randomreal(_state)+1; x0.ptr.p_double[i] = 5*(2*ae_randomreal(_state)-1); } /* * Prepare boundary parameters */ for(i=0; i<=n-1; i++) { bl.ptr.p_double[i] = -3*ae_randomreal(_state)-0.1; bu.ptr.p_double[i] = 3*ae_randomreal(_state)+0.1; } infcomp = ae_randominteger(n+1, _state); if( infcompv_neginf; } infcomp = ae_randominteger(n+1, _state); if( infcompv_posinf; } minlmcreatevj(n, m, &x, &state, _state); minlmsetcond(&state, (double)(0), 0, _state); minlmsetgradientcheck(&state, teststep, _state); minlmsetbc(&state, &bl, &bu, _state); /* * Check that the criterion passes a derivative if it is correct */ while(minlmiteration(&state, _state)) { if( state.needfij ) { /* * Check hat .X within the boundaries */ for(i=0; i<=n-1; i++) { if( (ae_isfinite(bl.ptr.p_double[i], _state)&&ae_fp_less(state.x.ptr.p_double[i],bl.ptr.p_double[i]))||(ae_isfinite(bu.ptr.p_double[i], _state)&&ae_fp_greater(state.x.ptr.p_double[i],bu.ptr.p_double[i])) ) { result = ae_true; ae_frame_leave(_state); return result; } } testminlmunit_funcderiv(&a, &x0, &state.x, m, n, rndconst, func, &state.fi, &state.j, _state); } } minlmresults(&state, &x, &rep, _state); /* * Check that error code does not equal to -7 and parameter .VarIdx * equal to -1. */ if( (rep.terminationtype==-7||rep.funcidx!=-1)||rep.varidx!=-1 ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 5*randomnormal(_state); } minlmrestartfrom(&state, &x, _state); /* * Check that the criterion does not miss a derivative if * it is incorrect */ while(minlmiteration(&state, _state)) { if( state.needfij ) { for(i=0; i<=n-1; i++) { if( (ae_isfinite(bl.ptr.p_double[i], _state)&&ae_fp_less(state.x.ptr.p_double[i],bl.ptr.p_double[i]))||(ae_isfinite(bu.ptr.p_double[i], _state)&&ae_fp_greater(state.x.ptr.p_double[i],bu.ptr.p_double[i])) ) { result = ae_true; ae_frame_leave(_state); return result; } } testminlmunit_funcderiv(&a, &x0, &state.x, m, n, rndconst, func, &state.fi, &state.j, _state); state.j.ptr.pp_double[nbrfunc][nbrcomp] = state.j.ptr.pp_double[nbrfunc][nbrcomp]+noise; } } minlmresults(&state, &x, &rep, _state); /* * Check that error code equal to -7 and parameter .VarIdx * equal to number of incorrect component. */ if( (rep.terminationtype!=-7||rep.funcidx!=nbrfunc)||rep.varidx!=nbrcomp ) { result = ae_true; ae_frame_leave(_state); return result; } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function return function value and it derivatives. The number of functions is M, dimension for each of functions is N. F(XI)=SUM(fi(XI)); (XI={x,y,z}; i=0..M-1); Function's list: * funcType=1: fi(X)=(Aj*Xj-X0j)^2; * funcType=2: fi(X)=Aj*sin(Xj-X0j); * funcType=3: fi(X)=Aj*Xj-X0j; fM-1(X)=A(M-1)*((X(M-1)-X0(M-1))-(X0-X00)). *************************************************************************/ static void testminlmunit_funcderiv(/* Real */ ae_vector* a, /* Real */ ae_vector* x0, /* Real */ ae_vector* x, ae_int_t m, ae_int_t n, double anyconst, ae_int_t functype, /* Real */ ae_vector* f, /* Real */ ae_matrix* j, ae_state *_state) { ae_int_t i0; ae_int_t j0; ae_assert(functype>=1&&functype<=3, "FuncDeriv: incorrect funcType(funcType<1 or funcType>3).", _state); ae_assert(n>0, "FuncDeriv: N<=0", _state); ae_assert(m>0, "FuncDeriv: M<=0", _state); ae_assert(x->cnt>=n, "FuncDeriv: Length(X)cnt>=n, "FuncDeriv: Length(X0)cnt>=n, "FuncDeriv: Length(X)ptr.p_double[i0] = a->ptr.p_double[i0]*x->ptr.p_double[i0]-x0->ptr.p_double[i0]; } else { f->ptr.p_double[i0] = anyconst; } } for(i0=0; i0<=m-1; i0++) { for(j0=0; j0<=n-1; j0++) { if( i0==j0 ) { j->ptr.pp_double[i0][j0] = a->ptr.p_double[j0]; } else { j->ptr.pp_double[i0][j0] = (double)(0); } } } return; } if( functype==2 ) { for(i0=0; i0<=m-1; i0++) { if( i0ptr.p_double[i0] = a->ptr.p_double[i0]*ae_sin(x->ptr.p_double[i0]-x0->ptr.p_double[i0], _state); } else { f->ptr.p_double[i0] = anyconst; } } for(i0=0; i0<=m-1; i0++) { for(j0=0; j0<=n-1; j0++) { if( i0==j0 ) { j->ptr.pp_double[i0][j0] = a->ptr.p_double[j0]*ae_cos(x->ptr.p_double[j0]-x0->ptr.p_double[j0], _state); } else { j->ptr.pp_double[i0][j0] = (double)(0); } } } return; } if( functype==3 ) { for(i0=0; i0<=m-1; i0++) { if( i0ptr.p_double[i0] = a->ptr.p_double[i0]*x->ptr.p_double[i0]-x0->ptr.p_double[i0]; } else { f->ptr.p_double[i0] = anyconst; } } for(i0=0; i0<=m-1; i0++) { for(j0=0; j0<=n-1; j0++) { if( i0==j0 ) { j->ptr.pp_double[i0][j0] = a->ptr.p_double[j0]; } else { j->ptr.pp_double[i0][j0] = (double)(0); } } } if( m>n&&n>1 ) { f->ptr.p_double[n-1] = a->ptr.p_double[n-1]*(x->ptr.p_double[n-1]-x0->ptr.p_double[n-1]-(x->ptr.p_double[0]-x0->ptr.p_double[0])); j->ptr.pp_double[n-1][0] = -a->ptr.p_double[n-1]; j->ptr.pp_double[n-1][n-1] = a->ptr.p_double[n-1]; for(i0=1; i0<=n-2; i0++) { j->ptr.pp_double[n-1][i0] = (double)(0); } } return; } } /************************************************************************* Test function 1: F(N, M, C, X) = SUM( f_i^2 ) f_i = SUM( (power(x_j,3)+alpha*x_j)*c_ij ) *************************************************************************/ static void testminlmunit_testfunc1(ae_int_t n, ae_int_t m, /* Real */ ae_matrix* c, /* Real */ ae_vector* x, double* f, ae_bool needf, /* Real */ ae_vector* fi, ae_bool needfi, /* Real */ ae_matrix* jac, ae_bool needjac, ae_state *_state) { ae_int_t i; ae_int_t j; double v; double alpha; alpha = 0.01; if( needf ) { *f = (double)(0); } for(i=0; i<=m-1; i++) { v = c->ptr.pp_double[i][n]; for(j=0; j<=n-1; j++) { v = v+(alpha*x->ptr.p_double[j]+ae_pow(x->ptr.p_double[j], (double)(3), _state))*c->ptr.pp_double[i][j]; if( needjac ) { jac->ptr.pp_double[i][j] = (alpha+3*ae_pow(x->ptr.p_double[j], (double)(2), _state))*c->ptr.pp_double[i][j]; } } if( needfi ) { fi->ptr.p_double[i] = v; } if( needf ) { *f = *f+v*v; } } } static void testevdunit_rmatrixfillsparsea(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double sparcity, double diagmag, ae_state *_state); static void testevdunit_cmatrixfillsparsea(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, double sparcity, double diagmag, ae_state *_state); static void testevdunit_rmatrixsymmetricsplit(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* al, /* Real */ ae_matrix* au, ae_state *_state); static void testevdunit_cmatrixhermitiansplit(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_matrix* al, /* Complex */ ae_matrix* au, ae_state *_state); static void testevdunit_unset2d(/* Real */ ae_matrix* a, ae_state *_state); static void testevdunit_cunset2d(/* Complex */ ae_matrix* a, ae_state *_state); static void testevdunit_unset1d(/* Real */ ae_vector* a, ae_state *_state); static double testevdunit_tdtestproduct(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, /* Real */ ae_matrix* z, /* Real */ ae_vector* lambdav, ae_state *_state); static double testevdunit_testproduct(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* z, /* Real */ ae_vector* lambdav, ae_state *_state); static double testevdunit_testort(/* Real */ ae_matrix* z, ae_int_t n, ae_state *_state); static double testevdunit_testcproduct(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_matrix* z, /* Real */ ae_vector* lambdav, ae_state *_state); static double testevdunit_testcort(/* Complex */ ae_matrix* z, ae_int_t n, ae_state *_state); static void testevdunit_testsevdproblem(/* Real */ ae_matrix* a, /* Real */ ae_matrix* al, /* Real */ ae_matrix* au, ae_int_t n, double threshold, ae_bool* serrors, ae_int_t* failc, ae_int_t* runs, ae_state *_state); static void testevdunit_testhevdproblem(/* Complex */ ae_matrix* a, /* Complex */ ae_matrix* al, /* Complex */ ae_matrix* au, ae_int_t n, double threshold, ae_bool* herrors, ae_int_t* failc, ae_int_t* runs, ae_state *_state); static void testevdunit_testsevdbiproblem(/* Real */ ae_matrix* afull, /* Real */ ae_matrix* al, /* Real */ ae_matrix* au, ae_int_t n, ae_bool distvals, double threshold, ae_bool* serrors, ae_int_t* failc, ae_int_t* runs, ae_state *_state); static void testevdunit_testhevdbiproblem(/* Complex */ ae_matrix* afull, /* Complex */ ae_matrix* al, /* Complex */ ae_matrix* au, ae_int_t n, ae_bool distvals, double threshold, ae_bool* herrors, ae_int_t* failc, ae_int_t* runs, ae_state *_state); static void testevdunit_testtdevdproblem(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, double threshold, ae_bool* tderrors, ae_state *_state); static void testevdunit_testtdevdbiproblem(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool distvals, double threshold, ae_bool* serrors, ae_int_t* failc, ae_int_t* runs, ae_state *_state); static void testevdunit_testnsevdproblem(/* Real */ ae_matrix* a, ae_int_t n, double threshold, ae_bool* nserrors, ae_state *_state); static void testevdunit_testevdset(ae_int_t n, double threshold, double bithreshold, ae_int_t* failc, ae_int_t* runs, ae_bool* nserrors, ae_bool* serrors, ae_bool* herrors, ae_bool* tderrors, ae_bool* sbierrors, ae_bool* hbierrors, ae_bool* tdbierrors, ae_state *_state); static void testevdunit_testsisymm(ae_bool* errorflag, ae_state *_state); /************************************************************************* Testing symmetric EVD subroutine *************************************************************************/ ae_bool testevd(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_matrix ra; ae_int_t n; ae_int_t j; ae_int_t failc; ae_int_t runs; double failthreshold; double threshold; double bithreshold; ae_bool waserrors; ae_bool nserrors; ae_bool serrors; ae_bool herrors; ae_bool tderrors; ae_bool sbierrors; ae_bool hbierrors; ae_bool tdbierrors; ae_bool sisymmerrors; ae_bool wfailed; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ra, 0, 0, DT_REAL, _state); failthreshold = 0.005; threshold = 1.0E-8; bithreshold = 1.0E-6; nserrors = ae_false; serrors = ae_false; herrors = ae_false; tderrors = ae_false; sbierrors = ae_false; hbierrors = ae_false; tdbierrors = ae_false; sisymmerrors = ae_false; failc = 0; runs = 0; /* * Test subspace iteration solver */ testevdunit_testsisymm(&sisymmerrors, _state); /* * Test dense solvers */ for(n=1; n<=ablasblocksize(&ra, _state); n++) { testevdunit_testevdset(n, threshold, bithreshold, &failc, &runs, &nserrors, &serrors, &herrors, &tderrors, &sbierrors, &hbierrors, &tdbierrors, _state); } for(j=2; j<=3; j++) { for(n=j*ablasblocksize(&ra, _state)-1; n<=j*ablasblocksize(&ra, _state)+1; n++) { testevdunit_testevdset(n, threshold, bithreshold, &failc, &runs, &nserrors, &serrors, &herrors, &tderrors, &sbierrors, &hbierrors, &tdbierrors, _state); } } /* * report */ wfailed = ae_fp_greater((double)failc/(double)runs,failthreshold); waserrors = (((((((nserrors||serrors)||herrors)||tderrors)||sbierrors)||hbierrors)||tdbierrors)||wfailed)||sisymmerrors; if( !silent ) { printf("TESTING EVD UNIT\n"); printf("DENSE DIRECT SOLVERS:\n"); printf("* NS "); if( !nserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* S "); if( !serrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* H "); if( !herrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* TD "); if( !tderrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* SBI "); if( !sbierrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* HBI "); if( !hbierrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* TDBI "); if( !tdbierrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* FAILURE THRESHOLD "); if( !wfailed ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("ITERATIVE SOLVERS:\n"); printf("* SUBSPACE ITERATION (S) "); if( !sisymmerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testevd(ae_bool silent, ae_state *_state) { return testevd(silent, _state); } /************************************************************************* Sparse fill Sparcity - sparcity level, in [0,1] (0=dense matrix). DiagMAg - magnitude of dense diagonal entries; zero value means that diagonal is sparse too, non-zero value means that diagonal is dense *************************************************************************/ static void testevdunit_rmatrixfillsparsea(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, double sparcity, double diagmag, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater_eq(ae_randomreal(_state),sparcity) ) { a->ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } else { a->ptr.pp_double[i][j] = (double)(0); } } } if( ae_fp_greater(diagmag,(double)(0)) ) { for(i=0; i<=ae_minint(m, n, _state)-1; i++) { a->ptr.pp_double[i][i] = diagmag*(2*ae_randomreal(_state)-1); } } } /************************************************************************* Sparse fill Sparcity - sparcity level, in [0,1] (0=dense matrix). DiagMAg - magnitude of dense diagonal entries; zero value means that diagonal is sparse too, non-zero value means that diagonal is dense *************************************************************************/ static void testevdunit_cmatrixfillsparsea(/* Complex */ ae_matrix* a, ae_int_t m, ae_int_t n, double sparcity, double diagmag, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater_eq(ae_randomreal(_state),sparcity) ) { a->ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; a->ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } else { a->ptr.pp_complex[i][j] = ae_complex_from_i(0); } } } if( ae_fp_greater(diagmag,(double)(0)) ) { for(i=0; i<=ae_minint(m, n, _state)-1; i++) { a->ptr.pp_complex[i][i].x = diagmag*(2*ae_randomreal(_state)-1); a->ptr.pp_complex[i][i].y = diagmag*(2*ae_randomreal(_state)-1); } } } /************************************************************************* Copies A to AL (lower half) and AU (upper half), filling unused parts by random garbage. *************************************************************************/ static void testevdunit_rmatrixsymmetricsplit(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* al, /* Real */ ae_matrix* au, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { al->ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; al->ptr.pp_double[j][i] = a->ptr.pp_double[i][j]; au->ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; au->ptr.pp_double[j][i] = 2*ae_randomreal(_state)-1; } al->ptr.pp_double[i][i] = a->ptr.pp_double[i][i]; au->ptr.pp_double[i][i] = a->ptr.pp_double[i][i]; } } /************************************************************************* Copies A to AL (lower half) and AU (upper half), filling unused parts by random garbage. *************************************************************************/ static void testevdunit_cmatrixhermitiansplit(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_matrix* al, /* Complex */ ae_matrix* au, ae_state *_state) { ae_int_t i; ae_int_t j; for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { al->ptr.pp_complex[i][j] = ae_complex_from_d(2*ae_randomreal(_state)-1); al->ptr.pp_complex[j][i] = ae_c_conj(a->ptr.pp_complex[i][j], _state); au->ptr.pp_complex[i][j] = a->ptr.pp_complex[i][j]; au->ptr.pp_complex[j][i] = ae_complex_from_d(2*ae_randomreal(_state)-1); } al->ptr.pp_complex[i][i] = a->ptr.pp_complex[i][i]; au->ptr.pp_complex[i][i] = a->ptr.pp_complex[i][i]; } } /************************************************************************* Unsets 2D array. *************************************************************************/ static void testevdunit_unset2d(/* Real */ ae_matrix* a, ae_state *_state) { ae_matrix_clear(a); if( a->rows*a->cols>0 ) { ae_matrix_set_length(a, 1, 1, _state); } } /************************************************************************* Unsets 2D array. *************************************************************************/ static void testevdunit_cunset2d(/* Complex */ ae_matrix* a, ae_state *_state) { ae_matrix_set_length(a, 0+1, 0+1, _state); a->ptr.pp_complex[0][0] = ae_complex_from_d(2*ae_randomreal(_state)-1); } /************************************************************************* Unsets 1D array. *************************************************************************/ static void testevdunit_unset1d(/* Real */ ae_vector* a, ae_state *_state) { ae_vector_clear(a); if( a->cnt>0 ) { ae_vector_set_length(a, 1, _state); } } /************************************************************************* Tests Z*Lambda*Z' against tridiag(D,E). Returns relative error. *************************************************************************/ static double testevdunit_tdtestproduct(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, /* Real */ ae_matrix* z, /* Real */ ae_vector* lambdav, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; double v; double mx; double result; result = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { /* * Calculate V = A[i,j], A = Z*Lambda*Z' */ v = (double)(0); for(k=0; k<=n-1; k++) { v = v+z->ptr.pp_double[i][k]*lambdav->ptr.p_double[k]*z->ptr.pp_double[j][k]; } /* * Compare */ if( ae_iabs(i-j, _state)==0 ) { result = ae_maxreal(result, ae_fabs(v-d->ptr.p_double[i], _state), _state); } if( ae_iabs(i-j, _state)==1 ) { result = ae_maxreal(result, ae_fabs(v-e->ptr.p_double[ae_minint(i, j, _state)], _state), _state); } if( ae_iabs(i-j, _state)>1 ) { result = ae_maxreal(result, ae_fabs(v, _state), _state); } } } mx = (double)(0); for(i=0; i<=n-1; i++) { mx = ae_maxreal(mx, ae_fabs(d->ptr.p_double[i], _state), _state); } for(i=0; i<=n-2; i++) { mx = ae_maxreal(mx, ae_fabs(e->ptr.p_double[i], _state), _state); } if( ae_fp_eq(mx,(double)(0)) ) { mx = (double)(1); } result = result/mx; return result; } /************************************************************************* Tests Z*Lambda*Z' against A Returns relative error. *************************************************************************/ static double testevdunit_testproduct(/* Real */ ae_matrix* a, ae_int_t n, /* Real */ ae_matrix* z, /* Real */ ae_vector* lambdav, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; double v; double mx; double result; result = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { /* * Calculate V = A[i,j], A = Z*Lambda*Z' */ v = (double)(0); for(k=0; k<=n-1; k++) { v = v+z->ptr.pp_double[i][k]*lambdav->ptr.p_double[k]*z->ptr.pp_double[j][k]; } /* * Compare */ result = ae_maxreal(result, ae_fabs(v-a->ptr.pp_double[i][j], _state), _state); } } mx = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { mx = ae_maxreal(mx, ae_fabs(a->ptr.pp_double[i][j], _state), _state); } } if( ae_fp_eq(mx,(double)(0)) ) { mx = (double)(1); } result = result/mx; return result; } /************************************************************************* Tests Z*Z' against diag(1...1) Returns absolute error. *************************************************************************/ static double testevdunit_testort(/* Real */ ae_matrix* z, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; double v; double result; result = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&z->ptr.pp_double[0][i], z->stride, &z->ptr.pp_double[0][j], z->stride, ae_v_len(0,n-1)); if( i==j ) { v = v-1; } result = ae_maxreal(result, ae_fabs(v, _state), _state); } } return result; } /************************************************************************* Tests Z*Lambda*Z' against A Returns relative error. *************************************************************************/ static double testevdunit_testcproduct(/* Complex */ ae_matrix* a, ae_int_t n, /* Complex */ ae_matrix* z, /* Real */ ae_vector* lambdav, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t k; ae_complex v; double mx; double result; result = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { /* * Calculate V = A[i,j], A = Z*Lambda*Z' */ v = ae_complex_from_i(0); for(k=0; k<=n-1; k++) { v = ae_c_add(v,ae_c_mul(ae_c_mul_d(z->ptr.pp_complex[i][k],lambdav->ptr.p_double[k]),ae_c_conj(z->ptr.pp_complex[j][k], _state))); } /* * Compare */ result = ae_maxreal(result, ae_c_abs(ae_c_sub(v,a->ptr.pp_complex[i][j]), _state), _state); } } mx = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { mx = ae_maxreal(mx, ae_c_abs(a->ptr.pp_complex[i][j], _state), _state); } } if( ae_fp_eq(mx,(double)(0)) ) { mx = (double)(1); } result = result/mx; return result; } /************************************************************************* Tests Z*Z' against diag(1...1) Returns absolute error. *************************************************************************/ static double testevdunit_testcort(/* Complex */ ae_matrix* z, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; ae_complex v; double result; result = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_cdotproduct(&z->ptr.pp_complex[0][i], z->stride, "N", &z->ptr.pp_complex[0][j], z->stride, "Conj", ae_v_len(0,n-1)); if( i==j ) { v = ae_c_sub_d(v,1); } result = ae_maxreal(result, ae_c_abs(v, _state), _state); } } return result; } /************************************************************************* Tests SEVD problem *************************************************************************/ static void testevdunit_testsevdproblem(/* Real */ ae_matrix* a, /* Real */ ae_matrix* al, /* Real */ ae_matrix* au, ae_int_t n, double threshold, ae_bool* serrors, ae_int_t* failc, ae_int_t* runs, ae_state *_state) { ae_frame _frame_block; ae_vector lambdav; ae_vector lambdaref; ae_matrix z; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_init(&lambdav, 0, DT_REAL, _state); ae_vector_init(&lambdaref, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); /* * Test simple EVD: values and full vectors, lower A */ testevdunit_unset1d(&lambdaref, _state); testevdunit_unset2d(&z, _state); *runs = *runs+1; if( !smatrixevd(al, n, 1, ae_false, &lambdaref, &z, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } *serrors = *serrors||ae_fp_greater(testevdunit_testproduct(a, n, &z, &lambdaref, _state),threshold); *serrors = *serrors||ae_fp_greater(testevdunit_testort(&z, n, _state),threshold); for(i=0; i<=n-2; i++) { if( ae_fp_less(lambdaref.ptr.p_double[i+1],lambdaref.ptr.p_double[i]) ) { *serrors = ae_true; ae_frame_leave(_state); return; } } /* * Test simple EVD: values and full vectors, upper A */ testevdunit_unset1d(&lambdav, _state); testevdunit_unset2d(&z, _state); *runs = *runs+1; if( !smatrixevd(au, n, 1, ae_true, &lambdav, &z, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } *serrors = *serrors||ae_fp_greater(testevdunit_testproduct(a, n, &z, &lambdav, _state),threshold); *serrors = *serrors||ae_fp_greater(testevdunit_testort(&z, n, _state),threshold); for(i=0; i<=n-2; i++) { if( ae_fp_less(lambdav.ptr.p_double[i+1],lambdav.ptr.p_double[i]) ) { *serrors = ae_true; ae_frame_leave(_state); return; } } /* * Test simple EVD: values only, lower A */ testevdunit_unset1d(&lambdav, _state); testevdunit_unset2d(&z, _state); *runs = *runs+1; if( !smatrixevd(al, n, 0, ae_false, &lambdav, &z, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { *serrors = *serrors||ae_fp_greater(ae_fabs(lambdav.ptr.p_double[i]-lambdaref.ptr.p_double[i], _state),threshold); } /* * Test simple EVD: values only, upper A */ testevdunit_unset1d(&lambdav, _state); testevdunit_unset2d(&z, _state); *runs = *runs+1; if( !smatrixevd(au, n, 0, ae_true, &lambdav, &z, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { *serrors = *serrors||ae_fp_greater(ae_fabs(lambdav.ptr.p_double[i]-lambdaref.ptr.p_double[i], _state),threshold); } ae_frame_leave(_state); } /************************************************************************* Tests SEVD problem *************************************************************************/ static void testevdunit_testhevdproblem(/* Complex */ ae_matrix* a, /* Complex */ ae_matrix* al, /* Complex */ ae_matrix* au, ae_int_t n, double threshold, ae_bool* herrors, ae_int_t* failc, ae_int_t* runs, ae_state *_state) { ae_frame _frame_block; ae_vector lambdav; ae_vector lambdaref; ae_matrix z; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_init(&lambdav, 0, DT_REAL, _state); ae_vector_init(&lambdaref, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_COMPLEX, _state); /* * Test simple EVD: values and full vectors, lower A */ testevdunit_unset1d(&lambdaref, _state); testevdunit_cunset2d(&z, _state); *runs = *runs+1; if( !hmatrixevd(al, n, 1, ae_false, &lambdaref, &z, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } *herrors = *herrors||ae_fp_greater(testevdunit_testcproduct(a, n, &z, &lambdaref, _state),threshold); *herrors = *herrors||ae_fp_greater(testevdunit_testcort(&z, n, _state),threshold); for(i=0; i<=n-2; i++) { if( ae_fp_less(lambdaref.ptr.p_double[i+1],lambdaref.ptr.p_double[i]) ) { *herrors = ae_true; ae_frame_leave(_state); return; } } /* * Test simple EVD: values and full vectors, upper A */ testevdunit_unset1d(&lambdav, _state); testevdunit_cunset2d(&z, _state); *runs = *runs+1; if( !hmatrixevd(au, n, 1, ae_true, &lambdav, &z, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } *herrors = *herrors||ae_fp_greater(testevdunit_testcproduct(a, n, &z, &lambdav, _state),threshold); *herrors = *herrors||ae_fp_greater(testevdunit_testcort(&z, n, _state),threshold); for(i=0; i<=n-2; i++) { if( ae_fp_less(lambdav.ptr.p_double[i+1],lambdav.ptr.p_double[i]) ) { *herrors = ae_true; ae_frame_leave(_state); return; } } /* * Test simple EVD: values only, lower A */ testevdunit_unset1d(&lambdav, _state); testevdunit_cunset2d(&z, _state); *runs = *runs+1; if( !hmatrixevd(al, n, 0, ae_false, &lambdav, &z, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { *herrors = *herrors||ae_fp_greater(ae_fabs(lambdav.ptr.p_double[i]-lambdaref.ptr.p_double[i], _state),threshold); } /* * Test simple EVD: values only, upper A */ testevdunit_unset1d(&lambdav, _state); testevdunit_cunset2d(&z, _state); *runs = *runs+1; if( !hmatrixevd(au, n, 0, ae_true, &lambdav, &z, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { *herrors = *herrors||ae_fp_greater(ae_fabs(lambdav.ptr.p_double[i]-lambdaref.ptr.p_double[i], _state),threshold); } ae_frame_leave(_state); } /************************************************************************* Tests EVD problem DistVals - is True, when eigenvalues are distinct. Is False, when we are solving sparse task with lots of zero eigenvalues. In such cases some tests related to the eigenvectors are not performed. *************************************************************************/ static void testevdunit_testsevdbiproblem(/* Real */ ae_matrix* afull, /* Real */ ae_matrix* al, /* Real */ ae_matrix* au, ae_int_t n, ae_bool distvals, double threshold, ae_bool* serrors, ae_int_t* failc, ae_int_t* runs, ae_state *_state) { ae_frame _frame_block; ae_vector lambdav; ae_vector lambdaref; ae_matrix z; ae_matrix zref; ae_matrix a1; ae_matrix a2; ae_matrix ar; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t m; ae_int_t i1; ae_int_t i2; double v; double a; double b; ae_frame_make(_state, &_frame_block); ae_vector_init(&lambdav, 0, DT_REAL, _state); ae_vector_init(&lambdaref, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); ae_matrix_init(&zref, 0, 0, DT_REAL, _state); ae_matrix_init(&a1, 0, 0, DT_REAL, _state); ae_matrix_init(&a2, 0, 0, DT_REAL, _state); ae_matrix_init(&ar, 0, 0, DT_REAL, _state); ae_vector_set_length(&lambdaref, n-1+1, _state); ae_matrix_set_length(&zref, n-1+1, n-1+1, _state); ae_matrix_set_length(&a1, n-1+1, n-1+1, _state); ae_matrix_set_length(&a2, n-1+1, n-1+1, _state); /* * Reference EVD */ *runs = *runs+1; if( !smatrixevd(afull, n, 1, ae_true, &lambdaref, &zref, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } /* * Select random interval boundaries. * If there are non-distinct eigenvalues at the boundaries, * we move indexes further until values splits. It is done to * avoid situations where we can't get definite answer. */ i1 = ae_randominteger(n, _state); i2 = i1+ae_randominteger(n-i1, _state); while(i1>0) { if( ae_fp_greater(ae_fabs(lambdaref.ptr.p_double[i1-1]-lambdaref.ptr.p_double[i1], _state),10*threshold) ) { break; } i1 = i1-1; } while(i20 ) { a = 0.5*(lambdaref.ptr.p_double[i1]+lambdaref.ptr.p_double[i1-1]); } else { a = lambdaref.ptr.p_double[0]-1; } if( i20) { if( ae_fp_greater(ae_fabs(lambdaref.ptr.p_double[i1-1]-lambdaref.ptr.p_double[i1], _state),10*threshold) ) { break; } i1 = i1-1; } while(i20 ) { a = 0.5*(lambdaref.ptr.p_double[i1]+lambdaref.ptr.p_double[i1-1]); } else { a = lambdaref.ptr.p_double[0]-1; } if( i21 ) { ae_vector_set_length(&ee, n-2+1, _state); } /* * Test simple EVD: values and full vectors */ for(i=0; i<=n-1; i++) { lambdav.ptr.p_double[i] = d->ptr.p_double[i]; } for(i=0; i<=n-2; i++) { ee.ptr.p_double[i] = e->ptr.p_double[i]; } testevdunit_unset2d(&z, _state); wsucc = smatrixtdevd(&lambdav, &ee, n, 2, &z, _state); if( !wsucc ) { seterrorflag(tderrors, ae_true, _state); ae_frame_leave(_state); return; } seterrorflag(tderrors, ae_fp_greater(testevdunit_tdtestproduct(d, e, n, &z, &lambdav, _state),threshold), _state); seterrorflag(tderrors, ae_fp_greater(testevdunit_testort(&z, n, _state),threshold), _state); for(i=0; i<=n-2; i++) { if( ae_fp_less(lambdav.ptr.p_double[i+1],lambdav.ptr.p_double[i]) ) { seterrorflag(tderrors, ae_true, _state); ae_frame_leave(_state); return; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { zref.ptr.pp_double[i][j] = z.ptr.pp_double[i][j]; } } /* * Test values only variant */ for(i=0; i<=n-1; i++) { lambda2.ptr.p_double[i] = d->ptr.p_double[i]; } for(i=0; i<=n-2; i++) { ee.ptr.p_double[i] = e->ptr.p_double[i]; } testevdunit_unset2d(&z, _state); wsucc = smatrixtdevd(&lambda2, &ee, n, 0, &z, _state); if( !wsucc ) { seterrorflag(tderrors, ae_true, _state); ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(tderrors, ae_fp_greater(ae_fabs(lambda2.ptr.p_double[i]-lambdav.ptr.p_double[i], _state),threshold), _state); } /* * Test multiplication variant */ for(i=0; i<=n-1; i++) { lambda2.ptr.p_double[i] = d->ptr.p_double[i]; } for(i=0; i<=n-2; i++) { ee.ptr.p_double[i] = e->ptr.p_double[i]; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a1.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; a2.ptr.pp_double[i][j] = a1.ptr.pp_double[i][j]; } } wsucc = smatrixtdevd(&lambda2, &ee, n, 1, &a1, _state); if( !wsucc ) { seterrorflag(tderrors, ae_true, _state); ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(tderrors, ae_fp_greater(ae_fabs(lambda2.ptr.p_double[i]-lambdav.ptr.p_double[i], _state),threshold), _state); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&a2.ptr.pp_double[i][0], 1, &zref.ptr.pp_double[0][j], zref.stride, ae_v_len(0,n-1)); /* * next line is a bit complicated because * depending on algorithm used we can get either * z or -z as eigenvector. so we compare result * with both A*ZRef and -A*ZRef */ seterrorflag(tderrors, ae_fp_greater(ae_fabs(v-a1.ptr.pp_double[i][j], _state),threshold)&&ae_fp_greater(ae_fabs(v+a1.ptr.pp_double[i][j], _state),threshold), _state); } } /* * Test first row variant. * * NOTE: this test is special because ZNeeded=3 is ALGLIB-specific feature * which is NOT supported by Intel MKL. Thus, MKL-capable version of * ALGLIB will use different algorithms for ZNeeded=3 and for ZNeeded<3. * * In most cases it is OK, but when problem happened to be degenerate * (two close eigenvalues), Z computed by ALGLIB may be different from * Z computed by MKL (up to arbitrary rotation), which will lead to * failure of the test, because ZNeeded=2 is used as reference value * for ZNeeded=3. * * That's why this test is performed only for well-separated matrices, * and with custom threshold. */ requiredseparation = 1.0E-6; specialthreshold = 1.0E-6; worstseparation = ae_maxrealnumber; for(i=0; i<=n-2; i++) { worstseparation = ae_minreal(worstseparation, ae_fabs(lambdav.ptr.p_double[i+1]-lambdav.ptr.p_double[i], _state), _state); } if( ae_fp_greater(worstseparation,requiredseparation) ) { for(i=0; i<=n-1; i++) { lambda2.ptr.p_double[i] = d->ptr.p_double[i]; } for(i=0; i<=n-2; i++) { ee.ptr.p_double[i] = e->ptr.p_double[i]; } testevdunit_unset2d(&z, _state); wsucc = smatrixtdevd(&lambda2, &ee, n, 3, &z, _state); if( !wsucc ) { seterrorflag(tderrors, ae_true, _state); ae_frame_leave(_state); return; } for(i=0; i<=n-1; i++) { seterrorflag(tderrors, ae_fp_greater(ae_fabs(lambda2.ptr.p_double[i]-lambdav.ptr.p_double[i], _state),threshold), _state); /* * next line is a bit complicated because * depending on algorithm used we can get either * z or -z as eigenvector. so we compare result * with both z and -z */ seterrorflag(tderrors, ae_fp_greater(ae_fabs(z.ptr.pp_double[0][i]-zref.ptr.pp_double[0][i], _state),specialthreshold)&&ae_fp_greater(ae_fabs(z.ptr.pp_double[0][i]+zref.ptr.pp_double[0][i], _state),specialthreshold), _state); } } ae_frame_leave(_state); } /************************************************************************* Tests EVD problem DistVals - is True, when eigenvalues are distinct. Is False, when we are solving sparse task with lots of zero eigenvalues. In such cases some tests related to the eigenvectors are not performed. *************************************************************************/ static void testevdunit_testtdevdbiproblem(/* Real */ ae_vector* d, /* Real */ ae_vector* e, ae_int_t n, ae_bool distvals, double threshold, ae_bool* serrors, ae_int_t* failc, ae_int_t* runs, ae_state *_state) { ae_frame _frame_block; ae_vector lambdav; ae_vector lambdaref; ae_matrix z; ae_matrix zref; ae_matrix a1; ae_matrix a2; ae_matrix ar; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t m; ae_int_t i1; ae_int_t i2; double v; double a; double b; ae_frame_make(_state, &_frame_block); ae_vector_init(&lambdav, 0, DT_REAL, _state); ae_vector_init(&lambdaref, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); ae_matrix_init(&zref, 0, 0, DT_REAL, _state); ae_matrix_init(&a1, 0, 0, DT_REAL, _state); ae_matrix_init(&a2, 0, 0, DT_REAL, _state); ae_matrix_init(&ar, 0, 0, DT_REAL, _state); ae_vector_set_length(&lambdaref, n-1+1, _state); ae_matrix_set_length(&zref, n-1+1, n-1+1, _state); ae_matrix_set_length(&a1, n-1+1, n-1+1, _state); ae_matrix_set_length(&a2, n-1+1, n-1+1, _state); /* * Reference EVD */ ae_vector_set_length(&lambdaref, n, _state); ae_v_move(&lambdaref.ptr.p_double[0], 1, &d->ptr.p_double[0], 1, ae_v_len(0,n-1)); *runs = *runs+1; if( !smatrixtdevd(&lambdaref, e, n, 2, &zref, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } /* * Select random interval boundaries. * If there are non-distinct eigenvalues at the boundaries, * we move indexes further until values splits. It is done to * avoid situations where we can't get definite answer. */ i1 = ae_randominteger(n, _state); i2 = i1+ae_randominteger(n-i1, _state); while(i1>0) { if( ae_fp_greater(ae_fabs(lambdaref.ptr.p_double[i1-1]-lambdaref.ptr.p_double[i1], _state),10*threshold) ) { break; } i1 = i1-1; } while(i20 ) { a = 0.5*(lambdaref.ptr.p_double[i1]+lambdaref.ptr.p_double[i1-1]); } else { a = lambdaref.ptr.p_double[0]-1; } if( i2ptr.p_double[i]; } *runs = *runs+1; if( !smatrixtdevdr(&lambdav, e, n, 0, a, b, &m, &z, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } if( m!=i2-i1+1 ) { *failc = *failc+1; ae_frame_leave(_state); return; } for(k=0; k<=m-1; k++) { *serrors = *serrors||ae_fp_greater(ae_fabs(lambdav.ptr.p_double[k]-lambdaref.ptr.p_double[i1+k], _state),threshold); } /* * Test indexes, no vectors */ ae_vector_set_length(&lambdav, n-1+1, _state); for(i=0; i<=n-1; i++) { lambdav.ptr.p_double[i] = d->ptr.p_double[i]; } *runs = *runs+1; if( !smatrixtdevdi(&lambdav, e, n, 0, i1, i2, &z, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } m = i2-i1+1; for(k=0; k<=m-1; k++) { *serrors = *serrors||ae_fp_greater(ae_fabs(lambdav.ptr.p_double[k]-lambdaref.ptr.p_double[i1+k], _state),threshold); } /* * Test interval, transform vectors */ ae_vector_set_length(&lambdav, n-1+1, _state); for(i=0; i<=n-1; i++) { lambdav.ptr.p_double[i] = d->ptr.p_double[i]; } ae_matrix_set_length(&a1, n-1+1, n-1+1, _state); ae_matrix_set_length(&a2, n-1+1, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a1.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; a2.ptr.pp_double[i][j] = a1.ptr.pp_double[i][j]; } } *runs = *runs+1; if( !smatrixtdevdr(&lambdav, e, n, 1, a, b, &m, &a1, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } if( m!=i2-i1+1 ) { *failc = *failc+1; ae_frame_leave(_state); return; } for(k=0; k<=m-1; k++) { *serrors = *serrors||ae_fp_greater(ae_fabs(lambdav.ptr.p_double[k]-lambdaref.ptr.p_double[i1+k], _state),threshold); } if( distvals ) { ae_matrix_set_length(&ar, n-1+1, m-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_dotproduct(&a2.ptr.pp_double[i][0], 1, &zref.ptr.pp_double[0][i1+j], zref.stride, ae_v_len(0,n-1)); ar.ptr.pp_double[i][j] = v; } } for(j=0; j<=m-1; j++) { v = ae_v_dotproduct(&a1.ptr.pp_double[0][j], a1.stride, &ar.ptr.pp_double[0][j], ar.stride, ae_v_len(0,n-1)); if( ae_fp_less(v,(double)(0)) ) { ae_v_muld(&ar.ptr.pp_double[0][j], ar.stride, ae_v_len(0,n-1), -1); } } for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { *serrors = *serrors||ae_fp_greater(ae_fabs(a1.ptr.pp_double[i][j]-ar.ptr.pp_double[i][j], _state),threshold); } } } /* * Test indexes, transform vectors */ ae_vector_set_length(&lambdav, n-1+1, _state); for(i=0; i<=n-1; i++) { lambdav.ptr.p_double[i] = d->ptr.p_double[i]; } ae_matrix_set_length(&a1, n-1+1, n-1+1, _state); ae_matrix_set_length(&a2, n-1+1, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a1.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; a2.ptr.pp_double[i][j] = a1.ptr.pp_double[i][j]; } } *runs = *runs+1; if( !smatrixtdevdi(&lambdav, e, n, 1, i1, i2, &a1, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } m = i2-i1+1; for(k=0; k<=m-1; k++) { *serrors = *serrors||ae_fp_greater(ae_fabs(lambdav.ptr.p_double[k]-lambdaref.ptr.p_double[i1+k], _state),threshold); } if( distvals ) { ae_matrix_set_length(&ar, n-1+1, m-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { v = ae_v_dotproduct(&a2.ptr.pp_double[i][0], 1, &zref.ptr.pp_double[0][i1+j], zref.stride, ae_v_len(0,n-1)); ar.ptr.pp_double[i][j] = v; } } for(j=0; j<=m-1; j++) { v = ae_v_dotproduct(&a1.ptr.pp_double[0][j], a1.stride, &ar.ptr.pp_double[0][j], ar.stride, ae_v_len(0,n-1)); if( ae_fp_less(v,(double)(0)) ) { ae_v_muld(&ar.ptr.pp_double[0][j], ar.stride, ae_v_len(0,n-1), -1); } } for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { *serrors = *serrors||ae_fp_greater(ae_fabs(a1.ptr.pp_double[i][j]-ar.ptr.pp_double[i][j], _state),threshold); } } } /* * Test interval, do not transform vectors */ ae_vector_set_length(&lambdav, n-1+1, _state); for(i=0; i<=n-1; i++) { lambdav.ptr.p_double[i] = d->ptr.p_double[i]; } ae_matrix_set_length(&z, 0+1, 0+1, _state); *runs = *runs+1; if( !smatrixtdevdr(&lambdav, e, n, 2, a, b, &m, &z, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } if( m!=i2-i1+1 ) { *failc = *failc+1; ae_frame_leave(_state); return; } for(k=0; k<=m-1; k++) { *serrors = *serrors||ae_fp_greater(ae_fabs(lambdav.ptr.p_double[k]-lambdaref.ptr.p_double[i1+k], _state),threshold); } if( distvals ) { for(j=0; j<=m-1; j++) { v = ae_v_dotproduct(&z.ptr.pp_double[0][j], z.stride, &zref.ptr.pp_double[0][i1+j], zref.stride, ae_v_len(0,n-1)); if( ae_fp_less(v,(double)(0)) ) { ae_v_muld(&z.ptr.pp_double[0][j], z.stride, ae_v_len(0,n-1), -1); } } for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { *serrors = *serrors||ae_fp_greater(ae_fabs(z.ptr.pp_double[i][j]-zref.ptr.pp_double[i][i1+j], _state),threshold); } } } /* * Test indexes, do not transform vectors */ ae_vector_set_length(&lambdav, n-1+1, _state); for(i=0; i<=n-1; i++) { lambdav.ptr.p_double[i] = d->ptr.p_double[i]; } ae_matrix_set_length(&z, 0+1, 0+1, _state); *runs = *runs+1; if( !smatrixtdevdi(&lambdav, e, n, 2, i1, i2, &z, _state) ) { *failc = *failc+1; ae_frame_leave(_state); return; } m = i2-i1+1; for(k=0; k<=m-1; k++) { *serrors = *serrors||ae_fp_greater(ae_fabs(lambdav.ptr.p_double[k]-lambdaref.ptr.p_double[i1+k], _state),threshold); } if( distvals ) { for(j=0; j<=m-1; j++) { v = ae_v_dotproduct(&z.ptr.pp_double[0][j], z.stride, &zref.ptr.pp_double[0][i1+j], zref.stride, ae_v_len(0,n-1)); if( ae_fp_less(v,(double)(0)) ) { ae_v_muld(&z.ptr.pp_double[0][j], z.stride, ae_v_len(0,n-1), -1); } } for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { *serrors = *serrors||ae_fp_greater(ae_fabs(z.ptr.pp_double[i][j]-zref.ptr.pp_double[i][i1+j], _state),threshold); } } } ae_frame_leave(_state); } /************************************************************************* Non-symmetric problem *************************************************************************/ static void testevdunit_testnsevdproblem(/* Real */ ae_matrix* a, ae_int_t n, double threshold, ae_bool* nserrors, ae_state *_state) { ae_frame _frame_block; double mx; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t vjob; ae_bool needl; ae_bool needr; ae_vector wr0; ae_vector wi0; ae_vector wr1; ae_vector wi1; ae_vector wr0s; ae_vector wi0s; ae_vector wr1s; ae_vector wi1s; ae_matrix vl; ae_matrix vr; ae_vector vec1r; ae_vector vec1i; ae_vector vec2r; ae_vector vec2i; ae_vector vec3r; ae_vector vec3i; double curwr; double curwi; double vt; double tmp; double vnorm; ae_frame_make(_state, &_frame_block); ae_vector_init(&wr0, 0, DT_REAL, _state); ae_vector_init(&wi0, 0, DT_REAL, _state); ae_vector_init(&wr1, 0, DT_REAL, _state); ae_vector_init(&wi1, 0, DT_REAL, _state); ae_vector_init(&wr0s, 0, DT_REAL, _state); ae_vector_init(&wi0s, 0, DT_REAL, _state); ae_vector_init(&wr1s, 0, DT_REAL, _state); ae_vector_init(&wi1s, 0, DT_REAL, _state); ae_matrix_init(&vl, 0, 0, DT_REAL, _state); ae_matrix_init(&vr, 0, 0, DT_REAL, _state); ae_vector_init(&vec1r, 0, DT_REAL, _state); ae_vector_init(&vec1i, 0, DT_REAL, _state); ae_vector_init(&vec2r, 0, DT_REAL, _state); ae_vector_init(&vec2i, 0, DT_REAL, _state); ae_vector_init(&vec3r, 0, DT_REAL, _state); ae_vector_init(&vec3i, 0, DT_REAL, _state); ae_vector_set_length(&vec1r, n-1+1, _state); ae_vector_set_length(&vec2r, n-1+1, _state); ae_vector_set_length(&vec3r, n-1+1, _state); ae_vector_set_length(&vec1i, n-1+1, _state); ae_vector_set_length(&vec2i, n-1+1, _state); ae_vector_set_length(&vec3i, n-1+1, _state); ae_vector_set_length(&wr0s, n-1+1, _state); ae_vector_set_length(&wr1s, n-1+1, _state); ae_vector_set_length(&wi0s, n-1+1, _state); ae_vector_set_length(&wi1s, n-1+1, _state); mx = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater(ae_fabs(a->ptr.pp_double[i][j], _state),mx) ) { mx = ae_fabs(a->ptr.pp_double[i][j], _state); } } } if( ae_fp_eq(mx,(double)(0)) ) { mx = (double)(1); } /* * Load values-only */ if( !rmatrixevd(a, n, 0, &wr0, &wi0, &vl, &vr, _state) ) { seterrorflag(nserrors, ae_true, _state); ae_frame_leave(_state); return; } /* * Test different jobs */ for(vjob=1; vjob<=3; vjob++) { needr = vjob==1||vjob==3; needl = vjob==2||vjob==3; if( !rmatrixevd(a, n, vjob, &wr1, &wi1, &vl, &vr, _state) ) { seterrorflag(nserrors, ae_true, _state); ae_frame_leave(_state); return; } /* * Test values: * 1. sort by real part * 2. test */ ae_v_move(&wr0s.ptr.p_double[0], 1, &wr0.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&wi0s.ptr.p_double[0], 1, &wi0.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { for(j=0; j<=n-2-i; j++) { if( ae_fp_greater(wr0s.ptr.p_double[j],wr0s.ptr.p_double[j+1]) ) { tmp = wr0s.ptr.p_double[j]; wr0s.ptr.p_double[j] = wr0s.ptr.p_double[j+1]; wr0s.ptr.p_double[j+1] = tmp; tmp = wi0s.ptr.p_double[j]; wi0s.ptr.p_double[j] = wi0s.ptr.p_double[j+1]; wi0s.ptr.p_double[j+1] = tmp; } } } ae_v_move(&wr1s.ptr.p_double[0], 1, &wr1.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&wi1s.ptr.p_double[0], 1, &wi1.ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { for(j=0; j<=n-2-i; j++) { if( ae_fp_greater(wr1s.ptr.p_double[j],wr1s.ptr.p_double[j+1]) ) { tmp = wr1s.ptr.p_double[j]; wr1s.ptr.p_double[j] = wr1s.ptr.p_double[j+1]; wr1s.ptr.p_double[j+1] = tmp; tmp = wi1s.ptr.p_double[j]; wi1s.ptr.p_double[j] = wi1s.ptr.p_double[j+1]; wi1s.ptr.p_double[j+1] = tmp; } } } for(i=0; i<=n-1; i++) { seterrorflag(nserrors, ae_fp_greater(ae_fabs(wr0s.ptr.p_double[i]-wr1s.ptr.p_double[i], _state),threshold), _state); seterrorflag(nserrors, ae_fp_greater(ae_fabs(wi0s.ptr.p_double[i]-wi1s.ptr.p_double[i], _state),threshold), _state); } /* * Test right vectors */ if( needr ) { k = 0; while(k<=n-1) { curwr = (double)(0); curwi = (double)(0); if( ae_fp_eq(wi1.ptr.p_double[k],(double)(0)) ) { ae_v_move(&vec1r.ptr.p_double[0], 1, &vr.ptr.pp_double[0][k], vr.stride, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { vec1i.ptr.p_double[i] = (double)(0); } curwr = wr1.ptr.p_double[k]; curwi = (double)(0); } if( ae_fp_greater(wi1.ptr.p_double[k],(double)(0)) ) { ae_v_move(&vec1r.ptr.p_double[0], 1, &vr.ptr.pp_double[0][k], vr.stride, ae_v_len(0,n-1)); ae_v_move(&vec1i.ptr.p_double[0], 1, &vr.ptr.pp_double[0][k+1], vr.stride, ae_v_len(0,n-1)); curwr = wr1.ptr.p_double[k]; curwi = wi1.ptr.p_double[k]; } if( ae_fp_less(wi1.ptr.p_double[k],(double)(0)) ) { ae_v_move(&vec1r.ptr.p_double[0], 1, &vr.ptr.pp_double[0][k-1], vr.stride, ae_v_len(0,n-1)); ae_v_moveneg(&vec1i.ptr.p_double[0], 1, &vr.ptr.pp_double[0][k], vr.stride, ae_v_len(0,n-1)); curwr = wr1.ptr.p_double[k]; curwi = wi1.ptr.p_double[k]; } vnorm = 0.0; for(i=0; i<=n-1; i++) { vt = ae_v_dotproduct(&a->ptr.pp_double[i][0], 1, &vec1r.ptr.p_double[0], 1, ae_v_len(0,n-1)); vec2r.ptr.p_double[i] = vt; vt = ae_v_dotproduct(&a->ptr.pp_double[i][0], 1, &vec1i.ptr.p_double[0], 1, ae_v_len(0,n-1)); vec2i.ptr.p_double[i] = vt; vnorm = vnorm+ae_sqr(vec1r.ptr.p_double[i], _state)+ae_sqr(vec1i.ptr.p_double[i], _state); } vnorm = ae_sqrt(vnorm, _state); ae_v_moved(&vec3r.ptr.p_double[0], 1, &vec1r.ptr.p_double[0], 1, ae_v_len(0,n-1), curwr); ae_v_subd(&vec3r.ptr.p_double[0], 1, &vec1i.ptr.p_double[0], 1, ae_v_len(0,n-1), curwi); ae_v_moved(&vec3i.ptr.p_double[0], 1, &vec1r.ptr.p_double[0], 1, ae_v_len(0,n-1), curwi); ae_v_addd(&vec3i.ptr.p_double[0], 1, &vec1i.ptr.p_double[0], 1, ae_v_len(0,n-1), curwr); seterrorflag(nserrors, ae_fp_less(vnorm,1.0E-3)||!ae_isfinite(vnorm, _state), _state); for(i=0; i<=n-1; i++) { seterrorflag(nserrors, ae_fp_greater(ae_fabs(vec2r.ptr.p_double[i]-vec3r.ptr.p_double[i], _state),threshold), _state); seterrorflag(nserrors, ae_fp_greater(ae_fabs(vec2i.ptr.p_double[i]-vec3i.ptr.p_double[i], _state),threshold), _state); } k = k+1; } } /* * Test left vectors */ curwr = (double)(0); curwi = (double)(0); if( needl ) { k = 0; while(k<=n-1) { if( ae_fp_eq(wi1.ptr.p_double[k],(double)(0)) ) { ae_v_move(&vec1r.ptr.p_double[0], 1, &vl.ptr.pp_double[0][k], vl.stride, ae_v_len(0,n-1)); for(i=0; i<=n-1; i++) { vec1i.ptr.p_double[i] = (double)(0); } curwr = wr1.ptr.p_double[k]; curwi = (double)(0); } if( ae_fp_greater(wi1.ptr.p_double[k],(double)(0)) ) { ae_v_move(&vec1r.ptr.p_double[0], 1, &vl.ptr.pp_double[0][k], vl.stride, ae_v_len(0,n-1)); ae_v_move(&vec1i.ptr.p_double[0], 1, &vl.ptr.pp_double[0][k+1], vl.stride, ae_v_len(0,n-1)); curwr = wr1.ptr.p_double[k]; curwi = wi1.ptr.p_double[k]; } if( ae_fp_less(wi1.ptr.p_double[k],(double)(0)) ) { ae_v_move(&vec1r.ptr.p_double[0], 1, &vl.ptr.pp_double[0][k-1], vl.stride, ae_v_len(0,n-1)); ae_v_moveneg(&vec1i.ptr.p_double[0], 1, &vl.ptr.pp_double[0][k], vl.stride, ae_v_len(0,n-1)); curwr = wr1.ptr.p_double[k]; curwi = wi1.ptr.p_double[k]; } vnorm = 0.0; for(j=0; j<=n-1; j++) { vt = ae_v_dotproduct(&vec1r.ptr.p_double[0], 1, &a->ptr.pp_double[0][j], a->stride, ae_v_len(0,n-1)); vec2r.ptr.p_double[j] = vt; vt = ae_v_dotproduct(&vec1i.ptr.p_double[0], 1, &a->ptr.pp_double[0][j], a->stride, ae_v_len(0,n-1)); vec2i.ptr.p_double[j] = -vt; vnorm = vnorm+ae_sqr(vec1r.ptr.p_double[j], _state)+ae_sqr(vec1i.ptr.p_double[j], _state); } vnorm = ae_sqrt(vnorm, _state); ae_v_moved(&vec3r.ptr.p_double[0], 1, &vec1r.ptr.p_double[0], 1, ae_v_len(0,n-1), curwr); ae_v_addd(&vec3r.ptr.p_double[0], 1, &vec1i.ptr.p_double[0], 1, ae_v_len(0,n-1), curwi); ae_v_moved(&vec3i.ptr.p_double[0], 1, &vec1r.ptr.p_double[0], 1, ae_v_len(0,n-1), curwi); ae_v_addd(&vec3i.ptr.p_double[0], 1, &vec1i.ptr.p_double[0], 1, ae_v_len(0,n-1), -curwr); seterrorflag(nserrors, ae_fp_less(vnorm,1.0E-3)||!ae_isfinite(vnorm, _state), _state); for(i=0; i<=n-1; i++) { seterrorflag(nserrors, ae_fp_greater(ae_fabs(vec2r.ptr.p_double[i]-vec3r.ptr.p_double[i], _state),threshold), _state); seterrorflag(nserrors, ae_fp_greater(ae_fabs(vec2i.ptr.p_double[i]-vec3i.ptr.p_double[i], _state),threshold), _state); } k = k+1; } } } ae_frame_leave(_state); } /************************************************************************* Testing EVD subroutines for one N NOTES: * BIThreshold is a threshold for bisection-and-inverse-iteration subroutines. special threshold is needed because these subroutines may have much more larger error than QR-based algorithms. *************************************************************************/ static void testevdunit_testevdset(ae_int_t n, double threshold, double bithreshold, ae_int_t* failc, ae_int_t* runs, ae_bool* nserrors, ae_bool* serrors, ae_bool* herrors, ae_bool* tderrors, ae_bool* sbierrors, ae_bool* hbierrors, ae_bool* tdbierrors, ae_state *_state) { ae_frame _frame_block; ae_matrix ra; ae_matrix ral; ae_matrix rau; ae_matrix ca; ae_matrix cal; ae_matrix cau; ae_vector d; ae_vector e; ae_int_t i; ae_int_t j; ae_int_t mkind; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ra, 0, 0, DT_REAL, _state); ae_matrix_init(&ral, 0, 0, DT_REAL, _state); ae_matrix_init(&rau, 0, 0, DT_REAL, _state); ae_matrix_init(&ca, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cal, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&cau, 0, 0, DT_COMPLEX, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); /* * Test symmetric problems */ ae_matrix_set_length(&ra, n, n, _state); ae_matrix_set_length(&ral, n, n, _state); ae_matrix_set_length(&rau, n, n, _state); ae_matrix_set_length(&ca, n, n, _state); ae_matrix_set_length(&cal, n, n, _state); ae_matrix_set_length(&cau, n, n, _state); /* * Zero matrices */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { ra.ptr.pp_double[i][j] = (double)(0); ca.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } testevdunit_rmatrixsymmetricsplit(&ra, n, &ral, &rau, _state); testevdunit_cmatrixhermitiansplit(&ca, n, &cal, &cau, _state); testevdunit_testsevdproblem(&ra, &ral, &rau, n, threshold, serrors, failc, runs, _state); testevdunit_testhevdproblem(&ca, &cal, &cau, n, threshold, herrors, failc, runs, _state); testevdunit_testsevdbiproblem(&ra, &ral, &rau, n, ae_false, bithreshold, sbierrors, failc, runs, _state); testevdunit_testhevdbiproblem(&ca, &cal, &cau, n, ae_false, bithreshold, hbierrors, failc, runs, _state); /* * Random matrix */ for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { ra.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; ra.ptr.pp_double[j][i] = ra.ptr.pp_double[i][j]; ca.ptr.pp_complex[j][i] = ae_c_conj(ca.ptr.pp_complex[i][j], _state); } ra.ptr.pp_double[i][i] = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][i] = ae_complex_from_d(2*ae_randomreal(_state)-1); } testevdunit_rmatrixsymmetricsplit(&ra, n, &ral, &rau, _state); testevdunit_cmatrixhermitiansplit(&ca, n, &cal, &cau, _state); testevdunit_testsevdproblem(&ra, &ral, &rau, n, threshold, serrors, failc, runs, _state); testevdunit_testhevdproblem(&ca, &cal, &cau, n, threshold, herrors, failc, runs, _state); /* * Random diagonally dominant matrix with distinct eigenvalues */ for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { ra.ptr.pp_double[i][j] = 0.1*(2*ae_randomreal(_state)-1)/n; ca.ptr.pp_complex[i][j].x = 0.1*(2*ae_randomreal(_state)-1)/n; ca.ptr.pp_complex[i][j].y = 0.1*(2*ae_randomreal(_state)-1)/n; ra.ptr.pp_double[j][i] = ra.ptr.pp_double[i][j]; ca.ptr.pp_complex[j][i] = ae_c_conj(ca.ptr.pp_complex[i][j], _state); } ra.ptr.pp_double[i][i] = 0.1*(2*ae_randomreal(_state)-1)+i; ca.ptr.pp_complex[i][i] = ae_complex_from_d(0.1*(2*ae_randomreal(_state)-1)+i); } testevdunit_rmatrixsymmetricsplit(&ra, n, &ral, &rau, _state); testevdunit_cmatrixhermitiansplit(&ca, n, &cal, &cau, _state); testevdunit_testsevdproblem(&ra, &ral, &rau, n, threshold, serrors, failc, runs, _state); testevdunit_testhevdproblem(&ca, &cal, &cau, n, threshold, herrors, failc, runs, _state); testevdunit_testsevdbiproblem(&ra, &ral, &rau, n, ae_true, bithreshold, sbierrors, failc, runs, _state); testevdunit_testhevdbiproblem(&ca, &cal, &cau, n, ae_true, bithreshold, hbierrors, failc, runs, _state); /* * Sparse matrices */ testevdunit_rmatrixfillsparsea(&ra, n, n, 0.995, (double)(0), _state); testevdunit_cmatrixfillsparsea(&ca, n, n, 0.995, (double)(0), _state); for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { ra.ptr.pp_double[j][i] = ra.ptr.pp_double[i][j]; ca.ptr.pp_complex[j][i] = ae_c_conj(ca.ptr.pp_complex[i][j], _state); } ca.ptr.pp_complex[i][i].y = (double)(0); } testevdunit_rmatrixsymmetricsplit(&ra, n, &ral, &rau, _state); testevdunit_cmatrixhermitiansplit(&ca, n, &cal, &cau, _state); testevdunit_testsevdproblem(&ra, &ral, &rau, n, threshold, serrors, failc, runs, _state); testevdunit_testhevdproblem(&ca, &cal, &cau, n, threshold, herrors, failc, runs, _state); testevdunit_testsevdbiproblem(&ra, &ral, &rau, n, ae_false, bithreshold, sbierrors, failc, runs, _state); testevdunit_testhevdbiproblem(&ca, &cal, &cau, n, ae_false, bithreshold, hbierrors, failc, runs, _state); /* * testing tridiagonal problems */ for(mkind=0; mkind<=7; mkind++) { ae_vector_set_length(&d, n, _state); if( n>1 ) { ae_vector_set_length(&e, n-1, _state); } if( mkind==0 ) { /* * Zero matrix */ for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-2; i++) { e.ptr.p_double[i] = (double)(0); } } if( mkind==1 ) { /* * Diagonal matrix */ for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=n-2; i++) { e.ptr.p_double[i] = (double)(0); } } if( mkind==2 ) { /* * Off-diagonal matrix */ for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-2; i++) { e.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } } if( mkind==3 ) { /* * Dense matrix with blocks */ for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=n-2; i++) { e.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } j = 1; i = 2; while(j<=n-2) { e.ptr.p_double[j] = (double)(0); j = j+i; i = i+1; } } if( mkind==4 ) { /* * dense matrix */ for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=n-2; i++) { e.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } } if( mkind==5 ) { /* * Diagonal matrix with distinct eigenvalues */ for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = 0.1*(2*ae_randomreal(_state)-1)+i; } for(i=0; i<=n-2; i++) { e.ptr.p_double[i] = (double)(0); } } if( mkind==6 ) { /* * Off-diagonal matrix with distinct eigenvalues */ for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = (double)(0); } for(i=0; i<=n-2; i++) { e.ptr.p_double[i] = 0.1*(2*ae_randomreal(_state)-1)+i+1; } } if( mkind==7 ) { /* * dense matrix with distinct eigenvalues */ for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = 0.1*(2*ae_randomreal(_state)-1)+i+1; } for(i=0; i<=n-2; i++) { e.ptr.p_double[i] = 0.1*(2*ae_randomreal(_state)-1); } } testevdunit_testtdevdproblem(&d, &e, n, threshold, tderrors, _state); testevdunit_testtdevdbiproblem(&d, &e, n, (mkind==5||mkind==6)||mkind==7, bithreshold, tdbierrors, failc, runs, _state); } /* * Test non-symmetric problems */ /* * Test non-symmetric problems: zero, random, sparse matrices. */ ae_matrix_set_length(&ra, n, n, _state); ae_matrix_set_length(&ca, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { ra.ptr.pp_double[i][j] = (double)(0); ca.ptr.pp_complex[i][j] = ae_complex_from_i(0); } } testevdunit_testnsevdproblem(&ra, n, threshold, nserrors, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { ra.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][j].x = 2*ae_randomreal(_state)-1; ca.ptr.pp_complex[i][j].y = 2*ae_randomreal(_state)-1; } } testevdunit_testnsevdproblem(&ra, n, threshold, nserrors, _state); testevdunit_rmatrixfillsparsea(&ra, n, n, 0.995, (double)(0), _state); testevdunit_cmatrixfillsparsea(&ca, n, n, 0.995, (double)(0), _state); testevdunit_testnsevdproblem(&ra, n, threshold, nserrors, _state); ae_frame_leave(_state); } /************************************************************************* Testing symmetric mode of subspace iteration solver. On failure sets error flag, on success flag is not changed. *************************************************************************/ static void testevdunit_testsisymm(ae_bool* errorflag, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t m; ae_int_t smode; ae_int_t i; ae_int_t j; ae_int_t k; hqrndstate rs; double sgn; ae_vector diaga; double decaya; ae_bool isupper; ae_matrix qa; ae_matrix densea; ae_matrix halfa; ae_matrix ra; ae_matrix rb; sparsematrix halfsa; ae_int_t sformat; double v; double mx; eigsubspacestate s; eigsubspacereport rep; ae_vector sw; ae_matrix sz; double tollambda; double tolvector; ae_int_t itscount; ae_int_t callcount; ae_int_t requestsize; ae_int_t requesttype; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&rs, _state); ae_vector_init(&diaga, 0, DT_REAL, _state); ae_matrix_init(&qa, 0, 0, DT_REAL, _state); ae_matrix_init(&densea, 0, 0, DT_REAL, _state); ae_matrix_init(&halfa, 0, 0, DT_REAL, _state); ae_matrix_init(&ra, 0, 0, DT_REAL, _state); ae_matrix_init(&rb, 0, 0, DT_REAL, _state); _sparsematrix_init(&halfsa, _state); _eigsubspacestate_init(&s, _state); _eigsubspacereport_init(&rep, _state); ae_vector_init(&sw, 0, DT_REAL, _state); ae_matrix_init(&sz, 0, 0, DT_REAL, _state); hqrndrandomize(&rs, _state); /* * Problem with weakly separated eigenvalues (but not too weak), * either with alternating sign - or all positive. EPS-based * stopping condition is used because we can not tell how many * iterations is required to solve it. */ tollambda = 1.0E-9; tolvector = 1.0E-4; for(n=1; n<=25; n++) { for(m=1; m<=n; m++) { for(smode=0; smode<=2; smode++) { /* * Generate eigenproblem */ sgn = (double)(2*hqrnduniformi(&rs, 2, _state)-1); decaya = 1.05; ae_vector_set_length(&diaga, n, _state); diaga.ptr.p_double[0] = ae_pow((double)(10), 2*hqrnduniformr(&rs, _state)-1, _state); for(i=1; i<=n-1; i++) { diaga.ptr.p_double[i] = diaga.ptr.p_double[i-1]*sgn/(decaya*(1+0.01*hqrnduniformr(&rs, _state))); } rmatrixrndorthogonal(n, &qa, _state); ae_matrix_set_length(&densea, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = (double)(0); for(k=0; k<=n-1; k++) { v = v+qa.ptr.pp_double[k][i]*diaga.ptr.p_double[k]*qa.ptr.pp_double[k][j]; } densea.ptr.pp_double[i][j] = v; } } isupper = ae_fp_greater(hqrnduniformr(&rs, _state),0.5); ae_matrix_set_length(&halfa, n, n, _state); sparsecreate(n, n, 0, &halfsa, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { halfa.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state); if( j>=i&&isupper ) { halfa.ptr.pp_double[i][j] = densea.ptr.pp_double[i][j]; sparseset(&halfsa, i, j, densea.ptr.pp_double[i][j], _state); } if( j<=i&&!isupper ) { halfa.ptr.pp_double[i][j] = densea.ptr.pp_double[i][j]; sparseset(&halfsa, i, j, densea.ptr.pp_double[i][j], _state); } } } sformat = hqrnduniformi(&rs, 2, _state); if( sformat==0 ) { sparseconverttocrs(&halfsa, _state); } if( sformat==1 ) { sparseconverttosks(&halfsa, _state); } /* * Solve with eigensolver operating in dense mode */ eigsubspacecreate(n, m, &s, _state); eigsubspacesetcond(&s, tollambda/10, 0, _state); if( smode==0 ) { /* * Dense mode */ eigsubspacesolvedenses(&s, &halfa, isupper, &sw, &sz, &rep, _state); } else { if( smode==1 ) { /* * Sparse mode */ eigsubspacesolvesparses(&s, &halfsa, isupper, &sw, &sz, &rep, _state); } else { if( smode==2 ) { /* * Out-of-core mode, symmetric version */ eigsubspaceoocstart(&s, 0, _state); while(eigsubspaceooccontinue(&s, _state)) { eigsubspaceoocgetrequestinfo(&s, &requesttype, &requestsize, _state); ae_assert(requesttype==0, "EVDI: integrity check failed in unit test", _state); ae_assert(requestsize>0, "EVDI: integrity check failed in unit test", _state); eigsubspaceoocgetrequestdata(&s, &ra, _state); rmatrixsetlengthatleast(&rb, n, requestsize, _state); rmatrixgemm(n, requestsize, n, 1.0, &densea, 0, 0, 0, &ra, 0, 0, 0, 0.0, &rb, 0, 0, _state); eigsubspaceoocsendresult(&s, &rb, _state); } eigsubspaceoocstop(&s, &sw, &sz, &rep, _state); } else { ae_assert(ae_false, "unittest: integrity check failed", _state); } } } /* * Compare against reference values */ for(i=0; i<=m-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(sw.ptr.p_double[i]-diaga.ptr.p_double[i], _state),tollambda), _state); v = ae_v_dotproduct(&qa.ptr.pp_double[i][0], 1, &sz.ptr.pp_double[0][i], sz.stride, ae_v_len(0,n-1)); mx = (double)(0); for(j=0; j<=n-1; j++) { mx = ae_maxreal(mx, ae_fabs(sz.ptr.pp_double[j][i]*ae_sign(v, _state)-qa.ptr.pp_double[i][j], _state), _state); } seterrorflag(errorflag, ae_fp_greater(mx,tolvector), _state); } } } } /* * Problem with highly separated eigenvalues (either with alternating * sign - or all positive). Only a few iterations is performed, we * want to check convergence properties on such problems. */ tollambda = 1.0E-9; tolvector = 1.0E-4; itscount = 5; for(n=1; n<=25; n++) { for(m=1; m<=n; m++) { for(smode=0; smode<=2; smode++) { /* * Generate eigenproblem */ sgn = (double)(2*hqrnduniformi(&rs, 2, _state)-1); decaya = 1.05; ae_vector_set_length(&diaga, n, _state); diaga.ptr.p_double[0] = ae_pow((double)(10), 2*hqrnduniformr(&rs, _state)-1, _state); for(i=1; i<=n-1; i++) { diaga.ptr.p_double[i] = diaga.ptr.p_double[i-1]*sgn/(decaya*(1+0.01*hqrnduniformr(&rs, _state))); } for(i=m; i<=n-1; i++) { diaga.ptr.p_double[i] = diaga.ptr.p_double[i]/(100+10*hqrnduniformr(&rs, _state)); } rmatrixrndorthogonal(n, &qa, _state); ae_matrix_set_length(&densea, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = (double)(0); for(k=0; k<=n-1; k++) { v = v+qa.ptr.pp_double[k][i]*diaga.ptr.p_double[k]*qa.ptr.pp_double[k][j]; } densea.ptr.pp_double[i][j] = v; } } isupper = ae_fp_greater(hqrnduniformr(&rs, _state),0.5); ae_matrix_set_length(&halfa, n, n, _state); sparsecreate(n, n, 0, &halfsa, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { halfa.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state); if( j>=i&&isupper ) { halfa.ptr.pp_double[i][j] = densea.ptr.pp_double[i][j]; sparseset(&halfsa, i, j, densea.ptr.pp_double[i][j], _state); } if( j<=i&&!isupper ) { halfa.ptr.pp_double[i][j] = densea.ptr.pp_double[i][j]; sparseset(&halfsa, i, j, densea.ptr.pp_double[i][j], _state); } } } sformat = hqrnduniformi(&rs, 2, _state); if( sformat==0 ) { sparseconverttocrs(&halfsa, _state); } if( sformat==1 ) { sparseconverttosks(&halfsa, _state); } /* * Solve with eigensolver operating in dense mode */ eigsubspacecreate(n, m, &s, _state); eigsubspacesetcond(&s, (double)(0), itscount, _state); if( smode==0 ) { /* * Dense mode */ eigsubspacesolvedenses(&s, &halfa, isupper, &sw, &sz, &rep, _state); } else { if( smode==1 ) { /* * Sparse mode */ eigsubspacesolvesparses(&s, &halfsa, isupper, &sw, &sz, &rep, _state); } else { if( smode==2 ) { /* * Out-of-core mode, symmetric version * * NOTE: we check that solver performs no more than ItsCount+2 calls */ callcount = 0; eigsubspaceoocstart(&s, 0, _state); while(eigsubspaceooccontinue(&s, _state)) { eigsubspaceoocgetrequestinfo(&s, &requesttype, &requestsize, _state); ae_assert(requesttype==0, "EVDI: integrity check failed in unit test", _state); ae_assert(requestsize>0, "EVDI: integrity check failed in unit test", _state); eigsubspaceoocgetrequestdata(&s, &ra, _state); rmatrixsetlengthatleast(&rb, n, requestsize, _state); rmatrixgemm(n, requestsize, n, 1.0, &densea, 0, 0, 0, &ra, 0, 0, 0, 0.0, &rb, 0, 0, _state); eigsubspaceoocsendresult(&s, &rb, _state); callcount = callcount+1; } eigsubspaceoocstop(&s, &sw, &sz, &rep, _state); seterrorflag(errorflag, callcount>itscount+2, _state); } else { ae_assert(ae_false, "unittest: integrity check failed", _state); } } } /* * Compare against reference values */ seterrorflag(errorflag, rep.iterationscount>itscount, _state); for(i=0; i<=m-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(sw.ptr.p_double[i]-diaga.ptr.p_double[i], _state),tollambda), _state); v = ae_v_dotproduct(&qa.ptr.pp_double[i][0], 1, &sz.ptr.pp_double[0][i], sz.stride, ae_v_len(0,n-1)); mx = (double)(0); for(j=0; j<=n-1; j++) { mx = ae_maxreal(mx, ae_fabs(sz.ptr.pp_double[j][i]*ae_sign(v, _state)-qa.ptr.pp_double[i][j], _state), _state); } seterrorflag(errorflag, ae_fp_greater(mx,tolvector), _state); } } } } /* * Problem with numerically zero matrix. We check ability to stop * and to return orthogonal vectors. */ tollambda = 1.0E-9; itscount = 5; for(n=1; n<=25; n++) { for(m=1; m<=n; m++) { /* * Generate eigenproblem */ ae_matrix_set_length(&densea, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { densea.ptr.pp_double[i][j] = (double)(0); } } /* * Solve with eigensolver operating in dense mode, * iteration count based stopping condition. */ eigsubspacecreate(n, m, &s, _state); eigsubspacesetcond(&s, (double)(0), itscount, _state); eigsubspacesolvedenses(&s, &densea, ae_true, &sw, &sz, &rep, _state); seterrorflag(errorflag, rep.iterationscount>itscount, _state); for(i=0; i<=m-1; i++) { for(j=i; j<=m-1; j++) { v = ae_v_dotproduct(&sz.ptr.pp_double[0][i], sz.stride, &sz.ptr.pp_double[0][j], sz.stride, ae_v_len(0,n-1)); if( j==i ) { v = v-1; } seterrorflag(errorflag, ae_fp_greater(ae_fabs(v, _state),1.0E3*ae_machineepsilon), _state); } seterrorflag(errorflag, ae_fp_neq(sw.ptr.p_double[i],(double)(0)), _state); } /* * Solve with eigensolver operating in dense mode, * eps-based stopping condition. */ eigsubspacecreate(n, m, &s, _state); eigsubspacesetcond(&s, tollambda, 0, _state); eigsubspacesolvedenses(&s, &densea, ae_true, &sw, &sz, &rep, _state); for(i=0; i<=m-1; i++) { for(j=i; j<=m-1; j++) { v = ae_v_dotproduct(&sz.ptr.pp_double[0][i], sz.stride, &sz.ptr.pp_double[0][j], sz.stride, ae_v_len(0,n-1)); if( j==i ) { v = v-1; } seterrorflag(errorflag, ae_fp_greater(ae_fabs(v, _state),1.0E3*ae_machineepsilon), _state); } seterrorflag(errorflag, ae_fp_neq(sw.ptr.p_double[i],(double)(0)), _state); } } } ae_frame_leave(_state); } static void testbasestatunit_testranking(ae_bool* err, ae_state *_state); ae_bool testbasestat(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_bool s1errors; ae_bool covcorrerrors; ae_bool rankerrors; double threshold; ae_int_t i; ae_int_t j; ae_int_t n; ae_int_t kx; ae_int_t ky; ae_int_t ctype; ae_int_t cidxx; ae_int_t cidxy; ae_vector x; ae_vector y; ae_matrix mx; ae_matrix my; ae_matrix cc; ae_matrix cp; ae_matrix cs; double mean; double variance; double skewness; double kurtosis; double adev; double median; double pv; double v; double tmean; double tvariance; double tskewness; double tkurtosis; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_matrix_init(&mx, 0, 0, DT_REAL, _state); ae_matrix_init(&my, 0, 0, DT_REAL, _state); ae_matrix_init(&cc, 0, 0, DT_REAL, _state); ae_matrix_init(&cp, 0, 0, DT_REAL, _state); ae_matrix_init(&cs, 0, 0, DT_REAL, _state); /* * Primary settings */ waserrors = ae_false; s1errors = ae_false; covcorrerrors = ae_false; rankerrors = ae_false; threshold = 1000*ae_machineepsilon; /* * Ranking */ testbasestatunit_testranking(&rankerrors, _state); /* * * prepare X and Y - two test samples * * test 1-sample coefficients * * test for SampleMean, SampleVariance, * SampleSkewness, SampleKurtosis. */ n = 10; ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_sqr((double)(i), _state); } samplemoments(&x, n, &mean, &variance, &skewness, &kurtosis, _state); s1errors = s1errors||ae_fp_greater(ae_fabs(mean-28.5, _state),0.001); s1errors = s1errors||ae_fp_greater(ae_fabs(variance-801.1667, _state),0.001); s1errors = s1errors||ae_fp_greater(ae_fabs(skewness-0.5751, _state),0.001); s1errors = s1errors||ae_fp_greater(ae_fabs(kurtosis+1.2666, _state),0.001); tmean = samplemean(&x, n, _state); tvariance = samplevariance(&x, n, _state); tskewness = sampleskewness(&x, n, _state); tkurtosis = samplekurtosis(&x, n, _state); s1errors = s1errors||ae_fp_greater(ae_fabs(mean-tmean, _state),1.0E-15); s1errors = s1errors||ae_fp_greater(ae_fabs(variance-tvariance, _state),1.0E-15); s1errors = s1errors||ae_fp_greater(ae_fabs(skewness-tskewness, _state),1.0E-15); s1errors = s1errors||ae_fp_greater(ae_fabs(kurtosis-tkurtosis, _state),1.0E-15); sampleadev(&x, n, &adev, _state); s1errors = s1errors||ae_fp_greater(ae_fabs(adev-23.2000, _state),0.001); samplemedian(&x, n, &median, _state); s1errors = s1errors||ae_fp_greater(ae_fabs(median-0.5*(16+25), _state),0.001); for(i=0; i<=n-1; i++) { samplepercentile(&x, n, (double)i/(double)(n-1), &pv, _state); s1errors = s1errors||ae_fp_greater(ae_fabs(pv-x.ptr.p_double[i], _state),0.001); } samplepercentile(&x, n, 0.5, &pv, _state); s1errors = s1errors||ae_fp_greater(ae_fabs(pv-0.5*(16+25), _state),0.001); /* * test covariance/correlation: * * 2-sample coefficients * * We generate random matrices MX and MY */ n = 10; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_sqr((double)(i), _state); y.ptr.p_double[i] = (double)(i); } covcorrerrors = covcorrerrors||ae_fp_greater(ae_fabs(pearsoncorr2(&x, &y, n, _state)-0.9627, _state),0.0001); covcorrerrors = covcorrerrors||ae_fp_greater(ae_fabs(spearmancorr2(&x, &y, n, _state)-1.0000, _state),0.0001); covcorrerrors = covcorrerrors||ae_fp_greater(ae_fabs(cov2(&x, &y, n, _state)-82.5000, _state),0.0001); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_sqr(i-0.5*n, _state); y.ptr.p_double[i] = (double)(i); } covcorrerrors = covcorrerrors||ae_fp_greater(ae_fabs(pearsoncorr2(&x, &y, n, _state)+0.3676, _state),0.0001); covcorrerrors = covcorrerrors||ae_fp_greater(ae_fabs(spearmancorr2(&x, &y, n, _state)+0.2761, _state),0.0001); covcorrerrors = covcorrerrors||ae_fp_greater(ae_fabs(cov2(&x, &y, n, _state)+9.1667, _state),0.0001); /* * test covariance/correlation: * * matrix covariance/correlation * * matrix cross-covariance/cross-correlation * * We generate random matrices MX and MY which contain KX (KY) * columns, all except one are random, one of them is constant. * We test that function (a) do not crash on constant column, * and (b) return variances/correlations that are exactly zero * for this column. * * CType control variable controls type of constant: 0 - no constant * column, 1 - zero column, 2 - nonzero column with value whose * binary representation contains many non-zero bits. Using such * type of constant column we are able to ensure than even in the * presense of roundoff error functions correctly detect constant * columns. */ for(n=0; n<=10; n++) { if( n>0 ) { ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); } for(ctype=0; ctype<=2; ctype++) { for(kx=1; kx<=10; kx++) { for(ky=1; ky<=10; ky++) { /* * Fill matrices, add constant column (when CType=1 or =2) */ cidxx = -1; cidxy = -1; if( n>0 ) { ae_matrix_set_length(&mx, n, kx, _state); ae_matrix_set_length(&my, n, ky, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=kx-1; j++) { mx.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } for(j=0; j<=ky-1; j++) { my.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } if( ctype==1 ) { cidxx = ae_randominteger(kx, _state); cidxy = ae_randominteger(ky, _state); for(i=0; i<=n-1; i++) { mx.ptr.pp_double[i][cidxx] = 0.0; my.ptr.pp_double[i][cidxy] = 0.0; } } if( ctype==2 ) { cidxx = ae_randominteger(kx, _state); cidxy = ae_randominteger(ky, _state); v = ae_sqrt((double)(ae_randominteger(kx, _state)+1)/(double)kx, _state); for(i=0; i<=n-1; i++) { mx.ptr.pp_double[i][cidxx] = v; my.ptr.pp_double[i][cidxy] = v; } } } /* * test covariance/correlation matrix using * 2-sample functions as reference point. * * We also test that coefficients for constant variables * are exactly zero. */ covm(&mx, n, kx, &cc, _state); pearsoncorrm(&mx, n, kx, &cp, _state); spearmancorrm(&mx, n, kx, &cs, _state); for(i=0; i<=kx-1; i++) { for(j=0; j<=kx-1; j++) { if( n>0 ) { ae_v_move(&x.ptr.p_double[0], 1, &mx.ptr.pp_double[0][i], mx.stride, ae_v_len(0,n-1)); ae_v_move(&y.ptr.p_double[0], 1, &mx.ptr.pp_double[0][j], mx.stride, ae_v_len(0,n-1)); } covcorrerrors = covcorrerrors||ae_fp_greater(ae_fabs(cov2(&x, &y, n, _state)-cc.ptr.pp_double[i][j], _state),threshold); covcorrerrors = covcorrerrors||ae_fp_greater(ae_fabs(pearsoncorr2(&x, &y, n, _state)-cp.ptr.pp_double[i][j], _state),threshold); covcorrerrors = covcorrerrors||ae_fp_greater(ae_fabs(spearmancorr2(&x, &y, n, _state)-cs.ptr.pp_double[i][j], _state),threshold); } } if( ctype!=0&&n>0 ) { for(i=0; i<=kx-1; i++) { covcorrerrors = covcorrerrors||ae_fp_neq(cc.ptr.pp_double[i][cidxx],(double)(0)); covcorrerrors = covcorrerrors||ae_fp_neq(cc.ptr.pp_double[cidxx][i],(double)(0)); covcorrerrors = covcorrerrors||ae_fp_neq(cp.ptr.pp_double[i][cidxx],(double)(0)); covcorrerrors = covcorrerrors||ae_fp_neq(cp.ptr.pp_double[cidxx][i],(double)(0)); covcorrerrors = covcorrerrors||ae_fp_neq(cs.ptr.pp_double[i][cidxx],(double)(0)); covcorrerrors = covcorrerrors||ae_fp_neq(cs.ptr.pp_double[cidxx][i],(double)(0)); } } /* * test cross-covariance/cross-correlation matrix using * 2-sample functions as reference point. * * We also test that coefficients for constant variables * are exactly zero. */ covm2(&mx, &my, n, kx, ky, &cc, _state); pearsoncorrm2(&mx, &my, n, kx, ky, &cp, _state); spearmancorrm2(&mx, &my, n, kx, ky, &cs, _state); for(i=0; i<=kx-1; i++) { for(j=0; j<=ky-1; j++) { if( n>0 ) { ae_v_move(&x.ptr.p_double[0], 1, &mx.ptr.pp_double[0][i], mx.stride, ae_v_len(0,n-1)); ae_v_move(&y.ptr.p_double[0], 1, &my.ptr.pp_double[0][j], my.stride, ae_v_len(0,n-1)); } covcorrerrors = covcorrerrors||ae_fp_greater(ae_fabs(cov2(&x, &y, n, _state)-cc.ptr.pp_double[i][j], _state),threshold); covcorrerrors = covcorrerrors||ae_fp_greater(ae_fabs(pearsoncorr2(&x, &y, n, _state)-cp.ptr.pp_double[i][j], _state),threshold); covcorrerrors = covcorrerrors||ae_fp_greater(ae_fabs(spearmancorr2(&x, &y, n, _state)-cs.ptr.pp_double[i][j], _state),threshold); } } if( ctype!=0&&n>0 ) { for(i=0; i<=kx-1; i++) { covcorrerrors = covcorrerrors||ae_fp_neq(cc.ptr.pp_double[i][cidxy],(double)(0)); covcorrerrors = covcorrerrors||ae_fp_neq(cp.ptr.pp_double[i][cidxy],(double)(0)); covcorrerrors = covcorrerrors||ae_fp_neq(cs.ptr.pp_double[i][cidxy],(double)(0)); } for(j=0; j<=ky-1; j++) { covcorrerrors = covcorrerrors||ae_fp_neq(cc.ptr.pp_double[cidxx][j],(double)(0)); covcorrerrors = covcorrerrors||ae_fp_neq(cp.ptr.pp_double[cidxx][j],(double)(0)); covcorrerrors = covcorrerrors||ae_fp_neq(cs.ptr.pp_double[cidxx][j],(double)(0)); } } } } } } /* * Final report */ waserrors = (s1errors||covcorrerrors)||rankerrors; if( !silent ) { printf("DESC.STAT TEST\n"); printf("TOTAL RESULTS: "); if( !waserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* 1-SAMPLE FUNCTIONALITY: "); if( !s1errors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* CORRELATION/COVARIATION: "); if( !covcorrerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* RANKING: "); if( !rankerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST SUMMARY: FAILED\n"); } else { printf("TEST SUMMARY: PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testbasestat(ae_bool silent, ae_state *_state) { return testbasestat(silent, _state); } /************************************************************************* This function tests ranking functionality. In case of failure it sets Err parameter to True; this parameter is left unchanged otherwise. *************************************************************************/ static void testbasestatunit_testranking(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t testk; ae_int_t npoints; ae_int_t nfeatures; ae_int_t i; ae_int_t j; ae_int_t k; ae_matrix xy0; ae_matrix xy1; ae_matrix xy2; double v; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy0, 0, 0, DT_REAL, _state); ae_matrix_init(&xy1, 0, 0, DT_REAL, _state); ae_matrix_init(&xy2, 0, 0, DT_REAL, _state); /* * Test 1 - large array, unique ranks, each row is obtained as follows: * * we generate X[i=0..N-1] = I * * we add random noise: X[i] := X[i] + 0.2*randomreal()-0.1 * * we perform random permutation * * Such dataset has following properties: * * all data are unique within their rows * * rank(X[i]) = round(X[i]) * * We perform several tests with different NPoints/NFeatures. */ for(testk=0; testk<=1; testk++) { /* * Select problem size */ if( testk==0 ) { npoints = 200; nfeatures = 1000; } else { npoints = 1000; nfeatures = 200; } /* * Generate XY0, XY1, XY2 */ ae_matrix_set_length(&xy0, npoints, nfeatures, _state); ae_matrix_set_length(&xy1, npoints, nfeatures, _state); ae_matrix_set_length(&xy2, npoints, nfeatures, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=nfeatures-1; j++) { xy0.ptr.pp_double[i][j] = j+0.2*ae_randomreal(_state)-0.1; } for(j=0; j<=nfeatures-2; j++) { k = ae_randominteger(nfeatures-j, _state); if( k!=0 ) { v = xy0.ptr.pp_double[i][j]; xy0.ptr.pp_double[i][j] = xy0.ptr.pp_double[i][j+k]; xy0.ptr.pp_double[i][j+k] = v; } } for(j=0; j<=nfeatures-1; j++) { xy1.ptr.pp_double[i][j] = xy0.ptr.pp_double[i][j]; xy2.ptr.pp_double[i][j] = xy0.ptr.pp_double[i][j]; } } /* * Test uncentered ranks */ rankdata(&xy0, npoints, nfeatures, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=nfeatures-1; j++) { if( ae_fp_neq(xy0.ptr.pp_double[i][j],(double)(ae_round(xy2.ptr.pp_double[i][j], _state))) ) { *err = ae_true; } } } /* * Test centered ranks: * they must be equal to uncentered ranks minus (NFeatures-1)/2 */ rankdatacentered(&xy1, npoints, nfeatures, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=nfeatures-1; j++) { if( ae_fp_neq(xy1.ptr.pp_double[i][j],ae_round(xy2.ptr.pp_double[i][j], _state)-(double)(nfeatures-1)/(double)2) ) { *err = ae_true; } } } } /* * Test correct handling of tied ranks */ npoints = 3; nfeatures = 4; ae_matrix_set_length(&xy0, npoints, nfeatures, _state); ae_matrix_set_length(&xy1, npoints, nfeatures, _state); xy0.ptr.pp_double[0][0] = 2.25; xy0.ptr.pp_double[0][1] = 3.75; xy0.ptr.pp_double[0][2] = 3.25; xy0.ptr.pp_double[0][3] = 2.25; xy0.ptr.pp_double[1][0] = (double)(2); xy0.ptr.pp_double[1][1] = (double)(2); xy0.ptr.pp_double[1][2] = (double)(2); xy0.ptr.pp_double[1][3] = (double)(7); xy0.ptr.pp_double[2][0] = (double)(9); xy0.ptr.pp_double[2][1] = (double)(9); xy0.ptr.pp_double[2][2] = (double)(9); xy0.ptr.pp_double[2][3] = (double)(9); for(i=0; i<=npoints-1; i++) { for(j=0; j<=nfeatures-1; j++) { xy1.ptr.pp_double[i][j] = xy0.ptr.pp_double[i][j]; } } rankdata(&xy0, npoints, nfeatures, _state); if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[0][0]-0.5, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[0][1]-3.0, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[0][2]-2.0, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[0][3]-0.5, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[1][0]-1.0, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[1][1]-1.0, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[1][2]-1.0, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[1][3]-3.0, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[2][0]-1.5, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[2][1]-1.5, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[2][2]-1.5, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy0.ptr.pp_double[2][3]-1.5, _state),10*ae_machineepsilon) ) { *err = ae_true; } rankdatacentered(&xy1, npoints, nfeatures, _state); if( ae_fp_greater(ae_fabs(xy1.ptr.pp_double[0][0]+1.0, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy1.ptr.pp_double[0][1]-1.5, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy1.ptr.pp_double[0][2]-0.5, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy1.ptr.pp_double[0][3]+1.0, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy1.ptr.pp_double[1][0]+0.5, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy1.ptr.pp_double[1][1]+0.5, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy1.ptr.pp_double[1][2]+0.5, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_greater(ae_fabs(xy1.ptr.pp_double[1][3]-1.5, _state),10*ae_machineepsilon) ) { *err = ae_true; } if( ae_fp_neq(xy1.ptr.pp_double[2][0],(double)(0)) ) { *err = ae_true; } if( ae_fp_neq(xy1.ptr.pp_double[2][1],(double)(0)) ) { *err = ae_true; } if( ae_fp_neq(xy1.ptr.pp_double[2][2],(double)(0)) ) { *err = ae_true; } if( ae_fp_neq(xy1.ptr.pp_double[2][3],(double)(0)) ) { *err = ae_true; } ae_frame_leave(_state); } static void testpcaunit_calculatemv(/* Real */ ae_vector* x, ae_int_t n, double* mean, double* means, double* stddev, double* stddevs, ae_state *_state); ae_bool testpca(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t maxn; ae_int_t maxm; double threshold; ae_int_t m; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t info; ae_vector means; ae_vector s; ae_vector s2; ae_vector t2; ae_vector t3; ae_matrix v; ae_matrix v2; ae_matrix x; double t; double h; double tmean; double tmeans; double tstddev; double tstddevs; double tmean2; double tmeans2; double tstddev2; double tstddevs2; hqrndstate rs; ae_int_t requested; double tolreduced; double varcomplete; double varreduced; ae_int_t pass; ae_bool pcaconverrors; ae_bool pcaorterrors; ae_bool pcavarerrors; ae_bool pcaopterrors; ae_bool pcasubspaceerrors; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&means, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&s2, 0, DT_REAL, _state); ae_vector_init(&t2, 0, DT_REAL, _state); ae_vector_init(&t3, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); ae_matrix_init(&v2, 0, 0, DT_REAL, _state); ae_matrix_init(&x, 0, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); /* * Primary settings */ maxm = 10; maxn = 100; threshold = 1000*ae_machineepsilon; waserrors = ae_false; pcaconverrors = ae_false; pcaorterrors = ae_false; pcavarerrors = ae_false; pcaopterrors = ae_false; pcasubspaceerrors = ae_false; /* * Test 1: N random points in M-dimensional space, full PCA. */ for(m=1; m<=maxm; m++) { for(n=1; n<=maxn; n++) { /* * Generate task */ ae_matrix_set_length(&x, n-1+1, m-1+1, _state); ae_vector_set_length(&means, m-1+1, _state); for(j=0; j<=m-1; j++) { means.ptr.p_double[j] = 1.5*ae_randomreal(_state)-0.75; } for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_double[i][j] = means.ptr.p_double[j]+(2*ae_randomreal(_state)-1); } } /* * Solve */ pcabuildbasis(&x, n, m, &info, &s, &v, _state); if( info!=1 ) { pcaconverrors = ae_true; continue; } /* * Orthogonality test */ for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { t = ae_v_dotproduct(&v.ptr.pp_double[0][i], v.stride, &v.ptr.pp_double[0][j], v.stride, ae_v_len(0,m-1)); if( i==j ) { t = t-1; } pcaorterrors = pcaorterrors||ae_fp_greater(ae_fabs(t, _state),threshold); } } /* * Variance test */ ae_vector_set_length(&t2, n-1+1, _state); for(k=0; k<=m-1; k++) { for(i=0; i<=n-1; i++) { t = ae_v_dotproduct(&x.ptr.pp_double[i][0], 1, &v.ptr.pp_double[0][k], v.stride, ae_v_len(0,m-1)); t2.ptr.p_double[i] = t; } testpcaunit_calculatemv(&t2, n, &tmean, &tmeans, &tstddev, &tstddevs, _state); if( n!=1 ) { t = ae_sqr(tstddev, _state)*n/(n-1); } else { t = (double)(0); } pcavarerrors = pcavarerrors||ae_fp_greater(ae_fabs(t-s.ptr.p_double[k], _state),threshold); } for(k=0; k<=m-2; k++) { pcavarerrors = pcavarerrors||ae_fp_less(s.ptr.p_double[k],s.ptr.p_double[k+1]); } /* * Optimality: different perturbations in V[..,0] can't * increase variance of projection - can only decrease. */ ae_vector_set_length(&t2, n-1+1, _state); ae_vector_set_length(&t3, n-1+1, _state); for(i=0; i<=n-1; i++) { t = ae_v_dotproduct(&x.ptr.pp_double[i][0], 1, &v.ptr.pp_double[0][0], v.stride, ae_v_len(0,m-1)); t2.ptr.p_double[i] = t; } testpcaunit_calculatemv(&t2, n, &tmean, &tmeans, &tstddev, &tstddevs, _state); for(k=0; k<=2*m-1; k++) { h = 0.001; if( k%2!=0 ) { h = -h; } ae_v_move(&t3.ptr.p_double[0], 1, &t2.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_addd(&t3.ptr.p_double[0], 1, &x.ptr.pp_double[0][k/2], x.stride, ae_v_len(0,n-1), h); t = (double)(0); for(j=0; j<=m-1; j++) { if( j!=k/2 ) { t = t+ae_sqr(v.ptr.pp_double[j][0], _state); } else { t = t+ae_sqr(v.ptr.pp_double[j][0]+h, _state); } } t = 1/ae_sqrt(t, _state); ae_v_muld(&t3.ptr.p_double[0], 1, ae_v_len(0,n-1), t); testpcaunit_calculatemv(&t3, n, &tmean2, &tmeans2, &tstddev2, &tstddevs2, _state); pcaopterrors = pcaopterrors||ae_fp_greater(tstddev2,tstddev+threshold); } } } /* * Special test for N=0 */ for(m=1; m<=maxm; m++) { /* * Solve */ pcabuildbasis(&x, 0, m, &info, &s, &v, _state); if( info!=1 ) { pcaconverrors = ae_true; continue; } /* * Orthogonality test */ for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { t = ae_v_dotproduct(&v.ptr.pp_double[0][i], v.stride, &v.ptr.pp_double[0][j], v.stride, ae_v_len(0,m-1)); if( i==j ) { t = t-1; } pcaorterrors = pcaorterrors||ae_fp_greater(ae_fabs(t, _state),threshold); } } } /* * Test reduced subspace PCA on randomly generated matrix. Because * matrix is random, it may have bad spectral properties, so large * number of iterations is performed. * * Following properties are tested: * * principal vectors are orthogonal * * variance values are computed correctly * * variance values are decreasing * * variance by inexact reduced PCA deviates from variance explaned * by top REQUESTED vectors of complete PCA by at most TolReduced. */ tolreduced = 1.0E-3; for(m=1; m<=maxm; m++) { for(n=1; n<=maxn; n++) { /* * Generate task */ requested = 1+hqrnduniformi(&rs, m, _state); ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } } /* * Solve */ pcatruncatedsubspace(&x, n, m, requested, 1.0E-9, 0, &s, &v, _state); /* * Orthogonality test */ for(i=0; i<=requested-1; i++) { for(j=0; j<=requested-1; j++) { t = ae_v_dotproduct(&v.ptr.pp_double[0][i], v.stride, &v.ptr.pp_double[0][j], v.stride, ae_v_len(0,m-1)); if( i==j ) { t = t-1; } seterrorflag(&pcasubspaceerrors, ae_fp_greater(ae_fabs(t, _state),threshold), _state); } } /* * Variance test */ ae_vector_set_length(&t2, n, _state); for(k=0; k<=requested-1; k++) { for(i=0; i<=n-1; i++) { t = ae_v_dotproduct(&x.ptr.pp_double[i][0], 1, &v.ptr.pp_double[0][k], v.stride, ae_v_len(0,m-1)); t2.ptr.p_double[i] = t; } testpcaunit_calculatemv(&t2, n, &tmean, &tmeans, &tstddev, &tstddevs, _state); if( n!=1 ) { t = ae_sqr(tstddev, _state)*n/(n-1); } else { t = (double)(0); } seterrorflag(&pcasubspaceerrors, ae_fp_greater(ae_fabs(t-s.ptr.p_double[k], _state),threshold), _state); } for(k=0; k<=requested-2; k++) { seterrorflag(&pcasubspaceerrors, ae_fp_less(ae_fabs(s.ptr.p_double[k], _state),ae_fabs(s.ptr.p_double[k+1], _state)), _state); } for(k=0; k<=requested-1; k++) { seterrorflag(&pcasubspaceerrors, ae_fp_less_eq(s.ptr.p_double[k],(double)(0))&&ae_fp_greater(ae_fabs(s.ptr.p_double[k], _state),1000*ae_machineepsilon*ae_fabs(s.ptr.p_double[0], _state)), _state); } /* * Compare variance explained by top REQUESTED vectors from * full PCA and variance explained by reduced PCA. */ pcabuildbasis(&x, n, m, &info, &s2, &v2, _state); ae_assert(info>0, "PCA error: solver nonconvergence", _state); varcomplete = 0.0; varreduced = 0.0; for(k=0; k<=requested-1; k++) { varreduced = varreduced+s.ptr.p_double[k]; varcomplete = varcomplete+s2.ptr.p_double[k]; } seterrorflag(&pcasubspaceerrors, ae_fp_less((varreduced-varcomplete)/varcomplete,-tolreduced), _state); seterrorflag(&pcasubspaceerrors, ae_fp_greater((varreduced-varcomplete)/varcomplete,1.0E3*ae_machineepsilon), _state); } } /* * Test subspace reduced PCA on specially designed problem with good * spectral properties. Only limited number of iterations is performed, * and we expect fast convergence. */ for(pass=0; pass<=3; pass++) { /* * Generate task */ m = -1; requested = -1; if( pass==0 ) { m = 50+hqrnduniformi(&rs, 50, _state); requested = 1+hqrnduniformi(&rs, 10, _state); } if( pass==1 ) { m = 100+hqrnduniformi(&rs, 50, _state); requested = 1+hqrnduniformi(&rs, 10, _state); } if( pass==2 ) { m = 100+hqrnduniformi(&rs, 50, _state); requested = 25+hqrnduniformi(&rs, 10, _state); } if( pass==3 ) { m = 200+hqrnduniformi(&rs, 200, _state); requested = 1+hqrnduniformi(&rs, 10, _state); } ae_assert(m>0&&requested>0, "PCA test: integrity failure", _state); n = 2*m; tolreduced = 1.0E-3; ae_matrix_set_length(&x, n, m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_double[i][j] = (double)(0); } } x.ptr.pp_double[0][0] = 1.0; for(i=1; i<=m-1; i++) { x.ptr.pp_double[i][i] = x.ptr.pp_double[i-1][i-1]*(1/1.05); } rmatrixrndorthogonalfromtheleft(&x, n, m, _state); rmatrixrndorthogonalfromtheright(&x, n, m, _state); /* * Solve */ if( ae_fp_greater(hqrnduniformr(&rs, _state),0.5) ) { pcatruncatedsubspace(&x, n, m, requested, 0.0, 20, &s, &v, _state); } else { pcatruncatedsubspace(&x, n, m, requested, 0.0, 0, &s, &v, _state); } /* * Orthogonality test */ for(i=0; i<=requested-1; i++) { for(j=0; j<=requested-1; j++) { t = ae_v_dotproduct(&v.ptr.pp_double[0][i], v.stride, &v.ptr.pp_double[0][j], v.stride, ae_v_len(0,m-1)); if( i==j ) { t = t-1; } seterrorflag(&pcasubspaceerrors, ae_fp_greater(ae_fabs(t, _state),threshold), _state); } } /* * Variance test */ ae_vector_set_length(&t2, n, _state); for(k=0; k<=requested-1; k++) { for(i=0; i<=n-1; i++) { t = ae_v_dotproduct(&x.ptr.pp_double[i][0], 1, &v.ptr.pp_double[0][k], v.stride, ae_v_len(0,m-1)); t2.ptr.p_double[i] = t; } testpcaunit_calculatemv(&t2, n, &tmean, &tmeans, &tstddev, &tstddevs, _state); if( n!=1 ) { t = ae_sqr(tstddev, _state)*n/(n-1); } else { t = (double)(0); } seterrorflag(&pcasubspaceerrors, ae_fp_greater(ae_fabs(t-s.ptr.p_double[k], _state),threshold), _state); } for(k=0; k<=requested-2; k++) { seterrorflag(&pcasubspaceerrors, ae_fp_less(s.ptr.p_double[k],s.ptr.p_double[k+1]), _state); } /* * Compare variance explained by top REQUESTED vectors from * full PCA and variance explained by reduced PCA. */ pcabuildbasis(&x, n, m, &info, &s2, &v2, _state); ae_assert(info>0, "PCA error: solver nonconvergence", _state); varcomplete = 0.0; varreduced = 0.0; for(k=0; k<=requested-1; k++) { varreduced = varreduced+s.ptr.p_double[k]; varcomplete = varcomplete+s2.ptr.p_double[k]; } seterrorflag(&pcasubspaceerrors, ae_fp_less((varreduced-varcomplete)/varcomplete,-tolreduced), _state); seterrorflag(&pcasubspaceerrors, ae_fp_greater((varreduced-varcomplete)/varcomplete,1.0E3*ae_machineepsilon), _state); } /* * Final report */ waserrors = (((pcaconverrors||pcaorterrors)||pcavarerrors)||pcaopterrors)||pcasubspaceerrors; if( !silent ) { printf("PCA TEST\n"); printf("TOTAL RESULTS: "); if( !waserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* CONVERGENCE "); if( !pcaconverrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* ORTOGONALITY "); if( !pcaorterrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* VARIANCE REPORT "); if( !pcavarerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* OPTIMALITY "); if( !pcaopterrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* REDUCED (SUBSPACE) "); if( !pcasubspaceerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST SUMMARY: FAILED\n"); } else { printf("TEST SUMMARY: PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testpca(ae_bool silent, ae_state *_state) { return testpca(silent, _state); } /************************************************************************* Moments estimates and their errors *************************************************************************/ static void testpcaunit_calculatemv(/* Real */ ae_vector* x, ae_int_t n, double* mean, double* means, double* stddev, double* stddevs, ae_state *_state) { ae_int_t i; double v1; double v2; double variance; *mean = 0; *means = 0; *stddev = 0; *stddevs = 0; *mean = (double)(0); *means = (double)(1); *stddev = (double)(0); *stddevs = (double)(1); variance = (double)(0); if( n<=1 ) { return; } /* * Mean */ for(i=0; i<=n-1; i++) { *mean = *mean+x->ptr.p_double[i]; } *mean = *mean/n; /* * Variance (using corrected two-pass algorithm) */ if( n!=1 ) { v1 = (double)(0); for(i=0; i<=n-1; i++) { v1 = v1+ae_sqr(x->ptr.p_double[i]-(*mean), _state); } v2 = (double)(0); for(i=0; i<=n-1; i++) { v2 = v2+(x->ptr.p_double[i]-(*mean)); } v2 = ae_sqr(v2, _state)/n; variance = (v1-v2)/n; if( ae_fp_less(variance,(double)(0)) ) { variance = (double)(0); } *stddev = ae_sqrt(variance, _state); } /* * Errors */ *means = *stddev/ae_sqrt((double)(n), _state); *stddevs = *stddev*ae_sqrt((double)(2), _state)/ae_sqrt((double)(n-1), _state); } static void testbdssunit_unset1di(/* Integer */ ae_vector* a, ae_state *_state); /************************************************************************* Testing BDSS operations *************************************************************************/ ae_bool testbdss(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t pass; ae_int_t passcount; ae_int_t maxn; ae_int_t maxnq; ae_vector a; ae_vector a0; ae_vector at; ae_matrix p; ae_vector thresholds; ae_int_t ni; ae_vector c; ae_vector p1; ae_vector p2; ae_vector ties; ae_vector pt1; ae_vector pt2; ae_int_t tiecount; ae_int_t c1; ae_int_t c0; ae_int_t nc; ae_vector tmp; ae_vector sortrbuf; ae_vector sortrbuf2; ae_vector sortibuf; double pal; double pbl; double par; double pbr; double cve; double cvr; ae_int_t info; double threshold; ae_vector tiebuf; ae_vector cntbuf; double rms; double cvrms; ae_bool waserrors; ae_bool tieserrors; ae_bool split2errors; ae_bool optimalsplitkerrors; ae_bool splitkerrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&a, 0, DT_REAL, _state); ae_vector_init(&a0, 0, DT_REAL, _state); ae_vector_init(&at, 0, DT_REAL, _state); ae_matrix_init(&p, 0, 0, DT_REAL, _state); ae_vector_init(&thresholds, 0, DT_REAL, _state); ae_vector_init(&c, 0, DT_INT, _state); ae_vector_init(&p1, 0, DT_INT, _state); ae_vector_init(&p2, 0, DT_INT, _state); ae_vector_init(&ties, 0, DT_INT, _state); ae_vector_init(&pt1, 0, DT_INT, _state); ae_vector_init(&pt2, 0, DT_INT, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_init(&sortrbuf, 0, DT_REAL, _state); ae_vector_init(&sortrbuf2, 0, DT_REAL, _state); ae_vector_init(&sortibuf, 0, DT_INT, _state); ae_vector_init(&tiebuf, 0, DT_INT, _state); ae_vector_init(&cntbuf, 0, DT_INT, _state); waserrors = ae_false; tieserrors = ae_false; split2errors = ae_false; splitkerrors = ae_false; optimalsplitkerrors = ae_false; maxn = 100; maxnq = 49; passcount = 10; /* * Test ties */ for(n=1; n<=maxn; n++) { for(pass=1; pass<=passcount; pass++) { /* * untied data, test DSTie */ testbdssunit_unset1di(&p1, _state); testbdssunit_unset1di(&p2, _state); testbdssunit_unset1di(&pt1, _state); testbdssunit_unset1di(&pt2, _state); ae_vector_set_length(&a, n-1+1, _state); ae_vector_set_length(&a0, n-1+1, _state); ae_vector_set_length(&at, n-1+1, _state); ae_vector_set_length(&tmp, n-1+1, _state); a.ptr.p_double[0] = 2*ae_randomreal(_state)-1; tmp.ptr.p_double[0] = ae_randomreal(_state); for(i=1; i<=n-1; i++) { /* * A is randomly permuted */ a.ptr.p_double[i] = a.ptr.p_double[i-1]+0.1*ae_randomreal(_state)+0.1; tmp.ptr.p_double[i] = ae_randomreal(_state); } tagsortfastr(&tmp, &a, &sortrbuf, &sortrbuf2, n, _state); for(i=0; i<=n-1; i++) { a0.ptr.p_double[i] = a.ptr.p_double[i]; at.ptr.p_double[i] = a.ptr.p_double[i]; } dstie(&a0, n, &ties, &tiecount, &p1, &p2, _state); tagsort(&at, n, &pt1, &pt2, _state); for(i=0; i<=n-1; i++) { tieserrors = tieserrors||p1.ptr.p_int[i]!=pt1.ptr.p_int[i]; tieserrors = tieserrors||p2.ptr.p_int[i]!=pt2.ptr.p_int[i]; } tieserrors = tieserrors||tiecount!=n; if( tiecount==n ) { for(i=0; i<=n; i++) { tieserrors = tieserrors||ties.ptr.p_int[i]!=i; } } /* * tied data, test DSTie */ testbdssunit_unset1di(&p1, _state); testbdssunit_unset1di(&p2, _state); testbdssunit_unset1di(&pt1, _state); testbdssunit_unset1di(&pt2, _state); ae_vector_set_length(&a, n-1+1, _state); ae_vector_set_length(&a0, n-1+1, _state); ae_vector_set_length(&at, n-1+1, _state); c1 = 0; c0 = 0; for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(ae_randominteger(2, _state)); if( ae_fp_eq(a.ptr.p_double[i],(double)(0)) ) { c0 = c0+1; } else { c1 = c1+1; } a0.ptr.p_double[i] = a.ptr.p_double[i]; at.ptr.p_double[i] = a.ptr.p_double[i]; } dstie(&a0, n, &ties, &tiecount, &p1, &p2, _state); tagsort(&at, n, &pt1, &pt2, _state); for(i=0; i<=n-1; i++) { tieserrors = tieserrors||p1.ptr.p_int[i]!=pt1.ptr.p_int[i]; tieserrors = tieserrors||p2.ptr.p_int[i]!=pt2.ptr.p_int[i]; } if( c0==0||c1==0 ) { tieserrors = tieserrors||tiecount!=1; if( tiecount==1 ) { tieserrors = tieserrors||ties.ptr.p_int[0]!=0; tieserrors = tieserrors||ties.ptr.p_int[1]!=n; } } else { tieserrors = tieserrors||tiecount!=2; if( tiecount==2 ) { tieserrors = tieserrors||ties.ptr.p_int[0]!=0; tieserrors = tieserrors||ties.ptr.p_int[1]!=c0; tieserrors = tieserrors||ties.ptr.p_int[2]!=n; } } } } /* * split-2 */ /* * General tests for different N's */ for(n=1; n<=maxn; n++) { ae_vector_set_length(&a, n-1+1, _state); ae_vector_set_length(&c, n-1+1, _state); /* * one-tie test */ if( n%2==0 ) { for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(n); c.ptr.p_int[i] = i%2; } dsoptimalsplit2(&a, &c, n, &info, &threshold, &pal, &pbl, &par, &pbr, &cve, _state); if( info!=-3 ) { split2errors = ae_true; continue; } } /* * two-tie test */ /* * test #1 */ if( n>1 ) { for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(i/((n+1)/2)); c.ptr.p_int[i] = i/((n+1)/2); } dsoptimalsplit2(&a, &c, n, &info, &threshold, &pal, &pbl, &par, &pbr, &cve, _state); if( info!=1 ) { split2errors = ae_true; continue; } split2errors = split2errors||ae_fp_greater(ae_fabs(threshold-0.5, _state),100*ae_machineepsilon); split2errors = split2errors||ae_fp_greater(ae_fabs(pal-1, _state),100*ae_machineepsilon); split2errors = split2errors||ae_fp_greater(ae_fabs(pbl-0, _state),100*ae_machineepsilon); split2errors = split2errors||ae_fp_greater(ae_fabs(par-0, _state),100*ae_machineepsilon); split2errors = split2errors||ae_fp_greater(ae_fabs(pbr-1, _state),100*ae_machineepsilon); } } /* * Special "CREDIT"-test (transparency coefficient) */ n = 110; ae_vector_set_length(&a, n-1+1, _state); ae_vector_set_length(&c, n-1+1, _state); a.ptr.p_double[0] = 0.000; c.ptr.p_int[0] = 0; a.ptr.p_double[1] = 0.000; c.ptr.p_int[1] = 0; a.ptr.p_double[2] = 0.000; c.ptr.p_int[2] = 0; a.ptr.p_double[3] = 0.000; c.ptr.p_int[3] = 0; a.ptr.p_double[4] = 0.000; c.ptr.p_int[4] = 0; a.ptr.p_double[5] = 0.000; c.ptr.p_int[5] = 0; a.ptr.p_double[6] = 0.000; c.ptr.p_int[6] = 0; a.ptr.p_double[7] = 0.000; c.ptr.p_int[7] = 1; a.ptr.p_double[8] = 0.000; c.ptr.p_int[8] = 0; a.ptr.p_double[9] = 0.000; c.ptr.p_int[9] = 1; a.ptr.p_double[10] = 0.000; c.ptr.p_int[10] = 0; a.ptr.p_double[11] = 0.000; c.ptr.p_int[11] = 0; a.ptr.p_double[12] = 0.000; c.ptr.p_int[12] = 0; a.ptr.p_double[13] = 0.000; c.ptr.p_int[13] = 0; a.ptr.p_double[14] = 0.000; c.ptr.p_int[14] = 0; a.ptr.p_double[15] = 0.000; c.ptr.p_int[15] = 0; a.ptr.p_double[16] = 0.000; c.ptr.p_int[16] = 0; a.ptr.p_double[17] = 0.000; c.ptr.p_int[17] = 0; a.ptr.p_double[18] = 0.000; c.ptr.p_int[18] = 0; a.ptr.p_double[19] = 0.000; c.ptr.p_int[19] = 0; a.ptr.p_double[20] = 0.000; c.ptr.p_int[20] = 0; a.ptr.p_double[21] = 0.000; c.ptr.p_int[21] = 0; a.ptr.p_double[22] = 0.000; c.ptr.p_int[22] = 1; a.ptr.p_double[23] = 0.000; c.ptr.p_int[23] = 0; a.ptr.p_double[24] = 0.000; c.ptr.p_int[24] = 0; a.ptr.p_double[25] = 0.000; c.ptr.p_int[25] = 0; a.ptr.p_double[26] = 0.000; c.ptr.p_int[26] = 0; a.ptr.p_double[27] = 0.000; c.ptr.p_int[27] = 1; a.ptr.p_double[28] = 0.000; c.ptr.p_int[28] = 0; a.ptr.p_double[29] = 0.000; c.ptr.p_int[29] = 1; a.ptr.p_double[30] = 0.000; c.ptr.p_int[30] = 0; a.ptr.p_double[31] = 0.000; c.ptr.p_int[31] = 1; a.ptr.p_double[32] = 0.000; c.ptr.p_int[32] = 0; a.ptr.p_double[33] = 0.000; c.ptr.p_int[33] = 1; a.ptr.p_double[34] = 0.000; c.ptr.p_int[34] = 0; a.ptr.p_double[35] = 0.030; c.ptr.p_int[35] = 0; a.ptr.p_double[36] = 0.030; c.ptr.p_int[36] = 0; a.ptr.p_double[37] = 0.050; c.ptr.p_int[37] = 0; a.ptr.p_double[38] = 0.070; c.ptr.p_int[38] = 1; a.ptr.p_double[39] = 0.110; c.ptr.p_int[39] = 0; a.ptr.p_double[40] = 0.110; c.ptr.p_int[40] = 1; a.ptr.p_double[41] = 0.120; c.ptr.p_int[41] = 0; a.ptr.p_double[42] = 0.130; c.ptr.p_int[42] = 0; a.ptr.p_double[43] = 0.140; c.ptr.p_int[43] = 0; a.ptr.p_double[44] = 0.140; c.ptr.p_int[44] = 0; a.ptr.p_double[45] = 0.140; c.ptr.p_int[45] = 0; a.ptr.p_double[46] = 0.150; c.ptr.p_int[46] = 0; a.ptr.p_double[47] = 0.150; c.ptr.p_int[47] = 0; a.ptr.p_double[48] = 0.170; c.ptr.p_int[48] = 0; a.ptr.p_double[49] = 0.190; c.ptr.p_int[49] = 1; a.ptr.p_double[50] = 0.200; c.ptr.p_int[50] = 0; a.ptr.p_double[51] = 0.200; c.ptr.p_int[51] = 0; a.ptr.p_double[52] = 0.250; c.ptr.p_int[52] = 0; a.ptr.p_double[53] = 0.250; c.ptr.p_int[53] = 0; a.ptr.p_double[54] = 0.260; c.ptr.p_int[54] = 0; a.ptr.p_double[55] = 0.270; c.ptr.p_int[55] = 0; a.ptr.p_double[56] = 0.280; c.ptr.p_int[56] = 0; a.ptr.p_double[57] = 0.310; c.ptr.p_int[57] = 0; a.ptr.p_double[58] = 0.310; c.ptr.p_int[58] = 0; a.ptr.p_double[59] = 0.330; c.ptr.p_int[59] = 0; a.ptr.p_double[60] = 0.330; c.ptr.p_int[60] = 0; a.ptr.p_double[61] = 0.340; c.ptr.p_int[61] = 0; a.ptr.p_double[62] = 0.340; c.ptr.p_int[62] = 0; a.ptr.p_double[63] = 0.370; c.ptr.p_int[63] = 0; a.ptr.p_double[64] = 0.380; c.ptr.p_int[64] = 1; a.ptr.p_double[65] = 0.380; c.ptr.p_int[65] = 0; a.ptr.p_double[66] = 0.410; c.ptr.p_int[66] = 0; a.ptr.p_double[67] = 0.460; c.ptr.p_int[67] = 0; a.ptr.p_double[68] = 0.520; c.ptr.p_int[68] = 0; a.ptr.p_double[69] = 0.530; c.ptr.p_int[69] = 0; a.ptr.p_double[70] = 0.540; c.ptr.p_int[70] = 0; a.ptr.p_double[71] = 0.560; c.ptr.p_int[71] = 0; a.ptr.p_double[72] = 0.560; c.ptr.p_int[72] = 0; a.ptr.p_double[73] = 0.570; c.ptr.p_int[73] = 0; a.ptr.p_double[74] = 0.600; c.ptr.p_int[74] = 0; a.ptr.p_double[75] = 0.600; c.ptr.p_int[75] = 0; a.ptr.p_double[76] = 0.620; c.ptr.p_int[76] = 0; a.ptr.p_double[77] = 0.650; c.ptr.p_int[77] = 0; a.ptr.p_double[78] = 0.660; c.ptr.p_int[78] = 0; a.ptr.p_double[79] = 0.680; c.ptr.p_int[79] = 0; a.ptr.p_double[80] = 0.700; c.ptr.p_int[80] = 0; a.ptr.p_double[81] = 0.750; c.ptr.p_int[81] = 0; a.ptr.p_double[82] = 0.770; c.ptr.p_int[82] = 0; a.ptr.p_double[83] = 0.770; c.ptr.p_int[83] = 0; a.ptr.p_double[84] = 0.770; c.ptr.p_int[84] = 0; a.ptr.p_double[85] = 0.790; c.ptr.p_int[85] = 0; a.ptr.p_double[86] = 0.810; c.ptr.p_int[86] = 0; a.ptr.p_double[87] = 0.840; c.ptr.p_int[87] = 0; a.ptr.p_double[88] = 0.860; c.ptr.p_int[88] = 0; a.ptr.p_double[89] = 0.870; c.ptr.p_int[89] = 0; a.ptr.p_double[90] = 0.890; c.ptr.p_int[90] = 0; a.ptr.p_double[91] = 0.900; c.ptr.p_int[91] = 1; a.ptr.p_double[92] = 0.900; c.ptr.p_int[92] = 0; a.ptr.p_double[93] = 0.910; c.ptr.p_int[93] = 0; a.ptr.p_double[94] = 0.940; c.ptr.p_int[94] = 0; a.ptr.p_double[95] = 0.950; c.ptr.p_int[95] = 0; a.ptr.p_double[96] = 0.952; c.ptr.p_int[96] = 0; a.ptr.p_double[97] = 0.970; c.ptr.p_int[97] = 0; a.ptr.p_double[98] = 0.970; c.ptr.p_int[98] = 0; a.ptr.p_double[99] = 0.980; c.ptr.p_int[99] = 0; a.ptr.p_double[100] = 1.000; c.ptr.p_int[100] = 0; a.ptr.p_double[101] = 1.000; c.ptr.p_int[101] = 0; a.ptr.p_double[102] = 1.000; c.ptr.p_int[102] = 0; a.ptr.p_double[103] = 1.000; c.ptr.p_int[103] = 0; a.ptr.p_double[104] = 1.000; c.ptr.p_int[104] = 0; a.ptr.p_double[105] = 1.020; c.ptr.p_int[105] = 0; a.ptr.p_double[106] = 1.090; c.ptr.p_int[106] = 0; a.ptr.p_double[107] = 1.130; c.ptr.p_int[107] = 0; a.ptr.p_double[108] = 1.840; c.ptr.p_int[108] = 0; a.ptr.p_double[109] = 2.470; c.ptr.p_int[109] = 0; dsoptimalsplit2(&a, &c, n, &info, &threshold, &pal, &pbl, &par, &pbr, &cve, _state); if( info!=1 ) { split2errors = ae_true; } else { split2errors = split2errors||ae_fp_greater(ae_fabs(threshold-0.195, _state),100*ae_machineepsilon); split2errors = split2errors||ae_fp_greater(ae_fabs(pal-0.80, _state),0.02); split2errors = split2errors||ae_fp_greater(ae_fabs(pbl-0.20, _state),0.02); split2errors = split2errors||ae_fp_greater(ae_fabs(par-0.97, _state),0.02); split2errors = split2errors||ae_fp_greater(ae_fabs(pbr-0.03, _state),0.02); } /* * split-2 fast */ /* * General tests for different N's */ for(n=1; n<=maxn; n++) { ae_vector_set_length(&a, n-1+1, _state); ae_vector_set_length(&c, n-1+1, _state); ae_vector_set_length(&tiebuf, n+1, _state); ae_vector_set_length(&cntbuf, 3+1, _state); /* * one-tie test */ if( n%2==0 ) { for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(n); c.ptr.p_int[i] = i%2; } dsoptimalsplit2fast(&a, &c, &tiebuf, &cntbuf, &sortrbuf, &sortibuf, n, 2, 0.00, &info, &threshold, &rms, &cvrms, _state); if( info!=-3 ) { split2errors = ae_true; continue; } } /* * two-tie test */ /* * test #1 */ if( n>1 ) { for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(i/((n+1)/2)); c.ptr.p_int[i] = i/((n+1)/2); } dsoptimalsplit2fast(&a, &c, &tiebuf, &cntbuf, &sortrbuf, &sortibuf, n, 2, 0.00, &info, &threshold, &rms, &cvrms, _state); if( info!=1 ) { split2errors = ae_true; continue; } split2errors = split2errors||ae_fp_greater(ae_fabs(threshold-0.5, _state),100*ae_machineepsilon); split2errors = split2errors||ae_fp_greater(ae_fabs(rms-0, _state),100*ae_machineepsilon); if( n==2 ) { split2errors = split2errors||ae_fp_greater(ae_fabs(cvrms-0.5, _state),100*ae_machineepsilon); } else { if( n==3 ) { split2errors = split2errors||ae_fp_greater(ae_fabs(cvrms-ae_sqrt((2*0+2*0+2*0.25)/6, _state), _state),100*ae_machineepsilon); } else { split2errors = split2errors||ae_fp_greater(ae_fabs(cvrms, _state),100*ae_machineepsilon); } } } } /* * special tests */ n = 10; ae_vector_set_length(&a, n-1+1, _state); ae_vector_set_length(&c, n-1+1, _state); ae_vector_set_length(&tiebuf, n+1, _state); ae_vector_set_length(&cntbuf, 2*3-1+1, _state); for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(i); if( i<=n-3 ) { c.ptr.p_int[i] = 0; } else { c.ptr.p_int[i] = i-(n-3); } } dsoptimalsplit2fast(&a, &c, &tiebuf, &cntbuf, &sortrbuf, &sortibuf, n, 3, 0.00, &info, &threshold, &rms, &cvrms, _state); if( info!=1 ) { split2errors = ae_true; } else { split2errors = split2errors||ae_fp_greater(ae_fabs(threshold-(n-2.5), _state),100*ae_machineepsilon); split2errors = split2errors||ae_fp_greater(ae_fabs(rms-ae_sqrt((0.25+0.25+0.25+0.25)/(3*n), _state), _state),100*ae_machineepsilon); split2errors = split2errors||ae_fp_greater(ae_fabs(cvrms-ae_sqrt((double)(1+1+1+1)/(double)(3*n), _state), _state),100*ae_machineepsilon); } /* * Optimal split-K */ /* * General tests for different N's */ for(n=1; n<=maxnq; n++) { ae_vector_set_length(&a, n-1+1, _state); ae_vector_set_length(&c, n-1+1, _state); /* * one-tie test */ if( n%2==0 ) { for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(n); c.ptr.p_int[i] = i%2; } dsoptimalsplitk(&a, &c, n, 2, 2+ae_randominteger(5, _state), &info, &thresholds, &ni, &cve, _state); if( info!=-3 ) { optimalsplitkerrors = ae_true; continue; } } /* * two-tie test */ /* * test #1 */ if( n>1 ) { c0 = 0; c1 = 0; for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(i/((n+1)/2)); c.ptr.p_int[i] = i/((n+1)/2); if( c.ptr.p_int[i]==0 ) { c0 = c0+1; } if( c.ptr.p_int[i]==1 ) { c1 = c1+1; } } dsoptimalsplitk(&a, &c, n, 2, 2+ae_randominteger(5, _state), &info, &thresholds, &ni, &cve, _state); if( info!=1 ) { optimalsplitkerrors = ae_true; continue; } optimalsplitkerrors = optimalsplitkerrors||ni!=2; optimalsplitkerrors = optimalsplitkerrors||ae_fp_greater(ae_fabs(thresholds.ptr.p_double[0]-0.5, _state),100*ae_machineepsilon); optimalsplitkerrors = optimalsplitkerrors||ae_fp_greater(ae_fabs(cve-(-c0*ae_log((double)c0/(double)(c0+1), _state)-c1*ae_log((double)c1/(double)(c1+1), _state)), _state),100*ae_machineepsilon); } /* * test #2 */ if( n>2 ) { c0 = 1+ae_randominteger(n-1, _state); c1 = n-c0; for(i=0; i<=n-1; i++) { if( i=16 ) { /* * Multi-tie test. * * First NC-1 ties have C0 entries, remaining NC-th tie * have C1 entries. */ nc = ae_round(ae_sqrt((double)(n), _state), _state); c0 = n/nc; c1 = n-c0*(nc-1); for(i=0; i<=nc-2; i++) { for(j=c0*i; j<=c0*(i+1)-1; j++) { a.ptr.p_double[j] = (double)(j); c.ptr.p_int[j] = i; } } for(j=c0*(nc-1); j<=n-1; j++) { a.ptr.p_double[j] = (double)(j); c.ptr.p_int[j] = nc-1; } dsoptimalsplitk(&a, &c, n, nc, nc+ae_randominteger(nc, _state), &info, &thresholds, &ni, &cve, _state); if( info!=1 ) { optimalsplitkerrors = ae_true; continue; } optimalsplitkerrors = optimalsplitkerrors||ni!=nc; if( ni==nc ) { for(i=0; i<=nc-2; i++) { optimalsplitkerrors = optimalsplitkerrors||ae_fp_greater(ae_fabs(thresholds.ptr.p_double[i]-(c0*(i+1)-1+0.5), _state),100*ae_machineepsilon); } cvr = -((nc-1)*c0*ae_log((double)c0/(double)(c0+nc-1), _state)+c1*ae_log((double)c1/(double)(c1+nc-1), _state)); optimalsplitkerrors = optimalsplitkerrors||ae_fp_greater(ae_fabs(cve-cvr, _state),100*ae_machineepsilon); } } } /* * Non-optimal split-K */ /* * General tests for different N's */ for(n=1; n<=maxnq; n++) { ae_vector_set_length(&a, n-1+1, _state); ae_vector_set_length(&c, n-1+1, _state); /* * one-tie test */ if( n%2==0 ) { for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(99); c.ptr.p_int[i] = i%2; } dssplitk(&a, &c, n, 2, 2+ae_randominteger(5, _state), &info, &thresholds, &ni, &cve, _state); if( info!=-3 ) { splitkerrors = ae_true; continue; } } /* * two-tie test */ /* * test #1 */ if( n>1 ) { c0 = 0; c1 = 0; for(i=0; i<=n-1; i++) { a.ptr.p_double[i] = (double)(i/((n+1)/2)); c.ptr.p_int[i] = i/((n+1)/2); if( c.ptr.p_int[i]==0 ) { c0 = c0+1; } if( c.ptr.p_int[i]==1 ) { c1 = c1+1; } } dssplitk(&a, &c, n, 2, 2+ae_randominteger(5, _state), &info, &thresholds, &ni, &cve, _state); if( info!=1 ) { splitkerrors = ae_true; continue; } splitkerrors = splitkerrors||ni!=2; if( ni==2 ) { splitkerrors = splitkerrors||ae_fp_greater(ae_fabs(thresholds.ptr.p_double[0]-0.5, _state),100*ae_machineepsilon); splitkerrors = splitkerrors||ae_fp_greater(ae_fabs(cve-(-c0*ae_log((double)c0/(double)(c0+1), _state)-c1*ae_log((double)c1/(double)(c1+1), _state)), _state),100*ae_machineepsilon); } } /* * test #2 */ if( n>2 ) { c0 = 1+ae_randominteger(n-1, _state); c1 = n-c0; for(i=0; i<=n-1; i++) { if( i1 ) { nc = n/c0; for(i=0; i<=nc-1; i++) { for(j=c0*i; j<=c0*(i+1)-1; j++) { a.ptr.p_double[j] = (double)(j); c.ptr.p_int[j] = i; } } dssplitk(&a, &c, n, nc, nc+ae_randominteger(nc, _state), &info, &thresholds, &ni, &cve, _state); if( info!=1 ) { splitkerrors = ae_true; continue; } splitkerrors = splitkerrors||ni!=nc; if( ni==nc ) { for(i=0; i<=nc-2; i++) { splitkerrors = splitkerrors||ae_fp_greater(ae_fabs(thresholds.ptr.p_double[i]-(c0*(i+1)-1+0.5), _state),100*ae_machineepsilon); } cvr = -nc*c0*ae_log((double)c0/(double)(c0+nc-1), _state); splitkerrors = splitkerrors||ae_fp_greater(ae_fabs(cve-cvr, _state),100*ae_machineepsilon); } } } } /* * report */ waserrors = ((tieserrors||split2errors)||optimalsplitkerrors)||splitkerrors; if( !silent ) { printf("TESTING BASIC DATASET SUBROUTINES\n"); printf("TIES: "); if( !tieserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("SPLIT-2: "); if( !split2errors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("OPTIMAL SPLIT-K: "); if( !optimalsplitkerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("SPLIT-K: "); if( !splitkerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testbdss(ae_bool silent, ae_state *_state) { return testbdss(silent, _state); } /************************************************************************* Unsets 1D array. *************************************************************************/ static void testbdssunit_unset1di(/* Integer */ ae_vector* a, ae_state *_state) { ae_vector_set_length(a, 0+1, _state); a->ptr.p_int[0] = ae_randominteger(3, _state)-1; } static double testmlpbaseunit_vectordiff(/* Real */ ae_vector* g0, /* Real */ ae_vector* g1, ae_int_t n, double s, ae_state *_state); static void testmlpbaseunit_createnetwork(multilayerperceptron* network, ae_int_t nkind, double a1, double a2, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_state *_state); static void testmlpbaseunit_unsetnetwork(multilayerperceptron* network, ae_state *_state); static void testmlpbaseunit_testinformational(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t passcount, ae_bool* err, ae_state *_state); static void testmlpbaseunit_testprocessing(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t passcount, ae_bool* err, ae_state *_state); static void testmlpbaseunit_testgradient(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t passcount, ae_int_t sizemin, ae_int_t sizemax, ae_bool* err, ae_state *_state); static void testmlpbaseunit_testhessian(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t passcount, ae_bool* err, ae_state *_state); static void testmlpbaseunit_testerr(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t passcount, ae_int_t sizemin, ae_int_t sizemax, ae_bool* err, ae_state *_state); static void testmlpbaseunit_spectests(ae_bool* inferrors, ae_bool* procerrors, ae_bool* graderrors, ae_bool* hesserrors, ae_bool* errerrors, ae_state *_state); static ae_bool testmlpbaseunit_testmlpgbsubset(ae_state *_state); ae_bool testmlpbase(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_int_t passcount; ae_int_t maxn; ae_int_t maxhid; ae_int_t sizemin; ae_int_t sizemax; ae_int_t nf; ae_int_t nl; ae_int_t nhid1; ae_int_t nhid2; ae_int_t nkind; multilayerperceptron network; multilayerperceptron network2; ae_matrix xy; ae_matrix valxy; ae_bool inferrors; ae_bool procerrors; ae_bool graderrors; ae_bool hesserrors; ae_bool errerrors; ae_bool result; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&network, _state); _multilayerperceptron_init(&network2, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&valxy, 0, 0, DT_REAL, _state); waserrors = ae_false; inferrors = ae_false; procerrors = ae_false; graderrors = ae_false; hesserrors = ae_false; errerrors = ae_false; passcount = 5; maxn = 3; maxhid = 3; /* * Special tests */ testmlpbaseunit_spectests(&inferrors, &procerrors, &graderrors, &hesserrors, &errerrors, _state); /* * General multilayer network tests. * These tests are performed with small dataset, whose size is in [0,10]. * We test correctness of functions on small sets, but do not test code * which splits large dataset into smaller chunks. */ sizemin = 0; sizemax = 10; for(nf=1; nf<=maxn; nf++) { for(nl=1; nl<=maxn; nl++) { for(nhid1=0; nhid1<=maxhid; nhid1++) { for(nhid2=0; nhid2<=maxhid; nhid2++) { for(nkind=0; nkind<=3; nkind++) { /* * Skip meaningless parameters combinations */ if( nkind==1&&nl<2 ) { continue; } if( nhid1==0&&nhid2!=0 ) { continue; } /* * Tests */ testmlpbaseunit_testinformational(nkind, nf, nhid1, nhid2, nl, passcount, &inferrors, _state); testmlpbaseunit_testprocessing(nkind, nf, nhid1, nhid2, nl, passcount, &procerrors, _state); testmlpbaseunit_testgradient(nkind, nf, nhid1, nhid2, nl, passcount, sizemin, sizemax, &graderrors, _state); testmlpbaseunit_testhessian(nkind, nf, nhid1, nhid2, nl, passcount, &hesserrors, _state); testmlpbaseunit_testerr(nkind, nf, nhid1, nhid2, nl, passcount, sizemin, sizemax, &errerrors, _state); } } } } } /* * Special tests on large datasets: test ability to correctly split * work into smaller chunks. */ nf = 2; nhid1 = 30; nhid2 = 30; nl = 2; sizemin = 1000; sizemax = 1000; testmlpbaseunit_testerr(0, nf, nhid1, nhid2, nl, 1, sizemin, sizemax, &errerrors, _state); testmlpbaseunit_testgradient(0, nf, nhid1, nhid2, nl, 1, sizemin, sizemax, &graderrors, _state); /* * Test for MLPGradBatch____Subset() */ graderrors = graderrors||testmlpbaseunit_testmlpgbsubset(_state); /* * Final report */ waserrors = (((inferrors||procerrors)||graderrors)||hesserrors)||errerrors; if( !silent ) { printf("MLP TEST\n"); printf("INFORMATIONAL FUNCTIONS: "); if( !inferrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("BASIC PROCESSING: "); if( !procerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("GRADIENT CALCULATION: "); if( !graderrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("HESSIAN CALCULATION: "); if( !hesserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("ERROR FUNCTIONS: "); if( !errerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST SUMMARY: FAILED\n"); } else { printf("TEST SUMMARY: PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testmlpbase(ae_bool silent, ae_state *_state) { return testmlpbase(silent, _state); } /************************************************************************* This function compares vectors G0 and G1 and returns ||G0-G1||/max(||G0||,||G1||,S) For zero G0, G1 and S (all three quantities are zero) it returns zero. *************************************************************************/ static double testmlpbaseunit_vectordiff(/* Real */ ae_vector* g0, /* Real */ ae_vector* g1, ae_int_t n, double s, ae_state *_state) { ae_int_t i; double norm0; double norm1; double diff; double result; norm0 = (double)(0); norm1 = (double)(0); diff = (double)(0); for(i=0; i<=n-1; i++) { norm0 = norm0+ae_sqr(g0->ptr.p_double[i], _state); norm1 = norm1+ae_sqr(g1->ptr.p_double[i], _state); diff = diff+ae_sqr(g0->ptr.p_double[i]-g1->ptr.p_double[i], _state); } norm0 = ae_sqrt(norm0, _state); norm1 = ae_sqrt(norm1, _state); diff = ae_sqrt(diff, _state); if( (ae_fp_neq(norm0,(double)(0))||ae_fp_neq(norm1,(double)(0)))||ae_fp_neq(s,(double)(0)) ) { diff = diff/ae_maxreal(ae_maxreal(norm0, norm1, _state), s, _state); } else { diff = (double)(0); } result = diff; return result; } /************************************************************************* Network creation This function creates network with desired structure. Network is created using one of the three methods: a) straightforward creation using MLPCreate???() b) MLPCreate???() for proxy object, which is copied with PassThroughSerializer() c) MLPCreate???() for proxy object, which is copied with MLPCopy() One of these methods is chosen at random. *************************************************************************/ static void testmlpbaseunit_createnetwork(multilayerperceptron* network, ae_int_t nkind, double a1, double a2, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_state *_state) { ae_frame _frame_block; ae_int_t mkind; multilayerperceptron tmp; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&tmp, _state); ae_assert(((nin>0&&nhid1>=0)&&nhid2>=0)&&nout>0, "CreateNetwork error", _state); ae_assert(nhid1!=0||nhid2==0, "CreateNetwork error", _state); ae_assert(nkind!=1||nout>=2, "CreateNetwork error", _state); mkind = ae_randominteger(3, _state); if( nhid1==0 ) { /* * No hidden layers */ if( nkind==0 ) { if( mkind==0 ) { mlpcreate0(nin, nout, network, _state); } if( mkind==1 ) { mlpcreate0(nin, nout, &tmp, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &tmp, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &tmp, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } } if( mkind==2 ) { mlpcreate0(nin, nout, &tmp, _state); mlpcopy(&tmp, network, _state); } } else { if( nkind==1 ) { if( mkind==0 ) { mlpcreatec0(nin, nout, network, _state); } if( mkind==1 ) { mlpcreatec0(nin, nout, &tmp, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &tmp, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &tmp, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } } if( mkind==2 ) { mlpcreatec0(nin, nout, &tmp, _state); mlpcopy(&tmp, network, _state); } } else { if( nkind==2 ) { if( mkind==0 ) { mlpcreateb0(nin, nout, a1, a2, network, _state); } if( mkind==1 ) { mlpcreateb0(nin, nout, a1, a2, &tmp, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &tmp, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &tmp, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } } if( mkind==2 ) { mlpcreateb0(nin, nout, a1, a2, &tmp, _state); mlpcopy(&tmp, network, _state); } } else { if( nkind==3 ) { if( mkind==0 ) { mlpcreater0(nin, nout, a1, a2, network, _state); } if( mkind==1 ) { mlpcreater0(nin, nout, a1, a2, &tmp, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &tmp, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &tmp, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } } if( mkind==2 ) { mlpcreater0(nin, nout, a1, a2, &tmp, _state); mlpcopy(&tmp, network, _state); } } } } } mlprandomizefull(network, _state); ae_frame_leave(_state); return; } if( nhid2==0 ) { /* * One hidden layer */ if( nkind==0 ) { if( mkind==0 ) { mlpcreate1(nin, nhid1, nout, network, _state); } if( mkind==1 ) { mlpcreate1(nin, nhid1, nout, &tmp, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &tmp, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &tmp, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } } if( mkind==2 ) { mlpcreate1(nin, nhid1, nout, &tmp, _state); mlpcopy(&tmp, network, _state); } } else { if( nkind==1 ) { if( mkind==0 ) { mlpcreatec1(nin, nhid1, nout, network, _state); } if( mkind==1 ) { mlpcreatec1(nin, nhid1, nout, &tmp, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &tmp, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &tmp, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } } if( mkind==2 ) { mlpcreatec1(nin, nhid1, nout, &tmp, _state); mlpcopy(&tmp, network, _state); } } else { if( nkind==2 ) { if( mkind==0 ) { mlpcreateb1(nin, nhid1, nout, a1, a2, network, _state); } if( mkind==1 ) { mlpcreateb1(nin, nhid1, nout, a1, a2, &tmp, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &tmp, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &tmp, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } } if( mkind==2 ) { mlpcreateb1(nin, nhid1, nout, a1, a2, &tmp, _state); mlpcopy(&tmp, network, _state); } } else { if( nkind==3 ) { if( mkind==0 ) { mlpcreater1(nin, nhid1, nout, a1, a2, network, _state); } if( mkind==1 ) { mlpcreater1(nin, nhid1, nout, a1, a2, &tmp, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &tmp, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &tmp, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } } if( mkind==2 ) { mlpcreater1(nin, nhid1, nout, a1, a2, &tmp, _state); mlpcopy(&tmp, network, _state); } } } } } mlprandomizefull(network, _state); ae_frame_leave(_state); return; } /* * Two hidden layers */ if( nkind==0 ) { if( mkind==0 ) { mlpcreate2(nin, nhid1, nhid2, nout, network, _state); } if( mkind==1 ) { mlpcreate2(nin, nhid1, nhid2, nout, &tmp, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &tmp, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &tmp, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } } if( mkind==2 ) { mlpcreate2(nin, nhid1, nhid2, nout, &tmp, _state); mlpcopy(&tmp, network, _state); } } else { if( nkind==1 ) { if( mkind==0 ) { mlpcreatec2(nin, nhid1, nhid2, nout, network, _state); } if( mkind==1 ) { mlpcreatec2(nin, nhid1, nhid2, nout, &tmp, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &tmp, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &tmp, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } } if( mkind==2 ) { mlpcreatec2(nin, nhid1, nhid2, nout, &tmp, _state); mlpcopy(&tmp, network, _state); } } else { if( nkind==2 ) { if( mkind==0 ) { mlpcreateb2(nin, nhid1, nhid2, nout, a1, a2, network, _state); } if( mkind==1 ) { mlpcreateb2(nin, nhid1, nhid2, nout, a1, a2, &tmp, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &tmp, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &tmp, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } } if( mkind==2 ) { mlpcreateb2(nin, nhid1, nhid2, nout, a1, a2, &tmp, _state); mlpcopy(&tmp, network, _state); } } else { if( nkind==3 ) { if( mkind==0 ) { mlpcreater2(nin, nhid1, nhid2, nout, a1, a2, network, _state); } if( mkind==1 ) { mlpcreater2(nin, nhid1, nhid2, nout, a1, a2, &tmp, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &tmp, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &tmp, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } } if( mkind==2 ) { mlpcreater2(nin, nhid1, nhid2, nout, a1, a2, &tmp, _state); mlpcopy(&tmp, network, _state); } } } } } mlprandomizefull(network, _state); ae_frame_leave(_state); } /************************************************************************* Unsets network (initialize it to smallest network possible *************************************************************************/ static void testmlpbaseunit_unsetnetwork(multilayerperceptron* network, ae_state *_state) { mlpcreate0(1, 1, network, _state); } /************************************************************************* Informational functions test *************************************************************************/ static void testmlpbaseunit_testinformational(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t passcount, ae_bool* err, ae_state *_state) { ae_frame _frame_block; multilayerperceptron network; ae_int_t n1; ae_int_t n2; ae_int_t wcount; ae_int_t i; ae_int_t j; ae_int_t k; double threshold; ae_int_t nlayers; ae_int_t nmax; ae_matrix neurons; ae_vector x; ae_vector y; double mean; double sigma; ae_int_t fkind; double c; double f; double df; double d2f; double s; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&network, _state); ae_matrix_init(&neurons, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); threshold = 100000*ae_machineepsilon; testmlpbaseunit_createnetwork(&network, nkind, 0.0, 0.0, nin, nhid1, nhid2, nout, _state); /* * test MLPProperties() */ mlpproperties(&network, &n1, &n2, &wcount, _state); *err = ((*err||n1!=nin)||n2!=nout)||wcount<=0; *err = ((*err||mlpgetinputscount(&network, _state)!=nin)||mlpgetoutputscount(&network, _state)!=nout)||mlpgetweightscount(&network, _state)!=wcount; /* * Test network geometry functions * * In order to do this we calculate neural network output using * informational functions only, and compare results with ones * obtained with MLPProcess(): * 1. we allocate 2-dimensional array of neurons and fill it by zeros * 2. we full first layer of neurons by input values * 3. we move through array, calculating values of subsequent layers * 4. if we have classification network, we SOFTMAX-normalize output layer * 5. we apply scaling to the outputs * 6. we compare results with ones obtained by MLPProcess() * * NOTE: it is important to do (4) before (5), because on SOFTMAX network * MLPGetOutputScaling() must return Mean=0 and Sigma=1. In order * to test it implicitly, we apply it to the classifier results * (already normalized). If one of the coefficients deviates from * expected values, we will get error during (6). */ nlayers = 2; nmax = ae_maxint(nin, nout, _state); if( nhid1!=0 ) { nlayers = 3; nmax = ae_maxint(nmax, nhid1, _state); } if( nhid2!=0 ) { nlayers = 4; nmax = ae_maxint(nmax, nhid2, _state); } ae_matrix_set_length(&neurons, nlayers, nmax, _state); for(i=0; i<=nlayers-1; i++) { for(j=0; j<=nmax-1; j++) { neurons.ptr.pp_double[i][j] = (double)(0); } } ae_vector_set_length(&x, nin, _state); for(i=0; i<=nin-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&y, nout, _state); for(i=0; i<=nout-1; i++) { y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(j=0; j<=nin-1; j++) { mlpgetinputscaling(&network, j, &mean, &sigma, _state); neurons.ptr.pp_double[0][j] = (x.ptr.p_double[j]-mean)/sigma; } for(i=1; i<=nlayers-1; i++) { for(j=0; j<=mlpgetlayersize(&network, i, _state)-1; j++) { for(k=0; k<=mlpgetlayersize(&network, i-1, _state)-1; k++) { neurons.ptr.pp_double[i][j] = neurons.ptr.pp_double[i][j]+mlpgetweight(&network, i-1, k, i, j, _state)*neurons.ptr.pp_double[i-1][k]; } mlpgetneuroninfo(&network, i, j, &fkind, &c, _state); mlpactivationfunction(neurons.ptr.pp_double[i][j]-c, fkind, &f, &df, &d2f, _state); neurons.ptr.pp_double[i][j] = f; } } if( nkind==1 ) { s = (double)(0); for(j=0; j<=nout-1; j++) { s = s+ae_exp(neurons.ptr.pp_double[nlayers-1][j], _state); } for(j=0; j<=nout-1; j++) { neurons.ptr.pp_double[nlayers-1][j] = ae_exp(neurons.ptr.pp_double[nlayers-1][j], _state)/s; } } for(j=0; j<=nout-1; j++) { mlpgetoutputscaling(&network, j, &mean, &sigma, _state); neurons.ptr.pp_double[nlayers-1][j] = neurons.ptr.pp_double[nlayers-1][j]*sigma+mean; } mlpprocess(&network, &x, &y, _state); for(j=0; j<=nout-1; j++) { *err = *err||ae_fp_greater(ae_fabs(neurons.ptr.pp_double[nlayers-1][j]-y.ptr.p_double[j], _state),threshold); } ae_frame_leave(_state); } /************************************************************************* Processing functions test *************************************************************************/ static void testmlpbaseunit_testprocessing(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t passcount, ae_bool* err, ae_state *_state) { ae_frame _frame_block; multilayerperceptron network; multilayerperceptron network2; sparsematrix sparsexy; ae_matrix densexy; ae_int_t npoints; ae_int_t subnp; ae_bool iscls; ae_int_t n1; ae_int_t n2; ae_int_t wcount; ae_bool zeronet; double a1; double a2; ae_int_t pass; ae_int_t i; ae_int_t j; ae_bool allsame; ae_vector x1; ae_vector x2; ae_vector y1; ae_vector y2; ae_vector p0; ae_vector p1; ae_int_t pcount; double v; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&network, _state); _multilayerperceptron_init(&network2, _state); _sparsematrix_init(&sparsexy, _state); ae_matrix_init(&densexy, 0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&y1, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&p0, 0, DT_REAL, _state); ae_vector_init(&p1, 0, DT_REAL, _state); ae_assert(passcount>=2, "PassCount<2!", _state); /* * Prepare network */ a1 = (double)(0); a2 = (double)(0); if( nkind==2 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = 2*ae_randomreal(_state)-1; } if( nkind==3 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = a1+(2*ae_randominteger(2, _state)-1)*(0.1+0.9*ae_randomreal(_state)); } testmlpbaseunit_createnetwork(&network, nkind, a1, a2, nin, nhid1, nhid2, nout, _state); mlpproperties(&network, &n1, &n2, &wcount, _state); iscls = mlpissoftmax(&network, _state); /* * Initialize arrays */ ae_vector_set_length(&x1, nin, _state); ae_vector_set_length(&x2, nin, _state); ae_vector_set_length(&y1, nout, _state); ae_vector_set_length(&y2, nout, _state); /* * Initialize sets */ npoints = ae_randominteger(11, _state)+10; if( iscls ) { ae_matrix_set_length(&densexy, npoints, nin+1, _state); sparsecreate(npoints, nin+1, npoints, &sparsexy, _state); } else { ae_matrix_set_length(&densexy, npoints, nin+nout, _state); sparsecreate(npoints, nin+nout, npoints, &sparsexy, _state); } sparseconverttocrs(&sparsexy, _state); /* * Main cycle */ for(pass=1; pass<=passcount; pass++) { /* * Last run is made on zero network */ mlprandomizefull(&network, _state); zeronet = ae_false; if( pass==passcount ) { ae_v_muld(&network.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1), 0); zeronet = ae_true; } /* * Same inputs leads to same outputs */ for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = x1.ptr.p_double[i]; } for(i=0; i<=nout-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mlpprocess(&network, &x1, &y1, _state); mlpprocess(&network, &x2, &y2, _state); seterrorflag(err, ae_fp_neq(testmlpbaseunit_vectordiff(&y1, &y2, nout, 1.0, _state),0.0), _state); /* * Same inputs on original network leads to same outputs * on copy created: * * using MLPCopy * * using MLPCopyShared */ for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = x1.ptr.p_double[i]; } for(i=0; i<=nout-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=nout-1; i++) { y2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } testmlpbaseunit_unsetnetwork(&network2, _state); mlpcopy(&network, &network2, _state); mlpprocess(&network, &x1, &y1, _state); mlpprocess(&network2, &x2, &y2, _state); seterrorflag(err, ae_fp_neq(testmlpbaseunit_vectordiff(&y1, &y2, nout, 1.0, _state),0.0), _state); for(i=0; i<=nout-1; i++) { y2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } testmlpbaseunit_unsetnetwork(&network2, _state); mlpcopyshared(&network, &network2, _state); mlpprocess(&network, &x1, &y1, _state); mlpprocess(&network2, &x2, &y2, _state); seterrorflag(err, ae_fp_neq(testmlpbaseunit_vectordiff(&y1, &y2, nout, 1.0, _state),0.0), _state); /* * Additionally we tests functions for copying of tunable * parameters by: * * copying network using MLPCopy * * randomizing tunable parameters with MLPRandomizeFull() * * copying tunable parameters with: * a) MLPCopyTunableParameters * b) combination of MLPExportTunableParameters and * MLPImportTunableParameters - we export parameters * to P1, copy PCount elements to P2, then test import. */ for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = x1.ptr.p_double[i]; } for(i=0; i<=nout-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=nout-1; i++) { y2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } testmlpbaseunit_unsetnetwork(&network2, _state); mlpcopy(&network, &network2, _state); mlprandomizefull(&network2, _state); mlpcopytunableparameters(&network, &network2, _state); mlpprocess(&network, &x1, &y1, _state); mlpprocess(&network2, &x2, &y2, _state); seterrorflag(err, ae_fp_neq(testmlpbaseunit_vectordiff(&y1, &y2, nout, 1.0, _state),0.0), _state); for(i=0; i<=nout-1; i++) { y2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } testmlpbaseunit_unsetnetwork(&network2, _state); mlpcopy(&network, &network2, _state); mlprandomizefull(&network2, _state); mlpexporttunableparameters(&network, &p0, &pcount, _state); ae_vector_set_length(&p1, pcount, _state); for(i=0; i<=pcount-1; i++) { p1.ptr.p_double[i] = p0.ptr.p_double[i]; } mlpimporttunableparameters(&network2, &p1, _state); mlpprocess(&network, &x1, &y1, _state); mlpprocess(&network2, &x2, &y2, _state); seterrorflag(err, ae_fp_neq(testmlpbaseunit_vectordiff(&y1, &y2, nout, 1.0, _state),0.0), _state); /* * Same inputs on original network leads to same outputs * on copy created using MLPSerialize */ testmlpbaseunit_unsetnetwork(&network2, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpalloc(&_local_serializer, &network, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpserialize(&_local_serializer, &network, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpunserialize(&_local_serializer, &network2, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = x1.ptr.p_double[i]; } for(i=0; i<=nout-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mlpprocess(&network, &x1, &y1, _state); mlpprocess(&network2, &x2, &y2, _state); allsame = ae_true; for(i=0; i<=nout-1; i++) { allsame = allsame&&ae_fp_eq(y1.ptr.p_double[i],y2.ptr.p_double[i]); } *err = *err||!allsame; /* * Different inputs leads to different outputs (non-zero network) */ if( !zeronet ) { for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=nout-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = y1.ptr.p_double[i]; } mlpprocess(&network, &x1, &y1, _state); mlpprocess(&network, &x2, &y2, _state); allsame = ae_true; for(i=0; i<=nout-1; i++) { allsame = allsame&&ae_fp_eq(y1.ptr.p_double[i],y2.ptr.p_double[i]); } *err = *err||allsame; } /* * Randomization changes outputs (when inputs are unchanged, non-zero network) */ if( !zeronet ) { for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=nout-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = y1.ptr.p_double[i]; } mlpcopy(&network, &network2, _state); mlprandomize(&network2, _state); mlpprocess(&network, &x1, &y1, _state); mlpprocess(&network2, &x1, &y2, _state); allsame = ae_true; for(i=0; i<=nout-1; i++) { allsame = allsame&&ae_fp_eq(y1.ptr.p_double[i],y2.ptr.p_double[i]); } *err = *err||allsame; } /* * Full randomization changes outputs (when inputs are unchanged, non-zero network) */ if( !zeronet ) { for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=nout-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = y1.ptr.p_double[i]; } mlpcopy(&network, &network2, _state); mlprandomizefull(&network2, _state); mlpprocess(&network, &x1, &y1, _state); mlpprocess(&network2, &x1, &y2, _state); allsame = ae_true; for(i=0; i<=nout-1; i++) { allsame = allsame&&ae_fp_eq(y1.ptr.p_double[i],y2.ptr.p_double[i]); } *err = *err||allsame; } /* * Normalization properties */ if( nkind==1 ) { /* * Classifier network outputs are normalized */ for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mlpprocess(&network, &x1, &y1, _state); v = (double)(0); for(i=0; i<=nout-1; i++) { v = v+y1.ptr.p_double[i]; *err = *err||ae_fp_less(y1.ptr.p_double[i],(double)(0)); } *err = *err||ae_fp_greater(ae_fabs(v-1, _state),1000*ae_machineepsilon); } if( nkind==2 ) { /* * B-type network outputs are bounded from above/below */ for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mlpprocess(&network, &x1, &y1, _state); for(i=0; i<=nout-1; i++) { if( ae_fp_greater_eq(a2,(double)(0)) ) { *err = *err||ae_fp_less(y1.ptr.p_double[i],a1); } else { *err = *err||ae_fp_greater(y1.ptr.p_double[i],a1); } } } if( nkind==3 ) { /* * R-type network outputs are within [A1,A2] (or [A2,A1]) */ for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mlpprocess(&network, &x1, &y1, _state); for(i=0; i<=nout-1; i++) { *err = (*err||ae_fp_less(y1.ptr.p_double[i],ae_minreal(a1, a2, _state)))||ae_fp_greater(y1.ptr.p_double[i],ae_maxreal(a1, a2, _state)); } } /* * Comperison MLPInitPreprocessor results with * MLPInitPreprocessorSparse results */ sparseconverttohash(&sparsexy, _state); if( iscls ) { for(i=0; i<=npoints-1; i++) { for(j=0; j<=nin-1; j++) { densexy.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; sparseset(&sparsexy, i, j, densexy.ptr.pp_double[i][j], _state); } densexy.ptr.pp_double[i][nin] = (double)(ae_randominteger(nout, _state)); sparseset(&sparsexy, i, j, densexy.ptr.pp_double[i][nin], _state); } } else { for(i=0; i<=npoints-1; i++) { for(j=0; j<=nin+nout-1; j++) { densexy.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; sparseset(&sparsexy, i, j, densexy.ptr.pp_double[i][j], _state); } } } sparseconverttocrs(&sparsexy, _state); mlpcopy(&network, &network2, _state); mlpinitpreprocessor(&network, &densexy, npoints, _state); mlpinitpreprocessorsparse(&network2, &sparsexy, npoints, _state); subnp = ae_randominteger(npoints, _state); for(i=0; i<=subnp-1; i++) { for(j=0; j<=nin-1; j++) { x1.ptr.p_double[j] = 2*ae_randomreal(_state)-1; } mlpprocess(&network, &x1, &y1, _state); mlpprocess(&network2, &x1, &y2, _state); for(j=0; j<=nout-1; j++) { *err = *err||ae_fp_greater(ae_fabs(y1.ptr.p_double[j]-y2.ptr.p_double[j], _state),1.0E-6); } } } ae_frame_leave(_state); } /************************************************************************* Gradient functions test *************************************************************************/ static void testmlpbaseunit_testgradient(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t passcount, ae_int_t sizemin, ae_int_t sizemax, ae_bool* err, ae_state *_state) { ae_frame _frame_block; multilayerperceptron network; sparsematrix sparsexy; sparsematrix sparsexy2; ae_int_t n1; ae_int_t n2; ae_int_t wcount; double h; double etol; double escale; double gscale; double nonstricttolerance; double a1; double a2; ae_int_t pass; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t ssize; ae_int_t subsetsize; ae_int_t rowsize; ae_matrix xy; ae_matrix xy2; ae_vector grad1; ae_vector grad2; ae_vector gradsp; ae_vector x; ae_vector y; ae_vector x1; ae_vector x2; ae_vector y1; ae_vector y2; ae_vector idx; double v; double e; double e1; double e2; double esp; double v1; double v2; double v3; double v4; double wprev; double referencee; ae_vector referenceg; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&network, _state); _sparsematrix_init(&sparsexy, _state); _sparsematrix_init(&sparsexy2, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&xy2, 0, 0, DT_REAL, _state); ae_vector_init(&grad1, 0, DT_REAL, _state); ae_vector_init(&grad2, 0, DT_REAL, _state); ae_vector_init(&gradsp, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&y1, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&idx, 0, DT_INT, _state); ae_vector_init(&referenceg, 0, DT_REAL, _state); a1 = (double)(0); a2 = (double)(0); if( nkind==2 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = 2*ae_randomreal(_state)-1; } if( nkind==3 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = a1+(2*ae_randominteger(2, _state)-1)*(0.1+0.9*ae_randomreal(_state)); } testmlpbaseunit_createnetwork(&network, nkind, a1, a2, nin, nhid1, nhid2, nout, _state); mlpproperties(&network, &n1, &n2, &wcount, _state); h = 0.0001; etol = 1.0E-2; escale = 1.0E-2; gscale = 1.0E-2; nonstricttolerance = 0.01; /* * Initialize */ ae_vector_set_length(&x, nin, _state); ae_vector_set_length(&x1, nin, _state); ae_vector_set_length(&x2, nin, _state); ae_vector_set_length(&y, nout, _state); ae_vector_set_length(&y1, nout, _state); ae_vector_set_length(&y2, nout, _state); ae_vector_set_length(&referenceg, wcount, _state); ae_vector_set_length(&grad1, wcount, _state); ae_vector_set_length(&grad2, wcount, _state); /* * Process */ for(pass=1; pass<=passcount; pass++) { /* * Randomize network, then re-randomaze weights manually. * * NOTE: weights magnitude is chosen to be small, about 0.1, * which allows us to avoid oversaturated network. * In 10% of cases we use zero weights. */ mlprandomizefull(&network, _state); if( ae_fp_less_eq(ae_randomreal(_state),0.1) ) { for(i=0; i<=wcount-1; i++) { network.weights.ptr.p_double[i] = 0.0; } } else { for(i=0; i<=wcount-1; i++) { network.weights.ptr.p_double[i] = 0.2*ae_randomreal(_state)-0.1; } } /* * Test MLPError(), MLPErrorSparse(), MLPGrad() for single-element dataset: * * generate input X, output Y, combine them in dataset XY * * calculate "reference" error on dataset manually (call MLPProcess and evaluate sum-of-squared errors) * * calculate "reference" gradient by performing numerical differentiation of "reference" error * using 4-point differentiation formula * * test error/gradient returned by MLPGrad(), MLPError(), MLPErrorSparse() */ ae_matrix_set_length(&xy, 1, nin+nout, _state); sparsecreate(1, nin+nout, nin+nout, &sparsexy, _state); for(i=0; i<=nin-1; i++) { x.ptr.p_double[i] = 4*ae_randomreal(_state)-2; } ae_v_move(&xy.ptr.pp_double[0][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,nin-1)); for(i=0; i<=nin-1; i++) { sparseset(&sparsexy, 0, i, x.ptr.p_double[i], _state); } if( mlpissoftmax(&network, _state) ) { for(i=0; i<=nout-1; i++) { y.ptr.p_double[i] = (double)(0); } xy.ptr.pp_double[0][nin] = (double)(ae_randominteger(nout, _state)); sparseset(&sparsexy, 0, nin, xy.ptr.pp_double[0][nin], _state); y.ptr.p_double[ae_round(xy.ptr.pp_double[0][nin], _state)] = (double)(1); } else { for(i=0; i<=nout-1; i++) { y.ptr.p_double[i] = 4*ae_randomreal(_state)-2; sparseset(&sparsexy, 0, nin+i, y.ptr.p_double[i], _state); } ae_v_move(&xy.ptr.pp_double[0][nin], 1, &y.ptr.p_double[0], 1, ae_v_len(nin,nin+nout-1)); } sparseconverttocrs(&sparsexy, _state); mlpprocess(&network, &x, &y2, _state); ae_v_sub(&y2.ptr.p_double[0], 1, &y.ptr.p_double[0], 1, ae_v_len(0,nout-1)); referencee = ae_v_dotproduct(&y2.ptr.p_double[0], 1, &y2.ptr.p_double[0], 1, ae_v_len(0,nout-1)); referencee = referencee/2; for(i=0; i<=wcount-1; i++) { wprev = network.weights.ptr.p_double[i]; network.weights.ptr.p_double[i] = wprev-2*h; mlpprocess(&network, &x, &y1, _state); ae_v_sub(&y1.ptr.p_double[0], 1, &y.ptr.p_double[0], 1, ae_v_len(0,nout-1)); v1 = ae_v_dotproduct(&y1.ptr.p_double[0], 1, &y1.ptr.p_double[0], 1, ae_v_len(0,nout-1)); v1 = v1/2; network.weights.ptr.p_double[i] = wprev-h; mlpprocess(&network, &x, &y1, _state); ae_v_sub(&y1.ptr.p_double[0], 1, &y.ptr.p_double[0], 1, ae_v_len(0,nout-1)); v2 = ae_v_dotproduct(&y1.ptr.p_double[0], 1, &y1.ptr.p_double[0], 1, ae_v_len(0,nout-1)); v2 = v2/2; network.weights.ptr.p_double[i] = wprev+h; mlpprocess(&network, &x, &y1, _state); ae_v_sub(&y1.ptr.p_double[0], 1, &y.ptr.p_double[0], 1, ae_v_len(0,nout-1)); v3 = ae_v_dotproduct(&y1.ptr.p_double[0], 1, &y1.ptr.p_double[0], 1, ae_v_len(0,nout-1)); v3 = v3/2; network.weights.ptr.p_double[i] = wprev+2*h; mlpprocess(&network, &x, &y1, _state); ae_v_sub(&y1.ptr.p_double[0], 1, &y.ptr.p_double[0], 1, ae_v_len(0,nout-1)); v4 = ae_v_dotproduct(&y1.ptr.p_double[0], 1, &y1.ptr.p_double[0], 1, ae_v_len(0,nout-1)); v4 = v4/2; network.weights.ptr.p_double[i] = wprev; referenceg.ptr.p_double[i] = (v1-8*v2+8*v3-v4)/(12*h); } mlpgrad(&network, &x, &y, &e, &grad2, _state); seterrorflagdiff(err, e, referencee, etol, escale, _state); seterrorflagdiff(err, mlperror(&network, &xy, 1, _state), referencee, etol, escale, _state); seterrorflagdiff(err, mlperrorsparse(&network, &sparsexy, 1, _state), referencee, etol, escale, _state); seterrorflag(err, ae_fp_greater(testmlpbaseunit_vectordiff(&referenceg, &grad2, wcount, gscale, _state),etol), _state); /* * Test MLPErrorN(), MLPGradN() for single-element dataset: * * generate input X, output Y, combine them in dataset XY * * calculate "reference" error on dataset manually (call MLPProcess and evaluate sum-of-squared errors) * * calculate "reference" gradient by performing numerical differentiation of "reference" error * * test error/gradient returned by MLPGradN(), MLPErrorN() * * NOTE: because we use inexact 2-point formula, we perform gradient test with NonStrictTolerance */ ae_matrix_set_length(&xy, 1, nin+nout, _state); for(i=0; i<=nin-1; i++) { x.ptr.p_double[i] = 4*ae_randomreal(_state)-2; } ae_v_move(&xy.ptr.pp_double[0][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,nin-1)); if( mlpissoftmax(&network, _state) ) { for(i=0; i<=nout-1; i++) { y.ptr.p_double[i] = (double)(0); } xy.ptr.pp_double[0][nin] = (double)(ae_randominteger(nout, _state)); y.ptr.p_double[ae_round(xy.ptr.pp_double[0][nin], _state)] = (double)(1); } else { for(i=0; i<=nout-1; i++) { y.ptr.p_double[i] = 4*ae_randomreal(_state)-2; } ae_v_move(&xy.ptr.pp_double[0][nin], 1, &y.ptr.p_double[0], 1, ae_v_len(nin,nin+nout-1)); } mlpprocess(&network, &x, &y2, _state); referencee = (double)(0); if( nkind!=1 ) { for(i=0; i<=nout-1; i++) { referencee = referencee+0.5*ae_sqr(y2.ptr.p_double[i]-y.ptr.p_double[i], _state); } } else { for(i=0; i<=nout-1; i++) { if( ae_fp_neq(y.ptr.p_double[i],(double)(0)) ) { if( ae_fp_eq(y2.ptr.p_double[i],(double)(0)) ) { referencee = referencee+y.ptr.p_double[i]*ae_log(ae_maxrealnumber, _state); } else { referencee = referencee+y.ptr.p_double[i]*ae_log(y.ptr.p_double[i]/y2.ptr.p_double[i], _state); } } } } for(i=0; i<=wcount-1; i++) { wprev = network.weights.ptr.p_double[i]; network.weights.ptr.p_double[i] = wprev+h; mlpprocess(&network, &x, &y2, _state); network.weights.ptr.p_double[i] = wprev-h; mlpprocess(&network, &x, &y1, _state); network.weights.ptr.p_double[i] = wprev; v = (double)(0); if( nkind!=1 ) { for(j=0; j<=nout-1; j++) { v = v+0.5*(ae_sqr(y2.ptr.p_double[j]-y.ptr.p_double[j], _state)-ae_sqr(y1.ptr.p_double[j]-y.ptr.p_double[j], _state))/(2*h); } } else { for(j=0; j<=nout-1; j++) { if( ae_fp_neq(y.ptr.p_double[j],(double)(0)) ) { if( ae_fp_eq(y2.ptr.p_double[j],(double)(0)) ) { v = v+y.ptr.p_double[j]*ae_log(ae_maxrealnumber, _state); } else { v = v+y.ptr.p_double[j]*ae_log(y.ptr.p_double[j]/y2.ptr.p_double[j], _state); } if( ae_fp_eq(y1.ptr.p_double[j],(double)(0)) ) { v = v-y.ptr.p_double[j]*ae_log(ae_maxrealnumber, _state); } else { v = v-y.ptr.p_double[j]*ae_log(y.ptr.p_double[j]/y1.ptr.p_double[j], _state); } } } v = v/(2*h); } referenceg.ptr.p_double[i] = v; } mlpgradn(&network, &x, &y, &e, &grad2, _state); seterrorflagdiff(err, e, referencee, etol, escale, _state); seterrorflagdiff(err, mlperrorn(&network, &xy, 1, _state), referencee, etol, escale, _state); seterrorflag(err, ae_fp_greater(testmlpbaseunit_vectordiff(&referenceg, &grad2, wcount, gscale, _state),nonstricttolerance), _state); /* * Test that gradient calculation functions automatically allocate * space for gradient, if needed. * * NOTE: we perform test with empty dataset. */ sparsecreate(1, nin+nout, 0, &sparsexy, _state); sparseconverttocrs(&sparsexy, _state); ae_vector_set_length(&grad1, 1, _state); mlpgradbatch(&network, &xy, 0, &e1, &grad1, _state); seterrorflag(err, grad1.cnt!=wcount, _state); ae_vector_set_length(&grad1, 1, _state); mlpgradbatchsparse(&network, &sparsexy, 0, &e1, &grad1, _state); seterrorflag(err, grad1.cnt!=wcount, _state); ae_vector_set_length(&grad1, 1, _state); mlpgradbatchsubset(&network, &xy, 0, &idx, 0, &e1, &grad1, _state); seterrorflag(err, grad1.cnt!=wcount, _state); ae_vector_set_length(&grad1, 1, _state); mlpgradbatchsparsesubset(&network, &sparsexy, 0, &idx, 0, &e1, &grad1, _state); seterrorflag(err, grad1.cnt!=wcount, _state); /* * Test MLPError(), MLPErrorSparse(), MLPGradBatch(), MLPGradBatchSparse() for many-element dataset: * * generate random dataset XY * * calculate "reference" error/gradient using MLPGrad(), which was tested in previous * section and is assumed to work correctly * * test results returned by MLPGradBatch/MLPGradBatchSparse against reference ones * * NOTE: about 10% of tests are performed with zero SSize */ ssize = sizemin+ae_randominteger(sizemax-sizemin+1, _state); ae_matrix_set_length(&xy, ae_maxint(ssize, 1, _state), nin+nout, _state); sparsecreate(ae_maxint(ssize, 1, _state), nin+nout, ssize*(nin+nout), &sparsexy, _state); for(i=0; i<=wcount-1; i++) { referenceg.ptr.p_double[i] = (double)(0); } referencee = (double)(0); for(i=0; i<=ssize-1; i++) { for(j=0; j<=nin-1; j++) { x1.ptr.p_double[j] = 4*ae_randomreal(_state)-2; sparseset(&sparsexy, i, j, x1.ptr.p_double[j], _state); } ae_v_move(&xy.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,nin-1)); if( mlpissoftmax(&network, _state) ) { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = (double)(0); } xy.ptr.pp_double[i][nin] = (double)(ae_randominteger(nout, _state)); sparseset(&sparsexy, i, nin, xy.ptr.pp_double[i][nin], _state); y1.ptr.p_double[ae_round(xy.ptr.pp_double[i][nin], _state)] = (double)(1); } else { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = 4*ae_randomreal(_state)-2; sparseset(&sparsexy, i, nin+j, y1.ptr.p_double[j], _state); } ae_v_move(&xy.ptr.pp_double[i][nin], 1, &y1.ptr.p_double[0], 1, ae_v_len(nin,nin+nout-1)); } mlpgrad(&network, &x1, &y1, &v, &grad2, _state); referencee = referencee+v; ae_v_add(&referenceg.ptr.p_double[0], 1, &grad2.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); } sparseconverttocrs(&sparsexy, _state); e2 = mlperror(&network, &xy, ssize, _state); seterrorflagdiff(err, e2, referencee, etol, escale, _state); e2 = mlperrorsparse(&network, &sparsexy, ssize, _state); seterrorflagdiff(err, e2, referencee, etol, escale, _state); mlpgradbatch(&network, &xy, ssize, &e2, &grad2, _state); seterrorflagdiff(err, e2, referencee, etol, escale, _state); seterrorflag(err, ae_fp_greater(testmlpbaseunit_vectordiff(&referenceg, &grad2, wcount, gscale, _state),etol), _state); mlpgradbatchsparse(&network, &sparsexy, ssize, &esp, &gradsp, _state); seterrorflagdiff(err, esp, referencee, etol, escale, _state); seterrorflag(err, ae_fp_greater(testmlpbaseunit_vectordiff(&referenceg, &gradsp, wcount, gscale, _state),etol), _state); /* * Test MLPErrorSubset(), MLPGradBatchSubset(), MLPErrorSparseSubset(), MLPGradBatchSparseSubset() * for many-element dataset with different types of subsets: * * generate random dataset XY * * "reference" error/gradient are calculated with MLPGradBatch(), * which was tested in previous section and is assumed to work correctly * * we perform tests for different subsets: * * SubsetSize<0 - subset is a full dataset * * SubsetSize=0 - subset is empty * * SubsetSize>0 - random subset */ ssize = sizemin+ae_randominteger(sizemax-sizemin+1, _state); ae_matrix_set_length(&xy, ae_maxint(ssize, 1, _state), nin+nout, _state); sparsecreate(ae_maxint(ssize, 1, _state), nin+nout, ssize*(nin+nout), &sparsexy, _state); for(i=0; i<=ssize-1; i++) { for(j=0; j<=nin-1; j++) { x1.ptr.p_double[j] = 4*ae_randomreal(_state)-2; sparseset(&sparsexy, i, j, x1.ptr.p_double[j], _state); } ae_v_move(&xy.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,nin-1)); if( mlpissoftmax(&network, _state) ) { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = (double)(0); } xy.ptr.pp_double[i][nin] = (double)(ae_randominteger(nout, _state)); sparseset(&sparsexy, i, nin, xy.ptr.pp_double[i][nin], _state); y1.ptr.p_double[ae_round(xy.ptr.pp_double[i][nin], _state)] = (double)(1); } else { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = 4*ae_randomreal(_state)-2; sparseset(&sparsexy, i, nin+j, y1.ptr.p_double[j], _state); } ae_v_move(&xy.ptr.pp_double[i][nin], 1, &y1.ptr.p_double[0], 1, ae_v_len(nin,nin+nout-1)); } } sparseconverttocrs(&sparsexy, _state); if( ssize>0 ) { subsetsize = 1+ae_randominteger(10, _state); ae_matrix_set_length(&xy2, subsetsize, nin+nout, _state); ae_vector_set_length(&idx, subsetsize, _state); sparsecreate(subsetsize, nin+nout, subsetsize*(nin+nout), &sparsexy2, _state); if( mlpissoftmax(&network, _state) ) { rowsize = nin+1; } else { rowsize = nin+nout; } for(i=0; i<=subsetsize-1; i++) { k = ae_randominteger(ssize, _state); idx.ptr.p_int[i] = k; for(j=0; j<=rowsize-1; j++) { xy2.ptr.pp_double[i][j] = xy.ptr.pp_double[k][j]; sparseset(&sparsexy2, i, j, sparseget(&sparsexy, k, j, _state), _state); } } sparseconverttocrs(&sparsexy2, _state); } else { subsetsize = 0; ae_matrix_set_length(&xy2, 0, 0, _state); ae_vector_set_length(&idx, 0, _state); sparsecreate(1, nin+nout, 0, &sparsexy2, _state); sparseconverttocrs(&sparsexy2, _state); } mlpgradbatch(&network, &xy, ssize, &referencee, &referenceg, _state); e2 = mlperrorsubset(&network, &xy, ssize, &idx, -1, _state); esp = mlperrorsparsesubset(&network, &sparsexy, ssize, &idx, -1, _state); seterrorflagdiff(err, e2, referencee, etol, escale, _state); seterrorflagdiff(err, esp, referencee, etol, escale, _state); mlpgradbatchsubset(&network, &xy, ssize, &idx, -1, &e2, &grad2, _state); mlpgradbatchsparsesubset(&network, &sparsexy, ssize, &idx, -1, &esp, &gradsp, _state); seterrorflagdiff(err, e2, referencee, etol, escale, _state); seterrorflagdiff(err, esp, referencee, etol, escale, _state); seterrorflag(err, ae_fp_greater(testmlpbaseunit_vectordiff(&referenceg, &grad2, wcount, gscale, _state),etol), _state); seterrorflag(err, ae_fp_greater(testmlpbaseunit_vectordiff(&referenceg, &gradsp, wcount, gscale, _state),etol), _state); mlpgradbatch(&network, &xy, 0, &referencee, &referenceg, _state); e2 = mlperrorsubset(&network, &xy, ssize, &idx, 0, _state); esp = mlperrorsparsesubset(&network, &sparsexy, ssize, &idx, 0, _state); seterrorflagdiff(err, e2, referencee, etol, escale, _state); seterrorflagdiff(err, esp, referencee, etol, escale, _state); mlpgradbatchsubset(&network, &xy, ssize, &idx, 0, &e2, &grad2, _state); mlpgradbatchsparsesubset(&network, &sparsexy, ssize, &idx, 0, &esp, &gradsp, _state); seterrorflagdiff(err, e2, referencee, etol, escale, _state); seterrorflagdiff(err, esp, referencee, etol, escale, _state); seterrorflag(err, ae_fp_greater(testmlpbaseunit_vectordiff(&referenceg, &grad2, wcount, gscale, _state),etol), _state); seterrorflag(err, ae_fp_greater(testmlpbaseunit_vectordiff(&referenceg, &gradsp, wcount, gscale, _state),etol), _state); mlpgradbatch(&network, &xy2, subsetsize, &referencee, &referenceg, _state); e2 = mlperrorsubset(&network, &xy, ssize, &idx, subsetsize, _state); esp = mlperrorsparsesubset(&network, &sparsexy, ssize, &idx, subsetsize, _state); seterrorflagdiff(err, e2, referencee, etol, escale, _state); seterrorflagdiff(err, esp, referencee, etol, escale, _state); mlpgradbatchsubset(&network, &xy, ssize, &idx, subsetsize, &e2, &grad2, _state); mlpgradbatchsparsesubset(&network, &sparsexy, ssize, &idx, subsetsize, &esp, &gradsp, _state); seterrorflagdiff(err, e2, referencee, etol, escale, _state); seterrorflagdiff(err, esp, referencee, etol, escale, _state); seterrorflag(err, ae_fp_greater(testmlpbaseunit_vectordiff(&referenceg, &grad2, wcount, gscale, _state),etol), _state); seterrorflag(err, ae_fp_greater(testmlpbaseunit_vectordiff(&referenceg, &gradsp, wcount, gscale, _state),etol), _state); /* * Test MLPGradNBatch() for many-element dataset: * * generate random dataset XY * * calculate "reference" error/gradient using MLPGrad(), which was tested in previous * section and is assumed to work correctly * * test results returned by MLPGradNBatch against reference ones */ ssize = sizemin+ae_randominteger(sizemax-sizemin+1, _state); ae_matrix_set_length(&xy, ssize, nin+nout, _state); for(i=0; i<=wcount-1; i++) { referenceg.ptr.p_double[i] = (double)(0); } referencee = (double)(0); for(i=0; i<=ssize-1; i++) { for(j=0; j<=nin-1; j++) { x1.ptr.p_double[j] = 4*ae_randomreal(_state)-2; } ae_v_move(&xy.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,nin-1)); if( mlpissoftmax(&network, _state) ) { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = (double)(0); } xy.ptr.pp_double[i][nin] = (double)(ae_randominteger(nout, _state)); y1.ptr.p_double[ae_round(xy.ptr.pp_double[i][nin], _state)] = (double)(1); } else { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = 4*ae_randomreal(_state)-2; } ae_v_move(&xy.ptr.pp_double[i][nin], 1, &y1.ptr.p_double[0], 1, ae_v_len(nin,nin+nout-1)); } mlpgradn(&network, &x1, &y1, &v, &grad2, _state); referencee = referencee+v; ae_v_add(&referenceg.ptr.p_double[0], 1, &grad2.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); } mlpgradnbatch(&network, &xy, ssize, &e2, &grad2, _state); seterrorflagdiff(err, e2, referencee, etol, escale, _state); seterrorflag(err, ae_fp_greater(testmlpbaseunit_vectordiff(&referenceg, &grad2, wcount, gscale, _state),etol), _state); } ae_frame_leave(_state); } /************************************************************************* Hessian functions test *************************************************************************/ static void testmlpbaseunit_testhessian(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t passcount, ae_bool* err, ae_state *_state) { ae_frame _frame_block; multilayerperceptron network; ae_int_t hkind; ae_int_t n1; ae_int_t n2; ae_int_t wcount; double h; double etol; ae_int_t pass; ae_int_t i; ae_int_t j; ae_int_t ssize; double a1; double a2; ae_matrix xy; ae_matrix h1; ae_matrix h2; ae_vector grad1; ae_vector grad2; ae_vector grad3; ae_vector x; ae_vector y; ae_vector x1; ae_vector x2; ae_vector y1; ae_vector y2; double v; double e1; double e2; double wprev; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&network, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&h1, 0, 0, DT_REAL, _state); ae_matrix_init(&h2, 0, 0, DT_REAL, _state); ae_vector_init(&grad1, 0, DT_REAL, _state); ae_vector_init(&grad2, 0, DT_REAL, _state); ae_vector_init(&grad3, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&y1, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_assert(passcount>=2, "PassCount<2!", _state); a1 = (double)(0); a2 = (double)(0); if( nkind==2 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = 2*ae_randomreal(_state)-1; } if( nkind==3 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = a1+(2*ae_randominteger(2, _state)-1)*(0.1+0.9*ae_randomreal(_state)); } testmlpbaseunit_createnetwork(&network, nkind, a1, a2, nin, nhid1, nhid2, nout, _state); mlpproperties(&network, &n1, &n2, &wcount, _state); h = 0.0001; etol = 0.05; /* * Initialize */ ae_vector_set_length(&x, nin-1+1, _state); ae_vector_set_length(&x1, nin-1+1, _state); ae_vector_set_length(&x2, nin-1+1, _state); ae_vector_set_length(&y, nout-1+1, _state); ae_vector_set_length(&y1, nout-1+1, _state); ae_vector_set_length(&y2, nout-1+1, _state); ae_vector_set_length(&grad1, wcount-1+1, _state); ae_vector_set_length(&grad2, wcount-1+1, _state); ae_vector_set_length(&grad3, wcount-1+1, _state); ae_matrix_set_length(&h1, wcount-1+1, wcount-1+1, _state); ae_matrix_set_length(&h2, wcount-1+1, wcount-1+1, _state); /* * Process */ for(pass=1; pass<=passcount; pass++) { mlprandomizefull(&network, _state); /* * Test hessian calculation . * E1 contains total error (calculated using MLPGrad/MLPGradN) * Grad1 contains total gradient (calculated using MLPGrad/MLPGradN) * H1 contains Hessian calculated using differences of gradients * * E2, Grad2 and H2 contains corresponing values calculated using MLPHessianBatch/MLPHessianNBatch */ for(hkind=0; hkind<=1; hkind++) { ssize = 1+ae_randominteger(10, _state); ae_matrix_set_length(&xy, ssize-1+1, nin+nout-1+1, _state); for(i=0; i<=wcount-1; i++) { grad1.ptr.p_double[i] = (double)(0); } for(i=0; i<=wcount-1; i++) { for(j=0; j<=wcount-1; j++) { h1.ptr.pp_double[i][j] = (double)(0); } } e1 = (double)(0); for(i=0; i<=ssize-1; i++) { /* * X, Y */ for(j=0; j<=nin-1; j++) { x1.ptr.p_double[j] = 4*ae_randomreal(_state)-2; } ae_v_move(&xy.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,nin-1)); if( mlpissoftmax(&network, _state) ) { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = (double)(0); } xy.ptr.pp_double[i][nin] = (double)(ae_randominteger(nout, _state)); y1.ptr.p_double[ae_round(xy.ptr.pp_double[i][nin], _state)] = (double)(1); } else { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = 4*ae_randomreal(_state)-2; } ae_v_move(&xy.ptr.pp_double[i][nin], 1, &y1.ptr.p_double[0], 1, ae_v_len(nin,nin+nout-1)); } /* * E1, Grad1 */ if( hkind==0 ) { mlpgrad(&network, &x1, &y1, &v, &grad2, _state); } else { mlpgradn(&network, &x1, &y1, &v, &grad2, _state); } e1 = e1+v; ae_v_add(&grad1.ptr.p_double[0], 1, &grad2.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); /* * H1 */ for(j=0; j<=wcount-1; j++) { wprev = network.weights.ptr.p_double[j]; network.weights.ptr.p_double[j] = wprev-2*h; if( hkind==0 ) { mlpgrad(&network, &x1, &y1, &v, &grad2, _state); } else { mlpgradn(&network, &x1, &y1, &v, &grad2, _state); } network.weights.ptr.p_double[j] = wprev-h; if( hkind==0 ) { mlpgrad(&network, &x1, &y1, &v, &grad3, _state); } else { mlpgradn(&network, &x1, &y1, &v, &grad3, _state); } ae_v_subd(&grad2.ptr.p_double[0], 1, &grad3.ptr.p_double[0], 1, ae_v_len(0,wcount-1), 8); network.weights.ptr.p_double[j] = wprev+h; if( hkind==0 ) { mlpgrad(&network, &x1, &y1, &v, &grad3, _state); } else { mlpgradn(&network, &x1, &y1, &v, &grad3, _state); } ae_v_addd(&grad2.ptr.p_double[0], 1, &grad3.ptr.p_double[0], 1, ae_v_len(0,wcount-1), 8); network.weights.ptr.p_double[j] = wprev+2*h; if( hkind==0 ) { mlpgrad(&network, &x1, &y1, &v, &grad3, _state); } else { mlpgradn(&network, &x1, &y1, &v, &grad3, _state); } ae_v_sub(&grad2.ptr.p_double[0], 1, &grad3.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); v = 1/(12*h); ae_v_addd(&h1.ptr.pp_double[j][0], 1, &grad2.ptr.p_double[0], 1, ae_v_len(0,wcount-1), v); network.weights.ptr.p_double[j] = wprev; } } if( hkind==0 ) { mlphessianbatch(&network, &xy, ssize, &e2, &grad2, &h2, _state); } else { mlphessiannbatch(&network, &xy, ssize, &e2, &grad2, &h2, _state); } *err = *err||ae_fp_greater(ae_fabs(e1-e2, _state)/e1,etol); for(i=0; i<=wcount-1; i++) { if( ae_fp_greater(ae_fabs(grad1.ptr.p_double[i], _state),1.0E-2) ) { *err = *err||ae_fp_greater(ae_fabs((grad2.ptr.p_double[i]-grad1.ptr.p_double[i])/grad1.ptr.p_double[i], _state),etol); } else { *err = *err||ae_fp_greater(ae_fabs(grad2.ptr.p_double[i]-grad1.ptr.p_double[i], _state),etol); } } for(i=0; i<=wcount-1; i++) { for(j=0; j<=wcount-1; j++) { if( ae_fp_greater(ae_fabs(h1.ptr.pp_double[i][j], _state),5.0E-2) ) { *err = *err||ae_fp_greater(ae_fabs((h1.ptr.pp_double[i][j]-h2.ptr.pp_double[i][j])/h1.ptr.pp_double[i][j], _state),etol); } else { *err = *err||ae_fp_greater(ae_fabs(h2.ptr.pp_double[i][j]-h1.ptr.pp_double[i][j], _state),etol); } } } } } ae_frame_leave(_state); } /************************************************************************* Error functions (other than MLPError and MLPErrorN) test. Network of type NKind is created, with NIn inputs, NHid1*NHid2 hidden layers (one layer if NHid2=0), NOut outputs. PassCount random passes is performed. Dataset has random size in [SizeMin,SizeMax]. *************************************************************************/ static void testmlpbaseunit_testerr(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t passcount, ae_int_t sizemin, ae_int_t sizemax, ae_bool* err, ae_state *_state) { ae_frame _frame_block; multilayerperceptron network; sparsematrix sparsexy; ae_int_t n1; ae_int_t n2; ae_int_t wcount; double etol; double escale; double a1; double a2; ae_int_t pass; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t ssize; ae_int_t subsetsize; ae_matrix xy; ae_vector y; ae_vector x1; ae_vector y1; ae_vector idx; ae_vector dummy; double refrmserror; double refclserror; double refrelclserror; double refavgce; double refavgerror; double refavgrelerror; ae_int_t avgrelcnt; modelerrors allerrors; ae_int_t nnmax; ae_int_t dsmax; double relclstolerance; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&network, _state); _sparsematrix_init(&sparsexy, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&y1, 0, DT_REAL, _state); ae_vector_init(&idx, 0, DT_INT, _state); ae_vector_init(&dummy, 0, DT_INT, _state); _modelerrors_init(&allerrors, _state); a1 = (double)(0); a2 = (double)(0); if( nkind==2 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = 2*ae_randomreal(_state)-1; } if( nkind==3 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = a1+(2*ae_randominteger(2, _state)-1)*(0.1+0.9*ae_randomreal(_state)); } testmlpbaseunit_createnetwork(&network, nkind, a1, a2, nin, nhid1, nhid2, nout, _state); mlpproperties(&network, &n1, &n2, &wcount, _state); etol = 1.0E-4; escale = 1.0E-2; /* * Initialize */ ae_vector_set_length(&x1, nin, _state); ae_vector_set_length(&y, nout, _state); ae_vector_set_length(&y1, nout, _state); /* * Process */ for(pass=1; pass<=passcount; pass++) { /* * Randomize network, then re-randomaze weights manually. * * NOTE: weights magnitude is chosen to be small, about 0.1, * which allows us to avoid oversaturated network. * In 10% of cases we use zero weights. */ mlprandomizefull(&network, _state); if( ae_fp_less_eq(ae_randomreal(_state),0.1) ) { for(i=0; i<=wcount-1; i++) { network.weights.ptr.p_double[i] = 0.0; } } else { for(i=0; i<=wcount-1; i++) { network.weights.ptr.p_double[i] = 0.2*ae_randomreal(_state)-0.1; } } /* * Generate random dataset. * Calculate reference errors. * * NOTE: about 10% of tests are performed with zero SSize */ ssize = sizemin+ae_randominteger(sizemax-sizemin+1, _state); if( mlpissoftmax(&network, _state) ) { ae_matrix_set_length(&xy, ae_maxint(ssize, 1, _state), nin+1, _state); sparsecreate(ae_maxint(ssize, 1, _state), nin+1, 0, &sparsexy, _state); } else { ae_matrix_set_length(&xy, ae_maxint(ssize, 1, _state), nin+nout, _state); sparsecreate(ae_maxint(ssize, 1, _state), nin+nout, 0, &sparsexy, _state); } refrmserror = 0.0; refclserror = 0.0; refavgce = 0.0; refavgerror = 0.0; refavgrelerror = 0.0; avgrelcnt = 0; for(i=0; i<=ssize-1; i++) { /* * Fill I-th row */ for(j=0; j<=nin-1; j++) { x1.ptr.p_double[j] = 4*ae_randomreal(_state)-2; sparseset(&sparsexy, i, j, x1.ptr.p_double[j], _state); } ae_v_move(&xy.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,nin-1)); if( mlpissoftmax(&network, _state) ) { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = (double)(0); } xy.ptr.pp_double[i][nin] = (double)(ae_randominteger(nout, _state)); sparseset(&sparsexy, i, nin, xy.ptr.pp_double[i][nin], _state); y1.ptr.p_double[ae_round(xy.ptr.pp_double[i][nin], _state)] = (double)(1); } else { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = 4*ae_randomreal(_state)-2; sparseset(&sparsexy, i, nin+j, y1.ptr.p_double[j], _state); } ae_v_move(&xy.ptr.pp_double[i][nin], 1, &y1.ptr.p_double[0], 1, ae_v_len(nin,nin+nout-1)); } /* * Process */ mlpprocess(&network, &x1, &y, _state); /* * Update reference errors */ nnmax = 0; if( mlpissoftmax(&network, _state) ) { if( ae_fp_greater(y.ptr.p_double[ae_round(xy.ptr.pp_double[i][nin], _state)],(double)(0)) ) { refavgce = refavgce+ae_log(1/y.ptr.p_double[ae_round(xy.ptr.pp_double[i][nin], _state)], _state); } else { refavgce = refavgce+ae_log(ae_maxrealnumber, _state); } } if( mlpissoftmax(&network, _state) ) { dsmax = ae_round(xy.ptr.pp_double[i][nin], _state); } else { dsmax = 0; } for(j=0; j<=nout-1; j++) { refrmserror = refrmserror+ae_sqr(y.ptr.p_double[j]-y1.ptr.p_double[j], _state); refavgerror = refavgerror+ae_fabs(y.ptr.p_double[j]-y1.ptr.p_double[j], _state); if( ae_fp_neq(y1.ptr.p_double[j],(double)(0)) ) { refavgrelerror = refavgrelerror+ae_fabs(y.ptr.p_double[j]-y1.ptr.p_double[j], _state)/ae_fabs(y1.ptr.p_double[j], _state); avgrelcnt = avgrelcnt+1; } if( ae_fp_greater(y.ptr.p_double[j],y.ptr.p_double[nnmax]) ) { nnmax = j; } if( !mlpissoftmax(&network, _state)&&ae_fp_greater(y1.ptr.p_double[j],y1.ptr.p_double[dsmax]) ) { dsmax = j; } } if( nnmax!=dsmax ) { refclserror = refclserror+1; } } sparseconverttocrs(&sparsexy, _state); if( ssize>0 ) { refrmserror = ae_sqrt(refrmserror/(ssize*nout), _state); refavgerror = refavgerror/(ssize*nout); refrelclserror = refclserror/ssize; refavgce = refavgce/(ssize*ae_log((double)(2), _state)); } else { refrelclserror = 0.0; } if( avgrelcnt>0 ) { refavgrelerror = refavgrelerror/avgrelcnt; } /* * Test "continuous" errors on full dataset */ seterrorflagdiff(err, mlprmserror(&network, &xy, ssize, _state), refrmserror, etol, escale, _state); seterrorflagdiff(err, mlpavgce(&network, &xy, ssize, _state), refavgce, etol, escale, _state); seterrorflagdiff(err, mlpavgerror(&network, &xy, ssize, _state), refavgerror, etol, escale, _state); seterrorflagdiff(err, mlpavgrelerror(&network, &xy, ssize, _state), refavgrelerror, etol, escale, _state); seterrorflagdiff(err, mlprmserrorsparse(&network, &sparsexy, ssize, _state), refrmserror, etol, escale, _state); seterrorflagdiff(err, mlpavgcesparse(&network, &sparsexy, ssize, _state), refavgce, etol, escale, _state); seterrorflagdiff(err, mlpavgerrorsparse(&network, &sparsexy, ssize, _state), refavgerror, etol, escale, _state); seterrorflagdiff(err, mlpavgrelerrorsparse(&network, &sparsexy, ssize, _state), refavgrelerror, etol, escale, _state); mlpallerrorssubset(&network, &xy, ssize, &dummy, -1, &allerrors, _state); seterrorflagdiff(err, allerrors.avgce, refavgce, etol, escale, _state); seterrorflagdiff(err, allerrors.rmserror, refrmserror, etol, escale, _state); seterrorflagdiff(err, allerrors.avgerror, refavgerror, etol, escale, _state); seterrorflagdiff(err, allerrors.avgrelerror, refavgrelerror, etol, escale, _state); mlpallerrorssparsesubset(&network, &sparsexy, ssize, &dummy, -1, &allerrors, _state); seterrorflagdiff(err, allerrors.avgce, refavgce, etol, escale, _state); seterrorflagdiff(err, allerrors.rmserror, refrmserror, etol, escale, _state); seterrorflagdiff(err, allerrors.avgerror, refavgerror, etol, escale, _state); seterrorflagdiff(err, allerrors.avgrelerror, refavgrelerror, etol, escale, _state); /* * Test errors on dataset given by subset. * We perform only limited test for RMS error, assuming that either all errors * are calculated correctly (subject to subset given by Idx) - or none of them. */ if( ssize>0 ) { subsetsize = ae_randominteger(10, _state); } else { subsetsize = 0; } ae_vector_set_length(&idx, subsetsize, _state); refrmserror = 0.0; for(i=0; i<=subsetsize-1; i++) { k = ae_randominteger(ssize, _state); idx.ptr.p_int[i] = k; ae_v_move(&x1.ptr.p_double[0], 1, &xy.ptr.pp_double[k][0], 1, ae_v_len(0,nin-1)); if( mlpissoftmax(&network, _state) ) { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = (double)(0); } y1.ptr.p_double[ae_round(xy.ptr.pp_double[k][nin], _state)] = (double)(1); } else { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = xy.ptr.pp_double[k][nin+j]; } } mlpprocess(&network, &x1, &y, _state); for(j=0; j<=nout-1; j++) { refrmserror = refrmserror+ae_sqr(y.ptr.p_double[j]-y1.ptr.p_double[j], _state); } } if( subsetsize>0 ) { refrmserror = ae_sqrt(refrmserror/(subsetsize*nout), _state); } mlpallerrorssubset(&network, &xy, ssize, &idx, subsetsize, &allerrors, _state); seterrorflagdiff(err, allerrors.rmserror, refrmserror, etol, escale, _state); mlpallerrorssparsesubset(&network, &sparsexy, ssize, &idx, subsetsize, &allerrors, _state); seterrorflagdiff(err, allerrors.rmserror, refrmserror, etol, escale, _state); /* * Test "discontinuous" error function. * Even slight changes in the network output may force these functions * to change by 1. So, we test them with relaxed criteria, corresponding to * difference in classification of two samples. */ if( ssize>0 ) { relclstolerance = 2.5/ssize; seterrorflag(err, ae_fp_greater(ae_fabs(mlpclserror(&network, &xy, ssize, _state)-refclserror, _state),ssize*relclstolerance), _state); seterrorflag(err, ae_fp_greater(ae_fabs(mlprelclserror(&network, &xy, ssize, _state)-refrelclserror, _state),relclstolerance), _state); seterrorflag(err, ae_fp_greater(ae_fabs(mlprelclserrorsparse(&network, &sparsexy, ssize, _state)-refrelclserror, _state),relclstolerance), _state); } } ae_frame_leave(_state); } /************************************************************************* Special tests *************************************************************************/ static void testmlpbaseunit_spectests(ae_bool* inferrors, ae_bool* procerrors, ae_bool* graderrors, ae_bool* hesserrors, ae_bool* errerrors, ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_matrix xy; double f; ae_vector g; ae_int_t i; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&net, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&g, 0, DT_REAL, _state); /* * Special test for overflow in TanH: * * create 1x1x1 linear network * * create dataset with 1 item: [x, y] = [0, 1] * * set network weights to [10000000, 10000000, 10000000, 10000000] * * check that error function is finite * * check that gradient is finite */ mlpcreate1(1, 1, 1, &net, _state); ae_matrix_set_length(&xy, 1, 2, _state); xy.ptr.pp_double[0][0] = (double)(0); xy.ptr.pp_double[0][1] = 1.0; for(i=0; i<=mlpgetweightscount(&net, _state)-1; i++) { net.weights.ptr.p_double[i] = 10000000.0; } mlpgradbatch(&net, &xy, 1, &f, &g, _state); seterrorflag(graderrors, !ae_isfinite(f, _state), _state); seterrorflag(graderrors, !ae_isfinite(mlperror(&net, &xy, 1, _state), _state), _state); for(i=0; i<=mlpgetweightscount(&net, _state)-1; i++) { seterrorflag(graderrors, !ae_isfinite(g.ptr.p_double[i], _state), _state); } /* * Special test for overflow in SOFTMAX layer: * * create 1x1x2 classifier network * * create dataset with 1 item: [x, y] = [0, 1] * * set network weights to [10000000, 10000000, 10000000, 10000000] * * check that error function is finite * * check that gradient is finite */ mlpcreatec1(1, 1, 2, &net, _state); ae_matrix_set_length(&xy, 1, 2, _state); xy.ptr.pp_double[0][0] = (double)(0); xy.ptr.pp_double[0][1] = (double)(1); for(i=0; i<=mlpgetweightscount(&net, _state)-1; i++) { net.weights.ptr.p_double[i] = 10000000.0; } mlpgradbatch(&net, &xy, 1, &f, &g, _state); seterrorflag(graderrors, !ae_isfinite(f, _state), _state); seterrorflag(graderrors, !ae_isfinite(mlperror(&net, &xy, 1, _state), _state), _state); for(i=0; i<=mlpgetweightscount(&net, _state)-1; i++) { seterrorflag(graderrors, !ae_isfinite(g.ptr.p_double[i], _state), _state); } ae_frame_leave(_state); } /************************************************************************* The function test functions MLPGradBatchMasked and MLPGradBatchSparseMasked. *************************************************************************/ static ae_bool testmlpbaseunit_testmlpgbsubset(ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; ae_matrix a; ae_matrix parta; sparsematrix sa; sparsematrix partsa; ae_vector idx; double e1; double e2; ae_vector grad1; ae_vector grad2; ae_int_t nin; ae_int_t nout; ae_int_t w; ae_int_t wcount; ae_int_t nhid1; ae_int_t nhid2; ae_int_t nkind; double a1; double a2; ae_int_t n1; ae_int_t n2; ae_int_t ssize; ae_int_t maxssize; ae_int_t sbsize; ae_int_t nvar; ae_int_t variant; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&net, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&parta, 0, 0, DT_REAL, _state); _sparsematrix_init(&sa, _state); _sparsematrix_init(&partsa, _state); ae_vector_init(&idx, 0, DT_INT, _state); ae_vector_init(&grad1, 0, DT_REAL, _state); ae_vector_init(&grad2, 0, DT_REAL, _state); /* * Variant: * * 1 - there are all rows; * * 2 - there are no one rows; * * 3 - there are some random rows. */ nvar = 3; maxssize = 96; for(ssize=0; ssize<=maxssize; ssize++) { ae_vector_set_length(&idx, ssize, _state); nkind = ae_randominteger(4, _state); a1 = (double)(0); a2 = (double)(0); if( nkind==2 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = 2*ae_randomreal(_state)-1; } if( nkind==3 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = a1+(2*ae_randominteger(2, _state)-1)*(0.1+0.9*ae_randomreal(_state)); } nin = ae_randominteger(20, _state)+1; nhid1 = ae_randominteger(5, _state); if( nhid1==0 ) { nhid2 = 0; } else { nhid2 = ae_randominteger(5, _state); } nout = ae_randominteger(20, _state)+2; testmlpbaseunit_createnetwork(&net, nkind, a1, a2, nin, nhid1, nhid2, nout, _state); mlpproperties(&net, &n1, &n2, &wcount, _state); if( mlpissoftmax(&net, _state) ) { w = nin+1; if( ssize>0 ) { ae_matrix_set_length(&a, ssize, w, _state); sparsecreate(ssize, w, ssize*w, &sa, _state); } else { ae_matrix_set_length(&a, 0, 0, _state); sparsecreate(1, 1, 0, &sa, _state); } for(i=0; i<=ssize-1; i++) { for(j=0; j<=w-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; sparseset(&sa, i, j, a.ptr.pp_double[i][j], _state); } } for(i=0; i<=ssize-1; i++) { a.ptr.pp_double[i][nin] = (double)(ae_randominteger(nout, _state)); sparseset(&sa, i, nin, a.ptr.pp_double[i][nin], _state); } } else { w = nin+nout; if( ssize>0 ) { ae_matrix_set_length(&a, ssize, w, _state); sparsecreate(ssize, w, ssize*w, &sa, _state); } else { ae_matrix_set_length(&a, 0, 0, _state); sparsecreate(1, 1, 0, &sa, _state); } for(i=0; i<=ssize-1; i++) { for(j=0; j<=w-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; sparseset(&sa, i, j, a.ptr.pp_double[i][j], _state); } } } sparseconverttocrs(&sa, _state); for(variant=1; variant<=nvar; variant++) { sbsize = -1; if( variant==1 ) { sbsize = ssize; for(i=0; i<=sbsize-1; i++) { idx.ptr.p_int[i] = i; } } if( variant==2 ) { sbsize = 0; } if( variant==3 ) { if( ssize==0 ) { sbsize = 0; } else { sbsize = ae_randominteger(ssize, _state); } for(i=0; i<=sbsize-1; i++) { idx.ptr.p_int[i] = ae_randominteger(ssize, _state); } } ae_assert(sbsize>=0, "mlpbase test: integrity check failed", _state); if( sbsize!=0 ) { ae_matrix_set_length(&parta, sbsize, w, _state); sparsecreate(sbsize, w, sbsize*w, &partsa, _state); } else { ae_matrix_set_length(&parta, 0, 0, _state); sparsecreate(1, 1, 0, &partsa, _state); } for(i=0; i<=sbsize-1; i++) { ae_v_move(&parta.ptr.pp_double[i][0], 1, &a.ptr.pp_double[idx.ptr.p_int[i]][0], 1, ae_v_len(0,w-1)); for(j=0; j<=w-1; j++) { sparseset(&partsa, i, j, parta.ptr.pp_double[i][j], _state); } } sparseconverttocrs(&partsa, _state); mlpgradbatch(&net, &parta, sbsize, &e1, &grad1, _state); mlpgradbatchsubset(&net, &a, ssize, &idx, sbsize, &e2, &grad2, _state); /* * Test for dense matrix */ if( ae_fp_greater(ae_fabs(e1-e2, _state),1.0E-6) ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=wcount-1; i++) { if( ae_fp_greater(ae_fabs(grad1.ptr.p_double[i]-grad2.ptr.p_double[i], _state),1.0E-6) ) { result = ae_true; ae_frame_leave(_state); return result; } } /* * Test for sparse matrix */ mlpgradbatchsparse(&net, &partsa, sbsize, &e1, &grad1, _state); mlpgradbatchsparsesubset(&net, &sa, ssize, &idx, sbsize, &e2, &grad2, _state); if( ae_fp_greater(ae_fabs(e1-e2, _state),1.0E-6) ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=wcount-1; i++) { if( ae_fp_greater(ae_fabs(grad1.ptr.p_double[i]-grad2.ptr.p_double[i], _state),1.0E-6) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } result = ae_false; ae_frame_leave(_state); return result; } static void testldaunit_gensimpleset(ae_int_t nfeatures, ae_int_t nclasses, ae_int_t nsamples, ae_int_t axis, /* Real */ ae_matrix* xy, ae_state *_state); static void testldaunit_gendeg1set(ae_int_t nfeatures, ae_int_t nclasses, ae_int_t nsamples, ae_int_t axis, /* Real */ ae_matrix* xy, ae_state *_state); static double testldaunit_generatenormal(double mean, double sigma, ae_state *_state); static ae_bool testldaunit_testwn(/* Real */ ae_matrix* xy, /* Real */ ae_matrix* wn, ae_int_t ns, ae_int_t nf, ae_int_t nc, ae_int_t ndeg, ae_state *_state); static double testldaunit_calcj(ae_int_t nf, /* Real */ ae_matrix* st, /* Real */ ae_matrix* sw, /* Real */ ae_vector* w, double* p, double* q, ae_state *_state); static void testldaunit_fishers(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_int_t nclasses, /* Real */ ae_matrix* st, /* Real */ ae_matrix* sw, ae_state *_state); ae_bool testlda(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t maxnf; ae_int_t maxns; ae_int_t maxnc; ae_int_t passcount; ae_bool ldanerrors; ae_bool lda1errors; ae_bool waserrors; ae_int_t nf; ae_int_t nc; ae_int_t ns; ae_int_t i; ae_int_t info; ae_int_t pass; ae_int_t axis; ae_matrix xy; ae_matrix wn; ae_vector w1; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&wn, 0, 0, DT_REAL, _state); ae_vector_init(&w1, 0, DT_REAL, _state); /* * Primary settings */ maxnf = 10; maxns = 1000; maxnc = 5; passcount = 1; waserrors = ae_false; ldanerrors = ae_false; lda1errors = ae_false; /* * General tests */ for(nf=1; nf<=maxnf; nf++) { for(nc=2; nc<=maxnc; nc++) { for(pass=1; pass<=passcount; pass++) { /* * Simple test for LDA-N/LDA-1 */ axis = ae_randominteger(nf, _state); ns = maxns/2+ae_randominteger(maxns/2, _state); testldaunit_gensimpleset(nf, nc, ns, axis, &xy, _state); fisherldan(&xy, ns, nf, nc, &info, &wn, _state); if( info!=1 ) { ldanerrors = ae_true; continue; } ldanerrors = ldanerrors||!testldaunit_testwn(&xy, &wn, ns, nf, nc, 0, _state); ldanerrors = ldanerrors||ae_fp_less_eq(ae_fabs(wn.ptr.pp_double[axis][0], _state),0.75); fisherlda(&xy, ns, nf, nc, &info, &w1, _state); for(i=0; i<=nf-1; i++) { lda1errors = lda1errors||ae_fp_neq(w1.ptr.p_double[i],wn.ptr.pp_double[i][0]); } /* * Degenerate test for LDA-N */ if( nf>=3 ) { ns = maxns/2+ae_randominteger(maxns/2, _state); /* * there are two duplicate features, * axis is oriented along non-duplicate feature */ axis = ae_randominteger(nf-2, _state); testldaunit_gendeg1set(nf, nc, ns, axis, &xy, _state); fisherldan(&xy, ns, nf, nc, &info, &wn, _state); if( info!=2 ) { ldanerrors = ae_true; continue; } ldanerrors = ldanerrors||ae_fp_less_eq(wn.ptr.pp_double[axis][0],0.75); fisherlda(&xy, ns, nf, nc, &info, &w1, _state); for(i=0; i<=nf-1; i++) { lda1errors = lda1errors||ae_fp_neq(w1.ptr.p_double[i],wn.ptr.pp_double[i][0]); } } } } } /* * Final report */ waserrors = ldanerrors||lda1errors; if( !silent ) { printf("LDA TEST\n"); printf("FISHER LDA-N: "); if( !ldanerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("FISHER LDA-1: "); if( !lda1errors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST SUMMARY: FAILED\n"); } else { printf("TEST SUMMARY: PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testlda(ae_bool silent, ae_state *_state) { return testlda(silent, _state); } /************************************************************************* Generates 'simple' set - a sequence of unit 'balls' at (0,0), (1,0), (2,0) and so on. *************************************************************************/ static void testldaunit_gensimpleset(ae_int_t nfeatures, ae_int_t nclasses, ae_int_t nsamples, ae_int_t axis, /* Real */ ae_matrix* xy, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t c; ae_matrix_clear(xy); ae_assert(axis>=0&&axisptr.pp_double[i][j] = testldaunit_generatenormal(0.0, 1.0, _state); } c = i%nclasses; xy->ptr.pp_double[i][axis] = xy->ptr.pp_double[i][axis]+c; xy->ptr.pp_double[i][nfeatures] = (double)(c); } } /************************************************************************* Generates 'degenerate' set #1. NFeatures>=3. *************************************************************************/ static void testldaunit_gendeg1set(ae_int_t nfeatures, ae_int_t nclasses, ae_int_t nsamples, ae_int_t axis, /* Real */ ae_matrix* xy, ae_state *_state) { ae_int_t i; ae_int_t j; ae_int_t c; ae_matrix_clear(xy); ae_assert(axis>=0&&axis=3, "GenDeg1Set: wrong NFeatures!", _state); ae_matrix_set_length(xy, nsamples-1+1, nfeatures+1, _state); if( axis>=nfeatures-2 ) { axis = nfeatures-3; } for(i=0; i<=nsamples-1; i++) { for(j=0; j<=nfeatures-2; j++) { xy->ptr.pp_double[i][j] = testldaunit_generatenormal(0.0, 1.0, _state); } xy->ptr.pp_double[i][nfeatures-1] = xy->ptr.pp_double[i][nfeatures-2]; c = i%nclasses; xy->ptr.pp_double[i][axis] = xy->ptr.pp_double[i][axis]+c; xy->ptr.pp_double[i][nfeatures] = (double)(c); } } /************************************************************************* Normal random number *************************************************************************/ static double testldaunit_generatenormal(double mean, double sigma, ae_state *_state) { double u; double v; double sum; double result; result = mean; for(;;) { u = (2*ae_randominteger(2, _state)-1)*ae_randomreal(_state); v = (2*ae_randominteger(2, _state)-1)*ae_randomreal(_state); sum = u*u+v*v; if( ae_fp_less(sum,(double)(1))&&ae_fp_greater(sum,(double)(0)) ) { sum = ae_sqrt(-2*ae_log(sum, _state)/sum, _state); result = sigma*u*sum+mean; return result; } } return result; } /************************************************************************* Tests WN for correctness *************************************************************************/ static ae_bool testldaunit_testwn(/* Real */ ae_matrix* xy, /* Real */ ae_matrix* wn, ae_int_t ns, ae_int_t nf, ae_int_t nc, ae_int_t ndeg, ae_state *_state) { ae_frame _frame_block; ae_matrix st; ae_matrix sw; ae_matrix a; ae_matrix z; ae_vector tx; ae_vector jp; ae_vector jq; ae_vector work; ae_int_t i; ae_int_t j; double v; double wprev; double tol; double p; double q; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&st, 0, 0, DT_REAL, _state); ae_matrix_init(&sw, 0, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); ae_vector_init(&tx, 0, DT_REAL, _state); ae_vector_init(&jp, 0, DT_REAL, _state); ae_vector_init(&jq, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); tol = (double)(10000); result = ae_true; testldaunit_fishers(xy, ns, nf, nc, &st, &sw, _state); /* * Test for decreasing of J */ ae_vector_set_length(&tx, nf-1+1, _state); ae_vector_set_length(&jp, nf-1+1, _state); ae_vector_set_length(&jq, nf-1+1, _state); for(j=0; j<=nf-1; j++) { ae_v_move(&tx.ptr.p_double[0], 1, &wn->ptr.pp_double[0][j], wn->stride, ae_v_len(0,nf-1)); v = testldaunit_calcj(nf, &st, &sw, &tx, &p, &q, _state); jp.ptr.p_double[j] = p; jq.ptr.p_double[j] = q; } for(i=1; i<=nf-1-ndeg; i++) { result = result&&ae_fp_greater_eq(jp.ptr.p_double[i-1]/jq.ptr.p_double[i-1],(1-tol*ae_machineepsilon)*jp.ptr.p_double[i]/jq.ptr.p_double[i]); } for(i=nf-1-ndeg+1; i<=nf-1; i++) { result = result&&ae_fp_less_eq(jp.ptr.p_double[i],tol*ae_machineepsilon*jp.ptr.p_double[0]); } /* * Test for J optimality */ ae_v_move(&tx.ptr.p_double[0], 1, &wn->ptr.pp_double[0][0], wn->stride, ae_v_len(0,nf-1)); v = testldaunit_calcj(nf, &st, &sw, &tx, &p, &q, _state); for(i=0; i<=nf-1; i++) { wprev = tx.ptr.p_double[i]; tx.ptr.p_double[i] = wprev+0.01; result = result&&ae_fp_greater_eq(v,(1-tol*ae_machineepsilon)*testldaunit_calcj(nf, &st, &sw, &tx, &p, &q, _state)); tx.ptr.p_double[i] = wprev-0.01; result = result&&ae_fp_greater_eq(v,(1-tol*ae_machineepsilon)*testldaunit_calcj(nf, &st, &sw, &tx, &p, &q, _state)); tx.ptr.p_double[i] = wprev; } /* * Test for linear independence of W */ ae_vector_set_length(&work, nf+1, _state); ae_matrix_set_length(&a, nf-1+1, nf-1+1, _state); matrixmatrixmultiply(wn, 0, nf-1, 0, nf-1, ae_false, wn, 0, nf-1, 0, nf-1, ae_true, 1.0, &a, 0, nf-1, 0, nf-1, 0.0, &work, _state); if( smatrixevd(&a, nf, 1, ae_true, &tx, &z, _state) ) { result = result&&ae_fp_greater(tx.ptr.p_double[0],tx.ptr.p_double[nf-1]*1000*ae_machineepsilon); } /* * Test for other properties */ for(j=0; j<=nf-1; j++) { v = ae_v_dotproduct(&wn->ptr.pp_double[0][j], wn->stride, &wn->ptr.pp_double[0][j], wn->stride, ae_v_len(0,nf-1)); v = ae_sqrt(v, _state); result = result&&ae_fp_less_eq(ae_fabs(v-1, _state),1000*ae_machineepsilon); v = (double)(0); for(i=0; i<=nf-1; i++) { v = v+wn->ptr.pp_double[i][j]; } result = result&&ae_fp_greater_eq(v,(double)(0)); } ae_frame_leave(_state); return result; } /************************************************************************* Calculates J *************************************************************************/ static double testldaunit_calcj(ae_int_t nf, /* Real */ ae_matrix* st, /* Real */ ae_matrix* sw, /* Real */ ae_vector* w, double* p, double* q, ae_state *_state) { ae_frame _frame_block; ae_vector tx; ae_int_t i; double v; double result; ae_frame_make(_state, &_frame_block); *p = 0; *q = 0; ae_vector_init(&tx, 0, DT_REAL, _state); ae_vector_set_length(&tx, nf-1+1, _state); for(i=0; i<=nf-1; i++) { v = ae_v_dotproduct(&st->ptr.pp_double[i][0], 1, &w->ptr.p_double[0], 1, ae_v_len(0,nf-1)); tx.ptr.p_double[i] = v; } v = ae_v_dotproduct(&w->ptr.p_double[0], 1, &tx.ptr.p_double[0], 1, ae_v_len(0,nf-1)); *p = v; for(i=0; i<=nf-1; i++) { v = ae_v_dotproduct(&sw->ptr.pp_double[i][0], 1, &w->ptr.p_double[0], 1, ae_v_len(0,nf-1)); tx.ptr.p_double[i] = v; } v = ae_v_dotproduct(&w->ptr.p_double[0], 1, &tx.ptr.p_double[0], 1, ae_v_len(0,nf-1)); *q = v; result = *p/(*q); ae_frame_leave(_state); return result; } /************************************************************************* Calculates ST/SW *************************************************************************/ static void testldaunit_fishers(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nfeatures, ae_int_t nclasses, /* Real */ ae_matrix* st, /* Real */ ae_matrix* sw, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t k; double v; ae_vector c; ae_vector mu; ae_matrix muc; ae_vector nc; ae_vector tf; ae_vector work; ae_frame_make(_state, &_frame_block); ae_matrix_clear(st); ae_matrix_clear(sw); ae_vector_init(&c, 0, DT_INT, _state); ae_vector_init(&mu, 0, DT_REAL, _state); ae_matrix_init(&muc, 0, 0, DT_REAL, _state); ae_vector_init(&nc, 0, DT_INT, _state); ae_vector_init(&tf, 0, DT_REAL, _state); ae_vector_init(&work, 0, DT_REAL, _state); /* * Prepare temporaries */ ae_vector_set_length(&tf, nfeatures-1+1, _state); ae_vector_set_length(&work, nfeatures+1, _state); /* * Convert class labels from reals to integers (just for convenience) */ ae_vector_set_length(&c, npoints-1+1, _state); for(i=0; i<=npoints-1; i++) { c.ptr.p_int[i] = ae_round(xy->ptr.pp_double[i][nfeatures], _state); } /* * Calculate class sizes and means */ ae_vector_set_length(&mu, nfeatures-1+1, _state); ae_matrix_set_length(&muc, nclasses-1+1, nfeatures-1+1, _state); ae_vector_set_length(&nc, nclasses-1+1, _state); for(j=0; j<=nfeatures-1; j++) { mu.ptr.p_double[j] = (double)(0); } for(i=0; i<=nclasses-1; i++) { nc.ptr.p_int[i] = 0; for(j=0; j<=nfeatures-1; j++) { muc.ptr.pp_double[i][j] = (double)(0); } } for(i=0; i<=npoints-1; i++) { ae_v_add(&mu.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nfeatures-1)); ae_v_add(&muc.ptr.pp_double[c.ptr.p_int[i]][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nfeatures-1)); nc.ptr.p_int[c.ptr.p_int[i]] = nc.ptr.p_int[c.ptr.p_int[i]]+1; } for(i=0; i<=nclasses-1; i++) { v = (double)1/(double)nc.ptr.p_int[i]; ae_v_muld(&muc.ptr.pp_double[i][0], 1, ae_v_len(0,nfeatures-1), v); } v = (double)1/(double)npoints; ae_v_muld(&mu.ptr.p_double[0], 1, ae_v_len(0,nfeatures-1), v); /* * Create ST matrix */ ae_matrix_set_length(st, nfeatures-1+1, nfeatures-1+1, _state); for(i=0; i<=nfeatures-1; i++) { for(j=0; j<=nfeatures-1; j++) { st->ptr.pp_double[i][j] = (double)(0); } } for(k=0; k<=npoints-1; k++) { ae_v_move(&tf.ptr.p_double[0], 1, &xy->ptr.pp_double[k][0], 1, ae_v_len(0,nfeatures-1)); ae_v_sub(&tf.ptr.p_double[0], 1, &mu.ptr.p_double[0], 1, ae_v_len(0,nfeatures-1)); for(i=0; i<=nfeatures-1; i++) { v = tf.ptr.p_double[i]; ae_v_addd(&st->ptr.pp_double[i][0], 1, &tf.ptr.p_double[0], 1, ae_v_len(0,nfeatures-1), v); } } /* * Create SW matrix */ ae_matrix_set_length(sw, nfeatures-1+1, nfeatures-1+1, _state); for(i=0; i<=nfeatures-1; i++) { for(j=0; j<=nfeatures-1; j++) { sw->ptr.pp_double[i][j] = (double)(0); } } for(k=0; k<=npoints-1; k++) { ae_v_move(&tf.ptr.p_double[0], 1, &xy->ptr.pp_double[k][0], 1, ae_v_len(0,nfeatures-1)); ae_v_sub(&tf.ptr.p_double[0], 1, &muc.ptr.pp_double[c.ptr.p_int[k]][0], 1, ae_v_len(0,nfeatures-1)); for(i=0; i<=nfeatures-1; i++) { v = tf.ptr.p_double[i]; ae_v_addd(&sw->ptr.pp_double[i][0], 1, &tf.ptr.p_double[0], 1, ae_v_len(0,nfeatures-1), v); } } ae_frame_leave(_state); } ae_bool testgammafunc(ae_bool silent, ae_state *_state) { double threshold; double v; double s; ae_bool waserrors; ae_bool gammaerrors; ae_bool lngammaerrors; ae_bool result; gammaerrors = ae_false; lngammaerrors = ae_false; waserrors = ae_false; threshold = 100*ae_machineepsilon; /* * */ gammaerrors = gammaerrors||ae_fp_greater(ae_fabs(gammafunction(0.5, _state)-ae_sqrt(ae_pi, _state), _state),threshold); gammaerrors = gammaerrors||ae_fp_greater(ae_fabs(gammafunction(1.5, _state)-0.5*ae_sqrt(ae_pi, _state), _state),threshold); v = lngamma(0.5, &s, _state); lngammaerrors = (lngammaerrors||ae_fp_greater(ae_fabs(v-ae_log(ae_sqrt(ae_pi, _state), _state), _state),threshold))||ae_fp_neq(s,(double)(1)); v = lngamma(1.5, &s, _state); lngammaerrors = (lngammaerrors||ae_fp_greater(ae_fabs(v-ae_log(0.5*ae_sqrt(ae_pi, _state), _state), _state),threshold))||ae_fp_neq(s,(double)(1)); /* * report */ waserrors = gammaerrors||lngammaerrors; if( !silent ) { printf("TESTING GAMMA FUNCTION\n"); printf("GAMMA: "); if( gammaerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("LN GAMMA: "); if( lngammaerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } /* * end */ result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testgammafunc(ae_bool silent, ae_state *_state) { return testgammafunc(silent, _state); } static void testlinregunit_generaterandomtask(double xl, double xr, ae_bool randomx, double ymin, double ymax, double smin, double smax, ae_int_t n, /* Real */ ae_matrix* xy, /* Real */ ae_vector* s, ae_state *_state); static void testlinregunit_generatetask(double a, double b, double xl, double xr, ae_bool randomx, double smin, double smax, ae_int_t n, /* Real */ ae_matrix* xy, /* Real */ ae_vector* s, ae_state *_state); static void testlinregunit_filltaskwithy(double a, double b, ae_int_t n, /* Real */ ae_matrix* xy, /* Real */ ae_vector* s, ae_state *_state); static double testlinregunit_generatenormal(double mean, double sigma, ae_state *_state); static void testlinregunit_calculatemv(/* Real */ ae_vector* x, ae_int_t n, double* mean, double* means, double* stddev, double* stddevs, ae_state *_state); static void testlinregunit_unsetlr(linearmodel* lr, ae_state *_state); ae_bool testlinreg(ae_bool silent, ae_state *_state) { ae_frame _frame_block; double sigmathreshold; ae_int_t maxn; ae_int_t passcount; ae_int_t estpasscount; double threshold; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t tmpi; ae_int_t pass; ae_int_t epass; ae_int_t m; ae_int_t tasktype; ae_int_t modeltype; ae_int_t m1; ae_int_t m2; ae_int_t n1; ae_int_t n2; ae_int_t info; ae_int_t info2; ae_matrix xy; ae_matrix xy2; ae_vector s; ae_vector s2; ae_vector w2; ae_vector x; ae_vector ta; ae_vector tb; ae_vector tc; ae_vector xy0; ae_vector tmpweights; linearmodel w; linearmodel wt; linearmodel wt2; ae_vector x1; ae_vector x2; double y1; double y2; ae_bool allsame; double ea; double eb; double varatested; double varbtested; double a; double b; double vara; double varb; double a2; double b2; double covab; double corrab; double p; ae_int_t qcnt; ae_vector qtbl; ae_vector qvals; ae_vector qsigma; lrreport ar; lrreport ar2; double f; double fp; double fm; double v; double vv; double cvrmserror; double cvavgerror; double cvavgrelerror; double rmserror; double avgerror; double avgrelerror; ae_bool nondefect; double sinshift; double tasklevel; double noiselevel; double hstep; double sigma; double mean; double means; double stddev; double stddevs; ae_bool slcerrors; ae_bool slerrors; ae_bool grcoverrors; ae_bool gropterrors; ae_bool gresterrors; ae_bool grothererrors; ae_bool grconverrors; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&xy2, 0, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&s2, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&ta, 0, DT_REAL, _state); ae_vector_init(&tb, 0, DT_REAL, _state); ae_vector_init(&tc, 0, DT_REAL, _state); ae_vector_init(&xy0, 0, DT_REAL, _state); ae_vector_init(&tmpweights, 0, DT_REAL, _state); _linearmodel_init(&w, _state); _linearmodel_init(&wt, _state); _linearmodel_init(&wt2, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&qtbl, 0, DT_REAL, _state); ae_vector_init(&qvals, 0, DT_REAL, _state); ae_vector_init(&qsigma, 0, DT_REAL, _state); _lrreport_init(&ar, _state); _lrreport_init(&ar2, _state); /* * Primary settings */ maxn = 40; passcount = 3; estpasscount = 1000; sigmathreshold = (double)(7); threshold = 1000000*ae_machineepsilon; slerrors = ae_false; slcerrors = ae_false; grcoverrors = ae_false; gropterrors = ae_false; gresterrors = ae_false; grothererrors = ae_false; grconverrors = ae_false; waserrors = ae_false; /* * Quantiles table setup */ qcnt = 5; ae_vector_set_length(&qtbl, qcnt-1+1, _state); ae_vector_set_length(&qvals, qcnt-1+1, _state); ae_vector_set_length(&qsigma, qcnt-1+1, _state); qtbl.ptr.p_double[0] = 0.5; qtbl.ptr.p_double[1] = 0.25; qtbl.ptr.p_double[2] = 0.10; qtbl.ptr.p_double[3] = 0.05; qtbl.ptr.p_double[4] = 0.025; for(i=0; i<=qcnt-1; i++) { qsigma.ptr.p_double[i] = ae_sqrt(qtbl.ptr.p_double[i]*(1-qtbl.ptr.p_double[i])/estpasscount, _state); } /* * Other setup */ ae_vector_set_length(&ta, estpasscount-1+1, _state); ae_vector_set_length(&tb, estpasscount-1+1, _state); /* * Test straight line regression */ for(n=2; n<=maxn; n++) { /* * Fail/pass test */ testlinregunit_generaterandomtask((double)(-1), (double)(1), ae_false, (double)(-1), (double)(1), (double)(1), (double)(2), n, &xy, &s, _state); lrlines(&xy, &s, n, &info, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); slcerrors = slcerrors||info!=1; testlinregunit_generaterandomtask((double)(1), (double)(1), ae_false, (double)(-1), (double)(1), (double)(1), (double)(2), n, &xy, &s, _state); lrlines(&xy, &s, n, &info, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); slcerrors = slcerrors||info!=-3; testlinregunit_generaterandomtask((double)(-1), (double)(1), ae_false, (double)(-1), (double)(1), (double)(-1), (double)(-1), n, &xy, &s, _state); lrlines(&xy, &s, n, &info, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); slcerrors = slcerrors||info!=-2; testlinregunit_generaterandomtask((double)(-1), (double)(1), ae_false, (double)(-1), (double)(1), (double)(2), (double)(1), 2, &xy, &s, _state); lrlines(&xy, &s, 1, &info, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); slcerrors = slcerrors||info!=-1; /* * Multipass tests */ for(pass=1; pass<=passcount; pass++) { /* * Test S variant against non-S variant */ ea = 2*ae_randomreal(_state)-1; eb = 2*ae_randomreal(_state)-1; testlinregunit_generatetask(ea, eb, -5*ae_randomreal(_state), 5*ae_randomreal(_state), ae_fp_greater(ae_randomreal(_state),0.5), (double)(1), (double)(1), n, &xy, &s, _state); lrlines(&xy, &s, n, &info, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); lrline(&xy, n, &info2, &a2, &b2, _state); if( info!=1||info2!=1 ) { slcerrors = ae_true; } else { slerrors = (slerrors||ae_fp_greater(ae_fabs(a-a2, _state),threshold))||ae_fp_greater(ae_fabs(b-b2, _state),threshold); } /* * Test for A/B * * Generate task with exact, non-perturbed y[i], * then make non-zero s[i] */ ea = 2*ae_randomreal(_state)-1; eb = 2*ae_randomreal(_state)-1; testlinregunit_generatetask(ea, eb, -5*ae_randomreal(_state), 5*ae_randomreal(_state), n>4, 0.0, 0.0, n, &xy, &s, _state); for(i=0; i<=n-1; i++) { s.ptr.p_double[i] = 1+ae_randomreal(_state); } lrlines(&xy, &s, n, &info, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); if( info!=1 ) { slcerrors = ae_true; } else { slerrors = (slerrors||ae_fp_greater(ae_fabs(a-ea, _state),0.001))||ae_fp_greater(ae_fabs(b-eb, _state),0.001); } /* * Test for VarA, VarB, P (P is being tested only for N>2) */ for(i=0; i<=qcnt-1; i++) { qvals.ptr.p_double[i] = (double)(0); } ea = 2*ae_randomreal(_state)-1; eb = 2*ae_randomreal(_state)-1; testlinregunit_generatetask(ea, eb, -5*ae_randomreal(_state), 5*ae_randomreal(_state), n>4, 1.0, 2.0, n, &xy, &s, _state); lrlines(&xy, &s, n, &info, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); if( info!=1 ) { slcerrors = ae_true; continue; } varatested = vara; varbtested = varb; for(epass=0; epass<=estpasscount-1; epass++) { /* * Generate */ testlinregunit_filltaskwithy(ea, eb, n, &xy, &s, _state); lrlines(&xy, &s, n, &info, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); if( info!=1 ) { slcerrors = ae_true; continue; } /* * A, B, P * (P is being tested for uniformity, additional p-tests are below) */ ta.ptr.p_double[epass] = a; tb.ptr.p_double[epass] = b; for(i=0; i<=qcnt-1; i++) { if( ae_fp_less_eq(p,qtbl.ptr.p_double[i]) ) { qvals.ptr.p_double[i] = qvals.ptr.p_double[i]+(double)1/(double)estpasscount; } } } testlinregunit_calculatemv(&ta, estpasscount, &mean, &means, &stddev, &stddevs, _state); slerrors = slerrors||ae_fp_greater_eq(ae_fabs(mean-ea, _state)/means,sigmathreshold); slerrors = slerrors||ae_fp_greater_eq(ae_fabs(stddev-ae_sqrt(varatested, _state), _state)/stddevs,sigmathreshold); testlinregunit_calculatemv(&tb, estpasscount, &mean, &means, &stddev, &stddevs, _state); slerrors = slerrors||ae_fp_greater_eq(ae_fabs(mean-eb, _state)/means,sigmathreshold); slerrors = slerrors||ae_fp_greater_eq(ae_fabs(stddev-ae_sqrt(varbtested, _state), _state)/stddevs,sigmathreshold); if( n>2 ) { for(i=0; i<=qcnt-1; i++) { if( ae_fp_greater(ae_fabs(qtbl.ptr.p_double[i]-qvals.ptr.p_double[i], _state)/qsigma.ptr.p_double[i],sigmathreshold) ) { slerrors = ae_true; } } } /* * Additional tests for P: correlation with fit quality */ if( n>2 ) { testlinregunit_generatetask(ea, eb, -5*ae_randomreal(_state), 5*ae_randomreal(_state), ae_false, 0.0, 0.0, n, &xy, &s, _state); for(i=0; i<=n-1; i++) { s.ptr.p_double[i] = 1+ae_randomreal(_state); } lrlines(&xy, &s, n, &info, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); if( info!=1 ) { slcerrors = ae_true; continue; } slerrors = slerrors||ae_fp_less(p,0.999); testlinregunit_generatetask((double)(0), (double)(0), -5*ae_randomreal(_state), 5*ae_randomreal(_state), ae_false, 1.0, 1.0, n, &xy, &s, _state); for(i=0; i<=n-1; i++) { if( i%2==0 ) { xy.ptr.pp_double[i][1] = 5.0; } else { xy.ptr.pp_double[i][1] = -5.0; } } if( n%2!=0 ) { xy.ptr.pp_double[n-1][1] = (double)(0); } lrlines(&xy, &s, n, &info, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); if( info!=1 ) { slcerrors = ae_true; continue; } slerrors = slerrors||ae_fp_greater(p,0.001); } } } /* * General regression tests: */ /* * Simple linear tests (small sample, optimum point, covariance) */ for(n=3; n<=maxn; n++) { ae_vector_set_length(&s, n-1+1, _state); /* * Linear tests: * a. random points, sigmas * b. no sigmas */ ae_matrix_set_length(&xy, n-1+1, 1+1, _state); for(i=0; i<=n-1; i++) { xy.ptr.pp_double[i][0] = 2*ae_randomreal(_state)-1; xy.ptr.pp_double[i][1] = 2*ae_randomreal(_state)-1; s.ptr.p_double[i] = 1+ae_randomreal(_state); } lrbuilds(&xy, &s, n, 1, &info, &wt, &ar, _state); if( info!=1 ) { grconverrors = ae_true; continue; } lrunpack(&wt, &tmpweights, &tmpi, _state); lrlines(&xy, &s, n, &info2, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); gropterrors = gropterrors||ae_fp_greater(ae_fabs(a-tmpweights.ptr.p_double[1], _state),threshold); gropterrors = gropterrors||ae_fp_greater(ae_fabs(b-tmpweights.ptr.p_double[0], _state),threshold); grcoverrors = grcoverrors||ae_fp_greater(ae_fabs(vara-ar.c.ptr.pp_double[1][1], _state),threshold); grcoverrors = grcoverrors||ae_fp_greater(ae_fabs(varb-ar.c.ptr.pp_double[0][0], _state),threshold); grcoverrors = grcoverrors||ae_fp_greater(ae_fabs(covab-ar.c.ptr.pp_double[1][0], _state),threshold); grcoverrors = grcoverrors||ae_fp_greater(ae_fabs(covab-ar.c.ptr.pp_double[0][1], _state),threshold); lrbuild(&xy, n, 1, &info, &wt, &ar, _state); if( info!=1 ) { grconverrors = ae_true; continue; } lrunpack(&wt, &tmpweights, &tmpi, _state); lrline(&xy, n, &info2, &a, &b, _state); gropterrors = gropterrors||ae_fp_greater(ae_fabs(a-tmpweights.ptr.p_double[1], _state),threshold); gropterrors = gropterrors||ae_fp_greater(ae_fabs(b-tmpweights.ptr.p_double[0], _state),threshold); } /* * S covariance versus S-less covariance. * Slightly skewed task, large sample size. * Will S-less subroutine estimate covariance matrix good enough? */ n = 1000+ae_randominteger(3000, _state); sigma = 0.1+ae_randomreal(_state)*1.9; ae_matrix_set_length(&xy, n-1+1, 1+1, _state); ae_vector_set_length(&s, n-1+1, _state); for(i=0; i<=n-1; i++) { xy.ptr.pp_double[i][0] = 1.5*ae_randomreal(_state)-0.5; xy.ptr.pp_double[i][1] = 1.2*xy.ptr.pp_double[i][0]-0.3+testlinregunit_generatenormal((double)(0), sigma, _state); s.ptr.p_double[i] = sigma; } lrbuild(&xy, n, 1, &info, &wt, &ar, _state); lrlines(&xy, &s, n, &info2, &a, &b, &vara, &varb, &covab, &corrab, &p, _state); if( info!=1||info2!=1 ) { grconverrors = ae_true; } else { grcoverrors = grcoverrors||ae_fp_greater(ae_fabs(ae_log(ar.c.ptr.pp_double[0][0]/varb, _state), _state),ae_log(1.2, _state)); grcoverrors = grcoverrors||ae_fp_greater(ae_fabs(ae_log(ar.c.ptr.pp_double[1][1]/vara, _state), _state),ae_log(1.2, _state)); grcoverrors = grcoverrors||ae_fp_greater(ae_fabs(ae_log(ar.c.ptr.pp_double[0][1]/covab, _state), _state),ae_log(1.2, _state)); grcoverrors = grcoverrors||ae_fp_greater(ae_fabs(ae_log(ar.c.ptr.pp_double[1][0]/covab, _state), _state),ae_log(1.2, _state)); } /* * General tests: * * basis functions - up to cubic * * task types: * * data set is noisy sine half-period with random shift * * tests: * unpacking/packing * optimality * error estimates * * tasks: * 0 = noised sine * 1 = degenerate task with 1-of-n encoded categorical variables * 2 = random task with large variation (for 1-type models) * 3 = random task with small variation (for 1-type models) * * Additional tasks TODO * specially designed task with defective vectors which leads to * the failure of the fast CV formula. * */ m1 = 0; m2 = -1; n1 = 0; n2 = -1; for(modeltype=0; modeltype<=1; modeltype++) { for(tasktype=0; tasktype<=3; tasktype++) { if( tasktype==0 ) { m1 = 1; m2 = 3; } if( tasktype==1 ) { m1 = 9; m2 = 9; } if( tasktype==2||tasktype==3 ) { m1 = 9; m2 = 9; } for(m=m1; m<=m2; m++) { if( tasktype==0 ) { n1 = m+3; n2 = m+20; } if( tasktype==1 ) { n1 = 70+ae_randominteger(70, _state); n2 = n1; } if( tasktype==2||tasktype==3 ) { n1 = 100; n2 = n1; } for(n=n1; n<=n2; n++) { ae_matrix_set_length(&xy, n-1+1, m+1, _state); ae_vector_set_length(&xy0, n-1+1, _state); ae_vector_set_length(&s, n-1+1, _state); hstep = 0.001; noiselevel = 0.2; /* * Prepare task */ if( tasktype==0 ) { for(i=0; i<=n-1; i++) { xy.ptr.pp_double[i][0] = 2*ae_randomreal(_state)-1; } for(i=0; i<=n-1; i++) { for(j=1; j<=m-1; j++) { xy.ptr.pp_double[i][j] = xy.ptr.pp_double[i][0]*xy.ptr.pp_double[i][j-1]; } } sinshift = ae_randomreal(_state)*ae_pi; for(i=0; i<=n-1; i++) { xy0.ptr.p_double[i] = ae_sin(sinshift+ae_pi*0.5*(xy.ptr.pp_double[i][0]+1), _state); xy.ptr.pp_double[i][m] = xy0.ptr.p_double[i]+noiselevel*testlinregunit_generatenormal((double)(0), (double)(1), _state); } } if( tasktype==1 ) { ae_assert(m==9, "Assertion failed", _state); ae_vector_set_length(&ta, 8+1, _state); ta.ptr.p_double[0] = (double)(1); ta.ptr.p_double[1] = (double)(2); ta.ptr.p_double[2] = (double)(3); ta.ptr.p_double[3] = 0.25; ta.ptr.p_double[4] = 0.5; ta.ptr.p_double[5] = 0.75; ta.ptr.p_double[6] = 0.06; ta.ptr.p_double[7] = 0.12; ta.ptr.p_double[8] = 0.18; for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { xy.ptr.pp_double[i][j] = (double)(0); } xy.ptr.pp_double[i][0+i%3] = (double)(1); xy.ptr.pp_double[i][3+i/3%3] = (double)(1); xy.ptr.pp_double[i][6+i/9%3] = (double)(1); v = ae_v_dotproduct(&xy.ptr.pp_double[i][0], 1, &ta.ptr.p_double[0], 1, ae_v_len(0,8)); xy0.ptr.p_double[i] = v; xy.ptr.pp_double[i][m] = v+noiselevel*testlinregunit_generatenormal((double)(0), (double)(1), _state); } } if( tasktype==2||tasktype==3 ) { ae_assert(m==9, "Assertion failed", _state); ae_vector_set_length(&ta, 8+1, _state); ta.ptr.p_double[0] = (double)(1); ta.ptr.p_double[1] = (double)(-2); ta.ptr.p_double[2] = (double)(3); ta.ptr.p_double[3] = 0.25; ta.ptr.p_double[4] = -0.5; ta.ptr.p_double[5] = 0.75; ta.ptr.p_double[6] = -0.06; ta.ptr.p_double[7] = 0.12; ta.ptr.p_double[8] = -0.18; for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { if( tasktype==2 ) { xy.ptr.pp_double[i][j] = 1+testlinregunit_generatenormal((double)(0), (double)(3), _state); } else { xy.ptr.pp_double[i][j] = 1+testlinregunit_generatenormal((double)(0), 0.05, _state); } } v = ae_v_dotproduct(&xy.ptr.pp_double[i][0], 1, &ta.ptr.p_double[0], 1, ae_v_len(0,8)); xy0.ptr.p_double[i] = v; xy.ptr.pp_double[i][m] = v+noiselevel*testlinregunit_generatenormal((double)(0), (double)(1), _state); } } for(i=0; i<=n-1; i++) { s.ptr.p_double[i] = 1+ae_randomreal(_state); } /* * Solve (using S-variant, non-S-variant is not tested) */ if( modeltype==0 ) { lrbuilds(&xy, &s, n, m, &info, &wt, &ar, _state); } else { lrbuildzs(&xy, &s, n, m, &info, &wt, &ar, _state); } if( info!=1 ) { grconverrors = ae_true; continue; } lrunpack(&wt, &tmpweights, &tmpi, _state); /* * LRProcess test */ ae_vector_set_length(&x, m-1+1, _state); v = tmpweights.ptr.p_double[m]; for(i=0; i<=m-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; v = v+tmpweights.ptr.p_double[i]*x.ptr.p_double[i]; } grothererrors = grothererrors||ae_fp_greater(ae_fabs(v-lrprocess(&wt, &x, _state), _state)/ae_maxreal(ae_fabs(v, _state), (double)(1), _state),threshold); /* * LRPack test */ lrpack(&tmpweights, m, &wt2, _state); ae_vector_set_length(&x, m-1+1, _state); for(i=0; i<=m-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } v = lrprocess(&wt, &x, _state); grothererrors = grothererrors||ae_fp_greater(ae_fabs(v-lrprocess(&wt2, &x, _state), _state)/ae_fabs(v, _state),threshold); /* * Optimality test */ for(k=0; k<=m; k++) { if( modeltype==1&&k==m ) { /* * 0-type models (with non-zero constant term) * are tested for optimality of all coefficients. * * 1-type models (with zero constant term) * are tested for optimality of non-constant terms only. */ continue; } f = (double)(0); fp = (double)(0); fm = (double)(0); for(i=0; i<=n-1; i++) { v = tmpweights.ptr.p_double[m]; for(j=0; j<=m-1; j++) { v = v+xy.ptr.pp_double[i][j]*tmpweights.ptr.p_double[j]; } f = f+ae_sqr((v-xy.ptr.pp_double[i][m])/s.ptr.p_double[i], _state); if( kptr.pp_double[i][0] = xl+(xr-xl)*ae_randomreal(_state); } else { xy->ptr.pp_double[i][0] = xl+(xr-xl)*i/(n-1); } xy->ptr.pp_double[i][1] = ymin+(ymax-ymin)*ae_randomreal(_state); s->ptr.p_double[i] = smin+(smax-smin)*ae_randomreal(_state); } } /************************************************************************* Task generation. *************************************************************************/ static void testlinregunit_generatetask(double a, double b, double xl, double xr, ae_bool randomx, double smin, double smax, ae_int_t n, /* Real */ ae_matrix* xy, /* Real */ ae_vector* s, ae_state *_state) { ae_int_t i; ae_matrix_set_length(xy, n-1+1, 1+1, _state); ae_vector_set_length(s, n-1+1, _state); for(i=0; i<=n-1; i++) { if( randomx ) { xy->ptr.pp_double[i][0] = xl+(xr-xl)*ae_randomreal(_state); } else { xy->ptr.pp_double[i][0] = xl+(xr-xl)*i/(n-1); } s->ptr.p_double[i] = smin+(smax-smin)*ae_randomreal(_state); xy->ptr.pp_double[i][1] = a+b*xy->ptr.pp_double[i][0]+testlinregunit_generatenormal((double)(0), s->ptr.p_double[i], _state); } } /************************************************************************* Task generation. y[i] are filled based on A, B, X[I], S[I] *************************************************************************/ static void testlinregunit_filltaskwithy(double a, double b, ae_int_t n, /* Real */ ae_matrix* xy, /* Real */ ae_vector* s, ae_state *_state) { ae_int_t i; for(i=0; i<=n-1; i++) { xy->ptr.pp_double[i][1] = a+b*xy->ptr.pp_double[i][0]+testlinregunit_generatenormal((double)(0), s->ptr.p_double[i], _state); } } /************************************************************************* Normal random numbers *************************************************************************/ static double testlinregunit_generatenormal(double mean, double sigma, ae_state *_state) { double u; double v; double sum; double result; result = mean; for(;;) { u = (2*ae_randominteger(2, _state)-1)*ae_randomreal(_state); v = (2*ae_randominteger(2, _state)-1)*ae_randomreal(_state); sum = u*u+v*v; if( ae_fp_less(sum,(double)(1))&&ae_fp_greater(sum,(double)(0)) ) { sum = ae_sqrt(-2*ae_log(sum, _state)/sum, _state); result = sigma*u*sum+mean; return result; } } return result; } /************************************************************************* Moments estimates and their errors *************************************************************************/ static void testlinregunit_calculatemv(/* Real */ ae_vector* x, ae_int_t n, double* mean, double* means, double* stddev, double* stddevs, ae_state *_state) { ae_int_t i; double v1; double v2; double variance; *mean = 0; *means = 0; *stddev = 0; *stddevs = 0; *mean = (double)(0); *means = (double)(1); *stddev = (double)(0); *stddevs = (double)(1); variance = (double)(0); if( n<=1 ) { return; } /* * Mean */ for(i=0; i<=n-1; i++) { *mean = *mean+x->ptr.p_double[i]; } *mean = *mean/n; /* * Variance (using corrected two-pass algorithm) */ if( n!=1 ) { v1 = (double)(0); for(i=0; i<=n-1; i++) { v1 = v1+ae_sqr(x->ptr.p_double[i]-(*mean), _state); } v2 = (double)(0); for(i=0; i<=n-1; i++) { v2 = v2+(x->ptr.p_double[i]-(*mean)); } v2 = ae_sqr(v2, _state)/n; variance = (v1-v2)/(n-1); if( ae_fp_less(variance,(double)(0)) ) { variance = (double)(0); } *stddev = ae_sqrt(variance, _state); } /* * Errors */ *means = *stddev/ae_sqrt((double)(n), _state); *stddevs = *stddev*ae_sqrt((double)(2), _state)/ae_sqrt((double)(n-1), _state); } /************************************************************************* Unsets LR *************************************************************************/ static void testlinregunit_unsetlr(linearmodel* lr, ae_state *_state) { ae_frame _frame_block; ae_matrix xy; ae_int_t info; lrreport rep; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _lrreport_init(&rep, _state); ae_matrix_set_length(&xy, 5+1, 1+1, _state); for(i=0; i<=5; i++) { xy.ptr.pp_double[i][0] = (double)(0); xy.ptr.pp_double[i][1] = (double)(0); } lrbuild(&xy, 6, 1, &info, lr, &rep, _state); ae_assert(info>0, "Assertion failed", _state); ae_frame_leave(_state); } ae_bool testfilters(ae_bool silent, ae_state *_state) { ae_bool waserrors; ae_bool smaerrors; ae_bool emaerrors; ae_bool lrmaerrors; ae_bool result; smaerrors = testsma(ae_true, _state); emaerrors = testema(ae_true, _state); lrmaerrors = testlrma(ae_true, _state); /* * Final report */ waserrors = (smaerrors||emaerrors)||lrmaerrors; if( !silent ) { printf("FILTERS TEST\n"); printf("* SMA: "); if( !smaerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* EMA: "); if( !emaerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* LRMA: "); if( !lrmaerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST SUMMARY: FAILED\n"); } else { printf("TEST SUMMARY: PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testfilters(ae_bool silent, ae_state *_state) { return testfilters(silent, _state); } /************************************************************************* This function tests SMA(k) filter. It returns True on error. Additional IsSilent parameter controls detailed error reporting. *************************************************************************/ ae_bool testsma(ae_bool issilent, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_bool precomputederrors; ae_bool zerohandlingerrors; double threshold; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); threshold = 1000*ae_machineepsilon; if( !issilent ) { printf("SMA(K) TEST\n"); } /* * Test several pre-computed problems. * * NOTE: tests below rely on the fact that floating point * additions and subtractions are exact when dealing * with integer values. */ precomputederrors = ae_false; ae_vector_set_length(&x, 1, _state); x.ptr.p_double[0] = (double)(7); filtersma(&x, 1, 1, _state); precomputederrors = precomputederrors||ae_fp_neq(x.ptr.p_double[0],(double)(7)); ae_vector_set_length(&x, 3, _state); x.ptr.p_double[0] = (double)(7); x.ptr.p_double[1] = (double)(8); x.ptr.p_double[2] = (double)(9); filtersma(&x, 3, 1, _state); precomputederrors = ((precomputederrors||ae_fp_neq(x.ptr.p_double[0],(double)(7)))||ae_fp_neq(x.ptr.p_double[1],(double)(8)))||ae_fp_neq(x.ptr.p_double[2],(double)(9)); filtersma(&x, 3, 2, _state); precomputederrors = ((precomputederrors||ae_fp_neq(x.ptr.p_double[0],(double)(7)))||ae_fp_neq(x.ptr.p_double[1],7.5))||ae_fp_neq(x.ptr.p_double[2],8.5); ae_vector_set_length(&x, 3, _state); x.ptr.p_double[0] = (double)(7); x.ptr.p_double[1] = (double)(8); x.ptr.p_double[2] = (double)(9); filtersma(&x, 3, 4, _state); precomputederrors = ((precomputederrors||ae_fp_neq(x.ptr.p_double[0],(double)(7)))||ae_fp_neq(x.ptr.p_double[1],7.5))||ae_fp_neq(x.ptr.p_double[2],(double)(8)); /* * Test zero-handling: * a) when we have non-zero sequence (N1 elements) followed by zero sequence * (N2 elements), then first N1+K-1 elements of the processed sequence are * non-zero, but elements since (N1+K)th must be exactly zero. * b) similar property holds for zero sequence followed by non-zero one * * Naive implementation of SMA does not have such property. * * NOTE: it is important to initialize X with non-integer elements with long * binary mantissas, because this test tries to test behaviour in the presence * of roundoff errors, and it will be useless when used with integer inputs. */ zerohandlingerrors = ae_false; ae_vector_set_length(&x, 10, _state); x.ptr.p_double[0] = ae_sqrt((double)(2), _state); x.ptr.p_double[1] = ae_sqrt((double)(3), _state); x.ptr.p_double[2] = ae_sqrt((double)(5), _state); x.ptr.p_double[3] = ae_sqrt((double)(6), _state); x.ptr.p_double[4] = ae_sqrt((double)(7), _state); x.ptr.p_double[5] = (double)(0); x.ptr.p_double[6] = (double)(0); x.ptr.p_double[7] = (double)(0); x.ptr.p_double[8] = (double)(0); x.ptr.p_double[9] = (double)(0); filtersma(&x, 10, 3, _state); zerohandlingerrors = zerohandlingerrors||ae_fp_greater(ae_fabs(x.ptr.p_double[0]-ae_sqrt((double)(2), _state), _state),threshold); zerohandlingerrors = zerohandlingerrors||ae_fp_greater(ae_fabs(x.ptr.p_double[1]-(ae_sqrt((double)(2), _state)+ae_sqrt((double)(3), _state))/2, _state),threshold); zerohandlingerrors = zerohandlingerrors||ae_fp_greater(ae_fabs(x.ptr.p_double[2]-(ae_sqrt((double)(2), _state)+ae_sqrt((double)(3), _state)+ae_sqrt((double)(5), _state))/3, _state),threshold); zerohandlingerrors = zerohandlingerrors||ae_fp_greater(ae_fabs(x.ptr.p_double[3]-(ae_sqrt((double)(3), _state)+ae_sqrt((double)(5), _state)+ae_sqrt((double)(6), _state))/3, _state),threshold); zerohandlingerrors = zerohandlingerrors||ae_fp_greater(ae_fabs(x.ptr.p_double[4]-(ae_sqrt((double)(5), _state)+ae_sqrt((double)(6), _state)+ae_sqrt((double)(7), _state))/3, _state),threshold); zerohandlingerrors = zerohandlingerrors||ae_fp_greater(ae_fabs(x.ptr.p_double[5]-(ae_sqrt((double)(6), _state)+ae_sqrt((double)(7), _state))/3, _state),threshold); zerohandlingerrors = zerohandlingerrors||ae_fp_greater(ae_fabs(x.ptr.p_double[6]-ae_sqrt((double)(7), _state)/3, _state),threshold); zerohandlingerrors = zerohandlingerrors||ae_fp_neq(x.ptr.p_double[7],(double)(0)); zerohandlingerrors = zerohandlingerrors||ae_fp_neq(x.ptr.p_double[8],(double)(0)); zerohandlingerrors = zerohandlingerrors||ae_fp_neq(x.ptr.p_double[9],(double)(0)); x.ptr.p_double[0] = (double)(0); x.ptr.p_double[1] = (double)(0); x.ptr.p_double[2] = (double)(0); x.ptr.p_double[3] = (double)(0); x.ptr.p_double[4] = (double)(0); x.ptr.p_double[5] = ae_sqrt((double)(2), _state); x.ptr.p_double[6] = ae_sqrt((double)(3), _state); x.ptr.p_double[7] = ae_sqrt((double)(5), _state); x.ptr.p_double[8] = ae_sqrt((double)(6), _state); x.ptr.p_double[9] = ae_sqrt((double)(7), _state); filtersma(&x, 10, 3, _state); zerohandlingerrors = zerohandlingerrors||ae_fp_neq(x.ptr.p_double[0],(double)(0)); zerohandlingerrors = zerohandlingerrors||ae_fp_neq(x.ptr.p_double[1],(double)(0)); zerohandlingerrors = zerohandlingerrors||ae_fp_neq(x.ptr.p_double[2],(double)(0)); zerohandlingerrors = zerohandlingerrors||ae_fp_neq(x.ptr.p_double[3],(double)(0)); zerohandlingerrors = zerohandlingerrors||ae_fp_neq(x.ptr.p_double[4],(double)(0)); zerohandlingerrors = zerohandlingerrors||ae_fp_greater(ae_fabs(x.ptr.p_double[5]-ae_sqrt((double)(2), _state)/3, _state),threshold); zerohandlingerrors = zerohandlingerrors||ae_fp_greater(ae_fabs(x.ptr.p_double[6]-(ae_sqrt((double)(2), _state)+ae_sqrt((double)(3), _state))/3, _state),threshold); zerohandlingerrors = zerohandlingerrors||ae_fp_greater(ae_fabs(x.ptr.p_double[7]-(ae_sqrt((double)(2), _state)+ae_sqrt((double)(3), _state)+ae_sqrt((double)(5), _state))/3, _state),threshold); zerohandlingerrors = zerohandlingerrors||ae_fp_greater(ae_fabs(x.ptr.p_double[8]-(ae_sqrt((double)(3), _state)+ae_sqrt((double)(5), _state)+ae_sqrt((double)(6), _state))/3, _state),threshold); zerohandlingerrors = zerohandlingerrors||ae_fp_greater(ae_fabs(x.ptr.p_double[9]-(ae_sqrt((double)(5), _state)+ae_sqrt((double)(6), _state)+ae_sqrt((double)(7), _state))/3, _state),threshold); /* * Final result */ result = precomputederrors||zerohandlingerrors; ae_frame_leave(_state); return result; } /************************************************************************* This function tests EMA(alpha) filter. It returns True on error. Additional IsSilent parameter controls detailed error reporting. *************************************************************************/ ae_bool testema(ae_bool issilent, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_bool precomputederrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); if( !issilent ) { printf("EMA(alpha) TEST\n"); } /* * Test several pre-computed problems. * * NOTE: tests below rely on the fact that floating point * additions and subtractions are exact when dealing * with integer values. */ precomputederrors = ae_false; ae_vector_set_length(&x, 1, _state); x.ptr.p_double[0] = (double)(7); filterema(&x, 1, 1.0, _state); precomputederrors = precomputederrors||ae_fp_neq(x.ptr.p_double[0],(double)(7)); filterema(&x, 1, 0.5, _state); precomputederrors = precomputederrors||ae_fp_neq(x.ptr.p_double[0],(double)(7)); ae_vector_set_length(&x, 3, _state); x.ptr.p_double[0] = (double)(7); x.ptr.p_double[1] = (double)(8); x.ptr.p_double[2] = (double)(9); filterema(&x, 3, 1.0, _state); precomputederrors = ((precomputederrors||ae_fp_neq(x.ptr.p_double[0],(double)(7)))||ae_fp_neq(x.ptr.p_double[1],(double)(8)))||ae_fp_neq(x.ptr.p_double[2],(double)(9)); filterema(&x, 3, 0.5, _state); precomputederrors = ((precomputederrors||ae_fp_neq(x.ptr.p_double[0],(double)(7)))||ae_fp_neq(x.ptr.p_double[1],7.5))||ae_fp_neq(x.ptr.p_double[2],8.25); /* * Final result */ result = precomputederrors; ae_frame_leave(_state); return result; } /************************************************************************* This function tests LRMA(k) filter. It returns True on error. Additional IsSilent parameter controls detailed error reporting. *************************************************************************/ ae_bool testlrma(ae_bool issilent, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_bool precomputederrors; double threshold; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); threshold = 1000*ae_machineepsilon; if( !issilent ) { printf("LRMA(K) TEST\n"); } precomputederrors = ae_false; /* * First, check that filter does not changes points for K=1 or K=2 */ ae_vector_set_length(&x, 1, _state); x.ptr.p_double[0] = (double)(7); filterlrma(&x, 1, 1, _state); precomputederrors = precomputederrors||ae_fp_neq(x.ptr.p_double[0],(double)(7)); ae_vector_set_length(&x, 6, _state); x.ptr.p_double[0] = (double)(7); x.ptr.p_double[1] = (double)(8); x.ptr.p_double[2] = (double)(9); x.ptr.p_double[3] = (double)(10); x.ptr.p_double[4] = (double)(11); x.ptr.p_double[5] = (double)(12); filterlrma(&x, 6, 1, _state); precomputederrors = (((((precomputederrors||ae_fp_neq(x.ptr.p_double[0],(double)(7)))||ae_fp_neq(x.ptr.p_double[1],(double)(8)))||ae_fp_neq(x.ptr.p_double[2],(double)(9)))||ae_fp_neq(x.ptr.p_double[3],(double)(10)))||ae_fp_neq(x.ptr.p_double[4],(double)(11)))||ae_fp_neq(x.ptr.p_double[5],(double)(12)); filterlrma(&x, 6, 2, _state); precomputederrors = (((((precomputederrors||ae_fp_neq(x.ptr.p_double[0],(double)(7)))||ae_fp_neq(x.ptr.p_double[1],(double)(8)))||ae_fp_neq(x.ptr.p_double[2],(double)(9)))||ae_fp_neq(x.ptr.p_double[3],(double)(10)))||ae_fp_neq(x.ptr.p_double[4],(double)(11)))||ae_fp_neq(x.ptr.p_double[5],(double)(12)); /* * Check several precomputed problems */ ae_vector_set_length(&x, 6, _state); x.ptr.p_double[0] = (double)(7); x.ptr.p_double[1] = (double)(8); x.ptr.p_double[2] = (double)(9); x.ptr.p_double[3] = (double)(10); x.ptr.p_double[4] = (double)(11); x.ptr.p_double[5] = (double)(12); filterlrma(&x, 6, 3, _state); precomputederrors = precomputederrors||ae_fp_greater(ae_fabs(x.ptr.p_double[0]-7, _state),threshold); precomputederrors = precomputederrors||ae_fp_greater(ae_fabs(x.ptr.p_double[1]-8, _state),threshold); precomputederrors = precomputederrors||ae_fp_greater(ae_fabs(x.ptr.p_double[2]-9, _state),threshold); precomputederrors = precomputederrors||ae_fp_greater(ae_fabs(x.ptr.p_double[3]-10, _state),threshold); precomputederrors = precomputederrors||ae_fp_greater(ae_fabs(x.ptr.p_double[4]-11, _state),threshold); precomputederrors = precomputederrors||ae_fp_greater(ae_fabs(x.ptr.p_double[5]-12, _state),threshold); ae_vector_set_length(&x, 6, _state); x.ptr.p_double[0] = (double)(7); x.ptr.p_double[1] = (double)(8); x.ptr.p_double[2] = (double)(8); x.ptr.p_double[3] = (double)(9); x.ptr.p_double[4] = (double)(12); x.ptr.p_double[5] = (double)(12); filterlrma(&x, 6, 3, _state); precomputederrors = precomputederrors||ae_fp_greater(ae_fabs(x.ptr.p_double[0]-7.0000000000, _state),1.0E-5); precomputederrors = precomputederrors||ae_fp_greater(ae_fabs(x.ptr.p_double[1]-8.0000000000, _state),1.0E-5); precomputederrors = precomputederrors||ae_fp_greater(ae_fabs(x.ptr.p_double[2]-8.1666666667, _state),1.0E-5); precomputederrors = precomputederrors||ae_fp_greater(ae_fabs(x.ptr.p_double[3]-8.8333333333, _state),1.0E-5); precomputederrors = precomputederrors||ae_fp_greater(ae_fabs(x.ptr.p_double[4]-11.6666666667, _state),1.0E-5); precomputederrors = precomputederrors||ae_fp_greater(ae_fabs(x.ptr.p_double[5]-12.5000000000, _state),1.0E-5); /* * Final result */ result = precomputederrors; ae_frame_leave(_state); return result; } static void testmcpdunit_testsimple(ae_bool* err, ae_state *_state); static void testmcpdunit_testentryexit(ae_bool* err, ae_state *_state); static void testmcpdunit_testec(ae_bool* err, ae_state *_state); static void testmcpdunit_testbc(ae_bool* err, ae_state *_state); static void testmcpdunit_testlc(ae_bool* err, ae_state *_state); static void testmcpdunit_createee(ae_int_t n, ae_int_t entrystate, ae_int_t exitstate, mcpdstate* s, ae_state *_state); ae_bool testmcpd(ae_bool silent, ae_state *_state) { ae_bool waserrors; ae_bool simpleerrors; ae_bool entryexiterrors; ae_bool ecerrors; ae_bool bcerrors; ae_bool lcerrors; ae_bool othererrors; ae_bool result; /* * Init */ waserrors = ae_false; othererrors = ae_false; simpleerrors = ae_false; entryexiterrors = ae_false; ecerrors = ae_false; bcerrors = ae_false; lcerrors = ae_false; /* * Test */ testmcpdunit_testsimple(&simpleerrors, _state); testmcpdunit_testentryexit(&entryexiterrors, _state); testmcpdunit_testec(&ecerrors, _state); testmcpdunit_testbc(&bcerrors, _state); testmcpdunit_testlc(&lcerrors, _state); /* * Final report */ waserrors = ((((othererrors||simpleerrors)||entryexiterrors)||ecerrors)||bcerrors)||lcerrors; if( !silent ) { printf("MCPD TEST\n"); printf("TOTAL RESULTS: "); if( !waserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* SIMPLE: "); if( !simpleerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* ENTRY/EXIT: "); if( !entryexiterrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* EQUALITY CONSTRAINTS: "); if( !ecerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* BOUND CONSTRAINTS: "); if( !bcerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* LINEAR CONSTRAINTS: "); if( !lcerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* OTHER PROPERTIES: "); if( !othererrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST SUMMARY: FAILED\n"); } else { printf("TEST SUMMARY: PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testmcpd(ae_bool silent, ae_state *_state) { return testmcpd(silent, _state); } /************************************************************************* Simple test with no "entry"/"exit" states On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testmcpdunit_testsimple(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_matrix pexact; ae_matrix xy; double threshold; ae_int_t i; ae_int_t j; double v; double v0; ae_matrix p; mcpdstate s; mcpdreport rep; double offdiagonal; ae_frame_make(_state, &_frame_block); ae_matrix_init(&pexact, 0, 0, DT_REAL, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&p, 0, 0, DT_REAL, _state); _mcpdstate_init(&s, _state); _mcpdreport_init(&rep, _state); threshold = 1.0E-2; /* * First test: * * N-dimensional problem * * proportional data * * no "entry"/"exit" states * * N tracks, each includes only two states * * first record in I-th track is [0 ... 1 ... 0] with 1 is in I-th position * * all tracks are modelled using randomly generated transition matrix P */ for(n=1; n<=5; n++) { /* * Initialize "exact" P: * * fill by random values * * make sure that each column sums to non-zero value * * normalize */ ae_matrix_set_length(&pexact, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { pexact.ptr.pp_double[i][j] = ae_randomreal(_state); } } for(j=0; j<=n-1; j++) { i = ae_randominteger(n, _state); pexact.ptr.pp_double[i][j] = pexact.ptr.pp_double[i][j]+0.1; } for(j=0; j<=n-1; j++) { v = (double)(0); for(i=0; i<=n-1; i++) { v = v+pexact.ptr.pp_double[i][j]; } for(i=0; i<=n-1; i++) { pexact.ptr.pp_double[i][j] = pexact.ptr.pp_double[i][j]/v; } } /* * Initialize solver: * * create object * * add tracks */ mcpdcreate(n, &s, _state); for(i=0; i<=n-1; i++) { ae_matrix_set_length(&xy, 2, n, _state); for(j=0; j<=n-1; j++) { xy.ptr.pp_double[0][j] = (double)(0); } xy.ptr.pp_double[0][i] = (double)(1); for(j=0; j<=n-1; j++) { xy.ptr.pp_double[1][j] = pexact.ptr.pp_double[j][i]; } mcpdaddtrack(&s, &xy, 2, _state); } /* * Solve and test */ mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); if( rep.terminationtype>0 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { *err = *err||ae_fp_greater(ae_fabs(p.ptr.pp_double[i][j]-pexact.ptr.pp_double[i][j], _state),threshold); } } } else { *err = ae_true; } } /* * Second test: * * N-dimensional problem * * proportional data * * no "entry"/"exit" states * * N tracks, each includes only two states * * first record in I-th track is [0 ...0.1 0.8 0.1 ... 0] with 0.8 is in I-th position * * all tracks are modelled using randomly generated transition matrix P */ offdiagonal = 0.1; for(n=1; n<=5; n++) { /* * Initialize "exact" P: * * fill by random values * * make sure that each column sums to non-zero value * * normalize */ ae_matrix_set_length(&pexact, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { pexact.ptr.pp_double[i][j] = ae_randomreal(_state); } } for(j=0; j<=n-1; j++) { i = ae_randominteger(n, _state); pexact.ptr.pp_double[i][j] = pexact.ptr.pp_double[i][j]+0.1; } for(j=0; j<=n-1; j++) { v = (double)(0); for(i=0; i<=n-1; i++) { v = v+pexact.ptr.pp_double[i][j]; } for(i=0; i<=n-1; i++) { pexact.ptr.pp_double[i][j] = pexact.ptr.pp_double[i][j]/v; } } /* * Initialize solver: * * create object * * add tracks */ mcpdcreate(n, &s, _state); for(i=0; i<=n-1; i++) { ae_matrix_set_length(&xy, 2, n, _state); for(j=0; j<=n-1; j++) { xy.ptr.pp_double[0][j] = (double)(0); } /* * "main" element */ xy.ptr.pp_double[0][i] = 1.0-2*offdiagonal; for(j=0; j<=n-1; j++) { xy.ptr.pp_double[1][j] = (1.0-2*offdiagonal)*pexact.ptr.pp_double[j][i]; } /* * off-diagonal ones */ if( i>0 ) { xy.ptr.pp_double[0][i-1] = offdiagonal; for(j=0; j<=n-1; j++) { xy.ptr.pp_double[1][j] = xy.ptr.pp_double[1][j]+offdiagonal*pexact.ptr.pp_double[j][i-1]; } } if( i0 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { *err = *err||ae_fp_greater(ae_fabs(p.ptr.pp_double[i][j]-pexact.ptr.pp_double[i][j], _state),threshold); } } } else { *err = ae_true; } } /* * Third test: * * N-dimensional problem * * population data * * no "entry"/"exit" states * * N tracks, each includes only two states * * first record in I-th track is V*[0 ...0.1 0.8 0.1 ... 0] with 0.8 is in I-th position, V in [1,10] * * all tracks are modelled using randomly generated transition matrix P */ offdiagonal = 0.1; for(n=1; n<=5; n++) { /* * Initialize "exact" P: * * fill by random values * * make sure that each column sums to non-zero value * * normalize */ ae_matrix_set_length(&pexact, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { pexact.ptr.pp_double[i][j] = ae_randomreal(_state); } } for(j=0; j<=n-1; j++) { i = ae_randominteger(n, _state); pexact.ptr.pp_double[i][j] = pexact.ptr.pp_double[i][j]+0.1; } for(j=0; j<=n-1; j++) { v = (double)(0); for(i=0; i<=n-1; i++) { v = v+pexact.ptr.pp_double[i][j]; } for(i=0; i<=n-1; i++) { pexact.ptr.pp_double[i][j] = pexact.ptr.pp_double[i][j]/v; } } /* * Initialize solver: * * create object * * add tracks */ mcpdcreate(n, &s, _state); for(i=0; i<=n-1; i++) { ae_matrix_set_length(&xy, 2, n, _state); for(j=0; j<=n-1; j++) { xy.ptr.pp_double[0][j] = (double)(0); } /* * "main" element */ v0 = 9*ae_randomreal(_state)+1; xy.ptr.pp_double[0][i] = v0*(1.0-2*offdiagonal); for(j=0; j<=n-1; j++) { xy.ptr.pp_double[1][j] = v0*(1.0-2*offdiagonal)*pexact.ptr.pp_double[j][i]; } /* * off-diagonal ones */ if( i>0 ) { xy.ptr.pp_double[0][i-1] = v0*offdiagonal; for(j=0; j<=n-1; j++) { xy.ptr.pp_double[1][j] = xy.ptr.pp_double[1][j]+v0*offdiagonal*pexact.ptr.pp_double[j][i-1]; } } if( i0 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { *err = *err||ae_fp_greater(ae_fabs(p.ptr.pp_double[i][j]-pexact.ptr.pp_double[i][j], _state),threshold); } } } else { *err = ae_true; } } ae_frame_leave(_state); } /************************************************************************* Test for different combinations of "entry"/"exit" models On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testmcpdunit_testentryexit(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_matrix p; ae_matrix pexact; ae_matrix xy; double threshold; ae_int_t entrystate; ae_int_t exitstate; ae_int_t entrykind; ae_int_t exitkind; ae_int_t popkind; ae_int_t i; ae_int_t j; ae_int_t k; double v; mcpdstate s; mcpdreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&p, 0, 0, DT_REAL, _state); ae_matrix_init(&pexact, 0, 0, DT_REAL, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _mcpdstate_init(&s, _state); _mcpdreport_init(&rep, _state); threshold = 1.0E-3; /* * */ for(n=2; n<=5; n++) { for(entrykind=0; entrykind<=1; entrykind++) { for(exitkind=0; exitkind<=1; exitkind++) { for(popkind=0; popkind<=1; popkind++) { /* * Generate EntryState/ExitState such that one of the following is True: * * EntryState<>ExitState * * EntryState=-1 or ExitState=-1 */ do { if( entrykind==0 ) { entrystate = -1; } else { entrystate = ae_randominteger(n, _state); } if( exitkind==0 ) { exitstate = -1; } else { exitstate = ae_randominteger(n, _state); } } while(!((entrystate==-1||exitstate==-1)||entrystate!=exitstate)); /* * Generate transition matrix P such that: * * columns corresponding to non-exit states sums to 1.0 * * columns corresponding to exit states sums to 0.0 * * rows corresponding to entry states are zero */ ae_matrix_set_length(&pexact, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { pexact.ptr.pp_double[i][j] = (double)(1+ae_randominteger(5, _state)); if( i==entrystate ) { pexact.ptr.pp_double[i][j] = 0.0; } if( j==exitstate ) { pexact.ptr.pp_double[i][j] = 0.0; } } } for(j=0; j<=n-1; j++) { v = 0.0; for(i=0; i<=n-1; i++) { v = v+pexact.ptr.pp_double[i][j]; } if( ae_fp_neq(v,(double)(0)) ) { for(i=0; i<=n-1; i++) { pexact.ptr.pp_double[i][j] = pexact.ptr.pp_double[i][j]/v; } } } /* * Create MCPD solver */ if( entrystate<0&&exitstate<0 ) { mcpdcreate(n, &s, _state); } if( entrystate>=0&&exitstate<0 ) { mcpdcreateentry(n, entrystate, &s, _state); } if( entrystate<0&&exitstate>=0 ) { mcpdcreateexit(n, exitstate, &s, _state); } if( entrystate>=0&&exitstate>=0 ) { mcpdcreateentryexit(n, entrystate, exitstate, &s, _state); } /* * Add N tracks. * * K-th track starts from vector with large value of * K-th component and small random noise in other components. * * Track contains from 2 to 4 elements. * * Tracks contain proportional (normalized) or * population data, depending on PopKind variable. */ for(k=0; k<=n-1; k++) { /* * Generate track whose length is in 2..4 */ ae_matrix_set_length(&xy, 2+ae_randominteger(3, _state), n, _state); for(j=0; j<=n-1; j++) { xy.ptr.pp_double[0][j] = 0.05*ae_randomreal(_state); } xy.ptr.pp_double[0][k] = 1+ae_randomreal(_state); for(i=1; i<=xy.rows-1; i++) { for(j=0; j<=n-1; j++) { if( j!=entrystate ) { v = ae_v_dotproduct(&pexact.ptr.pp_double[j][0], 1, &xy.ptr.pp_double[i-1][0], 1, ae_v_len(0,n-1)); xy.ptr.pp_double[i][j] = v; } else { xy.ptr.pp_double[i][j] = ae_randomreal(_state); } } } /* * Normalize, if needed */ if( popkind==1 ) { for(i=0; i<=xy.rows-1; i++) { v = 0.0; for(j=0; j<=n-1; j++) { v = v+xy.ptr.pp_double[i][j]; } if( ae_fp_greater(v,(double)(0)) ) { for(j=0; j<=n-1; j++) { xy.ptr.pp_double[i][j] = xy.ptr.pp_double[i][j]/v; } } } } /* * Add track */ mcpdaddtrack(&s, &xy, xy.rows, _state); } /* * Solve and test */ mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); if( rep.terminationtype>0 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { *err = *err||ae_fp_greater(ae_fabs(p.ptr.pp_double[i][j]-pexact.ptr.pp_double[i][j], _state),threshold); } } } else { *err = ae_true; } } } } } ae_frame_leave(_state); } /************************************************************************* Test equality constraints. On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testmcpdunit_testec(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_matrix p; ae_matrix ec; ae_matrix xy; ae_int_t entrystate; ae_int_t exitstate; ae_int_t entrykind; ae_int_t exitkind; ae_int_t i; ae_int_t j; ae_int_t ic; ae_int_t jc; double vc; mcpdstate s; mcpdreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&p, 0, 0, DT_REAL, _state); ae_matrix_init(&ec, 0, 0, DT_REAL, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _mcpdstate_init(&s, _state); _mcpdreport_init(&rep, _state); /* * We try different problems with following properties: * * N is large enough - we won't have problems with inconsistent constraints * * first state is either "entry" or "normal" * * last state is either "exit" or "normal" * * we have one long random track * * We test several properties which are described in comments below */ for(n=4; n<=6; n++) { for(entrykind=0; entrykind<=1; entrykind++) { for(exitkind=0; exitkind<=1; exitkind++) { /* * Prepare problem */ if( entrykind==0 ) { entrystate = -1; } else { entrystate = 0; } if( exitkind==0 ) { exitstate = -1; } else { exitstate = n-1; } ae_matrix_set_length(&xy, 2*n, n, _state); for(i=0; i<=xy.rows-1; i++) { for(j=0; j<=xy.cols-1; j++) { xy.ptr.pp_double[i][j] = ae_randomreal(_state); } } /* * Test that single equality constraint on non-entry * non-exit elements of P is satisfied. * * NOTE: this test needs N>=4 because smaller values * can give us inconsistent constraints */ ae_assert(n>=4, "TestEC: expectation failed", _state); ic = 1+ae_randominteger(n-2, _state); jc = 1+ae_randominteger(n-2, _state); vc = ae_randomreal(_state); testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdaddec(&s, ic, jc, vc, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); if( rep.terminationtype>0 ) { *err = *err||ae_fp_neq(p.ptr.pp_double[ic][jc],vc); } else { *err = ae_true; } /* * Test interaction with default "sum-to-one" constraint * on columns of P. * * We set N-1 equality constraints on random non-exit column * of P, which are inconsistent with this default constraint * (sum will be greater that 1.0). * * Algorithm must detect inconsistency. * * NOTE: * 1. we do not set constraints for the first element of * the column, because this element may be constrained by * "exit state" constraint. * 2. this test needs N>=3 */ ae_assert(n>=3, "TestEC: expectation failed", _state); jc = ae_randominteger(n-1, _state); vc = 0.95; testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); for(i=1; i<=n-1; i++) { mcpdaddec(&s, i, jc, vc, _state); } mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype!=-3; /* * Test interaction with constrains on entry states. * * When model has entry state, corresponding row of P * must be zero. We try to set two kinds of constraints * on random element of this row: * * zero equality constraint, which must be consistent * * non-zero equality constraint, which must be inconsistent */ if( entrystate>=0 ) { jc = ae_randominteger(n, _state); testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdaddec(&s, entrystate, jc, 0.0, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype<=0; testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdaddec(&s, entrystate, jc, 0.5, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype!=-3; } /* * Test interaction with constrains on exit states. * * When model has exit state, corresponding column of P * must be zero. We try to set two kinds of constraints * on random element of this column: * * zero equality constraint, which must be consistent * * non-zero equality constraint, which must be inconsistent */ if( exitstate>=0 ) { ic = ae_randominteger(n, _state); testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdaddec(&s, ic, exitstate, 0.0, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype<=0; testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdaddec(&s, ic, exitstate, 0.5, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype!=-3; } /* * Test SetEC() call - we constrain subset of non-entry * non-exit elements and test it. */ ae_assert(n>=4, "TestEC: expectation failed", _state); ae_matrix_set_length(&ec, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { ec.ptr.pp_double[i][j] = _state->v_nan; } } for(j=1; j<=n-2; j++) { ec.ptr.pp_double[1+ae_randominteger(n-2, _state)][j] = 0.1+0.1*ae_randomreal(_state); } testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdsetec(&s, &ec, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); if( rep.terminationtype>0 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( ae_isfinite(ec.ptr.pp_double[i][j], _state) ) { *err = *err||ae_fp_neq(p.ptr.pp_double[i][j],ec.ptr.pp_double[i][j]); } } } } else { *err = ae_true; } } } } ae_frame_leave(_state); } /************************************************************************* Test bound constraints. On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testmcpdunit_testbc(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_matrix p; ae_matrix bndl; ae_matrix bndu; ae_matrix xy; ae_int_t entrystate; ae_int_t exitstate; ae_int_t entrykind; ae_int_t exitkind; ae_int_t i; ae_int_t j; ae_int_t ic; ae_int_t jc; double vl; double vu; mcpdstate s; mcpdreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&p, 0, 0, DT_REAL, _state); ae_matrix_init(&bndl, 0, 0, DT_REAL, _state); ae_matrix_init(&bndu, 0, 0, DT_REAL, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _mcpdstate_init(&s, _state); _mcpdreport_init(&rep, _state); /* * We try different problems with following properties: * * N is large enough - we won't have problems with inconsistent constraints * * first state is either "entry" or "normal" * * last state is either "exit" or "normal" * * we have one long random track * * We test several properties which are described in comments below */ for(n=4; n<=6; n++) { for(entrykind=0; entrykind<=1; entrykind++) { for(exitkind=0; exitkind<=1; exitkind++) { /* * Prepare problem */ if( entrykind==0 ) { entrystate = -1; } else { entrystate = 0; } if( exitkind==0 ) { exitstate = -1; } else { exitstate = n-1; } ae_matrix_set_length(&xy, 2*n, n, _state); for(i=0; i<=xy.rows-1; i++) { for(j=0; j<=xy.cols-1; j++) { xy.ptr.pp_double[i][j] = ae_randomreal(_state); } } /* * Test that single bound constraint on non-entry * non-exit elements of P is satisfied. * * NOTE 1: this test needs N>=4 because smaller values * can give us inconsistent constraints */ ae_assert(n>=4, "TestBC: expectation failed", _state); ic = 1+ae_randominteger(n-2, _state); jc = 1+ae_randominteger(n-2, _state); if( ae_fp_greater(ae_randomreal(_state),0.5) ) { vl = 0.3*ae_randomreal(_state); } else { vl = _state->v_neginf; } if( ae_fp_greater(ae_randomreal(_state),0.5) ) { vu = 0.5+0.3*ae_randomreal(_state); } else { vu = _state->v_posinf; } testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdaddbc(&s, ic, jc, vl, vu, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); if( rep.terminationtype>0 ) { *err = *err||ae_fp_less(p.ptr.pp_double[ic][jc],vl); *err = *err||ae_fp_greater(p.ptr.pp_double[ic][jc],vu); } else { *err = ae_true; } /* * Test interaction with default "sum-to-one" constraint * on columns of P. * * We set N-1 bound constraints on random non-exit column * of P, which are inconsistent with this default constraint * (sum will be greater that 1.0). * * Algorithm must detect inconsistency. * * NOTE: * 1. we do not set constraints for the first element of * the column, because this element may be constrained by * "exit state" constraint. * 2. this test needs N>=3 */ ae_assert(n>=3, "TestEC: expectation failed", _state); jc = ae_randominteger(n-1, _state); vl = 0.85; vu = 0.95; testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); for(i=1; i<=n-1; i++) { mcpdaddbc(&s, i, jc, vl, vu, _state); } mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype!=-3; /* * Test interaction with constrains on entry states. * * When model has entry state, corresponding row of P * must be zero. We try to set two kinds of constraints * on random element of this row: * * bound constraint with zero lower bound, which must be consistent * * bound constraint with non-zero lower bound, which must be inconsistent */ if( entrystate>=0 ) { jc = ae_randominteger(n, _state); testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdaddbc(&s, entrystate, jc, 0.0, 1.0, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype<=0; testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdaddbc(&s, entrystate, jc, 0.5, 1.0, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype!=-3; } /* * Test interaction with constrains on exit states. * * When model has exit state, corresponding column of P * must be zero. We try to set two kinds of constraints * on random element of this column: * * bound constraint with zero lower bound, which must be consistent * * bound constraint with non-zero lower bound, which must be inconsistent */ if( exitstate>=0 ) { ic = ae_randominteger(n, _state); testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdaddbc(&s, ic, exitstate, 0.0, 1.0, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype<=0; testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdaddbc(&s, ic, exitstate, 0.5, 1.0, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype!=-3; } /* * Test SetBC() call - we constrain subset of non-entry * non-exit elements and test it. */ ae_assert(n>=4, "TestBC: expectation failed", _state); ae_matrix_set_length(&bndl, n, n, _state); ae_matrix_set_length(&bndu, n, n, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { bndl.ptr.pp_double[i][j] = _state->v_neginf; bndu.ptr.pp_double[i][j] = _state->v_posinf; } } for(j=1; j<=n-2; j++) { i = 1+ae_randominteger(n-2, _state); bndl.ptr.pp_double[i][j] = 0.5-0.1*ae_randomreal(_state); bndu.ptr.pp_double[i][j] = 0.5+0.1*ae_randomreal(_state); } testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdsetbc(&s, &bndl, &bndu, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); if( rep.terminationtype>0 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { *err = *err||ae_fp_less(p.ptr.pp_double[i][j],bndl.ptr.pp_double[i][j]); *err = *err||ae_fp_greater(p.ptr.pp_double[i][j],bndu.ptr.pp_double[i][j]); } } } else { *err = ae_true; } } } } ae_frame_leave(_state); } /************************************************************************* Test bound constraints. On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testmcpdunit_testlc(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_matrix p; ae_matrix c; ae_matrix xy; ae_vector ct; ae_int_t entrystate; ae_int_t exitstate; ae_int_t entrykind; ae_int_t exitkind; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t t; ae_int_t jc; double v; double threshold; mcpdstate s; mcpdreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&p, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&ct, 0, DT_INT, _state); _mcpdstate_init(&s, _state); _mcpdreport_init(&rep, _state); threshold = 1.0E5*ae_machineepsilon; /* * We try different problems with following properties: * * N is large enough - we won't have problems with inconsistent constraints * * first state is either "entry" or "normal" * * last state is either "exit" or "normal" * * we have one long random track * * We test several properties which are described in comments below */ for(n=4; n<=6; n++) { for(entrykind=0; entrykind<=1; entrykind++) { for(exitkind=0; exitkind<=1; exitkind++) { /* * Prepare problem */ if( entrykind==0 ) { entrystate = -1; } else { entrystate = 0; } if( exitkind==0 ) { exitstate = -1; } else { exitstate = n-1; } ae_matrix_set_length(&xy, 2*n, n, _state); for(i=0; i<=xy.rows-1; i++) { for(j=0; j<=xy.cols-1; j++) { xy.ptr.pp_double[i][j] = ae_randomreal(_state); } } /* * Test that single linear equality/inequality constraint * on non-entry non-exit elements of P is satisfied. * * NOTE 1: this test needs N>=4 because smaller values * can give us inconsistent constraints * NOTE 2: Constraints are generated is such a way that P=(1/N ... 1/N) * is always feasible. It guarantees that there always exists * at least one feasible point * NOTE 3: If we have inequality constraint, we "shift" right part * in order to make feasible some neighborhood of P=(1/N ... 1/N). */ ae_assert(n>=4, "TestLC: expectation failed", _state); ae_matrix_set_length(&c, 1, n*n+1, _state); ae_vector_set_length(&ct, 1, _state); v = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( ((i==0||i==n-1)||j==0)||j==n-1 ) { c.ptr.pp_double[0][i*n+j] = (double)(0); } else { c.ptr.pp_double[0][i*n+j] = ae_randomreal(_state); v = v+c.ptr.pp_double[0][i*n+j]*((double)1/(double)n); } } } c.ptr.pp_double[0][n*n] = v; ct.ptr.p_int[0] = ae_randominteger(3, _state)-1; if( ct.ptr.p_int[0]<0 ) { c.ptr.pp_double[0][n*n] = c.ptr.pp_double[0][n*n]+0.1; } if( ct.ptr.p_int[0]>0 ) { c.ptr.pp_double[0][n*n] = c.ptr.pp_double[0][n*n]-0.1; } testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdsetlc(&s, &c, &ct, 1, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); if( rep.terminationtype>0 ) { v = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = v+p.ptr.pp_double[i][j]*c.ptr.pp_double[0][i*n+j]; } } if( ct.ptr.p_int[0]<0 ) { *err = *err||ae_fp_greater_eq(v,c.ptr.pp_double[0][n*n]+threshold); } if( ct.ptr.p_int[0]==0 ) { *err = *err||ae_fp_greater_eq(ae_fabs(v-c.ptr.pp_double[0][n*n], _state),threshold); } if( ct.ptr.p_int[0]>0 ) { *err = *err||ae_fp_less_eq(v,c.ptr.pp_double[0][n*n]-threshold); } } else { *err = ae_true; } /* * Test interaction with default "sum-to-one" constraint * on columns of P. * * We set linear constraint which has for "sum-to-X" on * on random non-exit column of P. This constraint can be * either consistent (X=1.0) or inconsistent (X<>1.0) with * this default constraint. * * Algorithm must detect inconsistency. * * NOTE: * 1. this test needs N>=2 */ ae_assert(n>=2, "TestLC: expectation failed", _state); jc = ae_randominteger(n-1, _state); ae_matrix_set_length(&c, 1, n*n+1, _state); ae_vector_set_length(&ct, 1, _state); for(i=0; i<=n*n-1; i++) { c.ptr.pp_double[0][i] = 0.0; } for(i=0; i<=n-1; i++) { c.ptr.pp_double[0][n*i+jc] = 1.0; } c.ptr.pp_double[0][n*n] = 1.0; ct.ptr.p_int[0] = 0; testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdsetlc(&s, &c, &ct, 1, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype<=0; c.ptr.pp_double[0][n*n] = 2.0; testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdsetlc(&s, &c, &ct, 1, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype!=-3; /* * Test interaction with constrains on entry states. * * When model has entry state, corresponding row of P * must be zero. We try to set two kinds of constraints * on elements of this row: * * sums-to-zero constraint, which must be consistent * * sums-to-one constraint, which must be inconsistent */ if( entrystate>=0 ) { ae_matrix_set_length(&c, 1, n*n+1, _state); ae_vector_set_length(&ct, 1, _state); for(i=0; i<=n*n-1; i++) { c.ptr.pp_double[0][i] = 0.0; } for(j=0; j<=n-1; j++) { c.ptr.pp_double[0][n*entrystate+j] = 1.0; } ct.ptr.p_int[0] = 0; c.ptr.pp_double[0][n*n] = 0.0; testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdsetlc(&s, &c, &ct, 1, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype<=0; c.ptr.pp_double[0][n*n] = 1.0; testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdsetlc(&s, &c, &ct, 1, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype!=-3; } /* * Test interaction with constrains on exit states. * * When model has exit state, corresponding column of P * must be zero. We try to set two kinds of constraints * on elements of this column: * * sums-to-zero constraint, which must be consistent * * sums-to-one constraint, which must be inconsistent */ if( exitstate>=0 ) { ae_matrix_set_length(&c, 1, n*n+1, _state); ae_vector_set_length(&ct, 1, _state); for(i=0; i<=n*n-1; i++) { c.ptr.pp_double[0][i] = 0.0; } for(i=0; i<=n-1; i++) { c.ptr.pp_double[0][n*i+exitstate] = 1.0; } ct.ptr.p_int[0] = 0; c.ptr.pp_double[0][n*n] = 0.0; testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdsetlc(&s, &c, &ct, 1, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype<=0; c.ptr.pp_double[0][n*n] = 1.0; testmcpdunit_createee(n, entrystate, exitstate, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdsetlc(&s, &c, &ct, 1, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); *err = *err||rep.terminationtype!=-3; } } } } /* * Final test - we generate several random constraints and * test SetLC() function. * * NOTES: * * 1. Constraints are generated is such a way that P=(1/N ... 1/N) * is always feasible. It guarantees that there always exists * at least one feasible point * 2. For simplicity of the test we do not use entry/exit states * in our model */ for(n=1; n<=4; n++) { for(k=1; k<=2*n; k++) { /* * Generate track */ ae_matrix_set_length(&xy, 2*n, n, _state); for(i=0; i<=xy.rows-1; i++) { for(j=0; j<=xy.cols-1; j++) { xy.ptr.pp_double[i][j] = ae_randomreal(_state); } } /* * Generate random constraints */ ae_matrix_set_length(&c, k, n*n+1, _state); ae_vector_set_length(&ct, k, _state); for(i=0; i<=k-1; i++) { /* * Generate constraint and its right part */ c.ptr.pp_double[i][n*n] = (double)(0); for(j=0; j<=n*n-1; j++) { c.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; c.ptr.pp_double[i][n*n] = c.ptr.pp_double[i][n*n]+c.ptr.pp_double[i][j]*((double)1/(double)n); } ct.ptr.p_int[i] = ae_randominteger(3, _state)-1; /* * If we have inequality constraint, we "shift" right part * in order to make feasible some neighborhood of P=(1/N ... 1/N). */ if( ct.ptr.p_int[i]<0 ) { c.ptr.pp_double[i][n*n] = c.ptr.pp_double[i][n*n]+0.1; } if( ct.ptr.p_int[i]>0 ) { c.ptr.pp_double[i][n*n] = c.ptr.pp_double[i][n*n]-0.1; } } /* * Test */ testmcpdunit_createee(n, -1, -1, &s, _state); mcpdaddtrack(&s, &xy, xy.rows, _state); mcpdsetlc(&s, &c, &ct, k, _state); mcpdsolve(&s, _state); mcpdresults(&s, &p, &rep, _state); if( rep.terminationtype>0 ) { for(t=0; t<=k-1; t++) { v = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = v+p.ptr.pp_double[i][j]*c.ptr.pp_double[t][i*n+j]; } } if( ct.ptr.p_int[t]<0 ) { *err = *err||ae_fp_greater_eq(v,c.ptr.pp_double[t][n*n]+threshold); } if( ct.ptr.p_int[t]==0 ) { *err = *err||ae_fp_greater_eq(ae_fabs(v-c.ptr.pp_double[t][n*n], _state),threshold); } if( ct.ptr.p_int[t]>0 ) { *err = *err||ae_fp_less_eq(v,c.ptr.pp_double[t][n*n]-threshold); } } } else { *err = ae_true; } } } ae_frame_leave(_state); } /************************************************************************* This function is used to create MCPD object with arbitrary combination of entry and exit states *************************************************************************/ static void testmcpdunit_createee(ae_int_t n, ae_int_t entrystate, ae_int_t exitstate, mcpdstate* s, ae_state *_state) { _mcpdstate_clear(s); if( entrystate<0&&exitstate<0 ) { mcpdcreate(n, s, _state); } if( entrystate>=0&&exitstate<0 ) { mcpdcreateentry(n, entrystate, s, _state); } if( entrystate<0&&exitstate>=0 ) { mcpdcreateexit(n, exitstate, s, _state); } if( entrystate>=0&&exitstate>=0 ) { mcpdcreateentryexit(n, entrystate, exitstate, s, _state); } } static void testmlpeunit_createensemble(mlpensemble* ensemble, ae_int_t nkind, double a1, double a2, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ec, ae_state *_state); static void testmlpeunit_unsetensemble(mlpensemble* ensemble, ae_state *_state); static void testmlpeunit_testinformational(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ec, ae_int_t passcount, ae_bool* err, ae_state *_state); static void testmlpeunit_testprocessing(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ec, ae_int_t passcount, ae_bool* err, ae_state *_state); static void testmlpeunit_testerr(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ec, ae_int_t passcount, ae_int_t sizemin, ae_int_t sizemax, ae_bool* err, ae_state *_state); ae_bool testmlpe(ae_bool silent, ae_state *_state) { ae_bool waserrors; ae_bool inferrors; ae_bool procerrors; ae_bool errerrors; ae_int_t passcount; ae_int_t maxn; ae_int_t maxhid; ae_int_t nf; ae_int_t nl; ae_int_t nhid1; ae_int_t nhid2; ae_int_t ec; ae_int_t nkind; ae_int_t sizemin; ae_int_t sizemax; ae_bool result; waserrors = ae_false; inferrors = ae_false; procerrors = ae_false; errerrors = ae_false; passcount = 5; maxn = 3; maxhid = 3; /* * General MLP ensembles tests * These tests are performed with small dataset, whose size is in [0,10]. * We test correctness of functions on small sets, but do not test code * which splits large dataset into smaller chunks. */ sizemin = 0; sizemax = 10; for(nf=1; nf<=maxn; nf++) { for(nl=1; nl<=maxn; nl++) { for(nhid1=0; nhid1<=maxhid; nhid1++) { for(nhid2=0; nhid2<=maxhid; nhid2++) { for(nkind=0; nkind<=3; nkind++) { for(ec=1; ec<=3; ec++) { /* * Skip meaningless parameters combinations */ if( nkind==1&&nl<2 ) { continue; } if( nhid1==0&&nhid2!=0 ) { continue; } /* * Tests */ testmlpeunit_testinformational(nkind, nf, nhid1, nhid2, nl, ec, passcount, &inferrors, _state); testmlpeunit_testprocessing(nkind, nf, nhid1, nhid2, nl, ec, passcount, &procerrors, _state); testmlpeunit_testerr(nkind, nf, nhid1, nhid2, nl, ec, passcount, sizemin, sizemax, &errerrors, _state); } } } } } } /* * Special tests on large datasets: test ability to correctly split * work into smaller chunks. */ nf = 2; nhid1 = 10; nhid2 = 10; nl = 2; ec = 10; sizemin = 1000; sizemax = 1000; testmlpeunit_testerr(0, nf, nhid1, nhid2, nl, ec, 1, sizemin, sizemax, &errerrors, _state); /* * Final report */ waserrors = (inferrors||procerrors)||errerrors; if( !silent ) { printf("MLP ENSEMBLE TEST\n"); printf("INFORMATIONAL FUNCTIONS: "); if( !inferrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("BASIC PROCESSING: "); if( !procerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("ERROR FUNCTIONS: "); if( !errerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST SUMMARY: FAILED\n"); } else { printf("TEST SUMMARY: PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testmlpe(ae_bool silent, ae_state *_state) { return testmlpe(silent, _state); } /************************************************************************* Network creation *************************************************************************/ static void testmlpeunit_createensemble(mlpensemble* ensemble, ae_int_t nkind, double a1, double a2, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ec, ae_state *_state) { ae_assert(((nin>0&&nhid1>=0)&&nhid2>=0)&&nout>0, "CreateNetwork error", _state); ae_assert(nhid1!=0||nhid2==0, "CreateNetwork error", _state); ae_assert(nkind!=1||nout>=2, "CreateNetwork error", _state); if( nhid1==0 ) { /* * No hidden layers */ if( nkind==0 ) { mlpecreate0(nin, nout, ec, ensemble, _state); } else { if( nkind==1 ) { mlpecreatec0(nin, nout, ec, ensemble, _state); } else { if( nkind==2 ) { mlpecreateb0(nin, nout, a1, a2, ec, ensemble, _state); } else { if( nkind==3 ) { mlpecreater0(nin, nout, a1, a2, ec, ensemble, _state); } } } } return; } if( nhid2==0 ) { /* * One hidden layer */ if( nkind==0 ) { mlpecreate1(nin, nhid1, nout, ec, ensemble, _state); } else { if( nkind==1 ) { mlpecreatec1(nin, nhid1, nout, ec, ensemble, _state); } else { if( nkind==2 ) { mlpecreateb1(nin, nhid1, nout, a1, a2, ec, ensemble, _state); } else { if( nkind==3 ) { mlpecreater1(nin, nhid1, nout, a1, a2, ec, ensemble, _state); } } } } return; } /* * Two hidden layers */ if( nkind==0 ) { mlpecreate2(nin, nhid1, nhid2, nout, ec, ensemble, _state); } else { if( nkind==1 ) { mlpecreatec2(nin, nhid1, nhid2, nout, ec, ensemble, _state); } else { if( nkind==2 ) { mlpecreateb2(nin, nhid1, nhid2, nout, a1, a2, ec, ensemble, _state); } else { if( nkind==3 ) { mlpecreater2(nin, nhid1, nhid2, nout, a1, a2, ec, ensemble, _state); } } } } } /************************************************************************* Unsets network (initialize it to smallest network possible *************************************************************************/ static void testmlpeunit_unsetensemble(mlpensemble* ensemble, ae_state *_state) { mlpecreate0(1, 1, 1, ensemble, _state); } /************************************************************************* Iformational functions test *************************************************************************/ static void testmlpeunit_testinformational(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ec, ae_int_t passcount, ae_bool* err, ae_state *_state) { ae_frame _frame_block; mlpensemble ensemble; ae_int_t n1; ae_int_t n2; ae_frame_make(_state, &_frame_block); _mlpensemble_init(&ensemble, _state); testmlpeunit_createensemble(&ensemble, nkind, -1.0, 1.0, nin, nhid1, nhid2, nout, ec, _state); mlpeproperties(&ensemble, &n1, &n2, _state); *err = (*err||n1!=nin)||n2!=nout; ae_frame_leave(_state); } /************************************************************************* Processing functions test *************************************************************************/ static void testmlpeunit_testprocessing(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ec, ae_int_t passcount, ae_bool* err, ae_state *_state) { ae_frame _frame_block; mlpensemble ensemble; mlpensemble ensemble2; double a1; double a2; ae_int_t pass; ae_int_t rkind; ae_int_t i; ae_bool allsame; ae_vector x1; ae_vector x2; ae_vector y1; ae_vector y2; ae_vector ra; ae_vector ra2; double v; ae_frame_make(_state, &_frame_block); _mlpensemble_init(&ensemble, _state); _mlpensemble_init(&ensemble2, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&y1, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&ra, 0, DT_REAL, _state); ae_vector_init(&ra2, 0, DT_REAL, _state); /* * Prepare network */ a1 = (double)(0); a2 = (double)(0); if( nkind==2 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = 2*ae_randomreal(_state)-1; } if( nkind==3 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = a1+(2*ae_randominteger(2, _state)-1)*(0.1+0.9*ae_randomreal(_state)); } /* * Initialize arrays */ ae_vector_set_length(&x1, nin-1+1, _state); ae_vector_set_length(&x2, nin-1+1, _state); ae_vector_set_length(&y1, nout-1+1, _state); ae_vector_set_length(&y2, nout-1+1, _state); /* * Main cycle: * * Pass is a number of repeated test * * RKind is a "replication kind": * * RKind=0 means that we work with original ensemble * * RKind=1 means that we work with replica created with MLPECopy() * * RKind=2 means that we work with replica created with serialization/unserialization */ for(pass=1; pass<=passcount; pass++) { for(rkind=0; rkind<=2; rkind++) { /* * Create network, pass through replication in order to test that replicated network works correctly. */ testmlpeunit_createensemble(&ensemble, nkind, a1, a2, nin, nhid1, nhid2, nout, ec, _state); if( rkind==1 ) { mlpecopy(&ensemble, &ensemble2, _state); testmlpeunit_unsetensemble(&ensemble, _state); mlpecopy(&ensemble2, &ensemble, _state); testmlpeunit_unsetensemble(&ensemble2, _state); } if( rkind==2 ) { { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpealloc(&_local_serializer, &ensemble, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpeserialize(&_local_serializer, &ensemble, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpeunserialize(&_local_serializer, &ensemble2, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } testmlpeunit_unsetensemble(&ensemble, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpealloc(&_local_serializer, &ensemble2, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpeserialize(&_local_serializer, &ensemble2, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpeunserialize(&_local_serializer, &ensemble, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } testmlpeunit_unsetensemble(&ensemble2, _state); } /* * Same inputs leads to same outputs */ for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = x1.ptr.p_double[i]; } for(i=0; i<=nout-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mlpeprocess(&ensemble, &x1, &y1, _state); mlpeprocess(&ensemble, &x2, &y2, _state); allsame = ae_true; for(i=0; i<=nout-1; i++) { allsame = allsame&&ae_fp_eq(y1.ptr.p_double[i],y2.ptr.p_double[i]); } *err = *err||!allsame; /* * Same inputs on original network leads to same outputs * on copy created using MLPCopy */ testmlpeunit_unsetensemble(&ensemble2, _state); mlpecopy(&ensemble, &ensemble2, _state); for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = x1.ptr.p_double[i]; } for(i=0; i<=nout-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mlpeprocess(&ensemble, &x1, &y1, _state); mlpeprocess(&ensemble2, &x2, &y2, _state); allsame = ae_true; for(i=0; i<=nout-1; i++) { allsame = allsame&&ae_fp_eq(y1.ptr.p_double[i],y2.ptr.p_double[i]); } *err = *err||!allsame; /* * Same inputs on original network leads to same outputs * on copy created using MLPSerialize */ { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); mlpealloc(&_local_serializer, &ensemble, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpeserialize(&_local_serializer, &ensemble, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); mlpeunserialize(&_local_serializer, &ensemble2, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = x1.ptr.p_double[i]; } for(i=0; i<=nout-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mlpeprocess(&ensemble, &x1, &y1, _state); mlpeprocess(&ensemble2, &x2, &y2, _state); allsame = ae_true; for(i=0; i<=nout-1; i++) { allsame = allsame&&ae_fp_eq(y1.ptr.p_double[i],y2.ptr.p_double[i]); } *err = *err||!allsame; /* * Different inputs leads to different outputs (non-zero network) */ for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=nout-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = y1.ptr.p_double[i]; } mlpeprocess(&ensemble, &x1, &y1, _state); mlpeprocess(&ensemble, &x2, &y2, _state); allsame = ae_true; for(i=0; i<=nout-1; i++) { allsame = allsame&&ae_fp_eq(y1.ptr.p_double[i],y2.ptr.p_double[i]); } *err = *err||allsame; /* * Randomization changes outputs (when inputs are unchanged, non-zero network) */ for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=nout-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = y1.ptr.p_double[i]; } mlpecopy(&ensemble, &ensemble2, _state); mlperandomize(&ensemble2, _state); mlpeprocess(&ensemble, &x1, &y1, _state); mlpeprocess(&ensemble2, &x1, &y2, _state); allsame = ae_true; for(i=0; i<=nout-1; i++) { allsame = allsame&&ae_fp_eq(y1.ptr.p_double[i],y2.ptr.p_double[i]); } *err = *err||allsame; /* * Normalization properties */ if( nkind==1 ) { /* * Classifier network outputs are normalized */ for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mlpeprocess(&ensemble, &x1, &y1, _state); v = (double)(0); for(i=0; i<=nout-1; i++) { v = v+y1.ptr.p_double[i]; *err = *err||ae_fp_less(y1.ptr.p_double[i],(double)(0)); } *err = *err||ae_fp_greater(ae_fabs(v-1, _state),1000*ae_machineepsilon); } if( nkind==2 ) { /* * B-type network outputs are bounded from above/below */ for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mlpeprocess(&ensemble, &x1, &y1, _state); for(i=0; i<=nout-1; i++) { if( ae_fp_greater_eq(a2,(double)(0)) ) { *err = *err||ae_fp_less(y1.ptr.p_double[i],a1); } else { *err = *err||ae_fp_greater(y1.ptr.p_double[i],a1); } } } if( nkind==3 ) { /* * R-type network outputs are within [A1,A2] (or [A2,A1]) */ for(i=0; i<=nin-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } mlpeprocess(&ensemble, &x1, &y1, _state); for(i=0; i<=nout-1; i++) { *err = (*err||ae_fp_less(y1.ptr.p_double[i],ae_minreal(a1, a2, _state)))||ae_fp_greater(y1.ptr.p_double[i],ae_maxreal(a1, a2, _state)); } } } } ae_frame_leave(_state); } /************************************************************************* Error functions Ensemble of type NKind is created, with NIn inputs, NHid1*NHid2 hidden layers (one layer if NHid2=0), NOut outputs. PassCount random passes is performed. Dataset has random size in [SizeMin,SizeMax]. *************************************************************************/ static void testmlpeunit_testerr(ae_int_t nkind, ae_int_t nin, ae_int_t nhid1, ae_int_t nhid2, ae_int_t nout, ae_int_t ec, ae_int_t passcount, ae_int_t sizemin, ae_int_t sizemax, ae_bool* err, ae_state *_state) { ae_frame _frame_block; mlpensemble ensemble; sparsematrix sparsexy; sparsematrix sparsexy2; ae_int_t n1; ae_int_t n2; ae_int_t wcount; double etol; double escale; double a1; double a2; ae_int_t pass; ae_int_t i; ae_int_t j; ae_int_t ssize; ae_matrix xy; ae_matrix xy2; ae_vector y; ae_vector x1; ae_vector y1; ae_vector idx; ae_vector dummy; double refrmserror; double refclserror; double refavgce; double refavgerror; double refavgrelerror; ae_int_t avgrelcnt; modelerrors allerrors; ae_int_t nnmax; ae_int_t dsmax; ae_frame_make(_state, &_frame_block); _mlpensemble_init(&ensemble, _state); _sparsematrix_init(&sparsexy, _state); _sparsematrix_init(&sparsexy2, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&xy2, 0, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&y1, 0, DT_REAL, _state); ae_vector_init(&idx, 0, DT_INT, _state); ae_vector_init(&dummy, 0, DT_INT, _state); _modelerrors_init(&allerrors, _state); a1 = (double)(0); a2 = (double)(0); if( nkind==2 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = 2*ae_randomreal(_state)-1; } if( nkind==3 ) { a1 = 1000*ae_randomreal(_state)-500; a2 = a1+(2*ae_randominteger(2, _state)-1)*(0.1+0.9*ae_randomreal(_state)); } testmlpeunit_createensemble(&ensemble, nkind, a1, a2, nin, nhid1, nhid2, nout, ec, _state); mlpproperties(&ensemble.network, &n1, &n2, &wcount, _state); etol = 1.0E-4; escale = 1.0E-2; /* * Initialize */ ae_vector_set_length(&x1, nin, _state); ae_vector_set_length(&y, nout, _state); ae_vector_set_length(&y1, nout, _state); /* * Process */ for(pass=1; pass<=passcount; pass++) { /* * Randomize Ensemble, then re-randomaze weights manually. * * NOTE: weights magnitude is chosen to be small, about 0.1, * which allows us to avoid oversaturated Ensemble. * In 10% of cases we use zero weights. */ mlperandomize(&ensemble, _state); if( ae_fp_less_eq(ae_randomreal(_state),0.1) ) { for(i=0; i<=wcount*ec-1; i++) { ensemble.weights.ptr.p_double[i] = 0.0; } } else { for(i=0; i<=wcount*ec-1; i++) { ensemble.weights.ptr.p_double[i] = 0.2*ae_randomreal(_state)-0.1; } } /* * Generate random dataset. * Calculate reference errors. * * NOTE: about 10% of tests are performed with zero SSize */ ssize = sizemin+ae_randominteger(sizemax-sizemin+1, _state); if( mlpeissoftmax(&ensemble, _state) ) { ae_matrix_set_length(&xy, ae_maxint(ssize, 1, _state), nin+1, _state); sparsecreate(ae_maxint(ssize, 1, _state), nin+1, 0, &sparsexy, _state); } else { ae_matrix_set_length(&xy, ae_maxint(ssize, 1, _state), nin+nout, _state); sparsecreate(ae_maxint(ssize, 1, _state), nin+nout, 0, &sparsexy, _state); } refrmserror = 0.0; refclserror = 0.0; refavgce = 0.0; refavgerror = 0.0; refavgrelerror = 0.0; avgrelcnt = 0; for(i=0; i<=ssize-1; i++) { /* * Fill I-th row */ for(j=0; j<=nin-1; j++) { x1.ptr.p_double[j] = 4*ae_randomreal(_state)-2; sparseset(&sparsexy, i, j, x1.ptr.p_double[j], _state); } ae_v_move(&xy.ptr.pp_double[i][0], 1, &x1.ptr.p_double[0], 1, ae_v_len(0,nin-1)); if( mlpeissoftmax(&ensemble, _state) ) { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = (double)(0); } xy.ptr.pp_double[i][nin] = (double)(ae_randominteger(nout, _state)); sparseset(&sparsexy, i, nin, xy.ptr.pp_double[i][nin], _state); y1.ptr.p_double[ae_round(xy.ptr.pp_double[i][nin], _state)] = (double)(1); } else { for(j=0; j<=nout-1; j++) { y1.ptr.p_double[j] = 4*ae_randomreal(_state)-2; sparseset(&sparsexy, i, nin+j, y1.ptr.p_double[j], _state); } ae_v_move(&xy.ptr.pp_double[i][nin], 1, &y1.ptr.p_double[0], 1, ae_v_len(nin,nin+nout-1)); } /* * Process */ mlpeprocess(&ensemble, &x1, &y, _state); /* * Update reference errors */ nnmax = 0; if( mlpeissoftmax(&ensemble, _state) ) { if( ae_fp_greater(y.ptr.p_double[ae_round(xy.ptr.pp_double[i][nin], _state)],(double)(0)) ) { refavgce = refavgce+ae_log(1/y.ptr.p_double[ae_round(xy.ptr.pp_double[i][nin], _state)], _state); } else { refavgce = refavgce+ae_log(ae_maxrealnumber, _state); } } if( mlpeissoftmax(&ensemble, _state) ) { dsmax = ae_round(xy.ptr.pp_double[i][nin], _state); } else { dsmax = 0; } for(j=0; j<=nout-1; j++) { refrmserror = refrmserror+ae_sqr(y.ptr.p_double[j]-y1.ptr.p_double[j], _state); refavgerror = refavgerror+ae_fabs(y.ptr.p_double[j]-y1.ptr.p_double[j], _state); if( ae_fp_neq(y1.ptr.p_double[j],(double)(0)) ) { refavgrelerror = refavgrelerror+ae_fabs(y.ptr.p_double[j]-y1.ptr.p_double[j], _state)/ae_fabs(y1.ptr.p_double[j], _state); avgrelcnt = avgrelcnt+1; } if( ae_fp_greater(y.ptr.p_double[j],y.ptr.p_double[nnmax]) ) { nnmax = j; } if( !mlpeissoftmax(&ensemble, _state)&&ae_fp_greater(y1.ptr.p_double[j],y1.ptr.p_double[dsmax]) ) { dsmax = j; } } if( nnmax!=dsmax ) { refclserror = refclserror+1; } } sparseconverttocrs(&sparsexy, _state); if( ssize>0 ) { refrmserror = ae_sqrt(refrmserror/(ssize*nout), _state); refavgerror = refavgerror/(ssize*nout); refavgce = refavgce/(ssize*ae_log((double)(2), _state)); } if( avgrelcnt>0 ) { refavgrelerror = refavgrelerror/avgrelcnt; } /* * Test "continuous" errors on full dataset */ seterrorflagdiff(err, mlpermserror(&ensemble, &xy, ssize, _state), refrmserror, etol, escale, _state); seterrorflagdiff(err, mlpeavgce(&ensemble, &xy, ssize, _state), refavgce, etol, escale, _state); seterrorflagdiff(err, mlpeavgerror(&ensemble, &xy, ssize, _state), refavgerror, etol, escale, _state); seterrorflagdiff(err, mlpeavgrelerror(&ensemble, &xy, ssize, _state), refavgrelerror, etol, escale, _state); } ae_frame_leave(_state); } static ae_bool testmlptrainunit_testmlptraines(ae_state *_state); static ae_bool testmlptrainunit_testmlptrainregr(ae_state *_state); static ae_bool testmlptrainunit_testmlpxorregr(ae_state *_state); static ae_bool testmlptrainunit_testmlptrainclass(ae_state *_state); static ae_bool testmlptrainunit_testmlpxorcls(ae_state *_state); static ae_bool testmlptrainunit_testmlpzeroweights(ae_state *_state); static ae_bool testmlptrainunit_testmlprestarts(ae_state *_state); static ae_bool testmlptrainunit_testmlpcverror(ae_state *_state); static ae_bool testmlptrainunit_testmlptrainens(ae_state *_state); static ae_bool testmlptrainunit_testmlptrainensregr(ae_state *_state); static ae_bool testmlptrainunit_testmlptrainenscls(ae_state *_state); ae_bool testmlptrain(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_int_t info; multilayerperceptron network; multilayerperceptron network2; mlpreport rep; mlpcvreport cvrep; ae_matrix xy; ae_matrix valxy; ae_bool trnerrors; ae_bool mlpcverrorerr; ae_bool mlptrainregrerr; ae_bool mlptrainclasserr; ae_bool mlprestartserr; ae_bool mlpxorregrerr; ae_bool mlpxorclserr; ae_bool mlptrainenserrors; ae_bool result; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&network, _state); _multilayerperceptron_init(&network2, _state); _mlpreport_init(&rep, _state); _mlpcvreport_init(&cvrep, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&valxy, 0, 0, DT_REAL, _state); waserrors = ae_false; trnerrors = ae_false; mlpcverrorerr = ae_false; mlptrainregrerr = ae_false; mlptrainclasserr = ae_false; mlprestartserr = ae_false; mlpxorregrerr = ae_false; mlpxorclserr = ae_false; mlptrainenserrors = ae_false; /* * Test network training on simple XOR problem */ ae_matrix_set_length(&xy, 3+1, 2+1, _state); xy.ptr.pp_double[0][0] = (double)(-1); xy.ptr.pp_double[0][1] = (double)(-1); xy.ptr.pp_double[0][2] = (double)(-1); xy.ptr.pp_double[1][0] = (double)(1); xy.ptr.pp_double[1][1] = (double)(-1); xy.ptr.pp_double[1][2] = (double)(1); xy.ptr.pp_double[2][0] = (double)(-1); xy.ptr.pp_double[2][1] = (double)(1); xy.ptr.pp_double[2][2] = (double)(1); xy.ptr.pp_double[3][0] = (double)(1); xy.ptr.pp_double[3][1] = (double)(1); xy.ptr.pp_double[3][2] = (double)(-1); mlpcreate1(2, 2, 1, &network, _state); mlptrainlm(&network, &xy, 4, 0.001, 10, &info, &rep, _state); trnerrors = trnerrors||ae_fp_greater(mlprmserror(&network, &xy, 4, _state),0.1); /* * Test early stopping */ trnerrors = trnerrors||testmlptrainunit_testmlptraines(_state); /* * Test for function MLPFoldCV() */ mlpcverrorerr = testmlptrainunit_testmlpcverror(_state); /* * Test for training functions */ mlptrainregrerr = testmlptrainunit_testmlptrainregr(_state)||testmlptrainunit_testmlpzeroweights(_state); mlptrainclasserr = testmlptrainunit_testmlptrainclass(_state); mlprestartserr = testmlptrainunit_testmlprestarts(_state); mlpxorregrerr = testmlptrainunit_testmlpxorregr(_state); mlpxorclserr = testmlptrainunit_testmlpxorcls(_state); /* * Training for ensembles */ mlptrainenserrors = (testmlptrainunit_testmlptrainens(_state)||testmlptrainunit_testmlptrainensregr(_state))||testmlptrainunit_testmlptrainenscls(_state); /* * Final report */ waserrors = ((((((trnerrors||mlptrainregrerr)||mlptrainclasserr)||mlprestartserr)||mlpxorregrerr)||mlpxorclserr)||mlpcverrorerr)||mlptrainenserrors; if( !silent ) { printf("MLP TEST\n"); printf("CROSS-VALIDATION ERRORS: "); if( !mlpcverrorerr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("TRAINING: "); if( !trnerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("TRAIN -LM -LBFGS FOR REGRESSION: "); if( mlptrainregrerr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("TRAIN -LM -LBFGS FOR CLASSIFIER: "); if( mlptrainclasserr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("PARAMETER RESTARTS IN TRAIN -LBFGS: "); if( mlprestartserr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("TRAINIG WITH TRAINER FOR REGRESSION: "); if( mlpxorregrerr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("TRAINIG WITH TRAINER FOR CLASSIFIER: "); if( mlpxorclserr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("TRAINING ENSEMBLES: "); if( mlptrainenserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST SUMMARY: FAILED\n"); } else { printf("TEST SUMMARY: PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testmlptrain(ae_bool silent, ae_state *_state) { return testmlptrain(silent, _state); } /************************************************************************* This function tests MLPTrainES(). It returns True in case of errors, False when no errors were detected *************************************************************************/ static ae_bool testmlptrainunit_testmlptraines(ae_state *_state) { ae_frame _frame_block; ae_int_t pass; ae_int_t passcount; multilayerperceptron network; ae_matrix trnxy; ae_matrix valxy; ae_vector x; ae_vector y; ae_int_t n; ae_int_t i; ae_int_t j; ae_int_t nrestarts; ae_int_t info; mlpreport rep; ae_bool result; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&network, _state); ae_matrix_init(&trnxy, 0, 0, DT_REAL, _state); ae_matrix_init(&valxy, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); _mlpreport_init(&rep, _state); result = ae_false; /* * First test checks that MLPTrainES() - when training set is equal to the validation * set, MLPTrainES() behaves just like a "normal" training algorithm. * * Test sequence: * * generate training set - 100 random points from 2D square [-1,+1]*[-1,+1] * * generate network with 2 inputs, no hidden layers, nonlinear output layer, * use its outputs as target values for the test set * * randomize network * * train with MLPTrainES(), using original set as both training and validation set * * trained network must reproduce training set with good precision * * NOTE: it is important to test algorithm on nonlinear network because linear * problems converge too fast. Slow convergence is important to detect * some kinds of bugs. * * NOTE: it is important to have NRestarts at least equal to 5, because with just * one restart algorithm fails test about once in several thousands of passes. */ passcount = 10; nrestarts = 5; for(pass=1; pass<=passcount; pass++) { /* * Create network, generate training/validation sets */ mlpcreater0(2, 1, -2.0, 2.0, &network, _state); mlprandomize(&network, _state); n = 100; ae_matrix_set_length(&trnxy, n, 3, _state); ae_matrix_set_length(&valxy, n, 3, _state); ae_vector_set_length(&x, 2, _state); ae_vector_set_length(&y, 1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=1; j++) { trnxy.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; valxy.ptr.pp_double[i][j] = trnxy.ptr.pp_double[i][j]; x.ptr.p_double[j] = trnxy.ptr.pp_double[i][j]; } mlpprocess(&network, &x, &y, _state); trnxy.ptr.pp_double[i][2] = y.ptr.p_double[0]; valxy.ptr.pp_double[i][2] = y.ptr.p_double[0]; } mlprandomize(&network, _state); mlptraines(&network, &trnxy, n, &valxy, n, 0.0001, nrestarts, &info, &rep, _state); if( info<=0 ) { result = ae_true; ae_frame_leave(_state); return result; } if( ae_fp_greater(ae_sqrt(mlperror(&network, &valxy, n, _state)/n, _state),0.01) ) { result = ae_true; ae_frame_leave(_state); return result; } } ae_frame_leave(_state); return result; } /************************************************************************* This function tests MLPTrainLM, MLPTrainLBFGS and MLPTrainNetwork functions for regression. It check that train functions work correctly. Test use Create1 with 10 neurons. Test function is f(x,y)=X^2+cos(3*Pi*y). *************************************************************************/ static ae_bool testmlptrainunit_testmlptrainregr(ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; mlptrainer trainer; mlpreport rep; ae_int_t info; ae_matrix xy; sparsematrix sm; ae_vector x; ae_vector y; ae_int_t n; ae_int_t sn; ae_int_t nneurons; double vdecay; double averr; double st; double eps; double traineps; ae_int_t nneedrest; ae_int_t trainits; ae_int_t shift; ae_int_t i; ae_int_t j; ae_int_t vtrain; ae_bool result; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&net, _state); _mlptrainer_init(&trainer, _state); _mlpreport_init(&rep, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _sparsematrix_init(&sm, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); eps = 0.01; vdecay = 0.001; nneurons = 10; nneedrest = 5; traineps = 1.0E-3; trainits = 0; sn = 5; n = sn*sn; st = (double)2/(double)(sn-1); sparsecreate(n, 3, n*3, &sm, _state); ae_matrix_set_length(&xy, n, 3, _state); ae_vector_set_length(&x, 2, _state); for(vtrain=0; vtrain<=3; vtrain++) { averr = (double)(0); /* * Create a train set(uniformly distributed set of points). */ for(i=0; i<=sn-1; i++) { for(j=0; j<=sn-1; j++) { shift = i*sn+j; xy.ptr.pp_double[shift][0] = i*st-1.0; xy.ptr.pp_double[shift][1] = j*st-1.0; xy.ptr.pp_double[shift][2] = xy.ptr.pp_double[shift][0]*xy.ptr.pp_double[shift][0]+ae_cos(3*ae_pi*xy.ptr.pp_double[shift][1], _state); } } /* * Create and train a neural network */ mlpcreate1(2, nneurons, 1, &net, _state); if( vtrain==0 ) { mlptrainlm(&net, &xy, n, vdecay, nneedrest, &info, &rep, _state); } if( vtrain==1 ) { mlptrainlbfgs(&net, &xy, n, vdecay, nneedrest, traineps, trainits, &info, &rep, _state); } /* * Train with trainer, using: * * dense matrix; */ if( vtrain==2 ) { mlpcreatetrainer(2, 1, &trainer, _state); mlpsetdataset(&trainer, &xy, n, _state); mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); mlptrainnetwork(&trainer, &net, nneedrest, &rep, _state); } /* * * sparse matrix. */ if( vtrain==3 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=2; j++) { sparseset(&sm, i, j, xy.ptr.pp_double[i][j], _state); } } mlpcreatetrainer(2, 1, &trainer, _state); mlpsetsparsedataset(&trainer, &sm, n, _state); mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); mlptrainnetwork(&trainer, &net, nneedrest, &rep, _state); } /* * Check that network is trained correctly */ for(i=0; i<=n-1; i++) { x.ptr.p_double[0] = xy.ptr.pp_double[i][0]; x.ptr.p_double[1] = xy.ptr.pp_double[i][1]; mlpprocess(&net, &x, &y, _state); /* * Calculate average error */ averr = averr+ae_fabs(y.ptr.p_double[0]-xy.ptr.pp_double[i][2], _state); } if( ae_fp_greater(averr/n,eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function tests MLPTrainNetwork/MLPStartTraining/MLPContinueTraining functions for regression. It check that train functions work correctly. Test use Create1 with 2 neurons. Test function is XOR(x,y). *************************************************************************/ static ae_bool testmlptrainunit_testmlpxorregr(ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; mlptrainer trainer; mlpreport rep; ae_matrix xy; sparsematrix sm; ae_vector x; ae_vector y; ae_int_t n; ae_int_t sn; ae_int_t nneurons; double vdecay; double averr; double eps; ae_int_t numxp; double traineps; ae_int_t nneedrest; ae_int_t trainits; ae_int_t shift; ae_int_t i; ae_int_t j; ae_int_t vtrain; ae_int_t xp; ae_bool result; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&net, _state); _mlptrainer_init(&trainer, _state); _mlpreport_init(&rep, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _sparsematrix_init(&sm, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); eps = 0.01; numxp = 15; vdecay = 0.001; nneurons = 3; nneedrest = 1; traineps = 1.0E-4; trainits = 0; sn = 2; n = sn*sn; sparsecreate(n, 3, n*3, &sm, _state); ae_matrix_set_length(&xy, n, 3, _state); ae_vector_set_length(&x, 2, _state); for(xp=1; xp<=numxp; xp++) { for(vtrain=0; vtrain<=3; vtrain++) { averr = (double)(0); /* * Create a train set(uniformly distributed set of points). */ for(i=0; i<=sn-1; i++) { for(j=0; j<=sn-1; j++) { shift = i*sn+j; xy.ptr.pp_double[shift][0] = (double)(i); xy.ptr.pp_double[shift][1] = (double)(j); if( ae_fp_eq(xy.ptr.pp_double[shift][0],xy.ptr.pp_double[shift][1]) ) { xy.ptr.pp_double[shift][2] = (double)(0); } else { xy.ptr.pp_double[shift][2] = (double)(1); } } } /* * Create and train a neural network */ mlpcreate1(2, nneurons, 1, &net, _state); /* * Train with trainer, using: * * dense matrix; */ if( vtrain==0 ) { mlpcreatetrainer(2, 1, &trainer, _state); mlpsetdataset(&trainer, &xy, n, _state); mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); mlptrainnetwork(&trainer, &net, nneedrest, &rep, _state); } if( vtrain==1 ) { mlpcreatetrainer(2, 1, &trainer, _state); mlpsetdataset(&trainer, &xy, n, _state); mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); mlpstarttraining(&trainer, &net, ae_true, _state); while(mlpcontinuetraining(&trainer, &net, _state)) { } } /* * * sparse matrix. */ if( vtrain==2 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=2; j++) { sparseset(&sm, i, j, xy.ptr.pp_double[i][j], _state); } } mlpcreatetrainer(2, 1, &trainer, _state); mlpsetsparsedataset(&trainer, &sm, n, _state); mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); mlptrainnetwork(&trainer, &net, nneedrest, &rep, _state); } if( vtrain==3 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=2; j++) { sparseset(&sm, i, j, xy.ptr.pp_double[i][j], _state); } } mlpcreatetrainer(2, 1, &trainer, _state); mlpsetsparsedataset(&trainer, &sm, n, _state); mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); mlpstarttraining(&trainer, &net, ae_true, _state); while(mlpcontinuetraining(&trainer, &net, _state)) { } } /* * Check that network is trained correctly */ for(i=0; i<=n-1; i++) { x.ptr.p_double[0] = xy.ptr.pp_double[i][0]; x.ptr.p_double[1] = xy.ptr.pp_double[i][1]; mlpprocess(&net, &x, &y, _state); /* * Calculate average error */ averr = averr+ae_fabs(y.ptr.p_double[0]-xy.ptr.pp_double[i][2], _state); } if( ae_fp_greater(averr/n,eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function tests MLPTrainLM, MLPTrainLBFGS and MLPTrainNetwork functions for classification problems. It check that train functions work correctly when is used CreateC1 function. Here the network tries to distinguish positive from negative numbers. *************************************************************************/ static ae_bool testmlptrainunit_testmlptrainclass(ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; mlptrainer trainer; mlpreport rep; ae_int_t info; ae_matrix xy; sparsematrix sm; ae_vector x; ae_vector y; ae_int_t n; double vdecay; double traineps; ae_int_t nneedrest; ae_int_t trainits; double tmp; double mnc; double mxc; ae_int_t nxp; ae_int_t i; ae_int_t rndind; ae_int_t vtrain; ae_int_t xp; ae_bool result; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&net, _state); _mlptrainer_init(&trainer, _state); _mlpreport_init(&rep, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _sparsematrix_init(&sm, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); mnc = (double)(10); mxc = (double)(11); nxp = 15; vdecay = 0.001; nneedrest = 10; traineps = 1.0E-4; trainits = 0; n = 100; sparsecreate(n, 2, n*2, &sm, _state); ae_vector_set_length(&x, 1, _state); ae_matrix_set_length(&xy, n, 2, _state); for(xp=1; xp<=nxp; xp++) { for(vtrain=0; vtrain<=3; vtrain++) { /* * Initialization: * * create negative part of the set; */ for(i=0; i<=n/2-1; i++) { xy.ptr.pp_double[i][0] = -1*((mxc-mnc)*ae_randomreal(_state)+mnc); xy.ptr.pp_double[i][1] = (double)(0); } /* * * create positive part of the set; */ for(i=n/2; i<=n-1; i++) { xy.ptr.pp_double[i][0] = (mxc-mnc)*ae_randomreal(_state)+mnc; xy.ptr.pp_double[i][1] = (double)(1); } /* * * mix two parts. */ for(i=0; i<=n-1; i++) { do { rndind = ae_randominteger(n, _state); } while(rndind==i); tmp = xy.ptr.pp_double[i][0]; xy.ptr.pp_double[i][0] = xy.ptr.pp_double[rndind][0]; xy.ptr.pp_double[rndind][0] = tmp; tmp = xy.ptr.pp_double[i][1]; xy.ptr.pp_double[i][1] = xy.ptr.pp_double[rndind][1]; xy.ptr.pp_double[rndind][1] = tmp; } /* * Create and train a neural network */ mlpcreatec0(1, 2, &net, _state); if( vtrain==0 ) { mlptrainlm(&net, &xy, n, vdecay, nneedrest, &info, &rep, _state); } if( vtrain==1 ) { mlptrainlbfgs(&net, &xy, n, vdecay, nneedrest, traineps, trainits, &info, &rep, _state); } /* * Train with trainer, using: * * dense matrix; */ if( vtrain==2 ) { mlpcreatetrainercls(1, 2, &trainer, _state); mlpsetdataset(&trainer, &xy, n, _state); mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); mlptrainnetwork(&trainer, &net, nneedrest, &rep, _state); } /* * * sparse matrix. */ if( vtrain==3 ) { for(i=0; i<=n-1; i++) { sparseset(&sm, i, 0, xy.ptr.pp_double[i][0], _state); sparseset(&sm, i, 1, xy.ptr.pp_double[i][1], _state); } mlpcreatetrainercls(1, 2, &trainer, _state); mlpsetsparsedataset(&trainer, &sm, n, _state); mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); mlptrainnetwork(&trainer, &net, nneedrest, &rep, _state); } /* * Test on training set */ for(i=0; i<=n-1; i++) { x.ptr.p_double[0] = xy.ptr.pp_double[i][0]; mlpprocess(&net, &x, &y, _state); /* * Negative number has to be negative and * positive number has to be positive. */ if( ((ae_fp_less(x.ptr.p_double[0],(double)(0))&&ae_fp_less(y.ptr.p_double[0],0.95))&&ae_fp_greater(y.ptr.p_double[1],0.05))||((ae_fp_greater_eq(x.ptr.p_double[0],(double)(0))&&ae_fp_greater(y.ptr.p_double[0],0.05))&&ae_fp_less(y.ptr.p_double[1],0.95)) ) { result = ae_true; ae_frame_leave(_state); return result; } } /* * Test on random set */ for(i=0; i<=n-1; i++) { x.ptr.p_double[0] = ae_pow((double)(-1), (double)(ae_randominteger(2, _state)), _state)*((mxc-mnc)*ae_randomreal(_state)+mnc); mlpprocess(&net, &x, &y, _state); if( ((ae_fp_less(x.ptr.p_double[0],(double)(0))&&ae_fp_less(y.ptr.p_double[0],0.95))&&ae_fp_greater(y.ptr.p_double[1],0.05))||((ae_fp_greater_eq(x.ptr.p_double[0],(double)(0))&&ae_fp_greater(y.ptr.p_double[0],0.05))&&ae_fp_less(y.ptr.p_double[1],0.95)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function tests MLPTrainNetwork/MLPStartTraining/MLPContinueTraining functions for classification problems. It check that train functions work correctly when is used CreateC1 function. Here the network tries to distinguish positive from negative numbers. *************************************************************************/ static ae_bool testmlptrainunit_testmlpxorcls(ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; mlptrainer trainer; mlpreport rep; ae_matrix xy; sparsematrix sm; ae_vector x; ae_vector y; ae_int_t n; ae_int_t nin; ae_int_t nout; ae_int_t wcount; double e; double ebest; double v; ae_vector wbest; double vdecay; double traineps; ae_int_t nneurons; ae_int_t nneedrest; ae_int_t trainits; ae_int_t nxp; ae_int_t i; ae_int_t vtrain; ae_int_t xp; ae_bool result; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&net, _state); _mlptrainer_init(&trainer, _state); _mlpreport_init(&rep, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _sparsematrix_init(&sm, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&wbest, 0, DT_REAL, _state); nxp = 15; nneurons = 3; vdecay = 0.001; nneedrest = 3; traineps = 1.0E-4; trainits = 0; n = 4; sparsecreate(n, 3, n*3, &sm, _state); ae_vector_set_length(&x, 2, _state); ae_matrix_set_length(&xy, n, 3, _state); /* * Initialization: */ xy.ptr.pp_double[0][0] = (double)(0); xy.ptr.pp_double[0][1] = (double)(0); xy.ptr.pp_double[0][2] = (double)(0); xy.ptr.pp_double[1][0] = (double)(0); xy.ptr.pp_double[1][1] = (double)(1); xy.ptr.pp_double[1][2] = (double)(1); xy.ptr.pp_double[2][0] = (double)(1); xy.ptr.pp_double[2][1] = (double)(0); xy.ptr.pp_double[2][2] = (double)(1); xy.ptr.pp_double[3][0] = (double)(1); xy.ptr.pp_double[3][1] = (double)(1); xy.ptr.pp_double[3][2] = (double)(0); /* * Create a neural network */ mlpcreatec1(2, nneurons, 2, &net, _state); mlpproperties(&net, &nin, &nout, &wcount, _state); ae_vector_set_length(&wbest, wcount, _state); /* * Test */ for(xp=1; xp<=nxp; xp++) { for(vtrain=0; vtrain<=3; vtrain++) { /* * Train with trainer, using: * * dense matrix; */ if( vtrain==0 ) { mlpcreatetrainercls(2, 2, &trainer, _state); mlpsetdataset(&trainer, &xy, n, _state); mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); mlptrainnetwork(&trainer, &net, nneedrest, &rep, _state); } if( vtrain==1 ) { mlpcreatetrainercls(2, 2, &trainer, _state); mlpsetdataset(&trainer, &xy, n, _state); mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); ebest = ae_maxrealnumber; for(i=1; i<=nneedrest; i++) { mlpstarttraining(&trainer, &net, ae_true, _state); while(mlpcontinuetraining(&trainer, &net, _state)) { } v = ae_v_dotproduct(&net.weights.ptr.p_double[0], 1, &net.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); e = mlperror(&net, &xy, n, _state)+0.5*vdecay*v; /* * Compare with the best answer. */ if( ae_fp_less(e,ebest) ) { ae_v_move(&wbest.ptr.p_double[0], 1, &net.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); ebest = e; } } /* * The best result */ ae_v_move(&net.weights.ptr.p_double[0], 1, &wbest.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); } /* * * sparse matrix. */ if( vtrain==2 ) { for(i=0; i<=n-1; i++) { sparseset(&sm, i, 0, xy.ptr.pp_double[i][0], _state); sparseset(&sm, i, 1, xy.ptr.pp_double[i][1], _state); sparseset(&sm, i, 2, xy.ptr.pp_double[i][2], _state); } mlpcreatetrainercls(2, 2, &trainer, _state); mlpsetsparsedataset(&trainer, &sm, n, _state); mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); mlptrainnetwork(&trainer, &net, nneedrest, &rep, _state); } if( vtrain==3 ) { for(i=0; i<=n-1; i++) { sparseset(&sm, i, 0, xy.ptr.pp_double[i][0], _state); sparseset(&sm, i, 1, xy.ptr.pp_double[i][1], _state); sparseset(&sm, i, 2, xy.ptr.pp_double[i][2], _state); } mlpcreatetrainercls(2, 2, &trainer, _state); mlpsetsparsedataset(&trainer, &sm, n, _state); mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); ebest = ae_maxrealnumber; for(i=1; i<=nneedrest; i++) { mlpstarttraining(&trainer, &net, ae_true, _state); while(mlpcontinuetraining(&trainer, &net, _state)) { } v = ae_v_dotproduct(&net.weights.ptr.p_double[0], 1, &net.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); e = mlperror(&net, &xy, n, _state)+0.5*vdecay*v; /* * Compare with the best answer. */ if( ae_fp_less(e,ebest) ) { ae_v_move(&wbest.ptr.p_double[0], 1, &net.weights.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); ebest = e; } } /* * The best result */ ae_v_move(&net.weights.ptr.p_double[0], 1, &wbest.ptr.p_double[0], 1, ae_v_len(0,wcount-1)); } /* * Test on training set */ for(i=0; i<=n-1; i++) { x.ptr.p_double[0] = xy.ptr.pp_double[i][0]; x.ptr.p_double[1] = xy.ptr.pp_double[i][1]; mlpprocess(&net, &x, &y, _state); if( ((ae_fp_eq(x.ptr.p_double[0],x.ptr.p_double[1])&&ae_fp_less(y.ptr.p_double[0],0.95))&&ae_fp_greater(y.ptr.p_double[1],0.05))||((ae_fp_neq(x.ptr.p_double[0],x.ptr.p_double[1])&&ae_fp_greater(y.ptr.p_double[0],0.05))&&ae_fp_less(y.ptr.p_double[1],0.95)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* The test check, that all weights are zero after training with trainer using empty dataset(either zero size or is't used MLPSetDataSet function). Test on regression and classification problems given by dense or sparse matrix. NOTE: Result of the function is written in MLPTrainRegrErr variable in unit test. *************************************************************************/ static ae_bool testmlptrainunit_testmlpzeroweights(ae_state *_state) { ae_frame _frame_block; mlptrainer trainer; multilayerperceptron net; mlpreport rep; ae_int_t nin; ae_int_t nout; ae_int_t wcount; ae_int_t mxnin; ae_int_t mxnout; double vdecay; double traineps; ae_int_t trainits; ae_int_t nneedrest; ae_matrix dds; sparsematrix sds; ae_bool iscls; ae_bool issparse; ae_int_t c; ae_int_t n; ae_int_t xp; ae_int_t nxp; ae_bool result; ae_frame_make(_state, &_frame_block); _mlptrainer_init(&trainer, _state); _multilayerperceptron_init(&net, _state); _mlpreport_init(&rep, _state); ae_matrix_init(&dds, 0, 0, DT_REAL, _state); _sparsematrix_init(&sds, _state); mxnin = 10; mxnout = 10; vdecay = 1.0E-3; nneedrest = 1; traineps = 1.0E-3; trainits = 0; sparsecreate(1, 1, 0, &sds, _state); sparseconverttocrs(&sds, _state); nxp = 10; for(xp=1; xp<=nxp; xp++) { c = ae_randominteger(2, _state); iscls = c==1; c = ae_randominteger(2, _state); issparse = c==1; /* * Create trainer and network */ if( !iscls ) { /* * Regression */ nin = ae_randominteger(mxnin, _state)+1; nout = ae_randominteger(mxnout, _state)+1; mlpcreatetrainer(nin, nout, &trainer, _state); mlpcreate0(nin, nout, &net, _state); } else { /* * Classification */ nin = ae_randominteger(mxnin, _state)+1; nout = ae_randominteger(mxnout, _state)+2; mlpcreatetrainercls(nin, nout, &trainer, _state); mlpcreatec0(nin, nout, &net, _state); } n = ae_randominteger(2, _state)-1; if( n==0 ) { if( !issparse ) { mlpsetdataset(&trainer, &dds, n, _state); } else { mlpsetsparsedataset(&trainer, &sds, n, _state); } } mlpsetdecay(&trainer, vdecay, _state); mlpsetcond(&trainer, traineps, trainits, _state); c = ae_randominteger(2, _state); if( c==0 ) { mlpstarttraining(&trainer, &net, ae_true, _state); while(mlpcontinuetraining(&trainer, &net, _state)) { } } if( c==1 ) { mlptrainnetwork(&trainer, &net, nneedrest, &rep, _state); } /* * Check weights */ mlpproperties(&net, &nin, &nout, &wcount, _state); for(c=0; c<=wcount-1; c++) { if( ae_fp_neq(net.weights.ptr.p_double[c],(double)(0)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function tests that increasing numbers of restarts lead to statistical improvement quality of solution. Neural network created by Create1(10 neurons) and trained by MLPTrainLBFGS. TEST's DISCRIPTION: Net0 - network trained with one restart (denoted as R1) Net1 - network trained with more than one restart (denoted as Rn) We must refuse hypothesis that R1 equivalent to Rn. Here Mean = N/2, Sigma = Sqrt(N)/2. _ | 0 - R1 worse than Rn; ri = | |_1 - Rn same or worse then R1. If Sum(ri)1 restarts. */ mlpproperties(&net1, &nin, &nout, &wcount1, _state); e1 = ae_v_dotproduct(&net1.weights.ptr.p_double[0], 1, &net1.weights.ptr.p_double[0], 1, ae_v_len(0,wcount1-1)); e1 = mlperrorn(&net1, &xy, n, _state)+0.5*vdecay*e1; if( ae_fp_less_eq(e0,e1) ) { avval = avval+1; } } if( ae_fp_less(mean-numsigma,avval) ) { result = ae_true; ae_frame_leave(_state); return result; } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* The function test function MLPKFoldCV. *************************************************************************/ static ae_bool testmlptrainunit_testmlpcverror(ae_state *_state) { ae_frame _frame_block; multilayerperceptron net; mlptrainer trainer; mlpreport rep; mlpreport cvrep; ae_int_t nin; ae_int_t nout; ae_int_t nneurons; ae_int_t rowsz; double decay; double wstep; ae_int_t maxits; ae_int_t foldscount; ae_int_t nneedrest; sparsematrix sptrainingset; ae_matrix trainingset; ae_matrix testset; ae_int_t npoints; ae_int_t ntstpoints; double mean; double numsigma; double diffms; double tstrelclserror; double tstavgce; double tstrmserror; double tstavgerror; double tstavgrelerror; ae_int_t r0; ae_int_t r1; ae_int_t r2; ae_int_t r3; ae_int_t r4; ae_int_t ntest; ae_int_t xp; ae_int_t nxp; ae_bool isregr; ae_int_t issparse; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); _multilayerperceptron_init(&net, _state); _mlptrainer_init(&trainer, _state); _mlpreport_init(&rep, _state); _mlpreport_init(&cvrep, _state); _sparsematrix_init(&sptrainingset, _state); ae_matrix_init(&trainingset, 0, 0, DT_REAL, _state); ae_matrix_init(&testset, 0, 0, DT_REAL, _state); decay = 1.0E-6; wstep = 0.0; foldscount = 5; nneedrest = 1; ntest = ae_randominteger(4, _state); nxp = 1000; maxits = 50; nin = 1; nout = 1; npoints = 5; ntstpoints = 100; isregr = ae_true; nneurons = 3; if( ntest==1 ) { nxp = 1000; maxits = 50; nin = 1; nout = 10; npoints = 5; ntstpoints = 100; isregr = ae_true; } if( ntest==2 ) { nxp = 1000; maxits = 50; nin = 10; nout = 1; npoints = 20; ntstpoints = 100; isregr = ae_true; } if( ntest==3 ) { nxp = 2000; maxits = 10; nin = 1; nneurons = 3; nout = 3; npoints = 10; ntstpoints = 100; isregr = ae_false; } mean = nxp/2.0; numsigma = 5.0*ae_sqrt((double)(nxp), _state)/2.0; diffms = mean-numsigma; issparse = ae_randominteger(2, _state); if( isregr ) { mlpcreate0(nin, nout, &net, _state); mlpcreatetrainer(nin, nout, &trainer, _state); } else { mlpcreatec1(nin, nneurons, nout, &net, _state); mlpcreatetrainercls(nin, nout, &trainer, _state); } mlpsetcond(&trainer, wstep, maxits, _state); mlpsetdecay(&trainer, decay, _state); if( isregr ) { rowsz = nin+nout; } else { rowsz = nin+1; } r0 = 0; r1 = 0; r2 = 0; r3 = 0; r4 = 0; for(xp=1; xp<=nxp; xp++) { /* * Dense matrix */ if( issparse==0 ) { rmatrixsetlengthatleast(&trainingset, npoints, rowsz, _state); /* * Create training set */ for(i=0; i<=npoints-1; i++) { for(j=0; j<=nin-1; j++) { trainingset.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } if( isregr ) { for(i=0; i<=npoints-1; i++) { for(j=nin; j<=rowsz-1; j++) { trainingset.ptr.pp_double[i][j] = 2*ae_randomreal(_state)+1; } } } else { for(i=0; i<=npoints-1; i++) { for(j=nin; j<=rowsz-1; j++) { trainingset.ptr.pp_double[i][j] = (double)(ae_randominteger(nout, _state)); } } } mlpsetdataset(&trainer, &trainingset, npoints, _state); } /* * Sparse matrix */ if( issparse==1 ) { sparsecreate(npoints, rowsz, npoints*rowsz, &sptrainingset, _state); /* * Create training set */ for(i=0; i<=npoints-1; i++) { for(j=0; j<=nin-1; j++) { sparseset(&sptrainingset, i, j, 2*ae_randomreal(_state)-1, _state); } } if( isregr ) { for(i=0; i<=npoints-1; i++) { for(j=nin; j<=rowsz-1; j++) { sparseset(&sptrainingset, i, j, 2*ae_randomreal(_state)+1, _state); } } } else { for(i=0; i<=npoints-1; i++) { for(j=nin; j<=rowsz-1; j++) { sparseset(&sptrainingset, i, j, (double)(ae_randominteger(nout, _state)), _state); } } } sparseconverttocrs(&sptrainingset, _state); mlpsetsparsedataset(&trainer, &sptrainingset, npoints, _state); } rmatrixsetlengthatleast(&testset, ntstpoints, rowsz, _state); /* * Create test set */ for(i=0; i<=ntstpoints-1; i++) { for(j=0; j<=nin-1; j++) { testset.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } if( isregr ) { for(i=0; i<=ntstpoints-1; i++) { for(j=nin; j<=rowsz-1; j++) { testset.ptr.pp_double[i][j] = 2*ae_randomreal(_state)+1; } } } else { for(i=0; i<=ntstpoints-1; i++) { for(j=nin; j<=rowsz-1; j++) { testset.ptr.pp_double[i][j] = (double)(ae_randominteger(nout, _state)); } } } mlptrainnetwork(&trainer, &net, nneedrest, &rep, _state); tstrelclserror = (double)(0); tstavgce = (double)(0); tstrmserror = (double)(0); tstavgerror = (double)(0); tstavgrelerror = (double)(0); if( !isregr ) { tstrelclserror = mlprelclserror(&net, &testset, ntstpoints, _state); tstavgce = mlpavgce(&net, &testset, ntstpoints, _state); } tstrmserror = mlprmserror(&net, &testset, ntstpoints, _state); tstavgerror = mlpavgerror(&net, &testset, ntstpoints, _state); tstavgrelerror = mlpavgrelerror(&net, &testset, ntstpoints, _state); /* * Cross-validation */ mlpkfoldcv(&trainer, &net, nneedrest, foldscount, &cvrep, _state); if( !isregr ) { if( ae_fp_less(ae_fabs(tstrelclserror-rep.relclserror, _state),ae_fabs(tstrelclserror-cvrep.relclserror, _state)) ) { r0 = r0+1; } if( ae_fp_less(ae_fabs(tstavgce-rep.avgce, _state),ae_fabs(tstavgce-cvrep.avgce, _state)) ) { r1 = r1+1; } } if( ae_fp_less(ae_fabs(tstrmserror-rep.rmserror, _state),ae_fabs(tstrmserror-cvrep.rmserror, _state)) ) { r2 = r2+1; } if( ae_fp_less(ae_fabs(tstavgerror-rep.avgerror, _state),ae_fabs(tstavgerror-cvrep.avgerror, _state)) ) { r3 = r3+1; } if( ae_fp_less(ae_fabs(tstavgrelerror-rep.avgrelerror, _state),ae_fabs(tstavgrelerror-cvrep.avgrelerror, _state)) ) { r4 = r4+1; } } if( !isregr ) { if( ae_fp_less_eq(diffms,(double)(r0))||ae_fp_less_eq(diffms,(double)(r1)) ) { result = ae_true; ae_frame_leave(_state); return result; } } if( (ae_fp_less_eq(diffms,(double)(r2))||ae_fp_less_eq(diffms,(double)(r3)))||ae_fp_less_eq(diffms,(double)(r4)) ) { result = ae_true; ae_frame_leave(_state); return result; } /* * Test FoldCV when no dataset was specified with * MLPSetDataset/SetSparseDataset(), or subset with * only one point was given. * * NPoints values: * * -1 - don't set dataset with using MLPSetDataset..; * * 0 - zero dataset; * * 1 - dataset with one point. */ for(npoints=-1; npoints<=1; npoints++) { if( isregr ) { mlpcreatetrainer(nin, nout, &trainer, _state); } else { mlpcreatetrainercls(nin, nout, &trainer, _state); } if( npoints>-1 ) { if( issparse==0 ) { mlpsetdataset(&trainer, &trainingset, npoints, _state); } if( issparse==1 ) { mlpsetsparsedataset(&trainer, &sptrainingset, npoints, _state); } } mlpkfoldcv(&trainer, &net, nneedrest, foldscount, &cvrep, _state); if( ((((((ae_fp_neq(cvrep.relclserror,(double)(0))||ae_fp_neq(cvrep.avgce,(double)(0)))||ae_fp_neq(cvrep.rmserror,(double)(0)))||ae_fp_neq(cvrep.avgerror,(double)(0)))||ae_fp_neq(cvrep.avgrelerror,(double)(0)))||cvrep.ngrad!=0)||cvrep.nhess!=0)||cvrep.ncholesky!=0 ) { result = ae_true; ae_frame_leave(_state); return result; } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* The function tests functions for training ensembles: MLPEBaggingLM, MLPEBaggingLBFGS. *************************************************************************/ static ae_bool testmlptrainunit_testmlptrainens(ae_state *_state) { ae_frame _frame_block; mlpensemble ensemble; mlpreport rep; mlpcvreport oobrep; ae_int_t info; ae_matrix xy; ae_int_t nin; ae_int_t nout; ae_int_t npoints; ae_int_t nhid; ae_int_t algtype; ae_int_t tasktype; ae_int_t pass; double e; ae_int_t nless; ae_int_t nall; ae_int_t nclasses; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); _mlpensemble_init(&ensemble, _state); _mlpreport_init(&rep, _state); _mlpcvreport_init(&oobrep, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); result = ae_false; /* * network training must reduce error * test on random regression task */ nin = 3; nout = 2; nhid = 5; npoints = 100; nless = 0; nall = 0; for(pass=1; pass<=10; pass++) { for(algtype=0; algtype<=1; algtype++) { for(tasktype=0; tasktype<=1; tasktype++) { if( tasktype==0 ) { ae_matrix_set_length(&xy, npoints, nin+nout, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=nin+nout-1; j++) { xy.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } mlpecreate1(nin, nhid, nout, 1+ae_randominteger(3, _state), &ensemble, _state); } else { ae_matrix_set_length(&xy, npoints, nin+1, _state); nclasses = 2+ae_randominteger(2, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=nin-1; j++) { xy.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } xy.ptr.pp_double[i][nin] = (double)(ae_randominteger(nclasses, _state)); } mlpecreatec1(nin, nhid, nclasses, 1+ae_randominteger(3, _state), &ensemble, _state); } e = mlpermserror(&ensemble, &xy, npoints, _state); if( algtype==0 ) { mlpebagginglm(&ensemble, &xy, npoints, 0.001, 1, &info, &rep, &oobrep, _state); } else { mlpebagginglbfgs(&ensemble, &xy, npoints, 0.001, 1, 0.01, 0, &info, &rep, &oobrep, _state); } if( info<0 ) { result = ae_true; } else { if( ae_fp_less(mlpermserror(&ensemble, &xy, npoints, _state),e) ) { nless = nless+1; } } nall = nall+1; } } } result = result||ae_fp_greater((double)(nall-nless),0.3*nall); ae_frame_leave(_state); return result; } /************************************************************************* Testing for functions MLPETrainES and MLPTrainEnsembleES on regression problems. Returns TRUE for errors, FALSE for success. *************************************************************************/ static ae_bool testmlptrainunit_testmlptrainensregr(ae_state *_state) { ae_frame _frame_block; mlptrainer trainer; mlpensemble netens; mlpreport rep; modelerrors repx; ae_int_t info; sparsematrix xytrainsp; ae_matrix xytrain; ae_matrix xytest; ae_int_t nin; ae_int_t nout; ae_int_t nneurons; ae_vector x; ae_vector y; double decay; double wstep; ae_int_t maxits; ae_int_t nneedrest; ae_int_t enssize; double mnval; double mxval; ae_int_t ntrain; ae_int_t ntest; double avgerr; ae_int_t issparse; ae_int_t withtrainer; double eps; ae_int_t xp; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); _mlptrainer_init(&trainer, _state); _mlpensemble_init(&netens, _state); _mlpreport_init(&rep, _state); _modelerrors_init(&repx, _state); _sparsematrix_init(&xytrainsp, _state); ae_matrix_init(&xytrain, 0, 0, DT_REAL, _state); ae_matrix_init(&xytest, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); result = ae_false; /* * This test checks ability to train ensemble on simple regression * problem "f(x0,x1,x2,...) = x0 + x1 + x2 + ...". */ eps = 5.0E-2; mnval = (double)(-1); mxval = (double)(1); ntrain = 40; ntest = 20; decay = 1.0E-3; wstep = 1.0E-3; maxits = 20; nneedrest = 1; nneurons = 20; nout = 1; enssize = 100; for(xp=1; xp<=2; xp++) { nin = ae_randominteger(3, _state)+1; rvectorsetlengthatleast(&x, nin, _state); mlpcreatetrainer(nin, nout, &trainer, _state); mlpsetdecay(&trainer, decay, _state); mlpsetcond(&trainer, wstep, maxits, _state); rmatrixsetlengthatleast(&xytrain, ntrain, nin+nout, _state); rmatrixsetlengthatleast(&xytest, ntest, nin+nout, _state); withtrainer = ae_randominteger(2, _state); issparse = 0; if( withtrainer==0 ) { issparse = 0; } if( withtrainer==1 ) { issparse = ae_randominteger(2, _state); } /* * Training set */ for(i=0; i<=ntrain-1; i++) { for(j=0; j<=nin-1; j++) { xytrain.ptr.pp_double[i][j] = (mxval-mnval)*ae_randomreal(_state)+mnval; } xytrain.ptr.pp_double[i][nin] = (double)(0); for(j=0; j<=nin-1; j++) { xytrain.ptr.pp_double[i][nin] = xytrain.ptr.pp_double[i][nin]+xytrain.ptr.pp_double[i][j]; } } if( withtrainer==1 ) { /* * Dense matrix */ if( issparse==0 ) { mlpsetdataset(&trainer, &xytrain, ntrain, _state); } /* * Sparse matrix */ if( issparse==1 ) { sparsecreate(ntrain, nin+nout, ntrain*(nin+nout), &xytrainsp, _state); /* * Just copy dense matrix to sparse matrix(using SparseGet() is too expensive). */ for(i=0; i<=ntrain-1; i++) { for(j=0; j<=nin+nout-1; j++) { sparseset(&xytrainsp, i, j, xytrain.ptr.pp_double[i][j], _state); } } sparseconverttocrs(&xytrainsp, _state); mlpsetsparsedataset(&trainer, &xytrainsp, ntrain, _state); } } /* * Test set */ for(i=0; i<=ntest-1; i++) { for(j=0; j<=nin-1; j++) { xytest.ptr.pp_double[i][j] = (mxval-mnval)*ae_randomreal(_state)+mnval; } xytest.ptr.pp_double[i][nin] = (double)(0); for(j=0; j<=nin-1; j++) { xytest.ptr.pp_double[i][nin] = xytest.ptr.pp_double[i][nin]+xytest.ptr.pp_double[i][j]; } } /* * Create ensemble */ mlpecreate1(nin, nneurons, nout, enssize, &netens, _state); /* * Train ensembles: * * without trainer; */ if( withtrainer==0 ) { mlpetraines(&netens, &xytrain, ntrain, decay, nneedrest, &info, &rep, _state); } /* * * with trainer. */ if( withtrainer==1 ) { mlptrainensemblees(&trainer, &netens, nneedrest, &rep, _state); } /* * Test that Rep contains correct error values */ mlpeallerrorsx(&netens, &xytrain, &xytrainsp, ntrain, 0, &netens.network.dummyidx, 0, ntrain, 0, &netens.network.buf, &repx, _state); seterrorflagdiff(&result, rep.relclserror, repx.relclserror, 1.0E-4, 1.0E-2, _state); seterrorflagdiff(&result, rep.avgce, repx.avgce, 1.0E-4, 1.0E-2, _state); seterrorflagdiff(&result, rep.rmserror, repx.rmserror, 1.0E-4, 1.0E-2, _state); seterrorflagdiff(&result, rep.avgerror, repx.avgerror, 1.0E-4, 1.0E-2, _state); seterrorflagdiff(&result, rep.avgrelerror, repx.avgrelerror, 1.0E-4, 1.0E-2, _state); /* * Test that network fits data well. Calculate average error: * * on training dataset; * * on test dataset. (here we reduce the accuracy * requirements - average error is compared with 2*Eps). */ avgerr = (double)(0); for(i=0; i<=ntrain-1; i++) { if( issparse==0 ) { ae_v_move(&x.ptr.p_double[0], 1, &xytrain.ptr.pp_double[i][0], 1, ae_v_len(0,nin-1)); } if( issparse==1 ) { sparsegetrow(&xytrainsp, i, &x, _state); } mlpeprocess(&netens, &x, &y, _state); avgerr = avgerr+ae_fabs(y.ptr.p_double[0]-xytrain.ptr.pp_double[i][nin], _state); } avgerr = avgerr/ntrain; seterrorflag(&result, ae_fp_greater(avgerr,eps), _state); avgerr = (double)(0); for(i=0; i<=ntest-1; i++) { ae_v_move(&x.ptr.p_double[0], 1, &xytest.ptr.pp_double[i][0], 1, ae_v_len(0,nin-1)); mlpeprocess(&netens, &x, &y, _state); avgerr = avgerr+ae_fabs(y.ptr.p_double[0]-xytest.ptr.pp_double[i][nin], _state); } avgerr = avgerr/ntest; seterrorflag(&result, ae_fp_greater(avgerr,2*eps), _state); } /* * Catch bug in implementation of MLPTrainEnsembleX: * test ensemble training on empty dataset. * * Unfixed version should crash with violation of array * bounds (at least in C#). */ nin = 2; nout = 2; nneurons = 3; enssize = 3; nneedrest = 2; wstep = 0.001; maxits = 2; mlpcreatetrainer(nin, nout, &trainer, _state); mlpsetcond(&trainer, wstep, maxits, _state); mlpecreate1(nin, nneurons, nout, enssize, &netens, _state); mlptrainensemblees(&trainer, &netens, nneedrest, &rep, _state); ae_frame_leave(_state); return result; } /************************************************************************* Testing for functions MLPETrainES and MLPTrainEnsembleES on classification problems. *************************************************************************/ static ae_bool testmlptrainunit_testmlptrainenscls(ae_state *_state) { ae_frame _frame_block; mlptrainer trainer; mlpensemble netens; mlpreport rep; ae_int_t info; sparsematrix xytrainsp; ae_matrix xytrain; ae_matrix xytest; ae_int_t nin; ae_int_t nout; ae_vector x; ae_vector y; double decay; double wstep; ae_int_t maxits; ae_int_t nneedrest; ae_int_t enssize; ae_int_t val; ae_int_t ntrain; ae_int_t ntest; double avgerr; double eps; double delta; ae_int_t issparse; ae_int_t withtrainer; ae_int_t xp; ae_int_t nxp; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); _mlptrainer_init(&trainer, _state); _mlpensemble_init(&netens, _state); _mlpreport_init(&rep, _state); _sparsematrix_init(&xytrainsp, _state); ae_matrix_init(&xytrain, 0, 0, DT_REAL, _state); ae_matrix_init(&xytest, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); eps = 5.0E-2; delta = 0.1; ntrain = 90; ntest = 90; nin = 3; nout = 3; rvectorsetlengthatleast(&x, nin, _state); rmatrixsetlengthatleast(&xytrain, ntrain, nin+1, _state); rmatrixsetlengthatleast(&xytest, ntest, nin+1, _state); decay = 1.0E-3; wstep = 1.0E-3; maxits = 100; nneedrest = 1; mlpcreatetrainercls(nin, nout, &trainer, _state); mlpsetdecay(&trainer, decay, _state); mlpsetcond(&trainer, wstep, maxits, _state); nxp = 5; for(xp=1; xp<=nxp; xp++) { enssize = ae_round(ae_pow((double)(10), (double)(ae_randominteger(2, _state)+1), _state), _state); withtrainer = ae_randominteger(2, _state); issparse = 0; if( withtrainer==0 ) { issparse = 0; } if( withtrainer==1 ) { issparse = ae_randominteger(2, _state); } for(i=0; i<=ntrain-1; i++) { val = i%nin; for(j=0; j<=nin-1; j++) { xytrain.ptr.pp_double[i][j] = delta*(ae_randomreal(_state)-1); } xytrain.ptr.pp_double[i][val] = xytrain.ptr.pp_double[i][val]+1; xytrain.ptr.pp_double[i][nin] = (double)(val); } /* * Set dense dataset in trainer */ if( issparse==0 ) { mlpsetdataset(&trainer, &xytrain, ntrain, _state); } /* * * Sparse dataset(create it with using dense dataset). */ if( issparse==1 ) { sparsecreate(ntrain, nin+1, ntrain*(nin+1), &xytrainsp, _state); for(i=0; i<=ntrain-1; i++) { for(j=0; j<=nin-1; j++) { sparseset(&xytrainsp, i, j, xytrain.ptr.pp_double[i][j], _state); } sparseset(&xytrainsp, i, nin, xytrain.ptr.pp_double[i][nin], _state); } sparseconverttocrs(&xytrainsp, _state); /* * Set sparse dataset in trainer */ mlpsetsparsedataset(&trainer, &xytrainsp, ntrain, _state); } /* * Create test set */ for(i=0; i<=ntest-1; i++) { val = ae_randominteger(nin, _state); for(j=0; j<=nin-1; j++) { xytest.ptr.pp_double[i][j] = delta*(ae_randomreal(_state)-1); } xytest.ptr.pp_double[i][val] = xytest.ptr.pp_double[i][val]+1; xytest.ptr.pp_double[i][nin] = (double)(val); } /* * Create ensemble */ mlpecreatec0(nin, nout, enssize, &netens, _state); /* * Train ensembles: * * without trainer; */ if( withtrainer==0 ) { mlpetraines(&netens, &xytrain, ntrain, decay, nneedrest, &info, &rep, _state); } /* * * with trainer. */ if( withtrainer==1 ) { mlptrainensemblees(&trainer, &netens, nneedrest, &rep, _state); } /* * Calculate average error: * * on training dataset; */ avgerr = (double)(0); for(i=0; i<=ntrain-1; i++) { if( issparse==0 ) { ae_v_move(&x.ptr.p_double[0], 1, &xytrain.ptr.pp_double[i][0], 1, ae_v_len(0,nin-1)); } if( issparse==1 ) { sparsegetrow(&xytrainsp, i, &x, _state); } mlpeprocess(&netens, &x, &y, _state); for(j=0; j<=nout-1; j++) { if( ae_fp_neq((double)(j),xytrain.ptr.pp_double[i][nin]) ) { avgerr = avgerr+y.ptr.p_double[j]; } else { avgerr = avgerr+(1-y.ptr.p_double[j]); } } } avgerr = avgerr/(ntrain*nout); if( ae_fp_greater(avgerr,eps) ) { result = ae_true; ae_frame_leave(_state); return result; } /* * * on test dataset. */ avgerr = (double)(0); for(i=0; i<=ntest-1; i++) { ae_v_move(&x.ptr.p_double[0], 1, &xytest.ptr.pp_double[i][0], 1, ae_v_len(0,nin-1)); mlpeprocess(&netens, &x, &y, _state); for(j=0; j<=nout-1; j++) { if( ae_fp_neq((double)(j),xytest.ptr.pp_double[i][nin]) ) { avgerr = avgerr+y.ptr.p_double[j]; } else { avgerr = avgerr+(1-y.ptr.p_double[j]); } } } avgerr = avgerr/(ntest*nout); if( ae_fp_greater(avgerr,eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } result = ae_false; ae_frame_leave(_state); return result; } static ae_bool testclusteringunit_basicahctests(ae_state *_state); static ae_bool testclusteringunit_advancedahctests(ae_state *_state); static void testclusteringunit_kmeanssimpletest1(ae_int_t nvars, ae_int_t nc, ae_int_t passcount, ae_bool* converrors, ae_bool* othererrors, ae_bool* simpleerrors, ae_state *_state); static void testclusteringunit_kmeansspecialtests(ae_bool* othererrors, ae_state *_state); static void testclusteringunit_kmeansinfinitelooptest(ae_bool* othererrors, ae_state *_state); static void testclusteringunit_kmeansrestartstest(ae_bool* converrors, ae_bool* restartserrors, ae_state *_state); static double testclusteringunit_rnormal(ae_state *_state); static void testclusteringunit_rsphere(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t i, ae_state *_state); static double testclusteringunit_distfunc(/* Real */ ae_vector* x0, /* Real */ ae_vector* x1, ae_int_t d, ae_int_t disttype, ae_state *_state); static ae_bool testclusteringunit_errorsinmerges(/* Real */ ae_matrix* d, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nf, ahcreport* rep, ae_int_t ahcalgo, ae_state *_state); static void testclusteringunit_kmeansreferenceupdatedistances(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, /* Real */ ae_matrix* ct, ae_int_t k, /* Integer */ ae_vector* xyc, /* Real */ ae_vector* xydist2, ae_state *_state); /************************************************************************* Testing clustering *************************************************************************/ ae_bool testclustering(ae_bool silent, ae_state *_state) { ae_bool waserrors; ae_bool basicahcerrors; ae_bool ahcerrors; ae_bool kmeansconverrors; ae_bool kmeanssimpleerrors; ae_bool kmeansothererrors; ae_bool kmeansrestartserrors; ae_int_t passcount; ae_int_t nf; ae_int_t nc; ae_bool result; /* * AHC tests */ basicahcerrors = testclusteringunit_basicahctests(_state); ahcerrors = testclusteringunit_advancedahctests(_state); /* * k-means tests */ passcount = 10; kmeansconverrors = ae_false; kmeansothererrors = ae_false; kmeanssimpleerrors = ae_false; kmeansrestartserrors = ae_false; testclusteringunit_kmeansspecialtests(&kmeansothererrors, _state); testclusteringunit_kmeansinfinitelooptest(&kmeansothererrors, _state); testclusteringunit_kmeansrestartstest(&kmeansconverrors, &kmeansrestartserrors, _state); for(nf=1; nf<=5; nf++) { for(nc=1; nc<=5; nc++) { testclusteringunit_kmeanssimpletest1(nf, nc, passcount, &kmeansconverrors, &kmeansothererrors, &kmeanssimpleerrors, _state); } } /* * Results */ waserrors = ae_false; waserrors = waserrors||(basicahcerrors||ahcerrors); waserrors = waserrors||(((kmeansconverrors||kmeansothererrors)||kmeanssimpleerrors)||kmeansrestartserrors); if( !silent ) { printf("TESTING CLUSTERING\n"); printf("AHC: \n"); printf("* BASIC TESTS "); if( !basicahcerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* GENERAL TESTS "); if( !ahcerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("K-MEANS: \n"); printf("* CONVERGENCE "); if( !kmeansconverrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* SIMPLE TASKS "); if( !kmeanssimpleerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* OTHER PROPERTIES "); if( !kmeansothererrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* RESTARTS PROPERTIES "); if( !kmeansrestartserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testclustering(ae_bool silent, ae_state *_state) { return testclustering(silent, _state); } /************************************************************************* Basic agglomerative hierarchical clustering tests: returns True on failure, False on success. Basic tests study algorithm behavior on simple, hand-made datasets with small number of points (1..10). *************************************************************************/ static ae_bool testclusteringunit_basicahctests(ae_state *_state) { ae_frame _frame_block; clusterizerstate s; ahcreport rep; ae_matrix xy; ae_matrix d; ae_matrix c; ae_bool berr; ae_int_t ahcalgo; ae_int_t i; ae_int_t j; ae_int_t npoints; ae_int_t k; ae_vector cidx; ae_vector cz; ae_vector cidx2; ae_vector cz2; ae_bool result; ae_frame_make(_state, &_frame_block); _clusterizerstate_init(&s, _state); _ahcreport_init(&rep, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&d, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&cidx, 0, DT_INT, _state); ae_vector_init(&cz, 0, DT_INT, _state); ae_vector_init(&cidx2, 0, DT_INT, _state); ae_vector_init(&cz2, 0, DT_INT, _state); result = ae_true; /* * Test on empty problem */ clusterizercreate(&s, _state); clusterizerrunahc(&s, &rep, _state); if( rep.npoints!=0 ) { ae_frame_leave(_state); return result; } /* * Test on problem with one point */ ae_matrix_set_length(&xy, 1, 2, _state); xy.ptr.pp_double[0][0] = ae_randomreal(_state); xy.ptr.pp_double[0][1] = ae_randomreal(_state); clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, 1, 2, 0, _state); clusterizerrunahc(&s, &rep, _state); if( rep.npoints!=1 ) { ae_frame_leave(_state); return result; } /* * Test on problem with two points */ ae_matrix_set_length(&xy, 2, 2, _state); xy.ptr.pp_double[0][0] = ae_randomreal(_state); xy.ptr.pp_double[0][1] = ae_randomreal(_state); xy.ptr.pp_double[1][0] = ae_randomreal(_state); xy.ptr.pp_double[1][1] = ae_randomreal(_state); clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, 2, 2, 0, _state); clusterizerrunahc(&s, &rep, _state); if( (rep.npoints!=2||rep.z.rows!=1)||rep.z.cols!=2 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[0][0]!=0||rep.z.ptr.pp_int[0][1]!=1 ) { ae_frame_leave(_state); return result; } /* * Test on specially designed problem which should have * following dendrogram: * * ------ * | | * ---- ---- * | | | | * 0 1 2 3 * * ...with first merge performed on 0 and 1, second merge * performed on 2 and 3. Complete linkage is used. * * Additionally we test ClusterizerSeparatedByDist() on this * problem for different distances. Test is performed by * comparing function result with ClusterizerGetKClusters() * for known K. */ ae_matrix_set_length(&xy, 4, 1, _state); xy.ptr.pp_double[0][0] = 0.0; xy.ptr.pp_double[1][0] = 1.0; xy.ptr.pp_double[2][0] = 3.0; xy.ptr.pp_double[3][0] = 4.1; clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, 4, 1, 0, _state); clusterizersetahcalgo(&s, 0, _state); clusterizerrunahc(&s, &rep, _state); if( (((rep.npoints!=4||rep.z.rows!=3)||rep.z.cols!=2)||rep.pz.rows!=3)||rep.pz.cols!=2 ) { ae_frame_leave(_state); return result; } berr = ae_false; berr = (berr||rep.z.ptr.pp_int[0][0]!=0)||rep.z.ptr.pp_int[0][1]!=1; berr = (berr||rep.z.ptr.pp_int[1][0]!=2)||rep.z.ptr.pp_int[1][1]!=3; berr = (berr||rep.z.ptr.pp_int[2][0]!=4)||rep.z.ptr.pp_int[2][1]!=5; berr = (((berr||rep.p.ptr.p_int[0]!=0)||rep.p.ptr.p_int[1]!=1)||rep.p.ptr.p_int[2]!=2)||rep.p.ptr.p_int[3]!=3; berr = (berr||rep.pz.ptr.pp_int[0][0]!=0)||rep.pz.ptr.pp_int[0][1]!=1; berr = (berr||rep.pz.ptr.pp_int[1][0]!=2)||rep.pz.ptr.pp_int[1][1]!=3; berr = (berr||rep.pz.ptr.pp_int[2][0]!=4)||rep.pz.ptr.pp_int[2][1]!=5; berr = (((berr||rep.pm.ptr.pp_int[0][0]!=0)||rep.pm.ptr.pp_int[0][1]!=0)||rep.pm.ptr.pp_int[0][2]!=1)||rep.pm.ptr.pp_int[0][3]!=1; berr = (((berr||rep.pm.ptr.pp_int[1][0]!=2)||rep.pm.ptr.pp_int[1][1]!=2)||rep.pm.ptr.pp_int[1][2]!=3)||rep.pm.ptr.pp_int[1][3]!=3; berr = (((berr||rep.pm.ptr.pp_int[2][0]!=0)||rep.pm.ptr.pp_int[2][1]!=1)||rep.pm.ptr.pp_int[2][2]!=2)||rep.pm.ptr.pp_int[2][3]!=3; if( berr ) { ae_frame_leave(_state); return result; } clusterizerseparatedbydist(&rep, 0.5, &k, &cidx, &cz, _state); clusterizergetkclusters(&rep, 4, &cidx2, &cz2, _state); if( k!=4 ) { ae_frame_leave(_state); return result; } if( ((cidx.ptr.p_int[0]!=cidx2.ptr.p_int[0]||cidx.ptr.p_int[1]!=cidx2.ptr.p_int[1])||cidx.ptr.p_int[2]!=cidx2.ptr.p_int[2])||cidx.ptr.p_int[3]!=cidx2.ptr.p_int[3] ) { ae_frame_leave(_state); return result; } if( ((cz.ptr.p_int[0]!=cz2.ptr.p_int[0]||cz.ptr.p_int[1]!=cz2.ptr.p_int[1])||cz.ptr.p_int[2]!=cz2.ptr.p_int[2])||cz.ptr.p_int[3]!=cz2.ptr.p_int[3] ) { ae_frame_leave(_state); return result; } clusterizerseparatedbydist(&rep, 1.05, &k, &cidx, &cz, _state); clusterizergetkclusters(&rep, 3, &cidx2, &cz2, _state); if( k!=3 ) { ae_frame_leave(_state); return result; } if( ((cidx.ptr.p_int[0]!=cidx2.ptr.p_int[0]||cidx.ptr.p_int[1]!=cidx2.ptr.p_int[1])||cidx.ptr.p_int[2]!=cidx2.ptr.p_int[2])||cidx.ptr.p_int[3]!=cidx2.ptr.p_int[3] ) { ae_frame_leave(_state); return result; } if( (cz.ptr.p_int[0]!=cz2.ptr.p_int[0]||cz.ptr.p_int[1]!=cz2.ptr.p_int[1])||cz.ptr.p_int[2]!=cz2.ptr.p_int[2] ) { ae_frame_leave(_state); return result; } clusterizerseparatedbydist(&rep, 1.15, &k, &cidx, &cz, _state); clusterizergetkclusters(&rep, 2, &cidx2, &cz2, _state); if( k!=2 ) { ae_frame_leave(_state); return result; } if( ((cidx.ptr.p_int[0]!=cidx2.ptr.p_int[0]||cidx.ptr.p_int[1]!=cidx2.ptr.p_int[1])||cidx.ptr.p_int[2]!=cidx2.ptr.p_int[2])||cidx.ptr.p_int[3]!=cidx2.ptr.p_int[3] ) { ae_frame_leave(_state); return result; } if( cz.ptr.p_int[0]!=cz2.ptr.p_int[0]||cz.ptr.p_int[1]!=cz2.ptr.p_int[1] ) { ae_frame_leave(_state); return result; } /* * Test on specially designed problem with Pearson distance * which should have following dendrogram: * * ------ * | | * ---- ---- * | | | | * 0 1 2 3 * * This problem is used to test ClusterizerSeparatedByDist(). * The test is performed by comparing function result with * ClusterizerGetKClusters() for known K. * * NOTE: * * corr(a0,a1) = 0.866 * * corr(a2,a3) = 0.990 * * corr(a0/a1, a2/a3)<=0.5 */ ae_matrix_set_length(&xy, 4, 3, _state); xy.ptr.pp_double[0][0] = 0.3; xy.ptr.pp_double[0][1] = 0.5; xy.ptr.pp_double[0][2] = 0.3; xy.ptr.pp_double[1][0] = 0.3; xy.ptr.pp_double[1][1] = 0.5; xy.ptr.pp_double[1][2] = 0.4; xy.ptr.pp_double[2][0] = 0.1; xy.ptr.pp_double[2][1] = 0.5; xy.ptr.pp_double[2][2] = 0.9; xy.ptr.pp_double[3][0] = 0.1; xy.ptr.pp_double[3][1] = 0.4; xy.ptr.pp_double[3][2] = 0.9; clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, 4, 3, 10, _state); clusterizersetahcalgo(&s, 1, _state); clusterizerrunahc(&s, &rep, _state); clusterizerseparatedbycorr(&rep, 0.999, &k, &cidx, &cz, _state); clusterizergetkclusters(&rep, 4, &cidx2, &cz2, _state); if( k!=4 ) { ae_frame_leave(_state); return result; } if( ((cidx.ptr.p_int[0]!=cidx2.ptr.p_int[0]||cidx.ptr.p_int[1]!=cidx2.ptr.p_int[1])||cidx.ptr.p_int[2]!=cidx2.ptr.p_int[2])||cidx.ptr.p_int[3]!=cidx2.ptr.p_int[3] ) { ae_frame_leave(_state); return result; } if( ((cz.ptr.p_int[0]!=cz2.ptr.p_int[0]||cz.ptr.p_int[1]!=cz2.ptr.p_int[1])||cz.ptr.p_int[2]!=cz2.ptr.p_int[2])||cz.ptr.p_int[3]!=cz2.ptr.p_int[3] ) { ae_frame_leave(_state); return result; } clusterizerseparatedbycorr(&rep, 0.900, &k, &cidx, &cz, _state); clusterizergetkclusters(&rep, 3, &cidx2, &cz2, _state); if( k!=3 ) { ae_frame_leave(_state); return result; } if( ((cidx.ptr.p_int[0]!=cidx2.ptr.p_int[0]||cidx.ptr.p_int[1]!=cidx2.ptr.p_int[1])||cidx.ptr.p_int[2]!=cidx2.ptr.p_int[2])||cidx.ptr.p_int[3]!=cidx2.ptr.p_int[3] ) { ae_frame_leave(_state); return result; } if( (cz.ptr.p_int[0]!=cz2.ptr.p_int[0]||cz.ptr.p_int[1]!=cz2.ptr.p_int[1])||cz.ptr.p_int[2]!=cz2.ptr.p_int[2] ) { ae_frame_leave(_state); return result; } clusterizerseparatedbycorr(&rep, 0.600, &k, &cidx, &cz, _state); clusterizergetkclusters(&rep, 2, &cidx2, &cz2, _state); if( k!=2 ) { ae_frame_leave(_state); return result; } if( ((cidx.ptr.p_int[0]!=cidx2.ptr.p_int[0]||cidx.ptr.p_int[1]!=cidx2.ptr.p_int[1])||cidx.ptr.p_int[2]!=cidx2.ptr.p_int[2])||cidx.ptr.p_int[3]!=cidx2.ptr.p_int[3] ) { ae_frame_leave(_state); return result; } if( cz.ptr.p_int[0]!=cz2.ptr.p_int[0]||cz.ptr.p_int[1]!=cz2.ptr.p_int[1] ) { ae_frame_leave(_state); return result; } /* * Single linkage vs. complete linkage */ ae_matrix_set_length(&xy, 6, 1, _state); xy.ptr.pp_double[0][0] = 0.0; xy.ptr.pp_double[1][0] = 1.0; xy.ptr.pp_double[2][0] = 2.1; xy.ptr.pp_double[3][0] = 3.3; xy.ptr.pp_double[4][0] = 6.0; xy.ptr.pp_double[5][0] = 4.6; clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, 6, 1, 0, _state); clusterizersetahcalgo(&s, 0, _state); clusterizerrunahc(&s, &rep, _state); if( rep.npoints!=6||rep.p.cnt!=6 ) { ae_frame_leave(_state); return result; } if( ((rep.z.rows!=5||rep.z.cols!=2)||rep.pz.rows!=5)||rep.pz.cols!=2 ) { ae_frame_leave(_state); return result; } berr = ae_false; berr = berr||rep.p.ptr.p_int[0]!=2; berr = berr||rep.p.ptr.p_int[1]!=3; berr = berr||rep.p.ptr.p_int[2]!=4; berr = berr||rep.p.ptr.p_int[3]!=5; berr = berr||rep.p.ptr.p_int[4]!=0; berr = berr||rep.p.ptr.p_int[5]!=1; berr = (berr||rep.z.ptr.pp_int[0][0]!=0)||rep.z.ptr.pp_int[0][1]!=1; berr = (berr||rep.z.ptr.pp_int[1][0]!=2)||rep.z.ptr.pp_int[1][1]!=3; berr = (berr||rep.z.ptr.pp_int[2][0]!=4)||rep.z.ptr.pp_int[2][1]!=5; berr = (berr||rep.z.ptr.pp_int[3][0]!=6)||rep.z.ptr.pp_int[3][1]!=7; berr = (berr||rep.z.ptr.pp_int[4][0]!=8)||rep.z.ptr.pp_int[4][1]!=9; berr = (berr||rep.pz.ptr.pp_int[0][0]!=2)||rep.pz.ptr.pp_int[0][1]!=3; berr = (berr||rep.pz.ptr.pp_int[1][0]!=4)||rep.pz.ptr.pp_int[1][1]!=5; berr = (berr||rep.pz.ptr.pp_int[2][0]!=0)||rep.pz.ptr.pp_int[2][1]!=1; berr = (berr||rep.pz.ptr.pp_int[3][0]!=6)||rep.pz.ptr.pp_int[3][1]!=7; berr = (berr||rep.pz.ptr.pp_int[4][0]!=8)||rep.pz.ptr.pp_int[4][1]!=9; if( berr ) { ae_frame_leave(_state); return result; } clusterizersetahcalgo(&s, 1, _state); clusterizerrunahc(&s, &rep, _state); if( (rep.npoints!=6||rep.z.rows!=5)||rep.z.cols!=2 ) { ae_frame_leave(_state); return result; } berr = ae_false; berr = (berr||rep.z.ptr.pp_int[0][0]!=0)||rep.z.ptr.pp_int[0][1]!=1; berr = (berr||rep.z.ptr.pp_int[1][0]!=2)||rep.z.ptr.pp_int[1][1]!=6; berr = (berr||rep.z.ptr.pp_int[2][0]!=3)||rep.z.ptr.pp_int[2][1]!=7; berr = (berr||rep.z.ptr.pp_int[3][0]!=5)||rep.z.ptr.pp_int[3][1]!=8; berr = (berr||rep.z.ptr.pp_int[4][0]!=4)||rep.z.ptr.pp_int[4][1]!=9; if( berr ) { ae_frame_leave(_state); return result; } /* * Test which differentiates complete linkage and average linkage from * single linkage: * * we have cluster C0={(-0.5), (0)}, * cluster C1={(19.0), (20.0), (21.0), (22.0), (23.0)}, * and point P between them - (10.0) * * we try three different strategies - single linkage, complete * linkage, average linkage. * * any strategy will merge C0 first, then merge points of C1, * and then merge P with C0 or C1 (depending on linkage type) * * we test that: * a) C0 is merged first * b) after 5 merges (including merge of C0), P is merged with C0 or C1 * c) P is merged with C1 when we have single linkage, with C0 otherwise */ ae_matrix_set_length(&xy, 8, 1, _state); xy.ptr.pp_double[0][0] = -0.5; xy.ptr.pp_double[1][0] = 0.0; xy.ptr.pp_double[2][0] = 10.0; xy.ptr.pp_double[3][0] = 19.0; xy.ptr.pp_double[4][0] = 20.0; xy.ptr.pp_double[5][0] = 21.0; xy.ptr.pp_double[6][0] = 22.0; xy.ptr.pp_double[7][0] = 23.0; clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, 8, 1, 0, _state); for(ahcalgo=0; ahcalgo<=2; ahcalgo++) { clusterizersetahcalgo(&s, ahcalgo, _state); clusterizerrunahc(&s, &rep, _state); if( (rep.npoints!=8||rep.z.rows!=7)||rep.z.cols!=2 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[0][0]!=0||rep.z.ptr.pp_int[0][1]!=1 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[5][0]!=2&&rep.z.ptr.pp_int[5][1]!=2 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[5][0]!=2&&rep.z.ptr.pp_int[5][1]!=2 ) { ae_frame_leave(_state); return result; } if( (ahcalgo==0||ahcalgo==2)&&(rep.z.ptr.pp_int[5][0]!=8&&rep.z.ptr.pp_int[5][1]!=8) ) { ae_frame_leave(_state); return result; } if( ahcalgo==1&&(rep.z.ptr.pp_int[5][0]==8||rep.z.ptr.pp_int[5][1]==8) ) { ae_frame_leave(_state); return result; } } /* * Test which differentiates single linkage and average linkage from * complete linkage: * * we have cluster C0={(-2.5), (-2.0)}, * cluster C1={(19.0), (20.0), (21.0), (22.0), (23.0)}, * and point P between them - (10.0) * * we try three different strategies - single linkage, complete * linkage, average linkage. * * any strategy will merge C0 first, then merge points of C1, * and then merge P with C0 or C1 (depending on linkage type) * * we test that: * a) C0 is merged first * b) after 5 merges (including merge of C0), P is merged with C0 or C1 * c) P is merged with C0 when we have complete linkage, with C1 otherwise */ ae_matrix_set_length(&xy, 8, 1, _state); xy.ptr.pp_double[0][0] = -2.5; xy.ptr.pp_double[1][0] = -2.0; xy.ptr.pp_double[2][0] = 10.0; xy.ptr.pp_double[3][0] = 19.0; xy.ptr.pp_double[4][0] = 20.0; xy.ptr.pp_double[5][0] = 21.0; xy.ptr.pp_double[6][0] = 22.0; xy.ptr.pp_double[7][0] = 23.0; clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, 8, 1, 0, _state); for(ahcalgo=0; ahcalgo<=2; ahcalgo++) { clusterizersetahcalgo(&s, ahcalgo, _state); clusterizerrunahc(&s, &rep, _state); if( (rep.npoints!=8||rep.z.rows!=7)||rep.z.cols!=2 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[0][0]!=0||rep.z.ptr.pp_int[0][1]!=1 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[5][0]!=2&&rep.z.ptr.pp_int[5][1]!=2 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[5][0]!=2&&rep.z.ptr.pp_int[5][1]!=2 ) { ae_frame_leave(_state); return result; } if( ahcalgo==0&&(rep.z.ptr.pp_int[5][0]!=8&&rep.z.ptr.pp_int[5][1]!=8) ) { ae_frame_leave(_state); return result; } if( (ahcalgo==1||ahcalgo==2)&&(rep.z.ptr.pp_int[5][0]==8||rep.z.ptr.pp_int[5][1]==8) ) { ae_frame_leave(_state); return result; } } /* * Test which differentiates weighred average linkage from unweighted average linkage: * * we have cluster C0={(0.0), (1.5), (2.5)}, * cluster C1={(7.5), (7.99)}, * and point P between them - (4.5) * * we try two different strategies - weighted average linkage and unweighted average linkage * * any strategy will merge C1 first, then merge points of C0, * and then merge P with C0 or C1 (depending on linkage type) * * we test that: * a) C1 is merged first, C0 is merged after that * b) after first 3 merges P is merged with C0 or C1 * c) P is merged with C1 when we have weighted average linkage, with C0 otherwise */ ae_matrix_set_length(&xy, 6, 1, _state); xy.ptr.pp_double[0][0] = 0.0; xy.ptr.pp_double[1][0] = 1.5; xy.ptr.pp_double[2][0] = 2.5; xy.ptr.pp_double[3][0] = 4.5; xy.ptr.pp_double[4][0] = 7.5; xy.ptr.pp_double[5][0] = 7.99; clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, 6, 1, 0, _state); for(ahcalgo=2; ahcalgo<=3; ahcalgo++) { clusterizersetahcalgo(&s, ahcalgo, _state); clusterizerrunahc(&s, &rep, _state); if( (rep.npoints!=6||rep.z.rows!=5)||rep.z.cols!=2 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[0][0]!=4||rep.z.ptr.pp_int[0][1]!=5 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[1][0]!=1||rep.z.ptr.pp_int[1][1]!=2 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[2][0]!=0||rep.z.ptr.pp_int[2][1]!=7 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[3][0]!=3 ) { ae_frame_leave(_state); return result; } if( ahcalgo==2&&rep.z.ptr.pp_int[3][1]!=8 ) { ae_frame_leave(_state); return result; } if( ahcalgo==3&&rep.z.ptr.pp_int[3][1]!=6 ) { ae_frame_leave(_state); return result; } } /* * Test which checks correctness of Ward's method on very basic problem */ ae_matrix_set_length(&xy, 4, 1, _state); xy.ptr.pp_double[0][0] = 0.0; xy.ptr.pp_double[1][0] = 1.0; xy.ptr.pp_double[2][0] = 3.1; xy.ptr.pp_double[3][0] = 4.0; clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, xy.rows, xy.cols, 2, _state); clusterizergetdistances(&xy, xy.rows, xy.cols, 2, &d, _state); clusterizersetahcalgo(&s, 4, _state); clusterizerrunahc(&s, &rep, _state); if( testclusteringunit_errorsinmerges(&d, &xy, xy.rows, xy.cols, &rep, 4, _state) ) { ae_frame_leave(_state); return result; } /* * One more Ward's test */ ae_matrix_set_length(&xy, 8, 2, _state); xy.ptr.pp_double[0][0] = 0.4700566262; xy.ptr.pp_double[0][1] = 0.4565938448; xy.ptr.pp_double[1][0] = 0.2394499506; xy.ptr.pp_double[1][1] = 0.1750209592; xy.ptr.pp_double[2][0] = 0.6518417019; xy.ptr.pp_double[2][1] = 0.6151370746; xy.ptr.pp_double[3][0] = 0.9863942841; xy.ptr.pp_double[3][1] = 0.7855012189; xy.ptr.pp_double[4][0] = 0.1517812919; xy.ptr.pp_double[4][1] = 0.2600174758; xy.ptr.pp_double[5][0] = 0.7840203638; xy.ptr.pp_double[5][1] = 0.9023597604; xy.ptr.pp_double[6][0] = 0.2604194835; xy.ptr.pp_double[6][1] = 0.9792704661; xy.ptr.pp_double[7][0] = 0.6353096042; xy.ptr.pp_double[7][1] = 0.8252606906; clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, xy.rows, xy.cols, 2, _state); clusterizergetdistances(&xy, xy.rows, xy.cols, 2, &d, _state); clusterizersetahcalgo(&s, 4, _state); clusterizerrunahc(&s, &rep, _state); if( rep.z.ptr.pp_int[0][0]!=1||rep.z.ptr.pp_int[0][1]!=4 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[1][0]!=5||rep.z.ptr.pp_int[1][1]!=7 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[2][0]!=0||rep.z.ptr.pp_int[2][1]!=2 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[3][0]!=3||rep.z.ptr.pp_int[3][1]!=9 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[4][0]!=10||rep.z.ptr.pp_int[4][1]!=11 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[5][0]!=6||rep.z.ptr.pp_int[5][1]!=12 ) { ae_frame_leave(_state); return result; } if( rep.z.ptr.pp_int[6][0]!=8||rep.z.ptr.pp_int[6][1]!=13 ) { ae_frame_leave(_state); return result; } if( testclusteringunit_errorsinmerges(&d, &xy, xy.rows, xy.cols, &rep, 4, _state) ) { ae_frame_leave(_state); return result; } /* * Ability to solve problems with zero distance matrix */ npoints = 20; ae_matrix_set_length(&d, npoints, npoints, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=npoints-1; j++) { d.ptr.pp_double[i][j] = 0.0; } } for(ahcalgo=0; ahcalgo<=4; ahcalgo++) { clusterizercreate(&s, _state); clusterizersetdistances(&s, &d, npoints, ae_true, _state); clusterizersetahcalgo(&s, ahcalgo, _state); clusterizerrunahc(&s, &rep, _state); if( (rep.npoints!=npoints||rep.z.rows!=npoints-1)||rep.z.cols!=2 ) { ae_frame_leave(_state); return result; } } /* * Test GetKClusters() */ ae_matrix_set_length(&xy, 8, 1, _state); xy.ptr.pp_double[0][0] = -2.5; xy.ptr.pp_double[1][0] = -2.0; xy.ptr.pp_double[2][0] = 10.0; xy.ptr.pp_double[3][0] = 19.0; xy.ptr.pp_double[4][0] = 20.0; xy.ptr.pp_double[5][0] = 21.0; xy.ptr.pp_double[6][0] = 22.0; xy.ptr.pp_double[7][0] = 23.0; clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, 8, 1, 0, _state); clusterizersetahcalgo(&s, 0, _state); clusterizerrunahc(&s, &rep, _state); clusterizergetkclusters(&rep, 3, &cidx, &cz, _state); if( ((((((cidx.ptr.p_int[0]!=1||cidx.ptr.p_int[1]!=1)||cidx.ptr.p_int[2]!=0)||cidx.ptr.p_int[3]!=2)||cidx.ptr.p_int[4]!=2)||cidx.ptr.p_int[5]!=2)||cidx.ptr.p_int[6]!=2)||cidx.ptr.p_int[7]!=2 ) { ae_frame_leave(_state); return result; } if( (cz.ptr.p_int[0]!=2||cz.ptr.p_int[1]!=8)||cz.ptr.p_int[2]!=12 ) { ae_frame_leave(_state); return result; } /* * Test is done */ result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Advanced agglomerative hierarchical clustering tests : returns True on failure, False on success. Advanced testing subroutine perform several automatically generated tests. *************************************************************************/ static ae_bool testclusteringunit_advancedahctests(ae_state *_state) { ae_frame _frame_block; clusterizerstate s; ahcreport rep; ae_matrix xy; ae_matrix dm; ae_matrix dm2; ae_vector idx; ae_vector disttypes; ae_vector x0; ae_vector x1; ae_int_t d; ae_int_t n; ae_int_t npoints; ae_int_t ahcalgo; ae_int_t disttype; ae_int_t i; ae_int_t j; ae_int_t k; double v; ae_int_t t; ae_int_t euclidean; ae_bool result; ae_frame_make(_state, &_frame_block); _clusterizerstate_init(&s, _state); _ahcreport_init(&rep, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&dm, 0, 0, DT_REAL, _state); ae_matrix_init(&dm2, 0, 0, DT_REAL, _state); ae_vector_init(&idx, 0, DT_INT, _state); ae_vector_init(&disttypes, 0, DT_INT, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); result = ae_false; euclidean = 2; /* * Test on D-dimensional problem: * * D = 2...5 * * D clusters, each has N points; * centers are located at x=(0 ... 1 ... 0); * cluster radii are approximately 0.1 * * single/complete/unweighted_average/weighted_average linkage/Ward's method are tested * * Euclidean distance is used, either: * a) one given by distance matrix (ClusterizerSetDistances) * b) one calculated from dataset (ClusterizerSetPoints) * * we have N*D points, and N*D-1 merges in total * * points are randomly rearranged after generation * * For all kinds of linkage we perform following test: * * for each point we remember index of its cluster * (one which is determined during dataset generation) * * we clusterize points with ALGLIB capabilities * * we scan Rep.Z and perform first D*(N-1) merges * * for each merge we check that it merges points * from same cluster; * * Additonally, we call ErrorsInMerges(). See function comments * for more information about specific tests performed. This function * allows us to check that clusters are built exactly as specified by * definition of the clustering algorithm. */ for(d=2; d<=5; d++) { for(ahcalgo=0; ahcalgo<=4; ahcalgo++) { n = ae_round(ae_pow((double)(3), (double)(ae_randominteger(3, _state)), _state), _state); npoints = d*n; /* * 1. generate dataset. * 2. fill Idx (array of cluster indexes): * * first N*D elements store cluster indexes * * next D*(N-1) elements are filled during merges * 3. build distance matrix DM */ ae_matrix_set_length(&xy, n*d, d, _state); ae_vector_set_length(&idx, n*d+d*(n-1), _state); for(i=0; i<=n*d-1; i++) { for(j=0; j<=d-1; j++) { xy.ptr.pp_double[i][j] = 0.2*ae_randomreal(_state)-0.1; } xy.ptr.pp_double[i][i%d] = xy.ptr.pp_double[i][i%d]+1.0; idx.ptr.p_int[i] = i%d; } for(i=0; i<=n*d-1; i++) { k = ae_randominteger(n*d, _state); if( k!=i ) { for(j=0; j<=d-1; j++) { v = xy.ptr.pp_double[i][j]; xy.ptr.pp_double[i][j] = xy.ptr.pp_double[k][j]; xy.ptr.pp_double[k][j] = v; } t = idx.ptr.p_int[k]; idx.ptr.p_int[k] = idx.ptr.p_int[i]; idx.ptr.p_int[i] = t; } } ae_matrix_set_length(&dm, npoints, npoints, _state); ae_vector_set_length(&x0, d, _state); ae_vector_set_length(&x1, d, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=npoints-1; j++) { ae_v_move(&x0.ptr.p_double[0], 1, &xy.ptr.pp_double[i][0], 1, ae_v_len(0,d-1)); ae_v_move(&x1.ptr.p_double[0], 1, &xy.ptr.pp_double[j][0], 1, ae_v_len(0,d-1)); dm.ptr.pp_double[i][j] = testclusteringunit_distfunc(&x0, &x1, d, euclidean, _state); } } /* * Clusterize with SetPoints() */ clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, n*d, d, euclidean, _state); clusterizersetahcalgo(&s, ahcalgo, _state); clusterizerrunahc(&s, &rep, _state); /* * Tests: * * replay first D*(N-1) merges; these merges should take place * within clusters, intercluster merges will be performed at the * last stages of the processing. * * test with ErrorsInMerges() */ if( rep.npoints!=npoints ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=d*(n-1)-1; i++) { /* * Check correctness of I-th row of Z */ if( (rep.z.ptr.pp_int[i][0]<0||rep.z.ptr.pp_int[i][0]>=rep.z.ptr.pp_int[i][1])||rep.z.ptr.pp_int[i][1]>=d*n+i ) { result = ae_true; ae_frame_leave(_state); return result; } /* * Check that merge is performed within cluster */ if( idx.ptr.p_int[rep.z.ptr.pp_int[i][0]]!=idx.ptr.p_int[rep.z.ptr.pp_int[i][1]] ) { result = ae_true; ae_frame_leave(_state); return result; } /* * Write new entry of Idx. * Both points from the same cluster, so result of the merge * belongs to the same cluster */ idx.ptr.p_int[n*d+i] = idx.ptr.p_int[rep.z.ptr.pp_int[i][1]]; } if( ((ahcalgo==0||ahcalgo==1)||ahcalgo==2)||ahcalgo==4 ) { if( testclusteringunit_errorsinmerges(&dm, &xy, d*n, d, &rep, ahcalgo, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } } /* * Clusterize one more time, now with distance matrix */ clusterizercreate(&s, _state); clusterizersetdistances(&s, &dm, n*d, ae_fp_greater(ae_randomreal(_state),0.5), _state); clusterizersetahcalgo(&s, ahcalgo, _state); clusterizerrunahc(&s, &rep, _state); /* * Tests: * * replay first D*(N-1) merges; these merges should take place * within clusters, intercluster merges will be performed at the * last stages of the processing. * * test with ErrorsInMerges() */ if( rep.npoints!=npoints ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=d*(n-1)-1; i++) { /* * Check correctness of I-th row of Z */ if( (rep.z.ptr.pp_int[i][0]<0||rep.z.ptr.pp_int[i][0]>=rep.z.ptr.pp_int[i][1])||rep.z.ptr.pp_int[i][1]>=d*n+i ) { result = ae_true; ae_frame_leave(_state); return result; } /* * Check that merge is performed within cluster */ if( idx.ptr.p_int[rep.z.ptr.pp_int[i][0]]!=idx.ptr.p_int[rep.z.ptr.pp_int[i][1]] ) { result = ae_true; ae_frame_leave(_state); return result; } /* * Write new entry of Idx. * Both points from the same cluster, so result of the merge * belongs to the same cluster */ idx.ptr.p_int[n*d+i] = idx.ptr.p_int[rep.z.ptr.pp_int[i][1]]; } if( ((ahcalgo==0||ahcalgo==1)||ahcalgo==2)||ahcalgo==4 ) { if( testclusteringunit_errorsinmerges(&dm, &xy, d*n, d, &rep, ahcalgo, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } /* * Test on random D-dimensional problem: * * D = 2...5 * * N=1..16 random points from unit hypercube * * single/complete/unweighted_average linkage/Ward's method are tested * * different distance functions are tested * * we call ErrorsInMerges() and we check distance matrix * calculated by unit test against one returned by GetDistances() */ ae_vector_set_length(&disttypes, 9, _state); disttypes.ptr.p_int[0] = 0; disttypes.ptr.p_int[1] = 1; disttypes.ptr.p_int[2] = 2; disttypes.ptr.p_int[3] = 10; disttypes.ptr.p_int[4] = 11; disttypes.ptr.p_int[5] = 12; disttypes.ptr.p_int[6] = 13; disttypes.ptr.p_int[7] = 20; disttypes.ptr.p_int[8] = 21; for(disttype=0; disttype<=disttypes.cnt-1; disttype++) { for(ahcalgo=0; ahcalgo<=4; ahcalgo++) { if( ahcalgo==3 ) { continue; } if( ahcalgo==4&&disttype!=2 ) { continue; } npoints = ae_round(ae_pow((double)(2), (double)(ae_randominteger(5, _state)), _state), _state); d = 2+ae_randominteger(4, _state); /* * Generate dataset and distance matrix */ ae_matrix_set_length(&xy, npoints, d, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=d-1; j++) { xy.ptr.pp_double[i][j] = ae_randomreal(_state); } } ae_matrix_set_length(&dm, npoints, npoints, _state); ae_vector_set_length(&x0, d, _state); ae_vector_set_length(&x1, d, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=npoints-1; j++) { ae_v_move(&x0.ptr.p_double[0], 1, &xy.ptr.pp_double[i][0], 1, ae_v_len(0,d-1)); ae_v_move(&x1.ptr.p_double[0], 1, &xy.ptr.pp_double[j][0], 1, ae_v_len(0,d-1)); dm.ptr.pp_double[i][j] = testclusteringunit_distfunc(&x0, &x1, d, disttypes.ptr.p_int[disttype], _state); } } /* * Clusterize */ clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, npoints, d, disttypes.ptr.p_int[disttype], _state); clusterizersetahcalgo(&s, ahcalgo, _state); clusterizerrunahc(&s, &rep, _state); /* * Test with ErrorsInMerges() */ if( testclusteringunit_errorsinmerges(&dm, &xy, npoints, d, &rep, ahcalgo, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } /* * Test distance matrix */ clusterizergetdistances(&xy, npoints, d, disttypes.ptr.p_int[disttype], &dm2, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=npoints-1; j++) { if( !ae_isfinite(dm2.ptr.pp_double[i][j], _state)||ae_fp_greater(ae_fabs(dm.ptr.pp_double[i][j]-dm2.ptr.pp_double[i][j], _state),1.0E5*ae_machineepsilon) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } ae_frame_leave(_state); return result; } /************************************************************************* Simple test 1: ellipsoid in NF-dimensional space. compare k-means centers with random centers *************************************************************************/ static void testclusteringunit_kmeanssimpletest1(ae_int_t nvars, ae_int_t nc, ae_int_t passcount, ae_bool* converrors, ae_bool* othererrors, ae_bool* simpleerrors, ae_state *_state) { ae_frame _frame_block; ae_int_t npoints; ae_int_t majoraxis; ae_matrix xy; ae_vector tmp; double v; ae_int_t i; ae_int_t j; ae_int_t pass; ae_int_t restarts; double ekmeans; double erandom; double dclosest; ae_int_t cclosest; clusterizerstate s; kmeansreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); _clusterizerstate_init(&s, _state); _kmeansreport_init(&rep, _state); npoints = nc*100; restarts = 5; passcount = 10; ae_vector_set_length(&tmp, nvars-1+1, _state); for(pass=1; pass<=passcount; pass++) { /* * Fill */ ae_matrix_set_length(&xy, npoints-1+1, nvars-1+1, _state); majoraxis = ae_randominteger(nvars, _state); for(i=0; i<=npoints-1; i++) { testclusteringunit_rsphere(&xy, nvars, i, _state); xy.ptr.pp_double[i][majoraxis] = nc*xy.ptr.pp_double[i][majoraxis]; } /* * Test */ clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, npoints, nvars, 2, _state); clusterizersetkmeanslimits(&s, restarts, 0, _state); clusterizerrunkmeans(&s, nc, &rep, _state); if( rep.terminationtype<=0 ) { *converrors = ae_true; ae_frame_leave(_state); return; } /* * Test that XYC is correct mapping to cluster centers */ for(i=0; i<=npoints-1; i++) { cclosest = -1; dclosest = ae_maxrealnumber; for(j=0; j<=nc-1; j++) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy.ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); ae_v_sub(&tmp.ptr.p_double[0], 1, &rep.c.ptr.pp_double[j][0], 1, ae_v_len(0,nvars-1)); v = ae_v_dotproduct(&tmp.ptr.p_double[0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,nvars-1)); if( ae_fp_less(v,dclosest) ) { cclosest = j; dclosest = v; } } if( cclosest!=rep.cidx.ptr.p_int[i] ) { *othererrors = ae_true; ae_frame_leave(_state); return; } } /* * Use first NC rows of XY as random centers * (XY is totally random, so it is as good as any other choice). * * Compare potential functions. */ ekmeans = (double)(0); for(i=0; i<=npoints-1; i++) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy.ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); ae_v_sub(&tmp.ptr.p_double[0], 1, &rep.c.ptr.pp_double[rep.cidx.ptr.p_int[i]][0], 1, ae_v_len(0,nvars-1)); v = ae_v_dotproduct(&tmp.ptr.p_double[0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,nvars-1)); ekmeans = ekmeans+v; } erandom = (double)(0); for(i=0; i<=npoints-1; i++) { dclosest = ae_maxrealnumber; v = (double)(0); for(j=0; j<=nc-1; j++) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy.ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); ae_v_sub(&tmp.ptr.p_double[0], 1, &xy.ptr.pp_double[j][0], 1, ae_v_len(0,nvars-1)); v = ae_v_dotproduct(&tmp.ptr.p_double[0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,nvars-1)); if( ae_fp_less(v,dclosest) ) { dclosest = v; } } erandom = erandom+v; } if( ae_fp_less(erandom,ekmeans) ) { *simpleerrors = ae_true; ae_frame_leave(_state); return; } } ae_frame_leave(_state); } /************************************************************************* This test perform several checks for special properties On failure sets error flag, on success leaves it unchanged. *************************************************************************/ static void testclusteringunit_kmeansspecialtests(ae_bool* othererrors, ae_state *_state) { ae_frame _frame_block; ae_int_t npoints; ae_int_t nfeatures; ae_int_t nclusters; ae_int_t initalgo; ae_matrix xy; ae_matrix c; ae_int_t idx0; ae_int_t idx1; ae_int_t idx2; ae_int_t i; ae_int_t j; ae_int_t pass; ae_int_t passcount; ae_int_t separation; ae_vector xyc; ae_vector xycref; ae_vector xydist2; ae_vector xydist2ref; ae_vector energies; hqrndstate rs; clusterizerstate s; kmeansreport rep; ae_shared_pool bufferpool; apbuffers bufferseed; ae_vector pointslist; ae_vector featureslist; ae_vector clusterslist; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&c, 0, 0, DT_REAL, _state); ae_vector_init(&xyc, 0, DT_INT, _state); ae_vector_init(&xycref, 0, DT_INT, _state); ae_vector_init(&xydist2, 0, DT_REAL, _state); ae_vector_init(&xydist2ref, 0, DT_REAL, _state); ae_vector_init(&energies, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); _clusterizerstate_init(&s, _state); _kmeansreport_init(&rep, _state); ae_shared_pool_init(&bufferpool, _state); _apbuffers_init(&bufferseed, _state); ae_vector_init(&pointslist, 0, DT_INT, _state); ae_vector_init(&featureslist, 0, DT_INT, _state); ae_vector_init(&clusterslist, 0, DT_INT, _state); hqrndrandomize(&rs, _state); /* * Compare different initialization algorithms: * * dataset is K balls, chosen at random gaussian points, with * radius equal to 2^(-Separation). * * we generate random sample, run k-means initialization algorithm * and calculate mean energy for each initialization algorithm. * In order to suppress Lloyd's iteration we use KmeansDbgNoIts * debug flag. * * then, we compare mean energies; kmeans++ must be best one, * random initialization must be worst one. */ ae_vector_set_length(&energies, 4, _state); passcount = 1000; npoints = 100; nfeatures = 3; nclusters = 6; ae_matrix_set_length(&xy, npoints, nfeatures, _state); ae_matrix_set_length(&c, nclusters, nfeatures, _state); clusterizercreate(&s, _state); s.kmeansdbgnoits = ae_true; for(separation=2; separation<=5; separation++) { /* * Try different init algorithms */ for(initalgo=1; initalgo<=3; initalgo++) { energies.ptr.p_double[initalgo] = 0.0; clusterizersetkmeansinit(&s, initalgo, _state); for(pass=1; pass<=passcount; pass++) { /* * Generate centers of balls */ for(i=0; i<=nclusters-1; i++) { for(j=0; j<=nfeatures-1; j++) { c.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } } /* * Generate points */ for(i=0; i<=npoints-1; i++) { for(j=0; j<=nfeatures-1; j++) { xy.ptr.pp_double[i][j] = hqrndnormal(&rs, _state)*ae_pow((double)(2), (double)(-separation), _state)+c.ptr.pp_double[i%nclusters][j]; } } /* * Run clusterization */ clusterizersetpoints(&s, &xy, npoints, nfeatures, 2, _state); clusterizerrunkmeans(&s, nclusters, &rep, _state); seterrorflag(othererrors, rep.terminationtype<=0, _state); energies.ptr.p_double[initalgo] = energies.ptr.p_double[initalgo]+rep.energy/passcount; } } /* * Compare */ seterrorflag(othererrors, !ae_fp_less(energies.ptr.p_double[2],energies.ptr.p_double[1]), _state); seterrorflag(othererrors, !ae_fp_less(energies.ptr.p_double[3],energies.ptr.p_double[1]), _state); } /* * Test distance calculation algorithm */ ae_vector_set_length(&pointslist, 6, _state); pointslist.ptr.p_int[0] = 1; pointslist.ptr.p_int[1] = 10; pointslist.ptr.p_int[2] = 32; pointslist.ptr.p_int[3] = 100; pointslist.ptr.p_int[4] = 512; pointslist.ptr.p_int[5] = 8000; ae_vector_set_length(&featureslist, 5, _state); featureslist.ptr.p_int[0] = 1; featureslist.ptr.p_int[1] = 5; featureslist.ptr.p_int[2] = 32; featureslist.ptr.p_int[3] = 50; featureslist.ptr.p_int[4] = 96; ae_vector_set_length(&clusterslist, 5, _state); clusterslist.ptr.p_int[0] = 1; clusterslist.ptr.p_int[1] = 5; clusterslist.ptr.p_int[2] = 32; clusterslist.ptr.p_int[3] = 50; clusterslist.ptr.p_int[4] = 96; ae_shared_pool_set_seed(&bufferpool, &bufferseed, sizeof(bufferseed), _apbuffers_init, _apbuffers_init_copy, _apbuffers_destroy, _state); for(idx0=0; idx0<=pointslist.cnt-1; idx0++) { for(idx1=0; idx1<=featureslist.cnt-1; idx1++) { for(idx2=0; idx2<=clusterslist.cnt-1; idx2++) { npoints = pointslist.ptr.p_int[idx0]; nfeatures = featureslist.ptr.p_int[idx1]; nclusters = clusterslist.ptr.p_int[idx2]; ae_matrix_set_length(&xy, npoints, nfeatures, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=nfeatures-1; j++) { xy.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } } ae_matrix_set_length(&c, nclusters, nfeatures, _state); for(i=0; i<=nclusters-1; i++) { for(j=0; j<=nfeatures-1; j++) { c.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } } ae_vector_set_length(&xyc, npoints, _state); ae_vector_set_length(&xycref, npoints, _state); ae_vector_set_length(&xydist2, npoints, _state); ae_vector_set_length(&xydist2ref, npoints, _state); /* * Test */ kmeansupdatedistances(&xy, 0, npoints, nfeatures, &c, 0, nclusters, &xyc, &xydist2, &bufferpool, _state); testclusteringunit_kmeansreferenceupdatedistances(&xy, npoints, nfeatures, &c, nclusters, &xycref, &xydist2ref, _state); for(i=0; i<=npoints-1; i++) { seterrorflag(othererrors, xyc.ptr.p_int[i]!=xycref.ptr.p_int[i], _state); seterrorflag(othererrors, ae_fp_greater(ae_fabs(xydist2.ptr.p_double[i]-xydist2ref.ptr.p_double[i], _state),1.0E-6), _state); } } } } /* * Test degenerate dataset (less than NClusters distinct points) */ for(nclusters=2; nclusters<=10; nclusters++) { for(initalgo=0; initalgo<=3; initalgo++) { for(pass=1; pass<=10; pass++) { /* * Initialize points. Two algorithms are used: * * initialization by small integers (no rounding problems) * * initialization by "long" fraction */ npoints = 100; nfeatures = 10; ae_matrix_set_length(&xy, npoints, nfeatures, _state); if( ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)) ) { for(i=0; i<=nclusters-2; i++) { for(j=0; j<=nfeatures-1; j++) { xy.ptr.pp_double[i][j] = ae_sin(hqrndnormal(&rs, _state), _state); } } } else { for(i=0; i<=nclusters-2; i++) { for(j=0; j<=nfeatures-1; j++) { xy.ptr.pp_double[i][j] = (double)(hqrnduniformi(&rs, 50, _state)); } } } for(i=nclusters-1; i<=npoints-1; i++) { idx0 = hqrnduniformi(&rs, nclusters-1, _state); for(j=0; j<=nfeatures-1; j++) { xy.ptr.pp_double[i][j] = xy.ptr.pp_double[idx0][j]; } } /* * Clusterize with unlimited number of iterations. * Correct error code must be returned. */ clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, npoints, nfeatures, 2, _state); clusterizersetkmeanslimits(&s, 1, 0, _state); clusterizersetkmeansinit(&s, initalgo, _state); clusterizerrunkmeans(&s, nclusters, &rep, _state); seterrorflag(othererrors, rep.terminationtype!=-3, _state); } } } ae_frame_leave(_state); } /************************************************************************* This test checks algorithm ability to handle degenerate problems without causing infinite loop. *************************************************************************/ static void testclusteringunit_kmeansinfinitelooptest(ae_bool* othererrors, ae_state *_state) { ae_frame _frame_block; ae_int_t npoints; ae_int_t nfeatures; ae_int_t nclusters; ae_int_t restarts; ae_matrix xy; ae_int_t i; ae_int_t j; clusterizerstate s; kmeansreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _clusterizerstate_init(&s, _state); _kmeansreport_init(&rep, _state); /* * Problem 1: all points are same. * * For NClusters=1 we must get correct solution, for NClusters>1 we must get failure. */ npoints = 100; nfeatures = 1; restarts = 5; ae_matrix_set_length(&xy, npoints, nfeatures, _state); for(j=0; j<=nfeatures-1; j++) { xy.ptr.pp_double[0][j] = ae_randomreal(_state); } for(i=1; i<=npoints-1; i++) { for(j=0; j<=nfeatures-1; j++) { xy.ptr.pp_double[i][j] = xy.ptr.pp_double[0][j]; } } nclusters = 1; clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, npoints, nfeatures, 2, _state); clusterizersetkmeanslimits(&s, restarts, 0, _state); clusterizerrunkmeans(&s, nclusters, &rep, _state); *othererrors = *othererrors||rep.terminationtype<=0; for(i=0; i<=nfeatures-1; i++) { *othererrors = *othererrors||ae_fp_greater(ae_fabs(rep.c.ptr.pp_double[0][i]-xy.ptr.pp_double[0][i], _state),1000*ae_machineepsilon); } for(i=0; i<=npoints-1; i++) { *othererrors = *othererrors||rep.cidx.ptr.p_int[i]!=0; } nclusters = 5; clusterizerrunkmeans(&s, nclusters, &rep, _state); *othererrors = *othererrors||rep.terminationtype>0; /* * Problem 2: degenerate dataset (report by Andreas). */ npoints = 57; nfeatures = 1; restarts = 1; nclusters = 4; ae_matrix_set_length(&xy, npoints, nfeatures, _state); xy.ptr.pp_double[0][0] = 12.244689632138986; xy.ptr.pp_double[1][0] = 12.244689632138982; xy.ptr.pp_double[2][0] = 12.244689632138986; xy.ptr.pp_double[3][0] = 12.244689632138982; xy.ptr.pp_double[4][0] = 12.244689632138986; xy.ptr.pp_double[5][0] = 12.244689632138986; xy.ptr.pp_double[6][0] = 12.244689632138986; xy.ptr.pp_double[7][0] = 12.244689632138986; xy.ptr.pp_double[8][0] = 12.244689632138986; xy.ptr.pp_double[9][0] = 12.244689632138986; xy.ptr.pp_double[10][0] = 12.244689632138989; xy.ptr.pp_double[11][0] = 12.244689632138984; xy.ptr.pp_double[12][0] = 12.244689632138986; xy.ptr.pp_double[13][0] = 12.244689632138986; xy.ptr.pp_double[14][0] = 12.244689632138989; xy.ptr.pp_double[15][0] = 12.244689632138986; xy.ptr.pp_double[16][0] = 12.244689632138986; xy.ptr.pp_double[17][0] = 12.244689632138986; xy.ptr.pp_double[18][0] = 12.244689632138986; xy.ptr.pp_double[19][0] = 12.244689632138989; xy.ptr.pp_double[20][0] = 12.244689632138972; xy.ptr.pp_double[21][0] = 12.244689632138986; xy.ptr.pp_double[22][0] = 12.244689632138986; xy.ptr.pp_double[23][0] = 12.244689632138986; xy.ptr.pp_double[24][0] = 12.244689632138984; xy.ptr.pp_double[25][0] = 12.244689632138982; xy.ptr.pp_double[26][0] = 12.244689632138986; xy.ptr.pp_double[27][0] = 12.244689632138986; xy.ptr.pp_double[28][0] = 12.244689632138986; xy.ptr.pp_double[29][0] = 12.244689632138986; xy.ptr.pp_double[30][0] = 12.244689632138986; xy.ptr.pp_double[31][0] = 12.244689632138986; xy.ptr.pp_double[32][0] = 12.244689632138986; xy.ptr.pp_double[33][0] = 12.244689632138986; xy.ptr.pp_double[34][0] = 12.244689632138986; xy.ptr.pp_double[35][0] = 12.244689632138982; xy.ptr.pp_double[36][0] = 12.244689632138989; xy.ptr.pp_double[37][0] = 12.244689632138986; xy.ptr.pp_double[38][0] = 12.244689632138986; xy.ptr.pp_double[39][0] = 12.244689632138986; xy.ptr.pp_double[40][0] = 12.244689632138986; xy.ptr.pp_double[41][0] = 12.244689632138986; xy.ptr.pp_double[42][0] = 12.244689632138986; xy.ptr.pp_double[43][0] = 12.244689632138986; xy.ptr.pp_double[44][0] = 12.244689632138986; xy.ptr.pp_double[45][0] = 12.244689632138986; xy.ptr.pp_double[46][0] = 12.244689632138986; xy.ptr.pp_double[47][0] = 12.244689632138986; xy.ptr.pp_double[48][0] = 12.244689632138986; xy.ptr.pp_double[49][0] = 12.244689632138986; xy.ptr.pp_double[50][0] = 12.244689632138984; xy.ptr.pp_double[51][0] = 12.244689632138986; xy.ptr.pp_double[52][0] = 12.244689632138986; xy.ptr.pp_double[53][0] = 12.244689632138986; xy.ptr.pp_double[54][0] = 12.244689632138986; xy.ptr.pp_double[55][0] = 12.244689632138986; xy.ptr.pp_double[56][0] = 12.244689632138986; clusterizercreate(&s, _state); clusterizersetpoints(&s, &xy, npoints, nfeatures, 2, _state); clusterizersetkmeanslimits(&s, restarts, 0, _state); clusterizerrunkmeans(&s, nclusters, &rep, _state); *othererrors = *othererrors||rep.terminationtype<=0; ae_frame_leave(_state); } /************************************************************************* This non-deterministic test checks that Restarts>1 significantly improves quality of results. Subroutine generates random task 3 unit balls in 2D, each with 20 points, separated by 5 units wide gaps, and solves it with Restarts=1 and with Restarts=5. Potential functions are compared, outcome of the trial is either 0 or 1 (depending on what is better). Sequence of 1000 such tasks is solved. If Restarts>1 actually improve quality of solution, sum of outcome will be non-binomial. If it doesn't matter, it will be binomially distributed. P.S. This test was added after report from Gianluca Borello who noticed error in the handling of multiple restarts. *************************************************************************/ static void testclusteringunit_kmeansrestartstest(ae_bool* converrors, ae_bool* restartserrors, ae_state *_state) { ae_frame _frame_block; ae_int_t npoints; ae_int_t nvars; ae_int_t nclusters; ae_int_t clustersize; ae_int_t restarts; ae_int_t passcount; double sigmathreshold; double p; double s; ae_matrix xy; ae_vector tmp; ae_int_t i; ae_int_t j; ae_int_t pass; double ea; double eb; double v; clusterizerstate state; kmeansreport rep1; kmeansreport rep2; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&tmp, 0, DT_REAL, _state); _clusterizerstate_init(&state, _state); _kmeansreport_init(&rep1, _state); _kmeansreport_init(&rep2, _state); restarts = 5; passcount = 1000; clustersize = 20; nclusters = 3; nvars = 2; npoints = nclusters*clustersize; sigmathreshold = (double)(5); ae_matrix_set_length(&xy, npoints, nvars, _state); ae_vector_set_length(&tmp, nvars, _state); p = (double)(0); for(pass=1; pass<=passcount; pass++) { /* * Fill */ for(i=0; i<=npoints-1; i++) { testclusteringunit_rsphere(&xy, nvars, i, _state); for(j=0; j<=nvars-1; j++) { xy.ptr.pp_double[i][j] = xy.ptr.pp_double[i][j]+(double)i/(double)clustersize*5; } } clusterizercreate(&state, _state); clusterizersetpoints(&state, &xy, npoints, nvars, 2, _state); /* * Test: Restarts=1 */ clusterizersetkmeanslimits(&state, 1, 0, _state); clusterizerrunkmeans(&state, nclusters, &rep1, _state); if( rep1.terminationtype<=0 ) { *converrors = ae_true; ae_frame_leave(_state); return; } ea = (double)(0); for(i=0; i<=npoints-1; i++) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy.ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); ae_v_sub(&tmp.ptr.p_double[0], 1, &rep1.c.ptr.pp_double[rep1.cidx.ptr.p_int[i]][0], 1, ae_v_len(0,nvars-1)); v = ae_v_dotproduct(&tmp.ptr.p_double[0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,nvars-1)); ea = ea+v; } /* * Test: Restarts>1 */ clusterizersetkmeanslimits(&state, restarts, 0, _state); clusterizerrunkmeans(&state, nclusters, &rep2, _state); if( rep2.terminationtype<=0 ) { *converrors = ae_true; ae_frame_leave(_state); return; } eb = (double)(0); for(i=0; i<=npoints-1; i++) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy.ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); ae_v_sub(&tmp.ptr.p_double[0], 1, &rep2.c.ptr.pp_double[rep2.cidx.ptr.p_int[i]][0], 1, ae_v_len(0,nvars-1)); v = ae_v_dotproduct(&tmp.ptr.p_double[0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,nvars-1)); eb = eb+v; } /* * Calculate statistic. */ if( ae_fp_less(ea,eb) ) { p = p+1; } if( ae_fp_eq(ea,eb) ) { p = p+0.5; } } /* * If Restarts doesn't influence quality of centers found, P must be * binomially distributed random value with mean 0.5*PassCount and * standard deviation Sqrt(PassCount/4). * * If Restarts do influence quality of solution, P must be significantly * lower than 0.5*PassCount. */ s = (p-0.5*passcount)/ae_sqrt((double)passcount/(double)4, _state); *restartserrors = *restartserrors||ae_fp_greater(s,-sigmathreshold); ae_frame_leave(_state); } /************************************************************************* Random normal number *************************************************************************/ static double testclusteringunit_rnormal(ae_state *_state) { double u; double v; double s; double x1; double result; for(;;) { u = 2*ae_randomreal(_state)-1; v = 2*ae_randomreal(_state)-1; s = ae_sqr(u, _state)+ae_sqr(v, _state); if( ae_fp_greater(s,(double)(0))&&ae_fp_less(s,(double)(1)) ) { s = ae_sqrt(-2*ae_log(s, _state)/s, _state); x1 = u*s; break; } } result = x1; return result; } /************************************************************************* Random point from sphere *************************************************************************/ static void testclusteringunit_rsphere(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t i, ae_state *_state) { ae_int_t j; double v; for(j=0; j<=n-1; j++) { xy->ptr.pp_double[i][j] = testclusteringunit_rnormal(_state); } v = ae_v_dotproduct(&xy->ptr.pp_double[i][0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); v = ae_randomreal(_state)/ae_sqrt(v, _state); ae_v_muld(&xy->ptr.pp_double[i][0], 1, ae_v_len(0,n-1), v); } /************************************************************************* Distance function: distance between X0 and X1 X0, X1 - array[D], points DistType - distance type *************************************************************************/ static double testclusteringunit_distfunc(/* Real */ ae_vector* x0, /* Real */ ae_vector* x1, ae_int_t d, ae_int_t disttype, ae_state *_state) { ae_int_t i; double s0; double s1; double result; ae_assert((((((((disttype==0||disttype==1)||disttype==2)||disttype==10)||disttype==11)||disttype==12)||disttype==13)||disttype==20)||disttype==21, "Assertion failed", _state); if( disttype==0 ) { result = 0.0; for(i=0; i<=d-1; i++) { result = ae_maxreal(result, ae_fabs(x0->ptr.p_double[i]-x1->ptr.p_double[i], _state), _state); } return result; } if( disttype==1 ) { result = 0.0; for(i=0; i<=d-1; i++) { result = result+ae_fabs(x0->ptr.p_double[i]-x1->ptr.p_double[i], _state); } return result; } if( disttype==2 ) { result = 0.0; for(i=0; i<=d-1; i++) { result = result+ae_sqr(x0->ptr.p_double[i]-x1->ptr.p_double[i], _state); } result = ae_sqrt(result, _state); return result; } if( disttype==10 ) { result = ae_maxreal(1-pearsoncorr2(x0, x1, d, _state), 0.0, _state); return result; } if( disttype==11 ) { result = ae_maxreal(1-ae_fabs(pearsoncorr2(x0, x1, d, _state), _state), 0.0, _state); return result; } if( disttype==12||disttype==13 ) { s0 = 0.0; s1 = 0.0; for(i=0; i<=d-1; i++) { s0 = s0+ae_sqr(x0->ptr.p_double[i], _state)/d; s1 = s1+ae_sqr(x1->ptr.p_double[i], _state)/d; } s0 = ae_sqrt(s0, _state); s1 = ae_sqrt(s1, _state); result = (double)(0); for(i=0; i<=d-1; i++) { result = result+x0->ptr.p_double[i]/s0*(x1->ptr.p_double[i]/s1)/d; } if( disttype==12 ) { result = ae_maxreal(1-result, 0.0, _state); } else { result = ae_maxreal(1-ae_fabs(result, _state), 0.0, _state); } return result; } if( disttype==20 ) { result = ae_maxreal(1-spearmancorr2(x0, x1, d, _state), 0.0, _state); return result; } if( disttype==21 ) { result = ae_maxreal(1-ae_fabs(spearmancorr2(x0, x1, d, _state), _state), 0.0, _state); return result; } result = (double)(0); return result; } /************************************************************************* This function replays merges and checks that: * Rep.NPoints, Rep.Z, Rep.PZ and Rep.PM are consistent and correct * Rep.MergeDist is consistent with distance between clusters being merged * clusters with minimal distance are merged at each step * GetKClusters() correctly unpacks clusters for each K NOTE: this algorithm correctly handle ties, i.e. situations where several pairs of clusters have same intercluster distance, and we can't unambiguously choose clusters to merge. INPUT PARAMETERS D - distance matrix, array[NPoints,NPoints], full matrix is given (including both triangles and zeros on the main diagonal) XY - dataset matrix, array[NPoints,NF] NPoints - dataset size NF - number of features Rep - clusterizer report AHCAlgo - AHC algorithm: * 0 - complete linkage * 1 - single linkage * 2 - unweighted average linkage This function returns True on failure, False on success. *************************************************************************/ static ae_bool testclusteringunit_errorsinmerges(/* Real */ ae_matrix* d, /* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nf, ahcreport* rep, ae_int_t ahcalgo, ae_state *_state) { ae_frame _frame_block; ae_matrix dm; ae_matrix cm; ae_vector clustersizes; ae_vector clusterheights; ae_vector b; ae_vector x0; ae_vector x1; ae_bool bflag; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t i0; ae_int_t i1; ae_int_t c0; ae_int_t c1; ae_int_t s0; ae_int_t s1; double v; ae_int_t t; ae_int_t mergeidx; ae_vector kidx; ae_vector kidxz; ae_int_t currentelement; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&dm, 0, 0, DT_REAL, _state); ae_matrix_init(&cm, 0, 0, DT_INT, _state); ae_vector_init(&clustersizes, 0, DT_INT, _state); ae_vector_init(&clusterheights, 0, DT_INT, _state); ae_vector_init(&b, 0, DT_BOOL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&kidx, 0, DT_INT, _state); ae_vector_init(&kidxz, 0, DT_INT, _state); ae_assert(ahcalgo!=3, "integrity error", _state); result = ae_false; ae_vector_set_length(&x0, nf, _state); ae_vector_set_length(&x1, nf, _state); /* * Basic checks: * * positive completion code * * sizes of arrays * * Rep.P is correct permutation * * Rep.Z contains correct cluster indexes * * Rep.PZ is consistent with Rep.P/Rep.Z * * Rep.PM contains consistent indexes * * GetKClusters() for K=NPoints */ bflag = ae_false; bflag = bflag||rep->terminationtype<=0; if( bflag ) { result = ae_true; ae_frame_leave(_state); return result; } bflag = bflag||rep->npoints!=npoints; bflag = (bflag||rep->z.rows!=npoints-1)||(npoints>1&&rep->z.cols!=2); bflag = (bflag||rep->pz.rows!=npoints-1)||(npoints>1&&rep->pz.cols!=2); bflag = (bflag||rep->pm.rows!=npoints-1)||(npoints>1&&rep->pm.cols!=6); bflag = bflag||rep->mergedist.cnt!=npoints-1; bflag = bflag||rep->p.cnt!=npoints; if( bflag ) { result = ae_true; ae_frame_leave(_state); return result; } ae_vector_set_length(&b, npoints, _state); for(i=0; i<=npoints-1; i++) { b.ptr.p_bool[i] = ae_false; } for(i=0; i<=npoints-1; i++) { if( (rep->p.ptr.p_int[i]<0||rep->p.ptr.p_int[i]>=npoints)||b.ptr.p_bool[rep->p.ptr.p_int[i]] ) { result = ae_true; ae_frame_leave(_state); return result; } b.ptr.p_bool[rep->p.ptr.p_int[i]] = ae_true; } for(i=0; i<=npoints-2; i++) { if( (rep->z.ptr.pp_int[i][0]<0||rep->z.ptr.pp_int[i][0]>=rep->z.ptr.pp_int[i][1])||rep->z.ptr.pp_int[i][1]>=npoints+i ) { result = ae_true; ae_frame_leave(_state); return result; } if( (rep->pz.ptr.pp_int[i][0]<0||rep->pz.ptr.pp_int[i][0]>=rep->pz.ptr.pp_int[i][1])||rep->pz.ptr.pp_int[i][1]>=npoints+i ) { result = ae_true; ae_frame_leave(_state); return result; } } for(i=0; i<=npoints-2; i++) { c0 = rep->z.ptr.pp_int[i][0]; c1 = rep->z.ptr.pp_int[i][1]; s0 = rep->pz.ptr.pp_int[i][0]; s1 = rep->pz.ptr.pp_int[i][1]; if( c0p.ptr.p_int[c0]; } if( c1p.ptr.p_int[c1]; } if( c0!=s0||c1!=s1 ) { result = ae_true; ae_frame_leave(_state); return result; } } clusterizergetkclusters(rep, npoints, &kidx, &kidxz, _state); if( kidx.cnt!=npoints||kidxz.cnt!=npoints ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=npoints-1; i++) { if( kidxz.ptr.p_int[i]!=i||kidx.ptr.p_int[i]!=i ) { result = ae_true; ae_frame_leave(_state); return result; } } /* * Test description: * * we generate (2*NPoints-1)x(2*NPoints-1) matrix of distances DM and * (2*NPoints-1)xNPoints matrix of clusters CM (I-th row contains indexes * of elements which belong to I-th cluster, negative indexes denote * empty cells). Leading N*N square of DM is just a distance matrix, * other elements are filled by some large number M (used to mark empty * elements). * * we replay all merges * * every time we merge clusters I and J into K, we: * * check that distance between I and J is equal to the smallest * element of DM (note: we account for rounding errors when we * decide on that) * * check that distance is consistent with Rep.MergeDist * * then, we enumerate all elements in clusters being merged, * and check that after permutation their indexes fall into range * prescribed by Rep.PM * * fill K-th column/row of D by distances to cluster K * * merge I-th and J-th rows of CM and store result into K-th row * * clear DM and CM: fill I-th and J-th column/row of DM by large * number M, fill I-th and J-th row of CM by -1. * * NOTE: DM is initialized by distance metric specific to AHC algorithm * being used. CLINK, SLINK and average linkage use user-provided * distance measure, say Euclidean one, without any modifications. * Ward's method uses (and reports) squared and scaled Euclidean * distances. */ ae_matrix_set_length(&dm, 2*npoints-1, 2*npoints-1, _state); ae_matrix_set_length(&cm, 2*npoints-1, npoints, _state); ae_vector_set_length(&clustersizes, 2*npoints-1, _state); for(i=0; i<=2*npoints-2; i++) { for(j=0; j<=2*npoints-2; j++) { if( iptr.pp_double[i][j]; if( ahcalgo==4 ) { dm.ptr.pp_double[i][j] = 0.5*ae_sqr(dm.ptr.pp_double[i][j], _state); } } else { dm.ptr.pp_double[i][j] = ae_maxrealnumber; } } } for(i=0; i<=2*npoints-2; i++) { for(j=0; j<=npoints-1; j++) { cm.ptr.pp_int[i][j] = -1; } } for(i=0; i<=npoints-1; i++) { cm.ptr.pp_int[i][0] = i; clustersizes.ptr.p_int[i] = 1; } for(i=npoints; i<=2*npoints-2; i++) { clustersizes.ptr.p_int[i] = 0; } ae_vector_set_length(&clusterheights, 2*npoints-1, _state); for(i=0; i<=npoints-1; i++) { clusterheights.ptr.p_int[i] = 0; } for(mergeidx=0; mergeidx<=npoints-2; mergeidx++) { /* * Check that clusters with minimum distance are merged, * and that MergeDist is consistent with results. * * NOTE: we do not check for specific cluster indexes, * because it is possible to have a tie. We just * check that distance between clusters is a true * minimum over all possible clusters. */ v = ae_maxrealnumber; for(i=0; i<=2*npoints-2; i++) { for(j=0; j<=2*npoints-2; j++) { if( i!=j ) { v = ae_minreal(v, dm.ptr.pp_double[i][j], _state); } } } c0 = rep->z.ptr.pp_int[mergeidx][0]; c1 = rep->z.ptr.pp_int[mergeidx][1]; if( ae_fp_greater(dm.ptr.pp_double[c0][c1],v+10000*ae_machineepsilon) ) { result = ae_true; ae_frame_leave(_state); return result; } if( ae_fp_greater(rep->mergedist.ptr.p_double[mergeidx],v+10000*ae_machineepsilon) ) { result = ae_true; ae_frame_leave(_state); return result; } /* * Check that indexes of elements fall into range prescribed by Rep.PM, * and Rep.PM correctly described merge operation */ s0 = clustersizes.ptr.p_int[c0]; s1 = clustersizes.ptr.p_int[c1]; for(j=0; j<=clustersizes.ptr.p_int[c0]-1; j++) { if( rep->p.ptr.p_int[cm.ptr.pp_int[c0][j]]pm.ptr.pp_int[mergeidx][0]||rep->p.ptr.p_int[cm.ptr.pp_int[c0][j]]>rep->pm.ptr.pp_int[mergeidx][1] ) { /* * Element falls outside of range described by PM */ result = ae_true; ae_frame_leave(_state); return result; } } for(j=0; j<=clustersizes.ptr.p_int[c1]-1; j++) { if( rep->p.ptr.p_int[cm.ptr.pp_int[c1][j]]pm.ptr.pp_int[mergeidx][2]||rep->p.ptr.p_int[cm.ptr.pp_int[c1][j]]>rep->pm.ptr.pp_int[mergeidx][3] ) { /* * Element falls outside of range described by PM */ result = ae_true; ae_frame_leave(_state); return result; } } if( (rep->pm.ptr.pp_int[mergeidx][1]-rep->pm.ptr.pp_int[mergeidx][0]!=s0-1||rep->pm.ptr.pp_int[mergeidx][3]-rep->pm.ptr.pp_int[mergeidx][2]!=s1-1)||rep->pm.ptr.pp_int[mergeidx][2]!=rep->pm.ptr.pp_int[mergeidx][1]+1 ) { /* * Cluster size (as given by PM) is inconsistent with its actual size. */ result = ae_true; ae_frame_leave(_state); return result; } if( rep->pm.ptr.pp_int[mergeidx][4]!=clusterheights.ptr.p_int[rep->z.ptr.pp_int[mergeidx][0]]||rep->pm.ptr.pp_int[mergeidx][5]!=clusterheights.ptr.p_int[rep->z.ptr.pp_int[mergeidx][1]] ) { /* * Heights of subdendrograms as returned by PM are inconsistent with heights * calculated by us. */ result = ae_true; ae_frame_leave(_state); return result; } /* * Update cluster heights */ clusterheights.ptr.p_int[mergeidx+npoints] = ae_maxint(clusterheights.ptr.p_int[rep->z.ptr.pp_int[mergeidx][0]], clusterheights.ptr.p_int[rep->z.ptr.pp_int[mergeidx][1]], _state)+1; /* * Update CM */ t = 0; for(j=0; j<=clustersizes.ptr.p_int[rep->z.ptr.pp_int[mergeidx][0]]-1; j++) { cm.ptr.pp_int[npoints+mergeidx][t] = cm.ptr.pp_int[rep->z.ptr.pp_int[mergeidx][0]][j]; t = t+1; } for(j=0; j<=clustersizes.ptr.p_int[rep->z.ptr.pp_int[mergeidx][1]]-1; j++) { cm.ptr.pp_int[npoints+mergeidx][t] = cm.ptr.pp_int[rep->z.ptr.pp_int[mergeidx][1]][j]; t = t+1; } clustersizes.ptr.p_int[npoints+mergeidx] = t; clustersizes.ptr.p_int[rep->z.ptr.pp_int[mergeidx][0]] = 0; clustersizes.ptr.p_int[rep->z.ptr.pp_int[mergeidx][1]] = 0; /* * Update distance matrix D */ for(i=0; i<=2*npoints-2; i++) { /* * "Remove" columns/rows corresponding to clusters being merged */ dm.ptr.pp_double[i][rep->z.ptr.pp_int[mergeidx][0]] = ae_maxrealnumber; dm.ptr.pp_double[i][rep->z.ptr.pp_int[mergeidx][1]] = ae_maxrealnumber; dm.ptr.pp_double[rep->z.ptr.pp_int[mergeidx][0]][i] = ae_maxrealnumber; dm.ptr.pp_double[rep->z.ptr.pp_int[mergeidx][1]][i] = ae_maxrealnumber; } for(i=0; i<=npoints+mergeidx-1; i++) { if( clustersizes.ptr.p_int[i]>0 ) { /* * Calculate column/row corresponding to new cluster */ if( ahcalgo==0 ) { /* * Calculate distance between clusters I and NPoints+MergeIdx for CLINK */ v = 0.0; for(i0=0; i0<=clustersizes.ptr.p_int[i]-1; i0++) { for(i1=0; i1<=clustersizes.ptr.p_int[npoints+mergeidx]-1; i1++) { v = ae_maxreal(v, d->ptr.pp_double[cm.ptr.pp_int[i][i0]][cm.ptr.pp_int[npoints+mergeidx][i1]], _state); } } } if( ahcalgo==1 ) { /* * Calculate distance between clusters I and NPoints+MergeIdx for SLINK */ v = ae_maxrealnumber; for(i0=0; i0<=clustersizes.ptr.p_int[i]-1; i0++) { for(i1=0; i1<=clustersizes.ptr.p_int[npoints+mergeidx]-1; i1++) { v = ae_minreal(v, d->ptr.pp_double[cm.ptr.pp_int[i][i0]][cm.ptr.pp_int[npoints+mergeidx][i1]], _state); } } } if( ahcalgo==2 ) { /* * Calculate distance between clusters I and NPoints+MergeIdx for unweighted average */ v = 0.0; t = 0; for(i0=0; i0<=clustersizes.ptr.p_int[i]-1; i0++) { for(i1=0; i1<=clustersizes.ptr.p_int[npoints+mergeidx]-1; i1++) { v = v+d->ptr.pp_double[cm.ptr.pp_int[i][i0]][cm.ptr.pp_int[npoints+mergeidx][i1]]; t = t+1; } } v = v/t; } if( ahcalgo==3 ) { ae_assert(ae_false, "Assertion failed", _state); } if( ahcalgo==4 ) { /* * Calculate distance between clusters I and NPoints+MergeIdx for Ward's method: * * X0 = center of mass for cluster I * * X1 = center of mass for cluster NPoints+MergeIdx * * S0 = size of cluster I * * S1 = size of cluster NPoints+MergeIdx * * distance between clusters is S0*S1/(S0+S1)*|X0-X1|^2 * */ for(j=0; j<=nf-1; j++) { x0.ptr.p_double[j] = 0.0; x1.ptr.p_double[j] = 0.0; } for(i0=0; i0<=clustersizes.ptr.p_int[i]-1; i0++) { for(j=0; j<=nf-1; j++) { x0.ptr.p_double[j] = x0.ptr.p_double[j]+xy->ptr.pp_double[cm.ptr.pp_int[i][i0]][j]/clustersizes.ptr.p_int[i]; } } for(i1=0; i1<=clustersizes.ptr.p_int[npoints+mergeidx]-1; i1++) { for(j=0; j<=nf-1; j++) { x1.ptr.p_double[j] = x1.ptr.p_double[j]+xy->ptr.pp_double[cm.ptr.pp_int[npoints+mergeidx][i1]][j]/clustersizes.ptr.p_int[npoints+mergeidx]; } } v = 0.0; for(j=0; j<=nf-1; j++) { v = v+ae_sqr(x0.ptr.p_double[j]-x1.ptr.p_double[j], _state); } v = v*clustersizes.ptr.p_int[i]*clustersizes.ptr.p_int[npoints+mergeidx]/(clustersizes.ptr.p_int[i]+clustersizes.ptr.p_int[npoints+mergeidx]); } dm.ptr.pp_double[i][npoints+mergeidx] = v; dm.ptr.pp_double[npoints+mergeidx][i] = v; } } /* * Check that GetKClusters() correctly unpacks clusters for K=NPoints-(MergeIdx+1): * * check lengths of arays * * check consistency of CIdx/CZ parameters * * scan clusters (CZ parameter), for each cluster scan CM matrix which stores * cluster elements (according to our replay of merges), for each element of * the current cluster check that CIdx array correctly reflects its status. */ k = npoints-(mergeidx+1); clusterizergetkclusters(rep, k, &kidx, &kidxz, _state); if( kidx.cnt!=npoints||kidxz.cnt!=k ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=k-2; i++) { if( (kidxz.ptr.p_int[i]<0||kidxz.ptr.p_int[i]>=kidxz.ptr.p_int[i+1])||kidxz.ptr.p_int[i+1]>2*npoints-2 ) { /* * CZ is inconsistent */ result = ae_true; ae_frame_leave(_state); return result; } } for(i=0; i<=npoints-1; i++) { if( kidx.ptr.p_int[i]<0||kidx.ptr.p_int[i]>=k ) { /* * CIdx is inconsistent */ result = ae_true; ae_frame_leave(_state); return result; } } for(i=0; i<=k-1; i++) { for(j=0; j<=clustersizes.ptr.p_int[kidxz.ptr.p_int[i]]-1; j++) { currentelement = cm.ptr.pp_int[kidxz.ptr.p_int[i]][j]; if( kidx.ptr.p_int[currentelement]!=i ) { /* * We've found element which belongs to I-th cluster (according to CM * matrix, which reflects current status of agglomerative clustering), * but this element does not belongs to I-th cluster according to * results of ClusterizerGetKClusters() */ result = ae_true; ae_frame_leave(_state); return result; } } } } ae_frame_leave(_state); return result; } /************************************************************************* This procedure is a reference version of KMeansUpdateDistances(). INPUT PARAMETERS: XY - dataset, array [0..NPoints-1,0..NVars-1]. NPoints - dataset size, NPoints>=K NVars - number of variables, NVars>=1 CT - matrix of centers, centers are stored in rows K - number of centers, K>=1 XYC - preallocated output buffer XYDist2 - preallocated output buffer OUTPUT PARAMETERS: XYC - new assignment of points to centers XYDist2 - squared distances -- ALGLIB -- Copyright 21.01.2015 by Bochkanov Sergey *************************************************************************/ static void testclusteringunit_kmeansreferenceupdatedistances(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nvars, /* Real */ ae_matrix* ct, ae_int_t k, /* Integer */ ae_vector* xyc, /* Real */ ae_vector* xydist2, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t cclosest; double dclosest; double v; ae_vector tmp; ae_frame_make(_state, &_frame_block); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_set_length(&tmp, nvars, _state); for(i=0; i<=npoints-1; i++) { cclosest = -1; dclosest = ae_maxrealnumber; for(j=0; j<=k-1; j++) { ae_v_move(&tmp.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); ae_v_sub(&tmp.ptr.p_double[0], 1, &ct->ptr.pp_double[j][0], 1, ae_v_len(0,nvars-1)); v = ae_v_dotproduct(&tmp.ptr.p_double[0], 1, &tmp.ptr.p_double[0], 1, ae_v_len(0,nvars-1)); if( ae_fp_less(v,dclosest) ) { cclosest = j; dclosest = v; } } ae_assert(cclosest>=0, "KMeansUpdateDistances: internal error", _state); xyc->ptr.p_int[i] = cclosest; xydist2->ptr.p_double[i] = dclosest; } ae_frame_leave(_state); } static void testdforestunit_testprocessing(ae_bool* err, ae_state *_state); static void testdforestunit_basictest1(ae_int_t nvars, ae_int_t nclasses, ae_int_t passcount, ae_bool* err, ae_state *_state); static void testdforestunit_basictest2(ae_bool* err, ae_state *_state); static void testdforestunit_basictest3(ae_bool* err, ae_state *_state); static void testdforestunit_basictest4(ae_bool* err, ae_state *_state); static void testdforestunit_basictest5(ae_bool* err, ae_state *_state); static void testdforestunit_unsetdf(decisionforest* df, ae_state *_state); ae_bool testdforest(ae_bool silent, ae_state *_state) { ae_int_t ncmax; ae_int_t nvmax; ae_int_t passcount; ae_int_t nvars; ae_int_t nclasses; ae_bool waserrors; ae_bool basicerrors; ae_bool procerrors; ae_bool result; /* * Primary settings */ nvmax = 4; ncmax = 3; passcount = 10; basicerrors = ae_false; procerrors = ae_false; waserrors = ae_false; /* * Tests */ testdforestunit_testprocessing(&procerrors, _state); for(nvars=1; nvars<=nvmax; nvars++) { for(nclasses=1; nclasses<=ncmax; nclasses++) { testdforestunit_basictest1(nvars, nclasses, passcount, &basicerrors, _state); } } testdforestunit_basictest2(&basicerrors, _state); testdforestunit_basictest3(&basicerrors, _state); testdforestunit_basictest4(&basicerrors, _state); testdforestunit_basictest5(&basicerrors, _state); /* * Final report */ waserrors = basicerrors||procerrors; if( !silent ) { printf("RANDOM FOREST TEST\n"); printf("TOTAL RESULTS: "); if( !waserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* PROCESSING FUNCTIONS: "); if( !procerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* BASIC TESTS: "); if( !basicerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST SUMMARY: FAILED\n"); } else { printf("TEST SUMMARY: PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testdforest(ae_bool silent, ae_state *_state) { return testdforest(silent, _state); } /************************************************************************* Processing functions test *************************************************************************/ static void testdforestunit_testprocessing(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t nvars; ae_int_t nclasses; ae_int_t nsample; ae_int_t ntrees; ae_int_t nfeatures; ae_int_t flags; decisionforest df1; decisionforest df2; ae_int_t npoints; ae_matrix xy; ae_int_t pass; ae_int_t passcount; ae_int_t i; ae_int_t j; ae_bool allsame; ae_int_t info; dfreport rep; ae_vector x1; ae_vector x2; ae_vector y1; ae_vector y2; double v; ae_frame_make(_state, &_frame_block); _decisionforest_init(&df1, _state); _decisionforest_init(&df2, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _dfreport_init(&rep, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&y1, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); passcount = 100; /* * Main cycle */ for(pass=1; pass<=passcount; pass++) { /* * initialize parameters */ nvars = 1+ae_randominteger(5, _state); nclasses = 1+ae_randominteger(3, _state); ntrees = 1+ae_randominteger(4, _state); nfeatures = 1+ae_randominteger(nvars, _state); flags = 0; if( ae_fp_greater(ae_randomreal(_state),0.5) ) { flags = flags+2; } /* * Initialize arrays and data */ npoints = 10+ae_randominteger(50, _state); nsample = ae_maxint(10, ae_randominteger(npoints, _state), _state); ae_vector_set_length(&x1, nvars-1+1, _state); ae_vector_set_length(&x2, nvars-1+1, _state); ae_vector_set_length(&y1, nclasses-1+1, _state); ae_vector_set_length(&y2, nclasses-1+1, _state); ae_matrix_set_length(&xy, npoints-1+1, nvars+1, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=nvars-1; j++) { if( j%2==0 ) { xy.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } else { xy.ptr.pp_double[i][j] = (double)(ae_randominteger(2, _state)); } } if( nclasses==1 ) { xy.ptr.pp_double[i][nvars] = 2*ae_randomreal(_state)-1; } else { xy.ptr.pp_double[i][nvars] = (double)(ae_randominteger(nclasses, _state)); } } /* * create forest */ dfbuildinternal(&xy, npoints, nvars, nclasses, ntrees, nsample, nfeatures, flags, &info, &df1, &rep, _state); if( info<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } /* * Same inputs leads to same outputs */ for(i=0; i<=nvars-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = x1.ptr.p_double[i]; } for(i=0; i<=nclasses-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } dfprocess(&df1, &x1, &y1, _state); dfprocess(&df1, &x2, &y2, _state); allsame = ae_true; for(i=0; i<=nclasses-1; i++) { allsame = allsame&&ae_fp_eq(y1.ptr.p_double[i],y2.ptr.p_double[i]); } *err = *err||!allsame; /* * Same inputs on original forest leads to same outputs * on copy created using DFCopy */ testdforestunit_unsetdf(&df2, _state); dfcopy(&df1, &df2, _state); for(i=0; i<=nvars-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = x1.ptr.p_double[i]; } for(i=0; i<=nclasses-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } dfprocess(&df1, &x1, &y1, _state); dfprocess(&df2, &x2, &y2, _state); allsame = ae_true; for(i=0; i<=nclasses-1; i++) { allsame = allsame&&ae_fp_eq(y1.ptr.p_double[i],y2.ptr.p_double[i]); } *err = *err||!allsame; /* * Same inputs on original forest leads to same outputs * on copy created using DFSerialize */ testdforestunit_unsetdf(&df2, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); dfalloc(&_local_serializer, &df1, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); dfserialize(&_local_serializer, &df1, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); dfunserialize(&_local_serializer, &df2, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } for(i=0; i<=nvars-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; x2.ptr.p_double[i] = x1.ptr.p_double[i]; } for(i=0; i<=nclasses-1; i++) { y1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } dfprocess(&df1, &x1, &y1, _state); dfprocess(&df2, &x2, &y2, _state); allsame = ae_true; for(i=0; i<=nclasses-1; i++) { allsame = allsame&&ae_fp_eq(y1.ptr.p_double[i],y2.ptr.p_double[i]); } *err = *err||!allsame; /* * Normalization properties */ if( nclasses>1 ) { for(i=0; i<=nvars-1; i++) { x1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } dfprocess(&df1, &x1, &y1, _state); v = (double)(0); for(i=0; i<=nclasses-1; i++) { v = v+y1.ptr.p_double[i]; *err = *err||ae_fp_less(y1.ptr.p_double[i],(double)(0)); } *err = *err||ae_fp_greater(ae_fabs(v-1, _state),1000*ae_machineepsilon); } } ae_frame_leave(_state); } /************************************************************************* Basic test: one-tree forest built using full sample must remember all the training cases *************************************************************************/ static void testdforestunit_basictest1(ae_int_t nvars, ae_int_t nclasses, ae_int_t passcount, ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t pass; ae_matrix xy; ae_int_t npoints; ae_int_t i; ae_int_t j; ae_int_t k; double s; ae_int_t info; decisionforest df; ae_vector x; ae_vector y; dfreport rep; ae_bool hassame; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _decisionforest_init(&df, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); _dfreport_init(&rep, _state); if( nclasses==1 ) { /* * only classification tasks */ ae_frame_leave(_state); return; } for(pass=1; pass<=passcount; pass++) { /* * select number of points */ if( pass<=3&&passcount>3 ) { npoints = pass; } else { npoints = 100+ae_randominteger(100, _state); } /* * Prepare task */ ae_matrix_set_length(&xy, npoints-1+1, nvars+1, _state); ae_vector_set_length(&x, nvars-1+1, _state); ae_vector_set_length(&y, nclasses-1+1, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=nvars-1; j++) { xy.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } xy.ptr.pp_double[i][nvars] = (double)(ae_randominteger(nclasses, _state)); } /* * Test */ dfbuildinternal(&xy, npoints, nvars, nclasses, 1, npoints, 1, 1, &info, &df, &rep, _state); if( info<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } for(i=0; i<=npoints-1; i++) { ae_v_move(&x.ptr.p_double[0], 1, &xy.ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); dfprocess(&df, &x, &y, _state); s = (double)(0); for(j=0; j<=nclasses-1; j++) { if( ae_fp_less(y.ptr.p_double[j],(double)(0)) ) { *err = ae_true; ae_frame_leave(_state); return; } s = s+y.ptr.p_double[j]; } if( ae_fp_greater(ae_fabs(s-1, _state),1000*ae_machineepsilon) ) { *err = ae_true; ae_frame_leave(_state); return; } if( ae_fp_greater(ae_fabs(y.ptr.p_double[ae_round(xy.ptr.pp_double[i][nvars], _state)]-1, _state),1000*ae_machineepsilon) ) { /* * not an error if there exists such K,J that XY[K,J]=XY[I,J] * (may be we just can't distinguish two tied values). * * definitely error otherwise. */ hassame = ae_false; for(k=0; k<=npoints-1; k++) { if( k!=i ) { for(j=0; j<=nvars-1; j++) { if( ae_fp_eq(xy.ptr.pp_double[k][j],xy.ptr.pp_double[i][j]) ) { hassame = ae_true; } } } } if( !hassame ) { *err = ae_true; ae_frame_leave(_state); return; } } } } ae_frame_leave(_state); } /************************************************************************* Basic test: tests generalization ability on a simple noisy classification task: * 00.25 - P(class=0)=0 *************************************************************************/ static void testdforestunit_basictest3(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t pass; ae_int_t passcount; ae_matrix xy; ae_int_t npoints; ae_int_t ntrees; ae_int_t i; ae_int_t j; ae_int_t k; double s; ae_int_t info; decisionforest df; ae_vector x; ae_vector y; dfreport rep; ae_int_t testgridsize; double r; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _decisionforest_init(&df, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); _dfreport_init(&rep, _state); passcount = 1; testgridsize = 50; for(pass=1; pass<=passcount; pass++) { /* * select npoints and ntrees */ npoints = 2000; ntrees = 100; /* * Prepare task */ ae_matrix_set_length(&xy, npoints-1+1, 2+1, _state); ae_vector_set_length(&x, 1+1, _state); ae_vector_set_length(&y, 1+1, _state); for(i=0; i<=npoints-1; i++) { xy.ptr.pp_double[i][0] = 2*ae_randomreal(_state)-1; xy.ptr.pp_double[i][1] = 2*ae_randomreal(_state)-1; if( ae_fp_less_eq(ae_sqr(xy.ptr.pp_double[i][0], _state)+ae_sqr(xy.ptr.pp_double[i][1], _state),0.25) ) { xy.ptr.pp_double[i][2] = (double)(0); } else { xy.ptr.pp_double[i][2] = (double)(1); } } /* * Test */ dfbuildinternal(&xy, npoints, 2, 2, ntrees, ae_round(0.1*npoints, _state), 1, 0, &info, &df, &rep, _state); if( info<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } for(i=-testgridsize/2; i<=testgridsize/2; i++) { for(j=-testgridsize/2; j<=testgridsize/2; j++) { x.ptr.p_double[0] = (double)i/(double)(testgridsize/2); x.ptr.p_double[1] = (double)j/(double)(testgridsize/2); dfprocess(&df, &x, &y, _state); /* * Test for basic properties */ s = (double)(0); for(k=0; k<=1; k++) { if( ae_fp_less(y.ptr.p_double[k],(double)(0)) ) { *err = ae_true; ae_frame_leave(_state); return; } s = s+y.ptr.p_double[k]; } if( ae_fp_greater(ae_fabs(s-1, _state),1000*ae_machineepsilon) ) { *err = ae_true; ae_frame_leave(_state); return; } /* * test for good correlation with results */ r = ae_sqrt(ae_sqr(x.ptr.p_double[0], _state)+ae_sqr(x.ptr.p_double[1], _state), _state); if( ae_fp_less(r,0.5*0.5) ) { *err = *err||ae_fp_less(y.ptr.p_double[0],0.6); } if( ae_fp_greater(r,0.5*1.5) ) { *err = *err||ae_fp_less(y.ptr.p_double[1],0.6); } } } } ae_frame_leave(_state); } /************************************************************************* Basic test: simple regression task without noise: * |x|<1, |y|<1 * F(x,y) = x^2+y *************************************************************************/ static void testdforestunit_basictest4(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_int_t pass; ae_int_t passcount; ae_matrix xy; ae_int_t npoints; ae_int_t ntrees; ae_int_t ns; ae_int_t strongc; ae_int_t i; ae_int_t j; ae_int_t info; decisionforest df; decisionforest df2; ae_vector x; ae_vector y; dfreport rep; dfreport rep2; ae_int_t testgridsize; double maxerr; double maxerr2; double avgerr; double avgerr2; ae_int_t cnt; double ey; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _decisionforest_init(&df, _state); _decisionforest_init(&df2, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); _dfreport_init(&rep, _state); _dfreport_init(&rep2, _state); passcount = 1; testgridsize = 50; for(pass=1; pass<=passcount; pass++) { /* * select npoints and ntrees */ npoints = 5000; ntrees = 100; ns = ae_round(0.1*npoints, _state); strongc = 1; /* * Prepare task */ ae_matrix_set_length(&xy, npoints-1+1, 2+1, _state); ae_vector_set_length(&x, 1+1, _state); ae_vector_set_length(&y, 0+1, _state); for(i=0; i<=npoints-1; i++) { xy.ptr.pp_double[i][0] = 2*ae_randomreal(_state)-1; xy.ptr.pp_double[i][1] = 2*ae_randomreal(_state)-1; xy.ptr.pp_double[i][2] = ae_sqr(xy.ptr.pp_double[i][0], _state)+xy.ptr.pp_double[i][1]; } /* * Test */ dfbuildinternal(&xy, npoints, 2, 1, ntrees, ns, 1, 0, &info, &df, &rep, _state); if( info<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } dfbuildinternal(&xy, npoints, 2, 1, ntrees, ns, 1, strongc, &info, &df2, &rep2, _state); if( info<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } maxerr = (double)(0); maxerr2 = (double)(0); avgerr = (double)(0); avgerr2 = (double)(0); cnt = 0; for(i=ae_round(-0.7*testgridsize/2, _state); i<=ae_round(0.7*testgridsize/2, _state); i++) { for(j=ae_round(-0.7*testgridsize/2, _state); j<=ae_round(0.7*testgridsize/2, _state); j++) { x.ptr.p_double[0] = (double)i/(double)(testgridsize/2); x.ptr.p_double[1] = (double)j/(double)(testgridsize/2); ey = ae_sqr(x.ptr.p_double[0], _state)+x.ptr.p_double[1]; dfprocess(&df, &x, &y, _state); maxerr = ae_maxreal(maxerr, ae_fabs(y.ptr.p_double[0]-ey, _state), _state); avgerr = avgerr+ae_fabs(y.ptr.p_double[0]-ey, _state); dfprocess(&df2, &x, &y, _state); maxerr2 = ae_maxreal(maxerr2, ae_fabs(y.ptr.p_double[0]-ey, _state), _state); avgerr2 = avgerr2+ae_fabs(y.ptr.p_double[0]-ey, _state); cnt = cnt+1; } } avgerr = avgerr/cnt; avgerr2 = avgerr2/cnt; *err = *err||ae_fp_greater(maxerr,0.2); *err = *err||ae_fp_greater(maxerr2,0.2); *err = *err||ae_fp_greater(avgerr,0.1); *err = *err||ae_fp_greater(avgerr2,0.1); } ae_frame_leave(_state); } /************************************************************************* Basic test: extended variable selection leads to better results. Next task CAN be solved without EVS but it is very unlikely. With EVS it can be easily and exactly solved. Task matrix: 1 0 0 0 ... 0 0 0 1 0 0 ... 0 1 0 0 1 0 ... 0 2 0 0 0 1 ... 0 3 0 0 0 0 ... 1 N-1 *************************************************************************/ static void testdforestunit_basictest5(ae_bool* err, ae_state *_state) { ae_frame _frame_block; ae_matrix xy; ae_int_t nvars; ae_int_t npoints; ae_int_t nfeatures; ae_int_t nsample; ae_int_t ntrees; ae_int_t evs; ae_int_t i; ae_int_t j; ae_bool eflag; ae_int_t info; decisionforest df; ae_vector x; ae_vector y; dfreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _decisionforest_init(&df, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); _dfreport_init(&rep, _state); /* * select npoints and ntrees */ npoints = 50; nvars = npoints; ntrees = 1; nsample = npoints; evs = 2; nfeatures = 1; /* * Prepare task */ ae_matrix_set_length(&xy, npoints-1+1, nvars+1, _state); ae_vector_set_length(&x, nvars-1+1, _state); ae_vector_set_length(&y, 0+1, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=nvars-1; j++) { xy.ptr.pp_double[i][j] = (double)(0); } xy.ptr.pp_double[i][i] = (double)(1); xy.ptr.pp_double[i][nvars] = (double)(i); } /* * Without EVS */ dfbuildinternal(&xy, npoints, nvars, 1, ntrees, nsample, nfeatures, 0, &info, &df, &rep, _state); if( info<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } eflag = ae_false; for(i=0; i<=npoints-1; i++) { ae_v_move(&x.ptr.p_double[0], 1, &xy.ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); dfprocess(&df, &x, &y, _state); if( ae_fp_greater(ae_fabs(y.ptr.p_double[0]-xy.ptr.pp_double[i][nvars], _state),1000*ae_machineepsilon) ) { eflag = ae_true; } } if( !eflag ) { *err = ae_true; ae_frame_leave(_state); return; } /* * With EVS */ dfbuildinternal(&xy, npoints, nvars, 1, ntrees, nsample, nfeatures, evs, &info, &df, &rep, _state); if( info<=0 ) { *err = ae_true; ae_frame_leave(_state); return; } eflag = ae_false; for(i=0; i<=npoints-1; i++) { ae_v_move(&x.ptr.p_double[0], 1, &xy.ptr.pp_double[i][0], 1, ae_v_len(0,nvars-1)); dfprocess(&df, &x, &y, _state); if( ae_fp_greater(ae_fabs(y.ptr.p_double[0]-xy.ptr.pp_double[i][nvars], _state),1000*ae_machineepsilon) ) { eflag = ae_true; } } if( eflag ) { *err = ae_true; ae_frame_leave(_state); return; } ae_frame_leave(_state); } /************************************************************************* Unsets DF *************************************************************************/ static void testdforestunit_unsetdf(decisionforest* df, ae_state *_state) { ae_frame _frame_block; ae_matrix xy; ae_int_t info; dfreport rep; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _dfreport_init(&rep, _state); ae_matrix_set_length(&xy, 0+1, 1+1, _state); xy.ptr.pp_double[0][0] = (double)(0); xy.ptr.pp_double[0][1] = (double)(0); dfbuildinternal(&xy, 1, 1, 1, 1, 1, 1, 0, &info, df, &rep, _state); ae_frame_leave(_state); } static double testgqunit_mapkind(ae_int_t k, ae_state *_state); static void testgqunit_buildgausslegendrequadrature(ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state); static void testgqunit_buildgaussjacobiquadrature(ae_int_t n, double alpha, double beta, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state); static void testgqunit_buildgausslaguerrequadrature(ae_int_t n, double alpha, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state); static void testgqunit_buildgausshermitequadrature(ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testgq(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_vector alpha; ae_vector beta; ae_vector x; ae_vector w; ae_vector x2; ae_vector w2; double err; ae_int_t n; ae_int_t i; ae_int_t info; ae_int_t akind; ae_int_t bkind; double alphac; double betac; double errtol; double nonstricterrtol; double stricterrtol; ae_bool recerrors; ae_bool specerrors; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&alpha, 0, DT_REAL, _state); ae_vector_init(&beta, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); recerrors = ae_false; specerrors = ae_false; waserrors = ae_false; errtol = 1.0E-12; nonstricterrtol = 1.0E-6; stricterrtol = 1000*ae_machineepsilon; /* * Three tests for rec-based Gauss quadratures with known weights/nodes: * 1. Gauss-Legendre with N=2 * 2. Gauss-Legendre with N=5 * 3. Gauss-Chebyshev with N=1, 2, 4, 8, ..., 512 */ err = (double)(0); ae_vector_set_length(&alpha, 2, _state); ae_vector_set_length(&beta, 2, _state); alpha.ptr.p_double[0] = (double)(0); alpha.ptr.p_double[1] = (double)(0); beta.ptr.p_double[1] = (double)1/(double)(4*1*1-1); gqgeneraterec(&alpha, &beta, 2.0, 2, &info, &x, &w, _state); if( info>0 ) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[0]+ae_sqrt((double)(3), _state)/3, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[1]-ae_sqrt((double)(3), _state)/3, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[0]-1, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[1]-1, _state), _state); for(i=0; i<=0; i++) { recerrors = recerrors||ae_fp_greater_eq(x.ptr.p_double[i],x.ptr.p_double[i+1]); } } else { recerrors = ae_true; } ae_vector_set_length(&alpha, 5, _state); ae_vector_set_length(&beta, 5, _state); alpha.ptr.p_double[0] = (double)(0); for(i=1; i<=4; i++) { alpha.ptr.p_double[i] = (double)(0); beta.ptr.p_double[i] = ae_sqr((double)(i), _state)/(4*ae_sqr((double)(i), _state)-1); } gqgeneraterec(&alpha, &beta, 2.0, 5, &info, &x, &w, _state); if( info>0 ) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[0]+ae_sqrt(245+14*ae_sqrt((double)(70), _state), _state)/21, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[0]+x.ptr.p_double[4], _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[1]+ae_sqrt(245-14*ae_sqrt((double)(70), _state), _state)/21, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[1]+x.ptr.p_double[3], _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[2], _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[0]-(322-13*ae_sqrt((double)(70), _state))/900, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[0]-w.ptr.p_double[4], _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[1]-(322+13*ae_sqrt((double)(70), _state))/900, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[1]-w.ptr.p_double[3], _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[2]-(double)128/(double)225, _state), _state); for(i=0; i<=3; i++) { recerrors = recerrors||ae_fp_greater_eq(x.ptr.p_double[i],x.ptr.p_double[i+1]); } } else { recerrors = ae_true; } n = 1; while(n<=512) { ae_vector_set_length(&alpha, n, _state); ae_vector_set_length(&beta, n, _state); for(i=0; i<=n-1; i++) { alpha.ptr.p_double[i] = (double)(0); if( i==0 ) { beta.ptr.p_double[i] = (double)(0); } if( i==1 ) { beta.ptr.p_double[i] = (double)1/(double)2; } if( i>1 ) { beta.ptr.p_double[i] = (double)1/(double)4; } } gqgeneraterec(&alpha, &beta, ae_pi, n, &info, &x, &w, _state); if( info>0 ) { for(i=0; i<=n-1; i++) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[i]-ae_cos(ae_pi*(n-i-0.5)/n, _state), _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[i]-ae_pi/n, _state), _state); } for(i=0; i<=n-2; i++) { recerrors = recerrors||ae_fp_greater_eq(x.ptr.p_double[i],x.ptr.p_double[i+1]); } } else { recerrors = ae_true; } n = n*2; } recerrors = recerrors||ae_fp_greater(err,errtol); /* * Three tests for rec-based Gauss-Lobatto quadratures with known weights/nodes: * 1. Gauss-Lobatto with N=3 * 2. Gauss-Lobatto with N=4 * 3. Gauss-Lobatto with N=6 */ err = (double)(0); ae_vector_set_length(&alpha, 2, _state); ae_vector_set_length(&beta, 2, _state); alpha.ptr.p_double[0] = (double)(0); alpha.ptr.p_double[1] = (double)(0); beta.ptr.p_double[0] = (double)(0); beta.ptr.p_double[1] = (double)(1*1)/(double)(4*1*1-1); gqgenerategausslobattorec(&alpha, &beta, 2.0, (double)(-1), (double)(1), 3, &info, &x, &w, _state); if( info>0 ) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[0]+1, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[1], _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[2]-1, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[0]-(double)1/(double)3, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[1]-(double)4/(double)3, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[2]-(double)1/(double)3, _state), _state); for(i=0; i<=1; i++) { recerrors = recerrors||ae_fp_greater_eq(x.ptr.p_double[i],x.ptr.p_double[i+1]); } } else { recerrors = ae_true; } ae_vector_set_length(&alpha, 3, _state); ae_vector_set_length(&beta, 3, _state); alpha.ptr.p_double[0] = (double)(0); alpha.ptr.p_double[1] = (double)(0); alpha.ptr.p_double[2] = (double)(0); beta.ptr.p_double[0] = (double)(0); beta.ptr.p_double[1] = (double)(1*1)/(double)(4*1*1-1); beta.ptr.p_double[2] = (double)(2*2)/(double)(4*2*2-1); gqgenerategausslobattorec(&alpha, &beta, 2.0, (double)(-1), (double)(1), 4, &info, &x, &w, _state); if( info>0 ) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[0]+1, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[1]+ae_sqrt((double)(5), _state)/5, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[2]-ae_sqrt((double)(5), _state)/5, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[3]-1, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[0]-(double)1/(double)6, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[1]-(double)5/(double)6, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[2]-(double)5/(double)6, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[3]-(double)1/(double)6, _state), _state); for(i=0; i<=2; i++) { recerrors = recerrors||ae_fp_greater_eq(x.ptr.p_double[i],x.ptr.p_double[i+1]); } } else { recerrors = ae_true; } ae_vector_set_length(&alpha, 5, _state); ae_vector_set_length(&beta, 5, _state); alpha.ptr.p_double[0] = (double)(0); alpha.ptr.p_double[1] = (double)(0); alpha.ptr.p_double[2] = (double)(0); alpha.ptr.p_double[3] = (double)(0); alpha.ptr.p_double[4] = (double)(0); beta.ptr.p_double[0] = (double)(0); beta.ptr.p_double[1] = (double)(1*1)/(double)(4*1*1-1); beta.ptr.p_double[2] = (double)(2*2)/(double)(4*2*2-1); beta.ptr.p_double[3] = (double)(3*3)/(double)(4*3*3-1); beta.ptr.p_double[4] = (double)(4*4)/(double)(4*4*4-1); gqgenerategausslobattorec(&alpha, &beta, 2.0, (double)(-1), (double)(1), 6, &info, &x, &w, _state); if( info>0 ) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[0]+1, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[1]+ae_sqrt((7+2*ae_sqrt((double)(7), _state))/21, _state), _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[2]+ae_sqrt((7-2*ae_sqrt((double)(7), _state))/21, _state), _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[3]-ae_sqrt((7-2*ae_sqrt((double)(7), _state))/21, _state), _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[4]-ae_sqrt((7+2*ae_sqrt((double)(7), _state))/21, _state), _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[5]-1, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[0]-(double)1/(double)15, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[1]-(14-ae_sqrt((double)(7), _state))/30, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[2]-(14+ae_sqrt((double)(7), _state))/30, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[3]-(14+ae_sqrt((double)(7), _state))/30, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[4]-(14-ae_sqrt((double)(7), _state))/30, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[5]-(double)1/(double)15, _state), _state); for(i=0; i<=4; i++) { recerrors = recerrors||ae_fp_greater_eq(x.ptr.p_double[i],x.ptr.p_double[i+1]); } } else { recerrors = ae_true; } recerrors = recerrors||ae_fp_greater(err,errtol); /* * Three tests for rec-based Gauss-Radau quadratures with known weights/nodes: * 1. Gauss-Radau with N=2 * 2. Gauss-Radau with N=3 * 3. Gauss-Radau with N=3 (another case) */ err = (double)(0); ae_vector_set_length(&alpha, 1, _state); ae_vector_set_length(&beta, 2, _state); alpha.ptr.p_double[0] = (double)(0); beta.ptr.p_double[0] = (double)(0); beta.ptr.p_double[1] = (double)(1*1)/(double)(4*1*1-1); gqgenerategaussradaurec(&alpha, &beta, 2.0, (double)(-1), 2, &info, &x, &w, _state); if( info>0 ) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[0]+1, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[1]-(double)1/(double)3, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[0]-0.5, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[1]-1.5, _state), _state); for(i=0; i<=0; i++) { recerrors = recerrors||ae_fp_greater_eq(x.ptr.p_double[i],x.ptr.p_double[i+1]); } } else { recerrors = ae_true; } ae_vector_set_length(&alpha, 2, _state); ae_vector_set_length(&beta, 3, _state); alpha.ptr.p_double[0] = (double)(0); alpha.ptr.p_double[1] = (double)(0); for(i=0; i<=2; i++) { beta.ptr.p_double[i] = ae_sqr((double)(i), _state)/(4*ae_sqr((double)(i), _state)-1); } gqgenerategaussradaurec(&alpha, &beta, 2.0, (double)(-1), 3, &info, &x, &w, _state); if( info>0 ) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[0]+1, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[1]-(1-ae_sqrt((double)(6), _state))/5, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[2]-(1+ae_sqrt((double)(6), _state))/5, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[0]-(double)2/(double)9, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[1]-(16+ae_sqrt((double)(6), _state))/18, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[2]-(16-ae_sqrt((double)(6), _state))/18, _state), _state); for(i=0; i<=1; i++) { recerrors = recerrors||ae_fp_greater_eq(x.ptr.p_double[i],x.ptr.p_double[i+1]); } } else { recerrors = ae_true; } ae_vector_set_length(&alpha, 2, _state); ae_vector_set_length(&beta, 3, _state); alpha.ptr.p_double[0] = (double)(0); alpha.ptr.p_double[1] = (double)(0); for(i=0; i<=2; i++) { beta.ptr.p_double[i] = ae_sqr((double)(i), _state)/(4*ae_sqr((double)(i), _state)-1); } gqgenerategaussradaurec(&alpha, &beta, 2.0, (double)(1), 3, &info, &x, &w, _state); if( info>0 ) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[2]-1, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[1]+(1-ae_sqrt((double)(6), _state))/5, _state), _state); err = ae_maxreal(err, ae_fabs(x.ptr.p_double[0]+(1+ae_sqrt((double)(6), _state))/5, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[2]-(double)2/(double)9, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[1]-(16+ae_sqrt((double)(6), _state))/18, _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[0]-(16-ae_sqrt((double)(6), _state))/18, _state), _state); for(i=0; i<=1; i++) { recerrors = recerrors||ae_fp_greater_eq(x.ptr.p_double[i],x.ptr.p_double[i+1]); } } else { recerrors = ae_true; } recerrors = recerrors||ae_fp_greater(err,errtol); /* * test recurrence-based special cases (Legendre, Jacobi, Hermite, ...) * against another implementation (polynomial root-finder) */ for(n=1; n<=20; n++) { /* * test gauss-legendre */ err = (double)(0); gqgenerategausslegendre(n, &info, &x, &w, _state); if( info>0 ) { testgqunit_buildgausslegendrequadrature(n, &x2, &w2, _state); for(i=0; i<=n-1; i++) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[i]-x2.ptr.p_double[i], _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[i]-w2.ptr.p_double[i], _state), _state); } } else { specerrors = ae_true; } specerrors = specerrors||ae_fp_greater(err,errtol); /* * Test Gauss-Jacobi. * Since task is much more difficult we will use less strict * threshold. */ err = (double)(0); for(akind=0; akind<=9; akind++) { for(bkind=0; bkind<=9; bkind++) { alphac = testgqunit_mapkind(akind, _state); betac = testgqunit_mapkind(bkind, _state); gqgenerategaussjacobi(n, alphac, betac, &info, &x, &w, _state); if( info>0 ) { testgqunit_buildgaussjacobiquadrature(n, alphac, betac, &x2, &w2, _state); for(i=0; i<=n-1; i++) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[i]-x2.ptr.p_double[i], _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[i]-w2.ptr.p_double[i], _state), _state); } } else { specerrors = ae_true; } } } specerrors = specerrors||ae_fp_greater(err,nonstricterrtol); /* * special test for Gauss-Jacobi (Chebyshev weight * function with analytically known nodes/weights) */ err = (double)(0); gqgenerategaussjacobi(n, -0.5, -0.5, &info, &x, &w, _state); if( info>0 ) { for(i=0; i<=n-1; i++) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[i]+ae_cos(ae_pi*(i+0.5)/n, _state), _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[i]-ae_pi/n, _state), _state); } } else { specerrors = ae_true; } specerrors = specerrors||ae_fp_greater(err,stricterrtol); /* * Test Gauss-Laguerre */ err = (double)(0); for(akind=0; akind<=9; akind++) { alphac = testgqunit_mapkind(akind, _state); gqgenerategausslaguerre(n, alphac, &info, &x, &w, _state); if( info>0 ) { testgqunit_buildgausslaguerrequadrature(n, alphac, &x2, &w2, _state); for(i=0; i<=n-1; i++) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[i]-x2.ptr.p_double[i], _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[i]-w2.ptr.p_double[i], _state), _state); } } else { specerrors = ae_true; } } specerrors = specerrors||ae_fp_greater(err,nonstricterrtol); /* * Test Gauss-Hermite */ err = (double)(0); gqgenerategausshermite(n, &info, &x, &w, _state); if( info>0 ) { testgqunit_buildgausshermitequadrature(n, &x2, &w2, _state); for(i=0; i<=n-1; i++) { err = ae_maxreal(err, ae_fabs(x.ptr.p_double[i]-x2.ptr.p_double[i], _state), _state); err = ae_maxreal(err, ae_fabs(w.ptr.p_double[i]-w2.ptr.p_double[i], _state), _state); } } else { specerrors = ae_true; } specerrors = specerrors||ae_fp_greater(err,nonstricterrtol); } /* * end */ waserrors = recerrors||specerrors; if( !silent ) { printf("TESTING GAUSS QUADRATURES\n"); printf("FINAL RESULT: "); if( waserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* SPECIAL CASES (LEGENDRE/JACOBI/..) "); if( specerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* RECURRENCE-BASED: "); if( recerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testgq(ae_bool silent, ae_state *_state) { return testgq(silent, _state); } /************************************************************************* Maps: 0 => -0.9 1 => -0.5 2 => -0.1 3 => 0.0 4 => +0.1 5 => +0.5 6 => +0.9 7 => +1.0 8 => +1.5 9 => +2.0 *************************************************************************/ static double testgqunit_mapkind(ae_int_t k, ae_state *_state) { double result; result = (double)(0); if( k==0 ) { result = -0.9; } if( k==1 ) { result = -0.5; } if( k==2 ) { result = -0.1; } if( k==3 ) { result = 0.0; } if( k==4 ) { result = 0.1; } if( k==5 ) { result = 0.5; } if( k==6 ) { result = 0.9; } if( k==7 ) { result = 1.0; } if( k==8 ) { result = 1.5; } if( k==9 ) { result = 2.0; } return result; } /************************************************************************* Gauss-Legendre, another variant *************************************************************************/ static void testgqunit_buildgausslegendrequadrature(ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state) { ae_int_t i; ae_int_t j; double r; double r1; double p1; double p2; double p3; double dp3; double tmp; ae_vector_clear(x); ae_vector_clear(w); ae_vector_set_length(x, n-1+1, _state); ae_vector_set_length(w, n-1+1, _state); for(i=0; i<=(n+1)/2-1; i++) { r = ae_cos(ae_pi*(4*i+3)/(4*n+2), _state); do { p2 = (double)(0); p3 = (double)(1); for(j=0; j<=n-1; j++) { p1 = p2; p2 = p3; p3 = ((2*j+1)*r*p2-j*p1)/(j+1); } dp3 = n*(r*p3-p2)/(r*r-1); r1 = r; r = r-p3/dp3; } while(ae_fp_greater_eq(ae_fabs(r-r1, _state),ae_machineepsilon*(1+ae_fabs(r, _state))*100)); x->ptr.p_double[i] = r; x->ptr.p_double[n-1-i] = -r; w->ptr.p_double[i] = 2/((1-r*r)*dp3*dp3); w->ptr.p_double[n-1-i] = 2/((1-r*r)*dp3*dp3); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-2-i; j++) { if( ae_fp_greater_eq(x->ptr.p_double[j],x->ptr.p_double[j+1]) ) { tmp = x->ptr.p_double[j]; x->ptr.p_double[j] = x->ptr.p_double[j+1]; x->ptr.p_double[j+1] = tmp; tmp = w->ptr.p_double[j]; w->ptr.p_double[j] = w->ptr.p_double[j+1]; w->ptr.p_double[j+1] = tmp; } } } } /************************************************************************* Gauss-Jacobi, another variant *************************************************************************/ static void testgqunit_buildgaussjacobiquadrature(ae_int_t n, double alpha, double beta, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state) { ae_int_t i; ae_int_t j; double r; double r1; double t1; double t2; double t3; double p1; double p2; double p3; double pp; double an; double bn; double a; double b; double c; double tmpsgn; double tmp; double alfbet; double temp; ae_vector_clear(x); ae_vector_clear(w); ae_vector_set_length(x, n-1+1, _state); ae_vector_set_length(w, n-1+1, _state); r = (double)(0); for(i=0; i<=n-1; i++) { if( i==0 ) { an = alpha/n; bn = beta/n; t1 = (1+alpha)*(2.78/(4+n*n)+0.768*an/n); t2 = 1+1.48*an+0.96*bn+0.452*an*an+0.83*an*bn; r = (t2-t1)/t2; } else { if( i==1 ) { t1 = (4.1+alpha)/((1+alpha)*(1+0.156*alpha)); t2 = 1+0.06*(n-8)*(1+0.12*alpha)/n; t3 = 1+0.012*beta*(1+0.25*ae_fabs(alpha, _state))/n; r = r-t1*t2*t3*(1-r); } else { if( i==2 ) { t1 = (1.67+0.28*alpha)/(1+0.37*alpha); t2 = 1+0.22*(n-8)/n; t3 = 1+8*beta/((6.28+beta)*n*n); r = r-t1*t2*t3*(x->ptr.p_double[0]-r); } else { if( iptr.p_double[i-1]-3*x->ptr.p_double[i-2]+x->ptr.p_double[i-3]; } else { if( i==n-2 ) { t1 = (1+0.235*beta)/(0.766+0.119*beta); t2 = 1/(1+0.639*(n-4)/(1+0.71*(n-4))); t3 = 1/(1+20*alpha/((7.5+alpha)*n*n)); r = r+t1*t2*t3*(r-x->ptr.p_double[i-2]); } else { if( i==n-1 ) { t1 = (1+0.37*beta)/(1.67+0.28*beta); t2 = 1/(1+0.22*(n-8)/n); t3 = 1/(1+8*alpha/((6.28+alpha)*n*n)); r = r+t1*t2*t3*(r-x->ptr.p_double[i-2]); } } } } } } alfbet = alpha+beta; do { temp = 2+alfbet; p1 = (alpha-beta+temp*r)*0.5; p2 = (double)(1); for(j=2; j<=n; j++) { p3 = p2; p2 = p1; temp = 2*j+alfbet; a = 2*j*(j+alfbet)*(temp-2); b = (temp-1)*(alpha*alpha-beta*beta+temp*(temp-2)*r); c = 2*(j-1+alpha)*(j-1+beta)*temp; p1 = (b*p2-c*p3)/a; } pp = (n*(alpha-beta-temp*r)*p1+2*(n+alpha)*(n+beta)*p2)/(temp*(1-r*r)); r1 = r; r = r1-p1/pp; } while(ae_fp_greater_eq(ae_fabs(r-r1, _state),ae_machineepsilon*(1+ae_fabs(r, _state))*100)); x->ptr.p_double[i] = r; w->ptr.p_double[i] = ae_exp(lngamma(alpha+n, &tmpsgn, _state)+lngamma(beta+n, &tmpsgn, _state)-lngamma((double)(n+1), &tmpsgn, _state)-lngamma(n+alfbet+1, &tmpsgn, _state), _state)*temp*ae_pow((double)(2), alfbet, _state)/(pp*p2); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-2-i; j++) { if( ae_fp_greater_eq(x->ptr.p_double[j],x->ptr.p_double[j+1]) ) { tmp = x->ptr.p_double[j]; x->ptr.p_double[j] = x->ptr.p_double[j+1]; x->ptr.p_double[j+1] = tmp; tmp = w->ptr.p_double[j]; w->ptr.p_double[j] = w->ptr.p_double[j+1]; w->ptr.p_double[j+1] = tmp; } } } } /************************************************************************* Gauss-Laguerre, another variant *************************************************************************/ static void testgqunit_buildgausslaguerrequadrature(ae_int_t n, double alpha, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state) { ae_int_t i; ae_int_t j; double r; double r1; double p1; double p2; double p3; double dp3; double tsg; double tmp; ae_vector_clear(x); ae_vector_clear(w); ae_vector_set_length(x, n-1+1, _state); ae_vector_set_length(w, n-1+1, _state); r = (double)(0); for(i=0; i<=n-1; i++) { if( i==0 ) { r = (1+alpha)*(3+0.92*alpha)/(1+2.4*n+1.8*alpha); } else { if( i==1 ) { r = r+(15+6.25*alpha)/(1+0.9*alpha+2.5*n); } else { r = r+((1+2.55*(i-1))/(1.9*(i-1))+1.26*(i-1)*alpha/(1+3.5*(i-1)))/(1+0.3*alpha)*(r-x->ptr.p_double[i-2]); } } do { p2 = (double)(0); p3 = (double)(1); for(j=0; j<=n-1; j++) { p1 = p2; p2 = p3; p3 = ((-r+2*j+alpha+1)*p2-(j+alpha)*p1)/(j+1); } dp3 = (n*p3-(n+alpha)*p2)/r; r1 = r; r = r-p3/dp3; } while(ae_fp_greater_eq(ae_fabs(r-r1, _state),ae_machineepsilon*(1+ae_fabs(r, _state))*100)); x->ptr.p_double[i] = r; w->ptr.p_double[i] = -ae_exp(lngamma(alpha+n, &tsg, _state)-lngamma((double)(n), &tsg, _state), _state)/(dp3*n*p2); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-2-i; j++) { if( ae_fp_greater_eq(x->ptr.p_double[j],x->ptr.p_double[j+1]) ) { tmp = x->ptr.p_double[j]; x->ptr.p_double[j] = x->ptr.p_double[j+1]; x->ptr.p_double[j+1] = tmp; tmp = w->ptr.p_double[j]; w->ptr.p_double[j] = w->ptr.p_double[j+1]; w->ptr.p_double[j+1] = tmp; } } } } /************************************************************************* Gauss-Hermite, another variant *************************************************************************/ static void testgqunit_buildgausshermitequadrature(ae_int_t n, /* Real */ ae_vector* x, /* Real */ ae_vector* w, ae_state *_state) { ae_int_t i; ae_int_t j; double r; double r1; double p1; double p2; double p3; double dp3; double pipm4; double tmp; ae_vector_clear(x); ae_vector_clear(w); ae_vector_set_length(x, n-1+1, _state); ae_vector_set_length(w, n-1+1, _state); pipm4 = ae_pow(ae_pi, -0.25, _state); r = (double)(0); for(i=0; i<=(n+1)/2-1; i++) { if( i==0 ) { r = ae_sqrt((double)(2*n+1), _state)-1.85575*ae_pow((double)(2*n+1), -(double)1/(double)6, _state); } else { if( i==1 ) { r = r-1.14*ae_pow((double)(n), 0.426, _state)/r; } else { if( i==2 ) { r = 1.86*r-0.86*x->ptr.p_double[0]; } else { if( i==3 ) { r = 1.91*r-0.91*x->ptr.p_double[1]; } else { r = 2*r-x->ptr.p_double[i-2]; } } } } do { p2 = (double)(0); p3 = pipm4; for(j=0; j<=n-1; j++) { p1 = p2; p2 = p3; p3 = p2*r*ae_sqrt((double)2/(double)(j+1), _state)-p1*ae_sqrt((double)j/(double)(j+1), _state); } dp3 = ae_sqrt((double)(2*j), _state)*p2; r1 = r; r = r-p3/dp3; } while(ae_fp_greater_eq(ae_fabs(r-r1, _state),ae_machineepsilon*(1+ae_fabs(r, _state))*100)); x->ptr.p_double[i] = r; w->ptr.p_double[i] = 2/(dp3*dp3); x->ptr.p_double[n-1-i] = -x->ptr.p_double[i]; w->ptr.p_double[n-1-i] = w->ptr.p_double[i]; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-2-i; j++) { if( ae_fp_greater_eq(x->ptr.p_double[j],x->ptr.p_double[j+1]) ) { tmp = x->ptr.p_double[j]; x->ptr.p_double[j] = x->ptr.p_double[j+1]; x->ptr.p_double[j+1] = tmp; tmp = w->ptr.p_double[j]; w->ptr.p_double[j] = w->ptr.p_double[j+1]; w->ptr.p_double[j+1] = tmp; } } } } static double testgkqunit_mapkind(ae_int_t k, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testgkq(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t pkind; double errtol; double eps; ae_int_t n; ae_int_t i; ae_int_t k; ae_int_t info; double err; ae_int_t akind; ae_int_t bkind; double alphac; double betac; ae_vector x1; ae_vector wg1; ae_vector wk1; ae_vector x2; ae_vector wg2; ae_vector wk2; ae_int_t info1; ae_int_t info2; ae_bool successatleastonce; ae_bool intblerrors; ae_bool vstblerrors; ae_bool generrors; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&wg1, 0, DT_REAL, _state); ae_vector_init(&wk1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&wg2, 0, DT_REAL, _state); ae_vector_init(&wk2, 0, DT_REAL, _state); intblerrors = ae_false; vstblerrors = ae_false; generrors = ae_false; waserrors = ae_false; errtol = 10000*ae_machineepsilon; /* * test recurrence-based Legendre nodes against the precalculated table */ for(pkind=0; pkind<=5; pkind++) { n = 0; if( pkind==0 ) { n = 15; } if( pkind==1 ) { n = 21; } if( pkind==2 ) { n = 31; } if( pkind==3 ) { n = 41; } if( pkind==4 ) { n = 51; } if( pkind==5 ) { n = 61; } gkqlegendrecalc(n, &info, &x1, &wk1, &wg1, _state); gkqlegendretbl(n, &x2, &wk2, &wg2, &eps, _state); if( info<=0 ) { generrors = ae_true; break; } for(i=0; i<=n-1; i++) { vstblerrors = vstblerrors||ae_fp_greater(ae_fabs(x1.ptr.p_double[i]-x2.ptr.p_double[i], _state),errtol); vstblerrors = vstblerrors||ae_fp_greater(ae_fabs(wk1.ptr.p_double[i]-wk2.ptr.p_double[i], _state),errtol); vstblerrors = vstblerrors||ae_fp_greater(ae_fabs(wg1.ptr.p_double[i]-wg2.ptr.p_double[i], _state),errtol); } } /* * Test recurrence-baced Gauss-Kronrod nodes against Gauss-only nodes * calculated with subroutines from GQ unit. */ for(k=1; k<=30; k++) { n = 2*k+1; /* * Gauss-Legendre */ err = (double)(0); gkqgenerategausslegendre(n, &info1, &x1, &wk1, &wg1, _state); gqgenerategausslegendre(k, &info2, &x2, &wg2, _state); if( info1>0&&info2>0 ) { for(i=0; i<=k-1; i++) { err = ae_maxreal(err, ae_fabs(x1.ptr.p_double[2*i+1]-x2.ptr.p_double[i], _state), _state); err = ae_maxreal(err, ae_fabs(wg1.ptr.p_double[2*i+1]-wg2.ptr.p_double[i], _state), _state); } } else { generrors = ae_true; } generrors = generrors||ae_fp_greater(err,errtol); } for(k=1; k<=15; k++) { n = 2*k+1; /* * Gauss-Jacobi */ successatleastonce = ae_false; err = (double)(0); for(akind=0; akind<=9; akind++) { for(bkind=0; bkind<=9; bkind++) { alphac = testgkqunit_mapkind(akind, _state); betac = testgkqunit_mapkind(bkind, _state); gkqgenerategaussjacobi(n, alphac, betac, &info1, &x1, &wk1, &wg1, _state); gqgenerategaussjacobi(k, alphac, betac, &info2, &x2, &wg2, _state); if( info1>0&&info2>0 ) { successatleastonce = ae_true; for(i=0; i<=k-1; i++) { err = ae_maxreal(err, ae_fabs(x1.ptr.p_double[2*i+1]-x2.ptr.p_double[i], _state), _state); err = ae_maxreal(err, ae_fabs(wg1.ptr.p_double[2*i+1]-wg2.ptr.p_double[i], _state), _state); } } else { generrors = generrors||info1!=-5; } } } generrors = (generrors||ae_fp_greater(err,errtol))||!successatleastonce; } /* * end */ waserrors = (intblerrors||vstblerrors)||generrors; if( !silent ) { printf("TESTING GAUSS-KRONROD QUADRATURES\n"); printf("FINAL RESULT: "); if( waserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* PRE-CALCULATED TABLE: "); if( intblerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* CALCULATED AGAINST THE TABLE: "); if( vstblerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* GENERAL PROPERTIES: "); if( generrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testgkq(ae_bool silent, ae_state *_state) { return testgkq(silent, _state); } /************************************************************************* Maps: 0 => -0.9 1 => -0.5 2 => -0.1 3 => 0.0 4 => +0.1 5 => +0.5 6 => +0.9 7 => +1.0 8 => +1.5 9 => +2.0 *************************************************************************/ static double testgkqunit_mapkind(ae_int_t k, ae_state *_state) { double result; result = (double)(0); if( k==0 ) { result = -0.9; } if( k==1 ) { result = -0.5; } if( k==2 ) { result = -0.1; } if( k==3 ) { result = 0.0; } if( k==4 ) { result = 0.1; } if( k==5 ) { result = 0.5; } if( k==6 ) { result = 0.9; } if( k==7 ) { result = 1.0; } if( k==8 ) { result = 1.5; } if( k==9 ) { result = 2.0; } return result; } /************************************************************************* Test *************************************************************************/ ae_bool testautogk(ae_bool silent, ae_state *_state) { ae_frame _frame_block; double a; double b; autogkstate state; autogkreport rep; double v; double exact; double eabs; double alpha; ae_int_t pkind; double errtol; ae_bool simpleerrors; ae_bool sngenderrors; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); _autogkstate_init(&state, _state); _autogkreport_init(&rep, _state); simpleerrors = ae_false; sngenderrors = ae_false; waserrors = ae_false; errtol = 10000*ae_machineepsilon; /* * Simple test: integral(exp(x),+-1,+-2), no maximum width requirements */ a = (2*ae_randominteger(2, _state)-1)*1.0; b = (2*ae_randominteger(2, _state)-1)*2.0; autogksmooth(a, b, &state, _state); while(autogkiteration(&state, _state)) { state.f = ae_exp(state.x, _state); } autogkresults(&state, &v, &rep, _state); exact = ae_exp(b, _state)-ae_exp(a, _state); eabs = ae_fabs(ae_exp(b, _state)-ae_exp(a, _state), _state); if( rep.terminationtype<=0 ) { simpleerrors = ae_true; } else { simpleerrors = simpleerrors||ae_fp_greater(ae_fabs(exact-v, _state),errtol*eabs); } /* * Simple test: integral(exp(x),+-1,+-2), XWidth=0.1 */ a = (2*ae_randominteger(2, _state)-1)*1.0; b = (2*ae_randominteger(2, _state)-1)*2.0; autogksmoothw(a, b, 0.1, &state, _state); while(autogkiteration(&state, _state)) { state.f = ae_exp(state.x, _state); } autogkresults(&state, &v, &rep, _state); exact = ae_exp(b, _state)-ae_exp(a, _state); eabs = ae_fabs(ae_exp(b, _state)-ae_exp(a, _state), _state); if( rep.terminationtype<=0 ) { simpleerrors = ae_true; } else { simpleerrors = simpleerrors||ae_fp_greater(ae_fabs(exact-v, _state),errtol*eabs); } /* * Simple test: integral(cos(100*x),0,2*pi), no maximum width requirements */ a = (double)(0); b = 2*ae_pi; autogksmooth(a, b, &state, _state); while(autogkiteration(&state, _state)) { state.f = ae_cos(100*state.x, _state); } autogkresults(&state, &v, &rep, _state); exact = (double)(0); eabs = (double)(4); if( rep.terminationtype<=0 ) { simpleerrors = ae_true; } else { simpleerrors = simpleerrors||ae_fp_greater(ae_fabs(exact-v, _state),errtol*eabs); } /* * Simple test: integral(cos(100*x),0,2*pi), XWidth=0.3 */ a = (double)(0); b = 2*ae_pi; autogksmoothw(a, b, 0.3, &state, _state); while(autogkiteration(&state, _state)) { state.f = ae_cos(100*state.x, _state); } autogkresults(&state, &v, &rep, _state); exact = (double)(0); eabs = (double)(4); if( rep.terminationtype<=0 ) { simpleerrors = ae_true; } else { simpleerrors = simpleerrors||ae_fp_greater(ae_fabs(exact-v, _state),errtol*eabs); } /* * singular problem on [a,b] = [0.1, 0.5] * f2(x) = (1+x)*(b-x)^alpha, -1 < alpha < 1 */ for(pkind=0; pkind<=6; pkind++) { a = 0.1; b = 0.5; alpha = 0.0; if( pkind==0 ) { alpha = -0.9; } if( pkind==1 ) { alpha = -0.5; } if( pkind==2 ) { alpha = -0.1; } if( pkind==3 ) { alpha = 0.0; } if( pkind==4 ) { alpha = 0.1; } if( pkind==5 ) { alpha = 0.5; } if( pkind==6 ) { alpha = 0.9; } /* * f1(x) = (1+x)*(x-a)^alpha, -1 < alpha < 1 * 1. use singular integrator for [a,b] * 2. use singular integrator for [b,a] */ exact = ae_pow(b-a, alpha+2, _state)/(alpha+2)+(1+a)*ae_pow(b-a, alpha+1, _state)/(alpha+1); eabs = ae_fabs(exact, _state); autogksingular(a, b, alpha, 0.0, &state, _state); while(autogkiteration(&state, _state)) { if( ae_fp_less(state.xminusa,0.01) ) { state.f = ae_pow(state.xminusa, alpha, _state)*(1+state.x); } else { state.f = ae_pow(state.x-a, alpha, _state)*(1+state.x); } } autogkresults(&state, &v, &rep, _state); if( rep.terminationtype<=0 ) { sngenderrors = ae_true; } else { sngenderrors = sngenderrors||ae_fp_greater(ae_fabs(v-exact, _state),errtol*eabs); } autogksingular(b, a, 0.0, alpha, &state, _state); while(autogkiteration(&state, _state)) { if( ae_fp_greater(state.bminusx,-0.01) ) { state.f = ae_pow(-state.bminusx, alpha, _state)*(1+state.x); } else { state.f = ae_pow(state.x-a, alpha, _state)*(1+state.x); } } autogkresults(&state, &v, &rep, _state); if( rep.terminationtype<=0 ) { sngenderrors = ae_true; } else { sngenderrors = sngenderrors||ae_fp_greater(ae_fabs(-v-exact, _state),errtol*eabs); } /* * f1(x) = (1+x)*(b-x)^alpha, -1 < alpha < 1 * 1. use singular integrator for [a,b] * 2. use singular integrator for [b,a] */ exact = (1+b)*ae_pow(b-a, alpha+1, _state)/(alpha+1)-ae_pow(b-a, alpha+2, _state)/(alpha+2); eabs = ae_fabs(exact, _state); autogksingular(a, b, 0.0, alpha, &state, _state); while(autogkiteration(&state, _state)) { if( ae_fp_less(state.bminusx,0.01) ) { state.f = ae_pow(state.bminusx, alpha, _state)*(1+state.x); } else { state.f = ae_pow(b-state.x, alpha, _state)*(1+state.x); } } autogkresults(&state, &v, &rep, _state); if( rep.terminationtype<=0 ) { sngenderrors = ae_true; } else { sngenderrors = sngenderrors||ae_fp_greater(ae_fabs(v-exact, _state),errtol*eabs); } autogksingular(b, a, alpha, 0.0, &state, _state); while(autogkiteration(&state, _state)) { if( ae_fp_greater(state.xminusa,-0.01) ) { state.f = ae_pow(-state.xminusa, alpha, _state)*(1+state.x); } else { state.f = ae_pow(b-state.x, alpha, _state)*(1+state.x); } } autogkresults(&state, &v, &rep, _state); if( rep.terminationtype<=0 ) { sngenderrors = ae_true; } else { sngenderrors = sngenderrors||ae_fp_greater(ae_fabs(-v-exact, _state),errtol*eabs); } } /* * end */ waserrors = simpleerrors||sngenderrors; if( !silent ) { printf("TESTING AUTOGK\n"); printf("INTEGRATION WITH GIVEN ACCURACY: "); if( simpleerrors||sngenderrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* SIMPLE PROBLEMS: "); if( simpleerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* SINGULAR PROBLEMS (ENDS OF INTERVAL): "); if( sngenderrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testautogk(ae_bool silent, ae_state *_state) { return testautogk(silent, _state); } static void testfftunit_reffftc1d(/* Complex */ ae_vector* a, ae_int_t n, ae_state *_state); static void testfftunit_reffftc1dinv(/* Complex */ ae_vector* a, ae_int_t n, ae_state *_state); static void testfftunit_refinternalcfft(/* Real */ ae_vector* a, ae_int_t nn, ae_bool inversefft, ae_state *_state); static void testfftunit_refinternalrfft(/* Real */ ae_vector* a, ae_int_t nn, /* Complex */ ae_vector* f, ae_state *_state); static void testfftunit_quicktest(ae_int_t n, double* referr, double* refrerr, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testfft(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t i; ae_int_t k; ae_vector a1; ae_vector a2; ae_vector a3; ae_vector r1; ae_vector r2; ae_vector buf; fasttransformplan plan; ae_int_t maxsmalln; double bidierr; double bidirerr; double referr; double refrerr; double reinterr; double errtol; ae_bool referrors; ae_bool bidierrors; ae_bool refrerrors; ae_bool bidirerrors; ae_bool reinterrors; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&a1, 0, DT_COMPLEX, _state); ae_vector_init(&a2, 0, DT_COMPLEX, _state); ae_vector_init(&a3, 0, DT_COMPLEX, _state); ae_vector_init(&r1, 0, DT_REAL, _state); ae_vector_init(&r2, 0, DT_REAL, _state); ae_vector_init(&buf, 0, DT_REAL, _state); _fasttransformplan_init(&plan, _state); maxsmalln = 128; errtol = 100000*ae_pow((double)(maxsmalln), (double)3/(double)2, _state)*ae_machineepsilon; bidierrors = ae_false; referrors = ae_false; bidirerrors = ae_false; refrerrors = ae_false; reinterrors = ae_false; waserrors = ae_false; /* * Test bi-directional error: norm(x-invFFT(FFT(x))) */ bidierr = (double)(0); bidirerr = (double)(0); for(n=1; n<=maxsmalln; n++) { /* * Complex FFT/invFFT */ ae_vector_set_length(&a1, n, _state); ae_vector_set_length(&a2, n, _state); ae_vector_set_length(&a3, n, _state); for(i=0; i<=n-1; i++) { a1.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; a1.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; a2.ptr.p_complex[i] = a1.ptr.p_complex[i]; a3.ptr.p_complex[i] = a1.ptr.p_complex[i]; } fftc1d(&a2, n, _state); fftc1dinv(&a2, n, _state); fftc1dinv(&a3, n, _state); fftc1d(&a3, n, _state); for(i=0; i<=n-1; i++) { bidierr = ae_maxreal(bidierr, ae_c_abs(ae_c_sub(a1.ptr.p_complex[i],a2.ptr.p_complex[i]), _state), _state); bidierr = ae_maxreal(bidierr, ae_c_abs(ae_c_sub(a1.ptr.p_complex[i],a3.ptr.p_complex[i]), _state), _state); } /* * Real */ ae_vector_set_length(&r1, n, _state); ae_vector_set_length(&r2, n, _state); for(i=0; i<=n-1; i++) { r1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; r2.ptr.p_double[i] = r1.ptr.p_double[i]; } fftr1d(&r2, n, &a1, _state); ae_v_muld(&r2.ptr.p_double[0], 1, ae_v_len(0,n-1), 0); fftr1dinv(&a1, n, &r2, _state); for(i=0; i<=n-1; i++) { bidirerr = ae_maxreal(bidirerr, ae_c_abs(ae_complex_from_d(r1.ptr.p_double[i]-r2.ptr.p_double[i]), _state), _state); } } bidierrors = bidierrors||ae_fp_greater(bidierr,errtol); bidirerrors = bidirerrors||ae_fp_greater(bidirerr,errtol); /* * Test against reference O(N^2) implementation for small N's * (we do not test large N's because reference implementation will be too slow). */ referr = (double)(0); refrerr = (double)(0); for(n=1; n<=maxsmalln; n++) { /* * Complex FFT */ ae_vector_set_length(&a1, n, _state); ae_vector_set_length(&a2, n, _state); for(i=0; i<=n-1; i++) { a1.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; a1.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; a2.ptr.p_complex[i] = a1.ptr.p_complex[i]; } fftc1d(&a1, n, _state); testfftunit_reffftc1d(&a2, n, _state); for(i=0; i<=n-1; i++) { referr = ae_maxreal(referr, ae_c_abs(ae_c_sub(a1.ptr.p_complex[i],a2.ptr.p_complex[i]), _state), _state); } /* * Complex inverse FFT */ ae_vector_set_length(&a1, n, _state); ae_vector_set_length(&a2, n, _state); for(i=0; i<=n-1; i++) { a1.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; a1.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; a2.ptr.p_complex[i] = a1.ptr.p_complex[i]; } fftc1dinv(&a1, n, _state); testfftunit_reffftc1dinv(&a2, n, _state); for(i=0; i<=n-1; i++) { referr = ae_maxreal(referr, ae_c_abs(ae_c_sub(a1.ptr.p_complex[i],a2.ptr.p_complex[i]), _state), _state); } /* * Real forward/inverse FFT: * * calculate and check forward FFT * * use precalculated FFT to check backward FFT * fill unused parts of frequencies array with random numbers * to ensure that they are not really used */ ae_vector_set_length(&r1, n, _state); ae_vector_set_length(&r2, n, _state); for(i=0; i<=n-1; i++) { r1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; r2.ptr.p_double[i] = r1.ptr.p_double[i]; } fftr1d(&r1, n, &a1, _state); testfftunit_refinternalrfft(&r2, n, &a2, _state); for(i=0; i<=n-1; i++) { refrerr = ae_maxreal(refrerr, ae_c_abs(ae_c_sub(a1.ptr.p_complex[i],a2.ptr.p_complex[i]), _state), _state); } ae_vector_set_length(&a3, ae_ifloor((double)n/(double)2, _state)+1, _state); for(i=0; i<=ae_ifloor((double)n/(double)2, _state); i++) { a3.ptr.p_complex[i] = a2.ptr.p_complex[i]; } a3.ptr.p_complex[0].y = 2*ae_randomreal(_state)-1; if( n%2==0 ) { a3.ptr.p_complex[ae_ifloor((double)n/(double)2, _state)].y = 2*ae_randomreal(_state)-1; } for(i=0; i<=n-1; i++) { r1.ptr.p_double[i] = (double)(0); } fftr1dinv(&a3, n, &r1, _state); for(i=0; i<=n-1; i++) { refrerr = ae_maxreal(refrerr, ae_fabs(r2.ptr.p_double[i]-r1.ptr.p_double[i], _state), _state); } } referrors = referrors||ae_fp_greater(referr,errtol); refrerrors = refrerrors||ae_fp_greater(refrerr,errtol); /* * Test for large N's: * * we perform FFT * * we selectively calculate K (small number) of DFT components (using reference formula) * and compare them with ones calculated by fast implementation * * K components to test are chosen at random (random sampling with possible repetitions) * * overall complexity of the test is O(N*logN+K*N) * Several N's are tested, with different kinds of factorizations */ referr = (double)(0); refrerr = (double)(0); testfftunit_quicktest(1000, &referr, &refrerr, _state); testfftunit_quicktest(1024, &referr, &refrerr, _state); testfftunit_quicktest(1025, &referr, &refrerr, _state); testfftunit_quicktest(2000, &referr, &refrerr, _state); testfftunit_quicktest(2048, &referr, &refrerr, _state); testfftunit_quicktest(6535, &referr, &refrerr, _state); testfftunit_quicktest(65536, &referr, &refrerr, _state); testfftunit_quicktest(104729, &referr, &refrerr, _state); testfftunit_quicktest(139129, &referr, &refrerr, _state); testfftunit_quicktest(141740, &referr, &refrerr, _state); referrors = referrors||ae_fp_greater(referr,errtol); refrerrors = refrerrors||ae_fp_greater(refrerr,errtol); /* * test internal real even FFT */ reinterr = (double)(0); for(k=1; k<=maxsmalln/2; k++) { n = 2*k; /* * Real forward FFT */ ae_vector_set_length(&r1, n, _state); ae_vector_set_length(&r2, n, _state); for(i=0; i<=n-1; i++) { r1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; r2.ptr.p_double[i] = r1.ptr.p_double[i]; } ftcomplexfftplan(n/2, 1, &plan, _state); ae_vector_set_length(&buf, n, _state); fftr1dinternaleven(&r1, n, &buf, &plan, _state); testfftunit_refinternalrfft(&r2, n, &a2, _state); reinterr = ae_maxreal(reinterr, ae_fabs(r1.ptr.p_double[0]-a2.ptr.p_complex[0].x, _state), _state); reinterr = ae_maxreal(reinterr, ae_fabs(r1.ptr.p_double[1]-a2.ptr.p_complex[n/2].x, _state), _state); for(i=1; i<=n/2-1; i++) { reinterr = ae_maxreal(reinterr, ae_fabs(r1.ptr.p_double[2*i+0]-a2.ptr.p_complex[i].x, _state), _state); reinterr = ae_maxreal(reinterr, ae_fabs(r1.ptr.p_double[2*i+1]-a2.ptr.p_complex[i].y, _state), _state); } /* * Real backward FFT */ ae_vector_set_length(&r1, n, _state); for(i=0; i<=n-1; i++) { r1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&a2, ae_ifloor((double)n/(double)2, _state)+1, _state); a2.ptr.p_complex[0] = ae_complex_from_d(r1.ptr.p_double[0]); for(i=1; i<=ae_ifloor((double)n/(double)2, _state)-1; i++) { a2.ptr.p_complex[i].x = r1.ptr.p_double[2*i+0]; a2.ptr.p_complex[i].y = r1.ptr.p_double[2*i+1]; } a2.ptr.p_complex[ae_ifloor((double)n/(double)2, _state)] = ae_complex_from_d(r1.ptr.p_double[1]); ftcomplexfftplan(n/2, 1, &plan, _state); ae_vector_set_length(&buf, n, _state); fftr1dinvinternaleven(&r1, n, &buf, &plan, _state); fftr1dinv(&a2, n, &r2, _state); for(i=0; i<=n-1; i++) { reinterr = ae_maxreal(reinterr, ae_fabs(r1.ptr.p_double[i]-r2.ptr.p_double[i], _state), _state); } } reinterrors = reinterrors||ae_fp_greater(reinterr,errtol); /* * end */ waserrors = (((bidierrors||bidirerrors)||referrors)||refrerrors)||reinterrors; if( !silent ) { printf("TESTING FFT\n"); printf("FINAL RESULT: "); if( waserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* BI-DIRECTIONAL COMPLEX TEST: "); if( bidierrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* AGAINST REFERENCE COMPLEX FFT: "); if( referrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* BI-DIRECTIONAL REAL TEST: "); if( bidirerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* AGAINST REFERENCE REAL FFT: "); if( refrerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* INTERNAL EVEN FFT: "); if( reinterrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testfft(ae_bool silent, ae_state *_state) { return testfft(silent, _state); } /************************************************************************* Reference FFT *************************************************************************/ static void testfftunit_reffftc1d(/* Complex */ ae_vector* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector buf; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_init(&buf, 0, DT_REAL, _state); ae_assert(n>0, "FFTC1D: incorrect N!", _state); ae_vector_set_length(&buf, 2*n, _state); for(i=0; i<=n-1; i++) { buf.ptr.p_double[2*i+0] = a->ptr.p_complex[i].x; buf.ptr.p_double[2*i+1] = a->ptr.p_complex[i].y; } testfftunit_refinternalcfft(&buf, n, ae_false, _state); for(i=0; i<=n-1; i++) { a->ptr.p_complex[i].x = buf.ptr.p_double[2*i+0]; a->ptr.p_complex[i].y = buf.ptr.p_double[2*i+1]; } ae_frame_leave(_state); } /************************************************************************* Reference inverse FFT *************************************************************************/ static void testfftunit_reffftc1dinv(/* Complex */ ae_vector* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector buf; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_init(&buf, 0, DT_REAL, _state); ae_assert(n>0, "FFTC1DInv: incorrect N!", _state); ae_vector_set_length(&buf, 2*n, _state); for(i=0; i<=n-1; i++) { buf.ptr.p_double[2*i+0] = a->ptr.p_complex[i].x; buf.ptr.p_double[2*i+1] = a->ptr.p_complex[i].y; } testfftunit_refinternalcfft(&buf, n, ae_true, _state); for(i=0; i<=n-1; i++) { a->ptr.p_complex[i].x = buf.ptr.p_double[2*i+0]; a->ptr.p_complex[i].y = buf.ptr.p_double[2*i+1]; } ae_frame_leave(_state); } /************************************************************************* Internal complex FFT stub. Uses straightforward formula with O(N^2) complexity. *************************************************************************/ static void testfftunit_refinternalcfft(/* Real */ ae_vector* a, ae_int_t nn, ae_bool inversefft, ae_state *_state) { ae_frame _frame_block; ae_vector tmp; ae_int_t i; ae_int_t k; double hre; double him; double c; double s; double re; double im; ae_frame_make(_state, &_frame_block); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_set_length(&tmp, 2*nn-1+1, _state); if( !inversefft ) { for(i=0; i<=nn-1; i++) { hre = (double)(0); him = (double)(0); for(k=0; k<=nn-1; k++) { re = a->ptr.p_double[2*k]; im = a->ptr.p_double[2*k+1]; c = ae_cos(-2*ae_pi*k*i/nn, _state); s = ae_sin(-2*ae_pi*k*i/nn, _state); hre = hre+c*re-s*im; him = him+c*im+s*re; } tmp.ptr.p_double[2*i] = hre; tmp.ptr.p_double[2*i+1] = him; } for(i=0; i<=2*nn-1; i++) { a->ptr.p_double[i] = tmp.ptr.p_double[i]; } } else { for(k=0; k<=nn-1; k++) { hre = (double)(0); him = (double)(0); for(i=0; i<=nn-1; i++) { re = a->ptr.p_double[2*i]; im = a->ptr.p_double[2*i+1]; c = ae_cos(2*ae_pi*k*i/nn, _state); s = ae_sin(2*ae_pi*k*i/nn, _state); hre = hre+c*re-s*im; him = him+c*im+s*re; } tmp.ptr.p_double[2*k] = hre/nn; tmp.ptr.p_double[2*k+1] = him/nn; } for(i=0; i<=2*nn-1; i++) { a->ptr.p_double[i] = tmp.ptr.p_double[i]; } } ae_frame_leave(_state); } /************************************************************************* Internal real FFT stub. Uses straightforward formula with O(N^2) complexity. *************************************************************************/ static void testfftunit_refinternalrfft(/* Real */ ae_vector* a, ae_int_t nn, /* Complex */ ae_vector* f, ae_state *_state) { ae_frame _frame_block; ae_vector tmp; ae_int_t i; ae_frame_make(_state, &_frame_block); ae_vector_clear(f); ae_vector_init(&tmp, 0, DT_REAL, _state); ae_vector_set_length(&tmp, 2*nn-1+1, _state); for(i=0; i<=nn-1; i++) { tmp.ptr.p_double[2*i] = a->ptr.p_double[i]; tmp.ptr.p_double[2*i+1] = (double)(0); } testfftunit_refinternalcfft(&tmp, nn, ae_false, _state); ae_vector_set_length(f, nn, _state); for(i=0; i<=nn-1; i++) { f->ptr.p_complex[i].x = tmp.ptr.p_double[2*i+0]; f->ptr.p_complex[i].y = tmp.ptr.p_double[2*i+1]; } ae_frame_leave(_state); } /************************************************************************* This function performs real/complex FFT of given length on random data, selects K random components and compares them with values calculated by DFT definition. It updates RefErr and RefRErr as follows: RefErr:= max(RefErr, error_of_complex_FFT) RefRErr:= max(RefRErr,error_of_real_FFT) *************************************************************************/ static void testfftunit_quicktest(ae_int_t n, double* referr, double* refrerr, ae_state *_state) { ae_frame _frame_block; ae_vector a0; ae_vector a1; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t idx; ae_complex v; double c; double s; double re; double im; ae_frame_make(_state, &_frame_block); ae_vector_init(&a0, 0, DT_COMPLEX, _state); ae_vector_init(&a1, 0, DT_COMPLEX, _state); k = 10; /* * Complex FFT - forward and inverse */ ae_vector_set_length(&a0, n, _state); ae_vector_set_length(&a1, n, _state); for(i=0; i<=n-1; i++) { a0.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; a0.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; a1.ptr.p_complex[i] = a0.ptr.p_complex[i]; } fftc1d(&a0, n, _state); for(i=0; i<=k-1; i++) { idx = ae_randominteger(n, _state); v = ae_complex_from_i(0); for(j=0; j<=n-1; j++) { re = a1.ptr.p_complex[j].x; im = a1.ptr.p_complex[j].y; c = ae_cos(-2*ae_pi*j*idx/n, _state); s = ae_sin(-2*ae_pi*j*idx/n, _state); v.x = v.x+c*re-s*im; v.y = v.y+c*im+s*re; } *referr = ae_maxreal(*referr, ae_c_abs(ae_c_sub(v,a0.ptr.p_complex[idx]), _state), _state); } fftc1dinv(&a0, n, _state); for(i=0; i<=n-1; i++) { *referr = ae_maxreal(*referr, ae_c_abs(ae_c_sub(a0.ptr.p_complex[i],a1.ptr.p_complex[i]), _state), _state); } ae_frame_leave(_state); } static void testfhtunit_reffhtr1d(/* Real */ ae_vector* a, ae_int_t n, ae_state *_state); static void testfhtunit_reffhtr1dinv(/* Real */ ae_vector* a, ae_int_t n, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testfht(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t n; ae_int_t i; ae_vector r1; ae_vector r2; ae_vector r3; ae_int_t maxn; double bidierr; double referr; double errtol; ae_bool referrors; ae_bool bidierrors; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&r1, 0, DT_REAL, _state); ae_vector_init(&r2, 0, DT_REAL, _state); ae_vector_init(&r3, 0, DT_REAL, _state); maxn = 128; errtol = 100000*ae_pow((double)(maxn), (double)3/(double)2, _state)*ae_machineepsilon; bidierrors = ae_false; referrors = ae_false; waserrors = ae_false; /* * Test bi-directional error: norm(x-invFHT(FHT(x))) */ bidierr = (double)(0); for(n=1; n<=maxn; n++) { /* * FHT/invFHT */ ae_vector_set_length(&r1, n, _state); ae_vector_set_length(&r2, n, _state); ae_vector_set_length(&r3, n, _state); for(i=0; i<=n-1; i++) { r1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; r2.ptr.p_double[i] = r1.ptr.p_double[i]; r3.ptr.p_double[i] = r1.ptr.p_double[i]; } fhtr1d(&r2, n, _state); fhtr1dinv(&r2, n, _state); fhtr1dinv(&r3, n, _state); fhtr1d(&r3, n, _state); for(i=0; i<=n-1; i++) { bidierr = ae_maxreal(bidierr, ae_fabs(r1.ptr.p_double[i]-r2.ptr.p_double[i], _state), _state); bidierr = ae_maxreal(bidierr, ae_fabs(r1.ptr.p_double[i]-r3.ptr.p_double[i], _state), _state); } } bidierrors = bidierrors||ae_fp_greater(bidierr,errtol); /* * Test against reference O(N^2) implementation */ referr = (double)(0); for(n=1; n<=maxn; n++) { /* * FHT */ ae_vector_set_length(&r1, n, _state); ae_vector_set_length(&r2, n, _state); for(i=0; i<=n-1; i++) { r1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; r2.ptr.p_double[i] = r1.ptr.p_double[i]; } fhtr1d(&r1, n, _state); testfhtunit_reffhtr1d(&r2, n, _state); for(i=0; i<=n-1; i++) { referr = ae_maxreal(referr, ae_fabs(r1.ptr.p_double[i]-r2.ptr.p_double[i], _state), _state); } /* * inverse FHT */ ae_vector_set_length(&r1, n, _state); ae_vector_set_length(&r2, n, _state); for(i=0; i<=n-1; i++) { r1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; r2.ptr.p_double[i] = r1.ptr.p_double[i]; } fhtr1dinv(&r1, n, _state); testfhtunit_reffhtr1dinv(&r2, n, _state); for(i=0; i<=n-1; i++) { referr = ae_maxreal(referr, ae_fabs(r1.ptr.p_double[i]-r2.ptr.p_double[i], _state), _state); } } referrors = referrors||ae_fp_greater(referr,errtol); /* * end */ waserrors = bidierrors||referrors; if( !silent ) { printf("TESTING FHT\n"); printf("FINAL RESULT: "); if( waserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* BI-DIRECTIONAL TEST: "); if( bidierrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* AGAINST REFERENCE FHT: "); if( referrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testfht(ae_bool silent, ae_state *_state) { return testfht(silent, _state); } /************************************************************************* Reference FHT *************************************************************************/ static void testfhtunit_reffhtr1d(/* Real */ ae_vector* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector buf; ae_int_t i; ae_int_t j; double v; ae_frame_make(_state, &_frame_block); ae_vector_init(&buf, 0, DT_REAL, _state); ae_assert(n>0, "RefFHTR1D: incorrect N!", _state); ae_vector_set_length(&buf, n, _state); for(i=0; i<=n-1; i++) { v = (double)(0); for(j=0; j<=n-1; j++) { v = v+a->ptr.p_double[j]*(ae_cos(2*ae_pi*i*j/n, _state)+ae_sin(2*ae_pi*i*j/n, _state)); } buf.ptr.p_double[i] = v; } for(i=0; i<=n-1; i++) { a->ptr.p_double[i] = buf.ptr.p_double[i]; } ae_frame_leave(_state); } /************************************************************************* Reference inverse FHT *************************************************************************/ static void testfhtunit_reffhtr1dinv(/* Real */ ae_vector* a, ae_int_t n, ae_state *_state) { ae_int_t i; ae_assert(n>0, "RefFHTR1DInv: incorrect N!", _state); testfhtunit_reffhtr1d(a, n, _state); for(i=0; i<=n-1; i++) { a->ptr.p_double[i] = a->ptr.p_double[i]/n; } } static void testconvunit_refconvc1d(/* Complex */ ae_vector* a, ae_int_t m, /* Complex */ ae_vector* b, ae_int_t n, /* Complex */ ae_vector* r, ae_state *_state); static void testconvunit_refconvc1dcircular(/* Complex */ ae_vector* a, ae_int_t m, /* Complex */ ae_vector* b, ae_int_t n, /* Complex */ ae_vector* r, ae_state *_state); static void testconvunit_refconvr1d(/* Real */ ae_vector* a, ae_int_t m, /* Real */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* r, ae_state *_state); static void testconvunit_refconvr1dcircular(/* Real */ ae_vector* a, ae_int_t m, /* Real */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* r, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testconv(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t m; ae_int_t n; ae_int_t i; ae_int_t rkind; ae_int_t circkind; ae_vector ra; ae_vector rb; ae_vector rr1; ae_vector rr2; ae_vector ca; ae_vector cb; ae_vector cr1; ae_vector cr2; ae_int_t maxn; double referr; double refrerr; double inverr; double invrerr; double errtol; ae_bool referrors; ae_bool refrerrors; ae_bool inverrors; ae_bool invrerrors; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&ra, 0, DT_REAL, _state); ae_vector_init(&rb, 0, DT_REAL, _state); ae_vector_init(&rr1, 0, DT_REAL, _state); ae_vector_init(&rr2, 0, DT_REAL, _state); ae_vector_init(&ca, 0, DT_COMPLEX, _state); ae_vector_init(&cb, 0, DT_COMPLEX, _state); ae_vector_init(&cr1, 0, DT_COMPLEX, _state); ae_vector_init(&cr2, 0, DT_COMPLEX, _state); maxn = 32; errtol = 100000*ae_pow((double)(maxn), (double)3/(double)2, _state)*ae_machineepsilon; referrors = ae_false; refrerrors = ae_false; inverrors = ae_false; invrerrors = ae_false; waserrors = ae_false; /* * Test against reference O(N^2) implementation. * * Automatic ConvC1D() and different algorithms of ConvC1DX() are tested. */ referr = (double)(0); refrerr = (double)(0); for(m=1; m<=maxn; m++) { for(n=1; n<=maxn; n++) { for(circkind=0; circkind<=1; circkind++) { for(rkind=-3; rkind<=1; rkind++) { /* * skip impossible combinations of parameters: * * circular convolution, M-3 - internal subroutine does not support M=n ) { /* * test internal subroutine: * * circular/non-circular mode */ convc1dx(&ca, m, &cb, n, circkind!=0, rkind, 0, &cr1, _state); } else { /* * test internal subroutine - circular mode only */ ae_assert(circkind==0, "Convolution test: internal error!", _state); convc1dx(&cb, n, &ca, m, ae_false, rkind, 0, &cr1, _state); } } if( circkind==0 ) { testconvunit_refconvc1d(&ca, m, &cb, n, &cr2, _state); } else { testconvunit_refconvc1dcircular(&ca, m, &cb, n, &cr2, _state); } if( circkind==0 ) { for(i=0; i<=m+n-2; i++) { referr = ae_maxreal(referr, ae_c_abs(ae_c_sub(cr1.ptr.p_complex[i],cr2.ptr.p_complex[i]), _state), _state); } } else { for(i=0; i<=m-1; i++) { referr = ae_maxreal(referr, ae_c_abs(ae_c_sub(cr1.ptr.p_complex[i],cr2.ptr.p_complex[i]), _state), _state); } } /* * Real convolution */ ae_vector_set_length(&ra, m, _state); for(i=0; i<=m-1; i++) { ra.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&rb, n, _state); for(i=0; i<=n-1; i++) { rb.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&rr1, 1, _state); if( rkind==-3 ) { /* * test wrapper subroutine: * * circular/non-circular */ if( circkind==0 ) { convr1d(&ra, m, &rb, n, &rr1, _state); } else { convr1dcircular(&ra, m, &rb, n, &rr1, _state); } } else { if( m>=n ) { /* * test internal subroutine: * * circular/non-circular mode */ convr1dx(&ra, m, &rb, n, circkind!=0, rkind, 0, &rr1, _state); } else { /* * test internal subroutine - non-circular mode only */ convr1dx(&rb, n, &ra, m, circkind!=0, rkind, 0, &rr1, _state); } } if( circkind==0 ) { testconvunit_refconvr1d(&ra, m, &rb, n, &rr2, _state); } else { testconvunit_refconvr1dcircular(&ra, m, &rb, n, &rr2, _state); } if( circkind==0 ) { for(i=0; i<=m+n-2; i++) { refrerr = ae_maxreal(refrerr, ae_fabs(rr1.ptr.p_double[i]-rr2.ptr.p_double[i], _state), _state); } } else { for(i=0; i<=m-1; i++) { refrerr = ae_maxreal(refrerr, ae_fabs(rr1.ptr.p_double[i]-rr2.ptr.p_double[i], _state), _state); } } } } } } referrors = referrors||ae_fp_greater(referr,errtol); refrerrors = refrerrors||ae_fp_greater(refrerr,errtol); /* * Test inverse convolution */ inverr = (double)(0); invrerr = (double)(0); for(m=1; m<=maxn; m++) { for(n=1; n<=maxn; n++) { /* * Complex circilar and non-circular */ ae_vector_set_length(&ca, m, _state); for(i=0; i<=m-1; i++) { ca.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; ca.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&cb, n, _state); for(i=0; i<=n-1; i++) { cb.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; cb.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&cr1, 1, _state); ae_vector_set_length(&cr2, 1, _state); convc1d(&ca, m, &cb, n, &cr2, _state); convc1dinv(&cr2, m+n-1, &cb, n, &cr1, _state); for(i=0; i<=m-1; i++) { inverr = ae_maxreal(inverr, ae_c_abs(ae_c_sub(cr1.ptr.p_complex[i],ca.ptr.p_complex[i]), _state), _state); } ae_vector_set_length(&cr1, 1, _state); ae_vector_set_length(&cr2, 1, _state); convc1dcircular(&ca, m, &cb, n, &cr2, _state); convc1dcircularinv(&cr2, m, &cb, n, &cr1, _state); for(i=0; i<=m-1; i++) { inverr = ae_maxreal(inverr, ae_c_abs(ae_c_sub(cr1.ptr.p_complex[i],ca.ptr.p_complex[i]), _state), _state); } /* * Real circilar and non-circular */ ae_vector_set_length(&ra, m, _state); for(i=0; i<=m-1; i++) { ra.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&rb, n, _state); for(i=0; i<=n-1; i++) { rb.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&rr1, 1, _state); ae_vector_set_length(&rr2, 1, _state); convr1d(&ra, m, &rb, n, &rr2, _state); convr1dinv(&rr2, m+n-1, &rb, n, &rr1, _state); for(i=0; i<=m-1; i++) { invrerr = ae_maxreal(invrerr, ae_fabs(rr1.ptr.p_double[i]-ra.ptr.p_double[i], _state), _state); } ae_vector_set_length(&rr1, 1, _state); ae_vector_set_length(&rr2, 1, _state); convr1dcircular(&ra, m, &rb, n, &rr2, _state); convr1dcircularinv(&rr2, m, &rb, n, &rr1, _state); for(i=0; i<=m-1; i++) { invrerr = ae_maxreal(invrerr, ae_fabs(rr1.ptr.p_double[i]-ra.ptr.p_double[i], _state), _state); } } } inverrors = inverrors||ae_fp_greater(inverr,errtol); invrerrors = invrerrors||ae_fp_greater(invrerr,errtol); /* * end */ waserrors = ((referrors||refrerrors)||inverrors)||invrerrors; if( !silent ) { printf("TESTING CONVOLUTION\n"); printf("FINAL RESULT: "); if( waserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* AGAINST REFERENCE COMPLEX CONV: "); if( referrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* AGAINST REFERENCE REAL CONV: "); if( refrerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* COMPLEX INVERSE: "); if( inverrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* REAL INVERSE: "); if( invrerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testconv(ae_bool silent, ae_state *_state) { return testconv(silent, _state); } /************************************************************************* Reference implementation *************************************************************************/ static void testconvunit_refconvc1d(/* Complex */ ae_vector* a, ae_int_t m, /* Complex */ ae_vector* b, ae_int_t n, /* Complex */ ae_vector* r, ae_state *_state) { ae_int_t i; ae_complex v; ae_vector_clear(r); ae_vector_set_length(r, m+n-1, _state); for(i=0; i<=m+n-2; i++) { r->ptr.p_complex[i] = ae_complex_from_i(0); } for(i=0; i<=m-1; i++) { v = a->ptr.p_complex[i]; ae_v_caddc(&r->ptr.p_complex[i], 1, &b->ptr.p_complex[0], 1, "N", ae_v_len(i,i+n-1), v); } } /************************************************************************* Reference implementation *************************************************************************/ static void testconvunit_refconvc1dcircular(/* Complex */ ae_vector* a, ae_int_t m, /* Complex */ ae_vector* b, ae_int_t n, /* Complex */ ae_vector* r, ae_state *_state) { ae_frame _frame_block; ae_int_t i1; ae_int_t i2; ae_int_t j2; ae_vector buf; ae_frame_make(_state, &_frame_block); ae_vector_clear(r); ae_vector_init(&buf, 0, DT_COMPLEX, _state); testconvunit_refconvc1d(a, m, b, n, &buf, _state); ae_vector_set_length(r, m, _state); ae_v_cmove(&r->ptr.p_complex[0], 1, &buf.ptr.p_complex[0], 1, "N", ae_v_len(0,m-1)); i1 = m; while(i1<=m+n-2) { i2 = ae_minint(i1+m-1, m+n-2, _state); j2 = i2-i1; ae_v_cadd(&r->ptr.p_complex[0], 1, &buf.ptr.p_complex[i1], 1, "N", ae_v_len(0,j2)); i1 = i1+m; } ae_frame_leave(_state); } /************************************************************************* Reference FFT *************************************************************************/ static void testconvunit_refconvr1d(/* Real */ ae_vector* a, ae_int_t m, /* Real */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* r, ae_state *_state) { ae_int_t i; double v; ae_vector_clear(r); ae_vector_set_length(r, m+n-1, _state); for(i=0; i<=m+n-2; i++) { r->ptr.p_double[i] = (double)(0); } for(i=0; i<=m-1; i++) { v = a->ptr.p_double[i]; ae_v_addd(&r->ptr.p_double[i], 1, &b->ptr.p_double[0], 1, ae_v_len(i,i+n-1), v); } } /************************************************************************* Reference implementation *************************************************************************/ static void testconvunit_refconvr1dcircular(/* Real */ ae_vector* a, ae_int_t m, /* Real */ ae_vector* b, ae_int_t n, /* Real */ ae_vector* r, ae_state *_state) { ae_frame _frame_block; ae_int_t i1; ae_int_t i2; ae_int_t j2; ae_vector buf; ae_frame_make(_state, &_frame_block); ae_vector_clear(r); ae_vector_init(&buf, 0, DT_REAL, _state); testconvunit_refconvr1d(a, m, b, n, &buf, _state); ae_vector_set_length(r, m, _state); ae_v_move(&r->ptr.p_double[0], 1, &buf.ptr.p_double[0], 1, ae_v_len(0,m-1)); i1 = m; while(i1<=m+n-2) { i2 = ae_minint(i1+m-1, m+n-2, _state); j2 = i2-i1; ae_v_add(&r->ptr.p_double[0], 1, &buf.ptr.p_double[i1], 1, ae_v_len(0,j2)); i1 = i1+m; } ae_frame_leave(_state); } static void testcorrunit_refcorrc1d(/* Complex */ ae_vector* signal, ae_int_t n, /* Complex */ ae_vector* pattern, ae_int_t m, /* Complex */ ae_vector* r, ae_state *_state); static void testcorrunit_refcorrc1dcircular(/* Complex */ ae_vector* signal, ae_int_t n, /* Complex */ ae_vector* pattern, ae_int_t m, /* Complex */ ae_vector* r, ae_state *_state); static void testcorrunit_refcorrr1d(/* Real */ ae_vector* signal, ae_int_t n, /* Real */ ae_vector* pattern, ae_int_t m, /* Real */ ae_vector* r, ae_state *_state); static void testcorrunit_refcorrr1dcircular(/* Real */ ae_vector* signal, ae_int_t n, /* Real */ ae_vector* pattern, ae_int_t m, /* Real */ ae_vector* r, ae_state *_state); /************************************************************************* Test *************************************************************************/ ae_bool testcorr(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t m; ae_int_t n; ae_int_t i; ae_vector ra; ae_vector rb; ae_vector rr1; ae_vector rr2; ae_vector ca; ae_vector cb; ae_vector cr1; ae_vector cr2; ae_int_t maxn; double referr; double refrerr; double errtol; ae_bool referrors; ae_bool refrerrors; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&ra, 0, DT_REAL, _state); ae_vector_init(&rb, 0, DT_REAL, _state); ae_vector_init(&rr1, 0, DT_REAL, _state); ae_vector_init(&rr2, 0, DT_REAL, _state); ae_vector_init(&ca, 0, DT_COMPLEX, _state); ae_vector_init(&cb, 0, DT_COMPLEX, _state); ae_vector_init(&cr1, 0, DT_COMPLEX, _state); ae_vector_init(&cr2, 0, DT_COMPLEX, _state); maxn = 32; errtol = 100000*ae_pow((double)(maxn), (double)3/(double)2, _state)*ae_machineepsilon; referrors = ae_false; refrerrors = ae_false; waserrors = ae_false; /* * Test against reference O(N^2) implementation. */ referr = (double)(0); refrerr = (double)(0); for(m=1; m<=maxn; m++) { for(n=1; n<=maxn; n++) { /* * Complex correlation */ ae_vector_set_length(&ca, m, _state); for(i=0; i<=m-1; i++) { ca.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; ca.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&cb, n, _state); for(i=0; i<=n-1; i++) { cb.ptr.p_complex[i].x = 2*ae_randomreal(_state)-1; cb.ptr.p_complex[i].y = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&cr1, 1, _state); corrc1d(&ca, m, &cb, n, &cr1, _state); testcorrunit_refcorrc1d(&ca, m, &cb, n, &cr2, _state); for(i=0; i<=m+n-2; i++) { referr = ae_maxreal(referr, ae_c_abs(ae_c_sub(cr1.ptr.p_complex[i],cr2.ptr.p_complex[i]), _state), _state); } ae_vector_set_length(&cr1, 1, _state); corrc1dcircular(&ca, m, &cb, n, &cr1, _state); testcorrunit_refcorrc1dcircular(&ca, m, &cb, n, &cr2, _state); for(i=0; i<=m-1; i++) { referr = ae_maxreal(referr, ae_c_abs(ae_c_sub(cr1.ptr.p_complex[i],cr2.ptr.p_complex[i]), _state), _state); } /* * Real correlation */ ae_vector_set_length(&ra, m, _state); for(i=0; i<=m-1; i++) { ra.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&rb, n, _state); for(i=0; i<=n-1; i++) { rb.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&rr1, 1, _state); corrr1d(&ra, m, &rb, n, &rr1, _state); testcorrunit_refcorrr1d(&ra, m, &rb, n, &rr2, _state); for(i=0; i<=m+n-2; i++) { refrerr = ae_maxreal(refrerr, ae_fabs(rr1.ptr.p_double[i]-rr2.ptr.p_double[i], _state), _state); } ae_vector_set_length(&rr1, 1, _state); corrr1dcircular(&ra, m, &rb, n, &rr1, _state); testcorrunit_refcorrr1dcircular(&ra, m, &rb, n, &rr2, _state); for(i=0; i<=m-1; i++) { refrerr = ae_maxreal(refrerr, ae_fabs(rr1.ptr.p_double[i]-rr2.ptr.p_double[i], _state), _state); } } } referrors = referrors||ae_fp_greater(referr,errtol); refrerrors = refrerrors||ae_fp_greater(refrerr,errtol); /* * end */ waserrors = referrors||refrerrors; if( !silent ) { printf("TESTING CORRELATION\n"); printf("FINAL RESULT: "); if( waserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* AGAINST REFERENCE COMPLEX CORR: "); if( referrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* AGAINST REFERENCE REAL CORR: "); if( refrerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testcorr(ae_bool silent, ae_state *_state) { return testcorr(silent, _state); } /************************************************************************* Reference implementation *************************************************************************/ static void testcorrunit_refcorrc1d(/* Complex */ ae_vector* signal, ae_int_t n, /* Complex */ ae_vector* pattern, ae_int_t m, /* Complex */ ae_vector* r, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_complex v; ae_vector s; ae_frame_make(_state, &_frame_block); ae_vector_clear(r); ae_vector_init(&s, 0, DT_COMPLEX, _state); ae_vector_set_length(&s, m+n-1, _state); ae_v_cmove(&s.ptr.p_complex[0], 1, &signal->ptr.p_complex[0], 1, "N", ae_v_len(0,n-1)); for(i=n; i<=m+n-2; i++) { s.ptr.p_complex[i] = ae_complex_from_i(0); } ae_vector_set_length(r, m+n-1, _state); for(i=0; i<=n-1; i++) { v = ae_complex_from_i(0); for(j=0; j<=m-1; j++) { if( i+j>=n ) { break; } v = ae_c_add(v,ae_c_mul(ae_c_conj(pattern->ptr.p_complex[j], _state),s.ptr.p_complex[i+j])); } r->ptr.p_complex[i] = v; } for(i=1; i<=m-1; i++) { v = ae_complex_from_i(0); for(j=i; j<=m-1; j++) { v = ae_c_add(v,ae_c_mul(ae_c_conj(pattern->ptr.p_complex[j], _state),s.ptr.p_complex[j-i])); } r->ptr.p_complex[m+n-1-i] = v; } ae_frame_leave(_state); } /************************************************************************* Reference implementation *************************************************************************/ static void testcorrunit_refcorrc1dcircular(/* Complex */ ae_vector* signal, ae_int_t n, /* Complex */ ae_vector* pattern, ae_int_t m, /* Complex */ ae_vector* r, ae_state *_state) { ae_int_t i; ae_int_t j; ae_complex v; ae_vector_clear(r); ae_vector_set_length(r, n, _state); for(i=0; i<=n-1; i++) { v = ae_complex_from_i(0); for(j=0; j<=m-1; j++) { v = ae_c_add(v,ae_c_mul(ae_c_conj(pattern->ptr.p_complex[j], _state),signal->ptr.p_complex[(i+j)%n])); } r->ptr.p_complex[i] = v; } } /************************************************************************* Reference implementation *************************************************************************/ static void testcorrunit_refcorrr1d(/* Real */ ae_vector* signal, ae_int_t n, /* Real */ ae_vector* pattern, ae_int_t m, /* Real */ ae_vector* r, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; double v; ae_vector s; ae_frame_make(_state, &_frame_block); ae_vector_clear(r); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_set_length(&s, m+n-1, _state); ae_v_move(&s.ptr.p_double[0], 1, &signal->ptr.p_double[0], 1, ae_v_len(0,n-1)); for(i=n; i<=m+n-2; i++) { s.ptr.p_double[i] = (double)(0); } ae_vector_set_length(r, m+n-1, _state); for(i=0; i<=n-1; i++) { v = (double)(0); for(j=0; j<=m-1; j++) { if( i+j>=n ) { break; } v = v+pattern->ptr.p_double[j]*s.ptr.p_double[i+j]; } r->ptr.p_double[i] = v; } for(i=1; i<=m-1; i++) { v = (double)(0); for(j=i; j<=m-1; j++) { v = v+pattern->ptr.p_double[j]*s.ptr.p_double[-i+j]; } r->ptr.p_double[m+n-1-i] = v; } ae_frame_leave(_state); } /************************************************************************* Reference implementation *************************************************************************/ static void testcorrunit_refcorrr1dcircular(/* Real */ ae_vector* signal, ae_int_t n, /* Real */ ae_vector* pattern, ae_int_t m, /* Real */ ae_vector* r, ae_state *_state) { ae_int_t i; ae_int_t j; double v; ae_vector_clear(r); ae_vector_set_length(r, n, _state); for(i=0; i<=n-1; i++) { v = (double)(0); for(j=0; j<=m-1; j++) { v = v+pattern->ptr.p_double[j]*signal->ptr.p_double[(i+j)%n]; } r->ptr.p_double[i] = v; } } static void testidwintunit_testxy(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t nx, ae_int_t d, ae_int_t nq, ae_int_t nw, ae_bool* idwerrors, ae_state *_state); static void testidwintunit_testdegree(ae_int_t n, ae_int_t nx, ae_int_t d, ae_int_t dtask, ae_bool* idwerrors, ae_state *_state); static void testidwintunit_testnoisy(ae_bool* idwerrors, ae_state *_state); /************************************************************************* Testing IDW interpolation *************************************************************************/ ae_bool testidwint(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_matrix xy; ae_int_t i; ae_int_t j; double vx; double vy; double vz; ae_int_t d; ae_int_t dtask; ae_int_t nx; ae_int_t nq; ae_int_t nw; ae_int_t smalln; ae_int_t largen; ae_bool waserrors; ae_bool idwerrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); idwerrors = ae_false; smalln = 256; largen = 1024; nq = 10; nw = 18; /* * Simple test: * * F = x^3 + sin(pi*y)*z^2 - (x+y)^2 * * space is either R1=[-1,+1] (other dimensions are * fixed at 0), R1^2 or R1^3. ** D = -1, 0, 1, 2 */ for(nx=1; nx<=2; nx++) { ae_matrix_set_length(&xy, largen, nx+1, _state); for(i=0; i<=largen-1; i++) { for(j=0; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } if( nx>=1 ) { vx = xy.ptr.pp_double[i][0]; } else { vx = (double)(0); } if( nx>=2 ) { vy = xy.ptr.pp_double[i][1]; } else { vy = (double)(0); } if( nx>=3 ) { vz = xy.ptr.pp_double[i][2]; } else { vz = (double)(0); } xy.ptr.pp_double[i][nx] = vx*vx*vx+ae_sin(ae_pi*vy, _state)*ae_sqr(vz, _state)-ae_sqr(vx+vy, _state); } for(d=-1; d<=2; d++) { testidwintunit_testxy(&xy, largen, nx, d, nq, nw, &idwerrors, _state); } } /* * Another simple test: * * five points in 2D - (0,0), (0,1), (1,0), (-1,0) (0,-1) * * F is random * * D = -1, 0, 1, 2 */ nx = 2; ae_matrix_set_length(&xy, 5, nx+1, _state); xy.ptr.pp_double[0][0] = (double)(0); xy.ptr.pp_double[0][1] = (double)(0); xy.ptr.pp_double[0][2] = 2*ae_randomreal(_state)-1; xy.ptr.pp_double[1][0] = (double)(1); xy.ptr.pp_double[1][1] = (double)(0); xy.ptr.pp_double[1][2] = 2*ae_randomreal(_state)-1; xy.ptr.pp_double[2][0] = (double)(0); xy.ptr.pp_double[2][1] = (double)(1); xy.ptr.pp_double[2][2] = 2*ae_randomreal(_state)-1; xy.ptr.pp_double[3][0] = (double)(-1); xy.ptr.pp_double[3][1] = (double)(0); xy.ptr.pp_double[3][2] = 2*ae_randomreal(_state)-1; xy.ptr.pp_double[4][0] = (double)(0); xy.ptr.pp_double[4][1] = (double)(-1); xy.ptr.pp_double[4][2] = 2*ae_randomreal(_state)-1; for(d=-1; d<=2; d++) { testidwintunit_testxy(&xy, 5, nx, d, nq, nw, &idwerrors, _state); } /* * Degree test. * * F is either: * * constant (DTask=0) * * linear (DTask=1) * * quadratic (DTask=2) * * Nodal functions are either * * constant (D=0) * * linear (D=1) * * quadratic (D=2) * * When DTask<=D, we can interpolate without errors. * When DTask>D, we MUST have errors. */ for(nx=1; nx<=3; nx++) { for(d=0; d<=2; d++) { for(dtask=0; dtask<=2; dtask++) { testidwintunit_testdegree(smalln, nx, d, dtask, &idwerrors, _state); } } } /* * Noisy test */ testidwintunit_testnoisy(&idwerrors, _state); /* * report */ waserrors = idwerrors; if( !silent ) { printf("TESTING INVERSE DISTANCE WEIGHTING\n"); printf("* IDW: "); if( !idwerrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testidwint(ae_bool silent, ae_state *_state) { return testidwint(silent, _state); } /************************************************************************* Testing IDW: * generate model using N/NX/D/NQ/NW * test basic properties *************************************************************************/ static void testidwintunit_testxy(/* Real */ ae_matrix* xy, ae_int_t n, ae_int_t nx, ae_int_t d, ae_int_t nq, ae_int_t nw, ae_bool* idwerrors, ae_state *_state) { ae_frame _frame_block; double lipschitzstep; ae_int_t i; ae_int_t i1; ae_int_t i2; double v; double v1; double v2; double t; double l1; double l2; idwinterpolant z1; ae_vector x; ae_frame_make(_state, &_frame_block); _idwinterpolant_init(&z1, _state); ae_vector_init(&x, 0, DT_REAL, _state); lipschitzstep = 0.001; ae_vector_set_length(&x, nx, _state); /* * build */ idwbuildmodifiedshepard(xy, n, nx, d, nq, nw, &z1, _state); /* * first, test interpolation properties at nodes */ for(i=0; i<=n-1; i++) { ae_v_move(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); *idwerrors = *idwerrors||ae_fp_neq(idwcalc(&z1, &x, _state),xy->ptr.pp_double[i][nx]); } /* * test Lipschitz continuity */ i1 = ae_randominteger(n, _state); do { i2 = ae_randominteger(n, _state); } while(i2==i1); l1 = (double)(0); t = (double)(0); while(ae_fp_less(t,(double)(1))) { v = 1-t; ae_v_moved(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i1][0], 1, ae_v_len(0,nx-1), v); v = t; ae_v_addd(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i2][0], 1, ae_v_len(0,nx-1), v); v1 = idwcalc(&z1, &x, _state); v = 1-(t+lipschitzstep); ae_v_moved(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i1][0], 1, ae_v_len(0,nx-1), v); v = t+lipschitzstep; ae_v_addd(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i2][0], 1, ae_v_len(0,nx-1), v); v2 = idwcalc(&z1, &x, _state); l1 = ae_maxreal(l1, ae_fabs(v2-v1, _state)/lipschitzstep, _state); t = t+lipschitzstep; } l2 = (double)(0); t = (double)(0); while(ae_fp_less(t,(double)(1))) { v = 1-t; ae_v_moved(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i1][0], 1, ae_v_len(0,nx-1), v); v = t; ae_v_addd(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i2][0], 1, ae_v_len(0,nx-1), v); v1 = idwcalc(&z1, &x, _state); v = 1-(t+lipschitzstep/3); ae_v_moved(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i1][0], 1, ae_v_len(0,nx-1), v); v = t+lipschitzstep/3; ae_v_addd(&x.ptr.p_double[0], 1, &xy->ptr.pp_double[i2][0], 1, ae_v_len(0,nx-1), v); v2 = idwcalc(&z1, &x, _state); l2 = ae_maxreal(l2, ae_fabs(v2-v1, _state)/(lipschitzstep/3), _state); t = t+lipschitzstep/3; } *idwerrors = *idwerrors||ae_fp_greater(l2,2.0*l1); ae_frame_leave(_state); } /************************************************************************* Testing degree properties F is either: * constant (DTask=0) * linear (DTask=1) * quadratic (DTask=2) Nodal functions are either * constant (D=0) * linear (D=1) * quadratic (D=2) When DTask<=D, we can interpolate without errors. When DTask>D, we MUST have errors. *************************************************************************/ static void testidwintunit_testdegree(ae_int_t n, ae_int_t nx, ae_int_t d, ae_int_t dtask, ae_bool* idwerrors, ae_state *_state) { ae_frame _frame_block; double threshold; ae_int_t nq; ae_int_t nw; ae_int_t i; ae_int_t j; double v; double c0; ae_vector c1; ae_matrix c2; ae_vector x; ae_matrix xy; idwinterpolant z1; double v1; double v2; ae_frame_make(_state, &_frame_block); ae_vector_init(&c1, 0, DT_REAL, _state); ae_matrix_init(&c2, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); _idwinterpolant_init(&z1, _state); threshold = 1.0E6*ae_machineepsilon; nq = 2*(nx*nx+nx+1); nw = 10; ae_assert(nq<=n, "TestDegree: internal error", _state); /* * prepare model */ c0 = 2*ae_randomreal(_state)-1; ae_vector_set_length(&c1, nx, _state); for(i=0; i<=nx-1; i++) { c1.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_matrix_set_length(&c2, nx, nx, _state); for(i=0; i<=nx-1; i++) { for(j=i+1; j<=nx-1; j++) { c2.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; c2.ptr.pp_double[j][i] = c2.ptr.pp_double[i][j]; } do { c2.ptr.pp_double[i][i] = 2*ae_randomreal(_state)-1; } while(ae_fp_less_eq(ae_fabs(c2.ptr.pp_double[i][i], _state),0.3)); } /* * prepare points */ ae_matrix_set_length(&xy, n, nx+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = 4*ae_randomreal(_state)-2; } xy.ptr.pp_double[i][nx] = c0; if( dtask>=1 ) { v = ae_v_dotproduct(&c1.ptr.p_double[0], 1, &xy.ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); xy.ptr.pp_double[i][nx] = xy.ptr.pp_double[i][nx]+v; } if( dtask==2 ) { for(j=0; j<=nx-1; j++) { v = ae_v_dotproduct(&c2.ptr.pp_double[j][0], 1, &xy.ptr.pp_double[i][0], 1, ae_v_len(0,nx-1)); xy.ptr.pp_double[i][nx] = xy.ptr.pp_double[i][nx]+xy.ptr.pp_double[i][j]*v; } } } /* * build interpolant, calculate value at random point */ idwbuildmodifiedshepard(&xy, n, nx, d, nq, nw, &z1, _state); ae_vector_set_length(&x, nx, _state); for(i=0; i<=nx-1; i++) { x.ptr.p_double[i] = 4*ae_randomreal(_state)-2; } v1 = idwcalc(&z1, &x, _state); /* * calculate model value at the same point */ v2 = c0; if( dtask>=1 ) { v = ae_v_dotproduct(&c1.ptr.p_double[0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,nx-1)); v2 = v2+v; } if( dtask==2 ) { for(j=0; j<=nx-1; j++) { v = ae_v_dotproduct(&c2.ptr.pp_double[j][0], 1, &x.ptr.p_double[0], 1, ae_v_len(0,nx-1)); v2 = v2+x.ptr.p_double[j]*v; } } /* * Compare */ if( dtask<=d ) { *idwerrors = *idwerrors||ae_fp_greater(ae_fabs(v2-v1, _state),threshold); } else { *idwerrors = *idwerrors||ae_fp_less(ae_fabs(v2-v1, _state),threshold); } ae_frame_leave(_state); } /************************************************************************* Noisy test: * F = x^2 + y^2 + z^2 + noise on [-1,+1]^3 * space is either R1=[-1,+1] (other dimensions are fixed at 0), R1^2 or R1^3. * D = 1, 2 * 4096 points is used for function generation, 4096 points - for testing * RMS error of "noisy" model on test set must be lower than RMS error of interpolation model. *************************************************************************/ static void testidwintunit_testnoisy(ae_bool* idwerrors, ae_state *_state) { ae_frame _frame_block; double noiselevel; ae_int_t nq; ae_int_t nw; ae_int_t d; ae_int_t nx; ae_int_t ntrn; ae_int_t ntst; ae_int_t i; ae_int_t j; double v; double t; double v1; double v2; double ve; ae_matrix xy; ae_vector x; idwinterpolant z1; idwinterpolant z2; double rms1; double rms2; ae_frame_make(_state, &_frame_block); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); _idwinterpolant_init(&z1, _state); _idwinterpolant_init(&z2, _state); nq = 20; nw = 40; noiselevel = 0.2; ntrn = 256; ntst = 1024; for(d=1; d<=2; d++) { for(nx=1; nx<=2; nx++) { /* * prepare dataset */ ae_matrix_set_length(&xy, ntrn, nx+1, _state); for(i=0; i<=ntrn-1; i++) { v = noiselevel*(2*ae_randomreal(_state)-1); for(j=0; j<=nx-1; j++) { t = 2*ae_randomreal(_state)-1; v = v+ae_sqr(t, _state); xy.ptr.pp_double[i][j] = t; } xy.ptr.pp_double[i][nx] = v; } /* * build interpolants */ idwbuildmodifiedshepard(&xy, ntrn, nx, d, nq, nw, &z1, _state); idwbuildnoisy(&xy, ntrn, nx, d, nq, nw, &z2, _state); /* * calculate RMS errors */ ae_vector_set_length(&x, nx, _state); rms1 = (double)(0); rms2 = (double)(0); for(i=0; i<=ntst-1; i++) { ve = (double)(0); for(j=0; j<=nx-1; j++) { t = 2*ae_randomreal(_state)-1; x.ptr.p_double[j] = t; ve = ve+ae_sqr(t, _state); } v1 = idwcalc(&z1, &x, _state); v2 = idwcalc(&z2, &x, _state); rms1 = rms1+ae_sqr(v1-ve, _state); rms2 = rms2+ae_sqr(v2-ve, _state); } *idwerrors = *idwerrors||ae_fp_greater(rms2,rms1); } } ae_frame_leave(_state); } static void testratintunit_poldiff2(/* Real */ ae_vector* x, /* Real */ ae_vector* f, ae_int_t n, double t, double* p, double* dp, double* d2p, ae_state *_state); static void testratintunit_brcunset(barycentricinterpolant* b, ae_state *_state); ae_bool testratint(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_bool bcerrors; ae_bool nperrors; double threshold; double lipschitztol; ae_int_t passcount; barycentricinterpolant b1; barycentricinterpolant b2; ae_vector x; ae_vector x2; ae_vector y; ae_vector y2; ae_vector w; ae_vector w2; ae_vector xc; ae_vector yc; ae_vector dc; double h; double s1; double s2; ae_int_t n; ae_int_t n2; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t d; ae_int_t pass; double maxerr; double t; double a; double b; double s; double v0; double v1; double v2; double v3; double d0; double d1; double d2; ae_bool result; ae_frame_make(_state, &_frame_block); _barycentricinterpolant_init(&b1, _state); _barycentricinterpolant_init(&b2, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&yc, 0, DT_REAL, _state); ae_vector_init(&dc, 0, DT_INT, _state); nperrors = ae_false; bcerrors = ae_false; waserrors = ae_false; /* * PassCount number of repeated passes * Threshold error tolerance * LipschitzTol Lipschitz constant increase allowed * when calculating constant on a twice denser grid */ passcount = 5; threshold = 1000000*ae_machineepsilon; lipschitztol = 1.3; /* * Basic barycentric functions */ for(n=1; n<=10; n++) { /* * randomized tests */ for(pass=1; pass<=passcount; pass++) { /* * generate weights from polynomial interpolation */ v0 = 1+0.4*ae_randomreal(_state)-0.2; v1 = 2*ae_randomreal(_state)-1; v2 = 2*ae_randomreal(_state)-1; v3 = 2*ae_randomreal(_state)-1; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); for(i=0; i<=n-1; i++) { if( n==1 ) { x.ptr.p_double[i] = (double)(0); } else { x.ptr.p_double[i] = v0*ae_cos(i*ae_pi/(n-1), _state); } y.ptr.p_double[i] = ae_sin(v1*x.ptr.p_double[i], _state)+ae_cos(v2*x.ptr.p_double[i], _state)+ae_exp(v3*x.ptr.p_double[i], _state); } for(j=0; j<=n-1; j++) { w.ptr.p_double[j] = (double)(1); for(k=0; k<=n-1; k++) { if( k!=j ) { w.ptr.p_double[j] = w.ptr.p_double[j]/(x.ptr.p_double[j]-x.ptr.p_double[k]); } } } barycentricbuildxyw(&x, &y, &w, n, &b1, _state); /* * unpack, then pack again and compare */ testratintunit_brcunset(&b2, _state); barycentricunpack(&b1, &n2, &x2, &y2, &w2, _state); bcerrors = bcerrors||n2!=n; barycentricbuildxyw(&x2, &y2, &w2, n2, &b2, _state); t = 2*ae_randomreal(_state)-1; bcerrors = bcerrors||ae_fp_greater(ae_fabs(barycentriccalc(&b1, t, _state)-barycentriccalc(&b2, t, _state), _state),threshold); /* * copy, compare */ testratintunit_brcunset(&b2, _state); barycentriccopy(&b1, &b2, _state); t = 2*ae_randomreal(_state)-1; bcerrors = bcerrors||ae_fp_greater(ae_fabs(barycentriccalc(&b1, t, _state)-barycentriccalc(&b2, t, _state), _state),threshold); /* * test interpolation properties */ for(i=0; i<=n-1; i++) { /* * test interpolation at nodes */ bcerrors = bcerrors||ae_fp_greater(ae_fabs(barycentriccalc(&b1, x.ptr.p_double[i], _state)-y.ptr.p_double[i], _state),threshold*ae_fabs(y.ptr.p_double[i], _state)); /* * compare with polynomial interpolation */ t = 2*ae_randomreal(_state)-1; testratintunit_poldiff2(&x, &y, n, t, &v0, &v1, &v2, _state); bcerrors = bcerrors||ae_fp_greater(ae_fabs(barycentriccalc(&b1, t, _state)-v0, _state),threshold*ae_maxreal(ae_fabs(v0, _state), (double)(1), _state)); /* * test continuity between nodes * calculate Lipschitz constant on two grids - * dense and even more dense. If Lipschitz constant * on a denser grid is significantly increased, * continuity test is failed */ t = 3.0; k = 100; s1 = (double)(0); for(j=0; j<=k-1; j++) { v1 = x.ptr.p_double[i]+(t-x.ptr.p_double[i])*j/k; v2 = x.ptr.p_double[i]+(t-x.ptr.p_double[i])*(j+1)/k; s1 = ae_maxreal(s1, ae_fabs(barycentriccalc(&b1, v2, _state)-barycentriccalc(&b1, v1, _state), _state)/ae_fabs(v2-v1, _state), _state); } k = 2*k; s2 = (double)(0); for(j=0; j<=k-1; j++) { v1 = x.ptr.p_double[i]+(t-x.ptr.p_double[i])*j/k; v2 = x.ptr.p_double[i]+(t-x.ptr.p_double[i])*(j+1)/k; s2 = ae_maxreal(s2, ae_fabs(barycentriccalc(&b1, v2, _state)-barycentriccalc(&b1, v1, _state), _state)/ae_fabs(v2-v1, _state), _state); } bcerrors = bcerrors||(ae_fp_greater(s2,lipschitztol*s1)&&ae_fp_greater(s1,threshold*k)); } /* * test differentiation properties */ for(i=0; i<=n-1; i++) { t = 2*ae_randomreal(_state)-1; testratintunit_poldiff2(&x, &y, n, t, &v0, &v1, &v2, _state); d0 = (double)(0); d1 = (double)(0); d2 = (double)(0); barycentricdiff1(&b1, t, &d0, &d1, _state); bcerrors = bcerrors||ae_fp_greater(ae_fabs(v0-d0, _state),threshold*ae_maxreal(ae_fabs(v0, _state), (double)(1), _state)); bcerrors = bcerrors||ae_fp_greater(ae_fabs(v1-d1, _state),threshold*ae_maxreal(ae_fabs(v1, _state), (double)(1), _state)); d0 = (double)(0); d1 = (double)(0); d2 = (double)(0); barycentricdiff2(&b1, t, &d0, &d1, &d2, _state); bcerrors = bcerrors||ae_fp_greater(ae_fabs(v0-d0, _state),threshold*ae_maxreal(ae_fabs(v0, _state), (double)(1), _state)); bcerrors = bcerrors||ae_fp_greater(ae_fabs(v1-d1, _state),threshold*ae_maxreal(ae_fabs(v1, _state), (double)(1), _state)); bcerrors = bcerrors||ae_fp_greater(ae_fabs(v2-d2, _state),ae_sqrt(threshold, _state)*ae_maxreal(ae_fabs(v2, _state), (double)(1), _state)); } /* * test linear translation */ t = 2*ae_randomreal(_state)-1; a = 2*ae_randomreal(_state)-1; b = 2*ae_randomreal(_state)-1; testratintunit_brcunset(&b2, _state); barycentriccopy(&b1, &b2, _state); barycentriclintransx(&b2, a, b, _state); bcerrors = bcerrors||ae_fp_greater(ae_fabs(barycentriccalc(&b1, a*t+b, _state)-barycentriccalc(&b2, t, _state), _state),threshold); a = (double)(0); b = 2*ae_randomreal(_state)-1; testratintunit_brcunset(&b2, _state); barycentriccopy(&b1, &b2, _state); barycentriclintransx(&b2, a, b, _state); bcerrors = bcerrors||ae_fp_greater(ae_fabs(barycentriccalc(&b1, a*t+b, _state)-barycentriccalc(&b2, t, _state), _state),threshold); a = 2*ae_randomreal(_state)-1; b = 2*ae_randomreal(_state)-1; testratintunit_brcunset(&b2, _state); barycentriccopy(&b1, &b2, _state); barycentriclintransy(&b2, a, b, _state); bcerrors = bcerrors||ae_fp_greater(ae_fabs(a*barycentriccalc(&b1, t, _state)+b-barycentriccalc(&b2, t, _state), _state),threshold); } } for(pass=0; pass<=3; pass++) { /* * Crash-test: small numbers, large numbers */ ae_vector_set_length(&x, 4, _state); ae_vector_set_length(&y, 4, _state); ae_vector_set_length(&w, 4, _state); h = (double)(1); if( pass%2==0 ) { h = 100*ae_minrealnumber; } if( pass%2==1 ) { h = 0.01*ae_maxrealnumber; } x.ptr.p_double[0] = 0*h; x.ptr.p_double[1] = 1*h; x.ptr.p_double[2] = 2*h; x.ptr.p_double[3] = 3*h; y.ptr.p_double[0] = 0*h; y.ptr.p_double[1] = 1*h; y.ptr.p_double[2] = 2*h; y.ptr.p_double[3] = 3*h; w.ptr.p_double[0] = -1/(x.ptr.p_double[1]-x.ptr.p_double[0]); w.ptr.p_double[1] = 1*(1/(x.ptr.p_double[1]-x.ptr.p_double[0])+1/(x.ptr.p_double[2]-x.ptr.p_double[1])); w.ptr.p_double[2] = -1*(1/(x.ptr.p_double[2]-x.ptr.p_double[1])+1/(x.ptr.p_double[3]-x.ptr.p_double[2])); w.ptr.p_double[3] = 1/(x.ptr.p_double[3]-x.ptr.p_double[2]); v0 = (double)(0); if( pass/2==0 ) { v0 = (double)(0); } if( pass/2==1 ) { v0 = 0.6*h; } barycentricbuildxyw(&x, &y, &w, 4, &b1, _state); t = barycentriccalc(&b1, v0, _state); d0 = (double)(0); d1 = (double)(0); d2 = (double)(0); barycentricdiff1(&b1, v0, &d0, &d1, _state); bcerrors = bcerrors||ae_fp_greater(ae_fabs(t-v0, _state),threshold*v0); bcerrors = bcerrors||ae_fp_greater(ae_fabs(d0-v0, _state),threshold*v0); bcerrors = bcerrors||ae_fp_greater(ae_fabs(d1-1, _state),1000*threshold); } /* * crash test: large abscissas, small argument * * test for errors in D0 is not very strict * because renormalization used in Diff1() * destroys part of precision. */ ae_vector_set_length(&x, 4, _state); ae_vector_set_length(&y, 4, _state); ae_vector_set_length(&w, 4, _state); h = 0.01*ae_maxrealnumber; x.ptr.p_double[0] = 0*h; x.ptr.p_double[1] = 1*h; x.ptr.p_double[2] = 2*h; x.ptr.p_double[3] = 3*h; y.ptr.p_double[0] = 0*h; y.ptr.p_double[1] = 1*h; y.ptr.p_double[2] = 2*h; y.ptr.p_double[3] = 3*h; w.ptr.p_double[0] = -1/(x.ptr.p_double[1]-x.ptr.p_double[0]); w.ptr.p_double[1] = 1*(1/(x.ptr.p_double[1]-x.ptr.p_double[0])+1/(x.ptr.p_double[2]-x.ptr.p_double[1])); w.ptr.p_double[2] = -1*(1/(x.ptr.p_double[2]-x.ptr.p_double[1])+1/(x.ptr.p_double[3]-x.ptr.p_double[2])); w.ptr.p_double[3] = 1/(x.ptr.p_double[3]-x.ptr.p_double[2]); v0 = 100*ae_minrealnumber; barycentricbuildxyw(&x, &y, &w, 4, &b1, _state); t = barycentriccalc(&b1, v0, _state); d0 = (double)(0); d1 = (double)(0); d2 = (double)(0); barycentricdiff1(&b1, v0, &d0, &d1, _state); bcerrors = bcerrors||ae_fp_greater(ae_fabs(t, _state),v0*(1+threshold)); bcerrors = bcerrors||ae_fp_greater(ae_fabs(d0, _state),v0*(1+threshold)); bcerrors = bcerrors||ae_fp_greater(ae_fabs(d1-1, _state),1000*threshold); /* * crash test: test safe barycentric formula */ ae_vector_set_length(&x, 4, _state); ae_vector_set_length(&y, 4, _state); ae_vector_set_length(&w, 4, _state); h = 2*ae_minrealnumber; x.ptr.p_double[0] = 0*h; x.ptr.p_double[1] = 1*h; x.ptr.p_double[2] = 2*h; x.ptr.p_double[3] = 3*h; y.ptr.p_double[0] = 0*h; y.ptr.p_double[1] = 1*h; y.ptr.p_double[2] = 2*h; y.ptr.p_double[3] = 3*h; w.ptr.p_double[0] = -1/(x.ptr.p_double[1]-x.ptr.p_double[0]); w.ptr.p_double[1] = 1*(1/(x.ptr.p_double[1]-x.ptr.p_double[0])+1/(x.ptr.p_double[2]-x.ptr.p_double[1])); w.ptr.p_double[2] = -1*(1/(x.ptr.p_double[2]-x.ptr.p_double[1])+1/(x.ptr.p_double[3]-x.ptr.p_double[2])); w.ptr.p_double[3] = 1/(x.ptr.p_double[3]-x.ptr.p_double[2]); v0 = ae_minrealnumber; barycentricbuildxyw(&x, &y, &w, 4, &b1, _state); t = barycentriccalc(&b1, v0, _state); bcerrors = bcerrors||ae_fp_greater(ae_fabs(t-v0, _state)/v0,threshold); /* * Testing "No Poles" interpolation */ maxerr = (double)(0); for(pass=1; pass<=passcount-1; pass++) { ae_vector_set_length(&x, 1, _state); ae_vector_set_length(&y, 1, _state); x.ptr.p_double[0] = 2*ae_randomreal(_state)-1; y.ptr.p_double[0] = 2*ae_randomreal(_state)-1; barycentricbuildfloaterhormann(&x, &y, 1, 1, &b1, _state); maxerr = ae_maxreal(maxerr, ae_fabs(barycentriccalc(&b1, 2*ae_randomreal(_state)-1, _state)-y.ptr.p_double[0], _state), _state); } for(n=2; n<=10; n++) { /* * compare interpolant built by subroutine * with interpolant built by hands */ ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); ae_vector_set_length(&w2, n, _state); /* * D=1, non-equidistant nodes */ for(pass=1; pass<=passcount; pass++) { /* * Initialize X, Y, W */ a = -1-1*ae_randomreal(_state); b = 1+1*ae_randomreal(_state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_atan((b-a)*i/(n-1)+a, _state); } for(i=0; i<=n-1; i++) { y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } w.ptr.p_double[0] = -1/(x.ptr.p_double[1]-x.ptr.p_double[0]); s = (double)(1); for(i=1; i<=n-2; i++) { w.ptr.p_double[i] = s*(1/(x.ptr.p_double[i]-x.ptr.p_double[i-1])+1/(x.ptr.p_double[i+1]-x.ptr.p_double[i])); s = -s; } w.ptr.p_double[n-1] = s/(x.ptr.p_double[n-1]-x.ptr.p_double[n-2]); for(i=0; i<=n-1; i++) { k = ae_randominteger(n, _state); if( k!=i ) { t = x.ptr.p_double[i]; x.ptr.p_double[i] = x.ptr.p_double[k]; x.ptr.p_double[k] = t; t = y.ptr.p_double[i]; y.ptr.p_double[i] = y.ptr.p_double[k]; y.ptr.p_double[k] = t; t = w.ptr.p_double[i]; w.ptr.p_double[i] = w.ptr.p_double[k]; w.ptr.p_double[k] = t; } } /* * Build and test */ barycentricbuildfloaterhormann(&x, &y, n, 1, &b1, _state); barycentricbuildxyw(&x, &y, &w, n, &b2, _state); for(i=1; i<=2*n; i++) { t = a+(b-a)*ae_randomreal(_state); maxerr = ae_maxreal(maxerr, ae_fabs(barycentriccalc(&b1, t, _state)-barycentriccalc(&b2, t, _state), _state), _state); } } /* * D = 0, 1, 2. Equidistant nodes. */ for(d=0; d<=2; d++) { for(pass=1; pass<=passcount; pass++) { /* * Skip incorrect (N,D) pairs */ if( n<2*d ) { continue; } /* * Initialize X, Y, W */ a = -1-1*ae_randomreal(_state); b = 1+1*ae_randomreal(_state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (b-a)*i/(n-1)+a; } for(i=0; i<=n-1; i++) { y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } s = (double)(1); if( d==0 ) { for(i=0; i<=n-1; i++) { w.ptr.p_double[i] = s; s = -s; } } if( d==1 ) { w.ptr.p_double[0] = -s; for(i=1; i<=n-2; i++) { w.ptr.p_double[i] = 2*s; s = -s; } w.ptr.p_double[n-1] = s; } if( d==2 ) { w.ptr.p_double[0] = s; w.ptr.p_double[1] = -3*s; for(i=2; i<=n-3; i++) { w.ptr.p_double[i] = 4*s; s = -s; } w.ptr.p_double[n-2] = 3*s; w.ptr.p_double[n-1] = -s; } /* * Mix */ for(i=0; i<=n-1; i++) { k = ae_randominteger(n, _state); if( k!=i ) { t = x.ptr.p_double[i]; x.ptr.p_double[i] = x.ptr.p_double[k]; x.ptr.p_double[k] = t; t = y.ptr.p_double[i]; y.ptr.p_double[i] = y.ptr.p_double[k]; y.ptr.p_double[k] = t; t = w.ptr.p_double[i]; w.ptr.p_double[i] = w.ptr.p_double[k]; w.ptr.p_double[k] = t; } } /* * Build and test */ barycentricbuildfloaterhormann(&x, &y, n, d, &b1, _state); barycentricbuildxyw(&x, &y, &w, n, &b2, _state); for(i=1; i<=2*n; i++) { t = a+(b-a)*ae_randomreal(_state); maxerr = ae_maxreal(maxerr, ae_fabs(barycentriccalc(&b1, t, _state)-barycentriccalc(&b2, t, _state), _state), _state); } } } } if( ae_fp_greater(maxerr,threshold) ) { nperrors = ae_true; } /* * report */ waserrors = bcerrors||nperrors; if( !silent ) { printf("TESTING RATIONAL INTERPOLATION\n"); printf("BASIC BARYCENTRIC FUNCTIONS: "); if( bcerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("FLOATER-HORMANN: "); if( nperrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } /* * end */ result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testratint(ae_bool silent, ae_state *_state) { return testratint(silent, _state); } static void testratintunit_poldiff2(/* Real */ ae_vector* x, /* Real */ ae_vector* f, ae_int_t n, double t, double* p, double* dp, double* d2p, ae_state *_state) { ae_frame _frame_block; ae_vector _f; ae_int_t m; ae_int_t i; ae_vector df; ae_vector d2f; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_f, f, _state); f = &_f; *p = 0; *dp = 0; *d2p = 0; ae_vector_init(&df, 0, DT_REAL, _state); ae_vector_init(&d2f, 0, DT_REAL, _state); n = n-1; ae_vector_set_length(&df, n+1, _state); ae_vector_set_length(&d2f, n+1, _state); for(i=0; i<=n; i++) { d2f.ptr.p_double[i] = (double)(0); df.ptr.p_double[i] = (double)(0); } for(m=1; m<=n; m++) { for(i=0; i<=n-m; i++) { d2f.ptr.p_double[i] = ((t-x->ptr.p_double[i+m])*d2f.ptr.p_double[i]+(x->ptr.p_double[i]-t)*d2f.ptr.p_double[i+1]+2*df.ptr.p_double[i]-2*df.ptr.p_double[i+1])/(x->ptr.p_double[i]-x->ptr.p_double[i+m]); df.ptr.p_double[i] = ((t-x->ptr.p_double[i+m])*df.ptr.p_double[i]+f->ptr.p_double[i]+(x->ptr.p_double[i]-t)*df.ptr.p_double[i+1]-f->ptr.p_double[i+1])/(x->ptr.p_double[i]-x->ptr.p_double[i+m]); f->ptr.p_double[i] = ((t-x->ptr.p_double[i+m])*f->ptr.p_double[i]+(x->ptr.p_double[i]-t)*f->ptr.p_double[i+1])/(x->ptr.p_double[i]-x->ptr.p_double[i+m]); } } *p = f->ptr.p_double[0]; *dp = df.ptr.p_double[0]; *d2p = d2f.ptr.p_double[0]; ae_frame_leave(_state); } static void testratintunit_brcunset(barycentricinterpolant* b, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_vector w; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_set_length(&x, 1, _state); ae_vector_set_length(&y, 1, _state); ae_vector_set_length(&w, 1, _state); x.ptr.p_double[0] = (double)(0); y.ptr.p_double[0] = (double)(0); w.ptr.p_double[0] = (double)(1); barycentricbuildxyw(&x, &y, &w, 1, b, _state); ae_frame_leave(_state); } static void testspline1dunit_lconst(double a, double b, spline1dinterpolant* c, double lstep, double* l0, double* l1, double* l2, ae_state *_state); static ae_bool testspline1dunit_enumerateallsplines(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t* splineindex, spline1dinterpolant* s, ae_state *_state); static ae_bool testspline1dunit_testunpack(spline1dinterpolant* c, /* Real */ ae_vector* x, ae_state *_state); static void testspline1dunit_unsetspline1d(spline1dinterpolant* c, ae_state *_state); static ae_bool testspline1dunit_testmonotonespline(ae_state *_state); ae_bool testspline1d(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_bool crserrors; ae_bool cserrors; ae_bool hserrors; ae_bool aserrors; ae_bool lserrors; ae_bool dserrors; ae_bool uperrors; ae_bool cperrors; ae_bool lterrors; ae_bool ierrors; ae_bool monotoneerr; double threshold; ae_int_t passcount; double lstep; double h; ae_int_t maxn; ae_int_t bltype; ae_int_t brtype; ae_bool periodiccond; ae_int_t n; ae_int_t i; ae_int_t k; ae_int_t pass; ae_vector x; ae_vector y; ae_vector yp; ae_vector w; ae_vector w2; ae_vector y2; ae_vector d; ae_vector xc; ae_vector yc; ae_vector xtest; ae_int_t n2; ae_vector tmp0; ae_vector tmp1; ae_vector tmp2; ae_vector tmpx; ae_vector dc; spline1dinterpolant c; spline1dinterpolant c2; double a; double b; double bl; double br; double t; double sa; double sb; double v; double l10; double l11; double l12; double l20; double l21; double l22; double p0; double p1; double p2; double s; double ds; double d2s; double s2; double ds2; double d2s2; double vl; double vm; double vr; double err; double tension; double intab; ae_int_t splineindex; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&yp, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&yc, 0, DT_REAL, _state); ae_vector_init(&xtest, 0, DT_REAL, _state); ae_vector_init(&tmp0, 0, DT_REAL, _state); ae_vector_init(&tmp1, 0, DT_REAL, _state); ae_vector_init(&tmp2, 0, DT_REAL, _state); ae_vector_init(&tmpx, 0, DT_REAL, _state); ae_vector_init(&dc, 0, DT_INT, _state); _spline1dinterpolant_init(&c, _state); _spline1dinterpolant_init(&c2, _state); waserrors = ae_false; passcount = 20; lstep = 0.005; h = 0.00001; maxn = 10; threshold = 10000*ae_machineepsilon; lserrors = ae_false; cserrors = ae_false; crserrors = ae_false; hserrors = ae_false; aserrors = ae_false; dserrors = ae_false; cperrors = ae_false; uperrors = ae_false; lterrors = ae_false; ierrors = ae_false; /* * General test: linear, cubic, Hermite, Akima */ for(n=2; n<=maxn; n++) { ae_vector_set_length(&x, n-1+1, _state); ae_vector_set_length(&y, n-1+1, _state); ae_vector_set_length(&yp, n-1+1, _state); ae_vector_set_length(&d, n-1+1, _state); for(pass=1; pass<=passcount; pass++) { /* * Prepare task: * * X contains abscissas from [A,B] * * Y contains function values * * YP contains periodic function values */ a = -1-ae_randomreal(_state); b = 1+ae_randomreal(_state); bl = 2*ae_randomreal(_state)-1; br = 2*ae_randomreal(_state)-1; for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 0.5*(b+a)+0.5*(b-a)*ae_cos(ae_pi*(2*i+1)/(2*n), _state); if( i==0 ) { x.ptr.p_double[i] = a; } if( i==n-1 ) { x.ptr.p_double[i] = b; } y.ptr.p_double[i] = ae_cos(1.3*ae_pi*x.ptr.p_double[i]+0.4, _state); yp.ptr.p_double[i] = y.ptr.p_double[i]; d.ptr.p_double[i] = -1.3*ae_pi*ae_sin(1.3*ae_pi*x.ptr.p_double[i]+0.4, _state); } yp.ptr.p_double[n-1] = yp.ptr.p_double[0]; for(i=0; i<=n-1; i++) { k = ae_randominteger(n, _state); if( k!=i ) { t = x.ptr.p_double[i]; x.ptr.p_double[i] = x.ptr.p_double[k]; x.ptr.p_double[k] = t; t = y.ptr.p_double[i]; y.ptr.p_double[i] = y.ptr.p_double[k]; y.ptr.p_double[k] = t; t = yp.ptr.p_double[i]; yp.ptr.p_double[i] = yp.ptr.p_double[k]; yp.ptr.p_double[k] = t; t = d.ptr.p_double[i]; d.ptr.p_double[i] = d.ptr.p_double[k]; d.ptr.p_double[k] = t; } } /* * Build linear spline * Test for general interpolation scheme properties: * * values at nodes * * continuous function * Test for specific properties is implemented below. */ spline1dbuildlinear(&x, &y, n, &c, _state); err = (double)(0); for(i=0; i<=n-1; i++) { err = ae_maxreal(err, ae_fabs(y.ptr.p_double[i]-spline1dcalc(&c, x.ptr.p_double[i], _state), _state), _state); } lserrors = lserrors||ae_fp_greater(err,threshold); testspline1dunit_lconst(a, b, &c, lstep, &l10, &l11, &l12, _state); testspline1dunit_lconst(a, b, &c, lstep/3, &l20, &l21, &l22, _state); lserrors = lserrors||ae_fp_greater(l20/l10,1.2); /* * Build cubic spline. * Test for interpolation scheme properties: * * values at nodes * * boundary conditions * * continuous function * * continuous first derivative * * continuous second derivative * * periodicity properties * * Spline1DGridDiff(), Spline1DGridDiff2() and Spline1DDiff() * calls must return same results */ for(bltype=-1; bltype<=2; bltype++) { for(brtype=-1; brtype<=2; brtype++) { /* * skip meaningless combination of boundary conditions * (one condition is periodic, another is not) */ periodiccond = bltype==-1||brtype==-1; if( periodiccond&&bltype!=brtype ) { continue; } /* * build */ if( periodiccond ) { spline1dbuildcubic(&x, &yp, n, bltype, bl, brtype, br, &c, _state); } else { spline1dbuildcubic(&x, &y, n, bltype, bl, brtype, br, &c, _state); } /* * interpolation properties */ err = (double)(0); if( periodiccond ) { /* * * check values at nodes; spline is periodic so * we add random number of periods to nodes * * we also test for periodicity of derivatives */ for(i=0; i<=n-1; i++) { v = x.ptr.p_double[i]; vm = v+(b-a)*(ae_randominteger(5, _state)-2); t = yp.ptr.p_double[i]-spline1dcalc(&c, vm, _state); err = ae_maxreal(err, ae_fabs(t, _state), _state); spline1ddiff(&c, v, &s, &ds, &d2s, _state); spline1ddiff(&c, vm, &s2, &ds2, &d2s2, _state); err = ae_maxreal(err, ae_fabs(s-s2, _state), _state); err = ae_maxreal(err, ae_fabs(ds-ds2, _state), _state); err = ae_maxreal(err, ae_fabs(d2s-d2s2, _state), _state); } /* * periodicity between nodes */ v = a+(b-a)*ae_randomreal(_state); vm = v+(b-a)*(ae_randominteger(5, _state)-2); err = ae_maxreal(err, ae_fabs(spline1dcalc(&c, v, _state)-spline1dcalc(&c, vm, _state), _state), _state); spline1ddiff(&c, v, &s, &ds, &d2s, _state); spline1ddiff(&c, vm, &s2, &ds2, &d2s2, _state); err = ae_maxreal(err, ae_fabs(s-s2, _state), _state); err = ae_maxreal(err, ae_fabs(ds-ds2, _state), _state); err = ae_maxreal(err, ae_fabs(d2s-d2s2, _state), _state); } else { /* * * check values at nodes */ for(i=0; i<=n-1; i++) { err = ae_maxreal(err, ae_fabs(y.ptr.p_double[i]-spline1dcalc(&c, x.ptr.p_double[i], _state), _state), _state); } } cserrors = cserrors||ae_fp_greater(err,threshold); /* * check boundary conditions */ err = (double)(0); if( bltype==0 ) { spline1ddiff(&c, a-h, &s, &ds, &d2s, _state); spline1ddiff(&c, a+h, &s2, &ds2, &d2s2, _state); t = (d2s2-d2s)/(2*h); err = ae_maxreal(err, ae_fabs(t, _state), _state); } if( bltype==1 ) { t = (spline1dcalc(&c, a+h, _state)-spline1dcalc(&c, a-h, _state))/(2*h); err = ae_maxreal(err, ae_fabs(bl-t, _state), _state); } if( bltype==2 ) { t = (spline1dcalc(&c, a+h, _state)-2*spline1dcalc(&c, a, _state)+spline1dcalc(&c, a-h, _state))/ae_sqr(h, _state); err = ae_maxreal(err, ae_fabs(bl-t, _state), _state); } if( brtype==0 ) { spline1ddiff(&c, b-h, &s, &ds, &d2s, _state); spline1ddiff(&c, b+h, &s2, &ds2, &d2s2, _state); t = (d2s2-d2s)/(2*h); err = ae_maxreal(err, ae_fabs(t, _state), _state); } if( brtype==1 ) { t = (spline1dcalc(&c, b+h, _state)-spline1dcalc(&c, b-h, _state))/(2*h); err = ae_maxreal(err, ae_fabs(br-t, _state), _state); } if( brtype==2 ) { t = (spline1dcalc(&c, b+h, _state)-2*spline1dcalc(&c, b, _state)+spline1dcalc(&c, b-h, _state))/ae_sqr(h, _state); err = ae_maxreal(err, ae_fabs(br-t, _state), _state); } if( bltype==-1||brtype==-1 ) { spline1ddiff(&c, a+100*ae_machineepsilon, &s, &ds, &d2s, _state); spline1ddiff(&c, b-100*ae_machineepsilon, &s2, &ds2, &d2s2, _state); err = ae_maxreal(err, ae_fabs(s-s2, _state), _state); err = ae_maxreal(err, ae_fabs(ds-ds2, _state), _state); err = ae_maxreal(err, ae_fabs(d2s-d2s2, _state), _state); } cserrors = cserrors||ae_fp_greater(err,1.0E-3); /* * Check Lipschitz continuity */ testspline1dunit_lconst(a, b, &c, lstep, &l10, &l11, &l12, _state); testspline1dunit_lconst(a, b, &c, lstep/3, &l20, &l21, &l22, _state); if( ae_fp_greater(l10,1.0E-6) ) { cserrors = cserrors||ae_fp_greater(l20/l10,1.2); } if( ae_fp_greater(l11,1.0E-6) ) { cserrors = cserrors||ae_fp_greater(l21/l11,1.2); } if( ae_fp_greater(l12,1.0E-6) ) { cserrors = cserrors||ae_fp_greater(l22/l12,1.2); } /* * compare spline1dgriddiff() and spline1ddiff() results */ err = (double)(0); if( periodiccond ) { spline1dgriddiffcubic(&x, &yp, n, bltype, bl, brtype, br, &tmp1, _state); } else { spline1dgriddiffcubic(&x, &y, n, bltype, bl, brtype, br, &tmp1, _state); } ae_assert(tmp1.cnt>=n, "Assertion failed", _state); for(i=0; i<=n-1; i++) { spline1ddiff(&c, x.ptr.p_double[i], &s, &ds, &d2s, _state); err = ae_maxreal(err, ae_fabs(ds-tmp1.ptr.p_double[i], _state), _state); } if( periodiccond ) { spline1dgriddiff2cubic(&x, &yp, n, bltype, bl, brtype, br, &tmp1, &tmp2, _state); } else { spline1dgriddiff2cubic(&x, &y, n, bltype, bl, brtype, br, &tmp1, &tmp2, _state); } for(i=0; i<=n-1; i++) { spline1ddiff(&c, x.ptr.p_double[i], &s, &ds, &d2s, _state); err = ae_maxreal(err, ae_fabs(ds-tmp1.ptr.p_double[i], _state), _state); err = ae_maxreal(err, ae_fabs(d2s-tmp2.ptr.p_double[i], _state), _state); } cserrors = cserrors||ae_fp_greater(err,threshold); /* * compare spline1dconv()/convdiff()/convdiff2() and spline1ddiff() results */ n2 = 2+ae_randominteger(2*n, _state); ae_vector_set_length(&tmpx, n2, _state); for(i=0; i<=n2-1; i++) { tmpx.ptr.p_double[i] = 0.5*(a+b)+(a-b)*(2*ae_randomreal(_state)-1); } err = (double)(0); if( periodiccond ) { spline1dconvcubic(&x, &yp, n, bltype, bl, brtype, br, &tmpx, n2, &tmp0, _state); } else { spline1dconvcubic(&x, &y, n, bltype, bl, brtype, br, &tmpx, n2, &tmp0, _state); } for(i=0; i<=n2-1; i++) { spline1ddiff(&c, tmpx.ptr.p_double[i], &s, &ds, &d2s, _state); err = ae_maxreal(err, ae_fabs(s-tmp0.ptr.p_double[i], _state), _state); } if( periodiccond ) { spline1dconvdiffcubic(&x, &yp, n, bltype, bl, brtype, br, &tmpx, n2, &tmp0, &tmp1, _state); } else { spline1dconvdiffcubic(&x, &y, n, bltype, bl, brtype, br, &tmpx, n2, &tmp0, &tmp1, _state); } for(i=0; i<=n2-1; i++) { spline1ddiff(&c, tmpx.ptr.p_double[i], &s, &ds, &d2s, _state); err = ae_maxreal(err, ae_fabs(s-tmp0.ptr.p_double[i], _state), _state); err = ae_maxreal(err, ae_fabs(ds-tmp1.ptr.p_double[i], _state), _state); } if( periodiccond ) { spline1dconvdiff2cubic(&x, &yp, n, bltype, bl, brtype, br, &tmpx, n2, &tmp0, &tmp1, &tmp2, _state); } else { spline1dconvdiff2cubic(&x, &y, n, bltype, bl, brtype, br, &tmpx, n2, &tmp0, &tmp1, &tmp2, _state); } for(i=0; i<=n2-1; i++) { spline1ddiff(&c, tmpx.ptr.p_double[i], &s, &ds, &d2s, _state); err = ae_maxreal(err, ae_fabs(s-tmp0.ptr.p_double[i], _state), _state); err = ae_maxreal(err, ae_fabs(ds-tmp1.ptr.p_double[i], _state), _state); err = ae_maxreal(err, ae_fabs(d2s-tmp2.ptr.p_double[i], _state), _state); } cserrors = cserrors||ae_fp_greater(err,threshold); } } /* * Build Catmull-Rom spline. * Test for interpolation scheme properties: * * values at nodes * * boundary conditions * * continuous function * * continuous first derivative * * periodicity properties */ for(bltype=-1; bltype<=0; bltype++) { periodiccond = bltype==-1; /* * select random tension value, then build */ if( ae_fp_greater(ae_randomreal(_state),0.5) ) { if( ae_fp_greater(ae_randomreal(_state),0.5) ) { tension = (double)(0); } else { tension = (double)(1); } } else { tension = ae_randomreal(_state); } if( periodiccond ) { spline1dbuildcatmullrom(&x, &yp, n, bltype, tension, &c, _state); } else { spline1dbuildcatmullrom(&x, &y, n, bltype, tension, &c, _state); } /* * interpolation properties */ err = (double)(0); if( periodiccond ) { /* * * check values at nodes; spline is periodic so * we add random number of periods to nodes * * we also test for periodicity of first derivative */ for(i=0; i<=n-1; i++) { v = x.ptr.p_double[i]; vm = v+(b-a)*(ae_randominteger(5, _state)-2); t = yp.ptr.p_double[i]-spline1dcalc(&c, vm, _state); err = ae_maxreal(err, ae_fabs(t, _state), _state); spline1ddiff(&c, v, &s, &ds, &d2s, _state); spline1ddiff(&c, vm, &s2, &ds2, &d2s2, _state); err = ae_maxreal(err, ae_fabs(s-s2, _state), _state); err = ae_maxreal(err, ae_fabs(ds-ds2, _state), _state); } /* * periodicity between nodes */ v = a+(b-a)*ae_randomreal(_state); vm = v+(b-a)*(ae_randominteger(5, _state)-2); err = ae_maxreal(err, ae_fabs(spline1dcalc(&c, v, _state)-spline1dcalc(&c, vm, _state), _state), _state); spline1ddiff(&c, v, &s, &ds, &d2s, _state); spline1ddiff(&c, vm, &s2, &ds2, &d2s2, _state); err = ae_maxreal(err, ae_fabs(s-s2, _state), _state); err = ae_maxreal(err, ae_fabs(ds-ds2, _state), _state); } else { /* * * check values at nodes */ for(i=0; i<=n-1; i++) { err = ae_maxreal(err, ae_fabs(y.ptr.p_double[i]-spline1dcalc(&c, x.ptr.p_double[i], _state), _state), _state); } } crserrors = crserrors||ae_fp_greater(err,threshold); /* * check boundary conditions */ err = (double)(0); if( bltype==0 ) { spline1ddiff(&c, a-h, &s, &ds, &d2s, _state); spline1ddiff(&c, a+h, &s2, &ds2, &d2s2, _state); t = (d2s2-d2s)/(2*h); err = ae_maxreal(err, ae_fabs(t, _state), _state); spline1ddiff(&c, b-h, &s, &ds, &d2s, _state); spline1ddiff(&c, b+h, &s2, &ds2, &d2s2, _state); t = (d2s2-d2s)/(2*h); err = ae_maxreal(err, ae_fabs(t, _state), _state); } if( bltype==-1 ) { spline1ddiff(&c, a+100*ae_machineepsilon, &s, &ds, &d2s, _state); spline1ddiff(&c, b-100*ae_machineepsilon, &s2, &ds2, &d2s2, _state); err = ae_maxreal(err, ae_fabs(s-s2, _state), _state); err = ae_maxreal(err, ae_fabs(ds-ds2, _state), _state); } crserrors = crserrors||ae_fp_greater(err,1.0E-3); /* * Check Lipschitz continuity */ testspline1dunit_lconst(a, b, &c, lstep, &l10, &l11, &l12, _state); testspline1dunit_lconst(a, b, &c, lstep/3, &l20, &l21, &l22, _state); if( ae_fp_greater(l10,1.0E-6) ) { crserrors = crserrors||ae_fp_greater(l20/l10,1.2); } if( ae_fp_greater(l11,1.0E-6) ) { crserrors = crserrors||ae_fp_greater(l21/l11,1.2); } } /* * Build Hermite spline. * Test for interpolation scheme properties: * * values and derivatives at nodes * * continuous function * * continuous first derivative */ spline1dbuildhermite(&x, &y, &d, n, &c, _state); err = (double)(0); for(i=0; i<=n-1; i++) { err = ae_maxreal(err, ae_fabs(y.ptr.p_double[i]-spline1dcalc(&c, x.ptr.p_double[i], _state), _state), _state); } hserrors = hserrors||ae_fp_greater(err,threshold); err = (double)(0); for(i=0; i<=n-1; i++) { t = (spline1dcalc(&c, x.ptr.p_double[i]+h, _state)-spline1dcalc(&c, x.ptr.p_double[i]-h, _state))/(2*h); err = ae_maxreal(err, ae_fabs(d.ptr.p_double[i]-t, _state), _state); } hserrors = hserrors||ae_fp_greater(err,1.0E-3); testspline1dunit_lconst(a, b, &c, lstep, &l10, &l11, &l12, _state); testspline1dunit_lconst(a, b, &c, lstep/3, &l20, &l21, &l22, _state); hserrors = hserrors||ae_fp_greater(l20/l10,1.2); hserrors = hserrors||ae_fp_greater(l21/l11,1.2); /* * Build Akima spline * Test for general interpolation scheme properties: * * values at nodes * * continuous function * * continuous first derivative * Test for Akima-specific properties is implemented below. */ spline1dbuildakima(&x, &y, n, &c, _state); err = (double)(0); for(i=0; i<=n-1; i++) { err = ae_maxreal(err, ae_fabs(y.ptr.p_double[i]-spline1dcalc(&c, x.ptr.p_double[i], _state), _state), _state); } aserrors = aserrors||ae_fp_greater(err,threshold); testspline1dunit_lconst(a, b, &c, lstep, &l10, &l11, &l12, _state); testspline1dunit_lconst(a, b, &c, lstep/3, &l20, &l21, &l22, _state); hserrors = hserrors||(ae_fp_greater(l10,1.0E-10)&&ae_fp_greater(l20/l10,1.2)); hserrors = hserrors||(ae_fp_greater(l11,1.0E-10)&&ae_fp_greater(l21/l11,1.2)); } } /* * Special linear spline test: * test for linearity between x[i] and x[i+1] */ for(n=2; n<=maxn; n++) { ae_vector_set_length(&x, n-1+1, _state); ae_vector_set_length(&y, n-1+1, _state); /* * Prepare task */ a = (double)(-1); b = (double)(1); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = a+(b-a)*i/(n-1); y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } spline1dbuildlinear(&x, &y, n, &c, _state); /* * Test */ err = (double)(0); for(k=0; k<=n-2; k++) { a = x.ptr.p_double[k]; b = x.ptr.p_double[k+1]; for(pass=1; pass<=passcount; pass++) { t = a+(b-a)*ae_randomreal(_state); v = y.ptr.p_double[k]+(t-a)/(b-a)*(y.ptr.p_double[k+1]-y.ptr.p_double[k]); err = ae_maxreal(err, ae_fabs(spline1dcalc(&c, t, _state)-v, _state), _state); } } lserrors = lserrors||ae_fp_greater(err,threshold); } /* * Special Akima test: test outlier sensitivity * Spline value at (x[i], x[i+1]) should depend from * f[i-2], f[i-1], f[i], f[i+1], f[i+2], f[i+3] only. */ for(n=5; n<=maxn; n++) { ae_vector_set_length(&x, n-1+1, _state); ae_vector_set_length(&y, n-1+1, _state); ae_vector_set_length(&y2, n-1+1, _state); /* * Prepare unperturbed Akima spline */ a = (double)(-1); b = (double)(1); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = a+(b-a)*i/(n-1); y.ptr.p_double[i] = ae_cos(1.3*ae_pi*x.ptr.p_double[i]+0.4, _state); } spline1dbuildakima(&x, &y, n, &c, _state); /* * Process perturbed tasks */ err = (double)(0); for(k=0; k<=n-1; k++) { ae_v_move(&y2.ptr.p_double[0], 1, &y.ptr.p_double[0], 1, ae_v_len(0,n-1)); y2.ptr.p_double[k] = (double)(5); spline1dbuildakima(&x, &y2, n, &c2, _state); /* * Test left part independence */ if( k-3>=1 ) { a = (double)(-1); b = x.ptr.p_double[k-3]; for(pass=1; pass<=passcount; pass++) { t = a+(b-a)*ae_randomreal(_state); err = ae_maxreal(err, ae_fabs(spline1dcalc(&c, t, _state)-spline1dcalc(&c2, t, _state), _state), _state); } } /* * Test right part independence */ if( k+3<=n-2 ) { a = x.ptr.p_double[k+3]; b = (double)(1); for(pass=1; pass<=passcount; pass++) { t = a+(b-a)*ae_randomreal(_state); err = ae_maxreal(err, ae_fabs(spline1dcalc(&c, t, _state)-spline1dcalc(&c2, t, _state), _state), _state); } } } aserrors = aserrors||ae_fp_greater(err,threshold); } /* * Differentiation, copy/unpack test */ for(n=2; n<=maxn; n++) { ae_vector_set_length(&x, n-1+1, _state); ae_vector_set_length(&y, n-1+1, _state); /* * Prepare cubic spline */ a = -1-ae_randomreal(_state); b = 1+ae_randomreal(_state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = a+(b-a)*i/(n-1); y.ptr.p_double[i] = ae_cos(1.3*ae_pi*x.ptr.p_double[i]+0.4, _state); } spline1dbuildcubic(&x, &y, n, 2, 0.0, 2, 0.0, &c, _state); /* * Test diff */ err = (double)(0); for(pass=1; pass<=passcount; pass++) { t = a+(b-a)*ae_randomreal(_state); spline1ddiff(&c, t, &s, &ds, &d2s, _state); vl = spline1dcalc(&c, t-h, _state); vm = spline1dcalc(&c, t, _state); vr = spline1dcalc(&c, t+h, _state); err = ae_maxreal(err, ae_fabs(s-vm, _state), _state); err = ae_maxreal(err, ae_fabs(ds-(vr-vl)/(2*h), _state), _state); err = ae_maxreal(err, ae_fabs(d2s-(vr-2*vm+vl)/ae_sqr(h, _state), _state), _state); } dserrors = dserrors||ae_fp_greater(err,0.001); /* * Test copy */ testspline1dunit_unsetspline1d(&c2, _state); spline1dcopy(&c, &c2, _state); err = (double)(0); for(pass=1; pass<=passcount; pass++) { t = a+(b-a)*ae_randomreal(_state); err = ae_maxreal(err, ae_fabs(spline1dcalc(&c, t, _state)-spline1dcalc(&c2, t, _state), _state), _state); } cperrors = cperrors||ae_fp_greater(err,threshold); /* * Test unpack */ uperrors = uperrors||!testspline1dunit_testunpack(&c, &x, _state); } /* * Linear translation errors */ for(n=2; n<=maxn; n++) { /* * Prepare: * * X, Y - grid points * * XTest - test points */ ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); a = -1-ae_randomreal(_state); b = 1+ae_randomreal(_state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = a+(b-a)*(i+0.2*ae_randomreal(_state)-0.1)/(n-1); y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } ae_vector_set_length(&xtest, 5*n+2, _state); for(i=0; i<=xtest.cnt-1; i++) { xtest.ptr.p_double[i] = a+(b-a)*(i-1)/(xtest.cnt-3); } splineindex = 0; while(testspline1dunit_enumerateallsplines(&x, &y, n, &splineindex, &c, _state)) { /* * LinTransX, general A */ sa = 4*ae_randomreal(_state)-2; sb = 2*ae_randomreal(_state)-1; spline1dcopy(&c, &c2, _state); spline1dlintransx(&c2, sa, sb, _state); for(i=0; i<=xtest.cnt-1; i++) { lterrors = lterrors||ae_fp_greater(ae_fabs(spline1dcalc(&c, xtest.ptr.p_double[i], _state)-spline1dcalc(&c2, (xtest.ptr.p_double[i]-sb)/sa, _state), _state),threshold); } /* * LinTransX, special case: A=0 */ sb = 2*ae_randomreal(_state)-1; spline1dcopy(&c, &c2, _state); spline1dlintransx(&c2, (double)(0), sb, _state); for(i=0; i<=xtest.cnt-1; i++) { lterrors = lterrors||ae_fp_greater(ae_fabs(spline1dcalc(&c, sb, _state)-spline1dcalc(&c2, xtest.ptr.p_double[i], _state), _state),threshold); } /* * LinTransY */ sa = 2*ae_randomreal(_state)-1; sb = 2*ae_randomreal(_state)-1; spline1dcopy(&c, &c2, _state); spline1dlintransy(&c2, sa, sb, _state); for(i=0; i<=xtest.cnt-1; i++) { lterrors = lterrors||ae_fp_greater(ae_fabs(sa*spline1dcalc(&c, xtest.ptr.p_double[i], _state)+sb-spline1dcalc(&c2, xtest.ptr.p_double[i], _state), _state),threshold); } } } /* * Testing integration. * Three tests are performed: * * * approximate test (well behaved smooth function, many points, * integration inside [a,b]), non-periodic spline * * * exact test (integration of parabola, outside of [a,b], non-periodic spline * * * approximate test for periodic splines. F(x)=cos(2*pi*x)+1. * Period length is equals to 1.0, so all operations with * multiples of period are done exactly. For each value of PERIOD * we calculate and test integral at four points: * - 0 < t0 < PERIOD * - t1 = PERIOD-eps * - t2 = PERIOD * - t3 = PERIOD+eps */ err = (double)(0); for(n=20; n<=35; n++) { ae_vector_set_length(&x, n-1+1, _state); ae_vector_set_length(&y, n-1+1, _state); for(pass=1; pass<=passcount; pass++) { /* * Prepare cubic spline */ a = -1-0.2*ae_randomreal(_state); b = 1+0.2*ae_randomreal(_state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = a+(b-a)*i/(n-1); y.ptr.p_double[i] = ae_sin(ae_pi*x.ptr.p_double[i]+0.4, _state)+ae_exp(x.ptr.p_double[i], _state); } bl = ae_pi*ae_cos(ae_pi*a+0.4, _state)+ae_exp(a, _state); br = ae_pi*ae_cos(ae_pi*b+0.4, _state)+ae_exp(b, _state); spline1dbuildcubic(&x, &y, n, 1, bl, 1, br, &c, _state); /* * Test */ t = a+(b-a)*ae_randomreal(_state); v = -ae_cos(ae_pi*a+0.4, _state)/ae_pi+ae_exp(a, _state); v = -ae_cos(ae_pi*t+0.4, _state)/ae_pi+ae_exp(t, _state)-v; v = v-spline1dintegrate(&c, t, _state); err = ae_maxreal(err, ae_fabs(v, _state), _state); } } ierrors = ierrors||ae_fp_greater(err,0.001); p0 = 2*ae_randomreal(_state)-1; p1 = 2*ae_randomreal(_state)-1; p2 = 2*ae_randomreal(_state)-1; a = -ae_randomreal(_state)-0.5; b = ae_randomreal(_state)+0.5; n = 2; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&d, n, _state); x.ptr.p_double[0] = a; y.ptr.p_double[0] = p0+p1*a+p2*ae_sqr(a, _state); d.ptr.p_double[0] = p1+2*p2*a; x.ptr.p_double[1] = b; y.ptr.p_double[1] = p0+p1*b+p2*ae_sqr(b, _state); d.ptr.p_double[1] = p1+2*p2*b; spline1dbuildhermite(&x, &y, &d, n, &c, _state); bl = ae_minreal(a, b, _state)-ae_fabs(b-a, _state); br = ae_minreal(a, b, _state)+ae_fabs(b-a, _state); err = (double)(0); for(pass=1; pass<=100; pass++) { t = bl+(br-bl)*ae_randomreal(_state); v = p0*t+p1*ae_sqr(t, _state)/2+p2*ae_sqr(t, _state)*t/3-(p0*a+p1*ae_sqr(a, _state)/2+p2*ae_sqr(a, _state)*a/3); v = v-spline1dintegrate(&c, t, _state); err = ae_maxreal(err, ae_fabs(v, _state), _state); } ierrors = ierrors||ae_fp_greater(err,threshold); n = 100; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)i/(double)(n-1); y.ptr.p_double[i] = ae_cos(2*ae_pi*x.ptr.p_double[i], _state)+1; } y.ptr.p_double[0] = (double)(2); y.ptr.p_double[n-1] = (double)(2); spline1dbuildcubic(&x, &y, n, -1, 0.0, -1, 0.0, &c, _state); intab = spline1dintegrate(&c, 1.0, _state); v = ae_randomreal(_state); vr = spline1dintegrate(&c, v, _state); ierrors = ierrors||ae_fp_greater(ae_fabs(intab-1, _state),0.001); for(i=-10; i<=10; i++) { ierrors = ierrors||ae_fp_greater(ae_fabs(spline1dintegrate(&c, i+v, _state)-(i*intab+vr), _state),0.001); ierrors = ierrors||ae_fp_greater(ae_fabs(spline1dintegrate(&c, i-1000*ae_machineepsilon, _state)-i*intab, _state),0.001); ierrors = ierrors||ae_fp_greater(ae_fabs(spline1dintegrate(&c, (double)(i), _state)-i*intab, _state),0.001); ierrors = ierrors||ae_fp_greater(ae_fabs(spline1dintegrate(&c, i+1000*ae_machineepsilon, _state)-i*intab, _state),0.001); } /* * Test fo monotone cubic Hermit interpolation */ monotoneerr = testspline1dunit_testmonotonespline(_state); /* * report */ waserrors = (((((((((lserrors||cserrors)||crserrors)||hserrors)||aserrors)||dserrors)||cperrors)||uperrors)||lterrors)||ierrors)||monotoneerr; if( !silent ) { printf("TESTING SPLINE INTERPOLATION\n"); /* * Normal tests */ printf("LINEAR SPLINE TEST: "); if( lserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("CUBIC SPLINE TEST: "); if( cserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("CATMULL-ROM SPLINE TEST: "); if( crserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("HERMITE SPLINE TEST: "); if( hserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("AKIMA SPLINE TEST: "); if( aserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("DIFFERENTIATION TEST: "); if( dserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("COPY/SERIALIZATION TEST: "); if( cperrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("UNPACK TEST: "); if( uperrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("LIN.TRANS. TEST: "); if( lterrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("INTEGRATION TEST: "); if( ierrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("TEST MONOTONE CUBIC HERMITE SPLINE: "); if( monotoneerr ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } /* * end */ result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testspline1d(ae_bool silent, ae_state *_state) { return testspline1d(silent, _state); } /************************************************************************* Lipschitz constants for spline inself, first and second derivatives. *************************************************************************/ static void testspline1dunit_lconst(double a, double b, spline1dinterpolant* c, double lstep, double* l0, double* l1, double* l2, ae_state *_state) { double t; double vl; double vm; double vr; double prevf; double prevd; double prevd2; double f; double d; double d2; *l0 = 0; *l1 = 0; *l2 = 0; *l0 = (double)(0); *l1 = (double)(0); *l2 = (double)(0); t = a-0.1; vl = spline1dcalc(c, t-2*lstep, _state); vm = spline1dcalc(c, t-lstep, _state); vr = spline1dcalc(c, t, _state); f = vm; d = (vr-vl)/(2*lstep); d2 = (vr-2*vm+vl)/ae_sqr(lstep, _state); while(ae_fp_less_eq(t,b+0.1)) { prevf = f; prevd = d; prevd2 = d2; vl = vm; vm = vr; vr = spline1dcalc(c, t+lstep, _state); f = vm; d = (vr-vl)/(2*lstep); d2 = (vr-2*vm+vl)/ae_sqr(lstep, _state); *l0 = ae_maxreal(*l0, ae_fabs((f-prevf)/lstep, _state), _state); *l1 = ae_maxreal(*l1, ae_fabs((d-prevd)/lstep, _state), _state); *l2 = ae_maxreal(*l2, ae_fabs((d2-prevd2)/lstep, _state), _state); t = t+lstep; } } /************************************************************************* This function is used to enumerate all spline types which can be built from given dataset. It should be used as follows: > > init X, Y, N > SplineIndex:=0; > while EnumerateAllSplines(X, Y, N, SplineIndex, S) do > begin > do something with S > end; > On initial call EnumerateAllSplines accepts: * dataset X, Y, number of points N (N>=2) * SplineIndex, equal to 0 It returns: * True, in case there is a spline type which corresponds to SplineIndex. In this case S contains spline which was built using X/Y and spline type, as specified by input value of SplineIndex. SplineIndex is advanced to the next value. * False, in case SplineIndex contains past-the-end value, spline is not built. This function tries different variants of linear/cubic, periodic/nonperiodic splines. *************************************************************************/ static ae_bool testspline1dunit_enumerateallsplines(/* Real */ ae_vector* x, /* Real */ ae_vector* y, ae_int_t n, ae_int_t* splineindex, spline1dinterpolant* s, ae_state *_state) { ae_int_t idxoffs; ae_bool result; _spline1dinterpolant_clear(s); ae_assert(*splineindex>=0, "Assertion failed", _state); result = ae_false; if( *splineindex==0 ) { /* * Linear spline */ spline1dbuildlinear(x, y, n, s, _state); *splineindex = *splineindex+1; result = ae_true; return result; } else { if( *splineindex>=1&&*splineindex<11 ) { /* * Cubic spline, either periodic or non-periodic */ idxoffs = *splineindex-1; if( idxoffs==9 ) { /* * Periodic spline */ spline1dbuildcubic(x, y, n, -1, 0.0, -1, 0.0, s, _state); } else { /* * Non-periodic spline */ spline1dbuildcubic(x, y, n, idxoffs/3, 2*ae_randomreal(_state)-1, idxoffs%3, 2*ae_randomreal(_state)-1, s, _state); } *splineindex = *splineindex+1; result = ae_true; return result; } } return result; } /************************************************************************* Unpack testing *************************************************************************/ static ae_bool testspline1dunit_testunpack(spline1dinterpolant* c, /* Real */ ae_vector* x, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t n; double err; double t; double v1; double v2; ae_int_t pass; ae_int_t passcount; ae_matrix tbl; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&tbl, 0, 0, DT_REAL, _state); passcount = 20; err = (double)(0); spline1dunpack(c, &n, &tbl, _state); for(i=0; i<=n-2; i++) { for(pass=1; pass<=passcount; pass++) { t = ae_randomreal(_state)*(tbl.ptr.pp_double[i][1]-tbl.ptr.pp_double[i][0]); v1 = tbl.ptr.pp_double[i][2]+t*tbl.ptr.pp_double[i][3]+ae_sqr(t, _state)*tbl.ptr.pp_double[i][4]+t*ae_sqr(t, _state)*tbl.ptr.pp_double[i][5]; v2 = spline1dcalc(c, tbl.ptr.pp_double[i][0]+t, _state); err = ae_maxreal(err, ae_fabs(v1-v2, _state), _state); } } for(i=0; i<=n-2; i++) { err = ae_maxreal(err, ae_fabs(x->ptr.p_double[i]-tbl.ptr.pp_double[i][0], _state), _state); } for(i=0; i<=n-2; i++) { err = ae_maxreal(err, ae_fabs(x->ptr.p_double[i+1]-tbl.ptr.pp_double[i][1], _state), _state); } result = ae_fp_less(err,100*ae_machineepsilon); ae_frame_leave(_state); return result; } /************************************************************************* Unset spline, i.e. initialize it with random garbage *************************************************************************/ static void testspline1dunit_unsetspline1d(spline1dinterpolant* c, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_vector d; ae_frame_make(_state, &_frame_block); _spline1dinterpolant_clear(c); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_set_length(&x, 2, _state); ae_vector_set_length(&y, 2, _state); ae_vector_set_length(&d, 2, _state); x.ptr.p_double[0] = (double)(-1); y.ptr.p_double[0] = ae_randomreal(_state); d.ptr.p_double[0] = ae_randomreal(_state); x.ptr.p_double[1] = (double)(1); y.ptr.p_double[1] = ae_randomreal(_state); d.ptr.p_double[1] = ae_randomreal(_state); spline1dbuildhermite(&x, &y, &d, 2, c, _state); ae_frame_leave(_state); } /************************************************************************* Tests that built spline is monotone. *************************************************************************/ static ae_bool testspline1dunit_testmonotonespline(ae_state *_state) { ae_frame _frame_block; spline1dinterpolant c; spline1dinterpolant s2; double c0; double c1; ae_vector x; ae_vector y; ae_vector d; ae_int_t m; ae_vector n; ae_int_t alln; ae_int_t shift; double sign0; double sign1; double r; double st; double eps; double delta; double v; double dv; double d2v; ae_int_t nseg; ae_int_t npoints; ae_int_t tp; ae_int_t pass; ae_int_t passcount; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t kmax; ae_int_t l; ae_bool result; ae_frame_make(_state, &_frame_block); _spline1dinterpolant_init(&c, _state); _spline1dinterpolant_init(&s2, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&n, 0, DT_INT, _state); eps = 100*ae_machineepsilon; /* * Special test - N=2. * * Following properties are tested: * * monotone spline must be equal to the Hermite spline with * zero derivative at the ends * * monotone spline is constant beyond left/right boundaries */ ae_vector_set_length(&x, 2, _state); ae_vector_set_length(&y, 2, _state); ae_vector_set_length(&d, 2, _state); x.ptr.p_double[0] = -0.1-ae_randomreal(_state); y.ptr.p_double[0] = 2*ae_randomreal(_state)-1; d.ptr.p_double[0] = 0.0; x.ptr.p_double[1] = 0.1+ae_randomreal(_state); y.ptr.p_double[1] = y.ptr.p_double[0]; d.ptr.p_double[1] = 0.0; spline1dbuildmonotone(&x, &y, 2, &c, _state); spline1dbuildhermite(&x, &y, &d, 2, &s2, _state); v = 2*ae_randomreal(_state)-1; if( ae_fp_greater(ae_fabs(spline1dcalc(&c, v, _state)-spline1dcalc(&s2, v, _state), _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } if( ae_fp_neq(spline1dcalc(&c, (double)(-5), _state),y.ptr.p_double[0]) ) { result = ae_true; ae_frame_leave(_state); return result; } if( ae_fp_neq(spline1dcalc(&c, (double)(5), _state),y.ptr.p_double[0]) ) { result = ae_true; ae_frame_leave(_state); return result; } /* * Special test - N=3, x=[0,1,2], y=[0,1,0]. * Monotone spline must be equal to the Hermite spline with * zero derivative at all points. */ ae_vector_set_length(&x, 3, _state); ae_vector_set_length(&y, 3, _state); ae_vector_set_length(&d, 3, _state); x.ptr.p_double[0] = 0.0; y.ptr.p_double[0] = 0.0; d.ptr.p_double[0] = 0.0; x.ptr.p_double[1] = 1.0; y.ptr.p_double[1] = 1.0; d.ptr.p_double[1] = 0.0; x.ptr.p_double[2] = 2.0; y.ptr.p_double[2] = 0.0; d.ptr.p_double[2] = 0.0; spline1dbuildmonotone(&x, &y, 3, &c, _state); spline1dbuildhermite(&x, &y, &d, 3, &s2, _state); for(i=0; i<=10; i++) { v = x.ptr.p_double[0]+(double)i/(double)10*(x.ptr.p_double[2]-x.ptr.p_double[0]); if( ae_fp_greater(ae_fabs(spline1dcalc(&c, v, _state)-spline1dcalc(&s2, v, _state), _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } /* * Special test - N=5, x=[0,1,2,3,4], y=[0,1,1,2,3]. * * 1) spline passes through all prescribed points * 2) spline derivative at all points except x=3 is exactly zero * 3) spline derivative at x=3 is 1.0 (within machine epsilon) */ ae_vector_set_length(&x, 5, _state); ae_vector_set_length(&y, 5, _state); x.ptr.p_double[0] = 0.0; y.ptr.p_double[0] = 0.0; x.ptr.p_double[1] = 1.0; y.ptr.p_double[1] = 1.0; x.ptr.p_double[2] = 2.0; y.ptr.p_double[2] = 1.0; x.ptr.p_double[3] = 3.0; y.ptr.p_double[3] = 2.0; x.ptr.p_double[4] = 4.0; y.ptr.p_double[4] = 3.0; spline1dbuildmonotone(&x, &y, 5, &c, _state); for(i=0; i<=4; i++) { spline1ddiff(&c, x.ptr.p_double[i], &v, &dv, &d2v, _state); if( ae_fp_greater(ae_fabs(v-y.ptr.p_double[i], _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } if( (ae_fp_eq(x.ptr.p_double[i],3.0)&&ae_fp_greater(ae_fabs(dv-1.0, _state),eps))||(ae_fp_neq(x.ptr.p_double[i],3.0)&&ae_fp_neq(dv,(double)(0))) ) { result = ae_true; ae_frame_leave(_state); return result; } } /* * Special test: * * N=4 * * three fixed points - (0,0), (1,1), (2,0) * * one special point (x,y) with x in [0.1,0.9], y in [0.1,0.9] * * monotonicity of the interpolant at [0,1] is checked with very small step 1/KMax */ ae_vector_set_length(&x, 4, _state); ae_vector_set_length(&y, 4, _state); x.ptr.p_double[0] = 0.0; y.ptr.p_double[0] = 0.0; x.ptr.p_double[2] = 1.0; y.ptr.p_double[2] = 1.0; x.ptr.p_double[3] = 2.0; y.ptr.p_double[3] = 0.0; for(i=1; i<=9; i++) { for(j=1; j<=9; j++) { x.ptr.p_double[1] = (double)i/(double)10; y.ptr.p_double[1] = (double)j/(double)10; spline1dbuildmonotone(&x, &y, 4, &c, _state); kmax = 1000; for(k=0; k<=kmax-1; k++) { if( ae_fp_greater(spline1dcalc(&c, (double)k/(double)kmax, _state),spline1dcalc(&c, (double)(k+1)/(double)kmax, _state)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } /* * General case */ delta = (double)(0); nseg = 10; npoints = 15; passcount = 30; for(pass=1; pass<=passcount; pass++) { tp = ae_randominteger(6, _state)+4; r = (double)(ae_randominteger(76, _state)+25); m = ae_randominteger(nseg, _state)+1; ae_vector_set_length(&n, m, _state); alln = 0; for(i=0; i<=m-1; i++) { n.ptr.p_int[i] = ae_randominteger(npoints, _state)+2; alln = alln+n.ptr.p_int[i]; } ae_vector_set_length(&x, alln, _state); ae_vector_set_length(&y, alln, _state); x.ptr.p_double[0] = r*(2*ae_randomreal(_state)-1); y.ptr.p_double[0] = r*(2*ae_randomreal(_state)-1); /* * Builds monotone function */ st = 0.1+0.7*ae_randomreal(_state); shift = 0; sign0 = ae_pow((double)(-1), (double)(0), _state); for(i=0; i<=m-1; i++) { for(j=1; j<=n.ptr.p_int[i]-1; j++) { x.ptr.p_double[shift+j] = x.ptr.p_double[shift+j-1]+st+ae_randomreal(_state); delta = ae_maxreal(delta, x.ptr.p_double[shift+j]-x.ptr.p_double[shift+j-1], _state); y.ptr.p_double[shift+j] = y.ptr.p_double[shift+j-1]+sign0*(st+ae_randomreal(_state)); } shift = shift+n.ptr.p_int[i]; if( i!=m-1 ) { sign0 = ae_pow((double)(-1), (double)(i+1), _state); x.ptr.p_double[shift] = x.ptr.p_double[shift-1]+st+ae_randomreal(_state); y.ptr.p_double[shift] = y.ptr.p_double[shift-1]+sign0*ae_randomreal(_state); } } delta = 3*delta; spline1dbuildmonotone(&x, &y, alln, &c, _state); /* * Check that built function is monotone */ shift = 0; for(i=0; i<=m-1; i++) { for(j=1; j<=n.ptr.p_int[i]-1; j++) { st = (x.ptr.p_double[shift+j]-x.ptr.p_double[shift+j-1])/tp; sign0 = y.ptr.p_double[shift+j]-y.ptr.p_double[shift+j-1]; if( ae_fp_neq(sign0,(double)(0)) ) { sign0 = sign0/ae_fabs(sign0, _state); } for(l=0; l<=tp-1; l++) { c0 = spline1dcalc(&c, x.ptr.p_double[shift+j-1]+l*st, _state); c1 = spline1dcalc(&c, x.ptr.p_double[shift+j-1]+(l+1)*st, _state); sign1 = c1-c0; if( ae_fp_neq(sign1,(double)(0)) ) { sign1 = sign1/ae_fabs(sign1, _state); } if( ae_fp_less(sign0*sign1,(double)(0)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } c0 = spline1dcalc(&c, x.ptr.p_double[0]-delta, _state); c1 = spline1dcalc(&c, x.ptr.p_double[0], _state); if( ae_fp_greater(ae_fabs(c0-c1, _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } c0 = spline1dcalc(&c, x.ptr.p_double[alln-1], _state); c1 = spline1dcalc(&c, x.ptr.p_double[alln-1]+delta, _state); if( ae_fp_greater(ae_fabs(c0-c1, _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } /* * Builds constant function */ y.ptr.p_double[0] = r*(2*ae_randomreal(_state)-1); for(i=1; i<=alln-1; i++) { y.ptr.p_double[i] = y.ptr.p_double[0]; } spline1dbuildmonotone(&x, &y, alln, &c, _state); shift = 0; for(i=0; i<=m-1; i++) { for(j=1; j<=n.ptr.p_int[i]-1; j++) { st = (x.ptr.p_double[shift+j]-x.ptr.p_double[shift+j-1])/tp; sign0 = y.ptr.p_double[shift+j]-y.ptr.p_double[shift+j-1]; for(l=0; l<=tp-1; l++) { c0 = spline1dcalc(&c, x.ptr.p_double[shift+j-1]+l*st, _state); c1 = spline1dcalc(&c, x.ptr.p_double[shift+j-1]+(l+1)*st, _state); sign1 = c1-c0; if( ae_fp_greater(sign0,eps)||ae_fp_greater(sign1,eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } result = ae_false; ae_frame_leave(_state); return result; } static void testparametricunit_testrdp(ae_bool* errorflag, ae_state *_state); static void testparametricunit_unsetp2(pspline2interpolant* p, ae_state *_state); static void testparametricunit_unsetp3(pspline3interpolant* p, ae_state *_state); ae_bool testparametric(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_bool p2errors; ae_bool p3errors; ae_bool rdperrors; double nonstrictthreshold; double threshold; ae_int_t maxn; ae_int_t periodicity; ae_int_t skind; ae_int_t pkind; ae_bool periodic; double a; double b; ae_int_t n; ae_int_t tmpn; ae_int_t i; double vx; double vy; double vz; double vx2; double vy2; double vz2; double vdx; double vdy; double vdz; double vdx2; double vdy2; double vdz2; double vd2x; double vd2y; double vd2z; double vd2x2; double vd2y2; double vd2z2; double v0; double v1; ae_vector x; ae_vector y; ae_vector z; ae_vector t; ae_vector t2; ae_vector t3; ae_matrix xy; ae_matrix xyz; pspline2interpolant p2; pspline3interpolant p3; spline1dinterpolant s; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&z, 0, DT_REAL, _state); ae_vector_init(&t, 0, DT_REAL, _state); ae_vector_init(&t2, 0, DT_REAL, _state); ae_vector_init(&t3, 0, DT_REAL, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&xyz, 0, 0, DT_REAL, _state); _pspline2interpolant_init(&p2, _state); _pspline3interpolant_init(&p3, _state); _spline1dinterpolant_init(&s, _state); waserrors = ae_false; maxn = 10; threshold = 10000*ae_machineepsilon; nonstrictthreshold = 0.00001; p2errors = ae_false; p3errors = ae_false; rdperrors = ae_false; testparametricunit_testrdp(&rdperrors, _state); /* * Test basic properties of 2- and 3-dimensional splines: * * PSpline2ParameterValues() properties * * values at nodes * * for periodic splines - periodicity properties * * Variables used: * * N points count * * SKind spline * * PKind parameterization * * Periodicity whether we have periodic spline or not */ for(n=2; n<=maxn; n++) { for(skind=0; skind<=2; skind++) { for(pkind=0; pkind<=2; pkind++) { for(periodicity=0; periodicity<=1; periodicity++) { periodic = periodicity==1; /* * skip unsupported combinations of parameters */ if( periodic&&n<3 ) { continue; } if( periodic&&skind==0 ) { continue; } if( n<5&&skind==0 ) { continue; } /* * init */ ae_matrix_set_length(&xy, n, 2, _state); ae_matrix_set_length(&xyz, n, 3, _state); taskgenint1dequidist((double)(-1), (double)(1), n, &t2, &x, _state); ae_v_move(&xy.ptr.pp_double[0][0], xy.stride, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&xyz.ptr.pp_double[0][0], xyz.stride, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); taskgenint1dequidist((double)(-1), (double)(1), n, &t2, &y, _state); ae_v_move(&xy.ptr.pp_double[0][1], xy.stride, &y.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&xyz.ptr.pp_double[0][1], xyz.stride, &y.ptr.p_double[0], 1, ae_v_len(0,n-1)); taskgenint1dequidist((double)(-1), (double)(1), n, &t2, &z, _state); ae_v_move(&xyz.ptr.pp_double[0][2], xyz.stride, &z.ptr.p_double[0], 1, ae_v_len(0,n-1)); testparametricunit_unsetp2(&p2, _state); testparametricunit_unsetp3(&p3, _state); if( periodic ) { pspline2buildperiodic(&xy, n, skind, pkind, &p2, _state); pspline3buildperiodic(&xyz, n, skind, pkind, &p3, _state); } else { pspline2build(&xy, n, skind, pkind, &p2, _state); pspline3build(&xyz, n, skind, pkind, &p3, _state); } /* * PSpline2ParameterValues() properties */ pspline2parametervalues(&p2, &tmpn, &t2, _state); if( tmpn!=n ) { p2errors = ae_true; continue; } pspline3parametervalues(&p3, &tmpn, &t3, _state); if( tmpn!=n ) { p3errors = ae_true; continue; } p2errors = p2errors||ae_fp_neq(t2.ptr.p_double[0],(double)(0)); p3errors = p3errors||ae_fp_neq(t3.ptr.p_double[0],(double)(0)); for(i=1; i<=n-1; i++) { p2errors = p2errors||ae_fp_less_eq(t2.ptr.p_double[i],t2.ptr.p_double[i-1]); p3errors = p3errors||ae_fp_less_eq(t3.ptr.p_double[i],t3.ptr.p_double[i-1]); } if( periodic ) { p2errors = p2errors||ae_fp_greater_eq(t2.ptr.p_double[n-1],(double)(1)); p3errors = p3errors||ae_fp_greater_eq(t3.ptr.p_double[n-1],(double)(1)); } else { p2errors = p2errors||ae_fp_neq(t2.ptr.p_double[n-1],(double)(1)); p3errors = p3errors||ae_fp_neq(t3.ptr.p_double[n-1],(double)(1)); } /* * Now we have parameter values stored at T, * and want to test whether the actully correspond to * points */ for(i=0; i<=n-1; i++) { /* * 2-dimensional test */ pspline2calc(&p2, t2.ptr.p_double[i], &vx, &vy, _state); p2errors = p2errors||ae_fp_greater(ae_fabs(vx-x.ptr.p_double[i], _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vy-y.ptr.p_double[i], _state),threshold); /* * 3-dimensional test */ pspline3calc(&p3, t3.ptr.p_double[i], &vx, &vy, &vz, _state); p3errors = p3errors||ae_fp_greater(ae_fabs(vx-x.ptr.p_double[i], _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vy-y.ptr.p_double[i], _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vz-z.ptr.p_double[i], _state),threshold); } /* * Test periodicity (if needed) */ if( periodic ) { /* * periodicity at nodes */ for(i=0; i<=n-1; i++) { /* * 2-dimensional test */ pspline2calc(&p2, t2.ptr.p_double[i]+ae_randominteger(10, _state)-5, &vx, &vy, _state); p2errors = p2errors||ae_fp_greater(ae_fabs(vx-x.ptr.p_double[i], _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vy-y.ptr.p_double[i], _state),threshold); pspline2diff(&p2, t2.ptr.p_double[i]+ae_randominteger(10, _state)-5, &vx, &vdx, &vy, &vdy, _state); p2errors = p2errors||ae_fp_greater(ae_fabs(vx-x.ptr.p_double[i], _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vy-y.ptr.p_double[i], _state),threshold); pspline2diff2(&p2, t2.ptr.p_double[i]+ae_randominteger(10, _state)-5, &vx, &vdx, &vd2x, &vy, &vdy, &vd2y, _state); p2errors = p2errors||ae_fp_greater(ae_fabs(vx-x.ptr.p_double[i], _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vy-y.ptr.p_double[i], _state),threshold); /* * 3-dimensional test */ pspline3calc(&p3, t3.ptr.p_double[i]+ae_randominteger(10, _state)-5, &vx, &vy, &vz, _state); p3errors = p3errors||ae_fp_greater(ae_fabs(vx-x.ptr.p_double[i], _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vy-y.ptr.p_double[i], _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vz-z.ptr.p_double[i], _state),threshold); pspline3diff(&p3, t3.ptr.p_double[i]+ae_randominteger(10, _state)-5, &vx, &vdx, &vy, &vdy, &vz, &vdz, _state); p3errors = p3errors||ae_fp_greater(ae_fabs(vx-x.ptr.p_double[i], _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vy-y.ptr.p_double[i], _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vz-z.ptr.p_double[i], _state),threshold); pspline3diff2(&p3, t3.ptr.p_double[i]+ae_randominteger(10, _state)-5, &vx, &vdx, &vd2x, &vy, &vdy, &vd2y, &vz, &vdz, &vd2z, _state); p3errors = p3errors||ae_fp_greater(ae_fabs(vx-x.ptr.p_double[i], _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vy-y.ptr.p_double[i], _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vz-z.ptr.p_double[i], _state),threshold); } /* * periodicity between nodes */ v0 = ae_randomreal(_state); pspline2calc(&p2, v0, &vx, &vy, _state); pspline2calc(&p2, v0+ae_randominteger(10, _state)-5, &vx2, &vy2, _state); p2errors = p2errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); pspline3calc(&p3, v0, &vx, &vy, &vz, _state); pspline3calc(&p3, v0+ae_randominteger(10, _state)-5, &vx2, &vy2, &vz2, _state); p3errors = p3errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vz-vz2, _state),threshold); /* * near-boundary test for continuity of function values and derivatives: * 2-dimensional curve */ ae_assert(skind==1||skind==2, "TEST: unexpected spline type!", _state); v0 = 100*ae_machineepsilon; v1 = 1-v0; pspline2calc(&p2, v0, &vx, &vy, _state); pspline2calc(&p2, v1, &vx2, &vy2, _state); p2errors = p2errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); pspline2diff(&p2, v0, &vx, &vdx, &vy, &vdy, _state); pspline2diff(&p2, v1, &vx2, &vdx2, &vy2, &vdy2, _state); p2errors = p2errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vdx-vdx2, _state),nonstrictthreshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vdy-vdy2, _state),nonstrictthreshold); pspline2diff2(&p2, v0, &vx, &vdx, &vd2x, &vy, &vdy, &vd2y, _state); pspline2diff2(&p2, v1, &vx2, &vdx2, &vd2x2, &vy2, &vdy2, &vd2y2, _state); p2errors = p2errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vdx-vdx2, _state),nonstrictthreshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vdy-vdy2, _state),nonstrictthreshold); if( skind==2 ) { /* * second derivative test only for cubic splines */ p2errors = p2errors||ae_fp_greater(ae_fabs(vd2x-vd2x2, _state),nonstrictthreshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vd2y-vd2y2, _state),nonstrictthreshold); } /* * near-boundary test for continuity of function values and derivatives: * 3-dimensional curve */ ae_assert(skind==1||skind==2, "TEST: unexpected spline type!", _state); v0 = 100*ae_machineepsilon; v1 = 1-v0; pspline3calc(&p3, v0, &vx, &vy, &vz, _state); pspline3calc(&p3, v1, &vx2, &vy2, &vz2, _state); p3errors = p3errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vz-vz2, _state),threshold); pspline3diff(&p3, v0, &vx, &vdx, &vy, &vdy, &vz, &vdz, _state); pspline3diff(&p3, v1, &vx2, &vdx2, &vy2, &vdy2, &vz2, &vdz2, _state); p3errors = p3errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vz-vz2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vdx-vdx2, _state),nonstrictthreshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vdy-vdy2, _state),nonstrictthreshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vdz-vdz2, _state),nonstrictthreshold); pspline3diff2(&p3, v0, &vx, &vdx, &vd2x, &vy, &vdy, &vd2y, &vz, &vdz, &vd2z, _state); pspline3diff2(&p3, v1, &vx2, &vdx2, &vd2x2, &vy2, &vdy2, &vd2y2, &vz2, &vdz2, &vd2z2, _state); p3errors = p3errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vz-vz2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vdx-vdx2, _state),nonstrictthreshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vdy-vdy2, _state),nonstrictthreshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vdz-vdz2, _state),nonstrictthreshold); if( skind==2 ) { /* * second derivative test only for cubic splines */ p3errors = p3errors||ae_fp_greater(ae_fabs(vd2x-vd2x2, _state),nonstrictthreshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vd2y-vd2y2, _state),nonstrictthreshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vd2z-vd2z2, _state),nonstrictthreshold); } } } } } } /* * Test differentiation, tangents, calculation between nodes. * * Because differentiation is done in parameterization/spline/periodicity * oblivious manner, we don't have to test all possible combinations * of spline types and parameterizations. * * Actually we test special combination with properties which allow us * to easily solve this problem: * * 2 (3) variables * * first variable is sampled from equidistant grid on [0,1] * * other variables are random * * uniform parameterization is used * * periodicity - none * * spline type - any (we use cubic splines) * Same problem allows us to test calculation BETWEEN nodes. */ for(n=2; n<=maxn; n++) { /* * init */ ae_matrix_set_length(&xy, n, 2, _state); ae_matrix_set_length(&xyz, n, 3, _state); taskgenint1dequidist((double)(0), (double)(1), n, &t, &x, _state); ae_v_move(&xy.ptr.pp_double[0][0], xy.stride, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&xyz.ptr.pp_double[0][0], xyz.stride, &x.ptr.p_double[0], 1, ae_v_len(0,n-1)); taskgenint1dequidist((double)(0), (double)(1), n, &t, &y, _state); ae_v_move(&xy.ptr.pp_double[0][1], xy.stride, &y.ptr.p_double[0], 1, ae_v_len(0,n-1)); ae_v_move(&xyz.ptr.pp_double[0][1], xyz.stride, &y.ptr.p_double[0], 1, ae_v_len(0,n-1)); taskgenint1dequidist((double)(0), (double)(1), n, &t, &z, _state); ae_v_move(&xyz.ptr.pp_double[0][2], xyz.stride, &z.ptr.p_double[0], 1, ae_v_len(0,n-1)); testparametricunit_unsetp2(&p2, _state); testparametricunit_unsetp3(&p3, _state); pspline2build(&xy, n, 2, 0, &p2, _state); pspline3build(&xyz, n, 2, 0, &p3, _state); /* * Test 2D/3D spline: * * build non-parametric cubic spline from T and X/Y * * calculate its value and derivatives at V0 * * compare with Spline2Calc/Spline2Diff/Spline2Diff2 * Because of task properties both variants should * return same answer. */ v0 = ae_randomreal(_state); spline1dbuildcubic(&t, &x, n, 0, 0.0, 0, 0.0, &s, _state); spline1ddiff(&s, v0, &vx2, &vdx2, &vd2x2, _state); spline1dbuildcubic(&t, &y, n, 0, 0.0, 0, 0.0, &s, _state); spline1ddiff(&s, v0, &vy2, &vdy2, &vd2y2, _state); spline1dbuildcubic(&t, &z, n, 0, 0.0, 0, 0.0, &s, _state); spline1ddiff(&s, v0, &vz2, &vdz2, &vd2z2, _state); /* * 2D test */ pspline2calc(&p2, v0, &vx, &vy, _state); p2errors = p2errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); pspline2diff(&p2, v0, &vx, &vdx, &vy, &vdy, _state); p2errors = p2errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vdx-vdx2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vdy-vdy2, _state),threshold); pspline2diff2(&p2, v0, &vx, &vdx, &vd2x, &vy, &vdy, &vd2y, _state); p2errors = p2errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vdx-vdx2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vdy-vdy2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vd2x-vd2x2, _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vd2y-vd2y2, _state),threshold); /* * 3D test */ pspline3calc(&p3, v0, &vx, &vy, &vz, _state); p3errors = p3errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vz-vz2, _state),threshold); pspline3diff(&p3, v0, &vx, &vdx, &vy, &vdy, &vz, &vdz, _state); p3errors = p3errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vz-vz2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vdx-vdx2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vdy-vdy2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vdz-vdz2, _state),threshold); pspline3diff2(&p3, v0, &vx, &vdx, &vd2x, &vy, &vdy, &vd2y, &vz, &vdz, &vd2z, _state); p3errors = p3errors||ae_fp_greater(ae_fabs(vx-vx2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vy-vy2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vz-vz2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vdx-vdx2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vdy-vdy2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vdz-vdz2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vd2x-vd2x2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vd2y-vd2y2, _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vd2z-vd2z2, _state),threshold); /* * Test tangents for 2D/3D */ pspline2tangent(&p2, v0, &vx, &vy, _state); p2errors = p2errors||ae_fp_greater(ae_fabs(vx-vdx2/safepythag2(vdx2, vdy2, _state), _state),threshold); p2errors = p2errors||ae_fp_greater(ae_fabs(vy-vdy2/safepythag2(vdx2, vdy2, _state), _state),threshold); pspline3tangent(&p3, v0, &vx, &vy, &vz, _state); p3errors = p3errors||ae_fp_greater(ae_fabs(vx-vdx2/safepythag3(vdx2, vdy2, vdz2, _state), _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vy-vdy2/safepythag3(vdx2, vdy2, vdz2, _state), _state),threshold); p3errors = p3errors||ae_fp_greater(ae_fabs(vz-vdz2/safepythag3(vdx2, vdy2, vdz2, _state), _state),threshold); } /* * Arc length test. * * Simple problem with easy solution (points on a straight line with * uniform parameterization). */ for(n=2; n<=maxn; n++) { ae_matrix_set_length(&xy, n, 2, _state); ae_matrix_set_length(&xyz, n, 3, _state); for(i=0; i<=n-1; i++) { xy.ptr.pp_double[i][0] = (double)(i); xy.ptr.pp_double[i][1] = (double)(i); xyz.ptr.pp_double[i][0] = (double)(i); xyz.ptr.pp_double[i][1] = (double)(i); xyz.ptr.pp_double[i][2] = (double)(i); } pspline2build(&xy, n, 1, 0, &p2, _state); pspline3build(&xyz, n, 1, 0, &p3, _state); a = ae_randomreal(_state); b = ae_randomreal(_state); p2errors = p2errors||ae_fp_greater(ae_fabs(pspline2arclength(&p2, a, b, _state)-(b-a)*ae_sqrt((double)(2), _state)*(n-1), _state),nonstrictthreshold); p3errors = p3errors||ae_fp_greater(ae_fabs(pspline3arclength(&p3, a, b, _state)-(b-a)*ae_sqrt((double)(3), _state)*(n-1), _state),nonstrictthreshold); } /* * report */ waserrors = (p2errors||p3errors)||rdperrors; if( !silent ) { printf("TESTING PARAMETRIC INTERPOLATION\n"); /* * Normal tests */ printf("2D SPLINES: "); if( p2errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("3D SPLINES: "); if( p3errors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("RDP: "); if( rdperrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } /* * end */ result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testparametric(ae_bool silent, ae_state *_state) { return testparametric(silent, _state); } /************************************************************************* This function tests 4PL/5PL fitting. On error sets FitErrors flag variable; on success - flag is not changed. *************************************************************************/ static void testparametricunit_testrdp(ae_bool* errorflag, ae_state *_state) { ae_frame _frame_block; hqrndstate rs; ae_vector x; ae_vector y; ae_vector e; ae_vector x2; ae_vector y2; ae_vector x3; ae_vector y3; ae_matrix xy; ae_matrix xy2; ae_matrix xy3; ae_vector idx2; ae_vector idx3; ae_int_t nsections; ae_int_t nsections3; double eps; double v; ae_int_t i; ae_int_t j; ae_int_t n; ae_int_t d; spline1dinterpolant s; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&rs, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&e, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&x3, 0, DT_REAL, _state); ae_vector_init(&y3, 0, DT_REAL, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&xy2, 0, 0, DT_REAL, _state); ae_matrix_init(&xy3, 0, 0, DT_REAL, _state); ae_vector_init(&idx2, 0, DT_INT, _state); ae_vector_init(&idx3, 0, DT_INT, _state); _spline1dinterpolant_init(&s, _state); hqrndrandomize(&rs, _state); /* * Parametric test 1: non-closed curve */ ae_matrix_set_length(&xy, 4, 2, _state); xy.ptr.pp_double[0][0] = (double)(0); xy.ptr.pp_double[0][1] = (double)(0); xy.ptr.pp_double[1][0] = (double)(1); xy.ptr.pp_double[1][1] = (double)(2); xy.ptr.pp_double[2][0] = (double)(3); xy.ptr.pp_double[2][1] = (double)(1); xy.ptr.pp_double[3][0] = (double)(3); xy.ptr.pp_double[3][1] = (double)(3); parametricrdpfixed(&xy, 4, 2, 0, ae_sqrt((double)(2), _state)+0.001, &xy2, &idx2, &nsections, _state); seterrorflag(errorflag, nsections!=1, _state); if( nsections==1 ) { seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[0]!=0, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][0],(double)(3)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][1],(double)(3)), _state); seterrorflag(errorflag, idx2.ptr.p_int[1]!=3, _state); } parametricrdpfixed(&xy, 4, 2, 0, ae_sqrt((double)(2), _state)-0.001, &xy2, &idx2, &nsections, _state); seterrorflag(errorflag, nsections!=3, _state); if( nsections==3 ) { seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[0]!=0, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][0],(double)(1)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][1],(double)(2)), _state); seterrorflag(errorflag, idx2.ptr.p_int[1]!=1, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][0],(double)(3)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][1],(double)(1)), _state); seterrorflag(errorflag, idx2.ptr.p_int[2]!=2, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[3][0],(double)(3)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[3][1],(double)(3)), _state); seterrorflag(errorflag, idx2.ptr.p_int[3]!=3, _state); } parametricrdpfixed(&xy, 4, 2, 1, 0.0, &xy2, &idx2, &nsections, _state); seterrorflag(errorflag, nsections!=1, _state); if( nsections==1 ) { seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[0]!=0, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][0],(double)(3)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][1],(double)(3)), _state); seterrorflag(errorflag, idx2.ptr.p_int[1]!=3, _state); } parametricrdpfixed(&xy, 4, 2, 2, 0.0, &xy2, &idx2, &nsections, _state); seterrorflag(errorflag, nsections!=2, _state); if( nsections==2 ) { seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[0]!=0, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][0],(double)(3)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][1],(double)(1)), _state); seterrorflag(errorflag, idx2.ptr.p_int[1]!=2, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][0],(double)(3)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][1],(double)(3)), _state); seterrorflag(errorflag, idx2.ptr.p_int[2]!=3, _state); } parametricrdpfixed(&xy, 4, 2, 3, 0.0, &xy2, &idx2, &nsections, _state); seterrorflag(errorflag, nsections!=3, _state); if( nsections==3 ) { seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[0]!=0, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][0],(double)(1)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][1],(double)(2)), _state); seterrorflag(errorflag, idx2.ptr.p_int[1]!=1, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][0],(double)(3)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][1],(double)(1)), _state); seterrorflag(errorflag, idx2.ptr.p_int[2]!=2, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[3][0],(double)(3)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[3][1],(double)(3)), _state); seterrorflag(errorflag, idx2.ptr.p_int[3]!=3, _state); } parametricrdpfixed(&xy, 4, 2, 4, 0.0, &xy2, &idx2, &nsections, _state); seterrorflag(errorflag, nsections!=3, _state); if( nsections==3 ) { seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[0]!=0, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][0],(double)(1)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][1],(double)(2)), _state); seterrorflag(errorflag, idx2.ptr.p_int[1]!=1, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][0],(double)(3)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][1],(double)(1)), _state); seterrorflag(errorflag, idx2.ptr.p_int[2]!=2, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[3][0],(double)(3)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[3][1],(double)(3)), _state); seterrorflag(errorflag, idx2.ptr.p_int[3]!=3, _state); } /* * Parametric test 2: closed curve */ ae_matrix_set_length(&xy, 5, 2, _state); xy.ptr.pp_double[0][0] = (double)(0); xy.ptr.pp_double[0][1] = (double)(0); xy.ptr.pp_double[1][0] = (double)(1); xy.ptr.pp_double[1][1] = (double)(0); xy.ptr.pp_double[2][0] = (double)(1); xy.ptr.pp_double[2][1] = (double)(1); xy.ptr.pp_double[3][0] = (double)(0); xy.ptr.pp_double[3][1] = (double)(1); xy.ptr.pp_double[4][0] = (double)(0); xy.ptr.pp_double[4][1] = (double)(0); parametricrdpfixed(&xy, 5, 2, 0, ae_sqrt((double)(2), _state)+0.001, &xy2, &idx2, &nsections, _state); seterrorflag(errorflag, nsections!=1, _state); if( nsections==1 ) { seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[0]!=0, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[1]!=4, _state); } parametricrdpfixed(&xy, 5, 2, 0, ae_sqrt((double)(2), _state)-0.001, &xy2, &idx2, &nsections, _state); seterrorflag(errorflag, nsections!=2, _state); if( nsections==2 ) { seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[0]!=0, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][0],(double)(1)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][1],(double)(1)), _state); seterrorflag(errorflag, idx2.ptr.p_int[1]!=2, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[2]!=4, _state); } parametricrdpfixed(&xy, 5, 2, 0, ae_sqrt((double)(2), _state)/2+0.001, &xy2, &idx2, &nsections, _state); seterrorflag(errorflag, nsections!=2, _state); if( nsections==2 ) { seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[0]!=0, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][0],(double)(1)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][1],(double)(1)), _state); seterrorflag(errorflag, idx2.ptr.p_int[1]!=2, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[2]!=4, _state); } parametricrdpfixed(&xy, 5, 2, 0, ae_sqrt((double)(2), _state)/2-0.001, &xy2, &idx2, &nsections, _state); seterrorflag(errorflag, nsections!=4, _state); if( nsections==4 ) { seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[0][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[0]!=0, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][0],(double)(1)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[1][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[1]!=1, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][0],(double)(1)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[2][1],(double)(1)), _state); seterrorflag(errorflag, idx2.ptr.p_int[2]!=2, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[3][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[3][1],(double)(1)), _state); seterrorflag(errorflag, idx2.ptr.p_int[3]!=3, _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[4][0],(double)(0)), _state); seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[4][1],(double)(0)), _state); seterrorflag(errorflag, idx2.ptr.p_int[4]!=4, _state); } /* * Parametric, variable precision test (non-fixed), results are compared against fixed-section test */ eps = 10.0; n = 100; while(ae_fp_greater_eq(eps,0.0001)) { /* * Try different dimension counts */ for(d=1; d<=5; d++) { /* * Generate dataset */ ae_matrix_set_length(&xy, n, d, _state); for(i=0; i<=n-1; i++) { v = ae_pi*i/(n-1); for(j=0; j<=d-1; j++) { xy.ptr.pp_double[i][j] = ae_pow(ae_sin(v, _state), (double)(j+1), _state)+0.01*(hqrnduniformr(&rs, _state)-0.5); } } /* * Perform run of eps-based RDP algorithm */ parametricrdpfixed(&xy, n, d, 0, eps, &xy2, &idx2, &nsections, _state); seterrorflag(errorflag, nsections==0, _state); if( nsections==0 ) { ae_frame_leave(_state); return; } /* * Check properties */ seterrorflag(errorflag, idx2.ptr.p_int[0]!=0, _state); for(i=0; i<=nsections-1; i++) { seterrorflag(errorflag, idx2.ptr.p_int[i]>=idx2.ptr.p_int[i+1], _state); } seterrorflag(errorflag, idx2.ptr.p_int[nsections]!=n-1, _state); for(i=0; i<=nsections; i++) { for(j=0; j<=d-1; j++) { seterrorflag(errorflag, ae_fp_neq(xy2.ptr.pp_double[i][j],xy.ptr.pp_double[idx2.ptr.p_int[i]][j]), _state); } } ae_vector_set_length(&x, nsections+1, _state); ae_vector_set_length(&y, nsections+1, _state); ae_vector_set_length(&e, n, _state); for(i=0; i<=n-1; i++) { e.ptr.p_double[i] = (double)(0); } for(j=0; j<=d-1; j++) { for(i=0; i<=nsections; i++) { x.ptr.p_double[i] = (double)(idx2.ptr.p_int[i]); y.ptr.p_double[i] = xy2.ptr.pp_double[i][j]; } spline1dbuildlinear(&x, &y, nsections+1, &s, _state); for(i=0; i<=n-1; i++) { e.ptr.p_double[i] = e.ptr.p_double[i]+ae_sqr(spline1dcalc(&s, (double)(i), _state)-xy.ptr.pp_double[i][j], _state); } } for(i=0; i<=n-1; i++) { seterrorflag(errorflag, ae_fp_greater(ae_sqrt(e.ptr.p_double[i], _state),eps), _state); } /* * compare results with values returned by section-based algorithm */ parametricrdpfixed(&xy, n, d, nsections, 0.0, &xy3, &idx3, &nsections3, _state); seterrorflag(errorflag, nsections3!=nsections, _state); if( *errorflag ) { ae_frame_leave(_state); return; } for(i=0; i<=nsections; i++) { seterrorflag(errorflag, idx2.ptr.p_int[i]!=idx3.ptr.p_int[i], _state); for(j=0; j<=d-1; j++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(xy2.ptr.pp_double[i][j]-xy3.ptr.pp_double[i][j], _state),1000*ae_machineepsilon), _state); } } } /* * Next epsilon */ eps = eps*0.5; } ae_frame_leave(_state); } /************************************************************************* Unset spline, i.e. initialize it with random garbage *************************************************************************/ static void testparametricunit_unsetp2(pspline2interpolant* p, ae_state *_state) { ae_frame _frame_block; ae_matrix xy; ae_frame_make(_state, &_frame_block); _pspline2interpolant_clear(p); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_set_length(&xy, 2, 2, _state); xy.ptr.pp_double[0][0] = (double)(-1); xy.ptr.pp_double[0][1] = (double)(-1); xy.ptr.pp_double[1][0] = (double)(1); xy.ptr.pp_double[1][1] = (double)(1); pspline2build(&xy, 2, 1, 0, p, _state); ae_frame_leave(_state); } /************************************************************************* Unset spline, i.e. initialize it with random garbage *************************************************************************/ static void testparametricunit_unsetp3(pspline3interpolant* p, ae_state *_state) { ae_frame _frame_block; ae_matrix xy; ae_frame_make(_state, &_frame_block); _pspline3interpolant_clear(p); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_set_length(&xy, 2, 3, _state); xy.ptr.pp_double[0][0] = (double)(-1); xy.ptr.pp_double[0][1] = (double)(-1); xy.ptr.pp_double[0][2] = (double)(-1); xy.ptr.pp_double[1][0] = (double)(1); xy.ptr.pp_double[1][1] = (double)(1); xy.ptr.pp_double[1][2] = (double)(1); pspline3build(&xy, 2, 1, 0, p, _state); ae_frame_leave(_state); } static ae_bool testspline3dunit_basictest(ae_state *_state); static ae_bool testspline3dunit_testunpack(ae_state *_state); static ae_bool testspline3dunit_testlintrans(ae_state *_state); static ae_bool testspline3dunit_testtrilinearresample(ae_state *_state); static void testspline3dunit_buildrndgrid(ae_bool isvect, ae_bool reorder, ae_int_t* n, ae_int_t* m, ae_int_t* l, ae_int_t* d, /* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* z, /* Real */ ae_vector* f, ae_state *_state); ae_bool testspline3d(ae_bool silence, ae_state *_state) { ae_bool waserrors; ae_bool basicerr; ae_bool unpackerr; ae_bool lintransferr; ae_bool trilinreserr; ae_bool result; basicerr = testspline3dunit_basictest(_state); unpackerr = testspline3dunit_testunpack(_state); lintransferr = testspline3dunit_testlintrans(_state); trilinreserr = testspline3dunit_testtrilinearresample(_state); waserrors = ((basicerr||unpackerr)||lintransferr)||trilinreserr; if( !silence ) { printf("TESTING 3D SPLINE\n"); printf("BASIC TEST: "); if( basicerr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("UNPACK TEST: "); if( unpackerr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("LIN_TRANSF TEST: "); if( lintransferr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("TRILINEAR RESAMPLING TEST: "); if( trilinreserr ) { printf("FAILED\n"); } else { printf("OK\n"); } /* * Summary */ if( waserrors ) { printf("TEST FAILED"); } else { printf("TEST PASSED"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testspline3d(ae_bool silence, ae_state *_state) { return testspline3d(silence, _state); } /************************************************************************* The function does test basic functionality. *************************************************************************/ static ae_bool testspline3dunit_basictest(ae_state *_state) { ae_frame _frame_block; spline3dinterpolant c; spline3dinterpolant cc; ae_vector vvf; double vsf; ae_int_t d; ae_int_t m; ae_int_t n; ae_int_t l; ae_vector x; ae_vector y; ae_vector z; ae_vector sf; ae_vector vf; double eps; ae_int_t pass; ae_int_t passcount; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t offs; ae_int_t di; double ax; double ay; double az; double axy; double ayz; double vx; double vy; double vz; ae_bool result; ae_frame_make(_state, &_frame_block); _spline3dinterpolant_init(&c, _state); _spline3dinterpolant_init(&cc, _state); ae_vector_init(&vvf, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&z, 0, DT_REAL, _state); ae_vector_init(&sf, 0, DT_REAL, _state); ae_vector_init(&vf, 0, DT_REAL, _state); eps = 1000*ae_machineepsilon; /* * Test spline ability to reproduce D-dimensional vector function * f[idx](x,y,z) = idx+AX*x + AY*y + AZ*z + AXY*x*y + AYZ*y*z * with random AX/AY/... * * We generate random test function, build spline, then evaluate * it in the random test point. */ for(d=1; d<=3; d++) { n = 2+ae_randominteger(4, _state); m = 2+ae_randominteger(4, _state); l = 2+ae_randominteger(4, _state); ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)(i); } ae_vector_set_length(&y, m, _state); for(i=0; i<=m-1; i++) { y.ptr.p_double[i] = (double)(i); } ae_vector_set_length(&z, l, _state); for(i=0; i<=l-1; i++) { z.ptr.p_double[i] = (double)(i); } ae_vector_set_length(&vf, l*m*n*d, _state); offs = 0; ax = 2*ae_randomreal(_state)-1; ay = 2*ae_randomreal(_state)-1; az = 2*ae_randomreal(_state)-1; axy = 2*ae_randomreal(_state)-1; ayz = 2*ae_randomreal(_state)-1; for(k=0; k<=l-1; k++) { for(j=0; j<=m-1; j++) { for(i=0; i<=n-1; i++) { for(di=0; di<=d-1; di++) { vf.ptr.p_double[offs] = di+ax*i+ay*j+az*k+axy*i*j+ayz*j*k; offs = offs+1; } } } } spline3dbuildtrilinearv(&x, n, &y, m, &z, l, &vf, d, &c, _state); vx = ae_randomreal(_state)*n; vy = ae_randomreal(_state)*m; vz = ae_randomreal(_state)*l; spline3dcalcv(&c, vx, vy, vz, &vf, _state); for(di=0; di<=d-1; di++) { if( ae_fp_greater(ae_fabs(di+ax*vx+ay*vy+az*vz+axy*vx*vy+ayz*vy*vz-vf.ptr.p_double[di], _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } if( d==1 ) { vsf = spline3dcalc(&c, vx, vy, vz, _state); if( ae_fp_greater(ae_fabs(ax*vx+ay*vy+az*vz+axy*vx*vy+ayz*vy*vz-vsf, _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } /* * Generate random grid and test function. * Test spline ability to reproduce function values at grid nodes. */ passcount = 20; for(pass=1; pass<=passcount; pass++) { /* * Prepare a model and check that functions (Spline3DBuildTrilinear, * Spline3DCalc,Spline3DCalcV) work correctly and */ testspline3dunit_buildrndgrid(ae_true, ae_true, &n, &m, &l, &d, &x, &y, &z, &vf, _state); rvectorsetlengthatleast(&sf, n*m*l, _state); /* * Check that the model's values are equal to the function's values * in grid points */ spline3dbuildtrilinearv(&x, n, &y, m, &z, l, &vf, d, &c, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { for(k=0; k<=l-1; k++) { spline3dcalcv(&c, x.ptr.p_double[i], y.ptr.p_double[j], z.ptr.p_double[k], &vvf, _state); for(di=0; di<=d-1; di++) { if( ae_fp_greater(ae_fabs(vf.ptr.p_double[d*(n*(m*k+j)+i)+di]-vvf.ptr.p_double[di], _state),eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Unpack/UnpackV test *************************************************************************/ static ae_bool testspline3dunit_testunpack(ae_state *_state) { ae_frame _frame_block; spline3dinterpolant c; ae_matrix tbl0; ae_matrix tbl1; ae_int_t n; ae_int_t m; ae_int_t l; ae_int_t d; ae_int_t sz; ae_int_t un; ae_int_t um; ae_int_t ul; ae_int_t ud; ae_int_t ust; ae_int_t uvn; ae_int_t uvm; ae_int_t uvl; ae_int_t uvd; ae_int_t uvst; ae_int_t ci; ae_int_t cj; ae_int_t ck; ae_vector x; ae_vector y; ae_vector z; ae_vector sf; ae_vector vf; ae_int_t p0; ae_int_t p1; double tx; double ty; double tz; double v1; double v2; double err; ae_int_t pass; ae_int_t passcount; ae_bool bperr; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t di; ae_int_t i0; ae_bool result; ae_frame_make(_state, &_frame_block); _spline3dinterpolant_init(&c, _state); ae_matrix_init(&tbl0, 0, 0, DT_REAL, _state); ae_matrix_init(&tbl1, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&z, 0, DT_REAL, _state); ae_vector_init(&sf, 0, DT_REAL, _state); ae_vector_init(&vf, 0, DT_REAL, _state); passcount = 20; err = (double)(0); for(pass=1; pass<=passcount; pass++) { /* * generate random grid. * NOTE: for this test we need ordered grid, i.e. grid * with nodes in ascending order */ testspline3dunit_buildrndgrid(ae_true, ae_false, &n, &m, &l, &d, &x, &y, &z, &vf, _state); sz = n*m*l; rvectorsetlengthatleast(&sf, sz, _state); spline3dbuildtrilinearv(&x, n, &y, m, &z, l, &vf, d, &c, _state); spline3dunpackv(&c, &uvn, &uvm, &uvl, &uvd, &uvst, &tbl0, _state); for(di=0; di<=d-1; di++) { /* * DI-th component copy of a vector-function to * a scalar function */ for(i=0; i<=sz-1; i++) { sf.ptr.p_double[i] = vf.ptr.p_double[d*i+di]; } spline3dbuildtrilinearv(&x, n, &y, m, &z, l, &sf, 1, &c, _state); spline3dunpackv(&c, &un, &um, &ul, &ud, &ust, &tbl1, _state); for(i=0; i<=n-2; i++) { for(j=0; j<=m-2; j++) { for(k=0; k<=l-2; k++) { p1 = (n-1)*((m-1)*k+j)+i; p0 = d*p1+di; /* * Check that all components are correct: * *first check, that unpacked componets are equal * to packed components; */ bperr = (((((((((((((((((un!=n||um!=m)||ul!=l)||ae_fp_neq(tbl1.ptr.pp_double[p1][0],x.ptr.p_double[i]))||ae_fp_neq(tbl1.ptr.pp_double[p1][1],x.ptr.p_double[i+1]))||ae_fp_neq(tbl1.ptr.pp_double[p1][2],y.ptr.p_double[j]))||ae_fp_neq(tbl1.ptr.pp_double[p1][3],y.ptr.p_double[j+1]))||ae_fp_neq(tbl1.ptr.pp_double[p1][4],z.ptr.p_double[k]))||ae_fp_neq(tbl1.ptr.pp_double[p1][5],z.ptr.p_double[k+1]))||uvn!=n)||uvm!=m)||uvl!=l)||uvd!=d)||ae_fp_neq(tbl0.ptr.pp_double[p0][0],x.ptr.p_double[i]))||ae_fp_neq(tbl0.ptr.pp_double[p0][1],x.ptr.p_double[i+1]))||ae_fp_neq(tbl0.ptr.pp_double[p0][2],y.ptr.p_double[j]))||ae_fp_neq(tbl0.ptr.pp_double[p0][3],y.ptr.p_double[j+1]))||ae_fp_neq(tbl0.ptr.pp_double[p0][4],z.ptr.p_double[k]))||ae_fp_neq(tbl0.ptr.pp_double[p0][5],z.ptr.p_double[k+1]); /* * *check, that all components unpacked by Unpack * function are equal to all components unpacked * by UnpackV function. */ for(i0=0; i0<=13; i0++) { bperr = bperr||ae_fp_neq(tbl0.ptr.pp_double[p0][i0],tbl1.ptr.pp_double[p1][i0]); } if( bperr ) { result = ae_true; ae_frame_leave(_state); return result; } tx = (0.001+0.999*ae_randomreal(_state))*(tbl1.ptr.pp_double[p1][1]-tbl1.ptr.pp_double[p1][0]); ty = (0.001+0.999*ae_randomreal(_state))*(tbl1.ptr.pp_double[p1][3]-tbl1.ptr.pp_double[p1][2]); tz = (0.001+0.999*ae_randomreal(_state))*(tbl1.ptr.pp_double[p1][5]-tbl1.ptr.pp_double[p1][4]); /* * Interpolation properties for: * *scalar function; */ v1 = (double)(0); for(ci=0; ci<=1; ci++) { for(cj=0; cj<=1; cj++) { for(ck=0; ck<=1; ck++) { v1 = v1+tbl1.ptr.pp_double[p1][6+2*(2*ck+cj)+ci]*ae_pow(tx, (double)(ci), _state)*ae_pow(ty, (double)(cj), _state)*ae_pow(tz, (double)(ck), _state); } } } v2 = spline3dcalc(&c, tbl1.ptr.pp_double[p1][0]+tx, tbl1.ptr.pp_double[p1][2]+ty, tbl1.ptr.pp_double[p1][4]+tz, _state); err = ae_maxreal(err, ae_fabs(v1-v2, _state), _state); /* * *component of vector function. */ v1 = (double)(0); for(ci=0; ci<=1; ci++) { for(cj=0; cj<=1; cj++) { for(ck=0; ck<=1; ck++) { v1 = v1+tbl0.ptr.pp_double[p0][6+2*(2*ck+cj)+ci]*ae_pow(tx, (double)(ci), _state)*ae_pow(ty, (double)(cj), _state)*ae_pow(tz, (double)(ck), _state); } } } v2 = spline3dcalc(&c, tbl0.ptr.pp_double[p0][0]+tx, tbl0.ptr.pp_double[p0][2]+ty, tbl0.ptr.pp_double[p0][4]+tz, _state); err = ae_maxreal(err, ae_fabs(v1-v2, _state), _state); } } } } } result = ae_fp_greater(err,1.0E+5*ae_machineepsilon); ae_frame_leave(_state); return result; } /************************************************************************* LinTrans test *************************************************************************/ static ae_bool testspline3dunit_testlintrans(ae_state *_state) { ae_frame _frame_block; spline3dinterpolant c; spline3dinterpolant c2; ae_int_t m; ae_int_t n; ae_int_t l; ae_int_t d; ae_vector x; ae_vector y; ae_vector z; ae_vector f; double a1; double a2; double a3; double b1; double b2; double b3; double tx; double ty; double tz; double vx; double vy; double vz; ae_vector v1; ae_vector v2; ae_int_t pass; ae_int_t passcount; ae_int_t xjob; ae_int_t yjob; ae_int_t zjob; double err; ae_int_t i; ae_bool result; ae_frame_make(_state, &_frame_block); _spline3dinterpolant_init(&c, _state); _spline3dinterpolant_init(&c2, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&z, 0, DT_REAL, _state); ae_vector_init(&f, 0, DT_REAL, _state); ae_vector_init(&v1, 0, DT_REAL, _state); ae_vector_init(&v2, 0, DT_REAL, _state); err = (double)(0); passcount = 15; for(pass=1; pass<=passcount; pass++) { testspline3dunit_buildrndgrid(ae_true, ae_false, &n, &m, &l, &d, &x, &y, &z, &f, _state); spline3dbuildtrilinearv(&x, n, &y, m, &z, l, &f, d, &c, _state); for(xjob=0; xjob<=1; xjob++) { for(yjob=0; yjob<=1; yjob++) { for(zjob=0; zjob<=1; zjob++) { /* * Prepare */ do { a1 = 2.0*ae_randomreal(_state)-1.0; } while(ae_fp_eq(a1,(double)(0))); a1 = a1*xjob; b1 = x.ptr.p_double[0]+ae_randomreal(_state)*(x.ptr.p_double[n-1]-x.ptr.p_double[0]+2.0)-1.0; do { a2 = 2.0*ae_randomreal(_state)-1.0; } while(ae_fp_eq(a2,(double)(0))); a2 = a2*yjob; b2 = y.ptr.p_double[0]+ae_randomreal(_state)*(y.ptr.p_double[m-1]-y.ptr.p_double[0]+2.0)-1.0; do { a3 = 2.0*ae_randomreal(_state)-1.0; } while(ae_fp_eq(a3,(double)(0))); a3 = a3*zjob; b3 = z.ptr.p_double[0]+ae_randomreal(_state)*(z.ptr.p_double[l-1]-z.ptr.p_double[0]+2.0)-1.0; /* * Test XYZ */ spline3dcopy(&c, &c2, _state); spline3dlintransxyz(&c2, a1, b1, a2, b2, a3, b3, _state); tx = x.ptr.p_double[0]+ae_randomreal(_state)*(x.ptr.p_double[n-1]-x.ptr.p_double[0]); ty = y.ptr.p_double[0]+ae_randomreal(_state)*(y.ptr.p_double[m-1]-y.ptr.p_double[0]); tz = z.ptr.p_double[0]+ae_randomreal(_state)*(z.ptr.p_double[l-1]-z.ptr.p_double[0]); if( xjob==0 ) { tx = b1; vx = x.ptr.p_double[0]+ae_randomreal(_state)*(x.ptr.p_double[n-1]-x.ptr.p_double[0]); } else { vx = (tx-b1)/a1; } if( yjob==0 ) { ty = b2; vy = y.ptr.p_double[0]+ae_randomreal(_state)*(y.ptr.p_double[m-1]-y.ptr.p_double[0]); } else { vy = (ty-b2)/a2; } if( zjob==0 ) { tz = b3; vz = z.ptr.p_double[0]+ae_randomreal(_state)*(z.ptr.p_double[l-1]-z.ptr.p_double[0]); } else { vz = (tz-b3)/a3; } spline3dcalcv(&c, tx, ty, tz, &v1, _state); spline3dcalcv(&c2, vx, vy, vz, &v2, _state); for(i=0; i<=d-1; i++) { err = ae_maxreal(err, ae_fabs(v1.ptr.p_double[i]-v2.ptr.p_double[i], _state), _state); } if( ae_fp_greater(err,1.0E+4*ae_machineepsilon) ) { result = ae_true; ae_frame_leave(_state); return result; } /* * Test F */ spline3dcopy(&c, &c2, _state); spline3dlintransf(&c2, a1, b1, _state); tx = x.ptr.p_double[0]+ae_randomreal(_state)*(x.ptr.p_double[n-1]-x.ptr.p_double[0]); ty = y.ptr.p_double[0]+ae_randomreal(_state)*(y.ptr.p_double[m-1]-y.ptr.p_double[0]); tz = z.ptr.p_double[0]+ae_randomreal(_state)*(z.ptr.p_double[l-1]-z.ptr.p_double[0]); spline3dcalcv(&c, tx, ty, tz, &v1, _state); spline3dcalcv(&c2, tx, ty, tz, &v2, _state); for(i=0; i<=d-1; i++) { err = ae_maxreal(err, ae_fabs(a1*v1.ptr.p_double[i]+b1-v2.ptr.p_double[i], _state), _state); } } } } } result = ae_fp_greater(err,1.0E+4*ae_machineepsilon); ae_frame_leave(_state); return result; } /************************************************************************* Resample test *************************************************************************/ static ae_bool testspline3dunit_testtrilinearresample(ae_state *_state) { ae_frame _frame_block; spline3dinterpolant c; ae_int_t n; ae_int_t m; ae_int_t l; ae_int_t n2; ae_int_t m2; ae_int_t l2; ae_vector x; ae_vector y; ae_vector z; ae_vector f; ae_vector fr; double v1; double v2; double err; double mf; ae_int_t pass; ae_int_t passcount; ae_int_t i; ae_int_t j; ae_int_t k; ae_bool result; ae_frame_make(_state, &_frame_block); _spline3dinterpolant_init(&c, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&z, 0, DT_REAL, _state); ae_vector_init(&f, 0, DT_REAL, _state); ae_vector_init(&fr, 0, DT_REAL, _state); result = ae_false; passcount = 20; for(pass=1; pass<=passcount; pass++) { n = ae_randominteger(4, _state)+2; m = ae_randominteger(4, _state)+2; l = ae_randominteger(4, _state)+2; n2 = ae_randominteger(4, _state)+2; m2 = ae_randominteger(4, _state)+2; l2 = ae_randominteger(4, _state)+2; rvectorsetlengthatleast(&x, n, _state); rvectorsetlengthatleast(&y, m, _state); rvectorsetlengthatleast(&z, l, _state); rvectorsetlengthatleast(&f, n*m*l, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)i/(double)(n-1); } for(i=0; i<=m-1; i++) { y.ptr.p_double[i] = (double)i/(double)(m-1); } for(i=0; i<=l-1; i++) { z.ptr.p_double[i] = (double)i/(double)(l-1); } for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { for(k=0; k<=l-1; k++) { f.ptr.p_double[n*(m*k+j)+i] = 2*ae_randomreal(_state)-1; } } } spline3dresampletrilinear(&f, l, m, n, l2, m2, n2, &fr, _state); spline3dbuildtrilinearv(&x, n, &y, m, &z, l, &f, 1, &c, _state); err = (double)(0); mf = (double)(0); for(i=0; i<=n2-1; i++) { for(j=0; j<=m2-1; j++) { for(k=0; k<=l2-1; k++) { v1 = spline3dcalc(&c, (double)i/(double)(n2-1), (double)j/(double)(m2-1), (double)k/(double)(l2-1), _state); v2 = fr.ptr.p_double[n2*(m2*k+j)+i]; err = ae_maxreal(err, ae_fabs(v1-v2, _state), _state); mf = ae_maxreal(mf, ae_fabs(v1, _state), _state); } } } result = result||ae_fp_greater(err/mf,1.0E+4*ae_machineepsilon); if( result ) { ae_frame_leave(_state); return result; } } ae_frame_leave(_state); return result; } /************************************************************************* The function does build random function on random grid with random number of points: * N, M, K - random from 2 to 5 * D - 1 in case IsVect=False, 1..3 in case IsVect=True * X, Y, Z - each variable spans from MinV to MaxV, with MinV is random number from [-1.5,0.5] and MaxV is random number from [0.5,1.5]. All nodes are well separated. All nodes are randomly reordered in case Reorder=False. When Reorder=True, nodes are returned in ascending order. * F - random values from [-1,+1] *************************************************************************/ static void testspline3dunit_buildrndgrid(ae_bool isvect, ae_bool reorder, ae_int_t* n, ae_int_t* m, ae_int_t* l, ae_int_t* d, /* Real */ ae_vector* x, /* Real */ ae_vector* y, /* Real */ ae_vector* z, /* Real */ ae_vector* f, ae_state *_state) { double st; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t di; double v; double mx; double maxv; double minv; *n = 0; *m = 0; *l = 0; *d = 0; ae_vector_clear(x); ae_vector_clear(y); ae_vector_clear(z); ae_vector_clear(f); st = 0.3; *m = ae_randominteger(4, _state)+2; *n = ae_randominteger(4, _state)+2; *l = ae_randominteger(4, _state)+2; if( isvect ) { *d = ae_randominteger(3, _state)+1; } else { *d = 1; } rvectorsetlengthatleast(x, *n, _state); rvectorsetlengthatleast(y, *m, _state); rvectorsetlengthatleast(z, *l, _state); rvectorsetlengthatleast(f, *n*(*m)*(*l)*(*d), _state); /* * Fill X */ x->ptr.p_double[0] = (double)(0); for(i=1; i<=*n-1; i++) { x->ptr.p_double[i] = x->ptr.p_double[i-1]+st+ae_randomreal(_state); } minv = -0.5-ae_randomreal(_state); maxv = 0.5+ae_randomreal(_state); mx = x->ptr.p_double[*n-1]; for(i=0; i<=*n-1; i++) { x->ptr.p_double[i] = x->ptr.p_double[i]/mx*(maxv-minv)+minv; } if( reorder ) { for(i=0; i<=*n-1; i++) { k = ae_randominteger(*n, _state); v = x->ptr.p_double[i]; x->ptr.p_double[i] = x->ptr.p_double[k]; x->ptr.p_double[k] = v; } } /* * Fill Y */ y->ptr.p_double[0] = (double)(0); for(i=1; i<=*m-1; i++) { y->ptr.p_double[i] = y->ptr.p_double[i-1]+st+ae_randomreal(_state); } minv = -0.5-ae_randomreal(_state); maxv = 0.5+ae_randomreal(_state); mx = y->ptr.p_double[*m-1]; for(i=0; i<=*m-1; i++) { y->ptr.p_double[i] = y->ptr.p_double[i]/mx*(maxv-minv)+minv; } if( reorder ) { for(i=0; i<=*m-1; i++) { k = ae_randominteger(*m, _state); v = y->ptr.p_double[i]; y->ptr.p_double[i] = y->ptr.p_double[k]; y->ptr.p_double[k] = v; } } /* * Fill Z */ z->ptr.p_double[0] = (double)(0); for(i=1; i<=*l-1; i++) { z->ptr.p_double[i] = z->ptr.p_double[i-1]+st+ae_randomreal(_state); } minv = -0.5-ae_randomreal(_state); maxv = 0.5+ae_randomreal(_state); mx = z->ptr.p_double[*l-1]; for(i=0; i<=*l-1; i++) { z->ptr.p_double[i] = z->ptr.p_double[i]/mx*(maxv-minv)+minv; } if( reorder ) { for(i=0; i<=*l-1; i++) { k = ae_randominteger(*l, _state); v = z->ptr.p_double[i]; z->ptr.p_double[i] = z->ptr.p_double[k]; z->ptr.p_double[k] = v; } } /* * Fill F */ for(i=0; i<=*n-1; i++) { for(j=0; j<=*m-1; j++) { for(k=0; k<=*l-1; k++) { for(di=0; di<=*d-1; di++) { f->ptr.p_double[*d*(*n*(*m*k+j)+i)+di] = 2*ae_randomreal(_state)-1; } } } } } static double testpolintunit_internalpolint(/* Real */ ae_vector* x, /* Real */ ae_vector* f, ae_int_t n, double t, ae_state *_state); static void testpolintunit_brcunset(barycentricinterpolant* b, ae_state *_state); /************************************************************************* Unit test *************************************************************************/ ae_bool testpolint(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_bool interrors; double threshold; ae_vector x; ae_vector y; ae_vector w; ae_vector c; ae_vector c0; ae_vector c1; ae_vector c2; ae_vector x2; ae_vector y2; ae_vector w2; ae_vector xfull; ae_vector yfull; double a; double b; double t; ae_int_t i; ae_int_t k; ae_vector xc; ae_vector yc; ae_vector dc; double v; double v0; double v1; double v2; double v3; double v4; double pscale; double poffset; double eps; barycentricinterpolant p; barycentricinterpolant p1; barycentricinterpolant p2; ae_int_t n; ae_int_t maxn; ae_int_t pass; ae_int_t passcount; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&c, 0, DT_REAL, _state); ae_vector_init(&c0, 0, DT_REAL, _state); ae_vector_init(&c1, 0, DT_REAL, _state); ae_vector_init(&c2, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&xfull, 0, DT_REAL, _state); ae_vector_init(&yfull, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&yc, 0, DT_REAL, _state); ae_vector_init(&dc, 0, DT_INT, _state); _barycentricinterpolant_init(&p, _state); _barycentricinterpolant_init(&p1, _state); _barycentricinterpolant_init(&p2, _state); waserrors = ae_false; interrors = ae_false; maxn = 5; passcount = 20; threshold = 1.0E8*ae_machineepsilon; /* * Test equidistant interpolation */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { /* * prepare task: * * equidistant points * * random Y * * T in [A,B] or near (within 10% of its width) */ do { a = 2*ae_randomreal(_state)-1; b = 2*ae_randomreal(_state)-1; } while(ae_fp_less_eq(ae_fabs(a-b, _state),0.2)); t = a+(1.2*ae_randomreal(_state)-0.1)*(b-a); taskgenint1dequidist(a, b, n, &x, &y, _state); /* * test "fast" equidistant interpolation (no barycentric model) */ interrors = interrors||ae_fp_greater(ae_fabs(polynomialcalceqdist(a, b, &y, n, t, _state)-testpolintunit_internalpolint(&x, &y, n, t, _state), _state),threshold); /* * test "slow" equidistant interpolation (create barycentric model) */ testpolintunit_brcunset(&p, _state); polynomialbuild(&x, &y, n, &p, _state); interrors = interrors||ae_fp_greater(ae_fabs(barycentriccalc(&p, t, _state)-testpolintunit_internalpolint(&x, &y, n, t, _state), _state),threshold); /* * test "fast" interpolation (create "fast" barycentric model) */ testpolintunit_brcunset(&p, _state); polynomialbuildeqdist(a, b, &y, n, &p, _state); interrors = interrors||ae_fp_greater(ae_fabs(barycentriccalc(&p, t, _state)-testpolintunit_internalpolint(&x, &y, n, t, _state), _state),threshold); } } /* * Test Chebyshev-1 interpolation */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { /* * prepare task: * * equidistant points * * random Y * * T in [A,B] or near (within 10% of its width) */ do { a = 2*ae_randomreal(_state)-1; b = 2*ae_randomreal(_state)-1; } while(ae_fp_less_eq(ae_fabs(a-b, _state),0.2)); t = a+(1.2*ae_randomreal(_state)-0.1)*(b-a); taskgenint1dcheb1(a, b, n, &x, &y, _state); /* * test "fast" interpolation (no barycentric model) */ interrors = interrors||ae_fp_greater(ae_fabs(polynomialcalccheb1(a, b, &y, n, t, _state)-testpolintunit_internalpolint(&x, &y, n, t, _state), _state),threshold); /* * test "slow" interpolation (create barycentric model) */ testpolintunit_brcunset(&p, _state); polynomialbuild(&x, &y, n, &p, _state); interrors = interrors||ae_fp_greater(ae_fabs(barycentriccalc(&p, t, _state)-testpolintunit_internalpolint(&x, &y, n, t, _state), _state),threshold); /* * test "fast" interpolation (create "fast" barycentric model) */ testpolintunit_brcunset(&p, _state); polynomialbuildcheb1(a, b, &y, n, &p, _state); interrors = interrors||ae_fp_greater(ae_fabs(barycentriccalc(&p, t, _state)-testpolintunit_internalpolint(&x, &y, n, t, _state), _state),threshold); } } /* * Test Chebyshev-2 interpolation */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { /* * prepare task: * * equidistant points * * random Y * * T in [A,B] or near (within 10% of its width) */ do { a = 2*ae_randomreal(_state)-1; b = 2*ae_randomreal(_state)-1; } while(ae_fp_less_eq(ae_fabs(a-b, _state),0.2)); t = a+(1.2*ae_randomreal(_state)-0.1)*(b-a); taskgenint1dcheb2(a, b, n, &x, &y, _state); /* * test "fast" interpolation (no barycentric model) */ interrors = interrors||ae_fp_greater(ae_fabs(polynomialcalccheb2(a, b, &y, n, t, _state)-testpolintunit_internalpolint(&x, &y, n, t, _state), _state),threshold); /* * test "slow" interpolation (create barycentric model) */ testpolintunit_brcunset(&p, _state); polynomialbuild(&x, &y, n, &p, _state); interrors = interrors||ae_fp_greater(ae_fabs(barycentriccalc(&p, t, _state)-testpolintunit_internalpolint(&x, &y, n, t, _state), _state),threshold); /* * test "fast" interpolation (create "fast" barycentric model) */ testpolintunit_brcunset(&p, _state); polynomialbuildcheb2(a, b, &y, n, &p, _state); interrors = interrors||ae_fp_greater(ae_fabs(barycentriccalc(&p, t, _state)-testpolintunit_internalpolint(&x, &y, n, t, _state), _state),threshold); } } /* * Testing conversion Barycentric<->Chebyshev */ for(pass=1; pass<=passcount; pass++) { for(k=1; k<=3; k++) { /* * Allocate */ ae_vector_set_length(&x, k, _state); ae_vector_set_length(&y, k, _state); /* * Generate problem */ a = 2*ae_randomreal(_state)-1; b = a+(0.1+ae_randomreal(_state))*(2*ae_randominteger(2, _state)-1); v0 = 2*ae_randomreal(_state)-1; v1 = 2*ae_randomreal(_state)-1; v2 = 2*ae_randomreal(_state)-1; if( k==1 ) { x.ptr.p_double[0] = 0.5*(a+b); y.ptr.p_double[0] = v0; } if( k==2 ) { x.ptr.p_double[0] = a; y.ptr.p_double[0] = v0-v1; x.ptr.p_double[1] = b; y.ptr.p_double[1] = v0+v1; } if( k==3 ) { x.ptr.p_double[0] = a; y.ptr.p_double[0] = v0-v1+v2; x.ptr.p_double[1] = 0.5*(a+b); y.ptr.p_double[1] = v0-v2; x.ptr.p_double[2] = b; y.ptr.p_double[2] = v0+v1+v2; } /* * Test forward conversion */ polynomialbuild(&x, &y, k, &p, _state); ae_vector_set_length(&c, 1, _state); polynomialbar2cheb(&p, a, b, &c, _state); interrors = interrors||c.cnt!=k; if( k>=1 ) { interrors = interrors||ae_fp_greater(ae_fabs(c.ptr.p_double[0]-v0, _state),threshold); } if( k>=2 ) { interrors = interrors||ae_fp_greater(ae_fabs(c.ptr.p_double[1]-v1, _state),threshold); } if( k>=3 ) { interrors = interrors||ae_fp_greater(ae_fabs(c.ptr.p_double[2]-v2, _state),threshold); } /* * Test backward conversion */ polynomialcheb2bar(&c, k, a, b, &p2, _state); v = a+ae_randomreal(_state)*(b-a); interrors = interrors||ae_fp_greater(ae_fabs(barycentriccalc(&p, v, _state)-barycentriccalc(&p2, v, _state), _state),threshold); } } /* * Testing conversion Barycentric<->Power */ for(pass=1; pass<=passcount; pass++) { for(k=1; k<=5; k++) { /* * Allocate */ ae_vector_set_length(&x, k, _state); ae_vector_set_length(&y, k, _state); /* * Generate problem */ poffset = 2*ae_randomreal(_state)-1; pscale = (0.1+ae_randomreal(_state))*(2*ae_randominteger(2, _state)-1); v0 = 2*ae_randomreal(_state)-1; v1 = 2*ae_randomreal(_state)-1; v2 = 2*ae_randomreal(_state)-1; v3 = 2*ae_randomreal(_state)-1; v4 = 2*ae_randomreal(_state)-1; if( k==1 ) { x.ptr.p_double[0] = poffset; y.ptr.p_double[0] = v0; } if( k==2 ) { x.ptr.p_double[0] = poffset-pscale; y.ptr.p_double[0] = v0-v1; x.ptr.p_double[1] = poffset+pscale; y.ptr.p_double[1] = v0+v1; } if( k==3 ) { x.ptr.p_double[0] = poffset-pscale; y.ptr.p_double[0] = v0-v1+v2; x.ptr.p_double[1] = poffset; y.ptr.p_double[1] = v0; x.ptr.p_double[2] = poffset+pscale; y.ptr.p_double[2] = v0+v1+v2; } if( k==4 ) { x.ptr.p_double[0] = poffset-pscale; y.ptr.p_double[0] = v0-v1+v2-v3; x.ptr.p_double[1] = poffset-0.5*pscale; y.ptr.p_double[1] = v0-0.5*v1+0.25*v2-0.125*v3; x.ptr.p_double[2] = poffset+0.5*pscale; y.ptr.p_double[2] = v0+0.5*v1+0.25*v2+0.125*v3; x.ptr.p_double[3] = poffset+pscale; y.ptr.p_double[3] = v0+v1+v2+v3; } if( k==5 ) { x.ptr.p_double[0] = poffset-pscale; y.ptr.p_double[0] = v0-v1+v2-v3+v4; x.ptr.p_double[1] = poffset-0.5*pscale; y.ptr.p_double[1] = v0-0.5*v1+0.25*v2-0.125*v3+0.0625*v4; x.ptr.p_double[2] = poffset; y.ptr.p_double[2] = v0; x.ptr.p_double[3] = poffset+0.5*pscale; y.ptr.p_double[3] = v0+0.5*v1+0.25*v2+0.125*v3+0.0625*v4; x.ptr.p_double[4] = poffset+pscale; y.ptr.p_double[4] = v0+v1+v2+v3+v4; } /* * Test forward conversion */ polynomialbuild(&x, &y, k, &p, _state); ae_vector_set_length(&c, 1, _state); polynomialbar2pow(&p, poffset, pscale, &c, _state); interrors = interrors||c.cnt!=k; if( k>=1 ) { interrors = interrors||ae_fp_greater(ae_fabs(c.ptr.p_double[0]-v0, _state),threshold); } if( k>=2 ) { interrors = interrors||ae_fp_greater(ae_fabs(c.ptr.p_double[1]-v1, _state),threshold); } if( k>=3 ) { interrors = interrors||ae_fp_greater(ae_fabs(c.ptr.p_double[2]-v2, _state),threshold); } if( k>=4 ) { interrors = interrors||ae_fp_greater(ae_fabs(c.ptr.p_double[3]-v3, _state),threshold); } if( k>=5 ) { interrors = interrors||ae_fp_greater(ae_fabs(c.ptr.p_double[4]-v4, _state),threshold); } /* * Test backward conversion */ polynomialpow2bar(&c, k, poffset, pscale, &p2, _state); v = poffset+(2*ae_randomreal(_state)-1)*pscale; interrors = interrors||ae_fp_greater(ae_fabs(barycentriccalc(&p, v, _state)-barycentriccalc(&p2, v, _state), _state),threshold); } } /* * crash-test: ability to solve tasks which will overflow/underflow * weights with straightforward implementation */ for(n=1; n<=20; n++) { a = -0.1*ae_maxrealnumber; b = 0.1*ae_maxrealnumber; taskgenint1dequidist(a, b, n, &x, &y, _state); polynomialbuild(&x, &y, n, &p, _state); for(i=0; i<=n-1; i++) { interrors = interrors||ae_fp_eq(p.w.ptr.p_double[i],(double)(0)); } } /* * Test issue #634: instability in PolynomialBar2Pow(). * * Function returns incorrect coefficients when called with * approximately-unit scale for data which have significantly * non-unit scale. */ n = 7; eps = 1.0E-8; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&x2, n, _state); ae_vector_set_length(&y, n, _state); x.ptr.p_double[0] = ae_randomreal(_state)-0.5; y.ptr.p_double[0] = ae_randomreal(_state)-0.5; for(i=1; i<=n-1; i++) { x.ptr.p_double[i] = x.ptr.p_double[i-1]+ae_randomreal(_state)+0.1; y.ptr.p_double[i] = ae_randomreal(_state)-0.5; } polynomialbuild(&x, &y, n, &p, _state); polynomialbar2pow(&p, 0.0, 1.0, &c0, _state); pscale = 1.0E-10; for(i=0; i<=n-1; i++) { x2.ptr.p_double[i] = x.ptr.p_double[i]*pscale; } polynomialbuild(&x2, &y, n, &p, _state); polynomialbar2pow(&p, 0.0, 1.0, &c1, _state); for(i=0; i<=n-1; i++) { seterrorflag(&interrors, ae_fp_greater(ae_fabs(c0.ptr.p_double[i]-c1.ptr.p_double[i]*ae_pow(pscale, (double)(i), _state), _state),eps), _state); } pscale = 1.0E10; for(i=0; i<=n-1; i++) { x2.ptr.p_double[i] = x.ptr.p_double[i]*pscale; } polynomialbuild(&x2, &y, n, &p, _state); polynomialbar2pow(&p, 0.0, 1.0, &c2, _state); for(i=0; i<=n-1; i++) { seterrorflag(&interrors, ae_fp_greater(ae_fabs(c0.ptr.p_double[i]-c2.ptr.p_double[i]*ae_pow(pscale, (double)(i), _state), _state),eps), _state); } /* * report */ waserrors = interrors; if( !silent ) { printf("TESTING POLYNOMIAL INTERPOLATION\n"); /* * Normal tests */ printf("INTERPOLATION TEST: "); if( interrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } /* * end */ result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testpolint(ae_bool silent, ae_state *_state) { return testpolint(silent, _state); } static double testpolintunit_internalpolint(/* Real */ ae_vector* x, /* Real */ ae_vector* f, ae_int_t n, double t, ae_state *_state) { ae_frame _frame_block; ae_vector _f; ae_int_t i; ae_int_t j; double result; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_f, f, _state); f = &_f; n = n-1; for(j=0; j<=n-1; j++) { for(i=j+1; i<=n; i++) { f->ptr.p_double[i] = ((t-x->ptr.p_double[j])*f->ptr.p_double[i]-(t-x->ptr.p_double[i])*f->ptr.p_double[j])/(x->ptr.p_double[i]-x->ptr.p_double[j]); } } result = f->ptr.p_double[n]; ae_frame_leave(_state); return result; } static void testpolintunit_brcunset(barycentricinterpolant* b, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_vector w; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_set_length(&x, 1, _state); ae_vector_set_length(&y, 1, _state); ae_vector_set_length(&w, 1, _state); x.ptr.p_double[0] = (double)(0); y.ptr.p_double[0] = (double)(0); w.ptr.p_double[0] = (double)(1); barycentricbuildxyw(&x, &y, &w, 1, b, _state); ae_frame_leave(_state); } static void testlsfitunit_testpolynomialfitting(ae_bool* fiterrors, ae_state *_state); static void testlsfitunit_testrationalfitting(ae_bool* fiterrors, ae_state *_state); static void testlsfitunit_testsplinefitting(ae_bool* fiterrors, ae_state *_state); static void testlsfitunit_testgeneralfitting(ae_bool* llserrors, ae_bool* nlserrors, ae_state *_state); static void testlsfitunit_testrdp(ae_bool* errorflag, ae_state *_state); static void testlsfitunit_testlogisticfitting(ae_bool* fiterrors, ae_state *_state); static void testlsfitunit_testbcnls(ae_bool* errorflag, ae_state *_state); static void testlsfitunit_testlcnls(ae_bool* errorflag, ae_state *_state); static ae_bool testlsfitunit_isglssolution(ae_int_t n, ae_int_t m, ae_int_t k, /* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, /* Real */ ae_matrix* cmatrix, /* Real */ ae_vector* c, ae_state *_state); static double testlsfitunit_getglserror(ae_int_t n, ae_int_t m, /* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, /* Real */ ae_vector* c, ae_state *_state); static void testlsfitunit_fitlinearnonlinear(ae_int_t m, ae_int_t deravailable, /* Real */ ae_matrix* xy, lsfitstate* state, ae_bool* nlserrors, ae_state *_state); static void testlsfitunit_testgradientcheck(ae_bool* testg, ae_state *_state); static void testlsfitunit_funcderiv(/* Real */ ae_vector* c, /* Real */ ae_vector* x, /* Real */ ae_vector* x0, ae_int_t k, ae_int_t m, ae_int_t functype, double* f, /* Real */ ae_vector* g, ae_state *_state); static void testlsfitunit_testfunc1(ae_int_t k, /* Real */ ae_vector* x, /* Real */ ae_vector* z, double* f, ae_bool needf, /* Real */ ae_vector* g, ae_bool needg, ae_state *_state); static void testlsfitunit_testfunc2(/* Real */ ae_vector* x, ae_int_t nx, /* Real */ ae_vector* c, ae_int_t nc, double* f, ae_bool needf, /* Real */ ae_vector* g, ae_bool needg, ae_state *_state); static void testlsfitunit_testfunc3(/* Real */ ae_vector* x, ae_int_t nx, /* Real */ ae_vector* c, ae_int_t nc, double* f, ae_bool needf, /* Real */ ae_vector* g, ae_bool needg, ae_state *_state); ae_bool testlsfit(ae_bool silent, ae_state *_state) { ae_bool waserrors; ae_bool llserrors; ae_bool nlserrors; ae_bool polfiterrors; ae_bool ratfiterrors; ae_bool splfiterrors; ae_bool graderrors; ae_bool logisticerrors; ae_bool rdperrors; ae_bool result; waserrors = ae_false; polfiterrors = ae_false; ratfiterrors = ae_false; splfiterrors = ae_false; llserrors = ae_false; nlserrors = ae_false; graderrors = ae_false; logisticerrors = ae_false; rdperrors = ae_false; testlsfitunit_testrdp(&rdperrors, _state); testlsfitunit_testlogisticfitting(&logisticerrors, _state); testlsfitunit_testpolynomialfitting(&polfiterrors, _state); testlsfitunit_testrationalfitting(&ratfiterrors, _state); testlsfitunit_testsplinefitting(&splfiterrors, _state); testlsfitunit_testgeneralfitting(&llserrors, &nlserrors, _state); testlsfitunit_testgradientcheck(&graderrors, _state); /* * report */ waserrors = ((((((llserrors||nlserrors)||polfiterrors)||ratfiterrors)||splfiterrors)||graderrors)||logisticerrors)||rdperrors; if( !silent ) { printf("TESTING LEAST SQUARES\n"); printf("POLYNOMIAL LEAST SQUARES: "); if( polfiterrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("RATIONAL LEAST SQUARES: "); if( ratfiterrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("SPLINE LEAST SQUARES: "); if( splfiterrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("LINEAR LEAST SQUARES: "); if( llserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("NON-LINEAR LEAST SQUARES: "); if( nlserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("TEST FOR VERIFICATION OF THE GRADIENT: "); if( graderrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("LOGISTIC FITTING (4PL/5PL): "); if( logisticerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("RDP ALGORITHM: "); if( rdperrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } /* * end */ result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testlsfit(ae_bool silent, ae_state *_state) { return testlsfit(silent, _state); } /************************************************************************* Unit test *************************************************************************/ static void testlsfitunit_testpolynomialfitting(ae_bool* fiterrors, ae_state *_state) { ae_frame _frame_block; double threshold; ae_vector x; ae_vector y; ae_vector w; ae_vector x2; ae_vector y2; ae_vector w2; ae_vector xfull; ae_vector yfull; double t; ae_int_t i; ae_int_t k; ae_vector xc; ae_vector yc; ae_vector dc; ae_int_t info; ae_int_t info2; double v; double v0; double v1; double v2; double s; double xmin; double xmax; double refrms; double refavg; double refavgrel; double refmax; barycentricinterpolant p; barycentricinterpolant p1; barycentricinterpolant p2; polynomialfitreport rep; polynomialfitreport rep2; ae_int_t n; ae_int_t m; ae_int_t maxn; ae_int_t pass; ae_int_t passcount; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&xfull, 0, DT_REAL, _state); ae_vector_init(&yfull, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&yc, 0, DT_REAL, _state); ae_vector_init(&dc, 0, DT_INT, _state); _barycentricinterpolant_init(&p, _state); _barycentricinterpolant_init(&p1, _state); _barycentricinterpolant_init(&p2, _state); _polynomialfitreport_init(&rep, _state); _polynomialfitreport_init(&rep2, _state); *fiterrors = ae_false; maxn = 5; passcount = 20; threshold = 1.0E8*ae_machineepsilon; /* * Test polunomial fitting */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { /* * N=M+K fitting (i.e. interpolation) */ for(k=0; k<=n-1; k++) { taskgenint1d((double)(-1), (double)(1), n, &xfull, &yfull, _state); ae_vector_set_length(&x, n-k, _state); ae_vector_set_length(&y, n-k, _state); ae_vector_set_length(&w, n-k, _state); if( k>0 ) { ae_vector_set_length(&xc, k, _state); ae_vector_set_length(&yc, k, _state); ae_vector_set_length(&dc, k, _state); } for(i=0; i<=n-k-1; i++) { x.ptr.p_double[i] = xfull.ptr.p_double[i]; y.ptr.p_double[i] = yfull.ptr.p_double[i]; w.ptr.p_double[i] = 1+ae_randomreal(_state); } for(i=0; i<=k-1; i++) { xc.ptr.p_double[i] = xfull.ptr.p_double[n-k+i]; yc.ptr.p_double[i] = yfull.ptr.p_double[n-k+i]; dc.ptr.p_int[i] = 0; } polynomialfitwc(&x, &y, &w, n-k, &xc, &yc, &dc, k, n, &info, &p1, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { for(i=0; i<=n-k-1; i++) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(barycentriccalc(&p1, x.ptr.p_double[i], _state)-y.ptr.p_double[i], _state),threshold); } for(i=0; i<=k-1; i++) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(barycentriccalc(&p1, xc.ptr.p_double[i], _state)-yc.ptr.p_double[i], _state),threshold); } } } /* * Testing constraints on derivatives. * Special tasks which will always have solution: * 1. P(0)=YC[0] * 2. P(0)=YC[0], P'(0)=YC[1] */ if( n>1 ) { for(m=3; m<=5; m++) { for(k=1; k<=2; k++) { taskgenint1d((double)(-1), (double)(1), n, &x, &y, _state); ae_vector_set_length(&w, n, _state); ae_vector_set_length(&xc, 2, _state); ae_vector_set_length(&yc, 2, _state); ae_vector_set_length(&dc, 2, _state); for(i=0; i<=n-1; i++) { w.ptr.p_double[i] = 1+ae_randomreal(_state); } xc.ptr.p_double[0] = (double)(0); yc.ptr.p_double[0] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[0] = 0; xc.ptr.p_double[1] = (double)(0); yc.ptr.p_double[1] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[1] = 1; polynomialfitwc(&x, &y, &w, n, &xc, &yc, &dc, k, m, &info, &p1, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { barycentricdiff1(&p1, 0.0, &v0, &v1, _state); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(v0-yc.ptr.p_double[0], _state),threshold); if( k==2 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(v1-yc.ptr.p_double[1], _state),threshold); } } } } } } } for(m=2; m<=8; m++) { for(pass=1; pass<=passcount; pass++) { /* * General fitting * * interpolating function through M nodes should have * greater RMS error than fitting it through the same M nodes */ n = 100; ae_vector_set_length(&x2, n, _state); ae_vector_set_length(&y2, n, _state); ae_vector_set_length(&w2, n, _state); xmin = (double)(0); xmax = 2*ae_pi; for(i=0; i<=n-1; i++) { x2.ptr.p_double[i] = 2*ae_pi*ae_randomreal(_state); y2.ptr.p_double[i] = ae_sin(x2.ptr.p_double[i], _state); w2.ptr.p_double[i] = (double)(1); } ae_vector_set_length(&x, m, _state); ae_vector_set_length(&y, m, _state); for(i=0; i<=m-1; i++) { x.ptr.p_double[i] = xmin+(xmax-xmin)*i/(m-1); y.ptr.p_double[i] = ae_sin(x.ptr.p_double[i], _state); } polynomialbuild(&x, &y, m, &p1, _state); polynomialfitwc(&x2, &y2, &w2, n, &xc, &yc, &dc, 0, m, &info, &p2, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { /* * calculate P1 (interpolant) RMS error, compare with P2 error */ v1 = (double)(0); v2 = (double)(0); for(i=0; i<=n-1; i++) { v1 = v1+ae_sqr(barycentriccalc(&p1, x2.ptr.p_double[i], _state)-y2.ptr.p_double[i], _state); v2 = v2+ae_sqr(barycentriccalc(&p2, x2.ptr.p_double[i], _state)-y2.ptr.p_double[i], _state); } v1 = ae_sqrt(v1/n, _state); v2 = ae_sqrt(v2/n, _state); *fiterrors = *fiterrors||ae_fp_greater(v2,v1); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(v2-rep.rmserror, _state),threshold); } /* * compare weighted and non-weighted */ n = 20; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = i+(ae_randomreal(_state)-0.5); y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; w.ptr.p_double[i] = (double)(1); } polynomialfitwc(&x, &y, &w, n, &xc, &yc, &dc, 0, m, &info, &p1, &rep, _state); polynomialfit(&x, &y, n, m, &info2, &p2, &rep2, _state); if( info<=0||info2<=0 ) { *fiterrors = ae_true; } else { /* * calculate P1 (interpolant), compare with P2 error * compare RMS errors */ t = 2*ae_randomreal(_state)-1; v1 = barycentriccalc(&p1, t, _state); v2 = barycentriccalc(&p2, t, _state); *fiterrors = *fiterrors||!approxequal(v2, v1, 1.0E-12, _state); *fiterrors = *fiterrors||!approxequal(rep.rmserror, rep2.rmserror, 1.0E-12*maxreal3((double)(1), rep.rmserror, rep2.rmserror, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.avgerror, rep2.avgerror, 1.0E-12*maxreal3((double)(1), rep.avgerror, rep2.avgerror, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.avgrelerror, rep2.avgrelerror, 1.0E-12*maxreal3((double)(1), rep.avgrelerror, rep2.avgrelerror, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.maxerror, rep2.maxerror, 1.0E-12*maxreal3((double)(1), rep.maxerror, rep2.maxerror, _state), _state); } } } for(m=1; m<=maxn; m++) { for(pass=1; pass<=passcount; pass++) { ae_assert(passcount>=2, "PassCount should be 2 or greater!", _state); /* * solve simple task (all X[] are the same, Y[] are specially * calculated to ensure simple form of all types of errors) * and check correctness of the errors calculated by subroutines * * First pass is done with zero Y[], other passes - with random Y[]. * It should test both ability to correctly calculate errors and * ability to not fail while working with zeros :) */ n = 4*maxn; if( pass==1 ) { v1 = (double)(0); v2 = (double)(0); v = (double)(0); } else { v1 = ae_randomreal(_state); v2 = ae_randomreal(_state); v = 1+ae_randomreal(_state); } ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); for(i=0; i<=maxn-1; i++) { x.ptr.p_double[4*i+0] = (double)(i); y.ptr.p_double[4*i+0] = v-v2; w.ptr.p_double[4*i+0] = (double)(1); x.ptr.p_double[4*i+1] = (double)(i); y.ptr.p_double[4*i+1] = v-v1; w.ptr.p_double[4*i+1] = (double)(1); x.ptr.p_double[4*i+2] = (double)(i); y.ptr.p_double[4*i+2] = v+v1; w.ptr.p_double[4*i+2] = (double)(1); x.ptr.p_double[4*i+3] = (double)(i); y.ptr.p_double[4*i+3] = v+v2; w.ptr.p_double[4*i+3] = (double)(1); } refrms = ae_sqrt((ae_sqr(v1, _state)+ae_sqr(v2, _state))/2, _state); refavg = (ae_fabs(v1, _state)+ae_fabs(v2, _state))/2; if( pass==1 ) { refavgrel = (double)(0); } else { refavgrel = 0.25*(ae_fabs(v2, _state)/ae_fabs(v-v2, _state)+ae_fabs(v1, _state)/ae_fabs(v-v1, _state)+ae_fabs(v1, _state)/ae_fabs(v+v1, _state)+ae_fabs(v2, _state)/ae_fabs(v+v2, _state)); } refmax = ae_maxreal(v1, v2, _state); /* * Test errors correctness */ polynomialfit(&x, &y, n, m, &info, &p, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { s = barycentriccalc(&p, (double)(0), _state); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(s-v, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.rmserror-refrms, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.avgerror-refavg, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.avgrelerror-refavgrel, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.maxerror-refmax, _state),threshold); } } } ae_frame_leave(_state); } static void testlsfitunit_testrationalfitting(ae_bool* fiterrors, ae_state *_state) { ae_frame _frame_block; double threshold; ae_int_t maxn; ae_int_t passcount; barycentricinterpolant b1; barycentricinterpolant b2; ae_vector x; ae_vector x2; ae_vector y; ae_vector y2; ae_vector w; ae_vector w2; ae_vector xc; ae_vector yc; ae_vector dc; ae_int_t n; ae_int_t m; ae_int_t i; ae_int_t k; ae_int_t pass; double t; double s; double v; double v0; double v1; double v2; ae_int_t info; ae_int_t info2; double xmin; double xmax; double refrms; double refavg; double refavgrel; double refmax; barycentricfitreport rep; barycentricfitreport rep2; ae_frame_make(_state, &_frame_block); _barycentricinterpolant_init(&b1, _state); _barycentricinterpolant_init(&b2, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&yc, 0, DT_REAL, _state); ae_vector_init(&dc, 0, DT_INT, _state); _barycentricfitreport_init(&rep, _state); _barycentricfitreport_init(&rep2, _state); *fiterrors = ae_false; /* * PassCount number of repeated passes * Threshold error tolerance * LipschitzTol Lipschitz constant increase allowed * when calculating constant on a twice denser grid */ passcount = 5; maxn = 15; threshold = 1000000*ae_machineepsilon; /* * Test rational fitting: */ for(pass=1; pass<=passcount; pass++) { for(n=2; n<=maxn; n++) { /* * N=M+K fitting (i.e. interpolation) */ for(k=0; k<=n-1; k++) { ae_vector_set_length(&x, n-k, _state); ae_vector_set_length(&y, n-k, _state); ae_vector_set_length(&w, n-k, _state); if( k>0 ) { ae_vector_set_length(&xc, k, _state); ae_vector_set_length(&yc, k, _state); ae_vector_set_length(&dc, k, _state); } for(i=0; i<=n-k-1; i++) { x.ptr.p_double[i] = (double)i/(double)(n-1); y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; w.ptr.p_double[i] = 1+ae_randomreal(_state); } for(i=0; i<=k-1; i++) { xc.ptr.p_double[i] = (double)(n-k+i)/(double)(n-1); yc.ptr.p_double[i] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[i] = 0; } barycentricfitfloaterhormannwc(&x, &y, &w, n-k, &xc, &yc, &dc, k, n, &info, &b1, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { for(i=0; i<=n-k-1; i++) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(barycentriccalc(&b1, x.ptr.p_double[i], _state)-y.ptr.p_double[i], _state),threshold); } for(i=0; i<=k-1; i++) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(barycentriccalc(&b1, xc.ptr.p_double[i], _state)-yc.ptr.p_double[i], _state),threshold); } } } /* * Testing constraints on derivatives: * * several M's are tried * * several K's are tried - 1, 2. * * constraints at the ends of the interval */ for(m=3; m<=5; m++) { for(k=1; k<=2; k++) { ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); ae_vector_set_length(&xc, 2, _state); ae_vector_set_length(&yc, 2, _state); ae_vector_set_length(&dc, 2, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; w.ptr.p_double[i] = 1+ae_randomreal(_state); } xc.ptr.p_double[0] = (double)(-1); yc.ptr.p_double[0] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[0] = 0; xc.ptr.p_double[1] = (double)(1); yc.ptr.p_double[1] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[1] = 0; barycentricfitfloaterhormannwc(&x, &y, &w, n, &xc, &yc, &dc, k, m, &info, &b1, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { for(i=0; i<=k-1; i++) { barycentricdiff1(&b1, xc.ptr.p_double[i], &v0, &v1, _state); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(v0-yc.ptr.p_double[i], _state),threshold); } } } } } } for(m=2; m<=8; m++) { for(pass=1; pass<=passcount; pass++) { /* * General fitting * * interpolating function through M nodes should have * greater RMS error than fitting it through the same M nodes */ n = 100; ae_vector_set_length(&x2, n, _state); ae_vector_set_length(&y2, n, _state); ae_vector_set_length(&w2, n, _state); xmin = ae_maxrealnumber; xmax = -ae_maxrealnumber; for(i=0; i<=n-1; i++) { x2.ptr.p_double[i] = 2*ae_pi*ae_randomreal(_state); y2.ptr.p_double[i] = ae_sin(x2.ptr.p_double[i], _state); w2.ptr.p_double[i] = (double)(1); xmin = ae_minreal(xmin, x2.ptr.p_double[i], _state); xmax = ae_maxreal(xmax, x2.ptr.p_double[i], _state); } ae_vector_set_length(&x, m, _state); ae_vector_set_length(&y, m, _state); for(i=0; i<=m-1; i++) { x.ptr.p_double[i] = xmin+(xmax-xmin)*i/(m-1); y.ptr.p_double[i] = ae_sin(x.ptr.p_double[i], _state); } barycentricbuildfloaterhormann(&x, &y, m, 3, &b1, _state); barycentricfitfloaterhormannwc(&x2, &y2, &w2, n, &xc, &yc, &dc, 0, m, &info, &b2, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { /* * calculate B1 (interpolant) RMS error, compare with B2 error */ v1 = (double)(0); v2 = (double)(0); for(i=0; i<=n-1; i++) { v1 = v1+ae_sqr(barycentriccalc(&b1, x2.ptr.p_double[i], _state)-y2.ptr.p_double[i], _state); v2 = v2+ae_sqr(barycentriccalc(&b2, x2.ptr.p_double[i], _state)-y2.ptr.p_double[i], _state); } v1 = ae_sqrt(v1/n, _state); v2 = ae_sqrt(v2/n, _state); *fiterrors = *fiterrors||ae_fp_greater(v2,v1); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(v2-rep.rmserror, _state),threshold); } /* * compare weighted and non-weighted */ n = 20; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = i+(ae_randomreal(_state)-0.5); y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; w.ptr.p_double[i] = (double)(1); } barycentricfitfloaterhormannwc(&x, &y, &w, n, &xc, &yc, &dc, 0, m, &info, &b1, &rep, _state); barycentricfitfloaterhormann(&x, &y, n, m, &info2, &b2, &rep2, _state); if( info<=0||info2<=0 ) { *fiterrors = ae_true; } else { /* * calculate B1 (interpolant), compare with B2 * compare RMS errors */ t = 2*ae_randomreal(_state)-1; v1 = barycentriccalc(&b1, t, _state); v2 = barycentriccalc(&b2, t, _state); *fiterrors = *fiterrors||!approxequal(v2, v1, 1.0E-12, _state); *fiterrors = *fiterrors||!approxequal(rep.rmserror, rep2.rmserror, 1.0E-12*maxreal3((double)(1), rep.rmserror, rep2.rmserror, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.avgerror, rep2.avgerror, 1.0E-12*maxreal3((double)(1), rep.avgerror, rep2.avgerror, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.avgrelerror, rep2.avgrelerror, 1.0E-12*maxreal3((double)(1), rep.avgrelerror, rep2.avgrelerror, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.maxerror, rep2.maxerror, 1.0E-12*maxreal3((double)(1), rep.maxerror, rep2.maxerror, _state), _state); } } } for(pass=1; pass<=passcount; pass++) { ae_assert(passcount>=2, "PassCount should be 2 or greater!", _state); /* * solve simple task (all X[] are the same, Y[] are specially * calculated to ensure simple form of all types of errors) * and check correctness of the errors calculated by subroutines * * First pass is done with zero Y[], other passes - with random Y[]. * It should test both ability to correctly calculate errors and * ability to not fail while working with zeros :) */ n = 4; if( pass==1 ) { v1 = (double)(0); v2 = (double)(0); v = (double)(0); } else { v1 = ae_randomreal(_state); v2 = ae_randomreal(_state); v = 1+ae_randomreal(_state); } ae_vector_set_length(&x, 4, _state); ae_vector_set_length(&y, 4, _state); ae_vector_set_length(&w, 4, _state); x.ptr.p_double[0] = (double)(0); y.ptr.p_double[0] = v-v2; w.ptr.p_double[0] = (double)(1); x.ptr.p_double[1] = (double)(0); y.ptr.p_double[1] = v-v1; w.ptr.p_double[1] = (double)(1); x.ptr.p_double[2] = (double)(0); y.ptr.p_double[2] = v+v1; w.ptr.p_double[2] = (double)(1); x.ptr.p_double[3] = (double)(0); y.ptr.p_double[3] = v+v2; w.ptr.p_double[3] = (double)(1); refrms = ae_sqrt((ae_sqr(v1, _state)+ae_sqr(v2, _state))/2, _state); refavg = (ae_fabs(v1, _state)+ae_fabs(v2, _state))/2; if( pass==1 ) { refavgrel = (double)(0); } else { refavgrel = 0.25*(ae_fabs(v2, _state)/ae_fabs(v-v2, _state)+ae_fabs(v1, _state)/ae_fabs(v-v1, _state)+ae_fabs(v1, _state)/ae_fabs(v+v1, _state)+ae_fabs(v2, _state)/ae_fabs(v+v2, _state)); } refmax = ae_maxreal(v1, v2, _state); /* * Test errors correctness */ barycentricfitfloaterhormann(&x, &y, 4, 2, &info, &b1, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { s = barycentriccalc(&b1, (double)(0), _state); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(s-v, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.rmserror-refrms, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.avgerror-refavg, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.avgrelerror-refavgrel, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.maxerror-refmax, _state),threshold); } } ae_frame_leave(_state); } static void testlsfitunit_testsplinefitting(ae_bool* fiterrors, ae_state *_state) { ae_frame _frame_block; double threshold; double nonstrictthreshold; ae_int_t passcount; ae_int_t n; ae_int_t m; ae_int_t i; ae_int_t k; ae_int_t pass; ae_vector x; ae_vector y; ae_vector w; ae_vector w2; ae_vector xc; ae_vector yc; ae_vector d; ae_vector dc; double sa; double sb; ae_int_t info; ae_int_t info1; ae_int_t info2; spline1dinterpolant c; spline1dinterpolant c2; spline1dfitreport rep; spline1dfitreport rep2; double s; double ds; double d2s; ae_int_t stype; double t; double v; double v1; double v2; double refrms; double refavg; double refavgrel; double refmax; double rho; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&xc, 0, DT_REAL, _state); ae_vector_init(&yc, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&dc, 0, DT_INT, _state); _spline1dinterpolant_init(&c, _state); _spline1dinterpolant_init(&c2, _state); _spline1dfitreport_init(&rep, _state); _spline1dfitreport_init(&rep2, _state); /* * Valyes: * * pass count * * threshold - for tests which must be satisfied exactly * * nonstrictthreshold - for approximate tests */ passcount = 20; threshold = 10000*ae_machineepsilon; nonstrictthreshold = 1.0E-4; *fiterrors = ae_false; /* * Test fitting by Cubic and Hermite splines (obsolete, but still supported) */ for(pass=1; pass<=passcount; pass++) { /* * Cubic splines * Ability to handle boundary constraints (1-4 constraints on F, dF/dx). */ for(m=4; m<=8; m++) { for(k=1; k<=4; k++) { if( k>=m ) { continue; } n = 100; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); ae_vector_set_length(&xc, 4, _state); ae_vector_set_length(&yc, 4, _state); ae_vector_set_length(&dc, 4, _state); sa = 1+ae_randomreal(_state); sb = 2*ae_randomreal(_state)-1; for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = sa*ae_randomreal(_state)+sb; y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; w.ptr.p_double[i] = 1+ae_randomreal(_state); } xc.ptr.p_double[0] = sb; yc.ptr.p_double[0] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[0] = 0; xc.ptr.p_double[1] = sb; yc.ptr.p_double[1] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[1] = 1; xc.ptr.p_double[2] = sa+sb; yc.ptr.p_double[2] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[2] = 0; xc.ptr.p_double[3] = sa+sb; yc.ptr.p_double[3] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[3] = 1; spline1dfitcubicwc(&x, &y, &w, n, &xc, &yc, &dc, k, m, &info, &c, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { /* * Check that constraints are satisfied */ for(i=0; i<=k-1; i++) { spline1ddiff(&c, xc.ptr.p_double[i], &s, &ds, &d2s, _state); if( dc.ptr.p_int[i]==0 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(s-yc.ptr.p_double[i], _state),threshold); } if( dc.ptr.p_int[i]==1 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(ds-yc.ptr.p_double[i], _state),threshold); } if( dc.ptr.p_int[i]==2 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(d2s-yc.ptr.p_double[i], _state),threshold); } } } } } /* * Cubic splines * Ability to handle one internal constraint */ for(m=4; m<=8; m++) { n = 100; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); ae_vector_set_length(&xc, 1, _state); ae_vector_set_length(&yc, 1, _state); ae_vector_set_length(&dc, 1, _state); sa = 1+ae_randomreal(_state); sb = 2*ae_randomreal(_state)-1; for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = sa*ae_randomreal(_state)+sb; y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; w.ptr.p_double[i] = 1+ae_randomreal(_state); } xc.ptr.p_double[0] = sa*ae_randomreal(_state)+sb; yc.ptr.p_double[0] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[0] = ae_randominteger(2, _state); spline1dfitcubicwc(&x, &y, &w, n, &xc, &yc, &dc, 1, m, &info, &c, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { /* * Check that constraints are satisfied */ spline1ddiff(&c, xc.ptr.p_double[0], &s, &ds, &d2s, _state); if( dc.ptr.p_int[0]==0 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(s-yc.ptr.p_double[0], _state),threshold); } if( dc.ptr.p_int[0]==1 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(ds-yc.ptr.p_double[0], _state),threshold); } if( dc.ptr.p_int[0]==2 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(d2s-yc.ptr.p_double[0], _state),threshold); } } } /* * Hermite splines * Ability to handle boundary constraints (1-4 constraints on F, dF/dx). */ for(m=4; m<=8; m++) { for(k=1; k<=4; k++) { if( k>=m ) { continue; } if( m%2!=0 ) { continue; } n = 100; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); ae_vector_set_length(&xc, 4, _state); ae_vector_set_length(&yc, 4, _state); ae_vector_set_length(&dc, 4, _state); sa = 1+ae_randomreal(_state); sb = 2*ae_randomreal(_state)-1; for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = sa*ae_randomreal(_state)+sb; y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; w.ptr.p_double[i] = 1+ae_randomreal(_state); } xc.ptr.p_double[0] = sb; yc.ptr.p_double[0] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[0] = 0; xc.ptr.p_double[1] = sb; yc.ptr.p_double[1] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[1] = 1; xc.ptr.p_double[2] = sa+sb; yc.ptr.p_double[2] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[2] = 0; xc.ptr.p_double[3] = sa+sb; yc.ptr.p_double[3] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[3] = 1; spline1dfithermitewc(&x, &y, &w, n, &xc, &yc, &dc, k, m, &info, &c, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { /* * Check that constraints are satisfied */ for(i=0; i<=k-1; i++) { spline1ddiff(&c, xc.ptr.p_double[i], &s, &ds, &d2s, _state); if( dc.ptr.p_int[i]==0 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(s-yc.ptr.p_double[i], _state),threshold); } if( dc.ptr.p_int[i]==1 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(ds-yc.ptr.p_double[i], _state),threshold); } if( dc.ptr.p_int[i]==2 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(d2s-yc.ptr.p_double[i], _state),threshold); } } } } } /* * Hermite splines * Ability to handle one internal constraint */ for(m=4; m<=8; m++) { if( m%2!=0 ) { continue; } n = 100; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); ae_vector_set_length(&xc, 1, _state); ae_vector_set_length(&yc, 1, _state); ae_vector_set_length(&dc, 1, _state); sa = 1+ae_randomreal(_state); sb = 2*ae_randomreal(_state)-1; for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = sa*ae_randomreal(_state)+sb; y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; w.ptr.p_double[i] = 1+ae_randomreal(_state); } xc.ptr.p_double[0] = sa*ae_randomreal(_state)+sb; yc.ptr.p_double[0] = 2*ae_randomreal(_state)-1; dc.ptr.p_int[0] = ae_randominteger(2, _state); spline1dfithermitewc(&x, &y, &w, n, &xc, &yc, &dc, 1, m, &info, &c, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { /* * Check that constraints are satisfied */ spline1ddiff(&c, xc.ptr.p_double[0], &s, &ds, &d2s, _state); if( dc.ptr.p_int[0]==0 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(s-yc.ptr.p_double[0], _state),threshold); } if( dc.ptr.p_int[0]==1 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(ds-yc.ptr.p_double[0], _state),threshold); } if( dc.ptr.p_int[0]==2 ) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(d2s-yc.ptr.p_double[0], _state),threshold); } } } } for(m=4; m<=8; m++) { for(stype=0; stype<=1; stype++) { for(pass=1; pass<=passcount; pass++) { if( stype==1&&m%2!=0 ) { continue; } /* * cubic/Hermite spline fitting: * * generate "template spline" C2 * * generate 2*N points from C2, such that result of * ideal fit should be equal to C2 * * fit, store in C * * compare C and C2 */ sa = 1+ae_randomreal(_state); sb = 2*ae_randomreal(_state)-1; if( stype==0 ) { ae_vector_set_length(&x, m-2, _state); ae_vector_set_length(&y, m-2, _state); for(i=0; i<=m-2-1; i++) { x.ptr.p_double[i] = sa*i/(m-2-1)+sb; y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } spline1dbuildcubic(&x, &y, m-2, 1, 2*ae_randomreal(_state)-1, 1, 2*ae_randomreal(_state)-1, &c2, _state); } if( stype==1 ) { ae_vector_set_length(&x, m/2, _state); ae_vector_set_length(&y, m/2, _state); ae_vector_set_length(&d, m/2, _state); for(i=0; i<=m/2-1; i++) { x.ptr.p_double[i] = sa*i/(m/2-1)+sb; y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; d.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } spline1dbuildhermite(&x, &y, &d, m/2, &c2, _state); } n = 50; ae_vector_set_length(&x, 2*n, _state); ae_vector_set_length(&y, 2*n, _state); ae_vector_set_length(&w, 2*n, _state); for(i=0; i<=n-1; i++) { /* * "if i=0" and "if i=1" are needed to * synchronize interval size for C2 and * spline being fitted (i.e. C). */ t = ae_randomreal(_state); x.ptr.p_double[i] = sa*ae_randomreal(_state)+sb; if( i==0 ) { x.ptr.p_double[i] = sb; } if( i==1 ) { x.ptr.p_double[i] = sa+sb; } v = spline1dcalc(&c2, x.ptr.p_double[i], _state); y.ptr.p_double[i] = v+t; w.ptr.p_double[i] = 1+ae_randomreal(_state); x.ptr.p_double[n+i] = x.ptr.p_double[i]; y.ptr.p_double[n+i] = v-t; w.ptr.p_double[n+i] = w.ptr.p_double[i]; } info = -1; if( stype==0 ) { spline1dfitcubicwc(&x, &y, &w, 2*n, &xc, &yc, &dc, 0, m, &info, &c, &rep, _state); } if( stype==1 ) { spline1dfithermitewc(&x, &y, &w, 2*n, &xc, &yc, &dc, 0, m, &info, &c, &rep, _state); } if( info<=0 ) { *fiterrors = ae_true; } else { for(i=0; i<=n-1; i++) { v = sa*ae_randomreal(_state)+sb; *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(spline1dcalc(&c, v, _state)-spline1dcalc(&c2, v, _state), _state),threshold); } } } } } for(m=4; m<=8; m++) { for(pass=1; pass<=passcount; pass++) { /* * prepare points/weights */ n = 10+ae_randominteger(10, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = i+(ae_randomreal(_state)-0.5); y.ptr.p_double[i] = 2*ae_randomreal(_state)-1; w.ptr.p_double[i] = (double)(1); } /* * Fit cubic with unity weights, without weights, then compare */ if( m>=4 ) { spline1dfitcubicwc(&x, &y, &w, n, &xc, &yc, &dc, 0, m, &info1, &c, &rep, _state); spline1dfitcubic(&x, &y, n, m, &info2, &c2, &rep2, _state); if( info1<=0||info2<=0 ) { *fiterrors = ae_true; } else { for(i=0; i<=n-1; i++) { v = ae_randomreal(_state)*(n-1); *fiterrors = *fiterrors||!approxequal(spline1dcalc(&c, v, _state), spline1dcalc(&c2, v, _state), 1.0E-12, _state); *fiterrors = *fiterrors||!approxequal(rep.taskrcond, rep2.taskrcond, 1.0E-12*maxreal3((double)(1), rep.taskrcond, rep2.taskrcond, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.rmserror, rep2.rmserror, 1.0E-12*maxreal3((double)(1), rep.rmserror, rep2.rmserror, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.avgerror, rep2.avgerror, 1.0E-12*maxreal3((double)(1), rep.avgerror, rep2.avgerror, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.avgrelerror, rep2.avgrelerror, 1.0E-12*maxreal3((double)(1), rep.avgrelerror, rep2.avgrelerror, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.maxerror, rep2.maxerror, 1.0E-12*maxreal3((double)(1), rep.maxerror, rep2.maxerror, _state), _state); } } } /* * Fit Hermite with unity weights, without weights, then compare */ if( m>=4&&m%2==0 ) { spline1dfithermitewc(&x, &y, &w, n, &xc, &yc, &dc, 0, m, &info1, &c, &rep, _state); spline1dfithermite(&x, &y, n, m, &info2, &c2, &rep2, _state); if( info1<=0||info2<=0 ) { *fiterrors = ae_true; } else { for(i=0; i<=n-1; i++) { v = ae_randomreal(_state)*(n-1); *fiterrors = *fiterrors||!approxequal(spline1dcalc(&c, v, _state), spline1dcalc(&c2, v, _state), 1.0E-12, _state); *fiterrors = *fiterrors||!approxequal(rep.taskrcond, rep2.taskrcond, 1.0E-12*maxreal3((double)(1), rep.taskrcond, rep2.taskrcond, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.rmserror, rep2.rmserror, 1.0E-12*maxreal3((double)(1), rep.rmserror, rep2.rmserror, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.avgerror, rep2.avgerror, 1.0E-12*maxreal3((double)(1), rep.avgerror, rep2.avgerror, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.avgrelerror, rep2.avgrelerror, 1.0E-12*maxreal3((double)(1), rep.avgrelerror, rep2.avgrelerror, _state), _state); *fiterrors = *fiterrors||!approxequal(rep.maxerror, rep2.maxerror, 1.0E-12*maxreal3((double)(1), rep.maxerror, rep2.maxerror, _state), _state); } } } } } /* * check basic properties of penalized splines which are * preserved independently of Rho parameter. */ for(m=4; m<=10; m++) { for(k=-5; k<=5; k++) { rho = (double)(k); /* * when we have two points (even with different weights), * resulting spline must be equal to the straight line */ ae_vector_set_length(&x, 2, _state); ae_vector_set_length(&y, 2, _state); ae_vector_set_length(&w, 2, _state); x.ptr.p_double[0] = -0.5-ae_randomreal(_state); y.ptr.p_double[0] = 0.5+ae_randomreal(_state); w.ptr.p_double[0] = 1+ae_randomreal(_state); x.ptr.p_double[1] = 0.5+ae_randomreal(_state); y.ptr.p_double[1] = 0.5+ae_randomreal(_state); w.ptr.p_double[1] = 1+ae_randomreal(_state); spline1dfitpenalized(&x, &y, 2, m, rho, &info, &c, &rep, _state); if( info>0 ) { v = 2*ae_randomreal(_state)-1; v1 = (v-x.ptr.p_double[0])/(x.ptr.p_double[1]-x.ptr.p_double[0])*y.ptr.p_double[1]+(v-x.ptr.p_double[1])/(x.ptr.p_double[0]-x.ptr.p_double[1])*y.ptr.p_double[0]; *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(v1-spline1dcalc(&c, v, _state), _state),nonstrictthreshold); } else { *fiterrors = ae_true; } spline1dfitpenalizedw(&x, &y, &w, 2, m, rho, &info, &c, &rep, _state); if( info>0 ) { v = 2*ae_randomreal(_state)-1; v1 = (v-x.ptr.p_double[0])/(x.ptr.p_double[1]-x.ptr.p_double[0])*y.ptr.p_double[1]+(v-x.ptr.p_double[1])/(x.ptr.p_double[0]-x.ptr.p_double[1])*y.ptr.p_double[0]; *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(v1-spline1dcalc(&c, v, _state), _state),nonstrictthreshold); } else { *fiterrors = ae_true; } /* * spline fitting is invariant with respect to * scaling of weights (of course, ANY fitting algorithm * must be invariant, but we want to test this property * just to be sure that it is correctly implemented) */ for(n=2; n<=2*m; n++) { ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); ae_vector_set_length(&w2, n, _state); s = 1+ae_exp(10*ae_randomreal(_state), _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)i/(double)(n-1); y.ptr.p_double[i] = ae_randomreal(_state); w.ptr.p_double[i] = 0.1+ae_randomreal(_state); w2.ptr.p_double[i] = w.ptr.p_double[i]*s; } spline1dfitpenalizedw(&x, &y, &w, n, m, rho, &info, &c, &rep, _state); spline1dfitpenalizedw(&x, &y, &w2, n, m, rho, &info2, &c2, &rep2, _state); if( info>0&&info2>0 ) { v = ae_randomreal(_state); v1 = spline1dcalc(&c, v, _state); v2 = spline1dcalc(&c2, v, _state); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(v1-v2, _state),nonstrictthreshold); } else { *fiterrors = ae_true; } } } } /* * Advanced proprties: * * penalized spline with M about 5*N and sufficiently small Rho * must pass through all points on equidistant grid */ for(n=2; n<=10; n++) { m = 5*n; rho = (double)(-5); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)i/(double)(n-1); y.ptr.p_double[i] = ae_randomreal(_state); w.ptr.p_double[i] = 0.1+ae_randomreal(_state); } spline1dfitpenalized(&x, &y, n, m, rho, &info, &c, &rep, _state); if( info>0 ) { for(i=0; i<=n-1; i++) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(y.ptr.p_double[i]-spline1dcalc(&c, x.ptr.p_double[i], _state), _state),nonstrictthreshold); } } else { *fiterrors = ae_true; } spline1dfitpenalizedw(&x, &y, &w, n, m, rho, &info, &c, &rep, _state); if( info>0 ) { for(i=0; i<=n-1; i++) { *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(y.ptr.p_double[i]-spline1dcalc(&c, x.ptr.p_double[i], _state), _state),nonstrictthreshold); } } else { *fiterrors = ae_true; } } /* * Check correctness of error reports */ for(pass=1; pass<=passcount; pass++) { ae_assert(passcount>=2, "PassCount should be 2 or greater!", _state); /* * solve simple task (all X[] are the same, Y[] are specially * calculated to ensure simple form of all types of errors) * and check correctness of the errors calculated by subroutines * * First pass is done with zero Y[], other passes - with random Y[]. * It should test both ability to correctly calculate errors and * ability to not fail while working with zeros :) */ n = 4; if( pass==1 ) { v1 = (double)(0); v2 = (double)(0); v = (double)(0); } else { v1 = ae_randomreal(_state); v2 = ae_randomreal(_state); v = 1+ae_randomreal(_state); } ae_vector_set_length(&x, 4, _state); ae_vector_set_length(&y, 4, _state); ae_vector_set_length(&w, 4, _state); x.ptr.p_double[0] = (double)(0); y.ptr.p_double[0] = v-v2; w.ptr.p_double[0] = (double)(1); x.ptr.p_double[1] = (double)(0); y.ptr.p_double[1] = v-v1; w.ptr.p_double[1] = (double)(1); x.ptr.p_double[2] = (double)(0); y.ptr.p_double[2] = v+v1; w.ptr.p_double[2] = (double)(1); x.ptr.p_double[3] = (double)(0); y.ptr.p_double[3] = v+v2; w.ptr.p_double[3] = (double)(1); refrms = ae_sqrt((ae_sqr(v1, _state)+ae_sqr(v2, _state))/2, _state); refavg = (ae_fabs(v1, _state)+ae_fabs(v2, _state))/2; if( pass==1 ) { refavgrel = (double)(0); } else { refavgrel = 0.25*(ae_fabs(v2, _state)/ae_fabs(v-v2, _state)+ae_fabs(v1, _state)/ae_fabs(v-v1, _state)+ae_fabs(v1, _state)/ae_fabs(v+v1, _state)+ae_fabs(v2, _state)/ae_fabs(v+v2, _state)); } refmax = ae_maxreal(v1, v2, _state); /* * Test penalized spline */ spline1dfitpenalizedw(&x, &y, &w, 4, 4, 0.0, &info, &c, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { s = spline1dcalc(&c, (double)(0), _state); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(s-v, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.rmserror-refrms, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.avgerror-refavg, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.avgrelerror-refavgrel, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.maxerror-refmax, _state),threshold); } /* * Test cubic fitting */ spline1dfitcubic(&x, &y, 4, 4, &info, &c, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { s = spline1dcalc(&c, (double)(0), _state); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(s-v, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.rmserror-refrms, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.avgerror-refavg, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.avgrelerror-refavgrel, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.maxerror-refmax, _state),threshold); } /* * Test Hermite fitting */ spline1dfithermite(&x, &y, 4, 4, &info, &c, &rep, _state); if( info<=0 ) { *fiterrors = ae_true; } else { s = spline1dcalc(&c, (double)(0), _state); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(s-v, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.rmserror-refrms, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.avgerror-refavg, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.avgrelerror-refavgrel, _state),threshold); *fiterrors = *fiterrors||ae_fp_greater(ae_fabs(rep.maxerror-refmax, _state),threshold); } } ae_frame_leave(_state); } static void testlsfitunit_testgeneralfitting(ae_bool* llserrors, ae_bool* nlserrors, ae_state *_state) { ae_frame _frame_block; double threshold; double nlthreshold; ae_int_t maxn; ae_int_t maxm; ae_int_t skind; ae_int_t pkind; ae_int_t passcount; ae_int_t n; ae_int_t m; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t pass; double xscale; double cscale; double wscale; double noiselevel; double tol; double diffstep; ae_vector x; ae_vector y; ae_vector y2; ae_vector w; ae_vector w2; ae_vector s; ae_vector c; ae_vector cstart; ae_vector cend; ae_vector c2; ae_matrix a; ae_matrix a2; ae_matrix cm; double v; double v1; double v2; lsfitreport rep; lsfitreport rep2; ae_int_t info; ae_int_t info2; double refrms; double refavg; double refavgrel; double refmax; double avgdeviationpar; double avgdeviationcurve; double avgdeviationnoise; double adccnt; double adpcnt; double adncnt; lsfitstate state; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&w2, 0, DT_REAL, _state); ae_vector_init(&s, 0, DT_REAL, _state); ae_vector_init(&c, 0, DT_REAL, _state); ae_vector_init(&cstart, 0, DT_REAL, _state); ae_vector_init(&cend, 0, DT_REAL, _state); ae_vector_init(&c2, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&a2, 0, 0, DT_REAL, _state); ae_matrix_init(&cm, 0, 0, DT_REAL, _state); _lsfitreport_init(&rep, _state); _lsfitreport_init(&rep2, _state); _lsfitstate_init(&state, _state); *llserrors = ae_false; *nlserrors = ae_false; threshold = 10000*ae_machineepsilon; nlthreshold = 0.00001; diffstep = 0.0001; maxn = 6; maxm = 6; passcount = 4; /* * Test constrained NLS problems */ testlsfitunit_testbcnls(nlserrors, _state); testlsfitunit_testlcnls(nlserrors, _state); /* * Testing unconstrained least squares (linear/nonlinear) */ for(n=1; n<=maxn; n++) { for(m=1; m<=maxm; m++) { for(pass=1; pass<=passcount; pass++) { /* * Solve non-degenerate linear least squares task * Use Chebyshev basis. Its condition number is very good. */ ae_matrix_set_length(&a, n, m, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); xscale = 0.9+0.1*ae_randomreal(_state); for(i=0; i<=n-1; i++) { if( n==1 ) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } else { x.ptr.p_double[i] = xscale*((double)(2*i)/(double)(n-1)-1); } y.ptr.p_double[i] = 3*x.ptr.p_double[i]+ae_exp(x.ptr.p_double[i], _state); w.ptr.p_double[i] = 1+ae_randomreal(_state); a.ptr.pp_double[i][0] = (double)(1); if( m>1 ) { a.ptr.pp_double[i][1] = x.ptr.p_double[i]; } for(j=2; j<=m-1; j++) { a.ptr.pp_double[i][j] = 2*x.ptr.p_double[i]*a.ptr.pp_double[i][j-1]-a.ptr.pp_double[i][j-2]; } } /* * 1. test weighted fitting (optimality) * 2. Solve degenerate least squares task built on the basis * of previous task */ lsfitlinearw(&y, &w, &a, n, m, &info, &c, &rep, _state); if( info<=0 ) { *llserrors = ae_true; } else { *llserrors = *llserrors||!testlsfitunit_isglssolution(n, m, 0, &y, &w, &a, &cm, &c, _state); } ae_matrix_set_length(&a2, n, 2*m, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { a2.ptr.pp_double[i][2*j+0] = a.ptr.pp_double[i][j]; a2.ptr.pp_double[i][2*j+1] = a.ptr.pp_double[i][j]; } } lsfitlinearw(&y, &w, &a2, n, 2*m, &info, &c2, &rep, _state); if( info<=0 ) { *llserrors = ae_true; } else { /* * test answer correctness using design matrix properties * and previous task solution */ for(j=0; j<=m-1; j++) { *llserrors = *llserrors||ae_fp_greater(ae_fabs(c2.ptr.p_double[2*j+0]+c2.ptr.p_double[2*j+1]-c.ptr.p_double[j], _state),threshold); } } /* * test non-weighted fitting */ ae_vector_set_length(&w2, n, _state); for(i=0; i<=n-1; i++) { w2.ptr.p_double[i] = (double)(1); } lsfitlinearw(&y, &w2, &a, n, m, &info, &c, &rep, _state); lsfitlinear(&y, &a, n, m, &info2, &c2, &rep2, _state); if( info<=0||info2<=0 ) { *llserrors = ae_true; } else { /* * test answer correctness */ for(j=0; j<=m-1; j++) { *llserrors = *llserrors||ae_fp_greater(ae_fabs(c.ptr.p_double[j]-c2.ptr.p_double[j], _state),threshold); } *llserrors = *llserrors||ae_fp_greater(ae_fabs(rep.taskrcond-rep2.taskrcond, _state),threshold); } /* * test nonlinear fitting on the linear task * (only non-degenerate tasks are tested) * and compare with answer from linear fitting subroutine */ if( n>=m ) { ae_vector_set_length(&c2, m, _state); /* * test function/gradient/Hessian-based weighted fitting */ lsfitlinearw(&y, &w, &a, n, m, &info, &c, &rep, _state); for(i=0; i<=m-1; i++) { c2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } lsfitcreatewf(&a, &y, &w, &c2, n, m, m, diffstep, &state, _state); lsfitsetcond(&state, nlthreshold, 0, _state); testlsfitunit_fitlinearnonlinear(m, 0, &a, &state, nlserrors, _state); lsfitresults(&state, &info, &c2, &rep2, _state); if( info<=0 ) { *nlserrors = ae_true; } else { for(i=0; i<=m-1; i++) { *nlserrors = *nlserrors||ae_fp_greater(ae_fabs(c.ptr.p_double[i]-c2.ptr.p_double[i], _state),100*nlthreshold); } } for(i=0; i<=m-1; i++) { c2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } lsfitcreatewfg(&a, &y, &w, &c2, n, m, m, ae_fp_greater(ae_randomreal(_state),0.5), &state, _state); lsfitsetcond(&state, nlthreshold, 0, _state); testlsfitunit_fitlinearnonlinear(m, 1, &a, &state, nlserrors, _state); lsfitresults(&state, &info, &c2, &rep2, _state); if( info<=0 ) { *nlserrors = ae_true; } else { for(i=0; i<=m-1; i++) { *nlserrors = *nlserrors||ae_fp_greater(ae_fabs(c.ptr.p_double[i]-c2.ptr.p_double[i], _state),100*nlthreshold); } } for(i=0; i<=m-1; i++) { c2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } lsfitcreatewfgh(&a, &y, &w, &c2, n, m, m, &state, _state); lsfitsetcond(&state, nlthreshold, 0, _state); testlsfitunit_fitlinearnonlinear(m, 2, &a, &state, nlserrors, _state); lsfitresults(&state, &info, &c2, &rep2, _state); if( info<=0 ) { *nlserrors = ae_true; } else { for(i=0; i<=m-1; i++) { *nlserrors = *nlserrors||ae_fp_greater(ae_fabs(c.ptr.p_double[i]-c2.ptr.p_double[i], _state),100*nlthreshold); } } /* * test gradient-only or Hessian-based fitting without weights */ lsfitlinear(&y, &a, n, m, &info, &c, &rep, _state); for(i=0; i<=m-1; i++) { c2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } lsfitcreatef(&a, &y, &c2, n, m, m, diffstep, &state, _state); lsfitsetcond(&state, nlthreshold, 0, _state); testlsfitunit_fitlinearnonlinear(m, 0, &a, &state, nlserrors, _state); lsfitresults(&state, &info, &c2, &rep2, _state); if( info<=0 ) { *nlserrors = ae_true; } else { for(i=0; i<=m-1; i++) { *nlserrors = *nlserrors||ae_fp_greater(ae_fabs(c.ptr.p_double[i]-c2.ptr.p_double[i], _state),100*nlthreshold); } } for(i=0; i<=m-1; i++) { c2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } lsfitcreatefg(&a, &y, &c2, n, m, m, ae_fp_greater(ae_randomreal(_state),0.5), &state, _state); lsfitsetcond(&state, nlthreshold, 0, _state); testlsfitunit_fitlinearnonlinear(m, 1, &a, &state, nlserrors, _state); lsfitresults(&state, &info, &c2, &rep2, _state); if( info<=0 ) { *nlserrors = ae_true; } else { for(i=0; i<=m-1; i++) { *nlserrors = *nlserrors||ae_fp_greater(ae_fabs(c.ptr.p_double[i]-c2.ptr.p_double[i], _state),100*nlthreshold); } } for(i=0; i<=m-1; i++) { c2.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } lsfitcreatefgh(&a, &y, &c2, n, m, m, &state, _state); lsfitsetcond(&state, nlthreshold, 0, _state); testlsfitunit_fitlinearnonlinear(m, 2, &a, &state, nlserrors, _state); lsfitresults(&state, &info, &c2, &rep2, _state); if( info<=0 ) { *nlserrors = ae_true; } else { for(i=0; i<=m-1; i++) { *nlserrors = *nlserrors||ae_fp_greater(ae_fabs(c.ptr.p_double[i]-c2.ptr.p_double[i], _state),100*nlthreshold); } } } } } /* * test correctness of the RCond field */ ae_matrix_set_length(&a, n-1+1, n-1+1, _state); ae_vector_set_length(&x, n-1+1, _state); ae_vector_set_length(&y, n-1+1, _state); ae_vector_set_length(&w, n-1+1, _state); v1 = ae_maxrealnumber; v2 = ae_minrealnumber; for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = 0.1+0.9*ae_randomreal(_state); y.ptr.p_double[i] = 0.1+0.9*ae_randomreal(_state); w.ptr.p_double[i] = (double)(1); for(j=0; j<=n-1; j++) { if( i==j ) { a.ptr.pp_double[i][i] = 0.1+0.9*ae_randomreal(_state); v1 = ae_minreal(v1, a.ptr.pp_double[i][i], _state); v2 = ae_maxreal(v2, a.ptr.pp_double[i][i], _state); } else { a.ptr.pp_double[i][j] = (double)(0); } } } lsfitlinearw(&y, &w, &a, n, n, &info, &c, &rep, _state); if( info<=0 ) { *llserrors = ae_true; } else { *llserrors = *llserrors||ae_fp_greater(ae_fabs(rep.taskrcond-v1/v2, _state),threshold); } } /* * Test constrained least squares */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { for(m=1; m<=maxm; m++) { /* * test for K<>0 */ for(k=1; k<=m-1; k++) { /* * Prepare Chebyshev basis. Its condition number is very good. * Prepare constraints (random numbers) */ ae_matrix_set_length(&a, n, m, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); ae_vector_set_length(&w, n, _state); xscale = 0.9+0.1*ae_randomreal(_state); for(i=0; i<=n-1; i++) { if( n==1 ) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } else { x.ptr.p_double[i] = xscale*((double)(2*i)/(double)(n-1)-1); } y.ptr.p_double[i] = 3*x.ptr.p_double[i]+ae_exp(x.ptr.p_double[i], _state); w.ptr.p_double[i] = 1+ae_randomreal(_state); a.ptr.pp_double[i][0] = (double)(1); if( m>1 ) { a.ptr.pp_double[i][1] = x.ptr.p_double[i]; } for(j=2; j<=m-1; j++) { a.ptr.pp_double[i][j] = 2*x.ptr.p_double[i]*a.ptr.pp_double[i][j-1]-a.ptr.pp_double[i][j-2]; } } ae_matrix_set_length(&cm, k, m+1, _state); for(i=0; i<=k-1; i++) { for(j=0; j<=m; j++) { cm.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } /* * Solve constrained task */ lsfitlinearwc(&y, &w, &a, &cm, n, m, k, &info, &c, &rep, _state); if( info<=0 ) { *llserrors = ae_true; } else { *llserrors = *llserrors||!testlsfitunit_isglssolution(n, m, k, &y, &w, &a, &cm, &c, _state); } /* * test non-weighted fitting */ ae_vector_set_length(&w2, n, _state); for(i=0; i<=n-1; i++) { w2.ptr.p_double[i] = (double)(1); } lsfitlinearwc(&y, &w2, &a, &cm, n, m, k, &info, &c, &rep, _state); lsfitlinearc(&y, &a, &cm, n, m, k, &info2, &c2, &rep2, _state); if( info<=0||info2<=0 ) { *llserrors = ae_true; } else { /* * test answer correctness */ for(j=0; j<=m-1; j++) { *llserrors = *llserrors||ae_fp_greater(ae_fabs(c.ptr.p_double[j]-c2.ptr.p_double[j], _state),threshold); } *llserrors = *llserrors||ae_fp_greater(ae_fabs(rep.taskrcond-rep2.taskrcond, _state),threshold); } } } } } /* * nonlinear task for nonlinear fitting: * * f(X,C) = 1/(1+C*X^2), * C(true) = 2. */ n = 100; ae_vector_set_length(&c, 1, _state); c.ptr.p_double[0] = 1+2*ae_randomreal(_state); ae_matrix_set_length(&a, n, 1, _state); ae_vector_set_length(&y, n, _state); for(i=0; i<=n-1; i++) { a.ptr.pp_double[i][0] = 4*ae_randomreal(_state)-2; y.ptr.p_double[i] = 1/(1+2*ae_sqr(a.ptr.pp_double[i][0], _state)); } lsfitcreatefg(&a, &y, &c, n, 1, 1, ae_true, &state, _state); lsfitsetcond(&state, nlthreshold, 0, _state); while(lsfititeration(&state, _state)) { if( state.needf ) { state.f = 1/(1+state.c.ptr.p_double[0]*ae_sqr(state.x.ptr.p_double[0], _state)); } if( state.needfg ) { state.f = 1/(1+state.c.ptr.p_double[0]*ae_sqr(state.x.ptr.p_double[0], _state)); state.g.ptr.p_double[0] = -ae_sqr(state.x.ptr.p_double[0], _state)/ae_sqr(1+state.c.ptr.p_double[0]*ae_sqr(state.x.ptr.p_double[0], _state), _state); } } lsfitresults(&state, &info, &c, &rep, _state); if( info<=0 ) { seterrorflag(nlserrors, ae_true, _state); } else { seterrorflag(nlserrors, ae_fp_greater(ae_fabs(c.ptr.p_double[0]-2, _state),100*nlthreshold), _state); } /* * solve simple task (fitting by constant function) and check * correctness of the errors calculated by subroutines */ for(pass=1; pass<=passcount; pass++) { /* * test on task with non-zero Yi */ n = 4; v1 = ae_randomreal(_state); v2 = ae_randomreal(_state); v = 1+ae_randomreal(_state); ae_vector_set_length(&c, 1, _state); c.ptr.p_double[0] = 1+2*ae_randomreal(_state); ae_matrix_set_length(&a, 4, 1, _state); ae_vector_set_length(&y, 4, _state); a.ptr.pp_double[0][0] = (double)(1); y.ptr.p_double[0] = v-v2; a.ptr.pp_double[1][0] = (double)(1); y.ptr.p_double[1] = v-v1; a.ptr.pp_double[2][0] = (double)(1); y.ptr.p_double[2] = v+v1; a.ptr.pp_double[3][0] = (double)(1); y.ptr.p_double[3] = v+v2; refrms = ae_sqrt((ae_sqr(v1, _state)+ae_sqr(v2, _state))/2, _state); refavg = (ae_fabs(v1, _state)+ae_fabs(v2, _state))/2; refavgrel = 0.25*(ae_fabs(v2, _state)/ae_fabs(v-v2, _state)+ae_fabs(v1, _state)/ae_fabs(v-v1, _state)+ae_fabs(v1, _state)/ae_fabs(v+v1, _state)+ae_fabs(v2, _state)/ae_fabs(v+v2, _state)); refmax = ae_maxreal(v1, v2, _state); /* * Test LLS */ lsfitlinear(&y, &a, 4, 1, &info, &c, &rep, _state); if( info<=0 ) { *llserrors = ae_true; } else { *llserrors = *llserrors||ae_fp_greater(ae_fabs(c.ptr.p_double[0]-v, _state),threshold); *llserrors = *llserrors||ae_fp_greater(ae_fabs(rep.rmserror-refrms, _state),threshold); *llserrors = *llserrors||ae_fp_greater(ae_fabs(rep.avgerror-refavg, _state),threshold); *llserrors = *llserrors||ae_fp_greater(ae_fabs(rep.avgrelerror-refavgrel, _state),threshold); *llserrors = *llserrors||ae_fp_greater(ae_fabs(rep.maxerror-refmax, _state),threshold); } /* * Test NLS */ lsfitcreatefg(&a, &y, &c, 4, 1, 1, ae_true, &state, _state); lsfitsetcond(&state, nlthreshold, 0, _state); while(lsfititeration(&state, _state)) { if( state.needf ) { state.f = state.c.ptr.p_double[0]; } if( state.needfg ) { state.f = state.c.ptr.p_double[0]; state.g.ptr.p_double[0] = (double)(1); } } lsfitresults(&state, &info, &c, &rep, _state); if( info<=0 ) { seterrorflag(nlserrors, ae_true, _state); } else { seterrorflag(nlserrors, ae_fp_greater(ae_fabs(c.ptr.p_double[0]-v, _state),threshold), _state); seterrorflag(nlserrors, ae_fp_greater(ae_fabs(rep.rmserror-refrms, _state),threshold), _state); seterrorflag(nlserrors, ae_fp_greater(ae_fabs(rep.avgerror-refavg, _state),threshold), _state); seterrorflag(nlserrors, ae_fp_greater(ae_fabs(rep.avgrelerror-refavgrel, _state),threshold), _state); seterrorflag(nlserrors, ae_fp_greater(ae_fabs(rep.maxerror-refmax, _state),threshold), _state); } } /* * Check covariance matrix, errors-in-parameters. * * We test three different solvers: * * nonlinear solver * * unconstrained linear solver * * constrained linear solver with empty set of constrains * on two random problems: * * problem with known prior, noise, unit weights * * problem with known prior, noise, non-unit weights * * We test that: * * rep.ErrPar=sqrt(diag(Rep.CovPar)) * * Rep.ErrPar is not too optimistic - average value of ratio * between |c_fit-c_prior| and ErrPar[] is less than TOL * * Rep.ErrPar is not too pessimistic - average value of ratio * is larger than 1/TOL * * similarly, Rep.ErrCurve gives good estimate of |A*c_fit - A*c_prior| * - not optimistic, not pessimistic. * * similarly, per-point noise estimates are good enough (we use * slightly different tolerances, though) * In order to have these estimates we perform many different tests * and calculate average deviation divided by ErrPar/ErrCurve. Then * we perform test. * * Due to stochastic nature of the test it is not good idea to * consider each case individually - it is better to average over * many runs. * */ tol = 10.0; for(n=1; n<=10; n++) { for(skind=0; skind<=2; skind++) { for(pkind=0; pkind<=1; pkind++) { /* * Generate problem: * * PKind=0 - unit weights * * PKind=1 - non-unit weights, exact estimate of noise at I-th point * * We generate: * * C - prior values of parameters * * CStart - random initial point * * A - function matrix * * Y - noisy version of A*C * * W - weights vector * * S - vector of per-point estimates of noise */ cscale = ae_pow(10.0, 2*randomnormal(_state), _state); xscale = ae_pow(10.0, 2*randomnormal(_state), _state); noiselevel = 0.01*cscale*xscale; ae_vector_set_length(&c, n, _state); ae_vector_set_length(&cstart, n, _state); for(i=0; i<=n-1; i++) { c.ptr.p_double[i] = cscale*randomnormal(_state); cstart.ptr.p_double[i] = cscale*randomnormal(_state); } ae_matrix_set_length(&a, 1000, n, _state); ae_vector_set_length(&y, a.rows, _state); ae_vector_set_length(&w, a.rows, _state); ae_vector_set_length(&s, a.rows, _state); for(i=0; i<=a.rows-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = xscale*randomnormal(_state); } v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &c.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( pkind==0 ) { w.ptr.p_double[i] = (double)(1); s.ptr.p_double[i] = noiselevel; y.ptr.p_double[i] = v+s.ptr.p_double[i]*randomnormal(_state); } if( pkind==1 ) { w.ptr.p_double[i] = 1/noiselevel; s.ptr.p_double[i] = noiselevel; y.ptr.p_double[i] = v+s.ptr.p_double[i]*randomnormal(_state); } } /* * Test different solvers: * * SKind=0 - nonlinear solver * * SKind=1 - linear unconstrained * * SKind=2 - linear constrained with empty set of constraints */ info = -1; if( skind==0 ) { if( ae_fp_greater(ae_randomreal(_state),0.5) ) { lsfitcreatefg(&a, &y, &cstart, a.rows, n, n, ae_true, &state, _state); } else { lsfitcreatef(&a, &y, &cstart, a.rows, n, n, 0.001*cscale, &state, _state); } lsfitsetcond(&state, 0.0, 10, _state); while(lsfititeration(&state, _state)) { if( state.needf ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+state.c.ptr.p_double[i]*state.x.ptr.p_double[i]; } } if( state.needfg ) { state.f = (double)(0); for(i=0; i<=n-1; i++) { state.f = state.f+state.c.ptr.p_double[i]*state.x.ptr.p_double[i]; state.g.ptr.p_double[i] = state.x.ptr.p_double[i]; } } } lsfitresults(&state, &info, &cend, &rep, _state); } if( skind==1 ) { if( pkind==0 ) { lsfitlinear(&y, &a, a.rows, n, &info, &cend, &rep, _state); } else { lsfitlinearw(&y, &w, &a, a.rows, n, &info, &cend, &rep, _state); } } if( skind==2 ) { if( pkind==0 ) { lsfitlinearc(&y, &a, &a2, a.rows, n, 0, &info, &cend, &rep, _state); } else { lsfitlinearwc(&y, &w, &a, &a2, a.rows, n, 0, &info, &cend, &rep, _state); } } /* * Tests: * * check relation between CovPar and ErrPar * * accumulate average deviation in parameters * * accumulate average deviation in curve fit * * accumulate average deviation in noise estimate */ avgdeviationpar = (double)(0); adpcnt = (double)(0); avgdeviationcurve = (double)(0); adccnt = (double)(0); avgdeviationnoise = (double)(0); adncnt = (double)(0); for(i=0; i<=n-1; i++) { seterrorflag(llserrors, ae_fp_greater(ae_fabs(rep.covpar.ptr.pp_double[i][i]-ae_sqr(rep.errpar.ptr.p_double[i], _state), _state),100*ae_machineepsilon*ae_maxreal(ae_sqr(rep.errpar.ptr.p_double[i], _state), rep.covpar.ptr.pp_double[i][i], _state)), _state); } for(i=0; i<=n-1; i++) { avgdeviationpar = (avgdeviationpar*adpcnt+ae_fabs(c.ptr.p_double[i]-cend.ptr.p_double[i], _state)/rep.errpar.ptr.p_double[i])/(adpcnt+1); adpcnt = adpcnt+1; } for(i=0; i<=a.rows-1; i++) { v1 = ae_v_dotproduct(&c.ptr.p_double[0], 1, &a.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); v2 = ae_v_dotproduct(&cend.ptr.p_double[0], 1, &a.ptr.pp_double[i][0], 1, ae_v_len(0,n-1)); avgdeviationcurve = (avgdeviationcurve*adccnt+ae_fabs(v1-v2, _state)/rep.errcurve.ptr.p_double[i])/(adccnt+1); adccnt = adccnt+1; avgdeviationnoise = (avgdeviationnoise*adncnt+rep.noise.ptr.p_double[i]/s.ptr.p_double[i])/(adncnt+1); adncnt = adncnt+1; } /* * Check that estimates are not too optimistic. * This test is performed always. */ seterrorflag(llserrors, ae_fp_greater(avgdeviationpar,tol), _state); seterrorflag(llserrors, ae_fp_greater(avgdeviationcurve,tol), _state); seterrorflag(llserrors, ae_fp_greater(avgdeviationnoise,1.50), _state); seterrorflag(llserrors, ae_fp_less(avgdeviationnoise,0.66), _state); /* * Test for estimates being too pessimistic is performed only * when we have more than 4 parameters. */ seterrorflag(llserrors, n>=5&&ae_fp_less(avgdeviationcurve,0.01), _state); seterrorflag(llserrors, n>=5&&ae_fp_less(avgdeviationpar,0.01), _state); } } } /* * Check special property of the LSFit solver: it does not include points with * zero weight in the estimate of the noise level. Such property seems to be * quite natural, but in fact it requires some additional code in order to * ignore such points. * * In order to test it we solve two problems: one 300xN, with 150 non-zero * weights and 150 zero weights - and another one with only 150 points with * non-zero weights. Both problems should give us same covariance matrix. */ tol = (double)(10); for(n=1; n<=10; n++) { /* * Generate N-dimensional linear problem with 300 points: * * y = c'*x + noise * * prior values of coefficients C has scale CScale * * coordinates X has scale XScale * * noise in I-th point has magnitude 0.1*CScale*XScale*WScale/W[i] */ cscale = ae_pow(10.0, 2*randomnormal(_state), _state); xscale = ae_pow(10.0, 2*randomnormal(_state), _state); wscale = ae_pow(10.0, 2*randomnormal(_state), _state); noiselevel = 0.1*cscale*xscale; ae_vector_set_length(&c, n, _state); ae_vector_set_length(&cstart, n, _state); for(i=0; i<=n-1; i++) { c.ptr.p_double[i] = cscale*randomnormal(_state); cstart.ptr.p_double[i] = cscale*randomnormal(_state); } ae_matrix_set_length(&a, 300, n, _state); ae_vector_set_length(&y, a.rows, _state); ae_vector_set_length(&w, a.rows, _state); for(i=0; i<=a.rows-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = xscale*randomnormal(_state); } v = ae_v_dotproduct(&a.ptr.pp_double[i][0], 1, &c.ptr.p_double[0], 1, ae_v_len(0,n-1)); if( i0 and NZ<>0, but in * some cases either N or NZ (but not both) is zero. * * X-values have scale equal to ScaleX */ scalex = ae_pow((double)(10), 30*hqrnduniformr(&rs, _state)-15, _state); n = 40+hqrnduniformi(&rs, 40, _state); nz = 4+hqrnduniformi(&rs, 4, _state); if( ae_fp_less(hqrnduniformr(&rs, _state),0.1) ) { if( ae_fp_less(hqrnduniformr(&rs, _state),0.5) ) { n = 0; } else { nz = 0; } } ntotal = 2*(n+nz); ae_vector_set_length(&x, ntotal, _state); for(i=0; i<=n-1; i++) { v = scalex*ae_exp(ae_log((double)(5), _state)*(2*hqrnduniformr(&rs, _state)-1), _state); x.ptr.p_double[2*i+0] = v; x.ptr.p_double[2*i+1] = v; } for(i=0; i<=nz-1; i++) { x.ptr.p_double[2*n+2*i+0] = (double)(0); x.ptr.p_double[2*n+2*i+1] = (double)(0); } /* * Fenerate A/B/C/D: * * A/D are random with scale equal to ScaleY * * B is in +-[0.25,4.0] * * for C we choose one of X[], if N>0; * if N=0, we set C=1. */ scaley = ae_pow((double)(10), 30*hqrnduniformr(&rs, _state)-15, _state); ae = scaley*(hqrnduniformr(&rs, _state)-0.5); be = (2*hqrnduniformi(&rs, 2, _state)-1)*ae_exp(ae_log((double)(4), _state)*(2*hqrnduniformr(&rs, _state)-1), _state); ce = scalex*ae_exp(ae_log((double)(2), _state)*(2*hqrnduniformr(&rs, _state)-1), _state); de = ae+scaley*(2*hqrnduniformi(&rs, 2, _state)-1)*(hqrnduniformr(&rs, _state)+0.5); /* * Choose noise level and generate Y[]. */ noise = 0.05*scaley; ae_vector_set_length(&y, ntotal, _state); for(i=0; i<=ntotal/2-1; i++) { if( ae_fp_neq(x.ptr.p_double[2*i+0],(double)(0)) ) { v = de+(ae-de)/(1.0+ae_pow(x.ptr.p_double[2*i+0]/ce, be, _state)); } else { if( ae_fp_greater_eq(be,(double)(0)) ) { v = ae; } else { v = de; } } y.ptr.p_double[2*i+0] = v+noise; y.ptr.p_double[2*i+1] = v-noise; } /* * Unconstrained fit and test * * NOTE: we test that B>=0 is returned. If BE<0, we use * symmetry property of 4PL model. */ logisticfit4(&x, &y, ntotal, &a, &b, &c, &d, &rep, _state); seterrorflag(fiterrors, !ae_isfinite(a, _state), _state); seterrorflag(fiterrors, !ae_isfinite(b, _state), _state); seterrorflag(fiterrors, !ae_isfinite(c, _state), _state); seterrorflag(fiterrors, !ae_isfinite(d, _state), _state); seterrorflag(fiterrors, ae_fp_less(b,(double)(0)), _state); v = 0.0; for(i=0; i<=ntotal-1; i++) { if( ae_fp_neq(x.ptr.p_double[i],(double)(0)) ) { vv = d+(a-d)/(1.0+ae_pow(x.ptr.p_double[i]/c, b, _state)); } else { vv = a; } v = v+ae_sqr(y.ptr.p_double[i]-vv, _state); } v = ae_sqrt(v/ntotal, _state); seterrorflag(fiterrors, ae_fp_greater(v,(1+tol)*noise), _state); /* * Constrained fit and test * * NOTE: we test that B>=0 is returned. If BE<0, we use * symmetry property of 4PL model. */ for(k0=0; k0<=1; k0++) { for(k1=0; k1<=1; k1++) { /* * Choose constraints. */ if( k0==0 ) { v0 = _state->v_nan; } else { if( ae_fp_greater_eq(be,(double)(0)) ) { v0 = ae; } else { v0 = de; } } if( k1==0 ) { v1 = _state->v_nan; } else { if( ae_fp_greater_eq(be,(double)(0)) ) { v1 = de; } else { v1 = ae; } } /* * Fit */ logisticfit4ec(&x, &y, ntotal, v0, v1, &a, &b, &c, &d, &rep, _state); /* * Check */ seterrorflag(fiterrors, !ae_isfinite(a, _state), _state); seterrorflag(fiterrors, !ae_isfinite(b, _state), _state); seterrorflag(fiterrors, !ae_isfinite(c, _state), _state); seterrorflag(fiterrors, !ae_isfinite(d, _state), _state); seterrorflag(fiterrors, ae_fp_less(b,(double)(0)), _state); seterrorflag(fiterrors, k0!=0&&ae_fp_neq(a,v0), _state); seterrorflag(fiterrors, k1!=0&&ae_fp_neq(d,v1), _state); v = 0.0; for(i=0; i<=ntotal-1; i++) { if( ae_fp_neq(x.ptr.p_double[i],(double)(0)) ) { vv = d+(a-d)/(1.0+ae_pow(x.ptr.p_double[i]/c, b, _state)); } else { if( ae_fp_greater_eq(b,(double)(0)) ) { vv = a; } else { vv = d; } } v = v+ae_sqr(y.ptr.p_double[i]-vv, _state); } v = ae_sqrt(v/ntotal, _state); seterrorflag(fiterrors, ae_fp_greater(v,(1+tol)*noise), _state); } } } /* * 5PL fitting * * Generate random AE/BE/CE/DE/GE, generate random set of points and for * each point generate two function values: F(x)+eps and F(x)-eps. * Such problem has solution which is exactly AE/BE/CE/DE which were * used to generate points. * * NOTE: because problem has higher condition number, we use lower * tolerance for power parameters B and G. * * This test checks both unconstrained and constrained fitting. */ tol = 1.0E-6; for(pass=1; pass<=10; pass++) { /* * Generate N points, N-1 of them with non-zero X and * last one with zero X. * X-values have scale equal to ScaleX */ scalex = ae_pow((double)(10), 30*hqrnduniformr(&rs, _state)-15, _state); k = 50; n = 2*k+1; ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = scalex*ae_pow((double)(2), (double)(2*(i-k))/(double)k, _state); } x.ptr.p_double[n-1] = (double)(0); /* * Generate A/B/C/D/G: * * A/D are random with scale equal to ScaleY * * B is in +-[0.25,4.0] * * G is in [0.25,4.0] * * C is in [0.25,4.0]*ScaleX * if N=0, we set C=1. * Generate Y[]. */ scaley = ae_pow((double)(10), 30*hqrnduniformr(&rs, _state)-15, _state); ae = scaley*(hqrnduniformr(&rs, _state)-0.5); be = (2*hqrnduniformi(&rs, 2, _state)-1)*ae_exp(ae_log((double)(2), _state)*(2*hqrnduniformr(&rs, _state)-1), _state); ce = scalex*ae_exp(ae_log((double)(2), _state)*(2*hqrnduniformr(&rs, _state)-1), _state); de = ae+scaley*(2*hqrnduniformi(&rs, 2, _state)-1)*(hqrnduniformr(&rs, _state)+0.5); ge = ae_exp(ae_log((double)(2), _state)*(2*hqrnduniformr(&rs, _state)-1), _state); ae_vector_set_length(&y, n, _state); for(i=0; i<=n-1; i++) { if( ae_fp_neq(x.ptr.p_double[i],(double)(0)) ) { v = de+(ae-de)/ae_pow(1.0+ae_pow(x.ptr.p_double[i]/ce, be, _state), ge, _state); } else { if( ae_fp_greater_eq(be,(double)(0)) ) { v = ae; } else { v = de; } } y.ptr.p_double[i] = v; } /* * Unconstrained fit and test * * NOTE: we test that B>=0 is returned. If BE<0, we use * symmetry property of 4PL model. */ logisticfit5(&x, &y, n, &a, &b, &c, &d, &g, &rep, _state); v = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_neq(x.ptr.p_double[i],(double)(0)) ) { vv = d+(a-d)/ae_pow(1.0+ae_pow(x.ptr.p_double[i]/c, b, _state), g, _state); } else { if( ae_fp_greater_eq(b,(double)(0)) ) { vv = a; } else { vv = d; } } v = v+ae_sqr(y.ptr.p_double[i]-vv, _state); } v = ae_sqrt(v/n, _state); seterrorflag(fiterrors, ae_fp_greater(v,scaley*tol), _state); /* * Constrained fit and test */ for(k0=0; k0<=1; k0++) { for(k1=0; k1<=1; k1++) { /* * Choose constraints. */ if( k0==0 ) { v0 = _state->v_nan; } else { if( ae_fp_greater_eq(be,(double)(0)) ) { v0 = ae; } else { v0 = de; } } if( k1==0 ) { v1 = _state->v_nan; } else { if( ae_fp_greater_eq(be,(double)(0)) ) { v1 = de; } else { v1 = ae; } } /* * Fit */ logisticfit5ec(&x, &y, n, v0, v1, &a, &b, &c, &d, &g, &rep, _state); seterrorflag(fiterrors, !ae_isfinite(a, _state), _state); seterrorflag(fiterrors, !ae_isfinite(b, _state), _state); seterrorflag(fiterrors, !ae_isfinite(c, _state), _state); seterrorflag(fiterrors, !ae_isfinite(d, _state), _state); if( ae_fp_greater(b,(double)(0)) ) { seterrorflag(fiterrors, k0!=0&&ae_fp_neq(a,v0), _state); seterrorflag(fiterrors, k1!=0&&ae_fp_neq(d,v1), _state); } else { seterrorflag(fiterrors, k0!=0&&ae_fp_neq(d,v0), _state); seterrorflag(fiterrors, k1!=0&&ae_fp_neq(a,v1), _state); } v = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_neq(x.ptr.p_double[i],(double)(0)) ) { vv = d+(a-d)/ae_pow(1.0+ae_pow(x.ptr.p_double[i]/c, b, _state), g, _state); } else { if( ae_fp_greater_eq(b,(double)(0)) ) { vv = a; } else { vv = d; } } v = v+ae_sqr(y.ptr.p_double[i]-vv, _state); } v = ae_sqrt(v/n, _state); seterrorflag(fiterrors, ae_fp_greater(v,scaley*tol), _state); } } } /* * Test correctness of errors */ tol = 1.0E-6; for(pass=1; pass<=20; pass++) { n = 10; meany = 0.0; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)(i); y.ptr.p_double[i] = hqrnduniformr(&rs, _state)-0.5; meany = meany+y.ptr.p_double[i]; } meany = meany/n; x.ptr.p_double[1] = (double)(0); /* * Choose model fitting function to test */ k = hqrnduniformi(&rs, 4, _state); a = (double)(0); d = (double)(0); c = (double)(1); b = (double)(1); g = (double)(1); if( k==0 ) { logisticfit4(&x, &y, n, &a, &b, &c, &d, &rep, _state); g = 1.0; } if( k==1 ) { logisticfit4ec(&x, &y, n, hqrnduniformr(&rs, _state)-0.5, hqrnduniformr(&rs, _state)-0.5, &a, &b, &c, &d, &rep, _state); g = 1.0; } if( k==2 ) { logisticfit5(&x, &y, n, &a, &b, &c, &d, &g, &rep, _state); } if( k==3 ) { logisticfit5ec(&x, &y, n, hqrnduniformr(&rs, _state)-0.5, hqrnduniformr(&rs, _state)-0.5, &a, &b, &c, &d, &g, &rep, _state); } k = 0; erms = (double)(0); eavg = (double)(0); eavgrel = (double)(0); emax = (double)(0); rss = 0.0; tss = 0.0; for(i=0; i<=n-1; i++) { if( ae_fp_neq(x.ptr.p_double[i],(double)(0)) ) { v = d+(a-d)/ae_pow(1.0+ae_pow(x.ptr.p_double[i]/c, b, _state), g, _state); } else { if( ae_fp_greater_eq(b,(double)(0)) ) { v = a; } else { v = d; } } v = v-y.ptr.p_double[i]; rss = rss+v*v; tss = tss+ae_sqr(y.ptr.p_double[i]-meany, _state); erms = erms+ae_sqr(v, _state); eavg = eavg+ae_fabs(v, _state); if( ae_fp_neq(y.ptr.p_double[i],(double)(0)) ) { eavgrel = eavgrel+ae_fabs(v/y.ptr.p_double[i], _state); k = k+1; } emax = ae_maxreal(emax, ae_fabs(v, _state), _state); } er2 = 1.0-rss/tss; erms = ae_sqrt(erms/n, _state); eavg = eavg/n; if( k>0 ) { eavgrel = eavgrel/k; } seterrorflag(fiterrors, ae_fp_greater(ae_fabs(erms-rep.rmserror, _state),tol), _state); seterrorflag(fiterrors, ae_fp_greater(ae_fabs(eavg-rep.avgerror, _state),tol), _state); seterrorflag(fiterrors, ae_fp_greater(ae_fabs(emax-rep.maxerror, _state),tol), _state); seterrorflag(fiterrors, ae_fp_greater(ae_fabs(eavgrel-rep.avgrelerror, _state),tol), _state); seterrorflag(fiterrors, ae_fp_greater(ae_fabs(er2-rep.r2, _state),tol), _state); } ae_frame_leave(_state); } /************************************************************************* Test for box constrained problemsNLS. On failure sets error flag, leaves it unchanged on success. *************************************************************************/ static void testlsfitunit_testbcnls(ae_bool* errorflag, ae_state *_state) { ae_frame _frame_block; lsfitstate state; lsfitreport rep; ae_vector bl; ae_vector bu; ae_int_t nx; ae_int_t nc; ae_int_t m; ae_vector x; ae_vector c0; ae_vector c1; ae_vector c2; ae_matrix xx; ae_vector y; ae_int_t i; ae_int_t j; ae_int_t ii; double v; double h; hqrndstate rs; double epsx; double f0; double f1; ae_int_t tmpkind; ae_int_t terminationtype; ae_frame_make(_state, &_frame_block); _lsfitstate_init(&state, _state); _lsfitreport_init(&rep, _state); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&c0, 0, DT_REAL, _state); ae_vector_init(&c1, 0, DT_REAL, _state); ae_vector_init(&c2, 0, DT_REAL, _state); ae_matrix_init(&xx, 0, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); /* * Fit * * [ ] * f(X|C)= SUM_j[ (alpha*c_j+power(c_j,3))*x_j ] * [ ] * * subject to non-negativity constraints on c_j */ epsx = 1.0E-9; for(tmpkind=0; tmpkind<=1; tmpkind++) { for(nc=1; nc<=20; nc++) { nx = nc; m = nc+hqrnduniformi(&rs, nc, _state); ae_vector_set_length(&bl, nc, _state); ae_vector_set_length(&bu, nc, _state); ae_vector_set_length(&c0, nc, _state); ae_vector_set_length(&x, nx, _state); for(i=0; i<=nc-1; i++) { bl.ptr.p_double[i] = (double)(0); bu.ptr.p_double[i] = _state->v_posinf; c0.ptr.p_double[i] = 1+hqrnduniformr(&rs, _state); } ae_matrix_set_length(&xx, m, nx, _state); ae_vector_set_length(&y, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=nx-1; j++) { xx.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } y.ptr.p_double[i] = hqrndnormal(&rs, _state); } if( tmpkind==0 ) { lsfitcreatef(&xx, &y, &c0, m, nx, nc, 10*epsx, &state, _state); } if( tmpkind==1 ) { lsfitcreatefg(&xx, &y, &c0, m, nx, nc, ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)), &state, _state); } lsfitsetcond(&state, epsx, 0, _state); lsfitsetbc(&state, &bl, &bu, _state); while(lsfititeration(&state, _state)) { for(i=0; i<=nc-1; i++) { seterrorflag(errorflag, ae_fp_less(state.c.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(state.c.ptr.p_double[i],bu.ptr.p_double[i]), _state); } if( state.needf ) { testlsfitunit_testfunc1(nx, &state.x, &state.c, &state.f, ae_true, &state.g, ae_false, _state); continue; } if( state.needfg ) { testlsfitunit_testfunc1(nx, &state.x, &state.c, &state.f, ae_true, &state.g, ae_true, _state); continue; } ae_assert(ae_false, "minlm test: integrity check failed", _state); } lsfitresults(&state, &terminationtype, &c1, &rep, _state); seterrorflag(errorflag, terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } f0 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc1(nx, &x, &c1, &v, ae_true, &state.g, ae_false, _state); f0 = f0+ae_sqr(v-y.ptr.p_double[ii], _state); } h = 0.001; ae_vector_set_length(&c2, nc, _state); for(i=0; i<=nc-1; i++) { seterrorflag(errorflag, ae_fp_less(c1.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(c1.ptr.p_double[i],bu.ptr.p_double[i]), _state); if( ae_fp_greater_eq(c1.ptr.p_double[i]+h,bl.ptr.p_double[i]) ) { for(j=0; j<=nc-1; j++) { c2.ptr.p_double[j] = c1.ptr.p_double[j]; } c2.ptr.p_double[i] = c2.ptr.p_double[i]+h; f1 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc1(nx, &x, &c2, &v, ae_true, &state.g, ae_false, _state); f1 = f1+ae_sqr(v-y.ptr.p_double[ii], _state); } seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } if( ae_fp_greater_eq(c1.ptr.p_double[i]-h,bl.ptr.p_double[i]) ) { for(j=0; j<=nc-1; j++) { c2.ptr.p_double[j] = c1.ptr.p_double[j]; } c2.ptr.p_double[i] = c2.ptr.p_double[i]-h; f1 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc1(nx, &x, &c2, &v, ae_true, &state.g, ae_false, _state); f1 = f1+ae_sqr(v-y.ptr.p_double[ii], _state); } seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } } } } /* * Fit * * [ ] * f(X|C)= SUM_j[ (alpha*c_j+power(c_j,3))*x_j ] * [ ] * * subject to random box constraints on c_j */ epsx = 1.0E-9; for(tmpkind=0; tmpkind<=1; tmpkind++) { for(nc=1; nc<=20; nc++) { nx = nc; m = nc+hqrnduniformi(&rs, nc, _state); ae_vector_set_length(&bl, nc, _state); ae_vector_set_length(&bu, nc, _state); ae_vector_set_length(&c0, nc, _state); ae_vector_set_length(&x, nx, _state); for(i=0; i<=nc-1; i++) { bl.ptr.p_double[i] = hqrndnormal(&rs, _state); bu.ptr.p_double[i] = bl.ptr.p_double[i]+hqrnduniformr(&rs, _state); c0.ptr.p_double[i] = 1+hqrnduniformr(&rs, _state); } ae_matrix_set_length(&xx, m, nx, _state); ae_vector_set_length(&y, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=nx-1; j++) { xx.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } y.ptr.p_double[i] = hqrndnormal(&rs, _state); } if( tmpkind==0 ) { lsfitcreatef(&xx, &y, &c0, m, nx, nc, 10*epsx, &state, _state); } if( tmpkind==1 ) { lsfitcreatefg(&xx, &y, &c0, m, nx, nc, ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)), &state, _state); } lsfitsetcond(&state, epsx, 0, _state); lsfitsetbc(&state, &bl, &bu, _state); while(lsfititeration(&state, _state)) { for(i=0; i<=nc-1; i++) { seterrorflag(errorflag, ae_fp_less(state.c.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(state.c.ptr.p_double[i],bu.ptr.p_double[i]), _state); } if( state.needf ) { testlsfitunit_testfunc1(nx, &state.x, &state.c, &state.f, ae_true, &state.g, ae_false, _state); continue; } if( state.needfg ) { testlsfitunit_testfunc1(nx, &state.x, &state.c, &state.f, ae_true, &state.g, ae_true, _state); continue; } ae_assert(ae_false, "minlm test: integrity check failed", _state); } lsfitresults(&state, &terminationtype, &c1, &rep, _state); seterrorflag(errorflag, terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } f0 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc1(nx, &x, &c1, &v, ae_true, &state.g, ae_false, _state); f0 = f0+ae_sqr(v-y.ptr.p_double[ii], _state); } h = 0.001; ae_vector_set_length(&c2, nc, _state); for(i=0; i<=nc-1; i++) { seterrorflag(errorflag, ae_fp_less(c1.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(c1.ptr.p_double[i],bu.ptr.p_double[i]), _state); if( ae_fp_greater_eq(c1.ptr.p_double[i]+h,bl.ptr.p_double[i])&&ae_fp_less_eq(c1.ptr.p_double[i]+h,bu.ptr.p_double[i]) ) { for(j=0; j<=nc-1; j++) { c2.ptr.p_double[j] = c1.ptr.p_double[j]; } c2.ptr.p_double[i] = c2.ptr.p_double[i]+h; f1 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc1(nx, &x, &c2, &v, ae_true, &state.g, ae_false, _state); f1 = f1+ae_sqr(v-y.ptr.p_double[ii], _state); } seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } if( ae_fp_greater_eq(c1.ptr.p_double[i]-h,bl.ptr.p_double[i])&&ae_fp_less_eq(c1.ptr.p_double[i]-h,bu.ptr.p_double[i]) ) { for(j=0; j<=nc-1; j++) { c2.ptr.p_double[j] = c1.ptr.p_double[j]; } c2.ptr.p_double[i] = c2.ptr.p_double[i]-h; f1 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc1(nx, &x, &c2, &v, ae_true, &state.g, ae_false, _state); f1 = f1+ae_sqr(v-y.ptr.p_double[ii], _state); } seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } } } } /* * Fit * * * f(X|C)= c_0 * * * subject to random box constraints on c_0, where X is N-dimensional vector * (f does not depend on X, and it is not an error! we just test that sizes * of X and C are correctly handled). */ epsx = 1.0E-9; for(tmpkind=0; tmpkind<=1; tmpkind++) { for(nx=1; nx<=20; nx++) { nc = 1; m = nx+hqrnduniformi(&rs, nx, _state); ae_vector_set_length(&bl, nc, _state); ae_vector_set_length(&bu, nc, _state); ae_vector_set_length(&c0, nc, _state); ae_vector_set_length(&x, nx, _state); for(i=0; i<=nc-1; i++) { bl.ptr.p_double[i] = hqrndnormal(&rs, _state); bu.ptr.p_double[i] = bl.ptr.p_double[i]+hqrnduniformr(&rs, _state); c0.ptr.p_double[i] = 1+hqrnduniformr(&rs, _state); } ae_matrix_set_length(&xx, m, nx, _state); ae_vector_set_length(&y, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=nx-1; j++) { xx.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } y.ptr.p_double[i] = hqrndnormal(&rs, _state); } if( tmpkind==0 ) { lsfitcreatef(&xx, &y, &c0, m, nx, nc, 10*epsx, &state, _state); } if( tmpkind==1 ) { lsfitcreatefg(&xx, &y, &c0, m, nx, nc, ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)), &state, _state); } lsfitsetcond(&state, epsx, 0, _state); lsfitsetbc(&state, &bl, &bu, _state); while(lsfititeration(&state, _state)) { for(i=0; i<=nc-1; i++) { seterrorflag(errorflag, ae_fp_less(state.c.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(state.c.ptr.p_double[i],bu.ptr.p_double[i]), _state); } if( state.needf ) { testlsfitunit_testfunc2(&state.x, nx, &state.c, nc, &state.f, ae_true, &state.g, ae_false, _state); continue; } if( state.needfg ) { testlsfitunit_testfunc2(&state.x, nx, &state.c, nc, &state.f, ae_true, &state.g, ae_true, _state); continue; } ae_assert(ae_false, "minlm test: integrity check failed", _state); } lsfitresults(&state, &terminationtype, &c1, &rep, _state); seterrorflag(errorflag, terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } f0 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc2(&x, nx, &c1, nc, &v, ae_true, &state.g, ae_false, _state); f0 = f0+ae_sqr(v-y.ptr.p_double[ii], _state); } h = 0.001; ae_vector_set_length(&c2, nc, _state); for(i=0; i<=nc-1; i++) { seterrorflag(errorflag, ae_fp_less(c1.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(c1.ptr.p_double[i],bu.ptr.p_double[i]), _state); if( ae_fp_greater_eq(c1.ptr.p_double[i]+h,bl.ptr.p_double[i])&&ae_fp_less_eq(c1.ptr.p_double[i]+h,bu.ptr.p_double[i]) ) { for(j=0; j<=nc-1; j++) { c2.ptr.p_double[j] = c1.ptr.p_double[j]; } c2.ptr.p_double[i] = c2.ptr.p_double[i]+h; f1 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc2(&x, nx, &c2, nc, &v, ae_true, &state.g, ae_false, _state); f1 = f1+ae_sqr(v-y.ptr.p_double[ii], _state); } seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } if( ae_fp_greater_eq(c1.ptr.p_double[i]-h,bl.ptr.p_double[i])&&ae_fp_less_eq(c1.ptr.p_double[i]-h,bu.ptr.p_double[i]) ) { for(j=0; j<=nc-1; j++) { c2.ptr.p_double[j] = c1.ptr.p_double[j]; } c2.ptr.p_double[i] = c2.ptr.p_double[i]-h; f1 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc2(&x, nx, &c2, nc, &v, ae_true, &state.g, ae_false, _state); f1 = f1+ae_sqr(v-y.ptr.p_double[ii], _state); } seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } } } } /* * Fit * * * f(X|C)= c_0 + c_1*x0 + c_2*x0^2 + ... * * * subject to random box constraints on c. */ epsx = 1.0E-9; for(tmpkind=0; tmpkind<=1; tmpkind++) { for(nc=1; nc<=5; nc++) { nx = 1; m = 10+nc+hqrnduniformi(&rs, nc, _state); ae_vector_set_length(&bl, nc, _state); ae_vector_set_length(&bu, nc, _state); ae_vector_set_length(&c0, nc, _state); ae_vector_set_length(&x, nx, _state); for(i=0; i<=nc-1; i++) { bl.ptr.p_double[i] = hqrndnormal(&rs, _state); bu.ptr.p_double[i] = bl.ptr.p_double[i]+hqrnduniformr(&rs, _state); c0.ptr.p_double[i] = hqrndnormal(&rs, _state); } ae_matrix_set_length(&xx, m, nx, _state); ae_vector_set_length(&y, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=nx-1; j++) { xx.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } y.ptr.p_double[i] = hqrndnormal(&rs, _state); } if( tmpkind==0 ) { lsfitcreatef(&xx, &y, &c0, m, nx, nc, 10*epsx, &state, _state); } if( tmpkind==1 ) { lsfitcreatefg(&xx, &y, &c0, m, nx, nc, ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)), &state, _state); } lsfitsetcond(&state, epsx, 0, _state); lsfitsetbc(&state, &bl, &bu, _state); while(lsfititeration(&state, _state)) { for(i=0; i<=nc-1; i++) { seterrorflag(errorflag, ae_fp_less(state.c.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(state.c.ptr.p_double[i],bu.ptr.p_double[i]), _state); } if( state.needf ) { testlsfitunit_testfunc3(&state.x, nx, &state.c, nc, &state.f, ae_true, &state.g, ae_false, _state); continue; } if( state.needfg ) { testlsfitunit_testfunc3(&state.x, nx, &state.c, nc, &state.f, ae_true, &state.g, ae_true, _state); continue; } ae_assert(ae_false, "minlm test: integrity check failed", _state); } lsfitresults(&state, &terminationtype, &c1, &rep, _state); seterrorflag(errorflag, terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } f0 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc3(&x, nx, &c1, nc, &v, ae_true, &state.g, ae_false, _state); f0 = f0+ae_sqr(v-y.ptr.p_double[ii], _state); } h = 0.001; ae_vector_set_length(&c2, nc, _state); for(i=0; i<=nc-1; i++) { seterrorflag(errorflag, ae_fp_less(c1.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(c1.ptr.p_double[i],bu.ptr.p_double[i]), _state); if( ae_fp_greater_eq(c1.ptr.p_double[i]+h,bl.ptr.p_double[i])&&ae_fp_less_eq(c1.ptr.p_double[i]+h,bu.ptr.p_double[i]) ) { for(j=0; j<=nc-1; j++) { c2.ptr.p_double[j] = c1.ptr.p_double[j]; } c2.ptr.p_double[i] = c2.ptr.p_double[i]+h; f1 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc3(&x, nx, &c2, nc, &v, ae_true, &state.g, ae_false, _state); f1 = f1+ae_sqr(v-y.ptr.p_double[ii], _state); } seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } if( ae_fp_greater_eq(c1.ptr.p_double[i]-h,bl.ptr.p_double[i])&&ae_fp_less_eq(c1.ptr.p_double[i]-h,bu.ptr.p_double[i]) ) { for(j=0; j<=nc-1; j++) { c2.ptr.p_double[j] = c1.ptr.p_double[j]; } c2.ptr.p_double[i] = c2.ptr.p_double[i]-h; f1 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc3(&x, nx, &c2, nc, &v, ae_true, &state.g, ae_false, _state); f1 = f1+ae_sqr(v-y.ptr.p_double[ii], _state); } seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } } } } ae_frame_leave(_state); } /************************************************************************* Test for linearly constrained NLS problems. On failure sets error flag, leaves it unchanged on success. *************************************************************************/ static void testlsfitunit_testlcnls(ae_bool* errorflag, ae_state *_state) { ae_frame _frame_block; lsfitstate state; lsfitreport rep; ae_vector bl; ae_vector bu; ae_int_t nx; ae_int_t nc; ae_int_t m; ae_vector x; ae_vector y; ae_vector c0; ae_vector c1; ae_vector c2; ae_matrix rawc; ae_vector rawct; ae_int_t rawccnt; ae_matrix xx; ae_matrix xx12; ae_matrix z; ae_int_t trialidx; ae_int_t i; ae_int_t j; ae_int_t ii; double v; double h; ae_int_t optkind; hqrndstate rs; double epsx; double xtol; double f0; double f1; ae_bool bflag; ae_int_t terminationtype; ae_frame_make(_state, &_frame_block); _lsfitstate_init(&state, _state); _lsfitreport_init(&rep, _state); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&c0, 0, DT_REAL, _state); ae_vector_init(&c1, 0, DT_REAL, _state); ae_vector_init(&c2, 0, DT_REAL, _state); ae_matrix_init(&rawc, 0, 0, DT_REAL, _state); ae_vector_init(&rawct, 0, DT_INT, _state); ae_matrix_init(&xx, 0, 0, DT_REAL, _state); ae_matrix_init(&xx12, 0, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); hqrndrandomize(&rs, _state); /* * Minimize * * [ [ ]2 ] * SUM_i[ SUM_j[ (0.1*x_j+power(x_j,3))*c_ij ] ] * [ [ ] ] * * subject to mix of box and linear inequality constraints on x_j * * We check correctness of solution by sampling a few random points * around one returned by optimizer, and comparing function value * with target. Sampling is performed with respect to inequality * constraints. */ epsx = 1.0E-12; xtol = 1.0E-8; optkind = 1; for(nc=5; nc<=20; nc++) { /* * Generate problem */ nx = nc; m = nc+hqrnduniformi(&rs, nc, _state); ae_vector_set_length(&bl, nc, _state); ae_vector_set_length(&bu, nc, _state); ae_vector_set_length(&c0, nc, _state); ae_vector_set_length(&x, nx, _state); for(i=0; i<=nc-1; i++) { bl.ptr.p_double[i] = hqrndnormal(&rs, _state); bu.ptr.p_double[i] = bl.ptr.p_double[i]+0.01+hqrnduniformr(&rs, _state); c0.ptr.p_double[i] = bl.ptr.p_double[i]+(bu.ptr.p_double[i]-bl.ptr.p_double[i])*hqrnduniformr(&rs, _state); } ae_matrix_set_length(&xx, m, nx, _state); ae_vector_set_length(&y, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=nx-1; j++) { xx.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } y.ptr.p_double[i] = hqrndnormal(&rs, _state); } ae_assert(nc>=5, "Assertion failed", _state); rawccnt = 3; ae_matrix_set_length(&rawc, rawccnt, nc+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); for(i=0; i<=rawccnt-1; i++) { v = (double)(0); for(j=0; j<=nc-1; j++) { rawc.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); v = v+c0.ptr.p_double[j]*rawc.ptr.pp_double[i][j]; } rawc.ptr.pp_double[i][nc] = v; rawct.ptr.p_int[i] = 2*hqrnduniformi(&rs, 2, _state)-1; } /* * Solve */ if( optkind==0 ) { lsfitcreatef(&xx, &y, &c0, m, nx, nc, 1.0E-6, &state, _state); } if( optkind==1 ) { lsfitcreatefg(&xx, &y, &c0, m, nx, nc, ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)), &state, _state); } lsfitsetcond(&state, epsx, 0, _state); lsfitsetbc(&state, &bl, &bu, _state); lsfitsetlc(&state, &rawc, &rawct, rawccnt, _state); while(lsfititeration(&state, _state)) { for(i=0; i<=nc-1; i++) { seterrorflag(errorflag, ae_fp_less(state.c.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(state.c.ptr.p_double[i],bu.ptr.p_double[i]), _state); } if( state.needf ) { testlsfitunit_testfunc1(nx, &state.x, &state.c, &state.f, ae_true, &state.g, ae_false, _state); continue; } if( state.needfg ) { testlsfitunit_testfunc1(nx, &state.x, &state.c, &state.f, ae_true, &state.g, ae_true, _state); continue; } ae_assert(ae_false, "lsfit test: integrity check failed", _state); } lsfitresults(&state, &terminationtype, &c1, &rep, _state); seterrorflag(errorflag, terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } /* * Test feasibility w.r.t. box and linear inequality constraints */ for(i=0; i<=nc-1; i++) { seterrorflag(errorflag, ae_fp_less(c1.ptr.p_double[i],bl.ptr.p_double[i]), _state); seterrorflag(errorflag, ae_fp_greater(c1.ptr.p_double[i],bu.ptr.p_double[i]), _state); } for(i=0; i<=rawccnt-1; i++) { v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &c1.ptr.p_double[0], 1, ae_v_len(0,nc-1)); v = v-rawc.ptr.pp_double[i][nc]; if( rawct.ptr.p_int[i]>0 ) { seterrorflag(errorflag, ae_fp_less(v,-xtol), _state); } if( rawct.ptr.p_int[i]<0 ) { seterrorflag(errorflag, ae_fp_greater(v,xtol), _state); } } /* * Make several random trial steps and: * 0) generate small random trial step * 1) if step is infeasible, skip to next trial * 2) compare function value in the trial point against one in other points */ f0 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc1(nx, &x, &c1, &v, ae_true, &state.g, ae_false, _state); f0 = f0+ae_sqr(v-y.ptr.p_double[ii], _state); } ae_vector_set_length(&c2, nc, _state); for(trialidx=0; trialidx<=10*nc; trialidx++) { h = 0.001; for(i=0; i<=nc-1; i++) { do { c2.ptr.p_double[i] = c1.ptr.p_double[i]+(hqrnduniformr(&rs, _state)*2-1)*h; } while(!(ae_fp_greater_eq(c2.ptr.p_double[i],bl.ptr.p_double[i])&&ae_fp_less_eq(c2.ptr.p_double[i],bu.ptr.p_double[i]))); } bflag = ae_false; for(i=0; i<=rawccnt-1; i++) { ae_assert(rawct.ptr.p_int[i]!=0, "Assertion failed", _state); v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &c2.ptr.p_double[0], 1, ae_v_len(0,nc-1)); v = v-rawc.ptr.pp_double[i][nc]; bflag = bflag||(rawct.ptr.p_int[i]>0&&ae_fp_less(v,(double)(0))); bflag = bflag||(rawct.ptr.p_int[i]<0&&ae_fp_greater(v,(double)(0))); } if( bflag ) { continue; } f1 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc1(nx, &x, &c2, &v, ae_true, &state.g, ae_false, _state); f1 = f1+ae_sqr(v-y.ptr.p_double[ii], _state); } seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } } /* * Minimize * * [ [ ]2 ] * SUM_i[ SUM_j[ (0.1*x_j+power(x_j,3))*c_ij ] ] * [ [ ] ] * * subject to linear EQUALITY constraints on x_j. * * We check correctness of solution by sampling a few random points * around one returned by optimizer, and comparing function value * with target. Sampling is performed with respect to equality * constraints. In order to simplify algorithm we use orthogonal * equality constraints. * * NOTE: we solve problem using VJ mode (analytic Jacobian) because * roundoff errors from numerical differentiation sometimes * prevent us from converging with good precision. */ epsx = 1.0E-12; xtol = 1.0E-8; optkind = 1; for(nc=10; nc<=20; nc++) { /* * Generate problem */ nx = nc; m = nc+hqrnduniformi(&rs, nc, _state); ae_vector_set_length(&c0, nc, _state); for(i=0; i<=nc-1; i++) { c0.ptr.p_double[i] = hqrndnormal(&rs, _state); } ae_matrix_set_length(&xx, m, nx, _state); ae_vector_set_length(&y, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=nx-1; j++) { xx.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } y.ptr.p_double[i] = hqrndnormal(&rs, _state); } ae_assert(nc>=10, "Assertion failed", _state); rawccnt = 1+hqrnduniformi(&rs, 5, _state); rmatrixrndorthogonal(nc, &z, _state); ae_matrix_set_length(&rawc, rawccnt, nc+1, _state); ae_vector_set_length(&rawct, rawccnt, _state); for(i=0; i<=rawccnt-1; i++) { v = (double)(0); for(j=0; j<=nc-1; j++) { rawc.ptr.pp_double[i][j] = z.ptr.pp_double[i][j]; v = v+c0.ptr.p_double[j]*rawc.ptr.pp_double[i][j]; } rawc.ptr.pp_double[i][nc] = v; rawct.ptr.p_int[i] = 0; } /* * Solve */ if( optkind==0 ) { lsfitcreatef(&xx, &y, &c0, m, nx, nc, 1.0E-6, &state, _state); } if( optkind==1 ) { lsfitcreatefg(&xx, &y, &c0, m, nx, nc, ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)), &state, _state); } lsfitsetcond(&state, epsx, 0, _state); lsfitsetlc(&state, &rawc, &rawct, rawccnt, _state); while(lsfititeration(&state, _state)) { if( state.needf ) { testlsfitunit_testfunc1(nx, &state.x, &state.c, &state.f, ae_true, &state.g, ae_false, _state); continue; } if( state.needfg ) { testlsfitunit_testfunc1(nx, &state.x, &state.c, &state.f, ae_true, &state.g, ae_true, _state); continue; } ae_assert(ae_false, "lsfit test: integrity check failed", _state); } lsfitresults(&state, &terminationtype, &c1, &rep, _state); seterrorflag(errorflag, terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } /* * Test feasibility w.r.t. linear equality constraints */ for(i=0; i<=rawccnt-1; i++) { v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &c1.ptr.p_double[0], 1, ae_v_len(0,nc-1)); v = v-rawc.ptr.pp_double[i][nc]; seterrorflag(errorflag, ae_fp_greater(ae_fabs(v, _state),xtol), _state); } /* * Make several random trial steps and: * 0) generate small random trial step * 1) project it onto equality constrained subspace * 2) compare function value in the trial point against one in other points */ f0 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc1(nx, &x, &c1, &v, ae_true, &state.g, ae_false, _state); f0 = f0+ae_sqr(v-y.ptr.p_double[ii], _state); } ae_vector_set_length(&c2, nc, _state); for(trialidx=0; trialidx<=nc; trialidx++) { h = 0.001; for(i=0; i<=nc-1; i++) { c2.ptr.p_double[i] = hqrndnormal(&rs, _state); } for(i=0; i<=rawccnt-1; i++) { v = ae_v_dotproduct(&rawc.ptr.pp_double[i][0], 1, &c2.ptr.p_double[0], 1, ae_v_len(0,nc-1)); ae_v_subd(&c2.ptr.p_double[0], 1, &rawc.ptr.pp_double[i][0], 1, ae_v_len(0,nc-1), v); } v = ae_v_dotproduct(&c2.ptr.p_double[0], 1, &c2.ptr.p_double[0], 1, ae_v_len(0,nc-1)); ae_assert(ae_fp_greater(v,(double)(0)), "Assertion failed", _state); v = h/ae_sqrt(v, _state); ae_v_muld(&c2.ptr.p_double[0], 1, ae_v_len(0,nc-1), v); ae_v_add(&c2.ptr.p_double[0], 1, &c1.ptr.p_double[0], 1, ae_v_len(0,nc-1)); f1 = (double)(0); for(ii=0; ii<=m-1; ii++) { ae_v_move(&x.ptr.p_double[0], 1, &xx.ptr.pp_double[ii][0], 1, ae_v_len(0,nx-1)); testlsfitunit_testfunc1(nx, &x, &c2, &v, ae_true, &state.g, ae_false, _state); f1 = f1+ae_sqr(v-y.ptr.p_double[ii], _state); } seterrorflag(errorflag, ae_fp_less(f1,f0), _state); } } /* * Fit * * * f(X|C)= c_0 * * * subject to single general linear equality constraint on c_0. * * Here X has dimensionality NX, and C has dimensionality 1. * * We do not test convergence to solution, only feasibility of constraint. * The aim of this test is to assert that optimizer correctly handles * situations when dimensions of X and C differ. */ epsx = 1.0E-9; for(optkind=0; optkind<=1; optkind++) { for(nx=1; nx<=20; nx++) { nc = 1; m = nx+hqrnduniformi(&rs, nx, _state); ae_vector_set_length(&c0, nc, _state); for(i=0; i<=nc-1; i++) { c0.ptr.p_double[i] = hqrndnormal(&rs, _state); } ae_matrix_set_length(&xx, m, nx, _state); ae_vector_set_length(&y, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=nx-1; j++) { xx.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } y.ptr.p_double[i] = hqrndnormal(&rs, _state); } ae_assert(nc==1, "Assertion failed", _state); rawccnt = 1; ae_matrix_set_length(&rawc, 1, 2, _state); ae_vector_set_length(&rawct, 1, _state); rawc.ptr.pp_double[0][0] = (double)(1); rawc.ptr.pp_double[0][1] = hqrndnormal(&rs, _state); rawct.ptr.p_int[0] = 0; if( optkind==0 ) { lsfitcreatef(&xx, &y, &c0, m, nx, nc, 10*epsx, &state, _state); } if( optkind==1 ) { lsfitcreatefg(&xx, &y, &c0, m, nx, nc, ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)), &state, _state); } lsfitsetcond(&state, epsx, 0, _state); lsfitsetlc(&state, &rawc, &rawct, rawccnt, _state); while(lsfititeration(&state, _state)) { if( state.needf ) { testlsfitunit_testfunc2(&state.x, nx, &state.c, nc, &state.f, ae_true, &state.g, ae_false, _state); continue; } if( state.needfg ) { testlsfitunit_testfunc2(&state.x, nx, &state.c, nc, &state.f, ae_true, &state.g, ae_true, _state); continue; } ae_assert(ae_false, "minlm test: integrity check failed", _state); } lsfitresults(&state, &terminationtype, &c1, &rep, _state); seterrorflag(errorflag, terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } seterrorflag(errorflag, ae_fp_greater(ae_fabs(c1.ptr.p_double[0]-rawc.ptr.pp_double[0][1], _state),1.0E-6), _state); } } /* * Fit * * * f(X|C)= c_0 + c_1*x0 + c_2*x0^2 + ... * * * subject to single general linear equality constraint on c. * * Here X has dimensionality 1, and C has dimensionality NC. * * We do not test convergence to solution, only feasibility of constraint. * The aim of this test is to assert that optimizer correctly handles * situations when dimensions of X and C differ. */ epsx = 1.0E-9; for(optkind=0; optkind<=1; optkind++) { for(nc=1; nc<=5; nc++) { nx = 1; m = 10+nc+hqrnduniformi(&rs, nc, _state); ae_vector_set_length(&c0, nc, _state); for(i=0; i<=nc-1; i++) { c0.ptr.p_double[i] = hqrndnormal(&rs, _state); } ae_matrix_set_length(&xx, m, nx, _state); ae_vector_set_length(&y, m, _state); for(i=0; i<=m-1; i++) { for(j=0; j<=nx-1; j++) { xx.ptr.pp_double[i][j] = hqrndnormal(&rs, _state); } y.ptr.p_double[i] = hqrndnormal(&rs, _state); } rawccnt = 1; ae_matrix_set_length(&rawc, 1, nc+1, _state); ae_vector_set_length(&rawct, 1, _state); for(j=0; j<=nc; j++) { rawc.ptr.pp_double[0][j] = (2*hqrnduniformi(&rs, 2, _state)-1)*(0.1+hqrnduniformr(&rs, _state)); } rawct.ptr.p_int[0] = 0; if( optkind==0 ) { lsfitcreatef(&xx, &y, &c0, m, nx, nc, 10*epsx, &state, _state); } if( optkind==1 ) { lsfitcreatefg(&xx, &y, &c0, m, nx, nc, ae_fp_greater(hqrndnormal(&rs, _state),(double)(0)), &state, _state); } lsfitsetcond(&state, epsx, 0, _state); lsfitsetlc(&state, &rawc, &rawct, rawccnt, _state); while(lsfititeration(&state, _state)) { if( state.needf ) { testlsfitunit_testfunc3(&state.x, nx, &state.c, nc, &state.f, ae_true, &state.g, ae_false, _state); continue; } if( state.needfg ) { testlsfitunit_testfunc3(&state.x, nx, &state.c, nc, &state.f, ae_true, &state.g, ae_true, _state); continue; } ae_assert(ae_false, "minlm test: integrity check failed", _state); } lsfitresults(&state, &terminationtype, &c1, &rep, _state); seterrorflag(errorflag, terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } v = -rawc.ptr.pp_double[0][nc]; for(j=0; j<=nc-1; j++) { v = v+rawc.ptr.pp_double[0][j]*c1.ptr.p_double[j]; } seterrorflag(errorflag, ae_fp_greater(ae_fabs(v, _state),1.0E-6), _state); } } ae_frame_leave(_state); } /************************************************************************* Tests whether C is solution of (possibly) constrained LLS problem *************************************************************************/ static ae_bool testlsfitunit_isglssolution(ae_int_t n, ae_int_t m, ae_int_t k, /* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, /* Real */ ae_matrix* cmatrix, /* Real */ ae_vector* c, ae_state *_state) { ae_frame _frame_block; ae_vector _c; ae_int_t i; ae_int_t j; ae_vector c2; ae_vector sv; ae_vector deltac; ae_vector deltaproj; ae_matrix u; ae_matrix vt; double v; double s1; double s2; double s3; double delta; double threshold; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init_copy(&_c, c, _state); c = &_c; ae_vector_init(&c2, 0, DT_REAL, _state); ae_vector_init(&sv, 0, DT_REAL, _state); ae_vector_init(&deltac, 0, DT_REAL, _state); ae_vector_init(&deltaproj, 0, DT_REAL, _state); ae_matrix_init(&u, 0, 0, DT_REAL, _state); ae_matrix_init(&vt, 0, 0, DT_REAL, _state); /* * Setup. * Threshold is small because CMatrix may be ill-conditioned */ delta = 0.001; threshold = ae_sqrt(ae_machineepsilon, _state); ae_vector_set_length(&c2, m, _state); ae_vector_set_length(&deltac, m, _state); ae_vector_set_length(&deltaproj, m, _state); /* * test whether C is feasible point or not (projC must be close to C) */ for(i=0; i<=k-1; i++) { v = ae_v_dotproduct(&cmatrix->ptr.pp_double[i][0], 1, &c->ptr.p_double[0], 1, ae_v_len(0,m-1)); if( ae_fp_greater(ae_fabs(v-cmatrix->ptr.pp_double[i][m], _state),threshold) ) { result = ae_false; ae_frame_leave(_state); return result; } } /* * find orthogonal basis of Null(CMatrix) (stored in rows from K to M-1) */ if( k>0 ) { rmatrixsvd(cmatrix, k, m, 0, 2, 2, &sv, &u, &vt, _state); } /* * Test result */ result = ae_true; s1 = testlsfitunit_getglserror(n, m, y, w, fmatrix, c, _state); for(j=0; j<=m-1; j++) { /* * prepare modification of C which leave us in the feasible set. * * let deltaC be increment on Jth coordinate, then project * deltaC in the Null(CMatrix) and store result in DeltaProj */ ae_v_move(&c2.ptr.p_double[0], 1, &c->ptr.p_double[0], 1, ae_v_len(0,m-1)); for(i=0; i<=m-1; i++) { if( i==j ) { deltac.ptr.p_double[i] = delta; } else { deltac.ptr.p_double[i] = (double)(0); } } if( k==0 ) { ae_v_move(&deltaproj.ptr.p_double[0], 1, &deltac.ptr.p_double[0], 1, ae_v_len(0,m-1)); } else { for(i=0; i<=m-1; i++) { deltaproj.ptr.p_double[i] = (double)(0); } for(i=k; i<=m-1; i++) { v = ae_v_dotproduct(&vt.ptr.pp_double[i][0], 1, &deltac.ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_v_addd(&deltaproj.ptr.p_double[0], 1, &vt.ptr.pp_double[i][0], 1, ae_v_len(0,m-1), v); } } /* * now we have DeltaProj such that if C is feasible, * then C+DeltaProj is feasible too */ ae_v_move(&c2.ptr.p_double[0], 1, &c->ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_v_add(&c2.ptr.p_double[0], 1, &deltaproj.ptr.p_double[0], 1, ae_v_len(0,m-1)); s2 = testlsfitunit_getglserror(n, m, y, w, fmatrix, &c2, _state); ae_v_move(&c2.ptr.p_double[0], 1, &c->ptr.p_double[0], 1, ae_v_len(0,m-1)); ae_v_sub(&c2.ptr.p_double[0], 1, &deltaproj.ptr.p_double[0], 1, ae_v_len(0,m-1)); s3 = testlsfitunit_getglserror(n, m, y, w, fmatrix, &c2, _state); result = (result&&ae_fp_greater_eq(s2,s1/(1+threshold)))&&ae_fp_greater_eq(s3,s1/(1+threshold)); } ae_frame_leave(_state); return result; } /************************************************************************* Tests whether C is solution of LLS problem *************************************************************************/ static double testlsfitunit_getglserror(ae_int_t n, ae_int_t m, /* Real */ ae_vector* y, /* Real */ ae_vector* w, /* Real */ ae_matrix* fmatrix, /* Real */ ae_vector* c, ae_state *_state) { ae_int_t i; double v; double result; result = (double)(0); for(i=0; i<=n-1; i++) { v = ae_v_dotproduct(&fmatrix->ptr.pp_double[i][0], 1, &c->ptr.p_double[0], 1, ae_v_len(0,m-1)); result = result+ae_sqr(w->ptr.p_double[i]*(v-y->ptr.p_double[i]), _state); } return result; } /************************************************************************* Subroutine for nonlinear fitting of linear problem DerAvailable: * 0 when only function value should be used * 1 when we can provide gradient/function * 2 when we can provide Hessian/gradient/function When something which is not permitted by DerAvailable is requested, this function sets NLSErrors to True. *************************************************************************/ static void testlsfitunit_fitlinearnonlinear(ae_int_t m, ae_int_t deravailable, /* Real */ ae_matrix* xy, lsfitstate* state, ae_bool* nlserrors, ae_state *_state) { ae_int_t i; ae_int_t j; double v; while(lsfititeration(state, _state)) { /* * assume that one and only one of flags is set * test that we didn't request hessian in hessian-free setting */ if( deravailable<1&&state->needfg ) { *nlserrors = ae_true; } if( deravailable<2&&state->needfgh ) { *nlserrors = ae_true; } i = 0; if( state->needf ) { i = i+1; } if( state->needfg ) { i = i+1; } if( state->needfgh ) { i = i+1; } if( i!=1 ) { *nlserrors = ae_true; } /* * test that PointIndex is consistent with actual point passed */ for(i=0; i<=m-1; i++) { *nlserrors = *nlserrors||ae_fp_neq(xy->ptr.pp_double[state->pointindex][i],state->x.ptr.p_double[i]); } /* * calculate */ if( state->needf ) { v = ae_v_dotproduct(&state->x.ptr.p_double[0], 1, &state->c.ptr.p_double[0], 1, ae_v_len(0,m-1)); state->f = v; continue; } if( state->needfg ) { v = ae_v_dotproduct(&state->x.ptr.p_double[0], 1, &state->c.ptr.p_double[0], 1, ae_v_len(0,m-1)); state->f = v; ae_v_move(&state->g.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,m-1)); continue; } if( state->needfgh ) { v = ae_v_dotproduct(&state->x.ptr.p_double[0], 1, &state->c.ptr.p_double[0], 1, ae_v_len(0,m-1)); state->f = v; ae_v_move(&state->g.ptr.p_double[0], 1, &state->x.ptr.p_double[0], 1, ae_v_len(0,m-1)); for(i=0; i<=m-1; i++) { for(j=0; j<=m-1; j++) { state->h.ptr.pp_double[i][j] = (double)(0); } } continue; } } } /************************************************************************* This function tests, that gradient verified correctly. *************************************************************************/ static void testlsfitunit_testgradientcheck(ae_bool* testg, ae_state *_state) { ae_frame _frame_block; lsfitstate state; lsfitreport rep; ae_int_t n; ae_int_t m; ae_int_t k; ae_vector c; ae_vector cres; ae_matrix x; ae_vector y; ae_vector x0; ae_int_t info; ae_vector bl; ae_vector bu; ae_int_t infcomp; double teststep; double noise; ae_int_t nbrcomp; double spp; ae_int_t func; ae_int_t pass; ae_int_t passcount; ae_int_t i; ae_int_t j; ae_frame_make(_state, &_frame_block); *testg = ae_false; _lsfitstate_init(&state, _state); _lsfitreport_init(&rep, _state); ae_vector_init(&c, 0, DT_REAL, _state); ae_vector_init(&cres, 0, DT_REAL, _state); ae_matrix_init(&x, 0, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&bl, 0, DT_REAL, _state); ae_vector_init(&bu, 0, DT_REAL, _state); passcount = 35; spp = 1.0; teststep = 0.01; for(pass=1; pass<=passcount; pass++) { m = ae_randominteger(5, _state)+1; ae_vector_set_length(&x0, m, _state); k = ae_randominteger(5, _state)+1; ae_vector_set_length(&c, k, _state); ae_vector_set_length(&bl, k, _state); ae_vector_set_length(&bu, k, _state); /* * Prepare test's parameters */ func = ae_randominteger(3, _state)+1; n = ae_randominteger(8, _state)+3; ae_matrix_set_length(&x, n, m, _state); ae_vector_set_length(&y, n, _state); nbrcomp = ae_randominteger(k, _state); noise = (double)(2*ae_randominteger(2, _state)-1); /* * Prepare function's parameters */ for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { x.ptr.pp_double[i][j] = spp*(2*ae_randomreal(_state)-1); } y.ptr.p_double[i] = spp*(2*ae_randomreal(_state)-1); } for(i=0; i<=k-1; i++) { c.ptr.p_double[i] = spp*(2*ae_randomreal(_state)-1); } for(i=0; i<=m-1; i++) { x0.ptr.p_double[i] = 10*(2*ae_randomreal(_state)-1); } /* * Prepare boundary parameters */ for(i=0; i<=k-1; i++) { bl.ptr.p_double[i] = ae_randomreal(_state)-spp; bu.ptr.p_double[i] = ae_randomreal(_state)+spp-1; } infcomp = ae_randominteger(k+1, _state); if( infcompv_neginf; } infcomp = ae_randominteger(k+1, _state); if( infcompv_posinf; } lsfitcreatefg(&x, &y, &c, n, m, k, ae_true, &state, _state); lsfitsetgradientcheck(&state, teststep, _state); lsfitsetcond(&state, 0.0, 100, _state); lsfitsetbc(&state, &bl, &bu, _state); /* * Check that the criterion passes a derivative if it is correct */ while(lsfititeration(&state, _state)) { if( state.needfg ) { testlsfitunit_funcderiv(&state.c, &state.x, &x0, k, m, func, &state.f, &state.g, _state); } } lsfitresults(&state, &info, &cres, &rep, _state); /* * Check that error code does not equal to -7 and parameter .VarIdx * equal to -1. */ if( info==-7||rep.varidx!=-1 ) { *testg = ae_true; ae_frame_leave(_state); return; } /* * Create again and... */ lsfitcreatefg(&x, &y, &c, n, m, k, ae_true, &state, _state); lsfitsetgradientcheck(&state, teststep, _state); lsfitsetcond(&state, 0.0, 100, _state); lsfitsetbc(&state, &bl, &bu, _state); /* * Check that the criterion does not miss a derivative if * it is incorrect */ while(lsfititeration(&state, _state)) { if( state.needfg ) { testlsfitunit_funcderiv(&state.c, &state.x, &x0, k, m, func, &state.f, &state.g, _state); state.g.ptr.p_double[nbrcomp] = state.g.ptr.p_double[nbrcomp]+noise; } } lsfitresults(&state, &info, &cres, &rep, _state); /* * Check that error code equal to -7 and parameter .VarIdx * equal to number of incorrect component. */ if( info!=-7||rep.varidx!=nbrcomp ) { *testg = ae_true; ae_frame_leave(_state); return; } } *testg = ae_false; ae_frame_leave(_state); } /************************************************************************* This function return function's value(F=F(X,C)) and it derivatives(DF=dF/dC). Function dimension is M. Length(C) is K. Function's list: * funcType=1: K>M: F(X)=C0^2*(X0-CX0)^2+C1^2*(X1-CX1)^2+...+CM^2*(XM-CXM)^2 +C(M+1)^2+...+CK^2; KM: F(X)=C0*sin(X0-CX0)^2+C1*sin(X1-CX1)^2+...+CM*sin(XM-CXM)^2 +C(M+1)^3+...+CK^3; K=1&&functype<=3, "FuncDeriv: incorrect funcType(funcType<1 or funcType>3).", _state); ae_assert(k>0, "FuncDeriv: K<=0", _state); ae_assert(m>0, "FuncDeriv: M<=0", _state); ae_assert(x->cnt>=m, "FuncDeriv: Length(X)cnt>=m, "FuncDeriv: Length(X0)cnt>=k, "FuncDeriv: Length(X)ptr.p_double[i]*(x->ptr.p_double[i]-x0->ptr.p_double[i]), _state); g->ptr.p_double[i] = 2*c->ptr.p_double[i]*ae_sqr(x->ptr.p_double[i]-x0->ptr.p_double[i], _state); } if( k>m ) { for(i=m; i<=k-1; i++) { *f = *f+ae_sqr(c->ptr.p_double[i], _state); g->ptr.p_double[i] = 2*c->ptr.p_double[i]; } } if( kptr.p_double[i]-x0->ptr.p_double[i], _state); } } return; } if( functype==2 ) { *f = (double)(0); for(i=0; i<=ae_minint(m, k, _state)-1; i++) { *f = *f+c->ptr.p_double[i]*ae_sqr(ae_sin(x->ptr.p_double[i]-x0->ptr.p_double[i], _state), _state); g->ptr.p_double[i] = ae_sqr(ae_sin(x->ptr.p_double[i]-x0->ptr.p_double[i], _state), _state); } if( k>m ) { for(i=m; i<=k-1; i++) { *f = *f+c->ptr.p_double[i]*c->ptr.p_double[i]*c->ptr.p_double[i]; g->ptr.p_double[i] = 3*ae_sqr(c->ptr.p_double[i], _state); } } if( kptr.p_double[i]-x0->ptr.p_double[i], _state), _state); } } return; } if( functype==3 ) { *f = (double)(0); for(i=0; i<=m-1; i++) { *f = *f+ae_sqr(x->ptr.p_double[i]-x0->ptr.p_double[i], _state); } for(i=0; i<=k-1; i++) { *f = *f+c->ptr.p_double[i]*c->ptr.p_double[i]; } for(i=0; i<=k-1; i++) { g->ptr.p_double[i] = 2*c->ptr.p_double[i]; } return; } } /************************************************************************* Test function 1: F(K, X, Z) = SUM( (power(z_j,3)+alpha*z_j)*x_ij ) here X is a space of points, Z is a space of parameters *************************************************************************/ static void testlsfitunit_testfunc1(ae_int_t k, /* Real */ ae_vector* x, /* Real */ ae_vector* z, double* f, ae_bool needf, /* Real */ ae_vector* g, ae_bool needg, ae_state *_state) { ae_int_t j; double v; double alpha; alpha = 0.01; v = (double)(0); for(j=0; j<=k-1; j++) { v = v+(alpha*z->ptr.p_double[j]+ae_pow(z->ptr.p_double[j], (double)(3), _state))*x->ptr.p_double[j]; if( needg ) { g->ptr.p_double[j] = (alpha+3*ae_pow(z->ptr.p_double[j], (double)(2), _state))*x->ptr.p_double[j]; } } if( needf ) { *f = v; } } /************************************************************************* Test function 2: F(X|C) = c_0 here X is a space of points, C is a space of parameters *************************************************************************/ static void testlsfitunit_testfunc2(/* Real */ ae_vector* x, ae_int_t nx, /* Real */ ae_vector* c, ae_int_t nc, double* f, ae_bool needf, /* Real */ ae_vector* g, ae_bool needg, ae_state *_state) { ae_assert(nc==1, "TestFunc2: integrity check failure", _state); if( needf ) { *f = c->ptr.p_double[0]; } if( needg ) { g->ptr.p_double[0] = (double)(1); } } /************************************************************************* Test function 3: F(X|C) = c_0 + c_1*x0 + c_2*x0^2 + ... here X is a space of points, C is a space of parameters *************************************************************************/ static void testlsfitunit_testfunc3(/* Real */ ae_vector* x, ae_int_t nx, /* Real */ ae_vector* c, ae_int_t nc, double* f, ae_bool needf, /* Real */ ae_vector* g, ae_bool needg, ae_state *_state) { double v; ae_int_t i; ae_assert(nx==1, "TestFunc3: integrity check failure", _state); v = (double)(0); for(i=0; i<=nc-1; i++) { v = v+c->ptr.p_double[i]*ae_pow(x->ptr.p_double[0], (double)(i), _state); if( needg ) { g->ptr.p_double[i] = ae_pow(x->ptr.p_double[0], (double)(i), _state); } } if( needf ) { *f = v; } } static void testnsfitunit_testspherefittinggeneric(ae_bool* err, ae_state *_state); static void testnsfitunit_testspherefittingvosswinkel2(ae_bool* err, ae_state *_state); static void testnsfitunit_calcradii(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, /* Real */ ae_vector* cx, double* rlo, double* rhi, ae_state *_state); static void testnsfitunit_printsupportpoints(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, /* Real */ ae_vector* cx, ae_state *_state); static void testnsfitunit_addvalue(/* Real */ ae_matrix* xy, ae_int_t* cnt, double v, ae_state *_state); ae_bool testnsfit(ae_bool silent, ae_state *_state) { ae_bool sphereerrors; ae_bool wereerrors; ae_bool result; sphereerrors = ae_false; /* * Sphere fitting, several different test suites */ testnsfitunit_testspherefittinggeneric(&sphereerrors, _state); testnsfitunit_testspherefittingvosswinkel2(&sphereerrors, _state); /* * report */ wereerrors = sphereerrors; if( !silent ) { printf("TESTING NSFIT\n"); printf("* SPHERE FITTING (MCC, MIC, MZC) "); if( sphereerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( wereerrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !wereerrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testnsfit(ae_bool silent, ae_state *_state) { return testnsfit(silent, _state); } /************************************************************************* This function tests sphere fitting using generic synthetic datasets On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testnsfitunit_testspherefittinggeneric(ae_bool* err, ae_state *_state) { ae_frame _frame_block; hqrndstate rs; ae_matrix xy; ae_int_t npoints; ae_int_t nx; ae_vector cx; ae_vector cy; double rlo; double rhi; double rlo2; double rhi2; double xtol; double ftol; ae_int_t i; ae_int_t j; ae_int_t k; double v; double vv; ae_int_t problemtype; double vlo; double vhi; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&rs, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&cx, 0, DT_REAL, _state); ae_vector_init(&cy, 0, DT_REAL, _state); hqrndrandomize(&rs, _state); xtol = 1.0E-5; /* * Generate random problem */ for(nx=1; nx<=4; nx++) { /* * Generate synthetic dataset */ npoints = 50+ae_round(ae_pow((double)(4), (double)(nx), _state), _state); ae_matrix_set_length(&xy, npoints, nx, _state); for(i=0; i<=npoints-1; i++) { v = (double)(0); for(j=0; j<=nx-1; j++) { vv = hqrndnormal(&rs, _state); v = v+ae_sqr(vv, _state); xy.ptr.pp_double[i][j] = vv; } ae_assert(ae_fp_greater(v,(double)(0)), "Assertion failed", _state); v = (1+0.1*hqrnduniformr(&rs, _state))/ae_sqrt(v, _state); for(j=0; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = xy.ptr.pp_double[i][j]*v; } } /* * Perform various kinds of fit, NLC solver is used */ for(problemtype=1; problemtype<=3; problemtype++) { /* * Solve with generic solver */ unsetrealarray(&cx, _state); rlo = (double)(0); rhi = (double)(0); nsfitspherex(&xy, npoints, nx, problemtype, 0.0, 0, 0.0, &cx, &rlo, &rhi, _state); /* * Check that small perturbations to center position increase target function * * NOTE: in fact, we do allow small increase in target function - but no more * than FTol=1E-6*XTol. It helps to avoid spurious error reports in * degenerate cases. */ ftol = 1.0E-6*xtol; ae_vector_set_length(&cy, nx, _state); if( problemtype==2||problemtype==3 ) { vlo = (double)(1); } else { vlo = (double)(0); } if( problemtype==1||problemtype==3 ) { vhi = (double)(1); } else { vhi = (double)(0); } for(j=0; j<=nx-1; j++) { for(k=0; k<=nx-1; k++) { cy.ptr.p_double[k] = cx.ptr.p_double[k]; } cy.ptr.p_double[j] = cx.ptr.p_double[j]+xtol; testnsfitunit_calcradii(&xy, npoints, nx, &cy, &rlo2, &rhi2, _state); seterrorflag(err, ae_fp_less(rhi2*vhi-rlo2*vlo,rhi*vhi-rlo*vlo-ftol), _state); for(k=0; k<=nx-1; k++) { cy.ptr.p_double[k] = cx.ptr.p_double[k]; } cy.ptr.p_double[j] = cx.ptr.p_double[j]-xtol; testnsfitunit_calcradii(&xy, npoints, nx, &cy, &rlo2, &rhi2, _state); seterrorflag(err, ae_fp_less(rhi2*vhi-rlo2*vlo,rhi*vhi-rlo*vlo-ftol), _state); } /* * Compare against results returned by specific solver */ if( problemtype==1 ) { nsfitspheremcc(&xy, npoints, nx, &cy, &rhi2, _state); seterrorflag(err, ae_fp_neq(rhi2,rhi), _state); for(j=0; j<=nx-1; j++) { seterrorflag(err, ae_fp_neq(cy.ptr.p_double[j],cx.ptr.p_double[j]), _state); } } if( problemtype==2 ) { nsfitspheremic(&xy, npoints, nx, &cy, &rlo2, _state); seterrorflag(err, ae_fp_neq(rlo2,rlo), _state); for(j=0; j<=nx-1; j++) { seterrorflag(err, ae_fp_neq(cy.ptr.p_double[j],cx.ptr.p_double[j]), _state); } } if( problemtype==3 ) { nsfitspheremzc(&xy, npoints, nx, &cy, &rlo2, &rhi2, _state); seterrorflag(err, ae_fp_neq(rlo2,rlo), _state); seterrorflag(err, ae_fp_neq(rhi2,rhi), _state); for(j=0; j<=nx-1; j++) { seterrorflag(err, ae_fp_neq(cy.ptr.p_double[j],cx.ptr.p_double[j]), _state); } } } } ae_frame_leave(_state); } /************************************************************************* This function tests sphere fitting On failure sets Err to True (leaves it unchanged otherwise) *************************************************************************/ static void testnsfitunit_testspherefittingvosswinkel2(ae_bool* err, ae_state *_state) { ae_frame _frame_block; hqrndstate rs; ae_matrix xy; ae_int_t cnt; ae_vector cx; double rlo; double rhi; double tol; ae_frame_make(_state, &_frame_block); _hqrndstate_init(&rs, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&cx, 0, DT_REAL, _state); hqrndrandomize(&rs, _state); /* * Test problem #2 by Vosswinkel GmbH */ ae_matrix_set_length(&xy, 40, 2, _state); cnt = 0; testnsfitunit_addvalue(&xy, &cnt, 0.1026, _state); testnsfitunit_addvalue(&xy, &cnt, 0.000036, _state); testnsfitunit_addvalue(&xy, &cnt, 0.101119, _state); testnsfitunit_addvalue(&xy, &cnt, 0.016144, _state); testnsfitunit_addvalue(&xy, &cnt, 0.096754, _state); testnsfitunit_addvalue(&xy, &cnt, 0.031654, _state); testnsfitunit_addvalue(&xy, &cnt, 0.088981, _state); testnsfitunit_addvalue(&xy, &cnt, 0.045634, _state); testnsfitunit_addvalue(&xy, &cnt, 0.082056, _state); testnsfitunit_addvalue(&xy, &cnt, 0.06008, _state); testnsfitunit_addvalue(&xy, &cnt, 0.074966, _state); testnsfitunit_addvalue(&xy, &cnt, 0.075647, _state); testnsfitunit_addvalue(&xy, &cnt, 0.065, _state); testnsfitunit_addvalue(&xy, &cnt, 0.090471, _state); testnsfitunit_addvalue(&xy, &cnt, 0.052411, _state); testnsfitunit_addvalue(&xy, &cnt, 0.104381, _state); testnsfitunit_addvalue(&xy, &cnt, 0.036436, _state); testnsfitunit_addvalue(&xy, &cnt, 0.114859, _state); testnsfitunit_addvalue(&xy, &cnt, 0.019034, _state); testnsfitunit_addvalue(&xy, &cnt, 0.126577, _state); testnsfitunit_addvalue(&xy, &cnt, -0.001191, _state); testnsfitunit_addvalue(&xy, &cnt, 0.139295, _state); testnsfitunit_addvalue(&xy, &cnt, -0.024689, _state); testnsfitunit_addvalue(&xy, &cnt, 0.147143, _state); testnsfitunit_addvalue(&xy, &cnt, -0.049729, _state); testnsfitunit_addvalue(&xy, &cnt, 0.147861, _state); testnsfitunit_addvalue(&xy, &cnt, -0.076402, _state); testnsfitunit_addvalue(&xy, &cnt, 0.145907, _state); testnsfitunit_addvalue(&xy, &cnt, -0.103928, _state); testnsfitunit_addvalue(&xy, &cnt, 0.139553, _state); testnsfitunit_addvalue(&xy, &cnt, -0.133726, _state); testnsfitunit_addvalue(&xy, &cnt, 0.130429, _state); testnsfitunit_addvalue(&xy, &cnt, -0.159051, _state); testnsfitunit_addvalue(&xy, &cnt, 0.112298, _state); testnsfitunit_addvalue(&xy, &cnt, -0.179496, _state); testnsfitunit_addvalue(&xy, &cnt, 0.08821, _state); testnsfitunit_addvalue(&xy, &cnt, -0.194562, _state); testnsfitunit_addvalue(&xy, &cnt, 0.059989, _state); testnsfitunit_addvalue(&xy, &cnt, -0.204838, _state); testnsfitunit_addvalue(&xy, &cnt, 0.029135, _state); testnsfitunit_addvalue(&xy, &cnt, -0.206971, _state); testnsfitunit_addvalue(&xy, &cnt, -0.00349, _state); testnsfitunit_addvalue(&xy, &cnt, -0.206207, _state); testnsfitunit_addvalue(&xy, &cnt, -0.036427, _state); testnsfitunit_addvalue(&xy, &cnt, -0.197079, _state); testnsfitunit_addvalue(&xy, &cnt, -0.06806, _state); testnsfitunit_addvalue(&xy, &cnt, -0.180492, _state); testnsfitunit_addvalue(&xy, &cnt, -0.096353, _state); testnsfitunit_addvalue(&xy, &cnt, -0.158203, _state); testnsfitunit_addvalue(&xy, &cnt, -0.119891, _state); testnsfitunit_addvalue(&xy, &cnt, -0.132669, _state); testnsfitunit_addvalue(&xy, &cnt, -0.138375, _state); testnsfitunit_addvalue(&xy, &cnt, -0.105652, _state); testnsfitunit_addvalue(&xy, &cnt, -0.152229, _state); testnsfitunit_addvalue(&xy, &cnt, -0.078587, _state); testnsfitunit_addvalue(&xy, &cnt, -0.16316, _state); testnsfitunit_addvalue(&xy, &cnt, -0.049984, _state); testnsfitunit_addvalue(&xy, &cnt, -0.167084, _state); testnsfitunit_addvalue(&xy, &cnt, -0.022067, _state); testnsfitunit_addvalue(&xy, &cnt, -0.165233, _state); testnsfitunit_addvalue(&xy, &cnt, 0.004002, _state); testnsfitunit_addvalue(&xy, &cnt, -0.16075, _state); testnsfitunit_addvalue(&xy, &cnt, 0.028058, _state); testnsfitunit_addvalue(&xy, &cnt, -0.151829, _state); testnsfitunit_addvalue(&xy, &cnt, 0.050088, _state); testnsfitunit_addvalue(&xy, &cnt, -0.141178, _state); testnsfitunit_addvalue(&xy, &cnt, 0.067646, _state); testnsfitunit_addvalue(&xy, &cnt, -0.124169, _state); testnsfitunit_addvalue(&xy, &cnt, 0.081421, _state); testnsfitunit_addvalue(&xy, &cnt, -0.10567, _state); testnsfitunit_addvalue(&xy, &cnt, 0.087305, _state); testnsfitunit_addvalue(&xy, &cnt, -0.082618, _state); testnsfitunit_addvalue(&xy, &cnt, 0.094189, _state); testnsfitunit_addvalue(&xy, &cnt, -0.064399, _state); testnsfitunit_addvalue(&xy, &cnt, 0.099445, _state); testnsfitunit_addvalue(&xy, &cnt, -0.047018, _state); testnsfitunit_addvalue(&xy, &cnt, 0.09936, _state); testnsfitunit_addvalue(&xy, &cnt, -0.028981, _state); testnsfitunit_addvalue(&xy, &cnt, 0.101784, _state); testnsfitunit_addvalue(&xy, &cnt, -0.012918, _state); tol = 1.0E-7; /* * MZC problem, NLC solver */ unsetrealarray(&cx, _state); rlo = (double)(0); rhi = (double)(0); nsfitspheremzc(&xy, xy.rows, xy.cols, &cx, &rlo, &rhi, _state); seterrorflag(err, cx.cnt!=2, _state); if( *err ) { ae_frame_leave(_state); return; } seterrorflag(err, ae_fp_greater(ae_fabs(cx.ptr.p_double[0]+0.050884688, _state),tol), _state); seterrorflag(err, ae_fp_greater(ae_fabs(cx.ptr.p_double[1]+0.011472328, _state),tol), _state); seterrorflag(err, ae_fp_greater(ae_fabs(rlo-0.150973382, _state),tol), _state); seterrorflag(err, ae_fp_greater(ae_fabs(rhi-0.164374709, _state),tol), _state); unsetrealarray(&cx, _state); rlo = (double)(0); rhi = (double)(0); nsfitspherex(&xy, xy.rows, xy.cols, 3, 0.0, 0, 0.0, &cx, &rlo, &rhi, _state); seterrorflag(err, cx.cnt!=2, _state); if( *err ) { ae_frame_leave(_state); return; } seterrorflag(err, ae_fp_greater(ae_fabs(cx.ptr.p_double[0]+0.050884688, _state),tol), _state); seterrorflag(err, ae_fp_greater(ae_fabs(cx.ptr.p_double[1]+0.011472328, _state),tol), _state); seterrorflag(err, ae_fp_greater(ae_fabs(rlo-0.150973382, _state),tol), _state); seterrorflag(err, ae_fp_greater(ae_fabs(rhi-0.164374709, _state),tol), _state); /* * MCC problem, NLC solver */ unsetrealarray(&cx, _state); rlo = (double)(0); rhi = (double)(0); nsfitspheremcc(&xy, xy.rows, xy.cols, &cx, &rhi, _state); seterrorflag(err, cx.cnt!=2, _state); if( *err ) { ae_frame_leave(_state); return; } seterrorflag(err, ae_fp_greater(ae_fabs(cx.ptr.p_double[0]+0.051137580, _state),tol), _state); seterrorflag(err, ae_fp_greater(ae_fabs(cx.ptr.p_double[1]+0.011680985, _state),tol), _state); seterrorflag(err, ae_fp_neq(rlo,(double)(0)), _state); seterrorflag(err, ae_fp_greater(ae_fabs(rhi-0.164365735, _state),tol), _state); unsetrealarray(&cx, _state); rlo = (double)(0); rhi = (double)(0); nsfitspherex(&xy, xy.rows, xy.cols, 1, 0.0, 0, 0.0, &cx, &rlo, &rhi, _state); seterrorflag(err, cx.cnt!=2, _state); if( *err ) { ae_frame_leave(_state); return; } seterrorflag(err, ae_fp_greater(ae_fabs(cx.ptr.p_double[0]+0.051137580, _state),tol), _state); seterrorflag(err, ae_fp_greater(ae_fabs(cx.ptr.p_double[1]+0.011680985, _state),tol), _state); seterrorflag(err, ae_fp_neq(rlo,(double)(0)), _state); seterrorflag(err, ae_fp_greater(ae_fabs(rhi-0.164365735, _state),tol), _state); /* * MIC problem, NLC solver */ unsetrealarray(&cx, _state); rlo = (double)(0); rhi = (double)(0); nsfitspheremic(&xy, xy.rows, xy.cols, &cx, &rlo, _state); seterrorflag(err, cx.cnt!=2, _state); if( *err ) { ae_frame_leave(_state); return; } seterrorflag(err, ae_fp_greater(ae_fabs(cx.ptr.p_double[0]+0.054593489, _state),tol), _state); seterrorflag(err, ae_fp_greater(ae_fabs(cx.ptr.p_double[1]+0.007459466, _state),tol), _state); seterrorflag(err, ae_fp_greater(ae_fabs(rlo-0.152429205, _state),tol), _state); seterrorflag(err, ae_fp_neq(rhi,(double)(0)), _state); unsetrealarray(&cx, _state); rlo = (double)(0); rhi = (double)(0); nsfitspherex(&xy, xy.rows, xy.cols, 2, 0.0, 0, 0.0, &cx, &rlo, &rhi, _state); seterrorflag(err, cx.cnt!=2, _state); if( *err ) { ae_frame_leave(_state); return; } seterrorflag(err, ae_fp_greater(ae_fabs(cx.ptr.p_double[0]+0.054593489, _state),tol), _state); seterrorflag(err, ae_fp_greater(ae_fabs(cx.ptr.p_double[1]+0.007459466, _state),tol), _state); seterrorflag(err, ae_fp_greater(ae_fabs(rlo-0.152429205, _state),tol), _state); seterrorflag(err, ae_fp_neq(rhi,(double)(0)), _state); ae_frame_leave(_state); } /************************************************************************* Used to calculate RLo/Rhi given XY and center position *************************************************************************/ static void testnsfitunit_calcradii(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, /* Real */ ae_vector* cx, double* rlo, double* rhi, ae_state *_state) { ae_int_t i; ae_int_t j; double v; *rlo = 0; *rhi = 0; *rlo = ae_maxrealnumber; *rhi = (double)(0); for(i=0; i<=npoints-1; i++) { v = (double)(0); for(j=0; j<=nx-1; j++) { v = v+ae_sqr(xy->ptr.pp_double[i][j]-cx->ptr.p_double[j], _state); } v = ae_sqrt(v, _state); *rhi = ae_maxreal(*rhi, v, _state); *rlo = ae_minreal(*rlo, v, _state); } } /************************************************************************* Debug functions, used to print support points for RLo/RHi *************************************************************************/ static void testnsfitunit_printsupportpoints(/* Real */ ae_matrix* xy, ae_int_t npoints, ae_int_t nx, /* Real */ ae_vector* cx, ae_state *_state) { ae_int_t i; ae_int_t j; double v; double rlo; double rhi; ae_bool islo; ae_bool ishi; testnsfitunit_calcradii(xy, npoints, nx, cx, &rlo, &rhi, _state); for(i=0; i<=npoints-1; i++) { v = (double)(0); for(j=0; j<=nx-1; j++) { v = v+ae_sqr(xy->ptr.pp_double[i][j]-cx->ptr.p_double[j], _state); } v = ae_sqrt(v, _state); islo = ae_fp_less_eq(v,1.001*rlo); ishi = ae_fp_greater_eq(v,0.999*rhi); if( islo||ishi ) { printf("PT[%4d] ", (int)(i)); if( ishi ) { printf("hi-support "); } if( islo ) { printf("lo-support "); } printf("\n"); } } } /************************************************************************* Used to initialize dynamic array with constant values *************************************************************************/ static void testnsfitunit_addvalue(/* Real */ ae_matrix* xy, ae_int_t* cnt, double v, ae_state *_state) { xy->ptr.pp_double[*cnt/xy->cols][*cnt%xy->cols] = v; *cnt = *cnt+1; } static void testspline2dunit_lconst(spline2dinterpolant* c, /* Real */ ae_vector* lx, /* Real */ ae_vector* ly, ae_int_t m, ae_int_t n, double lstep, double* lc, double* lcx, double* lcy, double* lcxy, ae_state *_state); static void testspline2dunit_twodnumder(spline2dinterpolant* c, double x, double y, double h, double* f, double* fx, double* fy, double* fxy, ae_state *_state); static ae_bool testspline2dunit_testunpack(spline2dinterpolant* c, /* Real */ ae_vector* lx, /* Real */ ae_vector* ly, ae_state *_state); static ae_bool testspline2dunit_testlintrans(spline2dinterpolant* c, ae_int_t d, double ax, double bx, double ay, double by, ae_state *_state); static void testspline2dunit_unsetspline2d(spline2dinterpolant* c, ae_state *_state); static ae_bool testspline2dunit_testspline2dvf(ae_bool silent, ae_state *_state); ae_bool testspline2d(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_bool blerrors; ae_bool bcerrors; ae_bool dserrors; ae_bool cperrors; ae_bool uperrors; ae_bool lterrors; ae_bool syerrors; ae_bool rlerrors; ae_bool rcerrors; ae_bool vferrors; ae_int_t pass; ae_int_t passcount; ae_int_t jobtype; double lstep; double h; ae_vector x; ae_vector y; spline2dinterpolant c; spline2dinterpolant c2; ae_vector lx; ae_vector ly; ae_vector fv; ae_matrix f; ae_matrix fr; ae_matrix ft; double ax; double ay; double bx; double by; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t n; ae_int_t m; ae_int_t d; ae_int_t n2; ae_int_t m2; double err; double t; double t1; double t2; double l1; double l1x; double l1y; double l1xy; double l2; double l2x; double l2y; double l2xy; double fm; double f1; double f2; double f3; double f4; double v1; double v1x; double v1y; double v1xy; double v2; double v2x; double v2y; double v2xy; double mf; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); _spline2dinterpolant_init(&c, _state); _spline2dinterpolant_init(&c2, _state); ae_vector_init(&lx, 0, DT_REAL, _state); ae_vector_init(&ly, 0, DT_REAL, _state); ae_vector_init(&fv, 0, DT_REAL, _state); ae_matrix_init(&f, 0, 0, DT_REAL, _state); ae_matrix_init(&fr, 0, 0, DT_REAL, _state); ae_matrix_init(&ft, 0, 0, DT_REAL, _state); waserrors = ae_false; passcount = 10; h = 0.00001; lstep = 0.001; blerrors = ae_false; bcerrors = ae_false; dserrors = ae_false; cperrors = ae_false; uperrors = ae_false; lterrors = ae_false; syerrors = ae_false; rlerrors = ae_false; rcerrors = ae_false; vferrors = ae_false; /* * Test: bilinear, bicubic */ for(n=2; n<=7; n++) { for(m=2; m<=7; m++) { d = ae_randominteger(2, _state)+2; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, m, _state); ae_vector_set_length(&lx, 2*n-1, _state); ae_vector_set_length(&ly, 2*m-1, _state); ae_matrix_set_length(&f, m, n, _state); ae_vector_set_length(&fv, m*n*d, _state); ae_matrix_set_length(&ft, n, m, _state); for(pass=1; pass<=passcount; pass++) { /* * Prepare task: * * X and Y stores grid * * F stores function values * * LX and LY stores twice dense grid (for Lipschitz testing) */ ax = -1-ae_randomreal(_state); bx = 1+ae_randomreal(_state); ay = -1-ae_randomreal(_state); by = 1+ae_randomreal(_state); for(j=0; j<=n-1; j++) { x.ptr.p_double[j] = 0.5*(bx+ax)-0.5*(bx-ax)*ae_cos(ae_pi*(2*j+1)/(2*n), _state); if( j==0 ) { x.ptr.p_double[j] = ax; } if( j==n-1 ) { x.ptr.p_double[j] = bx; } lx.ptr.p_double[2*j] = x.ptr.p_double[j]; if( j>0 ) { lx.ptr.p_double[2*j-1] = 0.5*(x.ptr.p_double[j]+x.ptr.p_double[j-1]); } } for(j=0; j<=n-1; j++) { k = ae_randominteger(n, _state); if( k!=j ) { t = x.ptr.p_double[j]; x.ptr.p_double[j] = x.ptr.p_double[k]; x.ptr.p_double[k] = t; } } for(i=0; i<=m-1; i++) { y.ptr.p_double[i] = 0.5*(by+ay)-0.5*(by-ay)*ae_cos(ae_pi*(2*i+1)/(2*m), _state); if( i==0 ) { y.ptr.p_double[i] = ay; } if( i==m-1 ) { y.ptr.p_double[i] = by; } ly.ptr.p_double[2*i] = y.ptr.p_double[i]; if( i>0 ) { ly.ptr.p_double[2*i-1] = 0.5*(y.ptr.p_double[i]+y.ptr.p_double[i-1]); } } for(i=0; i<=m-1; i++) { k = ae_randominteger(m, _state); if( k!=i ) { t = y.ptr.p_double[i]; y.ptr.p_double[i] = y.ptr.p_double[k]; y.ptr.p_double[k] = t; } } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { f.ptr.pp_double[i][j] = ae_exp(0.6*x.ptr.p_double[j], _state)-ae_exp(-0.3*y.ptr.p_double[i]+0.08*x.ptr.p_double[j], _state)+2*ae_cos(ae_pi*(x.ptr.p_double[j]+1.2*y.ptr.p_double[i]), _state)+0.1*ae_cos(20*x.ptr.p_double[j]+15*y.ptr.p_double[i], _state); } } for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { for(k=0; k<=d-1; k++) { fv.ptr.p_double[d*(n*j+i)+k] = ae_exp(0.6*x.ptr.p_double[i], _state)-ae_exp(-0.3*y.ptr.p_double[j]+0.08*x.ptr.p_double[i], _state)+2*ae_cos(ae_pi*(x.ptr.p_double[i]+1.2*y.ptr.p_double[j]+k), _state)+0.1*ae_cos(20*x.ptr.p_double[i]+15*y.ptr.p_double[j]+k, _state); } } } /* * Test bilinear interpolation: * * interpolation at the nodes * * linearity * * continuity * * differentiation in the inner points */ spline2dbuildbilinear(&x, &y, &f, m, n, &c, _state); err = (double)(0); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { err = ae_maxreal(err, ae_fabs(f.ptr.pp_double[i][j]-spline2dcalc(&c, x.ptr.p_double[j], y.ptr.p_double[i], _state), _state), _state); } } blerrors = blerrors||ae_fp_greater(err,10000*ae_machineepsilon); err = (double)(0); for(i=0; i<=m-2; i++) { for(j=0; j<=n-2; j++) { /* * Test for linearity between grid points * (test point - geometric center of the cell) */ fm = spline2dcalc(&c, lx.ptr.p_double[2*j+1], ly.ptr.p_double[2*i+1], _state); f1 = spline2dcalc(&c, lx.ptr.p_double[2*j], ly.ptr.p_double[2*i], _state); f2 = spline2dcalc(&c, lx.ptr.p_double[2*j+2], ly.ptr.p_double[2*i], _state); f3 = spline2dcalc(&c, lx.ptr.p_double[2*j+2], ly.ptr.p_double[2*i+2], _state); f4 = spline2dcalc(&c, lx.ptr.p_double[2*j], ly.ptr.p_double[2*i+2], _state); err = ae_maxreal(err, ae_fabs(0.25*(f1+f2+f3+f4)-fm, _state), _state); } } blerrors = blerrors||ae_fp_greater(err,10000*ae_machineepsilon); testspline2dunit_lconst(&c, &lx, &ly, m, n, lstep, &l1, &l1x, &l1y, &l1xy, _state); testspline2dunit_lconst(&c, &lx, &ly, m, n, lstep/3, &l2, &l2x, &l2y, &l2xy, _state); blerrors = blerrors||ae_fp_greater(l2/l1,1.2); err = (double)(0); for(i=0; i<=m-2; i++) { for(j=0; j<=n-2; j++) { spline2ddiff(&c, lx.ptr.p_double[2*j+1], ly.ptr.p_double[2*i+1], &v1, &v1x, &v1y, &v1xy, _state); testspline2dunit_twodnumder(&c, lx.ptr.p_double[2*j+1], ly.ptr.p_double[2*i+1], h, &v2, &v2x, &v2y, &v2xy, _state); err = ae_maxreal(err, ae_fabs(v1-v2, _state), _state); err = ae_maxreal(err, ae_fabs(v1x-v2x, _state), _state); err = ae_maxreal(err, ae_fabs(v1y-v2y, _state), _state); err = ae_maxreal(err, ae_fabs(v1xy-v2xy, _state), _state); } } dserrors = dserrors||ae_fp_greater(err,1.0E-3); uperrors = uperrors||!testspline2dunit_testunpack(&c, &lx, &ly, _state); lterrors = lterrors||!testspline2dunit_testlintrans(&c, 1, ax, bx, ay, by, _state); /* * Lin.Trans. test for vector-function */ spline2dbuildbilinearv(&x, n, &y, m, &fv, d, &c, _state); lterrors = lterrors||!testspline2dunit_testlintrans(&c, d, ax, bx, ay, by, _state); /* * Test bicubic interpolation. * * interpolation at the nodes * * smoothness * * differentiation */ spline2dbuildbicubic(&x, &y, &f, m, n, &c, _state); err = (double)(0); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { err = ae_maxreal(err, ae_fabs(f.ptr.pp_double[i][j]-spline2dcalc(&c, x.ptr.p_double[j], y.ptr.p_double[i], _state), _state), _state); } } bcerrors = bcerrors||ae_fp_greater(err,10000*ae_machineepsilon); testspline2dunit_lconst(&c, &lx, &ly, m, n, lstep, &l1, &l1x, &l1y, &l1xy, _state); testspline2dunit_lconst(&c, &lx, &ly, m, n, lstep/3, &l2, &l2x, &l2y, &l2xy, _state); bcerrors = bcerrors||ae_fp_greater(l2/l1,1.2); bcerrors = bcerrors||ae_fp_greater(l2x/l1x,1.2); bcerrors = bcerrors||ae_fp_greater(l2y/l1y,1.2); if( ae_fp_greater(l2xy,0.01)&&ae_fp_greater(l1xy,0.01) ) { /* * Cross-derivative continuity is tested only when * bigger than 0.01. When the task size is too * small, the d2F/dXdY is nearly zero and Lipschitz * constant ratio is ill-conditioned. */ bcerrors = bcerrors||ae_fp_greater(l2xy/l1xy,1.2); } err = (double)(0); for(i=0; i<=2*m-2; i++) { for(j=0; j<=2*n-2; j++) { spline2ddiff(&c, lx.ptr.p_double[j], ly.ptr.p_double[i], &v1, &v1x, &v1y, &v1xy, _state); testspline2dunit_twodnumder(&c, lx.ptr.p_double[j], ly.ptr.p_double[i], h, &v2, &v2x, &v2y, &v2xy, _state); err = ae_maxreal(err, ae_fabs(v1-v2, _state), _state); err = ae_maxreal(err, ae_fabs(v1x-v2x, _state), _state); err = ae_maxreal(err, ae_fabs(v1y-v2y, _state), _state); err = ae_maxreal(err, ae_fabs(v1xy-v2xy, _state), _state); } } dserrors = dserrors||ae_fp_greater(err,1.0E-3); uperrors = uperrors||!testspline2dunit_testunpack(&c, &lx, &ly, _state); lterrors = lterrors||!testspline2dunit_testlintrans(&c, 1, ax, bx, ay, by, _state); /* * Lin.Trans. test for vector-function */ spline2dbuildbicubicv(&x, n, &y, m, &fv, d, &c, _state); lterrors = lterrors||!testspline2dunit_testlintrans(&c, d, ax, bx, ay, by, _state); /* * Copy test */ if( ae_fp_greater(ae_randomreal(_state),0.5) ) { spline2dbuildbicubic(&x, &y, &f, m, n, &c, _state); } else { spline2dbuildbilinear(&x, &y, &f, m, n, &c, _state); } testspline2dunit_unsetspline2d(&c2, _state); spline2dcopy(&c, &c2, _state); err = (double)(0); for(i=1; i<=5; i++) { t1 = ax+(bx-ax)*ae_randomreal(_state); t2 = ay+(by-ay)*ae_randomreal(_state); err = ae_maxreal(err, ae_fabs(spline2dcalc(&c, t1, t2, _state)-spline2dcalc(&c2, t1, t2, _state), _state), _state); } cperrors = cperrors||ae_fp_greater(err,10000*ae_machineepsilon); /* * Special symmetry test */ err = (double)(0); for(jobtype=0; jobtype<=1; jobtype++) { /* * Prepare */ for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { ft.ptr.pp_double[j][i] = f.ptr.pp_double[i][j]; } } if( jobtype==0 ) { spline2dbuildbilinear(&x, &y, &f, m, n, &c, _state); spline2dbuildbilinear(&y, &x, &ft, n, m, &c2, _state); } else { spline2dbuildbicubic(&x, &y, &f, m, n, &c, _state); spline2dbuildbicubic(&y, &x, &ft, n, m, &c2, _state); } /* * Test */ for(i=1; i<=10; i++) { t1 = ax+(bx-ax)*ae_randomreal(_state); t2 = ay+(by-ay)*ae_randomreal(_state); err = ae_maxreal(err, ae_fabs(spline2dcalc(&c, t1, t2, _state)-spline2dcalc(&c2, t2, t1, _state), _state), _state); } } syerrors = syerrors||ae_fp_greater(err,10000*ae_machineepsilon); } } } /* * Test resample */ for(m=2; m<=6; m++) { for(n=2; n<=6; n++) { ae_matrix_set_length(&f, m-1+1, n-1+1, _state); ae_vector_set_length(&x, n-1+1, _state); ae_vector_set_length(&y, m-1+1, _state); for(j=0; j<=n-1; j++) { x.ptr.p_double[j] = (double)j/(double)(n-1); } for(i=0; i<=m-1; i++) { y.ptr.p_double[i] = (double)i/(double)(m-1); } for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { f.ptr.pp_double[i][j] = ae_exp(0.6*x.ptr.p_double[j], _state)-ae_exp(-0.3*y.ptr.p_double[i]+0.08*x.ptr.p_double[j], _state)+2*ae_cos(ae_pi*(x.ptr.p_double[j]+1.2*y.ptr.p_double[i]), _state)+0.1*ae_cos(20*x.ptr.p_double[j]+15*y.ptr.p_double[i], _state); } } for(m2=2; m2<=6; m2++) { for(n2=2; n2<=6; n2++) { for(pass=1; pass<=passcount; pass++) { for(jobtype=0; jobtype<=1; jobtype++) { if( jobtype==0 ) { spline2dresamplebilinear(&f, m, n, &fr, m2, n2, _state); spline2dbuildbilinear(&x, &y, &f, m, n, &c, _state); } if( jobtype==1 ) { spline2dresamplebicubic(&f, m, n, &fr, m2, n2, _state); spline2dbuildbicubic(&x, &y, &f, m, n, &c, _state); } err = (double)(0); mf = (double)(0); for(i=0; i<=m2-1; i++) { for(j=0; j<=n2-1; j++) { v1 = spline2dcalc(&c, (double)j/(double)(n2-1), (double)i/(double)(m2-1), _state); v2 = fr.ptr.pp_double[i][j]; err = ae_maxreal(err, ae_fabs(v1-v2, _state), _state); mf = ae_maxreal(mf, ae_fabs(v1, _state), _state); } } if( jobtype==0 ) { rlerrors = rlerrors||ae_fp_greater(err/mf,10000*ae_machineepsilon); } if( jobtype==1 ) { rcerrors = rcerrors||ae_fp_greater(err/mf,10000*ae_machineepsilon); } } } } } } } /* * Test for vector-function */ vferrors = testspline2dunit_testspline2dvf(ae_true, _state); /* * Report */ waserrors = ((((((((blerrors||bcerrors)||dserrors)||cperrors)||uperrors)||lterrors)||syerrors)||rlerrors)||rcerrors)||vferrors; if( !silent ) { printf("TESTING 2D SPLINE\n"); /* * Normal tests */ printf("BILINEAR TEST: "); if( blerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("BICUBIC TEST: "); if( bcerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("DIFFERENTIATION TEST: "); if( dserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("COPY/SERIALIZE TEST: "); if( cperrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("UNPACK TEST: "); if( uperrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("LIN.TRANS. TEST: "); if( lterrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("SPECIAL SYMMETRY TEST: "); if( syerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("BILINEAR RESAMPLING TEST: "); if( rlerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("BICUBIC RESAMPLING TEST: "); if( rcerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("VECTOR FUNCTION TEST: "); if( vferrors ) { printf("FAILED\n"); } else { printf("OK\n"); } /* * Summary */ if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } /* * end */ result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testspline2d(ae_bool silent, ae_state *_state) { return testspline2d(silent, _state); } /************************************************************************* Lipschitz constants for spline inself, first and second derivatives. *************************************************************************/ static void testspline2dunit_lconst(spline2dinterpolant* c, /* Real */ ae_vector* lx, /* Real */ ae_vector* ly, ae_int_t m, ae_int_t n, double lstep, double* lc, double* lcx, double* lcy, double* lcxy, ae_state *_state) { ae_int_t i; ae_int_t j; double f1; double f2; double f3; double f4; double fx1; double fx2; double fx3; double fx4; double fy1; double fy2; double fy3; double fy4; double fxy1; double fxy2; double fxy3; double fxy4; double s2lstep; *lc = 0; *lcx = 0; *lcy = 0; *lcxy = 0; *lc = (double)(0); *lcx = (double)(0); *lcy = (double)(0); *lcxy = (double)(0); s2lstep = ae_sqrt((double)(2), _state)*lstep; for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { /* * Calculate */ testspline2dunit_twodnumder(c, lx->ptr.p_double[j]-lstep/2, ly->ptr.p_double[i]-lstep/2, lstep/4, &f1, &fx1, &fy1, &fxy1, _state); testspline2dunit_twodnumder(c, lx->ptr.p_double[j]+lstep/2, ly->ptr.p_double[i]-lstep/2, lstep/4, &f2, &fx2, &fy2, &fxy2, _state); testspline2dunit_twodnumder(c, lx->ptr.p_double[j]+lstep/2, ly->ptr.p_double[i]+lstep/2, lstep/4, &f3, &fx3, &fy3, &fxy3, _state); testspline2dunit_twodnumder(c, lx->ptr.p_double[j]-lstep/2, ly->ptr.p_double[i]+lstep/2, lstep/4, &f4, &fx4, &fy4, &fxy4, _state); /* * Lipschitz constant for the function itself */ *lc = ae_maxreal(*lc, ae_fabs((f1-f2)/lstep, _state), _state); *lc = ae_maxreal(*lc, ae_fabs((f2-f3)/lstep, _state), _state); *lc = ae_maxreal(*lc, ae_fabs((f3-f4)/lstep, _state), _state); *lc = ae_maxreal(*lc, ae_fabs((f4-f1)/lstep, _state), _state); *lc = ae_maxreal(*lc, ae_fabs((f1-f3)/s2lstep, _state), _state); *lc = ae_maxreal(*lc, ae_fabs((f2-f4)/s2lstep, _state), _state); /* * Lipschitz constant for the first derivative */ *lcx = ae_maxreal(*lcx, ae_fabs((fx1-fx2)/lstep, _state), _state); *lcx = ae_maxreal(*lcx, ae_fabs((fx2-fx3)/lstep, _state), _state); *lcx = ae_maxreal(*lcx, ae_fabs((fx3-fx4)/lstep, _state), _state); *lcx = ae_maxreal(*lcx, ae_fabs((fx4-fx1)/lstep, _state), _state); *lcx = ae_maxreal(*lcx, ae_fabs((fx1-fx3)/s2lstep, _state), _state); *lcx = ae_maxreal(*lcx, ae_fabs((fx2-fx4)/s2lstep, _state), _state); /* * Lipschitz constant for the first derivative */ *lcy = ae_maxreal(*lcy, ae_fabs((fy1-fy2)/lstep, _state), _state); *lcy = ae_maxreal(*lcy, ae_fabs((fy2-fy3)/lstep, _state), _state); *lcy = ae_maxreal(*lcy, ae_fabs((fy3-fy4)/lstep, _state), _state); *lcy = ae_maxreal(*lcy, ae_fabs((fy4-fy1)/lstep, _state), _state); *lcy = ae_maxreal(*lcy, ae_fabs((fy1-fy3)/s2lstep, _state), _state); *lcy = ae_maxreal(*lcy, ae_fabs((fy2-fy4)/s2lstep, _state), _state); /* * Lipschitz constant for the cross-derivative */ *lcxy = ae_maxreal(*lcxy, ae_fabs((fxy1-fxy2)/lstep, _state), _state); *lcxy = ae_maxreal(*lcxy, ae_fabs((fxy2-fxy3)/lstep, _state), _state); *lcxy = ae_maxreal(*lcxy, ae_fabs((fxy3-fxy4)/lstep, _state), _state); *lcxy = ae_maxreal(*lcxy, ae_fabs((fxy4-fxy1)/lstep, _state), _state); *lcxy = ae_maxreal(*lcxy, ae_fabs((fxy1-fxy3)/s2lstep, _state), _state); *lcxy = ae_maxreal(*lcxy, ae_fabs((fxy2-fxy4)/s2lstep, _state), _state); } } } /************************************************************************* Numerical differentiation. *************************************************************************/ static void testspline2dunit_twodnumder(spline2dinterpolant* c, double x, double y, double h, double* f, double* fx, double* fy, double* fxy, ae_state *_state) { *f = 0; *fx = 0; *fy = 0; *fxy = 0; *f = spline2dcalc(c, x, y, _state); *fx = (spline2dcalc(c, x+h, y, _state)-spline2dcalc(c, x-h, y, _state))/(2*h); *fy = (spline2dcalc(c, x, y+h, _state)-spline2dcalc(c, x, y-h, _state))/(2*h); *fxy = (spline2dcalc(c, x+h, y+h, _state)-spline2dcalc(c, x-h, y+h, _state)-spline2dcalc(c, x+h, y-h, _state)+spline2dcalc(c, x-h, y-h, _state))/ae_sqr(2*h, _state); } /************************************************************************* Unpack test *************************************************************************/ static ae_bool testspline2dunit_testunpack(spline2dinterpolant* c, /* Real */ ae_vector* lx, /* Real */ ae_vector* ly, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t n; ae_int_t m; ae_int_t ci; ae_int_t cj; ae_int_t p; double err; double tx; double ty; double v1; double v2; ae_int_t pass; ae_int_t passcount; ae_matrix tbl; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&tbl, 0, 0, DT_REAL, _state); passcount = 20; err = (double)(0); spline2dunpack(c, &m, &n, &tbl, _state); for(i=0; i<=m-2; i++) { for(j=0; j<=n-2; j++) { for(pass=1; pass<=passcount; pass++) { p = (n-1)*i+j; tx = (0.001+0.999*ae_randomreal(_state))*(tbl.ptr.pp_double[p][1]-tbl.ptr.pp_double[p][0]); ty = (0.001+0.999*ae_randomreal(_state))*(tbl.ptr.pp_double[p][3]-tbl.ptr.pp_double[p][2]); /* * Interpolation properties */ v1 = (double)(0); for(ci=0; ci<=3; ci++) { for(cj=0; cj<=3; cj++) { v1 = v1+tbl.ptr.pp_double[p][4+ci*4+cj]*ae_pow(tx, (double)(ci), _state)*ae_pow(ty, (double)(cj), _state); } } v2 = spline2dcalc(c, tbl.ptr.pp_double[p][0]+tx, tbl.ptr.pp_double[p][2]+ty, _state); err = ae_maxreal(err, ae_fabs(v1-v2, _state), _state); /* * Grid correctness */ err = ae_maxreal(err, ae_fabs(lx->ptr.p_double[2*j]-tbl.ptr.pp_double[p][0], _state), _state); err = ae_maxreal(err, ae_fabs(lx->ptr.p_double[2*(j+1)]-tbl.ptr.pp_double[p][1], _state), _state); err = ae_maxreal(err, ae_fabs(ly->ptr.p_double[2*i]-tbl.ptr.pp_double[p][2], _state), _state); err = ae_maxreal(err, ae_fabs(ly->ptr.p_double[2*(i+1)]-tbl.ptr.pp_double[p][3], _state), _state); } } } result = ae_fp_less(err,10000*ae_machineepsilon); ae_frame_leave(_state); return result; } /************************************************************************* LinTrans test for scalar *************************************************************************/ static ae_bool testspline2dunit_testlintrans(spline2dinterpolant* c, ae_int_t d, double ax, double bx, double ay, double by, ae_state *_state) { ae_frame _frame_block; double err; double a1; double a2; double b1; double b2; double tx; double ty; double vx; double vy; ae_vector v1; ae_vector v2; ae_int_t pass; ae_int_t passcount; ae_int_t xjob; ae_int_t yjob; spline2dinterpolant c2; ae_int_t di; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&v1, 0, DT_REAL, _state); ae_vector_init(&v2, 0, DT_REAL, _state); _spline2dinterpolant_init(&c2, _state); passcount = 5; err = (double)(0); for(xjob=0; xjob<=1; xjob++) { for(yjob=0; yjob<=1; yjob++) { for(pass=1; pass<=passcount; pass++) { /* * Prepare */ do { a1 = 2*ae_randomreal(_state)-1; } while(ae_fp_eq(a1,(double)(0))); a1 = a1*xjob; b1 = 2*ae_randomreal(_state)-1; do { a2 = 2*ae_randomreal(_state)-1; } while(ae_fp_eq(a2,(double)(0))); a2 = a2*yjob; b2 = 2*ae_randomreal(_state)-1; /* * Test XY */ spline2dcopy(c, &c2, _state); spline2dlintransxy(&c2, a1, b1, a2, b2, _state); tx = ax+ae_randomreal(_state)*(bx-ax); ty = ay+ae_randomreal(_state)*(by-ay); if( xjob==0 ) { tx = b1; vx = ax+ae_randomreal(_state)*(bx-ax); } else { vx = (tx-b1)/a1; } if( yjob==0 ) { ty = b2; vy = ay+ae_randomreal(_state)*(by-ay); } else { vy = (ty-b2)/a2; } spline2dcalcv(c, tx, ty, &v1, _state); spline2dcalcv(&c2, vx, vy, &v2, _state); for(di=0; di<=d-1; di++) { err = ae_maxreal(err, ae_fabs(v1.ptr.p_double[di]-v2.ptr.p_double[di], _state), _state); } /* * Test F */ spline2dcopy(c, &c2, _state); spline2dlintransf(&c2, a1, b1, _state); tx = ax+ae_randomreal(_state)*(bx-ax); ty = ay+ae_randomreal(_state)*(by-ay); spline2dcalcv(c, tx, ty, &v1, _state); spline2dcalcv(&c2, tx, ty, &v2, _state); for(di=0; di<=d-1; di++) { err = ae_maxreal(err, ae_fabs(a1*v1.ptr.p_double[di]+b1-v2.ptr.p_double[di], _state), _state); } } } } result = ae_fp_less(err,10000*ae_machineepsilon); ae_frame_leave(_state); return result; } /************************************************************************* Unset spline, i.e. initialize it with random garbage *************************************************************************/ static void testspline2dunit_unsetspline2d(spline2dinterpolant* c, ae_state *_state) { ae_frame _frame_block; ae_vector x; ae_vector y; ae_matrix f; ae_frame_make(_state, &_frame_block); _spline2dinterpolant_clear(c); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_matrix_init(&f, 0, 0, DT_REAL, _state); ae_vector_set_length(&x, 2, _state); ae_vector_set_length(&y, 2, _state); ae_matrix_set_length(&f, 2, 2, _state); x.ptr.p_double[0] = (double)(-1); x.ptr.p_double[1] = (double)(1); y.ptr.p_double[0] = (double)(-1); y.ptr.p_double[1] = (double)(1); f.ptr.pp_double[0][0] = (double)(0); f.ptr.pp_double[0][1] = (double)(0); f.ptr.pp_double[1][0] = (double)(0); f.ptr.pp_double[1][1] = (double)(0); spline2dbuildbilinear(&x, &y, &f, 2, 2, c, _state); ae_frame_leave(_state); } /************************************************************************* The function check, that follow functions works correctly: Spline2DBilinearV, Spline2DBicubicV, Spline2DCalcV and Spline2DUnpackV. *************************************************************************/ static ae_bool testspline2dunit_testspline2dvf(ae_bool silent, ae_state *_state) { ae_frame _frame_block; spline2dinterpolant vc; spline2dinterpolant sc; double range; ae_vector x; ae_vector y; ae_vector f; double rndx; double rndy; ae_int_t nrnd; ae_vector resf; ae_matrix ef; double resef; ae_int_t m; ae_int_t n; ae_int_t d; ae_int_t tstn; ae_int_t tstm; ae_int_t tstd; ae_matrix tsttbl0; ae_matrix tsttbl1; double eps; double st; ae_int_t p0; ae_int_t p1; ae_int_t variant; ae_int_t pass; ae_int_t passcount; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t i0; ae_int_t j0; double xmin; double xmax; double ymin; double ymax; ae_bool result; ae_frame_make(_state, &_frame_block); _spline2dinterpolant_init(&vc, _state); _spline2dinterpolant_init(&sc, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&f, 0, DT_REAL, _state); ae_vector_init(&resf, 0, DT_REAL, _state); ae_matrix_init(&ef, 0, 0, DT_REAL, _state); ae_matrix_init(&tsttbl0, 0, 0, DT_REAL, _state); ae_matrix_init(&tsttbl1, 0, 0, DT_REAL, _state); eps = 10000.0*ae_machineepsilon; st = 0.1; passcount = 5; for(pass=1; pass<=passcount; pass++) { for(variant=1; variant<=2; variant++) { range = ae_randominteger(71, _state)+30.0; nrnd = ae_randominteger(26, _state)+25; range = (double)(ae_randominteger(71, _state)+30); m = ae_randominteger(4, _state)+2; n = ae_randominteger(4, _state)+2; d = ae_randominteger(3, _state)+1; rvectorsetlengthatleast(&x, n, _state); rvectorsetlengthatleast(&y, m, _state); rvectorsetlengthatleast(&f, n*m*d, _state); rmatrixsetlengthatleast(&ef, m, n, _state); /* * Build a grid for spline */ x.ptr.p_double[0] = range*(2*ae_randomreal(_state)-1); y.ptr.p_double[0] = range*(2*ae_randomreal(_state)-1); for(i=1; i<=n-1; i++) { x.ptr.p_double[i] = x.ptr.p_double[i-1]+st+ae_randomreal(_state); } for(i=1; i<=m-1; i++) { y.ptr.p_double[i] = y.ptr.p_double[i-1]+st+ae_randomreal(_state); } for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { for(k=0; k<=d-1; k++) { f.ptr.p_double[d*(n*j+i)+k] = range*(2*ae_randomreal(_state)-1); } } } xmin = x.ptr.p_double[0]; xmax = x.ptr.p_double[n-1]; ymin = y.ptr.p_double[0]; ymax = y.ptr.p_double[m-1]; /* * Build a spline */ if( variant==1 ) { spline2dbuildbilinearv(&x, n, &y, m, &f, d, &vc, _state); } if( variant==2 ) { spline2dbuildbicubicv(&x, n, &y, m, &f, d, &vc, _state); } /* * Part of test, which shows that Spline2DBuildBilinearV function * works correctly. * And there is test for Spline2DUnpackV. */ spline2dunpackv(&vc, &tstm, &tstn, &tstd, &tsttbl1, _state); if( (tstm!=m||tstn!=n)||tstd!=d ) { if( !silent ) { printf("TestSpline2DVF fail Spline2DUnpack:\n"); printf(" TstM=%0d; M=%0d;\n TstN=%0d; N=%0d;\n TstD=%0d; D=%0d.\n", (int)(tstm), (int)(m), (int)(tstn), (int)(n), (int)(tstd), (int)(d)); } result = ae_true; ae_frame_leave(_state); return result; } for(k=0; k<=d-1; k++) { for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { ef.ptr.pp_double[i][j] = f.ptr.p_double[d*(i*n+j)+k]; } } if( variant==1 ) { spline2dbuildbilinear(&x, &y, &ef, m, n, &sc, _state); } if( variant==2 ) { spline2dbuildbicubic(&x, &y, &ef, m, n, &sc, _state); } spline2dunpack(&sc, &tstm, &tstn, &tsttbl0, _state); if( tstm!=m||tstn!=n ) { if( !silent ) { printf("TestSpline2DVF fail Spline2DUnpack:\n"); printf(" TstM=%0d; M=%0d;\n TstN=%0d; N=%0d.\n", (int)(tstm), (int)(m), (int)(tstn), (int)(n)); } result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=m-2; i++) { for(j=0; j<=n-2; j++) { p0 = i*(n-1)+j; p1 = d*p0; for(i0=0; i0<=19; i0++) { if( ae_fp_neq(tsttbl1.ptr.pp_double[p1+k][i0],tsttbl0.ptr.pp_double[p0][i0]) ) { if( !silent ) { printf("TestSpline2DVF: Tbl error\n"); } result = ae_true; ae_frame_leave(_state); return result; } } } } } /* * Part of test, which shows that functions Spline2DCalcVBuf and Spline2DCalcV * works correctly */ for(i=0; i<=n-1; i++) { for(j=0; j<=m-1; j++) { /* * Test for grid points */ spline2dcalcv(&vc, x.ptr.p_double[i], y.ptr.p_double[j], &resf, _state); for(k=0; k<=d-1; k++) { for(i0=0; i0<=m-1; i0++) { for(j0=0; j0<=n-1; j0++) { ef.ptr.pp_double[i0][j0] = f.ptr.p_double[d*(i0*n+j0)+k]; } } if( variant==1 ) { spline2dbuildbilinear(&x, &y, &ef, m, n, &sc, _state); } if( variant==2 ) { spline2dbuildbicubic(&x, &y, &ef, m, n, &sc, _state); } resef = spline2dcalc(&sc, x.ptr.p_double[i], y.ptr.p_double[j], _state); if( ae_fp_greater(ae_fabs(resf.ptr.p_double[k]-resef, _state),eps) ) { if( !silent ) { printf("TestSpline2DVF fail Spline2DCalcV:\n"); printf(" %0.5f=|resF[%0d]-resEF|=|%0.5f-%0.5f|>Eps=%0.2e;\n", (double)(ae_fabs(resf.ptr.p_double[k]-resef, _state)), (int)(k), (double)(resf.ptr.p_double[k]), (double)(resef), (double)(eps)); printf(" resF[%0d]=%0.5f;\n", (int)(k), (double)(resf.ptr.p_double[k])); printf(" resEF=%0.5f.\n", (double)(resef)); } result = ae_true; ae_frame_leave(_state); return result; } } } } /* * Test for random points */ for(i=1; i<=nrnd; i++) { rndx = xmin+(xmax-xmin)*ae_randomreal(_state); rndy = ymin+(ymax-ymin)*ae_randomreal(_state); /* * Calculate value for vector-function in random point */ spline2dcalcv(&vc, rndx, rndy, &resf, _state); for(k=0; k<=d-1; k++) { /* * Build spline for scalar-function, each of which correspond * to one of vector-function's components. */ for(i0=0; i0<=m-1; i0++) { for(j0=0; j0<=n-1; j0++) { ef.ptr.pp_double[i0][j0] = f.ptr.p_double[d*(i0*n+j0)+k]; } } if( variant==1 ) { spline2dbuildbilinear(&x, &y, &ef, m, n, &sc, _state); } if( variant==2 ) { spline2dbuildbicubic(&x, &y, &ef, m, n, &sc, _state); } resef = spline2dcalc(&sc, rndx, rndy, _state); if( ae_fp_greater(ae_fabs(resf.ptr.p_double[k]-resef, _state),eps) ) { if( !silent ) { printf("TestSpline2DVF fail Spline2DCalcV:\n"); printf(" %0.5f=|resF[%0d]-resEF|=|%0.5f-%0.5f|>Eps=%0.2e;\n", (double)(ae_fabs(resf.ptr.p_double[k]-resef, _state)), (int)(k), (double)(resf.ptr.p_double[k]), (double)(resef), (double)(eps)); printf(" resF[%0d]=%0.5f;\n", (int)(k), (double)(resf.ptr.p_double[k])); printf(" resEF=%0.5f.\n", (double)(resef)); } result = ae_true; ae_frame_leave(_state); return result; } } } } } if( !silent ) { printf("TestSpline2DVF: OK\n"); } result = ae_false; ae_frame_leave(_state); return result; } static double testrbfunit_tol = 1.0E-10; static ae_int_t testrbfunit_mxits = 0; static double testrbfunit_heps = 1.0E-12; static ae_bool testrbfunit_specialtest(ae_state *_state); static ae_bool testrbfunit_basicrbftest(ae_state *_state); static ae_bool testrbfunit_irregularrbftest(ae_state *_state); static ae_bool testrbfunit_linearitymodelrbftest(ae_state *_state); static ae_bool testrbfunit_serializationtest(ae_state *_state); static ae_bool testrbfunit_searcherr(/* Real */ ae_matrix* y0, /* Real */ ae_matrix* y1, ae_int_t n, ae_int_t ny, ae_int_t errtype, /* Real */ ae_vector* b1, /* Real */ ae_vector* delta, ae_state *_state); static ae_bool testrbfunit_basicmultilayerrbftest(ae_state *_state); static void testrbfunit_gridcalc23test(ae_bool* errorflag, ae_state *_state); static ae_bool testrbfunit_basichrbftest(ae_state *_state); static ae_bool testrbfunit_scaledhrbftest(ae_state *_state); static ae_bool testrbfunit_spechrbftest(ae_state *_state); static ae_bool testrbfunit_gridhrbftest(ae_state *_state); ae_bool testrbf(ae_bool silent, ae_state *_state) { ae_bool specialerrors; ae_bool basicrbferrors; ae_bool irregularrbferrors; ae_bool linearitymodelrbferr; ae_bool sqrdegmatrixrbferr; ae_bool sererrors; ae_bool multilayerrbf1derrors; ae_bool multilayerrbferrors; ae_bool gridcalc23errors; ae_bool hrbfbasicerrors; ae_bool hrbfscaleerrors; ae_bool hrbfspecerrors; ae_bool hrbfgriderrors; ae_bool hrbferrors; ae_bool waserrors; ae_bool result; /* * HRBF tests */ hrbfbasicerrors = testrbfunit_basichrbftest(_state); hrbfspecerrors = testrbfunit_spechrbftest(_state); hrbfscaleerrors = testrbfunit_scaledhrbftest(_state); hrbfgriderrors = testrbfunit_gridhrbftest(_state); hrbferrors = ((hrbfbasicerrors||hrbfspecerrors)||hrbfscaleerrors)||hrbfgriderrors; /* * Other tests */ specialerrors = testrbfunit_specialtest(_state); basicrbferrors = testrbfunit_basicrbftest(_state); irregularrbferrors = testrbfunit_irregularrbftest(_state); linearitymodelrbferr = testrbfunit_linearitymodelrbftest(_state); sqrdegmatrixrbferr = sqrdegmatrixrbftest(ae_true, _state); multilayerrbf1derrors = ae_false; multilayerrbferrors = testrbfunit_basicmultilayerrbftest(_state); sererrors = testrbfunit_serializationtest(_state); gridcalc23errors = ae_false; testrbfunit_gridcalc23test(&gridcalc23errors, _state); /* * report */ waserrors = ((((((((specialerrors||basicrbferrors)||irregularrbferrors)||linearitymodelrbferr)||sqrdegmatrixrbferr)||sererrors)||multilayerrbf1derrors)||multilayerrbferrors)||gridcalc23errors)||hrbferrors; if( !silent ) { printf("TESTING RBF\n"); printf("GENERAL TESTS:\n"); printf("* serialization test: "); if( sererrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* special properties: "); if( specialerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("RBF-V2:\n"); printf("* basic HRBF test: "); if( hrbfbasicerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* scale-related tests: "); if( hrbfscaleerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* grid calculation tests: "); if( hrbfgriderrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* special properties: "); if( hrbfspecerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("RBF-V1:\n"); printf("* basicRBFTest: "); if( basicrbferrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* irregularRBFTest: "); if( irregularrbferrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* linearity test: "); if( linearitymodelrbferr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* SqrDegMatrixRBFTest: "); if( sqrdegmatrixrbferr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* MultiLayerRBFErrors in 1D test: "); if( multilayerrbf1derrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* MultiLayerRBFErrors in 2-3D test: "); if( multilayerrbferrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("* GridCalc2/3V: "); if( gridcalc23errors ) { printf("FAILED\n"); } else { printf("OK\n"); } /* * was errors? */ if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testrbf(ae_bool silent, ae_state *_state) { return testrbf(silent, _state); } /************************************************************************* The test has to check, that algorithm can solve problems of matrix are degenerate. * used model with linear term; * points locate in a subspace of dimension less than an original space. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ ae_bool sqrdegmatrixrbftest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; rbfmodel s; rbfreport rep; ae_int_t nx; ae_int_t ny; ae_int_t k0; ae_int_t k1; ae_int_t np; double sx; double sy; double zx; double px; double zy; double py; double q; double z; ae_vector point; ae_matrix a; ae_vector d0; ae_vector d1; ae_int_t gen; ae_vector pvd0; ae_vector pvd1; double pvdnorm; double vnorm; double dd0; double dd1; ae_matrix gp; ae_vector x; ae_vector y; ae_int_t unx; ae_int_t uny; ae_matrix xwr; ae_matrix v; ae_int_t i; ae_int_t j; ae_int_t k; double eps; ae_int_t modelversion; ae_bool result; ae_frame_make(_state, &_frame_block); _rbfmodel_init(&s, _state); _rbfreport_init(&rep, _state); ae_vector_init(&point, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&d0, 0, DT_REAL, _state); ae_vector_init(&d1, 0, DT_REAL, _state); ae_vector_init(&pvd0, 0, DT_REAL, _state); ae_vector_init(&pvd1, 0, DT_REAL, _state); ae_matrix_init(&gp, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_matrix_init(&xwr, 0, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); zx = (double)(10); px = (double)(15); zy = (double)(10); py = (double)(15); eps = 1.0E-6; ny = 1; for(nx=2; nx<=3; nx++) { /* * prepare test problem */ sx = ae_pow(zx, px*(ae_randominteger(3, _state)-1), _state); sy = ae_pow(zy, py*(ae_randominteger(3, _state)-1), _state); ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&y, ny, _state); ae_vector_set_length(&point, nx, _state); rbfcreate(nx, ny, &s, _state); rbfsetcond(&s, testrbfunit_heps, testrbfunit_heps, testrbfunit_mxits, _state); q = 0.25+ae_randomreal(_state); z = 4.5+ae_randomreal(_state); rbfsetalgoqnn(&s, q, z, _state); /* * start points for grid */ for(i=0; i<=nx-1; i++) { point.ptr.p_double[i] = sx*(2*ae_randomreal(_state)-1); } if( nx==2 ) { for(k0=2; k0<=4; k0++) { rmatrixrndorthogonal(nx, &a, _state); ae_vector_set_length(&d0, nx, _state); ae_v_move(&d0.ptr.p_double[0], 1, &a.ptr.pp_double[0][0], a.stride, ae_v_len(0,nx-1)); np = k0; ae_matrix_set_length(&gp, np, nx+ny, _state); /* * create grid */ for(i=0; i<=k0-1; i++) { gp.ptr.pp_double[i][0] = point.ptr.p_double[0]+sx*i*d0.ptr.p_double[0]; gp.ptr.pp_double[i][1] = point.ptr.p_double[1]+sx*i*d0.ptr.p_double[1]; for(k=0; k<=ny-1; k++) { gp.ptr.pp_double[i][nx+k] = sy*(2*ae_randomreal(_state)-1); } } rbfsetpoints(&s, &gp, np, _state); rbfbuildmodel(&s, &rep, _state); for(i=0; i<=np-1; i++) { x.ptr.p_double[0] = gp.ptr.pp_double[i][0]; x.ptr.p_double[1] = gp.ptr.pp_double[i][1]; rbfcalc(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } if( nx==3 ) { for(k0=2; k0<=4; k0++) { for(k1=2; k1<=4; k1++) { for(gen=1; gen<=2; gen++) { rmatrixrndorthogonal(nx, &a, _state); ae_vector_set_length(&d0, nx, _state); ae_v_move(&d0.ptr.p_double[0], 1, &a.ptr.pp_double[0][0], a.stride, ae_v_len(0,nx-1)); /* * create grid */ np = -1; if( gen==1 ) { np = k0; ae_matrix_set_length(&gp, np, nx+ny, _state); for(i=0; i<=k0-1; i++) { gp.ptr.pp_double[i][0] = point.ptr.p_double[0]+sx*i*d0.ptr.p_double[0]; gp.ptr.pp_double[i][1] = point.ptr.p_double[1]+sx*i*d0.ptr.p_double[1]; gp.ptr.pp_double[i][2] = point.ptr.p_double[2]+sx*i*d0.ptr.p_double[2]; for(k=0; k<=ny-1; k++) { gp.ptr.pp_double[i][nx+k] = sy*(2*ae_randomreal(_state)-1); } } } if( gen==2 ) { ae_vector_set_length(&d1, nx, _state); ae_v_move(&d1.ptr.p_double[0], 1, &a.ptr.pp_double[0][1], a.stride, ae_v_len(0,nx-1)); np = k0*k1; ae_matrix_set_length(&gp, np, nx+ny, _state); for(i=0; i<=k0-1; i++) { for(j=0; j<=k1-1; j++) { gp.ptr.pp_double[i*k1+j][0] = sx*i*d0.ptr.p_double[0]+sx*j*d1.ptr.p_double[0]; gp.ptr.pp_double[i*k1+j][1] = sx*i*d0.ptr.p_double[1]+sx*j*d1.ptr.p_double[1]; gp.ptr.pp_double[i*k1+j][2] = sx*i*d0.ptr.p_double[2]+sx*j*d1.ptr.p_double[2]; for(k=0; k<=ny-1; k++) { gp.ptr.pp_double[i*k1+j][nx+k] = sy*(2*ae_randomreal(_state)-1); } } } } ae_assert(np>=0, "rbf test: integrity error", _state); rbfsetpoints(&s, &gp, np, _state); rbfbuildmodel(&s, &rep, _state); for(i=0; i<=np-1; i++) { x.ptr.p_double[0] = gp.ptr.pp_double[i][0]; x.ptr.p_double[1] = gp.ptr.pp_double[i][1]; x.ptr.p_double[2] = gp.ptr.pp_double[i][2]; rbfcalc(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } if( gen==2 ) { rbfunpack(&s, &unx, &uny, &xwr, &np, &v, &modelversion, _state); dd0 = (d0.ptr.p_double[0]*v.ptr.pp_double[0][0]+d0.ptr.p_double[1]*v.ptr.pp_double[0][1]+d0.ptr.p_double[2]*v.ptr.pp_double[0][2])/(d0.ptr.p_double[0]*d0.ptr.p_double[0]+d0.ptr.p_double[1]*d0.ptr.p_double[1]+d0.ptr.p_double[2]*d0.ptr.p_double[2]); dd1 = (d1.ptr.p_double[0]*v.ptr.pp_double[0][0]+d1.ptr.p_double[1]*v.ptr.pp_double[0][1]+d1.ptr.p_double[2]*v.ptr.pp_double[0][2])/(d1.ptr.p_double[0]*d1.ptr.p_double[0]+d1.ptr.p_double[1]*d1.ptr.p_double[1]+d1.ptr.p_double[2]*d1.ptr.p_double[2]); ae_vector_set_length(&pvd0, nx, _state); ae_vector_set_length(&pvd1, nx, _state); for(i=0; i<=nx-1; i++) { pvd0.ptr.p_double[i] = dd0*d0.ptr.p_double[i]; pvd1.ptr.p_double[i] = dd1*d1.ptr.p_double[i]; } pvdnorm = ae_sqrt(ae_sqr(v.ptr.pp_double[0][0]-pvd0.ptr.p_double[0]-pvd1.ptr.p_double[0], _state)+ae_sqr(v.ptr.pp_double[0][1]-pvd0.ptr.p_double[1]-pvd1.ptr.p_double[1], _state)+ae_sqr(v.ptr.pp_double[0][2]-pvd0.ptr.p_double[2]-pvd1.ptr.p_double[2], _state), _state); vnorm = ae_sqrt(ae_sqr(v.ptr.pp_double[0][0], _state)+ae_sqr(v.ptr.pp_double[0][1], _state)+ae_sqr(v.ptr.pp_double[0][2], _state), _state); if( ae_fp_greater(pvdnorm,vnorm*testrbfunit_tol) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Function for testing basic functionality of RBF module on regular grids with multi-layer algorithm in 1D. -- ALGLIB -- Copyright 2.03.2012 by Bochkanov Sergey *************************************************************************/ ae_bool basicmultilayerrbf1dtest(ae_state *_state) { ae_frame _frame_block; rbfmodel s; rbfreport rep; ae_int_t nx; ae_int_t ny; ae_int_t linterm; ae_int_t n; double q; double r; ae_int_t errtype; ae_vector delta; ae_int_t nlayers; double a; double b; double f1; double f2; ae_vector a1; ae_vector b1; ae_matrix gp; ae_vector x; ae_vector y; ae_matrix mody0; ae_matrix mody1; ae_matrix gy; ae_vector gpgx0; ae_vector gpgx1; ae_int_t pass; ae_int_t passcount; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); _rbfmodel_init(&s, _state); _rbfreport_init(&rep, _state); ae_vector_init(&delta, 0, DT_REAL, _state); ae_vector_init(&a1, 0, DT_REAL, _state); ae_vector_init(&b1, 0, DT_REAL, _state); ae_matrix_init(&gp, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_matrix_init(&mody0, 0, 0, DT_REAL, _state); ae_matrix_init(&mody1, 0, 0, DT_REAL, _state); ae_matrix_init(&gy, 0, 0, DT_REAL, _state); ae_vector_init(&gpgx0, 0, DT_REAL, _state); ae_vector_init(&gpgx1, 0, DT_REAL, _state); a = 1.0; b = (double)1/(double)9; f1 = 1.0; f2 = 10.0; passcount = 5; n = 100; ae_vector_set_length(&gpgx0, n, _state); ae_vector_set_length(&gpgx1, n, _state); for(i=0; i<=n-1; i++) { gpgx0.ptr.p_double[i] = (double)i/(double)n; gpgx1.ptr.p_double[i] = (double)(0); } r = (double)(1); for(pass=0; pass<=passcount-1; pass++) { nx = ae_randominteger(2, _state)+2; ny = ae_randominteger(3, _state)+1; linterm = ae_randominteger(3, _state)+1; ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&y, ny, _state); ae_vector_set_length(&a1, ny, _state); ae_vector_set_length(&b1, ny, _state); ae_vector_set_length(&delta, ny, _state); ae_matrix_set_length(&mody0, n, ny, _state); ae_matrix_set_length(&mody1, n, ny, _state); for(i=0; i<=ny-1; i++) { a1.ptr.p_double[i] = a+0.01*a*(2*ae_randomreal(_state)-1); b1.ptr.p_double[i] = b+0.01*b*(2*ae_randomreal(_state)-1); delta.ptr.p_double[i] = 0.35*b1.ptr.p_double[i]; } ae_matrix_set_length(&gp, n, nx+ny, _state); /* * create grid */ for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { gp.ptr.pp_double[i][j] = (double)(0); } gp.ptr.pp_double[i][0] = (double)i/(double)n; for(j=0; j<=ny-1; j++) { gp.ptr.pp_double[i][nx+j] = a1.ptr.p_double[j]*ae_cos(f1*2*ae_pi*gp.ptr.pp_double[i][0], _state)+b1.ptr.p_double[j]*ae_cos(f2*2*ae_pi*gp.ptr.pp_double[i][0], _state); mody0.ptr.pp_double[i][j] = gp.ptr.pp_double[i][nx+j]; } } q = (double)(1); nlayers = 1; errtype = 1; /* * test multilayer algorithm with different parameters */ while(ae_fp_greater_eq(q,1/(2*f2))) { rbfcreate(nx, ny, &s, _state); rbfsetalgomultilayer(&s, r, nlayers, 0.0, _state); if( linterm==1 ) { rbfsetlinterm(&s, _state); } if( linterm==2 ) { rbfsetconstterm(&s, _state); } if( linterm==3 ) { rbfsetzeroterm(&s, _state); } rbfsetpoints(&s, &gp, n, _state); rbfbuildmodel(&s, &rep, _state); if( ny==1 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = gp.ptr.pp_double[i][j]; } if( nx==2 ) { mody1.ptr.pp_double[i][0] = rbfcalc2(&s, x.ptr.p_double[0], x.ptr.p_double[1], _state); } else { if( nx==3 ) { mody1.ptr.pp_double[i][0] = rbfcalc3(&s, x.ptr.p_double[0], x.ptr.p_double[1], x.ptr.p_double[2], _state); } else { ae_assert(ae_false, "BasicMultiLayerRBFTest1D: Invalid variable NX(NX neither 2 nor 3)", _state); } } } if( testrbfunit_searcherr(&mody0, &mody1, n, ny, errtype, &b1, &delta, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } if( nx==2 ) { rbfgridcalc2(&s, &gpgx0, n, &gpgx1, n, &gy, _state); for(i=0; i<=n-1; i++) { mody1.ptr.pp_double[i][0] = gy.ptr.pp_double[i][0]; } } if( testrbfunit_searcherr(&mody0, &mody1, n, ny, errtype, &b1, &delta, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } } for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = gp.ptr.pp_double[i][j]; } rbfcalc(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { mody1.ptr.pp_double[i][j] = y.ptr.p_double[j]; } } if( testrbfunit_searcherr(&mody0, &mody1, n, ny, errtype, &b1, &delta, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = gp.ptr.pp_double[i][j]; } rbfcalcbuf(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { mody1.ptr.pp_double[i][j] = y.ptr.p_double[j]; } } if( testrbfunit_searcherr(&mody0, &mody1, n, ny, errtype, &b1, &delta, _state) ) { result = ae_true; ae_frame_leave(_state); return result; } q = q/2; nlayers = nlayers+1; if( errtype==1&&ae_fp_less_eq(q,1/f2) ) { errtype = 2; } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function tests special cases: * uninitialized RBF model will correctly return zero values * RBF correctly handles 1 or 2 distinct points * when we have many uniformly spaced points and one outlier, filter which is applied to radii, makes all radii equal (RBF-QNN). * RBF-ML with NLayers=0 gives linear model * Hierarchical RBF with NLayers=0 gives linear model -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testrbfunit_specialtest(ae_state *_state) { ae_frame _frame_block; rbfmodel s; rbfreport rep; ae_int_t n; ae_int_t nx; ae_int_t ny; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t t; ae_matrix xy; ae_matrix vf; ae_vector x; ae_vector y; ae_int_t termtype; double errtol; ae_int_t tmpnx; ae_int_t tmpny; ae_int_t tmpnc; ae_matrix xwr; ae_matrix v; double sx; double z; double va; double vb; double vc; double vd; ae_int_t modelversion; double vv; ae_bool result; ae_frame_make(_state, &_frame_block); _rbfmodel_init(&s, _state); _rbfreport_init(&rep, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&vf, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_matrix_init(&xwr, 0, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); errtol = 1.0E-9; result = ae_false; /* * Create model in the default state (no parameters/points specified). * With probability 0.5 we do one of the following: * * test that default state of the model is a zero model (all Calc() * functions return zero) * * call RBFBuildModel() (without specifying anything) and test that * all Calc() functions return zero. * * NOTE: because NX varies between 1 and 4, both V1 (old) and V2 RBFs * are tested. */ for(nx=1; nx<=4; nx++) { for(ny=1; ny<=3; ny++) { rbfcreate(nx, ny, &s, _state); if( ae_fp_greater(ae_randomreal(_state),0.5) ) { rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } } ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&y, 1, _state); for(i=0; i<=nx-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } rbfcalc(&s, &x, &y, _state); if( y.cnt!=ny ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } for(i=0; i<=ny-1; i++) { if( ae_fp_neq(y.ptr.p_double[i],(double)(0)) ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } } } } /* * Create default model with 1 point and different types of linear term. * Test algorithm on such dataset. * * NOTE: because NX varies between 1 and 4, both V1 (old) and V2 RBFs * are tested. */ for(nx=1; nx<=4; nx++) { for(ny=1; ny<=3; ny++) { rbfcreate(nx, ny, &s, _state); for(termtype=0; termtype<=1; termtype++) { if( termtype==0 ) { rbfsetlinterm(&s, _state); } if( termtype==1 ) { rbfsetconstterm(&s, _state); } ae_matrix_set_length(&xy, 1, nx+ny, _state); for(i=0; i<=nx+ny-1; i++) { xy.ptr.pp_double[0][i] = 2*ae_randomreal(_state)-1; } rbfsetpoints(&s, &xy, 1, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } /* * First, test that model exactly reproduces our dataset in the specified point */ ae_vector_set_length(&x, nx, _state); for(i=0; i<=nx-1; i++) { x.ptr.p_double[i] = xy.ptr.pp_double[0][i]; } rbfcalc(&s, &x, &y, _state); if( y.cnt!=ny ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } for(i=0; i<=ny-1; i++) { if( ae_fp_greater(ae_fabs(y.ptr.p_double[i]-xy.ptr.pp_double[0][nx+i], _state),errtol) ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } } /* * Second, test that model is constant */ for(i=0; i<=nx-1; i++) { x.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } rbfcalc(&s, &x, &y, _state); if( y.cnt!=ny ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } for(i=0; i<=ny-1; i++) { if( ae_fp_greater(ae_fabs(y.ptr.p_double[i]-xy.ptr.pp_double[0][nx+i], _state),errtol) ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } } } } } /* * Create model with 2 points and different types of linear term. * Test algorithm on such dataset. */ for(nx=2; nx<=3; nx++) { for(ny=1; ny<=3; ny++) { rbfcreate(nx, ny, &s, _state); for(termtype=0; termtype<=1; termtype++) { if( termtype==0 ) { rbfsetlinterm(&s, _state); } if( termtype==1 ) { rbfsetconstterm(&s, _state); } if( termtype==2 ) { rbfsetzeroterm(&s, _state); } ae_matrix_set_length(&xy, 2, nx+ny, _state); for(i=0; i<=nx+ny-1; i++) { xy.ptr.pp_double[0][i] = 2*ae_randomreal(_state)-1; } for(i=0; i<=nx+ny-1; i++) { xy.ptr.pp_double[1][i] = xy.ptr.pp_double[0][i]+1.0; } rbfsetpoints(&s, &xy, 2, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { result = ae_true; ae_frame_leave(_state); return result; } for(j=0; j<=1; j++) { ae_vector_set_length(&x, nx, _state); for(i=0; i<=nx-1; i++) { x.ptr.p_double[i] = xy.ptr.pp_double[j][i]; } rbfcalc(&s, &x, &y, _state); if( y.cnt!=ny ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=ny-1; i++) { if( ae_fp_greater(ae_fabs(y.ptr.p_double[i]-xy.ptr.pp_double[j][nx+i], _state),errtol) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } } /* * Generate a set of points (xi,yi) = (SX*i,0), and one * outlier (x_far,y_far)=(-1000*SX,0). * * Radii filtering should place a bound on the radius of outlier. */ for(nx=2; nx<=3; nx++) { for(ny=1; ny<=3; ny++) { sx = ae_exp(-5+10*ae_randomreal(_state), _state); rbfcreate(nx, ny, &s, _state); ae_matrix_set_length(&xy, 20, nx+ny, _state); for(i=0; i<=xy.rows-1; i++) { xy.ptr.pp_double[i][0] = sx*i; for(j=1; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = (double)(0); } for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i][nx+j] = ae_randomreal(_state); } } xy.ptr.pp_double[xy.rows-1][0] = -1000*sx; rbfsetpoints(&s, &xy, xy.rows, _state); /* * Try random Z from [1,5] */ z = 1+ae_randomreal(_state)*4; rbfsetalgoqnn(&s, 1.0, z, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { result = ae_true; ae_frame_leave(_state); return result; } rbfunpack(&s, &tmpnx, &tmpny, &xwr, &tmpnc, &v, &modelversion, _state); if( ((((tmpnx!=nx||tmpny!=ny)||tmpnc!=xy.rows)||xwr.cols!=nx+ny+1)||xwr.rows!=tmpnc)||modelversion!=1 ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=tmpnc-2; i++) { if( ae_fp_greater(ae_fabs(xwr.ptr.pp_double[i][nx+ny]-sx, _state),errtol) ) { result = ae_true; ae_frame_leave(_state); return result; } } if( ae_fp_greater(ae_fabs(xwr.ptr.pp_double[tmpnc-1][nx+ny]-z*sx, _state),errtol) ) { result = ae_true; ae_frame_leave(_state); return result; } } } /* * RBF-ML with NLayers=0 gives us linear model. * * In order to perform this test, we use test function which * is perfectly linear and see whether RBF model is able to * reproduce such function. */ n = 5; for(ny=1; ny<=3; ny++) { va = 2*ae_randomreal(_state)-1; vb = 2*ae_randomreal(_state)-1; vc = 2*ae_randomreal(_state)-1; vd = 2*ae_randomreal(_state)-1; /* * Test NX=2. * Generate linear function using random coefficients VA/VB/VC. * Function is K-dimensional vector-valued, each component has slightly * different coefficients. */ ae_matrix_set_length(&xy, n*n, 2+ny, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { xy.ptr.pp_double[n*i+j][0] = (double)(i); xy.ptr.pp_double[n*i+j][1] = (double)(j); for(k=0; k<=ny-1; k++) { xy.ptr.pp_double[n*i+j][2+k] = (va+0.1*k)*i+(vb+0.2*k)*j+(vc+0.3*k); } } } rbfcreate(2, ny, &s, _state); rbfsetpoints(&s, &xy, n*n, _state); rbfsetalgomultilayer(&s, 1.0, 0, 0.01, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { result = ae_true; ae_frame_leave(_state); return result; } ae_vector_set_length(&x, 2, _state); x.ptr.p_double[0] = (n-1)*ae_randomreal(_state); x.ptr.p_double[1] = (n-1)*ae_randomreal(_state); if( ny==1&&ae_fp_greater(ae_fabs(rbfcalc2(&s, x.ptr.p_double[0], x.ptr.p_double[1], _state)-(va*x.ptr.p_double[0]+vb*x.ptr.p_double[1]+vc), _state),errtol) ) { result = ae_true; ae_frame_leave(_state); return result; } rbfcalc(&s, &x, &y, _state); for(k=0; k<=ny-1; k++) { if( ae_fp_greater(ae_fabs(y.ptr.p_double[k]-((va+0.1*k)*x.ptr.p_double[0]+(vb+0.2*k)*x.ptr.p_double[1]+(vc+0.3*k)), _state),errtol) ) { result = ae_true; ae_frame_leave(_state); return result; } } /* * Test NX=3. * Generate linear function using random coefficients VA/VB/VC/VC. * Function is K-dimensional vector-valued, each component has slightly * different coefficients. */ ae_matrix_set_length(&xy, n*n*n, 3+ny, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { for(t=0; t<=n-1; t++) { xy.ptr.pp_double[n*n*i+n*j+t][0] = (double)(i); xy.ptr.pp_double[n*n*i+n*j+t][1] = (double)(j); xy.ptr.pp_double[n*n*i+n*j+t][2] = (double)(t); for(k=0; k<=ny-1; k++) { xy.ptr.pp_double[n*n*i+n*j+t][3+k] = (va+0.1*k)*i+(vb+0.2*k)*j+(vc+0.3*k)*t+(vd+0.4*k); } } } } rbfcreate(3, ny, &s, _state); rbfsetpoints(&s, &xy, n*n*n, _state); rbfsetalgomultilayer(&s, 1.0, 0, 0.01, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { result = ae_true; ae_frame_leave(_state); return result; } ae_vector_set_length(&x, 3, _state); x.ptr.p_double[0] = (n-1)*ae_randomreal(_state); x.ptr.p_double[1] = (n-1)*ae_randomreal(_state); x.ptr.p_double[2] = (n-1)*ae_randomreal(_state); if( ny==1&&ae_fp_greater(ae_fabs(rbfcalc3(&s, x.ptr.p_double[0], x.ptr.p_double[1], x.ptr.p_double[2], _state)-(va*x.ptr.p_double[0]+vb*x.ptr.p_double[1]+vc*x.ptr.p_double[2]+vd), _state),errtol) ) { result = ae_true; ae_frame_leave(_state); return result; } rbfcalc(&s, &x, &y, _state); for(k=0; k<=ny-1; k++) { if( ae_fp_greater(ae_fabs(y.ptr.p_double[k]-((va+0.1*k)*x.ptr.p_double[0]+(vb+0.2*k)*x.ptr.p_double[1]+(vc+0.3*k)*x.ptr.p_double[2]+(vd+0.4*k)), _state),errtol) ) { result = ae_true; ae_frame_leave(_state); return result; } } } /* * HierarchicalRBF with NLayers=0 gives us linear model. * * In order to perform this test, we use test function which * is perfectly linear and see whether RBF model is able to * reproduce such function. */ n = 15; for(nx=1; nx<=5; nx++) { for(ny=1; ny<=3; ny++) { ae_matrix_set_length(&vf, ny, nx+1, _state); for(i=0; i<=ny-1; i++) { for(j=0; j<=nx; j++) { vf.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } ae_matrix_set_length(&xy, n, nx+ny, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = ae_randomreal(_state); } for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i][nx+j] = vf.ptr.pp_double[j][nx]; for(k=0; k<=nx-1; k++) { xy.ptr.pp_double[i][nx+j] = xy.ptr.pp_double[i][nx+j]+vf.ptr.pp_double[j][k]*xy.ptr.pp_double[i][k]; } } } rbfcreate(nx, ny, &s, _state); rbfsetpoints(&s, &xy, n, _state); rbfsetalgohierarchical(&s, 1.0, 0, 0.0, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } ae_vector_set_length(&x, nx, _state); for(i=0; i<=nx-1; i++) { x.ptr.p_double[i] = ae_randomreal(_state); } rbfcalc(&s, &x, &y, _state); for(k=0; k<=ny-1; k++) { vv = vf.ptr.pp_double[k][nx]; for(j=0; j<=nx-1; j++) { vv = vv+vf.ptr.pp_double[k][j]*x.ptr.p_double[j]; } seterrorflag(&result, ae_fp_greater(ae_fabs(vv-y.ptr.p_double[k], _state),errtol), _state); } } } ae_frame_leave(_state); return result; } /************************************************************************* Function for testing basic functionality of RBF module on regular grids. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testrbfunit_basicrbftest(ae_state *_state) { ae_frame _frame_block; rbfmodel s; rbfreport rep; rbfcalcbuffer calcbuf; ae_int_t nx; ae_int_t ny; ae_int_t k0; ae_int_t k1; ae_int_t k2; ae_int_t linterm; ae_int_t np; double sx; double sy; double zx; double px; double zy; double py; double q; double z; ae_vector point; ae_matrix gp; ae_vector x; ae_vector y; ae_matrix gy; ae_int_t unx; ae_int_t uny; ae_matrix xwr; ae_matrix v; ae_vector gpgx0; ae_vector gpgx1; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t l; ae_int_t fidx; double eps; ae_int_t modelversion; ae_bool result; ae_frame_make(_state, &_frame_block); _rbfmodel_init(&s, _state); _rbfreport_init(&rep, _state); _rbfcalcbuffer_init(&calcbuf, _state); ae_vector_init(&point, 0, DT_REAL, _state); ae_matrix_init(&gp, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_matrix_init(&gy, 0, 0, DT_REAL, _state); ae_matrix_init(&xwr, 0, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); ae_vector_init(&gpgx0, 0, DT_REAL, _state); ae_vector_init(&gpgx1, 0, DT_REAL, _state); zx = (double)(10); px = (double)(15); zy = (double)(10); py = (double)(15); eps = 1.0E-6; /* * Problem types: * * 2 and 3-dimensional problems * * problems with zero, constant, linear terms * * different scalings of X and Y values (1.0, 1E-15, 1E+15) * * regular grids different grid sizes (from 2 to 4 points for each dimension) * * We check that: * * RBF model correctly reproduces function value (testes with different Calc() functions) * * unpacked model containt correct radii * * linear term has correct form */ for(nx=2; nx<=3; nx++) { for(ny=1; ny<=3; ny++) { for(linterm=1; linterm<=3; linterm++) { /* * prepare test problem */ sx = ae_pow(zx, px*(ae_randominteger(3, _state)-1), _state); sy = ae_pow(zy, py*(ae_randominteger(3, _state)-1), _state); ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&y, ny, _state); ae_vector_set_length(&point, nx, _state); rbfcreate(nx, ny, &s, _state); rbfsetcond(&s, testrbfunit_heps, testrbfunit_heps, testrbfunit_mxits, _state); q = 0.25+ae_randomreal(_state); z = 4.5+ae_randomreal(_state); rbfsetalgoqnn(&s, q, z, _state); if( linterm==1 ) { rbfsetlinterm(&s, _state); } if( linterm==2 ) { rbfsetconstterm(&s, _state); } if( linterm==3 ) { rbfsetzeroterm(&s, _state); } /* * start points for grid */ for(i=0; i<=nx-1; i++) { point.ptr.p_double[i] = sx*(2*ae_randomreal(_state)-1); } /* * 2-dimensional test problem */ if( nx==2 ) { for(k0=2; k0<=4; k0++) { for(k1=2; k1<=4; k1++) { np = k0*k1; ae_matrix_set_length(&gp, np, nx+ny, _state); /* * create grid */ for(i=0; i<=k0-1; i++) { for(j=0; j<=k1-1; j++) { gp.ptr.pp_double[i*k1+j][0] = point.ptr.p_double[0]+sx*i; gp.ptr.pp_double[i*k1+j][1] = point.ptr.p_double[1]+sx*j; for(k=0; k<=ny-1; k++) { gp.ptr.pp_double[i*k1+j][nx+k] = sy*(2*ae_randomreal(_state)-1); } } } rbfsetpoints(&s, &gp, np, _state); rbfbuildmodel(&s, &rep, _state); rbfcreatecalcbuffer(&s, &calcbuf, _state); if( ny==1 ) { ae_vector_set_length(&gpgx0, k0, _state); ae_vector_set_length(&gpgx1, k1, _state); for(i=0; i<=k0-1; i++) { gpgx0.ptr.p_double[i] = point.ptr.p_double[0]+sx*i; } for(i=0; i<=k1-1; i++) { gpgx1.ptr.p_double[i] = point.ptr.p_double[1]+sx*i; } rbfgridcalc2(&s, &gpgx0, k0, &gpgx1, k1, &gy, _state); for(i=0; i<=k0-1; i++) { for(j=0; j<=k1-1; j++) { if( ae_fp_greater(ae_fabs(gy.ptr.pp_double[i][j]-gp.ptr.pp_double[i*k1+j][nx], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } for(i=0; i<=np-1; i++) { /* * For each row we randomly choose a function to test * and call it. We do not call multiple functions per * row because carry-over effects may mask errors in * some function (say, it is possible that function * simply returns results from previous call of some * other function which were stored in the RBF model; * in this case, previous call with same parameters * may hide deficiencies in the function). */ x.ptr.p_double[0] = gp.ptr.pp_double[i][0]; x.ptr.p_double[1] = gp.ptr.pp_double[i][1]; fidx = ae_randominteger(4, _state); if( fidx==0&&ny==1 ) { y.ptr.p_double[0] = rbfcalc2(&s, x.ptr.p_double[0], x.ptr.p_double[1], _state); if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx]-y.ptr.p_double[0], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } if( fidx==1 ) { rbfcalc(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } if( fidx==2 ) { rbfcalcbuf(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } if( fidx==3 ) { rbftscalcbuf(&s, &calcbuf, &x, &y, _state); for(j=0; j<=ny-1; j++) { if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } /* * test for RBFUnpack */ rbfunpack(&s, &unx, &uny, &xwr, &np, &v, &modelversion, _state); if( (((((nx!=unx||ny!=uny)||xwr.rows!=np)||xwr.cols!=nx+ny+1)||v.rows!=ny)||v.cols!=nx+1)||modelversion!=1 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } for(i=0; i<=np-1; i++) { if( ae_fp_greater(ae_fabs(xwr.ptr.pp_double[i][unx+uny]-q*sx, _state),sx*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } if( linterm==2 ) { for(i=0; i<=unx-1; i++) { for(j=0; j<=uny-1; j++) { if( ae_fp_neq(v.ptr.pp_double[j][i],(double)(0)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } if( linterm==3 ) { for(i=0; i<=unx; i++) { for(j=0; j<=uny-1; j++) { if( ae_fp_neq(v.ptr.pp_double[j][i],(double)(0)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } } } /* * 3-dimensional test problems */ if( nx==3 ) { for(k0=2; k0<=4; k0++) { for(k1=2; k1<=4; k1++) { for(k2=2; k2<=4; k2++) { np = k0*k1*k2; ae_matrix_set_length(&gp, np, nx+ny, _state); /* * create grid */ for(i=0; i<=k0-1; i++) { for(j=0; j<=k1-1; j++) { for(k=0; k<=k2-1; k++) { gp.ptr.pp_double[(i*k1+j)*k2+k][0] = point.ptr.p_double[0]+sx*i; gp.ptr.pp_double[(i*k1+j)*k2+k][1] = point.ptr.p_double[1]+sx*j; gp.ptr.pp_double[(i*k1+j)*k2+k][2] = point.ptr.p_double[2]+sx*k; for(l=0; l<=ny-1; l++) { gp.ptr.pp_double[(i*k1+j)*k2+k][nx+l] = sy*(2*ae_randomreal(_state)-1); } } } } rbfsetpoints(&s, &gp, np, _state); rbfbuildmodel(&s, &rep, _state); rbfcreatecalcbuffer(&s, &calcbuf, _state); for(i=0; i<=np-1; i++) { /* * For each row we randomly choose a function to test * and call it. We do not call multiple functions per * row because carry-over effects may mask errors in * some function (say, it is possible that function * simply returns results from previous call of some * other function which were stored in the RBF model; * in this case, previous call with same parameters * may hide deficiencies in the function). */ x.ptr.p_double[0] = gp.ptr.pp_double[i][0]; x.ptr.p_double[1] = gp.ptr.pp_double[i][1]; x.ptr.p_double[2] = gp.ptr.pp_double[i][2]; fidx = ae_randominteger(4, _state); if( fidx==0&&ny==1 ) { y.ptr.p_double[0] = rbfcalc3(&s, x.ptr.p_double[0], x.ptr.p_double[1], x.ptr.p_double[2], _state); if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx]-y.ptr.p_double[0], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } if( fidx==1 ) { rbfcalc(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } if( fidx==2 ) { rbfcalcbuf(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } if( fidx==3 ) { rbftscalcbuf(&s, &calcbuf, &x, &y, _state); for(j=0; j<=ny-1; j++) { if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } /* * test for RBFUnpack */ rbfunpack(&s, &unx, &uny, &xwr, &np, &v, &modelversion, _state); if( (((((nx!=unx||ny!=uny)||xwr.rows!=np)||xwr.cols!=nx+ny+1)||v.rows!=ny)||v.cols!=nx+1)||modelversion!=1 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } for(i=0; i<=np-1; i++) { if( ae_fp_greater(ae_fabs(xwr.ptr.pp_double[i][unx+uny]-q*sx, _state),sx*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } if( linterm==2 ) { for(i=0; i<=unx-1; i++) { for(j=0; j<=uny-1; j++) { if( ae_fp_neq(v.ptr.pp_double[j][i],(double)(0)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } if( linterm==3 ) { for(i=0; i<=unx; i++) { for(j=0; j<=uny-1; j++) { if( ae_fp_neq(v.ptr.pp_double[j][i],(double)(0)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } } } } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Function for testing RBF module on irregular grids. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testrbfunit_irregularrbftest(ae_state *_state) { ae_frame _frame_block; rbfmodel s; rbfreport rep; ae_int_t nx; ae_int_t ny; ae_int_t k0; ae_int_t k1; ae_int_t k2; ae_int_t linterm; ae_int_t np; double sx; double sy; double zx; double px; double zy; double py; double q; double z; ae_vector point; ae_matrix gp; ae_vector x; ae_vector y; ae_matrix gy; double noiselevel; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t l; double eps; ae_bool result; ae_frame_make(_state, &_frame_block); _rbfmodel_init(&s, _state); _rbfreport_init(&rep, _state); ae_vector_init(&point, 0, DT_REAL, _state); ae_matrix_init(&gp, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_matrix_init(&gy, 0, 0, DT_REAL, _state); zx = (double)(10); px = (double)(15); zy = (double)(10); py = (double)(15); noiselevel = 0.1; eps = 1.0E-6; /* * Problem types: * * 2 and 3-dimensional problems * * problems with zero, constant, linear terms * * different scalings of X and Y values (1.0, 1E-15, 1E+15) * * noisy grids, which are just regular grids with different grid sizes * (from 2 to 4 points for each dimension) and moderate amount of random * noise added to all node positions. * * We check that: * * RBF model correctly reproduces function value (testes with different Calc() functions) */ for(nx=2; nx<=3; nx++) { for(ny=1; ny<=3; ny++) { for(linterm=1; linterm<=3; linterm++) { /* * prepare test problem */ sx = ae_pow(zx, px*(ae_randominteger(3, _state)-1), _state); sy = ae_pow(zy, py*(ae_randominteger(3, _state)-1), _state); ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&y, ny, _state); ae_vector_set_length(&point, nx, _state); rbfcreate(nx, ny, &s, _state); rbfsetcond(&s, testrbfunit_heps, testrbfunit_heps, testrbfunit_mxits, _state); q = 0.25+ae_randomreal(_state); z = 4.5+ae_randomreal(_state); rbfsetalgoqnn(&s, q, z, _state); if( linterm==1 ) { rbfsetlinterm(&s, _state); } if( linterm==2 ) { rbfsetconstterm(&s, _state); } if( linterm==3 ) { rbfsetzeroterm(&s, _state); } /* * start points for grid */ for(i=0; i<=nx-1; i++) { point.ptr.p_double[i] = sx*(2*ae_randomreal(_state)-1); } /* * 2-dimensional test problems */ if( nx==2 ) { for(k0=2; k0<=4; k0++) { for(k1=2; k1<=4; k1++) { np = k0*k1; ae_matrix_set_length(&gp, np, nx+ny, _state); /* * create grid */ for(i=0; i<=k0-1; i++) { for(j=0; j<=k1-1; j++) { gp.ptr.pp_double[i*k1+j][0] = point.ptr.p_double[0]+sx*i+noiselevel*sx*(2*ae_randomreal(_state)-1); gp.ptr.pp_double[i*k1+j][1] = point.ptr.p_double[1]+sx*j+noiselevel*sx*(2*ae_randomreal(_state)-1); for(k=0; k<=ny-1; k++) { gp.ptr.pp_double[i*k1+j][nx+k] = sy*(2*ae_randomreal(_state)-1); } } } rbfsetpoints(&s, &gp, np, _state); rbfbuildmodel(&s, &rep, _state); for(i=0; i<=np-1; i++) { x.ptr.p_double[0] = gp.ptr.pp_double[i][0]; x.ptr.p_double[1] = gp.ptr.pp_double[i][1]; if( ny==1 ) { y.ptr.p_double[0] = rbfcalc2(&s, x.ptr.p_double[0], x.ptr.p_double[1], _state); if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx]-y.ptr.p_double[0], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } rbfcalc(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } rbfcalcbuf(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } } /* * 2-dimensional test problems */ if( nx==3 ) { for(k0=2; k0<=4; k0++) { for(k1=2; k1<=4; k1++) { for(k2=2; k2<=4; k2++) { np = k0*k1*k2; ae_matrix_set_length(&gp, np, nx+ny, _state); /* * create grid */ for(i=0; i<=k0-1; i++) { for(j=0; j<=k1-1; j++) { for(k=0; k<=k2-1; k++) { gp.ptr.pp_double[(i*k1+j)*k2+k][0] = point.ptr.p_double[0]+sx*i+noiselevel*sx*(2*ae_randomreal(_state)-1); gp.ptr.pp_double[(i*k1+j)*k2+k][1] = point.ptr.p_double[1]+sx*j+noiselevel*sx*(2*ae_randomreal(_state)-1); gp.ptr.pp_double[(i*k1+j)*k2+k][2] = point.ptr.p_double[2]+sx*k+noiselevel*sx*(2*ae_randomreal(_state)-1); for(l=0; l<=ny-1; l++) { gp.ptr.pp_double[(i*k1+j)*k2+k][nx+l] = sy*(2*ae_randomreal(_state)-1); } } } } rbfsetpoints(&s, &gp, np, _state); rbfbuildmodel(&s, &rep, _state); for(i=0; i<=np-1; i++) { x.ptr.p_double[0] = gp.ptr.pp_double[i][0]; x.ptr.p_double[1] = gp.ptr.pp_double[i][1]; x.ptr.p_double[2] = gp.ptr.pp_double[i][2]; if( ny==1 ) { y.ptr.p_double[0] = rbfcalc3(&s, x.ptr.p_double[0], x.ptr.p_double[1], x.ptr.p_double[2], _state); if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx]-y.ptr.p_double[0], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } rbfcalc(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } rbfcalcbuf(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { if( ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),sy*eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } } } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* The test does check, that algorithm can build linear model for the data sets, when Y depends on X linearly. -- ALGLIB -- Copyright 13.12.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testrbfunit_linearitymodelrbftest(ae_state *_state) { ae_frame _frame_block; rbfmodel s; rbfreport rep; ae_int_t nx; ae_int_t ny; ae_int_t k0; ae_int_t k1; ae_int_t k2; ae_int_t linterm; ae_int_t np; double sx; double sy; double zx; double px; double zy; double py; double q; double z; ae_vector point; ae_vector a; ae_matrix gp; ae_vector x; ae_vector y; ae_int_t unx; ae_int_t uny; ae_matrix xwr; ae_matrix v; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t l; ae_int_t modelversion; ae_bool result; ae_frame_make(_state, &_frame_block); _rbfmodel_init(&s, _state); _rbfreport_init(&rep, _state); ae_vector_init(&point, 0, DT_REAL, _state); ae_vector_init(&a, 0, DT_REAL, _state); ae_matrix_init(&gp, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_matrix_init(&xwr, 0, 0, DT_REAL, _state); ae_matrix_init(&v, 0, 0, DT_REAL, _state); zx = (double)(10); px = (double)(15); zy = (double)(10); py = (double)(15); ny = 1; for(nx=2; nx<=3; nx++) { for(linterm=1; linterm<=3; linterm++) { /* * prepare test problem */ sx = ae_pow(zx, px*(ae_randominteger(3, _state)-1), _state); sy = ae_pow(zy, py*(ae_randominteger(3, _state)-1), _state); ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&y, ny, _state); ae_vector_set_length(&point, nx, _state); rbfcreate(nx, ny, &s, _state); q = 0.25+ae_randomreal(_state); z = 4.5+ae_randomreal(_state); rbfsetalgoqnn(&s, q, z, _state); ae_vector_set_length(&a, nx+1, _state); if( linterm==1 ) { rbfsetlinterm(&s, _state); for(i=0; i<=nx-1; i++) { a.ptr.p_double[i] = sy*(2*ae_randomreal(_state)-1)/sx; } a.ptr.p_double[nx] = sy*(2*ae_randomreal(_state)-1); } if( linterm==2 ) { rbfsetconstterm(&s, _state); for(i=0; i<=nx-1; i++) { a.ptr.p_double[i] = (double)(0); } a.ptr.p_double[nx] = sy*(2*ae_randomreal(_state)-1); } if( linterm==3 ) { rbfsetzeroterm(&s, _state); for(i=0; i<=nx; i++) { a.ptr.p_double[i] = (double)(0); } } /* * start points for grid */ for(i=0; i<=nx-1; i++) { point.ptr.p_double[i] = sx*(2*ae_randomreal(_state)-1); } if( nx==2 ) { for(k0=2; k0<=4; k0++) { for(k1=2; k1<=4; k1++) { np = k0*k1; ae_matrix_set_length(&gp, np, nx+ny, _state); /* * create grid */ for(i=0; i<=k0-1; i++) { for(j=0; j<=k1-1; j++) { gp.ptr.pp_double[i*k1+j][0] = point.ptr.p_double[0]+sx*i; gp.ptr.pp_double[i*k1+j][1] = point.ptr.p_double[1]+sx*j; gp.ptr.pp_double[i*k1+j][nx] = a.ptr.p_double[nx]; for(k=0; k<=nx-1; k++) { gp.ptr.pp_double[i*k1+j][nx] = gp.ptr.pp_double[i*k1+j][nx]+gp.ptr.pp_double[i*k1+j][k]*a.ptr.p_double[k]; } } } rbfsetpoints(&s, &gp, np, _state); rbfbuildmodel(&s, &rep, _state); /* * test for RBFUnpack */ rbfunpack(&s, &unx, &uny, &xwr, &np, &v, &modelversion, _state); if( (((((nx!=unx||ny!=uny)||xwr.rows!=np)||xwr.cols!=nx+ny+1)||v.rows!=ny)||v.cols!=nx+1)||modelversion!=1 ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=nx-1; i++) { if( ae_fp_greater(ae_fabs(v.ptr.pp_double[0][i]-a.ptr.p_double[i], _state),sy/sx*testrbfunit_tol) ) { result = ae_true; ae_frame_leave(_state); return result; } } if( ae_fp_greater(ae_fabs(v.ptr.pp_double[0][nx]-a.ptr.p_double[nx], _state),sy*testrbfunit_tol) ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=np-1; i++) { if( ae_fp_greater(ae_fabs(xwr.ptr.pp_double[i][unx], _state),sy*testrbfunit_tol) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } if( nx==3 ) { for(k0=2; k0<=4; k0++) { for(k1=2; k1<=4; k1++) { for(k2=2; k2<=4; k2++) { np = k0*k1*k2; ae_matrix_set_length(&gp, np, nx+ny, _state); /* * create grid */ for(i=0; i<=k0-1; i++) { for(j=0; j<=k1-1; j++) { for(k=0; k<=k2-1; k++) { gp.ptr.pp_double[(i*k1+j)*k2+k][0] = point.ptr.p_double[0]+sx*i; gp.ptr.pp_double[(i*k1+j)*k2+k][1] = point.ptr.p_double[1]+sx*j; gp.ptr.pp_double[(i*k1+j)*k2+k][2] = point.ptr.p_double[2]+sx*k; gp.ptr.pp_double[(i*k1+j)*k2+k][nx] = a.ptr.p_double[nx]; for(l=0; l<=nx-1; l++) { gp.ptr.pp_double[(i*k1+j)*k2+k][nx] = gp.ptr.pp_double[(i*k1+j)*k2+k][nx]+gp.ptr.pp_double[(i*k1+j)*k2+k][l]*a.ptr.p_double[l]; } } } } rbfsetpoints(&s, &gp, np, _state); rbfbuildmodel(&s, &rep, _state); /* * test for RBFUnpack */ rbfunpack(&s, &unx, &uny, &xwr, &np, &v, &modelversion, _state); if( (((((nx!=unx||ny!=uny)||xwr.rows!=np)||xwr.cols!=nx+ny+1)||v.rows!=ny)||v.cols!=nx+1)||modelversion!=1 ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=nx-1; i++) { if( ae_fp_greater(ae_fabs(v.ptr.pp_double[0][i]-a.ptr.p_double[i], _state),sy/sx*testrbfunit_tol) ) { result = ae_true; ae_frame_leave(_state); return result; } } if( ae_fp_greater(ae_fabs(v.ptr.pp_double[0][nx]-a.ptr.p_double[nx], _state),sy*testrbfunit_tol) ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=np-1; i++) { if( ae_fp_greater(ae_fabs(xwr.ptr.pp_double[i][unx], _state),sy*testrbfunit_tol) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } } } } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function tests serialization -- ALGLIB -- Copyright 02.02.2012 by Bochkanov Sergey *************************************************************************/ static ae_bool testrbfunit_serializationtest(ae_state *_state) { ae_frame _frame_block; rbfmodel s; rbfmodel s2; rbfreport rep; ae_int_t n; ae_int_t nx; ae_int_t ny; ae_int_t k0; ae_int_t k1; ae_int_t k2; ae_int_t i; ae_int_t i0; ae_int_t i1; ae_int_t i2; ae_int_t j; ae_int_t k; double rbase; ae_int_t nlayers; ae_int_t bf; ae_int_t gridsize; ae_matrix xy; ae_vector testpoint; ae_vector y0; ae_vector y1; ae_vector scalevec; ae_bool result; ae_frame_make(_state, &_frame_block); _rbfmodel_init(&s, _state); _rbfmodel_init(&s2, _state); _rbfreport_init(&rep, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&testpoint, 0, DT_REAL, _state); ae_vector_init(&y0, 0, DT_REAL, _state); ae_vector_init(&y1, 0, DT_REAL, _state); ae_vector_init(&scalevec, 0, DT_REAL, _state); result = ae_false; /* * This function generates random 2 or 3 dimensional problem, * builds RBF model (QNN is used), serializes/unserializes it, then compares * models by calculating model value at some random point. * * Additionally we test that new model (one which was restored * after serialization) has lost all model construction settings, * i.e. if we call RBFBuildModel() on a NEW model, we will get * empty (zero) model. */ for(nx=2; nx<=3; nx++) { for(ny=1; ny<=2; ny++) { /* * prepare test problem */ rbfcreate(nx, ny, &s, _state); rbfsetalgoqnn(&s, 1.0, 5.0, _state); rbfsetlinterm(&s, _state); if( nx==2 ) { /* * 2-dimensional problem */ k0 = 2+ae_randominteger(4, _state); k1 = 2+ae_randominteger(4, _state); ae_matrix_set_length(&xy, k0*k1, nx+ny, _state); for(i0=0; i0<=k0-1; i0++) { for(i1=0; i1<=k1-1; i1++) { xy.ptr.pp_double[i0*k1+i1][0] = i0+0.1*(2*ae_randomreal(_state)-1); xy.ptr.pp_double[i0*k1+i1][1] = i1+0.1*(2*ae_randomreal(_state)-1); for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i0*k1+i1][nx+j] = 2*ae_randomreal(_state)-1; } } } ae_vector_set_length(&testpoint, nx, _state); testpoint.ptr.p_double[0] = ae_randomreal(_state)*(k0-1); testpoint.ptr.p_double[1] = ae_randomreal(_state)*(k1-1); } else { /* * 3-dimensional problem */ k0 = 2+ae_randominteger(4, _state); k1 = 2+ae_randominteger(4, _state); k2 = 2+ae_randominteger(4, _state); ae_matrix_set_length(&xy, k0*k1*k2, nx+ny, _state); for(i0=0; i0<=k0-1; i0++) { for(i1=0; i1<=k1-1; i1++) { for(i2=0; i2<=k2-1; i2++) { xy.ptr.pp_double[i0*k1*k2+i1*k2+i2][0] = i0+0.1*(2*ae_randomreal(_state)-1); xy.ptr.pp_double[i0*k1*k2+i1*k2+i2][1] = i1+0.1*(2*ae_randomreal(_state)-1); xy.ptr.pp_double[i0*k1*k2+i1*k2+i2][2] = i2+0.1*(2*ae_randomreal(_state)-1); for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i0*k1*k2+i1*k2+i2][nx+j] = 2*ae_randomreal(_state)-1; } } } } ae_vector_set_length(&testpoint, nx, _state); testpoint.ptr.p_double[0] = ae_randomreal(_state)*(k0-1); testpoint.ptr.p_double[1] = ae_randomreal(_state)*(k1-1); testpoint.ptr.p_double[2] = ae_randomreal(_state)*(k2-1); } rbfsetpoints(&s, &xy, xy.rows, _state); /* * Build model, serialize, compare */ rbfbuildmodel(&s, &rep, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); rbfalloc(&_local_serializer, &s, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); rbfserialize(&_local_serializer, &s, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); rbfunserialize(&_local_serializer, &s2, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } rbfcalc(&s, &testpoint, &y0, _state); rbfcalc(&s2, &testpoint, &y1, _state); if( y0.cnt!=ny||y1.cnt!=ny ) { result = ae_true; ae_frame_leave(_state); return result; } for(j=0; j<=ny-1; j++) { if( ae_fp_neq(y0.ptr.p_double[j],y1.ptr.p_double[j]) ) { result = ae_true; ae_frame_leave(_state); return result; } } /* * Check that calling RBFBuildModel() on S2 (new model) * will result in construction of zero model, i.e. test * that serialization restores model, but not dataset * which was used to build model. */ rbfbuildmodel(&s2, &rep, _state); rbfcalc(&s2, &testpoint, &y1, _state); if( y1.cnt!=ny ) { result = ae_true; ae_frame_leave(_state); return result; } for(j=0; j<=ny-1; j++) { if( ae_fp_neq(y1.ptr.p_double[j],(double)(0)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } /* * This function generates random 2 or 3 dimensional problem, * builds model using RBF-ML algo, serializes/unserializes it, * then compares models by calculating model value at some * random point. * * Additionally we test that new model (one which was restored * after serialization) has lost all model construction settings, * i.e. if we call RBFBuildModel() on a NEW model, we will get * empty (zero) model. */ for(nx=2; nx<=3; nx++) { for(ny=1; ny<=2; ny++) { /* * prepare test problem */ rbfcreate(nx, ny, &s, _state); rbfsetalgomultilayer(&s, 5.0, 5, 1.0E-3, _state); rbfsetlinterm(&s, _state); if( nx==2 ) { /* * 2-dimensional problem */ k0 = 2+ae_randominteger(4, _state); k1 = 2+ae_randominteger(4, _state); ae_matrix_set_length(&xy, k0*k1, nx+ny, _state); for(i0=0; i0<=k0-1; i0++) { for(i1=0; i1<=k1-1; i1++) { xy.ptr.pp_double[i0*k1+i1][0] = i0+0.1*(2*ae_randomreal(_state)-1); xy.ptr.pp_double[i0*k1+i1][1] = i1+0.1*(2*ae_randomreal(_state)-1); for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i0*k1+i1][nx+j] = 2*ae_randomreal(_state)-1; } } } ae_vector_set_length(&testpoint, nx, _state); testpoint.ptr.p_double[0] = ae_randomreal(_state)*(k0-1); testpoint.ptr.p_double[1] = ae_randomreal(_state)*(k1-1); } else { /* * 3-dimensional problem */ k0 = 2+ae_randominteger(4, _state); k1 = 2+ae_randominteger(4, _state); k2 = 2+ae_randominteger(4, _state); ae_matrix_set_length(&xy, k0*k1*k2, nx+ny, _state); for(i0=0; i0<=k0-1; i0++) { for(i1=0; i1<=k1-1; i1++) { for(i2=0; i2<=k2-1; i2++) { xy.ptr.pp_double[i0*k1*k2+i1*k2+i2][0] = i0+0.1*(2*ae_randomreal(_state)-1); xy.ptr.pp_double[i0*k1*k2+i1*k2+i2][1] = i1+0.1*(2*ae_randomreal(_state)-1); xy.ptr.pp_double[i0*k1*k2+i1*k2+i2][2] = i2+0.1*(2*ae_randomreal(_state)-1); for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i0*k1*k2+i1*k2+i2][nx+j] = 2*ae_randomreal(_state)-1; } } } } ae_vector_set_length(&testpoint, nx, _state); testpoint.ptr.p_double[0] = ae_randomreal(_state)*(k0-1); testpoint.ptr.p_double[1] = ae_randomreal(_state)*(k1-1); testpoint.ptr.p_double[2] = ae_randomreal(_state)*(k2-1); } rbfsetpoints(&s, &xy, xy.rows, _state); /* * Build model, serialize, compare */ rbfbuildmodel(&s, &rep, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); rbfalloc(&_local_serializer, &s, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); rbfserialize(&_local_serializer, &s, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); rbfunserialize(&_local_serializer, &s2, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } rbfcalc(&s, &testpoint, &y0, _state); rbfcalc(&s2, &testpoint, &y1, _state); if( y0.cnt!=ny||y1.cnt!=ny ) { result = ae_true; ae_frame_leave(_state); return result; } for(j=0; j<=ny-1; j++) { if( ae_fp_neq(y0.ptr.p_double[j],y1.ptr.p_double[j]) ) { result = ae_true; ae_frame_leave(_state); return result; } } /* * Check that calling RBFBuildModel() on S2 (new model) * will result in construction of zero model, i.e. test * that serialization restores model, but not dataset * which was used to build model. */ rbfbuildmodel(&s2, &rep, _state); rbfcalc(&s2, &testpoint, &y1, _state); if( y1.cnt!=ny ) { result = ae_true; ae_frame_leave(_state); return result; } for(j=0; j<=ny-1; j++) { if( ae_fp_neq(y1.ptr.p_double[j],(double)(0)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } /* * This function generates random 1...4-dimensional problem, * builds model using RBF-H algo, serializes/unserializes it, * then compares models by calculating model value at some * random point. * * NOTE: we choose at random whether to use default scaling - * or user-supplied one. * * Additionally we test that new model (one which was restored * after serialization) has lost all model construction settings, * i.e. if we call RBFBuildModel() on a NEW model, we will get * empty (zero) model. */ for(nx=1; nx<=4; nx++) { for(ny=1; ny<=2; ny++) { /* * problem setup */ n = 150; rbase = 0.33; nlayers = 5; gridsize = ae_round(ae_pow((double)(n), (double)1/(double)nx, _state), _state)+1; bf = ae_randominteger(2, _state); n = ae_round(ae_pow((double)(gridsize), (double)(nx), _state), _state); ae_matrix_set_length(&xy, n, nx+ny, _state); ae_assert(gridsize>1, "Assertion failed", _state); ae_assert(ae_fp_eq((double)(n),ae_pow((double)(gridsize), (double)(nx), _state)), "Assertion failed", _state); for(i=0; i<=n-1; i++) { k = i; for(j=0; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = (double)(k%gridsize)/(double)(gridsize-1); k = k/gridsize; } for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i][nx+j] = ae_randomreal(_state)-0.5; } } ae_vector_set_length(&testpoint, nx, _state); for(j=0; j<=nx-1; j++) { testpoint.ptr.p_double[j] = ae_randomreal(_state); } ae_vector_set_length(&scalevec, nx, _state); for(j=0; j<=nx-1; j++) { scalevec.ptr.p_double[j] = ae_pow((double)(2), 2*ae_randomreal(_state)-1, _state); } /* * prepare test problem */ rbfcreate(nx, ny, &s, _state); rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, rbase, nlayers, 0.0, _state); rbfsetlinterm(&s, _state); if( ae_fp_greater(ae_randomreal(_state),0.5) ) { rbfsetpoints(&s, &xy, xy.rows, _state); } else { rbfsetpointsandscales(&s, &xy, xy.rows, &scalevec, _state); } /* * Build model, serialize, compare */ rbfbuildmodel(&s, &rep, _state); { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); rbfalloc(&_local_serializer, &s, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); rbfserialize(&_local_serializer, &s, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); rbfunserialize(&_local_serializer, &s2, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } rbfcalc(&s, &testpoint, &y0, _state); rbfcalc(&s2, &testpoint, &y1, _state); if( y0.cnt!=ny||y1.cnt!=ny ) { result = ae_true; ae_frame_leave(_state); return result; } for(j=0; j<=ny-1; j++) { if( ae_fp_neq(y0.ptr.p_double[j],y1.ptr.p_double[j]) ) { result = ae_true; ae_frame_leave(_state); return result; } } /* * Check that calling RBFBuildModel() on S2 (new model) * will result in construction of zero model, i.e. test * that serialization restores model, but not dataset * which was used to build model. */ rbfbuildmodel(&s2, &rep, _state); rbfcalc(&s2, &testpoint, &y1, _state); if( y1.cnt!=ny ) { result = ae_true; ae_frame_leave(_state); return result; } for(j=0; j<=ny-1; j++) { if( ae_fp_neq(y1.ptr.p_double[j],(double)(0)) ) { result = ae_true; ae_frame_leave(_state); return result; } } } } ae_frame_leave(_state); return result; } static ae_bool testrbfunit_searcherr(/* Real */ ae_matrix* y0, /* Real */ ae_matrix* y1, ae_int_t n, ae_int_t ny, ae_int_t errtype, /* Real */ ae_vector* b1, /* Real */ ae_vector* delta, ae_state *_state) { ae_frame _frame_block; ae_matrix _y0; ae_matrix _y1; ae_vector _b1; ae_vector _delta; double oralerr; double iralerr; ae_vector irerr; ae_vector orerr; ae_int_t lb; ae_int_t rb; ae_int_t i; ae_int_t j; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init_copy(&_y0, y0, _state); y0 = &_y0; ae_matrix_init_copy(&_y1, y1, _state); y1 = &_y1; ae_vector_init_copy(&_b1, b1, _state); b1 = &_b1; ae_vector_init_copy(&_delta, delta, _state); delta = &_delta; ae_vector_init(&irerr, 0, DT_REAL, _state); ae_vector_init(&orerr, 0, DT_REAL, _state); ae_assert(n>0, "SearchErr: invalid parameter N(N<=0).", _state); ae_assert(ny>0, "SearchErr: invalid parameter NY(NY<=0).", _state); oralerr = 1.0E-1; iralerr = 1.0E-2; lb = 25; rb = 75; ae_vector_set_length(&orerr, ny, _state); ae_vector_set_length(&irerr, ny, _state); for(j=0; j<=ny-1; j++) { orerr.ptr.p_double[j] = (double)(0); irerr.ptr.p_double[j] = (double)(0); } if( errtype==1 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=ny-1; j++) { if( ae_fp_less(orerr.ptr.p_double[j],ae_fabs(y0->ptr.pp_double[i][j]-y1->ptr.pp_double[i][j], _state)) ) { orerr.ptr.p_double[j] = ae_fabs(y0->ptr.pp_double[i][j]-y1->ptr.pp_double[i][j], _state); } } } for(i=0; i<=ny-1; i++) { if( ae_fp_greater(orerr.ptr.p_double[i],b1->ptr.p_double[i]+delta->ptr.p_double[i])||ae_fp_less(orerr.ptr.p_double[i],b1->ptr.p_double[i]-delta->ptr.p_double[i]) ) { result = ae_true; ae_frame_leave(_state); return result; } } } else { if( errtype==2 ) { for(i=0; i<=n-1; i++) { for(j=0; j<=ny-1; j++) { if( i>lb&&iptr.pp_double[i][j]-y1->ptr.pp_double[i][j], _state)) ) { irerr.ptr.p_double[j] = ae_fabs(y0->ptr.pp_double[i][j]-y1->ptr.pp_double[i][j], _state); } } else { if( ae_fp_less(orerr.ptr.p_double[j],ae_fabs(y0->ptr.pp_double[i][j]-y1->ptr.pp_double[i][j], _state)) ) { orerr.ptr.p_double[j] = ae_fabs(y0->ptr.pp_double[i][j]-y1->ptr.pp_double[i][j], _state); } } } } for(i=0; i<=ny-1; i++) { if( ae_fp_greater(orerr.ptr.p_double[i],oralerr)||ae_fp_greater(irerr.ptr.p_double[i],iralerr) ) { result = ae_true; ae_frame_leave(_state); return result; } } } else { ae_assert(ae_false, "SearchErr: invalid argument ErrType(ErrType neither 1 nor 2)", _state); } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Function for testing basic functionality of RBF module on regular grids with multi-layer algorithm in 2-3D. -- ALGLIB -- Copyright 2.03.2012 by Bochkanov Sergey *************************************************************************/ static ae_bool testrbfunit_basicmultilayerrbftest(ae_state *_state) { ae_frame _frame_block; rbfmodel s; rbfreport rep; rbfcalcbuffer calcbuf; ae_int_t nx; ae_int_t ny; ae_int_t k0; ae_int_t k1; ae_int_t k2; ae_int_t linterm; ae_int_t np; double q; ae_int_t layers; double eps; ae_int_t range; double s1; double s2; double gstep; ae_vector point; ae_matrix gp; ae_vector x; ae_vector y; ae_matrix gy; ae_vector gpgx0; ae_vector gpgx1; ae_vector gpgx2; ae_vector gcy; ae_int_t pass; ae_int_t passcount; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t l; ae_int_t fidx; double r0; ae_int_t margin; ae_int_t gridsize; double threshold; double v; ae_bool result; ae_frame_make(_state, &_frame_block); _rbfmodel_init(&s, _state); _rbfreport_init(&rep, _state); _rbfcalcbuffer_init(&calcbuf, _state); ae_vector_init(&point, 0, DT_REAL, _state); ae_matrix_init(&gp, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_matrix_init(&gy, 0, 0, DT_REAL, _state); ae_vector_init(&gpgx0, 0, DT_REAL, _state); ae_vector_init(&gpgx1, 0, DT_REAL, _state); ae_vector_init(&gpgx2, 0, DT_REAL, _state); ae_vector_init(&gcy, 0, DT_REAL, _state); result = ae_false; range = 10; passcount = 10; eps = 1.0E-6; /* * Test that RBF model with sufficient layers will exactly reproduce * target function. */ for(pass=0; pass<=passcount-1; pass++) { /* * prepare test problem */ k0 = 6+ae_randominteger(3, _state); k1 = 6+ae_randominteger(3, _state); k2 = 6+ae_randominteger(3, _state); s1 = ae_pow((double)(range), (double)(ae_randominteger(3, _state)-1), _state); s2 = ae_pow((double)(range), (double)(ae_randominteger(3, _state)-1), _state); nx = ae_randominteger(2, _state)+2; ny = ae_randominteger(2, _state)+1; linterm = ae_randominteger(3, _state)+1; layers = 5; gstep = s1/6; q = s1; /* * Create RBF structure and auxiliary structures */ ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&y, ny, _state); ae_vector_set_length(&point, nx, _state); rbfcreate(nx, ny, &s, _state); rbfsetalgomultilayer(&s, q, layers, 0.0, _state); if( linterm==1 ) { rbfsetlinterm(&s, _state); } if( linterm==2 ) { rbfsetconstterm(&s, _state); } if( linterm==3 ) { rbfsetzeroterm(&s, _state); } for(i=0; i<=nx-1; i++) { point.ptr.p_double[i] = s1*(2*ae_randomreal(_state)-1); } /* * 2-dimensional test problem */ if( nx==2 ) { np = k0*k1; ae_matrix_set_length(&gp, np, nx+ny, _state); /* * create grid */ for(i=0; i<=k0-1; i++) { for(j=0; j<=k1-1; j++) { gp.ptr.pp_double[i*k1+j][0] = point.ptr.p_double[0]+gstep*i; gp.ptr.pp_double[i*k1+j][1] = point.ptr.p_double[1]+gstep*j; for(k=0; k<=ny-1; k++) { gp.ptr.pp_double[i*k1+j][nx+k] = s2*(2*ae_randomreal(_state)-1); } } } rbfsetpoints(&s, &gp, np, _state); rbfbuildmodel(&s, &rep, _state); rbfcreatecalcbuffer(&s, &calcbuf, _state); if( ny==1 ) { ae_vector_set_length(&gpgx0, k0, _state); ae_vector_set_length(&gpgx1, k1, _state); for(i=0; i<=k0-1; i++) { gpgx0.ptr.p_double[i] = point.ptr.p_double[0]+gstep*i; } for(i=0; i<=k1-1; i++) { gpgx1.ptr.p_double[i] = point.ptr.p_double[1]+gstep*i; } rbfgridcalc2(&s, &gpgx0, k0, &gpgx1, k1, &gy, _state); for(i=0; i<=k0-1; i++) { for(j=0; j<=k1-1; j++) { seterrorflag(&result, ae_fp_greater(ae_fabs(gy.ptr.pp_double[i][j]-gp.ptr.pp_double[i*k1+j][nx], _state),s2*eps), _state); } } } for(i=0; i<=np-1; i++) { /* * For each row we randomly choose a function to test * and call it. We do not call multiple functions per * row because carry-over effects may mask errors in * some function (say, it is possible that function * simply returns results from previous call of some * other function which were stored in the RBF model; * in this case, previous call with same parameters * may hide deficiencies in the function). */ x.ptr.p_double[0] = gp.ptr.pp_double[i][0]; x.ptr.p_double[1] = gp.ptr.pp_double[i][1]; fidx = ae_randominteger(4, _state); if( fidx==0&&ny!=1 ) { continue; } if( fidx==0 ) { y.ptr.p_double[0] = rbfcalc2(&s, x.ptr.p_double[0], x.ptr.p_double[1], _state); } if( fidx==1 ) { rbfcalc(&s, &x, &y, _state); } if( fidx==2 ) { rbfcalcbuf(&s, &x, &y, _state); } if( fidx==3 ) { rbftscalcbuf(&s, &calcbuf, &x, &y, _state); } for(j=0; j<=ny-1; j++) { seterrorflag(&result, ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),s2*eps), _state); } } } /* * 3-dimensional test problems */ if( nx==3 ) { np = k0*k1*k2; ae_matrix_set_length(&gp, np, nx+ny, _state); /* * create grid, build model */ ae_vector_set_length(&gpgx0, k0, _state); ae_vector_set_length(&gpgx1, k1, _state); ae_vector_set_length(&gpgx2, k2, _state); for(i=0; i<=k0-1; i++) { gpgx0.ptr.p_double[i] = point.ptr.p_double[0]+gstep*i; } for(i=0; i<=k1-1; i++) { gpgx1.ptr.p_double[i] = point.ptr.p_double[1]+gstep*i; } for(i=0; i<=k2-1; i++) { gpgx2.ptr.p_double[i] = point.ptr.p_double[2]+gstep*i; } for(i=0; i<=k0-1; i++) { for(j=0; j<=k1-1; j++) { for(k=0; k<=k2-1; k++) { gp.ptr.pp_double[(i*k1+j)*k2+k][0] = gpgx0.ptr.p_double[i]; gp.ptr.pp_double[(i*k1+j)*k2+k][1] = gpgx1.ptr.p_double[j]; gp.ptr.pp_double[(i*k1+j)*k2+k][2] = gpgx2.ptr.p_double[k]; for(l=0; l<=ny-1; l++) { gp.ptr.pp_double[(i*k1+j)*k2+k][nx+l] = s2*(2*ae_randomreal(_state)-1); } } } } rbfsetpoints(&s, &gp, np, _state); rbfbuildmodel(&s, &rep, _state); rbfcreatecalcbuffer(&s, &calcbuf, _state); /* * Test RBFCalc3(), RBFCalc() and RBFCalcBuf() vs expected values on the grid (we expect good fit). */ for(i=0; i<=np-1; i++) { /* * For each row we randomly choose a function to test * and call it. We do not call multiple functions per * row because carry-over effects may mask errors in * some function (say, it is possible that function * simply returns results from previous call of some * other function which were stored in the RBF model; * in this case, previous call with same parameters * may hide deficiencies in the function). */ x.ptr.p_double[0] = gp.ptr.pp_double[i][0]; x.ptr.p_double[1] = gp.ptr.pp_double[i][1]; x.ptr.p_double[2] = gp.ptr.pp_double[i][2]; fidx = ae_randominteger(4, _state); if( fidx==0&&ny!=1 ) { continue; } if( fidx==0 ) { y.ptr.p_double[0] = rbfcalc3(&s, x.ptr.p_double[0], x.ptr.p_double[1], x.ptr.p_double[2], _state); } if( fidx==1 ) { rbfcalc(&s, &x, &y, _state); } if( fidx==2 ) { rbfcalcbuf(&s, &x, &y, _state); } if( fidx==3 ) { rbftscalcbuf(&s, &calcbuf, &x, &y, _state); } for(j=0; j<=ny-1; j++) { seterrorflag(&result, ae_fp_greater(ae_fabs(gp.ptr.pp_double[i][nx+j]-y.ptr.p_double[j], _state),s2*eps), _state); } } /* * Test RBFGridCalc3V vs RBFCalc() */ rbfgridcalc3v(&s, &gpgx0, k0, &gpgx1, k1, &gpgx2, k2, &gcy, _state); for(i=0; i<=k0-1; i++) { for(j=0; j<=k1-1; j++) { for(k=0; k<=k2-1; k++) { x.ptr.p_double[0] = gpgx0.ptr.p_double[i]; x.ptr.p_double[1] = gpgx1.ptr.p_double[j]; x.ptr.p_double[2] = gpgx2.ptr.p_double[k]; rbfcalcbuf(&s, &x, &y, _state); for(l=0; l<=ny-1; l++) { seterrorflag(&result, ae_fp_greater(ae_fabs(y.ptr.p_double[l]-gcy.ptr.p_double[l+ny*(i+j*k0+k*k0*k1)], _state),1.0E-9*s2), _state); } } } } } } /* * Test smoothing properties of RBF model: model with just one layer * and large initial radius will produce "average" value of neighbors. * * In order to test it we create regular grid, fill it with regular * +1/-1 pattern, and test model values in the inner points. Model * values should be around zero (we use handcrafted threshold to test * it). Radius is chosen to be several times larger than grid step. * * We perform test for 2D model, because same behavior is expected * regardless of dimensionality. */ r0 = (double)(3); margin = 5; gridsize = 2*margin+20; threshold = 0.1; nx = 2; ny = 1; rbfcreate(nx, ny, &s, _state); rbfsetalgomultilayer(&s, r0, 1, 0.0, _state); rbfsetzeroterm(&s, _state); ae_matrix_set_length(&gp, gridsize*gridsize, nx+ny, _state); for(i=0; i<=gridsize-1; i++) { for(j=0; j<=gridsize-1; j++) { gp.ptr.pp_double[i*gridsize+j][0] = (double)(i); gp.ptr.pp_double[i*gridsize+j][1] = (double)(j); gp.ptr.pp_double[i*gridsize+j][2] = 0.10*ae_randomreal(_state)-0.05+(2*((i+j)%2)-1); } } rbfsetpoints(&s, &gp, gridsize*gridsize, _state); rbfbuildmodel(&s, &rep, _state); v = 0.0; for(i=margin; i<=gridsize-margin-1; i++) { for(j=margin; j<=gridsize-margin-1; j++) { v = ae_maxreal(v, ae_fabs(rbfcalc2(&s, (double)(i), (double)(j), _state), _state), _state); } } seterrorflag(&result, ae_fp_greater(v,threshold), _state); ae_frame_leave(_state); return result; } /************************************************************************* Function for testing basic functionality of RBF module on regular grids with multi-layer algorithm in 2-3D. -- ALGLIB -- Copyright 2.03.2012 by Bochkanov Sergey *************************************************************************/ static void testrbfunit_gridcalc23test(ae_bool* errorflag, ae_state *_state) { ae_frame _frame_block; rbfmodel s; rbfreport rep; hqrndstate rs; ae_int_t pass; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t l; double perturbation; ae_vector kx; double sx; double sy; ae_int_t nx; ae_int_t ny; ae_int_t linterm; ae_int_t layers; ae_int_t npoints; ae_matrix xy; ae_vector gf; ae_vector rf; ae_vector x0; ae_vector x1; ae_vector x2; ae_vector gy; ae_vector gy2; ae_vector x; ae_vector y; double sparsity; ae_frame_make(_state, &_frame_block); _rbfmodel_init(&s, _state); _rbfreport_init(&rep, _state); _hqrndstate_init(&rs, _state); ae_vector_init(&kx, 0, DT_INT, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_vector_init(&gf, 0, DT_BOOL, _state); ae_vector_init(&rf, 0, DT_BOOL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&gy, 0, DT_REAL, _state); ae_vector_init(&gy2, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); hqrndrandomize(&rs, _state); for(pass=0; pass<=24; pass++) { /* * prepare test problem */ ae_vector_set_length(&kx, 3, _state); for(i=0; i<=2; i++) { /* * 66% of cases - large grid */ if( hqrnduniformi(&rs, 3, _state)==0 ) { kx.ptr.p_int[i] = ae_round(10*ae_pow((double)(10), hqrnduniformr(&rs, _state), _state), _state); continue; } /* * 33% of cases - small grid */ k = hqrnduniformi(&rs, 3, _state); if( k==0 ) { kx.ptr.p_int[i] = 1; } if( k==1 ) { kx.ptr.p_int[i] = 2; } if( k==2 ) { kx.ptr.p_int[i] = 10; } } sx = ae_pow((double)(10), (double)(hqrnduniformi(&rs, 3, _state)-1), _state); sy = ae_pow((double)(10), (double)(hqrnduniformi(&rs, 3, _state)-1), _state); nx = 3; ny = 1+hqrnduniformi(&rs, 5, _state); linterm = hqrnduniformi(&rs, 3, _state)+1; layers = hqrnduniformi(&rs, 3, _state)+1; npoints = 100; ae_matrix_set_length(&xy, npoints, nx+ny, _state); for(i=0; i<=npoints-1; i++) { for(j=0; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = hqrnduniformr(&rs, _state)*sx; } for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i][nx+j] = hqrnduniformr(&rs, _state)*sy; } } /* * Create RBF model */ rbfcreate(nx, ny, &s, _state); rbfsetalgomultilayer(&s, 0.1*sx, layers, 0.0, _state); if( linterm==1 ) { rbfsetlinterm(&s, _state); } if( linterm==2 ) { rbfsetconstterm(&s, _state); } if( linterm==3 ) { rbfsetzeroterm(&s, _state); } rbfsetpoints(&s, &xy, npoints, _state); rbfbuildmodel(&s, &rep, _state); seterrorflag(errorflag, rep.terminationtype<=0, _state); if( *errorflag ) { ae_frame_leave(_state); return; } /* * Prepare test grid */ ae_vector_set_length(&x0, kx.ptr.p_int[0], _state); for(i=0; i<=kx.ptr.p_int[0]-1; i++) { perturbation = 0.5*(hqrnduniformr(&rs, _state)-0.5); x0.ptr.p_double[i] = sx*(i+perturbation)/kx.ptr.p_int[0]; } ae_vector_set_length(&x1, kx.ptr.p_int[1], _state); for(i=0; i<=kx.ptr.p_int[1]-1; i++) { perturbation = 0.5*(hqrnduniformr(&rs, _state)-0.5); x1.ptr.p_double[i] = sx*(i+perturbation)/kx.ptr.p_int[1]; } ae_vector_set_length(&x2, kx.ptr.p_int[2], _state); for(i=0; i<=kx.ptr.p_int[2]-1; i++) { perturbation = 0.5*(hqrnduniformr(&rs, _state)-0.5); x2.ptr.p_double[i] = sx*(i+perturbation)/kx.ptr.p_int[2]; } /* * Test calculation on grid */ ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&y, ny, _state); unsetrealarray(&gy, _state); rbfgridcalc3v(&s, &x0, kx.ptr.p_int[0], &x1, kx.ptr.p_int[1], &x2, kx.ptr.p_int[2], &gy, _state); for(i=0; i<=kx.ptr.p_int[0]-1; i++) { for(j=0; j<=kx.ptr.p_int[1]-1; j++) { for(k=0; k<=kx.ptr.p_int[2]-1; k++) { x.ptr.p_double[0] = x0.ptr.p_double[i]; x.ptr.p_double[1] = x1.ptr.p_double[j]; x.ptr.p_double[2] = x2.ptr.p_double[k]; rbfcalcbuf(&s, &x, &y, _state); for(l=0; l<=ny-1; l++) { seterrorflag(errorflag, ae_fp_greater(ae_fabs(y.ptr.p_double[l]-gy.ptr.p_double[l+ny*(i+j*kx.ptr.p_int[0]+k*kx.ptr.p_int[0]*kx.ptr.p_int[1])], _state),1.0E-9*sy), _state); } } } } /* * Test calculation on subset of regular grid: * * select sparsity coefficient (from 1.0 to 0.001) * * fill bitmap array * * Test 1: compare full and subset versions * * Test 2: check sparsity. Subset function may perform additional * evaluations because it processes data micro-row by micro-row. * So, we can't check that all elements which were not flagged * are zero - some of them may become non-zero. However, if entire * row is empty, we can reasonably expect (informal guarantee) * that it is not processed. So, we check empty (completely * unflagged) rows * */ sparsity = ae_pow((double)(10), (double)(-hqrnduniformi(&rs, 4, _state)), _state); ae_vector_set_length(&gf, kx.ptr.p_int[0]*kx.ptr.p_int[1]*kx.ptr.p_int[2], _state); ae_vector_set_length(&rf, kx.ptr.p_int[1]*kx.ptr.p_int[2], _state); for(i=0; i<=kx.ptr.p_int[1]*kx.ptr.p_int[2]-1; i++) { rf.ptr.p_bool[i] = ae_false; } for(i=0; i<=kx.ptr.p_int[0]*kx.ptr.p_int[1]*kx.ptr.p_int[2]-1; i++) { gf.ptr.p_bool[i] = ae_fp_less(hqrnduniformr(&rs, _state),sparsity); if( gf.ptr.p_bool[i] ) { rf.ptr.p_bool[i/kx.ptr.p_int[0]] = ae_true; } } ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&y, ny, _state); unsetrealarray(&gy, _state); unsetrealarray(&gy2, _state); rbfgridcalc3vsubset(&s, &x0, kx.ptr.p_int[0], &x1, kx.ptr.p_int[1], &x2, kx.ptr.p_int[2], &gf, &gy, _state); rbfgridcalc3v(&s, &x0, kx.ptr.p_int[0], &x1, kx.ptr.p_int[1], &x2, kx.ptr.p_int[2], &gy2, _state); for(i=0; i<=ny*kx.ptr.p_int[0]*kx.ptr.p_int[1]*kx.ptr.p_int[2]-1; i++) { seterrorflag(errorflag, gf.ptr.p_bool[i/ny]&&ae_fp_greater(ae_fabs(gy.ptr.p_double[i]-gy2.ptr.p_double[i], _state),1.0E-9*sy), _state); seterrorflag(errorflag, !rf.ptr.p_bool[i/(ny*kx.ptr.p_int[0])]&&ae_fp_neq(gy.ptr.p_double[i],0.0), _state); } } ae_frame_leave(_state); } /************************************************************************* Function for testing basic functionality of RBF module with hierarchical algorithm. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ static ae_bool testrbfunit_basichrbftest(ae_state *_state) { ae_frame _frame_block; rbfmodel s; rbfmodel s2; rbfreport rep; rbfcalcbuffer tsbuf; ae_int_t nx; ae_int_t ny; ae_int_t linterm; ae_int_t bf; double rbase; ae_int_t nlayers; double errtol; double scalefactor; ae_matrix xy; ae_matrix xytest; ae_matrix uxwr; ae_matrix uv; ae_matrix xy2; ae_vector x; ae_vector y; ae_vector y2; ae_vector xzero; ae_vector yref; ae_vector scalevec; ae_int_t n; ae_int_t ntest; ae_int_t gridsize; ae_int_t i; ae_int_t j; ae_int_t k; double r0; ae_int_t margin; double threshold; double v; ae_int_t functype; ae_int_t densitytype; double width; double lowprec; double highprec; ae_int_t modeltype; ae_int_t shaketype; double maxerr; ae_bool fractionalerror; ae_int_t unx; ae_int_t uny; ae_int_t unc; ae_int_t modelversion; ae_bool hasscale; ae_bool result; ae_frame_make(_state, &_frame_block); _rbfmodel_init(&s, _state); _rbfmodel_init(&s2, _state); _rbfreport_init(&rep, _state); _rbfcalcbuffer_init(&tsbuf, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&xytest, 0, 0, DT_REAL, _state); ae_matrix_init(&uxwr, 0, 0, DT_REAL, _state); ae_matrix_init(&uv, 0, 0, DT_REAL, _state); ae_matrix_init(&xy2, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&xzero, 0, DT_REAL, _state); ae_vector_init(&yref, 0, DT_REAL, _state); ae_vector_init(&scalevec, 0, DT_REAL, _state); result = ae_false; /* * First test - random problem, ability to build model * which reproduces function value in all points with * good precision. * * We also test properties of the linear term - that * model value in far away points is either constant * or exactly zero (for corresponding kinds of linear * term). * * All dataset points are located in unit cube on * regular grid. We do not use smoothing for this test. * * We use/test following functions: * * RBFCalc() * * RBFCalc2() * * RBFCalc3() * * RBFCalcBuf() * * RBFTsCalcBuf() */ errtol = 1.0E-6; for(nx=1; nx<=4; nx++) { for(ny=1; ny<=3; ny++) { /* * problem setup */ n = 150; rbase = 0.33; nlayers = 10; gridsize = ae_round(ae_pow((double)(n), (double)1/(double)nx, _state), _state)+1; linterm = 1+ae_randominteger(3, _state); bf = ae_randominteger(2, _state); n = ae_round(ae_pow((double)(gridsize), (double)(nx), _state), _state); ae_matrix_set_length(&xy, n, nx+ny, _state); ae_assert(gridsize>1, "Assertion failed", _state); ae_assert(ae_fp_eq((double)(n),ae_pow((double)(gridsize), (double)(nx), _state)), "Assertion failed", _state); for(i=0; i<=n-1; i++) { k = i; for(j=0; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = (double)(k%gridsize)/(double)(gridsize-1); k = k/gridsize; } for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i][nx+j] = ae_randomreal(_state)-0.5; } } /* * Build model */ rbfcreate(nx, ny, &s, _state); rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, rbase, nlayers, 0.0, _state); if( linterm==1 ) { rbfsetlinterm(&s, _state); } if( linterm==2 ) { rbfsetconstterm(&s, _state); } if( linterm==3 ) { rbfsetzeroterm(&s, _state); } rbfsetpoints(&s, &xy, n, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } rbfcreatecalcbuffer(&s, &tsbuf, _state); /* * Test ability to reproduce function value * * NOTE: we use RBFCalc(XZero) to guarantee that internal state of * RBF model is "reset" between subsequent calls of different * functions. It allows us to make sure that we have no bug * like function simply returning latest result */ ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&xzero, nx, _state); ae_vector_set_length(&y, ny, _state); for(j=0; j<=nx-1; j++) { xzero.ptr.p_double[j] = (double)(0); } for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = xy.ptr.pp_double[i][j]; } rbfcalc(&s, &xzero, &y, _state); if( ae_fp_greater(ae_randomreal(_state),0.5) ) { ae_vector_set_length(&yref, ny+1, _state); } rbfcalc(&s, &x, &yref, _state); for(j=0; j<=ny-1; j++) { seterrorflag(&result, ae_fp_greater(ae_fabs(yref.ptr.p_double[j]-xy.ptr.pp_double[i][nx+j], _state),errtol), _state); } seterrorflag(&result, yref.cnt!=ny, _state); if( nx==1&&ny==1 ) { rbfcalc(&s, &xzero, &y, _state); seterrorflag(&result, ae_fp_neq(rbfcalc1(&s, x.ptr.p_double[0], _state),yref.ptr.p_double[0]), _state); } if( nx==2&&ny==1 ) { rbfcalc(&s, &xzero, &y, _state); seterrorflag(&result, ae_fp_neq(rbfcalc2(&s, x.ptr.p_double[0], x.ptr.p_double[1], _state),yref.ptr.p_double[0]), _state); } if( nx==3&&ny==1 ) { rbfcalc(&s, &xzero, &y, _state); seterrorflag(&result, ae_fp_neq(rbfcalc3(&s, x.ptr.p_double[0], x.ptr.p_double[1], x.ptr.p_double[2], _state),yref.ptr.p_double[0]), _state); } rbfcalc(&s, &xzero, &y, _state); if( ae_fp_greater(ae_randomreal(_state),0.5) ) { ae_vector_set_length(&y, ny+1, _state); rbfcalcbuf(&s, &x, &y, _state); seterrorflag(&result, y.cnt!=ny+1, _state); } else { ae_vector_set_length(&y, ny-1, _state); rbfcalcbuf(&s, &x, &y, _state); seterrorflag(&result, y.cnt!=ny, _state); } for(j=0; j<=ny-1; j++) { seterrorflag(&result, ae_fp_neq(y.ptr.p_double[j],yref.ptr.p_double[j]), _state); } rbfcalc(&s, &xzero, &y, _state); if( ae_fp_greater(ae_randomreal(_state),0.5) ) { ae_vector_set_length(&y, ny+1, _state); rbftscalcbuf(&s, &tsbuf, &x, &y, _state); seterrorflag(&result, y.cnt!=ny+1, _state); } else { ae_vector_set_length(&y, ny-1, _state); rbftscalcbuf(&s, &tsbuf, &x, &y, _state); seterrorflag(&result, y.cnt!=ny, _state); } for(j=0; j<=ny-1; j++) { seterrorflag(&result, ae_fp_neq(y.ptr.p_double[j],yref.ptr.p_double[j]), _state); } } /* * Test that: * a) model with zero linear term is zero far away from dataset * b) model with constant linear term is constant far away from dataset */ ae_vector_set_length(&x, nx, _state); if( linterm==2 ) { for(j=0; j<=nx-1; j++) { if( ae_fp_greater(ae_randomreal(_state),0.5) ) { x.ptr.p_double[j] = 1+1000*rbase; } else { x.ptr.p_double[j] = 0-1000*rbase; } } rbfcalc(&s, &x, &y, _state); for(j=0; j<=nx-1; j++) { if( ae_fp_greater(ae_randomreal(_state),0.5) ) { x.ptr.p_double[j] = 1+1000*rbase; } else { x.ptr.p_double[j] = 0-1000*rbase; } } rbfcalc(&s, &x, &y2, _state); for(j=0; j<=ny-1; j++) { seterrorflag(&result, ae_fp_neq(y.ptr.p_double[j],y2.ptr.p_double[j]), _state); } } if( linterm==3 ) { for(j=0; j<=nx-1; j++) { if( ae_fp_greater(ae_randomreal(_state),0.5) ) { x.ptr.p_double[j] = 1+1000*rbase; } else { x.ptr.p_double[j] = 0-1000*rbase; } } rbfcalc(&s, &x, &y, _state); for(j=0; j<=ny-1; j++) { seterrorflag(&result, ae_fp_neq(y.ptr.p_double[j],(double)(0)), _state); } } } } /* * Test rbfunpack() */ for(nx=1; nx<=2; nx++) { for(ny=1; ny<=2; ny++) { /* * problem setup */ n = 150; rbase = 0.33; nlayers = 5; gridsize = ae_round(ae_pow((double)(n), (double)1/(double)nx, _state), _state)+1; linterm = 1+ae_randominteger(3, _state); bf = ae_randominteger(2, _state); n = ae_round(ae_pow((double)(gridsize), (double)(nx), _state), _state); ae_matrix_set_length(&xy, n, nx+ny, _state); ae_assert(gridsize>1, "Assertion failed", _state); ae_assert(ae_fp_eq((double)(n),ae_pow((double)(gridsize), (double)(nx), _state)), "Assertion failed", _state); for(i=0; i<=n-1; i++) { k = i; for(j=0; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = (double)(k%gridsize)/(double)(gridsize-1); k = k/gridsize; } for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i][nx+j] = ae_randomreal(_state)-0.5; } } hasscale = ae_randominteger(2, _state)==0; ae_vector_set_length(&scalevec, nx, _state); if( hasscale ) { for(j=0; j<=nx-1; j++) { scalevec.ptr.p_double[j] = ae_pow((double)(2), 2*ae_randomreal(_state)-1, _state); } } /* * Build model */ rbfcreate(nx, ny, &s, _state); rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, rbase, nlayers, 0.0, _state); if( linterm==1 ) { rbfsetlinterm(&s, _state); } if( linterm==2 ) { rbfsetconstterm(&s, _state); } if( linterm==3 ) { rbfsetzeroterm(&s, _state); } if( hasscale ) { rbfsetpointsandscales(&s, &xy, n, &scalevec, _state); } else { rbfsetpoints(&s, &xy, n, _state); } rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } /* * Test RBFUnpack() */ rbfunpack(&s, &unx, &uny, &uxwr, &unc, &uv, &modelversion, _state); if( modelversion!=2 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } if( unx!=nx||uny!=ny ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } if( uv.cols!=nx+1||uv.rows!=ny ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } if( linterm==2 ) { for(i=0; i<=ny-1; i++) { for(j=0; j<=nx-1; j++) { seterrorflag(&result, ae_fp_neq(uv.ptr.pp_double[i][j],(double)(0)), _state); } } } if( linterm==3 ) { for(i=0; i<=ny-1; i++) { for(j=0; j<=nx; j++) { seterrorflag(&result, ae_fp_neq(uv.ptr.pp_double[i][j],(double)(0)), _state); } } } ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&y, ny, _state); for(i=0; i<=9; i++) { for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = ae_randomreal(_state); } rbfcalc(&s, &x, &yref, _state); for(j=0; j<=ny-1; j++) { y.ptr.p_double[j] = uv.ptr.pp_double[j][nx]; for(k=0; k<=nx-1; k++) { y.ptr.p_double[j] = y.ptr.p_double[j]+x.ptr.p_double[k]*uv.ptr.pp_double[j][k]; } } for(k=0; k<=unc-1; k++) { v = (double)(0); for(j=0; j<=nx-1; j++) { v = v+ae_sqr(uxwr.ptr.pp_double[k][j]-x.ptr.p_double[j], _state)/ae_sqr(uxwr.ptr.pp_double[k][nx+ny+j], _state); } if( ae_fp_less(v,rbfv2farradius(bf, _state)*rbfv2farradius(bf, _state)) ) { v = rbfv2basisfunc(bf, v, _state); } else { v = (double)(0); } for(j=0; j<=ny-1; j++) { y.ptr.p_double[j] = y.ptr.p_double[j]+v*uxwr.ptr.pp_double[k][nx+j]; } } for(j=0; j<=ny-1; j++) { seterrorflag(&result, ae_fp_greater(ae_fabs(y.ptr.p_double[j]-yref.ptr.p_double[j], _state),1.0E-9), _state); } } } } /* * Test that smooth 1D function is reproduced (between nodes) * with good precision. We test two model types: model with * three layers and moderate initial radius, and model with * large initial radius and large number of layers. * * This test: * * generates test function on [-Width,+Width]. Two sets of * nodes are generated - "model" ones and "test" ones. * * builds RBF model using "model" dataset * * test model using "test" dataset. Test points are more * dense and are spread in [-0.9*Width, +0.9*Width] (reduced * interval is used because RBF models are too bad near the * boundaries). * * NOTE: we calculate maximum error for given function type * and grid density over all modifications of the task, * and only after that we perform comparison with tolerance * level. It allows easier debugging. */ for(functype=0; functype<=2; functype++) { for(densitytype=0; densitytype<=1; densitytype++) { /* * Select tolerance */ if( functype==0 ) { lowprec = 1.0E-2; highprec = 1.0E-3; } else { if( functype==1 ) { lowprec = 1.0E-1; highprec = 1.0E-2; } else { if( functype==2 ) { lowprec = 1.0E-3; highprec = 1.0E-4; } else { ae_assert(ae_false, "Assertion failed", _state); } } } if( densitytype==0 ) { errtol = lowprec; } else { errtol = highprec; } /* * Test */ maxerr = (double)(0); for(modeltype=0; modeltype<=1; modeltype++) { for(shaketype=0; shaketype<=1; shaketype++) { /* * Generate grid */ if( functype==0 ) { /* * sin(x) on [-2*pi,+2*pi] */ n = 17*ae_round(ae_pow((double)(4), (double)(densitytype), _state), _state); width = ae_pi; fractionalerror = ae_false; } else { if( functype==1 ) { /* * exp(x) on [-3,+3] */ n = 50*ae_round(ae_pow((double)(4), (double)(densitytype), _state), _state); width = (double)(3); fractionalerror = ae_true; } else { if( functype==2 ) { /* * 1/(1+x^2) on [-3,+3] */ n = 20*ae_round(ae_pow((double)(4), (double)(densitytype), _state), _state); width = (double)(3); fractionalerror = ae_false; } else { ae_assert(ae_false, "Assertion failed", _state); } } } ae_matrix_set_length(&xy, n, 2, _state); for(i=0; i<=n-1; i++) { v = shaketype*0.25*(ae_randomreal(_state)-0.5); v = (i+v)/(n-1); v = 2*v-1; xy.ptr.pp_double[i][0] = width*v; } ntest = n*10; ae_matrix_set_length(&xytest, ntest, 2, _state); for(i=0; i<=ntest-1; i++) { xytest.ptr.pp_double[i][0] = 0.9*width*((double)(2*i)/(double)(ntest-1)-1); } /* * Evaluate function */ if( functype==0 ) { /* * sin(x) */ for(i=0; i<=n-1; i++) { xy.ptr.pp_double[i][1] = ae_sin(xy.ptr.pp_double[i][0], _state); } for(i=0; i<=ntest-1; i++) { xytest.ptr.pp_double[i][1] = ae_sin(xytest.ptr.pp_double[i][0], _state); } } else { if( functype==1 ) { /* * exp(x) */ for(i=0; i<=n-1; i++) { xy.ptr.pp_double[i][1] = ae_exp(xy.ptr.pp_double[i][0], _state); } for(i=0; i<=ntest-1; i++) { xytest.ptr.pp_double[i][1] = ae_exp(xytest.ptr.pp_double[i][0], _state); } } else { if( functype==2 ) { /* * 1/(1+x^2) */ for(i=0; i<=n-1; i++) { xy.ptr.pp_double[i][1] = 1/(1+ae_sqr(xy.ptr.pp_double[i][0], _state)); } for(i=0; i<=ntest-1; i++) { xytest.ptr.pp_double[i][1] = 1/(1+ae_sqr(xytest.ptr.pp_double[i][0], _state)); } } else { ae_assert(ae_false, "Assertion failed", _state); } } } /* * Select model properties and precision */ if( modeltype==0 ) { rbase = 4.0*(2*width/n); nlayers = 3; } else { if( modeltype==1 ) { rbase = 16.0*(2*width/n); nlayers = 6; } else { ae_assert(ae_false, "Assertion failed", _state); } } /* * Build model */ bf = ae_randominteger(2, _state); rbfcreate(1, 1, &s, _state); rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, rbase, nlayers, 0.0, _state); rbfsetpoints(&s, &xy, n, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } /* * Check */ ae_vector_set_length(&x, 1, _state); for(i=0; i<=ntest-1; i++) { x.ptr.p_double[0] = xytest.ptr.pp_double[i][0]; rbfcalc(&s, &x, &y, _state); if( fractionalerror ) { maxerr = ae_maxreal(maxerr, ae_fabs(y.ptr.p_double[0]-xytest.ptr.pp_double[i][1], _state)/ae_fabs(xytest.ptr.pp_double[i][1], _state), _state); } else { maxerr = ae_maxreal(maxerr, ae_fabs(y.ptr.p_double[0]-xytest.ptr.pp_double[i][1], _state), _state); } } } } /* * Check error */ seterrorflag(&result, ae_fp_greater(maxerr,errtol), _state); } } /* * Scaling test - random problem, we test that after * scaling of all variables and radius by 2^K (for some K) * we will get exactly same results (up to the last bit of * mantissa). * * It is very strong requirement for algorithm stability, * but it is satisfiable in most implementations of RBFs, * because all operations involving spatial values are usually * followed by division by radius, and using multiplier which * is exactly power of 2 results in no changes in numbers * being returned. * * It allows to test different scale-related bugs * (say, situation when deep in kd-tree search code we compare * against R instead of R^2). * * All dataset points are located in unit cube on * regular grid. * * We do not use smoothing for this test. */ for(nx=1; nx<=4; nx++) { /* * problem setup */ n = 150; rbase = 0.33; nlayers = ae_randominteger(4, _state); scalefactor = ae_pow((double)(1024), (double)(2*ae_randominteger(2, _state)-1), _state); gridsize = ae_round(ae_pow((double)(n), (double)1/(double)nx, _state), _state)+1; ny = 1+ae_randominteger(3, _state); linterm = 1+ae_randominteger(3, _state); bf = ae_randominteger(2, _state); n = ae_round(ae_pow((double)(gridsize), (double)(nx), _state), _state); ae_matrix_set_length(&xy, n, nx+ny, _state); ae_matrix_set_length(&xy2, n, nx+ny, _state); ae_assert(gridsize>1, "Assertion failed", _state); ae_assert(ae_fp_eq((double)(n),ae_pow((double)(gridsize), (double)(nx), _state)), "Assertion failed", _state); for(i=0; i<=n-1; i++) { k = i; for(j=0; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = (double)(k%gridsize)/(double)(gridsize-1); xy2.ptr.pp_double[i][j] = xy.ptr.pp_double[i][j]*scalefactor; k = k/gridsize; } for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i][nx+j] = ae_randomreal(_state)-0.5; xy2.ptr.pp_double[i][nx+j] = xy.ptr.pp_double[i][nx+j]; } } /* * Build model 1 */ rbfcreate(nx, ny, &s, _state); rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, rbase, nlayers, 0.0, _state); if( linterm==1 ) { rbfsetlinterm(&s, _state); } if( linterm==2 ) { rbfsetconstterm(&s, _state); } if( linterm==3 ) { rbfsetzeroterm(&s, _state); } rbfsetpoints(&s, &xy, n, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } /* * Build model 2 */ rbfcreate(nx, ny, &s2, _state); rbfsetv2bf(&s2, bf, _state); rbfsetalgohierarchical(&s2, rbase*scalefactor, nlayers, 0.0, _state); if( linterm==1 ) { rbfsetlinterm(&s2, _state); } if( linterm==2 ) { rbfsetconstterm(&s2, _state); } if( linterm==3 ) { rbfsetzeroterm(&s2, _state); } rbfsetpoints(&s2, &xy2, n, _state); rbfbuildmodel(&s2, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } /* * Compare models */ ae_vector_set_length(&x, nx, _state); ae_vector_set_length(&y, ny, _state); ae_vector_set_length(&y2, ny, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = xy.ptr.pp_double[i][j]; } rbfcalc(&s, &x, &y, _state); for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = xy2.ptr.pp_double[i][j]; } rbfcalc(&s2, &x, &y2, _state); for(j=0; j<=ny-1; j++) { seterrorflag(&result, ae_fp_neq(y.ptr.p_double[j],y2.ptr.p_double[j]), _state); } } } /* * Test that passing scaled dataset automatically results in V2 model * being built (even when algorithm type is not set explicitly). */ for(nx=1; nx<=4; nx++) { for(ny=1; ny<=3; ny++) { /* * problem setup */ n = 10; ae_matrix_set_length(&xy, n, nx+ny, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx+ny-1; j++) { xy.ptr.pp_double[i][j] = ae_randomreal(_state)-0.5; } } ae_vector_set_length(&scalevec, nx, _state); for(j=0; j<=nx-1; j++) { scalevec.ptr.p_double[j] = ae_pow((double)(2), 2*ae_randomreal(_state)-1, _state); } /* * prepare test problem */ rbfcreate(nx, ny, &s, _state); rbfsetpointsandscales(&s, &xy, xy.rows, &scalevec, _state); /* * Build model, check model version */ rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } seterrorflag(&result, rbfgetmodelversion(&s, _state)!=2, _state); } } /* * Test smoothing properties of RBF model: model with just one layer * and large initial radius will produce "average" value of neighbors. * * In order to test it we create regular grid, fill it with regular * +1/-1 pattern, and test model values in the inner points. Model * values should be around zero (we use handcrafted threshold to test * it). Radius is chosen to be several times larger than grid step. * * We perform test for 2D model, because same behavior is expected * regardless of dimensionality. */ r0 = (double)(6); margin = 10; threshold = 0.005; gridsize = 2*margin+10; nx = 2; ny = 1; for(bf=0; bf<=1; bf++) { rbfcreate(nx, ny, &s, _state); rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, r0, 1, 1.0E-1, _state); rbfsetzeroterm(&s, _state); ae_matrix_set_length(&xy, gridsize*gridsize, nx+ny, _state); for(i=0; i<=gridsize-1; i++) { for(j=0; j<=gridsize-1; j++) { xy.ptr.pp_double[i*gridsize+j][0] = (double)(i); xy.ptr.pp_double[i*gridsize+j][1] = (double)(j); xy.ptr.pp_double[i*gridsize+j][2] = 0.01*(ae_randomreal(_state)-0.5)+(2*((i+j)%2)-1); } } rbfsetpoints(&s, &xy, gridsize*gridsize, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } v = 0.0; for(i=margin; i<=gridsize-margin-1; i++) { for(j=margin; j<=gridsize-margin-1; j++) { v = ae_maxreal(v, ae_fabs(rbfcalc2(&s, (double)(i), (double)(j), _state), _state), _state); } } seterrorflag(&result, ae_fp_greater(v,threshold), _state); } ae_frame_leave(_state); return result; } /************************************************************************* Function for testing scaling-related functionality of RBF module with hierarchical algorithm. Returns True on failure (error flag is set). -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ static ae_bool testrbfunit_scaledhrbftest(ae_state *_state) { ae_frame _frame_block; rbfmodel s; rbfmodel s2; rbfreport rep; rbfcalcbuffer tsbuf; ae_int_t nx; ae_int_t ny; ae_int_t linterm; ae_int_t bf; double rbase; ae_int_t nlayers; double errtol; ae_matrix xy; ae_matrix xy2; ae_vector x; ae_vector y; ae_vector y2; ae_vector xzero; ae_vector yref; ae_vector scalex; ae_vector scaley; ae_vector c0; ae_vector c1; ae_int_t n; ae_int_t gridsize; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t strictness; double lambdav; ae_bool result; ae_frame_make(_state, &_frame_block); _rbfmodel_init(&s, _state); _rbfmodel_init(&s2, _state); _rbfreport_init(&rep, _state); _rbfcalcbuffer_init(&tsbuf, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&xy2, 0, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&y2, 0, DT_REAL, _state); ae_vector_init(&xzero, 0, DT_REAL, _state); ae_vector_init(&yref, 0, DT_REAL, _state); ae_vector_init(&scalex, 0, DT_REAL, _state); ae_vector_init(&scaley, 0, DT_REAL, _state); ae_vector_init(&c0, 0, DT_REAL, _state); ae_vector_init(&c1, 0, DT_REAL, _state); result = ae_false; /* * First test - random problem, test that using scaling * does not change model significantly (except for * rounding-related errors). * * We test two kinds of scaling: * * "strict", which is scaling by some power of 2, and * which should result in bit-to-bit equivalence of results * * "non-strict", which is scaling by random number, and * which should result in approximate equivalence * * We also apply scaling to Y, in order to test that it * is correctly handled too. */ for(strictness=0; strictness<=1; strictness++) { for(nx=1; nx<=2; nx++) { for(ny=1; ny<=2; ny++) { /* * problem setup */ if( strictness==1 ) { errtol = (double)(0); ae_vector_set_length(&scalex, nx, _state); for(i=0; i<=nx-1; i++) { scalex.ptr.p_double[i] = ae_pow((double)(16), (double)(ae_randominteger(3, _state)-1), _state); } ae_vector_set_length(&scaley, ny, _state); for(i=0; i<=ny-1; i++) { scaley.ptr.p_double[i] = ae_pow((double)(16), (double)(ae_randominteger(3, _state)-1), _state); } } else { errtol = 1.0E-3; ae_vector_set_length(&scalex, nx, _state); for(i=0; i<=nx-1; i++) { scalex.ptr.p_double[i] = ae_pow((double)(4), 2*ae_randomreal(_state)-1, _state); } ae_vector_set_length(&scaley, ny, _state); for(i=0; i<=ny-1; i++) { scaley.ptr.p_double[i] = ae_pow((double)(4), 2*ae_randomreal(_state)-1, _state); } } n = 150; rbase = 0.33; nlayers = 2; gridsize = ae_round(ae_pow((double)(n), (double)1/(double)nx, _state), _state)+1; linterm = 1+ae_randominteger(3, _state); bf = ae_randominteger(2, _state); lambdav = 1.0E-3*ae_randominteger(2, _state); n = ae_round(ae_pow((double)(gridsize), (double)(nx), _state), _state); ae_matrix_set_length(&xy, n, nx+ny, _state); ae_assert(gridsize>1, "Assertion failed", _state); ae_assert(ae_fp_eq((double)(n),ae_pow((double)(gridsize), (double)(nx), _state)), "Assertion failed", _state); ae_vector_set_length(&c0, nx, _state); for(j=0; j<=nx-1; j++) { c0.ptr.p_double[j] = ae_randomreal(_state)-0.5; } ae_vector_set_length(&c1, ny, _state); for(j=0; j<=ny-1; j++) { c1.ptr.p_double[j] = ae_randomreal(_state)-0.5; } for(i=0; i<=n-1; i++) { k = i; for(j=0; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = (double)(k%gridsize)/(double)(gridsize-1); k = k/gridsize; } for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i][nx+j] = (double)(0); for(k=0; k<=nx-1; k++) { xy.ptr.pp_double[i][nx+j] = xy.ptr.pp_double[i][nx+j]+c0.ptr.p_double[k]*ae_cos(ae_pi*(1+k)*xy.ptr.pp_double[i][k], _state); } xy.ptr.pp_double[i][nx+j] = c1.ptr.p_double[j]*xy.ptr.pp_double[i][nx+j]; } } ae_matrix_set_length(&xy2, n, nx+ny, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { xy2.ptr.pp_double[i][j] = xy.ptr.pp_double[i][j]*scalex.ptr.p_double[j]; } for(j=0; j<=ny-1; j++) { xy2.ptr.pp_double[i][nx+j] = xy.ptr.pp_double[i][nx+j]*scaley.ptr.p_double[j]; } } /* * Build models */ rbfcreate(nx, ny, &s, _state); rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, rbase, nlayers, lambdav, _state); if( linterm==1 ) { rbfsetlinterm(&s, _state); } if( linterm==2 ) { rbfsetconstterm(&s, _state); } if( linterm==3 ) { rbfsetzeroterm(&s, _state); } rbfsetpoints(&s, &xy, n, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } rbfcreate(nx, ny, &s2, _state); rbfsetv2bf(&s2, bf, _state); rbfsetalgohierarchical(&s2, rbase, nlayers, lambdav, _state); if( linterm==1 ) { rbfsetlinterm(&s2, _state); } if( linterm==2 ) { rbfsetconstterm(&s2, _state); } if( linterm==3 ) { rbfsetzeroterm(&s2, _state); } rbfsetpointsandscales(&s2, &xy2, n, &scalex, _state); rbfbuildmodel(&s2, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } /* * Compare model values in grid points */ ae_vector_set_length(&x, nx, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = xy.ptr.pp_double[i][j]; } rbfcalc(&s, &x, &y, _state); for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = xy2.ptr.pp_double[i][j]; } rbfcalc(&s2, &x, &y2, _state); for(j=0; j<=ny-1; j++) { seterrorflag(&result, ae_fp_greater(ae_fabs(y.ptr.p_double[j]-y2.ptr.p_double[j]/scaley.ptr.p_double[j], _state),errtol), _state); } } /* * Compare model values in random points */ ae_vector_set_length(&x, nx, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = ae_randomreal(_state); } rbfcalc(&s, &x, &y, _state); for(j=0; j<=nx-1; j++) { x.ptr.p_double[j] = x.ptr.p_double[j]*scalex.ptr.p_double[j]; } rbfcalc(&s2, &x, &y2, _state); for(j=0; j<=ny-1; j++) { seterrorflag(&result, ae_fp_greater(ae_fabs(y.ptr.p_double[j]-y2.ptr.p_double[j]/scaley.ptr.p_double[j], _state),errtol), _state); } } } } } ae_frame_leave(_state); return result; } /************************************************************************* Test special properties of hierarchical RBFs. Returns True on errors. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ static ae_bool testrbfunit_spechrbftest(ae_state *_state) { double d2; double vref; double vfunc; double maxerr; ae_bool result; result = ae_false; /* * Test properties of RBF basis functions - we compare values * returned by RBFV2BasisFunc() against analytic expressions * which are approximately modeled by RBFV2BasisFunc(). */ d2 = 0.0; while(ae_fp_less(d2,100.0)) { vref = ae_exp(-d2, _state); vfunc = rbfv2basisfunc(0, d2, _state); seterrorflag(&result, ae_fp_greater(ae_fabs(vref-vfunc, _state),1.0E-9*vref), _state); d2 = d2+(double)1/(double)64; } d2 = 0.0; maxerr = 0.0; while(ae_fp_less(d2,16.0)) { vref = ae_maxreal(1-d2/9, (double)(0), _state)*ae_exp(-d2, _state); vfunc = rbfv2basisfunc(1, d2, _state); maxerr = ae_maxreal(maxerr, ae_fabs(vref-vfunc, _state), _state); seterrorflag(&result, ae_fp_greater(ae_fabs(vref-vfunc, _state),0.005), _state); d2 = d2+(double)1/(double)64; } return result; } /************************************************************************* Test gridded evaluation of hierarchical RBFs. Returns True on errors. -- ALGLIB -- Copyright 20.06.2016 by Bochkanov Sergey *************************************************************************/ static ae_bool testrbfunit_gridhrbftest(ae_state *_state) { ae_frame _frame_block; rbfmodel s; rbfreport rep; ae_int_t linterm; ae_int_t bf; double rbase; ae_int_t nlayers; ae_int_t nx; ae_int_t ny; ae_bool hasscale; double errtol; ae_int_t n; ae_matrix xy; ae_matrix xy2; ae_matrix y2; ae_vector x0; ae_vector x1; ae_vector x2; ae_vector x02; ae_vector x12; ae_vector x22; ae_vector scalevec; ae_vector scalevec2; ae_vector needy; ae_vector rowflags; ae_int_t n0; ae_int_t n1; ae_int_t n2; ae_int_t nkind; double v; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t i0; ae_int_t i1; ae_int_t i2; ae_vector x; ae_vector y; ae_vector yv; ae_vector yv2; double scalefactor; double lambdav; ae_bool result; ae_frame_make(_state, &_frame_block); _rbfmodel_init(&s, _state); _rbfreport_init(&rep, _state); ae_matrix_init(&xy, 0, 0, DT_REAL, _state); ae_matrix_init(&xy2, 0, 0, DT_REAL, _state); ae_matrix_init(&y2, 0, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); ae_vector_init(&x2, 0, DT_REAL, _state); ae_vector_init(&x02, 0, DT_REAL, _state); ae_vector_init(&x12, 0, DT_REAL, _state); ae_vector_init(&x22, 0, DT_REAL, _state); ae_vector_init(&scalevec, 0, DT_REAL, _state); ae_vector_init(&scalevec2, 0, DT_REAL, _state); ae_vector_init(&needy, 0, DT_BOOL, _state); ae_vector_init(&rowflags, 0, DT_BOOL, _state); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&yv, 0, DT_REAL, _state); ae_vector_init(&yv2, 0, DT_REAL, _state); result = ae_false; /* * Test 2-dimensional grid calculation */ errtol = 1.0E-12; nx = 2; for(ny=1; ny<=4; ny++) { for(nkind=0; nkind<=2; nkind++) { /* * problem setup */ n = 150; rbase = 0.10; nlayers = ae_randominteger(3, _state); linterm = 1+ae_randominteger(3, _state); lambdav = 1.0E-3*ae_randominteger(2, _state); bf = ae_randominteger(2, _state); ae_matrix_set_length(&xy, n, nx+ny, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = ae_randomreal(_state); } for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i][nx+j] = ae_randomreal(_state)-0.5; } } hasscale = ae_randominteger(2, _state)==0; if( hasscale ) { ae_vector_set_length(&scalevec, nx, _state); for(j=0; j<=nx-1; j++) { scalevec.ptr.p_double[j] = ae_pow((double)(2), 2*ae_randomreal(_state)-1, _state); } } /* * Build model */ rbfcreate(nx, ny, &s, _state); rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, rbase, nlayers, lambdav, _state); if( linterm==1 ) { rbfsetlinterm(&s, _state); } if( linterm==2 ) { rbfsetconstterm(&s, _state); } if( linterm==3 ) { rbfsetzeroterm(&s, _state); } if( hasscale ) { rbfsetpointsandscales(&s, &xy, n, &scalevec, _state); } else { rbfsetpoints(&s, &xy, n, _state); } rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } /* * Prepare grid to test */ n0 = 1+ae_randominteger(50, _state); n1 = 1+ae_randominteger(50, _state); if( nkind==1 ) { k = ae_randominteger(2, _state); if( k==0 ) { n0 = 1; } if( k==1 ) { n1 = 1; } } else { if( nkind==2 ) { n0 = 1; n1 = 1; } else { ae_assert(nkind==0, "Assertion failed", _state); } } ae_vector_set_length(&x0, n0, _state); x0.ptr.p_double[0] = ae_randomreal(_state); for(i=1; i<=n0-1; i++) { x0.ptr.p_double[i] = x0.ptr.p_double[i-1]+ae_randomreal(_state)/n0; } ae_vector_set_length(&x1, n1, _state); x1.ptr.p_double[0] = ae_randomreal(_state); for(i=1; i<=n1-1; i++) { x1.ptr.p_double[i] = x1.ptr.p_double[i-1]+ae_randomreal(_state)/n1; } ae_vector_set_length(&needy, n0*n1, _state); v = ae_pow((double)(10), -3*ae_randomreal(_state), _state); for(i=0; i<=n0*n1-1; i++) { needy.ptr.p_bool[i] = ae_fp_less(ae_randomreal(_state),v); } /* * Test at grid */ ae_vector_set_length(&x, nx, _state); rbfgridcalc2v(&s, &x0, n0, &x1, n1, &yv, _state); for(i0=0; i0<=n0-1; i0++) { for(i1=0; i1<=n1-1; i1++) { x.ptr.p_double[0] = x0.ptr.p_double[i0]; x.ptr.p_double[1] = x1.ptr.p_double[i1]; rbfcalc(&s, &x, &y, _state); for(i=0; i<=ny-1; i++) { seterrorflag(&result, ae_fp_greater(ae_fabs(y.ptr.p_double[i]-yv.ptr.p_double[i+ny*(i0+i1*n0)], _state),errtol), _state); } } } /* * Test calculation on subset of regular grid: * * Test 1: compare full and subset versions * * Test 2: check sparsity. Subset function may perform additional * evaluations because it processes data micro-row by micro-row. * So, we can't check that all elements which were not flagged * are zero - some of them may become non-zero. However, if entire * row is empty, we can reasonably expect (informal guarantee) * that it is not processed. So, we check empty (completely * unflagged) rows * */ unsetrealarray(&yv2, _state); rbfgridcalc2vsubset(&s, &x0, n0, &x1, n1, &needy, &yv2, _state); for(i=0; i<=ny*n0*n1-1; i++) { seterrorflag(&result, needy.ptr.p_bool[i/ny]&&ae_fp_greater(ae_fabs(yv.ptr.p_double[i]-yv2.ptr.p_double[i], _state),errtol), _state); } /* * Test legacy function */ rbfgridcalc2(&s, &x0, n0, &x1, n1, &y2, _state); for(i=0; i<=n0*n1-1; i++) { if( ny==1 ) { seterrorflag(&result, ae_fp_greater(ae_fabs(yv.ptr.p_double[i]-y2.ptr.pp_double[i%n0][i/n0], _state),errtol), _state); } else { seterrorflag(&result, ae_fp_neq(y2.ptr.pp_double[i%n0][i/n0],(double)(0)), _state); } } /* * Test that scaling RBase, XY, X0, X1 by some power of 2 * does not change values at grid (quite a strict requirement, but * ALGLIB implementation of RBF may deal with it). */ scalefactor = ae_pow((double)(1024), (double)(2*ae_randominteger(2, _state)-1), _state); ae_matrix_set_length(&xy2, n, nx+ny, _state); ae_vector_set_length(&x02, n0, _state); ae_vector_set_length(&x12, n1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { xy2.ptr.pp_double[i][j] = xy.ptr.pp_double[i][j]*scalefactor; } for(j=0; j<=ny-1; j++) { xy2.ptr.pp_double[i][nx+j] = xy.ptr.pp_double[i][nx+j]; } } for(i=0; i<=n0-1; i++) { x02.ptr.p_double[i] = x0.ptr.p_double[i]*scalefactor; } for(i=0; i<=n1-1; i++) { x12.ptr.p_double[i] = x1.ptr.p_double[i]*scalefactor; } if( hasscale ) { rbfsetpointsandscales(&s, &xy2, n, &scalevec, _state); } else { rbfsetpoints(&s, &xy2, n, _state); } rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, rbase*scalefactor, nlayers, lambdav, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } unsetrealarray(&yv2, _state); rbfgridcalc2v(&s, &x02, n0, &x12, n1, &yv2, _state); for(i=0; i<=ny*n0*n1-1; i++) { seterrorflag(&result, ae_fp_neq(yv.ptr.p_double[i],yv2.ptr.p_double[i]), _state); } /* * Test that scaling RBase and scale vector by some power of 2 * (increase RBase and decreasing scale, or vice versa) does not * change values at grid (quite a strict requirement, but * ALGLIB implementation of RBF may deal with it). */ scalefactor = ae_pow((double)(1024), (double)(2*ae_randominteger(2, _state)-1), _state); ae_vector_set_length(&scalevec2, nx, _state); for(i=0; i<=nx-1; i++) { if( hasscale ) { scalevec2.ptr.p_double[i] = scalevec.ptr.p_double[i]; } else { scalevec2.ptr.p_double[i] = 1.0; } scalevec2.ptr.p_double[i] = scalevec2.ptr.p_double[i]/scalefactor; } rbfsetpointsandscales(&s, &xy, n, &scalevec2, _state); rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, rbase*scalefactor, nlayers, lambdav, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } unsetrealarray(&yv2, _state); rbfgridcalc2v(&s, &x0, n0, &x1, n1, &yv2, _state); for(i=0; i<=ny*n0*n1-1; i++) { seterrorflag(&result, ae_fp_neq(yv.ptr.p_double[i],yv2.ptr.p_double[i]), _state); } } } /* * Test 3-dimensional grid calculation */ errtol = 1.0E-12; nx = 3; for(ny=1; ny<=4; ny++) { for(nkind=0; nkind<=2; nkind++) { /* * problem setup */ n = 150; rbase = 0.10; lambdav = 1.0E-3*ae_randominteger(2, _state); nlayers = ae_randominteger(3, _state); linterm = 1+ae_randominteger(3, _state); bf = ae_randominteger(2, _state); ae_matrix_set_length(&xy, n, nx+ny, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { xy.ptr.pp_double[i][j] = ae_randomreal(_state); } for(j=0; j<=ny-1; j++) { xy.ptr.pp_double[i][nx+j] = ae_randomreal(_state)-0.5; } } hasscale = ae_randominteger(2, _state)==0; if( hasscale ) { ae_vector_set_length(&scalevec, nx, _state); for(j=0; j<=nx-1; j++) { scalevec.ptr.p_double[j] = ae_pow((double)(2), 2*ae_randomreal(_state)-1, _state); } } /* * Build model */ rbfcreate(nx, ny, &s, _state); rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, rbase, nlayers, lambdav, _state); if( linterm==1 ) { rbfsetlinterm(&s, _state); } if( linterm==2 ) { rbfsetconstterm(&s, _state); } if( linterm==3 ) { rbfsetzeroterm(&s, _state); } if( hasscale ) { rbfsetpointsandscales(&s, &xy, n, &scalevec, _state); } else { rbfsetpoints(&s, &xy, n, _state); } rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } /* * Prepare grid to test */ n0 = 1+ae_randominteger(50, _state); n1 = 1+ae_randominteger(50, _state); n2 = 1+ae_randominteger(50, _state); if( nkind==1 ) { k = ae_randominteger(3, _state); if( k==0 ) { n0 = 1; } if( k==1 ) { n1 = 1; } if( k==2 ) { n2 = 1; } } else { if( nkind==2 ) { n0 = 1; n1 = 1; n2 = 1; } else { ae_assert(nkind==0, "Assertion failed", _state); } } ae_vector_set_length(&x0, n0, _state); x0.ptr.p_double[0] = ae_randomreal(_state); for(i=1; i<=n0-1; i++) { x0.ptr.p_double[i] = x0.ptr.p_double[i-1]+ae_randomreal(_state)/n0; } ae_vector_set_length(&x1, n1, _state); x1.ptr.p_double[0] = ae_randomreal(_state); for(i=1; i<=n1-1; i++) { x1.ptr.p_double[i] = x1.ptr.p_double[i-1]+ae_randomreal(_state)/n1; } ae_vector_set_length(&x2, n2, _state); x2.ptr.p_double[0] = ae_randomreal(_state); for(i=1; i<=n2-1; i++) { x2.ptr.p_double[i] = x2.ptr.p_double[i-1]+ae_randomreal(_state)/n2; } ae_vector_set_length(&needy, n0*n1*n2, _state); v = ae_pow((double)(10), -3*ae_randomreal(_state), _state); for(i=0; i<=n0*n1*n2-1; i++) { needy.ptr.p_bool[i] = ae_fp_less(ae_randomreal(_state),v); } /* * Test at grid */ ae_vector_set_length(&x, nx, _state); rbfgridcalc3v(&s, &x0, n0, &x1, n1, &x2, n2, &yv, _state); for(i0=0; i0<=n0-1; i0++) { for(i1=0; i1<=n1-1; i1++) { for(i2=0; i2<=n2-1; i2++) { x.ptr.p_double[0] = x0.ptr.p_double[i0]; x.ptr.p_double[1] = x1.ptr.p_double[i1]; x.ptr.p_double[2] = x2.ptr.p_double[i2]; rbfcalc(&s, &x, &y, _state); for(i=0; i<=ny-1; i++) { seterrorflag(&result, ae_fp_greater(ae_fabs(y.ptr.p_double[i]-yv.ptr.p_double[i+ny*(i0+i1*n0+i2*n0*n1)], _state),errtol), _state); } } } } /* * Test calculation on subset of regular grid: * * Test 1: compare full and subset versions * * Test 2: check sparsity. Subset function may perform additional * evaluations because it processes data micro-row by micro-row. * So, we can't check that all elements which were not flagged * are zero - some of them may become non-zero. However, if entire * row is empty, we can reasonably expect (informal guarantee) * that it is not processed. So, we check empty (completely * unflagged) rows * */ ae_vector_set_length(&rowflags, n1*n2, _state); for(i=0; i<=n1*n2-1; i++) { rowflags.ptr.p_bool[i] = ae_false; } for(i=0; i<=n0*n1*n2-1; i++) { if( needy.ptr.p_bool[i] ) { rowflags.ptr.p_bool[i/n0] = ae_true; } } unsetrealarray(&yv2, _state); rbfgridcalc3vsubset(&s, &x0, n0, &x1, n1, &x2, n2, &needy, &yv2, _state); for(i=0; i<=ny*n0*n1*n2-1; i++) { seterrorflag(&result, needy.ptr.p_bool[i/ny]&&ae_fp_greater(ae_fabs(yv.ptr.p_double[i]-yv2.ptr.p_double[i], _state),errtol), _state); seterrorflag(&result, !rowflags.ptr.p_bool[i/(ny*n0)]&&ae_fp_neq(yv2.ptr.p_double[i],0.0), _state); } /* * Test that scaling RBase, XY, X0, X1 and X2 by some power of 2 * does not change values at grid (quite a strict requirement, but * ALGLIB implementation of RBF may deal with it). */ scalefactor = ae_pow((double)(1024), (double)(2*ae_randominteger(2, _state)-1), _state); ae_matrix_set_length(&xy2, n, nx+ny, _state); ae_vector_set_length(&x02, n0, _state); ae_vector_set_length(&x12, n1, _state); ae_vector_set_length(&x22, n2, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=nx-1; j++) { xy2.ptr.pp_double[i][j] = xy.ptr.pp_double[i][j]*scalefactor; } for(j=0; j<=ny-1; j++) { xy2.ptr.pp_double[i][nx+j] = xy.ptr.pp_double[i][nx+j]; } } for(i=0; i<=n0-1; i++) { x02.ptr.p_double[i] = x0.ptr.p_double[i]*scalefactor; } for(i=0; i<=n1-1; i++) { x12.ptr.p_double[i] = x1.ptr.p_double[i]*scalefactor; } for(i=0; i<=n2-1; i++) { x22.ptr.p_double[i] = x2.ptr.p_double[i]*scalefactor; } if( hasscale ) { rbfsetpointsandscales(&s, &xy2, n, &scalevec, _state); } else { rbfsetpoints(&s, &xy2, n, _state); } rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, rbase*scalefactor, nlayers, lambdav, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } unsetrealarray(&yv2, _state); rbfgridcalc3v(&s, &x02, n0, &x12, n1, &x22, n2, &yv2, _state); for(i=0; i<=ny*n0*n1*n2-1; i++) { seterrorflag(&result, ae_fp_neq(yv.ptr.p_double[i],yv2.ptr.p_double[i]), _state); } /* * Test that scaling RBase and scale vector by some power of 2 * (increase RBase and decreasing scale, or vice versa) does not * change values at grid (quite a strict requirement, but * ALGLIB implementation of RBF may deal with it). */ scalefactor = ae_pow((double)(1024), (double)(2*ae_randominteger(2, _state)-1), _state); ae_vector_set_length(&scalevec2, nx, _state); for(i=0; i<=nx-1; i++) { if( hasscale ) { scalevec2.ptr.p_double[i] = scalevec.ptr.p_double[i]; } else { scalevec2.ptr.p_double[i] = 1.0; } scalevec2.ptr.p_double[i] = scalevec2.ptr.p_double[i]/scalefactor; } rbfsetpointsandscales(&s, &xy, n, &scalevec2, _state); rbfsetv2bf(&s, bf, _state); rbfsetalgohierarchical(&s, rbase*scalefactor, nlayers, lambdav, _state); rbfbuildmodel(&s, &rep, _state); if( rep.terminationtype<=0 ) { seterrorflag(&result, ae_true, _state); ae_frame_leave(_state); return result; } unsetrealarray(&yv2, _state); rbfgridcalc3v(&s, &x0, n0, &x1, n1, &x2, n2, &yv2, _state); for(i=0; i<=ny*n0*n1*n2-1; i++) { seterrorflag(&result, ae_fp_neq(yv.ptr.p_double[i],yv2.ptr.p_double[i]), _state); } } } ae_frame_leave(_state); return result; } ae_bool testhermite(ae_bool silent, ae_state *_state) { ae_frame _frame_block; double err; double sumerr; double cerr; double threshold; ae_int_t n; ae_int_t maxn; ae_int_t pass; ae_vector c; double x; double v; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&c, 0, DT_REAL, _state); err = (double)(0); sumerr = (double)(0); cerr = (double)(0); threshold = 1.0E-9; waserrors = ae_false; /* * Testing Hermite polynomials */ n = 0; err = ae_maxreal(err, ae_fabs(hermitecalculate(n, (double)(1), _state)-1, _state), _state); n = 1; err = ae_maxreal(err, ae_fabs(hermitecalculate(n, (double)(1), _state)-2, _state), _state); n = 2; err = ae_maxreal(err, ae_fabs(hermitecalculate(n, (double)(1), _state)-2, _state), _state); n = 3; err = ae_maxreal(err, ae_fabs(hermitecalculate(n, (double)(1), _state)+4, _state), _state); n = 4; err = ae_maxreal(err, ae_fabs(hermitecalculate(n, (double)(1), _state)+20, _state), _state); n = 5; err = ae_maxreal(err, ae_fabs(hermitecalculate(n, (double)(1), _state)+8, _state), _state); n = 6; err = ae_maxreal(err, ae_fabs(hermitecalculate(n, (double)(1), _state)-184, _state), _state); n = 7; err = ae_maxreal(err, ae_fabs(hermitecalculate(n, (double)(1), _state)-464, _state), _state); n = 11; err = ae_maxreal(err, ae_fabs(hermitecalculate(n, (double)(1), _state)-230848, _state), _state); n = 12; err = ae_maxreal(err, ae_fabs(hermitecalculate(n, (double)(1), _state)-280768, _state), _state); /* * Testing Clenshaw summation */ maxn = 10; ae_vector_set_length(&c, maxn+1, _state); for(pass=1; pass<=10; pass++) { x = 2*ae_randomreal(_state)-1; v = (double)(0); for(n=0; n<=maxn; n++) { c.ptr.p_double[n] = 2*ae_randomreal(_state)-1; v = v+hermitecalculate(n, x, _state)*c.ptr.p_double[n]; sumerr = ae_maxreal(sumerr, ae_fabs(v-hermitesum(&c, n, x, _state), _state), _state); } } /* * Testing coefficients */ hermitecoefficients(0, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-1, _state), _state); hermitecoefficients(1, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-2, _state), _state); hermitecoefficients(2, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]+2, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-4, _state), _state); hermitecoefficients(3, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]+12, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]-8, _state), _state); hermitecoefficients(4, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-12, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]+48, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[4]-16, _state), _state); hermitecoefficients(5, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-120, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]+160, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[4]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[5]-32, _state), _state); hermitecoefficients(6, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]+120, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-720, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[4]+480, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[5]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[6]-64, _state), _state); /* * Reporting */ waserrors = (ae_fp_greater(err,threshold)||ae_fp_greater(sumerr,threshold))||ae_fp_greater(cerr,threshold); if( !silent ) { printf("TESTING CALCULATION OF THE HERMITE POLYNOMIALS\n"); printf("Max error %5.2e\n", (double)(err)); printf("Summation error %5.2e\n", (double)(sumerr)); printf("Coefficients error %5.2e\n", (double)(cerr)); printf("Threshold %5.2e\n", (double)(threshold)); if( !waserrors ) { printf("TEST PASSED\n"); } else { printf("TEST FAILED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testhermite(ae_bool silent, ae_state *_state) { return testhermite(silent, _state); } ae_bool testlaguerre(ae_bool silent, ae_state *_state) { ae_frame _frame_block; double err; double sumerr; double cerr; double threshold; ae_int_t n; ae_int_t maxn; ae_int_t pass; ae_vector c; double x; double v; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&c, 0, DT_REAL, _state); err = (double)(0); sumerr = (double)(0); cerr = (double)(0); threshold = 1.0E-9; waserrors = ae_false; /* * Testing Laguerre polynomials */ n = 0; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)-1.0000000000, _state), _state); n = 1; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)-0.5000000000, _state), _state); n = 2; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)-0.1250000000, _state), _state); n = 3; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)+0.1458333333, _state), _state); n = 4; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)+0.3307291667, _state), _state); n = 5; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)+0.4455729167, _state), _state); n = 6; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)+0.5041449653, _state), _state); n = 7; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)+0.5183392237, _state), _state); n = 8; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)+0.4983629984, _state), _state); n = 9; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)+0.4529195204, _state), _state); n = 10; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)+0.3893744141, _state), _state); n = 11; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)+0.3139072988, _state), _state); n = 12; err = ae_maxreal(err, ae_fabs(laguerrecalculate(n, 0.5, _state)+0.2316496389, _state), _state); /* * Testing Clenshaw summation */ maxn = 20; ae_vector_set_length(&c, maxn+1, _state); for(pass=1; pass<=10; pass++) { x = 2*ae_randomreal(_state)-1; v = (double)(0); for(n=0; n<=maxn; n++) { c.ptr.p_double[n] = 2*ae_randomreal(_state)-1; v = v+laguerrecalculate(n, x, _state)*c.ptr.p_double[n]; sumerr = ae_maxreal(sumerr, ae_fabs(v-laguerresum(&c, n, x, _state), _state), _state); } } /* * Testing coefficients */ laguerrecoefficients(0, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-1, _state), _state); laguerrecoefficients(1, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-1, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]+1, _state), _state); laguerrecoefficients(2, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-(double)2/(double)2, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]+(double)4/(double)2, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-(double)1/(double)2, _state), _state); laguerrecoefficients(3, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-(double)6/(double)6, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]+(double)18/(double)6, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-(double)9/(double)6, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]+(double)1/(double)6, _state), _state); laguerrecoefficients(4, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-(double)24/(double)24, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]+(double)96/(double)24, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-(double)72/(double)24, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]+(double)16/(double)24, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[4]-(double)1/(double)24, _state), _state); laguerrecoefficients(5, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-(double)120/(double)120, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]+(double)600/(double)120, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-(double)600/(double)120, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]+(double)200/(double)120, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[4]-(double)25/(double)120, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[5]+(double)1/(double)120, _state), _state); laguerrecoefficients(6, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-(double)720/(double)720, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]+(double)4320/(double)720, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-(double)5400/(double)720, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]+(double)2400/(double)720, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[4]-(double)450/(double)720, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[5]+(double)36/(double)720, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[6]-(double)1/(double)720, _state), _state); /* * Reporting */ waserrors = (ae_fp_greater(err,threshold)||ae_fp_greater(sumerr,threshold))||ae_fp_greater(cerr,threshold); if( !silent ) { printf("TESTING CALCULATION OF THE LAGUERRE POLYNOMIALS\n"); printf("Max error %5.2e\n", (double)(err)); printf("Summation error %5.2e\n", (double)(sumerr)); printf("Coefficients error %5.2e\n", (double)(cerr)); printf("Threshold %5.2e\n", (double)(threshold)); if( !waserrors ) { printf("TEST PASSED\n"); } else { printf("TEST FAILED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testlaguerre(ae_bool silent, ae_state *_state) { return testlaguerre(silent, _state); } ae_bool testlegendre(ae_bool silent, ae_state *_state) { ae_frame _frame_block; double err; double sumerr; double cerr; double threshold; ae_int_t n; ae_int_t maxn; ae_int_t i; ae_int_t pass; ae_vector c; double x; double v; double t; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&c, 0, DT_REAL, _state); err = (double)(0); sumerr = (double)(0); cerr = (double)(0); threshold = 1.0E-9; waserrors = ae_false; /* * Testing Legendre polynomials values */ for(n=0; n<=10; n++) { legendrecoefficients(n, &c, _state); for(pass=1; pass<=10; pass++) { x = 2*ae_randomreal(_state)-1; v = legendrecalculate(n, x, _state); t = (double)(1); for(i=0; i<=n; i++) { v = v-c.ptr.p_double[i]*t; t = t*x; } err = ae_maxreal(err, ae_fabs(v, _state), _state); } } /* * Testing Clenshaw summation */ maxn = 20; ae_vector_set_length(&c, maxn+1, _state); for(pass=1; pass<=10; pass++) { x = 2*ae_randomreal(_state)-1; v = (double)(0); for(n=0; n<=maxn; n++) { c.ptr.p_double[n] = 2*ae_randomreal(_state)-1; v = v+legendrecalculate(n, x, _state)*c.ptr.p_double[n]; sumerr = ae_maxreal(sumerr, ae_fabs(v-legendresum(&c, n, x, _state), _state), _state); } } /* * Testing coefficients */ legendrecoefficients(0, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-1, _state), _state); legendrecoefficients(1, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-1, _state), _state); legendrecoefficients(2, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]+(double)1/(double)2, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-(double)3/(double)2, _state), _state); legendrecoefficients(3, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]+(double)3/(double)2, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]-(double)5/(double)2, _state), _state); legendrecoefficients(4, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-(double)3/(double)8, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]+(double)30/(double)8, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[4]-(double)35/(double)8, _state), _state); legendrecoefficients(9, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-(double)315/(double)128, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]+(double)4620/(double)128, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[4]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[5]-(double)18018/(double)128, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[6]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[7]+(double)25740/(double)128, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[8]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[9]-(double)12155/(double)128, _state), _state); /* * Reporting */ waserrors = (ae_fp_greater(err,threshold)||ae_fp_greater(sumerr,threshold))||ae_fp_greater(cerr,threshold); if( !silent ) { printf("TESTING CALCULATION OF THE LEGENDRE POLYNOMIALS\n"); printf("Max error %5.2e\n", (double)(err)); printf("Summation error %5.2e\n", (double)(sumerr)); printf("Coefficients error %5.2e\n", (double)(cerr)); printf("Threshold %5.2e\n", (double)(threshold)); if( !waserrors ) { printf("TEST PASSED\n"); } else { printf("TEST FAILED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testlegendre(ae_bool silent, ae_state *_state) { return testlegendre(silent, _state); } ae_bool testchebyshev(ae_bool silent, ae_state *_state) { ae_frame _frame_block; double err; double sumerr; double cerr; double ferr; double threshold; double x; double v; ae_int_t pass; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t n; ae_int_t maxn; ae_vector c; ae_vector p1; ae_vector p2; ae_matrix a; ae_bool waserrors; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&c, 0, DT_REAL, _state); ae_vector_init(&p1, 0, DT_REAL, _state); ae_vector_init(&p2, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); err = (double)(0); sumerr = (double)(0); cerr = (double)(0); ferr = (double)(0); threshold = 1.0E-9; waserrors = ae_false; /* * Testing Chebyshev polynomials of the first kind */ err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 0, 0.00, _state)-1, _state), _state); err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 0, 0.33, _state)-1, _state), _state); err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 0, -0.42, _state)-1, _state), _state); x = 0.2; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 1, x, _state)-0.2, _state), _state); x = 0.4; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 1, x, _state)-0.4, _state), _state); x = 0.6; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 1, x, _state)-0.6, _state), _state); x = 0.8; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 1, x, _state)-0.8, _state), _state); x = 1.0; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 1, x, _state)-1.0, _state), _state); x = 0.2; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 2, x, _state)+0.92, _state), _state); x = 0.4; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 2, x, _state)+0.68, _state), _state); x = 0.6; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 2, x, _state)+0.28, _state), _state); x = 0.8; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 2, x, _state)-0.28, _state), _state); x = 1.0; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, 2, x, _state)-1.00, _state), _state); n = 10; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, n, 0.2, _state)-0.4284556288, _state), _state); n = 11; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, n, 0.2, _state)+0.7996160205, _state), _state); n = 12; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(1, n, 0.2, _state)+0.7483020370, _state), _state); /* * Testing Chebyshev polynomials of the second kind */ n = 0; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(2, n, 0.2, _state)-1.0000000000, _state), _state); n = 1; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(2, n, 0.2, _state)-0.4000000000, _state), _state); n = 2; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(2, n, 0.2, _state)+0.8400000000, _state), _state); n = 3; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(2, n, 0.2, _state)+0.7360000000, _state), _state); n = 4; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(2, n, 0.2, _state)-0.5456000000, _state), _state); n = 10; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(2, n, 0.2, _state)-0.6128946176, _state), _state); n = 11; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(2, n, 0.2, _state)+0.6770370970, _state), _state); n = 12; err = ae_maxreal(err, ae_fabs(chebyshevcalculate(2, n, 0.2, _state)+0.8837094564, _state), _state); /* * Testing Clenshaw summation */ maxn = 20; ae_vector_set_length(&c, maxn+1, _state); for(k=1; k<=2; k++) { for(pass=1; pass<=10; pass++) { x = 2*ae_randomreal(_state)-1; v = (double)(0); for(n=0; n<=maxn; n++) { c.ptr.p_double[n] = 2*ae_randomreal(_state)-1; v = v+chebyshevcalculate(k, n, x, _state)*c.ptr.p_double[n]; sumerr = ae_maxreal(sumerr, ae_fabs(v-chebyshevsum(&c, k, n, x, _state), _state), _state); } } } /* * Testing coefficients */ chebyshevcoefficients(0, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-1, _state), _state); chebyshevcoefficients(1, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-1, _state), _state); chebyshevcoefficients(2, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]+1, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-2, _state), _state); chebyshevcoefficients(3, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]+3, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]-4, _state), _state); chebyshevcoefficients(4, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-1, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]+8, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[4]-8, _state), _state); chebyshevcoefficients(9, &c, _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[0]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[1]-9, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[2]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[3]+120, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[4]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[5]-432, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[6]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[7]+576, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[8]-0, _state), _state); cerr = ae_maxreal(cerr, ae_fabs(c.ptr.p_double[9]-256, _state), _state); /* * Testing FromChebyshev */ maxn = 10; ae_matrix_set_length(&a, maxn+1, maxn+1, _state); for(i=0; i<=maxn; i++) { for(j=0; j<=maxn; j++) { a.ptr.pp_double[i][j] = (double)(0); } chebyshevcoefficients(i, &c, _state); ae_v_move(&a.ptr.pp_double[i][0], 1, &c.ptr.p_double[0], 1, ae_v_len(0,i)); } ae_vector_set_length(&c, maxn+1, _state); ae_vector_set_length(&p1, maxn+1, _state); for(n=0; n<=maxn; n++) { for(pass=1; pass<=10; pass++) { for(i=0; i<=n; i++) { p1.ptr.p_double[i] = (double)(0); } for(i=0; i<=n; i++) { c.ptr.p_double[i] = 2*ae_randomreal(_state)-1; v = c.ptr.p_double[i]; ae_v_addd(&p1.ptr.p_double[0], 1, &a.ptr.pp_double[i][0], 1, ae_v_len(0,i), v); } fromchebyshev(&c, n, &p2, _state); for(i=0; i<=n; i++) { ferr = ae_maxreal(ferr, ae_fabs(p1.ptr.p_double[i]-p2.ptr.p_double[i], _state), _state); } } } /* * Reporting */ waserrors = ((ae_fp_greater(err,threshold)||ae_fp_greater(sumerr,threshold))||ae_fp_greater(cerr,threshold))||ae_fp_greater(ferr,threshold); if( !silent ) { printf("TESTING CALCULATION OF THE CHEBYSHEV POLYNOMIALS\n"); printf("Max error against table %5.2e\n", (double)(err)); printf("Summation error %5.2e\n", (double)(sumerr)); printf("Coefficients error %5.2e\n", (double)(cerr)); printf("FrobChebyshev error %5.2e\n", (double)(ferr)); printf("Threshold %5.2e\n", (double)(threshold)); if( !waserrors ) { printf("TEST PASSED\n"); } else { printf("TEST FAILED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testchebyshev(ae_bool silent, ae_state *_state) { return testchebyshev(silent, _state); } ae_bool testwsr(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_vector xa; ae_int_t n; ae_int_t i; double taill; double tailr; double tailb; double taillprev; double tailrprev; double ebase; double eshift; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&xa, 0, DT_REAL, _state); waserrors = ae_false; /* * Test monotonicity of tail values for moving value of E */ for(n=5; n<=50; n++) { /* * Generate uniform and sorted X spanning [0,1] */ ae_vector_set_length(&xa, n, _state); for(i=0; i<=n-1; i++) { xa.ptr.p_double[i] = (double)i/(double)(n-1); } /* * Test N+1 values of E */ ebase = -0.5/(n-1); eshift = (double)1/(double)(n-1); tailrprev = (double)(0); taillprev = (double)(1); for(i=0; i<=n; i++) { wilcoxonsignedranktest(&xa, n, ebase+eshift*i, &tailb, &taill, &tailr, _state); seterrorflag(&waserrors, ae_fp_neq(tailb,2*ae_minreal(taill, tailr, _state)), _state); seterrorflag(&waserrors, ae_fp_greater(tailrprev,tailr), _state); seterrorflag(&waserrors, ae_fp_less(taillprev,taill), _state); tailrprev = tailr; taillprev = taill; } } /* * Test for integer overflow in the function: if one crucial * calculation step is performed in 32-bit integer arithmetics, * it will return incorrect results. * * We use special handcrafted N, such that in 32-bit integer * arithmetics int32(N*N)<0. Such negative N leads to domain * error in the sqrt() function. */ n = 50000; ae_vector_set_length(&xa, n, _state); for(i=0; i<=n-1; i++) { xa.ptr.p_double[i] = ae_sin((double)(10*i), _state); } wilcoxonsignedranktest(&xa, n, 0.0, &tailb, &taill, &tailr, _state); seterrorflag(&waserrors, !ae_isfinite(tailb, _state), _state); seterrorflag(&waserrors, !ae_isfinite(taill, _state), _state); seterrorflag(&waserrors, !ae_isfinite(tailr, _state), _state); /* * */ if( !silent ) { if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testwsr(ae_bool silent, ae_state *_state) { return testwsr(silent, _state); } ae_bool teststest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_vector x; double taill; double tailr; double tailb; ae_bool waserrors; double eps; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); waserrors = ae_false; eps = 1.0E-3; /* * Test 1 */ ae_vector_set_length(&x, 6, _state); x.ptr.p_double[0] = (double)(-3); x.ptr.p_double[1] = (double)(-2); x.ptr.p_double[2] = (double)(-1); x.ptr.p_double[3] = (double)(1); x.ptr.p_double[4] = (double)(2); x.ptr.p_double[5] = (double)(3); onesamplesigntest(&x, 6, 0.0, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-0.65625, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-0.65625, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-1.00000, _state),eps); onesamplesigntest(&x, 6, -1.0, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-0.81250, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-0.50000, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-1.00000, _state),eps); onesamplesigntest(&x, 6, -1.5, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-0.89062, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-0.34375, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-0.68750, _state),eps); onesamplesigntest(&x, 6, -3.0, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-1.00000, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-0.03125, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-0.06250, _state),eps); /* * Test 2 */ ae_vector_set_length(&x, 3, _state); x.ptr.p_double[0] = (double)(2); x.ptr.p_double[1] = (double)(2); x.ptr.p_double[2] = (double)(2); onesamplesigntest(&x, 3, 2.0, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); waserrors = waserrors||ae_fp_neq(tailb,(double)(1)); /* * Final report */ if( !silent ) { printf("SIGN TEST: "); if( !waserrors ) { printf("OK\n"); } else { printf("FAILED\n"); } if( waserrors ) { printf("TEST SUMMARY: FAILED\n"); } else { printf("TEST SUMMARY: PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_teststest(ae_bool silent, ae_state *_state) { return teststest(silent, _state); } ae_bool teststudentttests(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; double eps; ae_vector x; ae_vector y; ae_vector xa; ae_vector ya; ae_vector xb; ae_vector yb; ae_int_t n; ae_int_t i; double taill; double tailr; double tailb; double taill1; double tailr1; double tailb1; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); ae_vector_init(&xa, 0, DT_REAL, _state); ae_vector_init(&ya, 0, DT_REAL, _state); ae_vector_init(&xb, 0, DT_REAL, _state); ae_vector_init(&yb, 0, DT_REAL, _state); waserrors = ae_false; eps = 0.001; /* * 1-sample test */ n = 8; ae_vector_set_length(&x, 8, _state); x.ptr.p_double[0] = -3.0; x.ptr.p_double[1] = -1.5; x.ptr.p_double[2] = -1.0; x.ptr.p_double[3] = -0.5; x.ptr.p_double[4] = 0.5; x.ptr.p_double[5] = 1.0; x.ptr.p_double[6] = 1.5; x.ptr.p_double[7] = 3.0; studentttest1(&x, n, 0.0, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-1.00000, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-0.50000, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-0.50000, _state),eps); studentttest1(&x, n, 1.0, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-0.17816, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-0.08908, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-0.91092, _state),eps); studentttest1(&x, n, -1.0, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-0.17816, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-0.91092, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-0.08908, _state),eps); x.ptr.p_double[0] = 1.1; x.ptr.p_double[1] = 1.1; x.ptr.p_double[2] = 1.1; x.ptr.p_double[3] = 1.1; x.ptr.p_double[4] = 1.1; x.ptr.p_double[5] = 1.1; x.ptr.p_double[6] = 1.1; x.ptr.p_double[7] = 1.1; studentttest1(&x, n, 1.1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(1)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest1(&x, n, 0.0, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(0)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(0)); x.ptr.p_double[7] = 1.1; studentttest1(&x, 1, 1.1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(1)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest1(&x, 1, 0.0, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(0)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(0)); /* * 2-sample pooled (equal variance) test */ n = 8; ae_vector_set_length(&x, 8, _state); ae_vector_set_length(&y, 8, _state); x.ptr.p_double[0] = -3.0; x.ptr.p_double[1] = -1.5; x.ptr.p_double[2] = -1.0; x.ptr.p_double[3] = -0.5; x.ptr.p_double[4] = 0.5; x.ptr.p_double[5] = 1.0; x.ptr.p_double[6] = 1.5; x.ptr.p_double[7] = 3.0; y.ptr.p_double[0] = -2.0; y.ptr.p_double[1] = -0.5; y.ptr.p_double[2] = 0.0; y.ptr.p_double[3] = 0.5; y.ptr.p_double[4] = 1.5; y.ptr.p_double[5] = 2.0; y.ptr.p_double[6] = 2.5; y.ptr.p_double[7] = 4.0; studentttest2(&x, n, &y, n, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-0.30780, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-0.15390, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-0.84610, _state),eps); studentttest2(&x, n, &y, n-1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-0.53853, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-0.26927, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-0.73074, _state),eps); studentttest2(&x, n-1, &y, n, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-0.13829, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-0.06915, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-0.93086, _state),eps); x.ptr.p_double[0] = -1.0; x.ptr.p_double[1] = -1.0; x.ptr.p_double[2] = -1.0; x.ptr.p_double[3] = -1.0; x.ptr.p_double[4] = -1.0; x.ptr.p_double[5] = -1.0; x.ptr.p_double[6] = -1.0; x.ptr.p_double[7] = -1.0; y.ptr.p_double[0] = 1.0; y.ptr.p_double[1] = 1.0; y.ptr.p_double[2] = 1.0; y.ptr.p_double[3] = 1.0; y.ptr.p_double[4] = 1.0; y.ptr.p_double[5] = 1.0; y.ptr.p_double[6] = 1.0; y.ptr.p_double[7] = 1.0; studentttest2(&x, n, &y, n, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(0)); waserrors = waserrors||ae_fp_neq(taill,(double)(0)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest2(&x, n, &y, n-1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(0)); waserrors = waserrors||ae_fp_neq(taill,(double)(0)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest2(&x, n, &y, 1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(0)); waserrors = waserrors||ae_fp_neq(taill,(double)(0)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest2(&x, n-1, &y, n, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(0)); waserrors = waserrors||ae_fp_neq(taill,(double)(0)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest2(&x, 1, &y, n, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(0)); waserrors = waserrors||ae_fp_neq(taill,(double)(0)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest2(&x, 1, &y, 1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(0)); waserrors = waserrors||ae_fp_neq(taill,(double)(0)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest2(&y, 1, &x, 1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(0)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(0)); x.ptr.p_double[0] = 1.1; x.ptr.p_double[1] = 1.1; x.ptr.p_double[2] = 1.1; x.ptr.p_double[3] = 1.1; x.ptr.p_double[4] = 1.1; x.ptr.p_double[5] = 1.1; x.ptr.p_double[6] = 1.1; x.ptr.p_double[7] = 1.1; y.ptr.p_double[0] = 1.1; y.ptr.p_double[1] = 1.1; y.ptr.p_double[2] = 1.1; y.ptr.p_double[3] = 1.1; y.ptr.p_double[4] = 1.1; y.ptr.p_double[5] = 1.1; y.ptr.p_double[6] = 1.1; y.ptr.p_double[7] = 1.1; studentttest2(&x, n, &y, n, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(1)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest2(&x, n, &y, n-1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(1)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest2(&x, n, &y, 1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(1)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest2(&x, n-1, &y, n, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(1)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest2(&x, 1, &y, n, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(1)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); studentttest2(&x, 1, &y, 1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(1)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); /* * 2-sample unpooled (unequal variance) test: * * test on two non-constant samples * * tests on different combinations of non-constant and constant samples */ n = 8; ae_vector_set_length(&xa, 8, _state); ae_vector_set_length(&ya, 8, _state); ae_vector_set_length(&xb, 8, _state); ae_vector_set_length(&yb, 8, _state); xa.ptr.p_double[0] = -3.0; xa.ptr.p_double[1] = -1.5; xa.ptr.p_double[2] = -1.0; xa.ptr.p_double[3] = -0.5; xa.ptr.p_double[4] = 0.5; xa.ptr.p_double[5] = 1.0; xa.ptr.p_double[6] = 1.5; xa.ptr.p_double[7] = 3.0; ya.ptr.p_double[0] = -1.0; ya.ptr.p_double[1] = -0.5; ya.ptr.p_double[2] = 0.0; ya.ptr.p_double[3] = 0.5; ya.ptr.p_double[4] = 1.5; ya.ptr.p_double[5] = 2.0; ya.ptr.p_double[6] = 2.5; ya.ptr.p_double[7] = 3.0; xb.ptr.p_double[0] = -1.1; xb.ptr.p_double[1] = -1.1; xb.ptr.p_double[2] = -1.1; xb.ptr.p_double[3] = -1.1; xb.ptr.p_double[4] = -1.1; xb.ptr.p_double[5] = -1.1; xb.ptr.p_double[6] = -1.1; xb.ptr.p_double[7] = -1.1; yb.ptr.p_double[0] = 1.1; yb.ptr.p_double[1] = 1.1; yb.ptr.p_double[2] = 1.1; yb.ptr.p_double[3] = 1.1; yb.ptr.p_double[4] = 1.1; yb.ptr.p_double[5] = 1.1; yb.ptr.p_double[6] = 1.1; yb.ptr.p_double[7] = 1.1; unequalvariancettest(&xa, n, &ya, n, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-0.25791, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-0.12896, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-0.87105, _state),eps); unequalvariancettest(&xa, n, &yb, n, &tailb, &taill, &tailr, _state); studentttest1(&xa, n, 1.1, &tailb1, &taill1, &tailr1, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-tailb1, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-taill1, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-tailr1, _state),eps); unequalvariancettest(&xa, n, &yb, 1, &tailb, &taill, &tailr, _state); studentttest1(&xa, n, 1.1, &tailb1, &taill1, &tailr1, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-tailb1, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-taill1, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-tailr1, _state),eps); unequalvariancettest(&xb, n, &ya, n, &tailb, &taill, &tailr, _state); studentttest1(&ya, n, -1.1, &tailb1, &taill1, &tailr1, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-tailb1, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-tailr1, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-taill1, _state),eps); unequalvariancettest(&xb, 1, &ya, n, &tailb, &taill, &tailr, _state); studentttest1(&ya, n, -1.1, &tailb1, &taill1, &tailr1, _state); waserrors = waserrors||ae_fp_greater(ae_fabs(tailb-tailb1, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(taill-tailr1, _state),eps); waserrors = waserrors||ae_fp_greater(ae_fabs(tailr-taill1, _state),eps); unequalvariancettest(&xb, 1, &yb, 1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(0)); waserrors = waserrors||ae_fp_neq(taill,(double)(0)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); unequalvariancettest(&yb, 1, &xb, 1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(0)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(0)); unequalvariancettest(&xb, 1, &xb, 1, &tailb, &taill, &tailr, _state); waserrors = waserrors||ae_fp_neq(tailb,(double)(1)); waserrors = waserrors||ae_fp_neq(taill,(double)(1)); waserrors = waserrors||ae_fp_neq(tailr,(double)(1)); /* * Test for integer overflow in the function: if one crucial * calculation step is performed in 32-bit integer arithmetics, * it will return incorrect results. * * We use special handcrafted N, such that in 32-bit integer * arithmetics int32(N*N)<0. Such negative N leads to domain * error in the incomplete beta function. */ n = 50000; ae_vector_set_length(&xa, n, _state); ae_vector_set_length(&ya, n, _state); for(i=0; i<=n-1; i++) { xa.ptr.p_double[i] = ae_randomreal(_state); ya.ptr.p_double[i] = ae_randomreal(_state); } unequalvariancettest(&xa, n, &ya, n, &tailb, &taill, &tailr, _state); /* * */ if( !silent ) { if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_teststudentttests(ae_bool silent, ae_state *_state) { return teststudentttests(silent, _state); } ae_bool testmannwhitneyu(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_vector x; ae_vector y; ae_int_t testmin; ae_int_t testmax; ae_int_t testcnt; ae_int_t pass; ae_int_t n; ae_int_t m; ae_int_t i; ae_int_t k; double taill; double tailr; double tailb; double taill1; double tailr1; double tailb1; double taillprev; double tailrprev; double ebase; double eshift; ae_int_t ecnt; double worsterr; double v; hqrndstate rs; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&y, 0, DT_REAL, _state); _hqrndstate_init(&rs, _state); waserrors = ae_false; hqrndrandomize(&rs, _state); /* * Test monotonicity of tail values for monotinically moving distributions. */ for(n=5; n<=20; n++) { for(m=5; m<=20; m++) { /* * Generate uniform and sorted X/Y spanning [0,1] */ ae_vector_set_length(&x, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = (double)i/(double)(n-1)+100*ae_machineepsilon*hqrndnormal(&rs, _state); } ae_vector_set_length(&y, m, _state); for(i=0; i<=m-1; i++) { y.ptr.p_double[i] = (double)i/(double)(m-1)+100*ae_machineepsilon*hqrndnormal(&rs, _state); } /* * Test 100 values of E */ ecnt = 100; ebase = -1.1; eshift = -2*ebase/(ecnt-1); tailrprev = (double)(0); taillprev = (double)(1); for(i=0; i<=m-1; i++) { y.ptr.p_double[i] = y.ptr.p_double[i]+ebase; } for(k=0; k<=ecnt-1; k++) { mannwhitneyutest(&x, n, &y, m, &tailb, &taill, &tailr, _state); seterrorflag(&waserrors, ae_fp_neq(tailb,2*ae_minreal(taill, tailr, _state)), _state); seterrorflag(&waserrors, ae_fp_greater(tailrprev,tailr), _state); seterrorflag(&waserrors, ae_fp_less(taillprev,taill), _state); tailrprev = tailr; taillprev = taill; for(i=0; i<=m-1; i++) { y.ptr.p_double[i] = y.ptr.p_double[i]+eshift; } } } } /* * Test frequency of p-value 0.05 */ testmin = 5; testmax = 50; testcnt = 10000; worsterr = 0.0; for(n=testmin; n<=testmax; n++) { m = n+hqrnduniformi(&rs, testmax-n+1, _state); ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, m, _state); /* * Generate two uniformly distributed values, calculate p-value for both-tails, repeat */ k = 0; for(pass=0; pass<=testcnt-1; pass++) { for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = hqrnduniformr(&rs, _state); } for(i=0; i<=m-1; i++) { y.ptr.p_double[i] = hqrnduniformr(&rs, _state); } mannwhitneyutest(&x, n, &y, m, &tailb, &taill, &tailr, _state); if( ae_fp_less(tailb,0.05) ) { k = k+1; } } v = ae_fabs((double)k/(double)testcnt-0.05, _state); worsterr = ae_maxreal(worsterr, v, _state); /* * Test error in quantile; for different N's we have different tolerances */ if( n<10 ) { seterrorflag(&waserrors, ae_fp_greater(v,0.030), _state); } else { if( n<15 ) { seterrorflag(&waserrors, ae_fp_greater(v,0.020), _state); } else { if( n<30 ) { seterrorflag(&waserrors, ae_fp_greater(v,0.015), _state); } else { seterrorflag(&waserrors, ae_fp_greater(v,0.010), _state); } } } } /* * Test symmetry properties */ for(n=5; n<=50; n++) { for(m=5; m<=50; m++) { ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, m, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = hqrnduniformr(&rs, _state); } for(i=0; i<=m-1; i++) { y.ptr.p_double[i] = hqrnduniformr(&rs, _state); } mannwhitneyutest(&x, n, &y, m, &tailb, &taill, &tailr, _state); mannwhitneyutest(&y, m, &x, n, &tailb1, &taill1, &tailr1, _state); seterrorflag(&waserrors, ae_fp_greater(ae_fabs(tailb-tailb1, _state),1.0E-12), _state); seterrorflag(&waserrors, ae_fp_greater(ae_fabs(taill-tailr1, _state),1.0E-12), _state); seterrorflag(&waserrors, ae_fp_greater(ae_fabs(tailr-taill1, _state),1.0E-12), _state); } } /* * Test for integer overflow in the function: if one crucial * calculation step is performed in 32-bit integer arithmetics, * it will return incorrect results. * * We use special handcrafted N, such that in 32-bit integer * arithmetics int32(N*N)<0. Such negative N leads to domain * error in the sqrt() function. */ n = 50000; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&y, n, _state); for(i=0; i<=n-1; i++) { x.ptr.p_double[i] = ae_sin((double)(10*i), _state); y.ptr.p_double[i] = ae_sin((double)(13*i), _state); } mannwhitneyutest(&x, n, &y, n, &tailb, &taill, &tailr, _state); seterrorflag(&waserrors, !ae_isfinite(tailb, _state), _state); seterrorflag(&waserrors, !ae_isfinite(taill, _state), _state); seterrorflag(&waserrors, !ae_isfinite(tailr, _state), _state); /* * */ if( !silent ) { if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testmannwhitneyu(ae_bool silent, ae_state *_state) { return testmannwhitneyu(silent, _state); } static void testschurunit_testschurproblem(/* Real */ ae_matrix* a, ae_int_t n, double* materr, double* orterr, ae_bool* errstruct, ae_bool* wfailed, ae_state *_state); /************************************************************************* Testing Schur decomposition subroutine *************************************************************************/ ae_bool testschur(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_matrix a; ae_int_t n; ae_int_t maxn; ae_int_t i; ae_int_t j; ae_int_t pass; ae_int_t passcount; ae_bool waserrors; ae_bool errstruct; ae_bool wfailed; double materr; double orterr; double threshold; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&a, 0, 0, DT_REAL, _state); materr = (double)(0); orterr = (double)(0); errstruct = ae_false; wfailed = ae_false; waserrors = ae_false; maxn = 70; passcount = 1; threshold = 5*100*ae_machineepsilon; ae_matrix_set_length(&a, maxn-1+1, maxn-1+1, _state); /* * zero matrix, several cases */ for(i=0; i<=maxn-1; i++) { for(j=0; j<=maxn-1; j++) { a.ptr.pp_double[i][j] = (double)(0); } } for(n=1; n<=maxn; n++) { if( n>30&&n%2==0 ) { continue; } testschurunit_testschurproblem(&a, n, &materr, &orterr, &errstruct, &wfailed, _state); } /* * Dense matrix */ for(pass=1; pass<=passcount; pass++) { for(n=1; n<=maxn; n++) { if( n>30&&n%2==0 ) { continue; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; } } testschurunit_testschurproblem(&a, n, &materr, &orterr, &errstruct, &wfailed, _state); } } /* * report */ waserrors = ((ae_fp_greater(materr,threshold)||ae_fp_greater(orterr,threshold))||errstruct)||wfailed; if( !silent ) { printf("TESTING SCHUR DECOMPOSITION\n"); printf("Schur decomposition error: %5.3e\n", (double)(materr)); printf("Schur orthogonality error: %5.3e\n", (double)(orterr)); printf("T matrix structure: "); if( !errstruct ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("Always converged: "); if( !wfailed ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("Threshold: %5.3e\n", (double)(threshold)); if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testschur(ae_bool silent, ae_state *_state) { return testschur(silent, _state); } static void testschurunit_testschurproblem(/* Real */ ae_matrix* a, ae_int_t n, double* materr, double* orterr, ae_bool* errstruct, ae_bool* wfailed, ae_state *_state) { ae_frame _frame_block; ae_matrix s; ae_matrix t; ae_vector sr; ae_vector astc; ae_vector sastc; ae_int_t i; ae_int_t j; ae_int_t k; double v; double locerr; ae_frame_make(_state, &_frame_block); ae_matrix_init(&s, 0, 0, DT_REAL, _state); ae_matrix_init(&t, 0, 0, DT_REAL, _state); ae_vector_init(&sr, 0, DT_REAL, _state); ae_vector_init(&astc, 0, DT_REAL, _state); ae_vector_init(&sastc, 0, DT_REAL, _state); ae_vector_set_length(&sr, n-1+1, _state); ae_vector_set_length(&astc, n-1+1, _state); ae_vector_set_length(&sastc, n-1+1, _state); /* * Schur decomposition, convergence test */ ae_matrix_set_length(&t, n-1+1, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { t.ptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } if( !rmatrixschur(&t, n, &s, _state) ) { *wfailed = ae_true; ae_frame_leave(_state); return; } /* * decomposition error */ locerr = (double)(0); for(j=0; j<=n-1; j++) { ae_v_move(&sr.ptr.p_double[0], 1, &s.ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); for(k=0; k<=n-1; k++) { v = ae_v_dotproduct(&t.ptr.pp_double[k][0], 1, &sr.ptr.p_double[0], 1, ae_v_len(0,n-1)); astc.ptr.p_double[k] = v; } for(k=0; k<=n-1; k++) { v = ae_v_dotproduct(&s.ptr.pp_double[k][0], 1, &astc.ptr.p_double[0], 1, ae_v_len(0,n-1)); sastc.ptr.p_double[k] = v; } for(k=0; k<=n-1; k++) { locerr = ae_maxreal(locerr, ae_fabs(sastc.ptr.p_double[k]-a->ptr.pp_double[k][j], _state), _state); } } *materr = ae_maxreal(*materr, locerr, _state); /* * orthogonality error */ locerr = (double)(0); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { v = ae_v_dotproduct(&s.ptr.pp_double[0][i], s.stride, &s.ptr.pp_double[0][j], s.stride, ae_v_len(0,n-1)); if( i!=j ) { locerr = ae_maxreal(locerr, ae_fabs(v, _state), _state); } else { locerr = ae_maxreal(locerr, ae_fabs(v-1, _state), _state); } } } *orterr = ae_maxreal(*orterr, locerr, _state); /* * T matrix structure */ for(j=0; j<=n-1; j++) { for(i=j+2; i<=n-1; i++) { if( ae_fp_neq(t.ptr.pp_double[i][j],(double)(0)) ) { *errstruct = ae_true; } } } ae_frame_leave(_state); } /************************************************************************* Testing bidiagonal SVD decomposition subroutine *************************************************************************/ ae_bool testspdgevd(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_int_t pass; ae_int_t n; ae_int_t passcount; ae_int_t maxn; ae_int_t atask; ae_int_t btask; ae_vector d; ae_vector t1; ae_matrix a; ae_matrix b; ae_matrix afull; ae_matrix bfull; ae_matrix l; ae_matrix z; ae_bool isuppera; ae_bool isupperb; ae_int_t i; ae_int_t j; ae_int_t minij; double v; double v1; double v2; double err; double valerr; double threshold; ae_bool waserrors; ae_bool wfailed; ae_bool wnsorted; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&t1, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&b, 0, 0, DT_REAL, _state); ae_matrix_init(&afull, 0, 0, DT_REAL, _state); ae_matrix_init(&bfull, 0, 0, DT_REAL, _state); ae_matrix_init(&l, 0, 0, DT_REAL, _state); ae_matrix_init(&z, 0, 0, DT_REAL, _state); threshold = 10000*ae_machineepsilon; valerr = (double)(0); wfailed = ae_false; wnsorted = ae_false; maxn = 20; passcount = 5; /* * Main cycle */ for(n=1; n<=maxn; n++) { for(pass=1; pass<=passcount; pass++) { for(atask=0; atask<=1; atask++) { for(btask=0; btask<=1; btask++) { isuppera = atask==0; isupperb = btask==0; /* * Initialize A, B, AFull, BFull */ ae_vector_set_length(&t1, n-1+1, _state); ae_matrix_set_length(&a, n-1+1, n-1+1, _state); ae_matrix_set_length(&b, n-1+1, n-1+1, _state); ae_matrix_set_length(&afull, n-1+1, n-1+1, _state); ae_matrix_set_length(&bfull, n-1+1, n-1+1, _state); ae_matrix_set_length(&l, n-1+1, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = 2*ae_randomreal(_state)-1; a.ptr.pp_double[j][i] = a.ptr.pp_double[i][j]; afull.ptr.pp_double[i][j] = a.ptr.pp_double[i][j]; afull.ptr.pp_double[j][i] = a.ptr.pp_double[i][j]; } } for(i=0; i<=n-1; i++) { for(j=i+1; j<=n-1; j++) { l.ptr.pp_double[i][j] = ae_randomreal(_state); l.ptr.pp_double[j][i] = l.ptr.pp_double[i][j]; } l.ptr.pp_double[i][i] = 1.5+ae_randomreal(_state); } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { minij = ae_minint(i, j, _state); v = ae_v_dotproduct(&l.ptr.pp_double[i][0], 1, &l.ptr.pp_double[0][j], l.stride, ae_v_len(0,minij)); b.ptr.pp_double[i][j] = v; b.ptr.pp_double[j][i] = v; bfull.ptr.pp_double[i][j] = v; bfull.ptr.pp_double[j][i] = v; } } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( isuppera ) { if( jptr.pp_double[i][j] = a->ptr.pp_double[i][j]; } } } /************************************************************************* LU decomposition *************************************************************************/ static void testinverseupdateunit_matlu(/* Real */ ae_matrix* a, ae_int_t m, ae_int_t n, /* Integer */ ae_vector* pivots, ae_state *_state) { ae_frame _frame_block; ae_int_t i; ae_int_t j; ae_int_t jp; ae_vector t1; double s; ae_frame_make(_state, &_frame_block); ae_vector_clear(pivots); ae_vector_init(&t1, 0, DT_REAL, _state); ae_vector_set_length(pivots, ae_minint(m-1, n-1, _state)+1, _state); ae_vector_set_length(&t1, ae_maxint(m-1, n-1, _state)+1, _state); ae_assert(m>=0&&n>=0, "Error in LUDecomposition: incorrect function arguments", _state); /* * Quick return if possible */ if( m==0||n==0 ) { ae_frame_leave(_state); return; } for(j=0; j<=ae_minint(m-1, n-1, _state); j++) { /* * Find pivot and test for singularity. */ jp = j; for(i=j+1; i<=m-1; i++) { if( ae_fp_greater(ae_fabs(a->ptr.pp_double[i][j], _state),ae_fabs(a->ptr.pp_double[jp][j], _state)) ) { jp = i; } } pivots->ptr.p_int[j] = jp; if( ae_fp_neq(a->ptr.pp_double[jp][j],(double)(0)) ) { /* *Apply the interchange to rows */ if( jp!=j ) { ae_v_move(&t1.ptr.p_double[0], 1, &a->ptr.pp_double[j][0], 1, ae_v_len(0,n-1)); ae_v_move(&a->ptr.pp_double[j][0], 1, &a->ptr.pp_double[jp][0], 1, ae_v_len(0,n-1)); ae_v_move(&a->ptr.pp_double[jp][0], 1, &t1.ptr.p_double[0], 1, ae_v_len(0,n-1)); } /* *Compute elements J+1:M of J-th column. */ if( j+1ptr.pp_double[j][j]; ae_v_muld(&a->ptr.pp_double[jp][j], a->stride, ae_v_len(jp,m-1), s); } } if( jptr.pp_double[i][j]; ae_v_subd(&a->ptr.pp_double[i][jp], 1, &a->ptr.pp_double[j][jp], 1, ae_v_len(jp,n-1), s); } } } ae_frame_leave(_state); } /************************************************************************* Generate matrix with given condition number C (2-norm) *************************************************************************/ static void testinverseupdateunit_generaterandomorthogonalmatrix(/* Real */ ae_matrix* a0, ae_int_t n, ae_state *_state) { ae_frame _frame_block; double t; double lambdav; ae_int_t s; ae_int_t i; ae_int_t j; double u1; double u2; ae_vector w; ae_vector v; ae_matrix a; double sm; ae_frame_make(_state, &_frame_block); ae_vector_init(&w, 0, DT_REAL, _state); ae_vector_init(&v, 0, DT_REAL, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); if( n<=0 ) { ae_frame_leave(_state); return; } ae_vector_set_length(&w, n+1, _state); ae_vector_set_length(&v, n+1, _state); ae_matrix_set_length(&a, n+1, n+1, _state); ae_matrix_set_length(a0, n-1+1, n-1+1, _state); /* * Prepare A */ for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { if( i==j ) { a.ptr.pp_double[i][j] = (double)(1); } else { a.ptr.pp_double[i][j] = (double)(0); } } } /* * Calculate A using Stewart algorithm */ for(s=2; s<=n; s++) { /* * Prepare v and Lambda = v'*v */ do { i = 1; while(i<=s) { u1 = 2*ae_randomreal(_state)-1; u2 = 2*ae_randomreal(_state)-1; sm = u1*u1+u2*u2; if( ae_fp_eq(sm,(double)(0))||ae_fp_greater(sm,(double)(1)) ) { continue; } sm = ae_sqrt(-2*ae_log(sm, _state)/sm, _state); v.ptr.p_double[i] = u1*sm; if( i+1<=s ) { v.ptr.p_double[i+1] = u2*sm; } i = i+2; } lambdav = ae_v_dotproduct(&v.ptr.p_double[1], 1, &v.ptr.p_double[1], 1, ae_v_len(1,s)); } while(ae_fp_eq(lambdav,(double)(0))); lambdav = 2/lambdav; /* * A * (I - 2 vv'/v'v ) = * = A - (2/v'v) * A * v * v' = * = A - (2/v'v) * w * v' * where w = Av */ for(i=1; i<=s; i++) { t = ae_v_dotproduct(&a.ptr.pp_double[i][1], 1, &v.ptr.p_double[1], 1, ae_v_len(1,s)); w.ptr.p_double[i] = t; } for(i=1; i<=s; i++) { t = w.ptr.p_double[i]*lambdav; ae_v_subd(&a.ptr.pp_double[i][1], 1, &v.ptr.p_double[1], 1, ae_v_len(1,s), t); } } /* * */ for(i=1; i<=n; i++) { for(j=1; j<=n; j++) { a0->ptr.pp_double[i-1][j-1] = a.ptr.pp_double[i][j]; } } ae_frame_leave(_state); } static void testinverseupdateunit_generaterandommatrixcond(/* Real */ ae_matrix* a0, ae_int_t n, double c, ae_state *_state) { ae_frame _frame_block; double l1; double l2; ae_matrix q1; ae_matrix q2; ae_vector cc; ae_int_t i; ae_int_t j; ae_int_t k; ae_frame_make(_state, &_frame_block); ae_matrix_init(&q1, 0, 0, DT_REAL, _state); ae_matrix_init(&q2, 0, 0, DT_REAL, _state); ae_vector_init(&cc, 0, DT_REAL, _state); testinverseupdateunit_generaterandomorthogonalmatrix(&q1, n, _state); testinverseupdateunit_generaterandomorthogonalmatrix(&q2, n, _state); ae_vector_set_length(&cc, n-1+1, _state); l1 = (double)(0); l2 = ae_log(1/c, _state); cc.ptr.p_double[0] = ae_exp(l1, _state); for(i=1; i<=n-2; i++) { cc.ptr.p_double[i] = ae_exp(ae_randomreal(_state)*(l2-l1)+l1, _state); } cc.ptr.p_double[n-1] = ae_exp(l2, _state); ae_matrix_set_length(a0, n-1+1, n-1+1, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a0->ptr.pp_double[i][j] = (double)(0); for(k=0; k<=n-1; k++) { a0->ptr.pp_double[i][j] = a0->ptr.pp_double[i][j]+q1.ptr.pp_double[i][k]*cc.ptr.p_double[k]*q2.ptr.pp_double[j][k]; } } } ae_frame_leave(_state); } /************************************************************************* triangular inverse *************************************************************************/ static ae_bool testinverseupdateunit_invmattr(/* Real */ ae_matrix* a, ae_int_t n, ae_bool isupper, ae_bool isunittriangular, ae_state *_state) { ae_frame _frame_block; ae_bool nounit; ae_int_t i; ae_int_t j; double v; double ajj; ae_vector t; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&t, 0, DT_REAL, _state); result = ae_true; ae_vector_set_length(&t, n-1+1, _state); /* * Test the input parameters. */ nounit = !isunittriangular; if( isupper ) { /* * Compute inverse of upper triangular matrix. */ for(j=0; j<=n-1; j++) { if( nounit ) { if( ae_fp_eq(a->ptr.pp_double[j][j],(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } a->ptr.pp_double[j][j] = 1/a->ptr.pp_double[j][j]; ajj = -a->ptr.pp_double[j][j]; } else { ajj = (double)(-1); } /* * Compute elements 1:j-1 of j-th column. */ if( j>0 ) { ae_v_move(&t.ptr.p_double[0], 1, &a->ptr.pp_double[0][j], a->stride, ae_v_len(0,j-1)); for(i=0; i<=j-1; i++) { if( iptr.pp_double[i][i+1], 1, &t.ptr.p_double[i+1], 1, ae_v_len(i+1,j-1)); } else { v = (double)(0); } if( nounit ) { a->ptr.pp_double[i][j] = v+a->ptr.pp_double[i][i]*t.ptr.p_double[i]; } else { a->ptr.pp_double[i][j] = v+t.ptr.p_double[i]; } } ae_v_muld(&a->ptr.pp_double[0][j], a->stride, ae_v_len(0,j-1), ajj); } } } else { /* * Compute inverse of lower triangular matrix. */ for(j=n-1; j>=0; j--) { if( nounit ) { if( ae_fp_eq(a->ptr.pp_double[j][j],(double)(0)) ) { result = ae_false; ae_frame_leave(_state); return result; } a->ptr.pp_double[j][j] = 1/a->ptr.pp_double[j][j]; ajj = -a->ptr.pp_double[j][j]; } else { ajj = (double)(-1); } if( jptr.pp_double[j+1][j], a->stride, ae_v_len(j+1,n-1)); for(i=j+1; i<=n-1; i++) { if( i>j+1 ) { v = ae_v_dotproduct(&a->ptr.pp_double[i][j+1], 1, &t.ptr.p_double[j+1], 1, ae_v_len(j+1,i-1)); } else { v = (double)(0); } if( nounit ) { a->ptr.pp_double[i][j] = v+a->ptr.pp_double[i][i]*t.ptr.p_double[i]; } else { a->ptr.pp_double[i][j] = v+t.ptr.p_double[i]; } } ae_v_muld(&a->ptr.pp_double[j+1][j], a->stride, ae_v_len(j+1,n-1), ajj); } } } ae_frame_leave(_state); return result; } /************************************************************************* LU inverse *************************************************************************/ static ae_bool testinverseupdateunit_invmatlu(/* Real */ ae_matrix* a, /* Integer */ ae_vector* pivots, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector work; ae_int_t i; ae_int_t j; ae_int_t jp; double v; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&work, 0, DT_REAL, _state); result = ae_true; /* * Quick return if possible */ if( n==0 ) { ae_frame_leave(_state); return result; } ae_vector_set_length(&work, n-1+1, _state); /* * Form inv(U) */ if( !testinverseupdateunit_invmattr(a, n, ae_true, ae_false, _state) ) { result = ae_false; ae_frame_leave(_state); return result; } /* * Solve the equation inv(A)*L = inv(U) for inv(A). */ for(j=n-1; j>=0; j--) { /* * Copy current column of L to WORK and replace with zeros. */ for(i=j+1; i<=n-1; i++) { work.ptr.p_double[i] = a->ptr.pp_double[i][j]; a->ptr.pp_double[i][j] = (double)(0); } /* * Compute current column of inv(A). */ if( jptr.pp_double[i][j+1], 1, &work.ptr.p_double[j+1], 1, ae_v_len(j+1,n-1)); a->ptr.pp_double[i][j] = a->ptr.pp_double[i][j]-v; } } } /* * Apply column interchanges. */ for(j=n-2; j>=0; j--) { jp = pivots->ptr.p_int[j]; if( jp!=j ) { ae_v_move(&work.ptr.p_double[0], 1, &a->ptr.pp_double[0][j], a->stride, ae_v_len(0,n-1)); ae_v_move(&a->ptr.pp_double[0][j], a->stride, &a->ptr.pp_double[0][jp], a->stride, ae_v_len(0,n-1)); ae_v_move(&a->ptr.pp_double[0][jp], a->stride, &work.ptr.p_double[0], 1, ae_v_len(0,n-1)); } } ae_frame_leave(_state); return result; } /************************************************************************* Matrix inverse *************************************************************************/ static ae_bool testinverseupdateunit_invmat(/* Real */ ae_matrix* a, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector pivots; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&pivots, 0, DT_INT, _state); testinverseupdateunit_matlu(a, n, n, &pivots, _state); result = testinverseupdateunit_invmatlu(a, &pivots, n, _state); ae_frame_leave(_state); return result; } /************************************************************************* Diff *************************************************************************/ static double testinverseupdateunit_matrixdiff(/* Real */ ae_matrix* a, /* Real */ ae_matrix* b, ae_int_t m, ae_int_t n, ae_state *_state) { ae_int_t i; ae_int_t j; double result; result = (double)(0); for(i=0; i<=m-1; i++) { for(j=0; j<=n-1; j++) { result = ae_maxreal(result, ae_fabs(b->ptr.pp_double[i][j]-a->ptr.pp_double[i][j], _state), _state); } } return result; } /************************************************************************* Update and inverse *************************************************************************/ static ae_bool testinverseupdateunit_updandinv(/* Real */ ae_matrix* a, /* Real */ ae_vector* u, /* Real */ ae_vector* v, ae_int_t n, ae_state *_state) { ae_frame _frame_block; ae_vector pivots; ae_int_t i; double r; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&pivots, 0, DT_INT, _state); for(i=0; i<=n-1; i++) { r = u->ptr.p_double[i]; ae_v_addd(&a->ptr.pp_double[i][0], 1, &v->ptr.p_double[0], 1, ae_v_len(0,n-1), r); } testinverseupdateunit_matlu(a, n, n, &pivots, _state); result = testinverseupdateunit_invmatlu(a, &pivots, n, _state); ae_frame_leave(_state); return result; } /************************************************************************* Test *************************************************************************/ ae_bool testpolynomialsolver(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool wereerrors; ae_vector a; ae_vector x; double eps; ae_int_t n; polynomialsolverreport rep; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&a, 0, DT_REAL, _state); ae_vector_init(&x, 0, DT_COMPLEX, _state); _polynomialsolverreport_init(&rep, _state); wereerrors = ae_false; /* * Basic tests */ eps = 1.0E-6; n = 1; ae_vector_set_length(&a, n+1, _state); a.ptr.p_double[0] = (double)(2); a.ptr.p_double[1] = (double)(3); polynomialsolve(&a, n, &x, &rep, _state); seterrorflag(&wereerrors, ae_fp_greater(ae_fabs(x.ptr.p_complex[0].x+(double)2/(double)3, _state),eps), _state); seterrorflag(&wereerrors, ae_fp_neq(x.ptr.p_complex[0].y,(double)(0)), _state); seterrorflag(&wereerrors, ae_fp_greater(rep.maxerr,100*ae_machineepsilon), _state); n = 2; ae_vector_set_length(&a, n+1, _state); a.ptr.p_double[0] = (double)(1); a.ptr.p_double[1] = (double)(-2); a.ptr.p_double[2] = (double)(1); polynomialsolve(&a, n, &x, &rep, _state); seterrorflag(&wereerrors, ae_fp_greater(ae_c_abs(ae_c_sub_d(x.ptr.p_complex[0],1), _state),eps), _state); seterrorflag(&wereerrors, ae_fp_greater(ae_c_abs(ae_c_sub_d(x.ptr.p_complex[1],1), _state),eps), _state); seterrorflag(&wereerrors, ae_fp_greater(rep.maxerr,100*ae_machineepsilon), _state); n = 2; ae_vector_set_length(&a, n+1, _state); a.ptr.p_double[0] = (double)(2); a.ptr.p_double[1] = (double)(-3); a.ptr.p_double[2] = (double)(1); polynomialsolve(&a, n, &x, &rep, _state); if( ae_fp_less(x.ptr.p_complex[0].x,x.ptr.p_complex[1].x) ) { seterrorflag(&wereerrors, ae_fp_greater(ae_fabs(x.ptr.p_complex[0].x-1, _state),eps), _state); seterrorflag(&wereerrors, ae_fp_greater(ae_fabs(x.ptr.p_complex[1].x-2, _state),eps), _state); } else { seterrorflag(&wereerrors, ae_fp_greater(ae_fabs(x.ptr.p_complex[0].x-2, _state),eps), _state); seterrorflag(&wereerrors, ae_fp_greater(ae_fabs(x.ptr.p_complex[1].x-1, _state),eps), _state); } seterrorflag(&wereerrors, ae_fp_neq(x.ptr.p_complex[0].y,(double)(0)), _state); seterrorflag(&wereerrors, ae_fp_neq(x.ptr.p_complex[1].y,(double)(0)), _state); seterrorflag(&wereerrors, ae_fp_greater(rep.maxerr,100*ae_machineepsilon), _state); n = 2; ae_vector_set_length(&a, n+1, _state); a.ptr.p_double[0] = (double)(1); a.ptr.p_double[1] = (double)(0); a.ptr.p_double[2] = (double)(1); polynomialsolve(&a, n, &x, &rep, _state); seterrorflag(&wereerrors, ae_fp_greater(ae_c_abs(ae_c_add_d(ae_c_mul(x.ptr.p_complex[0],x.ptr.p_complex[0]),(double)(1)), _state),eps), _state); seterrorflag(&wereerrors, ae_fp_greater(rep.maxerr,100*ae_machineepsilon), _state); n = 4; ae_vector_set_length(&a, n+1, _state); a.ptr.p_double[0] = (double)(0); a.ptr.p_double[1] = (double)(0); a.ptr.p_double[2] = (double)(0); a.ptr.p_double[3] = (double)(0); a.ptr.p_double[4] = (double)(1); polynomialsolve(&a, n, &x, &rep, _state); seterrorflag(&wereerrors, ae_c_neq_d(x.ptr.p_complex[0],(double)(0)), _state); seterrorflag(&wereerrors, ae_c_neq_d(x.ptr.p_complex[1],(double)(0)), _state); seterrorflag(&wereerrors, ae_c_neq_d(x.ptr.p_complex[2],(double)(0)), _state); seterrorflag(&wereerrors, ae_c_neq_d(x.ptr.p_complex[3],(double)(0)), _state); seterrorflag(&wereerrors, ae_fp_greater(rep.maxerr,100*ae_machineepsilon), _state); n = 2; ae_vector_set_length(&a, n+1, _state); a.ptr.p_double[0] = (double)(0); a.ptr.p_double[1] = (double)(3); a.ptr.p_double[2] = (double)(2); polynomialsolve(&a, n, &x, &rep, _state); if( ae_fp_greater(x.ptr.p_complex[0].x,x.ptr.p_complex[1].x) ) { seterrorflag(&wereerrors, ae_c_neq_d(x.ptr.p_complex[0],(double)(0)), _state); seterrorflag(&wereerrors, ae_fp_greater(ae_fabs(x.ptr.p_complex[1].x+(double)3/(double)2, _state),eps), _state); seterrorflag(&wereerrors, ae_fp_neq(x.ptr.p_complex[1].y,(double)(0)), _state); } else { seterrorflag(&wereerrors, ae_c_neq_d(x.ptr.p_complex[1],(double)(0)), _state); seterrorflag(&wereerrors, ae_fp_greater(ae_fabs(x.ptr.p_complex[0].x+(double)3/(double)2, _state),eps), _state); seterrorflag(&wereerrors, ae_fp_neq(x.ptr.p_complex[0].y,(double)(0)), _state); } seterrorflag(&wereerrors, ae_fp_greater(rep.maxerr,100*ae_machineepsilon), _state); if( !silent ) { printf("TESTING POLYNOMIAL SOLVER\n"); if( wereerrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } } result = !wereerrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testpolynomialsolver(ae_bool silent, ae_state *_state) { return testpolynomialsolver(silent, _state); } static void testnlequnit_testfunchbm(nleqstate* state, ae_state *_state); static void testnlequnit_testfunchb1(nleqstate* state, ae_state *_state); static void testnlequnit_testfuncshbm(nleqstate* state, ae_state *_state); ae_bool testnleq(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool waserrors; ae_bool basicserrors; ae_bool converror; ae_bool othererrors; ae_int_t n; ae_vector x; ae_int_t i; ae_int_t k; double v; double flast; ae_vector xlast; ae_bool firstrep; ae_int_t nfunc; ae_int_t njac; ae_int_t itcnt; nleqstate state; nleqreport rep; ae_int_t pass; ae_int_t passcount; double epsf; double stpmax; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&x, 0, DT_REAL, _state); ae_vector_init(&xlast, 0, DT_REAL, _state); _nleqstate_init(&state, _state); _nleqreport_init(&rep, _state); waserrors = ae_false; basicserrors = ae_false; converror = ae_false; othererrors = ae_false; /* * Basic tests * * Test with Himmelblau's function (M): * * ability to find correct result * * ability to work after soft restart (restart after finish) * * ability to work after hard restart (restart in the middle of optimization) */ passcount = 100; for(pass=0; pass<=passcount-1; pass++) { /* * Ability to find correct result */ ae_vector_set_length(&x, 2, _state); x.ptr.p_double[0] = 20*ae_randomreal(_state)-10; x.ptr.p_double[1] = 20*ae_randomreal(_state)-10; nleqcreatelm(2, 2, &x, &state, _state); epsf = 1.0E-9; nleqsetcond(&state, epsf, 0, _state); while(nleqiteration(&state, _state)) { testnlequnit_testfunchbm(&state, _state); } nleqresults(&state, &x, &rep, _state); if( rep.terminationtype>0 ) { basicserrors = basicserrors||ae_fp_greater(ae_sqr(x.ptr.p_double[0]*x.ptr.p_double[0]+x.ptr.p_double[1]-11, _state)+ae_sqr(x.ptr.p_double[0]+x.ptr.p_double[1]*x.ptr.p_double[1]-7, _state),ae_sqr(epsf, _state)); } else { basicserrors = ae_true; } /* * Ability to work after soft restart */ ae_vector_set_length(&x, 2, _state); x.ptr.p_double[0] = 20*ae_randomreal(_state)-10; x.ptr.p_double[1] = 20*ae_randomreal(_state)-10; nleqcreatelm(2, 2, &x, &state, _state); epsf = 1.0E-9; nleqsetcond(&state, epsf, 0, _state); while(nleqiteration(&state, _state)) { testnlequnit_testfunchbm(&state, _state); } nleqresults(&state, &x, &rep, _state); ae_vector_set_length(&x, 2, _state); x.ptr.p_double[0] = 20*ae_randomreal(_state)-10; x.ptr.p_double[1] = 20*ae_randomreal(_state)-10; nleqrestartfrom(&state, &x, _state); while(nleqiteration(&state, _state)) { testnlequnit_testfunchbm(&state, _state); } nleqresults(&state, &x, &rep, _state); if( rep.terminationtype>0 ) { basicserrors = basicserrors||ae_fp_greater(ae_sqr(x.ptr.p_double[0]*x.ptr.p_double[0]+x.ptr.p_double[1]-11, _state)+ae_sqr(x.ptr.p_double[0]+x.ptr.p_double[1]*x.ptr.p_double[1]-7, _state),ae_sqr(epsf, _state)); } else { basicserrors = ae_true; } /* * Ability to work after hard restart: * * stopping condition: small F * * StpMax is so small that we need about 10000 iterations to * find solution (steps are small) * * choose random K significantly less that 9999 * * iterate for some time, then break, restart optimization */ ae_vector_set_length(&x, 2, _state); x.ptr.p_double[0] = (double)(100); x.ptr.p_double[1] = (double)(100); nleqcreatelm(2, 2, &x, &state, _state); epsf = 1.0E-9; nleqsetcond(&state, epsf, 0, _state); nleqsetstpmax(&state, 0.01, _state); k = 1+ae_randominteger(100, _state); for(i=0; i<=k-1; i++) { if( !nleqiteration(&state, _state) ) { break; } testnlequnit_testfunchbm(&state, _state); } ae_vector_set_length(&x, 2, _state); x.ptr.p_double[0] = 20*ae_randomreal(_state)-10; x.ptr.p_double[1] = 20*ae_randomreal(_state)-10; nleqrestartfrom(&state, &x, _state); while(nleqiteration(&state, _state)) { testnlequnit_testfunchbm(&state, _state); } nleqresults(&state, &x, &rep, _state); if( rep.terminationtype>0 ) { basicserrors = basicserrors||ae_fp_greater(ae_sqr(x.ptr.p_double[0]*x.ptr.p_double[0]+x.ptr.p_double[1]-11, _state)+ae_sqr(x.ptr.p_double[0]+x.ptr.p_double[1]*x.ptr.p_double[1]-7, _state),ae_sqr(epsf, _state)); } else { basicserrors = ae_true; } } /* * Basic tests * * Test with Himmelblau's function (1): * * ability to find correct result */ passcount = 100; for(pass=0; pass<=passcount-1; pass++) { /* * Ability to find correct result */ ae_vector_set_length(&x, 2, _state); x.ptr.p_double[0] = 20*ae_randomreal(_state)-10; x.ptr.p_double[1] = 20*ae_randomreal(_state)-10; nleqcreatelm(2, 1, &x, &state, _state); epsf = 1.0E-9; nleqsetcond(&state, epsf, 0, _state); while(nleqiteration(&state, _state)) { testnlequnit_testfunchb1(&state, _state); } nleqresults(&state, &x, &rep, _state); if( rep.terminationtype>0 ) { basicserrors = basicserrors||ae_fp_greater(ae_sqr(x.ptr.p_double[0]*x.ptr.p_double[0]+x.ptr.p_double[1]-11, _state)+ae_sqr(x.ptr.p_double[0]+x.ptr.p_double[1]*x.ptr.p_double[1]-7, _state),epsf); } else { basicserrors = ae_true; } } /* * Basic tests * * Ability to detect situation when we can't find minimum */ passcount = 100; for(pass=0; pass<=passcount-1; pass++) { ae_vector_set_length(&x, 2, _state); x.ptr.p_double[0] = 20*ae_randomreal(_state)-10; x.ptr.p_double[1] = 20*ae_randomreal(_state)-10; nleqcreatelm(2, 3, &x, &state, _state); epsf = 1.0E-9; nleqsetcond(&state, epsf, 0, _state); while(nleqiteration(&state, _state)) { testnlequnit_testfuncshbm(&state, _state); } nleqresults(&state, &x, &rep, _state); basicserrors = basicserrors||rep.terminationtype!=-4; } /* * Test correctness of intermediate reports and final report: * * first report is starting point * * function value decreases on subsequent reports * * function value is correctly reported * * last report is final point * * NFunc and NJac are compared with values counted directly * * IterationsCount is compared with value counter directly */ n = 2; ae_vector_set_length(&x, n, _state); ae_vector_set_length(&xlast, n, _state); x.ptr.p_double[0] = 20*ae_randomreal(_state)-10; x.ptr.p_double[1] = 20*ae_randomreal(_state)-10; xlast.ptr.p_double[0] = ae_maxrealnumber; xlast.ptr.p_double[1] = ae_maxrealnumber; nleqcreatelm(n, 2, &x, &state, _state); nleqsetcond(&state, 1.0E-6, 0, _state); nleqsetxrep(&state, ae_true, _state); firstrep = ae_true; flast = ae_maxrealnumber; nfunc = 0; njac = 0; itcnt = 0; while(nleqiteration(&state, _state)) { if( state.xupdated ) { /* * first report must be starting point */ if( firstrep ) { for(i=0; i<=n-1; i++) { othererrors = othererrors||ae_fp_neq(state.x.ptr.p_double[i],x.ptr.p_double[i]); } firstrep = ae_false; } /* * function value must decrease */ othererrors = othererrors||ae_fp_greater(state.f,flast); /* * check correctness of function value */ v = ae_sqr(state.x.ptr.p_double[0]*state.x.ptr.p_double[0]+state.x.ptr.p_double[1]-11, _state)+ae_sqr(state.x.ptr.p_double[0]+state.x.ptr.p_double[1]*state.x.ptr.p_double[1]-7, _state); othererrors = othererrors||ae_fp_greater(ae_fabs(v-state.f, _state)/ae_maxreal(v, (double)(1), _state),100*ae_machineepsilon); /* * update info and continue */ ae_v_move(&xlast.ptr.p_double[0], 1, &state.x.ptr.p_double[0], 1, ae_v_len(0,n-1)); flast = state.f; itcnt = itcnt+1; continue; } if( state.needf ) { nfunc = nfunc+1; } if( state.needfij ) { nfunc = nfunc+1; njac = njac+1; } testnlequnit_testfunchbm(&state, _state); } nleqresults(&state, &x, &rep, _state); if( rep.terminationtype>0 ) { othererrors = (othererrors||ae_fp_neq(xlast.ptr.p_double[0],x.ptr.p_double[0]))||ae_fp_neq(xlast.ptr.p_double[1],x.ptr.p_double[1]); v = ae_sqr(x.ptr.p_double[0]*x.ptr.p_double[0]+x.ptr.p_double[1]-11, _state)+ae_sqr(x.ptr.p_double[0]+x.ptr.p_double[1]*x.ptr.p_double[1]-7, _state); othererrors = othererrors||ae_fp_greater(ae_fabs(flast-v, _state)/ae_maxreal(v, (double)(1), _state),100*ae_machineepsilon); } else { converror = ae_true; } othererrors = othererrors||rep.nfunc!=nfunc; othererrors = othererrors||rep.njac!=njac; othererrors = othererrors||rep.iterationscount!=itcnt-1; /* * Test ability to set limit on algorithm steps */ ae_vector_set_length(&x, 2, _state); ae_vector_set_length(&xlast, 2, _state); x.ptr.p_double[0] = 20*ae_randomreal(_state)+20; x.ptr.p_double[1] = 20*ae_randomreal(_state)+20; xlast.ptr.p_double[0] = x.ptr.p_double[0]; xlast.ptr.p_double[1] = x.ptr.p_double[1]; stpmax = 0.1+0.1*ae_randomreal(_state); epsf = 1.0E-9; nleqcreatelm(2, 3, &x, &state, _state); nleqsetstpmax(&state, stpmax, _state); nleqsetcond(&state, epsf, 0, _state); nleqsetxrep(&state, ae_true, _state); while(nleqiteration(&state, _state)) { if( state.needf||state.needfij ) { testnlequnit_testfunchbm(&state, _state); } if( (state.needf||state.needfij)||state.xupdated ) { othererrors = othererrors||ae_fp_greater(ae_sqrt(ae_sqr(state.x.ptr.p_double[0]-xlast.ptr.p_double[0], _state)+ae_sqr(state.x.ptr.p_double[1]-xlast.ptr.p_double[1], _state), _state),1.00001*stpmax); } if( state.xupdated ) { xlast.ptr.p_double[0] = state.x.ptr.p_double[0]; xlast.ptr.p_double[1] = state.x.ptr.p_double[1]; } } /* * end */ waserrors = (basicserrors||converror)||othererrors; if( !silent ) { printf("TESTING NLEQ SOLVER\n"); printf("BASIC FUNCTIONALITY: "); if( basicserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("CONVERGENCE: "); if( converror ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("OTHER PROPERTIES: "); if( othererrors ) { printf("FAILED\n"); } else { printf("OK\n"); } if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; ae_frame_leave(_state); return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testnleq(ae_bool silent, ae_state *_state) { return testnleq(silent, _state); } /************************************************************************* Himmelblau's function F = (x^2+y-11)^2 + (x+y^2-7)^2 posed as system of M functions: f0 = x^2+y-11 f1 = x+y^2-7 *************************************************************************/ static void testnlequnit_testfunchbm(nleqstate* state, ae_state *_state) { double x; double y; ae_assert(state->needf||state->needfij, "TestNLEQUnit: internal error!", _state); x = state->x.ptr.p_double[0]; y = state->x.ptr.p_double[1]; if( state->needf ) { state->f = ae_sqr(x*x+y-11, _state)+ae_sqr(x+y*y-7, _state); return; } if( state->needfij ) { state->fi.ptr.p_double[0] = x*x+y-11; state->fi.ptr.p_double[1] = x+y*y-7; state->j.ptr.pp_double[0][0] = 2*x; state->j.ptr.pp_double[0][1] = (double)(1); state->j.ptr.pp_double[1][0] = (double)(1); state->j.ptr.pp_double[1][1] = 2*y; return; } } /************************************************************************* Himmelblau's function F = (x^2+y-11)^2 + (x+y^2-7)^2 posed as system of 1 function *************************************************************************/ static void testnlequnit_testfunchb1(nleqstate* state, ae_state *_state) { double x; double y; ae_assert(state->needf||state->needfij, "TestNLEQUnit: internal error!", _state); x = state->x.ptr.p_double[0]; y = state->x.ptr.p_double[1]; if( state->needf ) { state->f = ae_sqr(ae_sqr(x*x+y-11, _state)+ae_sqr(x+y*y-7, _state), _state); return; } if( state->needfij ) { state->fi.ptr.p_double[0] = ae_sqr(x*x+y-11, _state)+ae_sqr(x+y*y-7, _state); state->j.ptr.pp_double[0][0] = 2*(x*x+y-11)*2*x+2*(x+y*y-7); state->j.ptr.pp_double[0][1] = 2*(x*x+y-11)+2*(x+y*y-7)*2*y; return; } } /************************************************************************* Shifted Himmelblau's function F = (x^2+y-11)^2 + (x+y^2-7)^2 + 1 posed as system of M functions: f0 = x^2+y-11 f1 = x+y^2-7 f2 = 1 This function is used to test algorithm on problem which has no solution. *************************************************************************/ static void testnlequnit_testfuncshbm(nleqstate* state, ae_state *_state) { double x; double y; ae_assert(state->needf||state->needfij, "TestNLEQUnit: internal error!", _state); x = state->x.ptr.p_double[0]; y = state->x.ptr.p_double[1]; if( state->needf ) { state->f = ae_sqr(x*x+y-11, _state)+ae_sqr(x+y*y-7, _state)+1; return; } if( state->needfij ) { state->fi.ptr.p_double[0] = x*x+y-11; state->fi.ptr.p_double[1] = x+y*y-7; state->fi.ptr.p_double[2] = (double)(1); state->j.ptr.pp_double[0][0] = 2*x; state->j.ptr.pp_double[0][1] = (double)(1); state->j.ptr.pp_double[1][0] = (double)(1); state->j.ptr.pp_double[1][1] = 2*y; state->j.ptr.pp_double[2][0] = (double)(0); state->j.ptr.pp_double[2][1] = (double)(0); return; } } static double testlincgunit_e0 = 1.0E-6; static double testlincgunit_maxcond = 30; static ae_bool testlincgunit_complextest(ae_bool silent, ae_state *_state); static ae_bool testlincgunit_complexres(ae_bool silent, ae_state *_state); static ae_bool testlincgunit_basictestx(ae_bool silent, ae_state *_state); static ae_bool testlincgunit_testrcorrectness(ae_bool silent, ae_state *_state); static ae_bool testlincgunit_basictestiters(ae_bool silent, ae_state *_state); static ae_bool testlincgunit_krylovsubspacetest(ae_bool silent, ae_state *_state); static ae_bool testlincgunit_sparsetest(ae_bool silent, ae_state *_state); static ae_bool testlincgunit_precondtest(ae_bool silent, ae_state *_state); static void testlincgunit_gramshmidtortnorm(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t k, double eps, /* Real */ ae_matrix* b, ae_int_t* k2, ae_state *_state); static ae_bool testlincgunit_frombasis(/* Real */ ae_vector* x, /* Real */ ae_matrix* basis, ae_int_t n, ae_int_t k, double eps, ae_state *_state); ae_bool testlincg(ae_bool silent, ae_state *_state) { ae_bool basictestxerrors; ae_bool basictestiterserr; ae_bool complexreserrors; ae_bool complexerrors; ae_bool rcorrectness; ae_bool krylovsubspaceerr; ae_bool sparseerrors; ae_bool preconderrors; ae_bool waserrors; ae_bool result; basictestxerrors = testlincgunit_basictestx(ae_true, _state); basictestiterserr = testlincgunit_basictestiters(ae_true, _state); complexreserrors = testlincgunit_complexres(ae_true, _state); complexerrors = testlincgunit_complextest(ae_true, _state); rcorrectness = testlincgunit_testrcorrectness(ae_true, _state); krylovsubspaceerr = testlincgunit_krylovsubspacetest(ae_true, _state); sparseerrors = testlincgunit_sparsetest(ae_true, _state); preconderrors = testlincgunit_precondtest(ae_true, _state); /* * report */ waserrors = ((((((basictestxerrors||complexreserrors)||complexerrors)||rcorrectness)||basictestiterserr)||krylovsubspaceerr)||sparseerrors)||preconderrors; if( !silent ) { printf("TESTING LinCG\n"); printf("BasicTestX: "); if( basictestxerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("BasicTestIters: "); if( basictestiterserr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("ComplexResTest: "); if( complexreserrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("ComplexTest: "); if( complexerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("R2 correctness: "); if( rcorrectness ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("KrylovSubSpaceTest: "); if( krylovsubspaceerr ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("SparseTest: "); if( sparseerrors ) { printf("FAILED\n"); } else { printf("OK\n"); } printf("PrecondTest: "); if( preconderrors ) { printf("FAILED\n"); } else { printf("OK\n"); } /* *was errors? */ if( waserrors ) { printf("TEST FAILED\n"); } else { printf("TEST PASSED\n"); } printf("\n\n"); } result = !waserrors; return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testlincg(ae_bool silent, ae_state *_state) { return testlincg(silent, _state); } /************************************************************************* Function for testing LinCGIteration function(custom option), which to solve Ax=b(here A is random positive definite matrix NxN, b is random vector). It uses the default stopping criterion(RNormk; 2. (rk,rm)=0 for any m<>k; 3. (rk,pm)=0 for any m<>k; INPUT: Silent - if true then function output report -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlincgunit_complextest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; lincgstate state; lincgreport rep; ae_matrix a; ae_vector b; ae_int_t n; double c; ae_vector x0; ae_vector residual; double normofresidual; double sclr; double na; double nv0; double nv1; ae_int_t sz; double mx; ae_int_t i; ae_int_t j; ae_int_t k; ae_int_t l; double tmp; ae_matrix mtx; ae_matrix mtp; ae_matrix mtr; double getrnorm; ae_int_t numofit; ae_bool result; ae_frame_make(_state, &_frame_block); _lincgstate_init(&state, _state); _lincgreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&residual, 0, DT_REAL, _state); ae_matrix_init(&mtx, 0, 0, DT_REAL, _state); ae_matrix_init(&mtp, 0, 0, DT_REAL, _state); ae_matrix_init(&mtr, 0, 0, DT_REAL, _state); mx = (double)(100); n = 5; for(sz=1; sz<=n; sz++) { /* * Generate: * * random A with norm NA (equal to 1.0), * * random right part B whose elements are uniformly distributed in [-MX,+MX] * * random starting point X0 whose elements are uniformly distributed in [-MX,+MX] */ c = 15+15*ae_randomreal(_state); spdmatrixrndcond(sz, c, &a, _state); na = (double)(1); ae_vector_set_length(&b, sz, _state); for(i=0; i<=sz-1; i++) { b.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); } ae_vector_set_length(&x0, sz, _state); for(i=0; i<=sz-1; i++) { x0.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); } ae_matrix_set_length(&mtx, sz+1, sz, _state); /* * Start optimization, record its progress for further analysis * NOTE: we set update frequency of R to 2 in order to test that R is updated correctly */ lincgcreate(sz, &state, _state); lincgsetxrep(&state, ae_true, _state); lincgsetb(&state, &b, _state); lincgsetstartingpoint(&state, &x0, _state); lincgsetcond(&state, (double)(0), sz, _state); lincgsetrupdatefreq(&state, 2, _state); numofit = 0; getrnorm = ae_maxrealnumber; while(lincgiteration(&state, _state)) { if( state.needmv ) { for(i=0; i<=sz-1; i++) { state.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=sz-1; j++) { state.mv.ptr.p_double[i] = state.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } } } if( state.needvmv ) { state.vmv = (double)(0); for(i=0; i<=sz-1; i++) { state.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=sz-1; j++) { state.mv.ptr.p_double[i] = state.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*state.x.ptr.p_double[j]; } state.vmv = state.vmv+state.mv.ptr.p_double[i]*state.x.ptr.p_double[i]; } } if( state.needprec ) { for(i=0; i<=sz-1; i++) { state.pv.ptr.p_double[i] = state.x.ptr.p_double[i]; } } if( state.xupdated ) { /* * Save current point to MtX, it will be used later for additional tests */ if( numofit>=mtx.rows ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=sz-1; i++) { mtx.ptr.pp_double[numofit][i] = state.x.ptr.p_double[i]; } getrnorm = state.r2; numofit = numofit+1; } } lincgresults(&state, &x0, &rep, _state); if( ae_fp_neq(getrnorm,rep.r2) ) { if( !silent ) { printf("IterationsCount=%0d;\nNMV=%0d;\nTerminationType=%0d;\n", (int)(rep.iterationscount), (int)(rep.nmv), (int)(rep.terminationtype)); printf("Size=%0d;\nCond=%0.5f;\nComplexTest::Fail::GetRNorm<>Rep.R2!(%0.2e<>%0.2e)\n", (int)(sz), (double)(c), (double)(getrnorm), (double)(rep.r2)); } result = ae_true; ae_frame_leave(_state); return result; } /* * Calculate residual, check result */ ae_vector_set_length(&residual, sz, _state); for(i=0; i<=sz-1; i++) { tmp = (double)(0); for(j=0; j<=sz-1; j++) { tmp = tmp+a.ptr.pp_double[i][j]*x0.ptr.p_double[j]; } residual.ptr.p_double[i] = b.ptr.p_double[i]-tmp; } normofresidual = (double)(0); for(i=0; i<=sz-1; i++) { if( ae_fp_greater(ae_fabs(residual.ptr.p_double[i], _state),testlincgunit_e0) ) { if( !silent ) { printf("IterationsCount=%0d;\nNMV=%0d;\nTerminationType=%0d;\n", (int)(rep.iterationscount), (int)(rep.nmv), (int)(rep.terminationtype)); printf("Size=%0d;\nCond=%0.5f;\nComplexTest::Fail::Discripancy[%0d]>E0!(%0.2e>%0.2e)\n", (int)(sz), (double)(c), (int)(i), (double)(residual.ptr.p_double[i]), (double)(testlincgunit_e0)); } result = ae_true; ae_frame_leave(_state); return result; } normofresidual = normofresidual+residual.ptr.p_double[i]*residual.ptr.p_double[i]; } if( ae_fp_greater(ae_fabs(normofresidual-rep.r2, _state),testlincgunit_e0) ) { if( !silent ) { printf("IterationsCount=%0d;\nNMV=%0d;\nTerminationType=%0d;\n", (int)(rep.iterationscount), (int)(rep.nmv), (int)(rep.terminationtype)); printf("Size=%0d;\nCond=%0.5f;\nComplexTest::Fail::||NormOfResidual-Rep.R2||>E0!(%0.2e>%0.2e)\n", (int)(sz), (double)(c), (double)(ae_fabs(normofresidual-rep.r2, _state)), (double)(testlincgunit_e0)); printf("NormOfResidual=%0.2e; Rep.R2=%0.2e\n", (double)(normofresidual), (double)(rep.r2)); } result = ae_true; ae_frame_leave(_state); return result; } /* * Check algorithm properties (conjugacy/orthogonality). * Here we use MtX which was filled during algorithm progress towards solution. * * NOTE: this test is skipped when algorithm converged in less than SZ iterations. */ if( sz>1&&rep.iterationscount==sz ) { ae_matrix_set_length(&mtp, sz, sz, _state); ae_matrix_set_length(&mtr, sz, sz, _state); for(i=0; i<=sz-1; i++) { for(j=0; j<=sz-1; j++) { mtp.ptr.pp_double[i][j] = mtx.ptr.pp_double[i+1][j]-mtx.ptr.pp_double[i][j]; tmp = (double)(0); for(k=0; k<=sz-1; k++) { tmp = tmp+a.ptr.pp_double[j][k]*mtx.ptr.pp_double[i][k]; } mtr.ptr.pp_double[i][j] = b.ptr.p_double[j]-tmp; } } /* *(Api,pj)=0? */ sclr = (double)(0); nv0 = (double)(0); nv1 = (double)(0); for(i=0; i<=sz-1; i++) { for(j=0; j<=sz-1; j++) { if( i==j ) { continue; } for(k=0; k<=sz-1; k++) { tmp = (double)(0); for(l=0; l<=sz-1; l++) { tmp = tmp+a.ptr.pp_double[k][l]*mtp.ptr.pp_double[i][l]; } sclr = sclr+tmp*mtp.ptr.pp_double[j][k]; nv0 = nv0+mtp.ptr.pp_double[i][k]*mtp.ptr.pp_double[i][k]; nv1 = nv1+mtp.ptr.pp_double[j][k]*mtp.ptr.pp_double[j][k]; } nv0 = ae_sqrt(nv0, _state); nv1 = ae_sqrt(nv1, _state); if( ae_fp_greater(ae_fabs(sclr, _state),testlincgunit_e0*na*nv0*nv1) ) { if( !silent ) { printf("IterationsCount=%0d;\nNMV=%0d;\nTerminationType=%0d;\n", (int)(rep.iterationscount), (int)(rep.nmv), (int)(rep.terminationtype)); printf("Size=%0d;\nCond=%0.5f;\nComplexTest::Fail::(Ap%0d,p%0d)!=0\n{Sclr=%0.15f; NA=%0.15f NV0=%0.15f NV1=%0.15f;}\n", (int)(sz), (double)(c), (int)(i), (int)(j), (double)(sclr), (double)(na), (double)(nv0), (double)(nv1)); } result = ae_true; ae_frame_leave(_state); return result; } } } /* *(ri,pj)=0? */ for(i=1; i<=sz-1; i++) { for(j=0; j<=i-1; j++) { sclr = (double)(0); nv0 = (double)(0); nv1 = (double)(0); for(k=0; k<=sz-1; k++) { sclr = sclr+mtr.ptr.pp_double[i][k]*mtp.ptr.pp_double[j][k]; nv0 = nv0+mtr.ptr.pp_double[i][k]*mtr.ptr.pp_double[i][k]; nv1 = nv1+mtp.ptr.pp_double[j][k]*mtp.ptr.pp_double[j][k]; } nv0 = ae_sqrt(nv0, _state); nv1 = ae_sqrt(nv1, _state); if( ae_fp_greater(ae_fabs(sclr, _state),testlincgunit_e0*nv0*nv1) ) { if( !silent ) { printf("IterationsCount=%0d;\nNMV=%0d;\nTerminationType=%0d;\n", (int)(rep.iterationscount), (int)(rep.nmv), (int)(rep.terminationtype)); printf("Size=%0d;\nCond=%0.5f;\nComplexTest::Fail::(r%0d,p%0d)!=0\n{Sclr=%0.15f; NV0=%0.15f NV1=%0.15f;}\n", (int)(sz), (double)(c), (int)(i), (int)(j), (double)(sclr), (double)(nv0), (double)(nv1)); } result = ae_true; ae_frame_leave(_state); return result; } } } /* *(ri,rj)=0? */ for(i=0; i<=sz-1; i++) { for(j=i+1; j<=sz-1; j++) { sclr = (double)(0); nv0 = (double)(0); nv1 = (double)(0); for(k=0; k<=sz-1; k++) { sclr = sclr+mtr.ptr.pp_double[i][k]*mtr.ptr.pp_double[j][k]; nv0 = nv0+mtr.ptr.pp_double[i][k]*mtr.ptr.pp_double[i][k]; nv1 = nv1+mtr.ptr.pp_double[j][k]*mtr.ptr.pp_double[j][k]; } nv0 = ae_sqrt(nv0, _state); nv1 = ae_sqrt(nv1, _state); if( ae_fp_greater(ae_fabs(sclr, _state),testlincgunit_e0*nv0*nv1) ) { if( !silent ) { printf("IterationsCount=%0d;\nNMV=%0d;\nTerminationType=%0d;\n", (int)(rep.iterationscount), (int)(rep.nmv), (int)(rep.terminationtype)); printf("Size=%0d;\nCond=%0.5f;\nComplexTest::Fail::(rm,rk)!=0\n{Sclr=%0.15f; NV0=%0.15f NV1=%0.15f;}\n", (int)(sz), (double)(c), (double)(sclr), (double)(nv0), (double)(nv1)); } result = ae_true; ae_frame_leave(_state); return result; } } } } } if( !silent ) { printf("ComplexTest::Ok\n"); } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function prepare problem with a known solution 'Xs'(A*Xs-b=0). There b is A*Xs. After, function check algorithm result and 'Xs'. There used two stopping criterions: 1. achieved the required precision(StCrit=0); 2. execution of the required number of iterations(StCrit=1). -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlincgunit_complexres(ae_bool silent, ae_state *_state) { ae_frame _frame_block; lincgstate s; lincgreport rep; ae_matrix a; ae_vector b; ae_vector xs; ae_vector x0; double err; ae_int_t n; ae_int_t sz; double c; ae_int_t i; ae_int_t j; ae_int_t stcrit; double mx; double tmp; double eps; ae_int_t xp; ae_int_t nxp; ae_bool result; ae_frame_make(_state, &_frame_block); _lincgstate_init(&s, _state); _lincgreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&xs, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); sz = 5; mx = (double)(100); nxp = 100; for(xp=0; xp<=nxp-1; xp++) { for(n=1; n<=sz; n++) { for(stcrit=0; stcrit<=1; stcrit++) { /* * Generate: * * random A with norm NA (equal to 1.0), * * random solution XS whose elements are uniformly distributed in [-MX,+MX] * * random starting point X0 whose elements are uniformly distributed in [-MX,+MX] * * B = A*Xs */ c = (testlincgunit_maxcond-1)*ae_randomreal(_state)+1; spdmatrixrndcond(n, c, &a, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&xs, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); xs.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); } eps = (double)(0); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { b.ptr.p_double[i] = b.ptr.p_double[i]+a.ptr.pp_double[i][j]*xs.ptr.p_double[j]; } eps = eps+b.ptr.p_double[i]*b.ptr.p_double[i]; } eps = 1.0E-6*ae_sqrt(eps, _state); /* * Solve with different stopping criteria */ lincgcreate(n, &s, _state); lincgsetb(&s, &b, _state); lincgsetstartingpoint(&s, &x0, _state); lincgsetxrep(&s, ae_true, _state); if( stcrit==0 ) { lincgsetcond(&s, 1.0E-6, 0, _state); } else { lincgsetcond(&s, (double)(0), n, _state); } while(lincgiteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needvmv ) { s.vmv = (double)(0); for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } s.vmv = s.vmv+s.mv.ptr.p_double[i]*s.x.ptr.p_double[i]; } } if( s.needprec ) { for(i=0; i<=n-1; i++) { s.pv.ptr.p_double[i] = s.x.ptr.p_double[i]; } } } lincgresults(&s, &x0, &rep, _state); /* * Check result */ err = 0.0; for(i=0; i<=n-1; i++) { tmp = (double)(0); for(j=0; j<=n-1; j++) { tmp = tmp+a.ptr.pp_double[i][j]*x0.ptr.p_double[j]; } err = err+ae_sqr(b.ptr.p_double[i]-tmp, _state); } err = ae_sqrt(err, _state); if( ae_fp_greater(err,eps) ) { if( !silent ) { printf("ComplexRes::fail\n"); printf("IterationsCount=%0d\n", (int)(rep.iterationscount)); printf("NMV=%0d\n", (int)(rep.nmv)); printf("TerminationType=%0d\n", (int)(rep.terminationtype)); printf("X and Xs...\n"); for(j=0; j<=n-1; j++) { printf("x[%0d]=%0.10f; xs[%0d]=%0.10f\n", (int)(j), (double)(x0.ptr.p_double[j]), (int)(j), (double)(xs.ptr.p_double[j])); } } result = ae_true; ae_frame_leave(_state); return result; } } } } /* *test has been passed */ if( !silent ) { printf("ComplexRes::Ok\n"); } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function check, that XUpdated return State.X=X0 at zero iteration and State.X=X(algorithm result) at last. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlincgunit_basictestx(ae_bool silent, ae_state *_state) { ae_frame _frame_block; lincgstate s; lincgreport rep; ae_matrix a; ae_vector b; ae_vector x0; ae_vector x00; ae_vector x01; ae_int_t n; ae_int_t sz; double c; ae_int_t i; ae_int_t j; double mx; ae_int_t iters; ae_bool result; ae_frame_make(_state, &_frame_block); _lincgstate_init(&s, _state); _lincgreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x00, 0, DT_REAL, _state); ae_vector_init(&x01, 0, DT_REAL, _state); sz = 5; mx = (double)(100); for(n=1; n<=sz; n++) { /* * Generate: * * random A with norm NA (equal to 1.0), * * random right part B whose elements are uniformly distributed in [-MX,+MX] * * random starting point X0 whose elements are uniformly distributed in [-MX,+MX] */ c = (testlincgunit_maxcond-1)*ae_randomreal(_state)+1; spdmatrixrndcond(n, c, &a, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&x00, n, _state); ae_vector_set_length(&x01, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); b.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); } /* * Solve, save first and last reported points to x00 and x01 */ lincgcreate(n, &s, _state); lincgsetb(&s, &b, _state); lincgsetstartingpoint(&s, &x0, _state); lincgsetxrep(&s, ae_true, _state); lincgsetcond(&s, (double)(0), n, _state); iters = 0; while(lincgiteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needvmv ) { s.vmv = (double)(0); for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } s.vmv = s.vmv+s.mv.ptr.p_double[i]*s.x.ptr.p_double[i]; } } if( s.needprec ) { for(i=0; i<=n-1; i++) { s.pv.ptr.p_double[i] = s.x.ptr.p_double[i]; } } if( s.xupdated ) { if( iters==0 ) { for(i=0; i<=n-1; i++) { x00.ptr.p_double[i] = s.x.ptr.p_double[i]; } } if( iters==n ) { for(i=0; i<=n-1; i++) { x01.ptr.p_double[i] = s.x.ptr.p_double[i]; } } iters = iters+1; } } /* * Check first and last points */ for(i=0; i<=n-1; i++) { if( ae_fp_neq(x00.ptr.p_double[i],x0.ptr.p_double[i]) ) { if( !silent ) { printf("BasicTestX::fail\n"); printf("IterationsCount=%0d\n", (int)(rep.iterationscount)); printf("NMV=%0d\n", (int)(rep.nmv)); printf("TerminationType=%0d\n", (int)(rep.terminationtype)); for(j=0; j<=n-1; j++) { printf("x0=%0.5f; x00=%0.5f;\n", (double)(x0.ptr.p_double[j]), (double)(x00.ptr.p_double[j])); } } result = ae_true; ae_frame_leave(_state); return result; } } lincgresults(&s, &x0, &rep, _state); for(i=0; i<=n-1; i++) { if( ae_fp_neq(x01.ptr.p_double[i],x0.ptr.p_double[i]) ) { if( !silent ) { printf("BasicTestX::fail\n"); printf("IterationsCount=%0d\n", (int)(rep.iterationscount)); printf("NMV=%0d\n", (int)(rep.nmv)); printf("TerminationType=%0d\n", (int)(rep.terminationtype)); for(j=0; j<=n-1; j++) { printf("x0=%0.5f; x01=%0.5f;\n", (double)(x0.ptr.p_double[j]), (double)(x01.ptr.p_double[j])); } } result = ae_true; ae_frame_leave(_state); return result; } } } /* *test has been passed */ if( !silent ) { printf("BasicTestIters::Ok\n"); } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function checks that XUpdated returns correct State.R2. It creates large badly conditioned problem (N=50), which should be large enough and ill-conditioned enough to cause periodic recalculation of R. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlincgunit_testrcorrectness(ae_bool silent, ae_state *_state) { ae_frame _frame_block; lincgstate s; lincgreport rep; ae_matrix a; ae_vector b; ae_int_t n; double c; ae_int_t i; ae_int_t j; double r2; double v; double rtol; ae_int_t maxits; ae_bool result; ae_frame_make(_state, &_frame_block); _lincgstate_init(&s, _state); _lincgreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); rtol = 1.0E6*ae_machineepsilon; n = 50; maxits = n/2; c = (double)(10000); spdmatrixrndcond(n, c, &a, _state); ae_vector_set_length(&b, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } lincgcreate(n, &s, _state); lincgsetb(&s, &b, _state); lincgsetxrep(&s, ae_true, _state); lincgsetcond(&s, (double)(0), maxits, _state); while(lincgiteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needvmv ) { s.vmv = (double)(0); for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } s.vmv = s.vmv+s.mv.ptr.p_double[i]*s.x.ptr.p_double[i]; } } if( s.needprec ) { for(i=0; i<=n-1; i++) { s.pv.ptr.p_double[i] = s.x.ptr.p_double[i]; } } if( s.xupdated ) { /* * calculate R2, compare with value returned in state.R2 */ r2 = (double)(0); for(i=0; i<=n-1; i++) { v = (double)(0); for(j=0; j<=n-1; j++) { v = v+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } r2 = r2+ae_sqr(v-b.ptr.p_double[i], _state); } if( ae_fp_greater(ae_fabs(r2-s.r2, _state),rtol) ) { result = ae_true; ae_frame_leave(_state); return result; } } } lincgresults(&s, &b, &rep, _state); if( rep.iterationscount!=maxits ) { result = ae_true; ae_frame_leave(_state); return result; } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function check, that number of iterations are't more than MaxIts. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlincgunit_basictestiters(ae_bool silent, ae_state *_state) { ae_frame _frame_block; lincgstate s; lincgreport rep; ae_matrix a; ae_vector b; ae_vector x0; ae_int_t n; ae_int_t sz; double c; ae_int_t i; ae_int_t j; double mx; ae_int_t iters; ae_bool result; ae_frame_make(_state, &_frame_block); _lincgstate_init(&s, _state); _lincgreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); sz = 5; mx = (double)(100); for(n=1; n<=sz; n++) { /* * Generate: * * random A with norm NA (equal to 1.0), * * random right part B whose elements are uniformly distributed in [-MX,+MX] * * random starting point X0 whose elements are uniformly distributed in [-MX,+MX] */ c = (testlincgunit_maxcond-1)*ae_randomreal(_state)+1; spdmatrixrndcond(n, c, &a, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); b.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); } /* * Solve */ lincgcreate(n, &s, _state); lincgsetb(&s, &b, _state); lincgsetstartingpoint(&s, &x0, _state); lincgsetxrep(&s, ae_true, _state); lincgsetcond(&s, (double)(0), n, _state); iters = 0; while(lincgiteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needvmv ) { s.vmv = (double)(0); for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } s.vmv = s.vmv+s.mv.ptr.p_double[i]*s.x.ptr.p_double[i]; } } if( s.needprec ) { for(i=0; i<=n-1; i++) { s.pv.ptr.p_double[i] = s.x.ptr.p_double[i]; } } if( s.xupdated ) { iters = iters+1; } } lincgresults(&s, &x0, &rep, _state); /* * Check */ if( iters!=rep.iterationscount+1||iters>n+1 ) { if( !silent ) { printf("BasicTestIters::fail\n"); printf("IterationsCount=%0d\n", (int)(rep.iterationscount)); printf("NMV=%0d\n", (int)(rep.nmv)); printf("TerminationType=%0d\n", (int)(rep.terminationtype)); printf("Iters=%0d\n", (int)(iters)); } result = ae_true; ae_frame_leave(_state); return result; } /* * Restart problem */ c = (testlincgunit_maxcond-1)*ae_randomreal(_state)+1; spdmatrixrndcond(n, c, &a, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); b.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); } lincgsetstartingpoint(&s, &x0, _state); lincgrestart(&s, _state); lincgsetb(&s, &b, _state); iters = 0; while(lincgiteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needvmv ) { s.vmv = (double)(0); for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } s.vmv = s.vmv+s.mv.ptr.p_double[i]*s.x.ptr.p_double[i]; } } if( s.needprec ) { for(i=0; i<=n-1; i++) { s.pv.ptr.p_double[i] = s.x.ptr.p_double[i]; } } if( s.xupdated ) { iters = iters+1; } } lincgresults(&s, &x0, &rep, _state); /* *check */ if( iters!=rep.iterationscount+1||iters>n+1 ) { if( !silent ) { printf("BasicTestIters::fail\n"); printf("IterationsCount=%0d\n", (int)(rep.iterationscount)); printf("NMV=%0d\n", (int)(rep.nmv)); printf("TerminationType=%0d\n", (int)(rep.terminationtype)); printf("Iters=%0d\n", (int)(iters)); } result = ae_true; ae_frame_leave(_state); return result; } } /* *test has been passed */ if( !silent ) { printf("BasicTestIters::Ok\n"); } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* This function check, that programmed method is Krylov subspace methed. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlincgunit_krylovsubspacetest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; lincgstate s; lincgreport rep; ae_matrix a; ae_vector b; ae_vector x0; ae_matrix ksr; ae_vector r0; ae_vector tarray; ae_matrix mtx; ae_int_t n; ae_int_t sz; double c; ae_int_t i; ae_int_t j; ae_int_t l; ae_int_t m; double mx; double tmp; double normr0; ae_int_t numofit; ae_int_t maxits; ae_int_t k2; double eps; ae_bool result; ae_frame_make(_state, &_frame_block); _lincgstate_init(&s, _state); _lincgreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_matrix_init(&ksr, 0, 0, DT_REAL, _state); ae_vector_init(&r0, 0, DT_REAL, _state); ae_vector_init(&tarray, 0, DT_REAL, _state); ae_matrix_init(&mtx, 0, 0, DT_REAL, _state); eps = 1.0E-6; maxits = 3; sz = 5; mx = (double)(100); for(n=1; n<=sz; n++) { /* * Generate: * * random A with unit norm * * cond(A) in [0.5*MaxCond, 1.0*MaxCond] * * random x0 and b such that |A*x0-b| is large enough for algorithm to make at least one iteration. * * IMPORTANT: it is very important to have cond(A) both (1) not very large and * (2) not very small. Large cond(A) will make our problem ill-conditioned, * thus analytic properties won't hold. Small cond(A), from the other side, * will give us rapid convergence of the algorithm - in fact, too rapid. * Krylov basis will be dominated by numerical noise and test may fail. */ c = testlincgunit_maxcond*(0.5*ae_randomreal(_state)+0.5); spdmatrixrndcond(n, c, &a, _state); ae_matrix_set_length(&mtx, n+1, n, _state); ae_matrix_set_length(&ksr, n, n, _state); ae_vector_set_length(&r0, n, _state); ae_vector_set_length(&tarray, n, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&x0, n, _state); do { for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); b.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); } normr0 = (double)(0); for(i=0; i<=n-1; i++) { tmp = (double)(0); for(j=0; j<=n-1; j++) { tmp = tmp+a.ptr.pp_double[i][j]*x0.ptr.p_double[j]; } r0.ptr.p_double[i] = b.ptr.p_double[i]-tmp; normr0 = normr0+r0.ptr.p_double[i]*r0.ptr.p_double[i]; } } while(ae_fp_less_eq(ae_sqrt(normr0, _state),eps)); /* * Fill KSR by {r0, A*r0, A^2*r0, ... } */ for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { ksr.ptr.pp_double[i][j] = r0.ptr.p_double[j]; } for(j=0; j<=i-1; j++) { for(l=0; l<=n-1; l++) { tarray.ptr.p_double[l] = (double)(0); for(m=0; m<=n-1; m++) { tarray.ptr.p_double[l] = tarray.ptr.p_double[l]+a.ptr.pp_double[l][m]*ksr.ptr.pp_double[i][m]; } } for(l=0; l<=n-1; l++) { ksr.ptr.pp_double[i][l] = tarray.ptr.p_double[l]; } } } /* * Solve system, record intermediate points for futher analysis. * NOTE: we set update frequency of R to 2 in order to test that R is updated correctly */ lincgcreate(n, &s, _state); lincgsetb(&s, &b, _state); lincgsetstartingpoint(&s, &x0, _state); lincgsetxrep(&s, ae_true, _state); lincgsetcond(&s, (double)(0), n, _state); lincgsetrupdatefreq(&s, 2, _state); numofit = 0; while(lincgiteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needvmv ) { s.vmv = (double)(0); for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } s.vmv = s.vmv+s.mv.ptr.p_double[i]*s.x.ptr.p_double[i]; } } if( s.needprec ) { for(i=0; i<=n-1; i++) { s.pv.ptr.p_double[i] = s.x.ptr.p_double[i]; } } if( s.xupdated ) { for(i=0; i<=n-1; i++) { mtx.ptr.pp_double[numofit][i] = s.x.ptr.p_double[i]; } numofit = numofit+1; } } /* * Check that I-th step S_i=X[I+1]-X[i] belongs to I-th Krylov subspace. * Checks are done for first K2 steps, with K2 small enough to avoid * numerical errors. */ if( n<=maxits ) { k2 = n; } else { k2 = maxits; } for(i=0; i<=k2-1; i++) { for(j=0; j<=n-1; j++) { tarray.ptr.p_double[j] = mtx.ptr.pp_double[i+1][j]-mtx.ptr.pp_double[i][j]; } if( !testlincgunit_frombasis(&tarray, &ksr, n, i+1, testlincgunit_e0, _state) ) { if( !silent ) { printf("KrylovSubspaceTest::FAIL\n"); printf("Size=%0d; Iters=%0d;\n", (int)(n), (int)(i)); } result = ae_true; ae_frame_leave(_state); return result; } } } if( !silent ) { printf("KrylovSubspaceTest::OK\n"); } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Function for testing LinCgSolveSparse. This function prepare problem with a known solution 'Xs'(A*Xs-b=0). There b is A*Xs. After, function calculate result by LinCGSolveSparse and compares it with 'Xs'. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlincgunit_sparsetest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; lincgstate s; lincgreport rep; ae_matrix a; ae_vector b; ae_vector xs; ae_vector x0; ae_vector x1; sparsematrix uppera; sparsematrix lowera; double err; ae_int_t n; ae_int_t sz; double c; ae_int_t i; ae_int_t j; double mx; double eps; ae_bool result; ae_frame_make(_state, &_frame_block); _lincgstate_init(&s, _state); _lincgreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&xs, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&x1, 0, DT_REAL, _state); _sparsematrix_init(&uppera, _state); _sparsematrix_init(&lowera, _state); sz = 5; mx = (double)(100); for(n=1; n<=sz; n++) { /* * Generate: * * random A with unit norm * * random X0 (starting point) and XS (known solution) * Copy dense A to sparse SA */ c = (testlincgunit_maxcond-1)*ae_randomreal(_state)+1; spdmatrixrndcond(n, c, &a, _state); ae_vector_set_length(&b, n, _state); ae_vector_set_length(&xs, n, _state); for(i=0; i<=n-1; i++) { xs.ptr.p_double[i] = mx*(2*ae_randomreal(_state)-1); } eps = (double)(0); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { b.ptr.p_double[i] = b.ptr.p_double[i]+a.ptr.pp_double[i][j]*xs.ptr.p_double[j]; } eps = eps+b.ptr.p_double[i]*b.ptr.p_double[i]; } eps = 1.0E-6*ae_sqrt(eps, _state); sparsecreate(n, n, 0, &uppera, _state); sparsecreate(n, n, 0, &lowera, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { if( j>=i ) { sparseset(&uppera, i, j, a.ptr.pp_double[i][j], _state); } if( j<=i ) { sparseset(&lowera, i, j, a.ptr.pp_double[i][j], _state); } } } sparseconverttocrs(&uppera, _state); sparseconverttocrs(&lowera, _state); /* * Test upper triangle */ lincgcreate(n, &s, _state); lincgsetcond(&s, (double)(0), n, _state); lincgsolvesparse(&s, &uppera, ae_true, &b, _state); lincgresults(&s, &x0, &rep, _state); err = (double)(0); for(i=0; i<=n-1; i++) { err = err+ae_sqr(x0.ptr.p_double[i]-xs.ptr.p_double[i], _state); } err = ae_sqrt(err, _state); if( ae_fp_greater(err,eps) ) { result = ae_true; ae_frame_leave(_state); return result; } /* * Test lower triangle */ lincgcreate(n, &s, _state); lincgsetcond(&s, (double)(0), n, _state); lincgsolvesparse(&s, &lowera, ae_false, &b, _state); lincgresults(&s, &x1, &rep, _state); err = (double)(0); for(i=0; i<=n-1; i++) { err = err+ae_sqr(x1.ptr.p_double[i]-xs.ptr.p_double[i], _state); } err = ae_sqrt(err, _state); if( ae_fp_greater(err,eps) ) { result = ae_true; ae_frame_leave(_state); return result; } } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Function for testing the preconditioned conjugate gradient method. -- ALGLIB -- Copyright 14.11.2011 by Bochkanov Sergey *************************************************************************/ static ae_bool testlincgunit_precondtest(ae_bool silent, ae_state *_state) { ae_frame _frame_block; lincgstate s; lincgreport rep; ae_matrix a; ae_matrix ta; sparsematrix sa; ae_vector m; ae_matrix mtx; ae_matrix mtprex; ae_vector de; ae_vector rde; ae_vector b; ae_vector tb; ae_vector d; ae_vector xe; ae_vector x0; ae_vector tx0; ae_vector err; ae_int_t n; ae_int_t sz; ae_int_t numofit; double c; ae_int_t i; ae_int_t j; ae_int_t k; double eps; ae_bool bflag; ae_bool result; ae_frame_make(_state, &_frame_block); _lincgstate_init(&s, _state); _lincgreport_init(&rep, _state); ae_matrix_init(&a, 0, 0, DT_REAL, _state); ae_matrix_init(&ta, 0, 0, DT_REAL, _state); _sparsematrix_init(&sa, _state); ae_vector_init(&m, 0, DT_REAL, _state); ae_matrix_init(&mtx, 0, 0, DT_REAL, _state); ae_matrix_init(&mtprex, 0, 0, DT_REAL, _state); ae_vector_init(&de, 0, DT_REAL, _state); ae_vector_init(&rde, 0, DT_REAL, _state); ae_vector_init(&b, 0, DT_REAL, _state); ae_vector_init(&tb, 0, DT_REAL, _state); ae_vector_init(&d, 0, DT_REAL, _state); ae_vector_init(&xe, 0, DT_REAL, _state); ae_vector_init(&x0, 0, DT_REAL, _state); ae_vector_init(&tx0, 0, DT_REAL, _state); ae_vector_init(&err, 0, DT_REAL, _state); /* * Test 1. * * Preconditioned CG for A*x=b with preconditioner M=E*E' is algebraically * equivalent to non-preconditioned CG for (inv(E)*A*inv(E'))*z = inv(E)*b * with z=E'*x. * * We test it by generating random preconditioner, running algorithm twice - * one time for original problem with preconditioner , another one for * modified problem without preconditioner. */ sz = 5; for(n=1; n<=sz; n++) { /* * Generate: * * random A with unit norm * * random positive definite diagonal preconditioner M * * dE=sqrt(M) * * rdE=dE^(-1) * * tA = rdE*A*rdE * * random x0 and b - for original preconditioned problem * * tx0 and tb - for modified problem */ c = (testlincgunit_maxcond-1)*ae_randomreal(_state)+1; spdmatrixrndcond(n, c, &a, _state); ae_matrix_set_length(&ta, n, n, _state); ae_matrix_set_length(&mtx, n+1, n, _state); ae_matrix_set_length(&mtprex, n+1, n, _state); ae_vector_set_length(&m, n, _state); ae_vector_set_length(&de, n, _state); ae_vector_set_length(&rde, n, _state); for(i=0; i<=n-1; i++) { m.ptr.p_double[i] = ae_randomreal(_state)+0.5; de.ptr.p_double[i] = ae_sqrt(m.ptr.p_double[i], _state); rde.ptr.p_double[i] = 1/de.ptr.p_double[i]; } for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { ta.ptr.pp_double[i][j] = rde.ptr.p_double[i]*a.ptr.pp_double[i][j]*rde.ptr.p_double[j]; } } ae_vector_set_length(&b, n, _state); ae_vector_set_length(&tb, n, _state); ae_vector_set_length(&x0, n, _state); ae_vector_set_length(&tx0, n, _state); ae_vector_set_length(&err, n, _state); for(i=0; i<=n-1; i++) { x0.ptr.p_double[i] = 2*ae_randomreal(_state)-1; b.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } eps = 1.0E-5; for(i=0; i<=n-1; i++) { tx0.ptr.p_double[i] = de.ptr.p_double[i]*x0.ptr.p_double[i]; tb.ptr.p_double[i] = rde.ptr.p_double[i]*b.ptr.p_double[i]; } /* * Solve two problems, intermediate points are saved to MtX and MtPreX */ lincgcreate(n, &s, _state); lincgsetb(&s, &b, _state); lincgsetstartingpoint(&s, &x0, _state); lincgsetxrep(&s, ae_true, _state); lincgsetcond(&s, (double)(0), n, _state); numofit = 0; while(lincgiteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needvmv ) { s.vmv = (double)(0); for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+a.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } s.vmv = s.vmv+s.mv.ptr.p_double[i]*s.x.ptr.p_double[i]; } } if( s.needprec ) { for(i=0; i<=n-1; i++) { s.pv.ptr.p_double[i] = s.x.ptr.p_double[i]/m.ptr.p_double[i]; } } if( s.xupdated ) { if( numofit>=mtx.rows ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=n-1; i++) { mtx.ptr.pp_double[numofit][i] = s.x.ptr.p_double[i]; } numofit = numofit+1; } } lincgsetstartingpoint(&s, &tx0, _state); lincgsetb(&s, &tb, _state); lincgrestart(&s, _state); numofit = 0; while(lincgiteration(&s, _state)) { if( s.needmv ) { for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+ta.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } } } if( s.needvmv ) { s.vmv = (double)(0); for(i=0; i<=n-1; i++) { s.mv.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { s.mv.ptr.p_double[i] = s.mv.ptr.p_double[i]+ta.ptr.pp_double[i][j]*s.x.ptr.p_double[j]; } s.vmv = s.vmv+s.mv.ptr.p_double[i]*s.x.ptr.p_double[i]; } } if( s.needprec ) { for(i=0; i<=n-1; i++) { s.pv.ptr.p_double[i] = s.x.ptr.p_double[i]; } } if( s.xupdated ) { if( numofit>=mtprex.rows ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=n-1; i++) { mtprex.ptr.pp_double[numofit][i] = s.x.ptr.p_double[i]; } numofit = numofit+1; } } /* * Compare results - sequence of points generated when solving original problem with * points generated by modified problem. */ for(i=0; i<=numofit-1; i++) { for(j=0; j<=n-1; j++) { if( ae_fp_greater(ae_fabs(mtx.ptr.pp_double[i][j]-rde.ptr.p_double[j]*mtprex.ptr.pp_double[i][j], _state),eps) ) { if( !silent ) { printf("PrecondTest::fail\n"); printf("Size=%0d\n", (int)(n)); printf("IterationsCount=%0d\n", (int)(rep.iterationscount)); printf("NMV=%0d\n", (int)(rep.nmv)); printf("TerminationType=%0d\n", (int)(rep.terminationtype)); printf("X and X^...\n"); for(k=0; k<=n-1; k++) { printf("I=%0d; mtx[%0d]=%0.10f; mtx^[%0d]=%0.10f\n", (int)(i), (int)(k), (double)(mtx.ptr.pp_double[i][k]), (int)(k), (double)(mtprex.ptr.pp_double[i][k])); } } result = ae_true; ae_frame_leave(_state); return result; } } } } /* * Test 2. * * We test automatic diagonal preconditioning used by SolveSparse. * In order to do so we: * 1. generate 20*20 matrix A0 with condition number equal to 1.0E1 * 2. generate random "exact" solution xe and right part b=A0*xe * 3. generate random ill-conditioned diagonal scaling matrix D with * condition number equal to 1.0E50: * 4. transform A*x=b into badly scaled problem: * A0*x0=b0 * A0*D*(inv(D)*x0)=b0 * (D*A0*D)*(inv(D)*x0)=(D*b0) * finally we got new problem A*x=b with A=D*A0*D, b=D*b0, x=inv(D)*x0 * * Then we solve A*x=b: * 1. with default preconditioner * 2. with explicitly activayed diagonal preconditioning * 3. with unit preconditioner. * 1st and 2nd solutions must be close to xe, 3rd solution must be very * far from the true one. */ n = 20; spdmatrixrndcond(n, 1.0E1, &ta, _state); ae_vector_set_length(&xe, n, _state); for(i=0; i<=n-1; i++) { xe.ptr.p_double[i] = randomnormal(_state); } ae_vector_set_length(&b, n, _state); for(i=0; i<=n-1; i++) { b.ptr.p_double[i] = (double)(0); for(j=0; j<=n-1; j++) { b.ptr.p_double[i] = b.ptr.p_double[i]+ta.ptr.pp_double[i][j]*xe.ptr.p_double[j]; } } ae_vector_set_length(&d, n, _state); for(i=0; i<=n-1; i++) { d.ptr.p_double[i] = ae_pow((double)(10), 100*ae_randomreal(_state)-50, _state); } ae_matrix_set_length(&a, n, n, _state); sparsecreate(n, n, n*n, &sa, _state); for(i=0; i<=n-1; i++) { for(j=0; j<=n-1; j++) { a.ptr.pp_double[i][j] = d.ptr.p_double[i]*ta.ptr.pp_double[i][j]*d.ptr.p_double[j]; sparseset(&sa, i, j, a.ptr.pp_double[i][j], _state); } b.ptr.p_double[i] = b.ptr.p_double[i]*d.ptr.p_double[i]; xe.ptr.p_double[i] = xe.ptr.p_double[i]/d.ptr.p_double[i]; } sparseconverttocrs(&sa, _state); lincgcreate(n, &s, _state); lincgsetcond(&s, (double)(0), 2*n, _state); lincgsolvesparse(&s, &sa, ae_true, &b, _state); lincgresults(&s, &x0, &rep, _state); if( rep.terminationtype<=0 ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=n-1; i++) { if( ae_fp_greater(ae_fabs(xe.ptr.p_double[i]-x0.ptr.p_double[i], _state),5.0E-2/d.ptr.p_double[i]) ) { result = ae_true; ae_frame_leave(_state); return result; } } lincgsetprecunit(&s, _state); lincgsolvesparse(&s, &sa, ae_true, &b, _state); lincgresults(&s, &x0, &rep, _state); if( rep.terminationtype>0 ) { bflag = ae_false; for(i=0; i<=n-1; i++) { bflag = bflag||ae_fp_greater(ae_fabs(xe.ptr.p_double[i]-x0.ptr.p_double[i], _state),5.0E-2/d.ptr.p_double[i]); } if( !bflag ) { result = ae_true; ae_frame_leave(_state); return result; } } lincgsetprecdiag(&s, _state); lincgsolvesparse(&s, &sa, ae_true, &b, _state); lincgresults(&s, &x0, &rep, _state); if( rep.terminationtype<=0 ) { result = ae_true; ae_frame_leave(_state); return result; } for(i=0; i<=n-1; i++) { if( ae_fp_greater(ae_fabs(xe.ptr.p_double[i]-x0.ptr.p_double[i], _state),5.0E-2/d.ptr.p_double[i]) ) { result = ae_true; ae_frame_leave(_state); return result; } } /* *test has been passed */ if( !silent ) { printf("PrecondTest::Ok\n"); } result = ae_false; ae_frame_leave(_state); return result; } /************************************************************************* Orthogonalization by Gram-Shmidt method. *************************************************************************/ static void testlincgunit_gramshmidtortnorm(/* Real */ ae_matrix* a, ae_int_t n, ae_int_t k, double eps, /* Real */ ae_matrix* b, ae_int_t* k2, ae_state *_state) { double scaling; double tmp; double e; ae_int_t i; ae_int_t j; ae_int_t l; ae_int_t m; double sc; ae_matrix_clear(b); *k2 = 0; *k2 = 0; scaling = (double)(0); ae_matrix_set_length(b, k, n, _state); for(i=0; i<=k-1; i++) { tmp = (double)(0); for(j=0; j<=n-1; j++) { tmp = tmp+a->ptr.pp_double[i][j]*a->ptr.pp_double[i][j]; } if( ae_fp_greater(tmp,scaling) ) { scaling = tmp; } } scaling = ae_sqrt(scaling, _state); e = eps*scaling; for(i=0; i<=k-1; i++) { tmp = (double)(0); for(j=0; j<=n-1; j++) { b->ptr.pp_double[*k2][j] = a->ptr.pp_double[i][j]; tmp = tmp+a->ptr.pp_double[i][j]*a->ptr.pp_double[i][j]; } tmp = ae_sqrt(tmp, _state); if( ae_fp_less_eq(tmp,e) ) { continue; } for(j=0; j<=*k2-1; j++) { sc = (double)(0); for(m=0; m<=n-1; m++) { sc = sc+b->ptr.pp_double[*k2][m]*b->ptr.pp_double[j][m]; } for(l=0; l<=n-1; l++) { b->ptr.pp_double[*k2][l] = b->ptr.pp_double[*k2][l]-sc*b->ptr.pp_double[j][l]; } } tmp = (double)(0); for(j=0; j<=n-1; j++) { tmp = tmp+b->ptr.pp_double[*k2][j]*b->ptr.pp_double[*k2][j]; } tmp = ae_sqrt(tmp, _state); if( ae_fp_less_eq(tmp,e) ) { continue; } else { for(j=0; j<=n-1; j++) { b->ptr.pp_double[*k2][j] = b->ptr.pp_double[*k2][j]/tmp; } } *k2 = *k2+1; } } /************************************************************************* Checks that a vector belongs to the basis. *************************************************************************/ static ae_bool testlincgunit_frombasis(/* Real */ ae_vector* x, /* Real */ ae_matrix* basis, ae_int_t n, ae_int_t k, double eps, ae_state *_state) { ae_frame _frame_block; double normx; ae_matrix ortnormbasis; ae_int_t k2; ae_int_t i; ae_int_t j; double alpha; ae_vector alphas; ae_bool result; ae_frame_make(_state, &_frame_block); ae_matrix_init(&ortnormbasis, 0, 0, DT_REAL, _state); ae_vector_init(&alphas, 0, DT_REAL, _state); ae_vector_set_length(&alphas, k, _state); /* *calculating NORM for X */ normx = (double)(0); for(i=0; i<=n-1; i++) { normx = normx+x->ptr.p_double[i]*x->ptr.p_double[i]; } normx = ae_sqrt(normx, _state); /* *Gram-Shmidt method */ testlincgunit_gramshmidtortnorm(basis, n, k, eps, &ortnormbasis, &k2, _state); for(i=0; i<=k2-1; i++) { alpha = (double)(0); for(j=0; j<=n-1; j++) { alpha = alpha+x->ptr.p_double[j]*ortnormbasis.ptr.pp_double[i][j]; } alphas.ptr.p_double[i] = alpha; } /* *check */ for(i=0; i<=n-1; i++) { alpha = (double)(0); for(j=0; j<=k2-1; j++) { alpha = alpha+alphas.ptr.p_double[j]*ortnormbasis.ptr.pp_double[j][i]; } if( ae_fp_greater(ae_fabs(x->ptr.p_double[i]-alpha, _state),normx*eps) ) { result = ae_false; ae_frame_leave(_state); return result; } } result = ae_true; ae_frame_leave(_state); return result; } static ae_bool testalglibbasicsunit_testcomplexarithmetics(ae_bool silent, ae_state *_state); static ae_bool testalglibbasicsunit_testieeespecial(ae_bool silent, ae_state *_state); static ae_bool testalglibbasicsunit_testswapfunctions(ae_bool silent, ae_state *_state); static ae_bool testalglibbasicsunit_teststandardfunctions(ae_bool silent, ae_state *_state); static ae_bool testalglibbasicsunit_testserializationfunctions(ae_bool silent, ae_state *_state); static void testalglibbasicsunit_createpoolandrecords(poolrec2* seedrec2, poolrec2* seedrec2copy, ae_shared_pool* pool, ae_state *_state); static ae_bool testalglibbasicsunit_sharedpoolerrors(ae_state *_state); static ae_bool testalglibbasicsunit_testsharedpool(ae_bool silent, ae_state *_state); static void testalglibbasicsunit_testsort0func(/* Integer */ ae_vector* a, /* Integer */ ae_vector* buf, ae_int_t idx0, ae_int_t idx2, ae_state *_state); static ae_bool testalglibbasicsunit_performtestsort0(ae_state *_state); static void testalglibbasicsunit_testsort1func(/* Integer */ ae_vector* a, /* Integer */ ae_vector* buf, ae_int_t idx0, ae_int_t idx2, ae_bool usesmp, ae_state *_state); static ae_bool testalglibbasicsunit_performtestsort1(ae_state *_state); static void testalglibbasicsunit_testsort2func(/* Integer */ ae_vector* a, /* Integer */ ae_vector* buf, ae_int_t idx0, ae_int_t idx2, ae_state *_state); static ae_bool testalglibbasicsunit_performtestsort2(ae_state *_state); static ae_bool testalglibbasicsunit_performtestpoolsum(ae_state *_state); static void testalglibbasicsunit_parallelpoolsum(ae_shared_pool* sumpool, ae_int_t ind0, ae_int_t ind1, ae_state *_state); static void testalglibbasicsunit_mergesortedarrays(/* Integer */ ae_vector* a, /* Integer */ ae_vector* buf, ae_int_t idx0, ae_int_t idx1, ae_int_t idx2, ae_state *_state); static ae_bool testalglibbasicsunit_testsmp(ae_bool silent, ae_state *_state); void rec4serializationalloc(ae_serializer* s, rec4serialization* v, ae_state *_state) { ae_int_t i; /* * boolean fields */ ae_serializer_alloc_entry(s); for(i=0; i<=v->b.cnt-1; i++) { ae_serializer_alloc_entry(s); } /* * integer fields */ ae_serializer_alloc_entry(s); for(i=0; i<=v->i.cnt-1; i++) { ae_serializer_alloc_entry(s); } /* * real fields */ ae_serializer_alloc_entry(s); for(i=0; i<=v->r.cnt-1; i++) { ae_serializer_alloc_entry(s); } } void rec4serializationserialize(ae_serializer* s, rec4serialization* v, ae_state *_state) { ae_int_t i; /* * boolean fields */ ae_serializer_serialize_int(s, v->b.cnt, _state); for(i=0; i<=v->b.cnt-1; i++) { ae_serializer_serialize_bool(s, v->b.ptr.p_bool[i], _state); } /* * integer fields */ ae_serializer_serialize_int(s, v->i.cnt, _state); for(i=0; i<=v->i.cnt-1; i++) { ae_serializer_serialize_int(s, v->i.ptr.p_int[i], _state); } /* * real fields */ ae_serializer_serialize_int(s, v->r.cnt, _state); for(i=0; i<=v->r.cnt-1; i++) { ae_serializer_serialize_double(s, v->r.ptr.p_double[i], _state); } } void rec4serializationunserialize(ae_serializer* s, rec4serialization* v, ae_state *_state) { ae_int_t i; ae_int_t k; ae_bool bv; ae_int_t iv; double rv; _rec4serialization_clear(v); /* * boolean fields */ ae_serializer_unserialize_int(s, &k, _state); if( k>0 ) { ae_vector_set_length(&v->b, k, _state); for(i=0; i<=k-1; i++) { ae_serializer_unserialize_bool(s, &bv, _state); v->b.ptr.p_bool[i] = bv; } } /* * integer fields */ ae_serializer_unserialize_int(s, &k, _state); if( k>0 ) { ae_vector_set_length(&v->i, k, _state); for(i=0; i<=k-1; i++) { ae_serializer_unserialize_int(s, &iv, _state); v->i.ptr.p_int[i] = iv; } } /* * real fields */ ae_serializer_unserialize_int(s, &k, _state); if( k>0 ) { ae_vector_set_length(&v->r, k, _state); for(i=0; i<=k-1; i++) { ae_serializer_unserialize_double(s, &rv, _state); v->r.ptr.p_double[i] = rv; } } } ae_bool testalglibbasics(ae_bool silent, ae_state *_state) { ae_bool result; result = ae_true; result = result&&testalglibbasicsunit_testcomplexarithmetics(silent, _state); result = result&&testalglibbasicsunit_testieeespecial(silent, _state); result = result&&testalglibbasicsunit_testswapfunctions(silent, _state); result = result&&testalglibbasicsunit_teststandardfunctions(silent, _state); result = result&&testalglibbasicsunit_testserializationfunctions(silent, _state); result = result&&testalglibbasicsunit_testsharedpool(silent, _state); result = result&&testalglibbasicsunit_testsmp(silent, _state); if( !silent ) { printf("\n\n"); } return result; } /************************************************************************* Single-threaded stub. HPC ALGLIB replaces it by multithreaded code. *************************************************************************/ ae_bool _pexec_testalglibbasics(ae_bool silent, ae_state *_state) { return testalglibbasics(silent, _state); } /************************************************************************* Complex arithmetics test *************************************************************************/ static ae_bool testalglibbasicsunit_testcomplexarithmetics(ae_bool silent, ae_state *_state) { ae_bool absc; ae_bool addcc; ae_bool addcr; ae_bool addrc; ae_bool subcc; ae_bool subcr; ae_bool subrc; ae_bool mulcc; ae_bool mulcr; ae_bool mulrc; ae_bool divcc; ae_bool divcr; ae_bool divrc; ae_complex ca; ae_complex cb; ae_complex res; double ra; double rb; double threshold; ae_int_t pass; ae_int_t passcount; ae_bool result; threshold = 100*ae_machineepsilon; passcount = 1000; result = ae_true; absc = ae_true; addcc = ae_true; addcr = ae_true; addrc = ae_true; subcc = ae_true; subcr = ae_true; subrc = ae_true; mulcc = ae_true; mulcr = ae_true; mulrc = ae_true; divcc = ae_true; divcr = ae_true; divrc = ae_true; for(pass=1; pass<=passcount; pass++) { /* * Test AbsC */ ca.x = 2*ae_randomreal(_state)-1; ca.y = 2*ae_randomreal(_state)-1; ra = ae_c_abs(ca, _state); absc = absc&&ae_fp_less(ae_fabs(ra-ae_sqrt(ae_sqr(ca.x, _state)+ae_sqr(ca.y, _state), _state), _state),threshold); /* * test Add */ ca.x = 2*ae_randomreal(_state)-1; ca.y = 2*ae_randomreal(_state)-1; cb.x = 2*ae_randomreal(_state)-1; cb.y = 2*ae_randomreal(_state)-1; ra = 2*ae_randomreal(_state)-1; rb = 2*ae_randomreal(_state)-1; res = ae_c_add(ca,cb); addcc = (addcc&&ae_fp_less(ae_fabs(res.x-ca.x-cb.x, _state),threshold))&&ae_fp_less(ae_fabs(res.y-ca.y-cb.y, _state),threshold); res = ae_c_add_d(ca,rb); addcr = (addcr&&ae_fp_less(ae_fabs(res.x-ca.x-rb, _state),threshold))&&ae_fp_less(ae_fabs(res.y-ca.y, _state),threshold); res = ae_c_add_d(cb,ra); addrc = (addrc&&ae_fp_less(ae_fabs(res.x-ra-cb.x, _state),threshold))&&ae_fp_less(ae_fabs(res.y-cb.y, _state),threshold); /* * test Sub */ ca.x = 2*ae_randomreal(_state)-1; ca.y = 2*ae_randomreal(_state)-1; cb.x = 2*ae_randomreal(_state)-1; cb.y = 2*ae_randomreal(_state)-1; ra = 2*ae_randomreal(_state)-1; rb = 2*ae_randomreal(_state)-1; res = ae_c_sub(ca,cb); subcc = (subcc&&ae_fp_less(ae_fabs(res.x-(ca.x-cb.x), _state),threshold))&&ae_fp_less(ae_fabs(res.y-(ca.y-cb.y), _state),threshold); res = ae_c_sub_d(ca,rb); subcr = (subcr&&ae_fp_less(ae_fabs(res.x-(ca.x-rb), _state),threshold))&&ae_fp_less(ae_fabs(res.y-ca.y, _state),threshold); res = ae_c_d_sub(ra,cb); subrc = (subrc&&ae_fp_less(ae_fabs(res.x-(ra-cb.x), _state),threshold))&&ae_fp_less(ae_fabs(res.y+cb.y, _state),threshold); /* * test Mul */ ca.x = 2*ae_randomreal(_state)-1; ca.y = 2*ae_randomreal(_state)-1; cb.x = 2*ae_randomreal(_state)-1; cb.y = 2*ae_randomreal(_state)-1; ra = 2*ae_randomreal(_state)-1; rb = 2*ae_randomreal(_state)-1; res = ae_c_mul(ca,cb); mulcc = (mulcc&&ae_fp_less(ae_fabs(res.x-(ca.x*cb.x-ca.y*cb.y), _state),threshold))&&ae_fp_less(ae_fabs(res.y-(ca.x*cb.y+ca.y*cb.x), _state),threshold); res = ae_c_mul_d(ca,rb); mulcr = (mulcr&&ae_fp_less(ae_fabs(res.x-ca.x*rb, _state),threshold))&&ae_fp_less(ae_fabs(res.y-ca.y*rb, _state),threshold); res = ae_c_mul_d(cb,ra); mulrc = (mulrc&&ae_fp_less(ae_fabs(res.x-ra*cb.x, _state),threshold))&&ae_fp_less(ae_fabs(res.y-ra*cb.y, _state),threshold); /* * test Div */ ca.x = 2*ae_randomreal(_state)-1; ca.y = 2*ae_randomreal(_state)-1; do { cb.x = 2*ae_randomreal(_state)-1; cb.y = 2*ae_randomreal(_state)-1; } while(ae_fp_less_eq(ae_c_abs(cb, _state),0.5)); ra = 2*ae_randomreal(_state)-1; do { rb = 2*ae_randomreal(_state)-1; } while(ae_fp_less_eq(ae_fabs(rb, _state),0.5)); res = ae_c_div(ca,cb); divcc = (divcc&&ae_fp_less(ae_fabs(ae_c_mul(res,cb).x-ca.x, _state),threshold))&&ae_fp_less(ae_fabs(ae_c_mul(res,cb).y-ca.y, _state),threshold); res = ae_c_div_d(ca,rb); divcr = (divcr&&ae_fp_less(ae_fabs(res.x-ca.x/rb, _state),threshold))&&ae_fp_less(ae_fabs(res.y-ca.y/rb, _state),threshold); res = ae_c_d_div(ra,cb); divrc = (divrc&&ae_fp_less(ae_fabs(ae_c_mul(res,cb).x-ra, _state),threshold))&&ae_fp_less(ae_fabs(ae_c_mul(res,cb).y, _state),threshold); } /* * summary */ result = result&&absc; result = result&&addcc; result = result&&addcr; result = result&&addrc; result = result&&subcc; result = result&&subcr; result = result&&subrc; result = result&&mulcc; result = result&&mulcr; result = result&&mulrc; result = result&&divcc; result = result&&divcr; result = result&&divrc; if( !silent ) { if( result ) { printf("COMPLEX ARITHMETICS: OK\n"); } else { printf("COMPLEX ARITHMETICS: FAILED\n"); printf("* AddCC - - - - - - - - - - - - - - - - "); if( addcc ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* AddCR - - - - - - - - - - - - - - - - "); if( addcr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* AddRC - - - - - - - - - - - - - - - - "); if( addrc ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* SubCC - - - - - - - - - - - - - - - - "); if( subcc ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* SubCR - - - - - - - - - - - - - - - - "); if( subcr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* SubRC - - - - - - - - - - - - - - - - "); if( subrc ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* MulCC - - - - - - - - - - - - - - - - "); if( mulcc ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* MulCR - - - - - - - - - - - - - - - - "); if( mulcr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* MulRC - - - - - - - - - - - - - - - - "); if( mulrc ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* DivCC - - - - - - - - - - - - - - - - "); if( divcc ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* DivCR - - - - - - - - - - - - - - - - "); if( divcr ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* DivRC - - - - - - - - - - - - - - - - "); if( divrc ) { printf("OK\n"); } else { printf("FAILED\n"); } } } return result; } /************************************************************************* Tests for IEEE special quantities *************************************************************************/ static ae_bool testalglibbasicsunit_testieeespecial(ae_bool silent, ae_state *_state) { ae_bool oknan; ae_bool okinf; ae_bool okother; double v1; double v2; ae_bool result; result = ae_true; oknan = ae_true; okinf = ae_true; okother = ae_true; /* * Test classification functions */ okother = okother&&!ae_isinf(_state->v_nan, _state); okother = okother&&ae_isinf(_state->v_posinf, _state); okother = okother&&!ae_isinf(ae_maxrealnumber, _state); okother = okother&&!ae_isinf(1.0, _state); okother = okother&&!ae_isinf(ae_minrealnumber, _state); okother = okother&&!ae_isinf(0.0, _state); okother = okother&&!ae_isinf(-ae_minrealnumber, _state); okother = okother&&!ae_isinf(-1.0, _state); okother = okother&&!ae_isinf(-ae_maxrealnumber, _state); okother = okother&&ae_isinf(_state->v_neginf, _state); okother = okother&&!ae_isposinf(_state->v_nan, _state); okother = okother&&ae_isposinf(_state->v_posinf, _state); okother = okother&&!ae_isposinf(ae_maxrealnumber, _state); okother = okother&&!ae_isposinf(1.0, _state); okother = okother&&!ae_isposinf(ae_minrealnumber, _state); okother = okother&&!ae_isposinf(0.0, _state); okother = okother&&!ae_isposinf(-ae_minrealnumber, _state); okother = okother&&!ae_isposinf(-1.0, _state); okother = okother&&!ae_isposinf(-ae_maxrealnumber, _state); okother = okother&&!ae_isposinf(_state->v_neginf, _state); okother = okother&&!ae_isneginf(_state->v_nan, _state); okother = okother&&!ae_isneginf(_state->v_posinf, _state); okother = okother&&!ae_isneginf(ae_maxrealnumber, _state); okother = okother&&!ae_isneginf(1.0, _state); okother = okother&&!ae_isneginf(ae_minrealnumber, _state); okother = okother&&!ae_isneginf(0.0, _state); okother = okother&&!ae_isneginf(-ae_minrealnumber, _state); okother = okother&&!ae_isneginf(-1.0, _state); okother = okother&&!ae_isneginf(-ae_maxrealnumber, _state); okother = okother&&ae_isneginf(_state->v_neginf, _state); okother = okother&&ae_isnan(_state->v_nan, _state); okother = okother&&!ae_isnan(_state->v_posinf, _state); okother = okother&&!ae_isnan(ae_maxrealnumber, _state); okother = okother&&!ae_isnan(1.0, _state); okother = okother&&!ae_isnan(ae_minrealnumber, _state); okother = okother&&!ae_isnan(0.0, _state); okother = okother&&!ae_isnan(-ae_minrealnumber, _state); okother = okother&&!ae_isnan(-1.0, _state); okother = okother&&!ae_isnan(-ae_maxrealnumber, _state); okother = okother&&!ae_isnan(_state->v_neginf, _state); okother = okother&&!ae_isfinite(_state->v_nan, _state); okother = okother&&!ae_isfinite(_state->v_posinf, _state); okother = okother&&ae_isfinite(ae_maxrealnumber, _state); okother = okother&&ae_isfinite(1.0, _state); okother = okother&&ae_isfinite(ae_minrealnumber, _state); okother = okother&&ae_isfinite(0.0, _state); okother = okother&&ae_isfinite(-ae_minrealnumber, _state); okother = okother&&ae_isfinite(-1.0, _state); okother = okother&&ae_isfinite(-ae_maxrealnumber, _state); okother = okother&&!ae_isfinite(_state->v_neginf, _state); /* * Test NAN */ v1 = _state->v_nan; v2 = _state->v_nan; oknan = oknan&&ae_isnan(v1, _state); oknan = oknan&&ae_fp_neq(v1,v2); oknan = oknan&&!ae_fp_eq(v1,v2); /* * Test INF: * * basic properties * * comparisons involving PosINF on one of the sides * * comparisons involving NegINF on one of the sides */ v1 = _state->v_posinf; v2 = _state->v_neginf; okinf = okinf&&ae_isinf(_state->v_posinf, _state); okinf = okinf&&ae_isinf(v1, _state); okinf = okinf&&ae_isinf(_state->v_neginf, _state); okinf = okinf&&ae_isinf(v2, _state); okinf = okinf&&ae_isposinf(_state->v_posinf, _state); okinf = okinf&&ae_isposinf(v1, _state); okinf = okinf&&!ae_isposinf(_state->v_neginf, _state); okinf = okinf&&!ae_isposinf(v2, _state); okinf = okinf&&!ae_isneginf(_state->v_posinf, _state); okinf = okinf&&!ae_isneginf(v1, _state); okinf = okinf&&ae_isneginf(_state->v_neginf, _state); okinf = okinf&&ae_isneginf(v2, _state); okinf = okinf&&ae_fp_eq(_state->v_posinf,_state->v_posinf); okinf = okinf&&ae_fp_eq(_state->v_posinf,v1); okinf = okinf&&!ae_fp_eq(_state->v_posinf,_state->v_neginf); okinf = okinf&&!ae_fp_eq(_state->v_posinf,v2); okinf = okinf&&!ae_fp_eq(_state->v_posinf,(double)(0)); okinf = okinf&&!ae_fp_eq(_state->v_posinf,1.2); okinf = okinf&&!ae_fp_eq(_state->v_posinf,-1.2); okinf = okinf&&ae_fp_eq(v1,_state->v_posinf); okinf = okinf&&!ae_fp_eq(_state->v_neginf,_state->v_posinf); okinf = okinf&&!ae_fp_eq(v2,_state->v_posinf); okinf = okinf&&!ae_fp_eq((double)(0),_state->v_posinf); okinf = okinf&&!ae_fp_eq(1.2,_state->v_posinf); okinf = okinf&&!ae_fp_eq(-1.2,_state->v_posinf); okinf = okinf&&!ae_fp_neq(_state->v_posinf,_state->v_posinf); okinf = okinf&&!ae_fp_neq(_state->v_posinf,v1); okinf = okinf&&ae_fp_neq(_state->v_posinf,_state->v_neginf); okinf = okinf&&ae_fp_neq(_state->v_posinf,v2); okinf = okinf&&ae_fp_neq(_state->v_posinf,(double)(0)); okinf = okinf&&ae_fp_neq(_state->v_posinf,1.2); okinf = okinf&&ae_fp_neq(_state->v_posinf,-1.2); okinf = okinf&&!ae_fp_neq(v1,_state->v_posinf); okinf = okinf&&ae_fp_neq(_state->v_neginf,_state->v_posinf); okinf = okinf&&ae_fp_neq(v2,_state->v_posinf); okinf = okinf&&ae_fp_neq((double)(0),_state->v_posinf); okinf = okinf&&ae_fp_neq(1.2,_state->v_posinf); okinf = okinf&&ae_fp_neq(-1.2,_state->v_posinf); okinf = okinf&&!ae_fp_less(_state->v_posinf,_state->v_posinf); okinf = okinf&&!ae_fp_less(_state->v_posinf,v1); okinf = okinf&&!ae_fp_less(_state->v_posinf,_state->v_neginf); okinf = okinf&&!ae_fp_less(_state->v_posinf,v2); okinf = okinf&&!ae_fp_less(_state->v_posinf,(double)(0)); okinf = okinf&&!ae_fp_less(_state->v_posinf,1.2); okinf = okinf&&!ae_fp_less(_state->v_posinf,-1.2); okinf = okinf&&!ae_fp_less(v1,_state->v_posinf); okinf = okinf&&ae_fp_less(_state->v_neginf,_state->v_posinf); okinf = okinf&&ae_fp_less(v2,_state->v_posinf); okinf = okinf&&ae_fp_less((double)(0),_state->v_posinf); okinf = okinf&&ae_fp_less(1.2,_state->v_posinf); okinf = okinf&&ae_fp_less(-1.2,_state->v_posinf); okinf = okinf&&ae_fp_less_eq(_state->v_posinf,_state->v_posinf); okinf = okinf&&ae_fp_less_eq(_state->v_posinf,v1); okinf = okinf&&!ae_fp_less_eq(_state->v_posinf,_state->v_neginf); okinf = okinf&&!ae_fp_less_eq(_state->v_posinf,v2); okinf = okinf&&!ae_fp_less_eq(_state->v_posinf,(double)(0)); okinf = okinf&&!ae_fp_less_eq(_state->v_posinf,1.2); okinf = okinf&&!ae_fp_less_eq(_state->v_posinf,-1.2); okinf = okinf&&ae_fp_less_eq(v1,_state->v_posinf); okinf = okinf&&ae_fp_less_eq(_state->v_neginf,_state->v_posinf); okinf = okinf&&ae_fp_less_eq(v2,_state->v_posinf); okinf = okinf&&ae_fp_less_eq((double)(0),_state->v_posinf); okinf = okinf&&ae_fp_less_eq(1.2,_state->v_posinf); okinf = okinf&&ae_fp_less_eq(-1.2,_state->v_posinf); okinf = okinf&&!ae_fp_greater(_state->v_posinf,_state->v_posinf); okinf = okinf&&!ae_fp_greater(_state->v_posinf,v1); okinf = okinf&&ae_fp_greater(_state->v_posinf,_state->v_neginf); okinf = okinf&&ae_fp_greater(_state->v_posinf,v2); okinf = okinf&&ae_fp_greater(_state->v_posinf,(double)(0)); okinf = okinf&&ae_fp_greater(_state->v_posinf,1.2); okinf = okinf&&ae_fp_greater(_state->v_posinf,-1.2); okinf = okinf&&!ae_fp_greater(v1,_state->v_posinf); okinf = okinf&&!ae_fp_greater(_state->v_neginf,_state->v_posinf); okinf = okinf&&!ae_fp_greater(v2,_state->v_posinf); okinf = okinf&&!ae_fp_greater((double)(0),_state->v_posinf); okinf = okinf&&!ae_fp_greater(1.2,_state->v_posinf); okinf = okinf&&!ae_fp_greater(-1.2,_state->v_posinf); okinf = okinf&&ae_fp_greater_eq(_state->v_posinf,_state->v_posinf); okinf = okinf&&ae_fp_greater_eq(_state->v_posinf,v1); okinf = okinf&&ae_fp_greater_eq(_state->v_posinf,_state->v_neginf); okinf = okinf&&ae_fp_greater_eq(_state->v_posinf,v2); okinf = okinf&&ae_fp_greater_eq(_state->v_posinf,(double)(0)); okinf = okinf&&ae_fp_greater_eq(_state->v_posinf,1.2); okinf = okinf&&ae_fp_greater_eq(_state->v_posinf,-1.2); okinf = okinf&&ae_fp_greater_eq(v1,_state->v_posinf); okinf = okinf&&!ae_fp_greater_eq(_state->v_neginf,_state->v_posinf); okinf = okinf&&!ae_fp_greater_eq(v2,_state->v_posinf); okinf = okinf&&!ae_fp_greater_eq((double)(0),_state->v_posinf); okinf = okinf&&!ae_fp_greater_eq(1.2,_state->v_posinf); okinf = okinf&&!ae_fp_greater_eq(-1.2,_state->v_posinf); okinf = okinf&&!ae_fp_eq(_state->v_neginf,_state->v_posinf); okinf = okinf&&!ae_fp_eq(_state->v_neginf,v1); okinf = okinf&&ae_fp_eq(_state->v_neginf,_state->v_neginf); okinf = okinf&&ae_fp_eq(_state->v_neginf,v2); okinf = okinf&&!ae_fp_eq(_state->v_neginf,(double)(0)); okinf = okinf&&!ae_fp_eq(_state->v_neginf,1.2); okinf = okinf&&!ae_fp_eq(_state->v_neginf,-1.2); okinf = okinf&&!ae_fp_eq(v1,_state->v_neginf); okinf = okinf&&ae_fp_eq(_state->v_neginf,_state->v_neginf); okinf = okinf&&ae_fp_eq(v2,_state->v_neginf); okinf = okinf&&!ae_fp_eq((double)(0),_state->v_neginf); okinf = okinf&&!ae_fp_eq(1.2,_state->v_neginf); okinf = okinf&&!ae_fp_eq(-1.2,_state->v_neginf); okinf = okinf&&ae_fp_neq(_state->v_neginf,_state->v_posinf); okinf = okinf&&ae_fp_neq(_state->v_neginf,v1); okinf = okinf&&!ae_fp_neq(_state->v_neginf,_state->v_neginf); okinf = okinf&&!ae_fp_neq(_state->v_neginf,v2); okinf = okinf&&ae_fp_neq(_state->v_neginf,(double)(0)); okinf = okinf&&ae_fp_neq(_state->v_neginf,1.2); okinf = okinf&&ae_fp_neq(_state->v_neginf,-1.2); okinf = okinf&&ae_fp_neq(v1,_state->v_neginf); okinf = okinf&&!ae_fp_neq(_state->v_neginf,_state->v_neginf); okinf = okinf&&!ae_fp_neq(v2,_state->v_neginf); okinf = okinf&&ae_fp_neq((double)(0),_state->v_neginf); okinf = okinf&&ae_fp_neq(1.2,_state->v_neginf); okinf = okinf&&ae_fp_neq(-1.2,_state->v_neginf); okinf = okinf&&ae_fp_less(_state->v_neginf,_state->v_posinf); okinf = okinf&&ae_fp_less(_state->v_neginf,v1); okinf = okinf&&!ae_fp_less(_state->v_neginf,_state->v_neginf); okinf = okinf&&!ae_fp_less(_state->v_neginf,v2); okinf = okinf&&ae_fp_less(_state->v_neginf,(double)(0)); okinf = okinf&&ae_fp_less(_state->v_neginf,1.2); okinf = okinf&&ae_fp_less(_state->v_neginf,-1.2); okinf = okinf&&!ae_fp_less(v1,_state->v_neginf); okinf = okinf&&!ae_fp_less(_state->v_neginf,_state->v_neginf); okinf = okinf&&!ae_fp_less(v2,_state->v_neginf); okinf = okinf&&!ae_fp_less((double)(0),_state->v_neginf); okinf = okinf&&!ae_fp_less(1.2,_state->v_neginf); okinf = okinf&&!ae_fp_less(-1.2,_state->v_neginf); okinf = okinf&&ae_fp_less_eq(_state->v_neginf,_state->v_posinf); okinf = okinf&&ae_fp_less_eq(_state->v_neginf,v1); okinf = okinf&&ae_fp_less_eq(_state->v_neginf,_state->v_neginf); okinf = okinf&&ae_fp_less_eq(_state->v_neginf,v2); okinf = okinf&&ae_fp_less_eq(_state->v_neginf,(double)(0)); okinf = okinf&&ae_fp_less_eq(_state->v_neginf,1.2); okinf = okinf&&ae_fp_less_eq(_state->v_neginf,-1.2); okinf = okinf&&!ae_fp_less_eq(v1,_state->v_neginf); okinf = okinf&&ae_fp_less_eq(_state->v_neginf,_state->v_neginf); okinf = okinf&&ae_fp_less_eq(v2,_state->v_neginf); okinf = okinf&&!ae_fp_less_eq((double)(0),_state->v_neginf); okinf = okinf&&!ae_fp_less_eq(1.2,_state->v_neginf); okinf = okinf&&!ae_fp_less_eq(-1.2,_state->v_neginf); okinf = okinf&&!ae_fp_greater(_state->v_neginf,_state->v_posinf); okinf = okinf&&!ae_fp_greater(_state->v_neginf,v1); okinf = okinf&&!ae_fp_greater(_state->v_neginf,_state->v_neginf); okinf = okinf&&!ae_fp_greater(_state->v_neginf,v2); okinf = okinf&&!ae_fp_greater(_state->v_neginf,(double)(0)); okinf = okinf&&!ae_fp_greater(_state->v_neginf,1.2); okinf = okinf&&!ae_fp_greater(_state->v_neginf,-1.2); okinf = okinf&&ae_fp_greater(v1,_state->v_neginf); okinf = okinf&&!ae_fp_greater(_state->v_neginf,_state->v_neginf); okinf = okinf&&!ae_fp_greater(v2,_state->v_neginf); okinf = okinf&&ae_fp_greater((double)(0),_state->v_neginf); okinf = okinf&&ae_fp_greater(1.2,_state->v_neginf); okinf = okinf&&ae_fp_greater(-1.2,_state->v_neginf); okinf = okinf&&!ae_fp_greater_eq(_state->v_neginf,_state->v_posinf); okinf = okinf&&!ae_fp_greater_eq(_state->v_neginf,v1); okinf = okinf&&ae_fp_greater_eq(_state->v_neginf,_state->v_neginf); okinf = okinf&&ae_fp_greater_eq(_state->v_neginf,v2); okinf = okinf&&!ae_fp_greater_eq(_state->v_neginf,(double)(0)); okinf = okinf&&!ae_fp_greater_eq(_state->v_neginf,1.2); okinf = okinf&&!ae_fp_greater_eq(_state->v_neginf,-1.2); okinf = okinf&&ae_fp_greater_eq(v1,_state->v_neginf); okinf = okinf&&ae_fp_greater_eq(_state->v_neginf,_state->v_neginf); okinf = okinf&&ae_fp_greater_eq(v2,_state->v_neginf); okinf = okinf&&ae_fp_greater_eq((double)(0),_state->v_neginf); okinf = okinf&&ae_fp_greater_eq(1.2,_state->v_neginf); okinf = okinf&&ae_fp_greater_eq(-1.2,_state->v_neginf); /* * summary */ result = result&&oknan; result = result&&okinf; result = result&&okother; if( !silent ) { if( result ) { printf("IEEE SPECIAL VALUES: OK\n"); } else { printf("IEEE SPECIAL VALUES: FAILED\n"); printf("* NAN - - - - - - - - - - - - - - - - - "); if( oknan ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* INF - - - - - - - - - - - - - - - - - "); if( okinf ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* FUNCTIONS - - - - - - - - - - - - - - "); if( okother ) { printf("OK\n"); } else { printf("FAILED\n"); } } } return result; } /************************************************************************* Tests for swapping functions *************************************************************************/ static ae_bool testalglibbasicsunit_testswapfunctions(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool okb1; ae_bool okb2; ae_bool oki1; ae_bool oki2; ae_bool okr1; ae_bool okr2; ae_bool okc1; ae_bool okc2; ae_vector b11; ae_vector b12; ae_vector i11; ae_vector i12; ae_vector r11; ae_vector r12; ae_vector c11; ae_vector c12; ae_matrix b21; ae_matrix b22; ae_matrix i21; ae_matrix i22; ae_matrix r21; ae_matrix r22; ae_matrix c21; ae_matrix c22; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&b11, 0, DT_BOOL, _state); ae_vector_init(&b12, 0, DT_BOOL, _state); ae_vector_init(&i11, 0, DT_INT, _state); ae_vector_init(&i12, 0, DT_INT, _state); ae_vector_init(&r11, 0, DT_REAL, _state); ae_vector_init(&r12, 0, DT_REAL, _state); ae_vector_init(&c11, 0, DT_COMPLEX, _state); ae_vector_init(&c12, 0, DT_COMPLEX, _state); ae_matrix_init(&b21, 0, 0, DT_BOOL, _state); ae_matrix_init(&b22, 0, 0, DT_BOOL, _state); ae_matrix_init(&i21, 0, 0, DT_INT, _state); ae_matrix_init(&i22, 0, 0, DT_INT, _state); ae_matrix_init(&r21, 0, 0, DT_REAL, _state); ae_matrix_init(&r22, 0, 0, DT_REAL, _state); ae_matrix_init(&c21, 0, 0, DT_COMPLEX, _state); ae_matrix_init(&c22, 0, 0, DT_COMPLEX, _state); result = ae_true; okb1 = ae_true; okb2 = ae_true; oki1 = ae_true; oki2 = ae_true; okr1 = ae_true; okr2 = ae_true; okc1 = ae_true; okc2 = ae_true; /* * Test B1 swaps */ ae_vector_set_length(&b11, 1, _state); ae_vector_set_length(&b12, 2, _state); b11.ptr.p_bool[0] = ae_true; b12.ptr.p_bool[0] = ae_false; b12.ptr.p_bool[1] = ae_true; ae_swap_vectors(&b11, &b12); if( b11.cnt==2&&b12.cnt==1 ) { okb1 = okb1&&!b11.ptr.p_bool[0]; okb1 = okb1&&b11.ptr.p_bool[1]; okb1 = okb1&&b12.ptr.p_bool[0]; } else { okb1 = ae_false; } /* * Test I1 swaps */ ae_vector_set_length(&i11, 1, _state); ae_vector_set_length(&i12, 2, _state); i11.ptr.p_int[0] = 1; i12.ptr.p_int[0] = 2; i12.ptr.p_int[1] = 3; ae_swap_vectors(&i11, &i12); if( i11.cnt==2&&i12.cnt==1 ) { oki1 = oki1&&i11.ptr.p_int[0]==2; oki1 = oki1&&i11.ptr.p_int[1]==3; oki1 = oki1&&i12.ptr.p_int[0]==1; } else { oki1 = ae_false; } /* * Test R1 swaps */ ae_vector_set_length(&r11, 1, _state); ae_vector_set_length(&r12, 2, _state); r11.ptr.p_double[0] = 1.5; r12.ptr.p_double[0] = 2.5; r12.ptr.p_double[1] = 3.5; ae_swap_vectors(&r11, &r12); if( r11.cnt==2&&r12.cnt==1 ) { okr1 = okr1&&ae_fp_eq(r11.ptr.p_double[0],2.5); okr1 = okr1&&ae_fp_eq(r11.ptr.p_double[1],3.5); okr1 = okr1&&ae_fp_eq(r12.ptr.p_double[0],1.5); } else { okr1 = ae_false; } /* * Test C1 swaps */ ae_vector_set_length(&c11, 1, _state); ae_vector_set_length(&c12, 2, _state); c11.ptr.p_complex[0] = ae_complex_from_i(1); c12.ptr.p_complex[0] = ae_complex_from_i(2); c12.ptr.p_complex[1] = ae_complex_from_i(3); ae_swap_vectors(&c11, &c12); if( c11.cnt==2&&c12.cnt==1 ) { okc1 = okc1&&ae_c_eq_d(c11.ptr.p_complex[0],(double)(2)); okc1 = okc1&&ae_c_eq_d(c11.ptr.p_complex[1],(double)(3)); okc1 = okc1&&ae_c_eq_d(c12.ptr.p_complex[0],(double)(1)); } else { okc1 = ae_false; } /* * Test B2 swaps */ ae_matrix_set_length(&b21, 1, 2, _state); ae_matrix_set_length(&b22, 2, 1, _state); b21.ptr.pp_bool[0][0] = ae_true; b21.ptr.pp_bool[0][1] = ae_false; b22.ptr.pp_bool[0][0] = ae_false; b22.ptr.pp_bool[1][0] = ae_true; ae_swap_matrices(&b21, &b22); if( ((b21.rows==2&&b21.cols==1)&&b22.rows==1)&&b22.cols==2 ) { okb2 = okb2&&!b21.ptr.pp_bool[0][0]; okb2 = okb2&&b21.ptr.pp_bool[1][0]; okb2 = okb2&&b22.ptr.pp_bool[0][0]; okb2 = okb2&&!b22.ptr.pp_bool[0][1]; } else { okb2 = ae_false; } /* * Test I2 swaps */ ae_matrix_set_length(&i21, 1, 2, _state); ae_matrix_set_length(&i22, 2, 1, _state); i21.ptr.pp_int[0][0] = 1; i21.ptr.pp_int[0][1] = 2; i22.ptr.pp_int[0][0] = 3; i22.ptr.pp_int[1][0] = 4; ae_swap_matrices(&i21, &i22); if( ((i21.rows==2&&i21.cols==1)&&i22.rows==1)&&i22.cols==2 ) { oki2 = oki2&&i21.ptr.pp_int[0][0]==3; oki2 = oki2&&i21.ptr.pp_int[1][0]==4; oki2 = oki2&&i22.ptr.pp_int[0][0]==1; oki2 = oki2&&i22.ptr.pp_int[0][1]==2; } else { oki2 = ae_false; } /* * Test R2 swaps */ ae_matrix_set_length(&r21, 1, 2, _state); ae_matrix_set_length(&r22, 2, 1, _state); r21.ptr.pp_double[0][0] = (double)(1); r21.ptr.pp_double[0][1] = (double)(2); r22.ptr.pp_double[0][0] = (double)(3); r22.ptr.pp_double[1][0] = (double)(4); ae_swap_matrices(&r21, &r22); if( ((r21.rows==2&&r21.cols==1)&&r22.rows==1)&&r22.cols==2 ) { okr2 = okr2&&ae_fp_eq(r21.ptr.pp_double[0][0],(double)(3)); okr2 = okr2&&ae_fp_eq(r21.ptr.pp_double[1][0],(double)(4)); okr2 = okr2&&ae_fp_eq(r22.ptr.pp_double[0][0],(double)(1)); okr2 = okr2&&ae_fp_eq(r22.ptr.pp_double[0][1],(double)(2)); } else { okr2 = ae_false; } /* * Test C2 swaps */ ae_matrix_set_length(&c21, 1, 2, _state); ae_matrix_set_length(&c22, 2, 1, _state); c21.ptr.pp_complex[0][0] = ae_complex_from_i(1); c21.ptr.pp_complex[0][1] = ae_complex_from_i(2); c22.ptr.pp_complex[0][0] = ae_complex_from_i(3); c22.ptr.pp_complex[1][0] = ae_complex_from_i(4); ae_swap_matrices(&c21, &c22); if( ((c21.rows==2&&c21.cols==1)&&c22.rows==1)&&c22.cols==2 ) { okc2 = okc2&&ae_c_eq_d(c21.ptr.pp_complex[0][0],(double)(3)); okc2 = okc2&&ae_c_eq_d(c21.ptr.pp_complex[1][0],(double)(4)); okc2 = okc2&&ae_c_eq_d(c22.ptr.pp_complex[0][0],(double)(1)); okc2 = okc2&&ae_c_eq_d(c22.ptr.pp_complex[0][1],(double)(2)); } else { okc2 = ae_false; } /* * summary */ result = result&&okb1; result = result&&okb2; result = result&&oki1; result = result&&oki2; result = result&&okr1; result = result&&okr2; result = result&&okc1; result = result&&okc2; if( !silent ) { if( result ) { printf("SWAPPING FUNCTIONS: OK\n"); } else { printf("SWAPPING FUNCTIONS: FAILED\n"); } } ae_frame_leave(_state); return result; } /************************************************************************* Tests for standard functions *************************************************************************/ static ae_bool testalglibbasicsunit_teststandardfunctions(ae_bool silent, ae_state *_state) { ae_bool result; result = ae_true; /* * Test Sign() */ result = result&&ae_sign(1.2, _state)==1; result = result&&ae_sign((double)(0), _state)==0; result = result&&ae_sign(-1.2, _state)==-1; /* * summary */ if( !silent ) { if( result ) { printf("STANDARD FUNCTIONS: OK\n"); } else { printf("STANDARD FUNCTIONS: FAILED\n"); } } return result; } /************************************************************************* Tests for serualization functions *************************************************************************/ static ae_bool testalglibbasicsunit_testserializationfunctions(ae_bool silent, ae_state *_state) { ae_frame _frame_block; ae_bool okb; ae_bool oki; ae_bool okr; ae_int_t nb; ae_int_t ni; ae_int_t nr; ae_int_t i; rec4serialization r0; rec4serialization r1; ae_bool result; ae_frame_make(_state, &_frame_block); _rec4serialization_init(&r0, _state); _rec4serialization_init(&r1, _state); result = ae_true; okb = ae_true; oki = ae_true; okr = ae_true; for(nb=1; nb<=4; nb++) { for(ni=1; ni<=4; ni++) { for(nr=1; nr<=4; nr++) { ae_vector_set_length(&r0.b, nb, _state); for(i=0; i<=nb-1; i++) { r0.b.ptr.p_bool[i] = ae_randominteger(2, _state)!=0; } ae_vector_set_length(&r0.i, ni, _state); for(i=0; i<=ni-1; i++) { r0.i.ptr.p_int[i] = ae_randominteger(10, _state)-5; } ae_vector_set_length(&r0.r, nr, _state); for(i=0; i<=nr-1; i++) { r0.r.ptr.p_double[i] = 2*ae_randomreal(_state)-1; } { /* * This code passes data structure through serializers * (serializes it to string and loads back) */ ae_serializer _local_serializer; ae_int_t _local_ssize; ae_frame _local_frame_block; ae_dyn_block _local_dynamic_block; ae_frame_make(_state, &_local_frame_block); ae_serializer_init(&_local_serializer); ae_serializer_alloc_start(&_local_serializer); rec4serializationalloc(&_local_serializer, &r0, _state); _local_ssize = ae_serializer_get_alloc_size(&_local_serializer); ae_db_malloc(&_local_dynamic_block, _local_ssize+1, _state, ae_true); ae_serializer_sstart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); rec4serializationserialize(&_local_serializer, &r0, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_serializer_init(&_local_serializer); ae_serializer_ustart_str(&_local_serializer, (char*)_local_dynamic_block.ptr); rec4serializationunserialize(&_local_serializer, &r1, _state); ae_serializer_stop(&_local_serializer, _state); ae_serializer_clear(&_local_serializer); ae_frame_leave(_state); } if( (r0.b.cnt==r1.b.cnt&&r0.i.cnt==r1.i.cnt)&&r0.r.cnt==r1.r.cnt ) { for(i=0; i<=nb-1; i++) { okb = okb&&((r0.b.ptr.p_bool[i]&&r1.b.ptr.p_bool[i])||(!r0.b.ptr.p_bool[i]&&!r1.b.ptr.p_bool[i])); } for(i=0; i<=ni-1; i++) { oki = oki&&r0.i.ptr.p_int[i]==r1.i.ptr.p_int[i]; } for(i=0; i<=nr-1; i++) { okr = okr&&ae_fp_eq(r0.r.ptr.p_double[i],r1.r.ptr.p_double[i]); } } else { oki = ae_false; } } } } /* * summary */ result = result&&okb; result = result&&oki; result = result&&okr; if( !silent ) { if( result ) { printf("SERIALIZATION FUNCTIONS: OK\n"); } else { printf("SERIALIZATION FUNCTIONS: FAILED\n"); printf("* BOOLEAN - - - - - - - - - - - - - - - "); if( okb ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* INTEGER - - - - - - - - - - - - - - - "); if( oki ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* REAL - - - - - - - - - - - - - - - - "); if( okr ) { printf("OK\n"); } else { printf("FAILED\n"); } } } ae_frame_leave(_state); return result; } /************************************************************************* Tests for pool functions *************************************************************************/ static void testalglibbasicsunit_createpoolandrecords(poolrec2* seedrec2, poolrec2* seedrec2copy, ae_shared_pool* pool, ae_state *_state) { _poolrec2_clear(seedrec2); _poolrec2_clear(seedrec2copy); ae_shared_pool_clear(pool); seedrec2->bval = ae_fp_greater(ae_randomreal(_state),0.5); seedrec2->recval.bval = ae_fp_greater(ae_randomreal(_state),0.5); seedrec2->recval.ival = ae_randominteger(10, _state); seedrec2->recval.rval = ae_randomreal(_state); seedrec2->recval.cval.x = ae_randomreal(_state); seedrec2->recval.cval.y = ae_randomreal(_state); ae_vector_set_length(&seedrec2->recval.i1val, 3, _state); seedrec2->recval.i1val.ptr.p_int[0] = ae_randominteger(10, _state); seedrec2->recval.i1val.ptr.p_int[1] = ae_randominteger(10, _state); seedrec2->recval.i1val.ptr.p_int[2] = ae_randominteger(10, _state); seedrec2copy->bval = seedrec2->bval; seedrec2copy->recval.bval = seedrec2->recval.bval; seedrec2copy->recval.ival = seedrec2->recval.ival; seedrec2copy->recval.rval = seedrec2->recval.rval; seedrec2copy->recval.cval = seedrec2->recval.cval; ae_vector_set_length(&seedrec2copy->recval.i1val, 3, _state); seedrec2copy->recval.i1val.ptr.p_int[0] = seedrec2->recval.i1val.ptr.p_int[0]; seedrec2copy->recval.i1val.ptr.p_int[1] = seedrec2->recval.i1val.ptr.p_int[1]; seedrec2copy->recval.i1val.ptr.p_int[2] = seedrec2->recval.i1val.ptr.p_int[2]; ae_shared_pool_set_seed(pool, seedrec2, sizeof(*seedrec2), _poolrec2_init, _poolrec2_init_copy, _poolrec2_destroy, _state); } static ae_bool testalglibbasicsunit_sharedpoolerrors(ae_state *_state) { ae_frame _frame_block; poolrec1 seedrec1; poolrec2 seedrec2; poolrec2 seedrec2copy; ae_shared_pool pool; ae_shared_pool pool2; poolrec2 *prec2; ae_smart_ptr _prec2; poolrec2 *p0; ae_smart_ptr _p0; poolrec2 *p1; ae_smart_ptr _p1; poolrec2 *p2; ae_smart_ptr _p2; poolrec1 *q0; ae_smart_ptr _q0; poolrec1 *q1; ae_smart_ptr _q1; ae_shared_pool *ppool0; ae_smart_ptr _ppool0; ae_shared_pool *ppool1; ae_smart_ptr _ppool1; ae_int_t val100cnt; ae_int_t val101cnt; ae_int_t val102cnt; ae_int_t tmpval; ae_bool result; ae_frame_make(_state, &_frame_block); _poolrec1_init(&seedrec1, _state); _poolrec2_init(&seedrec2, _state); _poolrec2_init(&seedrec2copy, _state); ae_shared_pool_init(&pool, _state); ae_shared_pool_init(&pool2, _state); ae_smart_ptr_init(&_prec2, (void**)&prec2, _state); ae_smart_ptr_init(&_p0, (void**)&p0, _state); ae_smart_ptr_init(&_p1, (void**)&p1, _state); ae_smart_ptr_init(&_p2, (void**)&p2, _state); ae_smart_ptr_init(&_q0, (void**)&q0, _state); ae_smart_ptr_init(&_q1, (void**)&q1, _state); ae_smart_ptr_init(&_ppool0, (void**)&ppool0, _state); ae_smart_ptr_init(&_ppool1, (void**)&ppool1, _state); result = ae_true; /* * Test 1: test that: * a) smart pointer is null by default * b) "conventional local" is valid by default * b) unitinitialized shared pool is "not initialized" */ if( prec2!=NULL ) { ae_frame_leave(_state); return result; } if( !(&seedrec1!=NULL) ) { ae_frame_leave(_state); return result; } if( ae_shared_pool_is_initialized(&pool) ) { ae_frame_leave(_state); return result; } /* * Test 2: basic copying of complex structures * * check that pool is recognized as "initialized" * * change original seed record, * * retrieve value from pool, * * check that it is valid * * and it is unchanged. */ testalglibbasicsunit_createpoolandrecords(&seedrec2, &seedrec2copy, &pool, _state); if( !ae_shared_pool_is_initialized(&pool) ) { ae_frame_leave(_state); return result; } seedrec2.bval = !seedrec2.bval; seedrec2.recval.i1val.ptr.p_int[0] = seedrec2.recval.i1val.ptr.p_int[0]+1; ae_shared_pool_retrieve(&pool, &_prec2, _state); if( !(prec2!=NULL) ) { ae_frame_leave(_state); return result; } if( (seedrec2copy.bval&&!prec2->bval)||(prec2->bval&&!seedrec2copy.bval) ) { ae_frame_leave(_state); return result; } if( seedrec2copy.recval.i1val.ptr.p_int[0]!=prec2->recval.i1val.ptr.p_int[0] ) { ae_frame_leave(_state); return result; } /* * Test 3: unrecycled values are lost * * retrieve value from pool, * * change it, * * retrieve one more time, * * check that it is unchanged. */ testalglibbasicsunit_createpoolandrecords(&seedrec2, &seedrec2copy, &pool, _state); ae_shared_pool_retrieve(&pool, &_prec2, _state); prec2->recval.ival = prec2->recval.ival+1; ae_shared_pool_retrieve(&pool, &_prec2, _state); if( prec2->recval.ival!=seedrec2copy.recval.ival ) { ae_frame_leave(_state); return result; } /* * Test 4: recycled values are reused, PoolClearRecycled() removes recycled values * * retrieve value from pool, * * change it, * * recycle, * * check that recycled pointer is null * * retrieve one more time, * * check that modified value was returned, * * recycle, * * clear pool, * * retrieve one more time, * * check that unmodified value was returned, */ testalglibbasicsunit_createpoolandrecords(&seedrec2, &seedrec2copy, &pool, _state); ae_shared_pool_retrieve(&pool, &_prec2, _state); prec2->recval.ival = prec2->recval.ival+1; ae_shared_pool_recycle(&pool, &_prec2, _state); if( prec2!=NULL ) { ae_frame_leave(_state); return result; } ae_shared_pool_retrieve(&pool, &_prec2, _state); if( !(prec2!=NULL) ) { ae_frame_leave(_state); return result; } if( prec2->recval.ival!=seedrec2copy.recval.ival+1 ) { ae_frame_leave(_state); return result; } ae_shared_pool_recycle(&pool, &_prec2, _state); ae_shared_pool_clear_recycled(&pool, _state); ae_shared_pool_retrieve(&pool, &_prec2, _state); if( !(prec2!=NULL) ) { ae_frame_leave(_state); return result; } if( prec2->recval.ival!=seedrec2copy.recval.ival ) { ae_frame_leave(_state); return result; } /* * Test 5: basic enumeration * * retrieve 3 values from pool * * fill RecVal.iVal by 100, 101, 102 * * recycle values * * enumerate, check that each iVal occurs only once during enumeration * * repeat enumeration to make sure that it can be repeated */ testalglibbasicsunit_createpoolandrecords(&seedrec2, &seedrec2copy, &pool, _state); ae_shared_pool_retrieve(&pool, &_p0, _state); ae_shared_pool_retrieve(&pool, &_p1, _state); ae_shared_pool_retrieve(&pool, &_p2, _state); p0->recval.ival = 100; p1->recval.ival = 101; p2->recval.ival = 102; ae_shared_pool_recycle(&pool, &_p1, _state); ae_shared_pool_recycle(&pool, &_p2, _state); ae_shared_pool_recycle(&pool, &_p0, _state); val100cnt = 0; val101cnt = 0; val102cnt = 0; ae_shared_pool_first_recycled(&pool, &_prec2, _state); while(prec2!=NULL) { if( prec2->recval.ival==100 ) { val100cnt = val100cnt+1; } if( prec2->recval.ival==101 ) { val101cnt = val101cnt+1; } if( prec2->recval.ival==102 ) { val102cnt = val102cnt+1; } ae_shared_pool_next_recycled(&pool, &_prec2, _state); } if( (val100cnt!=1||val101cnt!=1)||val102cnt!=1 ) { ae_frame_leave(_state); return result; } val100cnt = 0; val101cnt = 0; val102cnt = 0; ae_shared_pool_first_recycled(&pool, &_prec2, _state); while(prec2!=NULL) { if( prec2->recval.ival==100 ) { val100cnt = val100cnt+1; } if( prec2->recval.ival==101 ) { val101cnt = val101cnt+1; } if( prec2->recval.ival==102 ) { val102cnt = val102cnt+1; } ae_shared_pool_next_recycled(&pool, &_prec2, _state); } if( (val100cnt!=1||val101cnt!=1)||val102cnt!=1 ) { ae_frame_leave(_state); return result; } /* * Test 6: pool itself can be pooled * * pool can be seeded with another pool * * smart pointers to pool are correctly handled * * pool correctly returns different references on "retrieve": * * we retrieve, modify and recycle back to PPool0 * * we retrieve from PPool1 - unmodified value is returned * * we retrievefrom PPool0 - modified value is returned */ testalglibbasicsunit_createpoolandrecords(&seedrec2, &seedrec2copy, &pool, _state); ae_shared_pool_set_seed(&pool2, &pool, sizeof(pool), ae_shared_pool_init, ae_shared_pool_init_copy, ae_shared_pool_destroy, _state); if( ppool0!=NULL||ppool1!=NULL ) { ae_frame_leave(_state); return result; } ae_shared_pool_retrieve(&pool2, &_ppool0, _state); ae_shared_pool_retrieve(&pool2, &_ppool1, _state); if( !(ppool0!=NULL&&ppool1!=NULL) ) { ae_frame_leave(_state); return result; } ae_shared_pool_retrieve(ppool0, &_p0, _state); p0->recval.ival = p0->recval.ival+1; tmpval = p0->recval.ival; ae_shared_pool_recycle(ppool0, &_p0, _state); ae_shared_pool_retrieve(ppool1, &_p1, _state); if( p1->recval.ival==tmpval ) { ae_frame_leave(_state); return result; } ae_shared_pool_recycle(ppool1, &_p1, _state); ae_shared_pool_retrieve(ppool0, &_p0, _state); if( p0->recval.ival!=tmpval ) { ae_frame_leave(_state); return result; } /* * Test 7: pools which are fields of records are correctly handled * * pool can be seeded with record which has initialized pool as its field * * when record is retrieved from pool, its fields are correctly copied (including * fields which are pools) */ testalglibbasicsunit_createpoolandrecords(&seedrec2, &seedrec2copy, &pool, _state); tmpval = 99; seedrec1.ival = tmpval; ae_shared_pool_set_seed(&seedrec2.pool, &seedrec1, sizeof(seedrec1), _poolrec1_init, _poolrec1_init_copy, _poolrec1_destroy, _state); ae_shared_pool_set_seed(&pool, &seedrec2, sizeof(seedrec2), _poolrec2_init, _poolrec2_init_copy, _poolrec2_destroy, _state); ae_shared_pool_retrieve(&pool, &_p0, _state); ae_shared_pool_retrieve(&p0->pool, &_q0, _state); q0->ival = tmpval-1; ae_shared_pool_recycle(&p0->pool, &_q0, _state); ae_shared_pool_retrieve(&pool, &_p1, _state); ae_shared_pool_retrieve(&p1->pool, &_q1, _state); if( q1->ival!=tmpval ) { ae_frame_leave(_state); return result; } ae_shared_pool_recycle(&p1->pool, &_q1, _state); ae_shared_pool_retrieve(&p0->pool, &_q0, _state); if( q0->ival!=tmpval-1 ) { ae_frame_leave(_state); return result; } /* * Test 8: after call to PoolReset(), call to PoolFirstRecycled() returns null references */ testalglibbasicsunit_createpoolandrecords(&seedrec2, &seedrec2copy, &pool, _state); ae_shared_pool_retrieve(&pool, &_p0, _state); ae_shared_pool_retrieve(&pool, &_p1, _state); ae_shared_pool_retrieve(&pool, &_p2, _state); ae_shared_pool_recycle(&pool, &_p1, _state); ae_shared_pool_recycle(&pool, &_p2, _state); ae_shared_pool_recycle(&pool, &_p0, _state); ae_shared_pool_first_recycled(&pool, &_p0, _state); if( !(p0!=NULL) ) { ae_frame_leave(_state); return result; } ae_shared_pool_next_recycled(&pool, &_p0, _state); if( !(p0!=NULL) ) { ae_frame_leave(_state); return result; } ae_shared_pool_next_recycled(&pool, &_p0, _state); if( !(p0!=NULL) ) { ae_frame_leave(_state); return result; } ae_shared_pool_next_recycled(&pool, &_p0, _state); if( p0!=NULL ) { ae_frame_leave(_state); return result; } ae_shared_pool_reset(&pool, _state); ae_shared_pool_first_recycled(&pool, &_p0, _state); if( p0!=NULL ) { ae_frame_leave(_state); return result; } ae_shared_pool_next_recycled(&pool, &_p0, _state); if( p0!=NULL ) { ae_frame_leave(_state); return result; } ae_shared_pool_next_recycled(&pool, &_p0, _state); if( p0!=NULL ) { ae_frame_leave(_state); return result; } ae_shared_pool_next_recycled(&pool, &_p0, _state); if( p0!=NULL ) { ae_frame_leave(_state); return result; } /* * Test 9: invalid pointer is recognized as non-null (we do not reference it, just test) */ testalglibbasicsunit_createpoolandrecords(&seedrec2, &seedrec2copy, &pool, _state); ae_shared_pool_retrieve(&pool, &_p0, _state); ae_shared_pool_retrieve(&pool, &_p1, _state); ae_shared_pool_retrieve(&pool, &_p2, _state); ae_shared_pool_recycle(&pool, &_p1, _state); ae_shared_pool_recycle(&pool, &_p2, _state); ae_shared_pool_recycle(&pool, &_p0, _state); ae_shared_pool_first_recycled(&pool, &_p0, _state); if( !(p0!=NULL) ) { ae_frame_leave(_state); return result; } ae_shared_pool_clear_recycled(&pool, _state); if( !(p0!=NULL) ) { ae_frame_leave(_state); return result; } /* * Test 9: non-null pointer is nulled by calling SetNull() */ testalglibbasicsunit_createpoolandrecords(&seedrec2, &seedrec2copy, &pool, _state); ae_shared_pool_retrieve(&pool, &_p0, _state); if( !(p0!=NULL) ) { ae_frame_leave(_state); return result; } ae_smart_ptr_assign(&_p0, NULL, ae_false, ae_false, NULL); if( p0!=NULL ) { ae_frame_leave(_state); return result; } result = ae_false; ae_frame_leave(_state); return result; } static ae_bool testalglibbasicsunit_testsharedpool(ae_bool silent, ae_state *_state) { ae_bool result; result = !testalglibbasicsunit_sharedpoolerrors(_state); if( !silent ) { if( result ) { printf("SHARED POOL: OK\n"); } else { printf("SHARED POOL: FAILED\n"); } } return result; } /************************************************************************* Tests for SMP functions testSort0: sort function *************************************************************************/ static void testalglibbasicsunit_testsort0func(/* Integer */ ae_vector* a, /* Integer */ ae_vector* buf, ae_int_t idx0, ae_int_t idx2, ae_state *_state) { ae_int_t idx1; if( idx2<=idx0+1 ) { return; } idx1 = (idx0+idx2)/2; testalglibbasicsunit_testsort0func(a, buf, idx0, idx1, _state); testalglibbasicsunit_testsort0func(a, buf, idx1, idx2, _state); testalglibbasicsunit_mergesortedarrays(a, buf, idx0, idx1, idx2, _state); } /************************************************************************* testSort0: recursive sorting by splitting array into two subarrays. Returns True on success, False on failure. *************************************************************************/ static ae_bool testalglibbasicsunit_performtestsort0(ae_state *_state) { ae_frame _frame_block; ae_vector a; ae_vector buf; ae_int_t i; ae_int_t k; ae_int_t t; ae_int_t n; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&a, 0, DT_INT, _state); ae_vector_init(&buf, 0, DT_INT, _state); n = 100000; ae_vector_set_length(&a, n, _state); ae_vector_set_length(&buf, n, _state); for(i=0; i<=n-1; i++) { a.ptr.p_int[i] = i; } for(i=0; i<=n-1; i++) { k = ae_randominteger(n, _state); if( k!=i ) { t = a.ptr.p_int[i]; a.ptr.p_int[i] = a.ptr.p_int[k]; a.ptr.p_int[k] = t; } } testalglibbasicsunit_testsort0func(&a, &buf, 0, n, _state); result = ae_true; for(i=0; i<=n-1; i++) { result = result&&a.ptr.p_int[i]==i; } ae_frame_leave(_state); return result; } /************************************************************************* TestSort0: sort function *************************************************************************/ static void testalglibbasicsunit_testsort1func(/* Integer */ ae_vector* a, /* Integer */ ae_vector* buf, ae_int_t idx0, ae_int_t idx2, ae_bool usesmp, ae_state *_state) { ae_int_t idxa; ae_int_t idxb; ae_int_t idxc; ae_int_t cnt4; if( idx2<=idx0+1 ) { return; } if( idx2==idx0+2 ) { testalglibbasicsunit_mergesortedarrays(a, buf, idx0, idx0+1, idx0+2, _state); return; } if( idx2==idx0+3 ) { testalglibbasicsunit_mergesortedarrays(a, buf, idx0+0, idx0+1, idx0+2, _state); testalglibbasicsunit_mergesortedarrays(a, buf, idx0+0, idx0+2, idx0+3, _state); return; } if( idx2==idx0+4 ) { testalglibbasicsunit_mergesortedarrays(a, buf, idx0+0, idx0+1, idx0+2, _state); testalglibbasicsunit_mergesortedarrays(a, buf, idx0+2, idx0+3, idx0+4, _state); testalglibbasicsunit_mergesortedarrays(a, buf, idx0+0, idx0+2, idx0+4, _state); return; } cnt4 = (idx2-idx0)/4; idxa = idx0+cnt4; idxb = idx0+2*cnt4; idxc = idx0+3*cnt4; testalglibbasicsunit_testsort1func(a, buf, idx0, idxa, usesmp, _state); testalglibbasicsunit_testsort1func(a, buf, idxa, idxb, usesmp, _state); testalglibbasicsunit_testsort1func(a, buf, idxb, idxc, usesmp, _state); testalglibbasicsunit_testsort1func(a, buf, idxc, idx2, usesmp, _state); testalglibbasicsunit_mergesortedarrays(a, buf, idx0, idxa, idxb, _state); testalglibbasicsunit_mergesortedarrays(a, buf, idxb, idxc, idx2, _state); testalglibbasicsunit_mergesortedarrays(a, buf, idx0, idxb, idx2, _state); } /************************************************************************* TestSort0: recursive sorting by splitting array into 4 subarrays. Sorting is performed in three rounds: * parallel sorting of randomly permuted array * result is randomly shuffled and sequentially sorted * result is randomly shuffled (again) and sorted in parallel mode (again) The idea of such "multitry sort" is that we test ability of SMP core to interleave highly parallel parts of code with long sequential parts. Returns True on success, False on failure. *************************************************************************/ static ae_bool testalglibbasicsunit_performtestsort1(ae_state *_state) { ae_frame _frame_block; ae_vector a; ae_vector buf; ae_int_t i; ae_int_t k; ae_int_t t; ae_int_t n; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&a, 0, DT_INT, _state); ae_vector_init(&buf, 0, DT_INT, _state); /* * Generate array */ n = 100000; ae_vector_set_length(&a, n, _state); ae_vector_set_length(&buf, n, _state); for(i=0; i<=n-1; i++) { a.ptr.p_int[i] = i; } /* * round 0: parallel sorting of randomly permuted array */ for(i=0; i<=n-1; i++) { k = ae_randominteger(n, _state); if( k!=i ) { t = a.ptr.p_int[i]; a.ptr.p_int[i] = a.ptr.p_int[k]; a.ptr.p_int[k] = t; } } testalglibbasicsunit_testsort1func(&a, &buf, 0, n, ae_true, _state); /* * round 1: result is randomly shuffled and sequentially sorted */ for(i=0; i<=n-1; i++) { k = ae_randominteger(n, _state); if( k!=i ) { t = a.ptr.p_int[i]; a.ptr.p_int[i] = a.ptr.p_int[k]; a.ptr.p_int[k] = t; } } testalglibbasicsunit_testsort1func(&a, &buf, 0, n, ae_false, _state); /* * round 2: result is randomly shuffled (again) and sorted in parallel mode (again) */ for(i=0; i<=n-1; i++) { k = ae_randominteger(n, _state); if( k!=i ) { t = a.ptr.p_int[i]; a.ptr.p_int[i] = a.ptr.p_int[k]; a.ptr.p_int[k] = t; } } testalglibbasicsunit_testsort1func(&a, &buf, 0, n, ae_true, _state); /* * Test */ result = ae_true; for(i=0; i<=n-1; i++) { result = result&&a.ptr.p_int[i]==i; } ae_frame_leave(_state); return result; } /************************************************************************* Tests for SMP functions testSort2: sort function *************************************************************************/ static void testalglibbasicsunit_testsort2func(/* Integer */ ae_vector* a, /* Integer */ ae_vector* buf, ae_int_t idx0, ae_int_t idx2, ae_state *_state) { ae_int_t idx1; if( idx2<=idx0+1 ) { return; } idx1 = idx0+1+ae_randominteger(idx2-idx0-1, _state); testalglibbasicsunit_testsort0func(a, buf, idx0, idx1, _state); testalglibbasicsunit_testsort0func(a, buf, idx1, idx2, _state); testalglibbasicsunit_mergesortedarrays(a, buf, idx0, idx1, idx2, _state); } /************************************************************************* testSort2: recursive sorting by splitting array into two subarrays of different length (main difference from testsort0). Returns True on success, False on failure. *************************************************************************/ static ae_bool testalglibbasicsunit_performtestsort2(ae_state *_state) { ae_frame _frame_block; ae_vector a; ae_vector buf; ae_int_t i; ae_int_t k; ae_int_t t; ae_int_t n; ae_bool result; ae_frame_make(_state, &_frame_block); ae_vector_init(&a, 0, DT_INT, _state); ae_vector_init(&buf, 0, DT_INT, _state); n = 100000; ae_vector_set_length(&a, n, _state); ae_vector_set_length(&buf, n, _state); for(i=0; i<=n-1; i++) { a.ptr.p_int[i] = i; } for(i=0; i<=n-1; i++) { k = ae_randominteger(n, _state); if( k!=i ) { t = a.ptr.p_int[i]; a.ptr.p_int[i] = a.ptr.p_int[k]; a.ptr.p_int[k] = t; } } testalglibbasicsunit_testsort2func(&a, &buf, 0, n, _state); result = ae_true; for(i=0; i<=n-1; i++) { result = result&&a.ptr.p_int[i]==i; } ae_frame_leave(_state); return result; } /************************************************************************* TestPoolSum: summation with pool We perform summation of 500000 numbers (each of them is equal to 1) in the recurrent manner, by accumulation of result in the pool. This test checks pool ability to handle continuous stream of operations. Returns True on success, False on failure. *************************************************************************/ static ae_bool testalglibbasicsunit_performtestpoolsum(ae_state *_state) { ae_frame _frame_block; ae_shared_pool pool; poolsummand *ptr; ae_smart_ptr _ptr; poolsummand seed; ae_int_t n; ae_int_t sum; ae_bool result; ae_frame_make(_state, &_frame_block); ae_shared_pool_init(&pool, _state); ae_smart_ptr_init(&_ptr, (void**)&ptr, _state); _poolsummand_init(&seed, _state); n = 500000; seed.val = 0; ae_shared_pool_set_seed(&pool, &seed, sizeof(seed), _poolsummand_init, _poolsummand_init_copy, _poolsummand_destroy, _state); testalglibbasicsunit_parallelpoolsum(&pool, 0, n, _state); sum = 0; ae_shared_pool_first_recycled(&pool, &_ptr, _state); while(ptr!=NULL) { sum = sum+ptr->val; ae_shared_pool_next_recycled(&pool, &_ptr, _state); } result = sum==n; ae_frame_leave(_state); return result; } /************************************************************************* Summation routune for parallel summation test. *************************************************************************/ static void testalglibbasicsunit_parallelpoolsum(ae_shared_pool* sumpool, ae_int_t ind0, ae_int_t ind1, ae_state *_state) { ae_frame _frame_block; ae_int_t i; poolsummand *ptr; ae_smart_ptr _ptr; ae_frame_make(_state, &_frame_block); ae_smart_ptr_init(&_ptr, (void**)&ptr, _state); if( ind1-ind0<=2 ) { ae_shared_pool_retrieve(sumpool, &_ptr, _state); ptr->val = ptr->val+ind1-ind0; ae_shared_pool_recycle(sumpool, &_ptr, _state); } else { i = (ind0+ind1)/2; testalglibbasicsunit_parallelpoolsum(sumpool, ind0, i, _state); testalglibbasicsunit_parallelpoolsum(sumpool, i, ind1, _state); } ae_frame_leave(_state); } /************************************************************************* This function merges sorted A[Idx0,Idx1) and A[Idx1,Idx2) into sorted array A[Idx0,Idx2) using corresponding elements of Buf. *************************************************************************/ static void testalglibbasicsunit_mergesortedarrays(/* Integer */ ae_vector* a, /* Integer */ ae_vector* buf, ae_int_t idx0, ae_int_t idx1, ae_int_t idx2, ae_state *_state) { ae_int_t srcleft; ae_int_t srcright; ae_int_t dst; srcleft = idx0; srcright = idx1; dst = idx0; for(;;) { if( srcleft==idx1&&srcright==idx2 ) { break; } if( srcleft==idx1 ) { buf->ptr.p_int[dst] = a->ptr.p_int[srcright]; srcright = srcright+1; dst = dst+1; continue; } if( srcright==idx2 ) { buf->ptr.p_int[dst] = a->ptr.p_int[srcleft]; srcleft = srcleft+1; dst = dst+1; continue; } if( a->ptr.p_int[srcleft]ptr.p_int[srcright] ) { buf->ptr.p_int[dst] = a->ptr.p_int[srcleft]; srcleft = srcleft+1; dst = dst+1; } else { buf->ptr.p_int[dst] = a->ptr.p_int[srcright]; srcright = srcright+1; dst = dst+1; } } for(dst=idx0; dst<=idx2-1; dst++) { a->ptr.p_int[dst] = buf->ptr.p_int[dst]; } } static ae_bool testalglibbasicsunit_testsmp(ae_bool silent, ae_state *_state) { ae_bool t0; ae_bool t1; ae_bool t2; ae_bool ts; ae_bool result; t0 = testalglibbasicsunit_performtestsort0(_state); t1 = testalglibbasicsunit_performtestsort1(_state); t2 = testalglibbasicsunit_performtestsort2(_state); ts = testalglibbasicsunit_performtestpoolsum(_state); result = ((t0&&t1)&&t2)&&ts; if( !silent ) { if( result ) { printf("SMP FUNCTIONS: OK\n"); } else { printf("SMP FUNCTIONS: FAILED\n"); printf("* TEST SORT0 (sorting, split-2) "); if( t0 ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* TEST SORT1 (sorting, split-4) "); if( t1 ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* TEST SORT2 (sorting, split-2, unequal) "); if( t2 ) { printf("OK\n"); } else { printf("FAILED\n"); } printf("* TEST POOLSUM (accumulation with pool) "); if( ts ) { printf("OK\n"); } else { printf("FAILED\n"); } } } return result; } void _rec1_init(void* _p, ae_state *_state) { rec1 *p = (rec1*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->b1field, 0, DT_BOOL, _state); ae_vector_init(&p->r1field, 0, DT_REAL, _state); ae_vector_init(&p->i1field, 0, DT_INT, _state); ae_vector_init(&p->c1field, 0, DT_COMPLEX, _state); ae_matrix_init(&p->b2field, 0, 0, DT_BOOL, _state); ae_matrix_init(&p->r2field, 0, 0, DT_REAL, _state); ae_matrix_init(&p->i2field, 0, 0, DT_INT, _state); ae_matrix_init(&p->c2field, 0, 0, DT_COMPLEX, _state); } void _rec1_init_copy(void* _dst, void* _src, ae_state *_state) { rec1 *dst = (rec1*)_dst; rec1 *src = (rec1*)_src; dst->bfield = src->bfield; dst->rfield = src->rfield; dst->ifield = src->ifield; dst->cfield = src->cfield; ae_vector_init_copy(&dst->b1field, &src->b1field, _state); ae_vector_init_copy(&dst->r1field, &src->r1field, _state); ae_vector_init_copy(&dst->i1field, &src->i1field, _state); ae_vector_init_copy(&dst->c1field, &src->c1field, _state); ae_matrix_init_copy(&dst->b2field, &src->b2field, _state); ae_matrix_init_copy(&dst->r2field, &src->r2field, _state); ae_matrix_init_copy(&dst->i2field, &src->i2field, _state); ae_matrix_init_copy(&dst->c2field, &src->c2field, _state); } void _rec1_clear(void* _p) { rec1 *p = (rec1*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->b1field); ae_vector_clear(&p->r1field); ae_vector_clear(&p->i1field); ae_vector_clear(&p->c1field); ae_matrix_clear(&p->b2field); ae_matrix_clear(&p->r2field); ae_matrix_clear(&p->i2field); ae_matrix_clear(&p->c2field); } void _rec1_destroy(void* _p) { rec1 *p = (rec1*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->b1field); ae_vector_destroy(&p->r1field); ae_vector_destroy(&p->i1field); ae_vector_destroy(&p->c1field); ae_matrix_destroy(&p->b2field); ae_matrix_destroy(&p->r2field); ae_matrix_destroy(&p->i2field); ae_matrix_destroy(&p->c2field); } void _rec4serialization_init(void* _p, ae_state *_state) { rec4serialization *p = (rec4serialization*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->b, 0, DT_BOOL, _state); ae_vector_init(&p->i, 0, DT_INT, _state); ae_vector_init(&p->r, 0, DT_REAL, _state); } void _rec4serialization_init_copy(void* _dst, void* _src, ae_state *_state) { rec4serialization *dst = (rec4serialization*)_dst; rec4serialization *src = (rec4serialization*)_src; ae_vector_init_copy(&dst->b, &src->b, _state); ae_vector_init_copy(&dst->i, &src->i, _state); ae_vector_init_copy(&dst->r, &src->r, _state); } void _rec4serialization_clear(void* _p) { rec4serialization *p = (rec4serialization*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->b); ae_vector_clear(&p->i); ae_vector_clear(&p->r); } void _rec4serialization_destroy(void* _p) { rec4serialization *p = (rec4serialization*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->b); ae_vector_destroy(&p->i); ae_vector_destroy(&p->r); } void _poolrec1_init(void* _p, ae_state *_state) { poolrec1 *p = (poolrec1*)_p; ae_touch_ptr((void*)p); ae_vector_init(&p->i1val, 0, DT_INT, _state); } void _poolrec1_init_copy(void* _dst, void* _src, ae_state *_state) { poolrec1 *dst = (poolrec1*)_dst; poolrec1 *src = (poolrec1*)_src; dst->cval = src->cval; dst->rval = src->rval; dst->ival = src->ival; dst->bval = src->bval; ae_vector_init_copy(&dst->i1val, &src->i1val, _state); } void _poolrec1_clear(void* _p) { poolrec1 *p = (poolrec1*)_p; ae_touch_ptr((void*)p); ae_vector_clear(&p->i1val); } void _poolrec1_destroy(void* _p) { poolrec1 *p = (poolrec1*)_p; ae_touch_ptr((void*)p); ae_vector_destroy(&p->i1val); } void _poolrec2_init(void* _p, ae_state *_state) { poolrec2 *p = (poolrec2*)_p; ae_touch_ptr((void*)p); _poolrec1_init(&p->recval, _state); ae_shared_pool_init(&p->pool, _state); } void _poolrec2_init_copy(void* _dst, void* _src, ae_state *_state) { poolrec2 *dst = (poolrec2*)_dst; poolrec2 *src = (poolrec2*)_src; dst->bval = src->bval; _poolrec1_init_copy(&dst->recval, &src->recval, _state); ae_shared_pool_init_copy(&dst->pool, &src->pool, _state); } void _poolrec2_clear(void* _p) { poolrec2 *p = (poolrec2*)_p; ae_touch_ptr((void*)p); _poolrec1_clear(&p->recval); ae_shared_pool_clear(&p->pool); } void _poolrec2_destroy(void* _p) { poolrec2 *p = (poolrec2*)_p; ae_touch_ptr((void*)p); _poolrec1_destroy(&p->recval); ae_shared_pool_destroy(&p->pool); } void _poolsummand_init(void* _p, ae_state *_state) { poolsummand *p = (poolsummand*)_p; ae_touch_ptr((void*)p); } void _poolsummand_init_copy(void* _dst, void* _src, ae_state *_state) { poolsummand *dst = (poolsummand*)_dst; poolsummand *src = (poolsummand*)_src; dst->val = src->val; } void _poolsummand_clear(void* _p) { poolsummand *p = (poolsummand*)_p; ae_touch_ptr((void*)p); } void _poolsummand_destroy(void* _p) { poolsummand *p = (poolsummand*)_p; ae_touch_ptr((void*)p); } #if (AE_OS==AE_WINDOWS) || defined(AE_DEBUG4WINDOWS) #include #endif #if (AE_OS==AE_POSIX) || defined(AE_DEBUG4POSIX) #include #include #endif #define AE_SINGLECORE 1 #define AE_SEQUENTIAL_MULTICORE 2 #define AE_PARALLEL_SINGLECORE 3 #define AE_PARALLEL_MULTICORE 4 #define AE_SKIP_TEST 5 unsigned seed; int global_failure_flag = 0; ae_bool use_smp = ae_false; #if (AE_OS==AE_WINDOWS) || defined(AE_DEBUG4WINDOWS) CRITICAL_SECTION tests_lock; CRITICAL_SECTION print_lock; #elif (AE_OS==AE_POSIX) || defined(AE_DEBUG4POSIX) pthread_mutex_t tests_lock; pthread_mutex_t print_lock; #else void *tests_lock = NULL; void *print_lock = NULL; #endif typedef struct { const char *name; ae_bool (*seq_testfunc)(ae_bool, ae_state*); ae_bool(*smp_testfunc)(ae_bool, ae_state*); } _s_testrecord; int unittests_processed = 0; _s_testrecord unittests[] = { {"tsort",testtsort,_pexec_testtsort}, {"nearestneighbor",testnearestneighbor,_pexec_testnearestneighbor}, {"hqrnd",testhqrnd,_pexec_testhqrnd}, {"odesolver",testodesolver,_pexec_testodesolver}, {"sparse",testsparse,_pexec_testsparse}, {"reflections",testreflections,_pexec_testreflections}, {"creflections",testcreflections,_pexec_testcreflections}, {"matgen",testmatgen,_pexec_testmatgen}, {"ablas",testablas,_pexec_testablas}, {"trfac",testtrfac,_pexec_testtrfac}, {"trlinsolve",testtrlinsolve,_pexec_testtrlinsolve}, {"safesolve",testsafesolve,_pexec_testsafesolve}, {"rcond",testrcond,_pexec_testrcond}, {"matinv",testmatinv,_pexec_testmatinv}, {"hblas",testhblas,_pexec_testhblas}, {"sblas",testsblas,_pexec_testsblas}, {"ortfac",testortfac,_pexec_testortfac}, {"fbls",testfbls,_pexec_testfbls}, {"cqmodels",testcqmodels,_pexec_testcqmodels}, {"bdsvd",testbdsvd,_pexec_testbdsvd}, {"blas",testblas,_pexec_testblas}, {"svd",testsvd,_pexec_testsvd}, {"optserv",testoptserv,_pexec_testoptserv}, {"snnls",testsnnls,_pexec_testsnnls}, {"sactivesets",testsactivesets,_pexec_testsactivesets}, {"linmin",testlinmin,_pexec_testlinmin}, {"minlbfgs",testminlbfgs,_pexec_testminlbfgs}, {"xblas",testxblas,_pexec_testxblas}, {"densesolver",testdensesolver,_pexec_testdensesolver}, {"normestimator",testnormestimator,_pexec_testnormestimator}, {"linlsqr",testlinlsqr,_pexec_testlinlsqr}, {"mincg",testmincg,_pexec_testmincg}, {"minbleic",testminbleic,_pexec_testminbleic}, {"minqp",testminqp,_pexec_testminqp}, {"minnlc",testminnlc,_pexec_testminnlc}, {"minbc",testminbc,_pexec_testminbc}, {"minns",testminns,_pexec_testminns}, {"minlm",testminlm,_pexec_testminlm}, {"evd",testevd,_pexec_testevd}, {"basestat",testbasestat,_pexec_testbasestat}, {"pca",testpca,_pexec_testpca}, {"bdss",testbdss,_pexec_testbdss}, {"mlpbase",testmlpbase,_pexec_testmlpbase}, {"lda",testlda,_pexec_testlda}, {"gammafunc",testgammafunc,_pexec_testgammafunc}, {"linreg",testlinreg,_pexec_testlinreg}, {"filters",testfilters,_pexec_testfilters}, {"mcpd",testmcpd,_pexec_testmcpd}, {"mlpe",testmlpe,_pexec_testmlpe}, {"mlptrain",testmlptrain,_pexec_testmlptrain}, {"clustering",testclustering,_pexec_testclustering}, {"dforest",testdforest,_pexec_testdforest}, {"gq",testgq,_pexec_testgq}, {"gkq",testgkq,_pexec_testgkq}, {"autogk",testautogk,_pexec_testautogk}, {"fft",testfft,_pexec_testfft}, {"fht",testfht,_pexec_testfht}, {"conv",testconv,_pexec_testconv}, {"corr",testcorr,_pexec_testcorr}, {"idwint",testidwint,_pexec_testidwint}, {"ratint",testratint,_pexec_testratint}, {"spline1d",testspline1d,_pexec_testspline1d}, {"parametric",testparametric,_pexec_testparametric}, {"spline3d",testspline3d,_pexec_testspline3d}, {"polint",testpolint,_pexec_testpolint}, {"lsfit",testlsfit,_pexec_testlsfit}, {"nsfit",testnsfit,_pexec_testnsfit}, {"spline2d",testspline2d,_pexec_testspline2d}, {"rbf",testrbf,_pexec_testrbf}, {"hermite",testhermite,_pexec_testhermite}, {"laguerre",testlaguerre,_pexec_testlaguerre}, {"legendre",testlegendre,_pexec_testlegendre}, {"chebyshev",testchebyshev,_pexec_testchebyshev}, {"wsr",testwsr,_pexec_testwsr}, {"stest",teststest,_pexec_teststest}, {"studentttests",teststudentttests,_pexec_teststudentttests}, {"mannwhitneyu",testmannwhitneyu,_pexec_testmannwhitneyu}, {"schur",testschur,_pexec_testschur}, {"spdgevd",testspdgevd,_pexec_testspdgevd}, {"inverseupdate",testinverseupdate,_pexec_testinverseupdate}, {"polynomialsolver",testpolynomialsolver,_pexec_testpolynomialsolver}, {"nleq",testnleq,_pexec_testnleq}, {"lincg",testlincg,_pexec_testlincg}, {"alglibbasics",testalglibbasics,_pexec_testalglibbasics}, {NULL, NULL, NULL} }; #if (AE_OS==AE_WINDOWS) || defined(AE_DEBUG4WINDOWS) void acquire_lock(CRITICAL_SECTION *p_lock) { EnterCriticalSection(p_lock); } void release_lock(CRITICAL_SECTION *p_lock) { LeaveCriticalSection(p_lock); } #elif (AE_OS==AE_POSIX) || defined(AE_DEBUG4POSIX) void acquire_lock(pthread_mutex_t *p_lock) { pthread_mutex_lock(p_lock); } void release_lock(pthread_mutex_t *p_lock) { pthread_mutex_unlock(p_lock); } #else void acquire_lock(void **p_lock) { } void release_lock(void **p_lock) { } #endif ae_bool call_unittest( ae_bool(*seq_testfunc)(ae_bool, ae_state*), ae_bool(*smp_testfunc)(ae_bool, ae_state*), int *psticky) { #ifndef AE_USE_CPP_ERROR_HANDLING ae_state _alglib_env_state; ae_frame _frame_block; jmp_buf _break_jump; ae_bool result; ae_state_init(&_alglib_env_state); if( setjmp(_break_jump) ) { *psticky = 1; return ae_false; } ae_state_set_break_jump(&_alglib_env_state, &_break_jump); ae_frame_make(&_alglib_env_state, &_frame_block); if( use_smp ) result = smp_testfunc(ae_true, &_alglib_env_state); else result = seq_testfunc(ae_true, &_alglib_env_state); ae_state_clear(&_alglib_env_state); if( !result ) *psticky = 1; return result; #else try { ae_state _alglib_env_state; ae_frame _frame_block; ae_bool result; ae_state_init(&_alglib_env_state); ae_frame_make(&_alglib_env_state, &_frame_block); if( use_smp ) result = smp_testfunc(ae_true, &_alglib_env_state); else result = seq_testfunc(ae_true, &_alglib_env_state); ae_state_clear(&_alglib_env_state); if( !result ) *psticky = 1; return result; } catch(...) { *psticky = 1; return ae_false; } #endif } #if (AE_OS==AE_WINDOWS) || defined(AE_DEBUG4WINDOWS) DWORD WINAPI tester_function(LPVOID T) #elif AE_OS==AE_POSIX || defined(AE_DEBUG4POSIX) void* tester_function(void *T) #else void tester_function(void *T) #endif { int idx; ae_bool status; for(;;) { /* * try to acquire test record */ acquire_lock(&tests_lock); if( unittests[unittests_processed].name==NULL ) { release_lock(&tests_lock); break; } idx = unittests_processed; unittests_processed++; release_lock(&tests_lock); /* * Call unit test */ status = call_unittest( unittests[idx].seq_testfunc, unittests[idx].smp_testfunc, &global_failure_flag); acquire_lock(&print_lock); if( status ) printf("%-32s OK\n", unittests[idx].name); else printf("%-32s FAILED\n", unittests[idx].name); fflush(stdout); release_lock(&print_lock); } #if AE_OS==AE_WINDOWS || defined(AE_DEBUG4WINDOWS) return 0; #elif AE_OS==AE_POSIX || defined(AE_DEBUG4POSIX) return NULL; #else #endif } int main(int argc, char **argv) { time_t time_0, time_1; union { double a; ae_int32_t p[2]; } u; if( argc==2 ) seed = (unsigned)atoi(argv[1]); else { time_t t; seed = (unsigned)time(&t); } #if (AE_OS==AE_WINDOWS) || defined(AE_DEBUG4WINDOWS) InitializeCriticalSection(&tests_lock); InitializeCriticalSection(&print_lock); #elif (AE_OS==AE_POSIX) || defined(AE_DEBUG4POSIX) pthread_mutex_init(&tests_lock, NULL); pthread_mutex_init(&print_lock, NULL); #endif /* * SMP settings */ #if AE_TEST==AE_PARALLEL_MULTICORE || AE_TEST==AE_SEQUENTIAL_MULTICORE use_smp = ae_true; #else use_smp = ae_false; #endif /* * Seed */ printf("SEED: %u\n", (unsigned int)seed); srand(seed); /* * Compiler */ #if AE_COMPILER==AE_GNUC printf("COMPILER: GCC\n"); #elif AE_COMPILER==AE_SUNC printf("COMPILER: SunStudio\n"); #elif AE_COMPILER==AE_MSVC printf("COMPILER: MSVC\n"); #else printf("COMPILER: unknown\n"); #endif /* * Architecture */ if( sizeof(void*)==4 ) printf("HARDWARE: 32-bit\n"); else if( sizeof(void*)==8 ) printf("HARDWARE: 64-bit\n"); else printf("HARDWARE: strange (non-32, non-64)\n"); /* * determine endianness of hardware. * 1983 is a good number - non-periodic double representation allow us to * easily distinguish between upper and lower halfs and to detect mixed endian hardware. */ u.a = 1.0/1983.0; if( u.p[1]==0x3f408642 ) printf("HARDWARE: little-endian\n"); else if( u.p[0]==0x3f408642 ) printf("HARDWARE: big-endian\n"); else printf("HARDWARE: mixed-endian\n"); /* * CPU (as defined) */ #if AE_CPU==AE_INTEL printf("CPU: Intel\n"); #elif AE_CPU==AE_SPARC printf("CPU: SPARC\n"); #else printf("CPU: unknown\n"); #endif /* * Cores count */ #ifdef AE_HPC printf("CORES: %d\n", (int)ae_cores_count()); #else printf("CORES: 1 (serial version)\n"); #endif /* * Support for vendor libraries */ #ifdef AE_MKL printf("LIBS: MKL (Intel)\n"); #else printf("LIBS: \n"); #endif /* * CPUID results */ printf("CPUID: %s\n", ae_cpuid()&CPU_SSE2 ? "sse2" : ""); /* * OS */ #if AE_OS==AE_WINDOWS printf("OS: Windows\n"); #elif AE_OS==AE_POSIX printf("OS: POSIX\n"); #else printf("OS: unknown\n"); #endif /* * Testing mode */ #if (AE_TEST==0) || (AE_TEST==AE_SINGLECORE) printf("TESTING MODE: single core\n"); #elif AE_TEST==AE_PARALLEL_SINGLECORE printf("TESTING MODE: single core, parallel\n"); #elif AE_TEST==AE_SEQUENTIAL_MULTICORE printf("TESTING MODE: milti-core, sequential\n"); #elif AE_TEST==AE_PARALLEL_MULTICORE printf("TESTING MODE: milti-core, parallel\n"); #elif AE_TEST==AE_SKIP_TEST printf("TESTING MODE: just compiling\n"); printf("Done in 0 seconds\n"); return 0; #else #error Unknown AE_TEST being passed #endif /* * now we are ready to test! */ time(&time_0); #ifdef AE_HPC if( ae_smpselftests() ) printf("%-32s OK\n", "SMP self tests"); else { printf("%-32s FAILED\n", "SMP self tests"); return 1; } #endif fflush(stdout); #if AE_TEST==0 || AE_TEST==AE_SINGLECORE || AE_TEST==AE_SEQUENTIAL_MULTICORE || AE_TEST==AE_SKIP_TEST tester_function(NULL); #elif AE_TEST==AE_PARALLEL_MULTICORE || AE_TEST==AE_PARALLEL_SINGLECORE #ifdef AE_HPC ae_set_cores_to_use(0); #endif #if AE_OS==AE_WINDOWS || defined(AE_DEBUG4WINDOWS) { SYSTEM_INFO sysInfo; HANDLE *hThreads = NULL; int idx; GetSystemInfo(&sysInfo); ae_assert(sysInfo.dwNumberOfProcessors>=1, "processors count is less than 1", NULL); hThreads = (HANDLE*)malloc(sizeof(HANDLE)*sysInfo.dwNumberOfProcessors); ae_assert(hThreads!=NULL, "malloc failure", NULL); for(idx=0; idx=1, "processors count is less than 1", NULL); threads = (pthread_t*)malloc(sizeof(pthread_t)*cpu_cnt); ae_assert(threads!=NULL, "malloc failure", NULL); for(idx=0; idx #include #include "alglibmisc.h" #include "alglibinternal.h" #include "linalg.h" #include "statistics.h" #include "dataanalysis.h" #include "specialfunctions.h" #include "solvers.h" #include "optimization.h" #include "diffequations.h" #include "fasttransforms.h" #include "integration.h" #include "interpolation.h" using namespace alglib; const char *fmt_str = "%-29s %s\n"; // // Flag variables // bool issue505_passed = true; bool issue478_passed = true; bool issue528_passed = true; bool issue591_passed = true; bool issue594_passed = true; // // Service datatypes // typedef struct { alglib_impl::ae_complex cval; double rval; alglib_impl::ae_int_t ival; ae_bool bval; alglib_impl::ae_vector i1val; } innerrec; typedef struct { ae_bool bval; innerrec recval; alglib_impl::ae_shared_pool pool; } seedrec; void _innerrec_init(void* _p, alglib_impl::ae_state *_state) { innerrec *p = (innerrec*)_p; alglib_impl::ae_touch_ptr((void*)p); alglib_impl::ae_vector_init(&p->i1val, 0, alglib_impl::DT_INT, _state); } void _innerrec_init_copy(void* _dst, void* _src, alglib_impl::ae_state *_state) { innerrec *dst = (innerrec*)_dst; innerrec *src = (innerrec*)_src; dst->cval = src->cval; dst->rval = src->rval; dst->ival = src->ival; dst->bval = src->bval; alglib_impl::ae_vector_init_copy(&dst->i1val, &src->i1val, _state); } void _innerrec_clear(void* _p) { innerrec *p = (innerrec*)_p; alglib_impl::ae_touch_ptr((void*)p); alglib_impl::ae_vector_clear(&p->i1val); } void _innerrec_destroy(void* _p) { innerrec *p = (innerrec*)_p; alglib_impl::ae_touch_ptr((void*)p); alglib_impl::ae_vector_destroy(&p->i1val); } void _seedrec_init(void* _p, alglib_impl::ae_state *_state) { seedrec *p = (seedrec*)_p; alglib_impl::ae_touch_ptr((void*)p); _innerrec_init(&p->recval, _state); alglib_impl::ae_shared_pool_init(&p->pool, _state); } void _seedrec_init_copy(void* _dst, void* _src, alglib_impl::ae_state *_state) { seedrec *dst = (seedrec*)_dst; seedrec *src = (seedrec*)_src; dst->bval = src->bval; _innerrec_init_copy(&dst->recval, &src->recval, _state); alglib_impl::ae_shared_pool_init_copy(&dst->pool, &src->pool, _state); } void _seedrec_clear(void* _p) { seedrec *p = (seedrec*)_p; alglib_impl::ae_touch_ptr((void*)p); _innerrec_clear(&p->recval); alglib_impl::ae_shared_pool_clear(&p->pool); } void _seedrec_destroy(void* _p) { seedrec *p = (seedrec*)_p; alglib_impl::ae_touch_ptr((void*)p); _innerrec_destroy(&p->recval); alglib_impl::ae_shared_pool_destroy(&p->pool); } void func505_grad(const real_1d_array &x, double &func, real_1d_array &grad, void *ptr) { double x0 = *((double*)ptr); // // This block assigns zero vector to gradient. Because gradient is a proxy vector // (vector which uses another object as storage), sizes of gradient and vector being // assigned must be equal. In this case data are copied in the memory linked with // proxy. // // Early versions of ALGLIB failed to handle such assignment (it discrupted link // between proxy vector and actual gradient stored in the internals of ALGLIB). // real_1d_array z = "[0]"; grad = "[0]"; grad = z; // // This block tries to perform operations which are forbidden for proxy vector: // * assign vector of non-matching size // * change length of the vector // Correct implementation must throw an exception without breaking a link between // proxy object and actual vector. // z = "[0,1]"; try { grad = "[0,1]"; issue505_passed = false; } catch(...) {} try { grad = z; issue505_passed = false; } catch(...) {} try { grad.setlength(10); issue505_passed = false; } catch(...) {} try { grad.setlength(1); issue505_passed = false; } catch(...) {} // // This block actually calculates function/gradient // func = pow(x[0]-x0,4); grad[0] = 4*pow(x[0]-x0,3); } void func505_vec(const real_1d_array &x, real_1d_array &fi, void *ptr) { double x0 = *((double*)ptr); fi[0] = x[0]-x0; fi[1] = pow(x[0]-x0,2); } void func505_jac(const real_1d_array &x, real_1d_array &fi, real_2d_array &jac, void *ptr) { double x0 = *((double*)ptr); // // This block assigns zero matrix to Jacobian. Because Jacobian is a proxy matrix // (matrix which uses another object as storage), sizes of Jacobian and matrix being // assigned must be equal. In this case data are copied in the memory linked with // proxy. // // Early versions of ALGLIB failed to handle such assignment (it discrupted link // between proxy and actual matrix stored in the internals of ALGLIB). // real_2d_array z = "[[0],[0]]"; jac = "[[0],[0]]"; jac = z; // // This block tries to perform operations which are forbidden for proxy vector: // * assign vector of non-matching size // * change length of the vector // Correct implementation must throw an exception without breaking a link between // proxy object and actual vector. // try { jac = "[[0]]"; issue505_passed = false; } catch(...) {} try { jac = "[[0,0],[1,1]]"; issue505_passed = false; } catch(...) {} try { z = "[[0,1]]"; jac = z; issue505_passed = false; } catch(...) {} try { jac.setlength(10,6); issue505_passed = false; } catch(...) {} try { jac.setlength(2,1); issue505_passed = false; } catch(...) {} // // This block actually calculates function/gradient // fi[0] = x[0]-x0; fi[1] = pow(x[0]-x0,2); jac[0][0] = 1.0; jac[1][0] = 2*(x[0]-x0); } void file_put_contents(const char *filename, const char *contents) { FILE *f = fopen(filename, "wb"); if( f==NULL ) throw alglib::ap_error("file_put_contents: failed opening file"); if( fwrite((void*)contents, 1, strlen(contents), f)!=strlen(contents) ) throw alglib::ap_error("file_put_contents: failed writing to file"); fclose(f); } int main() { // // Report system properties // printf("System:\n"); #ifdef AE_HPC printf("* cores count %3ld\n", (long)alglib_impl::ae_cores_count()); #else printf("* cores count %3ld\n", (long)1); #endif // // Testing basic functionality // printf("Basic functions:\n"); { // // Testing 1D array functionality // bool passed = true; try { // // 1D boolean // // Default constructor, string constructor, copy constructor, assignment constructors: // * test that array sizes as reported by length match to what was specified // * test item-by-item access // * test to_string() // * test that modification of the copied array does not change original // * test that setlength() changes length // * test setcontent/getcontent // * test getcontent(), operator() and operator[] on constant arrays // (in this case distinct implementation is used which must be tested separately) // alglib::boolean_1d_array arr_0, arr_1("[]"), arr_2("[true,false,true]"), arr_3(arr_2), arr_4, arr_5; arr_4 = arr_2; arr_5 = "[true,true,false]"; passed = passed && (arr_0.length()==0); passed = passed && (arr_1.length()==0); passed = passed && (arr_2.length()==3); passed = passed && (arr_3.length()==3); passed = passed && (arr_2[0]==arr_2(0)) && (arr_2[1]==arr_2(1)) && (arr_2[2]==arr_2(2)); passed = passed && arr_2[0] && !arr_2[1] && arr_2[2]; passed = passed && arr_3[0] && !arr_3[1] && arr_3[2]; passed = passed && arr_4[0] && !arr_4[1] && arr_4[2]; passed = passed && arr_5[0] && arr_5[1] && !arr_5[2]; passed = passed && (arr_2.tostring()=="[true,false,true]"); passed = passed && (arr_3.tostring()=="[true,false,true]"); passed = passed && (arr_4.tostring()=="[true,false,true]"); passed = passed && (arr_5.tostring()=="[true,true,false]"); arr_2[0] = false; passed = passed && !arr_2[0] && arr_3[0] && arr_4[0]; arr_5.setlength(99); passed = passed && (arr_5.length()==99); // setcontent/getcontent bool a0[] = {true, false, true, false, false}; bool a0_mod = false; bool a0_orig = true; bool *p6; alglib::boolean_1d_array arr_6; arr_6.setcontent(5, a0); passed = passed && (arr_6[0]==a0[0]) && (arr_6[1]==a0[1]) && (arr_6[2]==a0[2]) && (arr_6[3]==a0[3]) && (arr_6[4]==a0[4]); p6 = arr_6.getcontent(); passed = passed && (p6!=a0); passed = passed && (p6[0]==a0[0]) && (p6[1]==a0[1]) && (p6[2]==a0[2]) && (p6[3]==a0[3]) && (p6[4]==a0[4]); a0[0] = a0_mod; passed = passed && (arr_6[0]!=a0[0]); a0[0] = a0_orig; // operations on constant arrays { const alglib::boolean_1d_array &ac = arr_6; passed = passed && (ac[0]==a0[0]) && (ac[1]==a0[1]) && (ac[2]==a0[2]) && (ac[3]==a0[3]) && (ac[4]==a0[4]); passed = passed && (ac(0)==a0[0]) && (ac(1)==a0[1]) && (ac(2)==a0[2]) && (ac(3)==a0[3]) && (ac(4)==a0[4]); const bool *p = ac.getcontent(); passed = passed && (p[0]==a0[0]) && (p[1]==a0[1]) && (p[2]==a0[2]) && (p[3]==a0[3]) && (p[4]==a0[4]); } // // Operations with proxy arrays: // * changes in target are propagated to proxy and vice versa // * assignments where proxy is source create new independent copy // * assignments to proxy are checked (their size must match to that of the target) // * incorrect assignments or attempts to change length must generate exception // * attempts to call setlength() must fail even when new size match original size // of the array // alglib::boolean_1d_array proxy, targt, acopy; targt = "[true,false,false,true]"; proxy.attach_to(targt.c_ptr()); acopy = proxy; passed = passed && targt[0] && !targt[1] && !targt[2] && targt[3]; passed = passed && proxy[0] && !proxy[1] && !proxy[2] && proxy[3]; passed = passed && acopy[0] && !acopy[1] && !acopy[2] && acopy[3]; targt[0] = false; passed = passed && !targt[0] && !proxy[0] && acopy[0]; proxy[0] = true; passed = passed && targt[0] && proxy[0] && acopy[0]; acopy = "[false,true,true,true]"; proxy = acopy; passed = passed && !targt[0] && targt[1] && targt[2] && targt[3]; passed = passed && !proxy[0] && proxy[1] && proxy[2] && proxy[3]; proxy = "[true,false,true,true]"; passed = passed && targt[0] && !targt[1] && targt[2] && targt[3]; passed = passed && proxy[0] && !proxy[1] && proxy[2] && proxy[3]; try { acopy = "[false,true,true]"; proxy = acopy; passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } try { proxy = "[true,true,true]"; passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } try { proxy.setlength(100); passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } try { proxy.setlength(proxy.length()); passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } } catch(...) { passed = false; } try { // // 1D integer // // Default constructor, string constructor, copy constructor, assignment constructors: // * test that array sizes as reported by length match to what was specified // * test item-by-item access // * test to_string() // * test that modification of the copied array does not change original // * test that setlength() changes length // const char *s1 = "[2,3,-1]"; const char *s2 = "[5,4,3]"; const char *s3 = "[6,7,3,-4]"; const char *s4 = "[9,5,-12,-0]"; const char *s5 = "[1,7,2,1]"; const char *s6 = "[7,7,7]"; int v10 = 2, v11 = 3, v12 = -1, v10_mod = 9; int v20 = 5, v21 = 4, v22 = 3; int v30 = 6, v31 = 7, v32 = 3, v33 = -4, v30_mod = -6; int v40 = 9, v41 = 5, v42 =-12, v43 = 0; int v50 = 1, v51 = 7, v52 = 2, v53 = 1; alglib::integer_1d_array arr_0, arr_1("[]"), arr_2(s1), arr_3(arr_2), arr_4, arr_5; arr_4 = arr_2; arr_5 = s2; passed = passed && (arr_0.length()==0); passed = passed && (arr_1.length()==0); passed = passed && (arr_2.length()==3); passed = passed && (arr_3.length()==3); passed = passed && (arr_2[0]==arr_2(0)) && (arr_2[1]==arr_2(1)) && (arr_2[2]==arr_2(2)); passed = passed && (arr_2[0]==v10) && (arr_2[1]==v11) && (arr_2[2]==v12); passed = passed && (arr_3[0]==v10) && (arr_3[1]==v11) && (arr_3[2]==v12); passed = passed && (arr_4[0]==v10) && (arr_4[1]==v11) && (arr_4[2]==v12); passed = passed && (arr_5[0]==v20) && (arr_5[1]==v21) && (arr_5[2]==v22); passed = passed && (arr_2.tostring()==s1); passed = passed && (arr_3.tostring()==s1); passed = passed && (arr_4.tostring()==s1); passed = passed && (arr_5.tostring()==s2); arr_2[0] = v10_mod; passed = passed && (arr_2[0]==v10_mod) && (arr_3[0]==v10) && (arr_4[0]==v10); arr_5.setlength(99); passed = passed && (arr_5.length()==99); // setcontent/getcontent alglib::ae_int_t a0[] = {2, 3, 1, 9, 2}; alglib::ae_int_t a0_mod = 7; alglib::ae_int_t a0_orig = 2; alglib::ae_int_t *p6; alglib::integer_1d_array arr_6; arr_6.setcontent(5, a0); passed = passed && (arr_6[0]==a0[0]) && (arr_6[1]==a0[1]) && (arr_6[2]==a0[2]) && (arr_6[3]==a0[3]) && (arr_6[4]==a0[4]); p6 = arr_6.getcontent(); passed = passed && (p6!=a0); passed = passed && (p6[0]==a0[0]) && (p6[1]==a0[1]) && (p6[2]==a0[2]) && (p6[3]==a0[3]) && (p6[4]==a0[4]); a0[0] = a0_mod; passed = passed && (arr_6[0]!=a0[0]); a0[0] = a0_orig; // operations on constant arrays { const alglib::integer_1d_array &ac = arr_6; passed = passed && (ac[0]==a0[0]) && (ac[1]==a0[1]) && (ac[2]==a0[2]) && (ac[3]==a0[3]) && (ac[4]==a0[4]); passed = passed && (ac(0)==a0[0]) && (ac(1)==a0[1]) && (ac(2)==a0[2]) && (ac(3)==a0[3]) && (ac(4)==a0[4]); const alglib::ae_int_t *p = ac.getcontent(); passed = passed && (p[0]==a0[0]) && (p[1]==a0[1]) && (p[2]==a0[2]) && (p[3]==a0[3]) && (p[4]==a0[4]); } // // Operations with proxy arrays: // * changes in target are propagated to proxy and vice versa // * assignments where proxy is source create new independent copy // * assignments to proxy are checked (their size must match to that of the target) // * incorrect assignments or attempts to change length must generate exception // * attempts to call setlength() must fail even when new size match original size // of the array // alglib::integer_1d_array proxy, targt, acopy; targt = s3; proxy.attach_to(targt.c_ptr()); acopy = proxy; passed = passed && (targt[0]==v30) && (targt[1]==v31) && (targt[2]==v32) && (targt[3]==v33); passed = passed && (proxy[0]==v30) && (proxy[1]==v31) && (proxy[2]==v32) && (proxy[3]==v33); passed = passed && (acopy[0]==v30) && (acopy[1]==v31) && (acopy[2]==v32) && (acopy[3]==v33); targt[0] = v30_mod; passed = passed && (targt[0]==v30_mod) && (proxy[0]==v30_mod) && (acopy[0]==v30); proxy[0] = v30; passed = passed && (targt[0]==v30) && (proxy[0]==v30) && (acopy[0]==v30); acopy = s4; proxy = acopy; passed = passed && (targt[0]==v40) && (targt[1]==v41) && (targt[2]==v42) && (targt[3]==v43); passed = passed && (proxy[0]==v40) && (proxy[1]==v41) && (proxy[2]==v42) && (proxy[3]==v43); proxy = s5; passed = passed && (targt[0]==v50) && (targt[1]==v51) && (targt[2]==v52) && (targt[3]==v53); passed = passed && (proxy[0]==v50) && (proxy[1]==v51) && (proxy[2]==v52) && (proxy[3]==v53); try { acopy = s6; proxy = acopy; passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } try { proxy = s6; passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } try { proxy.setlength(100); passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } try { proxy.setlength(proxy.length()); passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } } catch(...) { passed = false; } try { // // 1D real // // Default constructor, string constructor, copy constructor, assignment constructors: // * test that array sizes as reported by length match to what was specified // * test item-by-item access // * test to_string() // * test that modification of the copied array does not change original // * test that setlength() changes length // const char *s1 = "[2,3.5,-2.5E-1]"; const char *s1_fmt = "[2.00,3.50,-0.25]"; const char *s2 = "[5,4,3.126]"; const char *s2_fmt = "[5.00,4.00,3.13]"; const char *s3 = "[6,7,3,-4E2]"; const char *s4 = "[9,5,-12,-0.01]"; const char *s5 = "[1,7,2,1]"; const char *s6 = "[7,7,7]"; const int dps = 2; double v10 = 2, v11 = 3.5, v12 = -0.25, v10_mod = 9; double v20 = 5, v21 = 4, v22 = 3.126; double v30 = 6, v31 = 7, v32 = 3, v33 = -400, v30_mod = -6; double v40 = 9, v41 = 5, v42 =-12, v43 = -0.01; double v50 = 1, v51 = 7, v52 = 2, v53 = 1; alglib::real_1d_array arr_0, arr_1("[]"), arr_2(s1), arr_3(arr_2), arr_4, arr_5; arr_4 = arr_2; arr_5 = s2; passed = passed && (arr_0.length()==0); passed = passed && (arr_1.length()==0); passed = passed && (arr_2.length()==3); passed = passed && (arr_3.length()==3); passed = passed && (arr_2[0]==arr_2(0)) && (arr_2[1]==arr_2(1)) && (arr_2[2]==arr_2(2)); passed = passed && (arr_2[0]==v10) && (arr_2[1]==v11) && (arr_2[2]==v12); passed = passed && (arr_3[0]==v10) && (arr_3[1]==v11) && (arr_3[2]==v12); passed = passed && (arr_4[0]==v10) && (arr_4[1]==v11) && (arr_4[2]==v12); passed = passed && (arr_5[0]==v20) && (arr_5[1]==v21) && (arr_5[2]==v22); passed = passed && (arr_2.tostring(dps)==s1_fmt); passed = passed && (arr_3.tostring(dps)==s1_fmt); passed = passed && (arr_4.tostring(dps)==s1_fmt); passed = passed && (arr_5.tostring(dps)==s2_fmt); arr_2[0] = v10_mod; passed = passed && (arr_2[0]==v10_mod) && (arr_3[0]==v10) && (arr_4[0]==v10); arr_5.setlength(99); passed = passed && (arr_5.length()==99); // setcontent/getcontent double a0[] = {2, 3.5, 1, 9.125, 2}; double a0_mod = 7; double a0_orig = 2; double *p6; alglib::real_1d_array arr_6; arr_6.setcontent(5, a0); passed = passed && (arr_6[0]==a0[0]) && (arr_6[1]==a0[1]) && (arr_6[2]==a0[2]) && (arr_6[3]==a0[3]) && (arr_6[4]==a0[4]); p6 = arr_6.getcontent(); passed = passed && (p6!=a0); passed = passed && (p6[0]==a0[0]) && (p6[1]==a0[1]) && (p6[2]==a0[2]) && (p6[3]==a0[3]) && (p6[4]==a0[4]); a0[0] = a0_mod; passed = passed && (arr_6[0]!=a0[0]); a0[0] = a0_orig; // operations on constant arrays { const alglib::real_1d_array &ac = arr_6; passed = passed && (ac[0]==a0[0]) && (ac[1]==a0[1]) && (ac[2]==a0[2]) && (ac[3]==a0[3]) && (ac[4]==a0[4]); passed = passed && (ac(0)==a0[0]) && (ac(1)==a0[1]) && (ac(2)==a0[2]) && (ac(3)==a0[3]) && (ac(4)==a0[4]); const double *p = ac.getcontent(); passed = passed && (p[0]==a0[0]) && (p[1]==a0[1]) && (p[2]==a0[2]) && (p[3]==a0[3]) && (p[4]==a0[4]); } // // Operations with proxy arrays: // * changes in target are propagated to proxy and vice versa // * assignments where proxy is source create new independent copy // * assignments to proxy are checked (their size must match to that of the target) // * incorrect assignments or attempts to change length must generate exception // * attempts to call setlength() must fail even when new size match original size // of the array // alglib::real_1d_array proxy, targt, acopy; targt = s3; proxy.attach_to(targt.c_ptr()); acopy = proxy; passed = passed && (targt[0]==v30) && (targt[1]==v31) && (targt[2]==v32) && (targt[3]==v33); passed = passed && (proxy[0]==v30) && (proxy[1]==v31) && (proxy[2]==v32) && (proxy[3]==v33); passed = passed && (acopy[0]==v30) && (acopy[1]==v31) && (acopy[2]==v32) && (acopy[3]==v33); targt[0] = v30_mod; passed = passed && (targt[0]==v30_mod) && (proxy[0]==v30_mod) && (acopy[0]==v30); proxy[0] = v30; passed = passed && (targt[0]==v30) && (proxy[0]==v30) && (acopy[0]==v30); acopy = s4; proxy = acopy; passed = passed && (targt[0]==v40) && (targt[1]==v41) && (targt[2]==v42) && (targt[3]==v43); passed = passed && (proxy[0]==v40) && (proxy[1]==v41) && (proxy[2]==v42) && (proxy[3]==v43); proxy = s5; passed = passed && (targt[0]==v50) && (targt[1]==v51) && (targt[2]==v52) && (targt[3]==v53); passed = passed && (proxy[0]==v50) && (proxy[1]==v51) && (proxy[2]==v52) && (proxy[3]==v53); try { acopy = s6; proxy = acopy; passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } try { proxy = s6; passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } try { proxy.setlength(100); passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } try { proxy.setlength(proxy.length()); passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } } catch(...) { passed = false; } try { // // 1D complex // // Default constructor, string constructor, copy constructor, assignment constructors: // * test that array sizes as reported by length match to what was specified // * test item-by-item access // * test to_string() // * test that modification of the copied array does not change original // * test that setlength() changes length // const char *s1 = "[2,3.5i,1-2.5E-1i]"; const char *s1_fmt = "[2.00,3.50i,1.00-0.25i]"; const char *s2 = "[5,-4+1i,3.126]"; const char *s2_fmt = "[5.00,-4.00+1.00i,3.13]"; const char *s3 = "[6,7,3,-4E2]"; const char *s4 = "[9,5,-12,-0.01]"; const char *s5 = "[1,7,2,1]"; const char *s6 = "[7,7,7]"; const int dps = 2; alglib::complex v10 = 2, v11 = alglib::complex(0,3.5), v12 = alglib::complex(1,-0.25), v10_mod = 9; alglib::complex v20 = 5, v21 = alglib::complex(-4,1), v22 = 3.126; alglib::complex v30 = 6, v31 = 7, v32 = 3, v33 = -400, v30_mod = -6; alglib::complex v40 = 9, v41 = 5, v42 =-12, v43 = -0.01; alglib::complex v50 = 1, v51 = 7, v52 = 2, v53 = 1; alglib::complex_1d_array arr_0, arr_1("[]"), arr_2(s1), arr_3(arr_2), arr_4, arr_5; arr_4 = arr_2; arr_5 = s2; passed = passed && (arr_0.length()==0); passed = passed && (arr_1.length()==0); passed = passed && (arr_2.length()==3); passed = passed && (arr_3.length()==3); passed = passed && (arr_2[0]==arr_2(0)) && (arr_2[1]==arr_2(1)) && (arr_2[2]==arr_2(2)); passed = passed && (arr_2[0]==v10) && (arr_2[1]==v11) && (arr_2[2]==v12); passed = passed && (arr_3[0]==v10) && (arr_3[1]==v11) && (arr_3[2]==v12); passed = passed && (arr_4[0]==v10) && (arr_4[1]==v11) && (arr_4[2]==v12); passed = passed && (arr_5[0]==v20) && (arr_5[1]==v21) && (arr_5[2]==v22); passed = passed && (arr_2.tostring(dps)==s1_fmt); passed = passed && (arr_3.tostring(dps)==s1_fmt); passed = passed && (arr_4.tostring(dps)==s1_fmt); passed = passed && (arr_5.tostring(dps)==s2_fmt); arr_2[0] = v10_mod; passed = passed && (arr_2[0]==v10_mod) && (arr_3[0]==v10) && (arr_4[0]==v10); arr_5.setlength(99); passed = passed && (arr_5.length()==99); // setcontent/getcontent alglib::complex a0[] = {2, 3.5, 1, 9.125, 2}; alglib::complex a0_mod = 7; alglib::complex a0_orig = 2; alglib::complex *p6; alglib::complex_1d_array arr_6; arr_6.setcontent(5, a0); passed = passed && (arr_6[0]==a0[0]) && (arr_6[1]==a0[1]) && (arr_6[2]==a0[2]) && (arr_6[3]==a0[3]) && (arr_6[4]==a0[4]); p6 = arr_6.getcontent(); passed = passed && (p6!=a0); passed = passed && (p6[0]==a0[0]) && (p6[1]==a0[1]) && (p6[2]==a0[2]) && (p6[3]==a0[3]) && (p6[4]==a0[4]); a0[0] = a0_mod; passed = passed && (arr_6[0]!=a0[0]); a0[0] = a0_orig; // operations on constant arrays { const alglib::complex_1d_array &ac = arr_6; passed = passed && (ac[0]==a0[0]) && (ac[1]==a0[1]) && (ac[2]==a0[2]) && (ac[3]==a0[3]) && (ac[4]==a0[4]); passed = passed && (ac(0)==a0[0]) && (ac(1)==a0[1]) && (ac(2)==a0[2]) && (ac(3)==a0[3]) && (ac(4)==a0[4]); const alglib::complex *p = ac.getcontent(); passed = passed && (p[0]==a0[0]) && (p[1]==a0[1]) && (p[2]==a0[2]) && (p[3]==a0[3]) && (p[4]==a0[4]); } // // Operations with proxy arrays: // * changes in target are propagated to proxy and vice versa // * assignments where proxy is source create new independent copy // * assignments to proxy are checked (their size must match to that of the target) // * incorrect assignments or attempts to change length must generate exception // * attempts to call setlength() must fail even when new size match original size // of the array // alglib::complex_1d_array proxy, targt, acopy; targt = s3; proxy.attach_to(targt.c_ptr()); acopy = proxy; passed = passed && (targt[0]==v30) && (targt[1]==v31) && (targt[2]==v32) && (targt[3]==v33); passed = passed && (proxy[0]==v30) && (proxy[1]==v31) && (proxy[2]==v32) && (proxy[3]==v33); passed = passed && (acopy[0]==v30) && (acopy[1]==v31) && (acopy[2]==v32) && (acopy[3]==v33); targt[0] = v30_mod; passed = passed && (targt[0]==v30_mod) && (proxy[0]==v30_mod) && (acopy[0]==v30); proxy[0] = v30; passed = passed && (targt[0]==v30) && (proxy[0]==v30) && (acopy[0]==v30); acopy = s4; proxy = acopy; passed = passed && (targt[0]==v40) && (targt[1]==v41) && (targt[2]==v42) && (targt[3]==v43); passed = passed && (proxy[0]==v40) && (proxy[1]==v41) && (proxy[2]==v42) && (proxy[3]==v43); proxy = s5; passed = passed && (targt[0]==v50) && (targt[1]==v51) && (targt[2]==v52) && (targt[3]==v53); passed = passed && (proxy[0]==v50) && (proxy[1]==v51) && (proxy[2]==v52) && (proxy[3]==v53); try { acopy = s6; proxy = acopy; passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } try { proxy = s6; passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } try { proxy.setlength(100); passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } try { proxy.setlength(proxy.length()); passed = false; } catch(alglib::ap_error e) { } catch(...) { passed = false; } } catch(...) { passed = false; } // // Report // printf(fmt_str, "* 1D arrays", passed ? "OK" : "FAILED"); fflush(stdout); if( !passed ) return 1; } { // // Testing 2D array functionality // bool passed = true; try { // // 2D real // // Default constructor, string constructor, copy constructor, assignment constructors: // * test that array sizes as reported by length match to what was specified // * test item-by-item access // * test to_string() // * test that modification of the copied array does not change original // * test that setlength() changes length // const char *s1 = "[[2,3.5,-2.5E-1],[1,2,3]]"; const char *s1_fmt = "[[2.00,3.50,-0.25],[1.00,2.00,3.00]]"; const char *s2 = "[[5],[4],[3.126]]"; const char *s2_fmt = "[[5.00],[4.00],[3.13]]"; const char *s3 = "[[6,7],[3,-4E2],[-3,-1]]"; const char *s4 = "[[9,5],[-12,-0.01],[-1,-2]]"; const char *s5 = "[[1,7],[2,1],[0,4]]"; const char *s60 = "[[7,7],[7,7]]"; const char *s61 = "[[7],[7],[7]]"; const int dps = 2; double v10 = 2, v11 = 3.5, v12 = -0.25, v13=1, v14 = 2, v15 = 3, v10_mod = 9; double v20 = 5, v21 = 4, v22 = 3.126; /*double v30 = 6, v31 = 7, v32 = 3, v33 = -400, v30_mod = -6; double v40 = 9, v41 = 5, v42 =-12, v43 = -0.01; double v50 = 1, v51 = 7, v52 = 2, v53 = 1;*/ double r; alglib::real_2d_array arr_0, arr_1("[[]]"), arr_2(s1), arr_3(arr_2), arr_4, arr_5; arr_4 = arr_2; arr_5 = s2; passed = passed && (arr_0.rows()==0) && (arr_0.cols()==0) && (arr_0.getstride()==0); passed = passed && (arr_1.rows()==0) && (arr_1.cols()==0) && (arr_1.getstride()==0); passed = passed && (arr_2.rows()==2) && (arr_2.cols()==3) && (arr_2.getstride()>=arr_2.cols()); passed = passed && (arr_3.rows()==2) && (arr_3.cols()==3) && (arr_3.getstride()>=arr_3.cols()); passed = passed && (arr_4.rows()==2) && (arr_4.cols()==3) && (arr_4.getstride()>=arr_4.cols()); passed = passed && (arr_5.rows()==3) && (arr_5.cols()==1) && (arr_5.getstride()>=arr_5.cols()); passed = passed && (arr_2[0][0]==arr_2(0,0)) && (arr_2[0][1]==arr_2(0,1)) && (arr_2[0][2]==arr_2(0,2)); passed = passed && (arr_2[1][0]==arr_2(1,0)) && (arr_2[1][1]==arr_2(1,1)) && (arr_2[1][2]==arr_2(1,2)); passed = passed && (arr_2[0][0]==v10) && (arr_2[0][1]==v11) && (arr_2[0][2]==v12); passed = passed && (arr_2[1][0]==v13) && (arr_2[1][1]==v14) && (arr_2[1][2]==v15); passed = passed && (arr_3[0][0]==v10) && (arr_3[0][1]==v11) && (arr_3[0][2]==v12); passed = passed && (arr_3[1][0]==v13) && (arr_3[1][1]==v14) && (arr_3[1][2]==v15); passed = passed && (arr_4[0][0]==v10) && (arr_4[0][1]==v11) && (arr_4[0][2]==v12); passed = passed && (arr_4[1][0]==v13) && (arr_4[1][1]==v14) && (arr_4[1][2]==v15); passed = passed && (arr_5[0][0]==v20) && (arr_5[1][0]==v21) && (arr_5[2][0]==v22); passed = passed && (arr_2.tostring(dps)==s1_fmt); passed = passed && (arr_3.tostring(dps)==s1_fmt); passed = passed && (arr_4.tostring(dps)==s1_fmt); passed = passed && (arr_5.tostring(dps)==s2_fmt); arr_2[0][0] = v10_mod; passed = passed && (arr_2[0][0]==v10_mod) && (arr_3[0][0]==v10) && (arr_4[0][0]==v10); arr_5.setlength(99,97); passed = passed && (arr_5.rows()==99) && (arr_5.cols()==97); // // setcontent/elementwise access/constant arrays // ae_int_t n, m, i, j; for(n=1; n<=10; n++) for(m=1; m<=10; m++) { alglib::real_2d_array arr_6; double a0[100]; // fill array by random values, test setcontent(0 for(i=0; i"; std::istringstream stream(_s); alglib::rbfmodel model; alglib::rbfunserialize(stream, model); for(int i=0; i'); } catch(...) { passed = false; } // // Report // printf(fmt_str, "* Serialization (RBF)", passed ? "OK" : "FAILED"); fflush(stdout); if( !passed ) return 1; } // // Testing issues which must be fixed // printf("Issues:\n"); { // // Testing issue #505 (http://bugs.alglib.net/view.php?id=505) in optimizers. // This issue was present in ALL optimizers, but we test it only on two: CG and LM. // try { // // Test CG // Stopping criteria - EpsX // mincgstate state; mincgreport rep; real_1d_array x = "[0.0]"; double x0 = 20*alglib::randomreal()-10; double epsx = 1.0E-9; mincgcreate(1, x, state); mincgsetcond(state, 0.0, 0.0, epsx, 0); mincgoptimize(state, func505_grad, NULL, &x0); mincgresults(state, x, rep); issue505_passed = issue505_passed && (fabs(4*pow(x[0]-x0,3))<1.0E-3); } catch(...) { issue505_passed = false; } try { // // Test LM // Stopping criteria - after |grad|4 ) { // // 64-bit mode, perform test: // * use large NMax>2^31 // * generate 1.000.000 random numbers // * use two bins - one for numbers less then NMax/2, // another one for the rest of them // * bin sizes are equal to n0, n1 // * both bins should be approximately equal, we use // ad hoc threshold 0.45 < n0,n1 < 0.55. // try { alglib::hqrndstate rs; alglib::ae_int_t nmax[3]; alglib::ae_int_t ncnt = 3, nidx; double n0, n1; alglib::hqrndrandomize(rs); // // nmax: // * first nmax is just large value to test basic uniformity of generator // nmax[0] = 1000000; nmax[0] = nmax[0]*nmax[0]; nmax[1] = 2147483562; nmax[1] *= 1.5; nmax[2] = 2147483562; nmax[2] *= 3; for(nidx=0; nidx=0) && (v0.45); issue478_passed = issue478_passed && (n0/(n0+n1)<0.55); issue478_passed = issue478_passed && (n1/(n0+n1)>0.45); issue478_passed = issue478_passed && (n1/(n0+n1)<0.55); } } catch(...) { issue478_passed = false; } printf(fmt_str, "* issue 478", issue478_passed ? "OK" : "FAILED"); fflush(stdout); if( !issue478_passed ) return 1; } else { // // 32-bit mode, skip test // printf(fmt_str, "* issue 478", "OK (skipped in 32-bit mode)"); fflush(stdout); } // // Testing issue #528 (http://bugs.alglib.net/view.php?id=528) // in shared pool and smart pointer which leak memory. // // In order to test it we create pool, seed it with specially // created structure, perform several operations, then clear it. // We test allocation counter before and after this operation. // #ifndef AE_USE_ALLOC_COUNTER #error AE_USE_ALLOC_COUNTER must be defined #endif try { int alloc_cnt; alglib_impl::ae_state _alglib_env_state; alglib_impl::ae_frame _frame_block; alglib_impl::ae_shared_pool pool; alglib_impl::ae_smart_ptr ptr0, ptr1; void *p0, *p1; seedrec seed; // case #0: just seeding the pool alloc_cnt = alglib_impl::_alloc_counter; alglib_impl::ae_state_init(&_alglib_env_state); alglib_impl::ae_frame_make(&_alglib_env_state, &_frame_block); alglib_impl::ae_shared_pool_init(&pool, &_alglib_env_state); _seedrec_init(&seed, &_alglib_env_state); alglib_impl::ae_shared_pool_set_seed(&pool, &seed, sizeof(seed), _seedrec_init, _seedrec_init_copy, _seedrec_destroy, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); issue528_passed = issue528_passed && (alloc_cnt==alglib_impl::_alloc_counter); // case #1: seeding and retrieving, not recycling alloc_cnt = alglib_impl::_alloc_counter; alglib_impl::ae_state_init(&_alglib_env_state); alglib_impl::ae_frame_make(&_alglib_env_state, &_frame_block); alglib_impl::ae_smart_ptr_init(&ptr0, (void**)&p0, &_alglib_env_state); alglib_impl::ae_shared_pool_init(&pool, &_alglib_env_state); _seedrec_init(&seed, &_alglib_env_state); alglib_impl::ae_shared_pool_set_seed(&pool, &seed, sizeof(seed), _seedrec_init, _seedrec_init_copy, _seedrec_destroy, &_alglib_env_state); alglib_impl::ae_shared_pool_retrieve(&pool, &ptr0, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); issue528_passed = issue528_passed && (alloc_cnt==alglib_impl::_alloc_counter); // case #2: seeding and retrieving twice, recycling both alloc_cnt = alglib_impl::_alloc_counter; alglib_impl::ae_state_init(&_alglib_env_state); alglib_impl::ae_frame_make(&_alglib_env_state, &_frame_block); alglib_impl::ae_smart_ptr_init(&ptr0, (void**)&p0, &_alglib_env_state); alglib_impl::ae_smart_ptr_init(&ptr1, (void**)&p1, &_alglib_env_state); alglib_impl::ae_shared_pool_init(&pool, &_alglib_env_state); _seedrec_init(&seed, &_alglib_env_state); alglib_impl::ae_shared_pool_set_seed(&pool, &seed, sizeof(seed), _seedrec_init, _seedrec_init_copy, _seedrec_destroy, &_alglib_env_state); alglib_impl::ae_shared_pool_retrieve(&pool, &ptr0, &_alglib_env_state); alglib_impl::ae_shared_pool_retrieve(&pool, &ptr1, &_alglib_env_state); alglib_impl::ae_shared_pool_recycle(&pool, &ptr0, &_alglib_env_state); alglib_impl::ae_shared_pool_recycle(&pool, &ptr1, &_alglib_env_state); alglib_impl::ae_state_clear(&_alglib_env_state); issue528_passed = issue528_passed && (alloc_cnt==alglib_impl::_alloc_counter); } catch(...) { issue528_passed = false; } printf(fmt_str, "* issue 528", issue528_passed ? "OK" : "FAILED"); fflush(stdout); if( !issue528_passed ) return 1; // // Testing issue #591 (http://bugs.alglib.net/view.php?id=591) // in copying of object containing shared pool as one of its // fields. // // Unfixed ALGLIB crashes because of unneeded assertion in the // ae_shared_pool_init_copy() function. // try { alglib::multilayerperceptron net0, net1; alglib::real_1d_array x("[1,2]"), y0("[0,0]"), y1("[0,0]"), y2("[0,0]"); alglib::mlpcreate0(2, 2, net0); alglib::mlpprocess(net0, x, y0); // // Test assignment constructor // net1 = net0; alglib::mlpprocess(net1, x, y1); issue591_passed = issue591_passed && (fabs(y0[0]-y1[0])<1.0E-9) && (fabs(y0[1]-y1[1])<1.0E-9); // // Test copy constructor // alglib::multilayerperceptron net2(net0); alglib::mlpprocess(net2, x, y2); issue591_passed = issue591_passed && (fabs(y0[0]-y2[0])<1.0E-9) && (fabs(y0[1]-y2[1])<1.0E-9); } catch(...) { issue591_passed = false; } printf(fmt_str, "* issue 591", issue591_passed ? "OK" : "FAILED"); fflush(stdout); if( !issue591_passed ) return 1; // // Task #594 (http://bugs.alglib.net/view.php?id=594) - additional // test for correctness of copying of objects. When we copy ALGLIB // object, indenendent new copy is created. // // This test checks both copying with copy constructor and assignment // constructor // try { alglib::multilayerperceptron net0, net1; alglib::real_1d_array x("[1,2]"), y0("[0,0]"), y1("[0,0]"), y2("[0,0]"); alglib::mlpcreate0(2, 2, net0); alglib::mlpprocess(net0, x, y0); // // Test assignment and copy constructors: // * copy object with one of the constructors // * process vector with original network // * randomize original network // * process vector with copied networks and compare // net1 = net0; alglib::multilayerperceptron net2(net0); alglib::mlprandomize(net0); alglib::mlpprocess(net1, x, y1); alglib::mlpprocess(net2, x, y2); issue594_passed = issue594_passed && (fabs(y0[0]-y1[0])<1.0E-9) && (fabs(y0[1]-y1[1])<1.0E-9); issue594_passed = issue594_passed && (fabs(y0[0]-y2[0])<1.0E-9) && (fabs(y0[1]-y2[1])<1.0E-9); } catch(...) { issue594_passed = false; } printf(fmt_str, "* issue 594", issue594_passed ? "OK" : "FAILED"); fflush(stdout); if( !issue594_passed ) return 1; } // // Performance testing // printf("Performance:\n"); { { int _n[] = { 16, 32, 64, 1024, 0}; int i, j, k, t, nidx; double desiredflops = 1.0E10; for(nidx=0; _n[nidx]!=0; nidx++) { // // Settings: // * n - matrix size // * nrepeat - number of repeated multiplications, always divisible by 4 // int n = _n[nidx]; int nrepeat = (int)(desiredflops/(2*pow((double)n,3.0))); nrepeat = 4*(nrepeat/4+1); // // Actual processing // alglib::real_2d_array a, b, c; double perf0, perf1, perf2; a.setlength(n, n); b.setlength(n, n); c.setlength(n, n); for(i=0; i Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read .cpp/gpl2.txt0000755000175000017500000004325413105126766012645 0ustar sergeysergey GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License.